From: Russell King rmk+kernel@armlinux.org.uk
[ Upstream commit ffd9a1ba9fdb7f2bd1d1ad9b9243d34e96756ba2 ]
DMA got broken a while back in two different ways: 1) a change in the behaviour of disable_irq() to wait for the interrupt to finish executing causes us to deadlock at the end of DMA. 2) a change to avoid modifying the scatterlist left the first transfer uninitialised.
DMA is only used with expansion cards, so has gone unnoticed.
Fixes: fa4e99899932 ("[ARM] dma: RiscPC: don't modify DMA SG entries") Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-rpc/dma.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/arm/mach-rpc/dma.c b/arch/arm/mach-rpc/dma.c index 488d5c3b37f4..799e0b016b62 100644 --- a/arch/arm/mach-rpc/dma.c +++ b/arch/arm/mach-rpc/dma.c @@ -128,7 +128,7 @@ static irqreturn_t iomd_dma_handle(int irq, void *dev_id) } while (1);
idma->state = ~DMA_ST_AB; - disable_irq(irq); + disable_irq_nosync(irq);
return IRQ_HANDLED; } @@ -177,6 +177,9 @@ static void iomd_enable_dma(unsigned int chan, dma_t *dma) DMA_FROM_DEVICE : DMA_TO_DEVICE); }
+ idma->dma_addr = idma->dma.sg->dma_address; + idma->dma_len = idma->dma.sg->length; + iomd_writeb(DMA_CR_C, dma_base + CR); idma->state = DMA_ST_AB; }
From: Douglas Anderson dianders@chromium.org
[ Upstream commit 1c0479023412ab7834f2e98b796eb0d8c627cd62 ]
As some point hs200 was failing on rk3288-veyron-minnie. See commit 984926781122 ("ARM: dts: rockchip: temporarily remove emmc hs200 speed from rk3288 minnie"). Although I didn't track down exactly when it started working, it seems to work OK now, so let's turn it back on.
To test this, I booted from SD card and then used this script to stress the enumeration process after fixing a memory leak [1]: cd /sys/bus/platform/drivers/dwmmc_rockchip for i in $(seq 1 3000); do echo "========================" $i echo ff0f0000.dwmmc > unbind sleep .5 echo ff0f0000.dwmmc > bind while true; do if [ -e /dev/mmcblk2 ]; then break; fi sleep .1 done done
It worked fine.
[1] https://lkml.kernel.org/r/20190503233526.226272-1-dianders@chromium.org
Signed-off-by: Douglas Anderson dianders@chromium.org Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/rk3288-veyron-minnie.dts | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/arch/arm/boot/dts/rk3288-veyron-minnie.dts b/arch/arm/boot/dts/rk3288-veyron-minnie.dts index 468a1818545d..ce57881625ec 100644 --- a/arch/arm/boot/dts/rk3288-veyron-minnie.dts +++ b/arch/arm/boot/dts/rk3288-veyron-minnie.dts @@ -90,10 +90,6 @@ pwm-off-delay-ms = <200>; };
-&emmc { - /delete-property/mmc-hs200-1_8v; -}; - &gpio_keys { pinctrl-0 = <&pwr_key_l &ap_lid_int_l &volum_down_l &volum_up_l>;
From: Douglas Anderson dianders@chromium.org
[ Upstream commit 99fa066710f75f18f4d9a5bc5f6a711968a581d5 ]
When I try to boot rk3288-veyron-mickey I totally fail to make the eMMC work. Specifically my logs (on Chrome OS 4.19):
mmc_host mmc1: card is non-removable. mmc_host mmc1: Bus speed (slot 0) = 400000Hz (slot req 400000Hz, actual 400000HZ div = 0) mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot req 52000000Hz, actual 50000000HZ div = 0) mmc1: switch to bus width 8 failed mmc1: switch to bus width 4 failed mmc1: new high speed MMC card at address 0001 mmcblk1: mmc1:0001 HAG2e 14.7 GiB mmcblk1boot0: mmc1:0001 HAG2e partition 1 4.00 MiB mmcblk1boot1: mmc1:0001 HAG2e partition 2 4.00 MiB mmcblk1rpmb: mmc1:0001 HAG2e partition 3 4.00 MiB, chardev (243:0) mmc_host mmc1: Bus speed (slot 0) = 400000Hz (slot req 400000Hz, actual 400000HZ div = 0) mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot req 52000000Hz, actual 50000000HZ div = 0) mmc1: switch to bus width 8 failed mmc1: switch to bus width 4 failed mmc1: tried to HW reset card, got error -110 mmcblk1: error -110 requesting status mmcblk1: recovery failed! print_req_error: I/O error, dev mmcblk1, sector 0 ...
When I remove the '/delete-property/mmc-hs200-1_8v' then everything is hunky dory.
That line comes from the original submission of the mickey dts upstream, so presumably at the time the HS200 was failing and just enumerating things as a high speed device was fine. ...or maybe it's just that some mickey devices work when enumerating at "high speed", just not mine?
In any case, hs200 seems good now. Let's turn it on.
Signed-off-by: Douglas Anderson dianders@chromium.org Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/rk3288-veyron-mickey.dts | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/arch/arm/boot/dts/rk3288-veyron-mickey.dts b/arch/arm/boot/dts/rk3288-veyron-mickey.dts index e852594417b5..b13f87792e9f 100644 --- a/arch/arm/boot/dts/rk3288-veyron-mickey.dts +++ b/arch/arm/boot/dts/rk3288-veyron-mickey.dts @@ -128,10 +128,6 @@ }; };
-&emmc { - /delete-property/mmc-hs200-1_8v; -}; - &i2c2 { status = "disabled"; };
From: Jerome Brunet jbrunet@baylibre.com
[ Upstream commit f9b3eeebef6aabaa37a351715374de53b6da860c ]
The bit 'SSEN' available on some MPLL DSS outputs is not related to the fractional part of the divider but to the function called 'Spread Spectrum'.
This function might be used to solve EM issues by adding a jitter on clock signal. This widens the signal spectrum and weakens the peaks in it.
While spread spectrum might be useful for some application, it is problematic for others, such as audio.
This patch introduce a new flag to the MPLL driver to enable (or not) the spread spectrum function.
Fixes: 1f737ffa13ef ("clk: meson: mpll: fix mpll0 fractional part ignored") Tested-by: Martin Blumenstinglmartin.blumenstingl@googlemail.com Signed-off-by: Jerome Brunet jbrunet@baylibre.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/meson/clk-mpll.c | 9 ++++++--- drivers/clk/meson/clk-mpll.h | 1 + 2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/clk/meson/clk-mpll.c b/drivers/clk/meson/clk-mpll.c index f76850d99e59..d3f42e086431 100644 --- a/drivers/clk/meson/clk-mpll.c +++ b/drivers/clk/meson/clk-mpll.c @@ -119,9 +119,12 @@ static int mpll_set_rate(struct clk_hw *hw, meson_parm_write(clk->map, &mpll->sdm, sdm); meson_parm_write(clk->map, &mpll->sdm_en, 1);
- /* Set additional fractional part enable if required */ - if (MESON_PARM_APPLICABLE(&mpll->ssen)) - meson_parm_write(clk->map, &mpll->ssen, 1); + /* Set spread spectrum if possible */ + if (MESON_PARM_APPLICABLE(&mpll->ssen)) { + unsigned int ss = + mpll->flags & CLK_MESON_MPLL_SPREAD_SPECTRUM ? 1 : 0; + meson_parm_write(clk->map, &mpll->ssen, ss); + }
/* Set the integer divider part */ meson_parm_write(clk->map, &mpll->n2, n2); diff --git a/drivers/clk/meson/clk-mpll.h b/drivers/clk/meson/clk-mpll.h index cf79340006dd..0f948430fed4 100644 --- a/drivers/clk/meson/clk-mpll.h +++ b/drivers/clk/meson/clk-mpll.h @@ -23,6 +23,7 @@ struct meson_clk_mpll_data { };
#define CLK_MESON_MPLL_ROUND_CLOSEST BIT(0) +#define CLK_MESON_MPLL_SPREAD_SPECTRUM BIT(1)
extern const struct clk_ops meson_clk_mpll_ro_ops; extern const struct clk_ops meson_clk_mpll_ops;
From: Douglas Anderson dianders@chromium.org
[ Upstream commit 8ef1ba39a9fa53d2205e633bc9b21840a275908e ]
This is similar to commit e6186820a745 ("arm64: dts: rockchip: Arch counter doesn't tick in system suspend"). Specifically on the rk3288 it can be seen that the timer stops ticking in suspend if we end up running through the "osc_disable" path in rk3288_slp_mode_set(). In that path the 24 MHz clock will turn off and the timer stops.
To test this, I ran this on a Chrome OS filesystem: before=$(date); \ suspend_stress_test -c1 --suspend_min=30 --suspend_max=31; \ echo ${before}; date
...and I found that unless I plug in a device that requests USB wakeup to be active that the two calls to "date" would show that fewer than 30 seconds passed.
NOTE: deep suspend (where the 24 MHz clock gets disabled) isn't supported yet on upstream Linux so this was tested on a downstream kernel.
Signed-off-by: Douglas Anderson dianders@chromium.org Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/rk3288.dtsi | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm/boot/dts/rk3288.dtsi b/arch/arm/boot/dts/rk3288.dtsi index aa017abf4f42..f7bc886a4b51 100644 --- a/arch/arm/boot/dts/rk3288.dtsi +++ b/arch/arm/boot/dts/rk3288.dtsi @@ -231,6 +231,7 @@ <GIC_PPI 11 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>, <GIC_PPI 10 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>; clock-frequency = <24000000>; + arm,no-tick-in-suspend; };
timer: timer@ff810000 {
From: Cheng Jian cj.chengjian@huawei.com
[ Upstream commit a124692b698b00026a58d89831ceda2331b2e1d0 ]
Custom trampolines can only be enabled if there is only a single ops attached to it. If there's only a single callback registered to a function, and the ops has a trampoline registered for it, then we can call the trampoline directly. This is very useful for improving the performance of ftrace and livepatch.
If more than one callback is registered to a function, the general trampoline is used, and the custom trampoline is not restored back to the direct call even if all the other callbacks were unregistered and we are back to one callback for the function.
To fix this, set FTRACE_FL_TRAMP flag if rec count is decremented to one, and the ops that left has a trampoline.
Testing After this patch :
insmod livepatch_unshare_files.ko cat /sys/kernel/debug/tracing/enabled_functions
unshare_files (1) R I tramp: 0xffffffffc0000000(klp_ftrace_handler+0x0/0xa0) ->ftrace_ops_assist_func+0x0/0xf0
echo unshare_files > /sys/kernel/debug/tracing/set_ftrace_filter echo function > /sys/kernel/debug/tracing/current_tracer cat /sys/kernel/debug/tracing/enabled_functions
unshare_files (2) R I ->ftrace_ops_list_func+0x0/0x150
echo nop > /sys/kernel/debug/tracing/current_tracer cat /sys/kernel/debug/tracing/enabled_functions
unshare_files (1) R I tramp: 0xffffffffc0000000(klp_ftrace_handler+0x0/0xa0) ->ftrace_ops_assist_func+0x0/0xf0
Link: http://lkml.kernel.org/r/1556969979-111047-1-git-send-email-cj.chengjian@hua...
Signed-off-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/ftrace.c | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 576c41644e77..208220d526e8 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -1622,6 +1622,11 @@ static bool test_rec_ops_needs_regs(struct dyn_ftrace *rec) return keep_regs; }
+static struct ftrace_ops * +ftrace_find_tramp_ops_any(struct dyn_ftrace *rec); +static struct ftrace_ops * +ftrace_find_tramp_ops_next(struct dyn_ftrace *rec, struct ftrace_ops *ops); + static bool __ftrace_hash_rec_update(struct ftrace_ops *ops, int filter_hash, bool inc) @@ -1750,15 +1755,17 @@ static bool __ftrace_hash_rec_update(struct ftrace_ops *ops, }
/* - * If the rec had TRAMP enabled, then it needs to - * be cleared. As TRAMP can only be enabled iff - * there is only a single ops attached to it. - * In otherwords, always disable it on decrementing. - * In the future, we may set it if rec count is - * decremented to one, and the ops that is left - * has a trampoline. + * The TRAMP needs to be set only if rec count + * is decremented to one, and the ops that is + * left has a trampoline. As TRAMP can only be + * enabled if there is only a single ops attached + * to it. */ - rec->flags &= ~FTRACE_FL_TRAMP; + if (ftrace_rec_count(rec) == 1 && + ftrace_find_tramp_ops_any(rec)) + rec->flags |= FTRACE_FL_TRAMP; + else + rec->flags &= ~FTRACE_FL_TRAMP;
/* * flags will be cleared in ftrace_check_record() @@ -1951,11 +1958,6 @@ static void print_ip_ins(const char *fmt, const unsigned char *p) printk(KERN_CONT "%s%02x", i ? ":" : "", p[i]); }
-static struct ftrace_ops * -ftrace_find_tramp_ops_any(struct dyn_ftrace *rec); -static struct ftrace_ops * -ftrace_find_tramp_ops_next(struct dyn_ftrace *rec, struct ftrace_ops *ops); - enum ftrace_bug_type ftrace_bug_type; const void *ftrace_expected;
From: Niklas Cassel niklas.cassel@linaro.org
[ Upstream commit 887b528c958f40b064d53edd0bfa9fea3a69eccd ]
The current l3 min voltage level is not supported by the regulator (the voltage is not a multiple of the regulator step size), so a driver requesting this exact voltage would fail, see discussion in: https://patchwork.kernel.org/comment/22461199/
It was agreed upon to set a min voltage level that is a multiple of the regulator step size.
There was actually a patch sent that did this: https://patchwork.kernel.org/patch/10819313/
However, the commit 331ab98f8c4a ("arm64: dts: qcom: qcs404: Fix voltages l3") that was applied is not identical to that patch.
Signed-off-by: Niklas Cassel niklas.cassel@linaro.org Signed-off-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Andy Gross agross@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/qcs404-evb.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/qcom/qcs404-evb.dtsi b/arch/arm64/boot/dts/qcom/qcs404-evb.dtsi index 2c3127167e3c..d987d6741e40 100644 --- a/arch/arm64/boot/dts/qcom/qcs404-evb.dtsi +++ b/arch/arm64/boot/dts/qcom/qcs404-evb.dtsi @@ -118,7 +118,7 @@ };
vreg_l3_1p05: l3 { - regulator-min-microvolt = <1050000>; + regulator-min-microvolt = <1048000>; regulator-max-microvolt = <1160000>; };
From: Sibi Sankar sibis@codeaurora.org
[ Upstream commit 8b3344422f097debe52296b87a39707d56ca3abe ]
Remoteproc q6v5-mss calls set_performance_state with INT_MAX on rpmpd. This is currently ignored since it is greater than the max supported state. Fixup rpmpd state to max if the required state is greater than all the supported states.
Fixes: 075d3db8d10d ("soc: qcom: rpmpd: Add support for get/set performance state") Reviewed-by: Marc Gonzalez marc.w.gonzalez@free.fr Reviewed-by: Vinod Koul vkoul@kernel.org Reviewed-by: Jeffrey Hugo jhugo@codeaurora.org Signed-off-by: Sibi Sankar sibis@codeaurora.org Signed-off-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Andy Gross agross@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/qcom/rpmpd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soc/qcom/rpmpd.c b/drivers/soc/qcom/rpmpd.c index 005326050c23..235d01870dd8 100644 --- a/drivers/soc/qcom/rpmpd.c +++ b/drivers/soc/qcom/rpmpd.c @@ -226,7 +226,7 @@ static int rpmpd_set_performance(struct generic_pm_domain *domain, struct rpmpd *pd = domain_to_rpmpd(domain);
if (state > MAX_RPMPD_STATE) - goto out; + state = MAX_RPMPD_STATE;
mutex_lock(&rpmpd_lock);
From: Heinrich Schuchardt xypron.glpk@gmx.de
[ Upstream commit d3446b266a8c72a7bbc94b65f5fc6d206be77d24 ]
Running a graphics adapter on the MACCHIATObin fails due to an insufficiently sized memory window.
Enlarge the memory window for the PCIe slot to 512 MiB.
With the patch I am able to use a GT710 graphics adapter with 1 GB onboard memory.
These are the mapped memory areas that the graphics adapter is actually using:
Region 0: Memory at cc000000 (32-bit, non-prefetchable) [size=16M] Region 1: Memory at c0000000 (64-bit, prefetchable) [size=128M] Region 3: Memory at c8000000 (64-bit, prefetchable) [size=32M] Region 5: I/O ports at 1000 [size=128] Expansion ROM at ca000000 [disabled] [size=512K]
Signed-off-by: Heinrich Schuchardt xypron.glpk@gmx.de Signed-off-by: Gregory CLEMENT gregory.clement@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/marvell/armada-8040-mcbin.dtsi | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/arm64/boot/dts/marvell/armada-8040-mcbin.dtsi b/arch/arm64/boot/dts/marvell/armada-8040-mcbin.dtsi index 329f8ceeebea..205071b45a32 100644 --- a/arch/arm64/boot/dts/marvell/armada-8040-mcbin.dtsi +++ b/arch/arm64/boot/dts/marvell/armada-8040-mcbin.dtsi @@ -184,6 +184,8 @@ num-lanes = <4>; num-viewport = <8>; reset-gpios = <&cp0_gpio2 20 GPIO_ACTIVE_LOW>; + ranges = <0x81000000 0x0 0xf9010000 0x0 0xf9010000 0x0 0x10000 + 0x82000000 0x0 0xc0000000 0x0 0xc0000000 0x0 0x20000000>; status = "okay"; };
From: Anson Huang Anson.Huang@nxp.com
[ Upstream commit 4c396a604a57da8f883a8b3368d83181680d6907 ]
Current implementation of i.MX8 SoC driver returns -ENODEV for all cases of error during initialization, this is incorrect. This patch fixes them using correct return value according to different errors.
Signed-off-by: Anson Huang Anson.Huang@nxp.com Reviewed-by: Leonard Crestez leonard.crestez@nxp.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/imx/soc-imx8.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/drivers/soc/imx/soc-imx8.c b/drivers/soc/imx/soc-imx8.c index fc6429f9170a..e567d866a9d3 100644 --- a/drivers/soc/imx/soc-imx8.c +++ b/drivers/soc/imx/soc-imx8.c @@ -73,7 +73,7 @@ static int __init imx8_soc_init(void)
soc_dev_attr = kzalloc(sizeof(*soc_dev_attr), GFP_KERNEL); if (!soc_dev_attr) - return -ENODEV; + return -ENOMEM;
soc_dev_attr->family = "Freescale i.MX";
@@ -83,8 +83,10 @@ static int __init imx8_soc_init(void) goto free_soc;
id = of_match_node(imx8_soc_match, root); - if (!id) + if (!id) { + ret = -ENODEV; goto free_soc; + }
of_node_put(root);
@@ -96,12 +98,16 @@ static int __init imx8_soc_init(void) }
soc_dev_attr->revision = imx8_revision(soc_rev); - if (!soc_dev_attr->revision) + if (!soc_dev_attr->revision) { + ret = -ENOMEM; goto free_soc; + }
soc_dev = soc_device_register(soc_dev_attr); - if (IS_ERR(soc_dev)) + if (IS_ERR(soc_dev)) { + ret = PTR_ERR(soc_dev); goto free_rev; + }
return 0;
@@ -110,6 +116,6 @@ static int __init imx8_soc_init(void) free_soc: kfree(soc_dev_attr); of_node_put(root); - return -ENODEV; + return ret; } device_initcall(imx8_soc_init);
From: Dmitry Osipenko digetx@gmail.com
[ Upstream commit dc161064beb83c668e0f85766b92b1e7ed186e58 ]
Apparently driver was never tested with DMA_PREP_INTERRUPT flag being unset since it completely disables interrupt handling instead of skipping the callbacks invocations, hence putting channel into unusable state.
The flag is always set by all of kernel drivers that use APB DMA, so let's error out in otherwise case for consistency. It won't be difficult to support that case properly if ever will be needed.
Signed-off-by: Dmitry Osipenko digetx@gmail.com Acked-by: Jon Hunter jonathanh@nvidia.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/tegra20-apb-dma.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c index ef317c90fbe1..79e9593815f1 100644 --- a/drivers/dma/tegra20-apb-dma.c +++ b/drivers/dma/tegra20-apb-dma.c @@ -977,8 +977,12 @@ static struct dma_async_tx_descriptor *tegra_dma_prep_slave_sg( csr |= tdc->slave_id << TEGRA_APBDMA_CSR_REQ_SEL_SHIFT; }
- if (flags & DMA_PREP_INTERRUPT) + if (flags & DMA_PREP_INTERRUPT) { csr |= TEGRA_APBDMA_CSR_IE_EOC; + } else { + WARN_ON_ONCE(1); + return NULL; + }
apb_seq |= TEGRA_APBDMA_APBSEQ_WRAP_WORD_1;
@@ -1120,8 +1124,12 @@ static struct dma_async_tx_descriptor *tegra_dma_prep_dma_cyclic( csr |= tdc->slave_id << TEGRA_APBDMA_CSR_REQ_SEL_SHIFT; }
- if (flags & DMA_PREP_INTERRUPT) + if (flags & DMA_PREP_INTERRUPT) { csr |= TEGRA_APBDMA_CSR_IE_EOC; + } else { + WARN_ON_ONCE(1); + return NULL; + }
apb_seq |= TEGRA_APBDMA_APBSEQ_WRAP_WORD_1;
From: Helen Koike helen.koike@collabora.com
[ Upstream commit c432a29d3fc9ee928caeca2f5cf68b3aebfa6817 ]
isp iommu requires wrapper variants of the clocks. noc variants are always on and using the wrapper variants will activate {A,H}CLK_ISP{0,1} due to the hierarchy.
Tested using the pending isp patch set (which is not upstream yet). Without this patch, streaming from the isp stalls.
Also add the respective power domain and remove the "disabled" status.
Refer: RK3399 TRM v1.4 Fig. 2-4 RK3399 Clock Architecture Diagram RK3399 TRM v1.4 Fig. 8-1 RK3399 Power Domain Partition
Signed-off-by: Helen Koike helen.koike@collabora.com Tested-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/rockchip/rk3399.dtsi | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi index 196ac9b78076..89594a7276f4 100644 --- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi @@ -1706,11 +1706,11 @@ reg = <0x0 0xff914000 0x0 0x100>, <0x0 0xff915000 0x0 0x100>; interrupts = <GIC_SPI 43 IRQ_TYPE_LEVEL_HIGH 0>; interrupt-names = "isp0_mmu"; - clocks = <&cru ACLK_ISP0_NOC>, <&cru HCLK_ISP0_NOC>; + clocks = <&cru ACLK_ISP0_WRAPPER>, <&cru HCLK_ISP0_WRAPPER>; clock-names = "aclk", "iface"; #iommu-cells = <0>; + power-domains = <&power RK3399_PD_ISP0>; rockchip,disable-mmu-reset; - status = "disabled"; };
isp1_mmu: iommu@ff924000 { @@ -1718,11 +1718,11 @@ reg = <0x0 0xff924000 0x0 0x100>, <0x0 0xff925000 0x0 0x100>; interrupts = <GIC_SPI 44 IRQ_TYPE_LEVEL_HIGH 0>; interrupt-names = "isp1_mmu"; - clocks = <&cru ACLK_ISP1_NOC>, <&cru HCLK_ISP1_NOC>; + clocks = <&cru ACLK_ISP1_WRAPPER>, <&cru HCLK_ISP1_WRAPPER>; clock-names = "aclk", "iface"; #iommu-cells = <0>; + power-domains = <&power RK3399_PD_ISP1>; rockchip,disable-mmu-reset; - status = "disabled"; };
hdmi_sound: hdmi-sound {
From: Prarit Bhargava prarit@redhat.com
[ Upstream commit 6e6de3dee51a439f76eb73c22ae2ffd2c9384712 ]
Microsoft HyperV disables the X86_FEATURE_SMCA bit on AMD systems, and linux guests boot with repeated errors:
amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2) amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2)
The warnings occur because the module code erroneously returns -EEXIST for modules that have failed to load and are in the process of being removed from the module list.
module amd64_edac_mod has a dependency on module edac_mce_amd. Using modules.dep, systemd will load edac_mce_amd for every request of amd64_edac_mod. When the edac_mce_amd module loads, the module has state MODULE_STATE_UNFORMED and once the module load fails and the state becomes MODULE_STATE_GOING. Another request for edac_mce_amd module executes and add_unformed_module() will erroneously return -EEXIST even though the previous instance of edac_mce_amd has MODULE_STATE_GOING. Upon receiving -EEXIST, systemd attempts to load amd64_edac_mod, which fails because of unknown symbols from edac_mce_amd.
add_unformed_module() must wait to return for any case other than MODULE_STATE_LIVE to prevent a race between multiple loads of dependent modules.
Signed-off-by: Prarit Bhargava prarit@redhat.com Signed-off-by: Barret Rhoden brho@google.com Cc: David Arcari darcari@redhat.com Cc: Jessica Yu jeyu@kernel.org Cc: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Jessica Yu jeyu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/module.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/kernel/module.c b/kernel/module.c index 80c7c09584cf..8431c3d47c97 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -3385,8 +3385,7 @@ static bool finished_loading(const char *name) sched_annotate_sleep(); mutex_lock(&module_mutex); mod = find_module_all(name, strlen(name), true); - ret = !mod || mod->state == MODULE_STATE_LIVE - || mod->state == MODULE_STATE_GOING; + ret = !mod || mod->state == MODULE_STATE_LIVE; mutex_unlock(&module_mutex);
return ret; @@ -3576,8 +3575,7 @@ static int add_unformed_module(struct module *mod) mutex_lock(&module_mutex); old = find_module_all(mod->name, strlen(mod->name), true); if (old != NULL) { - if (old->state == MODULE_STATE_COMING - || old->state == MODULE_STATE_UNFORMED) { + if (old->state != MODULE_STATE_LIVE) { /* Wait in case it fails to load. */ mutex_unlock(&module_mutex); err = wait_event_interruptible(module_wq,
On 7/26/19 9:38 AM, Sasha Levin wrote:
From: Prarit Bhargava prarit@redhat.com
[ Upstream commit 6e6de3dee51a439f76eb73c22ae2ffd2c9384712 ]
Microsoft HyperV disables the X86_FEATURE_SMCA bit on AMD systems, and linux guests boot with repeated errors:
Hey Sasha, I'd prefer to leave this out of stable branches for now. Linus is a bit nervous about it and I like to get see more soak time before the patch is backported to stable.
https://lkml.org/lkml/2019/7/18/680
FWIW we've been running this in RHEL for some time now without any issues.
P.
amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2) amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2)
The warnings occur because the module code erroneously returns -EEXIST for modules that have failed to load and are in the process of being removed from the module list.
module amd64_edac_mod has a dependency on module edac_mce_amd. Using modules.dep, systemd will load edac_mce_amd for every request of amd64_edac_mod. When the edac_mce_amd module loads, the module has state MODULE_STATE_UNFORMED and once the module load fails and the state becomes MODULE_STATE_GOING. Another request for edac_mce_amd module executes and add_unformed_module() will erroneously return -EEXIST even though the previous instance of edac_mce_amd has MODULE_STATE_GOING. Upon receiving -EEXIST, systemd attempts to load amd64_edac_mod, which fails because of unknown symbols from edac_mce_amd.
add_unformed_module() must wait to return for any case other than MODULE_STATE_LIVE to prevent a race between multiple loads of dependent modules.
Signed-off-by: Prarit Bhargava prarit@redhat.com Signed-off-by: Barret Rhoden brho@google.com Cc: David Arcari darcari@redhat.com Cc: Jessica Yu jeyu@kernel.org Cc: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Jessica Yu jeyu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org
kernel/module.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/kernel/module.c b/kernel/module.c index 80c7c09584cf..8431c3d47c97 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -3385,8 +3385,7 @@ static bool finished_loading(const char *name) sched_annotate_sleep(); mutex_lock(&module_mutex); mod = find_module_all(name, strlen(name), true);
- ret = !mod || mod->state == MODULE_STATE_LIVE
|| mod->state == MODULE_STATE_GOING;
- ret = !mod || mod->state == MODULE_STATE_LIVE; mutex_unlock(&module_mutex);
return ret; @@ -3576,8 +3575,7 @@ static int add_unformed_module(struct module *mod) mutex_lock(&module_mutex); old = find_module_all(mod->name, strlen(mod->name), true); if (old != NULL) {
if (old->state == MODULE_STATE_COMING
|| old->state == MODULE_STATE_UNFORMED) {
if (old->state != MODULE_STATE_LIVE) { /* Wait in case it fails to load. */ mutex_unlock(&module_mutex); err = wait_event_interruptible(module_wq,
From: Jean-Philippe Brucker jean-philippe.brucker@arm.com
[ Upstream commit 59b099a6c75e4ddceeaf9676422d8d91d0049755 ]
For PCI devices that have an OF node, set the fwnode as well. This way drivers that rely on fwnode don't need the special case described by commit f94277af03ea ("of/platform: Initialise dev->fwnode appropriately").
Acked-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Jean-Philippe Brucker jean-philippe.brucker@arm.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/of.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/pci/of.c b/drivers/pci/of.c index 73d5adec0a28..bc7b27a28795 100644 --- a/drivers/pci/of.c +++ b/drivers/pci/of.c @@ -22,12 +22,15 @@ void pci_set_of_node(struct pci_dev *dev) return; dev->dev.of_node = of_pci_find_child_device(dev->bus->dev.of_node, dev->devfn); + if (dev->dev.of_node) + dev->dev.fwnode = &dev->dev.of_node->fwnode; }
void pci_release_of_node(struct pci_dev *dev) { of_node_put(dev->dev.of_node); dev->dev.of_node = NULL; + dev->dev.fwnode = NULL; }
void pci_set_bus_of_node(struct pci_bus *bus) @@ -41,13 +44,18 @@ void pci_set_bus_of_node(struct pci_bus *bus) if (node && of_property_read_bool(node, "external-facing")) bus->self->untrusted = true; } + bus->dev.of_node = node; + + if (bus->dev.of_node) + bus->dev.fwnode = &bus->dev.of_node->fwnode; }
void pci_release_bus_of_node(struct pci_bus *bus) { of_node_put(bus->dev.of_node); bus->dev.of_node = NULL; + bus->dev.fwnode = NULL; }
struct device_node * __weak pcibios_get_phb_of_node(struct pci_bus *bus)
From: Jean-Philippe Brucker jean-philippe.brucker@arm.com
[ Upstream commit 92e074acf6f7694e96204265eb18ac113f546e80 ]
Since commit 85f1abe0019f ("kthread, sched/wait: Fix kthread_parkme() completion issue"), kthreads that are bound to a CPU must be parked before being stopped. At the moment the PSCI checker calls kthread_stop() directly on the suspend kthread, which triggers the following warning:
[ 6.068288] WARNING: CPU: 1 PID: 1 at kernel/kthread.c:398 __kthread_bind_mask+0x20/0x78 ... [ 6.190151] Call trace: [ 6.192566] __kthread_bind_mask+0x20/0x78 [ 6.196615] kthread_unpark+0x74/0x80 [ 6.200235] kthread_stop+0x44/0x1d8 [ 6.203769] psci_checker+0x3bc/0x484 [ 6.207389] do_one_initcall+0x48/0x260 [ 6.211180] kernel_init_freeable+0x2c8/0x368 [ 6.215488] kernel_init+0x10/0x100 [ 6.218935] ret_from_fork+0x10/0x1c [ 6.222467] ---[ end trace e05e22863d043cd3 ]---
kthread_unpark() tries to bind the thread to its CPU and aborts with a WARN() if the thread wasn't in TASK_PARKED state. Park the kthreads before stopping them.
Fixes: 85f1abe0019f ("kthread, sched/wait: Fix kthread_parkme() completion issue") Signed-off-by: Jean-Philippe Brucker jean-philippe.brucker@arm.com Reviewed-by: Sudeep Holla sudeep.holla@arm.com Acked-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Signed-off-by: Olof Johansson olof@lixom.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/psci/psci_checker.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/firmware/psci/psci_checker.c b/drivers/firmware/psci/psci_checker.c index 08c85099d4d0..f3659443f8c2 100644 --- a/drivers/firmware/psci/psci_checker.c +++ b/drivers/firmware/psci/psci_checker.c @@ -359,16 +359,16 @@ static int suspend_test_thread(void *arg) for (;;) { /* Needs to be set first to avoid missing a wakeup. */ set_current_state(TASK_INTERRUPTIBLE); - if (kthread_should_stop()) { - __set_current_state(TASK_RUNNING); + if (kthread_should_park()) break; - } schedule(); }
pr_info("CPU %d suspend test results: success %d, shallow states %d, errors %d\n", cpu, nb_suspend, nb_shallow_sleep, nb_err);
+ kthread_parkme(); + return nb_err; }
@@ -433,8 +433,10 @@ static int suspend_tests(void)
/* Stop and destroy all threads, get return status. */ - for (i = 0; i < nb_threads; ++i) + for (i = 0; i < nb_threads; ++i) { + err += kthread_park(threads[i]); err += kthread_stop(threads[i]); + } out: cpuidle_resume_and_unlock(); kfree(threads);
From: Anson Huang Anson.Huang@nxp.com
[ Upstream commit 1bcbe7300815e91fef18ee905b04f65490ad38c9 ]
When SoC's revision value is 0, SoC driver will print out "unknown" in sysfs's revision node, this "unknown" is a static string which can NOT be freed, this will caused below kernel dump in later error path which calls kfree:
kernel BUG at mm/slub.c:3942! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc4-next-20190611-00023-g705146c-dirty #2197 Hardware name: NXP i.MX8MQ EVK (DT) pstate: 60000005 (nZCv daif -PAN -UAO) pc : kfree+0x170/0x1b0 lr : imx8_soc_init+0xc0/0xe4 sp : ffff00001003bd10 x29: ffff00001003bd10 x28: ffff00001121e0a0 x27: ffff000011482000 x26: ffff00001117068c x25: ffff00001121e100 x24: ffff000011482000 x23: ffff000010fe2b58 x22: ffff0000111b9ab0 x21: ffff8000bd9dfba0 x20: ffff0000111b9b70 x19: ffff7e000043f880 x18: 0000000000001000 x17: ffff000010d05fa0 x16: ffff0000122e0000 x15: 0140000000000000 x14: 0000000030360000 x13: ffff8000b94b5bb0 x12: 0000000000000038 x11: ffffffffffffffff x10: ffffffffffffffff x9 : 0000000000000003 x8 : ffff8000b9488147 x7 : ffff00001003bc00 x6 : 0000000000000000 x5 : 0000000000000003 x4 : 0000000000000003 x3 : 0000000000000003 x2 : b8793acd604edf00 x1 : ffff7e000043f880 x0 : ffff7e000043f888 Call trace: kfree+0x170/0x1b0 imx8_soc_init+0xc0/0xe4 do_one_initcall+0x58/0x1b8 kernel_init_freeable+0x1cc/0x288 kernel_init+0x10/0x100 ret_from_fork+0x10/0x18
This patch fixes this potential kernel dump when a chip's revision is "unknown", it is done by checking whether the revision space can be freed.
Fixes: a7e26f356ca1 ("soc: imx: Add generic i.MX8 SoC driver") Signed-off-by: Anson Huang Anson.Huang@nxp.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/soc/imx/soc-imx8.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/soc/imx/soc-imx8.c b/drivers/soc/imx/soc-imx8.c index e567d866a9d3..79a3d922a4a9 100644 --- a/drivers/soc/imx/soc-imx8.c +++ b/drivers/soc/imx/soc-imx8.c @@ -112,7 +112,8 @@ static int __init imx8_soc_init(void) return 0;
free_rev: - kfree(soc_dev_attr->revision); + if (strcmp(soc_dev_attr->revision, "unknown")) + kfree(soc_dev_attr->revision); free_soc: kfree(soc_dev_attr); of_node_put(root);
From: Andy Gross agross@kernel.org
[ Upstream commit 0763d0c2273a3c72247d325c48fbac3d918d6b87 ]
This patch adds a reset-cells property to the gcc controller on the QCS404. Without this in place, we get warnings like the following if nodes reference a gcc reset:
arch/arm64/boot/dts/qcom/qcs404.dtsi:261.38-310.5: Warning (resets_property): /soc@0/remoteproc@b00000: Missing property '#reset-cells' in node /soc@0/clock-controller@1800000 or bad phandle (referred from resets[0]) also defined at arch/arm64/boot/dts/qcom/qcs404-evb.dtsi:82.18-84.3 DTC arch/arm64/boot/dts/qcom/qcs404-evb-4000.dtb arch/arm64/boot/dts/qcom/qcs404.dtsi:261.38-310.5: Warning (resets_property): /soc@0/remoteproc@b00000: Missing property '#reset-cells' in node /soc@0/clock-controller@1800000 or bad phandle (referred from resets[0]) also defined at arch/arm64/boot/dts/qcom/qcs404-evb.dtsi:82.18-84.3
Signed-off-by: Andy Gross agross@kernel.org Reviewed-by: Niklas Cassel niklas.cassel@linaro.org Reviewed-by: Vinod Koul vkoul@kernel.org Signed-off-by: Olof Johansson olof@lixom.net Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/qcs404.dtsi | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm64/boot/dts/qcom/qcs404.dtsi b/arch/arm64/boot/dts/qcom/qcs404.dtsi index ffedf9640af7..65a2cbeb28be 100644 --- a/arch/arm64/boot/dts/qcom/qcs404.dtsi +++ b/arch/arm64/boot/dts/qcom/qcs404.dtsi @@ -383,6 +383,7 @@ compatible = "qcom,gcc-qcs404"; reg = <0x01800000 0x80000>; #clock-cells = <1>; + #reset-cells = <1>;
assigned-clocks = <&gcc GCC_APSS_AHB_CLK_SRC>; assigned-clock-rates = <19200000>;
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 9c106119f6538f65bdddb7948a157d90625effa7 ]
On architectures that have a larger dma_addr_t than phys_addr_t, the swiotlb_tbl_map_single() function truncates its return code in the failure path, making it impossible to identify the error later, as we compare to the original value:
kernel/dma/swiotlb.c:551:9: error: implicit conversion from 'dma_addr_t' (aka 'unsigned long long') to 'phys_addr_t' (aka 'unsigned int') changes value from 18446744073709551615 to 4294967295 [-Werror,-Wconstant-conversion] return DMA_MAPPING_ERROR;
Use an explicit typecast here to convert it to the narrower type, and use the same expression in the error handling later.
Fixes: b907e20508d0 ("swiotlb: remove SWIOTLB_MAP_ERROR") Acked-by: Stefano Stabellini sstabellini@kernel.org Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/xen/swiotlb-xen.c | 2 +- kernel/dma/swiotlb.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c index d53f3493a6b9..cfbe46785a3b 100644 --- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -402,7 +402,7 @@ static dma_addr_t xen_swiotlb_map_page(struct device *dev, struct page *page,
map = swiotlb_tbl_map_single(dev, start_dma_addr, phys, size, dir, attrs); - if (map == DMA_MAPPING_ERROR) + if (map == (phys_addr_t)DMA_MAPPING_ERROR) return DMA_MAPPING_ERROR;
dev_addr = xen_phys_to_bus(map); diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 13f0cb080a4d..5f4e1b78babb 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -546,7 +546,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit()) dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes), total %lu (slots), used %lu (slots)\n", size, io_tlb_nslabs, tmp_io_tlb_used); - return DMA_MAPPING_ERROR; + return (phys_addr_t)DMA_MAPPING_ERROR; found: io_tlb_used += nslots; spin_unlock_irqrestore(&io_tlb_lock, flags); @@ -664,7 +664,7 @@ bool swiotlb_map(struct device *dev, phys_addr_t *phys, dma_addr_t *dma_addr, /* Oh well, have to allocate and map a bounce buffer. */ *phys = swiotlb_tbl_map_single(dev, __phys_to_dma(dev, io_tlb_start), *phys, size, dir, attrs); - if (*phys == DMA_MAPPING_ERROR) + if (*phys == (phys_addr_t)DMA_MAPPING_ERROR) return false;
/* Ensure that the address returned is DMA'ble */
From: Petr Cvek petrcvekcz@gmail.com
[ Upstream commit ba1bc0fcdeaf3bf583c1517bd2e3e29cf223c969 ]
The modification of EXIN register doesn't clean the bitfield before the writing of a new value. After a few modifications the bitfield would accumulate only '1's.
Signed-off-by: Petr Cvek petrcvekcz@gmail.com Signed-off-by: Paul Burton paul.burton@mips.com Cc: hauke@hauke-m.de Cc: john@phrozen.org Cc: linux-mips@vger.kernel.org Cc: openwrt-devel@lists.openwrt.org Cc: pakahmar@hotmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/lantiq/irq.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/mips/lantiq/irq.c b/arch/mips/lantiq/irq.c index cfd87e662fcf..9c95097557c7 100644 --- a/arch/mips/lantiq/irq.c +++ b/arch/mips/lantiq/irq.c @@ -154,8 +154,9 @@ static int ltq_eiu_settype(struct irq_data *d, unsigned int type) if (edge) irq_set_handler(d->hwirq, handle_edge_irq);
- ltq_eiu_w32(ltq_eiu_r32(LTQ_EIU_EXIN_C) | - (val << (i * 4)), LTQ_EIU_EXIN_C); + ltq_eiu_w32((ltq_eiu_r32(LTQ_EIU_EXIN_C) & + (~(7 << (i * 4)))) | (val << (i * 4)), + LTQ_EIU_EXIN_C); } }
From: Geert Uytterhoeven geert+renesas@glider.be
[ Upstream commit 78efb76ab4dfb8f74f290ae743f34162cd627f19 ]
While the .device_prep_slave_sg() callback rejects empty scatterlists, it still accepts single-entry scatterlists with a zero-length segment. These may happen if a driver calls dmaengine_prep_slave_single() with a zero len parameter. The corresponding DMA request will never complete, leading to messages like:
rcar-dmac e7300000.dma-controller: Channel Address Error happen
and DMA timeouts.
Although requesting a zero-length DMA request is a driver bug, rejecting it early eases debugging. Note that the .device_prep_dma_memcpy() callback already rejects requests to copy zero bytes.
Reported-by: Eugeniu Rosca erosca@de.adit-jv.com Analyzed-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/sh/rcar-dmac.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma/sh/rcar-dmac.c b/drivers/dma/sh/rcar-dmac.c index 33ab1b607e2b..54de669c38b8 100644 --- a/drivers/dma/sh/rcar-dmac.c +++ b/drivers/dma/sh/rcar-dmac.c @@ -1165,7 +1165,7 @@ rcar_dmac_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl, struct rcar_dmac_chan *rchan = to_rcar_dmac_chan(chan);
/* Someone calling slave DMA on a generic channel? */ - if (rchan->mid_rid < 0 || !sg_len) { + if (rchan->mid_rid < 0 || !sg_len || !sg_dma_len(sgl)) { dev_warn(chan->device->dev, "%s: bad parameter: len=%d, id=%d\n", __func__, sg_len, rchan->mid_rid);
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 24d2c73ff28bcda48607eacc4bc804002dbf78d9 ]
We get a link error for configurations that enable an Exynos SoC that does not require MCPM, but then manually enable MCPM anyway without also turning on the arm-cci:
arch/arm/mach-exynos/mcpm-exynos.o: In function `exynos_pm_power_up_setup': mcpm-exynos.c:(.text+0x8): undefined reference to `cci_enable_port_for_self'
Change it back to only build the code we actually need, by introducing a CONFIG_EXYNOS_MCPM that serves the same purpose as the older CONFIG_EXYNOS5420_MCPM.
Fixes: 2997520c2d4e ("ARM: exynos: Set MCPM as mandatory for Exynos542x/5800 SoCs") Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Krzysztof Kozlowski krzk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-exynos/Kconfig | 6 +++++- arch/arm/mach-exynos/Makefile | 2 +- arch/arm/mach-exynos/suspend.c | 6 +++--- 3 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/arch/arm/mach-exynos/Kconfig b/arch/arm/mach-exynos/Kconfig index 1c518b8ee520..21a59efd1a2c 100644 --- a/arch/arm/mach-exynos/Kconfig +++ b/arch/arm/mach-exynos/Kconfig @@ -106,7 +106,7 @@ config SOC_EXYNOS5420 bool "SAMSUNG EXYNOS5420" default y depends on ARCH_EXYNOS5 - select MCPM if SMP + select EXYNOS_MCPM if SMP select ARM_CCI400_PORT_CTRL select ARM_CPU_SUSPEND
@@ -115,6 +115,10 @@ config SOC_EXYNOS5800 default y depends on SOC_EXYNOS5420
+config EXYNOS_MCPM + bool + select MCPM + config EXYNOS_CPU_SUSPEND bool select ARM_CPU_SUSPEND diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile index 264dbaa89c3d..5abf3db23912 100644 --- a/arch/arm/mach-exynos/Makefile +++ b/arch/arm/mach-exynos/Makefile @@ -18,5 +18,5 @@ plus_sec := $(call as-instr,.arch_extension sec,+sec) AFLAGS_exynos-smc.o :=-Wa,-march=armv7-a$(plus_sec) AFLAGS_sleep.o :=-Wa,-march=armv7-a$(plus_sec)
-obj-$(CONFIG_MCPM) += mcpm-exynos.o +obj-$(CONFIG_EXYNOS_MCPM) += mcpm-exynos.o CFLAGS_mcpm-exynos.o += -march=armv7-a diff --git a/arch/arm/mach-exynos/suspend.c b/arch/arm/mach-exynos/suspend.c index be122af0de8f..8b1e6ab8504f 100644 --- a/arch/arm/mach-exynos/suspend.c +++ b/arch/arm/mach-exynos/suspend.c @@ -268,7 +268,7 @@ static int exynos5420_cpu_suspend(unsigned long arg) unsigned int cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1); unsigned int cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
- if (IS_ENABLED(CONFIG_MCPM)) { + if (IS_ENABLED(CONFIG_EXYNOS_MCPM)) { mcpm_set_entry_vector(cpu, cluster, exynos_cpu_resume); mcpm_cpu_suspend(); } @@ -351,7 +351,7 @@ static void exynos5420_pm_prepare(void) exynos_pm_enter_sleep_mode();
/* ensure at least INFORM0 has the resume address */ - if (IS_ENABLED(CONFIG_MCPM)) + if (IS_ENABLED(CONFIG_EXYNOS_MCPM)) pmu_raw_writel(__pa_symbol(mcpm_entry_point), S5P_INFORM0);
tmp = pmu_raw_readl(EXYNOS_L2_OPTION(0)); @@ -455,7 +455,7 @@ static void exynos5420_prepare_pm_resume(void) mpidr = read_cpuid_mpidr(); cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
- if (IS_ENABLED(CONFIG_MCPM)) + if (IS_ENABLED(CONFIG_EXYNOS_MCPM)) WARN_ON(mcpm_cpu_powered_up());
if (IS_ENABLED(CONFIG_HW_PERF_EVENTS) && cluster != 0) {
From: JC Kuo jckuo@nvidia.com
[ Upstream commit 0d34dfbf3023cf119b83f6470692c0b10c832495 ]
Full-speed and low-speed USB devices do not work with Tegra210 platforms because of incorrect PLLU/PLLU_OUT1 clock settings.
When full-speed device is connected: [ 14.059886] usb 1-3: new full-speed USB device number 2 using tegra-xusb [ 14.196295] usb 1-3: device descriptor read/64, error -71 [ 14.436311] usb 1-3: device descriptor read/64, error -71 [ 14.675749] usb 1-3: new full-speed USB device number 3 using tegra-xusb [ 14.812335] usb 1-3: device descriptor read/64, error -71 [ 15.052316] usb 1-3: device descriptor read/64, error -71 [ 15.164799] usb usb1-port3: attempt power cycle
When low-speed device is connected: [ 37.610949] usb usb1-port3: Cannot enable. Maybe the USB cable is bad? [ 38.557376] usb usb1-port3: Cannot enable. Maybe the USB cable is bad? [ 38.564977] usb usb1-port3: attempt power cycle
This commit fixes the issue by: 1. initializing PLLU_OUT1 before initializing XUSB_FS_SRC clock because PLLU_OUT1 is parent of XUSB_FS_SRC. 2. changing PLLU post-divider to /2 (DIVP=1) according to Technical Reference Manual.
Fixes: e745f992cf4b ("clk: tegra: Rework pll_u") Signed-off-by: JC Kuo jckuo@nvidia.com Acked-By: Peter De Schrijver pdeschrijver@nvidia.com Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/tegra/clk-tegra210.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/clk/tegra/clk-tegra210.c b/drivers/clk/tegra/clk-tegra210.c index ac1d27a8c650..e5470a6bbf55 100644 --- a/drivers/clk/tegra/clk-tegra210.c +++ b/drivers/clk/tegra/clk-tegra210.c @@ -2204,9 +2204,9 @@ static struct div_nmp pllu_nmp = { };
static struct tegra_clk_pll_freq_table pll_u_freq_table[] = { - { 12000000, 480000000, 40, 1, 0, 0 }, - { 13000000, 480000000, 36, 1, 0, 0 }, /* actual: 468.0 MHz */ - { 38400000, 480000000, 25, 2, 0, 0 }, + { 12000000, 480000000, 40, 1, 1, 0 }, + { 13000000, 480000000, 36, 1, 1, 0 }, /* actual: 468.0 MHz */ + { 38400000, 480000000, 25, 2, 1, 0 }, { 0, 0, 0, 0, 0, 0 }, };
@@ -3333,6 +3333,7 @@ static struct tegra_clk_init_table init_table[] __initdata = { { TEGRA210_CLK_DFLL_REF, TEGRA210_CLK_PLL_P, 51000000, 1 }, { TEGRA210_CLK_SBC4, TEGRA210_CLK_PLL_P, 12000000, 1 }, { TEGRA210_CLK_PLL_RE_VCO, TEGRA210_CLK_CLK_MAX, 672000000, 1 }, + { TEGRA210_CLK_PLL_U_OUT1, TEGRA210_CLK_CLK_MAX, 48000000, 1 }, { TEGRA210_CLK_XUSB_GATE, TEGRA210_CLK_CLK_MAX, 0, 1 }, { TEGRA210_CLK_XUSB_SS_SRC, TEGRA210_CLK_PLL_U_480M, 120000000, 0 }, { TEGRA210_CLK_XUSB_FS_SRC, TEGRA210_CLK_PLL_U_48M, 48000000, 0 }, @@ -3357,7 +3358,6 @@ static struct tegra_clk_init_table init_table[] __initdata = { { TEGRA210_CLK_PLL_DP, TEGRA210_CLK_CLK_MAX, 270000000, 0 }, { TEGRA210_CLK_SOC_THERM, TEGRA210_CLK_PLL_P, 51000000, 0 }, { TEGRA210_CLK_CCLK_G, TEGRA210_CLK_CLK_MAX, 0, 1 }, - { TEGRA210_CLK_PLL_U_OUT1, TEGRA210_CLK_CLK_MAX, 48000000, 1 }, { TEGRA210_CLK_PLL_U_OUT2, TEGRA210_CLK_CLK_MAX, 60000000, 1 }, { TEGRA210_CLK_SPDIF_IN_SYNC, TEGRA210_CLK_CLK_MAX, 24576000, 0 }, { TEGRA210_CLK_I2S0_SYNC, TEGRA210_CLK_CLK_MAX, 24576000, 0 },
From: Russell King rmk+kernel@armlinux.org.uk
[ Upstream commit 5808b14a1f52554de612fee85ef517199855e310 ]
Fix a use-after-free bug during filesystem initialisation, where we access the disc record (which is stored in a buffer) after we have released the buffer.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/adfs/super.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/adfs/super.c b/fs/adfs/super.c index ffb669f9bba7..ce0fbbe002bf 100644 --- a/fs/adfs/super.c +++ b/fs/adfs/super.c @@ -360,6 +360,7 @@ static int adfs_fill_super(struct super_block *sb, void *data, int silent) struct buffer_head *bh; struct object_info root_obj; unsigned char *b_data; + unsigned int blocksize; struct adfs_sb_info *asb; struct inode *root; int ret = -EINVAL; @@ -411,8 +412,10 @@ static int adfs_fill_super(struct super_block *sb, void *data, int silent) goto error_free_bh; }
+ blocksize = 1 << dr->log2secsize; brelse(bh); - if (sb_set_blocksize(sb, 1 << dr->log2secsize)) { + + if (sb_set_blocksize(sb, blocksize)) { bh = sb_bread(sb, ADFS_DISCRECORD / sb->s_blocksize); if (!bh) { adfs_error(sb, "couldn't read superblock on "
From: Chunyan Zhang zhang.chunyan@linaro.org
[ Upstream commit c974c48deeb969c5e4250e4f06af91edd84b1f10 ]
sprd_clk_regmap_init() doesn't always return success, adding check for its return value should make the code more strong.
Signed-off-by: Chunyan Zhang zhang.chunyan@linaro.org Reviewed-by: Baolin Wang baolin.wang@linaro.org [sboyd@kernel.org: Add a missing int ret] Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/sprd/sc9860-clk.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/clk/sprd/sc9860-clk.c b/drivers/clk/sprd/sc9860-clk.c index 9980ab55271b..f76305b4bc8d 100644 --- a/drivers/clk/sprd/sc9860-clk.c +++ b/drivers/clk/sprd/sc9860-clk.c @@ -2023,6 +2023,7 @@ static int sc9860_clk_probe(struct platform_device *pdev) { const struct of_device_id *match; const struct sprd_clk_desc *desc; + int ret;
match = of_match_node(sprd_sc9860_clk_ids, pdev->dev.of_node); if (!match) { @@ -2031,7 +2032,9 @@ static int sc9860_clk_probe(struct platform_device *pdev) }
desc = match->data; - sprd_clk_regmap_init(pdev, desc); + ret = sprd_clk_regmap_init(pdev, desc); + if (ret) + return ret;
return sprd_clk_probe(&pdev->dev, desc->hw_clks); }
From: Vicente Bergas vicencb@gmail.com
[ Upstream commit e1d9149e8389f1690cdd4e4056766dd26488a0fe ]
Before this patch, the Type-C port on the Sapphire board is dead. If setting the 'regulator-always-on' property to 'vcc5v0_typec0' then the port works for about 4 seconds at start-up. This is a sample trace with a memory stick plugged in: 1.- The memory stick LED lights on and kernel reports: [ 4.782999] scsi 0:0:0:0: Direct-Access USB DISK PMAP PQ: 0 ANSI: 4 [ 5.904580] sd 0:0:0:0: [sdb] 3913344 512-byte logical blocks: (2.00 GB/1.87 GiB) [ 5.906860] sd 0:0:0:0: [sdb] Write Protect is off [ 5.908973] sd 0:0:0:0: [sdb] Mode Sense: 23 00 00 00 [ 5.909122] sd 0:0:0:0: [sdb] No Caching mode page found [ 5.911214] sd 0:0:0:0: [sdb] Assuming drive cache: write through [ 5.951585] sdb: sdb1 [ 5.954816] sd 0:0:0:0: [sdb] Attached SCSI removable disk 2.- 4 seconds later the memory stick LED lights off and kernel reports: [ 9.082822] phy phy-ff770000.syscon:usb2-phy@e450.2: charger = USB_DCP_CHARGER 3.- After a minute the kernel reports: [ 71.666761] usb 5-1: USB disconnect, device number 2 It has been checked that, although the LED is off, VBUS is present.
If, instead, the dr_mode is changed to host and the phy-supply changed accordingly, then it works. It has only been tested in host mode.
Signed-off-by: Vicente Bergas vicencb@gmail.com Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/rockchip/rk3399-sapphire.dtsi | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3399-sapphire.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-sapphire.dtsi index 04623e52ac5d..1bc1579674e5 100644 --- a/arch/arm64/boot/dts/rockchip/rk3399-sapphire.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3399-sapphire.dtsi @@ -565,12 +565,11 @@ status = "okay";
u2phy0_otg: otg-port { - phy-supply = <&vcc5v0_typec0>; status = "okay"; };
u2phy0_host: host-port { - phy-supply = <&vcc5v0_host>; + phy-supply = <&vcc5v0_typec0>; status = "okay"; }; }; @@ -620,7 +619,7 @@
&usbdrd_dwc3_0 { status = "okay"; - dr_mode = "otg"; + dr_mode = "host"; };
&usbdrd3_1 {
From: Qu Wenruo wqu@suse.com
[ Upstream commit 4c094c33c9ed4b8d0d814bd1d7ff78e123d15d00 ]
Under certain conditions, we could have strange file extent item in log tree like:
item 18 key (69599 108 397312) itemoff 15208 itemsize 53 extent data disk bytenr 0 nr 0 extent data offset 0 nr 18446744073709547520 ram 18446744073709547520
The num_bytes + ram_bytes overflow 64 bit type.
For num_bytes part, we can detect such overflow along with file offset (key->offset), as file_offset + num_bytes should never go beyond u64.
For ram_bytes part, it's about the decompressed size of the extent, not directly related to the size. In theory it is OK to have a large value, and put extra limitation on RAM bytes may cause unexpected false alerts.
So in tree-checker, we only check if the file offset and num bytes overflow.
Signed-off-by: Qu Wenruo wqu@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/tree-checker.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c index 96fce4bef4e7..ccd5706199d7 100644 --- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -132,6 +132,7 @@ static int check_extent_data_item(struct extent_buffer *leaf, struct btrfs_file_extent_item *fi; u32 sectorsize = fs_info->sectorsize; u32 item_size = btrfs_item_size_nr(leaf, slot); + u64 extent_end;
if (!IS_ALIGNED(key->offset, sectorsize)) { file_extent_err(leaf, slot, @@ -207,6 +208,16 @@ static int check_extent_data_item(struct extent_buffer *leaf, CHECK_FE_ALIGNED(leaf, slot, fi, num_bytes, sectorsize)) return -EUCLEAN;
+ /* Catch extent end overflow */ + if (check_add_overflow(btrfs_file_extent_num_bytes(leaf, fi), + key->offset, &extent_end)) { + file_extent_err(leaf, slot, + "extent end overflow, have file offset %llu extent num bytes %llu", + key->offset, + btrfs_file_extent_num_bytes(leaf, fi)); + return -EUCLEAN; + } + /* * Check that no two consecutive file extent items, in the same leaf, * present ranges that overlap each other.
From: David Sterba dsterba@suse.com
[ Upstream commit 0ee5f8ae082e1f675a2fb6db601c31ac9958a134 ]
The list of profiles in btrfs_chunk_max_errors lists DUP as a profile DUP able to tolerate 1 device missing. Though this profile is special with 2 copies, it still needs the device, unlike the others.
Looking at the history of changes, thre's no clear reason why DUP is there, functions were refactored and blocks of code merged to one helper.
d20983b40e828 Btrfs: fix writing data into the seed filesystem - factor code to a helper
de11cc12df173 Btrfs: don't pre-allocate btrfs bio - unrelated change, DUP still in the list with max errors 1
a236aed14ccb0 Btrfs: Deal with failed writes in mirrored configurations - introduced the max errors, leaves DUP and RAID1 in the same group
Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/volumes.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 1c2a6e4b39da..8508f6028c8d 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -5328,8 +5328,7 @@ static inline int btrfs_chunk_max_errors(struct map_lookup *map)
if (map->type & (BTRFS_BLOCK_GROUP_RAID1 | BTRFS_BLOCK_GROUP_RAID10 | - BTRFS_BLOCK_GROUP_RAID5 | - BTRFS_BLOCK_GROUP_DUP)) { + BTRFS_BLOCK_GROUP_RAID5)) { max_errors = 1; } else if (map->type & BTRFS_BLOCK_GROUP_RAID6) { max_errors = 2;
From: Qu Wenruo wqu@suse.com
[ Upstream commit a94d1d0cb3bf1983fcdf05b59d914dbff4f1f52c ]
[BUG] The following script can cause unexpected fsync failure:
#!/bin/bash
dev=/dev/test/test mnt=/mnt/btrfs
mkfs.btrfs -f $dev -b 512M > /dev/null mount $dev $mnt -o nospace_cache
# Prealloc one extent xfs_io -f -c "falloc 8k 64m" $mnt/file1 # Fill the remaining data space xfs_io -f -c "pwrite 0 -b 4k 512M" $mnt/padding sync
# Write into the prealloc extent xfs_io -c "pwrite 1m 16m" $mnt/file1
# Reflink then fsync, fsync would fail due to ENOSPC xfs_io -c "reflink $mnt/file1 8k 0 4k" -c "fsync" $mnt/file1 umount $dev
The fsync fails with ENOSPC, and the last page of the buffered write is lost.
[CAUSE] This is caused by: - Btrfs' back reference only has extent level granularity So write into shared extent must be COWed even only part of the extent is shared.
So for above script we have: - fallocate Create a preallocated extent where we can do NOCOW write.
- fill all the remaining data and unallocated space
- buffered write into preallocated space As we have not enough space available for data and the extent is not shared (yet) we fall into NOCOW mode.
- reflink Now part of the large preallocated extent is shared, later write into that extent must be COWed.
- fsync triggers writeback But now the extent is shared and therefore we must fallback into COW mode, which fails with ENOSPC since there's not enough space to allocate data extents.
[WORKAROUND] The workaround is to ensure any buffered write in the related extents (not just the reflink source range) get flushed before reflink/dedupe, so that NOCOW writes succeed that happened before reflinking succeed.
The workaround is expensive, we could do it better by only flushing NOCOW range, but that needs extra accounting for NOCOW range. For now, fix the possible data loss first.
Reviewed-by: Filipe Manana fdmanana@suse.com Signed-off-by: Qu Wenruo wqu@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/ioctl.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 2a1be0d1a698..5b4beebf138c 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3999,6 +3999,27 @@ static int btrfs_remap_file_range_prep(struct file *file_in, loff_t pos_in, if (!same_inode) inode_dio_wait(inode_out);
+ /* + * Workaround to make sure NOCOW buffered write reach disk as NOCOW. + * + * Btrfs' back references do not have a block level granularity, they + * work at the whole extent level. + * NOCOW buffered write without data space reserved may not be able + * to fall back to CoW due to lack of data space, thus could cause + * data loss. + * + * Here we take a shortcut by flushing the whole inode, so that all + * nocow write should reach disk as nocow before we increase the + * reference of the extent. We could do better by only flushing NOCOW + * data, but that needs extra accounting. + * + * Also we don't need to check ASYNC_EXTENT, as async extent will be + * CoWed anyway, not affecting nocow part. + */ + ret = filemap_flush(inode_in->i_mapping); + if (ret < 0) + return ret; + ret = btrfs_wait_ordered_range(inode_in, ALIGN_DOWN(pos_in, bs), wb_len); if (ret < 0)
From: Clement Leger cleger@kalray.eu
[ Upstream commit 72f64cabc4bd6985c7355f5547bd3637c82762ac ]
When preparing the subdevice for the vdev, also copy dma_pfn_offset since this is used for sub device dma allocations. Without that, there is incoherency between the parent dma settings and the childs one, potentially leading to dma_alloc_coherent failure (due to phys_to_dma using dma_pfn_offset for translation).
Fixes: 086d08725d34 ("remoteproc: create vdev subdevice with specific dma memory pool") Signed-off-by: Clement Leger cleger@kalray.eu Acked-by: Loic Pallardy loic.pallardy@st.com Signed-off-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/remoteproc/remoteproc_core.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c index 8b5363223eaa..5031c6806908 100644 --- a/drivers/remoteproc/remoteproc_core.c +++ b/drivers/remoteproc/remoteproc_core.c @@ -512,6 +512,7 @@ static int rproc_handle_vdev(struct rproc *rproc, struct fw_rsc_vdev *rsc, /* Initialise vdev subdevice */ snprintf(name, sizeof(name), "vdev%dbuffer", rvdev->index); rvdev->dev.parent = rproc->dev.parent; + rvdev->dev.dma_pfn_offset = rproc->dev.parent->dma_pfn_offset; rvdev->dev.release = rproc_rvdev_release; dev_set_name(&rvdev->dev, "%s#%s", dev_name(rvdev->dev.parent), name); dev_set_drvdata(&rvdev->dev, rvdev);
From: Qu Wenruo wqu@suse.com
[ Upstream commit e88439debd0a7f969b3ddba6f147152cd0732676 ]
[BUG] Lockdep will report the following circular locking dependency:
WARNING: possible circular locking dependency detected 5.2.0-rc2-custom #24 Tainted: G O ------------------------------------------------------ btrfs/8631 is trying to acquire lock: 000000002536438c (&fs_info->qgroup_ioctl_lock#2){+.+.}, at: btrfs_qgroup_inherit+0x40/0x620 [btrfs]
but task is already holding lock: 000000003d52cc23 (&fs_info->tree_log_mutex){+.+.}, at: create_pending_snapshot+0x8b6/0xe60 [btrfs]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&fs_info->tree_log_mutex){+.+.}: __mutex_lock+0x76/0x940 mutex_lock_nested+0x1b/0x20 btrfs_commit_transaction+0x475/0xa00 [btrfs] btrfs_commit_super+0x71/0x80 [btrfs] close_ctree+0x2bd/0x320 [btrfs] btrfs_put_super+0x15/0x20 [btrfs] generic_shutdown_super+0x72/0x110 kill_anon_super+0x18/0x30 btrfs_kill_super+0x16/0xa0 [btrfs] deactivate_locked_super+0x3a/0x80 deactivate_super+0x51/0x60 cleanup_mnt+0x3f/0x80 __cleanup_mnt+0x12/0x20 task_work_run+0x94/0xb0 exit_to_usermode_loop+0xd8/0xe0 do_syscall_64+0x210/0x240 entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #1 (&fs_info->reloc_mutex){+.+.}: __mutex_lock+0x76/0x940 mutex_lock_nested+0x1b/0x20 btrfs_commit_transaction+0x40d/0xa00 [btrfs] btrfs_quota_enable+0x2da/0x730 [btrfs] btrfs_ioctl+0x2691/0x2b40 [btrfs] do_vfs_ioctl+0xa9/0x6d0 ksys_ioctl+0x67/0x90 __x64_sys_ioctl+0x1a/0x20 do_syscall_64+0x65/0x240 entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #0 (&fs_info->qgroup_ioctl_lock#2){+.+.}: lock_acquire+0xa7/0x190 __mutex_lock+0x76/0x940 mutex_lock_nested+0x1b/0x20 btrfs_qgroup_inherit+0x40/0x620 [btrfs] create_pending_snapshot+0x9d7/0xe60 [btrfs] create_pending_snapshots+0x94/0xb0 [btrfs] btrfs_commit_transaction+0x415/0xa00 [btrfs] btrfs_mksubvol+0x496/0x4e0 [btrfs] btrfs_ioctl_snap_create_transid+0x174/0x180 [btrfs] btrfs_ioctl_snap_create_v2+0x11c/0x180 [btrfs] btrfs_ioctl+0xa90/0x2b40 [btrfs] do_vfs_ioctl+0xa9/0x6d0 ksys_ioctl+0x67/0x90 __x64_sys_ioctl+0x1a/0x20 do_syscall_64+0x65/0x240 entry_SYSCALL_64_after_hwframe+0x49/0xbe
other info that might help us debug this:
Chain exists of: &fs_info->qgroup_ioctl_lock#2 --> &fs_info->reloc_mutex --> &fs_info->tree_log_mutex
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&fs_info->tree_log_mutex); lock(&fs_info->reloc_mutex); lock(&fs_info->tree_log_mutex); lock(&fs_info->qgroup_ioctl_lock#2);
*** DEADLOCK ***
6 locks held by btrfs/8631: #0: 00000000ed8f23f6 (sb_writers#12){.+.+}, at: mnt_want_write_file+0x28/0x60 #1: 000000009fb1597a (&type->i_mutex_dir_key#10/1){+.+.}, at: btrfs_mksubvol+0x70/0x4e0 [btrfs] #2: 0000000088c5ad88 (&fs_info->subvol_sem){++++}, at: btrfs_mksubvol+0x128/0x4e0 [btrfs] #3: 000000009606fc3e (sb_internal#2){.+.+}, at: start_transaction+0x37a/0x520 [btrfs] #4: 00000000f82bbdf5 (&fs_info->reloc_mutex){+.+.}, at: btrfs_commit_transaction+0x40d/0xa00 [btrfs] #5: 000000003d52cc23 (&fs_info->tree_log_mutex){+.+.}, at: create_pending_snapshot+0x8b6/0xe60 [btrfs]
[CAUSE] Due to the delayed subvolume creation, we need to call btrfs_qgroup_inherit() inside commit transaction code, with a lot of other mutex hold. This hell of lock chain can lead to above problem.
[FIX] On the other hand, we don't really need to hold qgroup_ioctl_lock if we're in the context of create_pending_snapshot(). As in that context, we're the only one being able to modify qgroup.
All other qgroup functions which needs qgroup_ioctl_lock are either holding a transaction handle, or will start a new transaction: Functions will start a new transaction(): * btrfs_quota_enable() * btrfs_quota_disable() Functions hold a transaction handler: * btrfs_add_qgroup_relation() * btrfs_del_qgroup_relation() * btrfs_create_qgroup() * btrfs_remove_qgroup() * btrfs_limit_qgroup() * btrfs_qgroup_inherit() call inside create_subvol()
So we have a higher level protection provided by transaction, thus we don't need to always hold qgroup_ioctl_lock in btrfs_qgroup_inherit().
Only the btrfs_qgroup_inherit() call in create_subvol() needs to hold qgroup_ioctl_lock, while the btrfs_qgroup_inherit() call in create_pending_snapshot() is already protected by transaction.
So the fix is to detect the context by checking trans->transaction->state. If we're at TRANS_STATE_COMMIT_DOING, then we're in commit transaction context and no need to get the mutex.
Reported-by: Nikolay Borisov nborisov@suse.com Signed-off-by: Qu Wenruo wqu@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/qgroup.c | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index 3e6ffbbd8b0a..f8a3c1b0a15a 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -2614,6 +2614,7 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid, int ret = 0; int i; u64 *i_qgroups; + bool committing = false; struct btrfs_fs_info *fs_info = trans->fs_info; struct btrfs_root *quota_root; struct btrfs_qgroup *srcgroup; @@ -2621,7 +2622,25 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid, u32 level_size = 0; u64 nums;
- mutex_lock(&fs_info->qgroup_ioctl_lock); + /* + * There are only two callers of this function. + * + * One in create_subvol() in the ioctl context, which needs to hold + * the qgroup_ioctl_lock. + * + * The other one in create_pending_snapshot() where no other qgroup + * code can modify the fs as they all need to either start a new trans + * or hold a trans handler, thus we don't need to hold + * qgroup_ioctl_lock. + * This would avoid long and complex lock chain and make lockdep happy. + */ + spin_lock(&fs_info->trans_lock); + if (trans->transaction->state == TRANS_STATE_COMMIT_DOING) + committing = true; + spin_unlock(&fs_info->trans_lock); + + if (!committing) + mutex_lock(&fs_info->qgroup_ioctl_lock); if (!test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags)) goto out;
@@ -2785,7 +2804,8 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid, unlock: spin_unlock(&fs_info->qgroup_lock); out: - mutex_unlock(&fs_info->qgroup_ioctl_lock); + if (!committing) + mutex_unlock(&fs_info->qgroup_ioctl_lock); return ret; }
From: Ronnie Sahlberg lsahlber@redhat.com
[ Upstream commit f2caf901c1b7ce65f9e6aef4217e3241039db768 ]
There is a race condition with how we send (or supress and don't send) smb echos that will cause the client to incorrectly think the server is unresponsive and thus needs to be reconnected.
Summary of the race condition: 1) Daisy chaining scheduling creates a gap. 2) If traffic comes unfortunate shortly after the last echo, the planned echo is suppressed. 3) Due to the gap, the next echo transmission is delayed until after the timeout, which is set hard to twice the echo interval.
This is fixed by changing the timeouts from 2 to three times the echo interval.
Detailed description of the bug: https://lutz.donnerhacke.de/eng/Blog/Groundhog-Day-with-SMB-remount
Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/connect.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index 8dd6637a3cbb..0872188ce3c6 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -706,10 +706,10 @@ static bool server_unresponsive(struct TCP_Server_Info *server) { /* - * We need to wait 2 echo intervals to make sure we handle such + * We need to wait 3 echo intervals to make sure we handle such * situations right: * 1s client sends a normal SMB request - * 2s client gets a response + * 3s client gets a response * 30s echo workqueue job pops, and decides we got a response recently * and don't need to send another * ... @@ -718,9 +718,9 @@ server_unresponsive(struct TCP_Server_Info *server) */ if ((server->tcpStatus == CifsGood || server->tcpStatus == CifsNeedNegotiate) && - time_after(jiffies, server->lstrp + 2 * server->echo_interval)) { + time_after(jiffies, server->lstrp + 3 * server->echo_interval)) { cifs_dbg(VFS, "Server %s has not responded in %lu seconds. Reconnecting...\n", - server->hostname, (2 * server->echo_interval) / HZ); + server->hostname, (3 * server->echo_interval) / HZ); cifs_reconnect(server); wake_up(&server->response_q); return true;
From: David Disseldorp ddiss@suse.de
[ Upstream commit 2b2abcac8c251d1c77a4cc9d9f248daefae0fb4e ]
ceph_listxattr() incorrectly returns a length based on the static ceph_vxattrs_name_size() value, which only takes into account whether vxattrs are hidden, ignoring vxattr.exists_cb().
When filling the xattr buffer ceph_listxattr() checks VXATTR_FLAG_HIDDEN and vxattr.exists_cb(). If both are false, we return an incorrect (oversize) length.
Fix this behaviour by always calculating the vxattrs length at runtime, taking both vxattr.hidden and vxattr.exists_cb() into account.
This bug is only exposed with the new "ceph.snap.btime" vxattr, as all other vxattrs with a non-null exists_cb also carry VXATTR_FLAG_HIDDEN.
Signed-off-by: David Disseldorp ddiss@suse.de Reviewed-by: "Yan, Zheng" zyan@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/xattr.c | 54 +++++++++++++++++++++++++++---------------------- 1 file changed, 30 insertions(+), 24 deletions(-)
diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c index 0cc42c8879e9..b9df49e13d9c 100644 --- a/fs/ceph/xattr.c +++ b/fs/ceph/xattr.c @@ -897,10 +897,9 @@ ssize_t ceph_listxattr(struct dentry *dentry, char *names, size_t size) struct inode *inode = d_inode(dentry); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_vxattr *vxattrs = ceph_inode_vxattrs(inode); - u32 vir_namelen = 0; + bool len_only = (size == 0); u32 namelen; int err; - u32 len; int i;
spin_lock(&ci->i_ceph_lock); @@ -919,38 +918,45 @@ ssize_t ceph_listxattr(struct dentry *dentry, char *names, size_t size) err = __build_xattrs(inode); if (err < 0) goto out; - /* - * Start with virtual dir xattr names (if any) (including - * terminating '\0' characters for each). - */ - vir_namelen = ceph_vxattrs_name_size(vxattrs);
- /* adding 1 byte per each variable due to the null termination */ + /* add 1 byte for each xattr due to the null termination */ namelen = ci->i_xattrs.names_size + ci->i_xattrs.count; - err = -ERANGE; - if (size && vir_namelen + namelen > size) - goto out; - - err = namelen + vir_namelen; - if (size == 0) - goto out; + if (!len_only) { + if (namelen > size) { + err = -ERANGE; + goto out; + } + names = __copy_xattr_names(ci, names); + size -= namelen; + }
- names = __copy_xattr_names(ci, names);
/* virtual xattr names, too */ - err = namelen; if (vxattrs) { for (i = 0; vxattrs[i].name; i++) { - if (!(vxattrs[i].flags & VXATTR_FLAG_HIDDEN) && - !(vxattrs[i].exists_cb && - !vxattrs[i].exists_cb(ci))) { - len = sprintf(names, "%s", vxattrs[i].name); - names += len + 1; - err += len + 1; + size_t this_len; + + if (vxattrs[i].flags & VXATTR_FLAG_HIDDEN) + continue; + if (vxattrs[i].exists_cb && !vxattrs[i].exists_cb(ci)) + continue; + + this_len = strlen(vxattrs[i].name) + 1; + namelen += this_len; + if (len_only) + continue; + + if (this_len > size) { + err = -ERANGE; + goto out; } + + memcpy(names, vxattrs[i].name, this_len); + names += this_len; + size -= this_len; } } - + err = namelen; out: spin_unlock(&ci->i_ceph_lock); return err;
From: Andrea Parri andrea.parri@amarulasolutions.com
[ Upstream commit 749607731e26dfb2558118038c40e9c0c80d23b5 ]
This barrier only applies to the read-modify-write operations; in particular, it does not apply to the atomic64_set() primitive.
Replace the barrier with an smp_mb().
Fixes: fdd4e15838e59 ("ceph: rework dcache readdir") Reported-by: "Paul E. McKenney" paulmck@linux.ibm.com Reported-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Andrea Parri andrea.parri@amarulasolutions.com Reviewed-by: "Yan, Zheng" zyan@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/super.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 5f27e1f7f2d6..172ce582b995 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -544,7 +544,12 @@ static inline void __ceph_dir_set_complete(struct ceph_inode_info *ci, long long release_count, long long ordered_count) { - smp_mb__before_atomic(); + /* + * Makes sure operations that setup readdir cache (update page + * cache and i_size) are strongly ordered w.r.t. the following + * atomic64_set() operations. + */ + smp_mb(); atomic64_set(&ci->i_complete_seq[0], release_count); atomic64_set(&ci->i_complete_seq[1], ordered_count); }
From: "Yan, Zheng" zyan@redhat.com
[ Upstream commit feab6ac25dbfe3ab96299cb741925dc8d2da0caf ]
It should call __ceph_dentry_dir_lease_touch() under dentry->d_lock. Besides, ceph_dentry(dentry) can be NULL when called by LOOKUP_RCU d_revalidate()
Signed-off-by: "Yan, Zheng" zyan@redhat.com Reviewed-by: Jeff Layton jlayton@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/dir.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-)
diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 0637149fb9f9..1271024a3797 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -1512,18 +1512,26 @@ static int __dir_lease_try_check(const struct dentry *dentry) static int dir_lease_is_valid(struct inode *dir, struct dentry *dentry) { struct ceph_inode_info *ci = ceph_inode(dir); - struct ceph_dentry_info *di = ceph_dentry(dentry); - int valid = 0; + int valid; + int shared_gen;
spin_lock(&ci->i_ceph_lock); - if (atomic_read(&ci->i_shared_gen) == di->lease_shared_gen) - valid = __ceph_caps_issued_mask(ci, CEPH_CAP_FILE_SHARED, 1); + valid = __ceph_caps_issued_mask(ci, CEPH_CAP_FILE_SHARED, 1); + shared_gen = atomic_read(&ci->i_shared_gen); spin_unlock(&ci->i_ceph_lock); - if (valid) - __ceph_dentry_dir_lease_touch(di); - dout("dir_lease_is_valid dir %p v%u dentry %p v%u = %d\n", - dir, (unsigned)atomic_read(&ci->i_shared_gen), - dentry, (unsigned)di->lease_shared_gen, valid); + if (valid) { + struct ceph_dentry_info *di; + spin_lock(&dentry->d_lock); + di = ceph_dentry(dentry); + if (dir == d_inode(dentry->d_parent) && + di && di->lease_shared_gen == shared_gen) + __ceph_dentry_dir_lease_touch(di); + else + valid = 0; + spin_unlock(&dentry->d_lock); + } + dout("dir_lease_is_valid dir %p v%u dentry %p = %d\n", + dir, (unsigned)atomic_read(&ci->i_shared_gen), dentry, valid); return valid; }
From: Jeff Layton jlayton@kernel.org
[ Upstream commit 3b421018f48c482bdc9650f894aa1747cf90e51d ]
The getxattr manpage states that we should return ERANGE if the destination buffer size is too small to hold the value. ceph_vxattrcb_layout does this internally, but we should be doing this for all vxattrs.
Fix the only caller of getxattr_cb to check the returned size against the buffer length and return -ERANGE if it doesn't fit. Drop the same check in ceph_vxattrcb_layout and just rely on the caller to handle it.
Signed-off-by: Jeff Layton jlayton@kernel.org Reviewed-by: "Yan, Zheng" zyan@redhat.com Acked-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/xattr.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c index b9df49e13d9c..0d2f8f9eb592 100644 --- a/fs/ceph/xattr.c +++ b/fs/ceph/xattr.c @@ -79,7 +79,7 @@ static size_t ceph_vxattrcb_layout(struct ceph_inode_info *ci, char *val, const char *ns_field = " pool_namespace="; char buf[128]; size_t len, total_len = 0; - int ret; + ssize_t ret;
pool_ns = ceph_try_get_string(ci->i_layout.pool_ns);
@@ -103,11 +103,8 @@ static size_t ceph_vxattrcb_layout(struct ceph_inode_info *ci, char *val, if (pool_ns) total_len += strlen(ns_field) + pool_ns->len;
- if (!size) { - ret = total_len; - } else if (total_len > size) { - ret = -ERANGE; - } else { + ret = total_len; + if (size >= total_len) { memcpy(val, buf, len); ret = len; if (pool_name) { @@ -835,8 +832,11 @@ ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value, if (err) return err; err = -ENODATA; - if (!(vxattr->exists_cb && !vxattr->exists_cb(ci))) + if (!(vxattr->exists_cb && !vxattr->exists_cb(ci))) { err = vxattr->getxattr_cb(ci, value, size); + if (size && size < err) + err = -ERANGE; + } return err; }
From: Ihor Matushchak ihor.matushchak@foobox.net
[ Upstream commit 5e663f0410fa2f355042209154029842ba1abd43 ]
in vm_find_vqs() irq has a wrong type so, in case of no IRQ resource defined, wrong parameter will be passed to request_irq()
Signed-off-by: Ihor Matushchak ihor.matushchak@foobox.net Signed-off-by: Michael S. Tsirkin mst@redhat.com Reviewed-by: Ivan T. Ivanov iivanov.xz@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/virtio/virtio_mmio.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c index f363fbeb5ab0..e09edb5c5e06 100644 --- a/drivers/virtio/virtio_mmio.c +++ b/drivers/virtio/virtio_mmio.c @@ -463,9 +463,14 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs, struct irq_affinity *desc) { struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev); - unsigned int irq = platform_get_irq(vm_dev->pdev, 0); + int irq = platform_get_irq(vm_dev->pdev, 0); int i, err, queue_idx = 0;
+ if (irq < 0) { + dev_err(&vdev->dev, "Cannot get IRQ resource\n"); + return irq; + } + err = request_irq(irq, vm_interrupt, IRQF_SHARED, dev_name(&vdev->dev), vm_dev); if (err)
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit b80d6a42bdc97bdb6139107d6034222e9843c6e2 ]
When CONFIG_DMI is disabled, we only have a tentative declaration, which causes a warning from clang:
drivers/acpi/blacklist.c:20:35: error: tentative array definition assumed to have one element [-Werror] static const struct dmi_system_id acpi_rev_dmi_table[] __initconst;
As the variable is not actually used here, hide it entirely in an #ifdef to shut up the warning.
Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Nathan Chancellor natechancellor@gmail.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/blacklist.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/acpi/blacklist.c b/drivers/acpi/blacklist.c index ad2c565f5cbe..a86a770c9b79 100644 --- a/drivers/acpi/blacklist.c +++ b/drivers/acpi/blacklist.c @@ -17,7 +17,9 @@
#include "internal.h"
+#ifdef CONFIG_DMI static const struct dmi_system_id acpi_rev_dmi_table[] __initconst; +#endif
/* * POLICY: If *anything* doesn't work, put it on the blacklist. @@ -61,7 +63,9 @@ int __init acpi_blacklisted(void) }
(void)early_acpi_osi_init(); +#ifdef CONFIG_DMI dmi_check_system(acpi_rev_dmi_table); +#endif
return blacklisted; }
From: Benjamin Block bblock@linux.ibm.com
[ Upstream commit 484647088826f2f651acbda6bcf9536b8a466703 ]
GCC v9 emits this warning: CC drivers/s390/scsi/zfcp_erp.o drivers/s390/scsi/zfcp_erp.c: In function 'zfcp_erp_action_enqueue': drivers/s390/scsi/zfcp_erp.c:217:26: warning: 'erp_action' may be used uninitialized in this function [-Wmaybe-uninitialized] 217 | struct zfcp_erp_action *erp_action; | ^~~~~~~~~~
This is a possible false positive case, as also documented in the GCC documentations: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wmaybe-uniniti...
The actual code-sequence is like this: Various callers can invoke the function below with the argument "want" being one of: ZFCP_ERP_ACTION_REOPEN_ADAPTER, ZFCP_ERP_ACTION_REOPEN_PORT_FORCED, ZFCP_ERP_ACTION_REOPEN_PORT, or ZFCP_ERP_ACTION_REOPEN_LUN.
zfcp_erp_action_enqueue(want, ...) ... need = zfcp_erp_required_act(want, ...) need = want ... maybe: need = ZFCP_ERP_ACTION_REOPEN_PORT maybe: need = ZFCP_ERP_ACTION_REOPEN_ADAPTER ... return need ... zfcp_erp_setup_act(need, ...) struct zfcp_erp_action *erp_action; // <== line 217 ... switch(need) { case ZFCP_ERP_ACTION_REOPEN_LUN: ... erp_action = &zfcp_sdev->erp_action; WARN_ON_ONCE(erp_action->port != port); // <== access ... break; case ZFCP_ERP_ACTION_REOPEN_PORT: case ZFCP_ERP_ACTION_REOPEN_PORT_FORCED: ... erp_action = &port->erp_action; WARN_ON_ONCE(erp_action->port != port); // <== access ... break; case ZFCP_ERP_ACTION_REOPEN_ADAPTER: ... erp_action = &adapter->erp_action; WARN_ON_ONCE(erp_action->port != NULL); // <== access ... break; } ... WARN_ON_ONCE(erp_action->adapter != adapter); // <== access
When zfcp_erp_setup_act() is called, 'need' will never be anything else than one of the 4 possible enumeration-names that are used in the switch-case, and 'erp_action' is initialized for every one of them, before it is used. Thus the warning is a false positive, as documented.
We introduce the extra if{} in the beginning to create an extra code-flow, so the compiler can be convinced that the switch-case will never see any other value.
BUG_ON()/BUG() is intentionally not used to not crash anything, should this ever happen anyway - right now it's impossible, as argued above; and it doesn't introduce a 'default:' switch-case to retain warnings should 'enum zfcp_erp_act_type' ever be extended and no explicit case be introduced. See also v5.0 commit 399b6c8bc9f7 ("scsi: zfcp: drop old default switch case which might paper over missing case").
Signed-off-by: Benjamin Block bblock@linux.ibm.com Reviewed-by: Jens Remus jremus@linux.ibm.com Reviewed-by: Steffen Maier maier@linux.ibm.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/s390/scsi/zfcp_erp.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c index e8fc28dba8df..96f0d34e9459 100644 --- a/drivers/s390/scsi/zfcp_erp.c +++ b/drivers/s390/scsi/zfcp_erp.c @@ -11,6 +11,7 @@ #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
#include <linux/kthread.h> +#include <linux/bug.h> #include "zfcp_ext.h" #include "zfcp_reqlist.h"
@@ -217,6 +218,12 @@ static struct zfcp_erp_action *zfcp_erp_setup_act(enum zfcp_erp_act_type need, struct zfcp_erp_action *erp_action; struct zfcp_scsi_dev *zfcp_sdev;
+ if (WARN_ON_ONCE(need != ZFCP_ERP_ACTION_REOPEN_LUN && + need != ZFCP_ERP_ACTION_REOPEN_PORT && + need != ZFCP_ERP_ACTION_REOPEN_PORT_FORCED && + need != ZFCP_ERP_ACTION_REOPEN_ADAPTER)) + return NULL; + switch (need) { case ZFCP_ERP_ACTION_REOPEN_LUN: zfcp_sdev = sdev_to_zfcp(sdev);
From: Ilya Leoshkevich iii@linux.ibm.com
[ Upstream commit 9cae4ace80ef39005da106fbb89c952b27d7b89e ]
When compiling an eBPF prog fails, make still returns 0, because failing clang command's output is piped to llc and therefore its exit status is ignored.
When clang fails, pipe the string "clang failed" to llc. This will make llc fail with an informative error message. This solution was chosen over using pipefail, having separate targets or getting rid of llc invocation due to its simplicity.
In addition, pull Kbuild.include in order to get .DELETE_ON_ERROR target, which would cause partial .o files to be removed.
Signed-off-by: Ilya Leoshkevich iii@linux.ibm.com Acked-by: Andrii Nakryiko andriin@fb.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/bpf/Makefile | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index e36356e2377e..e375f399b7a6 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -1,4 +1,5 @@ # SPDX-License-Identifier: GPL-2.0 +include ../../../../scripts/Kbuild.include
LIBDIR := ../../../lib BPFDIR := $(LIBDIR)/bpf @@ -185,8 +186,8 @@ $(ALU32_BUILD_DIR)/test_progs_32: prog_tests/*.c
$(ALU32_BUILD_DIR)/%.o: progs/%.c $(ALU32_BUILD_DIR) \ $(ALU32_BUILD_DIR)/test_progs_32 - $(CLANG) $(CLANG_FLAGS) \ - -O2 -target bpf -emit-llvm -c $< -o - | \ + ($(CLANG) $(CLANG_FLAGS) -O2 -target bpf -emit-llvm -c $< -o - || \ + echo "clang failed") | \ $(LLC) -march=bpf -mattr=+alu32 -mcpu=$(CPU) $(LLC_FLAGS) \ -filetype=obj -o $@ ifeq ($(DWARF2BTF),y) @@ -197,16 +198,16 @@ endif # Have one program compiled without "-target bpf" to test whether libbpf loads # it successfully $(OUTPUT)/test_xdp.o: progs/test_xdp.c - $(CLANG) $(CLANG_FLAGS) \ - -O2 -emit-llvm -c $< -o - | \ + ($(CLANG) $(CLANG_FLAGS) -O2 -emit-llvm -c $< -o - || \ + echo "clang failed") | \ $(LLC) -march=bpf -mcpu=$(CPU) $(LLC_FLAGS) -filetype=obj -o $@ ifeq ($(DWARF2BTF),y) $(BTF_PAHOLE) -J $@ endif
$(OUTPUT)/%.o: progs/%.c - $(CLANG) $(CLANG_FLAGS) \ - -O2 -target bpf -emit-llvm -c $< -o - | \ + ($(CLANG) $(CLANG_FLAGS) -O2 -target bpf -emit-llvm -c $< -o - || \ + echo "clang failed") | \ $(LLC) -march=bpf -mcpu=$(CPU) $(LLC_FLAGS) -filetype=obj -o $@ ifeq ($(DWARF2BTF),y) $(BTF_PAHOLE) -J $@
From: Nicholas Kazlauskas nicholas.kazlauskas@amd.com
[ Upstream commit 5fdb7c4c7f2691efd760b0b0dc00da4a3699f1a6 ]
[Why] In order to give pin notifications to the sound driver from DM we need to know whether audio is enabled on a stream and what pin it's using from DC.
[How] Expose the instance via stream status if it's a mapped resource for the stream. It will be -1 if there's no audio mapped.
Cc: Leo Li sunpeng.li@amd.com Cc: Harry Wentland harry.wentland@amd.com Signed-off-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 3 +++ drivers/gpu/drm/amd/display/dc/dc_stream.h | 1 + 2 files changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c index eac7186e4f08..12142d13f22f 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c @@ -2034,6 +2034,9 @@ enum dc_status resource_map_pool_resources( if (context->streams[i] == stream) { context->stream_status[i].primary_otg_inst = pipe_ctx->stream_res.tg->inst; context->stream_status[i].stream_enc_inst = pipe_ctx->stream_res.stream_enc->id; + context->stream_status[i].audio_inst = + pipe_ctx->stream_res.audio ? pipe_ctx->stream_res.audio->inst : -1; + return DC_OK; }
diff --git a/drivers/gpu/drm/amd/display/dc/dc_stream.h b/drivers/gpu/drm/amd/display/dc/dc_stream.h index 189bdab929a5..c20803b71fa5 100644 --- a/drivers/gpu/drm/amd/display/dc/dc_stream.h +++ b/drivers/gpu/drm/amd/display/dc/dc_stream.h @@ -42,6 +42,7 @@ struct dc_stream_status { int primary_otg_inst; int stream_enc_inst; int plane_count; + int audio_inst; struct timing_sync_info timing_sync_info; struct dc_plane_state *plane_states[MAX_SURFACE_NUM]; };
From: Ronnie Sahlberg lsahlber@redhat.com
[ Upstream commit ce465bf94b70f03136171a62b607864f00093b19 ]
RHBZ: 1649907
Fix a crash that happens while attempting to mount a DFS referral from the same server on the root of a filesystem.
Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/connect.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index 0872188ce3c6..e508613f09ff 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -4459,11 +4459,13 @@ cifs_are_all_path_components_accessible(struct TCP_Server_Info *server, unsigned int xid, struct cifs_tcon *tcon, struct cifs_sb_info *cifs_sb, - char *full_path) + char *full_path, + int added_treename) { int rc; char *s; char sep, tmp; + int skip = added_treename ? 1 : 0;
sep = CIFS_DIR_SEP(cifs_sb); s = full_path; @@ -4478,7 +4480,14 @@ cifs_are_all_path_components_accessible(struct TCP_Server_Info *server, /* next separator */ while (*s && *s != sep) s++; - + /* + * if the treename is added, we then have to skip the first + * part within the separators + */ + if (skip) { + skip = 0; + continue; + } /* * temporarily null-terminate the path at the end of * the current component @@ -4526,8 +4535,7 @@ static int is_path_remote(struct cifs_sb_info *cifs_sb, struct smb_vol *vol,
if (rc != -EREMOTE) { rc = cifs_are_all_path_components_accessible(server, xid, tcon, - cifs_sb, - full_path); + cifs_sb, full_path, tcon->Flags & SMB_SHARE_IS_IN_DFS); if (rc != 0) { cifs_dbg(VFS, "cannot query dirs between root and final path, " "enabling CIFS_MOUNT_USE_PREFIX_PATH\n");
From: Ravi Bangoria ravi.bangoria@linux.ibm.com
[ Upstream commit 916c31fff946fae0e05862f9b2435fdb29fd5090 ]
'perf version' on powerpc segfaults when used with non-supported option: # perf version -a Segmentation fault (core dumped)
Fix this.
Signed-off-by: Ravi Bangoria ravi.bangoria@linux.ibm.com Reviewed-by: Kamalesh Babulal kamalesh@linux.vnet.ibm.com Tested-by: Mamatha Inamdar mamatha4@linux.vnet.ibm.com Cc: Jiri Olsa jolsa@redhat.com Cc: Kamalesh Babulal kamalesh@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/20190611030109.20228-1-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-version.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/perf/builtin-version.c b/tools/perf/builtin-version.c index f470144d1a70..bf114ca9ca87 100644 --- a/tools/perf/builtin-version.c +++ b/tools/perf/builtin-version.c @@ -19,6 +19,7 @@ static struct version version; static struct option version_options[] = { OPT_BOOLEAN(0, "build-options", &version.build_options, "display the build options"), + OPT_END(), };
static const char * const version_usage[] = {
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit a6a6d3b1f867d34ba5bd61aa7bb056b48ca67cff ]
clang finds a contruct suspicious that converts an unsigned character to a signed integer and back, causing an overflow:
arch/x86/kvm/mmu.c:4605:39: error: implicit conversion from 'int' to 'u8' (aka 'unsigned char') changes value from -205 to 51 [-Werror,-Wconstant-conversion] u8 wf = (pfec & PFERR_WRITE_MASK) ? ~w : 0; ~~ ^~ arch/x86/kvm/mmu.c:4607:38: error: implicit conversion from 'int' to 'u8' (aka 'unsigned char') changes value from -241 to 15 [-Werror,-Wconstant-conversion] u8 uf = (pfec & PFERR_USER_MASK) ? ~u : 0; ~~ ^~ arch/x86/kvm/mmu.c:4609:39: error: implicit conversion from 'int' to 'u8' (aka 'unsigned char') changes value from -171 to 85 [-Werror,-Wconstant-conversion] u8 ff = (pfec & PFERR_FETCH_MASK) ? ~x : 0; ~~ ^~
Add an explicit cast to tell clang that everything works as intended here.
Signed-off-by: Arnd Bergmann arnd@arndb.de Link: https://github.com/ClangBuiltLinux/linux/issues/95 Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/mmu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 98f6e4f88b04..8d95c81b2c82 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -4593,11 +4593,11 @@ static void update_permission_bitmask(struct kvm_vcpu *vcpu, */
/* Faults from writes to non-writable pages */ - u8 wf = (pfec & PFERR_WRITE_MASK) ? ~w : 0; + u8 wf = (pfec & PFERR_WRITE_MASK) ? (u8)~w : 0; /* Faults from user mode accesses to supervisor pages */ - u8 uf = (pfec & PFERR_USER_MASK) ? ~u : 0; + u8 uf = (pfec & PFERR_USER_MASK) ? (u8)~u : 0; /* Faults from fetches of non-executable pages*/ - u8 ff = (pfec & PFERR_FETCH_MASK) ? ~x : 0; + u8 ff = (pfec & PFERR_FETCH_MASK) ? (u8)~x : 0; /* Faults from kernel mode fetches of user pages */ u8 smepf = 0; /* Faults from kernel mode accesses of user pages */
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit dfd6f9ad36368b8dbd5f5a2b2f0a4705ae69a323 ]
clang gets confused by an uninitialized variable in what looks to it like a never executed code path:
arch/x86/kernel/acpi/boot.c:618:13: error: variable 'polarity' is uninitialized when used here [-Werror,-Wuninitialized] polarity = polarity ? ACPI_ACTIVE_LOW : ACPI_ACTIVE_HIGH; ^~~~~~~~ arch/x86/kernel/acpi/boot.c:606:32: note: initialize the variable 'polarity' to silence this warning int rc, irq, trigger, polarity; ^ = 0 arch/x86/kernel/acpi/boot.c:617:12: error: variable 'trigger' is uninitialized when used here [-Werror,-Wuninitialized] trigger = trigger ? ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE; ^~~~~~~ arch/x86/kernel/acpi/boot.c:606:22: note: initialize the variable 'trigger' to silence this warning int rc, irq, trigger, polarity; ^ = 0
This is unfortunately a design decision in clang and won't be fixed.
Changing the acpi_get_override_irq() macro to an inline function reliably avoids the issue.
Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reviewed-by: Nathan Chancellor natechancellor@gmail.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/acpi.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/linux/acpi.h b/include/linux/acpi.h index d315d86844e4..872ab208c8ad 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -317,7 +317,10 @@ void acpi_set_irq_model(enum acpi_irq_model_id model, #ifdef CONFIG_X86_IO_APIC extern int acpi_get_override_irq(u32 gsi, int *trigger, int *polarity); #else -#define acpi_get_override_irq(gsi, trigger, polarity) (-1) +static inline int acpi_get_override_irq(u32 gsi, int *trigger, int *polarity) +{ + return -1; +} #endif /* * This function undoes the effect of one call to acpi_register_gsi().
From: Phong Tran tranmanphong@gmail.com
[ Upstream commit f384e62a82ba5d85408405fdd6aeff89354deaa9 ]
The syzbot test with random endpoint address which made the idx is overflow in the table of endpoint configuations.
this adds the checking for fixing the error report from syzbot
KASAN: stack-out-of-bounds Read in hfcsusb_probe [1] The patch tested by syzbot [2]
Reported-by: syzbot+8750abbc3a46ef47d509@syzkaller.appspotmail.com
[1]: https://syzkaller.appspot.com/bug?id=30a04378dac680c5d521304a00a86156bb91352... [2]: https://groups.google.com/d/msg/syzkaller-bugs/_6HBdge8F3E/OJn7wVNpBAAJ
Signed-off-by: Phong Tran tranmanphong@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/isdn/hardware/mISDN/hfcsusb.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/isdn/hardware/mISDN/hfcsusb.c b/drivers/isdn/hardware/mISDN/hfcsusb.c index 4c99739b937e..0e224232f746 100644 --- a/drivers/isdn/hardware/mISDN/hfcsusb.c +++ b/drivers/isdn/hardware/mISDN/hfcsusb.c @@ -1955,6 +1955,9 @@ hfcsusb_probe(struct usb_interface *intf, const struct usb_device_id *id)
/* get endpoint base */ idx = ((ep_addr & 0x7f) - 1) * 2; + if (idx > 15) + return -EIO; + if (ep_addr & 0x80) idx++; attr = ep->desc.bmAttributes;
From: Liran Alon liran.alon@oracle.com
[ Upstream commit 6694e48012826351036fd10fc506ca880023e25f ]
As reported by Maxime at https://bugzilla.kernel.org/show_bug.cgi?id=204175:
In vmx/nested.c::get_vmx_mem_address(), when the guest runs in long mode, the base address of the memory operand is computed with a simple: *ret = s.base + off;
This is incorrect, the base applies only to FS and GS, not to the others. Because of that, if the guest uses a VMX instruction based on DS and has a DS.base that is non-zero, KVM wrongfully adds the base to the resulting address.
Reported-by: Maxime Villard max@m00nbsd.net Reviewed-by: Joao Martins joao.m.martins@oracle.com Signed-off-by: Liran Alon liran.alon@oracle.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/vmx/nested.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 46af3a5e9209..aa949ab7850c 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -4068,7 +4068,10 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification, * mode, e.g. a 32-bit address size can yield a 64-bit virtual * address when using FS/GS with a non-zero base. */ - *ret = s.base + off; + if (seg_reg == VCPU_SREG_FS || seg_reg == VCPU_SREG_GS) + *ret = s.base + off; + else + *ret = off;
/* Long mode: #GP(0)/#SS(0) if the memory address is in a * non-canonical form. This is the only check on the memory
From: Andrii Nakryiko andriin@fb.com
[ Upstream commit 1acc5d5c5832da9a98b22374a8fae08ffe31b3f8 ]
BTF verifier has a size resolution bug which in some circumstances leads to invalid size resolution for, e.g., TYPEDEF modifier. This happens if we have [1] PTR -> [2] TYPEDEF -> [3] ARRAY, in which case due to being in pointer context ARRAY size won't be resolved (because for pointer it doesn't matter, so it's a sink in pointer context), but it will be permanently remembered as zero for TYPEDEF and TYPEDEF will be marked as RESOLVED. Eventually ARRAY size will be resolved correctly, but TYPEDEF resolved_size won't be updated anymore. This, subsequently, will lead to erroneous map creation failure, if that TYPEDEF is specified as either key or value, as key_size/value_size won't correspond to resolved size of TYPEDEF (kernel will believe it's zero).
Note, that if BTF was ordered as [1] ARRAY <- [2] TYPEDEF <- [3] PTR, this won't be a problem, as by the time we get to TYPEDEF, ARRAY's size is already calculated and stored.
This bug manifests itself in rejecting BTF-defined maps that use array typedef as a value type:
typedef int array_t[16];
struct { __uint(type, BPF_MAP_TYPE_ARRAY); __type(value, array_t); /* i.e., array_t *value; */ } test_map SEC(".maps");
The fix consists on not relying on modifier's resolved_size and instead using modifier's resolved_id (type ID for "concrete" type to which modifier eventually resolves) and doing size determination for that resolved type. This allow to preserve existing "early DFS termination" logic for PTR or STRUCT_OR_ARRAY contexts, but still do correct size determination for modifier types.
Fixes: eb3f595dab40 ("bpf: btf: Validate type reference") Cc: Martin KaFai Lau kafai@fb.com Signed-off-by: Andrii Nakryiko andriin@fb.com Acked-by: Martin KaFai Lau kafai@fb.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/btf.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index cad09858a5f2..8614102971da 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -1073,11 +1073,18 @@ const struct btf_type *btf_type_id_size(const struct btf *btf, !btf_type_is_var(size_type))) return NULL;
- size = btf->resolved_sizes[size_type_id]; size_type_id = btf->resolved_ids[size_type_id]; size_type = btf_type_by_id(btf, size_type_id); if (btf_type_nosize_or_null(size_type)) return NULL; + else if (btf_type_has_size(size_type)) + size = size_type->size; + else if (btf_type_is_array(size_type)) + size = btf->resolved_sizes[size_type_id]; + else if (btf_type_is_ptr(size_type)) + size = sizeof(void *); + else + return NULL; }
*type_id = size_type_id; @@ -1602,7 +1609,6 @@ static int btf_modifier_resolve(struct btf_verifier_env *env, const struct btf_type *next_type; u32 next_type_id = t->type; struct btf *btf = env->btf; - u32 next_type_size = 0;
next_type = btf_type_by_id(btf, next_type_id); if (!next_type || btf_type_is_resolve_source_only(next_type)) { @@ -1620,7 +1626,7 @@ static int btf_modifier_resolve(struct btf_verifier_env *env, * save us a few type-following when we use it later (e.g. in * pretty print). */ - if (!btf_type_id_size(btf, &next_type_id, &next_type_size)) { + if (!btf_type_id_size(btf, &next_type_id, NULL)) { if (env_type_is_resolved(env, next_type_id)) next_type = btf_type_id_resolve(btf, &next_type_id);
@@ -1633,7 +1639,7 @@ static int btf_modifier_resolve(struct btf_verifier_env *env, } }
- env_stack_pop_resolved(env, next_type_id, next_type_size); + env_stack_pop_resolved(env, next_type_id, 0);
return 0; } @@ -1645,7 +1651,6 @@ static int btf_var_resolve(struct btf_verifier_env *env, const struct btf_type *t = v->t; u32 next_type_id = t->type; struct btf *btf = env->btf; - u32 next_type_size;
next_type = btf_type_by_id(btf, next_type_id); if (!next_type || btf_type_is_resolve_source_only(next_type)) { @@ -1675,12 +1680,12 @@ static int btf_var_resolve(struct btf_verifier_env *env, * forward types or similar that would resolve to size of * zero is allowed. */ - if (!btf_type_id_size(btf, &next_type_id, &next_type_size)) { + if (!btf_type_id_size(btf, &next_type_id, NULL)) { btf_verifier_log_type(env, v->t, "Invalid type_id"); return -EINVAL; }
- env_stack_pop_resolved(env, next_type_id, next_type_size); + env_stack_pop_resolved(env, next_type_id, 0);
return 0; }
From: Andrii Nakryiko andriin@fb.com
[ Upstream commit 763ff0e7d9c72e7094b31e7fb84a859be9325635 ]
Similar issue was fixed in cdfc7f888c2a ("libbpf: fix GCC8 warning for strncpy") already. This one was missed. Fixing now.
Cc: Magnus Karlsson magnus.karlsson@intel.com Signed-off-by: Andrii Nakryiko andriin@fb.com Acked-by: Magnus Karlsson magnus.karlsson@intel.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/xsk.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c index 38667b62f1fe..aa37005209aa 100644 --- a/tools/lib/bpf/xsk.c +++ b/tools/lib/bpf/xsk.c @@ -561,7 +561,8 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname, err = -errno; goto out_socket; } - strncpy(xsk->ifname, ifname, IFNAMSIZ); + strncpy(xsk->ifname, ifname, IFNAMSIZ - 1); + xsk->ifname[IFNAMSIZ - 1] = '\0';
err = xsk_set_xdp_socket_config(&xsk->config, usr_config); if (err)
From: Benjamin Poirier bpoirier@suse.com
[ Upstream commit 7429c6c0d9cb086d8e79f0d2a48ae14851d2115e ]
While changing the number of interrupt channels, be2net stops adapter operation (including netif_tx_disable()) but it doesn't signal that it cannot transmit. This may lead dev_watchdog() to falsely trigger during that time.
Add the missing call to netif_carrier_off(), following the pattern used in many other drivers. netif_carrier_on() is already taken care of in be_open().
Signed-off-by: Benjamin Poirier bpoirier@suse.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/emulex/benet/be_main.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c index 82015c8a5ed7..b7a246b33599 100644 --- a/drivers/net/ethernet/emulex/benet/be_main.c +++ b/drivers/net/ethernet/emulex/benet/be_main.c @@ -4697,8 +4697,12 @@ int be_update_queues(struct be_adapter *adapter) struct net_device *netdev = adapter->netdev; int status;
- if (netif_running(netdev)) + if (netif_running(netdev)) { + /* device cannot transmit now, avoid dev_watchdog timeouts */ + netif_carrier_off(netdev); + be_close(netdev); + }
be_cancel_worker(adapter);
From: Vitaly Wool vitalywool@gmail.com
[ Upstream commit bb9a374dfa3a2f46581455ab66cd1d24c5e3d183 ]
As reported by Henry Burns:
Running z3fold stress testing with address sanitization showed zhdr->slots was being used after it was freed.
z3fold_free(z3fold_pool, handle) free_handle(handle) kmem_cache_free(pool->c_handle, zhdr->slots) release_z3fold_page_locked_list(kref) __release_z3fold_page(zhdr, true) zhdr_to_pool(zhdr) slots_to_pool(zhdr->slots) *BOOM*
To fix this, add pointer to the pool back to z3fold_header and modify zhdr_to_pool to return zhdr->pool.
Link: http://lkml.kernel.org/r/20190708134808.e89f3bfadd9f6ffd7eff9ba9@gmail.com Fixes: 7c2b8baa61fe ("mm/z3fold.c: add structure for buddy handles") Signed-off-by: Vitaly Wool vitalywool@gmail.com Reported-by: Henry Burns henryburns@google.com Reviewed-by: Shakeel Butt shakeelb@google.com Cc: Jonathan Adams jwadams@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/z3fold.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/z3fold.c b/mm/z3fold.c index 985732c8b025..e1686bf6d689 100644 --- a/mm/z3fold.c +++ b/mm/z3fold.c @@ -101,6 +101,7 @@ struct z3fold_buddy_slots { * @refcount: reference count for the z3fold page * @work: work_struct for page layout optimization * @slots: pointer to the structure holding buddy slots + * @pool: pointer to the containing pool * @cpu: CPU which this page "belongs" to * @first_chunks: the size of the first buddy in chunks, 0 if free * @middle_chunks: the size of the middle buddy in chunks, 0 if free @@ -114,6 +115,7 @@ struct z3fold_header { struct kref refcount; struct work_struct work; struct z3fold_buddy_slots *slots; + struct z3fold_pool *pool; short cpu; unsigned short first_chunks; unsigned short middle_chunks; @@ -320,6 +322,7 @@ static struct z3fold_header *init_z3fold_page(struct page *page, zhdr->start_middle = 0; zhdr->cpu = -1; zhdr->slots = slots; + zhdr->pool = pool; INIT_LIST_HEAD(&zhdr->buddy); INIT_WORK(&zhdr->work, compact_page_work); return zhdr; @@ -426,7 +429,7 @@ static enum buddy handle_to_buddy(unsigned long handle)
static inline struct z3fold_pool *zhdr_to_pool(struct z3fold_header *zhdr) { - return slots_to_pool(zhdr->slots); + return zhdr->pool; }
static void __release_z3fold_page(struct z3fold_header *zhdr, bool locked)
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit a07057dce2823e10d64a2b73cefbf09d8645efe9 ]
Clang gets rather confused about two variables in the same special section when one of them is not initialized, leading to an assembler warning later:
/tmp/slab_common-18f869.s: Assembler messages: /tmp/slab_common-18f869.s:7526: Warning: ignoring changed section attributes for .data..ro_after_init
Adding an initialization to kmalloc_caches is rather silly here but does avoid the issue.
Link: https://bugs.llvm.org/show_bug.cgi?id=42570 Link: http://lkml.kernel.org/r/20190712090455.266021-1-arnd@arndb.de Signed-off-by: Arnd Bergmann arnd@arndb.de Acked-by: David Rientjes rientjes@google.com Reviewed-by: Andrew Morton akpm@linux-foundation.org Cc: Christoph Lameter cl@linux.com Cc: Pekka Enberg penberg@kernel.org Cc: Joonsoo Kim iamjoonsoo.kim@lge.com Cc: Stephen Rothwell sfr@canb.auug.org.au Cc: Roman Gushchin guro@fb.com Cc: Shakeel Butt shakeelb@google.com Cc: Vladimir Davydov vdavydov.dev@gmail.com Cc: Andrey Konovalov andreyknvl@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/slab_common.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/slab_common.c b/mm/slab_common.c index 58251ba63e4a..cbd3411f644e 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -1003,7 +1003,8 @@ struct kmem_cache *__init create_kmalloc_cache(const char *name, }
struct kmem_cache * -kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1] __ro_after_init; +kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1] __ro_after_init = +{ /* initialization for https://bugs.llvm.org/show_bug.cgi?id=42570 */ }; EXPORT_SYMBOL(kmalloc_caches);
/*
From: Yafang Shao laoar.shao@gmail.com
[ Upstream commit 766a4c19d880887c457811b86f1f68525e416965 ]
After commit 815744d75152 ("mm: memcontrol: don't batch updates of local VM stats and events"), the local VM counter are not in sync with the hierarchical ones.
Below is one example in a leaf memcg on my server (with 8 CPUs):
inactive_file 3567570944 total_inactive_file 3568029696
We find that the deviation is very great because the 'val' in __mod_memcg_state() is in pages while the effective value in memcg_stat_show() is in bytes.
So the maximum of this deviation between local VM stats and total VM stats can be (32 * number_of_cpu * PAGE_SIZE), that may be an unacceptably great value.
We should keep the local VM stats in sync with the total stats. In order to keep this behavior the same across counters, this patch updates __mod_lruvec_state() and __count_memcg_events() as well.
Link: http://lkml.kernel.org/r/1562851979-10610-1-git-send-email-laoar.shao@gmail.... Signed-off-by: Yafang Shao laoar.shao@gmail.com Acked-by: Johannes Weiner hannes@cmpxchg.org Cc: Michal Hocko mhocko@kernel.org Cc: Vladimir Davydov vdavydov.dev@gmail.com Cc: Yafang Shao shaoyafang@didiglobal.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/memcontrol.c | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index ba9138a4a1de..07b4ca559bcc 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -691,12 +691,15 @@ void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val) if (mem_cgroup_disabled()) return;
- __this_cpu_add(memcg->vmstats_local->stat[idx], val); - x = val + __this_cpu_read(memcg->vmstats_percpu->stat[idx]); if (unlikely(abs(x) > MEMCG_CHARGE_BATCH)) { struct mem_cgroup *mi;
+ /* + * Batch local counters to keep them in sync with + * the hierarchical ones. + */ + __this_cpu_add(memcg->vmstats_local->stat[idx], x); for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) atomic_long_add(x, &mi->vmstats[idx]); x = 0; @@ -745,13 +748,15 @@ void __mod_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, /* Update memcg */ __mod_memcg_state(memcg, idx, val);
- /* Update lruvec */ - __this_cpu_add(pn->lruvec_stat_local->count[idx], val); - x = val + __this_cpu_read(pn->lruvec_stat_cpu->count[idx]); if (unlikely(abs(x) > MEMCG_CHARGE_BATCH)) { struct mem_cgroup_per_node *pi;
+ /* + * Batch local counters to keep them in sync with + * the hierarchical ones. + */ + __this_cpu_add(pn->lruvec_stat_local->count[idx], x); for (pi = pn; pi; pi = parent_nodeinfo(pi, pgdat->node_id)) atomic_long_add(x, &pi->lruvec_stat[idx]); x = 0; @@ -773,12 +778,15 @@ void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, if (mem_cgroup_disabled()) return;
- __this_cpu_add(memcg->vmstats_local->events[idx], count); - x = count + __this_cpu_read(memcg->vmstats_percpu->events[idx]); if (unlikely(x > MEMCG_CHARGE_BATCH)) { struct mem_cgroup *mi;
+ /* + * Batch local counters to keep them in sync with + * the hierarchical ones. + */ + __this_cpu_add(memcg->vmstats_local->events[idx], x); for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) atomic_long_add(x, &mi->vmevents[idx]); x = 0;
From: Henry Burns henryburns@google.com
[ Upstream commit c92d2f38563db20c20c8db2f98fa1349290477d5 ]
z3fold_page_migration() calls memcpy(new_zhdr, zhdr, PAGE_SIZE). However, zhdr contains fields that can't be directly coppied over (ex: list_head, a circular linked list). We only need to initialize the linked lists in new_zhdr, as z3fold_isolate_page() already ensures that these lists are empty
Additionally it is possible that zhdr->work has been placed in a workqueue. In this case we shouldn't migrate the page, as zhdr->work references zhdr as opposed to new_zhdr.
Link: http://lkml.kernel.org/r/20190716000520.230595-1-henryburns@google.com Fixes: 1f862989b04ade61d3 ("mm/z3fold.c: support page migration") Signed-off-by: Henry Burns henryburns@google.com Reviewed-by: Shakeel Butt shakeelb@google.com Cc: Vitaly Vul vitaly.vul@sony.com Cc: Vitaly Wool vitalywool@gmail.com Cc: Jonathan Adams jwadams@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/z3fold.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/mm/z3fold.c b/mm/z3fold.c index e1686bf6d689..7e764b0d8c8a 100644 --- a/mm/z3fold.c +++ b/mm/z3fold.c @@ -1350,12 +1350,22 @@ static int z3fold_page_migrate(struct address_space *mapping, struct page *newpa unlock_page(page); return -EBUSY; } + if (work_pending(&zhdr->work)) { + z3fold_page_unlock(zhdr); + return -EAGAIN; + } new_zhdr = page_address(newpage); memcpy(new_zhdr, zhdr, PAGE_SIZE); newpage->private = page->private; page->private = 0; z3fold_page_unlock(zhdr); spin_lock_init(&new_zhdr->page_lock); + INIT_WORK(&new_zhdr->work, compact_page_work); + /* + * z3fold_page_isolate() ensures that new_zhdr->buddy is empty, + * so we only have to reinitialize it. + */ + INIT_LIST_HEAD(&new_zhdr->buddy); new_mapping = page_mapping(page); __ClearPageMovable(page); ClearPagePrivate(page);
From: Qian Cai cai@lca.pw
[ Upstream commit ec6335586953b0df32f83ef696002063090c7aef ]
There are many compiler warnings like this,
In file included from ./arch/x86/include/asm/smp.h:13, from ./arch/x86/include/asm/mmzone_64.h:11, from ./arch/x86/include/asm/mmzone.h:5, from ./include/linux/mmzone.h:969, from ./include/linux/gfp.h:6, from ./include/linux/mm.h:10, from arch/x86/kernel/apic/io_apic.c:34: arch/x86/kernel/apic/io_apic.c: In function 'check_timer': ./arch/x86/include/asm/apic.h:37:11: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] if ((v) <= apic_verbosity) \ ^~ arch/x86/kernel/apic/io_apic.c:2160:2: note: in expansion of macro 'apic_printk' apic_printk(APIC_QUIET, KERN_INFO "..TIMER: vector=0x%02X " ^~~~~~~~~~~ ./arch/x86/include/asm/apic.h:37:11: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] if ((v) <= apic_verbosity) \ ^~ arch/x86/kernel/apic/io_apic.c:2207:4: note: in expansion of macro 'apic_printk' apic_printk(APIC_QUIET, KERN_ERR "..MP-BIOS bug: " ^~~~~~~~~~~
APIC_QUIET is 0, so silence them by making apic_verbosity type int.
Signed-off-by: Qian Cai cai@lca.pw Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/1562621805-24789-1-git-send-email-cai@lca.pw Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/apic.h | 2 +- arch/x86/kernel/apic/apic.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 1340fa53b575..2e599384abd8 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -49,7 +49,7 @@ static inline void generic_apic_probe(void)
#ifdef CONFIG_X86_LOCAL_APIC
-extern unsigned int apic_verbosity; +extern int apic_verbosity; extern int local_apic_timer_c2_ok;
extern int disable_apic; diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 16c21ed97cb2..530cf1fd68a2 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -183,7 +183,7 @@ EXPORT_SYMBOL_GPL(local_apic_timer_c2_ok); /* * Debug level, exported for io_apic.c */ -unsigned int apic_verbosity; +int apic_verbosity;
int pic_mode;
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 29e7e9664aec17b94a9c8c5a75f8d216a206aa3a ]
clang warns about a few parts of the math-emu implementation where a 16-bit integer becomes negative during assignment:
arch/x86/math-emu/poly_tan.c:88:35: error: implicit conversion from 'int' to 'short' changes value from 49216 to -16320 [-Werror,-Wconstant-conversion] (0x41 + EXTENDED_Ebias) | SIGN_Negative); ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~ arch/x86/math-emu/fpu_emu.h:180:58: note: expanded from macro 'setexponent16' #define setexponent16(x,y) { (*(short *)&((x)->exp)) = (y); } ~ ^ arch/x86/math-emu/reg_constant.c:37:32: error: implicit conversion from 'int' to 'short' changes value from 49085 to -16451 [-Werror,-Wconstant-conversion] FPU_REG const CONST_PI2extra = MAKE_REG(NEG, -66, ^~~~~~~~~~~~~~~~~~ arch/x86/math-emu/reg_constant.c:21:25: note: expanded from macro 'MAKE_REG' ((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) } ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~ arch/x86/math-emu/reg_constant.c:48:28: error: implicit conversion from 'int' to 'short' changes value from 65535 to -1 [-Werror,-Wconstant-conversion] FPU_REG const CONST_QNaN = MAKE_REG(NEG, EXP_OVER, 0x00000000, 0xC0000000); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/x86/math-emu/reg_constant.c:21:25: note: expanded from macro 'MAKE_REG' ((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) } ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
The code is correct as is, so add a typecast to shut up the warnings.
Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/20190712090816.350668-1-arnd@arndb.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/math-emu/fpu_emu.h | 2 +- arch/x86/math-emu/reg_constant.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/math-emu/fpu_emu.h b/arch/x86/math-emu/fpu_emu.h index a5a41ec58072..0c122226ca56 100644 --- a/arch/x86/math-emu/fpu_emu.h +++ b/arch/x86/math-emu/fpu_emu.h @@ -177,7 +177,7 @@ static inline void reg_copy(FPU_REG const *x, FPU_REG *y) #define setexponentpos(x,y) { (*(short *)&((x)->exp)) = \ ((y) + EXTENDED_Ebias) & 0x7fff; } #define exponent16(x) (*(short *)&((x)->exp)) -#define setexponent16(x,y) { (*(short *)&((x)->exp)) = (y); } +#define setexponent16(x,y) { (*(short *)&((x)->exp)) = (u16)(y); } #define addexponent(x,y) { (*(short *)&((x)->exp)) += (y); } #define stdexp(x) { (*(short *)&((x)->exp)) += EXTENDED_Ebias; }
diff --git a/arch/x86/math-emu/reg_constant.c b/arch/x86/math-emu/reg_constant.c index 8dc9095bab22..742619e94bdf 100644 --- a/arch/x86/math-emu/reg_constant.c +++ b/arch/x86/math-emu/reg_constant.c @@ -18,7 +18,7 @@ #include "control_w.h"
#define MAKE_REG(s, e, l, h) { l, h, \ - ((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) } + (u16)((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) }
FPU_REG const CONST_1 = MAKE_REG(POS, 0, 0x00000000, 0x80000000); #if 0
From: Doug Berger opendmb@gmail.com
[ Upstream commit c633324e311243586675e732249339685e5d6faa ]
The description of cma_declare_contiguous() indicates that if the 'fixed' argument is true the reserved contiguous area must be exactly at the address of the 'base' argument.
However, the function currently allows the 'base', 'size', and 'limit' arguments to be silently adjusted to meet alignment constraints. This commit enforces the documented behavior through explicit checks that return an error if the region does not fit within a specified region.
Link: http://lkml.kernel.org/r/1561422051-16142-1-git-send-email-opendmb@gmail.com Fixes: 5ea3b1b2f8ad ("cma: add placement specifier for "cma=" kernel parameter") Signed-off-by: Doug Berger opendmb@gmail.com Acked-by: Michal Nazarewicz mina86@mina86.com Cc: Yue Hu huyue2@yulong.com Cc: Mike Rapoport rppt@linux.ibm.com Cc: Laura Abbott labbott@redhat.com Cc: Peng Fan peng.fan@nxp.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Marek Szyprowski m.szyprowski@samsung.com Cc: Andrey Konovalov andreyknvl@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/cma.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/mm/cma.c b/mm/cma.c index 3340ef34c154..4973d253dc83 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -278,6 +278,12 @@ int __init cma_declare_contiguous(phys_addr_t base, */ alignment = max(alignment, (phys_addr_t)PAGE_SIZE << max_t(unsigned long, MAX_ORDER - 1, pageblock_order)); + if (fixed && base & (alignment - 1)) { + ret = -EINVAL; + pr_err("Region at %pa must be aligned to %pa bytes\n", + &base, &alignment); + goto err; + } base = ALIGN(base, alignment); size = ALIGN(size, alignment); limit &= ~(alignment - 1); @@ -308,6 +314,13 @@ int __init cma_declare_contiguous(phys_addr_t base, if (limit == 0 || limit > memblock_end) limit = memblock_end;
+ if (base + size > limit) { + ret = -EINVAL; + pr_err("Size (%pa) of region at %pa exceeds limit (%pa)\n", + &size, &base, &limit); + goto err; + } + /* Reserve memory */ if (fixed) { if (memblock_is_region_reserved(base, size) ||
From: Kees Cook keescook@chromium.org
[ Upstream commit 8e060c21ae2c265a2b596e9e7f9f97ec274151a4 ]
This adds __GFP_NOWARN to the kmalloc()-portions of the overflow test to avoid tainting the kernel. Additionally fixes up the math on wrap size to be architecture and page size agnostic.
Link: http://lkml.kernel.org/r/201905282012.0A8767E24@keescook Fixes: ca90800a91ba ("test_overflow: Add memory allocation overflow tests") Signed-off-by: Kees Cook keescook@chromium.org Reported-by: Randy Dunlap rdunlap@infradead.org Suggested-by: Rasmus Villemoes linux@rasmusvillemoes.dk Cc: Joe Perches joe@perches.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/test_overflow.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/lib/test_overflow.c b/lib/test_overflow.c index fc680562d8b6..7a4b6f6c5473 100644 --- a/lib/test_overflow.c +++ b/lib/test_overflow.c @@ -486,16 +486,17 @@ static int __init test_overflow_shift(void) * Deal with the various forms of allocator arguments. See comments above * the DEFINE_TEST_ALLOC() instances for mapping of the "bits". */ -#define alloc010(alloc, arg, sz) alloc(sz, GFP_KERNEL) -#define alloc011(alloc, arg, sz) alloc(sz, GFP_KERNEL, NUMA_NO_NODE) +#define alloc_GFP (GFP_KERNEL | __GFP_NOWARN) +#define alloc010(alloc, arg, sz) alloc(sz, alloc_GFP) +#define alloc011(alloc, arg, sz) alloc(sz, alloc_GFP, NUMA_NO_NODE) #define alloc000(alloc, arg, sz) alloc(sz) #define alloc001(alloc, arg, sz) alloc(sz, NUMA_NO_NODE) -#define alloc110(alloc, arg, sz) alloc(arg, sz, GFP_KERNEL) +#define alloc110(alloc, arg, sz) alloc(arg, sz, alloc_GFP) #define free0(free, arg, ptr) free(ptr) #define free1(free, arg, ptr) free(arg, ptr)
-/* Wrap around to 8K */ -#define TEST_SIZE (9 << PAGE_SHIFT) +/* Wrap around to 16K */ +#define TEST_SIZE (5 * 4096)
#define DEFINE_TEST_ALLOC(func, free_func, want_arg, want_gfp, want_node)\ static int __init test_ ## func (void *arg) \
From: Peter Rosin peda@axentia.se
[ Upstream commit 33d6e0ff68af74be0c846c8e042e84a9a1a0561e ]
If a memsetXX implementation is completely broken and fails in the first iteration, when i, j, and k are all zero, the failure is masked as zero is returned. Failing in the first iteration is perhaps the most likely failure, so this makes the tests pretty much useless. Avoid the situation by always setting a random unused bit in the result on failure.
Link: http://lkml.kernel.org/r/20190506124634.6807-3-peda@axentia.se Fixes: 03270c13c5ff ("lib/string.c: add testcases for memset16/32/64") Signed-off-by: Peter Rosin peda@axentia.se Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/test_string.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/lib/test_string.c b/lib/test_string.c index bf8def01ed20..b5117ae59693 100644 --- a/lib/test_string.c +++ b/lib/test_string.c @@ -36,7 +36,7 @@ static __init int memset16_selftest(void) fail: kfree(p); if (i < 256) - return (i << 24) | (j << 16) | k; + return (i << 24) | (j << 16) | k | 0x8000; return 0; }
@@ -72,7 +72,7 @@ static __init int memset32_selftest(void) fail: kfree(p); if (i < 256) - return (i << 24) | (j << 16) | k; + return (i << 24) | (j << 16) | k | 0x8000; return 0; }
@@ -108,7 +108,7 @@ static __init int memset64_selftest(void) fail: kfree(p); if (i < 256) - return (i << 24) | (j << 16) | k; + return (i << 24) | (j << 16) | k | 0x8000; return 0; }
From: Anshuman Khandual anshuman.khandual@arm.com
[ Upstream commit 6b95ab4218bfa59bc315105127ffe03aef3b5742 ]
Virtual address alignment is essential in ensuring correct clearing for all intermediate level pgtable entries and freeing associated pgtable pages. An unaligned address can end up randomly freeing pgtable page that potentially still contains valid mappings. Hence also check it's alignment along with existing phys_addr check.
Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Toshi Kani toshi.kani@hpe.com Cc: Will Deacon will.deacon@arm.com Cc: Chintan Pandya cpandya@codeaurora.org Cc: Thomas Gleixner tglx@linutronix.de Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/ioremap.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/lib/ioremap.c b/lib/ioremap.c index 063213685563..a95161d9c883 100644 --- a/lib/ioremap.c +++ b/lib/ioremap.c @@ -86,6 +86,9 @@ static int ioremap_try_huge_pmd(pmd_t *pmd, unsigned long addr, if ((end - addr) != PMD_SIZE) return 0;
+ if (!IS_ALIGNED(addr, PMD_SIZE)) + return 0; + if (!IS_ALIGNED(phys_addr, PMD_SIZE)) return 0;
@@ -126,6 +129,9 @@ static int ioremap_try_huge_pud(pud_t *pud, unsigned long addr, if ((end - addr) != PUD_SIZE) return 0;
+ if (!IS_ALIGNED(addr, PUD_SIZE)) + return 0; + if (!IS_ALIGNED(phys_addr, PUD_SIZE)) return 0;
@@ -166,6 +172,9 @@ static int ioremap_try_huge_p4d(p4d_t *p4d, unsigned long addr, if ((end - addr) != P4D_SIZE) return 0;
+ if (!IS_ALIGNED(addr, P4D_SIZE)) + return 0; + if (!IS_ALIGNED(phys_addr, P4D_SIZE)) return 0;
From: Zhouyang Jia jiazhouyang09@gmail.com
[ Upstream commit 02551c23bcd85f0c68a8259c7b953d49d44f86af ]
When fget fails, the lack of error-handling code may cause unexpected results.
This patch adds error-handling code after calling fget.
Link: http://lkml.kernel.org/r/2514ec03df9c33b86e56748513267a80dd8004d9.1558117389... Signed-off-by: Zhouyang Jia jiazhouyang09@gmail.com Signed-off-by: Jan Harkes jaharkes@cs.cmu.edu Cc: Arnd Bergmann arnd@arndb.de Cc: Colin Ian King colin.king@canonical.com Cc: Dan Carpenter dan.carpenter@oracle.com Cc: David Howells dhowells@redhat.com Cc: Fabian Frederick fabf@skynet.be Cc: Mikko Rapeli mikko.rapeli@iki.fi Cc: Sam Protsenko semen.protsenko@linaro.org Cc: Yann Droneaud ydroneaud@opteya.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/coda/psdev.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/coda/psdev.c b/fs/coda/psdev.c index 0ceef32e6fae..241f7e04ad04 100644 --- a/fs/coda/psdev.c +++ b/fs/coda/psdev.c @@ -182,8 +182,11 @@ static ssize_t coda_psdev_write(struct file *file, const char __user *buf, if (req->uc_opcode == CODA_OPEN_BY_FD) { struct coda_open_by_fd_out *outp = (struct coda_open_by_fd_out *)req->uc_data; - if (!outp->oh.result) + if (!outp->oh.result) { outp->fh = fget(outp->fd); + if (!outp->fh) + return -EBADF; + } }
wake_up(&req->uc_sleep);
From: Sam Protsenko semen.protsenko@linaro.org
[ Upstream commit b2a57e334086602be56b74958d9f29b955cd157f ]
The kernel is self-contained project and can be built with bare-metal toolchain. But bare-metal toolchain doesn't define __linux__. Because of this u_quad_t type is not defined when using bare-metal toolchain and codafs build fails. This patch fixes it by defining u_quad_t type unconditionally.
Link: http://lkml.kernel.org/r/3cbb40b0a57b6f9923a9d67b53473c0b691a3eaa.1558117389... Signed-off-by: Sam Protsenko semen.protsenko@linaro.org Signed-off-by: Jan Harkes jaharkes@cs.cmu.edu Cc: Arnd Bergmann arnd@arndb.de Cc: Colin Ian King colin.king@canonical.com Cc: Dan Carpenter dan.carpenter@oracle.com Cc: David Howells dhowells@redhat.com Cc: Fabian Frederick fabf@skynet.be Cc: Mikko Rapeli mikko.rapeli@iki.fi Cc: Yann Droneaud ydroneaud@opteya.com Cc: Zhouyang Jia jiazhouyang09@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/coda.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/include/linux/coda.h b/include/linux/coda.h index d30209b9cef8..0ca0c83fdb1c 100644 --- a/include/linux/coda.h +++ b/include/linux/coda.h @@ -58,8 +58,7 @@ Mellon the rights to redistribute these changes without encumbrance. #ifndef _CODA_HEADER_ #define _CODA_HEADER_
-#if defined(__linux__) typedef unsigned long long u_quad_t; -#endif + #include <uapi/linux/coda.h> #endif
From: Mikko Rapeli mikko.rapeli@iki.fi
[ Upstream commit f90fb3c7e2c13ae829db2274b88b845a75038b8a ]
Only users of upc_req in kernel side fs/coda/psdev.c and fs/coda/upcall.c already include linux/coda_psdev.h.
Suggested by Jan Harkes jaharkes@cs.cmu.edu in https://lore.kernel.org/lkml/20150531111913.GA23377@cs.cmu.edu/
Fixes these include/uapi/linux/coda_psdev.h compilation errors in userspace:
linux/coda_psdev.h:12:19: error: field `uc_chain' has incomplete type struct list_head uc_chain; ^ linux/coda_psdev.h:13:2: error: unknown type name `caddr_t' caddr_t uc_data; ^ linux/coda_psdev.h:14:2: error: unknown type name `u_short' u_short uc_flags; ^ linux/coda_psdev.h:15:2: error: unknown type name `u_short' u_short uc_inSize; /* Size is at most 5000 bytes */ ^ linux/coda_psdev.h:16:2: error: unknown type name `u_short' u_short uc_outSize; ^ linux/coda_psdev.h:17:2: error: unknown type name `u_short' u_short uc_opcode; /* copied from data to save lookup */ ^ linux/coda_psdev.h:19:2: error: unknown type name `wait_queue_head_t' wait_queue_head_t uc_sleep; /* process' wait queue */ ^
Link: http://lkml.kernel.org/r/9f99f5ce6a0563d5266e6cf7aa9585aac2cae971.1558117389... Signed-off-by: Mikko Rapeli mikko.rapeli@iki.fi Signed-off-by: Jan Harkes jaharkes@cs.cmu.edu Cc: Arnd Bergmann arnd@arndb.de Cc: Colin Ian King colin.king@canonical.com Cc: Dan Carpenter dan.carpenter@oracle.com Cc: David Howells dhowells@redhat.com Cc: Fabian Frederick fabf@skynet.be Cc: Sam Protsenko semen.protsenko@linaro.org Cc: Yann Droneaud ydroneaud@opteya.com Cc: Zhouyang Jia jiazhouyang09@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/coda_psdev.h | 11 +++++++++++ include/uapi/linux/coda_psdev.h | 13 ------------- 2 files changed, 11 insertions(+), 13 deletions(-)
diff --git a/include/linux/coda_psdev.h b/include/linux/coda_psdev.h index 15170954aa2b..57d2b2faf6a3 100644 --- a/include/linux/coda_psdev.h +++ b/include/linux/coda_psdev.h @@ -19,6 +19,17 @@ struct venus_comm { struct mutex vc_mutex; };
+/* messages between coda filesystem in kernel and Venus */ +struct upc_req { + struct list_head uc_chain; + caddr_t uc_data; + u_short uc_flags; + u_short uc_inSize; /* Size is at most 5000 bytes */ + u_short uc_outSize; + u_short uc_opcode; /* copied from data to save lookup */ + int uc_unique; + wait_queue_head_t uc_sleep; /* process' wait queue */ +};
static inline struct venus_comm *coda_vcp(struct super_block *sb) { diff --git a/include/uapi/linux/coda_psdev.h b/include/uapi/linux/coda_psdev.h index aa6623efd2dd..d50d51a57fe4 100644 --- a/include/uapi/linux/coda_psdev.h +++ b/include/uapi/linux/coda_psdev.h @@ -7,19 +7,6 @@ #define CODA_PSDEV_MAJOR 67 #define MAX_CODADEVS 5 /* how many do we allow */
- -/* messages between coda filesystem in kernel and Venus */ -struct upc_req { - struct list_head uc_chain; - caddr_t uc_data; - u_short uc_flags; - u_short uc_inSize; /* Size is at most 5000 bytes */ - u_short uc_outSize; - u_short uc_opcode; /* copied from data to save lookup */ - int uc_unique; - wait_queue_head_t uc_sleep; /* process' wait queue */ -}; - #define CODA_REQ_ASYNC 0x1 #define CODA_REQ_READ 0x2 #define CODA_REQ_WRITE 0x4
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit 156e0b1a8112b76e351684ac948c59757037ac36 ]
The dev_info.name[] array has space for RIO_MAX_DEVNAME_SZ + 1 characters. But the problem here is that we don't ensure that the user put a NUL terminator on the end of the string. It could lead to an out of bounds read.
Link: http://lkml.kernel.org/r/20190529110601.GB19119@mwanda Fixes: e8de370188d0 ("rapidio: add mport char device driver") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Acked-by: Alexandre Bounine alex.bou9@gmail.com Cc: Ira Weiny ira.weiny@intel.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/rapidio/devices/rio_mport_cdev.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/rapidio/devices/rio_mport_cdev.c b/drivers/rapidio/devices/rio_mport_cdev.c index ce7a90e68042..8155f59ece38 100644 --- a/drivers/rapidio/devices/rio_mport_cdev.c +++ b/drivers/rapidio/devices/rio_mport_cdev.c @@ -1686,6 +1686,7 @@ static int rio_mport_add_riodev(struct mport_cdev_priv *priv,
if (copy_from_user(&dev_info, arg, sizeof(dev_info))) return -EFAULT; + dev_info.name[sizeof(dev_info.name) - 1] = '\0';
rmcd_debug(RDEV, "name:%s ct:0x%x did:0x%x hc:0x%x", dev_info.name, dev_info.comptag, dev_info.destid, dev_info.hopcount); @@ -1817,6 +1818,7 @@ static int rio_mport_del_riodev(struct mport_cdev_priv *priv, void __user *arg)
if (copy_from_user(&dev_info, arg, sizeof(dev_info))) return -EFAULT; + dev_info.name[sizeof(dev_info.name) - 1] = '\0';
mport = priv->md->mport;
From: Miroslav Lichvar mlichvar@redhat.com
[ Upstream commit 5515e9a6273b8c02034466bcbd717ac9f53dab99 ]
The PPS assert/clear offset corrections are set by the PPS_SETPARAMS ioctl in the pps_ktime structs, which also contain flags. The flags are not initialized by applications (using the timepps.h header) and they are not used by the kernel for anything except returning them back in the PPS_GETPARAMS ioctl.
Set the flags to zero to make it clear they are unused and avoid leaking uninitialized data of the PPS_SETPARAMS caller to other applications that have a read access to the PPS device.
Link: http://lkml.kernel.org/r/20190702092251.24303-1-mlichvar@redhat.com Signed-off-by: Miroslav Lichvar mlichvar@redhat.com Reviewed-by: Thomas Gleixner tglx@linutronix.de Acked-by: Rodolfo Giometti giometti@enneenne.com Cc: Greg KH greg@kroah.com Cc: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pps/pps.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/pps/pps.c b/drivers/pps/pps.c index 3a546ec10d90..22a65ad4e46e 100644 --- a/drivers/pps/pps.c +++ b/drivers/pps/pps.c @@ -152,6 +152,14 @@ static long pps_cdev_ioctl(struct file *file, pps->params.mode |= PPS_CANWAIT; pps->params.api_version = PPS_API_VERS;
+ /* + * Clear unused fields of pps_kparams to avoid leaking + * uninitialized data of the PPS_SETPARAMS caller via + * PPS_GETPARAMS + */ + pps->params.assert_off_tu.flags = 0; + pps->params.clear_off_tu.flags = 0; + spin_unlock_irq(&pps->lock);
break;
From: Kees Cook keescook@chromium.org
[ Upstream commit a318f12ed8843cfac53198390c74a565c632f417 ]
Andreas Christoforou reported:
UBSAN: Undefined behaviour in ipc/mqueue.c:414:49 signed integer overflow: 9 * 2305843009213693951 cannot be represented in type 'long int' ... Call Trace: mqueue_evict_inode+0x8e7/0xa10 ipc/mqueue.c:414 evict+0x472/0x8c0 fs/inode.c:558 iput_final fs/inode.c:1547 [inline] iput+0x51d/0x8c0 fs/inode.c:1573 mqueue_get_inode+0x8eb/0x1070 ipc/mqueue.c:320 mqueue_create_attr+0x198/0x440 ipc/mqueue.c:459 vfs_mkobj+0x39e/0x580 fs/namei.c:2892 prepare_open ipc/mqueue.c:731 [inline] do_mq_open+0x6da/0x8e0 ipc/mqueue.c:771
Which could be triggered by:
struct mq_attr attr = { .mq_flags = 0, .mq_maxmsg = 9, .mq_msgsize = 0x1fffffffffffffff, .mq_curmsgs = 0, };
if (mq_open("/testing", 0x40, 3, &attr) == (mqd_t) -1) perror("mq_open");
mqueue_get_inode() was correctly rejecting the giant mq_msgsize, and preparing to return -EINVAL. During the cleanup, it calls mqueue_evict_inode() which performed resource usage tracking math for updating "user", before checking if there was a valid "user" at all (which would indicate that the calculations would be sane). Instead, delay this check to after seeing a valid "user".
The overflow was real, but the results went unused, so while the flaw is harmless, it's noisy for kernel fuzzers, so just fix it by moving the calculation under the non-NULL "user" where it actually gets used.
Link: http://lkml.kernel.org/r/201906072207.ECB65450@keescook Signed-off-by: Kees Cook keescook@chromium.org Reported-by: Andreas Christoforou andreaschristofo@gmail.com Acked-by: "Eric W. Biederman" ebiederm@xmission.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: Arnd Bergmann arnd@arndb.de Cc: Davidlohr Bueso dave@stgolabs.net Cc: Manfred Spraul manfred@colorfullife.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- ipc/mqueue.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/ipc/mqueue.c b/ipc/mqueue.c index 216cad1ff0d0..65c351564ad0 100644 --- a/ipc/mqueue.c +++ b/ipc/mqueue.c @@ -438,7 +438,6 @@ static void mqueue_evict_inode(struct inode *inode) { struct mqueue_inode_info *info; struct user_struct *user; - unsigned long mq_bytes, mq_treesize; struct ipc_namespace *ipc_ns; struct msg_msg *msg, *nmsg; LIST_HEAD(tmp_msg); @@ -461,16 +460,18 @@ static void mqueue_evict_inode(struct inode *inode) free_msg(msg); }
- /* Total amount of bytes accounted for the mqueue */ - mq_treesize = info->attr.mq_maxmsg * sizeof(struct msg_msg) + - min_t(unsigned int, info->attr.mq_maxmsg, MQ_PRIO_MAX) * - sizeof(struct posix_msg_tree_node); - - mq_bytes = mq_treesize + (info->attr.mq_maxmsg * - info->attr.mq_msgsize); - user = info->user; if (user) { + unsigned long mq_bytes, mq_treesize; + + /* Total amount of bytes accounted for the mqueue */ + mq_treesize = info->attr.mq_maxmsg * sizeof(struct msg_msg) + + min_t(unsigned int, info->attr.mq_maxmsg, MQ_PRIO_MAX) * + sizeof(struct posix_msg_tree_node); + + mq_bytes = mq_treesize + (info->attr.mq_maxmsg * + info->attr.mq_msgsize); + spin_lock(&mq_lock); user->mq_bytes -= mq_bytes; /*
From: "Dmitry V. Levin" ldv@altlinux.org
[ Upstream commit 33644b95eb342201511fc951d8fcd10362bd435b ]
PTRACE_GET_SYSCALL_INFO is a generic ptrace API that lets ptracer obtain details of the syscall the tracee is blocked in.
There are two reasons for a special syscall-related ptrace request.
Firstly, with the current ptrace API there are cases when ptracer cannot retrieve necessary information about syscalls. Some examples include:
* The notorious int-0x80-from-64-bit-task issue. See [1] for details. In short, if a 64-bit task performs a syscall through int 0x80, its tracer has no reliable means to find out that the syscall was, in fact, a compat syscall, and misidentifies it.
* Syscall-enter-stop and syscall-exit-stop look the same for the tracer. Common practice is to keep track of the sequence of ptrace-stops in order not to mix the two syscall-stops up. But it is not as simple as it looks; for example, strace had a (just recently fixed) long-standing bug where attaching strace to a tracee that is performing the execve system call led to the tracer identifying the following syscall-exit-stop as syscall-enter-stop, which messed up all the state tracking.
* Since the introduction of commit 84d77d3f06e7 ("ptrace: Don't allow accessing an undumpable mm"), both PTRACE_PEEKDATA and process_vm_readv become unavailable when the process dumpable flag is cleared. On such architectures as ia64 this results in all syscall arguments being unavailable for the tracer.
Secondly, ptracers also have to support a lot of arch-specific code for obtaining information about the tracee. For some architectures, this requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall argument and return value.
PTRACE_GET_SYSCALL_INFO returns the following structure:
struct ptrace_syscall_info { __u8 op; /* PTRACE_SYSCALL_INFO_* */ __u32 arch __attribute__((__aligned__(sizeof(__u32)))); __u64 instruction_pointer; __u64 stack_pointer; union { struct { __u64 nr; __u64 args[6]; } entry; struct { __s64 rval; __u8 is_error; } exit; struct { __u64 nr; __u64 args[6]; __u32 ret_data; } seccomp; }; };
The structure was chosen according to [2], except for the following changes:
* seccomp substructure was added as a superset of entry substructure
* the type of nr field was changed from int to __u64 because syscall numbers are, as a practical matter, 64 bits
* stack_pointer field was added along with instruction_pointer field since it is readily available and can save the tracer from extra PTRACE_GETREGS/PTRACE_GETREGSET calls
* arch is always initialized to aid with tracing system calls such as execve()
* instruction_pointer and stack_pointer are always initialized so they could be easily obtained for non-syscall stops
* a boolean is_error field was added along with rval field, this way the tracer can more reliably distinguish a return value from an error value
strace has been ported to PTRACE_GET_SYSCALL_INFO. Starting with release 4.26, strace uses PTRACE_GET_SYSCALL_INFO API as the preferred mechanism of obtaining syscall information.
[1] https://lore.kernel.org/lkml/CA+55aFzcSVmdDj9Lh_gdbz1OzHyEm6ZrGPBDAJnywm2LF_... [2] https://lore.kernel.org/lkml/CAObL_7GM0n80N7J_DFw_eQyfLyzq+sf4y2AvsCCV88Tb3A...
This patch (of 7):
All syscall_get_*() and syscall_set_*() functions must be defined as static inline as on all other architectures, otherwise asm/syscall.h cannot be included in more than one compilation unit.
This bug has to be fixed in order to extend the generic ptrace API with PTRACE_GET_SYSCALL_INFO request.
Link: http://lkml.kernel.org/r/20190510152749.GA28558@altlinux.org Fixes: 1932fbe36e02 ("nds32: System calls handling") Signed-off-by: Dmitry V. Levin ldv@altlinux.org Reported-by: kbuild test robot lkp@intel.com Acked-by: Greentime Hu greentime@andestech.com Cc: Vincent Chen deanbo422@gmail.com Cc: Elvira Khabirova lineprinter@altlinux.org Cc: Eugene Syromyatnikov esyr@redhat.com Cc: Oleg Nesterov oleg@redhat.com Cc: Andy Lutomirski luto@kernel.org Cc: Benjamin Herrenschmidt benh@kernel.crashing.org Cc: Helge Deller deller@gmx.de [parisc] Cc: James E.J. Bottomley jejb@parisc-linux.org Cc: James Hogan jhogan@kernel.org Cc: Kees Cook keescook@chromium.org Cc: Michael Ellerman mpe@ellerman.id.au Cc: Paul Burton paul.burton@mips.com Cc: Paul Mackerras paulus@samba.org Cc: Ralf Baechle ralf@linux-mips.org Cc: Richard Kuo rkuo@codeaurora.org Cc: Shuah Khan shuah@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/nds32/include/asm/syscall.h | 27 +++++++++++++++++---------- 1 file changed, 17 insertions(+), 10 deletions(-)
diff --git a/arch/nds32/include/asm/syscall.h b/arch/nds32/include/asm/syscall.h index 899b2fb4b52f..7b5180d78e20 100644 --- a/arch/nds32/include/asm/syscall.h +++ b/arch/nds32/include/asm/syscall.h @@ -26,7 +26,8 @@ struct pt_regs; * * It's only valid to call this when @task is known to be blocked. */ -int syscall_get_nr(struct task_struct *task, struct pt_regs *regs) +static inline int +syscall_get_nr(struct task_struct *task, struct pt_regs *regs) { return regs->syscallno; } @@ -47,7 +48,8 @@ int syscall_get_nr(struct task_struct *task, struct pt_regs *regs) * system call instruction. This may not be the same as what the * register state looked like at system call entry tracing. */ -void syscall_rollback(struct task_struct *task, struct pt_regs *regs) +static inline void +syscall_rollback(struct task_struct *task, struct pt_regs *regs) { regs->uregs[0] = regs->orig_r0; } @@ -62,7 +64,8 @@ void syscall_rollback(struct task_struct *task, struct pt_regs *regs) * It's only valid to call this when @task is stopped for tracing on exit * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT. */ -long syscall_get_error(struct task_struct *task, struct pt_regs *regs) +static inline long +syscall_get_error(struct task_struct *task, struct pt_regs *regs) { unsigned long error = regs->uregs[0]; return IS_ERR_VALUE(error) ? error : 0; @@ -79,7 +82,8 @@ long syscall_get_error(struct task_struct *task, struct pt_regs *regs) * It's only valid to call this when @task is stopped for tracing on exit * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT. */ -long syscall_get_return_value(struct task_struct *task, struct pt_regs *regs) +static inline long +syscall_get_return_value(struct task_struct *task, struct pt_regs *regs) { return regs->uregs[0]; } @@ -99,8 +103,9 @@ long syscall_get_return_value(struct task_struct *task, struct pt_regs *regs) * It's only valid to call this when @task is stopped for tracing on exit * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT. */ -void syscall_set_return_value(struct task_struct *task, struct pt_regs *regs, - int error, long val) +static inline void +syscall_set_return_value(struct task_struct *task, struct pt_regs *regs, + int error, long val) { regs->uregs[0] = (long)error ? error : val; } @@ -118,8 +123,9 @@ void syscall_set_return_value(struct task_struct *task, struct pt_regs *regs, * entry to a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT. */ #define SYSCALL_MAX_ARGS 6 -void syscall_get_arguments(struct task_struct *task, struct pt_regs *regs, - unsigned long *args) +static inline void +syscall_get_arguments(struct task_struct *task, struct pt_regs *regs, + unsigned long *args) { args[0] = regs->orig_r0; args++; @@ -138,8 +144,9 @@ void syscall_get_arguments(struct task_struct *task, struct pt_regs *regs, * It's only valid to call this when @task is stopped for tracing on * entry to a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT. */ -void syscall_set_arguments(struct task_struct *task, struct pt_regs *regs, - const unsigned long *args) +static inline void +syscall_set_arguments(struct task_struct *task, struct pt_regs *regs, + const unsigned long *args) { regs->orig_r0 = args[0]; args++;
From: Pavel Tatashin pasha.tatashin@soleen.com
[ Upstream commit 31e4ca92a7dd4cdebd7fe1456b3b0b6ace9a816f ]
Patch series ""Hotremove" persistent memory", v6.
Recently, adding a persistent memory to be used like a regular RAM was added to Linux. This work extends this functionality to also allow hot removing persistent memory.
We (Microsoft) have an important use case for this functionality.
The requirement is for physical machines with small amount of RAM (~8G) to be able to reboot in a very short period of time (<1s). Yet, there is a userland state that is expensive to recreate (~2G).
The solution is to boot machines with 2G preserved for persistent memory.
Copy the state, and hotadd the persistent memory so machine still has all 8G available for runtime. Before reboot, offline and hotremove device-dax 2G, copy the memory that is needed to be preserved to pmem0 device, and reboot.
The series of operations look like this:
1. After boot restore /dev/pmem0 to ramdisk to be consumed by apps. and free ramdisk. 2. Convert raw pmem0 to devdax ndctl create-namespace --mode devdax --map mem -e namespace0.0 -f 3. Hotadd to System RAM echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id echo online_movable > /sys/devices/system/memoryXXX/state 4. Before reboot hotremove device-dax memory from System RAM echo offline > /sys/devices/system/memoryXXX/state echo dax0.0 > /sys/bus/dax/drivers/kmem/unbind 5. Create raw pmem0 device ndctl create-namespace --mode raw -e namespace0.0 -f 6. Copy the state that was stored by apps to ramdisk to pmem device 7. Do kexec reboot or reboot through firmware if firmware does not zero memory in pmem0 region (These machines have only regular volatile memory). So to have pmem0 device either memmap kernel parameter is used, or devices nodes in dtb are specified.
This patch (of 3):
When add_memory() fails, the resource and the memory should be freed.
Link: http://lkml.kernel.org/r/20190517215438.6487-2-pasha.tatashin@soleen.com Fixes: c221c0b0308f ("device-dax: "Hotplug" persistent memory for use like normal RAM") Signed-off-by: Pavel Tatashin pasha.tatashin@soleen.com Reviewed-by: Dave Hansen dave.hansen@intel.com Cc: Bjorn Helgaas bhelgaas@google.com Cc: Borislav Petkov bp@suse.de Cc: Dan Williams dan.j.williams@intel.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Dave Jiang dave.jiang@intel.com Cc: David Hildenbrand david@redhat.com Cc: Fengguang Wu fengguang.wu@intel.com Cc: Huang Ying ying.huang@intel.com Cc: James Morris jmorris@namei.org Cc: Jérôme Glisse jglisse@redhat.com Cc: Keith Busch keith.busch@intel.com Cc: Michal Hocko mhocko@suse.com Cc: Ross Zwisler zwisler@kernel.org Cc: Sasha Levin sashal@kernel.org Cc: Takashi Iwai tiwai@suse.de Cc: Tom Lendacky thomas.lendacky@amd.com Cc: Vishal Verma vishal.l.verma@intel.com Cc: Yaowei Bai baiyaowei@cmss.chinamobile.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dax/kmem.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index a02318c6d28a..4c0131857133 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -66,8 +66,11 @@ int dev_dax_kmem_probe(struct device *dev) new_res->name = dev_name(dev);
rc = add_memory(numa_node, new_res->start, resource_size(new_res)); - if (rc) + if (rc) { + release_resource(new_res); + kfree(new_res); return rc; + }
return 0; }
From: Pavel Tatashin pasha.tatashin@soleen.com
[ Upstream commit eca499ab3749a4537dee77ffead47a1a2c0dee19 ]
Presently the remove_memory() interface is inherently broken. It tries to remove memory but panics if some memory is not offline. The problem is that it is impossible to ensure that all memory blocks are offline as this function also takes lock_device_hotplug that is required to change memory state via sysfs.
So, between calling this function and offlining all memory blocks there is always a window when lock_device_hotplug is released, and therefore, there is always a chance for a panic during this window.
Make this interface to return an error if memory removal fails. This way it is safe to call this function without panicking machine, and also makes it symmetric to add_memory() which already returns an error.
Link: http://lkml.kernel.org/r/20190517215438.6487-3-pasha.tatashin@soleen.com Signed-off-by: Pavel Tatashin pasha.tatashin@soleen.com Reviewed-by: David Hildenbrand david@redhat.com Acked-by: Michal Hocko mhocko@suse.com Cc: Bjorn Helgaas bhelgaas@google.com Cc: Borislav Petkov bp@suse.de Cc: Dan Williams dan.j.williams@intel.com Cc: Dave Hansen dave.hansen@intel.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Dave Jiang dave.jiang@intel.com Cc: Fengguang Wu fengguang.wu@intel.com Cc: Huang Ying ying.huang@intel.com Cc: James Morris jmorris@namei.org Cc: Jérôme Glisse jglisse@redhat.com Cc: Keith Busch keith.busch@intel.com Cc: Ross Zwisler zwisler@kernel.org Cc: Sasha Levin sashal@kernel.org Cc: Takashi Iwai tiwai@suse.de Cc: Tom Lendacky thomas.lendacky@amd.com Cc: Vishal Verma vishal.l.verma@intel.com Cc: Yaowei Bai baiyaowei@cmss.chinamobile.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/memory_hotplug.h | 8 +++-- mm/memory_hotplug.c | 64 +++++++++++++++++++++++----------- 2 files changed, 49 insertions(+), 23 deletions(-)
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index ae892eef8b82..988fde33cd7f 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -324,7 +324,7 @@ static inline void pgdat_resize_init(struct pglist_data *pgdat) {} extern bool is_mem_section_removable(unsigned long pfn, unsigned long nr_pages); extern void try_offline_node(int nid); extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages); -extern void remove_memory(int nid, u64 start, u64 size); +extern int remove_memory(int nid, u64 start, u64 size); extern void __remove_memory(int nid, u64 start, u64 size);
#else @@ -341,7 +341,11 @@ static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages) return -EINVAL; }
-static inline void remove_memory(int nid, u64 start, u64 size) {} +static inline int remove_memory(int nid, u64 start, u64 size) +{ + return -EBUSY; +} + static inline void __remove_memory(int nid, u64 start, u64 size) {} #endif /* CONFIG_MEMORY_HOTREMOVE */
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index e096c987d261..77d1f69cdead 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1736,9 +1736,10 @@ static int check_memblock_offlined_cb(struct memory_block *mem, void *arg) endpa = PFN_PHYS(section_nr_to_pfn(mem->end_section_nr + 1))-1; pr_warn("removing memory fails, because memory [%pa-%pa] is onlined\n", &beginpa, &endpa); - }
- return ret; + return -EBUSY; + } + return 0; }
static int check_cpu_on_node(pg_data_t *pgdat) @@ -1821,19 +1822,9 @@ static void __release_memory_resource(resource_size_t start, } }
-/** - * remove_memory - * @nid: the node ID - * @start: physical address of the region to remove - * @size: size of the region to remove - * - * NOTE: The caller must call lock_device_hotplug() to serialize hotplug - * and online/offline operations before this call, as required by - * try_offline_node(). - */ -void __ref __remove_memory(int nid, u64 start, u64 size) +static int __ref try_remove_memory(int nid, u64 start, u64 size) { - int ret; + int rc = 0;
BUG_ON(check_hotplug_memory_range(start, size));
@@ -1841,13 +1832,13 @@ void __ref __remove_memory(int nid, u64 start, u64 size)
/* * All memory blocks must be offlined before removing memory. Check - * whether all memory blocks in question are offline and trigger a BUG() + * whether all memory blocks in question are offline and return error * if this is not the case. */ - ret = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1), NULL, - check_memblock_offlined_cb); - if (ret) - BUG(); + rc = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1), NULL, + check_memblock_offlined_cb); + if (rc) + goto done;
/* remove memmap entry */ firmware_map_remove(start, start + size, "System RAM"); @@ -1859,14 +1850,45 @@ void __ref __remove_memory(int nid, u64 start, u64 size)
try_offline_node(nid);
+done: mem_hotplug_done(); + return rc; }
-void remove_memory(int nid, u64 start, u64 size) +/** + * remove_memory + * @nid: the node ID + * @start: physical address of the region to remove + * @size: size of the region to remove + * + * NOTE: The caller must call lock_device_hotplug() to serialize hotplug + * and online/offline operations before this call, as required by + * try_offline_node(). + */ +void __remove_memory(int nid, u64 start, u64 size) +{ + + /* + * trigger BUG() is some memory is not offlined prior to calling this + * function + */ + if (try_remove_memory(nid, start, size)) + BUG(); +} + +/* + * Remove memory if every memory block is offline, otherwise return -EBUSY is + * some memory is not offline + */ +int remove_memory(int nid, u64 start, u64 size) { + int rc; + lock_device_hotplug(); - __remove_memory(nid, start, size); + rc = try_remove_memory(nid, start, size); unlock_device_hotplug(); + + return rc; } EXPORT_SYMBOL_GPL(remove_memory); #endif /* CONFIG_MEMORY_HOTREMOVE */
From: Denis Efremov efremov@ispras.ru
[ Upstream commit f3554aeb991214cbfafd17d55e2bfddb50282e32 ]
This fixes a divide by zero error in the setup_format_params function of the floppy driver.
Two consecutive ioctls can trigger the bug: The first one should set the drive geometry with such .sect and .rate values for the F_SECT_PER_TRACK to become zero. Next, the floppy format operation should be called.
A floppy disk is not required to be inserted. An unprivileged user could trigger the bug if the device is accessible.
The patch checks F_SECT_PER_TRACK for a non-zero value in the set_geometry function. The proper check should involve a reasonable upper limit for the .sect and .rate fields, but it could change the UAPI.
The patch also checks F_SECT_PER_TRACK in the setup_format_params, and cancels the formatting operation in case of zero.
The bug was found by syzkaller.
Signed-off-by: Denis Efremov efremov@ispras.ru Tested-by: Willy Tarreau w@1wt.eu Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/floppy.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c index 9fb9b312ab6b..51246bc9709a 100644 --- a/drivers/block/floppy.c +++ b/drivers/block/floppy.c @@ -2120,6 +2120,9 @@ static void setup_format_params(int track) raw_cmd->kernel_data = floppy_track_buffer; raw_cmd->length = 4 * F_SECT_PER_TRACK;
+ if (!F_SECT_PER_TRACK) + return; + /* allow for about 30ms for data transport per track */ head_shift = (F_SECT_PER_TRACK + 5) / 6;
@@ -3232,6 +3235,8 @@ static int set_geometry(unsigned int cmd, struct floppy_struct *g, /* sanity checking for parameters. */ if (g->sect <= 0 || g->head <= 0 || + /* check for zero in F_SECT_PER_TRACK */ + (unsigned char)((g->sect << 2) >> FD_SIZECODE(g)) == 0 || g->track <= 0 || g->track > UDP->tracks >> STRETCH(g) || /* check if reserved bits are set */ (g->stretch & ~(FD_STRETCH | FD_SWAPSIDES | FD_SECTBASEMASK)) != 0)
From: Denis Efremov efremov@ispras.ru
[ Upstream commit da99466ac243f15fbba65bd261bfc75ffa1532b6 ]
This fixes a global out-of-bounds read access in the copy_buffer function of the floppy driver.
The FDDEFPRM ioctl allows one to set the geometry of a disk. The sect and head fields (unsigned int) of the floppy_drive structure are used to compute the max_sector (int) in the make_raw_rw_request function. It is possible to overflow the max_sector. Next, max_sector is passed to the copy_buffer function and used in one of the memcpy calls.
An unprivileged user could trigger the bug if the device is accessible, but requires a floppy disk to be inserted.
The patch adds the check for the .sect * .head multiplication for not overflowing in the set_geometry function.
The bug was found by syzkaller.
Signed-off-by: Denis Efremov efremov@ispras.ru Tested-by: Willy Tarreau w@1wt.eu Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/floppy.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c index 51246bc9709a..6787fe0c06f8 100644 --- a/drivers/block/floppy.c +++ b/drivers/block/floppy.c @@ -3233,8 +3233,10 @@ static int set_geometry(unsigned int cmd, struct floppy_struct *g, int cnt;
/* sanity checking for parameters. */ - if (g->sect <= 0 || - g->head <= 0 || + if ((int)g->sect <= 0 || + (int)g->head <= 0 || + /* check for overflow in max_sector */ + (int)(g->sect * g->head) <= 0 || /* check for zero in F_SECT_PER_TRACK */ (unsigned char)((g->sect << 2) >> FD_SIZECODE(g)) == 0 || g->track <= 0 || g->track > UDP->tracks >> STRETCH(g) ||
From: Petr Machata petrm@mellanox.com
[ Upstream commit dedfde2fe1c4ccf27179fcb234e2112d065c39bb ]
Spectrum systems use DSCP rewrite map to update DSCP field in egressing packets to correspond to priority that the packet has. Whether rewriting will take place is determined at the point when the packet ingresses the switch: if the port is in Trust L3 mode, packet priority is determined from the DSCP map at the port, and DSCP rewrite will happen. If the port is in Trust L2 mode, 802.1p is used for packet prioritization, and no DSCP rewrite will happen.
The driver determines the port trust mode based on whether any DSCP prioritization rules are in effect at given port. If there are any, trust level is L3, otherwise it's L2. When the last DSCP rule is removed, the port is switched to trust L2. Under that scenario, if DSCP of a packet should be rewritten, it should be rewritten to 0.
However, when switching to Trust L2, the driver neglects to also update the DSCP rewrite map. The last DSCP rule thus remains in effect, and packets egressing through this port, if they have the right priority, will have their DSCP set according to this rule.
Fix by first configuring the rewrite map, and only then switching to trust L2 and bailing out.
Fixes: b2b1dab6884e ("mlxsw: spectrum: Support ieee_setapp, ieee_delapp") Signed-off-by: Petr Machata petrm@mellanox.com Reported-by: Alex Veber alexve@mellanox.com Tested-by: Alex Veber alexve@mellanox.com Signed-off-by: Ido Schimmel idosch@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlxsw/spectrum_dcb.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c index b25048c6c761..21296fa7f7fb 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_dcb.c @@ -408,14 +408,6 @@ static int mlxsw_sp_port_dcb_app_update(struct mlxsw_sp_port *mlxsw_sp_port) have_dscp = mlxsw_sp_port_dcb_app_prio_dscp_map(mlxsw_sp_port, &prio_map);
- if (!have_dscp) { - err = mlxsw_sp_port_dcb_toggle_trust(mlxsw_sp_port, - MLXSW_REG_QPTS_TRUST_STATE_PCP); - if (err) - netdev_err(mlxsw_sp_port->dev, "Couldn't switch to trust L2\n"); - return err; - } - mlxsw_sp_port_dcb_app_dscp_prio_map(mlxsw_sp_port, default_prio, &dscp_map); err = mlxsw_sp_port_dcb_app_update_qpdpm(mlxsw_sp_port, @@ -432,6 +424,14 @@ static int mlxsw_sp_port_dcb_app_update(struct mlxsw_sp_port *mlxsw_sp_port) return err; }
+ if (!have_dscp) { + err = mlxsw_sp_port_dcb_toggle_trust(mlxsw_sp_port, + MLXSW_REG_QPTS_TRUST_STATE_PCP); + if (err) + netdev_err(mlxsw_sp_port->dev, "Couldn't switch to trust L2\n"); + return err; + } + err = mlxsw_sp_port_dcb_toggle_trust(mlxsw_sp_port, MLXSW_REG_QPTS_TRUST_STATE_DSCP); if (err) {
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit cac9b9a4b08304f11daace03b8b48659355e44c1 ]
When walking userspace stacks, USER_DS needs to be set, otherwise access_ok() will not function as expected.
Reported-by: Vegard Nossum vegard.nossum@oracle.com Reported-by: Eiichi Tsukata devel@etsukata.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Vegard Nossum vegard.nossum@oracle.com Reviewed-by: Joel Fernandes (Google) joel@joelfernandes.org Link: https://lkml.kernel.org/r/20190718085754.GM3402@hirez.programming.kicks-ass.... Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/stacktrace.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/kernel/stacktrace.c b/kernel/stacktrace.c index 36139de0a3c4..899b726c9e98 100644 --- a/kernel/stacktrace.c +++ b/kernel/stacktrace.c @@ -226,12 +226,17 @@ unsigned int stack_trace_save_user(unsigned long *store, unsigned int size) .store = store, .size = size, }; + mm_segment_t fs;
/* Trace user stack if not a kernel thread */ if (!current->mm) return 0;
+ fs = get_fs(); + set_fs(USER_DS); arch_stack_walk_user(consume_entry, &c, task_pt_regs(current)); + set_fs(fs); + return c.len; } #endif
From: David Rientjes rientjes@google.com
[ Upstream commit 83bf42510d7f7e1daa692c096e8e9919334d7b57 ]
SEV_VERSION_GREATER_OR_EQUAL() will fail if upgrading from 2.2 to 3.1, for example, because the minor version is not equal to or greater than the major.
Fix this and move to a static inline function for appropriate type checking.
Fixes: edd303ff0e9e ("crypto: ccp - Add DOWNLOAD_FIRMWARE SEV command") Reported-by: Cfir Cohen cfir@google.com Signed-off-by: David Rientjes rientjes@google.com Acked-by: Tom Lendacky thomas.lendacky@amd.com Acked-by: Gary R Hook gary.hook@amd.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/crypto/ccp/psp-dev.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/crypto/ccp/psp-dev.c b/drivers/crypto/ccp/psp-dev.c index de5a8ca70d3d..6b17d179ef8a 100644 --- a/drivers/crypto/ccp/psp-dev.c +++ b/drivers/crypto/ccp/psp-dev.c @@ -24,10 +24,6 @@ #include "sp-dev.h" #include "psp-dev.h"
-#define SEV_VERSION_GREATER_OR_EQUAL(_maj, _min) \ - ((psp_master->api_major) >= _maj && \ - (psp_master->api_minor) >= _min) - #define DEVICE_NAME "sev" #define SEV_FW_FILE "amd/sev.fw" #define SEV_FW_NAME_SIZE 64 @@ -47,6 +43,15 @@ MODULE_PARM_DESC(psp_probe_timeout, " default timeout value, in seconds, during static bool psp_dead; static int psp_timeout;
+static inline bool sev_version_greater_or_equal(u8 maj, u8 min) +{ + if (psp_master->api_major > maj) + return true; + if (psp_master->api_major == maj && psp_master->api_minor >= min) + return true; + return false; +} + static struct psp_device *psp_alloc_struct(struct sp_device *sp) { struct device *dev = sp->dev; @@ -588,7 +593,7 @@ static int sev_ioctl_do_get_id2(struct sev_issue_cmd *argp) int ret;
/* SEV GET_ID is available from SEV API v0.16 and up */ - if (!SEV_VERSION_GREATER_OR_EQUAL(0, 16)) + if (!sev_version_greater_or_equal(0, 16)) return -ENOTSUPP;
if (copy_from_user(&input, (void __user *)argp->data, sizeof(input))) @@ -651,7 +656,7 @@ static int sev_ioctl_do_get_id(struct sev_issue_cmd *argp) int ret;
/* SEV GET_ID available from SEV API v0.16 and up */ - if (!SEV_VERSION_GREATER_OR_EQUAL(0, 16)) + if (!sev_version_greater_or_equal(0, 16)) return -ENOTSUPP;
/* SEV FW expects the buffer it fills with the ID to be @@ -1053,7 +1058,7 @@ void psp_pci_init(void) psp_master->sev_state = SEV_STATE_UNINIT; }
- if (SEV_VERSION_GREATER_OR_EQUAL(0, 15) && + if (sev_version_greater_or_equal(0, 15) && sev_update_firmware(psp_master->dev) == 0) sev_get_api_version();
From: Juergen Gross jgross@suse.com
[ Upstream commit a1078e821b605813b63bf6bca414a85f804d5c66 ]
Instead of trying to allocate pages with GFP_USER in add_ballooned_pages() check the available free memory via si_mem_available(). GFP_USER is far less limiting memory exhaustion than the test via si_mem_available().
This will avoid dom0 running out of memory due to excessive foreign page mappings especially on ARM and on x86 in PVH mode, as those don't have a pre-ballooned area which can be used for foreign mappings.
As the normal ballooning suffers from the same problem don't balloon down more than si_mem_available() pages in one iteration. At the same time limit the default maximum number of retries.
This is part of XSA-300.
Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/xen/balloon.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index d37dd5bb7a8f..559768dc2567 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -538,8 +538,15 @@ static void balloon_process(struct work_struct *work) state = reserve_additional_memory(); }
- if (credit < 0) - state = decrease_reservation(-credit, GFP_BALLOON); + if (credit < 0) { + long n_pages; + + n_pages = min(-credit, si_mem_available()); + state = decrease_reservation(n_pages, GFP_BALLOON); + if (state == BP_DONE && n_pages != -credit && + n_pages < totalreserve_pages) + state = BP_EAGAIN; + }
state = update_schedule(state);
@@ -578,6 +585,9 @@ static int add_ballooned_pages(int nr_pages) } }
+ if (si_mem_available() < nr_pages) + return -ENOMEM; + st = decrease_reservation(nr_pages, GFP_USER); if (st != BP_DONE) return -ENOMEM; @@ -710,7 +720,7 @@ static int __init balloon_init(void) balloon_stats.schedule_delay = 1; balloon_stats.max_schedule_delay = 32; balloon_stats.retry_count = 1; - balloon_stats.max_retry_count = RETRY_UNLIMITED; + balloon_stats.max_retry_count = 4;
#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG set_online_page_callback(&xen_online_page);
From: Zhenzhong Duan zhenzhong.duan@oracle.com
[ Upstream commit b23e5844dfe78a80ba672793187d3f52e4b528d7 ]
Commit 7457c0da024b ("x86/alternatives: Add int3_emulate_call() selftest") is used to ensure there is a gap setup in int3 exception stack which could be used for inserting call return address.
This gap is missed in XEN PV int3 exception entry path, then below panic triggered:
[ 0.772876] general protection fault: 0000 [#1] SMP NOPTI [ 0.772886] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0+ #11 [ 0.772893] RIP: e030:int3_magic+0x0/0x7 [ 0.772905] RSP: 3507:ffffffff82203e98 EFLAGS: 00000246 [ 0.773334] Call Trace: [ 0.773334] alternative_instructions+0x3d/0x12e [ 0.773334] check_bugs+0x7c9/0x887 [ 0.773334] ? __get_locked_pte+0x178/0x1f0 [ 0.773334] start_kernel+0x4ff/0x535 [ 0.773334] ? set_init_arg+0x55/0x55 [ 0.773334] xen_start_kernel+0x571/0x57a
For 64bit PV guests, Xen's ABI enters the kernel with using SYSRET, with %rcx/%r11 on the stack. To convert back to "normal" looking exceptions, the xen thunks do 'xen_*: pop %rcx; pop %r11; jmp *'.
E.g. Extracting 'xen_pv_trap xenint3' we have: xen_xenint3: pop %rcx; pop %r11; jmp xenint3
As xenint3 and int3 entry code are same except xenint3 doesn't generate a gap, we can fix it by using int3 and drop useless xenint3.
Signed-off-by: Zhenzhong Duan zhenzhong.duan@oracle.com Reviewed-by: Juergen Gross jgross@suse.com Cc: Boris Ostrovsky boris.ostrovsky@oracle.com Cc: Juergen Gross jgross@suse.com Cc: Stefano Stabellini sstabellini@kernel.org Cc: Andy Lutomirski luto@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: Andrew Cooper andrew.cooper3@citrix.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/entry/entry_64.S | 1 - arch/x86/include/asm/traps.h | 2 +- arch/x86/xen/enlighten_pv.c | 2 +- arch/x86/xen/xen-asm_64.S | 1 - 4 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 8dbca86c249b..5c033dc0c2c7 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1171,7 +1171,6 @@ idtentry stack_segment do_stack_segment has_error_code=1 #ifdef CONFIG_XEN_PV idtentry xennmi do_nmi has_error_code=0 idtentry xendebug do_debug has_error_code=0 -idtentry xenint3 do_int3 has_error_code=0 #endif
idtentry general_protection do_general_protection has_error_code=1 diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index 7d6f3f3fad78..f2bd284abc16 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -40,7 +40,7 @@ asmlinkage void simd_coprocessor_error(void); asmlinkage void xen_divide_error(void); asmlinkage void xen_xennmi(void); asmlinkage void xen_xendebug(void); -asmlinkage void xen_xenint3(void); +asmlinkage void xen_int3(void); asmlinkage void xen_overflow(void); asmlinkage void xen_bounds(void); asmlinkage void xen_invalid_op(void); diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index 4722ba2966ac..30c14cb343fc 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -596,12 +596,12 @@ struct trap_array_entry {
static struct trap_array_entry trap_array[] = { { debug, xen_xendebug, true }, - { int3, xen_xenint3, true }, { double_fault, xen_double_fault, true }, #ifdef CONFIG_X86_MCE { machine_check, xen_machine_check, true }, #endif { nmi, xen_xennmi, true }, + { int3, xen_int3, false }, { overflow, xen_overflow, false }, #ifdef CONFIG_IA32_EMULATION { entry_INT80_compat, xen_entry_INT80_compat, false }, diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S index 1e9ef0ba30a5..ebf610b49c06 100644 --- a/arch/x86/xen/xen-asm_64.S +++ b/arch/x86/xen/xen-asm_64.S @@ -32,7 +32,6 @@ xen_pv_trap divide_error xen_pv_trap debug xen_pv_trap xendebug xen_pv_trap int3 -xen_pv_trap xenint3 xen_pv_trap xennmi xen_pv_trap overflow xen_pv_trap bounds
From: Josh Poimboeuf jpoimboe@redhat.com
[ Upstream commit 3a6ab4bcc52263dd5b1d2fd2e4ce95a38c798b4d ]
After an objtool improvement, it's complaining about the CLAC in copy_user_handle_tail():
arch/x86/lib/copy_user_64.o: warning: objtool: .altinstr_replacement+0x12: redundant UACCESS disable arch/x86/lib/copy_user_64.o: warning: objtool: copy_user_handle_tail()+0x6: (alt) arch/x86/lib/copy_user_64.o: warning: objtool: copy_user_handle_tail()+0x2: (alt) arch/x86/lib/copy_user_64.o: warning: objtool: copy_user_handle_tail()+0x0: <=== (func)
copy_user_handle_tail() is incorrectly marked as a callable function, so objtool is rightfully concerned about the CLAC with no corresponding STAC.
Remove the ELF function annotation. The copy_user_handle_tail() code path is already verified by objtool because it's jumped to by other callable asm code (which does the corresponding STAC).
Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/6b6e436774678b4b9873811ff023bd29935bee5b.156341331... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/lib/copy_user_64.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S index 378a1f70ae7d..4fe1601dbc5d 100644 --- a/arch/x86/lib/copy_user_64.S +++ b/arch/x86/lib/copy_user_64.S @@ -239,7 +239,7 @@ copy_user_handle_tail: ret
_ASM_EXTABLE_UA(1b, 2b) -ENDPROC(copy_user_handle_tail) +END(copy_user_handle_tail)
/* * copy_user_nocache - Uncached memory copy with exception handling
From: Josh Poimboeuf jpoimboe@redhat.com
[ Upstream commit a7e47f26039c26312a4144c3001b4e9fa886bd45 ]
After an objtool improvement, it's reporting that __memcpy_mcsafe() is calling mcsafe_handle_tail() with AC=1:
arch/x86/lib/memcpy_64.o: warning: objtool: .fixup+0x13: call to mcsafe_handle_tail() with UACCESS enabled arch/x86/lib/memcpy_64.o: warning: objtool: __memcpy_mcsafe()+0x34: (alt) arch/x86/lib/memcpy_64.o: warning: objtool: __memcpy_mcsafe()+0xb: (branch) arch/x86/lib/memcpy_64.o: warning: objtool: __memcpy_mcsafe()+0x0: <=== (func)
mcsafe_handle_tail() is basically an extension of __memcpy_mcsafe(), so AC=1 is supposed to be set. Add mcsafe_handle_tail() to the uaccess safe list.
Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Nick Desaulniers ndesaulniers@google.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/035c38f7eac845281d3c3d36749144982e06e58c.156341331... Signed-off-by: Sasha Levin sashal@kernel.org --- tools/objtool/check.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/objtool/check.c b/tools/objtool/check.c index 172f99195726..f5c3b97051a7 100644 --- a/tools/objtool/check.c +++ b/tools/objtool/check.c @@ -488,6 +488,7 @@ static const char *uaccess_safe_builtin[] = { /* misc */ "csum_partial_copy_generic", "__memcpy_mcsafe", + "mcsafe_handle_tail", "ftrace_likely_update", /* CONFIG_TRACE_BRANCH_PROFILING */ NULL };
From: Josh Poimboeuf jpoimboe@redhat.com
[ Upstream commit 3901336ed9887b075531bffaeef7742ba614058b ]
After making a change to improve objtool's sibling call detection, it started showing the following warning:
arch/x86/kvm/vmx/nested.o: warning: objtool: .fixup+0x15: sibling call from callable instruction with modified stack frame
The problem is the ____kvm_handle_fault_on_reboot() macro. It does a fake call by pushing a fake RIP and doing a jump. That tricks the unwinder into printing the function which triggered the exception, rather than the .fixup code.
Instead of the hack to make it look like the original function made the call, just change the macro so that the original function actually does make the call. This allows removal of the hack, and also makes objtool happy.
I triggered a vmx instruction exception and verified that the stack trace is still sane:
kernel BUG at arch/x86/kvm/x86.c:358! invalid opcode: 0000 [#1] SMP PTI CPU: 28 PID: 4096 Comm: qemu-kvm Not tainted 5.2.0+ #16 Hardware name: Lenovo THINKSYSTEM SD530 -[7X2106Z000]-/-[7X2106Z000]-, BIOS -[TEE113Z-1.00]- 07/17/2017 RIP: 0010:kvm_spurious_fault+0x5/0x10 Code: 00 00 00 00 00 8b 44 24 10 89 d2 45 89 c9 48 89 44 24 10 8b 44 24 08 48 89 44 24 08 e9 d4 40 22 00 0f 1f 40 00 0f 1f 44 00 00 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41 RSP: 0018:ffffbf91c683bd00 EFLAGS: 00010246 RAX: 000061f040000000 RBX: ffff9e159c77bba0 RCX: ffff9e15a5c87000 RDX: 0000000665c87000 RSI: ffff9e15a5c87000 RDI: ffff9e159c77bba0 RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9e15a5c87000 R10: 0000000000000000 R11: fffff8f2d99721c0 R12: ffff9e159c77bba0 R13: ffffbf91c671d960 R14: ffff9e159c778000 R15: 0000000000000000 FS: 00007fa341cbe700(0000) GS:ffff9e15b7400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdd38356804 CR3: 00000006759de003 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: loaded_vmcs_init+0x4f/0xe0 alloc_loaded_vmcs+0x38/0xd0 vmx_create_vcpu+0xf7/0x600 kvm_vm_ioctl+0x5e9/0x980 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 ? free_one_page+0x13f/0x4e0 do_vfs_ioctl+0xa4/0x630 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x55/0x1c0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fa349b1ee5b
Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Paolo Bonzini pbonzini@redhat.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/64a9b64d127e87b6920a97afde8e96ea76f6524e.156341331... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/kvm_host.h | 34 ++++++++++++++++++--------------- 1 file changed, 19 insertions(+), 15 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 26d1eb83f72a..e09e3c82d0a0 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1490,25 +1490,29 @@ enum { #define kvm_arch_vcpu_memslots_id(vcpu) ((vcpu)->arch.hflags & HF_SMM_MASK ? 1 : 0) #define kvm_memslots_for_spte_role(kvm, role) __kvm_memslots(kvm, (role).smm)
+asmlinkage void __noreturn kvm_spurious_fault(void); + /* * Hardware virtualization extension instructions may fault if a * reboot turns off virtualization while processes are running. - * Trap the fault and ignore the instruction if that happens. + * Usually after catching the fault we just panic; during reboot + * instead the instruction is ignored. */ -asmlinkage void kvm_spurious_fault(void); - -#define ____kvm_handle_fault_on_reboot(insn, cleanup_insn) \ - "666: " insn "\n\t" \ - "668: \n\t" \ - ".pushsection .fixup, "ax" \n" \ - "667: \n\t" \ - cleanup_insn "\n\t" \ - "cmpb $0, kvm_rebooting \n\t" \ - "jne 668b \n\t" \ - __ASM_SIZE(push) " $666b \n\t" \ - "jmp kvm_spurious_fault \n\t" \ - ".popsection \n\t" \ - _ASM_EXTABLE(666b, 667b) +#define ____kvm_handle_fault_on_reboot(insn, cleanup_insn) \ + "666: \n\t" \ + insn "\n\t" \ + "jmp 668f \n\t" \ + "667: \n\t" \ + "call kvm_spurious_fault \n\t" \ + "668: \n\t" \ + ".pushsection .fixup, "ax" \n\t" \ + "700: \n\t" \ + cleanup_insn "\n\t" \ + "cmpb $0, kvm_rebooting\n\t" \ + "je 667b \n\t" \ + "jmp 668b \n\t" \ + ".popsection \n\t" \ + _ASM_EXTABLE(666b, 700b)
#define __kvm_handle_fault_on_reboot(insn) \ ____kvm_handle_fault_on_reboot(insn, "")
From: Josh Poimboeuf jpoimboe@redhat.com
[ Upstream commit 083db6764821996526970e42d09c1ab2f4155dd4 ]
The __raw_callee_save_*() functions have an ELF symbol size of zero, which confuses objtool and other tools.
Fixes a bunch of warnings like the following:
arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_pte_val() is missing an ELF size annotation arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_pgd_val() is missing an ELF size annotation arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_make_pte() is missing an ELF size annotation arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_make_pgd() is missing an ELF size annotation
Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Juergen Gross jgross@suse.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/afa6d49bb07497ca62e4fc3b27a2d0cece545b4e.156341331... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/paravirt.h | 1 + arch/x86/kernel/kvm.c | 1 + 2 files changed, 2 insertions(+)
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index c25c38a05c1c..d6f5ae2c79ab 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -746,6 +746,7 @@ bool __raw_callee_save___native_vcpu_is_preempted(long cpu); PV_RESTORE_ALL_CALLER_REGS \ FRAME_END \ "ret;" \ + ".size " PV_THUNK_NAME(func) ", .-" PV_THUNK_NAME(func) ";" \ ".popsection")
/* Get a reference to a callee-save function */ diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 5169b8cc35bb..320b70acb211 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -817,6 +817,7 @@ asm( "cmpb $0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);" "setne %al;" "ret;" +".size __raw_callee_save___kvm_vcpu_is_preempted, .-__raw_callee_save___kvm_vcpu_is_preempted;" ".popsection");
#endif
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit 9b3d15e6b05e0b916be5fbd915f90300a403098b ]
Unlike legacy chips, 57500 chips don't need additional VNIC resources for aRFS/ntuple. Fix the code accordingly so that we don't reserve and allocate additional VNICs on 57500 chips. Without this patch, the driver is failing to initialize when it tries to allocate extra VNICs.
Fixes: ac33906c67e2 ("bnxt_en: Add support for aRFS on 57500 chips.") Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index f758b2e0591f..9a2a0d24d20d 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -3022,7 +3022,7 @@ static int bnxt_alloc_vnics(struct bnxt *bp) int num_vnics = 1;
#ifdef CONFIG_RFS_ACCEL - if (bp->flags & BNXT_FLAG_RFS) + if ((bp->flags & (BNXT_FLAG_RFS | BNXT_FLAG_CHIP_P5)) == BNXT_FLAG_RFS) num_vnics += bp->rx_nr_rings; #endif
@@ -7124,6 +7124,9 @@ static int bnxt_alloc_rfs_vnics(struct bnxt *bp) #ifdef CONFIG_RFS_ACCEL int i, rc = 0;
+ if (bp->flags & BNXT_FLAG_CHIP_P5) + return 0; + for (i = 0; i < bp->rx_nr_rings; i++) { struct bnxt_vnic_info *vnic; u16 vnic_id = i + 1; @@ -9587,7 +9590,7 @@ int bnxt_check_rings(struct bnxt *bp, int tx, int rx, bool sh, int tcs, return -ENOMEM;
vnics = 1; - if (bp->flags & BNXT_FLAG_RFS) + if ((bp->flags & (BNXT_FLAG_RFS | BNXT_FLAG_CHIP_P5)) == BNXT_FLAG_RFS) vnics += rx_rings;
if (bp->flags & BNXT_FLAG_AGG_RINGS)
From: Zhenzhong Duan zhenzhong.duan@oracle.com
[ Upstream commit 8c5477e8046ca139bac250386c08453da37ec1ae ]
Kernel build warns: 'sanitize_boot_params' defined but not used [-Wunused-function]
at below files: arch/x86/boot/compressed/cmdline.c arch/x86/boot/compressed/error.c arch/x86/boot/compressed/early_serial_console.c arch/x86/boot/compressed/acpi.c
That's becausethey each include misc.h which includes a definition of sanitize_boot_params() via bootparam_utils.h.
Remove the inclusion from misc.h and have the c file including bootparam_utils.h directly.
Signed-off-by: Zhenzhong Duan zhenzhong.duan@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/1563283092-1189-1-git-send-email-zhenzhong.duan@or... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/boot/compressed/misc.c | 1 + arch/x86/boot/compressed/misc.h | 1 - 2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c index 5a237e8dbf8d..0de54a1d25c0 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -17,6 +17,7 @@ #include "pgtable.h" #include "../string.h" #include "../voffset.h" +#include <asm/bootparam_utils.h>
/* * WARNING!! diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h index d2f184165934..c8181392f70d 100644 --- a/arch/x86/boot/compressed/misc.h +++ b/arch/x86/boot/compressed/misc.h @@ -23,7 +23,6 @@ #include <asm/page.h> #include <asm/boot.h> #include <asm/bootparam.h> -#include <asm/bootparam_utils.h>
#define BOOT_CTYPE_H #include <linux/acpi.h>
From: Josh Poimboeuf jpoimboe@redhat.com
[ Upstream commit 3193c0836f203a91bef96d88c64cccf0be090d9c ]
On x86-64, with CONFIG_RETPOLINE=n, GCC's "global common subexpression elimination" optimization results in ___bpf_prog_run()'s jumptable code changing from this:
select_insn: jmp *jumptable(, %rax, 8) ... ALU64_ADD_X: ... jmp *jumptable(, %rax, 8) ALU_ADD_X: ... jmp *jumptable(, %rax, 8)
to this:
select_insn: mov jumptable, %r12 jmp *(%r12, %rax, 8) ... ALU64_ADD_X: ... jmp *(%r12, %rax, 8) ALU_ADD_X: ... jmp *(%r12, %rax, 8)
The jumptable address is placed in a register once, at the beginning of the function. The function execution can then go through multiple indirect jumps which rely on that same register value. This has a few issues:
1) Objtool isn't smart enough to be able to track such a register value across multiple recursive indirect jumps through the jump table.
2) With CONFIG_RETPOLINE enabled, this optimization actually results in a small slowdown. I measured a ~4.7% slowdown in the test_bpf "tcpdump port 22" selftest.
This slowdown is actually predicted by the GCC manual:
Note: When compiling a program using computed gotos, a GCC extension, you may get better run-time performance if you disable the global common subexpression elimination pass by adding -fno-gcse to the command line.
So just disable the optimization for this function.
Fixes: e55a73251da3 ("bpf: Fix ORC unwinding in non-JIT BPF code") Reported-by: Randy Dunlap rdunlap@infradead.org Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Alexei Starovoitov ast@kernel.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/30c3ca29ba037afcbd860a8672eef0021addf9fe.156341331... Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/compiler-gcc.h | 2 ++ include/linux/compiler_types.h | 4 ++++ kernel/bpf/core.c | 2 +- 3 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h index e8579412ad21..d7ee4c6bad48 100644 --- a/include/linux/compiler-gcc.h +++ b/include/linux/compiler-gcc.h @@ -170,3 +170,5 @@ #else #define __diag_GCC_8(s) #endif + +#define __no_fgcse __attribute__((optimize("-fno-gcse"))) diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h index 19e58b9138a0..0454d82f8bd8 100644 --- a/include/linux/compiler_types.h +++ b/include/linux/compiler_types.h @@ -187,6 +187,10 @@ struct ftrace_likely_data { #define asm_volatile_goto(x...) asm goto(x) #endif
+#ifndef __no_fgcse +# define __no_fgcse +#endif + /* Are two types/vars the same type (ignoring qualifiers)? */ #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 080e2bb644cc..ebfd189916dc 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1295,7 +1295,7 @@ bool bpf_opcode_in_insntable(u8 code) * * Decode and execute eBPF instructions. */ -static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack) +static u64 __no_fgcse ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn, u64 *stack) { #define BPF_INSN_2_LBL(x, y) [BPF_##x | BPF_##y] = &&x##_##y #define BPF_INSN_3_LBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = &&x##_##y##_##z
From: Yongxin Liu yongxin.liu@windriver.com
[ Upstream commit 09b90e2fe35faeace2488234e2a7728f2ea8ba26 ]
In nouveau_conn_reset(), if connector->state is true, __drm_atomic_helper_connector_destroy_state() will be called, but the memory pointed by asyc isn't freed. Memory leak happens in the following function __drm_atomic_helper_connector_reset(), where newly allocated asyc->state will be assigned to connector->state.
So using nouveau_conn_atomic_destroy_state() instead of __drm_atomic_helper_connector_destroy_state to free the "old" asyc.
Here the is the log showing memory leak.
unreferenced object 0xffff8c5480483c80 (size 192): comm "kworker/0:2", pid 188, jiffies 4294695279 (age 53.179s) hex dump (first 32 bytes): 00 f0 ba 7b 54 8c ff ff 00 00 00 00 00 00 00 00 ...{T........... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000005005c0d0>] kmem_cache_alloc_trace+0x195/0x2c0 [<00000000a122baed>] nouveau_conn_reset+0x25/0xc0 [nouveau] [<000000004fd189a2>] nouveau_connector_create+0x3a7/0x610 [nouveau] [<00000000c73343a8>] nv50_display_create+0x343/0x980 [nouveau] [<000000002e2b03c3>] nouveau_display_create+0x51f/0x660 [nouveau] [<00000000c924699b>] nouveau_drm_device_init+0x182/0x7f0 [nouveau] [<00000000cc029436>] nouveau_drm_probe+0x20c/0x2c0 [nouveau] [<000000007e961c3e>] local_pci_probe+0x47/0xa0 [<00000000da14d569>] work_for_cpu_fn+0x1a/0x30 [<0000000028da4805>] process_one_work+0x27c/0x660 [<000000001d415b04>] worker_thread+0x22b/0x3f0 [<0000000003b69f1f>] kthread+0x12f/0x150 [<00000000c94c29b7>] ret_from_fork+0x3a/0x50
Signed-off-by: Yongxin Liu yongxin.liu@windriver.com Signed-off-by: Ben Skeggs bskeggs@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/nouveau/nouveau_connector.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c index 4116ee62adaf..f69ff22beee0 100644 --- a/drivers/gpu/drm/nouveau/nouveau_connector.c +++ b/drivers/gpu/drm/nouveau/nouveau_connector.c @@ -252,7 +252,7 @@ nouveau_conn_reset(struct drm_connector *connector) return;
if (connector->state) - __drm_atomic_helper_connector_destroy_state(connector->state); + nouveau_conn_atomic_destroy_state(connector, connector->state); __drm_atomic_helper_connector_reset(connector, &asyc->state); asyc->dither.mode = DITHERING_MODE_AUTO; asyc->dither.depth = DITHERING_DEPTH_AUTO;
From: Fugang Duan fugang.duan@nxp.com
[ Upstream commit 449fa54d6815be8c2c1f68fa9dbbae9384a7c03e ]
dma_map_sg() may use swiotlb buffer when the kernel command line includes "swiotlb=force" or the dma_addr is out of dev->dma_mask range. After DMA complete the memory moving from device to memory, then user call dma_sync_sg_for_cpu() to sync with DMA buffer, and copy the original virtual buffer to other space.
So dma_direct_sync_sg_for_cpu() should use swiotlb physical addr, not the original physical addr from sg_phys(sg).
dma_direct_sync_sg_for_device() also has the same issue, correct it as well.
Fixes: 55897af63091("dma-direct: merge swiotlb_dma_ops into the dma_direct code") Signed-off-by: Fugang Duan fugang.duan@nxp.com Reviewed-by: Robin Murphy robin.murphy@arm.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/dma/direct.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 2c2772e9702a..2c6d51a62251 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -231,12 +231,14 @@ void dma_direct_sync_sg_for_device(struct device *dev, int i;
for_each_sg(sgl, sg, nents, i) { - if (unlikely(is_swiotlb_buffer(sg_phys(sg)))) - swiotlb_tbl_sync_single(dev, sg_phys(sg), sg->length, + phys_addr_t paddr = dma_to_phys(dev, sg_dma_address(sg)); + + if (unlikely(is_swiotlb_buffer(paddr))) + swiotlb_tbl_sync_single(dev, paddr, sg->length, dir, SYNC_FOR_DEVICE);
if (!dev_is_dma_coherent(dev)) - arch_sync_dma_for_device(dev, sg_phys(sg), sg->length, + arch_sync_dma_for_device(dev, paddr, sg->length, dir); } } @@ -268,11 +270,13 @@ void dma_direct_sync_sg_for_cpu(struct device *dev, int i;
for_each_sg(sgl, sg, nents, i) { + phys_addr_t paddr = dma_to_phys(dev, sg_dma_address(sg)); + if (!dev_is_dma_coherent(dev)) - arch_sync_dma_for_cpu(dev, sg_phys(sg), sg->length, dir); - - if (unlikely(is_swiotlb_buffer(sg_phys(sg)))) - swiotlb_tbl_sync_single(dev, sg_phys(sg), sg->length, dir, + arch_sync_dma_for_cpu(dev, paddr, sg->length, dir); + + if (unlikely(is_swiotlb_buffer(paddr))) + swiotlb_tbl_sync_single(dev, paddr, sg->length, dir, SYNC_FOR_CPU); }
From: Ralph Campbell rcampbell@nvidia.com
[ Upstream commit d304654bd79332ace9ac46b9a3d8de60acb15da3 ]
In nouveau_dmem_pages_alloc(), the drm->dmem->mutex is unlocked before calling nouveau_dmem_chunk_alloc() as shown when CONFIG_PROVE_LOCKING is enabled:
[ 1294.871933] ===================================== [ 1294.876656] WARNING: bad unlock balance detected! [ 1294.881375] 5.2.0-rc3+ #5 Not tainted [ 1294.885048] ------------------------------------- [ 1294.889773] test-malloc-vra/6299 is trying to release lock (&drm->dmem->mutex) at: [ 1294.897482] [<ffffffffa01a220f>] nouveau_dmem_migrate_alloc_and_copy+0x79f/0xbf0 [nouveau] [ 1294.905782] but there are no more locks to release! [ 1294.910690] [ 1294.910690] other info that might help us debug this: [ 1294.917249] 1 lock held by test-malloc-vra/6299: [ 1294.921881] #0: 0000000016e10454 (&mm->mmap_sem#2){++++}, at: nouveau_svmm_bind+0x142/0x210 [nouveau] [ 1294.931313] [ 1294.931313] stack backtrace: [ 1294.935702] CPU: 4 PID: 6299 Comm: test-malloc-vra Not tainted 5.2.0-rc3+ #5 [ 1294.942786] Hardware name: ASUS X299-A/PRIME X299-A, BIOS 1401 05/21/2018 [ 1294.949590] Call Trace: [ 1294.952059] dump_stack+0x7c/0xc0 [ 1294.955469] ? nouveau_dmem_migrate_alloc_and_copy+0x79f/0xbf0 [nouveau] [ 1294.962213] print_unlock_imbalance_bug.cold.52+0xca/0xcf [ 1294.967641] lock_release+0x306/0x380 [ 1294.971383] ? nouveau_dmem_migrate_alloc_and_copy+0x79f/0xbf0 [nouveau] [ 1294.978089] ? lock_downgrade+0x2d0/0x2d0 [ 1294.982121] ? find_held_lock+0xac/0xd0 [ 1294.985979] __mutex_unlock_slowpath+0x8f/0x3f0 [ 1294.990540] ? wait_for_completion+0x230/0x230 [ 1294.995002] ? rwlock_bug.part.2+0x60/0x60 [ 1294.999197] nouveau_dmem_migrate_alloc_and_copy+0x79f/0xbf0 [nouveau] [ 1295.005751] ? page_mapping+0x98/0x110 [ 1295.009511] migrate_vma+0xa74/0x1090 [ 1295.013186] ? move_to_new_page+0x480/0x480 [ 1295.017400] ? __kmalloc+0x153/0x300 [ 1295.021052] ? nouveau_dmem_migrate_vma+0xd8/0x1e0 [nouveau] [ 1295.026796] nouveau_dmem_migrate_vma+0x157/0x1e0 [nouveau] [ 1295.032466] ? nouveau_dmem_init+0x490/0x490 [nouveau] [ 1295.037612] ? vmacache_find+0xc2/0x110 [ 1295.041537] nouveau_svmm_bind+0x1b4/0x210 [nouveau] [ 1295.046583] ? nouveau_svm_fault+0x13e0/0x13e0 [nouveau] [ 1295.051912] drm_ioctl_kernel+0x14d/0x1a0 [ 1295.055930] ? drm_setversion+0x330/0x330 [ 1295.059971] drm_ioctl+0x308/0x530 [ 1295.063384] ? drm_version+0x150/0x150 [ 1295.067153] ? find_held_lock+0xac/0xd0 [ 1295.070996] ? __pm_runtime_resume+0x3f/0xa0 [ 1295.075285] ? mark_held_locks+0x29/0xa0 [ 1295.079230] ? _raw_spin_unlock_irqrestore+0x3c/0x50 [ 1295.084232] ? lockdep_hardirqs_on+0x17d/0x250 [ 1295.088768] nouveau_drm_ioctl+0x9a/0x100 [nouveau] [ 1295.093661] do_vfs_ioctl+0x137/0x9a0 [ 1295.097341] ? ioctl_preallocate+0x140/0x140 [ 1295.101623] ? match_held_lock+0x1b/0x230 [ 1295.105646] ? match_held_lock+0x1b/0x230 [ 1295.109660] ? find_held_lock+0xac/0xd0 [ 1295.113512] ? __do_page_fault+0x324/0x630 [ 1295.117617] ? lock_downgrade+0x2d0/0x2d0 [ 1295.121648] ? mark_held_locks+0x79/0xa0 [ 1295.125583] ? handle_mm_fault+0x352/0x430 [ 1295.129687] ksys_ioctl+0x60/0x90 [ 1295.133020] ? mark_held_locks+0x29/0xa0 [ 1295.136964] __x64_sys_ioctl+0x3d/0x50 [ 1295.140726] do_syscall_64+0x68/0x250 [ 1295.144400] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 1295.149465] RIP: 0033:0x7f1a3495809b [ 1295.153053] Code: 0f 1e fa 48 8b 05 ed bd 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd bd 0c 00 f7 d8 64 89 01 48 [ 1295.171850] RSP: 002b:00007ffef7ed1358 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 1295.179451] RAX: ffffffffffffffda RBX: 00007ffef7ed1628 RCX: 00007f1a3495809b [ 1295.186601] RDX: 00007ffef7ed13b0 RSI: 0000000040406449 RDI: 0000000000000004 [ 1295.193759] RBP: 00007ffef7ed13b0 R08: 0000000000000000 R09: 000000000157e770 [ 1295.200917] R10: 000000000151c010 R11: 0000000000000246 R12: 0000000040406449 [ 1295.208083] R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000000
Reacquire the lock before continuing to the next page.
Signed-off-by: Ralph Campbell rcampbell@nvidia.com Signed-off-by: Ben Skeggs bskeggs@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/nouveau/nouveau_dmem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c index 40c47d6a7d78..745e197a4775 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c @@ -385,9 +385,10 @@ nouveau_dmem_pages_alloc(struct nouveau_drm *drm, ret = nouveau_dmem_chunk_alloc(drm); if (ret) { if (c) - break; + return 0; return ret; } + mutex_lock(&drm->dmem->mutex); continue; }
linux-stable-mirror@lists.linaro.org