This is the start of the stable review cycle for the 4.14.137 release. There are 53 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 07 Aug 2019 12:47:58 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.137-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.14.137-rc1
Andy Lutomirski luto@kernel.org x86/vdso: Prevent segfaults due to hoisted vclock reads
Linus Torvalds torvalds@linux-foundation.org gcc-9: properly declare the {pv,hv}clock_page storage
Josh Poimboeuf jpoimboe@redhat.com objtool: Support GCC 9 cold subfunction naming scheme
Jean Delvare jdelvare@suse.de eeprom: at24: make spd world-readable again
John Fleck john.fleck@intel.com IB/hfi1: Check for error on call to alloc_rsm_map_table
Yishai Hadas yishaih@mellanox.com IB/mlx5: Fix RSS Toeplitz setup to be aligned with the HW specification
Yishai Hadas yishaih@mellanox.com IB/mlx5: Move MRs to a kernel PD when freeing them to the MR cache
Yishai Hadas yishaih@mellanox.com IB/mlx5: Use direct mkey destroy command upon UMR unreg failure
Yishai Hadas yishaih@mellanox.com IB/mlx5: Fix unreg_umr to ignore the mkey state
Juergen Gross jgross@suse.com xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()
Munehisa Kamata kamatam@amazon.com nbd: replace kill_bdev() with __invalidate_device() again
Will Deacon will@kernel.org drivers/perf: arm_pmu: Fix failure path in PM notifier
Helge Deller deller@gmx.de parisc: Fix build of compressed kernel even with debug enabled
Stefan Haberland sth@linux.ibm.com s390/dasd: fix endless loop after read unit address configuration
Ondrej Mosnacek omosnace@redhat.com selinux: fix memory leak in policydb_init()
Gustavo A. R. Silva gustavo@embeddedor.com IB/hfi1: Fix Spectre v1 vulnerability
Michael Wu michael.wu@vatics.com gpiolib: fix incorrect IRQ requesting of an active-low lineevent
Douglas Anderson dianders@chromium.org mmc: dw_mmc: Fix occasional hang after tuning on eMMC
Filipe Manana fdmanana@suse.com Btrfs: fix race leading to fs corruption after transaction abort
Filipe Manana fdmanana@suse.com Btrfs: fix incremental send failure after deduplication
Masahiro Yamada yamada.masahiro@socionext.com kbuild: initialize CLANG_FLAGS correctly in the top Makefile
Yongxin Liu yongxin.liu@windriver.com drm/nouveau: fix memory leak in nouveau_conn_reset()
Zhenzhong Duan zhenzhong.duan@oracle.com x86, boot: Remove multiple copy of static function sanitize_boot_params()
Josh Poimboeuf jpoimboe@redhat.com x86/paravirt: Fix callee-saved function ELF sizes
Josh Poimboeuf jpoimboe@redhat.com x86/kvm: Don't call kvm_spurious_fault() from .fixup
Zhenzhong Duan zhenzhong.duan@oracle.com xen/pv: Fix a boot up hang revealed by int3 self test
Kees Cook keescook@chromium.org ipc/mqueue.c: only perform resource calculation if user valid
Dan Carpenter dan.carpenter@oracle.com drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings
Mikko Rapeli mikko.rapeli@iki.fi uapi linux/coda_psdev.h: move upc_req definition from uapi to kernel side headers
Sam Protsenko semen.protsenko@linaro.org coda: fix build using bare-metal toolchain
Zhouyang Jia jiazhouyang09@gmail.com coda: add error handling for fget
Doug Berger opendmb@gmail.com mm/cma.c: fail if fixed declaration can't be honored
Arnd Bergmann arnd@arndb.de x86: math-emu: Hide clang warnings for 16-bit overflow
Qian Cai cai@lca.pw x86/apic: Silence -Wtype-limits compiler warnings
Benjamin Poirier bpoirier@suse.com be2net: Signal that the device cannot transmit during reconfiguration
Arnd Bergmann arnd@arndb.de ACPI: fix false-positive -Wuninitialized warning
Arnd Bergmann arnd@arndb.de x86: kvm: avoid constant-conversion warning
Benjamin Block bblock@linux.ibm.com scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized
Arnd Bergmann arnd@arndb.de ACPI: blacklist: fix clang warning for unused DMI table
Jeff Layton jlayton@kernel.org ceph: return -ERANGE if virtual xattr value didn't fit in buffer
Andrea Parri andrea.parri@amarulasolutions.com ceph: fix improper use of smp_mb__before_atomic()
Ronnie Sahlberg lsahlber@redhat.com cifs: Fix a race condition with cifs_echo_request
David Sterba dsterba@suse.com btrfs: fix minimum number of chunk errors for DUP
Russell King rmk+kernel@armlinux.org.uk fs/adfs: super: fix use-after-free bug
JC Kuo jckuo@nvidia.com clk: tegra210: fix PLLU and PLLU_OUT1
Geert Uytterhoeven geert+renesas@glider.be dmaengine: rcar-dmac: Reject zero-length slave DMA requests
Petr Cvek petrcvekcz@gmail.com MIPS: lantiq: Fix bitfield masking
Prarit Bhargava prarit@redhat.com kernel/module.c: Only return -EEXIST for modules that have finished loading
Cheng Jian cj.chengjian@huawei.com ftrace: Enable trampoline when rec count returns back to one
Douglas Anderson dianders@chromium.org ARM: dts: rockchip: Mark that the rk3288 timer might stop in suspend
Douglas Anderson dianders@chromium.org ARM: dts: rockchip: Make rk3288-veyron-mickey's emmc work again
Douglas Anderson dianders@chromium.org ARM: dts: rockchip: Make rk3288-veyron-minnie run at hs200
Russell King rmk+kernel@armlinux.org.uk ARM: riscpc: fix DMA
-------------
Diffstat:
Makefile | 7 +-- arch/arm/boot/dts/rk3288-veyron-mickey.dts | 4 -- arch/arm/boot/dts/rk3288-veyron-minnie.dts | 4 -- arch/arm/boot/dts/rk3288.dtsi | 1 + arch/arm/mach-rpc/dma.c | 5 +- arch/mips/lantiq/irq.c | 5 +- arch/parisc/boot/compressed/vmlinux.lds.S | 4 +- arch/x86/boot/compressed/misc.c | 1 + arch/x86/boot/compressed/misc.h | 1 - arch/x86/entry/entry_64.S | 1 - arch/x86/entry/vdso/vclock_gettime.c | 19 +++++-- arch/x86/include/asm/apic.h | 2 +- arch/x86/include/asm/kvm_host.h | 34 +++++++------ arch/x86/include/asm/paravirt.h | 1 + arch/x86/include/asm/traps.h | 2 +- arch/x86/kernel/apic/apic.c | 2 +- arch/x86/kernel/kvm.c | 1 + arch/x86/kvm/mmu.c | 6 +-- arch/x86/math-emu/fpu_emu.h | 2 +- arch/x86/math-emu/reg_constant.c | 2 +- arch/x86/xen/enlighten_pv.c | 2 +- arch/x86/xen/xen-asm_64.S | 1 - drivers/acpi/blacklist.c | 4 ++ drivers/block/nbd.c | 2 +- drivers/clk/tegra/clk-tegra210.c | 8 +-- drivers/dma/sh/rcar-dmac.c | 2 +- drivers/gpio/gpiolib.c | 6 ++- drivers/gpu/drm/nouveau/nouveau_connector.c | 2 +- drivers/infiniband/hw/hfi1/chip.c | 11 ++++- drivers/infiniband/hw/hfi1/verbs.c | 2 + drivers/infiniband/hw/mlx5/mlx5_ib.h | 1 + drivers/infiniband/hw/mlx5/mr.c | 17 ++++--- drivers/infiniband/hw/mlx5/qp.c | 13 +++-- drivers/misc/eeprom/at24.c | 2 +- drivers/mmc/host/dw_mmc.c | 3 +- drivers/net/ethernet/emulex/benet/be_main.c | 6 ++- drivers/perf/arm_pmu.c | 2 +- drivers/rapidio/devices/rio_mport_cdev.c | 2 + drivers/s390/block/dasd_alias.c | 22 ++++++--- drivers/s390/scsi/zfcp_erp.c | 7 +++ drivers/xen/swiotlb-xen.c | 4 +- fs/adfs/super.c | 5 +- fs/btrfs/send.c | 77 ++++++----------------------- fs/btrfs/transaction.c | 10 ++++ fs/btrfs/volumes.c | 3 +- fs/ceph/super.h | 7 ++- fs/ceph/xattr.c | 14 +++--- fs/cifs/connect.c | 8 +-- fs/coda/psdev.c | 5 +- include/linux/acpi.h | 5 +- include/linux/coda.h | 3 +- include/linux/coda_psdev.h | 11 +++++ include/uapi/linux/coda_psdev.h | 13 ----- ipc/mqueue.c | 19 +++---- kernel/module.c | 6 +-- kernel/trace/ftrace.c | 28 ++++++----- mm/cma.c | 13 +++++ security/selinux/ss/policydb.c | 6 ++- tools/objtool/elf.c | 2 +- 59 files changed, 254 insertions(+), 204 deletions(-)
[ Upstream commit ffd9a1ba9fdb7f2bd1d1ad9b9243d34e96756ba2 ]
DMA got broken a while back in two different ways: 1) a change in the behaviour of disable_irq() to wait for the interrupt to finish executing causes us to deadlock at the end of DMA. 2) a change to avoid modifying the scatterlist left the first transfer uninitialised.
DMA is only used with expansion cards, so has gone unnoticed.
Fixes: fa4e99899932 ("[ARM] dma: RiscPC: don't modify DMA SG entries") Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/mach-rpc/dma.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/arm/mach-rpc/dma.c b/arch/arm/mach-rpc/dma.c index fb48f3141fb4d..c4c96661eb89a 100644 --- a/arch/arm/mach-rpc/dma.c +++ b/arch/arm/mach-rpc/dma.c @@ -131,7 +131,7 @@ static irqreturn_t iomd_dma_handle(int irq, void *dev_id) } while (1);
idma->state = ~DMA_ST_AB; - disable_irq(irq); + disable_irq_nosync(irq);
return IRQ_HANDLED; } @@ -174,6 +174,9 @@ static void iomd_enable_dma(unsigned int chan, dma_t *dma) DMA_FROM_DEVICE : DMA_TO_DEVICE); }
+ idma->dma_addr = idma->dma.sg->dma_address; + idma->dma_len = idma->dma.sg->length; + iomd_writeb(DMA_CR_C, dma_base + CR); idma->state = DMA_ST_AB; }
[ Upstream commit 1c0479023412ab7834f2e98b796eb0d8c627cd62 ]
As some point hs200 was failing on rk3288-veyron-minnie. See commit 984926781122 ("ARM: dts: rockchip: temporarily remove emmc hs200 speed from rk3288 minnie"). Although I didn't track down exactly when it started working, it seems to work OK now, so let's turn it back on.
To test this, I booted from SD card and then used this script to stress the enumeration process after fixing a memory leak [1]: cd /sys/bus/platform/drivers/dwmmc_rockchip for i in $(seq 1 3000); do echo "========================" $i echo ff0f0000.dwmmc > unbind sleep .5 echo ff0f0000.dwmmc > bind while true; do if [ -e /dev/mmcblk2 ]; then break; fi sleep .1 done done
It worked fine.
[1] https://lkml.kernel.org/r/20190503233526.226272-1-dianders@chromium.org
Signed-off-by: Douglas Anderson dianders@chromium.org Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/rk3288-veyron-minnie.dts | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/arch/arm/boot/dts/rk3288-veyron-minnie.dts b/arch/arm/boot/dts/rk3288-veyron-minnie.dts index 544de6027aaa0..6000dca1cf054 100644 --- a/arch/arm/boot/dts/rk3288-veyron-minnie.dts +++ b/arch/arm/boot/dts/rk3288-veyron-minnie.dts @@ -125,10 +125,6 @@ power-supply = <&backlight_regulator>; };
-&emmc { - /delete-property/mmc-hs200-1_8v; -}; - &gpio_keys { pinctrl-0 = <&pwr_key_l &ap_lid_int_l &volum_down_l &volum_up_l>;
[ Upstream commit 99fa066710f75f18f4d9a5bc5f6a711968a581d5 ]
When I try to boot rk3288-veyron-mickey I totally fail to make the eMMC work. Specifically my logs (on Chrome OS 4.19):
mmc_host mmc1: card is non-removable. mmc_host mmc1: Bus speed (slot 0) = 400000Hz (slot req 400000Hz, actual 400000HZ div = 0) mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot req 52000000Hz, actual 50000000HZ div = 0) mmc1: switch to bus width 8 failed mmc1: switch to bus width 4 failed mmc1: new high speed MMC card at address 0001 mmcblk1: mmc1:0001 HAG2e 14.7 GiB mmcblk1boot0: mmc1:0001 HAG2e partition 1 4.00 MiB mmcblk1boot1: mmc1:0001 HAG2e partition 2 4.00 MiB mmcblk1rpmb: mmc1:0001 HAG2e partition 3 4.00 MiB, chardev (243:0) mmc_host mmc1: Bus speed (slot 0) = 400000Hz (slot req 400000Hz, actual 400000HZ div = 0) mmc_host mmc1: Bus speed (slot 0) = 50000000Hz (slot req 52000000Hz, actual 50000000HZ div = 0) mmc1: switch to bus width 8 failed mmc1: switch to bus width 4 failed mmc1: tried to HW reset card, got error -110 mmcblk1: error -110 requesting status mmcblk1: recovery failed! print_req_error: I/O error, dev mmcblk1, sector 0 ...
When I remove the '/delete-property/mmc-hs200-1_8v' then everything is hunky dory.
That line comes from the original submission of the mickey dts upstream, so presumably at the time the HS200 was failing and just enumerating things as a high speed device was fine. ...or maybe it's just that some mickey devices work when enumerating at "high speed", just not mine?
In any case, hs200 seems good now. Let's turn it on.
Signed-off-by: Douglas Anderson dianders@chromium.org Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/rk3288-veyron-mickey.dts | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/arch/arm/boot/dts/rk3288-veyron-mickey.dts b/arch/arm/boot/dts/rk3288-veyron-mickey.dts index f0994f0e57745..d6ca67866bc00 100644 --- a/arch/arm/boot/dts/rk3288-veyron-mickey.dts +++ b/arch/arm/boot/dts/rk3288-veyron-mickey.dts @@ -161,10 +161,6 @@ }; };
-&emmc { - /delete-property/mmc-hs200-1_8v; -}; - &i2c2 { status = "disabled"; };
[ Upstream commit 8ef1ba39a9fa53d2205e633bc9b21840a275908e ]
This is similar to commit e6186820a745 ("arm64: dts: rockchip: Arch counter doesn't tick in system suspend"). Specifically on the rk3288 it can be seen that the timer stops ticking in suspend if we end up running through the "osc_disable" path in rk3288_slp_mode_set(). In that path the 24 MHz clock will turn off and the timer stops.
To test this, I ran this on a Chrome OS filesystem: before=$(date); \ suspend_stress_test -c1 --suspend_min=30 --suspend_max=31; \ echo ${before}; date
...and I found that unless I plug in a device that requests USB wakeup to be active that the two calls to "date" would show that fewer than 30 seconds passed.
NOTE: deep suspend (where the 24 MHz clock gets disabled) isn't supported yet on upstream Linux so this was tested on a downstream kernel.
Signed-off-by: Douglas Anderson dianders@chromium.org Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/rk3288.dtsi | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm/boot/dts/rk3288.dtsi b/arch/arm/boot/dts/rk3288.dtsi index 5a7888581eea9..23907d9ce89ad 100644 --- a/arch/arm/boot/dts/rk3288.dtsi +++ b/arch/arm/boot/dts/rk3288.dtsi @@ -213,6 +213,7 @@ <GIC_PPI 11 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>, <GIC_PPI 10 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_HIGH)>; clock-frequency = <24000000>; + arm,no-tick-in-suspend; };
timer: timer@ff810000 {
[ Upstream commit a124692b698b00026a58d89831ceda2331b2e1d0 ]
Custom trampolines can only be enabled if there is only a single ops attached to it. If there's only a single callback registered to a function, and the ops has a trampoline registered for it, then we can call the trampoline directly. This is very useful for improving the performance of ftrace and livepatch.
If more than one callback is registered to a function, the general trampoline is used, and the custom trampoline is not restored back to the direct call even if all the other callbacks were unregistered and we are back to one callback for the function.
To fix this, set FTRACE_FL_TRAMP flag if rec count is decremented to one, and the ops that left has a trampoline.
Testing After this patch :
insmod livepatch_unshare_files.ko cat /sys/kernel/debug/tracing/enabled_functions
unshare_files (1) R I tramp: 0xffffffffc0000000(klp_ftrace_handler+0x0/0xa0) ->ftrace_ops_assist_func+0x0/0xf0
echo unshare_files > /sys/kernel/debug/tracing/set_ftrace_filter echo function > /sys/kernel/debug/tracing/current_tracer cat /sys/kernel/debug/tracing/enabled_functions
unshare_files (2) R I ->ftrace_ops_list_func+0x0/0x150
echo nop > /sys/kernel/debug/tracing/current_tracer cat /sys/kernel/debug/tracing/enabled_functions
unshare_files (1) R I tramp: 0xffffffffc0000000(klp_ftrace_handler+0x0/0xa0) ->ftrace_ops_assist_func+0x0/0xf0
Link: http://lkml.kernel.org/r/1556969979-111047-1-git-send-email-cj.chengjian@hua...
Signed-off-by: Cheng Jian cj.chengjian@huawei.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/ftrace.c | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index c4a0ad18c8593..7420f5f360947 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -1712,6 +1712,11 @@ static bool test_rec_ops_needs_regs(struct dyn_ftrace *rec) return keep_regs; }
+static struct ftrace_ops * +ftrace_find_tramp_ops_any(struct dyn_ftrace *rec); +static struct ftrace_ops * +ftrace_find_tramp_ops_next(struct dyn_ftrace *rec, struct ftrace_ops *ops); + static bool __ftrace_hash_rec_update(struct ftrace_ops *ops, int filter_hash, bool inc) @@ -1840,15 +1845,17 @@ static bool __ftrace_hash_rec_update(struct ftrace_ops *ops, }
/* - * If the rec had TRAMP enabled, then it needs to - * be cleared. As TRAMP can only be enabled iff - * there is only a single ops attached to it. - * In otherwords, always disable it on decrementing. - * In the future, we may set it if rec count is - * decremented to one, and the ops that is left - * has a trampoline. + * The TRAMP needs to be set only if rec count + * is decremented to one, and the ops that is + * left has a trampoline. As TRAMP can only be + * enabled if there is only a single ops attached + * to it. */ - rec->flags &= ~FTRACE_FL_TRAMP; + if (ftrace_rec_count(rec) == 1 && + ftrace_find_tramp_ops_any(rec)) + rec->flags |= FTRACE_FL_TRAMP; + else + rec->flags &= ~FTRACE_FL_TRAMP;
/* * flags will be cleared in ftrace_check_record() @@ -2041,11 +2048,6 @@ static void print_ip_ins(const char *fmt, const unsigned char *p) printk(KERN_CONT "%s%02x", i ? ":" : "", p[i]); }
-static struct ftrace_ops * -ftrace_find_tramp_ops_any(struct dyn_ftrace *rec); -static struct ftrace_ops * -ftrace_find_tramp_ops_next(struct dyn_ftrace *rec, struct ftrace_ops *ops); - enum ftrace_bug_type ftrace_bug_type; const void *ftrace_expected;
[ Upstream commit 6e6de3dee51a439f76eb73c22ae2ffd2c9384712 ]
Microsoft HyperV disables the X86_FEATURE_SMCA bit on AMD systems, and linux guests boot with repeated errors:
amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2) amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2) amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2)
The warnings occur because the module code erroneously returns -EEXIST for modules that have failed to load and are in the process of being removed from the module list.
module amd64_edac_mod has a dependency on module edac_mce_amd. Using modules.dep, systemd will load edac_mce_amd for every request of amd64_edac_mod. When the edac_mce_amd module loads, the module has state MODULE_STATE_UNFORMED and once the module load fails and the state becomes MODULE_STATE_GOING. Another request for edac_mce_amd module executes and add_unformed_module() will erroneously return -EEXIST even though the previous instance of edac_mce_amd has MODULE_STATE_GOING. Upon receiving -EEXIST, systemd attempts to load amd64_edac_mod, which fails because of unknown symbols from edac_mce_amd.
add_unformed_module() must wait to return for any case other than MODULE_STATE_LIVE to prevent a race between multiple loads of dependent modules.
Signed-off-by: Prarit Bhargava prarit@redhat.com Signed-off-by: Barret Rhoden brho@google.com Cc: David Arcari darcari@redhat.com Cc: Jessica Yu jeyu@kernel.org Cc: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Jessica Yu jeyu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/module.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/kernel/module.c b/kernel/module.c index 94528b8910278..4b372c14d9a1f 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -3391,8 +3391,7 @@ static bool finished_loading(const char *name) sched_annotate_sleep(); mutex_lock(&module_mutex); mod = find_module_all(name, strlen(name), true); - ret = !mod || mod->state == MODULE_STATE_LIVE - || mod->state == MODULE_STATE_GOING; + ret = !mod || mod->state == MODULE_STATE_LIVE; mutex_unlock(&module_mutex);
return ret; @@ -3560,8 +3559,7 @@ again: mutex_lock(&module_mutex); old = find_module_all(mod->name, strlen(mod->name), true); if (old != NULL) { - if (old->state == MODULE_STATE_COMING - || old->state == MODULE_STATE_UNFORMED) { + if (old->state != MODULE_STATE_LIVE) { /* Wait in case it fails to load. */ mutex_unlock(&module_mutex); err = wait_event_interruptible(module_wq,
[ Upstream commit ba1bc0fcdeaf3bf583c1517bd2e3e29cf223c969 ]
The modification of EXIN register doesn't clean the bitfield before the writing of a new value. After a few modifications the bitfield would accumulate only '1's.
Signed-off-by: Petr Cvek petrcvekcz@gmail.com Signed-off-by: Paul Burton paul.burton@mips.com Cc: hauke@hauke-m.de Cc: john@phrozen.org Cc: linux-mips@vger.kernel.org Cc: openwrt-devel@lists.openwrt.org Cc: pakahmar@hotmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/lantiq/irq.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/mips/lantiq/irq.c b/arch/mips/lantiq/irq.c index c4ef1c31e0c4f..37caeadb2964c 100644 --- a/arch/mips/lantiq/irq.c +++ b/arch/mips/lantiq/irq.c @@ -156,8 +156,9 @@ static int ltq_eiu_settype(struct irq_data *d, unsigned int type) if (edge) irq_set_handler(d->hwirq, handle_edge_irq);
- ltq_eiu_w32(ltq_eiu_r32(LTQ_EIU_EXIN_C) | - (val << (i * 4)), LTQ_EIU_EXIN_C); + ltq_eiu_w32((ltq_eiu_r32(LTQ_EIU_EXIN_C) & + (~(7 << (i * 4)))) | (val << (i * 4)), + LTQ_EIU_EXIN_C); } }
[ Upstream commit 78efb76ab4dfb8f74f290ae743f34162cd627f19 ]
While the .device_prep_slave_sg() callback rejects empty scatterlists, it still accepts single-entry scatterlists with a zero-length segment. These may happen if a driver calls dmaengine_prep_slave_single() with a zero len parameter. The corresponding DMA request will never complete, leading to messages like:
rcar-dmac e7300000.dma-controller: Channel Address Error happen
and DMA timeouts.
Although requesting a zero-length DMA request is a driver bug, rejecting it early eases debugging. Note that the .device_prep_dma_memcpy() callback already rejects requests to copy zero bytes.
Reported-by: Eugeniu Rosca erosca@de.adit-jv.com Analyzed-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/sh/rcar-dmac.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma/sh/rcar-dmac.c b/drivers/dma/sh/rcar-dmac.c index 77b126525daca..19c7433e83097 100644 --- a/drivers/dma/sh/rcar-dmac.c +++ b/drivers/dma/sh/rcar-dmac.c @@ -1129,7 +1129,7 @@ rcar_dmac_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl, struct rcar_dmac_chan *rchan = to_rcar_dmac_chan(chan);
/* Someone calling slave DMA on a generic channel? */ - if (rchan->mid_rid < 0 || !sg_len) { + if (rchan->mid_rid < 0 || !sg_len || !sg_dma_len(sgl)) { dev_warn(chan->device->dev, "%s: bad parameter: len=%d, id=%d\n", __func__, sg_len, rchan->mid_rid);
[ Upstream commit 0d34dfbf3023cf119b83f6470692c0b10c832495 ]
Full-speed and low-speed USB devices do not work with Tegra210 platforms because of incorrect PLLU/PLLU_OUT1 clock settings.
When full-speed device is connected: [ 14.059886] usb 1-3: new full-speed USB device number 2 using tegra-xusb [ 14.196295] usb 1-3: device descriptor read/64, error -71 [ 14.436311] usb 1-3: device descriptor read/64, error -71 [ 14.675749] usb 1-3: new full-speed USB device number 3 using tegra-xusb [ 14.812335] usb 1-3: device descriptor read/64, error -71 [ 15.052316] usb 1-3: device descriptor read/64, error -71 [ 15.164799] usb usb1-port3: attempt power cycle
When low-speed device is connected: [ 37.610949] usb usb1-port3: Cannot enable. Maybe the USB cable is bad? [ 38.557376] usb usb1-port3: Cannot enable. Maybe the USB cable is bad? [ 38.564977] usb usb1-port3: attempt power cycle
This commit fixes the issue by: 1. initializing PLLU_OUT1 before initializing XUSB_FS_SRC clock because PLLU_OUT1 is parent of XUSB_FS_SRC. 2. changing PLLU post-divider to /2 (DIVP=1) according to Technical Reference Manual.
Fixes: e745f992cf4b ("clk: tegra: Rework pll_u") Signed-off-by: JC Kuo jckuo@nvidia.com Acked-By: Peter De Schrijver pdeschrijver@nvidia.com Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/tegra/clk-tegra210.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/clk/tegra/clk-tegra210.c b/drivers/clk/tegra/clk-tegra210.c index b92867814e2d5..cb2be154db3bc 100644 --- a/drivers/clk/tegra/clk-tegra210.c +++ b/drivers/clk/tegra/clk-tegra210.c @@ -2057,9 +2057,9 @@ static struct div_nmp pllu_nmp = { };
static struct tegra_clk_pll_freq_table pll_u_freq_table[] = { - { 12000000, 480000000, 40, 1, 0, 0 }, - { 13000000, 480000000, 36, 1, 0, 0 }, /* actual: 468.0 MHz */ - { 38400000, 480000000, 25, 2, 0, 0 }, + { 12000000, 480000000, 40, 1, 1, 0 }, + { 13000000, 480000000, 36, 1, 1, 0 }, /* actual: 468.0 MHz */ + { 38400000, 480000000, 25, 2, 1, 0 }, { 0, 0, 0, 0, 0, 0 }, };
@@ -2983,6 +2983,7 @@ static struct tegra_clk_init_table init_table[] __initdata = { { TEGRA210_CLK_DFLL_REF, TEGRA210_CLK_PLL_P, 51000000, 1 }, { TEGRA210_CLK_SBC4, TEGRA210_CLK_PLL_P, 12000000, 1 }, { TEGRA210_CLK_PLL_RE_VCO, TEGRA210_CLK_CLK_MAX, 672000000, 1 }, + { TEGRA210_CLK_PLL_U_OUT1, TEGRA210_CLK_CLK_MAX, 48000000, 1 }, { TEGRA210_CLK_XUSB_GATE, TEGRA210_CLK_CLK_MAX, 0, 1 }, { TEGRA210_CLK_XUSB_SS_SRC, TEGRA210_CLK_PLL_U_480M, 120000000, 0 }, { TEGRA210_CLK_XUSB_FS_SRC, TEGRA210_CLK_PLL_U_48M, 48000000, 0 }, @@ -3008,7 +3009,6 @@ static struct tegra_clk_init_table init_table[] __initdata = { { TEGRA210_CLK_PLL_DP, TEGRA210_CLK_CLK_MAX, 270000000, 0 }, { TEGRA210_CLK_SOC_THERM, TEGRA210_CLK_PLL_P, 51000000, 0 }, { TEGRA210_CLK_CCLK_G, TEGRA210_CLK_CLK_MAX, 0, 1 }, - { TEGRA210_CLK_PLL_U_OUT1, TEGRA210_CLK_CLK_MAX, 48000000, 1 }, { TEGRA210_CLK_PLL_U_OUT2, TEGRA210_CLK_CLK_MAX, 60000000, 1 }, /* This MUST be the last entry. */ { TEGRA210_CLK_CLK_MAX, TEGRA210_CLK_CLK_MAX, 0, 0 },
[ Upstream commit 5808b14a1f52554de612fee85ef517199855e310 ]
Fix a use-after-free bug during filesystem initialisation, where we access the disc record (which is stored in a buffer) after we have released the buffer.
Signed-off-by: Russell King rmk+kernel@armlinux.org.uk Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/adfs/super.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/adfs/super.c b/fs/adfs/super.c index c9fdfb1129335..e42c300015090 100644 --- a/fs/adfs/super.c +++ b/fs/adfs/super.c @@ -368,6 +368,7 @@ static int adfs_fill_super(struct super_block *sb, void *data, int silent) struct buffer_head *bh; struct object_info root_obj; unsigned char *b_data; + unsigned int blocksize; struct adfs_sb_info *asb; struct inode *root; int ret = -EINVAL; @@ -419,8 +420,10 @@ static int adfs_fill_super(struct super_block *sb, void *data, int silent) goto error_free_bh; }
+ blocksize = 1 << dr->log2secsize; brelse(bh); - if (sb_set_blocksize(sb, 1 << dr->log2secsize)) { + + if (sb_set_blocksize(sb, blocksize)) { bh = sb_bread(sb, ADFS_DISCRECORD / sb->s_blocksize); if (!bh) { adfs_error(sb, "couldn't read superblock on "
[ Upstream commit 0ee5f8ae082e1f675a2fb6db601c31ac9958a134 ]
The list of profiles in btrfs_chunk_max_errors lists DUP as a profile DUP able to tolerate 1 device missing. Though this profile is special with 2 copies, it still needs the device, unlike the others.
Looking at the history of changes, thre's no clear reason why DUP is there, functions were refactored and blocks of code merged to one helper.
d20983b40e828 Btrfs: fix writing data into the seed filesystem - factor code to a helper
de11cc12df173 Btrfs: don't pre-allocate btrfs bio - unrelated change, DUP still in the list with max errors 1
a236aed14ccb0 Btrfs: Deal with failed writes in mirrored configurations - introduced the max errors, leaves DUP and RAID1 in the same group
Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/volumes.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 85294fef10514..358e930df4acd 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -5019,8 +5019,7 @@ static inline int btrfs_chunk_max_errors(struct map_lookup *map)
if (map->type & (BTRFS_BLOCK_GROUP_RAID1 | BTRFS_BLOCK_GROUP_RAID10 | - BTRFS_BLOCK_GROUP_RAID5 | - BTRFS_BLOCK_GROUP_DUP)) { + BTRFS_BLOCK_GROUP_RAID5)) { max_errors = 1; } else if (map->type & BTRFS_BLOCK_GROUP_RAID6) { max_errors = 2;
[ Upstream commit f2caf901c1b7ce65f9e6aef4217e3241039db768 ]
There is a race condition with how we send (or supress and don't send) smb echos that will cause the client to incorrectly think the server is unresponsive and thus needs to be reconnected.
Summary of the race condition: 1) Daisy chaining scheduling creates a gap. 2) If traffic comes unfortunate shortly after the last echo, the planned echo is suppressed. 3) Due to the gap, the next echo transmission is delayed until after the timeout, which is set hard to twice the echo interval.
This is fixed by changing the timeouts from 2 to three times the echo interval.
Detailed description of the bug: https://lutz.donnerhacke.de/eng/Blog/Groundhog-Day-with-SMB-remount
Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Reviewed-by: Pavel Shilovsky pshilov@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/connect.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index 33cd844579aed..57c62ff4e8d6d 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -554,10 +554,10 @@ static bool server_unresponsive(struct TCP_Server_Info *server) { /* - * We need to wait 2 echo intervals to make sure we handle such + * We need to wait 3 echo intervals to make sure we handle such * situations right: * 1s client sends a normal SMB request - * 2s client gets a response + * 3s client gets a response * 30s echo workqueue job pops, and decides we got a response recently * and don't need to send another * ... @@ -566,9 +566,9 @@ server_unresponsive(struct TCP_Server_Info *server) */ if ((server->tcpStatus == CifsGood || server->tcpStatus == CifsNeedNegotiate) && - time_after(jiffies, server->lstrp + 2 * server->echo_interval)) { + time_after(jiffies, server->lstrp + 3 * server->echo_interval)) { cifs_dbg(VFS, "Server %s has not responded in %lu seconds. Reconnecting...\n", - server->hostname, (2 * server->echo_interval) / HZ); + server->hostname, (3 * server->echo_interval) / HZ); cifs_reconnect(server); wake_up(&server->response_q); return true;
[ Upstream commit 749607731e26dfb2558118038c40e9c0c80d23b5 ]
This barrier only applies to the read-modify-write operations; in particular, it does not apply to the atomic64_set() primitive.
Replace the barrier with an smp_mb().
Fixes: fdd4e15838e59 ("ceph: rework dcache readdir") Reported-by: "Paul E. McKenney" paulmck@linux.ibm.com Reported-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Andrea Parri andrea.parri@amarulasolutions.com Reviewed-by: "Yan, Zheng" zyan@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/super.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 3e27a28aa44ad..60b70f0985f67 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -517,7 +517,12 @@ static inline void __ceph_dir_set_complete(struct ceph_inode_info *ci, long long release_count, long long ordered_count) { - smp_mb__before_atomic(); + /* + * Makes sure operations that setup readdir cache (update page + * cache and i_size) are strongly ordered w.r.t. the following + * atomic64_set() operations. + */ + smp_mb(); atomic64_set(&ci->i_complete_seq[0], release_count); atomic64_set(&ci->i_complete_seq[1], ordered_count); }
[ Upstream commit 3b421018f48c482bdc9650f894aa1747cf90e51d ]
The getxattr manpage states that we should return ERANGE if the destination buffer size is too small to hold the value. ceph_vxattrcb_layout does this internally, but we should be doing this for all vxattrs.
Fix the only caller of getxattr_cb to check the returned size against the buffer length and return -ERANGE if it doesn't fit. Drop the same check in ceph_vxattrcb_layout and just rely on the caller to handle it.
Signed-off-by: Jeff Layton jlayton@kernel.org Reviewed-by: "Yan, Zheng" zyan@redhat.com Acked-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/xattr.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c index e1c4e0b12b4cd..0376db8a74f85 100644 --- a/fs/ceph/xattr.c +++ b/fs/ceph/xattr.c @@ -75,7 +75,7 @@ static size_t ceph_vxattrcb_layout(struct ceph_inode_info *ci, char *val, const char *ns_field = " pool_namespace="; char buf[128]; size_t len, total_len = 0; - int ret; + ssize_t ret;
pool_ns = ceph_try_get_string(ci->i_layout.pool_ns);
@@ -99,11 +99,8 @@ static size_t ceph_vxattrcb_layout(struct ceph_inode_info *ci, char *val, if (pool_ns) total_len += strlen(ns_field) + pool_ns->len;
- if (!size) { - ret = total_len; - } else if (total_len > size) { - ret = -ERANGE; - } else { + ret = total_len; + if (size >= total_len) { memcpy(val, buf, len); ret = len; if (pool_name) { @@ -761,8 +758,11 @@ ssize_t __ceph_getxattr(struct inode *inode, const char *name, void *value, if (err) return err; err = -ENODATA; - if (!(vxattr->exists_cb && !vxattr->exists_cb(ci))) + if (!(vxattr->exists_cb && !vxattr->exists_cb(ci))) { err = vxattr->getxattr_cb(ci, value, size); + if (size && size < err) + err = -ERANGE; + } return err; }
[ Upstream commit b80d6a42bdc97bdb6139107d6034222e9843c6e2 ]
When CONFIG_DMI is disabled, we only have a tentative declaration, which causes a warning from clang:
drivers/acpi/blacklist.c:20:35: error: tentative array definition assumed to have one element [-Werror] static const struct dmi_system_id acpi_rev_dmi_table[] __initconst;
As the variable is not actually used here, hide it entirely in an #ifdef to shut up the warning.
Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Nathan Chancellor natechancellor@gmail.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/blacklist.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/acpi/blacklist.c b/drivers/acpi/blacklist.c index 995c4d8922b12..761f0c19a4512 100644 --- a/drivers/acpi/blacklist.c +++ b/drivers/acpi/blacklist.c @@ -30,7 +30,9 @@
#include "internal.h"
+#ifdef CONFIG_DMI static const struct dmi_system_id acpi_rev_dmi_table[] __initconst; +#endif
/* * POLICY: If *anything* doesn't work, put it on the blacklist. @@ -74,7 +76,9 @@ int __init acpi_blacklisted(void) }
(void)early_acpi_osi_init(); +#ifdef CONFIG_DMI dmi_check_system(acpi_rev_dmi_table); +#endif
return blacklisted; }
[ Upstream commit 484647088826f2f651acbda6bcf9536b8a466703 ]
GCC v9 emits this warning: CC drivers/s390/scsi/zfcp_erp.o drivers/s390/scsi/zfcp_erp.c: In function 'zfcp_erp_action_enqueue': drivers/s390/scsi/zfcp_erp.c:217:26: warning: 'erp_action' may be used uninitialized in this function [-Wmaybe-uninitialized] 217 | struct zfcp_erp_action *erp_action; | ^~~~~~~~~~
This is a possible false positive case, as also documented in the GCC documentations: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wmaybe-uniniti...
The actual code-sequence is like this: Various callers can invoke the function below with the argument "want" being one of: ZFCP_ERP_ACTION_REOPEN_ADAPTER, ZFCP_ERP_ACTION_REOPEN_PORT_FORCED, ZFCP_ERP_ACTION_REOPEN_PORT, or ZFCP_ERP_ACTION_REOPEN_LUN.
zfcp_erp_action_enqueue(want, ...) ... need = zfcp_erp_required_act(want, ...) need = want ... maybe: need = ZFCP_ERP_ACTION_REOPEN_PORT maybe: need = ZFCP_ERP_ACTION_REOPEN_ADAPTER ... return need ... zfcp_erp_setup_act(need, ...) struct zfcp_erp_action *erp_action; // <== line 217 ... switch(need) { case ZFCP_ERP_ACTION_REOPEN_LUN: ... erp_action = &zfcp_sdev->erp_action; WARN_ON_ONCE(erp_action->port != port); // <== access ... break; case ZFCP_ERP_ACTION_REOPEN_PORT: case ZFCP_ERP_ACTION_REOPEN_PORT_FORCED: ... erp_action = &port->erp_action; WARN_ON_ONCE(erp_action->port != port); // <== access ... break; case ZFCP_ERP_ACTION_REOPEN_ADAPTER: ... erp_action = &adapter->erp_action; WARN_ON_ONCE(erp_action->port != NULL); // <== access ... break; } ... WARN_ON_ONCE(erp_action->adapter != adapter); // <== access
When zfcp_erp_setup_act() is called, 'need' will never be anything else than one of the 4 possible enumeration-names that are used in the switch-case, and 'erp_action' is initialized for every one of them, before it is used. Thus the warning is a false positive, as documented.
We introduce the extra if{} in the beginning to create an extra code-flow, so the compiler can be convinced that the switch-case will never see any other value.
BUG_ON()/BUG() is intentionally not used to not crash anything, should this ever happen anyway - right now it's impossible, as argued above; and it doesn't introduce a 'default:' switch-case to retain warnings should 'enum zfcp_erp_act_type' ever be extended and no explicit case be introduced. See also v5.0 commit 399b6c8bc9f7 ("scsi: zfcp: drop old default switch case which might paper over missing case").
Signed-off-by: Benjamin Block bblock@linux.ibm.com Reviewed-by: Jens Remus jremus@linux.ibm.com Reviewed-by: Steffen Maier maier@linux.ibm.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/s390/scsi/zfcp_erp.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c index 6d5065f679acf..64d70de98cdb6 100644 --- a/drivers/s390/scsi/zfcp_erp.c +++ b/drivers/s390/scsi/zfcp_erp.c @@ -11,6 +11,7 @@ #define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
#include <linux/kthread.h> +#include <linux/bug.h> #include "zfcp_ext.h" #include "zfcp_reqlist.h"
@@ -245,6 +246,12 @@ static struct zfcp_erp_action *zfcp_erp_setup_act(int need, u32 act_status, struct zfcp_erp_action *erp_action; struct zfcp_scsi_dev *zfcp_sdev;
+ if (WARN_ON_ONCE(need != ZFCP_ERP_ACTION_REOPEN_LUN && + need != ZFCP_ERP_ACTION_REOPEN_PORT && + need != ZFCP_ERP_ACTION_REOPEN_PORT_FORCED && + need != ZFCP_ERP_ACTION_REOPEN_ADAPTER)) + return NULL; + switch (need) { case ZFCP_ERP_ACTION_REOPEN_LUN: zfcp_sdev = sdev_to_zfcp(sdev);
[ Upstream commit a6a6d3b1f867d34ba5bd61aa7bb056b48ca67cff ]
clang finds a contruct suspicious that converts an unsigned character to a signed integer and back, causing an overflow:
arch/x86/kvm/mmu.c:4605:39: error: implicit conversion from 'int' to 'u8' (aka 'unsigned char') changes value from -205 to 51 [-Werror,-Wconstant-conversion] u8 wf = (pfec & PFERR_WRITE_MASK) ? ~w : 0; ~~ ^~ arch/x86/kvm/mmu.c:4607:38: error: implicit conversion from 'int' to 'u8' (aka 'unsigned char') changes value from -241 to 15 [-Werror,-Wconstant-conversion] u8 uf = (pfec & PFERR_USER_MASK) ? ~u : 0; ~~ ^~ arch/x86/kvm/mmu.c:4609:39: error: implicit conversion from 'int' to 'u8' (aka 'unsigned char') changes value from -171 to 85 [-Werror,-Wconstant-conversion] u8 ff = (pfec & PFERR_FETCH_MASK) ? ~x : 0; ~~ ^~
Add an explicit cast to tell clang that everything works as intended here.
Signed-off-by: Arnd Bergmann arnd@arndb.de Link: https://github.com/ClangBuiltLinux/linux/issues/95 Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/mmu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index f97b533bc6e68..87a0601b1c204 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -4313,11 +4313,11 @@ static void update_permission_bitmask(struct kvm_vcpu *vcpu, */
/* Faults from writes to non-writable pages */ - u8 wf = (pfec & PFERR_WRITE_MASK) ? ~w : 0; + u8 wf = (pfec & PFERR_WRITE_MASK) ? (u8)~w : 0; /* Faults from user mode accesses to supervisor pages */ - u8 uf = (pfec & PFERR_USER_MASK) ? ~u : 0; + u8 uf = (pfec & PFERR_USER_MASK) ? (u8)~u : 0; /* Faults from fetches of non-executable pages*/ - u8 ff = (pfec & PFERR_FETCH_MASK) ? ~x : 0; + u8 ff = (pfec & PFERR_FETCH_MASK) ? (u8)~x : 0; /* Faults from kernel mode fetches of user pages */ u8 smepf = 0; /* Faults from kernel mode accesses of user pages */
[ Upstream commit dfd6f9ad36368b8dbd5f5a2b2f0a4705ae69a323 ]
clang gets confused by an uninitialized variable in what looks to it like a never executed code path:
arch/x86/kernel/acpi/boot.c:618:13: error: variable 'polarity' is uninitialized when used here [-Werror,-Wuninitialized] polarity = polarity ? ACPI_ACTIVE_LOW : ACPI_ACTIVE_HIGH; ^~~~~~~~ arch/x86/kernel/acpi/boot.c:606:32: note: initialize the variable 'polarity' to silence this warning int rc, irq, trigger, polarity; ^ = 0 arch/x86/kernel/acpi/boot.c:617:12: error: variable 'trigger' is uninitialized when used here [-Werror,-Wuninitialized] trigger = trigger ? ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE; ^~~~~~~ arch/x86/kernel/acpi/boot.c:606:22: note: initialize the variable 'trigger' to silence this warning int rc, irq, trigger, polarity; ^ = 0
This is unfortunately a design decision in clang and won't be fixed.
Changing the acpi_get_override_irq() macro to an inline function reliably avoids the issue.
Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reviewed-by: Nathan Chancellor natechancellor@gmail.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/acpi.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/linux/acpi.h b/include/linux/acpi.h index 13c105121a185..d7a9700b93339 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -324,7 +324,10 @@ void acpi_set_irq_model(enum acpi_irq_model_id model, #ifdef CONFIG_X86_IO_APIC extern int acpi_get_override_irq(u32 gsi, int *trigger, int *polarity); #else -#define acpi_get_override_irq(gsi, trigger, polarity) (-1) +static inline int acpi_get_override_irq(u32 gsi, int *trigger, int *polarity) +{ + return -1; +} #endif /* * This function undoes the effect of one call to acpi_register_gsi().
[ Upstream commit 7429c6c0d9cb086d8e79f0d2a48ae14851d2115e ]
While changing the number of interrupt channels, be2net stops adapter operation (including netif_tx_disable()) but it doesn't signal that it cannot transmit. This may lead dev_watchdog() to falsely trigger during that time.
Add the missing call to netif_carrier_off(), following the pattern used in many other drivers. netif_carrier_on() is already taken care of in be_open().
Signed-off-by: Benjamin Poirier bpoirier@suse.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/emulex/benet/be_main.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c index 39f399741647f..cabeb1790db76 100644 --- a/drivers/net/ethernet/emulex/benet/be_main.c +++ b/drivers/net/ethernet/emulex/benet/be_main.c @@ -4600,8 +4600,12 @@ int be_update_queues(struct be_adapter *adapter) struct net_device *netdev = adapter->netdev; int status;
- if (netif_running(netdev)) + if (netif_running(netdev)) { + /* device cannot transmit now, avoid dev_watchdog timeouts */ + netif_carrier_off(netdev); + be_close(netdev); + }
be_cancel_worker(adapter);
[ Upstream commit ec6335586953b0df32f83ef696002063090c7aef ]
There are many compiler warnings like this,
In file included from ./arch/x86/include/asm/smp.h:13, from ./arch/x86/include/asm/mmzone_64.h:11, from ./arch/x86/include/asm/mmzone.h:5, from ./include/linux/mmzone.h:969, from ./include/linux/gfp.h:6, from ./include/linux/mm.h:10, from arch/x86/kernel/apic/io_apic.c:34: arch/x86/kernel/apic/io_apic.c: In function 'check_timer': ./arch/x86/include/asm/apic.h:37:11: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] if ((v) <= apic_verbosity) \ ^~ arch/x86/kernel/apic/io_apic.c:2160:2: note: in expansion of macro 'apic_printk' apic_printk(APIC_QUIET, KERN_INFO "..TIMER: vector=0x%02X " ^~~~~~~~~~~ ./arch/x86/include/asm/apic.h:37:11: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] if ((v) <= apic_verbosity) \ ^~ arch/x86/kernel/apic/io_apic.c:2207:4: note: in expansion of macro 'apic_printk' apic_printk(APIC_QUIET, KERN_ERR "..MP-BIOS bug: " ^~~~~~~~~~~
APIC_QUIET is 0, so silence them by making apic_verbosity type int.
Signed-off-by: Qian Cai cai@lca.pw Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/1562621805-24789-1-git-send-email-cai@lca.pw Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/apic.h | 2 +- arch/x86/kernel/apic/apic.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index a1ed92aae12a6..25a5a5c6ae90a 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -48,7 +48,7 @@ static inline void generic_apic_probe(void)
#ifdef CONFIG_X86_LOCAL_APIC
-extern unsigned int apic_verbosity; +extern int apic_verbosity; extern int local_apic_timer_c2_ok;
extern int disable_apic; diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 2e64178f284da..ae410f7585f16 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -182,7 +182,7 @@ EXPORT_SYMBOL_GPL(local_apic_timer_c2_ok); /* * Debug level, exported for io_apic.c */ -unsigned int apic_verbosity; +int apic_verbosity;
int pic_mode;
[ Upstream commit 29e7e9664aec17b94a9c8c5a75f8d216a206aa3a ]
clang warns about a few parts of the math-emu implementation where a 16-bit integer becomes negative during assignment:
arch/x86/math-emu/poly_tan.c:88:35: error: implicit conversion from 'int' to 'short' changes value from 49216 to -16320 [-Werror,-Wconstant-conversion] (0x41 + EXTENDED_Ebias) | SIGN_Negative); ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~ arch/x86/math-emu/fpu_emu.h:180:58: note: expanded from macro 'setexponent16' #define setexponent16(x,y) { (*(short *)&((x)->exp)) = (y); } ~ ^ arch/x86/math-emu/reg_constant.c:37:32: error: implicit conversion from 'int' to 'short' changes value from 49085 to -16451 [-Werror,-Wconstant-conversion] FPU_REG const CONST_PI2extra = MAKE_REG(NEG, -66, ^~~~~~~~~~~~~~~~~~ arch/x86/math-emu/reg_constant.c:21:25: note: expanded from macro 'MAKE_REG' ((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) } ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~ arch/x86/math-emu/reg_constant.c:48:28: error: implicit conversion from 'int' to 'short' changes value from 65535 to -1 [-Werror,-Wconstant-conversion] FPU_REG const CONST_QNaN = MAKE_REG(NEG, EXP_OVER, 0x00000000, 0xC0000000); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/x86/math-emu/reg_constant.c:21:25: note: expanded from macro 'MAKE_REG' ((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) } ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
The code is correct as is, so add a typecast to shut up the warnings.
Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/20190712090816.350668-1-arnd@arndb.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/math-emu/fpu_emu.h | 2 +- arch/x86/math-emu/reg_constant.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/math-emu/fpu_emu.h b/arch/x86/math-emu/fpu_emu.h index a5a41ec580721..0c122226ca56f 100644 --- a/arch/x86/math-emu/fpu_emu.h +++ b/arch/x86/math-emu/fpu_emu.h @@ -177,7 +177,7 @@ static inline void reg_copy(FPU_REG const *x, FPU_REG *y) #define setexponentpos(x,y) { (*(short *)&((x)->exp)) = \ ((y) + EXTENDED_Ebias) & 0x7fff; } #define exponent16(x) (*(short *)&((x)->exp)) -#define setexponent16(x,y) { (*(short *)&((x)->exp)) = (y); } +#define setexponent16(x,y) { (*(short *)&((x)->exp)) = (u16)(y); } #define addexponent(x,y) { (*(short *)&((x)->exp)) += (y); } #define stdexp(x) { (*(short *)&((x)->exp)) += EXTENDED_Ebias; }
diff --git a/arch/x86/math-emu/reg_constant.c b/arch/x86/math-emu/reg_constant.c index 8dc9095bab224..742619e94bdf2 100644 --- a/arch/x86/math-emu/reg_constant.c +++ b/arch/x86/math-emu/reg_constant.c @@ -18,7 +18,7 @@ #include "control_w.h"
#define MAKE_REG(s, e, l, h) { l, h, \ - ((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) } + (u16)((EXTENDED_Ebias+(e)) | ((SIGN_##s != 0)*0x8000)) }
FPU_REG const CONST_1 = MAKE_REG(POS, 0, 0x00000000, 0x80000000); #if 0
[ Upstream commit c633324e311243586675e732249339685e5d6faa ]
The description of cma_declare_contiguous() indicates that if the 'fixed' argument is true the reserved contiguous area must be exactly at the address of the 'base' argument.
However, the function currently allows the 'base', 'size', and 'limit' arguments to be silently adjusted to meet alignment constraints. This commit enforces the documented behavior through explicit checks that return an error if the region does not fit within a specified region.
Link: http://lkml.kernel.org/r/1561422051-16142-1-git-send-email-opendmb@gmail.com Fixes: 5ea3b1b2f8ad ("cma: add placement specifier for "cma=" kernel parameter") Signed-off-by: Doug Berger opendmb@gmail.com Acked-by: Michal Nazarewicz mina86@mina86.com Cc: Yue Hu huyue2@yulong.com Cc: Mike Rapoport rppt@linux.ibm.com Cc: Laura Abbott labbott@redhat.com Cc: Peng Fan peng.fan@nxp.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Marek Szyprowski m.szyprowski@samsung.com Cc: Andrey Konovalov andreyknvl@google.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/cma.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/mm/cma.c b/mm/cma.c index 56761e40d1918..c4a34c813d470 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -277,6 +277,12 @@ int __init cma_declare_contiguous(phys_addr_t base, */ alignment = max(alignment, (phys_addr_t)PAGE_SIZE << max_t(unsigned long, MAX_ORDER - 1, pageblock_order)); + if (fixed && base & (alignment - 1)) { + ret = -EINVAL; + pr_err("Region at %pa must be aligned to %pa bytes\n", + &base, &alignment); + goto err; + } base = ALIGN(base, alignment); size = ALIGN(size, alignment); limit &= ~(alignment - 1); @@ -307,6 +313,13 @@ int __init cma_declare_contiguous(phys_addr_t base, if (limit == 0 || limit > memblock_end) limit = memblock_end;
+ if (base + size > limit) { + ret = -EINVAL; + pr_err("Size (%pa) of region at %pa exceeds limit (%pa)\n", + &size, &base, &limit); + goto err; + } + /* Reserve memory */ if (fixed) { if (memblock_is_region_reserved(base, size) ||
[ Upstream commit 02551c23bcd85f0c68a8259c7b953d49d44f86af ]
When fget fails, the lack of error-handling code may cause unexpected results.
This patch adds error-handling code after calling fget.
Link: http://lkml.kernel.org/r/2514ec03df9c33b86e56748513267a80dd8004d9.1558117389... Signed-off-by: Zhouyang Jia jiazhouyang09@gmail.com Signed-off-by: Jan Harkes jaharkes@cs.cmu.edu Cc: Arnd Bergmann arnd@arndb.de Cc: Colin Ian King colin.king@canonical.com Cc: Dan Carpenter dan.carpenter@oracle.com Cc: David Howells dhowells@redhat.com Cc: Fabian Frederick fabf@skynet.be Cc: Mikko Rapeli mikko.rapeli@iki.fi Cc: Sam Protsenko semen.protsenko@linaro.org Cc: Yann Droneaud ydroneaud@opteya.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/coda/psdev.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/coda/psdev.c b/fs/coda/psdev.c index f40e3953e7fe3..a6d9e841a375c 100644 --- a/fs/coda/psdev.c +++ b/fs/coda/psdev.c @@ -187,8 +187,11 @@ static ssize_t coda_psdev_write(struct file *file, const char __user *buf, if (req->uc_opcode == CODA_OPEN_BY_FD) { struct coda_open_by_fd_out *outp = (struct coda_open_by_fd_out *)req->uc_data; - if (!outp->oh.result) + if (!outp->oh.result) { outp->fh = fget(outp->fd); + if (!outp->fh) + return -EBADF; + } }
wake_up(&req->uc_sleep);
[ Upstream commit b2a57e334086602be56b74958d9f29b955cd157f ]
The kernel is self-contained project and can be built with bare-metal toolchain. But bare-metal toolchain doesn't define __linux__. Because of this u_quad_t type is not defined when using bare-metal toolchain and codafs build fails. This patch fixes it by defining u_quad_t type unconditionally.
Link: http://lkml.kernel.org/r/3cbb40b0a57b6f9923a9d67b53473c0b691a3eaa.1558117389... Signed-off-by: Sam Protsenko semen.protsenko@linaro.org Signed-off-by: Jan Harkes jaharkes@cs.cmu.edu Cc: Arnd Bergmann arnd@arndb.de Cc: Colin Ian King colin.king@canonical.com Cc: Dan Carpenter dan.carpenter@oracle.com Cc: David Howells dhowells@redhat.com Cc: Fabian Frederick fabf@skynet.be Cc: Mikko Rapeli mikko.rapeli@iki.fi Cc: Yann Droneaud ydroneaud@opteya.com Cc: Zhouyang Jia jiazhouyang09@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/coda.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/include/linux/coda.h b/include/linux/coda.h index d30209b9cef81..0ca0c83fdb1c4 100644 --- a/include/linux/coda.h +++ b/include/linux/coda.h @@ -58,8 +58,7 @@ Mellon the rights to redistribute these changes without encumbrance. #ifndef _CODA_HEADER_ #define _CODA_HEADER_
-#if defined(__linux__) typedef unsigned long long u_quad_t; -#endif + #include <uapi/linux/coda.h> #endif
[ Upstream commit f90fb3c7e2c13ae829db2274b88b845a75038b8a ]
Only users of upc_req in kernel side fs/coda/psdev.c and fs/coda/upcall.c already include linux/coda_psdev.h.
Suggested by Jan Harkes jaharkes@cs.cmu.edu in https://lore.kernel.org/lkml/20150531111913.GA23377@cs.cmu.edu/
Fixes these include/uapi/linux/coda_psdev.h compilation errors in userspace:
linux/coda_psdev.h:12:19: error: field `uc_chain' has incomplete type struct list_head uc_chain; ^ linux/coda_psdev.h:13:2: error: unknown type name `caddr_t' caddr_t uc_data; ^ linux/coda_psdev.h:14:2: error: unknown type name `u_short' u_short uc_flags; ^ linux/coda_psdev.h:15:2: error: unknown type name `u_short' u_short uc_inSize; /* Size is at most 5000 bytes */ ^ linux/coda_psdev.h:16:2: error: unknown type name `u_short' u_short uc_outSize; ^ linux/coda_psdev.h:17:2: error: unknown type name `u_short' u_short uc_opcode; /* copied from data to save lookup */ ^ linux/coda_psdev.h:19:2: error: unknown type name `wait_queue_head_t' wait_queue_head_t uc_sleep; /* process' wait queue */ ^
Link: http://lkml.kernel.org/r/9f99f5ce6a0563d5266e6cf7aa9585aac2cae971.1558117389... Signed-off-by: Mikko Rapeli mikko.rapeli@iki.fi Signed-off-by: Jan Harkes jaharkes@cs.cmu.edu Cc: Arnd Bergmann arnd@arndb.de Cc: Colin Ian King colin.king@canonical.com Cc: Dan Carpenter dan.carpenter@oracle.com Cc: David Howells dhowells@redhat.com Cc: Fabian Frederick fabf@skynet.be Cc: Sam Protsenko semen.protsenko@linaro.org Cc: Yann Droneaud ydroneaud@opteya.com Cc: Zhouyang Jia jiazhouyang09@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/coda_psdev.h | 11 +++++++++++ include/uapi/linux/coda_psdev.h | 13 ------------- 2 files changed, 11 insertions(+), 13 deletions(-)
diff --git a/include/linux/coda_psdev.h b/include/linux/coda_psdev.h index 15170954aa2b3..57d2b2faf6a3e 100644 --- a/include/linux/coda_psdev.h +++ b/include/linux/coda_psdev.h @@ -19,6 +19,17 @@ struct venus_comm { struct mutex vc_mutex; };
+/* messages between coda filesystem in kernel and Venus */ +struct upc_req { + struct list_head uc_chain; + caddr_t uc_data; + u_short uc_flags; + u_short uc_inSize; /* Size is at most 5000 bytes */ + u_short uc_outSize; + u_short uc_opcode; /* copied from data to save lookup */ + int uc_unique; + wait_queue_head_t uc_sleep; /* process' wait queue */ +};
static inline struct venus_comm *coda_vcp(struct super_block *sb) { diff --git a/include/uapi/linux/coda_psdev.h b/include/uapi/linux/coda_psdev.h index aa6623efd2dd0..d50d51a57fe4e 100644 --- a/include/uapi/linux/coda_psdev.h +++ b/include/uapi/linux/coda_psdev.h @@ -7,19 +7,6 @@ #define CODA_PSDEV_MAJOR 67 #define MAX_CODADEVS 5 /* how many do we allow */
- -/* messages between coda filesystem in kernel and Venus */ -struct upc_req { - struct list_head uc_chain; - caddr_t uc_data; - u_short uc_flags; - u_short uc_inSize; /* Size is at most 5000 bytes */ - u_short uc_outSize; - u_short uc_opcode; /* copied from data to save lookup */ - int uc_unique; - wait_queue_head_t uc_sleep; /* process' wait queue */ -}; - #define CODA_REQ_ASYNC 0x1 #define CODA_REQ_READ 0x2 #define CODA_REQ_WRITE 0x4
[ Upstream commit 156e0b1a8112b76e351684ac948c59757037ac36 ]
The dev_info.name[] array has space for RIO_MAX_DEVNAME_SZ + 1 characters. But the problem here is that we don't ensure that the user put a NUL terminator on the end of the string. It could lead to an out of bounds read.
Link: http://lkml.kernel.org/r/20190529110601.GB19119@mwanda Fixes: e8de370188d0 ("rapidio: add mport char device driver") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Acked-by: Alexandre Bounine alex.bou9@gmail.com Cc: Ira Weiny ira.weiny@intel.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/rapidio/devices/rio_mport_cdev.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/rapidio/devices/rio_mport_cdev.c b/drivers/rapidio/devices/rio_mport_cdev.c index 76afe1449cab1..ecd71efe8ea00 100644 --- a/drivers/rapidio/devices/rio_mport_cdev.c +++ b/drivers/rapidio/devices/rio_mport_cdev.c @@ -1742,6 +1742,7 @@ static int rio_mport_add_riodev(struct mport_cdev_priv *priv,
if (copy_from_user(&dev_info, arg, sizeof(dev_info))) return -EFAULT; + dev_info.name[sizeof(dev_info.name) - 1] = '\0';
rmcd_debug(RDEV, "name:%s ct:0x%x did:0x%x hc:0x%x", dev_info.name, dev_info.comptag, dev_info.destid, dev_info.hopcount); @@ -1873,6 +1874,7 @@ static int rio_mport_del_riodev(struct mport_cdev_priv *priv, void __user *arg)
if (copy_from_user(&dev_info, arg, sizeof(dev_info))) return -EFAULT; + dev_info.name[sizeof(dev_info.name) - 1] = '\0';
mport = priv->md->mport;
[ Upstream commit a318f12ed8843cfac53198390c74a565c632f417 ]
Andreas Christoforou reported:
UBSAN: Undefined behaviour in ipc/mqueue.c:414:49 signed integer overflow: 9 * 2305843009213693951 cannot be represented in type 'long int' ... Call Trace: mqueue_evict_inode+0x8e7/0xa10 ipc/mqueue.c:414 evict+0x472/0x8c0 fs/inode.c:558 iput_final fs/inode.c:1547 [inline] iput+0x51d/0x8c0 fs/inode.c:1573 mqueue_get_inode+0x8eb/0x1070 ipc/mqueue.c:320 mqueue_create_attr+0x198/0x440 ipc/mqueue.c:459 vfs_mkobj+0x39e/0x580 fs/namei.c:2892 prepare_open ipc/mqueue.c:731 [inline] do_mq_open+0x6da/0x8e0 ipc/mqueue.c:771
Which could be triggered by:
struct mq_attr attr = { .mq_flags = 0, .mq_maxmsg = 9, .mq_msgsize = 0x1fffffffffffffff, .mq_curmsgs = 0, };
if (mq_open("/testing", 0x40, 3, &attr) == (mqd_t) -1) perror("mq_open");
mqueue_get_inode() was correctly rejecting the giant mq_msgsize, and preparing to return -EINVAL. During the cleanup, it calls mqueue_evict_inode() which performed resource usage tracking math for updating "user", before checking if there was a valid "user" at all (which would indicate that the calculations would be sane). Instead, delay this check to after seeing a valid "user".
The overflow was real, but the results went unused, so while the flaw is harmless, it's noisy for kernel fuzzers, so just fix it by moving the calculation under the non-NULL "user" where it actually gets used.
Link: http://lkml.kernel.org/r/201906072207.ECB65450@keescook Signed-off-by: Kees Cook keescook@chromium.org Reported-by: Andreas Christoforou andreaschristofo@gmail.com Acked-by: "Eric W. Biederman" ebiederm@xmission.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: Arnd Bergmann arnd@arndb.de Cc: Davidlohr Bueso dave@stgolabs.net Cc: Manfred Spraul manfred@colorfullife.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- ipc/mqueue.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/ipc/mqueue.c b/ipc/mqueue.c index 5c0ae912f2f25..dccd4ecb786ac 100644 --- a/ipc/mqueue.c +++ b/ipc/mqueue.c @@ -372,7 +372,6 @@ static void mqueue_evict_inode(struct inode *inode) { struct mqueue_inode_info *info; struct user_struct *user; - unsigned long mq_bytes, mq_treesize; struct ipc_namespace *ipc_ns; struct msg_msg *msg, *nmsg; LIST_HEAD(tmp_msg); @@ -395,16 +394,18 @@ static void mqueue_evict_inode(struct inode *inode) free_msg(msg); }
- /* Total amount of bytes accounted for the mqueue */ - mq_treesize = info->attr.mq_maxmsg * sizeof(struct msg_msg) + - min_t(unsigned int, info->attr.mq_maxmsg, MQ_PRIO_MAX) * - sizeof(struct posix_msg_tree_node); - - mq_bytes = mq_treesize + (info->attr.mq_maxmsg * - info->attr.mq_msgsize); - user = info->user; if (user) { + unsigned long mq_bytes, mq_treesize; + + /* Total amount of bytes accounted for the mqueue */ + mq_treesize = info->attr.mq_maxmsg * sizeof(struct msg_msg) + + min_t(unsigned int, info->attr.mq_maxmsg, MQ_PRIO_MAX) * + sizeof(struct posix_msg_tree_node); + + mq_bytes = mq_treesize + (info->attr.mq_maxmsg * + info->attr.mq_msgsize); + spin_lock(&mq_lock); user->mq_bytes -= mq_bytes; /*
[ Upstream commit b23e5844dfe78a80ba672793187d3f52e4b528d7 ]
Commit 7457c0da024b ("x86/alternatives: Add int3_emulate_call() selftest") is used to ensure there is a gap setup in int3 exception stack which could be used for inserting call return address.
This gap is missed in XEN PV int3 exception entry path, then below panic triggered:
[ 0.772876] general protection fault: 0000 [#1] SMP NOPTI [ 0.772886] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0+ #11 [ 0.772893] RIP: e030:int3_magic+0x0/0x7 [ 0.772905] RSP: 3507:ffffffff82203e98 EFLAGS: 00000246 [ 0.773334] Call Trace: [ 0.773334] alternative_instructions+0x3d/0x12e [ 0.773334] check_bugs+0x7c9/0x887 [ 0.773334] ? __get_locked_pte+0x178/0x1f0 [ 0.773334] start_kernel+0x4ff/0x535 [ 0.773334] ? set_init_arg+0x55/0x55 [ 0.773334] xen_start_kernel+0x571/0x57a
For 64bit PV guests, Xen's ABI enters the kernel with using SYSRET, with %rcx/%r11 on the stack. To convert back to "normal" looking exceptions, the xen thunks do 'xen_*: pop %rcx; pop %r11; jmp *'.
E.g. Extracting 'xen_pv_trap xenint3' we have: xen_xenint3: pop %rcx; pop %r11; jmp xenint3
As xenint3 and int3 entry code are same except xenint3 doesn't generate a gap, we can fix it by using int3 and drop useless xenint3.
Signed-off-by: Zhenzhong Duan zhenzhong.duan@oracle.com Reviewed-by: Juergen Gross jgross@suse.com Cc: Boris Ostrovsky boris.ostrovsky@oracle.com Cc: Juergen Gross jgross@suse.com Cc: Stefano Stabellini sstabellini@kernel.org Cc: Andy Lutomirski luto@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: Andrew Cooper andrew.cooper3@citrix.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/entry/entry_64.S | 1 - arch/x86/include/asm/traps.h | 2 +- arch/x86/xen/enlighten_pv.c | 2 +- arch/x86/xen/xen-asm_64.S | 1 - 4 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index e09ba4bc8b98f..b2524d349595c 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1113,7 +1113,6 @@ idtentry stack_segment do_stack_segment has_error_code=1 #ifdef CONFIG_XEN idtentry xennmi do_nmi has_error_code=0 idtentry xendebug do_debug has_error_code=0 -idtentry xenint3 do_int3 has_error_code=0 #endif
idtentry general_protection do_general_protection has_error_code=1 diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index afbc87206886e..b771bb3d159bc 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -40,7 +40,7 @@ asmlinkage void simd_coprocessor_error(void); asmlinkage void xen_divide_error(void); asmlinkage void xen_xennmi(void); asmlinkage void xen_xendebug(void); -asmlinkage void xen_xenint3(void); +asmlinkage void xen_int3(void); asmlinkage void xen_overflow(void); asmlinkage void xen_bounds(void); asmlinkage void xen_invalid_op(void); diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index 481d7920ea244..f79a0cdc6b4e7 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -598,12 +598,12 @@ struct trap_array_entry {
static struct trap_array_entry trap_array[] = { { debug, xen_xendebug, true }, - { int3, xen_xenint3, true }, { double_fault, xen_double_fault, true }, #ifdef CONFIG_X86_MCE { machine_check, xen_machine_check, true }, #endif { nmi, xen_xennmi, true }, + { int3, xen_int3, false }, { overflow, xen_overflow, false }, #ifdef CONFIG_IA32_EMULATION { entry_INT80_compat, xen_entry_INT80_compat, false }, diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S index 417b339e5c8e1..3a6feed76dfc1 100644 --- a/arch/x86/xen/xen-asm_64.S +++ b/arch/x86/xen/xen-asm_64.S @@ -30,7 +30,6 @@ xen_pv_trap divide_error xen_pv_trap debug xen_pv_trap xendebug xen_pv_trap int3 -xen_pv_trap xenint3 xen_pv_trap xennmi xen_pv_trap overflow xen_pv_trap bounds
[ Upstream commit 3901336ed9887b075531bffaeef7742ba614058b ]
After making a change to improve objtool's sibling call detection, it started showing the following warning:
arch/x86/kvm/vmx/nested.o: warning: objtool: .fixup+0x15: sibling call from callable instruction with modified stack frame
The problem is the ____kvm_handle_fault_on_reboot() macro. It does a fake call by pushing a fake RIP and doing a jump. That tricks the unwinder into printing the function which triggered the exception, rather than the .fixup code.
Instead of the hack to make it look like the original function made the call, just change the macro so that the original function actually does make the call. This allows removal of the hack, and also makes objtool happy.
I triggered a vmx instruction exception and verified that the stack trace is still sane:
kernel BUG at arch/x86/kvm/x86.c:358! invalid opcode: 0000 [#1] SMP PTI CPU: 28 PID: 4096 Comm: qemu-kvm Not tainted 5.2.0+ #16 Hardware name: Lenovo THINKSYSTEM SD530 -[7X2106Z000]-/-[7X2106Z000]-, BIOS -[TEE113Z-1.00]- 07/17/2017 RIP: 0010:kvm_spurious_fault+0x5/0x10 Code: 00 00 00 00 00 8b 44 24 10 89 d2 45 89 c9 48 89 44 24 10 8b 44 24 08 48 89 44 24 08 e9 d4 40 22 00 0f 1f 40 00 0f 1f 44 00 00 <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 55 49 89 fd 41 RSP: 0018:ffffbf91c683bd00 EFLAGS: 00010246 RAX: 000061f040000000 RBX: ffff9e159c77bba0 RCX: ffff9e15a5c87000 RDX: 0000000665c87000 RSI: ffff9e15a5c87000 RDI: ffff9e159c77bba0 RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9e15a5c87000 R10: 0000000000000000 R11: fffff8f2d99721c0 R12: ffff9e159c77bba0 R13: ffffbf91c671d960 R14: ffff9e159c778000 R15: 0000000000000000 FS: 00007fa341cbe700(0000) GS:ffff9e15b7400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdd38356804 CR3: 00000006759de003 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: loaded_vmcs_init+0x4f/0xe0 alloc_loaded_vmcs+0x38/0xd0 vmx_create_vcpu+0xf7/0x600 kvm_vm_ioctl+0x5e9/0x980 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 ? free_one_page+0x13f/0x4e0 do_vfs_ioctl+0xa4/0x630 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x55/0x1c0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fa349b1ee5b
Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Paolo Bonzini pbonzini@redhat.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/64a9b64d127e87b6920a97afde8e96ea76f6524e.156341331... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/kvm_host.h | 34 ++++++++++++++++++--------------- 1 file changed, 19 insertions(+), 15 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index f9a4b85d7309b..9f3eb334c818e 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1353,25 +1353,29 @@ enum { #define kvm_arch_vcpu_memslots_id(vcpu) ((vcpu)->arch.hflags & HF_SMM_MASK ? 1 : 0) #define kvm_memslots_for_spte_role(kvm, role) __kvm_memslots(kvm, (role).smm)
+asmlinkage void __noreturn kvm_spurious_fault(void); + /* * Hardware virtualization extension instructions may fault if a * reboot turns off virtualization while processes are running. - * Trap the fault and ignore the instruction if that happens. + * Usually after catching the fault we just panic; during reboot + * instead the instruction is ignored. */ -asmlinkage void kvm_spurious_fault(void); - -#define ____kvm_handle_fault_on_reboot(insn, cleanup_insn) \ - "666: " insn "\n\t" \ - "668: \n\t" \ - ".pushsection .fixup, "ax" \n" \ - "667: \n\t" \ - cleanup_insn "\n\t" \ - "cmpb $0, kvm_rebooting \n\t" \ - "jne 668b \n\t" \ - __ASM_SIZE(push) " $666b \n\t" \ - "jmp kvm_spurious_fault \n\t" \ - ".popsection \n\t" \ - _ASM_EXTABLE(666b, 667b) +#define ____kvm_handle_fault_on_reboot(insn, cleanup_insn) \ + "666: \n\t" \ + insn "\n\t" \ + "jmp 668f \n\t" \ + "667: \n\t" \ + "call kvm_spurious_fault \n\t" \ + "668: \n\t" \ + ".pushsection .fixup, "ax" \n\t" \ + "700: \n\t" \ + cleanup_insn "\n\t" \ + "cmpb $0, kvm_rebooting\n\t" \ + "je 667b \n\t" \ + "jmp 668b \n\t" \ + ".popsection \n\t" \ + _ASM_EXTABLE(666b, 700b)
#define __kvm_handle_fault_on_reboot(insn) \ ____kvm_handle_fault_on_reboot(insn, "")
[ Upstream commit 083db6764821996526970e42d09c1ab2f4155dd4 ]
The __raw_callee_save_*() functions have an ELF symbol size of zero, which confuses objtool and other tools.
Fixes a bunch of warnings like the following:
arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_pte_val() is missing an ELF size annotation arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_pgd_val() is missing an ELF size annotation arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_make_pte() is missing an ELF size annotation arch/x86/xen/mmu_pv.o: warning: objtool: __raw_callee_save_xen_make_pgd() is missing an ELF size annotation
Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Juergen Gross jgross@suse.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/afa6d49bb07497ca62e4fc3b27a2d0cece545b4e.156341331... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/paravirt.h | 1 + arch/x86/kernel/kvm.c | 1 + 2 files changed, 2 insertions(+)
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index c83a2f418cea0..4471f0da6ed76 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -758,6 +758,7 @@ static __always_inline bool pv_vcpu_is_preempted(long cpu) PV_RESTORE_ALL_CALLER_REGS \ FRAME_END \ "ret;" \ + ".size " PV_THUNK_NAME(func) ", .-" PV_THUNK_NAME(func) ";" \ ".popsection")
/* Get a reference to a callee-save function */ diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 652bdd867782c..5853eb50138e7 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -631,6 +631,7 @@ asm( "cmpb $0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);" "setne %al;" "ret;" +".size __raw_callee_save___kvm_vcpu_is_preempted, .-__raw_callee_save___kvm_vcpu_is_preempted;" ".popsection");
#endif
[ Upstream commit 8c5477e8046ca139bac250386c08453da37ec1ae ]
Kernel build warns: 'sanitize_boot_params' defined but not used [-Wunused-function]
at below files: arch/x86/boot/compressed/cmdline.c arch/x86/boot/compressed/error.c arch/x86/boot/compressed/early_serial_console.c arch/x86/boot/compressed/acpi.c
That's becausethey each include misc.h which includes a definition of sanitize_boot_params() via bootparam_utils.h.
Remove the inclusion from misc.h and have the c file including bootparam_utils.h directly.
Signed-off-by: Zhenzhong Duan zhenzhong.duan@oracle.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/1563283092-1189-1-git-send-email-zhenzhong.duan@or... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/boot/compressed/misc.c | 1 + arch/x86/boot/compressed/misc.h | 1 - 2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c index 252fee3208166..fb07cfa3f2f90 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -16,6 +16,7 @@ #include "error.h" #include "../string.h" #include "../voffset.h" +#include <asm/bootparam_utils.h>
/* * WARNING!! diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h index 32d4ec2e0243c..5380d45b1c6e4 100644 --- a/arch/x86/boot/compressed/misc.h +++ b/arch/x86/boot/compressed/misc.h @@ -19,7 +19,6 @@ #include <asm/page.h> #include <asm/boot.h> #include <asm/bootparam.h> -#include <asm/bootparam_utils.h>
#define BOOT_BOOT_H #include "../ctype.h"
[ Upstream commit 09b90e2fe35faeace2488234e2a7728f2ea8ba26 ]
In nouveau_conn_reset(), if connector->state is true, __drm_atomic_helper_connector_destroy_state() will be called, but the memory pointed by asyc isn't freed. Memory leak happens in the following function __drm_atomic_helper_connector_reset(), where newly allocated asyc->state will be assigned to connector->state.
So using nouveau_conn_atomic_destroy_state() instead of __drm_atomic_helper_connector_destroy_state to free the "old" asyc.
Here the is the log showing memory leak.
unreferenced object 0xffff8c5480483c80 (size 192): comm "kworker/0:2", pid 188, jiffies 4294695279 (age 53.179s) hex dump (first 32 bytes): 00 f0 ba 7b 54 8c ff ff 00 00 00 00 00 00 00 00 ...{T........... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000005005c0d0>] kmem_cache_alloc_trace+0x195/0x2c0 [<00000000a122baed>] nouveau_conn_reset+0x25/0xc0 [nouveau] [<000000004fd189a2>] nouveau_connector_create+0x3a7/0x610 [nouveau] [<00000000c73343a8>] nv50_display_create+0x343/0x980 [nouveau] [<000000002e2b03c3>] nouveau_display_create+0x51f/0x660 [nouveau] [<00000000c924699b>] nouveau_drm_device_init+0x182/0x7f0 [nouveau] [<00000000cc029436>] nouveau_drm_probe+0x20c/0x2c0 [nouveau] [<000000007e961c3e>] local_pci_probe+0x47/0xa0 [<00000000da14d569>] work_for_cpu_fn+0x1a/0x30 [<0000000028da4805>] process_one_work+0x27c/0x660 [<000000001d415b04>] worker_thread+0x22b/0x3f0 [<0000000003b69f1f>] kthread+0x12f/0x150 [<00000000c94c29b7>] ret_from_fork+0x3a/0x50
Signed-off-by: Yongxin Liu yongxin.liu@windriver.com Signed-off-by: Ben Skeggs bskeggs@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/nouveau/nouveau_connector.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c index 2c6d196836886..4a7d50a96d36f 100644 --- a/drivers/gpu/drm/nouveau/nouveau_connector.c +++ b/drivers/gpu/drm/nouveau/nouveau_connector.c @@ -251,7 +251,7 @@ nouveau_conn_reset(struct drm_connector *connector) return;
if (connector->state) - __drm_atomic_helper_connector_destroy_state(connector->state); + nouveau_conn_atomic_destroy_state(connector, connector->state); __drm_atomic_helper_connector_reset(connector, &asyc->state); asyc->dither.mode = DITHERING_MODE_AUTO; asyc->dither.depth = DITHERING_DEPTH_AUTO;
From: Masahiro Yamada yamada.masahiro@socionext.com
commit 5241ab4cf42d3a93b933b55d3d53f43049081fa1 upstream.
CLANG_FLAGS is initialized by the following line:
CLANG_FLAGS := --target=$(notdir $(CROSS_COMPILE:%-=%))
..., which is run only when CROSS_COMPILE is set.
Some build targets (bindeb-pkg etc.) recurse to the top Makefile.
When you build the kernel with Clang but without CROSS_COMPILE, the same compiler flags such as -no-integrated-as are accumulated into CLANG_FLAGS.
If you run 'make CC=clang' and then 'make CC=clang bindeb-pkg', Kbuild will recompile everything needlessly due to the build command change.
Fix this by correctly initializing CLANG_FLAGS.
Fixes: 238bcbc4e07f ("kbuild: consolidate Clang compiler flags") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Masahiro Yamada yamada.masahiro@socionext.com Reviewed-by: Nathan Chancellor natechancellor@gmail.com Acked-by: Nick Desaulniers ndesaulniers@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/Makefile +++ b/Makefile @@ -427,6 +427,7 @@ KBUILD_AFLAGS_MODULE := -DMODULE KBUILD_CFLAGS_MODULE := -DMODULE KBUILD_LDFLAGS_MODULE := -T $(srctree)/scripts/module-common.lds GCC_PLUGINS_CFLAGS := +CLANG_FLAGS :=
export ARCH SRCARCH CONFIG_SHELL HOSTCC HOSTCFLAGS CROSS_COMPILE AS LD CC export CPP AR NM STRIP OBJCOPY OBJDUMP HOSTLDFLAGS HOST_LOADLIBES @@ -479,7 +480,7 @@ endif
ifeq ($(cc-name),clang) ifneq ($(CROSS_COMPILE),) -CLANG_FLAGS := --target=$(notdir $(CROSS_COMPILE:%-=%)) +CLANG_FLAGS += --target=$(notdir $(CROSS_COMPILE:%-=%)) GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)elfedit)) CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR) GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..)
From: Filipe Manana fdmanana@suse.com
commit b4f9a1a87a48c255bb90d8a6c3d555a1abb88130 upstream.
When doing an incremental send operation we can fail if we previously did deduplication operations against a file that exists in both snapshots. In that case we will fail the send operation with -EIO and print a message to dmesg/syslog like the following:
BTRFS error (device sdc): Send: inconsistent snapshot, found updated \ extent for inode 257 without updated inode item, send root is 258, \ parent root is 257
This requires that we deduplicate to the same file in both snapshots for the same amount of times on each snapshot. The issue happens because a deduplication only updates the iversion of an inode and does not update any other field of the inode, therefore if we deduplicate the file on each snapshot for the same amount of time, the inode will have the same iversion value (stored as the "sequence" field on the inode item) on both snapshots, therefore it will be seen as unchanged between in the send snapshot while there are new/updated/deleted extent items when comparing to the parent snapshot. This makes the send operation return -EIO and print an error message.
Example reproducer:
$ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt
# Create our first file. The first half of the file has several 64Kb # extents while the second half as a single 512Kb extent. $ xfs_io -f -s -c "pwrite -S 0xb8 -b 64K 0 512K" /mnt/foo $ xfs_io -c "pwrite -S 0xb8 512K 512K" /mnt/foo
# Create the base snapshot and the parent send stream from it. $ btrfs subvolume snapshot -r /mnt /mnt/mysnap1 $ btrfs send -f /tmp/1.snap /mnt/mysnap1
# Create our second file, that has exactly the same data as the first # file. $ xfs_io -f -c "pwrite -S 0xb8 0 1M" /mnt/bar
# Create the second snapshot, used for the incremental send, before # doing the file deduplication. $ btrfs subvolume snapshot -r /mnt /mnt/mysnap2
# Now before creating the incremental send stream: # # 1) Deduplicate into a subrange of file foo in snapshot mysnap1. This # will drop several extent items and add a new one, also updating # the inode's iversion (sequence field in inode item) by 1, but not # any other field of the inode; # # 2) Deduplicate into a different subrange of file foo in snapshot # mysnap2. This will replace an extent item with a new one, also # updating the inode's iversion by 1 but not any other field of the # inode. # # After these two deduplication operations, the inode items, for file # foo, are identical in both snapshots, but we have different extent # items for this inode in both snapshots. We want to check this doesn't # cause send to fail with an error or produce an incorrect stream.
$ xfs_io -r -c "dedupe /mnt/bar 0 0 512K" /mnt/mysnap1/foo $ xfs_io -r -c "dedupe /mnt/bar 512K 512K 512K" /mnt/mysnap2/foo
# Create the incremental send stream. $ btrfs send -p /mnt/mysnap1 -f /tmp/2.snap /mnt/mysnap2 ERROR: send ioctl failed with -5: Input/output error
This issue started happening back in 2015 when deduplication was updated to not update the inode's ctime and mtime and update only the iversion. Back then we would hit a BUG_ON() in send, but later in 2016 send was updated to return -EIO and print the error message instead of doing the BUG_ON().
A test case for fstests follows soon.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203933 Fixes: 1c919a5e13702c ("btrfs: don't update mtime/ctime on deduped inodes") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/send.c | 77 ++++++++++---------------------------------------------- 1 file changed, 15 insertions(+), 62 deletions(-)
--- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -6130,68 +6130,21 @@ static int changed_extent(struct send_ct { int ret = 0;
- if (sctx->cur_ino != sctx->cmp_key->objectid) { - - if (result == BTRFS_COMPARE_TREE_CHANGED) { - struct extent_buffer *leaf_l; - struct extent_buffer *leaf_r; - struct btrfs_file_extent_item *ei_l; - struct btrfs_file_extent_item *ei_r; - - leaf_l = sctx->left_path->nodes[0]; - leaf_r = sctx->right_path->nodes[0]; - ei_l = btrfs_item_ptr(leaf_l, - sctx->left_path->slots[0], - struct btrfs_file_extent_item); - ei_r = btrfs_item_ptr(leaf_r, - sctx->right_path->slots[0], - struct btrfs_file_extent_item); - - /* - * We may have found an extent item that has changed - * only its disk_bytenr field and the corresponding - * inode item was not updated. This case happens due to - * very specific timings during relocation when a leaf - * that contains file extent items is COWed while - * relocation is ongoing and its in the stage where it - * updates data pointers. So when this happens we can - * safely ignore it since we know it's the same extent, - * but just at different logical and physical locations - * (when an extent is fully replaced with a new one, we - * know the generation number must have changed too, - * since snapshot creation implies committing the current - * transaction, and the inode item must have been updated - * as well). - * This replacement of the disk_bytenr happens at - * relocation.c:replace_file_extents() through - * relocation.c:btrfs_reloc_cow_block(). - */ - if (btrfs_file_extent_generation(leaf_l, ei_l) == - btrfs_file_extent_generation(leaf_r, ei_r) && - btrfs_file_extent_ram_bytes(leaf_l, ei_l) == - btrfs_file_extent_ram_bytes(leaf_r, ei_r) && - btrfs_file_extent_compression(leaf_l, ei_l) == - btrfs_file_extent_compression(leaf_r, ei_r) && - btrfs_file_extent_encryption(leaf_l, ei_l) == - btrfs_file_extent_encryption(leaf_r, ei_r) && - btrfs_file_extent_other_encoding(leaf_l, ei_l) == - btrfs_file_extent_other_encoding(leaf_r, ei_r) && - btrfs_file_extent_type(leaf_l, ei_l) == - btrfs_file_extent_type(leaf_r, ei_r) && - btrfs_file_extent_disk_bytenr(leaf_l, ei_l) != - btrfs_file_extent_disk_bytenr(leaf_r, ei_r) && - btrfs_file_extent_disk_num_bytes(leaf_l, ei_l) == - btrfs_file_extent_disk_num_bytes(leaf_r, ei_r) && - btrfs_file_extent_offset(leaf_l, ei_l) == - btrfs_file_extent_offset(leaf_r, ei_r) && - btrfs_file_extent_num_bytes(leaf_l, ei_l) == - btrfs_file_extent_num_bytes(leaf_r, ei_r)) - return 0; - } - - inconsistent_snapshot_error(sctx, result, "extent"); - return -EIO; - } + /* + * We have found an extent item that changed without the inode item + * having changed. This can happen either after relocation (where the + * disk_bytenr of an extent item is replaced at + * relocation.c:replace_file_extents()) or after deduplication into a + * file in both the parent and send snapshots (where an extent item can + * get modified or replaced with a new one). Note that deduplication + * updates the inode item, but it only changes the iversion (sequence + * field in the inode item) of the inode, so if a file is deduplicated + * the same amount of times in both the parent and send snapshots, its + * iversion becames the same in both snapshots, whence the inode item is + * the same on both snapshots. + */ + if (sctx->cur_ino != sctx->cmp_key->objectid) + return 0;
if (!sctx->cur_inode_new_gen && !sctx->cur_inode_deleted) { if (result != BTRFS_COMPARE_TREE_DELETED)
From: Filipe Manana fdmanana@suse.com
commit cb2d3daddbfb6318d170e79aac1f7d5e4d49f0d7 upstream.
When one transaction is finishing its commit, it is possible for another transaction to start and enter its initial commit phase as well. If the first ends up getting aborted, we have a small time window where the second transaction commit does not notice that the previous transaction aborted and ends up committing, writing a superblock that points to btrees that reference extent buffers (nodes and leafs) that were not persisted to disk. The consequence is that after mounting the filesystem again, we will be unable to load some btree nodes/leafs, either because the content on disk is either garbage (or just zeroes) or corresponds to the old content of a previouly COWed or deleted node/leaf, resulting in the well known error messages "parent transid verify failed on ...". The following sequence diagram illustrates how this can happen.
CPU 1 CPU 2
<at transaction N>
btrfs_commit_transaction() (...) --> sets transaction state to TRANS_STATE_UNBLOCKED --> sets fs_info->running_transaction to NULL
(...) btrfs_start_transaction() start_transaction() wait_current_trans() --> returns immediately because fs_info->running_transaction is NULL join_transaction() --> creates transaction N + 1 --> sets fs_info->running_transaction to transaction N + 1 --> adds transaction N + 1 to the fs_info->trans_list list --> returns transaction handle pointing to the new transaction N + 1 (...)
btrfs_sync_file() btrfs_start_transaction() --> returns handle to transaction N + 1 (...)
btrfs_write_and_wait_transaction() --> writeback of some extent buffer fails, returns an error btrfs_handle_fs_error() --> sets BTRFS_FS_STATE_ERROR in fs_info->fs_state --> jumps to label "scrub_continue" cleanup_transaction() btrfs_abort_transaction(N) --> sets BTRFS_FS_STATE_TRANS_ABORTED flag in fs_info->fs_state --> sets aborted field in the transaction and transaction handle structures, for transaction N only --> removes transaction from the list fs_info->trans_list btrfs_commit_transaction(N + 1) --> transaction N + 1 was not aborted, so it proceeds (...) --> sets the transaction's state to TRANS_STATE_COMMIT_START --> does not find the previous transaction (N) in the fs_info->trans_list, so it doesn't know that transaction was aborted, and the commit of transaction N + 1 proceeds (...) --> sets transaction N + 1 state to TRANS_STATE_UNBLOCKED btrfs_write_and_wait_transaction() --> succeeds writing all extent buffers created in the transaction N + 1 write_all_supers() --> succeeds --> we now have a superblock on disk that points to trees that refer to at least one extent buffer that was never persisted
So fix this by updating the transaction commit path to check if the flag BTRFS_FS_STATE_TRANS_ABORTED is set on fs_info->fs_state if after setting the transaction to the TRANS_STATE_COMMIT_START we do not find any previous transaction in the fs_info->trans_list. If the flag is set, just fail the transaction commit with -EROFS, as we do in other places. The exact error code for the previous transaction abort was already logged and reported.
Fixes: 49b25e0540904b ("btrfs: enhance transaction abort infrastructure") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/btrfs/transaction.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -2052,6 +2052,16 @@ int btrfs_commit_transaction(struct btrf } } else { spin_unlock(&fs_info->trans_lock); + /* + * The previous transaction was aborted and was already removed + * from the list of transactions at fs_info->trans_list. So we + * abort to prevent writing a new superblock that reflects a + * corrupt state (pointing to trees with unwritten nodes/leafs). + */ + if (test_bit(BTRFS_FS_STATE_TRANS_ABORTED, &fs_info->fs_state)) { + ret = -EROFS; + goto cleanup_transaction; + } }
extwriter_counter_dec(cur_trans, trans->type);
From: Douglas Anderson dianders@chromium.org
commit ba2d139b02ba684c6c101de42fed782d6cd2b997 upstream.
In commit 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after response errors.") we fixed a tuning-induced hang that I saw when stress testing tuning on certain SD cards. I won't re-hash that whole commit, but the summary is that as a normal part of tuning you need to deal with transfer errors and there were cases where these transfer errors was putting my system into a bad state causing all future transfers to fail. That commit fixed handling of the transfer errors for me.
In downstream Chrome OS my fix landed and had the same behavior for all SD/MMC commands. However, it looks like when the commit landed upstream we limited it to only SD tuning commands. Presumably this was to try to get around problems that Alim Akhtar reported on exynos [1].
Unfortunately while stress testing reboots (and suspend/resume) on some rk3288-based Chromebooks I found the same problem on the eMMC on some of my Chromebooks (the ones with Hynix eMMC). Since the eMMC tuning command is different (MMC_SEND_TUNING_BLOCK_HS200 vs. MMC_SEND_TUNING_BLOCK) we were basically getting back into the same situation.
I'm hoping that whatever problems exynos was having in the past are somehow magically fixed now and we can make the behavior the same for all commands.
[1] https://lkml.kernel.org/r/CAGOxZ53WfNbaMe0_AM0qBqU47kAfgmPBVZC8K8Y-_J3mDMqW4...
Fixes: 46d179525a1f ("mmc: dw_mmc: Wait for data transfer after response errors.") Signed-off-by: Douglas Anderson dianders@chromium.org Cc: Marek Szyprowski m.szyprowski@samsung.com Cc: Alim Akhtar alim.akhtar@gmail.com Cc: Enric Balletbo i Serra enric.balletbo@collabora.com Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/mmc/host/dw_mmc.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/mmc/host/dw_mmc.c +++ b/drivers/mmc/host/dw_mmc.c @@ -2046,8 +2046,7 @@ static void dw_mci_tasklet_func(unsigned * delayed. Allowing the transfer to take place * avoids races and keeps things simple. */ - if ((err != -ETIMEDOUT) && - (cmd->opcode == MMC_SEND_TUNING_BLOCK)) { + if (err != -ETIMEDOUT) { state = STATE_SENDING_DATA; continue; }
From: Michael Wu michael.wu@vatics.com
commit 223ecaf140b1dd1c1d2a1a1d96281efc5c906984 upstream.
When a pin is active-low, logical trigger edge should be inverted to match the same interrupt opportunity.
For example, a button pushed triggers falling edge in ACTIVE_HIGH case; in ACTIVE_LOW case, the button pushed triggers rising edge. For user space the IRQ requesting doesn't need to do any modification except to configuring GPIOHANDLE_REQUEST_ACTIVE_LOW.
For example, we want to catch the event when the button is pushed. The button on the original board drives level to be low when it is pushed, and drives level to be high when it is released.
In user space we can do:
req.handleflags = GPIOHANDLE_REQUEST_INPUT; req.eventflags = GPIOEVENT_REQUEST_FALLING_EDGE;
while (1) { read(fd, &dat, sizeof(dat)); if (dat.id == GPIOEVENT_EVENT_FALLING_EDGE) printf("button pushed\n"); }
Run the same logic on another board which the polarity of the button is inverted; it drives level to be high when pushed, and level to be low when released. For this inversion we add flag GPIOHANDLE_REQUEST_ACTIVE_LOW:
req.handleflags = GPIOHANDLE_REQUEST_INPUT | GPIOHANDLE_REQUEST_ACTIVE_LOW; req.eventflags = GPIOEVENT_REQUEST_FALLING_EDGE;
At the result, there are no any events caught when the button is pushed. By the way, button releasing will emit a "falling" event. The timing of "falling" catching is not expected.
Cc: stable@vger.kernel.org Signed-off-by: Michael Wu michael.wu@vatics.com Tested-by: Bartosz Golaszewski bgolaszewski@baylibre.com Signed-off-by: Bartosz Golaszewski bgolaszewski@baylibre.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpio/gpiolib.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -835,9 +835,11 @@ static int lineevent_create(struct gpio_ }
if (eflags & GPIOEVENT_REQUEST_RISING_EDGE) - irqflags |= IRQF_TRIGGER_RISING; + irqflags |= test_bit(FLAG_ACTIVE_LOW, &desc->flags) ? + IRQF_TRIGGER_FALLING : IRQF_TRIGGER_RISING; if (eflags & GPIOEVENT_REQUEST_FALLING_EDGE) - irqflags |= IRQF_TRIGGER_FALLING; + irqflags |= test_bit(FLAG_ACTIVE_LOW, &desc->flags) ? + IRQF_TRIGGER_RISING : IRQF_TRIGGER_FALLING; irqflags |= IRQF_ONESHOT; irqflags |= IRQF_SHARED;
From: Gustavo A. R. Silva gustavo@embeddedor.com
commit 6497d0a9c53df6e98b25e2b79f2295d7caa47b6e upstream.
sl is controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability.
Fix this by sanitizing sl before using it to index ibp->sl_to_sc.
Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1].
[1] https://lore.kernel.org/lkml/20180423164740.GY17484@dhcp22.suse.cz/
Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva gustavo@embeddedor.com Link: https://lore.kernel.org/r/20190731175428.GA16736@embeddedor Signed-off-by: Doug Ledford dledford@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/hw/hfi1/verbs.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/infiniband/hw/hfi1/verbs.c +++ b/drivers/infiniband/hw/hfi1/verbs.c @@ -54,6 +54,7 @@ #include <linux/mm.h> #include <linux/vmalloc.h> #include <rdma/opa_addr.h> +#include <linux/nospec.h>
#include "hfi.h" #include "common.h" @@ -1587,6 +1588,7 @@ static int hfi1_check_ah(struct ib_devic sl = rdma_ah_get_sl(ah_attr); if (sl >= ARRAY_SIZE(ibp->sl_to_sc)) return -EINVAL; + sl = array_index_nospec(sl, ARRAY_SIZE(ibp->sl_to_sc));
sc5 = ibp->sl_to_sc[sl]; if (sc_to_vlt(dd, sc5) > num_vls && sc_to_vlt(dd, sc5) != 0xf)
Hi Greg,
Can you please add these other two patches to stable:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
Thanks! -- Gustavo
On 8/5/19 8:03 AM, Greg Kroah-Hartman wrote:
From: Gustavo A. R. Silva gustavo@embeddedor.com
commit 6497d0a9c53df6e98b25e2b79f2295d7caa47b6e upstream.
sl is controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability.
Fix this by sanitizing sl before using it to index ibp->sl_to_sc.
Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1].
[1] https://lore.kernel.org/lkml/20180423164740.GY17484@dhcp22.suse.cz/
Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva gustavo@embeddedor.com Link: https://lore.kernel.org/r/20190731175428.GA16736@embeddedor Signed-off-by: Doug Ledford dledford@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
drivers/infiniband/hw/hfi1/verbs.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/infiniband/hw/hfi1/verbs.c +++ b/drivers/infiniband/hw/hfi1/verbs.c @@ -54,6 +54,7 @@ #include <linux/mm.h> #include <linux/vmalloc.h> #include <rdma/opa_addr.h> +#include <linux/nospec.h> #include "hfi.h" #include "common.h" @@ -1587,6 +1588,7 @@ static int hfi1_check_ah(struct ib_devic sl = rdma_ah_get_sl(ah_attr); if (sl >= ARRAY_SIZE(ibp->sl_to_sc)) return -EINVAL;
- sl = array_index_nospec(sl, ARRAY_SIZE(ibp->sl_to_sc));
sc5 = ibp->sl_to_sc[sl]; if (sc_to_vlt(dd, sc5) > num_vls && sc_to_vlt(dd, sc5) != 0xf)
On Mon, Aug 05, 2019 at 09:16:36AM -0500, Gustavo A. R. Silva wrote:
Hi Greg,
Can you please add these other two patches to stable:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
Does not apply :(
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
Also does not apply :(
Can you provide backports that work if they really are needed?
thanks,
greg k-h
From: Ondrej Mosnacek omosnace@redhat.com
commit 45385237f65aeee73641f1ef737d7273905a233f upstream.
Since roles_init() adds some entries to the role hash table, we need to destroy also its keys/values on error, otherwise we get a memory leak in the error path.
Cc: stable@vger.kernel.org Reported-by: syzbot+fee3a14d4cdf92646287@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- security/selinux/ss/policydb.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/security/selinux/ss/policydb.c +++ b/security/selinux/ss/policydb.c @@ -275,6 +275,8 @@ static int rangetr_cmp(struct hashtab *h return v; }
+static int (*destroy_f[SYM_NUM]) (void *key, void *datum, void *datap); + /* * Initialize a policy database structure. */ @@ -322,8 +324,10 @@ static int policydb_init(struct policydb out: hashtab_destroy(p->filename_trans); hashtab_destroy(p->range_tr); - for (i = 0; i < SYM_NUM; i++) + for (i = 0; i < SYM_NUM; i++) { + hashtab_map(p->symtab[i].table, destroy_f[i], NULL); hashtab_destroy(p->symtab[i].table); + } return rc; }
From: Stefan Haberland sth@linux.ibm.com
commit 41995342b40c418a47603e1321256d2c4a2ed0fb upstream.
After getting a storage server event that causes the DASD device driver to update its unit address configuration during a device shutdown there is the possibility of an endless loop in the device driver.
In the system log there will be ongoing DASD error messages with RC: -19.
The reason is that the loop starting the ruac request only terminates when the retry counter is decreased to 0. But in the sleep_on function there are early exit paths that do not decrease the retry counter.
Prevent an endless loop by handling those cases separately.
Remove the unnecessary do..while loop since the sleep_on function takes care of retries by itself.
Fixes: 8e09f21574ea ("[S390] dasd: add hyper PAV support to DASD device driver, part 1") Cc: stable@vger.kernel.org # 2.6.25+ Signed-off-by: Stefan Haberland sth@linux.ibm.com Reviewed-by: Jan Hoeppner hoeppner@linux.ibm.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/s390/block/dasd_alias.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-)
--- a/drivers/s390/block/dasd_alias.c +++ b/drivers/s390/block/dasd_alias.c @@ -383,6 +383,20 @@ suborder_not_supported(struct dasd_ccw_r char msg_format; char msg_no;
+ /* + * intrc values ENODEV, ENOLINK and EPERM + * will be optained from sleep_on to indicate that no + * IO operation can be started + */ + if (cqr->intrc == -ENODEV) + return 1; + + if (cqr->intrc == -ENOLINK) + return 1; + + if (cqr->intrc == -EPERM) + return 1; + sense = dasd_get_sense(&cqr->irb); if (!sense) return 0; @@ -447,12 +461,8 @@ static int read_unit_address_configurati lcu->flags &= ~NEED_UAC_UPDATE; spin_unlock_irqrestore(&lcu->lock, flags);
- do { - rc = dasd_sleep_on(cqr); - if (rc && suborder_not_supported(cqr)) - return -EOPNOTSUPP; - } while (rc && (cqr->retries > 0)); - if (rc) { + rc = dasd_sleep_on(cqr); + if (rc && !suborder_not_supported(cqr)) { spin_lock_irqsave(&lcu->lock, flags); lcu->flags |= NEED_UAC_UPDATE; spin_unlock_irqrestore(&lcu->lock, flags);
From: Helge Deller deller@gmx.de
commit 3fe6c873af2f2247544debdbe51ec29f690a2ccf upstream.
With debug info enabled (CONFIG_DEBUG_INFO=y) the resulting vmlinux may get that huge that we need to increase the start addresss for the decompression text section otherwise one will face a linker error.
Reported-by: Sven Schnelle svens@stackframe.org Tested-by: Sven Schnelle svens@stackframe.org Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/parisc/boot/compressed/vmlinux.lds.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/parisc/boot/compressed/vmlinux.lds.S +++ b/arch/parisc/boot/compressed/vmlinux.lds.S @@ -40,8 +40,8 @@ SECTIONS #endif _startcode_end = .;
- /* bootloader code and data starts behind area of extracted kernel */ - . = (SZ_end - SZparisc_kernel_start + KERNEL_BINARY_TEXT_START); + /* bootloader code and data starts at least behind area of extracted kernel */ + . = MAX(ABSOLUTE(.), (SZ_end - SZparisc_kernel_start + KERNEL_BINARY_TEXT_START));
/* align on next page boundary */ . = ALIGN(4096);
From: Will Deacon will@kernel.org
commit 0d7fd70f26039bd4b33444ca47f0e69ce3ae0354 upstream.
Handling of the CPU_PM_ENTER_FAILED transition in the Arm PMU PM notifier code incorrectly skips restoration of the counters. Fix the logic so that CPU_PM_ENTER_FAILED follows the same path as CPU_PM_EXIT.
Cc: stable@vger.kernel.org Fixes: da4e4f18afe0f372 ("drivers/perf: arm_pmu: implement CPU_PM notifier") Reported-by: Anders Roxell anders.roxell@linaro.org Acked-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/perf/arm_pmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -751,8 +751,8 @@ static int cpu_pm_pmu_notify(struct noti cpu_pm_pmu_setup(armpmu, cmd); break; case CPU_PM_EXIT: - cpu_pm_pmu_setup(armpmu, cmd); case CPU_PM_ENTER_FAILED: + cpu_pm_pmu_setup(armpmu, cmd); armpmu->start(armpmu); break; default:
From: Munehisa Kamata kamatam@amazon.com
commit 2b5c8f0063e4b263cf2de82029798183cf85c320 upstream.
Commit abbbdf12497d ("replace kill_bdev() with __invalidate_device()") once did this, but 29eaadc03649 ("nbd: stop using the bdev everywhere") resurrected kill_bdev() and it has been there since then. So buffer_head mappings still get killed on a server disconnection, and we can still hit the BUG_ON on a filesystem on the top of the nbd device.
EXT4-fs (nbd0): mounted filesystem with ordered data mode. Opts: (null) block nbd0: Receive control failed (result -32) block nbd0: shutting down sockets print_req_error: I/O error, dev nbd0, sector 66264 flags 3000 EXT4-fs warning (device nbd0): htree_dirblock_to_tree:979: inode #2: lblock 0: comm ls: error -5 reading directory block print_req_error: I/O error, dev nbd0, sector 2264 flags 3000 EXT4-fs error (device nbd0): __ext4_get_inode_loc:4690: inode #2: block 283: comm ls: unable to read itable block EXT4-fs error (device nbd0) in ext4_reserve_inode_write:5894: IO failure ------------[ cut here ]------------ kernel BUG at fs/buffer.c:3057! invalid opcode: 0000 [#1] SMP PTI CPU: 7 PID: 40045 Comm: jbd2/nbd0-8 Not tainted 5.1.0-rc3+ #4 Hardware name: Amazon EC2 m5.12xlarge/, BIOS 1.0 10/16/2017 RIP: 0010:submit_bh_wbc+0x18b/0x190 ... Call Trace: jbd2_write_superblock+0xf1/0x230 [jbd2] ? account_entity_enqueue+0xc5/0xf0 jbd2_journal_update_sb_log_tail+0x94/0xe0 [jbd2] jbd2_journal_commit_transaction+0x12f/0x1d20 [jbd2] ? __switch_to_asm+0x40/0x70 ... ? lock_timer_base+0x67/0x80 kjournald2+0x121/0x360 [jbd2] ? remove_wait_queue+0x60/0x60 kthread+0xf8/0x130 ? commit_timeout+0x10/0x10 [jbd2] ? kthread_bind+0x10/0x10 ret_from_fork+0x35/0x40
With __invalidate_device(), I no longer hit the BUG_ON with sync or unmount on the disconnected device.
Fixes: 29eaadc03649 ("nbd: stop using the bdev everywhere") Cc: linux-block@vger.kernel.org Cc: Ratna Manoj Bolla manoj.br@gmail.com Cc: nbd@other.debian.org Cc: stable@vger.kernel.org Cc: David Woodhouse dwmw@amazon.com Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Munehisa Kamata kamatam@amazon.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/block/nbd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -1207,7 +1207,7 @@ static void nbd_clear_sock_ioctl(struct struct block_device *bdev) { sock_shutdown(nbd); - kill_bdev(bdev); + __invalidate_device(bdev, true); nbd_bdev_reset(bdev); if (test_and_clear_bit(NBD_HAS_CONFIG_REF, &nbd->config->runtime_flags))
From: Juergen Gross jgross@suse.com
commit 50f6393f9654c561df4cdcf8e6cfba7260143601 upstream.
The condition in xen_swiotlb_free_coherent() for deciding whether to call xen_destroy_contiguous_region() is wrong: in case the region to be freed is not contiguous calling xen_destroy_contiguous_region() is the wrong thing to do: it would result in inconsistent mappings of multiple PFNs to the same MFN. This will lead to various strange crashes or data corruption.
Instead of calling xen_destroy_contiguous_region() in that case a warning should be issued as that situation should never occur.
Cc: stable@vger.kernel.org Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Boris Ostrovsky boris.ostrovsky@oracle.com Reviewed-by: Jan Beulich jbeulich@suse.com Acked-by: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/xen/swiotlb-xen.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/xen/swiotlb-xen.c +++ b/drivers/xen/swiotlb-xen.c @@ -371,8 +371,8 @@ xen_swiotlb_free_coherent(struct device /* Convert the size to actually allocated. */ size = 1UL << (order + XEN_PAGE_SHIFT);
- if (((dev_addr + size - 1 <= dma_mask)) || - range_straddles_page_boundary(phys, size)) + if (!WARN_ON((dev_addr + size - 1 > dma_mask) || + range_straddles_page_boundary(phys, size))) xen_destroy_contiguous_region(phys, order);
xen_free_coherent_pages(hwdev, size, vaddr, (dma_addr_t)phys, attrs);
From: Yishai Hadas yishaih@mellanox.com
commit 6a053953739d23694474a5f9c81d1a30093da81a upstream.
Fix unreg_umr to ignore the mkey state and do not fail if was freed. This prevents a case that a user space application already changed the mkey state to free and then the UMR operation will fail leaving the mkey in an inappropriate state.
Link: https://lore.kernel.org/r/20190723065733.4899-3-leon@kernel.org Cc: stable@vger.kernel.org # 3.19 Fixes: 968e78dd9644 ("IB/mlx5: Enhance UMR support to allow partial page table update") Signed-off-by: Yishai Hadas yishaih@mellanox.com Reviewed-by: Artemy Kovalyov artemyko@mellanox.com Signed-off-by: Leon Romanovsky leonro@mellanox.com Reviewed-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/hw/mlx5/mlx5_ib.h | 1 + drivers/infiniband/hw/mlx5/mr.c | 4 ++-- drivers/infiniband/hw/mlx5/qp.c | 12 ++++++++---- 3 files changed, 11 insertions(+), 6 deletions(-)
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -427,6 +427,7 @@ struct mlx5_umr_wr { u64 length; int access_flags; u32 mkey; + u8 ignore_free_state:1; };
static inline struct mlx5_umr_wr *umr_wr(struct ib_send_wr *wr) --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1302,10 +1302,10 @@ static int unreg_umr(struct mlx5_ib_dev if (mdev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) return 0;
- umrwr.wr.send_flags = MLX5_IB_SEND_UMR_DISABLE_MR | - MLX5_IB_SEND_UMR_FAIL_IF_FREE; + umrwr.wr.send_flags = MLX5_IB_SEND_UMR_DISABLE_MR; umrwr.wr.opcode = MLX5_IB_WR_UMR; umrwr.mkey = mr->mmkey.key; + umrwr.ignore_free_state = 1;
return mlx5_ib_post_send_wait(dev, &umrwr); } --- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -3265,10 +3265,14 @@ static void set_reg_umr_segment(struct m
memset(umr, 0, sizeof(*umr));
- if (wr->send_flags & MLX5_IB_SEND_UMR_FAIL_IF_FREE) - umr->flags = MLX5_UMR_CHECK_FREE; /* fail if free */ - else - umr->flags = MLX5_UMR_CHECK_NOT_FREE; /* fail if not free */ + if (!umrwr->ignore_free_state) { + if (wr->send_flags & MLX5_IB_SEND_UMR_FAIL_IF_FREE) + /* fail if free */ + umr->flags = MLX5_UMR_CHECK_FREE; + else + /* fail if not free */ + umr->flags = MLX5_UMR_CHECK_NOT_FREE; + }
umr->xlt_octowords = cpu_to_be16(get_xlt_octo(umrwr->xlt_size)); if (wr->send_flags & MLX5_IB_SEND_UMR_UPDATE_XLT) {
From: Yishai Hadas yishaih@mellanox.com
commit afd1417404fba6dbfa6c0a8e5763bd348da682e4 upstream.
Use a direct firmware command to destroy the mkey in case the unreg UMR operation has failed.
This prevents a case that a mkey will leak out from the cache post a failure to be destroyed by a UMR WR.
In case the MR cache limit didn't reach a call to add another entry to the cache instead of the destroyed one is issued.
In addition, replaced a warn message to WARN_ON() as this flow is fatal and can't happen unless some bug around.
Link: https://lore.kernel.org/r/20190723065733.4899-4-leon@kernel.org Cc: stable@vger.kernel.org # 4.10 Fixes: 49780d42dfc9 ("IB/mlx5: Expose MR cache for mlx5_ib") Signed-off-by: Yishai Hadas yishaih@mellanox.com Reviewed-by: Artemy Kovalyov artemyko@mellanox.com Signed-off-by: Leon Romanovsky leonro@mellanox.com Reviewed-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/hw/mlx5/mr.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
--- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -538,13 +538,16 @@ void mlx5_mr_cache_free(struct mlx5_ib_d int c;
c = order2idx(dev, mr->order); - if (c < 0 || c >= MAX_MR_CACHE_ENTRIES) { - mlx5_ib_warn(dev, "order %d, cache index %d\n", mr->order, c); - return; - } + WARN_ON(c < 0 || c >= MAX_MR_CACHE_ENTRIES);
- if (unreg_umr(dev, mr)) + if (unreg_umr(dev, mr)) { + mr->allocated_from_cache = false; + destroy_mkey(dev, mr); + ent = &cache->ent[c]; + if (ent->cur < ent->limit) + queue_work(cache->wq, &ent->work); return; + }
ent = &cache->ent[c]; spin_lock_irq(&ent->lock);
From: Yishai Hadas yishaih@mellanox.com
commit 9ec4483a3f0f71a228a5933bc040441322bfb090 upstream.
Fix unreg_umr to move the MR to a kernel owned PD (i.e. the UMR PD) which can't be accessed by userspace.
This ensures that nothing can continue to access the MR once it has been placed in the kernels cache for reuse.
MRs in the cache continue to have their HW state, including DMA tables, present. Even though the MR has been invalidated, changing the PD provides an additional layer of protection against use of the MR.
Link: https://lore.kernel.org/r/20190723065733.4899-5-leon@kernel.org Cc: stable@vger.kernel.org # 3.10 Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Yishai Hadas yishaih@mellanox.com Reviewed-by: Artemy Kovalyov artemyko@mellanox.com Signed-off-by: Leon Romanovsky leonro@mellanox.com Reviewed-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/hw/mlx5/mr.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1305,8 +1305,10 @@ static int unreg_umr(struct mlx5_ib_dev if (mdev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) return 0;
- umrwr.wr.send_flags = MLX5_IB_SEND_UMR_DISABLE_MR; + umrwr.wr.send_flags = MLX5_IB_SEND_UMR_DISABLE_MR | + MLX5_IB_SEND_UMR_UPDATE_PD_ACCESS; umrwr.wr.opcode = MLX5_IB_WR_UMR; + umrwr.pd = dev->umrc.pd; umrwr.mkey = mr->mmkey.key; umrwr.ignore_free_state = 1;
From: Yishai Hadas yishaih@mellanox.com
commit b7165bd0d6cbb93732559be6ea8774653b204480 upstream.
The specification for the Toeplitz function doesn't require to set the key explicitly to be symmetric. In case a symmetric functionality is required a symmetric key can be simply used.
Wrongly forcing the algorithm to symmetric causes the wrong packet distribution and a performance degradation.
Link: https://lore.kernel.org/r/20190723065733.4899-7-leon@kernel.org Cc: stable@vger.kernel.org # 4.7 Fixes: 28d6137008b2 ("IB/mlx5: Add RSS QP support") Signed-off-by: Yishai Hadas yishaih@mellanox.com Reviewed-by: Alex Vainman alexv@mellanox.com Signed-off-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/hw/mlx5/qp.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -1425,7 +1425,6 @@ static int create_rss_raw_qp_tir(struct }
MLX5_SET(tirc, tirc, rx_hash_fn, MLX5_RX_HASH_FN_TOEPLITZ); - MLX5_SET(tirc, tirc, rx_hash_symmetric, 1); memcpy(rss_key, ucmd.rx_hash_key, len); break; }
From: John Fleck john.fleck@intel.com
commit cd48a82087231fdba0e77521102386c6ed0168d6 upstream.
The call to alloc_rsm_map_table does not check if the kmalloc fails. Check for a NULL on alloc, and bail if it fails.
Fixes: 372cc85a13c9 ("IB/hfi1: Extract RSM map table init from QOS") Link: https://lore.kernel.org/r/20190715164521.74174.27047.stgit@awfm-01.aw.intel.... Cc: stable@vger.kernel.org Reviewed-by: Mike Marciniszyn mike.marciniszyn@intel.com Signed-off-by: John Fleck john.fleck@intel.com Signed-off-by: Mike Marciniszyn mike.marciniszyn@intel.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/hw/hfi1/chip.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
--- a/drivers/infiniband/hw/hfi1/chip.c +++ b/drivers/infiniband/hw/hfi1/chip.c @@ -14566,7 +14566,7 @@ void hfi1_deinit_vnic_rsm(struct hfi1_de clear_rcvctrl(dd, RCV_CTRL_RCV_RSM_ENABLE_SMASK); }
-static void init_rxe(struct hfi1_devdata *dd) +static int init_rxe(struct hfi1_devdata *dd) { struct rsm_map_table *rmt; u64 val; @@ -14575,6 +14575,9 @@ static void init_rxe(struct hfi1_devdata write_csr(dd, RCV_ERR_MASK, ~0ull);
rmt = alloc_rsm_map_table(dd); + if (!rmt) + return -ENOMEM; + /* set up QOS, including the QPN map table */ init_qos(dd, rmt); init_user_fecn_handling(dd, rmt); @@ -14599,6 +14602,7 @@ static void init_rxe(struct hfi1_devdata val = read_csr(dd, RCV_BYPASS); val |= (4ull << 16); write_csr(dd, RCV_BYPASS, val); + return 0; }
static void init_other(struct hfi1_devdata *dd) @@ -15154,7 +15158,10 @@ struct hfi1_devdata *hfi1_init_dd(struct goto bail_cleanup;
/* set initial RXE CSRs */ - init_rxe(dd); + ret = init_rxe(dd); + if (ret) + goto bail_cleanup; + /* set initial TXE CSRs */ init_txe(dd); /* set initial non-RXE, non-TXE CSRs */
From: Jean Delvare jdelvare@suse.de
commit 25e5ef302c24a6fead369c0cfe88c073d7b97ca8 upstream.
The integration of the at24 driver into the nvmem framework broke the world-readability of spd EEPROMs. Fix it.
Signed-off-by: Jean Delvare jdelvare@suse.de Cc: stable@vger.kernel.org Fixes: 57d155506dd5 ("eeprom: at24: extend driver to plug into the NVMEM framework") Cc: Andrew Lunn andrew@lunn.ch Cc: Srinivas Kandagatla srinivas.kandagatla@linaro.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Bartosz Golaszewski brgl@bgdev.pl Cc: Arnd Bergmann arnd@arndb.de Signed-off-by: Bartosz Golaszewski bgolaszewski@baylibre.com [Bartosz: backported the patch to older branches] Signed-off-by: Bartosz Golaszewski bgolaszewski@baylibre.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/misc/eeprom/at24.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/misc/eeprom/at24.c +++ b/drivers/misc/eeprom/at24.c @@ -834,7 +834,7 @@ static int at24_probe(struct i2c_client at24->nvmem_config.name = dev_name(&client->dev); at24->nvmem_config.dev = &client->dev; at24->nvmem_config.read_only = !writable; - at24->nvmem_config.root_only = true; + at24->nvmem_config.root_only = !(chip.flags & AT24_FLAG_IRUGO); at24->nvmem_config.owner = THIS_MODULE; at24->nvmem_config.compat = true; at24->nvmem_config.base_dev = &client->dev;
From: Josh Poimboeuf jpoimboe@redhat.com
commit bcb6fb5da77c2a228adf07cc9cb1a0c2aa2001c6 upstream.
Starting with GCC 8, a lot of unlikely code was moved out of line to "cold" subfunctions in .text.unlikely.
For example, the unlikely bits of:
irq_do_set_affinity()
are moved out to the following subfunction:
irq_do_set_affinity.cold.49()
Starting with GCC 9, the numbered suffix has been removed. So in the above example, the cold subfunction is instead:
irq_do_set_affinity.cold()
Tweak the objtool subfunction detection logic so that it detects both GCC 8 and GCC 9 naming schemes.
Reported-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/015e9544b1f188d36a7f02fa31e9e95629aa5f50.154104080... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- tools/objtool/elf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/objtool/elf.c +++ b/tools/objtool/elf.c @@ -305,7 +305,7 @@ static int read_symbols(struct elf *elf) if (sym->type != STT_FUNC) continue; sym->pfunc = sym->cfunc = sym; - coldstr = strstr(sym->name, ".cold."); + coldstr = strstr(sym->name, ".cold"); if (!coldstr) continue;
From: Linus Torvalds torvalds@linux-foundation.org
commit 459e3a21535ae3c7a9a123650e54f5c882b8fcbf upstream.
The pvlock_page and hvclock_page variables are (as the name implies) addresses to pages, created by the linker script.
But we declared them as just "extern u8" variables, which _works_, but now that gcc does some more bounds checking, it causes warnings like
warning: array subscript 1 is outside array bounds of ‘u8[1]’
when we then access more than one byte from those variables.
Fix this by simply making the declaration of the variables match reality, which makes the compiler happy too.
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/vdso/vclock_gettime.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/x86/entry/vdso/vclock_gettime.c +++ b/arch/x86/entry/vdso/vclock_gettime.c @@ -29,12 +29,12 @@ extern int __vdso_gettimeofday(struct ti extern time_t __vdso_time(time_t *t);
#ifdef CONFIG_PARAVIRT_CLOCK -extern u8 pvclock_page +extern u8 pvclock_page[PAGE_SIZE] __attribute__((visibility("hidden"))); #endif
#ifdef CONFIG_HYPERV_TSCPAGE -extern u8 hvclock_page +extern u8 hvclock_page[PAGE_SIZE] __attribute__((visibility("hidden"))); #endif
From: Andy Lutomirski luto@kernel.org
commit ff17bbe0bb405ad8b36e55815d381841f9fdeebc upstream.
GCC 5.5.0 sometimes cleverly hoists reads of the pvclock and/or hvclock pages before the vclock mode checks. This creates a path through vclock_gettime() in which no vclock is enabled at all (due to disabled TSC on old CPUs, for example) but the pvclock or hvclock page nevertheless read. This will segfault on bare metal.
This fixes commit 459e3a21535a ("gcc-9: properly declare the {pv,hv}clock_page storage") in the sense that, before that commit, GCC didn't seem to generate the offending code. There was nothing wrong with that commit per se, and -stable maintainers should backport this to all supported kernels regardless of whether the offending commit was present, since the same crash could just as easily be triggered by the phase of the moon.
On GCC 9.1.1, this doesn't seem to affect the generated code at all, so I'm not too concerned about performance regressions from this fix.
Cc: stable@vger.kernel.org Cc: x86@kernel.org Cc: Borislav Petkov bp@alien8.de Reported-by: Duncan Roe duncan_roe@optusnet.com.au Signed-off-by: Andy Lutomirski luto@kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/entry/vdso/vclock_gettime.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)
--- a/arch/x86/entry/vdso/vclock_gettime.c +++ b/arch/x86/entry/vdso/vclock_gettime.c @@ -191,13 +191,24 @@ notrace static inline u64 vgetsns(int *m
if (gtod->vclock_mode == VCLOCK_TSC) cycles = vread_tsc(); + + /* + * For any memory-mapped vclock type, we need to make sure that gcc + * doesn't cleverly hoist a load before the mode check. Otherwise we + * might end up touching the memory-mapped page even if the vclock in + * question isn't enabled, which will segfault. Hence the barriers. + */ #ifdef CONFIG_PARAVIRT_CLOCK - else if (gtod->vclock_mode == VCLOCK_PVCLOCK) + else if (gtod->vclock_mode == VCLOCK_PVCLOCK) { + barrier(); cycles = vread_pvclock(mode); + } #endif #ifdef CONFIG_HYPERV_TSCPAGE - else if (gtod->vclock_mode == VCLOCK_HVCLOCK) + else if (gtod->vclock_mode == VCLOCK_HVCLOCK) { + barrier(); cycles = vread_hvclock(mode); + } #endif else return 0;
On 8/5/19 7:02 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.137 release. There are 53 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 07 Aug 2019 12:47:58 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.137-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
On Mon, 5 Aug 2019 at 18:38, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.14.137 release. There are 53 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 07 Aug 2019 12:47:58 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.137-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 4.14.137-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.14.y git commit: 20d3ec30650b0c33377164def17390367716d4c8 git describe: v4.14.136-54-g20d3ec30650b Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.14-oe/build/v4.14.136-5...
No regressions (compared to build v4.14.136)
No fixes (compared to build v4.14.136)
Ran 21564 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - arm64 - hi6220-hikey - arm64 - i386 - juno-r2 - arm64 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - arm - x86_64
Test Suites ----------- * build * install-android-platform-tools-r2600 * kselftest * libhugetlbfs * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * perf * spectre-meltdown-checker-test * v4l2-compliance * ltp-ipc-tests * ltp-timers-tests * network-basic-tests * ltp-open-posix-tests * kvm-unit-tests * ssuite * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
Greg Kroah-Hartman gregkh@linuxfoundation.org 于2019年8月5日周一 下午3:14写道:
This is the start of the stable review cycle for the 4.14.137 release. There are 53 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 07 Aug 2019 12:47:58 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.137-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
Merge, and regression tested on my test machines, all looks good!
Thanks, Jack Wang
On Mon, Aug 05, 2019 at 03:02:25PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.137 release. There are 53 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 07 Aug 2019 12:47:58 PM UTC. Anything received after that time might be too late.
Build results: total: 172 pass: 172 fail: 0 Qemu test results: total: 346 pass: 346 fail: 0
Guenter
On 05/08/2019 14:02, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.14.137 release. There are 53 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 07 Aug 2019 12:47:58 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.137-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v4.14: 8 builds: 8 pass, 0 fail 16 boots: 16 pass, 0 fail 24 tests: 24 pass, 0 fail
Linux version: 4.14.137-rc1-g20d3ec30650b Boards tested: tegra124-jetson-tk1, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Cheers Jon
linux-stable-mirror@lists.linaro.org