This is the start of the stable review cycle for the 5.3.9 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 06 Nov 2019 09:14:04 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.3.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.3.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.3.9-rc1
Qian Cai cai@lca.pw sched/fair: Fix -Wunused-but-set-variable warnings
Jason Gunthorpe jgg@ziepe.ca RDMA/mlx5: Use irq xarray locking for mkey_table
Justin Song flyingecar@gmail.com ALSA: usb-audio: Add DSD support for Gustard U16/X26 USB Interface
Jussi Laako jussi@sonarnerd.net ALSA: usb-audio: Update DSD support quirks for Oppo and Rotel
Jussi Laako jussi@sonarnerd.net ALSA: usb-audio: DSD auto-detection for Playback Designs
Dave Chiluk chiluk+linux@indeed.com sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices
Luca Coelho luciano.coelho@intel.com iwlwifi: exclude GEO SAR support for 3168
Vlad Buslov vladbu@mellanox.com net: sched: sch_sfb: don't call qdisc_put() while holding tree lock
Eric Dumazet edumazet@google.com sch_netem: fix rcu splat in netem_enqueue()
Valentin Vidic vvidic@valentin-vidic.from.hr net: usb: sr9800: fix uninitialized local variable
Eric Dumazet edumazet@google.com netfilter: conntrack: avoid possible false sharing
Eric Dumazet edumazet@google.com bonding: fix potential NULL deref in bond_update_slave_arr
Johan Hovold johan@kernel.org NFC: pn533: fix use-after-free and memleaks
David Howells dhowells@redhat.com rxrpc: Fix trace-after-put looking at the put peer record
David Howells dhowells@redhat.com rxrpc: rxrpc_peer needs to hold a ref on the rxrpc_local record
David Howells dhowells@redhat.com rxrpc: Fix call ref leak
Eric Biggers ebiggers@google.com llc: fix sk_buff leak in llc_conn_service()
Eric Biggers ebiggers@google.com llc: fix sk_buff leak in llc_sap_state_process()
Sven Eckelmann sven@narfation.org batman-adv: Avoid free/alloc race when handling OGM buffer
John Donnelly John.P.Donnelly@Oracle.com iommu/vt-d: Fix panic after kexec -p for kdump
Jens Axboe axboe@kernel.dk io_uring: ensure we clear io_kiocb->result before each issue
Trond Myklebust trondmy@gmail.com NFS: Fix an RCU lock leak in nfs4_refresh_delegation_stateid()
chen gong curry.gong@amd.com drm/amdgpu: Fix SDMA hang when performing VKexample test
Pelle van Gils pelle@vangils.xyz drm/amdgpu/powerplay/vega10: allow undervolting in p7
Tianci.Yin tianci.yin@amd.com drm/amdgpu/gfx10: update gfx golden settings
Ville Syrjälä ville.syrjala@linux.intel.com drm/i915: Fix PCH reference clock for FDI on HSW/BDW
Alex Deucher alexander.deucher@amd.com drm/amdgpu/gmc10: properly set BANK_SELECT and FRAGMENT_SIZE
Tony Lindgren tony@atomide.com dmaengine: cppi41: Fix cppi41_dma_prep_slave_sg() when idle
Robin Gong yibin.gong@nxp.com dmaengine: imx-sdma: fix size check for sdma script_number
Sameer Pujar spujar@nvidia.com dmaengine: tegra210-adma: fix transfer failure
Jeffrey Hugo jeffrey.l.hugo@gmail.com dmaengine: qcom: bam_dma: Fix resource leak
Paolo Bonzini pbonzini@redhat.com KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is active
Laura Abbott labbott@redhat.com rtlwifi: Fix potential overflow on P2P code
Larry Finger Larry.Finger@lwfinger.net rtlwifi: rtl_pci: Fix problem of too small skb->len
Marvin Liu yong.liu@intel.com virtio_ring: fix stalls for packed rings
Bjorn Andersson bjorn.andersson@linaro.org arm64: cpufeature: Enable Qualcomm Falkor/Kryo errata 1003
Catalin Marinas catalin.marinas@arm.com arm64: Ensure VM_WRITE|VM_SHARED ptes are clean by default
Kaike Wan kaike.wan@intel.com IB/hfi1: Avoid excessive retry for TID RDMA READ request
Alexey Brodkin Alexey.Brodkin@synopsys.com ARC: perf: Accommodate big-endian CPU
Heiko Carstens heiko.carstens@de.ibm.com s390/idle: fix cpu idle time calculation
Yihui ZENG yzeng56@asu.edu s390/cmm: fix information leak in cmm_timeout_handler()
Ilya Leoshkevich iii@linux.ibm.com s390/unwind: fix mixing regs and sp
Anton Ivanov anton.ivanov@cambridgegreys.com um-ubd: Entrust re-queue to the upper layers
Andrey Smirnov andrew.smirnov@gmail.com HID: logitech-hidpp: do all FF cleanup in hidpp_ff_destroy()
Andrey Smirnov andrew.smirnov@gmail.com HID: logitech-hidpp: rework device validation
Andrey Smirnov andrew.smirnov@gmail.com HID: logitech-hidpp: split g920_get_config()
Michał Mirosław mirq-linux@rere.qmqm.pl HID: fix error message in hid_open_report()
Alan Stern stern@rowland.harvard.edu HID: Fix assumption that devices have inputs
Hans de Goede hdegoede@redhat.com HID: i2c-hid: add Trekstor Primebook C11B to descriptor override
Bart Van Assche bvanassche@acm.org scsi: target: cxgbit: Fix cxgbit_fw4_ack()
Quinn Tran qutran@marvell.com scsi: qla2xxx: Fix partial flash write of MBI
Mathias Nyman mathias.nyman@linux.intel.com xhci: Fix use-after-free regression in xhci clear hub TT implementation
Johan Hovold johan@kernel.org USB: serial: whiteheat: fix line-speed endianness
Johan Hovold johan@kernel.org USB: serial: whiteheat: fix potential slab corruption
Ben Dooks (Codethink) ben.dooks@codethink.co.uk usb: xhci: fix __le32/__le64 accessors in debugfs code
Samuel Holland samuel@sholland.org usb: xhci: fix Immediate Data Transfer endianness
Johan Hovold johan@kernel.org USB: ldusb: fix control-message timeout
Johan Hovold johan@kernel.org USB: ldusb: fix ring-buffer locking
Alan Stern stern@rowland.harvard.edu usb-storage: Revert commit 747668dbc061 ("usb-storage: Set virt_boundary_mask to avoid SG overflows")
Alan Stern stern@rowland.harvard.edu USB: gadget: Reject endpoints with 0 maxpacket value
Markus Theil markus.theil@tu-ilmenau.de nl80211: fix validation of mesh path nexthop
Alan Stern stern@rowland.harvard.edu UAS: Revert commit 3ae62a42090f ("UAS: fix alignment of scatter/gather segments")
Miaoqing Pan miaoqing@codeaurora.org ath10k: fix latency issue for QCA988x
Kailang Yang kailang@realtek.com ALSA: hda/realtek - Add support for ALC623
Aaron Ma aaron.ma@canonical.com ALSA: hda/realtek - Fix 2 front mics of codec 0x623
Takashi Iwai tiwai@suse.de ALSA: timer: Fix mutex deadlock at releasing card
Takashi Sakamoto o-takashi@sakamocchi.jp ALSA: bebob: Fix prototype of helper function to return negative value
Miklos Szeredi mszeredi@redhat.com fuse: truncate pending writes on O_TRUNC
Miklos Szeredi mszeredi@redhat.com fuse: flush dirty data/metadata before non-truncate setattr
Hui Peng benquike@gmail.com ath6kl: fix a NULL-ptr-deref bug in ath6kl_usb_alloc_urb_from_pipe()
Mika Westerberg mika.westerberg@linux.intel.com thunderbolt: Use 32-bit writes when writing ring producer/consumer
Mika Westerberg mika.westerberg@linux.intel.com thunderbolt: Correct path indices for PCIe tunnel
Sebastian Ott sebott@linux.ibm.com s390/pci: fix MSI message data
Joe Perches joe@perches.com rtw88: Fix misuse of GENMASK macro
Jeffrey Hugo jeffrey.l.hugo@gmail.com arm64: dts: qcom: Add Asus NovaGo TP370QL
Jeffrey Hugo jeffrey.l.hugo@gmail.com arm64: dts: qcom: Add HP Envy x2
Jeffrey Hugo jeffrey.l.hugo@gmail.com arm64: dts: qcom: Add Lenovo Miix 630
Mike Christie mchristi@redhat.com nbd: verify socket is supported during setup
Dan Carpenter dan.carpenter@oracle.com USB: legousbtower: fix a signedness bug in tower_probe()
Thomas Richter tmricht@linux.ibm.com perf/aux: Fix tracking of auxiliary trace buffer allocation
Gustavo A. R. Silva gustavo@embeddedor.com perf annotate: Fix multiple memory and file descriptor leaks
Petr Mladek pmladek@suse.com tracing: Initialize iter->seq after zeroing in tracing_read_pipe()
Christian Borntraeger borntraeger@de.ibm.com s390/uaccess: avoid (false positive) compiler warnings
Benjamin Coddington bcodding@redhat.com SUNRPC: fix race to sk_err after xs_error_report
Chuck Lever chuck.lever@oracle.com NFSv4: Fix leak of clp->cl_acceptor string
Xiubo Li xiubli@redhat.com nbd: fix possible sysfs duplicate warning
Navid Emamdoost navid.emamdoost@gmail.com virt: vbox: fix memory leak in hgcm_call_preprocess_linaddr
Halil Pasic pasic@linux.ibm.com s390/cio: fix virtio-ccw DMA without PV
Thomas Bogendoerfer tbogendoerfer@suse.de MIPS: fw: sni: Fix out of bounds init of o32 stack
Thomas Bogendoerfer tbogendoerfer@suse.de MIPS: include: Mark __xchg as __always_inline
Lorenzo Bianconi lorenzo@kernel.org iio: imu: st_lsm6dsx: fix waitime for st_lsm6dsx i2c controller
Navid Emamdoost navid.emamdoost@gmail.com iio: imu: adis16400: fix memory leak
Navid Emamdoost navid.emamdoost@gmail.com iio: imu: adis16400: release allocated memory on failure
Nirmoy Das nirmoy.das@amd.com drm/amdgpu: fix memory leak
Tom Lendacky thomas.lendacky@amd.com perf/x86/amd: Change/fix NMI latency mitigation to use a timestamp
Song Liu songliubraving@fb.com perf/core: Fix corner case in perf_rotate_context()
Song Liu songliubraving@fb.com perf/core: Rework memory accounting in perf_mmap()
Frederic Weisbecker frederic@kernel.org sched/vtime: Fix guest/system mis-accounting on task switch
Xuewei Zhang xueweiz@google.com sched/fair: Scale bandwidth quota and period without losing quota/period ratio precision
Kan Liang kan.liang@linux.intel.com x86/cpu: Add Comet Lake to the Intel CPU models header
Yunfeng Ye yeyunfeng@huawei.com arm64: armv8_deprecated: Checking return value for memory allocation
Austin Kim austindh.kim@gmail.com btrfs: silence maybe-uninitialized warning in clone_range
Jia-Ju Bai baijiaju1990@gmail.com fs: ocfs2: fix a possible null-pointer dereference in ocfs2_info_scan_inode_alloc()
Jia-Ju Bai baijiaju1990@gmail.com fs: ocfs2: fix a possible null-pointer dereference in ocfs2_write_end_nolock()
Jia-Ju Bai baijiaju1990@gmail.com fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()
Jia Guo guojia12@huawei.com ocfs2: clear zero in unaligned direct IO
Boris Ostrovsky boris.ostrovsky@oracle.com x86/xen: Return from panic notifier
Vincent Chen vincent.chen@sifive.com riscv: Correct the handling of unexpected ebreak in do_trap_break()
Vincent Chen vincent.chen@sifive.com riscv: avoid sending a SIGTRAP to a user thread trapped in WARN()
Vincent Chen vincent.chen@sifive.com riscv: avoid kernel hangs when trapped in BUG()
Thomas Bogendoerfer tbogendoerfer@suse.de MIPS: include: Mark __cmpxchg as __always_inline
Dave Young dyoung@redhat.com efi/x86: Do not clean dummy variable in kexec path
Lukas Wunner lukas@wunner.de efi/cper: Fix endianness of PCIe class code
Will Deacon will@kernel.org arm64: vdso32: Don't use KBUILD_CPPFLAGS unconditionally
Will Deacon will@kernel.org arm64: Default to building compat vDSO with clang when CONFIG_CC_IS_CLANG
Adam Ford aford173@gmail.com serial: 8250_omap: Fix gpio check for auto RTS/CTS
Adam Ford aford173@gmail.com serial: mctrl_gpio: Check for NULL pointer
Vincenzo Frascino vincenzo.frascino@arm.com arm64: vdso32: Detect binutils support for dmb ishld
Vincenzo Frascino vincenzo.frascino@arm.com arm64: vdso32: Fix broken compat vDSO build warnings
Austin Kim austindh.kim@gmail.com fs: cifs: mute -Wunused-const-variable message
Thierry Reding treding@nvidia.com gpio: max77620: Use correct unit for debounce times
Jason Gunthorpe jgg@ziepe.ca RDMA/mlx5: Add missing synchronize_srcu() for MW cases
Jason Gunthorpe jgg@ziepe.ca RDMA/mlx5: Order num_pending_prefetch properly with synchronize_srcu
Jason Gunthorpe jgg@ziepe.ca RDMA/mlx5: Do not allow rereg of a ODP MR
Leon Romanovsky leon@kernel.org RDMA/nldev: Reshuffle the code to avoid need to rebind QP in error path
Jack Morgenstein jackm@dev.mellanox.co.il RDMA/cm: Fix memory leak in cm_add/remove_one
Christophe JAILLET christophe.jaillet@wanadoo.fr RDMA/core: Fix an error handling path in 'res_get_common_doit()'
Navid Emamdoost navid.emamdoost@gmail.com misc: fastrpc: prevent memory leak in fastrpc_dma_buf_attach
Randy Dunlap rdunlap@infradead.org tty: n_hdlc: fix build on SPARC
Christoph Hellwig hch@lst.de serial/sifive: select SERIAL_EARLYCON
Christophe JAILLET christophe.jaillet@wanadoo.fr tty: serial: rda: Fix the link time qualifier of 'rda_uart_exit()'
Christophe JAILLET christophe.jaillet@wanadoo.fr tty: serial: owl: Fix the link time qualifier of 'owl_uart_exit()'
James Morse james.morse@arm.com arm64: ftrace: Ensure synchronisation in PLT setup for Neoverse-N1 #1542419
James Morse james.morse@arm.com arm64: Fix incorrect irqflag restore for priority masking for compat
Julien Grall julien.grall@arm.com arm64: cpufeature: Effectively expose FRINT capability to userspace
ZhangXiaoxu zhangxiaoxu5@huawei.com nfs: Fix nfsi->nrequests count error on nfs_inode_remove_request
Kees Cook keescook@chromium.org selftests/kselftest/runner.sh: Add 45 second timeout per test
Cristian Marussi cristian.marussi@arm.com kselftest: exclude failed TARGETS from runlist
Dexuan Cui decui@microsoft.com HID: hyperv: Use in-place iterator API in the channel callback
Bart Van Assche bvanassche@acm.org RDMA/iwcm: Fix a lock inversion issue
Potnuri Bharat Teja bharat@chelsio.com RDMA/iw_cxgb4: fix SRQ access from dump_qp()
Navid Emamdoost navid.emamdoost@gmail.com RDMA/hfi1: Prevent memory leak in sdma_init
Krishnamraju Eraparaju krishna2@chelsio.com RDMA/siw: Fix serialization issue in write_space()
Connor Kuehl connor.kuehl@canonical.com staging: rtl8188eu: fix null dereference when kzalloc fails
Arnaldo Carvalho de Melo acme@redhat.com perf annotate: Don't return -1 for error when doing BPF disassembly
Arnaldo Carvalho de Melo acme@redhat.com perf annotate: Return appropriate error code for allocation failures
Arnaldo Carvalho de Melo acme@redhat.com perf annotate: Fix arch specific ->init() failure errors
Arnaldo Carvalho de Melo acme@redhat.com perf annotate: Propagate the symbol__annotate() error return
Arnaldo Carvalho de Melo acme@redhat.com perf annotate: Fix the signedness of failure returns
Arnaldo Carvalho de Melo acme@redhat.com perf annotate: Propagate perf_env__arch() error
Arnaldo Carvalho de Melo acme@redhat.com perf tools: Propagate get_cpuid() error
Andi Kleen ak@linux.intel.com perf jevents: Fix period for Intel fixed counters
Andi Kleen ak@linux.intel.com perf script brstackinsn: Fix recovery from LBR/binary mismatch
Steve MacLean Steve.MacLean@microsoft.com perf map: Fix overlapped map handling
Ian Rogers irogers@google.com perf tests: Avoid raising SEGV using an obvious NULL dereference
Ian Rogers irogers@google.com libsubcmd: Make _FORTIFY_SOURCE defines dependent on the feature
Pascal Bouwmann bouwmann@tau-tec.de iio: fix center temperature of bmc150-accel-core
Remi Pommarel repk@triplefau.lt iio: adc: meson_saradc: Fix memory allocation order
Qu Wenruo wqu@suse.com btrfs: qgroup: Always free PREALLOC META reserve in btrfs_delalloc_release_extents()
Filipe Manana fdmanana@suse.com Btrfs: fix inode cache block reserve leak on failure to allocate data space
Mikulas Patocka mpatocka@redhat.com dm snapshot: rework COW throttling to fix deadlock
Mikulas Patocka mpatocka@redhat.com dm snapshot: introduce account_start_copy() and account_end_copy()
Jens Axboe axboe@kernel.dk io_uring: fix up O_NONBLOCK handling for sockets
-------------
Diffstat:
Documentation/admin-guide/kernel-parameters.txt | 4 + Documentation/scheduler/sched-bwc.rst | 74 ++++-- Makefile | 4 +- arch/arc/kernel/perf_event.c | 4 +- arch/arm64/Kconfig | 2 +- arch/arm64/Makefile | 22 +- arch/arm64/boot/dts/qcom/Makefile | 3 + .../boot/dts/qcom/msm8998-asus-novago-tp370ql.dts | 47 ++++ arch/arm64/boot/dts/qcom/msm8998-clamshell.dtsi | 240 ++++++++++++++++++++ arch/arm64/boot/dts/qcom/msm8998-hp-envy-x2.dts | 30 +++ .../boot/dts/qcom/msm8998-lenovo-miix-630.dts | 30 +++ arch/arm64/include/asm/pgtable-prot.h | 15 +- arch/arm64/include/asm/vdso/compat_barrier.h | 2 +- arch/arm64/kernel/armv8_deprecated.c | 5 + arch/arm64/kernel/cpu_errata.c | 1 + arch/arm64/kernel/cpufeature.c | 1 + arch/arm64/kernel/entry.S | 1 + arch/arm64/kernel/ftrace.c | 12 +- arch/arm64/kernel/vdso32/Makefile | 17 +- arch/mips/fw/sni/sniprom.c | 2 +- arch/mips/include/asm/cmpxchg.h | 9 +- arch/riscv/kernel/traps.c | 14 +- arch/s390/include/asm/uaccess.h | 4 +- arch/s390/include/asm/unwind.h | 1 + arch/s390/kernel/idle.c | 29 ++- arch/s390/kernel/unwind_bc.c | 18 +- arch/s390/mm/cmm.c | 12 +- arch/s390/pci/pci_irq.c | 2 +- arch/um/drivers/ubd_kern.c | 8 +- arch/x86/events/amd/core.c | 30 +-- arch/x86/include/asm/intel-family.h | 3 + arch/x86/kvm/svm.c | 10 +- arch/x86/kvm/vmx/vmx.c | 14 +- arch/x86/platform/efi/efi.c | 3 - arch/x86/xen/enlighten.c | 28 ++- drivers/block/nbd.c | 25 ++- drivers/dma/imx-sdma.c | 8 + drivers/dma/qcom/bam_dma.c | 19 ++ drivers/dma/tegra210-adma.c | 7 + drivers/dma/ti/cppi41.c | 21 +- drivers/firmware/efi/cper.c | 2 +- drivers/gpio/gpio-max77620.c | 6 +- drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 14 +- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c | 9 + drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c | 9 + drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 1 + drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 4 +- drivers/gpu/drm/i915/display/intel_display.c | 11 +- drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 15 ++ drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/hid/hid-axff.c | 11 +- drivers/hid/hid-core.c | 7 +- drivers/hid/hid-dr.c | 12 +- drivers/hid/hid-emsff.c | 12 +- drivers/hid/hid-gaff.c | 12 +- drivers/hid/hid-holtekff.c | 12 +- drivers/hid/hid-hyperv.c | 56 +---- drivers/hid/hid-lg2ff.c | 12 +- drivers/hid/hid-lg3ff.c | 11 +- drivers/hid/hid-lg4ff.c | 11 +- drivers/hid/hid-lgff.c | 11 +- drivers/hid/hid-logitech-hidpp.c | 248 ++++++++++++--------- drivers/hid/hid-microsoft.c | 12 +- drivers/hid/hid-sony.c | 12 +- drivers/hid/hid-tmff.c | 12 +- drivers/hid/hid-zpff.c | 12 +- drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c | 19 ++ drivers/iio/accel/bmc150-accel-core.c | 2 +- drivers/iio/adc/meson_saradc.c | 10 +- drivers/iio/imu/adis_buffer.c | 10 +- drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_shub.c | 4 +- drivers/infiniband/core/cm.c | 3 + drivers/infiniband/core/cma.c | 3 +- drivers/infiniband/core/nldev.c | 12 +- drivers/infiniband/hw/cxgb4/device.c | 7 +- drivers/infiniband/hw/cxgb4/qp.c | 10 +- drivers/infiniband/hw/hfi1/sdma.c | 5 +- drivers/infiniband/hw/hfi1/tid_rdma.c | 5 - drivers/infiniband/hw/mlx5/devx.c | 58 ++--- drivers/infiniband/hw/mlx5/mlx5_ib.h | 1 - drivers/infiniband/hw/mlx5/mr.c | 32 ++- drivers/infiniband/sw/siw/siw_qp.c | 15 +- drivers/iommu/intel-iommu.c | 2 +- drivers/md/dm-snap.c | 94 ++++++-- drivers/misc/fastrpc.c | 1 + drivers/net/bonding/bond_main.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/mr.c | 8 +- drivers/net/usb/sr9800.c | 2 +- drivers/net/wireless/ath/ath10k/core.c | 15 +- drivers/net/wireless/ath/ath6kl/usb.c | 8 + drivers/net/wireless/intel/iwlwifi/mvm/fw.c | 16 +- drivers/net/wireless/realtek/rtlwifi/pci.c | 3 +- drivers/net/wireless/realtek/rtlwifi/ps.c | 6 + drivers/net/wireless/realtek/rtw88/rtw8822b.c | 2 +- drivers/nfc/pn533/usb.c | 9 +- drivers/s390/cio/cio.h | 1 + drivers/s390/cio/css.c | 7 +- drivers/s390/cio/device.c | 2 +- drivers/scsi/qla2xxx/qla_attr.c | 7 +- drivers/staging/rtl8188eu/os_dep/usb_intf.c | 6 +- drivers/target/iscsi/cxgbit/cxgbit_cm.c | 3 +- drivers/thunderbolt/nhi.c | 22 +- drivers/thunderbolt/tunnel.c | 4 +- drivers/tty/n_hdlc.c | 5 + drivers/tty/serial/8250/8250_omap.c | 5 +- drivers/tty/serial/Kconfig | 1 + drivers/tty/serial/owl-uart.c | 2 +- drivers/tty/serial/rda-uart.c | 2 +- drivers/tty/serial/serial_mctrl_gpio.c | 3 + drivers/usb/gadget/udc/core.c | 11 + drivers/usb/host/xhci-debugfs.c | 24 +- drivers/usb/host/xhci-ring.c | 2 + drivers/usb/host/xhci.c | 54 ++++- drivers/usb/misc/ldusb.c | 6 +- drivers/usb/misc/legousbtower.c | 2 +- drivers/usb/serial/whiteheat.c | 13 +- drivers/usb/serial/whiteheat.h | 2 +- drivers/usb/storage/scsiglue.c | 10 - drivers/usb/storage/uas.c | 20 -- drivers/virt/vboxguest/vboxguest_utils.c | 3 +- drivers/virtio/virtio_ring.c | 7 +- fs/btrfs/ctree.h | 3 +- fs/btrfs/delalloc-space.c | 6 +- fs/btrfs/file.c | 7 +- fs/btrfs/inode-map.c | 5 +- fs/btrfs/inode.c | 12 +- fs/btrfs/ioctl.c | 6 +- fs/btrfs/relocation.c | 9 +- fs/btrfs/send.c | 2 +- fs/cifs/netmisc.c | 4 - fs/fuse/dir.c | 13 ++ fs/fuse/file.c | 10 +- fs/io_uring.c | 58 +++-- fs/nfs/delegation.c | 2 +- fs/nfs/nfs4proc.c | 1 + fs/nfs/write.c | 5 +- fs/ocfs2/aops.c | 25 ++- fs/ocfs2/ioctl.c | 2 +- fs/ocfs2/xattr.c | 56 ++--- include/linux/platform_data/dma-imx-sdma.h | 3 + include/linux/sunrpc/xprtsock.h | 1 + include/net/llc_conn.h | 2 +- include/net/sch_generic.h | 5 + include/trace/events/rxrpc.h | 6 +- kernel/events/core.c | 45 +++- kernel/sched/cputime.c | 6 +- kernel/sched/fair.c | 127 +++-------- kernel/sched/sched.h | 4 - kernel/trace/trace.c | 1 + net/batman-adv/bat_iv_ogm.c | 61 ++++- net/batman-adv/hard-interface.c | 2 + net/batman-adv/types.h | 3 + net/llc/llc_c_ac.c | 8 +- net/llc/llc_conn.c | 32 +-- net/llc/llc_s_ac.c | 12 +- net/llc/llc_sap.c | 23 +- net/netfilter/nf_conntrack_core.c | 4 +- net/rxrpc/peer_object.c | 16 +- net/rxrpc/sendmsg.c | 1 + net/sched/sch_netem.c | 2 +- net/sched/sch_sfb.c | 7 +- net/sunrpc/xprtsock.c | 17 +- net/wireless/nl80211.c | 2 +- sound/core/timer.c | 24 +- sound/firewire/bebob/bebob_stream.c | 3 +- sound/pci/hda/patch_realtek.c | 11 + sound/usb/quirks.c | 13 +- tools/lib/subcmd/Makefile | 8 +- tools/perf/arch/arm/annotate/instructions.c | 4 +- tools/perf/arch/arm64/annotate/instructions.c | 4 +- tools/perf/arch/powerpc/util/header.c | 3 +- tools/perf/arch/s390/annotate/instructions.c | 6 +- tools/perf/arch/s390/util/header.c | 9 +- tools/perf/arch/x86/annotate/instructions.c | 6 +- tools/perf/arch/x86/util/header.c | 3 +- tools/perf/builtin-kvm.c | 7 +- tools/perf/builtin-script.c | 6 +- tools/perf/pmu-events/jevents.c | 12 +- tools/perf/tests/perf-hooks.c | 3 +- tools/perf/util/annotate.c | 35 ++- tools/perf/util/annotate.h | 4 + tools/perf/util/map.c | 3 + tools/testing/selftests/Makefile | 4 + tools/testing/selftests/kselftest/runner.sh | 36 ++- tools/testing/selftests/rtc/settings | 1 + 186 files changed, 1848 insertions(+), 880 deletions(-)
From: Jens Axboe axboe@kernel.dk
[ Upstream commit 491381ce07ca57f68c49c79a8a43da5b60749e32 ]
We've got two issues with the non-regular file handling for non-blocking IO:
1) We don't want to re-do a short read in full for a non-regular file, as we can't just read the data again. 2) For non-regular files that don't support non-blocking IO attempts, we need to punt to async context even if the file is opened as non-blocking. Otherwise the caller always gets -EAGAIN.
Add two new request flags to handle these cases. One is just a cache of the inode S_ISREG() status, the other tells io_uring that we always need to punt this request to async context, even if REQ_F_NOWAIT is set.
Cc: stable@vger.kernel.org Reported-by: Hrvoje Zeba zeba.hrvoje@gmail.com Tested-by: Hrvoje Zeba zeba.hrvoje@gmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 57 +++++++++++++++++++++++++++++++++++---------------- 1 file changed, 39 insertions(+), 18 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index ed223c33dd898..59925b6583ba0 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -338,6 +338,8 @@ struct io_kiocb { #define REQ_F_LINK 64 /* linked sqes */ #define REQ_F_LINK_DONE 128 /* linked sqes done */ #define REQ_F_FAIL_LINK 256 /* fail rest of links */ +#define REQ_F_ISREG 2048 /* regular file */ +#define REQ_F_MUST_PUNT 4096 /* must be punted even for NONBLOCK */ u64 user_data; u32 result; u32 sequence; @@ -885,26 +887,26 @@ static int io_iopoll_check(struct io_ring_ctx *ctx, unsigned *nr_events, return ret; }
-static void kiocb_end_write(struct kiocb *kiocb) +static void kiocb_end_write(struct io_kiocb *req) { - if (kiocb->ki_flags & IOCB_WRITE) { - struct inode *inode = file_inode(kiocb->ki_filp); + /* + * Tell lockdep we inherited freeze protection from submission + * thread. + */ + if (req->flags & REQ_F_ISREG) { + struct inode *inode = file_inode(req->file);
- /* - * Tell lockdep we inherited freeze protection from submission - * thread. - */ - if (S_ISREG(inode->i_mode)) - __sb_writers_acquired(inode->i_sb, SB_FREEZE_WRITE); - file_end_write(kiocb->ki_filp); + __sb_writers_acquired(inode->i_sb, SB_FREEZE_WRITE); } + file_end_write(req->file); }
static void io_complete_rw(struct kiocb *kiocb, long res, long res2) { struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw);
- kiocb_end_write(kiocb); + if (kiocb->ki_flags & IOCB_WRITE) + kiocb_end_write(req);
if ((req->flags & REQ_F_LINK) && res != req->result) req->flags |= REQ_F_FAIL_LINK; @@ -916,7 +918,8 @@ static void io_complete_rw_iopoll(struct kiocb *kiocb, long res, long res2) { struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw);
- kiocb_end_write(kiocb); + if (kiocb->ki_flags & IOCB_WRITE) + kiocb_end_write(req);
if ((req->flags & REQ_F_LINK) && res != req->result) req->flags |= REQ_F_FAIL_LINK; @@ -1030,8 +1033,17 @@ static int io_prep_rw(struct io_kiocb *req, const struct sqe_submit *s, if (!req->file) return -EBADF;
- if (force_nonblock && !io_file_supports_async(req->file)) - force_nonblock = false; + if (S_ISREG(file_inode(req->file)->i_mode)) + req->flags |= REQ_F_ISREG; + + /* + * If the file doesn't support async, mark it as REQ_F_MUST_PUNT so + * we know to async punt it even if it was opened O_NONBLOCK + */ + if (force_nonblock && !io_file_supports_async(req->file)) { + req->flags |= REQ_F_MUST_PUNT; + return -EAGAIN; + }
kiocb->ki_pos = READ_ONCE(sqe->off); kiocb->ki_flags = iocb_flags(kiocb->ki_filp); @@ -1052,7 +1064,8 @@ static int io_prep_rw(struct io_kiocb *req, const struct sqe_submit *s, return ret;
/* don't allow async punt if RWF_NOWAIT was requested */ - if (kiocb->ki_flags & IOCB_NOWAIT) + if ((kiocb->ki_flags & IOCB_NOWAIT) || + (req->file->f_flags & O_NONBLOCK)) req->flags |= REQ_F_NOWAIT;
if (force_nonblock) @@ -1286,7 +1299,9 @@ static int io_read(struct io_kiocb *req, const struct sqe_submit *s, * need async punt anyway, so it's more efficient to do it * here. */ - if (force_nonblock && ret2 > 0 && ret2 < read_size) + if (force_nonblock && !(req->flags & REQ_F_NOWAIT) && + (req->flags & REQ_F_ISREG) && + ret2 > 0 && ret2 < read_size) ret2 = -EAGAIN; /* Catch -EAGAIN return for forced non-blocking submission */ if (!force_nonblock || ret2 != -EAGAIN) { @@ -1353,7 +1368,7 @@ static int io_write(struct io_kiocb *req, const struct sqe_submit *s, * released so that it doesn't complain about the held lock when * we return to userspace. */ - if (S_ISREG(file_inode(file)->i_mode)) { + if (req->flags & REQ_F_ISREG) { __sb_start_write(file_inode(file)->i_sb, SB_FREEZE_WRITE, true); __sb_writers_release(file_inode(file)->i_sb, @@ -2096,7 +2111,13 @@ static int io_queue_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req, }
ret = __io_submit_sqe(ctx, req, s, true); - if (ret == -EAGAIN && !(req->flags & REQ_F_NOWAIT)) { + + /* + * We async punt it if the file wasn't marked NOWAIT, or if the file + * doesn't support non-blocking read/write attempts + */ + if (ret == -EAGAIN && (!(req->flags & REQ_F_NOWAIT) || + (req->flags & REQ_F_MUST_PUNT))) { struct io_uring_sqe *sqe_copy;
sqe_copy = kmalloc(sizeof(*sqe_copy), GFP_KERNEL);
From: Mikulas Patocka mpatocka@redhat.com
[ Upstream commit a2f83e8b0c82c9500421a26c49eb198b25fcdea3 ]
This simple refactoring moves code for modifying the semaphore cow_count into separate functions to prepare for changes that will extend these methods to provide for a more sophisticated mechanism for COW throttling.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Reviewed-by: Nikos Tsironis ntsironis@arrikto.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/dm-snap.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-)
diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c index f150f5c5492b9..da3bd1794ee05 100644 --- a/drivers/md/dm-snap.c +++ b/drivers/md/dm-snap.c @@ -1512,6 +1512,16 @@ static void snapshot_dtr(struct dm_target *ti) kfree(s); }
+static void account_start_copy(struct dm_snapshot *s) +{ + down(&s->cow_count); +} + +static void account_end_copy(struct dm_snapshot *s) +{ + up(&s->cow_count); +} + /* * Flush a list of buffers. */ @@ -1732,7 +1742,7 @@ static void copy_callback(int read_err, unsigned long write_err, void *context) rb_link_node(&pe->out_of_order_node, parent, p); rb_insert_color(&pe->out_of_order_node, &s->out_of_order_tree); } - up(&s->cow_count); + account_end_copy(s); }
/* @@ -1756,7 +1766,7 @@ static void start_copy(struct dm_snap_pending_exception *pe) dest.count = src.count;
/* Hand over to kcopyd */ - down(&s->cow_count); + account_start_copy(s); dm_kcopyd_copy(s->kcopyd_client, &src, 1, &dest, 0, copy_callback, pe); }
@@ -1776,7 +1786,7 @@ static void start_full_bio(struct dm_snap_pending_exception *pe, pe->full_bio = bio; pe->full_bio_end_io = bio->bi_end_io;
- down(&s->cow_count); + account_start_copy(s); callback_data = dm_kcopyd_prepare_callback(s->kcopyd_client, copy_callback, pe);
@@ -1866,7 +1876,7 @@ static void zero_callback(int read_err, unsigned long write_err, void *context) struct bio *bio = context; struct dm_snapshot *s = bio->bi_private;
- up(&s->cow_count); + account_end_copy(s); bio->bi_status = write_err ? BLK_STS_IOERR : 0; bio_endio(bio); } @@ -1880,7 +1890,7 @@ static void zero_exception(struct dm_snapshot *s, struct dm_exception *e, dest.sector = bio->bi_iter.bi_sector; dest.count = s->store->chunk_size;
- down(&s->cow_count); + account_start_copy(s); WARN_ON_ONCE(bio->bi_private); bio->bi_private = s; dm_kcopyd_zero(s->kcopyd_client, 1, &dest, 0, zero_callback, bio);
From: Mikulas Patocka mpatocka@redhat.com
[ Upstream commit b21555786f18cd77f2311ad89074533109ae3ffa ]
Commit 721b1d98fb517a ("dm snapshot: Fix excessive memory usage and workqueue stalls") introduced a semaphore to limit the maximum number of in-flight kcopyd (COW) jobs.
The implementation of this throttling mechanism is prone to a deadlock:
1. One or more threads write to the origin device causing COW, which is performed by kcopyd.
2. At some point some of these threads might reach the s->cow_count semaphore limit and block in down(&s->cow_count), holding a read lock on _origins_lock.
3. Someone tries to acquire a write lock on _origins_lock, e.g., snapshot_ctr(), which blocks because the threads at step (2) already hold a read lock on it.
4. A COW operation completes and kcopyd runs dm-snapshot's completion callback, which ends up calling pending_complete(). pending_complete() tries to resubmit any deferred origin bios. This requires acquiring a read lock on _origins_lock, which blocks.
This happens because the read-write semaphore implementation gives priority to writers, meaning that as soon as a writer tries to enter the critical section, no readers will be allowed in, until all writers have completed their work.
So, pending_complete() waits for the writer at step (3) to acquire and release the lock. This writer waits for the readers at step (2) to release the read lock and those readers wait for pending_complete() (the kcopyd thread) to signal the s->cow_count semaphore: DEADLOCK.
The above was thoroughly analyzed and documented by Nikos Tsironis as part of his initial proposal for fixing this deadlock, see: https://www.redhat.com/archives/dm-devel/2019-October/msg00001.html
Fix this deadlock by reworking COW throttling so that it waits without holding any locks. Add a variable 'in_progress' that counts how many kcopyd jobs are running. A function wait_for_in_progress() will sleep if 'in_progress' is over the limit. It drops _origins_lock in order to avoid the deadlock.
Reported-by: Guruswamy Basavaiah guru2018@gmail.com Reported-by: Nikos Tsironis ntsironis@arrikto.com Reviewed-by: Nikos Tsironis ntsironis@arrikto.com Tested-by: Nikos Tsironis ntsironis@arrikto.com Fixes: 721b1d98fb51 ("dm snapshot: Fix excessive memory usage and workqueue stalls") Cc: stable@vger.kernel.org # v5.0+ Depends-on: 4a3f111a73a8c ("dm snapshot: introduce account_start_copy() and account_end_copy()") Signed-off-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Mike Snitzer snitzer@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/dm-snap.c | 78 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 64 insertions(+), 14 deletions(-)
diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c index da3bd1794ee05..4fb1a40e68a08 100644 --- a/drivers/md/dm-snap.c +++ b/drivers/md/dm-snap.c @@ -18,7 +18,6 @@ #include <linux/vmalloc.h> #include <linux/log2.h> #include <linux/dm-kcopyd.h> -#include <linux/semaphore.h>
#include "dm.h"
@@ -107,8 +106,8 @@ struct dm_snapshot { /* The on disk metadata handler */ struct dm_exception_store *store;
- /* Maximum number of in-flight COW jobs. */ - struct semaphore cow_count; + unsigned in_progress; + struct wait_queue_head in_progress_wait;
struct dm_kcopyd_client *kcopyd_client;
@@ -162,8 +161,8 @@ struct dm_snapshot { */ #define DEFAULT_COW_THRESHOLD 2048
-static int cow_threshold = DEFAULT_COW_THRESHOLD; -module_param_named(snapshot_cow_threshold, cow_threshold, int, 0644); +static unsigned cow_threshold = DEFAULT_COW_THRESHOLD; +module_param_named(snapshot_cow_threshold, cow_threshold, uint, 0644); MODULE_PARM_DESC(snapshot_cow_threshold, "Maximum number of chunks being copied on write");
DECLARE_DM_KCOPYD_THROTTLE_WITH_MODULE_PARM(snapshot_copy_throttle, @@ -1327,7 +1326,7 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv) goto bad_hash_tables; }
- sema_init(&s->cow_count, (cow_threshold > 0) ? cow_threshold : INT_MAX); + init_waitqueue_head(&s->in_progress_wait);
s->kcopyd_client = dm_kcopyd_client_create(&dm_kcopyd_throttle); if (IS_ERR(s->kcopyd_client)) { @@ -1509,17 +1508,54 @@ static void snapshot_dtr(struct dm_target *ti)
dm_put_device(ti, s->origin);
+ WARN_ON(s->in_progress); + kfree(s); }
static void account_start_copy(struct dm_snapshot *s) { - down(&s->cow_count); + spin_lock(&s->in_progress_wait.lock); + s->in_progress++; + spin_unlock(&s->in_progress_wait.lock); }
static void account_end_copy(struct dm_snapshot *s) { - up(&s->cow_count); + spin_lock(&s->in_progress_wait.lock); + BUG_ON(!s->in_progress); + s->in_progress--; + if (likely(s->in_progress <= cow_threshold) && + unlikely(waitqueue_active(&s->in_progress_wait))) + wake_up_locked(&s->in_progress_wait); + spin_unlock(&s->in_progress_wait.lock); +} + +static bool wait_for_in_progress(struct dm_snapshot *s, bool unlock_origins) +{ + if (unlikely(s->in_progress > cow_threshold)) { + spin_lock(&s->in_progress_wait.lock); + if (likely(s->in_progress > cow_threshold)) { + /* + * NOTE: this throttle doesn't account for whether + * the caller is servicing an IO that will trigger a COW + * so excess throttling may result for chunks not required + * to be COW'd. But if cow_threshold was reached, extra + * throttling is unlikely to negatively impact performance. + */ + DECLARE_WAITQUEUE(wait, current); + __add_wait_queue(&s->in_progress_wait, &wait); + __set_current_state(TASK_UNINTERRUPTIBLE); + spin_unlock(&s->in_progress_wait.lock); + if (unlock_origins) + up_read(&_origins_lock); + io_schedule(); + remove_wait_queue(&s->in_progress_wait, &wait); + return false; + } + spin_unlock(&s->in_progress_wait.lock); + } + return true; }
/* @@ -1537,7 +1573,7 @@ static void flush_bios(struct bio *bio) } }
-static int do_origin(struct dm_dev *origin, struct bio *bio); +static int do_origin(struct dm_dev *origin, struct bio *bio, bool limit);
/* * Flush a list of buffers. @@ -1550,7 +1586,7 @@ static void retry_origin_bios(struct dm_snapshot *s, struct bio *bio) while (bio) { n = bio->bi_next; bio->bi_next = NULL; - r = do_origin(s->origin, bio); + r = do_origin(s->origin, bio, false); if (r == DM_MAPIO_REMAPPED) generic_make_request(bio); bio = n; @@ -1926,6 +1962,11 @@ static int snapshot_map(struct dm_target *ti, struct bio *bio) if (!s->valid) return DM_MAPIO_KILL;
+ if (bio_data_dir(bio) == WRITE) { + while (unlikely(!wait_for_in_progress(s, false))) + ; /* wait_for_in_progress() has slept */ + } + down_read(&s->lock); dm_exception_table_lock(&lock);
@@ -2122,7 +2163,7 @@ redirect_to_origin:
if (bio_data_dir(bio) == WRITE) { up_write(&s->lock); - return do_origin(s->origin, bio); + return do_origin(s->origin, bio, false); }
out_unlock: @@ -2497,15 +2538,24 @@ next_snapshot: /* * Called on a write from the origin driver. */ -static int do_origin(struct dm_dev *origin, struct bio *bio) +static int do_origin(struct dm_dev *origin, struct bio *bio, bool limit) { struct origin *o; int r = DM_MAPIO_REMAPPED;
+again: down_read(&_origins_lock); o = __lookup_origin(origin->bdev); - if (o) + if (o) { + if (limit) { + struct dm_snapshot *s; + list_for_each_entry(s, &o->snapshots, list) + if (unlikely(!wait_for_in_progress(s, true))) + goto again; + } + r = __origin_write(&o->snapshots, bio->bi_iter.bi_sector, bio); + } up_read(&_origins_lock);
return r; @@ -2618,7 +2668,7 @@ static int origin_map(struct dm_target *ti, struct bio *bio) dm_accept_partial_bio(bio, available_sectors);
/* Only tell snapshots if this is a write */ - return do_origin(o->dev, bio); + return do_origin(o->dev, bio, true); }
/*
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 29d47d00e0ae61668ee0c5d90bef2893c8abbafa ]
If we failed to allocate the data extent(s) for the inode space cache, we were bailing out without releasing the previously reserved metadata. This was triggering the following warnings when unmounting a filesystem:
$ cat -n fs/btrfs/inode.c (...) 9268 void btrfs_destroy_inode(struct inode *inode) 9269 { (...) 9276 WARN_ON(BTRFS_I(inode)->block_rsv.reserved); 9277 WARN_ON(BTRFS_I(inode)->block_rsv.size); (...) 9281 WARN_ON(BTRFS_I(inode)->csum_bytes); 9282 WARN_ON(BTRFS_I(inode)->defrag_bytes); (...)
Several fstests test cases triggered this often, such as generic/083, generic/102, generic/172, generic/269 and generic/300 at least, producing stack traces like the following in dmesg/syslog:
[82039.079546] WARNING: CPU: 2 PID: 13167 at fs/btrfs/inode.c:9276 btrfs_destroy_inode+0x203/0x270 [btrfs] (...) [82039.081543] CPU: 2 PID: 13167 Comm: umount Tainted: G W 5.2.0-rc4-btrfs-next-50 #1 [82039.081912] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014 [82039.082673] RIP: 0010:btrfs_destroy_inode+0x203/0x270 [btrfs] (...) [82039.083913] RSP: 0018:ffffac0b426a7d30 EFLAGS: 00010206 [82039.084320] RAX: ffff8ddf77691158 RBX: ffff8dde29b34660 RCX: 0000000000000002 [82039.084736] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8dde29b34660 [82039.085156] RBP: ffff8ddf5fbec000 R08: 0000000000000000 R09: 0000000000000000 [82039.085578] R10: ffffac0b426a7c90 R11: ffffffffb9aad768 R12: ffffac0b426a7db0 [82039.086000] R13: ffff8ddf5fbec0a0 R14: dead000000000100 R15: 0000000000000000 [82039.086416] FS: 00007f8db96d12c0(0000) GS:ffff8de036b00000(0000) knlGS:0000000000000000 [82039.086837] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [82039.087253] CR2: 0000000001416108 CR3: 00000002315cc001 CR4: 00000000003606e0 [82039.087672] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [82039.088089] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [82039.088504] Call Trace: [82039.088918] destroy_inode+0x3b/0x70 [82039.089340] btrfs_free_fs_root+0x16/0xa0 [btrfs] [82039.089768] btrfs_free_fs_roots+0xd8/0x160 [btrfs] [82039.090183] ? wait_for_completion+0x65/0x1a0 [82039.090607] close_ctree+0x172/0x370 [btrfs] [82039.091021] generic_shutdown_super+0x6c/0x110 [82039.091427] kill_anon_super+0xe/0x30 [82039.091832] btrfs_kill_super+0x12/0xa0 [btrfs] [82039.092233] deactivate_locked_super+0x3a/0x70 [82039.092636] cleanup_mnt+0x3b/0x80 [82039.093039] task_work_run+0x93/0xc0 [82039.093457] exit_to_usermode_loop+0xfa/0x100 [82039.093856] do_syscall_64+0x162/0x1d0 [82039.094244] entry_SYSCALL_64_after_hwframe+0x49/0xbe [82039.094634] RIP: 0033:0x7f8db8fbab37 (...) [82039.095876] RSP: 002b:00007ffdce35b468 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [82039.096290] RAX: 0000000000000000 RBX: 0000560d20b00060 RCX: 00007f8db8fbab37 [82039.096700] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000560d20b00240 [82039.097110] RBP: 0000560d20b00240 R08: 0000560d20b00270 R09: 0000000000000015 [82039.097522] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007f8db94bce64 [82039.097937] R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffdce35b6f0 [82039.098350] irq event stamp: 0 [82039.098750] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [82039.099150] hardirqs last disabled at (0): [<ffffffffb7884ff2>] copy_process.part.33+0x7f2/0x1f00 [82039.099545] softirqs last enabled at (0): [<ffffffffb7884ff2>] copy_process.part.33+0x7f2/0x1f00 [82039.099925] softirqs last disabled at (0): [<0000000000000000>] 0x0 [82039.100292] ---[ end trace f2521afa616ddccc ]--- [82039.100707] WARNING: CPU: 2 PID: 13167 at fs/btrfs/inode.c:9277 btrfs_destroy_inode+0x1ac/0x270 [btrfs] (...) [82039.103050] CPU: 2 PID: 13167 Comm: umount Tainted: G W 5.2.0-rc4-btrfs-next-50 #1 [82039.103428] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014 [82039.104203] RIP: 0010:btrfs_destroy_inode+0x1ac/0x270 [btrfs] (...) [82039.105461] RSP: 0018:ffffac0b426a7d30 EFLAGS: 00010206 [82039.105866] RAX: ffff8ddf77691158 RBX: ffff8dde29b34660 RCX: 0000000000000002 [82039.106270] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8dde29b34660 [82039.106673] RBP: ffff8ddf5fbec000 R08: 0000000000000000 R09: 0000000000000000 [82039.107078] R10: ffffac0b426a7c90 R11: ffffffffb9aad768 R12: ffffac0b426a7db0 [82039.107487] R13: ffff8ddf5fbec0a0 R14: dead000000000100 R15: 0000000000000000 [82039.107894] FS: 00007f8db96d12c0(0000) GS:ffff8de036b00000(0000) knlGS:0000000000000000 [82039.108309] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [82039.108723] CR2: 0000000001416108 CR3: 00000002315cc001 CR4: 00000000003606e0 [82039.109146] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [82039.109567] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [82039.109989] Call Trace: [82039.110405] destroy_inode+0x3b/0x70 [82039.110830] btrfs_free_fs_root+0x16/0xa0 [btrfs] [82039.111257] btrfs_free_fs_roots+0xd8/0x160 [btrfs] [82039.111675] ? wait_for_completion+0x65/0x1a0 [82039.112101] close_ctree+0x172/0x370 [btrfs] [82039.112519] generic_shutdown_super+0x6c/0x110 [82039.112988] kill_anon_super+0xe/0x30 [82039.113439] btrfs_kill_super+0x12/0xa0 [btrfs] [82039.113861] deactivate_locked_super+0x3a/0x70 [82039.114278] cleanup_mnt+0x3b/0x80 [82039.114685] task_work_run+0x93/0xc0 [82039.115083] exit_to_usermode_loop+0xfa/0x100 [82039.115476] do_syscall_64+0x162/0x1d0 [82039.115863] entry_SYSCALL_64_after_hwframe+0x49/0xbe [82039.116254] RIP: 0033:0x7f8db8fbab37 (...) [82039.117463] RSP: 002b:00007ffdce35b468 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [82039.117882] RAX: 0000000000000000 RBX: 0000560d20b00060 RCX: 00007f8db8fbab37 [82039.118330] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000560d20b00240 [82039.118743] RBP: 0000560d20b00240 R08: 0000560d20b00270 R09: 0000000000000015 [82039.119159] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007f8db94bce64 [82039.119574] R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffdce35b6f0 [82039.119987] irq event stamp: 0 [82039.120387] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [82039.120787] hardirqs last disabled at (0): [<ffffffffb7884ff2>] copy_process.part.33+0x7f2/0x1f00 [82039.121182] softirqs last enabled at (0): [<ffffffffb7884ff2>] copy_process.part.33+0x7f2/0x1f00 [82039.121563] softirqs last disabled at (0): [<0000000000000000>] 0x0 [82039.121933] ---[ end trace f2521afa616ddccd ]--- [82039.122353] WARNING: CPU: 2 PID: 13167 at fs/btrfs/inode.c:9278 btrfs_destroy_inode+0x1bc/0x270 [btrfs] (...) [82039.124606] CPU: 2 PID: 13167 Comm: umount Tainted: G W 5.2.0-rc4-btrfs-next-50 #1 [82039.125008] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014 [82039.125801] RIP: 0010:btrfs_destroy_inode+0x1bc/0x270 [btrfs] (...) [82039.126998] RSP: 0018:ffffac0b426a7d30 EFLAGS: 00010202 [82039.127399] RAX: ffff8ddf77691158 RBX: ffff8dde29b34660 RCX: 0000000000000002 [82039.127803] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff8dde29b34660 [82039.128206] RBP: ffff8ddf5fbec000 R08: 0000000000000000 R09: 0000000000000000 [82039.128611] R10: ffffac0b426a7c90 R11: ffffffffb9aad768 R12: ffffac0b426a7db0 [82039.129020] R13: ffff8ddf5fbec0a0 R14: dead000000000100 R15: 0000000000000000 [82039.129428] FS: 00007f8db96d12c0(0000) GS:ffff8de036b00000(0000) knlGS:0000000000000000 [82039.129846] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [82039.130261] CR2: 0000000001416108 CR3: 00000002315cc001 CR4: 00000000003606e0 [82039.130684] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [82039.131142] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [82039.131561] Call Trace: [82039.131990] destroy_inode+0x3b/0x70 [82039.132417] btrfs_free_fs_root+0x16/0xa0 [btrfs] [82039.132844] btrfs_free_fs_roots+0xd8/0x160 [btrfs] [82039.133262] ? wait_for_completion+0x65/0x1a0 [82039.133688] close_ctree+0x172/0x370 [btrfs] [82039.134157] generic_shutdown_super+0x6c/0x110 [82039.134575] kill_anon_super+0xe/0x30 [82039.134997] btrfs_kill_super+0x12/0xa0 [btrfs] [82039.135415] deactivate_locked_super+0x3a/0x70 [82039.135832] cleanup_mnt+0x3b/0x80 [82039.136239] task_work_run+0x93/0xc0 [82039.136637] exit_to_usermode_loop+0xfa/0x100 [82039.137029] do_syscall_64+0x162/0x1d0 [82039.137418] entry_SYSCALL_64_after_hwframe+0x49/0xbe [82039.137812] RIP: 0033:0x7f8db8fbab37 (...) [82039.139059] RSP: 002b:00007ffdce35b468 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [82039.139475] RAX: 0000000000000000 RBX: 0000560d20b00060 RCX: 00007f8db8fbab37 [82039.139890] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000560d20b00240 [82039.140302] RBP: 0000560d20b00240 R08: 0000560d20b00270 R09: 0000000000000015 [82039.140719] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007f8db94bce64 [82039.141138] R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffdce35b6f0 [82039.141597] irq event stamp: 0 [82039.142043] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [82039.142443] hardirqs last disabled at (0): [<ffffffffb7884ff2>] copy_process.part.33+0x7f2/0x1f00 [82039.142839] softirqs last enabled at (0): [<ffffffffb7884ff2>] copy_process.part.33+0x7f2/0x1f00 [82039.143220] softirqs last disabled at (0): [<0000000000000000>] 0x0 [82039.143588] ---[ end trace f2521afa616ddcce ]--- [82039.167472] WARNING: CPU: 3 PID: 13167 at fs/btrfs/extent-tree.c:10120 btrfs_free_block_groups+0x30d/0x460 [btrfs] (...) [82039.173800] CPU: 3 PID: 13167 Comm: umount Tainted: G W 5.2.0-rc4-btrfs-next-50 #1 [82039.174847] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014 [82039.177031] RIP: 0010:btrfs_free_block_groups+0x30d/0x460 [btrfs] (...) [82039.180397] RSP: 0018:ffffac0b426a7dd8 EFLAGS: 00010206 [82039.181574] RAX: ffff8de010a1db40 RBX: ffff8de010a1db40 RCX: 0000000000170014 [82039.182711] RDX: ffff8ddff4380040 RSI: ffff8de010a1da58 RDI: 0000000000000246 [82039.183817] RBP: ffff8ddf5fbec000 R08: 0000000000000000 R09: 0000000000000000 [82039.184925] R10: ffff8de036404380 R11: ffffffffb8a5ea00 R12: ffff8de010a1b2b8 [82039.186090] R13: ffff8de010a1b2b8 R14: 0000000000000000 R15: dead000000000100 [82039.187208] FS: 00007f8db96d12c0(0000) GS:ffff8de036b80000(0000) knlGS:0000000000000000 [82039.188345] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [82039.189481] CR2: 00007fb044005170 CR3: 00000002315cc006 CR4: 00000000003606e0 [82039.190674] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [82039.191829] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [82039.192978] Call Trace: [82039.194160] close_ctree+0x19a/0x370 [btrfs] [82039.195315] generic_shutdown_super+0x6c/0x110 [82039.196486] kill_anon_super+0xe/0x30 [82039.197645] btrfs_kill_super+0x12/0xa0 [btrfs] [82039.198696] deactivate_locked_super+0x3a/0x70 [82039.199619] cleanup_mnt+0x3b/0x80 [82039.200559] task_work_run+0x93/0xc0 [82039.201505] exit_to_usermode_loop+0xfa/0x100 [82039.202436] do_syscall_64+0x162/0x1d0 [82039.203339] entry_SYSCALL_64_after_hwframe+0x49/0xbe [82039.204091] RIP: 0033:0x7f8db8fbab37 (...) [82039.206360] RSP: 002b:00007ffdce35b468 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [82039.207132] RAX: 0000000000000000 RBX: 0000560d20b00060 RCX: 00007f8db8fbab37 [82039.207906] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000560d20b00240 [82039.208621] RBP: 0000560d20b00240 R08: 0000560d20b00270 R09: 0000000000000015 [82039.209285] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007f8db94bce64 [82039.209984] R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffdce35b6f0 [82039.210642] irq event stamp: 0 [82039.211306] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [82039.211971] hardirqs last disabled at (0): [<ffffffffb7884ff2>] copy_process.part.33+0x7f2/0x1f00 [82039.212643] softirqs last enabled at (0): [<ffffffffb7884ff2>] copy_process.part.33+0x7f2/0x1f00 [82039.213304] softirqs last disabled at (0): [<0000000000000000>] 0x0 [82039.213875] ---[ end trace f2521afa616ddccf ]---
Fix this by releasing the reserved metadata on failure to allocate data extent(s) for the inode cache.
Fixes: 69fe2d75dd91d0 ("btrfs: make the delalloc block rsv per inode") Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/inode-map.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c index 2e8bb402050b9..10a73c97f8966 100644 --- a/fs/btrfs/inode-map.c +++ b/fs/btrfs/inode-map.c @@ -485,6 +485,7 @@ again: prealloc, prealloc, &alloc_hint); if (ret) { btrfs_delalloc_release_extents(BTRFS_I(inode), prealloc, true); + btrfs_delalloc_release_metadata(BTRFS_I(inode), prealloc, true); goto out_put; }
From: Qu Wenruo wqu@suse.com
[ Upstream commit 8702ba9396bf7bbae2ab93c94acd4bd37cfa4f09 ]
[Background] Btrfs qgroup uses two types of reserved space for METADATA space, PERTRANS and PREALLOC.
PERTRANS is metadata space reserved for each transaction started by btrfs_start_transaction(). While PREALLOC is for delalloc, where we reserve space before joining a transaction, and finally it will be converted to PERTRANS after the writeback is done.
[Inconsistency] However there is inconsistency in how we handle PREALLOC metadata space.
The most obvious one is: In btrfs_buffered_write(): btrfs_delalloc_release_extents(BTRFS_I(inode), reserve_bytes, true);
We always free qgroup PREALLOC meta space.
While in btrfs_truncate_block(): btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize, (ret != 0));
We only free qgroup PREALLOC meta space when something went wrong.
[The Correct Behavior] The correct behavior should be the one in btrfs_buffered_write(), we should always free PREALLOC metadata space.
The reason is, the btrfs_delalloc_* mechanism works by: - Reserve metadata first, even it's not necessary In btrfs_delalloc_reserve_metadata()
- Free the unused metadata space Normally in: btrfs_delalloc_release_extents() |- btrfs_inode_rsv_release() Here we do calculation on whether we should release or not.
E.g. for 64K buffered write, the metadata rsv works like:
/* The first page */ reserve_meta: num_bytes=calc_inode_reservations() free_meta: num_bytes=0 total: num_bytes=calc_inode_reservations() /* The first page caused one outstanding extent, thus needs metadata rsv */
/* The 2nd page */ reserve_meta: num_bytes=calc_inode_reservations() free_meta: num_bytes=calc_inode_reservations() total: not changed /* The 2nd page doesn't cause new outstanding extent, needs no new meta rsv, so we free what we have reserved */
/* The 3rd~16th pages */ reserve_meta: num_bytes=calc_inode_reservations() free_meta: num_bytes=calc_inode_reservations() total: not changed (still space for one outstanding extent)
This means, if btrfs_delalloc_release_extents() determines to free some space, then those space should be freed NOW. So for qgroup, we should call btrfs_qgroup_free_meta_prealloc() other than btrfs_qgroup_convert_reserved_meta().
The good news is: - The callers are not that hot The hottest caller is in btrfs_buffered_write(), which is already fixed by commit 336a8bb8e36a ("btrfs: Fix wrong btrfs_delalloc_release_extents parameter"). Thus it's not that easy to cause false EDQUOT.
- The trans commit in advance for qgroup would hide the bug Since commit f5fef4593653 ("btrfs: qgroup: Make qgroup async transaction commit more aggressive"), when btrfs qgroup metadata free space is slow, it will try to commit transaction and free the wrongly converted PERTRANS space, so it's not that easy to hit such bug.
[FIX] So to fix the problem, remove the @qgroup_free parameter for btrfs_delalloc_release_extents(), and always pass true to btrfs_inode_rsv_release().
Reported-by: Filipe Manana fdmanana@suse.com Fixes: 43b18595d660 ("btrfs: qgroup: Use separate meta reservation type for delalloc") CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Filipe Manana fdmanana@suse.com Signed-off-by: Qu Wenruo wqu@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/ctree.h | 3 +-- fs/btrfs/delalloc-space.c | 6 ++---- fs/btrfs/file.c | 7 +++---- fs/btrfs/inode-map.c | 4 ++-- fs/btrfs/inode.c | 12 ++++++------ fs/btrfs/ioctl.c | 6 ++---- fs/btrfs/relocation.c | 9 ++++----- 7 files changed, 20 insertions(+), 27 deletions(-)
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index e7a1ec075c652..180749080fd85 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2758,8 +2758,7 @@ int btrfs_subvolume_reserve_metadata(struct btrfs_root *root, int nitems, bool use_global_rsv); void btrfs_subvolume_release_metadata(struct btrfs_fs_info *fs_info, struct btrfs_block_rsv *rsv); -void btrfs_delalloc_release_extents(struct btrfs_inode *inode, u64 num_bytes, - bool qgroup_free); +void btrfs_delalloc_release_extents(struct btrfs_inode *inode, u64 num_bytes);
int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes); int btrfs_inc_block_group_ro(struct btrfs_block_group_cache *cache); diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c index 934521fe7e71d..8d2bb28ff5e0e 100644 --- a/fs/btrfs/delalloc-space.c +++ b/fs/btrfs/delalloc-space.c @@ -407,7 +407,6 @@ void btrfs_delalloc_release_metadata(struct btrfs_inode *inode, u64 num_bytes, * btrfs_delalloc_release_extents - release our outstanding_extents * @inode: the inode to balance the reservation for. * @num_bytes: the number of bytes we originally reserved with - * @qgroup_free: do we need to free qgroup meta reservation or convert them. * * When we reserve space we increase outstanding_extents for the extents we may * add. Once we've set the range as delalloc or created our ordered extents we @@ -415,8 +414,7 @@ void btrfs_delalloc_release_metadata(struct btrfs_inode *inode, u64 num_bytes, * temporarily tracked outstanding_extents. This _must_ be used in conjunction * with btrfs_delalloc_reserve_metadata. */ -void btrfs_delalloc_release_extents(struct btrfs_inode *inode, u64 num_bytes, - bool qgroup_free) +void btrfs_delalloc_release_extents(struct btrfs_inode *inode, u64 num_bytes) { struct btrfs_fs_info *fs_info = inode->root->fs_info; unsigned num_extents; @@ -430,7 +428,7 @@ void btrfs_delalloc_release_extents(struct btrfs_inode *inode, u64 num_bytes, if (btrfs_is_testing(fs_info)) return;
- btrfs_inode_rsv_release(inode, qgroup_free); + btrfs_inode_rsv_release(inode, true); }
/** diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index d68add0bf3462..a8a2adaf222f4 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1692,7 +1692,7 @@ again: force_page_uptodate); if (ret) { btrfs_delalloc_release_extents(BTRFS_I(inode), - reserve_bytes, true); + reserve_bytes); break; }
@@ -1704,7 +1704,7 @@ again: if (extents_locked == -EAGAIN) goto again; btrfs_delalloc_release_extents(BTRFS_I(inode), - reserve_bytes, true); + reserve_bytes); ret = extents_locked; break; } @@ -1772,8 +1772,7 @@ again: else free_extent_state(cached_state);
- btrfs_delalloc_release_extents(BTRFS_I(inode), reserve_bytes, - true); + btrfs_delalloc_release_extents(BTRFS_I(inode), reserve_bytes); if (ret) { btrfs_drop_pages(pages, num_pages); break; diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c index 10a73c97f8966..e2f49615c4297 100644 --- a/fs/btrfs/inode-map.c +++ b/fs/btrfs/inode-map.c @@ -484,13 +484,13 @@ again: ret = btrfs_prealloc_file_range_trans(inode, trans, 0, 0, prealloc, prealloc, prealloc, &alloc_hint); if (ret) { - btrfs_delalloc_release_extents(BTRFS_I(inode), prealloc, true); + btrfs_delalloc_release_extents(BTRFS_I(inode), prealloc); btrfs_delalloc_release_metadata(BTRFS_I(inode), prealloc, true); goto out_put; }
ret = btrfs_write_out_ino_cache(root, trans, path, inode); - btrfs_delalloc_release_extents(BTRFS_I(inode), prealloc, false); + btrfs_delalloc_release_extents(BTRFS_I(inode), prealloc); out_put: iput(inode); out_release: diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index b3453adb214da..1b85278471f62 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2167,7 +2167,7 @@ again:
ClearPageChecked(page); set_page_dirty(page); - btrfs_delalloc_release_extents(BTRFS_I(inode), PAGE_SIZE, false); + btrfs_delalloc_release_extents(BTRFS_I(inode), PAGE_SIZE); out: unlock_extent_cached(&BTRFS_I(inode)->io_tree, page_start, page_end, &cached_state); @@ -4912,7 +4912,7 @@ again: if (!page) { btrfs_delalloc_release_space(inode, data_reserved, block_start, blocksize, true); - btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize, true); + btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize); ret = -ENOMEM; goto out; } @@ -4980,7 +4980,7 @@ out_unlock: if (ret) btrfs_delalloc_release_space(inode, data_reserved, block_start, blocksize, true); - btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize, (ret != 0)); + btrfs_delalloc_release_extents(BTRFS_I(inode), blocksize); unlock_page(page); put_page(page); out: @@ -8685,7 +8685,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) } else if (ret >= 0 && (size_t)ret < count) btrfs_delalloc_release_space(inode, data_reserved, offset, count - (size_t)ret, true); - btrfs_delalloc_release_extents(BTRFS_I(inode), count, false); + btrfs_delalloc_release_extents(BTRFS_I(inode), count); } out: if (wakeup) @@ -9038,7 +9038,7 @@ again: unlock_extent_cached(io_tree, page_start, page_end, &cached_state);
if (!ret2) { - btrfs_delalloc_release_extents(BTRFS_I(inode), PAGE_SIZE, true); + btrfs_delalloc_release_extents(BTRFS_I(inode), PAGE_SIZE); sb_end_pagefault(inode->i_sb); extent_changeset_free(data_reserved); return VM_FAULT_LOCKED; @@ -9047,7 +9047,7 @@ again: out_unlock: unlock_page(page); out: - btrfs_delalloc_release_extents(BTRFS_I(inode), PAGE_SIZE, (ret != 0)); + btrfs_delalloc_release_extents(BTRFS_I(inode), PAGE_SIZE); btrfs_delalloc_release_space(inode, data_reserved, page_start, reserved_space, (ret != 0)); out_noreserve: diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 818f7ec8bb0ee..8dad66df74ed7 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1360,8 +1360,7 @@ again: unlock_page(pages[i]); put_page(pages[i]); } - btrfs_delalloc_release_extents(BTRFS_I(inode), page_cnt << PAGE_SHIFT, - false); + btrfs_delalloc_release_extents(BTRFS_I(inode), page_cnt << PAGE_SHIFT); extent_changeset_free(data_reserved); return i_done; out: @@ -1372,8 +1371,7 @@ out: btrfs_delalloc_release_space(inode, data_reserved, start_index << PAGE_SHIFT, page_cnt << PAGE_SHIFT, true); - btrfs_delalloc_release_extents(BTRFS_I(inode), page_cnt << PAGE_SHIFT, - true); + btrfs_delalloc_release_extents(BTRFS_I(inode), page_cnt << PAGE_SHIFT); extent_changeset_free(data_reserved); return ret;
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 074947bebd165..572314aebdf11 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -3277,7 +3277,7 @@ static int relocate_file_extent_cluster(struct inode *inode, btrfs_delalloc_release_metadata(BTRFS_I(inode), PAGE_SIZE, true); btrfs_delalloc_release_extents(BTRFS_I(inode), - PAGE_SIZE, true); + PAGE_SIZE); ret = -ENOMEM; goto out; } @@ -3298,7 +3298,7 @@ static int relocate_file_extent_cluster(struct inode *inode, btrfs_delalloc_release_metadata(BTRFS_I(inode), PAGE_SIZE, true); btrfs_delalloc_release_extents(BTRFS_I(inode), - PAGE_SIZE, true); + PAGE_SIZE); ret = -EIO; goto out; } @@ -3327,7 +3327,7 @@ static int relocate_file_extent_cluster(struct inode *inode, btrfs_delalloc_release_metadata(BTRFS_I(inode), PAGE_SIZE, true); btrfs_delalloc_release_extents(BTRFS_I(inode), - PAGE_SIZE, true); + PAGE_SIZE);
clear_extent_bits(&BTRFS_I(inode)->io_tree, page_start, page_end, @@ -3343,8 +3343,7 @@ static int relocate_file_extent_cluster(struct inode *inode, put_page(page);
index++; - btrfs_delalloc_release_extents(BTRFS_I(inode), PAGE_SIZE, - false); + btrfs_delalloc_release_extents(BTRFS_I(inode), PAGE_SIZE); balance_dirty_pages_ratelimited(inode->i_mapping); btrfs_throttle(fs_info); }
From: Remi Pommarel repk@triplefau.lt
[ Upstream commit de10ac47597e7a3596b27631d0d5ce5f48d2c099 ]
meson_saradc's irq handler uses priv->regmap so make sure that it is allocated before the irq get enabled.
This also fixes crash when CONFIG_DEBUG_SHIRQ is enabled, as device managed resources are freed in the inverted order they had been allocated, priv->regmap was freed before the spurious fake irq that CONFIG_DEBUG_SHIRQ adds called the handler.
Fixes: 3af109131b7eb8 ("iio: adc: meson-saradc: switch from polling to interrupt mode") Reported-by: Elie Roudninski xademax@gmail.com Signed-off-by: Remi Pommarel repk@triplefau.lt Reviewed-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Tested-by: Elie ROUDNINSKI xademax@gmail.com Reviewed-by: Kevin Hilman khilman@baylibre.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/adc/meson_saradc.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/iio/adc/meson_saradc.c b/drivers/iio/adc/meson_saradc.c index 7b28d045d2719..7b27306330a35 100644 --- a/drivers/iio/adc/meson_saradc.c +++ b/drivers/iio/adc/meson_saradc.c @@ -1219,6 +1219,11 @@ static int meson_sar_adc_probe(struct platform_device *pdev) if (IS_ERR(base)) return PTR_ERR(base);
+ priv->regmap = devm_regmap_init_mmio(&pdev->dev, base, + priv->param->regmap_config); + if (IS_ERR(priv->regmap)) + return PTR_ERR(priv->regmap); + irq = irq_of_parse_and_map(pdev->dev.of_node, 0); if (!irq) return -EINVAL; @@ -1228,11 +1233,6 @@ static int meson_sar_adc_probe(struct platform_device *pdev) if (ret) return ret;
- priv->regmap = devm_regmap_init_mmio(&pdev->dev, base, - priv->param->regmap_config); - if (IS_ERR(priv->regmap)) - return PTR_ERR(priv->regmap); - priv->clkin = devm_clk_get(&pdev->dev, "clkin"); if (IS_ERR(priv->clkin)) { dev_err(&pdev->dev, "failed to get clkin\n");
From: Pascal Bouwmann bouwmann@tau-tec.de
[ Upstream commit 6c59a962e081df6d8fe43325bbfabec57e0d4751 ]
The center temperature of the supported devices stored in the constant BMC150_ACCEL_TEMP_CENTER_VAL is not 24 degrees but 23 degrees.
It seems that some datasheets were inconsistent on this value leading to the error. For most usecases will only make minor difference so not queued for stable.
Signed-off-by: Pascal Bouwmann bouwmann@tau-tec.de Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/accel/bmc150-accel-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/accel/bmc150-accel-core.c b/drivers/iio/accel/bmc150-accel-core.c index cf6c0e3a83d38..121b4e89f038c 100644 --- a/drivers/iio/accel/bmc150-accel-core.c +++ b/drivers/iio/accel/bmc150-accel-core.c @@ -117,7 +117,7 @@ #define BMC150_ACCEL_SLEEP_1_SEC 0x0F
#define BMC150_ACCEL_REG_TEMP 0x08 -#define BMC150_ACCEL_TEMP_CENTER_VAL 24 +#define BMC150_ACCEL_TEMP_CENTER_VAL 23
#define BMC150_ACCEL_AXIS_TO_REG(axis) (BMC150_ACCEL_REG_XOUT_L + (axis * 2)) #define BMC150_AUTO_SUSPEND_DELAY_MS 2000
From: Ian Rogers irogers@google.com
[ Upstream commit 4b0b2b096da9d296e0e5668cdfba8613bd6f5bc8 ]
Unconditionally defining _FORTIFY_SOURCE can break tools that don't work with it, such as memory sanitizers:
https://github.com/google/sanitizers/wiki/AddressSanitizer#faq
Fixes: 4b6ab94eabe4 ("perf subcmd: Create subcmd library") Signed-off-by: Ian Rogers irogers@google.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Andi Kleen ak@linux.intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Link: http://lore.kernel.org/lkml/20190925195924.152834-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/subcmd/Makefile | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/tools/lib/subcmd/Makefile b/tools/lib/subcmd/Makefile index ed61fb3a46c08..5b2cd5e58df09 100644 --- a/tools/lib/subcmd/Makefile +++ b/tools/lib/subcmd/Makefile @@ -20,7 +20,13 @@ MAKEFLAGS += --no-print-directory LIBFILE = $(OUTPUT)libsubcmd.a
CFLAGS := $(EXTRA_WARNINGS) $(EXTRA_CFLAGS) -CFLAGS += -ggdb3 -Wall -Wextra -std=gnu99 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fPIC +CFLAGS += -ggdb3 -Wall -Wextra -std=gnu99 -fPIC + +ifeq ($(DEBUG),0) + ifeq ($(feature-fortify-source), 1) + CFLAGS += -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 + endif +endif
ifeq ($(CC_NO_CLANG), 0) CFLAGS += -O3
From: Ian Rogers irogers@google.com
[ Upstream commit e3e2cf3d5b1fe800b032e14c0fdcd9a6fb20cf3b ]
An optimized build such as:
make -C tools/perf CLANG=1 CC=clang EXTRA_CFLAGS="-O3
will turn the dereference operation into a ud2 instruction, raising a SIGILL rather than a SIGSEGV. Use raise(..) for correctness and clarity.
Similar issues were addressed in Numfor Mbiziwo-Tiapo's patch:
https://lkml.org/lkml/2019/7/8/1234
Committer testing:
Before:
[root@quaco ~]# perf test hooks 55: perf hooks : Ok [root@quaco ~]# perf test -v hooks 55: perf hooks : --- start --- test child forked, pid 17092 SIGSEGV is observed as expected, try to recover. Fatal error (SEGFAULT) in perf hook 'test' test child finished with 0 ---- end ---- perf hooks: Ok [root@quaco ~]#
After:
[root@quaco ~]# perf test hooks 55: perf hooks : Ok [root@quaco ~]# perf test -v hooks 55: perf hooks : --- start --- test child forked, pid 17909 SIGSEGV is observed as expected, try to recover. Fatal error (SEGFAULT) in perf hook 'test' test child finished with 0 ---- end ---- perf hooks: Ok [root@quaco ~]#
Fixes: a074865e60ed ("perf tools: Introduce perf hooks") Signed-off-by: Ian Rogers irogers@google.com Tested-by: Arnaldo Carvalho de Melo acme@redhat.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Andi Kleen ak@linux.intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Cc: Wang Nan wangnan0@huawei.com Link: http://lore.kernel.org/lkml/20190925195924.152834-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/perf-hooks.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/tools/perf/tests/perf-hooks.c b/tools/perf/tests/perf-hooks.c index a693bcf017ea2..44c16fd11bf6e 100644 --- a/tools/perf/tests/perf-hooks.c +++ b/tools/perf/tests/perf-hooks.c @@ -20,12 +20,11 @@ static void sigsegv_handler(int sig __maybe_unused) static void the_hook(void *_hook_flags) { int *hook_flags = _hook_flags; - int *p = NULL;
*hook_flags = 1234;
/* Generate a segfault, test perf_hooks__recover */ - *p = 0; + raise(SIGSEGV); }
int test__perf_hooks(struct test *test __maybe_unused, int subtest __maybe_unused)
From: Steve MacLean Steve.MacLean@microsoft.com
[ Upstream commit ee212d6ea20887c0ef352be8563ca13dbf965906 ]
Whenever an mmap/mmap2 event occurs, the map tree must be updated to add a new entry. If a new map overlaps a previous map, the overlapped section of the previous map is effectively unmapped, but the non-overlapping sections are still valid.
maps__fixup_overlappings() is responsible for creating any new map entries from the previously overlapped map. It optionally creates a before and an after map.
When creating the after map the existing code failed to adjust the map.pgoff. This meant the new after map would incorrectly calculate the file offset for the ip. This results in incorrect symbol name resolution for any ip in the after region.
Make maps__fixup_overlappings() correctly populate map.pgoff.
Add an assert that new mapping matches old mapping at the beginning of the after map.
Committer-testing:
Validated correct parsing of libcoreclr.so symbols from .NET Core 3.0 preview9 (which didn't strip symbols).
Preparation:
~/dotnet3.0-preview9/dotnet new webapi -o perfSymbol cd perfSymbol ~/dotnet3.0-preview9/dotnet publish perf record ~/dotnet3.0-preview9/dotnet \ bin/Debug/netcoreapp3.0/publish/perfSymbol.dll ^C
Before:
perf script --show-mmap-events 2>&1 | grep -e MMAP -e unknown |\ grep libcoreclr.so | head -n 4 dotnet 1907 373352.698780: PERF_RECORD_MMAP2 1907/1907: \ [0x7fe615726000(0x768000) @ 0 08:02 5510620 765057155]: \ r-xp .../3.0.0-preview9-19423-09/libcoreclr.so dotnet 1907 373352.701091: PERF_RECORD_MMAP2 1907/1907: \ [0x7fe615974000(0x1000) @ 0x24e000 08:02 5510620 765057155]: \ rwxp .../3.0.0-preview9-19423-09/libcoreclr.so dotnet 1907 373352.701241: PERF_RECORD_MMAP2 1907/1907: \ [0x7fe615c42000(0x1000) @ 0x51c000 08:02 5510620 765057155]: \ rwxp .../3.0.0-preview9-19423-09/libcoreclr.so dotnet 1907 373352.705249: 250000 cpu-clock: \ 7fe6159a1f99 [unknown] \ (.../3.0.0-preview9-19423-09/libcoreclr.so)
After:
perf script --show-mmap-events 2>&1 | grep -e MMAP -e unknown |\ grep libcoreclr.so | head -n 4 dotnet 1907 373352.698780: PERF_RECORD_MMAP2 1907/1907: \ [0x7fe615726000(0x768000) @ 0 08:02 5510620 765057155]: \ r-xp .../3.0.0-preview9-19423-09/libcoreclr.so dotnet 1907 373352.701091: PERF_RECORD_MMAP2 1907/1907: \ [0x7fe615974000(0x1000) @ 0x24e000 08:02 5510620 765057155]: \ rwxp .../3.0.0-preview9-19423-09/libcoreclr.so dotnet 1907 373352.701241: PERF_RECORD_MMAP2 1907/1907: \ [0x7fe615c42000(0x1000) @ 0x51c000 08:02 5510620 765057155]: \ rwxp .../3.0.0-preview9-19423-09/libcoreclr.so
All the [unknown] symbols were resolved.
Signed-off-by: Steve MacLean Steve.MacLean@Microsoft.com Tested-by: Brian Robbins brianrob@microsoft.com Acked-by: Jiri Olsa jolsa@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Andi Kleen ak@linux.intel.com Cc: Davidlohr Bueso dave@stgolabs.net Cc: Eric Saint-Etienne eric.saint.etienne@oracle.com Cc: John Keeping john@metanate.com Cc: John Salem josalem@microsoft.com Cc: Leo Yan leo.yan@linaro.org Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Song Liu songliubraving@fb.com Cc: Stephane Eranian eranian@google.com Cc: Tom McDonald thomas.mcdonald@microsoft.com Link: http://lore.kernel.org/lkml/BN8PR21MB136270949F22A6A02335C238F7800@BN8PR21MB... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/map.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c index 7666206d06fa7..f18113581cf03 100644 --- a/tools/perf/util/map.c +++ b/tools/perf/util/map.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 #include "symbol.h" +#include <assert.h> #include <errno.h> #include <inttypes.h> #include <limits.h> @@ -847,6 +848,8 @@ static int maps__fixup_overlappings(struct maps *maps, struct map *map, FILE *fp }
after->start = map->end; + after->pgoff += map->end - pos->start; + assert(pos->map_ip(pos, map->end) == after->map_ip(after, map->end)); __map_groups__insert(pos->groups, after); if (verbose >= 2 && !use_browser) map__fprintf(after, fp);
From: Andi Kleen ak@linux.intel.com
[ Upstream commit e98df280bc2a499fd41d7f9e2d6733884de69902 ]
When the LBR data and the instructions in a binary do not match the loop printing instructions could get confused and print a long stream of bogus <bad> instructions.
The problem was that if the instruction decoder cannot decode an instruction it ilen wasn't initialized, so the loop going through the basic block would continue with the previous value.
Harden the code to avoid such problems:
- Make sure ilen is always freshly initialized and is 0 for bad instructions.
- Do not overrun the code buffer while printing instructions
- Print a warning message if the final jump is not on an instruction boundary.
Signed-off-by: Andi Kleen ak@linux.intel.com Cc: Jiri Olsa jolsa@kernel.org Link: http://lore.kernel.org/lkml/20190927233546.11533-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-script.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 0140ddb8dd0bd..c14a1cdad80c0 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -1054,7 +1054,7 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample, continue;
insn = 0; - for (off = 0;; off += ilen) { + for (off = 0; off < (unsigned)len; off += ilen) { uint64_t ip = start + off;
printed += ip__fprintf_sym(ip, thread, x.cpumode, x.cpu, &lastsym, attr, fp); @@ -1065,6 +1065,7 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample, printed += print_srccode(thread, x.cpumode, ip); break; } else { + ilen = 0; printed += fprintf(fp, "\t%016" PRIx64 "\t%s\n", ip, dump_insn(&x, ip, buffer + off, len - off, &ilen)); if (ilen == 0) @@ -1074,6 +1075,8 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample, insn++; } } + if (off != (unsigned)len) + printed += fprintf(fp, "\tmismatch of LBR data and executable\n"); }
/* @@ -1114,6 +1117,7 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample, goto out; } for (off = 0; off <= end - start; off += ilen) { + ilen = 0; printed += fprintf(fp, "\t%016" PRIx64 "\t%s\n", start + off, dump_insn(&x, start + off, buffer + off, len - off, &ilen)); if (ilen == 0)
From: Andi Kleen ak@linux.intel.com
[ Upstream commit 6bdfd9f118bd59cf0f85d3bf4b72b586adea17c1 ]
The Intel fixed counters use a special table to override the JSON information.
During this override the period information from the JSON file got dropped, which results in inst_retired.any and similar running with frequency mode instead of a period.
Just specify the expected period in the table.
Signed-off-by: Andi Kleen ak@linux.intel.com Cc: Jiri Olsa jolsa@kernel.org Link: http://lore.kernel.org/lkml/20190927233546.11533-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/pmu-events/jevents.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c index d413761621b09..fa85e33762f72 100644 --- a/tools/perf/pmu-events/jevents.c +++ b/tools/perf/pmu-events/jevents.c @@ -449,12 +449,12 @@ static struct fixed { const char *name; const char *event; } fixed[] = { - { "inst_retired.any", "event=0xc0" }, - { "inst_retired.any_p", "event=0xc0" }, - { "cpu_clk_unhalted.ref", "event=0x0,umask=0x03" }, - { "cpu_clk_unhalted.thread", "event=0x3c" }, - { "cpu_clk_unhalted.core", "event=0x3c" }, - { "cpu_clk_unhalted.thread_any", "event=0x3c,any=1" }, + { "inst_retired.any", "event=0xc0,period=2000003" }, + { "inst_retired.any_p", "event=0xc0,period=2000003" }, + { "cpu_clk_unhalted.ref", "event=0x0,umask=0x03,period=2000003" }, + { "cpu_clk_unhalted.thread", "event=0x3c,period=2000003" }, + { "cpu_clk_unhalted.core", "event=0x3c,period=2000003" }, + { "cpu_clk_unhalted.thread_any", "event=0x3c,any=1,period=2000003" }, { NULL, NULL}, };
From: Arnaldo Carvalho de Melo acme@redhat.com
[ Upstream commit f67001a4a08eb124197ed4376941e1da9cf94b42 ]
For consistency, propagate the exact cause for get_cpuid() to have failed.
Cc: Adrian Hunter adrian.hunter@intel.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Link: https://lkml.kernel.org/n/tip-9ig269f7ktnhh99g4l15vpu2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/arch/powerpc/util/header.c | 3 ++- tools/perf/arch/s390/util/header.c | 9 +++++---- tools/perf/arch/x86/util/header.c | 3 ++- tools/perf/builtin-kvm.c | 7 ++++--- 4 files changed, 13 insertions(+), 9 deletions(-)
diff --git a/tools/perf/arch/powerpc/util/header.c b/tools/perf/arch/powerpc/util/header.c index 0b242664f5ea7..e46be9ef5a688 100644 --- a/tools/perf/arch/powerpc/util/header.c +++ b/tools/perf/arch/powerpc/util/header.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 #include <sys/types.h> +#include <errno.h> #include <unistd.h> #include <stdio.h> #include <stdlib.h> @@ -31,7 +32,7 @@ get_cpuid(char *buffer, size_t sz) buffer[nb-1] = '\0'; return 0; } - return -1; + return ENOBUFS; }
char * diff --git a/tools/perf/arch/s390/util/header.c b/tools/perf/arch/s390/util/header.c index 8b0b018d896ab..7933f6871c818 100644 --- a/tools/perf/arch/s390/util/header.c +++ b/tools/perf/arch/s390/util/header.c @@ -8,6 +8,7 @@ */
#include <sys/types.h> +#include <errno.h> #include <unistd.h> #include <stdio.h> #include <string.h> @@ -54,7 +55,7 @@ int get_cpuid(char *buffer, size_t sz)
sysinfo = fopen(SYSINFO, "r"); if (sysinfo == NULL) - return -1; + return errno;
while ((read = getline(&line, &line_sz, sysinfo)) != -1) { if (!strncmp(line, SYSINFO_MANU, strlen(SYSINFO_MANU))) { @@ -89,7 +90,7 @@ int get_cpuid(char *buffer, size_t sz)
/* Missing manufacturer, type or model information should not happen */ if (!manufacturer[0] || !type[0] || !model[0]) - return -1; + return EINVAL;
/* * Scan /proc/service_levels and return the CPU-MF counter facility @@ -133,14 +134,14 @@ skip_sysinfo: else nbytes = snprintf(buffer, sz, "%s,%s,%s", manufacturer, type, model); - return (nbytes >= sz) ? -1 : 0; + return (nbytes >= sz) ? ENOBUFS : 0; }
char *get_cpuid_str(struct perf_pmu *pmu __maybe_unused) { char *buf = malloc(128);
- if (buf && get_cpuid(buf, 128) < 0) + if (buf && get_cpuid(buf, 128)) zfree(&buf); return buf; } diff --git a/tools/perf/arch/x86/util/header.c b/tools/perf/arch/x86/util/header.c index af9a9f2600be4..a089af60906a0 100644 --- a/tools/perf/arch/x86/util/header.c +++ b/tools/perf/arch/x86/util/header.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 #include <sys/types.h> +#include <errno.h> #include <unistd.h> #include <stdio.h> #include <stdlib.h> @@ -57,7 +58,7 @@ __get_cpuid(char *buffer, size_t sz, const char *fmt) buffer[nb-1] = '\0'; return 0; } - return -1; + return ENOBUFS; }
int diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c index b33c834891208..44ff3ea1da23f 100644 --- a/tools/perf/builtin-kvm.c +++ b/tools/perf/builtin-kvm.c @@ -699,14 +699,15 @@ static int process_sample_event(struct perf_tool *tool,
static int cpu_isa_config(struct perf_kvm_stat *kvm) { - char buf[64], *cpuid; + char buf[128], *cpuid; int err;
if (kvm->live) { err = get_cpuid(buf, sizeof(buf)); if (err != 0) { - pr_err("Failed to look up CPU type\n"); - return err; + pr_err("Failed to look up CPU type: %s\n", + str_error_r(err, buf, sizeof(buf))); + return -err; } cpuid = buf; } else
From: Arnaldo Carvalho de Melo acme@redhat.com
[ Upstream commit a66fa0619a0ae3585ef09e9c33ecfb5c7c6cb72b ]
The callers of symbol__annotate2() use symbol__strerror_disassemble() to convert its failure returns into a human readable string, so propagate error values from functions it calls, starting with perf_env__arch() that when fails the right thing to do is to look at 'errno' to see why its possible call to uname() failed.
Reported-by: Russell King - ARM Linux admin linux@armlinux.org.uk Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org, Cc: Will Deacon will@kernel.org Link: https://lkml.kernel.org/n/tip-it5d83kyusfhb1q1b0l4pxzs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/annotate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 163536720149b..6d9642c082e52 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -2065,7 +2065,7 @@ int symbol__annotate(struct symbol *sym, struct map *map, int err;
if (!arch_name) - return -1; + return errno;
args.arch = arch = arch__find(arch_name); if (arch == NULL)
From: Arnaldo Carvalho de Melo acme@redhat.com
[ Upstream commit 28f4417c3333940b242af03d90214f713bbef232 ]
Callers of symbol__annotate() expect a errno value or some other extended error value range in symbol__strerror_disassemble() to convert to a proper error string, fix it when propagating a failure to find the arch specific annotation routines via arch__find(arch_name).
Reported-by: Russell King - ARM Linux admin linux@armlinux.org.uk Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org, Cc: Will Deacon will@kernel.org Link: https://lkml.kernel.org/n/tip-o0k6dw7cas0vvmjjvgsyvu1i@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/annotate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 6d9642c082e52..2dac51e9e4180 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -2069,7 +2069,7 @@ int symbol__annotate(struct symbol *sym, struct map *map,
args.arch = arch = arch__find(arch_name); if (arch == NULL) - return -ENOTSUP; + return ENOTSUP;
if (parch) *parch = arch;
From: Arnaldo Carvalho de Melo acme@redhat.com
[ Upstream commit 211f493b611eef012841f795166c38ec7528738d ]
We were just returning -1 in symbol__annotate() when symbol__annotate() failed, propagate its error as it is used later to pass to symbol__strerror_disassemble() to present a error message to the user, that in some cases were getting:
"Invalid -1 error code"
Fix it to propagate the error.
Reported-by: Russell King - ARM Linux admin linux@armlinux.org.uk Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org, Cc: Will Deacon will@kernel.org Link: https://lkml.kernel.org/n/tip-0tj89rs9g7nbcyd5skadlvuu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/annotate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 2dac51e9e4180..ca8b517e38fa8 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -2991,7 +2991,7 @@ int symbol__annotate2(struct symbol *sym, struct map *map, struct perf_evsel *ev
out_free_offsets: zfree(¬es->offsets); - return -1; + return err; }
#define ANNOTATION__CFG(n) \
From: Arnaldo Carvalho de Melo acme@redhat.com
[ Upstream commit 42d7a9107d83223a5fcecc6732d626a6c074cbc2 ]
They are called from symbol__annotate() and to propagate errors that can help understand the problem make them return what symbol__strerror_disassemble() known, i.e. errno codes and other annotation specific errors in a special, out of errnos, range.
Reported-by: Russell King - ARM Linux admin linux@armlinux.org.uk Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org, Cc: Will Deacon will@kernel.org Link: https://lkml.kernel.org/n/tip-pqx7srcv7tixgid251aeboj6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/arch/arm/annotate/instructions.c | 4 ++-- tools/perf/arch/arm64/annotate/instructions.c | 4 ++-- tools/perf/arch/s390/annotate/instructions.c | 6 ++++-- tools/perf/arch/x86/annotate/instructions.c | 6 ++++-- tools/perf/util/annotate.c | 6 ++++++ tools/perf/util/annotate.h | 2 ++ 6 files changed, 20 insertions(+), 8 deletions(-)
diff --git a/tools/perf/arch/arm/annotate/instructions.c b/tools/perf/arch/arm/annotate/instructions.c index c7d1a69b894fe..19ac54758c713 100644 --- a/tools/perf/arch/arm/annotate/instructions.c +++ b/tools/perf/arch/arm/annotate/instructions.c @@ -36,7 +36,7 @@ static int arm__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
arm = zalloc(sizeof(*arm)); if (!arm) - return -1; + return ENOMEM;
#define ARM_CONDS "(cc|cs|eq|ge|gt|hi|le|ls|lt|mi|ne|pl|vc|vs)" err = regcomp(&arm->call_insn, "^blx?" ARM_CONDS "?$", REG_EXTENDED); @@ -58,5 +58,5 @@ out_free_call: regfree(&arm->call_insn); out_free_arm: free(arm); - return -1; + return SYMBOL_ANNOTATE_ERRNO__ARCH_INIT_REGEXP; } diff --git a/tools/perf/arch/arm64/annotate/instructions.c b/tools/perf/arch/arm64/annotate/instructions.c index 8f70a1b282dfb..223e2f161f414 100644 --- a/tools/perf/arch/arm64/annotate/instructions.c +++ b/tools/perf/arch/arm64/annotate/instructions.c @@ -94,7 +94,7 @@ static int arm64__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
arm = zalloc(sizeof(*arm)); if (!arm) - return -1; + return ENOMEM;
/* bl, blr */ err = regcomp(&arm->call_insn, "^blr?$", REG_EXTENDED); @@ -117,5 +117,5 @@ out_free_call: regfree(&arm->call_insn); out_free_arm: free(arm); - return -1; + return SYMBOL_ANNOTATE_ERRNO__ARCH_INIT_REGEXP; } diff --git a/tools/perf/arch/s390/annotate/instructions.c b/tools/perf/arch/s390/annotate/instructions.c index 89bb8f2c54cee..a50e70baf9183 100644 --- a/tools/perf/arch/s390/annotate/instructions.c +++ b/tools/perf/arch/s390/annotate/instructions.c @@ -164,8 +164,10 @@ static int s390__annotate_init(struct arch *arch, char *cpuid __maybe_unused) if (!arch->initialized) { arch->initialized = true; arch->associate_instruction_ops = s390__associate_ins_ops; - if (cpuid) - err = s390__cpuid_parse(arch, cpuid); + if (cpuid) { + if (s390__cpuid_parse(arch, cpuid)) + err = SYMBOL_ANNOTATE_ERRNO__ARCH_INIT_CPUID_PARSING; + } }
return err; diff --git a/tools/perf/arch/x86/annotate/instructions.c b/tools/perf/arch/x86/annotate/instructions.c index 44f5aba78210e..7eb5621c021d8 100644 --- a/tools/perf/arch/x86/annotate/instructions.c +++ b/tools/perf/arch/x86/annotate/instructions.c @@ -196,8 +196,10 @@ static int x86__annotate_init(struct arch *arch, char *cpuid) if (arch->initialized) return 0;
- if (cpuid) - err = x86__cpuid_parse(arch, cpuid); + if (cpuid) { + if (x86__cpuid_parse(arch, cpuid)) + err = SYMBOL_ANNOTATE_ERRNO__ARCH_INIT_CPUID_PARSING; + }
arch->initialized = true; return err; diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index ca8b517e38fa8..b475449f955ef 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1625,6 +1625,12 @@ int symbol__strerror_disassemble(struct symbol *sym __maybe_unused, struct map * case SYMBOL_ANNOTATE_ERRNO__NO_LIBOPCODES_FOR_BPF: scnprintf(buf, buflen, "Please link with binutils's libopcode to enable BPF annotation"); break; + case SYMBOL_ANNOTATE_ERRNO__ARCH_INIT_REGEXP: + scnprintf(buf, buflen, "Problems with arch specific instruction name regular expressions."); + break; + case SYMBOL_ANNOTATE_ERRNO__ARCH_INIT_CPUID_PARSING: + scnprintf(buf, buflen, "Problems while parsing the CPUID in the arch specific initialization."); + break; default: scnprintf(buf, buflen, "Internal error: Invalid %d error code\n", errnum); break; diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index 5bc0cf655d377..a1191995fe77e 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -370,6 +370,8 @@ enum symbol_disassemble_errno {
SYMBOL_ANNOTATE_ERRNO__NO_VMLINUX = __SYMBOL_ANNOTATE_ERRNO__START, SYMBOL_ANNOTATE_ERRNO__NO_LIBOPCODES_FOR_BPF, + SYMBOL_ANNOTATE_ERRNO__ARCH_INIT_CPUID_PARSING, + SYMBOL_ANNOTATE_ERRNO__ARCH_INIT_REGEXP,
__SYMBOL_ANNOTATE_ERRNO__END, };
From: Arnaldo Carvalho de Melo acme@redhat.com
[ Upstream commit 16ed3c1e91159e28b02f11f71ff4ce4cbc6f99e4 ]
We should return errno or the annotation extra range understood by symbol__strerror_disassemble() instead of -1, fix it, returning ENOMEM instead.
Reported-by: Russell King - ARM Linux admin linux@armlinux.org.uk Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org, Cc: Will Deacon will@kernel.org Link: https://lkml.kernel.org/n/tip-8of1cmj3rz0mppfcshc9bbqq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/annotate.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index b475449f955ef..ab7851ec0ce53 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1662,7 +1662,7 @@ static int dso__disassemble_filename(struct dso *dso, char *filename, size_t fil
build_id_path = strdup(filename); if (!build_id_path) - return -1; + return ENOMEM;
/* * old style build-id cache has name of XX/XXXXXXX.. while @@ -2971,7 +2971,7 @@ int symbol__annotate2(struct symbol *sym, struct map *map, struct perf_evsel *ev
notes->offsets = zalloc(size * sizeof(struct annotation_line *)); if (notes->offsets == NULL) - return -1; + return ENOMEM;
if (perf_evsel__is_group_event(evsel)) nr_pcnt = evsel->nr_members;
From: Arnaldo Carvalho de Melo acme@redhat.com
[ Upstream commit 11aad897f6d1a28eae3b7e5b293647c522d65819 ]
Return errno when open_memstream() fails and add two new speciall error codes for when an invalid, non BPF file or one without BTF is passed to symbol__disassemble_bpf(), so that its callers can rely on symbol__strerror_disassemble() to convert that to a human readable error message that can help figure out what is wrong, with hints even.
Cc: Russell King - ARM Linux admin linux@armlinux.org.uk Cc: Song Liu songliubraving@fb.com Cc: Alexei Starovoitov ast@kernel.org Cc: Daniel Borkmann daniel@iogearbox.net Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org, Cc: Will Deacon will@kernel.org Link: https://lkml.kernel.org/n/tip-usevw9r2gcipfcrbpaueurw0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/annotate.c | 19 +++++++++++++++---- tools/perf/util/annotate.h | 2 ++ 2 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index ab7851ec0ce53..fb8756026a805 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1631,6 +1631,13 @@ int symbol__strerror_disassemble(struct symbol *sym __maybe_unused, struct map * case SYMBOL_ANNOTATE_ERRNO__ARCH_INIT_CPUID_PARSING: scnprintf(buf, buflen, "Problems while parsing the CPUID in the arch specific initialization."); break; + case SYMBOL_ANNOTATE_ERRNO__BPF_INVALID_FILE: + scnprintf(buf, buflen, "Invalid BPF file: %s.", dso->long_name); + break; + case SYMBOL_ANNOTATE_ERRNO__BPF_MISSING_BTF: + scnprintf(buf, buflen, "The %s BPF file has no BTF section, compile with -g or use pahole -J.", + dso->long_name); + break; default: scnprintf(buf, buflen, "Internal error: Invalid %d error code\n", errnum); break; @@ -1713,13 +1720,13 @@ static int symbol__disassemble_bpf(struct symbol *sym, char tpath[PATH_MAX]; size_t buf_size; int nr_skip = 0; - int ret = -1; char *buf; bfd *bfdf; + int ret; FILE *s;
if (dso->binary_type != DSO_BINARY_TYPE__BPF_PROG_INFO) - return -1; + return SYMBOL_ANNOTATE_ERRNO__BPF_INVALID_FILE;
pr_debug("%s: handling sym %s addr %" PRIx64 " len %" PRIx64 "\n", __func__, sym->name, sym->start, sym->end - sym->start); @@ -1732,8 +1739,10 @@ static int symbol__disassemble_bpf(struct symbol *sym, assert(bfd_check_format(bfdf, bfd_object));
s = open_memstream(&buf, &buf_size); - if (!s) + if (!s) { + ret = errno; goto out; + } init_disassemble_info(&info, s, (fprintf_ftype) fprintf);
@@ -1742,8 +1751,10 @@ static int symbol__disassemble_bpf(struct symbol *sym,
info_node = perf_env__find_bpf_prog_info(dso->bpf_prog.env, dso->bpf_prog.id); - if (!info_node) + if (!info_node) { + return SYMBOL_ANNOTATE_ERRNO__BPF_MISSING_BTF; goto out; + } info_linear = info_node->info_linear; sub_id = dso->bpf_prog.sub_id;
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index a1191995fe77e..2004e2cf0211b 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -372,6 +372,8 @@ enum symbol_disassemble_errno { SYMBOL_ANNOTATE_ERRNO__NO_LIBOPCODES_FOR_BPF, SYMBOL_ANNOTATE_ERRNO__ARCH_INIT_CPUID_PARSING, SYMBOL_ANNOTATE_ERRNO__ARCH_INIT_REGEXP, + SYMBOL_ANNOTATE_ERRNO__BPF_INVALID_FILE, + SYMBOL_ANNOTATE_ERRNO__BPF_MISSING_BTF,
__SYMBOL_ANNOTATE_ERRNO__END, };
From: Connor Kuehl connor.kuehl@canonical.com
[ Upstream commit 955c1532a34305f2f780b47f0c40cc7c65500810 ]
If kzalloc() returns NULL, the error path doesn't stop the flow of control from entering rtw_hal_read_chip_version() which dereferences the null pointer. Fix this by adding a 'goto' to the error path to more gracefully handle the issue and avoid proceeding with initialization steps that we're no longer prepared to handle.
Also update the debug message to be more consistent with the other debug messages in this function.
Addresses-Coverity: ("Dereference after null check")
Signed-off-by: Connor Kuehl connor.kuehl@canonical.com Link: https://lore.kernel.org/r/20190927214415.899-1-connor.kuehl@canonical.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/rtl8188eu/os_dep/usb_intf.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/staging/rtl8188eu/os_dep/usb_intf.c b/drivers/staging/rtl8188eu/os_dep/usb_intf.c index 664d93a7f90df..4fac9dca798e8 100644 --- a/drivers/staging/rtl8188eu/os_dep/usb_intf.c +++ b/drivers/staging/rtl8188eu/os_dep/usb_intf.c @@ -348,8 +348,10 @@ static struct adapter *rtw_usb_if1_init(struct dvobj_priv *dvobj, }
padapter->HalData = kzalloc(sizeof(struct hal_data_8188e), GFP_KERNEL); - if (!padapter->HalData) - DBG_88E("cant not alloc memory for HAL DATA\n"); + if (!padapter->HalData) { + DBG_88E("Failed to allocate memory for HAL data\n"); + goto free_adapter; + }
/* step read_chip_version */ rtw_hal_read_chip_version(padapter);
From: Krishnamraju Eraparaju krishna2@chelsio.com
[ Upstream commit df791c54d627bae53c9be3be40a69594c55de487 ]
In siw_qp_llp_write_space(), 'sock' members should be accessed with sk_callback_lock held, otherwise, it could race with siw_sk_restore_upcalls(). And this could cause "NULL deref" panic. Below panic is due to the NULL cep returned from sk_to_cep(sk):
Call Trace: <IRQ> siw_qp_llp_write_space+0x11/0x40 [siw] tcp_check_space+0x4c/0xf0 tcp_rcv_established+0x52b/0x630 tcp_v4_do_rcv+0xf4/0x1e0 tcp_v4_rcv+0x9b8/0xab0 ip_protocol_deliver_rcu+0x2c/0x1c0 ip_local_deliver_finish+0x44/0x50 ip_local_deliver+0x6b/0xf0 ? ip_protocol_deliver_rcu+0x1c0/0x1c0 ip_rcv+0x52/0xd0 ? ip_rcv_finish_core.isra.14+0x390/0x390 __netif_receive_skb_one_core+0x83/0xa0 netif_receive_skb_internal+0x73/0xb0 napi_gro_frags+0x1ff/0x2b0 t4_ethrx_handler+0x4a7/0x740 [cxgb4] process_responses+0x2c9/0x590 [cxgb4] ? t4_sge_intr_msix+0x1d/0x30 [cxgb4] ? handle_irq_event_percpu+0x51/0x70 ? handle_irq_event+0x41/0x60 ? handle_edge_irq+0x97/0x1a0 napi_rx_handler+0x14/0xe0 [cxgb4] net_rx_action+0x2af/0x410 __do_softirq+0xda/0x2a8 do_softirq_own_stack+0x2a/0x40 </IRQ> do_softirq+0x50/0x60 __local_bh_enable_ip+0x50/0x60 ip_finish_output2+0x18f/0x520 ip_output+0x6e/0xf0 ? __ip_finish_output+0x1f0/0x1f0 __ip_queue_xmit+0x14f/0x3d0 ? __slab_alloc+0x4b/0x58 __tcp_transmit_skb+0x57d/0xa60 tcp_write_xmit+0x23b/0xfd0 __tcp_push_pending_frames+0x2e/0xf0 tcp_sendmsg_locked+0x939/0xd50 tcp_sendmsg+0x27/0x40 sock_sendmsg+0x57/0x80 siw_tx_hdt+0x894/0xb20 [siw] ? find_busiest_group+0x3e/0x5b0 ? common_interrupt+0xa/0xf ? common_interrupt+0xa/0xf ? common_interrupt+0xa/0xf siw_qp_sq_process+0xf1/0xe60 [siw] ? __wake_up_common_lock+0x87/0xc0 siw_sq_resume+0x33/0xe0 [siw] siw_run_sq+0xac/0x190 [siw] ? remove_wait_queue+0x60/0x60 kthread+0xf8/0x130 ? siw_sq_resume+0xe0/0xe0 [siw] ? kthread_bind+0x10/0x10 ret_from_fork+0x35/0x40
Fixes: f29dd55b0236 ("rdma/siw: queue pair methods") Link: https://lore.kernel.org/r/20190923101112.32685-1-krishna2@chelsio.com Signed-off-by: Krishnamraju Eraparaju krishna2@chelsio.com Reviewed-by: Bernard Metzler bmt@zurich.ibm.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/sw/siw/siw_qp.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/sw/siw/siw_qp.c b/drivers/infiniband/sw/siw/siw_qp.c index 430314c8abd94..52d402f39df93 100644 --- a/drivers/infiniband/sw/siw/siw_qp.c +++ b/drivers/infiniband/sw/siw/siw_qp.c @@ -182,12 +182,19 @@ void siw_qp_llp_close(struct siw_qp *qp) */ void siw_qp_llp_write_space(struct sock *sk) { - struct siw_cep *cep = sk_to_cep(sk); + struct siw_cep *cep;
- cep->sk_write_space(sk); + read_lock(&sk->sk_callback_lock); + + cep = sk_to_cep(sk); + if (cep) { + cep->sk_write_space(sk);
- if (!test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) - (void)siw_sq_start(cep->qp); + if (!test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) + (void)siw_sq_start(cep->qp); + } + + read_unlock(&sk->sk_callback_lock); }
static int siw_qp_readq_init(struct siw_qp *qp, int irq_size, int orq_size)
From: Navid Emamdoost navid.emamdoost@gmail.com
[ Upstream commit 34b3be18a04ecdc610aae4c48e5d1b799d8689f6 ]
In sdma_init if rhashtable_init fails the allocated memory for tmp_sdma_rht should be released.
Fixes: 5a52a7acf7e2 ("IB/hfi1: NULL pointer dereference when freeing rhashtable") Link: https://lore.kernel.org/r/20190925144543.10141-1-navid.emamdoost@gmail.com Signed-off-by: Navid Emamdoost navid.emamdoost@gmail.com Acked-by: Dennis Dalessandro dennis.dalessandro@intel.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/hfi1/sdma.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/hfi1/sdma.c b/drivers/infiniband/hw/hfi1/sdma.c index 2395fd4233a7e..2ed7bfd5feea6 100644 --- a/drivers/infiniband/hw/hfi1/sdma.c +++ b/drivers/infiniband/hw/hfi1/sdma.c @@ -1526,8 +1526,11 @@ int sdma_init(struct hfi1_devdata *dd, u8 port) }
ret = rhashtable_init(tmp_sdma_rht, &sdma_rht_params); - if (ret < 0) + if (ret < 0) { + kfree(tmp_sdma_rht); goto bail; + } + dd->sdma_rht = tmp_sdma_rht;
dd_dev_info(dd, "SDMA num_sdma: %u\n", dd->num_sdma);
From: Potnuri Bharat Teja bharat@chelsio.com
[ Upstream commit 91724c1e5afe45b64970036170659726e7dc5cff ]
dump_qp() is wrongly trying to dump SRQ structures as QP when SRQ is used by the application. This patch matches the QPID before dumping them. Also removes unwanted SRQ id addition to QP id xarray.
Fixes: 2f43129127e6 ("cxgb4: Convert qpidr to XArray") Link: https://lore.kernel.org/r/20190930074119.20046-1-bharat@chelsio.com Signed-off-by: Rahul Kundu rahul.kundu@chelsio.com Signed-off-by: Potnuri Bharat Teja bharat@chelsio.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/cxgb4/device.c | 7 +++++-- drivers/infiniband/hw/cxgb4/qp.c | 10 +--------- 2 files changed, 6 insertions(+), 11 deletions(-)
diff --git a/drivers/infiniband/hw/cxgb4/device.c b/drivers/infiniband/hw/cxgb4/device.c index a8b9548bd1a26..599340c1f0b82 100644 --- a/drivers/infiniband/hw/cxgb4/device.c +++ b/drivers/infiniband/hw/cxgb4/device.c @@ -242,10 +242,13 @@ static void set_ep_sin6_addrs(struct c4iw_ep *ep, } }
-static int dump_qp(struct c4iw_qp *qp, struct c4iw_debugfs_data *qpd) +static int dump_qp(unsigned long id, struct c4iw_qp *qp, + struct c4iw_debugfs_data *qpd) { int space; int cc; + if (id != qp->wq.sq.qid) + return 0;
space = qpd->bufsize - qpd->pos - 1; if (space == 0) @@ -350,7 +353,7 @@ static int qp_open(struct inode *inode, struct file *file)
xa_lock_irq(&qpd->devp->qps); xa_for_each(&qpd->devp->qps, index, qp) - dump_qp(qp, qpd); + dump_qp(index, qp, qpd); xa_unlock_irq(&qpd->devp->qps);
qpd->buf[qpd->pos++] = 0; diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c index eb9368be28c1d..bbcac539777a2 100644 --- a/drivers/infiniband/hw/cxgb4/qp.c +++ b/drivers/infiniband/hw/cxgb4/qp.c @@ -2737,15 +2737,11 @@ int c4iw_create_srq(struct ib_srq *ib_srq, struct ib_srq_init_attr *attrs, if (CHELSIO_CHIP_VERSION(rhp->rdev.lldi.adapter_type) > CHELSIO_T6) srq->flags = T4_SRQ_LIMIT_SUPPORT;
- ret = xa_insert_irq(&rhp->qps, srq->wq.qid, srq, GFP_KERNEL); - if (ret) - goto err_free_queue; - if (udata) { srq_key_mm = kmalloc(sizeof(*srq_key_mm), GFP_KERNEL); if (!srq_key_mm) { ret = -ENOMEM; - goto err_remove_handle; + goto err_free_queue; } srq_db_key_mm = kmalloc(sizeof(*srq_db_key_mm), GFP_KERNEL); if (!srq_db_key_mm) { @@ -2789,8 +2785,6 @@ err_free_srq_db_key_mm: kfree(srq_db_key_mm); err_free_srq_key_mm: kfree(srq_key_mm); -err_remove_handle: - xa_erase_irq(&rhp->qps, srq->wq.qid); err_free_queue: free_srq_queue(srq, ucontext ? &ucontext->uctx : &rhp->rdev.uctx, srq->wr_waitp); @@ -2813,8 +2807,6 @@ void c4iw_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata) rhp = srq->rhp;
pr_debug("%s id %d\n", __func__, srq->wq.qid); - - xa_erase_irq(&rhp->qps, srq->wq.qid); ucontext = rdma_udata_to_drv_context(udata, struct c4iw_ucontext, ibucontext); free_srq_queue(srq, ucontext ? &ucontext->uctx : &rhp->rdev.uctx,
From: Bart Van Assche bvanassche@acm.org
[ Upstream commit b66f31efbdad95ec274345721d99d1d835e6de01 ]
This patch fixes the lock inversion complaint:
============================================ WARNING: possible recursive locking detected 5.3.0-rc7-dbg+ #1 Not tainted -------------------------------------------- kworker/u16:6/171 is trying to acquire lock: 00000000035c6e6c (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x78/0x4a0 [rdma_cm]
but task is already holding lock: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm]
other info that might help us debug this: Possible unsafe locking scenario:
CPU0 ---- lock(&id_priv->handler_mutex); lock(&id_priv->handler_mutex);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by kworker/u16:6/171: #0: 00000000e2eaa773 ((wq_completion)iw_cm_wq){+.+.}, at: process_one_work+0x472/0xac0 #1: 000000001efd357b ((work_completion)(&work->work)#3){+.+.}, at: process_one_work+0x476/0xac0 #2: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm]
stack backtrace: CPU: 3 PID: 171 Comm: kworker/u16:6 Not tainted 5.3.0-rc7-dbg+ #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: iw_cm_wq cm_work_handler [iw_cm] Call Trace: dump_stack+0x8a/0xd6 __lock_acquire.cold+0xe1/0x24d lock_acquire+0x106/0x240 __mutex_lock+0x12e/0xcb0 mutex_lock_nested+0x1f/0x30 rdma_destroy_id+0x78/0x4a0 [rdma_cm] iw_conn_req_handler+0x5c9/0x680 [rdma_cm] cm_work_handler+0xe62/0x1100 [iw_cm] process_one_work+0x56d/0xac0 worker_thread+0x7a/0x5d0 kthread+0x1bc/0x210 ret_from_fork+0x24/0x30
This is not a bug as there are actually two lock classes here.
Link: https://lore.kernel.org/r/20190930231707.48259-3-bvanassche@acm.org Fixes: de910bd92137 ("RDMA/cma: Simplify locking needed for serialization of callbacks") Signed-off-by: Bart Van Assche bvanassche@acm.org Reviewed-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/cma.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index a68d0ccf67a43..2e48b59926c19 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -2396,9 +2396,10 @@ static int iw_conn_req_handler(struct iw_cm_id *cm_id, conn_id->cm_id.iw = NULL; cma_exch(conn_id, RDMA_CM_DESTROYING); mutex_unlock(&conn_id->handler_mutex); + mutex_unlock(&listen_id->handler_mutex); cma_deref_id(conn_id); rdma_destroy_id(&conn_id->id); - goto out; + return ret; }
mutex_unlock(&conn_id->handler_mutex);
From: Dexuan Cui decui@microsoft.com
[ Upstream commit 6a297c90efa68b2864483193b8bfb0d19478600c ]
Simplify the ring buffer handling with the in-place API.
Also avoid the dynamic allocation and the memory leak in the channel callback function.
Signed-off-by: Dexuan Cui decui@microsoft.com Acked-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-hyperv.c | 56 +++++++--------------------------------- 1 file changed, 10 insertions(+), 46 deletions(-)
diff --git a/drivers/hid/hid-hyperv.c b/drivers/hid/hid-hyperv.c index 7795831d37c21..f363163200759 100644 --- a/drivers/hid/hid-hyperv.c +++ b/drivers/hid/hid-hyperv.c @@ -314,60 +314,24 @@ static void mousevsc_on_receive(struct hv_device *device,
static void mousevsc_on_channel_callback(void *context) { - const int packet_size = 0x100; - int ret; struct hv_device *device = context; - u32 bytes_recvd; - u64 req_id; struct vmpacket_descriptor *desc; - unsigned char *buffer; - int bufferlen = packet_size; - - buffer = kmalloc(bufferlen, GFP_ATOMIC); - if (!buffer) - return; - - do { - ret = vmbus_recvpacket_raw(device->channel, buffer, - bufferlen, &bytes_recvd, &req_id); - - switch (ret) { - case 0: - if (bytes_recvd <= 0) { - kfree(buffer); - return; - } - desc = (struct vmpacket_descriptor *)buffer; - - switch (desc->type) { - case VM_PKT_COMP: - break; - - case VM_PKT_DATA_INBAND: - mousevsc_on_receive(device, desc); - break; - - default: - pr_err("unhandled packet type %d, tid %llx len %d\n", - desc->type, req_id, bytes_recvd); - break; - }
+ foreach_vmbus_pkt(desc, device->channel) { + switch (desc->type) { + case VM_PKT_COMP: break;
- case -ENOBUFS: - kfree(buffer); - /* Handle large packet */ - bufferlen = bytes_recvd; - buffer = kmalloc(bytes_recvd, GFP_ATOMIC); - - if (!buffer) - return; + case VM_PKT_DATA_INBAND: + mousevsc_on_receive(device, desc); + break;
+ default: + pr_err("Unhandled packet type %d, tid %llx len %d\n", + desc->type, desc->trans_id, desc->len8 * 8); break; } - } while (1); - + } }
static int mousevsc_connect_to_vsp(struct hv_device *device)
From: Cristian Marussi cristian.marussi@arm.com
[ Upstream commit 131b30c94fbc0adb15f911609884dd39dada8f00 ]
A TARGET which failed to be built/installed should not be included in the runlist generated inside the run_kselftest.sh script.
Signed-off-by: Cristian Marussi cristian.marussi@arm.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/Makefile | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 25b43a8c2b159..1779923d7a7b8 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -198,8 +198,12 @@ ifdef INSTALL_PATH echo " cat /dev/null > $$logfile" >> $(ALL_SCRIPT) echo "fi" >> $(ALL_SCRIPT)
+ @# While building run_kselftest.sh skip also non-existent TARGET dirs: + @# they could be the result of a build failure and should NOT be + @# included in the generated runlist. for TARGET in $(TARGETS); do \ BUILD_TARGET=$$BUILD/$$TARGET; \ + [ ! -d $$INSTALL_PATH/$$TARGET ] && echo "Skipping non-existent dir: $$TARGET" && continue; \ echo "[ -w /dev/kmsg ] && echo "kselftest: Running tests in $$TARGET" >> /dev/kmsg" >> $(ALL_SCRIPT); \ echo "cd $$TARGET" >> $(ALL_SCRIPT); \ echo -n "run_many" >> $(ALL_SCRIPT); \
From: Kees Cook keescook@chromium.org
[ Upstream commit 852c8cbf34d3b3130a05c38064dd98614f97d3a8 ]
Commit a745f7af3cbd ("selftests/harness: Add 30 second timeout per test") solves the problem of kselftest_harness.h-using binary tests possibly hanging forever. However, scripts and other binaries can still hang forever. This adds a global timeout to each test script run.
To make this configurable (e.g. as needed in the "rtc" test case), include a new per-test-directory "settings" file (similar to "config") that can contain kselftest-specific settings. The first recognized field is "timeout".
Additionally, this splits the reporting for timeouts into a specific "TIMEOUT" not-ok (and adds exit code reporting in the remaining case).
Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/kselftest/runner.sh | 36 +++++++++++++++++++-- tools/testing/selftests/rtc/settings | 1 + 2 files changed, 34 insertions(+), 3 deletions(-) create mode 100644 tools/testing/selftests/rtc/settings
diff --git a/tools/testing/selftests/kselftest/runner.sh b/tools/testing/selftests/kselftest/runner.sh index 00c9020bdda8b..84de7bc74f2cf 100644 --- a/tools/testing/selftests/kselftest/runner.sh +++ b/tools/testing/selftests/kselftest/runner.sh @@ -3,9 +3,14 @@ # # Runs a set of tests in a given subdirectory. export skip_rc=4 +export timeout_rc=124 export logfile=/dev/stdout export per_test_logging=
+# Defaults for "settings" file fields: +# "timeout" how many seconds to let each test run before failing. +export kselftest_default_timeout=45 + # There isn't a shell-agnostic way to find the path of a sourced file, # so we must rely on BASE_DIR being set to find other tools. if [ -z "$BASE_DIR" ]; then @@ -24,6 +29,16 @@ tap_prefix() fi }
+tap_timeout() +{ + # Make sure tests will time out if utility is available. + if [ -x /usr/bin/timeout ] ; then + /usr/bin/timeout "$kselftest_timeout" "$1" + else + "$1" + fi +} + run_one() { DIR="$1" @@ -32,6 +47,18 @@ run_one()
BASENAME_TEST=$(basename $TEST)
+ # Reset any "settings"-file variables. + export kselftest_timeout="$kselftest_default_timeout" + # Load per-test-directory kselftest "settings" file. + settings="$BASE_DIR/$DIR/settings" + if [ -r "$settings" ] ; then + while read line ; do + field=$(echo "$line" | cut -d= -f1) + value=$(echo "$line" | cut -d= -f2-) + eval "kselftest_$field"="$value" + done < "$settings" + fi + TEST_HDR_MSG="selftests: $DIR: $BASENAME_TEST" echo "# $TEST_HDR_MSG" if [ ! -x "$TEST" ]; then @@ -44,14 +71,17 @@ run_one() echo "not ok $test_num $TEST_HDR_MSG" else cd `dirname $TEST` > /dev/null - (((((./$BASENAME_TEST 2>&1; echo $? >&3) | + ((((( tap_timeout ./$BASENAME_TEST 2>&1; echo $? >&3) | tap_prefix >&4) 3>&1) | (read xs; exit $xs)) 4>>"$logfile" && echo "ok $test_num $TEST_HDR_MSG") || - (if [ $? -eq $skip_rc ]; then \ + (rc=$?; \ + if [ $rc -eq $skip_rc ]; then \ echo "not ok $test_num $TEST_HDR_MSG # SKIP" + elif [ $rc -eq $timeout_rc ]; then \ + echo "not ok $test_num $TEST_HDR_MSG # TIMEOUT" else - echo "not ok $test_num $TEST_HDR_MSG" + echo "not ok $test_num $TEST_HDR_MSG # exit=$rc" fi) cd - >/dev/null fi diff --git a/tools/testing/selftests/rtc/settings b/tools/testing/selftests/rtc/settings new file mode 100644 index 0000000000000..ba4d85f74cd6b --- /dev/null +++ b/tools/testing/selftests/rtc/settings @@ -0,0 +1 @@ +timeout=90
From: ZhangXiaoxu zhangxiaoxu5@huawei.com
[ Upstream commit 33ea5aaa87cdae0f9af4d6b7ee4f650a1a36fd1d ]
When xfstests testing, there are some WARNING as below:
WARNING: CPU: 0 PID: 6235 at fs/nfs/inode.c:122 nfs_clear_inode+0x9c/0xd8 Modules linked in: CPU: 0 PID: 6235 Comm: umount.nfs Hardware name: linux,dummy-virt (DT) pstate: 60000005 (nZCv daif -PAN -UAO) pc : nfs_clear_inode+0x9c/0xd8 lr : nfs_evict_inode+0x60/0x78 sp : fffffc000f68fc00 x29: fffffc000f68fc00 x28: fffffe00c53155c0 x27: fffffe00c5315000 x26: fffffc0009a63748 x25: fffffc000f68fd18 x24: fffffc000bfaaf40 x23: fffffc000936d3c0 x22: fffffe00c4ff5e20 x21: fffffc000bfaaf40 x20: fffffe00c4ff5d10 x19: fffffc000c056000 x18: 000000000000003c x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000040 x14: 0000000000000228 x13: fffffc000c3a2000 x12: 0000000000000045 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : 0000000000000000 x7 : 0000000000000000 x6 : fffffc00084b027c x5 : fffffc0009a64000 x4 : fffffe00c0e77400 x3 : fffffc000c0563a8 x2 : fffffffffffffffb x1 : 000000000000764e x0 : 0000000000000001 Call trace: nfs_clear_inode+0x9c/0xd8 nfs_evict_inode+0x60/0x78 evict+0x108/0x380 dispose_list+0x70/0xa0 evict_inodes+0x194/0x210 generic_shutdown_super+0xb0/0x220 nfs_kill_super+0x40/0x88 deactivate_locked_super+0xb4/0x120 deactivate_super+0x144/0x160 cleanup_mnt+0x98/0x148 __cleanup_mnt+0x38/0x50 task_work_run+0x114/0x160 do_notify_resume+0x2f8/0x308 work_pending+0x8/0x14
The nrequest should be increased/decreased only if PG_INODE_REF flag was setted.
But in the nfs_inode_remove_request function, it maybe decrease when no PG_INODE_REF flag, this maybe lead nrequests count error.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: ZhangXiaoxu zhangxiaoxu5@huawei.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/write.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 85ca49549b39b..52cab65f91cf0 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -786,7 +786,6 @@ static void nfs_inode_remove_request(struct nfs_page *req) struct nfs_inode *nfsi = NFS_I(inode); struct nfs_page *head;
- atomic_long_dec(&nfsi->nrequests); if (nfs_page_group_sync_on_bit(req, PG_REMOVE)) { head = req->wb_head;
@@ -799,8 +798,10 @@ static void nfs_inode_remove_request(struct nfs_page *req) spin_unlock(&mapping->private_lock); }
- if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags)) + if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags)) { nfs_release_request(req); + atomic_long_dec(&nfsi->nrequests); + } }
static void
From: Julien Grall julien.grall@arm.com
[ Upstream commit 7230f7e99fecc684180322b056fad3853d1029d3 ]
The HWCAP framework will detect a new capability based on the sanitized version of the ID registers.
Sanitization is based on a whitelist, so any field not described will end up to be zeroed.
At the moment, ID_AA64ISAR1_EL1.FRINTTS is not described in ftr_id_aa64isar1. This means the field will be zeroed and therefore the userspace will not be able to see the HWCAP even if the hardware supports the feature.
This can be fixed by describing the field in ftr_id_aa64isar1.
Fixes: ca9503fc9e98 ("arm64: Expose FRINT capabilities to userspace") Signed-off-by: Julien Grall julien.grall@arm.com Cc: mark.brown@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kernel/cpufeature.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 9323bcc40a58a..cabebf1a79768 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -136,6 +136,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar0[] = {
static const struct arm64_ftr_bits ftr_id_aa64isar1[] = { ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_SB_SHIFT, 4, 0), + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_FRINTTS_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH), FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_GPI_SHIFT, 4, 0), ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH),
From: James Morse james.morse@arm.com
[ Upstream commit f46f27a576cc3b1e3d45ea50bc06287aa46b04b2 ]
Commit bd82d4bd2188 ("arm64: Fix incorrect irqflag restore for priority masking") added a macro to the entry.S call paths that leave the PSTATE.I bit set. This tells the pPNMI masking logic that interrupts are masked by the CPU, not by the PMR. This value is read back by local_daif_save().
Commit bd82d4bd2188 added this call to el0_svc, as el0_svc_handler is called with interrupts masked. el0_svc_compat was missed, but should be covered in the same way as both of these paths end up in el0_svc_common(), which expects to unmask interrupts.
Fixes: bd82d4bd2188 ("arm64: Fix incorrect irqflag restore for priority masking") Signed-off-by: James Morse james.morse@arm.com Cc: Julien Thierry julien.thierry.kdev@gmail.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kernel/entry.S | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 109894bd31948..239f6841a7412 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -775,6 +775,7 @@ el0_sync_compat: b.ge el0_dbg b el0_inv el0_svc_compat: + gic_prio_kentry_setup tmp=x1 mov x0, sp bl el0_svc_compat_handler b ret_to_user
From: James Morse james.morse@arm.com
[ Upstream commit dd8a1f13488438c6c220b7cafa500baaf21a6e53 ]
CPUs affected by Neoverse-N1 #1542419 may execute a stale instruction if it was recently modified. The affected sequence requires freshly written instructions to be executable before a branch to them is updated.
There are very few places in the kernel that modify executable text, all but one come with sufficient synchronisation: * The module loader's flush_module_icache() calls flush_icache_range(), which does a kick_all_cpus_sync() * bpf_int_jit_compile() calls flush_icache_range(). * Kprobes calls aarch64_insn_patch_text(), which does its work in stop_machine(). * static keys and ftrace both patch between nops and branches to existing kernel code (not generated code).
The affected sequence is the interaction between ftrace and modules. The module PLT is cleaned using __flush_icache_range() as the trampoline shouldn't be executable until we update the branch to it.
Drop the double-underscore so that this path runs kick_all_cpus_sync() too.
Signed-off-by: James Morse james.morse@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kernel/ftrace.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c index 1717732579742..06e56b4703153 100644 --- a/arch/arm64/kernel/ftrace.c +++ b/arch/arm64/kernel/ftrace.c @@ -121,10 +121,16 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
/* * Ensure updated trampoline is visible to instruction - * fetch before we patch in the branch. + * fetch before we patch in the branch. Although the + * architecture doesn't require an IPI in this case, + * Neoverse-N1 erratum #1542419 does require one + * if the TLB maintenance in module_enable_ro() is + * skipped due to rodata_enabled. It doesn't seem worth + * it to make it conditional given that this is + * certainly not a fast-path. */ - __flush_icache_range((unsigned long)&dst[0], - (unsigned long)&dst[1]); + flush_icache_range((unsigned long)&dst[0], + (unsigned long)&dst[1]); } addr = (unsigned long)dst; #else /* CONFIG_ARM64_MODULE_PLTS */
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit 6264dab6efd6069f0387efb078a9960b5642377b ]
'exit' functions should be marked as __exit, not __init.
Fixes: fc60a8b675bd ("tty: serial: owl: Implement console driver") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Link: https://lore.kernel.org/r/20190910041129.6978-1-christophe.jaillet@wanadoo.f... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/owl-uart.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/serial/owl-uart.c b/drivers/tty/serial/owl-uart.c index 29a6dc6a8d23c..73fcc6bdb0312 100644 --- a/drivers/tty/serial/owl-uart.c +++ b/drivers/tty/serial/owl-uart.c @@ -742,7 +742,7 @@ static int __init owl_uart_init(void) return ret; }
-static void __init owl_uart_exit(void) +static void __exit owl_uart_exit(void) { platform_driver_unregister(&owl_uart_platform_driver); uart_unregister_driver(&owl_uart_driver);
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit 5080d127127ac5b610b57900774d9559ae55e817 ]
'exit' functions should be marked as __exit, not __init.
Fixes: c10b13325ced ("tty: serial: Add RDA8810PL UART driver") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Link: https://lore.kernel.org/r/20190910041702.7357-1-christophe.jaillet@wanadoo.f... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/rda-uart.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/serial/rda-uart.c b/drivers/tty/serial/rda-uart.c index 284623eefaeba..ba5e488a03742 100644 --- a/drivers/tty/serial/rda-uart.c +++ b/drivers/tty/serial/rda-uart.c @@ -817,7 +817,7 @@ static int __init rda_uart_init(void) return ret; }
-static void __init rda_uart_exit(void) +static void __exit rda_uart_exit(void) { platform_driver_unregister(&rda_uart_platform_driver); uart_unregister_driver(&rda_uart_driver);
From: Christoph Hellwig hch@lst.de
[ Upstream commit 7e2a165de5a52003d10a611ee3884cdb5c44e8cd ]
The sifive serial driver implements earlycon support, but unless another driver is built in that supports earlycon support it won't be usable. Explicitly select SERIAL_EARLYCON instead.
Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Paul Walmsley paul.walmsley@sifive.com Link: https://lore.kernel.org/r/20190910055923.28384-1-hch@lst.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig index 3083dbae35f7e..3b436ccd29dad 100644 --- a/drivers/tty/serial/Kconfig +++ b/drivers/tty/serial/Kconfig @@ -1075,6 +1075,7 @@ config SERIAL_SIFIVE_CONSOLE bool "Console on SiFive UART" depends on SERIAL_SIFIVE=y select SERIAL_CORE_CONSOLE + select SERIAL_EARLYCON help Select this option if you would like to use a SiFive UART as the system console.
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit 47a7e5e97d4edd7b14974d34f0e5a5560fad2915 ]
Fix tty driver build on SPARC by not using __exitdata. It appears that SPARC does not support section .exit.data.
Fixes these build errors:
`.exit.data' referenced in section `.exit.text' of drivers/tty/n_hdlc.o: defined in discarded section `.exit.data' of drivers/tty/n_hdlc.o `.exit.data' referenced in section `.exit.text' of drivers/tty/n_hdlc.o: defined in discarded section `.exit.data' of drivers/tty/n_hdlc.o `.exit.data' referenced in section `.exit.text' of drivers/tty/n_hdlc.o: defined in discarded section `.exit.data' of drivers/tty/n_hdlc.o `.exit.data' referenced in section `.exit.text' of drivers/tty/n_hdlc.o: defined in discarded section `.exit.data' of drivers/tty/n_hdlc.o
Reported-by: kbuild test robot lkp@intel.com Fixes: 063246641d4a ("format-security: move static strings to const") Signed-off-by: Randy Dunlap rdunlap@infradead.org Cc: Kees Cook keescook@chromium.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: "David S. Miller" davem@davemloft.net Cc: Andrew Morton akpm@linux-foundation.org Link: https://lore.kernel.org/r/675e7bd9-955b-3ff3-1101-a973b58b5b75@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/n_hdlc.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/tty/n_hdlc.c b/drivers/tty/n_hdlc.c index e55c79eb64309..98361acd3053f 100644 --- a/drivers/tty/n_hdlc.c +++ b/drivers/tty/n_hdlc.c @@ -968,6 +968,11 @@ static int __init n_hdlc_init(void) } /* end of init_module() */
+#ifdef CONFIG_SPARC +#undef __exitdata +#define __exitdata +#endif + static const char hdlc_unregister_ok[] __exitdata = KERN_INFO "N_HDLC: line discipline unregistered\n"; static const char hdlc_unregister_fail[] __exitdata =
From: Navid Emamdoost navid.emamdoost@gmail.com
[ Upstream commit fc739a058d99c9297ef6bfd923b809d85855b9a9 ]
In fastrpc_dma_buf_attach if dma_get_sgtable fails the allocated memory for a should be released.
Signed-off-by: Navid Emamdoost navid.emamdoost@gmail.com Link: https://lore.kernel.org/r/20190925152742.16258-1-navid.emamdoost@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/fastrpc.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c index 98603e235cf04..a76b6c6fd660f 100644 --- a/drivers/misc/fastrpc.c +++ b/drivers/misc/fastrpc.c @@ -499,6 +499,7 @@ static int fastrpc_dma_buf_attach(struct dma_buf *dmabuf, FASTRPC_PHYS(buffer->phys), buffer->size); if (ret < 0) { dev_err(buffer->dev, "failed to get scatterlist from DMA API\n"); + kfree(a); return -EINVAL; }
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit ab59ca3eb4e7059727df85eee68bda169d26c8f8 ]
According to surrounding error paths, it is likely that 'goto err_get;' is expected here. Otherwise, a call to 'rdma_restrack_put(res);' would be missing.
Fixes: c5dfe0ea6ffa ("RDMA/nldev: Add resource tracker doit callback") Link: https://lore.kernel.org/r/20190818091044.8845-1-christophe.jaillet@wanadoo.f... Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Reviewed-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/nldev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c index 020c269765584..91e2cc9ddb9f8 100644 --- a/drivers/infiniband/core/nldev.c +++ b/drivers/infiniband/core/nldev.c @@ -1230,7 +1230,7 @@ static int res_get_common_doit(struct sk_buff *skb, struct nlmsghdr *nlh, msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); if (!msg) { ret = -ENOMEM; - goto err; + goto err_get; }
nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
From: Jack Morgenstein jackm@dev.mellanox.co.il
[ Upstream commit 94635c36f3854934a46d9e812e028d4721bbb0e6 ]
In the process of moving the debug counters sysfs entries, the commit mentioned below eliminated the cm_infiniband sysfs directory.
This sysfs directory was tied to the cm_port object allocated in procedure cm_add_one().
Before the commit below, this cm_port object was freed via a call to kobject_put(port->kobj) in procedure cm_remove_port_fs().
Since port no longer uses its kobj, kobject_put(port->kobj) was eliminated. This, however, meant that kfree was never called for the cm_port buffers.
Fix this by adding explicit kfree(port) calls to functions cm_add_one() and cm_remove_one().
Note: the kfree call in the first chunk below (in the cm_add_one error flow) fixes an old, undetected memory leak.
Fixes: c87e65cfb97c ("RDMA/cm: Move debug counters to be under relevant IB device") Link: https://lore.kernel.org/r/20190916071154.20383-2-leon@kernel.org Signed-off-by: Jack Morgenstein jackm@dev.mellanox.co.il Signed-off-by: Leon Romanovsky leonro@mellanox.com Reviewed-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/cm.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c index da10e6ccb43cd..5920c0085d35b 100644 --- a/drivers/infiniband/core/cm.c +++ b/drivers/infiniband/core/cm.c @@ -4399,6 +4399,7 @@ error2: error1: port_modify.set_port_cap_mask = 0; port_modify.clr_port_cap_mask = IB_PORT_CM_SUP; + kfree(port); while (--i) { if (!rdma_cap_ib_cm(ib_device, i)) continue; @@ -4407,6 +4408,7 @@ error1: ib_modify_port(ib_device, port->port_num, 0, &port_modify); ib_unregister_mad_agent(port->mad_agent); cm_remove_port_fs(port); + kfree(port); } free: kfree(cm_dev); @@ -4460,6 +4462,7 @@ static void cm_remove_one(struct ib_device *ib_device, void *client_data) spin_unlock_irq(&cm.state_lock); ib_unregister_mad_agent(cur_mad_agent); cm_remove_port_fs(port); + kfree(port); }
kfree(cm_dev);
From: Leon Romanovsky leonro@mellanox.com
[ Upstream commit 594e6c5d41ed2471ab0b90f3f0b66cdf618b7ac9 ]
Properly unwind QP counter rebinding in case of failure.
Trying to rebind the counter after unbiding it is not going to work reliably, move the unbind to the end so it doesn't have to be unwound.
Fixes: b389327df905 ("RDMA/nldev: Allow counter manual mode configration through RDMA netlink") Link: https://lore.kernel.org/r/20191002115627.16740-1-leon@kernel.org Reviewed-by: Mark Zhang markz@mellanox.com Signed-off-by: Leon Romanovsky leonro@mellanox.com Reviewed-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/core/nldev.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c index 91e2cc9ddb9f8..f42e856f30729 100644 --- a/drivers/infiniband/core/nldev.c +++ b/drivers/infiniband/core/nldev.c @@ -1787,10 +1787,6 @@ static int nldev_stat_del_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
cntn = nla_get_u32(tb[RDMA_NLDEV_ATTR_STAT_COUNTER_ID]); qpn = nla_get_u32(tb[RDMA_NLDEV_ATTR_RES_LQPN]); - ret = rdma_counter_unbind_qpn(device, port, qpn, cntn); - if (ret) - goto err_unbind; - if (fill_nldev_handle(msg, device) || nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, port) || nla_put_u32(msg, RDMA_NLDEV_ATTR_STAT_COUNTER_ID, cntn) || @@ -1799,13 +1795,15 @@ static int nldev_stat_del_doit(struct sk_buff *skb, struct nlmsghdr *nlh, goto err_fill; }
+ ret = rdma_counter_unbind_qpn(device, port, qpn, cntn); + if (ret) + goto err_fill; + nlmsg_end(msg, nlh); ib_device_put(device); return rdma_nl_unicast(msg, NETLINK_CB(skb).portid);
err_fill: - rdma_counter_bind_qpn(device, port, qpn, cntn); -err_unbind: nlmsg_free(msg); err: ib_device_put(device);
From: Jason Gunthorpe jgg@mellanox.com
[ Upstream commit 880505cfef1d086d18b59d2920eb2160429ffa1f ]
This code is completely broken, the umem of a ODP MR simply cannot be discarded without a lot more locking, nor can an ODP mkey be blithely destroyed via destroy_mkey().
Fixes: 6aec21f6a832 ("IB/mlx5: Page faults handling infrastructure") Link: https://lore.kernel.org/r/20191001153821.23621-2-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov artemyko@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/mlx5/mr.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 3401f5f6792e6..c4ba8838d2c46 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1423,6 +1423,9 @@ int mlx5_ib_rereg_user_mr(struct ib_mr *ib_mr, int flags, u64 start, if (!mr->umem) return -EINVAL;
+ if (is_odp_mr(mr)) + return -EOPNOTSUPP; + if (flags & IB_MR_REREG_TRANS) { addr = virt_addr; len = length; @@ -1468,8 +1471,6 @@ int mlx5_ib_rereg_user_mr(struct ib_mr *ib_mr, int flags, u64 start, }
mr->allocated_from_cache = 0; - if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) - mr->live = 1; } else { /* * Send a UMR WQE @@ -1498,7 +1499,6 @@ int mlx5_ib_rereg_user_mr(struct ib_mr *ib_mr, int flags, u64 start,
set_mr_fields(dev, mr, npages, len, access_flags);
- update_odp_mr(mr); return 0;
err:
From: Jason Gunthorpe jgg@mellanox.com
[ Upstream commit aa116b810ac9077a263ed8679fb4d595f180e0eb ]
During destroy setting live = 0 and then synchronize_srcu() prevents num_pending_prefetch from incrementing, and also, ensures that all work holding that count is queued on the WQ. Testing before causes races of the form:
CPU0 CPU1 dereg_mr() mlx5_ib_advise_mr_prefetch() srcu_read_lock() num_pending_prefetch_inc() if (!live) live = 0 atomic_read() == 0 // skip flush_workqueue() atomic_inc() queue_work(); srcu_read_unlock() WARN_ON(atomic_read()) // Fails
Swap the order so that the synchronize_srcu() prevents this.
Fixes: a6bc3875f176 ("IB/mlx5: Protect against prefetch of invalid MR") Link: https://lore.kernel.org/r/20191001153821.23621-5-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov artemyko@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/mlx5/mr.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index c4ba8838d2c46..96c8a6835592d 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1591,13 +1591,14 @@ static void dereg_mr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr) */ mr->live = 0;
+ /* Wait for all running page-fault handlers to finish. */ + synchronize_srcu(&dev->mr_srcu); + /* dequeue pending prefetch requests for the mr */ if (atomic_read(&mr->num_pending_prefetch)) flush_workqueue(system_unbound_wq); WARN_ON(atomic_read(&mr->num_pending_prefetch));
- /* Wait for all running page-fault handlers to finish. */ - synchronize_srcu(&dev->mr_srcu); /* Destroy all page mappings */ if (umem_odp->page_list) mlx5_ib_invalidate_range(umem_odp,
From: Jason Gunthorpe jgg@mellanox.com
[ Upstream commit 0417791536ae1e28d7f0418f1d20048ec4d3c6cf ]
While MR uses live as the SRCU 'update', the MW case uses the xarray directly, xa_erase() causes the MW to become inaccessible to the pagefault thread.
Thus whenever a MW is removed from the xarray we must synchronize_srcu() before freeing it.
This must be done before freeing the mkey as re-use of the mkey while the pagefault thread is using the stale mkey is undesirable.
Add the missing synchronizes to MW and DEVX indirect mkey and delete the bogus protection against double destroy in mlx5_core_destroy_mkey()
Fixes: 534fd7aac56a ("IB/mlx5: Manage indirection mkey upon DEVX flow for ODP") Fixes: 6aec21f6a832 ("IB/mlx5: Page faults handling infrastructure") Link: https://lore.kernel.org/r/20191001153821.23621-7-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov artemyko@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/mlx5/devx.c | 58 ++++++-------------- drivers/infiniband/hw/mlx5/mlx5_ib.h | 1 - drivers/infiniband/hw/mlx5/mr.c | 21 +++++-- drivers/net/ethernet/mellanox/mlx5/core/mr.c | 8 +-- 4 files changed, 33 insertions(+), 55 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/devx.c b/drivers/infiniband/hw/mlx5/devx.c index af5bbb35c0589..ef7ba0133d28a 100644 --- a/drivers/infiniband/hw/mlx5/devx.c +++ b/drivers/infiniband/hw/mlx5/devx.c @@ -1275,29 +1275,6 @@ static int devx_handle_mkey_create(struct mlx5_ib_dev *dev, return 0; }
-static void devx_free_indirect_mkey(struct rcu_head *rcu) -{ - kfree(container_of(rcu, struct devx_obj, devx_mr.rcu)); -} - -/* This function to delete from the radix tree needs to be called before - * destroying the underlying mkey. Otherwise a race might occur in case that - * other thread will get the same mkey before this one will be deleted, - * in that case it will fail via inserting to the tree its own data. - * - * Note: - * An error in the destroy is not expected unless there is some other indirect - * mkey which points to this one. In a kernel cleanup flow it will be just - * destroyed in the iterative destruction call. In a user flow, in case - * the application didn't close in the expected order it's its own problem, - * the mkey won't be part of the tree, in both cases the kernel is safe. - */ -static void devx_cleanup_mkey(struct devx_obj *obj) -{ - xa_erase(&obj->ib_dev->mdev->priv.mkey_table, - mlx5_base_mkey(obj->devx_mr.mmkey.key)); -} - static void devx_cleanup_subscription(struct mlx5_ib_dev *dev, struct devx_event_subscription *sub) { @@ -1339,8 +1316,16 @@ static int devx_obj_cleanup(struct ib_uobject *uobject, int ret;
dev = mlx5_udata_to_mdev(&attrs->driver_udata); - if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY) - devx_cleanup_mkey(obj); + if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY) { + /* + * The pagefault_single_data_segment() does commands against + * the mmkey, we must wait for that to stop before freeing the + * mkey, as another allocation could get the same mkey #. + */ + xa_erase(&obj->ib_dev->mdev->priv.mkey_table, + mlx5_base_mkey(obj->devx_mr.mmkey.key)); + synchronize_srcu(&dev->mr_srcu); + }
if (obj->flags & DEVX_OBJ_FLAGS_DCT) ret = mlx5_core_destroy_dct(obj->ib_dev->mdev, &obj->core_dct); @@ -1359,12 +1344,6 @@ static int devx_obj_cleanup(struct ib_uobject *uobject, devx_cleanup_subscription(dev, sub_entry); mutex_unlock(&devx_event_table->event_xa_lock);
- if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY) { - call_srcu(&dev->mr_srcu, &obj->devx_mr.rcu, - devx_free_indirect_mkey); - return ret; - } - kfree(obj); return ret; } @@ -1468,26 +1447,21 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_DEVX_OBJ_CREATE)( &obj_id); WARN_ON(obj->dinlen > MLX5_MAX_DESTROY_INBOX_SIZE_DW * sizeof(u32));
- if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY) { - err = devx_handle_mkey_indirect(obj, dev, cmd_in, cmd_out); - if (err) - goto obj_destroy; - } - err = uverbs_copy_to(attrs, MLX5_IB_ATTR_DEVX_OBJ_CREATE_CMD_OUT, cmd_out, cmd_out_len); if (err) - goto err_copy; + goto obj_destroy;
if (opcode == MLX5_CMD_OP_CREATE_GENERAL_OBJECT) obj_type = MLX5_GET(general_obj_in_cmd_hdr, cmd_in, obj_type); - obj->obj_id = get_enc_obj_id(opcode | obj_type << 16, obj_id);
+ if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY) { + err = devx_handle_mkey_indirect(obj, dev, cmd_in, cmd_out); + if (err) + goto obj_destroy; + } return 0;
-err_copy: - if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY) - devx_cleanup_mkey(obj); obj_destroy: if (obj->flags & DEVX_OBJ_FLAGS_DCT) mlx5_core_destroy_dct(obj->ib_dev->mdev, &obj->core_dct); diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 9ae587b74b121..43c7353b98123 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -638,7 +638,6 @@ struct mlx5_ib_mw { struct mlx5_ib_devx_mr { struct mlx5_core_mkey mmkey; int ndescs; - struct rcu_head rcu; };
struct mlx5_ib_umr_context { diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 96c8a6835592d..c15d05f61cc7b 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1970,14 +1970,25 @@ free:
int mlx5_ib_dealloc_mw(struct ib_mw *mw) { + struct mlx5_ib_dev *dev = to_mdev(mw->device); struct mlx5_ib_mw *mmw = to_mmw(mw); int err;
- err = mlx5_core_destroy_mkey((to_mdev(mw->device))->mdev, - &mmw->mmkey); - if (!err) - kfree(mmw); - return err; + if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) { + xa_erase(&dev->mdev->priv.mkey_table, + mlx5_base_mkey(mmw->mmkey.key)); + /* + * pagefault_single_data_segment() may be accessing mmw under + * SRCU if the user bound an ODP MR to this MW. + */ + synchronize_srcu(&dev->mr_srcu); + } + + err = mlx5_core_destroy_mkey(dev->mdev, &mmw->mmkey); + if (err) + return err; + kfree(mmw); + return 0; }
int mlx5_ib_check_mr_status(struct ib_mr *ibmr, u32 check_mask, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mr.c b/drivers/net/ethernet/mellanox/mlx5/core/mr.c index 9231b39d18b21..c501bf2a02521 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/mr.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/mr.c @@ -112,17 +112,11 @@ int mlx5_core_destroy_mkey(struct mlx5_core_dev *dev, u32 out[MLX5_ST_SZ_DW(destroy_mkey_out)] = {0}; u32 in[MLX5_ST_SZ_DW(destroy_mkey_in)] = {0}; struct xarray *mkeys = &dev->priv.mkey_table; - struct mlx5_core_mkey *deleted_mkey; unsigned long flags;
xa_lock_irqsave(mkeys, flags); - deleted_mkey = __xa_erase(mkeys, mlx5_base_mkey(mkey->key)); + __xa_erase(mkeys, mlx5_base_mkey(mkey->key)); xa_unlock_irqrestore(mkeys, flags); - if (!deleted_mkey) { - mlx5_core_dbg(dev, "failed xarray delete of mkey 0x%x\n", - mlx5_base_mkey(mkey->key)); - return -ENOENT; - }
MLX5_SET(destroy_mkey_in, in, opcode, MLX5_CMD_OP_DESTROY_MKEY); MLX5_SET(destroy_mkey_in, in, mkey_index, mlx5_mkey_to_idx(mkey->key));
From: Thierry Reding treding@nvidia.com
[ Upstream commit fffa6af94894126994a7600c6f6f09b892e89fa9 ]
The gpiod_set_debounce() function takes the debounce time in microseconds. Adjust the switch/case values in the MAX77620 GPIO to use the correct unit.
Signed-off-by: Thierry Reding treding@nvidia.com Link: https://lore.kernel.org/r/20191002122825.3948322-1-thierry.reding@gmail.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-max77620.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpio/gpio-max77620.c b/drivers/gpio/gpio-max77620.c index b7d89e30131e2..06e8caaafa811 100644 --- a/drivers/gpio/gpio-max77620.c +++ b/drivers/gpio/gpio-max77620.c @@ -192,13 +192,13 @@ static int max77620_gpio_set_debounce(struct max77620_gpio *mgpio, case 0: val = MAX77620_CNFG_GPIO_DBNC_None; break; - case 1 ... 8: + case 1000 ... 8000: val = MAX77620_CNFG_GPIO_DBNC_8ms; break; - case 9 ... 16: + case 9000 ... 16000: val = MAX77620_CNFG_GPIO_DBNC_16ms; break; - case 17 ... 32: + case 17000 ... 32000: val = MAX77620_CNFG_GPIO_DBNC_32ms; break; default:
From: Austin Kim austindh.kim@gmail.com
[ Upstream commit dd19c106a36690b47bb1acc68372f2b472b495b8 ]
After 'Initial git repository build' commit, 'mapping_table_ERRHRD' variable has not been used.
So 'mapping_table_ERRHRD' const variable could be removed to mute below warning message:
fs/cifs/netmisc.c:120:40: warning: unused variable 'mapping_table_ERRHRD' [-Wunused-const-variable] static const struct smb_to_posix_error mapping_table_ERRHRD[] = { ^ Signed-off-by: Austin Kim austindh.kim@gmail.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/netmisc.c | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/fs/cifs/netmisc.c b/fs/cifs/netmisc.c index ed92958e842d3..657f409d4de06 100644 --- a/fs/cifs/netmisc.c +++ b/fs/cifs/netmisc.c @@ -117,10 +117,6 @@ static const struct smb_to_posix_error mapping_table_ERRSRV[] = { {0, 0} };
-static const struct smb_to_posix_error mapping_table_ERRHRD[] = { - {0, 0} -}; - /* * Convert a string containing text IPv4 or IPv6 address to binary form. *
From: Vincenzo Frascino vincenzo.frascino@arm.com
[ Upstream commit e0de01aafc3dd7b73308106b056ead2d48391905 ]
The .config file and the generated include/config/auto.conf can end up out of sync after a set of commands since CONFIG_CROSS_COMPILE_COMPAT_VDSO is not updated correctly.
The sequence can be reproduced as follows:
$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- defconfig [...] $ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- menuconfig [set CONFIG_CROSS_COMPILE_COMPAT_VDSO="arm-linux-gnueabihf-"] $ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
Which results in:
arch/arm64/Makefile:62: CROSS_COMPILE_COMPAT not defined or empty, the compat vDSO will not be built
even though the compat vDSO has been built:
$ file arch/arm64/kernel/vdso32/vdso.so arch/arm64/kernel/vdso32/vdso.so: ELF 32-bit LSB pie executable, ARM, EABI5 version 1 (SYSV), dynamically linked, BuildID[sha1]=c67f6c786f2d2d6f86c71f708595594aa25247f6, stripped
A similar case that involves changing the configuration parameter multiple times can be reconducted to the same family of problems.
Remove the use of CONFIG_CROSS_COMPILE_COMPAT_VDSO altogether and instead rely on the cross-compiler prefix coming from the environment via CROSS_COMPILE_COMPAT, much like we do for the rest of the kernel.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Reported-by: Will Deacon will@kernel.org Signed-off-by: Vincenzo Frascino vincenzo.frascino@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/Kconfig | 2 +- arch/arm64/Makefile | 18 +++++------------- arch/arm64/kernel/vdso32/Makefile | 2 -- 3 files changed, 6 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index e8cf562838716..f63b824cdc2d3 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -111,7 +111,7 @@ config ARM64 select GENERIC_STRNLEN_USER select GENERIC_TIME_VSYSCALL select GENERIC_GETTIMEOFDAY - select GENERIC_COMPAT_VDSO if (!CPU_BIG_ENDIAN && COMPAT) + select GENERIC_COMPAT_VDSO if (!CPU_BIG_ENDIAN && COMPAT && "$(CROSS_COMPILE_COMPAT)" != "") select HANDLE_DOMAIN_IRQ select HARDIRQS_SW_RESEND select HAVE_PCI diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile index 61de992bbea3f..9743b50bdee7d 100644 --- a/arch/arm64/Makefile +++ b/arch/arm64/Makefile @@ -47,20 +47,12 @@ $(warning Detected assembler with broken .inst; disassembly will be unreliable) endif endif
+COMPATCC ?= $(CROSS_COMPILE_COMPAT)gcc +export COMPATCC + ifeq ($(CONFIG_GENERIC_COMPAT_VDSO), y) - CROSS_COMPILE_COMPAT ?= $(CONFIG_CROSS_COMPILE_COMPAT_VDSO:"%"=%) - - ifeq ($(CONFIG_CC_IS_CLANG), y) - $(warning CROSS_COMPILE_COMPAT is clang, the compat vDSO will not be built) - else ifeq ($(strip $(CROSS_COMPILE_COMPAT)),) - $(warning CROSS_COMPILE_COMPAT not defined or empty, the compat vDSO will not be built) - else ifeq ($(shell which $(CROSS_COMPILE_COMPAT)gcc 2> /dev/null),) - $(error $(CROSS_COMPILE_COMPAT)gcc not found, check CROSS_COMPILE_COMPAT) - else - export CROSS_COMPILE_COMPAT - export CONFIG_COMPAT_VDSO := y - compat_vdso := -DCONFIG_COMPAT_VDSO=1 - endif + export CONFIG_COMPAT_VDSO := y + compat_vdso := -DCONFIG_COMPAT_VDSO=1 endif
KBUILD_CFLAGS += -mgeneral-regs-only $(lseinstr) $(brokengasinst) $(compat_vdso) diff --git a/arch/arm64/kernel/vdso32/Makefile b/arch/arm64/kernel/vdso32/Makefile index 1fba0776ed40e..19e0d3115ffe0 100644 --- a/arch/arm64/kernel/vdso32/Makefile +++ b/arch/arm64/kernel/vdso32/Makefile @@ -8,8 +8,6 @@ ARCH_REL_TYPE_ABS := R_ARM_JUMP_SLOT|R_ARM_GLOB_DAT|R_ARM_ABS32 include $(srctree)/lib/vdso/Makefile
-COMPATCC := $(CROSS_COMPILE_COMPAT)gcc - # Same as cc-*option, but using COMPATCC instead of CC cc32-option = $(call try-run,\ $(COMPATCC) $(1) -c -x c /dev/null -o "$$TMP",$(1),$(2))
From: Vincenzo Frascino vincenzo.frascino@arm.com
[ Upstream commit 0df2c90eba60791148cee1823c0bf5fc66e3465c ]
Older versions of binutils (prior to 2.24) do not support the "ISHLD" option for memory barrier instructions, which leads to a build failure when assembling the vdso32 library.
Add a compilation time mechanism that detects if binutils supports those instructions and configure the kernel accordingly.
Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Reported-by: Will Deacon will@kernel.org Signed-off-by: Vincenzo Frascino vincenzo.frascino@arm.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Tested-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/vdso/compat_barrier.h | 2 +- arch/arm64/kernel/vdso32/Makefile | 9 +++++++++ 2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/vdso/compat_barrier.h b/arch/arm64/include/asm/vdso/compat_barrier.h index fb60a88b5ed41..3fd8fd6d8fc25 100644 --- a/arch/arm64/include/asm/vdso/compat_barrier.h +++ b/arch/arm64/include/asm/vdso/compat_barrier.h @@ -20,7 +20,7 @@
#define dmb(option) __asm__ __volatile__ ("dmb " #option : : : "memory")
-#if __LINUX_ARM_ARCH__ >= 8 +#if __LINUX_ARM_ARCH__ >= 8 && defined(CONFIG_AS_DMB_ISHLD) #define aarch32_smp_mb() dmb(ish) #define aarch32_smp_rmb() dmb(ishld) #define aarch32_smp_wmb() dmb(ishst) diff --git a/arch/arm64/kernel/vdso32/Makefile b/arch/arm64/kernel/vdso32/Makefile index 19e0d3115ffe0..77aa613403747 100644 --- a/arch/arm64/kernel/vdso32/Makefile +++ b/arch/arm64/kernel/vdso32/Makefile @@ -15,6 +15,8 @@ cc32-disable-warning = $(call try-run,\ $(COMPATCC) -W$(strip $(1)) -c -x c /dev/null -o "$$TMP",-Wno-$(strip $(1))) cc32-ldoption = $(call try-run,\ $(COMPATCC) $(1) -nostdlib -x c /dev/null -o "$$TMP",$(1),$(2)) +cc32-as-instr = $(call try-run,\ + printf "%b\n" "$(1)" | $(COMPATCC) $(VDSO_AFLAGS) -c -x assembler -o "$$TMP" -,$(2),$(3))
# We cannot use the global flags to compile the vDSO files, the main reason # being that the 32-bit compiler may be older than the main (64-bit) compiler @@ -53,6 +55,7 @@ endif VDSO_CAFLAGS += -fPIC -fno-builtin -fno-stack-protector VDSO_CAFLAGS += -DDISABLE_BRANCH_PROFILING
+ # Try to compile for ARMv8. If the compiler is too old and doesn't support it, # fall back to v7. There is no easy way to check for what architecture the code # is being compiled, so define a macro specifying that (see arch/arm/Makefile). @@ -89,6 +92,12 @@ VDSO_CFLAGS += -Wno-int-to-pointer-cast VDSO_AFLAGS := $(VDSO_CAFLAGS) VDSO_AFLAGS += -D__ASSEMBLY__
+# Check for binutils support for dmb ishld +dmbinstr := $(call cc32-as-instr,dmb ishld,-DCONFIG_AS_DMB_ISHLD=1) + +VDSO_CFLAGS += $(dmbinstr) +VDSO_AFLAGS += $(dmbinstr) + VDSO_LDFLAGS := $(VDSO_CPPFLAGS) # From arm vDSO Makefile VDSO_LDFLAGS += -Wl,-Bsymbolic -Wl,--no-undefined -Wl,-soname=linux-vdso.so.1
From: Adam Ford aford173@gmail.com
[ Upstream commit 37e3ab00e4734acc15d96b2926aab55c894f4d9c ]
When using mctrl_gpio_to_gpiod, it dereferences gpios into a single requested GPIO. This dereferencing can break if gpios is NULL, so this patch adds a NULL check before dereferencing it. If gpios is NULL, this function will also return NULL.
Signed-off-by: Adam Ford aford173@gmail.com Reviewed-by: Yegor Yefremov yegorslists@googlemail.com Link: https://lore.kernel.org/r/20191006163314.23191-1-aford173@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/serial_mctrl_gpio.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/tty/serial/serial_mctrl_gpio.c b/drivers/tty/serial/serial_mctrl_gpio.c index 2b400189be911..54c43e02e3755 100644 --- a/drivers/tty/serial/serial_mctrl_gpio.c +++ b/drivers/tty/serial/serial_mctrl_gpio.c @@ -61,6 +61,9 @@ EXPORT_SYMBOL_GPL(mctrl_gpio_set); struct gpio_desc *mctrl_gpio_to_gpiod(struct mctrl_gpios *gpios, enum mctrl_gpio_idx gidx) { + if (gpios == NULL) + return NULL; + return gpios->gpio[gidx]; } EXPORT_SYMBOL_GPL(mctrl_gpio_to_gpiod);
From: Adam Ford aford173@gmail.com
[ Upstream commit fc64f7abbef2dae7ee4c94702fb3cf9a2be5431a ]
There are two checks to see if the manual gpio is configured, but these the check is seeing if the structure is NULL instead it should check to see if there are CTS and/or RTS pins defined.
This patch uses checks for those individual pins instead of checking for the structure itself to restore auto RTS/CTS.
Signed-off-by: Adam Ford aford173@gmail.com Reviewed-by: Yegor Yefremov yegorslists@googlemail.com Link: https://lore.kernel.org/r/20191006163314.23191-2-aford173@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/8250/8250_omap.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c index 3ef65cbd2478a..e4b08077f8757 100644 --- a/drivers/tty/serial/8250/8250_omap.c +++ b/drivers/tty/serial/8250/8250_omap.c @@ -141,7 +141,7 @@ static void omap8250_set_mctrl(struct uart_port *port, unsigned int mctrl)
serial8250_do_set_mctrl(port, mctrl);
- if (!up->gpios) { + if (!mctrl_gpio_to_gpiod(up->gpios, UART_GPIO_RTS)) { /* * Turn off autoRTS if RTS is lowered and restore autoRTS * setting if RTS is raised @@ -456,7 +456,8 @@ static void omap_8250_set_termios(struct uart_port *port, up->port.status &= ~(UPSTAT_AUTOCTS | UPSTAT_AUTORTS | UPSTAT_AUTOXOFF);
if (termios->c_cflag & CRTSCTS && up->port.flags & UPF_HARD_FLOW && - !up->gpios) { + !mctrl_gpio_to_gpiod(up->gpios, UART_GPIO_RTS) && + !mctrl_gpio_to_gpiod(up->gpios, UART_GPIO_CTS)) { /* Enable AUTOCTS (autoRTS is enabled when RTS is raised) */ up->port.status |= UPSTAT_AUTOCTS | UPSTAT_AUTORTS; priv->efr |= UART_EFR_CTS;
From: Will Deacon will@kernel.org
[ Upstream commit 24ee01a927bfe56c66429ec4b1df6955a814adc8 ]
Rather than force the use of GCC for the compat cross-compiler, instead extract the target from CROSS_COMPILE_COMPAT and pass it to clang if the main compiler is clang.
Acked-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/Makefile | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile index 9743b50bdee7d..5858d6e449268 100644 --- a/arch/arm64/Makefile +++ b/arch/arm64/Makefile @@ -47,7 +47,11 @@ $(warning Detected assembler with broken .inst; disassembly will be unreliable) endif endif
+ifeq ($(CONFIG_CC_IS_CLANG), y) +COMPATCC ?= $(CC) --target=$(notdir $(CROSS_COMPILE_COMPAT:%-=%)) +else COMPATCC ?= $(CROSS_COMPILE_COMPAT)gcc +endif export COMPATCC
ifeq ($(CONFIG_GENERIC_COMPAT_VDSO), y)
From: Will Deacon will@kernel.org
[ Upstream commit c71e88c437962c1ec43d4d23a0ebf4c9cf9bee0d ]
KBUILD_CPPFLAGS is defined differently depending on whether the main compiler is clang or not. This means that it is not possible to build the compat vDSO with GCC if the rest of the kernel is built with clang.
Define VDSO_CPPFLAGS directly to break this dependency and allow a clang kernel to build a compat vDSO with GCC:
$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- \ CROSS_COMPILE_COMPAT=arm-linux-gnueabihf- CC=clang \ COMPATCC=arm-linux-gnueabihf-gcc
Acked-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kernel/vdso32/Makefile | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kernel/vdso32/Makefile b/arch/arm64/kernel/vdso32/Makefile index 77aa613403747..aa171b043287b 100644 --- a/arch/arm64/kernel/vdso32/Makefile +++ b/arch/arm64/kernel/vdso32/Makefile @@ -25,11 +25,9 @@ cc32-as-instr = $(call try-run,\ # arm64 one. # As a result we set our own flags here.
-# From top-level Makefile -# NOSTDINC_FLAGS -VDSO_CPPFLAGS := -nostdinc -isystem $(shell $(COMPATCC) -print-file-name=include) +# KBUILD_CPPFLAGS and NOSTDINC_FLAGS from top-level Makefile +VDSO_CPPFLAGS := -D__KERNEL__ -nostdinc -isystem $(shell $(COMPATCC) -print-file-name=include) VDSO_CPPFLAGS += $(LINUXINCLUDE) -VDSO_CPPFLAGS += $(KBUILD_CPPFLAGS)
# Common C and assembly flags # From top-level Makefile
From: Lukas Wunner lukas@wunner.de
[ Upstream commit 6fb9367a15d1a126d222d738b2702c7958594a5f ]
The CPER parser assumes that the class code is big endian, but at least on this edk2-derived Intel Purley platform it's little endian:
efi: EFI v2.50 by EDK II BIOS ID:PLYDCRB1.86B.0119.R05.1701181843 DMI: Intel Corporation PURLEY/PURLEY, BIOS PLYDCRB1.86B.0119.R05.1701181843 01/18/2017
{1}[Hardware Error]: device_id: 0000:5d:00.0 {1}[Hardware Error]: slot: 0 {1}[Hardware Error]: secondary_bus: 0x5e {1}[Hardware Error]: vendor_id: 0x8086, device_id: 0x2030 {1}[Hardware Error]: class_code: 000406 ^^^^^^ (should be 060400)
Signed-off-by: Lukas Wunner lukas@wunner.de Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org Cc: Ben Dooks ben.dooks@codethink.co.uk Cc: Dave Young dyoung@redhat.com Cc: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com Cc: Jerry Snitselaar jsnitsel@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Lyude Paul lyude@redhat.com Cc: Matthew Garrett mjg59@google.com Cc: Octavian Purdila octavian.purdila@intel.com Cc: Peter Jones pjones@redhat.com Cc: Peter Zijlstra peterz@infradead.org Cc: Scott Talbert swt@techie.net Cc: Thomas Gleixner tglx@linutronix.de Cc: linux-efi@vger.kernel.org Cc: linux-integrity@vger.kernel.org Link: https://lkml.kernel.org/r/20191002165904.8819-2-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/efi/cper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c index addf0749dd8b6..b1af0de2e1008 100644 --- a/drivers/firmware/efi/cper.c +++ b/drivers/firmware/efi/cper.c @@ -381,7 +381,7 @@ static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie, printk("%s""vendor_id: 0x%04x, device_id: 0x%04x\n", pfx, pcie->device_id.vendor_id, pcie->device_id.device_id); p = pcie->device_id.class_code; - printk("%s""class_code: %02x%02x%02x\n", pfx, p[0], p[1], p[2]); + printk("%s""class_code: %02x%02x%02x\n", pfx, p[2], p[1], p[0]); } if (pcie->validation_bits & CPER_PCIE_VALID_SERIAL_NUMBER) printk("%s""serial number: 0x%04x, 0x%04x\n", pfx,
From: Dave Young dyoung@redhat.com
[ Upstream commit 2ecb7402cfc7f22764e7bbc80790e66eadb20560 ]
kexec reboot fails randomly in UEFI based KVM guest. The firmware just resets while calling efi_delete_dummy_variable(); Unfortunately I don't know how to debug the firmware, it is also possible a potential problem on real hardware as well although nobody reproduced it.
The intention of the efi_delete_dummy_variable is to trigger garbage collection when entering virtual mode. But SetVirtualAddressMap can only run once for each physical reboot, thus kexec_enter_virtual_mode() is not necessarily a good place to clean a dummy object.
Drop the efi_delete_dummy_variable so that kexec reboot can work.
Signed-off-by: Dave Young dyoung@redhat.com Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org Acked-by: Matthew Garrett mjg59@google.com Cc: Ben Dooks ben.dooks@codethink.co.uk Cc: Jarkko Sakkinen jarkko.sakkinen@linux.intel.com Cc: Jerry Snitselaar jsnitsel@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Lukas Wunner lukas@wunner.de Cc: Lyude Paul lyude@redhat.com Cc: Octavian Purdila octavian.purdila@intel.com Cc: Peter Jones pjones@redhat.com Cc: Peter Zijlstra peterz@infradead.org Cc: Scott Talbert swt@techie.net Cc: Thomas Gleixner tglx@linutronix.de Cc: linux-efi@vger.kernel.org Cc: linux-integrity@vger.kernel.org Link: https://lkml.kernel.org/r/20191002165904.8819-8-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/platform/efi/efi.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c index a7189a3b4d70f..3304f61538a26 100644 --- a/arch/x86/platform/efi/efi.c +++ b/arch/x86/platform/efi/efi.c @@ -894,9 +894,6 @@ static void __init kexec_enter_virtual_mode(void)
if (efi_enabled(EFI_OLD_MEMMAP) && (__supported_pte_mask & _PAGE_NX)) runtime_code_page_mkexec(); - - /* clean DUMMY object */ - efi_delete_dummy_variable(); #endif }
From: Thomas Bogendoerfer tbogendoerfer@suse.de
[ Upstream commit 88356d09904bc606182c625575237269aeece22e ]
Commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly") allows compiler to uninline functions marked as 'inline'. In cace of cmpxchg this would cause to reference function __cmpxchg_called_with_bad_pointer, which is a error case for catching bugs and will not happen for correct code, if __cmpxchg is inlined.
Signed-off-by: Thomas Bogendoerfer tbogendoerfer@suse.de [paul.burton@mips.com: s/__cmpxchd/__cmpxchg in subject] Signed-off-by: Paul Burton paul.burton@mips.com Cc: Ralf Baechle ralf@linux-mips.org Cc: James Hogan jhogan@kernel.org Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/include/asm/cmpxchg.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/mips/include/asm/cmpxchg.h b/arch/mips/include/asm/cmpxchg.h index c8a47d18f6288..319522fa3a45e 100644 --- a/arch/mips/include/asm/cmpxchg.h +++ b/arch/mips/include/asm/cmpxchg.h @@ -153,8 +153,9 @@ static inline unsigned long __xchg(volatile void *ptr, unsigned long x, extern unsigned long __cmpxchg_small(volatile void *ptr, unsigned long old, unsigned long new, unsigned int size);
-static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old, - unsigned long new, unsigned int size) +static __always_inline +unsigned long __cmpxchg(volatile void *ptr, unsigned long old, + unsigned long new, unsigned int size) { switch (size) { case 1:
From: Vincent Chen vincent.chen@sifive.com
[ Upstream commit 8b04825ed205da38754f86f4c07ea8600d8c2a65 ]
When the CONFIG_GENERIC_BUG is disabled by disabling CONFIG_BUG, if a kernel thread is trapped by BUG(), the whole system will be in the loop that infinitely handles the ebreak exception instead of entering the die function. To fix this problem, the do_trap_break() will always call the die() to deal with the break exception as the type of break is BUG_TRAP_TYPE_BUG.
Signed-off-by: Vincent Chen vincent.chen@sifive.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Paul Walmsley paul.walmsley@sifive.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/traps.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 424eb72d56b10..055a937aca70a 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -124,23 +124,23 @@ static inline unsigned long get_break_insn_length(unsigned long pc)
asmlinkage void do_trap_break(struct pt_regs *regs) { -#ifdef CONFIG_GENERIC_BUG if (!user_mode(regs)) { enum bug_trap_type type;
type = report_bug(regs->sepc, regs); switch (type) { +#ifdef CONFIG_GENERIC_BUG case BUG_TRAP_TYPE_NONE: break; case BUG_TRAP_TYPE_WARN: regs->sepc += get_break_insn_length(regs->sepc); break; case BUG_TRAP_TYPE_BUG: +#endif /* CONFIG_GENERIC_BUG */ + default: die(regs, "Kernel BUG"); } } -#endif /* CONFIG_GENERIC_BUG */ - force_sig_fault(SIGTRAP, TRAP_BRKPT, (void __user *)(regs->sepc)); }
From: Vincent Chen vincent.chen@sifive.com
[ Upstream commit e0c0fc18f10d5080cddde0e81505fd3e952c20c4 ]
On RISC-V, when the kernel runs code on behalf of a user thread, and the kernel executes a WARN() or WARN_ON(), the user thread will be sent a bogus SIGTRAP. Fix the RISC-V kernel code to not send a SIGTRAP when a WARN()/WARN_ON() is executed.
Signed-off-by: Vincent Chen vincent.chen@sifive.com Reviewed-by: Christoph Hellwig hch@lst.de [paul.walmsley@sifive.com: fixed subject] Signed-off-by: Paul Walmsley paul.walmsley@sifive.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/traps.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 055a937aca70a..82f42a55451eb 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -134,7 +134,7 @@ asmlinkage void do_trap_break(struct pt_regs *regs) break; case BUG_TRAP_TYPE_WARN: regs->sepc += get_break_insn_length(regs->sepc); - break; + return; case BUG_TRAP_TYPE_BUG: #endif /* CONFIG_GENERIC_BUG */ default:
From: Vincent Chen vincent.chen@sifive.com
[ Upstream commit 8bb0daef64e5a92db63ad1d3bbf9e280a7b3612a ]
For the kernel space, all ebreak instructions are determined at compile time because the kernel space debugging module is currently unsupported. Hence, it should be treated as a bug if an ebreak instruction which does not belong to BUG_TRAP_TYPE_WARN or BUG_TRAP_TYPE_BUG is executed in kernel space. For the userspace, debugging module or user problem may intentionally insert an ebreak instruction to trigger a SIGTRAP signal. To approach the above two situations, the do_trap_break() will direct the BUG_TRAP_TYPE_NONE ebreak exception issued in kernel space to die() and will send a SIGTRAP to the trapped process only when the ebreak is in userspace.
Signed-off-by: Vincent Chen vincent.chen@sifive.com Reviewed-by: Christoph Hellwig hch@lst.de [paul.walmsley@sifive.com: fixed checkpatch issue] Signed-off-by: Paul Walmsley paul.walmsley@sifive.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/traps.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 82f42a55451eb..93742df9067fb 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -130,8 +130,6 @@ asmlinkage void do_trap_break(struct pt_regs *regs) type = report_bug(regs->sepc, regs); switch (type) { #ifdef CONFIG_GENERIC_BUG - case BUG_TRAP_TYPE_NONE: - break; case BUG_TRAP_TYPE_WARN: regs->sepc += get_break_insn_length(regs->sepc); return; @@ -140,8 +138,10 @@ asmlinkage void do_trap_break(struct pt_regs *regs) default: die(regs, "Kernel BUG"); } + } else { + force_sig_fault(SIGTRAP, TRAP_BRKPT, + (void __user *)(regs->sepc)); } - force_sig_fault(SIGTRAP, TRAP_BRKPT, (void __user *)(regs->sepc)); }
#ifdef CONFIG_GENERIC_BUG
From: Boris Ostrovsky boris.ostrovsky@oracle.com
[ Upstream commit c6875f3aacf2a5a913205accddabf0bfb75cac76 ]
Currently execution of panic() continues until Xen's panic notifier (xen_panic_event()) is called at which point we make a hypercall that never returns.
This means that any notifier that is supposed to be called later as well as significant part of panic() code (such as pstore writes from kmsg_dump()) is never executed.
There is no reason for xen_panic_event() to be this last point in execution since panic()'s emergency_restart() will call into xen_emergency_restart() from where we can perform our hypercall.
Nevertheless, we will provide xen_legacy_crash boot option that will preserve original behavior during crash. This option could be used, for example, if running kernel dumper (which happens after panic notifiers) is undesirable.
Reported-by: James Dingwall james@dingwall.me.uk Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../admin-guide/kernel-parameters.txt | 4 +++ arch/x86/xen/enlighten.c | 28 +++++++++++++++++-- 2 files changed, 29 insertions(+), 3 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 4c1971960afa3..5ea005c9e2d60 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -5267,6 +5267,10 @@ the unplug protocol never -- do not unplug even if version check succeeds
+ xen_legacy_crash [X86,XEN] + Crash from Xen panic notifier, without executing late + panic() code such as dumping handler. + xen_nopvspin [X86,XEN] Disables the ticketlock slowpath using Xen PV optimizations. diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 750f46ad018a0..205b1176084f5 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -269,19 +269,41 @@ void xen_reboot(int reason) BUG(); }
+static int reboot_reason = SHUTDOWN_reboot; +static bool xen_legacy_crash; void xen_emergency_restart(void) { - xen_reboot(SHUTDOWN_reboot); + xen_reboot(reboot_reason); }
static int xen_panic_event(struct notifier_block *this, unsigned long event, void *ptr) { - if (!kexec_crash_loaded()) - xen_reboot(SHUTDOWN_crash); + if (!kexec_crash_loaded()) { + if (xen_legacy_crash) + xen_reboot(SHUTDOWN_crash); + + reboot_reason = SHUTDOWN_crash; + + /* + * If panic_timeout==0 then we are supposed to wait forever. + * However, to preserve original dom0 behavior we have to drop + * into hypervisor. (domU behavior is controlled by its + * config file) + */ + if (panic_timeout == 0) + panic_timeout = -1; + } return NOTIFY_DONE; }
+static int __init parse_xen_legacy_crash(char *arg) +{ + xen_legacy_crash = true; + return 0; +} +early_param("xen_legacy_crash", parse_xen_legacy_crash); + static struct notifier_block xen_panic_block = { .notifier_call = xen_panic_event, .priority = INT_MIN
From: Jia Guo guojia12@huawei.com
[ Upstream commit 7a243c82ea527cd1da47381ad9cd646844f3b693 ]
Unused portion of a part-written fs-block-sized block is not set to zero in unaligned append direct write.This can lead to serious data inconsistencies.
Ocfs2 manage disk with cluster size(for example, 1M), part-written in one cluster will change the cluster state from UN-WRITTEN to WRITTEN, VFS(function dio_zero_block) doesn't do the cleaning because bh's state is not set to NEW in function ocfs2_dio_wr_get_block when we write a WRITTEN cluster. For example, the cluster size is 1M, file size is 8k and we direct write from 14k to 15k, then 12k~14k and 15k~16k will contain dirty data.
We have to deal with two cases: 1.The starting position of direct write is outside the file. 2.The starting position of direct write is located in the file.
We need set bh's state to NEW in the first case. In the second case, we need mapped twice because bh's state of area out file should be set to NEW while area in file not.
[akpm@linux-foundation.org: coding style fixes] Link: http://lkml.kernel.org/r/5292e287-8f1a-fd4a-1a14-661e555e0bed@huawei.com Signed-off-by: Jia Guo guojia12@huawei.com Reviewed-by: Yiwen Jiang jiangyiwen@huawei.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Joseph Qi joseph.qi@huawei.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ocfs2/aops.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index a4c905d6b5755..3e0a93e799ea1 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -2139,13 +2139,30 @@ static int ocfs2_dio_wr_get_block(struct inode *inode, sector_t iblock, struct ocfs2_dio_write_ctxt *dwc = NULL; struct buffer_head *di_bh = NULL; u64 p_blkno; - loff_t pos = iblock << inode->i_sb->s_blocksize_bits; + unsigned int i_blkbits = inode->i_sb->s_blocksize_bits; + loff_t pos = iblock << i_blkbits; + sector_t endblk = (i_size_read(inode) - 1) >> i_blkbits; unsigned len, total_len = bh_result->b_size; int ret = 0, first_get_block = 0;
len = osb->s_clustersize - (pos & (osb->s_clustersize - 1)); len = min(total_len, len);
+ /* + * bh_result->b_size is count in get_more_blocks according to write + * "pos" and "end", we need map twice to return different buffer state: + * 1. area in file size, not set NEW; + * 2. area out file size, set NEW. + * + * iblock endblk + * |--------|---------|---------|--------- + * |<-------area in file------->| + */ + + if ((iblock <= endblk) && + ((iblock + ((len - 1) >> i_blkbits)) > endblk)) + len = (endblk - iblock + 1) << i_blkbits; + mlog(0, "get block of %lu at %llu:%u req %u\n", inode->i_ino, pos, len, total_len);
@@ -2229,6 +2246,9 @@ static int ocfs2_dio_wr_get_block(struct inode *inode, sector_t iblock, if (desc->c_needs_zero) set_buffer_new(bh_result);
+ if (iblock > endblk) + set_buffer_new(bh_result); + /* May sleep in end_io. It should not happen in a irq context. So defer * it to dio work queue. */ set_buffer_defer_completion(bh_result);
From: Jia-Ju Bai baijiaju1990@gmail.com
[ Upstream commit 56e94ea132bb5c2c1d0b60a6aeb34dcb7d71a53d ]
In ocfs2_xa_prepare_entry(), there is an if statement on line 2136 to check whether loc->xl_entry is NULL:
if (loc->xl_entry)
When loc->xl_entry is NULL, it is used on line 2158:
ocfs2_xa_add_entry(loc, name_hash); loc->xl_entry->xe_name_hash = cpu_to_le32(name_hash); loc->xl_entry->xe_name_offset = cpu_to_le16(loc->xl_size);
and line 2164:
ocfs2_xa_add_namevalue(loc, xi); loc->xl_entry->xe_value_size = cpu_to_le64(xi->xi_value_len); loc->xl_entry->xe_name_len = xi->xi_name_len;
Thus, possible null-pointer dereferences may occur.
To fix these bugs, if loc-xl_entry is NULL, ocfs2_xa_prepare_entry() abnormally returns with -EINVAL.
These bugs are found by a static analysis tool STCheck written by us.
[akpm@linux-foundation.org: remove now-unused ocfs2_xa_add_entry()] Link: http://lkml.kernel.org/r/20190726101447.9153-1-baijiaju1990@gmail.com Signed-off-by: Jia-Ju Bai baijiaju1990@gmail.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Changwei Ge gechangwei@live.cn Cc: Gang He ghe@suse.com Cc: Jun Piao piaojun@huawei.com Cc: Stephen Rothwell sfr@canb.auug.org.au Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ocfs2/xattr.c | 56 ++++++++++++++++++++---------------------------- 1 file changed, 23 insertions(+), 33 deletions(-)
diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c index 90c830e3758e2..d8507972ee135 100644 --- a/fs/ocfs2/xattr.c +++ b/fs/ocfs2/xattr.c @@ -1490,18 +1490,6 @@ static int ocfs2_xa_check_space(struct ocfs2_xa_loc *loc, return loc->xl_ops->xlo_check_space(loc, xi); }
-static void ocfs2_xa_add_entry(struct ocfs2_xa_loc *loc, u32 name_hash) -{ - loc->xl_ops->xlo_add_entry(loc, name_hash); - loc->xl_entry->xe_name_hash = cpu_to_le32(name_hash); - /* - * We can't leave the new entry's xe_name_offset at zero or - * add_namevalue() will go nuts. We set it to the size of our - * storage so that it can never be less than any other entry. - */ - loc->xl_entry->xe_name_offset = cpu_to_le16(loc->xl_size); -} - static void ocfs2_xa_add_namevalue(struct ocfs2_xa_loc *loc, struct ocfs2_xattr_info *xi) { @@ -2133,29 +2121,31 @@ static int ocfs2_xa_prepare_entry(struct ocfs2_xa_loc *loc, if (rc) goto out;
- if (loc->xl_entry) { - if (ocfs2_xa_can_reuse_entry(loc, xi)) { - orig_value_size = loc->xl_entry->xe_value_size; - rc = ocfs2_xa_reuse_entry(loc, xi, ctxt); - if (rc) - goto out; - goto alloc_value; - } + if (!loc->xl_entry) { + rc = -EINVAL; + goto out; + }
- if (!ocfs2_xattr_is_local(loc->xl_entry)) { - orig_clusters = ocfs2_xa_value_clusters(loc); - rc = ocfs2_xa_value_truncate(loc, 0, ctxt); - if (rc) { - mlog_errno(rc); - ocfs2_xa_cleanup_value_truncate(loc, - "overwriting", - orig_clusters); - goto out; - } + if (ocfs2_xa_can_reuse_entry(loc, xi)) { + orig_value_size = loc->xl_entry->xe_value_size; + rc = ocfs2_xa_reuse_entry(loc, xi, ctxt); + if (rc) + goto out; + goto alloc_value; + } + + if (!ocfs2_xattr_is_local(loc->xl_entry)) { + orig_clusters = ocfs2_xa_value_clusters(loc); + rc = ocfs2_xa_value_truncate(loc, 0, ctxt); + if (rc) { + mlog_errno(rc); + ocfs2_xa_cleanup_value_truncate(loc, + "overwriting", + orig_clusters); + goto out; } - ocfs2_xa_wipe_namevalue(loc); - } else - ocfs2_xa_add_entry(loc, name_hash); + } + ocfs2_xa_wipe_namevalue(loc);
/* * If we get here, we have a blank entry. Fill it. We grow our
From: Jia-Ju Bai baijiaju1990@gmail.com
[ Upstream commit 583fee3e12df0e6f1f66f063b989d8e7fed0e65a ]
In ocfs2_write_end_nolock(), there are an if statement on lines 1976, 2047 and 2058, to check whether handle is NULL:
if (handle)
When handle is NULL, it is used on line 2045:
ocfs2_update_inode_fsync_trans(handle, inode, 1); oi->i_sync_tid = handle->h_transaction->t_tid;
Thus, a possible null-pointer dereference may occur.
To fix this bug, handle is checked before calling ocfs2_update_inode_fsync_trans().
This bug is found by a static analysis tool STCheck written by us.
Link: http://lkml.kernel.org/r/20190726033705.32307-1-baijiaju1990@gmail.com Signed-off-by: Jia-Ju Bai baijiaju1990@gmail.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Changwei Ge gechangwei@live.cn Cc: Gang He ghe@suse.com Cc: Jun Piao piaojun@huawei.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ocfs2/aops.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index 3e0a93e799ea1..9b827143a3504 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -2042,7 +2042,8 @@ out_write_size: inode->i_mtime = inode->i_ctime = current_time(inode); di->i_mtime = di->i_ctime = cpu_to_le64(inode->i_mtime.tv_sec); di->i_mtime_nsec = di->i_ctime_nsec = cpu_to_le32(inode->i_mtime.tv_nsec); - ocfs2_update_inode_fsync_trans(handle, inode, 1); + if (handle) + ocfs2_update_inode_fsync_trans(handle, inode, 1); } if (handle) ocfs2_journal_dirty(handle, wc->w_di_bh);
From: Jia-Ju Bai baijiaju1990@gmail.com
[ Upstream commit 2abb7d3b12d007c30193f48bebed781009bebdd2 ]
In ocfs2_info_scan_inode_alloc(), there is an if statement on line 283 to check whether inode_alloc is NULL:
if (inode_alloc)
When inode_alloc is NULL, it is used on line 287:
ocfs2_inode_lock(inode_alloc, &bh, 0); ocfs2_inode_lock_full_nested(inode, ...) struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
Thus, a possible null-pointer dereference may occur.
To fix this bug, inode_alloc is checked on line 286.
This bug is found by a static analysis tool STCheck written by us.
Link: http://lkml.kernel.org/r/20190726033717.32359-1-baijiaju1990@gmail.com Signed-off-by: Jia-Ju Bai baijiaju1990@gmail.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Changwei Ge gechangwei@live.cn Cc: Gang He ghe@suse.com Cc: Jun Piao piaojun@huawei.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ocfs2/ioctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ocfs2/ioctl.c b/fs/ocfs2/ioctl.c index d6f7b299eb236..efeea208fdebd 100644 --- a/fs/ocfs2/ioctl.c +++ b/fs/ocfs2/ioctl.c @@ -283,7 +283,7 @@ static int ocfs2_info_scan_inode_alloc(struct ocfs2_super *osb, if (inode_alloc) inode_lock(inode_alloc);
- if (o2info_coherent(&fi->ifi_req)) { + if (inode_alloc && o2info_coherent(&fi->ifi_req)) { status = ocfs2_inode_lock(inode_alloc, &bh, 0); if (status < 0) { mlog_errno(status);
From: Austin Kim austindh.kim@gmail.com
[ Upstream commit 431d39887d6273d6d84edf3c2eab09f4200e788a ]
GCC throws warning message as below:
‘clone_src_i_size’ may be used uninitialized in this function [-Wmaybe-uninitialized] #define IS_ALIGNED(x, a) (((x) & ((typeof(x))(a) - 1)) == 0) ^ fs/btrfs/send.c:5088:6: note: ‘clone_src_i_size’ was declared here u64 clone_src_i_size; ^ The clone_src_i_size is only used as call-by-reference in a call to get_inode_info().
Silence the warning by initializing clone_src_i_size to 0.
Note that the warning is a false positive and reported by older versions of GCC (eg. 7.x) but not eg 9.x. As there have been numerous people, the patch is applied. Setting clone_src_i_size to 0 does not otherwise make sense and would not do any action in case the code changes in the future.
Signed-off-by: Austin Kim austindh.kim@gmail.com Reviewed-by: David Sterba dsterba@suse.com [ add note ] Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/send.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index c3c0c064c25da..91c702b4cae9d 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -5070,7 +5070,7 @@ static int clone_range(struct send_ctx *sctx, struct btrfs_path *path; struct btrfs_key key; int ret; - u64 clone_src_i_size; + u64 clone_src_i_size = 0;
/* * Prevent cloning from a zero offset with a length matching the sector
From: Yunfeng Ye yeyunfeng@huawei.com
[ Upstream commit 3e7c93bd04edfb0cae7dad1215544c9350254b8f ]
There are no return value checking when using kzalloc() and kcalloc() for memory allocation. so add it.
Signed-off-by: Yunfeng Ye yeyunfeng@huawei.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kernel/armv8_deprecated.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c index 2ec09debc2bb1..ca158be21f833 100644 --- a/arch/arm64/kernel/armv8_deprecated.c +++ b/arch/arm64/kernel/armv8_deprecated.c @@ -174,6 +174,9 @@ static void __init register_insn_emulation(struct insn_emulation_ops *ops) struct insn_emulation *insn;
insn = kzalloc(sizeof(*insn), GFP_KERNEL); + if (!insn) + return; + insn->ops = ops; insn->min = INSN_UNDEF;
@@ -233,6 +236,8 @@ static void __init register_insn_emulation_sysctl(void)
insns_sysctl = kcalloc(nr_insn_emulated + 1, sizeof(*sysctl), GFP_KERNEL); + if (!insns_sysctl) + return;
raw_spin_lock_irqsave(&insn_emulation_lock, flags); list_for_each_entry(insn, &insn_emulation, node) {
From: Kan Liang kan.liang@linux.intel.com
[ Upstream commit 8d7c6ac3b2371eb1cbc9925a88f4d10efff374de ]
Comet Lake is the new 10th Gen Intel processor. Add two new CPU model numbers to the Intel family list.
The CPU model numbers are not published in the SDM yet but they come from an authoritative internal source.
[ bp: Touch up commit message. ]
Signed-off-by: Kan Liang kan.liang@linux.intel.com Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Tony Luck tony.luck@intel.com Cc: ak@linux.intel.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: Ingo Molnar mingo@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: x86-ml x86@kernel.org Link: https://lkml.kernel.org/r/1570549810-25049-2-git-send-email-kan.liang@linux.... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/intel-family.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h index 9ae1c0f05fd2d..3525014c71da9 100644 --- a/arch/x86/include/asm/intel-family.h +++ b/arch/x86/include/asm/intel-family.h @@ -76,6 +76,9 @@ #define INTEL_FAM6_TIGERLAKE_L 0x8C #define INTEL_FAM6_TIGERLAKE 0x8D
+#define INTEL_FAM6_COMETLAKE 0xA5 +#define INTEL_FAM6_COMETLAKE_L 0xA6 + /* "Small Core" Processors (Atom) */
#define INTEL_FAM6_ATOM_BONNELL 0x1C /* Diamondville, Pineview */
From: Xuewei Zhang xueweiz@google.com
[ Upstream commit 4929a4e6faa0f13289a67cae98139e727f0d4a97 ]
The quota/period ratio is used to ensure a child task group won't get more bandwidth than the parent task group, and is calculated as:
normalized_cfs_quota() = [(quota_us << 20) / period_us]
If the quota/period ratio was changed during this scaling due to precision loss, it will cause inconsistency between parent and child task groups.
See below example:
A userspace container manager (kubelet) does three operations:
1) Create a parent cgroup, set quota to 1,000us and period to 10,000us. 2) Create a few children cgroups. 3) Set quota to 1,000us and period to 10,000us on a child cgroup.
These operations are expected to succeed. However, if the scaling of 147/128 happens before step 3, quota and period of the parent cgroup will be changed:
new_quota: 1148437ns, 1148us new_period: 11484375ns, 11484us
And when step 3 comes in, the ratio of the child cgroup will be 104857, which will be larger than the parent cgroup ratio (104821), and will fail.
Scaling them by a factor of 2 will fix the problem.
Tested-by: Phil Auld pauld@redhat.com Signed-off-by: Xuewei Zhang xueweiz@google.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Phil Auld pauld@redhat.com Cc: Anton Blanchard anton@ozlabs.org Cc: Ben Segall bsegall@google.com Cc: Dietmar Eggemann dietmar.eggemann@arm.com Cc: Juri Lelli juri.lelli@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Mel Gorman mgorman@suse.de Cc: Peter Zijlstra peterz@infradead.org Cc: Steven Rostedt rostedt@goodmis.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Vincent Guittot vincent.guittot@linaro.org Fixes: 2e8e19226398 ("sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup") Link: https://lkml.kernel.org/r/20191004001243.140897-1-xueweiz@google.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/fair.c | 36 ++++++++++++++++++++++-------------- 1 file changed, 22 insertions(+), 14 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 86cfc5d5129ce..16b5d29bd7300 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4995,20 +4995,28 @@ static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer) if (++count > 3) { u64 new, old = ktime_to_ns(cfs_b->period);
- new = (old * 147) / 128; /* ~115% */ - new = min(new, max_cfs_quota_period); - - cfs_b->period = ns_to_ktime(new); - - /* since max is 1s, this is limited to 1e9^2, which fits in u64 */ - cfs_b->quota *= new; - cfs_b->quota = div64_u64(cfs_b->quota, old); - - pr_warn_ratelimited( - "cfs_period_timer[cpu%d]: period too short, scaling up (new cfs_period_us %lld, cfs_quota_us = %lld)\n", - smp_processor_id(), - div_u64(new, NSEC_PER_USEC), - div_u64(cfs_b->quota, NSEC_PER_USEC)); + /* + * Grow period by a factor of 2 to avoid losing precision. + * Precision loss in the quota/period ratio can cause __cfs_schedulable + * to fail. + */ + new = old * 2; + if (new < max_cfs_quota_period) { + cfs_b->period = ns_to_ktime(new); + cfs_b->quota *= 2; + + pr_warn_ratelimited( + "cfs_period_timer[cpu%d]: period too short, scaling up (new cfs_period_us = %lld, cfs_quota_us = %lld)\n", + smp_processor_id(), + div_u64(new, NSEC_PER_USEC), + div_u64(cfs_b->quota, NSEC_PER_USEC)); + } else { + pr_warn_ratelimited( + "cfs_period_timer[cpu%d]: period too short, but cannot scale up without losing precision (cfs_period_us = %lld, cfs_quota_us = %lld)\n", + smp_processor_id(), + div_u64(old, NSEC_PER_USEC), + div_u64(cfs_b->quota, NSEC_PER_USEC)); + }
/* reset count so we don't come right back in here */ count = 0;
From: Frederic Weisbecker frederic@kernel.org
[ Upstream commit 68e7a4d66b0ce04bf18ff2ffded5596ab3618585 ]
vtime_account_system() assumes that the target task to account cputime to is always the current task. This is most often true indeed except on task switch where we call:
vtime_common_task_switch(prev) vtime_account_system(prev)
Here prev is the scheduling-out task where we account the cputime to. It doesn't match current that is already the scheduling-in task at this stage of the context switch.
So we end up checking the wrong task flags to determine if we are accounting guest or system time to the previous task.
As a result the wrong task is used to check if the target is running in guest mode. We may then spuriously account or leak either system or guest time on task switch.
Fix this assumption and also turn vtime_guest_enter/exit() to use the task passed in parameter as well to avoid future similar issues.
Signed-off-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rik van Riel riel@redhat.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Wanpeng Li wanpengli@tencent.com Fixes: 2a42eb9594a1 ("sched/cputime: Accumulate vtime on top of nsec clocksource") Link: https://lkml.kernel.org/r/20190925214242.21873-1-frederic@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/cputime.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 2305ce89a26cf..46ed4e1383e21 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -740,7 +740,7 @@ void vtime_account_system(struct task_struct *tsk)
write_seqcount_begin(&vtime->seqcount); /* We might have scheduled out from guest path */ - if (current->flags & PF_VCPU) + if (tsk->flags & PF_VCPU) vtime_account_guest(tsk, vtime); else __vtime_account_system(tsk, vtime); @@ -783,7 +783,7 @@ void vtime_guest_enter(struct task_struct *tsk) */ write_seqcount_begin(&vtime->seqcount); __vtime_account_system(tsk, vtime); - current->flags |= PF_VCPU; + tsk->flags |= PF_VCPU; write_seqcount_end(&vtime->seqcount); } EXPORT_SYMBOL_GPL(vtime_guest_enter); @@ -794,7 +794,7 @@ void vtime_guest_exit(struct task_struct *tsk)
write_seqcount_begin(&vtime->seqcount); vtime_account_guest(tsk, vtime); - current->flags &= ~PF_VCPU; + tsk->flags &= ~PF_VCPU; write_seqcount_end(&vtime->seqcount); } EXPORT_SYMBOL_GPL(vtime_guest_exit);
From: Song Liu songliubraving@fb.com
[ Upstream commit d44248a41337731a111374822d7d4451b64e73e4 ]
perf_mmap() always increases user->locked_vm. As a result, "extra" could grow bigger than "user_extra", which doesn't make sense. Here is an example case:
(Note: Assume "user_lock_limit" is very small.)
| # of perf_mmap calls |vma->vm_mm->pinned_vm|user->locked_vm| | 0 | 0 | 0 | | 1 | user_extra | user_extra | | 2 | 3 * user_extra | 2 * user_extra| | 3 | 6 * user_extra | 3 * user_extra| | 4 | 10 * user_extra | 4 * user_extra|
Fix this by maintaining proper user_extra and extra.
Reviewed-By: Hechao Li hechaol@fb.com Reported-by: Hechao Li hechaol@fb.com Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: kernel-team@fb.com Cc: Jie Meng jmeng@fb.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/20190904214618.3795672-1-songliubraving@fb.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/events/core.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c index a2a50b668ef32..c0a6a50e01af6 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -5585,7 +5585,8 @@ again: * undo the VM accounting. */
- atomic_long_sub((size >> PAGE_SHIFT) + 1, &mmap_user->locked_vm); + atomic_long_sub((size >> PAGE_SHIFT) + 1 - mmap_locked, + &mmap_user->locked_vm); atomic64_sub(mmap_locked, &vma->vm_mm->pinned_vm); free_uid(mmap_user);
@@ -5729,8 +5730,20 @@ accounting:
user_locked = atomic_long_read(&user->locked_vm) + user_extra;
- if (user_locked > user_lock_limit) + if (user_locked <= user_lock_limit) { + /* charge all to locked_vm */ + } else if (atomic_long_read(&user->locked_vm) >= user_lock_limit) { + /* charge all to pinned_vm */ + extra = user_extra; + user_extra = 0; + } else { + /* + * charge locked_vm until it hits user_lock_limit; + * charge the rest from pinned_vm + */ extra = user_locked - user_lock_limit; + user_extra -= extra; + }
lock_limit = rlimit(RLIMIT_MEMLOCK); lock_limit >>= PAGE_SHIFT;
From: Song Liu songliubraving@fb.com
[ Upstream commit 7fa343b7fdc4f351de4e3f28d5c285937dd1f42f ]
In perf_rotate_context(), when the first cpu flexible event fail to schedule, cpu_rotate is 1, while cpu_event is NULL. Since cpu_event is NULL, perf_rotate_context will _NOT_ call cpu_ctx_sched_out(), thus cpuctx->ctx.is_active will have EVENT_FLEXIBLE set. Then, the next perf_event_sched_in() will skip all cpu flexible events because of the EVENT_FLEXIBLE bit.
In the next call of perf_rotate_context(), cpu_rotate stays 1, and cpu_event stays NULL, so this process repeats. The end result is, flexible events on this cpu will not be scheduled (until another event being added to the cpuctx).
Here is an easy repro of this issue. On Intel CPUs, where ref-cycles could only use one counter, run one pinned event for ref-cycles, one flexible event for ref-cycles, and one flexible event for cycles. The flexible ref-cycles is never scheduled, which is expected. However, because of this issue, the cycles event is never scheduled either.
$ perf stat -e ref-cycles:D,ref-cycles,cycles -C 5 -I 1000
time counts unit events 1.000152973 15,412,480 ref-cycles:D 1.000152973 <not counted> ref-cycles (0.00%) 1.000152973 <not counted> cycles (0.00%) 2.000486957 18,263,120 ref-cycles:D 2.000486957 <not counted> ref-cycles (0.00%) 2.000486957 <not counted> cycles (0.00%)
To fix this, when the flexible_active list is empty, try rotate the first event in the flexible_groups. Also, rename ctx_first_active() to ctx_event_to_rotate(), which is more accurate.
Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: kernel-team@fb.com Cc: Arnaldo Carvalho de Melo acme@kernel.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Sasha Levin sashal@kernel.org Cc: Thomas Gleixner tglx@linutronix.de Fixes: 8d5bce0c37fa ("perf/core: Optimize perf_rotate_context() event scheduling") Link: https://lkml.kernel.org/r/20191008165949.920548-1-songliubraving@fb.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/events/core.c | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c index c0a6a50e01af6..2ff44d7308dd7 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -3694,11 +3694,23 @@ static void rotate_ctx(struct perf_event_context *ctx, struct perf_event *event) perf_event_groups_insert(&ctx->flexible_groups, event); }
+/* pick an event from the flexible_groups to rotate */ static inline struct perf_event * -ctx_first_active(struct perf_event_context *ctx) +ctx_event_to_rotate(struct perf_event_context *ctx) { - return list_first_entry_or_null(&ctx->flexible_active, - struct perf_event, active_list); + struct perf_event *event; + + /* pick the first active flexible event */ + event = list_first_entry_or_null(&ctx->flexible_active, + struct perf_event, active_list); + + /* if no active flexible event, pick the first event */ + if (!event) { + event = rb_entry_safe(rb_first(&ctx->flexible_groups.tree), + typeof(*event), group_node); + } + + return event; }
static bool perf_rotate_context(struct perf_cpu_context *cpuctx) @@ -3723,9 +3735,9 @@ static bool perf_rotate_context(struct perf_cpu_context *cpuctx) perf_pmu_disable(cpuctx->ctx.pmu);
if (task_rotate) - task_event = ctx_first_active(task_ctx); + task_event = ctx_event_to_rotate(task_ctx); if (cpu_rotate) - cpu_event = ctx_first_active(&cpuctx->ctx); + cpu_event = ctx_event_to_rotate(&cpuctx->ctx);
/* * As per the order given at ctx_resched() first 'pop' task flexible
From: Tom Lendacky thomas.lendacky@amd.com
[ Upstream commit df4d29732fdad43a51284f826bec3e6ded177540 ]
It turns out that the NMI latency workaround from commit:
6d3edaae16c6 ("x86/perf/amd: Resolve NMI latency issues for active PMCs")
ends up being too conservative and results in the perf NMI handler claiming NMIs too easily on AMD hardware when the NMI watchdog is active.
This has an impact, for example, on the hpwdt (HPE watchdog timer) module. This module can produce an NMI that is used to reset the system. It registers an NMI handler for the NMI_UNKNOWN type and relies on the fact that nothing has claimed an NMI so that its handler will be invoked when the watchdog device produces an NMI. After the referenced commit, the hpwdt module is unable to process its generated NMI if the NMI watchdog is active, because the current NMI latency mitigation results in the NMI being claimed by the perf NMI handler.
Update the AMD perf NMI latency mitigation workaround to, instead, use a window of time. Whenever a PMC is handled in the perf NMI handler, set a timestamp which will act as a perf NMI window. Any NMIs arriving within that window will be claimed by perf. Anything outside that window will not be claimed by perf. The value for the NMI window is set to 100 msecs. This is a conservative value that easily covers any NMI latency in the hardware. While this still results in a window in which the hpwdt module will not receive its NMI, the window is now much, much smaller.
Signed-off-by: Tom Lendacky thomas.lendacky@amd.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@kernel.org Cc: Borislav Petkov bp@alien8.de Cc: Jerry Hoemann jerry.hoemann@hpe.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Fixes: 6d3edaae16c6 ("x86/perf/amd: Resolve NMI latency issues for active PMCs") Link: https://lkml.kernel.org/r/Message-ID: Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/amd/core.c | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-)
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c index e7d35f60d53fc..64c3e70b0556b 100644 --- a/arch/x86/events/amd/core.c +++ b/arch/x86/events/amd/core.c @@ -5,12 +5,14 @@ #include <linux/init.h> #include <linux/slab.h> #include <linux/delay.h> +#include <linux/jiffies.h> #include <asm/apicdef.h> #include <asm/nmi.h>
#include "../perf_event.h"
-static DEFINE_PER_CPU(unsigned int, perf_nmi_counter); +static DEFINE_PER_CPU(unsigned long, perf_nmi_tstamp); +static unsigned long perf_nmi_window;
static __initconst const u64 amd_hw_cache_event_ids [PERF_COUNT_HW_CACHE_MAX] @@ -641,11 +643,12 @@ static void amd_pmu_disable_event(struct perf_event *event) * handler when multiple PMCs are active or PMC overflow while handling some * other source of an NMI. * - * Attempt to mitigate this by using the number of active PMCs to determine - * whether to return NMI_HANDLED if the perf NMI handler did not handle/reset - * any PMCs. The per-CPU perf_nmi_counter variable is set to a minimum of the - * number of active PMCs or 2. The value of 2 is used in case an NMI does not - * arrive at the LAPIC in time to be collapsed into an already pending NMI. + * Attempt to mitigate this by creating an NMI window in which un-handled NMIs + * received during this window will be claimed. This prevents extending the + * window past when it is possible that latent NMIs should be received. The + * per-CPU perf_nmi_tstamp will be set to the window end time whenever perf has + * handled a counter. When an un-handled NMI is received, it will be claimed + * only if arriving within that window. */ static int amd_pmu_handle_irq(struct pt_regs *regs) { @@ -663,21 +666,19 @@ static int amd_pmu_handle_irq(struct pt_regs *regs) handled = x86_pmu_handle_irq(regs);
/* - * If a counter was handled, record the number of possible remaining - * NMIs that can occur. + * If a counter was handled, record a timestamp such that un-handled + * NMIs will be claimed if arriving within that window. */ if (handled) { - this_cpu_write(perf_nmi_counter, - min_t(unsigned int, 2, active)); + this_cpu_write(perf_nmi_tstamp, + jiffies + perf_nmi_window);
return handled; }
- if (!this_cpu_read(perf_nmi_counter)) + if (time_after(jiffies, this_cpu_read(perf_nmi_tstamp))) return NMI_DONE;
- this_cpu_dec(perf_nmi_counter); - return NMI_HANDLED; }
@@ -909,6 +910,9 @@ static int __init amd_core_pmu_init(void) if (!boot_cpu_has(X86_FEATURE_PERFCTR_CORE)) return 0;
+ /* Avoid calulating the value each time in the NMI handler */ + perf_nmi_window = msecs_to_jiffies(100); + switch (boot_cpu_data.x86) { case 0x15: pr_cont("Fam15h ");
From: Nirmoy Das nirmoy.das@amd.com
[ Upstream commit 083164dbdb17c5ea4ad92c1782b59c9d75567790 ]
cleanup error handling code and make sure temporary info array with the handles are freed by amdgpu_bo_list_put() on idr_replace()'s failure.
Signed-off-by: Nirmoy Das nirmoy.das@amd.com Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c index 7bcf86c619995..61e38e43ad1d5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c @@ -270,7 +270,7 @@ int amdgpu_bo_list_ioctl(struct drm_device *dev, void *data,
r = amdgpu_bo_create_list_entry_array(&args->in, &info); if (r) - goto error_free; + return r;
switch (args->in.operation) { case AMDGPU_BO_LIST_OP_CREATE: @@ -283,8 +283,7 @@ int amdgpu_bo_list_ioctl(struct drm_device *dev, void *data, r = idr_alloc(&fpriv->bo_list_handles, list, 1, 0, GFP_KERNEL); mutex_unlock(&fpriv->bo_list_lock); if (r < 0) { - amdgpu_bo_list_put(list); - return r; + goto error_put_list; }
handle = r; @@ -306,9 +305,8 @@ int amdgpu_bo_list_ioctl(struct drm_device *dev, void *data, mutex_unlock(&fpriv->bo_list_lock);
if (IS_ERR(old)) { - amdgpu_bo_list_put(list); r = PTR_ERR(old); - goto error_free; + goto error_put_list; }
amdgpu_bo_list_put(old); @@ -325,8 +323,10 @@ int amdgpu_bo_list_ioctl(struct drm_device *dev, void *data,
return 0;
+error_put_list: + amdgpu_bo_list_put(list); + error_free: - if (info) - kvfree(info); + kvfree(info); return r; }
From: Navid Emamdoost navid.emamdoost@gmail.com
[ Upstream commit ab612b1daf415b62c58e130cb3d0f30b255a14d0 ]
In adis_update_scan_mode, if allocation for adis->buffer fails, previously allocated adis->xfer needs to be released.
Signed-off-by: Navid Emamdoost navid.emamdoost@gmail.com Reviewed-by: Alexandru Ardelean alexandru.ardelean@analog.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/imu/adis_buffer.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/iio/imu/adis_buffer.c b/drivers/iio/imu/adis_buffer.c index 9ac8356d9a954..f446ff4978091 100644 --- a/drivers/iio/imu/adis_buffer.c +++ b/drivers/iio/imu/adis_buffer.c @@ -78,8 +78,11 @@ int adis_update_scan_mode(struct iio_dev *indio_dev, return -ENOMEM;
adis->buffer = kcalloc(indio_dev->scan_bytes, 2, GFP_KERNEL); - if (!adis->buffer) + if (!adis->buffer) { + kfree(adis->xfer); + adis->xfer = NULL; return -ENOMEM; + }
rx = adis->buffer; tx = rx + scan_count;
From: Navid Emamdoost navid.emamdoost@gmail.com
[ Upstream commit 9c0530e898f384c5d279bfcebd8bb17af1105873 ]
In adis_update_scan_mode_burst, if adis->buffer allocation fails release the adis->xfer.
Signed-off-by: Navid Emamdoost navid.emamdoost@gmail.com Reviewed-by: Alexandru Ardelean alexandru.ardelean@analog.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/imu/adis_buffer.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/iio/imu/adis_buffer.c b/drivers/iio/imu/adis_buffer.c index f446ff4978091..4998a89d083d5 100644 --- a/drivers/iio/imu/adis_buffer.c +++ b/drivers/iio/imu/adis_buffer.c @@ -35,8 +35,11 @@ static int adis_update_scan_mode_burst(struct iio_dev *indio_dev, return -ENOMEM;
adis->buffer = kzalloc(burst_length + sizeof(u16), GFP_KERNEL); - if (!adis->buffer) + if (!adis->buffer) { + kfree(adis->xfer); + adis->xfer = NULL; return -ENOMEM; + }
tx = adis->buffer + burst_length; tx[0] = ADIS_READ_REG(adis->burst->reg_cmd);
From: Lorenzo Bianconi lorenzo@kernel.org
[ Upstream commit fdb828e2c71a09bb9e865f41b015597c5f671705 ]
i2c controller available in st_lsm6dsx series performs i2c slave configuration using accel clock as trigger. st_lsm6dsx_shub_wait_complete routine is used to wait the controller has carried out the requested configuration. However if the accel sensor is not enabled we should not use its configured odr to estimate a proper timeout
Fixes: c91c1c844ebd ("iio: imu: st_lsm6dsx: add i2c embedded controller support") Signed-off-by: Lorenzo Bianconi lorenzo@kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_shub.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_shub.c b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_shub.c index 66fbcd94642d4..4c754a02717b3 100644 --- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_shub.c +++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_shub.c @@ -92,9 +92,11 @@ static const struct st_lsm6dsx_ext_dev_settings st_lsm6dsx_ext_dev_table[] = { static void st_lsm6dsx_shub_wait_complete(struct st_lsm6dsx_hw *hw) { struct st_lsm6dsx_sensor *sensor; + u16 odr;
sensor = iio_priv(hw->iio_devs[ST_LSM6DSX_ID_ACC]); - msleep((2000U / sensor->odr) + 1); + odr = (hw->enable_mask & BIT(ST_LSM6DSX_ID_ACC)) ? sensor->odr : 13; + msleep((2000U / odr) + 1); }
/**
From: Thomas Bogendoerfer tbogendoerfer@suse.de
[ Upstream commit 46f1619500d022501a4f0389f9f4c349ab46bb86 ]
Commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly") allows compiler to uninline functions marked as 'inline'. In cace of __xchg this would cause to reference function __xchg_called_with_bad_pointer, which is an error case for catching bugs and will not happen for correct code, if __xchg is inlined.
Signed-off-by: Thomas Bogendoerfer tbogendoerfer@suse.de Reviewed-by: Philippe Mathieu-Daudé f4bug@amsat.org Signed-off-by: Paul Burton paul.burton@mips.com Cc: Ralf Baechle ralf@linux-mips.org Cc: James Hogan jhogan@kernel.org Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/include/asm/cmpxchg.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/mips/include/asm/cmpxchg.h b/arch/mips/include/asm/cmpxchg.h index 319522fa3a45e..2b61052e10c94 100644 --- a/arch/mips/include/asm/cmpxchg.h +++ b/arch/mips/include/asm/cmpxchg.h @@ -77,8 +77,8 @@ extern unsigned long __xchg_called_with_bad_pointer(void) extern unsigned long __xchg_small(volatile void *ptr, unsigned long val, unsigned int size);
-static inline unsigned long __xchg(volatile void *ptr, unsigned long x, - int size) +static __always_inline +unsigned long __xchg(volatile void *ptr, unsigned long x, int size) { switch (size) { case 1:
From: Thomas Bogendoerfer tbogendoerfer@suse.de
[ Upstream commit efcb529694c3b707dc0471b312944337ba16e4dd ]
Use ARRAY_SIZE to caluculate the top of the o32 stack.
Signed-off-by: Thomas Bogendoerfer tbogendoerfer@suse.de Signed-off-by: Paul Burton paul.burton@mips.com Cc: Ralf Baechle ralf@linux-mips.org Cc: James Hogan jhogan@kernel.org Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/fw/sni/sniprom.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/mips/fw/sni/sniprom.c b/arch/mips/fw/sni/sniprom.c index 8772617b64cef..80112f2298b68 100644 --- a/arch/mips/fw/sni/sniprom.c +++ b/arch/mips/fw/sni/sniprom.c @@ -43,7 +43,7 @@
/* O32 stack has to be 8-byte aligned. */ static u64 o32_stk[4096]; -#define O32_STK &o32_stk[sizeof(o32_stk)] +#define O32_STK (&o32_stk[ARRAY_SIZE(o32_stk)])
#define __PROM_O32(fun, arg) fun arg __asm__(#fun); \ __asm__(#fun " = call_o32")
From: Halil Pasic pasic@linux.ibm.com
[ Upstream commit 05668e1d74b84c53fbe0f28565e4c9502a6b8a67 ]
Commit 37db8985b211 ("s390/cio: add basic protected virtualization support") breaks virtio-ccw devices with VIRTIO_F_IOMMU_PLATFORM for non Protected Virtualization (PV) guests. The problem is that the dma_mask of the ccw device, which is used by virtio core, gets changed from 64 to 31 bit, because some of the DMA allocations do require 31 bit addressable memory. For PV the only drawback is that some of the virtio structures must end up in ZONE_DMA because we have the bounce the buffers mapped via DMA API anyway.
But for non PV guests we have a problem: because of the 31 bit mask guests bigger than 2G are likely to try bouncing buffers. The swiotlb however is only initialized for PV guests, because we don't want to bounce anything for non PV guests. The first such map kills the guest.
Since the DMA API won't allow us to specify for each allocation whether we need memory from ZONE_DMA (31 bit addressable) or any DMA capable memory will do, let us use coherent_dma_mask (which is used for allocations) to force allocating form ZONE_DMA while changing dma_mask to DMA_BIT_MASK(64) so that at least the streaming API will regard the whole memory DMA capable.
Signed-off-by: Halil Pasic pasic@linux.ibm.com Reported-by: Marc Hartmayer mhartmay@linux.ibm.com Suggested-by: Robin Murphy robin.murphy@arm.com Fixes: 37db8985b211 ("s390/cio: add basic protected virtualization support") Link: https://lore.kernel.org/lkml/20190930153803.7958-1-pasic@linux.ibm.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Cornelia Huck cohuck@redhat.com Signed-off-by: Christian Borntraeger borntraeger@de.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/s390/cio/cio.h | 1 + drivers/s390/cio/css.c | 7 ++++++- drivers/s390/cio/device.c | 2 +- 3 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/s390/cio/cio.h b/drivers/s390/cio/cio.h index ba7d2480613b9..dcdaba689b20c 100644 --- a/drivers/s390/cio/cio.h +++ b/drivers/s390/cio/cio.h @@ -113,6 +113,7 @@ struct subchannel { enum sch_todo todo; struct work_struct todo_work; struct schib_config config; + u64 dma_mask; char *driver_override; /* Driver name to force a match */ } __attribute__ ((aligned(8)));
diff --git a/drivers/s390/cio/css.c b/drivers/s390/cio/css.c index 1fbfb0a93f5f1..831850435c23b 100644 --- a/drivers/s390/cio/css.c +++ b/drivers/s390/cio/css.c @@ -232,7 +232,12 @@ struct subchannel *css_alloc_subchannel(struct subchannel_id schid, * belong to a subchannel need to fit 31 bit width (e.g. ccw). */ sch->dev.coherent_dma_mask = DMA_BIT_MASK(31); - sch->dev.dma_mask = &sch->dev.coherent_dma_mask; + /* + * But we don't have such restrictions imposed on the stuff that + * is handled by the streaming API. + */ + sch->dma_mask = DMA_BIT_MASK(64); + sch->dev.dma_mask = &sch->dma_mask; return sch;
err: diff --git a/drivers/s390/cio/device.c b/drivers/s390/cio/device.c index c421899be20f2..027ef1dde5a7a 100644 --- a/drivers/s390/cio/device.c +++ b/drivers/s390/cio/device.c @@ -710,7 +710,7 @@ static struct ccw_device * io_subchannel_allocate_dev(struct subchannel *sch) if (!cdev->private) goto err_priv; cdev->dev.coherent_dma_mask = sch->dev.coherent_dma_mask; - cdev->dev.dma_mask = &cdev->dev.coherent_dma_mask; + cdev->dev.dma_mask = sch->dev.dma_mask; dma_pool = cio_gp_dma_create(&cdev->dev, 1); if (!dma_pool) goto err_dma_pool;
From: Navid Emamdoost navid.emamdoost@gmail.com
[ Upstream commit e0b0cb9388642c104838fac100a4af32745621e2 ]
In hgcm_call_preprocess_linaddr memory is allocated for bounce_buf but is not released if copy_form_user fails. In order to prevent memory leak in case of failure, the assignment to bounce_buf_ret is moved before the error check. This way the allocated bounce_buf will be released by the caller.
Fixes: 579db9d45cb4 ("virt: Add vboxguest VMMDEV communication code") Signed-off-by: Navid Emamdoost navid.emamdoost@gmail.com Reviewed-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20190930204223.3660-1-navid.emamdoost@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/virt/vboxguest/vboxguest_utils.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/virt/vboxguest/vboxguest_utils.c b/drivers/virt/vboxguest/vboxguest_utils.c index 75fd140b02ff8..43c391626a000 100644 --- a/drivers/virt/vboxguest/vboxguest_utils.c +++ b/drivers/virt/vboxguest/vboxguest_utils.c @@ -220,6 +220,8 @@ static int hgcm_call_preprocess_linaddr( if (!bounce_buf) return -ENOMEM;
+ *bounce_buf_ret = bounce_buf; + if (copy_in) { ret = copy_from_user(bounce_buf, (void __user *)buf, len); if (ret) @@ -228,7 +230,6 @@ static int hgcm_call_preprocess_linaddr( memset(bounce_buf, 0, len); }
- *bounce_buf_ret = bounce_buf; hgcm_call_add_pagelist_size(bounce_buf, len, extra); return 0; }
From: Xiubo Li xiubli@redhat.com
[ Upstream commit 862488105b84ca744b3d8ff131e0fcfe10644be1 ]
1. nbd_put takes the mutex and drops nbd->ref to 0. It then does idr_remove and drops the mutex.
2. nbd_genl_connect takes the mutex. idr_find/idr_for_each fails to find an existing device, so it does nbd_dev_add.
3. just before the nbd_put could call nbd_dev_remove or not finished totally, but if nbd_dev_add try to add_disk, we can hit:
debugfs: Directory 'nbd1' with parent 'block' already present!
This patch will make sure all the disk add/remove stuff are done by holding the nbd_index_mutex lock.
Reported-by: Mike Christie mchristi@redhat.com Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Xiubo Li xiubli@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/nbd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 0b727f7432f9e..bd164192045b0 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -230,8 +230,8 @@ static void nbd_put(struct nbd_device *nbd) if (refcount_dec_and_mutex_lock(&nbd->refs, &nbd_index_mutex)) { idr_remove(&nbd_index_idr, nbd->index); - mutex_unlock(&nbd_index_mutex); nbd_dev_remove(nbd); + mutex_unlock(&nbd_index_mutex); } }
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit 1047ec868332034d1fbcb2fae19fe6d4cb869ff2 ]
Our client can issue multiple SETCLIENTID operations to the same server in some circumstances. Ensure that calls to nfs4_proc_setclientid() after the first one do not overwrite the previously allocated cl_acceptor string.
unreferenced object 0xffff888461031800 (size 32): comm "mount.nfs", pid 2227, jiffies 4294822467 (age 1407.749s) hex dump (first 32 bytes): 6e 66 73 40 6b 6c 69 6d 74 2e 69 62 2e 31 30 31 nfs@klimt.ib.101 35 67 72 61 6e 67 65 72 2e 6e 65 74 00 00 00 00 5granger.net.... backtrace: [<00000000ab820188>] __kmalloc+0x128/0x176 [<00000000eeaf4ec8>] gss_stringify_acceptor+0xbd/0x1a7 [auth_rpcgss] [<00000000e85e3382>] nfs4_proc_setclientid+0x34e/0x46c [nfsv4] [<000000003d9cf1fa>] nfs40_discover_server_trunking+0x7a/0xed [nfsv4] [<00000000b81c3787>] nfs4_discover_server_trunking+0x81/0x244 [nfsv4] [<000000000801b55f>] nfs4_init_client+0x1b0/0x238 [nfsv4] [<00000000977daf7f>] nfs4_set_client+0xfe/0x14d [nfsv4] [<0000000053a68a2a>] nfs4_create_server+0x107/0x1db [nfsv4] [<0000000088262019>] nfs4_remote_mount+0x2c/0x59 [nfsv4] [<00000000e84a2fd0>] legacy_get_tree+0x2d/0x4c [<00000000797e947c>] vfs_get_tree+0x20/0xc7 [<00000000ecabaaa8>] fc_mount+0xe/0x36 [<00000000f15fafc2>] vfs_kern_mount+0x74/0x8d [<00000000a3ff4e26>] nfs_do_root_mount+0x8a/0xa3 [nfsv4] [<00000000d1c2b337>] nfs4_try_mount+0x58/0xad [nfsv4] [<000000004c9bddee>] nfs_fs_mount+0x820/0x869 [nfs]
Fixes: f11b2a1cfbf5 ("nfs4: copy acceptor name from context ... ") Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs4proc.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 1406858bae6c9..e1e7d2724b971 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -6058,6 +6058,7 @@ int nfs4_proc_setclientid(struct nfs_client *clp, u32 program, } status = task->tk_status; if (setclientid.sc_cred) { + kfree(clp->cl_acceptor); clp->cl_acceptor = rpcauth_stringify_acceptor(setclientid.sc_cred); put_rpccred(setclientid.sc_cred); }
From: Benjamin Coddington bcodding@redhat.com
[ Upstream commit af84537dbd1b39505d1f3d8023029b4a59666513 ]
Since commit 4f8943f80883 ("SUNRPC: Replace direct task wakeups from softirq context") there has been a race to the value of the sk_err if both XPRT_SOCK_WAKE_ERROR and XPRT_SOCK_WAKE_DISCONNECT are set. In that case, we may end up losing the sk_err value that existed when xs_error_report was called.
Fix this by reverting to the previous behavior: instead of using SO_ERROR to retrieve the value at a later time (which might also return sk_err_soft), copy the sk_err value onto struct sock_xprt, and use that value to wake pending tasks.
Signed-off-by: Benjamin Coddington bcodding@redhat.com Fixes: 4f8943f80883 ("SUNRPC: Replace direct task wakeups from softirq context") Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/sunrpc/xprtsock.h | 1 + net/sunrpc/xprtsock.c | 17 ++++++++--------- 2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/include/linux/sunrpc/xprtsock.h b/include/linux/sunrpc/xprtsock.h index 7638dbe7bc500..a940de03808dd 100644 --- a/include/linux/sunrpc/xprtsock.h +++ b/include/linux/sunrpc/xprtsock.h @@ -61,6 +61,7 @@ struct sock_xprt { struct mutex recv_mutex; struct sockaddr_storage srcaddr; unsigned short srcport; + int xprt_err;
/* * UDP socket buffer size parameters diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index e2176c167a579..4e0b5bed6c737 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -1243,19 +1243,21 @@ static void xs_error_report(struct sock *sk) { struct sock_xprt *transport; struct rpc_xprt *xprt; - int err;
read_lock_bh(&sk->sk_callback_lock); if (!(xprt = xprt_from_sock(sk))) goto out;
transport = container_of(xprt, struct sock_xprt, xprt); - err = -sk->sk_err; - if (err == 0) + transport->xprt_err = -sk->sk_err; + if (transport->xprt_err == 0) goto out; dprintk("RPC: xs_error_report client %p, error=%d...\n", - xprt, -err); - trace_rpc_socket_error(xprt, sk->sk_socket, err); + xprt, -transport->xprt_err); + trace_rpc_socket_error(xprt, sk->sk_socket, transport->xprt_err); + + /* barrier ensures xprt_err is set before XPRT_SOCK_WAKE_ERROR */ + smp_mb__before_atomic(); xs_run_error_worker(transport, XPRT_SOCK_WAKE_ERROR); out: read_unlock_bh(&sk->sk_callback_lock); @@ -2470,7 +2472,6 @@ static void xs_wake_write(struct sock_xprt *transport) static void xs_wake_error(struct sock_xprt *transport) { int sockerr; - int sockerr_len = sizeof(sockerr);
if (!test_bit(XPRT_SOCK_WAKE_ERROR, &transport->sock_state)) return; @@ -2479,9 +2480,7 @@ static void xs_wake_error(struct sock_xprt *transport) goto out; if (!test_and_clear_bit(XPRT_SOCK_WAKE_ERROR, &transport->sock_state)) goto out; - if (kernel_getsockopt(transport->sock, SOL_SOCKET, SO_ERROR, - (char *)&sockerr, &sockerr_len) != 0) - goto out; + sockerr = xchg(&transport->xprt_err, 0); if (sockerr < 0) xprt_wake_pending_tasks(&transport->xprt, sockerr); out:
From: Christian Borntraeger borntraeger@de.ibm.com
[ Upstream commit 062795fcdcb2d22822fb42644b1d76a8ad8439b3 ]
Depending on inlining decisions by the compiler, __get/put_user_fn might become out of line. Then the compiler is no longer able to tell that size can only be 1,2,4 or 8 due to the check in __get/put_user resulting in false positives like
./arch/s390/include/asm/uaccess.h: In function ‘__put_user_fn’: ./arch/s390/include/asm/uaccess.h:113:9: warning: ‘rc’ may be used uninitialized in this function [-Wmaybe-uninitialized] 113 | return rc; | ^~ ./arch/s390/include/asm/uaccess.h: In function ‘__get_user_fn’: ./arch/s390/include/asm/uaccess.h:143:9: warning: ‘rc’ may be used uninitialized in this function [-Wmaybe-uninitialized] 143 | return rc; | ^~
These functions are supposed to be always inlined. Mark it as such.
Signed-off-by: Christian Borntraeger borntraeger@de.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/include/asm/uaccess.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/s390/include/asm/uaccess.h b/arch/s390/include/asm/uaccess.h index bd2fd9a7821da..a470f1fa9f2af 100644 --- a/arch/s390/include/asm/uaccess.h +++ b/arch/s390/include/asm/uaccess.h @@ -83,7 +83,7 @@ raw_copy_to_user(void __user *to, const void *from, unsigned long n); __rc; \ })
-static inline int __put_user_fn(void *x, void __user *ptr, unsigned long size) +static __always_inline int __put_user_fn(void *x, void __user *ptr, unsigned long size) { unsigned long spec = 0x010000UL; int rc; @@ -113,7 +113,7 @@ static inline int __put_user_fn(void *x, void __user *ptr, unsigned long size) return rc; }
-static inline int __get_user_fn(void *x, const void __user *ptr, unsigned long size) +static __always_inline int __get_user_fn(void *x, const void __user *ptr, unsigned long size) { unsigned long spec = 0x01UL; int rc;
From: Petr Mladek pmladek@suse.com
[ Upstream commit d303de1fcf344ff7c15ed64c3f48a991c9958775 ]
A customer reported the following softlockup:
[899688.160002] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [test.sh:16464] [899688.160002] CPU: 0 PID: 16464 Comm: test.sh Not tainted 4.12.14-6.23-azure #1 SLE12-SP4 [899688.160002] RIP: 0010:up_write+0x1a/0x30 [899688.160002] Kernel panic - not syncing: softlockup: hung tasks [899688.160002] RIP: 0010:up_write+0x1a/0x30 [899688.160002] RSP: 0018:ffffa86784d4fde8 EFLAGS: 00000257 ORIG_RAX: ffffffffffffff12 [899688.160002] RAX: ffffffff970fea00 RBX: 0000000000000001 RCX: 0000000000000000 [899688.160002] RDX: ffffffff00000001 RSI: 0000000000000080 RDI: ffffffff970fea00 [899688.160002] RBP: ffffffffffffffff R08: ffffffffffffffff R09: 0000000000000000 [899688.160002] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8b59014720d8 [899688.160002] R13: ffff8b59014720c0 R14: ffff8b5901471090 R15: ffff8b5901470000 [899688.160002] tracing_read_pipe+0x336/0x3c0 [899688.160002] __vfs_read+0x26/0x140 [899688.160002] vfs_read+0x87/0x130 [899688.160002] SyS_read+0x42/0x90 [899688.160002] do_syscall_64+0x74/0x160
It caught the process in the middle of trace_access_unlock(). There is no loop. So, it must be looping in the caller tracing_read_pipe() via the "waitagain" label.
Crashdump analyze uncovered that iter->seq was completely zeroed at this point, including iter->seq.seq.size. It means that print_trace_line() was never able to print anything and there was no forward progress.
The culprit seems to be in the code:
/* reset all but tr, trace, and overruns */ memset(&iter->seq, 0, sizeof(struct trace_iterator) - offsetof(struct trace_iterator, seq));
It was added by the commit 53d0aa773053ab182877 ("ftrace: add logic to record overruns"). It was v2.6.27-rc1. It was the time when iter->seq looked like:
struct trace_seq { unsigned char buffer[PAGE_SIZE]; unsigned int len; };
There was no "size" variable and zeroing was perfectly fine.
The solution is to reinitialize the structure after or without zeroing.
Link: http://lkml.kernel.org/r/20191011142134.11997-1-pmladek@suse.com
Signed-off-by: Petr Mladek pmladek@suse.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 04458ed44a557..a5e27f1c35a1f 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -6012,6 +6012,7 @@ waitagain: sizeof(struct trace_iterator) - offsetof(struct trace_iterator, seq)); cpumask_clear(iter->started); + trace_seq_init(&iter->seq); iter->pos = -1;
trace_event_read_lock();
From: Gustavo A. R. Silva gustavo@embeddedor.com
[ Upstream commit f948eb45e3af9fb18a0487d0797a773897ef6929 ]
Store SYMBOL_ANNOTATE_ERRNO__BPF_MISSING_BTF in variable *ret*, instead of returning in the middle of the function and leaking multiple resources: prog_linfo, btf, s and bfdf.
Addresses-Coverity-ID: 1454832 ("Structurally dead code") Fixes: 11aad897f6d1 ("perf annotate: Don't return -1 for error when doing BPF disassembly") Signed-off-by: Gustavo A. R. Silva gustavo@embeddedor.com Acked-by: Jiri Olsa jolsa@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/20191014171047.GA30850@embeddedor Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/annotate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index fb8756026a805..2e02d2a0176a8 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1752,7 +1752,7 @@ static int symbol__disassemble_bpf(struct symbol *sym, info_node = perf_env__find_bpf_prog_info(dso->bpf_prog.env, dso->bpf_prog.id); if (!info_node) { - return SYMBOL_ANNOTATE_ERRNO__BPF_MISSING_BTF; + ret = SYMBOL_ANNOTATE_ERRNO__BPF_MISSING_BTF; goto out; } info_linear = info_node->info_linear;
From: Thomas Richter tmricht@linux.ibm.com
[ Upstream commit 5e6c3c7b1ec217c1c4c95d9148182302b9969b97 ]
The following commit from the v5.4 merge window:
d44248a41337 ("perf/core: Rework memory accounting in perf_mmap()")
... breaks auxiliary trace buffer tracking.
If I run command 'perf record -e rbd000' to record samples and saving them in the **auxiliary** trace buffer then the value of 'locked_vm' becomes negative after all trace buffers have been allocated and released:
During allocation the values increase:
[52.250027] perf_mmap user->locked_vm:0x87 pinned_vm:0x0 ret:0 [52.250115] perf_mmap user->locked_vm:0x107 pinned_vm:0x0 ret:0 [52.250251] perf_mmap user->locked_vm:0x188 pinned_vm:0x0 ret:0 [52.250326] perf_mmap user->locked_vm:0x208 pinned_vm:0x0 ret:0 [52.250441] perf_mmap user->locked_vm:0x289 pinned_vm:0x0 ret:0 [52.250498] perf_mmap user->locked_vm:0x309 pinned_vm:0x0 ret:0 [52.250613] perf_mmap user->locked_vm:0x38a pinned_vm:0x0 ret:0 [52.250715] perf_mmap user->locked_vm:0x408 pinned_vm:0x2 ret:0 [52.250834] perf_mmap user->locked_vm:0x408 pinned_vm:0x83 ret:0 [52.250915] perf_mmap user->locked_vm:0x408 pinned_vm:0x103 ret:0 [52.251061] perf_mmap user->locked_vm:0x408 pinned_vm:0x184 ret:0 [52.251146] perf_mmap user->locked_vm:0x408 pinned_vm:0x204 ret:0 [52.251299] perf_mmap user->locked_vm:0x408 pinned_vm:0x285 ret:0 [52.251383] perf_mmap user->locked_vm:0x408 pinned_vm:0x305 ret:0 [52.251544] perf_mmap user->locked_vm:0x408 pinned_vm:0x386 ret:0 [52.251634] perf_mmap user->locked_vm:0x408 pinned_vm:0x406 ret:0 [52.253018] perf_mmap user->locked_vm:0x408 pinned_vm:0x487 ret:0 [52.253197] perf_mmap user->locked_vm:0x408 pinned_vm:0x508 ret:0 [52.253374] perf_mmap user->locked_vm:0x408 pinned_vm:0x589 ret:0 [52.253550] perf_mmap user->locked_vm:0x408 pinned_vm:0x60a ret:0 [52.253726] perf_mmap user->locked_vm:0x408 pinned_vm:0x68b ret:0 [52.253903] perf_mmap user->locked_vm:0x408 pinned_vm:0x70c ret:0 [52.254084] perf_mmap user->locked_vm:0x408 pinned_vm:0x78d ret:0 [52.254263] perf_mmap user->locked_vm:0x408 pinned_vm:0x80e ret:0
The value of user->locked_vm increases to a limit then the memory is tracked by pinned_vm.
During deallocation the size is subtracted from pinned_vm until it hits a limit. Then a larger value is subtracted from locked_vm leading to a large number (because of type unsigned):
[64.267797] perf_mmap_close mmap_user->locked_vm:0x408 pinned_vm:0x78d [64.267826] perf_mmap_close mmap_user->locked_vm:0x408 pinned_vm:0x70c [64.267848] perf_mmap_close mmap_user->locked_vm:0x408 pinned_vm:0x68b [64.267869] perf_mmap_close mmap_user->locked_vm:0x408 pinned_vm:0x60a [64.267891] perf_mmap_close mmap_user->locked_vm:0x408 pinned_vm:0x589 [64.267911] perf_mmap_close mmap_user->locked_vm:0x408 pinned_vm:0x508 [64.267933] perf_mmap_close mmap_user->locked_vm:0x408 pinned_vm:0x487 [64.267952] perf_mmap_close mmap_user->locked_vm:0x408 pinned_vm:0x406 [64.268883] perf_mmap_close mmap_user->locked_vm:0x307 pinned_vm:0x406 [64.269117] perf_mmap_close mmap_user->locked_vm:0x206 pinned_vm:0x406 [64.269433] perf_mmap_close mmap_user->locked_vm:0x105 pinned_vm:0x406 [64.269536] perf_mmap_close mmap_user->locked_vm:0x4 pinned_vm:0x404 [64.269797] perf_mmap_close mmap_user->locked_vm:0xffffffffffffff84 pinned_vm:0x303 [64.270105] perf_mmap_close mmap_user->locked_vm:0xffffffffffffff04 pinned_vm:0x202 [64.270374] perf_mmap_close mmap_user->locked_vm:0xfffffffffffffe84 pinned_vm:0x101 [64.270628] perf_mmap_close mmap_user->locked_vm:0xfffffffffffffe04 pinned_vm:0x0
This value sticks for the user until system is rebooted, causing follow-on system calls using locked_vm resource limit to fail.
Note: There is no issue using the normal trace buffer.
In fact the issue is in perf_mmap_close(). During allocation auxiliary trace buffer memory is either traced as 'extra' and added to 'pinned_vm' or trace as 'user_extra' and added to 'locked_vm'. This applies for normal trace buffers and auxiliary trace buffer.
However in function perf_mmap_close() all auxiliary trace buffer is subtraced from 'locked_vm' and never from 'pinned_vm'. This breaks the ballance.
Signed-off-by: Thomas Richter tmricht@linux.ibm.com Acked-by: Peter Zijlstra a.p.zijlstra@chello.nl Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: acme@kernel.org Cc: gor@linux.ibm.com Cc: hechaol@fb.com Cc: heiko.carstens@de.ibm.com Cc: linux-perf-users@vger.kernel.org Cc: songliubraving@fb.com Fixes: d44248a41337 ("perf/core: Rework memory accounting in perf_mmap()") Link: https://lkml.kernel.org/r/20191021083354.67868-1-tmricht@linux.ibm.com [ Minor readability edits. ] Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/events/core.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c index 2ff44d7308dd7..53173883513c1 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -5524,8 +5524,10 @@ static void perf_mmap_close(struct vm_area_struct *vma) perf_pmu_output_stop(event);
/* now it's safe to free the pages */ - atomic_long_sub(rb->aux_nr_pages, &mmap_user->locked_vm); - atomic64_sub(rb->aux_mmap_locked, &vma->vm_mm->pinned_vm); + if (!rb->aux_mmap_locked) + atomic_long_sub(rb->aux_nr_pages, &mmap_user->locked_vm); + else + atomic64_sub(rb->aux_mmap_locked, &vma->vm_mm->pinned_vm);
/* this has to be the last one */ rb_free_aux(rb);
From: Dan Carpenter dan.carpenter@oracle.com
[ Upstream commit fd47a417e75e2506eb3672ae569b1c87e3774155 ]
The problem is that sizeof() is unsigned long so negative error codes are type promoted to high positive values and the condition becomes false.
Fixes: 1d427be4a39d ("USB: legousbtower: fix slab info leak at probe") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Acked-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20191011141115.GA4521@mwanda Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/misc/legousbtower.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/misc/legousbtower.c b/drivers/usb/misc/legousbtower.c index 62dab2441ec4f..23061f1526b4e 100644 --- a/drivers/usb/misc/legousbtower.c +++ b/drivers/usb/misc/legousbtower.c @@ -878,7 +878,7 @@ static int tower_probe (struct usb_interface *interface, const struct usb_device get_version_reply, sizeof(*get_version_reply), 1000); - if (result < sizeof(*get_version_reply)) { + if (result != sizeof(*get_version_reply)) { if (result >= 0) result = -EIO; dev_err(idev, "get version request failed: %d\n", result);
From: Mike Christie mchristi@redhat.com
[ Upstream commit cf1b2326b734896734c6e167e41766f9cee7686a ]
nbd requires socket families to support the shutdown method so the nbd recv workqueue can be woken up from its sock_recvmsg call. If the socket does not support the callout we will leave recv works running or get hangs later when the device or module is removed.
This adds a check during socket connection/reconnection to make sure the socket being passed in supports the needed callout.
Reported-by: syzbot+24c12fa8d218ed26011a@syzkaller.appspotmail.com Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs") Tested-by: Richard W.M. Jones rjones@redhat.com Signed-off-by: Mike Christie mchristi@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/nbd.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index bd164192045b0..9650777d0aaf1 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -935,6 +935,25 @@ static blk_status_t nbd_queue_rq(struct blk_mq_hw_ctx *hctx, return ret; }
+static struct socket *nbd_get_socket(struct nbd_device *nbd, unsigned long fd, + int *err) +{ + struct socket *sock; + + *err = 0; + sock = sockfd_lookup(fd, err); + if (!sock) + return NULL; + + if (sock->ops->shutdown == sock_no_shutdown) { + dev_err(disk_to_dev(nbd->disk), "Unsupported socket: shutdown callout must be supported.\n"); + *err = -EINVAL; + return NULL; + } + + return sock; +} + static int nbd_add_socket(struct nbd_device *nbd, unsigned long arg, bool netlink) { @@ -944,7 +963,7 @@ static int nbd_add_socket(struct nbd_device *nbd, unsigned long arg, struct nbd_sock *nsock; int err;
- sock = sockfd_lookup(arg, &err); + sock = nbd_get_socket(nbd, arg, &err); if (!sock) return err;
@@ -996,7 +1015,7 @@ static int nbd_reconnect_socket(struct nbd_device *nbd, unsigned long arg) int i; int err;
- sock = sockfd_lookup(arg, &err); + sock = nbd_get_socket(nbd, arg, &err); if (!sock) return err;
From: Jeffrey Hugo jeffrey.l.hugo@gmail.com
[ Upstream commit 2c6d2d3a580a852fe0a694e13af502a862293e0e ]
This adds the initial DT for the Lenovo Miix 630 laptop. Supported functionality includes USB (host), microSD-card, keyboard, and trackpad.
Signed-off-by: Jeffrey Hugo jeffrey.l.hugo@gmail.com Signed-off-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/Makefile | 1 + .../boot/dts/qcom/msm8998-clamshell.dtsi | 240 ++++++++++++++++++ .../boot/dts/qcom/msm8998-lenovo-miix-630.dts | 30 +++ 3 files changed, 271 insertions(+) create mode 100644 arch/arm64/boot/dts/qcom/msm8998-clamshell.dtsi create mode 100644 arch/arm64/boot/dts/qcom/msm8998-lenovo-miix-630.dts
diff --git a/arch/arm64/boot/dts/qcom/Makefile b/arch/arm64/boot/dts/qcom/Makefile index 0a7e5dfce6f79..c38ca859f2e02 100644 --- a/arch/arm64/boot/dts/qcom/Makefile +++ b/arch/arm64/boot/dts/qcom/Makefile @@ -6,6 +6,7 @@ dtb-$(CONFIG_ARCH_QCOM) += msm8916-mtp.dtb dtb-$(CONFIG_ARCH_QCOM) += msm8992-bullhead-rev-101.dtb dtb-$(CONFIG_ARCH_QCOM) += msm8994-angler-rev-101.dtb dtb-$(CONFIG_ARCH_QCOM) += msm8996-mtp.dtb +dtb-$(CONFIG_ARCH_QCOM) += msm8998-lenovo-miix-630.dtb dtb-$(CONFIG_ARCH_QCOM) += msm8998-mtp.dtb dtb-$(CONFIG_ARCH_QCOM) += sdm845-cheza-r1.dtb dtb-$(CONFIG_ARCH_QCOM) += sdm845-cheza-r2.dtb diff --git a/arch/arm64/boot/dts/qcom/msm8998-clamshell.dtsi b/arch/arm64/boot/dts/qcom/msm8998-clamshell.dtsi new file mode 100644 index 0000000000000..9682d4dd7496e --- /dev/null +++ b/arch/arm64/boot/dts/qcom/msm8998-clamshell.dtsi @@ -0,0 +1,240 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2019, Jeffrey Hugo. All rights reserved. */ + +/* + * Common include for MSM8998 clamshell devices, ie the Lenovo Miix 630, + * Asus NovaGo TP370QL, and HP Envy x2. All three devices are basically the + * same, with differences in peripherals. + */ + +#include "msm8998.dtsi" +#include "pm8998.dtsi" +#include "pm8005.dtsi" + +/ { + chosen { + }; + + vph_pwr: vph-pwr-regulator { + compatible = "regulator-fixed"; + regulator-name = "vph_pwr"; + regulator-always-on; + regulator-boot-on; + }; +}; + +&qusb2phy { + status = "okay"; + + vdda-pll-supply = <&vreg_l12a_1p8>; + vdda-phy-dpdm-supply = <&vreg_l24a_3p075>; +}; + +&rpm_requests { + pm8998-regulators { + compatible = "qcom,rpm-pm8998-regulators"; + + vdd_s1-supply = <&vph_pwr>; + vdd_s2-supply = <&vph_pwr>; + vdd_s3-supply = <&vph_pwr>; + vdd_s4-supply = <&vph_pwr>; + vdd_s5-supply = <&vph_pwr>; + vdd_s6-supply = <&vph_pwr>; + vdd_s7-supply = <&vph_pwr>; + vdd_s8-supply = <&vph_pwr>; + vdd_s9-supply = <&vph_pwr>; + vdd_s10-supply = <&vph_pwr>; + vdd_s11-supply = <&vph_pwr>; + vdd_s12-supply = <&vph_pwr>; + vdd_s13-supply = <&vph_pwr>; + vdd_l1_l27-supply = <&vreg_s7a_1p025>; + vdd_l2_l8_l17-supply = <&vreg_s3a_1p35>; + vdd_l3_l11-supply = <&vreg_s7a_1p025>; + vdd_l4_l5-supply = <&vreg_s7a_1p025>; + vdd_l6-supply = <&vreg_s5a_2p04>; + vdd_l7_l12_l14_l15-supply = <&vreg_s5a_2p04>; + vdd_l9-supply = <&vph_pwr>; + vdd_l10_l23_l25-supply = <&vph_pwr>; + vdd_l13_l19_l21-supply = <&vph_pwr>; + vdd_l16_l28-supply = <&vph_pwr>; + vdd_l18_l22-supply = <&vph_pwr>; + vdd_l20_l24-supply = <&vph_pwr>; + vdd_l26-supply = <&vreg_s3a_1p35>; + vdd_lvs1_lvs2-supply = <&vreg_s4a_1p8>; + + vreg_s3a_1p35: s3 { + regulator-min-microvolt = <1352000>; + regulator-max-microvolt = <1352000>; + }; + vreg_s4a_1p8: s4 { + regulator-min-microvolt = <1800000>; + regulator-max-microvolt = <1800000>; + regulator-allow-set-load; + }; + vreg_s5a_2p04: s5 { + regulator-min-microvolt = <1904000>; + regulator-max-microvolt = <2040000>; + }; + vreg_s7a_1p025: s7 { + regulator-min-microvolt = <900000>; + regulator-max-microvolt = <1028000>; + }; + vreg_l1a_0p875: l1 { + regulator-min-microvolt = <880000>; + regulator-max-microvolt = <880000>; + regulator-allow-set-load; + }; + vreg_l2a_1p2: l2 { + regulator-min-microvolt = <1200000>; + regulator-max-microvolt = <1200000>; + regulator-allow-set-load; + }; + vreg_l3a_1p0: l3 { + regulator-min-microvolt = <1000000>; + regulator-max-microvolt = <1000000>; + }; + vreg_l5a_0p8: l5 { + regulator-min-microvolt = <800000>; + regulator-max-microvolt = <800000>; + }; + vreg_l6a_1p8: l6 { + regulator-min-microvolt = <1808000>; + regulator-max-microvolt = <1808000>; + }; + vreg_l7a_1p8: l7 { + regulator-min-microvolt = <1800000>; + regulator-max-microvolt = <1800000>; + }; + vreg_l8a_1p2: l8 { + regulator-min-microvolt = <1200000>; + regulator-max-microvolt = <1200000>; + }; + vreg_l9a_1p8: l9 { + regulator-min-microvolt = <1808000>; + regulator-max-microvolt = <2960000>; + }; + vreg_l10a_1p8: l10 { + regulator-min-microvolt = <1808000>; + regulator-max-microvolt = <2960000>; + }; + vreg_l11a_1p0: l11 { + regulator-min-microvolt = <1000000>; + regulator-max-microvolt = <1000000>; + }; + vreg_l12a_1p8: l12 { + regulator-min-microvolt = <1800000>; + regulator-max-microvolt = <1800000>; + }; + vreg_l13a_2p95: l13 { + regulator-min-microvolt = <1808000>; + regulator-max-microvolt = <2960000>; + }; + vreg_l14a_1p88: l14 { + regulator-min-microvolt = <1880000>; + regulator-max-microvolt = <1880000>; + }; + vreg_15a_1p8: l15 { + regulator-min-microvolt = <1800000>; + regulator-max-microvolt = <1800000>; + }; + vreg_l16a_2p7: l16 { + regulator-min-microvolt = <2704000>; + regulator-max-microvolt = <2704000>; + }; + vreg_l17a_1p3: l17 { + regulator-min-microvolt = <1304000>; + regulator-max-microvolt = <1304000>; + }; + vreg_l18a_2p7: l18 { + regulator-min-microvolt = <2704000>; + regulator-max-microvolt = <2704000>; + }; + vreg_l19a_3p0: l19 { + regulator-min-microvolt = <3008000>; + regulator-max-microvolt = <3008000>; + }; + vreg_l20a_2p95: l20 { + regulator-min-microvolt = <2960000>; + regulator-max-microvolt = <2960000>; + regulator-allow-set-load; + }; + vreg_l21a_2p95: l21 { + regulator-min-microvolt = <2960000>; + regulator-max-microvolt = <2960000>; + regulator-allow-set-load; + regulator-system-load = <800000>; + }; + vreg_l22a_2p85: l22 { + regulator-min-microvolt = <2864000>; + regulator-max-microvolt = <2864000>; + }; + vreg_l23a_3p3: l23 { + regulator-min-microvolt = <3312000>; + regulator-max-microvolt = <3312000>; + }; + vreg_l24a_3p075: l24 { + regulator-min-microvolt = <3088000>; + regulator-max-microvolt = <3088000>; + }; + vreg_l25a_3p3: l25 { + regulator-min-microvolt = <3104000>; + regulator-max-microvolt = <3312000>; + }; + vreg_l26a_1p2: l26 { + regulator-min-microvolt = <1200000>; + regulator-max-microvolt = <1200000>; + }; + vreg_l28_3p0: l28 { + regulator-min-microvolt = <3008000>; + regulator-max-microvolt = <3008000>; + }; + + vreg_lvs1a_1p8: lvs1 { + regulator-min-microvolt = <1800000>; + regulator-max-microvolt = <1800000>; + }; + + vreg_lvs2a_1p8: lvs2 { + regulator-min-microvolt = <1800000>; + regulator-max-microvolt = <1800000>; + }; + + }; +}; + +&tlmm { + gpio-reserved-ranges = <0 4>, <81 4>; + + touchpad: touchpad { + config { + pins = "gpio123"; + bias-pull-up; /* pull up */ + }; + }; +}; + +&sdhc2 { + status = "okay"; + + vmmc-supply = <&vreg_l21a_2p95>; + vqmmc-supply = <&vreg_l13a_2p95>; + + pinctrl-names = "default", "sleep"; + pinctrl-0 = <&sdc2_clk_on &sdc2_cmd_on &sdc2_data_on &sdc2_cd_on>; + pinctrl-1 = <&sdc2_clk_off &sdc2_cmd_off &sdc2_data_off &sdc2_cd_off>; +}; + +&usb3 { + status = "okay"; +}; + +&usb3_dwc3 { + dr_mode = "host"; /* Force to host until we have Type-C hooked up */ +}; + +&usb3phy { + status = "okay"; + + vdda-phy-supply = <&vreg_l1a_0p875>; + vdda-pll-supply = <&vreg_l2a_1p2>; +}; diff --git a/arch/arm64/boot/dts/qcom/msm8998-lenovo-miix-630.dts b/arch/arm64/boot/dts/qcom/msm8998-lenovo-miix-630.dts new file mode 100644 index 0000000000000..407c6a32911cc --- /dev/null +++ b/arch/arm64/boot/dts/qcom/msm8998-lenovo-miix-630.dts @@ -0,0 +1,30 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2019, Jeffrey Hugo. All rights reserved. */ + +/dts-v1/; + +#include "msm8998-clamshell.dtsi" + +/ { + model = "Lenovo Miix 630"; + compatible = "lenovo,miix-630", "qcom,msm8998"; +}; + +&blsp1_i2c6 { + status = "okay"; + + keyboard@3a { + compatible = "hid-over-i2c"; + interrupt-parent = <&tlmm>; + interrupts = <0x79 IRQ_TYPE_LEVEL_LOW>; + reg = <0x3a>; + hid-descr-addr = <0x0001>; + + pinctrl-names = "default"; + pinctrl-0 = <&touchpad>; + }; +}; + +&sdhc2 { + cd-gpios = <&tlmm 95 GPIO_ACTIVE_HIGH>; +};
From: Jeffrey Hugo jeffrey.l.hugo@gmail.com
[ Upstream commit 3f527d311932791fde67ffec32536d22d5dd3030 ]
This adds the initial DT for the HP Envy x2 laptop. Supported functionality includes USB (host), microSD-card, keyboard, and trackpad.
Signed-off-by: Jeffrey Hugo jeffrey.l.hugo@gmail.com Signed-off-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/Makefile | 1 + .../boot/dts/qcom/msm8998-hp-envy-x2.dts | 30 +++++++++++++++++++ 2 files changed, 31 insertions(+) create mode 100644 arch/arm64/boot/dts/qcom/msm8998-hp-envy-x2.dts
diff --git a/arch/arm64/boot/dts/qcom/Makefile b/arch/arm64/boot/dts/qcom/Makefile index c38ca859f2e02..28695e7c0cc3d 100644 --- a/arch/arm64/boot/dts/qcom/Makefile +++ b/arch/arm64/boot/dts/qcom/Makefile @@ -6,6 +6,7 @@ dtb-$(CONFIG_ARCH_QCOM) += msm8916-mtp.dtb dtb-$(CONFIG_ARCH_QCOM) += msm8992-bullhead-rev-101.dtb dtb-$(CONFIG_ARCH_QCOM) += msm8994-angler-rev-101.dtb dtb-$(CONFIG_ARCH_QCOM) += msm8996-mtp.dtb +dtb-$(CONFIG_ARCH_QCOM) += msm8998-hp-envy-x2.dtb dtb-$(CONFIG_ARCH_QCOM) += msm8998-lenovo-miix-630.dtb dtb-$(CONFIG_ARCH_QCOM) += msm8998-mtp.dtb dtb-$(CONFIG_ARCH_QCOM) += sdm845-cheza-r1.dtb diff --git a/arch/arm64/boot/dts/qcom/msm8998-hp-envy-x2.dts b/arch/arm64/boot/dts/qcom/msm8998-hp-envy-x2.dts new file mode 100644 index 0000000000000..24073127091f6 --- /dev/null +++ b/arch/arm64/boot/dts/qcom/msm8998-hp-envy-x2.dts @@ -0,0 +1,30 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2019, Jeffrey Hugo. All rights reserved. */ + +/dts-v1/; + +#include "msm8998-clamshell.dtsi" + +/ { + model = "HP Envy x2"; + compatible = "hp,envy-x2", "qcom,msm8998"; +}; + +&blsp1_i2c6 { + status = "okay"; + + keyboard@3a { + compatible = "hid-over-i2c"; + interrupt-parent = <&tlmm>; + interrupts = <0x79 IRQ_TYPE_LEVEL_LOW>; + reg = <0x3a>; + hid-descr-addr = <0x0001>; + + pinctrl-names = "default"; + pinctrl-0 = <&touchpad>; + }; +}; + +&sdhc2 { + cd-gpios = <&tlmm 95 GPIO_ACTIVE_LOW>; +};
From: Jeffrey Hugo jeffrey.l.hugo@gmail.com
[ Upstream commit 722eb2f65acc4cebeb710fc7cc98f51513e90f1f ]
This adds the initial DT for the Asus NovaGo TP370QL laptop. Supported functionality includes USB (host), microSD-card, keyboard, and trackpad.
Signed-off-by: Jeffrey Hugo jeffrey.l.hugo@gmail.com Signed-off-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/Makefile | 1 + .../dts/qcom/msm8998-asus-novago-tp370ql.dts | 47 +++++++++++++++++++ 2 files changed, 48 insertions(+) create mode 100644 arch/arm64/boot/dts/qcom/msm8998-asus-novago-tp370ql.dts
diff --git a/arch/arm64/boot/dts/qcom/Makefile b/arch/arm64/boot/dts/qcom/Makefile index 28695e7c0cc3d..954d75de617ba 100644 --- a/arch/arm64/boot/dts/qcom/Makefile +++ b/arch/arm64/boot/dts/qcom/Makefile @@ -6,6 +6,7 @@ dtb-$(CONFIG_ARCH_QCOM) += msm8916-mtp.dtb dtb-$(CONFIG_ARCH_QCOM) += msm8992-bullhead-rev-101.dtb dtb-$(CONFIG_ARCH_QCOM) += msm8994-angler-rev-101.dtb dtb-$(CONFIG_ARCH_QCOM) += msm8996-mtp.dtb +dtb-$(CONFIG_ARCH_QCOM) += msm8998-asus-novago-tp370ql.dtb dtb-$(CONFIG_ARCH_QCOM) += msm8998-hp-envy-x2.dtb dtb-$(CONFIG_ARCH_QCOM) += msm8998-lenovo-miix-630.dtb dtb-$(CONFIG_ARCH_QCOM) += msm8998-mtp.dtb diff --git a/arch/arm64/boot/dts/qcom/msm8998-asus-novago-tp370ql.dts b/arch/arm64/boot/dts/qcom/msm8998-asus-novago-tp370ql.dts new file mode 100644 index 0000000000000..db5821be1e2fe --- /dev/null +++ b/arch/arm64/boot/dts/qcom/msm8998-asus-novago-tp370ql.dts @@ -0,0 +1,47 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2019, Jeffrey Hugo. All rights reserved. */ + +/dts-v1/; + +#include "msm8998-clamshell.dtsi" + +/ { + model = "Asus NovaGo TP370QL"; + compatible = "asus,novago-tp370ql", "qcom,msm8998"; +}; + +&blsp1_i2c6 { + status = "okay"; + + touchpad@15 { + compatible = "hid-over-i2c"; + interrupt-parent = <&tlmm>; + interrupts = <0x7b IRQ_TYPE_LEVEL_LOW>; + reg = <0x15>; + hid-descr-addr = <0x0001>; + + pinctrl-names = "default"; + pinctrl-0 = <&touchpad>; + }; + + keyboard@3a { + compatible = "hid-over-i2c"; + interrupt-parent = <&tlmm>; + interrupts = <0x25 IRQ_TYPE_LEVEL_LOW>; + reg = <0x3a>; + hid-descr-addr = <0x0001>; + }; +}; + +&sdhc2 { + cd-gpios = <&tlmm 95 GPIO_ACTIVE_HIGH>; +}; + +&tlmm { + touchpad: touchpad { + config { + pins = "gpio123"; + bias-pull-up; + }; + }; +};
From: Joe Perches joe@perches.com
[ Upstream commit 5ff29d836d1beb347080bd96e6321c811a8e3f62 ]
Arguments are supposed to be ordered high then low.
Signed-off-by: Joe Perches joe@perches.com Acked-by: Yan-Hsuan Chuang yhchuang@realtek.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/realtek/rtw88/rtw8822b.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/realtek/rtw88/rtw8822b.c b/drivers/net/wireless/realtek/rtw88/rtw8822b.c index 1172f6c0605b3..d61d534396c73 100644 --- a/drivers/net/wireless/realtek/rtw88/rtw8822b.c +++ b/drivers/net/wireless/realtek/rtw88/rtw8822b.c @@ -997,7 +997,7 @@ static void rtw8822b_do_iqk(struct rtw_dev *rtwdev) rtw_write_rf(rtwdev, RF_PATH_A, RF_DTXLOK, RFREG_MASK, 0x0);
reload = !!rtw_read32_mask(rtwdev, REG_IQKFAILMSK, BIT(16)); - iqk_fail_mask = rtw_read32_mask(rtwdev, REG_IQKFAILMSK, GENMASK(0, 7)); + iqk_fail_mask = rtw_read32_mask(rtwdev, REG_IQKFAILMSK, GENMASK(7, 0)); rtw_dbg(rtwdev, RTW_DBG_PHY, "iqk counter=%d reload=%d do_iqk_cnt=%d n_iqk_fail(mask)=0x%02x\n", counter, reload, ++do_iqk_cnt, iqk_fail_mask);
From: Sebastian Ott sebott@linux.ibm.com
[ Upstream commit cf2c4a3f35b75d38cebb4afbd578f1594f068d1e ]
After recent changes the MSI message data needs to specify the function-relative IRQ number.
Reported-and-tested-by: Alexander Schmidt alexs@linux.ibm.com Signed-off-by: Sebastian Ott sebott@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/pci/pci_irq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c index d80616ae8dd8a..fbe97ab2e2286 100644 --- a/arch/s390/pci/pci_irq.c +++ b/arch/s390/pci/pci_irq.c @@ -284,7 +284,7 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type) return rc; irq_set_chip_and_handler(irq, &zpci_irq_chip, handle_percpu_irq); - msg.data = hwirq; + msg.data = hwirq - bit; if (irq_delivery == DIRECTED) { msg.address_lo = zdev->msi_addr & 0xff0000ff; msg.address_lo |= msi->affinity ?
From: Mika Westerberg mika.westerberg@linux.intel.com
[ Upstream commit ce19f91eae43e39d5a1da55344756ab5a3c7e8d1 ]
PCIe tunnel path indices got mixed up when we added support for tunnels between switches that are not adjacent. This did not affect the functionality as it is just an index but fix it now nevertheless to make the code easier to understand.
Reported-by: Rajmohan Mani rajmohan.mani@intel.com Fixes: 8c7acaaf020f ("thunderbolt: Extend tunnel creation to more than 2 adjacent switches") Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Reviewed-by: Yehezkel Bernat YehezkelShB@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/thunderbolt/tunnel.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/thunderbolt/tunnel.c b/drivers/thunderbolt/tunnel.c index 31d0234837e45..5a99234826e73 100644 --- a/drivers/thunderbolt/tunnel.c +++ b/drivers/thunderbolt/tunnel.c @@ -211,7 +211,7 @@ struct tb_tunnel *tb_tunnel_alloc_pci(struct tb *tb, struct tb_port *up, return NULL; } tb_pci_init_path(path); - tunnel->paths[TB_PCI_PATH_UP] = path; + tunnel->paths[TB_PCI_PATH_DOWN] = path;
path = tb_path_alloc(tb, up, TB_PCI_HOPID, down, TB_PCI_HOPID, 0, "PCIe Up"); @@ -220,7 +220,7 @@ struct tb_tunnel *tb_tunnel_alloc_pci(struct tb *tb, struct tb_port *up, return NULL; } tb_pci_init_path(path); - tunnel->paths[TB_PCI_PATH_DOWN] = path; + tunnel->paths[TB_PCI_PATH_UP] = path;
return tunnel; }
From: Mika Westerberg mika.westerberg@linux.intel.com
[ Upstream commit 943795219d3cb9f8ce6ce51cad3ffe1f61e95c6b ]
The register access should be using 32-bit reads/writes according to the datasheet. With the previous generation hardware 16-bit writes have been working but starting with ICL this is not the case anymore so fix producer/consumer register update to use correct width register address.
Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Reviewed-by: Yehezkel Bernat YehezkelShB@gmail.com Tested-by: Mario Limonciello mario.limonciello@dell.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/thunderbolt/nhi.c | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-)
diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c index 27fbe62c7ddd4..9c782706e652f 100644 --- a/drivers/thunderbolt/nhi.c +++ b/drivers/thunderbolt/nhi.c @@ -143,9 +143,20 @@ static void __iomem *ring_options_base(struct tb_ring *ring) return io; }
-static void ring_iowrite16desc(struct tb_ring *ring, u32 value, u32 offset) +static void ring_iowrite_cons(struct tb_ring *ring, u16 cons) { - iowrite16(value, ring_desc_base(ring) + offset); + /* + * The other 16-bits in the register is read-only and writes to it + * are ignored by the hardware so we can save one ioread32() by + * filling the read-only bits with zeroes. + */ + iowrite32(cons, ring_desc_base(ring) + 8); +} + +static void ring_iowrite_prod(struct tb_ring *ring, u16 prod) +{ + /* See ring_iowrite_cons() above for explanation */ + iowrite32(prod << 16, ring_desc_base(ring) + 8); }
static void ring_iowrite32desc(struct tb_ring *ring, u32 value, u32 offset) @@ -197,7 +208,10 @@ static void ring_write_descriptors(struct tb_ring *ring) descriptor->sof = frame->sof; } ring->head = (ring->head + 1) % ring->size; - ring_iowrite16desc(ring, ring->head, ring->is_tx ? 10 : 8); + if (ring->is_tx) + ring_iowrite_prod(ring, ring->head); + else + ring_iowrite_cons(ring, ring->head); } }
@@ -662,7 +676,7 @@ void tb_ring_stop(struct tb_ring *ring)
ring_iowrite32options(ring, 0, 0); ring_iowrite64desc(ring, 0, 0); - ring_iowrite16desc(ring, 0, ring->is_tx ? 10 : 8); + ring_iowrite32desc(ring, 0, 8); ring_iowrite32desc(ring, 0, 12); ring->head = 0; ring->tail = 0;
From: Hui Peng benquike@gmail.com
[ Upstream commit 39d170b3cb62ba98567f5c4f40c27b5864b304e5 ]
The `ar_usb` field of `ath6kl_usb_pipe_usb_pipe` objects are initialized to point to the containing `ath6kl_usb` object according to endpoint descriptors read from the device side, as shown below in `ath6kl_usb_setup_pipe_resources`:
for (i = 0; i < iface_desc->desc.bNumEndpoints; ++i) { endpoint = &iface_desc->endpoint[i].desc;
// get the address from endpoint descriptor pipe_num = ath6kl_usb_get_logical_pipe_num(ar_usb, endpoint->bEndpointAddress, &urbcount); ...... // select the pipe object pipe = &ar_usb->pipes[pipe_num];
// initialize the ar_usb field pipe->ar_usb = ar_usb; }
The driver assumes that the addresses reported in endpoint descriptors from device side to be complete. If a device is malicious and does not report complete addresses, it may trigger NULL-ptr-deref `ath6kl_usb_alloc_urb_from_pipe` and `ath6kl_usb_free_urb_to_pipe`.
This patch fixes the bug by preventing potential NULL-ptr-deref (CVE-2019-15098).
Signed-off-by: Hui Peng benquike@gmail.com Reported-by: Hui Peng benquike@gmail.com Reported-by: Mathias Payer mathias.payer@nebelwelt.net Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath6kl/usb.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/net/wireless/ath/ath6kl/usb.c b/drivers/net/wireless/ath/ath6kl/usb.c index 4defb7a0330f4..53b66e9434c99 100644 --- a/drivers/net/wireless/ath/ath6kl/usb.c +++ b/drivers/net/wireless/ath/ath6kl/usb.c @@ -132,6 +132,10 @@ ath6kl_usb_alloc_urb_from_pipe(struct ath6kl_usb_pipe *pipe) struct ath6kl_urb_context *urb_context = NULL; unsigned long flags;
+ /* bail if this pipe is not initialized */ + if (!pipe->ar_usb) + return NULL; + spin_lock_irqsave(&pipe->ar_usb->cs_lock, flags); if (!list_empty(&pipe->urb_list_head)) { urb_context = @@ -150,6 +154,10 @@ static void ath6kl_usb_free_urb_to_pipe(struct ath6kl_usb_pipe *pipe, { unsigned long flags;
+ /* bail if this pipe is not initialized */ + if (!pipe->ar_usb) + return; + spin_lock_irqsave(&pipe->ar_usb->cs_lock, flags); pipe->urb_cnt++;
From: Miklos Szeredi mszeredi@redhat.com
commit b24e7598db62386a95a3c8b9c75630c5d56fe077 upstream.
If writeback cache is enabled, then writes might get reordered with chmod/chown/utimes. The problem with this is that performing the write in the fuse daemon might itself change some of these attributes. In such case the following sequence of operations will result in file ending up with the wrong mode, for example:
int fd = open ("suid", O_WRONLY|O_CREAT|O_EXCL); write (fd, "1", 1); fchown (fd, 0, 0); fchmod (fd, 04755); close (fd);
This patch fixes this by flushing pending writes before performing chown/chmod/utimes.
Reported-by: Giuseppe Scrivano gscrivan@redhat.com Tested-by: Giuseppe Scrivano gscrivan@redhat.com Fixes: 4d99ff8f12eb ("fuse: Turn writeback cache on") Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/fuse/dir.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -1476,6 +1476,19 @@ int fuse_do_setattr(struct dentry *dentr is_truncate = true; }
+ /* Flush dirty data/metadata before non-truncate SETATTR */ + if (is_wb && S_ISREG(inode->i_mode) && + attr->ia_valid & + (ATTR_MODE | ATTR_UID | ATTR_GID | ATTR_MTIME_SET | + ATTR_TIMES_SET)) { + err = write_inode_now(inode, true); + if (err) + return err; + + fuse_set_nowrite(inode); + fuse_release_nowrite(inode); + } + if (is_truncate) { fuse_set_nowrite(inode); set_bit(FUSE_I_SIZE_UNSTABLE, &fi->state);
From: Miklos Szeredi mszeredi@redhat.com
commit e4648309b85a78f8c787457832269a8712a8673e upstream.
Make sure cached writes are not reordered around open(..., O_TRUNC), with the obvious wrong results.
Fixes: 4d99ff8f12eb ("fuse: Turn writeback cache on") Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/fuse/file.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
--- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -201,7 +201,7 @@ int fuse_open_common(struct inode *inode { struct fuse_conn *fc = get_fuse_conn(inode); int err; - bool lock_inode = (file->f_flags & O_TRUNC) && + bool is_wb_truncate = (file->f_flags & O_TRUNC) && fc->atomic_o_trunc && fc->writeback_cache;
@@ -209,16 +209,20 @@ int fuse_open_common(struct inode *inode if (err) return err;
- if (lock_inode) + if (is_wb_truncate) { inode_lock(inode); + fuse_set_nowrite(inode); + }
err = fuse_do_open(fc, get_node_id(inode), file, isdir);
if (!err) fuse_finish_open(inode, file);
- if (lock_inode) + if (is_wb_truncate) { + fuse_release_nowrite(inode); inode_unlock(inode); + }
return err; }
From: Takashi Sakamoto o-takashi@sakamocchi.jp
commit f2bbdbcb075f3977a53da3bdcb7cd460bc8ae5f2 upstream.
A helper function of ALSA bebob driver returns negative value in a function which has a prototype to return unsigned value.
This commit fixes it by changing the prototype.
Fixes: eb7b3a056cd8 ("ALSA: bebob: Add commands and connections/streams management") Cc: stable@vger.kernel.org # v3.16+ Signed-off-by: Takashi Sakamoto o-takashi@sakamocchi.jp Link: https://lore.kernel.org/r/20191026030620.12077-1-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/firewire/bebob/bebob_stream.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/sound/firewire/bebob/bebob_stream.c +++ b/sound/firewire/bebob/bebob_stream.c @@ -252,8 +252,7 @@ end: return err; }
-static unsigned int -map_data_channels(struct snd_bebob *bebob, struct amdtp_stream *s) +static int map_data_channels(struct snd_bebob *bebob, struct amdtp_stream *s) { unsigned int sec, sections, ch, channels; unsigned int pcm, midi, location;
From: Takashi Iwai tiwai@suse.de
commit a39331867335d4a94b6165e306265c9e24aca073 upstream.
When a card is disconnected while in use, the system waits until all opened files are closed then releases the card. This is done via put_device() of the card device in each device release code.
The recently reported mutex deadlock bug happens in this code path; snd_timer_close() for the timer device deals with the global register_mutex and it calls put_device() there. When this timer device is the last one, the card gets freed and it eventually calls snd_timer_free(), which has again the protection with the global register_mutex -- boom.
Basically put_device() call itself is race-free, so a relative simple workaround is to move this put_device() call out of the mutex. For achieving that, in this patch, snd_timer_close_locked() got a new argument to store the card device pointer in return, and each caller invokes put_device() with the returned object after the mutex unlock.
Reported-and-tested-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/core/timer.c | 24 +++++++++++++++++------- 1 file changed, 17 insertions(+), 7 deletions(-)
--- a/sound/core/timer.c +++ b/sound/core/timer.c @@ -226,7 +226,8 @@ static int snd_timer_check_master(struct return 0; }
-static int snd_timer_close_locked(struct snd_timer_instance *timeri); +static int snd_timer_close_locked(struct snd_timer_instance *timeri, + struct device **card_devp_to_put);
/* * open a timer instance @@ -238,6 +239,7 @@ int snd_timer_open(struct snd_timer_inst { struct snd_timer *timer; struct snd_timer_instance *timeri = NULL; + struct device *card_dev_to_put = NULL; int err;
mutex_lock(®ister_mutex); @@ -261,7 +263,7 @@ int snd_timer_open(struct snd_timer_inst list_add_tail(&timeri->open_list, &snd_timer_slave_list); err = snd_timer_check_slave(timeri); if (err < 0) { - snd_timer_close_locked(timeri); + snd_timer_close_locked(timeri, &card_dev_to_put); timeri = NULL; } goto unlock; @@ -313,7 +315,7 @@ int snd_timer_open(struct snd_timer_inst timeri = NULL;
if (timer->card) - put_device(&timer->card->card_dev); + card_dev_to_put = &timer->card->card_dev; module_put(timer->module); goto unlock; } @@ -323,12 +325,15 @@ int snd_timer_open(struct snd_timer_inst timer->num_instances++; err = snd_timer_check_master(timeri); if (err < 0) { - snd_timer_close_locked(timeri); + snd_timer_close_locked(timeri, &card_dev_to_put); timeri = NULL; }
unlock: mutex_unlock(®ister_mutex); + /* put_device() is called after unlock for avoiding deadlock */ + if (card_dev_to_put) + put_device(card_dev_to_put); *ti = timeri; return err; } @@ -338,7 +343,8 @@ EXPORT_SYMBOL(snd_timer_open); * close a timer instance * call this with register_mutex down. */ -static int snd_timer_close_locked(struct snd_timer_instance *timeri) +static int snd_timer_close_locked(struct snd_timer_instance *timeri, + struct device **card_devp_to_put) { struct snd_timer *timer = timeri->timer; struct snd_timer_instance *slave, *tmp; @@ -395,7 +401,7 @@ static int snd_timer_close_locked(struct timer->hw.close(timer); /* release a card refcount for safe disconnection */ if (timer->card) - put_device(&timer->card->card_dev); + *card_devp_to_put = &timer->card->card_dev; module_put(timer->module); }
@@ -407,14 +413,18 @@ static int snd_timer_close_locked(struct */ int snd_timer_close(struct snd_timer_instance *timeri) { + struct device *card_dev_to_put = NULL; int err;
if (snd_BUG_ON(!timeri)) return -ENXIO;
mutex_lock(®ister_mutex); - err = snd_timer_close_locked(timeri); + err = snd_timer_close_locked(timeri, &card_dev_to_put); mutex_unlock(®ister_mutex); + /* put_device() is called after unlock for avoiding deadlock */ + if (card_dev_to_put) + put_device(card_dev_to_put); return err; } EXPORT_SYMBOL(snd_timer_close);
From: Aaron Ma aaron.ma@canonical.com
commit 8a6c55d0f883e9a7e7c91841434f3b6bbf932bb2 upstream.
These 2 ThinkCentres installed a new realtek codec ID 0x623, it has 2 front mics with the same location on pin 0x18 and 0x19.
Apply fixup ALC283_FIXUP_HEADSET_MIC to change 1 front mic location to right, then pulseaudio can handle them. One "Front Mic" and one "Mic" will be shown, and audio output works fine.
Signed-off-by: Aaron Ma aaron.ma@canonical.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191024114439.31522-1-aaron.ma@canonical.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_realtek.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -7187,6 +7187,8 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x17aa, 0x312f, "ThinkCentre Station", ALC294_FIXUP_LENOVO_MIC_LOCATION), SND_PCI_QUIRK(0x17aa, 0x313c, "ThinkCentre Station", ALC294_FIXUP_LENOVO_MIC_LOCATION), SND_PCI_QUIRK(0x17aa, 0x3151, "ThinkCentre Station", ALC283_FIXUP_HEADSET_MIC), + SND_PCI_QUIRK(0x17aa, 0x3176, "ThinkCentre Station", ALC283_FIXUP_HEADSET_MIC), + SND_PCI_QUIRK(0x17aa, 0x3178, "ThinkCentre Station", ALC283_FIXUP_HEADSET_MIC), SND_PCI_QUIRK(0x17aa, 0x3902, "Lenovo E50-80", ALC269_FIXUP_DMIC_THINKPAD_ACPI), SND_PCI_QUIRK(0x17aa, 0x3977, "IdeaPad S210", ALC283_FIXUP_INT_MIC), SND_PCI_QUIRK(0x17aa, 0x3978, "Lenovo B50-70", ALC269_FIXUP_DMIC_THINKPAD_ACPI),
From: Kailang Yang kailang@realtek.com
commit f0778871a13889b86a65d4ad34bef8340af9d082 upstream.
Support new codec ALC623.
Signed-off-by: Kailang Yang kailang@realtek.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/ed97b6a8bd9445ecb48bc763d9aaba7a@realtek.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/pci/hda/patch_realtek.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -409,6 +409,9 @@ static void alc_fill_eapd_coef(struct hd case 0x10ec0672: alc_update_coef_idx(codec, 0xd, 0, 1<<14); /* EAPD Ctrl */ break; + case 0x10ec0623: + alc_update_coef_idx(codec, 0x19, 1<<13, 0); + break; case 0x10ec0668: alc_update_coef_idx(codec, 0x7, 3<<13, 0); break; @@ -2919,6 +2922,7 @@ enum { ALC269_TYPE_ALC225, ALC269_TYPE_ALC294, ALC269_TYPE_ALC300, + ALC269_TYPE_ALC623, ALC269_TYPE_ALC700, };
@@ -2954,6 +2958,7 @@ static int alc269_parse_auto_config(stru case ALC269_TYPE_ALC225: case ALC269_TYPE_ALC294: case ALC269_TYPE_ALC300: + case ALC269_TYPE_ALC623: case ALC269_TYPE_ALC700: ssids = alc269_ssids; break; @@ -7976,6 +7981,9 @@ static int patch_alc269(struct hda_codec spec->codec_variant = ALC269_TYPE_ALC300; spec->gen.mixer_nid = 0; /* no loopback on ALC300 */ break; + case 0x10ec0623: + spec->codec_variant = ALC269_TYPE_ALC623; + break; case 0x10ec0700: case 0x10ec0701: case 0x10ec0703: @@ -9103,6 +9111,7 @@ static const struct hda_device_id snd_hd HDA_CODEC_ENTRY(0x10ec0298, "ALC298", patch_alc269), HDA_CODEC_ENTRY(0x10ec0299, "ALC299", patch_alc269), HDA_CODEC_ENTRY(0x10ec0300, "ALC300", patch_alc269), + HDA_CODEC_ENTRY(0x10ec0623, "ALC623", patch_alc269), HDA_CODEC_REV_ENTRY(0x10ec0861, 0x100340, "ALC660", patch_alc861), HDA_CODEC_ENTRY(0x10ec0660, "ALC660-VD", patch_alc861vd), HDA_CODEC_ENTRY(0x10ec0861, "ALC861", patch_alc861),
From: Miaoqing Pan miaoqing@codeaurora.org
commit d79749f7716d9dc32fa2d5075f6ec29aac63c76d upstream.
(kvalo: cherry picked from commit 1340cc631bd00431e2f174525c971f119df9efa1 in wireless-drivers-next to wireless-drivers as this a frequently reported regression)
Bad latency is found on QCA988x, the issue was introduced by commit 4504f0e5b571 ("ath10k: sdio: workaround firmware UART pin configuration bug"). If uart_pin_workaround is false, this change will set uart pin even if uart_print is false.
Tested HW: QCA9880 Tested FW: 10.2.4-1.0-00037
Fixes: 4504f0e5b571 ("ath10k: sdio: workaround firmware UART pin configuration bug") Signed-off-by: Miaoqing Pan miaoqing@codeaurora.org Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/ath/ath10k/core.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
--- a/drivers/net/wireless/ath/ath10k/core.c +++ b/drivers/net/wireless/ath/ath10k/core.c @@ -2118,12 +2118,15 @@ static int ath10k_init_uart(struct ath10 return ret; }
- if (!uart_print && ar->hw_params.uart_pin_workaround) { - ret = ath10k_bmi_write32(ar, hi_dbg_uart_txpin, - ar->hw_params.uart_pin); - if (ret) { - ath10k_warn(ar, "failed to set UART TX pin: %d", ret); - return ret; + if (!uart_print) { + if (ar->hw_params.uart_pin_workaround) { + ret = ath10k_bmi_write32(ar, hi_dbg_uart_txpin, + ar->hw_params.uart_pin); + if (ret) { + ath10k_warn(ar, "failed to set UART TX pin: %d", + ret); + return ret; + } }
return 0;
From: Alan Stern stern@rowland.harvard.edu
commit 1186f86a71130a7635a20843e355bb880c7349b2 upstream.
Commit 3ae62a42090f ("UAS: fix alignment of scatter/gather segments"), copying a similar commit for usb-storage, attempted to solve a problem involving scatter-gather I/O and USB/IP by setting the virt_boundary_mask for mass-storage devices.
However, it now turns out that the analogous change in usb-storage interacted badly with commit 09324d32d2a0 ("block: force an unlimited segment size on queues with a virt boundary"), which was added later. A typical error message is:
ehci-pci 0000:00:13.2: swiotlb buffer is full (sz: 327680 bytes), total 32768 (slots), used 97 (slots)
There is no longer any reason to keep the virt_boundary_mask setting in the uas driver. It was needed in the first place only for handling devices with a block size smaller than the maxpacket size and where the host controller was not capable of fully general scatter-gather operation (that is, able to merge two SG segments into a single USB packet). But:
High-speed or slower connections never use a bulk maxpacket value larger than 512;
The SCSI layer does not handle block devices with a block size smaller than 512 bytes;
All the host controllers capable of SuperSpeed operation can handle fully general SG;
Since commit ea44d190764b ("usbip: Implement SG support to vhci-hcd and stub driver") was merged, the USB/IP driver can also handle SG.
Therefore all supported device/controller combinations should be okay with no need for any special virt_boundary_mask. So in order to head off potential problems similar to those affecting usb-storage, this patch reverts commit 3ae62a42090f.
Signed-off-by: Alan Stern stern@rowland.harvard.edu CC: Oliver Neukum oneukum@suse.com CC: stable@vger.kernel.org Acked-by: Christoph Hellwig hch@lst.de Fixes: 3ae62a42090f ("UAS: fix alignment of scatter/gather segments") Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1910231132470.1878-100000@iolanthe... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/storage/uas.c | 20 -------------------- 1 file changed, 20 deletions(-)
--- a/drivers/usb/storage/uas.c +++ b/drivers/usb/storage/uas.c @@ -789,30 +789,10 @@ static int uas_slave_alloc(struct scsi_d { struct uas_dev_info *devinfo = (struct uas_dev_info *)sdev->host->hostdata; - int maxp;
sdev->hostdata = devinfo;
/* - * We have two requirements here. We must satisfy the requirements - * of the physical HC and the demands of the protocol, as we - * definitely want no additional memory allocation in this path - * ruling out using bounce buffers. - * - * For a transmission on USB to continue we must never send - * a package that is smaller than maxpacket. Hence the length of each - * scatterlist element except the last must be divisible by the - * Bulk maxpacket value. - * If the HC does not ensure that through SG, - * the upper layer must do that. We must assume nothing - * about the capabilities off the HC, so we use the most - * pessimistic requirement. - */ - - maxp = usb_maxpacket(devinfo->udev, devinfo->data_in_pipe, 0); - blk_queue_virt_boundary(sdev->request_queue, maxp - 1); - - /* * The protocol has no requirements on alignment in the strict sense. * Controllers may or may not have alignment restrictions. * As this is not exported, we use an extremely conservative guess.
From: Markus Theil markus.theil@tu-ilmenau.de
commit 1fab1b89e2e8f01204a9c05a39fd0b6411a48593 upstream.
Mesh path nexthop should be a ethernet address, but current validation checks against 4 byte integers.
Cc: stable@vger.kernel.org Fixes: 2ec600d672e74 ("nl80211/cfg80211: support for mesh, sta dumping") Signed-off-by: Markus Theil markus.theil@tu-ilmenau.de Link: https://lore.kernel.org/r/20191029093003.10355-1-markus.theil@tu-ilmenau.de Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/wireless/nl80211.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -377,7 +377,7 @@ const struct nla_policy nl80211_policy[N [NL80211_ATTR_MNTR_FLAGS] = { /* NLA_NESTED can't be empty */ }, [NL80211_ATTR_MESH_ID] = { .type = NLA_BINARY, .len = IEEE80211_MAX_MESH_ID_LEN }, - [NL80211_ATTR_MPATH_NEXT_HOP] = { .type = NLA_U32 }, + [NL80211_ATTR_MPATH_NEXT_HOP] = NLA_POLICY_ETH_ADDR_COMPAT,
[NL80211_ATTR_REG_ALPHA2] = { .type = NLA_STRING, .len = 2 }, [NL80211_ATTR_REG_RULES] = { .type = NLA_NESTED },
From: Alan Stern stern@rowland.harvard.edu
commit 54f83b8c8ea9b22082a496deadf90447a326954e upstream.
Endpoints with a maxpacket length of 0 are probably useless. They can't transfer any data, and it's not at all unlikely that a UDC will crash or hang when trying to handle a non-zero-length usb_request for such an endpoint. Indeed, dummy-hcd gets a divide error when trying to calculate the remainder of a transfer length by the maxpacket value, as discovered by the syzbot fuzzer.
Currently the gadget core does not check for endpoints having a maxpacket value of 0. This patch adds a check to usb_ep_enable(), preventing such endpoints from being used.
As far as I know, none of the gadget drivers in the kernel tries to create an endpoint with maxpacket = 0, but until now there has been nothing to prevent userspace programs under gadgetfs or configfs from doing it.
Signed-off-by: Alan Stern stern@rowland.harvard.edu Reported-and-tested-by: syzbot+8ab8bf161038a8768553@syzkaller.appspotmail.com CC: stable@vger.kernel.org Acked-by: Felipe Balbi balbi@kernel.org Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1910281052370.1485-100000@iolanthe... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/gadget/udc/core.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/drivers/usb/gadget/udc/core.c +++ b/drivers/usb/gadget/udc/core.c @@ -98,6 +98,17 @@ int usb_ep_enable(struct usb_ep *ep) if (ep->enabled) goto out;
+ /* UDC drivers can't handle endpoints with maxpacket size 0 */ + if (usb_endpoint_maxp(ep->desc) == 0) { + /* + * We should log an error message here, but we can't call + * dev_err() because there's no way to find the gadget + * given only ep. + */ + ret = -EINVAL; + goto out; + } + ret = ep->ops->enable(ep, ep->desc); if (ret) goto out;
From: Alan Stern stern@rowland.harvard.edu
commit 9a976949613132977098fc49510b46fa8678d864 upstream.
Commit 747668dbc061 ("usb-storage: Set virt_boundary_mask to avoid SG overflows") attempted to solve a problem involving scatter-gather I/O and USB/IP by setting the virt_boundary_mask for mass-storage devices.
However, it now turns out that this interacts badly with commit 09324d32d2a0 ("block: force an unlimited segment size on queues with a virt boundary"), which was added later. A typical error message is:
ehci-pci 0000:00:13.2: swiotlb buffer is full (sz: 327680 bytes), total 32768 (slots), used 97 (slots)
There is no longer any reason to keep the virt_boundary_mask setting for usb-storage. It was needed in the first place only for handling devices with a block size smaller than the maxpacket size and where the host controller was not capable of fully general scatter-gather operation (that is, able to merge two SG segments into a single USB packet). But:
High-speed or slower connections never use a bulk maxpacket value larger than 512;
The SCSI layer does not handle block devices with a block size smaller than 512 bytes;
All the host controllers capable of SuperSpeed operation can handle fully general SG;
Since commit ea44d190764b ("usbip: Implement SG support to vhci-hcd and stub driver") was merged, the USB/IP driver can also handle SG.
Therefore all supported device/controller combinations should be okay with no need for any special virt_boundary_mask. So in order to fix the swiotlb problem, this patch reverts commit 747668dbc061.
Reported-and-tested-by: Piergiorgio Sartor piergiorgio.sartor@nexgo.de Link: https://marc.info/?l=linux-usb&m=157134199501202&w=2 Signed-off-by: Alan Stern stern@rowland.harvard.edu CC: Seth Bollinger Seth.Bollinger@digi.com CC: stable@vger.kernel.org Fixes: 747668dbc061 ("usb-storage: Set virt_boundary_mask to avoid SG overflows") Acked-by: Christoph Hellwig hch@lst.de Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1910211145520.1673-100000@iolanthe... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/storage/scsiglue.c | 10 ---------- 1 file changed, 10 deletions(-)
--- a/drivers/usb/storage/scsiglue.c +++ b/drivers/usb/storage/scsiglue.c @@ -67,7 +67,6 @@ static const char* host_info(struct Scsi static int slave_alloc (struct scsi_device *sdev) { struct us_data *us = host_to_us(sdev->host); - int maxp;
/* * Set the INQUIRY transfer length to 36. We don't use any of @@ -77,15 +76,6 @@ static int slave_alloc (struct scsi_devi sdev->inquiry_len = 36;
/* - * USB has unusual scatter-gather requirements: the length of each - * scatterlist element except the last must be divisible by the - * Bulk maxpacket value. Fortunately this value is always a - * power of 2. Inform the block layer about this requirement. - */ - maxp = usb_maxpacket(us->pusb_dev, us->recv_bulk_pipe, 0); - blk_queue_virt_boundary(sdev->request_queue, maxp - 1); - - /* * Some host controllers may have alignment requirements. * We'll play it safe by requiring 512-byte alignment always. */
From: Johan Hovold johan@kernel.org
commit d98ee2a19c3334e9343df3ce254b496f1fc428eb upstream.
The custom ring-buffer implementation was merged without any locking or explicit memory barriers, but a spinlock was later added by commit 9d33efd9a791 ("USB: ldusb bugfix").
The lock did not cover the update of the tail index once the entry had been processed, something which could lead to memory corruption on weakly ordered architectures or due to compiler optimisations.
Specifically, a completion handler running on another CPU might observe the incremented tail index and update the entry before ld_usb_read() is done with it.
Fixes: 2824bd250f0b ("[PATCH] USB: add ldusb driver") Fixes: 9d33efd9a791 ("USB: ldusb bugfix") Cc: stable stable@vger.kernel.org # 2.6.13 Signed-off-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20191022143203.5260-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/misc/ldusb.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/misc/ldusb.c +++ b/drivers/usb/misc/ldusb.c @@ -495,11 +495,11 @@ static ssize_t ld_usb_read(struct file * retval = -EFAULT; goto unlock_exit; } - dev->ring_tail = (dev->ring_tail+1) % ring_buffer_size; - retval = bytes_to_read;
spin_lock_irq(&dev->rbsl); + dev->ring_tail = (dev->ring_tail + 1) % ring_buffer_size; + if (dev->buffer_overflow) { dev->buffer_overflow = 0; spin_unlock_irq(&dev->rbsl);
From: Johan Hovold johan@kernel.org
commit 52403cfbc635d28195167618690595013776ebde upstream.
USB control-message timeouts are specified in milliseconds, not jiffies. Waiting 83 minutes for a transfer to complete is a bit excessive.
Fixes: 2824bd250f0b ("[PATCH] USB: add ldusb driver") Cc: stable stable@vger.kernel.org # 2.6.13 Reported-by: syzbot+a4fbb3bb76cda0ea4e58@syzkaller.appspotmail.com Signed-off-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20191022153127.22295-1-johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/misc/ldusb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/usb/misc/ldusb.c +++ b/drivers/usb/misc/ldusb.c @@ -580,7 +580,7 @@ static ssize_t ld_usb_write(struct file 1 << 8, 0, dev->interrupt_out_buffer, bytes_to_write, - USB_CTRL_SET_TIMEOUT * HZ); + USB_CTRL_SET_TIMEOUT); if (retval < 0) dev_err(&dev->intf->dev, "Couldn't submit HID_REQ_SET_REPORT %d\n",
From: Samuel Holland samuel@sholland.org
commit bfa3dbb343f664573292afb9e44f9abeb81a19de upstream.
The arguments to queue_trb are always byteswapped to LE for placement in the ring, but this should not happen in the case of immediate data; the bytes copied out of transfer_buffer are already in the correct order. Add a complementary byteswap so the bytes end up in the ring correctly.
This was observed on BE ppc64 with a "Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller [104c:8241]" as a ch341 usb-serial adapter ("1a86:7523 QinHeng Electronics HL-340 USB-Serial adapter") always transmitting the same character (generally NUL) over the serial link regardless of the key pressed.
Cc: stable@vger.kernel.org # 5.2+ Fixes: 33e39350ebd2 ("usb: xhci: add Immediate Data Transfer support") Signed-off-by: Samuel Holland samuel@sholland.org Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/1572013829-14044-3-git-send-email-mathias.nyman@li... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/host/xhci-ring.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -3330,6 +3330,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd * if (xhci_urb_suitable_for_idt(urb)) { memcpy(&send_addr, urb->transfer_buffer, trb_buff_len); + le64_to_cpus(&send_addr); field |= TRB_IDT; } } @@ -3475,6 +3476,7 @@ int xhci_queue_ctrl_tx(struct xhci_hcd * if (xhci_urb_suitable_for_idt(urb)) { memcpy(&addr, urb->transfer_buffer, urb->transfer_buffer_length); + le64_to_cpus(&addr); field |= TRB_IDT; } else { addr = (u64) urb->transfer_dma;
From: Ben Dooks (Codethink) ben.dooks@codethink.co.uk
commit d5501d5c29a2e684640507cfee428178d6fd82ca upstream.
It looks like some of the xhci debug code is passing u32 to functions directly from __le32/__le64 fields. Fix this by using le{32,64}_to_cpu() on these to fix the following sparse warnings;
xhci-debugfs.c:205:62: warning: incorrect type in argument 1 (different base types) xhci-debugfs.c:205:62: expected unsigned int [usertype] field0 xhci-debugfs.c:205:62: got restricted __le32 xhci-debugfs.c:206:62: warning: incorrect type in argument 2 (different base types) xhci-debugfs.c:206:62: expected unsigned int [usertype] field1 xhci-debugfs.c:206:62: got restricted __le32 ...
[Trim down commit message, sparse warnings were similar -Mathias] Cc: stable@vger.kernel.org # 4.15+ Signed-off-by: Ben Dooks ben.dooks@codethink.co.uk Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/1572013829-14044-4-git-send-email-mathias.nyman@li... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/host/xhci-debugfs.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-)
--- a/drivers/usb/host/xhci-debugfs.c +++ b/drivers/usb/host/xhci-debugfs.c @@ -202,10 +202,10 @@ static void xhci_ring_dump_segment(struc trb = &seg->trbs[i]; dma = seg->dma + i * sizeof(*trb); seq_printf(s, "%pad: %s\n", &dma, - xhci_decode_trb(trb->generic.field[0], - trb->generic.field[1], - trb->generic.field[2], - trb->generic.field[3])); + xhci_decode_trb(le32_to_cpu(trb->generic.field[0]), + le32_to_cpu(trb->generic.field[1]), + le32_to_cpu(trb->generic.field[2]), + le32_to_cpu(trb->generic.field[3]))); } }
@@ -263,10 +263,10 @@ static int xhci_slot_context_show(struct xhci = hcd_to_xhci(bus_to_hcd(dev->udev->bus)); slot_ctx = xhci_get_slot_ctx(xhci, dev->out_ctx); seq_printf(s, "%pad: %s\n", &dev->out_ctx->dma, - xhci_decode_slot_context(slot_ctx->dev_info, - slot_ctx->dev_info2, - slot_ctx->tt_info, - slot_ctx->dev_state)); + xhci_decode_slot_context(le32_to_cpu(slot_ctx->dev_info), + le32_to_cpu(slot_ctx->dev_info2), + le32_to_cpu(slot_ctx->tt_info), + le32_to_cpu(slot_ctx->dev_state)));
return 0; } @@ -286,10 +286,10 @@ static int xhci_endpoint_context_show(st ep_ctx = xhci_get_ep_ctx(xhci, dev->out_ctx, dci); dma = dev->out_ctx->dma + dci * CTX_SIZE(xhci->hcc_params); seq_printf(s, "%pad: %s\n", &dma, - xhci_decode_ep_context(ep_ctx->ep_info, - ep_ctx->ep_info2, - ep_ctx->deq, - ep_ctx->tx_info)); + xhci_decode_ep_context(le32_to_cpu(ep_ctx->ep_info), + le32_to_cpu(ep_ctx->ep_info2), + le64_to_cpu(ep_ctx->deq), + le32_to_cpu(ep_ctx->tx_info))); }
return 0;
From: Johan Hovold johan@kernel.org
commit 1251dab9e0a2c4d0d2d48370ba5baa095a5e8774 upstream.
Fix a user-controlled slab buffer overflow due to a missing sanity check on the bulk-out transfer buffer used for control requests.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20191029102354.2733-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/serial/whiteheat.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/usb/serial/whiteheat.c +++ b/drivers/usb/serial/whiteheat.c @@ -559,6 +559,10 @@ static int firm_send_command(struct usb_
command_port = port->serial->port[COMMAND_PORT]; command_info = usb_get_serial_port_data(command_port); + + if (command_port->bulk_out_size < datasize + 1) + return -EIO; + mutex_lock(&command_info->mutex); command_info->command_finished = false;
From: Johan Hovold johan@kernel.org
commit 84968291d7924261c6a0624b9a72f952398e258b upstream.
Add missing endianness conversion when setting the line speed so that this driver might work also on big-endian machines.
Also use an unsigned format specifier in the corresponding debug message.
Signed-off-by: Johan Hovold johan@kernel.org Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20191029102354.2733-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/serial/whiteheat.c | 9 ++++++--- drivers/usb/serial/whiteheat.h | 2 +- 2 files changed, 7 insertions(+), 4 deletions(-)
--- a/drivers/usb/serial/whiteheat.c +++ b/drivers/usb/serial/whiteheat.c @@ -636,6 +636,7 @@ static void firm_setup_port(struct tty_s struct device *dev = &port->dev; struct whiteheat_port_settings port_settings; unsigned int cflag = tty->termios.c_cflag; + speed_t baud;
port_settings.port = port->port_number + 1;
@@ -696,11 +697,13 @@ static void firm_setup_port(struct tty_s dev_dbg(dev, "%s - XON = %2x, XOFF = %2x\n", __func__, port_settings.xon, port_settings.xoff);
/* get the baud rate wanted */ - port_settings.baud = tty_get_baud_rate(tty); - dev_dbg(dev, "%s - baud rate = %d\n", __func__, port_settings.baud); + baud = tty_get_baud_rate(tty); + port_settings.baud = cpu_to_le32(baud); + dev_dbg(dev, "%s - baud rate = %u\n", __func__, baud);
/* fixme: should set validated settings */ - tty_encode_baud_rate(tty, port_settings.baud, port_settings.baud); + tty_encode_baud_rate(tty, baud, baud); + /* handle any settings that aren't specified in the tty structure */ port_settings.lloop = 0;
--- a/drivers/usb/serial/whiteheat.h +++ b/drivers/usb/serial/whiteheat.h @@ -87,7 +87,7 @@ struct whiteheat_simple {
struct whiteheat_port_settings { __u8 port; /* port number (1 to N) */ - __u32 baud; /* any value 7 - 460800, firmware calculates + __le32 baud; /* any value 7 - 460800, firmware calculates best fit; arrives little endian */ __u8 bits; /* 5, 6, 7, or 8 */ __u8 stop; /* 1 or 2, default 1 (2 = 1.5 if bits = 5) */
From: Mathias Nyman mathias.nyman@linux.intel.com
commit 18b74067ac78a2dea65783314c13df98a53d071c upstream.
commit ef513be0a905 ("usb: xhci: Add Clear_TT_Buffer") schedules work to clear TT buffer, but causes a use-after-free regression at the same time
Make sure hub_tt_work finishes before endpoint is disabled, otherwise the work will dereference already freed endpoint and device related pointers.
This was triggered when usb core failed to read the configuration descriptor of a FS/LS device during enumeration. xhci driver queued clear_tt_work while usb core freed and reallocated a new device for the next enumeration attempt.
EHCI driver implents ehci_endpoint_disable() that makes sure clear_tt_work has finished before it returns, but xhci lacks this support. usb core will call hcd->driver->endpoint_disable() callback before disabling endpoints, so we want this in xhci as well.
The added xhci_endpoint_disable() is based on ehci_endpoint_disable()
Fixes: ef513be0a905 ("usb: xhci: Add Clear_TT_Buffer") Cc: stable@vger.kernel.org # v5.3 Reported-by: Johan Hovold johan@kernel.org Suggested-by: Johan Hovold johan@kernel.org Reviewed-by: Johan Hovold johan@kernel.org Tested-by: Johan Hovold johan@kernel.org Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/1572013829-14044-2-git-send-email-mathias.nyman@li... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/usb/host/xhci.c | 54 ++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 45 insertions(+), 9 deletions(-)
--- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -3071,6 +3071,48 @@ void xhci_cleanup_stalled_ring(struct xh } }
+static void xhci_endpoint_disable(struct usb_hcd *hcd, + struct usb_host_endpoint *host_ep) +{ + struct xhci_hcd *xhci; + struct xhci_virt_device *vdev; + struct xhci_virt_ep *ep; + struct usb_device *udev; + unsigned long flags; + unsigned int ep_index; + + xhci = hcd_to_xhci(hcd); +rescan: + spin_lock_irqsave(&xhci->lock, flags); + + udev = (struct usb_device *)host_ep->hcpriv; + if (!udev || !udev->slot_id) + goto done; + + vdev = xhci->devs[udev->slot_id]; + if (!vdev) + goto done; + + ep_index = xhci_get_endpoint_index(&host_ep->desc); + ep = &vdev->eps[ep_index]; + if (!ep) + goto done; + + /* wait for hub_tt_work to finish clearing hub TT */ + if (ep->ep_state & EP_CLEARING_TT) { + spin_unlock_irqrestore(&xhci->lock, flags); + schedule_timeout_uninterruptible(1); + goto rescan; + } + + if (ep->ep_state) + xhci_dbg(xhci, "endpoint disable with ep_state 0x%x\n", + ep->ep_state); +done: + host_ep->hcpriv = NULL; + spin_unlock_irqrestore(&xhci->lock, flags); +} + /* * Called after usb core issues a clear halt control message. * The host side of the halt should already be cleared by a reset endpoint @@ -5237,20 +5279,13 @@ static void xhci_clear_tt_buffer_complet unsigned int ep_index; unsigned long flags;
- /* - * udev might be NULL if tt buffer is cleared during a failed device - * enumeration due to a halted control endpoint. Usb core might - * have allocated a new udev for the next enumeration attempt. - */ - xhci = hcd_to_xhci(hcd); + + spin_lock_irqsave(&xhci->lock, flags); udev = (struct usb_device *)ep->hcpriv; - if (!udev) - return; slot_id = udev->slot_id; ep_index = xhci_get_endpoint_index(&ep->desc);
- spin_lock_irqsave(&xhci->lock, flags); xhci->devs[slot_id]->eps[ep_index].ep_state &= ~EP_CLEARING_TT; xhci_ring_doorbell_for_active_rings(xhci, slot_id, ep_index); spin_unlock_irqrestore(&xhci->lock, flags); @@ -5287,6 +5322,7 @@ static const struct hc_driver xhci_hc_dr .free_streams = xhci_free_streams, .add_endpoint = xhci_add_endpoint, .drop_endpoint = xhci_drop_endpoint, + .endpoint_disable = xhci_endpoint_disable, .endpoint_reset = xhci_endpoint_reset, .check_bandwidth = xhci_check_bandwidth, .reset_bandwidth = xhci_reset_bandwidth,
From: Quinn Tran qutran@marvell.com
commit 8d8b83f5be2a3bdac3695a94e6cb5e50bd114869 upstream.
For new adapters with multiple flash regions to write to, current code allows FW & Boot regions to be written, while other regions are blocked via sysfs. The fix is to block all flash read/write through sysfs interface.
Fixes: e81d1bcbde06 ("scsi: qla2xxx: Further limit FLASH region write access from SysFS") Cc: stable@vger.kernel.org # 5.2 Link: https://lore.kernel.org/r/20191022193643.7076-3-hmadhani@marvell.com Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Girish Basrur gbasrur@marvell.com Signed-off-by: Himanshu Madhani hmadhani@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/scsi/qla2xxx/qla_attr.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
--- a/drivers/scsi/qla2xxx/qla_attr.c +++ b/drivers/scsi/qla2xxx/qla_attr.c @@ -441,9 +441,6 @@ qla2x00_sysfs_write_optrom_ctl(struct fi valid = 0; if (ha->optrom_size == OPTROM_SIZE_2300 && start == 0) valid = 1; - else if (start == (ha->flt_region_boot * 4) || - start == (ha->flt_region_fw * 4)) - valid = 1; else if (IS_QLA24XX_TYPE(ha) || IS_QLA25XX(ha)) valid = 1; if (!valid) { @@ -491,8 +488,10 @@ qla2x00_sysfs_write_optrom_ctl(struct fi "Writing flash region -- 0x%x/0x%x.\n", ha->optrom_region_start, ha->optrom_region_size);
- ha->isp_ops->write_optrom(vha, ha->optrom_buffer, + rval = ha->isp_ops->write_optrom(vha, ha->optrom_buffer, ha->optrom_region_start, ha->optrom_region_size); + if (rval) + rval = -EIO; break; default: rval = -EINVAL;
From: Bart Van Assche bvanassche@acm.org
commit fc5b220b2dcf8b512d9bd46fd17f82257e49bf89 upstream.
Use the pointer 'p' after having tested that pointer instead of before.
Fixes: 5cadafb236df ("target/cxgbit: Fix endianness annotations") Cc: Varun Prakash varun@chelsio.com Cc: Nicholas Bellinger nab@linux-iscsi.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191023202150.22173-1-bvanassche@acm.org Reported-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/target/iscsi/cxgbit/cxgbit_cm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/target/iscsi/cxgbit/cxgbit_cm.c +++ b/drivers/target/iscsi/cxgbit/cxgbit_cm.c @@ -1831,7 +1831,7 @@ static void cxgbit_fw4_ack(struct cxgbit
while (credits) { struct sk_buff *p = cxgbit_sock_peek_wr(csk); - const u32 csum = (__force u32)p->csum; + u32 csum;
if (unlikely(!p)) { pr_err("csk 0x%p,%u, cr %u,%u+%u, empty.\n", @@ -1840,6 +1840,7 @@ static void cxgbit_fw4_ack(struct cxgbit break; }
+ csum = (__force u32)p->csum; if (unlikely(credits < csum)) { pr_warn("csk 0x%p,%u, cr %u,%u+%u, < %u.\n", csk, csk->tid,
From: Hans de Goede hdegoede@redhat.com
commit 09f3dbe474735df13dd8a66d3d1231048d9b373f upstream.
The Primebook C11B uses the SIPODEV SP1064 touchpad. There are 2 versions of this 2-in-1 and the touchpad in the older version does not supply descriptors, so it has to be added to the override list.
Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Benjamin Tissoires benjamin.tissoires@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
--- a/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c +++ b/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c @@ -323,6 +323,25 @@ static const struct dmi_system_id i2c_hi .driver_data = (void *)&sipodev_desc }, { + /* + * There are at least 2 Primebook C11B versions, the older + * version has a product-name of "Primebook C11B", and a + * bios version / release / firmware revision of: + * V2.1.2 / 05/03/2018 / 18.2 + * The new version has "PRIMEBOOK C11B" as product-name and a + * bios version / release / firmware revision of: + * CFALKSW05_BIOS_V1.1.2 / 11/19/2018 / 19.2 + * Only the older version needs this quirk, note the newer + * version will not match as it has a different product-name. + */ + .ident = "Trekstor Primebook C11B", + .matches = { + DMI_EXACT_MATCH(DMI_SYS_VENDOR, "TREKSTOR"), + DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Primebook C11B"), + }, + .driver_data = (void *)&sipodev_desc + }, + { .ident = "Direkt-Tek DTLAPY116-2", .matches = { DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Direkt-Tek"),
From: Alan Stern stern@rowland.harvard.edu
commit d9d4b1e46d9543a82c23f6df03f4ad697dab361b upstream.
The syzbot fuzzer found a slab-out-of-bounds write bug in the hid-gaff driver. The problem is caused by the driver's assumption that the device must have an input report. While this will be true for all normal HID input devices, a suitably malicious device can violate the assumption.
The same assumption is present in over a dozen other HID drivers. This patch fixes them by checking that the list of hid_inputs for the hid_device is nonempty before allowing it to be used.
Reported-and-tested-by: syzbot+403741a091bf41d4ae79@syzkaller.appspotmail.com Signed-off-by: Alan Stern stern@rowland.harvard.edu CC: stable@vger.kernel.org Signed-off-by: Benjamin Tissoires benjamin.tissoires@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hid/hid-axff.c | 11 +++++++++-- drivers/hid/hid-dr.c | 12 +++++++++--- drivers/hid/hid-emsff.c | 12 +++++++++--- drivers/hid/hid-gaff.c | 12 +++++++++--- drivers/hid/hid-holtekff.c | 12 +++++++++--- drivers/hid/hid-lg2ff.c | 12 +++++++++--- drivers/hid/hid-lg3ff.c | 11 +++++++++-- drivers/hid/hid-lg4ff.c | 11 +++++++++-- drivers/hid/hid-lgff.c | 11 +++++++++-- drivers/hid/hid-logitech-hidpp.c | 11 +++++++++-- drivers/hid/hid-microsoft.c | 12 +++++++++--- drivers/hid/hid-sony.c | 12 +++++++++--- drivers/hid/hid-tmff.c | 12 +++++++++--- drivers/hid/hid-zpff.c | 12 +++++++++--- 14 files changed, 126 insertions(+), 37 deletions(-)
--- a/drivers/hid/hid-axff.c +++ b/drivers/hid/hid-axff.c @@ -63,13 +63,20 @@ static int axff_init(struct hid_device * { struct axff_device *axff; struct hid_report *report; - struct hid_input *hidinput = list_first_entry(&hid->inputs, struct hid_input, list); + struct hid_input *hidinput; struct list_head *report_list =&hid->report_enum[HID_OUTPUT_REPORT].report_list; - struct input_dev *dev = hidinput->input; + struct input_dev *dev; int field_count = 0; int i, j; int error;
+ if (list_empty(&hid->inputs)) { + hid_err(hid, "no inputs found\n"); + return -ENODEV; + } + hidinput = list_first_entry(&hid->inputs, struct hid_input, list); + dev = hidinput->input; + if (list_empty(report_list)) { hid_err(hid, "no output reports found\n"); return -ENODEV; --- a/drivers/hid/hid-dr.c +++ b/drivers/hid/hid-dr.c @@ -75,13 +75,19 @@ static int drff_init(struct hid_device * { struct drff_device *drff; struct hid_report *report; - struct hid_input *hidinput = list_first_entry(&hid->inputs, - struct hid_input, list); + struct hid_input *hidinput; struct list_head *report_list = &hid->report_enum[HID_OUTPUT_REPORT].report_list; - struct input_dev *dev = hidinput->input; + struct input_dev *dev; int error;
+ if (list_empty(&hid->inputs)) { + hid_err(hid, "no inputs found\n"); + return -ENODEV; + } + hidinput = list_first_entry(&hid->inputs, struct hid_input, list); + dev = hidinput->input; + if (list_empty(report_list)) { hid_err(hid, "no output reports found\n"); return -ENODEV; --- a/drivers/hid/hid-emsff.c +++ b/drivers/hid/hid-emsff.c @@ -47,13 +47,19 @@ static int emsff_init(struct hid_device { struct emsff_device *emsff; struct hid_report *report; - struct hid_input *hidinput = list_first_entry(&hid->inputs, - struct hid_input, list); + struct hid_input *hidinput; struct list_head *report_list = &hid->report_enum[HID_OUTPUT_REPORT].report_list; - struct input_dev *dev = hidinput->input; + struct input_dev *dev; int error;
+ if (list_empty(&hid->inputs)) { + hid_err(hid, "no inputs found\n"); + return -ENODEV; + } + hidinput = list_first_entry(&hid->inputs, struct hid_input, list); + dev = hidinput->input; + if (list_empty(report_list)) { hid_err(hid, "no output reports found\n"); return -ENODEV; --- a/drivers/hid/hid-gaff.c +++ b/drivers/hid/hid-gaff.c @@ -64,14 +64,20 @@ static int gaff_init(struct hid_device * { struct gaff_device *gaff; struct hid_report *report; - struct hid_input *hidinput = list_entry(hid->inputs.next, - struct hid_input, list); + struct hid_input *hidinput; struct list_head *report_list = &hid->report_enum[HID_OUTPUT_REPORT].report_list; struct list_head *report_ptr = report_list; - struct input_dev *dev = hidinput->input; + struct input_dev *dev; int error;
+ if (list_empty(&hid->inputs)) { + hid_err(hid, "no inputs found\n"); + return -ENODEV; + } + hidinput = list_entry(hid->inputs.next, struct hid_input, list); + dev = hidinput->input; + if (list_empty(report_list)) { hid_err(hid, "no output reports found\n"); return -ENODEV; --- a/drivers/hid/hid-holtekff.c +++ b/drivers/hid/hid-holtekff.c @@ -124,13 +124,19 @@ static int holtekff_init(struct hid_devi { struct holtekff_device *holtekff; struct hid_report *report; - struct hid_input *hidinput = list_entry(hid->inputs.next, - struct hid_input, list); + struct hid_input *hidinput; struct list_head *report_list = &hid->report_enum[HID_OUTPUT_REPORT].report_list; - struct input_dev *dev = hidinput->input; + struct input_dev *dev; int error;
+ if (list_empty(&hid->inputs)) { + hid_err(hid, "no inputs found\n"); + return -ENODEV; + } + hidinput = list_entry(hid->inputs.next, struct hid_input, list); + dev = hidinput->input; + if (list_empty(report_list)) { hid_err(hid, "no output report found\n"); return -ENODEV; --- a/drivers/hid/hid-lg2ff.c +++ b/drivers/hid/hid-lg2ff.c @@ -50,11 +50,17 @@ int lg2ff_init(struct hid_device *hid) { struct lg2ff_device *lg2ff; struct hid_report *report; - struct hid_input *hidinput = list_entry(hid->inputs.next, - struct hid_input, list); - struct input_dev *dev = hidinput->input; + struct hid_input *hidinput; + struct input_dev *dev; int error;
+ if (list_empty(&hid->inputs)) { + hid_err(hid, "no inputs found\n"); + return -ENODEV; + } + hidinput = list_entry(hid->inputs.next, struct hid_input, list); + dev = hidinput->input; + /* Check that the report looks ok */ report = hid_validate_values(hid, HID_OUTPUT_REPORT, 0, 0, 7); if (!report) --- a/drivers/hid/hid-lg3ff.c +++ b/drivers/hid/hid-lg3ff.c @@ -117,12 +117,19 @@ static const signed short ff3_joystick_a
int lg3ff_init(struct hid_device *hid) { - struct hid_input *hidinput = list_entry(hid->inputs.next, struct hid_input, list); - struct input_dev *dev = hidinput->input; + struct hid_input *hidinput; + struct input_dev *dev; const signed short *ff_bits = ff3_joystick_ac; int error; int i;
+ if (list_empty(&hid->inputs)) { + hid_err(hid, "no inputs found\n"); + return -ENODEV; + } + hidinput = list_entry(hid->inputs.next, struct hid_input, list); + dev = hidinput->input; + /* Check that the report looks ok */ if (!hid_validate_values(hid, HID_OUTPUT_REPORT, 0, 0, 35)) return -ENODEV; --- a/drivers/hid/hid-lg4ff.c +++ b/drivers/hid/hid-lg4ff.c @@ -1253,8 +1253,8 @@ static int lg4ff_handle_multimode_wheel(
int lg4ff_init(struct hid_device *hid) { - struct hid_input *hidinput = list_entry(hid->inputs.next, struct hid_input, list); - struct input_dev *dev = hidinput->input; + struct hid_input *hidinput; + struct input_dev *dev; struct list_head *report_list = &hid->report_enum[HID_OUTPUT_REPORT].report_list; struct hid_report *report = list_entry(report_list->next, struct hid_report, list); const struct usb_device_descriptor *udesc = &(hid_to_usb_dev(hid)->descriptor); @@ -1266,6 +1266,13 @@ int lg4ff_init(struct hid_device *hid) int mmode_ret, mmode_idx = -1; u16 real_product_id;
+ if (list_empty(&hid->inputs)) { + hid_err(hid, "no inputs found\n"); + return -ENODEV; + } + hidinput = list_entry(hid->inputs.next, struct hid_input, list); + dev = hidinput->input; + /* Check that the report looks ok */ if (!hid_validate_values(hid, HID_OUTPUT_REPORT, 0, 0, 7)) return -1; --- a/drivers/hid/hid-lgff.c +++ b/drivers/hid/hid-lgff.c @@ -115,12 +115,19 @@ static void hid_lgff_set_autocenter(stru
int lgff_init(struct hid_device* hid) { - struct hid_input *hidinput = list_entry(hid->inputs.next, struct hid_input, list); - struct input_dev *dev = hidinput->input; + struct hid_input *hidinput; + struct input_dev *dev; const signed short *ff_bits = ff_joystick; int error; int i;
+ if (list_empty(&hid->inputs)) { + hid_err(hid, "no inputs found\n"); + return -ENODEV; + } + hidinput = list_entry(hid->inputs.next, struct hid_input, list); + dev = hidinput->input; + /* Check that the report looks ok */ if (!hid_validate_values(hid, HID_OUTPUT_REPORT, 0, 0, 7)) return -ENODEV; --- a/drivers/hid/hid-logitech-hidpp.c +++ b/drivers/hid/hid-logitech-hidpp.c @@ -2084,8 +2084,8 @@ static void hidpp_ff_destroy(struct ff_d static int hidpp_ff_init(struct hidpp_device *hidpp, u8 feature_index) { struct hid_device *hid = hidpp->hid_dev; - struct hid_input *hidinput = list_entry(hid->inputs.next, struct hid_input, list); - struct input_dev *dev = hidinput->input; + struct hid_input *hidinput; + struct input_dev *dev; const struct usb_device_descriptor *udesc = &(hid_to_usb_dev(hid)->descriptor); const u16 bcdDevice = le16_to_cpu(udesc->bcdDevice); struct ff_device *ff; @@ -2094,6 +2094,13 @@ static int hidpp_ff_init(struct hidpp_de int error, j, num_slots; u8 version;
+ if (list_empty(&hid->inputs)) { + hid_err(hid, "no inputs found\n"); + return -ENODEV; + } + hidinput = list_entry(hid->inputs.next, struct hid_input, list); + dev = hidinput->input; + if (!dev) { hid_err(hid, "Struct input_dev not set!\n"); return -EINVAL; --- a/drivers/hid/hid-microsoft.c +++ b/drivers/hid/hid-microsoft.c @@ -328,11 +328,17 @@ static int ms_play_effect(struct input_d
static int ms_init_ff(struct hid_device *hdev) { - struct hid_input *hidinput = list_entry(hdev->inputs.next, - struct hid_input, list); - struct input_dev *input_dev = hidinput->input; + struct hid_input *hidinput; + struct input_dev *input_dev; struct ms_data *ms = hid_get_drvdata(hdev);
+ if (list_empty(&hdev->inputs)) { + hid_err(hdev, "no inputs found\n"); + return -ENODEV; + } + hidinput = list_entry(hdev->inputs.next, struct hid_input, list); + input_dev = hidinput->input; + if (!(ms->quirks & MS_QUIRK_FF)) return 0;
--- a/drivers/hid/hid-sony.c +++ b/drivers/hid/hid-sony.c @@ -2254,9 +2254,15 @@ static int sony_play_effect(struct input
static int sony_init_ff(struct sony_sc *sc) { - struct hid_input *hidinput = list_entry(sc->hdev->inputs.next, - struct hid_input, list); - struct input_dev *input_dev = hidinput->input; + struct hid_input *hidinput; + struct input_dev *input_dev; + + if (list_empty(&sc->hdev->inputs)) { + hid_err(sc->hdev, "no inputs found\n"); + return -ENODEV; + } + hidinput = list_entry(sc->hdev->inputs.next, struct hid_input, list); + input_dev = hidinput->input;
input_set_capability(input_dev, EV_FF, FF_RUMBLE); return input_ff_create_memless(input_dev, NULL, sony_play_effect); --- a/drivers/hid/hid-tmff.c +++ b/drivers/hid/hid-tmff.c @@ -124,12 +124,18 @@ static int tmff_init(struct hid_device * struct tmff_device *tmff; struct hid_report *report; struct list_head *report_list; - struct hid_input *hidinput = list_entry(hid->inputs.next, - struct hid_input, list); - struct input_dev *input_dev = hidinput->input; + struct hid_input *hidinput; + struct input_dev *input_dev; int error; int i;
+ if (list_empty(&hid->inputs)) { + hid_err(hid, "no inputs found\n"); + return -ENODEV; + } + hidinput = list_entry(hid->inputs.next, struct hid_input, list); + input_dev = hidinput->input; + tmff = kzalloc(sizeof(struct tmff_device), GFP_KERNEL); if (!tmff) return -ENOMEM; --- a/drivers/hid/hid-zpff.c +++ b/drivers/hid/hid-zpff.c @@ -54,11 +54,17 @@ static int zpff_init(struct hid_device * { struct zpff_device *zpff; struct hid_report *report; - struct hid_input *hidinput = list_entry(hid->inputs.next, - struct hid_input, list); - struct input_dev *dev = hidinput->input; + struct hid_input *hidinput; + struct input_dev *dev; int i, error;
+ if (list_empty(&hid->inputs)) { + hid_err(hid, "no inputs found\n"); + return -ENODEV; + } + hidinput = list_entry(hid->inputs.next, struct hid_input, list); + dev = hidinput->input; + for (i = 0; i < 4; i++) { report = hid_validate_values(hid, HID_OUTPUT_REPORT, 0, i, 1); if (!report)
From: Michał Mirosław mirq-linux@rere.qmqm.pl
commit b3a81c777dcb093020680490ab970d85e2f6f04f upstream.
On HID report descriptor parsing error the code displays bogus pointer instead of error offset (subtracts start=NULL from end). Make the message more useful by displaying correct error offset and include total buffer size for reference.
This was carried over from ancient times - "Fixed" commit just promoted the message from DEBUG to ERROR.
Cc: stable@vger.kernel.org Fixes: 8c3d52fc393b ("HID: make parser more verbose about parsing errors by default") Signed-off-by: Michał Mirosław mirq-linux@rere.qmqm.pl Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hid/hid-core.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/hid/hid-core.c +++ b/drivers/hid/hid-core.c @@ -1139,6 +1139,7 @@ int hid_open_report(struct hid_device *d __u8 *start; __u8 *buf; __u8 *end; + __u8 *next; int ret; static int (*dispatch_type[])(struct hid_parser *parser, struct hid_item *item) = { @@ -1192,7 +1193,8 @@ int hid_open_report(struct hid_device *d device->collection_size = HID_DEFAULT_NUM_COLLECTIONS;
ret = -EINVAL; - while ((start = fetch_item(start, end, &item)) != NULL) { + while ((next = fetch_item(start, end, &item)) != NULL) { + start = next;
if (item.format != HID_ITEM_FORMAT_SHORT) { hid_err(device, "unexpected long global item\n"); @@ -1230,7 +1232,8 @@ int hid_open_report(struct hid_device *d } }
- hid_err(device, "item fetching failed at offset %d\n", (int)(end - start)); + hid_err(device, "item fetching failed at offset %u/%u\n", + size - (unsigned int)(end - start), size); err: kfree(parser->collection_stack); alloc_err:
From: Andrey Smirnov andrew.smirnov@gmail.com
commit abdd3d0b344fdf72a4904d09b97bc964d74c4419 upstream.
Original version of g920_get_config() contained two kind of actions:
1. Device specific communication to query/set some parameters which requires active communication channel with the device, or, put in other way, for the call to be sandwiched between hid_device_io_start() and hid_device_io_stop().
2. Input subsystem specific FF controller initialization which, in order to access a valid 'struct hid_input' via 'hid->inputs.next', requires claimed hidinput which means be executed after the call to hid_hw_start() with connect_mask containing HID_CONNECT_HIDINPUT.
Location of g920_get_config() can only fulfill requirements for #1 and not #2, which might result in following backtrace:
[ 88.312258] logitech-hidpp-device 0003:046D:C262.0005: HID++ 4.2 device connected. [ 88.320298] BUG: kernel NULL pointer dereference, address: 0000000000000018 [ 88.320304] #PF: supervisor read access in kernel mode [ 88.320307] #PF: error_code(0x0000) - not-present page [ 88.320309] PGD 0 P4D 0 [ 88.320315] Oops: 0000 [#1] SMP PTI [ 88.320320] CPU: 1 PID: 3080 Comm: systemd-udevd Not tainted 5.4.0-rc1+ #31 [ 88.320322] Hardware name: Apple Inc. MacBookPro11,1/Mac-189A3D4F975D5FFC, BIOS 149.0.0.0.0 09/17/2018 [ 88.320334] RIP: 0010:hidpp_probe+0x61f/0x948 [hid_logitech_hidpp] [ 88.320338] Code: 81 00 00 48 89 ef e8 f0 d6 ff ff 41 89 c6 85 c0 75 b5 0f b6 44 24 28 48 8b 5d 00 88 44 24 1e 89 44 24 0c 48 8b 83 18 1c 00 00 <48> 8b 48 18 48 8b 83 10 19 00 00 48 8b 40 40 48 89 0c 24 0f b7 80 [ 88.320341] RSP: 0018:ffffb0a6824aba68 EFLAGS: 00010246 [ 88.320345] RAX: 0000000000000000 RBX: ffff93a50756e000 RCX: 0000000000010408 [ 88.320347] RDX: 0000000000000000 RSI: ffff93a51f0ad0a0 RDI: 000000000002d0a0 [ 88.320350] RBP: ffff93a50416da28 R08: ffff93a50416da70 R09: ffff93a50416da70 [ 88.320352] R10: 000000148ae9e60c R11: 00000000000f1525 R12: ffff93a50756e000 [ 88.320354] R13: ffff93a50756f8d0 R14: 0000000000000000 R15: ffff93a50756fc38 [ 88.320358] FS: 00007f8d8c1e0940(0000) GS:ffff93a51f080000(0000) knlGS:0000000000000000 [ 88.320361] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 88.320363] CR2: 0000000000000018 CR3: 00000003996d8003 CR4: 00000000001606e0 [ 88.320366] Call Trace: [ 88.320377] ? _cond_resched+0x15/0x30 [ 88.320387] ? create_pinctrl+0x2f/0x3c0 [ 88.320393] ? kernfs_link_sibling+0x94/0xe0 [ 88.320398] ? _cond_resched+0x15/0x30 [ 88.320402] ? kernfs_activate+0x5f/0x80 [ 88.320406] ? kernfs_add_one+0xe2/0x130 [ 88.320411] hid_device_probe+0x106/0x170 [ 88.320419] really_probe+0x147/0x3c0 [ 88.320424] driver_probe_device+0xb6/0x100 [ 88.320428] device_driver_attach+0x53/0x60 [ 88.320433] __driver_attach+0x8a/0x150 [ 88.320437] ? device_driver_attach+0x60/0x60 [ 88.320440] bus_for_each_dev+0x78/0xc0 [ 88.320445] bus_add_driver+0x14d/0x1f0 [ 88.320450] driver_register+0x6c/0xc0 [ 88.320453] ? 0xffffffffc0d67000 [ 88.320457] __hid_register_driver+0x4c/0x80 [ 88.320464] do_one_initcall+0x46/0x1f4 [ 88.320469] ? _cond_resched+0x15/0x30 [ 88.320474] ? kmem_cache_alloc_trace+0x162/0x220 [ 88.320481] ? do_init_module+0x23/0x230 [ 88.320486] do_init_module+0x5c/0x230 [ 88.320491] load_module+0x26e1/0x2990 [ 88.320502] ? ima_post_read_file+0xf0/0x100 [ 88.320508] ? __do_sys_finit_module+0xaa/0x110 [ 88.320512] __do_sys_finit_module+0xaa/0x110 [ 88.320520] do_syscall_64+0x5b/0x180 [ 88.320525] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 88.320528] RIP: 0033:0x7f8d8d1f01fd [ 88.320532] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 5b 8c 0c 00 f7 d8 64 89 01 48 [ 88.320535] RSP: 002b:00007ffefa3bb068 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 88.320539] RAX: ffffffffffffffda RBX: 000055922040cb40 RCX: 00007f8d8d1f01fd [ 88.320541] RDX: 0000000000000000 RSI: 00007f8d8ce4984d RDI: 0000000000000006 [ 88.320543] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000007 [ 88.320545] R10: 0000000000000006 R11: 0000000000000246 R12: 00007f8d8ce4984d [ 88.320547] R13: 0000000000000000 R14: 000055922040efc0 R15: 000055922040cb40 [ 88.320551] Modules linked in: hid_logitech_hidpp(+) fuse rfcomm ccm xt_CHECKSUM xt_MASQUERADE bridge stp llc nf_nat_tftp nf_conntrack_tftp nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat tun iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables cmac bnep sunrpc dm_crypt nls_utf8 hfsplus intel_rapl_msr intel_rapl_common ath9k_htc ath9k_common x86_pkg_temp_thermal intel_powerclamp b43 ath9k_hw coretemp snd_hda_codec_hdmi cordic kvm_intel snd_hda_codec_cirrus mac80211 snd_hda_codec_generic ledtrig_audio kvm snd_hda_intel snd_intel_nhlt irqbypass snd_hda_codec btusb btrtl snd_hda_core ath btbcm ssb snd_hwdep btintel snd_seq crct10dif_pclmul iTCO_wdt snd_seq_device crc32_pclmul bluetooth mmc_core iTCO_vendor_support joydev cfg80211 [ 88.320602] applesmc ghash_clmulni_intel ecdh_generic snd_pcm input_polldev intel_cstate ecc intel_uncore thunderbolt snd_timer i2c_i801 libarc4 rfkill intel_rapl_perf lpc_ich mei_me pcspkr bcm5974 snd bcma mei soundcore acpi_als sbs kfifo_buf sbshc industrialio apple_bl i915 i2c_algo_bit drm_kms_helper drm uas crc32c_intel usb_storage video hid_apple [ 88.320630] CR2: 0000000000000018 [ 88.320633] ---[ end trace 933491c8a4fadeb7 ]--- [ 88.320642] RIP: 0010:hidpp_probe+0x61f/0x948 [hid_logitech_hidpp] [ 88.320645] Code: 81 00 00 48 89 ef e8 f0 d6 ff ff 41 89 c6 85 c0 75 b5 0f b6 44 24 28 48 8b 5d 00 88 44 24 1e 89 44 24 0c 48 8b 83 18 1c 00 00 <48> 8b 48 18 48 8b 83 10 19 00 00 48 8b 40 40 48 89 0c 24 0f b7 80 [ 88.320647] RSP: 0018:ffffb0a6824aba68 EFLAGS: 00010246 [ 88.320650] RAX: 0000000000000000 RBX: ffff93a50756e000 RCX: 0000000000010408 [ 88.320652] RDX: 0000000000000000 RSI: ffff93a51f0ad0a0 RDI: 000000000002d0a0 [ 88.320655] RBP: ffff93a50416da28 R08: ffff93a50416da70 R09: ffff93a50416da70 [ 88.320657] R10: 000000148ae9e60c R11: 00000000000f1525 R12: ffff93a50756e000 [ 88.320659] R13: ffff93a50756f8d0 R14: 0000000000000000 R15: ffff93a50756fc38 [ 88.320662] FS: 00007f8d8c1e0940(0000) GS:ffff93a51f080000(0000) knlGS:0000000000000000 [ 88.320664] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 88.320667] CR2: 0000000000000018 CR3: 00000003996d8003 CR4: 00000000001606e0
To solve this issue:
1. Split g920_get_config() such that all of the device specific communication remains a part of the function and input subsystem initialization bits go to hidpp_ff_init()
2. Move call to hidpp_ff_init() from being a part of g920_get_config() to be the last step of .probe(), right after a call to hid_hw_start() with connect_mask containing HID_CONNECT_HIDINPUT.
Fixes: 91cf9a98ae41 ("HID: logitech-hidpp: make .probe usbhid capable") Signed-off-by: Andrey Smirnov andrew.smirnov@gmail.com Tested-by: Sam Bazley sambazley@fastmail.com Cc: Jiri Kosina jikos@kernel.org Cc: Benjamin Tissoires benjamin.tissoires@redhat.com Cc: Henrik Rydberg rydberg@bitmath.org Cc: Pierre-Loup A. Griffais pgriffais@valvesoftware.com Cc: Austin Palmer austinp@valvesoftware.com Cc: linux-input@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # 5.2+ Signed-off-by: Benjamin Tissoires benjamin.tissoires@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hid/hid-logitech-hidpp.c | 150 ++++++++++++++++++++++++--------------- 1 file changed, 96 insertions(+), 54 deletions(-)
--- a/drivers/hid/hid-logitech-hidpp.c +++ b/drivers/hid/hid-logitech-hidpp.c @@ -1669,6 +1669,7 @@ static void hidpp_touchpad_raw_xy_event(
#define HIDPP_FF_EFFECTID_NONE -1 #define HIDPP_FF_EFFECTID_AUTOCENTER -2 +#define HIDPP_AUTOCENTER_PARAMS_LENGTH 18
#define HIDPP_FF_MAX_PARAMS 20 #define HIDPP_FF_RESERVED_SLOTS 1 @@ -2009,7 +2010,7 @@ static int hidpp_ff_erase_effect(struct static void hidpp_ff_set_autocenter(struct input_dev *dev, u16 magnitude) { struct hidpp_ff_private_data *data = dev->ff->private; - u8 params[18]; + u8 params[HIDPP_AUTOCENTER_PARAMS_LENGTH];
dbg_hid("Setting autocenter to %d.\n", magnitude);
@@ -2081,7 +2082,8 @@ static void hidpp_ff_destroy(struct ff_d kfree(data->effect_ids); }
-static int hidpp_ff_init(struct hidpp_device *hidpp, u8 feature_index) +static int hidpp_ff_init(struct hidpp_device *hidpp, + struct hidpp_ff_private_data *data) { struct hid_device *hid = hidpp->hid_dev; struct hid_input *hidinput; @@ -2089,9 +2091,7 @@ static int hidpp_ff_init(struct hidpp_de const struct usb_device_descriptor *udesc = &(hid_to_usb_dev(hid)->descriptor); const u16 bcdDevice = le16_to_cpu(udesc->bcdDevice); struct ff_device *ff; - struct hidpp_report response; - struct hidpp_ff_private_data *data; - int error, j, num_slots; + int error, j, num_slots = data->num_effects; u8 version;
if (list_empty(&hid->inputs)) { @@ -2116,27 +2116,17 @@ static int hidpp_ff_init(struct hidpp_de for (j = 0; hidpp_ff_effects_v2[j] >= 0; j++) set_bit(hidpp_ff_effects_v2[j], dev->ffbit);
- /* Read number of slots available in device */ - error = hidpp_send_fap_command_sync(hidpp, feature_index, - HIDPP_FF_GET_INFO, NULL, 0, &response); - if (error) { - if (error < 0) - return error; - hid_err(hidpp->hid_dev, "%s: received protocol error 0x%02x\n", - __func__, error); - return -EPROTO; - } - - num_slots = response.fap.params[0] - HIDPP_FF_RESERVED_SLOTS; - error = input_ff_create(dev, num_slots);
if (error) { hid_err(dev, "Failed to create FF device!\n"); return error; } - - data = kzalloc(sizeof(*data), GFP_KERNEL); + /* + * Create a copy of passed data, so we can transfer memory + * ownership to FF core + */ + data = kmemdup(data, sizeof(*data), GFP_KERNEL); if (!data) return -ENOMEM; data->effect_ids = kcalloc(num_slots, sizeof(int), GFP_KERNEL); @@ -2152,10 +2142,7 @@ static int hidpp_ff_init(struct hidpp_de }
data->hidpp = hidpp; - data->feature_index = feature_index; data->version = version; - data->slot_autocenter = 0; - data->num_effects = num_slots; for (j = 0; j < num_slots; j++) data->effect_ids[j] = -1;
@@ -2169,37 +2156,14 @@ static int hidpp_ff_init(struct hidpp_de ff->set_autocenter = hidpp_ff_set_autocenter; ff->destroy = hidpp_ff_destroy;
- - /* reset all forces */ - error = hidpp_send_fap_command_sync(hidpp, feature_index, - HIDPP_FF_RESET_ALL, NULL, 0, &response); - - /* Read current Range */ - error = hidpp_send_fap_command_sync(hidpp, feature_index, - HIDPP_FF_GET_APERTURE, NULL, 0, &response); - if (error) - hid_warn(hidpp->hid_dev, "Failed to read range from device!\n"); - data->range = error ? 900 : get_unaligned_be16(&response.fap.params[0]); - /* Create sysfs interface */ error = device_create_file(&(hidpp->hid_dev->dev), &dev_attr_range); if (error) hid_warn(hidpp->hid_dev, "Unable to create sysfs interface for "range", errno %d!\n", error);
- /* Read the current gain values */ - error = hidpp_send_fap_command_sync(hidpp, feature_index, - HIDPP_FF_GET_GLOBAL_GAINS, NULL, 0, &response); - if (error) - hid_warn(hidpp->hid_dev, "Failed to read gain values from device!\n"); - data->gain = error ? 0xffff : get_unaligned_be16(&response.fap.params[0]); - /* ignore boost value at response.fap.params[2] */ - /* init the hardware command queue */ atomic_set(&data->workqueue_size, 0);
- /* initialize with zero autocenter to get wheel in usable state */ - hidpp_ff_set_autocenter(dev, 0); - hid_info(hid, "Force feedback support loaded (firmware release %d).\n", version);
@@ -2732,24 +2696,93 @@ static int k400_connect(struct hid_devic
#define HIDPP_PAGE_G920_FORCE_FEEDBACK 0x8123
-static int g920_get_config(struct hidpp_device *hidpp) +static int g920_ff_set_autocenter(struct hidpp_device *hidpp, + struct hidpp_ff_private_data *data) { + struct hidpp_report response; + u8 params[HIDPP_AUTOCENTER_PARAMS_LENGTH] = { + [1] = HIDPP_FF_EFFECT_SPRING | HIDPP_FF_EFFECT_AUTOSTART, + }; + int ret; + + /* initialize with zero autocenter to get wheel in usable state */ + + dbg_hid("Setting autocenter to 0.\n"); + ret = hidpp_send_fap_command_sync(hidpp, data->feature_index, + HIDPP_FF_DOWNLOAD_EFFECT, + params, ARRAY_SIZE(params), + &response); + if (ret) + hid_warn(hidpp->hid_dev, "Failed to autocenter device!\n"); + else + data->slot_autocenter = response.fap.params[0]; + + return ret; +} + +static int g920_get_config(struct hidpp_device *hidpp, + struct hidpp_ff_private_data *data) +{ + struct hidpp_report response; u8 feature_type; - u8 feature_index; int ret;
+ memset(data, 0, sizeof(*data)); + /* Find feature and store for later use */ ret = hidpp_root_get_feature(hidpp, HIDPP_PAGE_G920_FORCE_FEEDBACK, - &feature_index, &feature_type); + &data->feature_index, &feature_type); if (ret) return ret;
- ret = hidpp_ff_init(hidpp, feature_index); + /* Read number of slots available in device */ + ret = hidpp_send_fap_command_sync(hidpp, data->feature_index, + HIDPP_FF_GET_INFO, + NULL, 0, + &response); + if (ret) { + if (ret < 0) + return ret; + hid_err(hidpp->hid_dev, + "%s: received protocol error 0x%02x\n", __func__, ret); + return -EPROTO; + } + + data->num_effects = response.fap.params[0] - HIDPP_FF_RESERVED_SLOTS; + + /* reset all forces */ + ret = hidpp_send_fap_command_sync(hidpp, data->feature_index, + HIDPP_FF_RESET_ALL, + NULL, 0, + &response); if (ret) - hid_warn(hidpp->hid_dev, "Unable to initialize force feedback support, errno %d\n", - ret); + hid_warn(hidpp->hid_dev, "Failed to reset all forces!\n");
- return 0; + ret = hidpp_send_fap_command_sync(hidpp, data->feature_index, + HIDPP_FF_GET_APERTURE, + NULL, 0, + &response); + if (ret) { + hid_warn(hidpp->hid_dev, + "Failed to read range from device!\n"); + } + data->range = ret ? + 900 : get_unaligned_be16(&response.fap.params[0]); + + /* Read the current gain values */ + ret = hidpp_send_fap_command_sync(hidpp, data->feature_index, + HIDPP_FF_GET_GLOBAL_GAINS, + NULL, 0, + &response); + if (ret) + hid_warn(hidpp->hid_dev, + "Failed to read gain values from device!\n"); + data->gain = ret ? + 0xffff : get_unaligned_be16(&response.fap.params[0]); + + /* ignore boost value at response.fap.params[2] */ + + return g920_ff_set_autocenter(hidpp, data); }
/* -------------------------------------------------------------------------- */ @@ -3512,6 +3545,7 @@ static int hidpp_probe(struct hid_device int ret; bool connected; unsigned int connect_mask = HID_CONNECT_DEFAULT; + struct hidpp_ff_private_data data;
/* report_fixup needs drvdata to be set before we call hid_parse */ hidpp = devm_kzalloc(&hdev->dev, sizeof(*hidpp), GFP_KERNEL); @@ -3621,7 +3655,7 @@ static int hidpp_probe(struct hid_device if (ret) goto hid_hw_init_fail; } else if (connected && (hidpp->quirks & HIDPP_QUIRK_CLASS_G920)) { - ret = g920_get_config(hidpp); + ret = g920_get_config(hidpp, &data); if (ret) goto hid_hw_init_fail; } @@ -3643,6 +3677,14 @@ static int hidpp_probe(struct hid_device goto hid_hw_start_fail; }
+ if (hidpp->quirks & HIDPP_QUIRK_CLASS_G920) { + ret = hidpp_ff_init(hidpp, &data); + if (ret) + hid_warn(hidpp->hid_dev, + "Unable to initialize force feedback support, errno %d\n", + ret); + } + return ret;
hid_hw_init_fail:
From: Andrey Smirnov andrew.smirnov@gmail.com
commit 905d754c53a522aacf806ea1d3e7c929148c1910 upstream.
G920 device only advertises REPORT_ID_HIDPP_LONG and REPORT_ID_HIDPP_VERY_LONG in its HID report descriptor, so querying for REPORT_ID_HIDPP_SHORT with optional=false will always fail and prevent G920 to be recognized as a valid HID++ device.
To fix this and improve some other aspects, modify hidpp_validate_device() as follows:
- Inline the code of hidpp_validate_report() to simplify distingushing between non-present and invalid report descriptors
- Drop the check for id >= HID_MAX_IDS || id < 0 since all of our IDs are static and known to satisfy that at compile time
- Change the algorithms to check all possible report types (including very long report) and deem the device as a valid HID++ device if it supports at least one
- Treat invalid report length as a hard stop for the validation algorithm, meaning that if any of the supported reports has invalid length we assume the worst and treat the device as a generic HID device.
- Fold initialization of hidpp->very_long_report_length into hidpp_validate_device() since it already fetches very long report length and validates its value
Fixes: fe3ee1ec007b ("HID: logitech-hidpp: allow non HID++ devices to be handled by this module") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204191 Reported-by: Sam Bazely sambazley@fastmail.com Signed-off-by: Andrey Smirnov andrew.smirnov@gmail.com Cc: Jiri Kosina jikos@kernel.org Cc: Benjamin Tissoires benjamin.tissoires@redhat.com Cc: Henrik Rydberg rydberg@bitmath.org Cc: Pierre-Loup A. Griffais pgriffais@valvesoftware.com Cc: Austin Palmer austinp@valvesoftware.com Cc: linux-input@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # 5.2+ Signed-off-by: Benjamin Tissoires benjamin.tissoires@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hid/hid-logitech-hidpp.c | 54 +++++++++++++++++++++------------------ 1 file changed, 30 insertions(+), 24 deletions(-)
--- a/drivers/hid/hid-logitech-hidpp.c +++ b/drivers/hid/hid-logitech-hidpp.c @@ -3498,34 +3498,45 @@ static int hidpp_get_report_length(struc return report->field[0]->report_count + 1; }
-static bool hidpp_validate_report(struct hid_device *hdev, int id, - int expected_length, bool optional) +static bool hidpp_validate_device(struct hid_device *hdev) { - int report_length; + struct hidpp_device *hidpp = hid_get_drvdata(hdev); + int id, report_length, supported_reports = 0; + + id = REPORT_ID_HIDPP_SHORT; + report_length = hidpp_get_report_length(hdev, id); + if (report_length) { + if (report_length < HIDPP_REPORT_SHORT_LENGTH) + goto bad_device;
- if (id >= HID_MAX_IDS || id < 0) { - hid_err(hdev, "invalid HID report id %u\n", id); - return false; + supported_reports++; }
+ id = REPORT_ID_HIDPP_LONG; report_length = hidpp_get_report_length(hdev, id); - if (!report_length) - return optional; + if (report_length) { + if (report_length < HIDPP_REPORT_LONG_LENGTH) + goto bad_device;
- if (report_length < expected_length) { - hid_warn(hdev, "not enough values in hidpp report %d\n", id); - return false; + supported_reports++; }
- return true; -} + id = REPORT_ID_HIDPP_VERY_LONG; + report_length = hidpp_get_report_length(hdev, id); + if (report_length) { + if (report_length < HIDPP_REPORT_LONG_LENGTH || + report_length > HIDPP_REPORT_VERY_LONG_MAX_LENGTH) + goto bad_device;
-static bool hidpp_validate_device(struct hid_device *hdev) -{ - return hidpp_validate_report(hdev, REPORT_ID_HIDPP_SHORT, - HIDPP_REPORT_SHORT_LENGTH, false) && - hidpp_validate_report(hdev, REPORT_ID_HIDPP_LONG, - HIDPP_REPORT_LONG_LENGTH, true); + supported_reports++; + hidpp->very_long_report_length = report_length; + } + + return supported_reports; + +bad_device: + hid_warn(hdev, "not enough values in hidpp report %d\n", id); + return false; }
static bool hidpp_application_equals(struct hid_device *hdev, @@ -3572,11 +3583,6 @@ static int hidpp_probe(struct hid_device return hid_hw_start(hdev, HID_CONNECT_DEFAULT); }
- hidpp->very_long_report_length = - hidpp_get_report_length(hdev, REPORT_ID_HIDPP_VERY_LONG); - if (hidpp->very_long_report_length > HIDPP_REPORT_VERY_LONG_MAX_LENGTH) - hidpp->very_long_report_length = HIDPP_REPORT_VERY_LONG_MAX_LENGTH; - if (id->group == HID_GROUP_LOGITECH_DJ_DEVICE) hidpp->quirks |= HIDPP_QUIRK_UNIFYING;
From: Andrey Smirnov andrew.smirnov@gmail.com
commit 08c453f6d073f069cf8e30e03cd3c16262c9b953 upstream.
All of the FF-related resources belong to corresponding FF device, so they should be freed as a part of hidpp_ff_destroy() to avoid potential race condidions.
Fixes: ff21a635dd1a ("HID: logitech-hidpp: Force feedback support for the Logitech G920") Suggested-by: Benjamin Tissoires benjamin.tissoires@redhat.com Signed-off-by: Andrey Smirnov andrew.smirnov@gmail.com Cc: Jiri Kosina jikos@kernel.org Cc: Benjamin Tissoires benjamin.tissoires@redhat.com Cc: Henrik Rydberg rydberg@bitmath.org Cc: Pierre-Loup A. Griffais pgriffais@valvesoftware.com Cc: Austin Palmer austinp@valvesoftware.com Cc: linux-input@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # 5.2+ Signed-off-by: Benjamin Tissoires benjamin.tissoires@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/hid/hid-logitech-hidpp.c | 33 +++++---------------------------- 1 file changed, 5 insertions(+), 28 deletions(-)
--- a/drivers/hid/hid-logitech-hidpp.c +++ b/drivers/hid/hid-logitech-hidpp.c @@ -2078,7 +2078,12 @@ static DEVICE_ATTR(range, S_IRUSR | S_IW static void hidpp_ff_destroy(struct ff_device *ff) { struct hidpp_ff_private_data *data = ff->private; + struct hid_device *hid = data->hidpp->hid_dev;
+ hid_info(hid, "Unloading HID++ force feedback.\n"); + + device_remove_file(&hid->dev, &dev_attr_range); + destroy_workqueue(data->wq); kfree(data->effect_ids); }
@@ -2170,31 +2175,6 @@ static int hidpp_ff_init(struct hidpp_de return 0; }
-static int hidpp_ff_deinit(struct hid_device *hid) -{ - struct hid_input *hidinput = list_entry(hid->inputs.next, struct hid_input, list); - struct input_dev *dev = hidinput->input; - struct hidpp_ff_private_data *data; - - if (!dev) { - hid_err(hid, "Struct input_dev not found!\n"); - return -EINVAL; - } - - hid_info(hid, "Unloading HID++ force feedback.\n"); - data = dev->ff->private; - if (!data) { - hid_err(hid, "Private data not found!\n"); - return -EINVAL; - } - - destroy_workqueue(data->wq); - device_remove_file(&hid->dev, &dev_attr_range); - - return 0; -} - - /* ************************************************************************** */ /* */ /* Device Support */ @@ -3713,9 +3693,6 @@ static void hidpp_remove(struct hid_devi
sysfs_remove_group(&hdev->dev.kobj, &ps_attribute_group);
- if (hidpp->quirks & HIDPP_QUIRK_CLASS_G920) - hidpp_ff_deinit(hdev); - hid_hw_stop(hdev); cancel_work_sync(&hidpp->work); mutex_destroy(&hidpp->send_mutex);
From: Anton Ivanov anton.ivanov@cambridgegreys.com
commit d848074b2f1eb11a38691285f7366bce83087014 upstream.
Fixes crashes due to ubd requeue logic conflicting with the block-mq logic. Crash is reproducible in 5.0 - 5.3.
Fixes: 53766defb8c8 ("um: Clean-up command processing in UML UBD driver") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Anton Ivanov anton.ivanov@cambridgegreys.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/um/drivers/ubd_kern.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/arch/um/drivers/ubd_kern.c +++ b/arch/um/drivers/ubd_kern.c @@ -1403,8 +1403,12 @@ static blk_status_t ubd_queue_rq(struct
spin_unlock_irq(&ubd_dev->lock);
- if (ret < 0) - blk_mq_requeue_request(req, true); + if (ret < 0) { + if (ret == -ENOMEM) + res = BLK_STS_RESOURCE; + else + res = BLK_STS_DEV_RESOURCE; + }
return res; }
From: Ilya Leoshkevich iii@linux.ibm.com
commit a1d863ac3e1085e1fea9caafd87252d08731de2e upstream.
unwind_for_each_frame stops after the first frame if regs->gprs[15] <= sp.
The reason is that in case regs are specified, the first frame should be regs->psw.addr and the second frame should be sp->gprs[8]. However, currently the second frame is regs->gprs[15], which confuses outside_of_stack().
Fix by introducing a flag to distinguish this special case from unwinding the interrupt handler, for which the current behavior is appropriate.
Fixes: 78c98f907413 ("s390/unwind: introduce stack unwind API") Signed-off-by: Ilya Leoshkevich iii@linux.ibm.com Cc: stable@vger.kernel.org # v5.2+ Reviewed-by: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/s390/include/asm/unwind.h | 1 + arch/s390/kernel/unwind_bc.c | 18 +++++++++++++----- 2 files changed, 14 insertions(+), 5 deletions(-)
--- a/arch/s390/include/asm/unwind.h +++ b/arch/s390/include/asm/unwind.h @@ -35,6 +35,7 @@ struct unwind_state { struct task_struct *task; struct pt_regs *regs; unsigned long sp, ip; + bool reuse_sp; int graph_idx; bool reliable; bool error; --- a/arch/s390/kernel/unwind_bc.c +++ b/arch/s390/kernel/unwind_bc.c @@ -46,10 +46,15 @@ bool unwind_next_frame(struct unwind_sta
regs = state->regs; if (unlikely(regs)) { - sp = READ_ONCE_NOCHECK(regs->gprs[15]); - if (unlikely(outside_of_stack(state, sp))) { - if (!update_stack_info(state, sp)) - goto out_err; + if (state->reuse_sp) { + sp = state->sp; + state->reuse_sp = false; + } else { + sp = READ_ONCE_NOCHECK(regs->gprs[15]); + if (unlikely(outside_of_stack(state, sp))) { + if (!update_stack_info(state, sp)) + goto out_err; + } } sf = (struct stack_frame *) sp; ip = READ_ONCE_NOCHECK(sf->gprs[8]); @@ -107,9 +112,9 @@ void __unwind_start(struct unwind_state { struct stack_info *info = &state->stack_info; unsigned long *mask = &state->stack_mask; + bool reliable, reuse_sp; struct stack_frame *sf; unsigned long ip; - bool reliable;
memset(state, 0, sizeof(*state)); state->task = task; @@ -134,10 +139,12 @@ void __unwind_start(struct unwind_state if (regs) { ip = READ_ONCE_NOCHECK(regs->psw.addr); reliable = true; + reuse_sp = true; } else { sf = (struct stack_frame *) sp; ip = READ_ONCE_NOCHECK(sf->gprs[8]); reliable = false; + reuse_sp = false; }
#ifdef CONFIG_FUNCTION_GRAPH_TRACER @@ -151,5 +158,6 @@ void __unwind_start(struct unwind_state state->sp = sp; state->ip = ip; state->reliable = reliable; + state->reuse_sp = reuse_sp; } EXPORT_SYMBOL_GPL(__unwind_start);
From: Yihui ZENG yzeng56@asu.edu
commit b8e51a6a9db94bc1fb18ae831b3dab106b5a4b5f upstream.
The problem is that we were putting the NUL terminator too far:
buf[sizeof(buf) - 1] = '\0';
If the user input isn't NUL terminated and they haven't initialized the whole buffer then it leads to an info leak. The NUL terminator should be:
buf[len - 1] = '\0';
Signed-off-by: Yihui Zeng yzeng56@asu.edu Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter dan.carpenter@oracle.com [heiko.carstens@de.ibm.com: keep semantics of how *lenp and *ppos are handled] Signed-off-by: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/s390/mm/cmm.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/arch/s390/mm/cmm.c +++ b/arch/s390/mm/cmm.c @@ -298,16 +298,16 @@ static int cmm_timeout_handler(struct ct }
if (write) { - len = *lenp; - if (copy_from_user(buf, buffer, - len > sizeof(buf) ? sizeof(buf) : len)) + len = min(*lenp, sizeof(buf)); + if (copy_from_user(buf, buffer, len)) return -EFAULT; - buf[sizeof(buf) - 1] = '\0'; + buf[len - 1] = '\0'; cmm_skip_blanks(buf, &p); nr = simple_strtoul(p, &p, 0); cmm_skip_blanks(p, &p); seconds = simple_strtoul(p, &p, 0); cmm_set_timeout(nr, seconds); + *ppos += *lenp; } else { len = sprintf(buf, "%ld %ld\n", cmm_timeout_pages, cmm_timeout_seconds); @@ -315,9 +315,9 @@ static int cmm_timeout_handler(struct ct len = *lenp; if (copy_to_user(buffer, buf, len)) return -EFAULT; + *lenp = len; + *ppos += len; } - *lenp = len; - *ppos += len; return 0; }
From: Heiko Carstens heiko.carstens@de.ibm.com
commit 3d7efa4edd07be5c5c3ffa95ba63e97e070e1f3f upstream.
The idle time reported in /proc/stat sometimes incorrectly contains huge values on s390. This is caused by a bug in arch_cpu_idle_time().
The kernel tries to figure out when a different cpu entered idle by accessing its per-cpu data structure. There is an ordering problem: if the remote cpu has an idle_enter value which is not zero, and an idle_exit value which is zero, it is assumed it is idle since "now". The "now" timestamp however is taken before the idle_enter value is read.
Which in turn means that "now" can be smaller than idle_enter of the remote cpu. Unconditionally subtracting idle_enter from "now" can thus lead to a negative value (aka large unsigned value).
Fix this by moving the get_tod_clock() invocation out of the loop. While at it also make the code a bit more readable.
A similar bug also exists for show_idle_time(). Fix this is as well.
Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens heiko.carstens@de.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/s390/kernel/idle.c | 29 ++++++++++++++++++++++------- 1 file changed, 22 insertions(+), 7 deletions(-)
--- a/arch/s390/kernel/idle.c +++ b/arch/s390/kernel/idle.c @@ -69,18 +69,26 @@ DEVICE_ATTR(idle_count, 0444, show_idle_ static ssize_t show_idle_time(struct device *dev, struct device_attribute *attr, char *buf) { + unsigned long long now, idle_time, idle_enter, idle_exit, in_idle; struct s390_idle_data *idle = &per_cpu(s390_idle, dev->id); - unsigned long long now, idle_time, idle_enter, idle_exit; unsigned int seq;
do { - now = get_tod_clock(); seq = read_seqcount_begin(&idle->seqcount); idle_time = READ_ONCE(idle->idle_time); idle_enter = READ_ONCE(idle->clock_idle_enter); idle_exit = READ_ONCE(idle->clock_idle_exit); } while (read_seqcount_retry(&idle->seqcount, seq)); - idle_time += idle_enter ? ((idle_exit ? : now) - idle_enter) : 0; + in_idle = 0; + now = get_tod_clock(); + if (idle_enter) { + if (idle_exit) { + in_idle = idle_exit - idle_enter; + } else if (now > idle_enter) { + in_idle = now - idle_enter; + } + } + idle_time += in_idle; return sprintf(buf, "%llu\n", idle_time >> 12); } DEVICE_ATTR(idle_time_us, 0444, show_idle_time, NULL); @@ -88,17 +96,24 @@ DEVICE_ATTR(idle_time_us, 0444, show_idl u64 arch_cpu_idle_time(int cpu) { struct s390_idle_data *idle = &per_cpu(s390_idle, cpu); - unsigned long long now, idle_enter, idle_exit; + unsigned long long now, idle_enter, idle_exit, in_idle; unsigned int seq;
do { - now = get_tod_clock(); seq = read_seqcount_begin(&idle->seqcount); idle_enter = READ_ONCE(idle->clock_idle_enter); idle_exit = READ_ONCE(idle->clock_idle_exit); } while (read_seqcount_retry(&idle->seqcount, seq)); - - return cputime_to_nsecs(idle_enter ? ((idle_exit ?: now) - idle_enter) : 0); + in_idle = 0; + now = get_tod_clock(); + if (idle_enter) { + if (idle_exit) { + in_idle = idle_exit - idle_enter; + } else if (now > idle_enter) { + in_idle = now - idle_enter; + } + } + return cputime_to_nsecs(in_idle); }
void arch_cpu_idle_enter(void)
From: Alexey Brodkin Alexey.Brodkin@synopsys.com
commit 5effc09c4907901f0e71e68e5f2e14211d9a203f upstream.
8-letter strings representing ARC perf events are stores in two 32-bit registers as ASCII characters like that: "IJMP", "IALL", "IJMPTAK" etc.
And the same order of bytes in the word is used regardless CPU endianness.
Which means in case of big-endian CPU core we need to swap bytes to get the same order as if it was on little-endian CPU.
Otherwise we're seeing the following error message on boot: ------------------------->8---------------------- ARC perf : 8 counters (32 bits), 40 conditions, [overflow IRQ support] sysfs: cannot create duplicate filename '/devices/arc_pct/events/pmji' CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3 Stack Trace: arc_unwind_core+0xd4/0xfc dump_stack+0x64/0x80 sysfs_warn_dup+0x46/0x58 sysfs_add_file_mode_ns+0xb2/0x168 create_files+0x70/0x2a0 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at kernel/events/core.c:12144 perf_event_sysfs_init+0x70/0xa0 Failed to register pmu: arc_pct, reason -17 Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3 Stack Trace: arc_unwind_core+0xd4/0xfc dump_stack+0x64/0x80 __warn+0x9c/0xd4 warn_slowpath_fmt+0x22/0x2c perf_event_sysfs_init+0x70/0xa0 ---[ end trace a75fb9a9837bd1ec ]--- ------------------------->8----------------------
What happens here we're trying to register more than one raw perf event with the same name "PMJI". Why? Because ARC perf events are 4 to 8 letters and encoded into two 32-bit words. In this particular case we deal with 2 events: * "IJMP____" which counts all jump & branch instructions * "IJMPC___" which counts only conditional jumps & branches
Those strings are split in two 32-bit words this way "IJMP" + "____" & "IJMP" + "C___" correspondingly. Now if we read them swapped due to CPU core being big-endian then we read "PMJI" + "____" & "PMJI" + "___C".
And since we interpret read array of ASCII letters as a null-terminated string on big-endian CPU we end up with 2 events of the same name "PMJI".
Signed-off-by: Alexey Brodkin abrodkin@synopsys.com Cc: stable@vger.kernel.org Signed-off-by: Vineet Gupta vgupta@synopsys.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arc/kernel/perf_event.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/arc/kernel/perf_event.c +++ b/arch/arc/kernel/perf_event.c @@ -614,8 +614,8 @@ static int arc_pmu_device_probe(struct p /* loop thru all available h/w condition indexes */ for (i = 0; i < cc_bcr.c; i++) { write_aux_reg(ARC_REG_CC_INDEX, i); - cc_name.indiv.word0 = read_aux_reg(ARC_REG_CC_NAME0); - cc_name.indiv.word1 = read_aux_reg(ARC_REG_CC_NAME1); + cc_name.indiv.word0 = le32_to_cpu(read_aux_reg(ARC_REG_CC_NAME0)); + cc_name.indiv.word1 = le32_to_cpu(read_aux_reg(ARC_REG_CC_NAME1));
arc_pmu_map_hw_event(i, cc_name.str); arc_pmu_add_raw_event_attr(i, cc_name.str);
From: Kaike Wan kaike.wan@intel.com
commit 9ed5bd7d22241ad232fd3a5be404e83eb6cadc04 upstream.
A TID RDMA READ request could be retried under one of the following conditions: - The RC retry timer expires; - A later TID RDMA READ RESP packet is received before the next expected one. For the latter, under normal conditions, the PSN in IB space is used for comparison. More specifically, the IB PSN in the incoming TID RDMA READ RESP packet is compared with the last IB PSN of a given TID RDMA READ request to determine if the request should be retried. This is similar to the retry logic for noraml RDMA READ request.
However, if a TID RDMA READ RESP packet is lost due to congestion, header suppresion will be disabled and each incoming packet will raise an interrupt until the hardware flow is reloaded. Under this condition, each packet KDETH PSN will be checked by software against r_next_psn and a retry will be requested if the packet KDETH PSN is later than r_next_psn. Since each TID RDMA READ segment could have up to 64 packets and each TID RDMA READ request could have many segments, we could make far more retries under such conditions, and thus leading to RETRY_EXC_ERR status.
This patch fixes the issue by removing the retry when the incoming packet KDETH PSN is later than r_next_psn. Instead, it resorts to RC timer and normal IB PSN comparison for any request retry.
Fixes: 9905bf06e890 ("IB/hfi1: Add functions to receive TID RDMA READ response") Cc: stable@vger.kernel.org Reviewed-by: Mike Marciniszyn mike.marciniszyn@intel.com Signed-off-by: Kaike Wan kaike.wan@intel.com Signed-off-by: Dennis Dalessandro dennis.dalessandro@intel.com Link: https://lore.kernel.org/r/20191004204035.26542.41684.stgit@awfm-01.aw.intel.... Signed-off-by: Doug Ledford dledford@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/infiniband/hw/hfi1/tid_rdma.c | 5 ----- 1 file changed, 5 deletions(-)
--- a/drivers/infiniband/hw/hfi1/tid_rdma.c +++ b/drivers/infiniband/hw/hfi1/tid_rdma.c @@ -2728,11 +2728,6 @@ static bool handle_read_kdeth_eflags(str diff = cmp_psn(psn, flow->flow_state.r_next_psn); if (diff > 0) { - if (!(qp->r_flags & RVT_R_RDMAR_SEQ)) - restart_tid_rdma_read_req(rcd, - qp, - wqe); - /* Drop the packet.*/ goto s_unlock; } else if (diff < 0) {
From: Catalin Marinas catalin.marinas@arm.com
commit aa57157be69fb599bd4c38a4b75c5aad74a60ec0 upstream.
Shared and writable mappings (__S.1.) should be clean (!dirty) initially and made dirty on a subsequent write either through the hardware DBM (dirty bit management) mechanism or through a write page fault. A clean pte for the arm64 kernel is one that has PTE_RDONLY set and PTE_DIRTY clear.
The PAGE_SHARED{,_EXEC} attributes have PTE_WRITE set (PTE_DBM) and PTE_DIRTY clear. Prior to commit 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()"), it was the responsibility of set_pte_at() to set the PTE_RDONLY bit and mark the pte clean if the software PTE_DIRTY bit was not set. However, the above commit removed the pte_sw_dirty() check and the subsequent setting of PTE_RDONLY in set_pte_at() while leaving the PAGE_SHARED{,_EXEC} definitions unchanged. The result is that shared+writable mappings are now dirty by default
Fix the above by explicitly setting PTE_RDONLY in PAGE_SHARED{,_EXEC}. In addition, remove the superfluous PTE_DIRTY bit from the kernel PROT_* attributes.
Fixes: 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()") Cc: stable@vger.kernel.org # 4.14.x- Cc: Will Deacon will@kernel.org Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/include/asm/pgtable-prot.h | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-)
--- a/arch/arm64/include/asm/pgtable-prot.h +++ b/arch/arm64/include/asm/pgtable-prot.h @@ -32,11 +32,11 @@ #define PROT_DEFAULT (_PROT_DEFAULT | PTE_MAYBE_NG) #define PROT_SECT_DEFAULT (_PROT_SECT_DEFAULT | PMD_MAYBE_NG)
-#define PROT_DEVICE_nGnRnE (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_nGnRnE)) -#define PROT_DEVICE_nGnRE (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_nGnRE)) -#define PROT_NORMAL_NC (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_NC)) -#define PROT_NORMAL_WT (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_WT)) -#define PROT_NORMAL (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL)) +#define PROT_DEVICE_nGnRnE (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_nGnRnE)) +#define PROT_DEVICE_nGnRE (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_nGnRE)) +#define PROT_NORMAL_NC (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_NC)) +#define PROT_NORMAL_WT (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_WT)) +#define PROT_NORMAL (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL))
#define PROT_SECT_DEVICE_nGnRE (PROT_SECT_DEFAULT | PMD_SECT_PXN | PMD_SECT_UXN | PMD_ATTRINDX(MT_DEVICE_nGnRE)) #define PROT_SECT_NORMAL (PROT_SECT_DEFAULT | PMD_SECT_PXN | PMD_SECT_UXN | PMD_ATTRINDX(MT_NORMAL)) @@ -80,8 +80,9 @@ #define PAGE_S2_DEVICE __pgprot(_PROT_DEFAULT | PAGE_S2_MEMATTR(DEVICE_nGnRE) | PTE_S2_RDONLY | PAGE_S2_XN)
#define PAGE_NONE __pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN) -#define PAGE_SHARED __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE) -#define PAGE_SHARED_EXEC __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_WRITE) +/* shared+writable pages are clean by default, hence PTE_RDONLY|PTE_WRITE */ +#define PAGE_SHARED __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE) +#define PAGE_SHARED_EXEC __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_WRITE) #define PAGE_READONLY __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN) #define PAGE_READONLY_EXEC __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_RDONLY | PTE_NG | PTE_PXN) #define PAGE_EXECONLY __pgprot(_PAGE_DEFAULT | PTE_RDONLY | PTE_NG | PTE_PXN)
From: Bjorn Andersson bjorn.andersson@linaro.org
commit d4af3c4b81f4cd5662baa6f1492f998d89783318 upstream.
With the introduction of 'cce360b54ce6 ("arm64: capabilities: Filter the entries based on a given mask")' the Qualcomm Falkor/Kryo errata 1003 is no long applied.
The result of not applying errata 1003 is that MSM8996 runs into various RCU stalls and fails to boot most of the times.
Give 1003 a "type" to ensure they are not filtered out in update_cpu_capabilities().
Fixes: cce360b54ce6 ("arm64: capabilities: Filter the entries based on a given mask") Cc: stable@vger.kernel.org Reported-by: Mark Brown broonie@kernel.org Suggested-by: Will Deacon will@kernel.org Signed-off-by: Bjorn Andersson bjorn.andersson@linaro.org Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/arm64/kernel/cpu_errata.c | 1 + 1 file changed, 1 insertion(+)
--- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -816,6 +816,7 @@ const struct arm64_cpu_capabilities arm6 { .desc = "Qualcomm Technologies Falkor/Kryo erratum 1003", .capability = ARM64_WORKAROUND_QCOM_FALKOR_E1003, + .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM, .matches = cpucap_multi_entry_cap_matches, .match_list = qcom_erratum_1003_list, },
From: Marvin Liu yong.liu@intel.com
commit 40ce7919d8730f5936da2bc8a21b46bd07db6411 upstream.
When VIRTIO_F_RING_EVENT_IDX is negotiated, virtio devices can use virtqueue_enable_cb_delayed_packed to reduce the number of device interrupts. At the moment, this is the case for virtio-net when the napi_tx module parameter is set to false.
In this case, the virtio driver selects an event offset and expects that the device will send a notification when rolling over the event offset in the ring. However, if this roll-over happens before the event suppression structure update, the notification won't be sent. To address this race condition the driver needs to check wether the device rolled over the offset after updating the event suppression structure.
With VIRTIO_F_RING_PACKED, the virtio driver did this by reading the flags field of the descriptor at the specified offset.
Unfortunately, checking at the event offset isn't reliable: if descriptors are chained (e.g. when INDIRECT is off) not all descriptors are overwritten by the device, so it's possible that the device skipped the specific descriptor driver is checking when writing out used descriptors. If this happens, the driver won't detect the race condition and will incorrectly expect the device to send a notification.
For virtio-net, the result will be a TX queue stall, with the transmission getting blocked forever.
With the packed ring, it isn't easy to find a location which is guaranteed to change upon the roll-over, except the next device descriptor, as described in the spec:
Writes of device and driver descriptors can generally be reordered, but each side (driver and device) are only required to poll (or test) a single location in memory: the next device descriptor after the one they processed previously, in circular order.
while this might be sub-optimal, let's do exactly this for now.
Cc: stable@vger.kernel.org Cc: Jason Wang jasowang@redhat.com Fixes: f51f982682e2a ("virtio_ring: leverage event idx in packed ring") Signed-off-by: Marvin Liu yong.liu@intel.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/virtio/virtio_ring.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
--- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -1499,9 +1499,6 @@ static bool virtqueue_enable_cb_delayed_ * counter first before updating event flags. */ virtio_wmb(vq->weak_barriers); - } else { - used_idx = vq->last_used_idx; - wrap_counter = vq->packed.used_wrap_counter; }
if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DISABLE) { @@ -1518,7 +1515,9 @@ static bool virtqueue_enable_cb_delayed_ */ virtio_mb(vq->weak_barriers);
- if (is_used_desc_packed(vq, used_idx, wrap_counter)) { + if (is_used_desc_packed(vq, + vq->last_used_idx, + vq->packed.used_wrap_counter)) { END_USE(vq); return false; }
From: Larry Finger Larry.Finger@lwfinger.net
commit b43f4a169f220e459edf3ea8f8cd3ec4ae7fa82d upstream.
In commit 8020919a9b99 ("mac80211: Properly handle SKB with radiotap only"), buffers whose length is too short cause a WARN_ON(1) to be executed. This change exposed a fault in rtlwifi drivers, which is fixed by regarding packets with skb->len <= FCS_LEN as though they are in error and dropping them. The test is now annotated as likely.
Cc: Stable stable@vger.kernel.org # v5.0+ Signed-off-by: Larry Finger Larry.Finger@lwfinger.net Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/realtek/rtlwifi/pci.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/net/wireless/realtek/rtlwifi/pci.c +++ b/drivers/net/wireless/realtek/rtlwifi/pci.c @@ -822,7 +822,7 @@ static void _rtl_pci_rx_interrupt(struct hdr = rtl_get_hdr(skb); fc = rtl_get_fc(skb);
- if (!stats.crc && !stats.hwerror) { + if (!stats.crc && !stats.hwerror && (skb->len > FCS_LEN)) { memcpy(IEEE80211_SKB_RXCB(skb), &rx_status, sizeof(rx_status));
@@ -859,6 +859,7 @@ static void _rtl_pci_rx_interrupt(struct _rtl_pci_rx_to_mac80211(hw, skb, rx_status); } } else { + /* drop packets with errors or those too short */ dev_kfree_skb_any(skb); } new_trx_end:
From: Laura Abbott labbott@redhat.com
commit 8c55dedb795be8ec0cf488f98c03a1c2176f7fb1 upstream.
Nicolas Waisman noticed that even though noa_len is checked for a compatible length it's still possible to overrun the buffers of p2pinfo since there's no check on the upper bound of noa_num. Bound noa_num against P2P_MAX_NOA_NUM.
Reported-by: Nicolas Waisman nico@semmle.com Signed-off-by: Laura Abbott labbott@redhat.com Acked-by: Ping-Ke Shih pkshih@realtek.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/realtek/rtlwifi/ps.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/net/wireless/realtek/rtlwifi/ps.c +++ b/drivers/net/wireless/realtek/rtlwifi/ps.c @@ -754,6 +754,9 @@ static void rtl_p2p_noa_ie(struct ieee80 return; } else { noa_num = (noa_len - 2) / 13; + if (noa_num > P2P_MAX_NOA_NUM) + noa_num = P2P_MAX_NOA_NUM; + } noa_index = ie[3]; if (rtlpriv->psc.p2p_ps_info.p2p_ps_mode == @@ -848,6 +851,9 @@ static void rtl_p2p_action_ie(struct iee return; } else { noa_num = (noa_len - 2) / 13; + if (noa_num > P2P_MAX_NOA_NUM) + noa_num = P2P_MAX_NOA_NUM; + } noa_index = ie[3]; if (rtlpriv->psc.p2p_ps_info.p2p_ps_mode ==
From: Paolo Bonzini pbonzini@redhat.com
commit 9167ab79936206118cc60e47dcb926c3489f3bd5 upstream.
VMX already does so if the host has SMEP, in order to support the combination of CR0.WP=1 and CR4.SMEP=1. However, it is perfectly safe to always do so, and in fact VMX already ends up running with EFER.NXE=1 on old processors that lack the "load EFER" controls, because it may help avoiding a slow MSR write. Removing all the conditionals simplifies the code.
SVM does not have similar code, but it should since recent AMD processors do support SMEP. So this patch also makes the code for the two vendors more similar while fixing NPT=0, CR0.WP=1 and CR4.SMEP=1 on AMD processors.
Cc: stable@vger.kernel.org Cc: Joerg Roedel jroedel@suse.de Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/kvm/svm.c | 10 ++++++++-- arch/x86/kvm/vmx/vmx.c | 14 +++----------- 2 files changed, 11 insertions(+), 13 deletions(-)
--- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -736,8 +736,14 @@ static int get_npt_level(struct kvm_vcpu static void svm_set_efer(struct kvm_vcpu *vcpu, u64 efer) { vcpu->arch.efer = efer; - if (!npt_enabled && !(efer & EFER_LMA)) - efer &= ~EFER_LME; + + if (!npt_enabled) { + /* Shadow paging assumes NX to be available. */ + efer |= EFER_NX; + + if (!(efer & EFER_LMA)) + efer &= ~EFER_LME; + }
to_svm(vcpu)->vmcb->save.efer = efer | EFER_SVME; mark_dirty(to_svm(vcpu)->vmcb, VMCB_CR); --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -897,17 +897,9 @@ static bool update_transition_efer(struc u64 guest_efer = vmx->vcpu.arch.efer; u64 ignore_bits = 0;
- if (!enable_ept) { - /* - * NX is needed to handle CR0.WP=1, CR4.SMEP=1. Testing - * host CPUID is more efficient than testing guest CPUID - * or CR4. Host SMEP is anyway a requirement for guest SMEP. - */ - if (boot_cpu_has(X86_FEATURE_SMEP)) - guest_efer |= EFER_NX; - else if (!(guest_efer & EFER_NX)) - ignore_bits |= EFER_NX; - } + /* Shadow paging assumes NX to be available. */ + if (!enable_ept) + guest_efer |= EFER_NX;
/* * LMA and LME handled by hardware; SCE meaningless outside long mode.
From: Jeffrey Hugo jeffrey.l.hugo@gmail.com
commit 7667819385457b4aeb5fac94f67f52ab52cc10d5 upstream.
bam_dma_terminate_all() will leak resources if any of the transactions are committed to the hardware (present in the desc fifo), and not complete. Since bam_dma_terminate_all() does not cause the hardware to be updated, the hardware will still operate on any previously committed transactions. This can cause memory corruption if the memory for the transaction has been reassigned, and will cause a sync issue between the BAM and its client(s).
Fix this by properly updating the hardware in bam_dma_terminate_all().
Fixes: e7c0fe2a5c84 ("dmaengine: add Qualcomm BAM dma driver") Signed-off-by: Jeffrey Hugo jeffrey.l.hugo@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191017152606.34120-1-jeffrey.l.hugo@gmail.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/dma/qcom/bam_dma.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
--- a/drivers/dma/qcom/bam_dma.c +++ b/drivers/dma/qcom/bam_dma.c @@ -694,6 +694,25 @@ static int bam_dma_terminate_all(struct
/* remove all transactions, including active transaction */ spin_lock_irqsave(&bchan->vc.lock, flag); + /* + * If we have transactions queued, then some might be committed to the + * hardware in the desc fifo. The only way to reset the desc fifo is + * to do a hardware reset (either by pipe or the entire block). + * bam_chan_init_hw() will trigger a pipe reset, and also reinit the + * pipe. If the pipe is left disabled (default state after pipe reset) + * and is accessed by a connected hardware engine, a fatal error in + * the BAM will occur. There is a small window where this could happen + * with bam_chan_init_hw(), but it is assumed that the caller has + * stopped activity on any attached hardware engine. Make sure to do + * this first so that the BAM hardware doesn't cause memory corruption + * by accessing freed resources. + */ + if (!list_empty(&bchan->desc_list)) { + async_desc = list_first_entry(&bchan->desc_list, + struct bam_async_desc, desc_node); + bam_chan_init_hw(bchan, async_desc->dir); + } + list_for_each_entry_safe(async_desc, tmp, &bchan->desc_list, desc_node) { list_add(&async_desc->vd.node, &bchan->vc.desc_issued);
From: Sameer Pujar spujar@nvidia.com
commit 9ec691f48b5ef741a48af8932ccaec859c67e8f1 upstream.
From Tegra186 onwards OUTSTANDING_REQUESTS field is added in channel
configuration register(bits 7:4) which defines the maximum number of reads from the source and writes to the destination that may be outstanding at any given point of time. This field must be programmed with a value between 1 and 8. A value of 0 will prevent any transfers from happening.
Thus added 'has_outstanding_reqs' bool member in chip data structure and is set to false for Tegra210, since the field is not applicable. For Tegra186 it is set to true and channel configuration is updated with maximum outstanding requests.
Fixes: 433de642a76c ("dmaengine: tegra210-adma: add support for Tegra186/Tegra194") Cc: stable@vger.kernel.org Signed-off-by: Sameer Pujar spujar@nvidia.com Acked-by: Jon Hunter jonathanh@nvidia.com Link: https://lore.kernel.org/r/1568626513-16541-1-git-send-email-spujar@nvidia.co... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/dma/tegra210-adma.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/dma/tegra210-adma.c +++ b/drivers/dma/tegra210-adma.c @@ -40,6 +40,7 @@ #define ADMA_CH_CONFIG_MAX_BURST_SIZE 16 #define ADMA_CH_CONFIG_WEIGHT_FOR_WRR(val) ((val) & 0xf) #define ADMA_CH_CONFIG_MAX_BUFS 8 +#define TEGRA186_ADMA_CH_CONFIG_OUTSTANDING_REQS(reqs) (reqs << 4)
#define ADMA_CH_FIFO_CTRL 0x2c #define TEGRA210_ADMA_CH_FIFO_CTRL_OFLWTHRES(val) (((val) & 0xf) << 24) @@ -85,6 +86,7 @@ struct tegra_adma; * @ch_req_tx_shift: Register offset for AHUB transmit channel select. * @ch_req_rx_shift: Register offset for AHUB receive channel select. * @ch_base_offset: Register offset of DMA channel registers. + * @has_outstanding_reqs: If DMA channel can have outstanding requests. * @ch_fifo_ctrl: Default value for channel FIFO CTRL register. * @ch_req_mask: Mask for Tx or Rx channel select. * @ch_req_max: Maximum number of Tx or Rx channels available. @@ -103,6 +105,7 @@ struct tegra_adma_chip_data { unsigned int ch_req_max; unsigned int ch_reg_size; unsigned int nr_channels; + bool has_outstanding_reqs; };
/* @@ -602,6 +605,8 @@ static int tegra_adma_set_xfer_params(st ADMA_CH_CTRL_FLOWCTRL_EN; ch_regs->config |= cdata->adma_get_burst_config(burst_size); ch_regs->config |= ADMA_CH_CONFIG_WEIGHT_FOR_WRR(1); + if (cdata->has_outstanding_reqs) + ch_regs->config |= TEGRA186_ADMA_CH_CONFIG_OUTSTANDING_REQS(8); ch_regs->fifo_ctrl = cdata->ch_fifo_ctrl; ch_regs->tc = desc->period_len & ADMA_CH_TC_COUNT_MASK;
@@ -786,6 +791,7 @@ static const struct tegra_adma_chip_data .ch_req_tx_shift = 28, .ch_req_rx_shift = 24, .ch_base_offset = 0, + .has_outstanding_reqs = false, .ch_fifo_ctrl = TEGRA210_FIFO_CTRL_DEFAULT, .ch_req_mask = 0xf, .ch_req_max = 10, @@ -800,6 +806,7 @@ static const struct tegra_adma_chip_data .ch_req_tx_shift = 27, .ch_req_rx_shift = 22, .ch_base_offset = 0x10000, + .has_outstanding_reqs = true, .ch_fifo_ctrl = TEGRA186_FIFO_CTRL_DEFAULT, .ch_req_mask = 0x1f, .ch_req_max = 20,
From: Robin Gong yibin.gong@nxp.com
commit bd73dfabdda280fc5f05bdec79b6721b4b2f035f upstream.
Illegal memory will be touch if SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V3 (41) exceed the size of structure sdma_script_start_addrs(40), thus cause memory corrupt such as slob block header so that kernel trap into while() loop forever in slob_free(). Please refer to below code piece in imx-sdma.c: for (i = 0; i < sdma->script_number; i++) if (addr_arr[i] > 0) saddr_arr[i] = addr_arr[i]; /* memory corrupt here */ That issue was brought by commit a572460be9cf ("dmaengine: imx-sdma: Add support for version 3 firmware") because SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V3 (38->41 3 scripts added) not align with script number added in sdma_script_start_addrs(2 scripts).
Fixes: a572460be9cf ("dmaengine: imx-sdma: Add support for version 3 firmware") Cc: stable@vger.kernel Link: https://www.spinics.net/lists/arm-kernel/msg754895.html Signed-off-by: Robin Gong yibin.gong@nxp.com Reported-by: Jurgen Lambrecht J.Lambrecht@TELEVIC.com Link: https://lore.kernel.org/r/1569347584-3478-1-git-send-email-yibin.gong@nxp.co... [vkoul: update the patch title] Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/dma/imx-sdma.c | 8 ++++++++ include/linux/platform_data/dma-imx-sdma.h | 3 +++ 2 files changed, 11 insertions(+)
--- a/drivers/dma/imx-sdma.c +++ b/drivers/dma/imx-sdma.c @@ -1707,6 +1707,14 @@ static void sdma_add_scripts(struct sdma if (!sdma->script_number) sdma->script_number = SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V1;
+ if (sdma->script_number > sizeof(struct sdma_script_start_addrs) + / sizeof(s32)) { + dev_err(sdma->dev, + "SDMA script number %d not match with firmware.\n", + sdma->script_number); + return; + } + for (i = 0; i < sdma->script_number; i++) if (addr_arr[i] > 0) saddr_arr[i] = addr_arr[i]; --- a/include/linux/platform_data/dma-imx-sdma.h +++ b/include/linux/platform_data/dma-imx-sdma.h @@ -51,7 +51,10 @@ struct sdma_script_start_addrs { /* End of v2 array */ s32 zcanfd_2_mcu_addr; s32 zqspi_2_mcu_addr; + s32 mcu_2_ecspi_addr; /* End of v3 array */ + s32 mcu_2_zqspi_addr; + /* End of v4 array */ };
/**
From: Tony Lindgren tony@atomide.com
commit bacdcb6675e170bb2e8d3824da220e10274f42a7 upstream.
Yegor Yefremov yegorslists@googlemail.com reported that musb and ftdi uart can fail for the first open of the uart unless connected using a hub.
This is because the first dma call done by musb_ep_program() must wait if cppi41 is PM runtime suspended. Otherwise musb_ep_program() continues with other non-dma packets before the DMA transfer is started causing at least ftdi uarts to fail to receive data.
Let's fix the issue by waking up cppi41 with PM runtime calls added to cppi41_dma_prep_slave_sg() and return NULL if still idled. This way we have musb_ep_program() continue with PIO until cppi41 is awake.
Fixes: fdea2d09b997 ("dmaengine: cppi41: Add basic PM runtime support") Reported-by: Yegor Yefremov yegorslists@googlemail.com Signed-off-by: Tony Lindgren tony@atomide.com Cc: stable@vger.kernel.org # v4.9+ Link: https://lore.kernel.org/r/20191023153138.23442-1-tony@atomide.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/dma/ti/cppi41.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-)
--- a/drivers/dma/ti/cppi41.c +++ b/drivers/dma/ti/cppi41.c @@ -586,9 +586,22 @@ static struct dma_async_tx_descriptor *c enum dma_transfer_direction dir, unsigned long tx_flags, void *context) { struct cppi41_channel *c = to_cpp41_chan(chan); + struct dma_async_tx_descriptor *txd = NULL; + struct cppi41_dd *cdd = c->cdd; struct cppi41_desc *d; struct scatterlist *sg; unsigned int i; + int error; + + error = pm_runtime_get(cdd->ddev.dev); + if (error < 0) { + pm_runtime_put_noidle(cdd->ddev.dev); + + return NULL; + } + + if (cdd->is_suspended) + goto err_out_not_ready;
d = c->desc; for_each_sg(sgl, sg, sg_len, i) { @@ -611,7 +624,13 @@ static struct dma_async_tx_descriptor *c d++; }
- return &c->txd; + txd = &c->txd; + +err_out_not_ready: + pm_runtime_mark_last_busy(cdd->ddev.dev); + pm_runtime_put_autosuspend(cdd->ddev.dev); + + return txd; }
static void cppi41_compute_td_desc(struct cppi41_desc *d)
From: Alex Deucher alexander.deucher@amd.com
commit 30ef5c7eaba0ddafc6c23eca65ebe52169dfcc60 upstream.
These were not aligned for optimal performance for GPUVM.
Acked-by: Christian König christian.koenig@amd.com Reviewed-by: Tianci Yin tianci.yin@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c | 9 +++++++++ drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c | 9 +++++++++ 2 files changed, 18 insertions(+)
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c @@ -151,6 +151,15 @@ static void gfxhub_v2_0_init_cache_regs( WREG32_SOC15(GC, 0, mmGCVM_L2_CNTL2, tmp);
tmp = mmGCVM_L2_CNTL3_DEFAULT; + if (adev->gmc.translate_further) { + tmp = REG_SET_FIELD(tmp, GCVM_L2_CNTL3, BANK_SELECT, 12); + tmp = REG_SET_FIELD(tmp, GCVM_L2_CNTL3, + L2_CACHE_BIGK_FRAGMENT_SIZE, 9); + } else { + tmp = REG_SET_FIELD(tmp, GCVM_L2_CNTL3, BANK_SELECT, 9); + tmp = REG_SET_FIELD(tmp, GCVM_L2_CNTL3, + L2_CACHE_BIGK_FRAGMENT_SIZE, 6); + } WREG32_SOC15(GC, 0, mmGCVM_L2_CNTL3, tmp);
tmp = mmGCVM_L2_CNTL4_DEFAULT; --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c @@ -137,6 +137,15 @@ static void mmhub_v2_0_init_cache_regs(s WREG32_SOC15(MMHUB, 0, mmMMVM_L2_CNTL2, tmp);
tmp = mmMMVM_L2_CNTL3_DEFAULT; + if (adev->gmc.translate_further) { + tmp = REG_SET_FIELD(tmp, MMVM_L2_CNTL3, BANK_SELECT, 12); + tmp = REG_SET_FIELD(tmp, MMVM_L2_CNTL3, + L2_CACHE_BIGK_FRAGMENT_SIZE, 9); + } else { + tmp = REG_SET_FIELD(tmp, MMVM_L2_CNTL3, BANK_SELECT, 9); + tmp = REG_SET_FIELD(tmp, MMVM_L2_CNTL3, + L2_CACHE_BIGK_FRAGMENT_SIZE, 6); + } WREG32_SOC15(MMHUB, 0, mmMMVM_L2_CNTL3, tmp);
tmp = mmMMVM_L2_CNTL4_DEFAULT;
From: Ville Syrjälä ville.syrjala@linux.intel.com
commit 59cd826fb5e7889515bf5771e295e0624c348571 upstream.
The change to skip the PCH reference initialization during fastboot did end up breaking FDI. To fix that let's try to do the PCH reference init whenever we're disabling a DPLL that was using said reference previously.
Cc: stable@vger.kernel.org Tested-by: Andrija akijo97@gmail.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112084 Fixes: b16c7ed95caf ("drm/i915: Do not touch the PCH SSC reference if a PLL is using it") Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20191022185643.1483-1-ville.sy... Reviewed-by: Imre Deak imre.deak@intel.com (cherry picked from commit dd5279c71405533d4ddbb9453effc60f0f5bf211) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/i915/display/intel_display.c | 11 ++++++----- drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 15 +++++++++++++++ drivers/gpu/drm/i915/i915_drv.h | 2 ++ 3 files changed, 23 insertions(+), 5 deletions(-)
--- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -9186,7 +9186,6 @@ static bool wrpll_uses_pch_ssc(struct dr static void lpt_init_pch_refclk(struct drm_i915_private *dev_priv) { struct intel_encoder *encoder; - bool pch_ssc_in_use = false; bool has_fdi = false;
for_each_intel_encoder(&dev_priv->drm, encoder) { @@ -9214,22 +9213,24 @@ static void lpt_init_pch_refclk(struct d * clock hierarchy. That would also allow us to do * clock bending finally. */ + dev_priv->pch_ssc_use = 0; + if (spll_uses_pch_ssc(dev_priv)) { DRM_DEBUG_KMS("SPLL using PCH SSC\n"); - pch_ssc_in_use = true; + dev_priv->pch_ssc_use |= BIT(DPLL_ID_SPLL); }
if (wrpll_uses_pch_ssc(dev_priv, DPLL_ID_WRPLL1)) { DRM_DEBUG_KMS("WRPLL1 using PCH SSC\n"); - pch_ssc_in_use = true; + dev_priv->pch_ssc_use |= BIT(DPLL_ID_WRPLL1); }
if (wrpll_uses_pch_ssc(dev_priv, DPLL_ID_WRPLL2)) { DRM_DEBUG_KMS("WRPLL2 using PCH SSC\n"); - pch_ssc_in_use = true; + dev_priv->pch_ssc_use |= BIT(DPLL_ID_WRPLL2); }
- if (pch_ssc_in_use) + if (dev_priv->pch_ssc_use) return;
if (has_fdi) { --- a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c +++ b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c @@ -498,16 +498,31 @@ static void hsw_ddi_wrpll_disable(struct val = I915_READ(WRPLL_CTL(id)); I915_WRITE(WRPLL_CTL(id), val & ~WRPLL_PLL_ENABLE); POSTING_READ(WRPLL_CTL(id)); + + /* + * Try to set up the PCH reference clock once all DPLLs + * that depend on it have been shut down. + */ + if (dev_priv->pch_ssc_use & BIT(id)) + intel_init_pch_refclk(dev_priv); }
static void hsw_ddi_spll_disable(struct drm_i915_private *dev_priv, struct intel_shared_dpll *pll) { + enum intel_dpll_id id = pll->info->id; u32 val;
val = I915_READ(SPLL_CTL); I915_WRITE(SPLL_CTL, val & ~SPLL_PLL_ENABLE); POSTING_READ(SPLL_CTL); + + /* + * Try to set up the PCH reference clock once all DPLLs + * that depend on it have been shut down. + */ + if (dev_priv->pch_ssc_use & BIT(id)) + intel_init_pch_refclk(dev_priv); }
static bool hsw_ddi_wrpll_get_hw_state(struct drm_i915_private *dev_priv, --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1881,6 +1881,8 @@ struct drm_i915_private { struct work_struct idle_work; } gem;
+ u8 pch_ssc_use; + /* For i945gm vblank irq vs. C3 workaround */ struct { struct work_struct work;
From: Tianci.Yin tianci.yin@amd.com
commit f52ebe1f888dfae68d7cffabf5ac898f8cb64fb3 upstream.
update registers: mmCGTT_SPI_CLK_CTRL
Reviewed-by: Feifei Xu Feifei.Xu@amd.com Signed-off-by: Tianci.Yin tianci.yin@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -67,7 +67,7 @@ static const struct soc15_reg_golden gol { SOC15_REG_GOLDEN_VALUE(GC, 0, mmCB_HW_CONTROL_4, 0xffffffff, 0x00400014), SOC15_REG_GOLDEN_VALUE(GC, 0, mmCGTT_CPF_CLK_CTRL, 0xfcff8fff, 0xf8000100), - SOC15_REG_GOLDEN_VALUE(GC, 0, mmCGTT_SPI_CLK_CTRL, 0xc0000000, 0xc0000100), + SOC15_REG_GOLDEN_VALUE(GC, 0, mmCGTT_SPI_CLK_CTRL, 0xcd000000, 0x0d000100), SOC15_REG_GOLDEN_VALUE(GC, 0, mmCGTT_SQ_CLK_CTRL, 0x60000ff0, 0x60000100), SOC15_REG_GOLDEN_VALUE(GC, 0, mmCGTT_SQG_CLK_CTRL, 0x40000000, 0x40000100), SOC15_REG_GOLDEN_VALUE(GC, 0, mmCGTT_VGT_CLK_CTRL, 0xffff8fff, 0xffff8100),
From: Pelle van Gils pelle@vangils.xyz
commit e6f4e274c1e52d1f0bfe293fb44ddf59de6c0374 upstream.
The vega10_odn_update_soc_table() function does not allow the SCLK dependent voltage to be set for power-state 7 to a value below the default in pptable. Change the for-loop condition to allow undervolting in the highest state.
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205277 Signed-off-by: Pelle van Gils pelle@vangils.xyz Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c @@ -5097,9 +5097,7 @@ static void vega10_odn_update_soc_table(
if (type == PP_OD_EDIT_SCLK_VDDC_TABLE) { podn_vdd_dep = &data->odn_dpm_table.vdd_dep_on_sclk; - for (i = 0; i < podn_vdd_dep->count - 1; i++) - od_vddc_lookup_table->entries[i].us_vdd = podn_vdd_dep->entries[i].vddc; - if (od_vddc_lookup_table->entries[i].us_vdd < podn_vdd_dep->entries[i].vddc) + for (i = 0; i < podn_vdd_dep->count; i++) od_vddc_lookup_table->entries[i].us_vdd = podn_vdd_dep->entries[i].vddc; } else if (type == PP_OD_EDIT_MCLK_VDDC_TABLE) { podn_vdd_dep = &data->odn_dpm_table.vdd_dep_on_mclk;
From: chen gong curry.gong@amd.com
commit e5574f61e9d8274c49e9a5d943abde8e938d57e1 upstream.
VKexample test hang during Occlusion/SDMA/Varia runs. Clear XNACK_WATERMK in reg SDMA0_UTCL1_WATERMK to fix this issue.
Signed-off-by: chen gong curry.gong@amd.com Reviewed-by: Aaron Liu aaron.liu@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c @@ -159,6 +159,7 @@ static const struct soc15_reg_golden gol SOC15_REG_GOLDEN_VALUE(SDMA0, 0, mmSDMA0_RLC7_RB_RPTR_ADDR_LO, 0xfffffffd, 0x00000001), SOC15_REG_GOLDEN_VALUE(SDMA0, 0, mmSDMA0_RLC7_RB_WPTR_POLL_CNTL, 0xfffffff7, 0x00403000), SOC15_REG_GOLDEN_VALUE(SDMA0, 0, mmSDMA0_UTCL1_PAGE, 0x000003ff, 0x000003c0), + SOC15_REG_GOLDEN_VALUE(SDMA0, 0, mmSDMA0_UTCL1_WATERMK, 0xfc000000, 0x00000000) };
static const struct soc15_reg_golden golden_settings_sdma1_4_2[] = {
From: Trond Myklebust trondmy@gmail.com
commit 79cc55422ce99be5964bde208ba8557174720893 upstream.
A typo in nfs4_refresh_delegation_stateid() means we're leaking an RCU lock, and always returning a value of 'false'. As the function description states, we were always supposed to return 'true' if a matching delegation was found.
Fixes: 12f275cdd163 ("NFSv4: Retry CLOSE and DELEGRETURN on NFS4ERR_OLD_STATEID.") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/nfs/delegation.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/nfs/delegation.c +++ b/fs/nfs/delegation.c @@ -1181,7 +1181,7 @@ bool nfs4_refresh_delegation_stateid(nfs if (delegation != NULL && nfs4_stateid_match_other(dst, &delegation->stateid)) { dst->seqid = delegation->stateid.seqid; - return ret; + ret = true; } rcu_read_unlock(); out:
From: Jens Axboe axboe@kernel.dk
commit 6873e0bd6a9cb14ecfadd89d9ed9698ff1761902 upstream.
We use io_kiocb->result == -EAGAIN as a way to know if we need to re-submit a polled request, as -EAGAIN reporting happens out-of-line for IO submission failures. This field is cleared when we originally allocate the request, but it isn't reset when we retry the submission from async context. This can cause issues where we think something needs a re-issue, but we're really just reading stale data.
Reset ->result whenever we re-prep a request for polled submission.
Cc: stable@vger.kernel.org Fixes: 9e645e1105ca ("io_uring: add support for sqe links") Reported-by: Bijan Mottahedeh bijan.mottahedeh@oracle.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/io_uring.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1078,6 +1078,7 @@ static int io_prep_rw(struct io_kiocb *r
kiocb->ki_flags |= IOCB_HIPRI; kiocb->ki_complete = io_complete_rw_iopoll; + req->result = 0; } else { if (kiocb->ki_flags & IOCB_HIPRI) return -EINVAL;
On 11/4/2019 1:45 PM, Greg Kroah-Hartman wrote:
From: Jens Axboe axboe@kernel.dk
commit 6873e0bd6a9cb14ecfadd89d9ed9698ff1761902 upstream.
We use io_kiocb->result == -EAGAIN as a way to know if we need to re-submit a polled request, as -EAGAIN reporting happens out-of-line for IO submission failures. This field is cleared when we originally allocate the request, but it isn't reset when we retry the submission from async context. This can cause issues where we think something needs a re-issue, but we're really just reading stale data.
Reset ->result whenever we re-prep a request for polled submission.
Cc: stable@vger.kernel.org Fixes: 9e645e1105ca ("io_uring: add support for sqe links") Reported-by: Bijan Mottahedeh bijan.mottahedeh@oracle.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
fs/io_uring.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1078,6 +1078,7 @@ static int io_prep_rw(struct io_kiocb *r kiocb->ki_flags |= IOCB_HIPRI; kiocb->ki_complete = io_complete_rw_iopoll;
} else { if (kiocb->ki_flags & IOCB_HIPRI) return -EINVAL;req->result = 0;
Reviewd-by: Bijan Mottahedeh bijan.mottahedeh@oracle.com
From: John Donnelly John.P.Donnelly@Oracle.com
commit 160c63f909ffbc797c0bbe23310ac1eaf2349d2f upstream.
This cures a panic on restart after a kexec operation on 5.3 and 5.4 kernels.
The underlying state of the iommu registers (iommu->flags & VTD_FLAG_TRANS_PRE_ENABLED) on a restart results in a domain being marked as "DEFER_DEVICE_DOMAIN_INFO" that produces an Oops in identity_mapping().
[ 43.654737] BUG: kernel NULL pointer dereference, address: 0000000000000056 [ 43.655720] #PF: supervisor read access in kernel mode [ 43.655720] #PF: error_code(0x0000) - not-present page [ 43.655720] PGD 0 P4D 0 [ 43.655720] Oops: 0000 [#1] SMP PTI [ 43.655720] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.3.2-1940.el8uek.x86_64 #1 [ 43.655720] Hardware name: Oracle Corporation ORACLE SERVER X5-2/ASM,MOTHERBOARD,1U, BIOS 30140300 09/20/2018 [ 43.655720] RIP: 0010:iommu_need_mapping+0x29/0xd0 [ 43.655720] Code: 00 0f 1f 44 00 00 48 8b 97 70 02 00 00 48 83 fa ff 74 53 48 8d 4a ff b8 01 00 00 00 48 83 f9 fd 76 01 c3 48 8b 35 7f 58 e0 01 <48> 39 72 58 75 f2 55 48 89 e5 41 54 53 48 8b 87 28 02 00 00 4c 8b [ 43.655720] RSP: 0018:ffffc9000001b9b0 EFLAGS: 00010246 [ 43.655720] RAX: 0000000000000001 RBX: 0000000000001000 RCX: fffffffffffffffd [ 43.655720] RDX: fffffffffffffffe RSI: ffff8880719b8000 RDI: ffff8880477460b0 [ 43.655720] RBP: ffffc9000001b9e8 R08: 0000000000000000 R09: ffff888047c01700 [ 43.655720] R10: 00002194036fc692 R11: 0000000000000000 R12: 0000000000000000 [ 43.655720] R13: ffff8880477460b0 R14: 0000000000000cc0 R15: ffff888072d2b558 [ 43.655720] FS: 0000000000000000(0000) GS:ffff888071c00000(0000) knlGS:0000000000000000 [ 43.655720] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 43.655720] CR2: 0000000000000056 CR3: 000000007440a002 CR4: 00000000001606b0 [ 43.655720] Call Trace: [ 43.655720] ? intel_alloc_coherent+0x2a/0x180 [ 43.655720] ? __schedule+0x2c2/0x650 [ 43.655720] dma_alloc_attrs+0x8c/0xd0 [ 43.655720] dma_pool_alloc+0xdf/0x200 [ 43.655720] ehci_qh_alloc+0x58/0x130 [ 43.655720] ehci_setup+0x287/0x7ba [ 43.655720] ? _dev_info+0x6c/0x83 [ 43.655720] ehci_pci_setup+0x91/0x436 [ 43.655720] usb_add_hcd.cold.48+0x1d4/0x754 [ 43.655720] usb_hcd_pci_probe+0x2bc/0x3f0 [ 43.655720] ehci_pci_probe+0x39/0x40 [ 43.655720] local_pci_probe+0x47/0x80 [ 43.655720] pci_device_probe+0xff/0x1b0 [ 43.655720] really_probe+0xf5/0x3a0 [ 43.655720] driver_probe_device+0xbb/0x100 [ 43.655720] device_driver_attach+0x58/0x60 [ 43.655720] __driver_attach+0x8f/0x150 [ 43.655720] ? device_driver_attach+0x60/0x60 [ 43.655720] bus_for_each_dev+0x74/0xb0 [ 43.655720] driver_attach+0x1e/0x20 [ 43.655720] bus_add_driver+0x151/0x1f0 [ 43.655720] ? ehci_hcd_init+0xb2/0xb2 [ 43.655720] ? do_early_param+0x95/0x95 [ 43.655720] driver_register+0x70/0xc0 [ 43.655720] ? ehci_hcd_init+0xb2/0xb2 [ 43.655720] __pci_register_driver+0x57/0x60 [ 43.655720] ehci_pci_init+0x6a/0x6c [ 43.655720] do_one_initcall+0x4a/0x1fa [ 43.655720] ? do_early_param+0x95/0x95 [ 43.655720] kernel_init_freeable+0x1bd/0x262 [ 43.655720] ? rest_init+0xb0/0xb0 [ 43.655720] kernel_init+0xe/0x110 [ 43.655720] ret_from_fork+0x24/0x50
Fixes: 8af46c784ecfe ("iommu/vt-d: Implement is_attach_deferred iommu ops entry") Cc: stable@vger.kernel.org # v5.3+
Signed-off-by: John Donnelly john.p.donnelly@oracle.com Reviewed-by: Lu Baolu baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/iommu/intel-iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2783,7 +2783,7 @@ static int identity_mapping(struct devic struct device_domain_info *info;
info = dev->archdata.iommu; - if (info && info != DUMMY_DEVICE_DOMAIN_INFO) + if (info && info != DUMMY_DEVICE_DOMAIN_INFO && info != DEFER_DEVICE_DOMAIN_INFO) return (info->domain == si_domain);
return 0;
From: Sven Eckelmann sven@narfation.org
commit 40e220b4218bb3d278e5e8cc04ccdfd1c7ff8307 upstream.
Each slave interface of an B.A.T.M.A.N. IV virtual interface has an OGM packet buffer which is initialized using data from netdevice notifier and other rtnetlink related hooks. It is sent regularly via various slave interfaces of the batadv virtual interface and in this process also modified (realloced) to integrate additional state information via TVLV containers.
It must be avoided that the worker item is executed without a common lock with the netdevice notifier/rtnetlink helpers. Otherwise it can either happen that half modified/freed data is sent out or functions modifying the OGM buffer try to access already freed memory regions.
Reported-by: syzbot+0cc629f19ccb8534935b@syzkaller.appspotmail.com Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol") Signed-off-by: Sven Eckelmann sven@narfation.org Signed-off-by: Simon Wunderlich sw@simonwunderlich.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/batman-adv/bat_iv_ogm.c | 61 ++++++++++++++++++++++++++++++++++------ net/batman-adv/hard-interface.c | 2 + net/batman-adv/types.h | 3 + 3 files changed, 57 insertions(+), 9 deletions(-)
--- a/net/batman-adv/bat_iv_ogm.c +++ b/net/batman-adv/bat_iv_ogm.c @@ -22,6 +22,8 @@ #include <linux/kernel.h> #include <linux/kref.h> #include <linux/list.h> +#include <linux/lockdep.h> +#include <linux/mutex.h> #include <linux/netdevice.h> #include <linux/netlink.h> #include <linux/pkt_sched.h> @@ -193,14 +195,18 @@ static int batadv_iv_ogm_iface_enable(st unsigned char *ogm_buff; u32 random_seqno;
+ mutex_lock(&hard_iface->bat_iv.ogm_buff_mutex); + /* randomize initial seqno to avoid collision */ get_random_bytes(&random_seqno, sizeof(random_seqno)); atomic_set(&hard_iface->bat_iv.ogm_seqno, random_seqno);
hard_iface->bat_iv.ogm_buff_len = BATADV_OGM_HLEN; ogm_buff = kmalloc(hard_iface->bat_iv.ogm_buff_len, GFP_ATOMIC); - if (!ogm_buff) + if (!ogm_buff) { + mutex_unlock(&hard_iface->bat_iv.ogm_buff_mutex); return -ENOMEM; + }
hard_iface->bat_iv.ogm_buff = ogm_buff;
@@ -212,35 +218,59 @@ static int batadv_iv_ogm_iface_enable(st batadv_ogm_packet->reserved = 0; batadv_ogm_packet->tq = BATADV_TQ_MAX_VALUE;
+ mutex_unlock(&hard_iface->bat_iv.ogm_buff_mutex); + return 0; }
static void batadv_iv_ogm_iface_disable(struct batadv_hard_iface *hard_iface) { + mutex_lock(&hard_iface->bat_iv.ogm_buff_mutex); + kfree(hard_iface->bat_iv.ogm_buff); hard_iface->bat_iv.ogm_buff = NULL; + + mutex_unlock(&hard_iface->bat_iv.ogm_buff_mutex); }
static void batadv_iv_ogm_iface_update_mac(struct batadv_hard_iface *hard_iface) { struct batadv_ogm_packet *batadv_ogm_packet; - unsigned char *ogm_buff = hard_iface->bat_iv.ogm_buff; + void *ogm_buff;
- batadv_ogm_packet = (struct batadv_ogm_packet *)ogm_buff; + mutex_lock(&hard_iface->bat_iv.ogm_buff_mutex); + + ogm_buff = hard_iface->bat_iv.ogm_buff; + if (!ogm_buff) + goto unlock; + + batadv_ogm_packet = ogm_buff; ether_addr_copy(batadv_ogm_packet->orig, hard_iface->net_dev->dev_addr); ether_addr_copy(batadv_ogm_packet->prev_sender, hard_iface->net_dev->dev_addr); + +unlock: + mutex_unlock(&hard_iface->bat_iv.ogm_buff_mutex); }
static void batadv_iv_ogm_primary_iface_set(struct batadv_hard_iface *hard_iface) { struct batadv_ogm_packet *batadv_ogm_packet; - unsigned char *ogm_buff = hard_iface->bat_iv.ogm_buff; + void *ogm_buff;
- batadv_ogm_packet = (struct batadv_ogm_packet *)ogm_buff; + mutex_lock(&hard_iface->bat_iv.ogm_buff_mutex); + + ogm_buff = hard_iface->bat_iv.ogm_buff; + if (!ogm_buff) + goto unlock; + + batadv_ogm_packet = ogm_buff; batadv_ogm_packet->ttl = BATADV_TTL; + +unlock: + mutex_unlock(&hard_iface->bat_iv.ogm_buff_mutex); }
/* when do we schedule our own ogm to be sent */ @@ -742,7 +772,11 @@ batadv_iv_ogm_slide_own_bcast_window(str } }
-static void batadv_iv_ogm_schedule(struct batadv_hard_iface *hard_iface) +/** + * batadv_iv_ogm_schedule_buff() - schedule submission of hardif ogm buffer + * @hard_iface: interface whose ogm buffer should be transmitted + */ +static void batadv_iv_ogm_schedule_buff(struct batadv_hard_iface *hard_iface) { struct batadv_priv *bat_priv = netdev_priv(hard_iface->soft_iface); unsigned char **ogm_buff = &hard_iface->bat_iv.ogm_buff; @@ -753,9 +787,7 @@ static void batadv_iv_ogm_schedule(struc u16 tvlv_len = 0; unsigned long send_time;
- if (hard_iface->if_status == BATADV_IF_NOT_IN_USE || - hard_iface->if_status == BATADV_IF_TO_BE_REMOVED) - return; + lockdep_assert_held(&hard_iface->bat_iv.ogm_buff_mutex);
/* the interface gets activated here to avoid race conditions between * the moment of activating the interface in @@ -823,6 +855,17 @@ out: batadv_hardif_put(primary_if); }
+static void batadv_iv_ogm_schedule(struct batadv_hard_iface *hard_iface) +{ + if (hard_iface->if_status == BATADV_IF_NOT_IN_USE || + hard_iface->if_status == BATADV_IF_TO_BE_REMOVED) + return; + + mutex_lock(&hard_iface->bat_iv.ogm_buff_mutex); + batadv_iv_ogm_schedule_buff(hard_iface); + mutex_unlock(&hard_iface->bat_iv.ogm_buff_mutex); +} + /** * batadv_iv_orig_ifinfo_sum() - Get bcast_own sum for originator over iterface * @orig_node: originator which reproadcasted the OGMs directly --- a/net/batman-adv/hard-interface.c +++ b/net/batman-adv/hard-interface.c @@ -18,6 +18,7 @@ #include <linux/kref.h> #include <linux/limits.h> #include <linux/list.h> +#include <linux/mutex.h> #include <linux/netdevice.h> #include <linux/printk.h> #include <linux/rculist.h> @@ -929,6 +930,7 @@ batadv_hardif_add_interface(struct net_d INIT_LIST_HEAD(&hard_iface->list); INIT_HLIST_HEAD(&hard_iface->neigh_list);
+ mutex_init(&hard_iface->bat_iv.ogm_buff_mutex); spin_lock_init(&hard_iface->neigh_list_lock); kref_init(&hard_iface->refcount);
--- a/net/batman-adv/types.h +++ b/net/batman-adv/types.h @@ -81,6 +81,9 @@ struct batadv_hard_iface_bat_iv {
/** @ogm_seqno: OGM sequence number - used to identify each OGM */ atomic_t ogm_seqno; + + /** @ogm_buff_mutex: lock protecting ogm_buff and ogm_buff_len */ + struct mutex ogm_buff_mutex; };
/**
From: Eric Biggers ebiggers@google.com
commit c6ee11c39fcc1fb55130748990a8f199e76263b4 upstream.
syzbot reported:
BUG: memory leak unreferenced object 0xffff888116270800 (size 224): comm "syz-executor641", pid 7047, jiffies 4294947360 (age 13.860s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 20 e1 2a 81 88 ff ff 00 40 3d 2a 81 88 ff ff . .*.....@=*.... backtrace: [<000000004d41b4cc>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<000000004d41b4cc>] slab_post_alloc_hook mm/slab.h:439 [inline] [<000000004d41b4cc>] slab_alloc_node mm/slab.c:3269 [inline] [<000000004d41b4cc>] kmem_cache_alloc_node+0x153/0x2a0 mm/slab.c:3579 [<00000000506a5965>] __alloc_skb+0x6e/0x210 net/core/skbuff.c:198 [<000000001ba5a161>] alloc_skb include/linux/skbuff.h:1058 [inline] [<000000001ba5a161>] alloc_skb_with_frags+0x5f/0x250 net/core/skbuff.c:5327 [<0000000047d9c78b>] sock_alloc_send_pskb+0x269/0x2a0 net/core/sock.c:2225 [<000000003828fe54>] sock_alloc_send_skb+0x32/0x40 net/core/sock.c:2242 [<00000000e34d94f9>] llc_ui_sendmsg+0x10a/0x540 net/llc/af_llc.c:933 [<00000000de2de3fb>] sock_sendmsg_nosec net/socket.c:652 [inline] [<00000000de2de3fb>] sock_sendmsg+0x54/0x70 net/socket.c:671 [<000000008fe16e7a>] __sys_sendto+0x148/0x1f0 net/socket.c:1964 [...]
The bug is that llc_sap_state_process() always takes an extra reference to the skb, but sometimes neither llc_sap_next_state() nor llc_sap_state_process() itself drops this reference.
Fix it by changing llc_sap_next_state() to never consume a reference to the skb, rather than sometimes do so and sometimes not. Then remove the extra skb_get() and kfree_skb() from llc_sap_state_process().
Reported-by: syzbot+6bf095f9becf5efef645@syzkaller.appspotmail.com Reported-by: syzbot+31c16aa4202dace3812e@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Jakub Kicinski jakub.kicinski@netronome.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/llc/llc_s_ac.c | 12 +++++++++--- net/llc/llc_sap.c | 23 ++++++++--------------- 2 files changed, 17 insertions(+), 18 deletions(-)
--- a/net/llc/llc_s_ac.c +++ b/net/llc/llc_s_ac.c @@ -58,8 +58,10 @@ int llc_sap_action_send_ui(struct llc_sa ev->daddr.lsap, LLC_PDU_CMD); llc_pdu_init_as_ui_cmd(skb); rc = llc_mac_hdr_init(skb, ev->saddr.mac, ev->daddr.mac); - if (likely(!rc)) + if (likely(!rc)) { + skb_get(skb); rc = dev_queue_xmit(skb); + } return rc; }
@@ -81,8 +83,10 @@ int llc_sap_action_send_xid_c(struct llc ev->daddr.lsap, LLC_PDU_CMD); llc_pdu_init_as_xid_cmd(skb, LLC_XID_NULL_CLASS_2, 0); rc = llc_mac_hdr_init(skb, ev->saddr.mac, ev->daddr.mac); - if (likely(!rc)) + if (likely(!rc)) { + skb_get(skb); rc = dev_queue_xmit(skb); + } return rc; }
@@ -135,8 +139,10 @@ int llc_sap_action_send_test_c(struct ll ev->daddr.lsap, LLC_PDU_CMD); llc_pdu_init_as_test_cmd(skb); rc = llc_mac_hdr_init(skb, ev->saddr.mac, ev->daddr.mac); - if (likely(!rc)) + if (likely(!rc)) { + skb_get(skb); rc = dev_queue_xmit(skb); + } return rc; }
--- a/net/llc/llc_sap.c +++ b/net/llc/llc_sap.c @@ -197,29 +197,22 @@ out: * After executing actions of the event, upper layer will be indicated * if needed(on receiving an UI frame). sk can be null for the * datalink_proto case. + * + * This function always consumes a reference to the skb. */ static void llc_sap_state_process(struct llc_sap *sap, struct sk_buff *skb) { struct llc_sap_state_ev *ev = llc_sap_ev(skb);
- /* - * We have to hold the skb, because llc_sap_next_state - * will kfree it in the sending path and we need to - * look at the skb->cb, where we encode llc_sap_state_ev. - */ - skb_get(skb); ev->ind_cfm_flag = 0; llc_sap_next_state(sap, skb); - if (ev->ind_cfm_flag == LLC_IND) { - if (skb->sk->sk_state == TCP_LISTEN) - kfree_skb(skb); - else { - llc_save_primitive(skb->sk, skb, ev->prim);
- /* queue skb to the user. */ - if (sock_queue_rcv_skb(skb->sk, skb)) - kfree_skb(skb); - } + if (ev->ind_cfm_flag == LLC_IND && skb->sk->sk_state != TCP_LISTEN) { + llc_save_primitive(skb->sk, skb, ev->prim); + + /* queue skb to the user. */ + if (sock_queue_rcv_skb(skb->sk, skb) == 0) + return; } kfree_skb(skb); }
From: Eric Biggers ebiggers@google.com
commit b74555de21acd791f12c4a1aeaf653dd7ac21133 upstream.
syzbot reported:
BUG: memory leak unreferenced object 0xffff88811eb3de00 (size 224): comm "syz-executor559", pid 7315, jiffies 4294943019 (age 10.300s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 a0 38 24 81 88 ff ff 00 c0 f2 15 81 88 ff ff ..8$............ backtrace: [<000000008d1c66a1>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<000000008d1c66a1>] slab_post_alloc_hook mm/slab.h:439 [inline] [<000000008d1c66a1>] slab_alloc_node mm/slab.c:3269 [inline] [<000000008d1c66a1>] kmem_cache_alloc_node+0x153/0x2a0 mm/slab.c:3579 [<00000000447d9496>] __alloc_skb+0x6e/0x210 net/core/skbuff.c:198 [<000000000cdbf82f>] alloc_skb include/linux/skbuff.h:1058 [inline] [<000000000cdbf82f>] llc_alloc_frame+0x66/0x110 net/llc/llc_sap.c:54 [<000000002418b52e>] llc_conn_ac_send_sabme_cmd_p_set_x+0x2f/0x140 net/llc/llc_c_ac.c:777 [<000000001372ae17>] llc_exec_conn_trans_actions net/llc/llc_conn.c:475 [inline] [<000000001372ae17>] llc_conn_service net/llc/llc_conn.c:400 [inline] [<000000001372ae17>] llc_conn_state_process+0x1ac/0x640 net/llc/llc_conn.c:75 [<00000000f27e53c1>] llc_establish_connection+0x110/0x170 net/llc/llc_if.c:109 [<00000000291b2ca0>] llc_ui_connect+0x10e/0x370 net/llc/af_llc.c:477 [<000000000f9c740b>] __sys_connect+0x11d/0x170 net/socket.c:1840 [...]
The bug is that most callers of llc_conn_send_pdu() assume it consumes a reference to the skb, when actually due to commit b85ab56c3f81 ("llc: properly handle dev_queue_xmit() return value") it doesn't.
Revert most of that commit, and instead make the few places that need llc_conn_send_pdu() to *not* consume a reference call skb_get() before.
Fixes: b85ab56c3f81 ("llc: properly handle dev_queue_xmit() return value") Reported-by: syzbot+6b825a6494a04cc0e3f7@syzkaller.appspotmail.com Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Jakub Kicinski jakub.kicinski@netronome.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/net/llc_conn.h | 2 +- net/llc/llc_c_ac.c | 8 ++++++-- net/llc/llc_conn.c | 32 +++++++++----------------------- 3 files changed, 16 insertions(+), 26 deletions(-)
--- a/include/net/llc_conn.h +++ b/include/net/llc_conn.h @@ -104,7 +104,7 @@ void llc_sk_reset(struct sock *sk);
/* Access to a connection */ int llc_conn_state_process(struct sock *sk, struct sk_buff *skb); -int llc_conn_send_pdu(struct sock *sk, struct sk_buff *skb); +void llc_conn_send_pdu(struct sock *sk, struct sk_buff *skb); void llc_conn_rtn_pdu(struct sock *sk, struct sk_buff *skb); void llc_conn_resend_i_pdu_as_cmd(struct sock *sk, u8 nr, u8 first_p_bit); void llc_conn_resend_i_pdu_as_rsp(struct sock *sk, u8 nr, u8 first_f_bit); --- a/net/llc/llc_c_ac.c +++ b/net/llc/llc_c_ac.c @@ -372,6 +372,7 @@ int llc_conn_ac_send_i_cmd_p_set_1(struc llc_pdu_init_as_i_cmd(skb, 1, llc->vS, llc->vR); rc = llc_mac_hdr_init(skb, llc->dev->dev_addr, llc->daddr.mac); if (likely(!rc)) { + skb_get(skb); llc_conn_send_pdu(sk, skb); llc_conn_ac_inc_vs_by_1(sk, skb); } @@ -389,7 +390,8 @@ static int llc_conn_ac_send_i_cmd_p_set_ llc_pdu_init_as_i_cmd(skb, 0, llc->vS, llc->vR); rc = llc_mac_hdr_init(skb, llc->dev->dev_addr, llc->daddr.mac); if (likely(!rc)) { - rc = llc_conn_send_pdu(sk, skb); + skb_get(skb); + llc_conn_send_pdu(sk, skb); llc_conn_ac_inc_vs_by_1(sk, skb); } return rc; @@ -406,6 +408,7 @@ int llc_conn_ac_send_i_xxx_x_set_0(struc llc_pdu_init_as_i_cmd(skb, 0, llc->vS, llc->vR); rc = llc_mac_hdr_init(skb, llc->dev->dev_addr, llc->daddr.mac); if (likely(!rc)) { + skb_get(skb); llc_conn_send_pdu(sk, skb); llc_conn_ac_inc_vs_by_1(sk, skb); } @@ -916,7 +919,8 @@ static int llc_conn_ac_send_i_rsp_f_set_ llc_pdu_init_as_i_cmd(skb, llc->ack_pf, llc->vS, llc->vR); rc = llc_mac_hdr_init(skb, llc->dev->dev_addr, llc->daddr.mac); if (likely(!rc)) { - rc = llc_conn_send_pdu(sk, skb); + skb_get(skb); + llc_conn_send_pdu(sk, skb); llc_conn_ac_inc_vs_by_1(sk, skb); } return rc; --- a/net/llc/llc_conn.c +++ b/net/llc/llc_conn.c @@ -30,7 +30,7 @@ #endif
static int llc_find_offset(int state, int ev_type); -static int llc_conn_send_pdus(struct sock *sk, struct sk_buff *skb); +static void llc_conn_send_pdus(struct sock *sk); static int llc_conn_service(struct sock *sk, struct sk_buff *skb); static int llc_exec_conn_trans_actions(struct sock *sk, struct llc_conn_state_trans *trans, @@ -193,11 +193,11 @@ out_skb_put: return rc; }
-int llc_conn_send_pdu(struct sock *sk, struct sk_buff *skb) +void llc_conn_send_pdu(struct sock *sk, struct sk_buff *skb) { /* queue PDU to send to MAC layer */ skb_queue_tail(&sk->sk_write_queue, skb); - return llc_conn_send_pdus(sk, skb); + llc_conn_send_pdus(sk); }
/** @@ -255,7 +255,7 @@ void llc_conn_resend_i_pdu_as_cmd(struct if (howmany_resend > 0) llc->vS = (llc->vS + 1) % LLC_2_SEQ_NBR_MODULO; /* any PDUs to re-send are queued up; start sending to MAC */ - llc_conn_send_pdus(sk, NULL); + llc_conn_send_pdus(sk); out:; }
@@ -296,7 +296,7 @@ void llc_conn_resend_i_pdu_as_rsp(struct if (howmany_resend > 0) llc->vS = (llc->vS + 1) % LLC_2_SEQ_NBR_MODULO; /* any PDUs to re-send are queued up; start sending to MAC */ - llc_conn_send_pdus(sk, NULL); + llc_conn_send_pdus(sk); out:; }
@@ -340,16 +340,12 @@ out: /** * llc_conn_send_pdus - Sends queued PDUs * @sk: active connection - * @hold_skb: the skb held by caller, or NULL if does not care * - * Sends queued pdus to MAC layer for transmission. When @hold_skb is - * NULL, always return 0. Otherwise, return 0 if @hold_skb is sent - * successfully, or 1 for failure. + * Sends queued pdus to MAC layer for transmission. */ -static int llc_conn_send_pdus(struct sock *sk, struct sk_buff *hold_skb) +static void llc_conn_send_pdus(struct sock *sk) { struct sk_buff *skb; - int ret = 0;
while ((skb = skb_dequeue(&sk->sk_write_queue)) != NULL) { struct llc_pdu_sn *pdu = llc_pdu_sn_hdr(skb); @@ -361,20 +357,10 @@ static int llc_conn_send_pdus(struct soc skb_queue_tail(&llc_sk(sk)->pdu_unack_q, skb); if (!skb2) break; - dev_queue_xmit(skb2); - } else { - bool is_target = skb == hold_skb; - int rc; - - if (is_target) - skb_get(skb); - rc = dev_queue_xmit(skb); - if (is_target) - ret = rc; + skb = skb2; } + dev_queue_xmit(skb); } - - return ret; }
/**
From: David Howells dhowells@redhat.com
commit c48fc11b69e95007109206311b0187a3090591f3 upstream.
When sendmsg() finds a call to continue on with, if the call is in an inappropriate state, it doesn't release the ref it just got on that call before returning an error.
This causes the following symptom to show up with kasan:
BUG: KASAN: use-after-free in rxrpc_send_keepalive+0x8a2/0x940 net/rxrpc/output.c:635 Read of size 8 at addr ffff888064219698 by task kworker/0:3/11077
where line 635 is:
whdr.epoch = htonl(peer->local->rxnet->epoch);
The local endpoint (which cannot be pinned by the call) has been released, but not the peer (which is pinned by the call).
Fix this by releasing the call in the error path.
Fixes: 37411cad633f ("rxrpc: Fix potential NULL-pointer exception") Reported-by: syzbot+d850c266e3df14da1d31@syzkaller.appspotmail.com Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/rxrpc/sendmsg.c | 1 + 1 file changed, 1 insertion(+)
--- a/net/rxrpc/sendmsg.c +++ b/net/rxrpc/sendmsg.c @@ -661,6 +661,7 @@ int rxrpc_do_sendmsg(struct rxrpc_sock * case RXRPC_CALL_SERVER_PREALLOC: case RXRPC_CALL_SERVER_SECURING: case RXRPC_CALL_SERVER_ACCEPTING: + rxrpc_put_call(call, rxrpc_call_put); ret = -EBUSY; goto error_release_sock; default:
From: David Howells dhowells@redhat.com
commit 9ebeddef58c41bd700419cdcece24cf64ce32276 upstream.
The rxrpc_peer record needs to hold a reference on the rxrpc_local record it points as the peer is used as a base to access information in the rxrpc_local record.
This can cause problems in __rxrpc_put_peer(), where we need the network namespace pointer, and in rxrpc_send_keepalive(), where we need to access the UDP socket, leading to symptoms like:
BUG: KASAN: use-after-free in __rxrpc_put_peer net/rxrpc/peer_object.c:411 [inline] BUG: KASAN: use-after-free in rxrpc_put_peer+0x685/0x6a0 net/rxrpc/peer_object.c:435 Read of size 8 at addr ffff888097ec0058 by task syz-executor823/24216
Fix this by taking a ref on the local record for the peer record.
Fixes: ace45bec6d77 ("rxrpc: Fix firewall route keepalive") Fixes: 2baec2c3f854 ("rxrpc: Support network namespacing") Reported-by: syzbot+b9be979c55f2bea8ed30@syzkaller.appspotmail.com Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/rxrpc/peer_object.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/net/rxrpc/peer_object.c +++ b/net/rxrpc/peer_object.c @@ -216,7 +216,7 @@ struct rxrpc_peer *rxrpc_alloc_peer(stru peer = kzalloc(sizeof(struct rxrpc_peer), gfp); if (peer) { atomic_set(&peer->usage, 1); - peer->local = local; + peer->local = rxrpc_get_local(local); INIT_HLIST_HEAD(&peer->error_targets); peer->service_conns = RB_ROOT; seqlock_init(&peer->service_conn_lock); @@ -307,7 +307,6 @@ void rxrpc_new_incoming_peer(struct rxrp unsigned long hash_key;
hash_key = rxrpc_peer_hash_key(local, &peer->srx); - peer->local = local; rxrpc_init_peer(rx, peer, hash_key);
spin_lock(&rxnet->peer_hash_lock); @@ -417,6 +416,7 @@ static void __rxrpc_put_peer(struct rxrp list_del_init(&peer->keepalive_link); spin_unlock_bh(&rxnet->peer_hash_lock);
+ rxrpc_put_local(peer->local); kfree_rcu(peer, rcu); }
@@ -450,6 +450,7 @@ void rxrpc_put_peer_locked(struct rxrpc_ if (n == 0) { hash_del_rcu(&peer->hash_link); list_del_init(&peer->keepalive_link); + rxrpc_put_local(peer->local); kfree_rcu(peer, rcu); } }
From: David Howells dhowells@redhat.com
commit 55f6c98e3674ce16038a1949c3f9ca5a9a99f289 upstream.
rxrpc_put_peer() calls trace_rxrpc_peer() after it has done the decrement of the refcount - which looks at the debug_id in the peer record. But unless the refcount was reduced to zero, we no longer have the right to look in the record and, indeed, it may be deleted by some other thread.
Fix this by getting the debug_id out before decrementing the refcount and then passing that into the tracepoint.
This can cause the following symptoms:
BUG: KASAN: use-after-free in __rxrpc_put_peer net/rxrpc/peer_object.c:411 [inline] BUG: KASAN: use-after-free in rxrpc_put_peer+0x685/0x6a0 net/rxrpc/peer_object.c:435 Read of size 8 at addr ffff888097ec0058 by task syz-executor823/24216
Fixes: 1159d4b496f5 ("rxrpc: Add a tracepoint to track rxrpc_peer refcounting") Reported-by: syzbot+b9be979c55f2bea8ed30@syzkaller.appspotmail.com Signed-off-by: David Howells dhowells@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/trace/events/rxrpc.h | 6 +++--- net/rxrpc/peer_object.c | 11 +++++++---- 2 files changed, 10 insertions(+), 7 deletions(-)
--- a/include/trace/events/rxrpc.h +++ b/include/trace/events/rxrpc.h @@ -519,10 +519,10 @@ TRACE_EVENT(rxrpc_local, );
TRACE_EVENT(rxrpc_peer, - TP_PROTO(struct rxrpc_peer *peer, enum rxrpc_peer_trace op, + TP_PROTO(unsigned int peer_debug_id, enum rxrpc_peer_trace op, int usage, const void *where),
- TP_ARGS(peer, op, usage, where), + TP_ARGS(peer_debug_id, op, usage, where),
TP_STRUCT__entry( __field(unsigned int, peer ) @@ -532,7 +532,7 @@ TRACE_EVENT(rxrpc_peer, ),
TP_fast_assign( - __entry->peer = peer->debug_id; + __entry->peer = peer_debug_id; __entry->op = op; __entry->usage = usage; __entry->where = where; --- a/net/rxrpc/peer_object.c +++ b/net/rxrpc/peer_object.c @@ -381,7 +381,7 @@ struct rxrpc_peer *rxrpc_get_peer(struct int n;
n = atomic_inc_return(&peer->usage); - trace_rxrpc_peer(peer, rxrpc_peer_got, n, here); + trace_rxrpc_peer(peer->debug_id, rxrpc_peer_got, n, here); return peer; }
@@ -395,7 +395,7 @@ struct rxrpc_peer *rxrpc_get_peer_maybe( if (peer) { int n = atomic_fetch_add_unless(&peer->usage, 1, 0); if (n > 0) - trace_rxrpc_peer(peer, rxrpc_peer_got, n + 1, here); + trace_rxrpc_peer(peer->debug_id, rxrpc_peer_got, n + 1, here); else peer = NULL; } @@ -426,11 +426,13 @@ static void __rxrpc_put_peer(struct rxrp void rxrpc_put_peer(struct rxrpc_peer *peer) { const void *here = __builtin_return_address(0); + unsigned int debug_id; int n;
if (peer) { + debug_id = peer->debug_id; n = atomic_dec_return(&peer->usage); - trace_rxrpc_peer(peer, rxrpc_peer_put, n, here); + trace_rxrpc_peer(debug_id, rxrpc_peer_put, n, here); if (n == 0) __rxrpc_put_peer(peer); } @@ -443,10 +445,11 @@ void rxrpc_put_peer(struct rxrpc_peer *p void rxrpc_put_peer_locked(struct rxrpc_peer *peer) { const void *here = __builtin_return_address(0); + unsigned int debug_id = peer->debug_id; int n;
n = atomic_dec_return(&peer->usage); - trace_rxrpc_peer(peer, rxrpc_peer_put, n, here); + trace_rxrpc_peer(debug_id, rxrpc_peer_put, n, here); if (n == 0) { hash_del_rcu(&peer->hash_link); list_del_init(&peer->keepalive_link);
From: Johan Hovold johan@kernel.org
commit 6af3aa57a0984e061f61308fe181a9a12359fecc upstream.
The driver would fail to deregister and its class device and free related resources on late probe errors.
Reported-by: syzbot+cb035c75c03dbe34b796@syzkaller.appspotmail.com Fixes: 32ecc75ded72 ("NFC: pn533: change order operations in dev registation") Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Jakub Kicinski jakub.kicinski@netronome.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/nfc/pn533/usb.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/drivers/nfc/pn533/usb.c +++ b/drivers/nfc/pn533/usb.c @@ -547,18 +547,25 @@ static int pn533_usb_probe(struct usb_in
rc = pn533_finalize_setup(priv); if (rc) - goto error; + goto err_deregister;
usb_set_intfdata(interface, phy);
return 0;
+err_deregister: + pn533_unregister_device(phy->priv); error: + usb_kill_urb(phy->in_urb); + usb_kill_urb(phy->out_urb); + usb_kill_urb(phy->ack_urb); + usb_free_urb(phy->in_urb); usb_free_urb(phy->out_urb); usb_free_urb(phy->ack_urb); usb_put_dev(phy->udev); kfree(in_buf); + kfree(phy->ack_buffer);
return rc; }
From: Eric Dumazet edumazet@google.com
commit a7137534b597b7c303203e6bc3ed87e87a273bb8 upstream.
syzbot got a NULL dereference in bond_update_slave_arr() [1], happening after a failure to allocate bond->slave_arr
A workqueue (bond_slave_arr_handler) is supposed to retry the allocation later, but if the slave is removed before the workqueue had a chance to complete, bond->slave_arr can still be NULL.
[1]
Failed to build slave-array. kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN PTI Modules linked in: Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:bond_update_slave_arr.cold+0xc6/0x198 drivers/net/bonding/bond_main.c:4039 RSP: 0018:ffff88018fe33678 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc9000290b000 RDX: 0000000000000000 RSI: ffffffff82b63037 RDI: ffff88019745ea20 RBP: ffff88018fe33760 R08: ffff880170754280 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff88019745ea00 R14: 0000000000000000 R15: ffff88018fe338b0 FS: 00007febd837d700(0000) GS:ffff8801dad00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004540a0 CR3: 00000001c242e005 CR4: 00000000001626f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: [<ffffffff82b5b45e>] __bond_release_one+0x43e/0x500 drivers/net/bonding/bond_main.c:1923 [<ffffffff82b5b966>] bond_release drivers/net/bonding/bond_main.c:2039 [inline] [<ffffffff82b5b966>] bond_do_ioctl+0x416/0x870 drivers/net/bonding/bond_main.c:3562 [<ffffffff83ae25f4>] dev_ifsioc+0x6f4/0x940 net/core/dev_ioctl.c:328 [<ffffffff83ae2e58>] dev_ioctl+0x1b8/0xc70 net/core/dev_ioctl.c:495 [<ffffffff83995ffd>] sock_do_ioctl+0x1bd/0x300 net/socket.c:1088 [<ffffffff83996a80>] sock_ioctl+0x300/0x5d0 net/socket.c:1196 [<ffffffff81b124db>] vfs_ioctl fs/ioctl.c:47 [inline] [<ffffffff81b124db>] file_ioctl fs/ioctl.c:501 [inline] [<ffffffff81b124db>] do_vfs_ioctl+0xacb/0x1300 fs/ioctl.c:688 [<ffffffff81b12dc6>] SYSC_ioctl fs/ioctl.c:705 [inline] [<ffffffff81b12dc6>] SyS_ioctl+0xb6/0xe0 fs/ioctl.c:696 [<ffffffff8101ccc8>] do_syscall_64+0x528/0x770 arch/x86/entry/common.c:305 [<ffffffff84400091>] entry_SYSCALL_64_after_hwframe+0x42/0xb7
Fixes: ee6377147409 ("bonding: Simplify the xmit function for modes that use xmit_hash") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Cc: Mahesh Bandewar maheshb@google.com Signed-off-by: Jakub Kicinski jakub.kicinski@netronome.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/bonding/bond_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -4039,7 +4039,7 @@ out: * this to-be-skipped slave to send a packet out. */ old_arr = rtnl_dereference(bond->slave_arr); - for (idx = 0; idx < old_arr->count; idx++) { + for (idx = 0; old_arr != NULL && idx < old_arr->count; idx++) { if (skipslave == old_arr->arr[idx]) { old_arr->arr[idx] = old_arr->arr[old_arr->count-1];
From: Eric Dumazet edumazet@google.com
commit e37542ba111f3974dc622ae0a21c1787318de500 upstream.
As hinted by KCSAN, we need at least one READ_ONCE() to prevent a compiler optimization.
More details on : https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE#it-may-improve...
sysbot report : BUG: KCSAN: data-race in __nf_ct_refresh_acct / __nf_ct_refresh_acct
read to 0xffff888123eb4f08 of 4 bytes by interrupt on cpu 0: __nf_ct_refresh_acct+0xd4/0x1b0 net/netfilter/nf_conntrack_core.c:1796 nf_ct_refresh_acct include/net/netfilter/nf_conntrack.h:201 [inline] nf_conntrack_tcp_packet+0xd40/0x3390 net/netfilter/nf_conntrack_proto_tcp.c:1161 nf_conntrack_handle_packet net/netfilter/nf_conntrack_core.c:1633 [inline] nf_conntrack_in+0x410/0xaa0 net/netfilter/nf_conntrack_core.c:1727 ipv4_conntrack_in+0x27/0x40 net/netfilter/nf_conntrack_proto.c:178 nf_hook_entry_hookfn include/linux/netfilter.h:135 [inline] nf_hook_slow+0x83/0x160 net/netfilter/core.c:512 nf_hook include/linux/netfilter.h:260 [inline] NF_HOOK include/linux/netfilter.h:303 [inline] ip_rcv+0x12f/0x1a0 net/ipv4/ip_input.c:523 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5004 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5118 netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5208 napi_skb_finish net/core/dev.c:5671 [inline] napi_gro_receive+0x28f/0x330 net/core/dev.c:5704 receive_buf+0x284/0x30b0 drivers/net/virtio_net.c:1061 virtnet_receive drivers/net/virtio_net.c:1323 [inline] virtnet_poll+0x436/0x7d0 drivers/net/virtio_net.c:1428 napi_poll net/core/dev.c:6352 [inline] net_rx_action+0x3ae/0xa50 net/core/dev.c:6418 __do_softirq+0x115/0x33f kernel/softirq.c:292
write to 0xffff888123eb4f08 of 4 bytes by task 7191 on cpu 1: __nf_ct_refresh_acct+0xfb/0x1b0 net/netfilter/nf_conntrack_core.c:1797 nf_ct_refresh_acct include/net/netfilter/nf_conntrack.h:201 [inline] nf_conntrack_tcp_packet+0xd40/0x3390 net/netfilter/nf_conntrack_proto_tcp.c:1161 nf_conntrack_handle_packet net/netfilter/nf_conntrack_core.c:1633 [inline] nf_conntrack_in+0x410/0xaa0 net/netfilter/nf_conntrack_core.c:1727 ipv4_conntrack_local+0xbe/0x130 net/netfilter/nf_conntrack_proto.c:200 nf_hook_entry_hookfn include/linux/netfilter.h:135 [inline] nf_hook_slow+0x83/0x160 net/netfilter/core.c:512 nf_hook include/linux/netfilter.h:260 [inline] __ip_local_out+0x1f7/0x2b0 net/ipv4/ip_output.c:114 ip_local_out+0x31/0x90 net/ipv4/ip_output.c:123 __ip_queue_xmit+0x3a8/0xa40 net/ipv4/ip_output.c:532 ip_queue_xmit+0x45/0x60 include/net/ip.h:236 __tcp_transmit_skb+0xdeb/0x1cd0 net/ipv4/tcp_output.c:1158 __tcp_send_ack+0x246/0x300 net/ipv4/tcp_output.c:3685 tcp_send_ack+0x34/0x40 net/ipv4/tcp_output.c:3691 tcp_cleanup_rbuf+0x130/0x360 net/ipv4/tcp.c:1575
Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 7191 Comm: syz-fuzzer Not tainted 5.3.0+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: cc16921351d8 ("netfilter: conntrack: avoid same-timeout update") Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Cc: Jozsef Kadlecsik kadlec@netfilter.org Cc: Florian Westphal fw@strlen.de Acked-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Jakub Kicinski jakub.kicinski@netronome.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/netfilter/nf_conntrack_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -1793,8 +1793,8 @@ void __nf_ct_refresh_acct(struct nf_conn if (nf_ct_is_confirmed(ct)) extra_jiffies += nfct_time_stamp;
- if (ct->timeout != extra_jiffies) - ct->timeout = extra_jiffies; + if (READ_ONCE(ct->timeout) != extra_jiffies) + WRITE_ONCE(ct->timeout, extra_jiffies); acct: if (do_acct) nf_ct_acct_update(ct, ctinfo, skb->len);
From: Valentin Vidic vvidic@valentin-vidic.from.hr
commit 77b6d09f4ae66d42cd63b121af67780ae3d1a5e9 upstream.
Make sure res does not contain random value if the call to sr_read_cmd fails for some reason.
Reported-by: syzbot+f1842130bbcfb335bac1@syzkaller.appspotmail.com Signed-off-by: Valentin Vidic vvidic@valentin-vidic.from.hr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/usb/sr9800.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/usb/sr9800.c +++ b/drivers/net/usb/sr9800.c @@ -336,7 +336,7 @@ static void sr_set_multicast(struct net_ static int sr_mdio_read(struct net_device *net, int phy_id, int loc) { struct usbnet *dev = netdev_priv(net); - __le16 res; + __le16 res = 0;
mutex_lock(&dev->phy_mutex); sr_set_sw_mii(dev);
From: Eric Dumazet edumazet@google.com
commit 159d2c7d8106177bd9a986fd005a311fe0d11285 upstream.
qdisc_root() use from netem_enqueue() triggers a lockdep warning.
__dev_queue_xmit() uses rcu_read_lock_bh() which is not equivalent to rcu_read_lock() + local_bh_disable_bh as far as lockdep is concerned.
WARNING: suspicious RCU usage 5.3.0-rc7+ #0 Not tainted ----------------------------- include/net/sch_generic.h:492 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 3 locks held by syz-executor427/8855: #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: lwtunnel_xmit_redirect include/net/lwtunnel.h:92 [inline] #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x2dc/0x2570 net/ipv4/ip_output.c:214 #1: 00000000b5525c01 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x20a/0x3650 net/core/dev.c:3804 #2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline] #2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_xmit_skb net/core/dev.c:3502 [inline] #2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_queue_xmit+0x14b8/0x3650 net/core/dev.c:3838
stack backtrace: CPU: 0 PID: 8855 Comm: syz-executor427 Not tainted 5.3.0-rc7+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5357 qdisc_root include/net/sch_generic.h:492 [inline] netem_enqueue+0x1cfb/0x2d80 net/sched/sch_netem.c:479 __dev_xmit_skb net/core/dev.c:3527 [inline] __dev_queue_xmit+0x15d2/0x3650 net/core/dev.c:3838 dev_queue_xmit+0x18/0x20 net/core/dev.c:3902 neigh_hh_output include/net/neighbour.h:500 [inline] neigh_output include/net/neighbour.h:509 [inline] ip_finish_output2+0x1726/0x2570 net/ipv4/ip_output.c:228 __ip_finish_output net/ipv4/ip_output.c:308 [inline] __ip_finish_output+0x5fc/0xb90 net/ipv4/ip_output.c:290 ip_finish_output+0x38/0x1f0 net/ipv4/ip_output.c:318 NF_HOOK_COND include/linux/netfilter.h:294 [inline] ip_mc_output+0x292/0xf40 net/ipv4/ip_output.c:417 dst_output include/net/dst.h:436 [inline] ip_local_out+0xbb/0x190 net/ipv4/ip_output.c:125 ip_send_skb+0x42/0xf0 net/ipv4/ip_output.c:1555 udp_send_skb.isra.0+0x6b2/0x1160 net/ipv4/udp.c:887 udp_sendmsg+0x1e96/0x2820 net/ipv4/udp.c:1174 inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:657 ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311 __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413 __do_sys_sendmmsg net/socket.c:2442 [inline] __se_sys_sendmmsg net/socket.c:2439 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439 do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe
Signed-off-by: Eric Dumazet edumazet@google.com Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/net/sch_generic.h | 5 +++++ net/sched/sch_netem.c | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-)
--- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -520,6 +520,11 @@ static inline struct Qdisc *qdisc_root(c return q; }
+static inline struct Qdisc *qdisc_root_bh(const struct Qdisc *qdisc) +{ + return rcu_dereference_bh(qdisc->dev_queue->qdisc); +} + static inline struct Qdisc *qdisc_root_sleeping(const struct Qdisc *qdisc) { return qdisc->dev_queue->qdisc_sleeping; --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -476,7 +476,7 @@ static int netem_enqueue(struct sk_buff * skb will be queued. */ if (count > 1 && (skb2 = skb_clone(skb, GFP_ATOMIC)) != NULL) { - struct Qdisc *rootq = qdisc_root(sch); + struct Qdisc *rootq = qdisc_root_bh(sch); u32 dupsave = q->duplicate; /* prevent duplicating a dup... */
q->duplicate = 0;
From: Vlad Buslov vladbu@mellanox.com
commit e3ae1f96accd21405715fe9c56b4d83bc7d96d44 upstream.
Recent changes that removed rtnl dependency from rules update path of tc also made tcf_block_put() function sleeping. This function is called from ops->destroy() of several Qdisc implementations, which in turn is called by qdisc_put(). Some Qdiscs call qdisc_put() while holding sch tree spinlock, which results sleeping-while-atomic BUG.
Steps to reproduce for sfb:
tc qdisc add dev ens1f0 handle 1: root sfb tc qdisc add dev ens1f0 parent 1:10 handle 50: sfq perturb 10 tc qdisc change dev ens1f0 root handle 1: sfb
Resulting dmesg:
[ 7265.938717] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:909 [ 7265.940152] in_atomic(): 1, irqs_disabled(): 0, pid: 28579, name: tc [ 7265.941455] INFO: lockdep is turned off. [ 7265.942744] CPU: 11 PID: 28579 Comm: tc Tainted: G W 5.3.0-rc8+ #721 [ 7265.944065] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017 [ 7265.945396] Call Trace: [ 7265.946709] dump_stack+0x85/0xc0 [ 7265.947994] ___might_sleep.cold+0xac/0xbc [ 7265.949282] __mutex_lock+0x5b/0x960 [ 7265.950543] ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0 [ 7265.951803] ? tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0 [ 7265.953022] tcf_chain0_head_change_cb_del.isra.0+0x1b/0xf0 [ 7265.954248] tcf_block_put_ext.part.0+0x21/0x50 [ 7265.955478] tcf_block_put+0x50/0x70 [ 7265.956694] sfq_destroy+0x15/0x50 [sch_sfq] [ 7265.957898] qdisc_destroy+0x5f/0x160 [ 7265.959099] sfb_change+0x175/0x330 [sch_sfb] [ 7265.960304] tc_modify_qdisc+0x324/0x840 [ 7265.961503] rtnetlink_rcv_msg+0x170/0x4b0 [ 7265.962692] ? netlink_deliver_tap+0x95/0x400 [ 7265.963876] ? rtnl_dellink+0x2d0/0x2d0 [ 7265.965064] netlink_rcv_skb+0x49/0x110 [ 7265.966251] netlink_unicast+0x171/0x200 [ 7265.967427] netlink_sendmsg+0x224/0x3f0 [ 7265.968595] sock_sendmsg+0x5e/0x60 [ 7265.969753] ___sys_sendmsg+0x2ae/0x330 [ 7265.970916] ? ___sys_recvmsg+0x159/0x1f0 [ 7265.972074] ? do_wp_page+0x9c/0x790 [ 7265.973233] ? __handle_mm_fault+0xcd3/0x19e0 [ 7265.974407] __sys_sendmsg+0x59/0xa0 [ 7265.975591] do_syscall_64+0x5c/0xb0 [ 7265.976753] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 7265.977938] RIP: 0033:0x7f229069f7b8 [ 7265.979117] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 89 5 4 [ 7265.981681] RSP: 002b:00007ffd7ed2d158 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 7265.983001] RAX: ffffffffffffffda RBX: 000000005d813ca1 RCX: 00007f229069f7b8 [ 7265.984336] RDX: 0000000000000000 RSI: 00007ffd7ed2d1c0 RDI: 0000000000000003 [ 7265.985682] RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000165c9a0 [ 7265.987021] R10: 0000000000404eda R11: 0000000000000246 R12: 0000000000000001 [ 7265.988309] R13: 000000000047f640 R14: 0000000000000000 R15: 0000000000000000
In sfb_change() function use qdisc_purge_queue() instead of qdisc_tree_flush_backlog() to properly reset old child Qdisc and save pointer to it into local temporary variable. Put reference to Qdisc after sch tree lock is released in order not to call potentially sleeping cls API in atomic section. This is safe to do because Qdisc has already been reset by qdisc_purge_queue() inside sch tree lock critical section.
Reported-by: syzbot+ac54455281db908c581e@syzkaller.appspotmail.com Fixes: c266f64dbfa2 ("net: sched: protect block state with mutex") Suggested-by: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: Vlad Buslov vladbu@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/sched/sch_sfb.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/net/sched/sch_sfb.c +++ b/net/sched/sch_sfb.c @@ -488,7 +488,7 @@ static int sfb_change(struct Qdisc *sch, struct netlink_ext_ack *extack) { struct sfb_sched_data *q = qdisc_priv(sch); - struct Qdisc *child; + struct Qdisc *child, *old; struct nlattr *tb[TCA_SFB_MAX + 1]; const struct tc_sfb_qopt *ctl = &sfb_default_ops; u32 limit; @@ -518,8 +518,8 @@ static int sfb_change(struct Qdisc *sch, qdisc_hash_add(child, true); sch_tree_lock(sch);
- qdisc_tree_flush_backlog(q->qdisc); - qdisc_put(q->qdisc); + qdisc_purge_queue(q->qdisc); + old = q->qdisc; q->qdisc = child;
q->rehash_interval = msecs_to_jiffies(ctl->rehash_interval); @@ -542,6 +542,7 @@ static int sfb_change(struct Qdisc *sch, sfb_init_perturbation(1, q);
sch_tree_unlock(sch); + qdisc_put(old);
return 0; }
From: Luca Coelho luciano.coelho@intel.com
commit 12e36d98d3e5acf5fc57774e0a15906d55f30cb9 upstream.
We currently support two NICs in FW version 29, namely 7265D and 3168. Out of these, only 7265D supports GEO SAR, so adjust the function that checks for it accordingly.
Signed-off-by: Luca Coelho luciano.coelho@intel.com Fixes: f5a47fae6aa3 ("iwlwifi: mvm: fix version check for GEO_TX_POWER_LIMIT support") Signed-off-by: Luca Coelho luciano.coelho@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/wireless/intel/iwlwifi/mvm/fw.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
--- a/drivers/net/wireless/intel/iwlwifi/mvm/fw.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/fw.c @@ -887,15 +887,17 @@ static bool iwl_mvm_sar_geo_support(stru * firmware versions. Unfortunately, we don't have a TLV API * flag to rely on, so rely on the major version which is in * the first byte of ucode_ver. This was implemented - * initially on version 38 and then backported to29 and 17. - * The intention was to have it in 36 as well, but not all - * 8000 family got this feature enabled. The 8000 family is - * the only one using version 36, so skip this version - * entirely. + * initially on version 38 and then backported to 17. It was + * also backported to 29, but only for 7265D devices. The + * intention was to have it in 36 as well, but not all 8000 + * family got this feature enabled. The 8000 family is the + * only one using version 36, so skip this version entirely. */ return IWL_UCODE_SERIAL(mvm->fw->ucode_ver) >= 38 || - IWL_UCODE_SERIAL(mvm->fw->ucode_ver) == 29 || - IWL_UCODE_SERIAL(mvm->fw->ucode_ver) == 17; + IWL_UCODE_SERIAL(mvm->fw->ucode_ver) == 17 || + (IWL_UCODE_SERIAL(mvm->fw->ucode_ver) == 29 && + ((mvm->trans->hw_rev & CSR_HW_REV_TYPE_MSK) == + CSR_HW_REV_TYPE_7265D)); }
int iwl_mvm_get_sar_geo_profile(struct iwl_mvm *mvm)
From: Dave Chiluk chiluk+linux@indeed.com
commit de53fd7aedb100f03e5d2231cfce0e4993282425 upstream.
It has been observed, that highly-threaded, non-cpu-bound applications running under cpu.cfs_quota_us constraints can hit a high percentage of periods throttled while simultaneously not consuming the allocated amount of quota. This use case is typical of user-interactive non-cpu bound applications, such as those running in kubernetes or mesos when run on multiple cpu cores.
This has been root caused to cpu-local run queue being allocated per cpu bandwidth slices, and then not fully using that slice within the period. At which point the slice and quota expires. This expiration of unused slice results in applications not being able to utilize the quota for which they are allocated.
The non-expiration of per-cpu slices was recently fixed by 'commit 512ac999d275 ("sched/fair: Fix bandwidth timer clock drift condition")'. Prior to that it appears that this had been broken since at least 'commit 51f2176d74ac ("sched/fair: Fix unlocked reads of some cfs_b->quota/period")' which was introduced in v3.16-rc1 in 2014. That added the following conditional which resulted in slices never being expired.
if (cfs_rq->runtime_expires != cfs_b->runtime_expires) { /* extend local deadline, drift is bounded above by 2 ticks */ cfs_rq->runtime_expires += TICK_NSEC;
Because this was broken for nearly 5 years, and has recently been fixed and is now being noticed by many users running kubernetes (https://github.com/kubernetes/kubernetes/issues/67577) it is my opinion that the mechanisms around expiring runtime should be removed altogether.
This allows quota already allocated to per-cpu run-queues to live longer than the period boundary. This allows threads on runqueues that do not use much CPU to continue to use their remaining slice over a longer period of time than cpu.cfs_period_us. However, this helps prevent the above condition of hitting throttling while also not fully utilizing your cpu quota.
This theoretically allows a machine to use slightly more than its allotted quota in some periods. This overflow would be bounded by the remaining quota left on each per-cpu runqueueu. This is typically no more than min_cfs_rq_runtime=1ms per cpu. For CPU bound tasks this will change nothing, as they should theoretically fully utilize all of their quota in each period. For user-interactive tasks as described above this provides a much better user/application experience as their cpu utilization will more closely match the amount they requested when they hit throttling. This means that cpu limits no longer strictly apply per period for non-cpu bound applications, but that they are still accurate over longer timeframes.
This greatly improves performance of high-thread-count, non-cpu bound applications with low cfs_quota_us allocation on high-core-count machines. In the case of an artificial testcase (10ms/100ms of quota on 80 CPU machine), this commit resulted in almost 30x performance improvement, while still maintaining correct cpu quota restrictions. That testcase is available at https://github.com/indeedeng/fibtest.
Fixes: 512ac999d275 ("sched/fair: Fix bandwidth timer clock drift condition") Signed-off-by: Dave Chiluk chiluk+linux@indeed.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Phil Auld pauld@redhat.com Reviewed-by: Ben Segall bsegall@google.com Cc: Ingo Molnar mingo@redhat.com Cc: John Hammond jhammond@indeed.com Cc: Jonathan Corbet corbet@lwn.net Cc: Kyle Anderson kwa@yelp.com Cc: Gabriel Munos gmunoz@netflix.com Cc: Peter Oskolkov posk@posk.io Cc: Cong Wang xiyou.wangcong@gmail.com Cc: Brendan Gregg bgregg@netflix.com Link: https://lkml.kernel.org/r/1563900266-19734-2-git-send-email-chiluk+linux@ind... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- Documentation/scheduler/sched-bwc.rst | 74 +++++++++++++++++++++++++++------- kernel/sched/fair.c | 72 +++------------------------------ kernel/sched/sched.h | 4 - 3 files changed, 67 insertions(+), 83 deletions(-)
--- a/Documentation/scheduler/sched-bwc.rst +++ b/Documentation/scheduler/sched-bwc.rst @@ -9,15 +9,16 @@ CFS bandwidth control is a CONFIG_FAIR_G specification of the maximum CPU bandwidth available to a group or hierarchy.
The bandwidth allowed for a group is specified using a quota and period. Within -each given "period" (microseconds), a group is allowed to consume only up to -"quota" microseconds of CPU time. When the CPU bandwidth consumption of a -group exceeds this limit (for that period), the tasks belonging to its -hierarchy will be throttled and are not allowed to run again until the next -period. - -A group's unused runtime is globally tracked, being refreshed with quota units -above at each period boundary. As threads consume this bandwidth it is -transferred to cpu-local "silos" on a demand basis. The amount transferred +each given "period" (microseconds), a task group is allocated up to "quota" +microseconds of CPU time. That quota is assigned to per-cpu run queues in +slices as threads in the cgroup become runnable. Once all quota has been +assigned any additional requests for quota will result in those threads being +throttled. Throttled threads will not be able to run again until the next +period when the quota is replenished. + +A group's unassigned quota is globally tracked, being refreshed back to +cfs_quota units at each period boundary. As threads consume this bandwidth it +is transferred to cpu-local "silos" on a demand basis. The amount transferred within each of these updates is tunable and described as the "slice".
Management @@ -35,12 +36,12 @@ The default values are::
A value of -1 for cpu.cfs_quota_us indicates that the group does not have any bandwidth restriction in place, such a group is described as an unconstrained -bandwidth group. This represents the traditional work-conserving behavior for +bandwidth group. This represents the traditional work-conserving behavior for CFS.
Writing any (valid) positive value(s) will enact the specified bandwidth limit. -The minimum quota allowed for the quota or period is 1ms. There is also an -upper bound on the period length of 1s. Additional restrictions exist when +The minimum quota allowed for the quota or period is 1ms. There is also an +upper bound on the period length of 1s. Additional restrictions exist when bandwidth limits are used in a hierarchical fashion, these are explained in more detail below.
@@ -53,8 +54,8 @@ unthrottled if it is in a constrained st System wide settings -------------------- For efficiency run-time is transferred between the global pool and CPU local -"silos" in a batch fashion. This greatly reduces global accounting pressure -on large systems. The amount transferred each time such an update is required +"silos" in a batch fashion. This greatly reduces global accounting pressure +on large systems. The amount transferred each time such an update is required is described as the "slice".
This is tunable via procfs:: @@ -97,6 +98,51 @@ There are two ways in which a group may In case b) above, even though the child may have runtime remaining it will not be allowed to until the parent's runtime is refreshed.
+CFS Bandwidth Quota Caveats +--------------------------- +Once a slice is assigned to a cpu it does not expire. However all but 1ms of +the slice may be returned to the global pool if all threads on that cpu become +unrunnable. This is configured at compile time by the min_cfs_rq_runtime +variable. This is a performance tweak that helps prevent added contention on +the global lock. + +The fact that cpu-local slices do not expire results in some interesting corner +cases that should be understood. + +For cgroup cpu constrained applications that are cpu limited this is a +relatively moot point because they will naturally consume the entirety of their +quota as well as the entirety of each cpu-local slice in each period. As a +result it is expected that nr_periods roughly equal nr_throttled, and that +cpuacct.usage will increase roughly equal to cfs_quota_us in each period. + +For highly-threaded, non-cpu bound applications this non-expiration nuance +allows applications to briefly burst past their quota limits by the amount of +unused slice on each cpu that the task group is running on (typically at most +1ms per cpu or as defined by min_cfs_rq_runtime). This slight burst only +applies if quota had been assigned to a cpu and then not fully used or returned +in previous periods. This burst amount will not be transferred between cores. +As a result, this mechanism still strictly limits the task group to quota +average usage, albeit over a longer time window than a single period. This +also limits the burst ability to no more than 1ms per cpu. This provides +better more predictable user experience for highly threaded applications with +small quota limits on high core count machines. It also eliminates the +propensity to throttle these applications while simultanously using less than +quota amounts of cpu. Another way to say this, is that by allowing the unused +portion of a slice to remain valid across periods we have decreased the +possibility of wastefully expiring quota on cpu-local silos that don't need a +full slice's amount of cpu time. + +The interaction between cpu-bound and non-cpu-bound-interactive applications +should also be considered, especially when single core usage hits 100%. If you +gave each of these applications half of a cpu-core and they both got scheduled +on the same CPU it is theoretically possible that the non-cpu bound application +will use up to 1ms additional quota in some periods, thereby preventing the +cpu-bound application from fully using its quota by that same amount. In these +instances it will be up to the CFS algorithm (see sched-design-CFS.rst) to +decide which application is chosen to run, as they will both be runnable and +have remaining quota. This runtime discrepancy will be made up in the following +periods when the interactive application idles. + Examples -------- 1. Limit a group to 1 CPU worth of runtime:: --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4370,8 +4370,6 @@ void __refill_cfs_bandwidth_runtime(stru
now = sched_clock_cpu(smp_processor_id()); cfs_b->runtime = cfs_b->quota; - cfs_b->runtime_expires = now + ktime_to_ns(cfs_b->period); - cfs_b->expires_seq++; }
static inline struct cfs_bandwidth *tg_cfs_bandwidth(struct task_group *tg) @@ -4393,8 +4391,7 @@ static int assign_cfs_rq_runtime(struct { struct task_group *tg = cfs_rq->tg; struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(tg); - u64 amount = 0, min_amount, expires; - int expires_seq; + u64 amount = 0, min_amount;
/* note: this is a positive sum as runtime_remaining <= 0 */ min_amount = sched_cfs_bandwidth_slice() - cfs_rq->runtime_remaining; @@ -4411,61 +4408,17 @@ static int assign_cfs_rq_runtime(struct cfs_b->idle = 0; } } - expires_seq = cfs_b->expires_seq; - expires = cfs_b->runtime_expires; raw_spin_unlock(&cfs_b->lock);
cfs_rq->runtime_remaining += amount; - /* - * we may have advanced our local expiration to account for allowed - * spread between our sched_clock and the one on which runtime was - * issued. - */ - if (cfs_rq->expires_seq != expires_seq) { - cfs_rq->expires_seq = expires_seq; - cfs_rq->runtime_expires = expires; - }
return cfs_rq->runtime_remaining > 0; }
-/* - * Note: This depends on the synchronization provided by sched_clock and the - * fact that rq->clock snapshots this value. - */ -static void expire_cfs_rq_runtime(struct cfs_rq *cfs_rq) -{ - struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg); - - /* if the deadline is ahead of our clock, nothing to do */ - if (likely((s64)(rq_clock(rq_of(cfs_rq)) - cfs_rq->runtime_expires) < 0)) - return; - - if (cfs_rq->runtime_remaining < 0) - return; - - /* - * If the local deadline has passed we have to consider the - * possibility that our sched_clock is 'fast' and the global deadline - * has not truly expired. - * - * Fortunately we can check determine whether this the case by checking - * whether the global deadline(cfs_b->expires_seq) has advanced. - */ - if (cfs_rq->expires_seq == cfs_b->expires_seq) { - /* extend local deadline, drift is bounded above by 2 ticks */ - cfs_rq->runtime_expires += TICK_NSEC; - } else { - /* global deadline is ahead, expiration has passed */ - cfs_rq->runtime_remaining = 0; - } -} - static void __account_cfs_rq_runtime(struct cfs_rq *cfs_rq, u64 delta_exec) { /* dock delta_exec before expiring quota (as it could span periods) */ cfs_rq->runtime_remaining -= delta_exec; - expire_cfs_rq_runtime(cfs_rq);
if (likely(cfs_rq->runtime_remaining > 0)) return; @@ -4658,8 +4611,7 @@ void unthrottle_cfs_rq(struct cfs_rq *cf resched_curr(rq); }
-static u64 distribute_cfs_runtime(struct cfs_bandwidth *cfs_b, - u64 remaining, u64 expires) +static u64 distribute_cfs_runtime(struct cfs_bandwidth *cfs_b, u64 remaining) { struct cfs_rq *cfs_rq; u64 runtime; @@ -4684,7 +4636,6 @@ static u64 distribute_cfs_runtime(struct remaining -= runtime;
cfs_rq->runtime_remaining += runtime; - cfs_rq->runtime_expires = expires;
/* we check whether we're throttled above */ if (cfs_rq->runtime_remaining > 0) @@ -4709,7 +4660,7 @@ next: */ static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun, unsigned long flags) { - u64 runtime, runtime_expires; + u64 runtime; int throttled;
/* no need to continue the timer with no bandwidth constraint */ @@ -4737,8 +4688,6 @@ static int do_sched_cfs_period_timer(str /* account preceding periods in which throttling occurred */ cfs_b->nr_throttled += overrun;
- runtime_expires = cfs_b->runtime_expires; - /* * This check is repeated as we are holding onto the new bandwidth while * we unthrottle. This can potentially race with an unthrottled group @@ -4751,8 +4700,7 @@ static int do_sched_cfs_period_timer(str cfs_b->distribute_running = 1; raw_spin_unlock_irqrestore(&cfs_b->lock, flags); /* we can't nest cfs_b->lock while distributing bandwidth */ - runtime = distribute_cfs_runtime(cfs_b, runtime, - runtime_expires); + runtime = distribute_cfs_runtime(cfs_b, runtime); raw_spin_lock_irqsave(&cfs_b->lock, flags);
cfs_b->distribute_running = 0; @@ -4834,8 +4782,7 @@ static void __return_cfs_rq_runtime(stru return;
raw_spin_lock(&cfs_b->lock); - if (cfs_b->quota != RUNTIME_INF && - cfs_rq->runtime_expires == cfs_b->runtime_expires) { + if (cfs_b->quota != RUNTIME_INF) { cfs_b->runtime += slack_runtime;
/* we are under rq->lock, defer unthrottling using a timer */ @@ -4868,7 +4815,6 @@ static void do_sched_cfs_slack_timer(str { u64 runtime = 0, slice = sched_cfs_bandwidth_slice(); unsigned long flags; - u64 expires;
/* confirm we're still not at a refresh boundary */ raw_spin_lock_irqsave(&cfs_b->lock, flags); @@ -4886,7 +4832,6 @@ static void do_sched_cfs_slack_timer(str if (cfs_b->quota != RUNTIME_INF && cfs_b->runtime > slice) runtime = cfs_b->runtime;
- expires = cfs_b->runtime_expires; if (runtime) cfs_b->distribute_running = 1;
@@ -4895,11 +4840,10 @@ static void do_sched_cfs_slack_timer(str if (!runtime) return;
- runtime = distribute_cfs_runtime(cfs_b, runtime, expires); + runtime = distribute_cfs_runtime(cfs_b, runtime);
raw_spin_lock_irqsave(&cfs_b->lock, flags); - if (expires == cfs_b->runtime_expires) - lsub_positive(&cfs_b->runtime, runtime); + lsub_positive(&cfs_b->runtime, runtime); cfs_b->distribute_running = 0; raw_spin_unlock_irqrestore(&cfs_b->lock, flags); } @@ -5064,8 +5008,6 @@ void start_cfs_bandwidth(struct cfs_band
cfs_b->period_active = 1; overrun = hrtimer_forward_now(&cfs_b->period_timer, cfs_b->period); - cfs_b->runtime_expires += (overrun + 1) * ktime_to_ns(cfs_b->period); - cfs_b->expires_seq++; hrtimer_start_expires(&cfs_b->period_timer, HRTIMER_MODE_ABS_PINNED); }
--- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -335,8 +335,6 @@ struct cfs_bandwidth { u64 quota; u64 runtime; s64 hierarchical_quota; - u64 runtime_expires; - int expires_seq;
u8 idle; u8 period_active; @@ -556,8 +554,6 @@ struct cfs_rq {
#ifdef CONFIG_CFS_BANDWIDTH int runtime_enabled; - int expires_seq; - u64 runtime_expires; s64 runtime_remaining;
u64 throttled_clock;
From: Jussi Laako jussi@sonarnerd.net
[ Upstream commit eb7505d52a2f8b0cfc3fd7146d8cb2dab5a73f0d ]
Add DSD support auto-detection for newer Playback Designs devices. Older device generations have a different USB interface implementation.
Keep the auto-detection VID whitelist sorted.
Signed-off-by: Jussi Laako jussi@sonarnerd.net Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/usb/quirks.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index b6f7b13768a10..6f228b4ec8c3e 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1563,7 +1563,8 @@ u64 snd_usb_interface_dsd_format_quirks(struct snd_usb_audio *chip, struct usb_interface *iface;
/* Playback Designs */ - if (USB_ID_VENDOR(chip->usb_id) == 0x23ba) { + if (USB_ID_VENDOR(chip->usb_id) == 0x23ba && + USB_ID_PRODUCT(chip->usb_id) < 0x0110) { switch (fp->altsetting) { case 1: fp->dsd_dop = true; @@ -1651,8 +1652,9 @@ u64 snd_usb_interface_dsd_format_quirks(struct snd_usb_audio *chip, * from XMOS/Thesycon */ switch (USB_ID_VENDOR(chip->usb_id)) { - case 0x20b1: /* XMOS based devices */ case 0x152a: /* Thesycon devices */ + case 0x20b1: /* XMOS based devices */ + case 0x23ba: /* Playback Designs */ case 0x25ce: /* Mytek devices */ case 0x2ab6: /* T+A devices */ case 0x3842: /* EVGA */
From: Jussi Laako jussi@sonarnerd.net
[ Upstream commit 0067e154b11e236d62a7a8205f321b097c21a35b ]
Oppo has issued firmware updates that change alt setting used for DSD support. However, these devices seem to support auto-detection, so support is moved from explicit whitelisting to auto-detection.
Also Rotel devices have USB interfaces that support DSD with auto-detection.
Signed-off-by: Jussi Laako jussi@sonarnerd.net Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/usb/quirks.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index 6f228b4ec8c3e..33d52ab6ebad0 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1581,9 +1581,6 @@ u64 snd_usb_interface_dsd_format_quirks(struct snd_usb_audio *chip, /* XMOS based USB DACs */ switch (chip->usb_id) { case USB_ID(0x1511, 0x0037): /* AURALiC VEGA */ - case USB_ID(0x22d9, 0x0416): /* OPPO HA-1 */ - case USB_ID(0x22d9, 0x0436): /* OPPO Sonica */ - case USB_ID(0x22d9, 0x0461): /* OPPO UDP-205 */ case USB_ID(0x2522, 0x0012): /* LH Labs VI DAC Infinity */ case USB_ID(0x2772, 0x0230): /* Pro-Ject Pre Box S2 Digital */ if (fp->altsetting == 2) @@ -1597,7 +1594,6 @@ u64 snd_usb_interface_dsd_format_quirks(struct snd_usb_audio *chip, case USB_ID(0x16d0, 0x0733): /* Furutech ADL Stratos */ case USB_ID(0x16d0, 0x09db): /* NuPrime Audio DAC-9 */ case USB_ID(0x1db5, 0x0003): /* Bryston BDA3 */ - case USB_ID(0x22d9, 0x0426): /* OPPO HA-2 */ case USB_ID(0x22e1, 0xca01): /* HDTA Serenade DSD */ case USB_ID(0x249c, 0x9326): /* M2Tech Young MkIII */ case USB_ID(0x2616, 0x0106): /* PS Audio NuWave DAC */ @@ -1654,8 +1650,10 @@ u64 snd_usb_interface_dsd_format_quirks(struct snd_usb_audio *chip, switch (USB_ID_VENDOR(chip->usb_id)) { case 0x152a: /* Thesycon devices */ case 0x20b1: /* XMOS based devices */ + case 0x22d9: /* Oppo */ case 0x23ba: /* Playback Designs */ case 0x25ce: /* Mytek devices */ + case 0x278b: /* Rotel? */ case 0x2ab6: /* T+A devices */ case 0x3842: /* EVGA */ case 0xc502: /* HiBy devices */
From: Justin Song flyingecar@gmail.com
[ Upstream commit e2995b95a914bbc6b5352be27d5d5f33ec802d2c ]
This patch adds native DSD support for Gustard U16/X26 USB Interface. Tested using VID and fp->dsd_raw method.
Signed-off-by: Justin Song flyingecar@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/CA+9XP1ipsFn+r3bCBKRinQv-JrJ+EHOGBdZWZoMwxFv0R8Y1M... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/usb/quirks.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c index 33d52ab6ebad0..059b70313f352 100644 --- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1654,6 +1654,7 @@ u64 snd_usb_interface_dsd_format_quirks(struct snd_usb_audio *chip, case 0x23ba: /* Playback Designs */ case 0x25ce: /* Mytek devices */ case 0x278b: /* Rotel? */ + case 0x292b: /* Gustard/Ess based devices */ case 0x2ab6: /* T+A devices */ case 0x3842: /* EVGA */ case 0xc502: /* HiBy devices */
From: Jason Gunthorpe jgg@mellanox.com
[ Upstream commit 1524b12a6e02a85264af4ed208b034a2239ef374 ]
The mkey_table xarray is touched by the reg_mr_callback() function which is called from a hard irq. Thus all other uses of xa_lock must use the _irq variants.
WARNING: inconsistent lock state 5.4.0-rc1 #12 Not tainted -------------------------------- inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. python3/343 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff888182be1d40 (&(&xa->xa_lock)->rlock#3){?.-.}, at: xa_erase+0x12/0x30 {IN-HARDIRQ-W} state was registered at: lock_acquire+0xe1/0x200 _raw_spin_lock_irqsave+0x35/0x50 reg_mr_callback+0x2dd/0x450 [mlx5_ib] mlx5_cmd_exec_cb_handler+0x2c/0x70 [mlx5_core] mlx5_cmd_comp_handler+0x355/0x840 [mlx5_core] [..]
Possible unsafe locking scenario:
CPU0 ---- lock(&(&xa->xa_lock)->rlock#3); <Interrupt> lock(&(&xa->xa_lock)->rlock#3);
*** DEADLOCK ***
2 locks held by python3/343: #0: ffff88818eb4bd38 (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_ioctl+0xe5/0x1e0 [ib_uverbs] #1: ffff888176c76d38 (&file->hw_destroy_rwsem){++++}, at: uobj_destroy+0x2d/0x90 [ib_uverbs]
stack backtrace: CPU: 3 PID: 343 Comm: python3 Not tainted 5.4.0-rc1 #12 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x86/0xca print_usage_bug.cold.50+0x2e5/0x355 mark_lock+0x871/0xb50 ? match_held_lock+0x20/0x250 ? check_usage_forwards+0x240/0x240 __lock_acquire+0x7de/0x23a0 ? __kasan_check_read+0x11/0x20 ? mark_lock+0xae/0xb50 ? mark_held_locks+0xb0/0xb0 ? find_held_lock+0xca/0xf0 lock_acquire+0xe1/0x200 ? xa_erase+0x12/0x30 _raw_spin_lock+0x2a/0x40 ? xa_erase+0x12/0x30 xa_erase+0x12/0x30 mlx5_ib_dealloc_mw+0x55/0xa0 [mlx5_ib] uverbs_dealloc_mw+0x3c/0x70 [ib_uverbs] uverbs_free_mw+0x1a/0x20 [ib_uverbs] destroy_hw_idr_uobject+0x49/0xa0 [ib_uverbs] [..]
Fixes: 0417791536ae ("RDMA/mlx5: Add missing synchronize_srcu() for MW cases") Link: https://lore.kernel.org/r/20191024234910.GA9038@ziepe.ca Acked-by: Leon Romanovsky leonro@mellanox.com Signed-off-by: Jason Gunthorpe jgg@mellanox.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/mlx5/mr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index c15d05f61cc7b..a6198fe7f3768 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1975,8 +1975,8 @@ int mlx5_ib_dealloc_mw(struct ib_mw *mw) int err;
if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) { - xa_erase(&dev->mdev->priv.mkey_table, - mlx5_base_mkey(mmw->mmkey.key)); + xa_erase_irq(&dev->mdev->priv.mkey_table, + mlx5_base_mkey(mmw->mmkey.key)); /* * pagefault_single_data_segment() may be accessing mmw under * SRCU if the user bound an ODP MR to this MW.
From: Qian Cai cai@lca.pw
[ Upstream commit 763a9ec06c409dcde2a761aac4bb83ff3938e0b3 ]
Commit:
de53fd7aedb1 ("sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices")
introduced a few compilation warnings:
kernel/sched/fair.c: In function '__refill_cfs_bandwidth_runtime': kernel/sched/fair.c:4365:6: warning: variable 'now' set but not used [-Wunused-but-set-variable] kernel/sched/fair.c: In function 'start_cfs_bandwidth': kernel/sched/fair.c:4992:6: warning: variable 'overrun' set but not used [-Wunused-but-set-variable]
Also, __refill_cfs_bandwidth_runtime() does no longer update the expiration time, so fix the comments accordingly.
Signed-off-by: Qian Cai cai@lca.pw Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Ben Segall bsegall@google.com Reviewed-by: Dave Chiluk chiluk+linux@indeed.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: pauld@redhat.com Fixes: de53fd7aedb1 ("sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices") Link: https://lkml.kernel.org/r/1566326455-8038-1-git-send-email-cai@lca.pw Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/fair.c | 19 ++++++------------- 1 file changed, 6 insertions(+), 13 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index a11a9c2d7793e..649c6b60929e2 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4355,21 +4355,16 @@ static inline u64 sched_cfs_bandwidth_slice(void) }
/* - * Replenish runtime according to assigned quota and update expiration time. - * We use sched_clock_cpu directly instead of rq->clock to avoid adding - * additional synchronization around rq->lock. + * Replenish runtime according to assigned quota. We use sched_clock_cpu + * directly instead of rq->clock to avoid adding additional synchronization + * around rq->lock. * * requires cfs_b->lock */ void __refill_cfs_bandwidth_runtime(struct cfs_bandwidth *cfs_b) { - u64 now; - - if (cfs_b->quota == RUNTIME_INF) - return; - - now = sched_clock_cpu(smp_processor_id()); - cfs_b->runtime = cfs_b->quota; + if (cfs_b->quota != RUNTIME_INF) + cfs_b->runtime = cfs_b->quota; }
static inline struct cfs_bandwidth *tg_cfs_bandwidth(struct task_group *tg) @@ -4999,15 +4994,13 @@ static void init_cfs_rq_runtime(struct cfs_rq *cfs_rq)
void start_cfs_bandwidth(struct cfs_bandwidth *cfs_b) { - u64 overrun; - lockdep_assert_held(&cfs_b->lock);
if (cfs_b->period_active) return;
cfs_b->period_active = 1; - overrun = hrtimer_forward_now(&cfs_b->period_timer, cfs_b->period); + hrtimer_forward_now(&cfs_b->period_timer, cfs_b->period); hrtimer_start_expires(&cfs_b->period_timer, HRTIMER_MODE_ABS_PINNED); }
stable-rc/linux-5.3.y boot: 128 boots: 1 failed, 120 passed with 7 offline (v5.3.8-164-g75c9913bbf6e)
Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-5.3.y/kernel/v5.3.8... Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-5.3.y/kernel/v5.3.8-164-g7...
Tree: stable-rc Branch: linux-5.3.y Git Describe: v5.3.8-164-g75c9913bbf6e Git Commit: 75c9913bbf6e9e64cb669236571e6af45cddfd68 Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git Tested: 77 unique boards, 25 SoC families, 16 builds out of 208
Boot Failure Detected:
arm64: defconfig: gcc-8: meson-gxl-s805x-libretech-ac: 1 failed lab
Offline Platforms:
arm:
sunxi_defconfig: gcc-8 sun5i-r8-chip: 1 offline lab sun7i-a20-bananapi: 1 offline lab
multi_v7_defconfig: gcc-8 qcom-apq8064-cm-qs600: 1 offline lab sun5i-r8-chip: 1 offline lab sun7i-a20-bananapi: 1 offline lab
davinci_all_defconfig: gcc-8 dm365evm,legacy: 1 offline lab
qcom_defconfig: gcc-8 qcom-apq8064-cm-qs600: 1 offline lab
--- For more info write to info@kernelci.org
On Tue, 5 Nov 2019 at 03:34, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.3.9 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 06 Nov 2019 09:14:04 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.3.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.3.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 5.3.9-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-5.3.y git commit: 75c9913bbf6e9e64cb669236571e6af45cddfd68 git describe: v5.3.8-164-g75c9913bbf6e Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-5.3-oe/build/v5.3.8-164-g...
No regressions (compared to build v5.3.8)
No fixes (compared to build v5.3.8)
Ran 24209 total tests in the following environments and test suites.
Environments -------------- - dragonboard-410c - hi6220-hikey - i386 - juno-r2 - qemu_arm - qemu_arm64 - qemu_i386 - qemu_x86_64 - x15 - x86
Test Suites ----------- * build * install-android-platform-tools-r2600 * kselftest * libgpiod * libhugetlbfs * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * network-basic-tests * perf * spectre-meltdown-checker-test * v4l2-compliance * ltp-open-posix-tests * kvm-unit-tests * ssuite * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-none
On Tue, Nov 05, 2019 at 11:56:57AM +0530, Naresh Kamboju wrote:
On Tue, 5 Nov 2019 at 03:34, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.3.9 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 06 Nov 2019 09:14:04 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.3.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.3.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Thanks for testing all of these so quickly and letting me know.
greg k-h
On 11/4/19 1:43 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.3.9 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 06 Nov 2019 09:14:04 PM UTC. Anything received after that time might be too late.
Build results: total: 158 pass: 158 fail: 0 Qemu test results: total: 391 pass: 391 fail: 0
Guenter
On Tue, Nov 05, 2019 at 06:26:27AM -0800, Guenter Roeck wrote:
On 11/4/19 1:43 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.3.9 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 06 Nov 2019 09:14:04 PM UTC. Anything received after that time might be too late.
Build results: total: 158 pass: 158 fail: 0 Qemu test results: total: 391 pass: 391 fail: 0
THanks for testing all of these and letting me know.
greg k-h
On 11/4/19 2:43 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.3.9 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 06 Nov 2019 09:14:04 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.3.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.3.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
On Tue, Nov 05, 2019 at 09:51:43AM -0700, shuah wrote:
On 11/4/19 2:43 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.3.9 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 06 Nov 2019 09:14:04 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.3.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.3.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Thanks for testing all of these and letting me know.
greg k-h
On 04/11/2019 21:43, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.3.9 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 06 Nov 2019 09:14:04 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.3.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.3.y and the diffstat can be found below.
thanks,
greg k-h
All tests for Tegra are passing ...
Test results for stable-v5.3: 12 builds: 12 pass, 0 fail 22 boots: 22 pass, 0 fail 38 tests: 38 pass, 0 fail
Linux version: 5.3.9-rc1-g75c9913bbf6e Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Cheers Jon
On Tue, Nov 05, 2019 at 11:41:15PM +0000, Jon Hunter wrote:
On 04/11/2019 21:43, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.3.9 release. There are 163 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed 06 Nov 2019 09:14:04 PM UTC. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.3.9-rc1.g... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.3.y and the diffstat can be found below.
thanks,
greg k-h
All tests for Tegra are passing ...
Test results for stable-v5.3: 12 builds: 12 pass, 0 fail 22 boots: 22 pass, 0 fail 38 tests: 38 pass, 0 fail
Linux version: 5.3.9-rc1-g75c9913bbf6e Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra20-ventana, tegra210-p2371-2180, tegra30-cardhu-a04
Thanks for testing all of these and letting me know.
greg k-h
linux-stable-mirror@lists.linaro.org