This is the start of the stable review cycle for the 5.14.8 release. There are 100 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Sep 2021 12:43:20 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.14.8-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.14.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.14.8-rc1
Guenter Roeck linux@roeck-us.net drm/nouveau/nvkm: Replace -ENOSYS with -ENODEV
Paul Moore paul@paul-moore.com selinux,smack: fix subjective/objective credential use mixups
Hao Xu haoxu@linux.alibaba.com io_uring: fix off-by-one in BUILD_BUG_ON check of __REQ_F_LAST_BIT
Enzo Matsumiya ematsumiya@suse.de cifs: properly invalidate cached root handle when closing it
Sebastian Andrzej Siewior bigeasy@linutronix.de sched/idle: Make the idle timer expire in hard interrupt context
Yu-Tung Chang mtwget@gmail.com rtc: rx8010: select REGMAP_I2C
Song Liu songliubraving@fb.com blk-mq: allow 4x BLK_MAX_REQUEST_COUNT at blk_plug for multiple_queues
Li Jinlin lijinlin3@huawei.com blk-throttle: fix UAF by deleteing timer in blk_throtl_exit()
Tetsuo Handa penguin-kernel@i-love.sakura.ne.jp block: genhd: don't call blkdev_show() with major_names_lock held
Hannes Reinecke hare@suse.de nvmet: fixup buffer overrun in nvmet_subsys_attr_serial()
Uwe Kleine-König u.kleine-koenig@pengutronix.de pwm: stm32-lp: Don't modify HW state in .remove() callback
Uwe Kleine-König u.kleine-koenig@pengutronix.de pwm: rockchip: Don't modify HW state in .remove() callback
Uwe Kleine-König u.kleine-koenig@pengutronix.de pwm: img: Don't modify HW state in .remove() callback
farah kassabri fkassabri@habana.ai habanalabs: cannot sleep while holding spinlock
Omer Shpigelman oshpigelman@habana.ai habanalabs: add "in device creation" status
Yuri Nudelman ynudelman@habana.ai habanalabs: fix mmu node address resolution in debugfs
Ofir Bitton obitton@habana.ai habanalabs: add validity check for event ID received from F/W
Philip Yang Philip.Yang@amd.com drm/amdgpu: fix fdinfo race with process exit
Anson Jacob Anson.Jacob@amd.com drm/amd/display: Fix memory leak reported by coverity
Luben Tuikov luben.tuikov@amd.com drm/amdgpu: Fixes to returning VBIOS RAS EEPROM address
Koby Elbaz kelbaz@habana.ai habanalabs: fix race between soft reset and heartbeat
Tomer Tayar ttayar@habana.ai habanalabs: fix nullifying of destroyed mmu pgt pool
Niklas Söderlund niklas.soderlund+renesas@ragnatech.se thermal/drivers/rcar_gen3_thermal: Store TSC id as unsigned int
Nanyong Sun sunnanyong@huawei.com nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group
Nanyong Sun sunnanyong@huawei.com nilfs2: fix memory leak in nilfs_sysfs_create_snapshot_group
Nanyong Sun sunnanyong@huawei.com nilfs2: fix memory leak in nilfs_sysfs_delete_##name##_group
Nanyong Sun sunnanyong@huawei.com nilfs2: fix memory leak in nilfs_sysfs_create_##name##_group
Nanyong Sun sunnanyong@huawei.com nilfs2: fix NULL pointer in nilfs_##name##_attr_release
Nanyong Sun sunnanyong@huawei.com nilfs2: fix memory leak in nilfs_sysfs_create_device_group
Anand Jain anand.jain@oracle.com btrfs: fix lockdep warning while mounting sprout fs
Josef Bacik josef@toxicpanda.com btrfs: delay blkdev_put until after the device remove
Josef Bacik josef@toxicpanda.com btrfs: update the bdev time directly when closing
Vasily Gorbik gor@linux.ibm.com s390/unwind: use current_frame_address() to unwind current task
Jeff Layton jlayton@kernel.org ceph: lockdep annotations for try_nonblocking_invalidate
Xiubo Li xiubli@redhat.com ceph: remove the capsnaps when removing caps
Jeff Layton jlayton@kernel.org ceph: request Fw caps before updating the mtime in ceph_write_iter
Jeff Layton jlayton@kernel.org ceph: fix memory leak on decode error in ceph_handle_caps
Mario Limonciello mario.limonciello@amd.com ACPI: PM: s2idle: Run both AMD and Microsoft methods if both are supported
Kuninori Morimoto kuninori.morimoto.gx@renesas.com ASoC: audio-graph: respawn Platform Support
Sven Schnelle svens@linux.ibm.com s390: add kmemleak annotation in stack_alloc()
Radhey Shyam Pandey radhey.shyam.pandey@xilinx.com dmaengine: xilinx_dma: Set DMA mask for coherent APIs
Johannes Berg johannes.berg@intel.com dmaengine: ioat: depends on !UML
Dan Williams dan.j.williams@intel.com cxl/pci: Introduce cdevm_file_operations
Ben Widawsky ben.widawsky@intel.com cxl: Move cxl_core to new directory
Zou Wei zou_wei@huawei.com dmaengine: sprd: Add missing MODULE_DEVICE_TABLE
Johannes Berg johannes.berg@intel.com dmaengine: idxd: depends on !UML
Adrian Hunter adrian.hunter@intel.com perf tools: Fix hybrid config terms list corruption
Geert Uytterhoeven geert@linux-m68k.org riscv: dts: microchip: mpfs-icicle: Fix serial console
Saravana Kannan saravanak@google.com of: property: Disable fw_devlink DT support for X86
xinhui pan xinhui.pan@amd.com drm/ttm: Fix a deadlock if the target BO is not idle during swap
Ard Biesheuvel ardb@kernel.org arm64: mm: limit linear region to 51 bits for KVM in nVHE mode
Fenghua Yu fenghua.yu@intel.com iommu/vt-d: Fix a deadlock in intel_svm_drain_prq()
Fenghua Yu fenghua.yu@intel.com iommu/vt-d: Fix PASID leak in intel_svm_unbind_mm()
Wei Huang wei.huang2@amd.com iommu/amd: Relocate GAMSup check to early_enable_iommus
Guenter Roeck linux@roeck-us.net parisc: Move pci_dev_is_behind_card_dino to where it is used
Geert Uytterhoeven geert@linux-m68k.org dma-buf: DMABUF_DEBUG should depend on DMA_SHARED_BUFFER
Geert Uytterhoeven geert@linux-m68k.org dma-buf: DMABUF_MOVE_NOTIFY should depend on DMA_SHARED_BUFFER
Thomas Gleixner tglx@linutronix.de drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION()
Koba Ko koba.ko@canonical.com drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform
Arnd Bergmann arnd@arndb.de thermal/core: Fix thermal_cooling_device_register() prototype
Masami Hiramatsu mhiramat@kernel.org tracing/boot: Fix to loop on only subkeys
Masami Hiramatsu mhiramat@kernel.org tools/bootconfig: Fix tracing_on option checking in ftrace2bconf.sh
Lukas Bulwahn lukas.bulwahn@gmail.com Kconfig.debug: drop selecting non-existing HARDLOCKUP_DETECTOR_ARCH
Rasmus Villemoes linux@rasmusvillemoes.dk init: move usermodehelper_enable() to populate_rootfs()
Geert Uytterhoeven geert@linux-m68k.org math: RATIONAL_KUNIT_TEST should depend on RATIONAL instead of selecting it
NeilBrown neilb@suse.de SUNRPC: don't pause on incomplete allocation
Heiko Carstens hca@linux.ibm.com s390/entry: make oklabel within CHKSTG macro local
Gwendal Grignou gwendal@chromium.org platform/chrome: cros_ec_trace: Fix format warnings
Gwendal Grignou gwendal@chromium.org platform/chrome: sensorhub: Add trace events for sample
Dave Jiang dave.jiang@intel.com dmaengine: idxd: clear block on fault flag when clear wq
Dave Jiang dave.jiang@intel.com dmaengine: idxd: fix abort status check
Dave Jiang dave.jiang@intel.com dmaengine: idxd: fix wq slot allocation index check
Dave Jiang dave.jiang@intel.com dmaengine: idxd: have command status always set
Dave Jiang dave.jiang@intel.com dmanegine: idxd: cleanup all device related bits after disabling device
Uwe Kleine-König u.kleine-koenig@pengutronix.de pwm: mxs: Don't modify HW state in .probe() after the PWM chip was registered
Uwe Kleine-König u.kleine-koenig@pengutronix.de pwm: lpc32xx: Don't modify HW state in .probe() after the PWM chip was registered
Jeff Layton jlayton@kernel.org ceph: cancel delayed work instead of flushing on mdsc teardown
Matthias Kaehlcke mka@chromium.org thermal/drivers/qcom/spmi-adc-tm5: Don't abort probing if a sensor is not used
Prasad Sodagudi psodagud@codeaurora.org PM: sleep: core: Avoid setting power.must_resume to false
Pavel Skripkin paskripkin@gmail.com profiling: fix shift-out-of-bounds bugs
Zhen Lei thunder.leizhen@huawei.com nilfs2: use refcount_dec_and_lock() to fix potential UAF
Cyrill Gorcunov gorcunov@gmail.com prctl: allow to setup brk for et_dyn executables
Uwe Kleine-König u.kleine-koenig@pengutronix.de pwm: ab8500: Fix register offset calculation to not depend on probe order
Xie Yongji xieyongji@bytedance.com 9p/trans_virtio: Remove sysfs file on probe failure
Dan Carpenter dan.carpenter@oracle.com thermal/drivers/exynos: Fix an error code in exynos_tmu_probe()
Yang Yingliang yangyingliang@huawei.com n64cart: fix return value check in n64cart_probe()
Fabio Aiuto fabioaiuto83@gmail.com staging: rtl8723bs: fix wpa_set_auth_algs() function
Namhyung Kim namhyung@kernel.org perf tools: Allow build-id with trailing zeros
Remi Bernon rbernon@codeweavers.com perf symbol: Look for ImageBase in PE file to compute .text offset
Michael Petlan mpetlan@redhat.com perf test: Fix bpf test sample mismatch reporting
Andy Shevchenko andriy.shevchenko@linux.intel.com dmaengine: acpi: Avoid comparison GSI with Linux vIRQ
Niklas Schnelle schnelle@linux.ibm.com RDMA/mlx5: Fix xlt_chunk_align calculation
Yixing Liu liuyixing1@huawei.com RDMA/hns: Enable stash feature of HIP09
Johannes Berg johannes.berg@intel.com um: virtio_uml: fix memory leak on init failures
QiuXi qiuxi1@huawei.com coredump: fix memleak in dump_vma_snapshot()
Johannes Berg johannes.berg@intel.com um: fix stub location calculation
Nathan Chancellor nathan@kernel.org staging: rtl8192u: Fix bitwise vs logical operator in TranslateRxSignalStuff819xUsb()
nick black dankamongmen@gmail.com console: consume APC, DM, DCS
Pali Rohár pali@kernel.org PCI: aardvark: Fix reporting CRS value
Pali Rohár pali@kernel.org PCI: pci-bridge-emul: Add PCIe Root Capabilities Register
-------------
Diffstat:
Documentation/driver-api/cxl/memory-devices.rst | 2 +- Makefile | 4 +- arch/arm64/kernel/cacheinfo.c | 7 +- arch/arm64/mm/init.c | 16 +++- arch/mips/kernel/cacheinfo.c | 7 +- .../dts/microchip/microchip-mpfs-icicle-kit.dts | 6 +- arch/riscv/kernel/cacheinfo.c | 7 +- arch/s390/include/asm/stacktrace.h | 20 ++--- arch/s390/include/asm/unwind.h | 8 +- arch/s390/kernel/entry.S | 4 +- arch/s390/kernel/setup.c | 10 ++- arch/um/drivers/virtio_uml.c | 4 +- arch/um/kernel/skas/clone.c | 3 +- arch/x86/kernel/cpu/cacheinfo.c | 7 +- arch/x86/um/shared/sysdep/stub_32.h | 12 +++ arch/x86/um/shared/sysdep/stub_64.h | 12 +++ arch/x86/um/stub_segv.c | 3 +- block/blk-mq.c | 14 +++- block/blk-throttle.c | 1 + block/genhd.c | 9 +- drivers/acpi/x86/s2idle.c | 67 ++++++++------- drivers/base/power/main.c | 2 +- drivers/block/n64cart.c | 4 +- drivers/cxl/Makefile | 4 +- drivers/cxl/core/Makefile | 5 ++ drivers/cxl/{core.c => core/bus.c} | 4 +- drivers/cxl/{mem.h => cxlmem.h} | 15 ++++ drivers/cxl/pci.c | 67 ++++++++------- drivers/cxl/pmem.c | 2 +- drivers/dma-buf/Kconfig | 2 + drivers/dma/Kconfig | 4 +- drivers/dma/acpi-dma.c | 10 ++- drivers/dma/idxd/device.c | 98 ++++++++++++++++------ drivers/dma/idxd/idxd.h | 6 +- drivers/dma/idxd/irq.c | 16 +++- drivers/dma/idxd/submit.c | 2 +- drivers/dma/idxd/sysfs.c | 22 ++--- drivers/dma/sprd-dma.c | 1 + drivers/dma/xilinx/xilinx_dma.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c | 50 +++++++---- drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 11 ++- .../drm/amd/display/dc/dcn303/dcn303_resource.c | 6 +- .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 17 +++- drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c | 2 +- drivers/gpu/drm/ttm/ttm_bo.c | 6 +- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 1 + drivers/infiniband/hw/mlx5/mr.c | 2 +- drivers/iommu/amd/init.c | 31 +++++-- drivers/iommu/intel/svm.c | 15 +++- drivers/misc/habanalabs/common/command_buffer.c | 2 - drivers/misc/habanalabs/common/debugfs.c | 2 +- drivers/misc/habanalabs/common/device.c | 56 ++++++++++--- drivers/misc/habanalabs/common/firmware_if.c | 18 ++-- drivers/misc/habanalabs/common/habanalabs.h | 6 +- drivers/misc/habanalabs/common/habanalabs_drv.c | 8 +- drivers/misc/habanalabs/common/hw_queue.c | 30 +++---- drivers/misc/habanalabs/common/memory.c | 2 +- drivers/misc/habanalabs/common/mmu/mmu_v1.c | 12 +-- drivers/misc/habanalabs/common/sysfs.c | 20 ++--- drivers/misc/habanalabs/gaudi/gaudi.c | 6 ++ drivers/misc/habanalabs/goya/goya.c | 6 ++ drivers/nvme/target/configfs.c | 3 +- drivers/of/property.c | 3 + drivers/parisc/dino.c | 18 ++-- drivers/pci/controller/pci-aardvark.c | 67 ++++++++++++++- drivers/pci/pci-bridge-emul.h | 2 +- drivers/platform/chrome/Makefile | 2 +- drivers/platform/chrome/cros_ec_sensorhub_ring.c | 14 ++++ drivers/platform/chrome/cros_ec_trace.h | 94 +++++++++++++++++++++ drivers/pwm/pwm-ab8500.c | 17 +++- drivers/pwm/pwm-img.c | 16 ---- drivers/pwm/pwm-lpc32xx.c | 10 +-- drivers/pwm/pwm-mxs.c | 13 ++- drivers/pwm/pwm-rockchip.c | 14 ---- drivers/pwm/pwm-stm32-lp.c | 2 - drivers/rtc/Kconfig | 1 + drivers/staging/rtl8192u/r8192U_core.c | 2 +- drivers/staging/rtl8723bs/os_dep/ioctl_linux.c | 6 +- drivers/thermal/qcom/qcom-spmi-adc-tm5.c | 6 ++ drivers/thermal/rcar_gen3_thermal.c | 7 +- drivers/thermal/samsung/exynos_tmu.c | 1 + drivers/tty/vt/vt.c | 31 ++++++- fs/btrfs/ioctl.c | 15 +++- fs/btrfs/volumes.c | 48 +++++++---- fs/btrfs/volumes.h | 3 +- fs/ceph/caps.c | 75 ++++++++++++----- fs/ceph/file.c | 32 +++---- fs/ceph/mds_client.c | 32 ++++++- fs/ceph/metric.c | 4 +- fs/ceph/super.h | 6 ++ fs/cifs/smb2ops.c | 20 +++-- fs/coredump.c | 4 +- fs/io_uring.c | 2 +- fs/nilfs2/sysfs.c | 26 +++--- fs/nilfs2/the_nilfs.c | 9 +- include/linux/cacheinfo.h | 18 ---- include/linux/thermal.h | 5 +- include/uapi/misc/habanalabs.h | 4 +- init/initramfs.c | 2 + init/main.c | 1 - init/noinitramfs.c | 2 + kernel/profile.c | 21 ++--- kernel/sched/idle.c | 4 +- kernel/sys.c | 7 -- kernel/trace/trace_boot.c | 6 +- lib/Kconfig.debug | 4 +- net/9p/trans_virtio.c | 4 +- net/sunrpc/svc_xprt.c | 13 +-- security/selinux/hooks.c | 4 +- security/smack/smack_lsm.c | 4 +- sound/soc/generic/audio-graph-card.c | 6 ++ tools/bootconfig/scripts/ftrace2bconf.sh | 4 +- tools/perf/tests/bpf.c | 2 +- tools/perf/util/dso.c | 10 +++ tools/perf/util/parse-events-hybrid.c | 18 +++- tools/perf/util/parse-events.c | 18 ++-- tools/perf/util/symbol.c | 20 ++++- 117 files changed, 1066 insertions(+), 514 deletions(-)
From: Pali Rohár pali@kernel.org
commit e902bb7c24a7099d0eb0eb4cba06f2d91e9299f3 upstream.
The 16-bit Root Capabilities register is at offset 0x1e in the PCIe Capability. Rename current 'rsvd' struct member to 'rootcap'.
Link: https://lore.kernel.org/r/20210722144041.12661-4-pali@kernel.org Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Reviewed-by: Marek Behún kabel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pci-bridge-emul.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/pci/pci-bridge-emul.h +++ b/drivers/pci/pci-bridge-emul.h @@ -54,7 +54,7 @@ struct pci_bridge_emul_pcie_conf { __le16 slotctl; __le16 slotsta; __le16 rootctl; - __le16 rsvd; + __le16 rootcap; __le32 rootsta; __le32 devcap2; __le16 devctl2;
From: Pali Rohár pali@kernel.org
commit 43f5c77bcbd27cce70bf33c2b86d6726ce95dd66 upstream.
Set CRSVIS flag in emulated root PCI bridge to indicate support for Completion Retry Status.
Add check for CRSSVE flag from root PCI brige when issuing Configuration Read Request via PIO to correctly returns fabricated CRS value as it is required by PCIe spec.
Link: https://lore.kernel.org/r/20210722144041.12661-5-pali@kernel.org Fixes: 8a3ebd8de328 ("PCI: aardvark: Implement emulated root PCI bridge config space") Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Lorenzo Pieralisi lorenzo.pieralisi@arm.com Cc: stable@vger.kernel.org # e0d9d30b7354 ("PCI: pci-bridge-emul: Fix big-endian support") Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-aardvark.c | 67 +++++++++++++++++++++++++++++++--- 1 file changed, 63 insertions(+), 4 deletions(-)
--- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -218,6 +218,8 @@
#define MSI_IRQ_NUM 32
+#define CFG_RD_CRS_VAL 0xffff0001 + struct advk_pcie { struct platform_device *pdev; void __iomem *base; @@ -587,7 +589,7 @@ static void advk_pcie_setup_hw(struct ad advk_writel(pcie, reg, PCIE_CORE_CMD_STATUS_REG); }
-static int advk_pcie_check_pio_status(struct advk_pcie *pcie, u32 *val) +static int advk_pcie_check_pio_status(struct advk_pcie *pcie, bool allow_crs, u32 *val) { struct device *dev = &pcie->pdev->dev; u32 reg; @@ -629,9 +631,30 @@ static int advk_pcie_check_pio_status(st strcomp_status = "UR"; break; case PIO_COMPLETION_STATUS_CRS: + if (allow_crs && val) { + /* PCIe r4.0, sec 2.3.2, says: + * If CRS Software Visibility is enabled: + * For a Configuration Read Request that includes both + * bytes of the Vendor ID field of a device Function's + * Configuration Space Header, the Root Complex must + * complete the Request to the host by returning a + * read-data value of 0001h for the Vendor ID field and + * all '1's for any additional bytes included in the + * request. + * + * So CRS in this case is not an error status. + */ + *val = CFG_RD_CRS_VAL; + strcomp_status = NULL; + break; + } /* PCIe r4.0, sec 2.3.2, says: * If CRS Software Visibility is not enabled, the Root Complex * must re-issue the Configuration Request as a new Request. + * If CRS Software Visibility is enabled: For a Configuration + * Write Request or for any other Configuration Read Request, + * the Root Complex must re-issue the Configuration Request as + * a new Request. * A Root Complex implementation may choose to limit the number * of Configuration Request/CRS Completion Status loops before * determining that something is wrong with the target of the @@ -700,6 +723,7 @@ advk_pci_bridge_emul_pcie_conf_read(stru case PCI_EXP_RTCTL: { u32 val = advk_readl(pcie, PCIE_ISR0_MASK_REG); *value = (val & PCIE_MSG_PM_PME_MASK) ? 0 : PCI_EXP_RTCTL_PMEIE; + *value |= PCI_EXP_RTCAP_CRSVIS << 16; return PCI_BRIDGE_EMUL_HANDLED; }
@@ -781,6 +805,7 @@ static struct pci_bridge_emul_ops advk_p static int advk_sw_pci_bridge_init(struct advk_pcie *pcie) { struct pci_bridge_emul *bridge = &pcie->bridge; + int ret;
bridge->conf.vendor = cpu_to_le16(advk_readl(pcie, PCIE_CORE_DEV_ID_REG) & 0xffff); @@ -804,7 +829,15 @@ static int advk_sw_pci_bridge_init(struc bridge->data = pcie; bridge->ops = &advk_pci_bridge_emul_ops;
- return pci_bridge_emul_init(bridge, 0); + /* PCIe config space can be initialized after pci_bridge_emul_init() */ + ret = pci_bridge_emul_init(bridge, 0); + if (ret < 0) + return ret; + + /* Indicates supports for Completion Retry Status */ + bridge->pcie_conf.rootcap = cpu_to_le16(PCI_EXP_RTCAP_CRSVIS); + + return 0; }
static bool advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus, @@ -856,6 +889,7 @@ static int advk_pcie_rd_conf(struct pci_ int where, int size, u32 *val) { struct advk_pcie *pcie = bus->sysdata; + bool allow_crs; u32 reg; int ret;
@@ -868,7 +902,24 @@ static int advk_pcie_rd_conf(struct pci_ return pci_bridge_emul_conf_read(&pcie->bridge, where, size, val);
+ /* + * Completion Retry Status is possible to return only when reading all + * 4 bytes from PCI_VENDOR_ID and PCI_DEVICE_ID registers at once and + * CRSSVE flag on Root Bridge is enabled. + */ + allow_crs = (where == PCI_VENDOR_ID) && (size == 4) && + (le16_to_cpu(pcie->bridge.pcie_conf.rootctl) & + PCI_EXP_RTCTL_CRSSVE); + if (advk_pcie_pio_is_running(pcie)) { + /* + * If it is possible return Completion Retry Status so caller + * tries to issue the request again instead of failing. + */ + if (allow_crs) { + *val = CFG_RD_CRS_VAL; + return PCIBIOS_SUCCESSFUL; + } *val = 0xffffffff; return PCIBIOS_SET_FAILED; } @@ -896,12 +947,20 @@ static int advk_pcie_rd_conf(struct pci_
ret = advk_pcie_wait_pio(pcie); if (ret < 0) { + /* + * If it is possible return Completion Retry Status so caller + * tries to issue the request again instead of failing. + */ + if (allow_crs) { + *val = CFG_RD_CRS_VAL; + return PCIBIOS_SUCCESSFUL; + } *val = 0xffffffff; return PCIBIOS_SET_FAILED; }
/* Check PIO status and get the read result */ - ret = advk_pcie_check_pio_status(pcie, val); + ret = advk_pcie_check_pio_status(pcie, allow_crs, val); if (ret < 0) { *val = 0xffffffff; return PCIBIOS_SET_FAILED; @@ -970,7 +1029,7 @@ static int advk_pcie_wr_conf(struct pci_ if (ret < 0) return PCIBIOS_SET_FAILED;
- ret = advk_pcie_check_pio_status(pcie, NULL); + ret = advk_pcie_check_pio_status(pcie, false, NULL); if (ret < 0) return PCIBIOS_SET_FAILED;
From: nick black dankamongmen@gmail.com
commit 3a2b2eb55681158d3e3ef464fbf47574cf0c517c upstream.
The Linux console's VT102 implementation already consumes OSC ("Operating System Command") sequences, probably because that's how palette changes are transmitted.
In addition to OSC, there are three other major clases of ANSI control strings: APC ("Application Program Command"), PM ("Privacy Message"), and DCS ("Device Control String"). They are handled similarly to OSC in terms of termination.
Source: vt100.net
Add three new enumerated states, one for each of these types. All three are handled the same way right now--they simply consume input until terminated. I hope to expand upon this firmament in the future. Add new predicate ansi_control_string(), returning true for any of these states. Replace explicit checks against ESosc with calls to this function. Transition to these states appropriately from the escape initiation (ESesc) state.
This was motivated by the following Notcurses bugs:
https://github.com/dankamongmen/notcurses/issues/2050 https://github.com/dankamongmen/notcurses/issues/1828 https://github.com/dankamongmen/notcurses/issues/2069
where standard VT sequences are not consumed by the Linux console. It's not necessary that the Linux console *support* these sequences, but it ought *consume* these well-specified classes of sequences.
Tested by sending a variety of escape sequences to the console, and verifying that they still worked, or were now properly consumed. Verified that the escapes were properly terminated at a generic level. Verified that the Notcurses tools continued to show expected output on the Linux console, except now without escape bleedthrough.
Link: https://lore.kernel.org/lkml/YSydL0q8iaUfkphg@schwarzgerat.orthanc/ Signed-off-by: nick black dankamongmen@gmail.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Jiri Slaby jirislaby@kernel.org Cc: Tetsuo Handa penguin-kernel@i-love.sakura.ne.jp Cc: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/vt/vt.c | 31 +++++++++++++++++++++++++++---- 1 file changed, 27 insertions(+), 4 deletions(-)
--- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -2059,7 +2059,7 @@ static void restore_cur(struct vc_data *
enum { ESnormal, ESesc, ESsquare, ESgetpars, ESfunckey, EShash, ESsetG0, ESsetG1, ESpercent, EScsiignore, ESnonstd, - ESpalette, ESosc }; + ESpalette, ESosc, ESapc, ESpm, ESdcs };
/* console_lock is held (except via vc_init()) */ static void reset_terminal(struct vc_data *vc, int do_clear) @@ -2133,20 +2133,28 @@ static void vc_setGx(struct vc_data *vc, vc->vc_translate = set_translate(*charset, vc); }
+/* is this state an ANSI control string? */ +static bool ansi_control_string(unsigned int state) +{ + if (state == ESosc || state == ESapc || state == ESpm || state == ESdcs) + return true; + return false; +} + /* console_lock is held */ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c) { /* * Control characters can be used in the _middle_ - * of an escape sequence. + * of an escape sequence, aside from ANSI control strings. */ - if (vc->vc_state == ESosc && c>=8 && c<=13) /* ... except for OSC */ + if (ansi_control_string(vc->vc_state) && c >= 8 && c <= 13) return; switch (c) { case 0: return; case 7: - if (vc->vc_state == ESosc) + if (ansi_control_string(vc->vc_state)) vc->vc_state = ESnormal; else if (vc->vc_bell_duration) kd_mksound(vc->vc_bell_pitch, vc->vc_bell_duration); @@ -2207,6 +2215,12 @@ static void do_con_trol(struct tty_struc case ']': vc->vc_state = ESnonstd; return; + case '_': + vc->vc_state = ESapc; + return; + case '^': + vc->vc_state = ESpm; + return; case '%': vc->vc_state = ESpercent; return; @@ -2224,6 +2238,9 @@ static void do_con_trol(struct tty_struc if (vc->state.x < VC_TABSTOPS_COUNT) set_bit(vc->state.x, vc->vc_tab_stop); return; + case 'P': + vc->vc_state = ESdcs; + return; case 'Z': respond_ID(tty); return; @@ -2520,8 +2537,14 @@ static void do_con_trol(struct tty_struc vc_setGx(vc, 1, c); vc->vc_state = ESnormal; return; + case ESapc: + return; case ESosc: return; + case ESpm: + return; + case ESdcs: + return; default: vc->vc_state = ESnormal; }
From: Nathan Chancellor nathan@kernel.org
commit 099ec97ac92911abfb102bb5c68ed270fc12e0dd upstream.
clang warns:
drivers/staging/rtl8192u/r8192U_core.c:4268:20: warning: bitwise and of boolean expressions; did you mean logical and? [-Wbool-operation-and] bpacket_toself = bpacket_match_bssid & ^~~~~~~~~~~~~~~~~~~~~ && 1 warning generated.
Replace the bitwise AND with a logical one to clear up the warning, as that is clearly what was intended.
Fixes: 8fc8598e61f6 ("Staging: Added Realtek rtl8192u driver to staging") Signed-off-by: Nathan Chancellor nathan@kernel.org Link: https://lore.kernel.org/r/20210814235625.1780033-1-nathan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/rtl8192u/r8192U_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/staging/rtl8192u/r8192U_core.c +++ b/drivers/staging/rtl8192u/r8192U_core.c @@ -4265,7 +4265,7 @@ static void TranslateRxSignalStuff819xUs bpacket_match_bssid = (type != IEEE80211_FTYPE_CTL) && (ether_addr_equal(priv->ieee80211->current_network.bssid, (fc & IEEE80211_FCTL_TODS) ? hdr->addr1 : (fc & IEEE80211_FCTL_FROMDS) ? hdr->addr2 : hdr->addr3)) && (!pstats->bHwError) && (!pstats->bCRC) && (!pstats->bICV); - bpacket_toself = bpacket_match_bssid & + bpacket_toself = bpacket_match_bssid && (ether_addr_equal(praddr, priv->ieee80211->dev->dev_addr));
if (WLAN_FC_GET_FRAMETYPE(fc) == IEEE80211_STYPE_BEACON)
From: Johannes Berg johannes.berg@intel.com
commit adf9ae0d159d3dc94f58d788fc4757c8749ac0df upstream.
In commit 9f0b4807a44f ("um: rework userspace stubs to not hard-code stub location") I changed stub_segv_handler() to do a calculation with a pointer to a stack variable to find the data page that we're using for the stack and the rest of the data. This same commit was meant to do it as well for stub_clone_handler(), but the change inadvertently went into commit 84b2789d6115 ("um: separate child and parent errors in clone stub") instead.
This was reported to not be compiled correctly by gcc 5, causing the code to crash here. I'm not sure why, perhaps it's UB because the var isn't initialized? In any case, this trick always seemed bad, so just create a new inline function that does the calculation in assembly.
Reported-by: subashab@codeaurora.org Fixes: 9f0b4807a44f ("um: rework userspace stubs to not hard-code stub location") Fixes: 84b2789d6115 ("um: separate child and parent errors in clone stub") Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Richard Weinberger richard@nod.at Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/um/kernel/skas/clone.c | 3 +-- arch/x86/um/shared/sysdep/stub_32.h | 12 ++++++++++++ arch/x86/um/shared/sysdep/stub_64.h | 12 ++++++++++++ arch/x86/um/stub_segv.c | 3 +-- 4 files changed, 26 insertions(+), 4 deletions(-)
--- a/arch/um/kernel/skas/clone.c +++ b/arch/um/kernel/skas/clone.c @@ -24,8 +24,7 @@ void __attribute__ ((__section__ (".__syscall_stub"))) stub_clone_handler(void) { - int stack; - struct stub_data *data = (void *) ((unsigned long)&stack & ~(UM_KERN_PAGE_SIZE - 1)); + struct stub_data *data = get_stub_page(); long err;
err = stub_syscall2(__NR_clone, CLONE_PARENT | CLONE_FILES | SIGCHLD, --- a/arch/x86/um/shared/sysdep/stub_32.h +++ b/arch/x86/um/shared/sysdep/stub_32.h @@ -101,4 +101,16 @@ static inline void remap_stack_and_trap( "memory"); }
+static __always_inline void *get_stub_page(void) +{ + unsigned long ret; + + asm volatile ( + "movl %%esp,%0 ;" + "andl %1,%0" + : "=a" (ret) + : "g" (~(UM_KERN_PAGE_SIZE - 1))); + + return (void *)ret; +} #endif --- a/arch/x86/um/shared/sysdep/stub_64.h +++ b/arch/x86/um/shared/sysdep/stub_64.h @@ -108,4 +108,16 @@ static inline void remap_stack_and_trap( __syscall_clobber, "r10", "r8", "r9"); }
+static __always_inline void *get_stub_page(void) +{ + unsigned long ret; + + asm volatile ( + "movq %%rsp,%0 ;" + "andq %1,%0" + : "=a" (ret) + : "g" (~(UM_KERN_PAGE_SIZE - 1))); + + return (void *)ret; +} #endif --- a/arch/x86/um/stub_segv.c +++ b/arch/x86/um/stub_segv.c @@ -11,9 +11,8 @@ void __attribute__ ((__section__ (".__syscall_stub"))) stub_segv_handler(int sig, siginfo_t *info, void *p) { - int stack; + struct faultinfo *f = get_stub_page(); ucontext_t *uc = p; - struct faultinfo *f = (void *)(((unsigned long)&stack) & ~(UM_KERN_PAGE_SIZE - 1));
GET_FAULTINFO_FROM_MC(*f, &uc->uc_mcontext); trap_myself();
From: QiuXi qiuxi1@huawei.com
commit 6fcac87e1f9e5b27805a2a404f4849194bb51de8 upstream.
dump_vma_snapshot() allocs memory for *vma_meta, when dump_vma_snapshot() returns -EFAULT, the memory will be leaked, so we free it correctly.
Link: https://lkml.kernel.org/r/20210810020441.62806-1-qiuxi1@huawei.com Fixes: a07279c9a8cd7 ("binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot") Signed-off-by: QiuXi qiuxi1@huawei.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: Jann Horn jannh@google.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/coredump.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/fs/coredump.c +++ b/fs/coredump.c @@ -1127,8 +1127,10 @@ int dump_vma_snapshot(struct coredump_pa
mmap_write_unlock(mm);
- if (WARN_ON(i != *vma_count)) + if (WARN_ON(i != *vma_count)) { + kvfree(*vma_meta); return -EFAULT; + }
*vma_data_size_ptr = vma_data_size; return 0;
From: Johannes Berg johannes.berg@intel.com
commit 7ad28e0df7ee9dbcb793bb88dd81d4d22bb9a10e upstream.
If initialization fails, e.g. because the connection failed, we leak the 'vu_dev'. Fix that. Reported by smatch.
Fixes: 5d38f324993f ("um: drivers: Add virtio vhost-user driver") Signed-off-by: Johannes Berg johannes.berg@intel.com Acked-By: Anton Ivanov anton.ivanov@cambridgegreys.com Signed-off-by: Richard Weinberger richard@nod.at Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/um/drivers/virtio_uml.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/arch/um/drivers/virtio_uml.c +++ b/arch/um/drivers/virtio_uml.c @@ -1139,7 +1139,7 @@ static int virtio_uml_probe(struct platf rc = os_connect_socket(pdata->socket_path); } while (rc == -EINTR); if (rc < 0) - return rc; + goto error_free; vu_dev->sock = rc;
spin_lock_init(&vu_dev->sock_lock); @@ -1160,6 +1160,8 @@ static int virtio_uml_probe(struct platf
error_init: os_close_file(vu_dev->sock); +error_free: + kfree(vu_dev); return rc; }
From: Yixing Liu liuyixing1@huawei.com
commit 260f64a40198309008026447f7fda277a73ed8c3 upstream.
The stash feature is enabled by default on HIP09.
Fixes: f93c39bc9547 ("RDMA/hns: Add support for QP stash") Fixes: bfefae9f108d ("RDMA/hns: Add support for CQ stash") Link: https://lore.kernel.org/r/1629539607-33217-3-git-send-email-liangwenpeng@hua... Signed-off-by: Yixing Liu liuyixing1@huawei.com Signed-off-by: Wenpeng Liang liangwenpeng@huawei.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -2004,6 +2004,7 @@ static void set_default_caps(struct hns_ caps->gid_table_len[0] = HNS_ROCE_V2_GID_INDEX_NUM;
if (hr_dev->pci_dev->revision >= PCI_REVISION_ID_HIP09) { + caps->flags |= HNS_ROCE_CAP_FLAG_STASH; caps->max_sq_inline = HNS_ROCE_V3_MAX_SQ_INLINE; } else { caps->max_sq_inline = HNS_ROCE_V2_MAX_SQ_INLINE;
From: Niklas Schnelle schnelle@linux.ibm.com
commit f4c6f31011eafe027abddf6cee1288a1b5a05b73 upstream.
The XLT chunk alignment depends on ent_size not sizeof(ent_size) aka sizeof(size_t). The incoming ent_size is either 8 or 16, so the miscalculation when 16 is required is only an over-alignment and functional harmless.
Fixes: 8010d74b9965 ("RDMA/mlx5: Split the WR setup out of mlx5_ib_update_xlt()") Link: https://lore.kernel.org/r/20210908081849.7948-2-schnelle@linux.ibm.com Signed-off-by: Niklas Schnelle schnelle@linux.ibm.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/infiniband/hw/mlx5/mr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -995,7 +995,7 @@ static struct mlx5_ib_mr *alloc_cacheabl static void *mlx5_ib_alloc_xlt(size_t *nents, size_t ent_size, gfp_t gfp_mask) { const size_t xlt_chunk_align = - MLX5_UMR_MTT_ALIGNMENT / sizeof(ent_size); + MLX5_UMR_MTT_ALIGNMENT / ent_size; size_t size; void *res = NULL;
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
commit 67db87dc8284070adb15b3c02c1c31d5cf51c5d6 upstream.
Currently the CRST parsing relies on the fact that on most of x86 devices the IRQ mapping is 1:1 with Linux vIRQ. However, it may be not true for some. Fix this by converting GSI to Linux vIRQ before checking it.
Fixes: ee8209fd026b ("dma: acpi-dma: parse CSRT to extract additional resources") Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20210730202715.24375-1-andriy.shevchenko@linux.int... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma/acpi-dma.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
--- a/drivers/dma/acpi-dma.c +++ b/drivers/dma/acpi-dma.c @@ -70,10 +70,14 @@ static int acpi_dma_parse_resource_group
si = (const struct acpi_csrt_shared_info *)&grp[1];
- /* Match device by MMIO and IRQ */ + /* Match device by MMIO */ if (si->mmio_base_low != lower_32_bits(mem) || - si->mmio_base_high != upper_32_bits(mem) || - si->gsi_interrupt != irq) + si->mmio_base_high != upper_32_bits(mem)) + return 0; + + /* Match device by Linux vIRQ */ + ret = acpi_register_gsi(NULL, si->gsi_interrupt, si->interrupt_mode, si->interrupt_polarity); + if (ret != irq) return 0;
dev_dbg(&adev->dev, "matches with %.4s%04X (rev %u)\n",
From: Michael Petlan mpetlan@redhat.com
commit 3e11300cdfd5f1bc13a05dfc6dccf69aca5dd1dc upstream.
When the expected sample count in the condition changed, the message needs to be changed too, otherwise we'll get:
0x1001f2091d8: mmap mask[0]: BPF filter result incorrect, expected 56, got 56 samples
Fixes: 4b04e0decd2518e5 ("perf test: Fix basic bpf filtering test") Signed-off-by: Michael Petlan mpetlan@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Sumanth Korikkar sumanthk@linux.ibm.com Link: https //lore.kernel.org/r/20210805160611.5542-1-mpetlan@redhat.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/perf/tests/bpf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/perf/tests/bpf.c +++ b/tools/perf/tests/bpf.c @@ -192,7 +192,7 @@ static int do_test(struct bpf_object *ob }
if (count != expect * evlist->core.nr_entries) { - pr_debug("BPF filter result incorrect, expected %d, got %d samples\n", expect, count); + pr_debug("BPF filter result incorrect, expected %d, got %d samples\n", expect * evlist->core.nr_entries, count); goto out_delete_evlist; }
From: Remi Bernon rbernon@codeweavers.com
commit d2930ede5218be28413a00130a6895d14393c325 upstream.
Instead of using the file offset in the debug file.
This fixes a regression from 00a3423492bc90be ("perf symbols: Make dso__load_bfd_symbols() load PE files from debug cache only"), causing incorrect symbol resolution when debug file have been stripped from non-debug sections (in which case its .text section is empty and doesn't have any file position).
The debug files could also be created with a different file alignment, and have different file positions from the mmap-ed binary, or have the section reordered.
This instead looks for the file image base, using the corresponding bfd *ABS* symbols. As PE symbols only have 4 bytes, it also needs to keep .text section vma high bits.
Signed-off-by: Remi Bernon rbernon@codeweavers.com Fixes: 00a3423492bc90be ("perf symbols: Make dso__load_bfd_symbols() load PE files from debug cache only") Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: Nicholas Fraser nfraser@codeweavers.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/20210909192637.4139125-1-rbernon@codeweavers.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/perf/util/symbol.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-)
--- a/tools/perf/util/symbol.c +++ b/tools/perf/util/symbol.c @@ -1581,10 +1581,6 @@ int dso__load_bfd_symbols(struct dso *ds if (bfd_get_flavour(abfd) == bfd_target_elf_flavour) goto out_close;
- section = bfd_get_section_by_name(abfd, ".text"); - if (section) - dso->text_offset = section->vma - section->filepos; - symbols_size = bfd_get_symtab_upper_bound(abfd); if (symbols_size == 0) { bfd_close(abfd); @@ -1602,6 +1598,22 @@ int dso__load_bfd_symbols(struct dso *ds if (symbols_count < 0) goto out_free;
+ section = bfd_get_section_by_name(abfd, ".text"); + if (section) { + for (i = 0; i < symbols_count; ++i) { + if (!strcmp(bfd_asymbol_name(symbols[i]), "__ImageBase") || + !strcmp(bfd_asymbol_name(symbols[i]), "__image_base__")) + break; + } + if (i < symbols_count) { + /* PE symbols can only have 4 bytes, so use .text high bits */ + dso->text_offset = section->vma - (u32)section->vma; + dso->text_offset += (u32)bfd_asymbol_value(symbols[i]); + } else { + dso->text_offset = section->vma - section->filepos; + } + } + qsort(symbols, symbols_count, sizeof(asymbol *), bfd_symbols__cmpvalue);
#ifdef bfd_get_section
From: Namhyung Kim namhyung@kernel.org
commit 4a86d41404005a3c7e7b6065e8169ac6202887a9 upstream.
Currently perf saves a build-id with size but old versions assumes the size of 20. In case the build-id is less than 20 (like for MD5), it'd fill the rest with 0s.
I saw a problem when old version of perf record saved a binary in the build-id cache and new version of perf reads the data. The symbols should be read from the build-id cache (as the path no longer has the same binary) but it failed due to mismatch in the build-id.
symsrc__init: build id mismatch for /home/namhyung/.debug/.build-id/53/e4c2f42a4c61a2d632d92a72afa08f00000000/elf.
The build-id event in the data has 20 byte build-ids, but it saw a different size (16) when it reads the build-id of the elf file in the build-id cache.
$ readelf -n ~/.debug/.build-id/53/e4c2f42a4c61a2d632d92a72afa08f00000000/elf
Displaying notes found in: .note.gnu.build-id Owner Data size Description GNU 0x00000010 NT_GNU_BUILD_ID (unique build ID bitstring) Build ID: 53e4c2f42a4c61a2d632d92a72afa08f
Let's fix this by allowing trailing zeros if the size is different.
Fixes: 39be8d0115b321ed ("perf tools: Pass build_id object to dso__build_id_equal()") Signed-off-by: Namhyung Kim namhyung@kernel.org Acked-by: Jiri Olsa jolsa@redhat.com Cc: Andi Kleen ak@linux.intel.com Cc: Ian Rogers irogers@google.com Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/20210910224630.1084877-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/perf/util/dso.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/tools/perf/util/dso.c +++ b/tools/perf/util/dso.c @@ -1349,6 +1349,16 @@ void dso__set_build_id(struct dso *dso,
bool dso__build_id_equal(const struct dso *dso, struct build_id *bid) { + if (dso->bid.size > bid->size && dso->bid.size == BUILD_ID_SIZE) { + /* + * For the backward compatibility, it allows a build-id has + * trailing zeros. + */ + return !memcmp(dso->bid.data, bid->data, bid->size) && + !memchr_inv(&dso->bid.data[bid->size], 0, + dso->bid.size - bid->size); + } + return dso->bid.size == bid->size && memcmp(dso->bid.data, bid->data, dso->bid.size) == 0; }
From: Fabio Aiuto fabioaiuto83@gmail.com
commit b658acbf64ae38b8fca982c2929ccc0bf4eb1ea2 upstream.
fix authentication algorithm constants. wpa_set_auth_algs() function contains some conditional statements masking the checked value with the wrong constants. This produces some unintentional dead code. Mask the value with the right macros.
Fixes: 5befa937e8da ("staging: rtl8723bs: Fix IEEE80211 authentication algorithm constants.") Reported-by: Colin Ian King colin.king@canonical.com Tested-on: Lenovo Ideapad MiiX 300-10IBY Signed-off-by: Fabio Aiuto fabioaiuto83@gmail.com Link: https://lore.kernel.org/r/20210715145700.9427-1-fabioaiuto83@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/rtl8723bs/os_dep/ioctl_linux.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c +++ b/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c @@ -349,16 +349,16 @@ static int wpa_set_auth_algs(struct net_ struct adapter *padapter = rtw_netdev_priv(dev); int ret = 0;
- if ((value & WLAN_AUTH_SHARED_KEY) && (value & WLAN_AUTH_OPEN)) { + if ((value & IW_AUTH_ALG_SHARED_KEY) && (value & IW_AUTH_ALG_OPEN_SYSTEM)) { padapter->securitypriv.ndisencryptstatus = Ndis802_11Encryption1Enabled; padapter->securitypriv.ndisauthtype = Ndis802_11AuthModeAutoSwitch; padapter->securitypriv.dot11AuthAlgrthm = dot11AuthAlgrthm_Auto; - } else if (value & WLAN_AUTH_SHARED_KEY) { + } else if (value & IW_AUTH_ALG_SHARED_KEY) { padapter->securitypriv.ndisencryptstatus = Ndis802_11Encryption1Enabled;
padapter->securitypriv.ndisauthtype = Ndis802_11AuthModeShared; padapter->securitypriv.dot11AuthAlgrthm = dot11AuthAlgrthm_Shared; - } else if (value & WLAN_AUTH_OPEN) { + } else if (value & IW_AUTH_ALG_OPEN_SYSTEM) { /* padapter->securitypriv.ndisencryptstatus = Ndis802_11EncryptionDisabled; */ if (padapter->securitypriv.ndisauthtype < Ndis802_11AuthModeWPAPSK) { padapter->securitypriv.ndisauthtype = Ndis802_11AuthModeOpen;
From: Yang Yingliang yangyingliang@huawei.com
commit 221e8360834c59f0c9952630fa5904a94ebd2bb8 upstream.
In case of error, the function devm_platform_ioremap_resource() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR().
Fixes: d9b2a2bbbb4d ("block: Add n64 cart driver") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Chaitanya Kulkarni kch@nvidia.com Link: https://lore.kernel.org/r/20210909090608.2989716-1-yangyingliang@huawei.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/n64cart.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/block/n64cart.c +++ b/drivers/block/n64cart.c @@ -129,8 +129,8 @@ static int __init n64cart_probe(struct p }
reg_base = devm_platform_ioremap_resource(pdev, 0); - if (!reg_base) - return -EINVAL; + if (IS_ERR(reg_base)) + return PTR_ERR(reg_base);
disk = blk_alloc_disk(NUMA_NO_NODE); if (!disk)
From: Dan Carpenter dan.carpenter@oracle.com
commit 02d438f62c05f0d055ceeedf12a2f8796b258c08 upstream.
This error path return success but it should propagate the negative error code from devm_clk_get().
Fixes: 6c247393cfdd ("thermal: exynos: Add TMU support for Exynos7 SoC") Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Reviewed-by: Krzysztof Kozlowski krzysztof.kozlowski@canonical.com Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Link: https://lore.kernel.org/r/20210810084413.GA23810@kili Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/thermal/samsung/exynos_tmu.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/thermal/samsung/exynos_tmu.c +++ b/drivers/thermal/samsung/exynos_tmu.c @@ -1073,6 +1073,7 @@ static int exynos_tmu_probe(struct platf data->sclk = devm_clk_get(&pdev->dev, "tmu_sclk"); if (IS_ERR(data->sclk)) { dev_err(&pdev->dev, "Failed to get sclk\n"); + ret = PTR_ERR(data->sclk); goto err_clk; } else { ret = clk_prepare_enable(data->sclk);
From: Xie Yongji xieyongji@bytedance.com
commit f997ea3b7afc108eb9761f321b57de2d089c7c48 upstream.
This ensures we don't leak the sysfs file if we failed to allocate chan->vc_wq during probe.
Link: http://lkml.kernel.org/r/20210517083557.172-1-xieyongji@bytedance.com Fixes: 86c8437383ac ("net/9p: Add sysfs mount_tag file for virtio 9P device") Signed-off-by: Xie Yongji xieyongji@bytedance.com Signed-off-by: Dominique Martinet asmadeus@codewreck.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/9p/trans_virtio.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/9p/trans_virtio.c +++ b/net/9p/trans_virtio.c @@ -610,7 +610,7 @@ static int p9_virtio_probe(struct virtio chan->vc_wq = kmalloc(sizeof(wait_queue_head_t), GFP_KERNEL); if (!chan->vc_wq) { err = -ENOMEM; - goto out_free_tag; + goto out_remove_file; } init_waitqueue_head(chan->vc_wq); chan->ring_bufs_avail = 1; @@ -628,6 +628,8 @@ static int p9_virtio_probe(struct virtio
return 0;
+out_remove_file: + sysfs_remove_file(&vdev->dev.kobj, &dev_attr_mount_tag.attr); out_free_tag: kfree(tag); out_free_vq:
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
commit eb41f334589d66b9da6f2b1acf7963ef8ca8d94e upstream.
The assumption that lead to commit 5e5da1e9fbee ("pwm: ab8500: Explicitly allocate pwm chip base dynamically") was wrong: The pwm-ab8500 devices are not directly instantiated from device tree, but from the ab8500 mfd driver. So the pdev->id isn't -1, but a number between 1 and 3. Now that pwmchip ids are always allocated dynamically, this cannot easily be reverted.
Introduce a new member in the driver data struct that tracks the hardware id and use this to calculate the register offset.
Side-note: Using chip->base to calculate the offset was never robust because if there was already a PWM with id 1 at the time ab8500-pwm.1 was probed, the associated pwmchip would get assigned chip->base = 2 (or something bigger).
Fixes: 5e5da1e9fbee ("pwm: ab8500: Explicitly allocate pwm chip base dynamically") Fixes: 6173f8f4ed9c ("pwm: Move AB8500 PWM driver to PWM framework") Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Acked-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pwm/pwm-ab8500.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-)
--- a/drivers/pwm/pwm-ab8500.c +++ b/drivers/pwm/pwm-ab8500.c @@ -22,14 +22,21 @@
struct ab8500_pwm_chip { struct pwm_chip chip; + unsigned int hwid; };
+static struct ab8500_pwm_chip *ab8500_pwm_from_chip(struct pwm_chip *chip) +{ + return container_of(chip, struct ab8500_pwm_chip, chip); +} + static int ab8500_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm, const struct pwm_state *state) { int ret; u8 reg; unsigned int higher_val, lower_val; + struct ab8500_pwm_chip *ab8500 = ab8500_pwm_from_chip(chip);
if (state->polarity != PWM_POLARITY_NORMAL) return -EINVAL; @@ -37,7 +44,7 @@ static int ab8500_pwm_apply(struct pwm_c if (!state->enabled) { ret = abx500_mask_and_set_register_interruptible(chip->dev, AB8500_MISC, AB8500_PWM_OUT_CTRL7_REG, - 1 << (chip->base - 1), 0); + 1 << ab8500->hwid, 0);
if (ret < 0) dev_err(chip->dev, "%s: Failed to disable PWM, Error %d\n", @@ -56,7 +63,7 @@ static int ab8500_pwm_apply(struct pwm_c */ higher_val = ((state->duty_cycle & 0x0300) >> 8);
- reg = AB8500_PWM_OUT_CTRL1_REG + ((chip->base - 1) * 2); + reg = AB8500_PWM_OUT_CTRL1_REG + (ab8500->hwid * 2);
ret = abx500_set_register_interruptible(chip->dev, AB8500_MISC, reg, (u8)lower_val); @@ -70,7 +77,7 @@ static int ab8500_pwm_apply(struct pwm_c
ret = abx500_mask_and_set_register_interruptible(chip->dev, AB8500_MISC, AB8500_PWM_OUT_CTRL7_REG, - 1 << (chip->base - 1), 1 << (chip->base - 1)); + 1 << ab8500->hwid, 1 << ab8500->hwid); if (ret < 0) dev_err(chip->dev, "%s: Failed to enable PWM, Error %d\n", pwm->label, ret); @@ -88,6 +95,9 @@ static int ab8500_pwm_probe(struct platf struct ab8500_pwm_chip *ab8500; int err;
+ if (pdev->id < 1 || pdev->id > 31) + return dev_err_probe(&pdev->dev, EINVAL, "Invalid device id %d\n", pdev->id); + /* * Nothing to be done in probe, this is required to get the * device which is required for ab8500 read and write @@ -99,6 +109,7 @@ static int ab8500_pwm_probe(struct platf ab8500->chip.dev = &pdev->dev; ab8500->chip.ops = &ab8500_pwm_ops; ab8500->chip.npwm = 1; + ab8500->hwid = pdev->id - 1;
err = pwmchip_add(&ab8500->chip); if (err < 0)
From: Cyrill Gorcunov gorcunov@gmail.com
commit e1fbbd073137a9d63279f6bf363151a938347640 upstream.
Keno Fischer reported that when a binray loaded via ld-linux-x the prctl(PR_SET_MM_MAP) doesn't allow to setup brk value because it lays before mm:end_data.
For example a test program shows
| # ~/t | | start_code 401000 | end_code 401a15 | start_stack 7ffce4577dd0 | start_data 403e10 | end_data 40408c | start_brk b5b000 | sbrk(0) b5b000
and when executed via ld-linux
| # /lib64/ld-linux-x86-64.so.2 ~/t | | start_code 7fc25b0a4000 | end_code 7fc25b0c4524 | start_stack 7fffcc6b2400 | start_data 7fc25b0ce4c0 | end_data 7fc25b0cff98 | start_brk 55555710c000 | sbrk(0) 55555710c000
This of course prevent criu from restoring such programs. Looking into how kernel operates with brk/start_brk inside brk() syscall I don't see any problem if we allow to setup brk/start_brk without checking for end_data. Even if someone pass some weird address here on a purpose then the worst possible result will be an unexpected unmapping of existing vma (own vma, since prctl works with the callers memory) but test for RLIMIT_DATA is still valid and a user won't be able to gain more memory in case of expanding VMAs via new values shipped with prctl call.
Link: https://lkml.kernel.org/r/20210121221207.GB2174@grain Fixes: bbdc6076d2e5 ("binfmt_elf: move brk out of mmap when doing direct loader exec") Signed-off-by: Cyrill Gorcunov gorcunov@gmail.com Reported-by: Keno Fischer keno@juliacomputing.com Acked-by: Andrey Vagin avagin@gmail.com Tested-by: Andrey Vagin avagin@gmail.com Cc: Dmitry Safonov 0x7f454c46@gmail.com Cc: Kirill Tkhai ktkhai@virtuozzo.com Cc: Eric W. Biederman ebiederm@xmission.com Cc: Pavel Tikhomirov ptikhomirov@virtuozzo.com Cc: Alexander Mikhalitsyn alexander.mikhalitsyn@virtuozzo.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sys.c | 7 ------- 1 file changed, 7 deletions(-)
--- a/kernel/sys.c +++ b/kernel/sys.c @@ -1960,13 +1960,6 @@ static int validate_prctl_map_addr(struc error = -EINVAL;
/* - * @brk should be after @end_data in traditional maps. - */ - if (prctl_map->start_brk <= prctl_map->end_data || - prctl_map->brk <= prctl_map->end_data) - goto out; - - /* * Neither we should allow to override limits if they set. */ if (check_data_rlimit(rlimit(RLIMIT_DATA), prctl_map->brk,
From: Zhen Lei thunder.leizhen@huawei.com
commit 98e2e409e76ef7781d8511f997359e9c504a95c1 upstream.
When the refcount is decreased to 0, the resource reclamation branch is entered. Before CPU0 reaches the race point (1), CPU1 may obtain the spinlock and traverse the rbtree to find 'root', see nilfs_lookup_root().
Although CPU1 will call refcount_inc() to increase the refcount, it is obviously too late. CPU0 will release 'root' directly, CPU1 then accesses 'root' and triggers UAF.
Use refcount_dec_and_lock() to ensure that both the operations of decrease refcount to 0 and link deletion are lock protected eliminates this risk.
CPU0 CPU1 nilfs_put_root(): <-------- (1) spin_lock(&nilfs->ns_cptree_lock); rb_erase(&root->rb_node, &nilfs->ns_cptree); spin_unlock(&nilfs->ns_cptree_lock);
kfree(root); <-------- use-after-free
refcount_t: underflow; use-after-free. WARNING: CPU: 2 PID: 9476 at lib/refcount.c:28 \ refcount_warn_saturate+0x1cf/0x210 lib/refcount.c:28 Modules linked in: CPU: 2 PID: 9476 Comm: syz-executor.0 Not tainted 5.10.45-rc1+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ... RIP: 0010:refcount_warn_saturate+0x1cf/0x210 lib/refcount.c:28 ... ... Call Trace: __refcount_sub_and_test include/linux/refcount.h:283 [inline] __refcount_dec_and_test include/linux/refcount.h:315 [inline] refcount_dec_and_test include/linux/refcount.h:333 [inline] nilfs_put_root+0xc1/0xd0 fs/nilfs2/the_nilfs.c:795 nilfs_segctor_destroy fs/nilfs2/segment.c:2749 [inline] nilfs_detach_log_writer+0x3fa/0x570 fs/nilfs2/segment.c:2812 nilfs_put_super+0x2f/0xf0 fs/nilfs2/super.c:467 generic_shutdown_super+0xcd/0x1f0 fs/super.c:464 kill_block_super+0x4a/0x90 fs/super.c:1446 deactivate_locked_super+0x6a/0xb0 fs/super.c:335 deactivate_super+0x85/0x90 fs/super.c:366 cleanup_mnt+0x277/0x2e0 fs/namespace.c:1118 __cleanup_mnt+0x15/0x20 fs/namespace.c:1125 task_work_run+0x8e/0x110 kernel/task_work.c:151 tracehook_notify_resume include/linux/tracehook.h:188 [inline] exit_to_user_mode_loop kernel/entry/common.c:164 [inline] exit_to_user_mode_prepare+0x13c/0x170 kernel/entry/common.c:191 syscall_exit_to_user_mode+0x16/0x30 kernel/entry/common.c:266 do_syscall_64+0x45/0x80 arch/x86/entry/common.c:56 entry_SYSCALL_64_after_hwframe+0x44/0xa9
There is no reproduction program, and the above is only theoretical analysis.
Link: https://lkml.kernel.org/r/1629859428-5906-1-git-send-email-konishi.ryusuke@g... Fixes: ba65ae4729bf ("nilfs2: add checkpoint tree to nilfs object") Link: https://lkml.kernel.org/r/20210723012317.4146-1-thunder.leizhen@huawei.com Signed-off-by: Zhen Lei thunder.leizhen@huawei.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nilfs2/the_nilfs.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
--- a/fs/nilfs2/the_nilfs.c +++ b/fs/nilfs2/the_nilfs.c @@ -792,14 +792,13 @@ nilfs_find_or_create_root(struct the_nil
void nilfs_put_root(struct nilfs_root *root) { - if (refcount_dec_and_test(&root->count)) { - struct the_nilfs *nilfs = root->nilfs; + struct the_nilfs *nilfs = root->nilfs;
- nilfs_sysfs_delete_snapshot_group(root); - - spin_lock(&nilfs->ns_cptree_lock); + if (refcount_dec_and_lock(&root->count, &nilfs->ns_cptree_lock)) { rb_erase(&root->rb_node, &nilfs->ns_cptree); spin_unlock(&nilfs->ns_cptree_lock); + + nilfs_sysfs_delete_snapshot_group(root); iput(root->ifile);
kfree(root);
From: Pavel Skripkin paskripkin@gmail.com
commit 2d186afd04d669fe9c48b994c41a7405a3c9f16d upstream.
Syzbot reported shift-out-of-bounds bug in profile_init(). The problem was in incorrect prof_shift. Since prof_shift value comes from userspace we need to clamp this value into [0, BITS_PER_LONG -1] boundaries.
Second possible shiht-out-of-bounds was found by Tetsuo: sample_step local variable in read_profile() had "unsigned int" type, but prof_shift allows to make a BITS_PER_LONG shift. So, to prevent possible shiht-out-of-bounds sample_step type was changed to "unsigned long".
Also, "unsigned short int" will be sufficient for storing [0, BITS_PER_LONG] value, that's why there is no need for "unsigned long" prof_shift.
Link: https://lkml.kernel.org/r/20210813140022.5011-1-paskripkin@gmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-and-tested-by: syzbot+e68c89a9510c159d9684@syzkaller.appspotmail.com Suggested-by: Tetsuo Handa penguin-kernel@i-love.sakura.ne.jp Signed-off-by: Pavel Skripkin paskripkin@gmail.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Steven Rostedt rostedt@goodmis.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/profile.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-)
--- a/kernel/profile.c +++ b/kernel/profile.c @@ -41,7 +41,8 @@ struct profile_hit { #define NR_PROFILE_GRP (NR_PROFILE_HIT/PROFILE_GRPSZ)
static atomic_t *prof_buffer; -static unsigned long prof_len, prof_shift; +static unsigned long prof_len; +static unsigned short int prof_shift;
int prof_on __read_mostly; EXPORT_SYMBOL_GPL(prof_on); @@ -67,8 +68,8 @@ int profile_setup(char *str) if (str[strlen(sleepstr)] == ',') str += strlen(sleepstr) + 1; if (get_option(&str, &par)) - prof_shift = par; - pr_info("kernel sleep profiling enabled (shift: %ld)\n", + prof_shift = clamp(par, 0, BITS_PER_LONG - 1); + pr_info("kernel sleep profiling enabled (shift: %u)\n", prof_shift); #else pr_warn("kernel sleep profiling requires CONFIG_SCHEDSTATS\n"); @@ -78,21 +79,21 @@ int profile_setup(char *str) if (str[strlen(schedstr)] == ',') str += strlen(schedstr) + 1; if (get_option(&str, &par)) - prof_shift = par; - pr_info("kernel schedule profiling enabled (shift: %ld)\n", + prof_shift = clamp(par, 0, BITS_PER_LONG - 1); + pr_info("kernel schedule profiling enabled (shift: %u)\n", prof_shift); } else if (!strncmp(str, kvmstr, strlen(kvmstr))) { prof_on = KVM_PROFILING; if (str[strlen(kvmstr)] == ',') str += strlen(kvmstr) + 1; if (get_option(&str, &par)) - prof_shift = par; - pr_info("kernel KVM profiling enabled (shift: %ld)\n", + prof_shift = clamp(par, 0, BITS_PER_LONG - 1); + pr_info("kernel KVM profiling enabled (shift: %u)\n", prof_shift); } else if (get_option(&str, &par)) { - prof_shift = par; + prof_shift = clamp(par, 0, BITS_PER_LONG - 1); prof_on = CPU_PROFILING; - pr_info("kernel profiling enabled (shift: %ld)\n", + pr_info("kernel profiling enabled (shift: %u)\n", prof_shift); } return 1; @@ -468,7 +469,7 @@ read_profile(struct file *file, char __u unsigned long p = *ppos; ssize_t read; char *pnt; - unsigned int sample_step = 1 << prof_shift; + unsigned long sample_step = 1UL << prof_shift;
profile_flip_buffers(); if (p >= (prof_len+1)*sizeof(unsigned int))
From: Prasad Sodagudi psodagud@codeaurora.org
commit 4a9344cd0aa4499beb3772bbecb40bb78888c0e1 upstream.
There are variables(power.may_skip_resume and dev->power.must_resume) and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices after a system wide suspend transition.
Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows its "noirq" and "early" resume callbacks to be skipped if the device can be left in suspend after a system-wide transition into the working state. PM core determines that the driver's "noirq" and "early" resume callbacks should be skipped or not with dev_pm_skip_resume() function by checking power.may_skip_resume variable.
power.must_resume variable is getting set to false in __device_suspend() function without checking device's DPM_FLAG_MAY_SKIP_RESUME settings. In problematic scenario, where all the devices in the suspend_late stage are successful and some device can fail to suspend in suspend_noirq phase. So some devices successfully suspended in suspend_late stage are not getting chance to execute __device_suspend_noirq() to set dev->power.must_resume variable to true and not getting resumed in early_resume phase.
Add a check for device's DPM_FLAG_MAY_SKIP_RESUME flag before setting power.must_resume variable in __device_suspend function.
Fixes: 6e176bf8d461 ("PM: sleep: core: Do not skip callbacks in the resume phase") Signed-off-by: Prasad Sodagudi psodagud@codeaurora.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/power/main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1642,7 +1642,7 @@ static int __device_suspend(struct devic }
dev->power.may_skip_resume = true; - dev->power.must_resume = false; + dev->power.must_resume = !dev_pm_test_driver_flags(dev, DPM_FLAG_MAY_SKIP_RESUME);
dpm_watchdog_set(&wd, dev); device_lock(dev);
From: Matthias Kaehlcke mka@chromium.org
commit 70ee251ded6ba24c15537f4abb8a318e233d0d1a upstream.
adc_tm5_register_tzd() registers the thermal zone sensors for all channels of the thermal monitor. If the registration of one channel fails the function skips the processing of the remaining channels and returns an error, which results in _probe() being aborted.
One of the reasons the registration could fail is that none of the thermal zones is using the channel/sensor, which hardly is a critical error (if it is an error at all). If this case is detected emit a warning and continue with processing the remaining channels.
Fixes: ca66dca5eda6 ("thermal: qcom: add support for adc-tm5 PMIC thermal monitor") Signed-off-by: Matthias Kaehlcke mka@chromium.org Reported-by: Stephen Boyd swboyd@chromium.org Reviewed-by: Stephen Boyd swboyd@chromium.org Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Link: https://lore.kernel.org/r/20210823134726.1.I1dd23ddf77e5b3568625d80d6827653a... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/thermal/qcom/qcom-spmi-adc-tm5.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/thermal/qcom/qcom-spmi-adc-tm5.c +++ b/drivers/thermal/qcom/qcom-spmi-adc-tm5.c @@ -359,6 +359,12 @@ static int adc_tm5_register_tzd(struct a &adc_tm->channels[i], &adc_tm5_ops); if (IS_ERR(tzd)) { + if (PTR_ERR(tzd) == -ENODEV) { + dev_warn(adc_tm->dev, "thermal sensor on channel %d is not used\n", + adc_tm->channels[i].channel); + continue; + } + dev_err(adc_tm->dev, "Error registering TZ zone for channel %d: %ld\n", adc_tm->channels[i].channel, PTR_ERR(tzd)); return PTR_ERR(tzd);
From: Jeff Layton jlayton@kernel.org
commit b4002173b7989588b6feaefc42edaf011b596782 upstream.
The first thing metric_delayed_work does is check mdsc->stopping, and then return immediately if it's set. That's good since we would have already torn down the metric structures at this point, otherwise, but there is no locking around mdsc->stopping.
It's possible that the ceph_metric_destroy call could race with the delayed_work, in which case we could end up with the delayed_work accessing destroyed percpu variables.
At this point in the mdsc teardown, the "stopping" flag has already been set, so there's no benefit to flushing the work. Move the work cancellation in ceph_metric_destroy ahead of the percpu variable destruction, and eliminate the flush_delayed_work call in ceph_mdsc_destroy.
Fixes: 18f473b384a6 ("ceph: periodically send perf metrics to MDSes") Signed-off-by: Jeff Layton jlayton@kernel.org Reviewed-by: Xiubo Li xiubli@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ceph/mds_client.c | 1 - fs/ceph/metric.c | 4 ++-- 2 files changed, 2 insertions(+), 3 deletions(-)
--- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -4912,7 +4912,6 @@ void ceph_mdsc_destroy(struct ceph_fs_cl
ceph_metric_destroy(&mdsc->metric);
- flush_delayed_work(&mdsc->metric.delayed_work); fsc->mdsc = NULL; kfree(mdsc); dout("mdsc_destroy %p done\n", mdsc); --- a/fs/ceph/metric.c +++ b/fs/ceph/metric.c @@ -302,6 +302,8 @@ void ceph_metric_destroy(struct ceph_cli if (!m) return;
+ cancel_delayed_work_sync(&m->delayed_work); + percpu_counter_destroy(&m->total_inodes); percpu_counter_destroy(&m->opened_inodes); percpu_counter_destroy(&m->i_caps_mis); @@ -309,8 +311,6 @@ void ceph_metric_destroy(struct ceph_cli percpu_counter_destroy(&m->d_lease_mis); percpu_counter_destroy(&m->d_lease_hit);
- cancel_delayed_work_sync(&m->delayed_work); - ceph_put_mds_session(m->session); }
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
commit 3d2813fb17e5fd0d73c1d1442ca0192bde4af10e upstream.
This fixes a race condition: After pwmchip_add() is called there might already be a consumer and then modifying the hardware behind the consumer's back is bad. So set the default before.
(Side-note: I don't know what this register setting actually does, if this modifies the polarity there is an inconsistency because the inversed polarity isn't considered if the PWM is already running during .probe().)
Fixes: acfd92fdfb93 ("pwm: lpc32xx: Set PWM_PIN_LEVEL bit to default value") Cc: Sylvain Lemieux slemieux@tycoint.com Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pwm/pwm-lpc32xx.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/drivers/pwm/pwm-lpc32xx.c +++ b/drivers/pwm/pwm-lpc32xx.c @@ -117,17 +117,17 @@ static int lpc32xx_pwm_probe(struct plat lpc32xx->chip.ops = &lpc32xx_pwm_ops; lpc32xx->chip.npwm = 1;
+ /* If PWM is disabled, configure the output to the default value */ + val = readl(lpc32xx->base + (lpc32xx->chip.pwms[0].hwpwm << 2)); + val &= ~PWM_PIN_LEVEL; + writel(val, lpc32xx->base + (lpc32xx->chip.pwms[0].hwpwm << 2)); + ret = pwmchip_add(&lpc32xx->chip); if (ret < 0) { dev_err(&pdev->dev, "failed to add PWM chip, error %d\n", ret); return ret; }
- /* When PWM is disable, configure the output to the default value */ - val = readl(lpc32xx->base + (lpc32xx->chip.pwms[0].hwpwm << 2)); - val &= ~PWM_PIN_LEVEL; - writel(val, lpc32xx->base + (lpc32xx->chip.pwms[0].hwpwm << 2)); - platform_set_drvdata(pdev, lpc32xx);
return 0;
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
commit 020162d6f49f2963062229814a56a89c86cbeaa8 upstream.
This fixes a race condition: After pwmchip_add() is called there might already be a consumer and then modifying the hardware behind the consumer's back is bad. So reset before calling pwmchip_add().
Note that reseting the hardware isn't the right thing to do if the PWM is already running as it might e.g. disable (or even enable) a backlight that is supposed to be on (or off).
Fixes: 4dce82c1e840 ("pwm: add pwm-mxs support") Cc: Sascha Hauer s.hauer@pengutronix.de Cc: Shawn Guo shawnguo@kernel.org Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pwm/pwm-mxs.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-)
--- a/drivers/pwm/pwm-mxs.c +++ b/drivers/pwm/pwm-mxs.c @@ -145,6 +145,11 @@ static int mxs_pwm_probe(struct platform return ret; }
+ /* FIXME: Only do this if the PWM isn't already running */ + ret = stmp_reset_block(mxs->base); + if (ret) + return dev_err_probe(&pdev->dev, ret, "failed to reset PWM\n"); + ret = pwmchip_add(&mxs->chip); if (ret < 0) { dev_err(&pdev->dev, "failed to add pwm chip %d\n", ret); @@ -153,15 +158,7 @@ static int mxs_pwm_probe(struct platform
platform_set_drvdata(pdev, mxs);
- ret = stmp_reset_block(mxs->base); - if (ret) - goto pwm_remove; - return 0; - -pwm_remove: - pwmchip_remove(&mxs->chip); - return ret; }
static int mxs_pwm_remove(struct platform_device *pdev)
From: Dave Jiang dave.jiang@intel.com
[ Upstream commit 0dcfe41e9a4ca759ccc87a48e3bb9cc3b08ff1e8 ]
The previous state cleanup patch only performed wq state cleanups. This does not go far enough as when device is disabled or reset, the state for groups and engines must also be cleaned up. Add additional state cleanup beyond wq cleanup. Tie those cleanups directly to device disable and reset, and wq disable and reset.
Fixes: da32b28c95a7 ("dmaengine: idxd: cleanup workqueue config after disabling") Signed-off-by: Dave Jiang dave.jiang@intel.com Link: https://lore.kernel.org/r/162285154108.2096632.5572805472362321307.stgit@dji... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/idxd/device.c | 88 +++++++++++++++++++++++++++++---------- drivers/dma/idxd/idxd.h | 6 +-- drivers/dma/idxd/irq.c | 4 +- drivers/dma/idxd/sysfs.c | 22 +++------- 4 files changed, 74 insertions(+), 46 deletions(-)
diff --git a/drivers/dma/idxd/device.c b/drivers/dma/idxd/device.c index 420b93fe5feb..928c2c8817f0 100644 --- a/drivers/dma/idxd/device.c +++ b/drivers/dma/idxd/device.c @@ -15,6 +15,8 @@
static void idxd_cmd_exec(struct idxd_device *idxd, int cmd_code, u32 operand, u32 *status); +static void idxd_device_wqs_clear_state(struct idxd_device *idxd); +static void idxd_wq_disable_cleanup(struct idxd_wq *wq);
/* Interrupt control bits */ void idxd_mask_msix_vector(struct idxd_device *idxd, int vec_id) @@ -234,7 +236,7 @@ int idxd_wq_enable(struct idxd_wq *wq) return 0; }
-int idxd_wq_disable(struct idxd_wq *wq) +int idxd_wq_disable(struct idxd_wq *wq, bool reset_config) { struct idxd_device *idxd = wq->idxd; struct device *dev = &idxd->pdev->dev; @@ -255,6 +257,8 @@ int idxd_wq_disable(struct idxd_wq *wq) return -ENXIO; }
+ if (reset_config) + idxd_wq_disable_cleanup(wq); wq->state = IDXD_WQ_DISABLED; dev_dbg(dev, "WQ %d disabled\n", wq->id); return 0; @@ -289,6 +293,7 @@ void idxd_wq_reset(struct idxd_wq *wq)
operand = BIT(wq->id % 16) | ((wq->id / 16) << 16); idxd_cmd_exec(idxd, IDXD_CMD_RESET_WQ, operand, NULL); + idxd_wq_disable_cleanup(wq); wq->state = IDXD_WQ_DISABLED; }
@@ -337,7 +342,7 @@ int idxd_wq_set_pasid(struct idxd_wq *wq, int pasid) unsigned int offset; unsigned long flags;
- rc = idxd_wq_disable(wq); + rc = idxd_wq_disable(wq, false); if (rc < 0) return rc;
@@ -364,7 +369,7 @@ int idxd_wq_disable_pasid(struct idxd_wq *wq) unsigned int offset; unsigned long flags;
- rc = idxd_wq_disable(wq); + rc = idxd_wq_disable(wq, false); if (rc < 0) return rc;
@@ -383,11 +388,11 @@ int idxd_wq_disable_pasid(struct idxd_wq *wq) return 0; }
-void idxd_wq_disable_cleanup(struct idxd_wq *wq) +static void idxd_wq_disable_cleanup(struct idxd_wq *wq) { struct idxd_device *idxd = wq->idxd;
- lockdep_assert_held(&idxd->dev_lock); + lockdep_assert_held(&wq->wq_lock); memset(wq->wqcfg, 0, idxd->wqcfg_size); wq->type = IDXD_WQT_NONE; wq->size = 0; @@ -548,22 +553,6 @@ int idxd_device_enable(struct idxd_device *idxd) return 0; }
-void idxd_device_wqs_clear_state(struct idxd_device *idxd) -{ - int i; - - lockdep_assert_held(&idxd->dev_lock); - - for (i = 0; i < idxd->max_wqs; i++) { - struct idxd_wq *wq = idxd->wqs[i]; - - if (wq->state == IDXD_WQ_ENABLED) { - idxd_wq_disable_cleanup(wq); - wq->state = IDXD_WQ_DISABLED; - } - } -} - int idxd_device_disable(struct idxd_device *idxd) { struct device *dev = &idxd->pdev->dev; @@ -585,7 +574,7 @@ int idxd_device_disable(struct idxd_device *idxd) }
spin_lock_irqsave(&idxd->dev_lock, flags); - idxd_device_wqs_clear_state(idxd); + idxd_device_clear_state(idxd); idxd->state = IDXD_DEV_CONF_READY; spin_unlock_irqrestore(&idxd->dev_lock, flags); return 0; @@ -597,7 +586,7 @@ void idxd_device_reset(struct idxd_device *idxd)
idxd_cmd_exec(idxd, IDXD_CMD_RESET_DEVICE, 0, NULL); spin_lock_irqsave(&idxd->dev_lock, flags); - idxd_device_wqs_clear_state(idxd); + idxd_device_clear_state(idxd); idxd->state = IDXD_DEV_CONF_READY; spin_unlock_irqrestore(&idxd->dev_lock, flags); } @@ -685,6 +674,59 @@ int idxd_device_release_int_handle(struct idxd_device *idxd, int handle, }
/* Device configuration bits */ +static void idxd_engines_clear_state(struct idxd_device *idxd) +{ + struct idxd_engine *engine; + int i; + + lockdep_assert_held(&idxd->dev_lock); + for (i = 0; i < idxd->max_engines; i++) { + engine = idxd->engines[i]; + engine->group = NULL; + } +} + +static void idxd_groups_clear_state(struct idxd_device *idxd) +{ + struct idxd_group *group; + int i; + + lockdep_assert_held(&idxd->dev_lock); + for (i = 0; i < idxd->max_groups; i++) { + group = idxd->groups[i]; + memset(&group->grpcfg, 0, sizeof(group->grpcfg)); + group->num_engines = 0; + group->num_wqs = 0; + group->use_token_limit = false; + group->tokens_allowed = 0; + group->tokens_reserved = 0; + group->tc_a = -1; + group->tc_b = -1; + } +} + +static void idxd_device_wqs_clear_state(struct idxd_device *idxd) +{ + int i; + + lockdep_assert_held(&idxd->dev_lock); + for (i = 0; i < idxd->max_wqs; i++) { + struct idxd_wq *wq = idxd->wqs[i]; + + if (wq->state == IDXD_WQ_ENABLED) { + idxd_wq_disable_cleanup(wq); + wq->state = IDXD_WQ_DISABLED; + } + } +} + +void idxd_device_clear_state(struct idxd_device *idxd) +{ + idxd_groups_clear_state(idxd); + idxd_engines_clear_state(idxd); + idxd_device_wqs_clear_state(idxd); +} + void idxd_msix_perm_setup(struct idxd_device *idxd) { union msix_perm mperm; diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h index fc708be7ad9a..0f27374eae4b 100644 --- a/drivers/dma/idxd/idxd.h +++ b/drivers/dma/idxd/idxd.h @@ -428,9 +428,8 @@ int idxd_device_init_reset(struct idxd_device *idxd); int idxd_device_enable(struct idxd_device *idxd); int idxd_device_disable(struct idxd_device *idxd); void idxd_device_reset(struct idxd_device *idxd); -void idxd_device_cleanup(struct idxd_device *idxd); +void idxd_device_clear_state(struct idxd_device *idxd); int idxd_device_config(struct idxd_device *idxd); -void idxd_device_wqs_clear_state(struct idxd_device *idxd); void idxd_device_drain_pasid(struct idxd_device *idxd, int pasid); int idxd_device_load_config(struct idxd_device *idxd); int idxd_device_request_int_handle(struct idxd_device *idxd, int idx, int *handle, @@ -443,12 +442,11 @@ void idxd_wqs_unmap_portal(struct idxd_device *idxd); int idxd_wq_alloc_resources(struct idxd_wq *wq); void idxd_wq_free_resources(struct idxd_wq *wq); int idxd_wq_enable(struct idxd_wq *wq); -int idxd_wq_disable(struct idxd_wq *wq); +int idxd_wq_disable(struct idxd_wq *wq, bool reset_config); void idxd_wq_drain(struct idxd_wq *wq); void idxd_wq_reset(struct idxd_wq *wq); int idxd_wq_map_portal(struct idxd_wq *wq); void idxd_wq_unmap_portal(struct idxd_wq *wq); -void idxd_wq_disable_cleanup(struct idxd_wq *wq); int idxd_wq_set_pasid(struct idxd_wq *wq, int pasid); int idxd_wq_disable_pasid(struct idxd_wq *wq); void idxd_wq_quiesce(struct idxd_wq *wq); diff --git a/drivers/dma/idxd/irq.c b/drivers/dma/idxd/irq.c index 4e3a7198c0ca..2924819ca8f3 100644 --- a/drivers/dma/idxd/irq.c +++ b/drivers/dma/idxd/irq.c @@ -59,7 +59,7 @@ static void idxd_device_reinit(struct work_struct *work) return;
out: - idxd_device_wqs_clear_state(idxd); + idxd_device_clear_state(idxd); }
static void idxd_device_fault_work(struct work_struct *work) @@ -192,7 +192,7 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause) spin_lock_bh(&idxd->dev_lock); idxd_wqs_quiesce(idxd); idxd_wqs_unmap_portal(idxd); - idxd_device_wqs_clear_state(idxd); + idxd_device_clear_state(idxd); dev_err(&idxd->pdev->dev, "idxd halted, need %s.\n", gensts.reset_type == IDXD_DEVICE_RESET_FLR ? diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c index bb4df63906a7..528cde54724b 100644 --- a/drivers/dma/idxd/sysfs.c +++ b/drivers/dma/idxd/sysfs.c @@ -129,7 +129,7 @@ static int enable_wq(struct idxd_wq *wq) rc = idxd_wq_map_portal(wq); if (rc < 0) { dev_warn(dev, "wq portal mapping failed: %d\n", rc); - rc = idxd_wq_disable(wq); + rc = idxd_wq_disable(wq, false); if (rc < 0) dev_warn(dev, "IDXD wq disable failed\n"); mutex_unlock(&wq->wq_lock); @@ -262,8 +262,6 @@ static void disable_wq(struct idxd_wq *wq)
static int idxd_config_bus_remove(struct device *dev) { - int rc; - dev_dbg(dev, "%s called for %s\n", __func__, dev_name(dev));
/* disable workqueue here */ @@ -288,22 +286,12 @@ static int idxd_config_bus_remove(struct device *dev) }
idxd_unregister_dma_device(idxd); - rc = idxd_device_disable(idxd); - if (test_bit(IDXD_FLAG_CONFIGURABLE, &idxd->flags)) { - for (i = 0; i < idxd->max_wqs; i++) { - struct idxd_wq *wq = idxd->wqs[i]; - - mutex_lock(&wq->wq_lock); - idxd_wq_disable_cleanup(wq); - mutex_unlock(&wq->wq_lock); - } - } + idxd_device_disable(idxd); + if (test_bit(IDXD_FLAG_CONFIGURABLE, &idxd->flags)) + idxd_device_reset(idxd); module_put(THIS_MODULE); - if (rc < 0) - dev_warn(dev, "Device disable failed\n"); - else - dev_info(dev, "Device %s disabled\n", dev_name(dev));
+ dev_info(dev, "Device %s disabled\n", dev_name(dev)); }
return 0;
From: Dave Jiang dave.jiang@intel.com
[ Upstream commit 53499d1fc11267e078557fc68ce943c1eb3b7a37 ]
The cached command status is only set when the write back status is is passed in. Move the variable set outside of the check so it is always set.
Fixes: 0d5c10b4c84d ("dmaengine: idxd: add work queue drain support") Reported-by: Ramesh Thomas ramesh.thomas@intel.com Signed-off-by: Dave Jiang dave.jiang@intel.com Link: https://lore.kernel.org/r/162274329740.1822314.3443875665504707588.stgit@dji... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/idxd/device.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/dma/idxd/device.c b/drivers/dma/idxd/device.c index 928c2c8817f0..c8cf1de72176 100644 --- a/drivers/dma/idxd/device.c +++ b/drivers/dma/idxd/device.c @@ -486,6 +486,7 @@ static void idxd_cmd_exec(struct idxd_device *idxd, int cmd_code, u32 operand, union idxd_command_reg cmd; DECLARE_COMPLETION_ONSTACK(done); unsigned long flags; + u32 stat;
if (idxd_device_is_halted(idxd)) { dev_warn(&idxd->pdev->dev, "Device is HALTED!\n"); @@ -518,11 +519,11 @@ static void idxd_cmd_exec(struct idxd_device *idxd, int cmd_code, u32 operand, */ spin_unlock_irqrestore(&idxd->cmd_lock, flags); wait_for_completion(&done); + stat = ioread32(idxd->reg_base + IDXD_CMDSTS_OFFSET); spin_lock_irqsave(&idxd->cmd_lock, flags); - if (status) { - *status = ioread32(idxd->reg_base + IDXD_CMDSTS_OFFSET); - idxd->cmd_status = *status & GENMASK(7, 0); - } + if (status) + *status = stat; + idxd->cmd_status = stat & GENMASK(7, 0);
__clear_bit(IDXD_FLAG_CMD_RUNNING, &idxd->flags); /* Wake up other pending commands */
From: Dave Jiang dave.jiang@intel.com
[ Upstream commit 673d812d30be67942762bb9e8548abb26a3ba4a7 ]
The sbitmap wait and allocate routine checks the index that is returned from sbitmap_queue_get(). It should be idxd >= 0 as 0 is also a valid index. This fixes issue where submission path hangs when WQ size is 1.
Fixes: 0705107fcc80 ("dmaengine: idxd: move submission to sbitmap_queue") Signed-off-by: Dave Jiang dave.jiang@intel.com Link: https://lore.kernel.org/r/162697645067.3478714.506720687816951762.stgit@djia... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/idxd/submit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma/idxd/submit.c b/drivers/dma/idxd/submit.c index 36c9c1a89b7e..196d6cf11965 100644 --- a/drivers/dma/idxd/submit.c +++ b/drivers/dma/idxd/submit.c @@ -67,7 +67,7 @@ struct idxd_desc *idxd_alloc_desc(struct idxd_wq *wq, enum idxd_op_type optype) if (signal_pending_state(TASK_INTERRUPTIBLE, current)) break; idx = sbitmap_queue_get(sbq, &cpu); - if (idx > 0) + if (idx >= 0) break; schedule(); }
From: Dave Jiang dave.jiang@intel.com
[ Upstream commit b60bb6e2bfc192091b8f792781b83b5e0f9324f6 ]
Coverity static analysis of linux-next found issue.
The check (status == IDXD_COMP_DESC_ABORT) is always false since status was previously masked with 0x7f and IDXD_COMP_DESC_ABORT is 0xff.
Fixes: 6b4b87f2c31a ("dmaengine: idxd: fix submission race window") Reported-by: Colin Ian King colin.king@canonical.com Signed-off-by: Dave Jiang dave.jiang@intel.com Link: https://lore.kernel.org/r/162698465160.3560828.18173186265683415384.stgit@dj... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/idxd/irq.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/dma/idxd/irq.c b/drivers/dma/idxd/irq.c index 2924819ca8f3..ba839d3569cd 100644 --- a/drivers/dma/idxd/irq.c +++ b/drivers/dma/idxd/irq.c @@ -269,7 +269,11 @@ static int irq_process_pending_llist(struct idxd_irq_entry *irq_entry, u8 status = desc->completion->status & DSA_COMP_STATUS_MASK;
if (status) { - if (unlikely(status == IDXD_COMP_DESC_ABORT)) { + /* + * Check against the original status as ABORT is software defined + * and 0xff, which DSA_COMP_STATUS_MASK can mask out. + */ + if (unlikely(desc->completion->status == IDXD_COMP_DESC_ABORT)) { complete_desc(desc, IDXD_COMPLETE_ABORT); (*processed)++; continue; @@ -333,7 +337,11 @@ static int irq_process_work_list(struct idxd_irq_entry *irq_entry, list_for_each_entry(desc, &flist, list) { u8 status = desc->completion->status & DSA_COMP_STATUS_MASK;
- if (unlikely(status == IDXD_COMP_DESC_ABORT)) { + /* + * Check against the original status as ABORT is software defined + * and 0xff, which DSA_COMP_STATUS_MASK can mask out. + */ + if (unlikely(desc->completion->status == IDXD_COMP_DESC_ABORT)) { complete_desc(desc, IDXD_COMPLETE_ABORT); continue; }
From: Dave Jiang dave.jiang@intel.com
[ Upstream commit bd2f4ae5e019efcfadd6b491204fd60adf14f4a3 ]
The block on fault flag is not cleared when we disable or reset wq. This causes it to remain set if the user does not clear it on the next configuration load. Add clear of flag in dxd_wq_disable_cleanup() routine.
Fixes: da32b28c95a7 ("dmaengine: idxd: cleanup workqueue config after disabling") Signed-off-by: Dave Jiang dave.jiang@intel.com Link: https://lore.kernel.org/r/162803023553.3086015.8158952172068868803.stgit@dji... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/idxd/device.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/dma/idxd/device.c b/drivers/dma/idxd/device.c index c8cf1de72176..9c6760ae5aef 100644 --- a/drivers/dma/idxd/device.c +++ b/drivers/dma/idxd/device.c @@ -401,6 +401,7 @@ static void idxd_wq_disable_cleanup(struct idxd_wq *wq) wq->priority = 0; wq->ats_dis = 0; clear_bit(WQ_FLAG_DEDICATED, &wq->flags); + clear_bit(WQ_FLAG_BLOCK_ON_FAULT, &wq->flags); memset(wq->name, 0, WQ_NAME_SIZE); }
From: Gwendal Grignou gwendal@chromium.org
[ Upstream commit d453ceb6549af8798913de6a20444cb7200fdb69 ]
Add trace event to report samples and their timestamp coming from the EC. It allows to check if the timestamps are correct and the filter is working correctly without introducing too much latency.
To enable these events:
cd /sys/kernel/debug/tracing/ echo 1 > events/cros_ec/enable echo 0 > events/cros_ec/cros_ec_request_start/enable echo 0 > events/cros_ec/cros_ec_request_done/enable echo 1 > tracing_on cat trace_pipe Observe event flowing: irq/105-chromeo-95 [000] .... 613.659758: cros_ec_sensorhub_timestamp: ... irq/105-chromeo-95 [000] .... 613.665219: cros_ec_sensorhub_filter: dx: ...
Signed-off-by: Gwendal Grignou gwendal@chromium.org Signed-off-by: Enric Balletbo i Serra enric.balletbo@collabora.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/chrome/Makefile | 2 +- .../platform/chrome/cros_ec_sensorhub_ring.c | 14 +++ drivers/platform/chrome/cros_ec_trace.h | 94 +++++++++++++++++++ 3 files changed, 109 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/chrome/Makefile b/drivers/platform/chrome/Makefile index 41baccba033f..f901d2e43166 100644 --- a/drivers/platform/chrome/Makefile +++ b/drivers/platform/chrome/Makefile @@ -20,7 +20,7 @@ obj-$(CONFIG_CROS_EC_CHARDEV) += cros_ec_chardev.o obj-$(CONFIG_CROS_EC_LIGHTBAR) += cros_ec_lightbar.o obj-$(CONFIG_CROS_EC_VBC) += cros_ec_vbc.o obj-$(CONFIG_CROS_EC_DEBUGFS) += cros_ec_debugfs.o -cros-ec-sensorhub-objs := cros_ec_sensorhub.o cros_ec_sensorhub_ring.o +cros-ec-sensorhub-objs := cros_ec_sensorhub.o cros_ec_sensorhub_ring.o cros_ec_trace.o obj-$(CONFIG_CROS_EC_SENSORHUB) += cros-ec-sensorhub.o obj-$(CONFIG_CROS_EC_SYSFS) += cros_ec_sysfs.o obj-$(CONFIG_CROS_USBPD_LOGGER) += cros_usbpd_logger.o diff --git a/drivers/platform/chrome/cros_ec_sensorhub_ring.c b/drivers/platform/chrome/cros_ec_sensorhub_ring.c index 8921f24e83ba..98e37080f760 100644 --- a/drivers/platform/chrome/cros_ec_sensorhub_ring.c +++ b/drivers/platform/chrome/cros_ec_sensorhub_ring.c @@ -17,6 +17,8 @@ #include <linux/sort.h> #include <linux/slab.h>
+#include "cros_ec_trace.h" + /* Precision of fixed point for the m values from the filter */ #define M_PRECISION BIT(23)
@@ -291,6 +293,7 @@ cros_ec_sensor_ring_ts_filter_update(struct cros_ec_sensors_ts_filter_state state->median_m = 0; state->median_error = 0; } + trace_cros_ec_sensorhub_filter(state, dx, dy); }
/** @@ -427,6 +430,11 @@ cros_ec_sensor_ring_process_event(struct cros_ec_sensorhub *sensorhub, if (new_timestamp - *current_timestamp > 0) *current_timestamp = new_timestamp; } + trace_cros_ec_sensorhub_timestamp(in->timestamp, + fifo_info->timestamp, + fifo_timestamp, + *current_timestamp, + now); }
if (in->flags & MOTIONSENSE_SENSOR_FLAG_ODR) { @@ -460,6 +468,12 @@ cros_ec_sensor_ring_process_event(struct cros_ec_sensorhub *sensorhub,
/* Regular sample */ out->sensor_id = in->sensor_num; + trace_cros_ec_sensorhub_data(in->sensor_num, + fifo_info->timestamp, + fifo_timestamp, + *current_timestamp, + now); + if (*current_timestamp - now > 0) { /* * This fix is needed to overcome the timestamp filter putting diff --git a/drivers/platform/chrome/cros_ec_trace.h b/drivers/platform/chrome/cros_ec_trace.h index f744b21bc655..f50b9f9b8610 100644 --- a/drivers/platform/chrome/cros_ec_trace.h +++ b/drivers/platform/chrome/cros_ec_trace.h @@ -15,6 +15,7 @@ #include <linux/types.h> #include <linux/platform_data/cros_ec_commands.h> #include <linux/platform_data/cros_ec_proto.h> +#include <linux/platform_data/cros_ec_sensorhub.h>
#include <linux/tracepoint.h>
@@ -70,6 +71,99 @@ TRACE_EVENT(cros_ec_request_done, __entry->retval) );
+TRACE_EVENT(cros_ec_sensorhub_timestamp, + TP_PROTO(u32 ec_sample_timestamp, u32 ec_fifo_timestamp, s64 fifo_timestamp, + s64 current_timestamp, s64 current_time), + TP_ARGS(ec_sample_timestamp, ec_fifo_timestamp, fifo_timestamp, current_timestamp, + current_time), + TP_STRUCT__entry( + __field(u32, ec_sample_timestamp) + __field(u32, ec_fifo_timestamp) + __field(s64, fifo_timestamp) + __field(s64, current_timestamp) + __field(s64, current_time) + __field(s64, delta) + ), + TP_fast_assign( + __entry->ec_sample_timestamp = ec_sample_timestamp; + __entry->ec_fifo_timestamp = ec_fifo_timestamp; + __entry->fifo_timestamp = fifo_timestamp; + __entry->current_timestamp = current_timestamp; + __entry->current_time = current_time; + __entry->delta = current_timestamp - current_time; + ), + TP_printk("ec_ts: %12lld, ec_fifo_ts: %12lld, fifo_ts: %12lld, curr_ts: %12lld, curr_time: %12lld, delta %12lld", + __entry->ec_sample_timestamp, + __entry->ec_fifo_timestamp, + __entry->fifo_timestamp, + __entry->current_timestamp, + __entry->current_time, + __entry->delta + ) +); + +TRACE_EVENT(cros_ec_sensorhub_data, + TP_PROTO(u32 ec_sensor_num, u32 ec_fifo_timestamp, s64 fifo_timestamp, + s64 current_timestamp, s64 current_time), + TP_ARGS(ec_sensor_num, ec_fifo_timestamp, fifo_timestamp, current_timestamp, current_time), + TP_STRUCT__entry( + __field(u32, ec_sensor_num) + __field(u32, ec_fifo_timestamp) + __field(s64, fifo_timestamp) + __field(s64, current_timestamp) + __field(s64, current_time) + __field(s64, delta) + ), + TP_fast_assign( + __entry->ec_sensor_num = ec_sensor_num; + __entry->ec_fifo_timestamp = ec_fifo_timestamp; + __entry->fifo_timestamp = fifo_timestamp; + __entry->current_timestamp = current_timestamp; + __entry->current_time = current_time; + __entry->delta = current_timestamp - current_time; + ), + TP_printk("ec_num: %4d, ec_fifo_ts: %12lld, fifo_ts: %12lld, curr_ts: %12lld, curr_time: %12lld, delta %12lld", + __entry->ec_sensor_num, + __entry->ec_fifo_timestamp, + __entry->fifo_timestamp, + __entry->current_timestamp, + __entry->current_time, + __entry->delta + ) +); + +TRACE_EVENT(cros_ec_sensorhub_filter, + TP_PROTO(struct cros_ec_sensors_ts_filter_state *state, s64 dx, s64 dy), + TP_ARGS(state, dx, dy), + TP_STRUCT__entry( + __field(s64, dx) + __field(s64, dy) + __field(s64, median_m) + __field(s64, median_error) + __field(s64, history_len) + __field(s64, x) + __field(s64, y) + ), + TP_fast_assign( + __entry->dx = dx; + __entry->dy = dy; + __entry->median_m = state->median_m; + __entry->median_error = state->median_error; + __entry->history_len = state->history_len; + __entry->x = state->x_offset; + __entry->y = state->y_offset; + ), + TP_printk("dx: %12lld. dy: %12lld median_m: %12lld median_error: %12lld len: %d x: %12lld y: %12lld", + __entry->dx, + __entry->dy, + __entry->median_m, + __entry->median_error, + __entry->history_len, + __entry->x, + __entry->y + ) +); +
#endif /* _CROS_EC_TRACE_H_ */
From: Gwendal Grignou gwendal@chromium.org
[ Upstream commit 4665584888ad2175831c972c004115741ec799e9 ]
Fix printf format issues in new tracing events.
Fixes: 814318242687 ("platform/chrome: cros_ec_trace: Add fields to command traces")
Signed-off-by: Gwendal Grignou gwendal@chromium.org Link: https://lore.kernel.org/r/20210830180050.2077261-1-gwendal@chromium.org Signed-off-by: Benson Leung bleung@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/chrome/cros_ec_trace.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/platform/chrome/cros_ec_trace.h b/drivers/platform/chrome/cros_ec_trace.h index f50b9f9b8610..7e7cfc98657a 100644 --- a/drivers/platform/chrome/cros_ec_trace.h +++ b/drivers/platform/chrome/cros_ec_trace.h @@ -92,7 +92,7 @@ TRACE_EVENT(cros_ec_sensorhub_timestamp, __entry->current_time = current_time; __entry->delta = current_timestamp - current_time; ), - TP_printk("ec_ts: %12lld, ec_fifo_ts: %12lld, fifo_ts: %12lld, curr_ts: %12lld, curr_time: %12lld, delta %12lld", + TP_printk("ec_ts: %9u, ec_fifo_ts: %9u, fifo_ts: %12lld, curr_ts: %12lld, curr_time: %12lld, delta %12lld", __entry->ec_sample_timestamp, __entry->ec_fifo_timestamp, __entry->fifo_timestamp, @@ -122,7 +122,7 @@ TRACE_EVENT(cros_ec_sensorhub_data, __entry->current_time = current_time; __entry->delta = current_timestamp - current_time; ), - TP_printk("ec_num: %4d, ec_fifo_ts: %12lld, fifo_ts: %12lld, curr_ts: %12lld, curr_time: %12lld, delta %12lld", + TP_printk("ec_num: %4u, ec_fifo_ts: %9u, fifo_ts: %12lld, curr_ts: %12lld, curr_time: %12lld, delta %12lld", __entry->ec_sensor_num, __entry->ec_fifo_timestamp, __entry->fifo_timestamp, @@ -153,7 +153,7 @@ TRACE_EVENT(cros_ec_sensorhub_filter, __entry->x = state->x_offset; __entry->y = state->y_offset; ), - TP_printk("dx: %12lld. dy: %12lld median_m: %12lld median_error: %12lld len: %d x: %12lld y: %12lld", + TP_printk("dx: %12lld. dy: %12lld median_m: %12lld median_error: %12lld len: %lld x: %12lld y: %12lld", __entry->dx, __entry->dy, __entry->median_m,
From: Heiko Carstens hca@linux.ibm.com
[ Upstream commit 15256194eff64f9a774b33b7817ea663e352394a ]
Make the oklabel within the CHKSTG macro local. This makes sure that tools like objdump and the crash debugging tool still disassemble full functions where the macro has been used instead of stopping half way where such a global label is used and one has to guess how to disassemble the rest of such a function:
E.g.:
0000000000cb0270 <mcck_int_handler>: cb0270: b2 05 03 20 stck 800 ... cb0354: a7 74 00 97 jne cb0482 <oklabel270+0xe2>
0000000000cb0358 <oklabel243>: cb0358: c0 e0 00 22 4e 8f larl %r14,10fa076 <opcode+0x2558> ...
Fixes: d35925b34996 ("s390/mcck: move storage error checks to assembler") Reviewed-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/entry.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S index b9716a7e326d..4c9b967290ae 100644 --- a/arch/s390/kernel/entry.S +++ b/arch/s390/kernel/entry.S @@ -140,10 +140,10 @@ _LPP_OFFSET = __LC_LPP TSTMSK __LC_MCCK_CODE,(MCCK_CODE_STG_ERROR|MCCK_CODE_STG_KEY_ERROR) jnz \errlabel TSTMSK __LC_MCCK_CODE,MCCK_CODE_STG_DEGRAD - jz oklabel@ + jz .Loklabel@ TSTMSK __LC_MCCK_CODE,MCCK_CODE_STG_FAIL_ADDR jnz \errlabel -oklabel@: +.Loklabel@: .endm
#if IS_ENABLED(CONFIG_KVM)
From: NeilBrown neilb@suse.de
[ Upstream commit e38b3f20059426a0adbde014ff71071739ab5226 ]
alloc_pages_bulk_array() attempts to allocate at least one page based on the provided pages, and then opportunistically allocates more if that can be done without dropping the spinlock.
So if it returns fewer than requested, that could just mean that it needed to drop the lock. In that case, try again immediately.
Only pause for a time if no progress could be made.
Reported-and-tested-by: Mike Javorski mike.javorski@gmail.com Reported-and-tested-by: Lothar Paltins lopa@mailbox.org Fixes: f6e70aab9dfe ("SUNRPC: refresh rq_pages using a bulk page allocator") Signed-off-by: NeilBrown neilb@suse.de Acked-by: Mel Gorman mgorman@suse.com Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sunrpc/svc_xprt.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index dbb41821b1b8..cd5a2b186f0d 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -662,7 +662,7 @@ static int svc_alloc_arg(struct svc_rqst *rqstp) { struct svc_serv *serv = rqstp->rq_server; struct xdr_buf *arg = &rqstp->rq_arg; - unsigned long pages, filled; + unsigned long pages, filled, ret;
pages = (serv->sv_max_mesg + 2 * PAGE_SIZE) >> PAGE_SHIFT; if (pages > RPCSVC_MAXPAGES) { @@ -672,11 +672,12 @@ static int svc_alloc_arg(struct svc_rqst *rqstp) pages = RPCSVC_MAXPAGES; }
- for (;;) { - filled = alloc_pages_bulk_array(GFP_KERNEL, pages, - rqstp->rq_pages); - if (filled == pages) - break; + for (filled = 0; filled < pages; filled = ret) { + ret = alloc_pages_bulk_array(GFP_KERNEL, pages, + rqstp->rq_pages); + if (ret > filled) + /* Made progress, don't sleep yet */ + continue;
set_current_state(TASK_INTERRUPTIBLE); if (signalled() || kthread_should_stop()) {
From: Geert Uytterhoeven geert@linux-m68k.org
[ Upstream commit 8ba739ede49dec361ddcb70afe24986b4b8cfe17 ]
RATIONAL_KUNIT_TEST selects RATIONAL, thus enabling an optional feature the user may not want to have enabled. Fix this by making the test depend on RATIONAL instead.
Link: https://lkml.kernel.org/r/20210706100945.3803694-3-geert@linux-m68k.org Fixes: b6c75c4afceb8bc0 ("lib/math/rational: add Kunit test cases") Signed-off-by: Geert Uytterhoeven geert@linux-m68k.org Cc: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: Brendan Higgins brendanhiggins@google.com Cc: Colin Ian King colin.king@canonical.com Cc: Trent Piepho tpiepho@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/Kconfig.debug | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 5ddd575159fb..021bc9cd43da 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -2460,8 +2460,7 @@ config SLUB_KUNIT_TEST
config RATIONAL_KUNIT_TEST tristate "KUnit test for rational.c" if !KUNIT_ALL_TESTS - depends on KUNIT - select RATIONAL + depends on KUNIT && RATIONAL default KUNIT_ALL_TESTS help This builds the rational math unit test.
From: Rasmus Villemoes linux@rasmusvillemoes.dk
[ Upstream commit b234ed6d629420827e2839c8c8935be85a0867fd ]
Currently, usermodehelper is enabled right before PID1 starts going through the initcalls. However, any call of a usermodehelper from a pure_, core_, postcore_, arch_, subsys_ or fs_ initcall is futile, as there is no filesystem contents yet.
Up until commit e7cb072eb988 ("init/initramfs.c: do unpacking asynchronously"), such calls, whether via some request_module(), a legacy uevent "/sbin/hotplug" notification or something else, would just fail silently with (presumably) -ENOENT from kernel_execve(). However, that commit introduced the wait_for_initramfs() synchronization hook which must be called from the usermodehelper exec path right before the kernel_execve, in order that request_module() et al done from *after* rootfs_initcall() time (i.e. device_ and late_ initcalls) would continue to find a populated initramfs as they used to.
Any call of wait_for_initramfs() done before the unpacking has been scheduled (i.e. before rootfs_initcall time) must just return immediately [and let the caller find an empty file system] in order not to deadlock the machine. I mistakenly thought, and my limited testing confirmed, that there were no such calls, so I added a pr_warn_once() in wait_for_initramfs(). It turns out that one can indeed hit request_module() as well as kobject_uevent_env() during those early init calls, leading to a user-visible warning in the kernel log emitted consistently for certain configurations.
We could just remove the pr_warn_once(), but I think it's better to postpone enabling the usermodehelper framework until there is at least some chance of finding the executable. That is also a little more efficient in that a lot of work done in umh.c will be elided. However, it does change the error seen by those early callers from -ENOENT to -EBUSY, so there is a risk of a regression if any caller care about the exact error value.
Link: https://lkml.kernel.org/r/20210728134638.329060-1-linux@rasmusvillemoes.dk Fixes: e7cb072eb988 ("init/initramfs.c: do unpacking asynchronously") Signed-off-by: Rasmus Villemoes linux@rasmusvillemoes.dk Reported-by: Alexander Egorenkov egorenar@linux.ibm.com Reported-by: Bruno Goncalves bgoncalv@redhat.com Reported-by: Heiner Kallweit hkallweit1@gmail.com Cc: Luis Chamberlain mcgrof@kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- init/initramfs.c | 2 ++ init/main.c | 1 - init/noinitramfs.c | 2 ++ 3 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/init/initramfs.c b/init/initramfs.c index af27abc59643..a842c0544745 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -15,6 +15,7 @@ #include <linux/mm.h> #include <linux/namei.h> #include <linux/init_syscalls.h> +#include <linux/umh.h>
static ssize_t __init xwrite(struct file *file, const char *p, size_t count, loff_t *pos) @@ -727,6 +728,7 @@ static int __init populate_rootfs(void) { initramfs_cookie = async_schedule_domain(do_populate_rootfs, NULL, &initramfs_domain); + usermodehelper_enable(); if (!initramfs_async) wait_for_initramfs(); return 0; diff --git a/init/main.c b/init/main.c index 8d97aba78c3a..90733a916791 100644 --- a/init/main.c +++ b/init/main.c @@ -1392,7 +1392,6 @@ static void __init do_basic_setup(void) driver_init(); init_irq_proc(); do_ctors(); - usermodehelper_enable(); do_initcalls(); }
diff --git a/init/noinitramfs.c b/init/noinitramfs.c index 3d62b07f3bb9..d1d26b93d25c 100644 --- a/init/noinitramfs.c +++ b/init/noinitramfs.c @@ -10,6 +10,7 @@ #include <linux/kdev_t.h> #include <linux/syscalls.h> #include <linux/init_syscalls.h> +#include <linux/umh.h>
/* * Create a simple rootfs that is similar to the default initramfs @@ -18,6 +19,7 @@ static int __init default_rootfs(void) { int err;
+ usermodehelper_enable(); err = init_mkdir("/dev", 0755); if (err < 0) goto out;
From: Lukas Bulwahn lukas.bulwahn@gmail.com
[ Upstream commit 6fe26259b4884b657cbc233fb9cdade9d704976e ]
Commit 05a4a9527931 ("kernel/watchdog: split up config options") adds a new config HARDLOCKUP_DETECTOR, which selects the non-existing config HARDLOCKUP_DETECTOR_ARCH.
Hence, ./scripts/checkkconfigsymbols.py warns:
HARDLOCKUP_DETECTOR_ARCH Referencing files: lib/Kconfig.debug
Simply drop selecting the non-existing HARDLOCKUP_DETECTOR_ARCH.
Link: https://lkml.kernel.org/r/20210806115618.22088-1-lukas.bulwahn@gmail.com Fixes: 05a4a9527931 ("kernel/watchdog: split up config options") Signed-off-by: Lukas Bulwahn lukas.bulwahn@gmail.com Cc: Nicholas Piggin npiggin@gmail.com Cc: Masahiro Yamada masahiroy@kernel.org Cc: Babu Moger babu.moger@oracle.com Cc: Don Zickus dzickus@redhat.com Cc: Randy Dunlap rdunlap@infradead.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/Kconfig.debug | 1 - 1 file changed, 1 deletion(-)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 021bc9cd43da..ffd22e499997 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1062,7 +1062,6 @@ config HARDLOCKUP_DETECTOR depends on HAVE_HARDLOCKUP_DETECTOR_PERF || HAVE_HARDLOCKUP_DETECTOR_ARCH select LOCKUP_DETECTOR select HARDLOCKUP_DETECTOR_PERF if HAVE_HARDLOCKUP_DETECTOR_PERF - select HARDLOCKUP_DETECTOR_ARCH if HAVE_HARDLOCKUP_DETECTOR_ARCH help Say Y here to enable the kernel to act as a watchdog to detect hard lockups.
From: Masami Hiramatsu mhiramat@kernel.org
[ Upstream commit 32ba9f0fb027cc43074e3ea26fcf831adeee8e03 ]
Since tracing_on indicates only "1" (default) or "0", ftrace2bconf.sh only need to check the value is "0".
Link: https://lkml.kernel.org/r/163077087144.222577.6888011847727968737.stgit@devn...
Fixes: 55ed4560774d ("tools/bootconfig: Add tracing_on support to helper scripts") Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/bootconfig/scripts/ftrace2bconf.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/bootconfig/scripts/ftrace2bconf.sh b/tools/bootconfig/scripts/ftrace2bconf.sh index a0c3bcc6da4f..fb201d5afe2c 100755 --- a/tools/bootconfig/scripts/ftrace2bconf.sh +++ b/tools/bootconfig/scripts/ftrace2bconf.sh @@ -222,8 +222,8 @@ instance_options() { # [instance-name] emit_kv $PREFIX.cpumask = $val fi val=`cat $INSTANCE/tracing_on` - if [ `echo $val | sed -e s/f//g`x != x ]; then - emit_kv $PREFIX.tracing_on = $val + if [ "$val" = "0" ]; then + emit_kv $PREFIX.tracing_on = 0 fi
val=
From: Masami Hiramatsu mhiramat@kernel.org
[ Upstream commit cfd799837dbc48499abb05d1891b3d9992354d3a ]
Since the commit e5efaeb8a8f5 ("bootconfig: Support mixing a value and subkeys under a key") allows to co-exist a value node and key nodes under a node, xbc_node_for_each_child() is not only returning key node but also a value node. In the boot-time tracing using xbc_node_for_each_child() to iterate the events, groups and instances, but those must be key nodes. Thus it must use xbc_node_for_each_subkey().
Link: https://lkml.kernel.org/r/163112988361.74896.2267026262061819145.stgit@devno...
Fixes: e5efaeb8a8f5 ("bootconfig: Support mixing a value and subkeys under a key") Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace_boot.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/trace/trace_boot.c b/kernel/trace/trace_boot.c index d713714cba67..4bd8f94a56c6 100644 --- a/kernel/trace/trace_boot.c +++ b/kernel/trace/trace_boot.c @@ -235,14 +235,14 @@ trace_boot_init_events(struct trace_array *tr, struct xbc_node *node) if (!node) return; /* per-event key starts with "event.GROUP.EVENT" */ - xbc_node_for_each_child(node, gnode) { + xbc_node_for_each_subkey(node, gnode) { data = xbc_node_get_data(gnode); if (!strcmp(data, "enable")) { enable_all = true; continue; } enable = false; - xbc_node_for_each_child(gnode, enode) { + xbc_node_for_each_subkey(gnode, enode) { data = xbc_node_get_data(enode); if (!strcmp(data, "enable")) { enable = true; @@ -338,7 +338,7 @@ trace_boot_init_instances(struct xbc_node *node) if (!node) return;
- xbc_node_for_each_child(node, inode) { + xbc_node_for_each_subkey(node, inode) { p = xbc_node_get_data(inode); if (!p || *p == '\0') continue;
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit fb83610762dd5927212aa62a468dd3b756b57a88 ]
There are two pairs of declarations for thermal_cooling_device_register() and thermal_of_cooling_device_register(), and only one set was changed in a recent patch, so the other one now causes a compile-time warning:
drivers/net/wireless/mediatek/mt76/mt7915/init.c: In function 'mt7915_thermal_init': drivers/net/wireless/mediatek/mt76/mt7915/init.c:134:48: error: passing argument 1 of 'thermal_cooling_device_register' discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers] 134 | cdev = thermal_cooling_device_register(wiphy_name(wiphy), phy, | ^~~~~~~~~~~~~~~~~ In file included from drivers/net/wireless/mediatek/mt76/mt7915/init.c:7: include/linux/thermal.h:407:39: note: expected 'char *' but argument is of type 'const char *' 407 | thermal_cooling_device_register(char *type, void *devdata, | ~~~~~~^~~~
Change the dummy helper functions to have the same arguments as the normal version.
Fixes: f991de53a8ab ("thermal: make device_register's type argument const") Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Jean-Francois Dagenais jeff.dagenais@gmail.com Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Link: https://lore.kernel.org/r/20210722090717.1116748-1-arnd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/thermal.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/include/linux/thermal.h b/include/linux/thermal.h index d296f3b88fb9..8050d929a5b4 100644 --- a/include/linux/thermal.h +++ b/include/linux/thermal.h @@ -404,12 +404,13 @@ static inline void thermal_zone_device_unregister( struct thermal_zone_device *tz) { } static inline struct thermal_cooling_device * -thermal_cooling_device_register(char *type, void *devdata, +thermal_cooling_device_register(const char *type, void *devdata, const struct thermal_cooling_device_ops *ops) { return ERR_PTR(-ENODEV); } static inline struct thermal_cooling_device * thermal_of_cooling_device_register(struct device_node *np, - char *type, void *devdata, const struct thermal_cooling_device_ops *ops) + const char *type, void *devdata, + const struct thermal_cooling_device_ops *ops) { return ERR_PTR(-ENODEV); } static inline struct thermal_cooling_device * devm_thermal_of_cooling_device_register(struct device *dev,
From: Koba Ko koba.ko@canonical.com
[ Upstream commit b3dc549986eb7b38eba4a144e979dc93f386751f ]
Due to high latency in PCIE clock switching on RKL platforms, switching the PCIE clock dynamically at runtime can lead to HDMI/DP audio problems. On newer asics this is handled in the SMU firmware. For SMU7-based asics, disable PCIE clock switching to avoid the issue.
AMD provide a parameter to disable PICE_DPM.
modprobe amdgpu ppfeaturemask=0xfff7bffb
It's better to contorl PCIE_DPM in amd gpu driver, switch PCI_DPM by determining intel RKL platform for SMU7-based asics.
Fixes: 1a31474cdb48 ("drm/amd/pm: workaround for audio noise issue") Ref: https://lists.freedesktop.org/archives/amd-gfx/2021-August/067413.html Signed-off-by: Koba Ko koba.ko@canonical.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c index 0541bfc81c1b..1d76cf7cd85d 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c @@ -27,6 +27,9 @@ #include <linux/pci.h> #include <linux/slab.h> #include <asm/div64.h> +#if IS_ENABLED(CONFIG_X86_64) +#include <asm/intel-family.h> +#endif #include <drm/amdgpu_drm.h> #include "ppatomctrl.h" #include "atombios.h" @@ -1733,6 +1736,17 @@ static int smu7_disable_dpm_tasks(struct pp_hwmgr *hwmgr) return result; }
+static bool intel_core_rkl_chk(void) +{ +#if IS_ENABLED(CONFIG_X86_64) + struct cpuinfo_x86 *c = &cpu_data(0); + + return (c->x86 == 6 && c->x86_model == INTEL_FAM6_ROCKETLAKE); +#else + return false; +#endif +} + static void smu7_init_dpm_defaults(struct pp_hwmgr *hwmgr) { struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend); @@ -1758,7 +1772,8 @@ static void smu7_init_dpm_defaults(struct pp_hwmgr *hwmgr)
data->mclk_dpm_key_disabled = hwmgr->feature_mask & PP_MCLK_DPM_MASK ? false : true; data->sclk_dpm_key_disabled = hwmgr->feature_mask & PP_SCLK_DPM_MASK ? false : true; - data->pcie_dpm_key_disabled = hwmgr->feature_mask & PP_PCIE_DPM_MASK ? false : true; + data->pcie_dpm_key_disabled = + intel_core_rkl_chk() || !(hwmgr->feature_mask & PP_PCIE_DPM_MASK); /* need to set voltage control types before EVV patching */ data->voltage_control = SMU7_VOLTAGE_CONTROL_NONE; data->vddci_control = SMU7_VOLTAGE_CONTROL_NONE;
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit 4b92d4add5f6dcf21275185c997d6ecb800054cd ]
DEFINE_SMP_CALL_CACHE_FUNCTION() was usefel before the CPU hotplug rework to ensure that the cache related functions are called on the upcoming CPU because the notifier itself could run on any online CPU.
The hotplug state machine guarantees that the callbacks are invoked on the upcoming CPU. So there is no need to have this SMP function call obfuscation. That indirection was missed when the hotplug notifiers were converted.
This also solves the problem of ARM64 init_cache_level() invoking ACPI functions which take a semaphore in that context. That's invalid as SMP function calls run with interrupts disabled. Running it just from the callback in context of the CPU hotplug thread solves this.
Fixes: 8571890e1513 ("arm64: Add support for ACPI based firmware tables") Reported-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Guenter Roeck linux@roeck-us.net Acked-by: Will Deacon will@kernel.org Acked-by: Peter Zijlstra peterz@infradead.org Link: https://lore.kernel.org/r/871r69ersb.ffs@tglx Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kernel/cacheinfo.c | 7 ++----- arch/mips/kernel/cacheinfo.c | 7 ++----- arch/riscv/kernel/cacheinfo.c | 7 ++----- arch/x86/kernel/cpu/cacheinfo.c | 7 ++----- include/linux/cacheinfo.h | 18 ------------------ 5 files changed, 8 insertions(+), 38 deletions(-)
diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c index 7fa6828bb488..587543c6c51c 100644 --- a/arch/arm64/kernel/cacheinfo.c +++ b/arch/arm64/kernel/cacheinfo.c @@ -43,7 +43,7 @@ static void ci_leaf_init(struct cacheinfo *this_leaf, this_leaf->type = type; }
-static int __init_cache_level(unsigned int cpu) +int init_cache_level(unsigned int cpu) { unsigned int ctype, level, leaves, fw_level; struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); @@ -78,7 +78,7 @@ static int __init_cache_level(unsigned int cpu) return 0; }
-static int __populate_cache_leaves(unsigned int cpu) +int populate_cache_leaves(unsigned int cpu) { unsigned int level, idx; enum cache_type type; @@ -97,6 +97,3 @@ static int __populate_cache_leaves(unsigned int cpu) } return 0; } - -DEFINE_SMP_CALL_CACHE_FUNCTION(init_cache_level) -DEFINE_SMP_CALL_CACHE_FUNCTION(populate_cache_leaves) diff --git a/arch/mips/kernel/cacheinfo.c b/arch/mips/kernel/cacheinfo.c index 53d8ea7d36e6..495dd058231d 100644 --- a/arch/mips/kernel/cacheinfo.c +++ b/arch/mips/kernel/cacheinfo.c @@ -17,7 +17,7 @@ do { \ leaf++; \ } while (0)
-static int __init_cache_level(unsigned int cpu) +int init_cache_level(unsigned int cpu) { struct cpuinfo_mips *c = ¤t_cpu_data; struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); @@ -74,7 +74,7 @@ static void fill_cpumask_cluster(int cpu, cpumask_t *cpu_map) cpumask_set_cpu(cpu1, cpu_map); }
-static int __populate_cache_leaves(unsigned int cpu) +int populate_cache_leaves(unsigned int cpu) { struct cpuinfo_mips *c = ¤t_cpu_data; struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); @@ -114,6 +114,3 @@ static int __populate_cache_leaves(unsigned int cpu)
return 0; } - -DEFINE_SMP_CALL_CACHE_FUNCTION(init_cache_level) -DEFINE_SMP_CALL_CACHE_FUNCTION(populate_cache_leaves) diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c index d86781357044..90deabfe63ea 100644 --- a/arch/riscv/kernel/cacheinfo.c +++ b/arch/riscv/kernel/cacheinfo.c @@ -113,7 +113,7 @@ static void fill_cacheinfo(struct cacheinfo **this_leaf, } }
-static int __init_cache_level(unsigned int cpu) +int init_cache_level(unsigned int cpu) { struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); struct device_node *np = of_cpu_device_node_get(cpu); @@ -155,7 +155,7 @@ static int __init_cache_level(unsigned int cpu) return 0; }
-static int __populate_cache_leaves(unsigned int cpu) +int populate_cache_leaves(unsigned int cpu) { struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); struct cacheinfo *this_leaf = this_cpu_ci->info_list; @@ -187,6 +187,3 @@ static int __populate_cache_leaves(unsigned int cpu)
return 0; } - -DEFINE_SMP_CALL_CACHE_FUNCTION(init_cache_level) -DEFINE_SMP_CALL_CACHE_FUNCTION(populate_cache_leaves) diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c index d66af2950e06..b5e36bd0425b 100644 --- a/arch/x86/kernel/cpu/cacheinfo.c +++ b/arch/x86/kernel/cpu/cacheinfo.c @@ -985,7 +985,7 @@ static void ci_leaf_init(struct cacheinfo *this_leaf, this_leaf->priv = base->nb; }
-static int __init_cache_level(unsigned int cpu) +int init_cache_level(unsigned int cpu) { struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
@@ -1014,7 +1014,7 @@ static void get_cache_id(int cpu, struct _cpuid4_info_regs *id4_regs) id4_regs->id = c->apicid >> index_msb; }
-static int __populate_cache_leaves(unsigned int cpu) +int populate_cache_leaves(unsigned int cpu) { unsigned int idx, ret; struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); @@ -1033,6 +1033,3 @@ static int __populate_cache_leaves(unsigned int cpu)
return 0; } - -DEFINE_SMP_CALL_CACHE_FUNCTION(init_cache_level) -DEFINE_SMP_CALL_CACHE_FUNCTION(populate_cache_leaves) diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h index 4f72b47973c3..2f909ed084c6 100644 --- a/include/linux/cacheinfo.h +++ b/include/linux/cacheinfo.h @@ -79,24 +79,6 @@ struct cpu_cacheinfo { bool cpu_map_populated; };
-/* - * Helpers to make sure "func" is executed on the cpu whose cache - * attributes are being detected - */ -#define DEFINE_SMP_CALL_CACHE_FUNCTION(func) \ -static inline void _##func(void *ret) \ -{ \ - int cpu = smp_processor_id(); \ - *(int *)ret = __##func(cpu); \ -} \ - \ -int func(unsigned int cpu) \ -{ \ - int ret; \ - smp_call_function_single(cpu, _##func, &ret, true); \ - return ret; \ -} - struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu); int init_cache_level(unsigned int cpu); int populate_cache_leaves(unsigned int cpu);
From: Geert Uytterhoeven geert@linux-m68k.org
[ Upstream commit c4f3a3460a5daebc772d9263500e4099b11e7300 ]
Move notify between drivers is an option of DMA-BUF. Enabling DMABUF_MOVE_NOTIFY without DMA_SHARED_BUFFER does not have any impact, as drivers/dma-buf/ is not entered during the build when DMA_SHARED_BUFFER is disabled.
Fixes: bb42df4662a44765 ("dma-buf: add dynamic DMA-buf handling v15") Signed-off-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Daniel Vetter daniel.vetter@ffwll.ch Link: https://patchwork.freedesktop.org/patch/msgid/20210902124913.2698760-2-geert... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma-buf/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/dma-buf/Kconfig b/drivers/dma-buf/Kconfig index 4e16c71c24b7..6e13cc941cd2 100644 --- a/drivers/dma-buf/Kconfig +++ b/drivers/dma-buf/Kconfig @@ -42,6 +42,7 @@ config UDMABUF config DMABUF_MOVE_NOTIFY bool "Move notify between drivers (EXPERIMENTAL)" default n + depends on DMA_SHARED_BUFFER help Don't pin buffers if the dynamic DMA-buf interface is available on both the exporter as well as the importer. This fixes a security
From: Geert Uytterhoeven geert@linux-m68k.org
[ Upstream commit cca62758ebdd71fcfb6d589d6487a7f26398d50d ]
DMA-BUF debug checks are an option of DMA-BUF. Enabling DMABUF_DEBUG without DMA_SHARED_BUFFER does not have any impact, as drivers/dma-buf/ is not entered during the build when DMA_SHARED_BUFFER is disabled.
Fixes: 84335675f2223cbd ("dma-buf: Add debug option") Signed-off-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Sumit Semwal sumit.semwal@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20210902124913.2698760-3-geert... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma-buf/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/dma-buf/Kconfig b/drivers/dma-buf/Kconfig index 6e13cc941cd2..6eb4d13f426e 100644 --- a/drivers/dma-buf/Kconfig +++ b/drivers/dma-buf/Kconfig @@ -53,6 +53,7 @@ config DMABUF_MOVE_NOTIFY
config DMABUF_DEBUG bool "DMA-BUF debug checks" + depends on DMA_SHARED_BUFFER default y if DMA_API_DEBUG help This option enables additional checks for DMA-BUF importers and
From: Guenter Roeck linux@roeck-us.net
[ Upstream commit 907872baa9f1538eed02ec737b8e89eba6c6e4b9 ]
parisc build test images fail to compile with the following error.
drivers/parisc/dino.c:160:12: error: 'pci_dev_is_behind_card_dino' defined but not used
Move the function just ahead of its only caller to avoid the error.
Fixes: 5fa1659105fa ("parisc: Disable HP HSC-PCI Cards to prevent kernel crash") Cc: Helge Deller deller@gmx.de Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/parisc/dino.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/parisc/dino.c b/drivers/parisc/dino.c index 889d7ce282eb..952a92504df6 100644 --- a/drivers/parisc/dino.c +++ b/drivers/parisc/dino.c @@ -156,15 +156,6 @@ static inline struct dino_device *DINO_DEV(struct pci_hba_data *hba) return container_of(hba, struct dino_device, hba); }
-/* Check if PCI device is behind a Card-mode Dino. */ -static int pci_dev_is_behind_card_dino(struct pci_dev *dev) -{ - struct dino_device *dino_dev; - - dino_dev = DINO_DEV(parisc_walk_tree(dev->bus->bridge)); - return is_card_dino(&dino_dev->hba.dev->id); -} - /* * Dino Configuration Space Accessor Functions */ @@ -447,6 +438,15 @@ static void quirk_cirrus_cardbus(struct pci_dev *dev) DECLARE_PCI_FIXUP_ENABLE(PCI_VENDOR_ID_CIRRUS, PCI_DEVICE_ID_CIRRUS_6832, quirk_cirrus_cardbus );
#ifdef CONFIG_TULIP +/* Check if PCI device is behind a Card-mode Dino. */ +static int pci_dev_is_behind_card_dino(struct pci_dev *dev) +{ + struct dino_device *dino_dev; + + dino_dev = DINO_DEV(parisc_walk_tree(dev->bus->bridge)); + return is_card_dino(&dino_dev->hba.dev->id); +} + static void pci_fixup_tulip(struct pci_dev *dev) { if (!pci_dev_is_behind_card_dino(dev))
From: Wei Huang wei.huang2@amd.com
[ Upstream commit c3811a50addd23b9bb5a36278609ee1638debcf6 ]
Currently, iommu_init_ga() checks and disables IOMMU VAPIC support (i.e. AMD AVIC support in IOMMU) when GAMSup feature bit is not set. However it forgets to clear IRQ_POSTING_CAP from the previously set amd_iommu_irq_ops.capability.
This triggers an invalid page fault bug during guest VM warm reboot if AVIC is enabled since the irq_remapping_cap(IRQ_POSTING_CAP) is incorrectly set, and crash the system with the following kernel trace.
BUG: unable to handle page fault for address: 0000000000400dd8 RIP: 0010:amd_iommu_deactivate_guest_mode+0x19/0xbc Call Trace: svm_set_pi_irte_mode+0x8a/0xc0 [kvm_amd] ? kvm_make_all_cpus_request_except+0x50/0x70 [kvm] kvm_request_apicv_update+0x10c/0x150 [kvm] svm_toggle_avic_for_irq_window+0x52/0x90 [kvm_amd] svm_enable_irq_window+0x26/0xa0 [kvm_amd] vcpu_enter_guest+0xbbe/0x1560 [kvm] ? avic_vcpu_load+0xd5/0x120 [kvm_amd] ? kvm_arch_vcpu_load+0x76/0x240 [kvm] ? svm_get_segment_base+0xa/0x10 [kvm_amd] kvm_arch_vcpu_ioctl_run+0x103/0x590 [kvm] kvm_vcpu_ioctl+0x22a/0x5d0 [kvm] __x64_sys_ioctl+0x84/0xc0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae
Fixes by moving the initializing of AMD IOMMU interrupt remapping mode (amd_iommu_guest_ir) earlier before setting up the amd_iommu_irq_ops.capability with appropriate IRQ_POSTING_CAP flag.
[joro: Squashed the two patches and limited check_features_on_all_iommus() to CONFIG_IRQ_REMAP to fix a compile warning.]
Signed-off-by: Wei Huang wei.huang2@amd.com Co-developed-by: Suravee Suthikulpanit suravee.suthikulpanit@amd.com Signed-off-by: Suravee Suthikulpanit suravee.suthikulpanit@amd.com Link: https://lore.kernel.org/r/20210820202957.187572-2-suravee.suthikulpanit@amd.... Link: https://lore.kernel.org/r/20210820202957.187572-3-suravee.suthikulpanit@amd.... Fixes: 8bda0cfbdc1a ("iommu/amd: Detect and initialize guest vAPIC log") Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/amd/init.c | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index 46280e6e1535..5c21f1ee5098 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -298,6 +298,22 @@ int amd_iommu_get_num_iommus(void) return amd_iommus_present; }
+#ifdef CONFIG_IRQ_REMAP +static bool check_feature_on_all_iommus(u64 mask) +{ + bool ret = false; + struct amd_iommu *iommu; + + for_each_iommu(iommu) { + ret = iommu_feature(iommu, mask); + if (!ret) + return false; + } + + return true; +} +#endif + /* * For IVHD type 0x11/0x40, EFR is also available via IVHD. * Default to IVHD EFR since it is available sooner @@ -854,13 +870,6 @@ static int iommu_init_ga(struct amd_iommu *iommu) int ret = 0;
#ifdef CONFIG_IRQ_REMAP - /* Note: We have already checked GASup from IVRS table. - * Now, we need to make sure that GAMSup is set. - */ - if (AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) && - !iommu_feature(iommu, FEATURE_GAM_VAPIC)) - amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY_GA; - ret = iommu_init_ga_log(iommu); #endif /* CONFIG_IRQ_REMAP */
@@ -2477,6 +2486,14 @@ static void early_enable_iommus(void) }
#ifdef CONFIG_IRQ_REMAP + /* + * Note: We have already checked GASup from IVRS table. + * Now, we need to make sure that GAMSup is set. + */ + if (AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir) && + !check_feature_on_all_iommus(FEATURE_GAM_VAPIC)) + amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY_GA; + if (AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir)) amd_iommu_irq_ops.capability |= (1 << IRQ_POSTING_CAP); #endif
From: Fenghua Yu fenghua.yu@intel.com
[ Upstream commit a21518cb23a3c7d49bafcb59862fd389fd829d4b ]
The mm->pasid will be used in intel_svm_free_pasid() after load_pasid() during unbinding mm. Clearing it in load_pasid() will cause PASID cannot be freed in intel_svm_free_pasid().
Additionally mm->pasid was updated already before load_pasid() during pasid allocation. No need to update it again in load_pasid() during binding mm. Don't update mm->pasid to avoid the issues in both binding mm and unbinding mm.
Fixes: 4048377414162 ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers") Reported-and-tested-by: Dave Jiang dave.jiang@intel.com Co-developed-by: Jacob Pan jacob.jun.pan@linux.intel.com Signed-off-by: Jacob Pan jacob.jun.pan@linux.intel.com Signed-off-by: Fenghua Yu fenghua.yu@intel.com Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210828070622.2437559-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/intel/svm.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 4b9b3f35ba0e..ceeca633a5f9 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -516,9 +516,6 @@ static void load_pasid(struct mm_struct *mm, u32 pasid) { mutex_lock(&mm->context.lock);
- /* Synchronize with READ_ONCE in update_pasid(). */ - smp_store_release(&mm->pasid, pasid); - /* Update PASID MSR on all CPUs running the mm's tasks. */ on_each_cpu_mask(mm_cpumask(mm), _load_pasid, NULL, true);
From: Fenghua Yu fenghua.yu@intel.com
[ Upstream commit 6ef0505158f7ca1b32763a3b038b5d11296b642b ]
pasid_mutex and dev->iommu->param->lock are held while unbinding mm is flushing IO page fault workqueue and waiting for all page fault works to finish. But an in-flight page fault work also need to hold the two locks while unbinding mm are holding them and waiting for the work to finish. This may cause an ABBA deadlock issue as shown below:
idxd 0000:00:0a.0: unbind PASID 2 ====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc7+ #549 Not tainted [ 186.615245] ---------- dsa_test/898 is trying to acquire lock: ffff888100d854e8 (¶m->lock){+.+.}-{3:3}, at: iopf_queue_flush_dev+0x29/0x60 but task is already holding lock: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at: intel_svm_unbind+0x34/0x1e0 which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (pasid_mutex){+.+.}-{3:3}: __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 intel_svm_page_response+0x8e/0x260 iommu_page_response+0x122/0x200 iopf_handle_group+0x1c2/0x240 process_one_work+0x2a5/0x5a0 worker_thread+0x55/0x400 kthread+0x13b/0x160 ret_from_fork+0x22/0x30
-> #1 (¶m->fault_param->lock){+.+.}-{3:3}: __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 iommu_report_device_fault+0xc2/0x170 prq_event_thread+0x28a/0x580 irq_thread_fn+0x28/0x60 irq_thread+0xcf/0x180 kthread+0x13b/0x160 ret_from_fork+0x22/0x30
-> #0 (¶m->lock){+.+.}-{3:3}: __lock_acquire+0x1134/0x1d60 lock_acquire+0xc6/0x2e0 __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 iopf_queue_flush_dev+0x29/0x60 intel_svm_drain_prq+0x127/0x210 intel_svm_unbind+0xc5/0x1e0 iommu_sva_unbind_device+0x62/0x80 idxd_cdev_release+0x15a/0x200 [idxd] __fput+0x9c/0x250 ____fput+0xe/0x10 task_work_run+0x64/0xa0 exit_to_user_mode_prepare+0x227/0x230 syscall_exit_to_user_mode+0x2c/0x60 do_syscall_64+0x48/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of: ¶m->lock --> ¶m->fault_param->lock --> pasid_mutex
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(pasid_mutex); lock(¶m->fault_param->lock); lock(pasid_mutex); lock(¶m->lock);
*** DEADLOCK ***
2 locks held by dsa_test/898: #0: ffff888100cc1cc0 (&group->mutex){+.+.}-{3:3}, at: iommu_sva_unbind_device+0x53/0x80 #1: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at: intel_svm_unbind+0x34/0x1e0
stack backtrace: CPU: 2 PID: 898 Comm: dsa_test Not tainted 5.14.0-rc7+ #549 Hardware name: Intel Corporation Kabylake Client platform/KBL S DDR4 UD IMM CRB, BIOS KBLSE2R1.R00.X050.P01.1608011715 08/01/2016 Call Trace: dump_stack_lvl+0x5b/0x74 dump_stack+0x10/0x12 print_circular_bug.cold+0x13d/0x142 check_noncircular+0xf1/0x110 __lock_acquire+0x1134/0x1d60 lock_acquire+0xc6/0x2e0 ? iopf_queue_flush_dev+0x29/0x60 ? pci_mmcfg_read+0xde/0x240 __mutex_lock+0x75/0x730 ? iopf_queue_flush_dev+0x29/0x60 ? pci_mmcfg_read+0xfd/0x240 ? iopf_queue_flush_dev+0x29/0x60 mutex_lock_nested+0x1b/0x20 iopf_queue_flush_dev+0x29/0x60 intel_svm_drain_prq+0x127/0x210 ? intel_pasid_tear_down_entry+0x22e/0x240 intel_svm_unbind+0xc5/0x1e0 iommu_sva_unbind_device+0x62/0x80 idxd_cdev_release+0x15a/0x200
pasid_mutex protects pasid and svm data mapping data. It's unnecessary to hold pasid_mutex while flushing the workqueue. To fix the deadlock issue, unlock pasid_pasid during flushing the workqueue to allow the works to be handled.
Fixes: d5b9e4bfe0d8 ("iommu/vt-d: Report prq to io-pgfault framework") Reported-and-tested-by: Dave Jiang dave.jiang@intel.com Signed-off-by: Fenghua Yu fenghua.yu@intel.com Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210828070622.2437559-3-baolu.lu@linux.intel.com [joro: Removed timing information from kernel log messages] Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/intel/svm.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index ceeca633a5f9..d575082567ca 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -793,7 +793,19 @@ prq_retry: goto prq_retry; }
+ /* + * A work in IO page fault workqueue may try to lock pasid_mutex now. + * Holding pasid_mutex while waiting in iopf_queue_flush_dev() for + * all works in the workqueue to finish may cause deadlock. + * + * It's unnecessary to hold pasid_mutex in iopf_queue_flush_dev(). + * Unlock it to allow the works to be handled while waiting for + * them to finish. + */ + lockdep_assert_held(&pasid_mutex); + mutex_unlock(&pasid_mutex); iopf_queue_flush_dev(dev); + mutex_lock(&pasid_mutex);
/* * Perform steps described in VT-d spec CH7.10 to drain page
From: Ard Biesheuvel ardb@kernel.org
[ Upstream commit 88053ec8cb1b91df566353cd3116470193797e00 ]
KVM in nVHE mode divides up its VA space into two equal halves, and picks the half that does not conflict with the HYP ID map to map its linear region. This worked fine when the kernel's linear map itself was guaranteed to cover precisely as many bits of VA space, but this was changed by commit f4693c2716b35d08 ("arm64: mm: extend linear region for 52-bit VA configurations").
The result is that, depending on the placement of the ID map, kernel-VA to hyp-VA translations may produce addresses that either conflict with other HYP mappings (including the ID map itself) or generate addresses outside of the 52-bit addressable range, neither of which is likely to lead to anything useful.
Given that 52-bit capable cores are guaranteed to implement VHE, this only affects configurations such as pKVM where we opt into non-VHE mode even if the hardware is VHE capable. So just for these configurations, let's limit the kernel linear map to 51 bits and work around the problem.
Fixes: f4693c2716b3 ("arm64: mm: extend linear region for 52-bit VA configurations") Signed-off-by: Ard Biesheuvel ardb@kernel.org Link: https://lore.kernel.org/r/20210826165613.60774-1-ardb@kernel.org Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/mm/init.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 1fdb7bb7c198..0ad4afc9359b 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -319,7 +319,21 @@ static void __init fdt_enforce_memory_region(void)
void __init arm64_memblock_init(void) { - const s64 linear_region_size = PAGE_END - _PAGE_OFFSET(vabits_actual); + s64 linear_region_size = PAGE_END - _PAGE_OFFSET(vabits_actual); + + /* + * Corner case: 52-bit VA capable systems running KVM in nVHE mode may + * be limited in their ability to support a linear map that exceeds 51 + * bits of VA space, depending on the placement of the ID map. Given + * that the placement of the ID map may be randomized, let's simply + * limit the kernel's linear map to 51 bits as well if we detect this + * configuration. + */ + if (IS_ENABLED(CONFIG_KVM) && vabits_actual == 52 && + is_hyp_mode_available() && !is_kernel_in_hyp_mode()) { + pr_info("Capping linear region to 51 bits for KVM in nVHE mode on LVA capable hardware.\n"); + linear_region_size = min_t(u64, linear_region_size, BIT(51)); + }
/* Handle linux,usable-memory-range property */ fdt_enforce_memory_region();
From: xinhui pan xinhui.pan@amd.com
[ Upstream commit 70982eef4d7eebb47a3b1ef25ec1bc742f3a21cf ]
The ret value might be -EBUSY, caller will think lru lock is still locked but actually NOT. So return -ENOSPC instead. Otherwise we hit list corruption.
ttm_bo_cleanup_refs might fail too if BO is not idle. If we return 0, caller(ttm_tt_populate -> ttm_global_swapout ->ttm_device_swapout) will be stuck as we actually did not free any BO memory. This usually happens when the fence is not signaled for a long time.
Signed-off-by: xinhui pan xinhui.pan@amd.com Reviewed-by: Christian König christian.koenig@amd.com Fixes: ebd59851c796 ("drm/ttm: move swapout logic around v3") Link: https://patchwork.freedesktop.org/patch/msgid/20210907040832.1107747-1-xinhu... Signed-off-by: Christian König christian.koenig@amd.com Signed-off-by: Dave Airlie airlied@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/ttm/ttm_bo.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 32202385073a..b47a5053eb85 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -1157,9 +1157,9 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx, }
if (bo->deleted) { - ttm_bo_cleanup_refs(bo, false, false, locked); + ret = ttm_bo_cleanup_refs(bo, false, false, locked); ttm_bo_put(bo); - return 0; + return ret == -EBUSY ? -ENOSPC : ret; }
ttm_bo_del_from_lru(bo); @@ -1213,7 +1213,7 @@ out: if (locked) dma_resv_unlock(bo->base.resv); ttm_bo_put(bo); - return ret; + return ret == -EBUSY ? -ENOSPC : ret; }
void ttm_bo_tt_destroy(struct ttm_buffer_object *bo)
From: Saravana Kannan saravanak@google.com
[ Upstream commit 4a48b66b3f52aa1a8aaa8a8863891eed35769731 ]
Andre reported fw_devlink=on breaking OLPC XO-1.5 [1].
OLPC XO-1.5 is an X86 system that uses a mix of ACPI and OF to populate devices. The root cause seems to be ISA devices not setting their fwnode field. But trying to figure out how to fix that doesn't seem worth the trouble because the OLPC devicetree is very sparse/limited and fw_devlink only adds the links causing this issue. Considering that there aren't many users of OF in an X86 system, simply fw_devlink DT support for X86.
[1] - https://lore.kernel.org/lkml/3c1f2473-92ad-bfc4-258e-a5a08ad73dd0@web.de/
Fixes: ea718c699055 ("Revert "Revert "driver core: Set fw_devlink=on by default""") Signed-off-by: Saravana Kannan saravanak@google.com Cc: Andre Muller andre.muller@web.de Acked-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Tested-by: Andre Müller andre.muller@web.de Link: https://lore.kernel.org/r/20210910011446.3208894-1-saravanak@google.com Signed-off-by: Rob Herring robh@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/of/property.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/of/property.c b/drivers/of/property.c index 6c028632f425..0b9c2fb843e7 100644 --- a/drivers/of/property.c +++ b/drivers/of/property.c @@ -1434,6 +1434,9 @@ static int of_fwnode_add_links(struct fwnode_handle *fwnode) struct property *p; struct device_node *con_np = to_of_node(fwnode);
+ if (IS_ENABLED(CONFIG_X86)) + return 0; + if (!con_np) return -EINVAL;
From: Geert Uytterhoeven geert@linux-m68k.org
[ Upstream commit cbba17870881cd17bca24673ccb72859431da5bd ]
Currently, nothing is output on the serial console, unless "console=ttyS0,115200n8" or "earlycon" are appended to the kernel command line. Enable automatic console selection using chosen/stdout-path by adding a proper alias, and configure the expected serial rate.
While at it, add aliases for the other three serial ports, which are provided on the same micro-USB connector as the first one.
Fixes: 0fa6107eca4186ad ("RISC-V: Initial DTS for Microchip ICICLE board") Signed-off-by: Geert Uytterhoeven geert@linux-m68k.org Reviewed-by: Bin Meng bmeng.cn@gmail.com Reviewed-by: Conor Dooley conor.dooley@microchip.com Signed-off-by: Palmer Dabbelt palmerdabbelt@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/boot/dts/microchip/microchip-mpfs-icicle-kit.dts | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/riscv/boot/dts/microchip/microchip-mpfs-icicle-kit.dts b/arch/riscv/boot/dts/microchip/microchip-mpfs-icicle-kit.dts index baea7d204639..b254c60589a1 100644 --- a/arch/riscv/boot/dts/microchip/microchip-mpfs-icicle-kit.dts +++ b/arch/riscv/boot/dts/microchip/microchip-mpfs-icicle-kit.dts @@ -16,10 +16,14 @@
aliases { ethernet0 = &emac1; + serial0 = &serial0; + serial1 = &serial1; + serial2 = &serial2; + serial3 = &serial3; };
chosen { - stdout-path = &serial0; + stdout-path = "serial0:115200n8"; };
cpus {
From: Adrian Hunter adrian.hunter@intel.com
[ Upstream commit 99fc5941b835d662eb2e91d8b61249e9a51df9f0 ]
A config terms list was spliced twice, resulting in a never-ending loop when the list was traversed. Fix by using list_splice_init() and copying and freeing the lists as necessary.
This patch also depends on patch "perf tools: Factor out copy_config_terms() and free_config_terms()"
Example on ADL:
Before:
# perf record -e '{intel_pt//,cycles/aux-sample-size=4096/pp}' uname & # jobs [1]+ Running perf record -e "{intel_pt//,cycles/aux-sample-size=4096/pp}" uname # perf top -E 10 PerfTop: 4071 irqs/sec kernel: 6.9% exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles], (all, 24 CPUs) ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
97.60% perf [.] __evsel__get_config_term 0.25% [kernel] [k] kallsyms_expand_symbol.constprop.13 0.24% perf [.] kallsyms__parse 0.15% [kernel] [k] _raw_spin_lock 0.14% [kernel] [k] number 0.13% [kernel] [k] advance_transaction 0.08% [kernel] [k] format_decode 0.08% perf [.] map__process_kallsym_symbol 0.08% perf [.] rb_insert_color 0.08% [kernel] [k] vsnprintf exiting. # kill %1
After:
# perf record -e '{intel_pt//,cycles/aux-sample-size=4096/pp}' uname & Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.060 MB perf.data ] # perf script | head perf-exec 604 [001] 1827.312293: psb: psb offs: 0 ffffffffb8415e87 pt_config_start+0x37 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb856a3bd event_sched_in.isra.133+0xfd ([kernel.kallsyms]) => ffffffffb856a9a0 perf_pmu_nop_void+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb856b10e merge_sched_in+0x26e ([kernel.kallsyms]) => ffffffffb856a2c0 event_sched_in.isra.133+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb856a45d event_sched_in.isra.133+0x19d ([kernel.kallsyms]) => ffffffffb8568b80 perf_event_set_state.part.61+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb8568b86 perf_event_set_state.part.61+0x6 ([kernel.kallsyms]) => ffffffffb85662a0 perf_event_update_time+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb856a35c event_sched_in.isra.133+0x9c ([kernel.kallsyms]) => ffffffffb8567610 perf_log_itrace_start+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb856a377 event_sched_in.isra.133+0xb7 ([kernel.kallsyms]) => ffffffffb8403b40 x86_pmu_add+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb8403b86 x86_pmu_add+0x46 ([kernel.kallsyms]) => ffffffffb8403940 collect_events+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb8403a7b collect_events+0x13b ([kernel.kallsyms]) => ffffffffb8402cd0 collect_event+0x0 ([kernel.kallsyms])
Fixes: 30def61f64bac5 ("perf parse-events Create two hybrid cache events") Fixes: 94da591b1c7913 ("perf parse-events Create two hybrid raw events") Fixes: 9cbfa2f64c04d9 ("perf parse-events Create two hybrid hardware events") Signed-off-by: Adrian Hunter adrian.hunter@intel.com Acked-by: Jiri Olsa jolsa@redhat.com Cc: Jin Yao yao.jin@linux.intel.com Cc: Kan Liang kan.liang@linux.intel.com Link: https //lore.kernel.org/r/20210909125508.28693-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/parse-events-hybrid.c | 18 +++++++++++++++--- tools/perf/util/parse-events.c | 18 ++++++++++++------ 2 files changed, 27 insertions(+), 9 deletions(-)
diff --git a/tools/perf/util/parse-events-hybrid.c b/tools/perf/util/parse-events-hybrid.c index 10160ab126f9..b234d95fb10a 100644 --- a/tools/perf/util/parse-events-hybrid.c +++ b/tools/perf/util/parse-events-hybrid.c @@ -76,12 +76,16 @@ static int add_hw_hybrid(struct parse_events_state *parse_state, int ret;
perf_pmu__for_each_hybrid_pmu(pmu) { + LIST_HEAD(terms); + if (pmu_cmp(parse_state, pmu)) continue;
+ copy_config_terms(&terms, config_terms); ret = create_event_hybrid(PERF_TYPE_HARDWARE, &parse_state->idx, list, attr, name, - config_terms, pmu); + &terms, pmu); + free_config_terms(&terms); if (ret) return ret; } @@ -115,11 +119,15 @@ static int add_raw_hybrid(struct parse_events_state *parse_state, int ret;
perf_pmu__for_each_hybrid_pmu(pmu) { + LIST_HEAD(terms); + if (pmu_cmp(parse_state, pmu)) continue;
+ copy_config_terms(&terms, config_terms); ret = create_raw_event_hybrid(&parse_state->idx, list, attr, - name, config_terms, pmu); + name, &terms, pmu); + free_config_terms(&terms); if (ret) return ret; } @@ -165,11 +173,15 @@ int parse_events__add_cache_hybrid(struct list_head *list, int *idx,
*hybrid = true; perf_pmu__for_each_hybrid_pmu(pmu) { + LIST_HEAD(terms); + if (pmu_cmp(parse_state, pmu)) continue;
+ copy_config_terms(&terms, config_terms); ret = create_event_hybrid(PERF_TYPE_HW_CACHE, idx, list, - attr, name, config_terms, pmu); + attr, name, &terms, pmu); + free_config_terms(&terms); if (ret) return ret; } diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index e5eae23cfceb..790b72f2f31f 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -387,7 +387,7 @@ __add_event(struct list_head *list, int *idx, evsel->name = strdup(name);
if (config_terms) - list_splice(config_terms, &evsel->config_terms); + list_splice_init(config_terms, &evsel->config_terms);
if (list) list_add_tail(&evsel->core.node, list); @@ -535,9 +535,12 @@ int parse_events_add_cache(struct list_head *list, int *idx, config_name ? : name, &config_terms, &hybrid, parse_state); if (hybrid) - return ret; + goto out_free_terms;
- return add_event(list, idx, &attr, config_name ? : name, &config_terms); + ret = add_event(list, idx, &attr, config_name ? : name, &config_terms); +out_free_terms: + free_config_terms(&config_terms); + return ret; }
static void tracepoint_error(struct parse_events_error *e, int err, @@ -1457,10 +1460,13 @@ int parse_events_add_numeric(struct parse_events_state *parse_state, get_config_name(head_config), &config_terms, &hybrid); if (hybrid) - return ret; + goto out_free_terms;
- return add_event(list, &parse_state->idx, &attr, - get_config_name(head_config), &config_terms); + ret = add_event(list, &parse_state->idx, &attr, + get_config_name(head_config), &config_terms); +out_free_terms: + free_config_terms(&config_terms); + return ret; }
int parse_events_add_tool(struct parse_events_state *parse_state,
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit b2296eeac91555bd13f774efa7ab7d4b12fb71ef ]
Now that UML has PCI support, this driver must depend also on !UML since it pokes at X86_64 architecture internals that don't exist on ARCH=um.
Reported-by: kernel test robot lkp@intel.com Signed-off-by: Johannes Berg johannes.berg@intel.com Acked-by: Dave Jiang dave.jiang@intel.com Acked-By: Anton Ivanov anton.ivanov@cambridgegreys.com Link: https://lore.kernel.org/r/20210625103810.fe877ae0aef4.If240438e3f50ae226f3f7... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig index 39b5b46e880f..f450e4231db7 100644 --- a/drivers/dma/Kconfig +++ b/drivers/dma/Kconfig @@ -279,7 +279,7 @@ config INTEL_IDMA64
config INTEL_IDXD tristate "Intel Data Accelerators support" - depends on PCI && X86_64 + depends on PCI && X86_64 && !UML depends on PCI_MSI depends on SBITMAP select DMA_ENGINE
From: Zou Wei zou_wei@huawei.com
[ Upstream commit 4faee8b65ec32346f8096e64c5fa1d5a73121742 ]
This patch adds missing MODULE_DEVICE_TABLE definition which generates correct modalias for automatic loading of this driver when it is built as an external module.
Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Zou Wei zou_wei@huawei.com Reviewed-by: Baolin Wang baolin.wang7@gmail.com Link: https://lore.kernel.org/r/1620094977-70146-1-git-send-email-zou_wei@huawei.c... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/sprd-dma.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/dma/sprd-dma.c b/drivers/dma/sprd-dma.c index 0ef5ca81ba4d..4357d2395e6b 100644 --- a/drivers/dma/sprd-dma.c +++ b/drivers/dma/sprd-dma.c @@ -1265,6 +1265,7 @@ static const struct of_device_id sprd_dma_match[] = { { .compatible = "sprd,sc9860-dma", }, {}, }; +MODULE_DEVICE_TABLE(of, sprd_dma_match);
static int __maybe_unused sprd_dma_runtime_suspend(struct device *dev) {
From: Ben Widawsky ben.widawsky@intel.com
[ Upstream commit 5161a55c069f53d88da49274cbef6e3c74eadea9 ]
CXL core is growing, and it's already arguably unmanageable. To support future growth, move core functionality to a new directory and rename the file to represent just bus support. Future work will remove non-bus functionality.
Note that mem.h is renamed to cxlmem.h to avoid a namespace collision with the global ARCH=um mem.h header.
Reported-by: kernel test robot lkp@intel.com Signed-off-by: Ben Widawsky ben.widawsky@intel.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Link: https://lore.kernel.org/r/162792537866.368511.8915631504621088321.stgit@dwil... Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/driver-api/cxl/memory-devices.rst | 2 +- drivers/cxl/Makefile | 4 +--- drivers/cxl/core/Makefile | 5 +++++ drivers/cxl/{core.c => core/bus.c} | 4 ++-- drivers/cxl/{mem.h => cxlmem.h} | 0 drivers/cxl/pci.c | 2 +- drivers/cxl/pmem.c | 2 +- 7 files changed, 11 insertions(+), 8 deletions(-) create mode 100644 drivers/cxl/core/Makefile rename drivers/cxl/{core.c => core/bus.c} (99%) rename drivers/cxl/{mem.h => cxlmem.h} (100%)
diff --git a/Documentation/driver-api/cxl/memory-devices.rst b/Documentation/driver-api/cxl/memory-devices.rst index 487ce4f41d77..a86e2c7c551a 100644 --- a/Documentation/driver-api/cxl/memory-devices.rst +++ b/Documentation/driver-api/cxl/memory-devices.rst @@ -36,7 +36,7 @@ CXL Core .. kernel-doc:: drivers/cxl/cxl.h :internal:
-.. kernel-doc:: drivers/cxl/core.c +.. kernel-doc:: drivers/cxl/core/bus.c :doc: cxl core
External Interfaces diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile index 32954059b37b..d1aaabc940f3 100644 --- a/drivers/cxl/Makefile +++ b/drivers/cxl/Makefile @@ -1,11 +1,9 @@ # SPDX-License-Identifier: GPL-2.0 -obj-$(CONFIG_CXL_BUS) += cxl_core.o +obj-$(CONFIG_CXL_BUS) += core/ obj-$(CONFIG_CXL_MEM) += cxl_pci.o obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o
-ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL -cxl_core-y := core.o cxl_pci-y := pci.o cxl_acpi-y := acpi.o cxl_pmem-y := pmem.o diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile new file mode 100644 index 000000000000..ad137f96e5c8 --- /dev/null +++ b/drivers/cxl/core/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_CXL_BUS) += cxl_core.o + +ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=CXL -I$(srctree)/drivers/cxl +cxl_core-y := bus.o diff --git a/drivers/cxl/core.c b/drivers/cxl/core/bus.c similarity index 99% rename from drivers/cxl/core.c rename to drivers/cxl/core/bus.c index a2e4d54fc7bc..0815eec23944 100644 --- a/drivers/cxl/core.c +++ b/drivers/cxl/core/bus.c @@ -6,8 +6,8 @@ #include <linux/pci.h> #include <linux/slab.h> #include <linux/idr.h> -#include "cxl.h" -#include "mem.h" +#include <cxlmem.h> +#include <cxl.h>
/** * DOC: cxl core diff --git a/drivers/cxl/mem.h b/drivers/cxl/cxlmem.h similarity index 100% rename from drivers/cxl/mem.h rename to drivers/cxl/cxlmem.h diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 145ad4bc305f..e7ee148ad7ee 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -12,9 +12,9 @@ #include <linux/pci.h> #include <linux/io.h> #include <linux/io-64-nonatomic-lo-hi.h> +#include "cxlmem.h" #include "pci.h" #include "cxl.h" -#include "mem.h"
/** * DOC: cxl pci diff --git a/drivers/cxl/pmem.c b/drivers/cxl/pmem.c index 0088e41dd2f3..9652c3ee41e7 100644 --- a/drivers/cxl/pmem.c +++ b/drivers/cxl/pmem.c @@ -6,7 +6,7 @@ #include <linux/ndctl.h> #include <linux/async.h> #include <linux/slab.h> -#include "mem.h" +#include "cxlmem.h" #include "cxl.h"
/*
From: Dan Williams dan.j.williams@intel.com
[ Upstream commit 9cc238c7a526dba9ee8c210fa2828886fc65db66 ]
In preparation for moving cxl_memdev allocation to the core, introduce cdevm_file_operations to coordinate file operations shutdown relative to driver data release.
The motivation for moving cxl_memdev allocation to the core (beyond better file organization of sysfs attributes in core/ and drivers in cxl/), is that device lifetime is longer than module lifetime. The cxl_pci module should be free to come and go without needing to coordinate with devices that need the text associated with cxl_memdev_release() to stay resident. The move will fix a use after free bug when looping driver load / unload with CONFIG_DEBUG_KOBJECT_RELEASE=y.
Another motivation for passing in file_operations to the core cxl_memdev creation flow is to allow for alternate drivers, like unit test code, to define their own ioctl backends.
Signed-off-by: Ben Widawsky ben.widawsky@intel.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Link: https://lore.kernel.org/r/162792539962.368511.2962268954245340288.stgit@dwil... Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cxl/cxlmem.h | 15 ++++++++++ drivers/cxl/pci.c | 65 ++++++++++++++++++++++++++------------------ 2 files changed, 53 insertions(+), 27 deletions(-)
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 8f02d02b26b4..0cd463de1342 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -34,6 +34,21 @@ */ #define CXL_MEM_MAX_DEVS 65536
+/** + * struct cdevm_file_operations - devm coordinated cdev file operations + * @fops: file operations that are synchronized against @shutdown + * @shutdown: disconnect driver data + * + * @shutdown is invoked in the devres release path to disconnect any + * driver instance data from @dev. It assumes synchronization with any + * fops operation that requires driver data. After @shutdown an + * operation may only reference @device data. + */ +struct cdevm_file_operations { + struct file_operations fops; + void (*shutdown)(struct device *dev); +}; + /** * struct cxl_memdev - CXL bus object representing a Type-3 Memory Device * @dev: driver core device object diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index e7ee148ad7ee..e809596049b6 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -806,13 +806,30 @@ static int cxl_memdev_release_file(struct inode *inode, struct file *file) return 0; }
-static const struct file_operations cxl_memdev_fops = { - .owner = THIS_MODULE, - .unlocked_ioctl = cxl_memdev_ioctl, - .open = cxl_memdev_open, - .release = cxl_memdev_release_file, - .compat_ioctl = compat_ptr_ioctl, - .llseek = noop_llseek, +static struct cxl_memdev *to_cxl_memdev(struct device *dev) +{ + return container_of(dev, struct cxl_memdev, dev); +} + +static void cxl_memdev_shutdown(struct device *dev) +{ + struct cxl_memdev *cxlmd = to_cxl_memdev(dev); + + down_write(&cxl_memdev_rwsem); + cxlmd->cxlm = NULL; + up_write(&cxl_memdev_rwsem); +} + +static const struct cdevm_file_operations cxl_memdev_fops = { + .fops = { + .owner = THIS_MODULE, + .unlocked_ioctl = cxl_memdev_ioctl, + .open = cxl_memdev_open, + .release = cxl_memdev_release_file, + .compat_ioctl = compat_ptr_ioctl, + .llseek = noop_llseek, + }, + .shutdown = cxl_memdev_shutdown, };
static inline struct cxl_mem_command *cxl_mem_find_command(u16 opcode) @@ -1161,11 +1178,6 @@ free_maps: return ret; }
-static struct cxl_memdev *to_cxl_memdev(struct device *dev) -{ - return container_of(dev, struct cxl_memdev, dev); -} - static void cxl_memdev_release(struct device *dev) { struct cxl_memdev *cxlmd = to_cxl_memdev(dev); @@ -1281,24 +1293,22 @@ static const struct device_type cxl_memdev_type = { .groups = cxl_memdev_attribute_groups, };
-static void cxl_memdev_shutdown(struct cxl_memdev *cxlmd) -{ - down_write(&cxl_memdev_rwsem); - cxlmd->cxlm = NULL; - up_write(&cxl_memdev_rwsem); -} - static void cxl_memdev_unregister(void *_cxlmd) { struct cxl_memdev *cxlmd = _cxlmd; struct device *dev = &cxlmd->dev; + struct cdev *cdev = &cxlmd->cdev; + const struct cdevm_file_operations *cdevm_fops; + + cdevm_fops = container_of(cdev->ops, typeof(*cdevm_fops), fops); + cdevm_fops->shutdown(dev);
cdev_device_del(&cxlmd->cdev, dev); - cxl_memdev_shutdown(cxlmd); put_device(dev); }
-static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm) +static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm, + const struct file_operations *fops) { struct pci_dev *pdev = cxlm->pdev; struct cxl_memdev *cxlmd; @@ -1324,7 +1334,7 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm) device_set_pm_not_required(dev);
cdev = &cxlmd->cdev; - cdev_init(cdev, &cxl_memdev_fops); + cdev_init(cdev, fops); return cxlmd;
err: @@ -1332,15 +1342,16 @@ err: return ERR_PTR(rc); }
-static struct cxl_memdev *devm_cxl_add_memdev(struct device *host, - struct cxl_mem *cxlm) +static struct cxl_memdev * +devm_cxl_add_memdev(struct device *host, struct cxl_mem *cxlm, + const struct cdevm_file_operations *cdevm_fops) { struct cxl_memdev *cxlmd; struct device *dev; struct cdev *cdev; int rc;
- cxlmd = cxl_memdev_alloc(cxlm); + cxlmd = cxl_memdev_alloc(cxlm, &cdevm_fops->fops); if (IS_ERR(cxlmd)) return cxlmd;
@@ -1370,7 +1381,7 @@ err: * The cdev was briefly live, shutdown any ioctl operations that * saw that state. */ - cxl_memdev_shutdown(cxlmd); + cdevm_fops->shutdown(dev); put_device(dev); return ERR_PTR(rc); } @@ -1611,7 +1622,7 @@ static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id) if (rc) return rc;
- cxlmd = devm_cxl_add_memdev(&pdev->dev, cxlm); + cxlmd = devm_cxl_add_memdev(&pdev->dev, cxlm, &cxl_memdev_fops); if (IS_ERR(cxlmd)) return PTR_ERR(cxlmd);
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit bbac7a92a46f0876e588722ebe552ddfe6fd790f ]
Now that UML has PCI support, this driver must depend also on !UML since it pokes at X86_64 architecture internals that don't exist on ARCH=um.
Reported-by: Geert Uytterhoeven geert@linux-m68k.org Signed-off-by: Johannes Berg johannes.berg@intel.com Acked-by: Dave Jiang dave.jiang@intel.com Link: https://lore.kernel.org/r/20210809112409.a3a0974874d2.I2ffe3d11ed37f735da2f3... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig index f450e4231db7..4f70cf57471a 100644 --- a/drivers/dma/Kconfig +++ b/drivers/dma/Kconfig @@ -315,7 +315,7 @@ config INTEL_IDXD_PERFMON
config INTEL_IOATDMA tristate "Intel I/OAT DMA support" - depends on PCI && X86_64 + depends on PCI && X86_64 && !UML select DMA_ENGINE select DMA_ENGINE_RAID select DCA
From: Radhey Shyam Pandey radhey.shyam.pandey@xilinx.com
[ Upstream commit aac6c0f90799d66b8989be1e056408f33fd99fe6 ]
The xilinx dma driver uses the consistent allocations, so for correct operation also set the DMA mask for coherent APIs. It fixes the below kernel crash with dmatest client when DMA IP is configured with 64-bit address width and linux is booted from high (>4GB) memory.
Call trace: [ 489.531257] dma_alloc_from_pool+0x8c/0x1c0 [ 489.535431] dma_direct_alloc+0x284/0x330 [ 489.539432] dma_alloc_attrs+0x80/0xf0 [ 489.543174] dma_pool_alloc+0x160/0x2c0 [ 489.547003] xilinx_cdma_prep_memcpy+0xa4/0x180 [ 489.551524] dmatest_func+0x3cc/0x114c [ 489.555266] kthread+0x124/0x130 [ 489.558486] ret_from_fork+0x10/0x3c [ 489.562051] ---[ end trace 248625b2d596a90a ]---
Signed-off-by: Radhey Shyam Pandey radhey.shyam.pandey@xilinx.com Reviewed-by: Harini Katakam harini.katakam@xilinx.com Link: https://lore.kernel.org/r/1629363528-30347-1-git-send-email-radhey.shyam.pan... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/xilinx/xilinx_dma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/dma/xilinx/xilinx_dma.c b/drivers/dma/xilinx/xilinx_dma.c index 4b9530a7bf65..434b1ff22e31 100644 --- a/drivers/dma/xilinx/xilinx_dma.c +++ b/drivers/dma/xilinx/xilinx_dma.c @@ -3077,7 +3077,7 @@ static int xilinx_dma_probe(struct platform_device *pdev) xdev->ext_addr = false;
/* Set the dma mask bits */ - dma_set_mask(xdev->dev, DMA_BIT_MASK(addr_width)); + dma_set_mask_and_coherent(xdev->dev, DMA_BIT_MASK(addr_width));
/* Initialize the DMA engine */ xdev->common.dev = &pdev->dev;
From: Sven Schnelle svens@linux.ibm.com
[ Upstream commit 436fc4feeabbf103d78d50a8e091b3aac28cc37f ]
kmemleak with enabled auto scanning reports that our stack allocation is lost. This is because we're saving the pointer + STACK_INIT_OFFSET to lowcore. When kmemleak now scans the objects, it thinks that this one is lost because it can't find a corresponding pointer.
Reported-by: Marc Hartmayer mhartmay@linux.ibm.com Signed-off-by: Sven Schnelle svens@linux.ibm.com Tested-by: Marc Hartmayer mhartmay@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/setup.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index ee23908f1b96..6f0d2d4dea74 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -50,6 +50,7 @@ #include <linux/compat.h> #include <linux/start_kernel.h> #include <linux/hugetlb.h> +#include <linux/kmemleak.h>
#include <asm/boot_data.h> #include <asm/ipl.h> @@ -312,9 +313,12 @@ void *restart_stack; unsigned long stack_alloc(void) { #ifdef CONFIG_VMAP_STACK - return (unsigned long)__vmalloc_node(THREAD_SIZE, THREAD_SIZE, - THREADINFO_GFP, NUMA_NO_NODE, - __builtin_return_address(0)); + void *ret; + + ret = __vmalloc_node(THREAD_SIZE, THREAD_SIZE, THREADINFO_GFP, + NUMA_NO_NODE, __builtin_return_address(0)); + kmemleak_not_leak(ret); + return (unsigned long)ret; #else return __get_free_pages(GFP_KERNEL, THREAD_SIZE_ORDER); #endif
From: Kuninori Morimoto kuninori.morimoto.gx@renesas.com
[ Upstream commit 5f939f49771002f347039edf984aca42f30fc31a ]
commit 63f2f9cceb09f8 ("ASoC: audio-graph: remove Platform support") removed Platform support from audio-graph, because it doesn't have "plat" support on DT (simple-card has). But, Platform support is needed if user is using snd_dmaengine_pcm_register() which adds generic DMA as Platform. And this Platform dev is using CPU dev.
Without this patch, at least STM32MP15 audio sound card is no more functional (v5.13 or later). This patch respawn Platform Support on audio-graph again.
Reported-by: Olivier MOYSAN olivier.moysan@foss.st.com Signed-off-by: Kuninori Morimoto kuninori.morimoto.gx@renesas.com Tested-by: Olivier MOYSAN olivier.moysan@foss.st.com Link: https://lore.kernel.org/r/878s0jzrpf.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/generic/audio-graph-card.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/sound/soc/generic/audio-graph-card.c b/sound/soc/generic/audio-graph-card.c index 5e71382467e8..546f6fd0609e 100644 --- a/sound/soc/generic/audio-graph-card.c +++ b/sound/soc/generic/audio-graph-card.c @@ -285,6 +285,7 @@ static int graph_dai_link_of_dpcm(struct asoc_simple_priv *priv, if (li->cpu) { struct snd_soc_card *card = simple_priv_to_card(priv); struct snd_soc_dai_link_component *cpus = asoc_link_to_cpu(dai_link, 0); + struct snd_soc_dai_link_component *platforms = asoc_link_to_platform(dai_link, 0); int is_single_links = 0;
/* Codec is dummy */ @@ -313,6 +314,7 @@ static int graph_dai_link_of_dpcm(struct asoc_simple_priv *priv, dai_link->no_pcm = 1;
asoc_simple_canonicalize_cpu(cpus, is_single_links); + asoc_simple_canonicalize_platform(platforms, cpus); } else { struct snd_soc_codec_conf *cconf = simple_props_to_codec_conf(dai_props, 0); struct snd_soc_dai_link_component *codecs = asoc_link_to_codec(dai_link, 0); @@ -366,6 +368,7 @@ static int graph_dai_link_of(struct asoc_simple_priv *priv, struct snd_soc_dai_link *dai_link = simple_priv_to_link(priv, li->link); struct snd_soc_dai_link_component *cpus = asoc_link_to_cpu(dai_link, 0); struct snd_soc_dai_link_component *codecs = asoc_link_to_codec(dai_link, 0); + struct snd_soc_dai_link_component *platforms = asoc_link_to_platform(dai_link, 0); char dai_name[64]; int ret, is_single_links = 0;
@@ -383,6 +386,7 @@ static int graph_dai_link_of(struct asoc_simple_priv *priv, "%s-%s", cpus->dai_name, codecs->dai_name);
asoc_simple_canonicalize_cpu(cpus, is_single_links); + asoc_simple_canonicalize_platform(platforms, cpus);
ret = graph_link_init(priv, cpu_ep, codec_ep, li, dai_name); if (ret < 0) @@ -608,6 +612,7 @@ static int graph_count_noml(struct asoc_simple_priv *priv,
li->num[li->link].cpus = 1; li->num[li->link].codecs = 1; + li->num[li->link].platforms = 1;
li->link += 1; /* 1xCPU-Codec */
@@ -630,6 +635,7 @@ static int graph_count_dpcm(struct asoc_simple_priv *priv,
if (li->cpu) { li->num[li->link].cpus = 1; + li->num[li->link].platforms = 1;
li->link++; /* 1xCPU-dummy */ } else {
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit fa209644a7124b3f4cf811ced55daef49ae39ac6 ]
It was reported that on "HP ENVY x360" that power LED does not come back, certain keys like brightness controls do not work, and the fan never spins up, even under load on 5.14 final.
In analysis of the SSDT it's clear that the Microsoft UUID doesn't provide functional support, but rather the AMD UUID should be supporting this system.
Because this is a gap in the expected logic, we checked back with internal team. The conclusion was that on Windows AMD uPEP *does* run even when Microsoft UUID present, but most OEM systems have adopted value of "0x3" for supported functions and hence nothing runs.
Henceforth add support for running both Microsoft and AMD methods. This approach will also allow the same logic on Intel systems if desired at a future time as well by pulling the evaluation of `lps0_dsm_func_mask_microsoft` out of the `if` block for `acpi_s2idle_vendor_amd`.
Link: https://gitlab.freedesktop.org/drm/amd/uploads/9fbcd7ec3a385cc6949c9bacf45dc... BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1691 Reported-by: Maxwell Beck max@ryt.one Signed-off-by: Mario Limonciello mario.limonciello@amd.com [ rjw: Edits of the new comments ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/x86/s2idle.c | 67 +++++++++++++++++++++++---------------- 1 file changed, 39 insertions(+), 28 deletions(-)
diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c index 3a308461246a..bd92b549fd5a 100644 --- a/drivers/acpi/x86/s2idle.c +++ b/drivers/acpi/x86/s2idle.c @@ -449,25 +449,30 @@ int acpi_s2idle_prepare_late(void) if (pm_debug_messages_on) lpi_check_constraints();
- if (lps0_dsm_func_mask_microsoft > 0) { + /* Screen off */ + if (lps0_dsm_func_mask > 0) + acpi_sleep_run_lps0_dsm(acpi_s2idle_vendor_amd() ? + ACPI_LPS0_SCREEN_OFF_AMD : + ACPI_LPS0_SCREEN_OFF, + lps0_dsm_func_mask, lps0_dsm_guid); + + if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_OFF, lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); - acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_ENTRY, - lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); + + /* LPS0 entry */ + if (lps0_dsm_func_mask > 0) + acpi_sleep_run_lps0_dsm(acpi_s2idle_vendor_amd() ? + ACPI_LPS0_ENTRY_AMD : + ACPI_LPS0_ENTRY, + lps0_dsm_func_mask, lps0_dsm_guid); + if (lps0_dsm_func_mask_microsoft > 0) { acpi_sleep_run_lps0_dsm(ACPI_LPS0_ENTRY, lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); - } else if (acpi_s2idle_vendor_amd()) { - acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_OFF_AMD, - lps0_dsm_func_mask, lps0_dsm_guid); - acpi_sleep_run_lps0_dsm(ACPI_LPS0_ENTRY_AMD, - lps0_dsm_func_mask, lps0_dsm_guid); - } else { - acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_OFF, - lps0_dsm_func_mask, lps0_dsm_guid); - acpi_sleep_run_lps0_dsm(ACPI_LPS0_ENTRY, - lps0_dsm_func_mask, lps0_dsm_guid); + /* modern standby entry */ + acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_ENTRY, + lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); } - return 0; }
@@ -476,24 +481,30 @@ void acpi_s2idle_restore_early(void) if (!lps0_device_handle || sleep_no_lps0) return;
- if (lps0_dsm_func_mask_microsoft > 0) { - acpi_sleep_run_lps0_dsm(ACPI_LPS0_EXIT, - lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); + /* Modern standby exit */ + if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_MS_EXIT, lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); - acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_ON, - lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); - } else if (acpi_s2idle_vendor_amd()) { - acpi_sleep_run_lps0_dsm(ACPI_LPS0_EXIT_AMD, - lps0_dsm_func_mask, lps0_dsm_guid); - acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_ON_AMD, - lps0_dsm_func_mask, lps0_dsm_guid); - } else { + + /* LPS0 exit */ + if (lps0_dsm_func_mask > 0) + acpi_sleep_run_lps0_dsm(acpi_s2idle_vendor_amd() ? + ACPI_LPS0_EXIT_AMD : + ACPI_LPS0_EXIT, + lps0_dsm_func_mask, lps0_dsm_guid); + if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_EXIT, - lps0_dsm_func_mask, lps0_dsm_guid); + lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); + + /* Screen on */ + if (lps0_dsm_func_mask_microsoft > 0) acpi_sleep_run_lps0_dsm(ACPI_LPS0_SCREEN_ON, - lps0_dsm_func_mask, lps0_dsm_guid); - } + lps0_dsm_func_mask_microsoft, lps0_dsm_guid_microsoft); + if (lps0_dsm_func_mask > 0) + acpi_sleep_run_lps0_dsm(acpi_s2idle_vendor_amd() ? + ACPI_LPS0_SCREEN_ON_AMD : + ACPI_LPS0_SCREEN_ON, + lps0_dsm_func_mask, lps0_dsm_guid); }
static const struct platform_s2idle_ops acpi_s2idle_ops_lps0 = {
From: Jeff Layton jlayton@kernel.org
[ Upstream commit 2ad32cf09bd28a21e6ad1595355a023ed631b529 ]
If we hit a decoding error late in the frame, then we might exit the function without putting the pool_ns string. Ensure that we always put that reference on the way out of the function.
Signed-off-by: Jeff Layton jlayton@kernel.org Reviewed-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/caps.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index ba562efdf07b..1f3d67133958 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -4137,8 +4137,9 @@ void ceph_handle_caps(struct ceph_mds_session *session, done: mutex_unlock(&session->s_mutex); done_unlocked: - ceph_put_string(extra_info.pool_ns); iput(inode); +out: + ceph_put_string(extra_info.pool_ns); return;
flush_cap_releases: @@ -4153,7 +4154,7 @@ flush_cap_releases: bad: pr_err("ceph_handle_caps: corrupt message\n"); ceph_msg_dump(msg); - return; + goto out; }
/*
From: Jeff Layton jlayton@kernel.org
[ Upstream commit b11ed50346683a749632ea664959b28d524d7395 ]
The current code will update the mtime and then try to get caps to handle the write. If we end up having to request caps from the MDS, then the mtime in the cap grant will clobber the updated mtime and it'll be lost.
This is most noticable when two clients are alternately writing to the same file. Fw caps are continually being granted and revoked, and the mtime ends up stuck because the updated mtimes are always being overwritten with the old one.
Fix this by changing the order of operations in ceph_write_iter to get the caps before updating the times. Also, make sure we check the pool full conditions before even getting any caps or uninlining.
URL: https://tracker.ceph.com/issues/46574 Reported-by: Jozef Kováč kovac@firma.zoznam.sk Signed-off-by: Jeff Layton jlayton@kernel.org Reviewed-by: Xiubo Li xiubli@redhat.com Reviewed-by: Luis Henriques lhenriques@suse.de Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/file.c | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-)
diff --git a/fs/ceph/file.c b/fs/ceph/file.c index d1755ac1d964..3daebfaec8c6 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1722,32 +1722,26 @@ retry_snap: goto out; }
- err = file_remove_privs(file); - if (err) + down_read(&osdc->lock); + map_flags = osdc->osdmap->flags; + pool_flags = ceph_pg_pool_flags(osdc->osdmap, ci->i_layout.pool_id); + up_read(&osdc->lock); + if ((map_flags & CEPH_OSDMAP_FULL) || + (pool_flags & CEPH_POOL_FLAG_FULL)) { + err = -ENOSPC; goto out; + }
- err = file_update_time(file); + err = file_remove_privs(file); if (err) goto out;
- inode_inc_iversion_raw(inode); - if (ci->i_inline_version != CEPH_INLINE_NONE) { err = ceph_uninline_data(file, NULL); if (err < 0) goto out; }
- down_read(&osdc->lock); - map_flags = osdc->osdmap->flags; - pool_flags = ceph_pg_pool_flags(osdc->osdmap, ci->i_layout.pool_id); - up_read(&osdc->lock); - if ((map_flags & CEPH_OSDMAP_FULL) || - (pool_flags & CEPH_POOL_FLAG_FULL)) { - err = -ENOSPC; - goto out; - } - dout("aio_write %p %llx.%llx %llu~%zd getting caps. i_size %llu\n", inode, ceph_vinop(inode), pos, count, i_size_read(inode)); if (fi->fmode & CEPH_FILE_MODE_LAZY) @@ -1759,6 +1753,12 @@ retry_snap: if (err < 0) goto out;
+ err = file_update_time(file); + if (err) + goto out_caps; + + inode_inc_iversion_raw(inode); + dout("aio_write %p %llx.%llx %llu~%zd got cap refs on %s\n", inode, ceph_vinop(inode), pos, count, ceph_cap_string(got));
@@ -1842,6 +1842,8 @@ retry_snap: }
goto out_unlocked; +out_caps: + ceph_put_cap_refs(ci, got); out: if (direct_lock) ceph_end_io_direct(inode);
From: Xiubo Li xiubli@redhat.com
[ Upstream commit a6d37ccdd240e80f26aaea0e62cda310e0e184d7 ]
capsnaps will take inode references via ihold when queueing to flush. When force unmounting, the client will just close the sessions and may never get a flush reply, causing a leak and inode ref leak.
Fix this by removing the capsnaps for an inode when removing the caps.
URL: https://tracker.ceph.com/issues/52295 Signed-off-by: Xiubo Li xiubli@redhat.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/caps.c | 68 +++++++++++++++++++++++++++++++++----------- fs/ceph/mds_client.c | 31 +++++++++++++++++++- fs/ceph/super.h | 6 ++++ 3 files changed, 87 insertions(+), 18 deletions(-)
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 1f3d67133958..c619a926dc18 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -3117,7 +3117,16 @@ void ceph_put_wrbuffer_cap_refs(struct ceph_inode_info *ci, int nr, break; } } - BUG_ON(!found); + + if (!found) { + /* + * The capsnap should already be removed when removing + * auth cap in the case of a forced unmount. + */ + WARN_ON_ONCE(ci->i_auth_cap); + goto unlock; + } + capsnap->dirty_pages -= nr; if (capsnap->dirty_pages == 0) { complete_capsnap = true; @@ -3139,6 +3148,7 @@ void ceph_put_wrbuffer_cap_refs(struct ceph_inode_info *ci, int nr, complete_capsnap ? " (complete capsnap)" : ""); }
+unlock: spin_unlock(&ci->i_ceph_lock);
if (last) { @@ -3609,6 +3619,43 @@ out: iput(inode); }
+void __ceph_remove_capsnap(struct inode *inode, struct ceph_cap_snap *capsnap, + bool *wake_ci, bool *wake_mdsc) +{ + struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_mds_client *mdsc = ceph_sb_to_client(inode->i_sb)->mdsc; + bool ret; + + lockdep_assert_held(&ci->i_ceph_lock); + + dout("removing capsnap %p, inode %p ci %p\n", capsnap, inode, ci); + + list_del_init(&capsnap->ci_item); + ret = __detach_cap_flush_from_ci(ci, &capsnap->cap_flush); + if (wake_ci) + *wake_ci = ret; + + spin_lock(&mdsc->cap_dirty_lock); + if (list_empty(&ci->i_cap_flush_list)) + list_del_init(&ci->i_flushing_item); + + ret = __detach_cap_flush_from_mdsc(mdsc, &capsnap->cap_flush); + if (wake_mdsc) + *wake_mdsc = ret; + spin_unlock(&mdsc->cap_dirty_lock); +} + +void ceph_remove_capsnap(struct inode *inode, struct ceph_cap_snap *capsnap, + bool *wake_ci, bool *wake_mdsc) +{ + struct ceph_inode_info *ci = ceph_inode(inode); + + lockdep_assert_held(&ci->i_ceph_lock); + + WARN_ON_ONCE(capsnap->dirty_pages || capsnap->writing); + __ceph_remove_capsnap(inode, capsnap, wake_ci, wake_mdsc); +} + /* * Handle FLUSHSNAP_ACK. MDS has flushed snap data to disk and we can * throw away our cap_snap. @@ -3646,23 +3693,10 @@ static void handle_cap_flushsnap_ack(struct inode *inode, u64 flush_tid, capsnap, capsnap->follows); } } - if (flushed) { - WARN_ON(capsnap->dirty_pages || capsnap->writing); - dout(" removing %p cap_snap %p follows %lld\n", - inode, capsnap, follows); - list_del(&capsnap->ci_item); - wake_ci |= __detach_cap_flush_from_ci(ci, &capsnap->cap_flush); - - spin_lock(&mdsc->cap_dirty_lock); - - if (list_empty(&ci->i_cap_flush_list)) - list_del_init(&ci->i_flushing_item); - - wake_mdsc |= __detach_cap_flush_from_mdsc(mdsc, - &capsnap->cap_flush); - spin_unlock(&mdsc->cap_dirty_lock); - } + if (flushed) + ceph_remove_capsnap(inode, capsnap, &wake_ci, &wake_mdsc); spin_unlock(&ci->i_ceph_lock); + if (flushed) { ceph_put_snap_context(capsnap->context); ceph_put_cap_snap(capsnap); diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index dc98de0acd28..52b3ddc5f199 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -1583,14 +1583,39 @@ out: return ret; }
+static int remove_capsnaps(struct ceph_mds_client *mdsc, struct inode *inode) +{ + struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_cap_snap *capsnap; + int capsnap_release = 0; + + lockdep_assert_held(&ci->i_ceph_lock); + + dout("removing capsnaps, ci is %p, inode is %p\n", ci, inode); + + while (!list_empty(&ci->i_cap_snaps)) { + capsnap = list_first_entry(&ci->i_cap_snaps, + struct ceph_cap_snap, ci_item); + __ceph_remove_capsnap(inode, capsnap, NULL, NULL); + ceph_put_snap_context(capsnap->context); + ceph_put_cap_snap(capsnap); + capsnap_release++; + } + wake_up_all(&ci->i_cap_wq); + wake_up_all(&mdsc->cap_flushing_wq); + return capsnap_release; +} + static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap, void *arg) { struct ceph_fs_client *fsc = (struct ceph_fs_client *)arg; + struct ceph_mds_client *mdsc = fsc->mdsc; struct ceph_inode_info *ci = ceph_inode(inode); LIST_HEAD(to_remove); bool dirty_dropped = false; bool invalidate = false; + int capsnap_release = 0;
dout("removing cap %p, ci is %p, inode is %p\n", cap, ci, &ci->vfs_inode); @@ -1598,7 +1623,6 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap, __ceph_remove_cap(cap, false); if (!ci->i_auth_cap) { struct ceph_cap_flush *cf; - struct ceph_mds_client *mdsc = fsc->mdsc;
if (READ_ONCE(fsc->mount_state) >= CEPH_MOUNT_SHUTDOWN) { if (inode->i_data.nrpages > 0) @@ -1662,6 +1686,9 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap, list_add(&ci->i_prealloc_cap_flush->i_list, &to_remove); ci->i_prealloc_cap_flush = NULL; } + + if (!list_empty(&ci->i_cap_snaps)) + capsnap_release = remove_capsnaps(mdsc, inode); } spin_unlock(&ci->i_ceph_lock); while (!list_empty(&to_remove)) { @@ -1678,6 +1705,8 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap, ceph_queue_invalidate(inode); if (dirty_dropped) iput(inode); + while (capsnap_release--) + iput(inode); return 0; }
diff --git a/fs/ceph/super.h b/fs/ceph/super.h index b1a363641beb..2200ed76b123 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -1163,6 +1163,12 @@ extern void ceph_put_cap_refs_no_check_caps(struct ceph_inode_info *ci, int had); extern void ceph_put_wrbuffer_cap_refs(struct ceph_inode_info *ci, int nr, struct ceph_snap_context *snapc); +extern void __ceph_remove_capsnap(struct inode *inode, + struct ceph_cap_snap *capsnap, + bool *wake_ci, bool *wake_mdsc); +extern void ceph_remove_capsnap(struct inode *inode, + struct ceph_cap_snap *capsnap, + bool *wake_ci, bool *wake_mdsc); extern void ceph_flush_snaps(struct ceph_inode_info *ci, struct ceph_mds_session **psession); extern bool __ceph_should_report_size(struct ceph_inode_info *ci);
From: Jeff Layton jlayton@kernel.org
[ Upstream commit 3eaf5aa1cfa8c97c72f5824e2e9263d6cc977b03 ]
Signed-off-by: Jeff Layton jlayton@kernel.org Reviewed-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/caps.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index c619a926dc18..3296a93be907 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -1859,6 +1859,8 @@ static u64 __mark_caps_flushing(struct inode *inode, * try to invalidate mapping pages without blocking. */ static int try_nonblocking_invalidate(struct inode *inode) + __releases(ci->i_ceph_lock) + __acquires(ci->i_ceph_lock) { struct ceph_inode_info *ci = ceph_inode(inode); u32 invalidating_gen = ci->i_rdcache_gen;
From: Vasily Gorbik gor@linux.ibm.com
[ Upstream commit 88b604263f3d6eedae0b1c2c3bbd602d1e2e8775 ]
current_stack_pointer() simply returns current value of %r15. If current_stack_pointer() caller allocates stack (which is the case in unwind code) %r15 points to a stack frame allocated for callees, meaning current_stack_pointer() caller (e.g. stack_trace_save) will end up in the stacktrace. This is not expected by stack_trace_save*() callers and causes problems.
current_frame_address() on the other hand returns function stack frame address, which matches %r15 upon function invocation. Using it in get_stack_pointer() makes it more aligned with x86 implementation (according to BACKTRACE_SELF_TEST output) and meets stack_trace_save*() caller's expectations, notably KCSAN.
Also make sure unwind_start is always inlined.
Reported-by: Nathan Chancellor nathan@kernel.org Suggested-by: Marco Elver elver@google.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Tested-by: Marco Elver elver@google.com Tested-by: Nathan Chancellor nathan@kernel.org Link: https://lore.kernel.org/r/patch.git-04dd26be3043.your-ad-here.call-016305048... Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/include/asm/stacktrace.h | 20 ++++++++++---------- arch/s390/include/asm/unwind.h | 8 ++++---- 2 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/arch/s390/include/asm/stacktrace.h b/arch/s390/include/asm/stacktrace.h index 3d8a4b94c620..dd00d98804ec 100644 --- a/arch/s390/include/asm/stacktrace.h +++ b/arch/s390/include/asm/stacktrace.h @@ -34,16 +34,6 @@ static inline bool on_stack(struct stack_info *info, return addr >= info->begin && addr + len <= info->end; }
-static __always_inline unsigned long get_stack_pointer(struct task_struct *task, - struct pt_regs *regs) -{ - if (regs) - return (unsigned long) kernel_stack_pointer(regs); - if (task == current) - return current_stack_pointer(); - return (unsigned long) task->thread.ksp; -} - /* * Stack layout of a C stack frame. */ @@ -74,6 +64,16 @@ struct stack_frame { ((unsigned long)__builtin_frame_address(0) - \ offsetof(struct stack_frame, back_chain))
+static __always_inline unsigned long get_stack_pointer(struct task_struct *task, + struct pt_regs *regs) +{ + if (regs) + return (unsigned long)kernel_stack_pointer(regs); + if (task == current) + return current_frame_address(); + return (unsigned long)task->thread.ksp; +} + /* * To keep this simple mark register 2-6 as being changed (volatile) * by the called function, even though register 6 is saved/nonvolatile. diff --git a/arch/s390/include/asm/unwind.h b/arch/s390/include/asm/unwind.h index de9006b0cfeb..5ebf534ef753 100644 --- a/arch/s390/include/asm/unwind.h +++ b/arch/s390/include/asm/unwind.h @@ -55,10 +55,10 @@ static inline bool unwind_error(struct unwind_state *state) return state->error; }
-static inline void unwind_start(struct unwind_state *state, - struct task_struct *task, - struct pt_regs *regs, - unsigned long first_frame) +static __always_inline void unwind_start(struct unwind_state *state, + struct task_struct *task, + struct pt_regs *regs, + unsigned long first_frame) { task = task ?: current; first_frame = first_frame ?: get_stack_pointer(task, regs);
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit 8f96a5bfa1503e0a5f3c78d51e993a1794d4aff1 ]
We update the ctime/mtime of a block device when we remove it so that blkid knows the device changed. However we do this by re-opening the block device and calling filp_update_time. This is more correct because it'll call the inode->i_op->update_time if it exists, but the block dev inodes do not do this. Instead call generic_update_time() on the bd_inode in order to avoid the blkdev_open path and get rid of the following lockdep splat:
====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc2+ #406 Not tainted ------------------------------------------------------ losetup/11596 is trying to acquire lock: ffff939640d2f538 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x67/0x5e0
but task is already holding lock: ffff939655510c68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (&lo->lo_mutex){+.+.}-{3:3}: __mutex_lock+0x7d/0x750 lo_open+0x28/0x60 [loop] blkdev_get_whole+0x25/0xf0 blkdev_get_by_dev.part.0+0x168/0x3c0 blkdev_open+0xd2/0xe0 do_dentry_open+0x161/0x390 path_openat+0x3cc/0xa20 do_filp_open+0x96/0x120 do_sys_openat2+0x7b/0x130 __x64_sys_openat+0x46/0x70 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #3 (&disk->open_mutex){+.+.}-{3:3}: __mutex_lock+0x7d/0x750 blkdev_get_by_dev.part.0+0x56/0x3c0 blkdev_open+0xd2/0xe0 do_dentry_open+0x161/0x390 path_openat+0x3cc/0xa20 do_filp_open+0x96/0x120 file_open_name+0xc7/0x170 filp_open+0x2c/0x50 btrfs_scratch_superblocks.part.0+0x10f/0x170 btrfs_rm_device.cold+0xe8/0xed btrfs_ioctl+0x2a31/0x2e70 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #2 (sb_writers#12){.+.+}-{0:0}: lo_write_bvec+0xc2/0x240 [loop] loop_process_work+0x238/0xd00 [loop] process_one_work+0x26b/0x560 worker_thread+0x55/0x3c0 kthread+0x140/0x160 ret_from_fork+0x1f/0x30
-> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}: process_one_work+0x245/0x560 worker_thread+0x55/0x3c0 kthread+0x140/0x160 ret_from_fork+0x1f/0x30
-> #0 ((wq_completion)loop0){+.+.}-{0:0}: __lock_acquire+0x10ea/0x1d90 lock_acquire+0xb5/0x2b0 flush_workqueue+0x91/0x5e0 drain_workqueue+0xa0/0x110 destroy_workqueue+0x36/0x250 __loop_clr_fd+0x9a/0x660 [loop] block_ioctl+0x3f/0x50 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of: (wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&lo->lo_mutex); lock(&disk->open_mutex); lock(&lo->lo_mutex); lock((wq_completion)loop0);
*** DEADLOCK ***
1 lock held by losetup/11596: #0: ffff939655510c68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
stack backtrace: CPU: 1 PID: 11596 Comm: losetup Not tainted 5.14.0-rc2+ #406 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 Call Trace: dump_stack_lvl+0x57/0x72 check_noncircular+0xcf/0xf0 ? stack_trace_save+0x3b/0x50 __lock_acquire+0x10ea/0x1d90 lock_acquire+0xb5/0x2b0 ? flush_workqueue+0x67/0x5e0 ? lockdep_init_map_type+0x47/0x220 flush_workqueue+0x91/0x5e0 ? flush_workqueue+0x67/0x5e0 ? verify_cpu+0xf0/0x100 drain_workqueue+0xa0/0x110 destroy_workqueue+0x36/0x250 __loop_clr_fd+0x9a/0x660 [loop] ? blkdev_ioctl+0x8d/0x2a0 block_ioctl+0x3f/0x50 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/volumes.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 10dd2d210b0f..d4e0c4aa6d4d 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1928,15 +1928,17 @@ out: * Function to update ctime/mtime for a given device path. * Mainly used for ctime/mtime based probe like libblkid. */ -static void update_dev_time(const char *path_name) +static void update_dev_time(struct block_device *bdev) { - struct file *filp; + struct inode *inode = bdev->bd_inode; + struct timespec64 now;
- filp = filp_open(path_name, O_RDWR, 0); - if (IS_ERR(filp)) + /* Shouldn't happen but just in case. */ + if (!inode) return; - file_update_time(filp); - filp_close(filp, NULL); + + now = current_time(inode); + generic_update_time(inode, &now, S_MTIME | S_CTIME); }
static int btrfs_rm_dev_item(struct btrfs_device *device) @@ -2116,7 +2118,7 @@ void btrfs_scratch_superblocks(struct btrfs_fs_info *fs_info, btrfs_kobject_uevent(bdev, KOBJ_CHANGE);
/* Update ctime/mtime for device path for libblkid */ - update_dev_time(device_path); + update_dev_time(bdev); }
int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path, @@ -2769,7 +2771,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path btrfs_forget_devices(device_path);
/* Update ctime/mtime for blkid or udev */ - update_dev_time(device_path); + update_dev_time(bdev);
return ret;
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit 3fa421dedbc82f985f030c5a6480ea2d784334c3 ]
When removing the device we call blkdev_put() on the device once we've removed it, and because we have an EXCL open we need to take the ->open_mutex on the block device to clean it up. Unfortunately during device remove we are holding the sb writers lock, which results in the following lockdep splat:
====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc2+ #407 Not tainted ------------------------------------------------------ losetup/11595 is trying to acquire lock: ffff973ac35dd138 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x67/0x5e0
but task is already holding lock: ffff973ac9812c68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (&lo->lo_mutex){+.+.}-{3:3}: __mutex_lock+0x7d/0x750 lo_open+0x28/0x60 [loop] blkdev_get_whole+0x25/0xf0 blkdev_get_by_dev.part.0+0x168/0x3c0 blkdev_open+0xd2/0xe0 do_dentry_open+0x161/0x390 path_openat+0x3cc/0xa20 do_filp_open+0x96/0x120 do_sys_openat2+0x7b/0x130 __x64_sys_openat+0x46/0x70 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #3 (&disk->open_mutex){+.+.}-{3:3}: __mutex_lock+0x7d/0x750 blkdev_put+0x3a/0x220 btrfs_rm_device.cold+0x62/0xe5 btrfs_ioctl+0x2a31/0x2e70 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #2 (sb_writers#12){.+.+}-{0:0}: lo_write_bvec+0xc2/0x240 [loop] loop_process_work+0x238/0xd00 [loop] process_one_work+0x26b/0x560 worker_thread+0x55/0x3c0 kthread+0x140/0x160 ret_from_fork+0x1f/0x30
-> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}: process_one_work+0x245/0x560 worker_thread+0x55/0x3c0 kthread+0x140/0x160 ret_from_fork+0x1f/0x30
-> #0 ((wq_completion)loop0){+.+.}-{0:0}: __lock_acquire+0x10ea/0x1d90 lock_acquire+0xb5/0x2b0 flush_workqueue+0x91/0x5e0 drain_workqueue+0xa0/0x110 destroy_workqueue+0x36/0x250 __loop_clr_fd+0x9a/0x660 [loop] block_ioctl+0x3f/0x50 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of: (wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&lo->lo_mutex); lock(&disk->open_mutex); lock(&lo->lo_mutex); lock((wq_completion)loop0);
*** DEADLOCK ***
1 lock held by losetup/11595: #0: ffff973ac9812c68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
stack backtrace: CPU: 0 PID: 11595 Comm: losetup Not tainted 5.14.0-rc2+ #407 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 Call Trace: dump_stack_lvl+0x57/0x72 check_noncircular+0xcf/0xf0 ? stack_trace_save+0x3b/0x50 __lock_acquire+0x10ea/0x1d90 lock_acquire+0xb5/0x2b0 ? flush_workqueue+0x67/0x5e0 ? lockdep_init_map_type+0x47/0x220 flush_workqueue+0x91/0x5e0 ? flush_workqueue+0x67/0x5e0 ? verify_cpu+0xf0/0x100 drain_workqueue+0xa0/0x110 destroy_workqueue+0x36/0x250 __loop_clr_fd+0x9a/0x660 [loop] ? blkdev_ioctl+0x8d/0x2a0 block_ioctl+0x3f/0x50 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fc21255d4cb
So instead save the bdev and do the put once we've dropped the sb writers lock in order to avoid the lockdep recursion.
Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/ioctl.c | 15 +++++++++++---- fs/btrfs/volumes.c | 23 +++++++++++++++++------ fs/btrfs/volumes.h | 3 ++- 3 files changed, 30 insertions(+), 11 deletions(-)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 0ba98e08a029..50e12989e84a 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3205,6 +3205,8 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg) struct inode *inode = file_inode(file); struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_ioctl_vol_args_v2 *vol_args; + struct block_device *bdev = NULL; + fmode_t mode; int ret; bool cancel = false;
@@ -3237,9 +3239,9 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg) /* Exclusive operation is now claimed */
if (vol_args->flags & BTRFS_DEVICE_SPEC_BY_ID) - ret = btrfs_rm_device(fs_info, NULL, vol_args->devid); + ret = btrfs_rm_device(fs_info, NULL, vol_args->devid, &bdev, &mode); else - ret = btrfs_rm_device(fs_info, vol_args->name, 0); + ret = btrfs_rm_device(fs_info, vol_args->name, 0, &bdev, &mode);
btrfs_exclop_finish(fs_info);
@@ -3255,6 +3257,8 @@ out: kfree(vol_args); err_drop: mnt_drop_write_file(file); + if (bdev) + blkdev_put(bdev, mode); return ret; }
@@ -3263,6 +3267,8 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg) struct inode *inode = file_inode(file); struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_ioctl_vol_args *vol_args; + struct block_device *bdev = NULL; + fmode_t mode; int ret; bool cancel;
@@ -3284,7 +3290,7 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg) ret = exclop_start_or_cancel_reloc(fs_info, BTRFS_EXCLOP_DEV_REMOVE, cancel); if (ret == 0) { - ret = btrfs_rm_device(fs_info, vol_args->name, 0); + ret = btrfs_rm_device(fs_info, vol_args->name, 0, &bdev, &mode); if (!ret) btrfs_info(fs_info, "disk deleted %s", vol_args->name); btrfs_exclop_finish(fs_info); @@ -3293,7 +3299,8 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg) kfree(vol_args); out_drop_write: mnt_drop_write_file(file); - + if (bdev) + blkdev_put(bdev, mode); return ret; }
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index d4e0c4aa6d4d..d6ffb38a0869 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2122,7 +2122,7 @@ void btrfs_scratch_superblocks(struct btrfs_fs_info *fs_info, }
int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path, - u64 devid) + u64 devid, struct block_device **bdev, fmode_t *mode) { struct btrfs_device *device; struct btrfs_fs_devices *cur_devices; @@ -2236,15 +2236,26 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path, mutex_unlock(&fs_devices->device_list_mutex);
/* - * at this point, the device is zero sized and detached from - * the devices list. All that's left is to zero out the old - * supers and free the device. + * At this point, the device is zero sized and detached from the + * devices list. All that's left is to zero out the old supers and + * free the device. + * + * We cannot call btrfs_close_bdev() here because we're holding the sb + * write lock, and blkdev_put() will pull in the ->open_mutex on the + * block device and it's dependencies. Instead just flush the device + * and let the caller do the final blkdev_put. */ - if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) + if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) { btrfs_scratch_superblocks(fs_info, device->bdev, device->name->str); + if (device->bdev) { + sync_blockdev(device->bdev); + invalidate_bdev(device->bdev); + } + }
- btrfs_close_bdev(device); + *bdev = device->bdev; + *mode = device->mode; synchronize_rcu(); btrfs_free_device(device);
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 55a8ba244716..f77f869dfd2c 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -472,7 +472,8 @@ struct btrfs_device *btrfs_alloc_device(struct btrfs_fs_info *fs_info, const u8 *uuid); void btrfs_free_device(struct btrfs_device *device); int btrfs_rm_device(struct btrfs_fs_info *fs_info, - const char *device_path, u64 devid); + const char *device_path, u64 devid, + struct block_device **bdev, fmode_t *mode); void __exit btrfs_cleanup_fs_uuids(void); int btrfs_num_copies(struct btrfs_fs_info *fs_info, u64 logical, u64 len); int btrfs_grow_device(struct btrfs_trans_handle *trans,
From: Anand Jain anand.jain@oracle.com
[ Upstream commit c124706900c20dee70f921bb3a90492431561a0a ]
Following test case reproduces lockdep warning.
Test case:
$ mkfs.btrfs -f <dev1> $ btrfstune -S 1 <dev1> $ mount <dev1> <mnt> $ btrfs device add <dev2> <mnt> -f $ umount <mnt> $ mount <dev2> <mnt> $ umount <mnt>
The warning claims a possible ABBA deadlock between the threads initiated by [#1] btrfs device add and [#0] the mount.
[ 540.743122] WARNING: possible circular locking dependency detected [ 540.743129] 5.11.0-rc7+ #5 Not tainted [ 540.743135] ------------------------------------------------------ [ 540.743142] mount/2515 is trying to acquire lock: [ 540.743149] ffffa0c5544c2ce0 (&fs_devs->device_list_mutex){+.+.}-{4:4}, at: clone_fs_devices+0x6d/0x210 [btrfs] [ 540.743458] but task is already holding lock: [ 540.743461] ffffa0c54a7932b8 (btrfs-chunk-00){++++}-{4:4}, at: __btrfs_tree_read_lock+0x32/0x200 [btrfs] [ 540.743541] which lock already depends on the new lock. [ 540.743543] the existing dependency chain (in reverse order) is:
[ 540.743546] -> #1 (btrfs-chunk-00){++++}-{4:4}: [ 540.743566] down_read_nested+0x48/0x2b0 [ 540.743585] __btrfs_tree_read_lock+0x32/0x200 [btrfs] [ 540.743650] btrfs_read_lock_root_node+0x70/0x200 [btrfs] [ 540.743733] btrfs_search_slot+0x6c6/0xe00 [btrfs] [ 540.743785] btrfs_update_device+0x83/0x260 [btrfs] [ 540.743849] btrfs_finish_chunk_alloc+0x13f/0x660 [btrfs] <--- device_list_mutex [ 540.743911] btrfs_create_pending_block_groups+0x18d/0x3f0 [btrfs] [ 540.743982] btrfs_commit_transaction+0x86/0x1260 [btrfs] [ 540.744037] btrfs_init_new_device+0x1600/0x1dd0 [btrfs] [ 540.744101] btrfs_ioctl+0x1c77/0x24c0 [btrfs] [ 540.744166] __x64_sys_ioctl+0xe4/0x140 [ 540.744170] do_syscall_64+0x4b/0x80 [ 540.744174] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 540.744180] -> #0 (&fs_devs->device_list_mutex){+.+.}-{4:4}: [ 540.744184] __lock_acquire+0x155f/0x2360 [ 540.744188] lock_acquire+0x10b/0x5c0 [ 540.744190] __mutex_lock+0xb1/0xf80 [ 540.744193] mutex_lock_nested+0x27/0x30 [ 540.744196] clone_fs_devices+0x6d/0x210 [btrfs] [ 540.744270] btrfs_read_chunk_tree+0x3c7/0xbb0 [btrfs] [ 540.744336] open_ctree+0xf6e/0x2074 [btrfs] [ 540.744406] btrfs_mount_root.cold.72+0x16/0x127 [btrfs] [ 540.744472] legacy_get_tree+0x38/0x90 [ 540.744475] vfs_get_tree+0x30/0x140 [ 540.744478] fc_mount+0x16/0x60 [ 540.744482] vfs_kern_mount+0x91/0x100 [ 540.744484] btrfs_mount+0x1e6/0x670 [btrfs] [ 540.744536] legacy_get_tree+0x38/0x90 [ 540.744537] vfs_get_tree+0x30/0x140 [ 540.744539] path_mount+0x8d8/0x1070 [ 540.744541] do_mount+0x8d/0xc0 [ 540.744543] __x64_sys_mount+0x125/0x160 [ 540.744545] do_syscall_64+0x4b/0x80 [ 540.744547] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 540.744551] other info that might help us debug this: [ 540.744552] Possible unsafe locking scenario:
[ 540.744553] CPU0 CPU1 [ 540.744554] ---- ---- [ 540.744555] lock(btrfs-chunk-00); [ 540.744557] lock(&fs_devs->device_list_mutex); [ 540.744560] lock(btrfs-chunk-00); [ 540.744562] lock(&fs_devs->device_list_mutex); [ 540.744564] *** DEADLOCK ***
[ 540.744565] 3 locks held by mount/2515: [ 540.744567] #0: ffffa0c56bf7a0e0 (&type->s_umount_key#42/1){+.+.}-{4:4}, at: alloc_super.isra.16+0xdf/0x450 [ 540.744574] #1: ffffffffc05a9628 (uuid_mutex){+.+.}-{4:4}, at: btrfs_read_chunk_tree+0x63/0xbb0 [btrfs] [ 540.744640] #2: ffffa0c54a7932b8 (btrfs-chunk-00){++++}-{4:4}, at: __btrfs_tree_read_lock+0x32/0x200 [btrfs] [ 540.744708] stack backtrace: [ 540.744712] CPU: 2 PID: 2515 Comm: mount Not tainted 5.11.0-rc7+ #5
But the device_list_mutex in clone_fs_devices() is redundant, as explained below. Two threads [1] and [2] (below) could lead to clone_fs_device().
[1] open_ctree <== mount sprout fs btrfs_read_chunk_tree() mutex_lock(&uuid_mutex) <== global lock read_one_dev() open_seed_devices() clone_fs_devices() <== seed fs_devices mutex_lock(&orig->device_list_mutex) <== seed fs_devices
[2] btrfs_init_new_device() <== sprouting mutex_lock(&uuid_mutex); <== global lock btrfs_prepare_sprout() lockdep_assert_held(&uuid_mutex) clone_fs_devices(seed_fs_device) <== seed fs_devices
Both of these threads hold uuid_mutex which is sufficient to protect getting the seed device(s) freed while we are trying to clone it for sprouting [2] or mounting a sprout [1] (as above). A mounted seed device can not free/write/replace because it is read-only. An unmounted seed device can be freed by btrfs_free_stale_devices(), but it needs uuid_mutex. So this patch removes the unnecessary device_list_mutex in clone_fs_devices(). And adds a lockdep_assert_held(&uuid_mutex) in clone_fs_devices().
Reported-by: Su Yue l@damenly.su Tested-by: Su Yue l@damenly.su Signed-off-by: Anand Jain anand.jain@oracle.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/volumes.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index d6ffb38a0869..682416d4edef 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -570,6 +570,8 @@ static int btrfs_free_stale_devices(const char *path, struct btrfs_device *device, *tmp_device; int ret = 0;
+ lockdep_assert_held(&uuid_mutex); + if (path) ret = -ENOENT;
@@ -1000,11 +1002,12 @@ static struct btrfs_fs_devices *clone_fs_devices(struct btrfs_fs_devices *orig) struct btrfs_device *orig_dev; int ret = 0;
+ lockdep_assert_held(&uuid_mutex); + fs_devices = alloc_fs_devices(orig->fsid, NULL); if (IS_ERR(fs_devices)) return fs_devices;
- mutex_lock(&orig->device_list_mutex); fs_devices->total_devices = orig->total_devices;
list_for_each_entry(orig_dev, &orig->devices, dev_list) { @@ -1036,10 +1039,8 @@ static struct btrfs_fs_devices *clone_fs_devices(struct btrfs_fs_devices *orig) device->fs_devices = fs_devices; fs_devices->num_devices++; } - mutex_unlock(&orig->device_list_mutex); return fs_devices; error: - mutex_unlock(&orig->device_list_mutex); free_fs_devices(fs_devices); return ERR_PTR(ret); }
From: Nanyong Sun sunnanyong@huawei.com
[ Upstream commit 5f5dec07aca7067216ed4c1342e464e7307a9197 ]
Patch series "nilfs2: fix incorrect usage of kobject".
This patchset from Nanyong Sun fixes memory leak issues and a NULL pointer dereference issue caused by incorrect usage of kboject in nilfs2 sysfs implementation.
This patch (of 6):
Reported by syzkaller:
BUG: memory leak unreferenced object 0xffff888100ca8988 (size 8): comm "syz-executor.1", pid 1930, jiffies 4294745569 (age 18.052s) hex dump (first 8 bytes): 6c 6f 6f 70 31 00 ff ff loop1... backtrace: kstrdup+0x36/0x70 mm/util.c:60 kstrdup_const+0x35/0x60 mm/util.c:83 kvasprintf_const+0xf1/0x180 lib/kasprintf.c:48 kobject_set_name_vargs+0x56/0x150 lib/kobject.c:289 kobject_add_varg lib/kobject.c:384 [inline] kobject_init_and_add+0xc9/0x150 lib/kobject.c:473 nilfs_sysfs_create_device_group+0x150/0x7d0 fs/nilfs2/sysfs.c:986 init_nilfs+0xa21/0xea0 fs/nilfs2/the_nilfs.c:637 nilfs_fill_super fs/nilfs2/super.c:1046 [inline] nilfs_mount+0x7b4/0xe80 fs/nilfs2/super.c:1316 legacy_get_tree+0x105/0x210 fs/fs_context.c:592 vfs_get_tree+0x8e/0x2d0 fs/super.c:1498 do_new_mount fs/namespace.c:2905 [inline] path_mount+0xf9b/0x1990 fs/namespace.c:3235 do_mount+0xea/0x100 fs/namespace.c:3248 __do_sys_mount fs/namespace.c:3456 [inline] __se_sys_mount fs/namespace.c:3433 [inline] __x64_sys_mount+0x14b/0x1f0 fs/namespace.c:3433 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae
If kobject_init_and_add return with error, then the cleanup of kobject is needed because memory may be allocated in kobject_init_and_add without freeing.
And the place of cleanup_dev_kobject should use kobject_put to free the memory associated with the kobject. As the section "Kobject removal" of "Documentation/core-api/kobject.rst" says, kobject_del() just makes the kobject "invisible", but it is not cleaned up. And no more cleanup will do after cleanup_dev_kobject, so kobject_put is needed here.
Link: https://lkml.kernel.org/r/1625651306-10829-1-git-send-email-konishi.ryusuke@... Link: https://lkml.kernel.org/r/1625651306-10829-2-git-send-email-konishi.ryusuke@... Reported-by: Hulk Robot hulkci@huawei.com Link: https://lkml.kernel.org/r/20210629022556.3985106-2-sunnanyong@huawei.com Signed-off-by: Nanyong Sun sunnanyong@huawei.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nilfs2/sysfs.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/fs/nilfs2/sysfs.c b/fs/nilfs2/sysfs.c index 68e8d61e28dd..d2d8ea89937a 100644 --- a/fs/nilfs2/sysfs.c +++ b/fs/nilfs2/sysfs.c @@ -986,7 +986,7 @@ int nilfs_sysfs_create_device_group(struct super_block *sb) err = kobject_init_and_add(&nilfs->ns_dev_kobj, &nilfs_dev_ktype, NULL, "%s", sb->s_id); if (err) - goto free_dev_subgroups; + goto cleanup_dev_kobject;
err = nilfs_sysfs_create_mounted_snapshots_group(nilfs); if (err) @@ -1023,9 +1023,7 @@ delete_mounted_snapshots_group: nilfs_sysfs_delete_mounted_snapshots_group(nilfs);
cleanup_dev_kobject: - kobject_del(&nilfs->ns_dev_kobj); - -free_dev_subgroups: + kobject_put(&nilfs->ns_dev_kobj); kfree(nilfs->ns_dev_subgroups);
failed_create_device_group:
From: Nanyong Sun sunnanyong@huawei.com
[ Upstream commit dbc6e7d44a514f231a64d9d5676e001b660b6448 ]
In nilfs_##name##_attr_release, kobj->parent should not be referenced because it is a NULL pointer. The release() method of kobject is always called in kobject_put(kobj), in the implementation of kobject_put(), the kobj->parent will be assigned as NULL before call the release() method. So just use kobj to get the subgroups, which is more efficient and can fix a NULL pointer reference problem.
Link: https://lkml.kernel.org/r/20210629022556.3985106-3-sunnanyong@huawei.com Link: https://lkml.kernel.org/r/1625651306-10829-3-git-send-email-konishi.ryusuke@... Signed-off-by: Nanyong Sun sunnanyong@huawei.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nilfs2/sysfs.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/fs/nilfs2/sysfs.c b/fs/nilfs2/sysfs.c index d2d8ea89937a..ec85ac53720d 100644 --- a/fs/nilfs2/sysfs.c +++ b/fs/nilfs2/sysfs.c @@ -51,11 +51,9 @@ static const struct sysfs_ops nilfs_##name##_attr_ops = { \ #define NILFS_DEV_INT_GROUP_TYPE(name, parent_name) \ static void nilfs_##name##_attr_release(struct kobject *kobj) \ { \ - struct nilfs_sysfs_##parent_name##_subgroups *subgroups; \ - struct the_nilfs *nilfs = container_of(kobj->parent, \ - struct the_nilfs, \ - ns_##parent_name##_kobj); \ - subgroups = nilfs->ns_##parent_name##_subgroups; \ + struct nilfs_sysfs_##parent_name##_subgroups *subgroups = container_of(kobj, \ + struct nilfs_sysfs_##parent_name##_subgroups, \ + sg_##name##_kobj); \ complete(&subgroups->sg_##name##_kobj_unregister); \ } \ static struct kobj_type nilfs_##name##_ktype = { \
From: Nanyong Sun sunnanyong@huawei.com
[ Upstream commit 24f8cb1ed057c840728167dab33b32e44147c86f ]
If kobject_init_and_add return with error, kobject_put() is needed here to avoid memory leak, because kobject_init_and_add may return error without freeing the memory associated with the kobject it allocated.
Link: https://lkml.kernel.org/r/20210629022556.3985106-4-sunnanyong@huawei.com Link: https://lkml.kernel.org/r/1625651306-10829-4-git-send-email-konishi.ryusuke@... Signed-off-by: Nanyong Sun sunnanyong@huawei.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nilfs2/sysfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/nilfs2/sysfs.c b/fs/nilfs2/sysfs.c index ec85ac53720d..6305e4ef7e39 100644 --- a/fs/nilfs2/sysfs.c +++ b/fs/nilfs2/sysfs.c @@ -79,8 +79,8 @@ static int nilfs_sysfs_create_##name##_group(struct the_nilfs *nilfs) \ err = kobject_init_and_add(kobj, &nilfs_##name##_ktype, parent, \ #name); \ if (err) \ - return err; \ - return 0; \ + kobject_put(kobj); \ + return err; \ } \ static void nilfs_sysfs_delete_##name##_group(struct the_nilfs *nilfs) \ { \
From: Nanyong Sun sunnanyong@huawei.com
[ Upstream commit a3e181259ddd61fd378390977a1e4e2316853afa ]
The kobject_put() should be used to cleanup the memory associated with the kobject instead of kobject_del. See the section "Kobject removal" of "Documentation/core-api/kobject.rst".
Link: https://lkml.kernel.org/r/20210629022556.3985106-5-sunnanyong@huawei.com Link: https://lkml.kernel.org/r/1625651306-10829-5-git-send-email-konishi.ryusuke@... Signed-off-by: Nanyong Sun sunnanyong@huawei.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nilfs2/sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nilfs2/sysfs.c b/fs/nilfs2/sysfs.c index 6305e4ef7e39..d989e6500bd7 100644 --- a/fs/nilfs2/sysfs.c +++ b/fs/nilfs2/sysfs.c @@ -84,7 +84,7 @@ static int nilfs_sysfs_create_##name##_group(struct the_nilfs *nilfs) \ } \ static void nilfs_sysfs_delete_##name##_group(struct the_nilfs *nilfs) \ { \ - kobject_del(&nilfs->ns_##parent_name##_subgroups->sg_##name##_kobj); \ + kobject_put(&nilfs->ns_##parent_name##_subgroups->sg_##name##_kobj); \ }
/************************************************************************
From: Nanyong Sun sunnanyong@huawei.com
[ Upstream commit b2fe39c248f3fa4bbb2a20759b4fdd83504190f7 ]
If kobject_init_and_add returns with error, kobject_put() is needed here to avoid memory leak, because kobject_init_and_add may return error without freeing the memory associated with the kobject it allocated.
Link: https://lkml.kernel.org/r/20210629022556.3985106-6-sunnanyong@huawei.com Link: https://lkml.kernel.org/r/1625651306-10829-6-git-send-email-konishi.ryusuke@... Signed-off-by: Nanyong Sun sunnanyong@huawei.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nilfs2/sysfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/nilfs2/sysfs.c b/fs/nilfs2/sysfs.c index d989e6500bd7..5ba87573ad3b 100644 --- a/fs/nilfs2/sysfs.c +++ b/fs/nilfs2/sysfs.c @@ -195,9 +195,9 @@ int nilfs_sysfs_create_snapshot_group(struct nilfs_root *root) }
if (err) - return err; + kobject_put(&root->snapshot_kobj);
- return 0; + return err; }
void nilfs_sysfs_delete_snapshot_group(struct nilfs_root *root)
From: Nanyong Sun sunnanyong@huawei.com
[ Upstream commit 17243e1c3072b8417a5ebfc53065d0a87af7ca77 ]
kobject_put() should be used to cleanup the memory associated with the kobject instead of kobject_del(). See the section "Kobject removal" of "Documentation/core-api/kobject.rst".
Link: https://lkml.kernel.org/r/20210629022556.3985106-7-sunnanyong@huawei.com Link: https://lkml.kernel.org/r/1625651306-10829-7-git-send-email-konishi.ryusuke@... Signed-off-by: Nanyong Sun sunnanyong@huawei.com Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nilfs2/sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nilfs2/sysfs.c b/fs/nilfs2/sysfs.c index 5ba87573ad3b..62f8a7ac19c8 100644 --- a/fs/nilfs2/sysfs.c +++ b/fs/nilfs2/sysfs.c @@ -202,7 +202,7 @@ int nilfs_sysfs_create_snapshot_group(struct nilfs_root *root)
void nilfs_sysfs_delete_snapshot_group(struct nilfs_root *root) { - kobject_del(&root->snapshot_kobj); + kobject_put(&root->snapshot_kobj); }
/************************************************************************
From: Niklas Söderlund niklas.soderlund+renesas@ragnatech.se
[ Upstream commit d3a2328e741bf6e9e6bda750e0a63832fa365a74 ]
The TSC id and number of TSC ids should be stored as unsigned int as they can't be negative. Fix the datatype of the loop counter 'i' and rcar_gen3_thermal_tsc.id to reflect this.
Signed-off-by: Niklas Söderlund niklas.soderlund+renesas@ragnatech.se Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Link: https://lore.kernel.org/r/20210804091818.2196806-3-niklas.soderlund+renesas@... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/thermal/rcar_gen3_thermal.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/thermal/rcar_gen3_thermal.c b/drivers/thermal/rcar_gen3_thermal.c index fdf16aa34eb4..702696cf58b6 100644 --- a/drivers/thermal/rcar_gen3_thermal.c +++ b/drivers/thermal/rcar_gen3_thermal.c @@ -84,7 +84,7 @@ struct rcar_gen3_thermal_tsc { struct thermal_zone_device *zone; struct equation_coefs coef; int tj_t; - int id; /* thermal channel id */ + unsigned int id; /* thermal channel id */ };
struct rcar_gen3_thermal_priv { @@ -310,7 +310,8 @@ static int rcar_gen3_thermal_probe(struct platform_device *pdev) const int *ths_tj_1 = of_device_get_match_data(dev); struct resource *res; struct thermal_zone_device *zone; - int ret, i; + unsigned int i; + int ret;
/* default values if FUSEs are missing */ /* TODO: Read values from hardware on supported platforms */ @@ -376,7 +377,7 @@ static int rcar_gen3_thermal_probe(struct platform_device *pdev) if (ret < 0) goto error_unregister;
- dev_info(dev, "TSC%d: Loaded %d trip points\n", i, ret); + dev_info(dev, "TSC%u: Loaded %d trip points\n", i, ret); }
priv->num_tscs = i;
From: Tomer Tayar ttayar@habana.ai
[ Upstream commit 89aad770d692e4d2d9a604c1674e9dfa69421430 ]
In case of host-resident MMU, when the page tables pool is destroyed, its pointer is not nullified correctly. As a result, on a device fini which happens after a failing reset, the already destroyed pool is accessed, which leads to a kernel panic. The patch fixes the setting of the pool pointer to NULL.
Signed-off-by: Tomer Tayar ttayar@habana.ai Reviewed-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/habanalabs/common/mmu/mmu_v1.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/misc/habanalabs/common/mmu/mmu_v1.c b/drivers/misc/habanalabs/common/mmu/mmu_v1.c index c5e93ff32586..0f536f79dd9c 100644 --- a/drivers/misc/habanalabs/common/mmu/mmu_v1.c +++ b/drivers/misc/habanalabs/common/mmu/mmu_v1.c @@ -470,13 +470,13 @@ static void hl_mmu_v1_fini(struct hl_device *hdev) if (!ZERO_OR_NULL_PTR(hdev->mmu_priv.hr.mmu_shadow_hop0)) { kvfree(hdev->mmu_priv.dr.mmu_shadow_hop0); gen_pool_destroy(hdev->mmu_priv.dr.mmu_pgt_pool); - }
- /* Make sure that if we arrive here again without init was called we - * won't cause kernel panic. This can happen for example if we fail - * during hard reset code at certain points - */ - hdev->mmu_priv.dr.mmu_shadow_hop0 = NULL; + /* Make sure that if we arrive here again without init was + * called we won't cause kernel panic. This can happen for + * example if we fail during hard reset code at certain points + */ + hdev->mmu_priv.dr.mmu_shadow_hop0 = NULL; + } }
/**
From: Koby Elbaz kelbaz@habana.ai
[ Upstream commit 8bb8b505761238be0d6a83dc41188867d65e5d4c ]
There is a scenario where an ongoing soft reset would race with an ongoing heartbeat routine, eventually causing heartbeat to fail and thus to escalate into a hard reset.
With this fix, soft-reset procedure will disable heartbeat CPU messages and flush the (ongoing) current one before continuing with reset code.
Signed-off-by: Koby Elbaz kelbaz@habana.ai Reviewed-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/habanalabs/common/device.c | 53 +++++++++++++++----- drivers/misc/habanalabs/common/firmware_if.c | 18 +++++-- drivers/misc/habanalabs/common/habanalabs.h | 4 +- drivers/misc/habanalabs/common/hw_queue.c | 30 ++++------- 4 files changed, 67 insertions(+), 38 deletions(-)
diff --git a/drivers/misc/habanalabs/common/device.c b/drivers/misc/habanalabs/common/device.c index ff4cbde289c0..0a788a13f2c1 100644 --- a/drivers/misc/habanalabs/common/device.c +++ b/drivers/misc/habanalabs/common/device.c @@ -682,6 +682,44 @@ out: return rc; }
+static void take_release_locks(struct hl_device *hdev) +{ + /* Flush anyone that is inside the critical section of enqueue + * jobs to the H/W + */ + hdev->asic_funcs->hw_queues_lock(hdev); + hdev->asic_funcs->hw_queues_unlock(hdev); + + /* Flush processes that are sending message to CPU */ + mutex_lock(&hdev->send_cpu_message_lock); + mutex_unlock(&hdev->send_cpu_message_lock); + + /* Flush anyone that is inside device open */ + mutex_lock(&hdev->fpriv_list_lock); + mutex_unlock(&hdev->fpriv_list_lock); +} + +static void cleanup_resources(struct hl_device *hdev, bool hard_reset) +{ + if (hard_reset) + device_late_fini(hdev); + + /* + * Halt the engines and disable interrupts so we won't get any more + * completions from H/W and we won't have any accesses from the + * H/W to the host machine + */ + hdev->asic_funcs->halt_engines(hdev, hard_reset); + + /* Go over all the queues, release all CS and their jobs */ + hl_cs_rollback_all(hdev); + + /* Release all pending user interrupts, each pending user interrupt + * holds a reference to user context + */ + hl_release_pending_user_interrupts(hdev); +} + /* * hl_device_suspend - initiate device suspend * @@ -707,16 +745,7 @@ int hl_device_suspend(struct hl_device *hdev) /* This blocks all other stuff that is not blocked by in_reset */ hdev->disabled = true;
- /* - * Flush anyone that is inside the critical section of enqueue - * jobs to the H/W - */ - hdev->asic_funcs->hw_queues_lock(hdev); - hdev->asic_funcs->hw_queues_unlock(hdev); - - /* Flush processes that are sending message to CPU */ - mutex_lock(&hdev->send_cpu_message_lock); - mutex_unlock(&hdev->send_cpu_message_lock); + take_release_locks(hdev);
rc = hdev->asic_funcs->suspend(hdev); if (rc) @@ -894,8 +923,8 @@ int hl_device_reset(struct hl_device *hdev, u32 flags) return 0; }
- hard_reset = (flags & HL_RESET_HARD) != 0; - from_hard_reset_thread = (flags & HL_RESET_FROM_RESET_THREAD) != 0; + hard_reset = !!(flags & HL_RESET_HARD); + from_hard_reset_thread = !!(flags & HL_RESET_FROM_RESET_THREAD);
if (!hard_reset && !hdev->supports_soft_reset) { hard_instead_soft = true; diff --git a/drivers/misc/habanalabs/common/firmware_if.c b/drivers/misc/habanalabs/common/firmware_if.c index 2e4d04ec6b53..653e8f5ef6ac 100644 --- a/drivers/misc/habanalabs/common/firmware_if.c +++ b/drivers/misc/habanalabs/common/firmware_if.c @@ -240,11 +240,15 @@ int hl_fw_send_cpu_message(struct hl_device *hdev, u32 hw_queue_id, u32 *msg, /* set fence to a non valid value */ pkt->fence = cpu_to_le32(UINT_MAX);
- rc = hl_hw_queue_send_cb_no_cmpl(hdev, hw_queue_id, len, pkt_dma_addr); - if (rc) { - dev_err(hdev->dev, "Failed to send CB on CPU PQ (%d)\n", rc); - goto out; - } + /* + * The CPU queue is a synchronous queue with an effective depth of + * a single entry (although it is allocated with room for multiple + * entries). We lock on it using 'send_cpu_message_lock' which + * serializes accesses to the CPU queue. + * Which means that we don't need to lock the access to the entire H/W + * queues module when submitting a JOB to the CPU queue. + */ + hl_hw_queue_submit_bd(hdev, queue, 0, len, pkt_dma_addr);
if (prop->fw_app_cpu_boot_dev_sts0 & CPU_BOOT_DEV_STS0_PKT_PI_ACK_EN) expected_ack_val = queue->pi; @@ -2235,6 +2239,10 @@ static int hl_fw_dynamic_init_cpu(struct hl_device *hdev, dev_info(hdev->dev, "Loading firmware to device, may take some time...\n");
+ /* + * In this stage, "cpu_dyn_regs" contains only LKD's hard coded values! + * It will be updated from FW after hl_fw_dynamic_request_descriptor(). + */ dyn_regs = &fw_loader->dynamic_loader.comm_desc.cpu_dyn_regs;
rc = hl_fw_dynamic_send_protocol_cmd(hdev, fw_loader, COMMS_RST_STATE, diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h index 6b3cdd7e068a..c63e26da5135 100644 --- a/drivers/misc/habanalabs/common/habanalabs.h +++ b/drivers/misc/habanalabs/common/habanalabs.h @@ -2436,7 +2436,9 @@ void destroy_hdev(struct hl_device *hdev); int hl_hw_queues_create(struct hl_device *hdev); void hl_hw_queues_destroy(struct hl_device *hdev); int hl_hw_queue_send_cb_no_cmpl(struct hl_device *hdev, u32 hw_queue_id, - u32 cb_size, u64 cb_ptr); + u32 cb_size, u64 cb_ptr); +void hl_hw_queue_submit_bd(struct hl_device *hdev, struct hl_hw_queue *q, + u32 ctl, u32 len, u64 ptr); int hl_hw_queue_schedule_cs(struct hl_cs *cs); u32 hl_hw_queue_add_ptr(u32 ptr, u16 val); void hl_hw_queue_inc_ci_kernel(struct hl_device *hdev, u32 hw_queue_id); diff --git a/drivers/misc/habanalabs/common/hw_queue.c b/drivers/misc/habanalabs/common/hw_queue.c index bcabfdbf1e01..0afead229e97 100644 --- a/drivers/misc/habanalabs/common/hw_queue.c +++ b/drivers/misc/habanalabs/common/hw_queue.c @@ -65,7 +65,7 @@ void hl_hw_queue_update_ci(struct hl_cs *cs) }
/* - * ext_and_hw_queue_submit_bd() - Submit a buffer descriptor to an external or a + * hl_hw_queue_submit_bd() - Submit a buffer descriptor to an external or a * H/W queue. * @hdev: pointer to habanalabs device structure * @q: pointer to habanalabs queue structure @@ -80,8 +80,8 @@ void hl_hw_queue_update_ci(struct hl_cs *cs) * This function must be called when the scheduler mutex is taken * */ -static void ext_and_hw_queue_submit_bd(struct hl_device *hdev, - struct hl_hw_queue *q, u32 ctl, u32 len, u64 ptr) +void hl_hw_queue_submit_bd(struct hl_device *hdev, struct hl_hw_queue *q, + u32 ctl, u32 len, u64 ptr) { struct hl_bd *bd;
@@ -222,8 +222,8 @@ static int hw_queue_sanity_checks(struct hl_device *hdev, struct hl_hw_queue *q, * @cb_size: size of CB * @cb_ptr: pointer to CB location * - * This function sends a single CB, that must NOT generate a completion entry - * + * This function sends a single CB, that must NOT generate a completion entry. + * Sending CPU messages can be done instead via 'hl_hw_queue_submit_bd()' */ int hl_hw_queue_send_cb_no_cmpl(struct hl_device *hdev, u32 hw_queue_id, u32 cb_size, u64 cb_ptr) @@ -231,16 +231,7 @@ int hl_hw_queue_send_cb_no_cmpl(struct hl_device *hdev, u32 hw_queue_id, struct hl_hw_queue *q = &hdev->kernel_queues[hw_queue_id]; int rc = 0;
- /* - * The CPU queue is a synchronous queue with an effective depth of - * a single entry (although it is allocated with room for multiple - * entries). Therefore, there is a different lock, called - * send_cpu_message_lock, that serializes accesses to the CPU queue. - * As a result, we don't need to lock the access to the entire H/W - * queues module when submitting a JOB to the CPU queue - */ - if (q->queue_type != QUEUE_TYPE_CPU) - hdev->asic_funcs->hw_queues_lock(hdev); + hdev->asic_funcs->hw_queues_lock(hdev);
if (hdev->disabled) { rc = -EPERM; @@ -258,11 +249,10 @@ int hl_hw_queue_send_cb_no_cmpl(struct hl_device *hdev, u32 hw_queue_id, goto out; }
- ext_and_hw_queue_submit_bd(hdev, q, 0, cb_size, cb_ptr); + hl_hw_queue_submit_bd(hdev, q, 0, cb_size, cb_ptr);
out: - if (q->queue_type != QUEUE_TYPE_CPU) - hdev->asic_funcs->hw_queues_unlock(hdev); + hdev->asic_funcs->hw_queues_unlock(hdev);
return rc; } @@ -328,7 +318,7 @@ static void ext_queue_schedule_job(struct hl_cs_job *job) cq->pi = hl_cq_inc_ptr(cq->pi);
submit_bd: - ext_and_hw_queue_submit_bd(hdev, q, ctl, len, ptr); + hl_hw_queue_submit_bd(hdev, q, ctl, len, ptr); }
/* @@ -407,7 +397,7 @@ static void hw_queue_schedule_job(struct hl_cs_job *job) else ptr = (u64) (uintptr_t) job->user_cb;
- ext_and_hw_queue_submit_bd(hdev, q, ctl, len, ptr); + hl_hw_queue_submit_bd(hdev, q, ctl, len, ptr); }
static int init_signal_cs(struct hl_device *hdev,
On Fri, Sep 24, 2021 at 02:44:29PM +0200, Greg Kroah-Hartman wrote:
From: Koby Elbaz kelbaz@habana.ai
[ Upstream commit 8bb8b505761238be0d6a83dc41188867d65e5d4c ]
There is a scenario where an ongoing soft reset would race with an ongoing heartbeat routine, eventually causing heartbeat to fail and thus to escalate into a hard reset.
With this fix, soft-reset procedure will disable heartbeat CPU messages and flush the (ongoing) current one before continuing with reset code.
Signed-off-by: Koby Elbaz kelbaz@habana.ai Reviewed-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org
drivers/misc/habanalabs/common/device.c | 53 +++++++++++++++----- drivers/misc/habanalabs/common/firmware_if.c | 18 +++++-- drivers/misc/habanalabs/common/habanalabs.h | 4 +- drivers/misc/habanalabs/common/hw_queue.c | 30 ++++------- 4 files changed, 67 insertions(+), 38 deletions(-)
This change adds a build warning so I'm going to drop it from the tree for now.
thanks,
greg k-h
From: Luben Tuikov luben.tuikov@amd.com
[ Upstream commit a6a355a22f7a0efa6a11bc90b5161f394d51fe95 ]
1) Generalize the function--if the user didn't set i2c_address, still return true/false to indicate whether VBIOS contains the RAS EEPROM address. This function shouldn't evaluate whether the user set the i2c_address pointer or not.
2) Don't touch the caller's i2c_address, unless you have to--this function shouldn't have side effects.
3) Correctly set the function comment as a kernel-doc comment.
Cc: John Clements john.clements@amd.com Cc: Hawking Zhang Hawking.Zhang@amd.com Cc: Alex Deucher Alexander.Deucher@amd.com Signed-off-by: Luben Tuikov luben.tuikov@amd.com Reviewed-by: Alex Deucher Alexander.Deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c | 50 ++++++++++++------- 1 file changed, 33 insertions(+), 17 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c index 8f53837d4d3e..97178b307ed6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c @@ -468,14 +468,18 @@ bool amdgpu_atomfirmware_dynamic_boot_config_supported(struct amdgpu_device *ade return (fw_cap & ATOM_FIRMWARE_CAP_DYNAMIC_BOOT_CFG_ENABLE) ? true : false; }
-/* - * Helper function to query RAS EEPROM address - * - * @adev: amdgpu_device pointer +/** + * amdgpu_atomfirmware_ras_rom_addr -- Get the RAS EEPROM addr from VBIOS + * adev: amdgpu_device pointer + * i2c_address: pointer to u8; if not NULL, will contain + * the RAS EEPROM address if the function returns true * - * Return true if vbios supports ras rom address reporting + * Return true if VBIOS supports RAS EEPROM address reporting, + * else return false. If true and @i2c_address is not NULL, + * will contain the RAS ROM address. */ -bool amdgpu_atomfirmware_ras_rom_addr(struct amdgpu_device *adev, uint8_t* i2c_address) +bool amdgpu_atomfirmware_ras_rom_addr(struct amdgpu_device *adev, + u8 *i2c_address) { struct amdgpu_mode_info *mode_info = &adev->mode_info; int index; @@ -483,27 +487,39 @@ bool amdgpu_atomfirmware_ras_rom_addr(struct amdgpu_device *adev, uint8_t* i2c_a union firmware_info *firmware_info; u8 frev, crev;
- if (i2c_address == NULL) - return false; - - *i2c_address = 0; - index = get_index_into_master_table(atom_master_list_of_data_tables_v2_1, - firmwareinfo); + firmwareinfo);
if (amdgpu_atom_parse_data_header(adev->mode_info.atom_context, - index, &size, &frev, &crev, &data_offset)) { + index, &size, &frev, &crev, + &data_offset)) { /* support firmware_info 3.4 + */ if ((frev == 3 && crev >=4) || (frev > 3)) { firmware_info = (union firmware_info *) (mode_info->atom_context->bios + data_offset); - *i2c_address = firmware_info->v34.ras_rom_i2c_slave_addr; + /* The ras_rom_i2c_slave_addr should ideally + * be a 19-bit EEPROM address, which would be + * used as is by the driver; see top of + * amdgpu_eeprom.c. + * + * When this is the case, 0 is of course a + * valid RAS EEPROM address, in which case, + * we'll drop the first "if (firm...)" and only + * leave the check for the pointer. + * + * The reason this works right now is because + * ras_rom_i2c_slave_addr contains the EEPROM + * device type qualifier 1010b in the top 4 + * bits. + */ + if (firmware_info->v34.ras_rom_i2c_slave_addr) { + if (i2c_address) + *i2c_address = firmware_info->v34.ras_rom_i2c_slave_addr; + return true; + } } }
- if (*i2c_address != 0) - return true; - return false; }
From: Anson Jacob Anson.Jacob@amd.com
[ Upstream commit 03388a347fe7cf7c3bdf68b0823ba316d177d470 ]
Free memory allocated if any of the previous allocations failed.
CID 1487129: Resource leaks (RESOURCE_LEAK) Variable "vpg" going out of scope leaks the storage it points to.
Addresses-Coverity-ID: 1487129: ("Resource leaks")
Reviewed-by: Aric Cyr aric.cyr@amd.com Acked-by: Mikita Lipski mikita.lipski@amd.com Signed-off-by: Anson Jacob Anson.Jacob@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c b/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c index dc7823d23ba8..dd38796ba30a 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c @@ -510,8 +510,12 @@ static struct stream_encoder *dcn303_stream_encoder_create(enum engine_id eng_id vpg = dcn303_vpg_create(ctx, vpg_inst); afmt = dcn303_afmt_create(ctx, afmt_inst);
- if (!enc1 || !vpg || !afmt) + if (!enc1 || !vpg || !afmt) { + kfree(enc1); + kfree(vpg); + kfree(afmt); return NULL; + }
dcn30_dio_stream_encoder_construct(enc1, ctx, ctx->dc_bios, eng_id, vpg, afmt, &stream_enc_regs[eng_id], &se_shift, &se_mask);
From: Philip Yang Philip.Yang@amd.com
[ Upstream commit d7eff46c214c036606dd3cd305bd5a128aecfe8c ]
Get process vm root BO ref in case process is exiting and root BO is freed, to avoid NULL pointer dereference backtrace:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 Call Trace: amdgpu_show_fdinfo+0xfe/0x2a0 [amdgpu] seq_show+0x12c/0x180 seq_read+0x153/0x410 vfs_read+0x91/0x140[ 3427.206183] ksys_read+0x4f/0xb0 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x65/0xca
Signed-off-by: Philip Yang Philip.Yang@amd.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c index d94c5419ec25..5a6857c44bb6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c @@ -59,6 +59,7 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f) uint64_t vram_mem = 0, gtt_mem = 0, cpu_mem = 0; struct drm_file *file = f->private_data; struct amdgpu_device *adev = drm_to_adev(file->minor->dev); + struct amdgpu_bo *root; int ret;
ret = amdgpu_file_to_fpriv(f, &fpriv); @@ -69,13 +70,19 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f) dev = PCI_SLOT(adev->pdev->devfn); fn = PCI_FUNC(adev->pdev->devfn);
- ret = amdgpu_bo_reserve(fpriv->vm.root.bo, false); + root = amdgpu_bo_ref(fpriv->vm.root.bo); + if (!root) + return; + + ret = amdgpu_bo_reserve(root, false); if (ret) { DRM_ERROR("Fail to reserve bo\n"); return; } amdgpu_vm_get_memory(&fpriv->vm, &vram_mem, >t_mem, &cpu_mem); - amdgpu_bo_unreserve(fpriv->vm.root.bo); + amdgpu_bo_unreserve(root); + amdgpu_bo_unref(&root); + seq_printf(m, "pdev:\t%04x:%02x:%02x.%d\npasid:\t%u\n", domain, bus, dev, fn, fpriv->vm.pasid); seq_printf(m, "vram mem:\t%llu kB\n", vram_mem/1024UL);
From: Ofir Bitton obitton@habana.ai
[ Upstream commit a6c849012b0f51c674f52384bd9a4f3dc0a33c31 ]
Currently there is no validity check for event ID received from F/W, Thus exposing driver to memory overrun.
Signed-off-by: Ofir Bitton obitton@habana.ai Reviewed-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/habanalabs/gaudi/gaudi.c | 6 ++++++ drivers/misc/habanalabs/goya/goya.c | 6 ++++++ 2 files changed, 12 insertions(+)
diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c index aa8a0ca5aca2..409f05c962f2 100644 --- a/drivers/misc/habanalabs/gaudi/gaudi.c +++ b/drivers/misc/habanalabs/gaudi/gaudi.c @@ -7809,6 +7809,12 @@ static void gaudi_handle_eqe(struct hl_device *hdev, u8 cause; bool reset_required;
+ if (event_type >= GAUDI_EVENT_SIZE) { + dev_err(hdev->dev, "Event type %u exceeds maximum of %u", + event_type, GAUDI_EVENT_SIZE - 1); + return; + } + gaudi->events_stat[event_type]++; gaudi->events_stat_aggregate[event_type]++;
diff --git a/drivers/misc/habanalabs/goya/goya.c b/drivers/misc/habanalabs/goya/goya.c index 755e08cf2ecc..bfb22f96c1a3 100644 --- a/drivers/misc/habanalabs/goya/goya.c +++ b/drivers/misc/habanalabs/goya/goya.c @@ -4797,6 +4797,12 @@ void goya_handle_eqe(struct hl_device *hdev, struct hl_eq_entry *eq_entry) >> EQ_CTL_EVENT_TYPE_SHIFT); struct goya_device *goya = hdev->asic_specific;
+ if (event_type >= GOYA_ASYNC_EVENT_ID_SIZE) { + dev_err(hdev->dev, "Event type %u exceeds maximum of %u", + event_type, GOYA_ASYNC_EVENT_ID_SIZE - 1); + return; + } + goya->events_stat[event_type]++; goya->events_stat_aggregate[event_type]++;
From: Yuri Nudelman ynudelman@habana.ai
[ Upstream commit 09ae43043c748423a5dcdc7bb1e63e4dcabe9bd6 ]
The address resolution via debugfs was not taking into consideration the page offset, resulting in a wrong address.
Signed-off-by: Yuri Nudelman ynudelman@habana.ai Reviewed-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/habanalabs/common/debugfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/misc/habanalabs/common/debugfs.c b/drivers/misc/habanalabs/common/debugfs.c index 703d79fb6f3f..379529bffc70 100644 --- a/drivers/misc/habanalabs/common/debugfs.c +++ b/drivers/misc/habanalabs/common/debugfs.c @@ -349,7 +349,7 @@ static int mmu_show(struct seq_file *s, void *data) return 0; }
- phys_addr = hops_info.hop_info[hops_info.used_hops - 1].hop_pte_val; + hl_mmu_va_to_pa(ctx, virt_addr, &phys_addr);
if (hops_info.scrambled_vaddr && (dev_entry->mmu_addr != hops_info.scrambled_vaddr))
From: Omer Shpigelman oshpigelman@habana.ai
[ Upstream commit 71731090ab17a208a58020e4b342fdfee280458a ]
On init, the disabled state is cleared right before hw_init and that causes the device to report on "Operational" state before the device initialization is finished. Although the char device is not yet exposed to the user at this stage, the sysfs entries are exposed.
This can cause errors in monitoring applications that use the sysfs entries.
In order to avoid this, a new state "in device creation" is introduced to ne reported when the device is not disabled but is still in init flow.
Signed-off-by: Omer Shpigelman oshpigelman@habana.ai Reviewed-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/habanalabs/common/device.c | 3 +++ drivers/misc/habanalabs/common/habanalabs.h | 2 +- .../misc/habanalabs/common/habanalabs_drv.c | 8 ++++++-- drivers/misc/habanalabs/common/sysfs.c | 20 +++++++------------ include/uapi/misc/habanalabs.h | 4 +++- 5 files changed, 20 insertions(+), 17 deletions(-)
diff --git a/drivers/misc/habanalabs/common/device.c b/drivers/misc/habanalabs/common/device.c index 0a788a13f2c1..846a7e78582c 100644 --- a/drivers/misc/habanalabs/common/device.c +++ b/drivers/misc/habanalabs/common/device.c @@ -23,6 +23,8 @@ enum hl_device_status hl_device_status(struct hl_device *hdev) status = HL_DEVICE_STATUS_NEEDS_RESET; else if (hdev->disabled) status = HL_DEVICE_STATUS_MALFUNCTION; + else if (!hdev->init_done) + status = HL_DEVICE_STATUS_IN_DEVICE_CREATION; else status = HL_DEVICE_STATUS_OPERATIONAL;
@@ -44,6 +46,7 @@ bool hl_device_operational(struct hl_device *hdev, case HL_DEVICE_STATUS_NEEDS_RESET: return false; case HL_DEVICE_STATUS_OPERATIONAL: + case HL_DEVICE_STATUS_IN_DEVICE_CREATION: default: return true; } diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h index c63e26da5135..c48d130a9049 100644 --- a/drivers/misc/habanalabs/common/habanalabs.h +++ b/drivers/misc/habanalabs/common/habanalabs.h @@ -1798,7 +1798,7 @@ struct hl_dbg_device_entry {
#define HL_STR_MAX 32
-#define HL_DEV_STS_MAX (HL_DEVICE_STATUS_NEEDS_RESET + 1) +#define HL_DEV_STS_MAX (HL_DEVICE_STATUS_LAST + 1)
/* Theoretical limit only. A single host can only contain up to 4 or 8 PCIe * x16 cards. In extreme cases, there are hosts that can accommodate 16 cards. diff --git a/drivers/misc/habanalabs/common/habanalabs_drv.c b/drivers/misc/habanalabs/common/habanalabs_drv.c index 4194cda2d04c..536451a9a16c 100644 --- a/drivers/misc/habanalabs/common/habanalabs_drv.c +++ b/drivers/misc/habanalabs/common/habanalabs_drv.c @@ -318,12 +318,16 @@ int create_hdev(struct hl_device **dev, struct pci_dev *pdev, hdev->asic_prop.fw_security_enabled = false;
/* Assign status description string */ - strncpy(hdev->status[HL_DEVICE_STATUS_MALFUNCTION], - "disabled", HL_STR_MAX); + strncpy(hdev->status[HL_DEVICE_STATUS_OPERATIONAL], + "operational", HL_STR_MAX); strncpy(hdev->status[HL_DEVICE_STATUS_IN_RESET], "in reset", HL_STR_MAX); + strncpy(hdev->status[HL_DEVICE_STATUS_MALFUNCTION], + "disabled", HL_STR_MAX); strncpy(hdev->status[HL_DEVICE_STATUS_NEEDS_RESET], "needs reset", HL_STR_MAX); + strncpy(hdev->status[HL_DEVICE_STATUS_IN_DEVICE_CREATION], + "in device creation", HL_STR_MAX);
hdev->major = hl_major; hdev->reset_on_lockup = reset_on_lockup; diff --git a/drivers/misc/habanalabs/common/sysfs.c b/drivers/misc/habanalabs/common/sysfs.c index db72df282ef8..34f9f2779962 100644 --- a/drivers/misc/habanalabs/common/sysfs.c +++ b/drivers/misc/habanalabs/common/sysfs.c @@ -9,8 +9,7 @@
#include <linux/pci.h>
-long hl_get_frequency(struct hl_device *hdev, u32 pll_index, - bool curr) +long hl_get_frequency(struct hl_device *hdev, u32 pll_index, bool curr) { struct cpucp_packet pkt; u32 used_pll_idx; @@ -44,8 +43,7 @@ long hl_get_frequency(struct hl_device *hdev, u32 pll_index, return (long) result; }
-void hl_set_frequency(struct hl_device *hdev, u32 pll_index, - u64 freq) +void hl_set_frequency(struct hl_device *hdev, u32 pll_index, u64 freq) { struct cpucp_packet pkt; u32 used_pll_idx; @@ -285,16 +283,12 @@ static ssize_t status_show(struct device *dev, struct device_attribute *attr, char *buf) { struct hl_device *hdev = dev_get_drvdata(dev); - char *str; + char str[HL_STR_MAX];
- if (atomic_read(&hdev->in_reset)) - str = "In reset"; - else if (hdev->disabled) - str = "Malfunction"; - else if (hdev->needs_reset) - str = "Needs Reset"; - else - str = "Operational"; + strscpy(str, hdev->status[hl_device_status(hdev)], HL_STR_MAX); + + /* use uppercase for backward compatibility */ + str[0] = 'A' + (str[0] - 'a');
return sprintf(buf, "%s\n", str); } diff --git a/include/uapi/misc/habanalabs.h b/include/uapi/misc/habanalabs.h index a47a731e4527..b4b681b81df8 100644 --- a/include/uapi/misc/habanalabs.h +++ b/include/uapi/misc/habanalabs.h @@ -276,7 +276,9 @@ enum hl_device_status { HL_DEVICE_STATUS_OPERATIONAL, HL_DEVICE_STATUS_IN_RESET, HL_DEVICE_STATUS_MALFUNCTION, - HL_DEVICE_STATUS_NEEDS_RESET + HL_DEVICE_STATUS_NEEDS_RESET, + HL_DEVICE_STATUS_IN_DEVICE_CREATION, + HL_DEVICE_STATUS_LAST = HL_DEVICE_STATUS_IN_DEVICE_CREATION };
/* Opcode for management ioctl
From: farah kassabri fkassabri@habana.ai
[ Upstream commit 607b1468c2263e082d74c1a3e71399a9026b41ce ]
Fix 2 areas in the code where it's possible the code will go to sleep while holding a spinlock.
Reported-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: farah kassabri fkassabri@habana.ai Reviewed-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Oded Gabbay ogabbay@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/habanalabs/common/command_buffer.c | 2 -- drivers/misc/habanalabs/common/memory.c | 2 +- 2 files changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/misc/habanalabs/common/command_buffer.c b/drivers/misc/habanalabs/common/command_buffer.c index 719168c980a4..402ac2395fc8 100644 --- a/drivers/misc/habanalabs/common/command_buffer.c +++ b/drivers/misc/habanalabs/common/command_buffer.c @@ -314,8 +314,6 @@ int hl_cb_create(struct hl_device *hdev, struct hl_cb_mgr *mgr,
spin_lock(&mgr->cb_lock); rc = idr_alloc(&mgr->cb_handles, cb, 1, 0, GFP_ATOMIC); - if (rc < 0) - rc = idr_alloc(&mgr->cb_handles, cb, 1, 0, GFP_KERNEL); spin_unlock(&mgr->cb_lock);
if (rc < 0) { diff --git a/drivers/misc/habanalabs/common/memory.c b/drivers/misc/habanalabs/common/memory.c index af339ce1ab4f..fcadde594a58 100644 --- a/drivers/misc/habanalabs/common/memory.c +++ b/drivers/misc/habanalabs/common/memory.c @@ -124,7 +124,7 @@ static int alloc_device_memory(struct hl_ctx *ctx, struct hl_mem_in *args,
spin_lock(&vm->idr_lock); handle = idr_alloc(&vm->phys_pg_pack_handles, phys_pg_pack, 1, 0, - GFP_KERNEL); + GFP_ATOMIC); spin_unlock(&vm->idr_lock);
if (handle < 0) {
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
[ Upstream commit c68eb29c8e9067c08175dd0414f6984f236f719d ]
A consumer is expected to disable a PWM before calling pwm_put(). And if they didn't there is hopefully a good reason (or the consumer needs fixing). Also if disabling an enabled PWM was the right thing to do, this should better be done in the framework instead of in each low level driver.
Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pwm/pwm-img.c | 16 ---------------- 1 file changed, 16 deletions(-)
diff --git a/drivers/pwm/pwm-img.c b/drivers/pwm/pwm-img.c index 11b16ecc4f96..18d8e34d0d08 100644 --- a/drivers/pwm/pwm-img.c +++ b/drivers/pwm/pwm-img.c @@ -326,23 +326,7 @@ err_pm_disable: static int img_pwm_remove(struct platform_device *pdev) { struct img_pwm_chip *pwm_chip = platform_get_drvdata(pdev); - u32 val; - unsigned int i; - int ret; - - ret = pm_runtime_get_sync(&pdev->dev); - if (ret < 0) { - pm_runtime_put(&pdev->dev); - return ret; - } - - for (i = 0; i < pwm_chip->chip.npwm; i++) { - val = img_pwm_readl(pwm_chip, PWM_CTRL_CFG); - val &= ~BIT(i); - img_pwm_writel(pwm_chip, PWM_CTRL_CFG, val); - }
- pm_runtime_put(&pdev->dev); pm_runtime_disable(&pdev->dev); if (!pm_runtime_status_suspended(&pdev->dev)) img_pwm_runtime_suspend(&pdev->dev);
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
[ Upstream commit 9d768cd7fd42bb0be16f36aec48548fca5260759 ]
A consumer is expected to disable a PWM before calling pwm_put(). And if they didn't there is hopefully a good reason (or the consumer needs fixing). Also if disabling an enabled PWM was the right thing to do, this should better be done in the framework instead of in each low level driver.
Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pwm/pwm-rockchip.c | 14 -------------- 1 file changed, 14 deletions(-)
diff --git a/drivers/pwm/pwm-rockchip.c b/drivers/pwm/pwm-rockchip.c index cbe900877724..8fcef29948d7 100644 --- a/drivers/pwm/pwm-rockchip.c +++ b/drivers/pwm/pwm-rockchip.c @@ -384,20 +384,6 @@ static int rockchip_pwm_remove(struct platform_device *pdev) { struct rockchip_pwm_chip *pc = platform_get_drvdata(pdev);
- /* - * Disable the PWM clk before unpreparing it if the PWM device is still - * running. This should only happen when the last PWM user left it - * enabled, or when nobody requested a PWM that was previously enabled - * by the bootloader. - * - * FIXME: Maybe the core should disable all PWM devices in - * pwmchip_remove(). In this case we'd only have to call - * clk_unprepare() after pwmchip_remove(). - * - */ - if (pwm_is_enabled(pc->chip.pwms)) - clk_disable(pc->clk); - clk_unprepare(pc->pclk); clk_unprepare(pc->clk);
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
[ Upstream commit d44084c93427bb0a9261432db1a8ca76a42d805e ]
A consumer is expected to disable a PWM before calling pwm_put(). And if they didn't there is hopefully a good reason (or the consumer needs fixing). Also if disabling an enabled PWM was the right thing to do, this should better be done in the framework instead of in each low level driver.
Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pwm/pwm-stm32-lp.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/pwm/pwm-stm32-lp.c b/drivers/pwm/pwm-stm32-lp.c index 93dd03618465..e4a10aac354d 100644 --- a/drivers/pwm/pwm-stm32-lp.c +++ b/drivers/pwm/pwm-stm32-lp.c @@ -222,8 +222,6 @@ static int stm32_pwm_lp_remove(struct platform_device *pdev) { struct stm32_pwm_lp *priv = platform_get_drvdata(pdev);
- pwm_disable(&priv->chip.pwms[0]); - return pwmchip_remove(&priv->chip); }
From: Hannes Reinecke hare@suse.de
[ Upstream commit f04064814c2a15c22ed9c803f9b634ef34f91092 ]
The serial number is copied into the buffer via memcpy_and_pad() with the length NVMET_SN_MAX_SIZE. So when printing out we also need to take just that length as anything beyond that will be uninitialized.
Signed-off-by: Hannes Reinecke hare@suse.de Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/target/configfs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c index 273555127188..fa88bf9cba4d 100644 --- a/drivers/nvme/target/configfs.c +++ b/drivers/nvme/target/configfs.c @@ -1067,7 +1067,8 @@ static ssize_t nvmet_subsys_attr_serial_show(struct config_item *item, { struct nvmet_subsys *subsys = to_subsys(item);
- return snprintf(page, PAGE_SIZE, "%s\n", subsys->serial); + return snprintf(page, PAGE_SIZE, "%*s\n", + NVMET_SN_MAX_SIZE, subsys->serial); }
static ssize_t
From: Tetsuo Handa penguin-kernel@i-love.sakura.ne.jp
[ Upstream commit dfbb3409b27fa42b96f5727a80d3ceb6a8663991 ]
If CONFIG_BLK_DEV_LOOP && CONFIG_MTD (at least; there might be other combinations), lockdep complains circular locking dependency at __loop_clr_fd(), for major_names_lock serves as a locking dependency aggregating hub across multiple block modules.
====================================================== WARNING: possible circular locking dependency detected 5.14.0+ #757 Tainted: G E ------------------------------------------------------ systemd-udevd/7568 is trying to acquire lock: ffff88800f334d48 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x70/0x560
but task is already holding lock: ffff888014a7d4a0 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x4d/0x400 [loop]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #6 (&lo->lo_mutex){+.+.}-{3:3}: lock_acquire+0xbe/0x1f0 __mutex_lock_common+0xb6/0xe10 mutex_lock_killable_nested+0x17/0x20 lo_open+0x23/0x50 [loop] blkdev_get_by_dev+0x199/0x540 blkdev_open+0x58/0x90 do_dentry_open+0x144/0x3a0 path_openat+0xa57/0xda0 do_filp_open+0x9f/0x140 do_sys_openat2+0x71/0x150 __x64_sys_openat+0x78/0xa0 do_syscall_64+0x3d/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #5 (&disk->open_mutex){+.+.}-{3:3}: lock_acquire+0xbe/0x1f0 __mutex_lock_common+0xb6/0xe10 mutex_lock_nested+0x17/0x20 bd_register_pending_holders+0x20/0x100 device_add_disk+0x1ae/0x390 loop_add+0x29c/0x2d0 [loop] blk_request_module+0x5a/0xb0 blkdev_get_no_open+0x27/0xa0 blkdev_get_by_dev+0x5f/0x540 blkdev_open+0x58/0x90 do_dentry_open+0x144/0x3a0 path_openat+0xa57/0xda0 do_filp_open+0x9f/0x140 do_sys_openat2+0x71/0x150 __x64_sys_openat+0x78/0xa0 do_syscall_64+0x3d/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #4 (major_names_lock){+.+.}-{3:3}: lock_acquire+0xbe/0x1f0 __mutex_lock_common+0xb6/0xe10 mutex_lock_nested+0x17/0x20 blkdev_show+0x19/0x80 devinfo_show+0x52/0x60 seq_read_iter+0x2d5/0x3e0 proc_reg_read_iter+0x41/0x80 vfs_read+0x2ac/0x330 ksys_read+0x6b/0xd0 do_syscall_64+0x3d/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #3 (&p->lock){+.+.}-{3:3}: lock_acquire+0xbe/0x1f0 __mutex_lock_common+0xb6/0xe10 mutex_lock_nested+0x17/0x20 seq_read_iter+0x37/0x3e0 generic_file_splice_read+0xf3/0x170 splice_direct_to_actor+0x14e/0x350 do_splice_direct+0x84/0xd0 do_sendfile+0x263/0x430 __se_sys_sendfile64+0x96/0xc0 do_syscall_64+0x3d/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #2 (sb_writers#3){.+.+}-{0:0}: lock_acquire+0xbe/0x1f0 lo_write_bvec+0x96/0x280 [loop] loop_process_work+0xa68/0xc10 [loop] process_one_work+0x293/0x480 worker_thread+0x23d/0x4b0 kthread+0x163/0x180 ret_from_fork+0x1f/0x30
-> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}: lock_acquire+0xbe/0x1f0 process_one_work+0x280/0x480 worker_thread+0x23d/0x4b0 kthread+0x163/0x180 ret_from_fork+0x1f/0x30
-> #0 ((wq_completion)loop0){+.+.}-{0:0}: validate_chain+0x1f0d/0x33e0 __lock_acquire+0x92d/0x1030 lock_acquire+0xbe/0x1f0 flush_workqueue+0x8c/0x560 drain_workqueue+0x80/0x140 destroy_workqueue+0x47/0x4f0 __loop_clr_fd+0xb4/0x400 [loop] blkdev_put+0x14a/0x1d0 blkdev_close+0x1c/0x20 __fput+0xfd/0x220 task_work_run+0x69/0xc0 exit_to_user_mode_prepare+0x1ce/0x1f0 syscall_exit_to_user_mode+0x26/0x60 do_syscall_64+0x4c/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of: (wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&lo->lo_mutex); lock(&disk->open_mutex); lock(&lo->lo_mutex); lock((wq_completion)loop0);
*** DEADLOCK ***
2 locks held by systemd-udevd/7568: #0: ffff888012554128 (&disk->open_mutex){+.+.}-{3:3}, at: blkdev_put+0x4c/0x1d0 #1: ffff888014a7d4a0 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x4d/0x400 [loop]
stack backtrace: CPU: 0 PID: 7568 Comm: systemd-udevd Tainted: G E 5.14.0+ #757 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020 Call Trace: dump_stack_lvl+0x79/0xbf print_circular_bug+0x5d6/0x5e0 ? stack_trace_save+0x42/0x60 ? save_trace+0x3d/0x2d0 check_noncircular+0x10b/0x120 validate_chain+0x1f0d/0x33e0 ? __lock_acquire+0x953/0x1030 ? __lock_acquire+0x953/0x1030 __lock_acquire+0x92d/0x1030 ? flush_workqueue+0x70/0x560 lock_acquire+0xbe/0x1f0 ? flush_workqueue+0x70/0x560 flush_workqueue+0x8c/0x560 ? flush_workqueue+0x70/0x560 ? sched_clock_cpu+0xe/0x1a0 ? drain_workqueue+0x41/0x140 drain_workqueue+0x80/0x140 destroy_workqueue+0x47/0x4f0 ? blk_mq_freeze_queue_wait+0xac/0xd0 __loop_clr_fd+0xb4/0x400 [loop] ? __mutex_unlock_slowpath+0x35/0x230 blkdev_put+0x14a/0x1d0 blkdev_close+0x1c/0x20 __fput+0xfd/0x220 task_work_run+0x69/0xc0 exit_to_user_mode_prepare+0x1ce/0x1f0 syscall_exit_to_user_mode+0x26/0x60 do_syscall_64+0x4c/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f0fd4c661f7 Code: 00 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 13 fc ff ff RSP: 002b:00007ffd1c9e9fd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 00007f0fd46be6c8 RCX: 00007f0fd4c661f7 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000006 RBP: 0000000000000006 R08: 000055fff1eaf400 R09: 0000000000000000 R10: 00007f0fd46be6c8 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000002f08 R15: 00007ffd1c9ea050
Commit 1c500ad706383f1a ("loop: reduce the loop_ctl_mutex scope") is for breaking "loop_ctl_mutex => &lo->lo_mutex" dependency chain. But enabling a different block module results in forming circular locking dependency due to shared major_names_lock mutex.
The simplest fix is to call probe function without holding major_names_lock [1], but Christoph Hellwig does not like such idea. Therefore, instead of holding major_names_lock in blkdev_show(), introduce a different lock for blkdev_show() in order to break "sb_writers#$N => &p->lock => major_names_lock" dependency chain.
Link: https://lkml.kernel.org/r/b2af8a5b-3c1b-204e-7f56-bea0b15848d6@i-love.sakura... [1] Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Link: https://lore.kernel.org/r/18a02da2-0bf3-550e-b071-2b4ab13c49f0@i-love.sakura... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/genhd.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/block/genhd.c b/block/genhd.c index 298ee78c1bda..9aba65404416 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -164,6 +164,7 @@ static struct blk_major_name { void (*probe)(dev_t devt); } *major_names[BLKDEV_MAJOR_HASH_SIZE]; static DEFINE_MUTEX(major_names_lock); +static DEFINE_SPINLOCK(major_names_spinlock);
/* index in the above - for now: assume no multimajor ranges */ static inline int major_to_index(unsigned major) @@ -176,11 +177,11 @@ void blkdev_show(struct seq_file *seqf, off_t offset) { struct blk_major_name *dp;
- mutex_lock(&major_names_lock); + spin_lock(&major_names_spinlock); for (dp = major_names[major_to_index(offset)]; dp; dp = dp->next) if (dp->major == offset) seq_printf(seqf, "%3d %s\n", dp->major, dp->name); - mutex_unlock(&major_names_lock); + spin_unlock(&major_names_spinlock); } #endif /* CONFIG_PROC_FS */
@@ -252,6 +253,7 @@ int __register_blkdev(unsigned int major, const char *name, p->next = NULL; index = major_to_index(major);
+ spin_lock(&major_names_spinlock); for (n = &major_names[index]; *n; n = &(*n)->next) { if ((*n)->major == major) break; @@ -260,6 +262,7 @@ int __register_blkdev(unsigned int major, const char *name, *n = p; else ret = -EBUSY; + spin_unlock(&major_names_spinlock);
if (ret < 0) { printk("register_blkdev: cannot get major %u for %s\n", @@ -279,6 +282,7 @@ void unregister_blkdev(unsigned int major, const char *name) int index = major_to_index(major);
mutex_lock(&major_names_lock); + spin_lock(&major_names_spinlock); for (n = &major_names[index]; *n; n = &(*n)->next) if ((*n)->major == major) break; @@ -288,6 +292,7 @@ void unregister_blkdev(unsigned int major, const char *name) p = *n; *n = p->next; } + spin_unlock(&major_names_spinlock); mutex_unlock(&major_names_lock); kfree(p); }
From: Li Jinlin lijinlin3@huawei.com
[ Upstream commit 884f0e84f1e3195b801319c8ec3d5774e9bf2710 ]
The pending timer has been set up in blk_throtl_init(). However, the timer is not deleted in blk_throtl_exit(). This means that the timer handler may still be running after freeing the timer, which would result in a use-after-free.
Fix by calling del_timer_sync() to delete the timer in blk_throtl_exit().
Signed-off-by: Li Jinlin lijinlin3@huawei.com Link: https://lore.kernel.org/r/20210907121242.2885564-1-lijinlin3@huawei.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-throttle.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 55c49015e533..7c4e7993ba97 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -2458,6 +2458,7 @@ int blk_throtl_init(struct request_queue *q) void blk_throtl_exit(struct request_queue *q) { BUG_ON(!q->td); + del_timer_sync(&q->td->service_queue.pending_timer); throtl_shutdown_wq(q); blkcg_deactivate_policy(q, &blkcg_policy_throtl); free_percpu(q->td->latency_buckets[READ]);
From: Song Liu songliubraving@fb.com
[ Upstream commit 7f2a6a69f7ced6db8220298e0497cf60482a9d4b ]
Limiting number of request to BLK_MAX_REQUEST_COUNT at blk_plug hurts performance for large md arrays. [1] shows resync speed of md array drops for md array with more than 16 HDDs.
Fix this by allowing more request at plug queue. The multiple_queue flag is used to only apply higher limit to multiple queue cases.
[1] https://lore.kernel.org/linux-raid/CAFDAVznS71BXW8Jxv6k9dXc2iR3ysX3iZRBww_rz... Tested-by: Marcin Wanat marcin.wanat@gmail.com Signed-off-by: Song Liu songliubraving@fb.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-mq.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c index 9d4fdc2be88a..9c64f0025a56 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2135,6 +2135,18 @@ static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq) } }
+/* + * Allow 4x BLK_MAX_REQUEST_COUNT requests on plug queue for multiple + * queues. This is important for md arrays to benefit from merging + * requests. + */ +static inline unsigned short blk_plug_max_rq_count(struct blk_plug *plug) +{ + if (plug->multiple_queues) + return BLK_MAX_REQUEST_COUNT * 4; + return BLK_MAX_REQUEST_COUNT; +} + /** * blk_mq_submit_bio - Create and send a request to block device. * @bio: Bio pointer. @@ -2231,7 +2243,7 @@ blk_qc_t blk_mq_submit_bio(struct bio *bio) else last = list_entry_rq(plug->mq_list.prev);
- if (request_count >= BLK_MAX_REQUEST_COUNT || (last && + if (request_count >= blk_plug_max_rq_count(plug) || (last && blk_rq_bytes(last) >= BLK_PLUG_FLUSH_SIZE)) { blk_flush_plug_list(plug, false); trace_block_plug(q);
From: Yu-Tung Chang mtwget@gmail.com
[ Upstream commit 0c45d3e24ef3d3d87c5e0077b8f38d1372af7176 ]
The rtc-rx8010 uses the I2C regmap but doesn't select it in Kconfig so depending on the configuration the build may fail. Fix it.
Signed-off-by: Yu-Tung Chang mtwget@gmail.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Link: https://lore.kernel.org/r/20210830052532.40356-1-mtwget@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/rtc/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/rtc/Kconfig b/drivers/rtc/Kconfig index 12153d5801ce..f7bf87097a9f 100644 --- a/drivers/rtc/Kconfig +++ b/drivers/rtc/Kconfig @@ -624,6 +624,7 @@ config RTC_DRV_FM3130
config RTC_DRV_RX8010 tristate "Epson RX8010SJ" + select REGMAP_I2C help If you say yes here you get support for the Epson RX8010SJ RTC chip.
From: Sebastian Andrzej Siewior bigeasy@linutronix.de
[ Upstream commit 9848417926353daa59d2b05eb26e185063dbac6e ]
The intel powerclamp driver will setup a per-CPU worker with RT priority. The worker will then invoke play_idle() in which it remains in the idle poll loop until it is stopped by the timer it started earlier.
That timer needs to expire in hard interrupt context on PREEMPT_RT. Otherwise the timer will expire in ksoftirqd as a SOFT timer but that task won't be scheduled on the CPU because its priority is lower than the priority of the worker which is in the idle loop.
Always expire the idle timer in hard interrupt context.
Reported-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/r/20210906113034.jgfxrjdvxnjqgtmc@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/idle.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index 912b47aa99d8..d17b0a5ce6ac 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -379,10 +379,10 @@ void play_idle_precise(u64 duration_ns, u64 latency_ns) cpuidle_use_deepest_state(latency_ns);
it.done = 0; - hrtimer_init_on_stack(&it.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); + hrtimer_init_on_stack(&it.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD); it.timer.function = idle_inject_timer_fn; hrtimer_start(&it.timer, ns_to_ktime(duration_ns), - HRTIMER_MODE_REL_PINNED); + HRTIMER_MODE_REL_PINNED_HARD);
while (!READ_ONCE(it.done)) do_idle();
From: Enzo Matsumiya ematsumiya@suse.de
[ Upstream commit 9351590f51cdda49d0265932a37f099950998504 ]
Cached root file was not being completely invalidated sometimes.
Reproducing: - With a DFS share with 2 targets, one disabled and one enabled - start some I/O on the mount # while true; do ls /mnt/dfs; done - at the same time, disable the enabled target and enable the disabled one - wait for DFS cache to expire - on reconnect, the previous cached root handle should be invalid, but open_cached_dir_by_dentry() will still try to use it, but throws a use-after-free warning (kref_get())
Make smb2_close_cached_fid() invalidate all fields every time, but only send an SMB2_close() when the entry is still valid.
Signed-off-by: Enzo Matsumiya ematsumiya@suse.de Reviewed-by: Paulo Alcantara (SUSE) pc@cjr.nz Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/smb2ops.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c index 2dfd0d8297eb..1b9de38a136a 100644 --- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -689,13 +689,19 @@ smb2_close_cached_fid(struct kref *ref) cifs_dbg(FYI, "clear cached root file handle\n"); SMB2_close(0, cfid->tcon, cfid->fid->persistent_fid, cfid->fid->volatile_fid); - cfid->is_valid = false; - cfid->file_all_info_is_valid = false; - cfid->has_lease = false; - if (cfid->dentry) { - dput(cfid->dentry); - cfid->dentry = NULL; - } + } + + /* + * We only check validity above to send SMB2_close, + * but we still need to invalidate these entries + * when this function is called + */ + cfid->is_valid = false; + cfid->file_all_info_is_valid = false; + cfid->has_lease = false; + if (cfid->dentry) { + dput(cfid->dentry); + cfid->dentry = NULL; } }
From: Hao Xu haoxu@linux.alibaba.com
[ Upstream commit 32c2d33e0b7c4ea53284d5d9435dd022b582c8cf ]
Build check of __REQ_F_LAST_BIT should be larger than, not equal or larger than. It's perfectly valid to have __REQ_F_LAST_BIT be 32, as that means that the last valid bit is 31 which does fit in the type.
Signed-off-by: Hao Xu haoxu@linux.alibaba.com Link: https://lore.kernel.org/r/20210907032243.114190-1-haoxu@linux.alibaba.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 43aaa3566431..754d59f734d8 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -10335,7 +10335,7 @@ static int __init io_uring_init(void) BUILD_BUG_ON(SQE_VALID_FLAGS >= (1 << 8));
BUILD_BUG_ON(ARRAY_SIZE(io_op_defs) != IORING_OP_LAST); - BUILD_BUG_ON(__REQ_F_LAST_BIT >= 8 * sizeof(int)); + BUILD_BUG_ON(__REQ_F_LAST_BIT > 8 * sizeof(int));
req_cachep = KMEM_CACHE(io_kiocb, SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT);
From: Paul Moore paul@paul-moore.com
commit a3727a8bac0a9e77c70820655fd8715523ba3db7 upstream.
Jann Horn reported a problem with commit eb1231f73c4d ("selinux: clarify task subjective and objective credentials") where some LSM hooks were attempting to access the subjective credentials of a task other than the current task. Generally speaking, it is not safe to access another task's subjective credentials and doing so can cause a number of problems.
Further, while looking into the problem, I realized that Smack was suffering from a similar problem brought about by a similar commit 1fb057dcde11 ("smack: differentiate between subjective and objective task credentials").
This patch addresses this problem by restoring the use of the task's objective credentials in those cases where the task is other than the current executing task. Not only does this resolve the problem reported by Jann, it is arguably the correct thing to do in these cases.
Cc: stable@vger.kernel.org Fixes: eb1231f73c4d ("selinux: clarify task subjective and objective credentials") Fixes: 1fb057dcde11 ("smack: differentiate between subjective and objective task credentials") Reported-by: Jann Horn jannh@google.com Acked-by: Eric W. Biederman ebiederm@xmission.com Acked-by: Casey Schaufler casey@schaufler-ca.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/selinux/hooks.c | 4 ++-- security/smack/smack_lsm.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
--- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -2155,7 +2155,7 @@ static int selinux_ptrace_access_check(s static int selinux_ptrace_traceme(struct task_struct *parent) { return avc_has_perm(&selinux_state, - task_sid_subj(parent), task_sid_obj(current), + task_sid_obj(parent), task_sid_obj(current), SECCLASS_PROCESS, PROCESS__PTRACE, NULL); }
@@ -6218,7 +6218,7 @@ static int selinux_msg_queue_msgrcv(stru struct ipc_security_struct *isec; struct msg_security_struct *msec; struct common_audit_data ad; - u32 sid = task_sid_subj(target); + u32 sid = task_sid_obj(target); int rc;
isec = selinux_ipc(msq); --- a/security/smack/smack_lsm.c +++ b/security/smack/smack_lsm.c @@ -2016,7 +2016,7 @@ static int smk_curacc_on_task(struct tas const char *caller) { struct smk_audit_info ad; - struct smack_known *skp = smk_of_task_struct_subj(p); + struct smack_known *skp = smk_of_task_struct_obj(p); int rc;
smk_ad_init(&ad, caller, LSM_AUDIT_DATA_TASK); @@ -3480,7 +3480,7 @@ static void smack_d_instantiate(struct d */ static int smack_getprocattr(struct task_struct *p, char *name, char **value) { - struct smack_known *skp = smk_of_task_struct_subj(p); + struct smack_known *skp = smk_of_task_struct_obj(p); char *cp; int slen;
From: Guenter Roeck linux@roeck-us.net
commit e8f71f89236ef82d449991bfbc237e3cb6ea584f upstream.
nvkm test builds fail with the following error.
drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c: In function 'nvkm_control_mthd_pstate_info': drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c:60:35: error: overflow in conversion from 'int' to '__s8' {aka 'signed char'} changes value from '-251' to '5'
The code builds on most architectures, but fails on parisc where ENOSYS is defined as 251.
Replace the error code with -ENODEV (-19). The actual error code does not really matter and is not passed to userspace - it just has to be negative.
Fixes: 7238eca4cf18 ("drm/nouveau: expose pstate selection per-power source in sysfs") Signed-off-by: Guenter Roeck linux@roeck-us.net Cc: Ben Skeggs bskeggs@redhat.com Cc: David Airlie airlied@linux.ie Cc: Daniel Vetter daniel@ffwll.ch Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c @@ -57,7 +57,7 @@ nvkm_control_mthd_pstate_info(struct nvk args->v0.count = 0; args->v0.ustate_ac = NVIF_CONTROL_PSTATE_INFO_V0_USTATE_DISABLE; args->v0.ustate_dc = NVIF_CONTROL_PSTATE_INFO_V0_USTATE_DISABLE; - args->v0.pwrsrc = -ENOSYS; + args->v0.pwrsrc = -ENODEV; args->v0.pstate = NVIF_CONTROL_PSTATE_INFO_V0_PSTATE_UNKNOWN; }
Hello!
On 9/24/21 7:43 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.14.8 release. There are 100 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Sep 2021 12:43:20 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.14.8-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.14.y and the diffstat can be found below.
thanks,
greg k-h
Regressions detected.
While building Perf for arm, arm64, i386 and x86, all with GCC 11, the following errors were encountered:
util/parse-events-hybrid.c: In function 'add_hw_hybrid': util/parse-events-hybrid.c:84:17: error: implicit declaration of function 'copy_config_terms'; did you mean 'perf_pmu__config_terms'? [-Werror=implicit-function-declaration] 84 | copy_config_terms(&terms, config_terms); | ^~~~~~~~~~~~~~~~~ | perf_pmu__config_terms util/parse-events-hybrid.c:88:17: error: implicit declaration of function 'free_config_terms'; did you mean 'perf_pmu__config_terms'? [-Werror=implicit-function-declaration] 88 | free_config_terms(&terms); | ^~~~~~~~~~~~~~~~~ | perf_pmu__config_terms cc1: all warnings being treated as errors
To reproduce this build locally (for instance): tuxmake \ --target-arch=x86_64 \ --kconfig=defconfig \ --kconfig-add=https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... \ --kconfig-add=https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... \ --kconfig-add=https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... \ --kconfig-add=https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... \ --kconfig-add=https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... \ --kconfig-add=CONFIG_IGB=y \ --kconfig-add=CONFIG_UNWINDER_FRAME_POINTER=y \ --kconfig-add=CONFIG_FTRACE_SYSCALLS=y \ --toolchain=gcc-11 \ --runtime=podman \ perf
The nbd problems persist on some Arm configurations with Clang (footbridge_defconfig, mini2440_defconfig, s3c2410_defconfig): ERROR: modpost: "__mulodi4" [drivers/block/nbd.ko] undefined!
Greetings!
Daniel Díaz daniel.diaz@linaro.org
On Fri, Sep 24, 2021 at 09:21:48AM -0500, Daniel Díaz wrote:
Hello!
On 9/24/21 7:43 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.14.8 release. There are 100 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Sep 2021 12:43:20 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.14.8-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.14.y and the diffstat can be found below.
thanks,
greg k-h
Regressions detected.
While building Perf for arm, arm64, i386 and x86, all with GCC 11, the following errors were encountered:
util/parse-events-hybrid.c: In function 'add_hw_hybrid': util/parse-events-hybrid.c:84:17: error: implicit declaration of function 'copy_config_terms'; did you mean 'perf_pmu__config_terms'? [-Werror=implicit-function-declaration] 84 | copy_config_terms(&terms, config_terms); | ^~~~~~~~~~~~~~~~~ | perf_pmu__config_terms util/parse-events-hybrid.c:88:17: error: implicit declaration of function 'free_config_terms'; did you mean 'perf_pmu__config_terms'? [-Werror=implicit-function-declaration] 88 | free_config_terms(&terms); | ^~~~~~~~~~~~~~~~~ | perf_pmu__config_terms cc1: all warnings being treated as errors
To reproduce this build locally (for instance): tuxmake \ --target-arch=x86_64 \ --kconfig=defconfig \ --kconfig-add=https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... \ --kconfig-add=https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... \ --kconfig-add=https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... \ --kconfig-add=https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... \ --kconfig-add=https://raw.githubusercontent.com/Linaro/meta-lkft/sumo/recipes-kernel/linux... \ --kconfig-add=CONFIG_IGB=y \ --kconfig-add=CONFIG_UNWINDER_FRAME_POINTER=y \ --kconfig-add=CONFIG_FTRACE_SYSCALLS=y \ --toolchain=gcc-11 \ --runtime=podman \ perf
Thanks, found this and will drop the perf patch that was causing it.
greg k-h
On Fri, 24 Sep 2021 14:43:09 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.14.8 release. There are 100 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Sep 2021 12:43:20 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.14.8-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.14.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v5.14: 10 builds: 10 pass, 0 fail 28 boots: 28 pass, 0 fail 114 tests: 114 pass, 0 fail
Linux version: 5.14.8-rc1-g565376331ac7 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On Fri, 24 Sep 2021 14:43:09 +0200, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.14.8 release. There are 100 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Sep 2021 12:43:20 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.14.8-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.14.y and the diffstat can be found below.
thanks,
greg k-h
5.14.8-rc1 Successfully Compiled and booted on my Raspberry PI 4b (8g) (bcm2711)
Tested-by: Fox Chen foxhlchen@gmail.com
On 9/24/21 6:43 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.14.8 release. There are 100 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Sep 2021 12:43:20 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.14.8-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.14.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
linux-stable-mirror@lists.linaro.org