This is the start of the stable review cycle for the 6.1.158 release. There are 157 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 29 Oct 2025 18:34:15 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.158-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.1.158-rc1
Mark Rutland mark.rutland@arm.com arm64: errata: Apply workarounds for Neoverse-V3AE
Mark Rutland mark.rutland@arm.com arm64: cputype: Add Neoverse-V3AE definitions
Leon Hwang leon.hwang@linux.dev Revert "selftests: mm: fix map_hugetlb failure on 64K page size systems"
Jakub Acs acsjakub@amazon.de mm/ksm: fix flag-dropping behavior in ksm_madvise
Vineeth Vijayan vneethv@linux.ibm.com s390/cio: Update purge function to unregister the unused subchannels
Namjae Jeon linkinjeon@kernel.org ksmbd: browse interfaces list on FSCTL_QUERY_INTERFACE_INFO IOCTL
Babu Moger babu.moger@amd.com x86/resctrl: Fix miscount of bandwidth event when reactivating previously unavailable RMID
Maarten Lankhorst dev@lankhorst.se devcoredump: Fix circular locking dependency with devcd->mutex.
Darrick J. Wong djwong@kernel.org xfs: always warn about deprecated mount options
Kaushlendra Kumar kaushlendra.kumar@intel.com arch_topology: Fix incorrect error check in topology_parse_cpu_capacity()
Devarsh Thakkar devarsht@ti.com phy: cadence: cdns-dphy: Update calibration wait time for startup state machine
Jedrzej Jagielski jedrzej.jagielski@intel.com ixgbevf: fix mailbox API compatibility by negotiating supported features
Jedrzej Jagielski jedrzej.jagielski@intel.com ixgbevf: fix getting link speed data for E610 devices
Piotr Kwapulinski piotr.kwapulinski@intel.com ixgbevf: Add support for Intel(R) E610 device
Piotr Kwapulinski piotr.kwapulinski@intel.com PCI: Add PCI_VDEVICE_SUB helper macro
Jaegeuk Kim jaegeuk@kernel.org f2fs: fix wrong block mapping for multi-devices
Christoph Hellwig hch@lst.de f2fs: factor a f2fs_map_blocks_cached helper
Christoph Hellwig hch@lst.de f2fs: remove the create argument to f2fs_map_blocks
Christoph Hellwig hch@lst.de f2fs: add a f2fs_get_block_locked helper
Niklas Cassel cassel@kernel.org PCI: tegra194: Reset BARs when running in PCIe endpoint mode
Tvrtko Ursulin tvrtko.ursulin@igalia.com drm/sched: Fix potential double free in drm_sched_job_add_resv_dependencies
Theodore Ts'o tytso@mit.edu ext4: avoid potential buffer over-read in parse_apply_sb_mount_options()
Chuck Lever chuck.lever@oracle.com NFSD: Define a proc_layoutcommit for the FlexFiles layout type
Jan Kara jack@suse.cz vfs: Don't leak disconnected dentries on umount
Sergey Bashirov sergeybashirov@gmail.com NFSD: Fix last write offset handling in layoutcommit
Sergey Bashirov sergeybashirov@gmail.com NFSD: Minor cleanup in layoutcommit processing
Sergey Bashirov sergeybashirov@gmail.com NFSD: Rework encoding and decoding of nfsd4_deviceid
Siddharth Vadapalli s-vadapalli@ti.com PCI: j721e: Fix programming sequence of "strap" settings
Siddharth Vadapalli s-vadapalli@ti.com PCI: j721e: Enable ACSPCIE Refclk if "ti,syscon-acspcie-proxy-ctrl" exists
Catalin Marinas catalin.marinas@arm.com arm64: mte: Do not flag the zero page as PG_mte_tagged
Darrick J. Wong djwong@kernel.org fuse: fix livelock in synchronous file put from fuseblk workers
Amir Goldstein amir73il@gmail.com fuse: allocate ff->release_args only if release is needed
Xiao Liang shaw.leon@gmail.com padata: Reset next CPU when reorder sequence wraps around
Sean Nyekjaer sean@geanix.com iio: imu: inv_icm42600: Avoid configuring if already pm_runtime suspended
David Lechner dlechner@baylibre.com iio: imu: inv_icm42600: use = { } instead of memset()
Sean Nyekjaer sean@geanix.com iio: imu: inv_icm42600: Simplify pm_runtime setup
Bence Csókás csokas.bence@prolan.hu PM: runtime: Add new devm functions
Devarsh Thakkar devarsht@ti.com phy: cadence: cdns-dphy: Fix PLL lock and O_CMN_READY polling
Tomi Valkeinen tomi.valkeinen@ideasonboard.com phy: cdns-dphy: Store hs_clk_rate and return it
Christoph Hellwig hch@lst.de xfs: fix log CRC mismatches between i386 and other architectures
Christoph Hellwig hch@lst.de xfs: rename the old_crc variable in xlog_recover_process
Florian Eckert fe@dev.tdt.de serial: 8250_exar: add support for Advantech 2 port card with Device ID 0x0018
Artem Shimko a.shimko.dev@gmail.com serial: 8250_dw: handle reset control deassert error
Victoria Votokina Victoria.Votokina@kaspersky.com most: usb: hdm_probe: Fix calling put_device() before device initialization
Victoria Votokina Victoria.Votokina@kaspersky.com most: usb: Fix use-after-free in hdm_disconnect
Junhao Xie bigfoot@radxa.com misc: fastrpc: Fix dma_buf object leak in fastrpc_map_lookup
Alexander Usyskin alexander.usyskin@intel.com mei: me: add wildcat lake P DID
Deepanshu Kartikey kartikey406@gmail.com comedi: fix divide-by-zero in comedi_buf_munge()
Alice Ryhl aliceryhl@google.com binder: remove "invalid inc weak" check
Mathias Nyman mathias.nyman@linux.intel.com xhci: dbc: enable back DbC in resume if it was enabled before suspend
Andrey Konovalov andreyknvl@gmail.com usb: raw-gadget: do not limit transfer length
Tim Guttzeit t.guttzeit@tuxedocomputers.com usb/core/quirks: Add Huawei ME906S to wakeup quirk
LI Qingwu Qing-wu.Li@leica-geosystems.com.cn USB: serial: option: add Telit FN920C04 ECM compositions
Reinhard Speyerer rspmn@arcor.de USB: serial: option: add Quectel RG255C
Renjun Wang renjunw0@foxmail.com USB: serial: option: add UNISOC UIS7720
Alok Tiwari alok.a.tiwari@oracle.com io_uring: correct __must_hold annotation in io_install_fixed_file
Anup Patel apatel@ventanamicro.com RISC-V: Don't print details of CPUs disabled in DT
Anup Patel apatel@ventanamicro.com RISC-V: Define pgprot_dmacoherent() for non-coherent devices
Matthieu Baerts (NGI0) matttbe@kernel.org selftests: mptcp: join: mark implicit tests as skipped if not supported
Matthieu Baerts (NGI0) matttbe@kernel.org selftests: mptcp: join: mark 'flush re-add' as skipped if not supported
Lad Prabhakar prabhakar.mahadev-lad.rj@bp.renesas.com net: ravb: Ensure memory write completes before ringing TX doorbell
Lad Prabhakar prabhakar.mahadev-lad.rj@bp.renesas.com net: ravb: Enforce descriptor type ordering
Michal Pecio michal.pecio@gmail.com net: usb: rtl8150: Fix frame padding
Sebastian Reichel sebastian.reichel@collabora.com net: stmmac: dwmac-rk: Fix disabling set_clock_selection
Stefano Garzarella sgarzare@redhat.com vsock: fix lock inversion in vsock_assign_transport()
Deepanshu Kartikey kartikey406@gmail.com ocfs2: clear extent cache after moving/defragmenting extents
Maciej W. Rozycki macro@orcam.me.uk MIPS: Malta: Fix keyboard resource preventing i8042 driver from registering
Marc Kleine-Budde mkl@pengutronix.de can: netlink: can_changelink(): allow disabling of automatic restart
Xi Ruoyao xry111@xry111.site ACPICA: Work around bogus -Wstringop-overread warning since GCC 11
Rafael J. Wysocki rafael.j.wysocki@intel.com Revert "cpuidle: menu: Avoid discarding useful information"
Tonghao Zhang tonghao@bamaicloud.com net: bonding: fix possible peer notify event loss or dup issue
Alexey Simakov bigalex934@gmail.com sctp: avoid NULL dereference when chunk data buffer is missing
Huang Ying ying.huang@linux.alibaba.com arm64, mm: avoid always making PTE dirty in pte_mkwrite()
Ioana Ciornei ioana.ciornei@nxp.com dpaa2-eth: fix the pointer passed to PTR_ALIGN on Tx path
Wei Fang wei.fang@nxp.com net: enetc: correct the value of ENETC_RXB_TRUESIZE
Johannes Wiesböck johannes.wiesboeck@aisec.fraunhofer.de rtnetlink: Allow deleting FDB entries in user namespace
Nathan Chancellor nathan@kernel.org net/mlx5e: Return 1 instead of 0 in invalid case in mlx5e_mpwrq_umr_entry_size()
Stefan Metzmacher metze@samba.org smb: server: let smb_direct_flush_send_list() invalidate a remote key first
Christophe Leroy christophe.leroy@csgroup.eu powerpc/32: Remove PAGE_KERNEL_TEXT to fix startup failure
Geert Uytterhoeven geert@linux-m68k.org m68k: bitops: Fix find_*_bit() signatures
Junjie Cao junjie.cao@intel.com lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
Yangtao Li frank.li@vivo.com hfsplus: return EIO when type of hidden directory mismatch in hfsplus_fill_super()
Viacheslav Dubeyko slava@dubeyko.com hfs: fix KMSAN uninit-value issue in hfs_find_set_zero_bits()
Alexander Aring aahringo@redhat.com dlm: check for defined force value in dlm_lockspace_release
Viacheslav Dubeyko slava@dubeyko.com hfsplus: fix KMSAN uninit-value issue in hfsplus_delete_cat()
Yang Chenzhi yang.chenzhi@vivo.com hfs: validate record offset in hfsplus_bmap_alloc
Viacheslav Dubeyko slava@dubeyko.com hfsplus: fix KMSAN uninit-value issue in __hfsplus_ext_cache_extent()
Viacheslav Dubeyko slava@dubeyko.com hfs: make proper initalization of struct hfs_find_data
Viacheslav Dubeyko slava@dubeyko.com hfs: clear offset and space out of valid records in b-tree node
Simon Schuster schuster.simon@siemens-energy.com nios2: ensure that memblock.current_limit is set when setting pfn limits
Xichao Zhao zhao.xichao@vivo.com exec: Fix incorrect type for ret
Brian Norris briannorris@google.com PCI/sysfs: Ensure devices are powered for config reads (part 2)
Viacheslav Dubeyko slava@dubeyko.com hfsplus: fix slab-out-of-bounds read in hfsplus_strcasecmp()
Thadeu Lima de Souza Cascardo cascardo@igalia.com HID: multitouch: fix name of Stylus input devices
Dmitry Torokhov dmitry.torokhov@gmail.com HID: hid-input: only ignore 0 battery events for digitizers
Jiaming Zhang r772577952@gmail.com ALSA: usb-audio: Fix NULL pointer deference in try_to_register_card
Randy Dunlap rdunlap@infradead.org ALSA: firewire: amdtp-stream: fix enum kernel-doc warnings
Vincent Guittot vincent.guittot@linaro.org sched/fair: Fix pelt lost idle time detection
Ingo Molnar mingo@kernel.org sched/balancing: Rename newidle_balance() => sched_balance_newidle()
Alok Tiwari alok.a.tiwari@oracle.com drm/rockchip: vop2: use correct destination rectangle height check
Timur Kristóf timur.kristof@gmail.com drm/amd/powerplay: Fix CIK shutdown temperature
Cristian Ciocaltea cristian.ciocaltea@collabora.com ASoC: nau8821: Add DMI quirk to bypass jack debounce circuit
Cristian Ciocaltea cristian.ciocaltea@collabora.com ASoC: nau8821: Generalize helper to clear IRQ status
Cristian Ciocaltea cristian.ciocaltea@collabora.com ASoC: nau8821: Cancel jdet_work before handling jack ejection
Marek Vasut marek.vasut@mailbox.org drm/bridge: lt9211: Drop check for last nibble of version register
Fabian Vogt fvogt@suse.de riscv: kprobes: Fix probe address validation
I Viswanath viswanathiyyappan@gmail.com net: usb: lan78xx: fix use of improperly initialized dev->chipid in lan78xx_reset
Oleksij Rempel linux@rempel-privat.de net: usb: lan78xx: Add error handling to lan78xx_init_mac_address
Sabrina Dubroca sd@queasysnail.net tls: don't rely on tx_work during send()
Sabrina Dubroca sd@queasysnail.net tls: wait for pending async decryptions if tls_strp_msg_hold fails
Sabrina Dubroca sd@queasysnail.net tls: always set record_type in tls_process_cmsg
Sabrina Dubroca sd@queasysnail.net tls: wait for async encrypt in case of error during latter iterations of sendmsg
Sascha Hauer s.hauer@pengutronix.de net: tls: wait for async completion on last message
Alexey Simakov bigalex934@gmail.com tg3: prevent use of uninitialized remote_adv and local_adv variables
Eric Dumazet edumazet@google.com tcp: fix tcp_tso_should_defer() vs large RTT
Raju Rangoju Raju.Rangoju@amd.com amd-xgbe: Avoid spurious link down messages during interface toggle
Dmitry Safonov 0x7f454c46@gmail.com net/ip6_tunnel: Prevent perpetual tunnel growth
Linmao Li lilinmao@kylinos.cn r8169: fix packet truncation after S4 resume on RTL8168H/RTL8111H
Nicolas Dichtel nicolas.dichtel@6wind.com doc: fix seg6_flowlabel path
Yeounsu Moon yyyynoom@gmail.com net: dlink: handle dma_map_single() failure properly
Marc Kleine-Budde mkl@pengutronix.de can: m_can: m_can_plat_remove(): add missing pm_runtime_disable()
Yuezhang Mo Yuezhang.Mo@sony.com dax: skip read lock assertion for read-only filesystems
Benjamin Tissoires bentiss@kernel.org HID: multitouch: fix sticky fingers
Thomas Gleixner tglx@linutronix.de Bluetooth: hci_qca: Fix the teardown problem for real
Steven Rostedt (Google) rostedt@goodmis.org timers: Update the documentation to reflect on the new timer_shutdown() API
Thomas Gleixner tglx@linutronix.de timers: Provide timer_shutdown[_sync]()
Thomas Gleixner tglx@linutronix.de timers: Add shutdown mechanism to the internal functions
Thomas Gleixner tglx@linutronix.de timers: Split [try_to_]del_timer[_sync]() to prepare for shutdown mode
Thomas Gleixner tglx@linutronix.de timers: Silently ignore timers with a NULL function
Thomas Gleixner tglx@linutronix.de Documentation: Replace del_timer/del_timer_sync()
Thomas Gleixner tglx@linutronix.de timers: Replace BUG_ON()s
Steven Rostedt (Google) rostedt@goodmis.org clocksource/drivers/sp804: Do not use timer namespace for timer_shutdown() function
Steven Rostedt (Google) rostedt@goodmis.org clocksource/drivers/arm_arch_timer: Do not use timer namespace for timer_shutdown() function
Steven Rostedt (Google) rostedt@goodmis.org ARM: spear: Do not use timer namespace for timer_shutdown() function
Thomas Gleixner tglx@linutronix.de Documentation: Remove bogus claim about del_timer_sync()
Kuen-Han Tsai khtsai@google.com usb: gadget: f_ncm: Refactor bind path to use __free()
Kuen-Han Tsai khtsai@google.com usb: gadget: f_acm: Refactor bind path to use __free()
Kuen-Han Tsai khtsai@google.com usb: gadget: f_ecm: Refactor bind path to use __free()
Kuen-Han Tsai khtsai@google.com usb: gadget: f_rndis: Refactor bind path to use __free()
Kuen-Han Tsai khtsai@google.com usb: gadget: Introduce free_usb_request helper
Kuen-Han Tsai khtsai@google.com usb: gadget: Store endpoint pointer in usb_request
Kaustabh Chakraborty kauschluss@disroot.org drm/exynos: exynos7_drm_decon: remove ctx->suspended
Kaustabh Chakraborty kauschluss@disroot.org drm/exynos: exynos7_drm_decon: properly clear channels during bind
Kaustabh Chakraborty kauschluss@disroot.org drm/exynos: exynos7_drm_decon: fix uninitialized crtc reference in functions
Marek Vasut marek.vasut+renesas@mailbox.org drm/rcar-du: dsi: Fix 1/2/3 lane support
Rafael J. Wysocki rafael.j.wysocki@intel.com cpufreq: CPPC: Avoid using CPUFREQ_ETERNAL as transition delay
Thomas Fourier fourier.thomas@gmail.com crypto: rockchip - Fix dma_unmap_sg() nents value
Mario Limonciello mario.limonciello@amd.com drm/amd: Check whether secure display TA loaded successfully
Gui-Dong Han hanguidong02@gmail.com drm/amdgpu: use atomic functions with memory barriers for vm fault info
Eugene Korenevsky ekorenevsky@aliyun.com cifs: parse_dfs_referrals: prevent oob on malformed input
Filipe Manana fdmanana@suse.com btrfs: do not assert we found block group item when creating free space tree
Filipe Manana fdmanana@suse.com btrfs: fix clearing of BTRFS_FS_RELOC_RUNNING if relocation already running
Deepanshu Kartikey kartikey406@gmail.com ext4: detect invalid INLINE_DATA + EXTENTS flag combination
Zhang Yi yi.zhang@huawei.com ext4: wait for ongoing I/O to complete before freeing blocks
Zhang Yi yi.zhang@huawei.com jbd2: ensure that all ongoing I/O complete before freeing blocks
Yi Cong yicong@kylinos.cn r8152: add error handling in rtl8152_driver_init
Shuhao Fu sfual@cse.ust.hk smb: client: Fix refcount leak for cifs_sb_tlink
-------------
Diffstat:
.../RCU/Design/Requirements/Requirements.rst | 2 +- Documentation/arm64/silicon-errata.rst | 2 + Documentation/core-api/local_ops.rst | 2 +- Documentation/kernel-hacking/locking.rst | 17 +- Documentation/networking/seg6-sysctl.rst | 3 + Documentation/timers/hrtimers.rst | 2 +- .../translations/it_IT/kernel-hacking/locking.rst | 14 +- .../translations/zh_CN/core-api/local_ops.rst | 2 +- Makefile | 4 +- arch/arm/mach-spear/time.c | 8 +- arch/arm64/Kconfig | 1 + arch/arm64/include/asm/cputype.h | 2 + arch/arm64/include/asm/pgtable.h | 3 +- arch/arm64/kernel/cpu_errata.c | 1 + arch/arm64/kernel/cpufeature.c | 10 +- arch/arm64/kernel/mte.c | 2 +- arch/m68k/include/asm/bitops.h | 25 +- arch/mips/mti-malta/malta-setup.c | 2 +- arch/nios2/kernel/setup.c | 15 + arch/powerpc/include/asm/pgtable.h | 12 - arch/powerpc/mm/book3s32/mmu.c | 4 +- arch/powerpc/mm/pgtable_32.c | 2 +- arch/riscv/include/asm/pgtable.h | 2 + arch/riscv/kernel/cpu.c | 4 +- arch/riscv/kernel/probes/kprobes.c | 13 +- arch/x86/kernel/cpu/resctrl/monitor.c | 12 +- drivers/acpi/acpica/tbprint.c | 6 + drivers/android/binder.c | 11 +- drivers/base/arch_topology.c | 2 +- drivers/base/devcoredump.c | 138 +++++---- drivers/base/power/runtime.c | 44 +++ drivers/bluetooth/hci_qca.c | 10 +- drivers/clocksource/arm_arch_timer.c | 12 +- drivers/clocksource/timer-sp804.c | 6 +- drivers/comedi/comedi_buf.c | 2 +- drivers/cpufreq/cppc_cpufreq.c | 14 +- drivers/cpuidle/governors/menu.c | 21 +- drivers/crypto/rockchip/rk3288_crypto_ahash.c | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 7 +- drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 7 +- .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 3 +- drivers/gpu/drm/bridge/lontium-lt9211.c | 3 +- drivers/gpu/drm/exynos/exynos7_drm_decon.c | 98 +++--- drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c | 5 +- drivers/gpu/drm/rcar-du/rcar_mipi_dsi_regs.h | 8 +- drivers/gpu/drm/rockchip/rockchip_drm_vop2.c | 2 +- drivers/gpu/drm/scheduler/sched_main.c | 13 +- drivers/hid/hid-input.c | 5 +- drivers/hid/hid-multitouch.c | 28 +- drivers/iio/imu/inv_icm42600/inv_icm42600_accel.c | 5 +- drivers/iio/imu/inv_icm42600/inv_icm42600_core.c | 35 +-- drivers/iio/imu/inv_icm42600/inv_icm42600_gyro.c | 5 +- drivers/misc/fastrpc.c | 2 + drivers/misc/lkdtm/fortify.c | 6 + drivers/misc/mei/hw-me-regs.h | 2 + drivers/misc/mei/pci-me.c | 2 + drivers/most/most_usb.c | 13 +- drivers/net/bonding/bond_main.c | 40 ++- drivers/net/can/dev/netlink.c | 6 +- drivers/net/can/m_can/m_can_platform.c | 2 +- drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 1 - drivers/net/ethernet/amd/xgbe/xgbe-mdio.c | 1 + drivers/net/ethernet/broadcom/tg3.c | 5 +- drivers/net/ethernet/dlink/dl2k.c | 23 +- drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 3 +- drivers/net/ethernet/freescale/enetc/enetc.h | 2 +- drivers/net/ethernet/intel/ixgbevf/defines.h | 6 +- drivers/net/ethernet/intel/ixgbevf/ipsec.c | 10 + drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 13 +- drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 46 ++- drivers/net/ethernet/intel/ixgbevf/mbx.h | 8 + drivers/net/ethernet/intel/ixgbevf/vf.c | 194 ++++++++++-- drivers/net/ethernet/intel/ixgbevf/vf.h | 5 +- .../net/ethernet/mellanox/mlx5/core/en/params.c | 2 +- drivers/net/ethernet/realtek/r8169_main.c | 5 +- drivers/net/ethernet/renesas/ravb_main.c | 24 +- drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c | 9 +- drivers/net/usb/lan78xx.c | 38 ++- drivers/net/usb/r8152.c | 7 +- drivers/net/usb/rtl8150.c | 11 +- drivers/pci/controller/cadence/pci-j721e.c | 64 +++- drivers/pci/controller/dwc/pcie-tegra194.c | 10 + drivers/pci/pci-sysfs.c | 10 +- drivers/phy/cadence/cdns-dphy.c | 133 ++++++-- drivers/s390/cio/device.c | 37 ++- drivers/tty/serial/8250/8250_dw.c | 4 +- drivers/tty/serial/8250/8250_exar.c | 11 + drivers/usb/core/quirks.c | 2 + drivers/usb/gadget/function/f_acm.c | 42 ++- drivers/usb/gadget/function/f_ecm.c | 48 ++- drivers/usb/gadget/function/f_ncm.c | 78 ++--- drivers/usb/gadget/function/f_rndis.c | 85 +++--- drivers/usb/gadget/legacy/raw_gadget.c | 2 - drivers/usb/gadget/udc/core.c | 3 + drivers/usb/host/xhci-dbgcap.c | 9 +- drivers/usb/serial/option.c | 10 + fs/btrfs/free-space-tree.c | 15 +- fs/btrfs/relocation.c | 13 +- fs/dax.c | 2 +- fs/dcache.c | 2 + fs/dlm/lockspace.c | 2 +- fs/exec.c | 2 +- fs/ext4/ext4_jbd2.c | 11 +- fs/ext4/inode.c | 8 + fs/ext4/super.c | 17 +- fs/f2fs/data.c | 108 ++++--- fs/f2fs/f2fs.h | 6 +- fs/f2fs/file.c | 16 +- fs/fuse/dir.c | 2 +- fs/fuse/file.c | 75 +++-- fs/fuse/fuse_i.h | 2 +- fs/hfs/bfind.c | 8 +- fs/hfs/brec.c | 27 +- fs/hfs/mdb.c | 2 +- fs/hfsplus/bfind.c | 8 +- fs/hfsplus/bnode.c | 41 --- fs/hfsplus/btree.c | 6 + fs/hfsplus/hfsplus_fs.h | 42 +++ fs/hfsplus/super.c | 25 +- fs/hfsplus/unicode.c | 24 ++ fs/jbd2/transaction.c | 13 +- fs/nfsd/blocklayout.c | 5 +- fs/nfsd/blocklayoutxdr.c | 7 +- fs/nfsd/flexfilelayout.c | 8 + fs/nfsd/flexfilelayoutxdr.c | 3 +- fs/nfsd/nfs4layouts.c | 1 - fs/nfsd/nfs4proc.c | 34 +-- fs/nfsd/nfs4xdr.c | 14 +- fs/nfsd/xdr4.h | 36 ++- fs/ocfs2/move_extents.c | 5 + fs/smb/client/inode.c | 6 +- fs/smb/client/misc.c | 17 ++ fs/smb/client/smb2ops.c | 8 +- fs/smb/server/ksmbd_netlink.h | 3 +- fs/smb/server/server.h | 1 + fs/smb/server/smb2pdu.c | 4 + fs/smb/server/transport_ipc.c | 1 + fs/smb/server/transport_rdma.c | 11 +- fs/smb/server/transport_tcp.c | 67 ++--- fs/smb/server/transport_tcp.h | 1 + fs/xfs/libxfs/xfs_log_format.h | 30 +- fs/xfs/xfs_log.c | 8 +- fs/xfs/xfs_log_priv.h | 4 +- fs/xfs/xfs_log_recover.c | 34 ++- fs/xfs/xfs_ondisk.h | 2 + fs/xfs/xfs_super.c | 33 +- include/linux/cpufreq.h | 3 + include/linux/mm.h | 2 +- include/linux/pci.h | 14 + include/linux/pm_runtime.h | 4 + include/linux/timer.h | 2 + include/linux/usb/gadget.h | 25 ++ include/net/ip_tunnels.h | 15 + include/trace/events/f2fs.h | 11 +- io_uring/filetable.c | 2 +- kernel/padata.c | 6 +- kernel/sched/fair.c | 38 +-- kernel/time/timer.c | 335 ++++++++++++++++----- net/core/rtnetlink.c | 3 - net/ipv4/ip_tunnel.c | 14 - net/ipv4/tcp_output.c | 19 +- net/ipv6/ip6_tunnel.c | 3 +- net/sctp/inqueue.c | 13 +- net/tls/tls_main.c | 7 +- net/tls/tls_sw.c | 28 +- net/vmw_vsock/af_vsock.c | 38 +-- rust/bindings/bindings_helper.h | 2 + rust/bindings/lib.rs | 1 + sound/firewire/amdtp-stream.h | 2 +- sound/soc/codecs/nau8821.c | 53 +++- sound/usb/card.c | 10 +- tools/testing/selftests/net/mptcp/mptcp_join.sh | 6 +- tools/testing/selftests/vm/map_hugetlb.c | 7 - 175 files changed, 2099 insertions(+), 1084 deletions(-)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuhao Fu sfual@cse.ust.hk
commit c2b77f42205ef485a647f62082c442c1cd69d3fc upstream.
Fix three refcount inconsistency issues related to `cifs_sb_tlink`.
Comments for `cifs_sb_tlink` state that `cifs_put_tlink()` needs to be called after successful calls to `cifs_sb_tlink()`. Three calls fail to update refcount accordingly, leading to possible resource leaks.
Fixes: 8ceb98437946 ("CIFS: Move rename to ops struct") Fixes: 2f1afe25997f ("cifs: Use smb 2 - 3 and cifsacl mount options getacl functions") Fixes: 366ed846df60 ("cifs: Use smb 2 - 3 and cifsacl mount options setacl function") Cc: stable@vger.kernel.org Signed-off-by: Shuhao Fu sfual@cse.ust.hk Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/inode.c | 6 ++++-- fs/smb/client/smb2ops.c | 8 ++++---- 2 files changed, 8 insertions(+), 6 deletions(-)
--- a/fs/smb/client/inode.c +++ b/fs/smb/client/inode.c @@ -2106,8 +2106,10 @@ cifs_do_rename(const unsigned int xid, s tcon = tlink_tcon(tlink); server = tcon->ses->server;
- if (!server->ops->rename) - return -ENOSYS; + if (!server->ops->rename) { + rc = -ENOSYS; + goto do_rename_exit; + }
/* try path-based rename first */ rc = server->ops->rename(xid, tcon, from_path, to_path, cifs_sb); --- a/fs/smb/client/smb2ops.c +++ b/fs/smb/client/smb2ops.c @@ -3323,8 +3323,7 @@ get_smb2_acl_by_path(struct cifs_sb_info utf16_path = cifs_convert_path_to_utf16(path, cifs_sb); if (!utf16_path) { rc = -ENOMEM; - free_xid(xid); - return ERR_PTR(rc); + goto put_tlink; }
oparms = (struct cifs_open_parms) { @@ -3356,6 +3355,7 @@ get_smb2_acl_by_path(struct cifs_sb_info SMB2_close(xid, tcon, fid.persistent_fid, fid.volatile_fid); }
+put_tlink: cifs_put_tlink(tlink); free_xid(xid);
@@ -3396,8 +3396,7 @@ set_smb2_acl(struct cifs_ntsd *pnntsd, _ utf16_path = cifs_convert_path_to_utf16(path, cifs_sb); if (!utf16_path) { rc = -ENOMEM; - free_xid(xid); - return rc; + goto put_tlink; }
oparms = (struct cifs_open_parms) { @@ -3418,6 +3417,7 @@ set_smb2_acl(struct cifs_ntsd *pnntsd, _ SMB2_close(xid, tcon, fid.persistent_fid, fid.volatile_fid); }
+put_tlink: cifs_put_tlink(tlink); free_xid(xid); return rc;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yi Cong yicong@kylinos.cn
commit 75527d61d60d493d1eb064f335071a20ca581f54 upstream.
rtl8152_driver_init() is missing the error handling. When rtl8152_driver registration fails, rtl8152_cfgselector_driver should be deregistered.
Fixes: ec51fbd1b8a2 ("r8152: add USB device driver for config selection") Cc: stable@vger.kernel.org Signed-off-by: Yi Cong yicong@kylinos.cn Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20251011082415.580740-1-yicongsrfy@163.com [pabeni@redhat.com: clarified the commit message] Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/r8152.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -9952,7 +9952,12 @@ static int __init rtl8152_driver_init(vo ret = usb_register_device_driver(&rtl8152_cfgselector_driver, THIS_MODULE); if (ret) return ret; - return usb_register(&rtl8152_driver); + + ret = usb_register(&rtl8152_driver); + if (ret) + usb_deregister_device_driver(&rtl8152_cfgselector_driver); + + return ret; }
static void __exit rtl8152_driver_exit(void)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Yi yi.zhang@huawei.com
commit 3c652c3a71de1d30d72dc82c3bead8deb48eb749 upstream.
When releasing file system metadata blocks in jbd2_journal_forget(), if this buffer has not yet been checkpointed, it may have already been written back, currently be in the process of being written back, or has not yet written back. jbd2_journal_forget() calls jbd2_journal_try_remove_checkpoint() to check the buffer's status and add it to the current transaction if it has not been written back. This buffer can only be reallocated after the transaction is committed.
jbd2_journal_try_remove_checkpoint() attempts to lock the buffer and check its dirty status while holding the buffer lock. If the buffer has already been written back, everything proceeds normally. However, there are two issues. First, the function returns immediately if the buffer is locked by the write-back process. It does not wait for the write-back to complete. Consequently, until the current transaction is committed and the block is reallocated, there is no guarantee that the I/O will complete. This means that ongoing I/O could write stale metadata to the newly allocated block, potentially corrupting data. Second, the function unlocks the buffer as soon as it detects that the buffer is still dirty. If a concurrent write-back occurs immediately after this unlocking and before clear_buffer_dirty() is called in jbd2_journal_forget(), data corruption can theoretically still occur.
Although these two issues are unlikely to occur in practice since the undergoing metadata writeback I/O does not take this long to complete, it's better to explicitly ensure that all ongoing I/O operations are completed.
Fixes: 597599268e3b ("jbd2: discard dirty data when forgetting an un-journalled buffer") Cc: stable@kernel.org Suggested-by: Jan Kara jack@suse.cz Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Message-ID: 20250916093337.3161016-2-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/jbd2/transaction.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
--- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -1668,6 +1668,7 @@ int jbd2_journal_forget(handle_t *handle int drop_reserve = 0; int err = 0; int was_modified = 0; + int wait_for_writeback = 0;
if (is_handle_aborted(handle)) return -EROFS; @@ -1791,18 +1792,22 @@ int jbd2_journal_forget(handle_t *handle }
/* - * The buffer is still not written to disk, we should - * attach this buffer to current transaction so that the - * buffer can be checkpointed only after the current - * transaction commits. + * The buffer has not yet been written to disk. We should + * either clear the buffer or ensure that the ongoing I/O + * is completed, and attach this buffer to current + * transaction so that the buffer can be checkpointed only + * after the current transaction commits. */ clear_buffer_dirty(bh); + wait_for_writeback = 1; __jbd2_journal_file_buffer(jh, transaction, BJ_Forget); spin_unlock(&journal->j_list_lock); } drop: __brelse(bh); spin_unlock(&jh->b_state_lock); + if (wait_for_writeback) + wait_on_buffer(bh); jbd2_journal_put_journal_head(jh); if (drop_reserve) { /* no need to reserve log space for this block -bzzz */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Yi yi.zhang@huawei.com
commit 328a782cb138029182e521c08f50eb1587db955d upstream.
When freeing metadata blocks in nojournal mode, ext4_forget() calls bforget() to clear the dirty flag on the buffer_head and remvoe associated mappings. This is acceptable if the metadata has not yet begun to be written back. However, if the write-back has already started but is not yet completed, ext4_forget() will have no effect. Subsequently, ext4_mb_clear_bb() will immediately return the block to the mb allocator. This block can then be reallocated immediately, potentially causing an data corruption issue.
Fix this by clearing the buffer's dirty flag and waiting for the ongoing I/O to complete, ensuring that no further writes to stale data will occur.
Fixes: 16e08b14a455 ("ext4: cleanup clean_bdev_aliases() calls") Cc: stable@kernel.org Reported-by: Gao Xiang hsiangkao@linux.alibaba.com Closes: https://lore.kernel.org/linux-ext4/a9417096-9549-4441-9878-b1955b899b4e@huaw... Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Message-ID: 20250916093337.3161016-3-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/ext4_jbd2.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
--- a/fs/ext4/ext4_jbd2.c +++ b/fs/ext4/ext4_jbd2.c @@ -271,9 +271,16 @@ int __ext4_forget(const char *where, uns bh, is_metadata, inode->i_mode, test_opt(inode->i_sb, DATA_FLAGS));
- /* In the no journal case, we can just do a bforget and return */ + /* + * In the no journal case, we should wait for the ongoing buffer + * to complete and do a forget. + */ if (!ext4_handle_valid(handle)) { - bforget(bh); + if (bh) { + clear_buffer_dirty(bh); + wait_on_buffer(bh); + __bforget(bh); + } return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Deepanshu Kartikey kartikey406@gmail.com
commit 1d3ad183943b38eec2acf72a0ae98e635dc8456b upstream.
syzbot reported a BUG_ON in ext4_es_cache_extent() when opening a verity file on a corrupted ext4 filesystem mounted without a journal.
The issue is that the filesystem has an inode with both the INLINE_DATA and EXTENTS flags set:
EXT4-fs error (device loop0): ext4_cache_extents:545: inode #15: comm syz.0.17: corrupted extent tree: lblk 0 < prev 66
Investigation revealed that the inode has both flags set: DEBUG: inode 15 - flag=1, i_inline_off=164, has_inline=1, extents_flag=1
This is an invalid combination since an inode should have either: - INLINE_DATA: data stored directly in the inode - EXTENTS: data stored in extent-mapped blocks
Having both flags causes ext4_has_inline_data() to return true, skipping extent tree validation in __ext4_iget(). The unvalidated out-of-order extents then trigger a BUG_ON in ext4_es_cache_extent() due to integer underflow when calculating hole sizes.
Fix this by detecting this invalid flag combination early in ext4_iget() and rejecting the corrupted inode.
Cc: stable@kernel.org Reported-and-tested-by: syzbot+038b7bf43423e132b308@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=038b7bf43423e132b308 Suggested-by: Zhang Yi yi.zhang@huawei.com Signed-off-by: Deepanshu Kartikey kartikey406@gmail.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Message-ID: 20250930112810.315095-1-kartikey406@gmail.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/inode.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4968,6 +4968,14 @@ struct inode *__ext4_iget(struct super_b } ei->i_flags = le32_to_cpu(raw_inode->i_flags); ext4_set_inode_flags(inode, true); + /* Detect invalid flag combination - can't have both inline data and extents */ + if (ext4_test_inode_flag(inode, EXT4_INODE_INLINE_DATA) && + ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) { + ext4_error_inode(inode, function, line, 0, + "inode has both inline data and extents flags"); + ret = -EFSCORRUPTED; + goto bad_inode; + } inode->i_blocks = ext4_inode_blocks(raw_inode, ei); ei->i_file_acl = le32_to_cpu(raw_inode->i_file_acl_lo); if (ext4_has_feature_64bit(sb))
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit 7e5a5983edda664e8e4bb20af17b80f5135c655c upstream.
When starting relocation, at reloc_chunk_start(), if we happen to find the flag BTRFS_FS_RELOC_RUNNING is already set we return an error (-EINPROGRESS) to the callers, however the callers call reloc_chunk_end() which will clear the flag BTRFS_FS_RELOC_RUNNING, which is wrong since relocation was started by another task and still running.
Finding the BTRFS_FS_RELOC_RUNNING flag already set is an unexpected scenario, but still our current behaviour is not correct.
Fix this by never calling reloc_chunk_end() if reloc_chunk_start() has returned an error, which is what logically makes sense, since the general widespread pattern is to have end functions called only if the counterpart start functions succeeded. This requires changing reloc_chunk_start() to clear BTRFS_FS_RELOC_RUNNING if there's a pending cancel request.
Fixes: 907d2710d727 ("btrfs: add cancellable chunk relocation support") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Boris Burkov boris@bur.io Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/relocation.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
--- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -3915,6 +3915,7 @@ out: /* * Mark start of chunk relocation that is cancellable. Check if the cancellation * has been requested meanwhile and don't start in that case. + * NOTE: if this returns an error, reloc_chunk_end() must not be called. * * Return: * 0 success @@ -3931,10 +3932,8 @@ static int reloc_chunk_start(struct btrf
if (atomic_read(&fs_info->reloc_cancel_req) > 0) { btrfs_info(fs_info, "chunk relocation canceled on start"); - /* - * On cancel, clear all requests but let the caller mark - * the end after cleanup operations. - */ + /* On cancel, clear all requests. */ + clear_and_wake_up_bit(BTRFS_FS_RELOC_RUNNING, &fs_info->flags); atomic_set(&fs_info->reloc_cancel_req, 0); return -ECANCELED; } @@ -3943,9 +3942,11 @@ static int reloc_chunk_start(struct btrf
/* * Mark end of chunk relocation that is cancellable and wake any waiters. + * NOTE: call only if a previous call to reloc_chunk_start() succeeded. */ static void reloc_chunk_end(struct btrfs_fs_info *fs_info) { + ASSERT(test_bit(BTRFS_FS_RELOC_RUNNING, &fs_info->flags)); /* Requested after start, clear bit first so any waiters can continue */ if (atomic_read(&fs_info->reloc_cancel_req) > 0) btrfs_info(fs_info, "chunk relocation canceled during operation"); @@ -4158,9 +4159,9 @@ out: if (err && rw) btrfs_dec_block_group_ro(rc->block_group); iput(rc->data_inode); + reloc_chunk_end(fs_info); out_put_bg: btrfs_put_block_group(bg); - reloc_chunk_end(fs_info); free_reloc_control(rc); return err; } @@ -4350,8 +4351,8 @@ out_clean: err = ret; out_unset: unset_reloc_control(rc); -out_end: reloc_chunk_end(fs_info); +out_end: free_reloc_control(rc); out: free_reloc_roots(&reloc_roots);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit a5a51bf4e9b7354ce7cd697e610d72c1b33fd949 upstream.
Currently, when building a free space tree at populate_free_space_tree(), if we are not using the block group tree feature, we always expect to find block group items (either extent items or a block group item with key type BTRFS_BLOCK_GROUP_ITEM_KEY) when we search the extent tree with btrfs_search_slot_for_read(), so we assert that we found an item. However this expectation is wrong since we can have a new block group created in the current transaction which is still empty and for which we still have not added the block group's item to the extent tree, in which case we do not have any items in the extent tree associated to the block group.
The insertion of a new block group's block group item in the extent tree happens at btrfs_create_pending_block_groups() when it calls the helper insert_block_group_item(). This typically is done when a transaction handle is released, committed or when running delayed refs (either as part of a transaction commit or when serving tickets for space reservation if we are low on free space).
So remove the assertion at populate_free_space_tree() even when the block group tree feature is not enabled and update the comment to mention this case.
Syzbot reported this with the following stack trace:
BTRFS info (device loop3 state M): rebuilding free space tree assertion failed: ret == 0 :: 0, in fs/btrfs/free-space-tree.c:1115 ------------[ cut here ]------------ kernel BUG at fs/btrfs/free-space-tree.c:1115! Oops: invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 1 UID: 0 PID: 6352 Comm: syz.3.25 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/18/2025 RIP: 0010:populate_free_space_tree+0x700/0x710 fs/btrfs/free-space-tree.c:1115 Code: ff ff e8 d3 (...) RSP: 0018:ffffc9000430f780 EFLAGS: 00010246 RAX: 0000000000000043 RBX: ffff88805b709630 RCX: fea61d0e2e79d000 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: ffffc9000430f8b0 R08: ffffc9000430f4a7 R09: 1ffff92000861e94 R10: dffffc0000000000 R11: fffff52000861e95 R12: 0000000000000001 R13: 1ffff92000861f00 R14: dffffc0000000000 R15: 0000000000000000 FS: 00007f424d9fe6c0(0000) GS:ffff888125afc000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fd78ad212c0 CR3: 0000000076d68000 CR4: 00000000003526f0 Call Trace: <TASK> btrfs_rebuild_free_space_tree+0x1ba/0x6d0 fs/btrfs/free-space-tree.c:1364 btrfs_start_pre_rw_mount+0x128f/0x1bf0 fs/btrfs/disk-io.c:3062 btrfs_remount_rw fs/btrfs/super.c:1334 [inline] btrfs_reconfigure+0xaed/0x2160 fs/btrfs/super.c:1559 reconfigure_super+0x227/0x890 fs/super.c:1076 do_remount fs/namespace.c:3279 [inline] path_mount+0xd1a/0xfe0 fs/namespace.c:4027 do_mount fs/namespace.c:4048 [inline] __do_sys_mount fs/namespace.c:4236 [inline] __se_sys_mount+0x313/0x410 fs/namespace.c:4213 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f424e39066a Code: d8 64 89 02 (...) RSP: 002b:00007f424d9fde68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 00007f424d9fdef0 RCX: 00007f424e39066a RDX: 0000200000000180 RSI: 0000200000000380 RDI: 0000000000000000 RBP: 0000200000000180 R08: 00007f424d9fdef0 R09: 0000000000000020 R10: 0000000000000020 R11: 0000000000000246 R12: 0000200000000380 R13: 00007f424d9fdeb0 R14: 0000000000000000 R15: 00002000000002c0 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]---
Reported-by: syzbot+884dc4621377ba579a6f@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/68dc3dab.a00a0220.102ee.004e.GAE@google.... Fixes: a5ed91828518 ("Btrfs: implement the free space B-tree") CC: stable@vger.kernel.org # 6.1.x: 1961d20f6fa8: btrfs: fix assertion when building free space tree CC: stable@vger.kernel.org # 6.1.x Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/free-space-tree.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-)
--- a/fs/btrfs/free-space-tree.c +++ b/fs/btrfs/free-space-tree.c @@ -1102,14 +1102,15 @@ static int populate_free_space_tree(stru * If ret is 1 (no key found), it means this is an empty block group, * without any extents allocated from it and there's no block group * item (key BTRFS_BLOCK_GROUP_ITEM_KEY) located in the extent tree - * because we are using the block group tree feature, so block group - * items are stored in the block group tree. It also means there are no - * extents allocated for block groups with a start offset beyond this - * block group's end offset (this is the last, highest, block group). + * because we are using the block group tree feature (so block group + * items are stored in the block group tree) or this is a new block + * group created in the current transaction and its block group item + * was not yet inserted in the extent tree (that happens in + * btrfs_create_pending_block_groups() -> insert_block_group_item()). + * It also means there are no extents allocated for block groups with a + * start offset beyond this block group's end offset (this is the last, + * highest, block group). */ - if (!btrfs_fs_compat_ro(trans->fs_info, BLOCK_GROUP_TREE)) - ASSERT(ret == 0); - start = block_group->start; end = block_group->start + block_group->length; while (ret == 0) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eugene Korenevsky ekorenevsky@aliyun.com
commit 6447b0e355562a1ff748c4a2ffb89aae7e84d2c9 upstream.
Malicious SMB server can send invalid reply to FSCTL_DFS_GET_REFERRALS
- reply smaller than sizeof(struct get_dfs_referral_rsp) - reply with number of referrals smaller than NumberOfReferrals in the header
Processing of such replies will cause oob.
Return -EINVAL error on such replies to prevent oob-s.
Signed-off-by: Eugene Korenevsky ekorenevsky@aliyun.com Cc: stable@vger.kernel.org Suggested-by: Nathan Chancellor nathan@kernel.org Acked-by: Paulo Alcantara (Red Hat) pc@manguebit.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/misc.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
--- a/fs/smb/client/misc.c +++ b/fs/smb/client/misc.c @@ -866,6 +866,14 @@ parse_dfs_referrals(struct get_dfs_refer char *data_end; struct dfs_referral_level_3 *ref;
+ if (rsp_size < sizeof(*rsp)) { + cifs_dbg(VFS | ONCE, + "%s: header is malformed (size is %u, must be %zu)\n", + __func__, rsp_size, sizeof(*rsp)); + rc = -EINVAL; + goto parse_DFS_referrals_exit; + } + *num_of_nodes = le16_to_cpu(rsp->NumberOfReferrals);
if (*num_of_nodes < 1) { @@ -874,6 +882,15 @@ parse_dfs_referrals(struct get_dfs_refer rc = -EINVAL; goto parse_DFS_referrals_exit; } + + if (sizeof(*rsp) + *num_of_nodes * sizeof(REFERRAL3) > rsp_size) { + cifs_dbg(VFS | ONCE, + "%s: malformed buffer (size is %u, must be at least %zu)\n", + __func__, rsp_size, + sizeof(*rsp) + *num_of_nodes * sizeof(REFERRAL3)); + rc = -EINVAL; + goto parse_DFS_referrals_exit; + }
ref = (struct dfs_referral_level_3 *) &(rsp->referrals); if (ref->VersionNumber != cpu_to_le16(3)) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gui-Dong Han hanguidong02@gmail.com
commit 6df8e84aa6b5b1812cc2cacd6b3f5ccbb18cda2b upstream.
The atomic variable vm_fault_info_updated is used to synchronize access to adev->gmc.vm_fault_info between the interrupt handler and get_vm_fault_info().
The default atomic functions like atomic_set() and atomic_read() do not provide memory barriers. This allows for CPU instruction reordering, meaning the memory accesses to vm_fault_info and the vm_fault_info_updated flag are not guaranteed to occur in the intended order. This creates a race condition that can lead to inconsistent or stale data being used.
The previous implementation, which used an explicit mb(), was incomplete and inefficient. It failed to account for all potential CPU reorderings, such as the access of vm_fault_info being reordered before the atomic_read of the flag. This approach is also more verbose and less performant than using the proper atomic functions with acquire/release semantics.
Fix this by switching to atomic_set_release() and atomic_read_acquire(). These functions provide the necessary acquire and release semantics, which act as memory barriers to ensure the correct order of operations. It is also more efficient and idiomatic than using explicit full memory barriers.
Fixes: b97dfa27ef3a ("drm/amdgpu: save vm fault information for amdkfd") Cc: stable@vger.kernel.org Signed-off-by: Gui-Dong Han hanguidong02@gmail.com Signed-off-by: Felix Kuehling felix.kuehling@amd.com Reviewed-by: Felix Kuehling felix.kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 5 ++--- drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 7 +++---- drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 7 +++---- 3 files changed, 8 insertions(+), 11 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -2268,10 +2268,9 @@ void amdgpu_amdkfd_gpuvm_unmap_gtt_bo_fr int amdgpu_amdkfd_gpuvm_get_vm_fault_info(struct amdgpu_device *adev, struct kfd_vm_fault_info *mem) { - if (atomic_read(&adev->gmc.vm_fault_info_updated) == 1) { + if (atomic_read_acquire(&adev->gmc.vm_fault_info_updated) == 1) { *mem = *adev->gmc.vm_fault_info; - mb(); /* make sure read happened */ - atomic_set(&adev->gmc.vm_fault_info_updated, 0); + atomic_set_release(&adev->gmc.vm_fault_info_updated, 0); } return 0; } --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c @@ -1067,7 +1067,7 @@ static int gmc_v7_0_sw_init(void *handle GFP_KERNEL); if (!adev->gmc.vm_fault_info) return -ENOMEM; - atomic_set(&adev->gmc.vm_fault_info_updated, 0); + atomic_set_release(&adev->gmc.vm_fault_info_updated, 0);
return 0; } @@ -1299,7 +1299,7 @@ static int gmc_v7_0_process_interrupt(st vmid = REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS, VMID); if (amdgpu_amdkfd_is_kfd_vmid(adev, vmid) - && !atomic_read(&adev->gmc.vm_fault_info_updated)) { + && !atomic_read_acquire(&adev->gmc.vm_fault_info_updated)) { struct kfd_vm_fault_info *info = adev->gmc.vm_fault_info; u32 protections = REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS, @@ -1315,8 +1315,7 @@ static int gmc_v7_0_process_interrupt(st info->prot_read = protections & 0x8 ? true : false; info->prot_write = protections & 0x10 ? true : false; info->prot_exec = protections & 0x20 ? true : false; - mb(); - atomic_set(&adev->gmc.vm_fault_info_updated, 1); + atomic_set_release(&adev->gmc.vm_fault_info_updated, 1); }
return 0; --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c @@ -1189,7 +1189,7 @@ static int gmc_v8_0_sw_init(void *handle GFP_KERNEL); if (!adev->gmc.vm_fault_info) return -ENOMEM; - atomic_set(&adev->gmc.vm_fault_info_updated, 0); + atomic_set_release(&adev->gmc.vm_fault_info_updated, 0);
return 0; } @@ -1480,7 +1480,7 @@ static int gmc_v8_0_process_interrupt(st vmid = REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS, VMID); if (amdgpu_amdkfd_is_kfd_vmid(adev, vmid) - && !atomic_read(&adev->gmc.vm_fault_info_updated)) { + && !atomic_read_acquire(&adev->gmc.vm_fault_info_updated)) { struct kfd_vm_fault_info *info = adev->gmc.vm_fault_info; u32 protections = REG_GET_FIELD(status, VM_CONTEXT1_PROTECTION_FAULT_STATUS, @@ -1496,8 +1496,7 @@ static int gmc_v8_0_process_interrupt(st info->prot_read = protections & 0x8 ? true : false; info->prot_write = protections & 0x10 ? true : false; info->prot_exec = protections & 0x20 ? true : false; - mb(); - atomic_set(&adev->gmc.vm_fault_info_updated, 1); + atomic_set_release(&adev->gmc.vm_fault_info_updated, 1); }
return 0;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello mario.limonciello@amd.com
commit c760bcda83571e07b72c10d9da175db5051ed971 upstream.
[Why] Not all renoir hardware supports secure display. If the TA is present but the feature isn't supported it will fail to load or send commands. This shows ERR messages to the user that make it seems like there is a problem.
[How] Check the resp_status of the context to see if there was an error before trying to send any secure display commands.
Reviewed-by: Alex Deucher alexander.deucher@amd.com Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1415 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Adrian Yip adrian.ytw@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c @@ -1959,7 +1959,7 @@ static int psp_securedisplay_initialize( }
ret = psp_ta_load(psp, &psp->securedisplay_context.context); - if (!ret) { + if (!ret && !psp->securedisplay_context.context.resp_status) { psp->securedisplay_context.context.initialized = true; mutex_init(&psp->securedisplay_context.mutex); } else
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Fourier fourier.thomas@gmail.com
[ Upstream commit 21140e5caf019e4a24e1ceabcaaa16bd693b393f ]
The dma_unmap_sg() functions should be called with the same nents as the dma_map_sg(), not the value the map function returned.
Fixes: 57d67c6e8219 ("crypto: rockchip - rework by using crypto_engine") Cc: stable@vger.kernel.org Signed-off-by: Thomas Fourier fourier.thomas@gmail.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au [ removed unused rctx variable declaration since device pointer already came from tctx->dev->dev instead of rctx->dev ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/crypto/rockchip/rk3288_crypto_ahash.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/crypto/rockchip/rk3288_crypto_ahash.c +++ b/drivers/crypto/rockchip/rk3288_crypto_ahash.c @@ -236,10 +236,9 @@ static int rk_hash_unprepare(struct cryp { struct ahash_request *areq = container_of(breq, struct ahash_request, base); struct crypto_ahash *tfm = crypto_ahash_reqtfm(areq); - struct rk_ahash_rctx *rctx = ahash_request_ctx(areq); struct rk_ahash_ctx *tctx = crypto_ahash_ctx(tfm);
- dma_unmap_sg(tctx->dev->dev, areq->src, rctx->nrsg, DMA_TO_DEVICE); + dma_unmap_sg(tctx->dev->dev, areq->src, sg_nents(areq->src), DMA_TO_DEVICE); return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Rafael J. Wysocki" rafael.j.wysocki@intel.com
[ Upstream commit f965d111e68f4a993cc44d487d416e3d954eea11 ]
If cppc_get_transition_latency() returns CPUFREQ_ETERNAL to indicate a failure to retrieve the transition latency value from the platform firmware, the CPPC cpufreq driver will use that value (converted to microseconds) as the policy transition delay, but it is way too large for any practical use.
Address this by making the driver use the cpufreq's default transition latency value (in microseconds) as the transition delay if CPUFREQ_ETERNAL is returned by cppc_get_transition_latency().
Fixes: d4f3388afd48 ("cpufreq / CPPC: Set platform specific transition_delay_us") Cc: 5.19+ stable@vger.kernel.org # 5.19 Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Mario Limonciello (AMD) superm1@kernel.org Reviewed-by: Jie Zhan zhanjie9@hisilicon.com Acked-by: Viresh Kumar viresh.kumar@linaro.org Reviewed-by: Qais Yousef qyousef@layalina.io [ added CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS definition to include/linux/cpufreq.h ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/cpufreq/cppc_cpufreq.c | 14 ++++++++++++-- include/linux/cpufreq.h | 3 +++ 2 files changed, 15 insertions(+), 2 deletions(-)
--- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -338,6 +338,16 @@ static int cppc_verify_policy(struct cpu return 0; }
+static unsigned int __cppc_cpufreq_get_transition_delay_us(unsigned int cpu) +{ + unsigned int transition_latency_ns = cppc_get_transition_latency(cpu); + + if (transition_latency_ns == CPUFREQ_ETERNAL) + return CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS / NSEC_PER_USEC; + + return transition_latency_ns / NSEC_PER_USEC; +} + /* * The PCC subspace describes the rate at which platform can accept commands * on the shared PCC channel (including READs which do not count towards freq @@ -360,12 +370,12 @@ static unsigned int cppc_cpufreq_get_tra return 10000; } } - return cppc_get_transition_latency(cpu) / NSEC_PER_USEC; + return __cppc_cpufreq_get_transition_delay_us(cpu); } #else static unsigned int cppc_cpufreq_get_transition_delay_us(unsigned int cpu) { - return cppc_get_transition_latency(cpu) / NSEC_PER_USEC; + return __cppc_cpufreq_get_transition_delay_us(cpu); } #endif
--- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -32,6 +32,9 @@ */
#define CPUFREQ_ETERNAL (-1) + +#define CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS NSEC_PER_MSEC + #define CPUFREQ_NAME_LEN 16 /* Print length for names. Extra 1 space for accommodating '\n' in prints */ #define CPUFREQ_NAME_PLEN (CPUFREQ_NAME_LEN + 1)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Vasut marek.vasut+renesas@mailbox.org
[ Upstream commit d83f1d19c898ac1b54ae64d1c950f5beff801982 ]
Remove fixed PPI lane count setup. The R-Car DSI host is capable of operating in 1..4 DSI lane mode. Remove the hard-coded 4-lane configuration from PPI register settings and instead configure the PPI lane count according to lane count information already obtained by this driver instance.
Configure TXSETR register to match PPI lane count. The R-Car V4H Reference Manual R19UH0186EJ0121 Rev.1.21 section 67.2.2.3 Tx Set Register (TXSETR), field LANECNT description indicates that the TXSETR register LANECNT bitfield lane count must be configured such, that it matches lane count configuration in PPISETR register DLEN bitfield. Make sure the LANECNT and DLEN bitfields are configured to match.
Fixes: 155358310f01 ("drm: rcar-du: Add R-Car DSI driver") Cc: stable@vger.kernel.org Signed-off-by: Marek Vasut marek.vasut+renesas@mailbox.org Reviewed-by: Tomi Valkeinen tomi.valkeinen+renesas@ideasonboard.com Link: https://lore.kernel.org/r/20250813210840.97621-1-marek.vasut+renesas@mailbox... Signed-off-by: Tomi Valkeinen tomi.valkeinen@ideasonboard.com [ adjusted file paths to remove renesas/ subdirectory ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c | 5 ++++- drivers/gpu/drm/rcar-du/rcar_mipi_dsi_regs.h | 8 ++++---- 2 files changed, 8 insertions(+), 5 deletions(-)
--- a/drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c +++ b/drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c @@ -385,7 +385,10 @@ static int rcar_mipi_dsi_startup(struct udelay(10); rcar_mipi_dsi_clr(dsi, CLOCKSET1, CLOCKSET1_UPDATEPLL);
- ppisetr = PPISETR_DLEN_3 | PPISETR_CLEN; + rcar_mipi_dsi_clr(dsi, TXSETR, TXSETR_LANECNT_MASK); + rcar_mipi_dsi_set(dsi, TXSETR, dsi->lanes - 1); + + ppisetr = ((BIT(dsi->lanes) - 1) & PPISETR_DLEN_MASK) | PPISETR_CLEN; rcar_mipi_dsi_write(dsi, PPISETR, ppisetr);
rcar_mipi_dsi_set(dsi, PHYSETUP, PHYSETUP_SHUTDOWNZ); --- a/drivers/gpu/drm/rcar-du/rcar_mipi_dsi_regs.h +++ b/drivers/gpu/drm/rcar-du/rcar_mipi_dsi_regs.h @@ -12,6 +12,9 @@ #define LINKSR_LPBUSY (1 << 1) #define LINKSR_HSBUSY (1 << 0)
+#define TXSETR 0x100 +#define TXSETR_LANECNT_MASK (0x3 << 0) + /* * Video Mode Register */ @@ -80,10 +83,7 @@ * PHY-Protocol Interface (PPI) Registers */ #define PPISETR 0x700 -#define PPISETR_DLEN_0 (0x1 << 0) -#define PPISETR_DLEN_1 (0x3 << 0) -#define PPISETR_DLEN_2 (0x7 << 0) -#define PPISETR_DLEN_3 (0xf << 0) +#define PPISETR_DLEN_MASK (0xf << 0) #define PPISETR_CLEN (1 << 8)
#define PPICLCR 0x710
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kaustabh Chakraborty kauschluss@disroot.org
[ Upstream commit d31bbacf783daf1e71fbe5c68df93550c446bf44 ]
Modify the functions to accept a pointer to struct decon_context instead.
Signed-off-by: Kaustabh Chakraborty kauschluss@disroot.org Signed-off-by: Inki Dae inki.dae@samsung.com Stable-dep-of: e1361a4f1be9 ("drm/exynos: exynos7_drm_decon: remove ctx->suspended") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/exynos/exynos7_drm_decon.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)
--- a/drivers/gpu/drm/exynos/exynos7_drm_decon.c +++ b/drivers/gpu/drm/exynos/exynos7_drm_decon.c @@ -82,10 +82,8 @@ static const enum drm_plane_type decon_w DRM_PLANE_TYPE_CURSOR, };
-static void decon_wait_for_vblank(struct exynos_drm_crtc *crtc) +static void decon_wait_for_vblank(struct decon_context *ctx) { - struct decon_context *ctx = crtc->ctx; - if (ctx->suspended) return;
@@ -101,9 +99,8 @@ static void decon_wait_for_vblank(struct DRM_DEV_DEBUG_KMS(ctx->dev, "vblank wait timed out.\n"); }
-static void decon_clear_channels(struct exynos_drm_crtc *crtc) +static void decon_clear_channels(struct decon_context *ctx) { - struct decon_context *ctx = crtc->ctx; unsigned int win, ch_enabled = 0;
/* Check if any channel is enabled. */ @@ -119,7 +116,7 @@ static void decon_clear_channels(struct
/* Wait for vsync, as disable channel takes effect at next vsync */ if (ch_enabled) - decon_wait_for_vblank(ctx->crtc); + decon_wait_for_vblank(ctx); }
static int decon_ctx_initialize(struct decon_context *ctx, @@ -127,7 +124,7 @@ static int decon_ctx_initialize(struct d { ctx->drm_dev = drm_dev;
- decon_clear_channels(ctx->crtc); + decon_clear_channels(ctx);
return exynos_drm_register_dma(drm_dev, ctx->dev, &ctx->dma_priv); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kaustabh Chakraborty kauschluss@disroot.org
[ Upstream commit 5f1a453974204175f20b3788824a0fe23cc36f79 ]
The DECON channels are not cleared properly as the windows aren't shadow protected. When accompanied with an IOMMU, it pagefaults, and the kernel panics.
Implement shadow protect/unprotect, along with a standalone update, for channel clearing to properly take effect.
Signed-off-by: Kaustabh Chakraborty kauschluss@disroot.org Signed-off-by: Inki Dae inki.dae@samsung.com Stable-dep-of: e1361a4f1be9 ("drm/exynos: exynos7_drm_decon: remove ctx->suspended") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/exynos/exynos7_drm_decon.c | 55 ++++++++++++++++------------- 1 file changed, 32 insertions(+), 23 deletions(-)
--- a/drivers/gpu/drm/exynos/exynos7_drm_decon.c +++ b/drivers/gpu/drm/exynos/exynos7_drm_decon.c @@ -82,6 +82,28 @@ static const enum drm_plane_type decon_w DRM_PLANE_TYPE_CURSOR, };
+/** + * decon_shadow_protect_win() - disable updating values from shadow registers at vsync + * + * @ctx: display and enhancement controller context + * @win: window to protect registers for + * @protect: 1 to protect (disable updates) + */ +static void decon_shadow_protect_win(struct decon_context *ctx, + unsigned int win, bool protect) +{ + u32 bits, val; + + bits = SHADOWCON_WINx_PROTECT(win); + + val = readl(ctx->regs + SHADOWCON); + if (protect) + val |= bits; + else + val &= ~bits; + writel(val, ctx->regs + SHADOWCON); +} + static void decon_wait_for_vblank(struct decon_context *ctx) { if (ctx->suspended) @@ -102,18 +124,27 @@ static void decon_wait_for_vblank(struct static void decon_clear_channels(struct decon_context *ctx) { unsigned int win, ch_enabled = 0; + u32 val;
/* Check if any channel is enabled. */ for (win = 0; win < WINDOWS_NR; win++) { - u32 val = readl(ctx->regs + WINCON(win)); + val = readl(ctx->regs + WINCON(win));
if (val & WINCONx_ENWIN) { + decon_shadow_protect_win(ctx, win, true); + val &= ~WINCONx_ENWIN; writel(val, ctx->regs + WINCON(win)); ch_enabled = 1; + + decon_shadow_protect_win(ctx, win, false); } }
+ val = readl(ctx->regs + DECON_UPDATE); + val |= DECON_UPDATE_STANDALONE_F; + writel(val, ctx->regs + DECON_UPDATE); + /* Wait for vsync, as disable channel takes effect at next vsync */ if (ch_enabled) decon_wait_for_vblank(ctx); @@ -341,28 +372,6 @@ static void decon_win_set_colkey(struct writel(keycon1, ctx->regs + WKEYCON1_BASE(win)); }
-/** - * decon_shadow_protect_win() - disable updating values from shadow registers at vsync - * - * @ctx: display and enhancement controller context - * @win: window to protect registers for - * @protect: 1 to protect (disable updates) - */ -static void decon_shadow_protect_win(struct decon_context *ctx, - unsigned int win, bool protect) -{ - u32 bits, val; - - bits = SHADOWCON_WINx_PROTECT(win); - - val = readl(ctx->regs + SHADOWCON); - if (protect) - val |= bits; - else - val &= ~bits; - writel(val, ctx->regs + SHADOWCON); -} - static void decon_atomic_begin(struct exynos_drm_crtc *crtc) { struct decon_context *ctx = crtc->ctx;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kaustabh Chakraborty kauschluss@disroot.org
[ Upstream commit e1361a4f1be9cb69a662c6d7b5ce218007d6e82b ]
Condition guards are found to be redundant, as the call flow is properly managed now, as also observed in the Exynos5433 DECON driver. Since state checking is no longer necessary, remove it.
This also fixes an issue which prevented decon_commit() from decon_atomic_enable() due to an incorrect state change setting.
Fixes: 96976c3d9aff ("drm/exynos: Add DECON driver") Cc: stable@vger.kernel.org Suggested-by: Inki Dae inki.dae@samsung.com Signed-off-by: Kaustabh Chakraborty kauschluss@disroot.org Signed-off-by: Inki Dae inki.dae@samsung.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/exynos/exynos7_drm_decon.c | 36 ----------------------------- 1 file changed, 36 deletions(-)
--- a/drivers/gpu/drm/exynos/exynos7_drm_decon.c +++ b/drivers/gpu/drm/exynos/exynos7_drm_decon.c @@ -52,7 +52,6 @@ struct decon_context { void __iomem *regs; unsigned long irq_flags; bool i80_if; - bool suspended; wait_queue_head_t wait_vsync_queue; atomic_t wait_vsync_event;
@@ -106,9 +105,6 @@ static void decon_shadow_protect_win(str
static void decon_wait_for_vblank(struct decon_context *ctx) { - if (ctx->suspended) - return; - atomic_set(&ctx->wait_vsync_event, 1);
/* @@ -184,9 +180,6 @@ static void decon_commit(struct exynos_d struct drm_display_mode *mode = &crtc->base.state->adjusted_mode; u32 val, clkdiv;
- if (ctx->suspended) - return; - /* nothing to do if we haven't set the mode yet */ if (mode->htotal == 0 || mode->vtotal == 0) return; @@ -248,9 +241,6 @@ static int decon_enable_vblank(struct ex struct decon_context *ctx = crtc->ctx; u32 val;
- if (ctx->suspended) - return -EPERM; - if (!test_and_set_bit(0, &ctx->irq_flags)) { val = readl(ctx->regs + VIDINTCON0);
@@ -273,9 +263,6 @@ static void decon_disable_vblank(struct struct decon_context *ctx = crtc->ctx; u32 val;
- if (ctx->suspended) - return; - if (test_and_clear_bit(0, &ctx->irq_flags)) { val = readl(ctx->regs + VIDINTCON0);
@@ -377,9 +364,6 @@ static void decon_atomic_begin(struct ex struct decon_context *ctx = crtc->ctx; int i;
- if (ctx->suspended) - return; - for (i = 0; i < WINDOWS_NR; i++) decon_shadow_protect_win(ctx, i, true); } @@ -399,9 +383,6 @@ static void decon_update_plane(struct ex unsigned int cpp = fb->format->cpp[0]; unsigned int pitch = fb->pitches[0];
- if (ctx->suspended) - return; - /* * SHADOWCON/PRTCON register is used for enabling timing. * @@ -489,9 +470,6 @@ static void decon_disable_plane(struct e unsigned int win = plane->index; u32 val;
- if (ctx->suspended) - return; - /* protect windows */ decon_shadow_protect_win(ctx, win, true);
@@ -510,9 +488,6 @@ static void decon_atomic_flush(struct ex struct decon_context *ctx = crtc->ctx; int i;
- if (ctx->suspended) - return; - for (i = 0; i < WINDOWS_NR; i++) decon_shadow_protect_win(ctx, i, false); exynos_crtc_handle_event(crtc); @@ -540,9 +515,6 @@ static void decon_atomic_enable(struct e struct decon_context *ctx = crtc->ctx; int ret;
- if (!ctx->suspended) - return; - ret = pm_runtime_resume_and_get(ctx->dev); if (ret < 0) { DRM_DEV_ERROR(ctx->dev, "failed to enable DECON device.\n"); @@ -556,8 +528,6 @@ static void decon_atomic_enable(struct e decon_enable_vblank(ctx->crtc);
decon_commit(ctx->crtc); - - ctx->suspended = false; }
static void decon_atomic_disable(struct exynos_drm_crtc *crtc) @@ -565,9 +535,6 @@ static void decon_atomic_disable(struct struct decon_context *ctx = crtc->ctx; int i;
- if (ctx->suspended) - return; - /* * We need to make sure that all windows are disabled before we * suspend that connector. Otherwise we might try to scan from @@ -577,8 +544,6 @@ static void decon_atomic_disable(struct decon_disable_plane(crtc, &ctx->planes[i]);
pm_runtime_put_sync(ctx->dev); - - ctx->suspended = true; }
static const struct exynos_drm_crtc_ops decon_crtc_ops = { @@ -699,7 +664,6 @@ static int decon_probe(struct platform_d return -ENOMEM;
ctx->dev = dev; - ctx->suspended = true;
i80_if_timings = of_get_child_by_name(dev->of_node, "i80-if-timings"); if (i80_if_timings)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuen-Han Tsai khtsai@google.com
[ Upstream commit bfb1d99d969fe3b892db30848aeebfa19d21f57f ]
Gadget function drivers often have goto-based error handling in their bind paths, which can be bug-prone. Refactoring these paths to use __free() scope-based cleanup is desirable, but currently blocked.
The blocker is that usb_ep_free_request(ep, req) requires two parameters, while the __free() mechanism can only pass a pointer to the request itself.
Store an endpoint pointer in the struct usb_request. The pointer is populated centrally in usb_ep_alloc_request() on every successful allocation, making the request object self-contained.
Signed-off-by: Kuen-Han Tsai khtsai@google.com Link: https://lore.kernel.org/r/20250916-ready-v1-1-4997bf277548@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Link: https://lore.kernel.org/r/20250916-ready-v1-1-4997bf277548@google.com Stable-dep-of: 75a5b8d4ddd4 ("usb: gadget: f_ncm: Refactor bind path to use __free()") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/udc/core.c | 3 +++ include/linux/usb/gadget.h | 2 ++ 2 files changed, 5 insertions(+)
--- a/drivers/usb/gadget/udc/core.c +++ b/drivers/usb/gadget/udc/core.c @@ -194,6 +194,9 @@ struct usb_request *usb_ep_alloc_request
req = ep->ops->alloc_request(ep, gfp_flags);
+ if (req) + req->ep = ep; + trace_usb_ep_alloc_request(ep, req, req ? 0 : -ENOMEM);
return req; --- a/include/linux/usb/gadget.h +++ b/include/linux/usb/gadget.h @@ -31,6 +31,7 @@ struct usb_ep;
/** * struct usb_request - describes one i/o request + * @ep: The associated endpoint set by usb_ep_alloc_request(). * @buf: Buffer used for data. Always provide this; some controllers * only use PIO, or don't use DMA for some endpoints. * @dma: DMA address corresponding to 'buf'. If you don't set this @@ -96,6 +97,7 @@ struct usb_ep; */
struct usb_request { + struct usb_ep *ep; void *buf; unsigned length; dma_addr_t dma;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuen-Han Tsai khtsai@google.com
[ Upstream commit 201c53c687f2b55a7cc6d9f4000af4797860174b ]
Introduce the free_usb_request() function that frees both the request's buffer and the request itself.
This function serves as the cleanup callback for DEFINE_FREE() to enable automatic, scope-based cleanup for usb_request pointers.
Signed-off-by: Kuen-Han Tsai khtsai@google.com Link: https://lore.kernel.org/r/20250916-ready-v1-2-4997bf277548@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Link: https://lore.kernel.org/r/20250916-ready-v1-2-4997bf277548@google.com Stable-dep-of: 75a5b8d4ddd4 ("usb: gadget: f_ncm: Refactor bind path to use __free()") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/usb/gadget.h | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+)
--- a/include/linux/usb/gadget.h +++ b/include/linux/usb/gadget.h @@ -15,6 +15,7 @@ #ifndef __LINUX_USB_GADGET_H #define __LINUX_USB_GADGET_H
+#include <linux/cleanup.h> #include <linux/device.h> #include <linux/errno.h> #include <linux/init.h> @@ -290,6 +291,28 @@ static inline void usb_ep_fifo_flush(str
/*-------------------------------------------------------------------------*/
+/** + * free_usb_request - frees a usb_request object and its buffer + * @req: the request being freed + * + * This helper function frees both the request's buffer and the request object + * itself by calling usb_ep_free_request(). Its signature is designed to be used + * with DEFINE_FREE() to enable automatic, scope-based cleanup for usb_request + * pointers. + */ +static inline void free_usb_request(struct usb_request *req) +{ + if (!req) + return; + + kfree(req->buf); + usb_ep_free_request(req->ep, req); +} + +DEFINE_FREE(free_usb_request, struct usb_request *, free_usb_request(_T)) + +/*-------------------------------------------------------------------------*/ + struct usb_dcd_config_params { __u8 bU1devExitLat; /* U1 Device exit Latency */ #define USB_DEFAULT_U1_DEV_EXIT_LAT 0x01 /* Less then 1 microsec */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuen-Han Tsai khtsai@google.com
[ Upstream commit 08228941436047bdcd35a612c1aec0912a29d8cd ]
After an bind/unbind cycle, the rndis->notify_req is left stale. If a subsequent bind fails, the unified error label attempts to free this stale request, leading to a NULL pointer dereference when accessing ep->ops->free_request.
Refactor the error handling in the bind path to use the __free() automatic cleanup mechanism.
Fixes: 45fe3b8e5342 ("usb ethernet gadget: split RNDIS function") Cc: stable@kernel.org Signed-off-by: Kuen-Han Tsai khtsai@google.com Link: https://lore.kernel.org/r/20250916-ready-v1-6-4997bf277548@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Link: https://lore.kernel.org/r/20250916-ready-v1-6-4997bf277548@google.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/function/f_rndis.c | 85 ++++++++++++++-------------------- 1 file changed, 35 insertions(+), 50 deletions(-)
--- a/drivers/usb/gadget/function/f_rndis.c +++ b/drivers/usb/gadget/function/f_rndis.c @@ -19,6 +19,8 @@
#include <linux/atomic.h>
+#include <linux/usb/gadget.h> + #include "u_ether.h" #include "u_ether_configfs.h" #include "u_rndis.h" @@ -675,6 +677,8 @@ rndis_bind(struct usb_configuration *c, struct usb_ep *ep;
struct f_rndis_opts *rndis_opts; + struct usb_os_desc_table *os_desc_table __free(kfree) = NULL; + struct usb_request *request __free(free_usb_request) = NULL;
if (!can_support_rndis(c)) return -EINVAL; @@ -682,12 +686,9 @@ rndis_bind(struct usb_configuration *c, rndis_opts = container_of(f->fi, struct f_rndis_opts, func_inst);
if (cdev->use_os_string) { - f->os_desc_table = kzalloc(sizeof(*f->os_desc_table), - GFP_KERNEL); - if (!f->os_desc_table) + os_desc_table = kzalloc(sizeof(*os_desc_table), GFP_KERNEL); + if (!os_desc_table) return -ENOMEM; - f->os_desc_n = 1; - f->os_desc_table[0].os_desc = &rndis_opts->rndis_os_desc; }
rndis_iad_descriptor.bFunctionClass = rndis_opts->class; @@ -705,16 +706,14 @@ rndis_bind(struct usb_configuration *c, gether_set_gadget(rndis_opts->net, cdev->gadget); status = gether_register_netdev(rndis_opts->net); if (status) - goto fail; + return status; rndis_opts->bound = true; }
us = usb_gstrings_attach(cdev, rndis_strings, ARRAY_SIZE(rndis_string_defs)); - if (IS_ERR(us)) { - status = PTR_ERR(us); - goto fail; - } + if (IS_ERR(us)) + return PTR_ERR(us); rndis_control_intf.iInterface = us[0].id; rndis_data_intf.iInterface = us[1].id; rndis_iad_descriptor.iFunction = us[2].id; @@ -722,36 +721,30 @@ rndis_bind(struct usb_configuration *c, /* allocate instance-specific interface IDs */ status = usb_interface_id(c, f); if (status < 0) - goto fail; + return status; rndis->ctrl_id = status; rndis_iad_descriptor.bFirstInterface = status;
rndis_control_intf.bInterfaceNumber = status; rndis_union_desc.bMasterInterface0 = status;
- if (cdev->use_os_string) - f->os_desc_table[0].if_id = - rndis_iad_descriptor.bFirstInterface; - status = usb_interface_id(c, f); if (status < 0) - goto fail; + return status; rndis->data_id = status;
rndis_data_intf.bInterfaceNumber = status; rndis_union_desc.bSlaveInterface0 = status;
- status = -ENODEV; - /* allocate instance-specific endpoints */ ep = usb_ep_autoconfig(cdev->gadget, &fs_in_desc); if (!ep) - goto fail; + return -ENODEV; rndis->port.in_ep = ep;
ep = usb_ep_autoconfig(cdev->gadget, &fs_out_desc); if (!ep) - goto fail; + return -ENODEV; rndis->port.out_ep = ep;
/* NOTE: a status/notification endpoint is, strictly speaking, @@ -760,21 +753,19 @@ rndis_bind(struct usb_configuration *c, */ ep = usb_ep_autoconfig(cdev->gadget, &fs_notify_desc); if (!ep) - goto fail; + return -ENODEV; rndis->notify = ep;
- status = -ENOMEM; - /* allocate notification request and buffer */ - rndis->notify_req = usb_ep_alloc_request(ep, GFP_KERNEL); - if (!rndis->notify_req) - goto fail; - rndis->notify_req->buf = kmalloc(STATUS_BYTECOUNT, GFP_KERNEL); - if (!rndis->notify_req->buf) - goto fail; - rndis->notify_req->length = STATUS_BYTECOUNT; - rndis->notify_req->context = rndis; - rndis->notify_req->complete = rndis_response_complete; + request = usb_ep_alloc_request(ep, GFP_KERNEL); + if (!request) + return -ENOMEM; + request->buf = kmalloc(STATUS_BYTECOUNT, GFP_KERNEL); + if (!request->buf) + return -ENOMEM; + request->length = STATUS_BYTECOUNT; + request->context = rndis; + request->complete = rndis_response_complete;
/* support all relevant hardware speeds... we expect that when * hardware is dual speed, all bulk-capable endpoints work at @@ -791,7 +782,7 @@ rndis_bind(struct usb_configuration *c, status = usb_assign_descriptors(f, eth_fs_function, eth_hs_function, eth_ss_function, eth_ss_function); if (status) - goto fail; + return status;
rndis->port.open = rndis_open; rndis->port.close = rndis_close; @@ -802,9 +793,18 @@ rndis_bind(struct usb_configuration *c, if (rndis->manufacturer && rndis->vendorID && rndis_set_param_vendor(rndis->params, rndis->vendorID, rndis->manufacturer)) { - status = -EINVAL; - goto fail_free_descs; + usb_free_all_descriptors(f); + return -EINVAL; + } + + if (cdev->use_os_string) { + os_desc_table[0].os_desc = &rndis_opts->rndis_os_desc; + os_desc_table[0].if_id = rndis_iad_descriptor.bFirstInterface; + f->os_desc_table = no_free_ptr(os_desc_table); + f->os_desc_n = 1; + } + rndis->notify_req = no_free_ptr(request);
/* NOTE: all that is done without knowing or caring about * the network link ... which is unavailable to this code @@ -817,21 +817,6 @@ rndis_bind(struct usb_configuration *c, rndis->port.in_ep->name, rndis->port.out_ep->name, rndis->notify->name); return 0; - -fail_free_descs: - usb_free_all_descriptors(f); -fail: - kfree(f->os_desc_table); - f->os_desc_n = 0; - - if (rndis->notify_req) { - kfree(rndis->notify_req->buf); - usb_ep_free_request(rndis->notify, rndis->notify_req); - } - - ERROR(cdev, "%s: can't bind, err %d\n", f->name, status); - - return status; }
void rndis_borrow_net(struct usb_function_instance *f, struct net_device *net)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuen-Han Tsai khtsai@google.com
[ Upstream commit 42988380ac67c76bb9dff8f77d7ef3eefd50b7b5 ]
After an bind/unbind cycle, the ecm->notify_req is left stale. If a subsequent bind fails, the unified error label attempts to free this stale request, leading to a NULL pointer dereference when accessing ep->ops->free_request.
Refactor the error handling in the bind path to use the __free() automatic cleanup mechanism.
Fixes: da741b8c56d6 ("usb ethernet gadget: split CDC Ethernet function") Cc: stable@kernel.org Signed-off-by: Kuen-Han Tsai khtsai@google.com Link: https://lore.kernel.org/r/20250916-ready-v1-5-4997bf277548@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Link: https://lore.kernel.org/r/20250916-ready-v1-5-4997bf277548@google.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/function/f_ecm.c | 48 +++++++++++++++--------------------- 1 file changed, 20 insertions(+), 28 deletions(-)
--- a/drivers/usb/gadget/function/f_ecm.c +++ b/drivers/usb/gadget/function/f_ecm.c @@ -8,12 +8,15 @@
/* #define VERBOSE_DEBUG */
+#include <linux/cleanup.h> #include <linux/slab.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/device.h> #include <linux/etherdevice.h>
+#include <linux/usb/gadget.h> + #include "u_ether.h" #include "u_ether_configfs.h" #include "u_ecm.h" @@ -689,6 +692,7 @@ ecm_bind(struct usb_configuration *c, st struct usb_ep *ep;
struct f_ecm_opts *ecm_opts; + struct usb_request *request __free(free_usb_request) = NULL;
if (!can_support_ecm(cdev->gadget)) return -EINVAL; @@ -726,7 +730,7 @@ ecm_bind(struct usb_configuration *c, st /* allocate instance-specific interface IDs */ status = usb_interface_id(c, f); if (status < 0) - goto fail; + return status; ecm->ctrl_id = status; ecm_iad_descriptor.bFirstInterface = status;
@@ -735,24 +739,22 @@ ecm_bind(struct usb_configuration *c, st
status = usb_interface_id(c, f); if (status < 0) - goto fail; + return status; ecm->data_id = status;
ecm_data_nop_intf.bInterfaceNumber = status; ecm_data_intf.bInterfaceNumber = status; ecm_union_desc.bSlaveInterface0 = status;
- status = -ENODEV; - /* allocate instance-specific endpoints */ ep = usb_ep_autoconfig(cdev->gadget, &fs_ecm_in_desc); if (!ep) - goto fail; + return -ENODEV; ecm->port.in_ep = ep;
ep = usb_ep_autoconfig(cdev->gadget, &fs_ecm_out_desc); if (!ep) - goto fail; + return -ENODEV; ecm->port.out_ep = ep;
/* NOTE: a status/notification endpoint is *OPTIONAL* but we @@ -761,20 +763,18 @@ ecm_bind(struct usb_configuration *c, st */ ep = usb_ep_autoconfig(cdev->gadget, &fs_ecm_notify_desc); if (!ep) - goto fail; + return -ENODEV; ecm->notify = ep;
- status = -ENOMEM; - /* allocate notification request and buffer */ - ecm->notify_req = usb_ep_alloc_request(ep, GFP_KERNEL); - if (!ecm->notify_req) - goto fail; - ecm->notify_req->buf = kmalloc(ECM_STATUS_BYTECOUNT, GFP_KERNEL); - if (!ecm->notify_req->buf) - goto fail; - ecm->notify_req->context = ecm; - ecm->notify_req->complete = ecm_notify_complete; + request = usb_ep_alloc_request(ep, GFP_KERNEL); + if (!request) + return -ENOMEM; + request->buf = kmalloc(ECM_STATUS_BYTECOUNT, GFP_KERNEL); + if (!request->buf) + return -ENOMEM; + request->context = ecm; + request->complete = ecm_notify_complete;
/* support all relevant hardware speeds... we expect that when * hardware is dual speed, all bulk-capable endpoints work at @@ -793,7 +793,7 @@ ecm_bind(struct usb_configuration *c, st status = usb_assign_descriptors(f, ecm_fs_function, ecm_hs_function, ecm_ss_function, ecm_ss_function); if (status) - goto fail; + return status;
/* NOTE: all that is done without knowing or caring about * the network link ... which is unavailable to this code @@ -803,22 +803,14 @@ ecm_bind(struct usb_configuration *c, st ecm->port.open = ecm_open; ecm->port.close = ecm_close;
+ ecm->notify_req = no_free_ptr(request); + DBG(cdev, "CDC Ethernet: %s speed IN/%s OUT/%s NOTIFY/%s\n", gadget_is_superspeed(c->cdev->gadget) ? "super" : gadget_is_dualspeed(c->cdev->gadget) ? "dual" : "full", ecm->port.in_ep->name, ecm->port.out_ep->name, ecm->notify->name); return 0; - -fail: - if (ecm->notify_req) { - kfree(ecm->notify_req->buf); - usb_ep_free_request(ecm->notify, ecm->notify_req); - } - - ERROR(cdev, "%s: can't bind, err %d\n", f->name, status); - - return status; }
static inline struct f_ecm_opts *to_f_ecm_opts(struct config_item *item)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuen-Han Tsai khtsai@google.com
[ Upstream commit 47b2116e54b4a854600341487e8b55249e926324 ]
After an bind/unbind cycle, the acm->notify_req is left stale. If a subsequent bind fails, the unified error label attempts to free this stale request, leading to a NULL pointer dereference when accessing ep->ops->free_request.
Refactor the error handling in the bind path to use the __free() automatic cleanup mechanism.
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 Call trace: usb_ep_free_request+0x2c/0xec gs_free_req+0x30/0x44 acm_bind+0x1b8/0x1f4 usb_add_function+0xcc/0x1f0 configfs_composite_bind+0x468/0x588 gadget_bind_driver+0x104/0x270 really_probe+0x190/0x374 __driver_probe_device+0xa0/0x12c driver_probe_device+0x3c/0x218 __device_attach_driver+0x14c/0x188 bus_for_each_drv+0x10c/0x168 __device_attach+0xfc/0x198 device_initial_probe+0x14/0x24 bus_probe_device+0x94/0x11c device_add+0x268/0x48c usb_add_gadget+0x198/0x28c dwc3_gadget_init+0x700/0x858 __dwc3_set_mode+0x3cc/0x664 process_scheduled_works+0x1d8/0x488 worker_thread+0x244/0x334 kthread+0x114/0x1bc ret_from_fork+0x10/0x20
Fixes: 1f1ba11b6494 ("usb gadget: issue notifications from ACM function") Cc: stable@kernel.org Signed-off-by: Kuen-Han Tsai khtsai@google.com Link: https://lore.kernel.org/r/20250916-ready-v1-4-4997bf277548@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Link: https://lore.kernel.org/r/20250916-ready-v1-4-4997bf277548@google.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/function/f_acm.c | 42 ++++++++++++++++-------------------- 1 file changed, 19 insertions(+), 23 deletions(-)
--- a/drivers/usb/gadget/function/f_acm.c +++ b/drivers/usb/gadget/function/f_acm.c @@ -11,12 +11,15 @@
/* #define VERBOSE_DEBUG */
+#include <linux/cleanup.h> #include <linux/slab.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/device.h> #include <linux/err.h>
+#include <linux/usb/gadget.h> + #include "u_serial.h"
@@ -612,6 +615,7 @@ acm_bind(struct usb_configuration *c, st struct usb_string *us; int status; struct usb_ep *ep; + struct usb_request *request __free(free_usb_request) = NULL;
/* REVISIT might want instance-specific strings to help * distinguish instances ... @@ -629,7 +633,7 @@ acm_bind(struct usb_configuration *c, st /* allocate instance-specific interface IDs, and patch descriptors */ status = usb_interface_id(c, f); if (status < 0) - goto fail; + return status; acm->ctrl_id = status; acm_iad_descriptor.bFirstInterface = status;
@@ -638,40 +642,38 @@ acm_bind(struct usb_configuration *c, st
status = usb_interface_id(c, f); if (status < 0) - goto fail; + return status; acm->data_id = status;
acm_data_interface_desc.bInterfaceNumber = status; acm_union_desc.bSlaveInterface0 = status; acm_call_mgmt_descriptor.bDataInterface = status;
- status = -ENODEV; - /* allocate instance-specific endpoints */ ep = usb_ep_autoconfig(cdev->gadget, &acm_fs_in_desc); if (!ep) - goto fail; + return -ENODEV; acm->port.in = ep;
ep = usb_ep_autoconfig(cdev->gadget, &acm_fs_out_desc); if (!ep) - goto fail; + return -ENODEV; acm->port.out = ep;
ep = usb_ep_autoconfig(cdev->gadget, &acm_fs_notify_desc); if (!ep) - goto fail; + return -ENODEV; acm->notify = ep;
/* allocate notification */ - acm->notify_req = gs_alloc_req(ep, - sizeof(struct usb_cdc_notification) + 2, - GFP_KERNEL); - if (!acm->notify_req) - goto fail; + request = gs_alloc_req(ep, + sizeof(struct usb_cdc_notification) + 2, + GFP_KERNEL); + if (!request) + return -ENODEV;
- acm->notify_req->complete = acm_cdc_notify_complete; - acm->notify_req->context = acm; + request->complete = acm_cdc_notify_complete; + request->context = acm;
/* support all relevant hardware speeds... we expect that when * hardware is dual speed, all bulk-capable endpoints work at @@ -688,7 +690,9 @@ acm_bind(struct usb_configuration *c, st status = usb_assign_descriptors(f, acm_fs_function, acm_hs_function, acm_ss_function, acm_ss_function); if (status) - goto fail; + return status; + + acm->notify_req = no_free_ptr(request);
dev_dbg(&cdev->gadget->dev, "acm ttyGS%d: %s speed IN/%s OUT/%s NOTIFY/%s\n", @@ -698,14 +702,6 @@ acm_bind(struct usb_configuration *c, st acm->port.in->name, acm->port.out->name, acm->notify->name); return 0; - -fail: - if (acm->notify_req) - gs_free_req(acm->notify, acm->notify_req); - - ERROR(cdev, "%s/%p: can't bind, err %d\n", f->name, f, status); - - return status; }
static void acm_unbind(struct usb_configuration *c, struct usb_function *f)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuen-Han Tsai khtsai@google.com
[ Upstream commit 75a5b8d4ddd4eb6b16cb0b475d14ff4ae64295ef ]
After an bind/unbind cycle, the ncm->notify_req is left stale. If a subsequent bind fails, the unified error label attempts to free this stale request, leading to a NULL pointer dereference when accessing ep->ops->free_request.
Refactor the error handling in the bind path to use the __free() automatic cleanup mechanism.
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 Call trace: usb_ep_free_request+0x2c/0xec ncm_bind+0x39c/0x3dc usb_add_function+0xcc/0x1f0 configfs_composite_bind+0x468/0x588 gadget_bind_driver+0x104/0x270 really_probe+0x190/0x374 __driver_probe_device+0xa0/0x12c driver_probe_device+0x3c/0x218 __device_attach_driver+0x14c/0x188 bus_for_each_drv+0x10c/0x168 __device_attach+0xfc/0x198 device_initial_probe+0x14/0x24 bus_probe_device+0x94/0x11c device_add+0x268/0x48c usb_add_gadget+0x198/0x28c dwc3_gadget_init+0x700/0x858 __dwc3_set_mode+0x3cc/0x664 process_scheduled_works+0x1d8/0x488 worker_thread+0x244/0x334 kthread+0x114/0x1bc ret_from_fork+0x10/0x20
Fixes: 9f6ce4240a2b ("usb: gadget: f_ncm.c added") Cc: stable@kernel.org Signed-off-by: Kuen-Han Tsai khtsai@google.com Link: https://lore.kernel.org/r/20250916-ready-v1-3-4997bf277548@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Link: https://lore.kernel.org/r/20250916-ready-v1-3-4997bf277548@google.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/function/f_ncm.c | 78 +++++++++++++++--------------------- 1 file changed, 33 insertions(+), 45 deletions(-)
--- a/drivers/usb/gadget/function/f_ncm.c +++ b/drivers/usb/gadget/function/f_ncm.c @@ -11,6 +11,7 @@ * Copyright (C) 2008 Nokia Corporation */
+#include <linux/cleanup.h> #include <linux/kernel.h> #include <linux/interrupt.h> #include <linux/module.h> @@ -19,6 +20,7 @@ #include <linux/crc32.h>
#include <linux/usb/cdc.h> +#include <linux/usb/gadget.h>
#include "u_ether.h" #include "u_ether_configfs.h" @@ -1437,18 +1439,18 @@ static int ncm_bind(struct usb_configura struct usb_ep *ep; struct f_ncm_opts *ncm_opts;
+ struct usb_os_desc_table *os_desc_table __free(kfree) = NULL; + struct usb_request *request __free(free_usb_request) = NULL; + if (!can_support_ecm(cdev->gadget)) return -EINVAL;
ncm_opts = container_of(f->fi, struct f_ncm_opts, func_inst);
if (cdev->use_os_string) { - f->os_desc_table = kzalloc(sizeof(*f->os_desc_table), - GFP_KERNEL); - if (!f->os_desc_table) + os_desc_table = kzalloc(sizeof(*os_desc_table), GFP_KERNEL); + if (!os_desc_table) return -ENOMEM; - f->os_desc_n = 1; - f->os_desc_table[0].os_desc = &ncm_opts->ncm_os_desc; }
mutex_lock(&ncm_opts->lock); @@ -1458,16 +1460,15 @@ static int ncm_bind(struct usb_configura mutex_unlock(&ncm_opts->lock);
if (status) - goto fail; + return status;
ncm_opts->bound = true;
us = usb_gstrings_attach(cdev, ncm_strings, ARRAY_SIZE(ncm_string_defs)); - if (IS_ERR(us)) { - status = PTR_ERR(us); - goto fail; - } + if (IS_ERR(us)) + return PTR_ERR(us); + ncm_control_intf.iInterface = us[STRING_CTRL_IDX].id; ncm_data_nop_intf.iInterface = us[STRING_DATA_IDX].id; ncm_data_intf.iInterface = us[STRING_DATA_IDX].id; @@ -1477,55 +1478,47 @@ static int ncm_bind(struct usb_configura /* allocate instance-specific interface IDs */ status = usb_interface_id(c, f); if (status < 0) - goto fail; + return status; ncm->ctrl_id = status; ncm_iad_desc.bFirstInterface = status;
ncm_control_intf.bInterfaceNumber = status; ncm_union_desc.bMasterInterface0 = status;
- if (cdev->use_os_string) - f->os_desc_table[0].if_id = - ncm_iad_desc.bFirstInterface; - status = usb_interface_id(c, f); if (status < 0) - goto fail; + return status; ncm->data_id = status;
ncm_data_nop_intf.bInterfaceNumber = status; ncm_data_intf.bInterfaceNumber = status; ncm_union_desc.bSlaveInterface0 = status;
- status = -ENODEV; - /* allocate instance-specific endpoints */ ep = usb_ep_autoconfig(cdev->gadget, &fs_ncm_in_desc); if (!ep) - goto fail; + return -ENODEV; ncm->port.in_ep = ep;
ep = usb_ep_autoconfig(cdev->gadget, &fs_ncm_out_desc); if (!ep) - goto fail; + return -ENODEV; ncm->port.out_ep = ep;
ep = usb_ep_autoconfig(cdev->gadget, &fs_ncm_notify_desc); if (!ep) - goto fail; + return -ENODEV; ncm->notify = ep;
- status = -ENOMEM; - /* allocate notification request and buffer */ - ncm->notify_req = usb_ep_alloc_request(ep, GFP_KERNEL); - if (!ncm->notify_req) - goto fail; - ncm->notify_req->buf = kmalloc(NCM_STATUS_BYTECOUNT, GFP_KERNEL); - if (!ncm->notify_req->buf) - goto fail; - ncm->notify_req->context = ncm; - ncm->notify_req->complete = ncm_notify_complete; + request = usb_ep_alloc_request(ep, GFP_KERNEL); + if (!request) + return -ENOMEM; + request->buf = kmalloc(NCM_STATUS_BYTECOUNT, GFP_KERNEL); + if (!request->buf) + return -ENOMEM; + request->context = ncm; + request->complete = ncm_notify_complete;
/* * support all relevant hardware speeds... we expect that when @@ -1545,7 +1538,7 @@ static int ncm_bind(struct usb_configura status = usb_assign_descriptors(f, ncm_fs_function, ncm_hs_function, ncm_ss_function, ncm_ss_function); if (status) - goto fail; + return status;
/* * NOTE: all that is done without knowing or caring about @@ -1559,25 +1552,20 @@ static int ncm_bind(struct usb_configura hrtimer_init(&ncm->task_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_SOFT); ncm->task_timer.function = ncm_tx_timeout;
+ if (cdev->use_os_string) { + os_desc_table[0].os_desc = &ncm_opts->ncm_os_desc; + os_desc_table[0].if_id = ncm_iad_desc.bFirstInterface; + f->os_desc_table = no_free_ptr(os_desc_table); + f->os_desc_n = 1; + } + ncm->notify_req = no_free_ptr(request); + DBG(cdev, "CDC Network: %s speed IN/%s OUT/%s NOTIFY/%s\n", gadget_is_superspeed(c->cdev->gadget) ? "super" : gadget_is_dualspeed(c->cdev->gadget) ? "dual" : "full", ncm->port.in_ep->name, ncm->port.out_ep->name, ncm->notify->name); return 0; - -fail: - kfree(f->os_desc_table); - f->os_desc_n = 0; - - if (ncm->notify_req) { - kfree(ncm->notify_req->buf); - usb_ep_free_request(ncm->notify, ncm->notify_req); - } - - ERROR(cdev, "%s: can't bind, err %d\n", f->name, status); - - return status; }
static inline struct f_ncm_opts *to_f_ncm_opts(struct config_item *item)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit b0b0aa5d858d4d2fe39a5e4486e0550e858108f6 ]
del_timer_sync() does not return the number of times it tried to delete the timer which rearms itself. It's clearly documented:
The function returns whether it has deactivated a pending timer or not.
This part of the documentation is from 2003 where del_timer_sync() really returned the number of deletion attempts for unknown reasons. The code was rewritten in 2005, but the documentation was not updated.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Anna-Maria Behnsen anna-maria@linutronix.de Link: https://lore.kernel.org/r/20221123201624.452282769@linutronix.de Signed-off-by: Jeongjun Park aha310510@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/kernel-hacking/locking.rst | 3 +-- Documentation/translations/it_IT/kernel-hacking/locking.rst | 4 +--- 2 files changed, 2 insertions(+), 5 deletions(-)
--- a/Documentation/kernel-hacking/locking.rst +++ b/Documentation/kernel-hacking/locking.rst @@ -1006,8 +1006,7 @@ Another common problem is deleting timer calling add_timer() at the end of their timer function). Because this is a fairly common case which is prone to races, you should use del_timer_sync() (``include/linux/timer.h``) to -handle this case. It returns the number of times the timer had to be -deleted before we finally stopped it from adding itself back in. +handle this case.
Locking Speed ============= --- a/Documentation/translations/it_IT/kernel-hacking/locking.rst +++ b/Documentation/translations/it_IT/kernel-hacking/locking.rst @@ -1027,9 +1027,7 @@ Un altro problema è l'eliminazione dei da soli (chiamando add_timer() alla fine della loro esecuzione). Dato che questo è un problema abbastanza comune con una propensione alle corse critiche, dovreste usare del_timer_sync() -(``include/linux/timer.h``) per gestire questo caso. Questa ritorna il -numero di volte che il temporizzatore è stato interrotto prima che -fosse in grado di fermarlo senza che si riavviasse. +(``include/linux/timer.h``) per gestire questo caso.
Velocità della sincronizzazione ===============================
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Steven Rostedt (Google)" rostedt@goodmis.org
[ Upstream commit 80b55772d41d8afec68dbc4ff0368a9fe5d1f390 ]
A new "shutdown" timer state is being added to the generic timer code. One of the functions to change the timer into the state is called "timer_shutdown()". This means that there can not be other functions called "timer_shutdown()" as the timer code owns the "timer_*" name space.
Rename timer_shutdown() to spear_timer_shutdown() to avoid this conflict.
Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Anna-Maria Behnsen anna-maria@linutronix.de Acked-by: Arnd Bergmann arnd@arndb.de Acked-by: Viresh Kumar viresh.kumar@linaro.org Link: https://lkml.kernel.org/r/20221106212701.822440504@goodmis.org Link: https://lore.kernel.org/all/20221105060155.228348078@goodmis.org/ Link: https://lore.kernel.org/r/20221110064146.810953418@goodmis.org Link: https://lore.kernel.org/r/20221123201624.513863211@linutronix.de Signed-off-by: Jeongjun Park aha310510@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/mach-spear/time.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/arch/arm/mach-spear/time.c +++ b/arch/arm/mach-spear/time.c @@ -90,7 +90,7 @@ static void __init spear_clocksource_ini 200, 16, clocksource_mmio_readw_up); }
-static inline void timer_shutdown(struct clock_event_device *evt) +static inline void spear_timer_shutdown(struct clock_event_device *evt) { u16 val = readw(gpt_base + CR(CLKEVT));
@@ -101,7 +101,7 @@ static inline void timer_shutdown(struct
static int spear_shutdown(struct clock_event_device *evt) { - timer_shutdown(evt); + spear_timer_shutdown(evt);
return 0; } @@ -111,7 +111,7 @@ static int spear_set_oneshot(struct cloc u16 val;
/* stop the timer */ - timer_shutdown(evt); + spear_timer_shutdown(evt);
val = readw(gpt_base + CR(CLKEVT)); val |= CTRL_ONE_SHOT; @@ -126,7 +126,7 @@ static int spear_set_periodic(struct clo u16 val;
/* stop the timer */ - timer_shutdown(evt); + spear_timer_shutdown(evt);
period = clk_get_rate(gpt_clk) / HZ; period >>= CTRL_PRESCALER16;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Steven Rostedt (Google)" rostedt@goodmis.org
[ Upstream commit 73737a5833ace25a8408b0d3b783637cb6bf29d1 ]
A new "shutdown" timer state is being added to the generic timer code. One of the functions to change the timer into the state is called "timer_shutdown()". This means that there can not be other functions called "timer_shutdown()" as the timer code owns the "timer_*" name space.
Rename timer_shutdown() to arch_timer_shutdown() to avoid this conflict.
Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Anna-Maria Behnsen anna-maria@linutronix.de Acked-by: Marc Zyngier maz@kernel.org Link: https://lkml.kernel.org/r/20221106212702.002251651@goodmis.org Link: https://lore.kernel.org/all/20221105060155.409832154@goodmis.org/ Link: https://lore.kernel.org/r/20221110064146.981725531@goodmis.org Link: https://lore.kernel.org/r/20221123201624.574672568@linutronix.de Signed-off-by: Jeongjun Park aha310510@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clocksource/arm_arch_timer.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/clocksource/arm_arch_timer.c +++ b/drivers/clocksource/arm_arch_timer.c @@ -687,8 +687,8 @@ static irqreturn_t arch_timer_handler_vi return timer_handler(ARCH_TIMER_MEM_VIRT_ACCESS, evt); }
-static __always_inline int timer_shutdown(const int access, - struct clock_event_device *clk) +static __always_inline int arch_timer_shutdown(const int access, + struct clock_event_device *clk) { unsigned long ctrl;
@@ -701,22 +701,22 @@ static __always_inline int timer_shutdow
static int arch_timer_shutdown_virt(struct clock_event_device *clk) { - return timer_shutdown(ARCH_TIMER_VIRT_ACCESS, clk); + return arch_timer_shutdown(ARCH_TIMER_VIRT_ACCESS, clk); }
static int arch_timer_shutdown_phys(struct clock_event_device *clk) { - return timer_shutdown(ARCH_TIMER_PHYS_ACCESS, clk); + return arch_timer_shutdown(ARCH_TIMER_PHYS_ACCESS, clk); }
static int arch_timer_shutdown_virt_mem(struct clock_event_device *clk) { - return timer_shutdown(ARCH_TIMER_MEM_VIRT_ACCESS, clk); + return arch_timer_shutdown(ARCH_TIMER_MEM_VIRT_ACCESS, clk); }
static int arch_timer_shutdown_phys_mem(struct clock_event_device *clk) { - return timer_shutdown(ARCH_TIMER_MEM_PHYS_ACCESS, clk); + return arch_timer_shutdown(ARCH_TIMER_MEM_PHYS_ACCESS, clk); }
static __always_inline void set_next_event(const int access, unsigned long evt,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Steven Rostedt (Google)" rostedt@goodmis.org
[ Upstream commit 6e1fc2591f116dfb20b65cf27356475461d61bd8 ]
A new "shutdown" timer state is being added to the generic timer code. One of the functions to change the timer into the state is called "timer_shutdown()". This means that there can not be other functions called "timer_shutdown()" as the timer code owns the "timer_*" name space.
Rename timer_shutdown() to evt_timer_shutdown() to avoid this conflict.
Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Anna-Maria Behnsen anna-maria@linutronix.de Link: https://lkml.kernel.org/r/20221106212702.182883323@goodmis.org Link: https://lore.kernel.org/all/20221105060155.592778858@goodmis.org/ Link: https://lore.kernel.org/r/20221110064147.158230501@goodmis.org Link: https://lore.kernel.org/r/20221123201624.634354813@linutronix.de Signed-off-by: Jeongjun Park aha310510@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clocksource/timer-sp804.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/clocksource/timer-sp804.c +++ b/drivers/clocksource/timer-sp804.c @@ -155,14 +155,14 @@ static irqreturn_t sp804_timer_interrupt return IRQ_HANDLED; }
-static inline void timer_shutdown(struct clock_event_device *evt) +static inline void evt_timer_shutdown(struct clock_event_device *evt) { writel(0, common_clkevt->ctrl); }
static int sp804_shutdown(struct clock_event_device *evt) { - timer_shutdown(evt); + evt_timer_shutdown(evt); return 0; }
@@ -171,7 +171,7 @@ static int sp804_set_periodic(struct clo unsigned long ctrl = TIMER_CTRL_32BIT | TIMER_CTRL_IE | TIMER_CTRL_PERIODIC | TIMER_CTRL_ENABLE;
- timer_shutdown(evt); + evt_timer_shutdown(evt); writel(common_clkevt->reload, common_clkevt->load); writel(ctrl, common_clkevt->ctrl); return 0;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit 82ed6f7ef58f9634fe4462dd721902c580f01569 ]
The timer code still has a few BUG_ON()s left which are crashing the kernel in situations where it still can recover or simply refuse to take an action.
Remove the one in the hotplug callback which checks for the CPU being offline. If that happens then the whole hotplug machinery will explode in colourful ways.
Replace the rest with WARN_ON_ONCE() and conditional returns where appropriate.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Anna-Maria Behnsen anna-maria@linutronix.de Link: https://lore.kernel.org/r/20221123201624.769128888@linutronix.de Signed-off-by: Jeongjun Park aha310510@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/time/timer.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
--- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1208,7 +1208,8 @@ EXPORT_SYMBOL(timer_reduce); */ void add_timer(struct timer_list *timer) { - BUG_ON(timer_pending(timer)); + if (WARN_ON_ONCE(timer_pending(timer))) + return; __mod_timer(timer, timer->expires, MOD_TIMER_NOTPENDING); } EXPORT_SYMBOL(add_timer); @@ -1227,7 +1228,8 @@ void add_timer_on(struct timer_list *tim struct timer_base *new_base, *base; unsigned long flags;
- BUG_ON(timer_pending(timer) || !timer->function); + if (WARN_ON_ONCE(timer_pending(timer) || !timer->function)) + return;
new_base = get_timer_cpu_base(timer->flags, cpu);
@@ -2047,8 +2049,6 @@ int timers_dead_cpu(unsigned int cpu) struct timer_base *new_base; int b, i;
- BUG_ON(cpu_online(cpu)); - for (b = 0; b < NR_BASES; b++) { old_base = per_cpu_ptr(&timer_bases[b], cpu); new_base = get_cpu_ptr(&timer_bases[b]); @@ -2065,7 +2065,8 @@ int timers_dead_cpu(unsigned int cpu) */ forward_timer_base(new_base);
- BUG_ON(old_base->running_timer); + WARN_ON_ONCE(old_base->running_timer); + old_base->running_timer = NULL;
for (i = 0; i < WHEEL_SIZE; i++) migrate_timer_list(new_base, old_base->vectors + i);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit 87bdd932e85881895d4720255b40ac28749c4e32 ]
Adjust to the new preferred function names.
Suggested-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Anna-Maria Behnsen anna-maria@linutronix.de Link: https://lore.kernel.org/r/20221123201625.075320635@linutronix.de Signed-off-by: Jeongjun Park aha310510@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/RCU/Design/Requirements/Requirements.rst | 2 +- Documentation/core-api/local_ops.rst | 2 +- Documentation/kernel-hacking/locking.rst | 11 +++++------ Documentation/timers/hrtimers.rst | 2 +- Documentation/translations/it_IT/kernel-hacking/locking.rst | 10 +++++----- Documentation/translations/zh_CN/core-api/local_ops.rst | 2 +- 6 files changed, 14 insertions(+), 15 deletions(-)
--- a/Documentation/RCU/Design/Requirements/Requirements.rst +++ b/Documentation/RCU/Design/Requirements/Requirements.rst @@ -1858,7 +1858,7 @@ unloaded. After a given module has been one of its functions results in a segmentation fault. The module-unload functions must therefore cancel any delayed calls to loadable-module functions, for example, any outstanding mod_timer() must be dealt -with via del_timer_sync() or similar. +with via timer_delete_sync() or similar.
Unfortunately, there is no way to cancel an RCU callback; once you invoke call_rcu(), the callback function is eventually going to be --- a/Documentation/core-api/local_ops.rst +++ b/Documentation/core-api/local_ops.rst @@ -191,7 +191,7 @@ Here is a sample module which implements
static void __exit test_exit(void) { - del_timer_sync(&test_timer); + timer_delete_sync(&test_timer); }
module_init(test_init); --- a/Documentation/kernel-hacking/locking.rst +++ b/Documentation/kernel-hacking/locking.rst @@ -967,7 +967,7 @@ you might do the following::
while (list) { struct foo *next = list->next; - del_timer(&list->timer); + timer_delete(&list->timer); kfree(list); list = next; } @@ -981,7 +981,7 @@ the lock after we spin_unlock_bh(), and the element (which has already been freed!).
This can be avoided by checking the result of -del_timer(): if it returns 1, the timer has been deleted. +timer_delete(): if it returns 1, the timer has been deleted. If 0, it means (in this case) that it is currently running, so we can do::
@@ -990,7 +990,7 @@ do::
while (list) { struct foo *next = list->next; - if (!del_timer(&list->timer)) { + if (!timer_delete(&list->timer)) { /* Give timer a chance to delete this */ spin_unlock_bh(&list_lock); goto retry; @@ -1005,8 +1005,7 @@ do:: Another common problem is deleting timers which restart themselves (by calling add_timer() at the end of their timer function). Because this is a fairly common case which is prone to races, you should -use del_timer_sync() (``include/linux/timer.h``) to -handle this case. +use timer_delete_sync() (``include/linux/timer.h``) to
Locking Speed ============= @@ -1334,7 +1333,7 @@ lock.
- kfree()
-- add_timer() and del_timer() +- add_timer() and timer_delete()
Mutex API reference =================== --- a/Documentation/timers/hrtimers.rst +++ b/Documentation/timers/hrtimers.rst @@ -118,7 +118,7 @@ existing timer wheel code, as it is matu was not really a win, due to the different data structures. Also, the hrtimer functions now have clearer behavior and clearer names - such as hrtimer_try_to_cancel() and hrtimer_cancel() [which are roughly -equivalent to del_timer() and del_timer_sync()] - so there's no direct +equivalent to timer_delete() and timer_delete_sync()] - so there's no direct 1:1 mapping between them on the algorithmic level, and thus no real potential for code sharing either.
--- a/Documentation/translations/it_IT/kernel-hacking/locking.rst +++ b/Documentation/translations/it_IT/kernel-hacking/locking.rst @@ -990,7 +990,7 @@ potreste fare come segue::
while (list) { struct foo *next = list->next; - del_timer(&list->timer); + timer_delete(&list->timer); kfree(list); list = next; } @@ -1003,7 +1003,7 @@ e prenderà il *lock* solo dopo spin_unl di eliminare il suo oggetto (che però è già stato eliminato).
Questo può essere evitato controllando il valore di ritorno di -del_timer(): se ritorna 1, il temporizzatore è stato già +timer_delete(): se ritorna 1, il temporizzatore è stato già rimosso. Se 0, significa (in questo caso) che il temporizzatore è in esecuzione, quindi possiamo fare come segue::
@@ -1012,7 +1012,7 @@ esecuzione, quindi possiamo fare come se
while (list) { struct foo *next = list->next; - if (!del_timer(&list->timer)) { + if (!timer_delete(&list->timer)) { /* Give timer a chance to delete this */ spin_unlock_bh(&list_lock); goto retry; @@ -1026,7 +1026,7 @@ esecuzione, quindi possiamo fare come se Un altro problema è l'eliminazione dei temporizzatori che si riavviano da soli (chiamando add_timer() alla fine della loro esecuzione). Dato che questo è un problema abbastanza comune con una propensione -alle corse critiche, dovreste usare del_timer_sync() +alle corse critiche, dovreste usare timer_delete_sync() (``include/linux/timer.h``) per gestire questo caso.
Velocità della sincronizzazione @@ -1372,7 +1372,7 @@ contesto, o trattenendo un qualsiasi *lo
- kfree()
-- add_timer() e del_timer() +- add_timer() e timer_delete()
Riferimento per l'API dei Mutex =============================== --- a/Documentation/translations/zh_CN/core-api/local_ops.rst +++ b/Documentation/translations/zh_CN/core-api/local_ops.rst @@ -185,7 +185,7 @@ UP之间没有不同的行为,在你�
static void __exit test_exit(void) { - del_timer_sync(&test_timer); + timer_delete_sync(&test_timer); }
module_init(test_init);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit d02e382cef06cc73561dd32dfdc171c00dcc416d ]
Tearing down timers which have circular dependencies to other functionality, e.g. workqueues, where the timer can schedule work and work can arm timers, is not trivial.
In those cases it is desired to shutdown the timer in a way which prevents rearming of the timer. The mechanism to do so is to set timer->function to NULL and use this as an indicator for the timer arming functions to ignore the (re)arm request.
In preparation for that replace the warnings in the relevant code paths with checks for timer->function == NULL. If the pointer is NULL, then discard the rearm request silently.
Add debug_assert_init() instead of the WARN_ON_ONCE(!timer->function) checks so that debug objects can warn about non-initialized timers.
The warning of debug objects does not warn if timer->function == NULL. It warns when timer was not initialized using timer_setup[_on_stack]() or via DEFINE_TIMER(). If developers fail to enable debug objects and then waste lots of time to figure out why their non-initialized timer is not firing, they deserve it. Same for initializing a timer with a NULL function.
Co-developed-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Anna-Maria Behnsen anna-maria@linutronix.de Link: https://lore.kernel.org/all/20220407161745.7d6754b3@gandalf.local.home Link: https://lore.kernel.org/all/20221110064101.429013735@goodmis.org Link: https://lore.kernel.org/r/87wn7kdann.ffs@tglx Signed-off-by: Jeongjun Park aha310510@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/time/timer.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 52 insertions(+), 5 deletions(-)
--- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1017,7 +1017,7 @@ __mod_timer(struct timer_list *timer, un unsigned int idx = UINT_MAX; int ret = 0;
- BUG_ON(!timer->function); + debug_assert_init(timer);
/* * This is a common optimization triggered by the networking code - if @@ -1044,6 +1044,14 @@ __mod_timer(struct timer_list *timer, un * dequeue/enqueue dance. */ base = lock_timer_base(timer, &flags); + /* + * Has @timer been shutdown? This needs to be evaluated + * while holding base lock to prevent a race against the + * shutdown code. + */ + if (!timer->function) + goto out_unlock; + forward_timer_base(base);
if (timer_pending(timer) && (options & MOD_TIMER_REDUCE) && @@ -1070,6 +1078,14 @@ __mod_timer(struct timer_list *timer, un } } else { base = lock_timer_base(timer, &flags); + /* + * Has @timer been shutdown? This needs to be evaluated + * while holding base lock to prevent a race against the + * shutdown code. + */ + if (!timer->function) + goto out_unlock; + forward_timer_base(base); }
@@ -1128,8 +1144,12 @@ out_unlock: * mod_timer_pending() is the same for pending timers as mod_timer(), but * will not activate inactive timers. * + * If @timer->function == NULL then the start operation is silently + * discarded. + * * Return: - * * %0 - The timer was inactive and not modified + * * %0 - The timer was inactive and not modified or was in + * shutdown state and the operation was discarded * * %1 - The timer was active and requeued to expire at @expires */ int mod_timer_pending(struct timer_list *timer, unsigned long expires) @@ -1155,8 +1175,12 @@ EXPORT_SYMBOL(mod_timer_pending); * same timer, then mod_timer() is the only safe way to modify the timeout, * since add_timer() cannot modify an already running timer. * + * If @timer->function == NULL then the start operation is silently + * discarded. In this case the return value is 0 and meaningless. + * * Return: - * * %0 - The timer was inactive and started + * * %0 - The timer was inactive and started or was in shutdown + * state and the operation was discarded * * %1 - The timer was active and requeued to expire at @expires or * the timer was active and not modified because @expires did * not change the effective expiry time @@ -1176,8 +1200,12 @@ EXPORT_SYMBOL(mod_timer); * modify an enqueued timer if that would reduce the expiration time. If * @timer is not enqueued it starts the timer. * + * If @timer->function == NULL then the start operation is silently + * discarded. + * * Return: - * * %0 - The timer was inactive and started + * * %0 - The timer was inactive and started or was in shutdown + * state and the operation was discarded * * %1 - The timer was active and requeued to expire at @expires or * the timer was active and not modified because @expires * did not change the effective expiry time such that the @@ -1200,6 +1228,9 @@ EXPORT_SYMBOL(timer_reduce); * The @timer->expires and @timer->function fields must be set prior * to calling this function. * + * If @timer->function == NULL then the start operation is silently + * discarded. + * * If @timer->expires is already in the past @timer will be queued to * expire at the next timer tick. * @@ -1228,7 +1259,9 @@ void add_timer_on(struct timer_list *tim struct timer_base *new_base, *base; unsigned long flags;
- if (WARN_ON_ONCE(timer_pending(timer) || !timer->function)) + debug_assert_init(timer); + + if (WARN_ON_ONCE(timer_pending(timer))) return;
new_base = get_timer_cpu_base(timer->flags, cpu); @@ -1239,6 +1272,13 @@ void add_timer_on(struct timer_list *tim * wrong base locked. See lock_timer_base(). */ base = lock_timer_base(timer, &flags); + /* + * Has @timer been shutdown? This needs to be evaluated while + * holding base lock to prevent a race against the shutdown code. + */ + if (!timer->function) + goto out_unlock; + if (base != new_base) { timer->flags |= TIMER_MIGRATING;
@@ -1252,6 +1292,7 @@ void add_timer_on(struct timer_list *tim
debug_timer_activate(timer); internal_add_timer(base, timer); +out_unlock: raw_spin_unlock_irqrestore(&base->lock, flags); } EXPORT_SYMBOL_GPL(add_timer_on); @@ -1541,6 +1582,12 @@ static void expire_timers(struct timer_b
fn = timer->function;
+ if (WARN_ON_ONCE(!fn)) { + /* Should never happen. Emphasis on should! */ + base->running_timer = NULL; + continue; + } + if (timer->flags & TIMER_IRQSAFE) { raw_spin_unlock(&base->lock); call_timer_fn(timer, fn, baseclk);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit 8553b5f2774a66b1f293b7d783934210afb8f23c ]
Tearing down timers which have circular dependencies to other functionality, e.g. workqueues, where the timer can schedule work and work can arm timers, is not trivial.
In those cases it is desired to shutdown the timer in a way which prevents rearming of the timer. The mechanism to do so is to set timer->function to NULL and use this as an indicator for the timer arming functions to ignore the (re)arm request.
Split the inner workings of try_do_del_timer_sync(), del_timer_sync() and del_timer() into helper functions to prepare for implementing the shutdown functionality.
No functional change.
Co-developed-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Anna-Maria Behnsen anna-maria@linutronix.de Link: https://lore.kernel.org/all/20220407161745.7d6754b3@gandalf.local.home Link: https://lore.kernel.org/all/20221110064101.429013735@goodmis.org Link: https://lore.kernel.org/r/20221123201625.195147423@linutronix.de Signed-off-by: Jeongjun Park aha310510@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/time/timer.c | 143 +++++++++++++++++++++++++++++++++------------------- 1 file changed, 92 insertions(+), 51 deletions(-)
--- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1298,20 +1298,14 @@ out_unlock: EXPORT_SYMBOL_GPL(add_timer_on);
/** - * timer_delete - Deactivate a timer + * __timer_delete - Internal function: Deactivate a timer * @timer: The timer to be deactivated * - * The function only deactivates a pending timer, but contrary to - * timer_delete_sync() it does not take into account whether the timer's - * callback function is concurrently executed on a different CPU or not. - * It neither prevents rearming of the timer. If @timer can be rearmed - * concurrently then the return value of this function is meaningless. - * * Return: * * %0 - The timer was not pending * * %1 - The timer was pending and deactivated */ -int timer_delete(struct timer_list *timer) +static int __timer_delete(struct timer_list *timer) { struct timer_base *base; unsigned long flags; @@ -1327,25 +1321,37 @@ int timer_delete(struct timer_list *time
return ret; } -EXPORT_SYMBOL(timer_delete);
/** - * try_to_del_timer_sync - Try to deactivate a timer - * @timer: Timer to deactivate + * timer_delete - Deactivate a timer + * @timer: The timer to be deactivated * - * This function tries to deactivate a timer. On success the timer is not - * queued and the timer callback function is not running on any CPU. + * The function only deactivates a pending timer, but contrary to + * timer_delete_sync() it does not take into account whether the timer's + * callback function is concurrently executed on a different CPU or not. + * It neither prevents rearming of the timer. If @timer can be rearmed + * concurrently then the return value of this function is meaningless. * - * This function does not guarantee that the timer cannot be rearmed right - * after dropping the base lock. That needs to be prevented by the calling - * code if necessary. + * Return: + * * %0 - The timer was not pending + * * %1 - The timer was pending and deactivated + */ +int timer_delete(struct timer_list *timer) +{ + return __timer_delete(timer); +} +EXPORT_SYMBOL(timer_delete); + +/** + * __try_to_del_timer_sync - Internal function: Try to deactivate a timer + * @timer: Timer to deactivate * * Return: * * %0 - The timer was not pending * * %1 - The timer was pending and deactivated * * %-1 - The timer callback function is running on a different CPU */ -int try_to_del_timer_sync(struct timer_list *timer) +static int __try_to_del_timer_sync(struct timer_list *timer) { struct timer_base *base; unsigned long flags; @@ -1362,6 +1368,27 @@ int try_to_del_timer_sync(struct timer_l
return ret; } + +/** + * try_to_del_timer_sync - Try to deactivate a timer + * @timer: Timer to deactivate + * + * This function tries to deactivate a timer. On success the timer is not + * queued and the timer callback function is not running on any CPU. + * + * This function does not guarantee that the timer cannot be rearmed right + * after dropping the base lock. That needs to be prevented by the calling + * code if necessary. + * + * Return: + * * %0 - The timer was not pending + * * %1 - The timer was pending and deactivated + * * %-1 - The timer callback function is running on a different CPU + */ +int try_to_del_timer_sync(struct timer_list *timer) +{ + return __try_to_del_timer_sync(timer); +} EXPORT_SYMBOL(try_to_del_timer_sync);
#ifdef CONFIG_PREEMPT_RT @@ -1438,45 +1465,15 @@ static inline void del_timer_wait_runnin #endif
/** - * timer_delete_sync - Deactivate a timer and wait for the handler to finish. + * __timer_delete_sync - Internal function: Deactivate a timer and wait + * for the handler to finish. * @timer: The timer to be deactivated * - * Synchronization rules: Callers must prevent restarting of the timer, - * otherwise this function is meaningless. It must not be called from - * interrupt contexts unless the timer is an irqsafe one. The caller must - * not hold locks which would prevent completion of the timer's callback - * function. The timer's handler must not call add_timer_on(). Upon exit - * the timer is not queued and the handler is not running on any CPU. - * - * For !irqsafe timers, the caller must not hold locks that are held in - * interrupt context. Even if the lock has nothing to do with the timer in - * question. Here's why:: - * - * CPU0 CPU1 - * ---- ---- - * <SOFTIRQ> - * call_timer_fn(); - * base->running_timer = mytimer; - * spin_lock_irq(somelock); - * <IRQ> - * spin_lock(somelock); - * timer_delete_sync(mytimer); - * while (base->running_timer == mytimer); - * - * Now timer_delete_sync() will never return and never release somelock. - * The interrupt on the other CPU is waiting to grab somelock but it has - * interrupted the softirq that CPU0 is waiting to finish. - * - * This function cannot guarantee that the timer is not rearmed again by - * some concurrent or preempting code, right after it dropped the base - * lock. If there is the possibility of a concurrent rearm then the return - * value of the function is meaningless. - * * Return: * * %0 - The timer was not pending * * %1 - The timer was pending and deactivated */ -int timer_delete_sync(struct timer_list *timer) +static int __timer_delete_sync(struct timer_list *timer) { int ret;
@@ -1506,7 +1503,7 @@ int timer_delete_sync(struct timer_list lockdep_assert_preemption_enabled();
do { - ret = try_to_del_timer_sync(timer); + ret = __try_to_del_timer_sync(timer);
if (unlikely(ret < 0)) { del_timer_wait_running(timer); @@ -1516,6 +1513,50 @@ int timer_delete_sync(struct timer_list
return ret; } + +/** + * timer_delete_sync - Deactivate a timer and wait for the handler to finish. + * @timer: The timer to be deactivated + * + * Synchronization rules: Callers must prevent restarting of the timer, + * otherwise this function is meaningless. It must not be called from + * interrupt contexts unless the timer is an irqsafe one. The caller must + * not hold locks which would prevent completion of the timer's callback + * function. The timer's handler must not call add_timer_on(). Upon exit + * the timer is not queued and the handler is not running on any CPU. + * + * For !irqsafe timers, the caller must not hold locks that are held in + * interrupt context. Even if the lock has nothing to do with the timer in + * question. Here's why:: + * + * CPU0 CPU1 + * ---- ---- + * <SOFTIRQ> + * call_timer_fn(); + * base->running_timer = mytimer; + * spin_lock_irq(somelock); + * <IRQ> + * spin_lock(somelock); + * timer_delete_sync(mytimer); + * while (base->running_timer == mytimer); + * + * Now timer_delete_sync() will never return and never release somelock. + * The interrupt on the other CPU is waiting to grab somelock but it has + * interrupted the softirq that CPU0 is waiting to finish. + * + * This function cannot guarantee that the timer is not rearmed again by + * some concurrent or preempting code, right after it dropped the base + * lock. If there is the possibility of a concurrent rearm then the return + * value of the function is meaningless. + * + * Return: + * * %0 - The timer was not pending + * * %1 - The timer was pending and deactivated + */ +int timer_delete_sync(struct timer_list *timer) +{ + return __timer_delete_sync(timer); +} EXPORT_SYMBOL(timer_delete_sync);
static void call_timer_fn(struct timer_list *timer,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit 0cc04e80458a822300b93f82ed861a513edde194 ]
Tearing down timers which have circular dependencies to other functionality, e.g. workqueues, where the timer can schedule work and work can arm timers, is not trivial.
In those cases it is desired to shutdown the timer in a way which prevents rearming of the timer. The mechanism to do so is to set timer->function to NULL and use this as an indicator for the timer arming functions to ignore the (re)arm request.
Add a shutdown argument to the relevant internal functions which makes the actual deactivation code set timer->function to NULL which in turn prevents rearming of the timer.
Co-developed-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Anna-Maria Behnsen anna-maria@linutronix.de Link: https://lore.kernel.org/all/20220407161745.7d6754b3@gandalf.local.home Link: https://lore.kernel.org/all/20221110064101.429013735@goodmis.org Link: https://lore.kernel.org/r/20221123201625.253883224@linutronix.de Signed-off-by: Jeongjun Park aha310510@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/time/timer.c | 62 +++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 54 insertions(+), 8 deletions(-)
--- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1300,12 +1300,19 @@ EXPORT_SYMBOL_GPL(add_timer_on); /** * __timer_delete - Internal function: Deactivate a timer * @timer: The timer to be deactivated + * @shutdown: If true, this indicates that the timer is about to be + * shutdown permanently. + * + * If @shutdown is true then @timer->function is set to NULL under the + * timer base lock which prevents further rearming of the time. In that + * case any attempt to rearm @timer after this function returns will be + * silently ignored. * * Return: * * %0 - The timer was not pending * * %1 - The timer was pending and deactivated */ -static int __timer_delete(struct timer_list *timer) +static int __timer_delete(struct timer_list *timer, bool shutdown) { struct timer_base *base; unsigned long flags; @@ -1313,9 +1320,22 @@ static int __timer_delete(struct timer_l
debug_assert_init(timer);
- if (timer_pending(timer)) { + /* + * If @shutdown is set then the lock has to be taken whether the + * timer is pending or not to protect against a concurrent rearm + * which might hit between the lockless pending check and the lock + * aquisition. By taking the lock it is ensured that such a newly + * enqueued timer is dequeued and cannot end up with + * timer->function == NULL in the expiry code. + * + * If timer->function is currently executed, then this makes sure + * that the callback cannot requeue the timer. + */ + if (timer_pending(timer) || shutdown) { base = lock_timer_base(timer, &flags); ret = detach_if_pending(timer, base, true); + if (shutdown) + timer->function = NULL; raw_spin_unlock_irqrestore(&base->lock, flags); }
@@ -1338,20 +1358,31 @@ static int __timer_delete(struct timer_l */ int timer_delete(struct timer_list *timer) { - return __timer_delete(timer); + return __timer_delete(timer, false); } EXPORT_SYMBOL(timer_delete);
/** * __try_to_del_timer_sync - Internal function: Try to deactivate a timer * @timer: Timer to deactivate + * @shutdown: If true, this indicates that the timer is about to be + * shutdown permanently. + * + * If @shutdown is true then @timer->function is set to NULL under the + * timer base lock which prevents further rearming of the timer. Any + * attempt to rearm @timer after this function returns will be silently + * ignored. + * + * This function cannot guarantee that the timer cannot be rearmed + * right after dropping the base lock if @shutdown is false. That + * needs to be prevented by the calling code if necessary. * * Return: * * %0 - The timer was not pending * * %1 - The timer was pending and deactivated * * %-1 - The timer callback function is running on a different CPU */ -static int __try_to_del_timer_sync(struct timer_list *timer) +static int __try_to_del_timer_sync(struct timer_list *timer, bool shutdown) { struct timer_base *base; unsigned long flags; @@ -1363,6 +1394,8 @@ static int __try_to_del_timer_sync(struc
if (base->running_timer != timer) ret = detach_if_pending(timer, base, true); + if (shutdown) + timer->function = NULL;
raw_spin_unlock_irqrestore(&base->lock, flags);
@@ -1387,7 +1420,7 @@ static int __try_to_del_timer_sync(struc */ int try_to_del_timer_sync(struct timer_list *timer) { - return __try_to_del_timer_sync(timer); + return __try_to_del_timer_sync(timer, false); } EXPORT_SYMBOL(try_to_del_timer_sync);
@@ -1468,12 +1501,25 @@ static inline void del_timer_wait_runnin * __timer_delete_sync - Internal function: Deactivate a timer and wait * for the handler to finish. * @timer: The timer to be deactivated + * @shutdown: If true, @timer->function will be set to NULL under the + * timer base lock which prevents rearming of @timer + * + * If @shutdown is not set the timer can be rearmed later. If the timer can + * be rearmed concurrently, i.e. after dropping the base lock then the + * return value is meaningless. + * + * If @shutdown is set then @timer->function is set to NULL under timer + * base lock which prevents rearming of the timer. Any attempt to rearm + * a shutdown timer is silently ignored. + * + * If the timer should be reused after shutdown it has to be initialized + * again. * * Return: * * %0 - The timer was not pending * * %1 - The timer was pending and deactivated */ -static int __timer_delete_sync(struct timer_list *timer) +static int __timer_delete_sync(struct timer_list *timer, bool shutdown) { int ret;
@@ -1503,7 +1549,7 @@ static int __timer_delete_sync(struct ti lockdep_assert_preemption_enabled();
do { - ret = __try_to_del_timer_sync(timer); + ret = __try_to_del_timer_sync(timer, shutdown);
if (unlikely(ret < 0)) { del_timer_wait_running(timer); @@ -1555,7 +1601,7 @@ static int __timer_delete_sync(struct ti */ int timer_delete_sync(struct timer_list *timer) { - return __timer_delete_sync(timer); + return __timer_delete_sync(timer, false); } EXPORT_SYMBOL(timer_delete_sync);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit f571faf6e443b6011ccb585d57866177af1f643c ]
Tearing down timers which have circular dependencies to other functionality, e.g. workqueues, where the timer can schedule work and work can arm timers, is not trivial.
In those cases it is desired to shutdown the timer in a way which prevents rearming of the timer. The mechanism to do so is to set timer->function to NULL and use this as an indicator for the timer arming functions to ignore the (re)arm request.
Expose new interfaces for this: timer_shutdown_sync() and timer_shutdown().
timer_shutdown_sync() has the same functionality as timer_delete_sync() plus the NULL-ification of the timer function.
timer_shutdown() has the same functionality as timer_delete() plus the NULL-ification of the timer function.
In both cases the rearming of the timer is prevented by silently discarding rearm attempts due to timer->function being NULL.
Co-developed-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Anna-Maria Behnsen anna-maria@linutronix.de Link: https://lore.kernel.org/all/20220407161745.7d6754b3@gandalf.local.home Link: https://lore.kernel.org/all/20221110064101.429013735@goodmis.org Link: https://lore.kernel.org/r/20221123201625.314230270@linutronix.de Signed-off-by: Jeongjun Park aha310510@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/timer.h | 2 + kernel/time/timer.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 68 insertions(+)
--- a/include/linux/timer.h +++ b/include/linux/timer.h @@ -184,6 +184,8 @@ extern void add_timer(struct timer_list extern int try_to_del_timer_sync(struct timer_list *timer); extern int timer_delete_sync(struct timer_list *timer); extern int timer_delete(struct timer_list *timer); +extern int timer_shutdown_sync(struct timer_list *timer); +extern int timer_shutdown(struct timer_list *timer);
/** * del_timer_sync - Delete a pending timer and wait for a running callback --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1363,6 +1363,27 @@ int timer_delete(struct timer_list *time EXPORT_SYMBOL(timer_delete);
/** + * timer_shutdown - Deactivate a timer and prevent rearming + * @timer: The timer to be deactivated + * + * The function does not wait for an eventually running timer callback on a + * different CPU but it prevents rearming of the timer. Any attempt to arm + * @timer after this function returns will be silently ignored. + * + * This function is useful for teardown code and should only be used when + * timer_shutdown_sync() cannot be invoked due to locking or context constraints. + * + * Return: + * * %0 - The timer was not pending + * * %1 - The timer was pending + */ +int timer_shutdown(struct timer_list *timer) +{ + return __timer_delete(timer, true); +} +EXPORT_SYMBOL_GPL(timer_shutdown); + +/** * __try_to_del_timer_sync - Internal function: Try to deactivate a timer * @timer: Timer to deactivate * @shutdown: If true, this indicates that the timer is about to be @@ -1595,6 +1616,9 @@ static int __timer_delete_sync(struct ti * lock. If there is the possibility of a concurrent rearm then the return * value of the function is meaningless. * + * If such a guarantee is needed, e.g. for teardown situations then use + * timer_shutdown_sync() instead. + * * Return: * * %0 - The timer was not pending * * %1 - The timer was pending and deactivated @@ -1605,6 +1629,48 @@ int timer_delete_sync(struct timer_list } EXPORT_SYMBOL(timer_delete_sync);
+/** + * timer_shutdown_sync - Shutdown a timer and prevent rearming + * @timer: The timer to be shutdown + * + * When the function returns it is guaranteed that: + * - @timer is not queued + * - The callback function of @timer is not running + * - @timer cannot be enqueued again. Any attempt to rearm + * @timer is silently ignored. + * + * See timer_delete_sync() for synchronization rules. + * + * This function is useful for final teardown of an infrastructure where + * the timer is subject to a circular dependency problem. + * + * A common pattern for this is a timer and a workqueue where the timer can + * schedule work and work can arm the timer. On shutdown the workqueue must + * be destroyed and the timer must be prevented from rearming. Unless the + * code has conditionals like 'if (mything->in_shutdown)' to prevent that + * there is no way to get this correct with timer_delete_sync(). + * + * timer_shutdown_sync() is solving the problem. The correct ordering of + * calls in this case is: + * + * timer_shutdown_sync(&mything->timer); + * workqueue_destroy(&mything->workqueue); + * + * After this 'mything' can be safely freed. + * + * This obviously implies that the timer is not required to be functional + * for the rest of the shutdown operation. + * + * Return: + * * %0 - The timer was not pending + * * %1 - The timer was pending + */ +int timer_shutdown_sync(struct timer_list *timer) +{ + return __timer_delete_sync(timer, true); +} +EXPORT_SYMBOL_GPL(timer_shutdown_sync); + static void call_timer_fn(struct timer_list *timer, void (*fn)(struct timer_list *), unsigned long baseclk)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Steven Rostedt (Google)" rostedt@goodmis.org
[ Upstream commit a31323bef2b66455920d054b160c17d4240f8fd4 ]
In order to make sure that a timer is not re-armed after it is stopped before freeing, a new shutdown state is added to the timer code. The API timer_shutdown_sync() and timer_shutdown() must be called before the object that holds the timer can be freed.
Update the documentation to reflect this new workflow.
[ tglx: Updated to the new semantics and updated the zh_CN version ]
Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Anna-Maria Behnsen anna-maria@linutronix.de Link: https://lore.kernel.org/r/20221110064147.712934793@goodmis.org Link: https://lore.kernel.org/r/20221123201625.375284489@linutronix.de Signed-off-by: Jeongjun Park aha310510@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/RCU/Design/Requirements/Requirements.rst | 2 +- Documentation/core-api/local_ops.rst | 2 +- Documentation/kernel-hacking/locking.rst | 5 +++++ Documentation/translations/zh_CN/core-api/local_ops.rst | 2 +- 4 files changed, 8 insertions(+), 3 deletions(-)
--- a/Documentation/RCU/Design/Requirements/Requirements.rst +++ b/Documentation/RCU/Design/Requirements/Requirements.rst @@ -1858,7 +1858,7 @@ unloaded. After a given module has been one of its functions results in a segmentation fault. The module-unload functions must therefore cancel any delayed calls to loadable-module functions, for example, any outstanding mod_timer() must be dealt -with via timer_delete_sync() or similar. +with via timer_shutdown_sync() or similar.
Unfortunately, there is no way to cancel an RCU callback; once you invoke call_rcu(), the callback function is eventually going to be --- a/Documentation/core-api/local_ops.rst +++ b/Documentation/core-api/local_ops.rst @@ -191,7 +191,7 @@ Here is a sample module which implements
static void __exit test_exit(void) { - timer_delete_sync(&test_timer); + timer_shutdown_sync(&test_timer); }
module_init(test_init); --- a/Documentation/kernel-hacking/locking.rst +++ b/Documentation/kernel-hacking/locking.rst @@ -1007,6 +1007,11 @@ calling add_timer() at the end of their Because this is a fairly common case which is prone to races, you should use timer_delete_sync() (``include/linux/timer.h``) to
+Before freeing a timer, timer_shutdown() or timer_shutdown_sync() should be +called which will keep it from being rearmed. Any subsequent attempt to +rearm the timer will be silently ignored by the core code. + + Locking Speed =============
--- a/Documentation/translations/zh_CN/core-api/local_ops.rst +++ b/Documentation/translations/zh_CN/core-api/local_ops.rst @@ -185,7 +185,7 @@ UP之间没有不同的行为,在你�
static void __exit test_exit(void) { - timer_delete_sync(&test_timer); + timer_shutdown_sync(&test_timer); }
module_init(test_init);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit e0d3da982c96aeddc1bbf1cf9469dbb9ebdca657 ]
While discussing solutions for the teardown problem which results from circular dependencies between timers and workqueues, where timers schedule work from their timer callback and workqueues arm the timers from work items, it was discovered that the recent fix to the QCA code is incorrect.
That commit fixes the obvious problem of using del_timer() instead of del_timer_sync() and reorders the teardown calls to
destroy_workqueue(wq); del_timer_sync(t);
This makes it less likely to explode, but it's still broken:
destroy_workqueue(wq); /* After this point @wq cannot be touched anymore */
---> timer expires queue_work(wq) <---- Results in a NULL pointer dereference deep in the work queue core code. del_timer_sync(t);
Use the new timer_shutdown_sync() function to ensure that the timers are disarmed, no timer callbacks are running and the timers cannot be armed again. This restores the original teardown sequence:
timer_shutdown_sync(t); destroy_workqueue(wq);
which is now correct because the timer core silently ignores potential rearming attempts which can happen when destroy_workqueue() drains pending work before mopping up the workqueue.
Fixes: 72ef98445aca ("Bluetooth: hci_qca: Use del_timer_sync() before freeing") Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Guenter Roeck linux@roeck-us.net Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Anna-Maria Behnsen anna-maria@linutronix.de Acked-by: Luiz Augusto von Dentz luiz.dentz@gmail.com Link: https://lore.kernel.org/all/87iljhsftt.ffs@tglx Link: https://lore.kernel.org/r/20221123201625.435907114@linutronix.de Signed-off-by: Jeongjun Park aha310510@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/bluetooth/hci_qca.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/drivers/bluetooth/hci_qca.c +++ b/drivers/bluetooth/hci_qca.c @@ -710,9 +710,15 @@ static int qca_close(struct hci_uart *hu skb_queue_purge(&qca->tx_wait_q); skb_queue_purge(&qca->txq); skb_queue_purge(&qca->rx_memdump_q); + /* + * Shut the timers down so they can't be rearmed when + * destroy_workqueue() drains pending work which in turn might try + * to arm a timer. After shutdown rearm attempts are silently + * ignored by the timer core code. + */ + timer_shutdown_sync(&qca->tx_idle_timer); + timer_shutdown_sync(&qca->wake_retrans_timer); destroy_workqueue(qca->workqueue); - del_timer_sync(&qca->tx_idle_timer); - del_timer_sync(&qca->wake_retrans_timer); qca->hu = NULL;
kfree_skb(qca->rx_skb);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Benjamin Tissoires bentiss@kernel.org
commit 46f781e0d151844589dc2125c8cce3300546f92a upstream.
The sticky fingers quirk (MT_QUIRK_STICKY_FINGERS) was only considering the case when slots were not released during the last report. This can be problematic if the firmware forgets to release a finger while others are still present.
This was observed on the Synaptics DLL0945 touchpad found on the Dell XPS 9310 and the Dell Inspiron 5406.
Fixes: 4f4001bc76fd ("HID: multitouch: fix rare Win 8 cases when the touch up event gets missing") Cc: stable@vger.kernel.org Signed-off-by: Benjamin Tissoires bentiss@kernel.org Signed-off-by: Jiri Kosina jkosina@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hid/hid-multitouch.c | 27 ++++++++++++++------------- 1 file changed, 14 insertions(+), 13 deletions(-)
--- a/drivers/hid/hid-multitouch.c +++ b/drivers/hid/hid-multitouch.c @@ -83,9 +83,8 @@ enum latency_mode { HID_LATENCY_HIGH = 1, };
-#define MT_IO_FLAGS_RUNNING 0 -#define MT_IO_FLAGS_ACTIVE_SLOTS 1 -#define MT_IO_FLAGS_PENDING_SLOTS 2 +#define MT_IO_SLOTS_MASK GENMASK(7, 0) /* reserve first 8 bits for slot tracking */ +#define MT_IO_FLAGS_RUNNING 32
static const bool mtrue = true; /* default for true */ static const bool mfalse; /* default for false */ @@ -161,7 +160,11 @@ struct mt_device { struct mt_class mtclass; /* our mt device class */ struct timer_list release_timer; /* to release sticky fingers */ struct hid_device *hdev; /* hid_device we're attached to */ - unsigned long mt_io_flags; /* mt flags (MT_IO_FLAGS_*) */ + unsigned long mt_io_flags; /* mt flags (MT_IO_FLAGS_RUNNING) + * first 8 bits are reserved for keeping the slot + * states, this is fine because we only support up + * to 250 slots (MT_MAX_MAXCONTACT) + */ __u8 inputmode_value; /* InputMode HID feature value */ __u8 maxcontacts; bool is_buttonpad; /* is this device a button pad? */ @@ -936,6 +939,7 @@ static void mt_release_pending_palms(str
for_each_set_bit(slotnum, app->pending_palm_slots, td->maxcontacts) { clear_bit(slotnum, app->pending_palm_slots); + clear_bit(slotnum, &td->mt_io_flags);
input_mt_slot(input, slotnum); input_mt_report_slot_inactive(input); @@ -967,12 +971,6 @@ static void mt_sync_frame(struct mt_devi
app->num_received = 0; app->left_button_state = 0; - - if (test_bit(MT_IO_FLAGS_ACTIVE_SLOTS, &td->mt_io_flags)) - set_bit(MT_IO_FLAGS_PENDING_SLOTS, &td->mt_io_flags); - else - clear_bit(MT_IO_FLAGS_PENDING_SLOTS, &td->mt_io_flags); - clear_bit(MT_IO_FLAGS_ACTIVE_SLOTS, &td->mt_io_flags); }
static int mt_compute_timestamp(struct mt_application *app, __s32 value) @@ -1147,7 +1145,9 @@ static int mt_process_slot(struct mt_dev input_event(input, EV_ABS, ABS_MT_TOUCH_MAJOR, major); input_event(input, EV_ABS, ABS_MT_TOUCH_MINOR, minor);
- set_bit(MT_IO_FLAGS_ACTIVE_SLOTS, &td->mt_io_flags); + set_bit(slotnum, &td->mt_io_flags); + } else { + clear_bit(slotnum, &td->mt_io_flags); }
return 0; @@ -1282,7 +1282,7 @@ static void mt_touch_report(struct hid_d * defect. */ if (app->quirks & MT_QUIRK_STICKY_FINGERS) { - if (test_bit(MT_IO_FLAGS_PENDING_SLOTS, &td->mt_io_flags)) + if (td->mt_io_flags & MT_IO_SLOTS_MASK) mod_timer(&td->release_timer, jiffies + msecs_to_jiffies(100)); else @@ -1729,6 +1729,7 @@ static void mt_release_contacts(struct h for (i = 0; i < mt->num_slots; i++) { input_mt_slot(input_dev, i); input_mt_report_slot_inactive(input_dev); + clear_bit(i, &td->mt_io_flags); } input_mt_sync_frame(input_dev); input_sync(input_dev); @@ -1751,7 +1752,7 @@ static void mt_expired_timeout(struct ti */ if (test_and_set_bit_lock(MT_IO_FLAGS_RUNNING, &td->mt_io_flags)) return; - if (test_bit(MT_IO_FLAGS_PENDING_SLOTS, &td->mt_io_flags)) + if (td->mt_io_flags & MT_IO_SLOTS_MASK) mt_release_contacts(hdev); clear_bit_unlock(MT_IO_FLAGS_RUNNING, &td->mt_io_flags); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yuezhang Mo Yuezhang.Mo@sony.com
[ Upstream commit 154d1e7ad9e5ce4b2aaefd3862b3dba545ad978d ]
The commit 168316db3583("dax: assert that i_rwsem is held exclusive for writes") added lock assertions to ensure proper locking in DAX operations. However, these assertions trigger false-positive lockdep warnings since read lock is unnecessary on read-only filesystems(e.g., erofs).
This patch skips the read lock assertion for read-only filesystems, eliminating the spurious warnings while maintaining the integrity checks for writable filesystems.
Fixes: 168316db3583 ("dax: assert that i_rwsem is held exclusive for writes") Signed-off-by: Yuezhang Mo Yuezhang.Mo@sony.com Reviewed-by: Friendy Su friendy.su@sony.com Reviewed-by: Daniel Palmer daniel.palmer@sony.com Reviewed-by: Gao Xiang hsiangkao@linux.alibaba.com Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/dax.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/dax.c b/fs/dax.c index ca7138bb1d545..2ebe70de35ec3 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1524,7 +1524,7 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter, if (iov_iter_rw(iter) == WRITE) { lockdep_assert_held_write(&iomi.inode->i_rwsem); iomi.flags |= IOMAP_WRITE; - } else { + } else if (!sb_rdonly(iomi.inode->i_sb)) { lockdep_assert_held(&iomi.inode->i_rwsem); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Kleine-Budde mkl@pengutronix.de
[ Upstream commit ba569fb07a7e9e9b71e9282e27e993ba859295c2 ]
Commit 227619c3ff7c ("can: m_can: move runtime PM enable/disable to m_can_platform") moved the PM runtime enable from the m_can core driver into the m_can_platform.
That patch forgot to move the pm_runtime_disable() to m_can_plat_remove(), so that unloading the m_can_platform driver causes an "Unbalanced pm_runtime_enable!" error message.
Add the missing pm_runtime_disable() to m_can_plat_remove() to fix the problem.
Cc: Patrik Flykt patrik.flykt@linux.intel.com Fixes: 227619c3ff7c ("can: m_can: move runtime PM enable/disable to m_can_platform") Reviewed-by: Markus Schneider-Pargmann msp@baylibre.com Link: https://patch.msgid.link/20250929-m_can-fix-state-handling-v4-1-682b49b49d9a... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/m_can/m_can_platform.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/can/m_can/m_can_platform.c b/drivers/net/can/m_can/m_can_platform.c index de6d8e01bf2e8..71cf3662128a1 100644 --- a/drivers/net/can/m_can/m_can_platform.c +++ b/drivers/net/can/m_can/m_can_platform.c @@ -170,7 +170,7 @@ static int m_can_plat_remove(struct platform_device *pdev) struct m_can_classdev *mcan_class = &priv->cdev;
m_can_class_unregister(mcan_class); - + pm_runtime_disable(mcan_class->dev); m_can_class_free_dev(mcan_class->net);
return 0;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yeounsu Moon yyyynoom@gmail.com
[ Upstream commit 65946eac6d888d50ae527c4e5c237dbe5cc3a2f2 ]
There is no error handling for `dma_map_single()` failures.
Add error handling by checking `dma_mapping_error()` and freeing the `skb` using `dev_kfree_skb()` (process context) when it fails.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Yeounsu Moon yyyynoom@gmail.com Tested-on: D-Link DGE-550T Rev-A3 Suggested-by: Simon Horman horms@kernel.org Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/dlink/dl2k.c | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/dlink/dl2k.c b/drivers/net/ethernet/dlink/dl2k.c index bf58181589bf2..2221d5b7eeba9 100644 --- a/drivers/net/ethernet/dlink/dl2k.c +++ b/drivers/net/ethernet/dlink/dl2k.c @@ -498,25 +498,34 @@ static int alloc_list(struct net_device *dev) for (i = 0; i < RX_RING_SIZE; i++) { /* Allocated fixed size of skbuff */ struct sk_buff *skb; + dma_addr_t addr;
skb = netdev_alloc_skb_ip_align(dev, np->rx_buf_sz); np->rx_skbuff[i] = skb; - if (!skb) { - free_list(dev); - return -ENOMEM; - } + if (!skb) + goto err_free_list; + + addr = dma_map_single(&np->pdev->dev, skb->data, + np->rx_buf_sz, DMA_FROM_DEVICE); + if (dma_mapping_error(&np->pdev->dev, addr)) + goto err_kfree_skb;
np->rx_ring[i].next_desc = cpu_to_le64(np->rx_ring_dma + ((i + 1) % RX_RING_SIZE) * sizeof(struct netdev_desc)); /* Rubicon now supports 40 bits of addressing space. */ - np->rx_ring[i].fraginfo = - cpu_to_le64(dma_map_single(&np->pdev->dev, skb->data, - np->rx_buf_sz, DMA_FROM_DEVICE)); + np->rx_ring[i].fraginfo = cpu_to_le64(addr); np->rx_ring[i].fraginfo |= cpu_to_le64((u64)np->rx_buf_sz << 48); }
return 0; + +err_kfree_skb: + dev_kfree_skb(np->rx_skbuff[i]); + np->rx_skbuff[i] = NULL; +err_free_list: + free_list(dev); + return -ENOMEM; }
static void rio_hw_init(struct net_device *dev)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolas Dichtel nicolas.dichtel@6wind.com
[ Upstream commit 0b4b77eff5f8cd9be062783a1c1e198d46d0a753 ]
This sysctl is not per interface; it's global per netns.
Fixes: 292ecd9f5a94 ("doc: move seg6_flowlabel to seg6-sysctl.rst") Reported-by: Philippe Guibert philippe.guibert@6wind.com Signed-off-by: Nicolas Dichtel nicolas.dichtel@6wind.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/networking/seg6-sysctl.rst | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/Documentation/networking/seg6-sysctl.rst b/Documentation/networking/seg6-sysctl.rst index 07c20e470bafe..1b6af4779be11 100644 --- a/Documentation/networking/seg6-sysctl.rst +++ b/Documentation/networking/seg6-sysctl.rst @@ -25,6 +25,9 @@ seg6_require_hmac - INTEGER
Default is 0.
+/proc/sys/net/ipv6/seg6_* variables: +==================================== + seg6_flowlabel - INTEGER Controls the behaviour of computing the flowlabel of outer IPv6 header in case of SR T.encaps
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linmao Li lilinmao@kylinos.cn
[ Upstream commit 70f92ab97042f243e1c8da1c457ff56b9b3e49f1 ]
After resume from S4 (hibernate), RTL8168H/RTL8111H truncates incoming packets. Packet captures show messages like "IP truncated-ip - 146 bytes missing!".
The issue is caused by RxConfig not being properly re-initialized after resume. Re-initializing the RxConfig register before the chip re-initialization sequence avoids the truncation and restores correct packet reception.
This follows the same pattern as commit ef9da46ddef0 ("r8169: fix data corruption issue on RTL8402").
Fixes: 6e1d0b898818 ("r8169:add support for RTL8168H and RTL8107E") Signed-off-by: Linmao Li lilinmao@kylinos.cn Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Heiner Kallweit hkallweit1@gmail.com Link: https://patch.msgid.link/20251009122549.3955845-1-lilinmao@kylinos.cn Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/realtek/r8169_main.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 6346821d480bd..6879660e44fad 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -4950,8 +4950,9 @@ static int rtl8169_resume(struct device *device) if (!device_may_wakeup(tp_to_dev(tp))) clk_prepare_enable(tp->clk);
- /* Reportedly at least Asus X453MA truncates packets otherwise */ - if (tp->mac_version == RTL_GIGA_MAC_VER_37) + /* Some chip versions may truncate packets without this initialization */ + if (tp->mac_version == RTL_GIGA_MAC_VER_37 || + tp->mac_version == RTL_GIGA_MAC_VER_46) rtl_init_rxcfg(tp);
return rtl8169_runtime_resume(device);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Safonov dima@arista.com
[ Upstream commit 21f4d45eba0b2dcae5dbc9e5e0ad08735c993f16 ]
Similarly to ipv4 tunnel, ipv6 version updates dev->needed_headroom, too. While ipv4 tunnel headroom adjustment growth was limited in commit 5ae1e9922bbd ("net: ip_tunnel: prevent perpetual headroom growth"), ipv6 tunnel yet increases the headroom without any ceiling.
Reflect ipv4 tunnel headroom adjustment limit on ipv6 version.
Credits to Francesco Ruggeri, who was originally debugging this issue and wrote local Arista-specific patch and a reproducer.
Fixes: 8eb30be0352d ("ipv6: Create ip6_tnl_xmit") Cc: Florian Westphal fw@strlen.de Cc: Francesco Ruggeri fruggeri05@gmail.com Signed-off-by: Dmitry Safonov dima@arista.com Link: https://patch.msgid.link/20251009-ip6_tunnel-headroom-v2-1-8e4dbd8f7e35@aris... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/ip_tunnels.h | 15 +++++++++++++++ net/ipv4/ip_tunnel.c | 14 -------------- net/ipv6/ip6_tunnel.c | 3 +-- 3 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h index 84751313b8265..e93db837412b2 100644 --- a/include/net/ip_tunnels.h +++ b/include/net/ip_tunnels.h @@ -481,6 +481,21 @@ struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md, int skb_tunnel_check_pmtu(struct sk_buff *skb, struct dst_entry *encap_dst, int headroom, bool reply);
+static inline void ip_tunnel_adj_headroom(struct net_device *dev, + unsigned int headroom) +{ + /* we must cap headroom to some upperlimit, else pskb_expand_head + * will overflow header offsets in skb_headers_offset_update(). + */ + const unsigned int max_allowed = 512; + + if (headroom > max_allowed) + headroom = max_allowed; + + if (headroom > READ_ONCE(dev->needed_headroom)) + WRITE_ONCE(dev->needed_headroom, headroom); +} + int iptunnel_handle_offloads(struct sk_buff *skb, int gso_type_mask);
static inline int iptunnel_pull_offloads(struct sk_buff *skb) diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c index 90e55b9979e69..dcf9e9c52a22a 100644 --- a/net/ipv4/ip_tunnel.c +++ b/net/ipv4/ip_tunnel.c @@ -567,20 +567,6 @@ static int tnl_update_pmtu(struct net_device *dev, struct sk_buff *skb, return 0; }
-static void ip_tunnel_adj_headroom(struct net_device *dev, unsigned int headroom) -{ - /* we must cap headroom to some upperlimit, else pskb_expand_head - * will overflow header offsets in skb_headers_offset_update(). - */ - static const unsigned int max_allowed = 512; - - if (headroom > max_allowed) - headroom = max_allowed; - - if (headroom > READ_ONCE(dev->needed_headroom)) - WRITE_ONCE(dev->needed_headroom, headroom); -} - void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev, u8 proto, int tunnel_hlen) { diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index 2a470c0c38aef..dfca22c6d345d 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -1256,8 +1256,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield, */ max_headroom = LL_RESERVED_SPACE(dst->dev) + sizeof(struct ipv6hdr) + dst->header_len + t->hlen; - if (max_headroom > READ_ONCE(dev->needed_headroom)) - WRITE_ONCE(dev->needed_headroom, max_headroom); + ip_tunnel_adj_headroom(dev, max_headroom);
err = ip6_tnl_encap(skb, t, &proto, fl6); if (err)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Raju Rangoju Raju.Rangoju@amd.com
[ Upstream commit 2616222e423398bb374ffcb5d23dea4ba2c3e524 ]
During interface toggle operations (ifdown/ifup), the driver currently resets the local helper variable 'phy_link' to -1. This causes the link state machine to incorrectly interpret the state as a link change event, resulting in spurious "Link is down" messages being logged when the interface is brought back up.
Preserve the phy_link state across interface toggles to avoid treating the -1 sentinel value as a legitimate link state transition.
Fixes: 88131a812b16 ("amd-xgbe: Perform phy connect/disconnect at dev open/stop") Signed-off-by: Raju Rangoju Raju.Rangoju@amd.com Reviewed-by: Dawid Osuchowski dawid.osuchowski@linux.intel.com Link: https://patch.msgid.link/20251010065142.1189310-1-Raju.Rangoju@amd.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 1 - drivers/net/ethernet/amd/xgbe/xgbe-mdio.c | 1 + 2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c index 34d45cebefb5d..b4d57da71de2a 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c +++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c @@ -1172,7 +1172,6 @@ static void xgbe_free_rx_data(struct xgbe_prv_data *pdata)
static int xgbe_phy_reset(struct xgbe_prv_data *pdata) { - pdata->phy_link = -1; pdata->phy_speed = SPEED_UNKNOWN;
return pdata->phy_if.phy_reset(pdata); diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c b/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c index 19fed56b6ee3f..ebb8b3e5b9a88 100644 --- a/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c +++ b/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c @@ -1636,6 +1636,7 @@ static int xgbe_phy_init(struct xgbe_prv_data *pdata) pdata->phy.duplex = DUPLEX_FULL; }
+ pdata->phy_link = 0; pdata->phy.link = 0;
pdata->phy.pause_autoneg = pdata->pause_autoneg;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 295ce1eb36ae47dc862d6c8a1012618a25516208 ]
Neal reported that using neper tcp_stream with TCP_TX_DELAY set to 50ms would often lead to flows stuck in a small cwnd mode, regardless of the congestion control.
While tcp_stream sets TCP_TX_DELAY too late after the connect(), it highlighted two kernel bugs.
The following heuristic in tcp_tso_should_defer() seems wrong for large RTT:
delta = tp->tcp_clock_cache - head->tstamp; /* If next ACK is likely to come too late (half srtt), do not defer */ if ((s64)(delta - (u64)NSEC_PER_USEC * (tp->srtt_us >> 4)) < 0) goto send_now;
If next ACK is expected to come in more than 1 ms, we should not defer because we prefer a smooth ACK clocking.
While blamed commit was a step in the good direction, it was not generic enough.
Another patch fixing TCP_TX_DELAY for established flows will be proposed when net-next reopens.
Fixes: 50c8339e9299 ("tcp: tso: restore IW10 after TSO autosizing") Reported-by: Neal Cardwell ncardwell@google.com Signed-off-by: Eric Dumazet edumazet@google.com Reviewed-by: Neal Cardwell ncardwell@google.com Tested-by: Neal Cardwell ncardwell@google.com Link: https://patch.msgid.link/20251011115742.1245771-1-edumazet@google.com [pabeni@redhat.com: fixed whitespace issue] Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_output.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 40568365cdb3b..a8d8e2f294ff2 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2184,7 +2184,8 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb, u32 max_segs) { const struct inet_connection_sock *icsk = inet_csk(sk); - u32 send_win, cong_win, limit, in_flight; + u32 send_win, cong_win, limit, in_flight, threshold; + u64 srtt_in_ns, expected_ack, how_far_is_the_ack; struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *head; int win_divisor; @@ -2246,9 +2247,19 @@ static bool tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb, head = tcp_rtx_queue_head(sk); if (!head) goto send_now; - delta = tp->tcp_clock_cache - head->tstamp; - /* If next ACK is likely to come too late (half srtt), do not defer */ - if ((s64)(delta - (u64)NSEC_PER_USEC * (tp->srtt_us >> 4)) < 0) + + srtt_in_ns = (u64)(NSEC_PER_USEC >> 3) * tp->srtt_us; + /* When is the ACK expected ? */ + expected_ack = head->tstamp + srtt_in_ns; + /* How far from now is the ACK expected ? */ + how_far_is_the_ack = expected_ack - tp->tcp_clock_cache; + + /* If next ACK is likely to come too late, + * ie in more than min(1ms, half srtt), do not defer. + */ + threshold = min(srtt_in_ns >> 1, NSEC_PER_MSEC); + + if ((s64)(how_far_is_the_ack - threshold) > 0) goto send_now;
/* Ok, it looks like it is advisable to defer.
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Simakov bigalex934@gmail.com
[ Upstream commit 0c3f2e62815a43628e748b1e4ad97a1c46cce703 ]
Some execution paths that jump to the fiber_setup_done label could leave the remote_adv and local_adv variables uninitialized and then use it.
Initialize this variables at the point of definition to avoid this.
Fixes: 85730a631f0c ("tg3: Add SGMII phy support for 5719/5718 serdes") Co-developed-by: Alexandr Sapozhnikov alsp705@gmail.com Signed-off-by: Alexandr Sapozhnikov alsp705@gmail.com Signed-off-by: Alexey Simakov bigalex934@gmail.com Reviewed-by: Pavan Chebbi pavan.chebbi@broadcom.com Link: https://patch.msgid.link/20251014164736.5890-1-bigalex934@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/tg3.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c index 95d460237835d..8e5236142aaca 100644 --- a/drivers/net/ethernet/broadcom/tg3.c +++ b/drivers/net/ethernet/broadcom/tg3.c @@ -5814,7 +5814,7 @@ static int tg3_setup_fiber_mii_phy(struct tg3 *tp, bool force_reset) u32 current_speed = SPEED_UNKNOWN; u8 current_duplex = DUPLEX_UNKNOWN; bool current_link_up = false; - u32 local_adv, remote_adv, sgsr; + u32 local_adv = 0, remote_adv = 0, sgsr;
if ((tg3_asic_rev(tp) == ASIC_REV_5719 || tg3_asic_rev(tp) == ASIC_REV_5720) && @@ -5955,9 +5955,6 @@ static int tg3_setup_fiber_mii_phy(struct tg3 *tp, bool force_reset) else current_duplex = DUPLEX_HALF;
- local_adv = 0; - remote_adv = 0; - if (bmcr & BMCR_ANENABLE) { u32 common;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sascha Hauer s.hauer@pengutronix.de
[ Upstream commit 54001d0f2fdbc7852136a00f3e6fc395a9547ae5 ]
When asynchronous encryption is used KTLS sends out the final data at proto->close time. This becomes problematic when the task calling close() receives a signal. In this case it can happen that tcp_sendmsg_locked() called at close time returns -ERESTARTSYS and the final data is not sent.
The described situation happens when KTLS is used in conjunction with io_uring, as io_uring uses task_work_add() to add work to the current userspace task. A discussion of the problem along with a reproducer can be found in [1] and [2]
Fix this by waiting for the asynchronous encryption to be completed on the final message. With this there is no data left to be sent at close time.
[1] https://lore.kernel.org/all/20231010141932.GD3114228@pengutronix.de/ [2] https://lore.kernel.org/all/20240315100159.3898944-1-s.hauer@pengutronix.de/
Signed-off-by: Sascha Hauer s.hauer@pengutronix.de Link: https://patch.msgid.link/20240904-ktls-wait-async-v1-1-a62892833110@pengutro... Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: b014a4e066c5 ("tls: wait for async encrypt in case of error during latter iterations of sendmsg") Signed-off-by: Sasha Levin sashal@kernel.org --- net/tls/tls_sw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index fe6514e964ba3..c67cf1a06c0e5 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1184,7 +1184,7 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
if (!num_async) { goto send_end; - } else if (num_zc) { + } else if (num_zc || eor) { int err;
/* Wait for pending encryptions to get completed */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sabrina Dubroca sd@queasysnail.net
[ Upstream commit b014a4e066c555185b7c367efacdc33f16695495 ]
If we hit an error during the main loop of tls_sw_sendmsg_locked (eg failed allocation), we jump to send_end and immediately return. Previous iterations may have queued async encryption requests that are still pending. We should wait for those before returning, as we could otherwise be reading from memory that userspace believes we're not using anymore, which would be a sort of use-after-free.
This is similar to what tls_sw_recvmsg already does: failures during the main loop jump to the "wait for async" code, not straight to the unlock/return.
Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance") Reported-by: Jann Horn jannh@google.com Signed-off-by: Sabrina Dubroca sd@queasysnail.net Link: https://patch.msgid.link/c793efe9673b87f808d84fdefc0f732217030c52.1760432043... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/tls/tls_sw.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index c67cf1a06c0e5..0e378d7cb6903 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1029,7 +1029,7 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) if (ret == -EINPROGRESS) num_async++; else if (ret != -EAGAIN) - goto send_end; + goto end; } }
@@ -1182,8 +1182,9 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) goto alloc_encrypted; }
+send_end: if (!num_async) { - goto send_end; + goto end; } else if (num_zc || eor) { int err;
@@ -1201,7 +1202,7 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) tls_tx_records(sk, msg->msg_flags); }
-send_end: +end: ret = sk_stream_error(sk, msg->msg_flags, ret);
release_sock(sk);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sabrina Dubroca sd@queasysnail.net
[ Upstream commit b6fe4c29bb51cf239ecf48eacf72b924565cb619 ]
When userspace wants to send a non-DATA record (via the TLS_SET_RECORD_TYPE cmsg), we need to send any pending data from a previous MSG_MORE send() as a separate DATA record. If that DATA record is encrypted asynchronously, tls_handle_open_record will return -EINPROGRESS. This is currently treated as an error by tls_process_cmsg, and it will skip setting record_type to the correct value, but the caller (tls_sw_sendmsg_locked) handles that return value correctly and proceeds with sending the new message with an incorrect record_type (DATA instead of whatever was requested in the cmsg).
Always set record_type before handling the open record. If tls_handle_open_record returns an error, record_type will be ignored. If it succeeds, whether with synchronous crypto (returning 0) or asynchronous (returning -EINPROGRESS), the caller will proceed correctly.
Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance") Reported-by: Jann Horn jannh@google.com Signed-off-by: Sabrina Dubroca sd@queasysnail.net Link: https://patch.msgid.link/0457252e578a10a94e40c72ba6288b3a64f31662.1760432043... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/tls/tls_main.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index 14d01558311d2..4797f68b9ec80 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -208,12 +208,9 @@ int tls_process_cmsg(struct sock *sk, struct msghdr *msg, if (msg->msg_flags & MSG_MORE) return -EINVAL;
- rc = tls_handle_open_record(sk, msg->msg_flags); - if (rc) - return rc; - *record_type = *(unsigned char *)CMSG_DATA(cmsg); - rc = 0; + + rc = tls_handle_open_record(sk, msg->msg_flags); break; default: return -EINVAL;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sabrina Dubroca sd@queasysnail.net
[ Upstream commit b8a6ff84abbcbbc445463de58704686011edc8e1 ]
Async decryption calls tls_strp_msg_hold to create a clone of the input skb to hold references to the memory it uses. If we fail to allocate that clone, proceeding with async decryption can lead to various issues (UAF on the skb, writing into userspace memory after the recv() call has returned).
In this case, wait for all pending decryption requests.
Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser") Reported-by: Jann Horn jannh@google.com Signed-off-by: Sabrina Dubroca sd@queasysnail.net Link: https://patch.msgid.link/b9fe61dcc07dab15da9b35cf4c7d86382a98caf2.1760432043... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/tls/tls_sw.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 0e378d7cb6903..baed07edc6395 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1724,8 +1724,10 @@ static int tls_decrypt_sg(struct sock *sk, struct iov_iter *out_iov,
if (unlikely(darg->async)) { err = tls_strp_msg_hold(&ctx->strp, &ctx->async_hold); - if (err) - __skb_queue_tail(&ctx->async_hold, darg->skb); + if (err) { + err = tls_decrypt_async_wait(ctx); + darg->async = false; + } return err; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sabrina Dubroca sd@queasysnail.net
[ Upstream commit 7f846c65ca11e63d2409868ff039081f80e42ae4 ]
With async crypto, we rely on tx_work to actually transmit records once encryption completes. But while send() is running, both the tx_lock and socket lock are held, so tx_work_handler cannot process the queue of encrypted records, and simply reschedules itself. During a large send(), this could last a long time, and use a lot of memory.
Transmit any pending encrypted records before restarting the main loop of tls_sw_sendmsg_locked.
Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance") Reported-by: Jann Horn jannh@google.com Signed-off-by: Sabrina Dubroca sd@queasysnail.net Link: https://patch.msgid.link/8396631478f70454b44afb98352237d33f48d34d.1760432043... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/tls/tls_sw.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index baed07edc6395..e7f151c98eb93 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1109,6 +1109,13 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) } else if (ret != -EAGAIN) goto send_end; } + + /* Transmit if any encryptions have completed */ + if (test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) { + cancel_delayed_work(&ctx->tx_work.work); + tls_tx_records(sk, msg->msg_flags); + } + continue; rollback_iter: copied -= try_to_copy; @@ -1163,6 +1170,12 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) goto send_end; } } + + /* Transmit if any encryptions have completed */ + if (test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) { + cancel_delayed_work(&ctx->tx_work.work); + tls_tx_records(sk, msg->msg_flags); + } }
continue;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oleksij Rempel o.rempel@pengutronix.de
[ Upstream commit 6f31135894ec96481e2bda93a1da70712f5e57c1 ]
Convert `lan78xx_init_mac_address` to return error codes and handle failures in register read and write operations. Update `lan78xx_reset` to check for errors during MAC address initialization and propagate them appropriately.
Signed-off-by: Oleksij Rempel o.rempel@pengutronix.de Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://patch.msgid.link/20241209130751.703182-3-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: 8d93ff40d49d ("net: usb: lan78xx: fix use of improperly initialized dev->chipid in lan78xx_reset") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/lan78xx.c | 36 ++++++++++++++++++++++++++++-------- 1 file changed, 28 insertions(+), 8 deletions(-)
diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c index 0f1c9009d793e..08fb03bcf4952 100644 --- a/drivers/net/usb/lan78xx.c +++ b/drivers/net/usb/lan78xx.c @@ -1940,13 +1940,19 @@ static const struct ethtool_ops lan78xx_ethtool_ops = { .get_regs = lan78xx_get_regs, };
-static void lan78xx_init_mac_address(struct lan78xx_net *dev) +static int lan78xx_init_mac_address(struct lan78xx_net *dev) { u32 addr_lo, addr_hi; u8 addr[6]; + int ret; + + ret = lan78xx_read_reg(dev, RX_ADDRL, &addr_lo); + if (ret < 0) + return ret;
- lan78xx_read_reg(dev, RX_ADDRL, &addr_lo); - lan78xx_read_reg(dev, RX_ADDRH, &addr_hi); + ret = lan78xx_read_reg(dev, RX_ADDRH, &addr_hi); + if (ret < 0) + return ret;
addr[0] = addr_lo & 0xFF; addr[1] = (addr_lo >> 8) & 0xFF; @@ -1979,14 +1985,26 @@ static void lan78xx_init_mac_address(struct lan78xx_net *dev) (addr[2] << 16) | (addr[3] << 24); addr_hi = addr[4] | (addr[5] << 8);
- lan78xx_write_reg(dev, RX_ADDRL, addr_lo); - lan78xx_write_reg(dev, RX_ADDRH, addr_hi); + ret = lan78xx_write_reg(dev, RX_ADDRL, addr_lo); + if (ret < 0) + return ret; + + ret = lan78xx_write_reg(dev, RX_ADDRH, addr_hi); + if (ret < 0) + return ret; }
- lan78xx_write_reg(dev, MAF_LO(0), addr_lo); - lan78xx_write_reg(dev, MAF_HI(0), addr_hi | MAF_HI_VALID_); + ret = lan78xx_write_reg(dev, MAF_LO(0), addr_lo); + if (ret < 0) + return ret; + + ret = lan78xx_write_reg(dev, MAF_HI(0), addr_hi | MAF_HI_VALID_); + if (ret < 0) + return ret;
eth_hw_addr_set(dev->net, addr); + + return 0; }
/* MDIO read and write wrappers for phylib */ @@ -2910,7 +2928,9 @@ static int lan78xx_reset(struct lan78xx_net *dev) } } while (buf & HW_CFG_LRST_);
- lan78xx_init_mac_address(dev); + ret = lan78xx_init_mac_address(dev); + if (ret < 0) + return ret;
/* save DEVID for later usage */ ret = lan78xx_read_reg(dev, ID_REV, &buf);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: I Viswanath viswanathiyyappan@gmail.com
[ Upstream commit 8d93ff40d49d70e05c82a74beae31f883fe0eaf8 ]
dev->chipid is used in lan78xx_init_mac_address before it's initialized:
lan78xx_reset() { lan78xx_init_mac_address() lan78xx_read_eeprom() lan78xx_read_raw_eeprom() <- dev->chipid is used here
dev->chipid = ... <- dev->chipid is initialized correctly here }
Reorder initialization so that dev->chipid is set before calling lan78xx_init_mac_address().
Fixes: a0db7d10b76e ("lan78xx: Add to handle mux control per chip id") Signed-off-by: I Viswanath viswanathiyyappan@gmail.com Reviewed-by: Vadim Fedorenko vadim.fedorenko@linux.dev Reviewed-by: Khalid Aziz khalid@kernel.org Link: https://patch.msgid.link/20251013181648.35153-1-viswanathiyyappan@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/lan78xx.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c index 08fb03bcf4952..42f8fc71baee8 100644 --- a/drivers/net/usb/lan78xx.c +++ b/drivers/net/usb/lan78xx.c @@ -2928,10 +2928,6 @@ static int lan78xx_reset(struct lan78xx_net *dev) } } while (buf & HW_CFG_LRST_);
- ret = lan78xx_init_mac_address(dev); - if (ret < 0) - return ret; - /* save DEVID for later usage */ ret = lan78xx_read_reg(dev, ID_REV, &buf); if (ret < 0) @@ -2940,6 +2936,10 @@ static int lan78xx_reset(struct lan78xx_net *dev) dev->chipid = (buf & ID_REV_CHIP_ID_MASK_) >> 16; dev->chiprev = buf & ID_REV_CHIP_REV_MASK_;
+ ret = lan78xx_init_mac_address(dev); + if (ret < 0) + return ret; + /* Respond to the IN token with a NAK */ ret = lan78xx_read_reg(dev, USB_CFG0, &buf); if (ret < 0)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fabian Vogt fvogt@suse.de
[ Upstream commit 9e68bd803fac49274fde914466fd3b07c4d602c8 ]
When adding a kprobe such as "p:probe/tcp_sendmsg _text+15392192", arch_check_kprobe would start iterating all instructions starting from _text until the probed address. Not only is this very inefficient, but literal values in there (e.g. left by function patching) are misinterpreted in a way that causes a desync.
Fix this by doing it like x86: start the iteration at the closest preceding symbol instead of the given starting point.
Fixes: 87f48c7ccc73 ("riscv: kprobe: Fixup kernel panic when probing an illegal position") Signed-off-by: Fabian Vogt fvogt@suse.de Signed-off-by: Marvin Friedrich marvin.friedrich@suse.com Acked-by: Guo Ren guoren@kernel.org Link: https://lore.kernel.org/r/6191817.lOV4Wx5bFT@fvogt-thinkpad Signed-off-by: Paul Walmsley pjw@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/probes/kprobes.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c index cca2b3a2135ad..f4836ef13e9bb 100644 --- a/arch/riscv/kernel/probes/kprobes.c +++ b/arch/riscv/kernel/probes/kprobes.c @@ -48,10 +48,15 @@ static void __kprobes arch_simulate_insn(struct kprobe *p, struct pt_regs *regs) post_kprobe_handler(p, kcb, regs); }
-static bool __kprobes arch_check_kprobe(struct kprobe *p) +static bool __kprobes arch_check_kprobe(unsigned long addr) { - unsigned long tmp = (unsigned long)p->addr - p->offset; - unsigned long addr = (unsigned long)p->addr; + unsigned long tmp, offset; + + /* start iterating at the closest preceding symbol */ + if (!kallsyms_lookup_size_offset(addr, NULL, &offset)) + return false; + + tmp = addr - offset;
while (tmp <= addr) { if (tmp == addr) @@ -70,7 +75,7 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p) if ((unsigned long)insn & 0x1) return -EILSEQ;
- if (!arch_check_kprobe(p)) + if (!arch_check_kprobe((unsigned long)p->addr)) return -EILSEQ;
/* copy instruction */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Vasut marek.vasut@mailbox.org
[ Upstream commit db74b04edce1bc86b9a5acc724c7ca06f427ab60 ]
There is now a new LT9211 rev. U5, which reports chip ID 0x18 0x01 0xe4 . The previous LT9211 reported chip ID 0x18 0x01 0xe3 , which is what the driver checks for right now. Since there is a possibility there will be yet another revision of the LT9211 in the future, drop the last version nibble check to allow all future revisions of the chip to work with this driver.
This fix makes LT9211 rev. U5 work with this driver.
Fixes: 8ce4129e3de4 ("drm/bridge: lt9211: Add Lontium LT9211 bridge driver") Signed-off-by: Marek Vasut marek.vasut@mailbox.org Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@oss.qualcomm.com Link: https://lore.kernel.org/r/20251011110017.12521-1-marek.vasut@mailbox.org Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@oss.qualcomm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/bridge/lontium-lt9211.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/bridge/lontium-lt9211.c b/drivers/gpu/drm/bridge/lontium-lt9211.c index 933ca028d612d..fd581160d95e9 100644 --- a/drivers/gpu/drm/bridge/lontium-lt9211.c +++ b/drivers/gpu/drm/bridge/lontium-lt9211.c @@ -121,8 +121,7 @@ static int lt9211_read_chipid(struct lt9211 *ctx) }
/* Test for known Chip ID. */ - if (chipid[0] != REG_CHIPID0_VALUE || chipid[1] != REG_CHIPID1_VALUE || - chipid[2] != REG_CHIPID2_VALUE) { + if (chipid[0] != REG_CHIPID0_VALUE || chipid[1] != REG_CHIPID1_VALUE) { dev_err(ctx->dev, "Unknown Chip ID: 0x%02x 0x%02x 0x%02x\n", chipid[0], chipid[1], chipid[2]); return -EINVAL;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cristian Ciocaltea cristian.ciocaltea@collabora.com
[ Upstream commit 6e54919cb541fdf1063b16f3254c28d01bc9e5ff ]
The microphone detection work scheduled by a prior jack insertion interrupt may still be in a pending state or under execution when a jack ejection interrupt has been fired.
This might lead to a racing condition or nau8821_jdet_work() completing after nau8821_eject_jack(), which will override the currently disconnected state of the jack and incorrectly report the headphone or the headset as being connected.
Cancel any pending jdet_work or wait for its execution to finish before attempting to handle the ejection interrupt.
Proceed similarly before launching the eject handler as a consequence of detecting an invalid insert interrupt.
Fixes: aab1ad11d69f ("ASoC: nau8821: new driver") Signed-off-by: Cristian Ciocaltea cristian.ciocaltea@collabora.com Link: https://patch.msgid.link/20251003-nau8821-jdet-fixes-v1-1-f7b0e2543f09@colla... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/nau8821.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/sound/soc/codecs/nau8821.c b/sound/soc/codecs/nau8821.c index efd92656a060d..ae2becb30beaa 100644 --- a/sound/soc/codecs/nau8821.c +++ b/sound/soc/codecs/nau8821.c @@ -1063,6 +1063,7 @@ static irqreturn_t nau8821_interrupt(int irq, void *data)
if ((active_irq & NAU8821_JACK_EJECT_IRQ_MASK) == NAU8821_JACK_EJECT_DETECTED) { + cancel_work_sync(&nau8821->jdet_work); regmap_update_bits(regmap, NAU8821_R71_ANALOG_ADC_1, NAU8821_MICDET_MASK, NAU8821_MICDET_DIS); nau8821_eject_jack(nau8821); @@ -1077,11 +1078,11 @@ static irqreturn_t nau8821_interrupt(int irq, void *data) clear_irq = NAU8821_KEY_RELEASE_IRQ; } else if ((active_irq & NAU8821_JACK_INSERT_IRQ_MASK) == NAU8821_JACK_INSERT_DETECTED) { + cancel_work_sync(&nau8821->jdet_work); regmap_update_bits(regmap, NAU8821_R71_ANALOG_ADC_1, NAU8821_MICDET_MASK, NAU8821_MICDET_EN); if (nau8821_is_jack_inserted(regmap)) { /* detect microphone and jack type */ - cancel_work_sync(&nau8821->jdet_work); schedule_work(&nau8821->jdet_work); /* Turn off insertion interruption at manual mode */ regmap_update_bits(regmap,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cristian Ciocaltea cristian.ciocaltea@collabora.com
[ Upstream commit 9273aa85b35cc02d0953a1ba3b7bd694e5a2c10e ]
Instead of adding yet another utility function for dealing with the interrupt clearing register, generalize nau8821_int_status_clear_all() by renaming it to nau8821_irq_status_clear(), whilst introducing a second parameter to allow restricting the operation scope to a single interrupt instead of the whole range of active IRQs.
While at it, also fix a spelling typo in the comment block.
Note this is mainly a prerequisite for subsequent patches aiming to address some deficiencies in the implementation of the interrupt handler. Thus the presence of the Fixes tag below is intentional, to facilitate backporting.
Fixes: aab1ad11d69f ("ASoC: nau8821: new driver") Signed-off-by: Cristian Ciocaltea cristian.ciocaltea@collabora.com Link: https://patch.msgid.link/20251003-nau8821-jdet-fixes-v1-2-f7b0e2543f09@colla... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/nau8821.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/sound/soc/codecs/nau8821.c b/sound/soc/codecs/nau8821.c index ae2becb30beaa..380ceac4d8700 100644 --- a/sound/soc/codecs/nau8821.c +++ b/sound/soc/codecs/nau8821.c @@ -902,12 +902,17 @@ static bool nau8821_is_jack_inserted(struct regmap *regmap) return active_high == is_high; }
-static void nau8821_int_status_clear_all(struct regmap *regmap) +static void nau8821_irq_status_clear(struct regmap *regmap, int active_irq) { - int active_irq, clear_irq, i; + int clear_irq, i;
- /* Reset the intrruption status from rightmost bit if the corres- - * ponding irq event occurs. + if (active_irq) { + regmap_write(regmap, NAU8821_R11_INT_CLR_KEY_STATUS, active_irq); + return; + } + + /* Reset the interruption status from rightmost bit if the + * corresponding irq event occurs. */ regmap_read(regmap, NAU8821_R10_IRQ_STATUS, &active_irq); for (i = 0; i < NAU8821_REG_DATA_LEN; i++) { @@ -934,7 +939,7 @@ static void nau8821_eject_jack(struct nau8821 *nau8821) snd_soc_dapm_sync(dapm);
/* Clear all interruption status */ - nau8821_int_status_clear_all(regmap); + nau8821_irq_status_clear(regmap, 0);
/* Enable the insertion interruption, disable the ejection inter- * ruption, and then bypass de-bounce circuit. @@ -1400,7 +1405,7 @@ static int nau8821_resume_setup(struct nau8821 *nau8821) nau8821_configure_sysclk(nau8821, NAU8821_CLK_DIS, 0); if (nau8821->irq) { /* Clear all interruption status */ - nau8821_int_status_clear_all(regmap); + nau8821_irq_status_clear(regmap, 0);
/* Enable both insertion and ejection interruptions, and then * bypass de-bounce circuit.
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cristian Ciocaltea cristian.ciocaltea@collabora.com
[ Upstream commit 2b4eda7bf7d8a4e2f7575a98f55d8336dec0f302 ]
Stress testing the audio jack hotplug handling on a few Steam Deck units revealed that the debounce circuit is responsible for having a negative impact on the detection reliability, e.g. in some cases the ejection interrupt is not fired, while in other instances it goes into a kind of invalid state and generates a flood of misleading interrupts.
Add new entries to the DMI table introduced via commit 1bc40efdaf4a ("ASoC: nau8821: Add DMI quirk mechanism for active-high jack-detect") and extend the quirk logic to allow bypassing the debounce circuit used for jack detection on Valve Steam Deck LCD and OLED models.
While at it, rename existing NAU8821_JD_ACTIVE_HIGH quirk bitfield to NAU8821_QUIRK_JD_ACTIVE_HIGH. This should help improve code readability by differentiating from similarly named register bits.
Fixes: aab1ad11d69f ("ASoC: nau8821: new driver") Signed-off-by: Cristian Ciocaltea cristian.ciocaltea@collabora.com Link: https://patch.msgid.link/20251003-nau8821-jdet-fixes-v1-4-f7b0e2543f09@colla... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/nau8821.c | 33 +++++++++++++++++++++++++++------ 1 file changed, 27 insertions(+), 6 deletions(-)
diff --git a/sound/soc/codecs/nau8821.c b/sound/soc/codecs/nau8821.c index 380ceac4d8700..66309eede0dbd 100644 --- a/sound/soc/codecs/nau8821.c +++ b/sound/soc/codecs/nau8821.c @@ -26,7 +26,8 @@ #include <sound/tlv.h> #include "nau8821.h"
-#define NAU8821_JD_ACTIVE_HIGH BIT(0) +#define NAU8821_QUIRK_JD_ACTIVE_HIGH BIT(0) +#define NAU8821_QUIRK_JD_DB_BYPASS BIT(1)
static int nau8821_quirk; static int quirk_override = -1; @@ -1043,9 +1044,10 @@ static void nau8821_setup_inserted_irq(struct nau8821 *nau8821) regmap_update_bits(regmap, NAU8821_R1D_I2S_PCM_CTRL2, NAU8821_I2S_MS_MASK, NAU8821_I2S_MS_SLAVE);
- /* Not bypass de-bounce circuit */ - regmap_update_bits(regmap, NAU8821_R0D_JACK_DET_CTRL, - NAU8821_JACK_DET_DB_BYPASS, 0); + /* Do not bypass de-bounce circuit */ + if (!(nau8821_quirk & NAU8821_QUIRK_JD_DB_BYPASS)) + regmap_update_bits(regmap, NAU8821_R0D_JACK_DET_CTRL, + NAU8821_JACK_DET_DB_BYPASS, 0);
regmap_update_bits(regmap, NAU8821_R0F_INTERRUPT_MASK, NAU8821_IRQ_EJECT_EN, 0); @@ -1718,7 +1720,23 @@ static const struct dmi_system_id nau8821_quirk_table[] = { DMI_MATCH(DMI_SYS_VENDOR, "Positivo Tecnologia SA"), DMI_MATCH(DMI_BOARD_NAME, "CW14Q01P-V2"), }, - .driver_data = (void *)(NAU8821_JD_ACTIVE_HIGH), + .driver_data = (void *)(NAU8821_QUIRK_JD_ACTIVE_HIGH), + }, + { + /* Valve Steam Deck LCD */ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Valve"), + DMI_MATCH(DMI_PRODUCT_NAME, "Jupiter"), + }, + .driver_data = (void *)(NAU8821_QUIRK_JD_DB_BYPASS), + }, + { + /* Valve Steam Deck OLED */ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Valve"), + DMI_MATCH(DMI_PRODUCT_NAME, "Galileo"), + }, + .driver_data = (void *)(NAU8821_QUIRK_JD_DB_BYPASS), }, {} }; @@ -1760,9 +1778,12 @@ static int nau8821_i2c_probe(struct i2c_client *i2c)
nau8821_check_quirks();
- if (nau8821_quirk & NAU8821_JD_ACTIVE_HIGH) + if (nau8821_quirk & NAU8821_QUIRK_JD_ACTIVE_HIGH) nau8821->jkdet_polarity = 0;
+ if (nau8821_quirk & NAU8821_QUIRK_JD_DB_BYPASS) + dev_dbg(dev, "Force bypassing jack detection debounce circuit\n"); + nau8821_print_device_properties(nau8821);
nau8821_reset_chip(nau8821->regmap);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Timur Kristóf timur.kristof@gmail.com
[ Upstream commit 6917112af2ba36c5f19075eb9f2933ffd07e55bf ]
Remove extra multiplication.
CIK GPUs such as Hawaii appear to use PP_TABLE_V0 in which case the shutdown temperature is hardcoded in smu7_init_dpm_defaults and is already multiplied by 1000. The value was mistakenly multiplied another time by smu7_get_thermal_temperature_range.
Fixes: 4ba082572a42 ("drm/amd/powerplay: export the thermal ranges of VI asics (V2)") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1676 Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Timur Kristóf timur.kristof@gmail.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c index 530888c475be1..d13ab986a5c20 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c @@ -5435,8 +5435,7 @@ static int smu7_get_thermal_temperature_range(struct pp_hwmgr *hwmgr, thermal_data->max = table_info->cac_dtp_table->usSoftwareShutdownTemp * PP_TEMPERATURE_UNITS_PER_CENTIGRADES; else if (hwmgr->pp_table_version == PP_TABLE_V0) - thermal_data->max = data->thermal_temp_setting.temperature_shutdown * - PP_TEMPERATURE_UNITS_PER_CENTIGRADES; + thermal_data->max = data->thermal_temp_setting.temperature_shutdown;
thermal_data->sw_ctf_threshold = thermal_data->max;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alok Tiwari alok.a.tiwari@oracle.com
[ Upstream commit 7f38a1487555604bc4e210fa7cc9b1bce981c40e ]
The vop2_plane_atomic_check() function incorrectly checks drm_rect_width(dest) twice instead of verifying both width and height. Fix the second condition to use drm_rect_height(dest) so that invalid destination rectangles with height < 4 are correctly rejected.
Fixes: 604be85547ce ("drm/rockchip: Add VOP2 driver") Signed-off-by: Alok Tiwari alok.a.tiwari@oracle.com Reviewed-by: Andy Yan andy.yan@rock-chips.com Signed-off-by: Heiko Stuebner heiko@sntech.de Link: https://lore.kernel.org/r/20251012142005.660727-1-alok.a.tiwari@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/rockchip/rockchip_drm_vop2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c index 6efa0a51b7d65..e14557d80efc2 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c @@ -983,7 +983,7 @@ static int vop2_plane_atomic_check(struct drm_plane *plane, return format;
if (drm_rect_width(src) >> 16 < 4 || drm_rect_height(src) >> 16 < 4 || - drm_rect_width(dest) < 4 || drm_rect_width(dest) < 4) { + drm_rect_width(dest) < 4 || drm_rect_height(dest) < 4) { drm_err(vop2->drm, "Invalid size: %dx%d->%dx%d, min size is 4x4\n", drm_rect_width(src) >> 16, drm_rect_height(src) >> 16, drm_rect_width(dest), drm_rect_height(dest));
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ingo Molnar mingo@kernel.org
[ Upstream commit 7d058285cd77cc1411c91efd1b1673530bb1bee8 ]
Standardize scheduler load-balancing function names on the sched_balance_() prefix.
Signed-off-by: Ingo Molnar mingo@kernel.org Reviewed-by: Shrikanth Hegde sshegde@linux.ibm.com Link: https://lore.kernel.org/r/20240308111819.1101550-11-mingo@kernel.org Stable-dep-of: 17e3e88ed0b6 ("sched/fair: Fix pelt lost idle time detection") Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/fair.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 2deb896883d38..cf889d1ed13d1 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4261,7 +4261,7 @@ static inline unsigned long cfs_rq_load_avg(struct cfs_rq *cfs_rq) return cfs_rq->avg.load_avg; }
-static int newidle_balance(struct rq *this_rq, struct rq_flags *rf); +static int sched_balance_newidle(struct rq *this_rq, struct rq_flags *rf);
static inline unsigned long task_util(struct task_struct *p) { @@ -4590,7 +4590,7 @@ attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {} static inline void detach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) {}
-static inline int newidle_balance(struct rq *rq, struct rq_flags *rf) +static inline int sched_balance_newidle(struct rq *rq, struct rq_flags *rf) { return 0; } @@ -7575,7 +7575,7 @@ balance_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) if (rq->nr_running) return 1;
- return newidle_balance(rq, rf) != 0; + return sched_balance_newidle(rq, rf) != 0; } #endif /* CONFIG_SMP */
@@ -7911,10 +7911,10 @@ done: __maybe_unused; if (!rf) return NULL;
- new_tasks = newidle_balance(rq, rf); + new_tasks = sched_balance_newidle(rq, rf);
/* - * Because newidle_balance() releases (and re-acquires) rq->lock, it is + * Because sched_balance_newidle() releases (and re-acquires) rq->lock, it is * possible for any higher priority task to appear. In that case we * must re-start the pick_next_entity() loop. */ @@ -10786,7 +10786,7 @@ static int load_balance(int this_cpu, struct rq *this_rq, ld_moved = 0;
/* - * newidle_balance() disregards balance intervals, so we could + * sched_balance_newidle() disregards balance intervals, so we could * repeatedly reach this code, which would lead to balance_interval * skyrocketing in a short amount of time. Skip the balance_interval * increase logic to avoid that. @@ -11548,7 +11548,7 @@ static inline void nohz_newidle_balance(struct rq *this_rq) { } #endif /* CONFIG_NO_HZ_COMMON */
/* - * newidle_balance is called by schedule() if this_cpu is about to become + * sched_balance_newidle is called by schedule() if this_cpu is about to become * idle. Attempts to pull tasks from other CPUs. * * Returns: @@ -11556,7 +11556,7 @@ static inline void nohz_newidle_balance(struct rq *this_rq) { } * 0 - failed, no new tasks * > 0 - success, new (fair) tasks present */ -static int newidle_balance(struct rq *this_rq, struct rq_flags *rf) +static int sched_balance_newidle(struct rq *this_rq, struct rq_flags *rf) { unsigned long next_balance = jiffies + HZ; int this_cpu = this_rq->cpu;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vincent Guittot vincent.guittot@linaro.org
[ Upstream commit 17e3e88ed0b6318fde0d1c14df1a804711cab1b5 ]
The check for some lost idle pelt time should be always done when pick_next_task_fair() fails to pick a task and not only when we call it from the fair fast-path.
The case happens when the last running task on rq is a RT or DL task. When the latter goes to sleep and the /Sum of util_sum of the rq is at the max value, we don't account the lost of idle time whereas we should.
Fixes: 67692435c411 ("sched: Rework pick_next_task() slow-path") Signed-off-by: Vincent Guittot vincent.guittot@linaro.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/fair.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index cf889d1ed13d1..b6795bf15211c 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7908,21 +7908,21 @@ done: __maybe_unused; return p;
idle: - if (!rf) - return NULL; - - new_tasks = sched_balance_newidle(rq, rf); + if (rf) { + new_tasks = sched_balance_newidle(rq, rf);
- /* - * Because sched_balance_newidle() releases (and re-acquires) rq->lock, it is - * possible for any higher priority task to appear. In that case we - * must re-start the pick_next_entity() loop. - */ - if (new_tasks < 0) - return RETRY_TASK; + /* + * Because sched_balance_newidle() releases (and re-acquires) + * rq->lock, it is possible for any higher priority task to + * appear. In that case we must re-start the pick_next_entity() + * loop. + */ + if (new_tasks < 0) + return RETRY_TASK;
- if (new_tasks > 0) - goto again; + if (new_tasks > 0) + goto again; + }
/* * rq is about to be idle, check if we need to update the
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit d41f68dff783d181a8fd462e612bda0fbab7f735 ]
Fix spelling of CIP_NO_HEADER to prevent a kernel-doc warning.
Warning: amdtp-stream.h:57 Enum value 'CIP_NO_HEADER' not described in enum 'cip_flags' Warning: amdtp-stream.h:57 Excess enum value '%CIP_NO_HEADERS' description in 'cip_flags'
Fixes: 3b196c394dd9f ("ALSA: firewire-lib: add no-header packet processing") Signed-off-by: Randy Dunlap rdunlap@infradead.org Reviewed-by: Takashi Sakamoto o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/firewire/amdtp-stream.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/firewire/amdtp-stream.h b/sound/firewire/amdtp-stream.h index 011d0f0c39415..dc70256ca2203 100644 --- a/sound/firewire/amdtp-stream.h +++ b/sound/firewire/amdtp-stream.h @@ -32,7 +32,7 @@ * allows 5 times as large as IEC 61883-6 defines. * @CIP_HEADER_WITHOUT_EOH: Only for in-stream. CIP Header doesn't include * valid EOH. - * @CIP_NO_HEADERS: a lack of headers in packets + * @CIP_NO_HEADER: a lack of headers in packets * @CIP_UNALIGHED_DBC: Only for in-stream. The value of dbc is not alighed to * the value of current SYT_INTERVAL; e.g. initial value is not zero. * @CIP_UNAWARE_SYT: For outgoing packet, the value in SYT field of CIP is 0xffff.
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiaming Zhang r772577952@gmail.com
[ Upstream commit 28412b489b088fb88dff488305fd4e56bd47f6e4 ]
In try_to_register_card(), the return value of usb_ifnum_to_if() is passed directly to usb_interface_claimed() without a NULL check, which will lead to a NULL pointer dereference when creating an invalid USB audio device. Fix this by adding a check to ensure the interface pointer is valid before passing it to usb_interface_claimed().
Fixes: 39efc9c8a973 ("ALSA: usb-audio: Fix last interface check for registration") Closes: https://lore.kernel.org/all/CANypQFYtQxHL5ghREs-BujZG413RPJGnO5TH=xjFBKpPts3... Signed-off-by: Jiaming Zhang r772577952@gmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/usb/card.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/sound/usb/card.c b/sound/usb/card.c index 5f539a1baef3d..d7fe1c22a48bb 100644 --- a/sound/usb/card.c +++ b/sound/usb/card.c @@ -753,10 +753,16 @@ get_alias_quirk(struct usb_device *dev, unsigned int id) */ static int try_to_register_card(struct snd_usb_audio *chip, int ifnum) { + struct usb_interface *iface; + if (check_delayed_register_option(chip) == ifnum || - chip->last_iface == ifnum || - usb_interface_claimed(usb_ifnum_to_if(chip->dev, chip->last_iface))) + chip->last_iface == ifnum) + return snd_card_register(chip->card); + + iface = usb_ifnum_to_if(chip->dev, chip->last_iface); + if (iface && usb_interface_claimed(iface)) return snd_card_register(chip->card); + return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Torokhov dmitry.torokhov@gmail.com
[ Upstream commit 0187c08058da3e7f11b356ac27e0c427d36f33f2 ]
Commit 581c4484769e ("HID: input: map digitizer battery usage") added handling of battery events for digitizers (typically for batteries presented in stylii). Digitizers typically report correct battery levels only when stylus is actively touching the surface, and in other cases they may report battery level of 0. To avoid confusing consumers of the battery information the code was added to filer out reports with 0 battery levels.
However there exist other kinds of devices that may legitimately report 0 battery levels. Fix this by filtering out 0-level reports only for digitizer usages, and continue reporting them for other kinds of devices (Smart Batteries, etc).
Reported-by: 卢国宏 luguohong@xiaomi.com Fixes: 581c4484769e ("HID: input: map digitizer battery usage") Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Jiri Kosina jkosina@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-input.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c index cd9d031858438..59ec205421753 100644 --- a/drivers/hid/hid-input.c +++ b/drivers/hid/hid-input.c @@ -636,7 +636,10 @@ static void hidinput_update_battery(struct hid_device *dev, unsigned int usage, return; }
- if (value == 0 || value < dev->battery_min || value > dev->battery_max) + if ((usage & HID_USAGE_PAGE) == HID_UP_DIGITIZER && value == 0) + return; + + if (value < dev->battery_min || value > dev->battery_max) return;
capacity = hidinput_scale_battery_capacity(dev, value);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thadeu Lima de Souza Cascardo cascardo@igalia.com
[ Upstream commit aa4daea418ee4215dca5c8636090660c545cb233 ]
HID_DG_PEN devices should have a suffix of "Stylus", as pointed out by commit c0ee1d571626 ("HID: hid-input: Add suffix also for HID_DG_PEN"). However, on multitouch devices, these suffixes may be overridden. Before that commit, HID_DG_PEN devices would get the "Stylus" suffix, but after that, multitouch would override them to have an "UNKNOWN" suffix. Just add HID_DG_PEN to the list of non-overriden suffixes in multitouch.
Before this fix:
[ 0.470981] input: ELAN9008:00 04F3:2E14 UNKNOWN as /devices/pci0000:00/0000:00:15.1/i2c_designware.1/i2c-16/i2c-ELAN9008:00/0018:04F3:2E14.0001/input/input8 ELAN9008:00 04F3:2E14 UNKNOWN
After this fix:
[ 0.474332] input: ELAN9008:00 04F3:2E14 Stylus as /devices/pci0000:00/0000:00:15.1/i2c_designware.1/i2c-16/i2c-ELAN9008:00/0018:04F3:2E14.0001/input/input8
ELAN9008:00 04F3:2E14 Stylus
Fixes: c0ee1d571626 ("HID: hid-input: Add suffix also for HID_DG_PEN") Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@igalia.com Reviewed-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Jiri Kosina jkosina@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-multitouch.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c index 6f1e54ee8f05d..b9e67b408a4b9 100644 --- a/drivers/hid/hid-multitouch.c +++ b/drivers/hid/hid-multitouch.c @@ -1658,6 +1658,7 @@ static int mt_input_configured(struct hid_device *hdev, struct hid_input *hi) case HID_CP_CONSUMER_CONTROL: case HID_GD_WIRELESS_RADIO_CTLS: case HID_GD_SYSTEM_MULTIAXIS: + case HID_DG_PEN: /* already handled by hid core */ break; case HID_DG_TOUCHSCREEN:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Viacheslav Dubeyko slava@dubeyko.com
commit 42520df65bf67189541a425f7d36b0b3e7bd7844 upstream.
The hfsplus_strcasecmp() logic can trigger the issue:
[ 117.317703][ T9855] ================================================================== [ 117.318353][ T9855] BUG: KASAN: slab-out-of-bounds in hfsplus_strcasecmp+0x1bc/0x490 [ 117.318991][ T9855] Read of size 2 at addr ffff88802160f40c by task repro/9855 [ 117.319577][ T9855] [ 117.319773][ T9855] CPU: 0 UID: 0 PID: 9855 Comm: repro Not tainted 6.17.0-rc6 #33 PREEMPT(full) [ 117.319780][ T9855] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 117.319783][ T9855] Call Trace: [ 117.319785][ T9855] <TASK> [ 117.319788][ T9855] dump_stack_lvl+0x1c1/0x2a0 [ 117.319795][ T9855] ? __virt_addr_valid+0x1c8/0x5c0 [ 117.319803][ T9855] ? __pfx_dump_stack_lvl+0x10/0x10 [ 117.319808][ T9855] ? rcu_is_watching+0x15/0xb0 [ 117.319816][ T9855] ? lock_release+0x4b/0x3e0 [ 117.319821][ T9855] ? __kasan_check_byte+0x12/0x40 [ 117.319828][ T9855] ? __virt_addr_valid+0x1c8/0x5c0 [ 117.319835][ T9855] ? __virt_addr_valid+0x4a5/0x5c0 [ 117.319842][ T9855] print_report+0x17e/0x7e0 [ 117.319848][ T9855] ? __virt_addr_valid+0x1c8/0x5c0 [ 117.319855][ T9855] ? __virt_addr_valid+0x4a5/0x5c0 [ 117.319862][ T9855] ? __phys_addr+0xd3/0x180 [ 117.319869][ T9855] ? hfsplus_strcasecmp+0x1bc/0x490 [ 117.319876][ T9855] kasan_report+0x147/0x180 [ 117.319882][ T9855] ? hfsplus_strcasecmp+0x1bc/0x490 [ 117.319891][ T9855] hfsplus_strcasecmp+0x1bc/0x490 [ 117.319900][ T9855] ? __pfx_hfsplus_cat_case_cmp_key+0x10/0x10 [ 117.319906][ T9855] hfs_find_rec_by_key+0xa9/0x1e0 [ 117.319913][ T9855] __hfsplus_brec_find+0x18e/0x470 [ 117.319920][ T9855] ? __pfx_hfsplus_bnode_find+0x10/0x10 [ 117.319926][ T9855] ? __pfx_hfs_find_rec_by_key+0x10/0x10 [ 117.319933][ T9855] ? __pfx___hfsplus_brec_find+0x10/0x10 [ 117.319942][ T9855] hfsplus_brec_find+0x28f/0x510 [ 117.319949][ T9855] ? __pfx_hfs_find_rec_by_key+0x10/0x10 [ 117.319956][ T9855] ? __pfx_hfsplus_brec_find+0x10/0x10 [ 117.319963][ T9855] ? __kmalloc_noprof+0x2a9/0x510 [ 117.319969][ T9855] ? hfsplus_find_init+0x8c/0x1d0 [ 117.319976][ T9855] hfsplus_brec_read+0x2b/0x120 [ 117.319983][ T9855] hfsplus_lookup+0x2aa/0x890 [ 117.319990][ T9855] ? __pfx_hfsplus_lookup+0x10/0x10 [ 117.320003][ T9855] ? d_alloc_parallel+0x2f0/0x15e0 [ 117.320008][ T9855] ? __lock_acquire+0xaec/0xd80 [ 117.320013][ T9855] ? __pfx_d_alloc_parallel+0x10/0x10 [ 117.320019][ T9855] ? __raw_spin_lock_init+0x45/0x100 [ 117.320026][ T9855] ? __init_waitqueue_head+0xa9/0x150 [ 117.320034][ T9855] __lookup_slow+0x297/0x3d0 [ 117.320039][ T9855] ? __pfx___lookup_slow+0x10/0x10 [ 117.320045][ T9855] ? down_read+0x1ad/0x2e0 [ 117.320055][ T9855] lookup_slow+0x53/0x70 [ 117.320065][ T9855] walk_component+0x2f0/0x430 [ 117.320073][ T9855] path_lookupat+0x169/0x440 [ 117.320081][ T9855] filename_lookup+0x212/0x590 [ 117.320089][ T9855] ? __pfx_filename_lookup+0x10/0x10 [ 117.320098][ T9855] ? strncpy_from_user+0x150/0x290 [ 117.320105][ T9855] ? getname_flags+0x1e5/0x540 [ 117.320112][ T9855] user_path_at+0x3a/0x60 [ 117.320117][ T9855] __x64_sys_umount+0xee/0x160 [ 117.320123][ T9855] ? __pfx___x64_sys_umount+0x10/0x10 [ 117.320129][ T9855] ? do_syscall_64+0xb7/0x3a0 [ 117.320135][ T9855] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 117.320141][ T9855] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 117.320145][ T9855] do_syscall_64+0xf3/0x3a0 [ 117.320150][ T9855] ? exc_page_fault+0x9f/0xf0 [ 117.320154][ T9855] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 117.320158][ T9855] RIP: 0033:0x7f7dd7908b07 [ 117.320163][ T9855] Code: 23 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 08 [ 117.320167][ T9855] RSP: 002b:00007ffd5ebd9698 EFLAGS: 00000202 ORIG_RAX: 00000000000000a6 [ 117.320172][ T9855] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7dd7908b07 [ 117.320176][ T9855] RDX: 0000000000000009 RSI: 0000000000000009 RDI: 00007ffd5ebd9740 [ 117.320179][ T9855] RBP: 00007ffd5ebda780 R08: 0000000000000005 R09: 00007ffd5ebd9530 [ 117.320181][ T9855] R10: 00007f7dd799bfc0 R11: 0000000000000202 R12: 000055e2008b32d0 [ 117.320184][ T9855] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 117.320189][ T9855] </TASK> [ 117.320190][ T9855] [ 117.351311][ T9855] Allocated by task 9855: [ 117.351683][ T9855] kasan_save_track+0x3e/0x80 [ 117.352093][ T9855] __kasan_kmalloc+0x8d/0xa0 [ 117.352490][ T9855] __kmalloc_noprof+0x288/0x510 [ 117.352914][ T9855] hfsplus_find_init+0x8c/0x1d0 [ 117.353342][ T9855] hfsplus_lookup+0x19c/0x890 [ 117.353747][ T9855] __lookup_slow+0x297/0x3d0 [ 117.354148][ T9855] lookup_slow+0x53/0x70 [ 117.354514][ T9855] walk_component+0x2f0/0x430 [ 117.354921][ T9855] path_lookupat+0x169/0x440 [ 117.355325][ T9855] filename_lookup+0x212/0x590 [ 117.355740][ T9855] user_path_at+0x3a/0x60 [ 117.356115][ T9855] __x64_sys_umount+0xee/0x160 [ 117.356529][ T9855] do_syscall_64+0xf3/0x3a0 [ 117.356920][ T9855] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 117.357429][ T9855] [ 117.357636][ T9855] The buggy address belongs to the object at ffff88802160f000 [ 117.357636][ T9855] which belongs to the cache kmalloc-2k of size 2048 [ 117.358827][ T9855] The buggy address is located 0 bytes to the right of [ 117.358827][ T9855] allocated 1036-byte region [ffff88802160f000, ffff88802160f40c) [ 117.360061][ T9855] [ 117.360266][ T9855] The buggy address belongs to the physical page: [ 117.360813][ T9855] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x21608 [ 117.361562][ T9855] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 117.362285][ T9855] flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff) [ 117.362929][ T9855] page_type: f5(slab) [ 117.363282][ T9855] raw: 00fff00000000040 ffff88801a842f00 ffffea0000932000 dead000000000002 [ 117.364015][ T9855] raw: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000 [ 117.364750][ T9855] head: 00fff00000000040 ffff88801a842f00 ffffea0000932000 dead000000000002 [ 117.365491][ T9855] head: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000 [ 117.366232][ T9855] head: 00fff00000000003 ffffea0000858201 00000000ffffffff 00000000ffffffff [ 117.366968][ T9855] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008 [ 117.367711][ T9855] page dumped because: kasan: bad access detected [ 117.368259][ T9855] page_owner tracks the page as allocated [ 117.368745][ T9855] page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN1 [ 117.370541][ T9855] post_alloc_hook+0x240/0x2a0 [ 117.370954][ T9855] get_page_from_freelist+0x2101/0x21e0 [ 117.371435][ T9855] __alloc_frozen_pages_noprof+0x274/0x380 [ 117.371935][ T9855] alloc_pages_mpol+0x241/0x4b0 [ 117.372360][ T9855] allocate_slab+0x8d/0x380 [ 117.372752][ T9855] ___slab_alloc+0xbe3/0x1400 [ 117.373159][ T9855] __kmalloc_cache_noprof+0x296/0x3d0 [ 117.373621][ T9855] nexthop_net_init+0x75/0x100 [ 117.374038][ T9855] ops_init+0x35c/0x5c0 [ 117.374400][ T9855] setup_net+0x10c/0x320 [ 117.374768][ T9855] copy_net_ns+0x31b/0x4d0 [ 117.375156][ T9855] create_new_namespaces+0x3f3/0x720 [ 117.375613][ T9855] unshare_nsproxy_namespaces+0x11c/0x170 [ 117.376094][ T9855] ksys_unshare+0x4ca/0x8d0 [ 117.376477][ T9855] __x64_sys_unshare+0x38/0x50 [ 117.376879][ T9855] do_syscall_64+0xf3/0x3a0 [ 117.377265][ T9855] page last free pid 9110 tgid 9110 stack trace: [ 117.377795][ T9855] __free_frozen_pages+0xbeb/0xd50 [ 117.378229][ T9855] __put_partials+0x152/0x1a0 [ 117.378625][ T9855] put_cpu_partial+0x17c/0x250 [ 117.379026][ T9855] __slab_free+0x2d4/0x3c0 [ 117.379404][ T9855] qlist_free_all+0x97/0x140 [ 117.379790][ T9855] kasan_quarantine_reduce+0x148/0x160 [ 117.380250][ T9855] __kasan_slab_alloc+0x22/0x80 [ 117.380662][ T9855] __kmalloc_noprof+0x232/0x510 [ 117.381074][ T9855] tomoyo_supervisor+0xc0a/0x1360 [ 117.381498][ T9855] tomoyo_env_perm+0x149/0x1e0 [ 117.381903][ T9855] tomoyo_find_next_domain+0x15ad/0x1b90 [ 117.382378][ T9855] tomoyo_bprm_check_security+0x11c/0x180 [ 117.382859][ T9855] security_bprm_check+0x89/0x280 [ 117.383289][ T9855] bprm_execve+0x8f1/0x14a0 [ 117.383673][ T9855] do_execveat_common+0x528/0x6b0 [ 117.384103][ T9855] __x64_sys_execve+0x94/0xb0 [ 117.384500][ T9855] [ 117.384706][ T9855] Memory state around the buggy address: [ 117.385179][ T9855] ffff88802160f300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 117.385854][ T9855] ffff88802160f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 117.386534][ T9855] >ffff88802160f400: 00 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 117.387204][ T9855] ^ [ 117.387566][ T9855] ffff88802160f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 117.388243][ T9855] ffff88802160f500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 117.388918][ T9855] ==================================================================
The issue takes place if the length field of struct hfsplus_unistr is bigger than HFSPLUS_MAX_STRLEN. The patch simply checks the length of comparing strings. And if the strings' length is bigger than HFSPLUS_MAX_STRLEN, then it is corrected to this value.
v2 The string length correction has been added for hfsplus_strcmp().
Reported-by: Jiaming Zhang r772577952@gmail.com Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com cc: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de cc: Yangtao Li frank.li@vivo.com cc: linux-fsdevel@vger.kernel.org cc: syzkaller@googlegroups.com Link: https://lore.kernel.org/r/20250919191243.1370388-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/hfsplus/unicode.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
--- a/fs/hfsplus/unicode.c +++ b/fs/hfsplus/unicode.c @@ -40,6 +40,18 @@ int hfsplus_strcasecmp(const struct hfsp p1 = s1->unicode; p2 = s2->unicode;
+ if (len1 > HFSPLUS_MAX_STRLEN) { + len1 = HFSPLUS_MAX_STRLEN; + pr_err("invalid length %u has been corrected to %d\n", + be16_to_cpu(s1->length), len1); + } + + if (len2 > HFSPLUS_MAX_STRLEN) { + len2 = HFSPLUS_MAX_STRLEN; + pr_err("invalid length %u has been corrected to %d\n", + be16_to_cpu(s2->length), len2); + } + while (1) { c1 = c2 = 0;
@@ -74,6 +86,18 @@ int hfsplus_strcmp(const struct hfsplus_ p1 = s1->unicode; p2 = s2->unicode;
+ if (len1 > HFSPLUS_MAX_STRLEN) { + len1 = HFSPLUS_MAX_STRLEN; + pr_err("invalid length %u has been corrected to %d\n", + be16_to_cpu(s1->length), len1); + } + + if (len2 > HFSPLUS_MAX_STRLEN) { + len2 = HFSPLUS_MAX_STRLEN; + pr_err("invalid length %u has been corrected to %d\n", + be16_to_cpu(s2->length), len2); + } + for (len = min(len1, len2); len > 0; len--) { c1 = be16_to_cpu(*p1); c2 = be16_to_cpu(*p2);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brian Norris briannorris@google.com
Commit 48991e493507 ("PCI/sysfs: Ensure devices are powered for config reads") was applied to various linux-stable trees. However, prior to 6.12.y, we do not have commit d2bd39c0456b ("PCI: Store all PCIe Supported Link Speeds"). Therefore, we also need to apply the change to max_link_speed_show().
This was pointed out here:
Re: Patch "PCI/sysfs: Ensure devices are powered for config reads" has been added to the 6.6-stable tree https://lore.kernel.org/all/aPEMIreBYZ7yk3cm@google.com/
Original change description follows:
The "max_link_width", "current_link_speed", "current_link_width", "secondary_bus_number", and "subordinate_bus_number" sysfs files all access config registers, but they don't check the runtime PM state. If the device is in D3cold or a parent bridge is suspended, we may see -EINVAL, bogus values, or worse, depending on implementation details.
Wrap these access in pci_config_pm_runtime_{get,put}() like most of the rest of the similar sysfs attributes.
Notably, "max_link_speed" does not access config registers; it returns a cached value since d2bd39c0456b ("PCI: Store all PCIe Supported Link Speeds").
Fixes: 56c1af4606f0 ("PCI: Add sysfs max_link_speed/width, current_link_speed/width, etc") Link: https://lore.kernel.org/all/aPEMIreBYZ7yk3cm@google.com/ Signed-off-by: Brian Norris briannorris@google.com Signed-off-by: Brian Norris briannorris@chromium.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pci-sysfs.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -186,9 +186,15 @@ static ssize_t max_link_speed_show(struc struct device_attribute *attr, char *buf) { struct pci_dev *pdev = to_pci_dev(dev); + ssize_t ret;
- return sysfs_emit(buf, "%s\n", - pci_speed_string(pcie_get_speed_cap(pdev))); + /* We read PCI_EXP_LNKCAP, so we need the device to be accessible. */ + pci_config_pm_runtime_get(pdev); + ret = sysfs_emit(buf, "%s\n", + pci_speed_string(pcie_get_speed_cap(pdev))); + pci_config_pm_runtime_put(pdev); + + return ret; } static DEVICE_ATTR_RO(max_link_speed);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xichao Zhao zhao.xichao@vivo.com
[ Upstream commit 5e088248375d171b80c643051e77ade6b97bc386 ]
In the setup_arg_pages(), ret is declared as an unsigned long. The ret might take a negative value. Therefore, its type should be changed to int.
Signed-off-by: Xichao Zhao zhao.xichao@vivo.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20250825073609.219855-1-zhao.xichao@vivo.com Signed-off-by: Kees Cook kees@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/exec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/exec.c b/fs/exec.c index b65af8f9a4f9b..a4d21a67723d7 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -750,7 +750,7 @@ int setup_arg_pages(struct linux_binprm *bprm, unsigned long stack_top, int executable_stack) { - unsigned long ret; + int ret; unsigned long stack_shift; struct mm_struct *mm = current->mm; struct vm_area_struct *vma = bprm->vma;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Simon Schuster schuster.simon@siemens-energy.com
[ Upstream commit a20b83cf45be2057f3d073506779e52c7fa17f94 ]
On nios2, with CONFIG_FLATMEM set, the kernel relies on memblock_get_current_limit() to determine the limits of mem_map, in particular for max_low_pfn. Unfortunately, memblock.current_limit is only default initialized to MEMBLOCK_ALLOC_ANYWHERE at this point of the bootup, potentially leading to situations where max_low_pfn can erroneously exceed the value of max_pfn and, thus, the valid range of available DRAM.
This can in turn cause kernel-level paging failures, e.g.:
[ 76.900000] Unable to handle kernel paging request at virtual address 20303000 [ 76.900000] ea = c0080890, ra = c000462c, cause = 14 [ 76.900000] Kernel panic - not syncing: Oops [ 76.900000] ---[ end Kernel panic - not syncing: Oops ]---
This patch fixes this by pre-calculating memblock.current_limit based on the upper limits of the available memory ranges via adjust_lowmem_bounds, a simplified version of the equivalent implementation within the arm architecture.
Signed-off-by: Simon Schuster schuster.simon@siemens-energy.com Signed-off-by: Andreas Oetken andreas.oetken@siemens-energy.com Signed-off-by: Dinh Nguyen dinguyen@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/nios2/kernel/setup.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/arch/nios2/kernel/setup.c b/arch/nios2/kernel/setup.c index 40bc8fb75e0b5..e2fc4b59d93ea 100644 --- a/arch/nios2/kernel/setup.c +++ b/arch/nios2/kernel/setup.c @@ -147,6 +147,20 @@ static void __init find_limits(unsigned long *min, unsigned long *max_low, *max_high = PFN_DOWN(memblock_end_of_DRAM()); }
+static void __init adjust_lowmem_bounds(void) +{ + phys_addr_t block_start, block_end; + u64 i; + phys_addr_t memblock_limit = 0; + + for_each_mem_range(i, &block_start, &block_end) { + if (block_end > memblock_limit) + memblock_limit = block_end; + } + + memblock_set_current_limit(memblock_limit); +} + void __init setup_arch(char **cmdline_p) { console_verbose(); @@ -160,6 +174,7 @@ void __init setup_arch(char **cmdline_p) /* Keep a copy of command line */ *cmdline_p = boot_command_line;
+ adjust_lowmem_bounds(); find_limits(&min_low_pfn, &max_low_pfn, &max_pfn); max_mapnr = max_low_pfn;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Viacheslav Dubeyko slava@dubeyko.com
[ Upstream commit 18b07c44f245beb03588b00b212b38fce9af7cc9 ]
Currently, hfs_brec_remove() executes moving records towards the location of deleted record and it updates offsets of moved records. However, the hfs_brec_remove() logic ignores the "mess" of b-tree node's free space and it doesn't touch the offsets out of records number. Potentially, it could confuse fsck or driver logic or to be a reason of potential corruption cases.
This patch reworks the logic of hfs_brec_remove() by means of clearing freed space of b-tree node after the records moving. And it clear the last offset that keeping old location of free space because now the offset before this one is keeping the actual offset to the free space after the record deletion.
Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com cc: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de cc: Yangtao Li frank.li@vivo.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20250815194918.38165-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfs/brec.c | 27 +++++++++++++++++++++++---- 1 file changed, 23 insertions(+), 4 deletions(-)
diff --git a/fs/hfs/brec.c b/fs/hfs/brec.c index 896396554bcc1..b01db1fae147c 100644 --- a/fs/hfs/brec.c +++ b/fs/hfs/brec.c @@ -179,6 +179,7 @@ int hfs_brec_remove(struct hfs_find_data *fd) struct hfs_btree *tree; struct hfs_bnode *node, *parent; int end_off, rec_off, data_off, size; + int src, dst, len;
tree = fd->tree; node = fd->bnode; @@ -208,10 +209,14 @@ int hfs_brec_remove(struct hfs_find_data *fd) } hfs_bnode_write_u16(node, offsetof(struct hfs_bnode_desc, num_recs), node->num_recs);
- if (rec_off == end_off) - goto skip; size = fd->keylength + fd->entrylength;
+ if (rec_off == end_off) { + src = fd->keyoffset; + hfs_bnode_clear(node, src, size); + goto skip; + } + do { data_off = hfs_bnode_read_u16(node, rec_off); hfs_bnode_write_u16(node, rec_off + 2, data_off - size); @@ -219,9 +224,23 @@ int hfs_brec_remove(struct hfs_find_data *fd) } while (rec_off >= end_off);
/* fill hole */ - hfs_bnode_move(node, fd->keyoffset, fd->keyoffset + size, - data_off - fd->keyoffset - size); + dst = fd->keyoffset; + src = fd->keyoffset + size; + len = data_off - src; + + hfs_bnode_move(node, dst, src, len); + + src = dst + len; + len = data_off - src; + + hfs_bnode_clear(node, src, len); + skip: + /* + * Remove the obsolete offset to free space. + */ + hfs_bnode_write_u16(node, end_off, 0); + hfs_bnode_dump(node); if (!fd->record) hfs_brec_update_parent(fd);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Viacheslav Dubeyko slava@dubeyko.com
[ Upstream commit c62663a986acee7c4485c1fa9de5fc40194b6290 ]
Potenatially, __hfs_ext_read_extent() could operate by not initialized values of fd->key after hfs_brec_find() call:
static inline int __hfs_ext_read_extent(struct hfs_find_data *fd, struct hfs_extent *extent, u32 cnid, u32 block, u8 type) { int res;
hfs_ext_build_key(fd->search_key, cnid, block, type); fd->key->ext.FNum = 0; res = hfs_brec_find(fd); if (res && res != -ENOENT) return res; if (fd->key->ext.FNum != fd->search_key->ext.FNum || fd->key->ext.FkType != fd->search_key->ext.FkType) return -ENOENT; if (fd->entrylength != sizeof(hfs_extent_rec)) return -EIO; hfs_bnode_read(fd->bnode, extent, fd->entryoffset, sizeof(hfs_extent_rec)); return 0; }
This patch changes kmalloc() on kzalloc() in hfs_find_init() and intializes fd->record, fd->keyoffset, fd->keylength, fd->entryoffset, fd->entrylength for the case if hfs_brec_find() has been found nothing in the b-tree node.
Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com cc: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de cc: Yangtao Li frank.li@vivo.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20250818225252.126427-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfs/bfind.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/hfs/bfind.c b/fs/hfs/bfind.c index ef9498a6e88ac..6d37b4c759034 100644 --- a/fs/hfs/bfind.c +++ b/fs/hfs/bfind.c @@ -18,7 +18,7 @@ int hfs_find_init(struct hfs_btree *tree, struct hfs_find_data *fd)
fd->tree = tree; fd->bnode = NULL; - ptr = kmalloc(tree->max_key_len * 2 + 4, GFP_KERNEL); + ptr = kzalloc(tree->max_key_len * 2 + 4, GFP_KERNEL); if (!ptr) return -ENOMEM; fd->search_key = ptr; @@ -112,6 +112,12 @@ int hfs_brec_find(struct hfs_find_data *fd) __be32 data; int height, res;
+ fd->record = -1; + fd->keyoffset = -1; + fd->keylength = -1; + fd->entryoffset = -1; + fd->entrylength = -1; + tree = fd->tree; if (fd->bnode) hfs_bnode_put(fd->bnode);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Viacheslav Dubeyko slava@dubeyko.com
[ Upstream commit 4840ceadef4290c56cc422f0fc697655f3cbf070 ]
The syzbot reported issue in __hfsplus_ext_cache_extent():
[ 70.194323][ T9350] BUG: KMSAN: uninit-value in __hfsplus_ext_cache_extent+0x7d0/0x990 [ 70.195022][ T9350] __hfsplus_ext_cache_extent+0x7d0/0x990 [ 70.195530][ T9350] hfsplus_file_extend+0x74f/0x1cf0 [ 70.195998][ T9350] hfsplus_get_block+0xe16/0x17b0 [ 70.196458][ T9350] __block_write_begin_int+0x962/0x2ce0 [ 70.196959][ T9350] cont_write_begin+0x1000/0x1950 [ 70.197416][ T9350] hfsplus_write_begin+0x85/0x130 [ 70.197873][ T9350] generic_perform_write+0x3e8/0x1060 [ 70.198374][ T9350] __generic_file_write_iter+0x215/0x460 [ 70.198892][ T9350] generic_file_write_iter+0x109/0x5e0 [ 70.199393][ T9350] vfs_write+0xb0f/0x14e0 [ 70.199771][ T9350] ksys_write+0x23e/0x490 [ 70.200149][ T9350] __x64_sys_write+0x97/0xf0 [ 70.200570][ T9350] x64_sys_call+0x3015/0x3cf0 [ 70.201065][ T9350] do_syscall_64+0xd9/0x1d0 [ 70.201506][ T9350] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 70.202054][ T9350] [ 70.202279][ T9350] Uninit was created at: [ 70.202693][ T9350] __kmalloc_noprof+0x621/0xf80 [ 70.203149][ T9350] hfsplus_find_init+0x8d/0x1d0 [ 70.203602][ T9350] hfsplus_file_extend+0x6ca/0x1cf0 [ 70.204087][ T9350] hfsplus_get_block+0xe16/0x17b0 [ 70.204561][ T9350] __block_write_begin_int+0x962/0x2ce0 [ 70.205074][ T9350] cont_write_begin+0x1000/0x1950 [ 70.205547][ T9350] hfsplus_write_begin+0x85/0x130 [ 70.206017][ T9350] generic_perform_write+0x3e8/0x1060 [ 70.206519][ T9350] __generic_file_write_iter+0x215/0x460 [ 70.207042][ T9350] generic_file_write_iter+0x109/0x5e0 [ 70.207552][ T9350] vfs_write+0xb0f/0x14e0 [ 70.207961][ T9350] ksys_write+0x23e/0x490 [ 70.208375][ T9350] __x64_sys_write+0x97/0xf0 [ 70.208810][ T9350] x64_sys_call+0x3015/0x3cf0 [ 70.209255][ T9350] do_syscall_64+0xd9/0x1d0 [ 70.209680][ T9350] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 70.210230][ T9350] [ 70.210454][ T9350] CPU: 2 UID: 0 PID: 9350 Comm: repro Not tainted 6.12.0-rc5 #5 [ 70.211174][ T9350] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 70.212115][ T9350] ===================================================== [ 70.212734][ T9350] Disabling lock debugging due to kernel taint [ 70.213284][ T9350] Kernel panic - not syncing: kmsan.panic set ... [ 70.213858][ T9350] CPU: 2 UID: 0 PID: 9350 Comm: repro Tainted: G B 6.12.0-rc5 #5 [ 70.214679][ T9350] Tainted: [B]=BAD_PAGE [ 70.215057][ T9350] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 70.215999][ T9350] Call Trace: [ 70.216309][ T9350] <TASK> [ 70.216585][ T9350] dump_stack_lvl+0x1fd/0x2b0 [ 70.217025][ T9350] dump_stack+0x1e/0x30 [ 70.217421][ T9350] panic+0x502/0xca0 [ 70.217803][ T9350] ? kmsan_get_metadata+0x13e/0x1c0
[ 70.218294][ Message fromT sy9350] kmsan_report+0x296/slogd@syzkaller 0x2aat Aug 18 22:11:058 ... kernel :[ 70.213284][ T9350] Kernel panic - not syncing: kmsan.panic [ 70.220179][ T9350] ? kmsan_get_metadata+0x13e/0x1c0 set ... [ 70.221254][ T9350] ? __msan_warning+0x96/0x120 [ 70.222066][ T9350] ? __hfsplus_ext_cache_extent+0x7d0/0x990 [ 70.223023][ T9350] ? hfsplus_file_extend+0x74f/0x1cf0 [ 70.224120][ T9350] ? hfsplus_get_block+0xe16/0x17b0 [ 70.224946][ T9350] ? __block_write_begin_int+0x962/0x2ce0 [ 70.225756][ T9350] ? cont_write_begin+0x1000/0x1950 [ 70.226337][ T9350] ? hfsplus_write_begin+0x85/0x130 [ 70.226852][ T9350] ? generic_perform_write+0x3e8/0x1060 [ 70.227405][ T9350] ? __generic_file_write_iter+0x215/0x460 [ 70.227979][ T9350] ? generic_file_write_iter+0x109/0x5e0 [ 70.228540][ T9350] ? vfs_write+0xb0f/0x14e0 [ 70.228997][ T9350] ? ksys_write+0x23e/0x490 [ 70.229458][ T9350] ? __x64_sys_write+0x97/0xf0 [ 70.229939][ T9350] ? x64_sys_call+0x3015/0x3cf0 [ 70.230432][ T9350] ? do_syscall_64+0xd9/0x1d0 [ 70.230941][ T9350] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 70.231926][ T9350] ? kmsan_get_metadata+0x13e/0x1c0 [ 70.232738][ T9350] ? kmsan_internal_set_shadow_origin+0x77/0x110 [ 70.233711][ T9350] ? kmsan_get_metadata+0x13e/0x1c0 [ 70.234516][ T9350] ? kmsan_get_shadow_origin_ptr+0x4a/0xb0 [ 70.235398][ T9350] ? __msan_metadata_ptr_for_load_4+0x24/0x40 [ 70.236323][ T9350] ? hfsplus_brec_find+0x218/0x9f0 [ 70.237090][ T9350] ? __pfx_hfs_find_rec_by_key+0x10/0x10 [ 70.237938][ T9350] ? __msan_instrument_asm_store+0xbf/0xf0 [ 70.238827][ T9350] ? __msan_metadata_ptr_for_store_4+0x27/0x40 [ 70.239772][ T9350] ? __hfsplus_ext_write_extent+0x536/0x620 [ 70.240666][ T9350] ? kmsan_get_metadata+0x13e/0x1c0 [ 70.241175][ T9350] __msan_warning+0x96/0x120 [ 70.241645][ T9350] __hfsplus_ext_cache_extent+0x7d0/0x990 [ 70.242223][ T9350] hfsplus_file_extend+0x74f/0x1cf0 [ 70.242748][ T9350] hfsplus_get_block+0xe16/0x17b0 [ 70.243255][ T9350] ? kmsan_internal_set_shadow_origin+0x77/0x110 [ 70.243878][ T9350] ? kmsan_get_metadata+0x13e/0x1c0 [ 70.244400][ T9350] ? kmsan_get_shadow_origin_ptr+0x4a/0xb0 [ 70.244967][ T9350] __block_write_begin_int+0x962/0x2ce0 [ 70.245531][ T9350] ? __pfx_hfsplus_get_block+0x10/0x10 [ 70.246079][ T9350] cont_write_begin+0x1000/0x1950 [ 70.246598][ T9350] hfsplus_write_begin+0x85/0x130 [ 70.247105][ T9350] ? __pfx_hfsplus_get_block+0x10/0x10 [ 70.247650][ T9350] ? __pfx_hfsplus_write_begin+0x10/0x10 [ 70.248211][ T9350] generic_perform_write+0x3e8/0x1060 [ 70.248752][ T9350] __generic_file_write_iter+0x215/0x460 [ 70.249314][ T9350] generic_file_write_iter+0x109/0x5e0 [ 70.249856][ T9350] ? kmsan_internal_set_shadow_origin+0x77/0x110 [ 70.250487][ T9350] vfs_write+0xb0f/0x14e0 [ 70.250930][ T9350] ? __pfx_generic_file_write_iter+0x10/0x10 [ 70.251530][ T9350] ksys_write+0x23e/0x490 [ 70.251974][ T9350] __x64_sys_write+0x97/0xf0 [ 70.252450][ T9350] x64_sys_call+0x3015/0x3cf0 [ 70.252924][ T9350] do_syscall_64+0xd9/0x1d0 [ 70.253384][ T9350] ? irqentry_exit+0x16/0x60 [ 70.253844][ T9350] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 70.254430][ T9350] RIP: 0033:0x7f7a92adffc9 [ 70.254873][ T9350] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 48 [ 70.256674][ T9350] RSP: 002b:00007fff0bca3188 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 [ 70.257485][ T9350] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7a92adffc9 [ 70.258246][ T9350] RDX: 000000000208e24b RSI: 0000000020000100 RDI: 0000000000000004 [ 70.258998][ T9350] RBP: 00007fff0bca31a0 R08: 00007fff0bca31a0 R09: 00007fff0bca31a0 [ 70.259769][ T9350] R10: 0000000000000000 R11: 0000000000000202 R12: 000055e0d75f8250 [ 70.260520][ T9350] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 70.261286][ T9350] </TASK> [ 70.262026][ T9350] Kernel Offset: disabled
(gdb) l *__hfsplus_ext_cache_extent+0x7d0 0xffffffff8318aef0 is in __hfsplus_ext_cache_extent (fs/hfsplus/extents.c:168). 163 fd->key->ext.cnid = 0; 164 res = hfs_brec_find(fd, hfs_find_rec_by_key); 165 if (res && res != -ENOENT) 166 return res; 167 if (fd->key->ext.cnid != fd->search_key->ext.cnid || 168 fd->key->ext.fork_type != fd->search_key->ext.fork_type) 169 return -ENOENT; 170 if (fd->entrylength != sizeof(hfsplus_extent_rec)) 171 return -EIO; 172 hfs_bnode_read(fd->bnode, extent, fd->entryoffset,
The __hfsplus_ext_cache_extent() calls __hfsplus_ext_read_extent():
res = __hfsplus_ext_read_extent(fd, hip->cached_extents, inode->i_ino, block, HFSPLUS_IS_RSRC(inode) ? HFSPLUS_TYPE_RSRC : HFSPLUS_TYPE_DATA);
And if inode->i_ino could be equal to zero or any non-available CNID, then hfs_brec_find() could not find the record in the tree. As a result, fd->key could be compared with fd->search_key. But hfsplus_find_init() uses kmalloc() for fd->key and fd->search_key allocation:
int hfs_find_init(struct hfs_btree *tree, struct hfs_find_data *fd) { <skipped> ptr = kmalloc(tree->max_key_len * 2 + 4, GFP_KERNEL); if (!ptr) return -ENOMEM; fd->search_key = ptr; fd->key = ptr + tree->max_key_len + 2; <skipped> }
Finally, fd->key is still not initialized if hfs_brec_find() has found nothing.
This patch changes kmalloc() on kzalloc() in hfs_find_init() and intializes fd->record, fd->keyoffset, fd->keylength, fd->entryoffset, fd->entrylength for the case if hfs_brec_find() has been found nothing in the b-tree node.
Reported-by: syzbot syzbot+55ad87f38795d6787521@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=55ad87f38795d6787521 Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com cc: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de cc: Yangtao Li frank.li@vivo.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20250818225232.126402-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfsplus/bfind.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/hfsplus/bfind.c b/fs/hfsplus/bfind.c index 901e83d65d202..26ebac4c60424 100644 --- a/fs/hfsplus/bfind.c +++ b/fs/hfsplus/bfind.c @@ -18,7 +18,7 @@ int hfs_find_init(struct hfs_btree *tree, struct hfs_find_data *fd)
fd->tree = tree; fd->bnode = NULL; - ptr = kmalloc(tree->max_key_len * 2 + 4, GFP_KERNEL); + ptr = kzalloc(tree->max_key_len * 2 + 4, GFP_KERNEL); if (!ptr) return -ENOMEM; fd->search_key = ptr; @@ -158,6 +158,12 @@ int hfs_brec_find(struct hfs_find_data *fd, search_strategy_t do_key_compare) __be32 data; int height, res;
+ fd->record = -1; + fd->keyoffset = -1; + fd->keylength = -1; + fd->entryoffset = -1; + fd->entrylength = -1; + tree = fd->tree; if (fd->bnode) hfs_bnode_put(fd->bnode);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yang Chenzhi yang.chenzhi@vivo.com
[ Upstream commit 738d5a51864ed8d7a68600b8c0c63fe6fe5c4f20 ]
hfsplus_bmap_alloc can trigger a crash if a record offset or length is larger than node_size
[ 15.264282] BUG: KASAN: slab-out-of-bounds in hfsplus_bmap_alloc+0x887/0x8b0 [ 15.265192] Read of size 8 at addr ffff8881085ca188 by task test/183 [ 15.265949] [ 15.266163] CPU: 0 UID: 0 PID: 183 Comm: test Not tainted 6.17.0-rc2-gc17b750b3ad9 #14 PREEMPT(voluntary) [ 15.266165] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 15.266167] Call Trace: [ 15.266168] <TASK> [ 15.266169] dump_stack_lvl+0x53/0x70 [ 15.266173] print_report+0xd0/0x660 [ 15.266181] kasan_report+0xce/0x100 [ 15.266185] hfsplus_bmap_alloc+0x887/0x8b0 [ 15.266208] hfs_btree_inc_height.isra.0+0xd5/0x7c0 [ 15.266217] hfsplus_brec_insert+0x870/0xb00 [ 15.266222] __hfsplus_ext_write_extent+0x428/0x570 [ 15.266225] __hfsplus_ext_cache_extent+0x5e/0x910 [ 15.266227] hfsplus_ext_read_extent+0x1b2/0x200 [ 15.266233] hfsplus_file_extend+0x5a7/0x1000 [ 15.266237] hfsplus_get_block+0x12b/0x8c0 [ 15.266238] __block_write_begin_int+0x36b/0x12c0 [ 15.266251] block_write_begin+0x77/0x110 [ 15.266252] cont_write_begin+0x428/0x720 [ 15.266259] hfsplus_write_begin+0x51/0x100 [ 15.266262] cont_write_begin+0x272/0x720 [ 15.266270] hfsplus_write_begin+0x51/0x100 [ 15.266274] generic_perform_write+0x321/0x750 [ 15.266285] generic_file_write_iter+0xc3/0x310 [ 15.266289] __kernel_write_iter+0x2fd/0x800 [ 15.266296] dump_user_range+0x2ea/0x910 [ 15.266301] elf_core_dump+0x2a94/0x2ed0 [ 15.266320] vfs_coredump+0x1d85/0x45e0 [ 15.266349] get_signal+0x12e3/0x1990 [ 15.266357] arch_do_signal_or_restart+0x89/0x580 [ 15.266362] irqentry_exit_to_user_mode+0xab/0x110 [ 15.266364] asm_exc_page_fault+0x26/0x30 [ 15.266366] RIP: 0033:0x41bd35 [ 15.266367] Code: bc d1 f3 0f 7f 27 f3 0f 7f 6f 10 f3 0f 7f 77 20 f3 0f 7f 7f 30 49 83 c0 0f 49 29 d0 48 8d 7c 17 31 e9 9f 0b 00 00 66 0f ef c0 <f3> 0f 6f 0e f3 0f 6f 56 10 66 0f 74 c1 66 0f d7 d0 49 83 f8f [ 15.266369] RSP: 002b:00007ffc9e62d078 EFLAGS: 00010283 [ 15.266371] RAX: 00007ffc9e62d100 RBX: 0000000000000000 RCX: 0000000000000000 [ 15.266372] RDX: 00000000000000e0 RSI: 0000000000000000 RDI: 00007ffc9e62d100 [ 15.266373] RBP: 0000400000000040 R08: 00000000000000e0 R09: 0000000000000000 [ 15.266374] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 15.266375] R13: 0000000000000000 R14: 0000000000000000 R15: 0000400000000000 [ 15.266376] </TASK>
When calling hfsplus_bmap_alloc to allocate a free node, this function first retrieves the bitmap from header node and map node using node->page together with the offset and length from hfs_brec_lenoff
``` len = hfs_brec_lenoff(node, 2, &off16); off = off16;
off += node->page_offset; pagep = node->page + (off >> PAGE_SHIFT); data = kmap_local_page(*pagep); ```
However, if the retrieved offset or length is invalid(i.e. exceeds node_size), the code may end up accessing pages outside the allocated range for this node.
This patch adds proper validation of both offset and length before use, preventing out-of-bounds page access. Move is_bnode_offset_valid and check_and_correct_requested_length to hfsplus_fs.h, as they may be required by other functions.
Reported-by: syzbot+356aed408415a56543cd@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/67bcb4a6.050a0220.bbfd1.008f.GAE@google.com/ Signed-off-by: Yang Chenzhi yang.chenzhi@vivo.com Reviewed-by: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com Link: https://lore.kernel.org/r/20250818141734.8559-2-yang.chenzhi@vivo.com Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfsplus/bnode.c | 41 ---------------------------------------- fs/hfsplus/btree.c | 6 ++++++ fs/hfsplus/hfsplus_fs.h | 42 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 48 insertions(+), 41 deletions(-)
diff --git a/fs/hfsplus/bnode.c b/fs/hfsplus/bnode.c index 14f4995588ff0..407d5152eb411 100644 --- a/fs/hfsplus/bnode.c +++ b/fs/hfsplus/bnode.c @@ -18,47 +18,6 @@ #include "hfsplus_fs.h" #include "hfsplus_raw.h"
-static inline -bool is_bnode_offset_valid(struct hfs_bnode *node, int off) -{ - bool is_valid = off < node->tree->node_size; - - if (!is_valid) { - pr_err("requested invalid offset: " - "NODE: id %u, type %#x, height %u, " - "node_size %u, offset %d\n", - node->this, node->type, node->height, - node->tree->node_size, off); - } - - return is_valid; -} - -static inline -int check_and_correct_requested_length(struct hfs_bnode *node, int off, int len) -{ - unsigned int node_size; - - if (!is_bnode_offset_valid(node, off)) - return 0; - - node_size = node->tree->node_size; - - if ((off + len) > node_size) { - int new_len = (int)node_size - off; - - pr_err("requested length has been corrected: " - "NODE: id %u, type %#x, height %u, " - "node_size %u, offset %d, " - "requested_len %d, corrected_len %d\n", - node->this, node->type, node->height, - node->tree->node_size, off, len, new_len); - - return new_len; - } - - return len; -}
/* Copy a specified range of bytes from the raw data of a node */ void hfs_bnode_read(struct hfs_bnode *node, void *buf, int off, int len) diff --git a/fs/hfsplus/btree.c b/fs/hfsplus/btree.c index 9e1732a2b92a8..fe6a54c4083c3 100644 --- a/fs/hfsplus/btree.c +++ b/fs/hfsplus/btree.c @@ -393,6 +393,12 @@ struct hfs_bnode *hfs_bmap_alloc(struct hfs_btree *tree) len = hfs_brec_lenoff(node, 2, &off16); off = off16;
+ if (!is_bnode_offset_valid(node, off)) { + hfs_bnode_put(node); + return ERR_PTR(-EIO); + } + len = check_and_correct_requested_length(node, off, len); + off += node->page_offset; pagep = node->page + (off >> PAGE_SHIFT); data = kmap_local_page(*pagep); diff --git a/fs/hfsplus/hfsplus_fs.h b/fs/hfsplus/hfsplus_fs.h index 3227436f3a4a6..e13da1fe2c2a2 100644 --- a/fs/hfsplus/hfsplus_fs.h +++ b/fs/hfsplus/hfsplus_fs.h @@ -574,6 +574,48 @@ hfsplus_btree_lock_class(struct hfs_btree *tree) return class; }
+static inline +bool is_bnode_offset_valid(struct hfs_bnode *node, int off) +{ + bool is_valid = off < node->tree->node_size; + + if (!is_valid) { + pr_err("requested invalid offset: " + "NODE: id %u, type %#x, height %u, " + "node_size %u, offset %d\n", + node->this, node->type, node->height, + node->tree->node_size, off); + } + + return is_valid; +} + +static inline +int check_and_correct_requested_length(struct hfs_bnode *node, int off, int len) +{ + unsigned int node_size; + + if (!is_bnode_offset_valid(node, off)) + return 0; + + node_size = node->tree->node_size; + + if ((off + len) > node_size) { + int new_len = (int)node_size - off; + + pr_err("requested length has been corrected: " + "NODE: id %u, type %#x, height %u, " + "node_size %u, offset %d, " + "requested_len %d, corrected_len %d\n", + node->this, node->type, node->height, + node->tree->node_size, off, len, new_len); + + return new_len; + } + + return len; +} + /* compatibility */ #define hfsp_mt2ut(t) (struct timespec64){ .tv_sec = __hfsp_mt2ut(t) } #define hfsp_ut2mt(t) __hfsp_ut2mt((t).tv_sec)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Viacheslav Dubeyko slava@dubeyko.com
[ Upstream commit 9b3d15a758910bb98ba8feb4109d99cc67450ee4 ]
The syzbot reported issue in hfsplus_delete_cat():
[ 70.682285][ T9333] ===================================================== [ 70.682943][ T9333] BUG: KMSAN: uninit-value in hfsplus_subfolders_dec+0x1d7/0x220 [ 70.683640][ T9333] hfsplus_subfolders_dec+0x1d7/0x220 [ 70.684141][ T9333] hfsplus_delete_cat+0x105d/0x12b0 [ 70.684621][ T9333] hfsplus_rmdir+0x13d/0x310 [ 70.685048][ T9333] vfs_rmdir+0x5ba/0x810 [ 70.685447][ T9333] do_rmdir+0x964/0xea0 [ 70.685833][ T9333] __x64_sys_rmdir+0x71/0xb0 [ 70.686260][ T9333] x64_sys_call+0xcd8/0x3cf0 [ 70.686695][ T9333] do_syscall_64+0xd9/0x1d0 [ 70.687119][ T9333] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 70.687646][ T9333] [ 70.687856][ T9333] Uninit was stored to memory at: [ 70.688311][ T9333] hfsplus_subfolders_inc+0x1c2/0x1d0 [ 70.688779][ T9333] hfsplus_create_cat+0x148e/0x1800 [ 70.689231][ T9333] hfsplus_mknod+0x27f/0x600 [ 70.689730][ T9333] hfsplus_mkdir+0x5a/0x70 [ 70.690146][ T9333] vfs_mkdir+0x483/0x7a0 [ 70.690545][ T9333] do_mkdirat+0x3f2/0xd30 [ 70.690944][ T9333] __x64_sys_mkdir+0x9a/0xf0 [ 70.691380][ T9333] x64_sys_call+0x2f89/0x3cf0 [ 70.691816][ T9333] do_syscall_64+0xd9/0x1d0 [ 70.692229][ T9333] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 70.692773][ T9333] [ 70.692990][ T9333] Uninit was stored to memory at: [ 70.693469][ T9333] hfsplus_subfolders_inc+0x1c2/0x1d0 [ 70.693960][ T9333] hfsplus_create_cat+0x148e/0x1800 [ 70.694438][ T9333] hfsplus_fill_super+0x21c1/0x2700 [ 70.694911][ T9333] mount_bdev+0x37b/0x530 [ 70.695320][ T9333] hfsplus_mount+0x4d/0x60 [ 70.695729][ T9333] legacy_get_tree+0x113/0x2c0 [ 70.696167][ T9333] vfs_get_tree+0xb3/0x5c0 [ 70.696588][ T9333] do_new_mount+0x73e/0x1630 [ 70.697013][ T9333] path_mount+0x6e3/0x1eb0 [ 70.697425][ T9333] __se_sys_mount+0x733/0x830 [ 70.697857][ T9333] __x64_sys_mount+0xe4/0x150 [ 70.698269][ T9333] x64_sys_call+0x2691/0x3cf0 [ 70.698704][ T9333] do_syscall_64+0xd9/0x1d0 [ 70.699117][ T9333] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 70.699730][ T9333] [ 70.699946][ T9333] Uninit was created at: [ 70.700378][ T9333] __alloc_pages_noprof+0x714/0xe60 [ 70.700843][ T9333] alloc_pages_mpol_noprof+0x2a2/0x9b0 [ 70.701331][ T9333] alloc_pages_noprof+0xf8/0x1f0 [ 70.701774][ T9333] allocate_slab+0x30e/0x1390 [ 70.702194][ T9333] ___slab_alloc+0x1049/0x33a0 [ 70.702635][ T9333] kmem_cache_alloc_lru_noprof+0x5ce/0xb20 [ 70.703153][ T9333] hfsplus_alloc_inode+0x5a/0xd0 [ 70.703598][ T9333] alloc_inode+0x82/0x490 [ 70.703984][ T9333] iget_locked+0x22e/0x1320 [ 70.704428][ T9333] hfsplus_iget+0x5c/0xba0 [ 70.704827][ T9333] hfsplus_btree_open+0x135/0x1dd0 [ 70.705291][ T9333] hfsplus_fill_super+0x1132/0x2700 [ 70.705776][ T9333] mount_bdev+0x37b/0x530 [ 70.706171][ T9333] hfsplus_mount+0x4d/0x60 [ 70.706579][ T9333] legacy_get_tree+0x113/0x2c0 [ 70.707019][ T9333] vfs_get_tree+0xb3/0x5c0 [ 70.707444][ T9333] do_new_mount+0x73e/0x1630 [ 70.707865][ T9333] path_mount+0x6e3/0x1eb0 [ 70.708270][ T9333] __se_sys_mount+0x733/0x830 [ 70.708711][ T9333] __x64_sys_mount+0xe4/0x150 [ 70.709158][ T9333] x64_sys_call+0x2691/0x3cf0 [ 70.709630][ T9333] do_syscall_64+0xd9/0x1d0 [ 70.710053][ T9333] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 70.710611][ T9333] [ 70.710842][ T9333] CPU: 3 UID: 0 PID: 9333 Comm: repro Not tainted 6.12.0-rc6-dirty #17 [ 70.711568][ T9333] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 70.712490][ T9333] ===================================================== [ 70.713085][ T9333] Disabling lock debugging due to kernel taint [ 70.713618][ T9333] Kernel panic - not syncing: kmsan.panic set ... [ 70.714159][ T9333] CPU: 3 UID: 0 PID: 9333 Comm: repro Tainted: G B 6.12.0-rc6-dirty #17 [ 70.715007][ T9333] Tainted: [B]=BAD_PAGE [ 70.715365][ T9333] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 70.716311][ T9333] Call Trace: [ 70.716621][ T9333] <TASK> [ 70.716899][ T9333] dump_stack_lvl+0x1fd/0x2b0 [ 70.717350][ T9333] dump_stack+0x1e/0x30 [ 70.717743][ T9333] panic+0x502/0xca0 [ 70.718116][ T9333] ? kmsan_get_metadata+0x13e/0x1c0 [ 70.718611][ T9333] kmsan_report+0x296/0x2a0 [ 70.719038][ T9333] ? __msan_metadata_ptr_for_load_4+0x24/0x40 [ 70.719859][ T9333] ? __msan_warning+0x96/0x120 [ 70.720345][ T9333] ? hfsplus_subfolders_dec+0x1d7/0x220 [ 70.720881][ T9333] ? hfsplus_delete_cat+0x105d/0x12b0 [ 70.721412][ T9333] ? hfsplus_rmdir+0x13d/0x310 [ 70.721880][ T9333] ? vfs_rmdir+0x5ba/0x810 [ 70.722458][ T9333] ? do_rmdir+0x964/0xea0 [ 70.722883][ T9333] ? __x64_sys_rmdir+0x71/0xb0 [ 70.723397][ T9333] ? x64_sys_call+0xcd8/0x3cf0 [ 70.723915][ T9333] ? do_syscall_64+0xd9/0x1d0 [ 70.724454][ T9333] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 70.725110][ T9333] ? vprintk_emit+0xd1f/0xe60 [ 70.725616][ T9333] ? vprintk_default+0x3f/0x50 [ 70.726175][ T9333] ? vprintk+0xce/0xd0 [ 70.726628][ T9333] ? _printk+0x17e/0x1b0 [ 70.727129][ T9333] ? __msan_metadata_ptr_for_load_4+0x24/0x40 [ 70.727739][ T9333] ? kmsan_get_metadata+0x13e/0x1c0 [ 70.728324][ T9333] __msan_warning+0x96/0x120 [ 70.728854][ T9333] hfsplus_subfolders_dec+0x1d7/0x220 [ 70.729479][ T9333] hfsplus_delete_cat+0x105d/0x12b0 [ 70.729984][ T9333] ? kmsan_get_shadow_origin_ptr+0x4a/0xb0 [ 70.730646][ T9333] ? __msan_metadata_ptr_for_load_4+0x24/0x40 [ 70.731296][ T9333] ? kmsan_get_metadata+0x13e/0x1c0 [ 70.731863][ T9333] hfsplus_rmdir+0x13d/0x310 [ 70.732390][ T9333] ? __pfx_hfsplus_rmdir+0x10/0x10 [ 70.732919][ T9333] vfs_rmdir+0x5ba/0x810 [ 70.733416][ T9333] ? kmsan_get_shadow_origin_ptr+0x4a/0xb0 [ 70.734044][ T9333] do_rmdir+0x964/0xea0 [ 70.734537][ T9333] __x64_sys_rmdir+0x71/0xb0 [ 70.735032][ T9333] x64_sys_call+0xcd8/0x3cf0 [ 70.735579][ T9333] do_syscall_64+0xd9/0x1d0 [ 70.736092][ T9333] ? irqentry_exit+0x16/0x60 [ 70.736637][ T9333] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 70.737269][ T9333] RIP: 0033:0x7fa9424eafc9 [ 70.737775][ T9333] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 48 [ 70.739844][ T9333] RSP: 002b:00007fff099cd8d8 EFLAGS: 00000202 ORIG_RAX: 0000000000000054 [ 70.740760][ T9333] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa9424eafc9 [ 70.741642][ T9333] RDX: 006c6f72746e6f63 RSI: 000000000000000a RDI: 0000000020000100 [ 70.742543][ T9333] RBP: 00007fff099cd8e0 R08: 00007fff099cd910 R09: 00007fff099cd910 [ 70.743376][ T9333] R10: 0000000000000000 R11: 0000000000000202 R12: 0000565430642260 [ 70.744247][ T9333] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 70.745082][ T9333] </TASK>
The main reason of the issue that struct hfsplus_inode_info has not been properly initialized for the case of root folder. In the case of root folder, hfsplus_fill_super() calls the hfsplus_iget() that implements only partial initialization of struct hfsplus_inode_info and subfolders field is not initialized by hfsplus_iget() logic.
This patch implements complete initialization of struct hfsplus_inode_info in the hfsplus_iget() logic with the goal to prevent likewise issues for the case of root folder.
Reported-by: syzbot syzbot+fdedff847a0e5e84c39f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=fdedff847a0e5e84c39f Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com cc: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de cc: Yangtao Li frank.li@vivo.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20250825225103.326401-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfsplus/super.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c index 1986b4f18a901..8c086f16dd589 100644 --- a/fs/hfsplus/super.c +++ b/fs/hfsplus/super.c @@ -67,13 +67,26 @@ struct inode *hfsplus_iget(struct super_block *sb, unsigned long ino) if (!(inode->i_state & I_NEW)) return inode;
- INIT_LIST_HEAD(&HFSPLUS_I(inode)->open_dir_list); - spin_lock_init(&HFSPLUS_I(inode)->open_dir_lock); - mutex_init(&HFSPLUS_I(inode)->extents_lock); - HFSPLUS_I(inode)->flags = 0; + atomic_set(&HFSPLUS_I(inode)->opencnt, 0); + HFSPLUS_I(inode)->first_blocks = 0; + HFSPLUS_I(inode)->clump_blocks = 0; + HFSPLUS_I(inode)->alloc_blocks = 0; + HFSPLUS_I(inode)->cached_start = U32_MAX; + HFSPLUS_I(inode)->cached_blocks = 0; + memset(HFSPLUS_I(inode)->first_extents, 0, sizeof(hfsplus_extent_rec)); + memset(HFSPLUS_I(inode)->cached_extents, 0, sizeof(hfsplus_extent_rec)); HFSPLUS_I(inode)->extent_state = 0; + mutex_init(&HFSPLUS_I(inode)->extents_lock); HFSPLUS_I(inode)->rsrc_inode = NULL; - atomic_set(&HFSPLUS_I(inode)->opencnt, 0); + HFSPLUS_I(inode)->create_date = 0; + HFSPLUS_I(inode)->linkid = 0; + HFSPLUS_I(inode)->flags = 0; + HFSPLUS_I(inode)->fs_blocks = 0; + HFSPLUS_I(inode)->userflags = 0; + HFSPLUS_I(inode)->subfolders = 0; + INIT_LIST_HEAD(&HFSPLUS_I(inode)->open_dir_list); + spin_lock_init(&HFSPLUS_I(inode)->open_dir_lock); + HFSPLUS_I(inode)->phys_size = 0;
if (inode->i_ino >= HFSPLUS_FIRSTUSER_CNID || inode->i_ino == HFSPLUS_ROOT_CNID) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Aring aahringo@redhat.com
[ Upstream commit 6af515c9f3ccec3eb8a262ca86bef2c499d07951 ]
Force values over 3 are undefined, so don't treat them as 3.
Signed-off-by: Alexander Aring aahringo@redhat.com Signed-off-by: David Teigland teigland@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/dlm/lockspace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/dlm/lockspace.c b/fs/dlm/lockspace.c index 23cf9b8f31b74..e7372d56c13f4 100644 --- a/fs/dlm/lockspace.c +++ b/fs/dlm/lockspace.c @@ -825,7 +825,7 @@ static int release_lockspace(struct dlm_ls *ls, int force)
dlm_device_deregister(ls);
- if (force < 3 && dlm_user_daemon_available()) + if (force != 3 && dlm_user_daemon_available()) do_uevent(ls, 0);
dlm_recoverd_stop(ls);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Viacheslav Dubeyko slava@dubeyko.com
[ Upstream commit 2048ec5b98dbdfe0b929d2e42dc7a54c389c53dd ]
The syzbot reported issue in hfs_find_set_zero_bits():
===================================================== BUG: KMSAN: uninit-value in hfs_find_set_zero_bits+0x74d/0xb60 fs/hfs/bitmap.c:45 hfs_find_set_zero_bits+0x74d/0xb60 fs/hfs/bitmap.c:45 hfs_vbm_search_free+0x13c/0x5b0 fs/hfs/bitmap.c:151 hfs_extend_file+0x6a5/0x1b00 fs/hfs/extent.c:408 hfs_get_block+0x435/0x1150 fs/hfs/extent.c:353 __block_write_begin_int+0xa76/0x3030 fs/buffer.c:2151 block_write_begin fs/buffer.c:2262 [inline] cont_write_begin+0x10e1/0x1bc0 fs/buffer.c:2601 hfs_write_begin+0x85/0x130 fs/hfs/inode.c:52 cont_expand_zero fs/buffer.c:2528 [inline] cont_write_begin+0x35a/0x1bc0 fs/buffer.c:2591 hfs_write_begin+0x85/0x130 fs/hfs/inode.c:52 hfs_file_truncate+0x1d6/0xe60 fs/hfs/extent.c:494 hfs_inode_setattr+0x964/0xaa0 fs/hfs/inode.c:654 notify_change+0x1993/0x1aa0 fs/attr.c:552 do_truncate+0x28f/0x310 fs/open.c:68 do_ftruncate+0x698/0x730 fs/open.c:195 do_sys_ftruncate fs/open.c:210 [inline] __do_sys_ftruncate fs/open.c:215 [inline] __se_sys_ftruncate fs/open.c:213 [inline] __x64_sys_ftruncate+0x11b/0x250 fs/open.c:213 x64_sys_call+0xfe3/0x3db0 arch/x86/include/generated/asm/syscalls_64.h:78 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xd9/0x210 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Uninit was created at: slab_post_alloc_hook mm/slub.c:4154 [inline] slab_alloc_node mm/slub.c:4197 [inline] __kmalloc_cache_noprof+0x7f7/0xed0 mm/slub.c:4354 kmalloc_noprof include/linux/slab.h:905 [inline] hfs_mdb_get+0x1cc8/0x2a90 fs/hfs/mdb.c:175 hfs_fill_super+0x3d0/0xb80 fs/hfs/super.c:337 get_tree_bdev_flags+0x6e3/0x920 fs/super.c:1681 get_tree_bdev+0x38/0x50 fs/super.c:1704 hfs_get_tree+0x35/0x40 fs/hfs/super.c:388 vfs_get_tree+0xb0/0x5c0 fs/super.c:1804 do_new_mount+0x738/0x1610 fs/namespace.c:3902 path_mount+0x6db/0x1e90 fs/namespace.c:4226 do_mount fs/namespace.c:4239 [inline] __do_sys_mount fs/namespace.c:4450 [inline] __se_sys_mount+0x6eb/0x7d0 fs/namespace.c:4427 __x64_sys_mount+0xe4/0x150 fs/namespace.c:4427 x64_sys_call+0xfa7/0x3db0 arch/x86/include/generated/asm/syscalls_64.h:166 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xd9/0x210 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f
CPU: 1 UID: 0 PID: 12609 Comm: syz.1.2692 Not tainted 6.16.0-syzkaller #0 PREEMPT(none) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025 =====================================================
The HFS_SB(sb)->bitmap buffer is allocated in hfs_mdb_get():
HFS_SB(sb)->bitmap = kmalloc(8192, GFP_KERNEL);
Finally, it can trigger the reported issue because kmalloc() doesn't clear the allocated memory. If allocated memory contains only zeros, then everything will work pretty fine. But if the allocated memory contains the "garbage", then it can affect the bitmap operations and it triggers the reported issue.
This patch simply exchanges the kmalloc() on kzalloc() with the goal to guarantee the correctness of bitmap operations. Because, newly created allocation bitmap should have all available blocks free. Potentially, initialization bitmap's read operation could not fill the whole allocated memory and "garbage" in the not initialized memory will be the reason of volume coruptions and file system driver bugs.
Reported-by: syzbot syzbot+773fa9d79b29bd8b6831@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=773fa9d79b29bd8b6831 Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com cc: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de cc: Yangtao Li frank.li@vivo.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20250820230636.179085-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfs/mdb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/hfs/mdb.c b/fs/hfs/mdb.c index 8082eb01127cd..bf811347bb07d 100644 --- a/fs/hfs/mdb.c +++ b/fs/hfs/mdb.c @@ -172,7 +172,7 @@ int hfs_mdb_get(struct super_block *sb) pr_warn("continuing without an alternate MDB\n"); }
- HFS_SB(sb)->bitmap = kmalloc(8192, GFP_KERNEL); + HFS_SB(sb)->bitmap = kzalloc(8192, GFP_KERNEL); if (!HFS_SB(sb)->bitmap) goto out;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yangtao Li frank.li@vivo.com
[ Upstream commit 9282bc905f0949fab8cf86c0f620ca988761254c ]
If Catalog File contains corrupted record for the case of hidden directory's type, regard it as I/O error instead of Invalid argument.
Signed-off-by: Yangtao Li frank.li@vivo.com Reviewed-by: Viacheslav Dubeyko slava@dubeyko.com Link: https://lore.kernel.org/r/20250805165905.3390154-1-frank.li@vivo.com Signed-off-by: Viacheslav Dubeyko slava@dubeyko.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/hfsplus/super.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c index 8c086f16dd589..7e889820a63d0 100644 --- a/fs/hfsplus/super.c +++ b/fs/hfsplus/super.c @@ -538,7 +538,7 @@ static int hfsplus_fill_super(struct super_block *sb, void *data, int silent) if (!hfs_brec_read(&fd, &entry, sizeof(entry))) { hfs_find_exit(&fd); if (entry.type != cpu_to_be16(HFSPLUS_FOLDER)) { - err = -EINVAL; + err = -EIO; goto out_put_root; } inode = hfsplus_iget(sb, be32_to_cpu(entry.folder.id));
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Junjie Cao junjie.cao@intel.com
[ Upstream commit 01c7344e21c2140e72282d9d16d79a61f840fc20 ]
Add missing NULL pointer checks after kmalloc() calls in lkdtm_FORTIFY_STR_MEMBER() and lkdtm_FORTIFY_MEM_MEMBER() functions.
Signed-off-by: Junjie Cao junjie.cao@intel.com Link: https://lore.kernel.org/r/20250814060605.5264-1-junjie.cao@intel.com Signed-off-by: Kees Cook kees@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/lkdtm/fortify.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/misc/lkdtm/fortify.c b/drivers/misc/lkdtm/fortify.c index 0159276656780..00ed2147113e6 100644 --- a/drivers/misc/lkdtm/fortify.c +++ b/drivers/misc/lkdtm/fortify.c @@ -44,6 +44,9 @@ static void lkdtm_FORTIFY_STR_MEMBER(void) char *src;
src = kmalloc(size, GFP_KERNEL); + if (!src) + return; + strscpy(src, "over ten bytes", size); size = strlen(src) + 1;
@@ -109,6 +112,9 @@ static void lkdtm_FORTIFY_MEM_MEMBER(void) char *src;
src = kmalloc(size, GFP_KERNEL); + if (!src) + return; + strscpy(src, "over ten bytes", size); size = strlen(src) + 1;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Geert Uytterhoeven geert@linux-m68k.org
[ Upstream commit 6d5674090543b89aac0c177d67e5fb32ddc53804 ]
The function signatures of the m68k-optimized implementations of the find_{first,next}_{,zero_}bit() helpers do not match the generic variants.
Fix this by changing all non-pointer inputs and outputs to "unsigned long", and updating a few local variables.
Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202509092305.ncd9mzaZ-lkp@intel.com/ Signed-off-by: Geert Uytterhoeven geert@linux-m68k.org Acked-by: "Yury Norov (NVIDIA)" yury.norov@gmail.com Link: https://patch.msgid.link/de6919554fbb4cd1427155c6bafbac8a9df822c8.1757517135... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/m68k/include/asm/bitops.h | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-)
diff --git a/arch/m68k/include/asm/bitops.h b/arch/m68k/include/asm/bitops.h index e984af71df6be..d86aa744cb8fc 100644 --- a/arch/m68k/include/asm/bitops.h +++ b/arch/m68k/include/asm/bitops.h @@ -329,12 +329,12 @@ arch___test_and_change_bit(unsigned long nr, volatile unsigned long *addr) #include <asm-generic/bitops/ffz.h> #else
-static inline int find_first_zero_bit(const unsigned long *vaddr, - unsigned size) +static inline unsigned long find_first_zero_bit(const unsigned long *vaddr, + unsigned long size) { const unsigned long *p = vaddr; - int res = 32; - unsigned int words; + unsigned long res = 32; + unsigned long words; unsigned long num;
if (!size) @@ -355,8 +355,9 @@ static inline int find_first_zero_bit(const unsigned long *vaddr, } #define find_first_zero_bit find_first_zero_bit
-static inline int find_next_zero_bit(const unsigned long *vaddr, int size, - int offset) +static inline unsigned long find_next_zero_bit(const unsigned long *vaddr, + unsigned long size, + unsigned long offset) { const unsigned long *p = vaddr + (offset >> 5); int bit = offset & 31UL, res; @@ -385,11 +386,12 @@ static inline int find_next_zero_bit(const unsigned long *vaddr, int size, } #define find_next_zero_bit find_next_zero_bit
-static inline int find_first_bit(const unsigned long *vaddr, unsigned size) +static inline unsigned long find_first_bit(const unsigned long *vaddr, + unsigned long size) { const unsigned long *p = vaddr; - int res = 32; - unsigned int words; + unsigned long res = 32; + unsigned long words; unsigned long num;
if (!size) @@ -410,8 +412,9 @@ static inline int find_first_bit(const unsigned long *vaddr, unsigned size) } #define find_first_bit find_first_bit
-static inline int find_next_bit(const unsigned long *vaddr, int size, - int offset) +static inline unsigned long find_next_bit(const unsigned long *vaddr, + unsigned long size, + unsigned long offset) { const unsigned long *p = vaddr + (offset >> 5); int bit = offset & 31UL, res;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe Leroy christophe.leroy@csgroup.eu
[ Upstream commit 9316512b717f6f25c4649b3fdb0a905b6a318e9f ]
PAGE_KERNEL_TEXT is an old macro that is used to tell kernel whether kernel text has to be mapped read-only or read-write based on build time options.
But nowadays, with functionnalities like jump_labels, static links, etc ... more only less all kernels need to be read-write at some point, and some combinations of configs failed to work due to innacurate setting of PAGE_KERNEL_TEXT. On the other hand, today we have CONFIG_STRICT_KERNEL_RWX which implements a more controlled access to kernel modifications.
Instead of trying to keep PAGE_KERNEL_TEXT accurate with all possible options that may imply kernel text modification, always set kernel text read-write at startup and rely on CONFIG_STRICT_KERNEL_RWX to provide accurate protection.
Do this by passing PAGE_KERNEL_X to map_kernel_page() in __maping_ram_chunk() instead of passing PAGE_KERNEL_TEXT. Once this is done, the only remaining user of PAGE_KERNEL_TEXT is mmu_mark_initmem_nx() which uses it in a call to setibat(). As setibat() ignores the RW/RO, we can seamlessly replace PAGE_KERNEL_TEXT by PAGE_KERNEL_X here as well and get rid of PAGE_KERNEL_TEXT completely.
Reported-by: Erhard Furtner erhard_f@mailbox.org Closes: https://lore.kernel.org/all/342b4120-911c-4723-82ec-d8c9b03a8aef@mailbox.org... Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Tested-by: Andrew Donnellan ajd@linux.ibm.com Signed-off-by: Madhavan Srinivasan maddy@linux.ibm.com Link: https://patch.msgid.link/8e2d793abf87ae3efb8f6dce10f974ac0eda61b8.1757412205... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/pgtable.h | 12 ------------ arch/powerpc/mm/book3s32/mmu.c | 4 ++-- arch/powerpc/mm/pgtable_32.c | 2 +- 3 files changed, 3 insertions(+), 15 deletions(-)
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index 9972626ddaf68..eda12ceacb55a 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -20,18 +20,6 @@ struct mm_struct; #include <asm/nohash/pgtable.h> #endif /* !CONFIG_PPC_BOOK3S */
-/* - * Protection used for kernel text. We want the debuggers to be able to - * set breakpoints anywhere, so don't write protect the kernel text - * on platforms where such control is possible. - */ -#if defined(CONFIG_KGDB) || defined(CONFIG_XMON) || defined(CONFIG_BDI_SWITCH) || \ - defined(CONFIG_KPROBES) || defined(CONFIG_DYNAMIC_FTRACE) -#define PAGE_KERNEL_TEXT PAGE_KERNEL_X -#else -#define PAGE_KERNEL_TEXT PAGE_KERNEL_ROX -#endif - /* Make modules code happy. We don't set RO yet */ #define PAGE_KERNEL_EXEC PAGE_KERNEL_X
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c index 850783cfa9c73..1b1848761a000 100644 --- a/arch/powerpc/mm/book3s32/mmu.c +++ b/arch/powerpc/mm/book3s32/mmu.c @@ -204,7 +204,7 @@ void mmu_mark_initmem_nx(void)
for (i = 0; i < nb - 1 && base < top;) { size = bat_block_size(base, top); - setibat(i++, PAGE_OFFSET + base, base, size, PAGE_KERNEL_TEXT); + setibat(i++, PAGE_OFFSET + base, base, size, PAGE_KERNEL_X); base += size; } if (base < top) { @@ -215,7 +215,7 @@ void mmu_mark_initmem_nx(void) pr_warn("Some RW data is getting mapped X. " "Adjust CONFIG_DATA_SHIFT to avoid that.\n"); } - setibat(i++, PAGE_OFFSET + base, base, size, PAGE_KERNEL_TEXT); + setibat(i++, PAGE_OFFSET + base, base, size, PAGE_KERNEL_X); base += size; } for (; i < nb; i++) diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index 5c02fd08d61ef..69fac96c2dcd1 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -109,7 +109,7 @@ static void __init __mapin_ram_chunk(unsigned long offset, unsigned long top) p = memstart_addr + s; for (; s < top; s += PAGE_SIZE) { ktext = core_kernel_text(v); - map_kernel_page(v, p, ktext ? PAGE_KERNEL_TEXT : PAGE_KERNEL); + map_kernel_page(v, p, ktext ? PAGE_KERNEL_X : PAGE_KERNEL); v += PAGE_SIZE; p += PAGE_SIZE; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefan Metzmacher metze@samba.org
[ Upstream commit 1b53426334c3c942db47e0959a2527a4f815af50 ]
If we want to invalidate a remote key we should do that as soon as possible, so do it in the first send work request.
Acked-by: Namjae Jeon linkinjeon@kernel.org Cc: Steve French smfrench@gmail.com Cc: Tom Talpey tom@talpey.com Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher metze@samba.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/server/transport_rdma.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/fs/smb/server/transport_rdma.c b/fs/smb/server/transport_rdma.c index af1c41f922bb3..81da8a5c1e0db 100644 --- a/fs/smb/server/transport_rdma.c +++ b/fs/smb/server/transport_rdma.c @@ -933,12 +933,15 @@ static int smb_direct_flush_send_list(struct smb_direct_transport *t, struct smb_direct_sendmsg, list);
+ if (send_ctx->need_invalidate_rkey) { + first->wr.opcode = IB_WR_SEND_WITH_INV; + first->wr.ex.invalidate_rkey = send_ctx->remote_key; + send_ctx->need_invalidate_rkey = false; + send_ctx->remote_key = 0; + } + last->wr.send_flags = IB_SEND_SIGNALED; last->wr.wr_cqe = &last->cqe; - if (is_last && send_ctx->need_invalidate_rkey) { - last->wr.opcode = IB_WR_SEND_WITH_INV; - last->wr.ex.invalidate_rkey = send_ctx->remote_key; - }
ret = smb_direct_post_send(t, &first->wr); if (!ret) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nathan Chancellor nathan@kernel.org
[ Upstream commit aaf043a5688114703ae2c1482b92e7e0754d684e ]
When building with Clang 20 or newer, there are some objtool warnings from unexpected fallthroughs to other functions:
vmlinux.o: warning: objtool: mlx5e_mpwrq_mtts_per_wqe() falls through to next function mlx5e_mpwrq_max_num_entries() vmlinux.o: warning: objtool: mlx5e_mpwrq_max_log_rq_size() falls through to next function mlx5e_get_linear_rq_headroom()
LLVM 20 contains an (admittedly problematic [1]) optimization [2] to convert divide by zero into the equivalent of __builtin_unreachable(), which invokes undefined behavior and destroys code generation when it is encountered in a control flow graph.
mlx5e_mpwrq_umr_entry_size() returns 0 in the default case of an unrecognized mlx5e_mpwrq_umr_mode value. mlx5e_mpwrq_mtts_per_wqe(), which is inlined into mlx5e_mpwrq_max_log_rq_size(), uses the result of mlx5e_mpwrq_umr_entry_size() in a divide operation without checking for zero, so LLVM is able to infer there will be a divide by zero in this case and invokes undefined behavior. While there is some proposed work to isolate this undefined behavior and avoid the destructive code generation that results in these objtool warnings, code should still be defensive against divide by zero.
As the WARN_ONCE() implies that an invalid value should be handled gracefully, return 1 instead of 0 in the default case so that the results of this division operation is always valid.
Fixes: 168723c1f8d6 ("net/mlx5e: xsk: Use umr_mode to calculate striding RQ parameters") Link: https://lore.kernel.org/CAGG=3QUk8-Ak7YKnRziO4=0z=1C_7+4jF+6ZeDQ9yF+kuTOHOQ@... [1] Link: https://github.com/llvm/llvm-project/commit/37932643abab699e8bb1def08b7eb4ea... [2] Closes: https://github.com/ClangBuiltLinux/linux/issues/2131 Closes: https://github.com/ClangBuiltLinux/linux/issues/2132 Signed-off-by: Nathan Chancellor nathan@kernel.org Reviewed-by: Tariq Toukan tariqt@nvidia.com Link: https://patch.msgid.link/20251014-mlx5e-avoid-zero-div-from-mlx5e_mpwrq_umr_... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c index 33cc53f221e0b..542cc017e64cd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c @@ -94,7 +94,7 @@ u8 mlx5e_mpwrq_umr_entry_size(enum mlx5e_mpwrq_umr_mode mode) return sizeof(struct mlx5_ksm) * 4; } WARN_ONCE(1, "MPWRQ UMR mode %d is not known\n", mode); - return 0; + return 1; }
u8 mlx5e_mpwrq_log_wqe_sz(struct mlx5_core_dev *mdev, u8 page_shift,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Wiesböck johannes.wiesboeck@aisec.fraunhofer.de
[ Upstream commit bf29555f5bdc017bac22ca66fcb6c9f46ec8788f ]
Creating FDB entries is possible from a non-initial user namespace when having CAP_NET_ADMIN, yet, when deleting FDB entries, processes receive an EPERM because the capability is always checked against the initial user namespace. This restricts the FDB management from unprivileged containers.
Drop the netlink_capable check in rtnl_fdb_del as it was originally dropped in c5c351088ae7 and reintroduced in 1690be63a27b without intention.
This patch was tested using a container on GyroidOS, where it was possible to delete FDB entries from an unprivileged user namespace and private network namespace.
Fixes: 1690be63a27b ("bridge: Add vlan support to static neighbors") Reviewed-by: Michael Weiß michael.weiss@aisec.fraunhofer.de Tested-by: Harshal Gohel hg@simonwunderlich.de Signed-off-by: Johannes Wiesböck johannes.wiesboeck@aisec.fraunhofer.de Reviewed-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Nikolay Aleksandrov razor@blackwall.org Link: https://patch.msgid.link/20251015201548.319871-1-johannes.wiesboeck@aisec.fr... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/rtnetlink.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 6bc2f78a5ebbf..6fd6c717d1e39 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -4274,9 +4274,6 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh, int err; u16 vid;
- if (!netlink_capable(skb, CAP_NET_ADMIN)) - return -EPERM; - if (!del_bulk) { err = nlmsg_parse_deprecated(nlh, sizeof(*ndm), tb, NDA_MAX, NULL, extack);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wei Fang wei.fang@nxp.com
[ Upstream commit e59bc32df2e989f034623a580e30a2a72af33b3f ]
The ENETC RX ring uses the page halves flipping mechanism, each page is split into two halves for the RX ring to use. And ENETC_RXB_TRUESIZE is defined to 2048 to indicate the size of half a page. However, the page size is configurable, for ARM64 platform, PAGE_SIZE is default to 4K, but it could be configured to 16K or 64K.
When PAGE_SIZE is set to 16K or 64K, ENETC_RXB_TRUESIZE is not correct, and the RX ring will always use the first half of the page. This is not consistent with the description in the relevant kernel doc and commit messages.
This issue is invisible in most cases, but if users want to increase PAGE_SIZE to receive a Jumbo frame with a single buffer for some use cases, it will not work as expected, because the buffer size of each RX BD is fixed to 2048 bytes.
Based on the above two points, we expect to correct ENETC_RXB_TRUESIZE to (PAGE_SIZE >> 1), as described in the comment.
Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Wei Fang wei.fang@nxp.com Reviewed-by: Claudiu Manoil claudiu.manoil@nxp.com Link: https://patch.msgid.link/20251016080131.3127122-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/enetc/enetc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h index c6d8cc15c2701..aacdfe98b65ab 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.h +++ b/drivers/net/ethernet/freescale/enetc/enetc.h @@ -40,7 +40,7 @@ struct enetc_tx_swbd { };
#define ENETC_RX_MAXFRM_SIZE ENETC_MAC_MAXFRM_SIZE -#define ENETC_RXB_TRUESIZE 2048 /* PAGE_SIZE >> 1 */ +#define ENETC_RXB_TRUESIZE (PAGE_SIZE >> 1) #define ENETC_RXB_PAD NET_SKB_PAD /* add extra space if needed */ #define ENETC_RXB_DMA_SIZE \ (SKB_WITH_OVERHEAD(ENETC_RXB_TRUESIZE) - ENETC_RXB_PAD)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ioana Ciornei ioana.ciornei@nxp.com
[ Upstream commit 902e81e679d86846a2404630d349709ad9372d0d ]
The blamed commit increased the needed headroom to account for alignment. This means that the size required to always align a Tx buffer was added inside the dpaa2_eth_needed_headroom() function. By doing that, a manual adjustment of the pointer passed to PTR_ALIGN() was no longer correct since the 'buffer_start' variable was already pointing to the start of the skb's memory.
The behavior of the dpaa2-eth driver without this patch was to drop frames on Tx even when the headroom was matching the 128 bytes necessary. Fix this by removing the manual adjust of 'buffer_start' from the PTR_MODE call.
Closes: https://lore.kernel.org/netdev/70f0dcd9-1906-4d13-82df-7bbbbe7194c6@app.fast... Fixes: f422abe3f23d ("dpaa2-eth: increase the needed headroom to account for alignment") Signed-off-by: Ioana Ciornei ioana.ciornei@nxp.com Tested-by: Mathew McBride matt@traverse.com.au Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20251016135807.360978-1-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c index dbc40e4514f0a..3c19be56af22e 100644 --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c @@ -1046,8 +1046,7 @@ static int dpaa2_eth_build_single_fd(struct dpaa2_eth_priv *priv, dma_addr_t addr;
buffer_start = skb->data - dpaa2_eth_needed_headroom(skb); - aligned_start = PTR_ALIGN(buffer_start - DPAA2_ETH_TX_BUF_ALIGN, - DPAA2_ETH_TX_BUF_ALIGN); + aligned_start = PTR_ALIGN(buffer_start, DPAA2_ETH_TX_BUF_ALIGN); if (aligned_start >= skb->head) buffer_start = aligned_start; else
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Huang Ying ying.huang@linux.alibaba.com
[ Upstream commit 143937ca51cc6ae2fccc61a1cb916abb24cd34f5 ]
Current pte_mkwrite_novma() makes PTE dirty unconditionally. This may mark some pages that are never written dirty wrongly. For example, do_swap_page() may map the exclusive pages with writable and clean PTEs if the VMA is writable and the page fault is for read access. However, current pte_mkwrite_novma() implementation always dirties the PTE. This may cause unnecessary disk writing if the pages are never written before being reclaimed.
So, change pte_mkwrite_novma() to clear the PTE_RDONLY bit only if the PTE_DIRTY bit is set to make it possible to make the PTE writable and clean.
The current behavior was introduced in commit 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()"). Before that, pte_mkwrite() only sets the PTE_WRITE bit, while set_pte_at() only clears the PTE_RDONLY bit if both the PTE_WRITE and the PTE_DIRTY bits are set.
To test the performance impact of the patch, on an arm64 server machine, run 16 redis-server processes on socket 1 and 16 memtier_benchmark processes on socket 0 with mostly get transactions (that is, redis-server will mostly read memory only). The memory footprint of redis-server is larger than the available memory, so swap out/in will be triggered. Test results show that the patch can avoid most swapping out because the pages are mostly clean. And the benchmark throughput improves ~23.9% in the test.
Fixes: 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()") Signed-off-by: Huang Ying ying.huang@linux.alibaba.com Cc: Will Deacon will@kernel.org Cc: Anshuman Khandual anshuman.khandual@arm.com Cc: Ryan Roberts ryan.roberts@arm.com Cc: Gavin Shan gshan@redhat.com Cc: Ard Biesheuvel ardb@kernel.org Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: Yicong Yang yangyicong@hisilicon.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/pgtable.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 426c3cb3e3bb1..62326f249aa71 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -183,7 +183,8 @@ static inline pmd_t set_pmd_bit(pmd_t pmd, pgprot_t prot) static inline pte_t pte_mkwrite(pte_t pte) { pte = set_pte_bit(pte, __pgprot(PTE_WRITE)); - pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); + if (pte_sw_dirty(pte)) + pte = clear_pte_bit(pte, __pgprot(PTE_RDONLY)); return pte; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Simakov bigalex934@gmail.com
[ Upstream commit 441f0647f7673e0e64d4910ef61a5fb8f16bfb82 ]
chunk->skb pointer is dereferenced in the if-block where it's supposed to be NULL only.
chunk->skb can only be NULL if chunk->head_skb is not. Check for frag_list instead and do it just before replacing chunk->skb. We're sure that otherwise chunk->skb is non-NULL because of outer if() condition.
Fixes: 90017accff61 ("sctp: Add GSO support") Signed-off-by: Alexey Simakov bigalex934@gmail.com Acked-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Link: https://patch.msgid.link/20251021130034.6333-1-bigalex934@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sctp/inqueue.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/net/sctp/inqueue.c b/net/sctp/inqueue.c index 5c16521818058..f5a7d5a387555 100644 --- a/net/sctp/inqueue.c +++ b/net/sctp/inqueue.c @@ -169,13 +169,14 @@ struct sctp_chunk *sctp_inq_pop(struct sctp_inq *queue) chunk->head_skb = chunk->skb;
/* skbs with "cover letter" */ - if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) + if (chunk->head_skb && chunk->skb->data_len == chunk->skb->len) { + if (WARN_ON(!skb_shinfo(chunk->skb)->frag_list)) { + __SCTP_INC_STATS(dev_net(chunk->skb->dev), + SCTP_MIB_IN_PKT_DISCARDS); + sctp_chunk_free(chunk); + goto next_chunk; + } chunk->skb = skb_shinfo(chunk->skb)->frag_list; - - if (WARN_ON(!chunk->skb)) { - __SCTP_INC_STATS(dev_net(chunk->skb->dev), SCTP_MIB_IN_PKT_DISCARDS); - sctp_chunk_free(chunk); - goto next_chunk; } }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tonghao Zhang tonghao@bamaicloud.com
commit 10843e1492e474c02b91314963161731fa92af91 upstream.
If the send_peer_notif counter and the peer event notify are not synchronized. It may cause problems such as the loss or dup of peer notify event.
Before this patch: - If should_notify_peers is true and the lock for send_peer_notif-- fails, peer event may be sent again in next mii_monitor loop, because should_notify_peers is still true. - If should_notify_peers is true and the lock for send_peer_notif-- succeeded, but the lock for peer event fails, the peer event will be lost.
This patch locks the RTNL for send_peer_notif, events, and commit simultaneously.
Fixes: 07a4ddec3ce9 ("bonding: add an option to specify a delay between peer notifications") Cc: Jay Vosburgh jv@jvosburgh.net Cc: Andrew Lunn andrew+netdev@lunn.ch Cc: Eric Dumazet edumazet@google.com Cc: Jakub Kicinski kuba@kernel.org Cc: Paolo Abeni pabeni@redhat.com Cc: Hangbin Liu liuhangbin@gmail.com Cc: Nikolay Aleksandrov razor@blackwall.org Cc: Vincent Bernat vincent@bernat.ch Cc: stable@vger.kernel.org Signed-off-by: Tonghao Zhang tonghao@bamaicloud.com Acked-by: Jay Vosburgh jv@jvosburgh.net Link: https://patch.msgid.link/20251021050933.46412-1-tonghao@bamaicloud.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/bonding/bond_main.c | 40 ++++++++++++++++++---------------------- 1 file changed, 18 insertions(+), 22 deletions(-)
--- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -2821,7 +2821,7 @@ static void bond_mii_monitor(struct work { struct bonding *bond = container_of(work, struct bonding, mii_work.work); - bool should_notify_peers = false; + bool should_notify_peers; bool commit; unsigned long delay; struct slave *slave; @@ -2833,30 +2833,33 @@ static void bond_mii_monitor(struct work goto re_arm;
rcu_read_lock(); + should_notify_peers = bond_should_notify_peers(bond); commit = !!bond_miimon_inspect(bond); - if (bond->send_peer_notif) { - rcu_read_unlock(); - if (rtnl_trylock()) { - bond->send_peer_notif--; - rtnl_unlock(); - } - } else { - rcu_read_unlock(); - }
- if (commit) { + rcu_read_unlock(); + + if (commit || bond->send_peer_notif) { /* Race avoidance with bond_close cancel of workqueue */ if (!rtnl_trylock()) { delay = 1; - should_notify_peers = false; goto re_arm; }
- bond_for_each_slave(bond, slave, iter) { - bond_commit_link_state(slave, BOND_SLAVE_NOTIFY_LATER); + if (commit) { + bond_for_each_slave(bond, slave, iter) { + bond_commit_link_state(slave, + BOND_SLAVE_NOTIFY_LATER); + } + bond_miimon_commit(bond); + } + + if (bond->send_peer_notif) { + bond->send_peer_notif--; + if (should_notify_peers) + call_netdevice_notifiers(NETDEV_NOTIFY_PEERS, + bond->dev); } - bond_miimon_commit(bond);
rtnl_unlock(); /* might sleep, hold no other locks */ } @@ -2864,13 +2867,6 @@ static void bond_mii_monitor(struct work re_arm: if (bond->params.miimon) queue_delayed_work(bond->wq, &bond->mii_work, delay); - - if (should_notify_peers) { - if (!rtnl_trylock()) - return; - call_netdevice_notifiers(NETDEV_NOTIFY_PEERS, bond->dev); - rtnl_unlock(); - } }
static int bond_upper_dev_walk(struct net_device *upper,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
commit 10fad4012234a7dea621ae17c0c9486824f645a0 upstream.
It is reported that commit 85975daeaa4d ("cpuidle: menu: Avoid discarding useful information") led to a performance regression on Intel Jasper Lake systems because it reduced the time spent by CPUs in idle state C7 which is correlated to the maximum frequency the CPUs can get to because of an average running power limit [1].
Before that commit, get_typical_interval() would have returned UINT_MAX whenever it had been unable to make a high-confidence prediction which had led to selecting the deepest available idle state too often and both power and performance had been inadequate as a result of that on some systems. However, this had not been a problem on systems with relatively aggressive average running power limits, like the Jasper Lake systems in question, because on those systems it was compensated by the ability to run CPUs faster.
It was addressed by causing get_typical_interval() to return a number based on the recent idle duration information available to it even if it could not make a high-confidence prediction, but that clearly did not take the possible correlation between idle power and available CPU capacity into account.
For this reason, revert most of the changes made by commit 85975daeaa4d, except for one cosmetic cleanup, and add a comment explaining the rationale for returning UINT_MAX from get_typical_interval() when it is unable to make a high-confidence prediction.
Fixes: 85975daeaa4d ("cpuidle: menu: Avoid discarding useful information") Closes: https://lore.kernel.org/linux-pm/36iykr223vmcfsoysexug6s274nq2oimcu55ybn6ww4... [1] Reported-by: Sergey Senozhatsky senozhatsky@chromium.org Cc: All applicable stable@vger.kernel.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Link: https://patch.msgid.link/3663603.iIbC2pHGDl@rafael.j.wysocki Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/cpuidle/governors/menu.c | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-)
--- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -256,20 +256,17 @@ again: * * This can deal with workloads that have long pauses interspersed * with sporadic activity with a bunch of short pauses. + * + * However, if the number of remaining samples is too small to exclude + * any more outliers, allow the deepest available idle state to be + * selected because there are systems where the time spent by CPUs in + * deep idle states is correlated to the maximum frequency the CPUs + * can get to. On those systems, shallow idle states should be avoided + * unless there is a clear indication that the given CPU is most likley + * going to be woken up shortly. */ - if (divisor * 4 <= INTERVALS * 3) { - /* - * If there are sufficiently many data points still under - * consideration after the outliers have been eliminated, - * returning without a prediction would be a mistake because it - * is likely that the next interval will not exceed the current - * maximum, so return the latter in that case. - */ - if (divisor >= INTERVALS / 2) - return max; - + if (divisor * 4 <= INTERVALS * 3) return UINT_MAX; - }
thresh = max - 1; goto again;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xi Ruoyao xry111@xry111.site
commit 6e3a4754717a74e931a9f00b5f953be708e07acb upstream.
When ACPI_MISALIGNMENT_NOT_SUPPORTED is set, GCC can produce a bogus -Wstringop-overread warning, see [1].
To me, it's very clear that we have a compiler bug here, thus just disable the warning.
Fixes: a9d13433fe17 ("LoongArch: Align ACPI structures if ARCH_STRICT_ALIGN enabled") Link: https://lore.kernel.org/all/899f2dec-e8b9-44f4-ab8d-001e160a2aed@roeck-us.ne... Link: https://github.com/acpica/acpica/commit/abf5b573 Link: https://gcc.gnu.org/PR122073 [1] Co-developed-by: Saket Dumbre saket.dumbre@intel.com Signed-off-by: Saket Dumbre saket.dumbre@intel.com Signed-off-by: Xi Ruoyao xry111@xry111.site Acked-by: Huacai Chen chenhuacai@loongson.cn Cc: All applicable stable@vger.kernel.org [ rjw: Subject and changelog edits ] Link: https://patch.msgid.link/20251021092825.822007-1-xry111@xry111.site Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/acpica/tbprint.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/acpi/acpica/tbprint.c +++ b/drivers/acpi/acpica/tbprint.c @@ -94,6 +94,11 @@ acpi_tb_print_table_header(acpi_physical { struct acpi_table_header local_header;
+#pragma GCC diagnostic push +#if defined(__GNUC__) && __GNUC__ >= 11 +#pragma GCC diagnostic ignored "-Wstringop-overread" +#endif + if (ACPI_COMPARE_NAMESEG(header->signature, ACPI_SIG_FACS)) {
/* FACS only has signature and length fields */ @@ -134,6 +139,7 @@ acpi_tb_print_table_header(acpi_physical local_header.asl_compiler_id, local_header.asl_compiler_revision)); } +#pragma GCC diagnostic pop }
/*******************************************************************************
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Kleine-Budde mkl@pengutronix.de
commit 8e93ac51e4c6dc399fad59ec21f55f2cfb46d27c upstream.
Since the commit c1f3f9797c1f ("can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode"), the automatic restart delay can only be set for devices that implement the restart handler struct can_priv::do_set_mode. As it makes no sense to configure a automatic restart for devices that doesn't support it.
However, since systemd commit 13ce5d4632e3 ("network/can: properly handle CAN.RestartSec=0") [1], systemd-networkd correctly handles a restart delay of "0" (i.e. the restart is disabled). Which means that a disabled restart is always configured in the kernel.
On systems with both changes active this causes that CAN interfaces that don't implement a restart handler cannot be brought up by systemd-networkd.
Solve this problem by allowing a delay of "0" to be configured, even if the device does not implement a restart handler.
[1] https://github.com/systemd/systemd/commit/13ce5d4632e395521e6205c954493c7fc1...
Cc: stable@vger.kernel.org Cc: Andrei Lalaev andrey.lalaev@gmail.com Reported-by: Marc Kleine-Budde mkl@pengutronix.de Closes: https://lore.kernel.org/all/20251020-certain-arrogant-vole-of-sunshine-14184... Fixes: c1f3f9797c1f ("can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode") Link: https://patch.msgid.link/20251020-netlink-fix-restart-v1-1-3f53c7f8520b@peng... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/dev/netlink.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/net/can/dev/netlink.c +++ b/drivers/net/can/dev/netlink.c @@ -252,7 +252,9 @@ static int can_changelink(struct net_dev }
if (data[IFLA_CAN_RESTART_MS]) { - if (!priv->do_set_mode) { + unsigned int restart_ms = nla_get_u32(data[IFLA_CAN_RESTART_MS]); + + if (restart_ms != 0 && !priv->do_set_mode) { NL_SET_ERR_MSG(extack, "Device doesn't support restart from Bus Off"); return -EOPNOTSUPP; @@ -261,7 +263,7 @@ static int can_changelink(struct net_dev /* Do not allow changing restart delay while running */ if (dev->flags & IFF_UP) return -EBUSY; - priv->restart_ms = nla_get_u32(data[IFLA_CAN_RESTART_MS]); + priv->restart_ms = restart_ms; }
if (data[IFLA_CAN_RESTART]) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maciej W. Rozycki macro@orcam.me.uk
commit bf5570590a981d0659d0808d2d4bcda21b27a2a5 upstream.
MIPS Malta platform code registers the PCI southbridge legacy port I/O PS/2 keyboard range as a standard resource marked as busy. It prevents the i8042 driver from registering as it fails to claim the resource in a call to i8042_platform_init(). Consequently PS/2 keyboard and mouse devices cannot be used with this platform.
Fix the issue by removing the busy marker from the standard reservation, making the driver register successfully:
serio: i8042 KBD port at 0x60,0x64 irq 1 serio: i8042 AUX port at 0x60,0x64 irq 12
and the resource show up as expected among the legacy devices:
00000000-00ffffff : MSC PCI I/O 00000000-0000001f : dma1 00000020-00000021 : pic1 00000040-0000005f : timer 00000060-0000006f : keyboard 00000060-0000006f : i8042 00000070-00000077 : rtc0 00000080-0000008f : dma page reg 000000a0-000000a1 : pic2 000000c0-000000df : dma2 [...]
If the i8042 driver has not been configured, then the standard resource will remain there preventing any conflicting dynamic assignment of this PCI port I/O address range.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Maciej W. Rozycki macro@orcam.me.uk Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Acked-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Cc: stable@vger.kernel.org Link: https://patch.msgid.link/alpine.DEB.2.21.2510211919240.8377@angie.orcam.me.u... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/mips/mti-malta/malta-setup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/mips/mti-malta/malta-setup.c +++ b/arch/mips/mti-malta/malta-setup.c @@ -47,7 +47,7 @@ static struct resource standard_io_resou .name = "keyboard", .start = 0x60, .end = 0x6f, - .flags = IORESOURCE_IO | IORESOURCE_BUSY + .flags = IORESOURCE_IO }, { .name = "dma page reg",
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Deepanshu Kartikey kartikey406@gmail.com
commit 78a63493f8e352296dbc7cb7b3f4973105e8679e upstream.
The extent map cache can become stale when extents are moved or defragmented, causing subsequent operations to see outdated extent flags. This triggers a BUG_ON in ocfs2_refcount_cal_cow_clusters().
The problem occurs when: 1. copy_file_range() creates a reflinked extent with OCFS2_EXT_REFCOUNTED 2. ioctl(FITRIM) triggers ocfs2_move_extents() 3. __ocfs2_move_extents_range() reads and caches the extent (flags=0x2) 4. ocfs2_move_extent()/ocfs2_defrag_extent() calls __ocfs2_move_extent() which clears OCFS2_EXT_REFCOUNTED flag on disk (flags=0x0) 5. The extent map cache is not invalidated after the move 6. Later write() operations read stale cached flags (0x2) but disk has updated flags (0x0), causing a mismatch 7. BUG_ON(!(rec->e_flags & OCFS2_EXT_REFCOUNTED)) triggers
Fix by clearing the extent map cache after each extent move/defrag operation in __ocfs2_move_extents_range(). This ensures subsequent operations read fresh extent data from disk.
Link: https://lore.kernel.org/all/20251009142917.517229-1-kartikey406@gmail.com/T/ Link: https://lkml.kernel.org/r/20251009154903.522339-1-kartikey406@gmail.com Fixes: 53069d4e7695 ("Ocfs2/move_extents: move/defrag extents within a certain range.") Signed-off-by: Deepanshu Kartikey kartikey406@gmail.com Reported-by: syzbot+6fdd8fa3380730a4b22c@syzkaller.appspotmail.com Tested-by: syzbot+6fdd8fa3380730a4b22c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?id=2959889e1f6e216585ce522f7e8bc002b46ad9e... Reviewed-by: Mark Fasheh mark@fasheh.com Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Changwei Ge gechangwei@live.cn Cc: Jun Piao piaojun@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/move_extents.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/fs/ocfs2/move_extents.c +++ b/fs/ocfs2/move_extents.c @@ -868,6 +868,11 @@ static int __ocfs2_move_extents_range(st mlog_errno(ret); goto out; } + /* + * Invalidate extent cache after moving/defragging to prevent + * stale cached data with outdated extent flags. + */ + ocfs2_extent_map_trunc(inode, cpos);
context->clusters_moved += alloc_size; next:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefano Garzarella sgarzare@redhat.com
commit f7c877e7535260cc7a21484c994e8ce7e8cb6780 upstream.
Syzbot reported a potential lock inversion deadlock between vsock_register_mutex and sk_lock-AF_VSOCK when vsock_linger() is called.
The issue was introduced by commit 687aa0c5581b ("vsock: Fix transport_* TOCTOU") which added vsock_register_mutex locking in vsock_assign_transport() around the transport->release() call, that can call vsock_linger(). vsock_assign_transport() can be called with sk_lock held. vsock_linger() calls sk_wait_event() that temporarily releases and re-acquires sk_lock. During this window, if another thread hold vsock_register_mutex while trying to acquire sk_lock, a circular dependency is created.
Fix this by releasing vsock_register_mutex before calling transport->release() and vsock_deassign_transport(). This is safe because we don't need to hold vsock_register_mutex while releasing the old transport, and we ensure the new transport won't disappear by obtaining a module reference first via try_module_get().
Reported-by: syzbot+10e35716f8e4929681fa@syzkaller.appspotmail.com Tested-by: syzbot+10e35716f8e4929681fa@syzkaller.appspotmail.com Fixes: 687aa0c5581b ("vsock: Fix transport_* TOCTOU") Cc: mhal@rbox.co Cc: stable@vger.kernel.org Signed-off-by: Stefano Garzarella sgarzare@redhat.com Link: https://patch.msgid.link/20251021121718.137668-1-sgarzare@redhat.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/vmw_vsock/af_vsock.c | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-)
--- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -483,12 +483,26 @@ int vsock_assign_transport(struct vsock_ goto err; }
- if (vsk->transport) { - if (vsk->transport == new_transport) { - ret = 0; - goto err; - } + if (vsk->transport && vsk->transport == new_transport) { + ret = 0; + goto err; + } + + /* We increase the module refcnt to prevent the transport unloading + * while there are open sockets assigned to it. + */ + if (!new_transport || !try_module_get(new_transport->module)) { + ret = -ENODEV; + goto err; + } + + /* It's safe to release the mutex after a successful try_module_get(). + * Whichever transport `new_transport` points at, it won't go away until + * the last module_put() below or in vsock_deassign_transport(). + */ + mutex_unlock(&vsock_register_mutex);
+ if (vsk->transport) { /* transport->release() must be called with sock lock acquired. * This path can only be taken during vsock_connect(), where we * have already held the sock lock. In the other cases, this @@ -508,20 +522,6 @@ int vsock_assign_transport(struct vsock_ vsk->peer_shutdown = 0; }
- /* We increase the module refcnt to prevent the transport unloading - * while there are open sockets assigned to it. - */ - if (!new_transport || !try_module_get(new_transport->module)) { - ret = -ENODEV; - goto err; - } - - /* It's safe to release the mutex after a successful try_module_get(). - * Whichever transport `new_transport` points at, it won't go away until - * the last module_put() below or in vsock_deassign_transport(). - */ - mutex_unlock(&vsock_register_mutex); - if (sk->sk_type == SOCK_SEQPACKET) { if (!new_transport->seqpacket_allow || !new_transport->seqpacket_allow(remote_cid)) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sebastian Reichel sebastian.reichel@collabora.com
commit 7f864458e9a6d2000b726d14b3d3a706ac92a3b0 upstream.
On all platforms set_clock_selection() writes to a GRF register. This requires certain clocks running and thus should happen before the clocks are disabled.
This has been noticed on RK3576 Sige5, which hangs during system suspend when trying to suspend the second network interface. Note, that suspending the first interface works, because the second device ensures that the necessary clocks for the GRF are enabled.
Cc: stable@vger.kernel.org Fixes: 2f2b60a0ec28 ("net: ethernet: stmmac: dwmac-rk: Add gmac support for rk3588") Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20251014-rockchip-network-clock-fix-v1-1-c257b4afdf... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c @@ -1565,14 +1565,15 @@ static int gmac_clk_enable(struct rk_pri } } else { if (bsp_priv->clk_enabled) { + if (bsp_priv->ops && bsp_priv->ops->set_clock_selection) { + bsp_priv->ops->set_clock_selection(bsp_priv, + bsp_priv->clock_input, false); + } + clk_bulk_disable_unprepare(bsp_priv->num_clks, bsp_priv->clks); clk_disable_unprepare(bsp_priv->clk_phy);
- if (bsp_priv->ops && bsp_priv->ops->set_clock_selection) - bsp_priv->ops->set_clock_selection(bsp_priv, - bsp_priv->clock_input, false); - bsp_priv->clk_enabled = false; } }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Pecio michal.pecio@gmail.com
commit 75cea9860aa6b2350d90a8d78fed114d27c7eca2 upstream.
TX frames aren't padded and unknown memory is sent into the ether.
Theoretically, it isn't even guaranteed that the extra memory exists and can be sent out, which could cause further problems. In practice, I found that plenty of tailroom exists in the skb itself (in my test with ping at least) and skb_padto() easily succeeds, so use it here.
In the event of -ENOMEM drop the frame like other drivers do.
The use of one more padding byte instead of a USB zero-length packet is retained to avoid regression. I have a dodgy Etron xHCI controller which doesn't seem to support sending ZLPs at all.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Michal Pecio michal.pecio@gmail.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20251014203528.3f9783c4.michal.pecio@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/rtl8150.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
--- a/drivers/net/usb/rtl8150.c +++ b/drivers/net/usb/rtl8150.c @@ -685,9 +685,16 @@ static netdev_tx_t rtl8150_start_xmit(st rtl8150_t *dev = netdev_priv(netdev); int count, res;
+ /* pad the frame and ensure terminating USB packet, datasheet 9.2.3 */ + count = max(skb->len, ETH_ZLEN); + if (count % 64 == 0) + count++; + if (skb_padto(skb, count)) { + netdev->stats.tx_dropped++; + return NETDEV_TX_OK; + } + netif_stop_queue(netdev); - count = (skb->len < 60) ? 60 : skb->len; - count = (count & 0x3f) ? count : count + 1; dev->tx_skb = skb; usb_fill_bulk_urb(dev->tx_urb, dev->udev, usb_sndbulkpipe(dev->udev, 2), skb->data, count, write_bulk_callback, dev);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lad Prabhakar prabhakar.mahadev-lad.rj@bp.renesas.com
commit 5370c31e84b0e0999c7b5ff949f4e104def35584 upstream.
Ensure the TX descriptor type fields are published in a safe order so the DMA engine never begins processing a descriptor chain before all descriptor fields are fully initialised.
For multi-descriptor transmits the driver writes DT_FEND into the last descriptor and DT_FSTART into the first. The DMA engine begins processing when it observes DT_FSTART. Move the dma_wmb() barrier so it executes immediately after DT_FEND and immediately before writing DT_FSTART (and before DT_FSINGLE in the single-descriptor case). This guarantees that all prior CPU writes to the descriptor memory are visible to the device before DT_FSTART is seen.
This avoids a situation where compiler/CPU reordering could publish DT_FSTART ahead of DT_FEND or other descriptor fields, allowing the DMA to start on a partially initialised chain and causing corrupted transmissions or TX timeouts. Such a failure was observed on RZ/G2L with an RT kernel as transmit queue timeouts and device resets.
Fixes: 2f45d1902acf ("ravb: minimize TX data copying") Cc: stable@vger.kernel.org Co-developed-by: Fabrizio Castro fabrizio.castro.jz@renesas.com Signed-off-by: Fabrizio Castro fabrizio.castro.jz@renesas.com Signed-off-by: Lad Prabhakar prabhakar.mahadev-lad.rj@bp.renesas.com Reviewed-by: Niklas Söderlund niklas.soderlund+renesas@ragnatech.se Link: https://patch.msgid.link/20251017151830.171062-4-prabhakar.mahadev-lad.rj@bp... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/renesas/ravb_main.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/renesas/ravb_main.c +++ b/drivers/net/ethernet/renesas/ravb_main.c @@ -2054,13 +2054,25 @@ static netdev_tx_t ravb_start_xmit(struc
skb_tx_timestamp(skb); } - /* Descriptor type must be set after all the above writes */ - dma_wmb(); + if (num_tx_desc > 1) { desc->die_dt = DT_FEND; desc--; + /* When using multi-descriptors, DT_FEND needs to get written + * before DT_FSTART, but the compiler may reorder the memory + * writes in an attempt to optimize the code. + * Use a dma_wmb() barrier to make sure DT_FEND and DT_FSTART + * are written exactly in the order shown in the code. + * This is particularly important for cases where the DMA engine + * is already running when we are running this code. If the DMA + * sees DT_FSTART without the corresponding DT_FEND it will enter + * an error condition. + */ + dma_wmb(); desc->die_dt = DT_FSTART; } else { + /* Descriptor type must be set after all the above writes */ + dma_wmb(); desc->die_dt = DT_FSINGLE; } ravb_modify(ndev, TCCR, TCCR_TSRQ0 << q, TCCR_TSRQ0 << q);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lad Prabhakar prabhakar.mahadev-lad.rj@bp.renesas.com
commit 706136c5723626fcde8dd8f598a4dcd251e24927 upstream.
Add a final dma_wmb() barrier before triggering the transmit request (TCCR_TSRQ) to ensure all descriptor and buffer writes are visible to the DMA engine.
According to the hardware manual, a read-back operation is required before writing to the doorbell register to guarantee completion of previous writes. Instead of performing a dummy read, a dma_wmb() is used to both enforce the same ordering semantics on the CPU side and also to ensure completion of writes.
Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Cc: stable@vger.kernel.org Co-developed-by: Fabrizio Castro fabrizio.castro.jz@renesas.com Signed-off-by: Fabrizio Castro fabrizio.castro.jz@renesas.com Signed-off-by: Lad Prabhakar prabhakar.mahadev-lad.rj@bp.renesas.com Reviewed-by: Niklas Söderlund niklas.soderlund+renesas@ragnatech.se Link: https://patch.msgid.link/20251017151830.171062-5-prabhakar.mahadev-lad.rj@bp... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/renesas/ravb_main.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/net/ethernet/renesas/ravb_main.c +++ b/drivers/net/ethernet/renesas/ravb_main.c @@ -2075,6 +2075,14 @@ static netdev_tx_t ravb_start_xmit(struc dma_wmb(); desc->die_dt = DT_FSINGLE; } + + /* Before ringing the doorbell we need to make sure that the latest + * writes have been committed to memory, otherwise it could delay + * things until the doorbell is rang again. + * This is in replacement of the read operation mentioned in the HW + * manuals. + */ + dma_wmb(); ravb_modify(ndev, TCCR, TCCR_TSRQ0 << q, TCCR_TSRQ0 << q);
priv->cur_tx[q] += num_tx_desc;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthieu Baerts (NGI0) matttbe@kernel.org
commit d68460bc31f9c8c6fc81fbb56ec952bec18409f1 upstream.
The call to 'continue_if' was missing: it properly marks a subtest as 'skipped' if the attached condition is not valid.
Without that, the test is wrongly marked as passed on older kernels.
Fixes: e06959e9eebd ("selftests: mptcp: join: test for flush/re-add endpoints") Cc: stable@vger.kernel.org Reviewed-by: Geliang Tang geliang@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20251020-net-mptcp-c-flag-late-add-addr-v1-2-820703... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -3457,7 +3457,7 @@ endpoint_tests()
# flush and re-add if reset_with_tcp_filter "flush re-add" ns2 10.0.3.2 REJECT OUTPUT && - mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then + continue_if mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then pm_nl_set_limits $ns1 0 2 pm_nl_set_limits $ns2 1 2 # broadcast IP: no packet for this address will be received on ns1
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthieu Baerts (NGI0) matttbe@kernel.org
commit 973f80d715bd2504b4db6e049f292e694145cd79 upstream.
The call to 'continue_if' was missing: it properly marks a subtest as 'skipped' if the attached condition is not valid.
Without that, the test is wrongly marked as passed on older kernels.
Fixes: 36c4127ae8dd ("selftests: mptcp: join: skip implicit tests if not supported") Cc: stable@vger.kernel.org Reviewed-by: Geliang Tang geliang@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20251020-net-mptcp-c-flag-late-add-addr-v1-3-820703... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -3309,7 +3309,7 @@ endpoint_tests() # subflow_rebuild_header is needed to support the implicit flag # userspace pm type prevents add_addr if reset "implicit EP" && - mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then + continue_if mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then pm_nl_set_limits $ns1 2 2 pm_nl_set_limits $ns2 2 2 pm_nl_add_endpoint $ns1 10.0.2.1 flags signal @@ -3330,7 +3330,7 @@ endpoint_tests() fi
if reset_with_tcp_filter "delete and re-add" ns2 10.0.3.2 REJECT OUTPUT && - mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then + continue_if mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then start_events pm_nl_set_limits $ns1 0 3 pm_nl_set_limits $ns2 0 3
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anup Patel apatel@ventanamicro.com
[ Upstream commit ca525d53f994d45c8140968b571372c45f555ac1 ]
The pgprot_dmacoherent() is used when allocating memory for non-coherent devices and by default pgprot_dmacoherent() is same as pgprot_noncached() unless architecture overrides it.
Currently, there is no pgprot_dmacoherent() definition for RISC-V hence non-coherent device memory is being mapped as IO thereby making CPU access to such memory slow.
Define pgprot_dmacoherent() to be same as pgprot_writecombine() for RISC-V so that CPU access non-coherent device memory as NOCACHE which is better than accessing it as IO.
Fixes: ff689fd21cb1 ("riscv: add RISC-V Svpbmt extension support") Signed-off-by: Anup Patel apatel@ventanamicro.com Tested-by: Han Gao rabenda.cn@gmail.com Tested-by: Guo Ren (Alibaba DAMO Academy) guoren@kernel.org Link: https://lore.kernel.org/r/20250820152316.1012757-1-apatel@ventanamicro.com Signed-off-by: Paul Walmsley pjw@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/include/asm/pgtable.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index bb19a643c5c2a..1a94e633c1445 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -555,6 +555,8 @@ static inline pgprot_t pgprot_writecombine(pgprot_t _prot) return __pgprot(prot); }
+#define pgprot_dmacoherent pgprot_writecombine + /* * THP functions */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anup Patel apatel@ventanamicro.com
[ Upstream commit d2721bb165b3ee00dd23525885381af07fec852a ]
Early boot stages may disable CPU DT nodes for unavailable CPUs based on SKU, pinstraps, eFuse, etc. Currently, the riscv_early_of_processor_hartid() prints details of a CPU if it is disabled in DT which has no value and gives a false impression to the users that there some issue with the CPU.
Fixes: e3d794d555cd ("riscv: treat cpu devicetree nodes without status as enabled") Signed-off-by: Anup Patel apatel@ventanamicro.com Reviewed-by: Andrew Jones ajones@ventanamicro.com Reviewed-by: Conor Dooley conor.dooley@microchip.com Link: https://lore.kernel.org/r/20251014163009.182381-1-apatel@ventanamicro.com Signed-off-by: Paul Walmsley pjw@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/cpu.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c index 0f76181dc634d..e642b3dc42d27 100644 --- a/arch/riscv/kernel/cpu.c +++ b/arch/riscv/kernel/cpu.c @@ -32,10 +32,8 @@ int riscv_of_processor_hartid(struct device_node *node, unsigned long *hart) return -ENODEV; }
- if (!of_device_is_available(node)) { - pr_info("CPU with hartid=%lu is not available\n", *hart); + if (!of_device_is_available(node)) return -ENODEV; - }
if (of_property_read_string(node, "riscv,isa", &isa)) { pr_warn("CPU with hartid=%lu has no "riscv,isa" property\n", *hart);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alok Tiwari alok.a.tiwari@oracle.com
[ Upstream commit c5efc6a0b3940381d67887302ddb87a5cf623685 ]
The __must_hold annotation references &req->ctx->uring_lock, but req is not in scope in io_install_fixed_file. This change updates the annotation to reference the correct ctx->uring_lock. improving code clarity.
Fixes: f110ed8498af ("io_uring: split out fixed file installation and removal") Signed-off-by: Alok Tiwari alok.a.tiwari@oracle.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- io_uring/filetable.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/io_uring/filetable.c b/io_uring/filetable.c index a64b4df0ac9c2..f9e59c650e893 100644 --- a/io_uring/filetable.c +++ b/io_uring/filetable.c @@ -62,7 +62,7 @@ void io_free_file_tables(struct io_file_table *table)
static int io_install_fixed_file(struct io_ring_ctx *ctx, struct file *file, u32 slot_index) - __must_hold(&req->ctx->uring_lock) + __must_hold(&ctx->uring_lock) { bool needs_switch = false; struct io_fixed_file *file_slot;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Renjun Wang renjunw0@foxmail.com
commit 71c07570b918f000de5d0f7f1bf17a2887e303b5 upstream.
Add support for UNISOC (Spreadtrum) UIS7720 (A7720) module.
T: Bus=05 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 5 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1782 ProdID=4064 Rev=04.04 S: Manufacturer=Unisoc-phone S: Product=Unisoc-phone S: SerialNumber=0123456789ABCDEF C: #Ifs= 9 Cfg#= 1 Atr=c0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=e0(wlcon) Sub=01 Prot=03 Driver=rndis_host E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 6 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 7 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 8 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=08(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
0&1: RNDIS, 2: LOG, 3: DIAG, 4&5: AT Ports, 6&7: AT2 Ports, 8: ADB
Signed-off-by: Renjun Wang renjunw0@foxmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -617,6 +617,7 @@ static void option_instat_callback(struc #define UNISOC_VENDOR_ID 0x1782 /* TOZED LT70-C based on UNISOC SL8563 uses UNISOC's vendor ID */ #define TOZED_PRODUCT_LT70C 0x4055 +#define UNISOC_PRODUCT_UIS7720 0x4064 /* Luat Air72*U series based on UNISOC UIS8910 uses UNISOC's vendor ID */ #define LUAT_PRODUCT_AIR720U 0x4e00
@@ -2466,6 +2467,7 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(SIERRA_VENDOR_ID, SIERRA_PRODUCT_EM9291, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(SIERRA_VENDOR_ID, SIERRA_PRODUCT_EM9291, 0xff, 0xff, 0x40) }, { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, TOZED_PRODUCT_LT70C, 0xff, 0, 0) }, + { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, UNISOC_PRODUCT_UIS7720, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, LUAT_PRODUCT_AIR720U, 0xff, 0, 0) }, { USB_DEVICE_INTERFACE_CLASS(0x1bbb, 0x0530, 0xff), /* TCL IK512 MBIM */ .driver_info = NCTRL(1) },
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Reinhard Speyerer rspmn@arcor.de
commit 89205c60c0fc96b73567a2e9fe27ee3f59d01193 upstream.
Add support for Quectel RG255C devices to complement commit 5c964c8a97c1 ("net: usb: qmi_wwan: add Quectel RG255C"). The composition is DM / NMEA / AT / QMI.
T: Bus=01 Lev=02 Prnt=99 Port=01 Cnt=02 Dev#=110 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=2c7c ProdID=0316 Rev= 5.15 S: Manufacturer=Quectel S: Product=RG255C-GL S: SerialNumber=xxxxxxxx C:* #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=86(I) Atr=03(Int.) MxPS= 8 Ivl=32ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Signed-off-by: Reinhard Speyerer rspmn@arcor.de Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -273,6 +273,7 @@ static void option_instat_callback(struc #define QUECTEL_PRODUCT_EM05CN 0x0312 #define QUECTEL_PRODUCT_EM05G_GR 0x0313 #define QUECTEL_PRODUCT_EM05G_RS 0x0314 +#define QUECTEL_PRODUCT_RG255C 0x0316 #define QUECTEL_PRODUCT_EM12 0x0512 #define QUECTEL_PRODUCT_RM500Q 0x0800 #define QUECTEL_PRODUCT_RM520N 0x0801 @@ -1271,6 +1272,9 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RM500K, 0xff, 0x00, 0x00) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RG650V, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RG650V, 0xff, 0, 0) }, + { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RG255C, 0xff, 0xff, 0x30) }, + { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RG255C, 0xff, 0, 0) }, + { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_RG255C, 0xff, 0xff, 0x40) },
{ USB_DEVICE(CMOTECH_VENDOR_ID, CMOTECH_PRODUCT_6001) }, { USB_DEVICE(CMOTECH_VENDOR_ID, CMOTECH_PRODUCT_CMU_300) },
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: LI Qingwu Qing-wu.Li@leica-geosystems.com.cn
commit 622865c73ae30f254abdf182f4b66cccbe3e0f10 upstream.
Add support for the Telit Cinterion FN920C04 module when operating in ECM (Ethernet Control Model) mode. The following USB product IDs are used by the module when AT#USBCFG is set to 3 or 7.
0x10A3: ECM + tty (NMEA) + tty (DUN) [+ tty (DIAG)] T: Bus=01 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 3 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10a3 Rev= 5.15 S: Manufacturer=Telit Cinterion S: Product=FN920 S: SerialNumber=76e7cb38 C:* #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=06 Prot=00 Driver=cdc_ether E: Ad=82(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
0x10A8: ECM + tty (DUN) + tty (AUX) [+ tty (DIAG)] T: Bus=03 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 3 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10a8 Rev= 5.15 S: Manufacturer=Telit Cinterion S: Product=FN920 S: SerialNumber=76e7cb38 C:* #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=06 Prot=00 Driver=cdc_ether E: Ad=82(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether I:* If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Adding these IDs allows the option driver to automatically create the corresponding /dev/ttyUSB* ports under ECM mode.
Tested with FN920C04 under ECM configuration (USBCFG=3 and 7).
Signed-off-by: LI Qingwu Qing-wu.Li@leica-geosystems.com.cn Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -1403,10 +1403,14 @@ static const struct usb_device_id option .driver_info = RSVD(0) | NCTRL(3) }, { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10a2, 0xff), /* Telit FN920C04 (MBIM) */ .driver_info = NCTRL(4) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10a3, 0xff), /* Telit FN920C04 (ECM) */ + .driver_info = NCTRL(4) }, { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10a4, 0xff), /* Telit FN20C04 (rmnet) */ .driver_info = RSVD(0) | NCTRL(3) }, { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10a7, 0xff), /* Telit FN920C04 (MBIM) */ .driver_info = NCTRL(4) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10a8, 0xff), /* Telit FN920C04 (ECM) */ + .driver_info = NCTRL(4) }, { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10a9, 0xff), /* Telit FN20C04 (rmnet) */ .driver_info = RSVD(0) | NCTRL(2) | RSVD(3) | RSVD(4) }, { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x10aa, 0xff), /* Telit FN920C04 (MBIM) */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tim Guttzeit t.guttzeit@tuxedocomputers.com
commit dfc2cf4dcaa03601cd4ca0f7def88b2630fca6ab upstream.
The list of Huawei LTE modules needing the quirk fixing spurious wakeups was missing the IDs of the Huawei ME906S module, therefore suspend did not work.
Cc: stable stable@kernel.org Signed-off-by: Tim Guttzeit t.guttzeit@tuxedocomputers.com Signed-off-by: Werner Sembach wse@tuxedocomputers.com Link: https://patch.msgid.link/20251020134304.35079-1-wse@tuxedocomputers.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/core/quirks.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/usb/core/quirks.c +++ b/drivers/usb/core/quirks.c @@ -464,6 +464,8 @@ static const struct usb_device_id usb_qu /* Huawei 4G LTE module */ { USB_DEVICE(0x12d1, 0x15bb), .driver_info = USB_QUIRK_DISCONNECT_SUSPEND }, + { USB_DEVICE(0x12d1, 0x15c1), .driver_info = + USB_QUIRK_DISCONNECT_SUSPEND }, { USB_DEVICE(0x12d1, 0x15c3), .driver_info = USB_QUIRK_DISCONNECT_SUSPEND },
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrey Konovalov andreyknvl@gmail.com
commit 37b9dd0d114a0e38c502695e30f55a74fb0c37d0 upstream.
Drop the check on the maximum transfer length in Raw Gadget for both control and non-control transfers.
Limiting the transfer length causes a problem with emulating USB devices whose full configuration descriptor exceeds PAGE_SIZE in length.
Overall, there does not appear to be any reason to enforce any kind of transfer length limit on the Raw Gadget side for either control or non-control transfers, so let's just drop the related check.
Cc: stable stable@kernel.org Fixes: f2c2e717642c ("usb: gadget: add raw-gadget interface") Signed-off-by: Andrey Konovalov andreyknvl@gmail.com Link: https://patch.msgid.link/a6024e8eab679043e9b8a5defdb41c4bda62f02b.1761085528... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/legacy/raw_gadget.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/usb/gadget/legacy/raw_gadget.c +++ b/drivers/usb/gadget/legacy/raw_gadget.c @@ -620,8 +620,6 @@ static void *raw_alloc_io_data(struct us return ERR_PTR(-EINVAL); if (!usb_raw_io_flags_valid(io->flags)) return ERR_PTR(-EINVAL); - if (io->length > PAGE_SIZE) - return ERR_PTR(-EINVAL); if (get_from_user) data = memdup_user(ptr + sizeof(*io), io->length); else {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mathias Nyman mathias.nyman@linux.intel.com
commit 2bbd38fcd29670e46c0fdb9cd0e90507a8a1bf6a upstream.
DbC is currently only enabled back if it's in configured state during suspend.
If system is suspended after DbC is enabled, but before the device is properly enumerated by the host, then DbC would not be enabled back in resume.
Always enable DbC back in resume if it's suspended in enabled, connected, or configured state
Cc: stable stable@kernel.org Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver") Tested-by: Łukasz Bartosik ukaszb@chromium.org Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-dbgcap.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/drivers/usb/host/xhci-dbgcap.c +++ b/drivers/usb/host/xhci-dbgcap.c @@ -1136,8 +1136,15 @@ int xhci_dbc_suspend(struct xhci_hcd *xh if (!dbc) return 0;
- if (dbc->state == DS_CONFIGURED) + switch (dbc->state) { + case DS_ENABLED: + case DS_CONNECTED: + case DS_CONFIGURED: dbc->resume_required = 1; + break; + default: + break; + }
xhci_dbc_stop(dbc);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alice Ryhl aliceryhl@google.com
commit d90eeb8ecd227c204ab6c34a17b372bd950b7aa2 upstream.
There are no scenarios where a weak increment is invalid on binder_node. The only possible case where it could be invalid is if the kernel delivers BR_DECREFS to the process that owns the node, and then increments the weak refcount again, effectively "reviving" a dead node.
However, that is not possible: when the BR_DECREFS command is delivered, the kernel removes and frees the binder_node. The fact that you were able to call binder_inc_node_nilocked() implies that the node is not yet destroyed, which implies that BR_DECREFS has not been delivered to userspace, so incrementing the weak refcount is valid.
Note that it's currently possible to trigger this condition if the owner calls BINDER_THREAD_EXIT while node->has_weak_ref is true. This causes BC_INCREFS on binder_ref instances to fail when they should not.
Cc: stable@vger.kernel.org Fixes: 457b9a6f09f0 ("Staging: android: add binder driver") Reported-by: Yu-Ting Tseng yutingtseng@google.com Signed-off-by: Alice Ryhl aliceryhl@google.com Link: https://patch.msgid.link/20251015-binder-weak-inc-v1-1-7914b092c371@google.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/android/binder.c | 11 +---------- 1 file changed, 1 insertion(+), 10 deletions(-)
--- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -845,17 +845,8 @@ static int binder_inc_node_nilocked(stru } else { if (!internal) node->local_weak_refs++; - if (!node->has_weak_ref && list_empty(&node->work.entry)) { - if (target_list == NULL) { - pr_err("invalid inc weak node for %d\n", - node->debug_id); - return -EINVAL; - } - /* - * See comment above - */ + if (!node->has_weak_ref && target_list && list_empty(&node->work.entry)) binder_enqueue_work_ilocked(&node->work, target_list); - } } return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Deepanshu Kartikey kartikey406@gmail.com
commit 87b318ba81dda2ee7b603f4f6c55e78ec3e95974 upstream.
The comedi_buf_munge() function performs a modulo operation `async->munge_chan %= async->cmd.chanlist_len` without first checking if chanlist_len is zero. If a user program submits a command with chanlist_len set to zero, this causes a divide-by-zero error when the device processes data in the interrupt handler path.
Add a check for zero chanlist_len at the beginning of the function, similar to the existing checks for !map and CMDF_RAWDATA flag. When chanlist_len is zero, update munge_count and return early, indicating the data was handled without munging.
This prevents potential kernel panics from malformed user commands.
Reported-by: syzbot+f6c3c066162d2c43a66c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f6c3c066162d2c43a66c Cc: stable@vger.kernel.org Signed-off-by: Deepanshu Kartikey kartikey406@gmail.com Reviewed-by: Ian Abbott abbotti@mev.co.uk Link: https://patch.msgid.link/20250924102639.1256191-1-kartikey406@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/comedi/comedi_buf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/comedi/comedi_buf.c +++ b/drivers/comedi/comedi_buf.c @@ -368,7 +368,7 @@ static unsigned int comedi_buf_munge(str unsigned int count = 0; const unsigned int num_sample_bytes = comedi_bytes_per_sample(s);
- if (!s->munge || (async->cmd.flags & CMDF_RAWDATA)) { + if (!s->munge || (async->cmd.flags & CMDF_RAWDATA) || async->cmd.chanlist_len == 0) { async->munge_count += num_bytes; return num_bytes; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Usyskin alexander.usyskin@intel.com
commit 410d6c2ad4d1a88efa0acbb9966693725b564933 upstream.
Add Wildcat Lake P device id.
Cc: stable@vger.kernel.org Co-developed-by: Tomas Winkler tomasw@gmail.com Signed-off-by: Tomas Winkler tomasw@gmail.com Signed-off-by: Alexander Usyskin alexander.usyskin@intel.com Link: https://patch.msgid.link/20251016125912.2146136-1-alexander.usyskin@intel.co... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/mei/hw-me-regs.h | 2 ++ drivers/misc/mei/pci-me.c | 2 ++ 2 files changed, 4 insertions(+)
--- a/drivers/misc/mei/hw-me-regs.h +++ b/drivers/misc/mei/hw-me-regs.h @@ -120,6 +120,8 @@ #define MEI_DEV_ID_PTL_H 0xE370 /* Panther Lake H */ #define MEI_DEV_ID_PTL_P 0xE470 /* Panther Lake P */
+#define MEI_DEV_ID_WCL_P 0x4D70 /* Wildcat Lake P */ + /* * MEI HW Section */ --- a/drivers/misc/mei/pci-me.c +++ b/drivers/misc/mei/pci-me.c @@ -127,6 +127,8 @@ static const struct pci_device_id mei_me {MEI_PCI_DEVICE(MEI_DEV_ID_PTL_H, MEI_ME_PCH15_CFG)}, {MEI_PCI_DEVICE(MEI_DEV_ID_PTL_P, MEI_ME_PCH15_CFG)},
+ {MEI_PCI_DEVICE(MEI_DEV_ID_WCL_P, MEI_ME_PCH15_CFG)}, + /* required last entry */ {0, } };
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Junhao Xie bigfoot@radxa.com
commit fff111bf45cbeeb659324316d68554e35d350092 upstream.
In fastrpc_map_lookup, dma_buf_get is called to obtain a reference to the dma_buf for comparison purposes. However, this reference is never released when the function returns, leading to a dma_buf memory leak.
Fix this by adding dma_buf_put before returning from the function, ensuring that the temporarily acquired reference is properly released regardless of whether a matching map is found.
Fixes: 9031626ade38 ("misc: fastrpc: Fix fastrpc_map_lookup operation") Cc: stable@kernel.org Signed-off-by: Junhao Xie bigfoot@radxa.com Tested-by: Xilin Wu sophon@radxa.com Link: https://lore.kernel.org/stable/48B368FB4C7007A7%2B20251017083906.3259343-1-b... Link: https://patch.msgid.link/48B368FB4C7007A7+20251017083906.3259343-1-bigfoot@r... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/fastrpc.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/misc/fastrpc.c +++ b/drivers/misc/fastrpc.c @@ -363,6 +363,8 @@ static int fastrpc_map_lookup(struct fas } spin_unlock(&fl->lock);
+ dma_buf_put(buf); + return ret; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Victoria Votokina Victoria.Votokina@kaspersky.com
commit 4b1270902609ef0d935ed2faa2ea6d122bd148f5 upstream.
hdm_disconnect() calls most_deregister_interface(), which eventually unregisters the MOST interface device with device_unregister(iface->dev). If that drops the last reference, the device core may call release_mdev() immediately while hdm_disconnect() is still executing.
The old code also freed several mdev-owned allocations in hdm_disconnect() and then performed additional put_device() calls. Depending on refcount order, this could lead to use-after-free or double-free when release_mdev() ran (or when unregister paths also performed puts).
Fix by moving the frees of mdev-owned allocations into release_mdev(), so they happen exactly once when the device is truly released, and by dropping the extra put_device() calls in hdm_disconnect() that are redundant after device_unregister() and most_deregister_interface().
This addresses the KASAN slab-use-after-free reported by syzbot in hdm_disconnect(). See report and stack traces in the bug link below.
Reported-by: syzbot+916742d5d24f6c254761@syzkaller.appspotmail.com Cc: stable stable@kernel.org Closes: https://syzkaller.appspot.com/bug?extid=916742d5d24f6c254761 Fixes: 97a6f772f36b ("drivers: most: add USB adapter driver") Signed-off-by: Victoria Votokina Victoria.Votokina@kaspersky.com Link: https://patch.msgid.link/20251010105241.4087114-2-Victoria.Votokina@kaspersk... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/most/most_usb.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)
--- a/drivers/most/most_usb.c +++ b/drivers/most/most_usb.c @@ -929,6 +929,10 @@ static void release_mdev(struct device * { struct most_dev *mdev = to_mdev_from_dev(dev);
+ kfree(mdev->busy_urbs); + kfree(mdev->cap); + kfree(mdev->conf); + kfree(mdev->ep_address); kfree(mdev); } /** @@ -1121,13 +1125,6 @@ static void hdm_disconnect(struct usb_in if (mdev->dci) device_unregister(&mdev->dci->dev); most_deregister_interface(&mdev->iface); - - kfree(mdev->busy_urbs); - kfree(mdev->cap); - kfree(mdev->conf); - kfree(mdev->ep_address); - put_device(&mdev->dci->dev); - put_device(&mdev->dev); }
static int hdm_suspend(struct usb_interface *interface, pm_message_t message)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Victoria Votokina Victoria.Votokina@kaspersky.com
commit a8cc9e5fcb0e2eef21513a4fec888f5712cb8162 upstream.
The early error path in hdm_probe() can jump to err_free_mdev before &mdev->dev has been initialized with device_initialize(). Calling put_device(&mdev->dev) there triggers a device core WARN and ends up invoking kref_put(&kobj->kref, kobject_release) on an uninitialized kobject.
In this path the private struct was only kmalloc'ed and the intended release is effectively kfree(mdev) anyway, so free it directly instead of calling put_device() on an uninitialized device.
This removes the WARNING and fixes the pre-initialization error path.
Fixes: 97a6f772f36b ("drivers: most: add USB adapter driver") Cc: stable stable@kernel.org Signed-off-by: Victoria Votokina Victoria.Votokina@kaspersky.com Link: https://patch.msgid.link/20251010105241.4087114-3-Victoria.Votokina@kaspersk... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/most/most_usb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/most/most_usb.c +++ b/drivers/most/most_usb.c @@ -1097,7 +1097,7 @@ err_free_cap: err_free_conf: kfree(mdev->conf); err_free_mdev: - put_device(&mdev->dev); + kfree(mdev); return ret; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Artem Shimko a.shimko.dev@gmail.com
commit daeb4037adf7d3349b4a1fb792f4bc9824686a4b upstream.
Check the return value of reset_control_deassert() in the probe function to prevent continuing probe when reset deassertion fails.
Previously, reset_control_deassert() was called without checking its return value, which could lead to probe continuing even when the device reset wasn't properly deasserted.
The fix checks the return value and returns an error with dev_err_probe() if reset deassertion fails, providing better error handling and diagnostics.
Fixes: acbdad8dd1ab ("serial: 8250_dw: simplify optional reset handling") Cc: stable stable@kernel.org Signed-off-by: Artem Shimko a.shimko.dev@gmail.com Link: https://patch.msgid.link/20251019095131.252848-1-a.shimko.dev@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/8250/8250_dw.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/tty/serial/8250/8250_dw.c +++ b/drivers/tty/serial/8250/8250_dw.c @@ -638,7 +638,9 @@ static int dw8250_probe(struct platform_ if (IS_ERR(data->rst)) return PTR_ERR(data->rst);
- reset_control_deassert(data->rst); + err = reset_control_deassert(data->rst); + if (err) + return dev_err_probe(dev, err, "failed to deassert resets\n");
err = devm_add_action_or_reset(dev, dw8250_reset_control_assert, data->rst); if (err)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Eckert fe@dev.tdt.de
commit e7cbce761fe3fcbcb49bcf30d4f8ca5e1a9ee2a0 upstream.
The Advantech 2-port serial card with PCI vendor=0x13fe and device=0x0018 has a 'XR17V35X' chip installed on the circuit board. Therefore, this driver can be used instead of theu outdated out-of-tree driver from the manufacturer.
Signed-off-by: Florian Eckert fe@dev.tdt.de Cc: stable stable@kernel.org Link: https://patch.msgid.link/20250924134115.2667650-1-fe@dev.tdt.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/8250/8250_exar.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/drivers/tty/serial/8250/8250_exar.c +++ b/drivers/tty/serial/8250/8250_exar.c @@ -33,6 +33,8 @@ #define PCI_DEVICE_ID_ACCESSIO_COM_4SM 0x10db #define PCI_DEVICE_ID_ACCESSIO_COM_8SM 0x10ea
+#define PCI_DEVICE_ID_ADVANTECH_XR17V352 0x0018 + #define PCI_DEVICE_ID_COMMTECH_4224PCI335 0x0002 #define PCI_DEVICE_ID_COMMTECH_4222PCI335 0x0004 #define PCI_DEVICE_ID_COMMTECH_2324PCI335 0x000a @@ -841,6 +843,12 @@ static const struct exar8250_board pbn_f .exit = pci_xr17v35x_exit, };
+static const struct exar8250_board pbn_adv_XR17V352 = { + .num_ports = 2, + .setup = pci_xr17v35x_setup, + .exit = pci_xr17v35x_exit, +}; + static const struct exar8250_board pbn_exar_XR17V4358 = { .num_ports = 12, .setup = pci_xr17v35x_setup, @@ -910,6 +918,9 @@ static const struct pci_device_id exar_p USR_DEVICE(XR17C152, 2980, pbn_exar_XR17C15x), USR_DEVICE(XR17C152, 2981, pbn_exar_XR17C15x),
+ /* ADVANTECH devices */ + EXAR_DEVICE(ADVANTECH, XR17V352, pbn_adv_XR17V352), + /* Exar Corp. XR17C15[248] Dual/Quad/Octal UART */ EXAR_DEVICE(EXAR, XR17C152, pbn_exar_XR17C15x), EXAR_DEVICE(EXAR, XR17C154, pbn_exar_XR17C15x),
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
[ Upstream commit 0b737f4ac1d3ec093347241df74bbf5f54a7e16c ]
old_crc is a very misleading name. Rename it to expected_crc as that described the usage much better.
Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Carlos Maiolino cem@kernel.org Stable-dep-of: e747883c7d73 ("xfs: fix log CRC mismatches between i386 and other architectures") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/xfs_log_recover.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-)
--- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -2854,20 +2854,19 @@ xlog_recover_process( int pass, struct list_head *buffer_list) { - __le32 old_crc = rhead->h_crc; - __le32 crc; + __le32 expected_crc = rhead->h_crc, crc;
crc = xlog_cksum(log, rhead, dp, be32_to_cpu(rhead->h_len));
/* * Nothing else to do if this is a CRC verification pass. Just return * if this a record with a non-zero crc. Unfortunately, mkfs always - * sets old_crc to 0 so we must consider this valid even on v5 supers. - * Otherwise, return EFSBADCRC on failure so the callers up the stack - * know precisely what failed. + * sets expected_crc to 0 so we must consider this valid even on v5 + * supers. Otherwise, return EFSBADCRC on failure so the callers up the + * stack know precisely what failed. */ if (pass == XLOG_RECOVER_CRCPASS) { - if (old_crc && crc != old_crc) + if (expected_crc && crc != expected_crc) return -EFSBADCRC; return 0; } @@ -2878,11 +2877,11 @@ xlog_recover_process( * zero CRC check prevents warnings from being emitted when upgrading * the kernel from one that does not add CRCs by default. */ - if (crc != old_crc) { - if (old_crc || xfs_has_crc(log->l_mp)) { + if (crc != expected_crc) { + if (expected_crc || xfs_has_crc(log->l_mp)) { xfs_alert(log->l_mp, "log record CRC mismatch: found 0x%x, expected 0x%x.", - le32_to_cpu(old_crc), + le32_to_cpu(expected_crc), le32_to_cpu(crc)); xfs_hex_dump(dp, 32); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
[ Upstream commit e747883c7d7306acb4d683038d881528fbfbe749 ]
When mounting file systems with a log that was dirtied on i386 on other architectures or vice versa, log recovery is unhappy:
[ 11.068052] XFS (vdb): Torn write (CRC failure) detected at log block 0x2. Truncating head block from 0xc.
This is because the CRCs generated by i386 and other architectures always diff. The reason for that is that sizeof(struct xlog_rec_header) returns different values for i386 vs the rest (324 vs 328), because the struct is not sizeof(uint64_t) aligned, and i386 has odd struct size alignment rules.
This issue goes back to commit 13cdc853c519 ("Add log versioning, and new super block field for the log stripe") in the xfs-import tree, which adds log v2 support and the h_size field that causes the unaligned size. At that time it only mattered for the crude debug only log header checksum, but with commit 0e446be44806 ("xfs: add CRC checks to the log") it became a real issue for v5 file system, because now there is a proper CRC, and regular builds actually expect it match.
Fix this by allowing checksums with and without the padding.
Fixes: 0e446be44806 ("xfs: add CRC checks to the log") Cc: stable@vger.kernel.org # v3.8 Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Carlos Maiolino cem@kernel.org [ Adjust context and file names ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/libxfs/xfs_log_format.h | 30 +++++++++++++++++++++++++++++- fs/xfs/xfs_log.c | 8 ++++---- fs/xfs/xfs_log_priv.h | 4 ++-- fs/xfs/xfs_log_recover.c | 19 +++++++++++++++++-- fs/xfs/xfs_ondisk.h | 2 ++ 5 files changed, 54 insertions(+), 9 deletions(-)
--- a/fs/xfs/libxfs/xfs_log_format.h +++ b/fs/xfs/libxfs/xfs_log_format.h @@ -171,12 +171,40 @@ typedef struct xlog_rec_header { __be32 h_prev_block; /* block number to previous LR : 4 */ __be32 h_num_logops; /* number of log operations in this LR : 4 */ __be32 h_cycle_data[XLOG_HEADER_CYCLE_SIZE / BBSIZE]; - /* new fields */ + + /* fields added by the Linux port: */ __be32 h_fmt; /* format of log record : 4 */ uuid_t h_fs_uuid; /* uuid of FS : 16 */ + + /* fields added for log v2: */ __be32 h_size; /* iclog size : 4 */ + + /* + * When h_size added for log v2 support, it caused structure to have + * a different size on i386 vs all other architectures because the + * sum of the size ofthe member is not aligned by that of the largest + * __be64-sized member, and i386 has really odd struct alignment rules. + * + * Due to the way the log headers are placed out on-disk that alone is + * not a problem becaue the xlog_rec_header always sits alone in a + * BBSIZEs area, and the rest of that area is padded with zeroes. + * But xlog_cksum used to calculate the checksum based on the structure + * size, and thus gives different checksums for i386 vs the rest. + * We now do two checksum validation passes for both sizes to allow + * moving v5 file systems with unclean logs between i386 and other + * (little-endian) architectures. + */ + __u32 h_pad0; } xlog_rec_header_t;
+#ifdef __i386__ +#define XLOG_REC_SIZE offsetofend(struct xlog_rec_header, h_size) +#define XLOG_REC_SIZE_OTHER sizeof(struct xlog_rec_header) +#else +#define XLOG_REC_SIZE sizeof(struct xlog_rec_header) +#define XLOG_REC_SIZE_OTHER offsetofend(struct xlog_rec_header, h_size) +#endif /* __i386__ */ + typedef struct xlog_rec_ext_header { __be32 xh_cycle; /* write cycle of log : 4 */ __be32 xh_cycle_data[XLOG_HEADER_CYCLE_SIZE / BBSIZE]; /* : 256 */ --- a/fs/xfs/xfs_log.c +++ b/fs/xfs/xfs_log.c @@ -1804,13 +1804,13 @@ xlog_cksum( struct xlog *log, struct xlog_rec_header *rhead, char *dp, - int size) + unsigned int hdrsize, + unsigned int size) { uint32_t crc;
/* first generate the crc for the record header ... */ - crc = xfs_start_cksum_update((char *)rhead, - sizeof(struct xlog_rec_header), + crc = xfs_start_cksum_update((char *)rhead, hdrsize, offsetof(struct xlog_rec_header, h_crc));
/* ... then for additional cycle data for v2 logs ... */ @@ -2074,7 +2074,7 @@ xlog_sync(
/* calculcate the checksum */ iclog->ic_header.h_crc = xlog_cksum(log, &iclog->ic_header, - iclog->ic_datap, size); + iclog->ic_datap, XLOG_REC_SIZE, size); /* * Intentionally corrupt the log record CRC based on the error injection * frequency, if defined. This facilitates testing log recovery in the --- a/fs/xfs/xfs_log_priv.h +++ b/fs/xfs/xfs_log_priv.h @@ -498,8 +498,8 @@ xlog_recover_finish( extern void xlog_recover_cancel(struct xlog *);
-extern __le32 xlog_cksum(struct xlog *log, struct xlog_rec_header *rhead, - char *dp, int size); +__le32 xlog_cksum(struct xlog *log, struct xlog_rec_header *rhead, + char *dp, unsigned int hdrsize, unsigned int size);
extern struct kmem_cache *xfs_log_ticket_cache; struct xlog_ticket *xlog_ticket_alloc(struct xlog *log, int unit_bytes, --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -2854,9 +2854,24 @@ xlog_recover_process( int pass, struct list_head *buffer_list) { - __le32 expected_crc = rhead->h_crc, crc; + __le32 expected_crc = rhead->h_crc, crc, other_crc;
- crc = xlog_cksum(log, rhead, dp, be32_to_cpu(rhead->h_len)); + crc = xlog_cksum(log, rhead, dp, XLOG_REC_SIZE, + be32_to_cpu(rhead->h_len)); + + /* + * Look at the end of the struct xlog_rec_header definition in + * xfs_log_format.h for the glory details. + */ + if (expected_crc && crc != expected_crc) { + other_crc = xlog_cksum(log, rhead, dp, XLOG_REC_SIZE_OTHER, + be32_to_cpu(rhead->h_len)); + if (other_crc == expected_crc) { + xfs_notice_once(log->l_mp, + "Fixing up incorrect CRC due to padding."); + crc = other_crc; + } + }
/* * Nothing else to do if this is a CRC verification pass. Just return --- a/fs/xfs/xfs_ondisk.h +++ b/fs/xfs/xfs_ondisk.h @@ -142,6 +142,8 @@ xfs_check_ondisk_structs(void) XFS_CHECK_STRUCT_SIZE(struct xfs_rud_log_format, 16); XFS_CHECK_STRUCT_SIZE(struct xfs_map_extent, 32); XFS_CHECK_STRUCT_SIZE(struct xfs_phys_extent, 16); + XFS_CHECK_STRUCT_SIZE(struct xlog_rec_header, 328); + XFS_CHECK_STRUCT_SIZE(struct xlog_rec_ext_header, 260);
XFS_CHECK_OFFSET(struct xfs_bui_log_format, bui_extents, 16); XFS_CHECK_OFFSET(struct xfs_cui_log_format, cui_extents, 16);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tomi Valkeinen tomi.valkeinen@ideasonboard.com
[ Upstream commit 689a54acb56858c85de8c7285db82b8ae6dbf683 ]
The DPHY driver does not return the actual hs_clk_rate, so the DSI driver has no idea what clock was actually achieved. Set the realized hs_clk_rate to the opts struct, so that the DSI driver gets it back.
Reviewed-by: Aradhya Bhatia aradhya.bhatia@linux.dev Tested-by: Parth Pancholi parth.pancholi@toradex.com Tested-by: Jayesh Choudhary j-choudhary@ti.com Acked-by: Vinod Koul vkoul@kernel.org Reviewed-by: Devarsh Thakkar devarsht@ti.com Signed-off-by: Tomi Valkeinen tomi.valkeinen@ideasonboard.com Link: https://lore.kernel.org/r/20250723-cdns-dphy-hs-clk-rate-fix-v1-1-d4539d44cb... Signed-off-by: Vinod Koul vkoul@kernel.org Stable-dep-of: 284fb19a3ffb ("phy: cadence: cdns-dphy: Fix PLL lock and O_CMN_READY polling") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/phy/cadence/cdns-dphy.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/phy/cadence/cdns-dphy.c +++ b/drivers/phy/cadence/cdns-dphy.c @@ -80,6 +80,7 @@ struct cdns_dphy_cfg { u8 pll_ipdiv; u8 pll_opdiv; u16 pll_fbdiv; + u32 hs_clk_rate; unsigned int nlanes; };
@@ -155,6 +156,9 @@ static int cdns_dsi_get_dphy_pll_cfg(str cfg->pll_ipdiv, pll_ref_hz);
+ cfg->hs_clk_rate = div_u64((u64)pll_ref_hz * cfg->pll_fbdiv, + 2 * cfg->pll_opdiv * cfg->pll_ipdiv); + return 0; }
@@ -298,6 +302,7 @@ static int cdns_dphy_config_from_opts(st if (ret) return ret;
+ opts->hs_clk_rate = cfg->hs_clk_rate; opts->wakeup = cdns_dphy_get_wakeup_time_ns(dphy) / 1000;
return 0;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Devarsh Thakkar devarsht@ti.com
[ Upstream commit 284fb19a3ffb1083c3ad9c00d29749d09dddb99c ]
PLL lockup and O_CMN_READY assertion can only happen after common state machine gets enabled by programming DPHY_CMN_SSM register, but driver was polling them before the common state machine was enabled which is incorrect. This is as per the DPHY initialization sequence as mentioned in J721E TRM [1] at section "12.7.2.4.1.2.1 Start-up Sequence Timing Diagram". It shows O_CMN_READY polling at the end after common configuration pin setup where the common configuration pin setup step enables state machine as referenced in "Table 12-1533. Common Configuration-Related Setup mentions state machine"
To fix this : - Add new function callbacks for polling on PLL lock and O_CMN_READY assertion. - As state machine and clocks get enabled in power_on callback only, move the clock related programming part from configure callback to power_on callback and poll for the PLL lockup and O_CMN_READY assertion after state machine gets enabled. - The configure callback only saves the PLL configuration received from the client driver which will be applied later on in power_on callback. - Add checks to ensure configure is called before power_on and state machine is in disabled state before power_on callback is called. - Disable state machine in power_off so that client driver can re-configure the PLL by following up a power_off, configure, power_on sequence.
[1]: https://www.ti.com/lit/zip/spruil1
Cc: stable@vger.kernel.org Fixes: 7a343c8bf4b5 ("phy: Add Cadence D-PHY support") Signed-off-by: Devarsh Thakkar devarsht@ti.com Tested-by: Harikrishna Shenoy h-shenoy@ti.com Reviewed-by: Tomi Valkeinen tomi.valkeinen@ideasonboard.com Link: https://lore.kernel.org/r/20250704125915.1224738-2-devarsht@ti.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/phy/cadence/cdns-dphy.c | 124 +++++++++++++++++++++++++++++----------- 1 file changed, 92 insertions(+), 32 deletions(-)
--- a/drivers/phy/cadence/cdns-dphy.c +++ b/drivers/phy/cadence/cdns-dphy.c @@ -101,6 +101,8 @@ struct cdns_dphy_ops { void (*set_pll_cfg)(struct cdns_dphy *dphy, const struct cdns_dphy_cfg *cfg); unsigned long (*get_wakeup_time_ns)(struct cdns_dphy *dphy); + int (*wait_for_pll_lock)(struct cdns_dphy *dphy); + int (*wait_for_cmn_ready)(struct cdns_dphy *dphy); };
struct cdns_dphy { @@ -110,6 +112,8 @@ struct cdns_dphy { struct clk *pll_ref_clk; const struct cdns_dphy_ops *ops; struct phy *phy; + bool is_configured; + bool is_powered; };
/* Order of bands is important since the index is the band number. */ @@ -196,6 +200,16 @@ static unsigned long cdns_dphy_get_wakeu return dphy->ops->get_wakeup_time_ns(dphy); }
+static int cdns_dphy_wait_for_pll_lock(struct cdns_dphy *dphy) +{ + return dphy->ops->wait_for_pll_lock ? dphy->ops->wait_for_pll_lock(dphy) : 0; +} + +static int cdns_dphy_wait_for_cmn_ready(struct cdns_dphy *dphy) +{ + return dphy->ops->wait_for_cmn_ready ? dphy->ops->wait_for_cmn_ready(dphy) : 0; +} + static unsigned long cdns_dphy_ref_get_wakeup_time_ns(struct cdns_dphy *dphy) { /* Default wakeup time is 800 ns (in a simulated environment). */ @@ -237,7 +251,6 @@ static unsigned long cdns_dphy_j721e_get static void cdns_dphy_j721e_set_pll_cfg(struct cdns_dphy *dphy, const struct cdns_dphy_cfg *cfg) { - u32 status;
/* * set the PWM and PLL Byteclk divider settings to recommended values @@ -254,13 +267,6 @@ static void cdns_dphy_j721e_set_pll_cfg(
writel(DPHY_TX_J721E_WIZ_LANE_RSTB, dphy->regs + DPHY_TX_J721E_WIZ_RST_CTRL); - - readl_poll_timeout(dphy->regs + DPHY_TX_J721E_WIZ_PLL_CTRL, status, - (status & DPHY_TX_WIZ_PLL_LOCK), 0, POLL_TIMEOUT_US); - - readl_poll_timeout(dphy->regs + DPHY_TX_J721E_WIZ_STATUS, status, - (status & DPHY_TX_WIZ_O_CMN_READY), 0, - POLL_TIMEOUT_US); }
static void cdns_dphy_j721e_set_psm_div(struct cdns_dphy *dphy, u8 div) @@ -268,6 +274,23 @@ static void cdns_dphy_j721e_set_psm_div( writel(div, dphy->regs + DPHY_TX_J721E_WIZ_PSM_FREQ); }
+static int cdns_dphy_j721e_wait_for_pll_lock(struct cdns_dphy *dphy) +{ + u32 status; + + return readl_poll_timeout(dphy->regs + DPHY_TX_J721E_WIZ_PLL_CTRL, status, + status & DPHY_TX_WIZ_PLL_LOCK, 0, POLL_TIMEOUT_US); +} + +static int cdns_dphy_j721e_wait_for_cmn_ready(struct cdns_dphy *dphy) +{ + u32 status; + + return readl_poll_timeout(dphy->regs + DPHY_TX_J721E_WIZ_STATUS, status, + status & DPHY_TX_WIZ_O_CMN_READY, 0, + POLL_TIMEOUT_US); +} + /* * This is the reference implementation of DPHY hooks. Specific integration of * this IP may have to re-implement some of them depending on how they decided @@ -283,6 +306,8 @@ static const struct cdns_dphy_ops j721e_ .get_wakeup_time_ns = cdns_dphy_j721e_get_wakeup_time_ns, .set_pll_cfg = cdns_dphy_j721e_set_pll_cfg, .set_psm_div = cdns_dphy_j721e_set_psm_div, + .wait_for_pll_lock = cdns_dphy_j721e_wait_for_pll_lock, + .wait_for_cmn_ready = cdns_dphy_j721e_wait_for_cmn_ready, };
static int cdns_dphy_config_from_opts(struct phy *phy, @@ -340,21 +365,36 @@ static int cdns_dphy_validate(struct phy static int cdns_dphy_configure(struct phy *phy, union phy_configure_opts *opts) { struct cdns_dphy *dphy = phy_get_drvdata(phy); - struct cdns_dphy_cfg cfg = { 0 }; - int ret, band_ctrl; - unsigned int reg; + int ret;
- ret = cdns_dphy_config_from_opts(phy, &opts->mipi_dphy, &cfg); - if (ret) - return ret; + ret = cdns_dphy_config_from_opts(phy, &opts->mipi_dphy, &dphy->cfg); + if (!ret) + dphy->is_configured = true; + + return ret; +} + +static int cdns_dphy_power_on(struct phy *phy) +{ + struct cdns_dphy *dphy = phy_get_drvdata(phy); + int ret; + u32 reg; + + if (!dphy->is_configured || dphy->is_powered) + return -EINVAL; + + clk_prepare_enable(dphy->psm_clk); + clk_prepare_enable(dphy->pll_ref_clk);
/* * Configure the internal PSM clk divider so that the DPHY has a * 1MHz clk (or something close). */ ret = cdns_dphy_setup_psm(dphy); - if (ret) - return ret; + if (ret) { + dev_err(&dphy->phy->dev, "Failed to setup PSM with error %d\n", ret); + goto err_power_on; + }
/* * Configure attach clk lanes to data lanes: the DPHY has 2 clk lanes @@ -369,40 +409,60 @@ static int cdns_dphy_configure(struct ph * Configure the DPHY PLL that will be used to generate the TX byte * clk. */ - cdns_dphy_set_pll_cfg(dphy, &cfg); + cdns_dphy_set_pll_cfg(dphy, &dphy->cfg);
- band_ctrl = cdns_dphy_tx_get_band_ctrl(opts->mipi_dphy.hs_clk_rate); - if (band_ctrl < 0) - return band_ctrl; + ret = cdns_dphy_tx_get_band_ctrl(dphy->cfg.hs_clk_rate); + if (ret < 0) { + dev_err(&dphy->phy->dev, "Failed to get band control value with error %d\n", ret); + goto err_power_on; + }
- reg = FIELD_PREP(DPHY_BAND_CFG_LEFT_BAND, band_ctrl) | - FIELD_PREP(DPHY_BAND_CFG_RIGHT_BAND, band_ctrl); + reg = FIELD_PREP(DPHY_BAND_CFG_LEFT_BAND, ret) | + FIELD_PREP(DPHY_BAND_CFG_RIGHT_BAND, ret); writel(reg, dphy->regs + DPHY_BAND_CFG);
- return 0; -} - -static int cdns_dphy_power_on(struct phy *phy) -{ - struct cdns_dphy *dphy = phy_get_drvdata(phy); - - clk_prepare_enable(dphy->psm_clk); - clk_prepare_enable(dphy->pll_ref_clk); - /* Start TX state machine. */ writel(DPHY_CMN_SSM_EN | DPHY_CMN_TX_MODE_EN, dphy->regs + DPHY_CMN_SSM);
+ ret = cdns_dphy_wait_for_pll_lock(dphy); + if (ret) { + dev_err(&dphy->phy->dev, "Failed to lock PLL with error %d\n", ret); + goto err_power_on; + } + + ret = cdns_dphy_wait_for_cmn_ready(dphy); + if (ret) { + dev_err(&dphy->phy->dev, "O_CMN_READY signal failed to assert with error %d\n", + ret); + goto err_power_on; + } + + dphy->is_powered = true; + return 0; + +err_power_on: + clk_disable_unprepare(dphy->pll_ref_clk); + clk_disable_unprepare(dphy->psm_clk); + + return ret; }
static int cdns_dphy_power_off(struct phy *phy) { struct cdns_dphy *dphy = phy_get_drvdata(phy); + u32 reg;
clk_disable_unprepare(dphy->pll_ref_clk); clk_disable_unprepare(dphy->psm_clk);
+ /* Stop TX state machine. */ + reg = readl(dphy->regs + DPHY_CMN_SSM); + writel(reg & ~DPHY_CMN_SSM_EN, dphy->regs + DPHY_CMN_SSM); + + dphy->is_powered = false; + return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bence Csókás csokas.bence@prolan.hu
[ Upstream commit 73db799bf5efc5a04654bb3ff6c9bf63a0dfa473 ]
Add `devm_pm_runtime_set_active_enabled()` and `devm_pm_runtime_get_noresume()` for simplifying common cases in drivers.
Signed-off-by: Bence Csókás csokas.bence@prolan.hu Link: https://patch.msgid.link/20250327195928.680771-3-csokas.bence@prolan.hu Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Stable-dep-of: 0792c1984a45 ("iio: imu: inv_icm42600: Simplify pm_runtime setup") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/power/runtime.c | 44 +++++++++++++++++++++++++++++++++++++++++++ include/linux/pm_runtime.h | 4 +++ 2 files changed, 48 insertions(+)
--- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -1512,6 +1512,32 @@ out: } EXPORT_SYMBOL_GPL(pm_runtime_enable);
+static void pm_runtime_set_suspended_action(void *data) +{ + pm_runtime_set_suspended(data); +} + +/** + * devm_pm_runtime_set_active_enabled - set_active version of devm_pm_runtime_enable. + * + * @dev: Device to handle. + */ +int devm_pm_runtime_set_active_enabled(struct device *dev) +{ + int err; + + err = pm_runtime_set_active(dev); + if (err) + return err; + + err = devm_add_action_or_reset(dev, pm_runtime_set_suspended_action, dev); + if (err) + return err; + + return devm_pm_runtime_enable(dev); +} +EXPORT_SYMBOL_GPL(devm_pm_runtime_set_active_enabled); + static void pm_runtime_disable_action(void *data) { pm_runtime_dont_use_autosuspend(data); @@ -1534,6 +1560,24 @@ int devm_pm_runtime_enable(struct device } EXPORT_SYMBOL_GPL(devm_pm_runtime_enable);
+static void pm_runtime_put_noidle_action(void *data) +{ + pm_runtime_put_noidle(data); +} + +/** + * devm_pm_runtime_get_noresume - devres-enabled version of pm_runtime_get_noresume. + * + * @dev: Device to handle. + */ +int devm_pm_runtime_get_noresume(struct device *dev) +{ + pm_runtime_get_noresume(dev); + + return devm_add_action_or_reset(dev, pm_runtime_put_noidle_action, dev); +} +EXPORT_SYMBOL_GPL(devm_pm_runtime_get_noresume); + /** * pm_runtime_forbid - Block runtime PM of a device. * @dev: Device to handle. --- a/include/linux/pm_runtime.h +++ b/include/linux/pm_runtime.h @@ -95,7 +95,9 @@ extern void pm_runtime_new_link(struct d extern void pm_runtime_drop_link(struct device_link *link); extern void pm_runtime_release_supplier(struct device_link *link);
+int devm_pm_runtime_set_active_enabled(struct device *dev); extern int devm_pm_runtime_enable(struct device *dev); +int devm_pm_runtime_get_noresume(struct device *dev);
/** * pm_runtime_get_if_in_use - Conditionally bump up runtime PM usage counter. @@ -292,7 +294,9 @@ static inline void __pm_runtime_disable( static inline void pm_runtime_allow(struct device *dev) {} static inline void pm_runtime_forbid(struct device *dev) {}
+static inline int devm_pm_runtime_set_active_enabled(struct device *dev) { return 0; } static inline int devm_pm_runtime_enable(struct device *dev) { return 0; } +static inline int devm_pm_runtime_get_noresume(struct device *dev) { return 0; }
static inline void pm_suspend_ignore_children(struct device *dev, bool enable) {} static inline void pm_runtime_get_noresume(struct device *dev) {}
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Nyekjaer sean@geanix.com
[ Upstream commit 0792c1984a45ccd7a296d6b8cb78088bc99a212e ]
Rework the power management in inv_icm42600_core_probe() to use devm_pm_runtime_set_active_enabled(), which simplifies the runtime PM setup by handling activation and enabling in one step. Remove the separate inv_icm42600_disable_pm callback, as it's no longer needed with the devm-managed approach. Using devm_pm_runtime_enable() also fixes the missing disable of autosuspend. Update inv_icm42600_disable_vddio_reg() to only disable the regulator if the device is not suspended i.e. powered-down, preventing unbalanced disables. Also remove redundant error msg on regulator_disable(), the regulator framework already emits an error message when regulator_disable() fails.
This simplifies the PM setup and avoids manipulating the usage counter unnecessarily.
Fixes: 31c24c1e93c3 ("iio: imu: inv_icm42600: add core of new inv_icm42600 driver") Signed-off-by: Sean Nyekjaer sean@geanix.com Link: https://patch.msgid.link/20250901-icm42pmreg-v3-1-ef1336246960@geanix.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/imu/inv_icm42600/inv_icm42600_core.c | 24 ++++++----------------- 1 file changed, 7 insertions(+), 17 deletions(-)
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c +++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c @@ -550,20 +550,12 @@ static void inv_icm42600_disable_vdd_reg static void inv_icm42600_disable_vddio_reg(void *_data) { struct inv_icm42600_state *st = _data; - const struct device *dev = regmap_get_device(st->map); - int ret; - - ret = regulator_disable(st->vddio_supply); - if (ret) - dev_err(dev, "failed to disable vddio error %d\n", ret); -} + struct device *dev = regmap_get_device(st->map);
-static void inv_icm42600_disable_pm(void *_data) -{ - struct device *dev = _data; + if (pm_runtime_status_suspended(dev)) + return;
- pm_runtime_put_sync(dev); - pm_runtime_disable(dev); + regulator_disable(st->vddio_supply); }
int inv_icm42600_core_probe(struct regmap *regmap, int chip, int irq, @@ -660,16 +652,14 @@ int inv_icm42600_core_probe(struct regma return ret;
/* setup runtime power management */ - ret = pm_runtime_set_active(dev); + ret = devm_pm_runtime_set_active_enabled(dev); if (ret) return ret; - pm_runtime_get_noresume(dev); - pm_runtime_enable(dev); + pm_runtime_set_autosuspend_delay(dev, INV_ICM42600_SUSPEND_DELAY_MS); pm_runtime_use_autosuspend(dev); - pm_runtime_put(dev);
- return devm_add_action_or_reset(dev, inv_icm42600_disable_pm, dev); + return ret; } EXPORT_SYMBOL_GPL(inv_icm42600_core_probe);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Lechner dlechner@baylibre.com
[ Upstream commit 352112e2d9aab6a156c2803ae14eb89a9fd93b7d ]
Use { } instead of memset() to zero-initialize stack memory to simplify the code.
Signed-off-by: David Lechner dlechner@baylibre.com Reviewed-by: Nuno Sá nuno.sa@analog.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://patch.msgid.link/20250611-iio-zero-init-stack-with-instead-of-memset... Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Stable-dep-of: 466f7a2fef2a ("iio: imu: inv_icm42600: Avoid configuring if already pm_runtime suspended") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/imu/inv_icm42600/inv_icm42600_accel.c | 5 ++--- drivers/iio/imu/inv_icm42600/inv_icm42600_gyro.c | 5 ++--- 2 files changed, 4 insertions(+), 6 deletions(-)
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_accel.c +++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_accel.c @@ -748,7 +748,8 @@ int inv_icm42600_accel_parse_fifo(struct const int8_t *temp; unsigned int odr; int64_t ts_val; - struct inv_icm42600_accel_buffer buffer; + /* buffer is copied to userspace, zeroing it to avoid any data leak */ + struct inv_icm42600_accel_buffer buffer = { };
/* parse all fifo packets */ for (i = 0, no = 0; i < st->fifo.count; i += size, ++no) { @@ -767,8 +768,6 @@ int inv_icm42600_accel_parse_fifo(struct inv_icm42600_timestamp_apply_odr(ts, st->fifo.period, st->fifo.nb.total, no);
- /* buffer is copied to userspace, zeroing it to avoid any data leak */ - memset(&buffer, 0, sizeof(buffer)); memcpy(&buffer.accel, accel, sizeof(buffer.accel)); /* convert 8 bits FIFO temperature in high resolution format */ buffer.temp = temp ? (*temp * 64) : 0; --- a/drivers/iio/imu/inv_icm42600/inv_icm42600_gyro.c +++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_gyro.c @@ -760,7 +760,8 @@ int inv_icm42600_gyro_parse_fifo(struct const int8_t *temp; unsigned int odr; int64_t ts_val; - struct inv_icm42600_gyro_buffer buffer; + /* buffer is copied to userspace, zeroing it to avoid any data leak */ + struct inv_icm42600_gyro_buffer buffer = { };
/* parse all fifo packets */ for (i = 0, no = 0; i < st->fifo.count; i += size, ++no) { @@ -779,8 +780,6 @@ int inv_icm42600_gyro_parse_fifo(struct inv_icm42600_timestamp_apply_odr(ts, st->fifo.period, st->fifo.nb.total, no);
- /* buffer is copied to userspace, zeroing it to avoid any data leak */ - memset(&buffer, 0, sizeof(buffer)); memcpy(&buffer.gyro, gyro, sizeof(buffer.gyro)); /* convert 8 bits FIFO temperature in high resolution format */ buffer.temp = temp ? (*temp * 64) : 0;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Nyekjaer sean@geanix.com
[ Upstream commit 466f7a2fef2a4e426f809f79845a1ec1aeb558f4 ]
Do as in suspend, skip resume configuration steps if the device is already pm_runtime suspended. This avoids reconfiguring a device that is already in the correct low-power state and ensures that pm_runtime handles the power state transitions properly.
Fixes: 31c24c1e93c3 ("iio: imu: inv_icm42600: add core of new inv_icm42600 driver") Signed-off-by: Sean Nyekjaer sean@geanix.com Link: https://patch.msgid.link/20250901-icm42pmreg-v3-3-ef1336246960@geanix.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com [ adjusted context for suspend/resume functions lacking APEX/wakeup support ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/imu/inv_icm42600/inv_icm42600_core.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c +++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c @@ -670,17 +670,15 @@ EXPORT_SYMBOL_GPL(inv_icm42600_core_prob static int __maybe_unused inv_icm42600_suspend(struct device *dev) { struct inv_icm42600_state *st = dev_get_drvdata(dev); - int ret; + int ret = 0;
mutex_lock(&st->lock);
st->suspended.gyro = st->conf.gyro.mode; st->suspended.accel = st->conf.accel.mode; st->suspended.temp = st->conf.temp_en; - if (pm_runtime_suspended(dev)) { - ret = 0; + if (pm_runtime_suspended(dev)) goto out_unlock; - }
/* disable FIFO data streaming */ if (st->fifo.on) { @@ -712,10 +710,13 @@ static int __maybe_unused inv_icm42600_r struct inv_icm42600_state *st = dev_get_drvdata(dev); struct inv_icm42600_timestamp *gyro_ts = iio_priv(st->indio_gyro); struct inv_icm42600_timestamp *accel_ts = iio_priv(st->indio_accel); - int ret; + int ret = 0;
mutex_lock(&st->lock);
+ if (pm_runtime_suspended(dev)) + goto out_unlock; + ret = inv_icm42600_enable_regulator_vddio(st); if (ret) goto out_unlock;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xiao Liang shaw.leon@gmail.com
[ Upstream commit 501302d5cee0d8e8ec2c4a5919c37e0df9abc99b ]
When seq_nr wraps around, the next reorder job with seq 0 is hashed to the first CPU in padata_do_serial(). Correspondingly, need reset pd->cpu to the first one when pd->processed wraps around. Otherwise, if the number of used CPUs is not a power of 2, padata_find_next() will be checking a wrong list, hence deadlock.
Fixes: 6fc4dbcf0276 ("padata: Replace delayed timer with immediate workqueue in padata_reorder") Cc: stable@vger.kernel.org Signed-off-by: Xiao Liang shaw.leon@gmail.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au [ applied fix in padata_find_next() instead of padata_reorder() ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/padata.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/kernel/padata.c +++ b/kernel/padata.c @@ -282,7 +282,11 @@ static struct padata_priv *padata_find_n if (remove_object) { list_del_init(&padata->list); ++pd->processed; - pd->cpu = cpumask_next_wrap(cpu, pd->cpumask.pcpu, -1, false); + /* When sequence wraps around, reset to the first CPU. */ + if (unlikely(pd->processed == 0)) + pd->cpu = cpumask_first(pd->cpumask.pcpu); + else + pd->cpu = cpumask_next_wrap(cpu, pd->cpumask.pcpu, -1, false); }
spin_unlock(&reorder->lock);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit e26ee4efbc79610b20e7abe9d96c87f33dacc1ff ]
This removed the need to pass isdir argument to fuse_put_file().
Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Stable-dep-of: 26e5c67deb2e ("fuse: fix livelock in synchronous file put from fuseblk workers") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/dir.c | 2 - fs/fuse/file.c | 69 +++++++++++++++++++++++++++++++------------------------ fs/fuse/fuse_i.h | 2 - 3 files changed, 41 insertions(+), 32 deletions(-)
--- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -584,7 +584,7 @@ static int fuse_create_open(struct inode goto out_err;
err = -ENOMEM; - ff = fuse_file_alloc(fm); + ff = fuse_file_alloc(fm, true); if (!ff) goto out_put_forget_req;
--- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -54,7 +54,7 @@ struct fuse_release_args { struct inode *inode; };
-struct fuse_file *fuse_file_alloc(struct fuse_mount *fm) +struct fuse_file *fuse_file_alloc(struct fuse_mount *fm, bool release) { struct fuse_file *ff;
@@ -63,11 +63,13 @@ struct fuse_file *fuse_file_alloc(struct return NULL;
ff->fm = fm; - ff->release_args = kzalloc(sizeof(*ff->release_args), - GFP_KERNEL_ACCOUNT); - if (!ff->release_args) { - kfree(ff); - return NULL; + if (release) { + ff->release_args = kzalloc(sizeof(*ff->release_args), + GFP_KERNEL_ACCOUNT); + if (!ff->release_args) { + kfree(ff); + return NULL; + } }
INIT_LIST_HEAD(&ff->write_entry); @@ -103,14 +105,14 @@ static void fuse_release_end(struct fuse kfree(ra); }
-static void fuse_file_put(struct fuse_file *ff, bool sync, bool isdir) +static void fuse_file_put(struct fuse_file *ff, bool sync) { if (refcount_dec_and_test(&ff->count)) { - struct fuse_args *args = &ff->release_args->args; + struct fuse_release_args *ra = ff->release_args; + struct fuse_args *args = (ra ? &ra->args : NULL);
- if (isdir ? ff->fm->fc->no_opendir : ff->fm->fc->no_open) { - /* Do nothing when client does not implement 'open' */ - fuse_release_end(ff->fm, args, 0); + if (!args) { + /* Do nothing when server does not implement 'open' */ } else if (sync) { fuse_simple_request(ff->fm, args); fuse_release_end(ff->fm, args, 0); @@ -130,15 +132,16 @@ struct fuse_file *fuse_file_open(struct struct fuse_conn *fc = fm->fc; struct fuse_file *ff; int opcode = isdir ? FUSE_OPENDIR : FUSE_OPEN; + bool open = isdir ? !fc->no_opendir : !fc->no_open;
- ff = fuse_file_alloc(fm); + ff = fuse_file_alloc(fm, open); if (!ff) return ERR_PTR(-ENOMEM);
ff->fh = 0; /* Default for no-open */ ff->open_flags = FOPEN_KEEP_CACHE | (isdir ? FOPEN_CACHE_DIR : 0); - if (isdir ? !fc->no_opendir : !fc->no_open) { + if (open) { struct fuse_open_out outarg; int err;
@@ -146,11 +149,13 @@ struct fuse_file *fuse_file_open(struct if (!err) { ff->fh = outarg.fh; ff->open_flags = outarg.open_flags; - } else if (err != -ENOSYS) { fuse_file_free(ff); return ERR_PTR(err); } else { + /* No release needed */ + kfree(ff->release_args); + ff->release_args = NULL; if (isdir) fc->no_opendir = 1; else @@ -272,7 +277,7 @@ out_inode_unlock: }
static void fuse_prepare_release(struct fuse_inode *fi, struct fuse_file *ff, - unsigned int flags, int opcode) + unsigned int flags, int opcode, bool sync) { struct fuse_conn *fc = ff->fm->fc; struct fuse_release_args *ra = ff->release_args; @@ -290,6 +295,9 @@ static void fuse_prepare_release(struct
wake_up_interruptible_all(&ff->poll_wait);
+ if (!ra) + return; + ra->inarg.fh = ff->fh; ra->inarg.flags = flags; ra->args.in_numargs = 1; @@ -299,6 +307,13 @@ static void fuse_prepare_release(struct ra->args.nodeid = ff->nodeid; ra->args.force = true; ra->args.nocreds = true; + + /* + * Hold inode until release is finished. + * From fuse_sync_release() the refcount is 1 and everything's + * synchronous, so we are fine with not doing igrab() here. + */ + ra->inode = sync ? NULL : igrab(&fi->inode); }
void fuse_file_release(struct inode *inode, struct fuse_file *ff, @@ -308,14 +323,12 @@ void fuse_file_release(struct inode *ino struct fuse_release_args *ra = ff->release_args; int opcode = isdir ? FUSE_RELEASEDIR : FUSE_RELEASE;
- fuse_prepare_release(fi, ff, open_flags, opcode); + fuse_prepare_release(fi, ff, open_flags, opcode, false);
- if (ff->flock) { + if (ra && ff->flock) { ra->inarg.release_flags |= FUSE_RELEASE_FLOCK_UNLOCK; ra->inarg.lock_owner = fuse_lock_owner_id(ff->fm->fc, id); } - /* Hold inode until release is finished */ - ra->inode = igrab(inode);
/* * Normally this will send the RELEASE request, however if @@ -326,7 +339,7 @@ void fuse_file_release(struct inode *ino * synchronous RELEASE is allowed (and desirable) in this case * because the server can be trusted not to screw up. */ - fuse_file_put(ff, ff->fm->fc->destroy, isdir); + fuse_file_put(ff, ff->fm->fc->destroy); }
void fuse_release_common(struct file *file, bool isdir) @@ -361,12 +374,8 @@ void fuse_sync_release(struct fuse_inode unsigned int flags) { WARN_ON(refcount_read(&ff->count) > 1); - fuse_prepare_release(fi, ff, flags, FUSE_RELEASE); - /* - * iput(NULL) is a no-op and since the refcount is 1 and everything's - * synchronous, we are fine with not doing igrab() here" - */ - fuse_file_put(ff, true, false); + fuse_prepare_release(fi, ff, flags, FUSE_RELEASE, true); + fuse_file_put(ff, true); } EXPORT_SYMBOL_GPL(fuse_sync_release);
@@ -923,7 +932,7 @@ static void fuse_readpages_end(struct fu put_page(page); } if (ia->ff) - fuse_file_put(ia->ff, false, false); + fuse_file_put(ia->ff, false);
fuse_io_free(ia); } @@ -1670,7 +1679,7 @@ static void fuse_writepage_free(struct f __free_page(ap->pages[i]);
if (wpa->ia.ff) - fuse_file_put(wpa->ia.ff, false, false); + fuse_file_put(wpa->ia.ff, false);
kfree(ap->pages); kfree(wpa); @@ -1918,7 +1927,7 @@ int fuse_write_inode(struct inode *inode ff = __fuse_write_file_get(fi); err = fuse_flush_times(inode, ff); if (ff) - fuse_file_put(ff, false, false); + fuse_file_put(ff, false);
return err; } @@ -2316,7 +2325,7 @@ static int fuse_writepages(struct addres fuse_writepages_send(&data); } if (data.ff) - fuse_file_put(data.ff, false, false); + fuse_file_put(data.ff, false);
kfree(data.orig_pages); out: --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -1022,7 +1022,7 @@ void fuse_read_args_fill(struct fuse_io_ */ int fuse_open_common(struct inode *inode, struct file *file, bool isdir);
-struct fuse_file *fuse_file_alloc(struct fuse_mount *fm); +struct fuse_file *fuse_file_alloc(struct fuse_mount *fm, bool release); void fuse_file_free(struct fuse_file *ff); void fuse_finish_open(struct inode *inode, struct file *file);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit 26e5c67deb2e1f42a951f022fdf5b9f7eb747b01 ]
I observed a hang when running generic/323 against a fuseblk server. This test opens a file, initiates a lot of AIO writes to that file descriptor, and closes the file descriptor before the writes complete. Unsurprisingly, the AIO exerciser threads are mostly stuck waiting for responses from the fuseblk server:
# cat /proc/372265/task/372313/stack [<0>] request_wait_answer+0x1fe/0x2a0 [fuse] [<0>] __fuse_simple_request+0xd3/0x2b0 [fuse] [<0>] fuse_do_getattr+0xfc/0x1f0 [fuse] [<0>] fuse_file_read_iter+0xbe/0x1c0 [fuse] [<0>] aio_read+0x130/0x1e0 [<0>] io_submit_one+0x542/0x860 [<0>] __x64_sys_io_submit+0x98/0x1a0 [<0>] do_syscall_64+0x37/0xf0 [<0>] entry_SYSCALL_64_after_hwframe+0x4b/0x53
But the /weird/ part is that the fuseblk server threads are waiting for responses from itself:
# cat /proc/372210/task/372232/stack [<0>] request_wait_answer+0x1fe/0x2a0 [fuse] [<0>] __fuse_simple_request+0xd3/0x2b0 [fuse] [<0>] fuse_file_put+0x9a/0xd0 [fuse] [<0>] fuse_release+0x36/0x50 [fuse] [<0>] __fput+0xec/0x2b0 [<0>] task_work_run+0x55/0x90 [<0>] syscall_exit_to_user_mode+0xe9/0x100 [<0>] do_syscall_64+0x43/0xf0 [<0>] entry_SYSCALL_64_after_hwframe+0x4b/0x53
The fuseblk server is fuse2fs so there's nothing all that exciting in the server itself. So why is the fuse server calling fuse_file_put? The commit message for the fstest sheds some light on that:
"By closing the file descriptor before calling io_destroy, you pretty much guarantee that the last put on the ioctx will be done in interrupt context (during I/O completion).
Aha. AIO fgets a new struct file from the fd when it queues the ioctx. The completion of the FUSE_WRITE command from userspace causes the fuse server to call the AIO completion function. The completion puts the struct file, queuing a delayed fput to the fuse server task. When the fuse server task returns to userspace, it has to run the delayed fput, which in the case of a fuseblk server, it does synchronously.
Sending the FUSE_RELEASE command sychronously from fuse server threads is a bad idea because a client program can initiate enough simultaneous AIOs such that all the fuse server threads end up in delayed_fput, and now there aren't any threads left to handle the queued fuse commands.
Fix this by only using asynchronous fputs when closing files, and leave a comment explaining why.
Cc: stable@vger.kernel.org # v2.6.38 Fixes: 5a18ec176c934c ("fuse: fix hang of single threaded fuseblk filesystem") Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/file.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -338,8 +338,14 @@ void fuse_file_release(struct inode *ino * Make the release synchronous if this is a fuseblk mount, * synchronous RELEASE is allowed (and desirable) in this case * because the server can be trusted not to screw up. + * + * Always use the asynchronous file put because the current thread + * might be the fuse server. This can happen if a process starts some + * aio and closes the fd before the aio completes. Since aio takes its + * own ref to the file, the IO completion has to drop the ref, which is + * how the fuse server can end up closing its clients' files. */ - fuse_file_put(ff, ff->fm->fc->destroy); + fuse_file_put(ff, false); }
void fuse_release_common(struct file *file, bool isdir)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Catalin Marinas catalin.marinas@arm.com
[ Upstream commit f620d66af3165838bfa845dcf9f5f9b4089bf508 ]
Commit 68d54ceeec0e ("arm64: mte: Allow PTRACE_PEEKMTETAGS access to the zero page") attempted to fix ptrace() reading of tags from the zero page by marking it as PG_mte_tagged during cpu_enable_mte(). The same commit also changed the ptrace() tag access permission check to the VM_MTE vma flag while turning the page flag test into a WARN_ON_ONCE().
Attempting to set the PG_mte_tagged flag early with CONFIG_DEFERRED_STRUCT_PAGE_INIT enabled may either hang (after commit d77e59a8fccd "arm64: mte: Lock a page for MTE tag initialisation") or have the flags cleared later during page_alloc_init_late(). In addition, pages_identical() -> memcmp_pages() will reject any comparison with the zero page as it is marked as tagged.
Partially revert the above commit to avoid setting PG_mte_tagged on the zero page. Update the __access_remote_tags() warning on untagged pages to ignore the zero page since it is known to have the tags initialised.
Note that all user mapping of the zero page are marked as pte_special(). The arm64 set_pte_at() will not call mte_sync_tags() on such pages, so PG_mte_tagged will remain cleared.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com Fixes: 68d54ceeec0e ("arm64: mte: Allow PTRACE_PEEKMTETAGS access to the zero page") Reported-by: Gergely Kovacs Gergely.Kovacs2@arm.com Cc: stable@vger.kernel.org # 5.10.x Cc: Will Deacon will@kernel.org Cc: David Hildenbrand david@redhat.com Cc: Lance Yang lance.yang@linux.dev Acked-by: Lance Yang lance.yang@linux.dev Reviewed-by: David Hildenbrand david@redhat.com Tested-by: Lance Yang lance.yang@linux.dev Signed-off-by: Will Deacon will@kernel.org [ removed folio-based hugetlb MTE checks ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kernel/cpufeature.c | 10 +++++++--- arch/arm64/kernel/mte.c | 2 +- 2 files changed, 8 insertions(+), 4 deletions(-)
--- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -2068,17 +2068,21 @@ static void bti_enable(const struct arm6 #ifdef CONFIG_ARM64_MTE static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap) { + static bool cleared_zero_page = false; + sysreg_clear_set(sctlr_el1, 0, SCTLR_ELx_ATA | SCTLR_EL1_ATA0);
mte_cpu_setup();
/* * Clear the tags in the zero page. This needs to be done via the - * linear map which has the Tagged attribute. + * linear map which has the Tagged attribute. Since this page is + * always mapped as pte_special(), set_pte_at() will not attempt to + * clear the tags or set PG_mte_tagged. */ - if (!page_mte_tagged(ZERO_PAGE(0))) { + if (!cleared_zero_page) { + cleared_zero_page = true; mte_clear_page_tags(lm_alias(empty_zero_page)); - set_page_mte_tagged(ZERO_PAGE(0)); }
kasan_init_hw_tags_cpu(); --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -456,7 +456,7 @@ static int __access_remote_tags(struct m put_page(page); break; } - WARN_ON_ONCE(!page_mte_tagged(page)); + WARN_ON_ONCE(!page_mte_tagged(page) && !is_zero_page(page));
/* limit access to the end of the page */ offset = offset_in_page(addr);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Siddharth Vadapalli s-vadapalli@ti.com
[ Upstream commit 82c4be4168e26a5593aaa1002b5678128a638824 ]
The ACSPCIE module is capable of driving the reference clock required by the PCIe Endpoint device. It is an alternative to on-board and external reference clock generators. Enabling the output from the ACSPCIE module's PAD IO Buffers requires clearing the "PAD IO disable" bits of the ACSPCIE_PROXY_CTRL register in the CTRL_MMR register space.
Add support to enable the ACSPCIE reference clock output using the optional device-tree property "ti,syscon-acspcie-proxy-ctrl".
Link: https://lore.kernel.org/linux-pci/20240829105316.1483684-3-s-vadapalli@ti.co... Signed-off-by: Siddharth Vadapalli s-vadapalli@ti.com Signed-off-by: Krzysztof Wilczyński kwilczynski@kernel.org Reviewed-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Stable-dep-of: f842d3313ba1 ("PCI: j721e: Fix programming sequence of "strap" settings") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/cadence/pci-j721e.c | 39 ++++++++++++++++++++++++++++- 1 file changed, 38 insertions(+), 1 deletion(-)
--- a/drivers/pci/controller/cadence/pci-j721e.c +++ b/drivers/pci/controller/cadence/pci-j721e.c @@ -46,6 +46,7 @@ enum link_status { #define LANE_COUNT_MASK BIT(8) #define LANE_COUNT(n) ((n) << 8)
+#define ACSPCIE_PAD_DISABLE_MASK GENMASK(1, 0) #define GENERATION_SEL_MASK GENMASK(1, 0)
#define MAX_LANES 2 @@ -218,6 +219,36 @@ static int j721e_pcie_set_lane_count(str return ret; }
+static int j721e_enable_acspcie_refclk(struct j721e_pcie *pcie, + struct regmap *syscon) +{ + struct device *dev = pcie->cdns_pcie->dev; + struct device_node *node = dev->of_node; + u32 mask = ACSPCIE_PAD_DISABLE_MASK; + struct of_phandle_args args; + u32 val; + int ret; + + ret = of_parse_phandle_with_fixed_args(node, + "ti,syscon-acspcie-proxy-ctrl", + 1, 0, &args); + if (ret) { + dev_err(dev, + "ti,syscon-acspcie-proxy-ctrl has invalid arguments\n"); + return ret; + } + + /* Clear PAD IO disable bits to enable refclk output */ + val = ~(args.args[0]); + ret = regmap_update_bits(syscon, 0, mask, val); + if (ret) { + dev_err(dev, "failed to enable ACSPCIE refclk: %d\n", ret); + return ret; + } + + return 0; +} + static int j721e_pcie_ctrl_init(struct j721e_pcie *pcie) { struct device *dev = pcie->cdns_pcie->dev; @@ -257,7 +288,13 @@ static int j721e_pcie_ctrl_init(struct j return ret; }
- return 0; + /* Enable ACSPCIE refclk output if the optional property exists */ + syscon = syscon_regmap_lookup_by_phandle_optional(node, + "ti,syscon-acspcie-proxy-ctrl"); + if (!syscon) + return 0; + + return j721e_enable_acspcie_refclk(pcie, syscon); }
static int cdns_ti_pcie_config_read(struct pci_bus *bus, unsigned int devfn,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Siddharth Vadapalli s-vadapalli@ti.com
[ Upstream commit f842d3313ba179d4005096357289c7ad09cec575 ]
The Cadence PCIe Controller integrated in the TI K3 SoCs supports both Root-Complex and Endpoint modes of operation. The Glue Layer allows "strapping" the Mode of operation of the Controller, the Link Speed and the Link Width. This is enabled by programming the "PCIEn_CTRL" register (n corresponds to the PCIe instance) within the CTRL_MMR memory-mapped register space. The "reset-values" of the registers are also different depending on the mode of operation.
Since the PCIe Controller latches onto the "reset-values" immediately after being powered on, if the Glue Layer configuration is not done while the PCIe Controller is off, it will result in the PCIe Controller latching onto the wrong "reset-values". In practice, this will show up as a wrong representation of the PCIe Controller's capability structures in the PCIe Configuration Space. Some such capabilities which are supported by the PCIe Controller in the Root-Complex mode but are incorrectly latched onto as being unsupported are: - Link Bandwidth Notification - Alternate Routing ID (ARI) Forwarding Support - Next capability offset within Advanced Error Reporting (AER) capability
Fix this by powering off the PCIe Controller before programming the "strap" settings and powering it on after that. The runtime PM APIs namely pm_runtime_put_sync() and pm_runtime_get_sync() will decrement and increment the usage counter respectively, causing GENPD to power off and power on the PCIe Controller.
Fixes: f3e25911a430 ("PCI: j721e: Add TI J721E PCIe driver") Signed-off-by: Siddharth Vadapalli s-vadapalli@ti.com Signed-off-by: Manivannan Sadhasivam mani@kernel.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250908120828.1471776-1-s-vadapalli@ti.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/cadence/pci-j721e.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)
--- a/drivers/pci/controller/cadence/pci-j721e.c +++ b/drivers/pci/controller/cadence/pci-j721e.c @@ -270,6 +270,25 @@ static int j721e_pcie_ctrl_init(struct j if (!ret) offset = args.args[0];
+ /* + * The PCIe Controller's registers have different "reset-values" + * depending on the "strap" settings programmed into the PCIEn_CTRL + * register within the CTRL_MMR memory-mapped register space. + * The registers latch onto a "reset-value" based on the "strap" + * settings sampled after the PCIe Controller is powered on. + * To ensure that the "reset-values" are sampled accurately, power + * off the PCIe Controller before programming the "strap" settings + * and power it on after that. The runtime PM APIs namely + * pm_runtime_put_sync() and pm_runtime_get_sync() will decrement and + * increment the usage counter respectively, causing GENPD to power off + * and power on the PCIe Controller. + */ + ret = pm_runtime_put_sync(dev); + if (ret < 0) { + dev_err(dev, "Failed to power off PCIe Controller\n"); + return ret; + } + ret = j721e_pcie_set_mode(pcie, syscon, offset); if (ret < 0) { dev_err(dev, "Failed to set pci mode\n"); @@ -288,6 +307,12 @@ static int j721e_pcie_ctrl_init(struct j return ret; }
+ ret = pm_runtime_get_sync(dev); + if (ret < 0) { + dev_err(dev, "Failed to power on PCIe Controller\n"); + return ret; + } + /* Enable ACSPCIE refclk output if the optional property exists */ syscon = syscon_regmap_lookup_by_phandle_optional(node, "ti,syscon-acspcie-proxy-ctrl");
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sergey Bashirov sergeybashirov@gmail.com
[ Upstream commit 832738e4b325b742940761e10487403f9aad13e8 ]
Compilers may optimize the layout of C structures, so we should not rely on sizeof struct and memcpy to encode and decode XDR structures. The byte order of the fields should also be taken into account.
This patch adds the correct functions to handle the deviceid4 structure and removes the pad field, which is currently not used by NFSD, from the runtime state. The server's byte order is preserved because the deviceid4 blob on the wire is only used as a cookie by the client.
Signed-off-by: Sergey Bashirov sergeybashirov@gmail.com Signed-off-by: Chuck Lever chuck.lever@oracle.com Stable-dep-of: d68886bae76a ("NFSD: Fix last write offset handling in layoutcommit") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/blocklayoutxdr.c | 7 ++----- fs/nfsd/flexfilelayoutxdr.c | 3 +-- fs/nfsd/nfs4layouts.c | 1 - fs/nfsd/nfs4xdr.c | 14 +------------- fs/nfsd/xdr4.h | 36 +++++++++++++++++++++++++++++++++++- 5 files changed, 39 insertions(+), 22 deletions(-)
--- a/fs/nfsd/blocklayoutxdr.c +++ b/fs/nfsd/blocklayoutxdr.c @@ -29,8 +29,7 @@ nfsd4_block_encode_layoutget(struct xdr_ *p++ = cpu_to_be32(len); *p++ = cpu_to_be32(1); /* we always return a single extent */
- p = xdr_encode_opaque_fixed(p, &b->vol_id, - sizeof(struct nfsd4_deviceid)); + p = svcxdr_encode_deviceid4(p, &b->vol_id); p = xdr_encode_hyper(p, b->foff); p = xdr_encode_hyper(p, b->len); p = xdr_encode_hyper(p, b->soff); @@ -145,9 +144,7 @@ nfsd4_block_decode_layoutupdate(__be32 * for (i = 0; i < nr_iomaps; i++) { struct pnfs_block_extent bex;
- memcpy(&bex.vol_id, p, sizeof(struct nfsd4_deviceid)); - p += XDR_QUADLEN(sizeof(struct nfsd4_deviceid)); - + p = svcxdr_decode_deviceid4(p, &bex.vol_id); p = xdr_decode_hyper(p, &bex.foff); if (bex.foff & (block_size - 1)) { dprintk("%s: unaligned offset 0x%llx\n", --- a/fs/nfsd/flexfilelayoutxdr.c +++ b/fs/nfsd/flexfilelayoutxdr.c @@ -54,8 +54,7 @@ nfsd4_ff_encode_layoutget(struct xdr_str *p++ = cpu_to_be32(1); /* single mirror */ *p++ = cpu_to_be32(1); /* single data server */
- p = xdr_encode_opaque_fixed(p, &fl->deviceid, - sizeof(struct nfsd4_deviceid)); + p = svcxdr_encode_deviceid4(p, &fl->deviceid);
*p++ = cpu_to_be32(1); /* efficiency */
--- a/fs/nfsd/nfs4layouts.c +++ b/fs/nfsd/nfs4layouts.c @@ -120,7 +120,6 @@ nfsd4_set_deviceid(struct nfsd4_deviceid
id->fsid_idx = fhp->fh_export->ex_devid_map->idx; id->generation = device_generation; - id->pad = 0; return 0; }
--- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -566,18 +566,6 @@ nfsd4_decode_state_owner4(struct nfsd4_c }
#ifdef CONFIG_NFSD_PNFS -static __be32 -nfsd4_decode_deviceid4(struct nfsd4_compoundargs *argp, - struct nfsd4_deviceid *devid) -{ - __be32 *p; - - p = xdr_inline_decode(argp->xdr, NFS4_DEVICEID4_SIZE); - if (!p) - return nfserr_bad_xdr; - memcpy(devid, p, sizeof(*devid)); - return nfs_ok; -}
static __be32 nfsd4_decode_layoutupdate4(struct nfsd4_compoundargs *argp, @@ -1733,7 +1721,7 @@ nfsd4_decode_getdeviceinfo(struct nfsd4_ __be32 status;
memset(gdev, 0, sizeof(*gdev)); - status = nfsd4_decode_deviceid4(argp, &gdev->gd_devid); + status = nfsd4_decode_deviceid4(argp->xdr, &gdev->gd_devid); if (status) return status; if (xdr_stream_decode_u32(argp->xdr, &gdev->gd_layout_type) < 0) --- a/fs/nfsd/xdr4.h +++ b/fs/nfsd/xdr4.h @@ -459,9 +459,43 @@ struct nfsd4_reclaim_complete { struct nfsd4_deviceid { u64 fsid_idx; u32 generation; - u32 pad; };
+static inline __be32 * +svcxdr_encode_deviceid4(__be32 *p, const struct nfsd4_deviceid *devid) +{ + __be64 *q = (__be64 *)p; + + *q = (__force __be64)devid->fsid_idx; + p += 2; + *p++ = (__force __be32)devid->generation; + *p++ = xdr_zero; + return p; +} + +static inline __be32 * +svcxdr_decode_deviceid4(__be32 *p, struct nfsd4_deviceid *devid) +{ + __be64 *q = (__be64 *)p; + + devid->fsid_idx = (__force u64)(*q); + p += 2; + devid->generation = (__force u32)(*p++); + p++; /* NFSD does not use the remaining octets */ + return p; +} + +static inline __be32 +nfsd4_decode_deviceid4(struct xdr_stream *xdr, struct nfsd4_deviceid *devid) +{ + __be32 *p = xdr_inline_decode(xdr, NFS4_DEVICEID4_SIZE); + + if (unlikely(!p)) + return nfserr_bad_xdr; + svcxdr_decode_deviceid4(p, devid); + return nfs_ok; +} + struct nfsd4_layout_seg { u32 iomode; u64 offset;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sergey Bashirov sergeybashirov@gmail.com
[ Upstream commit 274365a51d88658fb51cca637ba579034e90a799 ]
Remove dprintk in nfsd4_layoutcommit. These are not needed in day to day usage, and the information is also available in Wireshark when capturing NFS traffic.
Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Sergey Bashirov sergeybashirov@gmail.com Signed-off-by: Chuck Lever chuck.lever@oracle.com Stable-dep-of: d68886bae76a ("NFSD: Fix last write offset handling in layoutcommit") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/nfs4proc.c | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-)
--- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -2277,18 +2277,12 @@ nfsd4_layoutcommit(struct svc_rqst *rqst inode = d_inode(current_fh->fh_dentry);
nfserr = nfserr_inval; - if (new_size <= seg->offset) { - dprintk("pnfsd: last write before layout segment\n"); + if (new_size <= seg->offset) goto out; - } - if (new_size > seg->offset + seg->length) { - dprintk("pnfsd: last write beyond layout segment\n"); + if (new_size > seg->offset + seg->length) goto out; - } - if (!lcp->lc_newoffset && new_size > i_size_read(inode)) { - dprintk("pnfsd: layoutcommit beyond EOF\n"); + if (!lcp->lc_newoffset && new_size > i_size_read(inode)) goto out; - }
nfserr = nfsd4_preprocess_layout_stateid(rqstp, cstate, &lcp->lc_sid, false, lcp->lc_layout_type,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sergey Bashirov sergeybashirov@gmail.com
[ Upstream commit d68886bae76a4b9b3484d23e5b7df086f940fa38 ]
The data type of loca_last_write_offset is newoffset4 and is switched on a boolean value, no_newoffset, that indicates if a previous write occurred or not. If no_newoffset is FALSE, an offset is not given. This means that client does not try to update the file size. Thus, server should not try to calculate new file size and check if it fits into the segment range. See RFC 8881, section 12.5.4.2.
Sometimes the current incorrect logic may cause clients to hang when trying to sync an inode. If layoutcommit fails, the client marks the inode as dirty again.
Fixes: 9cf514ccfacb ("nfsd: implement pNFS operations") Cc: stable@vger.kernel.org Co-developed-by: Konstantin Evtushenko koevtushenko@yandex.com Signed-off-by: Konstantin Evtushenko koevtushenko@yandex.com Signed-off-by: Sergey Bashirov sergeybashirov@gmail.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com [ removed rqstp parameter from proc_layoutcommit ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/blocklayout.c | 5 ++--- fs/nfsd/nfs4proc.c | 30 +++++++++++++++--------------- 2 files changed, 17 insertions(+), 18 deletions(-)
--- a/fs/nfsd/blocklayout.c +++ b/fs/nfsd/blocklayout.c @@ -117,7 +117,6 @@ static __be32 nfsd4_block_commit_blocks(struct inode *inode, struct nfsd4_layoutcommit *lcp, struct iomap *iomaps, int nr_iomaps) { - loff_t new_size = lcp->lc_last_wr + 1; struct iattr iattr = { .ia_valid = 0 }; int error;
@@ -127,9 +126,9 @@ nfsd4_block_commit_blocks(struct inode * iattr.ia_valid |= ATTR_ATIME | ATTR_CTIME | ATTR_MTIME; iattr.ia_atime = iattr.ia_ctime = iattr.ia_mtime = lcp->lc_mtime;
- if (new_size > i_size_read(inode)) { + if (lcp->lc_size_chg) { iattr.ia_valid |= ATTR_SIZE; - iattr.ia_size = new_size; + iattr.ia_size = lcp->lc_newsize; }
error = inode->i_sb->s_export_op->commit_blocks(inode, iomaps, --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -2261,7 +2261,6 @@ nfsd4_layoutcommit(struct svc_rqst *rqst const struct nfsd4_layout_seg *seg = &lcp->lc_seg; struct svc_fh *current_fh = &cstate->current_fh; const struct nfsd4_layout_ops *ops; - loff_t new_size = lcp->lc_last_wr + 1; struct inode *inode; struct nfs4_layout_stateid *ls; __be32 nfserr; @@ -2276,13 +2275,21 @@ nfsd4_layoutcommit(struct svc_rqst *rqst goto out; inode = d_inode(current_fh->fh_dentry);
- nfserr = nfserr_inval; - if (new_size <= seg->offset) - goto out; - if (new_size > seg->offset + seg->length) - goto out; - if (!lcp->lc_newoffset && new_size > i_size_read(inode)) - goto out; + lcp->lc_size_chg = false; + if (lcp->lc_newoffset) { + loff_t new_size = lcp->lc_last_wr + 1; + + nfserr = nfserr_inval; + if (new_size <= seg->offset) + goto out; + if (new_size > seg->offset + seg->length) + goto out; + + if (new_size > i_size_read(inode)) { + lcp->lc_size_chg = true; + lcp->lc_newsize = new_size; + } + }
nfserr = nfsd4_preprocess_layout_stateid(rqstp, cstate, &lcp->lc_sid, false, lcp->lc_layout_type, @@ -2298,13 +2305,6 @@ nfsd4_layoutcommit(struct svc_rqst *rqst /* LAYOUTCOMMIT does not require any serialization */ mutex_unlock(&ls->ls_mutex);
- if (new_size > i_size_read(inode)) { - lcp->lc_size_chg = 1; - lcp->lc_newsize = new_size; - } else { - lcp->lc_size_chg = 0; - } - nfserr = ops->proc_layoutcommit(inode, lcp); nfs4_put_stid(&ls->ls_stid); out:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
[ Upstream commit 56094ad3eaa21e6621396cc33811d8f72847a834 ]
When user calls open_by_handle_at() on some inode that is not cached, we will create disconnected dentry for it. If such dentry is a directory, exportfs_decode_fh_raw() will then try to connect this dentry to the dentry tree through reconnect_path(). It may happen for various reasons (such as corrupted fs or race with rename) that the call to lookup_one_unlocked() in reconnect_one() will fail to find the dentry we are trying to reconnect and instead create a new dentry under the parent. Now this dentry will not be marked as disconnected although the parent still may well be disconnected (at least in case this inconsistency happened because the fs is corrupted and .. doesn't point to the real parent directory). This creates inconsistency in disconnected flags but AFAICS it was mostly harmless. At least until commit f1ee616214cb ("VFS: don't keep disconnected dentries on d_anon") which removed adding of most disconnected dentries to sb->s_anon list. Thus after this commit cleanup of disconnected dentries implicitely relies on the fact that dput() will immediately reclaim such dentries. However when some leaf dentry isn't marked as disconnected, as in the scenario described above, the reclaim doesn't happen and the dentries are "leaked". Memory reclaim can eventually reclaim them but otherwise they stay in memory and if umount comes first, we hit infamous "Busy inodes after unmount" bug. Make sure all dentries created under a disconnected parent are marked as disconnected as well.
Reported-by: syzbot+1d79ebe5383fc016cf07@syzkaller.appspotmail.com Fixes: f1ee616214cb ("VFS: don't keep disconnected dentries on d_anon") CC: stable@vger.kernel.org Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Christian Brauner brauner@kernel.org [ relocated DCACHE_DISCONNECTED propagation from d_alloc_parallel() to d_alloc() ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/dcache.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/fs/dcache.c +++ b/fs/dcache.c @@ -1862,6 +1862,8 @@ struct dentry *d_alloc(struct dentry * p __dget_dlock(parent); dentry->d_parent = parent; list_add(&dentry->d_child, &parent->d_subdirs); + if (parent->d_flags & DCACHE_DISCONNECTED) + dentry->d_flags |= DCACHE_DISCONNECTED; spin_unlock(&parent->d_lock);
return dentry;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit 4b47a8601b71ad98833b447d465592d847b4dc77 ]
Avoid a crash if a pNFS client should happen to send a LAYOUTCOMMIT operation on a FlexFiles layout.
Reported-by: Robert Morris rtm@csail.mit.edu Closes: https://lore.kernel.org/linux-nfs/152f99b2-ba35-4dec-93a9-4690e625dccd@oracl... Cc: Thomas Haynes loghyr@hammerspace.com Cc: stable@vger.kernel.org Fixes: 9b9960a0ca47 ("nfsd: Add a super simple flex file server") Signed-off-by: Chuck Lever chuck.lever@oracle.com [ removed struct svc_rqst parameter from nfsd4_ff_proc_layoutcommit ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/flexfilelayout.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/fs/nfsd/flexfilelayout.c +++ b/fs/nfsd/flexfilelayout.c @@ -125,6 +125,13 @@ nfsd4_ff_proc_getdeviceinfo(struct super return 0; }
+static __be32 +nfsd4_ff_proc_layoutcommit(struct inode *inode, + struct nfsd4_layoutcommit *lcp) +{ + return nfs_ok; +} + const struct nfsd4_layout_ops ff_layout_ops = { .notify_types = NOTIFY_DEVICEID4_DELETE | NOTIFY_DEVICEID4_CHANGE, @@ -133,4 +140,5 @@ const struct nfsd4_layout_ops ff_layout_ .encode_getdeviceinfo = nfsd4_ff_encode_getdeviceinfo, .proc_layoutget = nfsd4_ff_proc_layoutget, .encode_layoutget = nfsd4_ff_encode_layoutget, + .proc_layoutcommit = nfsd4_ff_proc_layoutcommit, };
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Theodore Ts'o tytso@mit.edu
[ Upstream commit 8ecb790ea8c3fc69e77bace57f14cf0d7c177bd8 ]
Unlike other strings in the ext4 superblock, we rely on tune2fs to make sure s_mount_opts is NUL terminated. Harden parse_apply_sb_mount_options() by treating s_mount_opts as a potential __nonstring.
Cc: stable@vger.kernel.org Fixes: 8b67f04ab9de ("ext4: Add mount options in superblock") Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Theodore Ts'o tytso@mit.edu Message-ID: 20250916-tune2fs-v2-1-d594dc7486f0@mit.edu Signed-off-by: Theodore Ts'o tytso@mit.edu [ added sizeof() third argument to strscpy_pad() ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/super.c | 17 +++++------------ 1 file changed, 5 insertions(+), 12 deletions(-)
--- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -2415,7 +2415,7 @@ static int parse_apply_sb_mount_options( struct ext4_fs_context *m_ctx) { struct ext4_sb_info *sbi = EXT4_SB(sb); - char *s_mount_opts = NULL; + char s_mount_opts[65]; struct ext4_fs_context *s_ctx = NULL; struct fs_context *fc = NULL; int ret = -ENOMEM; @@ -2423,15 +2423,11 @@ static int parse_apply_sb_mount_options( if (!sbi->s_es->s_mount_opts[0]) return 0;
- s_mount_opts = kstrndup(sbi->s_es->s_mount_opts, - sizeof(sbi->s_es->s_mount_opts), - GFP_KERNEL); - if (!s_mount_opts) - return ret; + strscpy_pad(s_mount_opts, sbi->s_es->s_mount_opts, sizeof(s_mount_opts));
fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL); if (!fc) - goto out_free; + return -ENOMEM;
s_ctx = kzalloc(sizeof(struct ext4_fs_context), GFP_KERNEL); if (!s_ctx) @@ -2463,11 +2459,8 @@ parse_failed: ret = 0;
out_free: - if (fc) { - ext4_fc_free(fc); - kfree(fc); - } - kfree(s_mount_opts); + ext4_fc_free(fc); + kfree(fc); return ret; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tvrtko Ursulin tvrtko.ursulin@igalia.com
[ Upstream commit 5801e65206b065b0b2af032f7f1eef222aa2fd83 ]
When adding dependencies with drm_sched_job_add_dependency(), that function consumes the fence reference both on success and failure, so in the latter case the dma_fence_put() on the error path (xarray failed to expand) is a double free.
Interestingly this bug appears to have been present ever since commit ebd5f74255b9 ("drm/sched: Add dependency tracking"), since the code back then looked like this:
drm_sched_job_add_implicit_dependencies(): ... for (i = 0; i < fence_count; i++) { ret = drm_sched_job_add_dependency(job, fences[i]); if (ret) break; }
for (; i < fence_count; i++) dma_fence_put(fences[i]);
Which means for the failing 'i' the dma_fence_put was already a double free. Possibly there were no users at that time, or the test cases were insufficient to hit it.
The bug was then only noticed and fixed after commit 9c2ba265352a ("drm/scheduler: use new iterator in drm_sched_job_add_implicit_dependencies v2") landed, with its fixup of commit 4eaf02d6076c ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies").
At that point it was a slightly different flavour of a double free, which commit 963d0b356935 ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder") noticed and attempted to fix.
But it only moved the double free from happening inside the drm_sched_job_add_dependency(), when releasing the reference not yet obtained, to the caller, when releasing the reference already released by the former in the failure case.
As such it is not easy to identify the right target for the fixes tag so lets keep it simple and just continue the chain.
While fixing we also improve the comment and explain the reason for taking the reference and not dropping it.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@igalia.com Fixes: 963d0b356935 ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder") Reported-by: Dan Carpenter dan.carpenter@linaro.org Closes: https://lore.kernel.org/dri-devel/aNFbXq8OeYl3QSdm@stanley.mountain/ Cc: Christian König christian.koenig@amd.com Cc: Rob Clark robdclark@chromium.org Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: Matthew Brost matthew.brost@intel.com Cc: Danilo Krummrich dakr@kernel.org Cc: Philipp Stanner phasta@kernel.org Cc: Christian König ckoenig.leichtzumerken@gmail.com Cc: dri-devel@lists.freedesktop.org Cc: stable@vger.kernel.org # v5.16+ Signed-off-by: Philipp Stanner phasta@kernel.org Link: https://lore.kernel.org/r/20251015084015.6273-1-tvrtko.ursulin@igalia.com [ applied to drm_sched_job_add_implicit_dependencies instead of drm_sched_job_add_resv_dependencies ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/scheduler/sched_main.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
--- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -719,13 +719,14 @@ int drm_sched_job_add_implicit_dependenc
dma_resv_for_each_fence(&cursor, obj->resv, dma_resv_usage_rw(write), fence) { - /* Make sure to grab an additional ref on the added fence */ - dma_fence_get(fence); - ret = drm_sched_job_add_dependency(job, fence); - if (ret) { - dma_fence_put(fence); + /* + * As drm_sched_job_add_dependency always consumes the fence + * reference (even when it fails), and dma_resv_for_each_fence + * is not obtaining one, we need to grab one before calling. + */ + ret = drm_sched_job_add_dependency(job, dma_fence_get(fence)); + if (ret) return ret; - } } return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Niklas Cassel cassel@kernel.org
[ Upstream commit 42f9c66a6d0cc45758dab77233c5460e1cf003df ]
Tegra already defines all BARs except BAR0 as BAR_RESERVED. This is sufficient for pci-epf-test to not allocate backing memory and to not call set_bar() for those BARs. However, marking a BAR as BAR_RESERVED does not mean that the BAR gets disabled.
The host side driver, pci_endpoint_test, simply does an ioremap for all enabled BARs and will run tests against all enabled BARs, so it will run tests against the BARs marked as BAR_RESERVED.
After running the BAR tests (which will write to all enabled BARs), the inbound address translation is broken. This is because the tegra controller exposes the ATU Port Logic Structure in BAR4, so when BAR4 is written, the inbound address translation settings get overwritten.
To avoid this, implement the dw_pcie_ep_ops .init() callback and start off by disabling all BARs (pci-epf-test will later enable/configure BARs that are not defined as BAR_RESERVED).
This matches the behavior of other PCIe endpoint drivers: dra7xx, imx6, layerscape-ep, artpec6, dw-rockchip, qcom-ep, rcar-gen4, and uniphier-ep.
With this, the PCI endpoint kselftest test case CONSECUTIVE_BAR_TEST (which was specifically made to detect address translation issues) passes.
Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Niklas Cassel cassel@kernel.org Signed-off-by: Manivannan Sadhasivam mani@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250922140822.519796-7-cassel@kernel.org [ changed .init field to .ep_init in pcie_ep_ops struct ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/dwc/pcie-tegra194.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/drivers/pci/controller/dwc/pcie-tegra194.c +++ b/drivers/pci/controller/dwc/pcie-tegra194.c @@ -1949,6 +1949,15 @@ static irqreturn_t tegra_pcie_ep_pex_rst return IRQ_HANDLED; }
+static void tegra_pcie_ep_init(struct dw_pcie_ep *ep) +{ + struct dw_pcie *pci = to_dw_pcie_from_ep(ep); + enum pci_barno bar; + + for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) + dw_pcie_ep_reset_bar(pci, bar); +}; + static int tegra_pcie_ep_raise_legacy_irq(struct tegra_pcie_dw *pcie, u16 irq) { /* Tegra194 supports only INTA */ @@ -2022,6 +2031,7 @@ tegra_pcie_ep_get_features(struct dw_pci }
static const struct dw_pcie_ep_ops pcie_ep_ops = { + .ep_init = tegra_pcie_ep_init, .raise_irq = tegra_pcie_ep_raise_irq, .get_features = tegra_pcie_ep_get_features, };
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
[ Upstream commit cf342d3beda000b4c60990755ca7800de5038785 ]
This allows to keep the f2fs_do_map_lock based locking scheme private to data.c.
Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Stable-dep-of: 9d5c4f5c7a2c ("f2fs: fix wrong block mapping for multi-devices") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/data.c | 16 ++++++++++++++-- fs/f2fs/f2fs.h | 3 +-- fs/f2fs/file.c | 4 +--- 3 files changed, 16 insertions(+), 7 deletions(-)
--- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1192,7 +1192,7 @@ int f2fs_reserve_block(struct dnode_of_d return err; }
-int f2fs_get_block(struct dnode_of_data *dn, pgoff_t index) +static int f2fs_get_block(struct dnode_of_data *dn, pgoff_t index) { struct extent_info ei = {0, }; struct inode *inode = dn->inode; @@ -1432,7 +1432,7 @@ static int __allocate_data_block(struct return 0; }
-void f2fs_do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock) +static void f2fs_do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock) { if (flag == F2FS_GET_BLOCK_PRE_AIO) { if (lock) @@ -1447,6 +1447,18 @@ void f2fs_do_map_lock(struct f2fs_sb_inf } }
+int f2fs_get_block_locked(struct dnode_of_data *dn, pgoff_t index) +{ + struct f2fs_sb_info *sbi = F2FS_I_SB(dn->inode); + int err; + + f2fs_do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, true); + err = f2fs_get_block(dn, index); + f2fs_do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, false); + + return err; +} + /* * f2fs_map_blocks() tries to find or build mapping relationship which * maps continuous logical blocks to physical blocks, and return such --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -3783,7 +3783,7 @@ void f2fs_set_data_blkaddr(struct dnode_ void f2fs_update_data_blkaddr(struct dnode_of_data *dn, block_t blkaddr); int f2fs_reserve_new_blocks(struct dnode_of_data *dn, blkcnt_t count); int f2fs_reserve_new_block(struct dnode_of_data *dn); -int f2fs_get_block(struct dnode_of_data *dn, pgoff_t index); +int f2fs_get_block_locked(struct dnode_of_data *dn, pgoff_t index); int f2fs_reserve_block(struct dnode_of_data *dn, pgoff_t index); struct page *f2fs_get_read_data_page(struct inode *inode, pgoff_t index, blk_opf_t op_flags, bool for_write, pgoff_t *next_pgofs); @@ -3794,7 +3794,6 @@ struct page *f2fs_get_lock_data_page(str struct page *f2fs_get_new_data_page(struct inode *inode, struct page *ipage, pgoff_t index, bool new_i_size); int f2fs_do_write_data_page(struct f2fs_io_info *fio); -void f2fs_do_map_lock(struct f2fs_sb_info *sbi, int flag, bool lock); int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, int create, int flag); int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -113,10 +113,8 @@ static vm_fault_t f2fs_vm_page_mkwrite(s
if (need_alloc) { /* block allocation */ - f2fs_do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, true); set_new_dnode(&dn, inode, NULL, NULL, 0); - err = f2fs_get_block(&dn, page->index); - f2fs_do_map_lock(sbi, F2FS_GET_BLOCK_PRE_AIO, false); + err = f2fs_get_block_locked(&dn, page->index); }
#ifdef CONFIG_F2FS_FS_COMPRESSION
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
[ Upstream commit cd8fc5226bef3a1fda13a0e61794a039ca46744a ]
The create argument is always identicaly to map->m_may_create, so use that consistently.
Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Stable-dep-of: 9d5c4f5c7a2c ("f2fs: fix wrong block mapping for multi-devices") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/data.c | 65 ++++++++++++++++++-------------------------- fs/f2fs/f2fs.h | 3 -- fs/f2fs/file.c | 12 ++++---- include/trace/events/f2fs.h | 11 ++----- 4 files changed, 39 insertions(+), 52 deletions(-)
--- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1464,8 +1464,7 @@ int f2fs_get_block_locked(struct dnode_o * maps continuous logical blocks to physical blocks, and return such * info via f2fs_map_blocks structure. */ -int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, - int create, int flag) +int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, int flag) { unsigned int maxblocks = map->m_len; struct dnode_of_data dn; @@ -1494,38 +1493,31 @@ int f2fs_map_blocks(struct inode *inode, pgofs = (pgoff_t)map->m_lblk; end = pgofs + maxblocks;
- if (!create && f2fs_lookup_read_extent_cache(inode, pgofs, &ei)) { - if (f2fs_lfs_mode(sbi) && flag == F2FS_GET_BLOCK_DIO && - map->m_may_create) - goto next_dnode; - - map->m_pblk = ei.blk + pgofs - ei.fofs; - map->m_len = min((pgoff_t)maxblocks, ei.fofs + ei.len - pgofs); - map->m_flags = F2FS_MAP_MAPPED; - if (map->m_next_extent) - *map->m_next_extent = pgofs + map->m_len; + if (map->m_may_create || + !f2fs_lookup_read_extent_cache(inode, pgofs, &ei)) + goto next_dnode; + + /* Found the map in read extent cache */ + map->m_pblk = ei.blk + pgofs - ei.fofs; + map->m_len = min((pgoff_t)maxblocks, ei.fofs + ei.len - pgofs); + map->m_flags = F2FS_MAP_MAPPED; + if (map->m_next_extent) + *map->m_next_extent = pgofs + map->m_len;
- /* for hardware encryption, but to avoid potential issue in future */ - if (flag == F2FS_GET_BLOCK_DIO) - f2fs_wait_on_block_writeback_range(inode, + /* for hardware encryption, but to avoid potential issue in future */ + if (flag == F2FS_GET_BLOCK_DIO) + f2fs_wait_on_block_writeback_range(inode, map->m_pblk, map->m_len);
- if (map->m_multidev_dio) { - block_t blk_addr = map->m_pblk; - - bidx = f2fs_target_device_index(sbi, map->m_pblk); + if (map->m_multidev_dio) { + bidx = f2fs_target_device_index(sbi, map->m_pblk);
- map->m_bdev = FDEV(bidx).bdev; - map->m_pblk -= FDEV(bidx).start_blk; - map->m_len = min(map->m_len, + map->m_bdev = FDEV(bidx).bdev; + map->m_pblk -= FDEV(bidx).start_blk; + map->m_len = min(map->m_len, FDEV(bidx).end_blk + 1 - map->m_pblk); - - if (map->m_may_create) - f2fs_update_device_state(sbi, inode->i_ino, - blk_addr, map->m_len); - } - goto out; } + goto out;
next_dnode: if (map->m_may_create) @@ -1589,7 +1581,7 @@ next_block: set_inode_flag(inode, FI_APPEND_WRITE); } } else { - if (create) { + if (map->m_may_create) { if (unlikely(f2fs_cp_error(sbi))) { err = -EIO; goto sync_out; @@ -1764,7 +1756,7 @@ unlock_out: f2fs_balance_fs(sbi, dn.node_changed); } out: - trace_f2fs_map_blocks(inode, map, create, flag, err); + trace_f2fs_map_blocks(inode, map, flag, err); return err; }
@@ -1786,7 +1778,7 @@ bool f2fs_overwrite_io(struct inode *ino
while (map.m_lblk < last_lblk) { map.m_len = last_lblk - map.m_lblk; - err = f2fs_map_blocks(inode, &map, 0, F2FS_GET_BLOCK_DEFAULT); + err = f2fs_map_blocks(inode, &map, F2FS_GET_BLOCK_DEFAULT); if (err || map.m_len == 0) return false; map.m_lblk += map.m_len; @@ -1960,7 +1952,7 @@ next: map.m_len = cluster_size - count_in_cluster; }
- ret = f2fs_map_blocks(inode, &map, 0, F2FS_GET_BLOCK_FIEMAP); + ret = f2fs_map_blocks(inode, &map, F2FS_GET_BLOCK_FIEMAP); if (ret) goto out;
@@ -2093,7 +2085,7 @@ static int f2fs_read_single_page(struct map->m_lblk = block_in_file; map->m_len = last_block - block_in_file;
- ret = f2fs_map_blocks(inode, map, 0, F2FS_GET_BLOCK_DEFAULT); + ret = f2fs_map_blocks(inode, map, F2FS_GET_BLOCK_DEFAULT); if (ret) goto out; got_it: @@ -3850,7 +3842,7 @@ static sector_t f2fs_bmap(struct address map.m_next_pgofs = NULL; map.m_seg_type = NO_CHECK_TYPE;
- if (!f2fs_map_blocks(inode, &map, 0, F2FS_GET_BLOCK_BMAP)) + if (!f2fs_map_blocks(inode, &map, F2FS_GET_BLOCK_BMAP)) blknr = map.m_pblk; } out: @@ -3958,7 +3950,7 @@ retry: map.m_seg_type = NO_CHECK_TYPE; map.m_may_create = false;
- ret = f2fs_map_blocks(inode, &map, 0, F2FS_GET_BLOCK_FIEMAP); + ret = f2fs_map_blocks(inode, &map, F2FS_GET_BLOCK_FIEMAP); if (ret) goto out;
@@ -4187,8 +4179,7 @@ static int f2fs_iomap_begin(struct inode if (flags & IOMAP_WRITE) map.m_may_create = true;
- err = f2fs_map_blocks(inode, &map, flags & IOMAP_WRITE, - F2FS_GET_BLOCK_DIO); + err = f2fs_map_blocks(inode, &map, F2FS_GET_BLOCK_DIO); if (err) return err;
--- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -3794,8 +3794,7 @@ struct page *f2fs_get_lock_data_page(str struct page *f2fs_get_new_data_page(struct inode *inode, struct page *ipage, pgoff_t index, bool new_i_size); int f2fs_do_write_data_page(struct f2fs_io_info *fio); -int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, - int create, int flag); +int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, int flag); int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, u64 start, u64 len); int f2fs_encrypt_one_page(struct f2fs_io_info *fio); --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1800,7 +1800,7 @@ next_alloc: f2fs_unlock_op(sbi);
map.m_seg_type = CURSEG_COLD_DATA_PINNED; - err = f2fs_map_blocks(inode, &map, 1, F2FS_GET_BLOCK_PRE_DIO); + err = f2fs_map_blocks(inode, &map, F2FS_GET_BLOCK_PRE_DIO); file_dont_truncate(inode);
f2fs_up_write(&sbi->pin_sem); @@ -1813,7 +1813,7 @@ next_alloc:
map.m_len = expanded; } else { - err = f2fs_map_blocks(inode, &map, 1, F2FS_GET_BLOCK_PRE_AIO); + err = f2fs_map_blocks(inode, &map, F2FS_GET_BLOCK_PRE_AIO); expanded = map.m_len; } out_err: @@ -2710,7 +2710,7 @@ static int f2fs_defragment_range(struct */ while (map.m_lblk < pg_end) { map.m_len = pg_end - map.m_lblk; - err = f2fs_map_blocks(inode, &map, 0, F2FS_GET_BLOCK_DEFAULT); + err = f2fs_map_blocks(inode, &map, F2FS_GET_BLOCK_DEFAULT); if (err) goto out;
@@ -2757,7 +2757,7 @@ static int f2fs_defragment_range(struct
do_map: map.m_len = pg_end - map.m_lblk; - err = f2fs_map_blocks(inode, &map, 0, F2FS_GET_BLOCK_DEFAULT); + err = f2fs_map_blocks(inode, &map, F2FS_GET_BLOCK_DEFAULT); if (err) goto clear_out;
@@ -3352,7 +3352,7 @@ int f2fs_precache_extents(struct inode * map.m_len = end - map.m_lblk;
f2fs_down_write(&fi->i_gc_rwsem[WRITE]); - err = f2fs_map_blocks(inode, &map, 0, F2FS_GET_BLOCK_PRECACHE); + err = f2fs_map_blocks(inode, &map, F2FS_GET_BLOCK_PRECACHE); f2fs_up_write(&fi->i_gc_rwsem[WRITE]); if (err) return err; @@ -4635,7 +4635,7 @@ static int f2fs_preallocate_blocks(struc flag = F2FS_GET_BLOCK_PRE_AIO; }
- ret = f2fs_map_blocks(inode, &map, 1, flag); + ret = f2fs_map_blocks(inode, &map, flag); /* -ENOSPC|-EDQUOT are fine to report the number of allocated blocks. */ if (ret < 0 && !((ret == -ENOSPC || ret == -EDQUOT) && map.m_len > 0)) return ret; --- a/include/trace/events/f2fs.h +++ b/include/trace/events/f2fs.h @@ -564,10 +564,10 @@ TRACE_EVENT(f2fs_file_write_iter, );
TRACE_EVENT(f2fs_map_blocks, - TP_PROTO(struct inode *inode, struct f2fs_map_blocks *map, - int create, int flag, int ret), + TP_PROTO(struct inode *inode, struct f2fs_map_blocks *map, int flag, + int ret),
- TP_ARGS(inode, map, create, flag, ret), + TP_ARGS(inode, map, flag, ret),
TP_STRUCT__entry( __field(dev_t, dev) @@ -579,7 +579,6 @@ TRACE_EVENT(f2fs_map_blocks, __field(int, m_seg_type) __field(bool, m_may_create) __field(bool, m_multidev_dio) - __field(int, create) __field(int, flag) __field(int, ret) ), @@ -594,7 +593,6 @@ TRACE_EVENT(f2fs_map_blocks, __entry->m_seg_type = map->m_seg_type; __entry->m_may_create = map->m_may_create; __entry->m_multidev_dio = map->m_multidev_dio; - __entry->create = create; __entry->flag = flag; __entry->ret = ret; ), @@ -602,7 +600,7 @@ TRACE_EVENT(f2fs_map_blocks, TP_printk("dev = (%d,%d), ino = %lu, file offset = %llu, " "start blkaddr = 0x%llx, len = 0x%llx, flags = %u, " "seg_type = %d, may_create = %d, multidevice = %d, " - "create = %d, flag = %d, err = %d", + "flag = %d, err = %d", show_dev_ino(__entry), (unsigned long long)__entry->m_lblk, (unsigned long long)__entry->m_pblk, @@ -611,7 +609,6 @@ TRACE_EVENT(f2fs_map_blocks, __entry->m_seg_type, __entry->m_may_create, __entry->m_multidev_dio, - __entry->create, __entry->flag, __entry->ret) );
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig hch@lst.de
[ Upstream commit 0094e98bd1477a6b7d97c25b47b19a7317c35279 ]
Add a helper to deal with everything needed to return a f2fs_map_blocks structure based on a lookup in the extent cache.
Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Stable-dep-of: 9d5c4f5c7a2c ("f2fs: fix wrong block mapping for multi-devices") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/data.c | 65 +++++++++++++++++++++++++++++++++------------------------ 1 file changed, 38 insertions(+), 27 deletions(-)
--- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1459,6 +1459,41 @@ int f2fs_get_block_locked(struct dnode_o return err; }
+static bool f2fs_map_blocks_cached(struct inode *inode, + struct f2fs_map_blocks *map, int flag) +{ + struct f2fs_sb_info *sbi = F2FS_I_SB(inode); + unsigned int maxblocks = map->m_len; + pgoff_t pgoff = (pgoff_t)map->m_lblk; + struct extent_info ei = {}; + + if (!f2fs_lookup_read_extent_cache(inode, pgoff, &ei)) + return false; + + map->m_pblk = ei.blk + pgoff - ei.fofs; + map->m_len = min((pgoff_t)maxblocks, ei.fofs + ei.len - pgoff); + map->m_flags = F2FS_MAP_MAPPED; + if (map->m_next_extent) + *map->m_next_extent = pgoff + map->m_len; + + /* for hardware encryption, but to avoid potential issue in future */ + if (flag == F2FS_GET_BLOCK_DIO) + f2fs_wait_on_block_writeback_range(inode, + map->m_pblk, map->m_len); + + if (f2fs_allow_multi_device_dio(sbi, flag)) { + int bidx = f2fs_target_device_index(sbi, map->m_pblk); + struct f2fs_dev_info *dev = &sbi->devs[bidx]; + + map->m_bdev = dev->bdev; + map->m_pblk -= dev->start_blk; + map->m_len = min(map->m_len, dev->end_blk + 1 - map->m_pblk); + } else { + map->m_bdev = inode->i_sb->s_bdev; + } + return true; +} + /* * f2fs_map_blocks() tries to find or build mapping relationship which * maps continuous logical blocks to physical blocks, and return such @@ -1474,7 +1509,6 @@ int f2fs_map_blocks(struct inode *inode, int err = 0, ofs = 1; unsigned int ofs_in_node, last_ofs_in_node; blkcnt_t prealloc; - struct extent_info ei = {0, }; block_t blkaddr; unsigned int start_pgofs; int bidx = 0; @@ -1482,6 +1516,9 @@ int f2fs_map_blocks(struct inode *inode, if (!maxblocks) return 0;
+ if (!map->m_may_create && f2fs_map_blocks_cached(inode, map, flag)) + goto out; + map->m_bdev = inode->i_sb->s_bdev; map->m_multidev_dio = f2fs_allow_multi_device_dio(F2FS_I_SB(inode), flag); @@ -1493,32 +1530,6 @@ int f2fs_map_blocks(struct inode *inode, pgofs = (pgoff_t)map->m_lblk; end = pgofs + maxblocks;
- if (map->m_may_create || - !f2fs_lookup_read_extent_cache(inode, pgofs, &ei)) - goto next_dnode; - - /* Found the map in read extent cache */ - map->m_pblk = ei.blk + pgofs - ei.fofs; - map->m_len = min((pgoff_t)maxblocks, ei.fofs + ei.len - pgofs); - map->m_flags = F2FS_MAP_MAPPED; - if (map->m_next_extent) - *map->m_next_extent = pgofs + map->m_len; - - /* for hardware encryption, but to avoid potential issue in future */ - if (flag == F2FS_GET_BLOCK_DIO) - f2fs_wait_on_block_writeback_range(inode, - map->m_pblk, map->m_len); - - if (map->m_multidev_dio) { - bidx = f2fs_target_device_index(sbi, map->m_pblk); - - map->m_bdev = FDEV(bidx).bdev; - map->m_pblk -= FDEV(bidx).start_blk; - map->m_len = min(map->m_len, - FDEV(bidx).end_blk + 1 - map->m_pblk); - } - goto out; - next_dnode: if (map->m_may_create) f2fs_do_map_lock(sbi, flag, true);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jaegeuk Kim jaegeuk@kernel.org
[ Upstream commit 9d5c4f5c7a2c7677e1b3942772122b032c265aae ]
Assuming the disk layout as below,
disk0: 0 --- 0x00035abfff disk1: 0x00035ac000 --- 0x00037abfff disk2: 0x00037ac000 --- 0x00037ebfff
and we want to read data from offset=13568 having len=128 across the block devices, we can illustrate the block addresses like below.
0 .. 0x00037ac000 ------------------- 0x00037ebfff, 0x00037ec000 ------- | ^ ^ ^ | fofs 0 13568 13568+128 | ------------------------------------------------------ | LBA 0x37e8aa9 0x37ebfa9 0x37ec029 --- map 0x3caa9 0x3ffa9
In this example, we should give the relative map of the target block device ranging from 0x3caa9 to 0x3ffa9 where the length should be calculated by 0x37ebfff + 1 - 0x37ebfa9.
In the below equation, however, map->m_pblk was supposed to be the original address instead of the one from the target block address.
- map->m_len = min(map->m_len, dev->end_blk + 1 - map->m_pblk);
Cc: stable@vger.kernel.org Fixes: 71f2c8206202 ("f2fs: multidevice: support direct IO") Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/data.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1486,8 +1486,8 @@ static bool f2fs_map_blocks_cached(struc struct f2fs_dev_info *dev = &sbi->devs[bidx];
map->m_bdev = dev->bdev; - map->m_pblk -= dev->start_blk; map->m_len = min(map->m_len, dev->end_blk + 1 - map->m_pblk); + map->m_pblk -= dev->start_blk; } else { map->m_bdev = inode->i_sb->s_bdev; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Piotr Kwapulinski piotr.kwapulinski@intel.com
[ Upstream commit 208fff3f567e2a3c3e7e4788845e90245c3891b4 ]
PCI_VDEVICE_SUB generates the pci_device_id struct layout for the specific PCI device/subdevice. Private data may follow the output.
Reviewed-by: Przemek Kitszel przemyslaw.kitszel@intel.com Signed-off-by: Piotr Kwapulinski piotr.kwapulinski@intel.com Acked-by: Bjorn Helgaas bhelgaas@google.com Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Stable-dep-of: a7075f501bd3 ("ixgbevf: fix mailbox API compatibility by negotiating supported features") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/pci.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
--- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1027,6 +1027,20 @@ static inline struct pci_driver *to_pci_ .subvendor = PCI_ANY_ID, .subdevice = PCI_ANY_ID, 0, 0
/** + * PCI_VDEVICE_SUB - describe a specific PCI device/subdevice in a short form + * @vend: the vendor name + * @dev: the 16 bit PCI Device ID + * @subvend: the 16 bit PCI Subvendor ID + * @subdev: the 16 bit PCI Subdevice ID + * + * Generate the pci_device_id struct layout for the specific PCI + * device/subdevice. Private data may follow the output. + */ +#define PCI_VDEVICE_SUB(vend, dev, subvend, subdev) \ + .vendor = PCI_VENDOR_ID_##vend, .device = (dev), \ + .subvendor = (subvend), .subdevice = (subdev), 0, 0 + +/** * PCI_DEVICE_DATA - macro used to describe a specific PCI device in very short form * @vend: the vendor name (without PCI_VENDOR_ID_ prefix) * @dev: the device name (without PCI_DEVICE_ID_<vend>_ prefix)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Piotr Kwapulinski piotr.kwapulinski@intel.com
[ Upstream commit 4c44b450c69b676955c2790dcf467c1f969d80f1 ]
Add support for Intel(R) E610 Series of network devices. The E610 is based on X550 but adds firmware managed link, enhanced security capabilities and support for updated server manageability
Reviewed-by: Przemek Kitszel przemyslaw.kitszel@intel.com Signed-off-by: Piotr Kwapulinski piotr.kwapulinski@intel.com Reviewed-by: Simon Horman horms@kernel.org Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Stable-dep-of: a7075f501bd3 ("ixgbevf: fix mailbox API compatibility by negotiating supported features") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ixgbevf/defines.h | 5 ++++- drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 6 +++++- drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 12 ++++++++++-- drivers/net/ethernet/intel/ixgbevf/vf.c | 12 +++++++++++- drivers/net/ethernet/intel/ixgbevf/vf.h | 4 +++- 5 files changed, 33 insertions(+), 6 deletions(-)
--- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -1,5 +1,5 @@ /* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright(c) 1999 - 2018 Intel Corporation. */ +/* Copyright(c) 1999 - 2024 Intel Corporation. */
#ifndef _IXGBEVF_DEFINES_H_ #define _IXGBEVF_DEFINES_H_ @@ -16,6 +16,9 @@ #define IXGBE_DEV_ID_X550_VF_HV 0x1564 #define IXGBE_DEV_ID_X550EM_X_VF_HV 0x15A9
+#define IXGBE_DEV_ID_E610_VF 0x57AD +#define IXGBE_SUBDEV_ID_E610_VF_HV 0x00FF + #define IXGBE_VF_IRQ_CLEAR_MASK 7 #define IXGBE_VF_MAX_TX_QUEUES 8 #define IXGBE_VF_MAX_RX_QUEUES 8 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -1,5 +1,5 @@ /* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright(c) 1999 - 2018 Intel Corporation. */ +/* Copyright(c) 1999 - 2024 Intel Corporation. */
#ifndef _IXGBEVF_H_ #define _IXGBEVF_H_ @@ -418,6 +418,8 @@ enum ixgbevf_boards { board_X550EM_x_vf, board_X550EM_x_vf_hv, board_x550em_a_vf, + board_e610_vf, + board_e610_vf_hv, };
enum ixgbevf_xcast_modes { @@ -434,11 +436,13 @@ extern const struct ixgbevf_info ixgbevf extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops; extern const struct ixgbe_mbx_operations ixgbevf_mbx_ops_legacy; extern const struct ixgbevf_info ixgbevf_x550em_a_vf_info; +extern const struct ixgbevf_info ixgbevf_e610_vf_info;
extern const struct ixgbevf_info ixgbevf_82599_vf_hv_info; extern const struct ixgbevf_info ixgbevf_X540_vf_hv_info; extern const struct ixgbevf_info ixgbevf_X550_vf_hv_info; extern const struct ixgbevf_info ixgbevf_X550EM_x_vf_hv_info; +extern const struct ixgbevf_info ixgbevf_e610_vf_hv_info; extern const struct ixgbe_mbx_operations ixgbevf_hv_mbx_ops;
/* needed by ethtool.c */ --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -1,5 +1,5 @@ // SPDX-License-Identifier: GPL-2.0 -/* Copyright(c) 1999 - 2018 Intel Corporation. */ +/* Copyright(c) 1999 - 2024 Intel Corporation. */
/****************************************************************************** Copyright (c)2006 - 2007 Myricom, Inc. for some LRO specific code @@ -39,7 +39,7 @@ static const char ixgbevf_driver_string[ "Intel(R) 10 Gigabit PCI Express Virtual Function Network Driver";
static char ixgbevf_copyright[] = - "Copyright (c) 2009 - 2018 Intel Corporation."; + "Copyright (c) 2009 - 2024 Intel Corporation.";
static const struct ixgbevf_info *ixgbevf_info_tbl[] = { [board_82599_vf] = &ixgbevf_82599_vf_info, @@ -51,6 +51,8 @@ static const struct ixgbevf_info *ixgbev [board_X550EM_x_vf] = &ixgbevf_X550EM_x_vf_info, [board_X550EM_x_vf_hv] = &ixgbevf_X550EM_x_vf_hv_info, [board_x550em_a_vf] = &ixgbevf_x550em_a_vf_info, + [board_e610_vf] = &ixgbevf_e610_vf_info, + [board_e610_vf_hv] = &ixgbevf_e610_vf_hv_info, };
/* ixgbevf_pci_tbl - PCI Device ID Table @@ -71,6 +73,9 @@ static const struct pci_device_id ixgbev {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF), board_X550EM_x_vf }, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_X_VF_HV), board_X550EM_x_vf_hv}, {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_X550EM_A_VF), board_x550em_a_vf }, + {PCI_VDEVICE_SUB(INTEL, IXGBE_DEV_ID_E610_VF, PCI_ANY_ID, + IXGBE_SUBDEV_ID_E610_VF_HV), board_e610_vf_hv}, + {PCI_VDEVICE(INTEL, IXGBE_DEV_ID_E610_VF), board_e610_vf}, /* required last entry */ {0, } }; @@ -4686,6 +4691,9 @@ static int ixgbevf_probe(struct pci_dev case ixgbe_mac_X540_vf: dev_info(&pdev->dev, "Intel(R) X540 Virtual Function\n"); break; + case ixgbe_mac_e610_vf: + dev_info(&pdev->dev, "Intel(R) E610 Virtual Function\n"); + break; case ixgbe_mac_82599_vf: default: dev_info(&pdev->dev, "Intel(R) 82599 Virtual Function\n"); --- a/drivers/net/ethernet/intel/ixgbevf/vf.c +++ b/drivers/net/ethernet/intel/ixgbevf/vf.c @@ -1,5 +1,5 @@ // SPDX-License-Identifier: GPL-2.0 -/* Copyright(c) 1999 - 2018 Intel Corporation. */ +/* Copyright(c) 1999 - 2024 Intel Corporation. */
#include "vf.h" #include "ixgbevf.h" @@ -1076,3 +1076,13 @@ const struct ixgbevf_info ixgbevf_x550em .mac = ixgbe_mac_x550em_a_vf, .mac_ops = &ixgbevf_mac_ops, }; + +const struct ixgbevf_info ixgbevf_e610_vf_info = { + .mac = ixgbe_mac_e610_vf, + .mac_ops = &ixgbevf_mac_ops, +}; + +const struct ixgbevf_info ixgbevf_e610_vf_hv_info = { + .mac = ixgbe_mac_e610_vf, + .mac_ops = &ixgbevf_hv_mac_ops, +}; --- a/drivers/net/ethernet/intel/ixgbevf/vf.h +++ b/drivers/net/ethernet/intel/ixgbevf/vf.h @@ -1,5 +1,5 @@ /* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright(c) 1999 - 2018 Intel Corporation. */ +/* Copyright(c) 1999 - 2024 Intel Corporation. */
#ifndef __IXGBE_VF_H__ #define __IXGBE_VF_H__ @@ -54,6 +54,8 @@ enum ixgbe_mac_type { ixgbe_mac_X550_vf, ixgbe_mac_X550EM_x_vf, ixgbe_mac_x550em_a_vf, + ixgbe_mac_e610, + ixgbe_mac_e610_vf, ixgbe_num_macs };
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jedrzej Jagielski jedrzej.jagielski@intel.com
[ Upstream commit 53f0eb62b4d23d40686f2dd51776b8220f2887bb ]
E610 adapters no longer use the VFLINKS register to read PF's link speed and linkup state. As a result VF driver cannot get actual link state and it incorrectly reports 10G which is the default option. It leads to a situation where even 1G adapters print 10G as actual link speed. The same happens when PF driver set speed different than 10G.
Add new mailbox operation to let the VF driver request a PF driver to provide actual link data. Update the mailbox api to v1.6.
Incorporate both ways of getting link status within the legacy ixgbe_check_mac_link_vf() function.
Fixes: 4c44b450c69b ("ixgbevf: Add support for Intel(R) E610 device") Co-developed-by: Andrzej Wilczynski andrzejx.wilczynski@intel.com Signed-off-by: Andrzej Wilczynski andrzejx.wilczynski@intel.com Reviewed-by: Przemek Kitszel przemyslaw.kitszel@intel.com Reviewed-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Cc: stable@vger.kernel.org Signed-off-by: Jedrzej Jagielski jedrzej.jagielski@intel.com Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-2-ef32a425b92a@in... Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: a7075f501bd3 ("ixgbevf: fix mailbox API compatibility by negotiating supported features") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ixgbevf/defines.h | 1 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 6 drivers/net/ethernet/intel/ixgbevf/mbx.h | 4 drivers/net/ethernet/intel/ixgbevf/vf.c | 137 +++++++++++++++++----- 4 files changed, 116 insertions(+), 32 deletions(-)
--- a/drivers/net/ethernet/intel/ixgbevf/defines.h +++ b/drivers/net/ethernet/intel/ixgbevf/defines.h @@ -28,6 +28,7 @@
/* Link speed */ typedef u32 ixgbe_link_speed; +#define IXGBE_LINK_SPEED_UNKNOWN 0 #define IXGBE_LINK_SPEED_1GB_FULL 0x0020 #define IXGBE_LINK_SPEED_10GB_FULL 0x0080 #define IXGBE_LINK_SPEED_100_FULL 0x0008 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -2272,6 +2272,7 @@ static void ixgbevf_negotiate_api(struct { struct ixgbe_hw *hw = &adapter->hw; static const int api[] = { + ixgbe_mbox_api_16, ixgbe_mbox_api_15, ixgbe_mbox_api_14, ixgbe_mbox_api_13, @@ -2291,7 +2292,8 @@ static void ixgbevf_negotiate_api(struct idx++; }
- if (hw->api_version >= ixgbe_mbox_api_15) { + /* Following is not supported by API 1.6, it is specific for 1.5 */ + if (hw->api_version == ixgbe_mbox_api_15) { hw->mbx.ops.init_params(hw); memcpy(&hw->mbx.ops, &ixgbevf_mbx_ops, sizeof(struct ixgbe_mbx_operations)); @@ -2648,6 +2650,7 @@ static void ixgbevf_set_num_queues(struc case ixgbe_mbox_api_13: case ixgbe_mbox_api_14: case ixgbe_mbox_api_15: + case ixgbe_mbox_api_16: if (adapter->xdp_prog && hw->mac.max_tx_queues == rss) rss = rss > 3 ? 2 : 1; @@ -4641,6 +4644,7 @@ static int ixgbevf_probe(struct pci_dev case ixgbe_mbox_api_13: case ixgbe_mbox_api_14: case ixgbe_mbox_api_15: + case ixgbe_mbox_api_16: netdev->max_mtu = IXGBE_MAX_JUMBO_FRAME_SIZE - (ETH_HLEN + ETH_FCS_LEN); break; --- a/drivers/net/ethernet/intel/ixgbevf/mbx.h +++ b/drivers/net/ethernet/intel/ixgbevf/mbx.h @@ -66,6 +66,7 @@ enum ixgbe_pfvf_api_rev { ixgbe_mbox_api_13, /* API version 1.3, linux/freebsd VF driver */ ixgbe_mbox_api_14, /* API version 1.4, linux/freebsd VF driver */ ixgbe_mbox_api_15, /* API version 1.5, linux/freebsd VF driver */ + ixgbe_mbox_api_16, /* API version 1.6, linux/freebsd VF driver */ /* This value should always be last */ ixgbe_mbox_api_unknown, /* indicates that API version is not known */ }; @@ -102,6 +103,9 @@ enum ixgbe_pfvf_api_rev {
#define IXGBE_VF_GET_LINK_STATE 0x10 /* get vf link state */
+/* mailbox API, version 1.6 VF requests */ +#define IXGBE_VF_GET_PF_LINK_STATE 0x11 /* request PF to send link info */ + /* length of permanent address message returned from PF */ #define IXGBE_VF_PERMADDR_MSG_LEN 4 /* word in permanent address message with the current multicast type */ --- a/drivers/net/ethernet/intel/ixgbevf/vf.c +++ b/drivers/net/ethernet/intel/ixgbevf/vf.c @@ -313,6 +313,7 @@ int ixgbevf_get_reta_locked(struct ixgbe * is not supported for this device type. */ switch (hw->api_version) { + case ixgbe_mbox_api_16: case ixgbe_mbox_api_15: case ixgbe_mbox_api_14: case ixgbe_mbox_api_13: @@ -382,6 +383,7 @@ int ixgbevf_get_rss_key_locked(struct ix * or if the operation is not supported for this device type. */ switch (hw->api_version) { + case ixgbe_mbox_api_16: case ixgbe_mbox_api_15: case ixgbe_mbox_api_14: case ixgbe_mbox_api_13: @@ -552,6 +554,7 @@ static s32 ixgbevf_update_xcast_mode(str case ixgbe_mbox_api_13: case ixgbe_mbox_api_14: case ixgbe_mbox_api_15: + case ixgbe_mbox_api_16: break; default: return -EOPNOTSUPP; @@ -625,6 +628,48 @@ static s32 ixgbevf_hv_get_link_state_vf( }
/** + * ixgbevf_get_pf_link_state - Get PF's link status + * @hw: pointer to the HW structure + * @speed: link speed + * @link_up: indicate if link is up/down + * + * Ask PF to provide link_up state and speed of the link. + * + * Return: IXGBE_ERR_MBX in the case of mailbox error, + * -EOPNOTSUPP if the op is not supported or 0 on success. + */ +static int ixgbevf_get_pf_link_state(struct ixgbe_hw *hw, ixgbe_link_speed *speed, + bool *link_up) +{ + u32 msgbuf[3] = {}; + int err; + + switch (hw->api_version) { + case ixgbe_mbox_api_16: + break; + default: + return -EOPNOTSUPP; + } + + msgbuf[0] = IXGBE_VF_GET_PF_LINK_STATE; + + err = ixgbevf_write_msg_read_ack(hw, msgbuf, msgbuf, + ARRAY_SIZE(msgbuf)); + if (err || (msgbuf[0] & IXGBE_VT_MSGTYPE_FAILURE)) { + err = IXGBE_ERR_MBX; + *speed = IXGBE_LINK_SPEED_UNKNOWN; + /* No need to set @link_up to false as it will be done by + * ixgbe_check_mac_link_vf(). + */ + } else { + *speed = msgbuf[1]; + *link_up = msgbuf[2]; + } + + return err; +} + +/** * ixgbevf_set_vfta_vf - Set/Unset VLAN filter table address * @hw: pointer to the HW structure * @vlan: 12 bit VLAN ID @@ -659,6 +704,58 @@ mbx_err: }
/** + * ixgbe_read_vflinks - Read VFLINKS register + * @hw: pointer to the HW structure + * @speed: link speed + * @link_up: indicate if link is up/down + * + * Get linkup status and link speed from the VFLINKS register. + */ +static void ixgbe_read_vflinks(struct ixgbe_hw *hw, ixgbe_link_speed *speed, + bool *link_up) +{ + u32 vflinks = IXGBE_READ_REG(hw, IXGBE_VFLINKS); + + /* if link status is down no point in checking to see if PF is up */ + if (!(vflinks & IXGBE_LINKS_UP)) { + *link_up = false; + return; + } + + /* for SFP+ modules and DA cables on 82599 it can take up to 500usecs + * before the link status is correct + */ + if (hw->mac.type == ixgbe_mac_82599_vf) { + for (int i = 0; i < 5; i++) { + udelay(100); + vflinks = IXGBE_READ_REG(hw, IXGBE_VFLINKS); + + if (!(vflinks & IXGBE_LINKS_UP)) { + *link_up = false; + return; + } + } + } + + /* We reached this point so there's link */ + *link_up = true; + + switch (vflinks & IXGBE_LINKS_SPEED_82599) { + case IXGBE_LINKS_SPEED_10G_82599: + *speed = IXGBE_LINK_SPEED_10GB_FULL; + break; + case IXGBE_LINKS_SPEED_1G_82599: + *speed = IXGBE_LINK_SPEED_1GB_FULL; + break; + case IXGBE_LINKS_SPEED_100_82599: + *speed = IXGBE_LINK_SPEED_100_FULL; + break; + default: + *speed = IXGBE_LINK_SPEED_UNKNOWN; + } +} + +/** * ixgbevf_hv_set_vfta_vf - * Hyper-V variant - just a stub. * @hw: unused * @vlan: unused @@ -705,7 +802,6 @@ static s32 ixgbevf_check_mac_link_vf(str struct ixgbe_mbx_info *mbx = &hw->mbx; struct ixgbe_mac_info *mac = &hw->mac; s32 ret_val = 0; - u32 links_reg; u32 in_msg = 0;
/* If we were hit with a reset drop the link */ @@ -715,36 +811,14 @@ static s32 ixgbevf_check_mac_link_vf(str if (!mac->get_link_status) goto out;
- /* if link status is down no point in checking to see if pf is up */ - links_reg = IXGBE_READ_REG(hw, IXGBE_VFLINKS); - if (!(links_reg & IXGBE_LINKS_UP)) - goto out; - - /* for SFP+ modules and DA cables on 82599 it can take up to 500usecs - * before the link status is correct - */ - if (mac->type == ixgbe_mac_82599_vf) { - int i; - - for (i = 0; i < 5; i++) { - udelay(100); - links_reg = IXGBE_READ_REG(hw, IXGBE_VFLINKS); - - if (!(links_reg & IXGBE_LINKS_UP)) - goto out; - } - } - - switch (links_reg & IXGBE_LINKS_SPEED_82599) { - case IXGBE_LINKS_SPEED_10G_82599: - *speed = IXGBE_LINK_SPEED_10GB_FULL; - break; - case IXGBE_LINKS_SPEED_1G_82599: - *speed = IXGBE_LINK_SPEED_1GB_FULL; - break; - case IXGBE_LINKS_SPEED_100_82599: - *speed = IXGBE_LINK_SPEED_100_FULL; - break; + if (hw->mac.type == ixgbe_mac_e610_vf) { + ret_val = ixgbevf_get_pf_link_state(hw, speed, link_up); + if (ret_val) + goto out; + } else { + ixgbe_read_vflinks(hw, speed, link_up); + if (*link_up == false) + goto out; }
/* if the read failed it could just be a mailbox collision, best wait @@ -951,6 +1025,7 @@ int ixgbevf_get_queues(struct ixgbe_hw * case ixgbe_mbox_api_13: case ixgbe_mbox_api_14: case ixgbe_mbox_api_15: + case ixgbe_mbox_api_16: break; default: return 0;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jedrzej Jagielski jedrzej.jagielski@intel.com
[ Upstream commit a7075f501bd33c93570af759b6f4302ef0175168 ]
There was backward compatibility in the terms of mailbox API. Various drivers from various OSes supporting 10G adapters from Intel portfolio could easily negotiate mailbox API.
This convention has been broken since introducing API 1.4. Commit 0062e7cc955e ("ixgbevf: add VF IPsec offload code") added support for IPSec which is specific only for the kernel ixgbe driver. None of the rest of the Intel 10G PF/VF drivers supports it. And actually lack of support was not included in the IPSec implementation - there were no such code paths. No possibility to negotiate support for the feature was introduced along with introduction of the feature itself.
Commit 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") increasing API version to 1.5 did the same - it introduced code supported specifically by the PF ESX driver. It altered API version for the VF driver in the same time not touching the version defined for the PF ixgbe driver. It led to additional discrepancies, as the code provided within API 1.6 cannot be supported for Linux ixgbe driver as it causes crashes.
The issue was noticed some time ago and mitigated by Jake within the commit d0725312adf5 ("ixgbevf: stop attempting IPSEC offload on Mailbox API 1.5"). As a result we have regression for IPsec support and after increasing API to version 1.6 ixgbevf driver stopped to support ESX MBX.
To fix this mess add new mailbox op asking PF driver about supported features. Basing on a response determine whether to set support for IPSec and ESX-specific enhanced mailbox.
New mailbox op, for compatibility purposes, must be added within new API revision, as API version of OOT PF & VF drivers is already increased to 1.6 and doesn't incorporate features negotiate op.
Features negotiation mechanism gives possibility to be extended with new features when needed in the future.
Reported-by: Jacob Keller jacob.e.keller@intel.com Closes: https://lore.kernel.org/intel-wired-lan/20241101-jk-ixgbevf-mailbox-v1-5-fix... Fixes: 0062e7cc955e ("ixgbevf: add VF IPsec offload code") Fixes: 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") Reviewed-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Przemek Kitszel przemyslaw.kitszel@intel.com Reviewed-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Cc: stable@vger.kernel.org Signed-off-by: Jedrzej Jagielski jedrzej.jagielski@intel.com Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-4-ef32a425b92a@in... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ixgbevf/ipsec.c | 10 ++++ drivers/net/ethernet/intel/ixgbevf/ixgbevf.h | 7 +++ drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 32 ++++++++++++++- drivers/net/ethernet/intel/ixgbevf/mbx.h | 4 + drivers/net/ethernet/intel/ixgbevf/vf.c | 45 +++++++++++++++++++++- drivers/net/ethernet/intel/ixgbevf/vf.h | 1 6 files changed, 96 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/intel/ixgbevf/ipsec.c +++ b/drivers/net/ethernet/intel/ixgbevf/ipsec.c @@ -269,6 +269,9 @@ static int ixgbevf_ipsec_add_sa(struct x adapter = netdev_priv(dev); ipsec = adapter->ipsec;
+ if (!(adapter->pf_features & IXGBEVF_PF_SUP_IPSEC)) + return -EOPNOTSUPP; + if (xs->id.proto != IPPROTO_ESP && xs->id.proto != IPPROTO_AH) { netdev_err(dev, "Unsupported protocol 0x%04x for IPsec offload\n", xs->id.proto); @@ -394,6 +397,9 @@ static void ixgbevf_ipsec_del_sa(struct adapter = netdev_priv(dev); ipsec = adapter->ipsec;
+ if (!(adapter->pf_features & IXGBEVF_PF_SUP_IPSEC)) + return; + if (xs->xso.dir == XFRM_DEV_OFFLOAD_IN) { sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_RX_INDEX;
@@ -622,6 +628,10 @@ void ixgbevf_init_ipsec_offload(struct i size_t size;
switch (adapter->hw.api_version) { + case ixgbe_mbox_api_17: + if (!(adapter->pf_features & IXGBEVF_PF_SUP_IPSEC)) + return; + break; case ixgbe_mbox_api_14: break; default: --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h @@ -366,6 +366,13 @@ struct ixgbevf_adapter { /* Interrupt Throttle Rate */ u32 eitr_param;
+ u32 pf_features; +#define IXGBEVF_PF_SUP_IPSEC BIT(0) +#define IXGBEVF_PF_SUP_ESX_MBX BIT(1) + +#define IXGBEVF_SUPPORTED_FEATURES (IXGBEVF_PF_SUP_IPSEC | \ + IXGBEVF_PF_SUP_ESX_MBX) + struct ixgbevf_hw_stats stats;
unsigned long state; --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -2268,10 +2268,35 @@ static void ixgbevf_init_last_counter_st adapter->stats.base_vfmprc = adapter->stats.last_vfmprc; }
+/** + * ixgbevf_set_features - Set features supported by PF + * @adapter: pointer to the adapter struct + * + * Negotiate with PF supported features and then set pf_features accordingly. + */ +static void ixgbevf_set_features(struct ixgbevf_adapter *adapter) +{ + u32 *pf_features = &adapter->pf_features; + struct ixgbe_hw *hw = &adapter->hw; + int err; + + err = hw->mac.ops.negotiate_features(hw, pf_features); + if (err && err != -EOPNOTSUPP) + netdev_dbg(adapter->netdev, + "PF feature negotiation failed.\n"); + + /* Address also pre API 1.7 cases */ + if (hw->api_version == ixgbe_mbox_api_14) + *pf_features |= IXGBEVF_PF_SUP_IPSEC; + else if (hw->api_version == ixgbe_mbox_api_15) + *pf_features |= IXGBEVF_PF_SUP_ESX_MBX; +} + static void ixgbevf_negotiate_api(struct ixgbevf_adapter *adapter) { struct ixgbe_hw *hw = &adapter->hw; static const int api[] = { + ixgbe_mbox_api_17, ixgbe_mbox_api_16, ixgbe_mbox_api_15, ixgbe_mbox_api_14, @@ -2292,8 +2317,9 @@ static void ixgbevf_negotiate_api(struct idx++; }
- /* Following is not supported by API 1.6, it is specific for 1.5 */ - if (hw->api_version == ixgbe_mbox_api_15) { + ixgbevf_set_features(adapter); + + if (adapter->pf_features & IXGBEVF_PF_SUP_ESX_MBX) { hw->mbx.ops.init_params(hw); memcpy(&hw->mbx.ops, &ixgbevf_mbx_ops, sizeof(struct ixgbe_mbx_operations)); @@ -2651,6 +2677,7 @@ static void ixgbevf_set_num_queues(struc case ixgbe_mbox_api_14: case ixgbe_mbox_api_15: case ixgbe_mbox_api_16: + case ixgbe_mbox_api_17: if (adapter->xdp_prog && hw->mac.max_tx_queues == rss) rss = rss > 3 ? 2 : 1; @@ -4645,6 +4672,7 @@ static int ixgbevf_probe(struct pci_dev case ixgbe_mbox_api_14: case ixgbe_mbox_api_15: case ixgbe_mbox_api_16: + case ixgbe_mbox_api_17: netdev->max_mtu = IXGBE_MAX_JUMBO_FRAME_SIZE - (ETH_HLEN + ETH_FCS_LEN); break; --- a/drivers/net/ethernet/intel/ixgbevf/mbx.h +++ b/drivers/net/ethernet/intel/ixgbevf/mbx.h @@ -67,6 +67,7 @@ enum ixgbe_pfvf_api_rev { ixgbe_mbox_api_14, /* API version 1.4, linux/freebsd VF driver */ ixgbe_mbox_api_15, /* API version 1.5, linux/freebsd VF driver */ ixgbe_mbox_api_16, /* API version 1.6, linux/freebsd VF driver */ + ixgbe_mbox_api_17, /* API version 1.7, linux/freebsd VF driver */ /* This value should always be last */ ixgbe_mbox_api_unknown, /* indicates that API version is not known */ }; @@ -106,6 +107,9 @@ enum ixgbe_pfvf_api_rev { /* mailbox API, version 1.6 VF requests */ #define IXGBE_VF_GET_PF_LINK_STATE 0x11 /* request PF to send link info */
+/* mailbox API, version 1.7 VF requests */ +#define IXGBE_VF_FEATURES_NEGOTIATE 0x12 /* get features supported by PF*/ + /* length of permanent address message returned from PF */ #define IXGBE_VF_PERMADDR_MSG_LEN 4 /* word in permanent address message with the current multicast type */ --- a/drivers/net/ethernet/intel/ixgbevf/vf.c +++ b/drivers/net/ethernet/intel/ixgbevf/vf.c @@ -313,6 +313,7 @@ int ixgbevf_get_reta_locked(struct ixgbe * is not supported for this device type. */ switch (hw->api_version) { + case ixgbe_mbox_api_17: case ixgbe_mbox_api_16: case ixgbe_mbox_api_15: case ixgbe_mbox_api_14: @@ -383,6 +384,7 @@ int ixgbevf_get_rss_key_locked(struct ix * or if the operation is not supported for this device type. */ switch (hw->api_version) { + case ixgbe_mbox_api_17: case ixgbe_mbox_api_16: case ixgbe_mbox_api_15: case ixgbe_mbox_api_14: @@ -555,6 +557,7 @@ static s32 ixgbevf_update_xcast_mode(str case ixgbe_mbox_api_14: case ixgbe_mbox_api_15: case ixgbe_mbox_api_16: + case ixgbe_mbox_api_17: break; default: return -EOPNOTSUPP; @@ -646,6 +649,7 @@ static int ixgbevf_get_pf_link_state(str
switch (hw->api_version) { case ixgbe_mbox_api_16: + case ixgbe_mbox_api_17: break; default: return -EOPNOTSUPP; @@ -670,6 +674,42 @@ static int ixgbevf_get_pf_link_state(str }
/** + * ixgbevf_negotiate_features_vf - negotiate supported features with PF driver + * @hw: pointer to the HW structure + * @pf_features: bitmask of features supported by PF + * + * Return: IXGBE_ERR_MBX in the case of mailbox error, + * -EOPNOTSUPP if the op is not supported or 0 on success. + */ +static int ixgbevf_negotiate_features_vf(struct ixgbe_hw *hw, u32 *pf_features) +{ + u32 msgbuf[2] = {}; + int err; + + switch (hw->api_version) { + case ixgbe_mbox_api_17: + break; + default: + return -EOPNOTSUPP; + } + + msgbuf[0] = IXGBE_VF_FEATURES_NEGOTIATE; + msgbuf[1] = IXGBEVF_SUPPORTED_FEATURES; + + err = ixgbevf_write_msg_read_ack(hw, msgbuf, msgbuf, + ARRAY_SIZE(msgbuf)); + + if (err || (msgbuf[0] & IXGBE_VT_MSGTYPE_FAILURE)) { + err = IXGBE_ERR_MBX; + *pf_features = 0x0; + } else { + *pf_features = msgbuf[1]; + } + + return err; +} + +/** * ixgbevf_set_vfta_vf - Set/Unset VLAN filter table address * @hw: pointer to the HW structure * @vlan: 12 bit VLAN ID @@ -799,6 +839,7 @@ static s32 ixgbevf_check_mac_link_vf(str bool *link_up, bool autoneg_wait_to_complete) { + struct ixgbevf_adapter *adapter = hw->back; struct ixgbe_mbx_info *mbx = &hw->mbx; struct ixgbe_mac_info *mac = &hw->mac; s32 ret_val = 0; @@ -825,7 +866,7 @@ static s32 ixgbevf_check_mac_link_vf(str * until we are called again and don't report an error */ if (mbx->ops.read(hw, &in_msg, 1)) { - if (hw->api_version >= ixgbe_mbox_api_15) + if (adapter->pf_features & IXGBEVF_PF_SUP_ESX_MBX) mac->get_link_status = false; goto out; } @@ -1026,6 +1067,7 @@ int ixgbevf_get_queues(struct ixgbe_hw * case ixgbe_mbox_api_14: case ixgbe_mbox_api_15: case ixgbe_mbox_api_16: + case ixgbe_mbox_api_17: break; default: return 0; @@ -1080,6 +1122,7 @@ static const struct ixgbe_mac_operations .setup_link = ixgbevf_setup_mac_link_vf, .check_link = ixgbevf_check_mac_link_vf, .negotiate_api_version = ixgbevf_negotiate_api_version_vf, + .negotiate_features = ixgbevf_negotiate_features_vf, .set_rar = ixgbevf_set_rar_vf, .update_mc_addr_list = ixgbevf_update_mc_addr_list_vf, .update_xcast_mode = ixgbevf_update_xcast_mode, --- a/drivers/net/ethernet/intel/ixgbevf/vf.h +++ b/drivers/net/ethernet/intel/ixgbevf/vf.h @@ -26,6 +26,7 @@ struct ixgbe_mac_operations { s32 (*stop_adapter)(struct ixgbe_hw *); s32 (*get_bus_info)(struct ixgbe_hw *); s32 (*negotiate_api_version)(struct ixgbe_hw *hw, int api); + int (*negotiate_features)(struct ixgbe_hw *hw, u32 *pf_features);
/* Link */ s32 (*setup_link)(struct ixgbe_hw *, ixgbe_link_speed, bool, bool);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Devarsh Thakkar devarsht@ti.com
[ Upstream commit 2c27aaee934a1b5229152fe33a14f1fdf50da143 ]
Do read-modify-write so that we re-use the characterized reset value as specified in TRM [1] to program calibration wait time which defines number of cycles to wait for after startup state machine is in bandgap enable state.
This fixes PLL lock timeout error faced while using RPi DSI Panel on TI's AM62L and J721E SoC since earlier calibration wait time was getting overwritten to zero value thus failing the PLL to lockup and causing timeout.
[1] AM62P TRM (Section 14.8.6.3.2.1.1 DPHY_TX_DPHYTX_CMN0_CMN_DIG_TBIT2): Link: https://www.ti.com/lit/pdf/spruj83
Cc: stable@vger.kernel.org Fixes: 7a343c8bf4b5 ("phy: Add Cadence D-PHY support") Signed-off-by: Devarsh Thakkar devarsht@ti.com Tested-by: Harikrishna Shenoy h-shenoy@ti.com Reviewed-by: Tomi Valkeinen tomi.valkeinen@ideasonboard.com Link: https://lore.kernel.org/r/20250704125915.1224738-3-devarsht@ti.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/phy/cadence/cdns-dphy.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/phy/cadence/cdns-dphy.c +++ b/drivers/phy/cadence/cdns-dphy.c @@ -31,6 +31,7 @@
#define DPHY_CMN_SSM DPHY_PMA_CMN(0x20) #define DPHY_CMN_SSM_EN BIT(0) +#define DPHY_CMN_SSM_CAL_WAIT_TIME GENMASK(8, 1) #define DPHY_CMN_TX_MODE_EN BIT(9)
#define DPHY_CMN_PWM DPHY_PMA_CMN(0x40) @@ -422,7 +423,8 @@ static int cdns_dphy_power_on(struct phy writel(reg, dphy->regs + DPHY_BAND_CFG);
/* Start TX state machine. */ - writel(DPHY_CMN_SSM_EN | DPHY_CMN_TX_MODE_EN, + reg = readl(dphy->regs + DPHY_CMN_SSM); + writel((reg & DPHY_CMN_SSM_CAL_WAIT_TIME) | DPHY_CMN_SSM_EN | DPHY_CMN_TX_MODE_EN, dphy->regs + DPHY_CMN_SSM);
ret = cdns_dphy_wait_for_pll_lock(dphy);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kaushlendra Kumar kaushlendra.kumar@intel.com
[ Upstream commit 2eead19334516c8e9927c11b448fbe512b1f18a1 ]
Fix incorrect use of PTR_ERR_OR_ZERO() in topology_parse_cpu_capacity() which causes the code to proceed with NULL clock pointers. The current logic uses !PTR_ERR_OR_ZERO(cpu_clk) which evaluates to true for both valid pointers and NULL, leading to potential NULL pointer dereference in clk_get_rate().
Per include/linux/err.h documentation, PTR_ERR_OR_ZERO(ptr) returns: "The error code within @ptr if it is an error pointer; 0 otherwise."
This means PTR_ERR_OR_ZERO() returns 0 for both valid pointers AND NULL pointers. Therefore !PTR_ERR_OR_ZERO(cpu_clk) evaluates to true (proceed) when cpu_clk is either valid or NULL, causing clk_get_rate(NULL) to be called when of_clk_get() returns NULL.
Replace with !IS_ERR_OR_NULL(cpu_clk) which only proceeds for valid pointers, preventing potential NULL pointer dereference in clk_get_rate().
Cc: stable stable@kernel.org Signed-off-by: Kaushlendra Kumar kaushlendra.kumar@intel.com Reviewed-by: Sudeep Holla sudeep.holla@arm.com Fixes: b8fe128dad8f ("arch_topology: Adjust initial CPU capacities with current freq") Link: https://patch.msgid.link/20250923174308.1771906-1-kaushlendra.kumar@intel.co... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org [ Adjust context ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/arch_topology.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -327,7 +327,7 @@ bool __init topology_parse_cpu_capacity( * frequency (by keeping the initial freq_factor value). */ cpu_clk = of_clk_get(cpu_node, 0); - if (!PTR_ERR_OR_ZERO(cpu_clk)) { + if (!IS_ERR_OR_NULL(cpu_clk)) { per_cpu(freq_factor, cpu) = clk_get_rate(cpu_clk) / 1000; clk_put(cpu_clk);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Darrick J. Wong" djwong@kernel.org
[ Upstream commit 630785bfbe12c3ee3ebccd8b530a98d632b7e39d ]
The deprecation of the 'attr2' mount option in 6.18 wasn't entirely successful because nobody noticed that the kernel never printed a warning about attr2 being set in fstab if the only xfs filesystem is the root fs; the initramfs mounts the root fs with no mount options; and the init scripts only conveyed the fstab options by remounting the root fs.
Fix this by making it complain all the time.
Cc: stable@vger.kernel.org # v5.13 Fixes: 92cf7d36384b99 ("xfs: Skip repetitive warnings about mount options") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Carlos Maiolino cmaiolino@redhat.com Signed-off-by: Carlos Maiolino cem@kernel.org [ Update existing xfs_fs_warn_deprecated() callers ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/xfs/xfs_super.c | 33 +++++++++++++++++++++------------ 1 file changed, 21 insertions(+), 12 deletions(-)
--- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1201,16 +1201,25 @@ suffix_kstrtoint( static inline void xfs_fs_warn_deprecated( struct fs_context *fc, - struct fs_parameter *param, - uint64_t flag, - bool value) + struct fs_parameter *param) { - /* Don't print the warning if reconfiguring and current mount point - * already had the flag set + /* + * Always warn about someone passing in a deprecated mount option. + * Previously we wouldn't print the warning if we were reconfiguring + * and current mount point already had the flag set, but that was not + * the right thing to do. + * + * Many distributions mount the root filesystem with no options in the + * initramfs and rely on mount -a to remount the root fs with the + * options in fstab. However, the old behavior meant that there would + * never be a warning about deprecated mount options for the root fs in + * /etc/fstab. On a single-fs system, that means no warning at all. + * + * Compounding this problem are distribution scripts that copy + * /proc/mounts to fstab, which means that we can't remove mount + * options unless we're 100% sure they have only ever been advertised + * in /proc/mounts in response to explicitly provided mount options. */ - if ((fc->purpose & FS_CONTEXT_FOR_RECONFIGURE) && - !!(XFS_M(fc->root->d_sb)->m_features & flag) == value) - return; xfs_warn(fc->s_fs_info, "%s mount option is deprecated.", param->key); }
@@ -1349,19 +1358,19 @@ xfs_fs_parse_param( #endif /* Following mount options will be removed in September 2025 */ case Opt_ikeep: - xfs_fs_warn_deprecated(fc, param, XFS_FEAT_IKEEP, true); + xfs_fs_warn_deprecated(fc, param); parsing_mp->m_features |= XFS_FEAT_IKEEP; return 0; case Opt_noikeep: - xfs_fs_warn_deprecated(fc, param, XFS_FEAT_IKEEP, false); + xfs_fs_warn_deprecated(fc, param); parsing_mp->m_features &= ~XFS_FEAT_IKEEP; return 0; case Opt_attr2: - xfs_fs_warn_deprecated(fc, param, XFS_FEAT_ATTR2, true); + xfs_fs_warn_deprecated(fc, param); parsing_mp->m_features |= XFS_FEAT_ATTR2; return 0; case Opt_noattr2: - xfs_fs_warn_deprecated(fc, param, XFS_FEAT_NOATTR2, true); + xfs_fs_warn_deprecated(fc, param); parsing_mp->m_features |= XFS_FEAT_NOATTR2; return 0; default:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maarten Lankhorst dev@lankhorst.se
[ Upstream commit a91c8096590bd7801a26454789f2992094fe36da ]
The original code causes a circular locking dependency found by lockdep.
====================================================== WARNING: possible circular locking dependency detected 6.16.0-rc6-lgci-xe-xe-pw-151626v3+ #1 Tainted: G S U ------------------------------------------------------ xe_fault_inject/5091 is trying to acquire lock: ffff888156815688 ((work_completion)(&(&devcd->del_wk)->work)){+.+.}-{0:0}, at: __flush_work+0x25d/0x660
but task is already holding lock:
ffff888156815620 (&devcd->mutex){+.+.}-{3:3}, at: dev_coredump_put+0x3f/0xa0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&devcd->mutex){+.+.}-{3:3}: mutex_lock_nested+0x4e/0xc0 devcd_data_write+0x27/0x90 sysfs_kf_bin_write+0x80/0xf0 kernfs_fop_write_iter+0x169/0x220 vfs_write+0x293/0x560 ksys_write+0x72/0xf0 __x64_sys_write+0x19/0x30 x64_sys_call+0x2bf/0x2660 do_syscall_64+0x93/0xb60 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #1 (kn->active#236){++++}-{0:0}: kernfs_drain+0x1e2/0x200 __kernfs_remove+0xae/0x400 kernfs_remove_by_name_ns+0x5d/0xc0 remove_files+0x54/0x70 sysfs_remove_group+0x3d/0xa0 sysfs_remove_groups+0x2e/0x60 device_remove_attrs+0xc7/0x100 device_del+0x15d/0x3b0 devcd_del+0x19/0x30 process_one_work+0x22b/0x6f0 worker_thread+0x1e8/0x3d0 kthread+0x11c/0x250 ret_from_fork+0x26c/0x2e0 ret_from_fork_asm+0x1a/0x30 -> #0 ((work_completion)(&(&devcd->del_wk)->work)){+.+.}-{0:0}: __lock_acquire+0x1661/0x2860 lock_acquire+0xc4/0x2f0 __flush_work+0x27a/0x660 flush_delayed_work+0x5d/0xa0 dev_coredump_put+0x63/0xa0 xe_driver_devcoredump_fini+0x12/0x20 [xe] devm_action_release+0x12/0x30 release_nodes+0x3a/0x120 devres_release_all+0x8a/0xd0 device_unbind_cleanup+0x12/0x80 device_release_driver_internal+0x23a/0x280 device_driver_detach+0x14/0x20 unbind_store+0xaf/0xc0 drv_attr_store+0x21/0x50 sysfs_kf_write+0x4a/0x80 kernfs_fop_write_iter+0x169/0x220 vfs_write+0x293/0x560 ksys_write+0x72/0xf0 __x64_sys_write+0x19/0x30 x64_sys_call+0x2bf/0x2660 do_syscall_64+0x93/0xb60 entry_SYSCALL_64_after_hwframe+0x76/0x7e other info that might help us debug this: Chain exists of: (work_completion)(&(&devcd->del_wk)->work) --> kn->active#236 --> &devcd->mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&devcd->mutex); lock(kn->active#236); lock(&devcd->mutex); lock((work_completion)(&(&devcd->del_wk)->work)); *** DEADLOCK *** 5 locks held by xe_fault_inject/5091: #0: ffff8881129f9488 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x72/0xf0 #1: ffff88810c755078 (&of->mutex#2){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x123/0x220 #2: ffff8881054811a0 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x55/0x280 #3: ffff888156815620 (&devcd->mutex){+.+.}-{3:3}, at: dev_coredump_put+0x3f/0xa0 #4: ffffffff8359e020 (rcu_read_lock){....}-{1:2}, at: __flush_work+0x72/0x660 stack backtrace: CPU: 14 UID: 0 PID: 5091 Comm: xe_fault_inject Tainted: G S U 6.16.0-rc6-lgci-xe-xe-pw-151626v3+ #1 PREEMPT_{RT,(lazy)} Tainted: [S]=CPU_OUT_OF_SPEC, [U]=USER Hardware name: Micro-Star International Co., Ltd. MS-7D25/PRO Z690-A DDR4(MS-7D25), BIOS 1.10 12/13/2021 Call Trace: <TASK> dump_stack_lvl+0x91/0xf0 dump_stack+0x10/0x20 print_circular_bug+0x285/0x360 check_noncircular+0x135/0x150 ? register_lock_class+0x48/0x4a0 __lock_acquire+0x1661/0x2860 lock_acquire+0xc4/0x2f0 ? __flush_work+0x25d/0x660 ? mark_held_locks+0x46/0x90 ? __flush_work+0x25d/0x660 __flush_work+0x27a/0x660 ? __flush_work+0x25d/0x660 ? trace_hardirqs_on+0x1e/0xd0 ? __pfx_wq_barrier_func+0x10/0x10 flush_delayed_work+0x5d/0xa0 dev_coredump_put+0x63/0xa0 xe_driver_devcoredump_fini+0x12/0x20 [xe] devm_action_release+0x12/0x30 release_nodes+0x3a/0x120 devres_release_all+0x8a/0xd0 device_unbind_cleanup+0x12/0x80 device_release_driver_internal+0x23a/0x280 ? bus_find_device+0xa8/0xe0 device_driver_detach+0x14/0x20 unbind_store+0xaf/0xc0 drv_attr_store+0x21/0x50 sysfs_kf_write+0x4a/0x80 kernfs_fop_write_iter+0x169/0x220 vfs_write+0x293/0x560 ksys_write+0x72/0xf0 __x64_sys_write+0x19/0x30 x64_sys_call+0x2bf/0x2660 do_syscall_64+0x93/0xb60 ? __f_unlock_pos+0x15/0x20 ? __x64_sys_getdents64+0x9b/0x130 ? __pfx_filldir64+0x10/0x10 ? do_syscall_64+0x1a2/0xb60 ? clear_bhb_loop+0x30/0x80 ? clear_bhb_loop+0x30/0x80 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x76e292edd574 Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d d5 ea 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89 RSP: 002b:00007fffe247a828 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000076e292edd574 RDX: 000000000000000c RSI: 00006267f6306063 RDI: 000000000000000b RBP: 000000000000000c R08: 000076e292fc4b20 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000202 R12: 00006267f6306063 R13: 000000000000000b R14: 00006267e6859c00 R15: 000076e29322a000 </TASK> xe 0000:03:00.0: [drm] Xe device coredump has been deleted.
Fixes: 01daccf74832 ("devcoredump : Serialize devcd_del work") Cc: Mukesh Ojha quic_mojha@quicinc.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Johannes Berg johannes@sipsolutions.net Cc: Rafael J. Wysocki rafael@kernel.org Cc: Danilo Krummrich dakr@kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # v6.1+ Signed-off-by: Maarten Lankhorst dev@lankhorst.se Cc: Matthew Brost matthew.brost@intel.com Acked-by: Mukesh Ojha mukesh.ojha@oss.qualcomm.com Link: https://lore.kernel.org/r/20250723142416.1020423-1-dev@lankhorst.se Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org [ replaced disable_delayed_work_sync() with cancel_delayed_work_sync() ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/devcoredump.c | 138 +++++++++++++++++++++++++++------------------ 1 file changed, 84 insertions(+), 54 deletions(-)
--- a/drivers/base/devcoredump.c +++ b/drivers/base/devcoredump.c @@ -26,50 +26,46 @@ struct devcd_entry { void *data; size_t datalen; /* - * Here, mutex is required to serialize the calls to del_wk work between - * user/kernel space which happens when devcd is added with device_add() - * and that sends uevent to user space. User space reads the uevents, - * and calls to devcd_data_write() which try to modify the work which is - * not even initialized/queued from devcoredump. + * There are 2 races for which mutex is required. * + * The first race is between device creation and userspace writing to + * schedule immediately destruction. * + * This race is handled by arming the timer before device creation, but + * when device creation fails the timer still exists. * - * cpu0(X) cpu1(Y) + * To solve this, hold the mutex during device_add(), and set + * init_completed on success before releasing the mutex. * - * dev_coredump() uevent sent to user space - * device_add() ======================> user space process Y reads the - * uevents writes to devcd fd - * which results into writes to + * That way the timer will never fire until device_add() is called, + * it will do nothing if init_completed is not set. The timer is also + * cancelled in that case. * - * devcd_data_write() - * mod_delayed_work() - * try_to_grab_pending() - * del_timer() - * debug_assert_init() - * INIT_DELAYED_WORK() - * schedule_delayed_work() - * - * - * Also, mutex alone would not be enough to avoid scheduling of - * del_wk work after it get flush from a call to devcd_free() - * mentioned as below. - * - * disabled_store() - * devcd_free() - * mutex_lock() devcd_data_write() - * flush_delayed_work() - * mutex_unlock() - * mutex_lock() - * mod_delayed_work() - * mutex_unlock() - * So, delete_work flag is required. + * The second race involves multiple parallel invocations of devcd_free(), + * add a deleted flag so only 1 can call the destructor. */ struct mutex mutex; - bool delete_work; + bool init_completed, deleted; struct module *owner; ssize_t (*read)(char *buffer, loff_t offset, size_t count, void *data, size_t datalen); void (*free)(void *data); + /* + * If nothing interferes and device_add() was returns success, + * del_wk will destroy the device after the timer fires. + * + * Multiple userspace processes can interfere in the working of the timer: + * - Writing to the coredump will reschedule the timer to run immediately, + * if still armed. + * + * This is handled by using "if (cancel_delayed_work()) { + * schedule_delayed_work() }", to prevent re-arming after having + * been previously fired. + * - Writing to /sys/class/devcoredump/disabled will destroy the + * coredump synchronously. + * This is handled by using disable_delayed_work_sync(), and then + * checking if deleted flag is set with &devcd->mutex held. + */ struct delayed_work del_wk; struct device *failing_dev; }; @@ -98,14 +94,27 @@ static void devcd_dev_release(struct dev kfree(devcd); }
+static void __devcd_del(struct devcd_entry *devcd) +{ + devcd->deleted = true; + device_del(&devcd->devcd_dev); + put_device(&devcd->devcd_dev); +} + static void devcd_del(struct work_struct *wk) { struct devcd_entry *devcd; + bool init_completed;
devcd = container_of(wk, struct devcd_entry, del_wk.work);
- device_del(&devcd->devcd_dev); - put_device(&devcd->devcd_dev); + /* devcd->mutex serializes against dev_coredumpm_timeout */ + mutex_lock(&devcd->mutex); + init_completed = devcd->init_completed; + mutex_unlock(&devcd->mutex); + + if (init_completed) + __devcd_del(devcd); }
static ssize_t devcd_data_read(struct file *filp, struct kobject *kobj, @@ -125,12 +134,12 @@ static ssize_t devcd_data_write(struct f struct device *dev = kobj_to_dev(kobj); struct devcd_entry *devcd = dev_to_devcd(dev);
- mutex_lock(&devcd->mutex); - if (!devcd->delete_work) { - devcd->delete_work = true; - mod_delayed_work(system_wq, &devcd->del_wk, 0); - } - mutex_unlock(&devcd->mutex); + /* + * Although it's tempting to use mod_delayed work here, + * that will cause a reschedule if the timer already fired. + */ + if (cancel_delayed_work(&devcd->del_wk)) + schedule_delayed_work(&devcd->del_wk, 0);
return count; } @@ -158,11 +167,21 @@ static int devcd_free(struct device *dev { struct devcd_entry *devcd = dev_to_devcd(dev);
+ /* + * To prevent a race with devcd_data_write(), cancel work and + * complete manually instead. + * + * We cannot rely on the return value of + * cancel_delayed_work_sync() here, because it might be in the + * middle of a cancel_delayed_work + schedule_delayed_work pair. + * + * devcd->mutex here guards against multiple parallel invocations + * of devcd_free(). + */ + cancel_delayed_work_sync(&devcd->del_wk); mutex_lock(&devcd->mutex); - if (!devcd->delete_work) - devcd->delete_work = true; - - flush_delayed_work(&devcd->del_wk); + if (!devcd->deleted) + __devcd_del(devcd); mutex_unlock(&devcd->mutex); return 0; } @@ -186,12 +205,10 @@ static ssize_t disabled_show(struct clas * put_device() <- last reference * error = fn(dev, data) devcd_dev_release() * devcd_free(dev, data) kfree(devcd) - * mutex_lock(&devcd->mutex); * * - * In the above diagram, It looks like disabled_store() would be racing with parallely - * running devcd_del() and result in memory abort while acquiring devcd->mutex which - * is called after kfree of devcd memory after dropping its last reference with + * In the above diagram, it looks like disabled_store() would be racing with parallelly + * running devcd_del() and result in memory abort after dropping its last reference with * put_device(). However, this will not happens as fn(dev, data) runs * with its own reference to device via klist_node so it is not its last reference. * so, above situation would not occur. @@ -353,7 +370,7 @@ void dev_coredumpm(struct device *dev, s devcd->read = read; devcd->free = free; devcd->failing_dev = get_device(dev); - devcd->delete_work = false; + devcd->deleted = false;
mutex_init(&devcd->mutex); device_initialize(&devcd->devcd_dev); @@ -362,8 +379,14 @@ void dev_coredumpm(struct device *dev, s atomic_inc_return(&devcd_count)); devcd->devcd_dev.class = &devcd_class;
- mutex_lock(&devcd->mutex); dev_set_uevent_suppress(&devcd->devcd_dev, true); + + /* devcd->mutex prevents devcd_del() completing until init finishes */ + mutex_lock(&devcd->mutex); + devcd->init_completed = false; + INIT_DELAYED_WORK(&devcd->del_wk, devcd_del); + schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT); + if (device_add(&devcd->devcd_dev)) goto put_device;
@@ -380,13 +403,20 @@ void dev_coredumpm(struct device *dev, s
dev_set_uevent_suppress(&devcd->devcd_dev, false); kobject_uevent(&devcd->devcd_dev.kobj, KOBJ_ADD); - INIT_DELAYED_WORK(&devcd->del_wk, devcd_del); - schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT); + + /* + * Safe to run devcd_del() now that we are done with devcd_dev. + * Alternatively we could have taken a ref on devcd_dev before + * dropping the lock. + */ + devcd->init_completed = true; mutex_unlock(&devcd->mutex); return; put_device: - put_device(&devcd->devcd_dev); mutex_unlock(&devcd->mutex); + cancel_delayed_work_sync(&devcd->del_wk); + put_device(&devcd->devcd_dev); + put_module: module_put(owner); free:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Babu Moger babu.moger@amd.com
[ Upstream commit 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92 ]
Users can create as many monitoring groups as the number of RMIDs supported by the hardware. However, on AMD systems, only a limited number of RMIDs are guaranteed to be actively tracked by the hardware. RMIDs that exceed this limit are placed in an "Unavailable" state.
When a bandwidth counter is read for such an RMID, the hardware sets MSR_IA32_QM_CTR.Unavailable (bit 62). When such an RMID starts being tracked again the hardware counter is reset to zero. MSR_IA32_QM_CTR.Unavailable remains set on first read after tracking re-starts and is clear on all subsequent reads as long as the RMID is tracked.
resctrl miscounts the bandwidth events after an RMID transitions from the "Unavailable" state back to being tracked. This happens because when the hardware starts counting again after resetting the counter to zero, resctrl in turn compares the new count against the counter value stored from the previous time the RMID was tracked.
This results in resctrl computing an event value that is either undercounting (when new counter is more than stored counter) or a mistaken overflow (when new counter is less than stored counter).
Reset the stored value (arch_mbm_state::prev_msr) of MSR_IA32_QM_CTR to zero whenever the RMID is in the "Unavailable" state to ensure accurate counting after the RMID resets to zero when it starts to be tracked again.
Example scenario that results in mistaken overflow ================================================== 1. The resctrl filesystem is mounted, and a task is assigned to a monitoring group.
$mount -t resctrl resctrl /sys/fs/resctrl $mkdir /sys/fs/resctrl/mon_groups/test1/ $echo 1234 > /sys/fs/resctrl/mon_groups/test1/tasks
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes 21323 <- Total bytes on domain 0 "Unavailable" <- Total bytes on domain 1
Task is running on domain 0. Counter on domain 1 is "Unavailable".
2. The task runs on domain 0 for a while and then moves to domain 1. The counter starts incrementing on domain 1.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes 7345357 <- Total bytes on domain 0 4545 <- Total bytes on domain 1
3. At some point, the RMID in domain 0 transitions to the "Unavailable" state because the task is no longer executing in that domain.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes "Unavailable" <- Total bytes on domain 0 434341 <- Total bytes on domain 1
4. Since the task continues to migrate between domains, it may eventually return to domain 0.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes 17592178699059 <- Overflow on domain 0 3232332 <- Total bytes on domain 1
In this case, the RMID on domain 0 transitions from "Unavailable" state to active state. The hardware sets MSR_IA32_QM_CTR.Unavailable (bit 62) when the counter is read and begins tracking the RMID counting from 0.
Subsequent reads succeed but return a value smaller than the previously saved MSR value (7345357). Consequently, the resctrl's overflow logic is triggered, it compares the previous value (7345357) with the new, smaller value and incorrectly interprets this as a counter overflow, adding a large delta.
In reality, this is a false positive: the counter did not overflow but was simply reset when the RMID transitioned from "Unavailable" back to active state.
Here is the text from APM [1] available from [2].
"In PQOS Version 2.0 or higher, the MBM hardware will set the U bit on the first QM_CTR read when it begins tracking an RMID that it was not previously tracking. The U bit will be zero for all subsequent reads from that RMID while it is still tracked by the hardware. Therefore, a QM_CTR read with the U bit set when that RMID is in use by a processor can be considered 0 when calculating the difference with a subsequent read."
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory Bandwidth (MBM).
[ bp: Split commit message into smaller paragraph chunks for better consumption. ]
Fixes: 4d05bf71f157d ("x86/resctrl: Introduce AMD QOS feature") Signed-off-by: Babu Moger babu.moger@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Reinette Chatre reinette.chatre@intel.com Tested-by: Reinette Chatre reinette.chatre@intel.com Cc: stable@vger.kernel.org # needs adjustments for <= v6.17 Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2] (cherry picked from commit 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92) [babu.moger@amd.com: Fix conflict for v6.1 stable] Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/resctrl/monitor.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -224,11 +224,15 @@ int resctrl_arch_rmid_read(struct rdt_re if (!cpumask_test_cpu(smp_processor_id(), &d->cpu_mask)) return -EINVAL;
+ am = get_arch_mbm_state(hw_dom, rmid, eventid); + ret = __rmid_read(rmid, eventid, &msr_val); - if (ret) + if (ret) { + if (am && ret == -EINVAL) + am->prev_msr = 0; return ret; + }
- am = get_arch_mbm_state(hw_dom, rmid, eventid); if (am) { am->chunks += mbm_overflow_count(am->prev_msr, msr_val, hw_res->mbm_width);
On 10/27/25 13:36, Greg Kroah-Hartman wrote:
6.1-stable review patch. If anyone has any objections, please let me know.
From: Babu Moger babu.moger@amd.com
[ Upstream commit 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92 ]
Users can create as many monitoring groups as the number of RMIDs supported by the hardware. However, on AMD systems, only a limited number of RMIDs are guaranteed to be actively tracked by the hardware. RMIDs that exceed this limit are placed in an "Unavailable" state.
When a bandwidth counter is read for such an RMID, the hardware sets MSR_IA32_QM_CTR.Unavailable (bit 62). When such an RMID starts being tracked again the hardware counter is reset to zero. MSR_IA32_QM_CTR.Unavailable remains set on first read after tracking re-starts and is clear on all subsequent reads as long as the RMID is tracked.
resctrl miscounts the bandwidth events after an RMID transitions from the "Unavailable" state back to being tracked. This happens because when the hardware starts counting again after resetting the counter to zero, resctrl in turn compares the new count against the counter value stored from the previous time the RMID was tracked.
This results in resctrl computing an event value that is either undercounting (when new counter is more than stored counter) or a mistaken overflow (when new counter is less than stored counter).
Reset the stored value (arch_mbm_state::prev_msr) of MSR_IA32_QM_CTR to zero whenever the RMID is in the "Unavailable" state to ensure accurate counting after the RMID resets to zero when it starts to be tracked again.
Example scenario that results in mistaken overflow
The resctrl filesystem is mounted, and a task is assigned to a monitoring group.
$mount -t resctrl resctrl /sys/fs/resctrl $mkdir /sys/fs/resctrl/mon_groups/test1/ $echo 1234 > /sys/fs/resctrl/mon_groups/test1/tasks
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes 21323 <- Total bytes on domain 0 "Unavailable" <- Total bytes on domain 1
Task is running on domain 0. Counter on domain 1 is "Unavailable".
The task runs on domain 0 for a while and then moves to domain 1. The counter starts incrementing on domain 1.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes 7345357 <- Total bytes on domain 0 4545 <- Total bytes on domain 1
At some point, the RMID in domain 0 transitions to the "Unavailable" state because the task is no longer executing in that domain.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes "Unavailable" <- Total bytes on domain 0 434341 <- Total bytes on domain 1
Since the task continues to migrate between domains, it may eventually return to domain 0.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes 17592178699059 <- Overflow on domain 0 3232332 <- Total bytes on domain 1
In this case, the RMID on domain 0 transitions from "Unavailable" state to active state. The hardware sets MSR_IA32_QM_CTR.Unavailable (bit 62) when the counter is read and begins tracking the RMID counting from 0.
Subsequent reads succeed but return a value smaller than the previously saved MSR value (7345357). Consequently, the resctrl's overflow logic is triggered, it compares the previous value (7345357) with the new, smaller value and incorrectly interprets this as a counter overflow, adding a large delta.
In reality, this is a false positive: the counter did not overflow but was simply reset when the RMID transitioned from "Unavailable" back to active state.
Here is the text from APM [1] available from [2].
"In PQOS Version 2.0 or higher, the MBM hardware will set the U bit on the first QM_CTR read when it begins tracking an RMID that it was not previously tracking. The U bit will be zero for all subsequent reads from that RMID while it is still tracked by the hardware. Therefore, a QM_CTR read with the U bit set when that RMID is in use by a processor can be considered 0 when calculating the difference with a subsequent read."
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory Bandwidth (MBM).
[ bp: Split commit message into smaller paragraph chunks for better consumption. ]
Fixes: 4d05bf71f157d ("x86/resctrl: Introduce AMD QOS feature") Signed-off-by: Babu Moger babu.moger@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Reinette Chatre reinette.chatre@intel.com Tested-by: Reinette Chatre reinette.chatre@intel.com Cc: stable@vger.kernel.org # needs adjustments for <= v6.17 Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2] (cherry picked from commit 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92) [babu.moger@amd.com: Fix conflict for v6.1 stable] Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Tested-by: Babu Moger babu.moger@amd.com
Thanks
Babu Moger
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon linkinjeon@kernel.org
[ Upstream commit b2d99376c5d61eb60ffdb6c503e4b6c8f9712ddd ]
ksmbd.mount will give each interfaces list and bind_interfaces_only flags to ksmbd server. Previously, the interfaces list was sent only when bind_interfaces_only was enabled. ksmbd server browse only interfaces list given from ksmbd.conf on FSCTL_QUERY_INTERFACE_INFO IOCTL.
Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/ksmbd_netlink.h | 3 + fs/smb/server/server.h | 1 fs/smb/server/smb2pdu.c | 4 ++ fs/smb/server/transport_ipc.c | 1 fs/smb/server/transport_tcp.c | 67 +++++++++++++++++++----------------------- fs/smb/server/transport_tcp.h | 1 6 files changed, 40 insertions(+), 37 deletions(-)
--- a/fs/smb/server/ksmbd_netlink.h +++ b/fs/smb/server/ksmbd_netlink.h @@ -107,8 +107,9 @@ struct ksmbd_startup_request { __u32 smb2_max_credits; /* MAX credits */ __u32 smbd_max_io_size; /* smbd read write size */ __u32 max_connections; /* Number of maximum simultaneous connections */ + __s8 bind_interfaces_only; __u32 max_ip_connections; /* Number of maximum connection per ip address */ - __u32 reserved[125]; /* Reserved room */ + __s8 reserved[499]; /* Reserved room */ __u32 ifc_list_sz; /* interfaces list size */ __s8 ____payload[]; } __packed; --- a/fs/smb/server/server.h +++ b/fs/smb/server/server.h @@ -45,6 +45,7 @@ struct ksmbd_server_config { unsigned int max_ip_connections;
char *conf[SERVER_CONF_WORK_GROUP + 1]; + bool bind_interfaces_only; };
extern struct ksmbd_server_config server_conf; --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -37,6 +37,7 @@ #include "mgmt/user_session.h" #include "mgmt/ksmbd_ida.h" #include "ndr.h" +#include "transport_tcp.h"
static void __wbuf(struct ksmbd_work *work, void **req, void **rsp) { @@ -7423,6 +7424,9 @@ static int fsctl_query_iface_info_ioctl( if (netdev->type == ARPHRD_LOOPBACK) continue;
+ if (!ksmbd_find_netdev_name_iface_list(netdev->name)) + continue; + flags = dev_get_flags(netdev); if (!(flags & IFF_RUNNING)) continue; --- a/fs/smb/server/transport_ipc.c +++ b/fs/smb/server/transport_ipc.c @@ -324,6 +324,7 @@ static int ipc_server_config_on_startup( ret = ksmbd_set_netbios_name(req->netbios_name); ret |= ksmbd_set_server_string(req->server_string); ret |= ksmbd_set_work_group(req->work_group); + server_conf.bind_interfaces_only = req->bind_interfaces_only; ret |= ksmbd_tcp_set_interfaces(KSMBD_STARTUP_CONFIG_INTERFACES(req), req->ifc_list_sz); out: --- a/fs/smb/server/transport_tcp.c +++ b/fs/smb/server/transport_tcp.c @@ -544,30 +544,37 @@ out_clear: return ret; }
+struct interface *ksmbd_find_netdev_name_iface_list(char *netdev_name) +{ + struct interface *iface; + + list_for_each_entry(iface, &iface_list, entry) + if (!strcmp(iface->name, netdev_name)) + return iface; + return NULL; +} + static int ksmbd_netdev_event(struct notifier_block *nb, unsigned long event, void *ptr) { struct net_device *netdev = netdev_notifier_info_to_dev(ptr); struct interface *iface; - int ret, found = 0; + int ret;
switch (event) { case NETDEV_UP: if (netif_is_bridge_port(netdev)) return NOTIFY_OK;
- list_for_each_entry(iface, &iface_list, entry) { - if (!strcmp(iface->name, netdev->name)) { - found = 1; - if (iface->state != IFACE_STATE_DOWN) - break; - ret = create_socket(iface); - if (ret) - return NOTIFY_OK; - break; - } + iface = ksmbd_find_netdev_name_iface_list(netdev->name); + if (iface && iface->state == IFACE_STATE_DOWN) { + ksmbd_debug(CONN, "netdev-up event: netdev(%s) is going up\n", + iface->name); + ret = create_socket(iface); + if (ret) + return NOTIFY_OK; } - if (!found && bind_additional_ifaces) { + if (!iface && bind_additional_ifaces) { iface = alloc_iface(kstrdup(netdev->name, GFP_KERNEL)); if (!iface) return NOTIFY_OK; @@ -577,19 +584,19 @@ static int ksmbd_netdev_event(struct not } break; case NETDEV_DOWN: - list_for_each_entry(iface, &iface_list, entry) { - if (!strcmp(iface->name, netdev->name) && - iface->state == IFACE_STATE_CONFIGURED) { - tcp_stop_kthread(iface->ksmbd_kthread); - iface->ksmbd_kthread = NULL; - mutex_lock(&iface->sock_release_lock); - tcp_destroy_socket(iface->ksmbd_socket); - iface->ksmbd_socket = NULL; - mutex_unlock(&iface->sock_release_lock); + iface = ksmbd_find_netdev_name_iface_list(netdev->name); + if (iface && iface->state == IFACE_STATE_CONFIGURED) { + ksmbd_debug(CONN, "netdev-down event: netdev(%s) is going down\n", + iface->name); + tcp_stop_kthread(iface->ksmbd_kthread); + iface->ksmbd_kthread = NULL; + mutex_lock(&iface->sock_release_lock); + tcp_destroy_socket(iface->ksmbd_socket); + iface->ksmbd_socket = NULL; + mutex_unlock(&iface->sock_release_lock);
- iface->state = IFACE_STATE_DOWN; - break; - } + iface->state = IFACE_STATE_DOWN; + break; } break; } @@ -658,18 +665,6 @@ int ksmbd_tcp_set_interfaces(char *ifc_l int sz = 0;
if (!ifc_list_sz) { - struct net_device *netdev; - - rtnl_lock(); - for_each_netdev(&init_net, netdev) { - if (netif_is_bridge_port(netdev)) - continue; - if (!alloc_iface(kstrdup(netdev->name, GFP_KERNEL))) { - rtnl_unlock(); - return -ENOMEM; - } - } - rtnl_unlock(); bind_additional_ifaces = 1; return 0; } --- a/fs/smb/server/transport_tcp.h +++ b/fs/smb/server/transport_tcp.h @@ -7,6 +7,7 @@ #define __KSMBD_TRANSPORT_TCP_H__
int ksmbd_tcp_set_interfaces(char *ifc_list, int ifc_list_sz); +struct interface *ksmbd_find_netdev_name_iface_list(char *netdev_name); int ksmbd_tcp_init(void); void ksmbd_tcp_destroy(void);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vineeth Vijayan vneethv@linux.ibm.com
commit 9daa5a8795865f9a3c93d8d1066785b07ded6073 upstream.
Starting with 'commit 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers")', cio no longer unregisters subchannels when the attached device is invalid or unavailable.
As an unintended side-effect, the cio_ignore purge function no longer removes subchannels for devices on the cio_ignore list if no CCW device is attached. This situation occurs when a CCW device is non-operational or unavailable
To ensure the same outcome of the purge function as when the current cio_ignore list had been active during boot, update the purge function to remove I/O subchannels without working CCW devices if the associated device number is found on the cio_ignore list.
Fixes: 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers") Suggested-by: Peter Oberparleiter oberpar@linux.ibm.com Reviewed-by: Peter Oberparleiter oberpar@linux.ibm.com Signed-off-by: Vineeth Vijayan vneethv@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/cio/device.c | 39 +++++++++++++++++++++++++-------------- 1 file changed, 25 insertions(+), 14 deletions(-)
--- a/drivers/s390/cio/device.c +++ b/drivers/s390/cio/device.c @@ -1309,23 +1309,34 @@ void ccw_device_schedule_recovery(void) spin_unlock_irqrestore(&recovery_lock, flags); }
-static int purge_fn(struct device *dev, void *data) +static int purge_fn(struct subchannel *sch, void *data) { - struct ccw_device *cdev = to_ccwdev(dev); - struct ccw_dev_id *id = &cdev->private->dev_id; - struct subchannel *sch = to_subchannel(cdev->dev.parent); - - spin_lock_irq(cdev->ccwlock); - if (is_blacklisted(id->ssid, id->devno) && - (cdev->private->state == DEV_STATE_OFFLINE) && - (atomic_cmpxchg(&cdev->private->onoff, 0, 1) == 0)) { - CIO_MSG_EVENT(3, "ccw: purging 0.%x.%04x\n", id->ssid, - id->devno); + struct ccw_device *cdev; + + spin_lock_irq(sch->lock); + if (sch->st != SUBCHANNEL_TYPE_IO || !sch->schib.pmcw.dnv) + goto unlock; + + if (!is_blacklisted(sch->schid.ssid, sch->schib.pmcw.dev)) + goto unlock; + + cdev = sch_get_cdev(sch); + if (cdev) { + if (cdev->private->state != DEV_STATE_OFFLINE) + goto unlock; + + if (atomic_cmpxchg(&cdev->private->onoff, 0, 1) != 0) + goto unlock; ccw_device_sched_todo(cdev, CDEV_TODO_UNREG); - css_sched_sch_todo(sch, SCH_TODO_UNREG); atomic_set(&cdev->private->onoff, 0); } - spin_unlock_irq(cdev->ccwlock); + + css_sched_sch_todo(sch, SCH_TODO_UNREG); + CIO_MSG_EVENT(3, "ccw: purging 0.%x.%04x%s\n", sch->schid.ssid, + sch->schib.pmcw.dev, cdev ? "" : " (no cdev)"); + +unlock: + spin_unlock_irq(sch->lock); /* Abort loop in case of pending signal. */ if (signal_pending(current)) return -EINTR; @@ -1341,7 +1352,7 @@ static int purge_fn(struct device *dev, int ccw_purge_blacklisted(void) { CIO_MSG_EVENT(2, "ccw: purging blacklisted devices\n"); - bus_for_each_dev(&ccw_bus_type, NULL, NULL, purge_fn); + for_each_subchannel_staged(purge_fn, NULL, NULL); return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Acs acsjakub@amazon.de
commit f04aad36a07cc17b7a5d5b9a2d386ce6fae63e93 upstream.
syzkaller discovered the following crash: (kernel BUG)
[ 44.607039] ------------[ cut here ]------------ [ 44.607422] kernel BUG at mm/userfaultfd.c:2067! [ 44.608148] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI [ 44.608814] CPU: 1 UID: 0 PID: 2475 Comm: reproducer Not tainted 6.16.0-rc6 #1 PREEMPT(none) [ 44.609635] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 44.610695] RIP: 0010:userfaultfd_release_all+0x3a8/0x460
<snip other registers, drop unreliable trace>
[ 44.617726] Call Trace: [ 44.617926] <TASK> [ 44.619284] userfaultfd_release+0xef/0x1b0 [ 44.620976] __fput+0x3f9/0xb60 [ 44.621240] fput_close_sync+0x110/0x210 [ 44.622222] __x64_sys_close+0x8f/0x120 [ 44.622530] do_syscall_64+0x5b/0x2f0 [ 44.622840] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 44.623244] RIP: 0033:0x7f365bb3f227
Kernel panics because it detects UFFD inconsistency during userfaultfd_release_all(). Specifically, a VMA which has a valid pointer to vma->vm_userfaultfd_ctx, but no UFFD flags in vma->vm_flags.
The inconsistency is caused in ksm_madvise(): when user calls madvise() with MADV_UNMEARGEABLE on a VMA that is registered for UFFD in MINOR mode, it accidentally clears all flags stored in the upper 32 bits of vma->vm_flags.
Assuming x86_64 kernel build, unsigned long is 64-bit and unsigned int and int are 32-bit wide. This setup causes the following mishap during the &= ~VM_MERGEABLE assignment.
VM_MERGEABLE is a 32-bit constant of type unsigned int, 0x8000'0000. After ~ is applied, it becomes 0x7fff'ffff unsigned int, which is then promoted to unsigned long before the & operation. This promotion fills upper 32 bits with leading 0s, as we're doing unsigned conversion (and even for a signed conversion, this wouldn't help as the leading bit is 0). & operation thus ends up AND-ing vm_flags with 0x0000'0000'7fff'ffff instead of intended 0xffff'ffff'7fff'ffff and hence accidentally clears the upper 32-bits of its value.
Fix it by changing `VM_MERGEABLE` constant to unsigned long, using the BIT() macro.
Note: other VM_* flags are not affected: This only happens to the VM_MERGEABLE flag, as the other VM_* flags are all constants of type int and after ~ operation, they end up with leading 1 and are thus converted to unsigned long with leading 1s.
Note 2: After commit 31defc3b01d9 ("userfaultfd: remove (VM_)BUG_ON()s"), this is no longer a kernel BUG, but a WARNING at the same place:
[ 45.595973] WARNING: CPU: 1 PID: 2474 at mm/userfaultfd.c:2067
but the root-cause (flag-drop) remains the same.
[akpm@linux-foundation.org: rust bindgen wasn't able to handle BIT(), from Miguel] Link: https://lore.kernel.org/oe-kbuild-all/202510030449.VfSaAjvd-lkp@intel.com/ Link: https://lkml.kernel.org/r/20251001090353.57523-2-acsjakub@amazon.de Fixes: 7677f7fd8be7 ("userfaultfd: add minor fault registration mode") Signed-off-by: Jakub Acs acsjakub@amazon.de Signed-off-by: Miguel Ojeda miguel.ojeda.sandonis@gmail.com Acked-by: David Hildenbrand david@redhat.com Acked-by: SeongJae Park sj@kernel.org Tested-by: Alice Ryhl aliceryhl@google.com Tested-by: Miguel Ojeda miguel.ojeda.sandonis@gmail.com Cc: Xu Xin xu.xin16@zte.com.cn Cc: Chengming Zhou chengming.zhou@linux.dev Cc: Peter Xu peterx@redhat.com Cc: Axel Rasmussen axelrasmussen@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org [acsjakub@amazon.de: adapt rust bindgen to older versions] Signed-off-by: Jakub Acs acsjakub@amazon.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/mm.h | 2 +- rust/bindings/bindings_helper.h | 2 ++ rust/bindings/lib.rs | 1 + 3 files changed, 4 insertions(+), 1 deletion(-)
--- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -316,7 +316,7 @@ extern unsigned int kobjsize(const void #define VM_MIXEDMAP 0x10000000 /* Can contain "struct page" and pure PFN pages */ #define VM_HUGEPAGE 0x20000000 /* MADV_HUGEPAGE marked this vma */ #define VM_NOHUGEPAGE 0x40000000 /* MADV_NOHUGEPAGE marked this vma */ -#define VM_MERGEABLE 0x80000000 /* KSM may merge identical pages */ +#define VM_MERGEABLE BIT(31) /* KSM may merge identical pages */
#ifdef CONFIG_ARCH_USES_HIGH_VMA_FLAGS #define VM_HIGH_ARCH_BIT_0 32 /* bit only usable on 64-bit architectures */ --- a/rust/bindings/bindings_helper.h +++ b/rust/bindings/bindings_helper.h @@ -7,8 +7,10 @@ */
#include <linux/slab.h> +#include <linux/mm.h>
/* `bindgen` gets confused at certain things. */ const size_t BINDINGS_ARCH_SLAB_MINALIGN = ARCH_SLAB_MINALIGN; const gfp_t BINDINGS_GFP_KERNEL = GFP_KERNEL; const gfp_t BINDINGS___GFP_ZERO = __GFP_ZERO; +const vm_flags_t BINDINGS_VM_MERGEABLE = VM_MERGEABLE; --- a/rust/bindings/lib.rs +++ b/rust/bindings/lib.rs @@ -51,3 +51,4 @@ pub use bindings_raw::*;
pub const GFP_KERNEL: gfp_t = BINDINGS_GFP_KERNEL; pub const __GFP_ZERO: gfp_t = BINDINGS___GFP_ZERO; +pub const VM_MERGEABLE: vm_flags_t = BINDINGS_VM_MERGEABLE;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leon Hwang leon.hwang@linux.dev
This reverts commit a584c7734a4dd050451fcdd65c66317e15660e81 which is commit 91b80cc5b39f00399e8e2d17527cad2c7fa535e2 upstream.
This fixes the following build error:
map_hugetlb.c: In function 'main': map_hugetlb.c:79:25: warning: implicit declaration of function 'default_huge_page_size' [-Wimplicit-function-declaration] 79 | hugepage_size = default_huge_page_size();
Signed-off-by: Leon Hwang leon.hwang@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/vm/map_hugetlb.c | 7 ------- 1 file changed, 7 deletions(-)
--- a/tools/testing/selftests/vm/map_hugetlb.c +++ b/tools/testing/selftests/vm/map_hugetlb.c @@ -15,7 +15,6 @@ #include <unistd.h> #include <sys/mman.h> #include <fcntl.h> -#include "vm_util.h"
#define LENGTH (256UL*1024*1024) #define PROTECTION (PROT_READ | PROT_WRITE) @@ -71,16 +70,10 @@ int main(int argc, char **argv) { void *addr; int ret; - size_t hugepage_size; size_t length = LENGTH; int flags = FLAGS; int shift = 0;
- hugepage_size = default_huge_page_size(); - /* munmap with fail if the length is not page aligned */ - if (hugepage_size > length) - length = hugepage_size; - if (argc > 1) length = atol(argv[1]) << 20; if (argc > 2) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Rutland mark.rutland@arm.com
commit 3bbf004c4808e2c3241e5c1ad6cc102f38a03c39 upstream.
Add cputype definitions for Neoverse-V3AE. These will be used for errata detection in subsequent patches.
These values can be found in the Neoverse-V3AE TRM:
https://developer.arm.com/documentation/SDEN-2615521/9-0/
... in section A.6.1 ("MIDR_EL1, Main ID Register").
Signed-off-by: Mark Rutland mark.rutland@arm.com Cc: James Morse james.morse@arm.com Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Ryan Roberts ryan.roberts@arm.com Signed-off-by: Will Deacon will@kernel.org [ Ryan: Trivial backport ] Signed-off-by: Ryan Roberts ryan.roberts@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/include/asm/cputype.h | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -93,6 +93,7 @@ #define ARM_CPU_PART_NEOVERSE_V2 0xD4F #define ARM_CPU_PART_CORTEX_A720 0xD81 #define ARM_CPU_PART_CORTEX_X4 0xD82 +#define ARM_CPU_PART_NEOVERSE_V3AE 0xD83 #define ARM_CPU_PART_NEOVERSE_V3 0xD84 #define ARM_CPU_PART_CORTEX_X925 0xD85 #define ARM_CPU_PART_CORTEX_A725 0xD87 @@ -173,6 +174,7 @@ #define MIDR_NEOVERSE_V2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V2) #define MIDR_CORTEX_A720 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A720) #define MIDR_CORTEX_X4 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X4) +#define MIDR_NEOVERSE_V3AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3AE) #define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3) #define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925) #define MIDR_CORTEX_A725 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A725)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Rutland mark.rutland@arm.com
commit 0c33aa1804d101c11ba1992504f17a42233f0e11 upstream.
Neoverse-V3AE is also affected by erratum #3312417, as described in its Software Developer Errata Notice (SDEN) document:
Neoverse V3AE (MP172) SDEN v9.0, erratum 3312417 https://developer.arm.com/documentation/SDEN-2615521/9-0/
Enable the workaround for Neoverse-V3AE, and document this.
Signed-off-by: Mark Rutland mark.rutland@arm.com Cc: James Morse james.morse@arm.com Cc: Will Deacon will@kernel.org Cc: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Ryan Roberts ryan.roberts@arm.com Signed-off-by: Will Deacon will@kernel.org [ Ryan: Trivial backport ] Signed-off-by: Ryan Roberts ryan.roberts@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/arm64/silicon-errata.rst | 2 ++ arch/arm64/Kconfig | 1 + arch/arm64/kernel/cpu_errata.c | 1 + 3 files changed, 4 insertions(+)
--- a/Documentation/arm64/silicon-errata.rst +++ b/Documentation/arm64/silicon-errata.rst @@ -181,6 +181,8 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | ARM | Neoverse-V3 | #3312417 | ARM64_ERRATUM_3194386 | +----------------+-----------------+-----------------+-----------------------------+ +| ARM | Neoverse-V3AE | #3312417 | ARM64_ERRATUM_3194386 | ++----------------+-----------------+-----------------+-----------------------------+ | ARM | MMU-500 | #841119,826419 | N/A | +----------------+-----------------+-----------------+-----------------------------+ | ARM | MMU-600 | #1076982,1209401| N/A | --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1027,6 +1027,7 @@ config ARM64_ERRATUM_3194386 * ARM Neoverse-V1 erratum 3324341 * ARM Neoverse V2 erratum 3324336 * ARM Neoverse-V3 erratum 3312417 + * ARM Neoverse-V3AE erratum 3312417
On affected cores "MSR SSBS, #0" instructions may not affect subsequent speculative instructions, which may permit unexepected --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -457,6 +457,7 @@ static const struct midr_range erratum_s MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V2), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3), + MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3AE), {} }; #endif
On 10/27/25 11:34, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.158 release. There are 157 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 29 Oct 2025 18:34:15 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.158-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
On 10/27/2025 2:34 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.158 release. There are 157 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 29 Oct 2025 18:34:15 +0000. Anything received after that time might be too late.
6.1.158-rc1 built and run on x86_64 test system with no errors or regressions: Tested-by: Slade Watkins sr@sladewatkins.com
Slade
Am 27.10.2025 um 19:34 schrieb Greg Kroah-Hartman:
This is the start of the stable review cycle for the 6.1.158 release. There are 157 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Builds, boots and works on my 2-socket Ivy Bridge Xeon E5-2697 v2 server. No dmesg oddities or regressions found.
Tested-by: Peter Schneider pschneider1968@googlemail.com
Beste Grüße, Peter Schneider
On Tue, 28 Oct 2025 at 00:39, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.1.158 release. There are 157 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 29 Oct 2025 18:34:15 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.158-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 6.1.158-rc1 * git: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git * git commit: f6fcaf2c6b7f6ed4e6ee10532555e1e16764c435 * git describe: v6.1.157-158-gf6fcaf2c6b7f * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.1.y/build/v6.1.15...
## Test Regressions (compared to v6.1.156-169-gec44a71e7948)
## Metric Regressions (compared to v6.1.156-169-gec44a71e7948)
## Test Fixes (compared to v6.1.156-169-gec44a71e7948)
## Metric Fixes (compared to v6.1.156-169-gec44a71e7948)
## Test result summary total: 86941, pass: 72310, fail: 2561, skip: 11898, xfail: 172
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 133 total, 132 passed, 1 failed * arm64: 41 total, 38 passed, 3 failed * i386: 21 total, 21 passed, 0 failed * mips: 26 total, 25 passed, 1 failed * parisc: 4 total, 4 passed, 0 failed * powerpc: 32 total, 31 passed, 1 failed * riscv: 11 total, 10 passed, 1 failed * s390: 14 total, 13 passed, 1 failed * sh: 10 total, 10 passed, 0 failed * sparc: 7 total, 7 passed, 0 failed * x86_64: 33 total, 32 passed, 1 failed
## Test suites summary * boot * commands * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-exec * kselftest-fpu * kselftest-futex * kselftest-intel_pstate * kselftest-kcmp * kselftest-kvm * kselftest-livepatch * kselftest-membarrier * kselftest-mincore * kselftest-mqueue * kselftest-openat2 * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-sigaltstack * kselftest-size * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user_events * kselftest-vDSO * kselftest-x86 * kunit * kvm-unit-tests * lava * libgpiod * libhugetlbfs * log-parser-boot * log-parser-build-clang * log-parser-build-gcc * log-parser-test * ltp-capability * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-hugetlb * ltp-math * ltp-mm * ltp-nptl * ltp-pty * ltp-sched * ltp-smoke * ltp-syscalls * ltp-tracing * perf * rcutorture
-- Linaro LKFT https://lkft.linaro.org
On Mon, 27 Oct 2025 19:34:21 +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.158 release. There are 157 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 29 Oct 2025 18:34:15 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.158-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v6.1: 10 builds: 10 pass, 0 fail 28 boots: 28 pass, 0 fail 119 tests: 119 pass, 0 fail
Linux version: 6.1.158-rc1-gf6fcaf2c6b7f Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra186-p3509-0000+p3636-0001, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On 10/27/25 11:34, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.158 release. There are 157 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 29 Oct 2025 18:34:15 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.158-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
# Librecast Test Results
020/020 [ OK ] liblcrq 010/010 [ OK ] libmld 120/120 [ OK ] liblibrecast
CPU/kernel: Linux auntie 6.1.158-rc1-00158-gf6fcaf2c6b7f #121 SMP PREEMPT_DYNAMIC Tue Oct 28 13:34:21 -00 2025 x86_64 AMD Ryzen 9 9950X 16-Core Processor AuthenticAMD GNU/Linux
Tested-by: Brett A C Sheffield bacs@librecast.net
On 10/27/25 12:34, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.158 release. There are 157 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 29 Oct 2025 18:34:15 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.158-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On Mon, 27 Oct 2025 19:34:21 +0100 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.1.158 release. There are 157 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 29 Oct 2025 18:34:15 +0000. Anything received after that time might be too late.
Boot-tested under QEMU for Rust x86_64:
Tested-by: Miguel Ojeda ojeda@kernel.org
Thanks!
Cheers, Miguel
linux-stable-mirror@lists.linaro.org