This is the start of the stable review cycle for the 6.6.107 release. There are 101 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 19 Sep 2025 12:32:53 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.107-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.6.107-rc1
Jani Nikula jani.nikula@intel.com drm/i915/power: fix size for for_each_set_bit() in abox iteration
Alex Deucher alexander.deucher@amd.com drm/amdgpu: fix a memory leak in fence cleanup when unloading
Buday Csaba buday.csaba@prolan.hu net: mdiobus: release reset_gpio in mdiobus_unregister_device()
Namjae Jeon linkinjeon@kernel.org ksmbd: fix null pointer dereference in alloc_preauth_hash()
Johan Hovold johan@kernel.org phy: ti-pipe3: fix device leak at unbind
Johan Hovold johan@kernel.org phy: tegra: xusb: fix device and OF node leak at probe
Miaoqian Lin linmq006@gmail.com dmaengine: dw: dmamux: Fix device reference leak in rzn1_dmamux_route_allocate
Stephan Gerhold stephan.gerhold@linaro.org dmaengine: qcom: bam_dma: Fix DT error handling for num-channels/ees
Takashi Iwai tiwai@suse.de usb: gadget: midi2: Fix MIDI2 IN EP max packet size
Takashi Iwai tiwai@suse.de usb: gadget: midi2: Fix missing UMP group attributes initialization
Alan Stern stern@rowland.harvard.edu USB: gadget: dummy-hcd: Fix locking bug in RT-enabled kernels
Mathias Nyman mathias.nyman@linux.intel.com xhci: fix memory leak regression when freeing xhci vdev devices depth first
Palmer Dabbelt palmer@rivosinc.com RISC-V: Remove unnecessary include from compat.h
Xiongfeng Wang wangxiongfeng2@huawei.com hrtimers: Unconditionally update target CPU base after offline timer migration
Jiapeng Chong jiapeng.chong@linux.alibaba.com hrtimer: Rename __hrtimer_hres_active() to hrtimer_hres_active()
Jiapeng Chong jiapeng.chong@linux.alibaba.com hrtimer: Remove unused function
Andreas Kemnade akemnade@kernel.org regulator: sy7636a: fix lifecycle of power good gpio
Anders Roxell anders.roxell@linaro.org dmaengine: ti: edma: Fix memory allocation size for queue_priority_map
Dan Carpenter dan.carpenter@linaro.org dmaengine: idxd: Fix double free in idxd_setup_wqs()
Yi Sun yi.sun@intel.com dmaengine: idxd: Fix refcount underflow on module unload
Yi Sun yi.sun@intel.com dmaengine: idxd: Remove improper idxd_free
Hangbin Liu liuhangbin@gmail.com hsr: use hsr_for_each_port_rtnl in hsr_port_get_hsr
Hangbin Liu liuhangbin@gmail.com hsr: use rtnl lock when iterating over ports
Murali Karicheri m-karicheri2@ti.com net: hsr: Add VLAN CTAG filter support
Murali Karicheri m-karicheri2@ti.com net: hsr: Add support for MC filtering at the slave device
Anssi Hannula anssi.hannula@bitwise.fi can: xilinx_can: xcan_write_frame(): fix use-after-free of transmitted SKB
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp can: j1939: j1939_local_ecu_get(): undo increment when j1939_local_ecu_get() fails
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp can: j1939: j1939_sk_bind(): call j1939_priv_put() immediately when j1939_local_ecu_get() failed
Michal Schmidt mschmidt@redhat.com i40e: fix IRQ freeing in i40e_vsi_request_irq_msix error path
Kohei Enju enjuk@amazon.com igb: fix link test skipping when interface is admin down
Alex Tran alex.t.tran@gmail.com docs: networking: can: change bcm_msg_head frames member to support flexible array
Antoine Tenart atenart@kernel.org tunnels: reset the GSO metadata before reusing the skb
Petr Machata petrm@nvidia.com net: bridge: Bounce invalid boolopts
Stefan Wahren wahrenst@gmx.net net: fec: Fix possible NPD in fec_enet_phy_reset_after_clk_enable()
Linus Torvalds torvalds@linux-foundation.org Disable SLUB_TINY for build testing
Fabio Porcedda fabio.porcedda@gmail.com USB: serial: option: add Telit Cinterion LE910C4-WWX new compositions
Fabio Porcedda fabio.porcedda@gmail.com USB: serial: option: add Telit Cinterion FN990A w/audio compositions
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org dt-bindings: serial: brcm,bcm7271-uart: Constrain clocks
Hugo Villeneuve hvilleneuve@dimonoff.com serial: sc16is7xx: fix bug in flow control levels init
Fabian Vogt fvogt@suse.de tty: hvc_console: Call hvc_kick in hvc_write unconditionally
Paolo Abeni pabeni@redhat.com Revert "net: usb: asix: ax88772: drop phylink use in PM to avoid MDIO runtime PM wakeups"
Christoffer Sandberg cs@tuxedo.de Input: i8042 - add TUXEDO InfinityBook Pro Gen10 AMD to i8042 quirk table
Jeff LaBundy jeff@labundy.com Input: iqs7222 - avoid enabling unused interrupts
Chen Ridong chenridong@huawei.com kernfs: Fix UAF in polling when open file is released
Yang Erkun yangerkun@huawei.com cifs: fix pagecache leak when do writepages
Wei Yang richard.weiyang@gmail.com mm/khugepaged: fix the address passed to notifier on testing young
Vishal Moola (Oracle) vishal.moola@gmail.com mm/khugepaged: convert hpage_collapse_scan_pmd() to use folios
Qu Wenruo wqu@suse.com btrfs: fix corruption reading compressed range when block size is smaller than page size
Boris Burkov boris@bur.io btrfs: use readahead_expand() on compressed extents
Quanmin Yan yanquanmin1@huawei.com mm/damon/lru_sort: avoid divide-by-zero in damon_lru_sort_apply_parameters()
Quanmin Yan yanquanmin1@huawei.com mm/damon/reclaim: avoid divide-by-zero in damon_reclaim_apply_parameters()
Stanislav Fort stanislav.fort@aisle.com mm/damon/sysfs: fix use-after-free in state_show()
Ilya Dryomov idryomov@gmail.com libceph: fix invalid accesses to ceph_connection_v1_info
Alexander Sverdlin alexander.sverdlin@siemens.com mtd: nand: raw: atmel: Respect tAR, tCLR in read setup timing
Alexander Dahl ada@thorsis.com mtd: nand: raw: atmel: Fix comment in timings preparation
David Rosca david.rosca@amd.com drm/amdgpu/vcn4: Fix IB parsing with multiple engine info packages
David Rosca david.rosca@amd.com drm/amdgpu/vcn: Allow limiting ctx to instance 0 for AV1 at any time
Johan Hovold johan@kernel.org drm/mediatek: fix potential OF node use-after-free
Sang-Heon Jeon ekffu200098@gmail.com mm/damon/core: set quota->charged_from to jiffies at first charge window
Miaohe Lin linmiaohe@huawei.com mm/memory-failure: fix VM_BUG_ON_PAGE(PagePoisoned(page)) when unpoison memory
Miklos Szeredi mszeredi@redhat.com fuse: prevent overflow in copy_file_range return value
Miklos Szeredi mszeredi@redhat.com fuse: check if copy_file_range() returns larger than requested size
Christophe Kerello christophe.kerello@foss.st.com mtd: rawnand: stm32_fmc2: fix ECC overwrite
Christophe Kerello christophe.kerello@foss.st.com mtd: rawnand: stm32_fmc2: avoid overlapping mappings on ECC buffer
Oleksij Rempel o.rempel@pengutronix.de net: usb: asix: ax88772: drop phylink use in PM to avoid MDIO runtime PM wakeups
Chiasheng Lee chiasheng.lee@linux.intel.com i2c: i801: Hide Intel Birch Stream SoC TCO WDT
Mark Tinguely mark.tinguely@oracle.com ocfs2: fix recursive semaphore deadlock in fiemap call
Krister Johansen kjlx@templeofstupid.com mptcp: sockopt: make sync_socket_options propagate SOCK_KEEPOPEN
Nathan Chancellor nathan@kernel.org compiler-clang.h: define __SANITIZE_*__ macros only when undefined
Trond Myklebust trond.myklebust@hammerspace.com Revert "SUNRPC: Don't allow waiting for exiting tasks"
Salah Triki salah.triki@gmail.com EDAC/altera: Delete an inappropriate dma_free_coherent() call
Borislav Petkov (AMD) bp@alien8.de KVM: SVM: Set synthesized TSA CPUID flags
Paul E. McKenney paulmck@kernel.org rcu-tasks: Maintain real-time response in rcu_tasks_postscan()
Paul E. McKenney paulmck@kernel.org rcu-tasks: Eliminate deadlocks involving do_exit() and RCU tasks
Paul E. McKenney paulmck@kernel.org rcu-tasks: Maintain lists to eliminate RCU-tasks/do_exit() deadlocks
wangzijie wangzijie1@honor.com proc: fix type confusion in pde_set_flags()
Kuniyuki Iwashima kuniyu@google.com tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork.
Peilin Ye yepeilin@google.com bpf: Tell memcg to use allow_spinning=false path in bpf_timer_init()
Thomas Richter tmricht@linux.ibm.com s390/cpum_cf: Deny all sampling events by counter PMU
Pu Lehui pulehui@huawei.com tracing: Silence warning when chunk allocation fails in trace_pid_write
Jonathan Curley jcurley@purestorage.com NFSv4/flexfiles: Fix layout merge mirror check.
Trond Myklebust trond.myklebust@hammerspace.com NFSv4.2: Serialise O_DIRECT i/o and copy range
Trond Myklebust trond.myklebust@hammerspace.com NFSv4.2: Serialise O_DIRECT i/o and clone range
Trond Myklebust trond.myklebust@hammerspace.com NFSv4.2: Serialise O_DIRECT i/o and fallocate()
Trond Myklebust trond.myklebust@hammerspace.com NFS: Serialise O_DIRECT i/o and truncate()
Max Kellermann max.kellermann@ionos.com fs/nfs/io: make nfs_start_io_*() killable
Vladimir Riabchun ferr.lambarginio@gmail.com ftrace/samples: Fix function size computation
Luo Gengkun luogengkun@huaweicloud.com tracing: Fix tracing_marker may trigger page fault during preempt_disable
Trond Myklebust trond.myklebust@hammerspace.com NFSv4: Clear the NFS_CAP_XATTR flag if not supported by the server
Trond Myklebust trond.myklebust@hammerspace.com NFSv4: Clear the NFS_CAP_FS_LOCATIONS flag if it is not set
Trond Myklebust trond.myklebust@hammerspace.com NFSv4: Don't clear capabilities that won't be reset
Justin Worrell jworrell@gmail.com SUNRPC: call xs_sock_process_cmsg for all cmsg
Tigran Mkrtchyan tigran.mkrtchyan@desy.de flexfiles/pNFS: fix NULL checks on result of ff_layout_choose_ds_for_read
Mimi Zohar zohar@linux.ibm.com ima: limit the number of ToMToU integrity violations
Kuniyuki Iwashima kuniyu@amazon.com net: Fix null-ptr-deref by sock_lock_init_class_and_name() and rmmod.
André Apitzsch git@apitzsch.eu media: i2c: imx214: Fix link frequency validation
Chuck Lever chuck.lever@oracle.com NFSD: nfsd_unlink() clobbers non-zero status returned from fh_fill_pre_attrs()
Trond Myklebust trond.myklebust@hammerspace.com nfsd: Fix a regression in nfsd_setattr()
Ada Couprie Diaz ada.coupriediaz@arm.com kasan: fix GCC mem-intrinsic prefix with sw tags
Harry Yoo harry.yoo@oracle.com mm: introduce and use {pgd,p4d}_populate_kernel()
Yeoreum Yun yeoreum.yun@arm.com kunit: kasan_test: disable fortify string checker on kasan_strings() test
-------------
Diffstat:
.../bindings/serial/brcm,bcm7271-uart.yaml | 2 +- Documentation/networking/can.rst | 2 +- Makefile | 4 +- arch/riscv/include/asm/compat.h | 1 - arch/s390/kernel/perf_cpum_cf.c | 4 +- arch/x86/kvm/cpuid.c | 5 + drivers/dma/dw/rzn1-dmamux.c | 15 +- drivers/dma/idxd/init.c | 39 ++--- drivers/dma/qcom/bam_dma.c | 8 +- drivers/dma/ti/edma.c | 4 +- drivers/edac/altera_edac.c | 1 - drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 3 - drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 12 +- drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c | 60 ++++---- drivers/gpu/drm/i915/display/intel_display_power.c | 6 +- drivers/gpu/drm/mediatek/mtk_drm_drv.c | 11 +- drivers/i2c/busses/i2c-i801.c | 2 +- drivers/input/misc/iqs7222.c | 3 + drivers/input/serio/i8042-acpipnpio.h | 14 ++ drivers/media/i2c/imx214.c | 27 ++-- drivers/mtd/nand/raw/atmel/nand-controller.c | 18 ++- drivers/mtd/nand/raw/stm32_fmc2_nand.c | 46 +++--- drivers/net/can/xilinx_can.c | 16 +-- drivers/net/ethernet/freescale/fec_main.c | 3 +- drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +- drivers/net/ethernet/intel/igb/igb_ethtool.c | 5 +- drivers/net/phy/mdio_bus.c | 4 +- drivers/phy/tegra/xusb-tegra210.c | 6 +- drivers/phy/ti/phy-ti-pipe3.c | 13 ++ drivers/regulator/sy7636a-regulator.c | 7 +- drivers/tty/hvc/hvc_console.c | 6 +- drivers/tty/serial/sc16is7xx.c | 14 +- drivers/usb/gadget/function/f_midi2.c | 11 +- drivers/usb/gadget/udc/dummy_hcd.c | 8 +- drivers/usb/host/xhci-mem.c | 2 +- drivers/usb/serial/option.c | 17 +++ fs/btrfs/extent_io.c | 78 ++++++++-- fs/fuse/file.c | 5 +- fs/kernfs/file.c | 54 ++++--- fs/nfs/client.c | 2 + fs/nfs/direct.c | 21 ++- fs/nfs/file.c | 14 +- fs/nfs/flexfilelayout/flexfilelayout.c | 21 +-- fs/nfs/inode.c | 4 +- fs/nfs/internal.h | 17 ++- fs/nfs/io.c | 55 ++++--- fs/nfs/nfs42proc.c | 2 + fs/nfs/nfs4file.c | 2 + fs/nfs/nfs4proc.c | 6 +- fs/nfsd/nfs4proc.c | 4 + fs/nfsd/vfs.c | 13 +- fs/ocfs2/extent_map.c | 10 +- fs/proc/generic.c | 3 +- fs/smb/client/file.c | 16 ++- fs/smb/server/connection.h | 11 ++ fs/smb/server/mgmt/user_session.c | 4 +- fs/smb/server/smb2pdu.c | 14 +- include/linux/compiler-clang.h | 31 +++- include/linux/pgalloc.h | 29 ++++ include/linux/pgtable.h | 13 +- include/net/sock.h | 40 +++++- kernel/bpf/helpers.c | 7 +- kernel/rcu/tasks.h | 107 ++++++++++---- kernel/time/hrtimer.c | 50 ++----- kernel/trace/trace.c | 10 +- mm/Kconfig | 2 +- mm/damon/core.c | 4 + mm/damon/lru_sort.c | 3 + mm/damon/reclaim.c | 3 + mm/damon/sysfs.c | 14 +- mm/kasan/init.c | 12 +- mm/kasan/kasan_test.c | 1 + mm/khugepaged.c | 22 +-- mm/memory-failure.c | 7 +- mm/percpu.c | 6 +- mm/sparse-vmemmap.c | 6 +- net/bridge/br.c | 7 + net/can/j1939/bus.c | 5 +- net/can/j1939/socket.c | 3 + net/ceph/messenger.c | 7 +- net/core/sock.c | 5 + net/hsr/hsr_device.c | 158 ++++++++++++++++++++- net/hsr/hsr_main.c | 4 +- net/hsr/hsr_main.h | 3 + net/ipv4/ip_tunnel_core.c | 6 + net/ipv4/tcp_bpf.c | 5 +- net/mptcp/sockopt.c | 11 +- net/sunrpc/sched.c | 2 - net/sunrpc/xprtsock.c | 6 +- samples/ftrace/ftrace-direct-modify.c | 2 +- scripts/Makefile.kasan | 12 +- security/integrity/ima/ima_main.c | 16 ++- security/integrity/integrity.h | 3 +- 93 files changed, 976 insertions(+), 403 deletions(-)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yeoreum Yun yeoreum.yun@arm.com
commit 7a19afee6fb39df63ddea7ce78976d8c521178c6 upstream.
Similar to commit 09c6304e38e4 ("kasan: test: fix compatibility with FORTIFY_SOURCE") the kernel is panicing in kasan_string().
This is due to the `src` and `ptr` not being hidden from the optimizer which would disable the runtime fortify string checker.
Call trace: __fortify_panic+0x10/0x20 (P) kasan_strings+0x980/0x9b0 kunit_try_run_case+0x68/0x190 kunit_generic_run_threadfn_adapter+0x34/0x68 kthread+0x1c4/0x228 ret_from_fork+0x10/0x20 Code: d503233f a9bf7bfd 910003fd 9424b243 (d4210000) ---[ end trace 0000000000000000 ]--- note: kunit_try_catch[128] exited with irqs disabled note: kunit_try_catch[128] exited with preempt_count 1 # kasan_strings: try faulted: last ** replaying previous printk message ** # kasan_strings: try faulted: last line seen mm/kasan/kasan_test_c.c:1600 # kasan_strings: internal error occurred preventing test case from running: -4
Link: https://lkml.kernel.org/r/20250801120236.2962642-1-yeoreum.yun@arm.com Fixes: 73228c7ecc5e ("KASAN: port KASAN Tests to KUnit") Signed-off-by: Yeoreum Yun yeoreum.yun@arm.com Cc: Alexander Potapenko glider@google.com Cc: Andrey Konovalov andreyknvl@gmail.com Cc: Andrey Ryabinin ryabinin.a.a@gmail.com Cc: Dmitriy Vyukov dvyukov@google.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Yeoreum Yun yeoreum.yun@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/kasan/kasan_test.c | 1 + 1 file changed, 1 insertion(+)
--- a/mm/kasan/kasan_test.c +++ b/mm/kasan/kasan_test.c @@ -1053,6 +1053,7 @@ static void kasan_strings(struct kunit *
ptr = kmalloc(size, GFP_KERNEL | __GFP_ZERO); KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ptr); + OPTIMIZER_HIDE_VAR(ptr);
kfree(ptr);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harry Yoo harry.yoo@oracle.com
commit f2d2f9598ebb0158a3fe17cda0106d7752e654a2 upstream.
Introduce and use {pgd,p4d}_populate_kernel() in core MM code when populating PGD and P4D entries for the kernel address space. These helpers ensure proper synchronization of page tables when updating the kernel portion of top-level page tables.
Until now, the kernel has relied on each architecture to handle synchronization of top-level page tables in an ad-hoc manner. For example, see commit 9b861528a801 ("x86-64, mem: Update all PGDs for direct mapping and vmemmap mapping changes").
However, this approach has proven fragile for following reasons:
1) It is easy to forget to perform the necessary page table synchronization when introducing new changes. For instance, commit 4917f55b4ef9 ("mm/sparse-vmemmap: improve memory savings for compound devmaps") overlooked the need to synchronize page tables for the vmemmap area.
2) It is also easy to overlook that the vmemmap and direct mapping areas must not be accessed before explicit page table synchronization. For example, commit 8d400913c231 ("x86/vmemmap: handle unpopulated sub-pmd ranges")) caused crashes by accessing the vmemmap area before calling sync_global_pgds().
To address this, as suggested by Dave Hansen, introduce _kernel() variants of the page table population helpers, which invoke architecture-specific hooks to properly synchronize page tables. These are introduced in a new header file, include/linux/pgalloc.h, so they can be called from common code.
They reuse existing infrastructure for vmalloc and ioremap. Synchronization requirements are determined by ARCH_PAGE_TABLE_SYNC_MASK, and the actual synchronization is performed by arch_sync_kernel_mappings().
This change currently targets only x86_64, so only PGD and P4D level helpers are introduced. Currently, these helpers are no-ops since no architecture sets PGTBL_{PGD,P4D}_MODIFIED in ARCH_PAGE_TABLE_SYNC_MASK.
In theory, PUD and PMD level helpers can be added later if needed by other architectures. For now, 32-bit architectures (x86-32 and arm) only handle PGTBL_PMD_MODIFIED, so p*d_populate_kernel() will never affect them unless we introduce a PMD level helper.
[harry.yoo@oracle.com: fix KASAN build error due to p*d_populate_kernel()] Link: https://lkml.kernel.org/r/20250822020727.202749-1-harry.yoo@oracle.com Link: https://lkml.kernel.org/r/20250818020206.4517-3-harry.yoo@oracle.com Fixes: 8d400913c231 ("x86/vmemmap: handle unpopulated sub-pmd ranges") Signed-off-by: Harry Yoo harry.yoo@oracle.com Suggested-by: Dave Hansen dave.hansen@linux.intel.com Acked-by: Kiryl Shutsemau kas@kernel.org Reviewed-by: Mike Rapoport (Microsoft) rppt@kernel.org Reviewed-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com Acked-by: David Hildenbrand david@redhat.com Cc: Alexander Potapenko glider@google.com Cc: Alistair Popple apopple@nvidia.com Cc: Andrey Konovalov andreyknvl@gmail.com Cc: Andrey Ryabinin ryabinin.a.a@gmail.com Cc: Andy Lutomirski luto@kernel.org Cc: "Aneesh Kumar K.V" aneesh.kumar@linux.ibm.com Cc: Anshuman Khandual anshuman.khandual@arm.com Cc: Ard Biesheuvel ardb@kernel.org Cc: Arnd Bergmann arnd@arndb.de Cc: bibo mao maobibo@loongson.cn Cc: Borislav Betkov bp@alien8.de Cc: Christoph Lameter (Ampere) cl@gentwo.org Cc: Dennis Zhou dennis@kernel.org Cc: Dev Jain dev.jain@arm.com Cc: Dmitriy Vyukov dvyukov@google.com Cc: Gwan-gyeong Mun gwan-gyeong.mun@intel.com Cc: Ingo Molnar mingo@redhat.com Cc: Jane Chu jane.chu@oracle.com Cc: Joao Martins joao.m.martins@oracle.com Cc: Joerg Roedel joro@8bytes.org Cc: John Hubbard jhubbard@nvidia.com Cc: Kevin Brodsky kevin.brodsky@arm.com Cc: Liam Howlett liam.howlett@oracle.com Cc: Michal Hocko mhocko@suse.com Cc: Oscar Salvador osalvador@suse.de Cc: Peter Xu peterx@redhat.com Cc: Peter Zijlstra peterz@infradead.org Cc: Qi Zheng zhengqi.arch@bytedance.com Cc: Ryan Roberts ryan.roberts@arm.com Cc: Suren Baghdasaryan surenb@google.com Cc: Tejun Heo tj@kernel.org Cc: Thomas Gleinxer tglx@linutronix.de Cc: Thomas Huth thuth@redhat.com Cc: "Uladzislau Rezki (Sony)" urezki@gmail.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: Vlastimil Babka vbabka@suse.cz Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org [ Adjust context ] Signed-off-by: Harry Yoo harry.yoo@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/pgalloc.h | 29 +++++++++++++++++++++++++++++ include/linux/pgtable.h | 13 +++++++------ mm/kasan/init.c | 12 ++++++------ mm/percpu.c | 6 +++--- mm/sparse-vmemmap.c | 6 +++--- 5 files changed, 48 insertions(+), 18 deletions(-) create mode 100644 include/linux/pgalloc.h
--- /dev/null +++ b/include/linux/pgalloc.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_PGALLOC_H +#define _LINUX_PGALLOC_H + +#include <linux/pgtable.h> +#include <asm/pgalloc.h> + +/* + * {pgd,p4d}_populate_kernel() are defined as macros to allow + * compile-time optimization based on the configured page table levels. + * Without this, linking may fail because callers (e.g., KASAN) may rely + * on calls to these functions being optimized away when passing symbols + * that exist only for certain page table levels. + */ +#define pgd_populate_kernel(addr, pgd, p4d) \ + do { \ + pgd_populate(&init_mm, pgd, p4d); \ + if (ARCH_PAGE_TABLE_SYNC_MASK & PGTBL_PGD_MODIFIED) \ + arch_sync_kernel_mappings(addr, addr); \ + } while (0) + +#define p4d_populate_kernel(addr, p4d, pud) \ + do { \ + p4d_populate(&init_mm, p4d, pud); \ + if (ARCH_PAGE_TABLE_SYNC_MASK & PGTBL_P4D_MODIFIED) \ + arch_sync_kernel_mappings(addr, addr); \ + } while (0) + +#endif /* _LINUX_PGALLOC_H */ --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1467,8 +1467,8 @@ static inline int pmd_protnone(pmd_t pmd
/* * Architectures can set this mask to a combination of PGTBL_P?D_MODIFIED values - * and let generic vmalloc and ioremap code know when arch_sync_kernel_mappings() - * needs to be called. + * and let generic vmalloc, ioremap and page table update code know when + * arch_sync_kernel_mappings() needs to be called. */ #ifndef ARCH_PAGE_TABLE_SYNC_MASK #define ARCH_PAGE_TABLE_SYNC_MASK 0 @@ -1601,10 +1601,11 @@ static inline bool arch_has_pfn_modify_c /* * Page Table Modification bits for pgtbl_mod_mask. * - * These are used by the p?d_alloc_track*() set of functions an in the generic - * vmalloc/ioremap code to track at which page-table levels entries have been - * modified. Based on that the code can better decide when vmalloc and ioremap - * mapping changes need to be synchronized to other page-tables in the system. + * These are used by the p?d_alloc_track*() and p*d_populate_kernel() + * functions in the generic vmalloc, ioremap and page table update code + * to track at which page-table levels entries have been modified. + * Based on that the code can better decide when page table changes need + * to be synchronized to other page-tables in the system. */ #define __PGTBL_PGD_MODIFIED 0 #define __PGTBL_P4D_MODIFIED 1 --- a/mm/kasan/init.c +++ b/mm/kasan/init.c @@ -13,9 +13,9 @@ #include <linux/mm.h> #include <linux/pfn.h> #include <linux/slab.h> +#include <linux/pgalloc.h>
#include <asm/page.h> -#include <asm/pgalloc.h>
#include "kasan.h"
@@ -197,7 +197,7 @@ static int __ref zero_p4d_populate(pgd_t pud_t *pud; pmd_t *pmd;
- p4d_populate(&init_mm, p4d, + p4d_populate_kernel(addr, p4d, lm_alias(kasan_early_shadow_pud)); pud = pud_offset(p4d, addr); pud_populate(&init_mm, pud, @@ -218,7 +218,7 @@ static int __ref zero_p4d_populate(pgd_t } else { p = early_alloc(PAGE_SIZE, NUMA_NO_NODE); pud_init(p); - p4d_populate(&init_mm, p4d, p); + p4d_populate_kernel(addr, p4d, p); } } zero_pud_populate(p4d, addr, next); @@ -257,10 +257,10 @@ int __ref kasan_populate_early_shadow(co * puds,pmds, so pgd_populate(), pud_populate() * is noops. */ - pgd_populate(&init_mm, pgd, + pgd_populate_kernel(addr, pgd, lm_alias(kasan_early_shadow_p4d)); p4d = p4d_offset(pgd, addr); - p4d_populate(&init_mm, p4d, + p4d_populate_kernel(addr, p4d, lm_alias(kasan_early_shadow_pud)); pud = pud_offset(p4d, addr); pud_populate(&init_mm, pud, @@ -279,7 +279,7 @@ int __ref kasan_populate_early_shadow(co if (!p) return -ENOMEM; } else { - pgd_populate(&init_mm, pgd, + pgd_populate_kernel(addr, pgd, early_alloc(PAGE_SIZE, NUMA_NO_NODE)); } } --- a/mm/percpu.c +++ b/mm/percpu.c @@ -3157,7 +3157,7 @@ out_free: #endif /* BUILD_EMBED_FIRST_CHUNK */
#ifdef BUILD_PAGE_FIRST_CHUNK -#include <asm/pgalloc.h> +#include <linux/pgalloc.h>
#ifndef P4D_TABLE_SIZE #define P4D_TABLE_SIZE PAGE_SIZE @@ -3185,7 +3185,7 @@ void __init __weak pcpu_populate_pte(uns p4d = memblock_alloc(P4D_TABLE_SIZE, P4D_TABLE_SIZE); if (!p4d) goto err_alloc; - pgd_populate(&init_mm, pgd, p4d); + pgd_populate_kernel(addr, pgd, p4d); }
p4d = p4d_offset(pgd, addr); @@ -3193,7 +3193,7 @@ void __init __weak pcpu_populate_pte(uns pud = memblock_alloc(PUD_TABLE_SIZE, PUD_TABLE_SIZE); if (!pud) goto err_alloc; - p4d_populate(&init_mm, p4d, pud); + p4d_populate_kernel(addr, p4d, pud); }
pud = pud_offset(p4d, addr); --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -27,9 +27,9 @@ #include <linux/spinlock.h> #include <linux/vmalloc.h> #include <linux/sched.h> +#include <linux/pgalloc.h>
#include <asm/dma.h> -#include <asm/pgalloc.h>
/* * Allocate a block of memory to be used to back the virtual memory map @@ -225,7 +225,7 @@ p4d_t * __meminit vmemmap_p4d_populate(p if (!p) return NULL; pud_init(p); - p4d_populate(&init_mm, p4d, p); + p4d_populate_kernel(addr, p4d, p); } return p4d; } @@ -237,7 +237,7 @@ pgd_t * __meminit vmemmap_pgd_populate(u void *p = vmemmap_alloc_block_zero(PAGE_SIZE, node); if (!p) return NULL; - pgd_populate(&init_mm, pgd, p); + pgd_populate_kernel(addr, pgd, p); } return pgd; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ada Couprie Diaz ada.coupriediaz@arm.com
[ Upstream commit 51337a9a3a404fde0f5337662ffc7699793dfeb5 ]
GCC doesn't support "hwasan-kernel-mem-intrinsic-prefix", only "asan-kernel-mem-intrinsic-prefix"[0], while LLVM supports both. This is already taken into account when checking "CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX", but not in the KASAN Makefile adding those parameters when "CONFIG_KASAN_SW_TAGS" is enabled.
Replace the version check with "CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX", which already validates that mem-intrinsic prefix parameter can be used, and choose the correct name depending on compiler.
GCC 13 and above trigger "CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX" which prevents `mem{cpy,move,set}()` being redefined in "mm/kasan/shadow.c" since commit 36be5cba99f6 ("kasan: treat meminstrinsic as builtins in uninstrumented files"), as we expect the compiler to prefix those calls with `__(hw)asan_` instead. But as the option passed to GCC has been incorrect, the compiler has not been emitting those prefixes, effectively never calling the instrumented versions of `mem{cpy,move,set}()` with "CONFIG_KASAN_SW_TAGS" enabled.
If "CONFIG_FORTIFY_SOURCES" is enabled, this issue would be mitigated as it redefines `mem{cpy,move,set}()` and properly aliases the `__underlying_mem*()` that will be called to the instrumented versions.
Link: https://lkml.kernel.org/r/20250821120735.156244-1-ada.coupriediaz@arm.com Link: https://gcc.gnu.org/onlinedocs/gcc-13.4.0/gcc/Optimize-Options.html [0] Signed-off-by: Ada Couprie Diaz ada.coupriediaz@arm.com Fixes: 36be5cba99f6 ("kasan: treat meminstrinsic as builtins in uninstrumented files") Reviewed-by: Yeoreum Yun yeoreum.yun@arm.com Cc: Alexander Potapenko glider@google.com Cc: Andrey Konovalov andreyknvl@gmail.com Cc: Andrey Ryabinin ryabinin.a.a@gmail.com Cc: Dmitriy Vyukov dvyukov@google.com Cc: Marco Elver elver@google.com Cc: Marc Rutland mark.rutland@arm.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: Nathan Chancellor nathan@kernel.org Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org [ kasan_params => CFLAGS_KASAN ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- scripts/Makefile.kasan | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
--- a/scripts/Makefile.kasan +++ b/scripts/Makefile.kasan @@ -68,10 +68,14 @@ CFLAGS_KASAN := -fsanitize=kernel-hwaddr $(call cc-param,hwasan-inline-all-checks=0) \ $(instrumentation_flags)
-# Instrument memcpy/memset/memmove calls by using instrumented __hwasan_mem*(). -ifeq ($(call clang-min-version, 150000)$(call gcc-min-version, 130000),y) -CFLAGS_KASAN += $(call cc-param,hwasan-kernel-mem-intrinsic-prefix=1) -endif +# Instrument memcpy/memset/memmove calls by using instrumented __(hw)asan_mem*(). +ifdef CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX + ifdef CONFIG_CC_IS_GCC + CFLAGS_KASAN += $(call cc-param,asan-kernel-mem-intrinsic-prefix=1) + else + CFLAGS_KASAN += $(call cc-param,hwasan-kernel-mem-intrinsic-prefix=1) + endif +endif # CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX
endif # CONFIG_KASAN_SW_TAGS
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit 6412e44c40aaf8f1d7320b2099c5bdd6cb9126ac ]
Commit bb4d53d66e4b ("NFSD: use (un)lock_inode instead of fh_(un)lock for file operations") broke the NFSv3 pre/post op attributes behaviour when doing a SETATTR rpc call by stripping out the calls to fh_fill_pre_attrs() and fh_fill_post_attrs().
Fixes: bb4d53d66e4b ("NFSD: use (un)lock_inode instead of fh_(un)lock for file operations") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Reviewed-by: Jeff Layton jlayton@kernel.org Reviewed-by: NeilBrown neilb@suse.de Message-ID: 20240216012451.22725-1-trondmy@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Stable-dep-of: d7d8e3169b56 ("NFSD: nfsd_unlink() clobbers non-zero status returned from fh_fill_pre_attrs()") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/nfs4proc.c | 4 ++++ fs/nfsd/vfs.c | 9 +++++++-- 2 files changed, 11 insertions(+), 2 deletions(-)
--- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -1131,6 +1131,7 @@ nfsd4_setattr(struct svc_rqst *rqstp, st }; struct inode *inode; __be32 status = nfs_ok; + bool save_no_wcc; int err;
if (setattr->sa_iattr.ia_valid & ATTR_SIZE) { @@ -1156,8 +1157,11 @@ nfsd4_setattr(struct svc_rqst *rqstp, st
if (status) goto out; + save_no_wcc = cstate->current_fh.fh_no_wcc; + cstate->current_fh.fh_no_wcc = true; status = nfsd_setattr(rqstp, &cstate->current_fh, &attrs, 0, (time64_t)0); + cstate->current_fh.fh_no_wcc = save_no_wcc; if (!status) status = nfserrno(attrs.na_labelerr); if (!status) --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -480,7 +480,7 @@ nfsd_setattr(struct svc_rqst *rqstp, str int accmode = NFSD_MAY_SATTR; umode_t ftype = 0; __be32 err; - int host_err; + int host_err = 0; bool get_write_count; bool size_change = (iap->ia_valid & ATTR_SIZE); int retries; @@ -538,6 +538,9 @@ nfsd_setattr(struct svc_rqst *rqstp, str }
inode_lock(inode); + err = fh_fill_pre_attrs(fhp); + if (err) + goto out_unlock; for (retries = 1;;) { struct iattr attrs;
@@ -565,13 +568,15 @@ nfsd_setattr(struct svc_rqst *rqstp, str attr->na_aclerr = set_posix_acl(&nop_mnt_idmap, dentry, ACL_TYPE_DEFAULT, attr->na_dpacl); + fh_fill_post_attrs(fhp); +out_unlock: inode_unlock(inode); if (size_change) put_write_access(inode); out: if (!host_err) host_err = commit_metadata(fhp); - return nfserrno(host_err); + return err != 0 ? err : nfserrno(host_err); }
#if defined(CONFIG_NFSD_V4)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit d7d8e3169b56e7696559a2427c922c0d55debcec ]
If fh_fill_pre_attrs() returns a non-zero status, the error flow takes it through out_unlock, which then overwrites the returned status code with
err = nfserrno(host_err);
Fixes: a332018a91c4 ("nfsd: handle failure to collect pre/post-op attrs more sanely") Reviewed-by: Jeff Layton jlayton@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com [ Slightly different error mapping ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/vfs.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -1970,11 +1970,9 @@ out_nfserr: err = nfserr_file_open; else err = nfserr_acces; - } else { - err = nfserrno(host_err); } out: - return err; + return err != nfs_ok ? err : nfserrno(host_err); out_unlock: inode_unlock(dirp); goto out_drop_write;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: André Apitzsch git@apitzsch.eu
[ Upstream commit acc294519f1749041e1b8c74d46bbf6c57d8b061 ]
The driver defines IMX214_DEFAULT_LINK_FREQ 480000000, and then IMX214_DEFAULT_PIXEL_RATE ((IMX214_DEFAULT_LINK_FREQ * 8LL) / 10), which works out as 384MPix/s. (The 8 is 4 lanes and DDR.)
Parsing the PLL registers with the defined 24MHz input. We're in single PLL mode, so MIPI frequency is directly linked to pixel rate. VTCK ends up being 1200MHz, and VTPXCK and OPPXCK both are 120MHz. Section 5.3 "Frame rate calculation formula" says "Pixel rate [pixels/s] = VTPXCK [MHz] * 4", so 120 * 4 = 480MPix/s, which basically agrees with my number above.
3.1.4. MIPI global timing setting says "Output bitrate = OPPXCK * reg 0x113[7:0]", so 120MHz * 10, or 1200Mbit/s. That would be a link frequency of 600MHz due to DDR. That also matches to 480MPix/s * 10bpp / 4 lanes / 2 for DDR.
Keep the previous link frequency for backward compatibility.
Acked-by: Ricardo Ribalda ribalda@chromium.org Signed-off-by: André Apitzsch git@apitzsch.eu Fixes: 436190596241 ("media: imx214: Add imx214 camera sensor driver") Cc: stable@vger.kernel.org Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Hans Verkuil hverkuil@xs4all.nl [ changed dev_err() to dev_err_probe() for the final error case ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/i2c/imx214.c | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-)
--- a/drivers/media/i2c/imx214.c +++ b/drivers/media/i2c/imx214.c @@ -20,7 +20,9 @@ #include <media/v4l2-subdev.h>
#define IMX214_DEFAULT_CLK_FREQ 24000000 -#define IMX214_DEFAULT_LINK_FREQ 480000000 +#define IMX214_DEFAULT_LINK_FREQ 600000000 +/* Keep wrong link frequency for backward compatibility */ +#define IMX214_DEFAULT_LINK_FREQ_LEGACY 480000000 #define IMX214_DEFAULT_PIXEL_RATE ((IMX214_DEFAULT_LINK_FREQ * 8LL) / 10) #define IMX214_FPS 30 #define IMX214_MBUS_CODE MEDIA_BUS_FMT_SRGGB10_1X10 @@ -892,17 +894,26 @@ static int imx214_parse_fwnode(struct de goto done; }
- for (i = 0; i < bus_cfg.nr_of_link_frequencies; i++) + if (bus_cfg.nr_of_link_frequencies != 1) + dev_warn(dev, "Only one link-frequency supported, please review your DT. Continuing anyway\n"); + + for (i = 0; i < bus_cfg.nr_of_link_frequencies; i++) { if (bus_cfg.link_frequencies[i] == IMX214_DEFAULT_LINK_FREQ) break; - - if (i == bus_cfg.nr_of_link_frequencies) { - dev_err(dev, "link-frequencies %d not supported, Please review your DT\n", - IMX214_DEFAULT_LINK_FREQ); - ret = -EINVAL; - goto done; + if (bus_cfg.link_frequencies[i] == + IMX214_DEFAULT_LINK_FREQ_LEGACY) { + dev_warn(dev, + "link-frequencies %d not supported, please review your DT. Continuing anyway\n", + IMX214_DEFAULT_LINK_FREQ); + break; + } }
+ if (i == bus_cfg.nr_of_link_frequencies) + ret = dev_err_probe(dev, -EINVAL, + "link-frequencies %d not supported, please review your DT\n", + IMX214_DEFAULT_LINK_FREQ); + done: v4l2_fwnode_endpoint_free(&bus_cfg); fwnode_handle_put(endpoint);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit 0bb2f7a1ad1f11d861f58e5ee5051c8974ff9569 ]
When I ran the repro [0] and waited a few seconds, I observed two LOCKDEP splats: a warning immediately followed by a null-ptr-deref. [1]
Reproduction Steps:
1) Mount CIFS 2) Add an iptables rule to drop incoming FIN packets for CIFS 3) Unmount CIFS 4) Unload the CIFS module 5) Remove the iptables rule
At step 3), the CIFS module calls sock_release() for the underlying TCP socket, and it returns quickly. However, the socket remains in FIN_WAIT_1 because incoming FIN packets are dropped.
At this point, the module's refcnt is 0 while the socket is still alive, so the following rmmod command succeeds.
# ss -tan State Recv-Q Send-Q Local Address:Port Peer Address:Port FIN-WAIT-1 0 477 10.0.2.15:51062 10.0.0.137:445
# lsmod | grep cifs cifs 1159168 0
This highlights a discrepancy between the lifetime of the CIFS module and the underlying TCP socket. Even after CIFS calls sock_release() and it returns, the TCP socket does not die immediately in order to close the connection gracefully.
While this is generally fine, it causes an issue with LOCKDEP because CIFS assigns a different lock class to the TCP socket's sk->sk_lock using sock_lock_init_class_and_name().
Once an incoming packet is processed for the socket or a timer fires, sk->sk_lock is acquired.
Then, LOCKDEP checks the lock context in check_wait_context(), where hlock_class() is called to retrieve the lock class. However, since the module has already been unloaded, hlock_class() logs a warning and returns NULL, triggering the null-ptr-deref.
If LOCKDEP is enabled, we must ensure that a module calling sock_lock_init_class_and_name() (CIFS, NFS, etc) cannot be unloaded while such a socket is still alive to prevent this issue.
Let's hold the module reference in sock_lock_init_class_and_name() and release it when the socket is freed in sk_prot_free().
Note that sock_lock_init() clears sk->sk_owner for svc_create_socket() that calls sock_lock_init_class_and_name() for a listening socket, which clones a socket by sk_clone_lock() without GFP_ZERO.
[0]: CIFS_SERVER="10.0.0.137" CIFS_PATH="//${CIFS_SERVER}/Users/Administrator/Desktop/CIFS_TEST" DEV="enp0s3" CRED="/root/WindowsCredential.txt"
MNT=$(mktemp -d /tmp/XXXXXX) mount -t cifs ${CIFS_PATH} ${MNT} -o vers=3.0,credentials=${CRED},cache=none,echo_interval=1
iptables -A INPUT -s ${CIFS_SERVER} -j DROP
for i in $(seq 10); do umount ${MNT} rmmod cifs sleep 1 done
rm -r ${MNT}
iptables -D INPUT -s ${CIFS_SERVER} -j DROP
[1]: DEBUG_LOCKS_WARN_ON(1) WARNING: CPU: 10 PID: 0 at kernel/locking/lockdep.c:234 hlock_class (kernel/locking/lockdep.c:234 kernel/locking/lockdep.c:223) Modules linked in: cifs_arc4 nls_ucs2_utils cifs_md4 [last unloaded: cifs] CPU: 10 UID: 0 PID: 0 Comm: swapper/10 Not tainted 6.14.0 #36 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:hlock_class (kernel/locking/lockdep.c:234 kernel/locking/lockdep.c:223) ... Call Trace: <IRQ> __lock_acquire (kernel/locking/lockdep.c:4853 kernel/locking/lockdep.c:5178) lock_acquire (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:5853 kernel/locking/lockdep.c:5816) _raw_spin_lock_nested (kernel/locking/spinlock.c:379) tcp_v4_rcv (./include/linux/skbuff.h:1678 ./include/net/tcp.h:2547 net/ipv4/tcp_ipv4.c:2350) ...
BUG: kernel NULL pointer dereference, address: 00000000000000c4 PF: supervisor read access in kernel mode PF: error_code(0x0000) - not-present page PGD 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 10 UID: 0 PID: 0 Comm: swapper/10 Tainted: G W 6.14.0 #36 Tainted: [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:__lock_acquire (kernel/locking/lockdep.c:4852 kernel/locking/lockdep.c:5178) Code: 15 41 09 c7 41 8b 44 24 20 25 ff 1f 00 00 41 09 c7 8b 84 24 a0 00 00 00 45 89 7c 24 20 41 89 44 24 24 e8 e1 bc ff ff 4c 89 e7 <44> 0f b6 b8 c4 00 00 00 e8 d1 bc ff ff 0f b6 80 c5 00 00 00 88 44 RSP: 0018:ffa0000000468a10 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ff1100010091cc38 RCX: 0000000000000027 RDX: ff1100081f09ca48 RSI: 0000000000000001 RDI: ff1100010091cc88 RBP: ff1100010091c200 R08: ff1100083fe6e228 R09: 00000000ffffbfff R10: ff1100081eca0000 R11: ff1100083fe10dc0 R12: ff1100010091cc88 R13: 0000000000000001 R14: 0000000000000000 R15: 00000000000424b1 FS: 0000000000000000(0000) GS:ff1100081f080000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000c4 CR3: 0000000002c4a003 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> lock_acquire (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:5853 kernel/locking/lockdep.c:5816) _raw_spin_lock_nested (kernel/locking/spinlock.c:379) tcp_v4_rcv (./include/linux/skbuff.h:1678 ./include/net/tcp.h:2547 net/ipv4/tcp_ipv4.c:2350) ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205 (discriminator 1)) ip_local_deliver_finish (./include/linux/rcupdate.h:878 net/ipv4/ip_input.c:234) ip_sublist_rcv_finish (net/ipv4/ip_input.c:576) ip_list_rcv_finish (net/ipv4/ip_input.c:628) ip_list_rcv (net/ipv4/ip_input.c:670) __netif_receive_skb_list_core (net/core/dev.c:5939 net/core/dev.c:5986) netif_receive_skb_list_internal (net/core/dev.c:6040 net/core/dev.c:6129) napi_complete_done (./include/linux/list.h:37 ./include/net/gro.h:519 ./include/net/gro.h:514 net/core/dev.c:6496) e1000_clean (drivers/net/ethernet/intel/e1000/e1000_main.c:3815) __napi_poll.constprop.0 (net/core/dev.c:7191) net_rx_action (net/core/dev.c:7262 net/core/dev.c:7382) handle_softirqs (kernel/softirq.c:561) __irq_exit_rcu (kernel/softirq.c:596 kernel/softirq.c:435 kernel/softirq.c:662) irq_exit_rcu (kernel/softirq.c:680) common_interrupt (arch/x86/kernel/irq.c:280 (discriminator 14)) </IRQ> <TASK> asm_common_interrupt (./arch/x86/include/asm/idtentry.h:693) RIP: 0010:default_idle (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:92 arch/x86/kernel/process.c:744) Code: 4c 01 c7 4c 29 c2 e9 72 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d c3 2b 15 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 RSP: 0018:ffa00000000ffee8 EFLAGS: 00000202 RAX: 000000000000640b RBX: ff1100010091c200 RCX: 0000000000061aa4 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff812f30c5 RBP: 000000000000000a R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000002 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ? do_idle (kernel/sched/idle.c:186 kernel/sched/idle.c:325) default_idle_call (./include/linux/cpuidle.h:143 kernel/sched/idle.c:118) do_idle (kernel/sched/idle.c:186 kernel/sched/idle.c:325) cpu_startup_entry (kernel/sched/idle.c:422 (discriminator 1)) start_secondary (arch/x86/kernel/smpboot.c:315) common_startup_64 (arch/x86/kernel/head_64.S:421) </TASK> Modules linked in: cifs_arc4 nls_ucs2_utils cifs_md4 [last unloaded: cifs] CR2: 00000000000000c4
Fixes: ed07536ed673 ("[PATCH] lockdep: annotate nfs/nfsd in-kernel sockets") Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250407163313.22682-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski kuba@kernel.org [ Adjust context ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/sock.h | 40 ++++++++++++++++++++++++++++++++++++++-- net/core/sock.c | 5 +++++ 2 files changed, 43 insertions(+), 2 deletions(-)
--- a/include/net/sock.h +++ b/include/net/sock.h @@ -353,6 +353,8 @@ struct sk_filter; * @sk_txtime_unused: unused txtime flags * @ns_tracker: tracker for netns reference * @sk_bind2_node: bind node in the bhash2 table + * @sk_owner: reference to the real owner of the socket that calls + * sock_lock_init_class_and_name(). */ struct sock { /* @@ -545,6 +547,10 @@ struct sock { struct rcu_head sk_rcu; netns_tracker ns_tracker; struct hlist_node sk_bind2_node; + +#if IS_ENABLED(CONFIG_PROVE_LOCKING) && IS_ENABLED(CONFIG_MODULES) + struct module *sk_owner; +#endif };
enum sk_pacing { @@ -1699,6 +1705,35 @@ static inline void sk_mem_uncharge(struc sk_mem_reclaim(sk); }
+#if IS_ENABLED(CONFIG_PROVE_LOCKING) && IS_ENABLED(CONFIG_MODULES) +static inline void sk_owner_set(struct sock *sk, struct module *owner) +{ + __module_get(owner); + sk->sk_owner = owner; +} + +static inline void sk_owner_clear(struct sock *sk) +{ + sk->sk_owner = NULL; +} + +static inline void sk_owner_put(struct sock *sk) +{ + module_put(sk->sk_owner); +} +#else +static inline void sk_owner_set(struct sock *sk, struct module *owner) +{ +} + +static inline void sk_owner_clear(struct sock *sk) +{ +} + +static inline void sk_owner_put(struct sock *sk) +{ +} +#endif /* * Macro so as to not evaluate some arguments when * lockdep is not enabled. @@ -1708,13 +1743,14 @@ static inline void sk_mem_uncharge(struc */ #define sock_lock_init_class_and_name(sk, sname, skey, name, key) \ do { \ + sk_owner_set(sk, THIS_MODULE); \ sk->sk_lock.owned = 0; \ init_waitqueue_head(&sk->sk_lock.wq); \ spin_lock_init(&(sk)->sk_lock.slock); \ debug_check_no_locks_freed((void *)&(sk)->sk_lock, \ - sizeof((sk)->sk_lock)); \ + sizeof((sk)->sk_lock)); \ lockdep_set_class_and_name(&(sk)->sk_lock.slock, \ - (skey), (sname)); \ + (skey), (sname)); \ lockdep_init_map(&(sk)->sk_lock.dep_map, (name), (key), 0); \ } while (0)
--- a/net/core/sock.c +++ b/net/core/sock.c @@ -2029,6 +2029,8 @@ lenout: */ static inline void sock_lock_init(struct sock *sk) { + sk_owner_clear(sk); + if (sk->sk_kern_sock) sock_lock_init_class_and_name( sk, @@ -2124,6 +2126,9 @@ static void sk_prot_free(struct proto *p cgroup_sk_free(&sk->sk_cgrp_data); mem_cgroup_sk_free(sk); security_sk_free(sk); + + sk_owner_put(sk); + if (slab != NULL) kmem_cache_free(slab, sk); else
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mimi Zohar zohar@linux.ibm.com
[ Upstream commit a414016218ca97140171aa3bb926b02e1f68c2cc ]
Each time a file in policy, that is already opened for read, is opened for write, a Time-of-Measure-Time-of-Use (ToMToU) integrity violation audit message is emitted and a violation record is added to the IMA measurement list. This occurs even if a ToMToU violation has already been recorded.
Limit the number of ToMToU integrity violations per file open for read.
Note: The IMA_MAY_EMIT_TOMTOU atomic flag must be set from the reader side based on policy. This may result in a per file open for read ToMToU violation.
Since IMA_MUST_MEASURE is only used for violations, rename the atomic IMA_MUST_MEASURE flag to IMA_MAY_EMIT_TOMTOU.
Cc: stable@vger.kernel.org # applies cleanly up to linux-6.6 Tested-by: Stefan Berger stefanb@linux.ibm.com Reviewed-by: Petr Vorel pvorel@suse.cz Tested-by: Petr Vorel pvorel@suse.cz Reviewed-by: Roberto Sassu roberto.sassu@huawei.com Signed-off-by: Mimi Zohar zohar@linux.ibm.com [ adapted IMA flag definitions location from ima.h to integrity.h ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/integrity/ima/ima_main.c | 16 +++++++++++----- security/integrity/integrity.h | 3 ++- 2 files changed, 13 insertions(+), 6 deletions(-)
--- a/security/integrity/ima/ima_main.c +++ b/security/integrity/ima/ima_main.c @@ -128,16 +128,22 @@ static void ima_rdwr_violation_check(str if (atomic_read(&inode->i_readcount) && IS_IMA(inode)) { if (!iint) iint = integrity_iint_find(inode); + /* IMA_MEASURE is set from reader side */ - if (iint && test_bit(IMA_MUST_MEASURE, - &iint->atomic_flags)) + if (iint && test_and_clear_bit(IMA_MAY_EMIT_TOMTOU, + &iint->atomic_flags)) send_tomtou = true; } } else { if (must_measure) - set_bit(IMA_MUST_MEASURE, &iint->atomic_flags); - if (inode_is_open_for_write(inode) && must_measure) - send_writers = true; + set_bit(IMA_MAY_EMIT_TOMTOU, &iint->atomic_flags); + + /* Limit number of open_writers violations */ + if (inode_is_open_for_write(inode) && must_measure) { + if (!test_and_set_bit(IMA_EMITTED_OPENWRITERS, + &iint->atomic_flags)) + send_writers = true; + } }
if (!send_tomtou && !send_writers) --- a/security/integrity/integrity.h +++ b/security/integrity/integrity.h @@ -74,7 +74,8 @@ #define IMA_UPDATE_XATTR 1 #define IMA_CHANGE_ATTR 2 #define IMA_DIGSIG 3 -#define IMA_MUST_MEASURE 4 +#define IMA_MAY_EMIT_TOMTOU 4 +#define IMA_EMITTED_OPENWRITERS 5
enum evm_ima_xattr_type { IMA_XATTR_DIGEST = 0x01,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tigran Mkrtchyan tigran.mkrtchyan@desy.de
[ Upstream commit 5a46d2339a5ae268ede53a221f20433d8ea4f2f9 ]
Recent commit f06bedfa62d5 ("pNFS/flexfiles: don't attempt pnfs on fatal DS errors") has changed the error return type of ff_layout_choose_ds_for_read() from NULL to an error pointer. However, not all code paths have been updated to match the change. Thus, some non-NULL checks will accept error pointers as a valid return value.
Reported-by: Dan Carpenter dan.carpenter@linaro.org Suggested-by: Dan Carpenter dan.carpenter@linaro.org Fixes: f06bedfa62d5 ("pNFS/flexfiles: don't attempt pnfs on fatal DS errors") Signed-off-by: Tigran Mkrtchyan tigran.mkrtchyan@desy.de Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/flexfilelayout/flexfilelayout.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c index 7354b6b104783..b05dd4d3ed653 100644 --- a/fs/nfs/flexfilelayout/flexfilelayout.c +++ b/fs/nfs/flexfilelayout/flexfilelayout.c @@ -756,8 +756,11 @@ ff_layout_choose_ds_for_read(struct pnfs_layout_segment *lseg, continue;
if (check_device && - nfs4_test_deviceid_unavailable(&mirror->mirror_ds->id_node)) + nfs4_test_deviceid_unavailable(&mirror->mirror_ds->id_node)) { + // reinitialize the error state in case if this is the last iteration + ds = ERR_PTR(-EINVAL); continue; + }
*best_idx = idx; break; @@ -787,7 +790,7 @@ ff_layout_choose_best_ds_for_read(struct pnfs_layout_segment *lseg, struct nfs4_pnfs_ds *ds;
ds = ff_layout_choose_valid_ds_for_read(lseg, start_idx, best_idx); - if (ds) + if (!IS_ERR(ds)) return ds; return ff_layout_choose_any_ds_for_read(lseg, start_idx, best_idx); } @@ -801,7 +804,7 @@ ff_layout_get_ds_for_read(struct nfs_pageio_descriptor *pgio,
ds = ff_layout_choose_best_ds_for_read(lseg, pgio->pg_mirror_idx, best_idx); - if (ds || !pgio->pg_mirror_idx) + if (!IS_ERR(ds) || !pgio->pg_mirror_idx) return ds; return ff_layout_choose_best_ds_for_read(lseg, 0, best_idx); } @@ -859,7 +862,7 @@ ff_layout_pg_init_read(struct nfs_pageio_descriptor *pgio, req->wb_nio = 0;
ds = ff_layout_get_ds_for_read(pgio, &ds_idx); - if (!ds) { + if (IS_ERR(ds)) { if (!ff_layout_no_fallback_to_mds(pgio->pg_lseg)) goto out_mds; pnfs_generic_pg_cleanup(pgio); @@ -1063,11 +1066,13 @@ static void ff_layout_resend_pnfs_read(struct nfs_pgio_header *hdr) { u32 idx = hdr->pgio_mirror_idx + 1; u32 new_idx = 0; + struct nfs4_pnfs_ds *ds;
- if (ff_layout_choose_any_ds_for_read(hdr->lseg, idx, &new_idx)) - ff_layout_send_layouterror(hdr->lseg); - else + ds = ff_layout_choose_any_ds_for_read(hdr->lseg, idx, &new_idx); + if (IS_ERR(ds)) pnfs_error_mark_layout_for_return(hdr->inode, hdr->lseg); + else + ff_layout_send_layouterror(hdr->lseg); pnfs_read_resend_pnfs(hdr, new_idx); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Justin Worrell jworrell@gmail.com
[ Upstream commit 9559d2fffd4f9b892165eed48198a0e5cb8504e6 ]
xs_sock_recv_cmsg was failing to call xs_sock_process_cmsg for any cmsg type other than TLS_RECORD_TYPE_ALERT (TLS_RECORD_TYPE_DATA, and other values not handled.) Based on my reading of the previous commit (cc5d5908: sunrpc: fix client side handling of tls alerts), it looks like only iov_iter_revert should be conditional on TLS_RECORD_TYPE_ALERT (but that other cmsg types should still call xs_sock_process_cmsg). On my machine, I was unable to connect (over mtls) to an NFS share hosted on FreeBSD. With this patch applied, I am able to mount the share again.
Fixes: cc5d59081fa2 ("sunrpc: fix client side handling of tls alerts") Signed-off-by: Justin Worrell jworrell@gmail.com Reviewed-and-tested-by: Scott Mayhew smayhew@redhat.com Link: https://lore.kernel.org/r/20250904211038.12874-3-jworrell@gmail.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sunrpc/xprtsock.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 8b27a21f3b42d..3660ef2647112 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -407,9 +407,9 @@ xs_sock_recv_cmsg(struct socket *sock, unsigned int *msg_flags, int flags) iov_iter_kvec(&msg.msg_iter, ITER_DEST, &alert_kvec, 1, alert_kvec.iov_len); ret = sock_recvmsg(sock, &msg, flags); - if (ret > 0 && - tls_get_record_type(sock->sk, &u.cmsg) == TLS_RECORD_TYPE_ALERT) { - iov_iter_revert(&msg.msg_iter, ret); + if (ret > 0) { + if (tls_get_record_type(sock->sk, &u.cmsg) == TLS_RECORD_TYPE_ALERT) + iov_iter_revert(&msg.msg_iter, ret); ret = xs_sock_process_cmsg(sock, &msg, msg_flags, &u.cmsg, -EAGAIN); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit 31f1a960ad1a14def94fa0b8c25d62b4c032813f ]
Don't clear the capabilities that are not going to get reset by the call to _nfs4_server_capabilities().
Reported-by: Scott Haiden scott.b.haiden@gmail.com Fixes: b01f21cacde9 ("NFS: Fix the setting of capabilities when automounting a new filesystem") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs4proc.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 6debcfc63222d..bc1eaabaf2c30 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -3951,7 +3951,6 @@ int nfs4_server_capabilities(struct nfs_server *server, struct nfs_fh *fhandle) }; int err;
- nfs_server_set_init_caps(server); do { err = nfs4_handle_exception(server, _nfs4_server_capabilities(server, fhandle),
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit dd5a8621b886b02f8341c5d4ea68eb2c552ebd3e ]
_nfs4_server_capabilities() is expected to clear any flags that are not supported by the server.
Fixes: 8a59bb93b7e3 ("NFSv4 store server support for fs_location attribute") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs4proc.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index bc1eaabaf2c30..124b9cee6fed7 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -3882,8 +3882,9 @@ static int _nfs4_server_capabilities(struct nfs_server *server, struct nfs_fh *f res.attr_bitmask[2] &= FATTR4_WORD2_NFS42_MASK; } memcpy(server->attr_bitmask, res.attr_bitmask, sizeof(server->attr_bitmask)); - server->caps &= ~(NFS_CAP_ACLS | NFS_CAP_HARDLINKS | - NFS_CAP_SYMLINKS| NFS_CAP_SECURITY_LABEL); + server->caps &= + ~(NFS_CAP_ACLS | NFS_CAP_HARDLINKS | NFS_CAP_SYMLINKS | + NFS_CAP_SECURITY_LABEL | NFS_CAP_FS_LOCATIONS); server->fattr_valid = NFS_ATTR_FATTR_V4; if (res.attr_bitmask[0] & FATTR4_WORD0_ACL && res.acl_bitmask & ACL4_SUPPORT_ALLOW_ACL)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit 4fb2b677fc1f70ee642c0beecc3cabf226ef5707 ]
nfs_server_set_fsinfo() shouldn't assume that NFS_CAP_XATTR is unset on entry to the function.
Fixes: b78ef845c35d ("NFSv4.2: query the server for extended attribute support") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/client.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/fs/nfs/client.c b/fs/nfs/client.c index cc764da581c43..1bcdaee7e856f 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -873,6 +873,8 @@ static void nfs_server_set_fsinfo(struct nfs_server *server,
if (fsinfo->xattr_support) server->caps |= NFS_CAP_XATTR; + else + server->caps &= ~NFS_CAP_XATTR; #endif }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luo Gengkun luogengkun@huaweicloud.com
[ Upstream commit 3d62ab32df065e4a7797204a918f6489ddb8a237 ]
Both tracing_mark_write and tracing_mark_raw_write call __copy_from_user_inatomic during preempt_disable. But in some case, __copy_from_user_inatomic may trigger page fault, and will call schedule() subtly. And if a task is migrated to other cpu, the following warning will be trigger: if (RB_WARN_ON(cpu_buffer, !local_read(&cpu_buffer->committing)))
An example can illustrate this issue:
process flow CPU ---------------------------------------------------------------------
tracing_mark_raw_write(): cpu:0 ... ring_buffer_lock_reserve(): cpu:0 ... cpu = raw_smp_processor_id() cpu:0 cpu_buffer = buffer->buffers[cpu] cpu:0 ... ... __copy_from_user_inatomic(): cpu:0 ... # page fault do_mem_abort(): cpu:0 ... # Call schedule schedule() cpu:0 ... # the task schedule to cpu1 __buffer_unlock_commit(): cpu:1 ... ring_buffer_unlock_commit(): cpu:1 ... cpu = raw_smp_processor_id() cpu:1 cpu_buffer = buffer->buffers[cpu] cpu:1
As shown above, the process will acquire cpuid twice and the return values are not the same.
To fix this problem using copy_from_user_nofault instead of __copy_from_user_inatomic, as the former performs 'access_ok' before copying.
Link: https://lore.kernel.org/20250819105152.2766363-1-luogengkun@huaweicloud.com Fixes: 656c7f0d2d2b ("tracing: Replace kmap with copy_from_user() in trace_marker writing") Signed-off-by: Luo Gengkun luogengkun@huaweicloud.com Reviewed-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index a32c8637503d1..baa87876cee47 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -7226,7 +7226,7 @@ tracing_mark_write(struct file *filp, const char __user *ubuf, entry = ring_buffer_event_data(event); entry->ip = _THIS_IP_;
- len = __copy_from_user_inatomic(&entry->buf, ubuf, cnt); + len = copy_from_user_nofault(&entry->buf, ubuf, cnt); if (len) { memcpy(&entry->buf, FAULTED_STR, FAULTED_SIZE); cnt = FAULTED_SIZE; @@ -7301,7 +7301,7 @@ tracing_mark_raw_write(struct file *filp, const char __user *ubuf,
entry = ring_buffer_event_data(event);
- len = __copy_from_user_inatomic(&entry->id, ubuf, cnt); + len = copy_from_user_nofault(&entry->id, ubuf, cnt); if (len) { entry->id = -1; memcpy(&entry->buf, FAULTED_STR, FAULTED_SIZE);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vladimir Riabchun ferr.lambarginio@gmail.com
[ Upstream commit 80d03a40837a9b26750a25122b906c052cc846c9 ]
In my_tramp1 function .size directive was placed above ASM_RET instruction, leading to a wrong function size.
Link: https://lore.kernel.org/aK3d7vxNcO52kEmg@vova-pc Fixes: 9d907f1ae80b ("samples/ftrace: Fix asm function ELF annotations") Signed-off-by: Vladimir Riabchun ferr.lambarginio@gmail.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- samples/ftrace/ftrace-direct-modify.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c index e2a6a69352dfb..b40f85e3806fc 100644 --- a/samples/ftrace/ftrace-direct-modify.c +++ b/samples/ftrace/ftrace-direct-modify.c @@ -40,8 +40,8 @@ asm ( CALL_DEPTH_ACCOUNT " call my_direct_func1\n" " leave\n" -" .size my_tramp1, .-my_tramp1\n" ASM_RET +" .size my_tramp1, .-my_tramp1\n"
" .type my_tramp2, @function\n" " .globl my_tramp2\n"
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Max Kellermann max.kellermann@ionos.com
[ Upstream commit 38a125b31504f91bf6fdd3cfc3a3e9a721e6c97a ]
This allows killing processes that wait for a lock when one process is stuck waiting for the NFS server. This aims to complete the coverage of NFS operations being killable, like nfs_direct_wait() does, for example.
Signed-off-by: Max Kellermann max.kellermann@ionos.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Stable-dep-of: 9eb90f435415 ("NFS: Serialise O_DIRECT i/o and truncate()") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/direct.c | 21 ++++++++++++++++++--- fs/nfs/file.c | 14 +++++++++++--- fs/nfs/internal.h | 7 ++++--- fs/nfs/io.c | 44 +++++++++++++++++++++++++++++++++----------- 4 files changed, 66 insertions(+), 20 deletions(-)
diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c index a1ff4a4f5380e..4e53708dfcf43 100644 --- a/fs/nfs/direct.c +++ b/fs/nfs/direct.c @@ -469,8 +469,16 @@ ssize_t nfs_file_direct_read(struct kiocb *iocb, struct iov_iter *iter, if (user_backed_iter(iter)) dreq->flags = NFS_ODIRECT_SHOULD_DIRTY;
- if (!swap) - nfs_start_io_direct(inode); + if (!swap) { + result = nfs_start_io_direct(inode); + if (result) { + /* release the reference that would usually be + * consumed by nfs_direct_read_schedule_iovec() + */ + nfs_direct_req_release(dreq); + goto out_release; + } + }
NFS_I(inode)->read_io += count; requested = nfs_direct_read_schedule_iovec(dreq, iter, iocb->ki_pos); @@ -1023,7 +1031,14 @@ ssize_t nfs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter, requested = nfs_direct_write_schedule_iovec(dreq, iter, pos, FLUSH_STABLE); } else { - nfs_start_io_direct(inode); + result = nfs_start_io_direct(inode); + if (result) { + /* release the reference that would usually be + * consumed by nfs_direct_write_schedule_iovec() + */ + nfs_direct_req_release(dreq); + goto out_release; + }
requested = nfs_direct_write_schedule_iovec(dreq, iter, pos, FLUSH_COND_STABLE); diff --git a/fs/nfs/file.c b/fs/nfs/file.c index 003dda0018403..2f4db026f8d67 100644 --- a/fs/nfs/file.c +++ b/fs/nfs/file.c @@ -167,7 +167,10 @@ nfs_file_read(struct kiocb *iocb, struct iov_iter *to) iocb->ki_filp, iov_iter_count(to), (unsigned long) iocb->ki_pos);
- nfs_start_io_read(inode); + result = nfs_start_io_read(inode); + if (result) + return result; + result = nfs_revalidate_mapping(inode, iocb->ki_filp->f_mapping); if (!result) { result = generic_file_read_iter(iocb, to); @@ -188,7 +191,10 @@ nfs_file_splice_read(struct file *in, loff_t *ppos, struct pipe_inode_info *pipe
dprintk("NFS: splice_read(%pD2, %zu@%llu)\n", in, len, *ppos);
- nfs_start_io_read(inode); + result = nfs_start_io_read(inode); + if (result) + return result; + result = nfs_revalidate_mapping(inode, in->f_mapping); if (!result) { result = filemap_splice_read(in, ppos, pipe, len, flags); @@ -668,7 +674,9 @@ ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from) nfs_clear_invalid_mapping(file->f_mapping);
since = filemap_sample_wb_err(file->f_mapping); - nfs_start_io_write(inode); + error = nfs_start_io_write(inode); + if (error) + return error; result = generic_write_checks(iocb, from); if (result > 0) result = generic_perform_write(iocb, from); diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index 4eea91d054b24..e78f43a137231 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -6,6 +6,7 @@ #include "nfs4_fs.h" #include <linux/fs_context.h> #include <linux/security.h> +#include <linux/compiler_attributes.h> #include <linux/crc32.h> #include <linux/sunrpc/addr.h> #include <linux/nfs_page.h> @@ -461,11 +462,11 @@ extern const struct netfs_request_ops nfs_netfs_ops; #endif
/* io.c */ -extern void nfs_start_io_read(struct inode *inode); +extern __must_check int nfs_start_io_read(struct inode *inode); extern void nfs_end_io_read(struct inode *inode); -extern void nfs_start_io_write(struct inode *inode); +extern __must_check int nfs_start_io_write(struct inode *inode); extern void nfs_end_io_write(struct inode *inode); -extern void nfs_start_io_direct(struct inode *inode); +extern __must_check int nfs_start_io_direct(struct inode *inode); extern void nfs_end_io_direct(struct inode *inode);
static inline bool nfs_file_io_is_buffered(struct nfs_inode *nfsi) diff --git a/fs/nfs/io.c b/fs/nfs/io.c index b5551ed8f648b..3388faf2acb9f 100644 --- a/fs/nfs/io.c +++ b/fs/nfs/io.c @@ -39,19 +39,28 @@ static void nfs_block_o_direct(struct nfs_inode *nfsi, struct inode *inode) * Note that buffered writes and truncates both take a write lock on * inode->i_rwsem, meaning that those are serialised w.r.t. the reads. */ -void +int nfs_start_io_read(struct inode *inode) { struct nfs_inode *nfsi = NFS_I(inode); + int err; + /* Be an optimist! */ - down_read(&inode->i_rwsem); + err = down_read_killable(&inode->i_rwsem); + if (err) + return err; if (test_bit(NFS_INO_ODIRECT, &nfsi->flags) == 0) - return; + return 0; up_read(&inode->i_rwsem); + /* Slow path.... */ - down_write(&inode->i_rwsem); + err = down_write_killable(&inode->i_rwsem); + if (err) + return err; nfs_block_o_direct(nfsi, inode); downgrade_write(&inode->i_rwsem); + + return 0; }
/** @@ -74,11 +83,15 @@ nfs_end_io_read(struct inode *inode) * Declare that a buffered read operation is about to start, and ensure * that we block all direct I/O. */ -void +int nfs_start_io_write(struct inode *inode) { - down_write(&inode->i_rwsem); - nfs_block_o_direct(NFS_I(inode), inode); + int err; + + err = down_write_killable(&inode->i_rwsem); + if (!err) + nfs_block_o_direct(NFS_I(inode), inode); + return err; }
/** @@ -119,19 +132,28 @@ static void nfs_block_buffered(struct nfs_inode *nfsi, struct inode *inode) * Note that buffered writes and truncates both take a write lock on * inode->i_rwsem, meaning that those are serialised w.r.t. O_DIRECT. */ -void +int nfs_start_io_direct(struct inode *inode) { struct nfs_inode *nfsi = NFS_I(inode); + int err; + /* Be an optimist! */ - down_read(&inode->i_rwsem); + err = down_read_killable(&inode->i_rwsem); + if (err) + return err; if (test_bit(NFS_INO_ODIRECT, &nfsi->flags) != 0) - return; + return 0; up_read(&inode->i_rwsem); + /* Slow path.... */ - down_write(&inode->i_rwsem); + err = down_write_killable(&inode->i_rwsem); + if (err) + return err; nfs_block_buffered(nfsi, inode); downgrade_write(&inode->i_rwsem); + + return 0; }
/**
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit 9eb90f435415c7da4800974ed943e39b5578ee7f ]
Ensure that all O_DIRECT reads and writes are complete, and prevent the initiation of new i/o until the setattr operation that will truncate the file is complete.
Fixes: a5864c999de6 ("NFS: Do not serialise O_DIRECT reads and writes") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/inode.c | 4 +++- fs/nfs/internal.h | 10 ++++++++++ fs/nfs/io.c | 13 ++----------- 3 files changed, 15 insertions(+), 12 deletions(-)
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index 7e7dd2aab449d..5cd5e4226db36 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -645,8 +645,10 @@ nfs_setattr(struct mnt_idmap *idmap, struct dentry *dentry, trace_nfs_setattr_enter(inode);
/* Write all dirty data */ - if (S_ISREG(inode->i_mode)) + if (S_ISREG(inode->i_mode)) { + nfs_file_block_o_direct(NFS_I(inode)); nfs_sync_inode(inode); + }
fattr = nfs_alloc_fattr_with_label(NFS_SERVER(inode)); if (fattr == NULL) { diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index e78f43a137231..bde81e0abf0ae 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -474,6 +474,16 @@ static inline bool nfs_file_io_is_buffered(struct nfs_inode *nfsi) return test_bit(NFS_INO_ODIRECT, &nfsi->flags) == 0; }
+/* Must be called with exclusively locked inode->i_rwsem */ +static inline void nfs_file_block_o_direct(struct nfs_inode *nfsi) +{ + if (test_bit(NFS_INO_ODIRECT, &nfsi->flags)) { + clear_bit(NFS_INO_ODIRECT, &nfsi->flags); + inode_dio_wait(&nfsi->vfs_inode); + } +} + + /* namespace.c */ #define NFS_PATH_CANONICAL 1 extern char *nfs_path(char **p, struct dentry *dentry, diff --git a/fs/nfs/io.c b/fs/nfs/io.c index 3388faf2acb9f..d275b0a250bf3 100644 --- a/fs/nfs/io.c +++ b/fs/nfs/io.c @@ -14,15 +14,6 @@
#include "internal.h"
-/* Call with exclusively locked inode->i_rwsem */ -static void nfs_block_o_direct(struct nfs_inode *nfsi, struct inode *inode) -{ - if (test_bit(NFS_INO_ODIRECT, &nfsi->flags)) { - clear_bit(NFS_INO_ODIRECT, &nfsi->flags); - inode_dio_wait(inode); - } -} - /** * nfs_start_io_read - declare the file is being used for buffered reads * @inode: file inode @@ -57,7 +48,7 @@ nfs_start_io_read(struct inode *inode) err = down_write_killable(&inode->i_rwsem); if (err) return err; - nfs_block_o_direct(nfsi, inode); + nfs_file_block_o_direct(nfsi); downgrade_write(&inode->i_rwsem);
return 0; @@ -90,7 +81,7 @@ nfs_start_io_write(struct inode *inode)
err = down_write_killable(&inode->i_rwsem); if (!err) - nfs_block_o_direct(NFS_I(inode), inode); + nfs_file_block_o_direct(NFS_I(inode)); return err; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit b93128f29733af5d427a335978a19884c2c230e2 ]
Ensure that all O_DIRECT reads and writes complete before calling fallocate so that we don't race w.r.t. attribute updates.
Fixes: 99f237832243 ("NFSv4.2: Always flush out writes in nfs42_proc_fallocate()") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs42proc.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c index 9f0d69e652644..66fe885fc19a1 100644 --- a/fs/nfs/nfs42proc.c +++ b/fs/nfs/nfs42proc.c @@ -112,6 +112,7 @@ static int nfs42_proc_fallocate(struct rpc_message *msg, struct file *filep, exception.inode = inode; exception.state = lock->open_context->state;
+ nfs_file_block_o_direct(NFS_I(inode)); err = nfs_sync_inode(inode); if (err) goto out;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit c80ebeba1198eac8811ab0dba36ecc13d51e4438 ]
Ensure that all O_DIRECT reads and writes complete before cloning a file range, so that both the source and destination are up to date.
Fixes: a5864c999de6 ("NFS: Do not serialise O_DIRECT reads and writes") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs4file.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c index 02788c3c85e5b..befdb0f4e6dc3 100644 --- a/fs/nfs/nfs4file.c +++ b/fs/nfs/nfs4file.c @@ -282,9 +282,11 @@ static loff_t nfs42_remap_file_range(struct file *src_file, loff_t src_off,
/* flush all pending writes on both src and dst so that server * has the latest data */ + nfs_file_block_o_direct(NFS_I(src_inode)); ret = nfs_sync_inode(src_inode); if (ret) goto out_unlock; + nfs_file_block_o_direct(NFS_I(dst_inode)); ret = nfs_sync_inode(dst_inode); if (ret) goto out_unlock;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit ca247c89900ae90207f4d321e260cd93b7c7d104 ]
Ensure that all O_DIRECT reads and writes complete before copying a file range, so that the destination is up to date.
Fixes: a5864c999de6 ("NFS: Do not serialise O_DIRECT reads and writes") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs42proc.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c index 66fe885fc19a1..582cf8a469560 100644 --- a/fs/nfs/nfs42proc.c +++ b/fs/nfs/nfs42proc.c @@ -356,6 +356,7 @@ static ssize_t _nfs42_proc_copy(struct file *src, return status; }
+ nfs_file_block_o_direct(NFS_I(dst_inode)); status = nfs_sync_inode(dst_inode); if (status) return status;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Curley jcurley@purestorage.com
[ Upstream commit dd2fa82473453661d12723c46c9f43d9876a7efd ]
Typo in ff_lseg_match_mirrors makes the diff ineffective. This results in merge happening all the time. Merge happening all the time is problematic because it marks lsegs invalid. Marking lsegs invalid causes all outstanding IO to get restarted with EAGAIN and connections to get closed.
Closing connections constantly triggers race conditions in the RDMA implementation...
Fixes: 660d1eb22301c ("pNFS/flexfile: Don't merge layout segments if the mirrors don't match") Signed-off-by: Jonathan Curley jcurley@purestorage.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/flexfilelayout/flexfilelayout.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c index b05dd4d3ed653..42c73c647a27f 100644 --- a/fs/nfs/flexfilelayout/flexfilelayout.c +++ b/fs/nfs/flexfilelayout/flexfilelayout.c @@ -276,7 +276,7 @@ ff_lseg_match_mirrors(struct pnfs_layout_segment *l1, struct pnfs_layout_segment *l2) { const struct nfs4_ff_layout_segment *fl1 = FF_LAYOUT_LSEG(l1); - const struct nfs4_ff_layout_segment *fl2 = FF_LAYOUT_LSEG(l1); + const struct nfs4_ff_layout_segment *fl2 = FF_LAYOUT_LSEG(l2); u32 i;
if (fl1->mirror_array_cnt != fl2->mirror_array_cnt)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pu Lehui pulehui@huawei.com
[ Upstream commit cd4453c5e983cf1fd5757e9acb915adb1e4602b6 ]
Syzkaller trigger a fault injection warning:
WARNING: CPU: 1 PID: 12326 at tracepoint_add_func+0xbfc/0xeb0 Modules linked in: CPU: 1 UID: 0 PID: 12326 Comm: syz.6.10325 Tainted: G U 6.14.0-rc5-syzkaller #0 Tainted: [U]=USER Hardware name: Google Compute Engine/Google Compute Engine RIP: 0010:tracepoint_add_func+0xbfc/0xeb0 kernel/tracepoint.c:294 Code: 09 fe ff 90 0f 0b 90 0f b6 74 24 43 31 ff 41 bc ea ff ff ff RSP: 0018:ffffc9000414fb48 EFLAGS: 00010283 RAX: 00000000000012a1 RBX: ffffffff8e240ae0 RCX: ffffc90014b78000 RDX: 0000000000080000 RSI: ffffffff81bbd78b RDI: 0000000000000001 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffffffffffffffef R13: 0000000000000000 R14: dffffc0000000000 R15: ffffffff81c264f0 FS: 00007f27217f66c0(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2e80dff8 CR3: 00000000268f8000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tracepoint_probe_register_prio+0xc0/0x110 kernel/tracepoint.c:464 register_trace_prio_sched_switch include/trace/events/sched.h:222 [inline] register_pid_events kernel/trace/trace_events.c:2354 [inline] event_pid_write.isra.0+0x439/0x7a0 kernel/trace/trace_events.c:2425 vfs_write+0x24c/0x1150 fs/read_write.c:677 ksys_write+0x12b/0x250 fs/read_write.c:731 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
We can reproduce the warning by following the steps below: 1. echo 8 >> set_event_notrace_pid. Let tr->filtered_pids owns one pid and register sched_switch tracepoint. 2. echo ' ' >> set_event_pid, and perform fault injection during chunk allocation of trace_pid_list_alloc. Let pid_list with no pid and assign to tr->filtered_pids. 3. echo ' ' >> set_event_pid. Let pid_list is NULL and assign to tr->filtered_pids. 4. echo 9 >> set_event_pid, will trigger the double register sched_switch tracepoint warning.
The reason is that syzkaller injects a fault into the chunk allocation in trace_pid_list_alloc, causing a failure in trace_pid_list_set, which may trigger double register of the same tracepoint. This only occurs when the system is about to crash, but to suppress this warning, let's add failure handling logic to trace_pid_list_set.
Link: https://lore.kernel.org/20250908024658.2390398-1-pulehui@huaweicloud.com Fixes: 8d6e90983ade ("tracing: Create a sparse bitmask for pid filtering") Reported-by: syzbot+161412ccaeff20ce4dde@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/67cb890e.050a0220.d8275.022e.GAE@google.com Signed-off-by: Pu Lehui pulehui@huawei.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index baa87876cee47..a111be83c3693 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -750,7 +750,10 @@ int trace_pid_write(struct trace_pid_list *filtered_pids, /* copy the current bits to the new max */ ret = trace_pid_list_first(filtered_pids, &pid); while (!ret) { - trace_pid_list_set(pid_list, pid); + ret = trace_pid_list_set(pid_list, pid); + if (ret < 0) + goto out; + ret = trace_pid_list_next(filtered_pids, pid + 1, &pid); nr_pids++; } @@ -787,6 +790,7 @@ int trace_pid_write(struct trace_pid_list *filtered_pids, trace_parser_clear(&parser); ret = 0; } + out: trace_parser_put(&parser);
if (ret < 0) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Richter tmricht@linux.ibm.com
[ Upstream commit ce971233242b5391d99442271f3ca096fb49818d ]
Deny all sampling event by the CPUMF counter facility device driver and return -ENOENT. This return value is used to try other PMUs. Up to now events for type PERF_TYPE_HARDWARE were not tested for sampling and returned later on -EOPNOTSUPP. This ends the search for alternative PMUs. Change that behavior and try other PMUs instead.
Fixes: 613a41b0d16e ("s390/cpum_cf: Reject request for sampling in event initialization") Acked-by: Sumanth Korikkar sumanthk@linux.ibm.com Signed-off-by: Thomas Richter tmricht@linux.ibm.com Signed-off-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/perf_cpum_cf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kernel/perf_cpum_cf.c b/arch/s390/kernel/perf_cpum_cf.c index 65a66df5bb865..771e1cb17540d 100644 --- a/arch/s390/kernel/perf_cpum_cf.c +++ b/arch/s390/kernel/perf_cpum_cf.c @@ -757,8 +757,6 @@ static int __hw_perf_event_init(struct perf_event *event, unsigned int type) break;
case PERF_TYPE_HARDWARE: - if (is_sampling_event(event)) /* No sampling support */ - return -ENOENT; ev = attr->config; if (!attr->exclude_user && attr->exclude_kernel) { /* @@ -856,6 +854,8 @@ static int cpumf_pmu_event_init(struct perf_event *event) unsigned int type = event->attr.type; int err;
+ if (is_sampling_event(event)) /* No sampling support */ + return err; if (type == PERF_TYPE_HARDWARE || type == PERF_TYPE_RAW) err = __hw_perf_event_init(event, type); else if (event->pmu->type == type)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peilin Ye yepeilin@google.com
[ Upstream commit 6d78b4473cdb08b74662355a9e8510bde09c511e ]
Currently, calling bpf_map_kmalloc_node() from __bpf_async_init() can cause various locking issues; see the following stack trace (edited for style) as one example:
... [10.011566] do_raw_spin_lock.cold [10.011570] try_to_wake_up (5) double-acquiring the same [10.011575] kick_pool rq_lock, causing a hardlockup [10.011579] __queue_work [10.011582] queue_work_on [10.011585] kernfs_notify [10.011589] cgroup_file_notify [10.011593] try_charge_memcg (4) memcg accounting raises an [10.011597] obj_cgroup_charge_pages MEMCG_MAX event [10.011599] obj_cgroup_charge_account [10.011600] __memcg_slab_post_alloc_hook [10.011603] __kmalloc_node_noprof ... [10.011611] bpf_map_kmalloc_node [10.011612] __bpf_async_init [10.011615] bpf_timer_init (3) BPF calls bpf_timer_init() [10.011617] bpf_prog_xxxxxxxxxxxxxxxx_fcg_runnable [10.011619] bpf__sched_ext_ops_runnable [10.011620] enqueue_task_scx (2) BPF runs with rq_lock held [10.011622] enqueue_task [10.011626] ttwu_do_activate [10.011629] sched_ttwu_pending (1) grabs rq_lock ...
The above was reproduced on bpf-next (b338cf849ec8) by modifying ./tools/sched_ext/scx_flatcg.bpf.c to call bpf_timer_init() during ops.runnable(), and hacking the memcg accounting code a bit to make a bpf_timer_init() call more likely to raise an MEMCG_MAX event.
We have also run into other similar variants (both internally and on bpf-next), including double-acquiring cgroup_file_kn_lock, the same worker_pool::lock, etc.
As suggested by Shakeel, fix this by using __GFP_HIGH instead of GFP_ATOMIC in __bpf_async_init(), so that e.g. if try_charge_memcg() raises an MEMCG_MAX event, we call __memcg_memory_event() with @allow_spinning=false and avoid calling cgroup_file_notify() there.
Depends on mm patch "memcg: skip cgroup_file_notify if spinning is not allowed": https://lore.kernel.org/bpf/20250905201606.66198-1-shakeel.butt@linux.dev/
v0 approach s/bpf_map_kmalloc_node/bpf_mem_alloc/ https://lore.kernel.org/bpf/20250905061919.439648-1-yepeilin@google.com/ v1 approach: https://lore.kernel.org/bpf/20250905234547.862249-1-yepeilin@google.com/
Fixes: b00628b1c7d5 ("bpf: Introduce bpf timers.") Suggested-by: Shakeel Butt shakeel.butt@linux.dev Signed-off-by: Peilin Ye yepeilin@google.com Link: https://lore.kernel.org/r/20250909095222.2121438-1-yepeilin@google.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/helpers.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 4b20a72ab8cff..90c281e1379ee 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1204,8 +1204,11 @@ static int __bpf_async_init(struct bpf_async_kern *async, struct bpf_map *map, u goto out; }
- /* allocate hrtimer via map_kmalloc to use memcg accounting */ - cb = bpf_map_kmalloc_node(map, size, GFP_ATOMIC, map->numa_node); + /* Allocate via bpf_map_kmalloc_node() for memcg accounting. Until + * kmalloc_nolock() is available, avoid locking issues by using + * __GFP_HIGH (GFP_ATOMIC & ~__GFP_RECLAIM). + */ + cb = bpf_map_kmalloc_node(map, size, __GFP_HIGH, map->numa_node); if (!cb) { ret = -ENOMEM; goto out;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@google.com
[ Upstream commit a3967baad4d533dc254c31e0d221e51c8d223d58 ]
syzbot reported the splat below. [0]
The repro does the following:
1. Load a sk_msg prog that calls bpf_msg_cork_bytes(msg, cork_bytes) 2. Attach the prog to a SOCKMAP 3. Add a socket to the SOCKMAP 4. Activate fault injection 5. Send data less than cork_bytes
At 5., the data is carried over to the next sendmsg() as it is smaller than the cork_bytes specified by bpf_msg_cork_bytes().
Then, tcp_bpf_send_verdict() tries to allocate psock->cork to hold the data, but this fails silently due to fault injection + __GFP_NOWARN.
If the allocation fails, we need to revert the sk->sk_forward_alloc change done by sk_msg_alloc().
Let's call sk_msg_free() when tcp_bpf_send_verdict fails to allocate psock->cork.
The "*copied" also needs to be updated such that a proper error can be returned to the caller, sendmsg. It fails to allocate psock->cork. Nothing has been corked so far, so this patch simply sets "*copied" to 0.
[0]: WARNING: net/ipv4/af_inet.c:156 at inet_sock_destruct+0x623/0x730 net/ipv4/af_inet.c:156, CPU#1: syz-executor/5983 Modules linked in: CPU: 1 UID: 0 PID: 5983 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025 RIP: 0010:inet_sock_destruct+0x623/0x730 net/ipv4/af_inet.c:156 Code: 0f 0b 90 e9 62 fe ff ff e8 7a db b5 f7 90 0f 0b 90 e9 95 fe ff ff e8 6c db b5 f7 90 0f 0b 90 e9 bb fe ff ff e8 5e db b5 f7 90 <0f> 0b 90 e9 e1 fe ff ff 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c 9f fc RSP: 0018:ffffc90000a08b48 EFLAGS: 00010246 RAX: ffffffff8a09d0b2 RBX: dffffc0000000000 RCX: ffff888024a23c80 RDX: 0000000000000100 RSI: 0000000000000fff RDI: 0000000000000000 RBP: 0000000000000fff R08: ffff88807e07c627 R09: 1ffff1100fc0f8c4 R10: dffffc0000000000 R11: ffffed100fc0f8c5 R12: ffff88807e07c380 R13: dffffc0000000000 R14: ffff88807e07c60c R15: 1ffff1100fc0f872 FS: 00005555604c4500(0000) GS:ffff888125af1000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005555604df5c8 CR3: 0000000032b06000 CR4: 00000000003526f0 Call Trace: <IRQ> __sk_destruct+0x86/0x660 net/core/sock.c:2339 rcu_do_batch kernel/rcu/tree.c:2605 [inline] rcu_core+0xca8/0x1770 kernel/rcu/tree.c:2861 handle_softirqs+0x286/0x870 kernel/softirq.c:579 __do_softirq kernel/softirq.c:613 [inline] invoke_softirq kernel/softirq.c:453 [inline] __irq_exit_rcu+0xca/0x1f0 kernel/softirq.c:680 irq_exit_rcu+0x9/0x30 kernel/softirq.c:696 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1052 [inline] sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1052 </IRQ>
Fixes: 4f738adba30a ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data") Reported-by: syzbot+4cabd1d2fa917a456db8@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/68c0b6b5.050a0220.3c6139.0013.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima kuniyu@google.com Signed-off-by: Martin KaFai Lau martin.lau@kernel.org Link: https://patch.msgid.link/20250909232623.4151337-1-kuniyu@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_bpf.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index 5312237e80409..7518d2af63088 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -408,8 +408,11 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, if (!psock->cork) { psock->cork = kzalloc(sizeof(*psock->cork), GFP_ATOMIC | __GFP_NOWARN); - if (!psock->cork) + if (!psock->cork) { + sk_msg_free(sk, msg); + *copied = 0; return -ENOMEM; + } } memcpy(psock->cork, msg, sizeof(*msg)); return 0;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: wangzijie wangzijie1@honor.com
[ Upstream commit 0ce9398aa0830f15f92bbed73853f9861c3e74ff ]
Commit 2ce3d282bd50 ("proc: fix missing pde_set_flags() for net proc files") missed a key part in the definition of proc_dir_entry:
union { const struct proc_ops *proc_ops; const struct file_operations *proc_dir_ops; };
So dereference of ->proc_ops assumes it is a proc_ops structure results in type confusion and make NULL check for 'proc_ops' not work for proc dir.
Add !S_ISDIR(dp->mode) test before calling pde_set_flags() to fix it.
Link: https://lkml.kernel.org/r/20250904135715.3972782-1-wangzijie1@honor.com Fixes: 2ce3d282bd50 ("proc: fix missing pde_set_flags() for net proc files") Signed-off-by: wangzijie wangzijie1@honor.com Reported-by: Brad Spengler spender@grsecurity.net Closes: https://lore.kernel.org/all/20250903065758.3678537-1-wangzijie1@honor.com/ Cc: Alexey Dobriyan adobriyan@gmail.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: Christian Brauner brauner@kernel.org Cc: Jiri Slaby jirislaby@kernel.org Cc: Stefano Brivio sbrivio@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/proc/generic.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/proc/generic.c b/fs/proc/generic.c index db3f2c6abc162..4cadd2fd23d8f 100644 --- a/fs/proc/generic.c +++ b/fs/proc/generic.c @@ -388,7 +388,8 @@ struct proc_dir_entry *proc_register(struct proc_dir_entry *dir, if (proc_alloc_inum(&dp->low_ino)) goto out_free_entry;
- pde_set_flags(dp); + if (!S_ISDIR(dp->mode)) + pde_set_flags(dp);
write_lock(&proc_subdir_lock); dp->parent = dir;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul E. McKenney paulmck@kernel.org
commit 6b70399f9ef3809f6e308fd99dd78b072c1bd05c upstream.
This commit continues the elimination of deadlocks involving do_exit() and RCU tasks by causing exit_tasks_rcu_start() to add the current task to a per-CPU list and causing exit_tasks_rcu_stop() to remove the current task from whatever list it is on. These lists will be used to track tasks that are exiting, while still accounting for any RCU-tasks quiescent states that these tasks pass though.
[ paulmck: Apply Frederic Weisbecker feedback. ]
Link: https://lore.kernel.org/all/20240118021842.290665-1-chenzhongjin@huawei.com/
Reported-by: Chen Zhongjin chenzhongjin@huawei.com Reported-by: Yang Jihong yangjihong1@huawei.com Signed-off-by: Paul E. McKenney paulmck@kernel.org Tested-by: Yang Jihong yangjihong1@huawei.com Tested-by: Chen Zhongjin chenzhongjin@huawei.com Reviewed-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Boqun Feng boqun.feng@gmail.com Cc: Tahera Fahimi taherafahimi@linux.microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/rcu/tasks.h | 43 +++++++++++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 10 deletions(-)
--- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -1175,25 +1175,48 @@ struct task_struct *get_rcu_tasks_gp_kth EXPORT_SYMBOL_GPL(get_rcu_tasks_gp_kthread);
/* - * Contribute to protect against tasklist scan blind spot while the - * task is exiting and may be removed from the tasklist. See - * corresponding synchronize_srcu() for further details. + * Protect against tasklist scan blind spot while the task is exiting and + * may be removed from the tasklist. Do this by adding the task to yet + * another list. + * + * Note that the task will remove itself from this list, so there is no + * need for get_task_struct(), except in the case where rcu_tasks_pertask() + * adds it to the holdout list, in which case rcu_tasks_pertask() supplies + * the needed get_task_struct(). */ -void exit_tasks_rcu_start(void) __acquires(&tasks_rcu_exit_srcu) +void exit_tasks_rcu_start(void) { - current->rcu_tasks_idx = __srcu_read_lock(&tasks_rcu_exit_srcu); + unsigned long flags; + struct rcu_tasks_percpu *rtpcp; + struct task_struct *t = current; + + WARN_ON_ONCE(!list_empty(&t->rcu_tasks_exit_list)); + preempt_disable(); + rtpcp = this_cpu_ptr(rcu_tasks.rtpcpu); + t->rcu_tasks_exit_cpu = smp_processor_id(); + raw_spin_lock_irqsave_rcu_node(rtpcp, flags); + if (!rtpcp->rtp_exit_list.next) + INIT_LIST_HEAD(&rtpcp->rtp_exit_list); + list_add(&t->rcu_tasks_exit_list, &rtpcp->rtp_exit_list); + raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags); + preempt_enable(); }
/* - * Contribute to protect against tasklist scan blind spot while the - * task is exiting and may be removed from the tasklist. See - * corresponding synchronize_srcu() for further details. + * Remove the task from the "yet another list" because do_exit() is now + * non-preemptible, allowing synchronize_rcu() to wait beyond this point. */ -void exit_tasks_rcu_stop(void) __releases(&tasks_rcu_exit_srcu) +void exit_tasks_rcu_stop(void) { + unsigned long flags; + struct rcu_tasks_percpu *rtpcp; struct task_struct *t = current;
- __srcu_read_unlock(&tasks_rcu_exit_srcu, t->rcu_tasks_idx); + WARN_ON_ONCE(list_empty(&t->rcu_tasks_exit_list)); + rtpcp = per_cpu_ptr(rcu_tasks.rtpcpu, t->rcu_tasks_exit_cpu); + raw_spin_lock_irqsave_rcu_node(rtpcp, flags); + list_del_init(&t->rcu_tasks_exit_list); + raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags); }
/*
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul E. McKenney paulmck@kernel.org
commit 1612160b91272f5b1596f499584d6064bf5be794 upstream.
Holding a mutex across synchronize_rcu_tasks() and acquiring that same mutex in code called from do_exit() after its call to exit_tasks_rcu_start() but before its call to exit_tasks_rcu_stop() results in deadlock. This is by design, because tasks that are far enough into do_exit() are no longer present on the tasks list, making it a bit difficult for RCU Tasks to find them, let alone wait on them to do a voluntary context switch. However, such deadlocks are becoming more frequent. In addition, lockdep currently does not detect such deadlocks and they can be difficult to reproduce.
In addition, if a task voluntarily context switches during that time (for example, if it blocks acquiring a mutex), then this task is in an RCU Tasks quiescent state. And with some adjustments, RCU Tasks could just as well take advantage of that fact.
This commit therefore eliminates these deadlock by replacing the SRCU-based wait for do_exit() completion with per-CPU lists of tasks currently exiting. A given task will be on one of these per-CPU lists for the same period of time that this task would previously have been in the previous SRCU read-side critical section. These lists enable RCU Tasks to find the tasks that have already been removed from the tasks list, but that must nevertheless be waited upon.
The RCU Tasks grace period gathers any of these do_exit() tasks that it must wait on, and adds them to the list of holdouts. Per-CPU locking and get_task_struct() are used to synchronize addition to and removal from these lists.
Link: https://lore.kernel.org/all/20240118021842.290665-1-chenzhongjin@huawei.com/
Reported-by: Chen Zhongjin chenzhongjin@huawei.com Reported-by: Yang Jihong yangjihong1@huawei.com Signed-off-by: Paul E. McKenney paulmck@kernel.org Tested-by: Yang Jihong yangjihong1@huawei.com Tested-by: Chen Zhongjin chenzhongjin@huawei.com Reviewed-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Boqun Feng boqun.feng@gmail.com Cc: Tahera Fahimi taherafahimi@linux.microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/rcu/tasks.h | 44 ++++++++++++++++++++++++++++---------------- 1 file changed, 28 insertions(+), 16 deletions(-)
--- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -150,8 +150,6 @@ static struct rcu_tasks rt_name = }
#ifdef CONFIG_TASKS_RCU -/* Track exiting tasks in order to allow them to be waited for. */ -DEFINE_STATIC_SRCU(tasks_rcu_exit_srcu);
/* Report delay in synchronize_srcu() completion in rcu_tasks_postscan(). */ static void tasks_rcu_exit_srcu_stall(struct timer_list *unused); @@ -879,10 +877,12 @@ static void rcu_tasks_wait_gp(struct rcu // number of voluntary context switches, and add that task to the // holdout list. // rcu_tasks_postscan(): -// Invoke synchronize_srcu() to ensure that all tasks that were -// in the process of exiting (and which thus might not know to -// synchronize with this RCU Tasks grace period) have completed -// exiting. +// Gather per-CPU lists of tasks in do_exit() to ensure that all +// tasks that were in the process of exiting (and which thus might +// not know to synchronize with this RCU Tasks grace period) have +// completed exiting. The synchronize_rcu() in rcu_tasks_postgp() +// will take care of any tasks stuck in the non-preemptible region +// of do_exit() following its call to exit_tasks_rcu_stop(). // check_all_holdout_tasks(), repeatedly until holdout list is empty: // Scans the holdout list, attempting to identify a quiescent state // for each task on the list. If there is a quiescent state, the @@ -895,8 +895,10 @@ static void rcu_tasks_wait_gp(struct rcu // with interrupts disabled. // // For each exiting task, the exit_tasks_rcu_start() and -// exit_tasks_rcu_finish() functions begin and end, respectively, the SRCU -// read-side critical sections waited for by rcu_tasks_postscan(). +// exit_tasks_rcu_finish() functions add and remove, respectively, the +// current task to a per-CPU list of tasks that rcu_tasks_postscan() must +// wait on. This is necessary because rcu_tasks_postscan() must wait on +// tasks that have already been removed from the global list of tasks. // // Pre-grace-period update-side code is ordered before the grace // via the raw_spin_lock.*rcu_node(). Pre-grace-period read-side code @@ -960,9 +962,13 @@ static void rcu_tasks_pertask(struct tas } }
+void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func); +DEFINE_RCU_TASKS(rcu_tasks, rcu_tasks_wait_gp, call_rcu_tasks, "RCU Tasks"); + /* Processing between scanning taskslist and draining the holdout list. */ static void rcu_tasks_postscan(struct list_head *hop) { + int cpu; int rtsi = READ_ONCE(rcu_task_stall_info);
if (!IS_ENABLED(CONFIG_TINY_RCU)) { @@ -976,9 +982,9 @@ static void rcu_tasks_postscan(struct li * this, divide the fragile exit path part in two intersecting * read side critical sections: * - * 1) An _SRCU_ read side starting before calling exit_notify(), - * which may remove the task from the tasklist, and ending after - * the final preempt_disable() call in do_exit(). + * 1) A task_struct list addition before calling exit_notify(), + * which may remove the task from the tasklist, with the + * removal after the final preempt_disable() call in do_exit(). * * 2) An _RCU_ read side starting with the final preempt_disable() * call in do_exit() and ending with the final call to schedule() @@ -987,7 +993,17 @@ static void rcu_tasks_postscan(struct li * This handles the part 1). And postgp will handle part 2) with a * call to synchronize_rcu(). */ - synchronize_srcu(&tasks_rcu_exit_srcu); + + for_each_possible_cpu(cpu) { + struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rcu_tasks.rtpcpu, cpu); + struct task_struct *t; + + raw_spin_lock_irq_rcu_node(rtpcp); + list_for_each_entry(t, &rtpcp->rtp_exit_list, rcu_tasks_exit_list) + if (list_empty(&t->rcu_tasks_holdout_list)) + rcu_tasks_pertask(t, hop); + raw_spin_unlock_irq_rcu_node(rtpcp); + }
if (!IS_ENABLED(CONFIG_TINY_RCU)) del_timer_sync(&tasks_rcu_exit_srcu_stall_timer); @@ -1055,7 +1071,6 @@ static void rcu_tasks_postgp(struct rcu_ * * In addition, this synchronize_rcu() waits for exiting tasks * to complete their final preempt_disable() region of execution, - * cleaning up after synchronize_srcu(&tasks_rcu_exit_srcu), * enforcing the whole region before tasklist removal until * the final schedule() with TASK_DEAD state to be an RCU TASKS * read side critical section. @@ -1063,9 +1078,6 @@ static void rcu_tasks_postgp(struct rcu_ synchronize_rcu(); }
-void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func); -DEFINE_RCU_TASKS(rcu_tasks, rcu_tasks_wait_gp, call_rcu_tasks, "RCU Tasks"); - static void tasks_rcu_exit_srcu_stall(struct timer_list *unused) { #ifndef CONFIG_TINY_RCU
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul E. McKenney paulmck@kernel.org
commit 0bb11a372fc8d7006b4d0f42a2882939747bdbff upstream.
The current code will scan the entirety of each per-CPU list of exiting tasks in ->rtp_exit_list with interrupts disabled. This is normally just fine, because each CPU typically won't have very many tasks in this state. However, if a large number of tasks block late in do_exit(), these lists could be arbitrarily long. Low probability, perhaps, but it really could happen.
This commit therefore occasionally re-enables interrupts while traversing these lists, inserting a dummy element to hold the current place in the list. In kernels built with CONFIG_PREEMPT_RT=y, this re-enabling happens after each list element is processed, otherwise every one-to-two jiffies.
[ paulmck: Apply Frederic Weisbecker feedback. ]
Link: https://lore.kernel.org/all/ZdeI_-RfdLR8jlsm@localhost.localdomain/
Signed-off-by: Paul E. McKenney paulmck@kernel.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Sebastian Siewior bigeasy@linutronix.de Cc: Anna-Maria Behnsen anna-maria@linutronix.de Cc: Steven Rostedt rostedt@goodmis.org Signed-off-by: Boqun Feng boqun.feng@gmail.com Cc: Tahera Fahimi taherafahimi@linux.microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/rcu/tasks.h | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-)
--- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -995,13 +995,33 @@ static void rcu_tasks_postscan(struct li */
for_each_possible_cpu(cpu) { + unsigned long j = jiffies + 1; struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rcu_tasks.rtpcpu, cpu); struct task_struct *t; + struct task_struct *t1; + struct list_head tmp;
raw_spin_lock_irq_rcu_node(rtpcp); - list_for_each_entry(t, &rtpcp->rtp_exit_list, rcu_tasks_exit_list) + list_for_each_entry_safe(t, t1, &rtpcp->rtp_exit_list, rcu_tasks_exit_list) { if (list_empty(&t->rcu_tasks_holdout_list)) rcu_tasks_pertask(t, hop); + + // RT kernels need frequent pauses, otherwise + // pause at least once per pair of jiffies. + if (!IS_ENABLED(CONFIG_PREEMPT_RT) && time_before(jiffies, j)) + continue; + + // Keep our place in the list while pausing. + // Nothing else traverses this list, so adding a + // bare list_head is OK. + list_add(&tmp, &t->rcu_tasks_exit_list); + raw_spin_unlock_irq_rcu_node(rtpcp); + cond_resched(); // For CONFIG_PREEMPT=n kernels + raw_spin_lock_irq_rcu_node(rtpcp); + t1 = list_entry(tmp.next, struct task_struct, rcu_tasks_exit_list); + list_del(&tmp); + j = jiffies + 1; + } raw_spin_unlock_irq_rcu_node(rtpcp); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov (AMD) bp@alien8.de
commit f3f9deccfc68a6b7c8c1cc51e902edba23d309d4 upstream.
VERW_CLEAR is supposed to be set only by the hypervisor to denote TSA mitigation support to a guest. SQ_NO and L1_NO are both synthesizable, and are going to be set by hw CPUID on future machines.
So keep the kvm_cpu_cap_init_kvm_defined() invocation *and* set them when synthesized.
This fix is stable-only.
Co-developed-by: Jinpu Wang jinpu.wang@ionos.com Signed-off-by: Jinpu Wang jinpu.wang@ionos.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Signed-off-by: Boris Ostrovsky boris.ostrovsky@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- --- arch/x86/kvm/cpuid.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -791,10 +791,15 @@ void kvm_set_cpu_caps(void) F(PERFMON_V2) );
+ kvm_cpu_cap_check_and_set(X86_FEATURE_VERW_CLEAR); + kvm_cpu_cap_init_kvm_defined(CPUID_8000_0021_ECX, F(TSA_SQ_NO) | F(TSA_L1_NO) );
+ kvm_cpu_cap_check_and_set(X86_FEATURE_TSA_SQ_NO); + kvm_cpu_cap_check_and_set(X86_FEATURE_TSA_L1_NO); + /* * Synthesize "LFENCE is serializing" into the AMD-defined entry in * KVM's supported CPUID if the feature is reported as supported by the
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Salah Triki salah.triki@gmail.com
commit ff2a66d21fd2364ed9396d151115eec59612b200 upstream.
dma_free_coherent() must only be called if the corresponding dma_alloc_coherent() call has succeeded. Calling it when the allocation fails leads to undefined behavior.
Delete the wrong call.
[ bp: Massage commit message. ]
Fixes: 71bcada88b0f3 ("edac: altera: Add Altera SDRAM EDAC support") Signed-off-by: Salah Triki salah.triki@gmail.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Acked-by: Dinh Nguyen dinguyen@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/aIrfzzqh4IzYtDVC@pc Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/edac/altera_edac.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/edac/altera_edac.c +++ b/drivers/edac/altera_edac.c @@ -127,7 +127,6 @@ static ssize_t altr_sdr_mc_err_inject_wr
ptemp = dma_alloc_coherent(mci->pdev, 16, &dma_handle, GFP_KERNEL); if (!ptemp) { - dma_free_coherent(mci->pdev, 16, ptemp, dma_handle); edac_printk(KERN_ERR, EDAC_MC, "Inject: Buffer Allocation error\n"); return -ENOMEM;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
commit 199cd9e8d14bc14bdbd1fa3031ce26dac9781507 upstream.
This reverts commit 14e41b16e8cb677bb440dca2edba8b041646c742.
This patch breaks the LTP acct02 test, so let's revert and look for a better solution.
Reported-by: Mark Brown broonie@kernel.org Reported-by: Harshvardhan Jha harshvardhan.j.jha@oracle.com Link: https://lore.kernel.org/linux-nfs/7d4d57b0-39a3-49f1-8ada-60364743e3b4@siren... Cc: stable@vger.kernel.org # 6.15.x Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sunrpc/sched.c | 2 -- 1 file changed, 2 deletions(-)
--- a/net/sunrpc/sched.c +++ b/net/sunrpc/sched.c @@ -276,8 +276,6 @@ EXPORT_SYMBOL_GPL(rpc_destroy_wait_queue
static int rpc_wait_bit_killable(struct wait_bit_key *key, int mode) { - if (unlikely(current->flags & PF_EXITING)) - return -EINTR; schedule(); if (signal_pending_state(mode, current)) return -ERESTARTSYS;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nathan Chancellor nathan@kernel.org
commit 3fac212fe489aa0dbe8d80a42a7809840ca7b0f9 upstream.
Clang 22 recently added support for defining __SANITIZE__ macros similar to GCC [1], which causes warnings (or errors with CONFIG_WERROR=y or W=e) with the existing defines that the kernel creates to emulate this behavior with existing clang versions.
In file included from <built-in>:3: In file included from include/linux/compiler_types.h:171: include/linux/compiler-clang.h:37:9: error: '__SANITIZE_THREAD__' macro redefined [-Werror,-Wmacro-redefined] 37 | #define __SANITIZE_THREAD__ | ^ <built-in>:352:9: note: previous definition is here 352 | #define __SANITIZE_THREAD__ 1 | ^
Refactor compiler-clang.h to only define the sanitizer macros when they are undefined and adjust the rest of the code to use these macros for checking if the sanitizers are enabled, clearing up the warnings and allowing the kernel to easily drop these defines when the minimum supported version of LLVM for building the kernel becomes 22.0.0 or newer.
Link: https://lkml.kernel.org/r/20250902-clang-update-sanitize-defines-v1-1-cf3702... Link: https://github.com/llvm/llvm-project/commit/568c23bbd3303518c5056d7f03444dae... [1] Signed-off-by: Nathan Chancellor nathan@kernel.org Reviewed-by: Justin Stitt justinstitt@google.com Cc: Alexander Potapenko glider@google.com Cc: Andrey Konovalov andreyknvl@gmail.com Cc: Andrey Ryabinin ryabinin.a.a@gmail.com Cc: Bill Wendling morbo@google.com Cc: Dmitriy Vyukov dvyukov@google.com Cc: Marco Elver elver@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/compiler-clang.h | 29 ++++++++++++++++++++++++----- 1 file changed, 24 insertions(+), 5 deletions(-)
--- a/include/linux/compiler-clang.h +++ b/include/linux/compiler-clang.h @@ -23,23 +23,42 @@ #define KASAN_ABI_VERSION 5
/* + * Clang 22 added preprocessor macros to match GCC, in hopes of eventually + * dropping __has_feature support for sanitizers: + * https://github.com/llvm/llvm-project/commit/568c23bbd3303518c5056d7f03444dae... + * Create these macros for older versions of clang so that it is easy to clean + * up once the minimum supported version of LLVM for building the kernel always + * creates these macros. + * * Note: Checking __has_feature(*_sanitizer) is only true if the feature is * enabled. Therefore it is not required to additionally check defined(CONFIG_*) * to avoid adding redundant attributes in other configurations. */ +#if __has_feature(address_sanitizer) && !defined(__SANITIZE_ADDRESS__) +#define __SANITIZE_ADDRESS__ +#endif +#if __has_feature(hwaddress_sanitizer) && !defined(__SANITIZE_HWADDRESS__) +#define __SANITIZE_HWADDRESS__ +#endif +#if __has_feature(thread_sanitizer) && !defined(__SANITIZE_THREAD__) +#define __SANITIZE_THREAD__ +#endif
-#if __has_feature(address_sanitizer) || __has_feature(hwaddress_sanitizer) -/* Emulate GCC's __SANITIZE_ADDRESS__ flag */ +/* + * Treat __SANITIZE_HWADDRESS__ the same as __SANITIZE_ADDRESS__ in the kernel. + */ +#ifdef __SANITIZE_HWADDRESS__ #define __SANITIZE_ADDRESS__ +#endif + +#ifdef __SANITIZE_ADDRESS__ #define __no_sanitize_address \ __attribute__((no_sanitize("address", "hwaddress"))) #else #define __no_sanitize_address #endif
-#if __has_feature(thread_sanitizer) -/* emulate gcc's __SANITIZE_THREAD__ flag */ -#define __SANITIZE_THREAD__ +#ifdef __SANITIZE_THREAD__ #define __no_sanitize_thread \ __attribute__((no_sanitize("thread"))) #else
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krister Johansen kjlx@templeofstupid.com
commit 648de37416b301f046f62f1b65715c7fa8ebaa67 upstream.
Users reported a scenario where MPTCP connections that were configured with SO_KEEPALIVE prior to connect would fail to enable their keepalives if MTPCP fell back to TCP mode.
After investigating, this affects keepalives for any connection where sync_socket_options is called on a socket that is in the closed or listening state. Joins are handled properly. For connects, sync_socket_options is called when the socket is still in the closed state. The tcp_set_keepalive() function does not act on sockets that are closed or listening, hence keepalive is not immediately enabled. Since the SO_KEEPOPEN flag is absent, it is not enabled later in the connect sequence via tcp_finish_connect. Setting the keepalive via sockopt after connect does work, but would not address any subsequently created flows.
Fortunately, the fix here is straight-forward: set SOCK_KEEPOPEN on the subflow when calling sync_socket_options.
The fix was valdidated both by using tcpdump to observe keepalive packets not being sent before the fix, and being sent after the fix. It was also possible to observe via ss that the keepalive timer was not enabled on these sockets before the fix, but was enabled afterwards.
Fixes: 1b3e7ede1365 ("mptcp: setsockopt: handle SO_KEEPALIVE and SO_PRIORITY") Cc: stable@vger.kernel.org Signed-off-by: Krister Johansen kjlx@templeofstupid.com Reviewed-by: Geliang Tang geliang@kernel.org Reviewed-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/aL8dYfPZrwedCIh9@templeofstupid.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/sockopt.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
--- a/net/mptcp/sockopt.c +++ b/net/mptcp/sockopt.c @@ -1471,13 +1471,12 @@ static void sync_socket_options(struct m { static const unsigned int tx_rx_locks = SOCK_RCVBUF_LOCK | SOCK_SNDBUF_LOCK; struct sock *sk = (struct sock *)msk; + bool keep_open;
- if (ssk->sk_prot->keepalive) { - if (sock_flag(sk, SOCK_KEEPOPEN)) - ssk->sk_prot->keepalive(ssk, 1); - else - ssk->sk_prot->keepalive(ssk, 0); - } + keep_open = sock_flag(sk, SOCK_KEEPOPEN); + if (ssk->sk_prot->keepalive) + ssk->sk_prot->keepalive(ssk, keep_open); + sock_valbool_flag(ssk, SOCK_KEEPOPEN, keep_open);
ssk->sk_priority = sk->sk_priority; ssk->sk_bound_dev_if = sk->sk_bound_dev_if;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Tinguely mark.tinguely@oracle.com
commit 04100f775c2ea501927f508f17ad824ad1f23c8d upstream.
syzbot detected a OCFS2 hang due to a recursive semaphore on a FS_IOC_FIEMAP of the extent list on a specially crafted mmap file.
context_switch kernel/sched/core.c:5357 [inline] __schedule+0x1798/0x4cc0 kernel/sched/core.c:6961 __schedule_loop kernel/sched/core.c:7043 [inline] schedule+0x165/0x360 kernel/sched/core.c:7058 schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:7115 rwsem_down_write_slowpath+0x872/0xfe0 kernel/locking/rwsem.c:1185 __down_write_common kernel/locking/rwsem.c:1317 [inline] __down_write kernel/locking/rwsem.c:1326 [inline] down_write+0x1ab/0x1f0 kernel/locking/rwsem.c:1591 ocfs2_page_mkwrite+0x2ff/0xc40 fs/ocfs2/mmap.c:142 do_page_mkwrite+0x14d/0x310 mm/memory.c:3361 wp_page_shared mm/memory.c:3762 [inline] do_wp_page+0x268d/0x5800 mm/memory.c:3981 handle_pte_fault mm/memory.c:6068 [inline] __handle_mm_fault+0x1033/0x5440 mm/memory.c:6195 handle_mm_fault+0x40a/0x8e0 mm/memory.c:6364 do_user_addr_fault+0x764/0x1390 arch/x86/mm/fault.c:1387 handle_page_fault arch/x86/mm/fault.c:1476 [inline] exc_page_fault+0x76/0xf0 arch/x86/mm/fault.c:1532 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623 RIP: 0010:copy_user_generic arch/x86/include/asm/uaccess_64.h:126 [inline] RIP: 0010:raw_copy_to_user arch/x86/include/asm/uaccess_64.h:147 [inline] RIP: 0010:_inline_copy_to_user include/linux/uaccess.h:197 [inline] RIP: 0010:_copy_to_user+0x85/0xb0 lib/usercopy.c:26 Code: e8 00 bc f7 fc 4d 39 fc 72 3d 4d 39 ec 77 38 e8 91 b9 f7 fc 4c 89 f7 89 de e8 47 25 5b fd 0f 01 cb 4c 89 ff 48 89 d9 4c 89 f6 <f3> a4 0f 1f 00 48 89 cb 0f 01 ca 48 89 d8 5b 41 5c 41 5d 41 5e 41 RSP: 0018:ffffc9000403f950 EFLAGS: 00050256 RAX: ffffffff84c7f101 RBX: 0000000000000038 RCX: 0000000000000038 RDX: 0000000000000000 RSI: ffffc9000403f9e0 RDI: 0000200000000060 RBP: ffffc9000403fa90 R08: ffffc9000403fa17 R09: 1ffff92000807f42 R10: dffffc0000000000 R11: fffff52000807f43 R12: 0000200000000098 R13: 00007ffffffff000 R14: ffffc9000403f9e0 R15: 0000200000000060 copy_to_user include/linux/uaccess.h:225 [inline] fiemap_fill_next_extent+0x1c0/0x390 fs/ioctl.c:145 ocfs2_fiemap+0x888/0xc90 fs/ocfs2/extent_map.c:806 ioctl_fiemap fs/ioctl.c:220 [inline] do_vfs_ioctl+0x1173/0x1430 fs/ioctl.c:532 __do_sys_ioctl fs/ioctl.c:596 [inline] __se_sys_ioctl+0x82/0x170 fs/ioctl.c:584 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f5f13850fd9 RSP: 002b:00007ffe3b3518b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000200000000000 RCX: 00007f5f13850fd9 RDX: 0000200000000040 RSI: 00000000c020660b RDI: 0000000000000004 RBP: 6165627472616568 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe3b3518f0 R13: 00007ffe3b351b18 R14: 431bde82d7b634db R15: 00007f5f1389a03b
ocfs2_fiemap() takes a read lock of the ip_alloc_sem semaphore (since v2.6.22-527-g7307de80510a) and calls fiemap_fill_next_extent() to read the extent list of this running mmap executable. The user supplied buffer to hold the fiemap information page faults calling ocfs2_page_mkwrite() which will take a write lock (since v2.6.27-38-g00dc417fa3e7) of the same semaphore. This recursive semaphore will hold filesystem locks and causes a hang of the fileystem.
The ip_alloc_sem protects the inode extent list and size. Release the read semphore before calling fiemap_fill_next_extent() in ocfs2_fiemap() and ocfs2_fiemap_inline(). This does an unnecessary semaphore lock/unlock on the last extent but simplifies the error path.
Link: https://lkml.kernel.org/r/61d1a62b-2631-4f12-81e2-cd689914360b@oracle.com Fixes: 00dc417fa3e7 ("ocfs2: fiemap support") Signed-off-by: Mark Tinguely mark.tinguely@oracle.com Reported-by: syzbot+541dcc6ee768f77103e7@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=541dcc6ee768f77103e7 Reviewed-by: Joseph Qi joseph.qi@linux.alibaba.com Cc: Mark Fasheh mark@fasheh.com Cc: Joel Becker jlbec@evilplan.org Cc: Junxiao Bi junxiao.bi@oracle.com Cc: Changwei Ge gechangwei@live.cn Cc: Jun Piao piaojun@huawei.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ocfs2/extent_map.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/fs/ocfs2/extent_map.c +++ b/fs/ocfs2/extent_map.c @@ -696,6 +696,8 @@ out: * it not only handles the fiemap for inlined files, but also deals * with the fast symlink, cause they have no difference for extent * mapping per se. + * + * Must be called with ip_alloc_sem semaphore held. */ static int ocfs2_fiemap_inline(struct inode *inode, struct buffer_head *di_bh, struct fiemap_extent_info *fieinfo, @@ -707,6 +709,7 @@ static int ocfs2_fiemap_inline(struct in u64 phys; u32 flags = FIEMAP_EXTENT_DATA_INLINE|FIEMAP_EXTENT_LAST; struct ocfs2_inode_info *oi = OCFS2_I(inode); + lockdep_assert_held_read(&oi->ip_alloc_sem);
di = (struct ocfs2_dinode *)di_bh->b_data; if (ocfs2_inode_is_fast_symlink(inode)) @@ -722,8 +725,11 @@ static int ocfs2_fiemap_inline(struct in phys += offsetof(struct ocfs2_dinode, id2.i_data.id_data);
+ /* Release the ip_alloc_sem to prevent deadlock on page fault */ + up_read(&OCFS2_I(inode)->ip_alloc_sem); ret = fiemap_fill_next_extent(fieinfo, 0, phys, id_count, flags); + down_read(&OCFS2_I(inode)->ip_alloc_sem); if (ret < 0) return ret; } @@ -792,9 +798,11 @@ int ocfs2_fiemap(struct inode *inode, st len_bytes = (u64)le16_to_cpu(rec.e_leaf_clusters) << osb->s_clustersize_bits; phys_bytes = le64_to_cpu(rec.e_blkno) << osb->sb->s_blocksize_bits; virt_bytes = (u64)le32_to_cpu(rec.e_cpos) << osb->s_clustersize_bits; - + /* Release the ip_alloc_sem to prevent deadlock on page fault */ + up_read(&OCFS2_I(inode)->ip_alloc_sem); ret = fiemap_fill_next_extent(fieinfo, virt_bytes, phys_bytes, len_bytes, fe_flags); + down_read(&OCFS2_I(inode)->ip_alloc_sem); if (ret) break;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chiasheng Lee chiasheng.lee@linux.intel.com
commit 664596bd98bb251dd417dfd3f9b615b661e1e44a upstream.
Hide the Intel Birch Stream SoC TCO WDT feature since it was removed.
On platforms with PCH TCO WDT, this redundant device might be rendering errors like this:
[ 28.144542] sysfs: cannot create duplicate filename '/bus/platform/devices/iTCO_wdt'
Fixes: 8c56f9ef25a3 ("i2c: i801: Add support for Intel Birch Stream SoC") Link: https://bugzilla.kernel.org/show_bug.cgi?id=220320 Signed-off-by: Chiasheng Lee chiasheng.lee@linux.intel.com Cc: stable@vger.kernel.org # v6.7+ Reviewed-by: Mika Westerberg mika.westerberg@linux.intel.com Reviewed-by: Jarkko Nikula jarkko.nikula@linux.intel.com Signed-off-by: Andi Shyti andi.shyti@kernel.org Link: https://lore.kernel.org/r/20250901125943.916522-1-chiasheng.lee@linux.intel.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/busses/i2c-i801.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/i2c/busses/i2c-i801.c +++ b/drivers/i2c/busses/i2c-i801.c @@ -1051,7 +1051,7 @@ static const struct pci_device_id i801_i { PCI_DEVICE_DATA(INTEL, METEOR_LAKE_P_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, { PCI_DEVICE_DATA(INTEL, METEOR_LAKE_SOC_S_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, { PCI_DEVICE_DATA(INTEL, METEOR_LAKE_PCH_S_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, - { PCI_DEVICE_DATA(INTEL, BIRCH_STREAM_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, + { PCI_DEVICE_DATA(INTEL, BIRCH_STREAM_SMBUS, FEATURES_ICH5) }, { PCI_DEVICE_DATA(INTEL, ARROW_LAKE_H_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, { PCI_DEVICE_DATA(INTEL, PANTHER_LAKE_H_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) }, { PCI_DEVICE_DATA(INTEL, PANTHER_LAKE_P_SMBUS, FEATURES_ICH5 | FEATURE_TCO_CNL) },
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oleksij Rempel o.rempel@pengutronix.de
commit 5537a4679403423e0b49c95b619983a4583d69c5 upstream.
Drop phylink_{suspend,resume}() from ax88772 PM callbacks.
MDIO bus accesses have their own runtime-PM handling and will try to wake the device if it is suspended. Such wake attempts must not happen from PM callbacks while the device PM lock is held. Since phylink {sus|re}sume may trigger MDIO, it must not be called in PM context.
No extra phylink PM handling is required for this driver: - .ndo_open/.ndo_stop control the phylink start/stop lifecycle. - ethtool/phylib entry points run in process context, not PM. - phylink MAC ops program the MAC on link changes after resume.
Fixes: e0bffe3e6894 ("net: asix: ax88772: migrate to phylink") Reported-by: Hubert Wiśniewski hubert.wisniewski.25632@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Oleksij Rempel o.rempel@pengutronix.de Tested-by: Hubert Wiśniewski hubert.wisniewski.25632@gmail.com Tested-by: Xu Yang xu.yang_2@nxp.com Link: https://patch.msgid.link/20250908112619.2900723-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/asix_devices.c | 13 ------------- 1 file changed, 13 deletions(-)
--- a/drivers/net/usb/asix_devices.c +++ b/drivers/net/usb/asix_devices.c @@ -607,15 +607,8 @@ static const struct net_device_ops ax887
static void ax88772_suspend(struct usbnet *dev) { - struct asix_common_private *priv = dev->driver_priv; u16 medium;
- if (netif_running(dev->net)) { - rtnl_lock(); - phylink_suspend(priv->phylink, false); - rtnl_unlock(); - } - /* Stop MAC operation */ medium = asix_read_medium_status(dev, 1); medium &= ~AX_MEDIUM_RE; @@ -644,12 +637,6 @@ static void ax88772_resume(struct usbnet for (i = 0; i < 3; i++) if (!priv->reset(dev, 1)) break; - - if (netif_running(dev->net)) { - rtnl_lock(); - phylink_resume(priv->phylink); - rtnl_unlock(); - } }
static int asix_resume(struct usb_interface *intf)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe Kerello christophe.kerello@foss.st.com
commit 513c40e59d5a414ab763a9c84797534b5e8c208d upstream.
Avoid below overlapping mappings by using a contiguous non-cacheable buffer.
[ 4.077708] DMA-API: stm32_fmc2_nfc 48810000.nand-controller: cacheline tracking EEXIST, overlapping mappings aren't supported [ 4.089103] WARNING: CPU: 1 PID: 44 at kernel/dma/debug.c:568 add_dma_entry+0x23c/0x300 [ 4.097071] Modules linked in: [ 4.100101] CPU: 1 PID: 44 Comm: kworker/u4:2 Not tainted 6.1.82 #1 [ 4.106346] Hardware name: STMicroelectronics STM32MP257F VALID1 SNOR / MB1704 (LPDDR4 Power discrete) + MB1703 + MB1708 (SNOR MB1730) (DT) [ 4.118824] Workqueue: events_unbound deferred_probe_work_func [ 4.124674] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 4.131624] pc : add_dma_entry+0x23c/0x300 [ 4.135658] lr : add_dma_entry+0x23c/0x300 [ 4.139792] sp : ffff800009dbb490 [ 4.143016] x29: ffff800009dbb4a0 x28: 0000000004008022 x27: ffff8000098a6000 [ 4.150174] x26: 0000000000000000 x25: ffff8000099e7000 x24: ffff8000099e7de8 [ 4.157231] x23: 00000000ffffffff x22: 0000000000000000 x21: ffff8000098a6a20 [ 4.164388] x20: ffff000080964180 x19: ffff800009819ba0 x18: 0000000000000006 [ 4.171545] x17: 6361727420656e69 x16: 6c6568636163203a x15: 72656c6c6f72746e [ 4.178602] x14: 6f632d646e616e2e x13: ffff800009832f58 x12: 00000000000004ec [ 4.185759] x11: 00000000000001a4 x10: ffff80000988af58 x9 : ffff800009832f58 [ 4.192916] x8 : 00000000ffffefff x7 : ffff80000988af58 x6 : 80000000fffff000 [ 4.199972] x5 : 000000000000bff4 x4 : 0000000000000000 x3 : 0000000000000000 [ 4.207128] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000812d2c40 [ 4.214185] Call trace: [ 4.216605] add_dma_entry+0x23c/0x300 [ 4.220338] debug_dma_map_sg+0x198/0x350 [ 4.224373] __dma_map_sg_attrs+0xa0/0x110 [ 4.228411] dma_map_sg_attrs+0x10/0x2c [ 4.232247] stm32_fmc2_nfc_xfer.isra.0+0x1c8/0x3fc [ 4.237088] stm32_fmc2_nfc_seq_read_page+0xc8/0x174 [ 4.242127] nand_read_oob+0x1d4/0x8e0 [ 4.245861] mtd_read_oob_std+0x58/0x84 [ 4.249596] mtd_read_oob+0x90/0x150 [ 4.253231] mtd_read+0x68/0xac
Signed-off-by: Christophe Kerello christophe.kerello@foss.st.com Cc: stable@vger.kernel.org Fixes: 2cd457f328c1 ("mtd: rawnand: stm32_fmc2: add STM32 FMC2 NAND flash controller driver") Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mtd/nand/raw/stm32_fmc2_nand.c | 28 +++++++++------------------- 1 file changed, 9 insertions(+), 19 deletions(-)
--- a/drivers/mtd/nand/raw/stm32_fmc2_nand.c +++ b/drivers/mtd/nand/raw/stm32_fmc2_nand.c @@ -263,6 +263,7 @@ struct stm32_fmc2_nfc { struct sg_table dma_data_sg; struct sg_table dma_ecc_sg; u8 *ecc_buf; + dma_addr_t dma_ecc_addr; int dma_ecc_len;
struct completion complete; @@ -885,17 +886,10 @@ static int stm32_fmc2_nfc_xfer(struct na
if (!write_data && !raw) { /* Configure DMA ECC status */ - p = nfc->ecc_buf; for_each_sg(nfc->dma_ecc_sg.sgl, sg, eccsteps, s) { - sg_set_buf(sg, p, nfc->dma_ecc_len); - p += nfc->dma_ecc_len; - } - - ret = dma_map_sg(nfc->dev, nfc->dma_ecc_sg.sgl, - eccsteps, dma_data_dir); - if (!ret) { - ret = -EIO; - goto err_unmap_data; + sg_dma_address(sg) = nfc->dma_ecc_addr + + s * nfc->dma_ecc_len; + sg_dma_len(sg) = nfc->dma_ecc_len; }
desc_ecc = dmaengine_prep_slave_sg(nfc->dma_ecc_ch, @@ -904,7 +898,7 @@ static int stm32_fmc2_nfc_xfer(struct na DMA_PREP_INTERRUPT); if (!desc_ecc) { ret = -ENOMEM; - goto err_unmap_ecc; + goto err_unmap_data; }
reinit_completion(&nfc->dma_ecc_complete); @@ -912,7 +906,7 @@ static int stm32_fmc2_nfc_xfer(struct na desc_ecc->callback_param = &nfc->dma_ecc_complete; ret = dma_submit_error(dmaengine_submit(desc_ecc)); if (ret) - goto err_unmap_ecc; + goto err_unmap_data;
dma_async_issue_pending(nfc->dma_ecc_ch); } @@ -932,7 +926,7 @@ static int stm32_fmc2_nfc_xfer(struct na if (!write_data && !raw) dmaengine_terminate_all(nfc->dma_ecc_ch); ret = -ETIMEDOUT; - goto err_unmap_ecc; + goto err_unmap_data; }
/* Wait DMA data transfer completion */ @@ -952,11 +946,6 @@ static int stm32_fmc2_nfc_xfer(struct na } }
-err_unmap_ecc: - if (!write_data && !raw) - dma_unmap_sg(nfc->dev, nfc->dma_ecc_sg.sgl, - eccsteps, dma_data_dir); - err_unmap_data: dma_unmap_sg(nfc->dev, nfc->dma_data_sg.sgl, eccsteps, dma_data_dir);
@@ -1582,7 +1571,8 @@ static int stm32_fmc2_nfc_dma_setup(stru return ret;
/* Allocate a buffer to store ECC status registers */ - nfc->ecc_buf = devm_kzalloc(nfc->dev, FMC2_MAX_ECC_BUF_LEN, GFP_KERNEL); + nfc->ecc_buf = dmam_alloc_coherent(nfc->dev, FMC2_MAX_ECC_BUF_LEN, + &nfc->dma_ecc_addr, GFP_KERNEL); if (!nfc->ecc_buf) return -ENOMEM;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe Kerello christophe.kerello@foss.st.com
commit 811c0da4542df3c065f6cb843ced68780e27bb44 upstream.
In case OOB write is requested during a data write, ECC is currently lost. Avoid this issue by only writing in the free spare area. This issue has been seen with a YAFFS2 file system.
Signed-off-by: Christophe Kerello christophe.kerello@foss.st.com Cc: stable@vger.kernel.org Fixes: 2cd457f328c1 ("mtd: rawnand: stm32_fmc2: add STM32 FMC2 NAND flash controller driver") Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mtd/nand/raw/stm32_fmc2_nand.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)
--- a/drivers/mtd/nand/raw/stm32_fmc2_nand.c +++ b/drivers/mtd/nand/raw/stm32_fmc2_nand.c @@ -968,9 +968,21 @@ static int stm32_fmc2_nfc_seq_write(stru
/* Write oob */ if (oob_required) { - ret = nand_change_write_column_op(chip, mtd->writesize, - chip->oob_poi, mtd->oobsize, - false); + unsigned int offset_in_page = mtd->writesize; + const void *buf = chip->oob_poi; + unsigned int len = mtd->oobsize; + + if (!raw) { + struct mtd_oob_region oob_free; + + mtd_ooblayout_free(mtd, 0, &oob_free); + offset_in_page += oob_free.offset; + buf += oob_free.offset; + len = oob_free.length; + } + + ret = nand_change_write_column_op(chip, offset_in_page, + buf, len, false); if (ret) return ret; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi mszeredi@redhat.com
commit e5203209b3935041dac541bc5b37efb44220cc0b upstream.
Just like write(), copy_file_range() should check if the return value is less or equal to the requested number of bytes.
Reported-by: Chunsheng Luo luochunsheng@ustc.edu Closes: https://lore.kernel.org/all/20250807062425.694-1-luochunsheng@ustc.edu/ Fixes: 88bc7d5097a1 ("fuse: add support for copy_file_range()") Cc: stable@vger.kernel.org # v4.20 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/file.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -3172,6 +3172,9 @@ static ssize_t __fuse_copy_file_range(st fc->no_copy_file_range = 1; err = -EOPNOTSUPP; } + if (!err && outarg.size > len) + err = -EIO; + if (err) goto out;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi mszeredi@redhat.com
commit 1e08938c3694f707bb165535df352ac97a8c75c9 upstream.
The FUSE protocol uses struct fuse_write_out to convey the return value of copy_file_range, which is restricted to uint32_t. But the COPY_FILE_RANGE interface supports a 64-bit size copies.
Currently the number of bytes copied is silently truncated to 32-bit, which may result in poor performance or even failure to copy in case of truncation to zero.
Reported-by: Florian Weimer fweimer@redhat.com Closes: https://lore.kernel.org/all/lhuh5ynl8z5.fsf@oldenburg.str.redhat.com/ Fixes: 88bc7d5097a1 ("fuse: add support for copy_file_range()") Cc: stable@vger.kernel.org # v4.20 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -3106,7 +3106,7 @@ static ssize_t __fuse_copy_file_range(st .nodeid_out = ff_out->nodeid, .fh_out = ff_out->fh, .off_out = pos_out, - .len = len, + .len = min_t(size_t, len, UINT_MAX & PAGE_MASK), .flags = flags }; struct fuse_write_out outarg;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miaohe Lin linmiaohe@huawei.com
commit d613f53c83ec47089c4e25859d5e8e0359f6f8da upstream.
When I did memory failure tests, below panic occurs:
page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page)) kernel BUG at include/linux/page-flags.h:616! Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 #40 RIP: 0010:unpoison_memory+0x2f3/0x590 RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246 RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0 RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000 R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe FS: 00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0 Call Trace: <TASK> unpoison_memory+0x2f3/0x590 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xd5/0x540 ksys_write+0x64/0xe0 do_syscall_64+0xb9/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f08f0314887 RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887 RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001 RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009 R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00 </TASK> Modules linked in: hwpoison_inject ---[ end trace 0000000000000000 ]--- RIP: 0010:unpoison_memory+0x2f3/0x590 RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246 RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0 RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000 R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe FS: 00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0 Kernel panic - not syncing: Fatal exception Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception ]---
The root cause is that unpoison_memory() tries to check the PG_HWPoison flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is triggered. This can be reproduced by below steps:
1.Offline memory block:
echo offline > /sys/devices/system/memory/memory12/state
2.Get offlined memory pfn:
page-types -b n -rlN
3.Write pfn to unpoison-pfn
echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn
This scenario can be identified by pfn_to_online_page() returning NULL. And ZONE_DEVICE pages are never expected, so we can simply fail if pfn_to_online_page() == NULL to fix the bug.
Link: https://lkml.kernel.org/r/20250828024618.1744895-1-linmiaohe@huawei.com Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") Signed-off-by: Miaohe Lin linmiaohe@huawei.com Suggested-by: David Hildenbrand david@redhat.com Acked-by: David Hildenbrand david@redhat.com Cc: Naoya Horiguchi nao.horiguchi@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memory-failure.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
--- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -2535,10 +2535,9 @@ int unpoison_memory(unsigned long pfn) static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST);
- if (!pfn_valid(pfn)) - return -ENXIO; - - p = pfn_to_page(pfn); + p = pfn_to_online_page(pfn); + if (!p) + return -EIO; folio = page_folio(p);
mutex_lock(&mf_mutex);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sang-Heon Jeon ekffu200098@gmail.com
commit ce652aac9c90a96c6536681d17518efb1f660fb8 upstream.
Kernel initializes the "jiffies" timer as 5 minutes below zero, as shown in include/linux/jiffies.h
/* * Have the 32 bit jiffies value wrap 5 minutes after boot * so jiffies wrap bugs show up earlier. */ #define INITIAL_JIFFIES ((unsigned long)(unsigned int) (-300*HZ))
And jiffies comparison help functions cast unsigned value to signed to cover wraparound
#define time_after_eq(a,b) \ (typecheck(unsigned long, a) && \ typecheck(unsigned long, b) && \ ((long)((a) - (b)) >= 0))
When quota->charged_from is initialized to 0, time_after_eq() can incorrectly return FALSE even after reset_interval has elapsed. This occurs when (jiffies - reset_interval) produces a value with MSB=1, which is interpreted as negative in signed arithmetic.
This issue primarily affects 32-bit systems because: On 64-bit systems: MSB=1 values occur after ~292 million years from boot (assuming HZ=1000), almost impossible.
On 32-bit systems: MSB=1 values occur during the first 5 minutes after boot, and the second half of every jiffies wraparound cycle, starting from day 25 (assuming HZ=1000)
When above unexpected FALSE return from time_after_eq() occurs, the charging window will not reset. The user impact depends on esz value at that time.
If esz is 0, scheme ignores configured quotas and runs without any limits.
If esz is not 0, scheme stops working once the quota is exhausted. It remains until the charging window finally resets.
So, change quota->charged_from to jiffies at damos_adjust_quota() when it is considered as the first charge window. By this change, we can avoid unexpected FALSE return from time_after_eq()
Link: https://lkml.kernel.org/r/20250822025057.1740854-1-ekffu200098@gmail.com Fixes: 2b8a248d5873 ("mm/damon/schemes: implement size quota for schemes application speed control") # 5.16 Signed-off-by: Sang-Heon Jeon ekffu200098@gmail.com Reviewed-by: SeongJae Park sj@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/damon/core.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -1043,6 +1043,10 @@ static void damos_adjust_quota(struct da if (!quota->ms && !quota->sz) return;
+ /* First charge window */ + if (!quota->total_charged_sz && !quota->charged_from) + quota->charged_from = jiffies; + /* New charge window starts */ if (time_after_eq(jiffies, quota->charged_from + msecs_to_jiffies(quota->reset_interval))) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 4de37a48b6b58faaded9eb765047cf0d8785ea18 upstream.
The for_each_child_of_node() helper drops the reference it takes to each node as it iterates over children and an explicit of_node_put() is only needed when exiting the loop early.
Drop the recently introduced bogus additional reference count decrement at each iteration that could potentially lead to a use-after-free.
Fixes: 1f403699c40f ("drm/mediatek: Fix device/node reference count leaks in mtk_drm_get_all_drm_priv") Cc: Ma Ke make24@iscas.ac.cn Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Reviewed-by: CK Hu ck.hu@mediatek.com Reviewed-by: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Link: https://patchwork.kernel.org/project/dri-devel/patch/20250829090345.21075-2-... Signed-off-by: Chun-Kuang Hu chunkuang.hu@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/mediatek/mtk_drm_drv.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c @@ -365,11 +365,11 @@ static bool mtk_drm_get_all_drm_priv(str
of_id = of_match_node(mtk_drm_of_ids, node); if (!of_id) - goto next_put_node; + continue;
pdev = of_find_device_by_node(node); if (!pdev) - goto next_put_node; + continue;
drm_dev = device_find_child(&pdev->dev, NULL, mtk_drm_match); if (!drm_dev) @@ -395,11 +395,10 @@ next_put_device_drm_dev: next_put_device_pdev_dev: put_device(&pdev->dev);
-next_put_node: - of_node_put(node); - - if (cnt == MAX_CRTC) + if (cnt == MAX_CRTC) { + of_node_put(node); break; + } }
if (drm_priv->data->mmsys_dev_num == cnt) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Rosca david.rosca@amd.com
commit 3318f2d20ce48849855df5e190813826d0bc3653 upstream.
There is no reason to require this to happen on first submitted IB only. We need to wait for the queue to be idle, but it can be done at any time (including when there are multiple video sessions active).
Signed-off-by: David Rosca david.rosca@amd.com Reviewed-by: Leo Liu leo.liu@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 8908fdce0634a623404e9923ed2f536101a39db5) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 12 ++++++++---- drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c | 12 ++++++++---- 2 files changed, 16 insertions(+), 8 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c @@ -1765,15 +1765,19 @@ static int vcn_v3_0_limit_sched(struct a struct amdgpu_job *job) { struct drm_gpu_scheduler **scheds; - - /* The create msg must be in the first IB submitted */ - if (atomic_read(&job->base.entity->fence_seq)) - return -EINVAL; + struct dma_fence *fence;
/* if VCN0 is harvested, we can't support AV1 */ if (p->adev->vcn.harvest_config & AMDGPU_VCN_HARVEST_VCN0) return -EINVAL;
+ /* wait for all jobs to finish before switching to instance 0 */ + fence = amdgpu_ctx_get_fence(p->ctx, job->base.entity, ~0ull); + if (fence) { + dma_fence_wait(fence, false); + dma_fence_put(fence); + } + scheds = p->adev->gpu_sched[AMDGPU_HW_IP_VCN_DEC] [AMDGPU_RING_PRIO_DEFAULT].sched; drm_sched_entity_modify_sched(job->base.entity, scheds, 1); --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c @@ -1644,15 +1644,19 @@ static int vcn_v4_0_limit_sched(struct a struct amdgpu_job *job) { struct drm_gpu_scheduler **scheds; - - /* The create msg must be in the first IB submitted */ - if (atomic_read(&job->base.entity->fence_seq)) - return -EINVAL; + struct dma_fence *fence;
/* if VCN0 is harvested, we can't support AV1 */ if (p->adev->vcn.harvest_config & AMDGPU_VCN_HARVEST_VCN0) return -EINVAL;
+ /* wait for all jobs to finish before switching to instance 0 */ + fence = amdgpu_ctx_get_fence(p->ctx, job->base.entity, ~0ull); + if (fence) { + dma_fence_wait(fence, false); + dma_fence_put(fence); + } + scheds = p->adev->gpu_sched[AMDGPU_HW_IP_VCN_ENC] [AMDGPU_RING_PRIO_0].sched; drm_sched_entity_modify_sched(job->base.entity, scheds, 1);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Rosca david.rosca@amd.com
commit 2b10cb58d7a3fd621ec9b2ba765a092e562ef998 upstream.
There can be multiple engine info packages in one IB and the first one may be common engine, not decode/encode. We need to parse the entire IB instead of stopping after finding first engine info.
Signed-off-by: David Rosca david.rosca@amd.com Reviewed-by: Leo Liu leo.liu@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit dc8f9f0f45166a6b37864e7a031c726981d6e5fc) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c | 52 +++++++++++++--------------------- 1 file changed, 21 insertions(+), 31 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c @@ -1747,22 +1747,16 @@ out:
#define RADEON_VCN_ENGINE_TYPE_ENCODE (0x00000002) #define RADEON_VCN_ENGINE_TYPE_DECODE (0x00000003) - #define RADEON_VCN_ENGINE_INFO (0x30000001) -#define RADEON_VCN_ENGINE_INFO_MAX_OFFSET 16 - #define RENCODE_ENCODE_STANDARD_AV1 2 #define RENCODE_IB_PARAM_SESSION_INIT 0x00000003 -#define RENCODE_IB_PARAM_SESSION_INIT_MAX_OFFSET 64
-/* return the offset in ib if id is found, -1 otherwise - * to speed up the searching we only search upto max_offset - */ -static int vcn_v4_0_enc_find_ib_param(struct amdgpu_ib *ib, uint32_t id, int max_offset) +/* return the offset in ib if id is found, -1 otherwise */ +static int vcn_v4_0_enc_find_ib_param(struct amdgpu_ib *ib, uint32_t id, int start) { int i;
- for (i = 0; i < ib->length_dw && i < max_offset && ib->ptr[i] >= 8; i += ib->ptr[i]/4) { + for (i = start; i < ib->length_dw && ib->ptr[i] >= 8; i += ib->ptr[i] / 4) { if (ib->ptr[i + 1] == id) return i; } @@ -1777,33 +1771,29 @@ static int vcn_v4_0_ring_patch_cs_in_pla struct amdgpu_vcn_decode_buffer *decode_buffer; uint64_t addr; uint32_t val; - int idx; + int idx = 0, sidx;
/* The first instance can decode anything */ if (!ring->me) return 0;
- /* RADEON_VCN_ENGINE_INFO is at the top of ib block */ - idx = vcn_v4_0_enc_find_ib_param(ib, RADEON_VCN_ENGINE_INFO, - RADEON_VCN_ENGINE_INFO_MAX_OFFSET); - if (idx < 0) /* engine info is missing */ - return 0; - - val = amdgpu_ib_get_value(ib, idx + 2); /* RADEON_VCN_ENGINE_TYPE */ - if (val == RADEON_VCN_ENGINE_TYPE_DECODE) { - decode_buffer = (struct amdgpu_vcn_decode_buffer *)&ib->ptr[idx + 6]; - - if (!(decode_buffer->valid_buf_flag & 0x1)) - return 0; - - addr = ((u64)decode_buffer->msg_buffer_address_hi) << 32 | - decode_buffer->msg_buffer_address_lo; - return vcn_v4_0_dec_msg(p, job, addr); - } else if (val == RADEON_VCN_ENGINE_TYPE_ENCODE) { - idx = vcn_v4_0_enc_find_ib_param(ib, RENCODE_IB_PARAM_SESSION_INIT, - RENCODE_IB_PARAM_SESSION_INIT_MAX_OFFSET); - if (idx >= 0 && ib->ptr[idx + 2] == RENCODE_ENCODE_STANDARD_AV1) - return vcn_v4_0_limit_sched(p, job); + while ((idx = vcn_v4_0_enc_find_ib_param(ib, RADEON_VCN_ENGINE_INFO, idx)) >= 0) { + val = amdgpu_ib_get_value(ib, idx + 2); /* RADEON_VCN_ENGINE_TYPE */ + if (val == RADEON_VCN_ENGINE_TYPE_DECODE) { + decode_buffer = (struct amdgpu_vcn_decode_buffer *)&ib->ptr[idx + 6]; + + if (!(decode_buffer->valid_buf_flag & 0x1)) + return 0; + + addr = ((u64)decode_buffer->msg_buffer_address_hi) << 32 | + decode_buffer->msg_buffer_address_lo; + return vcn_v4_0_dec_msg(p, job, addr); + } else if (val == RADEON_VCN_ENGINE_TYPE_ENCODE) { + sidx = vcn_v4_0_enc_find_ib_param(ib, RENCODE_IB_PARAM_SESSION_INIT, idx); + if (sidx >= 0 && ib->ptr[sidx + 2] == RENCODE_ENCODE_STANDARD_AV1) + return vcn_v4_0_limit_sched(p, job); + } + idx += ib->ptr[idx] / 4; } return 0; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Dahl ada@thorsis.com
[ Upstream commit 1c60e027ffdebd36f4da766d9c9abbd1ea4dd8f9 ]
Looks like a copy'n'paste mistake introduced when initially adding the dynamic timings feature with commit f9ce2eddf176 ("mtd: nand: atmel: Add ->setup_data_interface() hooks"). The context around this and especially the code itself suggests 'read' is meant instead of write.
Signed-off-by: Alexander Dahl ada@thorsis.com Reviewed-by: Nicolas Ferre nicolas.ferre@microchip.com Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20240226122537.75097-1-ada@thorsis.com Stable-dep-of: fd779eac2d65 ("mtd: nand: raw: atmel: Respect tAR, tCLR in read setup timing") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mtd/nand/raw/atmel/nand-controller.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/mtd/nand/raw/atmel/nand-controller.c +++ b/drivers/mtd/nand/raw/atmel/nand-controller.c @@ -1378,7 +1378,7 @@ static int atmel_smc_nand_prepare_smccon return ret;
/* - * The write cycle timing is directly matching tWC, but is also + * The read cycle timing is directly matching tRC, but is also * dependent on the setup and hold timings we calculated earlier, * which gives: *
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Sverdlin alexander.sverdlin@siemens.com
[ Upstream commit fd779eac2d659668be4d3dbdac0710afd5d6db12 ]
Having setup time 0 violates tAR, tCLR of some chips, for instance TOSHIBA TC58NVG2S3ETAI0 cannot be detected successfully (first ID byte being read duplicated, i.e. 98 98 dc 90 15 76 14 03 instead of 98 dc 90 15 76 ...).
Atmel Application Notes postulated 1 cycle NRD_SETUP without explanation [1], but it looks more appropriate to just calculate setup time properly.
[1] Link: https://ww1.microchip.com/downloads/aemDocuments/documents/MPU32/Application...
Cc: stable@vger.kernel.org Fixes: f9ce2eddf176 ("mtd: nand: atmel: Add ->setup_data_interface() hooks") Signed-off-by: Alexander Sverdlin alexander.sverdlin@siemens.com Tested-by: Alexander Dahl ada@thorsis.com Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mtd/nand/raw/atmel/nand-controller.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
--- a/drivers/mtd/nand/raw/atmel/nand-controller.c +++ b/drivers/mtd/nand/raw/atmel/nand-controller.c @@ -1378,13 +1378,23 @@ static int atmel_smc_nand_prepare_smccon return ret;
/* + * Read setup timing depends on the operation done on the NAND: + * + * NRD_SETUP = max(tAR, tCLR) + */ + timeps = max(conf->timings.sdr.tAR_min, conf->timings.sdr.tCLR_min); + ncycles = DIV_ROUND_UP(timeps, mckperiodps); + totalcycles += ncycles; + ret = atmel_smc_cs_conf_set_setup(smcconf, ATMEL_SMC_NRD_SHIFT, ncycles); + if (ret) + return ret; + + /* * The read cycle timing is directly matching tRC, but is also * dependent on the setup and hold timings we calculated earlier, * which gives: * - * NRD_CYCLE = max(tRC, NRD_PULSE + NRD_HOLD) - * - * NRD_SETUP is always 0. + * NRD_CYCLE = max(tRC, NRD_SETUP + NRD_PULSE + NRD_HOLD) */ ncycles = DIV_ROUND_UP(conf->timings.sdr.tRC_min, mckperiodps); ncycles = max(totalcycles, ncycles);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilya Dryomov idryomov@gmail.com
commit cdbc9836c7afadad68f374791738f118263c5371 upstream.
There is a place where generic code in messenger.c is reading and another place where it is writing to con->v1 union member without checking that the union member is active (i.e. msgr1 is in use).
On 64-bit systems, con->v1.auth_retry overlaps with con->v2.out_iter, so such a read is almost guaranteed to return a bogus value instead of 0 when msgr2 is in use. This ends up being fairly benign because the side effect is just the invalidation of the authorizer and successive fetching of new tickets.
con->v1.connect_seq overlaps with con->v2.conn_bufs and the fact that it's being written to can cause more serious consequences, but luckily it's not something that happens often.
Cc: stable@vger.kernel.org Fixes: cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)") Signed-off-by: Ilya Dryomov idryomov@gmail.com Reviewed-by: Viacheslav Dubeyko Slava.Dubeyko@ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ceph/messenger.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -1524,7 +1524,7 @@ static void con_fault_finish(struct ceph * in case we faulted due to authentication, invalidate our * current tickets so that we can get new ones. */ - if (con->v1.auth_retry) { + if (!ceph_msgr2(from_msgr(con->msgr)) && con->v1.auth_retry) { dout("auth_retry %d, invalidating\n", con->v1.auth_retry); if (con->ops->invalidate_authorizer) con->ops->invalidate_authorizer(con); @@ -1714,9 +1714,10 @@ static void clear_standby(struct ceph_co { /* come back from STANDBY? */ if (con->state == CEPH_CON_S_STANDBY) { - dout("clear_standby %p and ++connect_seq\n", con); + dout("clear_standby %p\n", con); con->state = CEPH_CON_S_PREOPEN; - con->v1.connect_seq++; + if (!ceph_msgr2(from_msgr(con->msgr))) + con->v1.connect_seq++; WARN_ON(ceph_con_flag_test(con, CEPH_CON_F_WRITE_PENDING)); WARN_ON(ceph_con_flag_test(con, CEPH_CON_F_KEEPALIVE_PENDING)); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stanislav Fort stanislav.fort@aisle.com
commit 3260a3f0828e06f5f13fac69fb1999a6d60d9cff upstream.
state_show() reads kdamond->damon_ctx without holding damon_sysfs_lock. This allows a use-after-free race:
CPU 0 CPU 1 ----- ----- state_show() damon_sysfs_turn_damon_on() ctx = kdamond->damon_ctx; mutex_lock(&damon_sysfs_lock); damon_destroy_ctx(kdamond->damon_ctx); kdamond->damon_ctx = NULL; mutex_unlock(&damon_sysfs_lock); damon_is_running(ctx); /* ctx is freed */ mutex_lock(&ctx->kdamond_lock); /* UAF */
(The race can also occur with damon_sysfs_kdamonds_rm_dirs() and damon_sysfs_kdamond_release(), which free or replace the context under damon_sysfs_lock.)
Fix by taking damon_sysfs_lock before dereferencing the context, mirroring the locking used in pid_show().
The bug has existed since state_show() first accessed kdamond->damon_ctx.
Link: https://lkml.kernel.org/r/20250905101046.2288-1-disclosure@aisle.com Fixes: a61ea561c871 ("mm/damon/sysfs: link DAMON for virtual address spaces monitoring") Signed-off-by: Stanislav Fort disclosure@aisle.com Reported-by: Stanislav Fort disclosure@aisle.com Reviewed-by: SeongJae Park sj@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: SeongJae Park sj@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/damon/sysfs.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
--- a/mm/damon/sysfs.c +++ b/mm/damon/sysfs.c @@ -1055,14 +1055,18 @@ static ssize_t state_show(struct kobject { struct damon_sysfs_kdamond *kdamond = container_of(kobj, struct damon_sysfs_kdamond, kobj); - struct damon_ctx *ctx = kdamond->damon_ctx; - bool running; + struct damon_ctx *ctx; + bool running = false;
- if (!ctx) - running = false; - else + if (!mutex_trylock(&damon_sysfs_lock)) + return -EBUSY; + + ctx = kdamond->damon_ctx; + if (ctx) running = damon_sysfs_ctx_running(ctx);
+ mutex_unlock(&damon_sysfs_lock); + return sysfs_emit(buf, "%s\n", running ? damon_sysfs_cmd_strs[DAMON_SYSFS_CMD_ON] : damon_sysfs_cmd_strs[DAMON_SYSFS_CMD_OFF]);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Quanmin Yan yanquanmin1@huawei.com
commit e6b543ca9806d7bced863f43020e016ee996c057 upstream.
When creating a new scheme of DAMON_RECLAIM, the calculation of 'min_age_region' uses 'aggr_interval' as the divisor, which may lead to division-by-zero errors. Fix it by directly returning -EINVAL when such a case occurs.
Link: https://lkml.kernel.org/r/20250827115858.1186261-3-yanquanmin1@huawei.com Fixes: f5a79d7c0c87 ("mm/damon: introduce struct damos_access_pattern") Signed-off-by: Quanmin Yan yanquanmin1@huawei.com Reviewed-by: SeongJae Park sj@kernel.org Cc: Kefeng Wang wangkefeng.wang@huawei.com Cc: ze zuo zuoze1@huawei.com Cc: stable@vger.kernel.org [6.1+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: SeongJae Park sj@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/damon/reclaim.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/mm/damon/reclaim.c +++ b/mm/damon/reclaim.c @@ -167,6 +167,9 @@ static int damon_reclaim_apply_parameter struct damos_filter *filter; int err = 0;
+ if (!damon_reclaim_mon_attrs.aggr_interval) + return -EINVAL; + err = damon_set_attrs(ctx, &damon_reclaim_mon_attrs); if (err) return err;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Quanmin Yan yanquanmin1@huawei.com
commit 711f19dfd783ffb37ca4324388b9c4cb87e71363 upstream.
Patch series "mm/damon: avoid divide-by-zero in DAMON module's parameters application".
DAMON's RECLAIM and LRU_SORT modules perform no validation on user-configured parameters during application, which may lead to division-by-zero errors.
Avoid the divide-by-zero by adding validation checks when DAMON modules attempt to apply the parameters.
This patch (of 2):
During the calculation of 'hot_thres' and 'cold_thres', either 'sample_interval' or 'aggr_interval' is used as the divisor, which may lead to division-by-zero errors. Fix it by directly returning -EINVAL when such a case occurs. Additionally, since 'aggr_interval' is already required to be set no smaller than 'sample_interval' in damon_set_attrs(), only the case where 'sample_interval' is zero needs to be checked.
Link: https://lkml.kernel.org/r/20250827115858.1186261-2-yanquanmin1@huawei.com Fixes: 40e983cca927 ("mm/damon: introduce DAMON-based LRU-lists Sorting") Signed-off-by: Quanmin Yan yanquanmin1@huawei.com Reviewed-by: SeongJae Park sj@kernel.org Cc: Kefeng Wang wangkefeng.wang@huawei.com Cc: ze zuo zuoze1@huawei.com Cc: stable@vger.kernel.org [6.0+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: SeongJae Park sj@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/damon/lru_sort.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/mm/damon/lru_sort.c +++ b/mm/damon/lru_sort.c @@ -203,6 +203,9 @@ static int damon_lru_sort_apply_paramete unsigned int hot_thres, cold_thres; int err = 0;
+ if (!damon_lru_sort_mon_attrs.sample_interval) + return -EINVAL; + err = damon_set_attrs(ctx, &damon_lru_sort_mon_attrs); if (err) return err;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Boris Burkov boris@bur.io
[ Upstream commit 9e9ff875e4174be939371667d2cc81244e31232f ]
We recently received a report of poor performance doing sequential buffered reads of a file with compressed extents. With bs=128k, a naive sequential dd ran as fast on a compressed file as on an uncompressed (1.2GB/s on my reproducing system) while with bs<32k, this performance tanked down to ~300MB/s.
i.e., slow:
dd if=some-compressed-file of=/dev/null bs=4k count=X
vs fast:
dd if=some-compressed-file of=/dev/null bs=128k count=Y
The cause of this slowness is overhead to do with looking up extent_maps to enable readahead pre-caching on compressed extents (add_ra_bio_pages()), as well as some overhead in the generic VFS readahead code we hit more in the slow case. Notably, the main difference between the two read sizes is that in the large sized request case, we call btrfs_readahead() relatively rarely while in the smaller request we call it for every compressed extent. So the fast case stays in the btrfs readahead loop:
while ((folio = readahead_folio(rac)) != NULL) btrfs_do_readpage(folio, &em_cached, &bio_ctrl, &prev_em_start);
where the slower one breaks out of that loop every time. This results in calling add_ra_bio_pages a lot, doing lots of extent_map lookups, extent_map locking, etc.
This happens because although add_ra_bio_pages() does add the appropriate un-compressed file pages to the cache, it does not communicate back to the ractl in any way. To solve this, we should be using readahead_expand() to signal to readahead to expand the readahead window.
This change passes the readahead_control into the btrfs_bio_ctrl and in the case of compressed reads sets the expansion to the size of the extent_map we already looked up anyway. It skips the subpage case as that one already doesn't do add_ra_bio_pages().
With this change, whether we use bs=4k or bs=128k, btrfs expands the readahead window up to the largest compressed extent we have seen so far (in the trivial example: 128k) and the call stacks of the two modes look identical. Notably, we barely call add_ra_bio_pages at all. And the performance becomes identical as well. So this change certainly "fixes" this performance problem.
Of course, it does seem to beg a few questions:
1. Will this waste too much page cache with a too large ra window? 2. Will this somehow cause bugs prevented by the more thoughtful checking in add_ra_bio_pages? 3. Should we delete add_ra_bio_pages?
My stabs at some answers:
1. Hard to say. See attempts at generic performance testing below. Is there a "readahead_shrink" we should be using? Should we expand more slowly, by half the remaining em size each time? 2. I don't think so. Since the new behavior is indistinguishable from reading the file with a larger read size passed in, I don't see why one would be safe but not the other. 3. Probably! I tested that and it was fine in fstests, and it seems like the pages would get re-used just as well in the readahead case. However, it is possible some reads that use page cache but not btrfs_readahead() could suffer. I will investigate this further as a follow up.
I tested the performance implications of this change in 3 ways (using compress-force=zstd:3 for compression):
Directly test the affected workload of small sequential reads on a compressed file (improved from ~250MB/s to ~1.2GB/s)
==========for-next========== dd /mnt/lol/non-cmpr 4k 1048576+0 records in 1048576+0 records out 4294967296 bytes (4.3 GB, 4.0 GiB) copied, 6.02983 s, 712 MB/s dd /mnt/lol/non-cmpr 128k 32768+0 records in 32768+0 records out 4294967296 bytes (4.3 GB, 4.0 GiB) copied, 5.92403 s, 725 MB/s dd /mnt/lol/cmpr 4k 1048576+0 records in 1048576+0 records out 4294967296 bytes (4.3 GB, 4.0 GiB) copied, 17.8832 s, 240 MB/s dd /mnt/lol/cmpr 128k 32768+0 records in 32768+0 records out 4294967296 bytes (4.3 GB, 4.0 GiB) copied, 3.71001 s, 1.2 GB/s
==========ra-expand========== dd /mnt/lol/non-cmpr 4k 1048576+0 records in 1048576+0 records out 4294967296 bytes (4.3 GB, 4.0 GiB) copied, 6.09001 s, 705 MB/s dd /mnt/lol/non-cmpr 128k 32768+0 records in 32768+0 records out 4294967296 bytes (4.3 GB, 4.0 GiB) copied, 6.07664 s, 707 MB/s dd /mnt/lol/cmpr 4k 1048576+0 records in 1048576+0 records out 4294967296 bytes (4.3 GB, 4.0 GiB) copied, 3.79531 s, 1.1 GB/s dd /mnt/lol/cmpr 128k 32768+0 records in 32768+0 records out 4294967296 bytes (4.3 GB, 4.0 GiB) copied, 3.69533 s, 1.2 GB/s
Built the linux kernel from clean (no change)
Ran fsperf. Mostly neutral results with some improvements and regressions here and there.
Reported-by: Dimitrios Apostolou jimis@gmx.net Link: https://lore.kernel.org/linux-btrfs/34601559-6c16-6ccc-1793-20a97ca0dbba@gmx... Reviewed-by: Filipe Manana fdmanana@suse.com Signed-off-by: Boris Burkov boris@bur.io Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/extent_io.c | 34 +++++++++++++++++++++++++++++++++- 1 file changed, 33 insertions(+), 1 deletion(-)
--- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -103,6 +103,7 @@ struct btrfs_bio_ctrl { blk_opf_t opf; btrfs_bio_end_io_t end_io_func; struct writeback_control *wbc; + struct readahead_control *ractl; };
static void submit_one_bio(struct btrfs_bio_ctrl *bio_ctrl) @@ -952,6 +953,23 @@ __get_extent_map(struct inode *inode, st } return em; } + +static void btrfs_readahead_expand(struct readahead_control *ractl, + const struct extent_map *em) +{ + const u64 ra_pos = readahead_pos(ractl); + const u64 ra_end = ra_pos + readahead_length(ractl); + const u64 em_end = em->start + em->ram_bytes; + + /* No expansion for holes and inline extents. */ + if (em->block_start > EXTENT_MAP_LAST_BYTE) + return; + + ASSERT(em_end >= ra_pos); + if (em_end > ra_end) + readahead_expand(ractl, ra_pos, em_end - ra_pos); +} + /* * basic readpage implementation. Locked extent state structs are inserted * into the tree that are removed when the IO is done (by the end_io @@ -1023,6 +1041,17 @@ static int btrfs_do_readpage(struct page
iosize = min(extent_map_end(em) - cur, end - cur + 1); iosize = ALIGN(iosize, blocksize); + + /* + * Only expand readahead for extents which are already creating + * the pages anyway in add_ra_bio_pages, which is compressed + * extents in the non subpage case. + */ + if (bio_ctrl->ractl && + !btrfs_is_subpage(fs_info, page) && + compress_type != BTRFS_COMPRESS_NONE) + btrfs_readahead_expand(bio_ctrl->ractl, em); + if (compress_type != BTRFS_COMPRESS_NONE) disk_bytenr = em->block_start; else @@ -2224,7 +2253,10 @@ int extent_writepages(struct address_spa
void extent_readahead(struct readahead_control *rac) { - struct btrfs_bio_ctrl bio_ctrl = { .opf = REQ_OP_READ | REQ_RAHEAD }; + struct btrfs_bio_ctrl bio_ctrl = { + .opf = REQ_OP_READ | REQ_RAHEAD, + .ractl = rac + }; struct page *pagepool[16]; struct extent_map *em_cached = NULL; u64 prev_em_start = (u64)-1;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
[ Upstream commit 9786531399a679fc2f4630d2c0a186205282ab2f ]
[BUG] With 64K page size (aarch64 with 64K page size config) and 4K btrfs block size, the following workload can easily lead to a corrupted read:
mkfs.btrfs -f -s 4k $dev > /dev/null mount -o compress $dev $mnt xfs_io -f -c "pwrite -S 0xff 0 64k" $mnt/base > /dev/null echo "correct result:" od -Ad -t x1 $mnt/base xfs_io -f -c "reflink $mnt/base 32k 0 32k" \ -c "reflink $mnt/base 0 32k 32k" \ -c "pwrite -S 0xff 60k 4k" $mnt/new > /dev/null echo "incorrect result:" od -Ad -t x1 $mnt/new umount $mnt
This shows the following result:
correct result: 0000000 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff * 0065536 incorrect result: 0000000 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff * 0032768 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 0061440 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff * 0065536
Notice the zero in the range [32K, 60K), which is incorrect.
[CAUSE] With extra trace printk, it shows the following events during od: (some unrelated info removed like CPU and context)
od-3457 btrfs_do_readpage: enter r/i=5/258 folio=0(65536) prev_em_start=0000000000000000
The "r/i" is indicating the root and inode number. In our case the file "new" is using ino 258 from fs tree (root 5).
Here notice the @prev_em_start pointer is NULL. This means the btrfs_do_readpage() is called from btrfs_read_folio(), not from btrfs_readahead().
od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=0 got em start=0 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=4096 got em start=0 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=8192 got em start=0 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=12288 got em start=0 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=16384 got em start=0 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=20480 got em start=0 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=24576 got em start=0 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=28672 got em start=0 len=32768
These above 32K blocks will be read from the first half of the compressed data extent.
od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=32768 got em start=32768 len=32768
Note here there is no btrfs_submit_compressed_read() call. Which is incorrect now. Although both extent maps at 0 and 32K are pointing to the same compressed data, their offsets are different thus can not be merged into the same read.
So this means the compressed data read merge check is doing something wrong.
od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=36864 got em start=32768 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=40960 got em start=32768 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=45056 got em start=32768 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=49152 got em start=32768 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=53248 got em start=32768 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=57344 got em start=32768 len=32768 od-3457 btrfs_do_readpage: r/i=5/258 folio=0(65536) cur=61440 skip uptodate od-3457 btrfs_submit_compressed_read: cb orig_bio: file off=0 len=61440
The function btrfs_submit_compressed_read() is only called at the end of folio read. The compressed bio will only have an extent map of range [0, 32K), but the original bio passed in is for the whole 64K folio.
This will cause the decompression part to only fill the first 32K, leaving the rest untouched (aka, filled with zero).
This incorrect compressed read merge leads to the above data corruption.
There were similar problems that happened in the past, commit 808f80b46790 ("Btrfs: update fix for read corruption of compressed and shared extents") is doing pretty much the same fix for readahead.
But that's back to 2015, where btrfs still only supports bs (block size) == ps (page size) cases. This means btrfs_do_readpage() only needs to handle a folio which contains exactly one block.
Only btrfs_readahead() can lead to a read covering multiple blocks. Thus only btrfs_readahead() passes a non-NULL @prev_em_start pointer.
With v5.15 kernel btrfs introduced bs < ps support. This breaks the above assumption that a folio can only contain one block.
Now btrfs_read_folio() can also read multiple blocks in one go. But btrfs_read_folio() doesn't pass a @prev_em_start pointer, thus the existing bio force submission check will never be triggered.
In theory, this can also happen for btrfs with large folios, but since large folio is still experimental, we don't need to bother it, thus only bs < ps support is affected for now.
[FIX] Instead of passing @prev_em_start to do the proper compressed extent check, introduce one new member, btrfs_bio_ctrl::last_em_start, so that the existing bio force submission logic will always be triggered.
CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Filipe Manana fdmanana@suse.com Signed-off-by: Qu Wenruo wqu@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/extent_io.c | 46 ++++++++++++++++++++++++++++++++-------------- 1 file changed, 32 insertions(+), 14 deletions(-)
--- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -104,6 +104,24 @@ struct btrfs_bio_ctrl { btrfs_bio_end_io_t end_io_func; struct writeback_control *wbc; struct readahead_control *ractl; + + /* + * The start offset of the last used extent map by a read operation. + * + * This is for proper compressed read merge. + * U64_MAX means we are starting the read and have made no progress yet. + * + * The current btrfs_bio_is_contig() only uses disk_bytenr as + * the condition to check if the read can be merged with previous + * bio, which is not correct. E.g. two file extents pointing to the + * same extent but with different offset. + * + * So here we need to do extra checks to only merge reads that are + * covered by the same extent map. + * Just extent_map::start will be enough, as they are unique + * inside the same inode. + */ + u64 last_em_start; };
static void submit_one_bio(struct btrfs_bio_ctrl *bio_ctrl) @@ -978,7 +996,7 @@ static void btrfs_readahead_expand(struc * return 0 on success, otherwise return error */ static int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, - struct btrfs_bio_ctrl *bio_ctrl, u64 *prev_em_start) + struct btrfs_bio_ctrl *bio_ctrl) { struct inode *inode = page->mapping->host; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); @@ -1095,12 +1113,11 @@ static int btrfs_do_readpage(struct page * non-optimal behavior (submitting 2 bios for the same extent). */ if (test_bit(EXTENT_FLAG_COMPRESSED, &em->flags) && - prev_em_start && *prev_em_start != (u64)-1 && - *prev_em_start != em->start) + bio_ctrl->last_em_start != (u64)-1 && + bio_ctrl->last_em_start != em->start) force_bio_submit = true;
- if (prev_em_start) - *prev_em_start = em->start; + bio_ctrl->last_em_start = em->start;
free_extent_map(em); em = NULL; @@ -1146,12 +1163,15 @@ int btrfs_read_folio(struct file *file, struct btrfs_inode *inode = BTRFS_I(page->mapping->host); u64 start = page_offset(page); u64 end = start + PAGE_SIZE - 1; - struct btrfs_bio_ctrl bio_ctrl = { .opf = REQ_OP_READ }; + struct btrfs_bio_ctrl bio_ctrl = { + .opf = REQ_OP_READ, + .last_em_start = (u64)-1, + }; int ret;
btrfs_lock_and_flush_ordered_range(inode, start, end, NULL);
- ret = btrfs_do_readpage(page, NULL, &bio_ctrl, NULL); + ret = btrfs_do_readpage(page, NULL, &bio_ctrl); /* * If btrfs_do_readpage() failed we will want to submit the assembled * bio to do the cleanup. @@ -1163,8 +1183,7 @@ int btrfs_read_folio(struct file *file, static inline void contiguous_readpages(struct page *pages[], int nr_pages, u64 start, u64 end, struct extent_map **em_cached, - struct btrfs_bio_ctrl *bio_ctrl, - u64 *prev_em_start) + struct btrfs_bio_ctrl *bio_ctrl) { struct btrfs_inode *inode = BTRFS_I(pages[0]->mapping->host); int index; @@ -1172,8 +1191,7 @@ static inline void contiguous_readpages( btrfs_lock_and_flush_ordered_range(inode, start, end, NULL);
for (index = 0; index < nr_pages; index++) { - btrfs_do_readpage(pages[index], em_cached, bio_ctrl, - prev_em_start); + btrfs_do_readpage(pages[index], em_cached, bio_ctrl); put_page(pages[index]); } } @@ -2255,11 +2273,11 @@ void extent_readahead(struct readahead_c { struct btrfs_bio_ctrl bio_ctrl = { .opf = REQ_OP_READ | REQ_RAHEAD, - .ractl = rac + .ractl = rac, + .last_em_start = (u64)-1, }; struct page *pagepool[16]; struct extent_map *em_cached = NULL; - u64 prev_em_start = (u64)-1; int nr;
while ((nr = readahead_page_batch(rac, pagepool))) { @@ -2267,7 +2285,7 @@ void extent_readahead(struct readahead_c u64 contig_end = contig_start + readahead_batch_length(rac) - 1;
contiguous_readpages(pagepool, nr, contig_start, contig_end, - &em_cached, &bio_ctrl, &prev_em_start); + &em_cached, &bio_ctrl); }
if (em_cached)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Vishal Moola (Oracle)" vishal.moola@gmail.com
[ Upstream commit 5c07ebb372d66423e508ecfb8e00324f8797f072 ]
Replaces 5 calls to compound_head(), and removes 1385 bytes of kernel text.
Link: https://lkml.kernel.org/r/20231020183331.10770-3-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) vishal.moola@gmail.com Reviewed-by: Rik van Riel riel@surriel.com Reviewed-by: Yang Shi shy828301@gmail.com Cc: Kefeng Wang wangkefeng.wang@huawei.com Cc: Matthew Wilcox (Oracle) willy@infradead.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: 394bfac1c7f7 ("mm/khugepaged: fix the address passed to notifier on testing young") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/khugepaged.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
--- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1240,6 +1240,7 @@ static int hpage_collapse_scan_pmd(struc int result = SCAN_FAIL, referenced = 0; int none_or_zero = 0, shared = 0; struct page *page = NULL; + struct folio *folio = NULL; unsigned long _address; spinlock_t *ptl; int node = NUMA_NO_NODE, unmapped = 0; @@ -1326,29 +1327,28 @@ static int hpage_collapse_scan_pmd(struc } }
- page = compound_head(page); - + folio = page_folio(page); /* * Record which node the original page is from and save this * information to cc->node_load[]. * Khugepaged will allocate hugepage from the node has the max * hit record. */ - node = page_to_nid(page); + node = folio_nid(folio); if (hpage_collapse_scan_abort(node, cc)) { result = SCAN_SCAN_ABORT; goto out_unmap; } cc->node_load[node]++; - if (!PageLRU(page)) { + if (!folio_test_lru(folio)) { result = SCAN_PAGE_LRU; goto out_unmap; } - if (PageLocked(page)) { + if (folio_test_locked(folio)) { result = SCAN_PAGE_LOCK; goto out_unmap; } - if (!PageAnon(page)) { + if (!folio_test_anon(folio)) { result = SCAN_PAGE_ANON; goto out_unmap; } @@ -1363,7 +1363,7 @@ static int hpage_collapse_scan_pmd(struc * has excessive GUP pins (i.e. 512). Anyway the same check * will be done again later the risk seems low. */ - if (!is_refcount_suitable(page)) { + if (!is_refcount_suitable(&folio->page)) { result = SCAN_PAGE_COUNT; goto out_unmap; } @@ -1373,8 +1373,8 @@ static int hpage_collapse_scan_pmd(struc * enough young pte to justify collapsing the page */ if (cc->is_khugepaged && - (pte_young(pteval) || page_is_young(page) || - PageReferenced(page) || mmu_notifier_test_young(vma->vm_mm, + (pte_young(pteval) || folio_test_young(folio) || + folio_test_referenced(folio) || mmu_notifier_test_young(vma->vm_mm, address))) referenced++; } @@ -1396,7 +1396,7 @@ out_unmap: *mmap_locked = false; } out: - trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced, + trace_mm_khugepaged_scan_pmd(mm, &folio->page, writable, referenced, none_or_zero, result, unmapped); return result; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wei Yang richard.weiyang@gmail.com
[ Upstream commit 394bfac1c7f7b701c2c93834c5761b9c9ceeebcf ]
Commit 8ee53820edfd ("thp: mmu_notifier_test_young") introduced mmu_notifier_test_young(), but we are passing the wrong address. In xxx_scan_pmd(), the actual iteration address is "_address" not "address". We seem to misuse the variable on the very beginning.
Change it to the right one.
[akpm@linux-foundation.org fix whitespace, per everyone] Link: https://lkml.kernel.org/r/20250822063318.11644-1-richard.weiyang@gmail.com Fixes: 8ee53820edfd ("thp: mmu_notifier_test_young") Signed-off-by: Wei Yang richard.weiyang@gmail.com Reviewed-by: Dev Jain dev.jain@arm.com Reviewed-by: Zi Yan ziy@nvidia.com Acked-by: David Hildenbrand david@redhat.com Reviewed-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com Cc: Baolin Wang baolin.wang@linux.alibaba.com Cc: Liam R. Howlett Liam.Howlett@oracle.com Cc: Nico Pache npache@redhat.com Cc: Ryan Roberts ryan.roberts@arm.com Cc: Barry Song baohua@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/khugepaged.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1374,8 +1374,8 @@ static int hpage_collapse_scan_pmd(struc */ if (cc->is_khugepaged && (pte_young(pteval) || folio_test_young(folio) || - folio_test_referenced(folio) || mmu_notifier_test_young(vma->vm_mm, - address))) + folio_test_referenced(folio) || + mmu_notifier_test_young(vma->vm_mm, _address))) referenced++; } if (!writable) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yang Erkun yangerkun@huawei.com
After commit f3dc1bdb6b0b("cifs: Fix writeback data corruption"), the writepages for cifs will find all folio needed writepage with two phase. The first folio will be found in cifs_writepages_begin, and the latter various folios will be found in cifs_extend_writeback.
All those will first get folio, and for normal case, once we set page writeback and after do really write, we should put the reference, folio found in cifs_extend_writeback do this with folio_batch_release. But the folio found in cifs_writepages_begin never get the chance do it. And every writepages call, we will leak a folio(found this problem while do xfstests over cifs, the latter show that we will leak about 600M+ every we run generic/074).
echo 3 > /proc/sys/vm/drop_caches ; cat /proc/meminfo | grep file Active(file): 34092 kB Inactive(file): 176192 kB ./check generic/074 (smb v1) ... generic/074 50s ... 53s Ran: generic/074 Passed all 1 tests
echo 3 > /proc/sys/vm/drop_caches ; cat /proc/meminfo | grep file Active(file): 35036 kB Inactive(file): 854708 kB
Besides, the exist path seem never handle this folio correctly, fix it too with this patch. All issue does not occur in the mainline because the writepages path for CIFS was changed to netfs (commit 3ee1a1fc3981, titled "cifs: Cut over to using netfslib") as part of a major refactor. After discussing with the CIFS maintainer, we believe that this single patch is safer for the stable branch [1].
Steve said: """ David and I discussed this today and this patch is MUCH safer than backporting the later (6.10) netfs changes which would be much larger and riskier to include (and presumably could affect code outside cifs.ko as well where this patch is narrowly targeted).
I am fine with this patch.from Yang for 6.6 stable """
David said: """ Backporting the massive amount of changes to netfslib, fscache, cifs, afs, 9p, ceph and nfs would kind of diminish the notion that this is a stable kernel;-). """
Fixes: f3dc1bdb6b0b ("cifs: Fix writeback data corruption") Cc: stable@kernel.org # v6.6~v6.9 Link: https://lore.kernel.org/all/20250911030120.1076413-1-yangerkun@huawei.com/ [1] Acked-by: Steve French stfrench@microsoft.com Reviewed-by: David Howells dhowells@redhat.com Signed-off-by: Yang Erkun yangerkun@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/file.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
--- a/fs/smb/client/file.c +++ b/fs/smb/client/file.c @@ -2884,17 +2884,21 @@ static ssize_t cifs_write_back_from_lock rc = cifs_get_writable_file(CIFS_I(inode), FIND_WR_ANY, &cfile); if (rc) { cifs_dbg(VFS, "No writable handle in writepages rc=%d\n", rc); + folio_unlock(folio); goto err_xid; }
rc = server->ops->wait_mtu_credits(server, cifs_sb->ctx->wsize, &wsize, credits); - if (rc != 0) + if (rc != 0) { + folio_unlock(folio); goto err_close; + }
wdata = cifs_writedata_alloc(cifs_writev_complete); if (!wdata) { rc = -ENOMEM; + folio_unlock(folio); goto err_uncredit; }
@@ -3041,17 +3045,22 @@ search_again: lock_again: if (wbc->sync_mode != WB_SYNC_NONE) { ret = folio_lock_killable(folio); - if (ret < 0) + if (ret < 0) { + folio_put(folio); return ret; + } } else { - if (!folio_trylock(folio)) + if (!folio_trylock(folio)) { + folio_put(folio); goto search_again; + } }
if (folio->mapping != mapping || !folio_test_dirty(folio)) { start += folio_size(folio); folio_unlock(folio); + folio_put(folio); goto search_again; }
@@ -3081,6 +3090,7 @@ lock_again: out: if (ret > 0) *_start = start + ret; + folio_put(folio); return ret; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen Ridong chenridong@huawei.com
commit 3c9ba2777d6c86025e1ba4186dc5cd930e40ec5f upstream.
A use-after-free (UAF) vulnerability was identified in the PSI (Pressure Stall Information) monitoring mechanism:
BUG: KASAN: slab-use-after-free in psi_trigger_poll+0x3c/0x140 Read of size 8 at addr ffff3de3d50bd308 by task systemd/1
psi_trigger_poll+0x3c/0x140 cgroup_pressure_poll+0x70/0xa0 cgroup_file_poll+0x8c/0x100 kernfs_fop_poll+0x11c/0x1c0 ep_item_poll.isra.0+0x188/0x2c0
Allocated by task 1: cgroup_file_open+0x88/0x388 kernfs_fop_open+0x73c/0xaf0 do_dentry_open+0x5fc/0x1200 vfs_open+0xa0/0x3f0 do_open+0x7e8/0xd08 path_openat+0x2fc/0x6b0 do_filp_open+0x174/0x368
Freed by task 8462: cgroup_file_release+0x130/0x1f8 kernfs_drain_open_files+0x17c/0x440 kernfs_drain+0x2dc/0x360 kernfs_show+0x1b8/0x288 cgroup_file_show+0x150/0x268 cgroup_pressure_write+0x1dc/0x340 cgroup_file_write+0x274/0x548
Reproduction Steps: 1. Open test/cpu.pressure and establish epoll monitoring 2. Disable monitoring: echo 0 > test/cgroup.pressure 3. Re-enable monitoring: echo 1 > test/cgroup.pressure
The race condition occurs because: 1. When cgroup.pressure is disabled (echo 0 > cgroup.pressure), it: - Releases PSI triggers via cgroup_file_release() - Frees of->priv through kernfs_drain_open_files() 2. While epoll still holds reference to the file and continues polling 3. Re-enabling (echo 1 > cgroup.pressure) accesses freed of->priv
epolling disable/enable cgroup.pressure fd=open(cpu.pressure) while(1) ... epoll_wait kernfs_fop_poll kernfs_get_active = true echo 0 > cgroup.pressure ... cgroup_file_show kernfs_show // inactive kn kernfs_drain_open_files cft->release(of); kfree(ctx); ... kernfs_get_active = false echo 1 > cgroup.pressure kernfs_show kernfs_activate_one(kn); kernfs_fop_poll kernfs_get_active = true cgroup_file_poll psi_trigger_poll // UAF ... end: close(fd)
To address this issue, introduce kernfs_get_active_of() for kernfs open files to obtain active references. This function will fail if the open file has been released. Replace kernfs_get_active() with kernfs_get_active_of() to prevent further operations on released file descriptors.
Fixes: 34f26a15611a ("sched/psi: Per-cgroup PSI accounting disable/re-enable interface") Cc: stable stable@kernel.org Reported-by: Zhang Zhaotian zhangzhaotian@huawei.com Signed-off-by: Chen Ridong chenridong@huawei.com Acked-by: Tejun Heo tj@kernel.org Link: https://lore.kernel.org/r/20250822070715.1565236-2-chenridong@huaweicloud.co... [ Drop llseek bits ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/kernfs/file.c | 54 ++++++++++++++++++++++++++++++++++++------------------ 1 file changed, 36 insertions(+), 18 deletions(-)
--- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -70,6 +70,24 @@ static struct kernfs_open_node *of_on(st !list_empty(&of->list)); }
+/* Get active reference to kernfs node for an open file */ +static struct kernfs_open_file *kernfs_get_active_of(struct kernfs_open_file *of) +{ + /* Skip if file was already released */ + if (unlikely(of->released)) + return NULL; + + if (!kernfs_get_active(of->kn)) + return NULL; + + return of; +} + +static void kernfs_put_active_of(struct kernfs_open_file *of) +{ + return kernfs_put_active(of->kn); +} + /** * kernfs_deref_open_node_locked - Get kernfs_open_node corresponding to @kn * @@ -139,7 +157,7 @@ static void kernfs_seq_stop_active(struc
if (ops->seq_stop) ops->seq_stop(sf, v); - kernfs_put_active(of->kn); + kernfs_put_active_of(of); }
static void *kernfs_seq_start(struct seq_file *sf, loff_t *ppos) @@ -152,7 +170,7 @@ static void *kernfs_seq_start(struct seq * the ops aren't called concurrently for the same open file. */ mutex_lock(&of->mutex); - if (!kernfs_get_active(of->kn)) + if (!kernfs_get_active_of(of)) return ERR_PTR(-ENODEV);
ops = kernfs_ops(of->kn); @@ -238,7 +256,7 @@ static ssize_t kernfs_file_read_iter(str * the ops aren't called concurrently for the same open file. */ mutex_lock(&of->mutex); - if (!kernfs_get_active(of->kn)) { + if (!kernfs_get_active_of(of)) { len = -ENODEV; mutex_unlock(&of->mutex); goto out_free; @@ -252,7 +270,7 @@ static ssize_t kernfs_file_read_iter(str else len = -EINVAL;
- kernfs_put_active(of->kn); + kernfs_put_active_of(of); mutex_unlock(&of->mutex);
if (len < 0) @@ -323,7 +341,7 @@ static ssize_t kernfs_fop_write_iter(str * the ops aren't called concurrently for the same open file. */ mutex_lock(&of->mutex); - if (!kernfs_get_active(of->kn)) { + if (!kernfs_get_active_of(of)) { mutex_unlock(&of->mutex); len = -ENODEV; goto out_free; @@ -335,7 +353,7 @@ static ssize_t kernfs_fop_write_iter(str else len = -EINVAL;
- kernfs_put_active(of->kn); + kernfs_put_active_of(of); mutex_unlock(&of->mutex);
if (len > 0) @@ -357,13 +375,13 @@ static void kernfs_vma_open(struct vm_ar if (!of->vm_ops) return;
- if (!kernfs_get_active(of->kn)) + if (!kernfs_get_active_of(of)) return;
if (of->vm_ops->open) of->vm_ops->open(vma);
- kernfs_put_active(of->kn); + kernfs_put_active_of(of); }
static vm_fault_t kernfs_vma_fault(struct vm_fault *vmf) @@ -375,14 +393,14 @@ static vm_fault_t kernfs_vma_fault(struc if (!of->vm_ops) return VM_FAULT_SIGBUS;
- if (!kernfs_get_active(of->kn)) + if (!kernfs_get_active_of(of)) return VM_FAULT_SIGBUS;
ret = VM_FAULT_SIGBUS; if (of->vm_ops->fault) ret = of->vm_ops->fault(vmf);
- kernfs_put_active(of->kn); + kernfs_put_active_of(of); return ret; }
@@ -395,7 +413,7 @@ static vm_fault_t kernfs_vma_page_mkwrit if (!of->vm_ops) return VM_FAULT_SIGBUS;
- if (!kernfs_get_active(of->kn)) + if (!kernfs_get_active_of(of)) return VM_FAULT_SIGBUS;
ret = 0; @@ -404,7 +422,7 @@ static vm_fault_t kernfs_vma_page_mkwrit else file_update_time(file);
- kernfs_put_active(of->kn); + kernfs_put_active_of(of); return ret; }
@@ -418,14 +436,14 @@ static int kernfs_vma_access(struct vm_a if (!of->vm_ops) return -EINVAL;
- if (!kernfs_get_active(of->kn)) + if (!kernfs_get_active_of(of)) return -EINVAL;
ret = -EINVAL; if (of->vm_ops->access) ret = of->vm_ops->access(vma, addr, buf, len, write);
- kernfs_put_active(of->kn); + kernfs_put_active_of(of); return ret; }
@@ -504,7 +522,7 @@ static int kernfs_fop_mmap(struct file * mutex_lock(&of->mutex);
rc = -ENODEV; - if (!kernfs_get_active(of->kn)) + if (!kernfs_get_active_of(of)) goto out_unlock;
ops = kernfs_ops(of->kn); @@ -539,7 +557,7 @@ static int kernfs_fop_mmap(struct file * } vma->vm_ops = &kernfs_vm_ops; out_put: - kernfs_put_active(of->kn); + kernfs_put_active_of(of); out_unlock: mutex_unlock(&of->mutex);
@@ -894,7 +912,7 @@ static __poll_t kernfs_fop_poll(struct f struct kernfs_node *kn = kernfs_dentry_node(filp->f_path.dentry); __poll_t ret;
- if (!kernfs_get_active(kn)) + if (!kernfs_get_active_of(of)) return DEFAULT_POLLMASK|EPOLLERR|EPOLLPRI;
if (kn->attr.ops->poll) @@ -902,7 +920,7 @@ static __poll_t kernfs_fop_poll(struct f else ret = kernfs_generic_poll(of, wait);
- kernfs_put_active(kn); + kernfs_put_active_of(of); return ret; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeff LaBundy jeff@labundy.com
commit c9ddc41cdd522f2db5d492eda3df8994d928be34 upstream.
If a proximity event node is defined so as to specify the wake-up properties of the touch surface, the proximity event interrupt is enabled unconditionally. This may result in unwanted interrupts.
Solve this problem by enabling the interrupt only if the event is mapped to a key or switch code.
Signed-off-by: Jeff LaBundy jeff@labundy.com Link: https://lore.kernel.org/r/aKJxxgEWpNaNcUaW@nixie71 Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/misc/iqs7222.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/input/misc/iqs7222.c +++ b/drivers/input/misc/iqs7222.c @@ -2430,6 +2430,9 @@ static int iqs7222_parse_chan(struct iqs if (error) return error;
+ if (!iqs7222->kp_type[chan_index][i]) + continue; + if (!dev_desc->event_offset) continue;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoffer Sandberg cs@tuxedo.de
commit 1939a9fcb80353dd8b111aa1e79c691afbde08b4 upstream.
Occasionally wakes up from suspend with missing input on the internal keyboard. Setting the quirks appears to fix the issue for this device as well.
Signed-off-by: Christoffer Sandberg cs@tuxedo.de Signed-off-by: Werner Sembach wse@tuxedocomputers.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250826142646.13516-1-wse@tuxedocomputers.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/serio/i8042-acpipnpio.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
--- a/drivers/input/serio/i8042-acpipnpio.h +++ b/drivers/input/serio/i8042-acpipnpio.h @@ -1155,6 +1155,20 @@ static const struct dmi_system_id i8042_ .driver_data = (void *)(SERIO_QUIRK_NOMUX | SERIO_QUIRK_RESET_ALWAYS | SERIO_QUIRK_NOLOOP | SERIO_QUIRK_NOPNP) }, + { + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "XxHP4NAx"), + }, + .driver_data = (void *)(SERIO_QUIRK_NOMUX | SERIO_QUIRK_RESET_ALWAYS | + SERIO_QUIRK_NOLOOP | SERIO_QUIRK_NOPNP) + }, + { + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "XxKK4NAx_XxSP4NAx"), + }, + .driver_data = (void *)(SERIO_QUIRK_NOMUX | SERIO_QUIRK_RESET_ALWAYS | + SERIO_QUIRK_NOLOOP | SERIO_QUIRK_NOPNP) + }, /* * A lot of modern Clevo barebones have touchpad and/or keyboard issues * after suspend fixable with the forcenorestore quirk.
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni pabeni@redhat.com
commit 63a796558bc22ec699e4193d5c75534757ddf2e6 upstream.
This reverts commit 5537a4679403 ("net: usb: asix: ax88772: drop phylink use in PM to avoid MDIO runtime PM wakeups"), it breaks operation of asix ethernet usb dongle after system suspend-resume cycle.
Link: https://lore.kernel.org/all/b5ea8296-f981-445d-a09a-2f389d7f6fdd@samsung.com... Fixes: 5537a4679403 ("net: usb: asix: ax88772: drop phylink use in PM to avoid MDIO runtime PM wakeups") Reported-by: Marek Szyprowski m.szyprowski@samsung.com Acked-by: Jakub Kicinski kuba@kernel.org Link: https://patch.msgid.link/2945b9dbadb8ee1fee058b19554a5cb14f1763c1.1757601118... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/asix_devices.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/drivers/net/usb/asix_devices.c +++ b/drivers/net/usb/asix_devices.c @@ -607,8 +607,15 @@ static const struct net_device_ops ax887
static void ax88772_suspend(struct usbnet *dev) { + struct asix_common_private *priv = dev->driver_priv; u16 medium;
+ if (netif_running(dev->net)) { + rtnl_lock(); + phylink_suspend(priv->phylink, false); + rtnl_unlock(); + } + /* Stop MAC operation */ medium = asix_read_medium_status(dev, 1); medium &= ~AX_MEDIUM_RE; @@ -637,6 +644,12 @@ static void ax88772_resume(struct usbnet for (i = 0; i < 3; i++) if (!priv->reset(dev, 1)) break; + + if (netif_running(dev->net)) { + rtnl_lock(); + phylink_resume(priv->phylink); + rtnl_unlock(); + } }
static int asix_resume(struct usb_interface *intf)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fabian Vogt fvogt@suse.de
commit cfd956dcb101aa3d25bac321fae923323a47c607 upstream.
After hvc_write completes, call hvc_kick also in the case the output buffer has been drained, to ensure tty_wakeup gets called.
This fixes that functions which wait for a drained buffer got stuck occasionally.
Cc: stable stable@kernel.org Closes: https://bugzilla.opensuse.org/show_bug.cgi?id=1230062 Signed-off-by: Fabian Vogt fvogt@suse.de Link: https://lore.kernel.org/r/2011735.PYKUYFuaPT@fvogt-thinkpad Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/hvc/hvc_console.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -543,10 +543,10 @@ static ssize_t hvc_write(struct tty_stru }
/* - * Racy, but harmless, kick thread if there is still pending data. + * Kick thread to flush if there's still pending data + * or to wakeup the write queue. */ - if (hp->n_outbuf) - hvc_kick(); + hvc_kick();
return written; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hugo Villeneuve hvilleneuve@dimonoff.com
commit 535fd4c98452c87537a40610abba45daf5761ec6 upstream.
When trying to set MCR[2], XON1 is incorrectly accessed instead. And when writing to the TCR register to configure flow control levels, we are incorrectly writing to the MSR register. The default value of $00 is then used for TCR, which means that selectable trigger levels in FCR are used in place of TCR.
TCR/TLR access requires EFR[4] (enable enhanced functions) and MCR[2] to be set. EFR[4] is already set in probe().
MCR access requires LCR[7] to be zero.
Since LCR is set to $BF when trying to set MCR[2], XON1 is incorrectly accessed instead because MCR shares the same address space as XON1.
Since MCR[2] is unmodified and still zero, when writing to TCR we are in fact writing to MSR because TCR/TLR registers share the same address space as MSR/SPR.
Fix by first removing useless reconfiguration of EFR[4] (enable enhanced functions), as it is already enabled in sc16is7xx_probe() since commit 43c51bb573aa ("sc16is7xx: make sure device is in suspend once probed"). Now LCR is $00, which means that MCR access is enabled.
Also remove regcache_cache_bypass() calls since we no longer access the enhanced registers set, and TCR is already declared as volatile (in fact by declaring MSR as volatile, which shares the same address).
Finally disable access to TCR/TLR registers after modifying them by clearing MCR[2].
Note: the comment about "... and internal clock div" is wrong and can be ignored/removed as access to internal clock div registers (DLL/DLH) is permitted only when LCR[7] is logic 1, not when enhanced features is enabled. And DLL/DLH access is not needed in sc16is7xx_startup().
Fixes: dfeae619d781 ("serial: sc16is7xx") Cc: stable@vger.kernel.org Signed-off-by: Hugo Villeneuve hvilleneuve@dimonoff.com Link: https://lore.kernel.org/r/20250731124451.1108864-1-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/sc16is7xx.c | 14 ++------------ 1 file changed, 2 insertions(+), 12 deletions(-)
--- a/drivers/tty/serial/sc16is7xx.c +++ b/drivers/tty/serial/sc16is7xx.c @@ -1163,17 +1163,6 @@ static int sc16is7xx_startup(struct uart sc16is7xx_port_write(port, SC16IS7XX_FCR_REG, SC16IS7XX_FCR_FIFO_BIT);
- /* Enable EFR */ - sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, - SC16IS7XX_LCR_CONF_MODE_B); - - regcache_cache_bypass(one->regmap, true); - - /* Enable write access to enhanced features and internal clock div */ - sc16is7xx_port_update(port, SC16IS7XX_EFR_REG, - SC16IS7XX_EFR_ENABLE_BIT, - SC16IS7XX_EFR_ENABLE_BIT); - /* Enable TCR/TLR */ sc16is7xx_port_update(port, SC16IS7XX_MCR_REG, SC16IS7XX_MCR_TCRTLR_BIT, @@ -1185,7 +1174,8 @@ static int sc16is7xx_startup(struct uart SC16IS7XX_TCR_RX_RESUME(24) | SC16IS7XX_TCR_RX_HALT(48));
- regcache_cache_bypass(one->regmap, false); + /* Disable TCR/TLR access */ + sc16is7xx_port_update(port, SC16IS7XX_MCR_REG, SC16IS7XX_MCR_TCRTLR_BIT, 0);
/* Now, initialize the UART */ sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, SC16IS7XX_LCR_WORD_LEN_8);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
commit ee047e1d85d73496541c54bd4f432c9464e13e65 upstream.
Lists should have fixed constraints, because binding must be specific in respect to hardware, thus add missing constraints to number of clocks.
Cc: stable stable@kernel.org Fixes: 88a499cd70d4 ("dt-bindings: Add support for the Broadcom UART driver") Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Link: https://lore.kernel.org/r/20250812121630.67072-2-krzysztof.kozlowski@linaro.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/serial/brcm,bcm7271-uart.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/Documentation/devicetree/bindings/serial/brcm,bcm7271-uart.yaml +++ b/Documentation/devicetree/bindings/serial/brcm,bcm7271-uart.yaml @@ -41,7 +41,7 @@ properties: - const: dma_intr2
clocks: - minItems: 1 + maxItems: 1
clock-names: const: sw_baud
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fabio Porcedda fabio.porcedda@gmail.com
commit cba70aff623b104085ab5613fedd21f6ea19095a upstream.
Add the following Telit Cinterion FN990A w/audio compositions:
0x1077: tty (diag) + adb + rmnet + audio + tty (AT/NMEA) + tty (AT) + tty (AT) + tty (AT) T: Bus=01 Lev=01 Prnt=01 Port=09 Cnt=01 Dev#= 8 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=1077 Rev=05.04 S: Manufacturer=Telit Wireless Solutions S: Product=FN990 S: SerialNumber=67e04c35 C: #Ifs=10 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=03(Int.) MxPS= 8 Ivl=32ms E: Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 0 Cls=01(audio) Sub=01 Prot=20 Driver=snd-usb-audio I: If#= 4 Alt= 1 #EPs= 1 Cls=01(audio) Sub=02 Prot=20 Driver=snd-usb-audio E: Ad=03(O) Atr=0d(Isoc) MxPS= 68 Ivl=1ms I: If#= 5 Alt= 1 #EPs= 1 Cls=01(audio) Sub=02 Prot=20 Driver=snd-usb-audio E: Ad=84(I) Atr=0d(Isoc) MxPS= 68 Ivl=1ms I: If#= 6 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 7 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=88(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 8 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8a(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 9 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8b(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8c(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
0x1078: tty (diag) + adb + MBIM + audio + tty (AT/NMEA) + tty (AT) + tty (AT) + tty (AT) T: Bus=01 Lev=01 Prnt=01 Port=09 Cnt=01 Dev#= 21 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=1078 Rev=05.04 S: Manufacturer=Telit Wireless Solutions S: Product=FN990 S: SerialNumber=67e04c35 C: #Ifs=11 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#=10 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8b(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8c(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=83(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 3 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 0 Cls=01(audio) Sub=01 Prot=20 Driver=snd-usb-audio I: If#= 5 Alt= 0 #EPs= 0 Cls=01(audio) Sub=02 Prot=20 Driver=snd-usb-audio I: If#= 6 Alt= 1 #EPs= 1 Cls=01(audio) Sub=02 Prot=20 Driver=snd-usb-audio E: Ad=84(I) Atr=0d(Isoc) MxPS= 68 Ivl=1ms I: If#= 7 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 8 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=88(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 9 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8a(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
0x1079: RNDIS + tty (diag) + adb + audio + tty (AT/NMEA) + tty (AT) + tty (AT) + tty (AT) T: Bus=01 Lev=01 Prnt=01 Port=09 Cnt=01 Dev#= 23 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=1079 Rev=05.04 S: Manufacturer=Telit Wireless Solutions S: Product=FN990 S: SerialNumber=67e04c35 C: #Ifs=11 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=ef(misc ) Sub=04 Prot=01 Driver=rndis_host E: Ad=81(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#=10 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8b(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8c(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 0 Cls=01(audio) Sub=01 Prot=20 Driver=snd-usb-audio I: If#= 5 Alt= 0 #EPs= 0 Cls=01(audio) Sub=02 Prot=20 Driver=snd-usb-audio I: If#= 6 Alt= 1 #EPs= 1 Cls=01(audio) Sub=02 Prot=20 Driver=snd-usb-audio E: Ad=84(I) Atr=0d(Isoc) MxPS= 68 Ivl=1ms I: If#= 7 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 8 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=88(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 9 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8a(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
Cc: stable@vger.kernel.org Signed-off-by: Fabio Porcedda fabio.porcedda@gmail.com Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -1369,6 +1369,12 @@ static const struct usb_device_id option .driver_info = NCTRL(0) | RSVD(1) }, { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1075, 0xff), /* Telit FN990A (PCIe) */ .driver_info = RSVD(0) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1077, 0xff), /* Telit FN990A (rmnet + audio) */ + .driver_info = NCTRL(0) | RSVD(1) | RSVD(2) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1078, 0xff), /* Telit FN990A (MBIM + audio) */ + .driver_info = NCTRL(0) | RSVD(1) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1079, 0xff), /* Telit FN990A (RNDIS + audio) */ + .driver_info = NCTRL(2) | RSVD(3) }, { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1080, 0xff), /* Telit FE990A (rmnet) */ .driver_info = NCTRL(0) | RSVD(1) | RSVD(2) }, { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1081, 0xff), /* Telit FE990A (MBIM) */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fabio Porcedda fabio.porcedda@gmail.com
commit a5a261bea9bf8444300d1067b4a73bedee5b5227 upstream.
Add the following Telit Cinterion LE910C4-WWX new compositions:
0x1034: tty (AT) + tty (AT) + rmnet T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 8 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=1034 Rev=00.00 S: Manufacturer=Telit S: Product=LE910C4-WWX S: SerialNumber=93f617e7 C: #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=fe Prot=ff Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
0x1036: tty (AT) + tty (AT) T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 10 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=1036 Rev=00.00 S: Manufacturer=Telit S: Product=LE910C4-WWX S: SerialNumber=93f617e7 C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=fe Prot=ff Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
0x1037: tty (diag) + tty (Telit custom) + tty (AT) + tty (AT) + rmnet T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 15 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=1037 Rev=00.00 S: Manufacturer=Telit S: Product=LE910C4-WWX S: SerialNumber=93f617e7 C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=fe Prot=ff Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
0x1038: tty (Telit custom) + tty (AT) + tty (AT) + rmnet T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 9 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=1038 Rev=00.00 S: Manufacturer=Telit S: Product=LE910C4-WWX S: SerialNumber=93f617e7 C: #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=fe Prot=ff Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
0x103b: tty (diag) + tty (Telit custom) + tty (AT) + tty (AT) T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 10 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=103b Rev=00.00 S: Manufacturer=Telit S: Product=LE910C4-WWX S: SerialNumber=93f617e7 C: #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=fe Prot=ff Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
0x103c: tty (Telit custom) + tty (AT) + tty (AT) T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 11 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=103c Rev=00.00 S: Manufacturer=Telit S: Product=LE910C4-WWX S: SerialNumber=93f617e7 C: #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=fe Prot=ff Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 64 Ivl=2ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Cc: stable@vger.kernel.org Signed-off-by: Fabio Porcedda fabio.porcedda@gmail.com Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -1322,7 +1322,18 @@ static const struct usb_device_id option .driver_info = NCTRL(0) | RSVD(3) }, { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1033, 0xff), /* Telit LE910C1-EUX (ECM) */ .driver_info = NCTRL(0) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1034, 0xff), /* Telit LE910C4-WWX (rmnet) */ + .driver_info = RSVD(2) }, { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1035, 0xff) }, /* Telit LE910C4-WWX (ECM) */ + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1036, 0xff) }, /* Telit LE910C4-WWX */ + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1037, 0xff), /* Telit LE910C4-WWX (rmnet) */ + .driver_info = NCTRL(0) | NCTRL(1) | RSVD(4) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1038, 0xff), /* Telit LE910C4-WWX (rmnet) */ + .driver_info = NCTRL(0) | RSVD(3) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x103b, 0xff), /* Telit LE910C4-WWX */ + .driver_info = NCTRL(0) | NCTRL(1) }, + { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x103c, 0xff), /* Telit LE910C4-WWX */ + .driver_info = NCTRL(0) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_LE922_USBCFG0), .driver_info = RSVD(0) | RSVD(1) | NCTRL(2) | RSVD(3) }, { USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_LE922_USBCFG1),
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linus Torvalds torvalds@linux-foundation.org
[ Upstream commit 6f110a5e4f9977c31ce76fefbfef6fd4eab6bfb7 ]
... and don't error out so hard on missing module descriptions.
Before commit 6c6c1fc09de3 ("modpost: require a MODULE_DESCRIPTION()") we used to warn about missing module descriptions, but only when building with extra warnigns (ie 'W=1').
After that commit the warning became an unconditional hard error.
And it turns out not all modules have been converted despite the claims to the contrary. As reported by Damian Tometzki, the slub KUnit test didn't have a module description, and apparently nobody ever really noticed.
The reason nobody noticed seems to be that the slub KUnit tests get disabled by SLUB_TINY, which also ends up disabling a lot of other code, both in tests and in slub itself. And so anybody doing full build tests didn't actually see this failre.
So let's disable SLUB_TINY for build-only tests, since it clearly ends up limiting build coverage. Also turn the missing module descriptions error back into a warning, but let's keep it around for non-'W=1' builds.
Reported-by: Damian Tometzki damian@riscv-rocks.de Link: https://lore.kernel.org/all/01070196099fd059-e8463438-7b1b-4ec8-816d-173874b... Cc: Masahiro Yamada masahiroy@kernel.org Cc: Jeff Johnson jeff.johnson@oss.qualcomm.com Fixes: 6c6c1fc09de3 ("modpost: require a MODULE_DESCRIPTION()") Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/Kconfig b/mm/Kconfig index c11cd01169e8d..046c32686fc4d 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -280,7 +280,7 @@ config SLAB
config SLUB_TINY bool "Configure SLUB for minimal memory footprint" - depends on SLUB && EXPERT + depends on SLUB && EXPERT && !COMPILE_TEST select SLAB_MERGE_DEFAULT help Configures the SLUB allocator in a way to achieve minimal memory
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefan Wahren wahrenst@gmx.net
[ Upstream commit 03e79de4608bdd48ad6eec272e196124cefaf798 ]
The function of_phy_find_device may return NULL, so we need to take care before dereferencing phy_dev.
Fixes: 64a632da538a ("net: fec: Fix phy_device lookup for phy_reset_after_clk_enable()") Signed-off-by: Stefan Wahren wahrenst@gmx.net Cc: Christoph Niedermaier cniedermaier@dh-electronics.com Cc: Richard Leitner richard.leitner@skidata.com Reviewed-by: Simon Horman horms@kernel.org Reviewed-by: Wei Fang wei.fang@nxp.com Link: https://patch.msgid.link/20250904091334.53965-1-wahrenst@gmx.net Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/fec_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 2a8b5429df595..8352d9b6469f2 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -2300,7 +2300,8 @@ static void fec_enet_phy_reset_after_clk_enable(struct net_device *ndev) */ phy_dev = of_phy_find_device(fep->phy_node); phy_reset_after_clk_enable(phy_dev); - put_device(&phy_dev->mdio.dev); + if (phy_dev) + put_device(&phy_dev->mdio.dev); } }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Petr Machata petrm@nvidia.com
[ Upstream commit 8625f5748fea960d2af4f3c3e9891ee8f6f80906 ]
The bridge driver currently tolerates options that it does not recognize. Instead, it should bounce them.
Fixes: a428afe82f98 ("net: bridge: add support for user-controlled bool options") Signed-off-by: Petr Machata petrm@nvidia.com Reviewed-by: Ido Schimmel idosch@nvidia.com Acked-by: Nikolay Aleksandrov razor@blackwall.org Link: https://patch.msgid.link/e6fdca3b5a8d54183fbda075daffef38bdd7ddce.1757070067... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/bridge/br.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/net/bridge/br.c b/net/bridge/br.c index a6e94ceb7c9a0..a45db67197226 100644 --- a/net/bridge/br.c +++ b/net/bridge/br.c @@ -312,6 +312,13 @@ int br_boolopt_multi_toggle(struct net_bridge *br, int err = 0; int opt_id;
+ opt_id = find_next_bit(&bitmap, BITS_PER_LONG, BR_BOOLOPT_MAX); + if (opt_id != BITS_PER_LONG) { + NL_SET_ERR_MSG_FMT_MOD(extack, "Unknown boolean option %d", + opt_id); + return -EINVAL; + } + for_each_set_bit(opt_id, &bitmap, BR_BOOLOPT_MAX) { bool on = !!(bm->optval & BIT(opt_id));
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Antoine Tenart atenart@kernel.org
[ Upstream commit e3c674db356c4303804b2415e7c2b11776cdd8c3 ]
If a GSO skb is sent through a Geneve tunnel and if Geneve options are added, the split GSO skb might not fit in the MTU anymore and an ICMP frag needed packet can be generated. In such case the ICMP packet might go through the segmentation logic (and dropped) later if it reaches a path were the GSO status is checked and segmentation is required.
This is especially true when an OvS bridge is used with a Geneve tunnel attached to it. The following set of actions could lead to the ICMP packet being wrongfully segmented:
1. An skb is constructed by the TCP layer (e.g. gso_type SKB_GSO_TCPV4, segs >= 2).
2. The skb hits the OvS bridge where Geneve options are added by an OvS action before being sent through the tunnel.
3. When the skb is xmited in the tunnel, the split skb does not fit anymore in the MTU and iptunnel_pmtud_build_icmp is called to generate an ICMP fragmentation needed packet. This is done by reusing the original (GSO!) skb. The GSO metadata is not cleared.
4. The ICMP packet being sent back hits the OvS bridge again and because skb_is_gso returns true, it goes through queue_gso_packets...
5. ...where __skb_gso_segment is called. The skb is then dropped.
6. Note that in the above example on re-transmission the skb won't be a GSO one as it would be segmented (len > MSS) and the ICMP packet should go through.
Fix this by resetting the GSO information before reusing an skb in iptunnel_pmtud_build_icmp and iptunnel_pmtud_build_icmpv6.
Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets") Reported-by: Adrian Moreno amorenoz@redhat.com Signed-off-by: Antoine Tenart atenart@kernel.org Reviewed-by: Stefano Brivio sbrivio@redhat.com Link: https://patch.msgid.link/20250904125351.159740-1-atenart@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/ip_tunnel_core.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c index deb08cab44640..75e3d7501752d 100644 --- a/net/ipv4/ip_tunnel_core.c +++ b/net/ipv4/ip_tunnel_core.c @@ -203,6 +203,9 @@ static int iptunnel_pmtud_build_icmp(struct sk_buff *skb, int mtu) if (!pskb_may_pull(skb, ETH_HLEN + sizeof(struct iphdr))) return -EINVAL;
+ if (skb_is_gso(skb)) + skb_gso_reset(skb); + skb_copy_bits(skb, skb_mac_offset(skb), &eh, ETH_HLEN); pskb_pull(skb, ETH_HLEN); skb_reset_network_header(skb); @@ -297,6 +300,9 @@ static int iptunnel_pmtud_build_icmpv6(struct sk_buff *skb, int mtu) if (!pskb_may_pull(skb, ETH_HLEN + sizeof(struct ipv6hdr))) return -EINVAL;
+ if (skb_is_gso(skb)) + skb_gso_reset(skb); + skb_copy_bits(skb, skb_mac_offset(skb), &eh, ETH_HLEN); pskb_pull(skb, ETH_HLEN); skb_reset_network_header(skb);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Tran alex.t.tran@gmail.com
[ Upstream commit 641427d5bf90af0625081bf27555418b101274cd ]
The documentation of the 'bcm_msg_head' struct does not match how it is defined in 'bcm.h'. Changed the frames member to a flexible array, matching the definition in the header file.
See commit 94dfc73e7cf4 ("treewide: uapi: Replace zero-length arrays with flexible-array members")
Signed-off-by: Alex Tran alex.t.tran@gmail.com Acked-by: Oliver Hartkopp socketcan@hartkopp.net Link: https://patch.msgid.link/20250904031709.1426895-1-alex.t.tran@gmail.com Fixes: 94dfc73e7cf4 ("treewide: uapi: Replace zero-length arrays with flexible-array members") Link: https://bugzilla.kernel.org/show_bug.cgi?id=217783 Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/networking/can.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/networking/can.rst b/Documentation/networking/can.rst index d7e1ada905b2d..3bdd155838105 100644 --- a/Documentation/networking/can.rst +++ b/Documentation/networking/can.rst @@ -740,7 +740,7 @@ The broadcast manager sends responses to user space in the same form: struct timeval ival1, ival2; /* count and subsequent interval */ canid_t can_id; /* unique can_id for task */ __u32 nframes; /* number of can_frames following */ - struct can_frame frames[0]; + struct can_frame frames[]; };
The aligned payload 'frames' uses the same basic CAN frame structure defined
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kohei Enju enjuk@amazon.com
[ Upstream commit d709f178abca22a4d3642513df29afe4323a594b ]
The igb driver incorrectly skips the link test when the network interface is admin down (if_running == false), causing the test to always report PASS regardless of the actual physical link state.
This behavior is inconsistent with other drivers (e.g. i40e, ice, ixgbe, etc.) which correctly test the physical link state regardless of admin state. Remove the if_running check to ensure link test always reflects the physical link state.
Fixes: 8d420a1b3ea6 ("igb: correct link test not being run when link is down") Signed-off-by: Kohei Enju enjuk@amazon.com Reviewed-by: Paul Menzel pmenzel@molgen.mpg.de Tested-by: Rinitha S sx.rinitha@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igb/igb_ethtool.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c b/drivers/net/ethernet/intel/igb/igb_ethtool.c index 92b2be06a6e93..f11cba65e5d85 100644 --- a/drivers/net/ethernet/intel/igb/igb_ethtool.c +++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c @@ -2081,11 +2081,8 @@ static void igb_diag_test(struct net_device *netdev, } else { dev_info(&adapter->pdev->dev, "online testing starting\n");
- /* PHY is powered down when interface is down */ - if (if_running && igb_link_test(adapter, &data[TEST_LINK])) + if (igb_link_test(adapter, &data[TEST_LINK])) eth_test->flags |= ETH_TEST_FL_FAILED; - else - data[TEST_LINK] = 0;
/* Online tests aren't run; pass by default */ data[TEST_REG] = 0;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Schmidt mschmidt@redhat.com
[ Upstream commit 915470e1b44e71d1dd07ee067276f003c3521ee3 ]
If request_irq() in i40e_vsi_request_irq_msix() fails in an iteration later than the first, the error path wants to free the IRQs requested so far. However, it uses the wrong dev_id argument for free_irq(), so it does not free the IRQs correctly and instead triggers the warning:
Trying to free already-free IRQ 173 WARNING: CPU: 25 PID: 1091 at kernel/irq/manage.c:1829 __free_irq+0x192/0x2c0 Modules linked in: i40e(+) [...] CPU: 25 UID: 0 PID: 1091 Comm: NetworkManager Not tainted 6.17.0-rc1+ #1 PREEMPT(lazy) Hardware name: [...] RIP: 0010:__free_irq+0x192/0x2c0 [...] Call Trace: <TASK> free_irq+0x32/0x70 i40e_vsi_request_irq_msix.cold+0x63/0x8b [i40e] i40e_vsi_request_irq+0x79/0x80 [i40e] i40e_vsi_open+0x21f/0x2f0 [i40e] i40e_open+0x63/0x130 [i40e] __dev_open+0xfc/0x210 __dev_change_flags+0x1fc/0x240 netif_change_flags+0x27/0x70 do_setlink.isra.0+0x341/0xc70 rtnl_newlink+0x468/0x860 rtnetlink_rcv_msg+0x375/0x450 netlink_rcv_skb+0x5c/0x110 netlink_unicast+0x288/0x3c0 netlink_sendmsg+0x20d/0x430 ____sys_sendmsg+0x3a2/0x3d0 ___sys_sendmsg+0x99/0xe0 __sys_sendmsg+0x8a/0xf0 do_syscall_64+0x82/0x2c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e [...] </TASK> ---[ end trace 0000000000000000 ]---
Use the same dev_id for free_irq() as for request_irq().
I tested this with inserting code to fail intentionally.
Fixes: 493fb30011b3 ("i40e: Move q_vectors from pointer to array to array of pointers") Signed-off-by: Michal Schmidt mschmidt@redhat.com Reviewed-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Reviewed-by: Subbaraya Sundeep sbhatta@marvell.com Tested-by: Rinitha S sx.rinitha@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index b749aa3e783ff..72869336e3a9a 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -4210,7 +4210,7 @@ static int i40e_vsi_request_irq_msix(struct i40e_vsi *vsi, char *basename) irq_num = pf->msix_entries[base + vector].vector; irq_set_affinity_notifier(irq_num, NULL); irq_update_affinity_hint(irq_num, NULL); - free_irq(irq_num, &vsi->q_vectors[vector]); + free_irq(irq_num, vsi->q_vectors[vector]); } return err; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit f214744c8a27c3c1da6b538c232da22cd027530e ]
Commit 25fe97cb7620 ("can: j1939: move j1939_priv_put() into sk_destruct callback") expects that a call to j1939_priv_put() can be unconditionally delayed until j1939_sk_sock_destruct() is called. But a refcount leak will happen when j1939_sk_bind() is called again after j1939_local_ecu_get() from previous j1939_sk_bind() call returned an error. We need to call j1939_priv_put() before j1939_sk_bind() returns an error.
Fixes: 25fe97cb7620 ("can: j1939: move j1939_priv_put() into sk_destruct callback") Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Tested-by: Oleksij Rempel o.rempel@pengutronix.de Acked-by: Oleksij Rempel o.rempel@pengutronix.de Link: https://patch.msgid.link/4f49a1bc-a528-42ad-86c0-187268ab6535@I-love.SAKURA.... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/can/j1939/socket.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c index cf9a12d8da6f9..7bf4d4fb96735 100644 --- a/net/can/j1939/socket.c +++ b/net/can/j1939/socket.c @@ -520,6 +520,9 @@ static int j1939_sk_bind(struct socket *sock, struct sockaddr *uaddr, int len) ret = j1939_local_ecu_get(priv, jsk->addr.src_name, jsk->addr.sa); if (ret) { j1939_netdev_stop(priv); + jsk->priv = NULL; + synchronize_rcu(); + j1939_priv_put(priv); goto out_release_sock; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit 06e02da29f6f1a45fc07bd60c7eaf172dc21e334 ]
Since j1939_sk_bind() and j1939_sk_release() call j1939_local_ecu_put() when J1939_SOCK_BOUND was already set, but the error handling path for j1939_sk_bind() will not set J1939_SOCK_BOUND when j1939_local_ecu_get() fails, j1939_local_ecu_get() needs to undo priv->ents[sa].nusers++ when j1939_local_ecu_get() returns an error.
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Tested-by: Oleksij Rempel o.rempel@pengutronix.de Acked-by: Oleksij Rempel o.rempel@pengutronix.de Link: https://patch.msgid.link/e7f80046-4ff7-4ce2-8ad8-7c3c678a42c9@I-love.SAKURA.... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/can/j1939/bus.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/can/j1939/bus.c b/net/can/j1939/bus.c index 4866879016021..e0b966c2517cf 100644 --- a/net/can/j1939/bus.c +++ b/net/can/j1939/bus.c @@ -290,8 +290,11 @@ int j1939_local_ecu_get(struct j1939_priv *priv, name_t name, u8 sa) if (!ecu) ecu = j1939_ecu_create_locked(priv, name); err = PTR_ERR_OR_ZERO(ecu); - if (err) + if (err) { + if (j1939_address_is_unicast(sa)) + priv->ents[sa].nusers--; goto done; + }
ecu->nusers++; /* TODO: do we care if ecu->addr != sa? */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anssi Hannula anssi.hannula@bitwise.fi
[ Upstream commit ef79f00be72bd81d2e1e6f060d83cf7e425deee4 ]
can_put_echo_skb() takes ownership of the SKB and it may be freed during or after the call.
However, xilinx_can xcan_write_frame() keeps using SKB after the call.
Fix that by only calling can_put_echo_skb() after the code is done touching the SKB.
The tx_lock is held for the entire xcan_write_frame() execution and also on the can_get_echo_skb() side so the order of operations does not matter.
An earlier fix commit 3d3c817c3a40 ("can: xilinx_can: Fix usage of skb memory") did not move the can_put_echo_skb() call far enough.
Signed-off-by: Anssi Hannula anssi.hannula@bitwise.fi Fixes: 1598efe57b3e ("can: xilinx_can: refactor code in preparation for CAN FD support") Link: https://patch.msgid.link/20250822095002.168389-1-anssi.hannula@bitwise.fi [mkl: add "commit" in front of sha1 in patch description] [mkl: fix indention] Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/xilinx_can.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c index abe58f1030433..57d1209134f11 100644 --- a/drivers/net/can/xilinx_can.c +++ b/drivers/net/can/xilinx_can.c @@ -628,14 +628,6 @@ static void xcan_write_frame(struct net_device *ndev, struct sk_buff *skb, dlc |= XCAN_DLCR_EDL_MASK; }
- if (!(priv->devtype.flags & XCAN_FLAG_TX_MAILBOXES) && - (priv->devtype.flags & XCAN_FLAG_TXFEMP)) - can_put_echo_skb(skb, ndev, priv->tx_head % priv->tx_max, 0); - else - can_put_echo_skb(skb, ndev, 0, 0); - - priv->tx_head++; - priv->write_reg(priv, XCAN_FRAME_ID_OFFSET(frame_offset), id); /* If the CAN frame is RTR frame this write triggers transmission * (not on CAN FD) @@ -668,6 +660,14 @@ static void xcan_write_frame(struct net_device *ndev, struct sk_buff *skb, data[1]); } } + + if (!(priv->devtype.flags & XCAN_FLAG_TX_MAILBOXES) && + (priv->devtype.flags & XCAN_FLAG_TXFEMP)) + can_put_echo_skb(skb, ndev, priv->tx_head % priv->tx_max, 0); + else + can_put_echo_skb(skb, ndev, 0, 0); + + priv->tx_head++; }
/**
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Murali Karicheri m-karicheri2@ti.com
[ Upstream commit 36b20fcdd9663ced36d3aef96f0eff8eb79de4b8 ]
When MC (multicast) list is updated by the networking layer due to a user command and as well as when allmulti flag is set, it needs to be passed to the enslaved Ethernet devices. This patch allows this to happen by implementing ndo_change_rx_flags() and ndo_set_rx_mode() API calls that in turns pass it to the slave devices using existing API calls.
Signed-off-by: Murali Karicheri m-karicheri2@ti.com Signed-off-by: Ravi Gunasekaran r-gunasekaran@ti.com Reviewed-by: Wojciech Drewek wojciech.drewek@intel.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: 8884c6939913 ("hsr: use rtnl lock when iterating over ports") Signed-off-by: Sasha Levin sashal@kernel.org --- net/hsr/hsr_device.c | 67 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 66 insertions(+), 1 deletion(-)
diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c index 906c38b9d66ff..cff37637c96d3 100644 --- a/net/hsr/hsr_device.c +++ b/net/hsr/hsr_device.c @@ -170,7 +170,24 @@ static int hsr_dev_open(struct net_device *dev)
static int hsr_dev_close(struct net_device *dev) { - /* Nothing to do here. */ + struct hsr_port *port; + struct hsr_priv *hsr; + + hsr = netdev_priv(dev); + hsr_for_each_port(hsr, port) { + if (port->type == HSR_PT_MASTER) + continue; + switch (port->type) { + case HSR_PT_SLAVE_A: + case HSR_PT_SLAVE_B: + dev_uc_unsync(port->dev, dev); + dev_mc_unsync(port->dev, dev); + break; + default: + break; + } + } + return 0; }
@@ -401,12 +418,60 @@ void hsr_del_ports(struct hsr_priv *hsr) hsr_del_port(port); }
+static void hsr_set_rx_mode(struct net_device *dev) +{ + struct hsr_port *port; + struct hsr_priv *hsr; + + hsr = netdev_priv(dev); + + hsr_for_each_port(hsr, port) { + if (port->type == HSR_PT_MASTER) + continue; + switch (port->type) { + case HSR_PT_SLAVE_A: + case HSR_PT_SLAVE_B: + dev_mc_sync_multiple(port->dev, dev); + dev_uc_sync_multiple(port->dev, dev); + break; + default: + break; + } + } +} + +static void hsr_change_rx_flags(struct net_device *dev, int change) +{ + struct hsr_port *port; + struct hsr_priv *hsr; + + hsr = netdev_priv(dev); + + hsr_for_each_port(hsr, port) { + if (port->type == HSR_PT_MASTER) + continue; + switch (port->type) { + case HSR_PT_SLAVE_A: + case HSR_PT_SLAVE_B: + if (change & IFF_ALLMULTI) + dev_set_allmulti(port->dev, + dev->flags & + IFF_ALLMULTI ? 1 : -1); + break; + default: + break; + } + } +} + static const struct net_device_ops hsr_device_ops = { .ndo_change_mtu = hsr_dev_change_mtu, .ndo_open = hsr_dev_open, .ndo_stop = hsr_dev_close, .ndo_start_xmit = hsr_dev_xmit, + .ndo_change_rx_flags = hsr_change_rx_flags, .ndo_fix_features = hsr_fix_features, + .ndo_set_rx_mode = hsr_set_rx_mode, };
static struct device_type hsr_type = {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Murali Karicheri m-karicheri2@ti.com
[ Upstream commit 1a8a63a5305e95519de6f941922dfcd8179f82e5 ]
This patch adds support for VLAN ctag based filtering at slave devices. The slave ethernet device may be capable of filtering ethernet packets based on VLAN ID. This requires that when the VLAN interface is created over an HSR/PRP interface, it passes the VID information to the associated slave ethernet devices so that it updates the hardware filters to filter ethernet frames based on VID. This patch adds the required functions to propagate the vid information to the slave devices.
Signed-off-by: Murali Karicheri m-karicheri2@ti.com Signed-off-by: MD Danish Anwar danishanwar@ti.com Reviewed-by: Jiri Pirko jiri@nvidia.com Link: https://patch.msgid.link/20241106091710.3308519-3-danishanwar@ti.com Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: 8884c6939913 ("hsr: use rtnl lock when iterating over ports") Signed-off-by: Sasha Levin sashal@kernel.org --- net/hsr/hsr_device.c | 80 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 79 insertions(+), 1 deletion(-)
diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c index cff37637c96d3..69f6c704352de 100644 --- a/net/hsr/hsr_device.c +++ b/net/hsr/hsr_device.c @@ -464,6 +464,77 @@ static void hsr_change_rx_flags(struct net_device *dev, int change) } }
+static int hsr_ndo_vlan_rx_add_vid(struct net_device *dev, + __be16 proto, u16 vid) +{ + bool is_slave_a_added = false; + bool is_slave_b_added = false; + struct hsr_port *port; + struct hsr_priv *hsr; + int ret = 0; + + hsr = netdev_priv(dev); + + hsr_for_each_port(hsr, port) { + if (port->type == HSR_PT_MASTER || + port->type == HSR_PT_INTERLINK) + continue; + + ret = vlan_vid_add(port->dev, proto, vid); + switch (port->type) { + case HSR_PT_SLAVE_A: + if (ret) { + /* clean up Slave-B */ + netdev_err(dev, "add vid failed for Slave-A\n"); + if (is_slave_b_added) + vlan_vid_del(port->dev, proto, vid); + return ret; + } + + is_slave_a_added = true; + break; + + case HSR_PT_SLAVE_B: + if (ret) { + /* clean up Slave-A */ + netdev_err(dev, "add vid failed for Slave-B\n"); + if (is_slave_a_added) + vlan_vid_del(port->dev, proto, vid); + return ret; + } + + is_slave_b_added = true; + break; + default: + break; + } + } + + return 0; +} + +static int hsr_ndo_vlan_rx_kill_vid(struct net_device *dev, + __be16 proto, u16 vid) +{ + struct hsr_port *port; + struct hsr_priv *hsr; + + hsr = netdev_priv(dev); + + hsr_for_each_port(hsr, port) { + switch (port->type) { + case HSR_PT_SLAVE_A: + case HSR_PT_SLAVE_B: + vlan_vid_del(port->dev, proto, vid); + break; + default: + break; + } + } + + return 0; +} + static const struct net_device_ops hsr_device_ops = { .ndo_change_mtu = hsr_dev_change_mtu, .ndo_open = hsr_dev_open, @@ -472,6 +543,8 @@ static const struct net_device_ops hsr_device_ops = { .ndo_change_rx_flags = hsr_change_rx_flags, .ndo_fix_features = hsr_fix_features, .ndo_set_rx_mode = hsr_set_rx_mode, + .ndo_vlan_rx_add_vid = hsr_ndo_vlan_rx_add_vid, + .ndo_vlan_rx_kill_vid = hsr_ndo_vlan_rx_kill_vid, };
static struct device_type hsr_type = { @@ -512,7 +585,8 @@ void hsr_dev_setup(struct net_device *dev)
dev->hw_features = NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HIGHDMA | NETIF_F_GSO_MASK | NETIF_F_HW_CSUM | - NETIF_F_HW_VLAN_CTAG_TX; + NETIF_F_HW_VLAN_CTAG_TX | + NETIF_F_HW_VLAN_CTAG_FILTER;
dev->features = dev->hw_features;
@@ -598,6 +672,10 @@ int hsr_dev_finalize(struct net_device *hsr_dev, struct net_device *slave[2], (slave[1]->features & NETIF_F_HW_HSR_FWD)) hsr->fwd_offloaded = true;
+ if ((slave[0]->features & NETIF_F_HW_VLAN_CTAG_FILTER) && + (slave[1]->features & NETIF_F_HW_VLAN_CTAG_FILTER)) + hsr_dev->features |= NETIF_F_HW_VLAN_CTAG_FILTER; + res = register_netdevice(hsr_dev); if (res) goto err_unregister;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit 8884c693991333ae065830554b9b0c96590b1bb2 ]
hsr_for_each_port is called in many places without holding the RCU read lock, this may trigger warnings on debug kernels. Most of the callers are actually hold rtnl lock. So add a new helper hsr_for_each_port_rtnl to allow callers in suitable contexts to iterate ports safely without explicit RCU locking.
This patch only fixed the callers that is hold rtnl lock. Other caller issues will be fixed in later patches.
Fixes: c5a759117210 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.") Signed-off-by: Hangbin Liu liuhangbin@gmail.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20250905091533.377443-2-liuhangbin@gmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/hsr/hsr_device.c | 18 +++++++++--------- net/hsr/hsr_main.c | 2 +- net/hsr/hsr_main.h | 3 +++ 3 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c index 69f6c704352de..a6fc3d7b02224 100644 --- a/net/hsr/hsr_device.c +++ b/net/hsr/hsr_device.c @@ -59,7 +59,7 @@ static bool hsr_check_carrier(struct hsr_port *master)
ASSERT_RTNL();
- hsr_for_each_port(master->hsr, port) { + hsr_for_each_port_rtnl(master->hsr, port) { if (port->type != HSR_PT_MASTER && is_slave_up(port->dev)) { netif_carrier_on(master->dev); return true; @@ -109,7 +109,7 @@ int hsr_get_max_mtu(struct hsr_priv *hsr) struct hsr_port *port;
mtu_max = ETH_DATA_LEN; - hsr_for_each_port(hsr, port) + hsr_for_each_port_rtnl(hsr, port) if (port->type != HSR_PT_MASTER) mtu_max = min(port->dev->mtu, mtu_max);
@@ -144,7 +144,7 @@ static int hsr_dev_open(struct net_device *dev) hsr = netdev_priv(dev); designation = '\0';
- hsr_for_each_port(hsr, port) { + hsr_for_each_port_rtnl(hsr, port) { if (port->type == HSR_PT_MASTER) continue; switch (port->type) { @@ -174,7 +174,7 @@ static int hsr_dev_close(struct net_device *dev) struct hsr_priv *hsr;
hsr = netdev_priv(dev); - hsr_for_each_port(hsr, port) { + hsr_for_each_port_rtnl(hsr, port) { if (port->type == HSR_PT_MASTER) continue; switch (port->type) { @@ -207,7 +207,7 @@ static netdev_features_t hsr_features_recompute(struct hsr_priv *hsr, * may become enabled. */ features &= ~NETIF_F_ONE_FOR_ALL; - hsr_for_each_port(hsr, port) + hsr_for_each_port_rtnl(hsr, port) features = netdev_increment_features(features, port->dev->features, mask); @@ -425,7 +425,7 @@ static void hsr_set_rx_mode(struct net_device *dev)
hsr = netdev_priv(dev);
- hsr_for_each_port(hsr, port) { + hsr_for_each_port_rtnl(hsr, port) { if (port->type == HSR_PT_MASTER) continue; switch (port->type) { @@ -447,7 +447,7 @@ static void hsr_change_rx_flags(struct net_device *dev, int change)
hsr = netdev_priv(dev);
- hsr_for_each_port(hsr, port) { + hsr_for_each_port_rtnl(hsr, port) { if (port->type == HSR_PT_MASTER) continue; switch (port->type) { @@ -475,7 +475,7 @@ static int hsr_ndo_vlan_rx_add_vid(struct net_device *dev,
hsr = netdev_priv(dev);
- hsr_for_each_port(hsr, port) { + hsr_for_each_port_rtnl(hsr, port) { if (port->type == HSR_PT_MASTER || port->type == HSR_PT_INTERLINK) continue; @@ -521,7 +521,7 @@ static int hsr_ndo_vlan_rx_kill_vid(struct net_device *dev,
hsr = netdev_priv(dev);
- hsr_for_each_port(hsr, port) { + hsr_for_each_port_rtnl(hsr, port) { switch (port->type) { case HSR_PT_SLAVE_A: case HSR_PT_SLAVE_B: diff --git a/net/hsr/hsr_main.c b/net/hsr/hsr_main.c index 257b50124cee5..c325ddad539a7 100644 --- a/net/hsr/hsr_main.c +++ b/net/hsr/hsr_main.c @@ -22,7 +22,7 @@ static bool hsr_slave_empty(struct hsr_priv *hsr) { struct hsr_port *port;
- hsr_for_each_port(hsr, port) + hsr_for_each_port_rtnl(hsr, port) if (port->type != HSR_PT_MASTER) return false; return true; diff --git a/net/hsr/hsr_main.h b/net/hsr/hsr_main.h index 18e01791ad799..2fcabe39e61f4 100644 --- a/net/hsr/hsr_main.h +++ b/net/hsr/hsr_main.h @@ -221,6 +221,9 @@ struct hsr_priv { #define hsr_for_each_port(hsr, port) \ list_for_each_entry_rcu((port), &(hsr)->ports, port_list)
+#define hsr_for_each_port_rtnl(hsr, port) \ + list_for_each_entry_rcu((port), &(hsr)->ports, port_list, lockdep_rtnl_is_held()) + struct hsr_port *hsr_port_get_hsr(struct hsr_priv *hsr, enum hsr_port_type pt);
/* Caller must ensure skb is a valid HSR frame */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit 393c841fe4333cdd856d0ca37b066d72746cfaa6 ]
hsr_port_get_hsr() iterates over ports using hsr_for_each_port(), but many of its callers do not hold the required RCU lock.
Switch to hsr_for_each_port_rtnl(), since most callers already hold the rtnl lock. After review, all callers are covered by either the rtnl lock or the RCU lock, except hsr_dev_xmit(). Fix this by adding an RCU read lock there.
Fixes: c5a759117210 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.") Signed-off-by: Hangbin Liu liuhangbin@gmail.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20250905091533.377443-3-liuhangbin@gmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/hsr/hsr_device.c | 3 +++ net/hsr/hsr_main.c | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c index a6fc3d7b02224..5514b5bedc929 100644 --- a/net/hsr/hsr_device.c +++ b/net/hsr/hsr_device.c @@ -228,6 +228,7 @@ static netdev_tx_t hsr_dev_xmit(struct sk_buff *skb, struct net_device *dev) struct hsr_priv *hsr = netdev_priv(dev); struct hsr_port *master;
+ rcu_read_lock(); master = hsr_port_get_hsr(hsr, HSR_PT_MASTER); if (master) { skb->dev = master->dev; @@ -240,6 +241,8 @@ static netdev_tx_t hsr_dev_xmit(struct sk_buff *skb, struct net_device *dev) dev_core_stats_tx_dropped_inc(dev); dev_kfree_skb_any(skb); } + rcu_read_unlock(); + return NETDEV_TX_OK; }
diff --git a/net/hsr/hsr_main.c b/net/hsr/hsr_main.c index c325ddad539a7..76a1958609e29 100644 --- a/net/hsr/hsr_main.c +++ b/net/hsr/hsr_main.c @@ -125,7 +125,7 @@ struct hsr_port *hsr_port_get_hsr(struct hsr_priv *hsr, enum hsr_port_type pt) { struct hsr_port *port;
- hsr_for_each_port(hsr, port) + hsr_for_each_port_rtnl(hsr, port) if (port->type == pt) return port; return NULL;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yi Sun yi.sun@intel.com
[ Upstream commit f41c538881eec4dcf5961a242097d447f848cda6 ]
The call to idxd_free() introduces a duplicate put_device() leading to a reference count underflow: refcount_t: underflow; use-after-free. WARNING: CPU: 15 PID: 4428 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110 ... Call Trace: <TASK> idxd_remove+0xe4/0x120 [idxd] pci_device_remove+0x3f/0xb0 device_release_driver_internal+0x197/0x200 driver_detach+0x48/0x90 bus_remove_driver+0x74/0xf0 pci_unregister_driver+0x2e/0xb0 idxd_exit_module+0x34/0x7a0 [idxd] __do_sys_delete_module.constprop.0+0x183/0x280 do_syscall_64+0x54/0xd70 entry_SYSCALL_64_after_hwframe+0x76/0x7e
The idxd_unregister_devices() which is invoked at the very beginning of idxd_remove(), already takes care of the necessary put_device() through the following call path: idxd_unregister_devices() -> device_unregister() -> put_device()
In addition, when CONFIG_DEBUG_KOBJECT_RELEASE is enabled, put_device() may trigger asynchronous cleanup via schedule_delayed_work(). If idxd_free() is called immediately after, it can result in a use-after-free.
Remove the improper idxd_free() to avoid both the refcount underflow and potential memory corruption during module unload.
Fixes: d5449ff1b04d ("dmaengine: idxd: Add missing idxd cleanup to fix memory leak in remove call") Signed-off-by: Yi Sun yi.sun@intel.com Tested-by: Shuai Xue xueshuai@linux.alibaba.com Reviewed-by: Dave Jiang dave.jiang@intel.com Acked-by: Vinicius Costa Gomes vinicius.gomes@intel.com
Link: https://lore.kernel.org/r/20250729150313.1934101-2-yi.sun@intel.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/idxd/init.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c index 92e86ae9db29d..3a78973882211 100644 --- a/drivers/dma/idxd/init.c +++ b/drivers/dma/idxd/init.c @@ -907,7 +907,6 @@ static void idxd_remove(struct pci_dev *pdev) idxd_cleanup(idxd); pci_iounmap(pdev, idxd->reg_base); put_device(idxd_confdev(idxd)); - idxd_free(idxd); pci_disable_device(pdev); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yi Sun yi.sun@intel.com
[ Upstream commit b7cb9a034305d52222433fad10c3de10204f29e7 ]
A recent refactor introduced a misplaced put_device() call, resulting in a reference count underflow during module unload.
There is no need to add additional put_device() calls for idxd groups, engines, or workqueues. Although the commit claims: "Note, this also fixes the missing put_device() for idxd groups, engines, and wqs."
It appears no such omission actually existed. The required cleanup is already handled by the call chain: idxd_unregister_devices() -> device_unregister() -> put_device()
Extend idxd_cleanup() to handle the remaining necessary cleanup and remove idxd_cleanup_internals(), which duplicates deallocation logic for idxd, engines, groups, and workqueues. Memory management is also properly handled through the Linux device model.
Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper") Signed-off-by: Yi Sun yi.sun@intel.com Tested-by: Shuai Xue xueshuai@linux.alibaba.com Reviewed-by: Dave Jiang dave.jiang@intel.com Acked-by: Vinicius Costa Gomes vinicius.gomes@intel.com
Link: https://lore.kernel.org/r/20250729150313.1934101-3-yi.sun@intel.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/idxd/init.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c index 3a78973882211..163a276d2670e 100644 --- a/drivers/dma/idxd/init.c +++ b/drivers/dma/idxd/init.c @@ -904,7 +904,10 @@ static void idxd_remove(struct pci_dev *pdev) device_unregister(idxd_confdev(idxd)); idxd_shutdown(pdev); idxd_device_remove_debugfs(idxd); - idxd_cleanup(idxd); + perfmon_pmu_remove(idxd); + idxd_cleanup_interrupts(idxd); + if (device_pasid_enabled(idxd)) + idxd_disable_system_pasid(idxd); pci_iounmap(pdev, idxd->reg_base); put_device(idxd_confdev(idxd)); pci_disable_device(pdev);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit 39aaa337449e71a41d4813be0226a722827ba606 ]
The clean up in idxd_setup_wqs() has had a couple bugs because the error handling is a bit subtle. It's simpler to just re-write it in a cleaner way. The issues here are:
1) If "idxd->max_wqs" is <= 0 then we call put_device(conf_dev) when "conf_dev" hasn't been initialized. 2) If kzalloc_node() fails then again "conf_dev" is invalid. It's either uninitialized or it points to the "conf_dev" from the previous iteration so it leads to a double free.
It's better to free partial loop iterations within the loop and then the unwinding at the end can handle whole loop iterations. I also renamed the labels to describe what the goto does and not where the goto was located.
Fixes: 3fd2f4bc010c ("dmaengine: idxd: fix memory leak in error handling path of idxd_setup_wqs") Reported-by: Colin Ian King colin.i.king@gmail.com Closes: https://lore.kernel.org/all/20250811095836.1642093-1-colin.i.king@gmail.com/ Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Dave Jiang dave.jiang@intel.com Link: https://lore.kernel.org/r/aJnJW3iYTDDCj9sk@stanley.mountain Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/idxd/init.c | 33 +++++++++++++++++---------------- 1 file changed, 17 insertions(+), 16 deletions(-)
diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c index 163a276d2670e..4b999c5802f4b 100644 --- a/drivers/dma/idxd/init.c +++ b/drivers/dma/idxd/init.c @@ -179,27 +179,30 @@ static int idxd_setup_wqs(struct idxd_device *idxd) idxd->wq_enable_map = bitmap_zalloc_node(idxd->max_wqs, GFP_KERNEL, dev_to_node(dev)); if (!idxd->wq_enable_map) { rc = -ENOMEM; - goto err_bitmap; + goto err_free_wqs; }
for (i = 0; i < idxd->max_wqs; i++) { wq = kzalloc_node(sizeof(*wq), GFP_KERNEL, dev_to_node(dev)); if (!wq) { rc = -ENOMEM; - goto err; + goto err_unwind; }
idxd_dev_set_type(&wq->idxd_dev, IDXD_DEV_WQ); conf_dev = wq_confdev(wq); wq->id = i; wq->idxd = idxd; - device_initialize(wq_confdev(wq)); + device_initialize(conf_dev); conf_dev->parent = idxd_confdev(idxd); conf_dev->bus = &dsa_bus_type; conf_dev->type = &idxd_wq_device_type; rc = dev_set_name(conf_dev, "wq%d.%d", idxd->id, wq->id); - if (rc < 0) - goto err; + if (rc < 0) { + put_device(conf_dev); + kfree(wq); + goto err_unwind; + }
mutex_init(&wq->wq_lock); init_waitqueue_head(&wq->err_queue); @@ -210,15 +213,20 @@ static int idxd_setup_wqs(struct idxd_device *idxd) wq->enqcmds_retries = IDXD_ENQCMDS_RETRIES; wq->wqcfg = kzalloc_node(idxd->wqcfg_size, GFP_KERNEL, dev_to_node(dev)); if (!wq->wqcfg) { + put_device(conf_dev); + kfree(wq); rc = -ENOMEM; - goto err; + goto err_unwind; }
if (idxd->hw.wq_cap.op_config) { wq->opcap_bmap = bitmap_zalloc(IDXD_MAX_OPCAP_BITS, GFP_KERNEL); if (!wq->opcap_bmap) { + kfree(wq->wqcfg); + put_device(conf_dev); + kfree(wq); rc = -ENOMEM; - goto err_opcap_bmap; + goto err_unwind; } bitmap_copy(wq->opcap_bmap, idxd->opcap_bmap, IDXD_MAX_OPCAP_BITS); } @@ -229,13 +237,7 @@ static int idxd_setup_wqs(struct idxd_device *idxd)
return 0;
-err_opcap_bmap: - kfree(wq->wqcfg); - -err: - put_device(conf_dev); - kfree(wq); - +err_unwind: while (--i >= 0) { wq = idxd->wqs[i]; if (idxd->hw.wq_cap.op_config) @@ -244,11 +246,10 @@ static int idxd_setup_wqs(struct idxd_device *idxd) conf_dev = wq_confdev(wq); put_device(conf_dev); kfree(wq); - } bitmap_free(idxd->wq_enable_map);
-err_bitmap: +err_free_wqs: kfree(idxd->wqs);
return rc;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anders Roxell anders.roxell@linaro.org
[ Upstream commit e63419dbf2ceb083c1651852209c7f048089ac0f ]
Fix a critical memory allocation bug in edma_setup_from_hw() where queue_priority_map was allocated with insufficient memory. The code declared queue_priority_map as s8 (*)[2] (pointer to array of 2 s8), but allocated memory using sizeof(s8) instead of the correct size.
This caused out-of-bounds memory writes when accessing: queue_priority_map[i][0] = i; queue_priority_map[i][1] = i;
The bug manifested as kernel crashes with "Oops - undefined instruction" on ARM platforms (BeagleBoard-X15) during EDMA driver probe, as the memory corruption triggered kernel hardening features on Clang.
Change the allocation to use sizeof(*queue_priority_map) which automatically gets the correct size for the 2D array structure.
Fixes: 2b6b3b742019 ("ARM/dmaengine: edma: Merge the two drivers under drivers/dma/") Signed-off-by: Anders Roxell anders.roxell@linaro.org Link: https://lore.kernel.org/r/20250830094953.3038012-1-anders.roxell@linaro.org Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma/ti/edma.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/dma/ti/edma.c b/drivers/dma/ti/edma.c index c0fa541324675..f7ddf588b7f9b 100644 --- a/drivers/dma/ti/edma.c +++ b/drivers/dma/ti/edma.c @@ -2063,8 +2063,8 @@ static int edma_setup_from_hw(struct device *dev, struct edma_soc_info *pdata, * priority. So Q0 is the highest priority queue and the last queue has * the lowest priority. */ - queue_priority_map = devm_kcalloc(dev, ecc->num_tc + 1, sizeof(s8), - GFP_KERNEL); + queue_priority_map = devm_kcalloc(dev, ecc->num_tc + 1, + sizeof(*queue_priority_map), GFP_KERNEL); if (!queue_priority_map) return -ENOMEM;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andreas Kemnade akemnade@kernel.org
[ Upstream commit c05d0b32eebadc8be6e53196e99c64cf2bed1d99 ]
Attach the power good gpio to the regulator device devres instead of the parent device to fix problems if probe is run multiple times (rmmod/insmod or some deferral).
Fixes: 8c485bedfb785 ("regulator: sy7636a: Initial commit") Signed-off-by: Andreas Kemnade akemnade@kernel.org Reviewed-by: Alistair Francis alistair@alistair23.me Reviewed-by: Peng Fan peng.fan@nxp.com Message-ID: 20250906-sy7636-rsrc-v1-2-e2886a9763a7@kernel.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/sy7636a-regulator.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/regulator/sy7636a-regulator.c b/drivers/regulator/sy7636a-regulator.c index d1e7ba1fb3e1a..27e3d939b7bb9 100644 --- a/drivers/regulator/sy7636a-regulator.c +++ b/drivers/regulator/sy7636a-regulator.c @@ -83,9 +83,11 @@ static int sy7636a_regulator_probe(struct platform_device *pdev) if (!regmap) return -EPROBE_DEFER;
- gdp = devm_gpiod_get(pdev->dev.parent, "epd-pwr-good", GPIOD_IN); + device_set_of_node_from_dev(&pdev->dev, pdev->dev.parent); + + gdp = devm_gpiod_get(&pdev->dev, "epd-pwr-good", GPIOD_IN); if (IS_ERR(gdp)) { - dev_err(pdev->dev.parent, "Power good GPIO fault %ld\n", PTR_ERR(gdp)); + dev_err(&pdev->dev, "Power good GPIO fault %ld\n", PTR_ERR(gdp)); return PTR_ERR(gdp); }
@@ -105,7 +107,6 @@ static int sy7636a_regulator_probe(struct platform_device *pdev) }
config.dev = &pdev->dev; - config.dev->of_node = pdev->dev.parent->of_node; config.regmap = regmap;
rdev = devm_regulator_register(&pdev->dev, &desc, &config);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiapeng Chong jiapeng.chong@linux.alibaba.com
[ Upstream commit 82ccdf062a64f3c4ac575c16179ce68edbbbe8e4 ]
The function is defined, but not called anywhere:
kernel/time/hrtimer.c:1880:20: warning: unused function '__hrtimer_peek_ahead_timers'.
Remove it.
Reported-by: Abaci Robot abaci@linux.alibaba.com Signed-off-by: Jiapeng Chong jiapeng.chong@linux.alibaba.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/r/20240322070441.29646-1-jiapeng.chong@linux.alibaba... Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=8611 Stable-dep-of: e895f8e29119 ("hrtimers: Unconditionally update target CPU base after offline timer migration") Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/time/hrtimer.c | 20 +------------------- 1 file changed, 1 insertion(+), 19 deletions(-)
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 6d9da768604d6..e9833a30a86a4 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -1938,25 +1938,7 @@ void hrtimer_interrupt(struct clock_event_device *dev) tick_program_event(expires_next, 1); pr_warn_once("hrtimer: interrupt took %llu ns\n", ktime_to_ns(delta)); } - -/* called with interrupts disabled */ -static inline void __hrtimer_peek_ahead_timers(void) -{ - struct tick_device *td; - - if (!hrtimer_hres_active()) - return; - - td = this_cpu_ptr(&tick_cpu_device); - if (td && td->evtdev) - hrtimer_interrupt(td->evtdev); -} - -#else /* CONFIG_HIGH_RES_TIMERS */ - -static inline void __hrtimer_peek_ahead_timers(void) { } - -#endif /* !CONFIG_HIGH_RES_TIMERS */ +#endif /* !CONFIG_HIGH_RES_TIMERS */
/* * Called from run_local_timers in hardirq context every jiffy
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiapeng Chong jiapeng.chong@linux.alibaba.com
[ Upstream commit b7c8e1f8a7b4352c1d0b4310686385e3cf6c104a ]
The function hrtimer_hres_active() are defined in the hrtimer.c file, but not called elsewhere, so rename __hrtimer_hres_active() to hrtimer_hres_active() and remove the old hrtimer_hres_active() function.
kernel/time/hrtimer.c:653:19: warning: unused function 'hrtimer_hres_active'.
Fixes: 82ccdf062a64 ("hrtimer: Remove unused function") Reported-by: Abaci Robot abaci@linux.alibaba.com Signed-off-by: Jiapeng Chong jiapeng.chong@linux.alibaba.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Anna-Maria Behnsen anna-maria@linutronix.de Link: https://lore.kernel.org/r/20240418023000.130324-1-jiapeng.chong@linux.alibab... Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=8778 Stable-dep-of: e895f8e29119 ("hrtimers: Unconditionally update target CPU base after offline timer migration") Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/time/hrtimer.c | 21 ++++++++------------- 1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index e9833a30a86a4..0c3ad1755cc26 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -671,17 +671,12 @@ static inline ktime_t hrtimer_update_base(struct hrtimer_cpu_base *base) /* * Is the high resolution mode active ? */ -static inline int __hrtimer_hres_active(struct hrtimer_cpu_base *cpu_base) +static inline int hrtimer_hres_active(struct hrtimer_cpu_base *cpu_base) { return IS_ENABLED(CONFIG_HIGH_RES_TIMERS) ? cpu_base->hres_active : 0; }
-static inline int hrtimer_hres_active(void) -{ - return __hrtimer_hres_active(this_cpu_ptr(&hrtimer_bases)); -} - static void __hrtimer_reprogram(struct hrtimer_cpu_base *cpu_base, struct hrtimer *next_timer, ktime_t expires_next) @@ -705,7 +700,7 @@ static void __hrtimer_reprogram(struct hrtimer_cpu_base *cpu_base, * set. So we'd effectively block all timers until the T2 event * fires. */ - if (!__hrtimer_hres_active(cpu_base) || cpu_base->hang_detected) + if (!hrtimer_hres_active(cpu_base) || cpu_base->hang_detected) return;
tick_program_event(expires_next, 1); @@ -814,12 +809,12 @@ static void retrigger_next_event(void *arg) * function call will take care of the reprogramming in case the * CPU was in a NOHZ idle sleep. */ - if (!__hrtimer_hres_active(base) && !tick_nohz_active) + if (!hrtimer_hres_active(base) && !tick_nohz_active) return;
raw_spin_lock(&base->lock); hrtimer_update_base(base); - if (__hrtimer_hres_active(base)) + if (hrtimer_hres_active(base)) hrtimer_force_reprogram(base, 0); else hrtimer_update_next_event(base); @@ -976,7 +971,7 @@ void clock_was_set(unsigned int bases) cpumask_var_t mask; int cpu;
- if (!__hrtimer_hres_active(cpu_base) && !tick_nohz_active) + if (!hrtimer_hres_active(cpu_base) && !tick_nohz_active) goto out_timerfd;
if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) { @@ -1554,7 +1549,7 @@ u64 hrtimer_get_next_event(void)
raw_spin_lock_irqsave(&cpu_base->lock, flags);
- if (!__hrtimer_hres_active(cpu_base)) + if (!hrtimer_hres_active(cpu_base)) expires = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_ALL);
raw_spin_unlock_irqrestore(&cpu_base->lock, flags); @@ -1577,7 +1572,7 @@ u64 hrtimer_next_event_without(const struct hrtimer *exclude)
raw_spin_lock_irqsave(&cpu_base->lock, flags);
- if (__hrtimer_hres_active(cpu_base)) { + if (hrtimer_hres_active(cpu_base)) { unsigned int active;
if (!cpu_base->softirq_activated) { @@ -1949,7 +1944,7 @@ void hrtimer_run_queues(void) unsigned long flags; ktime_t now;
- if (__hrtimer_hres_active(cpu_base)) + if (hrtimer_hres_active(cpu_base)) return;
/*
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xiongfeng Wang wangxiongfeng2@huawei.com
[ Upstream commit e895f8e29119c8c966ea794af9e9100b10becb88 ]
When testing softirq based hrtimers on an ARM32 board, with high resolution mode and NOHZ inactive, softirq based hrtimers fail to expire after being moved away from an offline CPU:
CPU0 CPU1 hrtimer_start(..., HRTIMER_MODE_SOFT); cpu_down(CPU1) ... hrtimers_cpu_dying() // Migrate timers to CPU0 smp_call_function_single(CPU0, returgger_next_event); retrigger_next_event() if (!highres && !nohz) return;
As retrigger_next_event() is a NOOP when both high resolution timers and NOHZ are inactive CPU0's hrtimer_cpu_base::softirq_expires_next is not updated and the migrated softirq timers never expire unless there is a softirq based hrtimer queued on CPU0 later.
Fix this by removing the hrtimer_hres_active() and tick_nohz_active() check in retrigger_next_event(), which enforces a full update of the CPU base. As this is not a fast path the extra cost does not matter.
[ tglx: Massaged change log ]
Fixes: 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier") Co-developed-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Xiongfeng Wang wangxiongfeng2@huawei.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/all/20250805081025.54235-1-wangxiongfeng2@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/time/hrtimer.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 0c3ad1755cc26..ccea52adcba67 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -808,10 +808,10 @@ static void retrigger_next_event(void *arg) * of the next expiring timer is enough. The return from the SMP * function call will take care of the reprogramming in case the * CPU was in a NOHZ idle sleep. + * + * In periodic low resolution mode, the next softirq expiration + * must also be updated. */ - if (!hrtimer_hres_active(base) && !tick_nohz_active) - return; - raw_spin_lock(&base->lock); hrtimer_update_base(base); if (hrtimer_hres_active(base)) @@ -2289,11 +2289,6 @@ int hrtimers_cpu_dying(unsigned int dying_cpu) &new_base->clock_base[i]); }
- /* - * The migration might have changed the first expiring softirq - * timer on this CPU. Update it. - */ - __hrtimer_get_next_event(new_base, HRTIMER_ACTIVE_SOFT); /* Tell the other CPU to retrigger the next event */ smp_call_function_single(ncpu, retrigger_next_event, NULL, 0);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Palmer Dabbelt palmer@rivosinc.com
[ Upstream commit 8d4f1e05ff821a5d59116ab8c3a30fcae81d8597 ]
Without this I get a bunch of build errors like
In file included from ./include/linux/sched/task_stack.h:12, from ./arch/riscv/include/asm/compat.h:12, from ./arch/riscv/include/asm/pgtable.h:115, from ./include/linux/pgtable.h:6, from ./include/linux/mm.h:30, from arch/riscv/kernel/asm-offsets.c:8: ./include/linux/kasan.h:50:37: error: ‘MAX_PTRS_PER_PTE’ undeclared here (not in a function); did you mean ‘PTRS_PER_PTE’? 50 | extern pte_t kasan_early_shadow_pte[MAX_PTRS_PER_PTE + PTE_HWTABLE_PTRS]; | ^~~~~~~~~~~~~~~~ | PTRS_PER_PTE ./include/linux/kasan.h:51:8: error: unknown type name ‘pmd_t’; did you mean ‘pgd_t’? 51 | extern pmd_t kasan_early_shadow_pmd[MAX_PTRS_PER_PMD]; | ^~~~~ | pgd_t ./include/linux/kasan.h:51:37: error: ‘MAX_PTRS_PER_PMD’ undeclared here (not in a function); did you mean ‘PTRS_PER_PGD’? 51 | extern pmd_t kasan_early_shadow_pmd[MAX_PTRS_PER_PMD]; | ^~~~~~~~~~~~~~~~ | PTRS_PER_PGD ./include/linux/kasan.h:52:8: error: unknown type name ‘pud_t’; did you mean ‘pgd_t’? 52 | extern pud_t kasan_early_shadow_pud[MAX_PTRS_PER_PUD]; | ^~~~~ | pgd_t ./include/linux/kasan.h:52:37: error: ‘MAX_PTRS_PER_PUD’ undeclared here (not in a function); did you mean ‘PTRS_PER_PGD’? 52 | extern pud_t kasan_early_shadow_pud[MAX_PTRS_PER_PUD]; | ^~~~~~~~~~~~~~~~ | PTRS_PER_PGD ./include/linux/kasan.h:53:8: error: unknown type name ‘p4d_t’; did you mean ‘pgd_t’? 53 | extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D]; | ^~~~~ | pgd_t ./include/linux/kasan.h:53:37: error: ‘MAX_PTRS_PER_P4D’ undeclared here (not in a function); did you mean ‘PTRS_PER_PGD’? 53 | extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D]; | ^~~~~~~~~~~~~~~~ | PTRS_PER_PGD
Link: https://lore.kernel.org/r/20241126143250.29708-1-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/include/asm/compat.h | 1 - 1 file changed, 1 deletion(-)
diff --git a/arch/riscv/include/asm/compat.h b/arch/riscv/include/asm/compat.h index 2ac955b51148f..6b79287baecc0 100644 --- a/arch/riscv/include/asm/compat.h +++ b/arch/riscv/include/asm/compat.h @@ -9,7 +9,6 @@ */ #include <linux/types.h> #include <linux/sched.h> -#include <linux/sched/task_stack.h> #include <asm-generic/compat.h>
static inline int is_compat_task(void)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mathias Nyman mathias.nyman@linux.intel.com
commit edcbe06453ddfde21f6aa763f7cab655f26133cc upstream.
Suspend-resume cycle test revealed a memory leak in 6.17-rc3
Turns out the slot_id race fix changes accidentally ends up calling xhci_free_virt_device() with an incorrect vdev parameter. The vdev variable was reused for temporary purposes right before calling xhci_free_virt_device().
Fix this by passing the correct vdev parameter.
The slot_id race fix that caused this regression was targeted for stable, so this needs to be applied there as well.
Fixes: 2eb03376151b ("usb: xhci: Fix slot_id resource race conflict") Reported-by: David Wang 00107082@163.com Closes: https://lore.kernel.org/linux-usb/20250829181354.4450-1-00107082@163.com Suggested-by: Michal Pecio michal.pecio@gmail.com Suggested-by: David Wang 00107082@163.com Cc: stable@vger.kernel.org Tested-by: David Wang 00107082@163.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20250902105306.877476-4-mathias.nyman@linux.intel.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-mem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -945,7 +945,7 @@ static void xhci_free_virt_devices_depth out: /* we are now at a leaf device */ xhci_debugfs_remove_slot(xhci, slot_id); - xhci_free_virt_device(xhci, vdev, slot_id); + xhci_free_virt_device(xhci, xhci->devs[slot_id], slot_id); }
int xhci_alloc_virt_device(struct xhci_hcd *xhci, int slot_id,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alan Stern stern@rowland.harvard.edu
commit 8d63c83d8eb922f6c316320f50c82fa88d099bea upstream.
Yunseong Kim and the syzbot fuzzer both reported a problem in RT-enabled kernels caused by the way dummy-hcd mixes interrupt management and spin-locking. The pattern was:
local_irq_save(flags); spin_lock(&dum->lock); ... spin_unlock(&dum->lock); ... // calls usb_gadget_giveback_request() local_irq_restore(flags);
The code was written this way because usb_gadget_giveback_request() needs to be called with interrupts disabled and the private lock not held.
While this pattern works fine in non-RT kernels, it's not good when RT is enabled. RT kernels handle spinlocks much like mutexes; in particular, spin_lock() may sleep. But sleeping is not allowed while local interrupts are disabled.
To fix the problem, rewrite the code to conform to the pattern used elsewhere in dummy-hcd and other UDC drivers:
spin_lock_irqsave(&dum->lock, flags); ... spin_unlock(&dum->lock); usb_gadget_giveback_request(...); spin_lock(&dum->lock); ... spin_unlock_irqrestore(&dum->lock, flags);
This approach satisfies the RT requirements.
Signed-off-by: Alan Stern stern@rowland.harvard.edu Cc: stable stable@kernel.org Fixes: b4dbda1a22d2 ("USB: dummy-hcd: disable interrupts during req->complete") Reported-by: Yunseong Kim ysk@kzalloc.com Closes: https://lore.kernel.org/linux-usb/5b337389-73b9-4ee4-a83e-7e82bf5af87a@kzalloc.com/ Reported-by: syzbot+8baacc4139f12fa77909@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-usb/68ac2411.050a0220.37038e.0087.GAE@google.com/ Tested-by: syzbot+8baacc4139f12fa77909@syzkaller.appspotmail.com CC: Sebastian Andrzej Siewior bigeasy@linutronix.de CC: stable@vger.kernel.org Reviewed-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Link: https://lore.kernel.org/r/bb192ae2-4eee-48ee-981f-3efdbbd0d8f0@rowland.harva... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/udc/dummy_hcd.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/usb/gadget/udc/dummy_hcd.c +++ b/drivers/usb/gadget/udc/dummy_hcd.c @@ -764,8 +764,7 @@ static int dummy_dequeue(struct usb_ep * if (!dum->driver) return -ESHUTDOWN;
- local_irq_save(flags); - spin_lock(&dum->lock); + spin_lock_irqsave(&dum->lock, flags); list_for_each_entry(iter, &ep->queue, queue) { if (&iter->req != _req) continue; @@ -775,15 +774,16 @@ static int dummy_dequeue(struct usb_ep * retval = 0; break; } - spin_unlock(&dum->lock);
if (retval == 0) { dev_dbg(udc_dev(dum), "dequeued req %p from %s, len %d buf %p\n", req, _ep->name, _req->length, _req->buf); + spin_unlock(&dum->lock); usb_gadget_giveback_request(_ep, _req); + spin_lock(&dum->lock); } - local_irq_restore(flags); + spin_unlock_irqrestore(&dum->lock, flags); return retval; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
commit 21d8525d2e061cde034277d518411b02eac764e2 upstream.
The gadget card driver forgot to call snd_ump_update_group_attrs() after adding FBs, and this leaves the UMP group attributes uninitialized. As a result, -ENODEV error is returned at opening a legacy rawmidi device as an inactive group.
This patch adds the missing call to address the behavior above.
Fixes: 8b645922b223 ("usb: gadget: Add support for USB MIDI 2.0 function driver") Cc: stable stable@kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Link: https://lore.kernel.org/r/20250904153932.13589-1-tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/function/f_midi2.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/gadget/function/f_midi2.c +++ b/drivers/usb/gadget/function/f_midi2.c @@ -1601,6 +1601,7 @@ static int f_midi2_create_card(struct f_ strscpy(fb->info.name, ump_fb_name(b), sizeof(fb->info.name)); } + snd_ump_update_group_attrs(ump); }
for (i = 0; i < midi2->num_eps; i++) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
commit 116e79c679a1530cf833d0ff3007061d7a716bd9 upstream.
The EP-IN of MIDI2 (altset 1) wasn't initialized in f_midi2_create_usb_configs() as it's an INT EP unlike others BULK EPs. But this leaves rather the max packet size unchanged no matter which speed is used, resulting in the very slow access. And the wMaxPacketSize values set there look legit for INT EPs, so let's initialize the MIDI2 EP-IN there for achieving the equivalent speed as well.
Fixes: 8b645922b223 ("usb: gadget: Add support for USB MIDI 2.0 function driver") Cc: stable stable@kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Link: https://lore.kernel.org/r/20250905133240.20966-1-tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/function/f_midi2.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/drivers/usb/gadget/function/f_midi2.c +++ b/drivers/usb/gadget/function/f_midi2.c @@ -1739,9 +1739,12 @@ static int f_midi2_create_usb_configs(st case USB_SPEED_HIGH: midi2_midi1_ep_out_desc.wMaxPacketSize = cpu_to_le16(512); midi2_midi1_ep_in_desc.wMaxPacketSize = cpu_to_le16(512); - for (i = 0; i < midi2->num_eps; i++) + for (i = 0; i < midi2->num_eps; i++) { midi2_midi2_ep_out_desc[i].wMaxPacketSize = cpu_to_le16(512); + midi2_midi2_ep_in_desc[i].wMaxPacketSize = + cpu_to_le16(512); + } fallthrough; case USB_SPEED_FULL: midi1_in_eps = midi2_midi1_ep_in_descs; @@ -1750,9 +1753,12 @@ static int f_midi2_create_usb_configs(st case USB_SPEED_SUPER: midi2_midi1_ep_out_desc.wMaxPacketSize = cpu_to_le16(1024); midi2_midi1_ep_in_desc.wMaxPacketSize = cpu_to_le16(1024); - for (i = 0; i < midi2->num_eps; i++) + for (i = 0; i < midi2->num_eps; i++) { midi2_midi2_ep_out_desc[i].wMaxPacketSize = cpu_to_le16(1024); + midi2_midi2_ep_in_desc[i].wMaxPacketSize = + cpu_to_le16(1024); + } midi1_in_eps = midi2_midi1_ep_in_ss_descs; midi1_out_eps = midi2_midi1_ep_out_ss_descs; break;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stephan Gerhold stephan.gerhold@linaro.org
commit 5068b5254812433e841a40886e695633148d362d upstream.
When we don't have a clock specified in the device tree, we have no way to ensure the BAM is on. This is often the case for remotely-controlled or remotely-powered BAM instances. In this case, we need to read num-channels from the DT to have all the necessary information to complete probing.
However, at the moment invalid device trees without clock and without num-channels still continue probing, because the error handling is missing return statements. The driver will then later try to read the number of channels from the registers. This is unsafe, because it relies on boot firmware and lucky timing to succeed. Unfortunately, the lack of proper error handling here has been abused for several Qualcomm SoCs upstream, causing early boot crashes in several situations [1, 2].
Avoid these early crashes by erroring out when any of the required DT properties are missing. Note that this will break some of the existing DTs upstream (mainly BAM instances related to the crypto engine). However, clearly these DTs have never been tested properly, since the error in the kernel log was just ignored. It's safer to disable the crypto engine for these broken DTBs.
[1]: https://lore.kernel.org/r/CY01EKQVWE36.B9X5TDXAREPF@fairphone.com/ [2]: https://lore.kernel.org/r/20230626145959.646747-1-krzysztof.kozlowski@linaro...
Cc: stable@vger.kernel.org Fixes: 48d163b1aa6e ("dmaengine: qcom: bam_dma: get num-channels and num-ees from dt") Signed-off-by: Stephan Gerhold stephan.gerhold@linaro.org Reviewed-by: Konrad Dybcio konrad.dybcio@oss.qualcomm.com Link: https://lore.kernel.org/r/20250212-bam-dma-fixes-v1-8-f560889e65d8@linaro.or... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma/qcom/bam_dma.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/dma/qcom/bam_dma.c +++ b/drivers/dma/qcom/bam_dma.c @@ -1283,13 +1283,17 @@ static int bam_dma_probe(struct platform if (!bdev->bamclk) { ret = of_property_read_u32(pdev->dev.of_node, "num-channels", &bdev->num_channels); - if (ret) + if (ret) { dev_err(bdev->dev, "num-channels unspecified in dt\n"); + return ret; + }
ret = of_property_read_u32(pdev->dev.of_node, "qcom,num-ees", &bdev->num_ees); - if (ret) + if (ret) { dev_err(bdev->dev, "num-ees unspecified in dt\n"); + return ret; + } }
ret = clk_prepare_enable(bdev->bamclk);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miaoqian Lin linmq006@gmail.com
commit aa2e1e4563d3ab689ffa86ca1412ecbf9fd3b308 upstream.
The reference taken by of_find_device_by_node() must be released when not needed anymore. Add missing put_device() call to fix device reference leaks.
Fixes: 134d9c52fca2 ("dmaengine: dw: dmamux: Introduce RZN1 DMA router support") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin linmq006@gmail.com Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/r/20250902090358.2423285-1-linmq006@gmail.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma/dw/rzn1-dmamux.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
--- a/drivers/dma/dw/rzn1-dmamux.c +++ b/drivers/dma/dw/rzn1-dmamux.c @@ -48,12 +48,16 @@ static void *rzn1_dmamux_route_allocate( u32 mask; int ret;
- if (dma_spec->args_count != RNZ1_DMAMUX_NCELLS) - return ERR_PTR(-EINVAL); + if (dma_spec->args_count != RNZ1_DMAMUX_NCELLS) { + ret = -EINVAL; + goto put_device; + }
map = kzalloc(sizeof(*map), GFP_KERNEL); - if (!map) - return ERR_PTR(-ENOMEM); + if (!map) { + ret = -ENOMEM; + goto put_device; + }
chan = dma_spec->args[0]; map->req_idx = dma_spec->args[4]; @@ -94,12 +98,15 @@ static void *rzn1_dmamux_route_allocate( if (ret) goto clear_bitmap;
+ put_device(&pdev->dev); return map;
clear_bitmap: clear_bit(map->req_idx, dmamux->used_chans); free_map: kfree(map); +put_device: + put_device(&pdev->dev);
return ERR_PTR(ret); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit bca065733afd1e3a89a02f05ffe14e966cd5f78e upstream.
Make sure to drop the references taken to the PMC OF node and device by of_parse_phandle() and of_find_device_by_node() during probe.
Note the holding a reference to the PMC device does not prevent the PMC regmap from going away (e.g. if the PMC driver is unbound) so there is no need to keep the reference.
Fixes: 2d1021487273 ("phy: tegra: xusb: Add wake/sleepwalk for Tegra210") Cc: stable@vger.kernel.org # 5.14 Cc: JC Kuo jckuo@nvidia.com Signed-off-by: Johan Hovold johan@kernel.org Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Link: https://lore.kernel.org/r/20250724131206.2211-2-johan@kernel.org Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/phy/tegra/xusb-tegra210.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/phy/tegra/xusb-tegra210.c +++ b/drivers/phy/tegra/xusb-tegra210.c @@ -3164,18 +3164,22 @@ tegra210_xusb_padctl_probe(struct device }
pdev = of_find_device_by_node(np); + of_node_put(np); if (!pdev) { dev_warn(dev, "PMC device is not available\n"); goto out; }
- if (!platform_get_drvdata(pdev)) + if (!platform_get_drvdata(pdev)) { + put_device(&pdev->dev); return ERR_PTR(-EPROBE_DEFER); + }
padctl->regmap = dev_get_regmap(&pdev->dev, "usb_sleepwalk"); if (!padctl->regmap) dev_info(dev, "failed to find PMC regmap\n");
+ put_device(&pdev->dev); out: return &padctl->base; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit e19bcea99749ce8e8f1d359f68ae03210694ad56 upstream.
Make sure to drop the reference to the control device taken by of_find_device_by_node() during probe when the driver is unbound.
Fixes: 918ee0d21ba4 ("usb: phy: omap-usb3: Don't use omap_get_control_dev()") Cc: stable@vger.kernel.org # 3.13 Cc: Roger Quadros rogerq@kernel.org Signed-off-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20250724131206.2211-4-johan@kernel.org Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/phy/ti/phy-ti-pipe3.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/drivers/phy/ti/phy-ti-pipe3.c +++ b/drivers/phy/ti/phy-ti-pipe3.c @@ -666,12 +666,20 @@ static int ti_pipe3_get_clk(struct ti_pi return 0; }
+static void ti_pipe3_put_device(void *_dev) +{ + struct device *dev = _dev; + + put_device(dev); +} + static int ti_pipe3_get_sysctrl(struct ti_pipe3 *phy) { struct device *dev = phy->dev; struct device_node *node = dev->of_node; struct device_node *control_node; struct platform_device *control_pdev; + int ret;
phy->phy_power_syscon = syscon_regmap_lookup_by_phandle(node, "syscon-phy-power"); @@ -703,6 +711,11 @@ static int ti_pipe3_get_sysctrl(struct t }
phy->control_dev = &control_pdev->dev; + + ret = devm_add_action_or_reset(dev, ti_pipe3_put_device, + phy->control_dev); + if (ret) + return ret; }
if (phy->mode == PIPE3_MODE_PCIE) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon linkinjeon@kernel.org
commit c8b5b7c5da7d0c31c9b7190b4a7bba5281fc4780 upstream.
The Client send malformed smb2 negotiate request. ksmbd return error response. Subsequently, the client can send smb2 session setup even thought conn->preauth_info is not allocated. This patch add KSMBD_SESS_NEED_SETUP status of connection to ignore session setup request if smb2 negotiate phase is not complete.
Cc: stable@vger.kernel.org Tested-by: Steve French stfrench@microsoft.com Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-26505 Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Jan Alexander Preissler akendo@akendo.eu Signed-off-by: Sujana Subramaniam sujana.subramaniam@sap.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/connection.h | 11 +++++++++++ fs/smb/server/mgmt/user_session.c | 4 ++-- fs/smb/server/smb2pdu.c | 14 +++++++++++--- 3 files changed, 24 insertions(+), 5 deletions(-)
--- a/fs/smb/server/connection.h +++ b/fs/smb/server/connection.h @@ -27,6 +27,7 @@ enum { KSMBD_SESS_EXITING, KSMBD_SESS_NEED_RECONNECT, KSMBD_SESS_NEED_NEGOTIATE, + KSMBD_SESS_NEED_SETUP, KSMBD_SESS_RELEASING };
@@ -195,6 +196,11 @@ static inline bool ksmbd_conn_need_negot return READ_ONCE(conn->status) == KSMBD_SESS_NEED_NEGOTIATE; }
+static inline bool ksmbd_conn_need_setup(struct ksmbd_conn *conn) +{ + return READ_ONCE(conn->status) == KSMBD_SESS_NEED_SETUP; +} + static inline bool ksmbd_conn_need_reconnect(struct ksmbd_conn *conn) { return READ_ONCE(conn->status) == KSMBD_SESS_NEED_RECONNECT; @@ -225,6 +231,11 @@ static inline void ksmbd_conn_set_need_n WRITE_ONCE(conn->status, KSMBD_SESS_NEED_NEGOTIATE); }
+static inline void ksmbd_conn_set_need_setup(struct ksmbd_conn *conn) +{ + WRITE_ONCE(conn->status, KSMBD_SESS_NEED_SETUP); +} + static inline void ksmbd_conn_set_need_reconnect(struct ksmbd_conn *conn) { WRITE_ONCE(conn->status, KSMBD_SESS_NEED_RECONNECT); --- a/fs/smb/server/mgmt/user_session.c +++ b/fs/smb/server/mgmt/user_session.c @@ -373,12 +373,12 @@ void destroy_previous_session(struct ksm ksmbd_all_conn_set_status(id, KSMBD_SESS_NEED_RECONNECT); err = ksmbd_conn_wait_idle_sess_id(conn, id); if (err) { - ksmbd_all_conn_set_status(id, KSMBD_SESS_NEED_NEGOTIATE); + ksmbd_all_conn_set_status(id, KSMBD_SESS_NEED_SETUP); goto out; } ksmbd_destroy_file_table(&prev_sess->file_table); prev_sess->state = SMB2_SESSION_EXPIRED; - ksmbd_all_conn_set_status(id, KSMBD_SESS_NEED_NEGOTIATE); + ksmbd_all_conn_set_status(id, KSMBD_SESS_NEED_SETUP); out: up_write(&conn->session_lock); up_write(&sessions_table_lock); --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -1252,7 +1252,7 @@ int smb2_handle_negotiate(struct ksmbd_w }
conn->srv_sec_mode = le16_to_cpu(rsp->SecurityMode); - ksmbd_conn_set_need_negotiate(conn); + ksmbd_conn_set_need_setup(conn);
err_out: if (rc) @@ -1273,6 +1273,9 @@ static int alloc_preauth_hash(struct ksm if (sess->Preauth_HashValue) return 0;
+ if (!conn->preauth_info) + return -ENOMEM; + sess->Preauth_HashValue = kmemdup(conn->preauth_info->Preauth_HashValue, PREAUTH_HASHVALUE_SIZE, GFP_KERNEL); if (!sess->Preauth_HashValue) @@ -1688,6 +1691,11 @@ int smb2_sess_setup(struct ksmbd_work *w
ksmbd_debug(SMB, "Received request for session setup\n");
+ if (!ksmbd_conn_need_setup(conn) && !ksmbd_conn_good(conn)) { + work->send_no_response = 1; + return rc; + } + WORK_BUFFERS(work, req, rsp);
rsp->StructureSize = cpu_to_le16(9); @@ -1919,7 +1927,7 @@ out_err: if (try_delay) { ksmbd_conn_set_need_reconnect(conn); ssleep(5); - ksmbd_conn_set_need_negotiate(conn); + ksmbd_conn_set_need_setup(conn); } } smb2_set_err_rsp(work); @@ -2249,7 +2257,7 @@ int smb2_session_logoff(struct ksmbd_wor ksmbd_free_user(sess->user); sess->user = NULL; } - ksmbd_all_conn_set_status(sess_id, KSMBD_SESS_NEED_NEGOTIATE); + ksmbd_all_conn_set_status(sess_id, KSMBD_SESS_NEED_SETUP);
rsp->StructureSize = cpu_to_le16(4); err = ksmbd_iov_pin_rsp(work, rsp, sizeof(struct smb2_logoff_rsp));
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Buday Csaba buday.csaba@prolan.hu
commit 8ea25274ebaf2f6be8be374633b2ed8348ec0e70 upstream.
reset_gpio is claimed in mdiobus_register_device(), but it is not released in mdiobus_unregister_device(). It is instead only released when the whole MDIO bus is unregistered. When a device uses the reset_gpio property, it becomes impossible to unregister it and register it again, because the GPIO remains claimed. This patch resolves that issue.
Fixes: bafbdd527d56 ("phylib: Add device reset GPIO support") # see notes Reviewed-by: Andrew Lunn andrew@lunn.ch Cc: Csókás Bence csokas.bence@prolan.hu [ csokas.bence: Resolve rebase conflict and clarify msg ] Signed-off-by: Buday Csaba buday.csaba@prolan.hu Link: https://patch.msgid.link/20250807135449.254254-2-csokas.bence@prolan.hu Signed-off-by: Paolo Abeni pabeni@redhat.com [ csokas.bence: Use the v1 patch on top of 6.6, as specified in notes ] Signed-off-by: Bence Csókás csokas.bence@prolan.hu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/mdio_bus.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/drivers/net/phy/mdio_bus.c +++ b/drivers/net/phy/mdio_bus.c @@ -99,6 +99,7 @@ int mdiobus_unregister_device(struct mdi if (mdiodev->bus->mdio_map[mdiodev->addr] != mdiodev) return -EINVAL;
+ gpiod_put(mdiodev->reset_gpio); reset_control_put(mdiodev->reset_ctrl);
mdiodev->bus->mdio_map[mdiodev->addr] = NULL; @@ -775,9 +776,6 @@ void mdiobus_unregister(struct mii_bus * if (!mdiodev) continue;
- if (mdiodev->reset_gpio) - gpiod_put(mdiodev->reset_gpio); - mdiodev->device_remove(mdiodev); mdiodev->device_free(mdiodev); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
[ Upstream commit 7838fb5f119191403560eca2e23613380c0e425e ]
Commit b61badd20b44 ("drm/amdgpu: fix usage slab after free") reordered when amdgpu_fence_driver_sw_fini() was called after that patch, amdgpu_fence_driver_sw_fini() effectively became a no-op as the sched entities we never freed because the ring pointers were already set to NULL. Remove the NULL setting.
Reported-by: Lin.Cao lincao12@amd.com Cc: Vitaly Prosyak vitaly.prosyak@amd.com Cc: Christian König christian.koenig@amd.com Fixes: b61badd20b44 ("drm/amdgpu: fix usage slab after free") Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit a525fa37aac36c4591cc8b07ae8957862415fbd5) Cc: stable@vger.kernel.org [ Adapt to conditional check ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 3 --- 1 file changed, 3 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c @@ -396,9 +396,6 @@ void amdgpu_ring_fini(struct amdgpu_ring dma_fence_put(ring->vmid_wait); ring->vmid_wait = NULL; ring->me = 0; - - if (!ring->is_mes_queue) - ring->adev->rings[ring->idx] = NULL; }
/**
[Public]
-----Original Message----- From: Greg Kroah-Hartman gregkh@linuxfoundation.org Sent: Wednesday, September 17, 2025 8:35 AM To: stable@vger.kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org; patches@lists.linux.dev; cao, lin lin.cao@amd.com; Prosyak, Vitaly Vitaly.Prosyak@amd.com; Koenig, Christian Christian.Koenig@amd.com; Deucher, Alexander Alexander.Deucher@amd.com; Sasha Levin sashal@kernel.org Subject: [PATCH 6.6 100/101] drm/amdgpu: fix a memory leak in fence cleanup when unloading
6.6-stable review patch. If anyone has any objections, please let me know.
From: Alex Deucher alexander.deucher@amd.com
[ Upstream commit 7838fb5f119191403560eca2e23613380c0e425e ]
Commit b61badd20b44 ("drm/amdgpu: fix usage slab after free") reordered when amdgpu_fence_driver_sw_fini() was called after that patch, amdgpu_fence_driver_sw_fini() effectively became a no-op as the sched entities we never freed because the ring pointers were already set to NULL. Remove the NULL setting.
Reported-by: Lin.Cao lincao12@amd.com Cc: Vitaly Prosyak vitaly.prosyak@amd.com Cc: Christian König christian.koenig@amd.com Fixes: b61badd20b44 ("drm/amdgpu: fix usage slab after free")
Does 6.6 contain b61badd20b44 or a backport of it? If not, then this patch is not applicable.
Alex
Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit a525fa37aac36c4591cc8b07ae8957862415fbd5) Cc: stable@vger.kernel.org [ Adapt to conditional check ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 3 --- 1 file changed, 3 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c @@ -396,9 +396,6 @@ void amdgpu_ring_fini(struct amdgpu_ring dma_fence_put(ring->vmid_wait); ring->vmid_wait = NULL; ring->me = 0;
if (!ring->is_mes_queue)
ring->adev->rings[ring->idx] = NULL;
}
/**
On Wed, Sep 17, 2025 at 02:35:55PM +0000, Deucher, Alexander wrote:
[Public]
-----Original Message----- From: Greg Kroah-Hartman gregkh@linuxfoundation.org Sent: Wednesday, September 17, 2025 8:35 AM To: stable@vger.kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org; patches@lists.linux.dev; cao, lin lin.cao@amd.com; Prosyak, Vitaly Vitaly.Prosyak@amd.com; Koenig, Christian Christian.Koenig@amd.com; Deucher, Alexander Alexander.Deucher@amd.com; Sasha Levin sashal@kernel.org Subject: [PATCH 6.6 100/101] drm/amdgpu: fix a memory leak in fence cleanup when unloading
6.6-stable review patch. If anyone has any objections, please let me know.
From: Alex Deucher alexander.deucher@amd.com
[ Upstream commit 7838fb5f119191403560eca2e23613380c0e425e ]
Commit b61badd20b44 ("drm/amdgpu: fix usage slab after free") reordered when amdgpu_fence_driver_sw_fini() was called after that patch, amdgpu_fence_driver_sw_fini() effectively became a no-op as the sched entities we never freed because the ring pointers were already set to NULL. Remove the NULL setting.
Reported-by: Lin.Cao lincao12@amd.com Cc: Vitaly Prosyak vitaly.prosyak@amd.com Cc: Christian König christian.koenig@amd.com Fixes: b61badd20b44 ("drm/amdgpu: fix usage slab after free")
Does 6.6 contain b61badd20b44 or a backport of it? If not, then this patch is not applicable.
Yes, it is in 6.6.64.
thanks,
greg k-h
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jani Nikula jani.nikula@intel.com
[ Upstream commit cfa7b7659757f8d0fc4914429efa90d0d2577dd7 ]
for_each_set_bit() expects size to be in bits, not bytes. The abox mask iteration uses bytes, but it works by coincidence, because the local variable holding the mask is unsigned long, and the mask only ever has bit 2 as the highest bit. Using a smaller type could lead to subtle and very hard to track bugs.
Fixes: 62afef2811e4 ("drm/i915/rkl: RKL uses ABOX0 for pixel transfers") Cc: Ville Syrjälä ville.syrjala@linux.intel.com Cc: Matt Roper matthew.d.roper@intel.com Cc: stable@vger.kernel.org # v5.9+ Reviewed-by: Matt Roper matthew.d.roper@intel.com Link: https://lore.kernel.org/r/20250905104149.1144751-1-jani.nikula@intel.com Signed-off-by: Jani Nikula jani.nikula@intel.com (cherry picked from commit 7ea3baa6efe4bb93d11e1c0e6528b1468d7debf6) Signed-off-by: Tvrtko Ursulin tursulin@ursulin.net [ adapted struct intel_display *display parameters to struct drm_i915_private *dev_priv ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/display/intel_display_power.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/gpu/drm/i915/display/intel_display_power.c +++ b/drivers/gpu/drm/i915/display/intel_display_power.c @@ -1170,7 +1170,7 @@ static void icl_mbus_init(struct drm_i91 if (DISPLAY_VER(dev_priv) == 12) abox_regs |= BIT(0);
- for_each_set_bit(i, &abox_regs, sizeof(abox_regs)) + for_each_set_bit(i, &abox_regs, BITS_PER_TYPE(abox_regs)) intel_de_rmw(dev_priv, MBUS_ABOX_CTL(i), mask, val); }
@@ -1623,11 +1623,11 @@ static void tgl_bw_buddy_init(struct drm if (table[config].page_mask == 0) { drm_dbg(&dev_priv->drm, "Unknown memory configuration; disabling address buddy logic.\n"); - for_each_set_bit(i, &abox_mask, sizeof(abox_mask)) + for_each_set_bit(i, &abox_mask, BITS_PER_TYPE(abox_mask)) intel_de_write(dev_priv, BW_BUDDY_CTL(i), BW_BUDDY_DISABLE); } else { - for_each_set_bit(i, &abox_mask, sizeof(abox_mask)) { + for_each_set_bit(i, &abox_mask, BITS_PER_TYPE(abox_mask)) { intel_de_write(dev_priv, BW_BUDDY_PAGE_MASK(i), table[config].page_mask);
The kernel, bpf tool, perf tool, and kselftest builds fine for v6.6.107-rc1 on x86 and arm64 Azure VM.
Tested-by: Hardik Garg hargar@linux.microsoft.com
Thanks, Hardik
On Wed, 17 Sep 2025 14:33:43 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.107 release. There are 101 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 19 Sep 2025 12:32:53 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.107-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v6.6: 10 builds: 10 pass, 0 fail 28 boots: 28 pass, 0 fail 120 tests: 120 pass, 0 fail
Linux version: 6.6.107-rc1-g08094cf55442 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra186-p3509-0000+p3636-0001, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
linux-stable-mirror@lists.linaro.org