This is the start of the stable review cycle for the 6.6.72 release. There are 129 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 17 Jan 2025 10:34:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.72-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.6.72-rc1
Daniel Golle daniel@makrotopia.org drm/mediatek: Only touch DISP_REG_OVL_PITCH_MSB if AFBC is supported
Alexandre Ghiti alexghiti@rivosinc.com riscv: Fix text patching when IPI are used
Liu Shixin liushixin2@huawei.com mm: hugetlb: independent PMD page table shared count
David Hildenbrand david@redhat.com mm/hugetlb: enforce that PMD PT sharing has split PMD PT locks
Peter Xu peterx@redhat.com fs/Kconfig: make hugetlbfs a menuconfig
Alexander Gordeev agordeev@linux.ibm.com pgtable: fix s390 ptdesc field comments
Tvrtko Ursulin tvrtko.ursulin@igalia.com workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker
Tejun Heo tj@kernel.org workqueue: Update lock debugging code
Xuewen Yan xuewen.yan@unisoc.com workqueue: Add rcu lock check at the end of work item execution
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp pmdomain: imx: gpcv2: fix an OF node reference leak in imx_gpcv2_probe()
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org pmdomain: imx: gpcv2: Simplify with scoped for each OF child loop
Peter Geis pgwipeout@gmail.com arm64: dts: rockchip: add hevc power domain clock to rk3328
Yu Kuai yukuai3@huawei.com block, bfq: fix waker_bfqq UAF after bfq_split_bfqq()
Daniil Stas daniil.stas@posteo.net hwmon: (drivetemp) Fix driver producing garbage data when SCSI errors occur
Jesse Taube Mr.Bossman075@gmail.com ARM: dts: imxrt1050: Fix clocks for mmc
Jens Axboe axboe@kernel.dk io_uring/eventfd: ensure io_eventfd_signal() defers another RCU period
Nam Cao namcao@linutronix.de riscv: kprobes: Fix incorrect address calculation
Uwe Kleine-König u.kleine-koenig@baylibre.com iio: adc: ad7124: Disable all channels at probe time
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp iio: inkern: call iio_device_put() only on mapped devices
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp iio: adc: at91: call input_free_device() on allocated iio_dev
Fabio Estevam festevam@gmail.com iio: adc: ti-ads124s08: Use gpiod_set_value_cansleep()
Carlos Song carlos.song@nxp.com iio: gyro: fxas21002c: Fix missing data update in trigger handler
Javier Carrasco javier.carrasco.cruz@gmail.com iio: adc: ti-ads8688: fix information leak in triggered buffer
Javier Carrasco javier.carrasco.cruz@gmail.com iio: adc: rockchip_saradc: fix information leak in triggered buffer
Javier Carrasco javier.carrasco.cruz@gmail.com iio: imu: kmx61: fix information leak in triggered buffer
Javier Carrasco javier.carrasco.cruz@gmail.com iio: light: vcnl4035: fix information leak in triggered buffer
Javier Carrasco javier.carrasco.cruz@gmail.com iio: dummy: iio_simply_dummy_buffer: fix information leak in triggered buffer
Javier Carrasco javier.carrasco.cruz@gmail.com iio: pressure: zpa2326: fix information leak in triggered buffer
Ingo Rohloff ingo.rohloff@lauterbach.com usb: gadget: configfs: Ignore trailing LF for user strings to cdev
Akash M akash.m5@samsung.com usb: gadget: f_fs: Remove WARN_ON in functionfs_bind
Dan Carpenter dan.carpenter@linaro.org usb: typec: tcpm/tcpci_maxim: fix error code in max_contaminant_read_resistance_kohm()
Prashanth K quic_prashk@quicinc.com usb: gadget: f_uac2: Fix incorrect setting of bNumEndpoints
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp usb: chipidea: ci_hdrc_imx: decrement device's refcount in .remove() and in the error path of .probe()
Takashi Iwai tiwai@suse.de usb: gadget: midi2: Reverse-select at the right place
Ma Ke make_ruc2021@163.com usb: fix reference leak in usb_new_device()
Kai-Heng Feng kaihengf@nvidia.com USB: core: Disable LPM only for non-suspended ports
Jun Yan jerrysteve1101@gmail.com USB: usblp: return error when setting unsupported protocol
Prashanth K quic_prashk@quicinc.com usb: dwc3-am62: Disable autosuspend during remove
Rick Edgecombe rick.p.edgecombe@intel.com x86/fpu: Ensure shadow stack is active before "getting" registers
Lianqin Hu hulianqin@vivo.com usb: gadget: u_serial: Disable ep before setting port to null to fix the crash caused by port being null
Ilpo Järvinen ilpo.jarvinen@linux.intel.com tty: serial: 8250: Fix another runtime PM usage counter underflow
Rengarajan S rengarajan.s@microchip.com misc: microchip: pci1xxxx: Resolve return code mismatch during GPIO set config
Rengarajan S rengarajan.s@microchip.com misc: microchip: pci1xxxx: Resolve kernel panic during GPIO IRQ handling
Li Huafei lihuafei1@huawei.com topology: Keep the cpumask unchanged when printing cpumap
André Draszik andre.draszik@linaro.org usb: dwc3: gadget: fix writing NYET threshold
Johan Hovold johan@kernel.org USB: serial: cp210x: add Phoenix Contact UPS Device
Lubomir Rintel lrintel@redhat.com usb-storage: Add max sectors quirk for Nokia 208
Zicheng Qu quzicheng@huawei.com staging: iio: ad9832: Correct phase range check
Zicheng Qu quzicheng@huawei.com staging: iio: ad9834: Correct phase range check
Michal Hrusecky michal.hrusecky@turris.com USB: serial: option: add Neoway N723-EA support
Chukun Pan amadeus@jmu.edu.cn USB: serial: option: add MeiG Smart SRM815
Milan Broz gmazyland@gmail.com dm-verity FEC: Fix RS FEC repair for roots unaligned to block size (take 2)
Ye Bin yebin10@huawei.com f2fs: fix null-ptr-deref in f2fs_submit_page_bio()
Pavel Begunkov asml.silence@gmail.com io_uring/timeout: fix multishot updates
Melissa Wen mwen@igalia.com drm/amd/display: increase MAX_SURFACES to the value supported by hw
Jesse.zhang@amd.com Jesse.zhang@amd.com drm/amdkfd: fixed page fault when enable MES shader debugger
Hans de Goede hdegoede@redhat.com ACPI: resource: Add Asus Vivobook X1504VAP to irq1_level_low_skip_override[]
Hans de Goede hdegoede@redhat.com ACPI: resource: Add TongFang GM5HG0A to irq1_edge_low_force_override[]
Nam Cao namcao@linutronix.de riscv: Fix sleeping in invalid context in die()
Meetakshi Setiya msetiya@microsoft.com smb: client: sync the root session and superblock context passwords before automounting
Joe Hattori joe@pf.is.s.u-tokyo.ac.jp thermal: of: fix OF node leak in of_thermal_zone_find()
Roman Li Roman.Li@amd.com drm/amd/display: Add check for granularity in dml ceil/floor helpers
Namjae Jeon linkinjeon@kernel.org ksmbd: Implement new SMB3 POSIX type
Matthieu Baerts (NGI0) matttbe@kernel.org sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy
Matthieu Baerts (NGI0) matttbe@kernel.org sctp: sysctl: udp_port: avoid using current->nsproxy
Matthieu Baerts (NGI0) matttbe@kernel.org sctp: sysctl: auth_enable: avoid using current->nsproxy
Matthieu Baerts (NGI0) matttbe@kernel.org sctp: sysctl: rto_min/max: avoid using current->nsproxy
Matthieu Baerts (NGI0) matttbe@kernel.org sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy
Matthieu Baerts (NGI0) matttbe@kernel.org mptcp: sysctl: sched: avoid using current->nsproxy
Mikulas Patocka mpatocka@redhat.com dm-ebs: don't set the flag DM_TARGET_PASSES_INTEGRITY
Manivannan Sadhasivam mani@kernel.org scsi: ufs: qcom: Power off the PHY if it was already powered on in ufs_qcom_power_up_sequence()
Krister Johansen kjlx@templeofstupid.com dm thin: make get_first_thin use rcu-safe list first function
Xu Lu luxu.kernel@bytedance.com riscv: mm: Fix the out of bound issue of vmemmap address
Javier Carrasco javier.carrasco.cruz@gmail.com cpuidle: riscv-sbi: fix device node release in early exit of for_each_possible_cpu
He Wang xw897002528@gmail.com ksmbd: fix unexpectedly changed path in ksmbd_vfs_kern_path_locked
Maciej S. Szmigiero mail@maciej.szmigiero.name platform/x86/amd/pmc: Only disable IRQ1 wakeup where i8042 actually enabled it
David Howells dhowells@redhat.com afs: Fix the maximum cell name length
Wentao Liang liangwentao@iscas.ac.cn ksmbd: fix a missing return value check bug
Liankun Yang liankun.yang@mediatek.com drm/mediatek: Add return value check when reading DPCD
Liankun Yang liankun.yang@mediatek.com drm/mediatek: Fix mode valid issue for dp
Liankun Yang liankun.yang@mediatek.com drm/mediatek: Fix YCbCr422 color format issue for DP
Arnd Bergmann arnd@arndb.de drm/mediatek: stop selecting foreign drivers
Guoqing Jiang guoqing.jiang@canonical.com drm/mediatek: Set private->all_drm_private[i]->drm to NULL if mtk_drm_bind returns err
Chenguang Zhao zhaochenguang@kylinos.cn net/mlx5: Fix variable not being completed when function returns
Parker Newman pnewman@connecttech.com net: stmmac: dwmac-tegra: Read iommu stream id from device tree
Toke Høiland-Jørgensen toke@redhat.com sched: sch_cake: add bounds checks to host bulk flow fairness counts
Pablo Neira Ayuso pablo@netfilter.org netfilter: conntrack: clamp maximum hashtable size to INT_MAX
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: imbalance in flowtable binding
Jean-Baptiste Maneyrol jean-baptiste.maneyrol@tdk.com iio: imu: inv_icm42600: fix timestamps after suspend if sensor is on
Jan Beulich jbeulich@suse.com x86/mm/numa: Use NUMA_NO_NODE when calling memblock_set_node()
Wei Yang richard.weiyang@gmail.com memblock tests: fix implicit declaration of function 'numa_valid_node'
Alexandre Ghiti alexghiti@rivosinc.com riscv: Fix early ftrace nop patching
Daniel Borkmann daniel@iogearbox.net tcp: Annotate data-race around sk->sk_mark in tcp_v4_send_reset
Neeraj Sanjay Kale neeraj.sanjaykale@nxp.com Bluetooth: btnxpuart: Fix driver sending truncated data
Luiz Augusto von Dentz luiz.von.dentz@intel.com Bluetooth: MGMT: Fix Add Device to responding before completing
Luiz Augusto von Dentz luiz.von.dentz@intel.com Bluetooth: hci_sync: Fix not setting Random Address when required
Jakub Kicinski kuba@kernel.org eth: gve: use appropriate helper to set xdp_features
Kuniyuki Iwashima kuniyu@amazon.com ipvlan: Fix use-after-free in ipvlan_get_iflink().
Benjamin Coddington bcodding@redhat.com tls: Fix tls_sw_sendmsg error handling
En-Wei Wu en-wei.wu@canonical.com igc: return early when failing to read EECD register
Jesse Brandeburg jesse.brandeburg@intel.com igc: field get conversion
Przemyslaw Korba przemyslaw.korba@intel.com ice: fix incorrect PHY settings for 100 GB/s
Anumula Murali Mohan Reddy anumula@chelsio.com cxgb4: Avoid removal of uninserted tid
Kalesh AP kalesh-anakkur.purayil@broadcom.com bnxt_en: Fix possible memory leak when hwrm_req_replace fails
Shannon Nelson shannon.nelson@amd.com pds_core: limit loop over fw name list
Qu Wenruo wqu@suse.com btrfs: avoid NULL pointer dereference if no valid extent tree
Jiawen Wu jiawenwu@trustnetic.com net: libwx: fix firmware mailbox abnormal return
Eric Dumazet edumazet@google.com net_sched: cls_flow: validate TCA_FLOW_RSHIFT attribute
Zhongqiu Duan dzq.aishenghu0@gmail.com tcp/dccp: allow a connection when sk_max_ack_backlog is zero
Jason Xing kernelxing@tencent.com tcp/dccp: complete lockless accesses to sk->sk_max_ack_backlog
Antonio Pastor antonio.pastor@gmail.com net: 802: LLC+SNAP OID:PID lookup on start of skb data
Keisuke Nishimura keisuke.nishimura@inria.fr ieee802154: ca8210: Add missing check for kfifo_alloc() in ca8210_probe()
Li Zhijian lizhijian@fujitsu.com selftests/alsa: Fix circular dependency involving global-timer
Chen-Yu Tsai wenst@chromium.org ASoC: mediatek: disable buffer pre-allocation
Shuming Fan shumingf@realtek.com ASoC: rt722: add delay time to wait for the calibration procedure
Gao Xiang xiang@kernel.org erofs: fix PSI memstall accounting
Gao Xiang xiang@kernel.org erofs: handle overlapped pclusters out of crafted images properly
Amir Goldstein amir73il@gmail.com ovl: support encoding fid from inode with no alias
Amir Goldstein amir73il@gmail.com ovl: pass realinode to ovl_encode_real_fh() instead of realdentry
Amir Goldstein amir73il@gmail.com ovl: do not encode lower fh with upper sb_writers held
Yuezhang Mo Yuezhang.Mo@sony.com exfat: fix the infinite loop in __exfat_free_cluster()
Yuezhang Mo Yuezhang.Mo@sony.com exfat: fix the infinite loop in exfat_readdir()
Ming-Hung Tsai mtsai@redhat.com dm array: fix cursor index when skipping across block boundaries
Ming-Hung Tsai mtsai@redhat.com dm array: fix unreleased btree blocks on closing a faulty array cursor
Ming-Hung Tsai mtsai@redhat.com dm array: fix releasing a faulty array block twice in dm_array_cursor_end
Zhang Yi yi.zhang@huawei.com jbd2: flush filesystem device before updating tail sequence
Zhang Yi yi.zhang@huawei.com jbd2: increase IO priority for writing revoke records
Mike Rapoport (IBM) rppt@kernel.org memblock: use numa_valid_node() helper to check for invalid node ID
Jan Beulich jbeulich@suse.com memblock: make memblock_set_node() also warn about use of MAX_NUMNODES
-------------
Diffstat:
Makefile | 4 +- arch/arm/boot/dts/nxp/imx/imxrt1050.dtsi | 2 +- arch/arm64/boot/dts/rockchip/rk3328.dtsi | 1 + arch/riscv/include/asm/cacheflush.h | 6 + arch/riscv/include/asm/page.h | 1 + arch/riscv/include/asm/patch.h | 1 + arch/riscv/include/asm/pgtable.h | 2 +- arch/riscv/kernel/ftrace.c | 47 ++++++- arch/riscv/kernel/patch.c | 16 ++- arch/riscv/kernel/probes/kprobes.c | 2 +- arch/riscv/kernel/traps.c | 6 +- arch/riscv/mm/init.c | 17 ++- arch/x86/kernel/fpu/regset.c | 3 +- arch/x86/mm/numa.c | 6 +- block/bfq-iosched.c | 12 +- drivers/acpi/resource.c | 18 +++ drivers/base/topology.c | 24 +++- drivers/bluetooth/btnxpuart.c | 1 + drivers/cpuidle/cpuidle-riscv-sbi.c | 4 +- drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 17 +++ drivers/gpu/drm/amd/display/dc/dc.h | 2 +- .../gpu/drm/amd/display/dc/dml/dml_inline_defs.h | 8 ++ drivers/gpu/drm/mediatek/Kconfig | 5 - drivers/gpu/drm/mediatek/mtk_disp_ovl.c | 57 ++++----- drivers/gpu/drm/mediatek/mtk_dp.c | 46 ++++--- drivers/gpu/drm/mediatek/mtk_drm_drv.c | 2 + drivers/hwmon/drivetemp.c | 8 +- drivers/iio/adc/ad7124.c | 3 + drivers/iio/adc/at91_adc.c | 2 +- drivers/iio/adc/rockchip_saradc.c | 2 + drivers/iio/adc/ti-ads124s08.c | 4 +- drivers/iio/adc/ti-ads8688.c | 2 +- drivers/iio/dummy/iio_simple_dummy_buffer.c | 2 +- drivers/iio/gyro/fxas21002c_core.c | 11 +- drivers/iio/imu/inv_icm42600/inv_icm42600_core.c | 8 +- drivers/iio/imu/kmx61.c | 2 +- drivers/iio/inkern.c | 2 +- drivers/iio/light/vcnl4035.c | 2 +- drivers/iio/pressure/zpa2326.c | 2 + drivers/md/dm-ebs-target.c | 2 +- drivers/md/dm-thin.c | 5 +- drivers/md/dm-verity-fec.c | 39 ++++-- drivers/md/persistent-data/dm-array.c | 19 +-- drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gpio.c | 4 +- drivers/net/ethernet/amd/pds_core/devlink.c | 2 +- drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 3 +- drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 5 +- drivers/net/ethernet/google/gve/gve_main.c | 14 ++- drivers/net/ethernet/intel/ice/ice_ptp_consts.h | 4 +- drivers/net/ethernet/intel/igc/igc_base.c | 12 +- drivers/net/ethernet/intel/igc/igc_i225.c | 5 +- drivers/net/ethernet/intel/igc/igc_main.c | 6 +- drivers/net/ethernet/intel/igc/igc_phy.c | 4 +- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 1 + drivers/net/ethernet/stmicro/stmmac/dwmac-tegra.c | 14 ++- drivers/net/ethernet/wangxun/libwx/wx_hw.c | 24 ++-- drivers/net/ieee802154/ca8210.c | 6 +- drivers/platform/x86/amd/pmc/pmc.c | 8 +- drivers/pmdomain/imx/gpcv2.c | 10 +- drivers/staging/iio/frequency/ad9832.c | 2 +- drivers/staging/iio/frequency/ad9834.c | 2 +- drivers/thermal/thermal_of.c | 1 + drivers/tty/serial/8250/8250_core.c | 3 + drivers/ufs/core/ufshcd-priv.h | 6 - drivers/ufs/core/ufshcd.c | 1 - drivers/ufs/host/ufs-qcom.c | 13 +- drivers/usb/chipidea/ci_hdrc_imx.c | 25 ++-- drivers/usb/class/usblp.c | 7 +- drivers/usb/core/hub.c | 6 +- drivers/usb/core/port.c | 7 +- drivers/usb/dwc3/core.h | 1 + drivers/usb/dwc3/dwc3-am62.c | 1 + drivers/usb/dwc3/gadget.c | 4 +- drivers/usb/gadget/Kconfig | 4 +- drivers/usb/gadget/configfs.c | 6 +- drivers/usb/gadget/function/f_fs.c | 2 +- drivers/usb/gadget/function/f_uac2.c | 1 + drivers/usb/gadget/function/u_serial.c | 8 +- drivers/usb/serial/cp210x.c | 1 + drivers/usb/serial/option.c | 4 +- drivers/usb/storage/unusual_devs.h | 7 ++ drivers/usb/typec/tcpm/maxim_contaminant.c | 4 +- fs/Kconfig | 24 ++-- fs/afs/afs.h | 2 +- fs/afs/afs_vl.h | 1 + fs/afs/vl_alias.c | 8 +- fs/afs/vlclient.c | 2 +- fs/btrfs/scrub.c | 4 + fs/erofs/zdata.c | 66 +++++----- fs/exfat/dir.c | 3 +- fs/exfat/fatent.c | 10 ++ fs/f2fs/super.c | 12 +- fs/jbd2/commit.c | 4 +- fs/jbd2/revoke.c | 2 +- fs/overlayfs/copy_up.c | 62 +++++---- fs/overlayfs/export.c | 49 ++++---- fs/overlayfs/namei.c | 41 ++++-- fs/overlayfs/overlayfs.h | 28 +++-- fs/overlayfs/super.c | 20 ++- fs/overlayfs/util.c | 10 ++ fs/smb/client/namespace.c | 19 ++- fs/smb/server/smb2pdu.c | 43 +++++++ fs/smb/server/smb2pdu.h | 10 ++ fs/smb/server/vfs.c | 3 +- include/linux/hugetlb.h | 5 +- include/linux/mm.h | 1 + include/linux/mm_types.h | 34 ++++- include/linux/numa.h | 5 + include/net/inet_connection_sock.h | 2 +- include/ufs/ufshcd.h | 2 - io_uring/io_uring.c | 13 +- io_uring/timeout.c | 4 +- kernel/workqueue.c | 68 ++++++---- mm/hugetlb.c | 24 ++-- mm/memblock.c | 24 ++-- net/802/psnap.c | 4 +- net/bluetooth/hci_sync.c | 11 +- net/bluetooth/mgmt.c | 38 +++++- net/core/link_watch.c | 10 +- net/ipv4/tcp_ipv4.c | 2 +- net/mptcp/ctrl.c | 11 +- net/netfilter/nf_conntrack_core.c | 5 +- net/netfilter/nf_tables_api.c | 15 ++- net/sched/cls_flow.c | 3 +- net/sched/sch_cake.c | 140 +++++++++++---------- net/sctp/sysctl.c | 14 ++- net/tls/tls_sw.c | 2 +- sound/soc/codecs/rt722-sdca.c | 7 +- .../soc/mediatek/common/mtk-afe-platform-driver.c | 4 +- tools/include/linux/numa.h | 5 + tools/testing/selftests/alsa/Makefile | 2 +- 131 files changed, 1032 insertions(+), 509 deletions(-)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Beulich jbeulich@suse.com
[ Upstream commit e0eec24e2e199873f43df99ec39773ad3af2bff7 ]
On an (old) x86 system with SRAT just covering space above 4Gb:
ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0xfffffffff] hotplug
the commit referenced below leads to this NUMA configuration no longer being refused by a CONFIG_NUMA=y kernel (previously
NUMA: nodes only cover 6144MB of your 8185MB e820 RAM. Not used. No NUMA configuration found Faking a node at [mem 0x0000000000000000-0x000000027fffffff]
was seen in the log directly after the message quoted above), because of memblock_validate_numa_coverage() checking for NUMA_NO_NODE (only). This in turn led to memblock_alloc_range_nid()'s warning about MAX_NUMNODES triggering, followed by a NULL deref in memmap_init() when trying to access node 64's (NODE_SHIFT=6) node data.
To compensate said change, make memblock_set_node() warn on and adjust a passed in value of MAX_NUMNODES, just like various other functions already do.
Fixes: ff6c3d81f2e8 ("NUMA: optimize detection of memory with no node id assigned by firmware") Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1c8a058c-5365-4f27-a9f1-3aeb7fb3e7b2@suse.com Signed-off-by: Mike Rapoport (IBM) rppt@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memblock.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/mm/memblock.c +++ b/mm/memblock.c @@ -1321,6 +1321,10 @@ int __init_memblock memblock_set_node(ph int start_rgn, end_rgn; int i, ret;
+ if (WARN_ONCE(nid == MAX_NUMNODES, + "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n")) + nid = NUMA_NO_NODE; + ret = memblock_isolate_range(type, base, size, &start_rgn, &end_rgn); if (ret) return ret;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mike Rapoport (IBM) rppt@kernel.org
commit 8043832e2a123fd9372007a29192f2f3ba328cd6 upstream.
Introduce numa_valid_node(nid) that verifies that nid is a valid node ID and use that instead of comparing nid parameter with either NUMA_NO_NODE or MAX_NUMNODES.
This makes the checks for valid node IDs consistent and more robust and allows to get rid of multiple WARNings.
Suggested-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Mike Rapoport (IBM) rppt@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/numa.h | 5 +++++ mm/memblock.c | 28 +++++++--------------------- 2 files changed, 12 insertions(+), 21 deletions(-)
--- a/include/linux/numa.h +++ b/include/linux/numa.h @@ -15,6 +15,11 @@ #define NUMA_NO_NODE (-1) #define NUMA_NO_MEMBLK (-1)
+static inline bool numa_valid_node(int nid) +{ + return nid >= 0 && nid < MAX_NUMNODES; +} + /* optionally keep NUMA memory info available post init */ #ifdef CONFIG_NUMA_KEEP_MEMINFO #define __initdata_or_meminfo --- a/mm/memblock.c +++ b/mm/memblock.c @@ -754,7 +754,7 @@ bool __init_memblock memblock_validate_n
/* calculate lose page */ for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) { - if (nid == NUMA_NO_NODE) + if (!numa_valid_node(nid)) nr_pages += end_pfn - start_pfn; }
@@ -1043,7 +1043,7 @@ static bool should_skip_region(struct me return false;
/* only memory regions are associated with nodes, check it */ - if (nid != NUMA_NO_NODE && nid != m_nid) + if (numa_valid_node(nid) && nid != m_nid) return true;
/* skip hotpluggable memory regions if needed */ @@ -1100,10 +1100,6 @@ void __next_mem_range(u64 *idx, int nid, int idx_a = *idx & 0xffffffff; int idx_b = *idx >> 32;
- if (WARN_ONCE(nid == MAX_NUMNODES, - "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n")) - nid = NUMA_NO_NODE; - for (; idx_a < type_a->cnt; idx_a++) { struct memblock_region *m = &type_a->regions[idx_a];
@@ -1197,9 +1193,6 @@ void __init_memblock __next_mem_range_re int idx_a = *idx & 0xffffffff; int idx_b = *idx >> 32;
- if (WARN_ONCE(nid == MAX_NUMNODES, "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n")) - nid = NUMA_NO_NODE; - if (*idx == (u64)ULLONG_MAX) { idx_a = type_a->cnt - 1; if (type_b != NULL) @@ -1285,7 +1278,7 @@ void __init_memblock __next_mem_pfn_rang
if (PFN_UP(r->base) >= PFN_DOWN(r->base + r->size)) continue; - if (nid == MAX_NUMNODES || nid == r_nid) + if (!numa_valid_node(nid) || nid == r_nid) break; } if (*idx >= type->cnt) { @@ -1321,10 +1314,6 @@ int __init_memblock memblock_set_node(ph int start_rgn, end_rgn; int i, ret;
- if (WARN_ONCE(nid == MAX_NUMNODES, - "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n")) - nid = NUMA_NO_NODE; - ret = memblock_isolate_range(type, base, size, &start_rgn, &end_rgn); if (ret) return ret; @@ -1434,9 +1423,6 @@ phys_addr_t __init memblock_alloc_range_ enum memblock_flags flags = choose_memblock_flags(); phys_addr_t found;
- if (WARN_ONCE(nid == MAX_NUMNODES, "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n")) - nid = NUMA_NO_NODE; - if (!align) { /* Can't use WARNs this early in boot on powerpc */ dump_stack(); @@ -1449,7 +1435,7 @@ again: if (found && !memblock_reserve(found, size)) goto done;
- if (nid != NUMA_NO_NODE && !exact_nid) { + if (numa_valid_node(nid) && !exact_nid) { found = memblock_find_in_range_node(size, align, start, end, NUMA_NO_NODE, flags); @@ -1969,7 +1955,7 @@ static void __init_memblock memblock_dum end = base + size - 1; flags = rgn->flags; #ifdef CONFIG_NUMA - if (memblock_get_region_node(rgn) != MAX_NUMNODES) + if (numa_valid_node(memblock_get_region_node(rgn))) snprintf(nid_buf, sizeof(nid_buf), " on node %d", memblock_get_region_node(rgn)); #endif @@ -2158,7 +2144,7 @@ static void __init memmap_init_reserved_ start = region->base; end = start + region->size;
- if (nid == NUMA_NO_NODE || nid >= MAX_NUMNODES) + if (!numa_valid_node(nid)) nid = early_pfn_to_nid(PFN_DOWN(start));
reserve_bootmem_region(start, end, nid); @@ -2247,7 +2233,7 @@ static int memblock_debug_show(struct se
seq_printf(m, "%4d: ", i); seq_printf(m, "%pa..%pa ", ®->base, &end); - if (nid != MAX_NUMNODES) + if (numa_valid_node(nid)) seq_printf(m, "%4d ", nid); else seq_printf(m, "%4c ", 'x');
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit ac1e21bd8c883aeac2f1835fc93b39c1e6838b35 ]
Commit '6a3afb6ac6df ("jbd2: increase the journal IO's priority")' increases the priority of journal I/O by marking I/O with the JBD2_JOURNAL_REQ_FLAGS. However, that commit missed the revoke buffers, so also addresses that kind of I/Os.
Fixes: 6a3afb6ac6df ("jbd2: increase the journal IO's priority") Signed-off-by: Zhang Yi yi.zhang@huawei.com Link: https://lore.kernel.org/r/20241203014407.805916-2-yi.zhang@huaweicloud.com Reviewed-by: Kemeng Shi shikemeng@huaweicloud.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/revoke.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/jbd2/revoke.c b/fs/jbd2/revoke.c index 4556e4689024..ce63d5fde9c3 100644 --- a/fs/jbd2/revoke.c +++ b/fs/jbd2/revoke.c @@ -654,7 +654,7 @@ static void flush_descriptor(journal_t *journal, set_buffer_jwrite(descriptor); BUFFER_TRACE(descriptor, "write"); set_buffer_dirty(descriptor); - write_dirty_buffer(descriptor, REQ_SYNC); + write_dirty_buffer(descriptor, JBD2_JOURNAL_REQ_FLAGS); } #endif
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit a0851ea9cd555c333795b85ddd908898b937c4e1 ]
When committing transaction in jbd2_journal_commit_transaction(), the disk caches for the filesystem device should be flushed before updating the journal tail sequence. However, this step is missed if the journal is not located on the filesystem device. As a result, the filesystem may become inconsistent following a power failure or system crash. Fix it by ensuring that the filesystem device is flushed appropriately.
Fixes: 3339578f0578 ("jbd2: cleanup journal tail after transaction commit") Signed-off-by: Zhang Yi yi.zhang@huawei.com Link: https://lore.kernel.org/r/20241203014407.805916-3-yi.zhang@huaweicloud.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/commit.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c index 0cd7439470fc..84663ff7dc50 100644 --- a/fs/jbd2/commit.c +++ b/fs/jbd2/commit.c @@ -777,9 +777,9 @@ void jbd2_journal_commit_transaction(journal_t *journal) /* * If the journal is not located on the file system device, * then we must flush the file system device before we issue - * the commit record + * the commit record and update the journal tail sequence. */ - if (commit_transaction->t_need_data_flush && + if ((commit_transaction->t_need_data_flush || update_tail) && (journal->j_fs_dev != journal->j_dev) && (journal->j_flags & JBD2_BARRIER)) blkdev_issue_flush(journal->j_fs_dev);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ming-Hung Tsai mtsai@redhat.com
[ Upstream commit f2893c0804d86230ffb8f1c8703fdbb18648abc8 ]
When dm_bm_read_lock() fails due to locking or checksum errors, it releases the faulty block implicitly while leaving an invalid output pointer behind. The caller of dm_bm_read_lock() should not operate on this invalid dm_block pointer, or it will lead to undefined result. For example, the dm_array_cursor incorrectly caches the invalid pointer on reading a faulty array block, causing a double release in dm_array_cursor_end(), then hitting the BUG_ON in dm-bufio cache_put().
Reproduce steps:
1. initialize a cache device
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0" dmsetup create cdata --table "0 65536 linear /dev/sdc 8192" dmsetup create corig --table "0 524288 linear /dev/sdc $262144" dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 dmsetup create cache --table "0 524288 cache /dev/mapper/cmeta \ /dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"
2. wipe the second array block offline
dmsteup remove cache cmeta cdata corig mapping_root=$(dd if=/dev/sdc bs=1c count=8 skip=192 \ 2>/dev/null | hexdump -e '1/8 "%u\n"') ablock=$(dd if=/dev/sdc bs=1c count=8 skip=$((4096*mapping_root+2056)) \ 2>/dev/null | hexdump -e '1/8 "%u\n"') dd if=/dev/zero of=/dev/sdc bs=4k count=1 seek=$ablock
3. try reopen the cache device
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0" dmsetup create cdata --table "0 65536 linear /dev/sdc 8192" dmsetup create corig --table "0 524288 linear /dev/sdc $262144" dmsetup create cache --table "0 524288 cache /dev/mapper/cmeta \ /dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"
Kernel logs:
(snip) device-mapper: array: array_block_check failed: blocknr 0 != wanted 10 device-mapper: block manager: array validator check failed for block 10 device-mapper: array: get_ablock failed device-mapper: cache metadata: dm_array_cursor_next for mapping failed ------------[ cut here ]------------ kernel BUG at drivers/md/dm-bufio.c:638!
Fix by setting the cached block pointer to NULL on errors.
In addition to the reproducer described above, this fix can be verified using the "array_cursor/damaged" test in dm-unit: dm-unit run /pdata/array_cursor/damaged --kernel-dir <KERNEL_DIR>
Signed-off-by: Ming-Hung Tsai mtsai@redhat.com Fixes: fdd1315aa5f0 ("dm array: introduce cursor api") Reviewed-by: Joe Thornber thornber@redhat.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/persistent-data/dm-array.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/md/persistent-data/dm-array.c b/drivers/md/persistent-data/dm-array.c index 798c9c53a343..de303ba33857 100644 --- a/drivers/md/persistent-data/dm-array.c +++ b/drivers/md/persistent-data/dm-array.c @@ -917,23 +917,27 @@ static int load_ablock(struct dm_array_cursor *c) if (c->block) unlock_ablock(c->info, c->block);
- c->block = NULL; - c->ab = NULL; c->index = 0;
r = dm_btree_cursor_get_value(&c->cursor, &key, &value_le); if (r) { DMERR("dm_btree_cursor_get_value failed"); - dm_btree_cursor_end(&c->cursor); + goto out;
} else { r = get_ablock(c->info, le64_to_cpu(value_le), &c->block, &c->ab); if (r) { DMERR("get_ablock failed"); - dm_btree_cursor_end(&c->cursor); + goto out; } }
+ return 0; + +out: + dm_btree_cursor_end(&c->cursor); + c->block = NULL; + c->ab = NULL; return r; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ming-Hung Tsai mtsai@redhat.com
[ Upstream commit 626f128ee9c4133b1cfce4be2b34a1508949370e ]
The cached block pointer in dm_array_cursor might be NULL if it reaches an unreadable array block, or the array is empty. Therefore, dm_array_cursor_end() should call dm_btree_cursor_end() unconditionally, to prevent leaving unreleased btree blocks.
This fix can be verified using the "array_cursor/iterate/empty" test in dm-unit: dm-unit run /pdata/array_cursor/iterate/empty --kernel-dir <KERNEL_DIR>
Signed-off-by: Ming-Hung Tsai mtsai@redhat.com Fixes: fdd1315aa5f0 ("dm array: introduce cursor api") Reviewed-by: Joe Thornber thornber@redhat.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/persistent-data/dm-array.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/md/persistent-data/dm-array.c b/drivers/md/persistent-data/dm-array.c index de303ba33857..788a31cae187 100644 --- a/drivers/md/persistent-data/dm-array.c +++ b/drivers/md/persistent-data/dm-array.c @@ -960,10 +960,10 @@ EXPORT_SYMBOL_GPL(dm_array_cursor_begin);
void dm_array_cursor_end(struct dm_array_cursor *c) { - if (c->block) { + if (c->block) unlock_ablock(c->info, c->block); - dm_btree_cursor_end(&c->cursor); - } + + dm_btree_cursor_end(&c->cursor); } EXPORT_SYMBOL_GPL(dm_array_cursor_end);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ming-Hung Tsai mtsai@redhat.com
[ Upstream commit 0bb1968da2737ba68fd63857d1af2b301a18d3bf ]
dm_array_cursor_skip() seeks to the target position by loading array blocks iteratively until the specified number of entries to skip is reached. When seeking across block boundaries, it uses dm_array_cursor_next() to step into the next block. dm_array_cursor_skip() must first move the cursor index to the end of the current block; otherwise, the cursor position could incorrectly remain in the same block, causing the actual number of skipped entries to be much smaller than expected.
This bug affects cache resizing in v2 metadata and could lead to data loss if the fast device is shrunk during the first-time resume. For example:
1. create a cache metadata consists of 32768 blocks, with a dirty block assigned to the second bitmap block. cache_restore v1.0 is required.
cat <<EOF >> cmeta.xml <superblock uuid="" block_size="64" nr_cache_blocks="32768" \ policy="smq" hint_width="4"> <mappings> <mapping cache_block="32767" origin_block="0" dirty="true"/> </mappings> </superblock> EOF dmsetup create cmeta --table "0 8192 linear /dev/sdc 0" cache_restore -i cmeta.xml -o /dev/mapper/cmeta --metadata-version=2
2. bring up the cache while attempt to discard all the blocks belonging to the second bitmap block (block# 32576 to 32767). The last command is expected to fail, but it actually succeeds.
dmsetup create cdata --table "0 2084864 linear /dev/sdc 8192" dmsetup create corig --table "0 65536 linear /dev/sdc 2105344" dmsetup create cache --table "0 65536 cache /dev/mapper/cmeta \ /dev/mapper/cdata /dev/mapper/corig 64 2 metadata2 writeback smq \ 2 migration_threshold 0"
In addition to the reproducer described above, this fix can be verified using the "array_cursor/skip" tests in dm-unit: dm-unit run /pdata/array_cursor/skip/ --kernel-dir <KERNEL_DIR>
Signed-off-by: Ming-Hung Tsai mtsai@redhat.com Fixes: 9b696229aa7d ("dm persistent data: add cursor skip functions to the cursor APIs") Reviewed-by: Joe Thornber thornber@redhat.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/persistent-data/dm-array.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/md/persistent-data/dm-array.c b/drivers/md/persistent-data/dm-array.c index 788a31cae187..b1fdfe53e937 100644 --- a/drivers/md/persistent-data/dm-array.c +++ b/drivers/md/persistent-data/dm-array.c @@ -1003,6 +1003,7 @@ int dm_array_cursor_skip(struct dm_array_cursor *c, uint32_t count) }
count -= remaining; + c->index += (remaining - 1); r = dm_array_cursor_next(c);
} while (!r);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yuezhang Mo Yuezhang.Mo@sony.com
[ Upstream commit fee873761bd978d077d8c55334b4966ac4cb7b59 ]
If the file system is corrupted so that a cluster is linked to itself in the cluster chain, and there is an unused directory entry in the cluster, 'dentry' will not be incremented, causing condition 'dentry < max_dentries' unable to prevent an infinite loop.
This infinite loop causes s_lock not to be released, and other tasks will hang, such as exfat_sync_fs().
This commit stops traversing the cluster chain when there is unused directory entry in the cluster to avoid this infinite loop.
Reported-by: syzbot+205c2644abdff9d3f9fc@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=205c2644abdff9d3f9fc Tested-by: syzbot+205c2644abdff9d3f9fc@syzkaller.appspotmail.com Fixes: ca06197382bd ("exfat: add directory operations") Signed-off-by: Yuezhang Mo Yuezhang.Mo@sony.com Reviewed-by: Sungjong Seo sj1557.seo@samsung.com Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/exfat/dir.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/exfat/dir.c b/fs/exfat/dir.c index 7a715016b96f..f4f81e349cef 100644 --- a/fs/exfat/dir.c +++ b/fs/exfat/dir.c @@ -125,7 +125,7 @@ static int exfat_readdir(struct inode *inode, loff_t *cpos, struct exfat_dir_ent type = exfat_get_entry_type(ep); if (type == TYPE_UNUSED) { brelse(bh); - break; + goto out; }
if (type != TYPE_FILE && type != TYPE_DIR) { @@ -189,6 +189,7 @@ static int exfat_readdir(struct inode *inode, loff_t *cpos, struct exfat_dir_ent } }
+out: dir_entry->namebuf.lfn[0] = '\0'; *cpos = EXFAT_DEN_TO_B(dentry); return 0;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yuezhang Mo Yuezhang.Mo@sony.com
[ Upstream commit a5324b3a488d883aa2d42f72260054e87d0940a0 ]
In __exfat_free_cluster(), the cluster chain is traversed until the EOF cluster. If the cluster chain includes a loop due to file system corruption, the EOF cluster cannot be traversed, resulting in an infinite loop.
This commit uses the total number of clusters to prevent this infinite loop.
Reported-by: syzbot+1de5a37cb85a2d536330@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=1de5a37cb85a2d536330 Tested-by: syzbot+1de5a37cb85a2d536330@syzkaller.appspotmail.com Fixes: 31023864e67a ("exfat: add fat entry operations") Signed-off-by: Yuezhang Mo Yuezhang.Mo@sony.com Reviewed-by: Sungjong Seo sj1557.seo@samsung.com Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/exfat/fatent.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/fs/exfat/fatent.c b/fs/exfat/fatent.c index 56b870d9cc0d..428d862a1d2b 100644 --- a/fs/exfat/fatent.c +++ b/fs/exfat/fatent.c @@ -216,6 +216,16 @@ static int __exfat_free_cluster(struct inode *inode, struct exfat_chain *p_chain
if (err) goto dec_used_clus; + + if (num_clusters >= sbi->num_clusters - EXFAT_FIRST_CLUSTER) { + /* + * The cluster chain includes a loop, scan the + * bitmap to get the number of used clusters. + */ + exfat_count_used_clusters(sb, &sbi->used_clusters); + + return 0; + } } while (clu != EXFAT_EOF_CLUSTER); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 ]
When lower fs is a nested overlayfs, calling encode_fh() on a lower directory dentry may trigger copy up and take sb_writers on the upper fs of the lower nested overlayfs.
The lower nested overlayfs may have the same upper fs as this overlayfs, so nested sb_writers lock is illegal.
Move all the callers that encode lower fh to before ovl_want_write().
Signed-off-by: Amir Goldstein amir73il@gmail.com Stable-dep-of: c45beebfde34 ("ovl: support encoding fid from inode with no alias") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/overlayfs/copy_up.c | 53 +++++++++++++++++++++++++--------------- fs/overlayfs/namei.c | 37 +++++++++++++++++++++------- fs/overlayfs/overlayfs.h | 26 ++++++++++++++------ fs/overlayfs/super.c | 20 ++++++++++----- fs/overlayfs/util.c | 10 ++++++++ 5 files changed, 104 insertions(+), 42 deletions(-)
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c index ada3fcc9c6d5..5c9af24bae4a 100644 --- a/fs/overlayfs/copy_up.c +++ b/fs/overlayfs/copy_up.c @@ -426,29 +426,29 @@ struct ovl_fh *ovl_encode_real_fh(struct ovl_fs *ofs, struct dentry *real, return ERR_PTR(err); }
-int ovl_set_origin(struct ovl_fs *ofs, struct dentry *lower, - struct dentry *upper) +struct ovl_fh *ovl_get_origin_fh(struct ovl_fs *ofs, struct dentry *origin) { - const struct ovl_fh *fh = NULL; - int err; - /* * When lower layer doesn't support export operations store a 'null' fh, * so we can use the overlay.origin xattr to distignuish between a copy * up and a pure upper inode. */ - if (ovl_can_decode_fh(lower->d_sb)) { - fh = ovl_encode_real_fh(ofs, lower, false); - if (IS_ERR(fh)) - return PTR_ERR(fh); - } + if (!ovl_can_decode_fh(origin->d_sb)) + return NULL; + + return ovl_encode_real_fh(ofs, origin, false); +} + +int ovl_set_origin_fh(struct ovl_fs *ofs, const struct ovl_fh *fh, + struct dentry *upper) +{ + int err;
/* * Do not fail when upper doesn't support xattrs. */ err = ovl_check_setxattr(ofs, upper, OVL_XATTR_ORIGIN, fh->buf, fh ? fh->fb.len : 0, 0); - kfree(fh);
/* Ignore -EPERM from setting "user.*" on symlink/special */ return err == -EPERM ? 0 : err; @@ -476,7 +476,7 @@ static int ovl_set_upper_fh(struct ovl_fs *ofs, struct dentry *upper, * * Caller must hold i_mutex on indexdir. */ -static int ovl_create_index(struct dentry *dentry, struct dentry *origin, +static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh, struct dentry *upper) { struct ovl_fs *ofs = OVL_FS(dentry->d_sb); @@ -502,7 +502,7 @@ static int ovl_create_index(struct dentry *dentry, struct dentry *origin, if (WARN_ON(ovl_test_flag(OVL_INDEX, d_inode(dentry)))) return -EIO;
- err = ovl_get_index_name(ofs, origin, &name); + err = ovl_get_index_name_fh(fh, &name); if (err) return err;
@@ -541,6 +541,7 @@ struct ovl_copy_up_ctx { struct dentry *destdir; struct qstr destname; struct dentry *workdir; + const struct ovl_fh *origin_fh; bool origin; bool indexed; bool metacopy; @@ -637,7 +638,7 @@ static int ovl_copy_up_metadata(struct ovl_copy_up_ctx *c, struct dentry *temp) * hard link. */ if (c->origin) { - err = ovl_set_origin(ofs, c->lowerpath.dentry, temp); + err = ovl_set_origin_fh(ofs, c->origin_fh, temp); if (err) return err; } @@ -749,7 +750,7 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c) goto cleanup;
if (S_ISDIR(c->stat.mode) && c->indexed) { - err = ovl_create_index(c->dentry, c->lowerpath.dentry, temp); + err = ovl_create_index(c->dentry, c->origin_fh, temp); if (err) goto cleanup; } @@ -861,6 +862,8 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c) { int err; struct ovl_fs *ofs = OVL_FS(c->dentry->d_sb); + struct dentry *origin = c->lowerpath.dentry; + struct ovl_fh *fh = NULL; bool to_index = false;
/* @@ -877,17 +880,25 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c) to_index = true; }
- if (S_ISDIR(c->stat.mode) || c->stat.nlink == 1 || to_index) + if (S_ISDIR(c->stat.mode) || c->stat.nlink == 1 || to_index) { + fh = ovl_get_origin_fh(ofs, origin); + if (IS_ERR(fh)) + return PTR_ERR(fh); + + /* origin_fh may be NULL */ + c->origin_fh = fh; c->origin = true; + }
if (to_index) { c->destdir = ovl_indexdir(c->dentry->d_sb); - err = ovl_get_index_name(ofs, c->lowerpath.dentry, &c->destname); + err = ovl_get_index_name(ofs, origin, &c->destname); if (err) - return err; + goto out_free_fh; } else if (WARN_ON(!c->parent)) { /* Disconnected dentry must be copied up to index dir */ - return -EIO; + err = -EIO; + goto out_free_fh; } else { /* * Mark parent "impure" because it may now contain non-pure @@ -895,7 +906,7 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c) */ err = ovl_set_impure(c->parent, c->destdir); if (err) - return err; + goto out_free_fh; }
/* Should we copyup with O_TMPFILE or with workdir? */ @@ -927,6 +938,8 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c) out: if (to_index) kfree(c->destname.name); +out_free_fh: + kfree(fh); return err; }
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c index 80391c687c2a..f10ac4ae35f0 100644 --- a/fs/overlayfs/namei.c +++ b/fs/overlayfs/namei.c @@ -507,6 +507,19 @@ static int ovl_verify_fh(struct ovl_fs *ofs, struct dentry *dentry, return err; }
+int ovl_verify_set_fh(struct ovl_fs *ofs, struct dentry *dentry, + enum ovl_xattr ox, const struct ovl_fh *fh, + bool is_upper, bool set) +{ + int err; + + err = ovl_verify_fh(ofs, dentry, ox, fh); + if (set && err == -ENODATA) + err = ovl_setxattr(ofs, dentry, ox, fh->buf, fh->fb.len); + + return err; +} + /* * Verify that @real dentry matches the file handle stored in xattr @name. * @@ -515,9 +528,9 @@ static int ovl_verify_fh(struct ovl_fs *ofs, struct dentry *dentry, * * Return 0 on match, -ESTALE on mismatch, -ENODATA on no xattr, < 0 on error. */ -int ovl_verify_set_fh(struct ovl_fs *ofs, struct dentry *dentry, - enum ovl_xattr ox, struct dentry *real, bool is_upper, - bool set) +int ovl_verify_origin_xattr(struct ovl_fs *ofs, struct dentry *dentry, + enum ovl_xattr ox, struct dentry *real, + bool is_upper, bool set) { struct inode *inode; struct ovl_fh *fh; @@ -530,9 +543,7 @@ int ovl_verify_set_fh(struct ovl_fs *ofs, struct dentry *dentry, goto fail; }
- err = ovl_verify_fh(ofs, dentry, ox, fh); - if (set && err == -ENODATA) - err = ovl_setxattr(ofs, dentry, ox, fh->buf, fh->fb.len); + err = ovl_verify_set_fh(ofs, dentry, ox, fh, is_upper, set); if (err) goto fail;
@@ -548,6 +559,7 @@ int ovl_verify_set_fh(struct ovl_fs *ofs, struct dentry *dentry, goto out; }
+ /* Get upper dentry from index */ struct dentry *ovl_index_upper(struct ovl_fs *ofs, struct dentry *index, bool connected) @@ -684,7 +696,7 @@ int ovl_verify_index(struct ovl_fs *ofs, struct dentry *index) goto out; }
-static int ovl_get_index_name_fh(struct ovl_fh *fh, struct qstr *name) +int ovl_get_index_name_fh(const struct ovl_fh *fh, struct qstr *name) { char *n, *s;
@@ -873,20 +885,27 @@ int ovl_path_next(int idx, struct dentry *dentry, struct path *path) static int ovl_fix_origin(struct ovl_fs *ofs, struct dentry *dentry, struct dentry *lower, struct dentry *upper) { + const struct ovl_fh *fh; int err;
if (ovl_check_origin_xattr(ofs, upper)) return 0;
+ fh = ovl_get_origin_fh(ofs, lower); + if (IS_ERR(fh)) + return PTR_ERR(fh); + err = ovl_want_write(dentry); if (err) - return err; + goto out;
- err = ovl_set_origin(ofs, lower, upper); + err = ovl_set_origin_fh(ofs, fh, upper); if (!err) err = ovl_set_impure(dentry->d_parent, upper->d_parent);
ovl_drop_write(dentry); +out: + kfree(fh); return err; }
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h index 09ca82ed0f8c..61e03d664d7d 100644 --- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -632,11 +632,15 @@ struct dentry *ovl_decode_real_fh(struct ovl_fs *ofs, struct ovl_fh *fh, int ovl_check_origin_fh(struct ovl_fs *ofs, struct ovl_fh *fh, bool connected, struct dentry *upperdentry, struct ovl_path **stackp); int ovl_verify_set_fh(struct ovl_fs *ofs, struct dentry *dentry, - enum ovl_xattr ox, struct dentry *real, bool is_upper, - bool set); + enum ovl_xattr ox, const struct ovl_fh *fh, + bool is_upper, bool set); +int ovl_verify_origin_xattr(struct ovl_fs *ofs, struct dentry *dentry, + enum ovl_xattr ox, struct dentry *real, + bool is_upper, bool set); struct dentry *ovl_index_upper(struct ovl_fs *ofs, struct dentry *index, bool connected); int ovl_verify_index(struct ovl_fs *ofs, struct dentry *index); +int ovl_get_index_name_fh(const struct ovl_fh *fh, struct qstr *name); int ovl_get_index_name(struct ovl_fs *ofs, struct dentry *origin, struct qstr *name); struct dentry *ovl_get_index_fh(struct ovl_fs *ofs, struct ovl_fh *fh); @@ -648,17 +652,24 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags); bool ovl_lower_positive(struct dentry *dentry);
+static inline int ovl_verify_origin_fh(struct ovl_fs *ofs, struct dentry *upper, + const struct ovl_fh *fh, bool set) +{ + return ovl_verify_set_fh(ofs, upper, OVL_XATTR_ORIGIN, fh, false, set); +} + static inline int ovl_verify_origin(struct ovl_fs *ofs, struct dentry *upper, struct dentry *origin, bool set) { - return ovl_verify_set_fh(ofs, upper, OVL_XATTR_ORIGIN, origin, - false, set); + return ovl_verify_origin_xattr(ofs, upper, OVL_XATTR_ORIGIN, origin, + false, set); }
static inline int ovl_verify_upper(struct ovl_fs *ofs, struct dentry *index, struct dentry *upper, bool set) { - return ovl_verify_set_fh(ofs, index, OVL_XATTR_UPPER, upper, true, set); + return ovl_verify_origin_xattr(ofs, index, OVL_XATTR_UPPER, upper, + true, set); }
/* readdir.c */ @@ -823,8 +834,9 @@ int ovl_copy_xattr(struct super_block *sb, const struct path *path, struct dentr int ovl_set_attr(struct ovl_fs *ofs, struct dentry *upper, struct kstat *stat); struct ovl_fh *ovl_encode_real_fh(struct ovl_fs *ofs, struct dentry *real, bool is_upper); -int ovl_set_origin(struct ovl_fs *ofs, struct dentry *lower, - struct dentry *upper); +struct ovl_fh *ovl_get_origin_fh(struct ovl_fs *ofs, struct dentry *origin); +int ovl_set_origin_fh(struct ovl_fs *ofs, const struct ovl_fh *fh, + struct dentry *upper);
/* export.c */ extern const struct export_operations ovl_export_operations; diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c index 2c056d737c27..e2574034c3fa 100644 --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -879,15 +879,20 @@ static int ovl_get_indexdir(struct super_block *sb, struct ovl_fs *ofs, { struct vfsmount *mnt = ovl_upper_mnt(ofs); struct dentry *indexdir; + struct dentry *origin = ovl_lowerstack(oe)->dentry; + const struct ovl_fh *fh; int err;
+ fh = ovl_get_origin_fh(ofs, origin); + if (IS_ERR(fh)) + return PTR_ERR(fh); + err = mnt_want_write(mnt); if (err) - return err; + goto out_free_fh;
/* Verify lower root is upper root origin */ - err = ovl_verify_origin(ofs, upperpath->dentry, - ovl_lowerstack(oe)->dentry, true); + err = ovl_verify_origin_fh(ofs, upperpath->dentry, fh, true); if (err) { pr_err("failed to verify upper root origin\n"); goto out; @@ -919,9 +924,10 @@ static int ovl_get_indexdir(struct super_block *sb, struct ovl_fs *ofs, * directory entries. */ if (ovl_check_origin_xattr(ofs, ofs->indexdir)) { - err = ovl_verify_set_fh(ofs, ofs->indexdir, - OVL_XATTR_ORIGIN, - upperpath->dentry, true, false); + err = ovl_verify_origin_xattr(ofs, ofs->indexdir, + OVL_XATTR_ORIGIN, + upperpath->dentry, true, + false); if (err) pr_err("failed to verify index dir 'origin' xattr\n"); } @@ -939,6 +945,8 @@ static int ovl_get_indexdir(struct super_block *sb, struct ovl_fs *ofs,
out: mnt_drop_write(mnt); +out_free_fh: + kfree(fh); return err; }
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c index 0bf3ffcd072f..4e6b747e0f2e 100644 --- a/fs/overlayfs/util.c +++ b/fs/overlayfs/util.c @@ -976,12 +976,18 @@ static void ovl_cleanup_index(struct dentry *dentry) struct dentry *index = NULL; struct inode *inode; struct qstr name = { }; + bool got_write = false; int err;
err = ovl_get_index_name(ofs, lowerdentry, &name); if (err) goto fail;
+ err = ovl_want_write(dentry); + if (err) + goto fail; + + got_write = true; inode = d_inode(upperdentry); if (!S_ISDIR(inode->i_mode) && inode->i_nlink != 1) { pr_warn_ratelimited("cleanup linked index (%pd2, ino=%lu, nlink=%u)\n", @@ -1019,6 +1025,8 @@ static void ovl_cleanup_index(struct dentry *dentry) goto fail;
out: + if (got_write) + ovl_drop_write(dentry); kfree(name.name); dput(index); return; @@ -1089,6 +1097,8 @@ void ovl_nlink_end(struct dentry *dentry) { struct inode *inode = d_inode(dentry);
+ ovl_drop_write(dentry); + if (ovl_test_flag(OVL_INDEX, inode) && inode->i_nlink == 0) { const struct cred *old_cred;
On 15 Jan 2025, at 10:36, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
6.6-stable review patch. If anyone has any objections, please let me know.
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 ]
When lower fs is a nested overlayfs, calling encode_fh() on a lower directory dentry may trigger copy up and take sb_writers on the upper fs of the lower nested overlayfs.
The lower nested overlayfs may have the same upper fs as this overlayfs, so nested sb_writers lock is illegal.
Move all the callers that encode lower fh to before ovl_want_write().
Signed-off-by: Amir Goldstein amir73il@gmail.com Stable-dep-of: c45beebfde34 ("ovl: support encoding fid from inode with no alias") Signed-off-by: Sasha Levin sashal@kernel.org
Hi,
This patch seems to trigger the following warning on 6.6.72, when running simple “$ docker run --rm -it debian” (creating a container):
------------[ cut here ]------------ WARNING: CPU: 12 PID: 668 at fs/namespace.c:1245 cleanup_mnt+0x130/0x150 Modules linked in: xt_conntrack(E) nft_chain_nat(E) xt_MASQUERADE(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bridge(E) stp(E) llc(E) xfrm_user(E) xfrm_algo(E) xt_addrtype(E) nft_compat(E) nf_tables(E) overlay(E) kvm_amd(E) ccp(E) kvm(E) irqbypass(E) crc32_pclmul(E) sha512_ssse3(E) sha256_ssse3(E) sha1_ssse3(E) aesni_intel(E) crypto_simd(E) cryptd(E) iTCO_wdt(E) virtio_console(E) virtio_balloon(E) iTCO_vendor_support(E) tiny_power_button(E) button(E) sch_fq_codel(E) fuse(E) nfnetlink(E) vsock_loopback(E) vmw_vsock_virtio_transport_common(E) vsock(E) efivarfs(E) ip_tables(E) x_tables(E) virtio_net(E) net_failover(E) virtio_blk(E) virtio_scsi(E) failover(E) crc32c_intel(E) i2c_i801(E) virtio_pci(E) virtio_pci_legacy_dev(E) i2c_smbus(E) lpc_ich(E) virtio_pci_modern_dev(E) mfd_core(E) virtio(E) virtio_ring(E) CPU: 12 PID: 668 Comm: dockerd Tainted: G E 6.6.71+ #18 Hardware name: KubeVirt None/RHEL, BIOS edk2-20230524-3.el9 05/24/2023 RIP: 0010:cleanup_mnt+0x130/0x150 Code: 2c 01 00 00 85 c0 75 16 e8 6d fb ff ff eb 8a c7 87 2c 01 00 00 00 00 00 00 e9 6a ff ff ff c7 87 2c 01 00 00 00 00 00 00 eb de <0f> 0b 48 83 bd 30 01 00 00 00 0f 84 e9 fe ff ff 48 89 ef e8 18 e7 RSP: 0018:ffffc9000095fec8 EFLAGS: 00010282 RAX: 00000000fffffffe RBX: 0000000000000000 RCX: 0000000000000010 RDX: 0000000000000010 RSI: 0000000000000010 RDI: 0000000000000010 RBP: ffff888109ea57c0 R08: ffffffffbc27ab60 R09: 0000000000000000 R10: 0000000000037420 R11: 0000000000000000 R12: ffff88810acba9bc R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f1041ffb6c0(0000) GS:ffff88903fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c000b7f02f CR3: 00000001034ca002 CR4: 0000000000770ee0 PKRU: 55555554 Call Trace: <TASK> ? cleanup_mnt+0x130/0x150 ? __warn+0x81/0x130 ? cleanup_mnt+0x130/0x150 ? report_bug+0x16f/0x1a0 ? handle_bug+0x53/0x90 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? cleanup_mnt+0x130/0x150 ? cleanup_mnt+0x13/0x150 task_work_run+0x5d/0x90 exit_to_user_mode_prepare+0xf8/0x100 syscall_exit_to_user_mode+0x21/0x40 ? srso_alias_return_thunk+0x5/0xfbef5 do_syscall_64+0x45/0x90 entry_SYSCALL_64_after_hwframe+0x60/0xca RIP: 0033:0x55d0e0726dee Code: 48 83 ec 38 e8 13 00 00 00 48 83 c4 38 5d c3 cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48 RSP: 002b:000000c000145a10 EFLAGS: 00000216 ORIG_RAX: 00000000000000a6 RAX: 0000000000000000 RBX: 000000c000b7fce0 RCX: 000055d0e0726dee RDX: 0000000000000000 RSI: 0000000000000002 RDI: 000000c000b7fce0 RBP: 000000c000145a50 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000216 R12: 000000c000b7fce0 R13: 0000000000000000 R14: 000000c000b06e00 R15: 1fffffffffffffff </TASK> ---[ end trace 0000000000000000 ]—
This commit was pointed by my bisecting 6.6.71..6.6.72, but to double-check it I had to revert the following commits to make 6.6.72 compile and not exhibit the issue:
* a3f8a2b13a277d942c810d2ccc654d5bc824a430 (“ovl: pass realinode to ovl_encode_real_fh() instead of realdentry ”) [ Upstream commit 07aeefae7ff44d80524375253980b1bdee2396b0 ] * 26423e18cd6f709ca4fe7194c29c11658cd0cdd0 (“ovl: do not encode lower fh with upper sb_writers held”) [ Upstream commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 ] * a1a541fbfa7e97c1100144db34b57553d7164ce5 ("ovl: support encoding fid from inode with no alias”) [ Upstream commit c45beebfde34aa71afbc48b2c54cdda623515037 ]
I can also confirm we don’t see this warning on the latest 6.12.10 release, so perhaps we have missed some dependencies in 6.6?
Ignat
fs/overlayfs/copy_up.c | 53 +++++++++++++++++++++++++--------------- fs/overlayfs/namei.c | 37 +++++++++++++++++++++------- fs/overlayfs/overlayfs.h | 26 ++++++++++++++------ fs/overlayfs/super.c | 20 ++++++++++----- fs/overlayfs/util.c | 10 ++++++++ 5 files changed, 104 insertions(+), 42 deletions(-)
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c index ada3fcc9c6d5..5c9af24bae4a 100644 --- a/fs/overlayfs/copy_up.c +++ b/fs/overlayfs/copy_up.c @@ -426,29 +426,29 @@ struct ovl_fh *ovl_encode_real_fh(struct ovl_fs *ofs, struct dentry *real, return ERR_PTR(err); }
-int ovl_set_origin(struct ovl_fs *ofs, struct dentry *lower,
- struct dentry *upper)
+struct ovl_fh *ovl_get_origin_fh(struct ovl_fs *ofs, struct dentry *origin) {
- const struct ovl_fh *fh = NULL;
- int err;
/*
- When lower layer doesn't support export operations store a 'null' fh,
- so we can use the overlay.origin xattr to distignuish between a copy
- up and a pure upper inode.
*/
- if (ovl_can_decode_fh(lower->d_sb)) {
- fh = ovl_encode_real_fh(ofs, lower, false);
- if (IS_ERR(fh))
- return PTR_ERR(fh);
- }
- if (!ovl_can_decode_fh(origin->d_sb))
- return NULL;
- return ovl_encode_real_fh(ofs, origin, false);
+}
+int ovl_set_origin_fh(struct ovl_fs *ofs, const struct ovl_fh *fh,
struct dentry *upper)
+{
- int err;
/*
- Do not fail when upper doesn't support xattrs.
*/ err = ovl_check_setxattr(ofs, upper, OVL_XATTR_ORIGIN, fh->buf, fh ? fh->fb.len : 0, 0);
- kfree(fh);
/* Ignore -EPERM from setting "user.*" on symlink/special */ return err == -EPERM ? 0 : err; @@ -476,7 +476,7 @@ static int ovl_set_upper_fh(struct ovl_fs *ofs, struct dentry *upper,
- Caller must hold i_mutex on indexdir.
*/ -static int ovl_create_index(struct dentry *dentry, struct dentry *origin, +static int ovl_create_index(struct dentry *dentry, const struct ovl_fh *fh, struct dentry *upper) { struct ovl_fs *ofs = OVL_FS(dentry->d_sb); @@ -502,7 +502,7 @@ static int ovl_create_index(struct dentry *dentry, struct dentry *origin, if (WARN_ON(ovl_test_flag(OVL_INDEX, d_inode(dentry)))) return -EIO;
- err = ovl_get_index_name(ofs, origin, &name);
- err = ovl_get_index_name_fh(fh, &name);
if (err) return err;
@@ -541,6 +541,7 @@ struct ovl_copy_up_ctx { struct dentry *destdir; struct qstr destname; struct dentry *workdir;
- const struct ovl_fh *origin_fh;
bool origin; bool indexed; bool metacopy; @@ -637,7 +638,7 @@ static int ovl_copy_up_metadata(struct ovl_copy_up_ctx *c, struct dentry *temp)
- hard link.
*/ if (c->origin) {
- err = ovl_set_origin(ofs, c->lowerpath.dentry, temp);
- err = ovl_set_origin_fh(ofs, c->origin_fh, temp);
if (err) return err; } @@ -749,7 +750,7 @@ static int ovl_copy_up_workdir(struct ovl_copy_up_ctx *c) goto cleanup;
if (S_ISDIR(c->stat.mode) && c->indexed) {
- err = ovl_create_index(c->dentry, c->lowerpath.dentry, temp);
- err = ovl_create_index(c->dentry, c->origin_fh, temp);
if (err) goto cleanup; } @@ -861,6 +862,8 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c) { int err; struct ovl_fs *ofs = OVL_FS(c->dentry->d_sb);
- struct dentry *origin = c->lowerpath.dentry;
- struct ovl_fh *fh = NULL;
bool to_index = false;
/* @@ -877,17 +880,25 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c) to_index = true; }
- if (S_ISDIR(c->stat.mode) || c->stat.nlink == 1 || to_index)
- if (S_ISDIR(c->stat.mode) || c->stat.nlink == 1 || to_index) {
- fh = ovl_get_origin_fh(ofs, origin);
- if (IS_ERR(fh))
- return PTR_ERR(fh);
- /* origin_fh may be NULL */
- c->origin_fh = fh;
c->origin = true;
- }
if (to_index) { c->destdir = ovl_indexdir(c->dentry->d_sb);
- err = ovl_get_index_name(ofs, c->lowerpath.dentry, &c->destname);
- err = ovl_get_index_name(ofs, origin, &c->destname);
if (err)
- return err;
- goto out_free_fh;
} else if (WARN_ON(!c->parent)) { /* Disconnected dentry must be copied up to index dir */
- return -EIO;
- err = -EIO;
- goto out_free_fh;
} else { /*
- Mark parent "impure" because it may now contain non-pure
@@ -895,7 +906,7 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c) */ err = ovl_set_impure(c->parent, c->destdir); if (err)
- return err;
- goto out_free_fh;
}
/* Should we copyup with O_TMPFILE or with workdir? */ @@ -927,6 +938,8 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c) out: if (to_index) kfree(c->destname.name); +out_free_fh:
- kfree(fh);
return err; }
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c index 80391c687c2a..f10ac4ae35f0 100644 --- a/fs/overlayfs/namei.c +++ b/fs/overlayfs/namei.c @@ -507,6 +507,19 @@ static int ovl_verify_fh(struct ovl_fs *ofs, struct dentry *dentry, return err; }
+int ovl_verify_set_fh(struct ovl_fs *ofs, struct dentry *dentry,
enum ovl_xattr ox, const struct ovl_fh *fh,
bool is_upper, bool set)
+{
- int err;
- err = ovl_verify_fh(ofs, dentry, ox, fh);
- if (set && err == -ENODATA)
- err = ovl_setxattr(ofs, dentry, ox, fh->buf, fh->fb.len);
- return err;
+}
/*
- Verify that @real dentry matches the file handle stored in xattr @name.
@@ -515,9 +528,9 @@ static int ovl_verify_fh(struct ovl_fs *ofs, struct dentry *dentry,
- Return 0 on match, -ESTALE on mismatch, -ENODATA on no xattr, < 0 on error.
*/ -int ovl_verify_set_fh(struct ovl_fs *ofs, struct dentry *dentry,
enum ovl_xattr ox, struct dentry *real, bool is_upper,
bool set)
+int ovl_verify_origin_xattr(struct ovl_fs *ofs, struct dentry *dentry,
- enum ovl_xattr ox, struct dentry *real,
- bool is_upper, bool set)
{ struct inode *inode; struct ovl_fh *fh; @@ -530,9 +543,7 @@ int ovl_verify_set_fh(struct ovl_fs *ofs, struct dentry *dentry, goto fail; }
- err = ovl_verify_fh(ofs, dentry, ox, fh);
- if (set && err == -ENODATA)
- err = ovl_setxattr(ofs, dentry, ox, fh->buf, fh->fb.len);
- err = ovl_verify_set_fh(ofs, dentry, ox, fh, is_upper, set);
if (err) goto fail;
@@ -548,6 +559,7 @@ int ovl_verify_set_fh(struct ovl_fs *ofs, struct dentry *dentry, goto out; }
/* Get upper dentry from index */ struct dentry *ovl_index_upper(struct ovl_fs *ofs, struct dentry *index, bool connected) @@ -684,7 +696,7 @@ int ovl_verify_index(struct ovl_fs *ofs, struct dentry *index) goto out; }
-static int ovl_get_index_name_fh(struct ovl_fh *fh, struct qstr *name) +int ovl_get_index_name_fh(const struct ovl_fh *fh, struct qstr *name) { char *n, *s;
@@ -873,20 +885,27 @@ int ovl_path_next(int idx, struct dentry *dentry, struct path *path) static int ovl_fix_origin(struct ovl_fs *ofs, struct dentry *dentry, struct dentry *lower, struct dentry *upper) {
- const struct ovl_fh *fh;
int err;
if (ovl_check_origin_xattr(ofs, upper)) return 0;
- fh = ovl_get_origin_fh(ofs, lower);
- if (IS_ERR(fh))
- return PTR_ERR(fh);
err = ovl_want_write(dentry); if (err)
- return err;
- goto out;
- err = ovl_set_origin(ofs, lower, upper);
- err = ovl_set_origin_fh(ofs, fh, upper);
if (!err) err = ovl_set_impure(dentry->d_parent, upper->d_parent);
ovl_drop_write(dentry); +out:
- kfree(fh);
return err; }
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h index 09ca82ed0f8c..61e03d664d7d 100644 --- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -632,11 +632,15 @@ struct dentry *ovl_decode_real_fh(struct ovl_fs *ofs, struct ovl_fh *fh, int ovl_check_origin_fh(struct ovl_fs *ofs, struct ovl_fh *fh, bool connected, struct dentry *upperdentry, struct ovl_path **stackp); int ovl_verify_set_fh(struct ovl_fs *ofs, struct dentry *dentry,
enum ovl_xattr ox, struct dentry *real, bool is_upper,
bool set);
enum ovl_xattr ox, const struct ovl_fh *fh,
bool is_upper, bool set);
+int ovl_verify_origin_xattr(struct ovl_fs *ofs, struct dentry *dentry,
- enum ovl_xattr ox, struct dentry *real,
- bool is_upper, bool set);
struct dentry *ovl_index_upper(struct ovl_fs *ofs, struct dentry *index, bool connected); int ovl_verify_index(struct ovl_fs *ofs, struct dentry *index); +int ovl_get_index_name_fh(const struct ovl_fh *fh, struct qstr *name); int ovl_get_index_name(struct ovl_fs *ofs, struct dentry *origin, struct qstr *name); struct dentry *ovl_get_index_fh(struct ovl_fs *ofs, struct ovl_fh *fh); @@ -648,17 +652,24 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags); bool ovl_lower_positive(struct dentry *dentry);
+static inline int ovl_verify_origin_fh(struct ovl_fs *ofs, struct dentry *upper,
const struct ovl_fh *fh, bool set)
+{
- return ovl_verify_set_fh(ofs, upper, OVL_XATTR_ORIGIN, fh, false, set);
+}
static inline int ovl_verify_origin(struct ovl_fs *ofs, struct dentry *upper, struct dentry *origin, bool set) {
- return ovl_verify_set_fh(ofs, upper, OVL_XATTR_ORIGIN, origin,
- false, set);
- return ovl_verify_origin_xattr(ofs, upper, OVL_XATTR_ORIGIN, origin,
false, set);
}
static inline int ovl_verify_upper(struct ovl_fs *ofs, struct dentry *index, struct dentry *upper, bool set) {
- return ovl_verify_set_fh(ofs, index, OVL_XATTR_UPPER, upper, true, set);
- return ovl_verify_origin_xattr(ofs, index, OVL_XATTR_UPPER, upper,
true, set);
}
/* readdir.c */ @@ -823,8 +834,9 @@ int ovl_copy_xattr(struct super_block *sb, const struct path *path, struct dentr int ovl_set_attr(struct ovl_fs *ofs, struct dentry *upper, struct kstat *stat); struct ovl_fh *ovl_encode_real_fh(struct ovl_fs *ofs, struct dentry *real, bool is_upper); -int ovl_set_origin(struct ovl_fs *ofs, struct dentry *lower,
- struct dentry *upper);
+struct ovl_fh *ovl_get_origin_fh(struct ovl_fs *ofs, struct dentry *origin); +int ovl_set_origin_fh(struct ovl_fs *ofs, const struct ovl_fh *fh,
struct dentry *upper);
/* export.c */ extern const struct export_operations ovl_export_operations; diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c index 2c056d737c27..e2574034c3fa 100644 --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -879,15 +879,20 @@ static int ovl_get_indexdir(struct super_block *sb, struct ovl_fs *ofs, { struct vfsmount *mnt = ovl_upper_mnt(ofs); struct dentry *indexdir;
- struct dentry *origin = ovl_lowerstack(oe)->dentry;
- const struct ovl_fh *fh;
int err;
- fh = ovl_get_origin_fh(ofs, origin);
- if (IS_ERR(fh))
- return PTR_ERR(fh);
err = mnt_want_write(mnt); if (err)
- return err;
- goto out_free_fh;
/* Verify lower root is upper root origin */
- err = ovl_verify_origin(ofs, upperpath->dentry,
- ovl_lowerstack(oe)->dentry, true);
- err = ovl_verify_origin_fh(ofs, upperpath->dentry, fh, true);
if (err) { pr_err("failed to verify upper root origin\n"); goto out; @@ -919,9 +924,10 @@ static int ovl_get_indexdir(struct super_block *sb, struct ovl_fs *ofs,
- directory entries.
*/ if (ovl_check_origin_xattr(ofs, ofs->indexdir)) {
- err = ovl_verify_set_fh(ofs, ofs->indexdir,
- OVL_XATTR_ORIGIN,
- upperpath->dentry, true, false);
- err = ovl_verify_origin_xattr(ofs, ofs->indexdir,
OVL_XATTR_ORIGIN,
upperpath->dentry, true,
false);
if (err) pr_err("failed to verify index dir 'origin' xattr\n"); } @@ -939,6 +945,8 @@ static int ovl_get_indexdir(struct super_block *sb, struct ovl_fs *ofs,
out: mnt_drop_write(mnt); +out_free_fh:
- kfree(fh);
return err; }
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c index 0bf3ffcd072f..4e6b747e0f2e 100644 --- a/fs/overlayfs/util.c +++ b/fs/overlayfs/util.c @@ -976,12 +976,18 @@ static void ovl_cleanup_index(struct dentry *dentry) struct dentry *index = NULL; struct inode *inode; struct qstr name = { };
- bool got_write = false;
int err;
err = ovl_get_index_name(ofs, lowerdentry, &name); if (err) goto fail;
- err = ovl_want_write(dentry);
- if (err)
- goto fail;
- got_write = true;
inode = d_inode(upperdentry); if (!S_ISDIR(inode->i_mode) && inode->i_nlink != 1) { pr_warn_ratelimited("cleanup linked index (%pd2, ino=%lu, nlink=%u)\n", @@ -1019,6 +1025,8 @@ static void ovl_cleanup_index(struct dentry *dentry) goto fail;
out:
- if (got_write)
- ovl_drop_write(dentry);
kfree(name.name); dput(index); return; @@ -1089,6 +1097,8 @@ void ovl_nlink_end(struct dentry *dentry) { struct inode *inode = d_inode(dentry);
- ovl_drop_write(dentry);
if (ovl_test_flag(OVL_INDEX, inode) && inode->i_nlink == 0) { const struct cred *old_cred;
-- 2.39.5
On Mon, Jan 20, 2025 at 6:09 PM Ignat Korchagin ignat@cloudflare.com wrote:
On 15 Jan 2025, at 10:36, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
6.6-stable review patch. If anyone has any objections, please let me know.
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 ]
When lower fs is a nested overlayfs, calling encode_fh() on a lower directory dentry may trigger copy up and take sb_writers on the upper fs of the lower nested overlayfs.
The lower nested overlayfs may have the same upper fs as this overlayfs, so nested sb_writers lock is illegal.
Move all the callers that encode lower fh to before ovl_want_write().
Signed-off-by: Amir Goldstein amir73il@gmail.com Stable-dep-of: c45beebfde34 ("ovl: support encoding fid from inode with no alias") Signed-off-by: Sasha Levin sashal@kernel.org
Hi,
This patch seems to trigger the following warning on 6.6.72, when running simple “$ docker run --rm -it debian” (creating a container):
------------[ cut here ]------------ WARNING: CPU: 12 PID: 668 at fs/namespace.c:1245 cleanup_mnt+0x130/0x150 Modules linked in: xt_conntrack(E) nft_chain_nat(E) xt_MASQUERADE(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bridge(E) stp(E) llc(E) xfrm_user(E) xfrm_algo(E) xt_addrtype(E) nft_compat(E) nf_tables(E) overlay(E) kvm_amd(E) ccp(E) kvm(E) irqbypass(E) crc32_pclmul(E) sha512_ssse3(E) sha256_ssse3(E) sha1_ssse3(E) aesni_intel(E) crypto_simd(E) cryptd(E) iTCO_wdt(E) virtio_console(E) virtio_balloon(E) iTCO_vendor_support(E) tiny_power_button(E) button(E) sch_fq_codel(E) fuse(E) nfnetlink(E) vsock_loopback(E) vmw_vsock_virtio_transport_common(E) vsock(E) efivarfs(E) ip_tables(E) x_tables(E) virtio_net(E) net_failover(E) virtio_blk(E) virtio_scsi(E) failover(E) crc32c_intel(E) i2c_i801(E) virtio_pci(E) virtio_pci_legacy_dev(E) i2c_smbus(E) lpc_ich(E) virtio_pci_modern_dev(E) mfd_core(E) virtio(E) virtio_ring(E) CPU: 12 PID: 668 Comm: dockerd Tainted: G E 6.6.71+ #18 Hardware name: KubeVirt None/RHEL, BIOS edk2-20230524-3.el9 05/24/2023 RIP: 0010:cleanup_mnt+0x130/0x150 Code: 2c 01 00 00 85 c0 75 16 e8 6d fb ff ff eb 8a c7 87 2c 01 00 00 00 00 00 00 e9 6a ff ff ff c7 87 2c 01 00 00 00 00 00 00 eb de <0f> 0b 48 83 bd 30 01 00 00 00 0f 84 e9 fe ff ff 48 89 ef e8 18 e7 RSP: 0018:ffffc9000095fec8 EFLAGS: 00010282 RAX: 00000000fffffffe RBX: 0000000000000000 RCX: 0000000000000010 RDX: 0000000000000010 RSI: 0000000000000010 RDI: 0000000000000010 RBP: ffff888109ea57c0 R08: ffffffffbc27ab60 R09: 0000000000000000 R10: 0000000000037420 R11: 0000000000000000 R12: ffff88810acba9bc R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f1041ffb6c0(0000) GS:ffff88903fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c000b7f02f CR3: 00000001034ca002 CR4: 0000000000770ee0 PKRU: 55555554 Call Trace:
<TASK> ? cleanup_mnt+0x130/0x150 ? __warn+0x81/0x130 ? cleanup_mnt+0x130/0x150 ? report_bug+0x16f/0x1a0 ? handle_bug+0x53/0x90 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? cleanup_mnt+0x130/0x150 ? cleanup_mnt+0x13/0x150 task_work_run+0x5d/0x90 exit_to_user_mode_prepare+0xf8/0x100 syscall_exit_to_user_mode+0x21/0x40 ? srso_alias_return_thunk+0x5/0xfbef5 do_syscall_64+0x45/0x90 entry_SYSCALL_64_after_hwframe+0x60/0xca RIP: 0033:0x55d0e0726dee Code: 48 83 ec 38 e8 13 00 00 00 48 83 c4 38 5d c3 cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48 RSP: 002b:000000c000145a10 EFLAGS: 00000216 ORIG_RAX: 00000000000000a6 RAX: 0000000000000000 RBX: 000000c000b7fce0 RCX: 000055d0e0726dee RDX: 0000000000000000 RSI: 0000000000000002 RDI: 000000c000b7fce0 RBP: 000000c000145a50 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000216 R12: 000000c000b7fce0 R13: 0000000000000000 R14: 000000c000b06e00 R15: 1fffffffffffffff </TASK> ---[ end trace 0000000000000000 ]—
This commit was pointed by my bisecting 6.6.71..6.6.72, but to double-check it I had to revert the following commits to make 6.6.72 compile and not exhibit the issue:
Can you say what the compile error was? Maybe it is easy to fix without reverting the entire bunch. Just by looking, it is hard for me to guess what caused the scripts to pull in this dependency patch.
- a3f8a2b13a277d942c810d2ccc654d5bc824a430 (“ovl: pass realinode to ovl_encode_real_fh() instead of realdentry
”) [ Upstream commit 07aeefae7ff44d80524375253980b1bdee2396b0 ]
- 26423e18cd6f709ca4fe7194c29c11658cd0cdd0 (“ovl: do not encode lower fh with upper sb_writers held”) [ Upstream commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 ]
- a1a541fbfa7e97c1100144db34b57553d7164ce5 ("ovl: support encoding fid from inode with no alias”) [ Upstream commit c45beebfde34aa71afbc48b2c54cdda623515037 ]
I can also confirm we don’t see this warning on the latest 6.12.10 release, so perhaps we have missed some dependencies in 6.6?
Nah. pulling in the dependency patch was wrong. This patch was not supposed to be applied detached from the entire series https://lore.kernel.org/linux-unionfs/20230816152334.924960-1-amir73il@gmail... and I think this is a risky series to backport.
Thanks, Amir.
On Mon, Jan 20, 2025 at 6:54 PM Amir Goldstein amir73il@gmail.com wrote:
On Mon, Jan 20, 2025 at 6:09 PM Ignat Korchagin ignat@cloudflare.com wrote:
On 15 Jan 2025, at 10:36, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
6.6-stable review patch. If anyone has any objections, please let me know.
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 ]
When lower fs is a nested overlayfs, calling encode_fh() on a lower directory dentry may trigger copy up and take sb_writers on the upper fs of the lower nested overlayfs.
The lower nested overlayfs may have the same upper fs as this overlayfs, so nested sb_writers lock is illegal.
Move all the callers that encode lower fh to before ovl_want_write().
Signed-off-by: Amir Goldstein amir73il@gmail.com Stable-dep-of: c45beebfde34 ("ovl: support encoding fid from inode with no alias") Signed-off-by: Sasha Levin sashal@kernel.org
Hi,
This patch seems to trigger the following warning on 6.6.72, when running simple “$ docker run --rm -it debian” (creating a container):
------------[ cut here ]------------ WARNING: CPU: 12 PID: 668 at fs/namespace.c:1245 cleanup_mnt+0x130/0x150 Modules linked in: xt_conntrack(E) nft_chain_nat(E) xt_MASQUERADE(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bridge(E) stp(E) llc(E) xfrm_user(E) xfrm_algo(E) xt_addrtype(E) nft_compat(E) nf_tables(E) overlay(E) kvm_amd(E) ccp(E) kvm(E) irqbypass(E) crc32_pclmul(E) sha512_ssse3(E) sha256_ssse3(E) sha1_ssse3(E) aesni_intel(E) crypto_simd(E) cryptd(E) iTCO_wdt(E) virtio_console(E) virtio_balloon(E) iTCO_vendor_support(E) tiny_power_button(E) button(E) sch_fq_codel(E) fuse(E) nfnetlink(E) vsock_loopback(E) vmw_vsock_virtio_transport_common(E) vsock(E) efivarfs(E) ip_tables(E) x_tables(E) virtio_net(E) net_failover(E) virtio_blk(E) virtio_scsi(E) failover(E) crc32c_intel(E) i2c_i801(E) virtio_pci(E) virtio_pci_legacy_dev(E) i2c_smbus(E) lpc_ich(E) virtio_pci_modern_dev(E) mfd_core(E) virtio(E) virtio_ring(E) CPU: 12 PID: 668 Comm: dockerd Tainted: G E 6.6.71+ #18 Hardware name: KubeVirt None/RHEL, BIOS edk2-20230524-3.el9 05/24/2023 RIP: 0010:cleanup_mnt+0x130/0x150 Code: 2c 01 00 00 85 c0 75 16 e8 6d fb ff ff eb 8a c7 87 2c 01 00 00 00 00 00 00 e9 6a ff ff ff c7 87 2c 01 00 00 00 00 00 00 eb de <0f> 0b 48 83 bd 30 01 00 00 00 0f 84 e9 fe ff ff 48 89 ef e8 18 e7 RSP: 0018:ffffc9000095fec8 EFLAGS: 00010282 RAX: 00000000fffffffe RBX: 0000000000000000 RCX: 0000000000000010 RDX: 0000000000000010 RSI: 0000000000000010 RDI: 0000000000000010 RBP: ffff888109ea57c0 R08: ffffffffbc27ab60 R09: 0000000000000000 R10: 0000000000037420 R11: 0000000000000000 R12: ffff88810acba9bc R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f1041ffb6c0(0000) GS:ffff88903fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c000b7f02f CR3: 00000001034ca002 CR4: 0000000000770ee0 PKRU: 55555554 Call Trace:
<TASK> ? cleanup_mnt+0x130/0x150 ? __warn+0x81/0x130 ? cleanup_mnt+0x130/0x150 ? report_bug+0x16f/0x1a0 ? handle_bug+0x53/0x90 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? cleanup_mnt+0x130/0x150 ? cleanup_mnt+0x13/0x150 task_work_run+0x5d/0x90 exit_to_user_mode_prepare+0xf8/0x100 syscall_exit_to_user_mode+0x21/0x40 ? srso_alias_return_thunk+0x5/0xfbef5 do_syscall_64+0x45/0x90 entry_SYSCALL_64_after_hwframe+0x60/0xca RIP: 0033:0x55d0e0726dee Code: 48 83 ec 38 e8 13 00 00 00 48 83 c4 38 5d c3 cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48 RSP: 002b:000000c000145a10 EFLAGS: 00000216 ORIG_RAX: 00000000000000a6 RAX: 0000000000000000 RBX: 000000c000b7fce0 RCX: 000055d0e0726dee RDX: 0000000000000000 RSI: 0000000000000002 RDI: 000000c000b7fce0 RBP: 000000c000145a50 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000216 R12: 000000c000b7fce0 R13: 0000000000000000 R14: 000000c000b06e00 R15: 1fffffffffffffff </TASK> ---[ end trace 0000000000000000 ]—
This commit was pointed by my bisecting 6.6.71..6.6.72, but to double-check it I had to revert the following commits to make 6.6.72 compile and not exhibit the issue:
Can you say what the compile error was? Maybe it is easy to fix without reverting the entire bunch.
Just to revert 26423e18cd6f I needed to revert a3f8a2b13a27 as well (otherwise there were conflicts). But then the compilation failed with
fs/overlayfs/export.c: In function ‘ovl_dentry_to_fid’: fs/overlayfs/export.c:255:67: error: ‘dentry’ undeclared (first use in this function) 255 | fh = ovl_encode_real_fh(ofs, enc_lower ? ovl_dentry_lower(dentry) : | ^~~~~~
so I had to revert a1a541fbfa7e as well
Thanks
Just by looking, it is hard for me to guess what caused the scripts to pull in this dependency patch.
- a3f8a2b13a277d942c810d2ccc654d5bc824a430 (“ovl: pass realinode to ovl_encode_real_fh() instead of realdentry
”) [ Upstream commit 07aeefae7ff44d80524375253980b1bdee2396b0 ]
- 26423e18cd6f709ca4fe7194c29c11658cd0cdd0 (“ovl: do not encode lower fh with upper sb_writers held”) [ Upstream commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 ]
- a1a541fbfa7e97c1100144db34b57553d7164ce5 ("ovl: support encoding fid from inode with no alias”) [ Upstream commit c45beebfde34aa71afbc48b2c54cdda623515037 ]
I can also confirm we don’t see this warning on the latest 6.12.10 release, so perhaps we have missed some dependencies in 6.6?
Nah. pulling in the dependency patch was wrong. This patch was not supposed to be applied detached from the entire series https://lore.kernel.org/linux-unionfs/20230816152334.924960-1-amir73il@gmail... and I think this is a risky series to backport.
Thanks, Amir.
On Mon, Jan 20, 2025 at 9:14 PM Ignat Korchagin ignat@cloudflare.com wrote:
On Mon, Jan 20, 2025 at 6:54 PM Amir Goldstein amir73il@gmail.com wrote:
On Mon, Jan 20, 2025 at 6:09 PM Ignat Korchagin ignat@cloudflare.com wrote:
On 15 Jan 2025, at 10:36, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
6.6-stable review patch. If anyone has any objections, please let me know.
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 ]
When lower fs is a nested overlayfs, calling encode_fh() on a lower directory dentry may trigger copy up and take sb_writers on the upper fs of the lower nested overlayfs.
The lower nested overlayfs may have the same upper fs as this overlayfs, so nested sb_writers lock is illegal.
Move all the callers that encode lower fh to before ovl_want_write().
Signed-off-by: Amir Goldstein amir73il@gmail.com Stable-dep-of: c45beebfde34 ("ovl: support encoding fid from inode with no alias") Signed-off-by: Sasha Levin sashal@kernel.org
Hi,
This patch seems to trigger the following warning on 6.6.72, when running simple “$ docker run --rm -it debian” (creating a container):
------------[ cut here ]------------ WARNING: CPU: 12 PID: 668 at fs/namespace.c:1245 cleanup_mnt+0x130/0x150 Modules linked in: xt_conntrack(E) nft_chain_nat(E) xt_MASQUERADE(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bridge(E) stp(E) llc(E) xfrm_user(E) xfrm_algo(E) xt_addrtype(E) nft_compat(E) nf_tables(E) overlay(E) kvm_amd(E) ccp(E) kvm(E) irqbypass(E) crc32_pclmul(E) sha512_ssse3(E) sha256_ssse3(E) sha1_ssse3(E) aesni_intel(E) crypto_simd(E) cryptd(E) iTCO_wdt(E) virtio_console(E) virtio_balloon(E) iTCO_vendor_support(E) tiny_power_button(E) button(E) sch_fq_codel(E) fuse(E) nfnetlink(E) vsock_loopback(E) vmw_vsock_virtio_transport_common(E) vsock(E) efivarfs(E) ip_tables(E) x_tables(E) virtio_net(E) net_failover(E) virtio_blk(E) virtio_scsi(E) failover(E) crc32c_intel(E) i2c_i801(E) virtio_pci(E) virtio_pci_legacy_dev(E) i2c_smbus(E) lpc_ich(E) virtio_pci_modern_dev(E) mfd_core(E) virtio(E) virtio_ring(E) CPU: 12 PID: 668 Comm: dockerd Tainted: G E 6.6.71+ #18 Hardware name: KubeVirt None/RHEL, BIOS edk2-20230524-3.el9 05/24/2023 RIP: 0010:cleanup_mnt+0x130/0x150 Code: 2c 01 00 00 85 c0 75 16 e8 6d fb ff ff eb 8a c7 87 2c 01 00 00 00 00 00 00 e9 6a ff ff ff c7 87 2c 01 00 00 00 00 00 00 eb de <0f> 0b 48 83 bd 30 01 00 00 00 0f 84 e9 fe ff ff 48 89 ef e8 18 e7 RSP: 0018:ffffc9000095fec8 EFLAGS: 00010282 RAX: 00000000fffffffe RBX: 0000000000000000 RCX: 0000000000000010 RDX: 0000000000000010 RSI: 0000000000000010 RDI: 0000000000000010 RBP: ffff888109ea57c0 R08: ffffffffbc27ab60 R09: 0000000000000000 R10: 0000000000037420 R11: 0000000000000000 R12: ffff88810acba9bc R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f1041ffb6c0(0000) GS:ffff88903fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c000b7f02f CR3: 00000001034ca002 CR4: 0000000000770ee0 PKRU: 55555554 Call Trace:
<TASK> ? cleanup_mnt+0x130/0x150 ? __warn+0x81/0x130 ? cleanup_mnt+0x130/0x150 ? report_bug+0x16f/0x1a0 ? handle_bug+0x53/0x90 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? cleanup_mnt+0x130/0x150 ? cleanup_mnt+0x13/0x150 task_work_run+0x5d/0x90 exit_to_user_mode_prepare+0xf8/0x100 syscall_exit_to_user_mode+0x21/0x40 ? srso_alias_return_thunk+0x5/0xfbef5 do_syscall_64+0x45/0x90 entry_SYSCALL_64_after_hwframe+0x60/0xca RIP: 0033:0x55d0e0726dee Code: 48 83 ec 38 e8 13 00 00 00 48 83 c4 38 5d c3 cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48 RSP: 002b:000000c000145a10 EFLAGS: 00000216 ORIG_RAX: 00000000000000a6 RAX: 0000000000000000 RBX: 000000c000b7fce0 RCX: 000055d0e0726dee RDX: 0000000000000000 RSI: 0000000000000002 RDI: 000000c000b7fce0 RBP: 000000c000145a50 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000216 R12: 000000c000b7fce0 R13: 0000000000000000 R14: 000000c000b06e00 R15: 1fffffffffffffff </TASK> ---[ end trace 0000000000000000 ]—
This commit was pointed by my bisecting 6.6.71..6.6.72, but to double-check it I had to revert the following commits to make 6.6.72 compile and not exhibit the issue:
Can you say what the compile error was? Maybe it is easy to fix without reverting the entire bunch.
Just to revert 26423e18cd6f I needed to revert a3f8a2b13a27 as well (otherwise there were conflicts). But then the compilation failed with
fs/overlayfs/export.c: In function ‘ovl_dentry_to_fid’: fs/overlayfs/export.c:255:67: error: ‘dentry’ undeclared (first use in this function) 255 | fh = ovl_encode_real_fh(ofs, enc_lower ? ovl_dentry_lower(dentry) : | ^~~~~~
so I had to revert a1a541fbfa7e as well
Care to try this backport branch without the buggy dependency based off of v6.6.71:
https://github.com/amir73il/linux/commits/ovl-6.6.y/
Sorry, I had no time to test this, but the conflicts are pretty trivial. I also added a manual backport of another related patch from 6.12.y while at it.
Just by looking, it is hard for me to guess what caused the scripts to pull in this dependency patch.
- a3f8a2b13a277d942c810d2ccc654d5bc824a430 (“ovl: pass realinode to ovl_encode_real_fh() instead of realdentry
”) [ Upstream commit 07aeefae7ff44d80524375253980b1bdee2396b0 ]
- 26423e18cd6f709ca4fe7194c29c11658cd0cdd0 (“ovl: do not encode lower fh with upper sb_writers held”) [ Upstream commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 ]
- a1a541fbfa7e97c1100144db34b57553d7164ce5 ("ovl: support encoding fid from inode with no alias”) [ Upstream commit c45beebfde34aa71afbc48b2c54cdda623515037 ]
These should be reverted in 6.6.y apply the backports from my branch instead.
Thanks, Amir.
On Mon, Jan 20, 2025 at 8:45 PM Amir Goldstein amir73il@gmail.com wrote:
On Mon, Jan 20, 2025 at 9:14 PM Ignat Korchagin ignat@cloudflare.com wrote:
On Mon, Jan 20, 2025 at 6:54 PM Amir Goldstein amir73il@gmail.com wrote:
On Mon, Jan 20, 2025 at 6:09 PM Ignat Korchagin ignat@cloudflare.com wrote:
On 15 Jan 2025, at 10:36, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
6.6-stable review patch. If anyone has any objections, please let me know.
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 ]
When lower fs is a nested overlayfs, calling encode_fh() on a lower directory dentry may trigger copy up and take sb_writers on the upper fs of the lower nested overlayfs.
The lower nested overlayfs may have the same upper fs as this overlayfs, so nested sb_writers lock is illegal.
Move all the callers that encode lower fh to before ovl_want_write().
Signed-off-by: Amir Goldstein amir73il@gmail.com Stable-dep-of: c45beebfde34 ("ovl: support encoding fid from inode with no alias") Signed-off-by: Sasha Levin sashal@kernel.org
Hi,
This patch seems to trigger the following warning on 6.6.72, when running simple “$ docker run --rm -it debian” (creating a container):
------------[ cut here ]------------ WARNING: CPU: 12 PID: 668 at fs/namespace.c:1245 cleanup_mnt+0x130/0x150 Modules linked in: xt_conntrack(E) nft_chain_nat(E) xt_MASQUERADE(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bridge(E) stp(E) llc(E) xfrm_user(E) xfrm_algo(E) xt_addrtype(E) nft_compat(E) nf_tables(E) overlay(E) kvm_amd(E) ccp(E) kvm(E) irqbypass(E) crc32_pclmul(E) sha512_ssse3(E) sha256_ssse3(E) sha1_ssse3(E) aesni_intel(E) crypto_simd(E) cryptd(E) iTCO_wdt(E) virtio_console(E) virtio_balloon(E) iTCO_vendor_support(E) tiny_power_button(E) button(E) sch_fq_codel(E) fuse(E) nfnetlink(E) vsock_loopback(E) vmw_vsock_virtio_transport_common(E) vsock(E) efivarfs(E) ip_tables(E) x_tables(E) virtio_net(E) net_failover(E) virtio_blk(E) virtio_scsi(E) failover(E) crc32c_intel(E) i2c_i801(E) virtio_pci(E) virtio_pci_legacy_dev(E) i2c_smbus(E) lpc_ich(E) virtio_pci_modern_dev(E) mfd_core(E) virtio(E) virtio_ring(E) CPU: 12 PID: 668 Comm: dockerd Tainted: G E 6.6.71+ #18 Hardware name: KubeVirt None/RHEL, BIOS edk2-20230524-3.el9 05/24/2023 RIP: 0010:cleanup_mnt+0x130/0x150 Code: 2c 01 00 00 85 c0 75 16 e8 6d fb ff ff eb 8a c7 87 2c 01 00 00 00 00 00 00 e9 6a ff ff ff c7 87 2c 01 00 00 00 00 00 00 eb de <0f> 0b 48 83 bd 30 01 00 00 00 0f 84 e9 fe ff ff 48 89 ef e8 18 e7 RSP: 0018:ffffc9000095fec8 EFLAGS: 00010282 RAX: 00000000fffffffe RBX: 0000000000000000 RCX: 0000000000000010 RDX: 0000000000000010 RSI: 0000000000000010 RDI: 0000000000000010 RBP: ffff888109ea57c0 R08: ffffffffbc27ab60 R09: 0000000000000000 R10: 0000000000037420 R11: 0000000000000000 R12: ffff88810acba9bc R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f1041ffb6c0(0000) GS:ffff88903fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c000b7f02f CR3: 00000001034ca002 CR4: 0000000000770ee0 PKRU: 55555554 Call Trace:
<TASK> ? cleanup_mnt+0x130/0x150 ? __warn+0x81/0x130 ? cleanup_mnt+0x130/0x150 ? report_bug+0x16f/0x1a0 ? handle_bug+0x53/0x90 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? cleanup_mnt+0x130/0x150 ? cleanup_mnt+0x13/0x150 task_work_run+0x5d/0x90 exit_to_user_mode_prepare+0xf8/0x100 syscall_exit_to_user_mode+0x21/0x40 ? srso_alias_return_thunk+0x5/0xfbef5 do_syscall_64+0x45/0x90 entry_SYSCALL_64_after_hwframe+0x60/0xca RIP: 0033:0x55d0e0726dee Code: 48 83 ec 38 e8 13 00 00 00 48 83 c4 38 5d c3 cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48 RSP: 002b:000000c000145a10 EFLAGS: 00000216 ORIG_RAX: 00000000000000a6 RAX: 0000000000000000 RBX: 000000c000b7fce0 RCX: 000055d0e0726dee RDX: 0000000000000000 RSI: 0000000000000002 RDI: 000000c000b7fce0 RBP: 000000c000145a50 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000216 R12: 000000c000b7fce0 R13: 0000000000000000 R14: 000000c000b06e00 R15: 1fffffffffffffff </TASK> ---[ end trace 0000000000000000 ]—
This commit was pointed by my bisecting 6.6.71..6.6.72, but to double-check it I had to revert the following commits to make 6.6.72 compile and not exhibit the issue:
Can you say what the compile error was? Maybe it is easy to fix without reverting the entire bunch.
Just to revert 26423e18cd6f I needed to revert a3f8a2b13a27 as well (otherwise there were conflicts). But then the compilation failed with
fs/overlayfs/export.c: In function ‘ovl_dentry_to_fid’: fs/overlayfs/export.c:255:67: error: ‘dentry’ undeclared (first use in this function) 255 | fh = ovl_encode_real_fh(ofs, enc_lower ? ovl_dentry_lower(dentry) : | ^~~~~~
so I had to revert a1a541fbfa7e as well
Care to try this backport branch without the buggy dependency based off of v6.6.71:
This branch seems OK and does not show the issue
Sorry, I had no time to test this, but the conflicts are pretty trivial. I also added a manual backport of another related patch from 6.12.y while at it.
Just by looking, it is hard for me to guess what caused the scripts to pull in this dependency patch.
- a3f8a2b13a277d942c810d2ccc654d5bc824a430 (“ovl: pass realinode to ovl_encode_real_fh() instead of realdentry
”) [ Upstream commit 07aeefae7ff44d80524375253980b1bdee2396b0 ]
- 26423e18cd6f709ca4fe7194c29c11658cd0cdd0 (“ovl: do not encode lower fh with upper sb_writers held”) [ Upstream commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 ]
- a1a541fbfa7e97c1100144db34b57553d7164ce5 ("ovl: support encoding fid from inode with no alias”) [ Upstream commit c45beebfde34aa71afbc48b2c54cdda623515037 ]
These should be reverted in 6.6.y apply the backports from my branch instead.
Thanks, Amir.
On Mon, Jan 20, 2025 at 09:45:40PM +0100, Amir Goldstein wrote:
On Mon, Jan 20, 2025 at 9:14 PM Ignat Korchagin ignat@cloudflare.com wrote:
On Mon, Jan 20, 2025 at 6:54 PM Amir Goldstein amir73il@gmail.com wrote:
On Mon, Jan 20, 2025 at 6:09 PM Ignat Korchagin ignat@cloudflare.com wrote:
On 15 Jan 2025, at 10:36, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
6.6-stable review patch. If anyone has any objections, please let me know.
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 ]
When lower fs is a nested overlayfs, calling encode_fh() on a lower directory dentry may trigger copy up and take sb_writers on the upper fs of the lower nested overlayfs.
The lower nested overlayfs may have the same upper fs as this overlayfs, so nested sb_writers lock is illegal.
Move all the callers that encode lower fh to before ovl_want_write().
Signed-off-by: Amir Goldstein amir73il@gmail.com Stable-dep-of: c45beebfde34 ("ovl: support encoding fid from inode with no alias") Signed-off-by: Sasha Levin sashal@kernel.org
Hi,
This patch seems to trigger the following warning on 6.6.72, when running simple “$ docker run --rm -it debian” (creating a container):
------------[ cut here ]------------ WARNING: CPU: 12 PID: 668 at fs/namespace.c:1245 cleanup_mnt+0x130/0x150 Modules linked in: xt_conntrack(E) nft_chain_nat(E) xt_MASQUERADE(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bridge(E) stp(E) llc(E) xfrm_user(E) xfrm_algo(E) xt_addrtype(E) nft_compat(E) nf_tables(E) overlay(E) kvm_amd(E) ccp(E) kvm(E) irqbypass(E) crc32_pclmul(E) sha512_ssse3(E) sha256_ssse3(E) sha1_ssse3(E) aesni_intel(E) crypto_simd(E) cryptd(E) iTCO_wdt(E) virtio_console(E) virtio_balloon(E) iTCO_vendor_support(E) tiny_power_button(E) button(E) sch_fq_codel(E) fuse(E) nfnetlink(E) vsock_loopback(E) vmw_vsock_virtio_transport_common(E) vsock(E) efivarfs(E) ip_tables(E) x_tables(E) virtio_net(E) net_failover(E) virtio_blk(E) virtio_scsi(E) failover(E) crc32c_intel(E) i2c_i801(E) virtio_pci(E) virtio_pci_legacy_dev(E) i2c_smbus(E) lpc_ich(E) virtio_pci_modern_dev(E) mfd_core(E) virtio(E) virtio_ring(E) CPU: 12 PID: 668 Comm: dockerd Tainted: G E 6.6.71+ #18 Hardware name: KubeVirt None/RHEL, BIOS edk2-20230524-3.el9 05/24/2023 RIP: 0010:cleanup_mnt+0x130/0x150 Code: 2c 01 00 00 85 c0 75 16 e8 6d fb ff ff eb 8a c7 87 2c 01 00 00 00 00 00 00 e9 6a ff ff ff c7 87 2c 01 00 00 00 00 00 00 eb de <0f> 0b 48 83 bd 30 01 00 00 00 0f 84 e9 fe ff ff 48 89 ef e8 18 e7 RSP: 0018:ffffc9000095fec8 EFLAGS: 00010282 RAX: 00000000fffffffe RBX: 0000000000000000 RCX: 0000000000000010 RDX: 0000000000000010 RSI: 0000000000000010 RDI: 0000000000000010 RBP: ffff888109ea57c0 R08: ffffffffbc27ab60 R09: 0000000000000000 R10: 0000000000037420 R11: 0000000000000000 R12: ffff88810acba9bc R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f1041ffb6c0(0000) GS:ffff88903fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c000b7f02f CR3: 00000001034ca002 CR4: 0000000000770ee0 PKRU: 55555554 Call Trace:
<TASK> ? cleanup_mnt+0x130/0x150 ? __warn+0x81/0x130 ? cleanup_mnt+0x130/0x150 ? report_bug+0x16f/0x1a0 ? handle_bug+0x53/0x90 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? cleanup_mnt+0x130/0x150 ? cleanup_mnt+0x13/0x150 task_work_run+0x5d/0x90 exit_to_user_mode_prepare+0xf8/0x100 syscall_exit_to_user_mode+0x21/0x40 ? srso_alias_return_thunk+0x5/0xfbef5 do_syscall_64+0x45/0x90 entry_SYSCALL_64_after_hwframe+0x60/0xca RIP: 0033:0x55d0e0726dee Code: 48 83 ec 38 e8 13 00 00 00 48 83 c4 38 5d c3 cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48 RSP: 002b:000000c000145a10 EFLAGS: 00000216 ORIG_RAX: 00000000000000a6 RAX: 0000000000000000 RBX: 000000c000b7fce0 RCX: 000055d0e0726dee RDX: 0000000000000000 RSI: 0000000000000002 RDI: 000000c000b7fce0 RBP: 000000c000145a50 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000216 R12: 000000c000b7fce0 R13: 0000000000000000 R14: 000000c000b06e00 R15: 1fffffffffffffff </TASK> ---[ end trace 0000000000000000 ]—
This commit was pointed by my bisecting 6.6.71..6.6.72, but to double-check it I had to revert the following commits to make 6.6.72 compile and not exhibit the issue:
Can you say what the compile error was? Maybe it is easy to fix without reverting the entire bunch.
Just to revert 26423e18cd6f I needed to revert a3f8a2b13a27 as well (otherwise there were conflicts). But then the compilation failed with
fs/overlayfs/export.c: In function ‘ovl_dentry_to_fid’: fs/overlayfs/export.c:255:67: error: ‘dentry’ undeclared (first use in this function) 255 | fh = ovl_encode_real_fh(ofs, enc_lower ? ovl_dentry_lower(dentry) : | ^~~~~~
so I had to revert a1a541fbfa7e as well
Care to try this backport branch without the buggy dependency based off of v6.6.71:
https://github.com/amir73il/linux/commits/ovl-6.6.y/
Sorry, I had no time to test this, but the conflicts are pretty trivial. I also added a manual backport of another related patch from 6.12.y while at it.
Just by looking, it is hard for me to guess what caused the scripts to pull in this dependency patch.
- a3f8a2b13a277d942c810d2ccc654d5bc824a430 (“ovl: pass realinode to ovl_encode_real_fh() instead of realdentry
”) [ Upstream commit 07aeefae7ff44d80524375253980b1bdee2396b0 ]
- 26423e18cd6f709ca4fe7194c29c11658cd0cdd0 (“ovl: do not encode lower fh with upper sb_writers held”) [ Upstream commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 ]
- a1a541fbfa7e97c1100144db34b57553d7164ce5 ("ovl: support encoding fid from inode with no alias”) [ Upstream commit c45beebfde34aa71afbc48b2c54cdda623515037 ]
These should be reverted in 6.6.y apply the backports from my branch instead.
Ok, let me revert them now, do a release, and then can you submit a working set of patches we can apply again?
thanks,
greg k-h
On Tue, Jan 21, 2025 at 08:55:41AM +0100, Greg Kroah-Hartman wrote:
On Mon, Jan 20, 2025 at 09:45:40PM +0100, Amir Goldstein wrote:
On Mon, Jan 20, 2025 at 9:14 PM Ignat Korchagin ignat@cloudflare.com wrote:
On Mon, Jan 20, 2025 at 6:54 PM Amir Goldstein amir73il@gmail.com wrote:
On Mon, Jan 20, 2025 at 6:09 PM Ignat Korchagin ignat@cloudflare.com wrote:
On 15 Jan 2025, at 10:36, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
6.6-stable review patch. If anyone has any objections, please let me know.
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 ]
When lower fs is a nested overlayfs, calling encode_fh() on a lower directory dentry may trigger copy up and take sb_writers on the upper fs of the lower nested overlayfs.
The lower nested overlayfs may have the same upper fs as this overlayfs, so nested sb_writers lock is illegal.
Move all the callers that encode lower fh to before ovl_want_write().
Signed-off-by: Amir Goldstein amir73il@gmail.com Stable-dep-of: c45beebfde34 ("ovl: support encoding fid from inode with no alias") Signed-off-by: Sasha Levin sashal@kernel.org
Hi,
This patch seems to trigger the following warning on 6.6.72, when running simple “$ docker run --rm -it debian” (creating a container):
------------[ cut here ]------------ WARNING: CPU: 12 PID: 668 at fs/namespace.c:1245 cleanup_mnt+0x130/0x150 Modules linked in: xt_conntrack(E) nft_chain_nat(E) xt_MASQUERADE(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bridge(E) stp(E) llc(E) xfrm_user(E) xfrm_algo(E) xt_addrtype(E) nft_compat(E) nf_tables(E) overlay(E) kvm_amd(E) ccp(E) kvm(E) irqbypass(E) crc32_pclmul(E) sha512_ssse3(E) sha256_ssse3(E) sha1_ssse3(E) aesni_intel(E) crypto_simd(E) cryptd(E) iTCO_wdt(E) virtio_console(E) virtio_balloon(E) iTCO_vendor_support(E) tiny_power_button(E) button(E) sch_fq_codel(E) fuse(E) nfnetlink(E) vsock_loopback(E) vmw_vsock_virtio_transport_common(E) vsock(E) efivarfs(E) ip_tables(E) x_tables(E) virtio_net(E) net_failover(E) virtio_blk(E) virtio_scsi(E) failover(E) crc32c_intel(E) i2c_i801(E) virtio_pci(E) virtio_pci_legacy_dev(E) i2c_smbus(E) lpc_ich(E) virtio_pci_modern_dev(E) mfd_core(E) virtio(E) virtio_ring(E) CPU: 12 PID: 668 Comm: dockerd Tainted: G E 6.6.71+ #18 Hardware name: KubeVirt None/RHEL, BIOS edk2-20230524-3.el9 05/24/2023 RIP: 0010:cleanup_mnt+0x130/0x150 Code: 2c 01 00 00 85 c0 75 16 e8 6d fb ff ff eb 8a c7 87 2c 01 00 00 00 00 00 00 e9 6a ff ff ff c7 87 2c 01 00 00 00 00 00 00 eb de <0f> 0b 48 83 bd 30 01 00 00 00 0f 84 e9 fe ff ff 48 89 ef e8 18 e7 RSP: 0018:ffffc9000095fec8 EFLAGS: 00010282 RAX: 00000000fffffffe RBX: 0000000000000000 RCX: 0000000000000010 RDX: 0000000000000010 RSI: 0000000000000010 RDI: 0000000000000010 RBP: ffff888109ea57c0 R08: ffffffffbc27ab60 R09: 0000000000000000 R10: 0000000000037420 R11: 0000000000000000 R12: ffff88810acba9bc R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f1041ffb6c0(0000) GS:ffff88903fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c000b7f02f CR3: 00000001034ca002 CR4: 0000000000770ee0 PKRU: 55555554 Call Trace:
<TASK> ? cleanup_mnt+0x130/0x150 ? __warn+0x81/0x130 ? cleanup_mnt+0x130/0x150 ? report_bug+0x16f/0x1a0 ? handle_bug+0x53/0x90 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? cleanup_mnt+0x130/0x150 ? cleanup_mnt+0x13/0x150 task_work_run+0x5d/0x90 exit_to_user_mode_prepare+0xf8/0x100 syscall_exit_to_user_mode+0x21/0x40 ? srso_alias_return_thunk+0x5/0xfbef5 do_syscall_64+0x45/0x90 entry_SYSCALL_64_after_hwframe+0x60/0xca RIP: 0033:0x55d0e0726dee Code: 48 83 ec 38 e8 13 00 00 00 48 83 c4 38 5d c3 cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48 RSP: 002b:000000c000145a10 EFLAGS: 00000216 ORIG_RAX: 00000000000000a6 RAX: 0000000000000000 RBX: 000000c000b7fce0 RCX: 000055d0e0726dee RDX: 0000000000000000 RSI: 0000000000000002 RDI: 000000c000b7fce0 RBP: 000000c000145a50 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000216 R12: 000000c000b7fce0 R13: 0000000000000000 R14: 000000c000b06e00 R15: 1fffffffffffffff </TASK> ---[ end trace 0000000000000000 ]—
This commit was pointed by my bisecting 6.6.71..6.6.72, but to double-check it I had to revert the following commits to make 6.6.72 compile and not exhibit the issue:
Can you say what the compile error was? Maybe it is easy to fix without reverting the entire bunch.
Just to revert 26423e18cd6f I needed to revert a3f8a2b13a27 as well (otherwise there were conflicts). But then the compilation failed with
fs/overlayfs/export.c: In function ‘ovl_dentry_to_fid’: fs/overlayfs/export.c:255:67: error: ‘dentry’ undeclared (first use in this function) 255 | fh = ovl_encode_real_fh(ofs, enc_lower ? ovl_dentry_lower(dentry) : | ^~~~~~
so I had to revert a1a541fbfa7e as well
Care to try this backport branch without the buggy dependency based off of v6.6.71:
https://github.com/amir73il/linux/commits/ovl-6.6.y/
Sorry, I had no time to test this, but the conflicts are pretty trivial. I also added a manual backport of another related patch from 6.12.y while at it.
Just by looking, it is hard for me to guess what caused the scripts to pull in this dependency patch.
- a3f8a2b13a277d942c810d2ccc654d5bc824a430 (“ovl: pass realinode to ovl_encode_real_fh() instead of realdentry
”) [ Upstream commit 07aeefae7ff44d80524375253980b1bdee2396b0 ]
- 26423e18cd6f709ca4fe7194c29c11658cd0cdd0 (“ovl: do not encode lower fh with upper sb_writers held”) [ Upstream commit 5b02bfc1e7e3811c5bf7f0fa626a0694d0dbbd77 ]
- a1a541fbfa7e97c1100144db34b57553d7164ce5 ("ovl: support encoding fid from inode with no alias”) [ Upstream commit c45beebfde34aa71afbc48b2c54cdda623515037 ]
These should be reverted in 6.6.y apply the backports from my branch instead.
Ok, let me revert them now, do a release, and then can you submit a working set of patches we can apply again?
Ok, 6.6.73 is out with the reverts, can you send me the patches to queue up for a future release?
thanks,
greg k-h
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit 07aeefae7ff44d80524375253980b1bdee2396b0 ]
We want to be able to encode an fid from an inode with no alias.
Signed-off-by: Amir Goldstein amir73il@gmail.com Link: https://lore.kernel.org/r/20250105162404.357058-2-amir73il@gmail.com Signed-off-by: Christian Brauner brauner@kernel.org Stable-dep-of: c45beebfde34 ("ovl: support encoding fid from inode with no alias") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/overlayfs/copy_up.c | 11 ++++++----- fs/overlayfs/export.c | 5 +++-- fs/overlayfs/namei.c | 4 ++-- fs/overlayfs/overlayfs.h | 2 +- 4 files changed, 12 insertions(+), 10 deletions(-)
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c index 5c9af24bae4a..f14c412c5609 100644 --- a/fs/overlayfs/copy_up.c +++ b/fs/overlayfs/copy_up.c @@ -371,13 +371,13 @@ int ovl_set_attr(struct ovl_fs *ofs, struct dentry *upperdentry, return err; }
-struct ovl_fh *ovl_encode_real_fh(struct ovl_fs *ofs, struct dentry *real, +struct ovl_fh *ovl_encode_real_fh(struct ovl_fs *ofs, struct inode *realinode, bool is_upper) { struct ovl_fh *fh; int fh_type, dwords; int buflen = MAX_HANDLE_SZ; - uuid_t *uuid = &real->d_sb->s_uuid; + uuid_t *uuid = &realinode->i_sb->s_uuid; int err;
/* Make sure the real fid stays 32bit aligned */ @@ -394,7 +394,8 @@ struct ovl_fh *ovl_encode_real_fh(struct ovl_fs *ofs, struct dentry *real, * the price or reconnecting the dentry. */ dwords = buflen >> 2; - fh_type = exportfs_encode_fh(real, (void *)fh->fb.fid, &dwords, 0); + fh_type = exportfs_encode_inode_fh(realinode, (void *)fh->fb.fid, + &dwords, NULL, 0); buflen = (dwords << 2);
err = -EIO; @@ -436,7 +437,7 @@ struct ovl_fh *ovl_get_origin_fh(struct ovl_fs *ofs, struct dentry *origin) if (!ovl_can_decode_fh(origin->d_sb)) return NULL;
- return ovl_encode_real_fh(ofs, origin, false); + return ovl_encode_real_fh(ofs, d_inode(origin), false); }
int ovl_set_origin_fh(struct ovl_fs *ofs, const struct ovl_fh *fh, @@ -461,7 +462,7 @@ static int ovl_set_upper_fh(struct ovl_fs *ofs, struct dentry *upper, const struct ovl_fh *fh; int err;
- fh = ovl_encode_real_fh(ofs, upper, true); + fh = ovl_encode_real_fh(ofs, d_inode(upper), true); if (IS_ERR(fh)) return PTR_ERR(fh);
diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c index 611ff567a1aa..c56e4e0b8054 100644 --- a/fs/overlayfs/export.c +++ b/fs/overlayfs/export.c @@ -228,6 +228,7 @@ static int ovl_check_encode_origin(struct dentry *dentry) static int ovl_dentry_to_fid(struct ovl_fs *ofs, struct dentry *dentry, u32 *fid, int buflen) { + struct inode *inode = d_inode(dentry); struct ovl_fh *fh = NULL; int err, enc_lower; int len; @@ -241,8 +242,8 @@ static int ovl_dentry_to_fid(struct ovl_fs *ofs, struct dentry *dentry, goto fail;
/* Encode an upper or lower file handle */ - fh = ovl_encode_real_fh(ofs, enc_lower ? ovl_dentry_lower(dentry) : - ovl_dentry_upper(dentry), !enc_lower); + fh = ovl_encode_real_fh(ofs, enc_lower ? ovl_inode_lower(inode) : + ovl_inode_upper(inode), !enc_lower); if (IS_ERR(fh)) return PTR_ERR(fh);
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c index f10ac4ae35f0..2d2ef671b36b 100644 --- a/fs/overlayfs/namei.c +++ b/fs/overlayfs/namei.c @@ -536,7 +536,7 @@ int ovl_verify_origin_xattr(struct ovl_fs *ofs, struct dentry *dentry, struct ovl_fh *fh; int err;
- fh = ovl_encode_real_fh(ofs, real, is_upper); + fh = ovl_encode_real_fh(ofs, d_inode(real), is_upper); err = PTR_ERR(fh); if (IS_ERR(fh)) { fh = NULL; @@ -732,7 +732,7 @@ int ovl_get_index_name(struct ovl_fs *ofs, struct dentry *origin, struct ovl_fh *fh; int err;
- fh = ovl_encode_real_fh(ofs, origin, false); + fh = ovl_encode_real_fh(ofs, d_inode(origin), false); if (IS_ERR(fh)) return PTR_ERR(fh);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h index 61e03d664d7d..ca63a26a6170 100644 --- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -832,7 +832,7 @@ int ovl_copy_up_with_data(struct dentry *dentry); int ovl_maybe_copy_up(struct dentry *dentry, int flags); int ovl_copy_xattr(struct super_block *sb, const struct path *path, struct dentry *new); int ovl_set_attr(struct ovl_fs *ofs, struct dentry *upper, struct kstat *stat); -struct ovl_fh *ovl_encode_real_fh(struct ovl_fs *ofs, struct dentry *real, +struct ovl_fh *ovl_encode_real_fh(struct ovl_fs *ofs, struct inode *realinode, bool is_upper); struct ovl_fh *ovl_get_origin_fh(struct ovl_fs *ofs, struct dentry *origin); int ovl_set_origin_fh(struct ovl_fs *ofs, const struct ovl_fh *fh,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amir Goldstein amir73il@gmail.com
[ Upstream commit c45beebfde34aa71afbc48b2c54cdda623515037 ]
Dmitry Safonov reported that a WARN_ON() assertion can be trigered by userspace when calling inotify_show_fdinfo() for an overlayfs watched inode, whose dentry aliases were discarded with drop_caches.
The WARN_ON() assertion in inotify_show_fdinfo() was removed, because it is possible for encoding file handle to fail for other reason, but the impact of failing to encode an overlayfs file handle goes beyond this assertion.
As shown in the LTP test case mentioned in the link below, failure to encode an overlayfs file handle from a non-aliased inode also leads to failure to report an fid with FAN_DELETE_SELF fanotify events.
As Dmitry notes in his analyzis of the problem, ovl_encode_fh() fails if it cannot find an alias for the inode, but this failure can be fixed. ovl_encode_fh() seldom uses the alias and in the case of non-decodable file handles, as is often the case with fanotify fid info, ovl_encode_fh() never needs to use the alias to encode a file handle.
Defer finding an alias until it is actually needed so ovl_encode_fh() will not fail in the common case of FAN_DELETE_SELF fanotify events.
Fixes: 16aac5ad1fa9 ("ovl: support encoding non-decodable file handles") Reported-by: Dmitry Safonov dima@arista.com Closes: https://lore.kernel.org/linux-fsdevel/CAOQ4uxiie81voLZZi2zXS1BziXZCM24nXqPAx... Signed-off-by: Amir Goldstein amir73il@gmail.com Link: https://lore.kernel.org/r/20250105162404.357058-3-amir73il@gmail.com Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/overlayfs/export.c | 46 +++++++++++++++++++++++-------------------- 1 file changed, 25 insertions(+), 21 deletions(-)
diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c index c56e4e0b8054..3a17e4366f28 100644 --- a/fs/overlayfs/export.c +++ b/fs/overlayfs/export.c @@ -181,35 +181,37 @@ static int ovl_connect_layer(struct dentry *dentry) * * Return 0 for upper file handle, > 0 for lower file handle or < 0 on error. */ -static int ovl_check_encode_origin(struct dentry *dentry) +static int ovl_check_encode_origin(struct inode *inode) { - struct ovl_fs *ofs = OVL_FS(dentry->d_sb); + struct ovl_fs *ofs = OVL_FS(inode->i_sb); bool decodable = ofs->config.nfs_export; + struct dentry *dentry; + int err;
/* No upper layer? */ if (!ovl_upper_mnt(ofs)) return 1;
/* Lower file handle for non-upper non-decodable */ - if (!ovl_dentry_upper(dentry) && !decodable) + if (!ovl_inode_upper(inode) && !decodable) return 1;
/* Upper file handle for pure upper */ - if (!ovl_dentry_lower(dentry)) + if (!ovl_inode_lower(inode)) return 0;
/* * Root is never indexed, so if there's an upper layer, encode upper for * root. */ - if (dentry == dentry->d_sb->s_root) + if (inode == d_inode(inode->i_sb->s_root)) return 0;
/* * Upper decodable file handle for non-indexed upper. */ - if (ovl_dentry_upper(dentry) && decodable && - !ovl_test_flag(OVL_INDEX, d_inode(dentry))) + if (ovl_inode_upper(inode) && decodable && + !ovl_test_flag(OVL_INDEX, inode)) return 0;
/* @@ -218,17 +220,25 @@ static int ovl_check_encode_origin(struct dentry *dentry) * ovl_connect_layer() will try to make origin's layer "connected" by * copying up a "connectable" ancestor. */ - if (d_is_dir(dentry) && decodable) - return ovl_connect_layer(dentry); + if (!decodable || !S_ISDIR(inode->i_mode)) + return 1; + + dentry = d_find_any_alias(inode); + if (!dentry) + return -ENOENT; + + err = ovl_connect_layer(dentry); + dput(dentry); + if (err < 0) + return err;
/* Lower file handle for indexed and non-upper dir/non-dir */ return 1; }
-static int ovl_dentry_to_fid(struct ovl_fs *ofs, struct dentry *dentry, +static int ovl_dentry_to_fid(struct ovl_fs *ofs, struct inode *inode, u32 *fid, int buflen) { - struct inode *inode = d_inode(dentry); struct ovl_fh *fh = NULL; int err, enc_lower; int len; @@ -237,7 +247,7 @@ static int ovl_dentry_to_fid(struct ovl_fs *ofs, struct dentry *dentry, * Check if we should encode a lower or upper file handle and maybe * copy up an ancestor to make lower file handle connectable. */ - err = enc_lower = ovl_check_encode_origin(dentry); + err = enc_lower = ovl_check_encode_origin(inode); if (enc_lower < 0) goto fail;
@@ -257,8 +267,8 @@ static int ovl_dentry_to_fid(struct ovl_fs *ofs, struct dentry *dentry, return err;
fail: - pr_warn_ratelimited("failed to encode file handle (%pd2, err=%i)\n", - dentry, err); + pr_warn_ratelimited("failed to encode file handle (ino=%lu, err=%i)\n", + inode->i_ino, err); goto out; }
@@ -266,19 +276,13 @@ static int ovl_encode_fh(struct inode *inode, u32 *fid, int *max_len, struct inode *parent) { struct ovl_fs *ofs = OVL_FS(inode->i_sb); - struct dentry *dentry; int bytes, buflen = *max_len << 2;
/* TODO: encode connectable file handles */ if (parent) return FILEID_INVALID;
- dentry = d_find_any_alias(inode); - if (!dentry) - return FILEID_INVALID; - - bytes = ovl_dentry_to_fid(ofs, dentry, fid, buflen); - dput(dentry); + bytes = ovl_dentry_to_fid(ofs, inode, fid, buflen); if (bytes <= 0) return FILEID_INVALID;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gao Xiang hsiangkao@linux.alibaba.com
commit 9e2f9d34dd12e6e5b244ec488bcebd0c2d566c50 upstream.
syzbot reported a task hang issue due to a deadlock case where it is waiting for the folio lock of a cached folio that will be used for cache I/Os.
After looking into the crafted fuzzed image, I found it's formed with several overlapped big pclusters as below:
Ext: logical offset | length : physical offset | length 0: 0.. 16384 | 16384 : 151552.. 167936 | 16384 1: 16384.. 32768 | 16384 : 155648.. 172032 | 16384 2: 32768.. 49152 | 16384 : 537223168.. 537239552 | 16384 ...
Here, extent 0/1 are physically overlapped although it's entirely _impossible_ for normal filesystem images generated by mkfs.
First, managed folios containing compressed data will be marked as up-to-date and then unlocked immediately (unlike in-place folios) when compressed I/Os are complete. If physical blocks are not submitted in the incremental order, there should be separate BIOs to avoid dependency issues. However, the current code mis-arranges z_erofs_fill_bio_vec() and BIO submission which causes unexpected BIO waits.
Second, managed folios will be connected to their own pclusters for efficient inter-queries. However, this is somewhat hard to implement easily if overlapped big pclusters exist. Again, these only appear in fuzzed images so let's simply fall back to temporary short-lived pages for correctness.
Additionally, it justifies that referenced managed folios cannot be truncated for now and reverts part of commit 2080ca1ed3e4 ("erofs: tidy up `struct z_erofs_bvec`") for simplicity although it shouldn't be any difference.
Reported-by: syzbot+4fc98ed414ae63d1ada2@syzkaller.appspotmail.com Reported-by: syzbot+de04e06b28cfecf2281c@syzkaller.appspotmail.com Reported-by: syzbot+c8c8238b394be4a1087d@syzkaller.appspotmail.com Tested-by: syzbot+4fc98ed414ae63d1ada2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/0000000000002fda01061e334873@google.com Fixes: 8e6c8fa9f2e9 ("erofs: enable big pcluster feature") Link: https://lore.kernel.org/r/20240910070847.3356592-1-hsiangkao@linux.alibaba.c... Signed-off-by: Gao Xiang hsiangkao@linux.alibaba.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/erofs/zdata.c | 59 +++++++++++++++++++++++++----------------------- 1 file changed, 31 insertions(+), 28 deletions(-)
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index 1c0e6167d8e7..9fa07436a4da 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -1483,14 +1483,13 @@ static struct page *pickup_page_for_submission(struct z_erofs_pcluster *pcl, goto out;
lock_page(page); - - /* only true if page reclaim goes wrong, should never happen */ - DBG_BUGON(justfound && PagePrivate(page)); - - /* the page is still in manage cache */ - if (page->mapping == mc) { + if (likely(page->mapping == mc)) { WRITE_ONCE(pcl->compressed_bvecs[nr].page, page);
+ /* + * The cached folio is still in managed cache but without + * a valid `->private` pcluster hint. Let's reconnect them. + */ if (!PagePrivate(page)) { /* * impossible to be !PagePrivate(page) for @@ -1504,22 +1503,24 @@ static struct page *pickup_page_for_submission(struct z_erofs_pcluster *pcl, SetPagePrivate(page); }
- /* no need to submit io if it is already up-to-date */ - if (PageUptodate(page)) { - unlock_page(page); - page = NULL; + if (likely(page->private == (unsigned long)pcl)) { + /* don't submit cache I/Os again if already uptodate */ + if (PageUptodate(page)) { + unlock_page(page); + page = NULL; + + } + goto out; } - goto out; + /* + * Already linked with another pcluster, which only appears in + * crafted images by fuzzers for now. But handle this anyway. + */ + tocache = false; /* use temporary short-lived pages */ + } else { + DBG_BUGON(1); /* referenced managed folios can't be truncated */ + tocache = true; } - - /* - * the managed page has been truncated, it's unsafe to - * reuse this one, let's allocate a new cache-managed page. - */ - DBG_BUGON(page->mapping); - DBG_BUGON(!justfound); - - tocache = true; unlock_page(page); put_page(page); out_allocpage: @@ -1677,16 +1678,11 @@ static void z_erofs_submit_queue(struct z_erofs_decompress_frontend *f, end = cur + pcl->pclusterpages;
do { - struct page *page; - - page = pickup_page_for_submission(pcl, i++, - &f->pagepool, mc); - if (!page) - continue; + struct page *page = NULL;
if (bio && (cur != last_index + 1 || last_bdev != mdev.m_bdev)) { -submit_bio_retry: +drain_io: submit_bio(bio); if (memstall) { psi_memstall_leave(&pflags); @@ -1695,6 +1691,13 @@ static void z_erofs_submit_queue(struct z_erofs_decompress_frontend *f, bio = NULL; }
+ if (!page) { + page = pickup_page_for_submission(pcl, i++, + &f->pagepool, mc); + if (!page) + continue; + } + if (unlikely(PageWorkingset(page)) && !memstall) { psi_memstall_enter(&pflags); memstall = 1; @@ -1715,7 +1718,7 @@ static void z_erofs_submit_queue(struct z_erofs_decompress_frontend *f, }
if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) - goto submit_bio_retry; + goto drain_io;
last_index = cur; bypass = false;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gao Xiang hsiangkao@linux.alibaba.com
commit 1a2180f6859c73c674809f9f82e36c94084682ba upstream.
Max Kellermann recently reported psi_group_cpu.tasks[NR_MEMSTALL] is incorrect in the 6.11.9 kernel.
The root cause appears to be that, since the problematic commit, bio can be NULL, causing psi_memstall_leave() to be skipped in z_erofs_submit_queue().
Reported-by: Max Kellermann max.kellermann@ionos.com Closes: https://lore.kernel.org/r/CAKPOu+8tvSowiJADW2RuKyofL_CSkm_SuyZA7ME5vMLWmL6pq... Fixes: 9e2f9d34dd12 ("erofs: handle overlapped pclusters out of crafted images properly") Reviewed-by: Chao Yu chao@kernel.org Link: https://lore.kernel.org/r/20241127085236.3538334-1-hsiangkao@linux.alibaba.c... Signed-off-by: Gao Xiang hsiangkao@linux.alibaba.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/erofs/zdata.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index 9fa07436a4da..496e4c7c52a4 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -1730,11 +1730,10 @@ static void z_erofs_submit_queue(struct z_erofs_decompress_frontend *f, move_to_bypass_jobqueue(pcl, qtail, owned_head); } while (owned_head != Z_EROFS_PCLUSTER_TAIL);
- if (bio) { + if (bio) submit_bio(bio); - if (memstall) - psi_memstall_leave(&pflags); - } + if (memstall) + psi_memstall_leave(&pflags);
/* * although background is preferred, no one is pending for submission.
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuming Fan shumingf@realtek.com
[ Upstream commit c9e3ebdc52ebe028f238c9df5162ae92483bedd5 ]
The calibration procedure needs some time to finish. This patch adds the delay time to ensure the calibration procedure is completed correctly.
Signed-off-by: Shuming Fan shumingf@realtek.com Link: https://patch.msgid.link/20241218091307.96656-1-shumingf@realtek.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/rt722-sdca.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/sound/soc/codecs/rt722-sdca.c b/sound/soc/codecs/rt722-sdca.c index b9b330375add..0f9f592744ad 100644 --- a/sound/soc/codecs/rt722-sdca.c +++ b/sound/soc/codecs/rt722-sdca.c @@ -1466,13 +1466,18 @@ static void rt722_sdca_jack_preset(struct rt722_sdca_priv *rt722) 0x008d); /* check HP calibration FSM status */ for (loop_check = 0; loop_check < chk_cnt; loop_check++) { + usleep_range(10000, 11000); ret = rt722_sdca_index_read(rt722, RT722_VENDOR_CALI, RT722_DAC_DC_CALI_CTL3, &calib_status); - if (ret < 0 || loop_check == chk_cnt) + if (ret < 0) dev_dbg(&rt722->slave->dev, "calibration failed!, ret=%d\n", ret); if ((calib_status & 0x0040) == 0x0) break; } + + if (loop_check == chk_cnt) + dev_dbg(&rt722->slave->dev, "%s, calibration time-out!\n", __func__); + /* Set ADC09 power entity floating control */ rt722_sdca_index_write(rt722, RT722_VENDOR_HDA_CTL, RT722_ADC0A_08_PDE_FLOAT_CTL, 0x2a12);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen-Yu Tsai wenst@chromium.org
[ Upstream commit 32c9c06adb5b157ef259233775a063a43746d699 ]
On Chromebooks based on Mediatek MT8195 or MT8188, the audio frontend (AFE) is limited to accessing a very small window (1 MiB) of memory, which is described as a reserved memory region in the device tree.
On these two platforms, the maximum buffer size is given as 512 KiB. The MediaTek common code uses the same value for preallocations. This means that only the first two PCM substreams get preallocations, and then the whole space is exhausted, barring any other substreams from working. Since the substreams used are not always the first two, this means audio won't work correctly.
This is observed on the MT8188 Geralt Chromebooks, on which the "mediatek,dai-link" property was dropped when it was upstreamed. That property causes the driver to only register the PCM substreams listed in the property, and in the order given.
Instead of trying to compute an optimal value and figuring out which streams are used, simply disable preallocation. The PCM buffers are managed by the core and are allocated and released on the fly. There should be no impact to any of the other MediaTek platforms.
Signed-off-by: Chen-Yu Tsai wenst@chromium.org Reviewed-by: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Link: https://patch.msgid.link/20241219105303.548437-1-wenst@chromium.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/mediatek/common/mtk-afe-platform-driver.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/soc/mediatek/common/mtk-afe-platform-driver.c b/sound/soc/mediatek/common/mtk-afe-platform-driver.c index 01501d5747a7..52495c930ca3 100644 --- a/sound/soc/mediatek/common/mtk-afe-platform-driver.c +++ b/sound/soc/mediatek/common/mtk-afe-platform-driver.c @@ -120,8 +120,8 @@ int mtk_afe_pcm_new(struct snd_soc_component *component, struct mtk_base_afe *afe = snd_soc_component_get_drvdata(component);
size = afe->mtk_afe_hardware->buffer_bytes_max; - snd_pcm_set_managed_buffer_all(pcm, SNDRV_DMA_TYPE_DEV, - afe->dev, size, size); + snd_pcm_set_managed_buffer_all(pcm, SNDRV_DMA_TYPE_DEV, afe->dev, 0, size); + return 0; } EXPORT_SYMBOL_GPL(mtk_afe_pcm_new);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Li Zhijian lizhijian@fujitsu.com
[ Upstream commit 55853cb829dc707427c3519f6b8686682a204368 ]
The pattern rule `$(OUTPUT)/%: %.c` inadvertently included a circular dependency on the global-timer target due to its inclusion in $(TEST_GEN_PROGS_EXTENDED). This resulted in a circular dependency warning during the build process.
To resolve this, the dependency on $(TEST_GEN_PROGS_EXTENDED) has been replaced with an explicit dependency on $(OUTPUT)/libatest.so. This change ensures that libatest.so is built before any other targets that require it, without creating a circular dependency.
This fix addresses the following warning:
make[4]: Entering directory 'tools/testing/selftests/alsa' make[4]: Circular default_modconfig/kselftest/alsa/global-timer <- default_modconfig/kselftest/alsa/global-timer dependency dropped. make[4]: Nothing to be done for 'all'. make[4]: Leaving directory 'tools/testing/selftests/alsa'
Cc: Mark Brown broonie@kernel.org Cc: Jaroslav Kysela perex@perex.cz Cc: Takashi Iwai tiwai@suse.com Cc: Shuah Khan shuah@kernel.org Signed-off-by: Li Zhijian lizhijian@fujitsu.com Link: https://patch.msgid.link/20241218025931.914164-1-lizhijian@fujitsu.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/alsa/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/alsa/Makefile b/tools/testing/selftests/alsa/Makefile index 5af9ba8a4645..140c7f821727 100644 --- a/tools/testing/selftests/alsa/Makefile +++ b/tools/testing/selftests/alsa/Makefile @@ -23,5 +23,5 @@ include ../lib.mk $(OUTPUT)/libatest.so: conf.c alsa-local.h $(CC) $(CFLAGS) -shared -fPIC $< $(LDLIBS) -o $@
-$(OUTPUT)/%: %.c $(TEST_GEN_PROGS_EXTENDED) alsa-local.h +$(OUTPUT)/%: %.c $(OUTPUT)/libatest.so alsa-local.h $(CC) $(CFLAGS) $< $(LDLIBS) -latest -o $@
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Keisuke Nishimura keisuke.nishimura@inria.fr
[ Upstream commit 2c87309ea741341c6722efdf1fb3f50dd427c823 ]
ca8210_test_interface_init() returns the result of kfifo_alloc(), which can be non-zero in case of an error. The caller, ca8210_probe(), should check the return value and do error-handling if it fails.
Fixes: ded845a781a5 ("ieee802154: Add CA8210 IEEE 802.15.4 device driver") Signed-off-by: Keisuke Nishimura keisuke.nishimura@inria.fr Reviewed-by: Simon Horman horms@kernel.org Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/20241029182712.318271-1-keisuke.nishimura@inria.fr Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ieee802154/ca8210.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ieee802154/ca8210.c b/drivers/net/ieee802154/ca8210.c index 4ec0dab38872..0a0ad3d77557 100644 --- a/drivers/net/ieee802154/ca8210.c +++ b/drivers/net/ieee802154/ca8210.c @@ -3078,7 +3078,11 @@ static int ca8210_probe(struct spi_device *spi_device) spi_set_drvdata(priv->spi, priv); if (IS_ENABLED(CONFIG_IEEE802154_CA8210_DEBUGFS)) { cascoda_api_upstream = ca8210_test_int_driver_write; - ca8210_test_interface_init(priv); + ret = ca8210_test_interface_init(priv); + if (ret) { + dev_crit(&spi_device->dev, "ca8210_test_interface_init failed\n"); + goto error; + } } else { cascoda_api_upstream = NULL; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Antonio Pastor antonio.pastor@gmail.com
[ Upstream commit 1e9b0e1c550c42c13c111d1a31e822057232abc4 ]
802.2+LLC+SNAP frames received by napi_complete_done() with GRO and DSA have skb->transport_header set two bytes short, or pointing 2 bytes before network_header & skb->data. This was an issue as snap_rcv() expected offset to point to SNAP header (OID:PID), causing packet to be dropped.
A fix at llc_fixup_skb() (a024e377efed) resets transport_header for any LLC consumers that may care about it, and stops SNAP packets from being dropped, but doesn't fix the problem which is that LLC and SNAP should not use transport_header offset.
Ths patch eliminates the use of transport_header offset for SNAP lookup of OID:PID so that SNAP does not rely on the offset at all. The offset is reset after pull for any SNAP packet consumers that may (but shouldn't) use it.
Fixes: fda55eca5a33 ("net: introduce skb_transport_header_was_set()") Signed-off-by: Antonio Pastor antonio.pastor@gmail.com Reviewed-by: Eric Dumazet edumazet@google.com Link: https://patch.msgid.link/20250103012303.746521-1-antonio.pastor@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/802/psnap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/802/psnap.c b/net/802/psnap.c index 1406bfdbda13..dbd9647f2ef1 100644 --- a/net/802/psnap.c +++ b/net/802/psnap.c @@ -55,11 +55,11 @@ static int snap_rcv(struct sk_buff *skb, struct net_device *dev, goto drop;
rcu_read_lock(); - proto = find_snap_client(skb_transport_header(skb)); + proto = find_snap_client(skb->data); if (proto) { /* Pass the frame on. */ - skb->transport_header += 5; skb_pull_rcsum(skb, 5); + skb_reset_transport_header(skb); rc = proto->rcvfunc(skb, dev, &snap_packet_type, orig_dev); } rcu_read_unlock();
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason Xing kernelxing@tencent.com
[ Upstream commit 9a79c65f00e2b036e17af3a3a607d7d732b7affb ]
Since commit 099ecf59f05b ("net: annotate lockless accesses to sk->sk_max_ack_backlog") decided to handle the sk_max_ack_backlog locklessly, there is one more function mostly called in TCP/DCCP cases. So this patch completes it:)
Signed-off-by: Jason Xing kernelxing@tencent.com Reviewed-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20240331090521.71965-1-kerneljasonxing@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: 3479c7549fb1 ("tcp/dccp: allow a connection when sk_max_ack_backlog is zero") Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/inet_connection_sock.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h index fee1e5650551..7b5e783ea24d 100644 --- a/include/net/inet_connection_sock.h +++ b/include/net/inet_connection_sock.h @@ -282,7 +282,7 @@ static inline int inet_csk_reqsk_queue_len(const struct sock *sk)
static inline int inet_csk_reqsk_queue_is_full(const struct sock *sk) { - return inet_csk_reqsk_queue_len(sk) >= sk->sk_max_ack_backlog; + return inet_csk_reqsk_queue_len(sk) >= READ_ONCE(sk->sk_max_ack_backlog); }
bool inet_csk_reqsk_queue_drop(struct sock *sk, struct request_sock *req);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhongqiu Duan dzq.aishenghu0@gmail.com
[ Upstream commit 3479c7549fb1dfa7a1db4efb7347c7b8ef50de4b ]
If the backlog of listen() is set to zero, sk_acceptq_is_full() allows one connection to be made, but inet_csk_reqsk_queue_is_full() does not. When the net.ipv4.tcp_syncookies is zero, inet_csk_reqsk_queue_is_full() will cause an immediate drop before the sk_acceptq_is_full() check in tcp_conn_request(), resulting in no connection can be made.
This patch tries to keep consistent with 64a146513f8f ("[NET]: Revert incorrect accept queue backlog changes.").
Link: https://lore.kernel.org/netdev/20250102080258.53858-1-kuniyu@amazon.com/ Fixes: ef547f2ac16b ("tcp: remove max_qlen_log") Signed-off-by: Zhongqiu Duan dzq.aishenghu0@gmail.com Reviewed-by: Kuniyuki Iwashima kuniyu@amazon.com Reviewed-by: Jason Xing kerneljasonxing@gmail.com Reviewed-by: Eric Dumazet edumazet@google.com Link: https://patch.msgid.link/20250102171426.915276-1-dzq.aishenghu0@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/inet_connection_sock.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h index 7b5e783ea24d..e85834722b8f 100644 --- a/include/net/inet_connection_sock.h +++ b/include/net/inet_connection_sock.h @@ -282,7 +282,7 @@ static inline int inet_csk_reqsk_queue_len(const struct sock *sk)
static inline int inet_csk_reqsk_queue_is_full(const struct sock *sk) { - return inet_csk_reqsk_queue_len(sk) >= READ_ONCE(sk->sk_max_ack_backlog); + return inet_csk_reqsk_queue_len(sk) > READ_ONCE(sk->sk_max_ack_backlog); }
bool inet_csk_reqsk_queue_drop(struct sock *sk, struct request_sock *req);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit a039e54397c6a75b713b9ce7894a62e06956aa92 ]
syzbot found that TCA_FLOW_RSHIFT attribute was not validated. Right shitfing a 32bit integer is undefined for large shift values.
UBSAN: shift-out-of-bounds in net/sched/cls_flow.c:329:23 shift exponent 9445 is too large for 32-bit type 'u32' (aka 'unsigned int') CPU: 1 UID: 0 PID: 54 Comm: kworker/u8:3 Not tainted 6.13.0-rc3-syzkaller-00180-g4f619d518db9 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: ipv6_addrconf addrconf_dad_work Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 ubsan_epilogue lib/ubsan.c:231 [inline] __ubsan_handle_shift_out_of_bounds+0x3c8/0x420 lib/ubsan.c:468 flow_classify+0x24d5/0x25b0 net/sched/cls_flow.c:329 tc_classify include/net/tc_wrapper.h:197 [inline] __tcf_classify net/sched/cls_api.c:1771 [inline] tcf_classify+0x420/0x1160 net/sched/cls_api.c:1867 sfb_classify net/sched/sch_sfb.c:260 [inline] sfb_enqueue+0x3ad/0x18b0 net/sched/sch_sfb.c:318 dev_qdisc_enqueue+0x4b/0x290 net/core/dev.c:3793 __dev_xmit_skb net/core/dev.c:3889 [inline] __dev_queue_xmit+0xf0e/0x3f50 net/core/dev.c:4400 dev_queue_xmit include/linux/netdevice.h:3168 [inline] neigh_hh_output include/net/neighbour.h:523 [inline] neigh_output include/net/neighbour.h:537 [inline] ip_finish_output2+0xd41/0x1390 net/ipv4/ip_output.c:236 iptunnel_xmit+0x55d/0x9b0 net/ipv4/ip_tunnel_core.c:82 udp_tunnel_xmit_skb+0x262/0x3b0 net/ipv4/udp_tunnel_core.c:173 geneve_xmit_skb drivers/net/geneve.c:916 [inline] geneve_xmit+0x21dc/0x2d00 drivers/net/geneve.c:1039 __netdev_start_xmit include/linux/netdevice.h:5002 [inline] netdev_start_xmit include/linux/netdevice.h:5011 [inline] xmit_one net/core/dev.c:3590 [inline] dev_hard_start_xmit+0x27a/0x7d0 net/core/dev.c:3606 __dev_queue_xmit+0x1b73/0x3f50 net/core/dev.c:4434
Fixes: e5dfb815181f ("[NET_SCHED]: Add flow classifier") Reported-by: syzbot+1dbb57d994e54aaa04d2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6777bf49.050a0220.178762.0040.GAE@google.com/... Signed-off-by: Eric Dumazet edumazet@google.com Link: https://patch.msgid.link/20250103104546.3714168-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/cls_flow.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/sched/cls_flow.c b/net/sched/cls_flow.c index 6ab317b48d6c..815216b564f3 100644 --- a/net/sched/cls_flow.c +++ b/net/sched/cls_flow.c @@ -356,7 +356,8 @@ static const struct nla_policy flow_policy[TCA_FLOW_MAX + 1] = { [TCA_FLOW_KEYS] = { .type = NLA_U32 }, [TCA_FLOW_MODE] = { .type = NLA_U32 }, [TCA_FLOW_BASECLASS] = { .type = NLA_U32 }, - [TCA_FLOW_RSHIFT] = { .type = NLA_U32 }, + [TCA_FLOW_RSHIFT] = NLA_POLICY_MAX(NLA_U32, + 31 /* BITS_PER_U32 - 1 */), [TCA_FLOW_ADDEND] = { .type = NLA_U32 }, [TCA_FLOW_MASK] = { .type = NLA_U32 }, [TCA_FLOW_XOR] = { .type = NLA_U32 },
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiawen Wu jiawenwu@trustnetic.com
[ Upstream commit 8ce4f287524c74a118b0af1eebd4b24a8efca57a ]
The existing SW-FW interaction flow on the driver is wrong. Follow this wrong flow, driver would never return error if there is a unknown command. Since firmware writes back 'firmware ready' and 'unknown command' in the mailbox message if there is an unknown command sent by driver. So reading 'firmware ready' does not timeout. Then driver would mistakenly believe that the interaction has completed successfully.
It tends to happen with the use of custom firmware. Move the check for 'unknown command' out of the poll timeout for 'firmware ready'. And adjust the debug log so that mailbox messages are always printed when commands timeout.
Fixes: 1efa9bfe58c5 ("net: libwx: Implement interaction with firmware") Signed-off-by: Jiawen Wu jiawenwu@trustnetic.com Link: https://patch.msgid.link/20250103081013.1995939-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/wangxun/libwx/wx_hw.c | 24 ++++++++++------------ 1 file changed, 11 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_hw.c b/drivers/net/ethernet/wangxun/libwx/wx_hw.c index 52130df26aee..d6bc2309d2a3 100644 --- a/drivers/net/ethernet/wangxun/libwx/wx_hw.c +++ b/drivers/net/ethernet/wangxun/libwx/wx_hw.c @@ -242,27 +242,25 @@ int wx_host_interface_command(struct wx *wx, u32 *buffer, status = read_poll_timeout(rd32, hicr, hicr & WX_MNG_MBOX_CTL_FWRDY, 1000, timeout * 1000, false, wx, WX_MNG_MBOX_CTL);
+ buf[0] = rd32(wx, WX_MNG_MBOX); + if ((buf[0] & 0xff0000) >> 16 == 0x80) { + wx_err(wx, "Unknown FW command: 0x%x\n", buffer[0] & 0xff); + status = -EINVAL; + goto rel_out; + } + /* Check command completion */ if (status) { - wx_dbg(wx, "Command has failed with no status valid.\n"); - - buf[0] = rd32(wx, WX_MNG_MBOX); - if ((buffer[0] & 0xff) != (~buf[0] >> 24)) { - status = -EINVAL; - goto rel_out; - } - if ((buf[0] & 0xff0000) >> 16 == 0x80) { - wx_dbg(wx, "It's unknown cmd.\n"); - status = -EINVAL; - goto rel_out; - } - + wx_err(wx, "Command has failed with no status valid.\n"); wx_dbg(wx, "write value:\n"); for (i = 0; i < dword_len; i++) wx_dbg(wx, "%x ", buffer[i]); wx_dbg(wx, "read value:\n"); for (i = 0; i < dword_len; i++) wx_dbg(wx, "%x ", buf[i]); + wx_dbg(wx, "\ncheck: %x %x\n", buffer[0] & 0xff, ~buf[0] >> 24); + + goto rel_out; }
if (!return_data)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
[ Upstream commit 6aecd91a5c5b68939cf4169e32bc49f3cd2dd329 ]
[BUG] Syzbot reported a crash with the following call trace:
BTRFS info (device loop0): scrub: started on devid 1 BUG: kernel NULL pointer dereference, address: 0000000000000208 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 106e70067 P4D 106e70067 PUD 107143067 PMD 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 UID: 0 PID: 689 Comm: repro Kdump: loaded Tainted: G O 6.13.0-rc4-custom+ #206 Tainted: [O]=OOT_MODULE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022 RIP: 0010:find_first_extent_item+0x26/0x1f0 [btrfs] Call Trace: <TASK> scrub_find_fill_first_stripe+0x13d/0x3b0 [btrfs] scrub_simple_mirror+0x175/0x260 [btrfs] scrub_stripe+0x5d4/0x6c0 [btrfs] scrub_chunk+0xbb/0x170 [btrfs] scrub_enumerate_chunks+0x2f4/0x5f0 [btrfs] btrfs_scrub_dev+0x240/0x600 [btrfs] btrfs_ioctl+0x1dc8/0x2fa0 [btrfs] ? do_sys_openat2+0xa5/0xf0 __x64_sys_ioctl+0x97/0xc0 do_syscall_64+0x4f/0x120 entry_SYSCALL_64_after_hwframe+0x76/0x7e </TASK>
[CAUSE] The reproducer is using a corrupted image where extent tree root is corrupted, thus forcing to use "rescue=all,ro" mount option to mount the image.
Then it triggered a scrub, but since scrub relies on extent tree to find where the data/metadata extents are, scrub_find_fill_first_stripe() relies on an non-empty extent root.
But unfortunately scrub_find_fill_first_stripe() doesn't really expect an NULL pointer for extent root, it use extent_root to grab fs_info and triggered a NULL pointer dereference.
[FIX] Add an extra check for a valid extent root at the beginning of scrub_find_fill_first_stripe().
The new error path is introduced by 42437a6386ff ("btrfs: introduce mount option rescue=ignorebadroots"), but that's pretty old, and later commit b979547513ff ("btrfs: scrub: introduce helper to find and fill sector info for a scrub_stripe") changed how we do scrub.
So for kernels older than 6.6, the fix will need manual backport.
Reported-by: syzbot+339e9dbe3a2ca419b85d@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/67756935.050a0220.25abdd.0a12.GAE@google... Fixes: 42437a6386ff ("btrfs: introduce mount option rescue=ignorebadroots") Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/scrub.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index a2d91d9f8a10..6be092bb814f 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -1538,6 +1538,10 @@ static int scrub_find_fill_first_stripe(struct btrfs_block_group *bg, u64 extent_gen; int ret;
+ if (unlikely(!extent_root)) { + btrfs_err(fs_info, "no valid extent root for scrub"); + return -EUCLEAN; + } memset(stripe->sectors, 0, sizeof(struct scrub_sector_verification) * stripe->nr_sectors); scrub_stripe_reset_bitmaps(stripe);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shannon Nelson shannon.nelson@amd.com
[ Upstream commit 8c817eb26230dc0ae553cee16ff43a4a895f6756 ]
Add an array size limit to the for-loop to be sure we don't try to reference a fw_version string off the end of the fw info names array. We know that our firmware only has a limited number of firmware slot names, but we shouldn't leave this unchecked.
Fixes: 45d76f492938 ("pds_core: set up device and adminq") Signed-off-by: Shannon Nelson shannon.nelson@amd.com Reviewed-by: Simon Horman horms@kernel.org Reviewed-by: Brett Creeley brett.creeley@amd.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/20250103195147.7408-1-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/amd/pds_core/devlink.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/amd/pds_core/devlink.c b/drivers/net/ethernet/amd/pds_core/devlink.c index d8218bb153d9..971d4278280d 100644 --- a/drivers/net/ethernet/amd/pds_core/devlink.c +++ b/drivers/net/ethernet/amd/pds_core/devlink.c @@ -117,7 +117,7 @@ int pdsc_dl_info_get(struct devlink *dl, struct devlink_info_req *req, if (err && err != -EIO) return err;
- listlen = fw_list.num_fw_slots; + listlen = min(fw_list.num_fw_slots, ARRAY_SIZE(fw_list.fw_names)); for (i = 0; i < listlen; i++) { if (i < ARRAY_SIZE(fw_slotnames)) strscpy(buf, fw_slotnames[i], sizeof(buf));
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kalesh AP kalesh-anakkur.purayil@broadcom.com
[ Upstream commit c8dafb0e4398dacc362832098a04b97da3b0395b ]
When hwrm_req_replace() fails, the driver is not invoking bnxt_req_drop() which could cause a memory leak.
Fixes: bbf33d1d9805 ("bnxt_en: update all firmware calls to use the new APIs") Reviewed-by: Pavan Chebbi pavan.chebbi@broadcom.com Signed-off-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Link: https://patch.msgid.link/20250104043849.3482067-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c index 7689086371e0..2980963208cb 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c @@ -159,7 +159,7 @@ int bnxt_send_msg(struct bnxt_en_dev *edev,
rc = hwrm_req_replace(bp, req, fw_msg->msg, fw_msg->msg_len); if (rc) - return rc; + goto drop_req;
hwrm_req_timeout(bp, req, fw_msg->timeout); resp = hwrm_req_hold(bp, req); @@ -171,6 +171,7 @@ int bnxt_send_msg(struct bnxt_en_dev *edev,
memcpy(fw_msg->resp, resp, resp_len); } +drop_req: hwrm_req_drop(bp, req); return rc; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anumula Murali Mohan Reddy anumula@chelsio.com
[ Upstream commit 4c1224501e9d6c5fd12d83752f1c1b444e0e3418 ]
During ARP failure, tid is not inserted but _c4iw_free_ep() attempts to remove tid which results in error. This patch fixes the issue by avoiding removal of uninserted tid.
Fixes: 59437d78f088 ("cxgb4/chtls: fix ULD connection failures due to wrong TID base") Signed-off-by: Anumula Murali Mohan Reddy anumula@chelsio.com Signed-off-by: Potnuri Bharat Teja bharat@chelsio.com Link: https://patch.msgid.link/20250103092327.1011925-1-anumula@chelsio.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index b215ff14da1b..3989c9491f0f 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -1799,7 +1799,10 @@ void cxgb4_remove_tid(struct tid_info *t, unsigned int chan, unsigned int tid, struct adapter *adap = container_of(t, struct adapter, tids); struct sk_buff *skb;
- WARN_ON(tid_out_of_range(&adap->tids, tid)); + if (tid_out_of_range(&adap->tids, tid)) { + dev_err(adap->pdev_dev, "tid %d out of range\n", tid); + return; + }
if (t->tid_tab[tid - adap->tids.tid_base]) { t->tid_tab[tid - adap->tids.tid_base] = NULL;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Przemyslaw Korba przemyslaw.korba@intel.com
[ Upstream commit 6c5b989116083a98f45aada548ff54e7a83a9c2d ]
ptp4l application reports too high offset when ran on E823 device with a 100GB/s link. Those values cannot go under 100ns, like in a working case when using 100 GB/s cable.
This is due to incorrect frequency settings on the PHY clocks for 100 GB/s speed. Changes are introduced to align with the internal hardware documentation, and correctly initialize frequency in PHY clocks with the frequency values that are in our HW spec.
To reproduce the issue run ptp4l as a Time Receiver on E823 device, and observe the offset, which will never approach values seen in the PTP working case.
Reproduction output: ptp4l -i enp137s0f3 -m -2 -s -f /etc/ptp4l_8275.conf ptp4l[5278.775]: master offset 12470 s2 freq +41288 path delay -3002 ptp4l[5278.837]: master offset 10525 s2 freq +39202 path delay -3002 ptp4l[5278.900]: master offset -24840 s2 freq -20130 path delay -3002 ptp4l[5278.963]: master offset 10597 s2 freq +37908 path delay -3002 ptp4l[5279.025]: master offset 8883 s2 freq +36031 path delay -3002 ptp4l[5279.088]: master offset 7267 s2 freq +34151 path delay -3002 ptp4l[5279.150]: master offset 5771 s2 freq +32316 path delay -3002 ptp4l[5279.213]: master offset 4388 s2 freq +30526 path delay -3002 ptp4l[5279.275]: master offset -30434 s2 freq -28485 path delay -3002 ptp4l[5279.338]: master offset -28041 s2 freq -27412 path delay -3002 ptp4l[5279.400]: master offset 7870 s2 freq +31118 path delay -3002
Fixes: 3a7496234d17 ("ice: implement basic E822 PTP support") Reviewed-by: Milena Olech milena.olech@intel.com Signed-off-by: Przemyslaw Korba przemyslaw.korba@intel.com Tested-by: Rinitha S sx.rinitha@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_ptp_consts.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_consts.h b/drivers/net/ethernet/intel/ice/ice_ptp_consts.h index 4109aa3b2fcd..87ce20540f57 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp_consts.h +++ b/drivers/net/ethernet/intel/ice/ice_ptp_consts.h @@ -359,9 +359,9 @@ const struct ice_vernier_info_e822 e822_vernier[NUM_ICE_PTP_LNK_SPD] = { /* rx_desk_rsgb_par */ 644531250, /* 644.53125 MHz Reed Solomon gearbox */ /* tx_desk_rsgb_pcs */ - 644531250, /* 644.53125 MHz Reed Solomon gearbox */ + 390625000, /* 390.625 MHz Reed Solomon gearbox */ /* rx_desk_rsgb_pcs */ - 644531250, /* 644.53125 MHz Reed Solomon gearbox */ + 390625000, /* 390.625 MHz Reed Solomon gearbox */ /* tx_fixed_delay */ 1620, /* pmd_adj_divisor */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jesse Brandeburg jesse.brandeburg@intel.com
[ Upstream commit a8e0c7a6800dc466ac815264c16971b9adf7ffbd ]
Refactor the igc driver to use FIELD_GET() for mask and shift reads, which reduces lines of code and adds clarity of intent.
This code was generated by the following coccinelle/spatch script and then manually repaired in a later patch.
@get@ constant shift,mask; type T; expression a; @@ -((T)((a) & mask) >> shift) +FIELD_GET(mask, a)
and applied via: spatch --sp-file field_prep.cocci --in-place --dir \ drivers/net/ethernet/intel/
Cc: Julia Lawall Julia.Lawall@inria.fr Reviewed-by: Marcin Szycik marcin.szycik@linux.intel.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: Jesse Brandeburg jesse.brandeburg@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Stable-dep-of: bd2776e39c2a ("igc: return early when failing to read EECD register") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_base.c | 6 ++---- drivers/net/ethernet/intel/igc/igc_i225.c | 5 ++--- drivers/net/ethernet/intel/igc/igc_main.c | 6 ++---- drivers/net/ethernet/intel/igc/igc_phy.c | 4 ++-- 4 files changed, 8 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/intel/igc/igc_base.c b/drivers/net/ethernet/intel/igc/igc_base.c index a1d815af507d..9fae8bdec2a7 100644 --- a/drivers/net/ethernet/intel/igc/igc_base.c +++ b/drivers/net/ethernet/intel/igc/igc_base.c @@ -68,8 +68,7 @@ static s32 igc_init_nvm_params_base(struct igc_hw *hw) u32 eecd = rd32(IGC_EECD); u16 size;
- size = (u16)((eecd & IGC_EECD_SIZE_EX_MASK) >> - IGC_EECD_SIZE_EX_SHIFT); + size = FIELD_GET(IGC_EECD_SIZE_EX_MASK, eecd);
/* Added to a constant, "size" becomes the left-shift value * for setting word_size. @@ -162,8 +161,7 @@ static s32 igc_init_phy_params_base(struct igc_hw *hw) phy->reset_delay_us = 100;
/* set lan id */ - hw->bus.func = (rd32(IGC_STATUS) & IGC_STATUS_FUNC_MASK) >> - IGC_STATUS_FUNC_SHIFT; + hw->bus.func = FIELD_GET(IGC_STATUS_FUNC_MASK, rd32(IGC_STATUS));
/* Make sure the PHY is in a good state. Several people have reported * firmware leaving the PHY's page select register set to something diff --git a/drivers/net/ethernet/intel/igc/igc_i225.c b/drivers/net/ethernet/intel/igc/igc_i225.c index d2562c8e8015..0dd61719f1ed 100644 --- a/drivers/net/ethernet/intel/igc/igc_i225.c +++ b/drivers/net/ethernet/intel/igc/igc_i225.c @@ -579,9 +579,8 @@ s32 igc_set_ltr_i225(struct igc_hw *hw, bool link)
/* Calculate tw_system (nsec). */ if (speed == SPEED_100) { - tw_system = ((rd32(IGC_EEE_SU) & - IGC_TW_SYSTEM_100_MASK) >> - IGC_TW_SYSTEM_100_SHIFT) * 500; + tw_system = FIELD_GET(IGC_TW_SYSTEM_100_MASK, + rd32(IGC_EEE_SU)) * 500; } else { tw_system = (rd32(IGC_EEE_SU) & IGC_TW_SYSTEM_1000_MASK) * 500; diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index da1018d83262..91a4722460f6 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -3708,8 +3708,7 @@ static int igc_enable_nfc_rule(struct igc_adapter *adapter, }
if (rule->filter.match_flags & IGC_FILTER_FLAG_VLAN_TCI) { - int prio = (rule->filter.vlan_tci & VLAN_PRIO_MASK) >> - VLAN_PRIO_SHIFT; + int prio = FIELD_GET(VLAN_PRIO_MASK, rule->filter.vlan_tci);
err = igc_add_vlan_prio_filter(adapter, prio, rule->action); if (err) @@ -3731,8 +3730,7 @@ static void igc_disable_nfc_rule(struct igc_adapter *adapter, igc_del_etype_filter(adapter, rule->filter.etype);
if (rule->filter.match_flags & IGC_FILTER_FLAG_VLAN_TCI) { - int prio = (rule->filter.vlan_tci & VLAN_PRIO_MASK) >> - VLAN_PRIO_SHIFT; + int prio = FIELD_GET(VLAN_PRIO_MASK, rule->filter.vlan_tci);
igc_del_vlan_prio_filter(adapter, prio); } diff --git a/drivers/net/ethernet/intel/igc/igc_phy.c b/drivers/net/ethernet/intel/igc/igc_phy.c index d0d9e7170154..7cd8716d2ffa 100644 --- a/drivers/net/ethernet/intel/igc/igc_phy.c +++ b/drivers/net/ethernet/intel/igc/igc_phy.c @@ -727,7 +727,7 @@ static s32 igc_write_xmdio_reg(struct igc_hw *hw, u16 addr, */ s32 igc_write_phy_reg_gpy(struct igc_hw *hw, u32 offset, u16 data) { - u8 dev_addr = (offset & GPY_MMD_MASK) >> GPY_MMD_SHIFT; + u8 dev_addr = FIELD_GET(GPY_MMD_MASK, offset); s32 ret_val;
offset = offset & GPY_REG_MASK; @@ -758,7 +758,7 @@ s32 igc_write_phy_reg_gpy(struct igc_hw *hw, u32 offset, u16 data) */ s32 igc_read_phy_reg_gpy(struct igc_hw *hw, u32 offset, u16 *data) { - u8 dev_addr = (offset & GPY_MMD_MASK) >> GPY_MMD_SHIFT; + u8 dev_addr = FIELD_GET(GPY_MMD_MASK, offset); s32 ret_val;
offset = offset & GPY_REG_MASK;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: En-Wei Wu en-wei.wu@canonical.com
[ Upstream commit bd2776e39c2a82ef4681d02678bb77b3d41e79be ]
When booting with a dock connected, the igc driver may get stuck for ~40 seconds if PCIe link is lost during initialization.
This happens because the driver access device after EECD register reads return all F's, indicating failed reads. Consequently, hw->hw_addr is set to NULL, which impacts subsequent rd32() reads. This leads to the driver hanging in igc_get_hw_semaphore_i225(), as the invalid hw->hw_addr prevents retrieving the expected value.
To address this, a validation check and a corresponding return value catch is added for the EECD register read result. If all F's are returned, indicating PCIe link loss, the driver will return -ENXIO immediately. This avoids the 40-second hang and significantly improves boot time when using a dock with an igc NIC.
Log before the patch: [ 0.911913] igc 0000:70:00.0: enabling device (0000 -> 0002) [ 0.912386] igc 0000:70:00.0: PTM enabled, 4ns granularity [ 1.571098] igc 0000:70:00.0 (unnamed net_device) (uninitialized): PCIe link lost, device now detached [ 43.449095] igc_get_hw_semaphore_i225: igc 0000:70:00.0 (unnamed net_device) (uninitialized): Driver can't access device - SMBI bit is set. [ 43.449186] igc 0000:70:00.0: probe with driver igc failed with error -13 [ 46.345701] igc 0000:70:00.0: enabling device (0000 -> 0002) [ 46.345777] igc 0000:70:00.0: PTM enabled, 4ns granularity
Log after the patch: [ 1.031000] igc 0000:70:00.0: enabling device (0000 -> 0002) [ 1.032097] igc 0000:70:00.0: PTM enabled, 4ns granularity [ 1.642291] igc 0000:70:00.0 (unnamed net_device) (uninitialized): PCIe link lost, device now detached [ 5.480490] igc 0000:70:00.0: enabling device (0000 -> 0002) [ 5.480516] igc 0000:70:00.0: PTM enabled, 4ns granularity
Fixes: ab4056126813 ("igc: Add NVM support") Cc: Chia-Lin Kao (AceLan) acelan.kao@canonical.com Signed-off-by: En-Wei Wu en-wei.wu@canonical.com Reviewed-by: Vitaly Lifshits vitaly.lifshits@intel.com Tested-by: Mor Bar-Gabay morx.bar.gabay@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_base.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/intel/igc/igc_base.c b/drivers/net/ethernet/intel/igc/igc_base.c index 9fae8bdec2a7..1613b562d17c 100644 --- a/drivers/net/ethernet/intel/igc/igc_base.c +++ b/drivers/net/ethernet/intel/igc/igc_base.c @@ -68,6 +68,10 @@ static s32 igc_init_nvm_params_base(struct igc_hw *hw) u32 eecd = rd32(IGC_EECD); u16 size;
+ /* failed to read reg and got all F's */ + if (!(~eecd)) + return -ENXIO; + size = FIELD_GET(IGC_EECD_SIZE_EX_MASK, eecd);
/* Added to a constant, "size" becomes the left-shift value @@ -221,6 +225,8 @@ static s32 igc_get_invariants_base(struct igc_hw *hw)
/* NVM initialization */ ret_val = igc_init_nvm_params_base(hw); + if (ret_val) + goto out; switch (hw->mac.type) { case igc_i225: ret_val = igc_init_nvm_params_i225(hw);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Benjamin Coddington bcodding@redhat.com
[ Upstream commit b341ca51d2679829d26a3f6a4aa9aee9abd94f92 ]
We've noticed that NFS can hang when using RPC over TLS on an unstable connection, and investigation shows that the RPC layer is stuck in a tight loop attempting to transmit, but forever getting -EBADMSG back from the underlying network. The loop begins when tcp_sendmsg_locked() returns -EPIPE to tls_tx_records(), but that error is converted to -EBADMSG when calling the socket's error reporting handler.
Instead of converting errors from tcp_sendmsg_locked(), let's pass them along in this path. The RPC layer handles -EPIPE by reconnecting the transport, which prevents the endless attempts to transmit on a broken connection.
Signed-off-by: Benjamin Coddington bcodding@redhat.com Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance") Link: https://patch.msgid.link/9594185559881679d81f071b181a10eb07cd079f.1736004079... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/tls/tls_sw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index df166f6afad8..6e30fe879d53 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -458,7 +458,7 @@ int tls_tx_records(struct sock *sk, int flags)
tx_err: if (rc < 0 && rc != -EAGAIN) - tls_err_abort(sk, -EBADMSG); + tls_err_abort(sk, rc);
return rc; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit cb358ff94154774d031159b018adf45e17673941 ]
syzbot presented an use-after-free report [0] regarding ipvlan and linkwatch.
ipvlan does not hold a refcnt of the lower device unlike vlan and macvlan.
If the linkwatch work is triggered for the ipvlan dev, the lower dev might have already been freed, resulting in UAF of ipvlan->phy_dev in ipvlan_get_iflink().
We can delay the lower dev unregistration like vlan and macvlan by holding the lower dev's refcnt in dev->netdev_ops->ndo_init() and releasing it in dev->priv_destructor().
Jakub pointed out calling .ndo_XXX after unregister_netdevice() has returned is error prone and suggested [1] addressing this UAF in the core by taking commit 750e51603395 ("net: avoid potential UAF in default_operstate()") further.
Let's assume unregistering devices DOWN and use RCU protection in default_operstate() not to race with the device unregistration.
[0]: BUG: KASAN: slab-use-after-free in ipvlan_get_iflink+0x84/0x88 drivers/net/ipvlan/ipvlan_main.c:353 Read of size 4 at addr ffff0000d768c0e0 by task kworker/u8:35/6944
CPU: 0 UID: 0 PID: 6944 Comm: kworker/u8:35 Not tainted 6.13.0-rc2-g9bc5c9515b48 #12 4c3cb9e8b4565456f6a355f312ff91f4f29b3c47 Hardware name: linux,dummy-virt (DT) Workqueue: events_unbound linkwatch_event Call trace: show_stack+0x38/0x50 arch/arm64/kernel/stacktrace.c:484 (C) __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0xbc/0x108 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0x16c/0x6f0 mm/kasan/report.c:489 kasan_report+0xc0/0x120 mm/kasan/report.c:602 __asan_report_load4_noabort+0x20/0x30 mm/kasan/report_generic.c:380 ipvlan_get_iflink+0x84/0x88 drivers/net/ipvlan/ipvlan_main.c:353 dev_get_iflink+0x7c/0xd8 net/core/dev.c:674 default_operstate net/core/link_watch.c:45 [inline] rfc2863_policy+0x144/0x360 net/core/link_watch.c:72 linkwatch_do_dev+0x60/0x228 net/core/link_watch.c:175 __linkwatch_run_queue+0x2f4/0x5b8 net/core/link_watch.c:239 linkwatch_event+0x64/0xa8 net/core/link_watch.c:282 process_one_work+0x700/0x1398 kernel/workqueue.c:3229 process_scheduled_works kernel/workqueue.c:3310 [inline] worker_thread+0x8c4/0xe10 kernel/workqueue.c:3391 kthread+0x2b0/0x360 kernel/kthread.c:389 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:862
Allocated by task 9303: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x30/0x68 mm/kasan/common.c:68 kasan_save_alloc_info+0x44/0x58 mm/kasan/generic.c:568 poison_kmalloc_redzone mm/kasan/common.c:377 [inline] __kasan_kmalloc+0x84/0xa0 mm/kasan/common.c:394 kasan_kmalloc include/linux/kasan.h:260 [inline] __do_kmalloc_node mm/slub.c:4283 [inline] __kmalloc_node_noprof+0x2a0/0x560 mm/slub.c:4289 __kvmalloc_node_noprof+0x9c/0x230 mm/util.c:650 alloc_netdev_mqs+0xb4/0x1118 net/core/dev.c:11209 rtnl_create_link+0x2b8/0xb60 net/core/rtnetlink.c:3595 rtnl_newlink_create+0x19c/0x868 net/core/rtnetlink.c:3771 __rtnl_newlink net/core/rtnetlink.c:3896 [inline] rtnl_newlink+0x122c/0x15c0 net/core/rtnetlink.c:4011 rtnetlink_rcv_msg+0x61c/0x918 net/core/rtnetlink.c:6901 netlink_rcv_skb+0x1dc/0x398 net/netlink/af_netlink.c:2542 rtnetlink_rcv+0x34/0x50 net/core/rtnetlink.c:6928 netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline] netlink_unicast+0x618/0x838 net/netlink/af_netlink.c:1347 netlink_sendmsg+0x5fc/0x8b0 net/netlink/af_netlink.c:1891 sock_sendmsg_nosec net/socket.c:711 [inline] __sock_sendmsg net/socket.c:726 [inline] __sys_sendto+0x2ec/0x438 net/socket.c:2197 __do_sys_sendto net/socket.c:2204 [inline] __se_sys_sendto net/socket.c:2200 [inline] __arm64_sys_sendto+0xe4/0x110 net/socket.c:2200 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x90/0x278 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x13c/0x250 arch/arm64/kernel/syscall.c:132 do_el0_svc+0x54/0x70 arch/arm64/kernel/syscall.c:151 el0_svc+0x4c/0xa8 arch/arm64/kernel/entry-common.c:744 el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:762 el0t_64_sync+0x198/0x1a0 arch/arm64/kernel/entry.S:600
Freed by task 10200: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x30/0x68 mm/kasan/common.c:68 kasan_save_free_info+0x58/0x70 mm/kasan/generic.c:582 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x48/0x68 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:233 [inline] slab_free_hook mm/slub.c:2338 [inline] slab_free mm/slub.c:4598 [inline] kfree+0x140/0x420 mm/slub.c:4746 kvfree+0x4c/0x68 mm/util.c:693 netdev_release+0x94/0xc8 net/core/net-sysfs.c:2034 device_release+0x98/0x1c0 kobject_cleanup lib/kobject.c:689 [inline] kobject_release lib/kobject.c:720 [inline] kref_put include/linux/kref.h:65 [inline] kobject_put+0x2b0/0x438 lib/kobject.c:737 netdev_run_todo+0xdd8/0xf48 net/core/dev.c:10924 rtnl_unlock net/core/rtnetlink.c:152 [inline] rtnl_net_unlock net/core/rtnetlink.c:209 [inline] rtnl_dellink+0x484/0x680 net/core/rtnetlink.c:3526 rtnetlink_rcv_msg+0x61c/0x918 net/core/rtnetlink.c:6901 netlink_rcv_skb+0x1dc/0x398 net/netlink/af_netlink.c:2542 rtnetlink_rcv+0x34/0x50 net/core/rtnetlink.c:6928 netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline] netlink_unicast+0x618/0x838 net/netlink/af_netlink.c:1347 netlink_sendmsg+0x5fc/0x8b0 net/netlink/af_netlink.c:1891 sock_sendmsg_nosec net/socket.c:711 [inline] __sock_sendmsg net/socket.c:726 [inline] ____sys_sendmsg+0x410/0x708 net/socket.c:2583 ___sys_sendmsg+0x178/0x1d8 net/socket.c:2637 __sys_sendmsg net/socket.c:2669 [inline] __do_sys_sendmsg net/socket.c:2674 [inline] __se_sys_sendmsg net/socket.c:2672 [inline] __arm64_sys_sendmsg+0x12c/0x1c8 net/socket.c:2672 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x90/0x278 arch/arm64/kernel/syscall.c:49 el0_svc_common+0x13c/0x250 arch/arm64/kernel/syscall.c:132 do_el0_svc+0x54/0x70 arch/arm64/kernel/syscall.c:151 el0_svc+0x4c/0xa8 arch/arm64/kernel/entry-common.c:744 el0t_64_sync_handler+0x78/0x108 arch/arm64/kernel/entry-common.c:762 el0t_64_sync+0x198/0x1a0 arch/arm64/kernel/entry.S:600
The buggy address belongs to the object at ffff0000d768c000 which belongs to the cache kmalloc-cg-4k of size 4096 The buggy address is located 224 bytes inside of freed 4096-byte region [ffff0000d768c000, ffff0000d768d000)
The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x117688 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff0000c77ef981 flags: 0xbfffe0000000040(head|node=0|zone=2|lastcpupid=0x1ffff) page_type: f5(slab) raw: 0bfffe0000000040 ffff0000c000f500 dead000000000100 dead000000000122 raw: 0000000000000000 0000000000040004 00000001f5000000 ffff0000c77ef981 head: 0bfffe0000000040 ffff0000c000f500 dead000000000100 dead000000000122 head: 0000000000000000 0000000000040004 00000001f5000000 ffff0000c77ef981 head: 0bfffe0000000003 fffffdffc35da201 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff0000d768bf80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff0000d768c000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff0000d768c080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff0000d768c100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff0000d768c180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Fixes: 8c55facecd7a ("net: linkwatch: only report IF_OPER_LOWERLAYERDOWN if iflink is actually down") Reported-by: syzkaller syzkaller@googlegroups.com Suggested-by: Jakub Kicinski kuba@kernel.org Link: https://lore.kernel.org/netdev/20250102174400.085fd8ac@kernel.org/ [1] Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Link: https://patch.msgid.link/20250106071911.64355-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/link_watch.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/net/core/link_watch.c b/net/core/link_watch.c index 66422c95c83c..e2e8a341318b 100644 --- a/net/core/link_watch.c +++ b/net/core/link_watch.c @@ -42,14 +42,18 @@ static unsigned char default_operstate(const struct net_device *dev) * first check whether lower is indeed the source of its down state. */ if (!netif_carrier_ok(dev)) { - int iflink = dev_get_iflink(dev); struct net_device *peer; + int iflink;
/* If called from netdev_run_todo()/linkwatch_sync_dev(), * dev_net(dev) can be already freed, and RTNL is not held. */ - if (dev->reg_state == NETREG_UNREGISTERED || - iflink == dev->ifindex) + if (dev->reg_state <= NETREG_REGISTERED) + iflink = dev_get_iflink(dev); + else + iflink = dev->ifindex; + + if (iflink == dev->ifindex) return IF_OPER_DOWN;
ASSERT_RTNL();
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit db78475ba0d3c66d430f7ded2388cc041078a542 ]
Commit f85949f98206 ("xdp: add xdp_set_features_flag utility routine") added routines to inform the core about XDP flag changes. GVE support was added around the same time and missed using them.
GVE only changes the flags on error recover or resume. Presumably the flags may change during resume if VM migrated. User would not get the notification and upper devices would not get a chance to recalculate their flags.
Fixes: 75eaae158b1b ("gve: Add XDP DROP and TX support for GQI-QPL format") Reviewed-By: Jeroen de Borst jeroendb@google.com Reviewed-by: Willem de Bruijn willemb@google.com Link: https://patch.msgid.link/20250106180210.1861784-1-kuba@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/google/gve/gve_main.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c index d70305654e7d..90d433b36799 100644 --- a/drivers/net/ethernet/google/gve/gve_main.c +++ b/drivers/net/ethernet/google/gve/gve_main.c @@ -2009,14 +2009,18 @@ static void gve_service_task(struct work_struct *work)
static void gve_set_netdev_xdp_features(struct gve_priv *priv) { + xdp_features_t xdp_features; + if (priv->queue_format == GVE_GQI_QPL_FORMAT) { - priv->dev->xdp_features = NETDEV_XDP_ACT_BASIC; - priv->dev->xdp_features |= NETDEV_XDP_ACT_REDIRECT; - priv->dev->xdp_features |= NETDEV_XDP_ACT_NDO_XMIT; - priv->dev->xdp_features |= NETDEV_XDP_ACT_XSK_ZEROCOPY; + xdp_features = NETDEV_XDP_ACT_BASIC; + xdp_features |= NETDEV_XDP_ACT_REDIRECT; + xdp_features |= NETDEV_XDP_ACT_NDO_XMIT; + xdp_features |= NETDEV_XDP_ACT_XSK_ZEROCOPY; } else { - priv->dev->xdp_features = 0; + xdp_features = 0; } + + xdp_set_features_flag(priv->dev, xdp_features); }
static int gve_init_priv(struct gve_priv *priv, bool skip_describe_device)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
[ Upstream commit c2994b008492db033d40bd767be1620229a3035e ]
This fixes errors such as the following when Own address type is set to Random Address but it has not been programmed yet due to either be advertising or connecting:
< HCI Command: LE Set Exte.. (0x08|0x0041) plen 13 Own address type: Random (0x03) Filter policy: Ignore not in accept list (0x01) PHYs: 0x05 Entry 0: LE 1M Type: Passive (0x00) Interval: 60.000 msec (0x0060) Window: 30.000 msec (0x0030) Entry 1: LE Coded Type: Passive (0x00) Interval: 180.000 msec (0x0120) Window: 90.000 msec (0x0090)
HCI Event: Command Complete (0x0e) plen 4
LE Set Extended Scan Parameters (0x08|0x0041) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Exten.. (0x08|0x0042) plen 6 Extended scan: Enabled (0x01) Filter duplicates: Enabled (0x01) Duration: 0 msec (0x0000) Period: 0.00 sec (0x0000)
HCI Event: Command Complete (0x0e) plen 4
LE Set Extended Scan Enable (0x08|0x0042) ncmd 1 Status: Invalid HCI Command Parameters (0x12)
Fixes: c45074d68a9b ("Bluetooth: Fix not generating RPA when required") Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/hci_sync.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c index d95e2b55badb..d6f40806ee51 100644 --- a/net/bluetooth/hci_sync.c +++ b/net/bluetooth/hci_sync.c @@ -1049,9 +1049,9 @@ static bool adv_use_rpa(struct hci_dev *hdev, uint32_t flags)
static int hci_set_random_addr_sync(struct hci_dev *hdev, bdaddr_t *rpa) { - /* If we're advertising or initiating an LE connection we can't - * go ahead and change the random address at this time. This is - * because the eventual initiator address used for the + /* If a random_addr has been set we're advertising or initiating an LE + * connection we can't go ahead and change the random address at this + * time. This is because the eventual initiator address used for the * subsequently created connection will be undefined (some * controllers use the new address and others the one we had * when the operation started). @@ -1059,8 +1059,9 @@ static int hci_set_random_addr_sync(struct hci_dev *hdev, bdaddr_t *rpa) * In this kind of scenario skip the update and let the random * address be updated at the next cycle. */ - if (hci_dev_test_flag(hdev, HCI_LE_ADV) || - hci_lookup_le_connect(hdev)) { + if (bacmp(&hdev->random_addr, BDADDR_ANY) && + (hci_dev_test_flag(hdev, HCI_LE_ADV) || + hci_lookup_le_connect(hdev))) { bt_dev_dbg(hdev, "Deferring random address update"); hci_dev_set_flag(hdev, HCI_RPA_EXPIRED); return 0;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
[ Upstream commit a182d9c84f9c52fb5db895ecceeee8b3a1bf661e ]
Add Device with LE type requires updating resolving/accept list which requires quite a number of commands to complete and each of them may fail, so instead of pretending it would always work this checks the return of hci_update_passive_scan_sync which indicates if everything worked as intended.
Fixes: e8907f76544f ("Bluetooth: hci_sync: Make use of hci_cmd_sync_queue set 3") Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/mgmt.c | 38 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 36 insertions(+), 2 deletions(-)
diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c index 1175248e4bec..e3440f0d7d9d 100644 --- a/net/bluetooth/mgmt.c +++ b/net/bluetooth/mgmt.c @@ -7589,6 +7589,24 @@ static void device_added(struct sock *sk, struct hci_dev *hdev, mgmt_event(MGMT_EV_DEVICE_ADDED, hdev, &ev, sizeof(ev), sk); }
+static void add_device_complete(struct hci_dev *hdev, void *data, int err) +{ + struct mgmt_pending_cmd *cmd = data; + struct mgmt_cp_add_device *cp = cmd->param; + + if (!err) { + device_added(cmd->sk, hdev, &cp->addr.bdaddr, cp->addr.type, + cp->action); + device_flags_changed(NULL, hdev, &cp->addr.bdaddr, + cp->addr.type, hdev->conn_flags, + PTR_UINT(cmd->user_data)); + } + + mgmt_cmd_complete(cmd->sk, hdev->id, MGMT_OP_ADD_DEVICE, + mgmt_status(err), &cp->addr, sizeof(cp->addr)); + mgmt_pending_free(cmd); +} + static int add_device_sync(struct hci_dev *hdev, void *data) { return hci_update_passive_scan_sync(hdev); @@ -7597,6 +7615,7 @@ static int add_device_sync(struct hci_dev *hdev, void *data) static int add_device(struct sock *sk, struct hci_dev *hdev, void *data, u16 len) { + struct mgmt_pending_cmd *cmd; struct mgmt_cp_add_device *cp = data; u8 auto_conn, addr_type; struct hci_conn_params *params; @@ -7677,9 +7696,24 @@ static int add_device(struct sock *sk, struct hci_dev *hdev, current_flags = params->flags; }
- err = hci_cmd_sync_queue(hdev, add_device_sync, NULL, NULL); - if (err < 0) + cmd = mgmt_pending_new(sk, MGMT_OP_ADD_DEVICE, hdev, data, len); + if (!cmd) { + err = -ENOMEM; goto unlock; + } + + cmd->user_data = UINT_PTR(current_flags); + + err = hci_cmd_sync_queue(hdev, add_device_sync, cmd, + add_device_complete); + if (err < 0) { + err = mgmt_cmd_complete(sk, hdev->id, MGMT_OP_ADD_DEVICE, + MGMT_STATUS_FAILED, &cp->addr, + sizeof(cp->addr)); + mgmt_pending_free(cmd); + } + + goto unlock;
added: device_added(sk, hdev, &cp->addr.bdaddr, cp->addr.type, cp->action);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Neeraj Sanjay Kale neeraj.sanjaykale@nxp.com
[ Upstream commit 8023dd2204254a70887f5ee58d914bf70a060b9d ]
This fixes the apparent controller hang issue seen during stress test where the host sends a truncated payload, followed by HCI commands. The controller treats these HCI commands as a part of previously truncated payload, leading to command timeouts.
Adding a serdev_device_wait_until_sent() call after serdev_device_write_buf() fixed the issue.
Fixes: 689ca16e5232 ("Bluetooth: NXP: Add protocol support for NXP Bluetooth chipsets") Signed-off-by: Neeraj Sanjay Kale neeraj.sanjaykale@nxp.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btnxpuart.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/bluetooth/btnxpuart.c b/drivers/bluetooth/btnxpuart.c index 5ee9a8b8dcfd..e809bb2dbe5e 100644 --- a/drivers/bluetooth/btnxpuart.c +++ b/drivers/bluetooth/btnxpuart.c @@ -1280,6 +1280,7 @@ static void btnxpuart_tx_work(struct work_struct *work)
while ((skb = nxp_dequeue(nxpdev))) { len = serdev_device_write_buf(serdev, skb->data, skb->len); + serdev_device_wait_until_sent(serdev, 0); hdev->stat.byte_tx += len;
skb_pull(skb, len);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Borkmann daniel@iogearbox.net
[ Upstream commit 80fb40baba19e25a1b6f3ecff6fc5c0171806bde ]
This is a follow-up to 3c5b4d69c358 ("net: annotate data-races around sk->sk_mark"). sk->sk_mark can be read and written without holding the socket lock. IPv6 equivalent is already covered with READ_ONCE() annotation in tcp_v6_send_response().
Fixes: 3c5b4d69c358 ("net: annotate data-races around sk->sk_mark") Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Eric Dumazet edumazet@google.com Link: https://patch.msgid.link/f459d1fc44f205e13f6d8bdca2c8bfb9902ffac9.1736244569... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_ipv4.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index df3ddf31f8e6..705320f160ac 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -832,7 +832,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb) sock_net_set(ctl_sk, net); if (sk) { ctl_sk->sk_mark = (sk->sk_state == TCP_TIME_WAIT) ? - inet_twsk(sk)->tw_mark : sk->sk_mark; + inet_twsk(sk)->tw_mark : READ_ONCE(sk->sk_mark); ctl_sk->sk_priority = (sk->sk_state == TCP_TIME_WAIT) ? inet_twsk(sk)->tw_priority : sk->sk_priority; transmit_time = tcp_transmit_time(sk);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexandre Ghiti alexghiti@rivosinc.com
commit 6ca445d8af0ed5950ebf899415fd6bfcd7d9d7a3 upstream.
Commit c97bf629963e ("riscv: Fix text patching when IPI are used") converted ftrace_make_nop() to use patch_insn_write() which does not emit any icache flush relying entirely on __ftrace_modify_code() to do that.
But we missed that ftrace_make_nop() was called very early directly when converting mcount calls into nops (actually on riscv it converts 2B nops emitted by the compiler into 4B nops).
This caused crashes on multiple HW as reported by Conor and Björn since the booting core could have half-patched instructions in its icache which would trigger an illegal instruction trap: fix this by emitting a local flush icache when early patching nops.
Fixes: c97bf629963e ("riscv: Fix text patching when IPI are used") Signed-off-by: Alexandre Ghiti alexghiti@rivosinc.com Reported-by: Conor Dooley conor.dooley@microchip.com Tested-by: Conor Dooley conor.dooley@microchip.com Reviewed-by: Björn Töpel bjorn@rivosinc.com Tested-by: Björn Töpel bjorn@rivosinc.com Link: https://lore.kernel.org/r/20240523115134.70380-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/include/asm/cacheflush.h | 6 ++++++ arch/riscv/kernel/ftrace.c | 3 +++ 2 files changed, 9 insertions(+)
--- a/arch/riscv/include/asm/cacheflush.h +++ b/arch/riscv/include/asm/cacheflush.h @@ -13,6 +13,12 @@ static inline void local_flush_icache_al asm volatile ("fence.i" ::: "memory"); }
+static inline void local_flush_icache_range(unsigned long start, + unsigned long end) +{ + local_flush_icache_all(); +} + #define PG_dcache_clean PG_arch_1
static inline void flush_dcache_folio(struct folio *folio) --- a/arch/riscv/kernel/ftrace.c +++ b/arch/riscv/kernel/ftrace.c @@ -120,6 +120,9 @@ int ftrace_init_nop(struct module *mod, out = ftrace_make_nop(mod, rec, MCOUNT_ADDR); mutex_unlock(&text_mutex);
+ if (!mod) + local_flush_icache_range(rec->ip, rec->ip + MCOUNT_INSN_SIZE); + return out; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wei Yang richard.weiyang@gmail.com
commit 9364a7e40d54e6858479f0a96e1a04aa1204be16 upstream.
commit 8043832e2a12 ("memblock: use numa_valid_node() helper to check for invalid node ID") introduce a new helper numa_valid_node(), which is not defined in memblock tests.
Let's add it in the corresponding header file.
Signed-off-by: Wei Yang richard.weiyang@gmail.com CC: Mike Rapoport (IBM) rppt@kernel.org Link: https://lore.kernel.org/r/20240624015432.31134-1-richard.weiyang@gmail.com Signed-off-by: Mike Rapoport rppt@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/include/linux/numa.h | 5 +++++ 1 file changed, 5 insertions(+)
--- a/tools/include/linux/numa.h +++ b/tools/include/linux/numa.h @@ -13,4 +13,9 @@
#define NUMA_NO_NODE (-1)
+static inline bool numa_valid_node(int nid) +{ + return nid >= 0 && nid < MAX_NUMNODES; +} + #endif /* _LINUX_NUMA_H */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Beulich jbeulich@suse.com
commit 3ac36aa7307363b7247ccb6f6a804e11496b2b36 upstream.
memblock_set_node() warns about using MAX_NUMNODES, see
e0eec24e2e19 ("memblock: make memblock_set_node() also warn about use of MAX_NUMNODES")
for details.
Reported-by: Narasimhan V Narasimhan.V@amd.com Signed-off-by: Jan Beulich jbeulich@suse.com Cc: stable@vger.kernel.org [bp: commit message] Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Mike Rapoport (IBM) rppt@kernel.org Tested-by: Paul E. McKenney paulmck@kernel.org Link: https://lore.kernel.org/r/20240603141005.23261-1-bp@kernel.org Link: https://lore.kernel.org/r/abadb736-a239-49e4-ab42-ace7acdd4278@suse.com Signed-off-by: Mike Rapoport (IBM) rppt@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/mm/numa.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/x86/mm/numa.c +++ b/arch/x86/mm/numa.c @@ -492,7 +492,7 @@ static void __init numa_clear_kernel_nod for_each_reserved_mem_region(mb_region) { int nid = memblock_get_region_node(mb_region);
- if (nid != MAX_NUMNODES) + if (nid != NUMA_NO_NODE) node_set(nid, reserved_nodemask); }
@@ -613,9 +613,9 @@ static int __init numa_init(int (*init_f nodes_clear(node_online_map); memset(&numa_meminfo, 0, sizeof(numa_meminfo)); WARN_ON(memblock_set_node(0, ULLONG_MAX, &memblock.memory, - MAX_NUMNODES)); + NUMA_NO_NODE)); WARN_ON(memblock_set_node(0, ULLONG_MAX, &memblock.reserved, - MAX_NUMNODES)); + NUMA_NO_NODE)); /* In case that parsing SRAT failed. */ WARN_ON(memblock_clear_hotplug(0, ULLONG_MAX)); numa_reset_distance();
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jean-Baptiste Maneyrol jean-baptiste.maneyrol@tdk.com
commit 65a60a590142c54a3f3be11ff162db2d5b0e1e06 upstream.
Currently suspending while sensors are one will result in timestamping continuing without gap at resume. It can work with monotonic clock but not with other clocks. Fix that by resetting timestamping.
Fixes: ec74ae9fd37c ("iio: imu: inv_icm42600: add accurate timestamping") Cc: stable@vger.kernel.org Signed-off-by: Jean-Baptiste Maneyrol jean-baptiste.maneyrol@tdk.com Link: https://patch.msgid.link/20241113-inv_icm42600-fix-timestamps-after-suspend-... Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/imu/inv_icm42600/inv_icm42600_core.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c +++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c @@ -17,6 +17,7 @@ #include <linux/regmap.h>
#include <linux/iio/iio.h> +#include <linux/iio/common/inv_sensors_timestamp.h>
#include "inv_icm42600.h" #include "inv_icm42600_buffer.h" @@ -725,6 +726,8 @@ out_unlock: static int inv_icm42600_resume(struct device *dev) { struct inv_icm42600_state *st = dev_get_drvdata(dev); + struct inv_sensors_timestamp *gyro_ts = iio_priv(st->indio_gyro); + struct inv_sensors_timestamp *accel_ts = iio_priv(st->indio_accel); int ret;
mutex_lock(&st->lock); @@ -745,9 +748,12 @@ static int inv_icm42600_resume(struct de goto out_unlock;
/* restore FIFO data streaming */ - if (st->fifo.on) + if (st->fifo.on) { + inv_sensors_timestamp_reset(gyro_ts); + inv_sensors_timestamp_reset(accel_ts); ret = regmap_write(st->map, INV_ICM42600_REG_FIFO_CONFIG, INV_ICM42600_FIFO_CONFIG_STREAM); + }
out_unlock: mutex_unlock(&st->lock);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 13210fc63f353fe78584048079343413a3cdf819 ]
All these cases cause imbalance between BIND and UNBIND calls:
- Delete an interface from a flowtable with multiple interfaces
- Add a (device to a) flowtable with --check flag
- Delete a netns containing a flowtable
- In an interactive nft session, create a table with owner flag and flowtable inside, then quit.
Fix it by calling FLOW_BLOCK_UNBIND when unregistering hooks, then remove late FLOW_BLOCK_UNBIND call when destroying flowtable.
Fixes: ff4bf2f42a40 ("netfilter: nf_tables: add nft_unregister_flowtable_hook()") Reported-by: Phil Sutter phil@nwl.cc Tested-by: Phil Sutter phil@nwl.cc Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index a110aad45fe4..1d1e998acd67 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -8375,6 +8375,7 @@ static void nft_unregister_flowtable_hook(struct net *net, }
static void __nft_unregister_flowtable_net_hooks(struct net *net, + struct nft_flowtable *flowtable, struct list_head *hook_list, bool release_netdev) { @@ -8382,6 +8383,8 @@ static void __nft_unregister_flowtable_net_hooks(struct net *net,
list_for_each_entry_safe(hook, next, hook_list, list) { nf_unregister_net_hook(net, &hook->ops); + flowtable->data.type->setup(&flowtable->data, hook->ops.dev, + FLOW_BLOCK_UNBIND); if (release_netdev) { list_del(&hook->list); kfree_rcu(hook, rcu); @@ -8390,9 +8393,10 @@ static void __nft_unregister_flowtable_net_hooks(struct net *net, }
static void nft_unregister_flowtable_net_hooks(struct net *net, + struct nft_flowtable *flowtable, struct list_head *hook_list) { - __nft_unregister_flowtable_net_hooks(net, hook_list, false); + __nft_unregister_flowtable_net_hooks(net, flowtable, hook_list, false); }
static int nft_register_flowtable_net_hooks(struct net *net, @@ -9028,8 +9032,6 @@ static void nf_tables_flowtable_destroy(struct nft_flowtable *flowtable)
flowtable->data.type->free(&flowtable->data); list_for_each_entry_safe(hook, next, &flowtable->hook_list, list) { - flowtable->data.type->setup(&flowtable->data, hook->ops.dev, - FLOW_BLOCK_UNBIND); list_del_rcu(&hook->list); kfree_rcu(hook, rcu); } @@ -10399,6 +10401,7 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb) &nft_trans_flowtable_hooks(trans), trans->msg_type); nft_unregister_flowtable_net_hooks(net, + nft_trans_flowtable(trans), &nft_trans_flowtable_hooks(trans)); } else { list_del_rcu(&nft_trans_flowtable(trans)->list); @@ -10407,6 +10410,7 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb) NULL, trans->msg_type); nft_unregister_flowtable_net_hooks(net, + nft_trans_flowtable(trans), &nft_trans_flowtable(trans)->hook_list); } break; @@ -10659,11 +10663,13 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action) case NFT_MSG_NEWFLOWTABLE: if (nft_trans_flowtable_update(trans)) { nft_unregister_flowtable_net_hooks(net, + nft_trans_flowtable(trans), &nft_trans_flowtable_hooks(trans)); } else { nft_use_dec_restore(&trans->ctx.table->use); list_del_rcu(&nft_trans_flowtable(trans)->list); nft_unregister_flowtable_net_hooks(net, + nft_trans_flowtable(trans), &nft_trans_flowtable(trans)->hook_list); } break; @@ -11224,7 +11230,8 @@ static void __nft_release_hook(struct net *net, struct nft_table *table) list_for_each_entry(chain, &table->chains, list) __nf_tables_unregister_hook(net, table, chain, true); list_for_each_entry(flowtable, &table->flowtables, list) - __nft_unregister_flowtable_net_hooks(net, &flowtable->hook_list, + __nft_unregister_flowtable_net_hooks(net, flowtable, + &flowtable->hook_list, true); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit b541ba7d1f5a5b7b3e2e22dc9e40e18a7d6dbc13 ]
Use INT_MAX as maximum size for the conntrack hashtable. Otherwise, it is possible to hit WARN_ON_ONCE in __kvmalloc_node_noprof() when resizing hashtable because __GFP_NOWARN is unset. See:
0708a0afe291 ("mm: Consider __GFP_NOWARN flag for oversized kvmalloc() calls")
Note: hashtable resize is only possible from init_netns.
Fixes: 9cc1c73ad666 ("netfilter: conntrack: avoid integer overflow when resizing") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_conntrack_core.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index e4ae2a08da6a..34ad5975fbf3 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -2568,12 +2568,15 @@ void *nf_ct_alloc_hashtable(unsigned int *sizep, int nulls) struct hlist_nulls_head *hash; unsigned int nr_slots, i;
- if (*sizep > (UINT_MAX / sizeof(struct hlist_nulls_head))) + if (*sizep > (INT_MAX / sizeof(struct hlist_nulls_head))) return NULL;
BUILD_BUG_ON(sizeof(struct hlist_nulls_head) != sizeof(struct hlist_head)); nr_slots = *sizep = roundup(*sizep, PAGE_SIZE / sizeof(struct hlist_nulls_head));
+ if (nr_slots > (INT_MAX / sizeof(struct hlist_nulls_head))) + return NULL; + hash = kvcalloc(nr_slots, sizeof(struct hlist_nulls_head), GFP_KERNEL);
if (hash && nulls)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Toke Høiland-Jørgensen toke@redhat.com
[ Upstream commit 737d4d91d35b5f7fa5bb442651472277318b0bfd ]
Even though we fixed a logic error in the commit cited below, syzbot still managed to trigger an underflow of the per-host bulk flow counters, leading to an out of bounds memory access.
To avoid any such logic errors causing out of bounds memory accesses, this commit factors out all accesses to the per-host bulk flow counters to a series of helpers that perform bounds-checking before any increments and decrements. This also has the benefit of improving readability by moving the conditional checks for the flow mode into these helpers, instead of having them spread out throughout the code (which was the cause of the original logic error).
As part of this change, the flow quantum calculation is consolidated into a helper function, which means that the dithering applied to the ost load scaling is now applied both in the DRR rotation and when a sparse flow's quantum is first initiated. The only user-visible effect of this is that the maximum packet size that can be sent while a flow stays sparse will now vary with +/- one byte in some cases. This should not make a noticeable difference in practice, and thus it's not worth complicating the code to preserve the old behaviour.
Fixes: 546ea84d07e3 ("sched: sch_cake: fix bulk flow accounting logic for host fairness") Reported-by: syzbot+f63600d288bfb7057424@syzkaller.appspotmail.com Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Acked-by: Dave Taht dave.taht@gmail.com Link: https://patch.msgid.link/20250107120105.70685-1-toke@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_cake.c | 140 +++++++++++++++++++++++-------------------- 1 file changed, 75 insertions(+), 65 deletions(-)
diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c index a65fad45d556..09242578dac5 100644 --- a/net/sched/sch_cake.c +++ b/net/sched/sch_cake.c @@ -644,6 +644,63 @@ static bool cake_ddst(int flow_mode) return (flow_mode & CAKE_FLOW_DUAL_DST) == CAKE_FLOW_DUAL_DST; }
+static void cake_dec_srchost_bulk_flow_count(struct cake_tin_data *q, + struct cake_flow *flow, + int flow_mode) +{ + if (likely(cake_dsrc(flow_mode) && + q->hosts[flow->srchost].srchost_bulk_flow_count)) + q->hosts[flow->srchost].srchost_bulk_flow_count--; +} + +static void cake_inc_srchost_bulk_flow_count(struct cake_tin_data *q, + struct cake_flow *flow, + int flow_mode) +{ + if (likely(cake_dsrc(flow_mode) && + q->hosts[flow->srchost].srchost_bulk_flow_count < CAKE_QUEUES)) + q->hosts[flow->srchost].srchost_bulk_flow_count++; +} + +static void cake_dec_dsthost_bulk_flow_count(struct cake_tin_data *q, + struct cake_flow *flow, + int flow_mode) +{ + if (likely(cake_ddst(flow_mode) && + q->hosts[flow->dsthost].dsthost_bulk_flow_count)) + q->hosts[flow->dsthost].dsthost_bulk_flow_count--; +} + +static void cake_inc_dsthost_bulk_flow_count(struct cake_tin_data *q, + struct cake_flow *flow, + int flow_mode) +{ + if (likely(cake_ddst(flow_mode) && + q->hosts[flow->dsthost].dsthost_bulk_flow_count < CAKE_QUEUES)) + q->hosts[flow->dsthost].dsthost_bulk_flow_count++; +} + +static u16 cake_get_flow_quantum(struct cake_tin_data *q, + struct cake_flow *flow, + int flow_mode) +{ + u16 host_load = 1; + + if (cake_dsrc(flow_mode)) + host_load = max(host_load, + q->hosts[flow->srchost].srchost_bulk_flow_count); + + if (cake_ddst(flow_mode)) + host_load = max(host_load, + q->hosts[flow->dsthost].dsthost_bulk_flow_count); + + /* The get_random_u16() is a way to apply dithering to avoid + * accumulating roundoff errors + */ + return (q->flow_quantum * quantum_div[host_load] + + get_random_u16()) >> 16; +} + static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb, int flow_mode, u16 flow_override, u16 host_override) { @@ -790,10 +847,8 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb, allocate_dst = cake_ddst(flow_mode);
if (q->flows[outer_hash + k].set == CAKE_SET_BULK) { - if (allocate_src) - q->hosts[q->flows[reduced_hash].srchost].srchost_bulk_flow_count--; - if (allocate_dst) - q->hosts[q->flows[reduced_hash].dsthost].dsthost_bulk_flow_count--; + cake_dec_srchost_bulk_flow_count(q, &q->flows[outer_hash + k], flow_mode); + cake_dec_dsthost_bulk_flow_count(q, &q->flows[outer_hash + k], flow_mode); } found: /* reserve queue for future packets in same flow */ @@ -818,9 +873,10 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb, q->hosts[outer_hash + k].srchost_tag = srchost_hash; found_src: srchost_idx = outer_hash + k; - if (q->flows[reduced_hash].set == CAKE_SET_BULK) - q->hosts[srchost_idx].srchost_bulk_flow_count++; q->flows[reduced_hash].srchost = srchost_idx; + + if (q->flows[reduced_hash].set == CAKE_SET_BULK) + cake_inc_srchost_bulk_flow_count(q, &q->flows[reduced_hash], flow_mode); }
if (allocate_dst) { @@ -841,9 +897,10 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb, q->hosts[outer_hash + k].dsthost_tag = dsthost_hash; found_dst: dsthost_idx = outer_hash + k; - if (q->flows[reduced_hash].set == CAKE_SET_BULK) - q->hosts[dsthost_idx].dsthost_bulk_flow_count++; q->flows[reduced_hash].dsthost = dsthost_idx; + + if (q->flows[reduced_hash].set == CAKE_SET_BULK) + cake_inc_dsthost_bulk_flow_count(q, &q->flows[reduced_hash], flow_mode); } }
@@ -1856,10 +1913,6 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
/* flowchain */ if (!flow->set || flow->set == CAKE_SET_DECAYING) { - struct cake_host *srchost = &b->hosts[flow->srchost]; - struct cake_host *dsthost = &b->hosts[flow->dsthost]; - u16 host_load = 1; - if (!flow->set) { list_add_tail(&flow->flowchain, &b->new_flows); } else { @@ -1869,18 +1922,8 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch, flow->set = CAKE_SET_SPARSE; b->sparse_flow_count++;
- if (cake_dsrc(q->flow_mode)) - host_load = max(host_load, srchost->srchost_bulk_flow_count); - - if (cake_ddst(q->flow_mode)) - host_load = max(host_load, dsthost->dsthost_bulk_flow_count); - - flow->deficit = (b->flow_quantum * - quantum_div[host_load]) >> 16; + flow->deficit = cake_get_flow_quantum(b, flow, q->flow_mode); } else if (flow->set == CAKE_SET_SPARSE_WAIT) { - struct cake_host *srchost = &b->hosts[flow->srchost]; - struct cake_host *dsthost = &b->hosts[flow->dsthost]; - /* this flow was empty, accounted as a sparse flow, but actually * in the bulk rotation. */ @@ -1888,12 +1931,8 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch, b->sparse_flow_count--; b->bulk_flow_count++;
- if (cake_dsrc(q->flow_mode)) - srchost->srchost_bulk_flow_count++; - - if (cake_ddst(q->flow_mode)) - dsthost->dsthost_bulk_flow_count++; - + cake_inc_srchost_bulk_flow_count(b, flow, q->flow_mode); + cake_inc_dsthost_bulk_flow_count(b, flow, q->flow_mode); }
if (q->buffer_used > q->buffer_max_used) @@ -1950,13 +1989,11 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch) { struct cake_sched_data *q = qdisc_priv(sch); struct cake_tin_data *b = &q->tins[q->cur_tin]; - struct cake_host *srchost, *dsthost; ktime_t now = ktime_get(); struct cake_flow *flow; struct list_head *head; bool first_flow = true; struct sk_buff *skb; - u16 host_load; u64 delay; u32 len;
@@ -2056,11 +2093,6 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch) q->cur_flow = flow - b->flows; first_flow = false;
- /* triple isolation (modified DRR++) */ - srchost = &b->hosts[flow->srchost]; - dsthost = &b->hosts[flow->dsthost]; - host_load = 1; - /* flow isolation (DRR++) */ if (flow->deficit <= 0) { /* Keep all flows with deficits out of the sparse and decaying @@ -2072,11 +2104,8 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch) b->sparse_flow_count--; b->bulk_flow_count++;
- if (cake_dsrc(q->flow_mode)) - srchost->srchost_bulk_flow_count++; - - if (cake_ddst(q->flow_mode)) - dsthost->dsthost_bulk_flow_count++; + cake_inc_srchost_bulk_flow_count(b, flow, q->flow_mode); + cake_inc_dsthost_bulk_flow_count(b, flow, q->flow_mode);
flow->set = CAKE_SET_BULK; } else { @@ -2088,19 +2117,7 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch) } }
- if (cake_dsrc(q->flow_mode)) - host_load = max(host_load, srchost->srchost_bulk_flow_count); - - if (cake_ddst(q->flow_mode)) - host_load = max(host_load, dsthost->dsthost_bulk_flow_count); - - WARN_ON(host_load > CAKE_QUEUES); - - /* The get_random_u16() is a way to apply dithering to avoid - * accumulating roundoff errors - */ - flow->deficit += (b->flow_quantum * quantum_div[host_load] + - get_random_u16()) >> 16; + flow->deficit += cake_get_flow_quantum(b, flow, q->flow_mode); list_move_tail(&flow->flowchain, &b->old_flows);
goto retry; @@ -2124,11 +2141,8 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch) if (flow->set == CAKE_SET_BULK) { b->bulk_flow_count--;
- if (cake_dsrc(q->flow_mode)) - srchost->srchost_bulk_flow_count--; - - if (cake_ddst(q->flow_mode)) - dsthost->dsthost_bulk_flow_count--; + cake_dec_srchost_bulk_flow_count(b, flow, q->flow_mode); + cake_dec_dsthost_bulk_flow_count(b, flow, q->flow_mode);
b->decaying_flow_count++; } else if (flow->set == CAKE_SET_SPARSE || @@ -2146,12 +2160,8 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch) else if (flow->set == CAKE_SET_BULK) { b->bulk_flow_count--;
- if (cake_dsrc(q->flow_mode)) - srchost->srchost_bulk_flow_count--; - - if (cake_ddst(q->flow_mode)) - dsthost->dsthost_bulk_flow_count--; - + cake_dec_srchost_bulk_flow_count(b, flow, q->flow_mode); + cake_dec_dsthost_bulk_flow_count(b, flow, q->flow_mode); } else b->decaying_flow_count--;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Parker Newman pnewman@connecttech.com
[ Upstream commit 426046e2d62dd19533808661e912b8e8a9eaec16 ]
Nvidia's Tegra MGBE controllers require the IOMMU "Stream ID" (SID) to be written to the MGBE_WRAP_AXI_ASID0_CTRL register.
The current driver is hard coded to use MGBE0's SID for all controllers. This causes softirq time outs and kernel panics when using controllers other than MGBE0.
Example dmesg errors when an ethernet cable is connected to MGBE1:
[ 116.133290] tegra-mgbe 6910000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx [ 121.851283] tegra-mgbe 6910000.ethernet eth1: NETDEV WATCHDOG: CPU: 5: transmit queue 0 timed out 5690 ms [ 121.851782] tegra-mgbe 6910000.ethernet eth1: Reset adapter. [ 121.892464] tegra-mgbe 6910000.ethernet eth1: Register MEM_TYPE_PAGE_POOL RxQ-0 [ 121.905920] tegra-mgbe 6910000.ethernet eth1: PHY [stmmac-1:00] driver [Aquantia AQR113] (irq=171) [ 121.907356] tegra-mgbe 6910000.ethernet eth1: Enabling Safety Features [ 121.907578] tegra-mgbe 6910000.ethernet eth1: IEEE 1588-2008 Advanced Timestamp supported [ 121.908399] tegra-mgbe 6910000.ethernet eth1: registered PTP clock [ 121.908582] tegra-mgbe 6910000.ethernet eth1: configuring for phy/10gbase-r link mode [ 125.961292] tegra-mgbe 6910000.ethernet eth1: Link is Up - 1Gbps/Full - flow control rx/tx [ 181.921198] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 181.921404] rcu: 7-....: (1 GPs behind) idle=540c/1/0x4000000000000002 softirq=1748/1749 fqs=2337 [ 181.921684] rcu: (detected by 4, t=6002 jiffies, g=1357, q=1254 ncpus=8) [ 181.921878] Sending NMI from CPU 4 to CPUs 7: [ 181.921886] NMI backtrace for cpu 7 [ 181.922131] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Kdump: loaded Not tainted 6.13.0-rc3+ #6 [ 181.922390] Hardware name: NVIDIA CTI Forge + Orin AGX/Jetson, BIOS 202402.1-Unknown 10/28/2024 [ 181.922658] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 181.922847] pc : handle_softirqs+0x98/0x368 [ 181.922978] lr : __do_softirq+0x18/0x20 [ 181.923095] sp : ffff80008003bf50 [ 181.923189] x29: ffff80008003bf50 x28: 0000000000000008 x27: 0000000000000000 [ 181.923379] x26: ffffce78ea277000 x25: 0000000000000000 x24: 0000001c61befda0 [ 181.924486] x23: 0000000060400009 x22: ffffce78e99918bc x21: ffff80008018bd70 [ 181.925568] x20: ffffce78e8bb00d8 x19: ffff80008018bc20 x18: 0000000000000000 [ 181.926655] x17: ffff318ebe7d3000 x16: ffff800080038000 x15: 0000000000000000 [ 181.931455] x14: ffff000080816680 x13: ffff318ebe7d3000 x12: 000000003464d91d [ 181.938628] x11: 0000000000000040 x10: ffff000080165a70 x9 : ffffce78e8bb0160 [ 181.945804] x8 : ffff8000827b3160 x7 : f9157b241586f343 x6 : eeb6502a01c81c74 [ 181.953068] x5 : a4acfcdd2e8096bb x4 : ffffce78ea277340 x3 : 00000000ffffd1e1 [ 181.960329] x2 : 0000000000000101 x1 : ffffce78ea277340 x0 : ffff318ebe7d3000 [ 181.967591] Call trace: [ 181.970043] handle_softirqs+0x98/0x368 (P) [ 181.974240] __do_softirq+0x18/0x20 [ 181.977743] ____do_softirq+0x14/0x28 [ 181.981415] call_on_irq_stack+0x24/0x30 [ 181.985180] do_softirq_own_stack+0x20/0x30 [ 181.989379] __irq_exit_rcu+0x114/0x140 [ 181.993142] irq_exit_rcu+0x14/0x28 [ 181.996816] el1_interrupt+0x44/0xb8 [ 182.000316] el1h_64_irq_handler+0x14/0x20 [ 182.004343] el1h_64_irq+0x80/0x88 [ 182.007755] cpuidle_enter_state+0xc4/0x4a8 (P) [ 182.012305] cpuidle_enter+0x3c/0x58 [ 182.015980] cpuidle_idle_call+0x128/0x1c0 [ 182.020005] do_idle+0xe0/0xf0 [ 182.023155] cpu_startup_entry+0x3c/0x48 [ 182.026917] secondary_start_kernel+0xdc/0x120 [ 182.031379] __secondary_switched+0x74/0x78 [ 212.971162] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 7-.... } 6103 jiffies s: 417 root: 0x80/. [ 212.985935] rcu: blocking rcu_node structures (internal RCU debug): [ 212.992758] Sending NMI from CPU 0 to CPUs 7: [ 212.998539] NMI backtrace for cpu 7 [ 213.004304] CPU: 7 UID: 0 PID: 0 Comm: swapper/7 Kdump: loaded Not tainted 6.13.0-rc3+ #6 [ 213.016116] Hardware name: NVIDIA CTI Forge + Orin AGX/Jetson, BIOS 202402.1-Unknown 10/28/2024 [ 213.030817] pstate: 40400009 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 213.040528] pc : handle_softirqs+0x98/0x368 [ 213.046563] lr : __do_softirq+0x18/0x20 [ 213.051293] sp : ffff80008003bf50 [ 213.055839] x29: ffff80008003bf50 x28: 0000000000000008 x27: 0000000000000000 [ 213.067304] x26: ffffce78ea277000 x25: 0000000000000000 x24: 0000001c61befda0 [ 213.077014] x23: 0000000060400009 x22: ffffce78e99918bc x21: ffff80008018bd70 [ 213.087339] x20: ffffce78e8bb00d8 x19: ffff80008018bc20 x18: 0000000000000000 [ 213.097313] x17: ffff318ebe7d3000 x16: ffff800080038000 x15: 0000000000000000 [ 213.107201] x14: ffff000080816680 x13: ffff318ebe7d3000 x12: 000000003464d91d [ 213.116651] x11: 0000000000000040 x10: ffff000080165a70 x9 : ffffce78e8bb0160 [ 213.127500] x8 : ffff8000827b3160 x7 : 0a37b344852820af x6 : 3f049caedd1ff608 [ 213.138002] x5 : cff7cfdbfaf31291 x4 : ffffce78ea277340 x3 : 00000000ffffde04 [ 213.150428] x2 : 0000000000000101 x1 : ffffce78ea277340 x0 : ffff318ebe7d3000 [ 213.162063] Call trace: [ 213.165494] handle_softirqs+0x98/0x368 (P) [ 213.171256] __do_softirq+0x18/0x20 [ 213.177291] ____do_softirq+0x14/0x28 [ 213.182017] call_on_irq_stack+0x24/0x30 [ 213.186565] do_softirq_own_stack+0x20/0x30 [ 213.191815] __irq_exit_rcu+0x114/0x140 [ 213.196891] irq_exit_rcu+0x14/0x28 [ 213.202401] el1_interrupt+0x44/0xb8 [ 213.207741] el1h_64_irq_handler+0x14/0x20 [ 213.213519] el1h_64_irq+0x80/0x88 [ 213.217541] cpuidle_enter_state+0xc4/0x4a8 (P) [ 213.224364] cpuidle_enter+0x3c/0x58 [ 213.228653] cpuidle_idle_call+0x128/0x1c0 [ 213.233993] do_idle+0xe0/0xf0 [ 213.237928] cpu_startup_entry+0x3c/0x48 [ 213.243791] secondary_start_kernel+0xdc/0x120 [ 213.249830] __secondary_switched+0x74/0x78
This bug has existed since the dwmac-tegra driver was added in Dec 2022 (See Fixes tag below for commit hash).
The Tegra234 SOC has 4 MGBE controllers, however Nvidia's Developer Kit only uses MGBE0 which is why the bug was not found previously. Connect Tech has many products that use 2 (or more) MGBE controllers.
The solution is to read the controller's SID from the existing "iommus" device tree property. The 2nd field of the "iommus" device tree property is the controller's SID.
Device tree snippet from tegra234.dtsi showing MGBE1's "iommus" property:
smmu_niso0: iommu@12000000 { compatible = "nvidia,tegra234-smmu", "nvidia,smmu-500"; ... }
/* MGBE1 */ ethernet@6900000 { compatible = "nvidia,tegra234-mgbe"; ... iommus = <&smmu_niso0 TEGRA234_SID_MGBE_VF1>; ... }
Nvidia's arm-smmu driver reads the "iommus" property and stores the SID in the MGBE device's "fwspec" struct. The dwmac-tegra driver can access the SID using the tegra_dev_iommu_get_stream_id() helper function found in linux/iommu.h.
Calling tegra_dev_iommu_get_stream_id() should not fail unless the "iommus" property is removed from the device tree or the IOMMU is disabled.
While the Tegra234 SOC technically supports bypassing the IOMMU, it is not supported by the current firmware, has not been tested and not recommended. More detailed discussion with Thierry Reding from Nvidia linked below.
Fixes: d8ca113724e7 ("net: stmmac: tegra: Add MGBE support") Link: https://lore.kernel.org/netdev/cover.1731685185.git.pnewman@connecttech.com Signed-off-by: Parker Newman pnewman@connecttech.com Reviewed-by: Andrew Lunn andrew@lunn.ch Acked-by: Thierry Reding treding@nvidia.com Link: https://patch.msgid.link/6fb97f32cf4accb4f7cf92846f6b60064ba0a3bd.1736284360... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/dwmac-tegra.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-tegra.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-tegra.c index e2d61a3a7712..760405b805f4 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-tegra.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-tegra.c @@ -1,4 +1,5 @@ // SPDX-License-Identifier: GPL-2.0-only +#include <linux/iommu.h> #include <linux/platform_device.h> #include <linux/of.h> #include <linux/module.h> @@ -19,6 +20,8 @@ struct tegra_mgbe { struct reset_control *rst_mac; struct reset_control *rst_pcs;
+ u32 iommu_sid; + void __iomem *hv; void __iomem *regs; void __iomem *xpcs; @@ -50,7 +53,6 @@ struct tegra_mgbe { #define MGBE_WRAP_COMMON_INTR_ENABLE 0x8704 #define MAC_SBD_INTR BIT(2) #define MGBE_WRAP_AXI_ASID0_CTRL 0x8400 -#define MGBE_SID 0x6
static int __maybe_unused tegra_mgbe_suspend(struct device *dev) { @@ -84,7 +86,7 @@ static int __maybe_unused tegra_mgbe_resume(struct device *dev) writel(MAC_SBD_INTR, mgbe->regs + MGBE_WRAP_COMMON_INTR_ENABLE);
/* Program SID */ - writel(MGBE_SID, mgbe->hv + MGBE_WRAP_AXI_ASID0_CTRL); + writel(mgbe->iommu_sid, mgbe->hv + MGBE_WRAP_AXI_ASID0_CTRL);
value = readl(mgbe->xpcs + XPCS_WRAP_UPHY_STATUS); if ((value & XPCS_WRAP_UPHY_STATUS_TX_P_UP) == 0) { @@ -241,6 +243,12 @@ static int tegra_mgbe_probe(struct platform_device *pdev) if (IS_ERR(mgbe->xpcs)) return PTR_ERR(mgbe->xpcs);
+ /* get controller's stream id from iommu property in device tree */ + if (!tegra_dev_iommu_get_stream_id(mgbe->dev, &mgbe->iommu_sid)) { + dev_err(mgbe->dev, "failed to get iommu stream id\n"); + return -EINVAL; + } + res.addr = mgbe->regs; res.irq = irq;
@@ -346,7 +354,7 @@ static int tegra_mgbe_probe(struct platform_device *pdev) writel(MAC_SBD_INTR, mgbe->regs + MGBE_WRAP_COMMON_INTR_ENABLE);
/* Program SID */ - writel(MGBE_SID, mgbe->hv + MGBE_WRAP_AXI_ASID0_CTRL); + writel(mgbe->iommu_sid, mgbe->hv + MGBE_WRAP_AXI_ASID0_CTRL);
plat->flags |= STMMAC_FLAG_SERDES_UP_AFTER_PHY_LINKUP;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chenguang Zhao zhaochenguang@kylinos.cn
[ Upstream commit 0e2909c6bec9048f49d0c8e16887c63b50b14647 ]
When cmd_alloc_index(), fails cmd_work_handler() needs to complete ent->slotted before returning early. Otherwise the task which issued the command may hang:
mlx5_core 0000:01:00.0: cmd_work_handler:877:(pid 3880418): failed to allocate command entry INFO: task kworker/13:2:4055883 blocked for more than 120 seconds. Not tainted 4.19.90-25.44.v2101.ky10.aarch64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/13:2 D 0 4055883 2 0x00000228 Workqueue: events mlx5e_tx_dim_work [mlx5_core] Call trace: __switch_to+0xe8/0x150 __schedule+0x2a8/0x9b8 schedule+0x2c/0x88 schedule_timeout+0x204/0x478 wait_for_common+0x154/0x250 wait_for_completion+0x28/0x38 cmd_exec+0x7a0/0xa00 [mlx5_core] mlx5_cmd_exec+0x54/0x80 [mlx5_core] mlx5_core_modify_cq+0x6c/0x80 [mlx5_core] mlx5_core_modify_cq_moderation+0xa0/0xb8 [mlx5_core] mlx5e_tx_dim_work+0x54/0x68 [mlx5_core] process_one_work+0x1b0/0x448 worker_thread+0x54/0x468 kthread+0x134/0x138 ret_from_fork+0x10/0x18
Fixes: 485d65e13571 ("net/mlx5: Add a timeout to acquire the command queue semaphore") Signed-off-by: Chenguang Zhao zhaochenguang@kylinos.cn Reviewed-by: Moshe Shemesh moshe@nvidia.com Acked-by: Tariq Toukan tariqt@nvidia.com Link: https://patch.msgid.link/20250108030009.68520-1-zhaochenguang@kylinos.cn Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index 80af0fc7101f..3e6bd27f6315 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -1006,6 +1006,7 @@ static void cmd_work_handler(struct work_struct *work) complete(&ent->done); } up(&cmd->vars.sem); + complete(&ent->slotted); return; } } else {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guoqing Jiang guoqing.jiang@canonical.com
[ Upstream commit 36684e9d88a2e2401ae26715a2e217cb4295cea7 ]
The pointer need to be set to NULL, otherwise KASAN complains about use-after-free. Because in mtk_drm_bind, all private's drm are set as follows.
private->all_drm_private[i]->drm = drm;
And drm will be released by drm_dev_put in case mtk_drm_kms_init returns failure. However, the shutdown path still accesses the previous allocated memory in drm_atomic_helper_shutdown.
[ 84.874820] watchdog: watchdog0: watchdog did not stop! [ 86.512054] ================================================================== [ 86.513162] BUG: KASAN: use-after-free in drm_atomic_helper_shutdown+0x33c/0x378 [ 86.514258] Read of size 8 at addr ffff0000d46fc068 by task shutdown/1 [ 86.515213] [ 86.515455] CPU: 1 UID: 0 PID: 1 Comm: shutdown Not tainted 6.13.0-rc1-mtk+gfa1a78e5d24b-dirty #55 [ 86.516752] Hardware name: Unknown Product/Unknown Product, BIOS 2022.10 10/01/2022 [ 86.517960] Call trace: [ 86.518333] show_stack+0x20/0x38 (C) [ 86.518891] dump_stack_lvl+0x90/0xd0 [ 86.519443] print_report+0xf8/0x5b0 [ 86.519985] kasan_report+0xb4/0x100 [ 86.520526] __asan_report_load8_noabort+0x20/0x30 [ 86.521240] drm_atomic_helper_shutdown+0x33c/0x378 [ 86.521966] mtk_drm_shutdown+0x54/0x80 [ 86.522546] platform_shutdown+0x64/0x90 [ 86.523137] device_shutdown+0x260/0x5b8 [ 86.523728] kernel_restart+0x78/0xf0 [ 86.524282] __do_sys_reboot+0x258/0x2f0 [ 86.524871] __arm64_sys_reboot+0x90/0xd8 [ 86.525473] invoke_syscall+0x74/0x268 [ 86.526041] el0_svc_common.constprop.0+0xb0/0x240 [ 86.526751] do_el0_svc+0x4c/0x70 [ 86.527251] el0_svc+0x4c/0xc0 [ 86.527719] el0t_64_sync_handler+0x144/0x168 [ 86.528367] el0t_64_sync+0x198/0x1a0 [ 86.528920] [ 86.529157] The buggy address belongs to the physical page: [ 86.529972] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff0000d46fd4d0 pfn:0x1146fc [ 86.531319] flags: 0xbfffc0000000000(node=0|zone=2|lastcpupid=0xffff) [ 86.532267] raw: 0bfffc0000000000 0000000000000000 dead000000000122 0000000000000000 [ 86.533390] raw: ffff0000d46fd4d0 0000000000000000 00000000ffffffff 0000000000000000 [ 86.534511] page dumped because: kasan: bad access detected [ 86.535323] [ 86.535559] Memory state around the buggy address: [ 86.536265] ffff0000d46fbf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.537314] ffff0000d46fbf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.538363] >ffff0000d46fc000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.544733] ^ [ 86.551057] ffff0000d46fc080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.557510] ffff0000d46fc100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 86.563928] ================================================================== [ 86.571093] Disabling lock debugging due to kernel taint [ 86.577642] Unable to handle kernel paging request at virtual address e0e9c0920000000b [ 86.581834] KASAN: maybe wild-memory-access in range [0x0752049000000058-0x075204900000005f] ...
Fixes: 1ef7ed48356c ("drm/mediatek: Modify mediatek-drm for mt8195 multi mmsys support") Signed-off-by: Guoqing Jiang guoqing.jiang@canonical.com Reviewed-by: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Link: https://patchwork.kernel.org/project/dri-devel/patch/20241223023227.1258112-... Signed-off-by: Chun-Kuang Hu chunkuang.hu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/mediatek/mtk_drm_drv.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c index 600f4ccc90d3..8b41a07c3641 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c @@ -633,6 +633,8 @@ static int mtk_drm_bind(struct device *dev) err_free: private->drm = NULL; drm_dev_put(drm); + for (i = 0; i < private->data->mmsys_dev_num; i++) + private->all_drm_private[i]->drm = NULL; return ret; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit 924d66011f2401a4145e2e814842c5c4572e439f ]
The PHY portion of the mediatek hdmi driver was originally part of the driver it self and later split out into drivers/phy, which a 'select' to keep the prior behavior.
However, this leads to build failures when the PHY driver cannot be built:
WARNING: unmet direct dependencies detected for PHY_MTK_HDMI Depends on [n]: (ARCH_MEDIATEK || COMPILE_TEST [=y]) && COMMON_CLK [=y] && OF [=y] && REGULATOR [=n] Selected by [m]: - DRM_MEDIATEK_HDMI [=m] && HAS_IOMEM [=y] && DRM [=m] && DRM_MEDIATEK [=m] ERROR: modpost: "devm_regulator_register" [drivers/phy/mediatek/phy-mtk-hdmi-drv.ko] undefined! ERROR: modpost: "rdev_get_drvdata" [drivers/phy/mediatek/phy-mtk-hdmi-drv.ko] undefined!
The best option here is to just not select the phy driver and leave that up to the defconfig. Do the same for the other PHY and memory drivers selected here as well for consistency.
Fixes: a481bf2f0ca4 ("drm/mediatek: Separate mtk_hdmi_phy to an independent module") Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Reviewed-by: CK Hu ck.hu@mediatek.com Link: https://patchwork.kernel.org/project/dri-devel/patch/20241218085837.2670434-... Signed-off-by: Chun-Kuang Hu chunkuang.hu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/mediatek/Kconfig | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/drivers/gpu/drm/mediatek/Kconfig b/drivers/gpu/drm/mediatek/Kconfig index 76cab28e010c..d2652751d190 100644 --- a/drivers/gpu/drm/mediatek/Kconfig +++ b/drivers/gpu/drm/mediatek/Kconfig @@ -10,9 +10,6 @@ config DRM_MEDIATEK select DRM_KMS_HELPER select DRM_MIPI_DSI select DRM_PANEL - select MEMORY - select MTK_SMI - select PHY_MTK_MIPI_DSI select VIDEOMODE_HELPERS help Choose this option if you have a Mediatek SoCs. @@ -23,7 +20,6 @@ config DRM_MEDIATEK config DRM_MEDIATEK_DP tristate "DRM DPTX Support for MediaTek SoCs" depends on DRM_MEDIATEK - select PHY_MTK_DP select DRM_DISPLAY_HELPER select DRM_DISPLAY_DP_HELPER select DRM_DP_AUX_BUS @@ -34,6 +30,5 @@ config DRM_MEDIATEK_HDMI tristate "DRM HDMI Support for Mediatek SoCs" depends on DRM_MEDIATEK select SND_SOC_HDMI_CODEC if SND_SOC - select PHY_MTK_HDMI help DRM/KMS HDMI driver for Mediatek SoCs
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liankun Yang liankun.yang@mediatek.com
[ Upstream commit ef24fbd8f12015ff827973fffefed3902ffd61cc ]
Setting up misc0 for Pixel Encoding Format.
According to the definition of YCbCr in spec 1.2a Table 2-96, 0x1 << 1 should be written to the register.
Use switch case to distinguish RGB, YCbCr422, and unsupported color formats.
Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver") Signed-off-by: Liankun Yang liankun.yang@mediatek.com Link: https://patchwork.kernel.org/project/dri-devel/patch/20241025083036.8829-2-l... Signed-off-by: Chun-Kuang Hu chunkuang.hu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/mediatek/mtk_dp.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/mediatek/mtk_dp.c b/drivers/gpu/drm/mediatek/mtk_dp.c index 48a4defbc66c..399423cb618c 100644 --- a/drivers/gpu/drm/mediatek/mtk_dp.c +++ b/drivers/gpu/drm/mediatek/mtk_dp.c @@ -458,18 +458,16 @@ static int mtk_dp_set_color_format(struct mtk_dp *mtk_dp, enum dp_pixelformat color_format) { u32 val; - - /* update MISC0 */ - mtk_dp_update_bits(mtk_dp, MTK_DP_ENC0_P0_3034, - color_format << DP_TEST_COLOR_FORMAT_SHIFT, - DP_TEST_COLOR_FORMAT_MASK); + u32 misc0_color;
switch (color_format) { case DP_PIXELFORMAT_YUV422: val = PIXEL_ENCODE_FORMAT_DP_ENC0_P0_YCBCR422; + misc0_color = DP_COLOR_FORMAT_YCbCr422; break; case DP_PIXELFORMAT_RGB: val = PIXEL_ENCODE_FORMAT_DP_ENC0_P0_RGB; + misc0_color = DP_COLOR_FORMAT_RGB; break; default: drm_warn(mtk_dp->drm_dev, "Unsupported color format: %d\n", @@ -477,6 +475,11 @@ static int mtk_dp_set_color_format(struct mtk_dp *mtk_dp, return -EINVAL; }
+ /* update MISC0 */ + mtk_dp_update_bits(mtk_dp, MTK_DP_ENC0_P0_3034, + misc0_color, + DP_TEST_COLOR_FORMAT_MASK); + mtk_dp_update_bits(mtk_dp, MTK_DP_ENC0_P0_303C, val, PIXEL_ENCODE_FORMAT_DP_ENC0_P0_MASK); return 0;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liankun Yang liankun.yang@mediatek.com
[ Upstream commit 0d68b55887cedc7487036ed34cb4c2097c4228f1 ]
Fix dp mode valid issue to avoid abnormal display of limit state.
After DP passes link training, it can express the lane count of the current link status is good. Calculate the maximum bandwidth supported by DP using the current lane count.
The color format will select the best one based on the bandwidth requirements of the current timing mode. If the current timing mode uses RGB and meets the DP link bandwidth requirements, RGB will be used.
If the timing mode uses RGB but does not meet the DP link bandwidthi requirements, it will continue to check whether YUV422 meets the DP link bandwidth.
FEC overhead is approximately 2.4% from DP 1.4a spec 2.2.1.4.2. The down-spread amplitude shall either be disabled (0.0%) or up to 0.5% from 1.4a 3.5.2.6. Add up to approximately 3% total overhead.
Because rate is already divided by 10, mode->clock does not need to be multiplied by 10.
Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver") Signed-off-by: Liankun Yang liankun.yang@mediatek.com Link: https://patchwork.kernel.org/project/dri-devel/patch/20241025083036.8829-3-l... Signed-off-by: Chun-Kuang Hu chunkuang.hu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/mediatek/mtk_dp.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/mediatek/mtk_dp.c b/drivers/gpu/drm/mediatek/mtk_dp.c index 399423cb618c..3e1b400febde 100644 --- a/drivers/gpu/drm/mediatek/mtk_dp.c +++ b/drivers/gpu/drm/mediatek/mtk_dp.c @@ -2313,12 +2313,19 @@ mtk_dp_bridge_mode_valid(struct drm_bridge *bridge, { struct mtk_dp *mtk_dp = mtk_dp_from_bridge(bridge); u32 bpp = info->color_formats & DRM_COLOR_FORMAT_YCBCR422 ? 16 : 24; - u32 rate = min_t(u32, drm_dp_max_link_rate(mtk_dp->rx_cap) * - drm_dp_max_lane_count(mtk_dp->rx_cap), - drm_dp_bw_code_to_link_rate(mtk_dp->max_linkrate) * - mtk_dp->max_lanes); + u32 lane_count_min = mtk_dp->train_info.lane_count; + u32 rate = drm_dp_bw_code_to_link_rate(mtk_dp->train_info.link_rate) * + lane_count_min;
- if (rate < mode->clock * bpp / 8) + /* + *FEC overhead is approximately 2.4% from DP 1.4a spec 2.2.1.4.2. + *The down-spread amplitude shall either be disabled (0.0%) or up + *to 0.5% from 1.4a 3.5.2.6. Add up to approximately 3% total overhead. + * + *Because rate is already divided by 10, + *mode->clock does not need to be multiplied by 10 + */ + if ((rate * 97 / 100) < (mode->clock * bpp / 8)) return MODE_CLOCK_HIGH;
return MODE_OK; @@ -2359,10 +2366,9 @@ static u32 *mtk_dp_bridge_atomic_get_input_bus_fmts(struct drm_bridge *bridge, struct drm_display_mode *mode = &crtc_state->adjusted_mode; struct drm_display_info *display_info = &conn_state->connector->display_info; - u32 rate = min_t(u32, drm_dp_max_link_rate(mtk_dp->rx_cap) * - drm_dp_max_lane_count(mtk_dp->rx_cap), - drm_dp_bw_code_to_link_rate(mtk_dp->max_linkrate) * - mtk_dp->max_lanes); + u32 lane_count_min = mtk_dp->train_info.lane_count; + u32 rate = drm_dp_bw_code_to_link_rate(mtk_dp->train_info.link_rate) * + lane_count_min;
*num_input_fmts = 0;
@@ -2371,8 +2377,8 @@ static u32 *mtk_dp_bridge_atomic_get_input_bus_fmts(struct drm_bridge *bridge, * datarate of YUV422 and sink device supports YUV422, we output YUV422 * format. Use this condition, we can support more resolution. */ - if ((rate < (mode->clock * 24 / 8)) && - (rate > (mode->clock * 16 / 8)) && + if (((rate * 97 / 100) < (mode->clock * 24 / 8)) && + ((rate * 97 / 100) > (mode->clock * 16 / 8)) && (display_info->color_formats & DRM_COLOR_FORMAT_YCBCR422)) { input_fmts = kcalloc(1, sizeof(*input_fmts), GFP_KERNEL); if (!input_fmts)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liankun Yang liankun.yang@mediatek.com
[ Upstream commit 522908140645865dc3e2fac70fd3b28834dfa7be ]
Check the return value of drm_dp_dpcd_readb() to confirm that AUX communication is successful. To simplify the code, replace drm_dp_dpcd_readb() and DP_GET_SINK_COUNT() with drm_dp_read_sink_count().
Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver") Signed-off-by: Liankun Yang liankun.yang@mediatek.com Reviewed-by: Guillaume Ranquet granquet@baylibre.com Link: https://patchwork.kernel.org/project/dri-devel/patch/20241218113448.2992-1-l... Signed-off-by: Chun-Kuang Hu chunkuang.hu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/mediatek/mtk_dp.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/mediatek/mtk_dp.c b/drivers/gpu/drm/mediatek/mtk_dp.c index 3e1b400febde..be4de26c77f9 100644 --- a/drivers/gpu/drm/mediatek/mtk_dp.c +++ b/drivers/gpu/drm/mediatek/mtk_dp.c @@ -2005,7 +2005,6 @@ static enum drm_connector_status mtk_dp_bdg_detect(struct drm_bridge *bridge) struct mtk_dp *mtk_dp = mtk_dp_from_bridge(bridge); enum drm_connector_status ret = connector_status_disconnected; bool enabled = mtk_dp->enabled; - u8 sink_count = 0;
if (!mtk_dp->train_info.cable_plugged_in) return ret; @@ -2020,8 +2019,8 @@ static enum drm_connector_status mtk_dp_bdg_detect(struct drm_bridge *bridge) * function, we just need to check the HPD connection to check * whether we connect to a sink device. */ - drm_dp_dpcd_readb(&mtk_dp->aux, DP_SINK_COUNT, &sink_count); - if (DP_GET_SINK_COUNT(sink_count)) + + if (drm_dp_read_sink_count(&mtk_dp->aux) > 0) ret = connector_status_connected;
if (!enabled)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wentao Liang liangwentao@iscas.ac.cn
[ Upstream commit 4c16e1cadcbcaf3c82d5fc310fbd34d0f5d0db7c ]
In the smb2_send_interim_resp(), if ksmbd_alloc_work_struct() fails to allocate a node, it returns a NULL pointer to the in_work pointer. This can lead to an illegal memory write of in_work->response_buf when allocate_interim_rsp_buf() attempts to perform a kzalloc() on it.
To address this issue, incorporating a check for the return value of ksmbd_alloc_work_struct() ensures that the function returns immediately upon allocation failure, thereby preventing the aforementioned illegal memory access.
Fixes: 041bba4414cd ("ksmbd: fix wrong interim response on compound") Signed-off-by: Wentao Liang liangwentao@iscas.ac.cn Acked-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/server/smb2pdu.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index 2884ebdc0eda..94b4a2565b2a 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -695,6 +695,9 @@ void smb2_send_interim_resp(struct ksmbd_work *work, __le32 status) struct smb2_hdr *rsp_hdr; struct ksmbd_work *in_work = ksmbd_alloc_work_struct();
+ if (!in_work) + return; + if (allocate_interim_rsp_buf(in_work)) { pr_err("smb_allocate_rsp_buf failed!\n"); ksmbd_free_work_struct(in_work);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Howells dhowells@redhat.com
[ Upstream commit 8fd56ad6e7c90ac2bddb0741c6b248c8c5d56ac8 ]
The kafs filesystem limits the maximum length of a cell to 256 bytes, but a problem occurs if someone actually does that: kafs tries to create a directory under /proc/net/afs/ with the name of the cell, but that fails with a warning:
WARNING: CPU: 0 PID: 9 at fs/proc/generic.c:405
because procfs limits the maximum filename length to 255.
However, the DNS limits the maximum lookup length and, by extension, the maximum cell name, to 255 less two (length count and trailing NUL).
Fix this by limiting the maximum acceptable cellname length to 253. This also allows us to be sure we can create the "/afs/.<cell>/" mountpoint too.
Further, split the YFS VL record cell name maximum to be the 256 allowed by the protocol and ignore the record retrieved by YFSVL.GetCellName if it exceeds 253.
Fixes: c3e9f888263b ("afs: Implement client support for the YFSVL.GetCellName RPC op") Reported-by: syzbot+7848fee1f1e5c53f912b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/6776d25d.050a0220.3a8527.0048.GAE@google.com/ Signed-off-by: David Howells dhowells@redhat.com Link: https://lore.kernel.org/r/376236.1736180460@warthog.procyon.org.uk Tested-by: syzbot+7848fee1f1e5c53f912b@syzkaller.appspotmail.com cc: Marc Dionne marc.dionne@auristor.com cc: linux-afs@lists.infradead.org Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/afs/afs.h | 2 +- fs/afs/afs_vl.h | 1 + fs/afs/vl_alias.c | 8 ++++++-- fs/afs/vlclient.c | 2 +- 4 files changed, 9 insertions(+), 4 deletions(-)
diff --git a/fs/afs/afs.h b/fs/afs/afs.h index 81815724db6c..25c17100798b 100644 --- a/fs/afs/afs.h +++ b/fs/afs/afs.h @@ -10,7 +10,7 @@
#include <linux/in.h>
-#define AFS_MAXCELLNAME 256 /* Maximum length of a cell name */ +#define AFS_MAXCELLNAME 253 /* Maximum length of a cell name (DNS limited) */ #define AFS_MAXVOLNAME 64 /* Maximum length of a volume name */ #define AFS_MAXNSERVERS 8 /* Maximum servers in a basic volume record */ #define AFS_NMAXNSERVERS 13 /* Maximum servers in a N/U-class volume record */ diff --git a/fs/afs/afs_vl.h b/fs/afs/afs_vl.h index 9c65ffb8a523..8da0899fbc08 100644 --- a/fs/afs/afs_vl.h +++ b/fs/afs/afs_vl.h @@ -13,6 +13,7 @@ #define AFS_VL_PORT 7003 /* volume location service port */ #define VL_SERVICE 52 /* RxRPC service ID for the Volume Location service */ #define YFS_VL_SERVICE 2503 /* Service ID for AuriStor upgraded VL service */ +#define YFS_VL_MAXCELLNAME 256 /* Maximum length of a cell name in YFS protocol */
enum AFSVL_Operations { VLGETENTRYBYID = 503, /* AFS Get VLDB entry by ID */ diff --git a/fs/afs/vl_alias.c b/fs/afs/vl_alias.c index f04a80e4f5c3..83cf1bfbe343 100644 --- a/fs/afs/vl_alias.c +++ b/fs/afs/vl_alias.c @@ -302,6 +302,7 @@ static char *afs_vl_get_cell_name(struct afs_cell *cell, struct key *key) static int yfs_check_canonical_cell_name(struct afs_cell *cell, struct key *key) { struct afs_cell *master; + size_t name_len; char *cell_name;
cell_name = afs_vl_get_cell_name(cell, key); @@ -313,8 +314,11 @@ static int yfs_check_canonical_cell_name(struct afs_cell *cell, struct key *key) return 0; }
- master = afs_lookup_cell(cell->net, cell_name, strlen(cell_name), - NULL, false); + name_len = strlen(cell_name); + if (!name_len || name_len > AFS_MAXCELLNAME) + master = ERR_PTR(-EOPNOTSUPP); + else + master = afs_lookup_cell(cell->net, cell_name, name_len, NULL, false); kfree(cell_name); if (IS_ERR(master)) return PTR_ERR(master); diff --git a/fs/afs/vlclient.c b/fs/afs/vlclient.c index 00fca3c66ba6..16653f2ffe4f 100644 --- a/fs/afs/vlclient.c +++ b/fs/afs/vlclient.c @@ -671,7 +671,7 @@ static int afs_deliver_yfsvl_get_cell_name(struct afs_call *call) return ret;
namesz = ntohl(call->tmp); - if (namesz > AFS_MAXCELLNAME) + if (namesz > YFS_VL_MAXCELLNAME) return afs_protocol_error(call, afs_eproto_cellname_len); paddedsz = (namesz + 3) & ~3; call->count = namesz;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maciej S. Szmigiero mail@maciej.szmigiero.name
[ Upstream commit dd410d784402c5775f66faf8b624e85e41c38aaf ]
Wakeup for IRQ1 should be disabled only in cases where i8042 had actually enabled it, otherwise "wake_depth" for this IRQ will try to drop below zero and there will be an unpleasant WARN() logged:
kernel: atkbd serio0: Disabling IRQ1 wakeup source to avoid platform firmware bug kernel: ------------[ cut here ]------------ kernel: Unbalanced IRQ 1 wake disable kernel: WARNING: CPU: 10 PID: 6431 at kernel/irq/manage.c:920 irq_set_irq_wake+0x147/0x1a0
The PMC driver uses DEFINE_SIMPLE_DEV_PM_OPS() to define its dev_pm_ops which sets amd_pmc_suspend_handler() to the .suspend, .freeze, and .poweroff handlers. i8042_pm_suspend(), however, is only set as the .suspend handler.
Fix the issue by call PMC suspend handler only from the same set of dev_pm_ops handlers as i8042_pm_suspend(), which currently means just the .suspend handler.
To reproduce this issue try hibernating (S4) the machine after a fresh boot without putting it into s2idle first.
Fixes: 8e60615e8932 ("platform/x86/amd: pmc: Disable IRQ1 wakeup for RN/CZN") Reviewed-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Maciej S. Szmigiero mail@maciej.szmigiero.name Link: https://lore.kernel.org/r/c8f28c002ca3c66fbeeb850904a1f43118e17200.173618460... [ij: edited the commit message.] Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/amd/pmc/pmc.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/x86/amd/pmc/pmc.c b/drivers/platform/x86/amd/pmc/pmc.c index f49b1bb258c7..70907e8f3ea9 100644 --- a/drivers/platform/x86/amd/pmc/pmc.c +++ b/drivers/platform/x86/amd/pmc/pmc.c @@ -878,6 +878,10 @@ static int amd_pmc_suspend_handler(struct device *dev) { struct amd_pmc_dev *pdev = dev_get_drvdata(dev);
+ /* + * Must be called only from the same set of dev_pm_ops handlers + * as i8042_pm_suspend() is called: currently just from .suspend. + */ if (pdev->disable_8042_wakeup && !disable_workarounds) { int rc = amd_pmc_wa_irq1(pdev);
@@ -890,7 +894,9 @@ static int amd_pmc_suspend_handler(struct device *dev) return 0; }
-static DEFINE_SIMPLE_DEV_PM_OPS(amd_pmc_pm, amd_pmc_suspend_handler, NULL); +static const struct dev_pm_ops amd_pmc_pm = { + .suspend = amd_pmc_suspend_handler, +};
static const struct pci_device_id pmc_pci_ids[] = { { PCI_DEVICE(PCI_VENDOR_ID_AMD, AMD_CPU_ID_PS) },
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: He Wang xw897002528@gmail.com
[ Upstream commit 2ac538e40278a2c0c051cca81bcaafc547d61372 ]
When `ksmbd_vfs_kern_path_locked` met an error and it is not the last entry, it will exit without restoring changed path buffer. But later this buffer may be used as the filename for creation.
Fixes: c5a709f08d40 ("ksmbd: handle caseless file creation") Signed-off-by: He Wang xw897002528@gmail.com Acked-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/server/vfs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c index 2c548e8efef0..d0c19ad9d014 100644 --- a/fs/smb/server/vfs.c +++ b/fs/smb/server/vfs.c @@ -1259,6 +1259,8 @@ int ksmbd_vfs_kern_path_locked(struct ksmbd_work *work, char *name, filepath, flags, path); + if (!is_last) + next[0] = '/'; if (err) goto out2; else if (is_last) @@ -1266,7 +1268,6 @@ int ksmbd_vfs_kern_path_locked(struct ksmbd_work *work, char *name, path_put(parent_path); *parent_path = *path;
- next[0] = '/'; remain_len -= filename_len + 1; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Javier Carrasco javier.carrasco.cruz@gmail.com
[ Upstream commit 7e25044b804581b9c029d5a28d8800aebde18043 ]
The 'np' device_node is initialized via of_cpu_device_node_get(), which requires explicit calls to of_node_put() when it is no longer required to avoid leaking the resource.
Instead of adding the missing calls to of_node_put() in all execution paths, use the cleanup attribute for 'np' by means of the __free() macro, which automatically calls of_node_put() when the variable goes out of scope. Given that 'np' is only used within the for_each_possible_cpu(), reduce its scope to release the nood after every iteration of the loop.
Fixes: 6abf32f1d9c5 ("cpuidle: Add RISC-V SBI CPU idle driver") Reviewed-by: Andrew Jones ajones@ventanamicro.com Signed-off-by: Javier Carrasco javier.carrasco.cruz@gmail.com Link: https://lore.kernel.org/r/20241116-cpuidle-riscv-sbi-cleanup-v3-1-a3a46372ce... Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cpuidle/cpuidle-riscv-sbi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c index c0fe92409175..71d433bb0ce6 100644 --- a/drivers/cpuidle/cpuidle-riscv-sbi.c +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c @@ -534,12 +534,12 @@ static int sbi_cpuidle_probe(struct platform_device *pdev) int cpu, ret; struct cpuidle_driver *drv; struct cpuidle_device *dev; - struct device_node *np, *pds_node; + struct device_node *pds_node;
/* Detect OSI support based on CPU DT nodes */ sbi_cpuidle_use_osi = true; for_each_possible_cpu(cpu) { - np = of_cpu_device_node_get(cpu); + struct device_node *np __free(device_node) = of_cpu_device_node_get(cpu); if (np && of_property_present(np, "power-domains") && of_property_present(np, "power-domain-names")) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xu Lu luxu.kernel@bytedance.com
[ Upstream commit f754f27e98f88428aaf6be6e00f5cbce97f62d4b ]
In sparse vmemmap model, the virtual address of vmemmap is calculated as: ((struct page *)VMEMMAP_START - (phys_ram_base >> PAGE_SHIFT)). And the struct page's va can be calculated with an offset: (vmemmap + (pfn)).
However, when initializing struct pages, kernel actually starts from the first page from the same section that phys_ram_base belongs to. If the first page's physical address is not (phys_ram_base >> PAGE_SHIFT), then we get an va below VMEMMAP_START when calculating va for it's struct page.
For example, if phys_ram_base starts from 0x82000000 with pfn 0x82000, the first page in the same section is actually pfn 0x80000. During init_unavailable_range(), we will initialize struct page for pfn 0x80000 with virtual address ((struct page *)VMEMMAP_START - 0x2000), which is below VMEMMAP_START as well as PCI_IO_END.
This commit fixes this bug by introducing a new variable 'vmemmap_start_pfn' which is aligned with memory section size and using it to calculate vmemmap address instead of phys_ram_base.
Fixes: a11dd49dcb93 ("riscv: Sparse-Memory/vmemmap out-of-bounds fix") Signed-off-by: Xu Lu luxu.kernel@bytedance.com Reviewed-by: Alexandre Ghiti alexghiti@rivosinc.com Tested-by: Björn Töpel bjorn@rivosinc.com Reviewed-by: Björn Töpel bjorn@rivosinc.com Link: https://lore.kernel.org/r/20241209122617.53341-1-luxu.kernel@bytedance.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/include/asm/page.h | 1 + arch/riscv/include/asm/pgtable.h | 2 +- arch/riscv/mm/init.c | 17 ++++++++++++++++- 3 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h index 94b3d6930fc3..4d1f58848129 100644 --- a/arch/riscv/include/asm/page.h +++ b/arch/riscv/include/asm/page.h @@ -122,6 +122,7 @@ struct kernel_mapping {
extern struct kernel_mapping kernel_map; extern phys_addr_t phys_ram_base; +extern unsigned long vmemmap_start_pfn;
#define is_kernel_mapping(x) \ ((x) >= kernel_map.virt_addr && (x) < (kernel_map.virt_addr + kernel_map.size)) diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 37829dab4a0a..f540b2625714 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -84,7 +84,7 @@ * Define vmemmap for pfn_to_page & page_to_pfn calls. Needed if kernel * is configured with CONFIG_SPARSEMEM_VMEMMAP enabled. */ -#define vmemmap ((struct page *)VMEMMAP_START - (phys_ram_base >> PAGE_SHIFT)) +#define vmemmap ((struct page *)VMEMMAP_START - vmemmap_start_pfn)
#define PCI_IO_SIZE SZ_16M #define PCI_IO_END VMEMMAP_START diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 3245bb525212..bdf8ac6c7e30 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -32,6 +32,7 @@ #include <asm/ptdump.h> #include <asm/sections.h> #include <asm/soc.h> +#include <asm/sparsemem.h> #include <asm/tlbflush.h>
#include "../kernel/head.h" @@ -57,6 +58,13 @@ EXPORT_SYMBOL(pgtable_l5_enabled); phys_addr_t phys_ram_base __ro_after_init; EXPORT_SYMBOL(phys_ram_base);
+#ifdef CONFIG_SPARSEMEM_VMEMMAP +#define VMEMMAP_ADDR_ALIGN (1ULL << SECTION_SIZE_BITS) + +unsigned long vmemmap_start_pfn __ro_after_init; +EXPORT_SYMBOL(vmemmap_start_pfn); +#endif + unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss; EXPORT_SYMBOL(empty_zero_page); @@ -221,8 +229,12 @@ static void __init setup_bootmem(void) * Make sure we align the start of the memory on a PMD boundary so that * at worst, we map the linear mapping with PMD mappings. */ - if (!IS_ENABLED(CONFIG_XIP_KERNEL)) + if (!IS_ENABLED(CONFIG_XIP_KERNEL)) { phys_ram_base = memblock_start_of_DRAM() & PMD_MASK; +#ifdef CONFIG_SPARSEMEM_VMEMMAP + vmemmap_start_pfn = round_down(phys_ram_base, VMEMMAP_ADDR_ALIGN) >> PAGE_SHIFT; +#endif + }
/* * In 64-bit, any use of __va/__pa before this point is wrong as we @@ -1080,6 +1092,9 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa) kernel_map.xiprom_sz = (uintptr_t)(&_exiprom) - (uintptr_t)(&_xiprom);
phys_ram_base = CONFIG_PHYS_RAM_BASE; +#ifdef CONFIG_SPARSEMEM_VMEMMAP + vmemmap_start_pfn = round_down(phys_ram_base, VMEMMAP_ADDR_ALIGN) >> PAGE_SHIFT; +#endif kernel_map.phys_addr = (uintptr_t)CONFIG_PHYS_RAM_BASE; kernel_map.size = (uintptr_t)(&_end) - (uintptr_t)(&_start);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krister Johansen kjlx@templeofstupid.com
commit 80f130bfad1dab93b95683fc39b87235682b8f72 upstream.
The documentation in rculist.h explains the absence of list_empty_rcu() and cautions programmers against relying on a list_empty() -> list_first() sequence in RCU safe code. This is because each of these functions performs its own READ_ONCE() of the list head. This can lead to a situation where the list_empty() sees a valid list entry, but the subsequent list_first() sees a different view of list head state after a modification.
In the case of dm-thin, this author had a production box crash from a GP fault in the process_deferred_bios path. This function saw a valid list head in get_first_thin() but when it subsequently dereferenced that and turned it into a thin_c, it got the inside of the struct pool, since the list was now empty and referring to itself. The kernel on which this occurred printed both a warning about a refcount_t being saturated, and a UBSAN error for an out-of-bounds cpuid access in the queued spinlock, prior to the fault itself. When the resulting kdump was examined, it was possible to see another thread patiently waiting in thin_dtr's synchronize_rcu.
The thin_dtr call managed to pull the thin_c out of the active thins list (and have it be the last entry in the active_thins list) at just the wrong moment which lead to this crash.
Fortunately, the fix here is straight forward. Switch get_first_thin() function to use list_first_or_null_rcu() which performs just a single READ_ONCE() and returns NULL if the list is already empty.
This was run against the devicemapper test suite's thin-provisioning suites for delete and suspend and no regressions were observed.
Signed-off-by: Krister Johansen kjlx@templeofstupid.com Fixes: b10ebd34ccca ("dm thin: fix rcu_read_lock being held in code that can sleep") Cc: stable@vger.kernel.org Acked-by: Ming-Hung Tsai mtsai@redhat.com Signed-off-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-thin.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -2334,10 +2334,9 @@ static struct thin_c *get_first_thin(str struct thin_c *tc = NULL;
rcu_read_lock(); - if (!list_empty(&pool->active_thins)) { - tc = list_entry_rcu(pool->active_thins.next, struct thin_c, list); + tc = list_first_or_null_rcu(&pool->active_thins, struct thin_c, list); + if (tc) thin_get(tc); - } rcu_read_unlock();
return tc;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org
commit 7bac65687510038390a0a54cbe14fba08d037e46 upstream.
PHY might already be powered on during ufs_qcom_power_up_sequence() in a couple of cases:
1. During UFSHCD_QUIRK_REINIT_AFTER_MAX_GEAR_SWITCH quirk
2. Resuming from spm_lvl = 5 suspend
In those cases, it is necessary to call phy_power_off() and phy_exit() in ufs_qcom_power_up_sequence() function to power off the PHY before calling phy_init() and phy_power_on().
Case (1) is doing it via ufs_qcom_reinit_notify() callback, but case (2) is not handled. So to satisfy both cases, call phy_power_off() and phy_exit() if the phy_count is non-zero. And with this change, the reinit_notify() callback is no longer needed.
This fixes the below UFS resume failure with spm_lvl = 5:
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: Enabling the controller failed ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 ufs_device_wlun 0:0:0:49488: ufshcd_wl_resume failed: -5 ufs_device_wlun 0:0:0:49488: PM: dpm_run_callback(): scsi_bus_resume returns -5 ufs_device_wlun 0:0:0:49488: PM: failed to resume async: error -5
Cc: stable@vger.kernel.org # 6.3 Fixes: baf5ddac90dc ("scsi: ufs: ufs-qcom: Add support for reinitializing the UFS device") Reported-by: Ram Kumar Dwivedi quic_rdwivedi@quicinc.com Tested-by: Amit Pundir amit.pundir@linaro.org # on SM8550-HDK Reviewed-by: Bart Van Assche bvanassche@acm.org Tested-by: Neil Armstrong neil.armstrong@linaro.org # on SM8550-QRD Signed-off-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-1-63c4b95a70b9@li... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ufs/core/ufshcd-priv.h | 6 ------ drivers/ufs/core/ufshcd.c | 1 - drivers/ufs/host/ufs-qcom.c | 13 +++++-------- include/ufs/ufshcd.h | 2 -- 4 files changed, 5 insertions(+), 17 deletions(-)
--- a/drivers/ufs/core/ufshcd-priv.h +++ b/drivers/ufs/core/ufshcd-priv.h @@ -242,12 +242,6 @@ static inline void ufshcd_vops_config_sc hba->vops->config_scaling_param(hba, p, data); }
-static inline void ufshcd_vops_reinit_notify(struct ufs_hba *hba) -{ - if (hba->vops && hba->vops->reinit_notify) - hba->vops->reinit_notify(hba); -} - static inline int ufshcd_vops_mcq_config_resource(struct ufs_hba *hba) { if (hba->vops && hba->vops->mcq_config_resource) --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -8795,7 +8795,6 @@ static int ufshcd_probe_hba(struct ufs_h ufshcd_device_reset(hba); ufs_put_device_desc(hba); ufshcd_hba_stop(hba); - ufshcd_vops_reinit_notify(hba); ret = ufshcd_hba_enable(hba); if (ret) { dev_err(hba->dev, "Host controller enable failed\n"); --- a/drivers/ufs/host/ufs-qcom.c +++ b/drivers/ufs/host/ufs-qcom.c @@ -455,6 +455,11 @@ static int ufs_qcom_power_up_sequence(st dev_warn(hba->dev, "%s: host reset returned %d\n", __func__, ret);
+ if (phy->power_count) { + phy_power_off(phy); + phy_exit(phy); + } + /* phy initialization - calibrate the phy */ ret = phy_init(phy); if (ret) { @@ -1638,13 +1643,6 @@ static void ufs_qcom_config_scaling_para } #endif
-static void ufs_qcom_reinit_notify(struct ufs_hba *hba) -{ - struct ufs_qcom_host *host = ufshcd_get_variant(hba); - - phy_power_off(host->generic_phy); -} - /* Resources */ static const struct ufshcd_res_info ufs_res_info[RES_MAX] = { {.name = "ufs_mem",}, @@ -1887,7 +1885,6 @@ static const struct ufs_hba_variant_ops .device_reset = ufs_qcom_device_reset, .config_scaling_param = ufs_qcom_config_scaling_param, .program_key = ufs_qcom_ice_program_key, - .reinit_notify = ufs_qcom_reinit_notify, .mcq_config_resource = ufs_qcom_mcq_config_resource, .get_hba_mac = ufs_qcom_get_hba_mac, .op_runtime_config = ufs_qcom_op_runtime_config, --- a/include/ufs/ufshcd.h +++ b/include/ufs/ufshcd.h @@ -324,7 +324,6 @@ struct ufs_pwr_mode_info { * @config_scaling_param: called to configure clock scaling parameters * @program_key: program or evict an inline encryption key * @event_notify: called to notify important events - * @reinit_notify: called to notify reinit of UFSHCD during max gear switch * @mcq_config_resource: called to configure MCQ platform resources * @get_hba_mac: called to get vendor specific mac value, mandatory for mcq mode * @op_runtime_config: called to config Operation and runtime regs Pointers @@ -369,7 +368,6 @@ struct ufs_hba_variant_ops { const union ufs_crypto_cfg_entry *cfg, int slot); void (*event_notify)(struct ufs_hba *hba, enum ufs_event_type evt, void *data); - void (*reinit_notify)(struct ufs_hba *); int (*mcq_config_resource)(struct ufs_hba *hba); int (*get_hba_mac)(struct ufs_hba *hba); int (*op_runtime_config)(struct ufs_hba *hba);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikulas Patocka mpatocka@redhat.com
commit 47f33c27fc9565fb0bc7dfb76be08d445cd3d236 upstream.
dm-ebs uses dm-bufio to process requests that are not aligned on logical sector size. dm-bufio doesn't support passing integrity data (and it is unclear how should it do it), so we shouldn't set the DM_TARGET_PASSES_INTEGRITY flag.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Cc: stable@vger.kernel.org Fixes: d3c7b35c20d6 ("dm: add emulated block size target") Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-ebs-target.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/md/dm-ebs-target.c +++ b/drivers/md/dm-ebs-target.c @@ -442,7 +442,7 @@ static int ebs_iterate_devices(struct dm static struct target_type ebs_target = { .name = "ebs", .version = {1, 0, 1}, - .features = DM_TARGET_PASSES_INTEGRITY, + .features = 0, .module = THIS_MODULE, .ctr = ebs_ctr, .dtr = ebs_dtr,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthieu Baerts (NGI0) matttbe@kernel.org
commit d38e26e36206ae3d544d496513212ae931d1da0a upstream.
Using the 'net' structure via 'current' is not recommended for different reasons.
First, if the goal is to use it to read or write per-netns data, this is inconsistent with how the "generic" sysctl entries are doing: directly by only using pointers set to the table entry, e.g. table->data. Linked to that, the per-netns data should always be obtained from the table linked to the netns it had been created for, which may not coincide with the reader's or writer's netns.
Another reason is that access to current->nsproxy->netns can oops if attempted when current->nsproxy had been dropped when the current task is exiting. This is what syzbot found, when using acct(2):
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f] CPU: 1 UID: 0 PID: 5924 Comm: syz-executor Not tainted 6.13.0-rc5-syzkaller-00004-gccb98ccef0e5 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 RIP: 0010:proc_scheduler+0xc6/0x3c0 net/mptcp/ctrl.c:125 Code: 03 42 80 3c 38 00 0f 85 fe 02 00 00 4d 8b a4 24 08 09 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 28 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 cc 02 00 00 4d 8b 7c 24 28 48 8d 84 24 c8 00 00 RSP: 0018:ffffc900034774e8 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 1ffff9200068ee9e RCX: ffffc90003477620 RDX: 0000000000000005 RSI: ffffffff8b08f91e RDI: 0000000000000028 RBP: 0000000000000001 R08: ffffc90003477710 R09: 0000000000000040 R10: 0000000000000040 R11: 00000000726f7475 R12: 0000000000000000 R13: ffffc90003477620 R14: ffffc90003477710 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fee3cd452d8 CR3: 000000007d116000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> proc_sys_call_handler+0x403/0x5d0 fs/proc/proc_sysctl.c:601 __kernel_write_iter+0x318/0xa80 fs/read_write.c:612 __kernel_write+0xf6/0x140 fs/read_write.c:632 do_acct_process+0xcb0/0x14a0 kernel/acct.c:539 acct_pin_kill+0x2d/0x100 kernel/acct.c:192 pin_kill+0x194/0x7c0 fs/fs_pin.c:44 mnt_pin_kill+0x61/0x1e0 fs/fs_pin.c:81 cleanup_mnt+0x3ac/0x450 fs/namespace.c:1366 task_work_run+0x14e/0x250 kernel/task_work.c:239 exit_task_work include/linux/task_work.h:43 [inline] do_exit+0xad8/0x2d70 kernel/exit.c:938 do_group_exit+0xd3/0x2a0 kernel/exit.c:1087 get_signal+0x2576/0x2610 kernel/signal.c:3017 arch_do_signal_or_restart+0x90/0x7e0 arch/x86/kernel/signal.c:337 exit_to_user_mode_loop kernel/entry/common.c:111 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0x150/0x2a0 kernel/entry/common.c:218 do_syscall_64+0xda/0x250 arch/x86/entry/common.c:89 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fee3cb87a6a Code: Unable to access opcode bytes at 0x7fee3cb87a40. RSP: 002b:00007fffcccac688 EFLAGS: 00000202 ORIG_RAX: 0000000000000037 RAX: 0000000000000000 RBX: 00007fffcccac710 RCX: 00007fee3cb87a6a RDX: 0000000000000041 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 0000000000000003 R08: 00007fffcccac6ac R09: 00007fffcccacac7 R10: 00007fffcccac710 R11: 0000000000000202 R12: 00007fee3cd49500 R13: 00007fffcccac6ac R14: 0000000000000000 R15: 00007fee3cd4b000 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:proc_scheduler+0xc6/0x3c0 net/mptcp/ctrl.c:125 Code: 03 42 80 3c 38 00 0f 85 fe 02 00 00 4d 8b a4 24 08 09 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 28 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 cc 02 00 00 4d 8b 7c 24 28 48 8d 84 24 c8 00 00 RSP: 0018:ffffc900034774e8 EFLAGS: 00010206 RAX: dffffc0000000000 RBX: 1ffff9200068ee9e RCX: ffffc90003477620 RDX: 0000000000000005 RSI: ffffffff8b08f91e RDI: 0000000000000028 RBP: 0000000000000001 R08: ffffc90003477710 R09: 0000000000000040 R10: 0000000000000040 R11: 00000000726f7475 R12: 0000000000000000 R13: ffffc90003477620 R14: ffffc90003477710 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fee3cd452d8 CR3: 000000007d116000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 ---------------- Code disassembly (best guess), 1 bytes skipped: 0: 42 80 3c 38 00 cmpb $0x0,(%rax,%r15,1) 5: 0f 85 fe 02 00 00 jne 0x309 b: 4d 8b a4 24 08 09 00 mov 0x908(%r12),%r12 12: 00 13: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax 1a: fc ff df 1d: 49 8d 7c 24 28 lea 0x28(%r12),%rdi 22: 48 89 fa mov %rdi,%rdx 25: 48 c1 ea 03 shr $0x3,%rdx * 29: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction 2d: 0f 85 cc 02 00 00 jne 0x2ff 33: 4d 8b 7c 24 28 mov 0x28(%r12),%r15 38: 48 rex.W 39: 8d .byte 0x8d 3a: 84 24 c8 test %ah,(%rax,%rcx,8)
Here with 'net.mptcp.scheduler', the 'net' structure is not really needed, because the table->data already has a pointer to the current scheduler, the only thing needed from the per-netns data. Simply use 'data', instead of getting (most of the time) the same thing, but from a longer and indirect way.
Fixes: 6963c508fd7a ("mptcp: only allow set existing scheduler for net.mptcp.scheduler") Cc: stable@vger.kernel.org Reported-by: syzbot+e364f774c6f57f2c86d1@syzkaller.appspotmail.com Closes: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com Suggested-by: Al Viro viro@zeniv.linux.org.uk Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-2-5df34b2083... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/ctrl.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
--- a/net/mptcp/ctrl.c +++ b/net/mptcp/ctrl.c @@ -87,16 +87,15 @@ static void mptcp_pernet_set_defaults(st }
#ifdef CONFIG_SYSCTL -static int mptcp_set_scheduler(const struct net *net, const char *name) +static int mptcp_set_scheduler(char *scheduler, const char *name) { - struct mptcp_pernet *pernet = mptcp_get_pernet(net); struct mptcp_sched_ops *sched; int ret = 0;
rcu_read_lock(); sched = mptcp_sched_find(name); if (sched) - strscpy(pernet->scheduler, name, MPTCP_SCHED_NAME_MAX); + strscpy(scheduler, name, MPTCP_SCHED_NAME_MAX); else ret = -ENOENT; rcu_read_unlock(); @@ -107,7 +106,7 @@ static int mptcp_set_scheduler(const str static int proc_scheduler(struct ctl_table *ctl, int write, void *buffer, size_t *lenp, loff_t *ppos) { - const struct net *net = current->nsproxy->net_ns; + char (*scheduler)[MPTCP_SCHED_NAME_MAX] = ctl->data; char val[MPTCP_SCHED_NAME_MAX]; struct ctl_table tbl = { .data = val, @@ -115,11 +114,11 @@ static int proc_scheduler(struct ctl_tab }; int ret;
- strscpy(val, mptcp_get_scheduler(net), MPTCP_SCHED_NAME_MAX); + strscpy(val, *scheduler, MPTCP_SCHED_NAME_MAX);
ret = proc_dostring(&tbl, write, buffer, lenp, ppos); if (write && ret == 0) - ret = mptcp_set_scheduler(net, val); + ret = mptcp_set_scheduler(*scheduler, val);
return ret; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthieu Baerts (NGI0) matttbe@kernel.org
commit ea62dd1383913b5999f3d16ae99d411f41b528d4 upstream.
As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2).
The 'net' structure can be obtained from the table->data using container_of().
Note that table->data could also be used directly, as this is the only member needed from the 'net' structure, but that would increase the size of this fix, to use '*data' everywhere 'net->sctp.sctp_hmac_alg' is used.
Fixes: 3c68198e7511 ("sctp: Make hmac algorithm selection for cookie generation dynamic") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-4-5df34b2083... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/sysctl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/sctp/sysctl.c +++ b/net/sctp/sysctl.c @@ -391,7 +391,8 @@ static struct ctl_table sctp_net_table[] static int proc_sctp_do_hmac_alg(struct ctl_table *ctl, int write, void *buffer, size_t *lenp, loff_t *ppos) { - struct net *net = current->nsproxy->net_ns; + struct net *net = container_of(ctl->data, struct net, + sctp.sctp_hmac_alg); struct ctl_table tbl; bool changed = false; char *none = "none";
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthieu Baerts (NGI0) matttbe@kernel.org
commit 9fc17b76fc70763780aa78b38fcf4742384044a5 upstream.
As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2).
The 'net' structure can be obtained from the table->data using container_of().
Note that table->data could also be used directly, as this is the only member needed from the 'net' structure, but that would increase the size of this fix, to use '*data' everywhere 'net->sctp.rto_min/max' is used.
Fixes: 4f3fdf3bc59c ("sctp: add check rto_min and rto_max in sysctl") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-5-5df34b2083... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/sysctl.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/sctp/sysctl.c +++ b/net/sctp/sysctl.c @@ -437,7 +437,7 @@ static int proc_sctp_do_hmac_alg(struct static int proc_sctp_do_rto_min(struct ctl_table *ctl, int write, void *buffer, size_t *lenp, loff_t *ppos) { - struct net *net = current->nsproxy->net_ns; + struct net *net = container_of(ctl->data, struct net, sctp.rto_min); unsigned int min = *(unsigned int *) ctl->extra1; unsigned int max = *(unsigned int *) ctl->extra2; struct ctl_table tbl; @@ -465,7 +465,7 @@ static int proc_sctp_do_rto_min(struct c static int proc_sctp_do_rto_max(struct ctl_table *ctl, int write, void *buffer, size_t *lenp, loff_t *ppos) { - struct net *net = current->nsproxy->net_ns; + struct net *net = container_of(ctl->data, struct net, sctp.rto_max); unsigned int min = *(unsigned int *) ctl->extra1; unsigned int max = *(unsigned int *) ctl->extra2; struct ctl_table tbl;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthieu Baerts (NGI0) matttbe@kernel.org
commit 15649fd5415eda664ef35780c2013adeb5d9c695 upstream.
As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2).
The 'net' structure can be obtained from the table->data using container_of().
Note that table->data could also be used directly, but that would increase the size of this fix, while 'sctp.ctl_sock' still needs to be retrieved from 'net' structure.
Fixes: b14878ccb7fa ("net: sctp: cache auth_enable per endpoint") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-6-5df34b2083... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/sysctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/sctp/sysctl.c +++ b/net/sctp/sysctl.c @@ -503,7 +503,7 @@ static int proc_sctp_do_alpha_beta(struc static int proc_sctp_do_auth(struct ctl_table *ctl, int write, void *buffer, size_t *lenp, loff_t *ppos) { - struct net *net = current->nsproxy->net_ns; + struct net *net = container_of(ctl->data, struct net, sctp.auth_enable); struct ctl_table tbl; int new_value, ret;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthieu Baerts (NGI0) matttbe@kernel.org
commit c10377bbc1972d858eaf0ab366a311b39f8ef1b6 upstream.
As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2).
The 'net' structure can be obtained from the table->data using container_of().
Note that table->data could also be used directly, but that would increase the size of this fix, while 'sctp.ctl_sock' still needs to be retrieved from 'net' structure.
Fixes: 046c052b475e ("sctp: enable udp tunneling socks") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-7-5df34b2083... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/sysctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/sctp/sysctl.c +++ b/net/sctp/sysctl.c @@ -532,7 +532,7 @@ static int proc_sctp_do_auth(struct ctl_ static int proc_sctp_do_udp_port(struct ctl_table *ctl, int write, void *buffer, size_t *lenp, loff_t *ppos) { - struct net *net = current->nsproxy->net_ns; + struct net *net = container_of(ctl->data, struct net, sctp.udp_port); unsigned int min = *(unsigned int *)ctl->extra1; unsigned int max = *(unsigned int *)ctl->extra2; struct ctl_table tbl;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthieu Baerts (NGI0) matttbe@kernel.org
commit 6259d2484d0ceff42245d1f09cc8cb6ee72d847a upstream.
As mentioned in a previous commit of this series, using the 'net' structure via 'current' is not recommended for different reasons:
- Inconsistency: getting info from the reader's/writer's netns vs only from the opener's netns.
- current->nsproxy can be NULL in some cases, resulting in an 'Oops' (null-ptr-deref), e.g. when the current task is exiting, as spotted by syzbot [1] using acct(2).
The 'net' structure can be obtained from the table->data using container_of().
Note that table->data could also be used directly, as this is the only member needed from the 'net' structure, but that would increase the size of this fix, to use '*data' everywhere 'net->sctp.probe_interval' is used.
Fixes: d1e462a7a5f3 ("sctp: add probe_interval in sysctl and sock/asoc/transport") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/67769ecb.050a0220.3a8527.003f.GAE@google.com [1] Suggested-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20250108-net-sysctl-current-nsproxy-v1-8-5df34b2083... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/sysctl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/sctp/sysctl.c +++ b/net/sctp/sysctl.c @@ -573,7 +573,8 @@ static int proc_sctp_do_udp_port(struct static int proc_sctp_do_probe_interval(struct ctl_table *ctl, int write, void *buffer, size_t *lenp, loff_t *ppos) { - struct net *net = current->nsproxy->net_ns; + struct net *net = container_of(ctl->data, struct net, + sctp.probe_interval); struct ctl_table tbl; int ret, new_value;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon linkinjeon@kernel.org
commit e8580b4c600e085b3c8e6404392de2f822d4c132 upstream.
As SMB3 posix extension specification, Give posix file type to posix mode.
https://www.samba.org/~slow/SMB3_POSIX/fscc_posix_extensions.html#posix-file...
Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/smb2pdu.c | 40 ++++++++++++++++++++++++++++++++++++++++ fs/smb/server/smb2pdu.h | 10 ++++++++++ 2 files changed, 50 insertions(+)
--- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -3989,6 +3989,26 @@ static int smb2_populate_readdir_entry(s posix_info->DeviceId = cpu_to_le32(ksmbd_kstat->kstat->rdev); posix_info->HardLinks = cpu_to_le32(ksmbd_kstat->kstat->nlink); posix_info->Mode = cpu_to_le32(ksmbd_kstat->kstat->mode & 0777); + switch (ksmbd_kstat->kstat->mode & S_IFMT) { + case S_IFDIR: + posix_info->Mode |= cpu_to_le32(POSIX_TYPE_DIR << POSIX_FILETYPE_SHIFT); + break; + case S_IFLNK: + posix_info->Mode |= cpu_to_le32(POSIX_TYPE_SYMLINK << POSIX_FILETYPE_SHIFT); + break; + case S_IFCHR: + posix_info->Mode |= cpu_to_le32(POSIX_TYPE_CHARDEV << POSIX_FILETYPE_SHIFT); + break; + case S_IFBLK: + posix_info->Mode |= cpu_to_le32(POSIX_TYPE_BLKDEV << POSIX_FILETYPE_SHIFT); + break; + case S_IFIFO: + posix_info->Mode |= cpu_to_le32(POSIX_TYPE_FIFO << POSIX_FILETYPE_SHIFT); + break; + case S_IFSOCK: + posix_info->Mode |= cpu_to_le32(POSIX_TYPE_SOCKET << POSIX_FILETYPE_SHIFT); + } + posix_info->Inode = cpu_to_le64(ksmbd_kstat->kstat->ino); posix_info->DosAttributes = S_ISDIR(ksmbd_kstat->kstat->mode) ? @@ -5177,6 +5197,26 @@ static int find_file_posix_info(struct s file_info->AllocationSize = cpu_to_le64(stat.blocks << 9); file_info->HardLinks = cpu_to_le32(stat.nlink); file_info->Mode = cpu_to_le32(stat.mode & 0777); + switch (stat.mode & S_IFMT) { + case S_IFDIR: + file_info->Mode |= cpu_to_le32(POSIX_TYPE_DIR << POSIX_FILETYPE_SHIFT); + break; + case S_IFLNK: + file_info->Mode |= cpu_to_le32(POSIX_TYPE_SYMLINK << POSIX_FILETYPE_SHIFT); + break; + case S_IFCHR: + file_info->Mode |= cpu_to_le32(POSIX_TYPE_CHARDEV << POSIX_FILETYPE_SHIFT); + break; + case S_IFBLK: + file_info->Mode |= cpu_to_le32(POSIX_TYPE_BLKDEV << POSIX_FILETYPE_SHIFT); + break; + case S_IFIFO: + file_info->Mode |= cpu_to_le32(POSIX_TYPE_FIFO << POSIX_FILETYPE_SHIFT); + break; + case S_IFSOCK: + file_info->Mode |= cpu_to_le32(POSIX_TYPE_SOCKET << POSIX_FILETYPE_SHIFT); + } + file_info->DeviceId = cpu_to_le32(stat.rdev);
/* --- a/fs/smb/server/smb2pdu.h +++ b/fs/smb/server/smb2pdu.h @@ -500,4 +500,14 @@ static inline void *smb2_get_msg(void *b return buf + 4; }
+#define POSIX_TYPE_FILE 0 +#define POSIX_TYPE_DIR 1 +#define POSIX_TYPE_SYMLINK 2 +#define POSIX_TYPE_CHARDEV 3 +#define POSIX_TYPE_BLKDEV 4 +#define POSIX_TYPE_FIFO 5 +#define POSIX_TYPE_SOCKET 6 + +#define POSIX_FILETYPE_SHIFT 12 + #endif /* _SMB2PDU_H */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roman Li Roman.Li@amd.com
commit 0881fbc4fd62e00a2b8e102725f76d10351b2ea8 upstream.
[Why] Wrapper functions for dcn_bw_ceil2() and dcn_bw_floor2() should check for granularity is non zero to avoid assert and divide-by-zero error in dcn_bw_ functions.
[How] Add check for granularity 0.
Cc: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Alvin Lee alvin.lee2@amd.com Signed-off-by: Roman Li Roman.Li@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit f6e09701c3eb2ccb8cb0518e0b67f1c69742a4ec) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/dml/dml_inline_defs.h | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/gpu/drm/amd/display/dc/dml/dml_inline_defs.h +++ b/drivers/gpu/drm/amd/display/dc/dml/dml_inline_defs.h @@ -66,11 +66,15 @@ static inline double dml_max5(double a,
static inline double dml_ceil(double a, double granularity) { + if (granularity == 0) + return 0; return (double) dcn_bw_ceil2(a, granularity); }
static inline double dml_floor(double a, double granularity) { + if (granularity == 0) + return 0; return (double) dcn_bw_floor2(a, granularity); }
@@ -114,11 +118,15 @@ static inline double dml_ceil_2(double f
static inline double dml_ceil_ex(double x, double granularity) { + if (granularity == 0) + return 0; return (double) dcn_bw_ceil2(x, granularity); }
static inline double dml_floor_ex(double x, double granularity) { + if (granularity == 0) + return 0; return (double) dcn_bw_floor2(x, granularity); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp
[ Upstream commit 9164e0912af206a72ddac4915f7784e470a04ace ]
of_thermal_zone_find() calls of_parse_phandle_with_args(), but does not release the OF node reference obtained by it.
Add a of_node_put() call when the call is successful.
Fixes: 3fd6d6e2b4e8 ("thermal/of: Rework the thermal device tree initialization") Signed-off-by: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp Link: https://patch.msgid.link/20241224031809.950461-1-joe@pf.is.s.u-tokyo.ac.jp [ rjw: Changelog edit ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/thermal/thermal_of.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c index 4e5f86c21456..0f520cf923a1 100644 --- a/drivers/thermal/thermal_of.c +++ b/drivers/thermal/thermal_of.c @@ -203,6 +203,7 @@ static struct device_node *of_thermal_zone_find(struct device_node *sensor, int goto out; }
+ of_node_put(sensor_specs.np); if ((sensor == sensor_specs.np) && id == (sensor_specs.args_count ? sensor_specs.args[0] : 0)) { pr_debug("sensor %pOFn id=%d belongs to %pOFn\n", sensor, id, child);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Meetakshi Setiya msetiya@microsoft.com
commit 20b1aa912316ffb7fbb5f407f17c330f2a22ddff upstream.
In some cases, when password2 becomes the working password, the client swaps the two password fields in the root session struct, but not in the smb3_fs_context struct in cifs_sb. DFS automounts inherit fs context from their parent mounts. Therefore, they might end up getting the passwords in the stale order. The automount should succeed, because the mount function will end up retrying with the actual password anyway. But to reduce these unnecessary session setup retries for automounts, we can sync the parent context's passwords with the root session's passwords before duplicating it to the child's fs context.
Cc: stable@vger.kernel.org Signed-off-by: Meetakshi Setiya msetiya@microsoft.com Reviewed-by: Shyam Prasad N sprasad@microsoft.com Acked-by: Paulo Alcantara (Red Hat) pc@manguebit.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/client/namespace.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-)
--- a/fs/smb/client/namespace.c +++ b/fs/smb/client/namespace.c @@ -196,11 +196,28 @@ static struct vfsmount *cifs_do_automoun struct smb3_fs_context tmp; char *full_path; struct vfsmount *mnt; + struct cifs_sb_info *mntpt_sb; + struct cifs_ses *ses;
if (IS_ROOT(mntpt)) return ERR_PTR(-ESTALE);
- cur_ctx = CIFS_SB(mntpt->d_sb)->ctx; + mntpt_sb = CIFS_SB(mntpt->d_sb); + ses = cifs_sb_master_tcon(mntpt_sb)->ses; + cur_ctx = mntpt_sb->ctx; + + /* + * At this point, the root session should be in the mntpt sb. We should + * bring the sb context passwords in sync with the root session's + * passwords. This would help prevent unnecessary retries and password + * swaps for automounts. + */ + mutex_lock(&ses->session_mutex); + rc = smb3_sync_session_ctx_passwords(mntpt_sb, ses); + mutex_unlock(&ses->session_mutex); + + if (rc) + return ERR_PTR(rc);
fc = fs_context_for_submount(path->mnt->mnt_sb->s_type, mntpt); if (IS_ERR(fc))
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nam Cao namcao@linutronix.de
commit 6a97f4118ac07cfdc316433f385dbdc12af5025e upstream.
die() can be called in exception handler, and therefore cannot sleep. However, die() takes spinlock_t which can sleep with PREEMPT_RT enabled. That causes the following warning:
BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 285, name: mutex preempt_count: 110001, expected: 0 RCU nest depth: 0, expected: 0 CPU: 0 UID: 0 PID: 285 Comm: mutex Not tainted 6.12.0-rc7-00022-ge19049cf7d56-dirty #234 Hardware name: riscv-virtio,qemu (DT) Call Trace: dump_backtrace+0x1c/0x24 show_stack+0x2c/0x38 dump_stack_lvl+0x5a/0x72 dump_stack+0x14/0x1c __might_resched+0x130/0x13a rt_spin_lock+0x2a/0x5c die+0x24/0x112 do_trap_insn_illegal+0xa0/0xea _new_vmalloc_restore_context_a0+0xcc/0xd8 Oops - illegal instruction [#1]
Switch to use raw_spinlock_t, which does not sleep even with PREEMPT_RT enabled.
Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code") Signed-off-by: Nam Cao namcao@linutronix.de Cc: stable@vger.kernel.org Reviewed-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Link: https://lore.kernel.org/r/20241118091333.1185288-1-namcao@linutronix.de Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/kernel/traps.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -34,7 +34,7 @@
int show_unhandled_signals = 1;
-static DEFINE_SPINLOCK(die_lock); +static DEFINE_RAW_SPINLOCK(die_lock);
static void dump_kernel_instr(const char *loglvl, struct pt_regs *regs) { @@ -66,7 +66,7 @@ void die(struct pt_regs *regs, const cha
oops_enter();
- spin_lock_irqsave(&die_lock, flags); + raw_spin_lock_irqsave(&die_lock, flags); console_verbose(); bust_spinlocks(1);
@@ -85,7 +85,7 @@ void die(struct pt_regs *regs, const cha
bust_spinlocks(0); add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE); - spin_unlock_irqrestore(&die_lock, flags); + raw_spin_unlock_irqrestore(&die_lock, flags); oops_exit();
if (in_interrupt())
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans de Goede hdegoede@redhat.com
commit 7ed4e4a659d99499dc6968c61970d41b64feeac0 upstream.
The TongFang GM5HG0A is a TongFang barebone design which is sold under various brand names.
The ACPI IRQ override for the keyboard IRQ must be used on these AMD Zen laptops in order for the IRQ to work.
At least on the SKIKK Vanaheim variant the DMI product- and board-name strings have been replaced by the OEM with "Vanaheim" so checking that board-name contains "GM5HG0A" as is usually done for TongFang barebones quirks does not work.
The DMI OEM strings do contain "GM5HG0A". I have looked at the dmidecode for a few other TongFang devices and the TongFang code-name string being in the OEM strings seems to be something which is consistently true.
Add a quirk checking one of the DMI_OEM_STRING(s) is "GM5HG0A" in the hope that this will work for other OEM versions of the "GM5HG0A" too.
Link: https://www.skikk.eu/en/laptops/vanaheim-15-rtx-4060 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219614 Cc: All applicable stable@vger.kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://patch.msgid.link/20241228164845.42381-1-hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/resource.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/drivers/acpi/resource.c +++ b/drivers/acpi/resource.c @@ -633,6 +633,17 @@ static const struct dmi_system_id lg_lap DMI_MATCH(DMI_BOARD_NAME, "GMxHGxx"), }, }, + { + /* + * TongFang GM5HG0A in case of the SKIKK Vanaheim relabel the + * board-name is changed, so check OEM strings instead. Note + * OEM string matches are always exact matches. + * https://bugzilla.kernel.org/show_bug.cgi?id=219614 + */ + .matches = { + DMI_EXACT_MATCH(DMI_OEM_STRING, "GM5HG0A"), + }, + }, { } };
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans de Goede hdegoede@redhat.com
commit 66d337fede44dcbab4107d37684af8fcab3d648e upstream.
Like the Vivobook X1704VAP the X1504VAP has its keyboard IRQ (1) described as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh which breaks the keyboard.
Add the X1504VAP to the irq1_level_low_skip_override[] quirk table to fix this.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219224 Cc: All applicable stable@vger.kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://patch.msgid.link/20241220181352.25974-1-hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/resource.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/acpi/resource.c +++ b/drivers/acpi/resource.c @@ -440,6 +440,13 @@ static const struct dmi_system_id asus_l }, }, { + /* Asus Vivobook X1504VAP */ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."), + DMI_MATCH(DMI_BOARD_NAME, "X1504VAP"), + }, + }, + { /* Asus Vivobook X1704VAP */ .matches = { DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jesse.zhang@amd.com Jesse.zhang@amd.com
commit 9738609449c3e44d1afb73eecab4763362b57930 upstream.
Initialize the process context address before setting the shader debugger.
[ 260.781212] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0) [ 260.781236] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10 [ 260.781255] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040A40 [ 260.781270] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5) [ 260.781284] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0 [ 260.781296] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [ 260.781308] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x4 [ 260.781320] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 260.781332] amdgpu 0000:03:00.0: amdgpu: RW: 0x1 [ 260.782017] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0) [ 260.782039] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10 [ 260.782058] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040A41 [ 260.782073] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5) [ 260.782087] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1 [ 260.782098] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [ 260.782110] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x4 [ 260.782122] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 260.782137] amdgpu 0000:03:00.0: amdgpu: RW: 0x1 [ 260.782155] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0) [ 260.782166] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10
Fixes: 438b39ac74e2 ("drm/amdkfd: pause autosuspend when creating pdd") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3849 Signed-off-by: Jesse Zhang jesse.zhang@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 5b231f5bc9ff02ec5737f2ec95cdf15ac95088e9) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdkfd/kfd_debug.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c @@ -349,10 +349,27 @@ int kfd_dbg_set_mes_debug_mode(struct kf { uint32_t spi_dbg_cntl = pdd->spi_dbg_override | pdd->spi_dbg_launch_mode; uint32_t flags = pdd->process->dbg_flags; + struct amdgpu_device *adev = pdd->dev->adev; + int r;
if (!kfd_dbg_is_per_vmid_supported(pdd->dev)) return 0;
+ if (!pdd->proc_ctx_cpu_ptr) { + r = amdgpu_amdkfd_alloc_gtt_mem(adev, + AMDGPU_MES_PROC_CTX_SIZE, + &pdd->proc_ctx_bo, + &pdd->proc_ctx_gpu_addr, + &pdd->proc_ctx_cpu_ptr, + false); + if (r) { + dev_err(adev->dev, + "failed to allocate process context bo\n"); + return r; + } + memset(pdd->proc_ctx_cpu_ptr, 0, AMDGPU_MES_PROC_CTX_SIZE); + } + return amdgpu_mes_set_shader_debugger(pdd->dev->adev, pdd->proc_ctx_gpu_addr, spi_dbg_cntl, pdd->watch_points, flags, sq_trap_en); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Melissa Wen mwen@igalia.com
commit 21541bc6b44241e3f791f9e552352d8440b2b29e upstream.
As the hw supports up to 4 surfaces, increase the maximum number of surfaces to prevent the DC error when trying to use more than three planes.
[drm:dc_state_add_plane [amdgpu]] *ERROR* Surface: can not attach plane_state 000000003e2cb82c! Maximum is: 3
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3693 Signed-off-by: Melissa Wen mwen@igalia.com Reviewed-by: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Signed-off-by: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit b8d6daffc871a42026c3c20bff7b8fa0302298c1) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/dc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/display/dc/dc.h +++ b/drivers/gpu/drm/amd/display/dc/dc.h @@ -49,7 +49,7 @@ struct dmub_notification;
#define DC_VER "3.2.247"
-#define MAX_SURFACES 3 +#define MAX_SURFACES 4 #define MAX_PLANES 6 #define MAX_STREAMS 6 #define MIN_VIEWPORT_SIZE 12
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pavel Begunkov asml.silence@gmail.com
commit c83c846231db8b153bfcb44d552d373c34f78245 upstream.
After update only the first shot of a multishot timeout request adheres to the new timeout value while all subsequent retries continue to use the old value. Don't forget to update the timeout stored in struct io_timeout_data.
Cc: stable@vger.kernel.org Fixes: ea97f6c8558e8 ("io_uring: add support for multishot timeouts") Reported-by: Christian Mazakas christian.mazakas@gmail.com Signed-off-by: Pavel Begunkov asml.silence@gmail.com Link: https://lore.kernel.org/r/e6516c3304eb654ec234cfa65c88a9579861e597.173601528... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/timeout.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/io_uring/timeout.c +++ b/io_uring/timeout.c @@ -413,10 +413,12 @@ static int io_timeout_update(struct io_r
timeout->off = 0; /* noseq */ data = req->async_data; + data->ts = *ts; + list_add_tail(&timeout->list, &ctx->timeout_list); hrtimer_init(&data->timer, io_timeout_get_clock(data), mode); data->timer.function = io_timeout_fn; - hrtimer_start(&data->timer, timespec64_to_ktime(*ts), mode); + hrtimer_start(&data->timer, timespec64_to_ktime(data->ts), mode); return 0; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ye Bin yebin10@huawei.com
commit b7d0a97b28083084ebdd8e5c6bccd12e6ec18faa upstream.
There's issue as follows when concurrently installing the f2fs.ko module and mounting the f2fs file system: KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027] RIP: 0010:__bio_alloc+0x2fb/0x6c0 [f2fs] Call Trace: <TASK> f2fs_submit_page_bio+0x126/0x8b0 [f2fs] __get_meta_page+0x1d4/0x920 [f2fs] get_checkpoint_version.constprop.0+0x2b/0x3c0 [f2fs] validate_checkpoint+0xac/0x290 [f2fs] f2fs_get_valid_checkpoint+0x207/0x950 [f2fs] f2fs_fill_super+0x1007/0x39b0 [f2fs] mount_bdev+0x183/0x250 legacy_get_tree+0xf4/0x1e0 vfs_get_tree+0x88/0x340 do_new_mount+0x283/0x5e0 path_mount+0x2b2/0x15b0 __x64_sys_mount+0x1fe/0x270 do_syscall_64+0x5f/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Above issue happens as the biset of the f2fs file system is not initialized before register "f2fs_fs_type". To address above issue just register "f2fs_fs_type" at the last in init_f2fs_fs(). Ensure that all f2fs file system resources are initialized.
Fixes: f543805fcd60 ("f2fs: introduce private bioset") Signed-off-by: Ye Bin yebin10@huawei.com Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Bin Lan lanbincn@qq.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/super.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -4931,9 +4931,6 @@ static int __init init_f2fs_fs(void) err = register_shrinker(&f2fs_shrinker_info, "f2fs-shrinker"); if (err) goto free_sysfs; - err = register_filesystem(&f2fs_fs_type); - if (err) - goto free_shrinker; f2fs_create_root_stats(); err = f2fs_init_post_read_processing(); if (err) @@ -4956,7 +4953,12 @@ static int __init init_f2fs_fs(void) err = f2fs_create_casefold_cache(); if (err) goto free_compress_cache; + err = register_filesystem(&f2fs_fs_type); + if (err) + goto free_casefold_cache; return 0; +free_casefold_cache: + f2fs_destroy_casefold_cache(); free_compress_cache: f2fs_destroy_compress_cache(); free_compress_mempool: @@ -4971,8 +4973,6 @@ free_post_read: f2fs_destroy_post_read_processing(); free_root_stats: f2fs_destroy_root_stats(); - unregister_filesystem(&f2fs_fs_type); -free_shrinker: unregister_shrinker(&f2fs_shrinker_info); free_sysfs: f2fs_exit_sysfs(); @@ -4996,6 +4996,7 @@ fail:
static void __exit exit_f2fs_fs(void) { + unregister_filesystem(&f2fs_fs_type); f2fs_destroy_casefold_cache(); f2fs_destroy_compress_cache(); f2fs_destroy_compress_mempool(); @@ -5004,7 +5005,6 @@ static void __exit exit_f2fs_fs(void) f2fs_destroy_iostat_processing(); f2fs_destroy_post_read_processing(); f2fs_destroy_root_stats(); - unregister_filesystem(&f2fs_fs_type); unregister_shrinker(&f2fs_shrinker_info); f2fs_exit_sysfs(); f2fs_destroy_garbage_collection_cache();
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Milan Broz gmazyland@gmail.com
commit 6df90c02bae468a3a6110bafbc659884d0c4966c upstream.
This patch fixes an issue that was fixed in the commit df7b59ba9245 ("dm verity: fix FEC for RS roots unaligned to block size") but later broken again in the commit 8ca7cab82bda ("dm verity fec: fix misaligned RS roots IO")
If the Reed-Solomon roots setting spans multiple blocks, the code does not use proper parity bytes and randomly fails to repair even trivial errors.
This bug cannot happen if the sector size is multiple of RS roots setting (Android case with roots 2).
The previous solution was to find a dm-bufio block size that is multiple of the device sector size and roots size. Unfortunately, the optimization in commit 8ca7cab82bda ("dm verity fec: fix misaligned RS roots IO") is incorrect and uses data block size for some roots (for example, it uses 4096 block size for roots = 20).
This patch uses a different approach:
- It always uses a configured data block size for dm-bufio to avoid possible misaligned IOs.
- and it caches the processed parity bytes, so it can join it if it spans two blocks.
As the RS calculation is called only if an error is detected and the process is computationally intensive, copying a few more bytes should not introduce performance issues.
The issue was reported to cryptsetup with trivial reproducer https://gitlab.com/cryptsetup/cryptsetup/-/issues/923
Reproducer (with roots=20):
# create verity device with RS FEC dd if=/dev/urandom of=data.img bs=4096 count=8 status=none veritysetup format data.img hash.img --fec-device=fec.img --fec-roots=20 | \ awk '/^Root hash/{ print $3 }' >roothash
# create an erasure that should always be repairable with this roots setting dd if=/dev/zero of=data.img conv=notrunc bs=1 count=4 seek=4 status=none
# try to read it through dm-verity veritysetup open data.img test hash.img --fec-device=fec.img --fec-roots=20 $(cat roothash) dd if=/dev/mapper/test of=/dev/null bs=4096 status=noxfer
Even now the log says it cannot repair it: : verity-fec: 7:1: FEC 0: failed to correct: -74 : device-mapper: verity: 7:1: data block 0 is corrupted ...
With this fix, errors are properly repaired. : verity-fec: 7:1: FEC 0: corrected 4 errors
Signed-off-by: Milan Broz gmazyland@gmail.com Fixes: 8ca7cab82bda ("dm verity fec: fix misaligned RS roots IO") Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Milan Broz gmazyland@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-verity-fec.c | 39 ++++++++++++++++++++++++++------------- 1 file changed, 26 insertions(+), 13 deletions(-)
--- a/drivers/md/dm-verity-fec.c +++ b/drivers/md/dm-verity-fec.c @@ -60,14 +60,19 @@ static int fec_decode_rs8(struct dm_veri * to the data block. Caller is responsible for releasing buf. */ static u8 *fec_read_parity(struct dm_verity *v, u64 rsb, int index, - unsigned int *offset, struct dm_buffer **buf) + unsigned int *offset, unsigned int par_buf_offset, + struct dm_buffer **buf) { u64 position, block, rem; u8 *res;
+ /* We have already part of parity bytes read, skip to the next block */ + if (par_buf_offset) + index++; + position = (index + rsb) * v->fec->roots; block = div64_u64_rem(position, v->fec->io_size, &rem); - *offset = (unsigned int)rem; + *offset = par_buf_offset ? 0 : (unsigned int)rem;
res = dm_bufio_read(v->fec->bufio, block, buf); if (IS_ERR(res)) { @@ -127,10 +132,11 @@ static int fec_decode_bufs(struct dm_ver { int r, corrected = 0, res; struct dm_buffer *buf; - unsigned int n, i, offset; - u8 *par, *block; + unsigned int n, i, offset, par_buf_offset = 0; + u8 *par, *block, par_buf[DM_VERITY_FEC_RSM - DM_VERITY_FEC_MIN_RSN];
- par = fec_read_parity(v, rsb, block_offset, &offset, &buf); + par = fec_read_parity(v, rsb, block_offset, &offset, + par_buf_offset, &buf); if (IS_ERR(par)) return PTR_ERR(par);
@@ -140,7 +146,8 @@ static int fec_decode_bufs(struct dm_ver */ fec_for_each_buffer_rs_block(fio, n, i) { block = fec_buffer_rs_block(v, fio, n, i); - res = fec_decode_rs8(v, fio, block, &par[offset], neras); + memcpy(&par_buf[par_buf_offset], &par[offset], v->fec->roots - par_buf_offset); + res = fec_decode_rs8(v, fio, block, par_buf, neras); if (res < 0) { r = res; goto error; @@ -153,12 +160,21 @@ static int fec_decode_bufs(struct dm_ver if (block_offset >= 1 << v->data_dev_block_bits) goto done;
- /* read the next block when we run out of parity bytes */ - offset += v->fec->roots; + /* Read the next block when we run out of parity bytes */ + offset += (v->fec->roots - par_buf_offset); + /* Check if parity bytes are split between blocks */ + if (offset < v->fec->io_size && (offset + v->fec->roots) > v->fec->io_size) { + par_buf_offset = v->fec->io_size - offset; + memcpy(par_buf, &par[offset], par_buf_offset); + offset += par_buf_offset; + } else + par_buf_offset = 0; + if (offset >= v->fec->io_size) { dm_bufio_release(buf);
- par = fec_read_parity(v, rsb, block_offset, &offset, &buf); + par = fec_read_parity(v, rsb, block_offset, &offset, + par_buf_offset, &buf); if (IS_ERR(par)) return PTR_ERR(par); } @@ -743,10 +759,7 @@ int verity_fec_ctr(struct dm_verity *v) return -E2BIG; }
- if ((f->roots << SECTOR_SHIFT) & ((1 << v->data_dev_block_bits) - 1)) - f->io_size = 1 << v->data_dev_block_bits; - else - f->io_size = v->fec->roots << SECTOR_SHIFT; + f->io_size = 1 << v->data_dev_block_bits;
f->bufio = dm_bufio_client_create(f->dev->bdev, f->io_size,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chukun Pan amadeus@jmu.edu.cn
commit c1947d244f807b1f95605b75a4059e7b37b5dcc3 upstream.
It looks like SRM815 shares ID with SRM825L.
T: Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=2dee ProdID=4d22 Rev= 4.14 S: Manufacturer=MEIG S: Product=LTE-A Module S: SerialNumber=123456 C:* #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Signed-off-by: Chukun Pan amadeus@jmu.edu.cn Link: https://lore.kernel.org/lkml/20241215100027.1970930-1-amadeus@jmu.edu.cn/ Link: https://lore.kernel.org/all/4333b4d0-281f-439d-9944-5570cbc4971d@gmail.com/ Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -621,7 +621,7 @@ static void option_instat_callback(struc
/* MeiG Smart Technology products */ #define MEIGSMART_VENDOR_ID 0x2dee -/* MeiG Smart SRM825L based on Qualcomm 315 */ +/* MeiG Smart SRM815/SRM825L based on Qualcomm 315 */ #define MEIGSMART_PRODUCT_SRM825L 0x4d22 /* MeiG Smart SLM320 based on UNISOC UIS8910 */ #define MEIGSMART_PRODUCT_SLM320 0x4d41 @@ -2405,6 +2405,7 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, LUAT_PRODUCT_AIR720U, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SLM320, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SLM770A, 0xff, 0, 0) }, + { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0, 0) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x40) }, { USB_DEVICE_AND_INTERFACE_INFO(MEIGSMART_VENDOR_ID, MEIGSMART_PRODUCT_SRM825L, 0xff, 0xff, 0x60) },
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Hrusecky michal.hrusecky@turris.com
commit f5b435be70cb126866fa92ffc6f89cda9e112c75 upstream.
Update the USB serial option driver to support Neoway N723-EA.
ID 2949:8700 Marvell Mobile Composite Device Bus
T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=2949 ProdID=8700 Rev= 1.00 S: Manufacturer=Marvell S: Product=Mobile Composite Device Bus S: SerialNumber=200806006809080000 C:* #Ifs= 5 Cfg#= 1 Atr=c0 MxPwr=500mA A: FirstIf#= 0 IfCount= 2 Cls=e0(wlcon) Sub=01 Prot=03 I:* If#= 0 Alt= 0 #EPs= 1 Cls=e0(wlcon) Sub=01 Prot=03 Driver=rndis_host E: Ad=87(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0c(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=89(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0b(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=86(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0e(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms I:* If#= 6 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=88(I) Atr=03(Int.) MxPS= 64 Ivl=4096ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=0a(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Tested successfully connecting to the Internet via rndis interface after dialing via AT commands on If#=4 or If#=6.
Not sure of the purpose of the other serial interface.
Signed-off-by: Michal Hrusecky michal.hrusecky@turris.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -2413,6 +2413,7 @@ static const struct usb_device_id option .driver_info = NCTRL(1) }, { USB_DEVICE_INTERFACE_CLASS(0x1bbb, 0x0640, 0xff), /* TCL IK512 ECM */ .driver_info = NCTRL(3) }, + { USB_DEVICE_INTERFACE_CLASS(0x2949, 0x8700, 0xff) }, /* Neoway N723-EA */ { } /* Terminating entry */ }; MODULE_DEVICE_TABLE(usb, option_ids);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zicheng Qu quzicheng@huawei.com
commit c0599762f0c7e260b99c6b7bceb8eae69b804c94 upstream.
User Perspective: When a user sets the phase value, the ad9834_write_phase() is called. The phase register has a 12-bit resolution, so the valid range is 0 to 4095. If the phase offset value of 4096 is input, it effectively exactly equals 0 in the lower 12 bits, meaning no offset.
Reasons for the Change: 1) Original Condition (phase > BIT(AD9834_PHASE_BITS)): This condition allows a phase value equal to 2^12, which is 4096. However, this value exceeds the valid 12-bit range, as the maximum valid phase value should be 4095. 2) Modified Condition (phase >= BIT(AD9834_PHASE_BITS)): Ensures that the phase value is within the valid range, preventing invalid datafrom being written.
Impact on Subsequent Logic: st->data = cpu_to_be16(addr | phase): If the phase value is 2^12, i.e., 4096 (0001 0000 0000 0000), and addr is AD9834_REG_PHASE0 (1100 0000 0000 0000), then addr | phase results in 1101 0000 0000 0000, occupying DB12. According to the section of WRITING TO A PHASE REGISTER in the datasheet, the MSB 12 PHASE0 bits should be DB11. The original condition leads to incorrect DB12 usage, which contradicts the datasheet and could pose potential issues for future updates if DB12 is used in such related cases.
Fixes: 12b9d5bf76bf ("Staging: IIO: DDS: AD9833 / AD9834 driver") Cc: stable@vger.kernel.org Signed-off-by: Zicheng Qu quzicheng@huawei.com Reviewed-by: Dan Carpenter dan.carpenter@linaro.org Link: https://patch.msgid.link/20241107011015.2472600-2-quzicheng@huawei.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/iio/frequency/ad9834.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/staging/iio/frequency/ad9834.c +++ b/drivers/staging/iio/frequency/ad9834.c @@ -131,7 +131,7 @@ static int ad9834_write_frequency(struct static int ad9834_write_phase(struct ad9834_state *st, unsigned long addr, unsigned long phase) { - if (phase > BIT(AD9834_PHASE_BITS)) + if (phase >= BIT(AD9834_PHASE_BITS)) return -EINVAL; st->data = cpu_to_be16(addr | phase);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zicheng Qu quzicheng@huawei.com
commit 4636e859ebe0011f41e35fa79bab585b8004e9a3 upstream.
User Perspective: When a user sets the phase value, the ad9832_write_phase() is called. The phase register has a 12-bit resolution, so the valid range is 0 to 4095. If the phase offset value of 4096 is input, it effectively exactly equals 0 in the lower 12 bits, meaning no offset.
Reasons for the Change: 1) Original Condition (phase > BIT(AD9832_PHASE_BITS)): This condition allows a phase value equal to 2^12, which is 4096. However, this value exceeds the valid 12-bit range, as the maximum valid phase value should be 4095. 2) Modified Condition (phase >= BIT(AD9832_PHASE_BITS)): Ensures that the phase value is within the valid range, preventing invalid datafrom being written.
Impact on Subsequent Logic: st->data = cpu_to_be16(addr | phase): If the phase value is 2^12, i.e., 4096 (0001 0000 0000 0000), and addr is AD9832_REG_PHASE0 (1100 0000 0000 0000), then addr | phase results in 1101 0000 0000 0000, occupying DB12. According to the section of WRITING TO A PHASE REGISTER in the datasheet, the MSB 12 PHASE0 bits should be DB11. The original condition leads to incorrect DB12 usage, which contradicts the datasheet and could pose potential issues for future updates if DB12 is used in such related cases.
Fixes: ea707584bac1 ("Staging: IIO: DDS: AD9832 / AD9835 driver") Cc: stable@vger.kernel.org Signed-off-by: Zicheng Qu quzicheng@huawei.com Link: https://patch.msgid.link/20241107011015.2472600-3-quzicheng@huawei.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/iio/frequency/ad9832.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/staging/iio/frequency/ad9832.c +++ b/drivers/staging/iio/frequency/ad9832.c @@ -158,7 +158,7 @@ static int ad9832_write_frequency(struct static int ad9832_write_phase(struct ad9832_state *st, unsigned long addr, unsigned long phase) { - if (phase > BIT(AD9832_PHASE_BITS)) + if (phase >= BIT(AD9832_PHASE_BITS)) return -EINVAL;
st->phase_data[0] = cpu_to_be16((AD9832_CMD_PHA8BITSW << CMD_SHIFT) |
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lubomir Rintel lrintel@redhat.com
commit cdef30e0774802df2f87024d68a9d86c3b99ca2a upstream.
This fixes data corruption when accessing the internal SD card in mass storage mode.
I am actually not too sure why. I didn't figure a straightforward way to reproduce the issue, but i seem to get garbage when issuing a lot (over 50) of large reads (over 120 sectors) are done in a quick succession. That is, time seems to matter here -- larger reads are fine if they are done with some delay between them.
But I'm not great at understanding this sort of things, so I'll assume the issue other, smarter, folks were seeing with similar phones is the same problem and I'll just put my quirk next to theirs.
The "Software details" screen on the phone is as follows:
V 04.06 07-08-13 RM-849 (c) Nokia
TL;DR version of the device descriptor:
idVendor 0x0421 Nokia Mobile Phones idProduct 0x06c2 bcdDevice 4.06 iManufacturer 1 Nokia iProduct 2 Nokia 208
The patch assumes older firmwares are broken too (I'm unable to test, but no biggie if they aren't I guess), and I have no idea if newer firmware exists.
Signed-off-by: Lubomir Rintel lkundrak@v3.sk Cc: stable stable@kernel.org Acked-by: Alan Stern stern@rowland.harvard.edu Link: https://lore.kernel.org/r/20250101212206.2386207-1-lkundrak@v3.sk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/storage/unusual_devs.h | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/usb/storage/unusual_devs.h +++ b/drivers/usb/storage/unusual_devs.h @@ -255,6 +255,13 @@ UNUSUAL_DEV( 0x0421, 0x06aa, 0x1110, 0x USB_SC_DEVICE, USB_PR_DEVICE, NULL, US_FL_MAX_SECTORS_64 ),
+/* Added by Lubomir Rintel lkundrak@v3.sk, a very fine chap */ +UNUSUAL_DEV( 0x0421, 0x06c2, 0x0000, 0x0406, + "Nokia", + "Nokia 208", + USB_SC_DEVICE, USB_PR_DEVICE, NULL, + US_FL_MAX_SECTORS_64 ), + #ifdef NO_SDDR09 UNUSUAL_DEV( 0x0436, 0x0005, 0x0100, 0x0100, "Microtech",
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 854eee93bd6e3dca619d47087af4d65b2045828e upstream.
Phoenix Contact sells UPS Quint devices [1] with a custom datacable [2] that embeds a Silicon Labs converter:
Bus 001 Device 003: ID 1b93:1013 Silicon Labs Phoenix Contact UPS Device Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass 0 bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 64 idVendor 0x1b93 idProduct 0x1013 bcdDevice 1.00 iManufacturer 1 Silicon Labs iProduct 2 Phoenix Contact UPS Device iSerial 3 <redacted> bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 0x0020 bNumInterfaces 1 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 100mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 0 bInterfaceProtocol 0 iInterface 2 Phoenix Contact UPS Device Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x01 EP 1 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0
[1] https://www.phoenixcontact.com/en-pc/products/power-supply-unit-quint-ps-1ac... [2] https://www.phoenixcontact.com/en-il/products/data-cable-preassembled-ifs-us...
Reported-by: Giuseppe Corbelli giuseppe.corbelli@antaresvision.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/cp210x.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -223,6 +223,7 @@ static const struct usb_device_id id_tab { USB_DEVICE(0x19CF, 0x3000) }, /* Parrot NMEA GPS Flight Recorder */ { USB_DEVICE(0x1ADB, 0x0001) }, /* Schweitzer Engineering C662 Cable */ { USB_DEVICE(0x1B1C, 0x1C00) }, /* Corsair USB Dongle */ + { USB_DEVICE(0x1B93, 0x1013) }, /* Phoenix Contact UPS Device */ { USB_DEVICE(0x1BA4, 0x0002) }, /* Silicon Labs 358x factory default */ { USB_DEVICE(0x1BE3, 0x07A6) }, /* WAGO 750-923 USB Service Cable */ { USB_DEVICE(0x1D6F, 0x0010) }, /* Seluxit ApS RF Dongle */
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: André Draszik andre.draszik@linaro.org
commit 01ea6bf5cb58b20cc1bd159f0cf74a76cf04bb69 upstream.
Before writing a new value to the register, the old value needs to be masked out for the new value to be programmed as intended, because at least in some cases the reset value of that field is 0xf (max value).
At the moment, the dwc3 core initialises the threshold to the maximum value (0xf), with the option to override it via a DT. No upstream DTs seem to override it, therefore this commit doesn't change behaviour for any upstream platform. Nevertheless, the code should be fixed to have the desired outcome.
Do so.
Fixes: 80caf7d21adc ("usb: dwc3: add lpm erratum support") Cc: stable@vger.kernel.org # 5.10+ (needs adjustment for 5.4) Signed-off-by: André Draszik andre.draszik@linaro.org Acked-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Link: https://lore.kernel.org/r/20241209-dwc3-nyet-fix-v2-1-02755683345b@linaro.or... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/core.h | 1 + drivers/usb/dwc3/gadget.c | 4 +++- 2 files changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/usb/dwc3/core.h +++ b/drivers/usb/dwc3/core.h @@ -451,6 +451,7 @@ #define DWC3_DCTL_TRGTULST_SS_INACT (DWC3_DCTL_TRGTULST(6))
/* These apply for core versions 1.94a and later */ +#define DWC3_DCTL_NYET_THRES_MASK (0xf << 20) #define DWC3_DCTL_NYET_THRES(n) (((n) & 0xf) << 20)
#define DWC3_DCTL_KEEP_CONNECT BIT(19) --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4208,8 +4208,10 @@ static void dwc3_gadget_conndone_interru WARN_ONCE(DWC3_VER_IS_PRIOR(DWC3, 240A) && dwc->has_lpm_erratum, "LPM Erratum not available on dwc3 revisions < 2.40a\n");
- if (dwc->has_lpm_erratum && !DWC3_VER_IS_PRIOR(DWC3, 240A)) + if (dwc->has_lpm_erratum && !DWC3_VER_IS_PRIOR(DWC3, 240A)) { + reg &= ~DWC3_DCTL_NYET_THRES_MASK; reg |= DWC3_DCTL_NYET_THRES(dwc->lpm_nyet_threshold); + }
dwc3_gadget_dctl_write_safe(dwc, reg); } else {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Li Huafei lihuafei1@huawei.com
commit cbd399f78e23ad4492c174fc5e6b3676dba74a52 upstream.
During fuzz testing, the following warning was discovered:
different return values (15 and 11) from vsnprintf("%*pbl ", ...)
test:keyward is WARNING in kvasprintf WARNING: CPU: 55 PID: 1168477 at lib/kasprintf.c:30 kvasprintf+0x121/0x130 Call Trace: kvasprintf+0x121/0x130 kasprintf+0xa6/0xe0 bitmap_print_to_buf+0x89/0x100 core_siblings_list_read+0x7e/0xb0 kernfs_file_read_iter+0x15b/0x270 new_sync_read+0x153/0x260 vfs_read+0x215/0x290 ksys_read+0xb9/0x160 do_syscall_64+0x56/0x100 entry_SYSCALL_64_after_hwframe+0x78/0xe2
The call trace shows that kvasprintf() reported this warning during the printing of core_siblings_list. kvasprintf() has several steps:
(1) First, calculate the length of the resulting formatted string.
(2) Allocate a buffer based on the returned length.
(3) Then, perform the actual string formatting.
(4) Check whether the lengths of the formatted strings returned in steps (1) and (2) are consistent.
If the core_cpumask is modified between steps (1) and (3), the lengths obtained in these two steps may not match. Indeed our test includes cpu hotplugging, which should modify core_cpumask while printing.
To fix this issue, cache the cpumask into a temporary variable before calling cpumap_print_{list, cpumask}_to_buf(), to keep it unchanged during the printing process.
Fixes: bb9ec13d156e ("topology: use bin_attribute to break the size limitation of cpumap ABI") Cc: stable stable@kernel.org Signed-off-by: Li Huafei lihuafei1@huawei.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Link: https://lore.kernel.org/r/20241114110141.94725-1-lihuafei1@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/topology.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-)
--- a/drivers/base/topology.c +++ b/drivers/base/topology.c @@ -27,9 +27,17 @@ static ssize_t name##_read(struct file * loff_t off, size_t count) \ { \ struct device *dev = kobj_to_dev(kobj); \ + cpumask_var_t mask; \ + ssize_t n; \ \ - return cpumap_print_bitmask_to_buf(buf, topology_##mask(dev->id), \ - off, count); \ + if (!alloc_cpumask_var(&mask, GFP_KERNEL)) \ + return -ENOMEM; \ + \ + cpumask_copy(mask, topology_##mask(dev->id)); \ + n = cpumap_print_bitmask_to_buf(buf, mask, off, count); \ + free_cpumask_var(mask); \ + \ + return n; \ } \ \ static ssize_t name##_list_read(struct file *file, struct kobject *kobj, \ @@ -37,9 +45,17 @@ static ssize_t name##_list_read(struct f loff_t off, size_t count) \ { \ struct device *dev = kobj_to_dev(kobj); \ + cpumask_var_t mask; \ + ssize_t n; \ + \ + if (!alloc_cpumask_var(&mask, GFP_KERNEL)) \ + return -ENOMEM; \ + \ + cpumask_copy(mask, topology_##mask(dev->id)); \ + n = cpumap_print_list_to_buf(buf, mask, off, count); \ + free_cpumask_var(mask); \ \ - return cpumap_print_list_to_buf(buf, topology_##mask(dev->id), \ - off, count); \ + return n; \ }
define_id_show_func(physical_package_id, "%d");
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rengarajan S rengarajan.s@microchip.com
commit 194f9f94a5169547d682e9bbcc5ae6d18a564735 upstream.
Resolve kernel panic caused by improper handling of IRQs while accessing GPIO values. This is done by replacing generic_handle_irq with handle_nested_irq.
Fixes: 1f4d8ae231f4 ("misc: microchip: pci1xxxx: Add gpio irq handler and irq helper functions irq_ack, irq_mask, irq_unmask and irq_set_type of irq_chip.") Cc: stable stable@kernel.org Signed-off-by: Rengarajan S rengarajan.s@microchip.com Link: https://lore.kernel.org/r/20241205133626.1483499-2-rengarajan.s@microchip.co... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gpio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gpio.c +++ b/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gpio.c @@ -277,7 +277,7 @@ static irqreturn_t pci1xxxx_gpio_irq_han writel(BIT(bit), priv->reg_base + INTR_STATUS_OFFSET(gpiobank)); spin_unlock_irqrestore(&priv->lock, flags); irq = irq_find_mapping(gc->irq.domain, (bit + (gpiobank * 32))); - generic_handle_irq(irq); + handle_nested_irq(irq); } } spin_lock_irqsave(&priv->lock, flags);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rengarajan S rengarajan.s@microchip.com
commit c7a5378a0f707686de3ddb489f1653c523bb7dcc upstream.
Driver returns -EOPNOTSUPPORTED on unsupported parameters case in set config. Upper level driver checks for -ENOTSUPP. Because of the return code mismatch, the ioctls from userspace fail. Resolve the issue by passing -ENOTSUPP during unsupported case.
Fixes: 7d3e4d807df2 ("misc: microchip: pci1xxxx: load gpio driver for the gpio controller auxiliary device enumerated by the auxiliary bus driver.") Cc: stable stable@kernel.org Signed-off-by: Rengarajan S rengarajan.s@microchip.com Link: https://lore.kernel.org/r/20241205133626.1483499-3-rengarajan.s@microchip.co... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gpio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gpio.c +++ b/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gpio.c @@ -148,7 +148,7 @@ static int pci1xxxx_gpio_set_config(stru pci1xxx_assign_bit(priv->reg_base, OPENDRAIN_OFFSET(offset), (offset % 32), true); break; default: - ret = -EOPNOTSUPP; + ret = -ENOTSUPP; break; } spin_unlock_irqrestore(&priv->lock, flags);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
commit ed2761958ad77e54791802b07095786150eab844 upstream.
The commit f9b11229b79c ("serial: 8250: Fix PM usage_count for console handover") fixed one runtime PM usage counter balance problem that occurs because .dev is not set during univ8250 setup preventing call to pm_runtime_get_sync(). Later, univ8250_console_exit() will trigger the runtime PM usage counter underflow as .dev is already set at that time.
Call pm_runtime_get_sync() to balance the RPM usage counter also in serial8250_register_8250_port() before trying to add the port.
Reported-by: Borislav Petkov (AMD) bp@alien8.de Fixes: bedb404e91bb ("serial: 8250_port: Don't use power management for kernel console") Cc: stable stable@kernel.org Tested-by: Borislav Petkov (AMD) bp@alien8.de Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Link: https://lore.kernel.org/r/20241210170120.2231-1-ilpo.jarvinen@linux.intel.co... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/serial/8250/8250_core.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/tty/serial/8250/8250_core.c +++ b/drivers/tty/serial/8250/8250_core.c @@ -1131,6 +1131,9 @@ int serial8250_register_8250_port(const uart->dl_write = up->dl_write;
if (uart->port.type != PORT_8250_CIR) { + if (uart_console_registered(&uart->port)) + pm_runtime_get_sync(uart->port.dev); + if (serial8250_isa_config != NULL) serial8250_isa_config(0, &uart->port, &uart->capabilities);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lianqin Hu hulianqin@vivo.com
commit 13014969cbf07f18d62ceea40bd8ca8ec9d36cec upstream.
Considering that in some extreme cases, when performing the unbinding operation, gserial_disconnect has cleared gser->ioport, which triggers gadget reconfiguration, and then calls gs_read_complete, resulting in access to a null pointer. Therefore, ep is disabled before gserial_disconnect sets port to null to prevent this from happening.
Call trace: gs_read_complete+0x58/0x240 usb_gadget_giveback_request+0x40/0x160 dwc3_remove_requests+0x170/0x484 dwc3_ep0_out_start+0xb0/0x1d4 __dwc3_gadget_start+0x25c/0x720 kretprobe_trampoline.cfi_jt+0x0/0x8 kretprobe_trampoline.cfi_jt+0x0/0x8 udc_bind_to_driver+0x1d8/0x300 usb_gadget_probe_driver+0xa8/0x1dc gadget_dev_desc_UDC_store+0x13c/0x188 configfs_write_iter+0x160/0x1f4 vfs_write+0x2d0/0x40c ksys_write+0x7c/0xf0 __arm64_sys_write+0x20/0x30 invoke_syscall+0x60/0x150 el0_svc_common+0x8c/0xf8 do_el0_svc+0x28/0xa0 el0_svc+0x24/0x84
Fixes: c1dca562be8a ("usb gadget: split out serial core") Cc: stable stable@kernel.org Suggested-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Lianqin Hu hulianqin@vivo.com Link: https://lore.kernel.org/r/TYUPR06MB621733B5AC690DBDF80A0DCCD2042@TYUPR06MB62... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/function/u_serial.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/usb/gadget/function/u_serial.c +++ b/drivers/usb/gadget/function/u_serial.c @@ -1398,6 +1398,10 @@ void gserial_disconnect(struct gserial * /* REVISIT as above: how best to track this? */ port->port_line_coding = gser->port_line_coding;
+ /* disable endpoints, aborting down any active I/O */ + usb_ep_disable(gser->out); + usb_ep_disable(gser->in); + port->port_usb = NULL; gser->ioport = NULL; if (port->port.count > 0) { @@ -1409,10 +1413,6 @@ void gserial_disconnect(struct gserial * spin_unlock(&port->port_lock); spin_unlock_irqrestore(&serial_port_lock, flags);
- /* disable endpoints, aborting down any active I/O */ - usb_ep_disable(gser->out); - usb_ep_disable(gser->in); - /* finally, free any unused/unusable I/O buffers */ spin_lock_irqsave(&port->port_lock, flags); if (port->port.count == 0)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rick Edgecombe rick.p.edgecombe@intel.com
commit a9d9c33132d49329ada647e4514d210d15e31d81 upstream.
The x86 shadow stack support has its own set of registers. Those registers are XSAVE-managed, but they are "supervisor state components" which means that userspace can not touch them with XSAVE/XRSTOR. It also means that they are not accessible from the existing ptrace ABI for XSAVE state. Thus, there is a new ptrace get/set interface for it.
The regset code that ptrace uses provides an ->active() handler in addition to the get/set ones. For shadow stack this ->active() handler verifies that shadow stack is enabled via the ARCH_SHSTK_SHSTK bit in the thread struct. The ->active() handler is checked from some call sites of the regset get/set handlers, but not the ptrace ones. This was not understood when shadow stack support was put in place.
As a result, both the set/get handlers can be called with XFEATURE_CET_USER in its init state, which would cause get_xsave_addr() to return NULL and trigger a WARN_ON(). The ssp_set() handler luckily has an ssp_active() check to avoid surprising the kernel with shadow stack behavior when the kernel is not ready for it (ARCH_SHSTK_SHSTK==0). That check just happened to avoid the warning.
But the ->get() side wasn't so lucky. It can be called with shadow stacks disabled, triggering the warning in practice, as reported by Christina Schimpe:
WARNING: CPU: 5 PID: 1773 at arch/x86/kernel/fpu/regset.c:198 ssp_get+0x89/0xa0 [...] Call Trace: <TASK> ? show_regs+0x6e/0x80 ? ssp_get+0x89/0xa0 ? __warn+0x91/0x150 ? ssp_get+0x89/0xa0 ? report_bug+0x19d/0x1b0 ? handle_bug+0x46/0x80 ? exc_invalid_op+0x1d/0x80 ? asm_exc_invalid_op+0x1f/0x30 ? __pfx_ssp_get+0x10/0x10 ? ssp_get+0x89/0xa0 ? ssp_get+0x52/0xa0 __regset_get+0xad/0xf0 copy_regset_to_user+0x52/0xc0 ptrace_regset+0x119/0x140 ptrace_request+0x13c/0x850 ? wait_task_inactive+0x142/0x1d0 ? do_syscall_64+0x6d/0x90 arch_ptrace+0x102/0x300 [...]
Ensure that shadow stacks are active in a thread before looking them up in the XSAVE buffer. Since ARCH_SHSTK_SHSTK and user_ssp[SHSTK_EN] are set at the same time, the active check ensures that there will be something to find in the XSAVE buffer.
[ dhansen: changelog/subject tweaks ]
Fixes: 2fab02b25ae7 ("x86: Add PTRACE interface for shadow stack") Reported-by: Christina Schimpe christina.schimpe@intel.com Signed-off-by: Rick Edgecombe rick.p.edgecombe@intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Tested-by: Christina Schimpe christina.schimpe@intel.com Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20250107233056.235536-1-rick.p.edgecombe%40intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/fpu/regset.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c index 6bc1eb2a21bd..887b0b8e21e3 100644 --- a/arch/x86/kernel/fpu/regset.c +++ b/arch/x86/kernel/fpu/regset.c @@ -190,7 +190,8 @@ int ssp_get(struct task_struct *target, const struct user_regset *regset, struct fpu *fpu = &target->thread.fpu; struct cet_user_state *cetregs;
- if (!cpu_feature_enabled(X86_FEATURE_USER_SHSTK)) + if (!cpu_feature_enabled(X86_FEATURE_USER_SHSTK) || + !ssp_active(target, regset)) return -ENODEV;
sync_fpstate(fpu);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Prashanth K quic_prashk@quicinc.com
commit 625e70ccb7bbbb2cc912e23c63390946170c085c upstream.
Runtime PM documentation (Section 5) mentions, during remove() callbacks, drivers should undo the runtime PM changes done in probe(). Usually this means calling pm_runtime_disable(), pm_runtime_dont_use_autosuspend() etc. Hence add missing function to disable autosuspend on dwc3-am62 driver unbind.
Fixes: e8784c0aec03 ("drivers: usb: dwc3: Add AM62 USB wrapper driver") Cc: stable stable@kernel.org Signed-off-by: Prashanth K quic_prashk@quicinc.com Acked-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Link: https://lore.kernel.org/r/20241209105728.3216872-1-quic_prashk@quicinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/dwc3-am62.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/dwc3/dwc3-am62.c +++ b/drivers/usb/dwc3/dwc3-am62.c @@ -284,6 +284,7 @@ static void dwc3_ti_remove(struct platfo
pm_runtime_put_sync(dev); pm_runtime_disable(dev); + pm_runtime_dont_use_autosuspend(dev); pm_runtime_set_suspended(dev); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jun Yan jerrysteve1101@gmail.com
commit 7a3d76a0b60b3f6fc3375e4de2174bab43f64545 upstream.
Fix the regression introduced by commit d8c6edfa3f4e ("USB: usblp: don't call usb_set_interface if there's a single alt"), which causes that unsupported protocols can also be set via ioctl when the num_altsetting of the device is 1.
Move the check for protocol support to the earlier stage.
Fixes: d8c6edfa3f4e ("USB: usblp: don't call usb_set_interface if there's a single alt") Cc: stable stable@kernel.org Signed-off-by: Jun Yan jerrysteve1101@gmail.com Link: https://lore.kernel.org/r/20241212143852.671889-1-jerrysteve1101@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/class/usblp.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/usb/class/usblp.c +++ b/drivers/usb/class/usblp.c @@ -1337,11 +1337,12 @@ static int usblp_set_protocol(struct usb if (protocol < USBLP_FIRST_PROTOCOL || protocol > USBLP_LAST_PROTOCOL) return -EINVAL;
+ alts = usblp->protocol[protocol].alt_setting; + if (alts < 0) + return -EINVAL; + /* Don't unnecessarily set the interface if there's a single alt. */ if (usblp->intf->num_altsetting > 1) { - alts = usblp->protocol[protocol].alt_setting; - if (alts < 0) - return -EINVAL; r = usb_set_interface(usblp->dev, usblp->ifnum, alts); if (r < 0) { printk(KERN_ERR "usblp: can't set desired altsetting %d on interface %d\n",
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kai-Heng Feng kaihengf@nvidia.com
commit 59bfeaf5454b7e764288d84802577f4a99bf0819 upstream.
There's USB error when tegra board is shutting down: [ 180.919315] usb 2-3: Failed to set U1 timeout to 0x0,error code -113 [ 180.919995] usb 2-3: Failed to set U1 timeout to 0xa,error code -113 [ 180.920512] usb 2-3: Failed to set U2 timeout to 0x4,error code -113 [ 186.157172] tegra-xusb 3610000.usb: xHCI host controller not responding, assume dead [ 186.157858] tegra-xusb 3610000.usb: HC died; cleaning up [ 186.317280] tegra-xusb 3610000.usb: Timeout while waiting for evaluate context command
The issue is caused by disabling LPM on already suspended ports.
For USB2 LPM, the LPM is already disabled during port suspend. For USB3 LPM, port won't transit to U1/U2 when it's already suspended in U3, hence disabling LPM is only needed for ports that are not suspended.
Cc: Wayne Chang waynec@nvidia.com Cc: stable stable@kernel.org Fixes: d920a2ed8620 ("usb: Disable USB3 LPM at shutdown") Signed-off-by: Kai-Heng Feng kaihengf@nvidia.com Acked-by: Alan Stern stern@rowland.harvard.edu Tested-by: Jon Hunter jonathanh@nvidia.com Link: https://lore.kernel.org/r/20241206074817.89189-1-kaihengf@nvidia.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/core/port.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/usb/core/port.c +++ b/drivers/usb/core/port.c @@ -451,10 +451,11 @@ static int usb_port_runtime_suspend(stru static void usb_port_shutdown(struct device *dev) { struct usb_port *port_dev = to_usb_port(dev); + struct usb_device *udev = port_dev->child;
- if (port_dev->child) { - usb_disable_usb2_hardware_lpm(port_dev->child); - usb_unlocked_disable_lpm(port_dev->child); + if (udev && !udev->port_is_suspended) { + usb_disable_usb2_hardware_lpm(udev); + usb_unlocked_disable_lpm(udev); } }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make_ruc2021@163.com
commit 0df11fa8cee5a9cf8753d4e2672bb3667138c652 upstream.
When device_add(&udev->dev) succeeds and a later call fails, usb_new_device() does not properly call device_del(). As comment of device_add() says, 'if device_add() succeeds, you should call device_del() when you want to get rid of it. If device_add() has not succeeded, use only put_device() to drop the reference count'.
Found by code review.
Cc: stable stable@kernel.org Fixes: 9f8b17e643fe ("USB: make usbdevices export their device nodes instead of using a separate class") Signed-off-by: Ma Ke make_ruc2021@163.com Reviewed-by: Alan Stern stern@rowland.harvard.edu Link: https://lore.kernel.org/r/20241218071346.2973980-1-make_ruc2021@163.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/core/hub.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -2633,13 +2633,13 @@ int usb_new_device(struct usb_device *ud err = sysfs_create_link(&udev->dev.kobj, &port_dev->dev.kobj, "port"); if (err) - goto fail; + goto out_del_dev;
err = sysfs_create_link(&port_dev->dev.kobj, &udev->dev.kobj, "device"); if (err) { sysfs_remove_link(&udev->dev.kobj, "port"); - goto fail; + goto out_del_dev; }
if (!test_and_set_bit(port1, hub->child_usage_bits)) @@ -2651,6 +2651,8 @@ int usb_new_device(struct usb_device *ud pm_runtime_put_sync_autosuspend(&udev->dev); return err;
+out_del_dev: + device_del(&udev->dev); fail: usb_set_device_state(udev, USB_STATE_NOTATTACHED); pm_runtime_disable(&udev->dev);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
commit 6f660ffce7c938f2a5d8473c0e0b45e4fb25ef7f upstream.
We should do reverse selection of other components from CONFIG_USB_F_MIDI2 which is tristate, instead of CONFIG_USB_CONFIGFS_F_MIDI2 which is bool, for satisfying subtle module dependencies.
Fixes: 8b645922b223 ("usb: gadget: Add support for USB MIDI 2.0 function driver") Cc: stable stable@kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de Link: https://lore.kernel.org/r/20250101131124.27599-1-tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/Kconfig | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/gadget/Kconfig +++ b/drivers/usb/gadget/Kconfig @@ -210,6 +210,8 @@ config USB_F_MIDI
config USB_F_MIDI2 tristate + select SND_UMP + select SND_UMP_LEGACY_RAWMIDI
config USB_F_HID tristate @@ -444,8 +446,6 @@ config USB_CONFIGFS_F_MIDI2 depends on USB_CONFIGFS depends on SND select USB_LIBCOMPOSITE - select SND_UMP - select SND_UMP_LEGACY_RAWMIDI select USB_F_MIDI2 help The MIDI 2.0 function driver provides the generic emulated
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp
commit 74adad500346fb07d69af2c79acbff4adb061134 upstream.
Current implementation of ci_hdrc_imx_driver does not decrement the refcount of the device obtained in usbmisc_get_init_data(). Add a put_device() call in .remove() and in .probe() before returning an error.
This bug was found by an experimental static analysis tool that I am developing.
Cc: stable stable@kernel.org Fixes: f40017e0f332 ("chipidea: usbmisc_imx: Add USB support for VF610 SoCs") Signed-off-by: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp Acked-by: Peter Chen peter.chen@kernel.org Link: https://lore.kernel.org/r/20241216015539.352579-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/chipidea/ci_hdrc_imx.c | 25 +++++++++++++++++-------- 1 file changed, 17 insertions(+), 8 deletions(-)
--- a/drivers/usb/chipidea/ci_hdrc_imx.c +++ b/drivers/usb/chipidea/ci_hdrc_imx.c @@ -362,25 +362,29 @@ static int ci_hdrc_imx_probe(struct plat data->pinctrl = devm_pinctrl_get(dev); if (PTR_ERR(data->pinctrl) == -ENODEV) data->pinctrl = NULL; - else if (IS_ERR(data->pinctrl)) - return dev_err_probe(dev, PTR_ERR(data->pinctrl), + else if (IS_ERR(data->pinctrl)) { + ret = dev_err_probe(dev, PTR_ERR(data->pinctrl), "pinctrl get failed\n"); + goto err_put; + }
data->hsic_pad_regulator = devm_regulator_get_optional(dev, "hsic"); if (PTR_ERR(data->hsic_pad_regulator) == -ENODEV) { /* no pad regulator is needed */ data->hsic_pad_regulator = NULL; - } else if (IS_ERR(data->hsic_pad_regulator)) - return dev_err_probe(dev, PTR_ERR(data->hsic_pad_regulator), + } else if (IS_ERR(data->hsic_pad_regulator)) { + ret = dev_err_probe(dev, PTR_ERR(data->hsic_pad_regulator), "Get HSIC pad regulator error\n"); + goto err_put; + }
if (data->hsic_pad_regulator) { ret = regulator_enable(data->hsic_pad_regulator); if (ret) { dev_err(dev, "Failed to enable HSIC pad regulator\n"); - return ret; + goto err_put; } } } @@ -394,13 +398,14 @@ static int ci_hdrc_imx_probe(struct plat dev_err(dev, "pinctrl_hsic_idle lookup failed, err=%ld\n", PTR_ERR(pinctrl_hsic_idle)); - return PTR_ERR(pinctrl_hsic_idle); + ret = PTR_ERR(pinctrl_hsic_idle); + goto err_put; }
ret = pinctrl_select_state(data->pinctrl, pinctrl_hsic_idle); if (ret) { dev_err(dev, "hsic_idle select failed, err=%d\n", ret); - return ret; + goto err_put; }
data->pinctrl_hsic_active = pinctrl_lookup_state(data->pinctrl, @@ -409,7 +414,8 @@ static int ci_hdrc_imx_probe(struct plat dev_err(dev, "pinctrl_hsic_active lookup failed, err=%ld\n", PTR_ERR(data->pinctrl_hsic_active)); - return PTR_ERR(data->pinctrl_hsic_active); + ret = PTR_ERR(data->pinctrl_hsic_active); + goto err_put; } }
@@ -513,6 +519,8 @@ disable_hsic_regulator: if (pdata.flags & CI_HDRC_PMQOS) cpu_latency_qos_remove_request(&data->pm_qos_req); data->ci_pdev = NULL; +err_put: + put_device(data->usbmisc_data->dev); return ret; }
@@ -536,6 +544,7 @@ static void ci_hdrc_imx_remove(struct pl if (data->hsic_pad_regulator) regulator_disable(data->hsic_pad_regulator); } + put_device(data->usbmisc_data->dev); }
static void ci_hdrc_imx_shutdown(struct platform_device *pdev)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Prashanth K quic_prashk@quicinc.com
commit 057bd54dfcf68b1f67e6dfc32a47a72e12198495 upstream.
Currently afunc_bind sets std_ac_if_desc.bNumEndpoints to 1 if controls (mute/volume) are enabled. During next afunc_bind call, bNumEndpoints would be unchanged and incorrectly set to 1 even if the controls aren't enabled.
Fix this by resetting the value of bNumEndpoints to 0 on every afunc_bind call.
Fixes: eaf6cbe09920 ("usb: gadget: f_uac2: add volume and mute support") Cc: stable stable@kernel.org Signed-off-by: Prashanth K quic_prashk@quicinc.com Link: https://lore.kernel.org/r/20241211115915.159864-1-quic_prashk@quicinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/function/f_uac2.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/gadget/function/f_uac2.c +++ b/drivers/usb/gadget/function/f_uac2.c @@ -1176,6 +1176,7 @@ afunc_bind(struct usb_configuration *cfg uac2->as_in_alt = 0; }
+ std_ac_if_desc.bNumEndpoints = 0; if (FUOUT_EN(uac2_opts) || FUIN_EN(uac2_opts)) { uac2->int_ep = usb_ep_autoconfig(gadget, &fs_ep_int_desc); if (!uac2->int_ep) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
commit b9711ff7cde0cfbcdd44cb1fac55b6eec496e690 upstream.
If max_contaminant_read_adc_mv() fails, then return the error code. Don't return zero.
Fixes: 02b332a06397 ("usb: typec: maxim_contaminant: Implement check_contaminant callback") Cc: stable stable@kernel.org Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: André Draszik andre.draszik@linaro.org Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/f1bf3768-419e-40dd-989c-f7f455d6c824@stanley.mount... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/tcpm/maxim_contaminant.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/typec/tcpm/maxim_contaminant.c +++ b/drivers/usb/typec/tcpm/maxim_contaminant.c @@ -137,7 +137,7 @@ static int max_contaminant_read_resistan
mv = max_contaminant_read_adc_mv(chip, channel, sleep_msec, raw, true); if (mv < 0) - return ret; + return mv;
/* OVP enable */ ret = regmap_update_bits(regmap, TCPC_VENDOR_CC_CTRL2, CCOVPDIS, 0); @@ -159,7 +159,7 @@ static int max_contaminant_read_resistan
mv = max_contaminant_read_adc_mv(chip, channel, sleep_msec, raw, true); if (mv < 0) - return ret; + return mv; /* Disable current source */ ret = regmap_update_bits(regmap, TCPC_VENDOR_CC_CTRL2, SBURPCTRL, 0); if (ret < 0)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Akash M akash.m5@samsung.com
commit dfc51e48bca475bbee984e90f33fdc537ce09699 upstream.
This commit addresses an issue related to below kernel panic where panic_on_warn is enabled. It is caused by the unnecessary use of WARN_ON in functionsfs_bind, which easily leads to the following scenarios.
1.adb_write in adbd 2. UDC write via configfs ================= =====================
->usb_ffs_open_thread() ->UDC write ->open_functionfs() ->configfs_write_iter() ->adb_open() ->gadget_dev_desc_UDC_store() ->adb_write() ->usb_gadget_register_driver_owner ->driver_register() ->StartMonitor() ->bus_add_driver() ->adb_read() ->gadget_bind_driver() <times-out without BIND event> ->configfs_composite_bind() ->usb_add_function() ->open_functionfs() ->ffs_func_bind() ->adb_open() ->functionfs_bind() <ffs->state !=FFS_ACTIVE>
The adb_open, adb_read, and adb_write operations are invoked from the daemon, but trying to bind the function is a process that is invoked by UDC write through configfs, which opens up the possibility of a race condition between the two paths. In this race scenario, the kernel panic occurs due to the WARN_ON from functionfs_bind when panic_on_warn is enabled. This commit fixes the kernel panic by removing the unnecessary WARN_ON.
Kernel panic - not syncing: kernel: panic_on_warn set ... [ 14.542395] Call trace: [ 14.542464] ffs_func_bind+0x1c8/0x14a8 [ 14.542468] usb_add_function+0xcc/0x1f0 [ 14.542473] configfs_composite_bind+0x468/0x588 [ 14.542478] gadget_bind_driver+0x108/0x27c [ 14.542483] really_probe+0x190/0x374 [ 14.542488] __driver_probe_device+0xa0/0x12c [ 14.542492] driver_probe_device+0x3c/0x220 [ 14.542498] __driver_attach+0x11c/0x1fc [ 14.542502] bus_for_each_dev+0x104/0x160 [ 14.542506] driver_attach+0x24/0x34 [ 14.542510] bus_add_driver+0x154/0x270 [ 14.542514] driver_register+0x68/0x104 [ 14.542518] usb_gadget_register_driver_owner+0x48/0xf4 [ 14.542523] gadget_dev_desc_UDC_store+0xf8/0x144 [ 14.542526] configfs_write_iter+0xf0/0x138
Fixes: ddf8abd25994 ("USB: f_fs: the FunctionFS driver") Cc: stable stable@kernel.org Signed-off-by: Akash M akash.m5@samsung.com Link: https://lore.kernel.org/r/20241219125221.1679-1-akash.m5@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/function/f_fs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -1810,7 +1810,7 @@ static int functionfs_bind(struct ffs_da struct usb_gadget_strings **lang; int first_id;
- if (WARN_ON(ffs->state != FFS_ACTIVE + if ((ffs->state != FFS_ACTIVE || test_and_set_bit(FFS_FL_BOUND, &ffs->flags))) return -EBADFD;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ingo Rohloff ingo.rohloff@lauterbach.com
commit 9466545720e231fc02acd69b5f4e9138e09a26f6 upstream.
Since commit c033563220e0f7a8 ("usb: gadget: configfs: Attach arbitrary strings to cdev") a user can provide extra string descriptors to a USB gadget via configfs.
For "manufacturer", "product", "serialnumber", setting the string via configfs ignores a trailing LF.
For the arbitrary strings the LF was not ignored.
This patch ignores a trailing LF to make this consistent with the existing behavior for "manufacturer", ... string descriptors.
Fixes: c033563220e0 ("usb: gadget: configfs: Attach arbitrary strings to cdev") Cc: stable stable@kernel.org Signed-off-by: Ingo Rohloff ingo.rohloff@lauterbach.com Link: https://lore.kernel.org/r/20241212154114.29295-1-ingo.rohloff@lauterbach.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/configfs.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/usb/gadget/configfs.c +++ b/drivers/usb/gadget/configfs.c @@ -824,11 +824,15 @@ static ssize_t gadget_string_s_store(str { struct gadget_string *string = to_gadget_string(item); int size = min(sizeof(string->string), len + 1); + ssize_t cpy_len;
if (len > USB_MAX_STRING_LEN) return -EINVAL;
- return strscpy(string->string, page, size); + cpy_len = strscpy(string->string, page, size); + if (cpy_len > 0 && string->string[cpy_len - 1] == '\n') + string->string[cpy_len - 1] = 0; + return len; } CONFIGFS_ATTR(gadget_string_, s);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Javier Carrasco javier.carrasco.cruz@gmail.com
commit 6007d10c5262f6f71479627c1216899ea7f09073 upstream.
The 'sample' local struct is used to push data to user space from a triggered buffer, but it has a hole between the temperature and the timestamp (u32 pressure, u16 temperature, GAP, u64 timestamp). This hole is never initialized.
Initialize the struct to zero before using it to avoid pushing uninitialized information to userspace.
Cc: stable@vger.kernel.org Fixes: 03b262f2bbf4 ("iio:pressure: initial zpa2326 barometer support") Signed-off-by: Javier Carrasco javier.carrasco.cruz@gmail.com Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-3-0cb6e98d895c@gm... Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/pressure/zpa2326.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/iio/pressure/zpa2326.c +++ b/drivers/iio/pressure/zpa2326.c @@ -586,6 +586,8 @@ static int zpa2326_fill_sample_buffer(st } sample; int err;
+ memset(&sample, 0, sizeof(sample)); + if (test_bit(0, indio_dev->active_scan_mask)) { /* Get current pressure from hardware FIFO. */ err = zpa2326_dequeue_pressure(indio_dev, &sample.pressure);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Javier Carrasco javier.carrasco.cruz@gmail.com
commit 333be433ee908a53f283beb95585dfc14c8ffb46 upstream.
The 'data' array is allocated via kmalloc() and it is used to push data to user space from a triggered buffer, but it does not set values for inactive channels, as it only uses iio_for_each_active_channel() to assign new values.
Use kzalloc for the memory allocation to avoid pushing uninitialized information to userspace.
Cc: stable@vger.kernel.org Fixes: 415f79244757 ("iio: Move IIO Dummy Driver out of staging") Signed-off-by: Javier Carrasco javier.carrasco.cruz@gmail.com Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-9-0cb6e98d895c@gm... Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/dummy/iio_simple_dummy_buffer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/dummy/iio_simple_dummy_buffer.c +++ b/drivers/iio/dummy/iio_simple_dummy_buffer.c @@ -48,7 +48,7 @@ static irqreturn_t iio_simple_dummy_trig int i = 0, j; u16 *data;
- data = kmalloc(indio_dev->scan_bytes, GFP_KERNEL); + data = kzalloc(indio_dev->scan_bytes, GFP_KERNEL); if (!data) goto done;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Javier Carrasco javier.carrasco.cruz@gmail.com
commit 47b43e53c0a0edf5578d5d12f5fc71c019649279 upstream.
The 'buffer' local array is used to push data to userspace from a triggered buffer, but it does not set an initial value for the single data element, which is an u16 aligned to 8 bytes. That leaves at least 4 bytes uninitialized even after writing an integer value with regmap_read().
Initialize the array to zero before using it to avoid pushing uninitialized information to userspace.
Cc: stable@vger.kernel.org Fixes: ec90b52c07c0 ("iio: light: vcnl4035: Fix buffer alignment in iio_push_to_buffers_with_timestamp()") Signed-off-by: Javier Carrasco javier.carrasco.cruz@gmail.com Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-6-0cb6e98d895c@gm... Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/light/vcnl4035.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/light/vcnl4035.c +++ b/drivers/iio/light/vcnl4035.c @@ -105,7 +105,7 @@ static irqreturn_t vcnl4035_trigger_cons struct iio_dev *indio_dev = pf->indio_dev; struct vcnl4035_data *data = iio_priv(indio_dev); /* Ensure naturally aligned timestamp */ - u8 buffer[ALIGN(sizeof(u16), sizeof(s64)) + sizeof(s64)] __aligned(8); + u8 buffer[ALIGN(sizeof(u16), sizeof(s64)) + sizeof(s64)] __aligned(8) = { }; int ret;
ret = regmap_read(data->regmap, VCNL4035_ALS_DATA, (int *)buffer);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Javier Carrasco javier.carrasco.cruz@gmail.com
commit 6ae053113f6a226a2303caa4936a4c37f3bfff7b upstream.
The 'buffer' local array is used to push data to user space from a triggered buffer, but it does not set values for inactive channels, as it only uses iio_for_each_active_channel() to assign new values.
Initialize the array to zero before using it to avoid pushing uninitialized information to userspace.
Cc: stable@vger.kernel.org Fixes: c3a23ecc0901 ("iio: imu: kmx61: Add support for data ready triggers") Signed-off-by: Javier Carrasco javier.carrasco.cruz@gmail.com Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-5-0cb6e98d895c@gm... Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/imu/kmx61.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/imu/kmx61.c +++ b/drivers/iio/imu/kmx61.c @@ -1192,7 +1192,7 @@ static irqreturn_t kmx61_trigger_handler struct kmx61_data *data = kmx61_get_data(indio_dev); int bit, ret, i = 0; u8 base; - s16 buffer[8]; + s16 buffer[8] = { };
if (indio_dev == data->acc_indio_dev) base = KMX61_ACC_XOUT_L;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Javier Carrasco javier.carrasco.cruz@gmail.com
commit 38724591364e1e3b278b4053f102b49ea06ee17c upstream.
The 'data' local struct is used to push data to user space from a triggered buffer, but it does not set values for inactive channels, as it only uses iio_for_each_active_channel() to assign new values.
Initialize the struct to zero before using it to avoid pushing uninitialized information to userspace.
Cc: stable@vger.kernel.org Fixes: 4e130dc7b413 ("iio: adc: rockchip_saradc: Add support iio buffers") Signed-off-by: Javier Carrasco javier.carrasco.cruz@gmail.com Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-4-0cb6e98d895c@gm... Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/rockchip_saradc.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/iio/adc/rockchip_saradc.c +++ b/drivers/iio/adc/rockchip_saradc.c @@ -368,6 +368,8 @@ static irqreturn_t rockchip_saradc_trigg int ret; int i, j = 0;
+ memset(&data, 0, sizeof(data)); + mutex_lock(&info->lock);
for_each_set_bit(i, i_dev->active_scan_mask, i_dev->masklength) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Javier Carrasco javier.carrasco.cruz@gmail.com
commit 2a7377ccfd940cd6e9201756aff1e7852c266e69 upstream.
The 'buffer' local array is used to push data to user space from a triggered buffer, but it does not set values for inactive channels, as it only uses iio_for_each_active_channel() to assign new values.
Initialize the array to zero before using it to avoid pushing uninitialized information to userspace.
Cc: stable@vger.kernel.org Fixes: 61fa5dfa5f52 ("iio: adc: ti-ads8688: Fix alignment of buffer in iio_push_to_buffers_with_timestamp()") Signed-off-by: Javier Carrasco javier.carrasco.cruz@gmail.com Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-8-0cb6e98d895c@gm... Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/ti-ads8688.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/adc/ti-ads8688.c +++ b/drivers/iio/adc/ti-ads8688.c @@ -382,7 +382,7 @@ static irqreturn_t ads8688_trigger_handl struct iio_poll_func *pf = p; struct iio_dev *indio_dev = pf->indio_dev; /* Ensure naturally aligned timestamp */ - u16 buffer[ADS8688_MAX_CHANNELS + sizeof(s64)/sizeof(u16)] __aligned(8); + u16 buffer[ADS8688_MAX_CHANNELS + sizeof(s64)/sizeof(u16)] __aligned(8) = { }; int i, j = 0;
for (i = 0; i < indio_dev->masklength; i++) {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Carlos Song carlos.song@nxp.com
commit fa13ac6cdf9b6c358e7d77c29fb60145c7a87965 upstream.
The fxas21002c_trigger_handler() may fail to acquire sample data because the runtime PM enters the autosuspend state and sensor can not return sample data in standby mode..
Resume the sensor before reading the sample data into the buffer within the trigger handler. After the data is read, place the sensor back into the autosuspend state.
Fixes: a0701b6263ae ("iio: gyro: add core driver for fxas21002c") Signed-off-by: Carlos Song carlos.song@nxp.com Signed-off-by: Frank Li Frank.Li@nxp.com Link: https://patch.msgid.link/20241116152945.4006374-1-Frank.Li@nxp.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/gyro/fxas21002c_core.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/drivers/iio/gyro/fxas21002c_core.c +++ b/drivers/iio/gyro/fxas21002c_core.c @@ -730,14 +730,21 @@ static irqreturn_t fxas21002c_trigger_ha int ret;
mutex_lock(&data->lock); + ret = fxas21002c_pm_get(data); + if (ret < 0) + goto out_unlock; + ret = regmap_bulk_read(data->regmap, FXAS21002C_REG_OUT_X_MSB, data->buffer, CHANNEL_SCAN_MAX * sizeof(s16)); if (ret < 0) - goto out_unlock; + goto out_pm_put;
iio_push_to_buffers_with_timestamp(indio_dev, data->buffer, data->timestamp);
+out_pm_put: + fxas21002c_pm_put(data); + out_unlock: mutex_unlock(&data->lock);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fabio Estevam festevam@gmail.com
commit 2a8e34096ec70d73ebb6d9920688ea312700cbd9 upstream.
Using gpiod_set_value() to control the reset GPIO causes some verbose warnings during boot when the reset GPIO is controlled by an I2C IO expander.
As the caller can sleep, use the gpiod_set_value_cansleep() variant to fix the issue.
Tested on a custom i.MX93 board with a ADS124S08 ADC.
Cc: stable@kernel.org Fixes: e717f8c6dfec ("iio: adc: Add the TI ads124s08 ADC code") Signed-off-by: Fabio Estevam festevam@gmail.com Link: https://patch.msgid.link/20241122164308.390340-1-festevam@gmail.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/ti-ads124s08.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/iio/adc/ti-ads124s08.c +++ b/drivers/iio/adc/ti-ads124s08.c @@ -183,9 +183,9 @@ static int ads124s_reset(struct iio_dev struct ads124s_private *priv = iio_priv(indio_dev);
if (priv->reset_gpio) { - gpiod_set_value(priv->reset_gpio, 0); + gpiod_set_value_cansleep(priv->reset_gpio, 0); udelay(200); - gpiod_set_value(priv->reset_gpio, 1); + gpiod_set_value_cansleep(priv->reset_gpio, 1); } else { return ads124s_write_cmd(indio_dev, ADS124S08_CMD_RESET); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp
commit de6a73bad1743e9e81ea5a24c178c67429ff510b upstream.
Current implementation of at91_ts_register() calls input_free_deivce() on st->ts_input, however, the err label can be reached before the allocated iio_dev is stored to st->ts_input. Thus call input_free_device() on input instead of st->ts_input.
Fixes: 84882b060301 ("iio: adc: at91_adc: Add support for touchscreens without TSMR") Signed-off-by: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp Link: https://patch.msgid.link/20241207043045.1255409-1-joe@pf.is.s.u-tokyo.ac.jp Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/at91_adc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/adc/at91_adc.c +++ b/drivers/iio/adc/at91_adc.c @@ -984,7 +984,7 @@ static int at91_ts_register(struct iio_d return ret;
err: - input_free_device(st->ts_input); + input_free_device(input); return ret; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp
commit 64f43895b4457532a3cc524ab250b7a30739a1b1 upstream.
In the error path of iio_channel_get_all(), iio_device_put() is called on all IIO devices, which can cause a refcount imbalance. Fix this error by calling iio_device_put() only on IIO devices whose refcounts were previously incremented by iio_device_get().
Fixes: 314be14bb893 ("iio: Rename _st_ functions to loose the bit that meant the staging version.") Signed-off-by: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp Link: https://patch.msgid.link/20241204111342.1246706-1-joe@pf.is.s.u-tokyo.ac.jp Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/inkern.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/inkern.c +++ b/drivers/iio/inkern.c @@ -514,7 +514,7 @@ struct iio_channel *iio_channel_get_all( return chans;
error_free_chans: - for (i = 0; i < nummaps; i++) + for (i = 0; i < mapind; i++) iio_device_put(chans[i].indio_dev); kfree(chans); error_ret:
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Uwe Kleine-König u.kleine-koenig@baylibre.com
commit 4be339af334c283a1a1af3cb28e7e448a0aa8a7c upstream.
When during a measurement two channels are enabled, two measurements are done that are reported sequencially in the DATA register. As the code triggered by reading one of the sysfs properties expects that only one channel is enabled it only reads the first data set which might or might not belong to the intended channel.
To prevent this situation disable all channels during probe. This fixes a problem in practise because the reset default for channel 0 is enabled. So all measurements before the first measurement on channel 0 (which disables channel 0 at the end) might report wrong values.
Fixes: 7b8d045e497a ("iio: adc: ad7124: allow more than 8 channels") Reviewed-by: Nuno Sa nuno.sa@analog.com Signed-off-by: Uwe Kleine-König u.kleine-koenig@baylibre.com Link: https://patch.msgid.link/20241104101905.845737-2-u.kleine-koenig@baylibre.co... Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/ad7124.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/iio/adc/ad7124.c +++ b/drivers/iio/adc/ad7124.c @@ -923,6 +923,9 @@ static int ad7124_setup(struct ad7124_st * set all channels to this default value. */ ad7124_set_channel_odr(st, i, 10); + + /* Disable all channels to prevent unintended conversions. */ + ad_sd_write_reg(&st->sd, AD7124_CHANNEL(i), 2, 0); }
return ret;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nam Cao namcao@linutronix.de
commit 13134cc949148e1dfa540a0fe5dc73569bc62155 upstream.
p->ainsn.api.insn is a pointer to u32, therefore arithmetic operations are multiplied by four. This is clearly undesirable for this case.
Cast it to (void *) first before any calculation.
Below is a sample before/after. The dumped memory is two kprobe slots, the first slot has
- c.addiw a0, 0x1c (0x7125) - ebreak (0x00100073)
and the second slot has:
- c.addiw a0, -4 (0x7135) - ebreak (0x00100073)
Before this patch:
(gdb) x/16xh 0xff20000000135000 0xff20000000135000: 0x7125 0x0000 0x0000 0x0000 0x7135 0x0010 0x0000 0x0000 0xff20000000135010: 0x0073 0x0010 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
After this patch:
(gdb) x/16xh 0xff20000000125000 0xff20000000125000: 0x7125 0x0073 0x0010 0x0000 0x7135 0x0073 0x0010 0x0000 0xff20000000125010: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
Fixes: b1756750a397 ("riscv: kprobes: Use patch_text_nosync() for insn slots") Signed-off-by: Nam Cao namcao@linutronix.de Cc: stable@vger.kernel.org Reviewed-by: Alexandre Ghiti alexghiti@rivosinc.com Link: https://lore.kernel.org/r/20241119111056.2554419-1-namcao@linutronix.de Signed-off-by: Palmer Dabbelt palmer@rivosinc.com [rebase to v6.6] Signed-off-by: Nam Cao namcao@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/kernel/probes/kprobes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/riscv/kernel/probes/kprobes.c +++ b/arch/riscv/kernel/probes/kprobes.c @@ -29,7 +29,7 @@ static void __kprobes arch_prepare_ss_sl p->ainsn.api.restore = (unsigned long)p->addr + offset;
patch_text_nosync(p->ainsn.api.insn, &p->opcode, 1); - patch_text_nosync(p->ainsn.api.insn + offset, &insn, 1); + patch_text_nosync((void *)p->ainsn.api.insn + offset, &insn, 1); }
static void __kprobes arch_prepare_simulate(struct kprobe *p)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe axboe@kernel.dk
Commit c9a40292a44e78f71258b8522655bffaf5753bdb upstream.
io_eventfd_do_signal() is invoked from an RCU callback, but when dropping the reference to the io_ev_fd, it calls io_eventfd_free() directly if the refcount drops to zero. This isn't correct, as any potential freeing of the io_ev_fd should be deferred another RCU grace period.
Just call io_eventfd_put() rather than open-code the dec-and-test and free, which will correctly defer it another RCU grace period.
Fixes: 21a091b970cd ("io_uring: signal registered eventfd to process deferred task work") Reported-by: Jann Horn jannh@google.com Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/io_uring.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
--- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -537,6 +537,13 @@ static __cold void io_queue_deferred(str } }
+static void io_eventfd_free(struct rcu_head *rcu) +{ + struct io_ev_fd *ev_fd = container_of(rcu, struct io_ev_fd, rcu); + + eventfd_ctx_put(ev_fd->cq_ev_fd); + kfree(ev_fd); +}
static void io_eventfd_ops(struct rcu_head *rcu) { @@ -550,10 +557,8 @@ static void io_eventfd_ops(struct rcu_he * ordering in a race but if references are 0 we know we have to free * it regardless. */ - if (atomic_dec_and_test(&ev_fd->refs)) { - eventfd_ctx_put(ev_fd->cq_ev_fd); - kfree(ev_fd); - } + if (atomic_dec_and_test(&ev_fd->refs)) + call_rcu(&ev_fd->rcu, io_eventfd_free); }
static void io_eventfd_signal(struct io_ring_ctx *ctx)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jesse Taube Mr.Bossman075@gmail.com
[ Upstream commit 5f122030061db3e5d2bddd9cf5c583deaa6c54ff ]
One of the usdhc1 controller's clocks should be IMXRT1050_CLK_AHB_PODF not IMXRT1050_CLK_OSC.
Fixes: 1c4f01be3490 ("ARM: dts: imx: Add i.MXRT1050-EVK support") Signed-off-by: Jesse Taube Mr.Bossman075@gmail.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/nxp/imx/imxrt1050.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/nxp/imx/imxrt1050.dtsi b/arch/arm/boot/dts/nxp/imx/imxrt1050.dtsi index dd714d235d5f..b0bad0d1ba36 100644 --- a/arch/arm/boot/dts/nxp/imx/imxrt1050.dtsi +++ b/arch/arm/boot/dts/nxp/imx/imxrt1050.dtsi @@ -87,7 +87,7 @@ reg = <0x402c0000 0x4000>; interrupts = <110>; clocks = <&clks IMXRT1050_CLK_IPG_PDOF>, - <&clks IMXRT1050_CLK_OSC>, + <&clks IMXRT1050_CLK_AHB_PODF>, <&clks IMXRT1050_CLK_USDHC1>; clock-names = "ipg", "ahb", "per"; bus-width = <4>;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniil Stas daniil.stas@posteo.net
[ Upstream commit 82163d63ae7a4c36142cd252388737205bb7e4b9 ]
scsi_execute_cmd() function can return both negative (linux codes) and positive (scsi_cmnd result field) error codes.
Currently the driver just passes error codes of scsi_execute_cmd() to hwmon core, which is incorrect because hwmon only checks for negative error codes. This leads to hwmon reporting uninitialized data to userspace in case of SCSI errors (for example if the disk drive was disconnected).
This patch checks scsi_execute_cmd() output and returns -EIO if it's error code is positive.
Fixes: 5b46903d8bf37 ("hwmon: Driver for disk and solid state drives with temperature sensors") Signed-off-by: Daniil Stas daniil.stas@posteo.net Cc: Guenter Roeck linux@roeck-us.net Cc: Chris Healy cphealy@gmail.com Cc: Linus Walleij linus.walleij@linaro.org Cc: Martin K. Petersen martin.petersen@oracle.com Cc: Bart Van Assche bvanassche@acm.org Cc: linux-kernel@vger.kernel.org Cc: linux-scsi@vger.kernel.org Cc: linux-ide@vger.kernel.org Cc: linux-hwmon@vger.kernel.org Link: https://lore.kernel.org/r/20250105213618.531691-1-daniil.stas@posteo.net [groeck: Avoid inline variable declaration for portability] Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/drivetemp.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/hwmon/drivetemp.c b/drivers/hwmon/drivetemp.c index 6bdd21aa005a..2a4ec55ddb47 100644 --- a/drivers/hwmon/drivetemp.c +++ b/drivers/hwmon/drivetemp.c @@ -165,6 +165,7 @@ static int drivetemp_scsi_command(struct drivetemp_data *st, { u8 scsi_cmd[MAX_COMMAND_SIZE]; enum req_op op; + int err;
memset(scsi_cmd, 0, sizeof(scsi_cmd)); scsi_cmd[0] = ATA_16; @@ -192,8 +193,11 @@ static int drivetemp_scsi_command(struct drivetemp_data *st, scsi_cmd[12] = lba_high; scsi_cmd[14] = ata_command;
- return scsi_execute_cmd(st->sdev, scsi_cmd, op, st->smartdata, - ATA_SECT_SIZE, HZ, 5, NULL); + err = scsi_execute_cmd(st->sdev, scsi_cmd, op, st->smartdata, + ATA_SECT_SIZE, HZ, 5, NULL); + if (err > 0) + err = -EIO; + return err; }
static int drivetemp_ata_command(struct drivetemp_data *st, u8 feature,
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yu Kuai yukuai3@huawei.com
[ Upstream commit fcede1f0a043ccefe9bc6ad57f12718e42f63f1d ]
Our syzkaller report a following UAF for v6.6:
BUG: KASAN: slab-use-after-free in bfq_init_rq+0x175d/0x17a0 block/bfq-iosched.c:6958 Read of size 8 at addr ffff8881b57147d8 by task fsstress/232726
CPU: 2 PID: 232726 Comm: fsstress Not tainted 6.6.0-g3629d1885222 #39 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x91/0xf0 lib/dump_stack.c:106 print_address_description.constprop.0+0x66/0x300 mm/kasan/report.c:364 print_report+0x3e/0x70 mm/kasan/report.c:475 kasan_report+0xb8/0xf0 mm/kasan/report.c:588 hlist_add_head include/linux/list.h:1023 [inline] bfq_init_rq+0x175d/0x17a0 block/bfq-iosched.c:6958 bfq_insert_request.isra.0+0xe8/0xa20 block/bfq-iosched.c:6271 bfq_insert_requests+0x27f/0x390 block/bfq-iosched.c:6323 blk_mq_insert_request+0x290/0x8f0 block/blk-mq.c:2660 blk_mq_submit_bio+0x1021/0x15e0 block/blk-mq.c:3143 __submit_bio+0xa0/0x6b0 block/blk-core.c:639 __submit_bio_noacct_mq block/blk-core.c:718 [inline] submit_bio_noacct_nocheck+0x5b7/0x810 block/blk-core.c:747 submit_bio_noacct+0xca0/0x1990 block/blk-core.c:847 __ext4_read_bh fs/ext4/super.c:205 [inline] ext4_read_bh+0x15e/0x2e0 fs/ext4/super.c:230 __read_extent_tree_block+0x304/0x6f0 fs/ext4/extents.c:567 ext4_find_extent+0x479/0xd20 fs/ext4/extents.c:947 ext4_ext_map_blocks+0x1a3/0x2680 fs/ext4/extents.c:4182 ext4_map_blocks+0x929/0x15a0 fs/ext4/inode.c:660 ext4_iomap_begin_report+0x298/0x480 fs/ext4/inode.c:3569 iomap_iter+0x3dd/0x1010 fs/iomap/iter.c:91 iomap_fiemap+0x1f4/0x360 fs/iomap/fiemap.c:80 ext4_fiemap+0x181/0x210 fs/ext4/extents.c:5051 ioctl_fiemap.isra.0+0x1b4/0x290 fs/ioctl.c:220 do_vfs_ioctl+0x31c/0x11a0 fs/ioctl.c:811 __do_sys_ioctl fs/ioctl.c:869 [inline] __se_sys_ioctl+0xae/0x190 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x70/0x120 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x78/0xe2
Allocated by task 232719: kasan_save_stack+0x22/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 __kasan_slab_alloc+0x87/0x90 mm/kasan/common.c:328 kasan_slab_alloc include/linux/kasan.h:188 [inline] slab_post_alloc_hook mm/slab.h:768 [inline] slab_alloc_node mm/slub.c:3492 [inline] kmem_cache_alloc_node+0x1b8/0x6f0 mm/slub.c:3537 bfq_get_queue+0x215/0x1f00 block/bfq-iosched.c:5869 bfq_get_bfqq_handle_split+0x167/0x5f0 block/bfq-iosched.c:6776 bfq_init_rq+0x13a4/0x17a0 block/bfq-iosched.c:6938 bfq_insert_request.isra.0+0xe8/0xa20 block/bfq-iosched.c:6271 bfq_insert_requests+0x27f/0x390 block/bfq-iosched.c:6323 blk_mq_insert_request+0x290/0x8f0 block/blk-mq.c:2660 blk_mq_submit_bio+0x1021/0x15e0 block/blk-mq.c:3143 __submit_bio+0xa0/0x6b0 block/blk-core.c:639 __submit_bio_noacct_mq block/blk-core.c:718 [inline] submit_bio_noacct_nocheck+0x5b7/0x810 block/blk-core.c:747 submit_bio_noacct+0xca0/0x1990 block/blk-core.c:847 __ext4_read_bh fs/ext4/super.c:205 [inline] ext4_read_bh_nowait+0x15a/0x240 fs/ext4/super.c:217 ext4_read_bh_lock+0xac/0xd0 fs/ext4/super.c:242 ext4_bread_batch+0x268/0x500 fs/ext4/inode.c:958 __ext4_find_entry+0x448/0x10f0 fs/ext4/namei.c:1671 ext4_lookup_entry fs/ext4/namei.c:1774 [inline] ext4_lookup.part.0+0x359/0x6f0 fs/ext4/namei.c:1842 ext4_lookup+0x72/0x90 fs/ext4/namei.c:1839 __lookup_slow+0x257/0x480 fs/namei.c:1696 lookup_slow fs/namei.c:1713 [inline] walk_component+0x454/0x5c0 fs/namei.c:2004 link_path_walk.part.0+0x773/0xda0 fs/namei.c:2331 link_path_walk fs/namei.c:3826 [inline] path_openat+0x1b9/0x520 fs/namei.c:3826 do_filp_open+0x1b7/0x400 fs/namei.c:3857 do_sys_openat2+0x5dc/0x6e0 fs/open.c:1428 do_sys_open fs/open.c:1443 [inline] __do_sys_openat fs/open.c:1459 [inline] __se_sys_openat fs/open.c:1454 [inline] __x64_sys_openat+0x148/0x200 fs/open.c:1454 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x70/0x120 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x78/0xe2
Freed by task 232726: kasan_save_stack+0x22/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 kasan_save_free_info+0x2b/0x50 mm/kasan/generic.c:522 ____kasan_slab_free mm/kasan/common.c:236 [inline] __kasan_slab_free+0x12a/0x1b0 mm/kasan/common.c:244 kasan_slab_free include/linux/kasan.h:164 [inline] slab_free_hook mm/slub.c:1827 [inline] slab_free_freelist_hook mm/slub.c:1853 [inline] slab_free mm/slub.c:3820 [inline] kmem_cache_free+0x110/0x760 mm/slub.c:3842 bfq_put_queue+0x6a7/0xfb0 block/bfq-iosched.c:5428 bfq_forget_entity block/bfq-wf2q.c:634 [inline] bfq_put_idle_entity+0x142/0x240 block/bfq-wf2q.c:645 bfq_forget_idle+0x189/0x1e0 block/bfq-wf2q.c:671 bfq_update_vtime block/bfq-wf2q.c:1280 [inline] __bfq_lookup_next_entity block/bfq-wf2q.c:1374 [inline] bfq_lookup_next_entity+0x350/0x480 block/bfq-wf2q.c:1433 bfq_update_next_in_service+0x1c0/0x4f0 block/bfq-wf2q.c:128 bfq_deactivate_entity+0x10a/0x240 block/bfq-wf2q.c:1188 bfq_deactivate_bfqq block/bfq-wf2q.c:1592 [inline] bfq_del_bfqq_busy+0x2e8/0xad0 block/bfq-wf2q.c:1659 bfq_release_process_ref+0x1cc/0x220 block/bfq-iosched.c:3139 bfq_split_bfqq+0x481/0xdf0 block/bfq-iosched.c:6754 bfq_init_rq+0xf29/0x17a0 block/bfq-iosched.c:6934 bfq_insert_request.isra.0+0xe8/0xa20 block/bfq-iosched.c:6271 bfq_insert_requests+0x27f/0x390 block/bfq-iosched.c:6323 blk_mq_insert_request+0x290/0x8f0 block/blk-mq.c:2660 blk_mq_submit_bio+0x1021/0x15e0 block/blk-mq.c:3143 __submit_bio+0xa0/0x6b0 block/blk-core.c:639 __submit_bio_noacct_mq block/blk-core.c:718 [inline] submit_bio_noacct_nocheck+0x5b7/0x810 block/blk-core.c:747 submit_bio_noacct+0xca0/0x1990 block/blk-core.c:847 __ext4_read_bh fs/ext4/super.c:205 [inline] ext4_read_bh+0x15e/0x2e0 fs/ext4/super.c:230 __read_extent_tree_block+0x304/0x6f0 fs/ext4/extents.c:567 ext4_find_extent+0x479/0xd20 fs/ext4/extents.c:947 ext4_ext_map_blocks+0x1a3/0x2680 fs/ext4/extents.c:4182 ext4_map_blocks+0x929/0x15a0 fs/ext4/inode.c:660 ext4_iomap_begin_report+0x298/0x480 fs/ext4/inode.c:3569 iomap_iter+0x3dd/0x1010 fs/iomap/iter.c:91 iomap_fiemap+0x1f4/0x360 fs/iomap/fiemap.c:80 ext4_fiemap+0x181/0x210 fs/ext4/extents.c:5051 ioctl_fiemap.isra.0+0x1b4/0x290 fs/ioctl.c:220 do_vfs_ioctl+0x31c/0x11a0 fs/ioctl.c:811 __do_sys_ioctl fs/ioctl.c:869 [inline] __se_sys_ioctl+0xae/0x190 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x70/0x120 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x78/0xe2
commit 1ba0403ac644 ("block, bfq: fix uaf for accessing waker_bfqq after splitting") fix the problem that if waker_bfqq is in the merge chain, and current is the only procress, waker_bfqq can be freed from bfq_split_bfqq(). However, the case that waker_bfqq is not in the merge chain is missed, and if the procress reference of waker_bfqq is 0, waker_bfqq can be freed as well.
Fix the problem by checking procress reference if waker_bfqq is not in the merge_chain.
Fixes: 1ba0403ac644 ("block, bfq: fix uaf for accessing waker_bfqq after splitting") Signed-off-by: Hou Tao houtao1@huawei.com Signed-off-by: Yu Kuai yukuai3@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20250108084148.1549973-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/bfq-iosched.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index dd8ca3f7ba60..617d6802b8a0 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -6843,16 +6843,24 @@ static struct bfq_queue *bfq_waker_bfqq(struct bfq_queue *bfqq) if (new_bfqq == waker_bfqq) { /* * If waker_bfqq is in the merge chain, and current - * is the only procress. + * is the only process, waker_bfqq can be freed. */ if (bfqq_process_refs(waker_bfqq) == 1) return NULL; - break; + + return waker_bfqq; }
new_bfqq = new_bfqq->new_bfqq; }
+ /* + * If waker_bfqq is not in the merge chain, and it's procress reference + * is 0, waker_bfqq can be freed. + */ + if (bfqq_process_refs(waker_bfqq) == 0) + return NULL; + return waker_bfqq; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Geis pgwipeout@gmail.com
[ Upstream commit 3699f2c43ea9984e00d70463f8c29baaf260ea97 ]
There is a race condition at startup between disabling power domains not used and disabling clocks not used on the rk3328. When the clocks are disabled first, the hevc power domain fails to shut off leading to a splat of failures. Add the hevc core clock to the rk3328 power domain node to prevent this condition.
rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 3-.... } 1087 jiffies s: 89 root: 0x8/. rcu: blocking rcu_node structures (internal RCU debug): Sending NMI from CPU 0 to CPUs 3: NMI backtrace for cpu 3 CPU: 3 UID: 0 PID: 86 Comm: kworker/3:3 Not tainted 6.12.0-rc5+ #53 Hardware name: Firefly ROC-RK3328-CC (DT) Workqueue: pm genpd_power_off_work_fn pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : regmap_unlock_spinlock+0x18/0x30 lr : regmap_read+0x60/0x88 sp : ffff800081123c00 x29: ffff800081123c00 x28: ffff2fa4c62cad80 x27: 0000000000000000 x26: ffffd74e6e660eb8 x25: ffff2fa4c62cae00 x24: 0000000000000040 x23: ffffd74e6d2f3ab8 x22: 0000000000000001 x21: ffff800081123c74 x20: 0000000000000000 x19: ffff2fa4c0412000 x18: 0000000000000000 x17: 77202c31203d2065 x16: 6c6469203a72656c x15: 6c6f72746e6f632d x14: 7265776f703a6e6f x13: 2063766568206e69 x12: 616d6f64202c3431 x11: 347830206f742030 x10: 3430303034783020 x9 : ffffd74e6c7369e0 x8 : 3030316666206e69 x7 : 205d383738353733 x6 : 332e31202020205b x5 : ffffd74e6c73fc88 x4 : ffffd74e6c73fcd4 x3 : ffffd74e6c740b40 x2 : ffff800080015484 x1 : 0000000000000000 x0 : ffff2fa4c0412000 Call trace: regmap_unlock_spinlock+0x18/0x30 rockchip_pmu_set_idle_request+0xac/0x2c0 rockchip_pd_power+0x144/0x5f8 rockchip_pd_power_off+0x1c/0x30 _genpd_power_off+0x9c/0x180 genpd_power_off.part.0.isra.0+0x130/0x2a8 genpd_power_off_work_fn+0x6c/0x98 process_one_work+0x170/0x3f0 worker_thread+0x290/0x4a8 kthread+0xec/0xf8 ret_from_fork+0x10/0x20 rockchip-pm-domain ff100000.syscon:power-controller: failed to get ack on domain 'hevc', val=0x88220
Fixes: 52e02d377a72 ("arm64: dts: rockchip: add core dtsi file for RK3328 SoCs") Signed-off-by: Peter Geis pgwipeout@gmail.com Reviewed-by: Dragan Simic dsimic@manjaro.org Link: https://lore.kernel.org/r/20241214224339.24674-1-pgwipeout@gmail.com Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/rockchip/rk3328.dtsi | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm64/boot/dts/rockchip/rk3328.dtsi b/arch/arm64/boot/dts/rockchip/rk3328.dtsi index 5d47acbf4a24..82eb7c49e825 100644 --- a/arch/arm64/boot/dts/rockchip/rk3328.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3328.dtsi @@ -304,6 +304,7 @@
power-domain@RK3328_PD_HEVC { reg = <RK3328_PD_HEVC>; + clocks = <&cru SCLK_VENC_CORE>; #power-domain-cells = <0>; }; power-domain@RK3328_PD_VIDEO {
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
[ Upstream commit 13bd778c900537f3fff7cfb671ff2eb0e92feee6 ]
Use scoped for_each_child_of_node_scoped() when iterating over device nodes to make code a bit simpler.
Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Link: https://lore.kernel.org/r/20240823-cleanup-h-guard-pm-domain-v1-4-8320722eaf... Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Stable-dep-of: 469c0682e03d ("pmdomain: imx: gpcv2: fix an OF node reference leak in imx_gpcv2_probe()") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pmdomain/imx/gpcv2.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/pmdomain/imx/gpcv2.c b/drivers/pmdomain/imx/gpcv2.c index fbd3d92f8cd8..ec789bf92274 100644 --- a/drivers/pmdomain/imx/gpcv2.c +++ b/drivers/pmdomain/imx/gpcv2.c @@ -1449,7 +1449,7 @@ static int imx_gpcv2_probe(struct platform_device *pdev) .max_register = SZ_4K, }; struct device *dev = &pdev->dev; - struct device_node *pgc_np, *np; + struct device_node *pgc_np; struct regmap *regmap; void __iomem *base; int ret; @@ -1471,7 +1471,7 @@ static int imx_gpcv2_probe(struct platform_device *pdev) return ret; }
- for_each_child_of_node(pgc_np, np) { + for_each_child_of_node_scoped(pgc_np, np) { struct platform_device *pd_pdev; struct imx_pgc_domain *domain; u32 domain_index; @@ -1482,7 +1482,6 @@ static int imx_gpcv2_probe(struct platform_device *pdev) ret = of_property_read_u32(np, "reg", &domain_index); if (ret) { dev_err(dev, "Failed to read 'reg' property\n"); - of_node_put(np); return ret; }
@@ -1497,7 +1496,6 @@ static int imx_gpcv2_probe(struct platform_device *pdev) domain_index); if (!pd_pdev) { dev_err(dev, "Failed to allocate platform device\n"); - of_node_put(np); return -ENOMEM; }
@@ -1506,7 +1504,6 @@ static int imx_gpcv2_probe(struct platform_device *pdev) sizeof(domain_data->domains[domain_index])); if (ret) { platform_device_put(pd_pdev); - of_node_put(np); return ret; }
@@ -1523,7 +1520,6 @@ static int imx_gpcv2_probe(struct platform_device *pdev) ret = platform_device_add(pd_pdev); if (ret) { platform_device_put(pd_pdev); - of_node_put(np); return ret; } }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp
[ Upstream commit 469c0682e03d67d8dc970ecaa70c2d753057c7c0 ]
imx_gpcv2_probe() leaks an OF node reference obtained by of_get_child_by_name(). Fix it by declaring the device node with the __free(device_node) cleanup construct.
This bug was found by an experimental static analysis tool that I am developing.
Fixes: 03aa12629fc4 ("soc: imx: Add GPCv2 power gating driver") Signed-off-by: Joe Hattori joe@pf.is.s.u-tokyo.ac.jp Cc: stable@vger.kernel.org Message-ID: 20241215030159.1526624-1-joe@pf.is.s.u-tokyo.ac.jp Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pmdomain/imx/gpcv2.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/pmdomain/imx/gpcv2.c b/drivers/pmdomain/imx/gpcv2.c index ec789bf92274..13fce2b134f6 100644 --- a/drivers/pmdomain/imx/gpcv2.c +++ b/drivers/pmdomain/imx/gpcv2.c @@ -1449,12 +1449,12 @@ static int imx_gpcv2_probe(struct platform_device *pdev) .max_register = SZ_4K, }; struct device *dev = &pdev->dev; - struct device_node *pgc_np; + struct device_node *pgc_np __free(device_node) = + of_get_child_by_name(dev->of_node, "pgc"); struct regmap *regmap; void __iomem *base; int ret;
- pgc_np = of_get_child_by_name(dev->of_node, "pgc"); if (!pgc_np) { dev_err(dev, "No power domains specified in DT\n"); return -EINVAL;
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xuewen Yan xuewen.yan@unisoc.com
[ Upstream commit 1a65a6d17cbc58e1aeffb2be962acce49efbef9c ]
Currently the workqueue just checks the atomic and locking states after work execution ends. However, sometimes, a work item may not unlock rcu after acquiring rcu_read_lock(). And as a result, it would cause rcu stall, but the rcu stall warning can not dump the work func, because the work has finished.
In order to quickly discover those works that do not call rcu_read_unlock() after rcu_read_lock(), add the rcu lock check.
Use rcu_preempt_depth() to check the work's rcu status. Normally, this value is 0. If this value is bigger than 0, it means the work are still holding rcu lock. If so, print err info and the work func.
tj: Reworded the description for clarity. Minor formatting tweak.
Signed-off-by: Xuewen Yan xuewen.yan@unisoc.com Reviewed-by: Lai Jiangshan jiangshanlai@gmail.com Reviewed-by: Waiman Long longman@redhat.com Signed-off-by: Tejun Heo tj@kernel.org Stable-dep-of: de35994ecd2d ("workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker") Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/workqueue.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 7fa1c7c9151a..2d85f232c675 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -2638,11 +2638,12 @@ __acquires(&pool->lock) lock_map_release(&lockdep_map); lock_map_release(&pwq->wq->lockdep_map);
- if (unlikely(in_atomic() || lockdep_depth(current) > 0)) { - pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d\n" + if (unlikely(in_atomic() || lockdep_depth(current) > 0 || + rcu_preempt_depth() > 0)) { + pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d/%d\n" " last function: %ps\n", - current->comm, preempt_count(), task_pid_nr(current), - worker->current_func); + current->comm, preempt_count(), rcu_preempt_depth(), + task_pid_nr(current), worker->current_func); debug_show_held_locks(current); dump_stack(); }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tejun Heo tj@kernel.org
[ Upstream commit c35aea39d1e106f61fd2130f0d32a3bac8bd4570 ]
These changes are in preparation of BH workqueue which will execute work items from BH context.
- Update lock and RCU depth checks in process_one_work() so that it remembers and checks against the starting depths and prints out the depth changes.
- Factor out lockdep annotations in the flush paths into touch_{wq|work}_lockdep_map(). The work->lockdep_map touching is moved from __flush_work() to its callee - start_flush_work(). This brings it closer to the wq counterpart and will allow testing the associated wq's flags which will be needed to support BH workqueues. This is not expected to cause any functional changes.
Signed-off-by: Tejun Heo tj@kernel.org Tested-by: Allen Pais allen.lkml@gmail.com Stable-dep-of: de35994ecd2d ("workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker") Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/workqueue.c | 51 ++++++++++++++++++++++++++++++---------------- 1 file changed, 34 insertions(+), 17 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 2d85f232c675..da5750246a92 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -2541,6 +2541,7 @@ __acquires(&pool->lock) struct pool_workqueue *pwq = get_work_pwq(work); struct worker_pool *pool = worker->pool; unsigned long work_data; + int lockdep_start_depth, rcu_start_depth; #ifdef CONFIG_LOCKDEP /* * It is permissible to free the struct work_struct from @@ -2603,6 +2604,8 @@ __acquires(&pool->lock) pwq->stats[PWQ_STAT_STARTED]++; raw_spin_unlock_irq(&pool->lock);
+ rcu_start_depth = rcu_preempt_depth(); + lockdep_start_depth = lockdep_depth(current); lock_map_acquire(&pwq->wq->lockdep_map); lock_map_acquire(&lockdep_map); /* @@ -2638,12 +2641,15 @@ __acquires(&pool->lock) lock_map_release(&lockdep_map); lock_map_release(&pwq->wq->lockdep_map);
- if (unlikely(in_atomic() || lockdep_depth(current) > 0 || - rcu_preempt_depth() > 0)) { - pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d/%d\n" - " last function: %ps\n", - current->comm, preempt_count(), rcu_preempt_depth(), - task_pid_nr(current), worker->current_func); + if (unlikely((worker->task && in_atomic()) || + lockdep_depth(current) != lockdep_start_depth || + rcu_preempt_depth() != rcu_start_depth)) { + pr_err("BUG: workqueue leaked atomic, lock or RCU: %s[%d]\n" + " preempt=0x%08x lock=%d->%d RCU=%d->%d workfn=%ps\n", + current->comm, task_pid_nr(current), preempt_count(), + lockdep_start_depth, lockdep_depth(current), + rcu_start_depth, rcu_preempt_depth(), + worker->current_func); debug_show_held_locks(current); dump_stack(); } @@ -3123,6 +3129,19 @@ static bool flush_workqueue_prep_pwqs(struct workqueue_struct *wq, return wait; }
+static void touch_wq_lockdep_map(struct workqueue_struct *wq) +{ + lock_map_acquire(&wq->lockdep_map); + lock_map_release(&wq->lockdep_map); +} + +static void touch_work_lockdep_map(struct work_struct *work, + struct workqueue_struct *wq) +{ + lock_map_acquire(&work->lockdep_map); + lock_map_release(&work->lockdep_map); +} + /** * __flush_workqueue - ensure that any scheduled work has run to completion. * @wq: workqueue to flush @@ -3142,8 +3161,7 @@ void __flush_workqueue(struct workqueue_struct *wq) if (WARN_ON(!wq_online)) return;
- lock_map_acquire(&wq->lockdep_map); - lock_map_release(&wq->lockdep_map); + touch_wq_lockdep_map(wq);
mutex_lock(&wq->mutex);
@@ -3342,6 +3360,7 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr, struct worker *worker = NULL; struct worker_pool *pool; struct pool_workqueue *pwq; + struct workqueue_struct *wq;
might_sleep();
@@ -3365,11 +3384,14 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr, pwq = worker->current_pwq; }
- check_flush_dependency(pwq->wq, work); + wq = pwq->wq; + check_flush_dependency(wq, work);
insert_wq_barrier(pwq, barr, work, worker); raw_spin_unlock_irq(&pool->lock);
+ touch_work_lockdep_map(work, wq); + /* * Force a lock recursion deadlock when using flush_work() inside a * single-threaded or rescuer equipped workqueue. @@ -3379,11 +3401,9 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr, * workqueues the deadlock happens when the rescuer stalls, blocking * forward progress. */ - if (!from_cancel && - (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer)) { - lock_map_acquire(&pwq->wq->lockdep_map); - lock_map_release(&pwq->wq->lockdep_map); - } + if (!from_cancel && (wq->saved_max_active == 1 || wq->rescuer)) + touch_wq_lockdep_map(wq); + rcu_read_unlock(); return true; already_gone: @@ -3402,9 +3422,6 @@ static bool __flush_work(struct work_struct *work, bool from_cancel) if (WARN_ON(!work->func)) return false;
- lock_map_acquire(&work->lockdep_map); - lock_map_release(&work->lockdep_map); - if (start_flush_work(work, &barr, from_cancel)) { wait_for_completion(&barr.done); destroy_work_on_stack(&barr.work);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tvrtko Ursulin tvrtko.ursulin@igalia.com
[ Upstream commit de35994ecd2dd6148ab5a6c5050a1670a04dec77 ]
After commit 746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM") amdgpu started seeing the following warning:
[ ] workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work [gpu_sched] is flushing !WQ_MEM_RECLAIM events:amdgpu_device_delay_enable_gfx_off [amdgpu] ... [ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched] ... [ ] Call Trace: [ ] <TASK> ... [ ] ? check_flush_dependency+0xf5/0x110 ... [ ] cancel_delayed_work_sync+0x6e/0x80 [ ] amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu] [ ] amdgpu_ring_alloc+0x40/0x50 [amdgpu] [ ] amdgpu_ib_schedule+0xf4/0x810 [amdgpu] [ ] ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched] [ ] amdgpu_job_run+0xaa/0x1f0 [amdgpu] [ ] drm_sched_run_job_work+0x257/0x430 [gpu_sched] [ ] process_one_work+0x217/0x720 ... [ ] </TASK>
The intent of the verifcation done in check_flush_depedency is to ensure forward progress during memory reclaim, by flagging cases when either a memory reclaim process, or a memory reclaim work item is flushed from a context not marked as memory reclaim safe.
This is correct when flushing, but when called from the cancel(_delayed)_work_sync() paths it is a false positive because work is either already running, or will not be running at all. Therefore cancelling it is safe and we can relax the warning criteria by letting the helper know of the calling context.
Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@igalia.com Fixes: fca839c00a12 ("workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue") References: 746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM") Cc: Tejun Heo tj@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Lai Jiangshan jiangshanlai@gmail.com Cc: Alex Deucher alexander.deucher@amd.com Cc: Christian König <christian.koenig@amd.com Cc: Matthew Brost matthew.brost@intel.com Cc: stable@vger.kernel.org # v4.5+ Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/workqueue.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c index da5750246a92..59b6efb2a11c 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -2947,23 +2947,27 @@ static int rescuer_thread(void *__rescuer) * check_flush_dependency - check for flush dependency sanity * @target_wq: workqueue being flushed * @target_work: work item being flushed (NULL for workqueue flushes) + * @from_cancel: are we called from the work cancel path * * %current is trying to flush the whole @target_wq or @target_work on it. - * If @target_wq doesn't have %WQ_MEM_RECLAIM, verify that %current is not - * reclaiming memory or running on a workqueue which doesn't have - * %WQ_MEM_RECLAIM as that can break forward-progress guarantee leading to - * a deadlock. + * If this is not the cancel path (which implies work being flushed is either + * already running, or will not be at all), check if @target_wq doesn't have + * %WQ_MEM_RECLAIM and verify that %current is not reclaiming memory or running + * on a workqueue which doesn't have %WQ_MEM_RECLAIM as that can break forward- + * progress guarantee leading to a deadlock. */ static void check_flush_dependency(struct workqueue_struct *target_wq, - struct work_struct *target_work) + struct work_struct *target_work, + bool from_cancel) { - work_func_t target_func = target_work ? target_work->func : NULL; + work_func_t target_func; struct worker *worker;
- if (target_wq->flags & WQ_MEM_RECLAIM) + if (from_cancel || target_wq->flags & WQ_MEM_RECLAIM) return;
worker = current_wq_worker(); + target_func = target_work ? target_work->func : NULL;
WARN_ONCE(current->flags & PF_MEMALLOC, "workqueue: PF_MEMALLOC task %d(%s) is flushing !WQ_MEM_RECLAIM %s:%ps", @@ -3208,7 +3212,7 @@ void __flush_workqueue(struct workqueue_struct *wq) list_add_tail(&this_flusher.list, &wq->flusher_overflow); }
- check_flush_dependency(wq, NULL); + check_flush_dependency(wq, NULL, false);
mutex_unlock(&wq->mutex);
@@ -3385,7 +3389,7 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr, }
wq = pwq->wq; - check_flush_dependency(wq, work); + check_flush_dependency(wq, work, from_cancel);
insert_wq_barrier(pwq, barr, work, worker); raw_spin_unlock_irq(&pool->lock);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Gordeev agordeev@linux.ibm.com
[ Upstream commit 38ca8a185389716e9f7566bce4bb0085f71da61d ]
Patch series "minor ptdesc updates", v3.
This patch (of 2):
Since commit d08d4e7cd6bf ("s390/mm: use full 4KB page for 2KB PTE") there is no fragmented page tracking on s390. Fix the corresponding comments.
Link: https://lkml.kernel.org/r/cover.1700594815.git.agordeev@linux.ibm.com Link: https://lkml.kernel.org/r/2eead241f3a45bed26c7911cf66bded1e35670b8.170059481... Signed-off-by: Alexander Gordeev agordeev@linux.ibm.com Suggested-by: Heiko Carstens hca@linux.ibm.com Cc: Gerald Schaefer gerald.schaefer@linux.ibm.com Cc: Vishal Moola (Oracle) vishal.moola@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count") Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/mm_types.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 20c96ce98751..1f224c55fb58 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -398,11 +398,11 @@ FOLIO_MATCH(compound_head, _head_2a); * @pmd_huge_pte: Protected by ptdesc->ptl, used for THPs. * @__page_mapping: Aliases with page->mapping. Unused for page tables. * @pt_mm: Used for x86 pgds. - * @pt_frag_refcount: For fragmented page table tracking. Powerpc and s390 only. + * @pt_frag_refcount: For fragmented page table tracking. Powerpc only. * @_pt_pad_2: Padding to ensure proper alignment. * @ptl: Lock for the page table. * @__page_type: Same as page->page_type. Unused for page tables. - * @_refcount: Same as page refcount. Used for s390 page tables. + * @_refcount: Same as page refcount. * @pt_memcg_data: Memcg data. Tracked for page tables here. * * This struct overlays struct page for now. Do not modify without a good
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Xu peterx@redhat.com
[ Upstream commit cddba0af0b7919e93134469f6fdf29a7d362768a ]
Hugetlb vmemmap default option (HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON) is a sub-option to hugetlbfs, but it shows in the same level as hugetlbfs itself, under "Pesudo filesystems".
Make the vmemmap option a sub-option to hugetlbfs, by changing hugetlbfs into a menuconfig. When moving it, fix a typo 'v' spot by Randy.
Link: https://lkml.kernel.org/r/20231124151902.1075697-1-peterx@redhat.com Signed-off-by: Peter Xu peterx@redhat.com Reviewed-by: Muchun Song songmuchun@bytedance.com Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Randy Dunlap rdunlap@infradead.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/Kconfig | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/fs/Kconfig b/fs/Kconfig index aa7e03cc1941..0ad3c7c7e984 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -253,7 +253,7 @@ config TMPFS_QUOTA config ARCH_SUPPORTS_HUGETLBFS def_bool n
-config HUGETLBFS +menuconfig HUGETLBFS bool "HugeTLB file system support" depends on X86 || IA64 || SPARC64 || ARCH_SUPPORTS_HUGETLBFS || BROKEN depends on (SYSFS || SYSCTL) @@ -265,22 +265,24 @@ config HUGETLBFS
If unsure, say N.
-config HUGETLB_PAGE - def_bool HUGETLBFS - -config HUGETLB_PAGE_OPTIMIZE_VMEMMAP - def_bool HUGETLB_PAGE - depends on ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP - depends on SPARSEMEM_VMEMMAP - +if HUGETLBFS config HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON bool "HugeTLB Vmemmap Optimization (HVO) defaults to on" default n depends on HUGETLB_PAGE_OPTIMIZE_VMEMMAP help - The HugeTLB VmemmapvOptimization (HVO) defaults to off. Say Y here to + The HugeTLB Vmemmap Optimization (HVO) defaults to off. Say Y here to enable HVO by default. It can be disabled via hugetlb_free_vmemmap=off (boot command line) or hugetlb_optimize_vmemmap (sysctl). +endif # HUGETLBFS + +config HUGETLB_PAGE + def_bool HUGETLBFS + +config HUGETLB_PAGE_OPTIMIZE_VMEMMAP + def_bool HUGETLB_PAGE + depends on ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP + depends on SPARSEMEM_VMEMMAP
config ARCH_HAS_GIGANTIC_PAGE bool
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Hildenbrand david@redhat.com
[ Upstream commit 188cac58a8bcdf82c7f63275b68f7a46871e45d6 ]
Sharing page tables between processes but falling back to per-MM page table locks cannot possibly work.
So, let's make sure that we do have split PMD locks by adding a new Kconfig option and letting that depend on CONFIG_SPLIT_PMD_PTLOCKS.
Link: https://lkml.kernel.org/r/20240726150728.3159964-3-david@redhat.com Signed-off-by: David Hildenbrand david@redhat.com Acked-by: Mike Rapoport (Microsoft) rppt@kernel.org Cc: Alexander Viro viro@zeniv.linux.org.uk Cc: Borislav Petkov bp@alien8.de Cc: Boris Ostrovsky boris.ostrovsky@oracle.com Cc: Christian Brauner brauner@kernel.org Cc: Christophe Leroy christophe.leroy@csgroup.eu Cc: Dave Hansen dave.hansen@linux.intel.com Cc: "H. Peter Anvin" hpa@zytor.com Cc: Ingo Molnar mingo@redhat.com Cc: Juergen Gross jgross@suse.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: Muchun Song muchun.song@linux.dev Cc: "Naveen N. Rao" naveen.n.rao@linux.ibm.com Cc: Nicholas Piggin npiggin@gmail.com Cc: Oscar Salvador osalvador@suse.de Cc: Peter Xu peterx@redhat.com Cc: Russell King linux@armlinux.org.uk Cc: Thomas Gleixner tglx@linutronix.de Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/Kconfig | 4 ++++ include/linux/hugetlb.h | 5 ++--- mm/hugetlb.c | 8 ++++---- 3 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/fs/Kconfig b/fs/Kconfig index 0ad3c7c7e984..02a9237807a7 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -284,6 +284,10 @@ config HUGETLB_PAGE_OPTIMIZE_VMEMMAP depends on ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP depends on SPARSEMEM_VMEMMAP
+config HUGETLB_PMD_PAGE_TABLE_SHARING + def_bool HUGETLB_PAGE + depends on ARCH_WANT_HUGE_PMD_SHARE && SPLIT_PMD_PTLOCKS + config ARCH_HAS_GIGANTIC_PAGE bool
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 0c50c4fceb95..0ca93c7574ad 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -1239,7 +1239,7 @@ static inline __init void hugetlb_cma_reserve(int order) } #endif
-#ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE +#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING static inline bool hugetlb_pmd_shared(pte_t *pte) { return page_count(virt_to_page(pte)) > 1; @@ -1275,8 +1275,7 @@ bool __vma_private_lock(struct vm_area_struct *vma); static inline pte_t * hugetlb_walk(struct vm_area_struct *vma, unsigned long addr, unsigned long sz) { -#if defined(CONFIG_HUGETLB_PAGE) && \ - defined(CONFIG_ARCH_WANT_HUGE_PMD_SHARE) && defined(CONFIG_LOCKDEP) +#if defined(CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING) && defined(CONFIG_LOCKDEP) struct hugetlb_vma_lock *vma_lock = vma->vm_private_data;
/* diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 92b955cc5a41..5b8cc558ab6e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6907,7 +6907,7 @@ long hugetlb_unreserve_pages(struct inode *inode, long start, long end, return 0; }
-#ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE +#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING static unsigned long page_table_shareable(struct vm_area_struct *svma, struct vm_area_struct *vma, unsigned long addr, pgoff_t idx) @@ -7069,7 +7069,7 @@ int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, return 1; }
-#else /* !CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ +#else /* !CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING */
pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, pud_t *pud) @@ -7092,7 +7092,7 @@ bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr) { return false; } -#endif /* CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ +#endif /* CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING */
#ifdef CONFIG_ARCH_WANT_GENERAL_HUGETLB pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, @@ -7190,7 +7190,7 @@ unsigned long hugetlb_mask_last_page(struct hstate *h) /* See description above. Architectures can provide their own version. */ __weak unsigned long hugetlb_mask_last_page(struct hstate *h) { -#ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE +#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING if (huge_page_size(h) == PMD_SIZE) return PUD_SIZE - PMD_SIZE; #endif
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liu Shixin liushixin2@huawei.com
[ Upstream commit 59d9094df3d79443937add8700b2ef1a866b1081 ]
The folio refcount may be increased unexpectly through try_get_folio() by caller such as split_huge_pages. In huge_pmd_unshare(), we use refcount to check whether a pmd page table is shared. The check is incorrect if the refcount is increased by the above caller, and this can cause the page table leaked:
BUG: Bad page state in process sh pfn:109324 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x66 pfn:0x109324 flags: 0x17ffff800000000(node=0|zone=2|lastcpupid=0xfffff) page_type: f2(table) raw: 017ffff800000000 0000000000000000 0000000000000000 0000000000000000 raw: 0000000000000066 0000000000000000 00000000f2000000 0000000000000000 page dumped because: nonzero mapcount ... CPU: 31 UID: 0 PID: 7515 Comm: sh Kdump: loaded Tainted: G B 6.13.0-rc2master+ #7 Tainted: [B]=BAD_PAGE Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 Call trace: show_stack+0x20/0x38 (C) dump_stack_lvl+0x80/0xf8 dump_stack+0x18/0x28 bad_page+0x8c/0x130 free_page_is_bad_report+0xa4/0xb0 free_unref_page+0x3cc/0x620 __folio_put+0xf4/0x158 split_huge_pages_all+0x1e0/0x3e8 split_huge_pages_write+0x25c/0x2d8 full_proxy_write+0x64/0xd8 vfs_write+0xcc/0x280 ksys_write+0x70/0x110 __arm64_sys_write+0x24/0x38 invoke_syscall+0x50/0x120 el0_svc_common.constprop.0+0xc8/0xf0 do_el0_svc+0x24/0x38 el0_svc+0x34/0x128 el0t_64_sync_handler+0xc8/0xd0 el0t_64_sync+0x190/0x198
The issue may be triggered by damon, offline_page, page_idle, etc, which will increase the refcount of page table.
1. The page table itself will be discarded after reporting the "nonzero mapcount".
2. The HugeTLB page mapped by the page table miss freeing since we treat the page table as shared and a shared page table will not be unmapped.
Fix it by introducing independent PMD page table shared count. As described by comment, pt_index/pt_mm/pt_frag_refcount are used for s390 gmap, x86 pgds and powerpc, pt_share_count is used for x86/arm64/riscv pmds, so we can reuse the field as pt_share_count.
Link: https://lkml.kernel.org/r/20241216071147.3984217-1-liushixin2@huawei.com Fixes: 39dde65c9940 ("[PATCH] shared page table for hugetlb page") Signed-off-by: Liu Shixin liushixin2@huawei.com Cc: Kefeng Wang wangkefeng.wang@huawei.com Cc: Ken Chen kenneth.w.chen@intel.com Cc: Muchun Song muchun.song@linux.dev Cc: Nanyong Sun sunnanyong@huawei.com Cc: Jane Chu jane.chu@oracle.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/mm.h | 1 + include/linux/mm_types.h | 30 ++++++++++++++++++++++++++++++ mm/hugetlb.c | 16 +++++++--------- 3 files changed, 38 insertions(+), 9 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h index b6a4d6471b4a..209370f64436 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3031,6 +3031,7 @@ static inline bool pagetable_pmd_ctor(struct ptdesc *ptdesc) if (!pmd_ptlock_init(ptdesc)) return false; __folio_set_pgtable(folio); + ptdesc_pmd_pts_init(ptdesc); lruvec_stat_add_folio(folio, NR_PAGETABLE); return true; } diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 1f224c55fb58..e77d4a5c0bac 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -399,6 +399,7 @@ FOLIO_MATCH(compound_head, _head_2a); * @__page_mapping: Aliases with page->mapping. Unused for page tables. * @pt_mm: Used for x86 pgds. * @pt_frag_refcount: For fragmented page table tracking. Powerpc only. + * @pt_share_count: Used for HugeTLB PMD page table share count. * @_pt_pad_2: Padding to ensure proper alignment. * @ptl: Lock for the page table. * @__page_type: Same as page->page_type. Unused for page tables. @@ -424,6 +425,9 @@ struct ptdesc { union { struct mm_struct *pt_mm; atomic_t pt_frag_refcount; +#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING + atomic_t pt_share_count; +#endif };
union { @@ -468,6 +472,32 @@ static_assert(sizeof(struct ptdesc) <= sizeof(struct page)); const struct page *: (const struct ptdesc *)(p), \ struct page *: (struct ptdesc *)(p)))
+#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING +static inline void ptdesc_pmd_pts_init(struct ptdesc *ptdesc) +{ + atomic_set(&ptdesc->pt_share_count, 0); +} + +static inline void ptdesc_pmd_pts_inc(struct ptdesc *ptdesc) +{ + atomic_inc(&ptdesc->pt_share_count); +} + +static inline void ptdesc_pmd_pts_dec(struct ptdesc *ptdesc) +{ + atomic_dec(&ptdesc->pt_share_count); +} + +static inline int ptdesc_pmd_pts_count(struct ptdesc *ptdesc) +{ + return atomic_read(&ptdesc->pt_share_count); +} +#else +static inline void ptdesc_pmd_pts_init(struct ptdesc *ptdesc) +{ +} +#endif + /* * Used for sizing the vmemmap region on some architectures */ diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 5b8cc558ab6e..21c12519a7ef 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7014,7 +7014,7 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, spte = hugetlb_walk(svma, saddr, vma_mmu_pagesize(svma)); if (spte) { - get_page(virt_to_page(spte)); + ptdesc_pmd_pts_inc(virt_to_ptdesc(spte)); break; } } @@ -7029,7 +7029,7 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, (pmd_t *)((unsigned long)spte & PAGE_MASK)); mm_inc_nr_pmds(mm); } else { - put_page(virt_to_page(spte)); + ptdesc_pmd_pts_dec(virt_to_ptdesc(spte)); } spin_unlock(&mm->page_table_lock); out: @@ -7041,10 +7041,6 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, /* * unmap huge page backed by shared pte. * - * Hugetlb pte page is ref counted at the time of mapping. If pte is shared - * indicated by page_count > 1, unmap is achieved by clearing pud and - * decrementing the ref count. If count == 1, the pte page is not shared. - * * Called with page table lock held. * * returns: 1 successfully unmapped a shared pte page @@ -7053,18 +7049,20 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { + unsigned long sz = huge_page_size(hstate_vma(vma)); pgd_t *pgd = pgd_offset(mm, addr); p4d_t *p4d = p4d_offset(pgd, addr); pud_t *pud = pud_offset(p4d, addr);
i_mmap_assert_write_locked(vma->vm_file->f_mapping); hugetlb_vma_assert_locked(vma); - BUG_ON(page_count(virt_to_page(ptep)) == 0); - if (page_count(virt_to_page(ptep)) == 1) + if (sz != PMD_SIZE) + return 0; + if (!ptdesc_pmd_pts_count(virt_to_ptdesc(ptep))) return 0;
pud_clear(pud); - put_page(virt_to_page(ptep)); + ptdesc_pmd_pts_dec(virt_to_ptdesc(ptep)); mm_dec_nr_pmds(mm); return 1; }
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexandre Ghiti alexghiti@rivosinc.com
[ Upstream commit c97bf629963e52b205ed5fbaf151e5bd342f9c63 ]
For now, we use stop_machine() to patch the text and when we use IPIs for remote icache flushes (which is emitted in patch_text_nosync()), the system hangs.
So instead, make sure every CPU executes the stop_machine() patching function and emit a local icache flush there.
Co-developed-by: Björn Töpel bjorn@rivosinc.com Signed-off-by: Björn Töpel bjorn@rivosinc.com Signed-off-by: Alexandre Ghiti alexghiti@rivosinc.com Reviewed-by: Andrea Parri parri.andrea@gmail.com Link: https://lore.kernel.org/r/20240229121056.203419-3-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Stable-dep-of: 13134cc94914 ("riscv: kprobes: Fix incorrect address calculation") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/include/asm/patch.h | 1 + arch/riscv/kernel/ftrace.c | 44 ++++++++++++++++++++++++++++++---- arch/riscv/kernel/patch.c | 16 +++++++++---- 3 files changed, 53 insertions(+), 8 deletions(-)
diff --git a/arch/riscv/include/asm/patch.h b/arch/riscv/include/asm/patch.h index e88b52d39eac..9f5d6e14c405 100644 --- a/arch/riscv/include/asm/patch.h +++ b/arch/riscv/include/asm/patch.h @@ -6,6 +6,7 @@ #ifndef _ASM_RISCV_PATCH_H #define _ASM_RISCV_PATCH_H
+int patch_insn_write(void *addr, const void *insn, size_t len); int patch_text_nosync(void *addr, const void *insns, size_t len); int patch_text_set_nosync(void *addr, u8 c, size_t len); int patch_text(void *addr, u32 *insns, int ninsns); diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c index 03a6434a8cdd..6ede2bcce238 100644 --- a/arch/riscv/kernel/ftrace.c +++ b/arch/riscv/kernel/ftrace.c @@ -8,6 +8,7 @@ #include <linux/ftrace.h> #include <linux/uaccess.h> #include <linux/memory.h> +#include <linux/stop_machine.h> #include <asm/cacheflush.h> #include <asm/patch.h>
@@ -75,8 +76,7 @@ static int __ftrace_modify_call(unsigned long hook_pos, unsigned long target, make_call_t0(hook_pos, target, call);
/* Replace the auipc-jalr pair at once. Return -EPERM on write error. */ - if (patch_text_nosync - ((void *)hook_pos, enable ? call : nops, MCOUNT_INSN_SIZE)) + if (patch_insn_write((void *)hook_pos, enable ? call : nops, MCOUNT_INSN_SIZE)) return -EPERM;
return 0; @@ -88,7 +88,7 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
make_call_t0(rec->ip, addr, call);
- if (patch_text_nosync((void *)rec->ip, call, MCOUNT_INSN_SIZE)) + if (patch_insn_write((void *)rec->ip, call, MCOUNT_INSN_SIZE)) return -EPERM;
return 0; @@ -99,7 +99,7 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec, { unsigned int nops[2] = {NOP4, NOP4};
- if (patch_text_nosync((void *)rec->ip, nops, MCOUNT_INSN_SIZE)) + if (patch_insn_write((void *)rec->ip, nops, MCOUNT_INSN_SIZE)) return -EPERM;
return 0; @@ -134,6 +134,42 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
return ret; } + +struct ftrace_modify_param { + int command; + atomic_t cpu_count; +}; + +static int __ftrace_modify_code(void *data) +{ + struct ftrace_modify_param *param = data; + + if (atomic_inc_return(¶m->cpu_count) == num_online_cpus()) { + ftrace_modify_all_code(param->command); + /* + * Make sure the patching store is effective *before* we + * increment the counter which releases all waiting CPUs + * by using the release variant of atomic increment. The + * release pairs with the call to local_flush_icache_all() + * on the waiting CPU. + */ + atomic_inc_return_release(¶m->cpu_count); + } else { + while (atomic_read(¶m->cpu_count) <= num_online_cpus()) + cpu_relax(); + } + + local_flush_icache_all(); + + return 0; +} + +void arch_ftrace_update_code(int command) +{ + struct ftrace_modify_param param = { command, ATOMIC_INIT(0) }; + + stop_machine(__ftrace_modify_code, ¶m, cpu_online_mask); +} #endif
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c index 30e12b310cab..78387d843aa5 100644 --- a/arch/riscv/kernel/patch.c +++ b/arch/riscv/kernel/patch.c @@ -196,7 +196,7 @@ int patch_text_set_nosync(void *addr, u8 c, size_t len) } NOKPROBE_SYMBOL(patch_text_set_nosync);
-static int patch_insn_write(void *addr, const void *insn, size_t len) +int patch_insn_write(void *addr, const void *insn, size_t len) { size_t patched = 0; size_t size; @@ -240,16 +240,24 @@ static int patch_text_cb(void *data) if (atomic_inc_return(&patch->cpu_count) == num_online_cpus()) { for (i = 0; ret == 0 && i < patch->ninsns; i++) { len = GET_INSN_LENGTH(patch->insns[i]); - ret = patch_text_nosync(patch->addr + i * len, - &patch->insns[i], len); + ret = patch_insn_write(patch->addr + i * len, &patch->insns[i], len); } - atomic_inc(&patch->cpu_count); + /* + * Make sure the patching store is effective *before* we + * increment the counter which releases all waiting CPUs + * by using the release variant of atomic increment. The + * release pairs with the call to local_flush_icache_all() + * on the waiting CPU. + */ + atomic_inc_return_release(&patch->cpu_count); } else { while (atomic_read(&patch->cpu_count) <= num_online_cpus()) cpu_relax(); smp_mb(); }
+ local_flush_icache_all(); + return ret; } NOKPROBE_SYMBOL(patch_text_cb);
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Golle daniel@makrotopia.org
[ Upstream commit f8d9b91739e1fb436447c437a346a36deb676a36 ]
Touching DISP_REG_OVL_PITCH_MSB leads to video overlay on MT2701, MT7623N and probably other older SoCs being broken.
Move setting up AFBC layer configuration into a separate function only being called on hardware which actually supports AFBC which restores the behavior as it was before commit c410fa9b07c3 ("drm/mediatek: Add AFBC support to Mediatek DRM driver") on non-AFBC hardware.
Fixes: c410fa9b07c3 ("drm/mediatek: Add AFBC support to Mediatek DRM driver") Cc: stable@vger.kernel.org Signed-off-by: Daniel Golle daniel@makrotopia.org Reviewed-by: CK Hu ck.hu@mediatek.com Link: https://patchwork.kernel.org/project/dri-devel/patch/c7fbd3c3e633c0b7dd6d1cd... Signed-off-by: Chun-Kuang Hu chunkuang.hu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/mediatek/mtk_disp_ovl.c | 57 +++++++++++++------------ 1 file changed, 29 insertions(+), 28 deletions(-)
diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c index 6f15069da8b0..ce0f441e3f13 100644 --- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c +++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c @@ -403,6 +403,29 @@ static unsigned int ovl_fmt_convert(struct mtk_disp_ovl *ovl, unsigned int fmt) } }
+static void mtk_ovl_afbc_layer_config(struct mtk_disp_ovl *ovl, + unsigned int idx, + struct mtk_plane_pending_state *pending, + struct cmdq_pkt *cmdq_pkt) +{ + unsigned int pitch_msb = pending->pitch >> 16; + unsigned int hdr_pitch = pending->hdr_pitch; + unsigned int hdr_addr = pending->hdr_addr; + + if (pending->modifier != DRM_FORMAT_MOD_LINEAR) { + mtk_ddp_write_relaxed(cmdq_pkt, hdr_addr, &ovl->cmdq_reg, ovl->regs, + DISP_REG_OVL_HDR_ADDR(ovl, idx)); + mtk_ddp_write_relaxed(cmdq_pkt, + OVL_PITCH_MSB_2ND_SUBBUF | pitch_msb, + &ovl->cmdq_reg, ovl->regs, DISP_REG_OVL_PITCH_MSB(idx)); + mtk_ddp_write_relaxed(cmdq_pkt, hdr_pitch, &ovl->cmdq_reg, ovl->regs, + DISP_REG_OVL_HDR_PITCH(ovl, idx)); + } else { + mtk_ddp_write_relaxed(cmdq_pkt, pitch_msb, + &ovl->cmdq_reg, ovl->regs, DISP_REG_OVL_PITCH_MSB(idx)); + } +} + void mtk_ovl_layer_config(struct device *dev, unsigned int idx, struct mtk_plane_state *state, struct cmdq_pkt *cmdq_pkt) @@ -410,24 +433,12 @@ void mtk_ovl_layer_config(struct device *dev, unsigned int idx, struct mtk_disp_ovl *ovl = dev_get_drvdata(dev); struct mtk_plane_pending_state *pending = &state->pending; unsigned int addr = pending->addr; - unsigned int hdr_addr = pending->hdr_addr; - unsigned int pitch = pending->pitch; - unsigned int hdr_pitch = pending->hdr_pitch; + unsigned int pitch_lsb = pending->pitch & GENMASK(15, 0); unsigned int fmt = pending->format; unsigned int offset = (pending->y << 16) | pending->x; unsigned int src_size = (pending->height << 16) | pending->width; unsigned int ignore_pixel_alpha = 0; unsigned int con; - bool is_afbc = pending->modifier != DRM_FORMAT_MOD_LINEAR; - union overlay_pitch { - struct split_pitch { - u16 lsb; - u16 msb; - } split_pitch; - u32 pitch; - } overlay_pitch; - - overlay_pitch.pitch = pitch;
if (!pending->enable) { mtk_ovl_layer_off(dev, idx, cmdq_pkt); @@ -457,11 +468,12 @@ void mtk_ovl_layer_config(struct device *dev, unsigned int idx, }
if (ovl->data->supports_afbc) - mtk_ovl_set_afbc(ovl, cmdq_pkt, idx, is_afbc); + mtk_ovl_set_afbc(ovl, cmdq_pkt, idx, + pending->modifier != DRM_FORMAT_MOD_LINEAR);
mtk_ddp_write_relaxed(cmdq_pkt, con, &ovl->cmdq_reg, ovl->regs, DISP_REG_OVL_CON(idx)); - mtk_ddp_write_relaxed(cmdq_pkt, overlay_pitch.split_pitch.lsb | ignore_pixel_alpha, + mtk_ddp_write_relaxed(cmdq_pkt, pitch_lsb | ignore_pixel_alpha, &ovl->cmdq_reg, ovl->regs, DISP_REG_OVL_PITCH(idx)); mtk_ddp_write_relaxed(cmdq_pkt, src_size, &ovl->cmdq_reg, ovl->regs, DISP_REG_OVL_SRC_SIZE(idx)); @@ -470,19 +482,8 @@ void mtk_ovl_layer_config(struct device *dev, unsigned int idx, mtk_ddp_write_relaxed(cmdq_pkt, addr, &ovl->cmdq_reg, ovl->regs, DISP_REG_OVL_ADDR(ovl, idx));
- if (is_afbc) { - mtk_ddp_write_relaxed(cmdq_pkt, hdr_addr, &ovl->cmdq_reg, ovl->regs, - DISP_REG_OVL_HDR_ADDR(ovl, idx)); - mtk_ddp_write_relaxed(cmdq_pkt, - OVL_PITCH_MSB_2ND_SUBBUF | overlay_pitch.split_pitch.msb, - &ovl->cmdq_reg, ovl->regs, DISP_REG_OVL_PITCH_MSB(idx)); - mtk_ddp_write_relaxed(cmdq_pkt, hdr_pitch, &ovl->cmdq_reg, ovl->regs, - DISP_REG_OVL_HDR_PITCH(ovl, idx)); - } else { - mtk_ddp_write_relaxed(cmdq_pkt, - overlay_pitch.split_pitch.msb, - &ovl->cmdq_reg, ovl->regs, DISP_REG_OVL_PITCH_MSB(idx)); - } + if (ovl->data->supports_afbc) + mtk_ovl_afbc_layer_config(ovl, idx, pending, cmdq_pkt);
mtk_ovl_set_bit_depth(dev, idx, fmt, cmdq_pkt); mtk_ovl_layer_on(dev, idx, cmdq_pkt);
Hi!
This is the start of the stable review cycle for the 6.6.72 release. There are 129 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
CIP testing did not find any problems here:
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-6...
6.12 passes our testing, too:
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-6...
Tested-by: Pavel Machek (CIP) pavel@denx.de
Best regards, Pavel
On Wed, 15 Jan 2025 11:36:15 +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.72 release. There are 129 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 17 Jan 2025 10:34:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.72-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v6.6: 10 builds: 10 pass, 0 fail 26 boots: 26 pass, 0 fail 116 tests: 116 pass, 0 fail
Linux version: 6.6.72-rc1-g6a7137c98fe3 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On Wed, Jan 15, 2025 at 11:36:15AM +0100, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.72 release. There are 129 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Tested-by: Mark Brown broonie@kernel.org
On 1/15/25 02:36, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.72 release. There are 129 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 17 Jan 2025 10:34:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.72-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
On 1/15/25 03:36, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.72 release. There are 129 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 17 Jan 2025 10:34:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.72-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On 1/15/25 02:36, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.6.72 release. There are 129 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 17 Jan 2025 10:34:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.72-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
On Wed, 15 Jan 2025 at 16:25, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.6.72 release. There are 129 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 17 Jan 2025 10:34:58 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.72-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 6.6.72-rc1 * git: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git * git commit: 6a7137c98fe395a5691fdf06ed53c9b1df1fb3a3 * git describe: v6.6.71-130-g6a7137c98fe3 * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.6.y/build/v6.6.71...
## Test Regressions (compared to v6.6.69-223-g5652330123c6)
## Metric Regressions (compared to v6.6.69-223-g5652330123c6)
## Test Fixes (compared to v6.6.69-223-g5652330123c6)
## Metric Fixes (compared to v6.6.69-223-g5652330123c6)
## Test result summary total: 96734, pass: 78107, fail: 2971, skip: 15248, xfail: 408
## Build Summary * arc: 6 total, 5 passed, 1 failed * arm: 133 total, 133 passed, 0 failed * arm64: 46 total, 44 passed, 2 failed * i386: 31 total, 28 passed, 3 failed * mips: 30 total, 25 passed, 5 failed * parisc: 5 total, 5 passed, 0 failed * powerpc: 36 total, 32 passed, 4 failed * riscv: 23 total, 22 passed, 1 failed * s390: 18 total, 14 passed, 4 failed * sh: 12 total, 10 passed, 2 failed * sparc: 9 total, 8 passed, 1 failed * x86_64: 38 total, 37 passed, 1 failed
## Test suites summary * boot * commands * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-efivarfs * kselftest-exec * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-filesystems-epoll * kselftest-firmware * kselftest-fpu * kselftest-ftrace * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-kcmp * kselftest-kvm * kselftest-membarrier * kselftest-memfd * kselftest-mincore * kselftest-mqueue * kselftest-net * kselftest-net-mptcp * kselftest-openat2 * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-tc-testing * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user_events * kselftest-vDSO * kselftest-x86 * kunit * kvm-unit-tests * libgpiod * libhugetlbfs * log-parser-boot * log-parser-build-clang * log-parser-build-gcc * log-parser-test * ltp-capability * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-hugetlb * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-pty * ltp-sched * ltp-smoke * ltp-syscalls * ltp-tracing * perf * rcutorture
-- Linaro LKFT https://lkft.linaro.org
Am 15.01.2025 um 11:36 schrieb Greg Kroah-Hartman:
This is the start of the stable review cycle for the 6.6.72 release. There are 129 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Builds, boots and works on my 2-socket Ivy Bridge Xeon E5-2697 v2 server. No dmesg oddities or regressions found.
Tested-by: Peter Schneider pschneider1968@googlemail.com
Beste Grüße, Peter Schneider
The kernel, bpf tool, amd kselftest tool builds fine for v6.6.72-rc1 on x86 and arm64 Azure VM.
Tested-by: Hardik Garg hargar@linux.microsoft.com
Thanks, Hardik
linux-stable-mirror@lists.linaro.org