This is the start of the stable review cycle for the 6.12.57 release.
There are 40 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 02 Nov 2025 14:00:34 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.57-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.12.57-rc1
Edward Cree <ecree.xilinx(a)gmail.com>
sfc: fix NULL dereferences in ef100_process_design_param()
Xiaogang Chen <xiaogang.chen(a)amd.com>
udmabuf: fix a buf size overflow issue during udmabuf creation
Aditya Kumar Singh <quic_adisi(a)quicinc.com>
wifi: ath12k: fix read pointer after free in ath12k_mac_assign_vif_to_vdev()
Kees Bakker <kees(a)ijzerbout.nl>
iommu/vt-d: Avoid use of NULL after WARN_ON_ONCE
William Breathitt Gray <wbg(a)kernel.org>
gpio: idio-16: Define fixed direction of the GPIO lines
Ioana Ciornei <ioana.ciornei(a)nxp.com>
gpio: regmap: add the .fixed_direction_output configuration parameter
Mathieu Dubois-Briand <mathieu.dubois-briand(a)bootlin.com>
gpio: regmap: Allow to allocate regmap-irq device
Vincent Mailhol <mailhol.vincent(a)wanadoo.fr>
bits: introduce fixed-type GENMASK_U*()
Vincent Mailhol <mailhol.vincent(a)wanadoo.fr>
bits: add comments and newlines to #if, #else and #endif directives
Wang Liang <wangliang74(a)huawei.com>
bonding: check xdp prog when set bond mode
Hangbin Liu <liuhangbin(a)gmail.com>
bonding: return detailed error when loading native XDP fails
Alexander Wetzel <Alexander(a)wetzel-home.de>
wifi: cfg80211: Add missing lock in cfg80211_check_and_end_cac()
Chao Yu <chao(a)kernel.org>
f2fs: fix to avoid panic once fallocation fails for pinfile
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
mptcp: pm: in-kernel: C-flag: handle late ADD_ADDR
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
selftests: mptcp: join: mark 'delete re-add signal' as skipped if not supported
Geliang Tang <geliang(a)kernel.org>
selftests: mptcp: disable add_addr retrans in endpoint_tests
Jonathan Corbet <corbet(a)lwn.net>
docs: kdoc: handle the obsolescensce of docutils.ErrorString()
Menglong Dong <menglong8.dong(a)gmail.com>
arch: Add the macro COMPILE_OFFSETS to all the asm-offsets.c
Tejun Heo <tj(a)kernel.org>
sched_ext: Make qmap dump operation non-destructive
Filipe Manana <fdmanana(a)suse.com>
btrfs: use smp_mb__after_atomic() when forcing COW in create_pending_snapshot()
Qu Wenruo <wqu(a)suse.com>
btrfs: tree-checker: add inode extref checks
Filipe Manana <fdmanana(a)suse.com>
btrfs: abort transaction if we fail to update inode in log replay dir fixup
Filipe Manana <fdmanana(a)suse.com>
btrfs: use level argument in log tree walk callback replay_one_buffer()
Filipe Manana <fdmanana(a)suse.com>
btrfs: always drop log root tree reference in btrfs_replay_log()
Thorsten Blum <thorsten.blum(a)linux.dev>
btrfs: scrub: replace max_t()/min_t() with clamp() in scrub_throttle_dev_io()
Naohiro Aota <naohiro.aota(a)wdc.com>
btrfs: zoned: refine extent allocator hint selection
Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
btrfs: zoned: return error from btrfs_zone_finish_endio()
Filipe Manana <fdmanana(a)suse.com>
btrfs: abort transaction in the process_one_buffer() log tree walk callback
Filipe Manana <fdmanana(a)suse.com>
btrfs: abort transaction on specific error places when walking log tree
Chen Ridong <chenridong(a)huawei.com>
cpuset: Use new excpus for nocpu error check when enabling root partition
Avadhut Naik <avadhut.naik(a)amd.com>
EDAC/mc_sysfs: Increase legacy channel support to 16
David Kaplan <david.kaplan(a)amd.com>
x86/bugs: Fix reporting of LFENCE retpoline
David Kaplan <david.kaplan(a)amd.com>
x86/bugs: Report correct retbleed mitigation status
Jiri Olsa <jolsa(a)kernel.org>
seccomp: passthrough uprobe systemcall without filtering
Josh Poimboeuf <jpoimboe(a)kernel.org>
perf: Skip user unwind if the task is a kernel thread
Josh Poimboeuf <jpoimboe(a)kernel.org>
perf: Have get_perf_callchain() return NULL if crosstask and user are set
Steven Rostedt <rostedt(a)goodmis.org>
perf: Use current->flags & PF_KTHREAD|PF_USER_WORKER instead of current->mm == NULL
Dapeng Mi <dapeng1.mi(a)linux.intel.com>
perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK
Richard Guy Briggs <rgb(a)redhat.com>
audit: record fanotify event regardless of presence of rules
Xiang Mei <xmei5(a)asu.edu>
net/sched: sch_qfq: Fix null-deref in agg_dequeue
-------------
Diffstat:
Documentation/sphinx/kernel_abi.py | 4 +-
Documentation/sphinx/kernel_feat.py | 4 +-
Documentation/sphinx/kernel_include.py | 6 ++-
Documentation/sphinx/maintainers_include.py | 4 +-
Makefile | 4 +-
arch/alpha/kernel/asm-offsets.c | 1 +
arch/arc/kernel/asm-offsets.c | 1 +
arch/arm/kernel/asm-offsets.c | 2 +
arch/arm64/kernel/asm-offsets.c | 1 +
arch/csky/kernel/asm-offsets.c | 1 +
arch/hexagon/kernel/asm-offsets.c | 1 +
arch/loongarch/kernel/asm-offsets.c | 2 +
arch/m68k/kernel/asm-offsets.c | 1 +
arch/microblaze/kernel/asm-offsets.c | 1 +
arch/mips/kernel/asm-offsets.c | 2 +
arch/nios2/kernel/asm-offsets.c | 1 +
arch/openrisc/kernel/asm-offsets.c | 1 +
arch/parisc/kernel/asm-offsets.c | 1 +
arch/powerpc/kernel/asm-offsets.c | 1 +
arch/riscv/kernel/asm-offsets.c | 1 +
arch/s390/kernel/asm-offsets.c | 1 +
arch/sh/kernel/asm-offsets.c | 1 +
arch/sparc/kernel/asm-offsets.c | 1 +
arch/um/kernel/asm-offsets.c | 2 +
arch/x86/events/intel/core.c | 10 ++--
arch/x86/include/asm/perf_event.h | 6 ++-
arch/x86/kernel/cpu/bugs.c | 9 ++--
arch/x86/kvm/pmu.h | 2 +-
arch/xtensa/kernel/asm-offsets.c | 1 +
drivers/dma-buf/udmabuf.c | 2 +-
drivers/edac/edac_mc_sysfs.c | 24 ++++++++++
drivers/gpio/gpio-idio-16.c | 5 ++
drivers/gpio/gpio-regmap.c | 53 ++++++++++++++++++--
drivers/iommu/intel/iommu.c | 7 +--
drivers/net/bonding/bond_main.c | 11 +++--
drivers/net/bonding/bond_options.c | 3 ++
drivers/net/ethernet/sfc/ef100_netdev.c | 6 +--
drivers/net/ethernet/sfc/ef100_nic.c | 47 ++++++++----------
drivers/net/wireless/ath/ath12k/mac.c | 6 +--
fs/btrfs/disk-io.c | 2 +-
fs/btrfs/extent-tree.c | 6 ++-
fs/btrfs/inode.c | 7 +--
fs/btrfs/scrub.c | 3 +-
fs/btrfs/transaction.c | 2 +-
fs/btrfs/tree-checker.c | 37 ++++++++++++++
fs/btrfs/tree-log.c | 64 +++++++++++++++++++------
fs/btrfs/zoned.c | 8 ++--
fs/btrfs/zoned.h | 9 ++--
fs/f2fs/file.c | 8 ++--
fs/f2fs/segment.c | 20 ++++----
include/linux/audit.h | 2 +-
include/linux/bitops.h | 1 -
include/linux/bits.h | 38 ++++++++++++++-
include/linux/gpio/regmap.h | 16 +++++++
include/net/bonding.h | 1 +
include/net/pkt_sched.h | 25 +++++++++-
kernel/cgroup/cpuset.c | 6 +--
kernel/events/callchain.c | 16 +++----
kernel/events/core.c | 7 +--
kernel/seccomp.c | 32 ++++++++++---
net/mptcp/pm_netlink.c | 6 +++
net/sched/sch_api.c | 10 ----
net/sched/sch_hfsc.c | 16 -------
net/sched/sch_qfq.c | 2 +-
net/wireless/reg.c | 4 ++
tools/sched_ext/scx_qmap.bpf.c | 18 ++++++-
tools/testing/selftests/net/mptcp/mptcp_join.sh | 3 +-
67 files changed, 442 insertions(+), 164 deletions(-)
Hi,
This Linux kernel patch series introduces support for error recovery for
passthrough PCI devices on System Z (s390x).
Background
----------
For PCI devices on s390x an operating system receives platform specific
error events from firmware rather than through AER.Today for
passthrough/userspace devices, we don't attempt any error recovery and
ignore any error events for the devices. The passthrough/userspace devices
are managed by the vfio-pci driver. The driver does register error handling
callbacks (error_detected), and on an error trigger an eventfd to
userspace. But we need a mechanism to notify userspace
(QEMU/guest/userspace drivers) about the error event.
Proposal
--------
We can expose this error information (currently only the PCI Error Code)
via a device feature. Userspace can then obtain the error information
via VFIO_DEVICE_FEATURE ioctl and take appropriate actions such as driving
a device reset.
This is how a typical flow for passthrough devices to a VM would work:
For passthrough devices to a VM, the driver bound to the device on the host
is vfio-pci. vfio-pci driver does support the error_detected() callback
(vfio_pci_core_aer_err_detected()), and on an PCI error s390x recovery
code on the host will call the vfio-pci error_detected() callback. The
vfio-pci error_detected() callback will notify userspace/QEMU via an
eventfd, and return PCI_ERS_RESULT_CAN_RECOVER. At this point the s390x
error recovery on the host will skip any further action(see patch 6) and
let userspace drive the error recovery.
Once userspace/QEMU is notified, it then injects this error into the VM
so device drivers in the VM can take recovery actions. For example for a
passthrough NVMe device, the VM's OS NVMe driver will access the device.
At this point the VM's NVMe driver's error_detected() will drive the
recovery by returning PCI_ERS_RESULT_NEED_RESET, and the s390x error
recovery in the VM's OS will try to do a reset. Resets are privileged
operations and so the VM will need intervention from QEMU to perform the
reset. QEMU will invoke the VFIO_DEVICE_RESET ioctl to now notify the
host that the VM is requesting a reset of the device. The vfio-pci driver
on the host will then perform the reset on the device to recover it.
Thanks
Farhan
ChangeLog
---------
v4 series https://lore.kernel.org/all/20250924171628.826-1-alifm@linux.ibm.com/
v4 -> v5
- Rebase on 6.18-rc5
- Move bug fixes to the beginning of the series (patch 1 and 2). These patches
were posted as a separate fixes series
https://lore.kernel.org/all/a14936ac-47d6-461b-816f-0fd66f869b0f@linux.ibm.…
- Add matching pci_put_dev() for pci_get_slot() (patch 6).
v3 series https://lore.kernel.org/all/20250911183307.1910-1-alifm@linux.ibm.com/
v3 -> v4
- Remove warn messages for each PCI capability not restored (patch 1)
- Check PCI_COMMAND and PCI_STATUS register for error value instead of device id
(patch 1)
- Fix kernel crash in patch 3
- Added reviewed by tags
- Address comments from Niklas's (patches 4, 5, 7)
- Fix compilation error non s390x system (patch 8)
- Explicitly align struct vfio_device_feature_zpci_err (patch 8)
v2 series https://lore.kernel.org/all/20250825171226.1602-1-alifm@linux.ibm.com/
v2 -> v3
- Patch 1 avoids saving any config space state if the device is in error
(suggested by Alex)
- Patch 2 adds additional check only for FLR reset to try other function
reset method (suggested by Alex).
- Patch 3 fixes a bug in s390 for resetting PCI devices with multiple
functions. Creates a new flag pci_slot to allow per function slot.
- Patch 4 fixes a bug in s390 for resource to bus address translation.
- Rebase on 6.17-rc5
v1 series https://lore.kernel.org/all/20250813170821.1115-1-alifm@linux.ibm.com/
v1 - > v2
- Patches 1 and 2 adds some additional checks for FLR/PM reset to
try other function reset method (suggested by Alex).
- Patch 3 fixes a bug in s390 for resetting PCI devices with multiple
functions.
- Patch 7 adds a new device feature for zPCI devices for the VFIO_DEVICE_FEATURE
ioctl. The ioctl is used by userspace to retriece any PCI error
information for the device (suggested by Alex).
- Patch 8 adds a reset_done() callback for the vfio-pci driver, to
restore the state of the device after a reset.
- Patch 9 removes the pcie check for triggering VFIO_PCI_ERR_IRQ_INDEX.
Farhan Ali (9):
PCI: Allow per function PCI slots
s390/pci: Add architecture specific resource/bus address translation
PCI: Avoid saving error values for config space
PCI: Add additional checks for flr reset
s390/pci: Update the logic for detecting passthrough device
s390/pci: Store PCI error information for passthrough devices
vfio-pci/zdev: Add a device feature for error information
vfio: Add a reset_done callback for vfio-pci driver
vfio: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX
arch/s390/include/asm/pci.h | 29 ++++++++
arch/s390/pci/pci.c | 75 +++++++++++++++++++++
arch/s390/pci/pci_event.c | 107 +++++++++++++++++-------------
drivers/pci/host-bridge.c | 4 +-
drivers/pci/pci.c | 37 +++++++++--
drivers/pci/pcie/aer.c | 3 +
drivers/pci/pcie/dpc.c | 3 +
drivers/pci/pcie/ptm.c | 3 +
drivers/pci/slot.c | 25 ++++++-
drivers/pci/tph.c | 3 +
drivers/pci/vc.c | 3 +
drivers/vfio/pci/vfio_pci_core.c | 20 ++++--
drivers/vfio/pci/vfio_pci_intrs.c | 3 +-
drivers/vfio/pci/vfio_pci_priv.h | 9 +++
drivers/vfio/pci/vfio_pci_zdev.c | 45 ++++++++++++-
include/linux/pci.h | 1 +
include/uapi/linux/vfio.h | 15 +++++
17 files changed, 321 insertions(+), 64 deletions(-)
--
2.43.0
The function samsung_dsim_parse_dt() calls of_graph_get_endpoint_by_regs()
to get the endpoint device node, but fails to call of_node_put() to release
the reference when the function returns. This results in a device node
reference leak.
Fix this by adding the missing of_node_put() call before returning from
the function.
Found via static analysis and code review.
Fixes: 77169a11d4e9 ("drm/bridge: samsung-dsim: add driver support for exynos7870 DSIM bridge")
Cc: stable(a)vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com>
---
drivers/gpu/drm/bridge/samsung-dsim.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/bridge/samsung-dsim.c b/drivers/gpu/drm/bridge/samsung-dsim.c
index eabc4c32f6ab..1a5acd5077ad 100644
--- a/drivers/gpu/drm/bridge/samsung-dsim.c
+++ b/drivers/gpu/drm/bridge/samsung-dsim.c
@@ -2086,6 +2086,7 @@ static int samsung_dsim_parse_dt(struct samsung_dsim *dsi)
if (lane_polarities[1])
dsi->swap_dn_dp_data = true;
}
+ of_node_put(endpoint);
return 0;
}
--
2.39.5 (Apple Git-154)
The current LoongArch BPF trampoline implementation is incompatible
with tracing functions in kernel modules. This causes several severe
and user-visible problems:
* Kernel lockups when a BPF program is attached to a module function [0].
* The `bpf_selftests/module_attach` test fails consistently.
* Critical kernel modules like WireGuard experience traffic disruption
when their functions are traced with fentry [1].
Given the severity and the potential for other unknown side-effects,
it is safest to disable the feature entirely for now. This patch
prevents the BPF subsystem from allowing trampoline attachments to
module functions on LoongArch.
This is a temporary mitigation until the core issues in the trampoline
code for module handling can be identified and fixed.
[root@fedora bpf]# ./test_progs -a module_attach -v
bpf_testmod.ko is already unloaded.
Loading bpf_testmod.ko...
Successfully loaded bpf_testmod.ko.
test_module_attach:PASS:skel_open 0 nsec
test_module_attach:PASS:set_attach_target 0 nsec
test_module_attach:PASS:set_attach_target_explicit 0 nsec
test_module_attach:PASS:skel_load 0 nsec
libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
test_module_attach:FAIL:skel_attach skeleton attach failed: -524
Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
Successfully unloaded bpf_testmod.ko.
[0]: https://lore.kernel.org/loongarch/CAK3+h2wDmpC-hP4u4pJY8T-yfKyk4yRzpu2LMO+C…
[1]: https://lore.kernel.org/loongarch/CAK3+h2wYcpc+OwdLDUBvg2rF9rvvyc5amfHT-KcF…
Cc: stable(a)vger.kernel.org
Fixes: f9b6b41f0cf3 (“LoongArch: BPF: Add basic bpf trampoline support”)
Closes: https://lore.kernel.org/loongarch/CAK3+h2wDmpC-hP4u4pJY8T-yfKyk4yRzpu2LMO+C…
Acked-by: Hengqi Chen <hengqi.chen(a)gmail.com>
Signed-off-by: Vincent Li <vincent.mc.li(a)gmail.com>
---
arch/loongarch/net/bpf_jit.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index cbe53d0b7fb0..49c1d4b95404 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -1624,6 +1624,9 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
/* Direct jump skips 5 NOP instructions */
else if (is_bpf_text_address((unsigned long)orig_call))
orig_call += LOONGARCH_BPF_FENTRY_NBYTES;
+ /* Module tracing not supported - causes kernel lockups */
+ else if (is_module_text_address((unsigned long)orig_call))
+ return -ENOTSUPP;
if (flags & BPF_TRAMP_F_CALL_ORIG) {
move_addr(ctx, LOONGARCH_GPR_A0, (const u64)im);
--
2.38.1
The current LoongArch BPF trampoline implementation is incompatible
with tracing functions in kernel modules. This causes several severe
and user-visible problems:
* Kernel lockups when a BPF program is attached to a module function [0].
* The `bpf_selftests/module_attach` test fails consistently.
* Critical kernel modules like WireGuard experience traffic disruption
when their functions are traced with fentry [1].
Given the severity and the potential for other unknown side-effects,
it is safest to disable the feature entirely for now. This patch
prevents the BPF subsystem from allowing trampoline attachments to
module functions on LoongArch.
This is a temporary mitigation until the core issues in the trampoline
code for module handling can be identified and fixed.
[root@fedora bpf]# ./test_progs -a module_attach -v
bpf_testmod.ko is already unloaded.
Loading bpf_testmod.ko...
Successfully loaded bpf_testmod.ko.
test_module_attach:PASS:skel_open 0 nsec
test_module_attach:PASS:set_attach_target 0 nsec
test_module_attach:PASS:set_attach_target_explicit 0 nsec
test_module_attach:PASS:skel_load 0 nsec
libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
test_module_attach:FAIL:skel_attach skeleton attach failed: -524
Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
Successfully unloaded bpf_testmod.ko.
[0]: https://lore.kernel.org/loongarch/CAK3+h2wDmpC-hP4u4pJY8T-yfKyk4yRzpu2LMO+C…
[1]: https://lore.kernel.org/loongarch/CAK3+h2wYcpc+OwdLDUBvg2rF9rvvyc5amfHT-KcF…
Cc: stable(a)vger.kernel.org
Fixes: f9b6b41f0cf3 (“LoongArch: BPF: Add basic bpf trampoline support”)
Closes: https://lore.kernel.org/loongarch/CAK3+h2wDmpC-hP4u4pJY8T-yfKyk4yRzpu2LMO+C…
Acked-by: Hengqi Chen <hengqi.chen(a)gmail.com>
Signed-off-by: Vincent Li <vincent.mc.li(a)gmail.com>
---
arch/loongarch/net/bpf_jit.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index cbe53d0b7fb0..49c1d4b95404 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -1624,6 +1624,9 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
/* Direct jump skips 5 NOP instructions */
else if (is_bpf_text_address((unsigned long)orig_call))
orig_call += LOONGARCH_BPF_FENTRY_NBYTES;
+ /* Module tracing not supported - causes kernel lockups */
+ else if (is_module_text_address((unsigned long)orig_call))
+ return -ENOTSUPP;
if (flags & BPF_TRAMP_F_CALL_ORIG) {
move_addr(ctx, LOONGARCH_GPR_A0, (const u64)im);
--
2.38.1
Hi Greg, Sasha,
On 03/11/2025 02:38, gregkh(a)linuxfoundation.org wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> mptcp: drop bogus optimization in __mptcp_check_push()
>
> to the 5.15-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> mptcp-drop-bogus-optimization-in-__mptcp_check_push.patch
> and it can be found in the queue-5.15 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
Can you please drop this patch from v5.15? It looks like it is causing
some issues with MP_PRIO tests. I think that's because back then, the
path is selected differently, with the use of 'msk->last_snd' which will
bypass some decisions to where to send the next data.
I will try to check if another version of this patch is needed for v5.15.
Cheers,
Matt
(I kept the patch below just in case some people from the MPTCP ML want
to react.)
> From stable+bounces-192087-greg=kroah.com(a)vger.kernel.org Mon Nov 3 05:15:58 2025
> From: Sasha Levin <sashal(a)kernel.org>
> Date: Sun, 2 Nov 2025 15:15:50 -0500
> Subject: mptcp: drop bogus optimization in __mptcp_check_push()
> To: stable(a)vger.kernel.org
> Cc: Paolo Abeni <pabeni(a)redhat.com>, Geliang Tang <geliang(a)kernel.org>, Mat Martineau <martineau(a)kernel.org>, "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>, Jakub Kicinski <kuba(a)kernel.org>, Sasha Levin <sashal(a)kernel.org>
> Message-ID: <20251102201550.3588174-1-sashal(a)kernel.org>
>
> From: Paolo Abeni <pabeni(a)redhat.com>
>
> [ Upstream commit 27b0e701d3872ba59c5b579a9e8a02ea49ad3d3b ]
>
> Accessing the transmit queue without owning the msk socket lock is
> inherently racy, hence __mptcp_check_push() could actually quit early
> even when there is pending data.
>
> That in turn could cause unexpected tx lock and timeout.
>
> Dropping the early check avoids the race, implicitly relaying on later
> tests under the relevant lock. With such change, all the other
> mptcp_send_head() call sites are now under the msk socket lock and we
> can additionally drop the now unneeded annotation on the transmit head
> pointer accesses.
>
> Fixes: 6e628cd3a8f7 ("mptcp: use mptcp release_cb for delayed tasks")
> Cc: stable(a)vger.kernel.org
> Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
> Reviewed-by: Geliang Tang <geliang(a)kernel.org>
> Tested-by: Geliang Tang <geliang(a)kernel.org>
> Reviewed-by: Mat Martineau <martineau(a)kernel.org>
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
> Link: https://patch.msgid.link/20251028-net-mptcp-send-timeout-v1-1-38ffff5a9ec8@…
> Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
> [ split upstream __subflow_push_pending modification across __mptcp_push_pending and __mptcp_subflow_push_pending ]
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
> ---
> net/mptcp/protocol.c | 13 +++++--------
> net/mptcp/protocol.h | 2 +-
> 2 files changed, 6 insertions(+), 9 deletions(-)
>
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -1137,7 +1137,7 @@ static void __mptcp_clean_una(struct soc
> if (WARN_ON_ONCE(!msk->recovery))
> break;
>
> - WRITE_ONCE(msk->first_pending, mptcp_send_next(sk));
> + msk->first_pending = mptcp_send_next(sk);
> }
>
> dfrag_clear(sk, dfrag);
> @@ -1674,7 +1674,7 @@ void __mptcp_push_pending(struct sock *s
>
> mptcp_update_post_push(msk, dfrag, ret);
> }
> - WRITE_ONCE(msk->first_pending, mptcp_send_next(sk));
> + msk->first_pending = mptcp_send_next(sk);
> }
>
> /* at this point we held the socket lock for the last subflow we used */
> @@ -1732,7 +1732,7 @@ static void __mptcp_subflow_push_pending
>
> mptcp_update_post_push(msk, dfrag, ret);
> }
> - WRITE_ONCE(msk->first_pending, mptcp_send_next(sk));
> + msk->first_pending = mptcp_send_next(sk);
> }
>
> out:
> @@ -1850,7 +1850,7 @@ static int mptcp_sendmsg(struct sock *sk
> get_page(dfrag->page);
> list_add_tail(&dfrag->list, &msk->rtx_queue);
> if (!msk->first_pending)
> - WRITE_ONCE(msk->first_pending, dfrag);
> + msk->first_pending = dfrag;
> }
> pr_debug("msk=%p dfrag at seq=%llu len=%u sent=%u new=%d\n", msk,
> dfrag->data_seq, dfrag->data_len, dfrag->already_sent,
> @@ -2645,7 +2645,7 @@ static void __mptcp_clear_xmit(struct so
> struct mptcp_sock *msk = mptcp_sk(sk);
> struct mptcp_data_frag *dtmp, *dfrag;
>
> - WRITE_ONCE(msk->first_pending, NULL);
> + msk->first_pending = NULL;
> list_for_each_entry_safe(dfrag, dtmp, &msk->rtx_queue, list)
> dfrag_clear(sk, dfrag);
> }
> @@ -3114,9 +3114,6 @@ void __mptcp_data_acked(struct sock *sk)
>
> void __mptcp_check_push(struct sock *sk, struct sock *ssk)
> {
> - if (!mptcp_send_head(sk))
> - return;
> -
> if (!sock_owned_by_user(sk)) {
> struct sock *xmit_ssk = mptcp_subflow_get_send(mptcp_sk(sk));
>
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -325,7 +325,7 @@ static inline struct mptcp_data_frag *mp
> {
> const struct mptcp_sock *msk = mptcp_sk(sk);
>
> - return READ_ONCE(msk->first_pending);
> + return msk->first_pending;
> }
>
> static inline struct mptcp_data_frag *mptcp_send_next(struct sock *sk)
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
The receive error handling code is shared between RSCI and all other
SCIF port types, but the RSCI overrun_reg is specified as a memory
offset, while for other SCIF types it is an enum value used to index
into the sci_port_params->regs array, as mentioned above the
sci_serial_in() function.
For RSCI, the overrun_reg is CSR (0x48), causing the sci_getreg() call
inside the sci_handle_fifo_overrun() function to index outside the
bounds of the regs array, which currently has a size of 20, as specified
by SCI_NR_REGS.
Because of this, we end up accessing memory outside of RSCI's
rsci_port_params structure, which, when interpreted as a plat_sci_reg,
happens to have a non-zero size, causing the following WARN when
sci_serial_in() is called, as the accidental size does not match the
supported register sizes.
The existence of the overrun_reg needs to be checked because
SCIx_SH3_SCIF_REGTYPE has overrun_reg set to SCLSR, but SCLSR is not
present in the regs array.
Avoid calling sci_getreg() for port types which don't use standard
register handling.
Use the ops->read_reg() and ops->write_reg() functions to properly read
and write registers for RSCI, and change the type of the status variable
to accommodate the 32-bit CSR register.
sci_getreg() and sci_serial_in() are also called with overrun_reg in the
sci_mpxed_interrupt() interrupt handler, but that code path is not used
for RSCI, as it does not have a muxed interrupt.
------------[ cut here ]------------
Invalid register access
WARNING: CPU: 0 PID: 0 at drivers/tty/serial/sh-sci.c:522 sci_serial_in+0x38/0xac
Modules linked in: renesas_usbhs at24 rzt2h_adc industrialio_adc sha256 cfg80211 bluetooth ecdh_generic ecc rfkill fuse drm backlight ipv6
CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.17.0-rc1+ #30 PREEMPT
Hardware name: Renesas RZ/T2H EVK Board based on r9a09g077m44 (DT)
pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : sci_serial_in+0x38/0xac
lr : sci_serial_in+0x38/0xac
sp : ffff800080003e80
x29: ffff800080003e80 x28: ffff800082195b80 x27: 000000000000000d
x26: ffff8000821956d0 x25: 0000000000000000 x24: ffff800082195b80
x23: ffff000180e0d800 x22: 0000000000000010 x21: 0000000000000000
x20: 0000000000000010 x19: ffff000180e72000 x18: 000000000000000a
x17: ffff8002bcee7000 x16: ffff800080000000 x15: 0720072007200720
x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720
x11: 0000000000000058 x10: 0000000000000018 x9 : ffff8000821a6a48
x8 : 0000000000057fa8 x7 : 0000000000000406 x6 : ffff8000821fea48
x5 : ffff00033ef88408 x4 : ffff8002bcee7000 x3 : ffff800082195b80
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff800082195b80
Call trace:
sci_serial_in+0x38/0xac (P)
sci_handle_fifo_overrun.isra.0+0x70/0x134
sci_er_interrupt+0x50/0x39c
__handle_irq_event_percpu+0x48/0x140
handle_irq_event+0x44/0xb0
handle_fasteoi_irq+0xf4/0x1a0
handle_irq_desc+0x34/0x58
generic_handle_domain_irq+0x1c/0x28
gic_handle_irq+0x4c/0x140
call_on_irq_stack+0x30/0x48
do_interrupt_handler+0x80/0x84
el1_interrupt+0x34/0x68
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x6c/0x70
default_idle_call+0x28/0x58 (P)
do_idle+0x1f8/0x250
cpu_startup_entry+0x34/0x3c
rest_init+0xd8/0xe0
console_on_rootfs+0x0/0x6c
__primary_switched+0x88/0x90
---[ end trace 0000000000000000 ]---
Cc: stable(a)vger.kernel.org
Fixes: 0666e3fe95ab ("serial: sh-sci: Add support for RZ/T2H SCI")
Signed-off-by: Cosmin Tanislav <cosmin-gabriel.tanislav.xa(a)renesas.com>
---
drivers/tty/serial/sh-sci.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/drivers/tty/serial/sh-sci.c b/drivers/tty/serial/sh-sci.c
index 538b2f991609..62bb62b82cbe 100644
--- a/drivers/tty/serial/sh-sci.c
+++ b/drivers/tty/serial/sh-sci.c
@@ -1014,16 +1014,18 @@ static int sci_handle_fifo_overrun(struct uart_port *port)
struct sci_port *s = to_sci_port(port);
const struct plat_sci_reg *reg;
int copied = 0;
- u16 status;
+ u32 status;
- reg = sci_getreg(port, s->params->overrun_reg);
- if (!reg->size)
- return 0;
+ if (s->type != SCI_PORT_RSCI) {
+ reg = sci_getreg(port, s->params->overrun_reg);
+ if (!reg->size)
+ return 0;
+ }
- status = sci_serial_in(port, s->params->overrun_reg);
+ status = s->ops->read_reg(port, s->params->overrun_reg);
if (status & s->params->overrun_mask) {
status &= ~s->params->overrun_mask;
- sci_serial_out(port, s->params->overrun_reg, status);
+ s->ops->write_reg(port, s->params->overrun_reg, status);
port->icount.overrun++;
--
2.51.0