September 2025 - Linux-stable-mirror

[PATCH] perf lzma: Fix return value in lzma_is_compressed()

by Miaoqian Lin

lzma_is_compressed() returns -1 on error but is declared as bool. And -1 gets converted to true, which could be misleading. Return false instead to match the declared type. Fixes: 4b57fd44b61b ("perf tools: Add lzma_is_compressed function") Cc: <stable(a)vger.kernel.org> Signed-off-by: Miaoqian Lin <linmq006(a)gmail.com> --- tools/perf/util/lzma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/lzma.c b/tools/perf/util/lzma.c index bbcd2ffcf4bd..c355757ed391 100644 --- a/tools/perf/util/lzma.c +++ b/tools/perf/util/lzma.c @@ -120,7 +120,7 @@ bool lzma_is_compressed(const char *input) ssize_t rc; if (fd < 0) - return -1; + return false; rc = read(fd, buf, sizeof(buf)); close(fd); -- 2.39.5 (Apple Git-154)

2 weeks, 5 days

2
1
0 0

FAILED: patch "[PATCH] selftests: mptcp: connect: catch IO errors on listen side" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x 14e22b43df25dbd4301351b882486ea38892ae4f # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025092158-molehill-radiation-11c3@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 14e22b43df25dbd4301351b882486ea38892ae4f Mon Sep 17 00:00:00 2001 From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org> Date: Fri, 12 Sep 2025 14:25:51 +0200 Subject: [PATCH] selftests: mptcp: connect: catch IO errors on listen side IO errors were correctly printed to stderr, and propagated up to the main loop for the server side, but the returned value was ignored. As a consequence, the program for the listener side was no longer exiting with an error code in case of IO issues. Because of that, some issues might not have been seen. But very likely, most issues either had an effect on the client side, or the file transfer was not the expected one, e.g. the connection got reset before the end. Still, it is better to fix this. The main consequence of this issue is the error that was reported by the selftests: the received and sent files were different, and the MIB counters were not printed. Also, when such errors happened during the 'disconnect' tests, the program tried to continue until the timeout. Now when an IO error is detected, the program exits directly with an error. Fixes: 05be5e273c84 ("selftests: mptcp: add disconnect tests") Cc: stable(a)vger.kernel.org Reviewed-by: Mat Martineau <martineau(a)kernel.org> Reviewed-by: Geliang Tang <geliang(a)kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org> Link: https://patch.msgid.link/20250912-net-mptcp-fix-sft-connect-v1-2-d40e77cbbf… Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c index 4f07ac9fa207..1408698df099 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_connect.c +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c @@ -1093,6 +1093,7 @@ int main_loop_s(int listensock) struct pollfd polls; socklen_t salen; int remotesock; + int err = 0; int fd = 0; again: @@ -1125,7 +1126,7 @@ int main_loop_s(int listensock) SOCK_TEST_TCPULP(remotesock, 0); memset(&winfo, 0, sizeof(winfo)); - copyfd_io(fd, remotesock, 1, true, &winfo); + err = copyfd_io(fd, remotesock, 1, true, &winfo); } else { perror("accept"); return 1; @@ -1134,10 +1135,10 @@ int main_loop_s(int listensock) if (cfg_input) close(fd); - if (--cfg_repeat > 0) + if (!err && --cfg_repeat > 0) goto again; - return 0; + return err; } static void init_rng(void)

2 weeks, 6 days

2
1
0 0

Unplayable framerates in game but specific kernel versions work, maybe amdgpu problem

by machion＠disroot.org

Hello kernel/driver developers, I hope, with my information it's possible to find a bug/problem in the kernel. Otherwise I am sorry, that I disturbed you. I only use LTS kernels, but I can narrow it down to a hand full of them, where it works. The PC: Manjaro Stable/Cinnamon/X11/AMD Ryzen 5 2600/Radeon HD 7790/8GB RAM I already asked the Manjaro community, but with no luck. The game: Hellpoint (GOG Linux latest version, Unity3D-Engine v2021), uses vulkan --- I came a long road of kernels. I had many versions of 5.4, 5.10, 5.15, 6.1 and 6.6 and and the game was always unplayable, because the frames where around 1fps (performance of PC is not the problem). I asked the mesa and cinnamon team for help in the past, but also with no luck. It never worked, till on 2025-03-29 when I installed 6.12.19 for the first time and it worked! But it only worked with 6.12.19, 6.12.20 and 6.12.21 When I updated to 6.12.25, it was back to unplayable. For testing I installed 6.14.4 with the same result. It doesn't work. I also compared file /proc/config.gz of both kernels (6.12.21 <> 6.14.4), but can't seem to see drastic changes to the graphical part. I presume it has something to do with amdgpu. If you need more information, I would be happy to help. Kind regards, Marion

2 weeks, 6 days

2
5
0 0

[PATCH] nvmet: pci-epf: Move DMA initialization to EPC init callback

by Niklas Cassel

From: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com> For DMA initialization to work across all EPC drivers, the DMA initialization has to be done in the .init() callback. This is because not all EPC drivers will have a refclock (which is often needed to access registers of a DMA controller embedded in a PCIe controller) at the time the .bind() callback is called. However, all EPC drivers are guaranteed to have a refclock by the time the .init() callback is called. Thus, move the DMA initialization to the .init() callback. This change was already done for other EPF drivers in commit 60bd3e039aa2 ("PCI: endpoint: pci-epf-{mhi/test}: Move DMA initialization to EPC init callback"). Cc: stable(a)vger.kernel.org Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com> Signed-off-by: Niklas Cassel <cassel(a)kernel.org> --- drivers/nvme/target/pci-epf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/target/pci-epf.c b/drivers/nvme/target/pci-epf.c index 2e78397a7373a..9c5b0f78ce8df 100644 --- a/drivers/nvme/target/pci-epf.c +++ b/drivers/nvme/target/pci-epf.c @@ -2325,6 +2325,8 @@ static int nvmet_pci_epf_epc_init(struct pci_epf *epf) return ret; } + nvmet_pci_epf_init_dma(nvme_epf); + /* Set device ID, class, etc. */ epf->header->vendorid = ctrl->tctrl->subsys->vendor_id; epf->header->subsys_vendor_id = ctrl->tctrl->subsys->subsys_vendor_id; @@ -2422,8 +2424,6 @@ static int nvmet_pci_epf_bind(struct pci_epf *epf) if (ret) return ret; - nvmet_pci_epf_init_dma(nvme_epf); - return 0; } -- 2.51.0

2 weeks, 6 days

3
3
0 0

[PATCH] drm/xe: Increase global invalidation timeout to 1000us

by Kenneth Graunke

The previous timeout of 500us seems to be too small; panning the map in the Roll20 VTT in Firefox on a KDE/Wayland desktop reliably triggered timeouts within a few seconds of usage, causing the monitor to freeze and the following to be printed to dmesg: [Jul30 13:44] xe 0000:03:00.0: [drm] *ERROR* GT0: Global invalidation timeout [Jul30 13:48] xe 0000:03:00.0: [drm] *ERROR* [CRTC:82:pipe A] flip_done timed out I haven't hit a single timeout since increasing it to 1000us even after several multi-hour testing sessions. Fixes: c0114fdf6d4a ("drm/xe: Move DSB l2 flush to a more sensible place") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5710 Signed-off-by: Kenneth Graunke <kenneth(a)whitecape.org> Cc: stable(a)vger.kernel.org Cc: Maarten Lankhorst <dev(a)lankhorst.se> --- drivers/gpu/drm/xe/xe_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) This fixes my desktop which has been broken since 6.15. Given that https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6097 was recently filed and they seem to need a timeout of 2000 (and are having somewhat different issues), maybe more work's needed here...but I figured I'd send out the fix for my system and let xe folks figure out what they'd like to do. Thanks :) diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c index a4d12ee7d575..6339b8800914 100644 --- a/drivers/gpu/drm/xe/xe_device.c +++ b/drivers/gpu/drm/xe/xe_device.c @@ -1064,7 +1064,7 @@ void xe_device_l2_flush(struct xe_device *xe) spin_lock(&gt->global_invl_lock); xe_mmio_write32(&gt->mmio, XE2_GLOBAL_INVAL, 0x1); - if (xe_mmio_wait32(&gt->mmio, XE2_GLOBAL_INVAL, 0x1, 0x0, 500, NULL, true)) + if (xe_mmio_wait32(&gt->mmio, XE2_GLOBAL_INVAL, 0x1, 0x0, 1000, NULL, true)) xe_gt_err_once(gt, "Global invalidation timeout\n"); spin_unlock(&gt->global_invl_lock); -- 2.51.0

2 weeks, 6 days

3
2
0 0

[PATCH 6.1.y] spi: microchip-core: ensure TX and RX FIFOs are empty at start of a transfer

by jianqi.ren.cn＠windriver.com

From: Steve Wilkins <steve.wilkins(a)raymarine.com> [ Upstream commit 9cf71eb0faef4bff01df4264841b8465382d7927 ] While transmitting with rx_len == 0, the RX FIFO is not going to be emptied in the interrupt handler. A subsequent transfer could then read crap from the previous transfer out of the RX FIFO into the start RX buffer. The core provides a register that will empty the RX and TX FIFOs, so do that before each transfer. Fixes: 9ac8d17694b6 ("spi: add support for microchip fpga spi controllers") Signed-off-by: Steve Wilkins <steve.wilkins(a)raymarine.com> Signed-off-by: Conor Dooley <conor.dooley(a)microchip.com> Link: https://patch.msgid.link/20240715-flammable-provoke-459226d08e70@wendy Signed-off-by: Mark Brown <broonie(a)kernel.org> [Minor conflict resolved due to code context change.] Signed-off-by: Jianqi Ren <jianqi.ren.cn(a)windriver.com> Signed-off-by: He Zhe <zhe.he(a)windriver.com> --- Verified the build test --- drivers/spi/spi-microchip-core.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/spi/spi-microchip-core.c b/drivers/spi/spi-microchip-core.c index bfad0fe743ad..acc05f5a929e 100644 --- a/drivers/spi/spi-microchip-core.c +++ b/drivers/spi/spi-microchip-core.c @@ -91,6 +91,8 @@ #define REG_CONTROL2 (0x28) #define REG_COMMAND (0x2c) #define COMMAND_CLRFRAMECNT BIT(4) +#define COMMAND_TXFIFORST BIT(3) +#define COMMAND_RXFIFORST BIT(2) #define REG_PKTSIZE (0x30) #define REG_CMD_SIZE (0x34) #define REG_HWSTATUS (0x38) @@ -489,6 +491,8 @@ static int mchp_corespi_transfer_one(struct spi_controller *host, mchp_corespi_set_xfer_size(spi, (spi->tx_len > FIFO_DEPTH) ? FIFO_DEPTH : spi->tx_len); + mchp_corespi_write(spi, REG_COMMAND, COMMAND_RXFIFORST | COMMAND_TXFIFORST); + while (spi->tx_len) mchp_corespi_write_fifo(spi); -- 2.34.1

3 weeks

4
3
0 0

[PATCH] mfd: altera-sysmgr: fix device leak on sysmgr regmap lookup

by Johan Hovold

Make sure to drop the reference taken to the sysmgr platform device when retrieving its driver data. Note that holding a reference to a device does not prevent its driver data from going away. Fixes: f36e789a1f8d ("mfd: altera-sysmgr: Add SOCFPGA System Manager") Cc: stable(a)vger.kernel.org # 5.2 Signed-off-by: Johan Hovold <johan(a)kernel.org> --- drivers/mfd/altera-sysmgr.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/mfd/altera-sysmgr.c b/drivers/mfd/altera-sysmgr.c index fb5f988e61f3..90c6902d537d 100644 --- a/drivers/mfd/altera-sysmgr.c +++ b/drivers/mfd/altera-sysmgr.c @@ -117,6 +117,8 @@ struct regmap *altr_sysmgr_regmap_lookup_by_phandle(struct device_node *np, sysmgr = dev_get_drvdata(dev); + put_device(dev); + return sysmgr->regmap; } EXPORT_SYMBOL_GPL(altr_sysmgr_regmap_lookup_by_phandle); -- 2.49.1

3 weeks

3
2
0 0

[bp@imp.ch: Bug#1111455: linux-image-6.12.41+deb13-amd64: kernel BUG at fs/netfs/read_collect.c:316 netfs: Can't donate prior to front]

by Salvatore Bonaccorso

Hi, A user in Debian reported the following kernel oops when running on 6.12.41 (but apparently as well on older versions, though there were several netfs related similar issues, so including Max Kellermann as well in the recipients) The report from Benoit Panizzon is as follows: > From: Benoit Panizzon <bp(a)imp.ch> > Resent-From: Benoit Panizzon <bp(a)imp.ch> > Reply-To: Benoit Panizzon <bp(a)imp.ch>, 1111455(a)bugs.debian.org > X-Mailer: reportbug 13.2.0 > Date: Mon, 18 Aug 2025 10:24:32 +0200 > To: Debian Bug Tracking System <submit(a)bugs.debian.org> > Subject: Bug#1111455: linux-image-6.12.41+deb13-amd64: kernel BUG at fs/netfs/read_collect.c:316 netfs: Can't donate > prior to front > Delivered-To: lists-debian-kernel(a)bendel.debian.org > Delivered-To: submit(a)bugs.debian.org > Message-ID: <175550547264.3745.5845128440223069497.reportbug(a)go.imp.ch> > > Package: src:linux > Version: 6.12.41-1 > Severity: grave > Justification: renders package unusable > X-Debbugs-Cc: debian-amd64(a)lists.debian.org > User: debian-amd64(a)lists.debian.org > Usertags: amd64 > > Dear Maintainer, > > Updated my workstation from Bookworm to Trixie. /home on NFS > > Applications accessing data on NFS shares become unresponsive one after the other after a couple of minutes. > > Especially affected: > * Claws-Mail > * Chromium > > Suspected cachefsd being the culpit and disabled - issue persists. > > It looks like an invalid opcode is being used. Fairly recent CPU in use: > > vendor_id : AuthenticAMD > cpu family : 25 > model : 80 > model name : AMD Ryzen 7 PRO 5750GE with Radeon Graphics > stepping : 0 > microcode : 0xa500011 > > Google found reports of others affected with similar kernel versions mostly when accessing SMB shares. > > [Mo Aug 18 10:11:19 2025] netfs: Can't donate prior to front > [Mo Aug 18 10:11:19 2025] R=00001e07[4] s=6000-7fff 0/2000/2000 > [Mo Aug 18 10:11:19 2025] folio: 4000-7fff > [Mo Aug 18 10:11:19 2025] donated: prev=0 next=0 > [Mo Aug 18 10:11:19 2025] s=6000 av=2000 part=2000 > [Mo Aug 18 10:11:19 2025] ------------[ cut here ]------------ > [Mo Aug 18 10:11:19 2025] kernel BUG at fs/netfs/read_collect.c:316! > [Mo Aug 18 10:11:19 2025] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI > [Mo Aug 18 10:11:19 2025] CPU: 5 UID: 0 PID: 115 Comm: kworker/u64:1 Not tainted 6.12.41+deb13-amd64 #1 Debian 6.12.41-1 > [Mo Aug 18 10:11:19 2025] Hardware name: LENOVO 11JN000JGE/32E4, BIOS M47KT26A 11/23/2022 > [Mo Aug 18 10:11:19 2025] Workqueue: nfsiod rpc_async_release [sunrpc] > [Mo Aug 18 10:11:19 2025] RIP: 0010:netfs_consume_read_data.isra.0+0xb79/0xb80 [netfs] > [Mo Aug 18 10:11:19 2025] Code: 48 89 ea 31 f6 48 c7 c7 96 95 6f c2 e8 d0 8d a2 e1 48 8b 4c 24 10 4c 89 fe 48 8b 54 24 20 48 c7 c7 b2 95 6f c2 e8 b7 8d a2 e1 <0f> 0b 90 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 > [Mo Aug 18 10:11:19 2025] RSP: 0018:ffffb4234057bd58 EFLAGS: 00010246 > [Mo Aug 18 10:11:19 2025] RAX: 0000000000000018 RBX: ffff9aed62651ec0 RCX: 0000000000000027 > [Mo Aug 18 10:11:19 2025] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff9af04e4a1740 > [Mo Aug 18 10:11:19 2025] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffb4234057bbe8 > [Mo Aug 18 10:11:19 2025] R10: ffffffffa50b43a8 R11: 0000000000000003 R12: ffff9aed6c9081e8 > [Mo Aug 18 10:11:19 2025] R13: 0000000000004000 R14: ffff9aed6c9081e8 R15: 0000000000006000 > [Mo Aug 18 10:11:19 2025] FS: 0000000000000000(0000) GS:ffff9af04e480000(0000) knlGS:0000000000000000 > [Mo Aug 18 10:11:19 2025] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [Mo Aug 18 10:11:19 2025] CR2: 00003d3407849020 CR3: 00000002f1822000 CR4: 0000000000f50ef0 > [Mo Aug 18 10:11:19 2025] PKRU: 55555554 > [Mo Aug 18 10:11:19 2025] Call Trace: > [Mo Aug 18 10:11:19 2025] <TASK> > [Mo Aug 18 10:11:19 2025] netfs_read_subreq_terminated+0x2ab/0x3e0 [netfs] > [Mo Aug 18 10:11:19 2025] nfs_netfs_read_completion+0x9c/0xc0 [nfs] > [Mo Aug 18 10:11:19 2025] nfs_read_completion+0xf6/0x130 [nfs] > [Mo Aug 18 10:11:19 2025] rpc_free_task+0x39/0x60 [sunrpc] > [Mo Aug 18 10:11:19 2025] rpc_async_release+0x2f/0x40 [sunrpc] > [Mo Aug 18 10:11:19 2025] process_one_work+0x177/0x330 > [Mo Aug 18 10:11:19 2025] worker_thread+0x251/0x390 > [Mo Aug 18 10:11:19 2025] ? __pfx_worker_thread+0x10/0x10 > [Mo Aug 18 10:11:19 2025] kthread+0xd2/0x100 > [Mo Aug 18 10:11:19 2025] ? __pfx_kthread+0x10/0x10 > [Mo Aug 18 10:11:19 2025] ret_from_fork+0x34/0x50 > [Mo Aug 18 10:11:19 2025] ? __pfx_kthread+0x10/0x10 > [Mo Aug 18 10:11:19 2025] ret_from_fork_asm+0x1a/0x30 > [Mo Aug 18 10:11:19 2025] </TASK> > [Mo Aug 18 10:11:19 2025] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs netfs nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables libcrc32c wireguard libchacha20poly1305 chacha_x86_64 poly1305_x86_64 curve25519_x86_64 libcurve25519_generic libchacha vxlan ip6_udp_tunnel udp_tunnel bridge stp llc btusb btrtl btintel btbcm btmtk bluetooth joydev qrtr nfsd auth_rpcgss binfmt_misc nfs_acl lockd grace nls_ascii nls_cp437 sunrpc vfat fat amd_atl intel_rapl_msr intel_rapl_common rtw88_8822ce rtw88_8822c edac_mce_amd rtw88_pci snd_sof_amd_rembrandt snd_sof_amd_acp rtw88_core kvm_amd snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_hda_codec_realtek mac80211 kvm snd_hda_codec_generic snd_sof_utils snd_hda_scodec_component snd_soc_core snd_hda_codec_hdmi snd_hda_intel snd_compress snd_intel_dspcfg libarc4 snd_pcm_dmaengine irqbypass snd_intel_sdw_acpi snd_pci_ps crct10dif_pclmul snd_hda_codec ghash_clmulni_intel snd_rpl_pci_acp6x cfg80211 snd_hda_core sha512_ssse3 snd_acp_pci > [Mo Aug 18 10:11:19 2025] sha256_ssse3 snd_acp_legacy_common sha1_ssse3 snd_hwdep snd_pci_acp6x aesni_intel snd_pcm gf128mul snd_pci_acp5x crypto_simd think_lmi snd_timer snd_rn_pci_acp3x cryptd firmware_attributes_class snd_acp_config wmi_bmof snd snd_soc_acpi ee1004 rapl snd_pci_acp3x pcspkr ccp k10temp rfkill soundcore evdev parport_pc ppdev lp parport configfs efi_pstore nfnetlink efivarfs ip_tables x_tables autofs4 ext4 mbcache jbd2 crc32c_generic hid_plantronics hid_generic usbhid hid amdgpu dm_mod amdxcp drm_exec gpu_sched drm_buddy i2c_algo_bit drm_suballoc_helper drm_display_helper cec rc_core drm_ttm_helper xhci_pci ttm xhci_hcd drm_kms_helper drm r8169 nvme usbcore realtek sp5100_tco nvme_core mdio_devres watchdog libphy crc32_pclmul i2c_piix4 video crc32c_intel usb_common nvme_auth i2c_smbus crc16 wmi button > [Mo Aug 18 10:11:19 2025] ---[ end trace 0000000000000000 ]--- Any ideas here? Benoit can you please es well test the current 6.16.1 ideally to verify if the problem persists there as well? Regards, Salvatore

3 weeks, 2 days

2
5
0 0

FAILED: patch "[PATCH] bus: mhi: host: Detect events pointing to unexpected TREs" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x 5bd398e20f0833ae8a1267d4f343591a2dd20185 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025082100-snowiness-profanity-df3a@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 5bd398e20f0833ae8a1267d4f343591a2dd20185 Mon Sep 17 00:00:00 2001 From: Youssef Samir <quic_yabdulra(a)quicinc.com> Date: Mon, 14 Jul 2025 18:30:39 +0200 Subject: [PATCH] bus: mhi: host: Detect events pointing to unexpected TREs When a remote device sends a completion event to the host, it contains a pointer to the consumed TRE. The host uses this pointer to process all of the TREs between it and the host's local copy of the ring's read pointer. This works when processing completion for chained transactions, but can lead to nasty results if the device sends an event for a single-element transaction with a read pointer that is multiple elements ahead of the host's read pointer. For instance, if the host accesses an event ring while the device is updating it, the pointer inside of the event might still point to an old TRE. If the host uses the channel's xfer_cb() to directly free the buffer pointed to by the TRE, the buffer will be double-freed. This behavior was observed on an ep that used upstream EP stack without 'commit 6f18d174b73d ("bus: mhi: ep: Update read pointer only after buffer is written")'. Where the device updated the events ring pointer before updating the event contents, so it left a window where the host was able to access the stale data the event pointed to, before the device had the chance to update them. The usual pattern was that the host received an event pointing to a TRE that is not immediately after the last processed one, so it got treated as if it was a chained transaction, processing all of the TREs in between the two read pointers. This commit aims to harden the host by ensuring transactions where the event points to a TRE that isn't local_rp + 1 are chained. Fixes: 1d3173a3bae7 ("bus: mhi: core: Add support for processing events from client device") Signed-off-by: Youssef Samir <quic_yabdulra(a)quicinc.com> [mani: added stable tag and reworded commit message] Signed-off-by: Manivannan Sadhasivam <mani(a)kernel.org> Reviewed-by: Jeff Hugo <jeff.hugo(a)oss.qualcomm.com> Cc: stable(a)vger.kernel.org Link: https://patch.msgid.link/20250714163039.3438985-1-quic_yabdulra@quicinc.com diff --git a/drivers/bus/mhi/host/main.c b/drivers/bus/mhi/host/main.c index 3041ee6747e3..52bef663e182 100644 --- a/drivers/bus/mhi/host/main.c +++ b/drivers/bus/mhi/host/main.c @@ -602,7 +602,7 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl, { dma_addr_t ptr = MHI_TRE_GET_EV_PTR(event); struct mhi_ring_element *local_rp, *ev_tre; - void *dev_rp; + void *dev_rp, *next_rp; struct mhi_buf_info *buf_info; u16 xfer_len; @@ -621,6 +621,16 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl, result.dir = mhi_chan->dir; local_rp = tre_ring->rp; + + next_rp = local_rp + 1; + if (next_rp >= tre_ring->base + tre_ring->len) + next_rp = tre_ring->base; + if (dev_rp != next_rp && !MHI_TRE_DATA_GET_CHAIN(local_rp)) { + dev_err(&mhi_cntrl->mhi_dev->dev, + "Event element points to an unexpected TRE\n"); + break; + } + while (local_rp != dev_rp) { buf_info = buf_ring->rp; /* If it's the last TRE, get length from the event */

3 weeks, 2 days

2
1
0 0

[PATCH net v2] page_pool: Fix PP_MAGIC_MASK to avoid crashing on some 32-bit arches

by Toke Høiland-Jørgensen

Helge reported that the introduction of PP_MAGIC_MASK let to crashes on boot on his 32-bit parisc machine. The cause of this is the mask is set too wide, so the page_pool_page_is_pp() incurs false positives which crashes the machine. Just disabling the check in page_pool_is_pp() will lead to the page_pool code itself malfunctioning; so instead of doing this, this patch changes the define for PP_DMA_INDEX_BITS to avoid mistaking arbitrary kernel pointers for page_pool-tagged pages. The fix relies on the kernel pointers that alias with the pp_magic field always being above PAGE_OFFSET. With this assumption, we can use the lowest bit of the value of PAGE_OFFSET as the upper bound of the PP_DMA_INDEX_MASK, which should avoid the false positives. Because we cannot rely on PAGE_OFFSET always being a compile-time constant, nor on it always being >0, we fall back to disabling the dma_index storage when there are not enough bits available. This leaves us in the situation we were in before the patch in the Fixes tag, but only on a subset of architecture configurations. This seems to be the best we can do until the transition to page types in complete for page_pool pages. v2: - Make sure there's at least 8 bits available and that the PAGE_OFFSET bit calculation doesn't wrap Link: https://lore.kernel.org/all/aMNJMFa5fDalFmtn@p100/ Fixes: ee62ce7a1d90 ("page_pool: Track DMA-mapped pages and unmap them when destroying the pool") Cc: stable(a)vger.kernel.org # 6.15+ Tested-by: Helge Deller <deller(a)gmx.de> Signed-off-by: Toke Høiland-Jørgensen <toke(a)redhat.com> --- include/linux/mm.h | 22 +++++++------ net/core/page_pool.c | 76 ++++++++++++++++++++++++++++++-------------- 2 files changed, 66 insertions(+), 32 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 1ae97a0b8ec7..0905eb6b55ec 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4159,14 +4159,13 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status); * since this value becomes part of PP_SIGNATURE; meaning we can just use the * space between the PP_SIGNATURE value (without POISON_POINTER_DELTA), and the * lowest bits of POISON_POINTER_DELTA. On arches where POISON_POINTER_DELTA is - * 0, we make sure that we leave the two topmost bits empty, as that guarantees - * we won't mistake a valid kernel pointer for a value we set, regardless of the - * VMSPLIT setting. + * 0, we use the lowest bit of PAGE_OFFSET as the boundary if that value is + * known at compile-time. * - * Altogether, this means that the number of bits available is constrained by - * the size of an unsigned long (at the upper end, subtracting two bits per the - * above), and the definition of PP_SIGNATURE (with or without - * POISON_POINTER_DELTA). + * If the value of PAGE_OFFSET is not known at compile time, or if it is too + * small to leave at least 8 bits available above PP_SIGNATURE, we define the + * number of bits to be 0, which turns off the DMA index tracking altogether + * (see page_pool_register_dma_index()). */ #define PP_DMA_INDEX_SHIFT (1 + __fls(PP_SIGNATURE - POISON_POINTER_DELTA)) #if POISON_POINTER_DELTA > 0 @@ -4175,8 +4174,13 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status); */ #define PP_DMA_INDEX_BITS MIN(32, __ffs(POISON_POINTER_DELTA) - PP_DMA_INDEX_SHIFT) #else -/* Always leave out the topmost two; see above. */ -#define PP_DMA_INDEX_BITS MIN(32, BITS_PER_LONG - PP_DMA_INDEX_SHIFT - 2) +/* Use the lowest bit of PAGE_OFFSET if there's at least 8 bits available; see above */ +#define PP_DMA_INDEX_MIN_OFFSET (1 << (PP_DMA_INDEX_SHIFT + 8)) +#define PP_DMA_INDEX_BITS ((__builtin_constant_p(PAGE_OFFSET) && \ + PAGE_OFFSET >= PP_DMA_INDEX_MIN_OFFSET && \ + !(PAGE_OFFSET & (PP_DMA_INDEX_MIN_OFFSET - 1))) ? \ + MIN(32, __ffs(PAGE_OFFSET) - PP_DMA_INDEX_SHIFT) : 0) + #endif #define PP_DMA_INDEX_MASK GENMASK(PP_DMA_INDEX_BITS + PP_DMA_INDEX_SHIFT - 1, \ diff --git a/net/core/page_pool.c b/net/core/page_pool.c index 492728f9e021..1a5edec485f1 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -468,11 +468,60 @@ page_pool_dma_sync_for_device(const struct page_pool *pool, } } +static int page_pool_register_dma_index(struct page_pool *pool, + netmem_ref netmem, gfp_t gfp) +{ + int err = 0; + u32 id; + + if (unlikely(!PP_DMA_INDEX_BITS)) + goto out; + + if (in_softirq()) + err = xa_alloc(&pool->dma_mapped, &id, netmem_to_page(netmem), + PP_DMA_INDEX_LIMIT, gfp); + else + err = xa_alloc_bh(&pool->dma_mapped, &id, netmem_to_page(netmem), + PP_DMA_INDEX_LIMIT, gfp); + if (err) { + WARN_ONCE(err != -ENOMEM, "couldn't track DMA mapping, please report to netdev@"); + goto out; + } + + netmem_set_dma_index(netmem, id); +out: + return err; +} + +static int page_pool_release_dma_index(struct page_pool *pool, + netmem_ref netmem) +{ + struct page *old, *page = netmem_to_page(netmem); + unsigned long id; + + if (unlikely(!PP_DMA_INDEX_BITS)) + return 0; + + id = netmem_get_dma_index(netmem); + if (!id) + return -1; + + if (in_softirq()) + old = xa_cmpxchg(&pool->dma_mapped, id, page, NULL, 0); + else + old = xa_cmpxchg_bh(&pool->dma_mapped, id, page, NULL, 0); + if (old != page) + return -1; + + netmem_set_dma_index(netmem, 0); + + return 0; +} + static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem, gfp_t gfp) { dma_addr_t dma; int err; - u32 id; /* Setup DMA mapping: use 'struct page' area for storing DMA-addr * since dma_addr_t can be either 32 or 64 bits and does not always fit @@ -491,18 +540,10 @@ static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem, gfp_t g goto unmap_failed; } - if (in_softirq()) - err = xa_alloc(&pool->dma_mapped, &id, netmem_to_page(netmem), - PP_DMA_INDEX_LIMIT, gfp); - else - err = xa_alloc_bh(&pool->dma_mapped, &id, netmem_to_page(netmem), - PP_DMA_INDEX_LIMIT, gfp); - if (err) { - WARN_ONCE(err != -ENOMEM, "couldn't track DMA mapping, please report to netdev@"); + err = page_pool_register_dma_index(pool, netmem, gfp); + if (err) goto unset_failed; - } - netmem_set_dma_index(netmem, id); page_pool_dma_sync_for_device(pool, netmem, pool->p.max_len); return true; @@ -680,8 +721,6 @@ void page_pool_clear_pp_info(netmem_ref netmem) static __always_inline void __page_pool_release_netmem_dma(struct page_pool *pool, netmem_ref netmem) { - struct page *old, *page = netmem_to_page(netmem); - unsigned long id; dma_addr_t dma; if (!pool->dma_map) @@ -690,15 +729,7 @@ static __always_inline void __page_pool_release_netmem_dma(struct page_pool *poo */ return; - id = netmem_get_dma_index(netmem); - if (!id) - return; - - if (in_softirq()) - old = xa_cmpxchg(&pool->dma_mapped, id, page, NULL, 0); - else - old = xa_cmpxchg_bh(&pool->dma_mapped, id, page, NULL, 0); - if (old != page) + if (page_pool_release_dma_index(pool, netmem)) return; dma = page_pool_get_dma_addr_netmem(netmem); @@ -708,7 +739,6 @@ static __always_inline void __page_pool_release_netmem_dma(struct page_pool *poo PAGE_SIZE << pool->p.order, pool->p.dma_dir, DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING); page_pool_set_dma_addr_netmem(netmem, 0); - netmem_set_dma_index(netmem, 0); } /* Disconnects a page (from a page_pool). API users can have a need -- 2.51.0

3 weeks, 2 days

4
5
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror September 2025