- Linux-stable-mirror - lists.linaro.org

[PATCH 6.1.y] selftests/vm: fix undefined reference of the `default_huge_page_size`

by Andrey Kalachev

The commit a584c7734a4d ("selftests: mm: fix map_hugetlb failure on 64K page size systems") backported the fix from v6.8 to stable v6.1. The patch uses default_huge_page_size() function, which definition moved into vm_util.[ch] by commit bd4d67e76f699 ("selftests/mm: merge default_huge_page_size() into one") merged to upsream since v6.4. However, in v6.1 common definition/declaration for the default_huge_page_size() we doesn't have, the following build error is seen: map_hugetlb.c:79:25: warning: implicit declaration of function ‘default_huge_page_size’ [-Wimplicit-function-declaration] 79 | hugepage_size = default_huge_page_size(); | ^~~~~~~~~~~~~~~~~~~~~~ /usr/bin/ld: /tmp/ccx95BZz.o: in function `main': map_hugetlb.c:(.text+0x104): undefined reference to `default_huge_page_size' Place default_huge_page_size() function body into map_hugetlb.c to fix this issue. Fixes: a584c7734a4d ("selftests: mm: fix map_hugetlb failure on 64K page size systems") Signed-off-by: Andrey Kalachev <kalachev(a)swemel.ru> --- tools/testing/selftests/vm/map_hugetlb.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/tools/testing/selftests/vm/map_hugetlb.c b/tools/testing/selftests/vm/map_hugetlb.c index c65c55b7a789..5826c50b6736 100644 --- a/tools/testing/selftests/vm/map_hugetlb.c +++ b/tools/testing/selftests/vm/map_hugetlb.c @@ -67,6 +67,30 @@ static int read_bytes(char *addr, size_t length) return 0; } +/* + * default_huge_page_size copied from mlock2-tests.c + */ +unsigned long default_huge_page_size(void) +{ + unsigned long hps = 0; + char *line = NULL; + size_t linelen = 0; + FILE *f = fopen("/proc/meminfo", "r"); + + if (!f) + return 0; + while (getline(&line, &linelen, f) > 0) { + if (sscanf(line, "Hugepagesize: %lu kB", &hps) == 1) { + hps <<= 10; + break; + } + } + + free(line); + fclose(f); + return hps; +} + int main(int argc, char **argv) { void *addr; -- 2.39.5

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] mm/slab/kvfree_rcu: Switch to WQ_MEM_RECLAIM wq" failed to apply to 6.12-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.12-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y git checkout FETCH_HEAD git cherry-pick -x dfd3df31c9db752234d7d2e09bef2aeabb643ce4 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030914-turtle-tattered-27c6@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From dfd3df31c9db752234d7d2e09bef2aeabb643ce4 Mon Sep 17 00:00:00 2001 From: "Uladzislau Rezki (Sony)" <urezki(a)gmail.com> Date: Fri, 28 Feb 2025 13:13:56 +0100 Subject: [PATCH] mm/slab/kvfree_rcu: Switch to WQ_MEM_RECLAIM wq Currently kvfree_rcu() APIs use a system workqueue which is "system_unbound_wq" to driver RCU machinery to reclaim a memory. Recently, it has been noted that the following kernel warning can be observed: <snip> workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_scan_work is flushing !WQ_MEM_RECLAIM events_unbound:kfree_rcu_work WARNING: CPU: 21 PID: 330 at kernel/workqueue.c:3719 check_flush_dependency+0x112/0x120 Modules linked in: intel_uncore_frequency(E) intel_uncore_frequency_common(E) skx_edac(E) ... CPU: 21 UID: 0 PID: 330 Comm: kworker/u144:6 Tainted: G E 6.13.2-0_g925d379822da #1 Hardware name: Wiwynn Twin Lakes MP/Twin Lakes Passive MP, BIOS YMM20 02/01/2023 Workqueue: nvme-wq nvme_scan_work RIP: 0010:check_flush_dependency+0x112/0x120 Code: 05 9a 40 14 02 01 48 81 c6 c0 00 00 00 48 8b 50 18 48 81 c7 c0 00 00 00 48 89 f9 48 ... RSP: 0018:ffffc90000df7bd8 EFLAGS: 00010082 RAX: 000000000000006a RBX: ffffffff81622390 RCX: 0000000000000027 RDX: 00000000fffeffff RSI: 000000000057ffa8 RDI: ffff88907f960c88 RBP: 0000000000000000 R08: ffffffff83068e50 R09: 000000000002fffd R10: 0000000000000004 R11: 0000000000000000 R12: ffff8881001a4400 R13: 0000000000000000 R14: ffff88907f420fb8 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88907f940000(0000) knlGS:0000000000000000 CR2: 00007f60c3001000 CR3: 000000107d010005 CR4: 00000000007726f0 PKRU: 55555554 Call Trace: <TASK> ? __warn+0xa4/0x140 ? check_flush_dependency+0x112/0x120 ? report_bug+0xe1/0x140 ? check_flush_dependency+0x112/0x120 ? handle_bug+0x5e/0x90 ? exc_invalid_op+0x16/0x40 ? asm_exc_invalid_op+0x16/0x20 ? timer_recalc_next_expiry+0x190/0x190 ? check_flush_dependency+0x112/0x120 ? check_flush_dependency+0x112/0x120 __flush_work.llvm.1643880146586177030+0x174/0x2c0 flush_rcu_work+0x28/0x30 kvfree_rcu_barrier+0x12f/0x160 kmem_cache_destroy+0x18/0x120 bioset_exit+0x10c/0x150 disk_release.llvm.6740012984264378178+0x61/0xd0 device_release+0x4f/0x90 kobject_put+0x95/0x180 nvme_put_ns+0x23/0xc0 nvme_remove_invalid_namespaces+0xb3/0xd0 nvme_scan_work+0x342/0x490 process_scheduled_works+0x1a2/0x370 worker_thread+0x2ff/0x390 ? pwq_release_workfn+0x1e0/0x1e0 kthread+0xb1/0xe0 ? __kthread_parkme+0x70/0x70 ret_from_fork+0x30/0x40 ? __kthread_parkme+0x70/0x70 ret_from_fork_asm+0x11/0x20 </TASK> ---[ end trace 0000000000000000 ]--- <snip> To address this switch to use of independent WQ_MEM_RECLAIM workqueue, so the rules are not violated from workqueue framework point of view. Apart of that, since kvfree_rcu() does reclaim memory it is worth to go with WQ_MEM_RECLAIM type of wq because it is designed for this purpose. Fixes: 6c6c47b063b5 ("mm, slab: call kvfree_rcu_barrier() from kmem_cache_destroy()"), Reported-by: Keith Busch <kbusch(a)kernel.org> Closes: https://lore.kernel.org/all/Z7iqJtCjHKfo8Kho@kbusch-mbp/ Cc: stable(a)vger.kernel.org Signed-off-by: Uladzislau Rezki (Sony) <urezki(a)gmail.com> Reviewed-by: Joel Fernandes <joelagnelf(a)nvidia.com> Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz> diff --git a/mm/slab_common.c b/mm/slab_common.c index 4030907b6b7d..4c9f0a87f733 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -1304,6 +1304,8 @@ module_param(rcu_min_cached_objs, int, 0444); static int rcu_delay_page_cache_fill_msec = 5000; module_param(rcu_delay_page_cache_fill_msec, int, 0444); +static struct workqueue_struct *rcu_reclaim_wq; + /* Maximum number of jiffies to wait before draining a batch. */ #define KFREE_DRAIN_JIFFIES (5 * HZ) #define KFREE_N_BATCHES 2 @@ -1632,10 +1634,10 @@ __schedule_delayed_monitor_work(struct kfree_rcu_cpu *krcp) if (delayed_work_pending(&krcp->monitor_work)) { delay_left = krcp->monitor_work.timer.expires - jiffies; if (delay < delay_left) - mod_delayed_work(system_unbound_wq, &krcp->monitor_work, delay); + mod_delayed_work(rcu_reclaim_wq, &krcp->monitor_work, delay); return; } - queue_delayed_work(system_unbound_wq, &krcp->monitor_work, delay); + queue_delayed_work(rcu_reclaim_wq, &krcp->monitor_work, delay); } static void @@ -1733,7 +1735,7 @@ kvfree_rcu_queue_batch(struct kfree_rcu_cpu *krcp) // "free channels", the batch can handle. Break // the loop since it is done with this CPU thus // queuing an RCU work is _always_ success here. - queued = queue_rcu_work(system_unbound_wq, &krwp->rcu_work); + queued = queue_rcu_work(rcu_reclaim_wq, &krwp->rcu_work); WARN_ON_ONCE(!queued); break; } @@ -1883,7 +1885,7 @@ run_page_cache_worker(struct kfree_rcu_cpu *krcp) if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING && !atomic_xchg(&krcp->work_in_progress, 1)) { if (atomic_read(&krcp->backoff_page_cache_fill)) { - queue_delayed_work(system_unbound_wq, + queue_delayed_work(rcu_reclaim_wq, &krcp->page_cache_work, msecs_to_jiffies(rcu_delay_page_cache_fill_msec)); } else { @@ -2120,6 +2122,10 @@ void __init kvfree_rcu_init(void) int i, j; struct shrinker *kfree_rcu_shrinker; + rcu_reclaim_wq = alloc_workqueue("kvfree_rcu_reclaim", + WQ_UNBOUND | WQ_MEM_RECLAIM, 0); + WARN_ON(!rcu_reclaim_wq); + /* Clamp it to [0:100] seconds interval. */ if (rcu_delay_page_cache_fill_msec < 0 || rcu_delay_page_cache_fill_msec > 100 * MSEC_PER_SEC) {

6 months, 2 weeks

3
2
0 0

[PATCH] mmc: sdhci-brcmstb: add cqhci suspend/resume to PM ops

by Kamal Dasu

cqhci timeouts observed on brcmstb platforms during suspend: ... [ 164.832853] mmc0: cqhci: timeout for tag 18 ... Adding cqhci_suspend()/resume() calls to disable cqe in sdhci_brcmstb_suspend()/resume() respectively to fix CQE timeouts seen on PM suspend. Fixes: d46ba2d17f90 ("mmc: sdhci-brcmstb: Add support for Command Queuing (CQE)") Cc: stable(a)vger.kernel.org Signed-off-by: Kamal Dasu <kamal.dasu(a)broadcom.com> --- drivers/mmc/host/sdhci-brcmstb.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/mmc/host/sdhci-brcmstb.c b/drivers/mmc/host/sdhci-brcmstb.c index 0ef4d578ade8..bf55a9185eb6 100644 --- a/drivers/mmc/host/sdhci-brcmstb.c +++ b/drivers/mmc/host/sdhci-brcmstb.c @@ -505,6 +505,12 @@ static int sdhci_brcmstb_suspend(struct device *dev) struct sdhci_brcmstb_priv *priv = sdhci_pltfm_priv(pltfm_host); clk_disable_unprepare(priv->base_clk); + if (host->mmc->caps2 & MMC_CAP2_CQE) { + ret = cqhci_suspend(host->mmc); + if (ret) + return ret; + } + return sdhci_pltfm_suspend(dev); } @@ -529,6 +535,9 @@ static int sdhci_brcmstb_resume(struct device *dev) ret = clk_set_rate(priv->base_clk, priv->base_freq_hz); } + if (host->mmc->caps2 & MMC_CAP2_CQE) + ret = cqhci_resume(host->mmc); + return ret; } #endif -- 2.17.1

6 months, 2 weeks

2
2
0 0

[PATCH] fpga: bridge: incorrect set to clear freeze_illegal_request register

by Tanmay Kathpalia

A Partial Region Controller can be connected to one or more Freeze Bridge. Each Freeze Bridge has an illegal_request bit represented in the freeze_illegal_request register. Thus, instead of just set to clear the illegal_request bit for first Freeze Bridge, we need to ensure the set to clear action is applied to which ever Freeze Bridge that has occurrence of illegal request. Fixes: ca24a648f535 ("fpga: add altera freeze bridge support") Signed-off-by: Chiau Ee Chew <chiau.ee.chew(a)intel.com> Signed-off-by: Tanmay Kathpalia <tanmay.kathpalia(a)altera.com> --- drivers/fpga/altera-freeze-bridge.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/fpga/altera-freeze-bridge.c b/drivers/fpga/altera-freeze-bridge.c index 594693ff786e..23e8b2b54355 100644 --- a/drivers/fpga/altera-freeze-bridge.c +++ b/drivers/fpga/altera-freeze-bridge.c @@ -52,7 +52,7 @@ static int altera_freeze_br_req_ack(struct altera_freeze_br_data *priv, if (illegal) { dev_err(dev, "illegal request detected 0x%x", illegal); - writel(1, csr_illegal_req_addr); + writel(illegal, csr_illegal_req_addr); illegal = readl(csr_illegal_req_addr); if (illegal) -- 2.19.0

6 months, 2 weeks

2
1
0 0

[PATCH net v3] qlcnic: fix memory leak issues in qlcnic_sriov_common.c

by Haoxiang Li

Add qlcnic_sriov_free_vlans() in qlcnic_sriov_alloc_vlans() if any sriov_vlans fails to be allocated. Add qlcnic_sriov_free_vlans() to free the memory allocated by qlcnic_sriov_alloc_vlans() if "sriov->allowed_vlans" fails to be allocated. Fixes: 91b7282b613d ("qlcnic: Support VLAN id config.") Cc: stable(a)vger.kernel.org Signed-off-by: Haoxiang Li <haoxiang_li2024(a)163.com> --- Changes in v3: - Handle allocation errors in qlcnic_sriov_alloc_vlans() - Modify the patch title and description. There's one more thing I'm confused about: I'm not sure if the fixes-tag is correct, because I noticed that the two modifications correspond to different commits. Should I split them into two separate patch submissions? Thanks, Paolo! Changes in v2: - Add qlcnic_sriov_free_vlans() if qlcnic_sriov_alloc_vlans() fails. - Modify the patch description. vf_info was allocated by kcalloc, no need to do more checks cause kfree(NULL) is safe. Thanks, Paolo! --- drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c index f9dd50152b1e..28d24d59efb8 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c @@ -454,8 +454,10 @@ static int qlcnic_sriov_set_guest_vlan_mode(struct qlcnic_adapter *adapter, num_vlans = sriov->num_allowed_vlans; sriov->allowed_vlans = kcalloc(num_vlans, sizeof(u16), GFP_KERNEL); - if (!sriov->allowed_vlans) + if (!sriov->allowed_vlans) { + qlcnic_sriov_free_vlans(adapter); return -ENOMEM; + } vlans = (u16 *)&cmd->rsp.arg[3]; for (i = 0; i < num_vlans; i++) @@ -2167,8 +2169,10 @@ int qlcnic_sriov_alloc_vlans(struct qlcnic_adapter *adapter) vf = &sriov->vf_info[i]; vf->sriov_vlans = kcalloc(sriov->num_allowed_vlans, sizeof(*vf->sriov_vlans), GFP_KERNEL); - if (!vf->sriov_vlans) + if (!vf->sriov_vlans) { + qlcnic_sriov_free_vlans(adapter); return -ENOMEM; + } } return 0; -- 2.25.1

6 months, 2 weeks

2
1
0 0

[REGRESSION] stable-rc/linux-5.10.y: (build) in vmlinux (Makefile:1212) [logspec:kbuild,kbuild.other]

by KernelCI bot

Hello, New build issue found on stable-rc/linux-5.10.y: --- in vmlinux (Makefile:1212) [logspec:kbuild,kbuild.other] --- - dashboard: https://d.kernelci.org/issue/maestro:d5c2be698989c7de46471109aae8df0339b713… - giturl: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git - commit HEAD: a0e8dfa03993fda7b4d4b696c50f69726522abba Log excerpt: ===================================================== .lds In file included from ./include/linux/kernel.h:15, net/ipv6/udp.c: In function ‘udp_v6_send_skb’: ./include/linux/minmax.h:20:35: warning: comparison of distinct pointer types lacks a cast ./include/linux/minmax.h:26:18: note: in expansion of macro ‘__typecheck’ ./include/linux/minmax.h:36:31: note: in expansion of macro ‘__safe_cmp’ ./include/linux/minmax.h:45:25: note: in expansion of macro ‘__careful_cmp’ net/ipv6/udp.c:1213:28: note: in expansion of macro ‘min’ In file included from ./include/linux/uaccess.h:7, net/ipv4/udp.c: In function ‘udp_send_skb’: ./include/linux/minmax.h:20:35: warning: comparison of distinct pointer types lacks a cast ./include/linux/minmax.h:26:18: note: in expansion of macro ‘__typecheck’ ./include/linux/minmax.h:36:31: note: in expansion of macro ‘__safe_cmp’ ./include/linux/minmax.h:45:25: note: in expansion of macro ‘__careful_cmp’ net/ipv4/udp.c:926:28: note: in expansion of macro ‘min’ FAILED unresolved symbol filp_close ===================================================== # Builds where the incident occurred: ## cros://chromeos-5.10/x86_64/chromeos-amd-stoneyridge.flavour.config+lab-setup+x86-board+CONFIG_MODULE_COMPRESS=n+CONFIG_MODULE_COMPRESS_NONE=y on (x86_64): - compiler: gcc-12 - dashboard: https://d.kernelci.org/build/maestro:67ceffea18018371957ebdc0 #kernelci issue maestro:d5c2be698989c7de46471109aae8df0339b713c1 Reported-by: kernelci.org bot <bot(a)kernelci.org> -- This is an experimental report format. Please send feedback in! Talk to us at kernelci(a)lists.linux.dev Made with love by the KernelCI team - https://kernelci.org

6 months, 2 weeks

2
1
0 0

queue-5.10: Panic on shutdown at platform_shutdown+0x9

by Chuck Lever

Hi - For the past 3-4 days, NFSD CI runs on queue-5.10.y have been failing. I looked into it today, and the test guest fails to reboot because it panics during a reboot shutdown: [ 146.793087] BUG: unable to handle page fault for address: ffffffffffffffe8 [ 146.793918] #PF: supervisor read access in kernel mode [ 146.794544] #PF: error_code(0x0000) - not-present page [ 146.795172] PGD 3d5c14067 P4D 3d5c15067 PUD 3d5c17067 PMD 0 [ 146.795865] Oops: 0000 [#1] SMP NOPTI [ 146.796326] CPU: 3 PID: 1 Comm: systemd-shutdow Not tainted 5.10.234-g99349f441fe1 #1 [ 146.797256] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 [ 146.798267] RIP: 0010:platform_shutdown+0x9/0x20 [ 146.798838] Code: b7 46 08 c3 cc cc cc cc 31 c0 83 bf a8 02 00 00 ff 75 ec c3 cc cc cc cc 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47 68 <48> 8b 40 e8 48 85 c0 74 09 48 83 ef 10 ff e0 0f 1f 00 c3 cc cc cc [ 146.801012] RSP: 0018:ff7f86f440013de0 EFLAGS: 00010246 [ 146.801651] RAX: 0000000000000000 RBX: ff4f0637469df418 RCX: 0000000000000000 [ 146.802500] RDX: 0000000000000001 RSI: ff4f0637469df418 RDI: ff4f0637469df410 [ 146.803350] RBP: ffffffffb2e79220 R08: ff4f0637469dd808 R09: ffffffffb2c5c698 [ 146.804203] R10: 0000000000000000 R11: 0000000000000000 R12: ff4f0637469df410 [ 146.805059] R13: ff4f0637469df490 R14: 00000000fee1dead R15: 0000000000000000 [ 146.805909] FS: 00007f4e7ecc6b80(0000) GS:ff4f063aafd80000(0000) knlGS:0000000000000000 [ 146.806866] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 146.807558] CR2: ffffffffffffffe8 CR3: 000000010ecb2001 CR4: 0000000000771ee0 [ 146.808412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 146.809262] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 146.810109] PKRU: 55555554 [ 146.810460] Call Trace: [ 146.810791] ? __die_body.cold+0x1a/0x1f [ 146.811282] ? no_context.constprop.0+0xf8/0x2f0 [ 146.811854] ? exc_page_fault+0xc5/0x150 [ 146.812342] ? asm_exc_page_fault+0x1e/0x30 [ 146.812862] ? platform_shutdown+0x9/0x20 [ 146.813362] device_shutdown+0x158/0x1c0 [ 146.813853] __do_sys_reboot.cold+0x2f/0x5b [ 146.814370] ? vfs_writev+0x9b/0x110 [ 146.814824] ? do_writev+0x57/0xf0 [ 146.815254] do_syscall_64+0x30/0x40 [ 146.815708] entry_SYSCALL_64_after_hwframe+0x67/0xd1 Let me know how to further assist. -- Chuck Lever

6 months, 2 weeks

3
8
0 0

[PATCH v2 1/2] PCI: Forcefully set the PCI_REASSIGN_ALL_BUS flag for Marvell CN96XX/CN10XXX boards

by Bo Sun

---8<--- Changes in v2: - Added explicit comment about the quirk, as requested by Mani. - Made commit message more clear, as requested by Bjorn. ---8<--- On our Marvell OCTEON CN96XX board, we observed the following panic on the latest kernel: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000080 CPU: 22 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.14.0-rc6 #20 Hardware name: Marvell OcteonTX CN96XX board (DT) pc : of_pci_add_properties+0x278/0x4c8 Call trace: of_pci_add_properties+0x278/0x4c8 (P) of_pci_make_dev_node+0xe0/0x158 pci_bus_add_device+0x158/0x228 pci_bus_add_devices+0x40/0x98 pci_host_probe+0x94/0x118 pci_host_common_probe+0x130/0x1b0 platform_probe+0x70/0xf0 The dmesg logs indicated that the PCI bridge was scanning with an invalid bus range: pci-host-generic 878020000000.pci: PCI host bridge to bus 0002:00 pci_bus 0002:00: root bus resource [bus 00-ff] pci 0002:00:00.0: scanning [bus f9-f9] behind bridge, pass 0 pci 0002:00:01.0: scanning [bus fa-fa] behind bridge, pass 0 pci 0002:00:02.0: scanning [bus fb-fb] behind bridge, pass 0 pci 0002:00:03.0: scanning [bus fc-fc] behind bridge, pass 0 pci 0002:00:04.0: scanning [bus fd-fd] behind bridge, pass 0 pci 0002:00:05.0: scanning [bus fe-fe] behind bridge, pass 0 pci 0002:00:06.0: scanning [bus ff-ff] behind bridge, pass 0 pci 0002:00:07.0: scanning [bus 00-00] behind bridge, pass 0 pci 0002:00:07.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci 0002:00:08.0: scanning [bus 01-01] behind bridge, pass 0 pci 0002:00:09.0: scanning [bus 02-02] behind bridge, pass 0 pci 0002:00:0a.0: scanning [bus 03-03] behind bridge, pass 0 pci 0002:00:0b.0: scanning [bus 04-04] behind bridge, pass 0 pci 0002:00:0c.0: scanning [bus 05-05] behind bridge, pass 0 pci 0002:00:0d.0: scanning [bus 06-06] behind bridge, pass 0 pci 0002:00:0e.0: scanning [bus 07-07] behind bridge, pass 0 pci 0002:00:0f.0: scanning [bus 08-08] behind bridge, pass 0 This regression was introduced by commit 7246a4520b4b ("PCI: Use preserve_config in place of pci_flags"). On our board, the 0002:00:07.0 bridge is misconfigured by the bootloader. Both its secondary and subordinate bus numbers are initialized to 0, while its fixed secondary bus number is set to 8. However, bus number 8 is also assigned to another bridge (0002:00:0f.0). Although this is a bootloader issue, before the change in commit 7246a4520b4b, the PCI_REASSIGN_ALL_BUS flag was set by default when PCI_PROBE_ONLY was not enabled, ensuing that all the bus number for these bridges were reassigned, avoiding any conflicts. After the change introduced in commit 7246a4520b4b, the bus numbers assigned by the bootloader are reused by all other bridges, except the misconfigured 0002:00:07.0 bridge. The kernel attempt to reconfigure 0002:00:07.0 by reusing the fixed secondary bus number 8 assigned by bootloader. However, since a pci_bus has already been allocated for bus 8 due to the probe of 0002:00:0f.0, no new pci_bus allocated for 0002:00:07.0. This results in a pci bridge device without a pci_bus attached (pdev->subordinate == NULL). Consequently, accessing pdev->subordinate in of_pci_prop_bus_range() leads to a NULL pointer dereference. To summarize, we need to set the PCI_REASSIGN_ALL_BUS flag when PCI_PROBE_ONLY is not enabled in order to work around issue like the one described above. Cc: stable(a)vger.kernel.org Fixes: 7246a4520b4b ("PCI: Use preserve_config in place of pci_flags") Signed-off-by: Bo Sun <Bo.Sun.CN(a)windriver.com> --- drivers/pci/quirks.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 82b21e34c545..cec58c7479e1 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -6181,6 +6181,23 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1536, rom_bar_overlap_defect); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1537, rom_bar_overlap_defect); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1538, rom_bar_overlap_defect); +/* + * Quirk for Marvell CN96XX/CN10XXX boards: + * + * Adds PCI_REASSIGN_ALL_BUS unless PCI_PROBE_ONLY is set, forcing bus number + * reassignment to avoid conflicts caused by bootloader misconfigured PCI bridges. + * + * This resolves a regression introduced by commit 7246a4520b4b ("PCI: Use + * preserve_config in place of pci_flags"), which removed this behavior. + */ +static void quirk_marvell_cn96xx_cn10xxx_reassign_all_busnr(struct pci_dev *dev) +{ + if (!pci_has_flag(PCI_PROBE_ONLY)) + pci_add_flags(PCI_REASSIGN_ALL_BUS); +} +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_CAVIUM, 0xa002, + quirk_marvell_cn96xx_cn10xxx_reassign_all_busnr); + #ifdef CONFIG_PCIEASPM /* * Several Intel DG2 graphics devices advertise that they can only tolerate -- 2.48.1

6 months, 2 weeks

1
1
0 0

[PATCH v9] arm64: mm: Populate vmemmap at the page level if not section aligned

by Zhenhua Huang

On the arm64 platform with 4K base page config, SECTION_SIZE_BITS is set to 27, making one section 128M. The related page struct which vmemmap points to is 2M then. Commit c1cc1552616d ("arm64: MMU initialisation") optimizes the vmemmap to populate at the PMD section level which was suitable initially since hot plug granule is always one section(128M). However, commit ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") introduced a 2M(SUBSECTION_SIZE) hot plug granule, which disrupted the existing arm64 assumptions. The first problem is that if start or end is not aligned to a section boundary, such as when a subsection is hot added, populating the entire section is wasteful. The next problem is if we hotplug something that spans part of 128 MiB section (subsections, let's call it memblock1), and then hotplug something that spans another part of a 128 MiB section(subsections, let's call it memblock2), and subsequently unplug memblock1, vmemmap_free() will clear the entire PMD entry which also supports memblock2 even though memblock2 is still active. Assuming hotplug/unplug sizes are guaranteed to be symmetric. Do the fix similar to x86-64: populate to pages levels if start/end is not aligned with section boundary. Cc: <stable(a)vger.kernel.org> # v5.4+ Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") Acked-by: David Hildenbrand <david(a)redhat.com> Signed-off-by: Zhenhua Huang <quic_zhenhuah(a)quicinc.com> --- arch/arm64/mm/mmu.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index b4df5bc5b1b8..1dfe1a8efdbe 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1177,8 +1177,11 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap) { WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END)); + /* [start, end] should be within one section */ + WARN_ON_ONCE(end - start > PAGES_PER_SECTION * sizeof(struct page)); - if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES)) + if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES) || + (end - start < PAGES_PER_SECTION * sizeof(struct page))) return vmemmap_populate_basepages(start, end, node, altmap); else return vmemmap_populate_hugepages(start, end, node, altmap); -- 2.25.1

6 months, 2 weeks

4
3
0 0

[PATCH] media: ov08x40: Extend sleep after reset to 5 ms

by Hans de Goede

Some users are reporting that ov08x40_identify_module() fails to identify the chip reading 0x00 as value for OV08X40_REG_CHIP_ID. Intel's out of tree IPU6 drivers include some ov08x40 changes including adding support for the reset GPIO for older kernels and Intel's patch for this uses 5 ms. Extend the sleep to 5 ms following Intel's example, this fixes the ov08x40_identify_module() problem. Link: https://github.com/intel/ipu6-drivers/blob/c09e2198d801e1eb701984d294837312… Fixes: df1ae2251a50 ("media: ov08x40: Add OF probe support") Cc: stable(a)vger.kernel.org Signed-off-by: Hans de Goede <hdegoede(a)redhat.com> --- drivers/media/i2c/ov08x40.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/media/i2c/ov08x40.c b/drivers/media/i2c/ov08x40.c index cf0e41fc3071..54575eea3c49 100644 --- a/drivers/media/i2c/ov08x40.c +++ b/drivers/media/i2c/ov08x40.c @@ -1341,7 +1341,7 @@ static int ov08x40_power_on(struct device *dev) } gpiod_set_value_cansleep(ov08x->reset_gpio, 0); - usleep_range(1500, 1800); + usleep_range(5000, 5500); return 0; -- 2.48.1

6 months, 2 weeks

2
1
0 0

Re: [PATCH] parport: Add Brainboxes XC cards.

by Cameron Williams

Cc'ing stable Cc: stable(a)vger.kernel.org On 10 March 2025 22:25:08 GMT, Cameron Williams <cang1(a)live.co.uk> wrote: >Add ExpressCard parport cards. > >Signed-off-by: Cameron Williams <cang1(a)live.co.uk> >--- > drivers/parport/parport_pc.c | 6 ++++++ > 1 file changed, 6 insertions(+) > >diff --git a/drivers/parport/parport_pc.c b/drivers/parport/parport_pc.c >index f33b5d1dd..787e894bb 100644 >--- a/drivers/parport/parport_pc.c >+++ b/drivers/parport/parport_pc.c >@@ -2854,6 +2854,12 @@ static const struct pci_device_id parport_pc_pci_tbl[] = { > /* Brainboxes PX-475 */ > { PCI_VENDOR_ID_INTASHIELD, 0x401f, > PCI_ANY_ID, PCI_ANY_ID, 0, 0, oxsemi_pcie_pport }, >+ /* Brainboxes XC-157 */ >+ { PCI_VENDOR_ID_INTASHIELD, 0x4020, >+ PCI_ANY_ID, PCI_ANY_ID, 0, 0, oxsemi_pcie_pport }, >+ /* Brainboxes XC-475 */ >+ { PCI_VENDOR_ID_INTASHIELD, 0x4022, >+ PCI_ANY_ID, PCI_ANY_ID, 0, 0, oxsemi_pcie_pport }, > { 0, } /* terminate list */ > }; > MODULE_DEVICE_TABLE(pci, parport_pc_pci_tbl);

6 months, 2 weeks

2
1
0 0

Re: [PATCH] tty: serial: 8250: Add Brainboxes XC devices

by Greg KH

On Tue, Mar 11, 2025 at 06:54:00AM +0000, Cameron Williams wrote: > Cc'ing stable > > Cc: stable(a)vger.kernel.org > <formletter> This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly. </formletter>

6 months, 2 weeks

1
0
0 0

Re: [PATCH] tty: serial: 8250: Add Brainboxes XC devices

by Cameron Williams

Cc'ing stable Cc: stable(a)vger.kernel.org On 10 March 2025 22:27:10 GMT, Cameron Williams <cang1(a)live.co.uk> wrote: >These ExpressCard devices use the OxPCIE chip and can be used with >this driver. > >Signed-off-by: Cameron Williams <cang1(a)live.co.uk> >--- > drivers/tty/serial/8250/8250_pci.c | 30 ++++++++++++++++++++++++++++++ > 1 file changed, 30 insertions(+) > >diff --git a/drivers/tty/serial/8250/8250_pci.c b/drivers/tty/serial/8250/8250_pci.c >index df4d0d832..911774fb8 100644 >--- a/drivers/tty/serial/8250/8250_pci.c >+++ b/drivers/tty/serial/8250/8250_pci.c >@@ -2727,6 +2727,22 @@ static struct pci_serial_quirk pci_serial_quirks[] = { > .init = pci_oxsemi_tornado_init, > .setup = pci_oxsemi_tornado_setup, > }, >+ { >+ .vendor = PCI_VENDOR_ID_INTASHIELD, >+ .device = 0x4026, >+ .subvendor = PCI_ANY_ID, >+ .subdevice = PCI_ANY_ID, >+ .init = pci_oxsemi_tornado_init, >+ .setup = pci_oxsemi_tornado_setup, >+ }, >+ { >+ .vendor = PCI_VENDOR_ID_INTASHIELD, >+ .device = 0x4021, >+ .subvendor = PCI_ANY_ID, >+ .subdevice = PCI_ANY_ID, >+ .init = pci_oxsemi_tornado_init, >+ .setup = pci_oxsemi_tornado_setup, >+ }, > { > .vendor = PCI_VENDOR_ID_INTEL, > .device = 0x8811, >@@ -5599,6 +5615,20 @@ static const struct pci_device_id serial_pci_tbl[] = { > PCI_ANY_ID, PCI_ANY_ID, > 0, 0, > pbn_oxsemi_1_15625000 }, >+ /* >+ * Brainboxes XC-235 >+ */ >+ { PCI_VENDOR_ID_INTASHIELD, 0x4026, >+ PCI_ANY_ID, PCI_ANY_ID, >+ 0, 0, >+ pbn_oxsemi_1_15625000 }, >+ /* >+ * Brainboxes XC-475 >+ */ >+ { PCI_VENDOR_ID_INTASHIELD, 0x4021, >+ PCI_ANY_ID, PCI_ANY_ID, >+ 0, 0, >+ pbn_oxsemi_1_15625000 }, > > /* > * Perle PCI-RAS cards

6 months, 2 weeks

1
0
0 0

[PATCH V4] drm/sched: Fix fence reference count leak

by Qianyi Liu

From: qianyi liu <liuqianyi125(a)gmail.com> The last_scheduled fence leaks when an entity is being killed and adding the cleanup callback fails. Decrement the reference count of prev when dma_fence_add_callback() fails, ensuring proper balance. Cc: stable(a)vger.kernel.org Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini") Signed-off-by: qianyi liu <liuqianyi125(a)gmail.com> --- v3 -> v4: Improve commit message and add code comments (Philipp) v2 -> v3: Rework commit message (Markus) v1 -> v2: Added 'Fixes:' tag and clarified commit message (Philipp and Matthew) --- drivers/gpu/drm/scheduler/sched_entity.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 69bcf0e99d57..da00572d7d42 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -259,9 +259,16 @@ static void drm_sched_entity_kill(struct drm_sched_entity *entity) struct drm_sched_fence *s_fence = job->s_fence; dma_fence_get(&s_fence->finished); - if (!prev || dma_fence_add_callback(prev, &job->finish_cb, - drm_sched_entity_kill_jobs_cb)) + if (!prev || + dma_fence_add_callback(prev, &job->finish_cb, + drm_sched_entity_kill_jobs_cb)) { + /* + * Adding callback above failed. + * dma_fence_put() checks for NULL. + */ + dma_fence_put(prev); drm_sched_entity_kill_jobs_cb(NULL, &job->finish_cb); + } prev = &s_fence->finished; } -- 2.25.1

6 months, 2 weeks

1
0
0 0

+ memcg-drain-obj-stock-on-cpu-hotplug-teardown.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: memcg: drain obj stock on cpu hotplug teardown has been added to the -mm mm-hotfixes-unstable branch. Its filename is memcg-drain-obj-stock-on-cpu-hotplug-teardown.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Shakeel Butt <shakeel.butt(a)linux.dev> Subject: memcg: drain obj stock on cpu hotplug teardown Date: Mon, 10 Mar 2025 16:09:34 -0700 Currently on cpu hotplug teardown, only memcg stock is drained but we need to drain the obj stock as well otherwise we will miss the stats accumulated on the target cpu as well as the nr_bytes cached. The stats include MEMCG_KMEM, NR_SLAB_RECLAIMABLE_B & NR_SLAB_UNRECLAIMABLE_B. In addition we are leaking reference to struct obj_cgroup object. Link: https://lkml.kernel.org/r/20250310230934.2913113-1-shakeel.butt@linux.dev Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API") Signed-off-by: Shakeel Butt <shakeel.butt(a)linux.dev> Cc: Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Michal Hocko <mhocko(a)kernel.org> Cc: Muchun Song <muchun.song(a)linux.dev> Cc: Roman Gushchin <roman.gushchin(a)linux.dev> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/memcontrol.c | 9 +++++++++ 1 file changed, 9 insertions(+) --- a/mm/memcontrol.c~memcg-drain-obj-stock-on-cpu-hotplug-teardown +++ a/mm/memcontrol.c @@ -1921,9 +1921,18 @@ void drain_all_stock(struct mem_cgroup * static int memcg_hotplug_cpu_dead(unsigned int cpu) { struct memcg_stock_pcp *stock; + struct obj_cgroup *old; + unsigned long flags; stock = &per_cpu(memcg_stock, cpu); + + /* drain_obj_stock requires stock_lock */ + local_lock_irqsave(&memcg_stock.stock_lock, flags); + old = drain_obj_stock(stock); + local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + drain_stock(stock); + obj_cgroup_put(old); return 0; } _ Patches currently in -mm which might be from shakeel.butt(a)linux.dev are memcg-drain-obj-stock-on-cpu-hotplug-teardown.patch memcg-add-hierarchical-effective-limits-for-v2.patch memcg-dont-call-propagate_protected_usage-for-v1.patch page_counter-track-failcnt-only-for-legacy-cgroups.patch page_counter-reduce-struct-page_counter-size.patch memcg-bypass-root-memcg-check-for-skmem-charging.patch

6 months, 2 weeks

1
0
0 0

+ mm-huge_memory-drop-beyond-eof-folios-with-the-right-number-of-refs.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: mm/huge_memory: drop beyond-EOF folios with the right number of refs. has been added to the -mm mm-hotfixes-unstable branch. Its filename is mm-huge_memory-drop-beyond-eof-folios-with-the-right-number-of-refs.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Zi Yan <ziy(a)nvidia.com> Subject: mm/huge_memory: drop beyond-EOF folios with the right number of refs. Date: Mon, 10 Mar 2025 11:57:27 -0400 When an after-split folio is large and needs to be dropped due to EOF, folio_put_refs(folio, folio_nr_pages(folio)) should be used to drop all page cache refs. Otherwise, the folio will not be freed, causing memory leak. This leak would happen on a filesystem with blocksize > page_size and a truncate is performed, where the blocksize makes folios split to >0 order ones, causing truncated folios not being freed. Link: https://lkml.kernel.org/r/20250310155727.472846-1-ziy@nvidia.com Fixes: c010d47f107f ("mm: thp: split huge page to any lower order pages") Signed-off-by: Zi Yan <ziy(a)nvidia.com> Reported-by: Hugh Dickins <hughd(a)google.com> Closes: https://lore.kernel.org/all/fcbadb7f-dd3e-21df-f9a7-2853b53183c4@google.com/ Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: David Hildenbrand <david(a)redhat.com> Cc: John Hubbard <jhubbard(a)nvidia.com> Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com> Cc: Kirill A. Shuemov <kirill.shutemov(a)linux.intel.com> Cc: Luis Chamberalin <mcgrof(a)kernel.org> Cc: Matthew Wilcow (Oracle) <willy(a)infradead.org> Cc: Miaohe Lin <linmiaohe(a)huawei.com> Cc: Pankaj Raghav <p.raghav(a)samsung.com> Cc: Ryan Roberts <ryan.roberts(a)arm.com> Cc: Yang Shi <yang(a)os.amperecomputing.com> Cc: Yu Zhao <yuzhao(a)google.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/huge_memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/huge_memory.c~mm-huge_memory-drop-beyond-eof-folios-with-the-right-number-of-refs +++ a/mm/huge_memory.c @@ -3304,7 +3304,7 @@ static void __split_huge_page(struct pag folio_account_cleaned(tail, inode_to_wb(folio->mapping->host)); __filemap_remove_folio(tail, NULL); - folio_put(tail); + folio_put_refs(tail, folio_nr_pages(tail)); } else if (!folio_test_anon(folio)) { __xa_store(&folio->mapping->i_pages, tail->index, tail, 0); _ Patches currently in -mm which might be from ziy(a)nvidia.com are mm-migrate-fix-shmem-xarray-update-during-migration.patch mm-huge_memory-drop-beyond-eof-folios-with-the-right-number-of-refs.patch selftests-mm-make-file-backed-thp-split-work-by-writing-pmd-size-data.patch mm-huge_memory-allow-split-shmem-large-folio-to-any-lower-order.patch selftests-mm-test-splitting-file-backed-thp-to-any-lower-order.patch xarray-add-xas_try_split-to-split-a-multi-index-entry.patch mm-huge_memory-add-two-new-not-yet-used-functions-for-folio_split.patch mm-huge_memory-add-two-new-not-yet-used-functions-for-folio_split-fix.patch mm-huge_memory-move-folio-split-common-code-to-__folio_split.patch mm-huge_memory-add-buddy-allocator-like-non-uniform-folio_split.patch mm-huge_memory-remove-the-old-unused-__split_huge_page.patch mm-huge_memory-add-folio_split-to-debugfs-testing-interface.patch mm-truncate-use-folio_split-in-truncate-operation.patch selftests-mm-add-tests-for-folio_split-buddy-allocator-like-split.patch mm-filemap-use-xas_try_split-in-__filemap_add_folio.patch mm-shmem-use-xas_try_split-in-shmem_split_large_entry.patch

6 months, 2 weeks

1
0
0 0

+ mm-mremap-correctly-handle-partial-mremap-of-vma-starting-at-0.patch added to mm-unstable branch

by Andrew Morton

The patch titled Subject: mm/mremap: correctly handle partial mremap() of VMA starting at 0 has been added to the -mm mm-unstable branch. Its filename is mm-mremap-correctly-handle-partial-mremap-of-vma-starting-at-0.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com> Subject: mm/mremap: correctly handle partial mremap() of VMA starting at 0 Date: Mon, 10 Mar 2025 20:50:34 +0000 Patch series "refactor mremap and fix bug", v3. The existing mremap() logic has grown organically over a very long period of time, resulting in code that is in many parts, very difficult to follow and full of subtleties and sources of confusion. In addition, it is difficult to thread state through the operation correctly, as function arguments have expanded, some parameters are expected to be temporarily altered during the operation, others are intended to remain static and some can be overridden. This series completely refactors the mremap implementation, sensibly separating functions, adding comments to explain the more subtle aspects of the implementation and making use of small structs to thread state through everything. The reason for doing so is to lay the groundwork for planned future changes to the mremap logic, changes which require the ability to easily pass around state. Additionally, it would be unhelpful to add yet more logic to code that is already difficult to follow without first refactoring it like this. The first patch in this series additionally fixes a bug when a VMA with start address zero is partially remapped. Tested on real hardware under heavy workload and all self tests are passing. This patch (of 3): Consider the case of a partial mremap() (that results in a VMA split) of an accountable VMA (i.e. which has the VM_ACCOUNT flag set) whose start address is zero, with the MREMAP_MAYMOVE flag specified and a scenario where a move does in fact occur: addr end | | v v |-------------| | vma | |-------------| 0 This move is affected by unmapping the range [addr, end). In order to prevent an incorrect decrement of accounted memory which has already been determined, the mremap() code in move_vma() clears VM_ACCOUNT from the VMA prior to doing so, before reestablishing it in each of the VMAs post-split: addr end | | v v |---| |---| | A | | B | |---| |---| Commit 6b73cff239e5 ("mm: change munmap splitting order and move_vma()") changed this logic such as to determine whether there is a need to do so by establishing account_start and account_end and, in the instance where such an operation is required, assigning them to vma->vm_start and vma->vm_end. Later the code checks if the operation is required for 'A' referenced above thusly: if (account_start) { ... } However, if the VMA described above has vma->vm_start == 0, which is now assigned to account_start, this branch will not be executed. As a result, the VMA 'A' above will remain stripped of its VM_ACCOUNT flag, incorrectly. The fix is to simply convert these variables to booleans and set them as required. Link: https://lkml.kernel.org/r/cover.1741639347.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/dc55cb6db25d97c3d9e460de4986a323fa959676.17416393… Fixes: 6b73cff239e5 ("mm: change munmap splitting order and move_vma()") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com> Reviewed-by: Harry Yoo <harry.yoo(a)oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett(a)oracle.com> Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/mremap.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) --- a/mm/mremap.c~mm-mremap-correctly-handle-partial-mremap-of-vma-starting-at-0 +++ a/mm/mremap.c @@ -705,8 +705,8 @@ static unsigned long move_vma(struct vm_ unsigned long vm_flags = vma->vm_flags; unsigned long new_pgoff; unsigned long moved_len; - unsigned long account_start = 0; - unsigned long account_end = 0; + bool account_start = false; + bool account_end = false; unsigned long hiwater_vm; int err = 0; bool need_rmap_locks; @@ -790,9 +790,9 @@ static unsigned long move_vma(struct vm_ if (vm_flags & VM_ACCOUNT && !(flags & MREMAP_DONTUNMAP)) { vm_flags_clear(vma, VM_ACCOUNT); if (vma->vm_start < old_addr) - account_start = vma->vm_start; + account_start = true; if (vma->vm_end > old_addr + old_len) - account_end = vma->vm_end; + account_end = true; } /* @@ -832,7 +832,7 @@ static unsigned long move_vma(struct vm_ /* OOM: unable to split vma, just get accounts right */ if (vm_flags & VM_ACCOUNT && !(flags & MREMAP_DONTUNMAP)) vm_acct_memory(old_len >> PAGE_SHIFT); - account_start = account_end = 0; + account_start = account_end = false; } if (vm_flags & VM_LOCKED) { _ Patches currently in -mm which might be from lorenzo.stoakes(a)oracle.com are mm-simplify-vma-merge-structure-and-expand-comments.patch mm-further-refactor-commit_merge.patch mm-eliminate-adj_start-parameter-from-commit_merge.patch mm-make-vmg-target-consistent-and-further-simplify-commit_merge.patch mm-completely-abstract-unnecessary-adj_start-calculation.patch mm-madvise-split-out-mmap-locking-operations-for-madvise-fix.patch mm-use-read-write_once-for-vma-vm_flags-on-migrate-mprotect.patch mm-refactor-rmap_walk_file-to-separate-out-traversal-logic.patch mm-provide-mapping_wrprotect_range-function.patch fb_defio-do-not-use-deprecated-page-mapping-index-fields.patch fb_defio-do-not-use-deprecated-page-mapping-index-fields-fix.patch mm-allow-guard-regions-in-file-backed-and-read-only-mappings.patch selftests-mm-rename-guard-pages-to-guard-regions.patch selftests-mm-rename-guard-pages-to-guard-regions-fix.patch tools-selftests-expand-all-guard-region-tests-to-file-backed.patch tools-selftests-add-file-shmem-backed-mapping-guard-region-tests.patch fs-proc-task_mmu-add-guard-region-bit-to-pagemap.patch tools-selftests-add-guard-region-test-for-proc-pid-pagemap.patch tools-selftests-add-guard-region-test-for-proc-pid-pagemap-fix.patch mm-mremap-correctly-handle-partial-mremap-of-vma-starting-at-0.patch mm-mremap-refactor-mremap-system-call-implementation.patch mm-mremap-introduce-and-use-vma_remap_struct-threaded-state.patch mm-mremap-initial-refactor-of-move_vma.patch mm-mremap-complete-refactor-of-move_vma.patch mm-mremap-refactor-move_page_tables-abstracting-state.patch mm-mremap-thread-state-through-move-page-table-operation.patch

6 months, 2 weeks

1
0
0 0

[PATCH] drm/dp_mst: Fix locking when skipping CSN before topology probing

by Imre Deak

The handling of the MST Connection Status Notify message is skipped if the probing of the topology is still pending. Acquiring the drm_dp_mst_topology_mgr::probe_lock for this in drm_dp_mst_handle_up_req() is problematic: the task/work this function is called from is also responsible for handling MST down-request replies (in drm_dp_mst_handle_down_rep()). Thus drm_dp_mst_link_probe_work() - holding already probe_lock - could be blocked waiting for an MST down-request reply while drm_dp_mst_handle_up_req() is waiting for probe_lock while processing a CSN message. This leads to the probe work's down-request message timing out. A scenario similar to the above leading to a down-request timeout is handling a CSN message in drm_dp_mst_handle_conn_stat(), holding the probe_lock and sending down-request messages while a second CSN message sent by the sink subsequently is handled by drm_dp_mst_handle_up_req(). Fix the above by moving the logic to skip the CSN handling to drm_dp_mst_process_up_req(). This function is called from a work (separate from the task/work handling new up/down messages), already holding probe_lock. This solves the above timeout issue, since handling of down-request replies won't be blocked by probe_lock. Fixes: ddf983488c3e ("drm/dp_mst: Skip CSN if topology probing is not done yet") Cc: Wayne Lin <Wayne.Lin(a)amd.com> Cc: Lyude Paul <lyude(a)redhat.com> Cc: stable(a)vger.kernel.org # v6.6+ Signed-off-by: Imre Deak <imre.deak(a)intel.com> --- drivers/gpu/drm/display/drm_dp_mst_topology.c | 40 +++++++++++-------- 1 file changed, 24 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c index 8b68bb3fbffb0..3a1f1ffc7b552 100644 --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c @@ -4036,6 +4036,22 @@ static int drm_dp_mst_handle_down_rep(struct drm_dp_mst_topology_mgr *mgr) return 0; } +static bool primary_mstb_probing_is_done(struct drm_dp_mst_topology_mgr *mgr) +{ + bool probing_done = false; + + mutex_lock(&mgr->lock); + + if (mgr->mst_primary && drm_dp_mst_topology_try_get_mstb(mgr->mst_primary)) { + probing_done = mgr->mst_primary->link_address_sent; + drm_dp_mst_topology_put_mstb(mgr->mst_primary); + } + + mutex_unlock(&mgr->lock); + + return probing_done; +} + static inline bool drm_dp_mst_process_up_req(struct drm_dp_mst_topology_mgr *mgr, struct drm_dp_pending_up_req *up_req) @@ -4066,8 +4082,12 @@ drm_dp_mst_process_up_req(struct drm_dp_mst_topology_mgr *mgr, /* TODO: Add missing handler for DP_RESOURCE_STATUS_NOTIFY events */ if (msg->req_type == DP_CONNECTION_STATUS_NOTIFY) { - dowork = drm_dp_mst_handle_conn_stat(mstb, &msg->u.conn_stat); - hotplug = true; + if (!primary_mstb_probing_is_done(mgr)) { + drm_dbg_kms(mgr->dev, "Got CSN before finish topology probing. Skip it.\n"); + } else { + dowork = drm_dp_mst_handle_conn_stat(mstb, &msg->u.conn_stat); + hotplug = true; + } } drm_dp_mst_topology_put_mstb(mstb); @@ -4149,10 +4169,11 @@ static int drm_dp_mst_handle_up_req(struct drm_dp_mst_topology_mgr *mgr) drm_dp_send_up_ack_reply(mgr, mst_primary, up_req->msg.req_type, false); + drm_dp_mst_topology_put_mstb(mst_primary); + if (up_req->msg.req_type == DP_CONNECTION_STATUS_NOTIFY) { const struct drm_dp_connection_status_notify *conn_stat = &up_req->msg.u.conn_stat; - bool handle_csn; drm_dbg_kms(mgr->dev, "Got CSN: pn: %d ldps:%d ddps: %d mcs: %d ip: %d pdt: %d\n", conn_stat->port_number, @@ -4161,16 +4182,6 @@ static int drm_dp_mst_handle_up_req(struct drm_dp_mst_topology_mgr *mgr) conn_stat->message_capability_status, conn_stat->input_port, conn_stat->peer_device_type); - - mutex_lock(&mgr->probe_lock); - handle_csn = mst_primary->link_address_sent; - mutex_unlock(&mgr->probe_lock); - - if (!handle_csn) { - drm_dbg_kms(mgr->dev, "Got CSN before finish topology probing. Skip it."); - kfree(up_req); - goto out_put_primary; - } } else if (up_req->msg.req_type == DP_RESOURCE_STATUS_NOTIFY) { const struct drm_dp_resource_status_notify *res_stat = &up_req->msg.u.resource_stat; @@ -4185,9 +4196,6 @@ static int drm_dp_mst_handle_up_req(struct drm_dp_mst_topology_mgr *mgr) list_add_tail(&up_req->next, &mgr->up_req_list); mutex_unlock(&mgr->up_req_lock); queue_work(system_long_wq, &mgr->up_req_work); - -out_put_primary: - drm_dp_mst_topology_put_mstb(mst_primary); out_clear_reply: reset_msg_rx_state(&mgr->up_req_recv); return ret; -- 2.44.2

6 months, 2 weeks

3
5
0 0

[5.4/5.10/5.15/6.1/6.6] spi-mxs: Fix chipselect glitch

by Stefan Wahren

Dear stable team, I noticed that ceeeb99cd821 ("dmaengine: mxs: rename custom flag") got backported, but the additional fix 269e31aecdd0 ("spi-mxs: Fix chipselect glitch") hasn't. I think was caused by the lack of Cc to stable. Without the latter patch the SPI is causing glitches on MXS platform. Please backport it from 5.4 to 6.6. Thanks Stefan

6 months, 2 weeks

2
2
0 0

[PATCH 1/2] regulator: dummy: force synchronous probing

by Christian Eggers

Sometimes I get a NULL pointer dereference at boot time in kobject_get() with the following call stack: anatop_regulator_probe() devm_regulator_register() regulator_register() regulator_resolve_supply() kobject_get() By placing some extra BUG_ON() statements I could verify that this is raised because probing of the 'dummy' regulator driver is not completed ('dummy_regulator_rdev' is still NULL). In the JTAG debugger I can see that dummy_regulator_probe() and anatop_regulator_probe() can be run by different kernel threads (kworker/u4:*). I haven't further investigated whether this can be changed or if there are other possibilities to force synchronization between these two probe routines. On the other hand I don't expect much boot time penalty by probing the 'dummy' regulator synchronously. Cc: stable(a)vger.kernel.org Fixes: 259b93b21a9f ("regulator: Set PROBE_PREFER_ASYNCHRONOUS for drivers that existed in 4.14") Signed-off-by: Christian Eggers <ceggers(a)arri.de> --- drivers/regulator/dummy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/regulator/dummy.c b/drivers/regulator/dummy.c index 5b9b9e4e762d..9f59889129ab 100644 --- a/drivers/regulator/dummy.c +++ b/drivers/regulator/dummy.c @@ -60,7 +60,7 @@ static struct platform_driver dummy_regulator_driver = { .probe = dummy_regulator_probe, .driver = { .name = "reg-dummy", - .probe_type = PROBE_PREFER_ASYNCHRONOUS, + .probe_type = PROBE_FORCE_SYNCHRONOUS, }, }; -- 2.43.0

6 months, 2 weeks

3
4
0 0

[PATCH 2/3] usb: chipidea: ci_hdrc_imx: disable regulator on error path in probe

by Fedor Pchelkin

Upon encountering errors during the HSIC pinctrl handling section the regulator should be disabled. After the above-stated changes it is possible to jump onto "disable_hsic_regulator" label without having added the CPU latency QoS request previously. This would result in cpu_latency_qos_remove_request() yielding a WARNING. So rearrange the error handling path to follow the reverse order of different probing phases. Found by Linux Verification Center (linuxtesting.org). Fixes: 4d6141288c33 ("usb: chipidea: imx: pinctrl for HSIC is optional") Cc: stable(a)vger.kernel.org Signed-off-by: Fedor Pchelkin <pchelkin(a)ispras.ru> --- drivers/usb/chipidea/ci_hdrc_imx.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/drivers/usb/chipidea/ci_hdrc_imx.c b/drivers/usb/chipidea/ci_hdrc_imx.c index 619779eef333..3f11ae071c7f 100644 --- a/drivers/usb/chipidea/ci_hdrc_imx.c +++ b/drivers/usb/chipidea/ci_hdrc_imx.c @@ -407,13 +407,13 @@ static int ci_hdrc_imx_probe(struct platform_device *pdev) "pinctrl_hsic_idle lookup failed, err=%ld\n", PTR_ERR(pinctrl_hsic_idle)); ret = PTR_ERR(pinctrl_hsic_idle); - goto err_put; + goto disable_hsic_regulator; } ret = pinctrl_select_state(data->pinctrl, pinctrl_hsic_idle); if (ret) { dev_err(dev, "hsic_idle select failed, err=%d\n", ret); - goto err_put; + goto disable_hsic_regulator; } data->pinctrl_hsic_active = pinctrl_lookup_state(data->pinctrl, @@ -423,7 +423,7 @@ static int ci_hdrc_imx_probe(struct platform_device *pdev) "pinctrl_hsic_active lookup failed, err=%ld\n", PTR_ERR(data->pinctrl_hsic_active)); ret = PTR_ERR(data->pinctrl_hsic_active); - goto err_put; + goto disable_hsic_regulator; } } @@ -432,11 +432,11 @@ static int ci_hdrc_imx_probe(struct platform_device *pdev) ret = imx_get_clks(dev); if (ret) - goto disable_hsic_regulator; + goto qos_remove_request; ret = imx_prepare_enable_clks(dev); if (ret) - goto disable_hsic_regulator; + goto qos_remove_request; ret = clk_prepare_enable(data->clk_wakeup); if (ret) @@ -526,12 +526,13 @@ static int ci_hdrc_imx_probe(struct platform_device *pdev) clk_disable_unprepare(data->clk_wakeup); err_wakeup_clk: imx_disable_unprepare_clks(dev); +qos_remove_request: + if (pdata.flags & CI_HDRC_PMQOS) + cpu_latency_qos_remove_request(&data->pm_qos_req); disable_hsic_regulator: if (data->hsic_pad_regulator) /* don't overwrite original ret (cf. EPROBE_DEFER) */ regulator_disable(data->hsic_pad_regulator); - if (pdata.flags & CI_HDRC_PMQOS) - cpu_latency_qos_remove_request(&data->pm_qos_req); data->ci_pdev = NULL; err_put: if (data->usbmisc_data) -- 2.48.1

6 months, 2 weeks

2
2
0 0

[PATCH 6.1.y] KVM: selftests: Fix build error due to assert in dirty_log_test

by Andrey Kalachev

Hi all. Please apply that forgotten patch to fix v6.1 KVM selftests broken build. Origin of the patch can be founded here [1] Regards, AK [1] https://lore.kernel.org/stable/20240403164230.1722018-1-rananta@google.com/ -- 2.30.2

6 months, 2 weeks

1
1
0 0

Re: Patch "drm/i915: Plumb 'dsb' all way to the plane hooks" has been added to the 6.12-stable tree

by Ville Syrjälä

On Sun, Mar 09, 2025 at 03:45:57PM -0400, Sasha Levin wrote: > This is a note to let you know that I've just added the patch titled > > drm/i915: Plumb 'dsb' all way to the plane hooks > > to the 6.12-stable tree which can be found at: > http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… > > The filename of the patch is: > drm-i915-plumb-dsb-all-way-to-the-plane-hooks.patch > and it can be found in the queue-6.12 subdirectory. > > If you, or anyone else, feels it should not be added to the stable tree, > please let <stable(a)vger.kernel.org> know about it. > > > > commit f03e7cca22f4bb50cae98840f91fcf1e6d780a54 > Author: Ville Syrjälä <ville.syrjala(a)linux.intel.com> > Date: Mon Sep 30 20:04:13 2024 +0300 > > drm/i915: Plumb 'dsb' all way to the plane hooks > > [ Upstream commit 01389846f7d61d262cc92d42ad4d1a25730e3eff ] It would help if you actually mentioned *why* you need to backport this? -- Ville Syrjälä Intel

6 months, 2 weeks

2
2
0 0

[REGRESSION] stable-rc/linux-5.4.y: (build) +arch/x86/events/amd/../perf_event.h:855:21: error: invalid output...

by KernelCI bot

Hello, New build issue found on stable-rc/linux-5.4.y: --- +arch/x86/events/amd/../perf_event.h:855:21: error: invalid output size for constraint '=q' in arch/x86/events/amd/core.o (arch/x86/events/amd/core.c) [logspec:kbuild,kbuild.compiler.error] --- - dashboard: https://d.kernelci.org/issue/maestro:d36536187d181e0823b2aa631ebf9936dd657c… - giturl: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git - commit HEAD: 888d41479bea1f83b835311f71a91f5e08f6c4d7 Log excerpt: ===================================================== In file included from arch/x86/events/amd/core.c:12: +arch/x86/events/amd/../perf_event.h:855:21: error: invalid output size for constraint '=q' ./include/linux/percpu-defs.h:446:2: note: expanded from macro '__this_cpu_read' 446 | raw_cpu_read(pcp); \ | ^ ./include/linux/percpu-defs.h:420:28:.. note: expanded from macro 'raw_cpu_read' 420 | #define raw_cpu_read(pcp) __pcpu_size_call_return(raw_cpu_read_, pcp) | ^ ./include/linux/percpu-defs.h:323:23: note: expanded from macro '__pcpu_size_call_return' 323 | case 4: pscr_ret__ = stem##4(variable); break; \ | ^ <scratch space>:85:1: note: expanded from here 85 | raw_cpu_read_4 | ^ ./arch/x86/include/asm/percpu.h:396:30: note: expanded from macro 'raw_cpu_read_4' 396 | #define raw_cpu_read_4(pcp) percpu_from_op(, "mov", pcp) | ^ ./arch/x86/include/asm/percpu.h:189:15: note: expanded from macro 'percpu_from_op' 189 | : "=q" (pfo_ret__) \ | ^ ....+. HDRTEST usr/include/sound/sof/tokens.h . AR init/built-in.a ..........6 errors generated. ....+..... ===================================================== # Builds where the incident occurred: ## i386_defconfig+allmodconfig+CONFIG_FRAME_WARN=2048 on (i386): - compiler: clang-17 - dashboard: https://d.kernelci.org/build/maestro:67ceff2318018371957eaf29 #kernelci issue maestro:d36536187d181e0823b2aa631ebf9936dd657c55 Reported-by: kernelci.org bot <bot(a)kernelci.org> -- This is an experimental report format. Please send feedback in! Talk to us at kernelci(a)lists.linux.dev Made with love by the KernelCI team - https://kernelci.org

6 months, 2 weeks

1
0
0 0

[REGRESSION] stable-rc/linux-6.6.y: (build) ‘RISCV_ISA_EXT_XLINUXENVCFG’ undeclared (first use in this functio...

by KernelCI bot

Hello, New build issue found on stable-rc/linux-6.6.y: --- ‘RISCV_ISA_EXT_XLINUXENVCFG’ undeclared (first use in this function); did you mean ‘RISCV_ISA_EXT_ZIFENCEI’? in arch/riscv/kernel/suspend.o (arch/riscv/kernel/suspend.c) [logspec:kbuild,kbuild.compiler.error] --- - dashboard: https://d.kernelci.org/issue/maestro:f277022d07efdd2a5858eb44b3c3dab79cca84… - giturl: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git - commit HEAD: b49d45c66a5e8cc1c82591049bfc0d04daa1e77c Log excerpt: ===================================================== arch/riscv/kernel/suspend.c:14:66: error: ‘RISCV_ISA_EXT_XLINUXENVCFG’ undeclared (first use in this function); did you mean ‘RISCV_ISA_EXT_ZIFENCEI’? 14 | if (riscv_cpu_has_extension_unlikely(smp_processor_id(), RISCV_ISA_EXT_XLINUXENVCFG)) | ^~~~~~~~~~~~~~~~~~~~~~~~~~ | RISCV_ISA_EXT_ZIFENCEI arch/riscv/kernel/suspend.c:14:66: note: each undeclared identifier is reported only once for each function it appears in CC fs/proc/cpuinfo.o arch/riscv/kernel/suspend.c: In function ‘suspend_restore_csrs’: arch/riscv/kernel/suspend.c:37:66: error: ‘RISCV_ISA_EXT_XLINUXENVCFG’ undeclared (first use in this function); did you mean ‘RISCV_ISA_EXT_ZIFENCEI’? 37 | if (riscv_cpu_has_extension_unlikely(smp_processor_id(), RISCV_ISA_EXT_XLINUXENVCFG)) | ^~~~~~~~~~~~~~~~~~~~~~~~~~ | RISCV_ISA_EXT_ZIFENCEI ===================================================== # Builds where the incident occurred: ## defconfig on (riscv): - compiler: gcc-12 - dashboard: https://d.kernelci.org/build/maestro:67cf00ee18018371957ec83e #kernelci issue maestro:f277022d07efdd2a5858eb44b3c3dab79cca847e Reported-by: kernelci.org bot <bot(a)kernelci.org> -- This is an experimental report format. Please send feedback in! Talk to us at kernelci(a)lists.linux.dev Made with love by the KernelCI team - https://kernelci.org

6 months, 2 weeks

1
0
0 0

[PATCH] mm/huge_memory: drop beyond-EOF folios with the right number of refs.

by Zi Yan

When an after-split folio is large and needs to be dropped due to EOF, folio_put_refs(folio, folio_nr_pages(folio)) should be used to drop all page cache refs. Otherwise, the folio will not be freed, causing memory leak. This leak would happen on a filesystem with blocksize > page_size and a truncate is performed, where the blocksize makes folios split to >0 order ones, causing truncated folios not being freed. Fixes: c010d47f107f ("mm: thp: split huge page to any lower order pages") Reported-by: Hugh Dickins <hughd(a)google.com> Closes: https://lore.kernel.org/all/fcbadb7f-dd3e-21df-f9a7-2853b53183c4@google.com/ Cc: stable(a)vger.kernel.org Signed-off-by: Zi Yan <ziy(a)nvidia.com> --- mm/huge_memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 3d3ebdc002d5..373781b21e5c 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3304,7 +3304,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, folio_account_cleaned(tail, inode_to_wb(folio->mapping->host)); __filemap_remove_folio(tail, NULL); - folio_put(tail); + folio_put_refs(tail, folio_nr_pages(tail)); } else if (!folio_test_anon(folio)) { __xa_store(&folio->mapping->i_pages, tail->index, tail, 0); -- 2.47.2

6 months, 2 weeks

1
0
0 0

net: cadence: macb: Enable software IRQ coalescing by default

by Daniel J Blueman

The macb ethernet driver (Raspberry Pi 5) delivers interrupts only to the first core, quickly saturating it at higher packet rates. Introducing software interrupt coalescing dramatically alleviates this limitation; the oneliner fix is upstream at d57f7b45945ac0517ff8ea50655f00db6e8d637c. Please backport this fix to 6.6 -stable to bring this benefit to more Raspberry Pis; it applies cleanly on this branch. Many thanks, Daniel -- Daniel J Blueman

6 months, 2 weeks

2
2
0 0

[PATCH 1/3] usb: chipidea: ci_hdrc_imx: fix usbmisc handling

by Fedor Pchelkin

usbmisc is an optional device property so it is totally valid for the corresponding data->usbmisc_data to have a NULL value. Check that before dereferencing the pointer. Found by Linux Verification Center (linuxtesting.org) with Svace static analysis tool. Fixes: 74adad500346 ("usb: chipidea: ci_hdrc_imx: decrement device's refcount in .remove() and in the error path of .probe()") Cc: stable(a)vger.kernel.org Signed-off-by: Fedor Pchelkin <pchelkin(a)ispras.ru> --- drivers/usb/chipidea/ci_hdrc_imx.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/usb/chipidea/ci_hdrc_imx.c b/drivers/usb/chipidea/ci_hdrc_imx.c index 1a7fc638213e..619779eef333 100644 --- a/drivers/usb/chipidea/ci_hdrc_imx.c +++ b/drivers/usb/chipidea/ci_hdrc_imx.c @@ -534,7 +534,8 @@ static int ci_hdrc_imx_probe(struct platform_device *pdev) cpu_latency_qos_remove_request(&data->pm_qos_req); data->ci_pdev = NULL; err_put: - put_device(data->usbmisc_data->dev); + if (data->usbmisc_data) + put_device(data->usbmisc_data->dev); return ret; } @@ -559,7 +560,8 @@ static void ci_hdrc_imx_remove(struct platform_device *pdev) if (data->hsic_pad_regulator) regulator_disable(data->hsic_pad_regulator); } - put_device(data->usbmisc_data->dev); + if (data->usbmisc_data) + put_device(data->usbmisc_data->dev); } static void ci_hdrc_imx_shutdown(struct platform_device *pdev) -- 2.48.1

6 months, 2 weeks

2
1
0 0

[PATCH V2] block: make sure ->nr_integrity_segments is cloned in blk_rq_prep_clone

by Ming Lei

Make sure ->nr_integrity_segments is cloned in blk_rq_prep_clone(), otherwise requests cloned by device-mapper multipath will not have the proper nr_integrity_segments values set, then BUG() is hit from sg_alloc_table_chained(). Fixes: b0fd271d5fba ("block: add request clone interface (v2)") Cc: stable(a)vger.kernel.org Cc: Christoph Hellwig <hch(a)infradead.org> Signed-off-by: Ming Lei <ming.lei(a)redhat.com> --- V2: - rewords commit log(Christoph) - add fixes tag(Christoph) block/blk-mq.c | 1 + 1 file changed, 1 insertion(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index 40490ac88045..005c520d3498 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3314,6 +3314,7 @@ int blk_rq_prep_clone(struct request *rq, struct request *rq_src, rq->special_vec = rq_src->special_vec; } rq->nr_phys_segments = rq_src->nr_phys_segments; + rq->nr_integrity_segments = rq_src->nr_integrity_segments; if (rq->bio && blk_crypto_rq_bio_prep(rq, rq->bio, gfp_mask) < 0) goto free_and_out; -- 2.47.0

6 months, 2 weeks

3
2
0 0

[PATCH] char: xillybus: Fix error handling in xillybus_init_chrdev()

by Ma Ke

After cdev_alloc() succeed and cdev_add() failed, call cdev_del() to remove unit->cdev from the system properly. Found by code review. Cc: stable(a)vger.kernel.org Fixes: 8cb5d216ab33 ("char: xillybus: Move class-related functions to new xillybus_class.c") Signed-off-by: Ma Ke <make24(a)iscas.ac.cn> --- drivers/char/xillybus/xillybus_class.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/char/xillybus/xillybus_class.c b/drivers/char/xillybus/xillybus_class.c index c92a628e389e..045e125ec423 100644 --- a/drivers/char/xillybus/xillybus_class.c +++ b/drivers/char/xillybus/xillybus_class.c @@ -105,7 +105,7 @@ int xillybus_init_chrdev(struct device *dev, dev_err(dev, "Failed to add cdev.\n"); /* kobject_put() is normally done by cdev_del() */ kobject_put(&unit->cdev->kobj); - goto unregister_chrdev; + goto err_cdev; } for (i = 0; i < num_nodes; i++) { @@ -157,6 +157,7 @@ int xillybus_init_chrdev(struct device *dev, device_destroy(&xillybus_class, MKDEV(unit->major, i + unit->lowest_minor)); +err_cdev: cdev_del(unit->cdev); unregister_chrdev: -- 2.25.1

6 months, 2 weeks

2
1
0 0

[REGRESSION] stable-rc/linux-5.15.y: (build) implicit declaration of function ‘acpi_get_cache_info’; did you me...

by KernelCI bot

Hello, New build issue found on stable-rc/linux-5.15.y: --- implicit declaration of function ‘acpi_get_cache_info’; did you mean ‘acpi_get_system_info’? [-Werror=implicit-function-declaration] in arch/riscv/kernel/cacheinfo.o (arch/riscv/kernel/cacheinfo.c) [logspec:kbuild,kbuild.compiler.error] --- - dashboard: https://d.kernelci.org/issue/maestro:c4d70565f303a7d7450fbf5add7ca4cc80a961… - giturl: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git - commit HEAD: 2ae395ef666caf57984ff9d2ad7bca6be851f719 Log excerpt: ===================================================== arch/riscv/kernel/cacheinfo.c:127:23: error: implicit declaration of function ‘acpi_get_cache_info’; did you mean ‘acpi_get_system_info’? [-Werror=implicit-function-declaration] 127 | ret = acpi_get_cache_info(cpu, &fw_levels, &split_levels); | ^~~~~~~~~~~~~~~~~~~ | acpi_get_system_info cc1: some warnings being treated as errors CC arch/riscv/kernel/patch.o CC fs/proc/generic.o ===================================================== # Builds where the incident occurred: ## defconfig on (riscv): - compiler: gcc-12 - dashboard: https://d.kernelci.org/build/maestro:67ced73618018371957dfa8e ## nommu_k210_defconfig on (riscv): - compiler: gcc-12 - dashboard: https://d.kernelci.org/build/maestro:67ced73a18018371957dfa91 #kernelci issue maestro:c4d70565f303a7d7450fbf5add7ca4cc80a96112 Reported-by: kernelci.org bot <bot(a)kernelci.org> -- This is an experimental report format. Please send feedback in! Talk to us at kernelci(a)lists.linux.dev Made with love by the KernelCI team - https://kernelci.org

6 months, 2 weeks

1
0
0 0

[REGRESSION] stable-rc/linux-5.4.y: (build) implicit declaration of function ‘acpi_get_cache_info’; did you me...

by KernelCI bot

Hello, New build issue found on stable-rc/linux-5.4.y: --- implicit declaration of function ‘acpi_get_cache_info’; did you mean ‘acpi_get_system_info’? [-Werror=implicit-function-declaration] in arch/riscv/kernel/cacheinfo.o (arch/riscv/kernel/cacheinfo.c) [logspec:kbuild,kbuild.compiler.error] --- - dashboard: https://d.kernelci.org/issue/maestro:0f2670909ac3275cc312c3c604f3ed03443fee… - giturl: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git - commit HEAD: 2f9225fb6ea4ba2ad94f50f0e24bad9c353b8649 Log excerpt: ===================================================== arch/riscv/kernel/cacheinfo.c:118:23: error: implicit declaration of function ‘acpi_get_cache_info’; did you mean ‘acpi_get_system_info’? [-Werror=implicit-function-declaration] 118 | ret = acpi_get_cache_info(cpu, &fw_levels, &split_levels); | ^~~~~~~~~~~~~~~~~~~ | acpi_get_system_info arch/riscv/kernel/cacheinfo.c:140:13: error: implicit declaration of function ‘of_property_present’; did you mean ‘fwnode_property_present’? [-Werror=implicit-function-declaration] 140 | if (of_property_present(np, "cache-size")) | ^~~~~~~~~~~~~~~~~~~ | fwnode_property_present CC arch/riscv/kernel/module-sections.o CC arch/riscv/kernel/perf_regs.o cc1: some warnings being treated as errors ===================================================== # Builds where the incident occurred: ## defconfig on (riscv): - compiler: gcc-12 - dashboard: https://d.kernelci.org/build/maestro:67ced63718018371957df9ae #kernelci issue maestro:0f2670909ac3275cc312c3c604f3ed03443feecc Reported-by: kernelci.org bot <bot(a)kernelci.org> -- This is an experimental report format. Please send feedback in! Talk to us at kernelci(a)lists.linux.dev Made with love by the KernelCI team - https://kernelci.org

6 months, 2 weeks

1
0
0 0

[PATCH v3 2/2] phy: freescale: imx8m-pcie: assert phy reset and perst in power off

by Stefan Eichenberger

From: Stefan Eichenberger <stefan.eichenberger(a)toradex.com> Ensure the PHY reset and perst is asserted during power-off to guarantee it is in a reset state upon repeated power-on calls. This resolves an issue where the PHY may not properly initialize during subsequent power-on cycles. Power-on will deassert the reset at the appropriate time after tuning the PHY parameters. During suspend/resume cycles, we observed that the PHY PLL failed to lock during resume when the CPU temperature increased from 65C to 75C. The observed errors were: phy phy-32f00000.pcie-phy.3: phy poweron failed --> -110 imx6q-pcie 33800000.pcie: waiting for PHY ready timeout! imx6q-pcie 33800000.pcie: PM: dpm_run_callback(): genpd_resume_noirq+0x0/0x80 returns -110 imx6q-pcie 33800000.pcie: PM: failed to resume noirq: error -110 This resulted in a complete CPU freeze, which is resolved by ensuring the PHY is in reset during power-on, thus preventing PHY PLL failures. Cc: stable(a)vger.kernel.org Fixes: 1aa97b002258 ("phy: freescale: pcie: Initialize the imx8 pcie standalone phy driver") Reviewed-by: Frank Li <Frank.Li(a)nxp.com> Acked-by: Richard Zhu <hongxing.zhu(a)nxp.com> Signed-off-by: Stefan Eichenberger <stefan.eichenberger(a)toradex.com> --- drivers/phy/freescale/phy-fsl-imx8m-pcie.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/phy/freescale/phy-fsl-imx8m-pcie.c b/drivers/phy/freescale/phy-fsl-imx8m-pcie.c index 5b505e34ca364..7355d9921b646 100644 --- a/drivers/phy/freescale/phy-fsl-imx8m-pcie.c +++ b/drivers/phy/freescale/phy-fsl-imx8m-pcie.c @@ -156,6 +156,16 @@ static int imx8_pcie_phy_power_on(struct phy *phy) return ret; } +static int imx8_pcie_phy_power_off(struct phy *phy) +{ + struct imx8_pcie_phy *imx8_phy = phy_get_drvdata(phy); + + reset_control_assert(imx8_phy->reset); + reset_control_assert(imx8_phy->perst); + + return 0; +} + static int imx8_pcie_phy_init(struct phy *phy) { struct imx8_pcie_phy *imx8_phy = phy_get_drvdata(phy); @@ -176,6 +186,7 @@ static const struct phy_ops imx8_pcie_phy_ops = { .init = imx8_pcie_phy_init, .exit = imx8_pcie_phy_exit, .power_on = imx8_pcie_phy_power_on, + .power_off = imx8_pcie_phy_power_off, .owner = THIS_MODULE, }; -- 2.45.2

6 months, 2 weeks

1
0
0 0

[PATCH net] net: mana: Support holes in device list reply msg

by Haiyang Zhang

According to GDMA protocol, holes (zeros) are allowed at the beginning or middle of the gdma_list_devices_resp message. The existing code cannot properly handle this, and may miss some devices in the list. To fix, scan the entire list until the num_of_devs are found, or until the end of the list. Cc: stable(a)vger.kernel.org Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Signed-off-by: Haiyang Zhang <haiyangz(a)microsoft.com> --- drivers/net/ethernet/microsoft/mana/gdma_main.c | 16 ++++++++++++---- include/net/mana/gdma.h | 11 +++++++---- 2 files changed, 19 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c index c15a5ef4674e..df3ab31974b1 100644 --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c @@ -134,9 +134,10 @@ static int mana_gd_detect_devices(struct pci_dev *pdev) struct gdma_list_devices_resp resp = {}; struct gdma_general_req req = {}; struct gdma_dev_id dev; - u32 i, max_num_devs; + int found_dev = 0; u16 dev_type; int err; + u32 i; mana_gd_init_req_hdr(&req.hdr, GDMA_LIST_DEVICES, sizeof(req), sizeof(resp)); @@ -148,12 +149,19 @@ static int mana_gd_detect_devices(struct pci_dev *pdev) return err ? err : -EPROTO; } - max_num_devs = min_t(u32, MAX_NUM_GDMA_DEVICES, resp.num_of_devs); - - for (i = 0; i < max_num_devs; i++) { + for (i = 0; i < GDMA_DEV_LIST_SIZE && + found_dev < resp.num_of_devs; i++) { dev = resp.devs[i]; dev_type = dev.type; + /* Skip empty devices */ + if (dev.as_uint32 == 0) + continue; + + found_dev++; + dev_info(gc->dev, "Got devidx:%u, type:%u, instance:%u\n", i, + dev.type, dev.instance); + /* HWC is already detected in mana_hwc_create_channel(). */ if (dev_type == GDMA_DEVICE_HWC) continue; diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h index 90f56656b572..62e9d7673862 100644 --- a/include/net/mana/gdma.h +++ b/include/net/mana/gdma.h @@ -408,8 +408,6 @@ struct gdma_context { struct gdma_dev mana_ib; }; -#define MAX_NUM_GDMA_DEVICES 4 - static inline bool mana_gd_is_mana(struct gdma_dev *gd) { return gd->dev_id.type == GDMA_DEVICE_MANA; @@ -556,11 +554,15 @@ enum { #define GDMA_DRV_CAP_FLAG_1_HWC_TIMEOUT_RECONFIG BIT(3) #define GDMA_DRV_CAP_FLAG_1_VARIABLE_INDIRECTION_TABLE_SUPPORT BIT(5) +/* Driver can handle holes (zeros) in the device list */ +#define GDMA_DRV_CAP_FLAG_1_DEV_LIST_HOLES_SUP BIT(11) + #define GDMA_DRV_CAP_FLAGS1 \ (GDMA_DRV_CAP_FLAG_1_EQ_SHARING_MULTI_VPORT | \ GDMA_DRV_CAP_FLAG_1_NAPI_WKDONE_FIX | \ GDMA_DRV_CAP_FLAG_1_HWC_TIMEOUT_RECONFIG | \ - GDMA_DRV_CAP_FLAG_1_VARIABLE_INDIRECTION_TABLE_SUPPORT) + GDMA_DRV_CAP_FLAG_1_VARIABLE_INDIRECTION_TABLE_SUPPORT | \ + GDMA_DRV_CAP_FLAG_1_DEV_LIST_HOLES_SUP) #define GDMA_DRV_CAP_FLAGS2 0 @@ -621,11 +623,12 @@ struct gdma_query_max_resources_resp { }; /* HW DATA */ /* GDMA_LIST_DEVICES */ +#define GDMA_DEV_LIST_SIZE 64 struct gdma_list_devices_resp { struct gdma_resp_hdr hdr; u32 num_of_devs; u32 reserved; - struct gdma_dev_id devs[64]; + struct gdma_dev_id devs[GDMA_DEV_LIST_SIZE]; }; /* HW DATA */ /* GDMA_REGISTER_DEVICE */ -- 2.34.1

6 months, 2 weeks

4
5
0 0

[PATCH v6.13 v2] fs/netfs/read_collect: fix crash due to uninitialized `prev` variable

by Max Kellermann

When checking whether the edges of adjacent subrequests touch, the `prev` variable is deferenced, but it might not have been initialized. This causes crashes like this one: BUG: unable to handle page fault for address: 0000000181343843 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 8000001c66db0067 P4D 8000001c66db0067 PUD 0 Oops: Oops: 0000 [#1] SMP PTI CPU: 1 UID: 33333 PID: 24424 Comm: php-cgi8.2 Kdump: loaded Not tainted 6.13.2-cm4all0-hp+ #427 Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 11/23/2021 RIP: 0010:netfs_consume_read_data.isra.0+0x5ef/0xb00 Code: fe ff ff 48 8b 83 88 00 00 00 48 8b 4c 24 30 4c 8b 43 78 48 85 c0 48 8d 51 70 75 20 48 8b 73 30 48 39 d6 74 17 48 8b 7c 24 40 <48> 8b 4f 78 48 03 4f 68 48 39 4b 68 0f 84 ab 02 00 00 49 29 c0 48 RSP: 0000:ffffc90037adbd00 EFLAGS: 00010283 RAX: 0000000000000000 RBX: ffff88811bda0600 RCX: ffff888620e7b980 RDX: ffff888620e7b9f0 RSI: ffff88811bda0428 RDI: 00000001813437cb RBP: 0000000000000000 R08: 0000000000004000 R09: 0000000000000000 R10: ffffffff82e070c0 R11: 0000000007ffffff R12: 0000000000004000 R13: ffff888620e7bb68 R14: 0000000000008000 R15: ffff888620e7bb68 FS: 00007ff2e0e7ddc0(0000) GS:ffff88981f840000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000181343843 CR3: 0000001bc10ba006 CR4: 00000000001706f0 Call Trace: <TASK> ? __die+0x1f/0x60 ? page_fault_oops+0x15c/0x450 ? search_extable+0x22/0x30 ? netfs_consume_read_data.isra.0+0x5ef/0xb00 ? search_module_extables+0xe/0x40 ? exc_page_fault+0x5e/0x100 ? asm_exc_page_fault+0x22/0x30 ? netfs_consume_read_data.isra.0+0x5ef/0xb00 ? intel_iommu_unmap_pages+0xaa/0x190 ? __pfx_cachefiles_read_complete+0x10/0x10 netfs_read_subreq_terminated+0x24f/0x390 cachefiles_read_complete+0x48/0xf0 iomap_dio_bio_end_io+0x125/0x160 blk_update_request+0xea/0x3e0 scsi_end_request+0x27/0x190 scsi_io_completion+0x43/0x6c0 blk_complete_reqs+0x40/0x50 handle_softirqs+0xd1/0x280 irq_exit_rcu+0x91/0xb0 common_interrupt+0x3b/0xa0 asm_common_interrupt+0x22/0x40 RIP: 0033:0x55fe8470d2ab Code: 00 00 3c 7e 74 3b 3c b6 0f 84 dd 03 00 00 3c 1e 74 2f 83 c1 01 48 83 c2 38 48 83 c7 30 44 39 d1 74 3e 48 63 42 08 85 c0 79 a3 <49> 8b 46 48 8b 04 38 f6 c4 04 75 0b 0f b6 42 30 83 e0 0c 3c 04 75 RSP: 002b:00007ffca5ef2720 EFLAGS: 00000216 RAX: 0000000000000023 RBX: 0000000000000008 RCX: 000000000000001b RDX: 00007ff2e0cdb6f8 RSI: 0000000000000006 RDI: 0000000000000510 RBP: 00007ffca5ef27a0 R08: 00007ffca5ef2720 R09: 0000000000000001 R10: 000000000000001e R11: 00007ff2e0c10d08 R12: 0000000000000001 R13: 0000000000000120 R14: 00007ff2e0cb1ed0 R15: 00000000000000b0 </TASK> Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Cc: stable(a)vger.kernel.org Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com> --- David/Greg: just like the other two netfs patches I sent yesterday, this one doesn't apply to v6.14 as it was obsoleted by commit e2d46f2ec332 ("netfs: Change the read result collector to only use one work item"). v1->v2: duplicate "prev" assigment removed. Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com> --- fs/netfs/read_collect.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c index e8624f5c7fcc..8878b46589ff 100644 --- a/fs/netfs/read_collect.c +++ b/fs/netfs/read_collect.c @@ -258,17 +258,18 @@ static bool netfs_consume_read_data(struct netfs_io_subrequest *subreq, bool was */ if (!subreq->consumed && !prev_donated && - !list_is_first(&subreq->rreq_link, &rreq->subrequests) && - subreq->start == prev->start + prev->len) { + !list_is_first(&subreq->rreq_link, &rreq->subrequests)) { prev = list_prev_entry(subreq, rreq_link); - WRITE_ONCE(prev->next_donated, prev->next_donated + subreq->len); - subreq->start += subreq->len; - subreq->len = 0; - subreq->transferred = 0; - trace_netfs_donate(rreq, subreq, prev, subreq->len, - netfs_trace_donate_to_prev); - trace_netfs_sreq(subreq, netfs_sreq_trace_donate_to_prev); - goto remove_subreq_locked; + if (subreq->start == prev->start + prev->len) { + WRITE_ONCE(prev->next_donated, prev->next_donated + subreq->len); + subreq->start += subreq->len; + subreq->len = 0; + subreq->transferred = 0; + trace_netfs_donate(rreq, subreq, prev, subreq->len, + netfs_trace_donate_to_prev); + trace_netfs_sreq(subreq, netfs_sreq_trace_donate_to_prev); + goto remove_subreq_locked; + } } /* If we can't donate down the chain, donate up the chain instead. */ -- 2.47.2

6 months, 2 weeks

5
4
0 0

FAILED: patch "[PATCH] mm: hugetlb: Add huge page size param to" failed to apply to 6.1-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x 02410ac72ac3707936c07ede66e94360d0d65319 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030437-specks-impotency-d026@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 02410ac72ac3707936c07ede66e94360d0d65319 Mon Sep 17 00:00:00 2001 From: Ryan Roberts <ryan.roberts(a)arm.com> Date: Wed, 26 Feb 2025 12:06:51 +0000 Subject: [PATCH] mm: hugetlb: Add huge page size param to huge_ptep_get_and_clear() In order to fix a bug, arm64 needs to be told the size of the huge page for which the huge_pte is being cleared in huge_ptep_get_and_clear(). Provide for this by adding an `unsigned long sz` parameter to the function. This follows the same pattern as huge_pte_clear() and set_huge_pte_at(). This commit makes the required interface modifications to the core mm as well as all arches that implement this function (arm64, loongarch, mips, parisc, powerpc, riscv, s390, sparc). The actual arm64 bug will be fixed in a separate commit. Cc: stable(a)vger.kernel.org Fixes: 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit") Acked-by: David Hildenbrand <david(a)redhat.com> Reviewed-by: Alexandre Ghiti <alexghiti(a)rivosinc.com> # riscv Reviewed-by: Christophe Leroy <christophe.leroy(a)csgroup.eu> Reviewed-by: Catalin Marinas <catalin.marinas(a)arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual(a)arm.com> Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com> Acked-by: Alexander Gordeev <agordeev(a)linux.ibm.com> # s390 Link: https://lore.kernel.org/r/20250226120656.2400136-2-ryan.roberts@arm.com Signed-off-by: Will Deacon <will(a)kernel.org> diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h index c6dff3e69539..03db9cb21ace 100644 --- a/arch/arm64/include/asm/hugetlb.h +++ b/arch/arm64/include/asm/hugetlb.h @@ -42,8 +42,8 @@ extern int huge_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t pte, int dirty); #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR -extern pte_t huge_ptep_get_and_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep); +extern pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned long sz); #define __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT extern void huge_ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep); diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 98a2a0e64e25..06db4649af91 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -396,8 +396,8 @@ void huge_pte_clear(struct mm_struct *mm, unsigned long addr, __pte_clear(mm, addr, ptep); } -pte_t huge_ptep_get_and_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep) +pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned long sz) { int ncontig; size_t pgsize; @@ -549,6 +549,8 @@ bool __init arch_hugetlb_valid_size(unsigned long size) pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { + unsigned long psize = huge_page_size(hstate_vma(vma)); + if (alternative_has_cap_unlikely(ARM64_WORKAROUND_2645198)) { /* * Break-before-make (BBM) is required for all user space mappings @@ -558,7 +560,7 @@ pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr if (pte_user_exec(__ptep_get(ptep))) return huge_ptep_clear_flush(vma, addr, ptep); } - return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); + return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep, psize); } void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, diff --git a/arch/loongarch/include/asm/hugetlb.h b/arch/loongarch/include/asm/hugetlb.h index c8e4057734d0..4dc4b3e04225 100644 --- a/arch/loongarch/include/asm/hugetlb.h +++ b/arch/loongarch/include/asm/hugetlb.h @@ -36,7 +36,8 @@ static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr, #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep) + unsigned long addr, pte_t *ptep, + unsigned long sz) { pte_t clear; pte_t pte = ptep_get(ptep); @@ -51,8 +52,9 @@ static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { pte_t pte; + unsigned long sz = huge_page_size(hstate_vma(vma)); - pte = huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); + pte = huge_ptep_get_and_clear(vma->vm_mm, addr, ptep, sz); flush_tlb_page(vma, addr); return pte; } diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h index d0a86ce83de9..fbc71ddcf0f6 100644 --- a/arch/mips/include/asm/hugetlb.h +++ b/arch/mips/include/asm/hugetlb.h @@ -27,7 +27,8 @@ static inline int prepare_hugepage_range(struct file *file, #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep) + unsigned long addr, pte_t *ptep, + unsigned long sz) { pte_t clear; pte_t pte = *ptep; @@ -42,13 +43,14 @@ static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { pte_t pte; + unsigned long sz = huge_page_size(hstate_vma(vma)); /* * clear the huge pte entry firstly, so that the other smp threads will * not get old pte entry after finishing flush_tlb_page and before * setting new huge pte entry */ - pte = huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); + pte = huge_ptep_get_and_clear(vma->vm_mm, addr, ptep, sz); flush_tlb_page(vma, addr); return pte; } diff --git a/arch/parisc/include/asm/hugetlb.h b/arch/parisc/include/asm/hugetlb.h index 5b3a5429f71b..21e9ace17739 100644 --- a/arch/parisc/include/asm/hugetlb.h +++ b/arch/parisc/include/asm/hugetlb.h @@ -10,7 +10,7 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, - pte_t *ptep); + pte_t *ptep, unsigned long sz); #define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, diff --git a/arch/parisc/mm/hugetlbpage.c b/arch/parisc/mm/hugetlbpage.c index e9d18cf25b79..a94fe546d434 100644 --- a/arch/parisc/mm/hugetlbpage.c +++ b/arch/parisc/mm/hugetlbpage.c @@ -126,7 +126,7 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, - pte_t *ptep) + pte_t *ptep, unsigned long sz) { pte_t entry; diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index dad2e7980f24..86326587e58d 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -45,7 +45,8 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep) + unsigned long addr, pte_t *ptep, + unsigned long sz) { return __pte(pte_update(mm, addr, ptep, ~0UL, 0, 1)); } @@ -55,8 +56,9 @@ static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { pte_t pte; + unsigned long sz = huge_page_size(hstate_vma(vma)); - pte = huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); + pte = huge_ptep_get_and_clear(vma->vm_mm, addr, ptep, sz); flush_hugetlb_page(vma, addr); return pte; } diff --git a/arch/riscv/include/asm/hugetlb.h b/arch/riscv/include/asm/hugetlb.h index faf3624d8057..446126497768 100644 --- a/arch/riscv/include/asm/hugetlb.h +++ b/arch/riscv/include/asm/hugetlb.h @@ -28,7 +28,8 @@ void set_huge_pte_at(struct mm_struct *mm, #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR pte_t huge_ptep_get_and_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep); + unsigned long addr, pte_t *ptep, + unsigned long sz); #define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c index 42314f093922..b4a78a4b35cf 100644 --- a/arch/riscv/mm/hugetlbpage.c +++ b/arch/riscv/mm/hugetlbpage.c @@ -293,7 +293,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, - pte_t *ptep) + pte_t *ptep, unsigned long sz) { pte_t orig_pte = ptep_get(ptep); int pte_num; diff --git a/arch/s390/include/asm/hugetlb.h b/arch/s390/include/asm/hugetlb.h index 7c52acaf9f82..663e87220e89 100644 --- a/arch/s390/include/asm/hugetlb.h +++ b/arch/s390/include/asm/hugetlb.h @@ -25,8 +25,16 @@ void __set_huge_pte_at(struct mm_struct *mm, unsigned long addr, #define __HAVE_ARCH_HUGE_PTEP_GET pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep); +pte_t __huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, + pte_t *ptep); + #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR -pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep); +static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + unsigned long sz) +{ + return __huge_ptep_get_and_clear(mm, addr, ptep); +} static inline void arch_clear_hugetlb_flags(struct folio *folio) { @@ -48,7 +56,7 @@ static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr, static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) { - return huge_ptep_get_and_clear(vma->vm_mm, address, ptep); + return __huge_ptep_get_and_clear(vma->vm_mm, address, ptep); } #define __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS @@ -59,7 +67,7 @@ static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma, int changed = !pte_same(huge_ptep_get(vma->vm_mm, addr, ptep), pte); if (changed) { - huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); + __huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); __set_huge_pte_at(vma->vm_mm, addr, ptep, pte); } return changed; @@ -69,7 +77,7 @@ static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma, static inline void huge_ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - pte_t pte = huge_ptep_get_and_clear(mm, addr, ptep); + pte_t pte = __huge_ptep_get_and_clear(mm, addr, ptep); __set_huge_pte_at(mm, addr, ptep, pte_wrprotect(pte)); } diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c index d9ce199953de..2e568f175cd4 100644 --- a/arch/s390/mm/hugetlbpage.c +++ b/arch/s390/mm/hugetlbpage.c @@ -188,8 +188,8 @@ pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep) return __rste_to_pte(pte_val(*ptep)); } -pte_t huge_ptep_get_and_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep) +pte_t __huge_ptep_get_and_clear(struct mm_struct *mm, + unsigned long addr, pte_t *ptep) { pte_t pte = huge_ptep_get(mm, addr, ptep); pmd_t *pmdp = (pmd_t *) ptep; diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h index c714ca6a05aa..e7a9cdd498dc 100644 --- a/arch/sparc/include/asm/hugetlb.h +++ b/arch/sparc/include/asm/hugetlb.h @@ -20,7 +20,7 @@ void __set_huge_pte_at(struct mm_struct *mm, unsigned long addr, #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, - pte_t *ptep); + pte_t *ptep, unsigned long sz); #define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c index eee601a0d2cf..80504148d8a5 100644 --- a/arch/sparc/mm/hugetlbpage.c +++ b/arch/sparc/mm/hugetlbpage.c @@ -260,7 +260,7 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, } pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, - pte_t *ptep) + pte_t *ptep, unsigned long sz) { unsigned int i, nptes, orig_shift, shift; unsigned long size; diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h index f42133dae68e..2afc95bf1655 100644 --- a/include/asm-generic/hugetlb.h +++ b/include/asm-generic/hugetlb.h @@ -90,7 +90,7 @@ static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, #ifndef __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep) + unsigned long addr, pte_t *ptep, unsigned long sz) { return ptep_get_and_clear(mm, addr, ptep); } diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ec8c0ccc8f95..bf5f7256bd28 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -1004,7 +1004,9 @@ static inline void hugetlb_count_sub(long l, struct mm_struct *mm) static inline pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { - return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); + unsigned long psize = huge_page_size(hstate_vma(vma)); + + return huge_ptep_get_and_clear(vma->vm_mm, addr, ptep, psize); } #endif diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 65068671e460..de9d49e521c1 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5447,7 +5447,7 @@ static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr, if (src_ptl != dst_ptl) spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); - pte = huge_ptep_get_and_clear(mm, old_addr, src_pte); + pte = huge_ptep_get_and_clear(mm, old_addr, src_pte, sz); if (need_clear_uffd_wp && pte_marker_uffd_wp(pte)) huge_pte_clear(mm, new_addr, dst_pte, sz); @@ -5622,7 +5622,7 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, set_vma_resv_flags(vma, HPAGE_RESV_UNMAPPED); } - pte = huge_ptep_get_and_clear(mm, address, ptep); + pte = huge_ptep_get_and_clear(mm, address, ptep, sz); tlb_remove_huge_tlb_entry(h, tlb, ptep, address); if (huge_pte_dirty(pte)) set_page_dirty(page);

6 months, 2 weeks

4
3
0 0

Patch "iio: dac: ad3552r: clear reset status flag" has been added to the 6.1-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled iio: dac: ad3552r: clear reset status flag to the 6.1-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: iio-dac-ad3552r-clear-reset-status-flag.patch and it can be found in the queue-6.1 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. From e17b9f20da7d2bc1f48878ab2230523b2512d965 Mon Sep 17 00:00:00 2001 From: Angelo Dureghello <adureghello(a)baylibre.com> Date: Sat, 25 Jan 2025 17:24:32 +0100 Subject: iio: dac: ad3552r: clear reset status flag From: Angelo Dureghello <adureghello(a)baylibre.com> commit e17b9f20da7d2bc1f48878ab2230523b2512d965 upstream. Clear reset status flag, to keep error status register clean after reset (ad3552r manual, rev B table 38). Reset error flag was left to 1, so debugging registers, the "Error Status Register" was dirty (0x01). It is important to clear this bit, so if there is any reset event over normal working mode, it is possible to detect it. Fixes: 8f2b54824b28 ("drivers:iio:dac: Add AD3552R driver support") Signed-off-by: Angelo Dureghello <adureghello(a)baylibre.com> Link: https://patch.msgid.link/20250125-wip-bl-ad3552r-clear-reset-v2-1-aa3a27f3f… Cc: <Stable@vger..kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/iio/dac/ad3552r.c | 6 ++++++ 1 file changed, 6 insertions(+) --- a/drivers/iio/dac/ad3552r.c +++ b/drivers/iio/dac/ad3552r.c @@ -703,6 +703,12 @@ static int ad3552r_reset(struct ad3552r_ return ret; } + /* Clear reset error flag, see ad3552r manual, rev B table 38. */ + ret = ad3552r_write_reg(dac, AD3552R_REG_ADDR_ERR_STATUS, + AD3552R_MASK_RESET_STATUS); + if (ret) + return ret; + return ad3552r_update_reg_field(dac, addr_mask_map[AD3552R_ADDR_ASCENSION][0], addr_mask_map[AD3552R_ADDR_ASCENSION][1], Patches currently in stable-queue which might be from adureghello(a)baylibre.com are queue-6.1/iio-dac-ad3552r-clear-reset-status-flag.patch

6 months, 2 weeks

1
0
0 0

Patch "iio: dac: ad3552r: clear reset status flag" has been added to the 6.12-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled iio: dac: ad3552r: clear reset status flag to the 6.12-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: iio-dac-ad3552r-clear-reset-status-flag.patch and it can be found in the queue-6.12 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. From e17b9f20da7d2bc1f48878ab2230523b2512d965 Mon Sep 17 00:00:00 2001 From: Angelo Dureghello <adureghello(a)baylibre.com> Date: Sat, 25 Jan 2025 17:24:32 +0100 Subject: iio: dac: ad3552r: clear reset status flag From: Angelo Dureghello <adureghello(a)baylibre.com> commit e17b9f20da7d2bc1f48878ab2230523b2512d965 upstream. Clear reset status flag, to keep error status register clean after reset (ad3552r manual, rev B table 38). Reset error flag was left to 1, so debugging registers, the "Error Status Register" was dirty (0x01). It is important to clear this bit, so if there is any reset event over normal working mode, it is possible to detect it. Fixes: 8f2b54824b28 ("drivers:iio:dac: Add AD3552R driver support") Signed-off-by: Angelo Dureghello <adureghello(a)baylibre.com> Link: https://patch.msgid.link/20250125-wip-bl-ad3552r-clear-reset-v2-1-aa3a27f3f… Cc: <Stable@vger..kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/iio/dac/ad3552r.c | 6 ++++++ 1 file changed, 6 insertions(+) --- a/drivers/iio/dac/ad3552r.c +++ b/drivers/iio/dac/ad3552r.c @@ -714,6 +714,12 @@ static int ad3552r_reset(struct ad3552r_ return ret; } + /* Clear reset error flag, see ad3552r manual, rev B table 38. */ + ret = ad3552r_write_reg(dac, AD3552R_REG_ADDR_ERR_STATUS, + AD3552R_MASK_RESET_STATUS); + if (ret) + return ret; + return ad3552r_update_reg_field(dac, addr_mask_map[AD3552R_ADDR_ASCENSION][0], addr_mask_map[AD3552R_ADDR_ASCENSION][1], Patches currently in stable-queue which might be from adureghello(a)baylibre.com are queue-6.12/iio-dac-ad3552r-clear-reset-status-flag.patch

6 months, 2 weeks

1
0
0 0

[PATCH] block: make sure ->nr_integrity_segments is cloned in blk_rq_prep_clone

by Ming Lei

Make sure ->nr_integrity_segments is cloned in blk_rq_prep_clone(), otherwise zero ->nr_integrity_segments will be observed in sg_alloc_table_chained(), in which BUG() is hit. Cc: stable(a)vger.kernel.org Signed-off-by: Ming Lei <ming.lei(a)redhat.com> --- block/blk-mq.c | 1 + 1 file changed, 1 insertion(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index 40490ac88045..005c520d3498 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3314,6 +3314,7 @@ int blk_rq_prep_clone(struct request *rq, struct request *rq_src, rq->special_vec = rq_src->special_vec; } rq->nr_phys_segments = rq_src->nr_phys_segments; + rq->nr_integrity_segments = rq_src->nr_integrity_segments; if (rq->bio && blk_crypto_rq_bio_prep(rq, rq->bio, gfp_mask) < 0) goto free_and_out; -- 2.47.0

6 months, 2 weeks

2
1
0 0

Patch "iio: dac: ad3552r: clear reset status flag" has been added to the 6.13-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled iio: dac: ad3552r: clear reset status flag to the 6.13-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: iio-dac-ad3552r-clear-reset-status-flag.patch and it can be found in the queue-6.13 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. From e17b9f20da7d2bc1f48878ab2230523b2512d965 Mon Sep 17 00:00:00 2001 From: Angelo Dureghello <adureghello(a)baylibre.com> Date: Sat, 25 Jan 2025 17:24:32 +0100 Subject: iio: dac: ad3552r: clear reset status flag From: Angelo Dureghello <adureghello(a)baylibre.com> commit e17b9f20da7d2bc1f48878ab2230523b2512d965 upstream. Clear reset status flag, to keep error status register clean after reset (ad3552r manual, rev B table 38). Reset error flag was left to 1, so debugging registers, the "Error Status Register" was dirty (0x01). It is important to clear this bit, so if there is any reset event over normal working mode, it is possible to detect it. Fixes: 8f2b54824b28 ("drivers:iio:dac: Add AD3552R driver support") Signed-off-by: Angelo Dureghello <adureghello(a)baylibre.com> Link: https://patch.msgid.link/20250125-wip-bl-ad3552r-clear-reset-v2-1-aa3a27f3f… Cc: <Stable@vger..kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/iio/dac/ad3552r.c | 6 ++++++ 1 file changed, 6 insertions(+) --- a/drivers/iio/dac/ad3552r.c +++ b/drivers/iio/dac/ad3552r.c @@ -410,6 +410,12 @@ static int ad3552r_reset(struct ad3552r_ return ret; } + /* Clear reset error flag, see ad3552r manual, rev B table 38. */ + ret = ad3552r_write_reg(dac, AD3552R_REG_ADDR_ERR_STATUS, + AD3552R_MASK_RESET_STATUS); + if (ret) + return ret; + return ad3552r_update_reg_field(dac, AD3552R_REG_ADDR_INTERFACE_CONFIG_A, AD3552R_MASK_ADDR_ASCENSION, Patches currently in stable-queue which might be from adureghello(a)baylibre.com are queue-6.13/iio-dac-ad3552r-clear-reset-status-flag.patch

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] cdx: Fix possible UAF error in driver_override_show()" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x 91d44c1afc61a2fec37a9c7a3485368309391e0b # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025031035-dangle-briskness-0e29@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 91d44c1afc61a2fec37a9c7a3485368309391e0b Mon Sep 17 00:00:00 2001 From: Qiu-ji Chen <chenqiuji666(a)gmail.com> Date: Sat, 18 Jan 2025 15:08:33 +0800 Subject: [PATCH] cdx: Fix possible UAF error in driver_override_show() Fixed a possible UAF problem in driver_override_show() in drivers/cdx/cdx.c This function driver_override_show() is part of DEVICE_ATTR_RW, which includes both driver_override_show() and driver_override_store(). These functions can be executed concurrently in sysfs. The driver_override_store() function uses driver_set_override() to update the driver_override value, and driver_set_override() internally locks the device (device_lock(dev)). If driver_override_show() reads cdx_dev->driver_override without locking, it could potentially access a freed pointer if driver_override_store() frees the string concurrently. This could lead to printing a kernel address, which is a security risk since DEVICE_ATTR can be read by all users. Additionally, a similar pattern is used in drivers/amba/bus.c, as well as many other bus drivers, where device_lock() is taken in the show function, and it has been working without issues. This potential bug was detected by our experimental static analysis tool, which analyzes locking APIs and paired functions to identify data races and atomicity violations. Fixes: 1f86a00c1159 ("bus/fsl-mc: add support for 'driver_override' in the mc-bus") Cc: stable <stable(a)kernel.org> Signed-off-by: Qiu-ji Chen <chenqiuji666(a)gmail.com> Link: https://lore.kernel.org/r/20250118070833.27201-1-chenqiuji666@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> diff --git a/drivers/cdx/cdx.c b/drivers/cdx/cdx.c index c573ed2ee71a..7811aa734053 100644 --- a/drivers/cdx/cdx.c +++ b/drivers/cdx/cdx.c @@ -473,8 +473,12 @@ static ssize_t driver_override_show(struct device *dev, struct device_attribute *attr, char *buf) { struct cdx_device *cdx_dev = to_cdx_device(dev); + ssize_t len; - return sysfs_emit(buf, "%s\n", cdx_dev->driver_override); + device_lock(dev); + len = sysfs_emit(buf, "%s\n", cdx_dev->driver_override); + device_unlock(dev); + return len; } static DEVICE_ATTR_RW(driver_override);

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] cdx: Fix possible UAF error in driver_override_show()" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x 91d44c1afc61a2fec37a9c7a3485368309391e0b # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025031035-unmoving-oak-e2a9@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 91d44c1afc61a2fec37a9c7a3485368309391e0b Mon Sep 17 00:00:00 2001 From: Qiu-ji Chen <chenqiuji666(a)gmail.com> Date: Sat, 18 Jan 2025 15:08:33 +0800 Subject: [PATCH] cdx: Fix possible UAF error in driver_override_show() Fixed a possible UAF problem in driver_override_show() in drivers/cdx/cdx.c This function driver_override_show() is part of DEVICE_ATTR_RW, which includes both driver_override_show() and driver_override_store(). These functions can be executed concurrently in sysfs. The driver_override_store() function uses driver_set_override() to update the driver_override value, and driver_set_override() internally locks the device (device_lock(dev)). If driver_override_show() reads cdx_dev->driver_override without locking, it could potentially access a freed pointer if driver_override_store() frees the string concurrently. This could lead to printing a kernel address, which is a security risk since DEVICE_ATTR can be read by all users. Additionally, a similar pattern is used in drivers/amba/bus.c, as well as many other bus drivers, where device_lock() is taken in the show function, and it has been working without issues. This potential bug was detected by our experimental static analysis tool, which analyzes locking APIs and paired functions to identify data races and atomicity violations. Fixes: 1f86a00c1159 ("bus/fsl-mc: add support for 'driver_override' in the mc-bus") Cc: stable <stable(a)kernel.org> Signed-off-by: Qiu-ji Chen <chenqiuji666(a)gmail.com> Link: https://lore.kernel.org/r/20250118070833.27201-1-chenqiuji666@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> diff --git a/drivers/cdx/cdx.c b/drivers/cdx/cdx.c index c573ed2ee71a..7811aa734053 100644 --- a/drivers/cdx/cdx.c +++ b/drivers/cdx/cdx.c @@ -473,8 +473,12 @@ static ssize_t driver_override_show(struct device *dev, struct device_attribute *attr, char *buf) { struct cdx_device *cdx_dev = to_cdx_device(dev); + ssize_t len; - return sysfs_emit(buf, "%s\n", cdx_dev->driver_override); + device_lock(dev); + len = sysfs_emit(buf, "%s\n", cdx_dev->driver_override); + device_unlock(dev); + return len; } static DEVICE_ATTR_RW(driver_override);

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] cdx: Fix possible UAF error in driver_override_show()" failed to apply to 6.1-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x 91d44c1afc61a2fec37a9c7a3485368309391e0b # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025031034-faction-uphold-6310@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 91d44c1afc61a2fec37a9c7a3485368309391e0b Mon Sep 17 00:00:00 2001 From: Qiu-ji Chen <chenqiuji666(a)gmail.com> Date: Sat, 18 Jan 2025 15:08:33 +0800 Subject: [PATCH] cdx: Fix possible UAF error in driver_override_show() Fixed a possible UAF problem in driver_override_show() in drivers/cdx/cdx.c This function driver_override_show() is part of DEVICE_ATTR_RW, which includes both driver_override_show() and driver_override_store(). These functions can be executed concurrently in sysfs. The driver_override_store() function uses driver_set_override() to update the driver_override value, and driver_set_override() internally locks the device (device_lock(dev)). If driver_override_show() reads cdx_dev->driver_override without locking, it could potentially access a freed pointer if driver_override_store() frees the string concurrently. This could lead to printing a kernel address, which is a security risk since DEVICE_ATTR can be read by all users. Additionally, a similar pattern is used in drivers/amba/bus.c, as well as many other bus drivers, where device_lock() is taken in the show function, and it has been working without issues. This potential bug was detected by our experimental static analysis tool, which analyzes locking APIs and paired functions to identify data races and atomicity violations. Fixes: 1f86a00c1159 ("bus/fsl-mc: add support for 'driver_override' in the mc-bus") Cc: stable <stable(a)kernel.org> Signed-off-by: Qiu-ji Chen <chenqiuji666(a)gmail.com> Link: https://lore.kernel.org/r/20250118070833.27201-1-chenqiuji666@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> diff --git a/drivers/cdx/cdx.c b/drivers/cdx/cdx.c index c573ed2ee71a..7811aa734053 100644 --- a/drivers/cdx/cdx.c +++ b/drivers/cdx/cdx.c @@ -473,8 +473,12 @@ static ssize_t driver_override_show(struct device *dev, struct device_attribute *attr, char *buf) { struct cdx_device *cdx_dev = to_cdx_device(dev); + ssize_t len; - return sysfs_emit(buf, "%s\n", cdx_dev->driver_override); + device_lock(dev); + len = sysfs_emit(buf, "%s\n", cdx_dev->driver_override); + device_unlock(dev); + return len; } static DEVICE_ATTR_RW(driver_override);

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] KVM: x86: Snapshot the host's DEBUGCTL after disabling IRQs" failed to apply to 6.1-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x 189ecdb3e112da703ac0699f4ec76aa78122f911 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025031003-unstitch-arbitrate-baa1@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 189ecdb3e112da703ac0699f4ec76aa78122f911 Mon Sep 17 00:00:00 2001 From: Sean Christopherson <seanjc(a)google.com> Date: Thu, 27 Feb 2025 14:24:10 -0800 Subject: [PATCH] KVM: x86: Snapshot the host's DEBUGCTL after disabling IRQs Snapshot the host's DEBUGCTL after disabling IRQs, as perf can toggle debugctl bits from IRQ context, e.g. when enabling/disabling events via smp_call_function_single(). Taking the snapshot (long) before IRQs are disabled could result in KVM effectively clobbering DEBUGCTL due to using a stale snapshot. Cc: stable(a)vger.kernel.org Reviewed-and-tested-by: Ravi Bangoria <ravi.bangoria(a)amd.com> Link: https://lore.kernel.org/r/20250227222411.3490595-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc(a)google.com> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5c6fd0edc41f..12d5f47c1bbe 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4968,7 +4968,6 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) /* Save host pkru register if supported */ vcpu->arch.host_pkru = read_pkru(); - vcpu->arch.host_debugctl = get_debugctlmsr(); /* Apply any externally detected TSC adjustments (due to suspend) */ if (unlikely(vcpu->arch.tsc_offset_adjustment)) { @@ -10969,6 +10968,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) set_debugreg(0, 7); } + vcpu->arch.host_debugctl = get_debugctlmsr(); + guest_timing_enter_irqoff(); for (;;) {

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] KVM: x86: Snapshot the host's DEBUGCTL after disabling IRQs" failed to apply to 6.6-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.6-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y git checkout FETCH_HEAD git cherry-pick -x 189ecdb3e112da703ac0699f4ec76aa78122f911 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025031002-campsite-railroad-4d13@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 189ecdb3e112da703ac0699f4ec76aa78122f911 Mon Sep 17 00:00:00 2001 From: Sean Christopherson <seanjc(a)google.com> Date: Thu, 27 Feb 2025 14:24:10 -0800 Subject: [PATCH] KVM: x86: Snapshot the host's DEBUGCTL after disabling IRQs Snapshot the host's DEBUGCTL after disabling IRQs, as perf can toggle debugctl bits from IRQ context, e.g. when enabling/disabling events via smp_call_function_single(). Taking the snapshot (long) before IRQs are disabled could result in KVM effectively clobbering DEBUGCTL due to using a stale snapshot. Cc: stable(a)vger.kernel.org Reviewed-and-tested-by: Ravi Bangoria <ravi.bangoria(a)amd.com> Link: https://lore.kernel.org/r/20250227222411.3490595-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc(a)google.com> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5c6fd0edc41f..12d5f47c1bbe 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4968,7 +4968,6 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) /* Save host pkru register if supported */ vcpu->arch.host_pkru = read_pkru(); - vcpu->arch.host_debugctl = get_debugctlmsr(); /* Apply any externally detected TSC adjustments (due to suspend) */ if (unlikely(vcpu->arch.tsc_offset_adjustment)) { @@ -10969,6 +10968,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) set_debugreg(0, 7); } + vcpu->arch.host_debugctl = get_debugctlmsr(); + guest_timing_enter_irqoff(); for (;;) {

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] KVM: x86: Snapshot the host's DEBUGCTL in common x86" failed to apply to 6.6-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.6-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y git checkout FETCH_HEAD git cherry-pick -x fb71c795935652fa20eaf9517ca9547f5af99a76 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025031034-twister-stash-ba87@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From fb71c795935652fa20eaf9517ca9547f5af99a76 Mon Sep 17 00:00:00 2001 From: Sean Christopherson <seanjc(a)google.com> Date: Thu, 27 Feb 2025 14:24:08 -0800 Subject: [PATCH] KVM: x86: Snapshot the host's DEBUGCTL in common x86 Move KVM's snapshot of DEBUGCTL to kvm_vcpu_arch and take the snapshot in common x86, so that SVM can also use the snapshot. Opportunistically change the field to a u64. While bits 63:32 are reserved on AMD, not mentioned at all in Intel's SDM, and managed as an "unsigned long" by the kernel, DEBUGCTL is an MSR and therefore a 64-bit value. Reviewed-by: Xiaoyao Li <xiaoyao.li(a)intel.com> Cc: stable(a)vger.kernel.org Reviewed-and-tested-by: Ravi Bangoria <ravi.bangoria(a)amd.com> Link: https://lore.kernel.org/r/20250227222411.3490595-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc(a)google.com> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 0b7af5902ff7..32ae3aa50c7e 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -780,6 +780,7 @@ struct kvm_vcpu_arch { u32 pkru; u32 hflags; u64 efer; + u64 host_debugctl; u64 apic_base; struct kvm_lapic *apic; /* kernel irqchip context */ bool load_eoi_exitmap_pending; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 6c56d5235f0f..3b92f893b239 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1514,16 +1514,12 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu, */ void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) { - struct vcpu_vmx *vmx = to_vmx(vcpu); - if (vcpu->scheduled_out && !kvm_pause_in_guest(vcpu->kvm)) shrink_ple_window(vcpu); vmx_vcpu_load_vmcs(vcpu, cpu, NULL); vmx_vcpu_pi_load(vcpu, cpu); - - vmx->host_debugctlmsr = get_debugctlmsr(); } void vmx_vcpu_put(struct kvm_vcpu *vcpu) @@ -7458,8 +7454,8 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) } /* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */ - if (vmx->host_debugctlmsr) - update_debugctlmsr(vmx->host_debugctlmsr); + if (vcpu->arch.host_debugctl) + update_debugctlmsr(vcpu->arch.host_debugctl); #ifndef CONFIG_X86_64 /* diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 8b111ce1087c..951e44dc9d0e 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -340,8 +340,6 @@ struct vcpu_vmx { /* apic deadline value in host tsc */ u64 hv_deadline_tsc; - unsigned long host_debugctlmsr; - /* * Only bits masked by msr_ia32_feature_control_valid_bits can be set in * msr_ia32_feature_control. FEAT_CTL_LOCKED is always included diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 02159c967d29..5c6fd0edc41f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4968,6 +4968,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) /* Save host pkru register if supported */ vcpu->arch.host_pkru = read_pkru(); + vcpu->arch.host_debugctl = get_debugctlmsr(); /* Apply any externally detected TSC adjustments (due to suspend) */ if (unlikely(vcpu->arch.tsc_offset_adjustment)) {

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] KVM: x86: Snapshot the host's DEBUGCTL in common x86" failed to apply to 6.1-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x fb71c795935652fa20eaf9517ca9547f5af99a76 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025031034-latitude-stinking-09c1@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From fb71c795935652fa20eaf9517ca9547f5af99a76 Mon Sep 17 00:00:00 2001 From: Sean Christopherson <seanjc(a)google.com> Date: Thu, 27 Feb 2025 14:24:08 -0800 Subject: [PATCH] KVM: x86: Snapshot the host's DEBUGCTL in common x86 Move KVM's snapshot of DEBUGCTL to kvm_vcpu_arch and take the snapshot in common x86, so that SVM can also use the snapshot. Opportunistically change the field to a u64. While bits 63:32 are reserved on AMD, not mentioned at all in Intel's SDM, and managed as an "unsigned long" by the kernel, DEBUGCTL is an MSR and therefore a 64-bit value. Reviewed-by: Xiaoyao Li <xiaoyao.li(a)intel.com> Cc: stable(a)vger.kernel.org Reviewed-and-tested-by: Ravi Bangoria <ravi.bangoria(a)amd.com> Link: https://lore.kernel.org/r/20250227222411.3490595-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc(a)google.com> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 0b7af5902ff7..32ae3aa50c7e 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -780,6 +780,7 @@ struct kvm_vcpu_arch { u32 pkru; u32 hflags; u64 efer; + u64 host_debugctl; u64 apic_base; struct kvm_lapic *apic; /* kernel irqchip context */ bool load_eoi_exitmap_pending; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 6c56d5235f0f..3b92f893b239 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1514,16 +1514,12 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu, */ void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) { - struct vcpu_vmx *vmx = to_vmx(vcpu); - if (vcpu->scheduled_out && !kvm_pause_in_guest(vcpu->kvm)) shrink_ple_window(vcpu); vmx_vcpu_load_vmcs(vcpu, cpu, NULL); vmx_vcpu_pi_load(vcpu, cpu); - - vmx->host_debugctlmsr = get_debugctlmsr(); } void vmx_vcpu_put(struct kvm_vcpu *vcpu) @@ -7458,8 +7454,8 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, bool force_immediate_exit) } /* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */ - if (vmx->host_debugctlmsr) - update_debugctlmsr(vmx->host_debugctlmsr); + if (vcpu->arch.host_debugctl) + update_debugctlmsr(vcpu->arch.host_debugctl); #ifndef CONFIG_X86_64 /* diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 8b111ce1087c..951e44dc9d0e 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -340,8 +340,6 @@ struct vcpu_vmx { /* apic deadline value in host tsc */ u64 hv_deadline_tsc; - unsigned long host_debugctlmsr; - /* * Only bits masked by msr_ia32_feature_control_valid_bits can be set in * msr_ia32_feature_control. FEAT_CTL_LOCKED is always included diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 02159c967d29..5c6fd0edc41f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4968,6 +4968,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) /* Save host pkru register if supported */ vcpu->arch.host_pkru = read_pkru(); + vcpu->arch.host_debugctl = get_debugctlmsr(); /* Apply any externally detected TSC adjustments (due to suspend) */ if (unlikely(vcpu->arch.tsc_offset_adjustment)) {

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x be45bc4eff33d9a7dae84a2150f242a91a617402 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025031025-hurry-muster-0e93@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From be45bc4eff33d9a7dae84a2150f242a91a617402 Mon Sep 17 00:00:00 2001 From: Sean Christopherson <seanjc(a)google.com> Date: Mon, 24 Feb 2025 08:54:41 -0800 Subject: [PATCH] KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the STI shadow Enable/disable local IRQs, i.e. set/clear RFLAGS.IF, in the common svm_vcpu_enter_exit() just after/before guest_state_{enter,exit}_irqoff() so that VMRUN is not executed in an STI shadow. AMD CPUs have a quirk (some would say "bug"), where the STI shadow bleeds into the guest's intr_state field if a #VMEXIT occurs during injection of an event, i.e. if the VMRUN doesn't complete before the subsequent #VMEXIT. The spurious "interrupts masked" state is relatively benign, as it only occurs during event injection and is transient. Because KVM is already injecting an event, the guest can't be in HLT, and if KVM is querying IRQ blocking for injection, then KVM would need to force an immediate exit anyways since injecting multiple events is impossible. However, because KVM copies int_state verbatim from vmcb02 to vmcb12, the spurious STI shadow is visible to L1 when running a nested VM, which can trip sanity checks, e.g. in VMware's VMM. Hoist the STI+CLI all the way to C code, as the aforementioned calls to guest_state_{enter,exit}_irqoff() already inform lockdep that IRQs are enabled/disabled, and taking a fault on VMRUN with RFLAGS.IF=1 is already possible. I.e. if there's kernel code that is confused by running with RFLAGS.IF=1, then it's already a problem. In practice, since GIF=0 also blocks NMIs, the only change in exposure to non-KVM code (relative to surrounding VMRUN with STI+CLI) is exception handling code, and except for the kvm_rebooting=1 case, all exception in the core VM-Enter/VM-Exit path are fatal. Use the "raw" variants to enable/disable IRQs to avoid tracing in the "no instrumentation" code; the guest state helpers also take care of tracing IRQ state. Oppurtunstically document why KVM needs to do STI in the first place. Reported-by: Doug Covelli <doug.covelli(a)broadcom.com> Closes: https://lore.kernel.org/all/CADH9ctBs1YPmE4aCfGPNBwA10cA8RuAk2gO7542DjMZgs4… Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly") Cc: stable(a)vger.kernel.org Reviewed-by: Jim Mattson <jmattson(a)google.com> Link: https://lore.kernel.org/r/20250224165442.2338294-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc(a)google.com> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index a713c803a3a3..0d299f3f921e 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4189,6 +4189,18 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in guest_state_enter_irqoff(); + /* + * Set RFLAGS.IF prior to VMRUN, as the host's RFLAGS.IF at the time of + * VMRUN controls whether or not physical IRQs are masked (KVM always + * runs with V_INTR_MASKING_MASK). Toggle RFLAGS.IF here to avoid the + * temptation to do STI+VMRUN+CLI, as AMD CPUs bleed the STI shadow + * into guest state if delivery of an event during VMRUN triggers a + * #VMEXIT, and the guest_state transitions already tell lockdep that + * IRQs are being enabled/disabled. Note! GIF=0 for the entirety of + * this path, so IRQs aren't actually unmasked while running host code. + */ + raw_local_irq_enable(); + amd_clear_divider(); if (sev_es_guest(vcpu->kvm)) @@ -4197,6 +4209,8 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in else __svm_vcpu_run(svm, spec_ctrl_intercepted); + raw_local_irq_disable(); + guest_state_exit_irqoff(); } diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S index 2ed80aea3bb1..0c61153b275f 100644 --- a/arch/x86/kvm/svm/vmenter.S +++ b/arch/x86/kvm/svm/vmenter.S @@ -170,12 +170,8 @@ SYM_FUNC_START(__svm_vcpu_run) mov VCPU_RDI(%_ASM_DI), %_ASM_DI /* Enter guest mode */ - sti - 3: vmrun %_ASM_AX 4: - cli - /* Pop @svm to RAX while it's the only available register. */ pop %_ASM_AX @@ -340,12 +336,8 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run) mov KVM_VMCB_pa(%rax), %rax /* Enter guest mode */ - sti - 1: vmrun %rax - -2: cli - +2: /* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */ FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x be45bc4eff33d9a7dae84a2150f242a91a617402 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025031024-bootleg-parkway-393c@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From be45bc4eff33d9a7dae84a2150f242a91a617402 Mon Sep 17 00:00:00 2001 From: Sean Christopherson <seanjc(a)google.com> Date: Mon, 24 Feb 2025 08:54:41 -0800 Subject: [PATCH] KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the STI shadow Enable/disable local IRQs, i.e. set/clear RFLAGS.IF, in the common svm_vcpu_enter_exit() just after/before guest_state_{enter,exit}_irqoff() so that VMRUN is not executed in an STI shadow. AMD CPUs have a quirk (some would say "bug"), where the STI shadow bleeds into the guest's intr_state field if a #VMEXIT occurs during injection of an event, i.e. if the VMRUN doesn't complete before the subsequent #VMEXIT. The spurious "interrupts masked" state is relatively benign, as it only occurs during event injection and is transient. Because KVM is already injecting an event, the guest can't be in HLT, and if KVM is querying IRQ blocking for injection, then KVM would need to force an immediate exit anyways since injecting multiple events is impossible. However, because KVM copies int_state verbatim from vmcb02 to vmcb12, the spurious STI shadow is visible to L1 when running a nested VM, which can trip sanity checks, e.g. in VMware's VMM. Hoist the STI+CLI all the way to C code, as the aforementioned calls to guest_state_{enter,exit}_irqoff() already inform lockdep that IRQs are enabled/disabled, and taking a fault on VMRUN with RFLAGS.IF=1 is already possible. I.e. if there's kernel code that is confused by running with RFLAGS.IF=1, then it's already a problem. In practice, since GIF=0 also blocks NMIs, the only change in exposure to non-KVM code (relative to surrounding VMRUN with STI+CLI) is exception handling code, and except for the kvm_rebooting=1 case, all exception in the core VM-Enter/VM-Exit path are fatal. Use the "raw" variants to enable/disable IRQs to avoid tracing in the "no instrumentation" code; the guest state helpers also take care of tracing IRQ state. Oppurtunstically document why KVM needs to do STI in the first place. Reported-by: Doug Covelli <doug.covelli(a)broadcom.com> Closes: https://lore.kernel.org/all/CADH9ctBs1YPmE4aCfGPNBwA10cA8RuAk2gO7542DjMZgs4… Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly") Cc: stable(a)vger.kernel.org Reviewed-by: Jim Mattson <jmattson(a)google.com> Link: https://lore.kernel.org/r/20250224165442.2338294-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc(a)google.com> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index a713c803a3a3..0d299f3f921e 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4189,6 +4189,18 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in guest_state_enter_irqoff(); + /* + * Set RFLAGS.IF prior to VMRUN, as the host's RFLAGS.IF at the time of + * VMRUN controls whether or not physical IRQs are masked (KVM always + * runs with V_INTR_MASKING_MASK). Toggle RFLAGS.IF here to avoid the + * temptation to do STI+VMRUN+CLI, as AMD CPUs bleed the STI shadow + * into guest state if delivery of an event during VMRUN triggers a + * #VMEXIT, and the guest_state transitions already tell lockdep that + * IRQs are being enabled/disabled. Note! GIF=0 for the entirety of + * this path, so IRQs aren't actually unmasked while running host code. + */ + raw_local_irq_enable(); + amd_clear_divider(); if (sev_es_guest(vcpu->kvm)) @@ -4197,6 +4209,8 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in else __svm_vcpu_run(svm, spec_ctrl_intercepted); + raw_local_irq_disable(); + guest_state_exit_irqoff(); } diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S index 2ed80aea3bb1..0c61153b275f 100644 --- a/arch/x86/kvm/svm/vmenter.S +++ b/arch/x86/kvm/svm/vmenter.S @@ -170,12 +170,8 @@ SYM_FUNC_START(__svm_vcpu_run) mov VCPU_RDI(%_ASM_DI), %_ASM_DI /* Enter guest mode */ - sti - 3: vmrun %_ASM_AX 4: - cli - /* Pop @svm to RAX while it's the only available register. */ pop %_ASM_AX @@ -340,12 +336,8 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run) mov KVM_VMCB_pa(%rax), %rax /* Enter guest mode */ - sti - 1: vmrun %rax - -2: cli - +2: /* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */ FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the" failed to apply to 6.1-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x be45bc4eff33d9a7dae84a2150f242a91a617402 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025031023-dodge-ungodly-172a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From be45bc4eff33d9a7dae84a2150f242a91a617402 Mon Sep 17 00:00:00 2001 From: Sean Christopherson <seanjc(a)google.com> Date: Mon, 24 Feb 2025 08:54:41 -0800 Subject: [PATCH] KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the STI shadow Enable/disable local IRQs, i.e. set/clear RFLAGS.IF, in the common svm_vcpu_enter_exit() just after/before guest_state_{enter,exit}_irqoff() so that VMRUN is not executed in an STI shadow. AMD CPUs have a quirk (some would say "bug"), where the STI shadow bleeds into the guest's intr_state field if a #VMEXIT occurs during injection of an event, i.e. if the VMRUN doesn't complete before the subsequent #VMEXIT. The spurious "interrupts masked" state is relatively benign, as it only occurs during event injection and is transient. Because KVM is already injecting an event, the guest can't be in HLT, and if KVM is querying IRQ blocking for injection, then KVM would need to force an immediate exit anyways since injecting multiple events is impossible. However, because KVM copies int_state verbatim from vmcb02 to vmcb12, the spurious STI shadow is visible to L1 when running a nested VM, which can trip sanity checks, e.g. in VMware's VMM. Hoist the STI+CLI all the way to C code, as the aforementioned calls to guest_state_{enter,exit}_irqoff() already inform lockdep that IRQs are enabled/disabled, and taking a fault on VMRUN with RFLAGS.IF=1 is already possible. I.e. if there's kernel code that is confused by running with RFLAGS.IF=1, then it's already a problem. In practice, since GIF=0 also blocks NMIs, the only change in exposure to non-KVM code (relative to surrounding VMRUN with STI+CLI) is exception handling code, and except for the kvm_rebooting=1 case, all exception in the core VM-Enter/VM-Exit path are fatal. Use the "raw" variants to enable/disable IRQs to avoid tracing in the "no instrumentation" code; the guest state helpers also take care of tracing IRQ state. Oppurtunstically document why KVM needs to do STI in the first place. Reported-by: Doug Covelli <doug.covelli(a)broadcom.com> Closes: https://lore.kernel.org/all/CADH9ctBs1YPmE4aCfGPNBwA10cA8RuAk2gO7542DjMZgs4… Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly") Cc: stable(a)vger.kernel.org Reviewed-by: Jim Mattson <jmattson(a)google.com> Link: https://lore.kernel.org/r/20250224165442.2338294-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc(a)google.com> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index a713c803a3a3..0d299f3f921e 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4189,6 +4189,18 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in guest_state_enter_irqoff(); + /* + * Set RFLAGS.IF prior to VMRUN, as the host's RFLAGS.IF at the time of + * VMRUN controls whether or not physical IRQs are masked (KVM always + * runs with V_INTR_MASKING_MASK). Toggle RFLAGS.IF here to avoid the + * temptation to do STI+VMRUN+CLI, as AMD CPUs bleed the STI shadow + * into guest state if delivery of an event during VMRUN triggers a + * #VMEXIT, and the guest_state transitions already tell lockdep that + * IRQs are being enabled/disabled. Note! GIF=0 for the entirety of + * this path, so IRQs aren't actually unmasked while running host code. + */ + raw_local_irq_enable(); + amd_clear_divider(); if (sev_es_guest(vcpu->kvm)) @@ -4197,6 +4209,8 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in else __svm_vcpu_run(svm, spec_ctrl_intercepted); + raw_local_irq_disable(); + guest_state_exit_irqoff(); } diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S index 2ed80aea3bb1..0c61153b275f 100644 --- a/arch/x86/kvm/svm/vmenter.S +++ b/arch/x86/kvm/svm/vmenter.S @@ -170,12 +170,8 @@ SYM_FUNC_START(__svm_vcpu_run) mov VCPU_RDI(%_ASM_DI), %_ASM_DI /* Enter guest mode */ - sti - 3: vmrun %_ASM_AX 4: - cli - /* Pop @svm to RAX while it's the only available register. */ pop %_ASM_AX @@ -340,12 +336,8 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run) mov KVM_VMCB_pa(%rax), %rax /* Enter guest mode */ - sti - 1: vmrun %rax - -2: cli - +2: /* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */ FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the" failed to apply to 6.6-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.6-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y git checkout FETCH_HEAD git cherry-pick -x be45bc4eff33d9a7dae84a2150f242a91a617402 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025031022-debunk-winner-e8fe@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From be45bc4eff33d9a7dae84a2150f242a91a617402 Mon Sep 17 00:00:00 2001 From: Sean Christopherson <seanjc(a)google.com> Date: Mon, 24 Feb 2025 08:54:41 -0800 Subject: [PATCH] KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the STI shadow Enable/disable local IRQs, i.e. set/clear RFLAGS.IF, in the common svm_vcpu_enter_exit() just after/before guest_state_{enter,exit}_irqoff() so that VMRUN is not executed in an STI shadow. AMD CPUs have a quirk (some would say "bug"), where the STI shadow bleeds into the guest's intr_state field if a #VMEXIT occurs during injection of an event, i.e. if the VMRUN doesn't complete before the subsequent #VMEXIT. The spurious "interrupts masked" state is relatively benign, as it only occurs during event injection and is transient. Because KVM is already injecting an event, the guest can't be in HLT, and if KVM is querying IRQ blocking for injection, then KVM would need to force an immediate exit anyways since injecting multiple events is impossible. However, because KVM copies int_state verbatim from vmcb02 to vmcb12, the spurious STI shadow is visible to L1 when running a nested VM, which can trip sanity checks, e.g. in VMware's VMM. Hoist the STI+CLI all the way to C code, as the aforementioned calls to guest_state_{enter,exit}_irqoff() already inform lockdep that IRQs are enabled/disabled, and taking a fault on VMRUN with RFLAGS.IF=1 is already possible. I.e. if there's kernel code that is confused by running with RFLAGS.IF=1, then it's already a problem. In practice, since GIF=0 also blocks NMIs, the only change in exposure to non-KVM code (relative to surrounding VMRUN with STI+CLI) is exception handling code, and except for the kvm_rebooting=1 case, all exception in the core VM-Enter/VM-Exit path are fatal. Use the "raw" variants to enable/disable IRQs to avoid tracing in the "no instrumentation" code; the guest state helpers also take care of tracing IRQ state. Oppurtunstically document why KVM needs to do STI in the first place. Reported-by: Doug Covelli <doug.covelli(a)broadcom.com> Closes: https://lore.kernel.org/all/CADH9ctBs1YPmE4aCfGPNBwA10cA8RuAk2gO7542DjMZgs4… Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly") Cc: stable(a)vger.kernel.org Reviewed-by: Jim Mattson <jmattson(a)google.com> Link: https://lore.kernel.org/r/20250224165442.2338294-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc(a)google.com> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index a713c803a3a3..0d299f3f921e 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4189,6 +4189,18 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in guest_state_enter_irqoff(); + /* + * Set RFLAGS.IF prior to VMRUN, as the host's RFLAGS.IF at the time of + * VMRUN controls whether or not physical IRQs are masked (KVM always + * runs with V_INTR_MASKING_MASK). Toggle RFLAGS.IF here to avoid the + * temptation to do STI+VMRUN+CLI, as AMD CPUs bleed the STI shadow + * into guest state if delivery of an event during VMRUN triggers a + * #VMEXIT, and the guest_state transitions already tell lockdep that + * IRQs are being enabled/disabled. Note! GIF=0 for the entirety of + * this path, so IRQs aren't actually unmasked while running host code. + */ + raw_local_irq_enable(); + amd_clear_divider(); if (sev_es_guest(vcpu->kvm)) @@ -4197,6 +4209,8 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in else __svm_vcpu_run(svm, spec_ctrl_intercepted); + raw_local_irq_disable(); + guest_state_exit_irqoff(); } diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S index 2ed80aea3bb1..0c61153b275f 100644 --- a/arch/x86/kvm/svm/vmenter.S +++ b/arch/x86/kvm/svm/vmenter.S @@ -170,12 +170,8 @@ SYM_FUNC_START(__svm_vcpu_run) mov VCPU_RDI(%_ASM_DI), %_ASM_DI /* Enter guest mode */ - sti - 3: vmrun %_ASM_AX 4: - cli - /* Pop @svm to RAX while it's the only available register. */ pop %_ASM_AX @@ -340,12 +336,8 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run) mov KVM_VMCB_pa(%rax), %rax /* Enter guest mode */ - sti - 1: vmrun %rax - -2: cli - +2: /* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */ FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_VMEXIT

6 months, 2 weeks

1
0
0 0

[PATCH v2 0/2] R-Car CANFD fixes

by Biju Das

This patch series addresses 2 issues 1) Fix typo in pattern properties for R-Car V4M. 2) Fix page entries in the AFL list. v1->v2: * Split fixes patches as separate series. * Added Rb tag from Geert for binding patch. * Added the tag Cc:stable@vger.kernel.org Biju Das (2): dt-bindings: can: renesas,rcar-canfd: Fix typo in pattern properties for R-Car V4M can: rcar_canfd: Fix page entries in the AFL list .../bindings/net/can/renesas,rcar-canfd.yaml | 2 +- drivers/net/can/rcar/rcar_canfd.c | 17 ++++++++++------- 2 files changed, 11 insertions(+), 8 deletions(-) -- 2.43.0

6 months, 2 weeks

4
15
0 0

[PATCH V3] drm/sched: Fix fence reference count leak

by Qianyi Liu

From: qianyi liu <liuqianyi125(a)gmail.com> The last_scheduled fence leaked when an entity was being killed and adding its callback failed. Decrement the reference count of prev when dma_fence_add_callback() fails, ensuring proper balance. Cc: stable(a)vger.kernel.org Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini") Signed-off-by: qianyi liu <liuqianyi125(a)gmail.com> --- v2 -> v3: Rework commit message (Markus) v1 -> v2: Added 'Fixes:' tag and clarified commit message (Philipp and Matthew) --- drivers/gpu/drm/scheduler/sched_entity.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 69bcf0e99d57..1c0c14bcf726 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -259,9 +259,12 @@ static void drm_sched_entity_kill(struct drm_sched_entity *entity) struct drm_sched_fence *s_fence = job->s_fence; dma_fence_get(&s_fence->finished); - if (!prev || dma_fence_add_callback(prev, &job->finish_cb, - drm_sched_entity_kill_jobs_cb)) + if (!prev || + dma_fence_add_callback(prev, &job->finish_cb, + drm_sched_entity_kill_jobs_cb)) { + dma_fence_put(prev); drm_sched_entity_kill_jobs_cb(NULL, &job->finish_cb); + } prev = &s_fence->finished; } -- 2.25.1

6 months, 2 weeks

2
4
0 0

[PATCH v4 0/2] Allow non-coherent video capture buffers on Rockchip ISP V1

by Mikhail Rudenko

This small series adds support for non-coherent video capture buffers on Rockchip ISP V1. Patch 1 fixes cache management for dmabuf's allocated by dma-contig allocator. Patch 2 allows non-coherent allocations on the rkisp1 capture queue. Some timing measurements are provided in the commit message of patch 2. Signed-off-by: Mikhail Rudenko <mike.rudenko(a)gmail.com> --- Changes in v4: - rebase to media/next - use `direction` instead of `buf->dma_dir` in dma_sync_sgtable_* - Link to v3: https://lore.kernel.org/r/20250128-b4-rkisp-noncoherent-v3-0-baf39c997d2a@g… Changes in v3: - ignore skip_cache_sync_* flags in vb2_dc_dmabuf_ops_{begin,end}_cpu_access - invalidate/flush kernel mappings as appropriate if they exist - use dma_sync_sgtable_* instead of dma_sync_sg_* - Link to v2: https://lore.kernel.org/r/20250115-b4-rkisp-noncoherent-v2-0-0853e1a24012@g… Changes in v2: - Fix vb2_dc_dmabuf_ops_{begin,end}_cpu_access() for non-coherent buffers. - Add cache management timing information to patch 2 commit message. - Link to v1: https://lore.kernel.org/r/20250102-b4-rkisp-noncoherent-v1-1-bba164f7132c@g… --- Mikhail Rudenko (2): media: videobuf2: Fix dmabuf cache sync/flush in dma-contig media: rkisp1: Allow non-coherent video capture buffers .../media/common/videobuf2/videobuf2-dma-contig.c | 22 ++++++++++++++++++++++ .../platform/rockchip/rkisp1/rkisp1-capture.c | 1 + 2 files changed, 23 insertions(+) --- base-commit: b2c4bf0c102084e77ed1b12090d77a76469a6814 change-id: 20241231-b4-rkisp-noncoherent-ad6e7c7a68ba Best regards, -- Mikhail Rudenko <mike.rudenko(a)gmail.com>

6 months, 2 weeks

3
6
0 0

[PATCH 1/1] mcb: fix a double free bug in chameleon_parse_gdd()

by Johannes Thumshirn

From: Haoxiang Li <haoxiang_li2024(a)163.com> In chameleon_parse_gdd(), if mcb_device_register() fails, 'mdev' would be released in mcb_device_register() via put_device(). Thus, goto 'err' label and free 'mdev' again causes a double free. Just return if mcb_device_register() fails. Fixes: 3764e82e5150 ("drivers: Introduce MEN Chameleon Bus") Cc: stable(a)vger.kernel.org Signed-off-by: Haoxiang Li <haoxiang_li2024(a)163.com> Signed-off-by: Johannes Thumshirn <jth(a)kernel.org> --- drivers/mcb/mcb-parse.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/mcb/mcb-parse.c b/drivers/mcb/mcb-parse.c index 02a680c73979..bf0d7d58c8b0 100644 --- a/drivers/mcb/mcb-parse.c +++ b/drivers/mcb/mcb-parse.c @@ -96,7 +96,7 @@ static int chameleon_parse_gdd(struct mcb_bus *bus, ret = mcb_device_register(bus, mdev); if (ret < 0) - goto err; + return ret; return 0; -- 2.43.0

6 months, 2 weeks

1
0
0 0

Re: Patch "fs/pipe: Read pipe->{head,tail} atomically outside pipe->mutex" has been added to the 5.10-stable tree

by Linus Torvalds

Note that this was a real fix, but the fix only matters if commit aaec5a95d596 ("pipe_read: don't wake up the writer if the pipe is still full") is in the tree. Now, the bug was pre-existing, and *maybe* it could be hit without that commit aaec5a95d596, but nobody has ever reported it, so it's very very unlikely. Also, this fix then had some fall-out, and while I think you've queued all the fallout fixes too, I think it might be a good idea to wait for more reports from the development tree before considering these for stable. Put another way: this fix caused some pain. It might not be worth back-porting to stable at all, and if it is, it might be worth waiting to see that there's no other fallout. Linus On Sun, 9 Mar 2025 at 09:52, Sasha Levin <sashal(a)kernel.org> wrote: > > This is a note to let you know that I've just added the patch titled > > fs/pipe: Read pipe->{head,tail} atomically outside pipe->mutex

6 months, 2 weeks

2
1
0 0

[PATCH 5.15] sched: sch_cake: add bounds checks to host bulk flow fairness counts

by Hagar Hemdan

From: Toke Høiland-Jørgensen <toke(a)redhat.com> [ Upstream commit 737d4d91d35b5f7fa5bb442651472277318b0bfd ] Even though we fixed a logic error in the commit cited below, syzbot still managed to trigger an underflow of the per-host bulk flow counters, leading to an out of bounds memory access. To avoid any such logic errors causing out of bounds memory accesses, this commit factors out all accesses to the per-host bulk flow counters to a series of helpers that perform bounds-checking before any increments and decrements. This also has the benefit of improving readability by moving the conditional checks for the flow mode into these helpers, instead of having them spread out throughout the code (which was the cause of the original logic error). As part of this change, the flow quantum calculation is consolidated into a helper function, which means that the dithering applied to the ost load scaling is now applied both in the DRR rotation and when a sparse flow's quantum is first initiated. The only user-visible effect of this is that the maximum packet size that can be sent while a flow stays sparse will now vary with +/- one byte in some cases. This should not make a noticeable difference in practice, and thus it's not worth complicating the code to preserve the old behaviour. Fixes: 546ea84d07e3 ("sched: sch_cake: fix bulk flow accounting logic for host fairness") Reported-by: syzbot+f63600d288bfb7057424(a)syzkaller.appspotmail.com Signed-off-by: Toke Høiland-Jørgensen <toke(a)redhat.com> Acked-by: Dave Taht <dave.taht(a)gmail.com> Link: https://patch.msgid.link/20250107120105.70685-1-toke@redhat.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> [Hagar: needed contextual fixes due to missing commit 7e3cf0843fe5] Signed-off-by: Hagar Hemdan <hagarhem(a)amazon.com> --- net/sched/sch_cake.c | 140 +++++++++++++++++++++++-------------------- 1 file changed, 75 insertions(+), 65 deletions(-) diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c index 8d9c0b98a747..d9535129f4e9 100644 --- a/net/sched/sch_cake.c +++ b/net/sched/sch_cake.c @@ -643,6 +643,63 @@ static bool cake_ddst(int flow_mode) return (flow_mode & CAKE_FLOW_DUAL_DST) == CAKE_FLOW_DUAL_DST; } +static void cake_dec_srchost_bulk_flow_count(struct cake_tin_data *q, + struct cake_flow *flow, + int flow_mode) +{ + if (likely(cake_dsrc(flow_mode) && + q->hosts[flow->srchost].srchost_bulk_flow_count)) + q->hosts[flow->srchost].srchost_bulk_flow_count--; +} + +static void cake_inc_srchost_bulk_flow_count(struct cake_tin_data *q, + struct cake_flow *flow, + int flow_mode) +{ + if (likely(cake_dsrc(flow_mode) && + q->hosts[flow->srchost].srchost_bulk_flow_count < CAKE_QUEUES)) + q->hosts[flow->srchost].srchost_bulk_flow_count++; +} + +static void cake_dec_dsthost_bulk_flow_count(struct cake_tin_data *q, + struct cake_flow *flow, + int flow_mode) +{ + if (likely(cake_ddst(flow_mode) && + q->hosts[flow->dsthost].dsthost_bulk_flow_count)) + q->hosts[flow->dsthost].dsthost_bulk_flow_count--; +} + +static void cake_inc_dsthost_bulk_flow_count(struct cake_tin_data *q, + struct cake_flow *flow, + int flow_mode) +{ + if (likely(cake_ddst(flow_mode) && + q->hosts[flow->dsthost].dsthost_bulk_flow_count < CAKE_QUEUES)) + q->hosts[flow->dsthost].dsthost_bulk_flow_count++; +} + +static u16 cake_get_flow_quantum(struct cake_tin_data *q, + struct cake_flow *flow, + int flow_mode) +{ + u16 host_load = 1; + + if (cake_dsrc(flow_mode)) + host_load = max(host_load, + q->hosts[flow->srchost].srchost_bulk_flow_count); + + if (cake_ddst(flow_mode)) + host_load = max(host_load, + q->hosts[flow->dsthost].dsthost_bulk_flow_count); + + /* The shifted prandom_u32() is a way to apply dithering to avoid + * accumulating roundoff errors + */ + return (q->flow_quantum * quantum_div[host_load] + + (prandom_u32() >> 16)) >> 16; +} + static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb, int flow_mode, u16 flow_override, u16 host_override) { @@ -789,10 +846,8 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb, allocate_dst = cake_ddst(flow_mode); if (q->flows[outer_hash + k].set == CAKE_SET_BULK) { - if (allocate_src) - q->hosts[q->flows[reduced_hash].srchost].srchost_bulk_flow_count--; - if (allocate_dst) - q->hosts[q->flows[reduced_hash].dsthost].dsthost_bulk_flow_count--; + cake_dec_srchost_bulk_flow_count(q, &q->flows[outer_hash + k], flow_mode); + cake_dec_dsthost_bulk_flow_count(q, &q->flows[outer_hash + k], flow_mode); } found: /* reserve queue for future packets in same flow */ @@ -817,9 +872,10 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb, q->hosts[outer_hash + k].srchost_tag = srchost_hash; found_src: srchost_idx = outer_hash + k; - if (q->flows[reduced_hash].set == CAKE_SET_BULK) - q->hosts[srchost_idx].srchost_bulk_flow_count++; q->flows[reduced_hash].srchost = srchost_idx; + + if (q->flows[reduced_hash].set == CAKE_SET_BULK) + cake_inc_srchost_bulk_flow_count(q, &q->flows[reduced_hash], flow_mode); } if (allocate_dst) { @@ -840,9 +896,10 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb, q->hosts[outer_hash + k].dsthost_tag = dsthost_hash; found_dst: dsthost_idx = outer_hash + k; - if (q->flows[reduced_hash].set == CAKE_SET_BULK) - q->hosts[dsthost_idx].dsthost_bulk_flow_count++; q->flows[reduced_hash].dsthost = dsthost_idx; + + if (q->flows[reduced_hash].set == CAKE_SET_BULK) + cake_inc_dsthost_bulk_flow_count(q, &q->flows[reduced_hash], flow_mode); } } @@ -1855,10 +1912,6 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch, /* flowchain */ if (!flow->set || flow->set == CAKE_SET_DECAYING) { - struct cake_host *srchost = &b->hosts[flow->srchost]; - struct cake_host *dsthost = &b->hosts[flow->dsthost]; - u16 host_load = 1; - if (!flow->set) { list_add_tail(&flow->flowchain, &b->new_flows); } else { @@ -1868,18 +1921,8 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch, flow->set = CAKE_SET_SPARSE; b->sparse_flow_count++; - if (cake_dsrc(q->flow_mode)) - host_load = max(host_load, srchost->srchost_bulk_flow_count); - - if (cake_ddst(q->flow_mode)) - host_load = max(host_load, dsthost->dsthost_bulk_flow_count); - - flow->deficit = (b->flow_quantum * - quantum_div[host_load]) >> 16; + flow->deficit = cake_get_flow_quantum(b, flow, q->flow_mode); } else if (flow->set == CAKE_SET_SPARSE_WAIT) { - struct cake_host *srchost = &b->hosts[flow->srchost]; - struct cake_host *dsthost = &b->hosts[flow->dsthost]; - /* this flow was empty, accounted as a sparse flow, but actually * in the bulk rotation. */ @@ -1887,12 +1930,8 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch, b->sparse_flow_count--; b->bulk_flow_count++; - if (cake_dsrc(q->flow_mode)) - srchost->srchost_bulk_flow_count++; - - if (cake_ddst(q->flow_mode)) - dsthost->dsthost_bulk_flow_count++; - + cake_inc_srchost_bulk_flow_count(b, flow, q->flow_mode); + cake_inc_dsthost_bulk_flow_count(b, flow, q->flow_mode); } if (q->buffer_used > q->buffer_max_used) @@ -1949,13 +1988,11 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch) { struct cake_sched_data *q = qdisc_priv(sch); struct cake_tin_data *b = &q->tins[q->cur_tin]; - struct cake_host *srchost, *dsthost; ktime_t now = ktime_get(); struct cake_flow *flow; struct list_head *head; bool first_flow = true; struct sk_buff *skb; - u16 host_load; u64 delay; u32 len; @@ -2055,11 +2092,6 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch) q->cur_flow = flow - b->flows; first_flow = false; - /* triple isolation (modified DRR++) */ - srchost = &b->hosts[flow->srchost]; - dsthost = &b->hosts[flow->dsthost]; - host_load = 1; - /* flow isolation (DRR++) */ if (flow->deficit <= 0) { /* Keep all flows with deficits out of the sparse and decaying @@ -2071,11 +2103,8 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch) b->sparse_flow_count--; b->bulk_flow_count++; - if (cake_dsrc(q->flow_mode)) - srchost->srchost_bulk_flow_count++; - - if (cake_ddst(q->flow_mode)) - dsthost->dsthost_bulk_flow_count++; + cake_inc_srchost_bulk_flow_count(b, flow, q->flow_mode); + cake_inc_dsthost_bulk_flow_count(b, flow, q->flow_mode); flow->set = CAKE_SET_BULK; } else { @@ -2087,19 +2116,7 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch) } } - if (cake_dsrc(q->flow_mode)) - host_load = max(host_load, srchost->srchost_bulk_flow_count); - - if (cake_ddst(q->flow_mode)) - host_load = max(host_load, dsthost->dsthost_bulk_flow_count); - - WARN_ON(host_load > CAKE_QUEUES); - - /* The shifted prandom_u32() is a way to apply dithering to - * avoid accumulating roundoff errors - */ - flow->deficit += (b->flow_quantum * quantum_div[host_load] + - (prandom_u32() >> 16)) >> 16; + flow->deficit += cake_get_flow_quantum(b, flow, q->flow_mode); list_move_tail(&flow->flowchain, &b->old_flows); goto retry; @@ -2123,11 +2140,8 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch) if (flow->set == CAKE_SET_BULK) { b->bulk_flow_count--; - if (cake_dsrc(q->flow_mode)) - srchost->srchost_bulk_flow_count--; - - if (cake_ddst(q->flow_mode)) - dsthost->dsthost_bulk_flow_count--; + cake_dec_srchost_bulk_flow_count(b, flow, q->flow_mode); + cake_dec_dsthost_bulk_flow_count(b, flow, q->flow_mode); b->decaying_flow_count++; } else if (flow->set == CAKE_SET_SPARSE || @@ -2145,12 +2159,8 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch) else if (flow->set == CAKE_SET_BULK) { b->bulk_flow_count--; - if (cake_dsrc(q->flow_mode)) - srchost->srchost_bulk_flow_count--; - - if (cake_ddst(q->flow_mode)) - dsthost->dsthost_bulk_flow_count--; - + cake_dec_srchost_bulk_flow_count(b, flow, q->flow_mode); + cake_dec_dsthost_bulk_flow_count(b, flow, q->flow_mode); } else b->decaying_flow_count--; -- 2.47.1

6 months, 2 weeks

3
6
0 0

[PATCH v4] sched/topology: Enable topology_span_sane check only for debug builds

by Naman Jain

From: Saurabh Sengar <ssengar(a)linux.microsoft.com> On a x86 system under test with 1780 CPUs, topology_span_sane() takes around 8 seconds cumulatively for all the iterations. It is an expensive operation which does the sanity of non-NUMA topology masks. CPU topology is not something which changes very frequently hence make this check optional for the systems where the topology is trusted and need faster bootup. Restrict this to sched_verbose kernel cmdline option so that this penalty can be avoided for the systems who want to avoid it. Cc: stable(a)vger.kernel.org Fixes: ccf74128d66c ("sched/topology: Assert non-NUMA topology masks don't (partially) overlap") Signed-off-by: Saurabh Sengar <ssengar(a)linux.microsoft.com> Co-developed-by: Naman Jain <namjain(a)linux.microsoft.com> Signed-off-by: Naman Jain <namjain(a)linux.microsoft.com> Tested-by: K Prateek Nayak <kprateek.nayak(a)amd.com> --- Changes since v3: https://lore.kernel.org/all/20250203114738.3109-1-namjain@linux.microsoft.c… - Minor typo correction in comment - Added Tested-by tag from Prateek for x86 Changes since v2: https://lore.kernel.org/all/1731922777-7121-1-git-send-email-ssengar@linux.… - Use sched_debug() instead of using sched_debug_verbose variable directly (addressing Prateek's comment) Changes since v1: https://lore.kernel.org/all/1729619853-2597-1-git-send-email-ssengar@linux.… - Use kernel cmdline param instead of compile time flag. Adding a link to the other patch which is under review. https://lore.kernel.org/lkml/20241031200431.182443-1-steve.wahl@hpe.com/ Above patch tries to optimize the topology sanity check, whereas this patch makes it optional. We believe both patches can coexist, as even with optimization, there will still be some performance overhead for this check. --- kernel/sched/topology.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index c49aea8c1025..666f0a18cc6c 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -2359,6 +2359,13 @@ static bool topology_span_sane(struct sched_domain_topology_level *tl, { int i = cpu + 1; + /* Skip the topology sanity check for non-debug, as it is a time-consuming operation */ + if (!sched_debug()) { + pr_info_once("%s: Skipping topology span sanity check. Use `sched_verbose` boot parameter to enable it.\n", + __func__); + return true; + } + /* NUMA levels are allowed to overlap */ if (tl->flags & SDTL_OVERLAP) return true; -- 2.34.1

6 months, 2 weeks

3
4
0 0

[PATCH] lib/buildid: Handle memfd_secret() files in build_id_parse()

by Chen Linxuan

Backport of a similar change from commit 5ac9b4e935df ("lib/buildid: Handle memfd_secret() files in build_id_parse()") to address an issue where accessing secret memfd contents through build_id_parse() would trigger faults. Original report and repro can be found in [0]. [0] https://lore.kernel.org/bpf/ZwyG8Uro%2FSyTXAni@ly-workstation/ This repro will cause BUG: unable to handle kernel paging request in build_id_parse in 5.15/6.1/6.6. Some other discussions can be found in [1]. [1] https://lore.kernel.org/bpf/20241104175256.2327164-1-jolsa@kernel.org/T/#u Cc: stable(a)vger.kernel.org Fixes: 88a16a130933 ("perf: Add build id data in mmap2 event") Signed-off-by: Chen Linxuan <chenlinxuan(a)deepin.org> --- lib/buildid.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/lib/buildid.c b/lib/buildid.c index 9fc46366597e..b78d119ed1f7 100644 --- a/lib/buildid.c +++ b/lib/buildid.c @@ -157,6 +157,12 @@ int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id, if (!vma->vm_file) return -EINVAL; +#ifdef CONFIG_SECRETMEM + /* reject secretmem folios created with memfd_secret() */ + if (vma->vm_file->f_mapping->a_ops == &secretmem_aops) + return -EFAULT; +#endif + page = find_get_page(vma->vm_file->f_mapping, 0); if (!page) return -EFAULT; /* page not mapped */ -- 2.48.1

6 months, 2 weeks

4
4
0 0

[PATCH 5.4] x86/mm: Don't disable PCID when INVLPG has been fixed by microcode

by Pawan Gupta

From: Xi Ruoyao <xry111(a)xry111.site> commit f24f669d03f884a6ef95cca84317d0f329e93961 upstream. Per the "Processor Specification Update" documentations referred by the intel-microcode-20240312 release note, this microcode release has fixed the issue for all affected models. So don't disable PCID if the microcode is new enough. The precise minimum microcode revision fixing the issue was provided by Pawan Intel. [ dhansen: comment and changelog tweaks ] [ pawan: backported to 5.4 s/ATOM_GRACEMONT/ALDERLAKE_N/ added microcode matching to INTEL_MATCH() and invlpg_miss_ids ] Signed-off-by: Xi Ruoyao <xry111(a)xry111.site> Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com> Acked-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com> Link: https://lore.kernel.org/all/168436059559.404.13934972543631851306.tip-bot2@… Link: https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/release… Link: https://cdrdv2.intel.com/v1/dl/getContent/740518 # RPL042, rev. 13 Link: https://cdrdv2.intel.com/v1/dl/getContent/682436 # ADL063, rev. 24 Link: https://lore.kernel.org/all/20240325231300.qrltbzf6twm43ftb@desk/ Link: https://lore.kernel.org/all/20240522020625.69418-1-xry111%40xry111.site --- arch/x86/mm/init.c | 32 +++++++++++++++++++------------- 1 file changed, 19 insertions(+), 13 deletions(-) diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 086b274fa60f..b3bed9a9f78d 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -211,33 +211,39 @@ static void __init probe_page_size_mask(void) } } -#define INTEL_MATCH(_model) { .vendor = X86_VENDOR_INTEL, \ - .family = 6, \ - .model = _model, \ - } +#define INTEL_MATCH(_model, ucode) { .vendor = X86_VENDOR_INTEL, \ + .family = 6, \ + .model = _model, \ + .driver_data = ucode, \ + } /* - * INVLPG may not properly flush Global entries - * on these CPUs when PCIDs are enabled. + * INVLPG may not properly flush Global entries on + * these CPUs. New microcode fixes the issue. */ static const struct x86_cpu_id invlpg_miss_ids[] = { - INTEL_MATCH(INTEL_FAM6_ALDERLAKE ), - INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L ), - INTEL_MATCH(INTEL_FAM6_ALDERLAKE_N ), - INTEL_MATCH(INTEL_FAM6_RAPTORLAKE ), - INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P), - INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S), + INTEL_MATCH(INTEL_FAM6_ALDERLAKE, 0x2e), + INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L, 0x42c), + INTEL_MATCH(INTEL_FAM6_ALDERLAKE_N, 0x11), + INTEL_MATCH(INTEL_FAM6_RAPTORLAKE, 0x118), + INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P, 0x4117), + INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S, 0x2e), {} }; static void setup_pcid(void) { + const struct x86_cpu_id *invlpg_miss_match; + if (!IS_ENABLED(CONFIG_X86_64)) return; if (!boot_cpu_has(X86_FEATURE_PCID)) return; - if (x86_match_cpu(invlpg_miss_ids)) { + invlpg_miss_match = x86_match_cpu(invlpg_miss_ids); + + if (invlpg_miss_match && + boot_cpu_data.microcode < invlpg_miss_match->driver_data) { pr_info("Incomplete global flushes, disabling PCID"); setup_clear_cpu_cap(X86_FEATURE_PCID); return; --- base-commit: 856a224845f949243d6719165c88a70e4b473ec4 change-id: 20250308-clear-pcid-5-4-cc78be4ffaaf Best regards, -- Thanks, Pawan

6 months, 2 weeks

2
1
0 0

[PATCH 6.6] x86/mm: Don't disable PCID when INVLPG has been fixed by microcode

by Pawan Gupta

From: Xi Ruoyao <xry111(a)xry111.site> commit f24f669d03f884a6ef95cca84317d0f329e93961 upstream. Per the "Processor Specification Update" documentations referred by the intel-microcode-20240312 release note, this microcode release has fixed the issue for all affected models. So don't disable PCID if the microcode is new enough. The precise minimum microcode revision fixing the issue was provided by Pawan Intel. [ dhansen: comment and changelog tweaks ] [ pawan: backported to 6.6 ] Signed-off-by: Xi Ruoyao <xry111(a)xry111.site> Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com> Acked-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com> Link: https://lore.kernel.org/all/168436059559.404.13934972543631851306.tip-bot2@… Link: https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/release… Link: https://cdrdv2.intel.com/v1/dl/getContent/740518 # RPL042, rev. 13 Link: https://cdrdv2.intel.com/v1/dl/getContent/682436 # ADL063, rev. 24 Link: https://lore.kernel.org/all/20240325231300.qrltbzf6twm43ftb@desk/ Link: https://lore.kernel.org/all/20240522020625.69418-1-xry111%40xry111.site --- arch/x86/mm/init.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 6215dfa23578..71d29dd7ad76 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -262,28 +262,33 @@ static void __init probe_page_size_mask(void) } /* - * INVLPG may not properly flush Global entries - * on these CPUs when PCIDs are enabled. + * INVLPG may not properly flush Global entries on + * these CPUs. New microcode fixes the issue. */ static const struct x86_cpu_id invlpg_miss_ids[] = { - X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, 0), - X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, 0), - X86_MATCH_INTEL_FAM6_MODEL(ATOM_GRACEMONT, 0), - X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE, 0), - X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P, 0), - X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_S, 0), + X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, 0x2e), + X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, 0x42c), + X86_MATCH_INTEL_FAM6_MODEL(ATOM_GRACEMONT, 0x11), + X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE, 0x118), + X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P, 0x4117), + X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_S, 0x2e), {} }; static void setup_pcid(void) { + const struct x86_cpu_id *invlpg_miss_match; + if (!IS_ENABLED(CONFIG_X86_64)) return; if (!boot_cpu_has(X86_FEATURE_PCID)) return; - if (x86_match_cpu(invlpg_miss_ids)) { + invlpg_miss_match = x86_match_cpu(invlpg_miss_ids); + + if (invlpg_miss_match && + boot_cpu_data.microcode < invlpg_miss_match->driver_data) { pr_info("Incomplete global flushes, disabling PCID"); setup_clear_cpu_cap(X86_FEATURE_PCID); return; --- base-commit: 568e253c3e3bdfecf5a4d65ccc8fc971c6c4b31f change-id: 20250307-clear-pcid-6-6-ce51baab3b5b Best regards, -- Thanks, Pawan

6 months, 2 weeks

2
1
0 0

[PATCH v2 6.1] mm/mempolicy: fix migrate_to_node() assuming there is at least one VMA in a MM

by Alexey Panov

From: David Hildenbrand <david(a)redhat.com> commit 091c1dd2d4df6edd1beebe0e5863d4034ade9572 upstream. We currently assume that there is at least one VMA in a MM, which isn't true. So we might end up having find_vma() return NULL, to then de-reference NULL. So properly handle find_vma() returning NULL. This fixes the report: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 1 UID: 0 PID: 6021 Comm: syz-executor284 Not tainted 6.12.0-rc7-syzkaller-00187-gf868cd251776 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024 RIP: 0010:migrate_to_node mm/mempolicy.c:1090 [inline] RIP: 0010:do_migrate_pages+0x403/0x6f0 mm/mempolicy.c:1194 Code: ... RSP: 0018:ffffc9000375fd08 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffc9000375fd78 RCX: 0000000000000000 RDX: ffff88807e171300 RSI: dffffc0000000000 RDI: ffff88803390c044 RBP: ffff88807e171428 R08: 0000000000000014 R09: fffffbfff2039ef1 R10: ffffffff901cf78f R11: 0000000000000000 R12: 0000000000000003 R13: ffffc9000375fe90 R14: ffffc9000375fe98 R15: ffffc9000375fdf8 FS: 00005555919e1380(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005555919e1ca8 CR3: 000000007f12a000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> kernel_migrate_pages+0x5b2/0x750 mm/mempolicy.c:1709 __do_sys_migrate_pages mm/mempolicy.c:1727 [inline] __se_sys_migrate_pages mm/mempolicy.c:1723 [inline] __x64_sys_migrate_pages+0x96/0x100 mm/mempolicy.c:1723 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f [akpm(a)linux-foundation.org: add unlikely()] Link: https://lkml.kernel.org/r/20241120201151.9518-1-david@redhat.com Fixes: 39743889aaf7 ("[PATCH] Swap Migration V5: sys_migrate_pages interface") Signed-off-by: David Hildenbrand <david(a)redhat.com> Reported-by: syzbot+3511625422f7aa637f0d(a)syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/673d2696.050a0220.3c9d61.012f.GAE@google.com/T/ Reviewed-by: Liam R. Howlett <Liam.Howlett(a)Oracle.com> Reviewed-by: Christoph Lameter <cl(a)linux.com> Cc: Liam R. Howlett <Liam.Howlett(a)Oracle.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> [ Alexey: mmap_read_lock is not used in this context, so mmap_read_unlock is removed. Synchronization is provided by an external context in do_migrate_pages(). ] Signed-off-by: Alexey Panov <apanov(a)astralinux.ru> --- v2: Clarify mmap_lock context in changes summary. Fix braces for a single statement block. Rearrange the changes with a comment and VM_BUG_ON to look more consistent with upstream. mm/mempolicy.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 399d8cb48813..f60ff4727f46 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1062,13 +1062,17 @@ static int migrate_to_node(struct mm_struct *mm, int source, int dest, nodes_clear(nmask); node_set(source, nmask); + VM_BUG_ON(!(flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))); + + vma = find_vma(mm, 0); + if (unlikely(!vma)) + return 0; + /* * This does not "check" the range but isolates all pages that * need migration. Between passing in the full user address * space range and MPOL_MF_DISCONTIG_OK, this call can not fail. */ - vma = find_vma(mm, 0); - VM_BUG_ON(!(flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))); queue_pages_range(mm, vma->vm_start, mm->task_size, &nmask, flags | MPOL_MF_DISCONTIG_OK, &pagelist); -- 2.30.2

6 months, 2 weeks

2
1
0 0

[PATCH v2 5.10/5.15] mm/mempolicy: fix migrate_to_node() assuming there is at least one VMA in a MM

by Alexey Panov

From: David Hildenbrand <david(a)redhat.com> commit 091c1dd2d4df6edd1beebe0e5863d4034ade9572 upstream. We currently assume that there is at least one VMA in a MM, which isn't true. So we might end up having find_vma() return NULL, to then de-reference NULL. So properly handle find_vma() returning NULL. This fixes the report: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 1 UID: 0 PID: 6021 Comm: syz-executor284 Not tainted 6.12.0-rc7-syzkaller-00187-gf868cd251776 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024 RIP: 0010:migrate_to_node mm/mempolicy.c:1090 [inline] RIP: 0010:do_migrate_pages+0x403/0x6f0 mm/mempolicy.c:1194 Code: ... RSP: 0018:ffffc9000375fd08 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffc9000375fd78 RCX: 0000000000000000 RDX: ffff88807e171300 RSI: dffffc0000000000 RDI: ffff88803390c044 RBP: ffff88807e171428 R08: 0000000000000014 R09: fffffbfff2039ef1 R10: ffffffff901cf78f R11: 0000000000000000 R12: 0000000000000003 R13: ffffc9000375fe90 R14: ffffc9000375fe98 R15: ffffc9000375fdf8 FS: 00005555919e1380(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005555919e1ca8 CR3: 000000007f12a000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> kernel_migrate_pages+0x5b2/0x750 mm/mempolicy.c:1709 __do_sys_migrate_pages mm/mempolicy.c:1727 [inline] __se_sys_migrate_pages mm/mempolicy.c:1723 [inline] __x64_sys_migrate_pages+0x96/0x100 mm/mempolicy.c:1723 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f [akpm(a)linux-foundation.org: add unlikely()] Link: https://lkml.kernel.org/r/20241120201151.9518-1-david@redhat.com Fixes: 39743889aaf7 ("[PATCH] Swap Migration V5: sys_migrate_pages interface") Signed-off-by: David Hildenbrand <david(a)redhat.com> Reported-by: syzbot+3511625422f7aa637f0d(a)syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/673d2696.050a0220.3c9d61.012f.GAE@google.com/T/ Reviewed-by: Liam R. Howlett <Liam.Howlett(a)Oracle.com> Reviewed-by: Christoph Lameter <cl(a)linux.com> Cc: Liam R. Howlett <Liam.Howlett(a)Oracle.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> [ Alexey: mmap_read_lock is not used in this context, so mmap_read_unlock is removed. Synchronization is provided by an external context in do_migrate_pages(). find_vma(mm, 0) is the same as mm->mmap. ] Signed-off-by: Alexey Panov <apanov(a)astralinux.ru> --- v2: Clarify mmap_lock context in changes summary. Fix braces for a single statement block. Rearrange the changes with a comment and VM_BUG_ON to look more consistent with upstream. mm/mempolicy.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 6c98585f20df..db94aec0ea17 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1067,6 +1067,7 @@ static int migrate_to_node(struct mm_struct *mm, int source, int dest, int flags) { nodemask_t nmask; + struct vm_area_struct *vma; LIST_HEAD(pagelist); int err = 0; struct migration_target_control mtc = { @@ -1077,13 +1078,18 @@ static int migrate_to_node(struct mm_struct *mm, int source, int dest, nodes_clear(nmask); node_set(source, nmask); + VM_BUG_ON(!(flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))); + + vma = find_vma(mm, 0); + if (unlikely(!vma)) + return 0; + /* * This does not "check" the range but isolates all pages that * need migration. Between passing in the full user address * space range and MPOL_MF_DISCONTIG_OK, this call can not fail. */ - VM_BUG_ON(!(flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))); - queue_pages_range(mm, mm->mmap->vm_start, mm->task_size, &nmask, + queue_pages_range(mm, vma->vm_start, mm->task_size, &nmask, flags | MPOL_MF_DISCONTIG_OK, &pagelist); if (!list_empty(&pagelist)) { -- 2.30.2

6 months, 2 weeks

2
1
0 0

[PATCH 6.12/6.13] loongarch: Use ASM_REACHABLE

by Huacai Chen

From: Peter Zijlstra <peterz(a)infradead.org> commit 624bde3465f660e54a7cd4c1efc3e536349fead5 upstream. annotate_reachable() is unreliable since the compiler is free to place random code inbetween two consecutive asm() statements. This removes the last and only annotate_reachable() user. Backport to solve a build error since relevant commits have already been backported. Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Acked-by: Josh Poimboeuf <jpoimboe(a)kernel.org> Link: https://lore.kernel.org/r/20241128094312.133437051@infradead.org Closes: https://lore.kernel.org/loongarch/20250307214943.372210-1-ojeda@kernel.org/ Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn> --- arch/loongarch/include/asm/bug.h | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/arch/loongarch/include/asm/bug.h b/arch/loongarch/include/asm/bug.h index 08388876ade4..561ac1bf79e2 100644 --- a/arch/loongarch/include/asm/bug.h +++ b/arch/loongarch/include/asm/bug.h @@ -4,6 +4,7 @@ #include <asm/break.h> #include <linux/stringify.h> +#include <linux/objtool.h> #ifndef CONFIG_DEBUG_BUGVERBOSE #define _BUGVERBOSE_LOCATION(file, line) @@ -33,25 +34,25 @@ #define ASM_BUG_FLAGS(flags) \ __BUG_ENTRY(flags) \ - break BRK_BUG + break BRK_BUG; #define ASM_BUG() ASM_BUG_FLAGS(0) -#define __BUG_FLAGS(flags) \ - asm_inline volatile (__stringify(ASM_BUG_FLAGS(flags))); +#define __BUG_FLAGS(flags, extra) \ + asm_inline volatile (__stringify(ASM_BUG_FLAGS(flags)) \ + extra); #define __WARN_FLAGS(flags) \ do { \ instrumentation_begin(); \ - __BUG_FLAGS(BUGFLAG_WARNING|(flags)); \ - annotate_reachable(); \ + __BUG_FLAGS(BUGFLAG_WARNING|(flags), ASM_REACHABLE); \ instrumentation_end(); \ } while (0) #define BUG() \ do { \ instrumentation_begin(); \ - __BUG_FLAGS(0); \ + __BUG_FLAGS(0, ""); \ unreachable(); \ } while (0) -- 2.47.1

6 months, 2 weeks

3
2
0 0

[PATCH 5.15] x86/mm: Don't disable PCID when INVLPG has been fixed by microcode

by Pawan Gupta

From: Xi Ruoyao <xry111(a)xry111.site> commit f24f669d03f884a6ef95cca84317d0f329e93961 upstream. Per the "Processor Specification Update" documentations referred by the intel-microcode-20240312 release note, this microcode release has fixed the issue for all affected models. So don't disable PCID if the microcode is new enough. The precise minimum microcode revision fixing the issue was provided by Pawan Intel. [ dhansen: comment and changelog tweaks ] [ pawan: backported to 5.15 s/ATOM_GRACEMONT/ALDERLAKE_N/ ] Signed-off-by: Xi Ruoyao <xry111(a)xry111.site> Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com> Acked-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com> Link: https://lore.kernel.org/all/168436059559.404.13934972543631851306.tip-bot2@… Link: https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/release… Link: https://cdrdv2.intel.com/v1/dl/getContent/740518 # RPL042, rev. 13 Link: https://cdrdv2.intel.com/v1/dl/getContent/682436 # ADL063, rev. 24 Link: https://lore.kernel.org/all/20240325231300.qrltbzf6twm43ftb@desk/ Link: https://lore.kernel.org/all/20240522020625.69418-1-xry111%40xry111.site --- arch/x86/mm/init.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 5953c7482016..1110f6dda352 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -264,28 +264,33 @@ static void __init probe_page_size_mask(void) } /* - * INVLPG may not properly flush Global entries - * on these CPUs when PCIDs are enabled. + * INVLPG may not properly flush Global entries on + * these CPUs. New microcode fixes the issue. */ static const struct x86_cpu_id invlpg_miss_ids[] = { - X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, 0), - X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, 0), - X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_N, 0), - X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE, 0), - X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P, 0), - X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_S, 0), + X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, 0x2e), + X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, 0x42c), + X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_N, 0x11), + X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE, 0x118), + X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P, 0x4117), + X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_S, 0x2e), {} }; static void setup_pcid(void) { + const struct x86_cpu_id *invlpg_miss_match; + if (!IS_ENABLED(CONFIG_X86_64)) return; if (!boot_cpu_has(X86_FEATURE_PCID)) return; - if (x86_match_cpu(invlpg_miss_ids)) { + invlpg_miss_match = x86_match_cpu(invlpg_miss_ids); + + if (invlpg_miss_match && + boot_cpu_data.microcode < invlpg_miss_match->driver_data) { pr_info("Incomplete global flushes, disabling PCID"); setup_clear_cpu_cap(X86_FEATURE_PCID); return; --- base-commit: c16c81c81336c0912eb3542194f16215c0a40037 change-id: 20250307-clear-pcid-5-15-e9740c3b5649 Best regards, -- Thanks, Pawan

6 months, 2 weeks

2
1
0 0

[PATCH 6.12] x86/mm: Don't disable PCID when INVLPG has been fixed by microcode

by Pawan Gupta

From: Xi Ruoyao <xry111(a)xry111.site> commit f24f669d03f884a6ef95cca84317d0f329e93961 upstream. Per the "Processor Specification Update" documentations referred by the intel-microcode-20240312 release note, this microcode release has fixed the issue for all affected models. So don't disable PCID if the microcode is new enough. The precise minimum microcode revision fixing the issue was provided by Pawan Intel. [ dhansen: comment and changelog tweaks ] [ pawan: backported to 6.12 ] Signed-off-by: Xi Ruoyao <xry111(a)xry111.site> Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com> Acked-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com> Link: https://lore.kernel.org/all/168436059559.404.13934972543631851306.tip-bot2@… Link: https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/release… Link: https://cdrdv2.intel.com/v1/dl/getContent/740518 # RPL042, rev. 13 Link: https://cdrdv2.intel.com/v1/dl/getContent/682436 # ADL063, rev. 24 Link: https://lore.kernel.org/all/20240325231300.qrltbzf6twm43ftb@desk/ Link: https://lore.kernel.org/all/20240522020625.69418-1-xry111%40xry111.site --- arch/x86/mm/init.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index eb503f53c319..101725c149c4 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -263,28 +263,33 @@ static void __init probe_page_size_mask(void) } /* - * INVLPG may not properly flush Global entries - * on these CPUs when PCIDs are enabled. + * INVLPG may not properly flush Global entries on + * these CPUs. New microcode fixes the issue. */ static const struct x86_cpu_id invlpg_miss_ids[] = { - X86_MATCH_VFM(INTEL_ALDERLAKE, 0), - X86_MATCH_VFM(INTEL_ALDERLAKE_L, 0), - X86_MATCH_VFM(INTEL_ATOM_GRACEMONT, 0), - X86_MATCH_VFM(INTEL_RAPTORLAKE, 0), - X86_MATCH_VFM(INTEL_RAPTORLAKE_P, 0), - X86_MATCH_VFM(INTEL_RAPTORLAKE_S, 0), + X86_MATCH_VFM(INTEL_ALDERLAKE, 0x2e), + X86_MATCH_VFM(INTEL_ALDERLAKE_L, 0x42c), + X86_MATCH_VFM(INTEL_ATOM_GRACEMONT, 0x11), + X86_MATCH_VFM(INTEL_RAPTORLAKE, 0x118), + X86_MATCH_VFM(INTEL_RAPTORLAKE_P, 0x4117), + X86_MATCH_VFM(INTEL_RAPTORLAKE_S, 0x2e), {} }; static void setup_pcid(void) { + const struct x86_cpu_id *invlpg_miss_match; + if (!IS_ENABLED(CONFIG_X86_64)) return; if (!boot_cpu_has(X86_FEATURE_PCID)) return; - if (x86_match_cpu(invlpg_miss_ids)) { + invlpg_miss_match = x86_match_cpu(invlpg_miss_ids); + + if (invlpg_miss_match && + boot_cpu_data.microcode < invlpg_miss_match->driver_data) { pr_info("Incomplete global flushes, disabling PCID"); setup_clear_cpu_cap(X86_FEATURE_PCID); return; --- base-commit: 105a31925e2d17b766cebcff5d173f469e7b9e52 change-id: 20250307-clear-pcid-6-12-f3e4c84c7206 Best regards, -- Pawan

6 months, 2 weeks

2
1
0 0

[PATCH 5.10] x86/mm: Don't disable PCID when INVLPG has been fixed by microcode

by Pawan Gupta

From: Xi Ruoyao <xry111(a)xry111.site> commit f24f669d03f884a6ef95cca84317d0f329e93961 upstream. Per the "Processor Specification Update" documentations referred by the intel-microcode-20240312 release note, this microcode release has fixed the issue for all affected models. So don't disable PCID if the microcode is new enough. The precise minimum microcode revision fixing the issue was provided by Pawan Intel. [ dhansen: comment and changelog tweaks ] [ pawan: backported to 5.10 s/ATOM_GRACEMONT/ALDERLAKE_N/ ] Signed-off-by: Xi Ruoyao <xry111(a)xry111.site> Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com> Acked-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com> Link: https://lore.kernel.org/all/168436059559.404.13934972543631851306.tip-bot2@… Link: https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/release… Link: https://cdrdv2.intel.com/v1/dl/getContent/740518 # RPL042, rev. 13 Link: https://cdrdv2.intel.com/v1/dl/getContent/682436 # ADL063, rev. 24 Link: https://lore.kernel.org/all/20240325231300.qrltbzf6twm43ftb@desk/ Link: https://lore.kernel.org/all/20240522020625.69418-1-xry111%40xry111.site --- arch/x86/mm/init.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 17f1a89e26fc..d4b6ca0221a7 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -258,28 +258,33 @@ static void __init probe_page_size_mask(void) } /* - * INVLPG may not properly flush Global entries - * on these CPUs when PCIDs are enabled. + * INVLPG may not properly flush Global entries on + * these CPUs. New microcode fixes the issue. */ static const struct x86_cpu_id invlpg_miss_ids[] = { - X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, 0), - X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, 0), - X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_N, 0), - X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE, 0), - X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P, 0), - X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_S, 0), + X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, 0x2e), + X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, 0x42c), + X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_N, 0x11), + X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE, 0x118), + X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P, 0x4117), + X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_S, 0x2e), {} }; static void setup_pcid(void) { + const struct x86_cpu_id *invlpg_miss_match; + if (!IS_ENABLED(CONFIG_X86_64)) return; if (!boot_cpu_has(X86_FEATURE_PCID)) return; - if (x86_match_cpu(invlpg_miss_ids)) { + invlpg_miss_match = x86_match_cpu(invlpg_miss_ids); + + if (invlpg_miss_match && + boot_cpu_data.microcode < invlpg_miss_match->driver_data) { pr_info("Incomplete global flushes, disabling PCID"); setup_clear_cpu_cap(X86_FEATURE_PCID); return; --- base-commit: f0a53361993a94f602df6f35e78149ad2ac12c89 change-id: 20250307-clear-pcid-5-10-64f4287b45c7 Best regards, -- Thanks, Pawan

6 months, 2 weeks

2
1
0 0

[PATCH linux-6.12.y] selftests/bpf: Clean up open-coded gettid syscall invocations

by Alan Maguire

From: Kumar Kartikeya Dwivedi <memxor(a)gmail.com> Availability of the gettid definition across glibc versions supported by BPF selftests is not certain. Currently, all users in the tree open-code syscall to gettid. Convert them to a common macro definition. Reviewed-by: Jiri Olsa <jolsa(a)kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor(a)gmail.com> Link: https://lore.kernel.org/r/20241104171959.2938862-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast(a)kernel.org> (cherry picked from commit 0e2fb011a0ba8e2258ce776fdf89fbd589c2a3a6) This backport is needed to build BPF selftests successfully for linux-6.12.y, as when currently building BPF selftests, the following error is seen: TEST-OBJ [test_progs] raw_tp_null.test.o prog_tests/raw_tp_null.c: In function ‘test_raw_tp_null’: prog_tests/raw_tp_null.c:15:26: error: implicit declaration of function ‘sys_gettid’; did you mean ‘gettid’? [-Werror=implicit-function-declaration] 15 | skel->bss->tid = sys_gettid(); | ^~~~~~~~~~ | gettid cc1: all warnings being treated as errors Fixes: abd30e947f70 ("selftests/bpf: Add tests for raw_tp null handling") Reported-by: Colm Harrington <colm.harrington(a)oracle.com> Signed-off-by: Alan Maguire <alan.maguire(a)oracle.com> Conflicts: tools/testing/selftests/bpf/prog_tests/task_local_storage.c Conflicts were due to new unrelated context in the upstream version. --- tools/testing/selftests/bpf/benchs/bench_trigger.c | 3 ++- tools/testing/selftests/bpf/bpf_util.h | 9 +++++++++ .../testing/selftests/bpf/map_tests/task_storage_map.c | 3 ++- tools/testing/selftests/bpf/prog_tests/bpf_cookie.c | 2 +- tools/testing/selftests/bpf/prog_tests/bpf_iter.c | 6 +++--- .../selftests/bpf/prog_tests/cgrp_local_storage.c | 10 +++++----- tools/testing/selftests/bpf/prog_tests/core_reloc.c | 2 +- tools/testing/selftests/bpf/prog_tests/linked_funcs.c | 2 +- .../selftests/bpf/prog_tests/ns_current_pid_tgid.c | 2 +- tools/testing/selftests/bpf/prog_tests/rcu_read_lock.c | 4 ++-- .../selftests/bpf/prog_tests/task_local_storage.c | 8 ++++---- .../selftests/bpf/prog_tests/uprobe_multi_test.c | 2 +- 12 files changed, 32 insertions(+), 21 deletions(-) diff --git a/tools/testing/selftests/bpf/benchs/bench_trigger.c b/tools/testing/selftests/bpf/benchs/bench_trigger.c index 2ed0ef6f21ee..32e9f194d449 100644 --- a/tools/testing/selftests/bpf/benchs/bench_trigger.c +++ b/tools/testing/selftests/bpf/benchs/bench_trigger.c @@ -4,6 +4,7 @@ #include <argp.h> #include <unistd.h> #include <stdint.h> +#include "bpf_util.h" #include "bench.h" #include "trigger_bench.skel.h" #include "trace_helpers.h" @@ -72,7 +73,7 @@ static __always_inline void inc_counter(struct counter *counters) unsigned slot; if (unlikely(tid == 0)) - tid = syscall(SYS_gettid); + tid = sys_gettid(); /* multiplicative hashing, it's fast */ slot = 2654435769U * tid; diff --git a/tools/testing/selftests/bpf/bpf_util.h b/tools/testing/selftests/bpf/bpf_util.h index 10587a29b967..feff92219e21 100644 --- a/tools/testing/selftests/bpf/bpf_util.h +++ b/tools/testing/selftests/bpf/bpf_util.h @@ -6,6 +6,7 @@ #include <stdlib.h> #include <string.h> #include <errno.h> +#include <syscall.h> #include <bpf/libbpf.h> /* libbpf_num_possible_cpus */ static inline unsigned int bpf_num_possible_cpus(void) @@ -59,4 +60,12 @@ static inline void bpf_strlcpy(char *dst, const char *src, size_t sz) (offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER)) #endif +/* Availability of gettid across glibc versions is hit-and-miss, therefore + * fallback to syscall in this macro and use it everywhere. + */ +#ifndef sys_gettid +#define sys_gettid() syscall(SYS_gettid) +#endif + + #endif /* __BPF_UTIL__ */ diff --git a/tools/testing/selftests/bpf/map_tests/task_storage_map.c b/tools/testing/selftests/bpf/map_tests/task_storage_map.c index 7d050364efca..62971dbf2996 100644 --- a/tools/testing/selftests/bpf/map_tests/task_storage_map.c +++ b/tools/testing/selftests/bpf/map_tests/task_storage_map.c @@ -12,6 +12,7 @@ #include <bpf/bpf.h> #include <bpf/libbpf.h> +#include "bpf_util.h" #include "test_maps.h" #include "task_local_storage_helpers.h" #include "read_bpf_task_storage_busy.skel.h" @@ -115,7 +116,7 @@ void test_task_storage_map_stress_lookup(void) CHECK(err, "attach", "error %d\n", err); /* Trigger program */ - syscall(SYS_gettid); + sys_gettid(); skel->bss->pid = 0; CHECK(skel->bss->busy != 0, "bad bpf_task_storage_busy", "got %d\n", skel->bss->busy); diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c index 070c52c312e5..6befa870434b 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c @@ -690,7 +690,7 @@ void test_bpf_cookie(void) if (!ASSERT_OK_PTR(skel, "skel_open")) return; - skel->bss->my_tid = syscall(SYS_gettid); + skel->bss->my_tid = sys_gettid(); if (test__start_subtest("kprobe")) kprobe_subtest(skel); diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c index 9006549a1294..b8e1224cfd19 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c @@ -226,7 +226,7 @@ static void test_task_common_nocheck(struct bpf_iter_attach_opts *opts, ASSERT_OK(pthread_create(&thread_id, NULL, &do_nothing_wait, NULL), "pthread_create"); - skel->bss->tid = syscall(SYS_gettid); + skel->bss->tid = sys_gettid(); do_dummy_read_opts(skel->progs.dump_task, opts); @@ -255,10 +255,10 @@ static void *run_test_task_tid(void *arg) union bpf_iter_link_info linfo; int num_unknown_tid, num_known_tid; - ASSERT_NEQ(getpid(), syscall(SYS_gettid), "check_new_thread_id"); + ASSERT_NEQ(getpid(), sys_gettid(), "check_new_thread_id"); memset(&linfo, 0, sizeof(linfo)); - linfo.task.tid = syscall(SYS_gettid); + linfo.task.tid = sys_gettid(); opts.link_info = &linfo; opts.link_info_len = sizeof(linfo); test_task_common(&opts, 0, 1); diff --git a/tools/testing/selftests/bpf/prog_tests/cgrp_local_storage.c b/tools/testing/selftests/bpf/prog_tests/cgrp_local_storage.c index 747761572098..9015e2c2ab12 100644 --- a/tools/testing/selftests/bpf/prog_tests/cgrp_local_storage.c +++ b/tools/testing/selftests/bpf/prog_tests/cgrp_local_storage.c @@ -63,14 +63,14 @@ static void test_tp_btf(int cgroup_fd) if (!ASSERT_OK(err, "map_delete_elem")) goto out; - skel->bss->target_pid = syscall(SYS_gettid); + skel->bss->target_pid = sys_gettid(); err = cgrp_ls_tp_btf__attach(skel); if (!ASSERT_OK(err, "skel_attach")) goto out; - syscall(SYS_gettid); - syscall(SYS_gettid); + sys_gettid(); + sys_gettid(); skel->bss->target_pid = 0; @@ -154,7 +154,7 @@ static void test_recursion(int cgroup_fd) goto out; /* trigger sys_enter, make sure it does not cause deadlock */ - syscall(SYS_gettid); + sys_gettid(); out: cgrp_ls_recursion__destroy(skel); @@ -224,7 +224,7 @@ static void test_yes_rcu_lock(__u64 cgroup_id) return; CGROUP_MODE_SET(skel); - skel->bss->target_pid = syscall(SYS_gettid); + skel->bss->target_pid = sys_gettid(); bpf_program__set_autoload(skel->progs.yes_rcu_lock, true); err = cgrp_ls_sleepable__load(skel); diff --git a/tools/testing/selftests/bpf/prog_tests/core_reloc.c b/tools/testing/selftests/bpf/prog_tests/core_reloc.c index 26019313e1fc..1c682550e0e7 100644 --- a/tools/testing/selftests/bpf/prog_tests/core_reloc.c +++ b/tools/testing/selftests/bpf/prog_tests/core_reloc.c @@ -1010,7 +1010,7 @@ static void run_core_reloc_tests(bool use_btfgen) struct data *data; void *mmap_data = NULL; - my_pid_tgid = getpid() | ((uint64_t)syscall(SYS_gettid) << 32); + my_pid_tgid = getpid() | ((uint64_t)sys_gettid() << 32); for (i = 0; i < ARRAY_SIZE(test_cases); i++) { char btf_file[] = "/tmp/core_reloc.btf.XXXXXX"; diff --git a/tools/testing/selftests/bpf/prog_tests/linked_funcs.c b/tools/testing/selftests/bpf/prog_tests/linked_funcs.c index cad664546912..fa639b021f7e 100644 --- a/tools/testing/selftests/bpf/prog_tests/linked_funcs.c +++ b/tools/testing/selftests/bpf/prog_tests/linked_funcs.c @@ -20,7 +20,7 @@ void test_linked_funcs(void) bpf_program__set_autoload(skel->progs.handler1, true); bpf_program__set_autoload(skel->progs.handler2, true); - skel->rodata->my_tid = syscall(SYS_gettid); + skel->rodata->my_tid = sys_gettid(); skel->bss->syscall_id = SYS_getpgid; err = linked_funcs__load(skel); diff --git a/tools/testing/selftests/bpf/prog_tests/ns_current_pid_tgid.c b/tools/testing/selftests/bpf/prog_tests/ns_current_pid_tgid.c index c29787e092d6..761ce24bce38 100644 --- a/tools/testing/selftests/bpf/prog_tests/ns_current_pid_tgid.c +++ b/tools/testing/selftests/bpf/prog_tests/ns_current_pid_tgid.c @@ -23,7 +23,7 @@ static int get_pid_tgid(pid_t *pid, pid_t *tgid, struct stat st; int err; - *pid = syscall(SYS_gettid); + *pid = sys_gettid(); *tgid = getpid(); err = stat("/proc/self/ns/pid", &st); diff --git a/tools/testing/selftests/bpf/prog_tests/rcu_read_lock.c b/tools/testing/selftests/bpf/prog_tests/rcu_read_lock.c index a1f7e7378a64..ebe0c12b5536 100644 --- a/tools/testing/selftests/bpf/prog_tests/rcu_read_lock.c +++ b/tools/testing/selftests/bpf/prog_tests/rcu_read_lock.c @@ -21,7 +21,7 @@ static void test_success(void) if (!ASSERT_OK_PTR(skel, "skel_open")) return; - skel->bss->target_pid = syscall(SYS_gettid); + skel->bss->target_pid = sys_gettid(); bpf_program__set_autoload(skel->progs.get_cgroup_id, true); bpf_program__set_autoload(skel->progs.task_succ, true); @@ -58,7 +58,7 @@ static void test_rcuptr_acquire(void) if (!ASSERT_OK_PTR(skel, "skel_open")) return; - skel->bss->target_pid = syscall(SYS_gettid); + skel->bss->target_pid = sys_gettid(); bpf_program__set_autoload(skel->progs.task_acquire, true); err = rcu_read_lock__load(skel); diff --git a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c index c33c05161a9e..0d42ce00166f 100644 --- a/tools/testing/selftests/bpf/prog_tests/task_local_storage.c +++ b/tools/testing/selftests/bpf/prog_tests/task_local_storage.c @@ -23,14 +23,14 @@ static void test_sys_enter_exit(void) if (!ASSERT_OK_PTR(skel, "skel_open_and_load")) return; - skel->bss->target_pid = syscall(SYS_gettid); + skel->bss->target_pid = sys_gettid(); err = task_local_storage__attach(skel); if (!ASSERT_OK(err, "skel_attach")) goto out; - syscall(SYS_gettid); - syscall(SYS_gettid); + sys_gettid(); + sys_gettid(); /* 3x syscalls: 1x attach and 2x gettid */ ASSERT_EQ(skel->bss->enter_cnt, 3, "enter_cnt"); @@ -99,7 +99,7 @@ static void test_recursion(void) /* trigger sys_enter, make sure it does not cause deadlock */ skel->bss->test_pid = getpid(); - syscall(SYS_gettid); + sys_gettid(); skel->bss->test_pid = 0; task_ls_recursion__detach(skel); diff --git a/tools/testing/selftests/bpf/prog_tests/uprobe_multi_test.c b/tools/testing/selftests/bpf/prog_tests/uprobe_multi_test.c index c1ac813ff9ba..02a484b22aa6 100644 --- a/tools/testing/selftests/bpf/prog_tests/uprobe_multi_test.c +++ b/tools/testing/selftests/bpf/prog_tests/uprobe_multi_test.c @@ -125,7 +125,7 @@ static void *child_thread(void *ctx) struct child *child = ctx; int c = 0, err; - child->tid = syscall(SYS_gettid); + child->tid = sys_gettid(); /* let parent know we are ready */ err = write(child->c2p[1], &c, 1); -- 2.43.5

6 months, 2 weeks

2
1
0 0

[PATCH stable v5.4 v2 0/3] Missing overflow changes

by Florian Fainelli

This patch series backports the minimum set of changes in order to fix this warning that popped up with >= 5.4.284 stable kernels: In file included from ./include/linux/mm.h:29, from ./include/linux/pagemap.h:8, from ./include/linux/buffer_head.h:14, from fs/udf/udfdecl.h:12, from fs/udf/super.c:41: fs/udf/super.c: In function 'udf_fill_partdesc_info': ./include/linux/overflow.h:70:15: warning: comparison of distinct pointer types lacks a cast (void) (&__a == &__b); \ ^~ fs/udf/super.c:1162:7: note: in expansion of macro 'check_add_overflow' if (check_add_overflow(map->s_partition_len, ^~~~~~~~~~~~~~~~~~ Changes in v2: - added missing upstream commit ID to the last patch in the series Kees Cook (2): overflow: Add __must_check attribute to check_*() helpers overflow: Allow mixed type arguments Keith Busch (1): overflow: Correct check_shl_overflow() comment include/linux/overflow.h | 101 +++++++++++++++++++++++---------------- 1 file changed, 60 insertions(+), 41 deletions(-) -- 2.34.1

6 months, 2 weeks

2
6
0 0

[PATCH 6.1] x86/mm: Don't disable PCID when INVLPG has been fixed by microcode

by Pawan Gupta

From: Xi Ruoyao <xry111(a)xry111.site> commit f24f669d03f884a6ef95cca84317d0f329e93961 upstream. Per the "Processor Specification Update" documentations referred by the intel-microcode-20240312 release note, this microcode release has fixed the issue for all affected models. So don't disable PCID if the microcode is new enough. The precise minimum microcode revision fixing the issue was provided by Pawan Intel. [ dhansen: comment and changelog tweaks ] [ pawan: backported to 6.1 s/ATOM_GRACEMONT/ALDERLAKE_N/ ] Signed-off-by: Xi Ruoyao <xry111(a)xry111.site> Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com> Acked-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta(a)linux.intel.com> Link: https://lore.kernel.org/all/168436059559.404.13934972543631851306.tip-bot2@… Link: https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/release… Link: https://cdrdv2.intel.com/v1/dl/getContent/740518 # RPL042, rev. 13 Link: https://cdrdv2.intel.com/v1/dl/getContent/682436 # ADL063, rev. 24 Link: https://lore.kernel.org/all/20240325231300.qrltbzf6twm43ftb@desk/ Link: https://lore.kernel.org/all/20240522020625.69418-1-xry111%40xry111.site --- arch/x86/mm/init.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index ed861ef33f80..ab697ee64528 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -263,28 +263,33 @@ static void __init probe_page_size_mask(void) } /* - * INVLPG may not properly flush Global entries - * on these CPUs when PCIDs are enabled. + * INVLPG may not properly flush Global entries on + * these CPUs. New microcode fixes the issue. */ static const struct x86_cpu_id invlpg_miss_ids[] = { - X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, 0), - X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, 0), - X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_N, 0), - X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE, 0), - X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P, 0), - X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_S, 0), + X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, 0x2e), + X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, 0x42c), + X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_N, 0x11), + X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE, 0x118), + X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P, 0x4117), + X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_S, 0x2e), {} }; static void setup_pcid(void) { + const struct x86_cpu_id *invlpg_miss_match; + if (!IS_ENABLED(CONFIG_X86_64)) return; if (!boot_cpu_has(X86_FEATURE_PCID)) return; - if (x86_match_cpu(invlpg_miss_ids)) { + invlpg_miss_match = x86_match_cpu(invlpg_miss_ids); + + if (invlpg_miss_match && + boot_cpu_data.microcode < invlpg_miss_match->driver_data) { pr_info("Incomplete global flushes, disabling PCID"); setup_clear_cpu_cap(X86_FEATURE_PCID); return; --- base-commit: 6ae7ac5c4251b139da4b672fe4157f2089a9d922 change-id: 20250307-clear-pcid-6-1-3f7b1af35cb2 Best regards, -- Thanks, Pawan

6 months, 2 weeks

2
1
0 0

[PATCH 1/2] virt: sev-guest: Allocate request data dynamically

by Alexey Kardashevskiy

From: Nikunj A Dadhania <nikunj(a)amd.com> Commit ae596615d93d ("virt: sev-guest: Reduce the scope of SNP command mutex") narrowed the command mutex scope to snp_send_guest_request. However, GET_REPORT, GET_DERIVED_KEY, and GET_EXT_REPORT share the req structure in snp_guest_dev. Without the mutex protection, concurrent requests can overwrite each other's data. Fix it by dynamically allocating the request structure. Fixes: ae596615d93d ("virt: sev-guest: Reduce the scope of SNP command mutex") Cc: stable(a)vger.kernel.org Reported-by: andreas.stuehrk(a)yaxi.tech Closes: https://github.com/AMDESE/AMDSEV/issues/265 Signed-off-by: Nikunj A Dadhania <nikunj(a)amd.com> --- drivers/virt/coco/sev-guest/sev-guest.c | 24 ++++++++++++-------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c index ddec5677e247..4699fdc9ed44 100644 --- a/drivers/virt/coco/sev-guest/sev-guest.c +++ b/drivers/virt/coco/sev-guest/sev-guest.c @@ -39,12 +39,6 @@ struct snp_guest_dev { struct miscdevice misc; struct snp_msg_desc *msg_desc; - - union { - struct snp_report_req report; - struct snp_derived_key_req derived_key; - struct snp_ext_report_req ext_report; - } req; }; /* @@ -72,7 +66,7 @@ struct snp_req_resp { static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg) { - struct snp_report_req *report_req = &snp_dev->req.report; + struct snp_report_req *report_req __free(kfree) = NULL; struct snp_msg_desc *mdesc = snp_dev->msg_desc; struct snp_report_resp *report_resp; struct snp_guest_req req = {}; @@ -81,6 +75,10 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io if (!arg->req_data || !arg->resp_data) return -EINVAL; + report_req = kzalloc(sizeof(*report_req), GFP_KERNEL_ACCOUNT); + if (!report_req) + return -ENOMEM; + if (copy_from_user(report_req, (void __user *)arg->req_data, sizeof(*report_req))) return -EFAULT; @@ -117,7 +115,7 @@ static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_io static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg) { - struct snp_derived_key_req *derived_key_req = &snp_dev->req.derived_key; + struct snp_derived_key_req *derived_key_req __free(kfree) = NULL; struct snp_derived_key_resp derived_key_resp = {0}; struct snp_msg_desc *mdesc = snp_dev->msg_desc; struct snp_guest_req req = {}; @@ -137,6 +135,10 @@ static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_reque if (sizeof(buf) < resp_len) return -ENOMEM; + derived_key_req = kzalloc(sizeof(*derived_key_req), GFP_KERNEL_ACCOUNT); + if (!derived_key_req) + return -ENOMEM; + if (copy_from_user(derived_key_req, (void __user *)arg->req_data, sizeof(*derived_key_req))) return -EFAULT; @@ -169,7 +171,7 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques struct snp_req_resp *io) { - struct snp_ext_report_req *report_req = &snp_dev->req.ext_report; + struct snp_ext_report_req *report_req __free(kfree) = NULL; struct snp_msg_desc *mdesc = snp_dev->msg_desc; struct snp_report_resp *report_resp; struct snp_guest_req req = {}; @@ -179,6 +181,10 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques if (sockptr_is_null(io->req_data) || sockptr_is_null(io->resp_data)) return -EINVAL; + report_req = kzalloc(sizeof(*report_req), GFP_KERNEL_ACCOUNT); + if (!report_req) + return -ENOMEM; + if (copy_from_sockptr(report_req, io->req_data, sizeof(*report_req))) return -EFAULT; -- 2.47.1

6 months, 2 weeks

3
3
0 0

[PATCH 5.15.y] fs/ntfs3: Fix shift-out-of-bounds in ntfs_fill_super

by Miguel García

6 months, 2 weeks

2
1
0 0

[PATCH 5.10] crypto: hisilicon/qm - inject error before stopping queue

by Xiangyu Chen

From: Weili Qian <qianweili(a)huawei.com> commit b04f06fc0243600665b3b50253869533b7938468 upstream. The master ooo cannot be completely closed when the accelerator core reports memory error. Therefore, the driver needs to inject the qm error to close the master ooo. Currently, the qm error is injected after stopping queue, memory may be released immediately after stopping queue, causing the device to access the released memory. Therefore, error is injected to close master ooo before stopping queue to ensure that the device does not access the released memory. Fixes: 6c6dd5802c2d ("crypto: hisilicon/qm - add controller reset interface") Signed-off-by: Weili Qian <qianweili(a)huawei.com> Signed-off-by: Herbert Xu <herbert(a)gondor.apana.org.au> Signed-off-by: Xiangyu Chen <xiangyu.chen(a)windriver.com> Signed-off-by: He Zhe <zhe.he(a)windriver.com> --- Verified the code compile on arm64 toolchain --- drivers/crypto/hisilicon/qm.c | 46 +++++++++++++++++------------------ 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index 530f23116d7c..8988ee714ce1 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -3354,6 +3354,27 @@ static int qm_set_vf_mse(struct hisi_qm *qm, bool set) return -ETIMEDOUT; } +static void qm_dev_ecc_mbit_handle(struct hisi_qm *qm) +{ + u32 nfe_enb = 0; + + if (!qm->err_status.is_dev_ecc_mbit && + qm->err_status.is_qm_ecc_mbit && + qm->err_ini->close_axi_master_ooo) { + + qm->err_ini->close_axi_master_ooo(qm); + + } else if (qm->err_status.is_dev_ecc_mbit && + !qm->err_status.is_qm_ecc_mbit && + !qm->err_ini->close_axi_master_ooo) { + + nfe_enb = readl(qm->io_base + QM_RAS_NFE_ENABLE); + writel(nfe_enb & QM_RAS_NFE_MBIT_DISABLE, + qm->io_base + QM_RAS_NFE_ENABLE); + writel(QM_ECC_MBIT, qm->io_base + QM_ABNORMAL_INT_SET); + } +} + static int qm_set_msi(struct hisi_qm *qm, bool set) { struct pci_dev *pdev = qm->pdev; @@ -3433,6 +3454,8 @@ static int qm_controller_reset_prepare(struct hisi_qm *qm) return ret; } + qm_dev_ecc_mbit_handle(qm); + if (qm->vfs_num) { ret = qm_vf_reset_prepare(qm, QM_SOFT_RESET); if (ret) { @@ -3450,27 +3473,6 @@ static int qm_controller_reset_prepare(struct hisi_qm *qm) return 0; } -static void qm_dev_ecc_mbit_handle(struct hisi_qm *qm) -{ - u32 nfe_enb = 0; - - if (!qm->err_status.is_dev_ecc_mbit && - qm->err_status.is_qm_ecc_mbit && - qm->err_ini->close_axi_master_ooo) { - - qm->err_ini->close_axi_master_ooo(qm); - - } else if (qm->err_status.is_dev_ecc_mbit && - !qm->err_status.is_qm_ecc_mbit && - !qm->err_ini->close_axi_master_ooo) { - - nfe_enb = readl(qm->io_base + QM_RAS_NFE_ENABLE); - writel(nfe_enb & QM_RAS_NFE_MBIT_DISABLE, - qm->io_base + QM_RAS_NFE_ENABLE); - writel(QM_ECC_MBIT, qm->io_base + QM_ABNORMAL_INT_SET); - } -} - static int qm_soft_reset(struct hisi_qm *qm) { struct pci_dev *pdev = qm->pdev; @@ -3496,8 +3498,6 @@ static int qm_soft_reset(struct hisi_qm *qm) return ret; } - qm_dev_ecc_mbit_handle(qm); - /* OOO register set and check */ writel(ACC_MASTER_GLOBAL_CTRL_SHUTDOWN, qm->io_base + ACC_MASTER_GLOBAL_CTRL); -- 2.25.1

6 months, 2 weeks

2
1
0 0

[PATCH v2 6.6] mm/mempolicy: fix unbalanced unlock in backported VMA check

by Alexey Panov

No upstream commit exists for this commit. The issue was introduced with backporting upstream commit 091c1dd2d4df ("mm/mempolicy: fix migrate_to_node() assuming there is at least one VMA in a MM"). The backport incorrectly added unlock logic to a path where mmap_lock was provided by external context in do_migrate_pages(), creating lock imbalance when no VMAs are found. This fixes the report: WARNING: bad unlock balance detected! 6.6.79 #1 Not tainted ------------------------------------- repro/9655 is trying to release lock (&mm->mmap_lock) at: [<ffffffff81daa36f>] mmap_read_unlock include/linux/mmap_lock.h:173 [inline] [<ffffffff81daa36f>] do_migrate_pages+0x59f/0x700 mm/mempolicy.c:1196 but there are no more locks to release! other info that might help us debug this: no locks held by repro/9655. stack backtrace: CPU: 1 PID: 9655 Comm: a Not tainted 6.6.79 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd5/0x1b0 lib/dump_stack.c:106 __lock_release kernel/locking/lockdep.c:5431 [inline] lock_release+0x4b1/0x680 kernel/locking/lockdep.c:5774 up_read+0x12/0x20 kernel/locking/rwsem.c:1615 mmap_read_unlock include/linux/mmap_lock.h:173 [inline] do_migrate_pages+0x59f/0x700 mm/mempolicy.c:1196 kernel_migrate_pages+0x59b/0x780 mm/mempolicy.c:1665 __do_sys_migrate_pages mm/mempolicy.c:1684 [inline] __se_sys_migrate_pages mm/mempolicy.c:1680 [inline] __x64_sys_migrate_pages+0x92/0xf0 mm/mempolicy.c:1680 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x68/0xd2 Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: a13b2b9b0b0b ("mm/mempolicy: fix migrate_to_node() assuming there is at least one VMA in a MM") Signed-off-by: Alexey Panov <apanov(a)astralinux.ru> --- v2: Clarify mmap_lock context in commit description. Fix braces for a single statement block. Add empty line after VM_BUG_ON to look more consistent with upstream. mm/mempolicy.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 94c74c594d10..d2855507d2e9 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1070,11 +1070,10 @@ static long migrate_to_node(struct mm_struct *mm, int source, int dest, node_set(source, nmask); VM_BUG_ON(!(flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))); + vma = find_vma(mm, 0); - if (unlikely(!vma)) { - mmap_read_unlock(mm); + if (unlikely(!vma)) return 0; - } /* * This does not migrate the range, but isolates all pages that -- 2.30.2

6 months, 2 weeks

2
1
0 0

[PATCH 6.1 000/176] 6.1.130-rc1 review

by Greg Kroah-Hartman

This is the start of the stable review cycle for the 6.1.130 release. There are 176 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Fri, 07 Mar 2025 17:44:26 +0000. Anything received after that time might be too late. The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.130-rc… or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below. thanks, greg k-h ------------- Pseudo-Shortlog of commits: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Linux 6.1.130-rc1 Fullway Wang <fullwaywang(a)outlook.com> media: mtk-vcodec: potential null pointer deference in SCP Quang Le <quanglex97(a)gmail.com> pfifo_tail_enqueue: Drop new packet when sch->limit == 0 Phillip Lougher <phillip(a)squashfs.org.uk> Squashfs: check the inode number is not the invalid value of zero Jiaxun Yang <jiaxun.yang(a)flygoat.com> mm/memory: Use exception ip to search exception tables Jiaxun Yang <jiaxun.yang(a)flygoat.com> ptrace: Introduce exception_ip arch hook Thomas Gleixner <tglx(a)linutronix.de> intel_idle: Handle older CPUs, which stop the TSC in deeper C states, correctly chr[] <chris(a)rudorff.com> amdgpu/pm/legacy: fix suspend/resume issues Sohaib Nadeem <sohaib.nadeem(a)amd.com> drm/amd/display: fixed integer types and null check locations Andreas Schwab <schwab(a)suse.de> riscv/futex: sign extend compare value in atomic cmpxchg Thomas Gleixner <tglx(a)linutronix.de> sched/core: Prevent rescheduling when interrupts are disabled Ard Biesheuvel <ardb(a)kernel.org> vmlinux.lds: Ensure that const vars with relocations are mapped R/O Matthieu Baerts (NGI0) <matttbe(a)kernel.org> mptcp: reset when MPTCP opts are dropped after join Paolo Abeni <pabeni(a)redhat.com> mptcp: always handle address removal under msk socket lock Kaustabh Chakraborty <kauschluss(a)disroot.org> phy: exynos5-usbdrd: fix MPLL_MULTIPLIER and SSC_REFCLKSEL masks in refclk BH Hsieh <bhsieh(a)nvidia.com> phy: tegra: xusb: reset VBUS & ID OVERRIDE Wei Fang <wei.fang(a)nxp.com> net: enetc: fix the off-by-one issue in enetc_map_tx_tso_buffs() Wei Fang <wei.fang(a)nxp.com> net: enetc: correct the xdp_tx statistics Wei Fang <wei.fang(a)nxp.com> net: enetc: update UDP checksum when updating originTimestamp field Wei Fang <wei.fang(a)nxp.com> net: enetc: keep track of correct Tx BD count in enetc_map_tx_tso_buffs() Wei Fang <wei.fang(a)nxp.com> net: enetc: fix the off-by-one issue in enetc_map_tx_buffs() Nikita Zhandarovich <n.zhandarovich(a)fintech.ru> usbnet: gl620a: fix endpoint checking in genelink_bind() Tyrone Ting <kfting(a)nuvoton.com> i2c: npcm: disable interrupt enable bit before devm_request_irq Roman Li <Roman.Li(a)amd.com> drm/amd/display: Fix HPD after gpu reset Tom Chung <chiahsuan.chung(a)amd.com> drm/amd/display: Disable PSR-SU on eDP panels Kan Liang <kan.liang(a)linux.intel.com> perf/core: Fix low freq setting via IOC_PERIOD Kan Liang <kan.liang(a)linux.intel.com> perf/x86: Fix low freqency setting issue Dmitry Panchenko <dmitry(a)d-systems.ee> ALSA: usb-audio: Re-add sample rate quirk for Pioneer DJM-900NXS2 Nikolay Kuratov <kniv(a)yandex-team.ru> ftrace: Avoid potential division by zero in function_stat_show() Steven Rostedt <rostedt(a)goodmis.org> tracing: Fix bad hist from corrupting named_triggers list Chukun Pan <amadeus(a)jmu.edu.cn> phy: rockchip: naneng-combphy: compatible reset with old DT Russell Senior <russell(a)personaltelco.net> x86/CPU: Fix warm boot hang regression on AMD SC1100 SoC systems Pavel Begunkov <asml.silence(a)gmail.com> io_uring/net: save msg_control for compat Tong Tiangen <tongtiangen(a)huawei.com> uprobes: Reject the shared zeropage in uprobe_write_opcode() David Howells <dhowells(a)redhat.com> mm: Don't pin ZERO_PAGE in pin_user_pages() Justin Iurman <justin.iurman(a)uliege.be> net: ipv6: fix dst ref loop on input in rpl lwt Justin Iurman <justin.iurman(a)uliege.be> net: ipv6: rpl_iptunnel: mitigate 2-realloc issue Justin Iurman <justin.iurman(a)uliege.be> net: ipv6: fix dst ref loop on input in seg6 lwt Justin Iurman <justin.iurman(a)uliege.be> net: ipv6: seg6_iptunnel: mitigate 2-realloc issue Justin Iurman <justin.iurman(a)uliege.be> include: net: add static inline dst_dev_overhead() to dst.h Shay Drory <shayd(a)nvidia.com> net/mlx5: IRQ, Fix null string in debug print Harshal Chaudhari <hchaudhari(a)marvell.com> net: mvpp2: cls: Fixed Non IP flow, with vlan tag flow defination. Mohammad Heib <mheib(a)redhat.com> net: Clear old fragment checksum value in napi_reuse_skb Wang Hai <wanghai38(a)huawei.com> tcp: Defer ts_recent changes until req is owned Philo Lu <lulie(a)linux.alibaba.com> ipvs: Always clear ipvs_property flag in skb_scrub_packet() Nicolas Frattaroli <nicolas.frattaroli(a)collabora.com> ASoC: es8328: fix route from DAC to output Sean Anderson <sean.anderson(a)linux.dev> net: cadence: macb: Synchronize stats calculations Eric Dumazet <edumazet(a)google.com> ipvlan: ensure network headers are in skb linear part Guillaume Nault <gnault(a)redhat.com> ipvlan: Prepare ipvlan_process_v4_outbound() to future .flowi4_tos conversion. Guillaume Nault <gnault(a)redhat.com> ipv4: Convert ip_route_input() to dscp_t. Guillaume Nault <gnault(a)redhat.com> ipv4: Convert icmp_route_lookup() to dscp_t. Ido Schimmel <idosch(a)nvidia.com> ipvlan: Unmask upper DSCP bits in ipvlan_process_v4_outbound() Ido Schimmel <idosch(a)nvidia.com> ipv4: icmp: Unmask upper DSCP bits in icmp_route_lookup() Ido Schimmel <idosch(a)nvidia.com> ipv4: icmp: Pass full DS field to ip_route_input() Peilin He <he.peilin(a)zte.com.cn> net/ipv4: add tracepoint for icmp_send Jiri Slaby (SUSE) <jirislaby(a)kernel.org> net: set the minimum for net_hotdata.netdev_budget_usecs Ido Schimmel <idosch(a)nvidia.com> net: loopback: Avoid sending IP packets without an Ethernet header David Howells <dhowells(a)redhat.com> afs: Fix the server_list to unuse a displaced server rather than putting it David Howells <dhowells(a)redhat.com> afs: Make it possible to find the volumes that are using a server Colin Ian King <colin.i.king(a)gmail.com> afs: remove variable nr_servers Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com> Bluetooth: L2CAP: Fix L2CAP_ECRED_CONN_RSP response Takashi Iwai <tiwai(a)suse.de> ALSA: usb-audio: Avoid dropping MIDI events at closing multiple ports Arnd Bergmann <arnd(a)arndb.de> sunrpc: suppress warnings for unused procfs functions Patrisious Haddad <phaddad(a)nvidia.com> RDMA/mlx5: Fix bind QP error cleanup flow Ye Bin <yebin10(a)huawei.com> scsi: core: Clear driver private data when retrying request Patrisious Haddad <phaddad(a)nvidia.com> RDMA/mlx5: Fix AH static rate parsing Or Har-Toov <ohartoov(a)nvidia.com> IB/core: Add support for XDR link speed Leon Romanovsky <leon(a)kernel.org> RDMA/mlx5: Reduce QP table exposure Mark Zhang <markzhang(a)nvidia.com> RDMA/mlx: Calling qp event handler in workqueue context Trond Myklebust <trond.myklebust(a)hammerspace.com> SUNRPC: Prevent looping due to rpc_signal_task() races Stephen Brennan <stephen.s.brennan(a)oracle.com> SUNRPC: convert RPC_TASK_* constants to enum Vasiliy Kovalev <kovalev(a)altlinux.org> ovl: fix UAF in ovl_dentry_update_reval by moving dput() in ovl_link_up Mark Zhang <markzhang(a)nvidia.com> IB/mlx5: Set and get correct qp_num for a DCT QP Yishai Hadas <yishaih(a)nvidia.com> RDMA/mlx5: Fix the recovery flow of the UMR QP Shay Drory <shayd(a)nvidia.com> RDMA/mlx5: Implement mkeys management via LIFO queue Michael Guralnik <michaelgur(a)nvidia.com> RDMA/mlx5: Add work to remove temporary entries from the cache Michael Guralnik <michaelgur(a)nvidia.com> RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow Michael Guralnik <michaelgur(a)nvidia.com> RDMA/mlx5: Introduce mlx5r_cache_rb_key Michael Guralnik <michaelgur(a)nvidia.com> RDMA/mlx5: Change the cache structure to an RB-tree Aharon Landau <aharonl(a)nvidia.com> RDMA/mlx5: Remove implicit ODP cache entry Aharon Landau <aharonl(a)nvidia.com> RDMA/mlx5: Don't keep umrable 'page_shift' in cache entries Xin Long <lucien.xin(a)gmail.com> netfilter: allow exp not to be removed in nf_ct_find_expectation Alexander Dahl <ada(a)thorsis.com> spi: atmel-quadspi: Fix wrong register value written to MR Alexander Dahl <ada(a)thorsis.com> spi: atmel-quadspi: Avoid overwriting delay register settings Yunfei Dong <yunfei.dong(a)mediatek.com> media: mediatek: vcodec: Fix H264 multi stateless decoder smatch warning Yu Kuai <yukuai3(a)huawei.com> block, bfq: fix bfqq uaf in bfq_limit_depth() Paolo Valente <paolo.valente(a)linaro.org> block, bfq: split sync bfq_queues on a per-actuator basis Patrick Bellasi <derkling(a)google.com> x86/cpu/kvm: SRSO: Fix possible missing IBPB on VM-Exit Steven Rostedt <rostedt(a)goodmis.org> ftrace: Do not add duplicate entries in subops manager ops Sebastian Andrzej Siewior <bigeasy(a)linutronix.de> ftrace: Correct preemption accounting for function tracing. Komal Bajaj <quic_kbajaj(a)quicinc.com> EDAC/qcom: Correct interrupt enable register configuration Haoxiang Li <haoxiang_li2024(a)163.com> smb: client: Add check for next_buffer in receive_encrypted_standard() Niravkumar L Rabara <niravkumar.l.rabara(a)intel.com> mtd: rawnand: cadence: fix incorrect device in dma_unmap_single Niravkumar L Rabara <niravkumar.l.rabara(a)intel.com> mtd: rawnand: cadence: use dma_map_resource for sdma address Niravkumar L Rabara <niravkumar.l.rabara(a)intel.com> mtd: rawnand: cadence: fix error code in cadence_nand_init() Ricardo Cañuelo Navarro <rcn(a)igalia.com> mm,madvise,hugetlb: check for 0-length range after end address adjustment Christian Brauner <brauner(a)kernel.org> acct: block access to kernel internal filesystems Christian Brauner <brauner(a)kernel.org> acct: perform last write from workqueue John Veness <john-linux(a)pelago.org.uk> ALSA: hda/conexant: Add quirk for HP ProBook 450 G4 mute LED Wentao Liang <vulab(a)iscas.ac.cn> ALSA: hda: Add error check for snd_ctl_rename_id() in snd_hda_create_dig_out_ctls() Nikita Zhandarovich <n.zhandarovich(a)fintech.ru> ASoC: fsl_micfil: Enable default case in micfil_set_quality() Haoxiang Li <haoxiang_li2024(a)163.com> nfp: bpf: Add check for nfp_app_ctrl_msg_alloc() Gavrilov Ilia <Ilia.Gavrilov(a)infotecs.ru> drop_monitor: fix incorrect initialization order Sumit Garg <sumit.garg(a)linaro.org> tee: optee: Fix supplicant wait loop Ville Syrjälä <ville.syrjala(a)linux.intel.com> drm/i915: Make sure all planes in use by the joiner have their crtc included Jessica Zhang <quic_jesszhan(a)quicinc.com> drm/msm/dpu: Disable dither in phys encoder cleanup Yan Zhai <yan(a)cloudflare.com> bpf: skip non exist keys in generic_map_lookup_batch Caleb Sander Mateos <csander(a)purestorage.com> nvme/ioctl: add missing space in err message Marijn Suijten <marijn.suijten(a)somainline.org> drm/msm/dpu: Don't leak bits_per_component into random DSC_ENC fields David Hildenbrand <david(a)redhat.com> nouveau/svm: fix missing folio unlock + put after make_device_exclusive_range() Andrey Vatoropin <a.vatoropin(a)crpt.ru> power: supply: da9150-fg: fix potential overflow Jiayuan Chen <mrpre(a)163.com> bpf: Fix wrong copied_seq calculation Jiayuan Chen <mrpre(a)163.com> strparser: Add read_sock callback Shigeru Yoshida <syoshida(a)redhat.com> bpf, test_run: Fix use-after-free issue in eth_skb_pkt_type() Tomi Valkeinen <tomi.valkeinen+renesas(a)ideasonboard.com> drm/rcar-du: dsi: Fix PHY lock bit check Devarsh Thakkar <devarsht(a)ti.com> drm/tidss: Fix race condition while handling interrupt registers Tomi Valkeinen <tomi.valkeinen(a)ideasonboard.com> drm/tidss: Add simple K2G manual reset Sabrina Dubroca <sd(a)queasysnail.net> tcp: drop secpath at the same time as we currently drop dst Nick Hu <nick.hu(a)sifive.com> net: axienet: Set mac_managed_pm Breno Leitao <leitao(a)debian.org> arp: switch to dev_getbyhwaddr() in arp_req_set_public() Breno Leitao <leitao(a)debian.org> net: Add non-RCU dev_getbyhwaddr() helper Cong Wang <xiyou.wangcong(a)gmail.com> flow_dissector: Fix port range key handling in BPF conversion Cong Wang <xiyou.wangcong(a)gmail.com> flow_dissector: Fix handling of mixed port and port-range keys Kuniyuki Iwashima <kuniyu(a)amazon.com> geneve: Suppress list corruption splat in geneve_destroy_tunnels(). Kuniyuki Iwashima <kuniyu(a)amazon.com> gtp: Suppress list corruption splat in gtp_net_exit_batch_rtnl(). Nick Child <nnac123(a)linux.ibm.com> ibmvnic: Don't reference skb after sending to VIOS Nick Child <nnac123(a)linux.ibm.com> ibmvnic: Add stat for tx direct vs tx batched Nick Child <nnac123(a)linux.ibm.com> ibmvnic: Introduce send sub-crq direct Nick Child <nnac123(a)linux.ibm.com> ibmvnic: Return error code on TX scrq flush fail Vitaly Rodionov <vitalyr(a)opensource.cirrus.com> ALSA: hda/cirrus: Correct the full scale volume set logic Kuniyuki Iwashima <kuniyu(a)amazon.com> geneve: Fix use-after-free in geneve_find_dev(). Christophe Leroy <christophe.leroy(a)csgroup.eu> powerpc/code-patching: Fix KASAN hit by not flagging text patching area as VM_ALLOC Kailang Yang <kailang(a)realtek.com> ALSA: hda/realtek: Fixup ALC225 depop procedure Christophe Leroy <christophe.leroy(a)csgroup.eu> powerpc/64s: Rewrite __real_pte() and __rpte_to_hidx() as static inline Michael Ellerman <mpe(a)ellerman.id.au> powerpc/64s/mm: Move __real_pte stubs into hash-4k.h John Keeping <jkeeping(a)inmusicbrands.com> ASoC: rockchip: i2s-tdm: fix shift config for SND_SOC_DAIFMT_DSP_[AB] Jill Donahue <jilliandonahue58(a)gmail.com> USB: gadget: f_midi: f_midi_complete to call queue_work Roy Luo <royluo(a)google.com> usb: gadget: core: flush gadget workqueue after device removal Roy Luo <royluo(a)google.com> USB: gadget: core: create sysfs link between udc and gadget Ricardo Ribalda <ribalda(a)chromium.org> media: uvcvideo: Remove dangling pointers Ricardo Ribalda <ribalda(a)chromium.org> media: uvcvideo: Only save async fh if success Ricardo Ribalda <ribalda(a)chromium.org> media: uvcvideo: Refactor iterators Ricardo Ribalda <ribalda(a)chromium.org> media: uvcvideo: Fix crash during unbind if gpio unit is in use Yang Yingliang <yangyingliang(a)huawei.com> media: Switch to use dev_err_probe() helper Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org> soc: mediatek: mtk-devapc: Fix leaking IO map on driver remove Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de> soc/mediatek: mtk-devapc: Convert to platform remove callback returning void Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org> soc: mediatek: mtk-devapc: Fix leaking IO map on error paths AngeloGioacchino Del Regno <angelogioacchino.delregno(a)collabora.com> soc: mediatek: mtk-devapc: Switch to devm_clk_get_enabled() Jarkko Sakkinen <jarkko(a)kernel.org> tpm: Change to kvalloc() in eventlog/acpi.c Eddie James <eajames(a)linux.ibm.com> tpm: Use managed allocation for bios event log Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org> arm64: dts: qcom: sm8450: Fix CDSP memory length Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org> arm64: dts: qcom: trim addresses to 8 digits Chen-Yu Tsai <wenst(a)chromium.org> arm64: dts: mediatek: mt8183: Disable DSI display output by default Igor Pylypiv <ipylypiv(a)google.com> scsi: core: Do not retry I/Os during depopulation Douglas Gilbert <dgilbert(a)interlog.com> scsi: core: Handle depopulation and restoration in progress Dan Carpenter <dan.carpenter(a)linaro.org> ASoC: renesas: rz-ssi: Add a check for negative sample_space Daniel Golle <daniel(a)makrotopia.org> clk: mediatek: mt2701-img: add missing dummy clk Daniel Golle <daniel(a)makrotopia.org> clk: mediatek: mt2701-bdp: add missing dummy clk Daniel Golle <daniel(a)makrotopia.org> clk: mediatek: mt2701-vdec: fix conversion to mtk_clk_simple_probe AngeloGioacchino Del Regno <angelogioacchino.delregno(a)collabora.com> clk: mediatek: clk-mtk: Add dummy clock ops Zijun Hu <quic_zijuhu(a)quicinc.com> Bluetooth: qca: Fix poor RF performance for WCN6855 Cheng Jiang <quic_chejiang(a)quicinc.com> Bluetooth: qca: Update firmware-name to support board specific nvm Zijun Hu <quic_zijuhu(a)quicinc.com> Bluetooth: qca: Support downloading board id specific NVM for WCN7850 Bence Csókás <csokas.bence(a)prolan.hu> spi: atmel-qspi: Memory barriers after memory-mapped I/O Csókás, Bence <csokas.bence(a)prolan.hu> spi: atmel-quadspi: Create `atmel_qspi_ops` to support newer SoC families Yang Yingliang <yangyingliang(a)huawei.com> spi: atmel-quadspi: switch to use modern name Tudor Ambarus <tudor.ambarus(a)microchip.com> spi: atmel-quadspi: Add support for configuring CS timing Chen Ridong <chenridong(a)huawei.com> memcg: fix soft lockup in the OOM process Carlos Galo <carlosgalo(a)google.com> mm: update mark_victim tracepoints fields Yu Kuai <yukuai3(a)huawei.com> md/md-bitmap: Synchronize bitmap_get_stats() with bitmap lifetime Yu Kuai <yukuai3(a)huawei.com> md/md-bitmap: add 'sync_size' into struct md_bitmap_stats Yu Kuai <yukuai3(a)huawei.com> md/md-cluster: fix spares warnings for __le64 Yu Kuai <yukuai3(a)huawei.com> md/md-bitmap: replace md_bitmap_status() with a new helper md_bitmap_get_stats() Yu Kuai <yukuai3(a)huawei.com> md: simplify md_seq_ops Yu Kuai <yukuai3(a)huawei.com> md: factor out a helper from mddev_put() Yu Kuai <yukuai3(a)huawei.com> md: use separate work_struct for md_start_sync() Catalin Marinas <catalin.marinas(a)arm.com> arm64: mte: Do not allow PROT_MTE on MAP_HUGETLB user mappings ------------- Diffstat: Documentation/core-api/pin_user_pages.rst | 6 + Documentation/networking/strparser.rst | 9 +- Makefile | 4 +- arch/arm64/boot/dts/mediatek/mt8183.dtsi | 1 + arch/arm64/boot/dts/qcom/sm8350.dtsi | 2 +- arch/arm64/boot/dts/qcom/sm8450.dtsi | 4 +- arch/arm64/include/asm/mman.h | 9 +- arch/mips/include/asm/ptrace.h | 2 + arch/mips/kernel/ptrace.c | 7 + arch/powerpc/include/asm/book3s/64/hash-4k.h | 28 + arch/powerpc/include/asm/book3s/64/pgtable.h | 26 - arch/powerpc/lib/code-patching.c | 2 +- arch/riscv/include/asm/futex.h | 2 +- arch/x86/Kconfig | 3 +- arch/x86/events/core.c | 2 +- arch/x86/kernel/cpu/bugs.c | 20 +- arch/x86/kernel/cpu/cyrix.c | 4 +- block/bfq-cgroup.c | 97 +-- block/bfq-iosched.c | 195 ++++-- block/bfq-iosched.h | 51 +- drivers/bluetooth/btqca.c | 110 ++- drivers/char/tpm/eventlog/acpi.c | 16 +- drivers/char/tpm/eventlog/efi.c | 13 +- drivers/char/tpm/eventlog/of.c | 3 +- drivers/char/tpm/tpm-chip.c | 1 - drivers/clk/mediatek/clk-mt2701-bdp.c | 1 + drivers/clk/mediatek/clk-mt2701-img.c | 1 + drivers/clk/mediatek/clk-mt2701-vdec.c | 1 + drivers/clk/mediatek/clk-mtk.c | 16 + drivers/clk/mediatek/clk-mtk.h | 19 + drivers/edac/qcom_edac.c | 4 +- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c | 14 + .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c | 3 +- drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c | 16 +- drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c | 25 +- drivers/gpu/drm/amd/pm/legacy-dpm/legacy_dpm.c | 8 +- drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 26 +- drivers/gpu/drm/i915/display/intel_display.c | 18 + drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 3 + drivers/gpu/drm/msm/disp/dpu1/dpu_hw_dsc.c | 3 +- drivers/gpu/drm/nouveau/nouveau_svm.c | 9 +- drivers/gpu/drm/rcar-du/rcar_mipi_dsi.c | 2 +- drivers/gpu/drm/rcar-du/rcar_mipi_dsi_regs.h | 1 - drivers/gpu/drm/tidss/tidss_dispc.c | 22 +- drivers/gpu/drm/tidss/tidss_irq.c | 2 + drivers/i2c/busses/i2c-npcm7xx.c | 7 + drivers/idle/intel_idle.c | 4 + drivers/infiniband/core/sysfs.c | 4 + drivers/infiniband/core/uverbs_std_types_device.c | 3 +- drivers/infiniband/core/verbs.c | 3 + drivers/infiniband/hw/mlx4/main.c | 8 + drivers/infiniband/hw/mlx4/mlx4_ib.h | 3 + drivers/infiniband/hw/mlx4/qp.c | 121 +++- drivers/infiniband/hw/mlx5/ah.c | 3 +- drivers/infiniband/hw/mlx5/counters.c | 8 +- drivers/infiniband/hw/mlx5/main.c | 7 + drivers/infiniband/hw/mlx5/mlx5_ib.h | 60 +- drivers/infiniband/hw/mlx5/mr.c | 742 ++++++++++++++------- drivers/infiniband/hw/mlx5/odp.c | 40 +- drivers/infiniband/hw/mlx5/qp.c | 129 ++-- drivers/infiniband/hw/mlx5/qp.h | 14 +- drivers/infiniband/hw/mlx5/qpc.c | 3 +- drivers/infiniband/hw/mlx5/umr.c | 87 ++- drivers/md/md-bitmap.c | 34 +- drivers/md/md-bitmap.h | 9 +- drivers/md/md-cluster.c | 34 +- drivers/md/md.c | 171 +++-- drivers/md/md.h | 5 +- drivers/media/cec/platform/stm32/stm32-cec.c | 9 +- drivers/media/i2c/ad5820.c | 18 +- drivers/media/i2c/imx274.c | 5 +- drivers/media/i2c/tc358743.c | 9 +- drivers/media/platform/mediatek/mdp/mtk_mdp_comp.c | 5 +- .../platform/mediatek/vcodec/mtk_vcodec_fw_scp.c | 2 + .../mediatek/vcodec/vdec/vdec_h264_req_multi_if.c | 9 +- .../media/platform/samsung/exynos4-is/media-dev.c | 4 +- drivers/media/platform/st/stm32/stm32-dcmi.c | 27 +- drivers/media/platform/ti/omap3isp/isp.c | 3 +- drivers/media/platform/xilinx/xilinx-csi2rxss.c | 8 +- drivers/media/rc/gpio-ir-recv.c | 10 +- drivers/media/rc/gpio-ir-tx.c | 9 +- drivers/media/rc/ir-rx51.c | 9 +- drivers/media/usb/uvc/uvc_ctrl.c | 99 ++- drivers/media/usb/uvc/uvc_driver.c | 35 +- drivers/media/usb/uvc/uvc_v4l2.c | 2 + drivers/media/usb/uvc/uvcvideo.h | 10 +- drivers/mtd/nand/raw/cadence-nand-controller.c | 42 +- drivers/net/ethernet/cadence/macb.h | 2 + drivers/net/ethernet/cadence/macb_main.c | 12 +- drivers/net/ethernet/freescale/enetc/enetc.c | 100 ++- drivers/net/ethernet/ibm/ibmvnic.c | 85 ++- drivers/net/ethernet/ibm/ibmvnic.h | 3 +- drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c | 2 +- drivers/net/ethernet/mellanox/mlx4/qp.c | 14 +- drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c | 2 +- drivers/net/ethernet/netronome/nfp/bpf/cmsg.c | 2 + drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 1 + drivers/net/geneve.c | 16 +- drivers/net/gtp.c | 5 - drivers/net/ipvlan/ipvlan_core.c | 24 +- drivers/net/loopback.c | 14 + drivers/net/usb/gl620a.c | 4 +- drivers/nvme/host/ioctl.c | 3 +- drivers/phy/rockchip/phy-rockchip-naneng-combphy.c | 5 +- drivers/phy/samsung/phy-exynos5-usbdrd.c | 12 +- drivers/phy/tegra/xusb-tegra186.c | 11 + drivers/power/supply/da9150-fg.c | 4 +- drivers/scsi/scsi_lib.c | 22 +- drivers/scsi/sd.c | 4 + drivers/soc/mediatek/mtk-devapc.c | 36 +- drivers/spi/atmel-quadspi.c | 172 +++-- drivers/tee/optee/supp.c | 35 +- drivers/usb/gadget/function/f_midi.c | 2 +- drivers/usb/gadget/udc/core.c | 11 +- fs/afs/cell.c | 1 + fs/afs/internal.h | 23 +- fs/afs/server.c | 1 + fs/afs/server_list.c | 114 +++- fs/afs/vl_alias.c | 2 +- fs/afs/volume.c | 40 +- fs/overlayfs/copy_up.c | 2 +- fs/smb/client/smb2ops.c | 4 + fs/squashfs/inode.c | 5 +- include/asm-generic/vmlinux.lds.h | 2 +- include/linux/mlx4/qp.h | 1 + include/linux/mlx5/driver.h | 10 - include/linux/mm.h | 26 +- include/linux/netdevice.h | 2 + include/linux/ptrace.h | 4 + include/linux/skmsg.h | 2 + include/linux/sunrpc/sched.h | 17 +- include/net/dst.h | 9 + include/net/ip.h | 5 + include/net/netfilter/nf_conntrack_expect.h | 2 +- include/net/route.h | 5 +- include/net/strparser.h | 2 + include/net/tcp.h | 22 + include/rdma/ib_verbs.h | 4 +- include/trace/events/icmp.h | 67 ++ include/trace/events/oom.h | 36 +- include/trace/events/sunrpc.h | 3 +- include/uapi/rdma/ib_user_ioctl_verbs.h | 3 +- io_uring/net.c | 4 +- kernel/acct.c | 134 ++-- kernel/bpf/syscall.c | 18 +- kernel/events/core.c | 17 +- kernel/events/uprobes.c | 5 + kernel/sched/core.c | 2 +- kernel/trace/ftrace.c | 30 +- kernel/trace/trace_events_hist.c | 34 +- kernel/trace/trace_functions.c | 6 +- mm/gup.c | 31 +- mm/madvise.c | 11 +- mm/memcontrol.c | 7 +- mm/memory.c | 4 +- mm/oom_kill.c | 14 +- net/bluetooth/l2cap_core.c | 9 +- net/bpf/test_run.c | 5 +- net/bridge/br_netfilter_hooks.c | 8 +- net/core/dev.c | 37 +- net/core/drop_monitor.c | 39 +- net/core/flow_dissector.c | 49 +- net/core/gro.c | 1 + net/core/skbuff.c | 2 +- net/core/skmsg.c | 7 + net/core/sysctl_net_core.c | 3 +- net/ipv4/arp.c | 2 +- net/ipv4/icmp.c | 24 +- net/ipv4/ip_options.c | 3 +- net/ipv4/tcp.c | 29 +- net/ipv4/tcp_bpf.c | 36 + net/ipv4/tcp_fastopen.c | 4 +- net/ipv4/tcp_input.c | 8 +- net/ipv4/tcp_ipv4.c | 2 +- net/ipv4/tcp_minisocks.c | 10 +- net/ipv6/ip6_tunnel.c | 4 +- net/ipv6/rpl_iptunnel.c | 58 +- net/ipv6/seg6_iptunnel.c | 97 ++- net/mptcp/pm_netlink.c | 5 - net/mptcp/subflow.c | 15 +- net/netfilter/nf_conntrack_core.c | 2 +- net/netfilter/nf_conntrack_expect.c | 4 +- net/netfilter/nft_ct.c | 2 + net/sched/sch_fifo.c | 3 + net/strparser/strparser.c | 11 +- net/sunrpc/cache.c | 10 +- net/sunrpc/sched.c | 2 - sound/pci/hda/hda_codec.c | 4 +- sound/pci/hda/patch_conexant.c | 1 + sound/pci/hda/patch_cs8409-tables.c | 6 +- sound/pci/hda/patch_cs8409.c | 20 +- sound/pci/hda/patch_cs8409.h | 5 +- sound/pci/hda/patch_realtek.c | 1 + sound/soc/codecs/es8328.c | 15 +- sound/soc/fsl/fsl_micfil.c | 2 + sound/soc/rockchip/rockchip_i2s_tdm.c | 4 +- sound/soc/sh/rz-ssi.c | 2 + sound/usb/midi.c | 2 +- sound/usb/quirks.c | 1 + 199 files changed, 3053 insertions(+), 1460 deletions(-)

6 months, 2 weeks

9
184
0 0

[PATCH] ASoC: amd: yc: Support mic on another Lenovo ThinkPad E16 Gen 2 model

by Thomas Mizrahi

The internal microphone on the Lenovo ThinkPad E16 model requires a quirk entry to work properly. This was fixed in a previous patch (linked below), but depending on the specific variant of the model, the product name may be "21M5" or "21M6". The following patch fixed this issue for the 21M5 variant: https://lore.kernel.org/all/20240725065442.9293-1-tiwai@suse.de/ This patch adds support for the microphone on the 21M6 variant. Link: https://github.com/ramaureirac/thinkpad-e14-linux/issues/31 Cc: <stable(a)vger.kernel.org> Signed-off-by: Thomas Mizrahi <thomasmizra(a)gmail.com> --- I recently acquired a ThinkPad E16 Gen 2 AMD and could not get the internal microphone working. After some research, I discovered this issue. Since my machine is a 21M6 variant, the required quirk was not applied by the existing patch. After applying this patch and testing on my machine, the microphone was immediately recognized and worked without further issues. sound/soc/amd/yc/acp6x-mach.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/sound/soc/amd/yc/acp6x-mach.c b/sound/soc/amd/yc/acp6x-mach.c index b16587d8f97a..a7637056972a 100644 --- a/sound/soc/amd/yc/acp6x-mach.c +++ b/sound/soc/amd/yc/acp6x-mach.c @@ -248,6 +248,13 @@ static const struct dmi_system_id yc_acp_quirk_table[] = { DMI_MATCH(DMI_PRODUCT_NAME, "21M5"), } }, + { + .driver_data = &acp6x_card, + .matches = { + DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"), + DMI_MATCH(DMI_PRODUCT_NAME, "21M6"), + } + }, { .driver_data = &acp6x_card, .matches = { -- 2.48.1

6 months, 2 weeks

2
1
0 0

Re: [PATCH v1] usb: core: fix pipe creation for get_bMaxPacketSize0

by Alan Stern

On Thu, Mar 06, 2025 at 09:06:23PM +0000, Colin Evans wrote: > > Please try collecting a usbmon trace for bus 2 showing the problem. > > Ideally the trace should show what happens from system boot-up, but > > there's no way to do that. Instead, you can do this (the first command > > below disables the bus, the second starts the usbmon trace, and the > > third re-enables the bus): > > > > echo 0 >/sys/bus/usb/devices/usb2/bConfigurationValue > > cat /sys/kernel/debug/usb/usbmon/2u >usbmon.txt & > > echo 1 >/sys/bus/usb/devices/usb2/bConfigurationValue > > > > Then after enough time has passed for the errors to show up, kill the > > "cat" process and post the resulting trace file. (Note: If your > > keyboard is attached to bus 2, you won't be able to use it to issue the > > second and third commands. You could use a network login, or put the > > commands into a shell file and run them that way.) > > > > In fact, you should do this twice: The second time, run it on machine 2 > > with the powered hub plugged in to suppress the errors. > > > > Alan Stern > > Happy to try this, but as it stands there is no such file, or file-like > thing, on my machine- > > # ls /sys/kernel/debug/usb/usbmon/2u > ls: cannot access '/sys/kernel/debug/usb/usbmon/2u': No such file or > directory > > # find /sys/kernel/debug/usb -name "2u" > # > > # ls /sys/kernel/debug/usb > devices ehci ohci uhci uvcvideo xhci > > > It seems something is missing? Ah -- you have to load the usbmon module first: modprobe usbmon Some distributions do this for you automatically. Alan Stern

6 months, 2 weeks

2
4
0 0

FAILED: patch "[PATCH] usb: xhci: Enable the TRB overfetch quirk on VIA VL805" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x c133ec0e5717868c9967fa3df92a55e537b1aead # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030900-slaw-onstage-6b47@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From c133ec0e5717868c9967fa3df92a55e537b1aead Mon Sep 17 00:00:00 2001 From: Michal Pecio <michal.pecio(a)gmail.com> Date: Tue, 25 Feb 2025 11:59:27 +0200 Subject: [PATCH] usb: xhci: Enable the TRB overfetch quirk on VIA VL805 Raspberry Pi is a major user of those chips and they discovered a bug - when the end of a transfer ring segment is reached, up to four TRBs can be prefetched from the next page even if the segment ends with link TRB and on page boundary (the chip claims to support standard 4KB pages). It also appears that if the prefetched TRBs belong to a different ring whose doorbell is later rung, they may be used without refreshing from system RAM and the endpoint will stay idle if their cycle bit is stale. Other users complain about IOMMU faults on x86 systems, unsurprisingly. Deal with it by using existing quirk which allocates a dummy page after each transfer ring segment. This was seen to resolve both problems. RPi came up with a more efficient solution, shortening each segment by four TRBs, but it complicated the driver and they ditched it for this quirk. Also rename the quirk and add VL805 device ID macro. Signed-off-by: Michal Pecio <michal.pecio(a)gmail.com> Link: https://github.com/raspberrypi/linux/issues/4685 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215906 CC: stable(a)vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com> Link: https://lore.kernel.org/r/20250225095927.2512358-2-mathias.nyman@linux.inte… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index 92703efda1f7..fdf0c1008225 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -2437,7 +2437,8 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags) * and our use of dma addresses in the trb_address_map radix tree needs * TRB_SEGMENT_SIZE alignment, so we pick the greater alignment need. */ - if (xhci->quirks & XHCI_ZHAOXIN_TRB_FETCH) + if (xhci->quirks & XHCI_TRB_OVERFETCH) + /* Buggy HC prefetches beyond segment bounds - allocate dummy space at the end */ xhci->segment_pool = dma_pool_create("xHCI ring segments", dev, TRB_SEGMENT_SIZE * 2, TRB_SEGMENT_SIZE * 2, xhci->page_size * 2); else diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c index ad0ff356f6fa..54460d11f7ee 100644 --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -38,6 +38,8 @@ #define PCI_DEVICE_ID_ETRON_EJ168 0x7023 #define PCI_DEVICE_ID_ETRON_EJ188 0x7052 +#define PCI_DEVICE_ID_VIA_VL805 0x3483 + #define PCI_DEVICE_ID_INTEL_LYNXPOINT_XHCI 0x8c31 #define PCI_DEVICE_ID_INTEL_LYNXPOINT_LP_XHCI 0x9c31 #define PCI_DEVICE_ID_INTEL_WILDCATPOINT_LP_XHCI 0x9cb1 @@ -418,8 +420,10 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) pdev->device == 0x3432) xhci->quirks |= XHCI_BROKEN_STREAMS; - if (pdev->vendor == PCI_VENDOR_ID_VIA && pdev->device == 0x3483) + if (pdev->vendor == PCI_VENDOR_ID_VIA && pdev->device == PCI_DEVICE_ID_VIA_VL805) { xhci->quirks |= XHCI_LPM_SUPPORT; + xhci->quirks |= XHCI_TRB_OVERFETCH; + } if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA && pdev->device == PCI_DEVICE_ID_ASMEDIA_1042_XHCI) { @@ -467,11 +471,11 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) if (pdev->device == 0x9202) { xhci->quirks |= XHCI_RESET_ON_RESUME; - xhci->quirks |= XHCI_ZHAOXIN_TRB_FETCH; + xhci->quirks |= XHCI_TRB_OVERFETCH; } if (pdev->device == 0x9203) - xhci->quirks |= XHCI_ZHAOXIN_TRB_FETCH; + xhci->quirks |= XHCI_TRB_OVERFETCH; } if (pdev->vendor == PCI_VENDOR_ID_CDNS && diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 8c164340a2c3..779b01dee068 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1632,7 +1632,7 @@ struct xhci_hcd { #define XHCI_EP_CTX_BROKEN_DCS BIT_ULL(42) #define XHCI_SUSPEND_RESUME_CLKS BIT_ULL(43) #define XHCI_RESET_TO_DEFAULT BIT_ULL(44) -#define XHCI_ZHAOXIN_TRB_FETCH BIT_ULL(45) +#define XHCI_TRB_OVERFETCH BIT_ULL(45) #define XHCI_ZHAOXIN_HOST BIT_ULL(46) #define XHCI_WRITE_64_HI_LO BIT_ULL(47) #define XHCI_CDNS_SCTX_QUIRK BIT_ULL(48)

6 months, 2 weeks

3
3
0 0

Re: patch "acpi: typec: ucsi: Introduce a ->poll_cci method" added to usb-linus

by Fedor Pchelkin

On Wed, 19. Feb 15:20, gregkh(a)linuxfoundation.org wrote: > > This is a note to let you know that I've just added the patch titled > > acpi: typec: ucsi: Introduce a ->poll_cci method > > to my usb git tree which can be found at > git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git > in the usb-linus branch. > > The patch will show up in the next release of the linux-next tree > (usually sometime within the next 24 hours during the week.) > > The patch will hopefully also be merged in Linus's tree for the > next -rc kernel release. > > If you have any questions about this process, please let me know. > > > From 976e7e9bdc7719a023a4ecccd2e3daec9ab20a40 Mon Sep 17 00:00:00 2001 > From: "Christian A. Ehrhardt" <lk(a)c--e.de> > Date: Mon, 17 Feb 2025 13:54:39 +0300 > Subject: acpi: typec: ucsi: Introduce a ->poll_cci method > > For the ACPI backend of UCSI the UCSI "registers" are just a memory copy > of the register values in an opregion. The ACPI implementation in the > BIOS ensures that the opregion contents are synced to the embedded > controller and it ensures that the registers (in particular CCI) are > synced back to the opregion on notifications. While there is an ACPI call > that syncs the actual registers to the opregion there is rarely a need to > do this and on some ACPI implementations it actually breaks in various > interesting ways. > > The only reason to force a sync from the embedded controller is to poll > CCI while notifications are disabled. Only the ucsi core knows if this > is the case and guessing based on the current command is suboptimal, i.e. > leading to the following spurious assertion splat: > > WARNING: CPU: 3 PID: 76 at drivers/usb/typec/ucsi/ucsi.c:1388 ucsi_reset_ppm+0x1b4/0x1c0 [typec_ucsi] > CPU: 3 UID: 0 PID: 76 Comm: kworker/3:0 Not tainted 6.12.11-200.fc41.x86_64 #1 > Hardware name: LENOVO 21D0/LNVNB161216, BIOS J6CN45WW 03/17/2023 > Workqueue: events_long ucsi_init_work [typec_ucsi] > RIP: 0010:ucsi_reset_ppm+0x1b4/0x1c0 [typec_ucsi] > Call Trace: > <TASK> > ucsi_init_work+0x3c/0xac0 [typec_ucsi] > process_one_work+0x179/0x330 > worker_thread+0x252/0x390 > kthread+0xd2/0x100 > ret_from_fork+0x34/0x50 > ret_from_fork_asm+0x1a/0x30 > </TASK> > > Thus introduce a ->poll_cci() method that works like ->read_cci() with an > additional forced sync and document that this should be used when polling > with notifications disabled. For all other backends that presumably don't > have this issue use the same implementation for both methods. > > Fixes: fa48d7e81624 ("usb: typec: ucsi: Do not call ACPI _DSM method for UCSI read operations") > Cc: stable <stable(a)kernel.org> Oh, the stable tag has been mangled here.. I didn't notice this, sorry. Now Cc'ing the appropriate list. Could you apply the patch to stables based on Fixes tag then, please? They are 6.12 and 6.13. Thanks! > Signed-off-by: Christian A. Ehrhardt <lk(a)c--e.de> > Tested-by: Fedor Pchelkin <boddah8794(a)gmail.com> > Signed-off-by: Fedor Pchelkin <boddah8794(a)gmail.com> > Reviewed-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com> > Link: https://lore.kernel.org/r/20250217105442.113486-2-boddah8794@gmail.com > Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> > --- > drivers/usb/typec/ucsi/ucsi.c | 10 +++++----- > drivers/usb/typec/ucsi/ucsi.h | 2 ++ > drivers/usb/typec/ucsi/ucsi_acpi.c | 21 ++++++++++++++------- > drivers/usb/typec/ucsi/ucsi_ccg.c | 1 + > drivers/usb/typec/ucsi/ucsi_glink.c | 1 + > drivers/usb/typec/ucsi/ucsi_stm32g0.c | 1 + > drivers/usb/typec/ucsi/ucsi_yoga_c630.c | 1 + > 7 files changed, 25 insertions(+), 12 deletions(-) > > diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c > index fcf499cc9458..0fe1476f4c29 100644 > --- a/drivers/usb/typec/ucsi/ucsi.c > +++ b/drivers/usb/typec/ucsi/ucsi.c > @@ -1346,7 +1346,7 @@ static int ucsi_reset_ppm(struct ucsi *ucsi) > > mutex_lock(&ucsi->ppm_lock); > > - ret = ucsi->ops->read_cci(ucsi, &cci); > + ret = ucsi->ops->poll_cci(ucsi, &cci); > if (ret < 0) > goto out; > > @@ -1364,7 +1364,7 @@ static int ucsi_reset_ppm(struct ucsi *ucsi) > > tmo = jiffies + msecs_to_jiffies(UCSI_TIMEOUT_MS); > do { > - ret = ucsi->ops->read_cci(ucsi, &cci); > + ret = ucsi->ops->poll_cci(ucsi, &cci); > if (ret < 0) > goto out; > if (cci & UCSI_CCI_COMMAND_COMPLETE) > @@ -1393,7 +1393,7 @@ static int ucsi_reset_ppm(struct ucsi *ucsi) > /* Give the PPM time to process a reset before reading CCI */ > msleep(20); > > - ret = ucsi->ops->read_cci(ucsi, &cci); > + ret = ucsi->ops->poll_cci(ucsi, &cci); > if (ret) > goto out; > > @@ -1929,8 +1929,8 @@ struct ucsi *ucsi_create(struct device *dev, const struct ucsi_operations *ops) > struct ucsi *ucsi; > > if (!ops || > - !ops->read_version || !ops->read_cci || !ops->read_message_in || > - !ops->sync_control || !ops->async_control) > + !ops->read_version || !ops->read_cci || !ops->poll_cci || > + !ops->read_message_in || !ops->sync_control || !ops->async_control) > return ERR_PTR(-EINVAL); > > ucsi = kzalloc(sizeof(*ucsi), GFP_KERNEL); > diff --git a/drivers/usb/typec/ucsi/ucsi.h b/drivers/usb/typec/ucsi/ucsi.h > index 82735eb34f0e..28780acc4af2 100644 > --- a/drivers/usb/typec/ucsi/ucsi.h > +++ b/drivers/usb/typec/ucsi/ucsi.h > @@ -62,6 +62,7 @@ struct dentry; > * struct ucsi_operations - UCSI I/O operations > * @read_version: Read implemented UCSI version > * @read_cci: Read CCI register > + * @poll_cci: Read CCI register while polling with notifications disabled > * @read_message_in: Read message data from UCSI > * @sync_control: Blocking control operation > * @async_control: Non-blocking control operation > @@ -76,6 +77,7 @@ struct dentry; > struct ucsi_operations { > int (*read_version)(struct ucsi *ucsi, u16 *version); > int (*read_cci)(struct ucsi *ucsi, u32 *cci); > + int (*poll_cci)(struct ucsi *ucsi, u32 *cci); > int (*read_message_in)(struct ucsi *ucsi, void *val, size_t val_len); > int (*sync_control)(struct ucsi *ucsi, u64 command); > int (*async_control)(struct ucsi *ucsi, u64 command); > diff --git a/drivers/usb/typec/ucsi/ucsi_acpi.c b/drivers/usb/typec/ucsi/ucsi_acpi.c > index 5c5515551963..ac1ebb5d9527 100644 > --- a/drivers/usb/typec/ucsi/ucsi_acpi.c > +++ b/drivers/usb/typec/ucsi/ucsi_acpi.c > @@ -59,19 +59,24 @@ static int ucsi_acpi_read_version(struct ucsi *ucsi, u16 *version) > static int ucsi_acpi_read_cci(struct ucsi *ucsi, u32 *cci) > { > struct ucsi_acpi *ua = ucsi_get_drvdata(ucsi); > - int ret; > - > - if (UCSI_COMMAND(ua->cmd) == UCSI_PPM_RESET) { > - ret = ucsi_acpi_dsm(ua, UCSI_DSM_FUNC_READ); > - if (ret) > - return ret; > - } > > memcpy(cci, ua->base + UCSI_CCI, sizeof(*cci)); > > return 0; > } > > +static int ucsi_acpi_poll_cci(struct ucsi *ucsi, u32 *cci) > +{ > + struct ucsi_acpi *ua = ucsi_get_drvdata(ucsi); > + int ret; > + > + ret = ucsi_acpi_dsm(ua, UCSI_DSM_FUNC_READ); > + if (ret) > + return ret; > + > + return ucsi_acpi_read_cci(ucsi, cci); > +} > + > static int ucsi_acpi_read_message_in(struct ucsi *ucsi, void *val, size_t val_len) > { > struct ucsi_acpi *ua = ucsi_get_drvdata(ucsi); > @@ -94,6 +99,7 @@ static int ucsi_acpi_async_control(struct ucsi *ucsi, u64 command) > static const struct ucsi_operations ucsi_acpi_ops = { > .read_version = ucsi_acpi_read_version, > .read_cci = ucsi_acpi_read_cci, > + .poll_cci = ucsi_acpi_poll_cci, > .read_message_in = ucsi_acpi_read_message_in, > .sync_control = ucsi_sync_control_common, > .async_control = ucsi_acpi_async_control > @@ -142,6 +148,7 @@ static int ucsi_gram_sync_control(struct ucsi *ucsi, u64 command) > static const struct ucsi_operations ucsi_gram_ops = { > .read_version = ucsi_acpi_read_version, > .read_cci = ucsi_acpi_read_cci, > + .poll_cci = ucsi_acpi_poll_cci, > .read_message_in = ucsi_gram_read_message_in, > .sync_control = ucsi_gram_sync_control, > .async_control = ucsi_acpi_async_control > diff --git a/drivers/usb/typec/ucsi/ucsi_ccg.c b/drivers/usb/typec/ucsi/ucsi_ccg.c > index 740171f24ef9..4b1668733a4b 100644 > --- a/drivers/usb/typec/ucsi/ucsi_ccg.c > +++ b/drivers/usb/typec/ucsi/ucsi_ccg.c > @@ -664,6 +664,7 @@ static int ucsi_ccg_sync_control(struct ucsi *ucsi, u64 command) > static const struct ucsi_operations ucsi_ccg_ops = { > .read_version = ucsi_ccg_read_version, > .read_cci = ucsi_ccg_read_cci, > + .poll_cci = ucsi_ccg_read_cci, > .read_message_in = ucsi_ccg_read_message_in, > .sync_control = ucsi_ccg_sync_control, > .async_control = ucsi_ccg_async_control, > diff --git a/drivers/usb/typec/ucsi/ucsi_glink.c b/drivers/usb/typec/ucsi/ucsi_glink.c > index fed39d458090..8af79101a2fc 100644 > --- a/drivers/usb/typec/ucsi/ucsi_glink.c > +++ b/drivers/usb/typec/ucsi/ucsi_glink.c > @@ -206,6 +206,7 @@ static void pmic_glink_ucsi_connector_status(struct ucsi_connector *con) > static const struct ucsi_operations pmic_glink_ucsi_ops = { > .read_version = pmic_glink_ucsi_read_version, > .read_cci = pmic_glink_ucsi_read_cci, > + .poll_cci = pmic_glink_ucsi_read_cci, > .read_message_in = pmic_glink_ucsi_read_message_in, > .sync_control = ucsi_sync_control_common, > .async_control = pmic_glink_ucsi_async_control, > diff --git a/drivers/usb/typec/ucsi/ucsi_stm32g0.c b/drivers/usb/typec/ucsi/ucsi_stm32g0.c > index 6923fad31d79..57ef7d83a412 100644 > --- a/drivers/usb/typec/ucsi/ucsi_stm32g0.c > +++ b/drivers/usb/typec/ucsi/ucsi_stm32g0.c > @@ -424,6 +424,7 @@ static irqreturn_t ucsi_stm32g0_irq_handler(int irq, void *data) > static const struct ucsi_operations ucsi_stm32g0_ops = { > .read_version = ucsi_stm32g0_read_version, > .read_cci = ucsi_stm32g0_read_cci, > + .poll_cci = ucsi_stm32g0_read_cci, > .read_message_in = ucsi_stm32g0_read_message_in, > .sync_control = ucsi_sync_control_common, > .async_control = ucsi_stm32g0_async_control, > diff --git a/drivers/usb/typec/ucsi/ucsi_yoga_c630.c b/drivers/usb/typec/ucsi/ucsi_yoga_c630.c > index 4cae85c0dc12..d33e3f2dd1d8 100644 > --- a/drivers/usb/typec/ucsi/ucsi_yoga_c630.c > +++ b/drivers/usb/typec/ucsi/ucsi_yoga_c630.c > @@ -74,6 +74,7 @@ static int yoga_c630_ucsi_async_control(struct ucsi *ucsi, u64 command) > static const struct ucsi_operations yoga_c630_ucsi_ops = { > .read_version = yoga_c630_ucsi_read_version, > .read_cci = yoga_c630_ucsi_read_cci, > + .poll_cci = yoga_c630_ucsi_read_cci, > .read_message_in = yoga_c630_ucsi_read_message_in, > .sync_control = ucsi_sync_control_common, > .async_control = yoga_c630_ucsi_async_control, > -- > 2.48.1 > >

6 months, 2 weeks

2
1
0 0

FAILED: patch "[PATCH] usb: dwc3: gadget: Prevent irq storm when TH re-executes" failed to apply to 5.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y git checkout FETCH_HEAD git cherry-pick -x 69c58deec19628c8a686030102176484eb94fed4 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030917-porridge-retired-1abd@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 69c58deec19628c8a686030102176484eb94fed4 Mon Sep 17 00:00:00 2001 From: Badhri Jagan Sridharan <badhri(a)google.com> Date: Sun, 16 Feb 2025 22:30:02 +0000 Subject: [PATCH] usb: dwc3: gadget: Prevent irq storm when TH re-executes While commit d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events in event cache") makes sure that top half(TH) does not end up overwriting the cached events before processing them when the TH gets invoked more than one time, returning IRQ_HANDLED results in occasional irq storm where the TH hogs the CPU. The irq storm can be prevented by the flag before event handler busy is cleared. Default enable interrupt moderation in all versions which support them. ftrace event stub during dwc3 irq storm: irq/504_dwc3-1111 ( 1111) [000] .... 70.000866: irq_handler_exit: irq=14 ret=handled irq/504_dwc3-1111 ( 1111) [000] .... 70.000872: irq_handler_entry: irq=504 name=dwc3 irq/504_dwc3-1111 ( 1111) [000] .... 70.000874: irq_handler_exit: irq=504 ret=handled irq/504_dwc3-1111 ( 1111) [000] .... 70.000881: irq_handler_entry: irq=504 name=dwc3 irq/504_dwc3-1111 ( 1111) [000] .... 70.000883: irq_handler_exit: irq=504 ret=handled irq/504_dwc3-1111 ( 1111) [000] .... 70.000889: irq_handler_entry: irq=504 name=dwc3 irq/504_dwc3-1111 ( 1111) [000] .... 70.000892: irq_handler_exit: irq=504 ret=handled irq/504_dwc3-1111 ( 1111) [000] .... 70.000898: irq_handler_entry: irq=504 name=dwc3 irq/504_dwc3-1111 ( 1111) [000] .... 70.000901: irq_handler_exit: irq=504 ret=handled irq/504_dwc3-1111 ( 1111) [000] .... 70.000907: irq_handler_entry: irq=504 name=dwc3 irq/504_dwc3-1111 ( 1111) [000] .... 70.000909: irq_handler_exit: irq=504 ret=handled irq/504_dwc3-1111 ( 1111) [000] .... 70.000915: irq_handler_entry: irq=504 name=dwc3 irq/504_dwc3-1111 ( 1111) [000] .... 70.000918: irq_handler_exit: irq=504 ret=handled irq/504_dwc3-1111 ( 1111) [000] .... 70.000924: irq_handler_entry: irq=504 name=dwc3 irq/504_dwc3-1111 ( 1111) [000] .... 70.000927: irq_handler_exit: irq=504 ret=handled irq/504_dwc3-1111 ( 1111) [000] .... 70.000933: irq_handler_entry: irq=504 name=dwc3 irq/504_dwc3-1111 ( 1111) [000] .... 70.000935: irq_handler_exit: irq=504 ret=handled .... Cc: stable <stable(a)kernel.org> Suggested-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com> Fixes: d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events in event cache") Signed-off-by: Badhri Jagan Sridharan <badhri(a)google.com> Acked-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com> Link: https://lore.kernel.org/r/20250216223003.3568039-1-badhri@google.com Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index dfa1b5fe48dc..2c472cb97f6c 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -1835,8 +1835,6 @@ static void dwc3_get_properties(struct dwc3 *dwc) dwc->tx_thr_num_pkt_prd = tx_thr_num_pkt_prd; dwc->tx_max_burst_prd = tx_max_burst_prd; - dwc->imod_interval = 0; - dwc->tx_fifo_resize_max_num = tx_fifo_resize_max_num; } @@ -1854,21 +1852,19 @@ static void dwc3_check_params(struct dwc3 *dwc) unsigned int hwparam_gen = DWC3_GHWPARAMS3_SSPHY_IFC(dwc->hwparams.hwparams3); - /* Check for proper value of imod_interval */ - if (dwc->imod_interval && !dwc3_has_imod(dwc)) { - dev_warn(dwc->dev, "Interrupt moderation not supported\n"); - dwc->imod_interval = 0; - } - /* + * Enable IMOD for all supporting controllers. + * + * Particularly, DWC_usb3 v3.00a must enable this feature for + * the following reason: + * * Workaround for STAR 9000961433 which affects only version * 3.00a of the DWC_usb3 core. This prevents the controller * interrupt from being masked while handling events. IMOD * allows us to work around this issue. Enable it for the * affected version. */ - if (!dwc->imod_interval && - DWC3_VER_IS(DWC3, 300A)) + if (dwc3_has_imod((dwc))) dwc->imod_interval = 1; /* Check the maximum_speed parameter */ diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index ddd6b2ce5710..89a4dc8ebf94 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4501,14 +4501,18 @@ static irqreturn_t dwc3_process_event_buf(struct dwc3_event_buffer *evt) dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), DWC3_GEVNTSIZ_SIZE(evt->length)); + evt->flags &= ~DWC3_EVENT_PENDING; + /* + * Add an explicit write memory barrier to make sure that the update of + * clearing DWC3_EVENT_PENDING is observed in dwc3_check_event_buf() + */ + wmb(); + if (dwc->imod_interval) { dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), DWC3_GEVNTCOUNT_EHB); dwc3_writel(dwc->regs, DWC3_DEV_IMOD(0), dwc->imod_interval); } - /* Keep the clearing of DWC3_EVENT_PENDING at the end */ - evt->flags &= ~DWC3_EVENT_PENDING; - return ret; }

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] usb: dwc3: gadget: Prevent irq storm when TH re-executes" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x 69c58deec19628c8a686030102176484eb94fed4 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030916-expenses-extended-20cc@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 69c58deec19628c8a686030102176484eb94fed4 Mon Sep 17 00:00:00 2001 From: Badhri Jagan Sridharan <badhri(a)google.com> Date: Sun, 16 Feb 2025 22:30:02 +0000 Subject: [PATCH] usb: dwc3: gadget: Prevent irq storm when TH re-executes While commit d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events in event cache") makes sure that top half(TH) does not end up overwriting the cached events before processing them when the TH gets invoked more than one time, returning IRQ_HANDLED results in occasional irq storm where the TH hogs the CPU. The irq storm can be prevented by the flag before event handler busy is cleared. Default enable interrupt moderation in all versions which support them. ftrace event stub during dwc3 irq storm: irq/504_dwc3-1111 ( 1111) [000] .... 70.000866: irq_handler_exit: irq=14 ret=handled irq/504_dwc3-1111 ( 1111) [000] .... 70.000872: irq_handler_entry: irq=504 name=dwc3 irq/504_dwc3-1111 ( 1111) [000] .... 70.000874: irq_handler_exit: irq=504 ret=handled irq/504_dwc3-1111 ( 1111) [000] .... 70.000881: irq_handler_entry: irq=504 name=dwc3 irq/504_dwc3-1111 ( 1111) [000] .... 70.000883: irq_handler_exit: irq=504 ret=handled irq/504_dwc3-1111 ( 1111) [000] .... 70.000889: irq_handler_entry: irq=504 name=dwc3 irq/504_dwc3-1111 ( 1111) [000] .... 70.000892: irq_handler_exit: irq=504 ret=handled irq/504_dwc3-1111 ( 1111) [000] .... 70.000898: irq_handler_entry: irq=504 name=dwc3 irq/504_dwc3-1111 ( 1111) [000] .... 70.000901: irq_handler_exit: irq=504 ret=handled irq/504_dwc3-1111 ( 1111) [000] .... 70.000907: irq_handler_entry: irq=504 name=dwc3 irq/504_dwc3-1111 ( 1111) [000] .... 70.000909: irq_handler_exit: irq=504 ret=handled irq/504_dwc3-1111 ( 1111) [000] .... 70.000915: irq_handler_entry: irq=504 name=dwc3 irq/504_dwc3-1111 ( 1111) [000] .... 70.000918: irq_handler_exit: irq=504 ret=handled irq/504_dwc3-1111 ( 1111) [000] .... 70.000924: irq_handler_entry: irq=504 name=dwc3 irq/504_dwc3-1111 ( 1111) [000] .... 70.000927: irq_handler_exit: irq=504 ret=handled irq/504_dwc3-1111 ( 1111) [000] .... 70.000933: irq_handler_entry: irq=504 name=dwc3 irq/504_dwc3-1111 ( 1111) [000] .... 70.000935: irq_handler_exit: irq=504 ret=handled .... Cc: stable <stable(a)kernel.org> Suggested-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com> Fixes: d325a1de49d6 ("usb: dwc3: gadget: Prevent losing events in event cache") Signed-off-by: Badhri Jagan Sridharan <badhri(a)google.com> Acked-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com> Link: https://lore.kernel.org/r/20250216223003.3568039-1-badhri@google.com Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index dfa1b5fe48dc..2c472cb97f6c 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -1835,8 +1835,6 @@ static void dwc3_get_properties(struct dwc3 *dwc) dwc->tx_thr_num_pkt_prd = tx_thr_num_pkt_prd; dwc->tx_max_burst_prd = tx_max_burst_prd; - dwc->imod_interval = 0; - dwc->tx_fifo_resize_max_num = tx_fifo_resize_max_num; } @@ -1854,21 +1852,19 @@ static void dwc3_check_params(struct dwc3 *dwc) unsigned int hwparam_gen = DWC3_GHWPARAMS3_SSPHY_IFC(dwc->hwparams.hwparams3); - /* Check for proper value of imod_interval */ - if (dwc->imod_interval && !dwc3_has_imod(dwc)) { - dev_warn(dwc->dev, "Interrupt moderation not supported\n"); - dwc->imod_interval = 0; - } - /* + * Enable IMOD for all supporting controllers. + * + * Particularly, DWC_usb3 v3.00a must enable this feature for + * the following reason: + * * Workaround for STAR 9000961433 which affects only version * 3.00a of the DWC_usb3 core. This prevents the controller * interrupt from being masked while handling events. IMOD * allows us to work around this issue. Enable it for the * affected version. */ - if (!dwc->imod_interval && - DWC3_VER_IS(DWC3, 300A)) + if (dwc3_has_imod((dwc))) dwc->imod_interval = 1; /* Check the maximum_speed parameter */ diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index ddd6b2ce5710..89a4dc8ebf94 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4501,14 +4501,18 @@ static irqreturn_t dwc3_process_event_buf(struct dwc3_event_buffer *evt) dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), DWC3_GEVNTSIZ_SIZE(evt->length)); + evt->flags &= ~DWC3_EVENT_PENDING; + /* + * Add an explicit write memory barrier to make sure that the update of + * clearing DWC3_EVENT_PENDING is observed in dwc3_check_event_buf() + */ + wmb(); + if (dwc->imod_interval) { dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), DWC3_GEVNTCOUNT_EHB); dwc3_writel(dwc->regs, DWC3_DEV_IMOD(0), dwc->imod_interval); } - /* Keep the clearing of DWC3_EVENT_PENDING at the end */ - evt->flags &= ~DWC3_EVENT_PENDING; - return ret; }

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] usb: dwc3: Set SUSPENDENABLE soon after phy init" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x cc5bfc4e16fc1d1c520cd7bb28646e82b6e69217 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030953-washboard-overcrowd-fed5@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From cc5bfc4e16fc1d1c520cd7bb28646e82b6e69217 Mon Sep 17 00:00:00 2001 From: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com> Date: Thu, 30 Jan 2025 23:49:31 +0000 Subject: [PATCH] usb: dwc3: Set SUSPENDENABLE soon after phy init After phy initialization, some phy operations can only be executed while in lower P states. Ensure GUSB3PIPECTL.SUSPENDENABLE and GUSB2PHYCFG.SUSPHY are set soon after initialization to avoid blocking phy ops. Previously the SUSPENDENABLE bits are only set after the controller initialization, which may not happen right away if there's no gadget driver or xhci driver bound. Revise this to clear SUSPENDENABLE bits only when there's mode switching (change in GCTL.PRTCAPDIR). Fixes: 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init") Cc: stable <stable(a)kernel.org> Signed-off-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com> Link: https://lore.kernel.org/r/633aef0afee7d56d2316f7cc3e1b2a6d518a8cc9.17382809… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index 2c472cb97f6c..66a08b527165 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -131,11 +131,24 @@ void dwc3_enable_susphy(struct dwc3 *dwc, bool enable) } } -void dwc3_set_prtcap(struct dwc3 *dwc, u32 mode) +void dwc3_set_prtcap(struct dwc3 *dwc, u32 mode, bool ignore_susphy) { + unsigned int hw_mode; u32 reg; reg = dwc3_readl(dwc->regs, DWC3_GCTL); + + /* + * For DRD controllers, GUSB3PIPECTL.SUSPENDENABLE and + * GUSB2PHYCFG.SUSPHY should be cleared during mode switching, + * and they can be set after core initialization. + */ + hw_mode = DWC3_GHWPARAMS0_MODE(dwc->hwparams.hwparams0); + if (hw_mode == DWC3_GHWPARAMS0_MODE_DRD && !ignore_susphy) { + if (DWC3_GCTL_PRTCAP(reg) != mode) + dwc3_enable_susphy(dwc, false); + } + reg &= ~(DWC3_GCTL_PRTCAPDIR(DWC3_GCTL_PRTCAP_OTG)); reg |= DWC3_GCTL_PRTCAPDIR(mode); dwc3_writel(dwc->regs, DWC3_GCTL, reg); @@ -216,7 +229,7 @@ static void __dwc3_set_mode(struct work_struct *work) spin_lock_irqsave(&dwc->lock, flags); - dwc3_set_prtcap(dwc, desired_dr_role); + dwc3_set_prtcap(dwc, desired_dr_role, false); spin_unlock_irqrestore(&dwc->lock, flags); @@ -658,16 +671,7 @@ static int dwc3_ss_phy_setup(struct dwc3 *dwc, int index) */ reg &= ~DWC3_GUSB3PIPECTL_UX_EXIT_PX; - /* - * Above DWC_usb3.0 1.94a, it is recommended to set - * DWC3_GUSB3PIPECTL_SUSPHY to '0' during coreConsultant configuration. - * So default value will be '0' when the core is reset. Application - * needs to set it to '1' after the core initialization is completed. - * - * Similarly for DRD controllers, GUSB3PIPECTL.SUSPENDENABLE must be - * cleared after power-on reset, and it can be set after core - * initialization. - */ + /* Ensure the GUSB3PIPECTL.SUSPENDENABLE is cleared prior to phy init. */ reg &= ~DWC3_GUSB3PIPECTL_SUSPHY; if (dwc->u2ss_inp3_quirk) @@ -747,15 +751,7 @@ static int dwc3_hs_phy_setup(struct dwc3 *dwc, int index) break; } - /* - * Above DWC_usb3.0 1.94a, it is recommended to set - * DWC3_GUSB2PHYCFG_SUSPHY to '0' during coreConsultant configuration. - * So default value will be '0' when the core is reset. Application - * needs to set it to '1' after the core initialization is completed. - * - * Similarly for DRD controllers, GUSB2PHYCFG.SUSPHY must be cleared - * after power-on reset, and it can be set after core initialization. - */ + /* Ensure the GUSB2PHYCFG.SUSPHY is cleared prior to phy init. */ reg &= ~DWC3_GUSB2PHYCFG_SUSPHY; if (dwc->dis_enblslpm_quirk) @@ -830,6 +826,25 @@ static int dwc3_phy_init(struct dwc3 *dwc) goto err_exit_usb3_phy; } + /* + * Above DWC_usb3.0 1.94a, it is recommended to set + * DWC3_GUSB3PIPECTL_SUSPHY and DWC3_GUSB2PHYCFG_SUSPHY to '0' during + * coreConsultant configuration. So default value will be '0' when the + * core is reset. Application needs to set it to '1' after the core + * initialization is completed. + * + * Certain phy requires to be in P0 power state during initialization. + * Make sure GUSB3PIPECTL.SUSPENDENABLE and GUSB2PHYCFG.SUSPHY are clear + * prior to phy init to maintain in the P0 state. + * + * After phy initialization, some phy operations can only be executed + * while in lower P states. Ensure GUSB3PIPECTL.SUSPENDENABLE and + * GUSB2PHYCFG.SUSPHY are set soon after initialization to avoid + * blocking phy ops. + */ + if (!DWC3_VER_IS_WITHIN(DWC3, ANY, 194A)) + dwc3_enable_susphy(dwc, true); + return 0; err_exit_usb3_phy: @@ -1588,7 +1603,7 @@ static int dwc3_core_init_mode(struct dwc3 *dwc) switch (dwc->dr_mode) { case USB_DR_MODE_PERIPHERAL: - dwc3_set_prtcap(dwc, DWC3_GCTL_PRTCAP_DEVICE); + dwc3_set_prtcap(dwc, DWC3_GCTL_PRTCAP_DEVICE, false); if (dwc->usb2_phy) otg_set_vbus(dwc->usb2_phy->otg, false); @@ -1600,7 +1615,7 @@ static int dwc3_core_init_mode(struct dwc3 *dwc) return dev_err_probe(dev, ret, "failed to initialize gadget\n"); break; case USB_DR_MODE_HOST: - dwc3_set_prtcap(dwc, DWC3_GCTL_PRTCAP_HOST); + dwc3_set_prtcap(dwc, DWC3_GCTL_PRTCAP_HOST, false); if (dwc->usb2_phy) otg_set_vbus(dwc->usb2_phy->otg, true); @@ -1645,7 +1660,7 @@ static void dwc3_core_exit_mode(struct dwc3 *dwc) } /* de-assert DRVVBUS for HOST and OTG mode */ - dwc3_set_prtcap(dwc, DWC3_GCTL_PRTCAP_DEVICE); + dwc3_set_prtcap(dwc, DWC3_GCTL_PRTCAP_DEVICE, true); } static void dwc3_get_software_properties(struct dwc3 *dwc) @@ -2453,7 +2468,7 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg) if (ret) return ret; - dwc3_set_prtcap(dwc, DWC3_GCTL_PRTCAP_DEVICE); + dwc3_set_prtcap(dwc, DWC3_GCTL_PRTCAP_DEVICE, true); dwc3_gadget_resume(dwc); break; case DWC3_GCTL_PRTCAP_HOST: @@ -2461,7 +2476,7 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg) ret = dwc3_core_init_for_resume(dwc); if (ret) return ret; - dwc3_set_prtcap(dwc, DWC3_GCTL_PRTCAP_HOST); + dwc3_set_prtcap(dwc, DWC3_GCTL_PRTCAP_HOST, true); break; } /* Restore GUSB2PHYCFG bits that were modified in suspend */ @@ -2490,7 +2505,7 @@ static int dwc3_resume_common(struct dwc3 *dwc, pm_message_t msg) if (ret) return ret; - dwc3_set_prtcap(dwc, dwc->current_dr_role); + dwc3_set_prtcap(dwc, dwc->current_dr_role, true); dwc3_otg_init(dwc); if (dwc->current_otg_role == DWC3_OTG_ROLE_HOST) { diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h index c955039bb4f6..aaa39e663f60 100644 --- a/drivers/usb/dwc3/core.h +++ b/drivers/usb/dwc3/core.h @@ -1558,7 +1558,7 @@ struct dwc3_gadget_ep_cmd_params { #define DWC3_HAS_OTG BIT(3) /* prototypes */ -void dwc3_set_prtcap(struct dwc3 *dwc, u32 mode); +void dwc3_set_prtcap(struct dwc3 *dwc, u32 mode, bool ignore_susphy); void dwc3_set_mode(struct dwc3 *dwc, u32 mode); u32 dwc3_core_fifo_space(struct dwc3_ep *dep, u8 type); diff --git a/drivers/usb/dwc3/drd.c b/drivers/usb/dwc3/drd.c index d76ae676783c..7977860932b1 100644 --- a/drivers/usb/dwc3/drd.c +++ b/drivers/usb/dwc3/drd.c @@ -173,7 +173,7 @@ void dwc3_otg_init(struct dwc3 *dwc) * block "Initialize GCTL for OTG operation". */ /* GCTL.PrtCapDir=2'b11 */ - dwc3_set_prtcap(dwc, DWC3_GCTL_PRTCAP_OTG); + dwc3_set_prtcap(dwc, DWC3_GCTL_PRTCAP_OTG, true); /* GUSB2PHYCFG0.SusPHY=0 */ reg = dwc3_readl(dwc->regs, DWC3_GUSB2PHYCFG(0)); reg &= ~DWC3_GUSB2PHYCFG_SUSPHY; @@ -556,7 +556,7 @@ int dwc3_drd_init(struct dwc3 *dwc) dwc3_drd_update(dwc); } else { - dwc3_set_prtcap(dwc, DWC3_GCTL_PRTCAP_OTG); + dwc3_set_prtcap(dwc, DWC3_GCTL_PRTCAP_OTG, true); /* use OTG block to get ID event */ irq = dwc3_otg_get_irq(dwc);

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] usb: xhci: Enable the TRB overfetch quirk on VIA VL805" failed to apply to 5.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y git checkout FETCH_HEAD git cherry-pick -x c133ec0e5717868c9967fa3df92a55e537b1aead # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030902-fernlike-flashback-65c0@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From c133ec0e5717868c9967fa3df92a55e537b1aead Mon Sep 17 00:00:00 2001 From: Michal Pecio <michal.pecio(a)gmail.com> Date: Tue, 25 Feb 2025 11:59:27 +0200 Subject: [PATCH] usb: xhci: Enable the TRB overfetch quirk on VIA VL805 Raspberry Pi is a major user of those chips and they discovered a bug - when the end of a transfer ring segment is reached, up to four TRBs can be prefetched from the next page even if the segment ends with link TRB and on page boundary (the chip claims to support standard 4KB pages). It also appears that if the prefetched TRBs belong to a different ring whose doorbell is later rung, they may be used without refreshing from system RAM and the endpoint will stay idle if their cycle bit is stale. Other users complain about IOMMU faults on x86 systems, unsurprisingly. Deal with it by using existing quirk which allocates a dummy page after each transfer ring segment. This was seen to resolve both problems. RPi came up with a more efficient solution, shortening each segment by four TRBs, but it complicated the driver and they ditched it for this quirk. Also rename the quirk and add VL805 device ID macro. Signed-off-by: Michal Pecio <michal.pecio(a)gmail.com> Link: https://github.com/raspberrypi/linux/issues/4685 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215906 CC: stable(a)vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com> Link: https://lore.kernel.org/r/20250225095927.2512358-2-mathias.nyman@linux.inte… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index 92703efda1f7..fdf0c1008225 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -2437,7 +2437,8 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags) * and our use of dma addresses in the trb_address_map radix tree needs * TRB_SEGMENT_SIZE alignment, so we pick the greater alignment need. */ - if (xhci->quirks & XHCI_ZHAOXIN_TRB_FETCH) + if (xhci->quirks & XHCI_TRB_OVERFETCH) + /* Buggy HC prefetches beyond segment bounds - allocate dummy space at the end */ xhci->segment_pool = dma_pool_create("xHCI ring segments", dev, TRB_SEGMENT_SIZE * 2, TRB_SEGMENT_SIZE * 2, xhci->page_size * 2); else diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c index ad0ff356f6fa..54460d11f7ee 100644 --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -38,6 +38,8 @@ #define PCI_DEVICE_ID_ETRON_EJ168 0x7023 #define PCI_DEVICE_ID_ETRON_EJ188 0x7052 +#define PCI_DEVICE_ID_VIA_VL805 0x3483 + #define PCI_DEVICE_ID_INTEL_LYNXPOINT_XHCI 0x8c31 #define PCI_DEVICE_ID_INTEL_LYNXPOINT_LP_XHCI 0x9c31 #define PCI_DEVICE_ID_INTEL_WILDCATPOINT_LP_XHCI 0x9cb1 @@ -418,8 +420,10 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) pdev->device == 0x3432) xhci->quirks |= XHCI_BROKEN_STREAMS; - if (pdev->vendor == PCI_VENDOR_ID_VIA && pdev->device == 0x3483) + if (pdev->vendor == PCI_VENDOR_ID_VIA && pdev->device == PCI_DEVICE_ID_VIA_VL805) { xhci->quirks |= XHCI_LPM_SUPPORT; + xhci->quirks |= XHCI_TRB_OVERFETCH; + } if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA && pdev->device == PCI_DEVICE_ID_ASMEDIA_1042_XHCI) { @@ -467,11 +471,11 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) if (pdev->device == 0x9202) { xhci->quirks |= XHCI_RESET_ON_RESUME; - xhci->quirks |= XHCI_ZHAOXIN_TRB_FETCH; + xhci->quirks |= XHCI_TRB_OVERFETCH; } if (pdev->device == 0x9203) - xhci->quirks |= XHCI_ZHAOXIN_TRB_FETCH; + xhci->quirks |= XHCI_TRB_OVERFETCH; } if (pdev->vendor == PCI_VENDOR_ID_CDNS && diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 8c164340a2c3..779b01dee068 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1632,7 +1632,7 @@ struct xhci_hcd { #define XHCI_EP_CTX_BROKEN_DCS BIT_ULL(42) #define XHCI_SUSPEND_RESUME_CLKS BIT_ULL(43) #define XHCI_RESET_TO_DEFAULT BIT_ULL(44) -#define XHCI_ZHAOXIN_TRB_FETCH BIT_ULL(45) +#define XHCI_TRB_OVERFETCH BIT_ULL(45) #define XHCI_ZHAOXIN_HOST BIT_ULL(46) #define XHCI_WRITE_64_HI_LO BIT_ULL(47) #define XHCI_CDNS_SCTX_QUIRK BIT_ULL(48)

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] usb: xhci: Enable the TRB overfetch quirk on VIA VL805" failed to apply to 6.6-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.6-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y git checkout FETCH_HEAD git cherry-pick -x c133ec0e5717868c9967fa3df92a55e537b1aead # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030959-thee-uniformed-b4eb@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From c133ec0e5717868c9967fa3df92a55e537b1aead Mon Sep 17 00:00:00 2001 From: Michal Pecio <michal.pecio(a)gmail.com> Date: Tue, 25 Feb 2025 11:59:27 +0200 Subject: [PATCH] usb: xhci: Enable the TRB overfetch quirk on VIA VL805 Raspberry Pi is a major user of those chips and they discovered a bug - when the end of a transfer ring segment is reached, up to four TRBs can be prefetched from the next page even if the segment ends with link TRB and on page boundary (the chip claims to support standard 4KB pages). It also appears that if the prefetched TRBs belong to a different ring whose doorbell is later rung, they may be used without refreshing from system RAM and the endpoint will stay idle if their cycle bit is stale. Other users complain about IOMMU faults on x86 systems, unsurprisingly. Deal with it by using existing quirk which allocates a dummy page after each transfer ring segment. This was seen to resolve both problems. RPi came up with a more efficient solution, shortening each segment by four TRBs, but it complicated the driver and they ditched it for this quirk. Also rename the quirk and add VL805 device ID macro. Signed-off-by: Michal Pecio <michal.pecio(a)gmail.com> Link: https://github.com/raspberrypi/linux/issues/4685 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215906 CC: stable(a)vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com> Link: https://lore.kernel.org/r/20250225095927.2512358-2-mathias.nyman@linux.inte… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index 92703efda1f7..fdf0c1008225 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -2437,7 +2437,8 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags) * and our use of dma addresses in the trb_address_map radix tree needs * TRB_SEGMENT_SIZE alignment, so we pick the greater alignment need. */ - if (xhci->quirks & XHCI_ZHAOXIN_TRB_FETCH) + if (xhci->quirks & XHCI_TRB_OVERFETCH) + /* Buggy HC prefetches beyond segment bounds - allocate dummy space at the end */ xhci->segment_pool = dma_pool_create("xHCI ring segments", dev, TRB_SEGMENT_SIZE * 2, TRB_SEGMENT_SIZE * 2, xhci->page_size * 2); else diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c index ad0ff356f6fa..54460d11f7ee 100644 --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -38,6 +38,8 @@ #define PCI_DEVICE_ID_ETRON_EJ168 0x7023 #define PCI_DEVICE_ID_ETRON_EJ188 0x7052 +#define PCI_DEVICE_ID_VIA_VL805 0x3483 + #define PCI_DEVICE_ID_INTEL_LYNXPOINT_XHCI 0x8c31 #define PCI_DEVICE_ID_INTEL_LYNXPOINT_LP_XHCI 0x9c31 #define PCI_DEVICE_ID_INTEL_WILDCATPOINT_LP_XHCI 0x9cb1 @@ -418,8 +420,10 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) pdev->device == 0x3432) xhci->quirks |= XHCI_BROKEN_STREAMS; - if (pdev->vendor == PCI_VENDOR_ID_VIA && pdev->device == 0x3483) + if (pdev->vendor == PCI_VENDOR_ID_VIA && pdev->device == PCI_DEVICE_ID_VIA_VL805) { xhci->quirks |= XHCI_LPM_SUPPORT; + xhci->quirks |= XHCI_TRB_OVERFETCH; + } if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA && pdev->device == PCI_DEVICE_ID_ASMEDIA_1042_XHCI) { @@ -467,11 +471,11 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) if (pdev->device == 0x9202) { xhci->quirks |= XHCI_RESET_ON_RESUME; - xhci->quirks |= XHCI_ZHAOXIN_TRB_FETCH; + xhci->quirks |= XHCI_TRB_OVERFETCH; } if (pdev->device == 0x9203) - xhci->quirks |= XHCI_ZHAOXIN_TRB_FETCH; + xhci->quirks |= XHCI_TRB_OVERFETCH; } if (pdev->vendor == PCI_VENDOR_ID_CDNS && diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 8c164340a2c3..779b01dee068 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1632,7 +1632,7 @@ struct xhci_hcd { #define XHCI_EP_CTX_BROKEN_DCS BIT_ULL(42) #define XHCI_SUSPEND_RESUME_CLKS BIT_ULL(43) #define XHCI_RESET_TO_DEFAULT BIT_ULL(44) -#define XHCI_ZHAOXIN_TRB_FETCH BIT_ULL(45) +#define XHCI_TRB_OVERFETCH BIT_ULL(45) #define XHCI_ZHAOXIN_HOST BIT_ULL(46) #define XHCI_WRITE_64_HI_LO BIT_ULL(47) #define XHCI_CDNS_SCTX_QUIRK BIT_ULL(48)

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] usb: xhci: Enable the TRB overfetch quirk on VIA VL805" failed to apply to 6.1-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x c133ec0e5717868c9967fa3df92a55e537b1aead # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030959-character-delouse-db17@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From c133ec0e5717868c9967fa3df92a55e537b1aead Mon Sep 17 00:00:00 2001 From: Michal Pecio <michal.pecio(a)gmail.com> Date: Tue, 25 Feb 2025 11:59:27 +0200 Subject: [PATCH] usb: xhci: Enable the TRB overfetch quirk on VIA VL805 Raspberry Pi is a major user of those chips and they discovered a bug - when the end of a transfer ring segment is reached, up to four TRBs can be prefetched from the next page even if the segment ends with link TRB and on page boundary (the chip claims to support standard 4KB pages). It also appears that if the prefetched TRBs belong to a different ring whose doorbell is later rung, they may be used without refreshing from system RAM and the endpoint will stay idle if their cycle bit is stale. Other users complain about IOMMU faults on x86 systems, unsurprisingly. Deal with it by using existing quirk which allocates a dummy page after each transfer ring segment. This was seen to resolve both problems. RPi came up with a more efficient solution, shortening each segment by four TRBs, but it complicated the driver and they ditched it for this quirk. Also rename the quirk and add VL805 device ID macro. Signed-off-by: Michal Pecio <michal.pecio(a)gmail.com> Link: https://github.com/raspberrypi/linux/issues/4685 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215906 CC: stable(a)vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com> Link: https://lore.kernel.org/r/20250225095927.2512358-2-mathias.nyman@linux.inte… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index 92703efda1f7..fdf0c1008225 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -2437,7 +2437,8 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags) * and our use of dma addresses in the trb_address_map radix tree needs * TRB_SEGMENT_SIZE alignment, so we pick the greater alignment need. */ - if (xhci->quirks & XHCI_ZHAOXIN_TRB_FETCH) + if (xhci->quirks & XHCI_TRB_OVERFETCH) + /* Buggy HC prefetches beyond segment bounds - allocate dummy space at the end */ xhci->segment_pool = dma_pool_create("xHCI ring segments", dev, TRB_SEGMENT_SIZE * 2, TRB_SEGMENT_SIZE * 2, xhci->page_size * 2); else diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c index ad0ff356f6fa..54460d11f7ee 100644 --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -38,6 +38,8 @@ #define PCI_DEVICE_ID_ETRON_EJ168 0x7023 #define PCI_DEVICE_ID_ETRON_EJ188 0x7052 +#define PCI_DEVICE_ID_VIA_VL805 0x3483 + #define PCI_DEVICE_ID_INTEL_LYNXPOINT_XHCI 0x8c31 #define PCI_DEVICE_ID_INTEL_LYNXPOINT_LP_XHCI 0x9c31 #define PCI_DEVICE_ID_INTEL_WILDCATPOINT_LP_XHCI 0x9cb1 @@ -418,8 +420,10 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) pdev->device == 0x3432) xhci->quirks |= XHCI_BROKEN_STREAMS; - if (pdev->vendor == PCI_VENDOR_ID_VIA && pdev->device == 0x3483) + if (pdev->vendor == PCI_VENDOR_ID_VIA && pdev->device == PCI_DEVICE_ID_VIA_VL805) { xhci->quirks |= XHCI_LPM_SUPPORT; + xhci->quirks |= XHCI_TRB_OVERFETCH; + } if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA && pdev->device == PCI_DEVICE_ID_ASMEDIA_1042_XHCI) { @@ -467,11 +471,11 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) if (pdev->device == 0x9202) { xhci->quirks |= XHCI_RESET_ON_RESUME; - xhci->quirks |= XHCI_ZHAOXIN_TRB_FETCH; + xhci->quirks |= XHCI_TRB_OVERFETCH; } if (pdev->device == 0x9203) - xhci->quirks |= XHCI_ZHAOXIN_TRB_FETCH; + xhci->quirks |= XHCI_TRB_OVERFETCH; } if (pdev->vendor == PCI_VENDOR_ID_CDNS && diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 8c164340a2c3..779b01dee068 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1632,7 +1632,7 @@ struct xhci_hcd { #define XHCI_EP_CTX_BROKEN_DCS BIT_ULL(42) #define XHCI_SUSPEND_RESUME_CLKS BIT_ULL(43) #define XHCI_RESET_TO_DEFAULT BIT_ULL(44) -#define XHCI_ZHAOXIN_TRB_FETCH BIT_ULL(45) +#define XHCI_TRB_OVERFETCH BIT_ULL(45) #define XHCI_ZHAOXIN_HOST BIT_ULL(46) #define XHCI_WRITE_64_HI_LO BIT_ULL(47) #define XHCI_CDNS_SCTX_QUIRK BIT_ULL(48)

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] usb: xhci: Enable the TRB overfetch quirk on VIA VL805" failed to apply to 6.12-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.12-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y git checkout FETCH_HEAD git cherry-pick -x c133ec0e5717868c9967fa3df92a55e537b1aead # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030958-june-lard-2d9f@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From c133ec0e5717868c9967fa3df92a55e537b1aead Mon Sep 17 00:00:00 2001 From: Michal Pecio <michal.pecio(a)gmail.com> Date: Tue, 25 Feb 2025 11:59:27 +0200 Subject: [PATCH] usb: xhci: Enable the TRB overfetch quirk on VIA VL805 Raspberry Pi is a major user of those chips and they discovered a bug - when the end of a transfer ring segment is reached, up to four TRBs can be prefetched from the next page even if the segment ends with link TRB and on page boundary (the chip claims to support standard 4KB pages). It also appears that if the prefetched TRBs belong to a different ring whose doorbell is later rung, they may be used without refreshing from system RAM and the endpoint will stay idle if their cycle bit is stale. Other users complain about IOMMU faults on x86 systems, unsurprisingly. Deal with it by using existing quirk which allocates a dummy page after each transfer ring segment. This was seen to resolve both problems. RPi came up with a more efficient solution, shortening each segment by four TRBs, but it complicated the driver and they ditched it for this quirk. Also rename the quirk and add VL805 device ID macro. Signed-off-by: Michal Pecio <michal.pecio(a)gmail.com> Link: https://github.com/raspberrypi/linux/issues/4685 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215906 CC: stable(a)vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com> Link: https://lore.kernel.org/r/20250225095927.2512358-2-mathias.nyman@linux.inte… Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index 92703efda1f7..fdf0c1008225 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -2437,7 +2437,8 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags) * and our use of dma addresses in the trb_address_map radix tree needs * TRB_SEGMENT_SIZE alignment, so we pick the greater alignment need. */ - if (xhci->quirks & XHCI_ZHAOXIN_TRB_FETCH) + if (xhci->quirks & XHCI_TRB_OVERFETCH) + /* Buggy HC prefetches beyond segment bounds - allocate dummy space at the end */ xhci->segment_pool = dma_pool_create("xHCI ring segments", dev, TRB_SEGMENT_SIZE * 2, TRB_SEGMENT_SIZE * 2, xhci->page_size * 2); else diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c index ad0ff356f6fa..54460d11f7ee 100644 --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -38,6 +38,8 @@ #define PCI_DEVICE_ID_ETRON_EJ168 0x7023 #define PCI_DEVICE_ID_ETRON_EJ188 0x7052 +#define PCI_DEVICE_ID_VIA_VL805 0x3483 + #define PCI_DEVICE_ID_INTEL_LYNXPOINT_XHCI 0x8c31 #define PCI_DEVICE_ID_INTEL_LYNXPOINT_LP_XHCI 0x9c31 #define PCI_DEVICE_ID_INTEL_WILDCATPOINT_LP_XHCI 0x9cb1 @@ -418,8 +420,10 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) pdev->device == 0x3432) xhci->quirks |= XHCI_BROKEN_STREAMS; - if (pdev->vendor == PCI_VENDOR_ID_VIA && pdev->device == 0x3483) + if (pdev->vendor == PCI_VENDOR_ID_VIA && pdev->device == PCI_DEVICE_ID_VIA_VL805) { xhci->quirks |= XHCI_LPM_SUPPORT; + xhci->quirks |= XHCI_TRB_OVERFETCH; + } if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA && pdev->device == PCI_DEVICE_ID_ASMEDIA_1042_XHCI) { @@ -467,11 +471,11 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) if (pdev->device == 0x9202) { xhci->quirks |= XHCI_RESET_ON_RESUME; - xhci->quirks |= XHCI_ZHAOXIN_TRB_FETCH; + xhci->quirks |= XHCI_TRB_OVERFETCH; } if (pdev->device == 0x9203) - xhci->quirks |= XHCI_ZHAOXIN_TRB_FETCH; + xhci->quirks |= XHCI_TRB_OVERFETCH; } if (pdev->vendor == PCI_VENDOR_ID_CDNS && diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 8c164340a2c3..779b01dee068 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1632,7 +1632,7 @@ struct xhci_hcd { #define XHCI_EP_CTX_BROKEN_DCS BIT_ULL(42) #define XHCI_SUSPEND_RESUME_CLKS BIT_ULL(43) #define XHCI_RESET_TO_DEFAULT BIT_ULL(44) -#define XHCI_ZHAOXIN_TRB_FETCH BIT_ULL(45) +#define XHCI_TRB_OVERFETCH BIT_ULL(45) #define XHCI_ZHAOXIN_HOST BIT_ULL(46) #define XHCI_WRITE_64_HI_LO BIT_ULL(47) #define XHCI_CDNS_SCTX_QUIRK BIT_ULL(48)

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] mm/slab/kvfree_rcu: Switch to WQ_MEM_RECLAIM wq" failed to apply to 6.13-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.13-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.13.y git checkout FETCH_HEAD git cherry-pick -x dfd3df31c9db752234d7d2e09bef2aeabb643ce4 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030913-contact-unit-647c@gregkh' --subject-prefix 'PATCH 6.13.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From dfd3df31c9db752234d7d2e09bef2aeabb643ce4 Mon Sep 17 00:00:00 2001 From: "Uladzislau Rezki (Sony)" <urezki(a)gmail.com> Date: Fri, 28 Feb 2025 13:13:56 +0100 Subject: [PATCH] mm/slab/kvfree_rcu: Switch to WQ_MEM_RECLAIM wq Currently kvfree_rcu() APIs use a system workqueue which is "system_unbound_wq" to driver RCU machinery to reclaim a memory. Recently, it has been noted that the following kernel warning can be observed: <snip> workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_scan_work is flushing !WQ_MEM_RECLAIM events_unbound:kfree_rcu_work WARNING: CPU: 21 PID: 330 at kernel/workqueue.c:3719 check_flush_dependency+0x112/0x120 Modules linked in: intel_uncore_frequency(E) intel_uncore_frequency_common(E) skx_edac(E) ... CPU: 21 UID: 0 PID: 330 Comm: kworker/u144:6 Tainted: G E 6.13.2-0_g925d379822da #1 Hardware name: Wiwynn Twin Lakes MP/Twin Lakes Passive MP, BIOS YMM20 02/01/2023 Workqueue: nvme-wq nvme_scan_work RIP: 0010:check_flush_dependency+0x112/0x120 Code: 05 9a 40 14 02 01 48 81 c6 c0 00 00 00 48 8b 50 18 48 81 c7 c0 00 00 00 48 89 f9 48 ... RSP: 0018:ffffc90000df7bd8 EFLAGS: 00010082 RAX: 000000000000006a RBX: ffffffff81622390 RCX: 0000000000000027 RDX: 00000000fffeffff RSI: 000000000057ffa8 RDI: ffff88907f960c88 RBP: 0000000000000000 R08: ffffffff83068e50 R09: 000000000002fffd R10: 0000000000000004 R11: 0000000000000000 R12: ffff8881001a4400 R13: 0000000000000000 R14: ffff88907f420fb8 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88907f940000(0000) knlGS:0000000000000000 CR2: 00007f60c3001000 CR3: 000000107d010005 CR4: 00000000007726f0 PKRU: 55555554 Call Trace: <TASK> ? __warn+0xa4/0x140 ? check_flush_dependency+0x112/0x120 ? report_bug+0xe1/0x140 ? check_flush_dependency+0x112/0x120 ? handle_bug+0x5e/0x90 ? exc_invalid_op+0x16/0x40 ? asm_exc_invalid_op+0x16/0x20 ? timer_recalc_next_expiry+0x190/0x190 ? check_flush_dependency+0x112/0x120 ? check_flush_dependency+0x112/0x120 __flush_work.llvm.1643880146586177030+0x174/0x2c0 flush_rcu_work+0x28/0x30 kvfree_rcu_barrier+0x12f/0x160 kmem_cache_destroy+0x18/0x120 bioset_exit+0x10c/0x150 disk_release.llvm.6740012984264378178+0x61/0xd0 device_release+0x4f/0x90 kobject_put+0x95/0x180 nvme_put_ns+0x23/0xc0 nvme_remove_invalid_namespaces+0xb3/0xd0 nvme_scan_work+0x342/0x490 process_scheduled_works+0x1a2/0x370 worker_thread+0x2ff/0x390 ? pwq_release_workfn+0x1e0/0x1e0 kthread+0xb1/0xe0 ? __kthread_parkme+0x70/0x70 ret_from_fork+0x30/0x40 ? __kthread_parkme+0x70/0x70 ret_from_fork_asm+0x11/0x20 </TASK> ---[ end trace 0000000000000000 ]--- <snip> To address this switch to use of independent WQ_MEM_RECLAIM workqueue, so the rules are not violated from workqueue framework point of view. Apart of that, since kvfree_rcu() does reclaim memory it is worth to go with WQ_MEM_RECLAIM type of wq because it is designed for this purpose. Fixes: 6c6c47b063b5 ("mm, slab: call kvfree_rcu_barrier() from kmem_cache_destroy()"), Reported-by: Keith Busch <kbusch(a)kernel.org> Closes: https://lore.kernel.org/all/Z7iqJtCjHKfo8Kho@kbusch-mbp/ Cc: stable(a)vger.kernel.org Signed-off-by: Uladzislau Rezki (Sony) <urezki(a)gmail.com> Reviewed-by: Joel Fernandes <joelagnelf(a)nvidia.com> Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz> diff --git a/mm/slab_common.c b/mm/slab_common.c index 4030907b6b7d..4c9f0a87f733 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -1304,6 +1304,8 @@ module_param(rcu_min_cached_objs, int, 0444); static int rcu_delay_page_cache_fill_msec = 5000; module_param(rcu_delay_page_cache_fill_msec, int, 0444); +static struct workqueue_struct *rcu_reclaim_wq; + /* Maximum number of jiffies to wait before draining a batch. */ #define KFREE_DRAIN_JIFFIES (5 * HZ) #define KFREE_N_BATCHES 2 @@ -1632,10 +1634,10 @@ __schedule_delayed_monitor_work(struct kfree_rcu_cpu *krcp) if (delayed_work_pending(&krcp->monitor_work)) { delay_left = krcp->monitor_work.timer.expires - jiffies; if (delay < delay_left) - mod_delayed_work(system_unbound_wq, &krcp->monitor_work, delay); + mod_delayed_work(rcu_reclaim_wq, &krcp->monitor_work, delay); return; } - queue_delayed_work(system_unbound_wq, &krcp->monitor_work, delay); + queue_delayed_work(rcu_reclaim_wq, &krcp->monitor_work, delay); } static void @@ -1733,7 +1735,7 @@ kvfree_rcu_queue_batch(struct kfree_rcu_cpu *krcp) // "free channels", the batch can handle. Break // the loop since it is done with this CPU thus // queuing an RCU work is _always_ success here. - queued = queue_rcu_work(system_unbound_wq, &krwp->rcu_work); + queued = queue_rcu_work(rcu_reclaim_wq, &krwp->rcu_work); WARN_ON_ONCE(!queued); break; } @@ -1883,7 +1885,7 @@ run_page_cache_worker(struct kfree_rcu_cpu *krcp) if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING && !atomic_xchg(&krcp->work_in_progress, 1)) { if (atomic_read(&krcp->backoff_page_cache_fill)) { - queue_delayed_work(system_unbound_wq, + queue_delayed_work(rcu_reclaim_wq, &krcp->page_cache_work, msecs_to_jiffies(rcu_delay_page_cache_fill_msec)); } else { @@ -2120,6 +2122,10 @@ void __init kvfree_rcu_init(void) int i, j; struct shrinker *kfree_rcu_shrinker; + rcu_reclaim_wq = alloc_workqueue("kvfree_rcu_reclaim", + WQ_UNBOUND | WQ_MEM_RECLAIM, 0); + WARN_ON(!rcu_reclaim_wq); + /* Clamp it to [0:100] seconds interval. */ if (rcu_delay_page_cache_fill_msec < 0 || rcu_delay_page_cache_fill_msec > 100 * MSEC_PER_SEC) {

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] mm/hugetlb: wait for hugetlb folios to be freed" failed to apply to 5.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y git checkout FETCH_HEAD git cherry-pick -x 67bab13307c83fb742c2556b06cdc39dbad27f07 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030908-defacing-rumor-448c@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 67bab13307c83fb742c2556b06cdc39dbad27f07 Mon Sep 17 00:00:00 2001 From: Ge Yang <yangge1116(a)126.com> Date: Wed, 19 Feb 2025 11:46:44 +0800 Subject: [PATCH] mm/hugetlb: wait for hugetlb folios to be freed Since the introduction of commit c77c0a8ac4c52 ("mm/hugetlb: defer freeing of huge pages if in non-task context"), which supports deferring the freeing of hugetlb pages, the allocation of contiguous memory through cma_alloc() may fail probabilistically. In the CMA allocation process, if it is found that the CMA area is occupied by in-use hugetlb folios, these in-use hugetlb folios need to be migrated to another location. When there are no available hugetlb folios in the free hugetlb pool during the migration of in-use hugetlb folios, new folios are allocated from the buddy system. A temporary state is set on the newly allocated folio. Upon completion of the hugetlb folio migration, the temporary state is transferred from the new folios to the old folios. Normally, when the old folios with the temporary state are freed, it is directly released back to the buddy system. However, due to the deferred freeing of hugetlb pages, the PageBuddy() check fails, ultimately leading to the failure of cma_alloc(). Here is a simplified call trace illustrating the process: cma_alloc() ->__alloc_contig_migrate_range() // Migrate in-use hugetlb folios ->unmap_and_move_huge_page() ->folio_putback_hugetlb() // Free old folios ->test_pages_isolated() ->__test_page_isolated_in_pageblock() ->PageBuddy(page) // Check if the page is in buddy To resolve this issue, we have implemented a function named wait_for_freed_hugetlb_folios(). This function ensures that the hugetlb folios are properly released back to the buddy system after their migration is completed. By invoking wait_for_freed_hugetlb_folios() before calling PageBuddy(), we ensure that PageBuddy() will succeed. Link: https://lkml.kernel.org/r/1739936804-18199-1-git-send-email-yangge1116@126.… Fixes: c77c0a8ac4c5 ("mm/hugetlb: defer freeing of huge pages if in non-task context") Signed-off-by: Ge Yang <yangge1116(a)126.com> Reviewed-by: Muchun Song <muchun.song(a)linux.dev> Acked-by: David Hildenbrand <david(a)redhat.com> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Barry Song <21cnbao(a)gmail.com> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ec8c0ccc8f95..dbe76d4f1bfc 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -682,6 +682,7 @@ struct huge_bootmem_page { int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); +void wait_for_freed_hugetlb_folios(void); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner); struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, @@ -1066,6 +1067,10 @@ static inline int replace_free_hugepage_folios(unsigned long start_pfn, return 0; } +static inline void wait_for_freed_hugetlb_folios(void) +{ +} + static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 163190e89ea1..811b29f77abf 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2943,6 +2943,14 @@ int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn) return ret; } +void wait_for_freed_hugetlb_folios(void) +{ + if (llist_empty(&hpage_freelist)) + return; + + flush_work(&free_hpage_work); +} + typedef enum { /* * For either 0/1: we checked the per-vma resv map, and one resv diff --git a/mm/page_isolation.c b/mm/page_isolation.c index c608e9d72865..a051a29e95ad 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -607,6 +607,16 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn, struct zone *zone; int ret; + /* + * Due to the deferred freeing of hugetlb folios, the hugepage folios may + * not immediately release to the buddy system. This can cause PageBuddy() + * to fail in __test_page_isolated_in_pageblock(). To ensure that the + * hugetlb folios are properly released back to the buddy system, we + * invoke the wait_for_freed_hugetlb_folios() function to wait for the + * release to complete. + */ + wait_for_freed_hugetlb_folios(); + /* * Note: pageblock_nr_pages != MAX_PAGE_ORDER. Then, chunks of free * pages are not aligned to pageblock_nr_pages.

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] mm/hugetlb: wait for hugetlb folios to be freed" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x 67bab13307c83fb742c2556b06cdc39dbad27f07 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030907-blush-surname-f05c@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 67bab13307c83fb742c2556b06cdc39dbad27f07 Mon Sep 17 00:00:00 2001 From: Ge Yang <yangge1116(a)126.com> Date: Wed, 19 Feb 2025 11:46:44 +0800 Subject: [PATCH] mm/hugetlb: wait for hugetlb folios to be freed Since the introduction of commit c77c0a8ac4c52 ("mm/hugetlb: defer freeing of huge pages if in non-task context"), which supports deferring the freeing of hugetlb pages, the allocation of contiguous memory through cma_alloc() may fail probabilistically. In the CMA allocation process, if it is found that the CMA area is occupied by in-use hugetlb folios, these in-use hugetlb folios need to be migrated to another location. When there are no available hugetlb folios in the free hugetlb pool during the migration of in-use hugetlb folios, new folios are allocated from the buddy system. A temporary state is set on the newly allocated folio. Upon completion of the hugetlb folio migration, the temporary state is transferred from the new folios to the old folios. Normally, when the old folios with the temporary state are freed, it is directly released back to the buddy system. However, due to the deferred freeing of hugetlb pages, the PageBuddy() check fails, ultimately leading to the failure of cma_alloc(). Here is a simplified call trace illustrating the process: cma_alloc() ->__alloc_contig_migrate_range() // Migrate in-use hugetlb folios ->unmap_and_move_huge_page() ->folio_putback_hugetlb() // Free old folios ->test_pages_isolated() ->__test_page_isolated_in_pageblock() ->PageBuddy(page) // Check if the page is in buddy To resolve this issue, we have implemented a function named wait_for_freed_hugetlb_folios(). This function ensures that the hugetlb folios are properly released back to the buddy system after their migration is completed. By invoking wait_for_freed_hugetlb_folios() before calling PageBuddy(), we ensure that PageBuddy() will succeed. Link: https://lkml.kernel.org/r/1739936804-18199-1-git-send-email-yangge1116@126.… Fixes: c77c0a8ac4c5 ("mm/hugetlb: defer freeing of huge pages if in non-task context") Signed-off-by: Ge Yang <yangge1116(a)126.com> Reviewed-by: Muchun Song <muchun.song(a)linux.dev> Acked-by: David Hildenbrand <david(a)redhat.com> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Barry Song <21cnbao(a)gmail.com> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ec8c0ccc8f95..dbe76d4f1bfc 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -682,6 +682,7 @@ struct huge_bootmem_page { int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); +void wait_for_freed_hugetlb_folios(void); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner); struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, @@ -1066,6 +1067,10 @@ static inline int replace_free_hugepage_folios(unsigned long start_pfn, return 0; } +static inline void wait_for_freed_hugetlb_folios(void) +{ +} + static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 163190e89ea1..811b29f77abf 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2943,6 +2943,14 @@ int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn) return ret; } +void wait_for_freed_hugetlb_folios(void) +{ + if (llist_empty(&hpage_freelist)) + return; + + flush_work(&free_hpage_work); +} + typedef enum { /* * For either 0/1: we checked the per-vma resv map, and one resv diff --git a/mm/page_isolation.c b/mm/page_isolation.c index c608e9d72865..a051a29e95ad 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -607,6 +607,16 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn, struct zone *zone; int ret; + /* + * Due to the deferred freeing of hugetlb folios, the hugepage folios may + * not immediately release to the buddy system. This can cause PageBuddy() + * to fail in __test_page_isolated_in_pageblock(). To ensure that the + * hugetlb folios are properly released back to the buddy system, we + * invoke the wait_for_freed_hugetlb_folios() function to wait for the + * release to complete. + */ + wait_for_freed_hugetlb_folios(); + /* * Note: pageblock_nr_pages != MAX_PAGE_ORDER. Then, chunks of free * pages are not aligned to pageblock_nr_pages.

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] mm/hugetlb: wait for hugetlb folios to be freed" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x 67bab13307c83fb742c2556b06cdc39dbad27f07 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030906-iodize-baboon-b1af@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 67bab13307c83fb742c2556b06cdc39dbad27f07 Mon Sep 17 00:00:00 2001 From: Ge Yang <yangge1116(a)126.com> Date: Wed, 19 Feb 2025 11:46:44 +0800 Subject: [PATCH] mm/hugetlb: wait for hugetlb folios to be freed Since the introduction of commit c77c0a8ac4c52 ("mm/hugetlb: defer freeing of huge pages if in non-task context"), which supports deferring the freeing of hugetlb pages, the allocation of contiguous memory through cma_alloc() may fail probabilistically. In the CMA allocation process, if it is found that the CMA area is occupied by in-use hugetlb folios, these in-use hugetlb folios need to be migrated to another location. When there are no available hugetlb folios in the free hugetlb pool during the migration of in-use hugetlb folios, new folios are allocated from the buddy system. A temporary state is set on the newly allocated folio. Upon completion of the hugetlb folio migration, the temporary state is transferred from the new folios to the old folios. Normally, when the old folios with the temporary state are freed, it is directly released back to the buddy system. However, due to the deferred freeing of hugetlb pages, the PageBuddy() check fails, ultimately leading to the failure of cma_alloc(). Here is a simplified call trace illustrating the process: cma_alloc() ->__alloc_contig_migrate_range() // Migrate in-use hugetlb folios ->unmap_and_move_huge_page() ->folio_putback_hugetlb() // Free old folios ->test_pages_isolated() ->__test_page_isolated_in_pageblock() ->PageBuddy(page) // Check if the page is in buddy To resolve this issue, we have implemented a function named wait_for_freed_hugetlb_folios(). This function ensures that the hugetlb folios are properly released back to the buddy system after their migration is completed. By invoking wait_for_freed_hugetlb_folios() before calling PageBuddy(), we ensure that PageBuddy() will succeed. Link: https://lkml.kernel.org/r/1739936804-18199-1-git-send-email-yangge1116@126.… Fixes: c77c0a8ac4c5 ("mm/hugetlb: defer freeing of huge pages if in non-task context") Signed-off-by: Ge Yang <yangge1116(a)126.com> Reviewed-by: Muchun Song <muchun.song(a)linux.dev> Acked-by: David Hildenbrand <david(a)redhat.com> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Barry Song <21cnbao(a)gmail.com> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ec8c0ccc8f95..dbe76d4f1bfc 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -682,6 +682,7 @@ struct huge_bootmem_page { int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); +void wait_for_freed_hugetlb_folios(void); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner); struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, @@ -1066,6 +1067,10 @@ static inline int replace_free_hugepage_folios(unsigned long start_pfn, return 0; } +static inline void wait_for_freed_hugetlb_folios(void) +{ +} + static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 163190e89ea1..811b29f77abf 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2943,6 +2943,14 @@ int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn) return ret; } +void wait_for_freed_hugetlb_folios(void) +{ + if (llist_empty(&hpage_freelist)) + return; + + flush_work(&free_hpage_work); +} + typedef enum { /* * For either 0/1: we checked the per-vma resv map, and one resv diff --git a/mm/page_isolation.c b/mm/page_isolation.c index c608e9d72865..a051a29e95ad 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -607,6 +607,16 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn, struct zone *zone; int ret; + /* + * Due to the deferred freeing of hugetlb folios, the hugepage folios may + * not immediately release to the buddy system. This can cause PageBuddy() + * to fail in __test_page_isolated_in_pageblock(). To ensure that the + * hugetlb folios are properly released back to the buddy system, we + * invoke the wait_for_freed_hugetlb_folios() function to wait for the + * release to complete. + */ + wait_for_freed_hugetlb_folios(); + /* * Note: pageblock_nr_pages != MAX_PAGE_ORDER. Then, chunks of free * pages are not aligned to pageblock_nr_pages.

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] mm/hugetlb: wait for hugetlb folios to be freed" failed to apply to 6.1-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x 67bab13307c83fb742c2556b06cdc39dbad27f07 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030905-parchment-riddance-0a09@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 67bab13307c83fb742c2556b06cdc39dbad27f07 Mon Sep 17 00:00:00 2001 From: Ge Yang <yangge1116(a)126.com> Date: Wed, 19 Feb 2025 11:46:44 +0800 Subject: [PATCH] mm/hugetlb: wait for hugetlb folios to be freed Since the introduction of commit c77c0a8ac4c52 ("mm/hugetlb: defer freeing of huge pages if in non-task context"), which supports deferring the freeing of hugetlb pages, the allocation of contiguous memory through cma_alloc() may fail probabilistically. In the CMA allocation process, if it is found that the CMA area is occupied by in-use hugetlb folios, these in-use hugetlb folios need to be migrated to another location. When there are no available hugetlb folios in the free hugetlb pool during the migration of in-use hugetlb folios, new folios are allocated from the buddy system. A temporary state is set on the newly allocated folio. Upon completion of the hugetlb folio migration, the temporary state is transferred from the new folios to the old folios. Normally, when the old folios with the temporary state are freed, it is directly released back to the buddy system. However, due to the deferred freeing of hugetlb pages, the PageBuddy() check fails, ultimately leading to the failure of cma_alloc(). Here is a simplified call trace illustrating the process: cma_alloc() ->__alloc_contig_migrate_range() // Migrate in-use hugetlb folios ->unmap_and_move_huge_page() ->folio_putback_hugetlb() // Free old folios ->test_pages_isolated() ->__test_page_isolated_in_pageblock() ->PageBuddy(page) // Check if the page is in buddy To resolve this issue, we have implemented a function named wait_for_freed_hugetlb_folios(). This function ensures that the hugetlb folios are properly released back to the buddy system after their migration is completed. By invoking wait_for_freed_hugetlb_folios() before calling PageBuddy(), we ensure that PageBuddy() will succeed. Link: https://lkml.kernel.org/r/1739936804-18199-1-git-send-email-yangge1116@126.… Fixes: c77c0a8ac4c5 ("mm/hugetlb: defer freeing of huge pages if in non-task context") Signed-off-by: Ge Yang <yangge1116(a)126.com> Reviewed-by: Muchun Song <muchun.song(a)linux.dev> Acked-by: David Hildenbrand <david(a)redhat.com> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Barry Song <21cnbao(a)gmail.com> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ec8c0ccc8f95..dbe76d4f1bfc 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -682,6 +682,7 @@ struct huge_bootmem_page { int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); +void wait_for_freed_hugetlb_folios(void); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner); struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, @@ -1066,6 +1067,10 @@ static inline int replace_free_hugepage_folios(unsigned long start_pfn, return 0; } +static inline void wait_for_freed_hugetlb_folios(void) +{ +} + static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 163190e89ea1..811b29f77abf 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2943,6 +2943,14 @@ int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn) return ret; } +void wait_for_freed_hugetlb_folios(void) +{ + if (llist_empty(&hpage_freelist)) + return; + + flush_work(&free_hpage_work); +} + typedef enum { /* * For either 0/1: we checked the per-vma resv map, and one resv diff --git a/mm/page_isolation.c b/mm/page_isolation.c index c608e9d72865..a051a29e95ad 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -607,6 +607,16 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn, struct zone *zone; int ret; + /* + * Due to the deferred freeing of hugetlb folios, the hugepage folios may + * not immediately release to the buddy system. This can cause PageBuddy() + * to fail in __test_page_isolated_in_pageblock(). To ensure that the + * hugetlb folios are properly released back to the buddy system, we + * invoke the wait_for_freed_hugetlb_folios() function to wait for the + * release to complete. + */ + wait_for_freed_hugetlb_folios(); + /* * Note: pageblock_nr_pages != MAX_PAGE_ORDER. Then, chunks of free * pages are not aligned to pageblock_nr_pages.

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] mm/hugetlb: wait for hugetlb folios to be freed" failed to apply to 6.6-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.6-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y git checkout FETCH_HEAD git cherry-pick -x 67bab13307c83fb742c2556b06cdc39dbad27f07 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030904-splendor-sly-a852@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 67bab13307c83fb742c2556b06cdc39dbad27f07 Mon Sep 17 00:00:00 2001 From: Ge Yang <yangge1116(a)126.com> Date: Wed, 19 Feb 2025 11:46:44 +0800 Subject: [PATCH] mm/hugetlb: wait for hugetlb folios to be freed Since the introduction of commit c77c0a8ac4c52 ("mm/hugetlb: defer freeing of huge pages if in non-task context"), which supports deferring the freeing of hugetlb pages, the allocation of contiguous memory through cma_alloc() may fail probabilistically. In the CMA allocation process, if it is found that the CMA area is occupied by in-use hugetlb folios, these in-use hugetlb folios need to be migrated to another location. When there are no available hugetlb folios in the free hugetlb pool during the migration of in-use hugetlb folios, new folios are allocated from the buddy system. A temporary state is set on the newly allocated folio. Upon completion of the hugetlb folio migration, the temporary state is transferred from the new folios to the old folios. Normally, when the old folios with the temporary state are freed, it is directly released back to the buddy system. However, due to the deferred freeing of hugetlb pages, the PageBuddy() check fails, ultimately leading to the failure of cma_alloc(). Here is a simplified call trace illustrating the process: cma_alloc() ->__alloc_contig_migrate_range() // Migrate in-use hugetlb folios ->unmap_and_move_huge_page() ->folio_putback_hugetlb() // Free old folios ->test_pages_isolated() ->__test_page_isolated_in_pageblock() ->PageBuddy(page) // Check if the page is in buddy To resolve this issue, we have implemented a function named wait_for_freed_hugetlb_folios(). This function ensures that the hugetlb folios are properly released back to the buddy system after their migration is completed. By invoking wait_for_freed_hugetlb_folios() before calling PageBuddy(), we ensure that PageBuddy() will succeed. Link: https://lkml.kernel.org/r/1739936804-18199-1-git-send-email-yangge1116@126.… Fixes: c77c0a8ac4c5 ("mm/hugetlb: defer freeing of huge pages if in non-task context") Signed-off-by: Ge Yang <yangge1116(a)126.com> Reviewed-by: Muchun Song <muchun.song(a)linux.dev> Acked-by: David Hildenbrand <david(a)redhat.com> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Barry Song <21cnbao(a)gmail.com> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ec8c0ccc8f95..dbe76d4f1bfc 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -682,6 +682,7 @@ struct huge_bootmem_page { int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); +void wait_for_freed_hugetlb_folios(void); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner); struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, @@ -1066,6 +1067,10 @@ static inline int replace_free_hugepage_folios(unsigned long start_pfn, return 0; } +static inline void wait_for_freed_hugetlb_folios(void) +{ +} + static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 163190e89ea1..811b29f77abf 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2943,6 +2943,14 @@ int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn) return ret; } +void wait_for_freed_hugetlb_folios(void) +{ + if (llist_empty(&hpage_freelist)) + return; + + flush_work(&free_hpage_work); +} + typedef enum { /* * For either 0/1: we checked the per-vma resv map, and one resv diff --git a/mm/page_isolation.c b/mm/page_isolation.c index c608e9d72865..a051a29e95ad 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -607,6 +607,16 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn, struct zone *zone; int ret; + /* + * Due to the deferred freeing of hugetlb folios, the hugepage folios may + * not immediately release to the buddy system. This can cause PageBuddy() + * to fail in __test_page_isolated_in_pageblock(). To ensure that the + * hugetlb folios are properly released back to the buddy system, we + * invoke the wait_for_freed_hugetlb_folios() function to wait for the + * release to complete. + */ + wait_for_freed_hugetlb_folios(); + /* * Note: pageblock_nr_pages != MAX_PAGE_ORDER. Then, chunks of free * pages are not aligned to pageblock_nr_pages.

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] mm/hugetlb: wait for hugetlb folios to be freed" failed to apply to 6.12-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.12-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y git checkout FETCH_HEAD git cherry-pick -x 67bab13307c83fb742c2556b06cdc39dbad27f07 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030903-simplify-blooming-c758@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 67bab13307c83fb742c2556b06cdc39dbad27f07 Mon Sep 17 00:00:00 2001 From: Ge Yang <yangge1116(a)126.com> Date: Wed, 19 Feb 2025 11:46:44 +0800 Subject: [PATCH] mm/hugetlb: wait for hugetlb folios to be freed Since the introduction of commit c77c0a8ac4c52 ("mm/hugetlb: defer freeing of huge pages if in non-task context"), which supports deferring the freeing of hugetlb pages, the allocation of contiguous memory through cma_alloc() may fail probabilistically. In the CMA allocation process, if it is found that the CMA area is occupied by in-use hugetlb folios, these in-use hugetlb folios need to be migrated to another location. When there are no available hugetlb folios in the free hugetlb pool during the migration of in-use hugetlb folios, new folios are allocated from the buddy system. A temporary state is set on the newly allocated folio. Upon completion of the hugetlb folio migration, the temporary state is transferred from the new folios to the old folios. Normally, when the old folios with the temporary state are freed, it is directly released back to the buddy system. However, due to the deferred freeing of hugetlb pages, the PageBuddy() check fails, ultimately leading to the failure of cma_alloc(). Here is a simplified call trace illustrating the process: cma_alloc() ->__alloc_contig_migrate_range() // Migrate in-use hugetlb folios ->unmap_and_move_huge_page() ->folio_putback_hugetlb() // Free old folios ->test_pages_isolated() ->__test_page_isolated_in_pageblock() ->PageBuddy(page) // Check if the page is in buddy To resolve this issue, we have implemented a function named wait_for_freed_hugetlb_folios(). This function ensures that the hugetlb folios are properly released back to the buddy system after their migration is completed. By invoking wait_for_freed_hugetlb_folios() before calling PageBuddy(), we ensure that PageBuddy() will succeed. Link: https://lkml.kernel.org/r/1739936804-18199-1-git-send-email-yangge1116@126.… Fixes: c77c0a8ac4c5 ("mm/hugetlb: defer freeing of huge pages if in non-task context") Signed-off-by: Ge Yang <yangge1116(a)126.com> Reviewed-by: Muchun Song <muchun.song(a)linux.dev> Acked-by: David Hildenbrand <david(a)redhat.com> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Barry Song <21cnbao(a)gmail.com> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ec8c0ccc8f95..dbe76d4f1bfc 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -682,6 +682,7 @@ struct huge_bootmem_page { int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); +void wait_for_freed_hugetlb_folios(void); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner); struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, @@ -1066,6 +1067,10 @@ static inline int replace_free_hugepage_folios(unsigned long start_pfn, return 0; } +static inline void wait_for_freed_hugetlb_folios(void) +{ +} + static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 163190e89ea1..811b29f77abf 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2943,6 +2943,14 @@ int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn) return ret; } +void wait_for_freed_hugetlb_folios(void) +{ + if (llist_empty(&hpage_freelist)) + return; + + flush_work(&free_hpage_work); +} + typedef enum { /* * For either 0/1: we checked the per-vma resv map, and one resv diff --git a/mm/page_isolation.c b/mm/page_isolation.c index c608e9d72865..a051a29e95ad 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -607,6 +607,16 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn, struct zone *zone; int ret; + /* + * Due to the deferred freeing of hugetlb folios, the hugepage folios may + * not immediately release to the buddy system. This can cause PageBuddy() + * to fail in __test_page_isolated_in_pageblock(). To ensure that the + * hugetlb folios are properly released back to the buddy system, we + * invoke the wait_for_freed_hugetlb_folios() function to wait for the + * release to complete. + */ + wait_for_freed_hugetlb_folios(); + /* * Note: pageblock_nr_pages != MAX_PAGE_ORDER. Then, chunks of free * pages are not aligned to pageblock_nr_pages.

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] mm/hugetlb: wait for hugetlb folios to be freed" failed to apply to 6.13-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.13-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.13.y git checkout FETCH_HEAD git cherry-pick -x 67bab13307c83fb742c2556b06cdc39dbad27f07 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030902-guidance-kung-0573@gregkh' --subject-prefix 'PATCH 6.13.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 67bab13307c83fb742c2556b06cdc39dbad27f07 Mon Sep 17 00:00:00 2001 From: Ge Yang <yangge1116(a)126.com> Date: Wed, 19 Feb 2025 11:46:44 +0800 Subject: [PATCH] mm/hugetlb: wait for hugetlb folios to be freed Since the introduction of commit c77c0a8ac4c52 ("mm/hugetlb: defer freeing of huge pages if in non-task context"), which supports deferring the freeing of hugetlb pages, the allocation of contiguous memory through cma_alloc() may fail probabilistically. In the CMA allocation process, if it is found that the CMA area is occupied by in-use hugetlb folios, these in-use hugetlb folios need to be migrated to another location. When there are no available hugetlb folios in the free hugetlb pool during the migration of in-use hugetlb folios, new folios are allocated from the buddy system. A temporary state is set on the newly allocated folio. Upon completion of the hugetlb folio migration, the temporary state is transferred from the new folios to the old folios. Normally, when the old folios with the temporary state are freed, it is directly released back to the buddy system. However, due to the deferred freeing of hugetlb pages, the PageBuddy() check fails, ultimately leading to the failure of cma_alloc(). Here is a simplified call trace illustrating the process: cma_alloc() ->__alloc_contig_migrate_range() // Migrate in-use hugetlb folios ->unmap_and_move_huge_page() ->folio_putback_hugetlb() // Free old folios ->test_pages_isolated() ->__test_page_isolated_in_pageblock() ->PageBuddy(page) // Check if the page is in buddy To resolve this issue, we have implemented a function named wait_for_freed_hugetlb_folios(). This function ensures that the hugetlb folios are properly released back to the buddy system after their migration is completed. By invoking wait_for_freed_hugetlb_folios() before calling PageBuddy(), we ensure that PageBuddy() will succeed. Link: https://lkml.kernel.org/r/1739936804-18199-1-git-send-email-yangge1116@126.… Fixes: c77c0a8ac4c5 ("mm/hugetlb: defer freeing of huge pages if in non-task context") Signed-off-by: Ge Yang <yangge1116(a)126.com> Reviewed-by: Muchun Song <muchun.song(a)linux.dev> Acked-by: David Hildenbrand <david(a)redhat.com> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Barry Song <21cnbao(a)gmail.com> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ec8c0ccc8f95..dbe76d4f1bfc 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -682,6 +682,7 @@ struct huge_bootmem_page { int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); +void wait_for_freed_hugetlb_folios(void); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner); struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, @@ -1066,6 +1067,10 @@ static inline int replace_free_hugepage_folios(unsigned long start_pfn, return 0; } +static inline void wait_for_freed_hugetlb_folios(void) +{ +} + static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 163190e89ea1..811b29f77abf 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2943,6 +2943,14 @@ int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn) return ret; } +void wait_for_freed_hugetlb_folios(void) +{ + if (llist_empty(&hpage_freelist)) + return; + + flush_work(&free_hpage_work); +} + typedef enum { /* * For either 0/1: we checked the per-vma resv map, and one resv diff --git a/mm/page_isolation.c b/mm/page_isolation.c index c608e9d72865..a051a29e95ad 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -607,6 +607,16 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn, struct zone *zone; int ret; + /* + * Due to the deferred freeing of hugetlb folios, the hugepage folios may + * not immediately release to the buddy system. This can cause PageBuddy() + * to fail in __test_page_isolated_in_pageblock(). To ensure that the + * hugetlb folios are properly released back to the buddy system, we + * invoke the wait_for_freed_hugetlb_folios() function to wait for the + * release to complete. + */ + wait_for_freed_hugetlb_folios(); + /* * Note: pageblock_nr_pages != MAX_PAGE_ORDER. Then, chunks of free * pages are not aligned to pageblock_nr_pages.

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] mm: shmem: fix potential data corruption during shmem swapin" failed to apply to 6.12-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.12-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y git checkout FETCH_HEAD git cherry-pick -x 058313515d5aab10d0a01dd634f92ed4a4e71d4c # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030954-polish-overeater-d2be@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 058313515d5aab10d0a01dd634f92ed4a4e71d4c Mon Sep 17 00:00:00 2001 From: Baolin Wang <baolin.wang(a)linux.alibaba.com> Date: Tue, 25 Feb 2025 17:52:55 +0800 Subject: [PATCH] mm: shmem: fix potential data corruption during shmem swapin Alex and Kairui reported some issues (system hang or data corruption) when swapping out or swapping in large shmem folios. This is especially easy to reproduce when the tmpfs is mount with the 'huge=within_size' parameter. Thanks to Kairui's reproducer, the issue can be easily replicated. The root cause of the problem is that swap readahead may asynchronously swap in order 0 folios into the swap cache, while the shmem mapping can still store large swap entries. Then an order 0 folio is inserted into the shmem mapping without splitting the large swap entry, which overwrites the original large swap entry, leading to data corruption. When getting a folio from the swap cache, we should split the large swap entry stored in the shmem mapping if the orders do not match, to fix this issue. Link: https://lkml.kernel.org/r/2fe47c557e74e9df5fe2437ccdc6c9115fa1bf70.17404769… Fixes: 809bc86517cc ("mm: shmem: support large folio swap out") Signed-off-by: Baolin Wang <baolin.wang(a)linux.alibaba.com> Reported-by: Alex Xu (Hello71) <alex_y_xu(a)yahoo.ca> Reported-by: Kairui Song <ryncsn(a)gmail.com> Closes: https://lore.kernel.org/all/1738717785.im3r5g2vxc.none@localhost/ Tested-by: Kairui Song <kasong(a)tencent.com> Cc: David Hildenbrand <david(a)redhat.com> Cc: Lance Yang <ioworker0(a)gmail.com> Cc: Matthew Wilcow <willy(a)infradead.org> Cc: Hugh Dickins <hughd(a)google.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/mm/shmem.c b/mm/shmem.c index 4ea6109a8043..cebbac97a221 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2253,7 +2253,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, struct folio *folio = NULL; bool skip_swapcache = false; swp_entry_t swap; - int error, nr_pages; + int error, nr_pages, order, split_order; VM_BUG_ON(!*foliop || !xa_is_value(*foliop)); swap = radix_to_swp_entry(*foliop); @@ -2272,10 +2272,9 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, /* Look it up and read it in.. */ folio = swap_cache_get_folio(swap, NULL, 0); + order = xa_get_order(&mapping->i_pages, index); if (!folio) { - int order = xa_get_order(&mapping->i_pages, index); bool fallback_order0 = false; - int split_order; /* Or update major stats only when swapin succeeds?? */ if (fault_type) { @@ -2339,6 +2338,29 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, error = -ENOMEM; goto failed; } + } else if (order != folio_order(folio)) { + /* + * Swap readahead may swap in order 0 folios into swapcache + * asynchronously, while the shmem mapping can still stores + * large swap entries. In such cases, we should split the + * large swap entry to prevent possible data corruption. + */ + split_order = shmem_split_large_entry(inode, index, swap, gfp); + if (split_order < 0) { + error = split_order; + goto failed; + } + + /* + * If the large swap entry has already been split, it is + * necessary to recalculate the new swap entry based on + * the old order alignment. + */ + if (split_order > 0) { + pgoff_t offset = index - round_down(index, 1 << split_order); + + swap = swp_entry(swp_type(swap), swp_offset(swap) + offset); + } } alloced: @@ -2346,7 +2368,8 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, folio_lock(folio); if ((!skip_swapcache && !folio_test_swapcache(folio)) || folio->swap.val != swap.val || - !shmem_confirm_swap(mapping, index, swap)) { + !shmem_confirm_swap(mapping, index, swap) || + xa_get_order(&mapping->i_pages, index) != folio_order(folio)) { error = -EEXIST; goto unlock; }

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] mm: shmem: fix potential data corruption during shmem swapin" failed to apply to 6.13-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.13-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.13.y git checkout FETCH_HEAD git cherry-pick -x 058313515d5aab10d0a01dd634f92ed4a4e71d4c # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030953-alkalize-eardrum-de40@gregkh' --subject-prefix 'PATCH 6.13.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 058313515d5aab10d0a01dd634f92ed4a4e71d4c Mon Sep 17 00:00:00 2001 From: Baolin Wang <baolin.wang(a)linux.alibaba.com> Date: Tue, 25 Feb 2025 17:52:55 +0800 Subject: [PATCH] mm: shmem: fix potential data corruption during shmem swapin Alex and Kairui reported some issues (system hang or data corruption) when swapping out or swapping in large shmem folios. This is especially easy to reproduce when the tmpfs is mount with the 'huge=within_size' parameter. Thanks to Kairui's reproducer, the issue can be easily replicated. The root cause of the problem is that swap readahead may asynchronously swap in order 0 folios into the swap cache, while the shmem mapping can still store large swap entries. Then an order 0 folio is inserted into the shmem mapping without splitting the large swap entry, which overwrites the original large swap entry, leading to data corruption. When getting a folio from the swap cache, we should split the large swap entry stored in the shmem mapping if the orders do not match, to fix this issue. Link: https://lkml.kernel.org/r/2fe47c557e74e9df5fe2437ccdc6c9115fa1bf70.17404769… Fixes: 809bc86517cc ("mm: shmem: support large folio swap out") Signed-off-by: Baolin Wang <baolin.wang(a)linux.alibaba.com> Reported-by: Alex Xu (Hello71) <alex_y_xu(a)yahoo.ca> Reported-by: Kairui Song <ryncsn(a)gmail.com> Closes: https://lore.kernel.org/all/1738717785.im3r5g2vxc.none@localhost/ Tested-by: Kairui Song <kasong(a)tencent.com> Cc: David Hildenbrand <david(a)redhat.com> Cc: Lance Yang <ioworker0(a)gmail.com> Cc: Matthew Wilcow <willy(a)infradead.org> Cc: Hugh Dickins <hughd(a)google.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/mm/shmem.c b/mm/shmem.c index 4ea6109a8043..cebbac97a221 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2253,7 +2253,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, struct folio *folio = NULL; bool skip_swapcache = false; swp_entry_t swap; - int error, nr_pages; + int error, nr_pages, order, split_order; VM_BUG_ON(!*foliop || !xa_is_value(*foliop)); swap = radix_to_swp_entry(*foliop); @@ -2272,10 +2272,9 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, /* Look it up and read it in.. */ folio = swap_cache_get_folio(swap, NULL, 0); + order = xa_get_order(&mapping->i_pages, index); if (!folio) { - int order = xa_get_order(&mapping->i_pages, index); bool fallback_order0 = false; - int split_order; /* Or update major stats only when swapin succeeds?? */ if (fault_type) { @@ -2339,6 +2338,29 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, error = -ENOMEM; goto failed; } + } else if (order != folio_order(folio)) { + /* + * Swap readahead may swap in order 0 folios into swapcache + * asynchronously, while the shmem mapping can still stores + * large swap entries. In such cases, we should split the + * large swap entry to prevent possible data corruption. + */ + split_order = shmem_split_large_entry(inode, index, swap, gfp); + if (split_order < 0) { + error = split_order; + goto failed; + } + + /* + * If the large swap entry has already been split, it is + * necessary to recalculate the new swap entry based on + * the old order alignment. + */ + if (split_order > 0) { + pgoff_t offset = index - round_down(index, 1 << split_order); + + swap = swp_entry(swp_type(swap), swp_offset(swap) + offset); + } } alloced: @@ -2346,7 +2368,8 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, folio_lock(folio); if ((!skip_swapcache && !folio_test_swapcache(folio)) || folio->swap.val != swap.val || - !shmem_confirm_swap(mapping, index, swap)) { + !shmem_confirm_swap(mapping, index, swap) || + xa_get_order(&mapping->i_pages, index) != folio_order(folio)) { error = -EEXIST; goto unlock; }

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned" failed to apply to 5.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y git checkout FETCH_HEAD git cherry-pick -x af288a426c3e3552b62595c6138ec6371a17dbba # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030937-relax-dubbed-d185@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From af288a426c3e3552b62595c6138ec6371a17dbba Mon Sep 17 00:00:00 2001 From: Ma Wupeng <mawupeng1(a)huawei.com> Date: Mon, 17 Feb 2025 09:43:29 +0800 Subject: [PATCH] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio Commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined) add page poison checks in do_migrate_range in order to make offline hwpoisoned page possible by introducing isolate_lru_page and try_to_unmap for hwpoisoned page. However folio lock must be held before calling try_to_unmap. Add it to fix this problem. Warning will be produced if folio is not locked during unmap: ------------[ cut here ]------------ kernel BUG at ./include/linux/swapops.h:400! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 4 UID: 0 PID: 411 Comm: bash Tainted: G W 6.13.0-rc1-00016-g3c434c7ee82a-dirty #41 Tainted: [W]=WARN Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : try_to_unmap_one+0xb08/0xd3c lr : try_to_unmap_one+0x3dc/0xd3c Call trace: try_to_unmap_one+0xb08/0xd3c (P) try_to_unmap_one+0x3dc/0xd3c (L) rmap_walk_anon+0xdc/0x1f8 rmap_walk+0x3c/0x58 try_to_unmap+0x88/0x90 unmap_poisoned_folio+0x30/0xa8 do_migrate_range+0x4a0/0x568 offline_pages+0x5a4/0x670 memory_block_action+0x17c/0x374 memory_subsys_offline+0x3c/0x78 device_offline+0xa4/0xd0 state_store+0x8c/0xf0 dev_attr_store+0x18/0x2c sysfs_kf_write+0x44/0x54 kernfs_fop_write_iter+0x118/0x1a8 vfs_write+0x3a8/0x4bc ksys_write+0x6c/0xf8 __arm64_sys_write+0x1c/0x28 invoke_syscall+0x44/0x100 el0_svc_common.constprop.0+0x40/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x30/0xd0 el0t_64_sync_handler+0xc8/0xcc el0t_64_sync+0x198/0x19c Code: f9407be0 b5fff320 d4210000 17ffff97 (d4210000) ---[ end trace 0000000000000000 ]--- Link: https://lkml.kernel.org/r/20250217014329.3610326-4-mawupeng1@huawei.com Fixes: b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined") Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com> Acked-by: David Hildenbrand <david(a)redhat.com> Acked-by: Miaohe Lin <linmiaohe(a)huawei.com> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index a6abd8d4a09c..16cf9e17077e 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1832,8 +1832,11 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn) (folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) { if (WARN_ON(folio_test_lru(folio))) folio_isolate_lru(folio); - if (folio_mapped(folio)) + if (folio_mapped(folio)) { + folio_lock(folio); unmap_poisoned_folio(folio, pfn, false); + folio_unlock(folio); + } goto put_folio; }

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x af288a426c3e3552b62595c6138ec6371a17dbba # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030936-oink-rocklike-abc3@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From af288a426c3e3552b62595c6138ec6371a17dbba Mon Sep 17 00:00:00 2001 From: Ma Wupeng <mawupeng1(a)huawei.com> Date: Mon, 17 Feb 2025 09:43:29 +0800 Subject: [PATCH] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio Commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined) add page poison checks in do_migrate_range in order to make offline hwpoisoned page possible by introducing isolate_lru_page and try_to_unmap for hwpoisoned page. However folio lock must be held before calling try_to_unmap. Add it to fix this problem. Warning will be produced if folio is not locked during unmap: ------------[ cut here ]------------ kernel BUG at ./include/linux/swapops.h:400! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 4 UID: 0 PID: 411 Comm: bash Tainted: G W 6.13.0-rc1-00016-g3c434c7ee82a-dirty #41 Tainted: [W]=WARN Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : try_to_unmap_one+0xb08/0xd3c lr : try_to_unmap_one+0x3dc/0xd3c Call trace: try_to_unmap_one+0xb08/0xd3c (P) try_to_unmap_one+0x3dc/0xd3c (L) rmap_walk_anon+0xdc/0x1f8 rmap_walk+0x3c/0x58 try_to_unmap+0x88/0x90 unmap_poisoned_folio+0x30/0xa8 do_migrate_range+0x4a0/0x568 offline_pages+0x5a4/0x670 memory_block_action+0x17c/0x374 memory_subsys_offline+0x3c/0x78 device_offline+0xa4/0xd0 state_store+0x8c/0xf0 dev_attr_store+0x18/0x2c sysfs_kf_write+0x44/0x54 kernfs_fop_write_iter+0x118/0x1a8 vfs_write+0x3a8/0x4bc ksys_write+0x6c/0xf8 __arm64_sys_write+0x1c/0x28 invoke_syscall+0x44/0x100 el0_svc_common.constprop.0+0x40/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x30/0xd0 el0t_64_sync_handler+0xc8/0xcc el0t_64_sync+0x198/0x19c Code: f9407be0 b5fff320 d4210000 17ffff97 (d4210000) ---[ end trace 0000000000000000 ]--- Link: https://lkml.kernel.org/r/20250217014329.3610326-4-mawupeng1@huawei.com Fixes: b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined") Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com> Acked-by: David Hildenbrand <david(a)redhat.com> Acked-by: Miaohe Lin <linmiaohe(a)huawei.com> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index a6abd8d4a09c..16cf9e17077e 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1832,8 +1832,11 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn) (folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) { if (WARN_ON(folio_test_lru(folio))) folio_isolate_lru(folio); - if (folio_mapped(folio)) + if (folio_mapped(folio)) { + folio_lock(folio); unmap_poisoned_folio(folio, pfn, false); + folio_unlock(folio); + } goto put_folio; }

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x af288a426c3e3552b62595c6138ec6371a17dbba # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030935-pasted-diner-95df@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From af288a426c3e3552b62595c6138ec6371a17dbba Mon Sep 17 00:00:00 2001 From: Ma Wupeng <mawupeng1(a)huawei.com> Date: Mon, 17 Feb 2025 09:43:29 +0800 Subject: [PATCH] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio Commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined) add page poison checks in do_migrate_range in order to make offline hwpoisoned page possible by introducing isolate_lru_page and try_to_unmap for hwpoisoned page. However folio lock must be held before calling try_to_unmap. Add it to fix this problem. Warning will be produced if folio is not locked during unmap: ------------[ cut here ]------------ kernel BUG at ./include/linux/swapops.h:400! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 4 UID: 0 PID: 411 Comm: bash Tainted: G W 6.13.0-rc1-00016-g3c434c7ee82a-dirty #41 Tainted: [W]=WARN Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : try_to_unmap_one+0xb08/0xd3c lr : try_to_unmap_one+0x3dc/0xd3c Call trace: try_to_unmap_one+0xb08/0xd3c (P) try_to_unmap_one+0x3dc/0xd3c (L) rmap_walk_anon+0xdc/0x1f8 rmap_walk+0x3c/0x58 try_to_unmap+0x88/0x90 unmap_poisoned_folio+0x30/0xa8 do_migrate_range+0x4a0/0x568 offline_pages+0x5a4/0x670 memory_block_action+0x17c/0x374 memory_subsys_offline+0x3c/0x78 device_offline+0xa4/0xd0 state_store+0x8c/0xf0 dev_attr_store+0x18/0x2c sysfs_kf_write+0x44/0x54 kernfs_fop_write_iter+0x118/0x1a8 vfs_write+0x3a8/0x4bc ksys_write+0x6c/0xf8 __arm64_sys_write+0x1c/0x28 invoke_syscall+0x44/0x100 el0_svc_common.constprop.0+0x40/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x30/0xd0 el0t_64_sync_handler+0xc8/0xcc el0t_64_sync+0x198/0x19c Code: f9407be0 b5fff320 d4210000 17ffff97 (d4210000) ---[ end trace 0000000000000000 ]--- Link: https://lkml.kernel.org/r/20250217014329.3610326-4-mawupeng1@huawei.com Fixes: b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined") Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com> Acked-by: David Hildenbrand <david(a)redhat.com> Acked-by: Miaohe Lin <linmiaohe(a)huawei.com> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index a6abd8d4a09c..16cf9e17077e 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1832,8 +1832,11 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn) (folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) { if (WARN_ON(folio_test_lru(folio))) folio_isolate_lru(folio); - if (folio_mapped(folio)) + if (folio_mapped(folio)) { + folio_lock(folio); unmap_poisoned_folio(folio, pfn, false); + folio_unlock(folio); + } goto put_folio; }

6 months, 2 weeks

1
0
0 0

FAILED: patch "[PATCH] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned" failed to apply to 6.1-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x af288a426c3e3552b62595c6138ec6371a17dbba # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025030934-clock-preview-4a7a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From af288a426c3e3552b62595c6138ec6371a17dbba Mon Sep 17 00:00:00 2001 From: Ma Wupeng <mawupeng1(a)huawei.com> Date: Mon, 17 Feb 2025 09:43:29 +0800 Subject: [PATCH] hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio Commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined) add page poison checks in do_migrate_range in order to make offline hwpoisoned page possible by introducing isolate_lru_page and try_to_unmap for hwpoisoned page. However folio lock must be held before calling try_to_unmap. Add it to fix this problem. Warning will be produced if folio is not locked during unmap: ------------[ cut here ]------------ kernel BUG at ./include/linux/swapops.h:400! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 4 UID: 0 PID: 411 Comm: bash Tainted: G W 6.13.0-rc1-00016-g3c434c7ee82a-dirty #41 Tainted: [W]=WARN Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : try_to_unmap_one+0xb08/0xd3c lr : try_to_unmap_one+0x3dc/0xd3c Call trace: try_to_unmap_one+0xb08/0xd3c (P) try_to_unmap_one+0x3dc/0xd3c (L) rmap_walk_anon+0xdc/0x1f8 rmap_walk+0x3c/0x58 try_to_unmap+0x88/0x90 unmap_poisoned_folio+0x30/0xa8 do_migrate_range+0x4a0/0x568 offline_pages+0x5a4/0x670 memory_block_action+0x17c/0x374 memory_subsys_offline+0x3c/0x78 device_offline+0xa4/0xd0 state_store+0x8c/0xf0 dev_attr_store+0x18/0x2c sysfs_kf_write+0x44/0x54 kernfs_fop_write_iter+0x118/0x1a8 vfs_write+0x3a8/0x4bc ksys_write+0x6c/0xf8 __arm64_sys_write+0x1c/0x28 invoke_syscall+0x44/0x100 el0_svc_common.constprop.0+0x40/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x30/0xd0 el0t_64_sync_handler+0xc8/0xcc el0t_64_sync+0x198/0x19c Code: f9407be0 b5fff320 d4210000 17ffff97 (d4210000) ---[ end trace 0000000000000000 ]--- Link: https://lkml.kernel.org/r/20250217014329.3610326-4-mawupeng1@huawei.com Fixes: b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined") Signed-off-by: Ma Wupeng <mawupeng1(a)huawei.com> Acked-by: David Hildenbrand <david(a)redhat.com> Acked-by: Miaohe Lin <linmiaohe(a)huawei.com> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index a6abd8d4a09c..16cf9e17077e 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1832,8 +1832,11 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn) (folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) { if (WARN_ON(folio_test_lru(folio))) folio_isolate_lru(folio); - if (folio_mapped(folio)) + if (folio_mapped(folio)) { + folio_lock(folio); unmap_poisoned_folio(folio, pfn, false); + folio_unlock(folio); + } goto put_folio; }

6 months, 2 weeks

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror