Linux-stable-mirror January 2025

linux-stable-mirror@lists.linaro.org

520 participants
1002 discussions

[PATCH v4] usb: xhci: quirk for data loss in ISOC transfers

by Raju Rangoju

During the High-Speed Isochronous Audio transfers, xHCI controller on certain AMD platforms experiences momentary data loss. This results in Missed Service Errors (MSE) being generated by the xHCI. The root cause of the MSE is attributed to the ISOC OUT endpoint being omitted from scheduling. This can happen either when an IN endpoint with a 64ms service interval is pre-scheduled prior to the ISOC OUT endpoint or when the interval of the ISOC OUT endpoint is shorter than that of the IN endpoint. Consequently, the OUT service is neglected when an IN endpoint with a service interval exceeding 32ms is scheduled concurrently (every 64ms in this scenario). This issue is particularly seen on certain older AMD platforms. To mitigate this problem, it is recommended to adjust the service interval of the IN endpoint to not exceed 32ms (interval 8). This adjustment ensures that the OUT endpoint will not be bypassed, even if a smaller interval value is utilized. Cc: stable(a)vger.kernel.org Signed-off-by: Raju Rangoju <Raju.Rangoju(a)amd.com> --- Changes since v3: - Bump up the enum number XHCI_LIMIT_ENDPOINT_INTERVAL_9 Changes since v2: - added stable tag to backport to all stable kernels Changes since v1: - replaced hex values with pci device names - corrected the commit message drivers/usb/host/xhci-mem.c | 5 +++++ drivers/usb/host/xhci-pci.c | 25 +++++++++++++++++++++++++ drivers/usb/host/xhci.h | 1 + 3 files changed, 31 insertions(+) diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index 92703efda1f7..d3182ba98788 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -1420,6 +1420,11 @@ int xhci_endpoint_init(struct xhci_hcd *xhci, /* Periodic endpoint bInterval limit quirk */ if (usb_endpoint_xfer_int(&ep->desc) || usb_endpoint_xfer_isoc(&ep->desc)) { + if ((xhci->quirks & XHCI_LIMIT_ENDPOINT_INTERVAL_9) && + usb_endpoint_xfer_int(&ep->desc) && + interval >= 9) { + interval = 8; + } if ((xhci->quirks & XHCI_LIMIT_ENDPOINT_INTERVAL_7) && udev->speed >= USB_SPEED_HIGH && interval >= 7) { diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c index 2d1e205c14c6..d23884afdf3f 100644 --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -69,12 +69,22 @@ #define PCI_DEVICE_ID_INTEL_TITAN_RIDGE_4C_XHCI 0x15ec #define PCI_DEVICE_ID_INTEL_TITAN_RIDGE_DD_XHCI 0x15f0 +#define PCI_DEVICE_ID_AMD_ARIEL_TYPEC_XHCI 0x13ed +#define PCI_DEVICE_ID_AMD_ARIEL_TYPEA_XHCI 0x13ee +#define PCI_DEVICE_ID_AMD_STARSHIP_XHCI 0x148c +#define PCI_DEVICE_ID_AMD_FIREFLIGHT_15D4_XHCI 0x15d4 +#define PCI_DEVICE_ID_AMD_FIREFLIGHT_15D5_XHCI 0x15d5 +#define PCI_DEVICE_ID_AMD_RAVEN_15E0_XHCI 0x15e0 +#define PCI_DEVICE_ID_AMD_RAVEN_15E1_XHCI 0x15e1 +#define PCI_DEVICE_ID_AMD_RAVEN2_XHCI 0x15e5 #define PCI_DEVICE_ID_AMD_RENOIR_XHCI 0x1639 #define PCI_DEVICE_ID_AMD_PROMONTORYA_4 0x43b9 #define PCI_DEVICE_ID_AMD_PROMONTORYA_3 0x43ba #define PCI_DEVICE_ID_AMD_PROMONTORYA_2 0x43bb #define PCI_DEVICE_ID_AMD_PROMONTORYA_1 0x43bc +#define PCI_DEVICE_ID_ATI_NAVI10_7316_XHCI 0x7316 + #define PCI_DEVICE_ID_ASMEDIA_1042_XHCI 0x1042 #define PCI_DEVICE_ID_ASMEDIA_1042A_XHCI 0x1142 #define PCI_DEVICE_ID_ASMEDIA_1142_XHCI 0x1242 @@ -278,6 +288,21 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) if (pdev->vendor == PCI_VENDOR_ID_NEC) xhci->quirks |= XHCI_NEC_HOST; + if (pdev->vendor == PCI_VENDOR_ID_AMD && + (pdev->device == PCI_DEVICE_ID_AMD_ARIEL_TYPEC_XHCI || + pdev->device == PCI_DEVICE_ID_AMD_ARIEL_TYPEA_XHCI || + pdev->device == PCI_DEVICE_ID_AMD_STARSHIP_XHCI || + pdev->device == PCI_DEVICE_ID_AMD_FIREFLIGHT_15D4_XHCI || + pdev->device == PCI_DEVICE_ID_AMD_FIREFLIGHT_15D5_XHCI || + pdev->device == PCI_DEVICE_ID_AMD_RAVEN_15E0_XHCI || + pdev->device == PCI_DEVICE_ID_AMD_RAVEN_15E1_XHCI || + pdev->device == PCI_DEVICE_ID_AMD_RAVEN2_XHCI)) + xhci->quirks |= XHCI_LIMIT_ENDPOINT_INTERVAL_9; + + if (pdev->vendor == PCI_VENDOR_ID_ATI && + pdev->device == PCI_DEVICE_ID_ATI_NAVI10_7316_XHCI) + xhci->quirks |= XHCI_LIMIT_ENDPOINT_INTERVAL_9; + if (pdev->vendor == PCI_VENDOR_ID_AMD && xhci->hci_version == 0x96) xhci->quirks |= XHCI_AMD_0x96_HOST; diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 4914f0a10cff..36b77d3c0e7b 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1633,6 +1633,7 @@ struct xhci_hcd { #define XHCI_WRITE_64_HI_LO BIT_ULL(47) #define XHCI_CDNS_SCTX_QUIRK BIT_ULL(48) #define XHCI_ETRON_HOST BIT_ULL(49) +#define XHCI_LIMIT_ENDPOINT_INTERVAL_9 BIT_ULL(50) unsigned int num_active_eps; unsigned int limit_active_eps; -- 2.34.1

9 months, 2 weeks

[PATCH v2] sched/fair: Fix integer underflow

by Pierre Gondois

(struct sg_lb_stats).idle_cpus is of type 'unsigned int'. (local->idle_cpus - busiest->idle_cpus) can underflow to UINT_MAX for instance, and max_t(long, 0, UINT_MAX) will output UINT_MAX. Use lsub_positive() instead of max_t(). Fixes: 16b0a7a1a0af ("sched/fair: Ensure tasks spreading in LLC during LB") cc: stable(a)vger.kernel.org Signed-off-by: Pierre Gondois <pierre.gondois(a)arm.com> Reviewed-by: Vincent Guittot <vincent.guittot(a)linaro.org> --- kernel/sched/fair.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9057584ec06d..6d9124499f52 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -10775,8 +10775,8 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s * idle CPUs. */ env->migration_type = migrate_task; - env->imbalance = max_t(long, 0, - (local->idle_cpus - busiest->idle_cpus)); + env->imbalance = local->idle_cpus; + lsub_positive(&env->imbalance, busiest->idle_cpus); } #ifdef CONFIG_NUMA -- 2.25.1

9 months, 2 weeks

[PATCH 6.1.y 6.6.y 0/3] mm/filemap: fix page cache corruption with large folios

by Kairui Song

From: Kairui Song <kasong(a)tencent.com> This series fixes the page cache corruption issue reported by Christian Theune [1]. The issue was reported affects kernels back to 5.19. Current maintained effected branches includes 6.1 and 6.6 and the fix was included in 6.10 already. This series can be applied for both 6.1 and 6.6. Patch 3/3 is the fixing patch. It was initially submitted and merge as an optimization but found to have fixed the corruption by handling race correctly. Patch 1/3 and 2/3 is required for 3/3. Patch 3/3 included some unit test code, making the LOC of the backport a bit higher, but should be OK to be kept, since they are just test code. Note there seems still some unresolved problem in Link [1] but that should be a different issue, and the commits being backported have been well tested, they fix the corruption issue just fine. Link: https://lore.kernel.org/linux-mm/A5A976CB-DB57-4513-A700-656580488AB6@flyin… [1] Kairui Song (3): mm/filemap: return early if failed to allocate memory for split lib/xarray: introduce a new helper xas_get_order mm/filemap: optimize filemap folio adding include/linux/xarray.h | 6 +++ lib/test_xarray.c | 93 ++++++++++++++++++++++++++++++++++++++++++ lib/xarray.c | 49 ++++++++++++++-------- mm/filemap.c | 50 ++++++++++++++++++----- 4 files changed, 169 insertions(+), 29 deletions(-) -- 2.46.1

9 months, 3 weeks

[PATCH 5.15] net: defer final 'struct net' free in netns dismantle

by Vasiliy Kovalev

From: Eric Dumazet <edumazet(a)google.com> commit 0f6ede9fbc747e2553612271bce108f7517e7a45 upstream. Ilya reported a slab-use-after-free in dst_destroy [1] Issue is in xfrm6_net_init() and xfrm4_net_init() : They copy xfrm[46]_dst_ops_template into net->xfrm.xfrm[46]_dst_ops. But net structure might be freed before all the dst callbacks are called. So when dst_destroy() calls later : if (dst->ops->destroy) dst->ops->destroy(dst); dst->ops points to the old net->xfrm.xfrm[46]_dst_ops, which has been freed. See a relevant issue fixed in : ac888d58869b ("net: do not delay dst_entries_add() in dst_release()") A fix is to queue the 'struct net' to be freed after one another cleanup_net() round (and existing rcu_barrier()) [1] BUG: KASAN: slab-use-after-free in dst_destroy (net/core/dst.c:112) Read of size 8 at addr ffff8882137ccab0 by task swapper/37/0 Dec 03 05:46:18 kernel: CPU: 37 UID: 0 PID: 0 Comm: swapper/37 Kdump: loaded Not tainted 6.12.0 #67 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.1-1.el9 04/01/2014 Call Trace: <IRQ> dump_stack_lvl (lib/dump_stack.c:124) print_address_description.constprop.0 (mm/kasan/report.c:378) ? dst_destroy (net/core/dst.c:112) print_report (mm/kasan/report.c:489) ? dst_destroy (net/core/dst.c:112) ? kasan_addr_to_slab (mm/kasan/common.c:37) kasan_report (mm/kasan/report.c:603) ? dst_destroy (net/core/dst.c:112) ? rcu_do_batch (kernel/rcu/tree.c:2567) dst_destroy (net/core/dst.c:112) rcu_do_batch (kernel/rcu/tree.c:2567) ? __pfx_rcu_do_batch (kernel/rcu/tree.c:2491) ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4339 kernel/locking/lockdep.c:4406) rcu_core (kernel/rcu/tree.c:2825) handle_softirqs (kernel/softirq.c:554) __irq_exit_rcu (kernel/softirq.c:589 kernel/softirq.c:428 kernel/softirq.c:637) irq_exit_rcu (kernel/softirq.c:651) sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1049 arch/x86/kernel/apic/apic.c:1049) </IRQ> <TASK> asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:702) RIP: 0010:default_idle (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:92 arch/x86/kernel/process.c:743) Code: 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 6e ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 90 0f 00 2d c7 c9 27 00 fb f4 <fa> c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 90 RSP: 0018:ffff888100d2fe00 EFLAGS: 00000246 RAX: 00000000001870ed RBX: 1ffff110201a5fc2 RCX: ffffffffb61a3e46 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffb3d4d123 RBP: 0000000000000000 R08: 0000000000000001 R09: ffffed11c7e1835d R10: ffff888e3f0c1aeb R11: 0000000000000000 R12: 0000000000000000 R13: ffff888100d20000 R14: dffffc0000000000 R15: 0000000000000000 ? ct_kernel_exit.constprop.0 (kernel/context_tracking.c:148) ? cpuidle_idle_call (kernel/sched/idle.c:186) default_idle_call (./include/linux/cpuidle.h:143 kernel/sched/idle.c:118) cpuidle_idle_call (kernel/sched/idle.c:186) ? __pfx_cpuidle_idle_call (kernel/sched/idle.c:168) ? lock_release (kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5848) ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4347 kernel/locking/lockdep.c:4406) ? tsc_verify_tsc_adjust (arch/x86/kernel/tsc_sync.c:59) do_idle (kernel/sched/idle.c:326) cpu_startup_entry (kernel/sched/idle.c:423 (discriminator 1)) start_secondary (arch/x86/kernel/smpboot.c:202 arch/x86/kernel/smpboot.c:282) ? __pfx_start_secondary (arch/x86/kernel/smpboot.c:232) ? soft_restart_cpu (arch/x86/kernel/head_64.S:452) common_startup_64 (arch/x86/kernel/head_64.S:414) </TASK> Dec 03 05:46:18 kernel: Allocated by task 12184: kasan_save_stack (mm/kasan/common.c:48) kasan_save_track (./arch/x86/include/asm/current.h:49 mm/kasan/common.c:60 mm/kasan/common.c:69) __kasan_slab_alloc (mm/kasan/common.c:319 mm/kasan/common.c:345) kmem_cache_alloc_noprof (mm/slub.c:4085 mm/slub.c:4134 mm/slub.c:4141) copy_net_ns (net/core/net_namespace.c:421 net/core/net_namespace.c:480) create_new_namespaces (kernel/nsproxy.c:110) unshare_nsproxy_namespaces (kernel/nsproxy.c:228 (discriminator 4)) ksys_unshare (kernel/fork.c:3313) __x64_sys_unshare (kernel/fork.c:3382) do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Dec 03 05:46:18 kernel: Freed by task 11: kasan_save_stack (mm/kasan/common.c:48) kasan_save_track (./arch/x86/include/asm/current.h:49 mm/kasan/common.c:60 mm/kasan/common.c:69) kasan_save_free_info (mm/kasan/generic.c:582) __kasan_slab_free (mm/kasan/common.c:271) kmem_cache_free (mm/slub.c:4579 mm/slub.c:4681) cleanup_net (net/core/net_namespace.c:456 net/core/net_namespace.c:446 net/core/net_namespace.c:647) process_one_work (kernel/workqueue.c:3229) worker_thread (kernel/workqueue.c:3304 kernel/workqueue.c:3391) kthread (kernel/kthread.c:389) ret_from_fork (arch/x86/kernel/process.c:147) ret_from_fork_asm (arch/x86/entry/entry_64.S:257) Dec 03 05:46:18 kernel: Last potentially related work creation: kasan_save_stack (mm/kasan/common.c:48) __kasan_record_aux_stack (mm/kasan/generic.c:541) insert_work (./include/linux/instrumented.h:68 ./include/asm-generic/bitops/instrumented-non-atomic.h:141 kernel/workqueue.c:788 kernel/workqueue.c:795 kernel/workqueue.c:2186) __queue_work (kernel/workqueue.c:2340) queue_work_on (kernel/workqueue.c:2391) xfrm_policy_insert (net/xfrm/xfrm_policy.c:1610) xfrm_add_policy (net/xfrm/xfrm_user.c:2116) xfrm_user_rcv_msg (net/xfrm/xfrm_user.c:3321) netlink_rcv_skb (net/netlink/af_netlink.c:2536) xfrm_netlink_rcv (net/xfrm/xfrm_user.c:3344) netlink_unicast (net/netlink/af_netlink.c:1316 net/netlink/af_netlink.c:1342) netlink_sendmsg (net/netlink/af_netlink.c:1886) sock_write_iter (net/socket.c:729 net/socket.c:744 net/socket.c:1165) vfs_write (fs/read_write.c:590 fs/read_write.c:683) ksys_write (fs/read_write.c:736) do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Dec 03 05:46:18 kernel: Second to last potentially related work creation: kasan_save_stack (mm/kasan/common.c:48) __kasan_record_aux_stack (mm/kasan/generic.c:541) insert_work (./include/linux/instrumented.h:68 ./include/asm-generic/bitops/instrumented-non-atomic.h:141 kernel/workqueue.c:788 kernel/workqueue.c:795 kernel/workqueue.c:2186) __queue_work (kernel/workqueue.c:2340) queue_work_on (kernel/workqueue.c:2391) __xfrm_state_insert (./include/linux/workqueue.h:723 net/xfrm/xfrm_state.c:1150 net/xfrm/xfrm_state.c:1145 net/xfrm/xfrm_state.c:1513) xfrm_state_update (./include/linux/spinlock.h:396 net/xfrm/xfrm_state.c:1940) xfrm_add_sa (net/xfrm/xfrm_user.c:912) xfrm_user_rcv_msg (net/xfrm/xfrm_user.c:3321) netlink_rcv_skb (net/netlink/af_netlink.c:2536) xfrm_netlink_rcv (net/xfrm/xfrm_user.c:3344) netlink_unicast (net/netlink/af_netlink.c:1316 net/netlink/af_netlink.c:1342) netlink_sendmsg (net/netlink/af_netlink.c:1886) sock_write_iter (net/socket.c:729 net/socket.c:744 net/socket.c:1165) vfs_write (fs/read_write.c:590 fs/read_write.c:683) ksys_write (fs/read_write.c:736) do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Fixes: a8a572a6b5f2 ("xfrm: dst_entries_init() per-net dst_ops") Reported-by: Ilya Maximets <i.maximets(a)ovn.org> Closes: https://lore.kernel.org/netdev/CANn89iKKYDVpB=MtmfH7nyv2p=rJWSLedO5k7wSZgtY… Signed-off-by: Eric Dumazet <edumazet(a)google.com> Acked-by: Paolo Abeni <pabeni(a)redhat.com> Reviewed-by: Kuniyuki Iwashima <kuniyu(a)amazon.com> Link: https://patch.msgid.link/20241204125455.3871859-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> Signed-off-by: Vasiliy Kovalev <kovalev(a)altlinux.org> --- Backport to fix CVE-2024-56658 Link: https://www.cve.org/CVERecord/?id=CVE-2024-56658 --- include/net/net_namespace.h | 1 + net/core/net_namespace.c | 21 ++++++++++++++++++++- 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h index c47baa623ba586..a5d6e04c8e8b55 100644 --- a/include/net/net_namespace.h +++ b/include/net/net_namespace.h @@ -80,6 +80,7 @@ struct net { * or to unregister pernet ops * (pernet_ops_rwsem write locked). */ + struct llist_node defer_free_list; struct llist_node cleanup_list; /* namespaces on death row */ #ifdef CONFIG_KEYS diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c index 3addbce20f8ed0..0217dd2635cdb4 100644 --- a/net/core/net_namespace.c +++ b/net/core/net_namespace.c @@ -430,11 +430,28 @@ static struct net *net_alloc(void) goto out; } +static LLIST_HEAD(defer_free_list); + +static void net_complete_free(void) +{ + struct llist_node *kill_list; + struct net *net, *next; + + /* Get the list of namespaces to free from last round. */ + kill_list = llist_del_all(&defer_free_list); + + llist_for_each_entry_safe(net, next, kill_list, defer_free_list) + kmem_cache_free(net_cachep, net); + +} + static void net_free(struct net *net) { if (refcount_dec_and_test(&net->passive)) { kfree(rcu_access_pointer(net->gen)); - kmem_cache_free(net_cachep, net); + + /* Wait for an extra rcu_barrier() before final free. */ + llist_add(&net->defer_free_list, &defer_free_list); } } @@ -609,6 +626,8 @@ static void cleanup_net(struct work_struct *work) */ rcu_barrier(); + net_complete_free(); + /* Finally it is safe to free my network namespace structure */ list_for_each_entry_safe(net, tmp, &net_exit_list, exit_list) { list_del_init(&net->exit_list); -- 2.33.8

9 months, 3 weeks

[PATCH v2] Revert "mmc: sdhci_am654: Add sdhci_am654_start_signal_voltage_switch"

by Josua Mayer

This reverts commit 941a7abd4666912b84ab209396fdb54b0dae685d. This commit uses presence of device-tree properties vmmc-supply and vqmmc-supply for deciding whether to enable a quirk affecting timing of clock and data. The intention was to address issues observed with eMMC and SD on AM62 platforms. This new quirk is however also enabled for AM64 breaking microSD access on the SolidRun HimmingBoard-T which is supported in-tree since v6.11, causing a regression. During boot microSD initialization now fails with the error below: [ 2.008520] mmc1: SDHCI controller on fa00000.mmc [fa00000.mmc] using ADMA 64-bit [ 2.115348] mmc1: error -110 whilst initialising SD card The heuristics for enabling the quirk are clearly not correct as they break at least one but potentially many existing boards. Revert the change and restore original behaviour until a more appropriate method of selecting the quirk is derived. Fixes: 941a7abd4666 ("mmc: sdhci_am654: Add sdhci_am654_start_signal_voltage_switch") Closes: https://lore.kernel.org/linux-mmc/a70fc9fc-186f-4165-a652-3de50733763a@soli… Cc: stable(a)vger.kernel.org Signed-off-by: Josua Mayer <josua(a)solid-run.com> Acked-by: Adrian Hunter <adrian.hunter(a)intel.com> --- Changes in v2: - Fixed "Fixes:" tag invalid commit description copied from history (Reported-by: Adrian Hunter <adrian.hunter(a)intel.com>) (Reported-by: Greg KH <gregkh(a)linuxfoundation.org>) - Link to v1: https://lore.kernel.org/r/20250127-am654-mmc-regression-v1-1-d831f9a13ae9@s… --- drivers/mmc/host/sdhci_am654.c | 30 ------------------------------ 1 file changed, 30 deletions(-) diff --git a/drivers/mmc/host/sdhci_am654.c b/drivers/mmc/host/sdhci_am654.c index b73f673db92bbc042392995e715815e15ace6005..f75c31815ab00d17b5757063521f56ba5643babe 100644 --- a/drivers/mmc/host/sdhci_am654.c +++ b/drivers/mmc/host/sdhci_am654.c @@ -155,7 +155,6 @@ struct sdhci_am654_data { u32 tuning_loop; #define SDHCI_AM654_QUIRK_FORCE_CDTEST BIT(0) -#define SDHCI_AM654_QUIRK_SUPPRESS_V1P8_ENA BIT(1) }; struct window { @@ -357,29 +356,6 @@ static void sdhci_j721e_4bit_set_clock(struct sdhci_host *host, sdhci_set_clock(host, clock); } -static int sdhci_am654_start_signal_voltage_switch(struct mmc_host *mmc, struct mmc_ios *ios) -{ - struct sdhci_host *host = mmc_priv(mmc); - struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host); - struct sdhci_am654_data *sdhci_am654 = sdhci_pltfm_priv(pltfm_host); - int ret; - - if ((sdhci_am654->quirks & SDHCI_AM654_QUIRK_SUPPRESS_V1P8_ENA) && - ios->signal_voltage == MMC_SIGNAL_VOLTAGE_180) { - if (!IS_ERR(mmc->supply.vqmmc)) { - ret = mmc_regulator_set_vqmmc(mmc, ios); - if (ret < 0) { - pr_err("%s: Switching to 1.8V signalling voltage failed,\n", - mmc_hostname(mmc)); - return -EIO; - } - } - return 0; - } - - return sdhci_start_signal_voltage_switch(mmc, ios); -} - static u8 sdhci_am654_write_power_on(struct sdhci_host *host, u8 val, int reg) { writeb(val, host->ioaddr + reg); @@ -868,11 +844,6 @@ static int sdhci_am654_get_of_property(struct platform_device *pdev, if (device_property_read_bool(dev, "ti,fails-without-test-cd")) sdhci_am654->quirks |= SDHCI_AM654_QUIRK_FORCE_CDTEST; - /* Suppress v1p8 ena for eMMC and SD with vqmmc supply */ - if (!!of_parse_phandle(dev->of_node, "vmmc-supply", 0) == - !!of_parse_phandle(dev->of_node, "vqmmc-supply", 0)) - sdhci_am654->quirks |= SDHCI_AM654_QUIRK_SUPPRESS_V1P8_ENA; - sdhci_get_of_property(pdev); return 0; @@ -969,7 +940,6 @@ static int sdhci_am654_probe(struct platform_device *pdev) goto err_pltfm_free; } - host->mmc_host_ops.start_signal_voltage_switch = sdhci_am654_start_signal_voltage_switch; host->mmc_host_ops.execute_tuning = sdhci_am654_execute_tuning; pm_runtime_get_noresume(dev); --- base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04 change-id: 20250127-am654-mmc-regression-ed289f8967c2 Best regards, -- Josua Mayer <josua(a)solid-run.com>

9 months, 3 weeks

[BUG REPORT] cifs: Deadlock due to network reconnection during file writing

by Wang Zhaolong

In the code of the LTS branch that is being maintained (from linux-5.4 to linux-6.6), a deadlock occurs in the network reconnection scenario When multiple processes or threads write to the same file concurrently. Take the code of linux-5.10 as an example. The simplified deadlock process is as follows: ``` Process 1 Process 2 lock_page() - [1] wait_on_page_writeback() - [2] Waiting for writeback, blocked by [4] lock_page() - [3] Blocked by [1] end_page_writeback() - [4] Won't execute ``` Based on my research, I'm going to use two detailed scenarios to illustrate the issue. Scenarios 1: ``` P1 (dd) P2 (cifsd) P3 (cifsiod) cifs_writepages wdata_prepare_pages lock_page - [1] wait_on_page_writeback - [2] Waiting for writeback, blocked by [4] wait_on_page_bit cifs_demultiplex_thread cifs_read_from_socket cifs_readv_from_socket - If another process triggers reconnect at this point cifs_reconnect - mid->mid_state updated to MID_RETRY_NEEDED smb2_writev_callback mid_entry->callback() - mid_state leads to wdata->result = -EAGAIN wdata->result = -EAGAIN queue_work(cifsiod_wq, &wdata->work); cifs_writev_complete - worker function - wdata->result == -EAGAIN Condition satisfied cifs_writev_requeue lock_page - [3] Blocked by [1] end_page_writeback - [4] Won't execute unlock_page ``` Mainline refactoring commit d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list") unlock folio while waiting for the writeback to complete. This patch is introduced in v6.3-rc1. Therefore, scenario 1 only affects LTS versions from linux-5.4 to linux-6.1. Call stack trace: ``` cat /proc/34/stack [<0>] __lock_page+0x147/0x3a0 [<0>] cifs_writev_requeue.cold+0x185/0x28e [<0>] process_one_work+0x1df/0x3b0 [<0>] worker_thread+0x4a/0x3c0 [<0>] kthread+0x125/0x160 [<0>] ret_from_fork+0x22/0x30 # cat /proc/465/stack [<0>] wait_on_page_bit+0x106/0x2e0 [<0>] wait_on_page_writeback+0x25/0xd0 [<0>] cifs_writepages+0x5ee/0xf60 [<0>] do_writepages+0x43/0xe0 [<0>] __filemap_fdatawrite_range+0xcd/0x110 [<0>] file_write_and_wait_range+0x40/0x90 [<0>] cifs_strict_fsync+0x35/0x470 [<0>] do_fsync+0x38/0x70 [<0>] __x64_sys_fsync+0x10/0x20 [<0>] do_syscall_64+0x33/0x40 [<0>] entry_SYSCALL_64_after_hwframe+0x67/0xd1 [ 369.826215] INFO: task kworker/1:1:34 blocked for more than 122 seconds. [ 369.828964] Not tainted 5.10.0+ #164 [ 369.830623] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 369.835104] task:kworker/1:1 state:D stack:13472 pid: 34 ppid: 2 flags:0x00004000 [ 369.838448] Workqueue: cifsiod cifs_writev_complete [ 369.840242] Call Trace: [ 369.841219] __schedule+0x401/0x8e0 [ 369.842568] schedule+0x49/0x130 [ 369.843785] io_schedule+0x12/0x40 [ 369.845079] __lock_page+0x147/0x3a0 [ 369.846444] ? add_to_page_cache_lru+0x180/0x180 [ 369.847963] cifs_writev_requeue.cold+0x185/0x28e [ 369.849193] process_one_work+0x1df/0x3b0 [ 369.850248] worker_thread+0x4a/0x3c0 [ 369.851216] ? process_one_work+0x3b0/0x3b0 [ 369.852308] kthread+0x125/0x160 [ 369.853167] ? kthread_park+0x90/0x90 [ 369.854142] ret_from_fork+0x22/0x30 [ 369.855054] INFO: task kworker/u8:3:96 blocked for more than 122 seconds. [ 369.856781] Not tainted 5.10.0+ #164 [ 369.857851] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 369.859419] task:kworker/u8:3 state:D stack:12744 pid: 96 ppid: 2 flags:0x00004000 [ 369.861041] Workqueue: writeback wb_workfn (flush-cifs-2) [ 369.862095] Call Trace: [ 369.862583] __schedule+0x401/0x8e0 [ 369.863280] schedule+0x49/0x130 [ 369.863912] io_schedule+0x12/0x40 [ 369.864604] __lock_page+0x147/0x3a0 [ 369.865322] ? add_to_page_cache_lru+0x180/0x180 [ 369.866246] cifs_writepages+0x620/0xf60 [ 369.867005] do_writepages+0x43/0xe0 [ 369.867737] ? __blk_mq_try_issue_directly+0x121/0x1c0 [ 369.868750] __writeback_single_inode+0x3d/0x320 [ 369.869589] writeback_sb_inodes+0x20d/0x480 [ 369.870367] __writeback_inodes_wb+0x4c/0xe0 [ 369.871148] wb_writeback+0x201/0x2f0 [ 369.871797] wb_workfn+0x38a/0x4e0 [ 369.872427] ? check_preempt_curr+0x47/0x70 [ 369.873191] ? ttwu_do_wakeup.isra.0+0x17/0x170 [ 369.873999] process_one_work+0x1df/0x3b0 [ 369.874741] worker_thread+0x4a/0x3c0 [ 369.875421] ? process_one_work+0x3b0/0x3b0 [ 369.876180] kthread+0x125/0x160 [ 369.876761] ? kthread_park+0x90/0x90 [ 369.877431] ret_from_fork+0x22/0x30 [ 369.878106] INFO: task a.out:465 blocked for more than 122 seconds. [ 369.879225] Not tainted 5.10.0+ #164 [ 369.879945] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 369.881316] task:a.out state:D stack:12752 pid: 465 ppid: 386 flags:0x00000002 [ 369.882791] Call Trace: [ 369.883263] __schedule+0x401/0x8e0 [ 369.883884] schedule+0x49/0x130 [ 369.884447] io_schedule+0x12/0x40 [ 369.885054] wait_on_page_bit+0x106/0x2e0 [ 369.885795] ? add_to_page_cache_lru+0x180/0x180 [ 369.886631] wait_on_page_writeback+0x25/0xd0 [ 369.887427] cifs_writepages+0x5ee/0xf60 [ 369.888151] do_writepages+0x43/0xe0 [ 369.888789] ? __generic_file_write_iter+0xfd/0x1d0 [ 369.889663] __filemap_fdatawrite_range+0xcd/0x110 [ 369.890523] file_write_and_wait_range+0x40/0x90 [ 369.891360] cifs_strict_fsync+0x35/0x470 [ 369.892094] do_fsync+0x38/0x70 [ 369.892657] __x64_sys_fsync+0x10/0x20 [ 369.893336] do_syscall_64+0x33/0x40 [ 369.893978] entry_SYSCALL_64_after_hwframe+0x67/0xd1 [ 369.894883] RIP: 0033:0x7f660e208950 [ 369.895538] RSP: 002b:00007fff52b27b78 EFLAGS: 00000202 ORIG_RAX: 000000000000004a [ 369.896882] RAX: ffffffffffffffda RBX: 00007fff52b28cb8 RCX: 00007f660e208950 [ 369.898139] RDX: 0000000000001000 RSI: 00007fff52b27b80 RDI: 0000000000000003 [ 369.899395] RBP: 00007fff52b28ba0 R08: 0000000000000410 R09: 0000000000000001 [ 369.900661] R10: 00007f660e11c400 R11: 0000000000000202 R12: 0000000000000000 [ 369.901925] R13: 00007fff52b28cc8 R14: 00007f660e328000 R15: 000055b5aeb6fdd8 [ 369.903202] INFO: task sync:468 blocked for more than 122 seconds. [ 369.904311] Not tainted 5.10.0+ #164 [ 369.905034] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 369.906457] task:sync state:D stack:13632 pid: 468 ppid: 386 flags:0x00004002 [ 369.907930] Call Trace: [ 369.908369] __schedule+0x401/0x8e0 [ 369.908984] schedule+0x49/0x130 [ 369.909582] io_schedule+0x12/0x40 [ 369.910208] wait_on_page_bit+0x106/0x2e0 [ 369.910918] ? add_to_page_cache_lru+0x180/0x180 [ 369.911758] wait_on_page_writeback+0x25/0xd0 [ 369.912560] __filemap_fdatawait_range+0x83/0x110 [ 369.913408] ? __add_pages+0x6f/0x1b0 [ 369.914089] filemap_fdatawait_keep_errors+0x1a/0x50 [ 369.914957] sync_inodes_sb+0x208/0x2a0 [ 369.915666] ? __x64_sys_tee+0xd0/0xd0 [ 369.916344] iterate_supers+0x90/0xe0 [ 369.916983] ksys_sync+0x40/0xb0 [ 369.917590] __do_sys_sync+0xa/0x20 [ 369.918240] do_syscall_64+0x33/0x40 [ 369.918884] entry_SYSCALL_64_after_hwframe+0x67/0xd1 [ 369.919800] RIP: 0033:0x7f746d820987 [ 369.920451] RSP: 002b:00007ffce853fd78 EFLAGS: 00000206 ORIG_RAX: 00000000000000a2 [ 369.921798] RAX: ffffffffffffffda RBX: 00007ffce853fed8 RCX: 00007f746d820987 [ 369.923063] RDX: 00007f746d8f4801 RSI: 00007ffce8541f71 RDI: 00007f746d8b05ad [ 369.924339] RBP: 0000000000000001 R08: 000000000000ffff R09: 0000000000000000 [ 369.925605] R10: 00007f746d7308a0 R11: 0000000000000206 R12: 000055b8487470fb [ 369.926866] R13: 0000000000000000 R14: 0000000000000000 R15: 000055b848749ce0 [ 369.928138] Kernel panic - not syncing: hung_task: blocked tasks [ 369.929191] CPU: 3 PID: 35 Comm: khungtaskd Not tainted 5.10.0+ #164 [ 369.952450] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 [ 369.956984] Call Trace: [ 369.957973] dump_stack+0x57/0x6e [ 369.959273] panic+0x115/0x2f1 [ 369.960476] watchdog.cold+0xb5/0xb5 [ 369.961884] ? hungtask_pm_notify+0x40/0x40 [ 369.963310] kthread+0x125/0x160 [ 369.964354] ? kthread_park+0x90/0x90 [ 369.965551] ret_from_fork+0x22/0x30 [ 369.967673] Kernel Offset: 0xd600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 369.971025] ---[ end Kernel panic - not syncing: hung_task: blocked tasks ]--- ``` Scenarios 2: Scenario 2 occurs in strict cache mode ``` P1 (dd) P2 (cifsd) P3 (cifsiod) cifs_strict_writev cifs_zap_mapping - If something breaks the oplock cifs_revalidate_mapping cifs_invalidate_mapping invalidate_inode_pages2 invalidate_inode_pages2_range lock_page - [1] wait_on_page_writeback - [2] Waiting for writeback, blocked by [4] wait_on_page_bit cifs_demultiplex_thread cifs_read_from_socket cifs_readv_from_socket - If another process triggers reconnect at this point cifs_reconnect - mid->mid_state updated to MID_RETRY_NEEDED smb2_writev_callback mid_entry->callback() - mid_state leads to wdata->result = -EAGAIN wdata->result = -EAGAIN queue_work(cifsiod_wq, &wdata->work); cifs_writev_complete - worker function - wdata->result == -EAGAIN Condition satisfied cifs_writev_requeue lock_page - [3] Blocked by [1] end_page_writeback - [4] Won't execute unlock_page ``` Mainline refactoring commit 3ee1a1fc3981 ("cifs: Cut over to using netfslib") directly terminates the file write instead of resending data when smb2_writev_callback() detects a write failure, thus avoiding this problem. This patch is introduced in v6.10-rc1. Therefore, scenario 2 affects LTS versions from linux-5.4 to linux-6.6. ``` cat /proc/522/stack [<0>] wait_on_page_bit+0x106/0x150 [<0>] invalidate_inode_pages2_range+0x2cc/0x580 [<0>] cifs_invalidate_mapping+0x2c/0x50 [cifs] [<0>] cifs_revalidate_mapping+0x4c/0x90 [cifs] [<0>] cifs_strict_writev+0x17a/0x250 [cifs] [<0>] __vfs_write+0x14f/0x1b0 [<0>] vfs_write+0xb6/0x1a0 [<0>] ksys_write+0x57/0xd0 [<0>] do_syscall_64+0x63/0x250 [<0>] entry_SYSCALL_64_after_hwframe+0x5c/0xc1 [<0>] 0xffffffffffffffff cat /proc/33/stack [<0>] __lock_page+0x10c/0x160 [<0>] cifs_writev_requeue.cold+0x17e/0x239 [cifs] [<0>] process_one_work+0x1a9/0x3f0 [<0>] worker_thread+0x50/0x3c0 [<0>] kthread+0x117/0x130 [<0>] ret_from_fork+0x35/0x40 [<0>] 0xffffffffffffffff ``` The root cause of the deadlock problem is that the page/folio is locked again in cifs_writev_requeue(). In order to safely fix it on the LTS branches, I would like to clarify the following questions:, 1. Whether resending is necessary. If retransmission is not required, simply terminating the write would avoids this problem. Is this an acceptable solution? 2. Is it necessary to lock the page/folio in cifs_writev_requeue()? Based on my code screening (possibly missing), there seems to be no process that modifies a page when it is marked as PG_writeback.Therefore, the page does not need to be locked during wait_on_page_writeback().

9 months, 4 weeks

[PATCH v5 01/16] x86/stackprotector: Work around strict Clang TLS symbol requirements

by Brian Gerst

From: Ard Biesheuvel <ardb(a)kernel.org> GCC and Clang both implement stack protector support based on Thread Local Storage (TLS) variables, and this is used in the kernel to implement per-task stack cookies, by copying a task's stack cookie into a per-CPU variable every time it is scheduled in. Both now also implement -mstack-protector-guard-symbol=, which permits the TLS variable to be specified directly. This is useful because it will allow us to move away from using a fixed offset of 40 bytes into the per-CPU area on x86_64, which requires a lot of special handling in the per-CPU code and the runtime relocation code. However, while GCC is rather lax in its implementation of this command line option, Clang actually requires that the provided symbol name refers to a TLS variable (i.e., one declared with __thread), although it also permits the variable to be undeclared entirely, in which case it will use an implicit declaration of the right type. The upshot of this is that Clang will emit the correct references to the stack cookie variable in most cases, e.g., 10d: 64 a1 00 00 00 00 mov %fs:0x0,%eax 10f: R_386_32 __stack_chk_guard However, if a non-TLS definition of the symbol in question is visible in the same compilation unit (which amounts to the whole of vmlinux if LTO is enabled), it will drop the per-CPU prefix and emit a load from a bogus address. Work around this by using a symbol name that never occurs in C code, and emit it as an alias in the linker script. Fixes: 3fb0fdb3bbe7 ("x86/stackprotector/32: Make the canary into a regular percpu variable") Cc: <stable(a)vger.kernel.org> Cc: Fangrui Song <i(a)maskray.me> Cc: Uros Bizjak <ubizjak(a)gmail.com> Cc: Nathan Chancellor <nathan(a)kernel.org> Cc: Andy Lutomirski <luto(a)kernel.org> Link: https://github.com/ClangBuiltLinux/linux/issues/1854 Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org> Signed-off-by: Brian Gerst <brgerst(a)gmail.com> --- arch/x86/Makefile | 5 +++-- arch/x86/entry/entry.S | 16 ++++++++++++++++ arch/x86/include/asm/asm-prototypes.h | 3 +++ arch/x86/kernel/cpu/common.c | 2 ++ arch/x86/kernel/vmlinux.lds.S | 3 +++ 5 files changed, 27 insertions(+), 2 deletions(-) diff --git a/arch/x86/Makefile b/arch/x86/Makefile index cd75e78a06c1..5b773b34768d 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -142,9 +142,10 @@ ifeq ($(CONFIG_X86_32),y) ifeq ($(CONFIG_STACKPROTECTOR),y) ifeq ($(CONFIG_SMP),y) - KBUILD_CFLAGS += -mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard + KBUILD_CFLAGS += -mstack-protector-guard-reg=fs \ + -mstack-protector-guard-symbol=__ref_stack_chk_guard else - KBUILD_CFLAGS += -mstack-protector-guard=global + KBUILD_CFLAGS += -mstack-protector-guard=global endif endif else diff --git a/arch/x86/entry/entry.S b/arch/x86/entry/entry.S index 324686bca368..b7ea3e8e9ecc 100644 --- a/arch/x86/entry/entry.S +++ b/arch/x86/entry/entry.S @@ -51,3 +51,19 @@ EXPORT_SYMBOL_GPL(mds_verw_sel); .popsection THUNK warn_thunk_thunk, __warn_thunk + +#ifndef CONFIG_X86_64 +/* + * Clang's implementation of TLS stack cookies requires the variable in + * question to be a TLS variable. If the variable happens to be defined as an + * ordinary variable with external linkage in the same compilation unit (which + * amounts to the whole of vmlinux with LTO enabled), Clang will drop the + * segment register prefix from the references, resulting in broken code. Work + * around this by avoiding the symbol used in -mstack-protector-guard-symbol= + * entirely in the C code, and use an alias emitted by the linker script + * instead. + */ +#ifdef CONFIG_STACKPROTECTOR +EXPORT_SYMBOL(__ref_stack_chk_guard); +#endif +#endif diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h index 25466c4d2134..3674006e3974 100644 --- a/arch/x86/include/asm/asm-prototypes.h +++ b/arch/x86/include/asm/asm-prototypes.h @@ -20,3 +20,6 @@ extern void cmpxchg8b_emu(void); #endif +#if defined(__GENKSYMS__) && defined(CONFIG_STACKPROTECTOR) +extern unsigned long __ref_stack_chk_guard; +#endif diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 8f41ab219cf1..9d42bd15e06c 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -2091,8 +2091,10 @@ void syscall_init(void) #ifdef CONFIG_STACKPROTECTOR DEFINE_PER_CPU(unsigned long, __stack_chk_guard); +#ifndef CONFIG_SMP EXPORT_PER_CPU_SYMBOL(__stack_chk_guard); #endif +#endif #endif /* CONFIG_X86_64 */ diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index 410546bacc0f..d61c3584f3e6 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -468,6 +468,9 @@ SECTIONS . = ASSERT((_end - LOAD_OFFSET <= KERNEL_IMAGE_SIZE), "kernel image bigger than KERNEL_IMAGE_SIZE"); +/* needed for Clang - see arch/x86/entry/entry.S */ +PROVIDE(__ref_stack_chk_guard = __stack_chk_guard); + #ifdef CONFIG_X86_64 /* * Per-cpu symbols which need to be offset from __per_cpu_load -- 2.47.0

10 months

Re: [PATCH 1/2] arm64: efi: Execute runtime services from a dedicated stack

by Lee Jones

On Mon, 05 Dec 2022, Ard Biesheuvel wrote: > With the introduction of PRMT in the ACPI subsystem, the EFI rts > workqueue is no longer the only caller of efi_call_virt_pointer() in the > kernel. This means the EFI runtime services lock is no longer sufficient > to manage concurrent calls into firmware, but also that firmware calls > may occur that are not marshalled via the workqueue mechanism, but > originate directly from the caller context. > > For added robustness, and to ensure that the runtime services have 8 KiB > of stack space available as per the EFI spec, introduce a spinlock > protected EFI runtime stack of 8 KiB, where the spinlock also ensures > serialization between the EFI rts workqueue (which itself serializes EFI > runtime calls) and other callers of efi_call_virt_pointer(). > > While at it, use the stack pivot to avoid reloading the shadow call > stack pointer from the ordinary stack, as doing so could produce a > gadget to defeat it. > > Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org> > --- > arch/arm64/include/asm/efi.h | 3 +++ > arch/arm64/kernel/efi-rt-wrapper.S | 13 +++++++++- > arch/arm64/kernel/efi.c | 25 ++++++++++++++++++++ > 3 files changed, 40 insertions(+), 1 deletion(-) Could we have this in Stable please? Upstream commit: ff7a167961d1b ("arm64: efi: Execute runtime services from a dedicated stack") Ard, do we need Patch 2 as well, or can this be applied on its own? > diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h > index 7c12e01c2b312e7b..1c408ec3c8b3a883 100644 > --- a/arch/arm64/include/asm/efi.h > +++ b/arch/arm64/include/asm/efi.h > @@ -25,6 +25,7 @@ int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md); > ({ \ > efi_virtmap_load(); \ > __efi_fpsimd_begin(); \ > + spin_lock(&efi_rt_lock); \ > }) > > #undef arch_efi_call_virt > @@ -33,10 +34,12 @@ int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md); > > #define arch_efi_call_virt_teardown() \ > ({ \ > + spin_unlock(&efi_rt_lock); \ > __efi_fpsimd_end(); \ > efi_virtmap_unload(); \ > }) > > +extern spinlock_t efi_rt_lock; > efi_status_t __efi_rt_asm_wrapper(void *, const char *, ...); > > #define ARCH_EFI_IRQ_FLAGS_MASK (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT) > diff --git a/arch/arm64/kernel/efi-rt-wrapper.S b/arch/arm64/kernel/efi-rt-wrapper.S > index 75691a2641c1c0f8..b2786b968fee68dd 100644 > --- a/arch/arm64/kernel/efi-rt-wrapper.S > +++ b/arch/arm64/kernel/efi-rt-wrapper.S > @@ -16,6 +16,12 @@ SYM_FUNC_START(__efi_rt_asm_wrapper) > */ > stp x1, x18, [sp, #16] > > + ldr_l x16, efi_rt_stack_top > + mov sp, x16 > +#ifdef CONFIG_SHADOW_CALL_STACK > + str x18, [sp, #-16]! > +#endif > + > /* > * We are lucky enough that no EFI runtime services take more than > * 5 arguments, so all are passed in registers rather than via the > @@ -29,6 +35,7 @@ SYM_FUNC_START(__efi_rt_asm_wrapper) > mov x4, x6 > blr x8 > > + mov sp, x29 > ldp x1, x2, [sp, #16] > cmp x2, x18 > ldp x29, x30, [sp], #32 > @@ -42,6 +49,10 @@ SYM_FUNC_START(__efi_rt_asm_wrapper) > * called with preemption disabled and a separate shadow stack is used > * for interrupts. > */ > - mov x18, x2 > +#ifdef CONFIG_SHADOW_CALL_STACK > + ldr_l x18, efi_rt_stack_top > + ldr x18, [x18, #-16] > +#endif > + > b efi_handle_corrupted_x18 // tail call > SYM_FUNC_END(__efi_rt_asm_wrapper) > diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c > index a908a37f03678b6b..8cb2e005f8aca589 100644 > --- a/arch/arm64/kernel/efi.c > +++ b/arch/arm64/kernel/efi.c > @@ -144,3 +144,28 @@ asmlinkage efi_status_t efi_handle_corrupted_x18(efi_status_t s, const char *f) > pr_err_ratelimited(FW_BUG "register x18 corrupted by EFI %s\n", f); > return s; > } > + > +DEFINE_SPINLOCK(efi_rt_lock); > + > +asmlinkage u64 *efi_rt_stack_top __ro_after_init; > + > +/* required by the EFI spec */ > +static_assert(THREAD_SIZE >= SZ_8K); > + > +int __init arm64_efi_rt_init(void) > +{ > + void *p = __vmalloc_node_range(THREAD_SIZE, THREAD_ALIGN, > + VMALLOC_START, VMALLOC_END, GFP_KERNEL, > + PAGE_KERNEL, 0, NUMA_NO_NODE, > + __builtin_return_address(0)); > + > + if (!p) { > + pr_warn("Failed to allocate EFI runtime stack\n"); > + clear_bit(EFI_RUNTIME_SERVICES, &efi.flags); > + return -ENOMEM; > + } > + > + efi_rt_stack_top = p + THREAD_SIZE; > + return 0; > +} > +core_initcall(arm64_efi_rt_init); > -- > 2.35.1 > > -- Lee Jones [李琼斯]

10 months, 1 week

[PATCH v6 4/8] crypto: ccp: Fix uapi definitions of PSP errors

by Dionna Glaze

From: Alexey Kardashevskiy <aik(a)amd.com> Additions to the error enum after the explicit 0x27 setting for SEV_RET_INVALID_KEY leads to incorrect value assignments. Use explicit values to match the manufacturer specifications more clearly. Fixes: 3a45dc2b419e ("crypto: ccp: Define the SEV-SNP commands") CC: Sean Christopherson <seanjc(a)google.com> CC: Paolo Bonzini <pbonzini(a)redhat.com> CC: Thomas Gleixner <tglx(a)linutronix.de> CC: Ingo Molnar <mingo(a)redhat.com> CC: Borislav Petkov <bp(a)alien8.de> CC: Dave Hansen <dave.hansen(a)linux.intel.com> CC: Ashish Kalra <ashish.kalra(a)amd.com> CC: Tom Lendacky <thomas.lendacky(a)amd.com> CC: John Allen <john.allen(a)amd.com> CC: Herbert Xu <herbert(a)gondor.apana.org.au> CC: "David S. Miller" <davem(a)davemloft.net> CC: Michael Roth <michael.roth(a)amd.com> CC: Luis Chamberlain <mcgrof(a)kernel.org> CC: Russ Weight <russ.weight(a)linux.dev> CC: Danilo Krummrich <dakr(a)redhat.com> CC: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> CC: "Rafael J. Wysocki" <rafael(a)kernel.org> CC: Tianfei zhang <tianfei.zhang(a)intel.com> CC: Alexey Kardashevskiy <aik(a)amd.com> CC: stable(a)vger.kernel.org Signed-off-by: Alexey Kardashevskiy <aik(a)amd.com> Signed-off-by: Dionna Glaze <dionnaglaze(a)google.com> --- include/uapi/linux/psp-sev.h | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/include/uapi/linux/psp-sev.h b/include/uapi/linux/psp-sev.h index 832c15d9155bd..eeb20dfb1fdaa 100644 --- a/include/uapi/linux/psp-sev.h +++ b/include/uapi/linux/psp-sev.h @@ -73,13 +73,20 @@ typedef enum { SEV_RET_INVALID_PARAM, SEV_RET_RESOURCE_LIMIT, SEV_RET_SECURE_DATA_INVALID, - SEV_RET_INVALID_KEY = 0x27, - SEV_RET_INVALID_PAGE_SIZE, - SEV_RET_INVALID_PAGE_STATE, - SEV_RET_INVALID_MDATA_ENTRY, - SEV_RET_INVALID_PAGE_OWNER, - SEV_RET_INVALID_PAGE_AEAD_OFLOW, - SEV_RET_RMP_INIT_REQUIRED, + SEV_RET_INVALID_PAGE_SIZE = 0x0019, + SEV_RET_INVALID_PAGE_STATE = 0x001A, + SEV_RET_INVALID_MDATA_ENTRY = 0x001B, + SEV_RET_INVALID_PAGE_OWNER = 0x001C, + SEV_RET_AEAD_OFLOW = 0x001D, + SEV_RET_EXIT_RING_BUFFER = 0x001F, + SEV_RET_RMP_INIT_REQUIRED = 0x0020, + SEV_RET_BAD_SVN = 0x0021, + SEV_RET_BAD_VERSION = 0x0022, + SEV_RET_SHUTDOWN_REQUIRED = 0x0023, + SEV_RET_UPDATE_FAILED = 0x0024, + SEV_RET_RESTORE_REQUIRED = 0x0025, + SEV_RET_RMP_INITIALIZATION_FAILED = 0x0026, + SEV_RET_INVALID_KEY = 0x0027, SEV_RET_MAX, } sev_ret_code; -- 2.47.0.277.g8800431eea-goog

10 months, 1 week

[PATCH] PCI: controller: Restore PCI_REASSIGN_ALL_BUS when PCI_PROBE_ONLY is enabled

by Bo Sun

On our Marvell OCTEON CN96XX board, we observed the following panic on the latest kernel: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000080 Mem abort info: ESR = 0x0000000096000005 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x05: level 1 translation fault Data abort info: ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [0000000000000080] user address but active_mm is swapper Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP Modules linked in: CPU: 9 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.13.0-rc7-00149-g9bffa1ad25b8 #1 Hardware name: Marvell OcteonTX CN96XX board (DT) pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : of_pci_add_properties+0x278/0x4c8 lr : of_pci_add_properties+0x258/0x4c8 sp : ffff8000822ef9b0 x29: ffff8000822ef9b0 x28: ffff000106dd8000 x27: ffff800081bc3b30 x26: ffff800081540118 x25: ffff8000813d2be0 x24: 0000000000000000 x23: ffff00010528a800 x22: ffff000107c50000 x21: ffff0001039c2630 x20: ffff0001039c2630 x19: 0000000000000000 x18: ffffffffffffffff x17: 00000000a49c1b85 x16: 0000000084c07b58 x15: ffff000103a10f98 x14: ffffffffffffffff x13: ffff000103a10f96 x12: 0000000000000003 x11: 0101010101010101 x10: 000000000000002c x9 : ffff800080ca7acc x8 : ffff0001038fd900 x7 : 0000000000000000 x6 : 0000000000696370 x5 : 0000000000000000 x4 : 0000000000000002 x3 : ffff8000822efa40 x2 : ffff800081341000 x1 : ffff000107c50000 x0 : 0000000000000000 Call trace: of_pci_add_properties+0x278/0x4c8 (P) of_pci_make_dev_node+0xe0/0x158 pci_bus_add_device+0x158/0x210 pci_bus_add_devices+0x40/0x98 pci_host_probe+0x94/0x118 pci_host_common_probe+0x120/0x1a0 platform_probe+0x70/0xf0 really_probe+0xb4/0x2a8 __driver_probe_device+0x80/0x140 driver_probe_device+0x48/0x170 __driver_attach+0x9c/0x1b0 bus_for_each_dev+0x7c/0xe8 driver_attach+0x2c/0x40 bus_add_driver+0xec/0x218 driver_register+0x68/0x138 __platform_driver_register+0x2c/0x40 gen_pci_driver_init+0x24/0x38 do_one_initcall+0x4c/0x278 kernel_init_freeable+0x1f4/0x3d0 kernel_init+0x28/0x1f0 ret_from_fork+0x10/0x20 Code: aa1603e1 f0005522 d2800044 91000042 (f94040a0) This regression was introduced by commit 7246a4520b4b ("PCI: Use preserve_config in place of pci_flags"). On our board, the 002:00:07.0 bridge is misconfigured by the bootloader. Both its secondary and subordinate bus numbers are initialized to 0, while its fixed secondary bus number is set to 8. However, bus number 8 is also assigned to another bridge (0002:00:0f.0). Although this is a bootloader issue, before the change in commit 7246a4520b4b, the PCI_REASSIGN_ALL_BUS flag was set by default when PCI_PROBE_ONLY was enabled, ensuing that all the bus number for these bridges were reassigned, avoiding any conflicts. After the change introduced in commit 7246a4520b4b, the bus numbers assigned by the bootloader are reused by all other bridges, except the misconfigured 002:00:07.0 bridge. The kernel attempt to reconfigure 002:00:07.0 by reusing the fixed secondary bus number 8 assigned by bootloader. However, since a pci_bus has already been allocated for bus 8 due to the probe of 0002:00:0f.0, no new pci_bus allocated for 002:00:07.0. This results in a pci bridge device without a pci_bus attached (pdev->subordinate == NULL). Consequently, accessing pdev->subordinate in of_pci_prop_bus_range() leads to a NULL pointer dereference. To summarize, we need to restore the PCI_REASSIGN_ALL_BUS flag when PCI_PROBE_ONLY is enabled in order to work around issue like the one described above. Cc: stable(a)vger.kernel.org Fixes: 7246a4520b4b ("PCI: Use preserve_config in place of pci_flags") Signed-off-by: Bo Sun <Bo.Sun.CN(a)windriver.com> --- drivers/pci/controller/pci-host-common.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/pci/controller/pci-host-common.c b/drivers/pci/controller/pci-host-common.c index cf5f59a745b3..615923acbc3e 100644 --- a/drivers/pci/controller/pci-host-common.c +++ b/drivers/pci/controller/pci-host-common.c @@ -73,6 +73,10 @@ int pci_host_common_probe(struct platform_device *pdev) if (IS_ERR(cfg)) return PTR_ERR(cfg); + /* Do not reassign resources if probe only */ + if (!pci_has_flag(PCI_PROBE_ONLY)) + pci_add_flags(PCI_REASSIGN_ALL_BUS); + bridge->sysdata = cfg; bridge->ops = (struct pci_ops *)&ops->pci_ops; bridge->msi_domain = true; -- 2.48.1

10 months, 1 week

← Newer
1
2
3
4
5
6
7
...
101
Older →

Jump to page:

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror January 2025