Hi, all
We are encountering a perf related soft lockup as shown below:
[25023823.265138] watchdog: BUG: soft lockup - CPU#29 stuck for 45s! [YD:3284696] [25023823.275772] net_failover virtio_scsi failover [25023823.276750] CPU: 29 PID: 3284696 Comm: YD Kdump: loaded Not tainted 4.19.90-23.18.v2101.ky10.aarch64 #1 [25023823.278257] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [25023823.279475] pstate: 80400005 (Nzcv daif +PAN -UAO) [25023823.280516] pc : perf_iterate_sb+0x1b8/0x1f0 [25023823.281530] lr : perf_iterate_sb+0x18c/0x1f0 [25023823.282529] sp : ffff801f282efbf0 [25023823.283446] x29: ffff801f282efbf0 x28: ffff801f207a8b80 [25023823.284551] x27: 0000000000000000 x26: ffff801f99b355e8 [25023823.285674] x25: 0000000000000000 x24: ffff8019e2fbd800 [25023823.286770] x23: ffff0000093f0018 x22: ffff801f282efc40 [25023823.287864] x21: ffff000008255f60 x20: ffff801ffdf58e80 [25023823.288964] x19: ffff8019f1c27800 x18: 0000000000000000 [25023823.290060] x17: 0000000000000000 x16: 0000000000000000 [25023823.291164] x15: 0400000000000000 x14: 0000000000000000 [25023823.292266] x13: ffff000008c6e340 x12: 0000000000000002 [25023823.293381] x11: ffff000008c6e318 x10: 00000019e5feff20 [25023823.294486] x9 : ffff8019fb49c000 x8 : 0058e6fd335b260e [25023823.295597] x7 : 0000000100321ed8 x6 : ffff00003d083780 [25023823.296715] x5 : 00ffffffffffffff x4 : 0000801ff4ae0000 [25023823.297860] x3 : ffff801ffdf64cc0 x2 : ffff000009858758 [25023823.298977] x1 : 0000000000000000 x0 : ffff8019e2fbd800 [25023823.300090] Call trace: [25023823.300962] perf_iterate_sb+0x1b8/0x1f0 [25023823.301961] perf_event_task+0x78/0x80 [25023823.302946] perf_event_exit_task+0xa4/0xb0 [25023823.303978] do_exit+0x38c/0x5d0 [25023823.304932] do_group_exit+0x3c/0xd8 [25023823.305904] get_signal+0x12c/0x740 [25023823.306859] do_signal+0x158/0x260 [25023823.307795] do_notify_resume+0xd8/0x358 [25023823.308781] work_pending+0x8/0x10
We got a vmcore by enable panic_on_soft_lockup, from the vmcore we found the perf_event accessed through perf_iterate_sb -> perf_iterate_sb_cpu -> event_filter_match -> pmu_filter_match -> for_each_sibling_event had been removed:
#define for_each_sibling_event(sibling, event) \ if ((event)->group_leader == (event)) \ list_for_each_entry((sibling), &(event)->sibling_list, sibling_list)
#define list_for_each_entry(pos, head, member) \ for (pos = __container_of((head)->next, pos, member); \ &pos->member != (head); \ pos = __container_of(pos->member.next, pos, member))
crash> struct perf_event ffff8019e2fbd800 struct perf_event { event_entry = { next = 0xffff8019f1c27800, prev = 0xdead000000000200 }, ... state = PERF_EVENT_STATE_DEAD, ... }
By the way, we also found another process which is deleting sibling_list:
crash> bt 3284533 PID: 3284533 TASK: ffff801f901ae880 CPU: 16 COMMAND: "YD" #0 [ffff801f8cd977f0] __switch_to at ffff000008088ba4 #1 [ffff801f8cd97810] __schedule at ffff000008bf10c4 #2 [ffff801f8cd97890] schedule at ffff000008bf17b0 #3 [ffff801f8cd978a0] schedule_timeout at ffff000008bf5b10 #4 [ffff801f8cd97960] wait_for_common at ffff000008bf2530 #5 [ffff801f8cd979f0] wait_for_completion at ffff000008bf2644 #6 [ffff801f8cd97a10] __wait_rcu_gp at ffff000008171c00 #7 [ffff801f8cd97a80] synchronize_sched at ffff000008179da8 #8 [ffff801f8cd97ad0] perf_trace_event_unreg at ffff000008216d50 #9 [ffff801f8cd97b00] perf_trace_destroy at ffff000008217148 #10 [ffff801f8cd97b20] tp_perf_event_destroy at ffff000008256ae0 #11 [ffff801f8cd97b30] _free_event at ffff00000825f21c #12 [ffff801f8cd97b70] put_event at ffff00000825faf0 #13 [ffff801f8cd97b80] perf_event_release_kernel at ffff00000825fcb8 #14 [ffff801f8cd97be0] perf_release at ffff00000825fdbc #15 [ffff801f8cd97bf0] __fput at ffff00000832f0b8 #16 [ffff801f8cd97c30] ____fput at ffff00000832f28c #17 [ffff801f8cd97c50] task_work_run at ffff00000810f8c8 #18 [ffff801f8cd97c90] do_exit at ffff0000080ef458 #19 [ffff801f8cd97cf0] do_group_exit at ffff0000080ef738 #20 [ffff801f8cd97d20] get_signal at ffff0000080fdde0 #21 [ffff801f8cd97d90] do_signal at ffff00000808e488 #22 [ffff801f8cd97e80] do_notify_resume at ffff00000808e7f4 #23 [ffff801f8cd97ff0] work_pending at ffff000008083f60
So it's reasonable to suspect that perf_iterate_sb is traversing sibling_list while another process is deleting it which eventually caused for_each_sibling_event to endless loop and thus soft lockup.
The race scenario thus could be this:
CPU 29: CPU 16: perf_event_release_kernel --> mutex_lock(&ctx->mutex) --> perf_remove_from_context --> perf_group_detach(event); for_each_sibling_event() --> list_del_init(&event->sibling_list)
As commit f3c0eba287049(“perf: Add a few assertions”)said: “Notable for_each_sibling_event() relies on exclusion from modification. This would normally be holding either ctx->lock or ctx->mutex, however due to how things are constructed disabling IRQs is a valid and sufficient substitute for ctx->lock.”, we think it's necessary to hold ctx ->mutex, but currently LTS such as 4.19,5.4,5.10, and 6.1 all does not do so:
perf_event_task --> perf_iterate_sb --> perf_iterate_sb_cpu --> event_filter_match --> pmu_filter_match --> for_each_sibling_event
commit bd27568117664(“perf: Rewrite core context handling”)had removed the pmu_filter_match operation, so it may be a temporary workaround for this issue.
But it's necessary to confirm if there is a race problem between sibling_list, and if it is, how to fix currently LTS branches.
Thanks in advance.
On Wed, Nov 29, 2023 at 05:33:33PM +0800, Zhengyuan Liu wrote:
Hi, all
We are encountering a perf related soft lockup as shown below:
[25023823.265138] watchdog: BUG: soft lockup - CPU#29 stuck for 45s! [YD:3284696] [25023823.275772] net_failover virtio_scsi failover [25023823.276750] CPU: 29 PID: 3284696 Comm: YD Kdump: loaded Not tainted 4.19.90-23.18.v2101.ky10.aarch64 #1
^^^^^^^^^^^^^^^^^^^
That is some unholy ancient kernel. Please see if you can reproduce on something recent.
On Wed, Nov 29, 2023 at 6:32 PM Peter Zijlstra peterz@infradead.org wrote:
On Wed, Nov 29, 2023 at 05:33:33PM +0800, Zhengyuan Liu wrote:
Hi, all
We are encountering a perf related soft lockup as shown below:
[25023823.265138] watchdog: BUG: soft lockup - CPU#29 stuck for 45s! [YD:3284696] [25023823.275772] net_failover virtio_scsi failover [25023823.276750] CPU: 29 PID: 3284696 Comm: YD Kdump: loaded Not tainted 4.19.90-23.18.v2101.ky10.aarch64 #1
^^^^^^^^^^^^^^^^^^^
That is some unholy ancient kernel. Please see if you can reproduce on something recent.
Sorry for the late reply since my company mail server has some trouble.
I don't have a reproducer, It's an online server and happens once every few months. From our analysis, the recent kernel shouldn't have this problem after commit bd27568117664(“perf: Rewrite core context handling”). But LTS branches such as v4.19 and v5.4 will be used for a long time, so I think it's worth fixing this problem.
Thanks,
linux-stable-mirror@lists.linaro.org