Hi Jihong and perf experts,
Greetings!
There was "BUG: soft lockup in asm_sysvec_apic_timer_interrupt" in v6.11-rc4 mainline kernel by local syzkaller Intel internal kernel testing.
Bisected and found first bad commit: " 15def34e2635 perf/core: Fix hardlockup failure caused by perf throttle " After reverted above commit on top of v6.11-rc4 mainline kernel, this issue was gone.
And this issue could be reproduced in 1200s.
All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/240823_212601_asm_sysve... Syzkaller repro code: https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysve... Syzkaller repro syscall steps: https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysve... Syzkaller report: https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysve... Kconfig(make olddefconfig): https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysve... Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysve... bzImage: https://github.com/xupengfe/syzkaller_logs/raw/main/240823_212601_asm_sysvec... Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysve...
" [ 22.518698] hrtimer: interrupt took 13103 ns [ 30.382700] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 2079936720 wd_nsec: 2079936828 [ 378.038693] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 7719948786 wd_nsec: 7719948793 [ 736.508865] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [repro:193160] [ 736.509218] Modules linked in: [ 736.509369] irq event stamp: 15405229 [ 736.509539] hardirqs last enabled at (15405228): [<ffffffff8579c9de>] irqentry_exit+0x3e/0xa0 [ 736.509947] hardirqs last disabled at (15405229): [<ffffffff8579ae14>] sysvec_apic_timer_interrupt+0x14/0xc0 [ 736.510383] softirqs last enabled at (9218742): [<ffffffff81289fb9>] __irq_exit_rcu+0xa9/0x120 [ 736.510776] softirqs last disabled at (9218745): [<ffffffff81289fb9>] __irq_exit_rcu+0xa9/0x120 [ 736.511167] CPU: 0 UID: 0 PID: 193160 Comm: repro Not tainted 6.11.0-rc4-47ac09b91bef+ #1 [ 736.511529] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 [ 736.512039] RIP: 0010:__rcu_read_unlock+0x284/0x560 [ 736.512272] Code: 8f 00 00 00 84 d2 0f 84 87 00 00 00 bf 09 00 00 00 e8 d0 0b dc ff 4d 85 f6 0f 84 68 fe ff ff e8 f2 83 26 00 fb 0f 1f 44 00 00 <e9> 58 fe ff ff 0f 0b 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3 [ 736.513091] RSP: 0018:ffff88806c609938 EFLAGS: 00000212 [ 736.513330] RAX: 0000000000e956e0 RBX: ffff88806c649600 RCX: 1ffffffff14ae71b [ 736.513648] RDX: 0000000000000000 RSI: 0000000000000101 RDI: 0000000000000000 [ 736.513974] RBP: ffff88806c609968 R08: 0000000000000001 R09: fffffbfff14a92b5 [ 736.514289] R10: ffffffff8a5495af R11: 0000000000000001 R12: ffff88801379ca00 [ 736.514606] R13: ffff88801379ca00 R14: 0000000000000200 R15: ffffffff86e619c0 [ 736.514935] FS: 00007f095e148740(0000) GS:ffff88806c600000(0000) knlGS:0000000000000000 [ 736.515293] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 736.515555] CR2: 0000000020000000 CR3: 0000000021944006 CR4: 0000000000770ef0 [ 736.515888] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 736.516215] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400 [ 736.516539] PKRU: 55555554 [ 736.516672] Call Trace: [ 736.516798] <IRQ> [ 736.516907] ? show_regs+0xa8/0xc0 [ 736.517080] ? watchdog_timer_fn+0x52f/0x6b0 [ 736.517284] ? __pfx_watchdog_timer_fn+0x10/0x10 [ 736.517503] ? __hrtimer_run_queues+0x5d6/0xb10 [ 736.517733] ? __pfx___hrtimer_run_queues+0x10/0x10 [ 736.517975] ? hrtimer_interrupt+0x324/0x7a0 [ 736.518193] ? __sysvec_apic_timer_interrupt+0x10b/0x410 [ 736.518443] ? debug_smp_processor_id+0x20/0x30 [ 736.518663] ? sysvec_apic_timer_interrupt+0x4b/0xc0 [ 736.518901] ? asm_sysvec_apic_timer_interrupt+0x1f/0x30 [ 736.519162] ? __rcu_read_unlock+0x284/0x560 [ 736.519370] ? __rcu_read_unlock+0x27e/0x560 [ 736.519579] __is_insn_slot_addr+0x14c/0x2a0 [ 736.519791] kernel_text_address+0x64/0xe0 [ 736.519991] __kernel_text_address+0x16/0x50 [ 736.520199] unwind_get_return_address+0x8c/0x100 [ 736.520424] ? __pfx_stack_trace_consume_entry+0x10/0x10 [ 736.520671] arch_stack_walk+0xa7/0x170 [ 736.520878] stack_trace_save+0x97/0xd0 [ 736.521064] ? __pfx_stack_trace_save+0x10/0x10 [ 736.521276] ? __pfx_mark_lock.part.0+0x10/0x10 [ 736.521499] kasan_save_stack+0x2c/0x60 [ 736.521681] ? kasan_save_stack+0x2c/0x60 [ 736.521870] ? kasan_save_track+0x18/0x40 [ 736.522057] ? kasan_save_free_info+0x3f/0x60 [ 736.522267] ? __kasan_slab_free+0x115/0x1a0 [ 736.522467] ? kfree+0xfe/0x330 [ 736.522622] ? free_ctx+0x22/0x30 [ 736.522788] ? rcu_core+0x877/0x18f0 [ 736.522967] ? rcu_core_si+0x12/0x20 [ 736.523142] ? handle_softirqs+0x1c7/0x870 [ 736.523334] ? __irq_exit_rcu+0xa9/0x120 [ 736.523519] ? irq_exit_rcu+0x12/0x30 [ 736.523693] ? sysvec_apic_timer_interrupt+0xa5/0xc0 [ 736.523922] ? asm_sysvec_apic_timer_interrupt+0x1f/0x30 [ 736.524165] ? lock_acquire+0x1ff/0x580 [ 736.524352] ? _raw_spin_lock+0x38/0x50 [ 736.524534] ? do_fcntl+0xa95/0x1400 [ 736.524709] ? __x64_sys_fcntl+0x179/0x210 [ 736.524906] ? x64_sys_call+0x5b9/0x20d0 [ 736.525094] ? do_syscall_64+0x6d/0x140 [ 736.525276] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 736.525529] ? __pfx___lock_acquire+0x10/0x10 [ 736.525736] ? do_raw_spin_trylock+0xbf/0x190 [ 736.525949] ? free_unref_page_commit+0x3c0/0xfb0 [ 736.526176] ? __this_cpu_preempt_check+0x21/0x30 [ 736.526399] ? lock_acquire+0x1de/0x580 [ 736.526587] ? free_ctx+0x22/0x30 [ 736.526753] kasan_save_track+0x18/0x40 [ 736.526944] kasan_save_free_info+0x3f/0x60 [ 736.527146] __kasan_slab_free+0x115/0x1a0 [ 736.527341] ? free_ctx+0x22/0x30 [ 736.527503] kfree+0xfe/0x330 [ 736.527655] ? rcu_core+0x875/0x18f0 [ 736.527833] free_ctx+0x22/0x30 [ 736.527991] rcu_core+0x877/0x18f0 [ 736.528165] ? __pfx_rcu_core+0x10/0x10 [ 736.528365] rcu_core_si+0x12/0x20 [ 736.528535] handle_softirqs+0x1c7/0x870 [ 736.528734] __irq_exit_rcu+0xa9/0x120 [ 736.528915] irq_exit_rcu+0x12/0x30 [ 736.529085] sysvec_apic_timer_interrupt+0xa5/0xc0 [ 736.529309] </IRQ> [ 736.529414] <TASK> [ 736.529522] asm_sysvec_apic_timer_interrupt+0x1f/0x30 [ 736.529762] RIP: 0010:lock_acquire+0x1ff/0x580 [ 736.529974] Code: 48 83 c4 28 e8 72 73 37 04 b8 ff ff ff ff 65 0f c1 05 fd b6 c0 7e 83 f8 01 0f 85 d9 02 00 00 4d 85 ff 74 06 fb 0f 1f 44 00 00 <48> b8 00 00 00 00 00 fc ff df 48 01 c3 48 c7 03 00 00 00 00 48 c7 [ 736.530786] RSP: 0018:ffff88801aa0fcd0 EFLAGS: 00000206 [ 736.531031] RAX: 0000000000000001 RBX: 1ffff11003541f9e RCX: 1ffff11003541f82 [ 736.531348] RDX: 1ffff110026f3b07 RSI: 0000000000000001 RDI: 0000000000000000 [ 736.531668] RBP: ffff88801aa0fdb8 R08: 0000000000000000 R09: fffffbfff14a92b5 [ 736.531994] R10: ffffffff8a5495af R11: 0000000000000001 R12: 0000000000000001 [ 736.532318] R13: 0000000000000000 R14: ffff88800f65a028 R15: 0000000000000200 [ 736.532660] ? __pfx_lock_acquire+0x10/0x10 [ 736.532860] ? _raw_spin_unlock+0x31/0x60 [ 736.533063] ? up_write+0x1c0/0x550 [ 736.533234] ? fasync_helper+0x77/0xc0 [ 736.533420] _raw_spin_lock+0x38/0x50 [ 736.533595] ? do_fcntl+0xa95/0x1400 [ 736.533775] do_fcntl+0xa95/0x1400 [ 736.533948] ? __pfx_do_fcntl+0x10/0x10 [ 736.534137] ? trace_hardirqs_on+0x51/0x60 [ 736.534336] ? seqcount_lockdep_reader_access.constprop.0+0xc0/0xd0 [ 736.534619] ? __sanitizer_cov_trace_cmp4+0x1a/0x20 [ 736.534853] ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20 [ 736.535104] ? security_file_fcntl+0x9d/0xd0 [ 736.535315] __x64_sys_fcntl+0x179/0x210 [ 736.535508] x64_sys_call+0x5b9/0x20d0 [ 736.535687] do_syscall_64+0x6d/0x140 [ 736.535866] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 736.536099] RIP: 0033:0x7f095de3ee5d [ 736.536273] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 93 af 1b 00 f7 d8 64 89 01 48 [ 736.537075] RSP: 002b:00007ffea8175048 EFLAGS: 00000217 ORIG_RAX: 0000000000000048 [ 736.537411] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f095de3ee5d [ 736.537736] RDX: 0000000000002400 RSI: 0000000000000004 RDI: 0000000000000003 [ 736.538053] RBP: 00007ffea8175060 R08: 00007ffea8175060 R09: 00007ffea8175060 [ 736.538371] R10: 00007ffea8175060 R11: 0000000000000217 R12: 00007ffea81751d8 [ 736.538689] R13: 000000000040547a R14: 0000000000407df8 R15: 00007f095e193000 [ 736.539023] </TASK> [ 736.539132] Kernel panic - not syncing: softlockup: hung tasks "
Hope it's helpful.
Thanks!
---
If you don't need the following environment to reproduce the problem or if you already have one reproduced environment, please ignore the following information.
How to reproduce: git clone https://gitlab.com/xupengfe/repro_vm_env.git cd repro_vm_env tar -xvf repro_vm_env.tar.gz cd repro_vm_env; ./start3.sh // it needs qemu-system-x86_64 and I used v7.1.0 // start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel // You could change the bzImage_xxx as you want // Maybe you need to remove line "-drive if=pflash,format=raw,readonly=on,file=./OVMF_CODE.fd " for different qemu version You could use below command to log in, there is no password for root. ssh -p 10023 root@localhost
After login vm(virtual machine) successfully, you could transfer reproduced binary to the vm by below way, and reproduce the problem in vm: gcc -pthread -o repro repro.c scp -P 10023 repro root@localhost:/root/
Get the bzImage for target kernel: Please use target kconfig and copy it to kernel_src/.config make olddefconfig make -jx bzImage //x should equal or less than cpu num your pc has
Fill the bzImage file into above start3.sh to load the target kernel in vm.
Tips: If you already have qemu-system-x86_64, please ignore below info. If you want to install qemu v7.1.0 version: git clone https://github.com/qemu/qemu.git cd qemu git checkout -f v7.1.0 mkdir build cd build yum install -y ninja-build.x86_64 yum -y install libslirp-devel.x86_64 ../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl --enable-usb-redir --enable-slirp make make install
Best Regards, Thanks!
On 2023-04-17 at 10:46:22 -0000, tip-bot2 for Yang Jihong wrote:
The following commit has been merged into the perf/core branch of tip:
Commit-ID: 15def34e2635ab7e0e96f1bc32e1b69609f14942 Gitweb: https://git.kernel.org/tip/15def34e2635ab7e0e96f1bc32e1b69609f14942 Author: Yang Jihong yangjihong1@huawei.com AuthorDate: Mon, 27 Feb 2023 10:35:08 +08:00 Committer: Peter Zijlstra peterz@infradead.org CommitterDate: Fri, 14 Apr 2023 16:08:22 +02:00
perf/core: Fix hardlockup failure caused by perf throttle
commit e050e3f0a71bf ("perf: Fix broken interrupt rate throttling") introduces a change in throttling threshold judgment. Before this, compare hwc->interrupts and max_samples_per_tick, then increase hwc->interrupts by 1, but this commit reverses order of these two behaviors, causing the semantics of max_samples_per_tick to change. In literal sense of "max_samples_per_tick", if hwc->interrupts == max_samples_per_tick, it should not be throttled, therefore, the judgment condition should be changed to "hwc->interrupts > max_samples_per_tick".
In fact, this may cause the hardlockup to fail, The minimum value of max_samples_per_tick may be 1, in this case, the return value of __perf_event_account_interrupt function is 1. As a result, nmi_watchdog gets throttled, which would stop PMU (Use x86 architecture as an example, see x86_pmu_handle_irq).
Fixes: e050e3f0a71b ("perf: Fix broken interrupt rate throttling") Signed-off-by: Yang Jihong yangjihong1@huawei.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20230227023508.102230-1-yangjihong1@huawei.com
kernel/events/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c index fb3e436..82b95b8 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -9433,8 +9433,8 @@ __perf_event_account_interrupt(struct perf_event *event, int throttle) hwc->interrupts = 1; } else { hwc->interrupts++;
if (unlikely(throttle
&& hwc->interrupts >= max_samples_per_tick)) {
if (unlikely(throttle &&
hwc->interrupts > max_samples_per_tick)) { __this_cpu_inc(perf_throttled_count); tick_dep_set_cpu(smp_processor_id(), TICK_DEP_BIT_PERF_EVENTS); hwc->interrupts = MAX_INTERRUPTS;