kernel warning noticed on x86_64 while running LTP tracing ftrace-stress-test case. started noticing on the stable-rc linux-5.8.y branch.
This device booted with KASAN config and DYNAMIC tracing configs and more. This reported issue is not easily reproducible.
metadata: git branch: linux-5.8.y git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git commit: ad8c735b1497520df959f675718f39dca8cb8019 git describe: v5.8.2 make_kernelversion: 5.8.2 kernel-config: https://builds.tuxbuild.com/bOz0eAwkcraRiWALTW9D3Q/kernel.config
[ 88.139387] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1 [ 88.139387] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1 [ 107.507991] ------------[ cut here ]------------ [ 107.513103] NETDEV WATCHDOG: eth0 (igb): transmit queue 2 timed out [ 107.519973] WARNING: CPU: 1 PID: 331 at net/sched/sch_generic.c:442 dev_watchdog+0x4c7/0x4d0 [ 107.528907] Modules linked in: x86_pkg_temp_thermal [ 107.534541] CPU: 1 PID: 331 Comm: systemd-journal Not tainted 5.8.2 #1 [ 107.541480] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.2 05/23/2018 [ 107.549314] RIP: 0010:dev_watchdog+0x4c7/0x4d0 [ 107.554226] Code: ff ff 48 8b 5d c8 c6 05 6d f7 94 01 01 48 89 df e8 9e b4 f8 ff 44 89 e9 48 89 de 48 c7 c7 20 49 51 9c 48 89 c2 e8 91 7e e9 fe <0f> 0b e9 03 ff ff ff 66 90 e8 9b 23 db fe 55 48 89 e5 41 57 41 56 [ 107.573476] RSP: 0018:ffff888230889d88 EFLAGS: 00010286 [ 107.579264] RAX: 0000000000000000 RBX: ffff88822bbb0000 RCX: dffffc0000000000 [ 107.586928] RDX: 1ffff11046114c99 RSI: ffffffff9a7e4dbe RDI: ffffffff9b7a6da7 [ 107.594473] RBP: ffff888230889de0 R08: ffffffff9a7e4dd3 R09: ffffed1044de2529 [ 107.602101] R10: ffff888226f12943 R11: ffffed1044de2528 R12: ffff88822bbb0440 [ 107.609648] R13: 0000000000000002 R14: ffff88822bbb0388 R15: ffff88822bbb0380 [ 107.617197] FS: 00007f8b471bb480(0000) GS:ffff888230880000(0000) knlGS:0000000000000000 [ 107.625698] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 107.631944] CR2: 0000000000000008 CR3: 0000000226a64001 CR4: 00000000003606e0 [ 107.639496] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 107.647092] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 107.654661] Call Trace: [ 107.657735] <IRQ> [ 107.663155] ? ftrace_graph_caller+0xc0/0xc0 [ 107.667929] call_timer_fn+0x3b/0x1b0 [ 107.672238] ? netif_carrier_off+0x70/0x70 [ 107.677771] ? netif_carrier_off+0x70/0x70 [ 107.682656] ? ftrace_graph_caller+0xc0/0xc0 [ 107.687379] run_timer_softirq+0x3e8/0xa10 [ 107.694653] ? call_timer_fn+0x1b0/0x1b0 [ 107.699382] ? trace_event_raw_event_softirq+0xdd/0x150 [ 107.706768] ? ring_buffer_unlock_commit+0xf5/0x210 [ 107.712213] ? call_timer_fn+0x1b0/0x1b0 [ 107.716625] ? __do_softirq+0x155/0x467 Aug 22 04:21:44 intel-corei7-64 [ 107.721972] ? run_timer_softirq+0x5/0xa10 user.warn kernel[ 107.727997] ? asm_call_on_stack+0x12/0x20 : [ 107.507991] ------------[ c[ 107.735546] ? ftrace_graph_caller+0xc0/0xc0 ut here ]-------[ 107.740453] __do_softirq+0x160/0x467 ----- [ 107.745737] ? hrtimer_interrupt+0x5/0x340 [ 107.753961] asm_call_on_stack+0x12/0x20 [ 107.758672] </IRQ> [ 107.761555] do_softirq_own_stack+0x3f/0x50 [ 107.766521] ? ftrace_graph_caller+0xc0/0xc0 [ 107.771246] irq_exit_rcu+0xff/0x110 [ 107.776116] ? ftrace_graph_caller+0xc0/0xc0 [ 107.780808] sysvec_apic_timer_interrupt+0x38/0x90 [ 107.786971] asm_sysvec_apic_timer_interrupt+0x12/0x20 [ 107.792598] RIP: 0010:profile_graph_return+0x111/0x1d0 [ 107.798204] Code: 75 e1 48 8b 45 d0 f6 c4 02 75 16 50 9d e8 f7 ff 02 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 c3 fb 02 00 ff 75 d0 9d <48> 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 48 8d 7b 20 e8 77 78 [ 107.817416] RSP: 0018:ffff8882269b73a0 EFLAGS: 00000286 [ 107.823201] RAX: ffff8882269b73d8 RBX: ffff8882269b7428 RCX: dffffc0000000000 [ 107.830785] RDX: dffffc0000000000 RSI: ffffffff9a7e4dbe RDI: ffffffff9a7a955d [ 107.838411] RBP: ffff8882269b73d8 R08: ffffffff9a7e4dd3 R09: ffffed1044de2529 [ 107.846072] R10: ffff888226f12943 R11: ffffed1044de2528 R12: ffff8882308a67c0 [ 107.853621] R13: ffff888226f12930 R14: ffff8882308a67c8 R15: ffff88822c7e4000 [ 107.863449] ? ftrace_return_to_handler+0x1a3/0x230 Aug 22 04:21:44 [ 107.869545] ? ftrace_return_to_handler+0x18e/0x230 intel-corei7-64 [ 107.875178] ? profile_graph_return+0x10d/0x1d0 user.info kernel: [ 107.513103][ 107.882521] ? unwind_dump+0x100/0x100 NETDEV WATCHDOG: eth0 (igb): tr[ 107.889054] ? unwind_next_frame.part.0+0xe0/0x360 ansmit queue 2 t[ 107.895638] ftrace_return_to_handler+0x18e/0x230 imed out [ 107.902594] ? function_graph_enter+0x2d0/0x2d0 [ 107.907616] ? unwind_next_frame+0x23/0x30 [ 107.912633] ? unwind_dump+0x100/0x100 [ 107.919304] ? update_stack_state+0x1d4/0x290 [ 107.926042] return_to_handler+0x15/0x30 [ 107.931071] ? ftrace_graph_caller+0xc0/0xc0 [ 107.935763] ? unwind_next_frame.part.0+0xe0/0x360 [ 107.941175] ? unwind_next_frame.part.0+0x5/0x360 [ 107.947344] ? profile_setup.cold+0x9b/0x9b [ 107.953253] ? ftrace_graph_caller+0xc0/0xc0 [ 107.958020] unwind_next_frame+0x23/0x30 [ 107.963015] ? ftrace_graph_caller+0xc0/0xc0 [ 107.967701] arch_stack_walk+0x8c/0xf0 [ 107.974194] ? vfs_open+0x58/0x60 [ 107.979965] ? ftrace_graph_caller+0xc0/0xc0 [ 107.984652] stack_trace_save+0x94/0xc0 [ 107.989565] ? stack_trace_consume_entry+0x90/0x90 [ 107.995792] ? stack_trace_save+0x5/0xc0 [ 108.000536] ? trace_hardirqs_on+0x3a/0x120 [ 108.005959] ? stack_trace_save+0x5/0xc0 [ 108.007220] ? ftrace_graph_caller+0xc0/0xc0 [ 108.007464] save_stack+0x23/0x50 [ 108.007909] ? save_stack+0x23/0x50 [ 108.008225] ? __kasan_kmalloc.constprop.0+0xcf/0xe0 [ 108.008517] ? kasan_kmalloc+0x9/0x10 Aug 22 04:21:44 [ 108.008809] ? kmem_cache_alloc_trace+0xf6/0x270 intel-corei7-64 [ 108.009207] ? single_open+0x3b/0xf0 user.warn kernel[ 108.044913] ? proc_single_open+0x1b/0x20 : [ 107.519973][ 108.050304] ? do_dentry_open+0x2a6/0x6f0 WARNING: CPU: 1 PID: 331 at net[ 108.057406] ? __kasan_check_read+0x11/0x20 /sched/sch_gener[ 108.063044] ? rb_commit+0xef/0x630 ic.c:442 dev_watchdog+0x4c7/0x4d0 [ 108.070497] ? __kasan_check_read+0x11/0x20 [ 108.075391] ? ring_buffer_unlock_commit+0x102/0x210 [ 108.082301] ? trace_buffer_unlock_commit_regs+0x171/0x1d0 [ 108.090197] ? trace_event_buffer_commit+0xfb/0x3d0 [ 108.097668] ? ftrace_graph_caller+0xc0/0xc0 [ 108.102681] ? trace_event_raw_event_kmem_alloc+0x7c/0xe0 [ 108.109083] ? kasan_unpoison_shadow+0x38/0x50 [ 108.114780] __kasan_kmalloc.constprop.0+0xcf/0xe0 [ 108.121927] kasan_kmalloc+0x9/0x10 [ 108.126280] kmem_cache_alloc_trace+0xf6/0x270 [ 108.131499] ? ftrace_graph_caller+0xc0/0xc0 [ 108.136690] ? proc_cwd_link+0x140/0x140 [ 108.141797] ? ftrace_graph_caller+0xc0/0xc0 [ 108.146568] single_open+0x3b/0xf0 [ 108.151695] ? ftrace_graph_caller+0xc0/0xc0 [ 108.156386] proc_single_open+0x1b/0x20 [ 108.161083] ? ftrace_graph_caller+0xc0/0xc0 [ 108.165772] do_dentry_open+0x2a6/0x6f0 [ 108.170282] ? vfs_open+0x4a/0x60 [ 108.175328] ? sched_open+0x20/0x20 [ 108.181406] ? ftrace_graph_caller+0xc0/0xc0 [ 108.186131] vfs_open+0x58/0x60 [ 108.190568] ? ftrace_graph_caller+0xc0/0xc0 [ 108.195258] path_openat+0x153b/0x1ab0 [ 108.202219] ? __srcu_read_lock+0x50/0x50 [ 108.209143] ? vfs_mkobj+0x270/0x270 [ 108.210188] ? ftrace_graph_caller+0x81/0xc0 [ 108.210721] ? ftrace_return_to_handler+0x1a3/0x230 [ 108.212065] ? function_graph_enter+0x2d0/0x2d0 [ 108.212812] ? do_filp_open+0x91/0x1b0 [ 108.213426] ? do_filp_open+0x107/0x1b0 [ 108.214662] ? path_openat+0x5/0x1ab0 [ 108.217377] ? ftrace_graph_caller+0xc0/0xc0 [ 108.217621] do_filp_open+0x124/0x1b0 [ 108.219009] ? may_open_dev+0x50/0x50 [ 108.222371] ? prepare_ftrace_return+0x7b/0xa0 [ 108.224144] ? ftrace_graph_caller+0x81/0xc0 [ 108.224461] ? function_graph_enter+0x2d0/0x2d0 [ 108.226641] ? ftrace_return_to_handler+0x1a3/0x230 [ 108.228414] ? do_filp_open+0x5/0x1b0 [ 108.231132] ? ftrace_graph_caller+0xc0/0xc0 [ 108.231376] do_sys_openat2+0x31d/0x410 [ 108.233412] ? file_open_root+0x210/0x210 [ 108.235616] ? do_sys_openat2+0x5/0x410 [ 108.236230] ? ftrace_graph_caller+0x81/0xc0 [ 108.238435] ? ftrace_graph_caller+0xc0/0xc0 [ 108.238678] do_sys_open+0x99/0xf0 [ 108.239555] ? filp_open+0x60/0x60 [ 108.240816] ? do_sys_open+0x5/0xf0 [ 108.242372] ? do_sys_open+0x5/0xf0 [ 108.243632] ? ftrace_graph_caller+0xc0/0xc0 [ 108.243959] __x64_sys_openat+0x59/0x70 [ 108.245508] ? ftrace_graph_caller+0xc0/0xc0 [ 108.245752] do_syscall_64+0x51/0x90 [ 108.246951] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 108.247244] RIP: 0033:0x7f8b46680d20 [ 108.247512] Code: 25 00 00 41 00 3d 00 00 41 00 74 36 48 8d 05 ef eb 2c 00 8b 00 85 c0 75 5a 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 84 00 00 00 48 83 c4 68 5b 5d c3 0f 1f 44 [ 108.247777] RSP: 002b:00007ffd2d0f21d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 [ 108.248248] RAX: ffffffffffffffda RBX: 00005635a1dcc260 RCX: 00007f8b46680d20 [ 108.248466] RDX: 0000000000080000 RSI: 00007ffd2d0f2380 RDI: 00000000ffffff9c [ 108.248684] RBP: 0000000000000008 R08: 0000000000000008 R09: 0000000000000001 [ 108.248984] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f8b46cb2977 [ 108.249226] R13: 00007f8b46cb2977 R14: 0000000000000001 R15: 0000000000000000 [ 108.254504] ---[ end trace 743b6da37b9f0ee8 ]---
full test log link, https://qa-reports.linaro.org/lkft/linux-stable-rc-5.8-oe/build/v5.8.2/testr...
Reported-by: Naresh Kamboju naresh.kamboju@linaro.org
On Wed, 19 Aug 2020 17:01:06 +0530 Naresh Kamboju naresh.kamboju@linaro.org wrote:
kernel warning noticed on x86_64 while running LTP tracing ftrace-stress-test case. started noticing on the stable-rc linux-5.8.y branch.
This device booted with KASAN config and DYNAMIC tracing configs and more. This reported issue is not easily reproducible.
metadata: git branch: linux-5.8.y git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git commit: ad8c735b1497520df959f675718f39dca8cb8019 git describe: v5.8.2 make_kernelversion: 5.8.2 kernel-config: https://builds.tuxbuild.com/bOz0eAwkcraRiWALTW9D3Q/kernel.config
[ 88.139387] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1 [ 88.139387] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1 [ 107.507991] ------------[ cut here ]------------ [ 107.513103] NETDEV WATCHDOG: eth0 (igb): transmit queue 2 timed out [ 107.519973] WARNING: CPU: 1 PID: 331 at net/sched/sch_generic.c:442 dev_watchdog+0x4c7/0x4d0 [ 107.528907] Modules linked in: x86_pkg_temp_thermal [ 107.534541] CPU: 1 PID: 331 Comm: systemd-journal Not tainted 5.8.2 #1 [ 107.541480] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.2 05/23/2018 [ 107.549314] RIP: 0010:dev_watchdog+0x4c7/0x4d0 [ 107.554226] Code: ff ff 48 8b 5d c8 c6 05 6d f7 94 01 01 48 89 df e8 9e b4 f8 ff 44 89 e9 48 89 de 48 c7 c7 20 49 51 9c 48 89 c2 e8 91 7e e9 fe <0f> 0b e9 03 ff ff ff 66 90 e8 9b 23 db fe 55 48 89 e5 41 57
I've triggered this myself in my testing, and I assumed that adding the overhead of tracing and here KASAN too, made some watchdog a bit unhappy. By commenting out the warning, I've seen no ill effects.
Perhaps this is something we need to dig a bit deeper into.
-- Steve
41 56 [ 107.573476] RSP: 0018:ffff888230889d88 EFLAGS: 00010286 [ 107.579264] RAX: 0000000000000000 RBX: ffff88822bbb0000 RCX: dffffc0000000000 [ 107.586928] RDX: 1ffff11046114c99 RSI: ffffffff9a7e4dbe RDI: ffffffff9b7a6da7 [ 107.594473] RBP: ffff888230889de0 R08: ffffffff9a7e4dd3 R09: ffffed1044de2529 [ 107.602101] R10: ffff888226f12943 R11: ffffed1044de2528 R12: ffff88822bbb0440 [ 107.609648] R13: 0000000000000002 R14: ffff88822bbb0388 R15: ffff88822bbb0380 [ 107.617197] FS: 00007f8b471bb480(0000) GS:ffff888230880000(0000) knlGS:0000000000000000 [ 107.625698] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 107.631944] CR2: 0000000000000008 CR3: 0000000226a64001 CR4: 00000000003606e0 [ 107.639496] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 107.647092] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 107.654661] Call Trace: [ 107.657735] <IRQ> [ 107.663155] ? ftrace_graph_caller+0xc0/0xc0 [ 107.667929] call_timer_fn+0x3b/0x1b0 [ 107.672238] ? netif_carrier_off+0x70/0x70 [ 107.677771] ? netif_carrier_off+0x70/0x70 [ 107.682656] ? ftrace_graph_caller+0xc0/0xc0 [ 107.687379] run_timer_softirq+0x3e8/0xa10 [ 107.694653] ? call_timer_fn+0x1b0/0x1b0 [ 107.699382] ? trace_event_raw_event_softirq+0xdd/0x150 [ 107.706768] ? ring_buffer_unlock_commit+0xf5/0x210 [ 107.712213] ? call_timer_fn+0x1b0/0x1b0 [ 107.716625] ? __do_softirq+0x155/0x467 Aug 22 04:21:44 intel-corei7-64 [ 107.721972] ? run_timer_softirq+0x5/0xa10 user.warn kernel[ 107.727997] ? asm_call_on_stack+0x12/0x20 : [ 107.507991] ------------[ c[ 107.735546] ? ftrace_graph_caller+0xc0/0xc0 ut here ]-------[ 107.740453] __do_softirq+0x160/0x467
[ 107.745737] ? hrtimer_interrupt+0x5/0x340 [ 107.753961] asm_call_on_stack+0x12/0x20 [ 107.758672] </IRQ> [ 107.761555] do_softirq_own_stack+0x3f/0x50 [ 107.766521] ? ftrace_graph_caller+0xc0/0xc0 [ 107.771246] irq_exit_rcu+0xff/0x110 [ 107.776116] ? ftrace_graph_caller+0xc0/0xc0 [ 107.780808] sysvec_apic_timer_interrupt+0x38/0x90 [ 107.786971] asm_sysvec_apic_timer_interrupt+0x12/0x20 [ 107.792598] RIP: 0010:profile_graph_return+0x111/0x1d0 [ 107.798204] Code: 75 e1 48 8b 45 d0 f6 c4 02 75 16 50 9d e8 f7 ff 02 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 c3 fb 02 00 ff 75 d0 9d <48> 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 48 8d 7b 20 e8 77 78 [ 107.817416] RSP: 0018:ffff8882269b73a0 EFLAGS: 00000286 [ 107.823201] RAX: ffff8882269b73d8 RBX: ffff8882269b7428 RCX: dffffc0000000000 [ 107.830785] RDX: dffffc0000000000 RSI: ffffffff9a7e4dbe RDI: ffffffff9a7a955d [ 107.838411] RBP: ffff8882269b73d8 R08: ffffffff9a7e4dd3 R09: ffffed1044de2529 [ 107.846072] R10: ffff888226f12943 R11: ffffed1044de2528 R12: ffff8882308a67c0 [ 107.853621] R13: ffff888226f12930 R14: ffff8882308a67c8 R15: ffff88822c7e4000 [ 107.863449] ? ftrace_return_to_handler+0x1a3/0x230 Aug 22 04:21:44 [ 107.869545] ? ftrace_return_to_handler+0x18e/0x230 intel-corei7-64 [ 107.875178] ? profile_graph_return+0x10d/0x1d0 user.info kernel: [ 107.513103][ 107.882521] ? unwind_dump+0x100/0x100 NETDEV WATCHDOG: eth0 (igb): tr[ 107.889054] ? unwind_next_frame.part.0+0xe0/0x360 ansmit queue 2 t[ 107.895638] ftrace_return_to_handler+0x18e/0x230 imed out [ 107.902594] ? function_graph_enter+0x2d0/0x2d0 [ 107.907616] ? unwind_next_frame+0x23/0x30 [ 107.912633] ? unwind_dump+0x100/0x100 [ 107.919304] ? update_stack_state+0x1d4/0x290 [ 107.926042] return_to_handler+0x15/0x30 [ 107.931071] ? ftrace_graph_caller+0xc0/0xc0 [ 107.935763] ? unwind_next_frame.part.0+0xe0/0x360 [ 107.941175] ? unwind_next_frame.part.0+0x5/0x360 [ 107.947344] ? profile_setup.cold+0x9b/0x9b [ 107.953253] ? ftrace_graph_caller+0xc0/0xc0 [ 107.958020] unwind_next_frame+0x23/0x30 [ 107.963015] ? ftrace_graph_caller+0xc0/0xc0 [ 107.967701] arch_stack_walk+0x8c/0xf0 [ 107.974194] ? vfs_open+0x58/0x60 [ 107.979965] ? ftrace_graph_caller+0xc0/0xc0 [ 107.984652] stack_trace_save+0x94/0xc0 [ 107.989565] ? stack_trace_consume_entry+0x90/0x90 [ 107.995792] ? stack_trace_save+0x5/0xc0 [ 108.000536] ? trace_hardirqs_on+0x3a/0x120 [ 108.005959] ? stack_trace_save+0x5/0xc0 [ 108.007220] ? ftrace_graph_caller+0xc0/0xc0 [ 108.007464] save_stack+0x23/0x50 [ 108.007909] ? save_stack+0x23/0x50 [ 108.008225] ? __kasan_kmalloc.constprop.0+0xcf/0xe0 [ 108.008517] ? kasan_kmalloc+0x9/0x10 Aug 22 04:21:44 [ 108.008809] ? kmem_cache_alloc_trace+0xf6/0x270 intel-corei7-64 [ 108.009207] ? single_open+0x3b/0xf0 user.warn kernel[ 108.044913] ? proc_single_open+0x1b/0x20 : [ 107.519973][ 108.050304] ? do_dentry_open+0x2a6/0x6f0 WARNING: CPU: 1 PID: 331 at net[ 108.057406] ? __kasan_check_read+0x11/0x20 /sched/sch_gener[ 108.063044] ? rb_commit+0xef/0x630 ic.c:442 dev_watchdog+0x4c7/0x4d0 [ 108.070497] ? __kasan_check_read+0x11/0x20 [ 108.075391] ? ring_buffer_unlock_commit+0x102/0x210 [ 108.082301] ? trace_buffer_unlock_commit_regs+0x171/0x1d0 [ 108.090197] ? trace_event_buffer_commit+0xfb/0x3d0 [ 108.097668] ? ftrace_graph_caller+0xc0/0xc0 [ 108.102681] ? trace_event_raw_event_kmem_alloc+0x7c/0xe0 [ 108.109083] ? kasan_unpoison_shadow+0x38/0x50 [ 108.114780] __kasan_kmalloc.constprop.0+0xcf/0xe0 [ 108.121927] kasan_kmalloc+0x9/0x10 [ 108.126280] kmem_cache_alloc_trace+0xf6/0x270 [ 108.131499] ? ftrace_graph_caller+0xc0/0xc0 [ 108.136690] ? proc_cwd_link+0x140/0x140 [ 108.141797] ? ftrace_graph_caller+0xc0/0xc0 [ 108.146568] single_open+0x3b/0xf0 [ 108.151695] ? ftrace_graph_caller+0xc0/0xc0 [ 108.156386] proc_single_open+0x1b/0x20 [ 108.161083] ? ftrace_graph_caller+0xc0/0xc0 [ 108.165772] do_dentry_open+0x2a6/0x6f0 [ 108.170282] ? vfs_open+0x4a/0x60 [ 108.175328] ? sched_open+0x20/0x20 [ 108.181406] ? ftrace_graph_caller+0xc0/0xc0 [ 108.186131] vfs_open+0x58/0x60 [ 108.190568] ? ftrace_graph_caller+0xc0/0xc0 [ 108.195258] path_openat+0x153b/0x1ab0 [ 108.202219] ? __srcu_read_lock+0x50/0x50 [ 108.209143] ? vfs_mkobj+0x270/0x270 [ 108.210188] ? ftrace_graph_caller+0x81/0xc0 [ 108.210721] ? ftrace_return_to_handler+0x1a3/0x230 [ 108.212065] ? function_graph_enter+0x2d0/0x2d0 [ 108.212812] ? do_filp_open+0x91/0x1b0 [ 108.213426] ? do_filp_open+0x107/0x1b0 [ 108.214662] ? path_openat+0x5/0x1ab0 [ 108.217377] ? ftrace_graph_caller+0xc0/0xc0 [ 108.217621] do_filp_open+0x124/0x1b0 [ 108.219009] ? may_open_dev+0x50/0x50 [ 108.222371] ? prepare_ftrace_return+0x7b/0xa0 [ 108.224144] ? ftrace_graph_caller+0x81/0xc0 [ 108.224461] ? function_graph_enter+0x2d0/0x2d0 [ 108.226641] ? ftrace_return_to_handler+0x1a3/0x230 [ 108.228414] ? do_filp_open+0x5/0x1b0 [ 108.231132] ? ftrace_graph_caller+0xc0/0xc0 [ 108.231376] do_sys_openat2+0x31d/0x410 [ 108.233412] ? file_open_root+0x210/0x210 [ 108.235616] ? do_sys_openat2+0x5/0x410 [ 108.236230] ? ftrace_graph_caller+0x81/0xc0 [ 108.238435] ? ftrace_graph_caller+0xc0/0xc0 [ 108.238678] do_sys_open+0x99/0xf0 [ 108.239555] ? filp_open+0x60/0x60 [ 108.240816] ? do_sys_open+0x5/0xf0 [ 108.242372] ? do_sys_open+0x5/0xf0 [ 108.243632] ? ftrace_graph_caller+0xc0/0xc0 [ 108.243959] __x64_sys_openat+0x59/0x70 [ 108.245508] ? ftrace_graph_caller+0xc0/0xc0 [ 108.245752] do_syscall_64+0x51/0x90 [ 108.246951] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 108.247244] RIP: 0033:0x7f8b46680d20 [ 108.247512] Code: 25 00 00 41 00 3d 00 00 41 00 74 36 48 8d 05 ef eb 2c 00 8b 00 85 c0 75 5a 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 84 00 00 00 48 83 c4 68 5b 5d c3 0f 1f 44 [ 108.247777] RSP: 002b:00007ffd2d0f21d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 [ 108.248248] RAX: ffffffffffffffda RBX: 00005635a1dcc260 RCX: 00007f8b46680d20 [ 108.248466] RDX: 0000000000080000 RSI: 00007ffd2d0f2380 RDI: 00000000ffffff9c [ 108.248684] RBP: 0000000000000008 R08: 0000000000000008 R09: 0000000000000001 [ 108.248984] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f8b46cb2977 [ 108.249226] R13: 00007f8b46cb2977 R14: 0000000000000001 R15: 0000000000000000 [ 108.254504] ---[ end trace 743b6da37b9f0ee8 ]---
full test log link, https://qa-reports.linaro.org/lkft/linux-stable-rc-5.8-oe/build/v5.8.2/testr...
Reported-by: Naresh Kamboju naresh.kamboju@linaro.org
Steven Rostedt wrote:
On Wed, 19 Aug 2020 17:01:06 +0530 Naresh Kamboju naresh.kamboju@linaro.org wrote:
kernel warning noticed on x86_64 while running LTP tracing ftrace-stress-test case. started noticing on the stable-rc linux-5.8.y branch.
This device booted with KASAN config and DYNAMIC tracing configs and more. This reported issue is not easily reproducible.
metadata: git branch: linux-5.8.y git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git commit: ad8c735b1497520df959f675718f39dca8cb8019 git describe: v5.8.2 make_kernelversion: 5.8.2 kernel-config: https://builds.tuxbuild.com/bOz0eAwkcraRiWALTW9D3Q/kernel.config
[ 88.139387] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1 [ 88.139387] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1 [ 107.507991] ------------[ cut here ]------------ [ 107.513103] NETDEV WATCHDOG: eth0 (igb): transmit queue 2 timed out [ 107.519973] WARNING: CPU: 1 PID: 331 at net/sched/sch_generic.c:442 dev_watchdog+0x4c7/0x4d0 [ 107.528907] Modules linked in: x86_pkg_temp_thermal [ 107.534541] CPU: 1 PID: 331 Comm: systemd-journal Not tainted 5.8.2 #1 [ 107.541480] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS 2.2 05/23/2018 [ 107.549314] RIP: 0010:dev_watchdog+0x4c7/0x4d0 [ 107.554226] Code: ff ff 48 8b 5d c8 c6 05 6d f7 94 01 01 48 89 df e8 9e b4 f8 ff 44 89 e9 48 89 de 48 c7 c7 20 49 51 9c 48 89 c2 e8 91 7e e9 fe <0f> 0b e9 03 ff ff ff 66 90 e8 9b 23 db fe 55 48 89 e5 41 57
I've triggered this myself in my testing, and I assumed that adding the overhead of tracing and here KASAN too, made some watchdog a bit unhappy. By commenting out the warning, I've seen no ill effects.
Perhaps this is something we need to dig a bit deeper into.
Looked into it a little, igb uses a timeout of 5 seconds, and the stack prints the warning if we haven't completed the transmit in that time.
What I don't understand in the stack trace is this:
[ 107.654661] Call Trace: [ 107.657735] <IRQ> [ 107.663155] ? ftrace_graph_caller+0xc0/0xc0 [ 107.667929] call_timer_fn+0x3b/0x1b0 [ 107.672238] ? netif_carrier_off+0x70/0x70 [ 107.677771] ? netif_carrier_off+0x70/0x70 [ 107.682656] ? ftrace_graph_caller+0xc0/0xc0 [ 107.687379] run_timer_softirq+0x3e8/0xa10 [ 107.694653] ? call_timer_fn+0x1b0/0x1b0 [ 107.699382] ? trace_event_raw_event_softirq+0xdd/0x150 [ 107.706768] ? ring_buffer_unlock_commit+0xf5/0x210 [ 107.712213] ? call_timer_fn+0x1b0/0x1b0 [ 107.716625] ? __do_softirq+0x155/0x467
If the carrier was turned off by something, that could cause the stack to timeout since it appears the driver didn't call this itself after finishing all transmits like it normally would have.
Is the trace above correct? Usually the ? indicate unsure backtrace due to missing symbols, right?
On Wed, 19 Aug 2020 10:29:09 -0700 Jesse Brandeburg jesse.brandeburg@intel.com wrote:
What I don't understand in the stack trace is this:
[ 107.654661] Call Trace: [ 107.657735] <IRQ> [ 107.663155] ? ftrace_graph_caller+0xc0/0xc0 [ 107.667929] call_timer_fn+0x3b/0x1b0 [ 107.672238] ? netif_carrier_off+0x70/0x70 [ 107.677771] ? netif_carrier_off+0x70/0x70 [ 107.682656] ? ftrace_graph_caller+0xc0/0xc0 [ 107.687379] run_timer_softirq+0x3e8/0xa10 [ 107.694653] ? call_timer_fn+0x1b0/0x1b0 [ 107.699382] ? trace_event_raw_event_softirq+0xdd/0x150 [ 107.706768] ? ring_buffer_unlock_commit+0xf5/0x210 [ 107.712213] ? call_timer_fn+0x1b0/0x1b0 [ 107.716625] ? __do_softirq+0x155/0x467
If the carrier was turned off by something, that could cause the stack to timeout since it appears the driver didn't call this itself after finishing all transmits like it normally would have.
Is the trace above correct? Usually the ? indicate unsure backtrace due to missing symbols, right?
The "?" means that there wasn't a stack frame to confirm that this was the true call stack. What happens is that the scan of the stack will look for any address in the stack that is for a function. If it finds one, it will print it and add a "?" to that address. Basically, those functions with the "?" are just addresses found in the stack but was not part of a stack frame link.
-- Steve
linux-stable-mirror@lists.linaro.org