Dear stable maintainers, I like to indicate the oops encountered and request the below patch to be backported to v 5.15. The fix is important to avoid recurring oops in context of rcu detected stalls.
subject: rcu: Avoid tracing a few functions executed in stop machine commit 48f8070f5dd8 Target kernel version v 5.15 Reason for Application: To avoid oops due to rcu_prempt detect stalls on cpus/tasks
Environment and oops context: Issue was observed in my environment on 5.15.193 kernel (arm platform). The patch is helpful to avoid the below oops indicated in [1] and [2]
log : root@ls1021atwr:~# uname -r 5.15.93-rt58+ge0f69a158d5b
oops dump stack
** ID_531 main/smp_fsm.c:1884 <inrcu: INFO: rcu_preempt detected stalls on CPUs/tasks: <<< [1] rcu: Tasks blocked on level-0 rcu_node (CPUs 0-1): P116/2:b..l (detected by 1, t=2102 jiffies, g=12741, q=1154) task:irq/31-arm-irq1 state:D stack: 0 pid: 116 ppid: 2 flags:0x00000000 [<8064b97f>] (__schedule) from [<8064bb01>] (schedule+0x8d/0xc2) [<8064bb01>] (schedule) from [<8064fa65>] (schedule_timeout+0x6d/0xa0) [<8064fa65>] (schedule_timeout) from [<804ba353>] (fsl_ifc_run_command+0x6f/0x178) [<804ba353>] (fsl_ifc_run_command) from [<804ba72f>] (fsl_ifc_cmdfunc+0x203/0x2b8) [<804ba72f>] (fsl_ifc_cmdfunc) from [<804b135f>] (nand_status_op+0xaf/0xe0) [<804b135f>] (nand_status_op) from [<804b13b3>] (nand_check_wp+0x23/0x48) .... < snipped >
Exception stack(0x822bbfb0 to 0x822bbff8) bfa0: 00000000 00000000 00000000 00000000 bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 rcu: rcu_preempt kthread timer wakeup didn't happen for 764 jiffies! g12741 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x1000 rcu: Possible timer handling issue on cpu=0 timer-softirq=1095 rcu: rcu_preempt kthread starved for 765 jiffies! g12741 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x1000 ->cpu=0 <<< [2] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. rcu: RCU grace-period kthread stack dump: task:rcu_preempt state:D stack: 0 pid: 13 ppid: 2 flags:0x00000000 [<8064b97f>] (__schedule) from [<8064ba03>] (schedule_rtlock+0x1b/0x2e) [<8064ba03>] (schedule_rtlock) from [<8064ea6f>] (rtlock_slowlock_locked+0x93/0x108) [<8064ea6f>] (rtlock_slowlock_locked) from [<8064eb1b>] (rt_spin_lock+0x37/0x4a) [<8064eb1b>] (rt_spin_lock) from [<8021b723>] (__local_bh_disable_ip+0x6b/0x110) [<8021b723>] (__local_bh_disable_ip) from [<8025a90f>] (del_timer_sync+0x7f/0xe0) [<8025a90f>] (del_timer_sync) from [<8064fa6b>] (schedule_timeout+0x73/0xa0) [<8064fa6b>] (schedule_timeout) from [<80254677>] (rcu_gp_fqs_loop+0x8b/0x1bc) [<80254677>] (rcu_gp_fqs_loop) from [<8025483f>] (rcu_gp_kthread+0x97/0xbc) [<8025483f>] (rcu_gp_kthread) from [<8022ca67>] (kthread+0xcf/0xe4) [<8022ca67>] (kthread) from [<80200149>] (ret_from_fork+0x11/0x28) Exception stack(0x820fffb0 to 0x820ffff8) ffa0: 00000000 00000000 00000000 00000000 ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 rcu: Stack dump where RCU GP kthread last ran: << Sending NMI from CPU 1 to CPUs 0: NMI backtrace for cpu 0 < .. >
Thank you for your time and consideration. Please let me know if you require any additional information
Best Regards, Ronald Monthero