On Thu, Nov 18, 2021 at 09:06:27AM +0100, Peter Zijlstra wrote:
On Wed, Nov 17, 2021 at 03:50:17PM -0800, Linus Torvalds wrote:
I really don't think the WCHAN code should use unwinders at all. It's too damn fragile, and it's too easily triggered from user space.
On x86, esp. with ORC, it pretty much has to. The thing is, the ORC unwinder has been very stable so far. I'm guessing there's some really stupid thing going on, like for example trying to unwind a freed stack.
I *just* managed to reproduce, so let me go have a poke.
Confirmed, with the below it no longer reproduces. Now, let me go undo that and fix the unwinder to not explode while trying to unwind nothing.
--- diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 862af1db22ab..f810c5192cb9 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1978,7 +1978,7 @@ unsigned long get_wchan(struct task_struct *p) raw_spin_lock_irq(&p->pi_lock); state = READ_ONCE(p->__state); smp_rmb(); /* see try_to_wake_up() */ - if (state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq) + if (state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq && !(p->flags & PF_EXITING)) ip = __get_wchan(p); raw_spin_unlock_irq(&p->pi_lock);