From: Jann Horn jannh@google.com
[ Upstream commit 586b58cac8b4683eb58a1446fbc399de18974e40 ]
With CONFIG_DEBUG_ATOMIC_SLEEP=y and CONFIG_CGROUPS=y, kernel oopses in non-preemptible context look untidy; after the main oops, the kernel prints a "sleeping function called from invalid context" report because exit_signals() -> cgroup_threadgroup_change_begin() -> percpu_down_read() can sleep, and that happens before the preempt_count_set(PREEMPT_ENABLED) fixup.
It looks like the same thing applies to profile_task_exit() and kcov_task_exit().
Fix it by moving the preemption fixup up and the calls to profile_task_exit() and kcov_task_exit() down.
Fixes: 1dc0fffc48af ("sched/core: Robustify preemption leak checks") Signed-off-by: Jann Horn jannh@google.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20200305220657.46800-1-jannh@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/exit.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/kernel/exit.c b/kernel/exit.c index 54c3269b8dda..9c76bacb043d 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -772,8 +772,12 @@ void __noreturn do_exit(long code) struct task_struct *tsk = current; int group_dead;
- profile_task_exit(tsk); - kcov_task_exit(tsk); + /* + * We can get here from a kernel oops, sometimes with preemption off. + * Start by checking for critical errors. + * Then fix up important state like USER_DS and preemption. + * Then do everything else. + */
WARN_ON(blk_needs_flush_plug(tsk));
@@ -791,6 +795,16 @@ void __noreturn do_exit(long code) */ set_fs(USER_DS);
+ if (unlikely(in_atomic())) { + pr_info("note: %s[%d] exited with preempt_count %d\n", + current->comm, task_pid_nr(current), + preempt_count()); + preempt_count_set(PREEMPT_ENABLED); + } + + profile_task_exit(tsk); + kcov_task_exit(tsk); + ptrace_event(PTRACE_EVENT_EXIT, code);
validate_creds_for_do_exit(tsk); @@ -828,13 +842,6 @@ void __noreturn do_exit(long code) raw_spin_lock_irq(&tsk->pi_lock); raw_spin_unlock_irq(&tsk->pi_lock);
- if (unlikely(in_atomic())) { - pr_info("note: %s[%d] exited with preempt_count %d\n", - current->comm, task_pid_nr(current), - preempt_count()); - preempt_count_set(PREEMPT_ENABLED); - } - /* sync mm's RSS info before statistics gathering */ if (tsk->mm) sync_mm_rss(tsk->mm);