On Fri, May 06 2022 at 09:15, Eric W. Biederman wrote:
* the init task will end up wanting to create kthreads, which, if * we schedule it before we create kthreadd, will OOPS. */
- pid = kernel_thread(kernel_init, NULL, CLONE_FS);
- pid = user_mode_thread(kernel_init, NULL, CLONE_FS);
So init does not have PF_KTHREAD set anymore, which causes this to go sideways with a NULL pointer dereference in get_mm_counter() on next:
get_mm_counter include/linux/mm.h:1996 [inline] get_mm_rss include/linux/mm.h:2049 [inline] task_nr_scan_windows.isra.0+0x23/0x120 kernel/sched/fair.c:1123 task_scan_min kernel/sched/fair.c:1144 [inline] task_scan_start+0x6c/0x400 kernel/sched/fair.c:1150 task_tick_numa kernel/sched/fair.c:2944 [inline] task_tick_fair+0xaeb/0xef0 kernel/sched/fair.c:11186 scheduler_tick+0x20a/0x5e0 kernel/sched/core.c:5380
https://lore.kernel.org/lkml/0000000000008a9fbb05dea76400@google.com
because the fence in task_tick_numa():
if ((curr->flags & (PF_EXITING | PF_KTHREAD)) || work->next != work) return;
is not longer sufficient. It needs also to bail if !curr->mm.
I'm worried that there are more of these issues lurking. Haven't looked yet.
Thanks,
tglx