On 7/31/23 09:14, Peter Zijlstra wrote:
On Mon, Jul 31, 2023 at 05:08:29PM +0100, Roy Hopkins wrote:
On Mon, 2023-07-31 at 16:52 +0200, Peter Zijlstra wrote:
On Mon, Jul 31, 2023 at 07:48:19AM -0700, Guenter Roeck wrote:
I've taken your config above, and the rootfs.ext2 and run-sh from x86/. I've then modified run-sh to use:
qemu-system-x86_64 -enable-kvm -cpu host
What I'm seeing is that some boots get stuck at:
[ 0.608230] Running RCU-tasks wait API self tests
Is this the right 'problem' ?
Yes, exactly.
Excellent! Let me prod that with something sharp, see what comes creeping out.
In an effort to get up to speed with this area of the kernel, I've been playing around with this too today and managed to reproduce the problem using the same configuration. I'm completely new to this code but I think I may have found the root of the problem.
What I've found is that there is a race condition between starting the RCU tasks grace-period thread in rcu_spawn_tasks_kthread_generic() and a subsequent call to synchronize_rcu_tasks_generic(). This results in rtp->tasks_gp_mutex being locked in the initial thread which subsequently blocks the newly started grace- period thread.
The problem is that although synchronize_rcu_tasks_generic() checks to see if the grace-period kthread is running, it uses rtp->kthread_ptr to achieve this. This is only set in the thread entry point and not when the thread is created, meaning that it is set only after the creating thread yields or is preempted. If this has not happened before the next call to synchronize_rcu_tasks_generic() then a deadlock occurs.
I've created a debug patch that introduces a new flag in rcu_tasks that is set when the kthread is created and used this in synchronize_rcu_tasks_generic() in place of READ_ONCE(rtp->kthread_ptr). This fixes the issue in my test environment.
I'm happy to have a go at submitting a patch for this if it helps.
Ha!, I was poking around the same thing. My hack below seems to (so far, <20 boots) help things.
So, dumb question: How comes this bisects to "sched/fair: Remove sched_feat(START_DEBIT)" ?
Thanks, Guenter
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 56c470a489c8..b083b5a30025 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -652,7 +658,11 @@ static void __init rcu_spawn_tasks_kthread_generic(struct rcu_tasks *rtp) t = kthread_run(rcu_tasks_kthread, rtp, "%s_kthread", rtp->kname); if (WARN_ONCE(IS_ERR(t), "%s: Could not start %s grace-period kthread, OOM is now expected behavior\n", __func__, rtp->name)) return;
- smp_mb(); /* Ensure others see full kthread. */
- for (;;) {
cond_resched();
if (smp_load_acquire(&rtp->kthread_ptr))
break;
- } }
#ifndef CONFIG_TINY_RCU