On Tue, Jan 05, 2021 at 03:25:10PM -0800, Paul E. McKenney wrote:
On Tue, Jan 05, 2021 at 10:55:03AM +0100, Peter Zijlstra wrote:
On Mon, Jan 04, 2021 at 04:20:55PM +0100, Frederic Weisbecker wrote:
Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP kthread (rcuog) to be serviced.
Usually a wake up happening while running the idle task is spotted in one of the need_resched() checks carefully placed within the idle loop that can break to the scheduler.
Urgh, this is horrific and fragile :/ You having had to audit and fix a number of rcu_idle_enter() callers should've made you realize that making rcu_idle_enter() return something would've been saner.
Also, I might hope that when RCU does do that wakeup, it will not have put RCU in idle mode? So it is a natural 'fail' state for rcu_idle_enter(), *sigh* it continues to put RCU to sleep, so that needs fixing too.
It depends on what is being awakened. For example, the nocb rcuog and rcuoc kthreads might be well on some other CPU, so RCU might need the wakeup to happen, but might also need to go completely to sleep on this CPU.
But yes, if the wakeup needs to be on the current CPU, then idle must be exited and RCU needs to again be watching. However, RCU has no idea what CPU the to-be-awakened kthread will be running on. And even if it were to know at the time it does the wakeup, that kthread's location might well have changed by the time the current CPU enters idle.
A simple check for need_resched() would do the trick. Sure that could also catch other sources of wake up that would have been otherwise handled once IRQs get re-enabled but that's not a problem.
I'm thinking that rcu_user_enter() will have the exact same problem? Did you audit that?
Something like the below, combined with a fixup for all callers (which the compiler will help us find thanks to __must_check).
Looks at least somewhat plausible at first glance.
Though given the above, it is possible (likely, even) that rcu_user_enter() returns true, but that this CPU still needs to enter idle. So isn't a subsequent check of need_resched() or friends still required? Or is your point that this will happen automatically upon exit from the idle loop?
Exactly, upon "wake_up_process(rdp_gp->nocb_gp_kthread)", we just need to make sure that need_resched() is set before returning false, namely:
@@ -644,7 +644,14 @@ static noinstr void rcu_eqs_enter(bool user) trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&rdp->dynticks)); WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current)); rdp = this_cpu_ptr(&rcu_data);
- do_nocb_deferred_wakeup(rdp);
- if (do_nocb_deferred_wakeup(rdp)) {
/*
* We did the wakeup, don't enter EQS, we'll need to abort idle
* and schedule.
*/
return false;
Right here.
But still I think we should decouple the wake up from rcu_eqs_enter().
And have:
rcu_eqs_enter_prepare(): does the deferred wakeup and forbid from calling call_rcu() from here.
rcu_eqs_enter(): enter RCU extended quiescent state
This way we can more easily fix the rcu_user_enter() case as it happens past the last scheduler entrypoint before returning to userspace.
Thanks.