Two situations can cause a missed nocb timer rearm:
1) rdp(CPU A) queues its nocb timer. The grace period elapses before the timer get a chance to fire. The nocb_gp kthread is awaken by rdp(CPU B). The nocb_cb kthread for rdp(CPU A) is awaken and process the callbacks, again before the nocb_timer for CPU A get a chance to fire. rdp(CPU A) queues a callback and wakes up nocb_gp kthread, cancelling the pending nocb_timer without resetting the corresponding nocb_defer_wakeup.
2) The "nocb_bypass_timer" ends up calling wake_nocb_gp() which deletes the pending "nocb_timer" (note they are not the same timers) for the given rdp without resetting the matching state stored in nocb_defer wakeup.
On both situations, a future call_rcu() on that rdp may be fooled and think the timer is armed when it's not, missing a deferred nocb_gp wakeup.
Case 1) is very unlikely due to timing constraint (the timer fires after 1 jiffy) but still possible in theory. Case 2) is more likely to happen. But in any case such scenario require the CPU to spend a long time within a kernel thread without exiting to idle or user space, which is a pretty exotic behaviour.
Fix this with resetting rdp->nocb_defer_wakeup everytime we disarm the timer.
Fixes: d1b222c6be1f (rcu/nocb: Add bypass callback queueing) Cc: Stable stable@vger.kernel.org Cc: Josh Triplett josh@joshtriplett.org Cc: Lai Jiangshan jiangshanlai@gmail.com Cc: Joel Fernandes joel@joelfernandes.org Cc: Neeraj Upadhyay neeraju@codeaurora.org Cc: Boqun Feng boqun.feng@gmail.com Signed-off-by: Frederic Weisbecker frederic@kernel.org --- kernel/rcu/tree_plugin.h | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 2ec9d7f55f99..dd0dc66c282d 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1720,7 +1720,11 @@ static bool wake_nocb_gp(struct rcu_data *rdp, bool force, rcu_nocb_unlock_irqrestore(rdp, flags); return false; } - del_timer(&rdp->nocb_timer); + + if (READ_ONCE(rdp->nocb_defer_wakeup) > RCU_NOCB_WAKE_NOT) { + WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT); + del_timer(&rdp->nocb_timer); + } rcu_nocb_unlock_irqrestore(rdp, flags); raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags); if (force || READ_ONCE(rdp_gp->nocb_gp_sleep)) { @@ -2349,7 +2353,6 @@ static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp) return false; } ndw = READ_ONCE(rdp->nocb_defer_wakeup); - WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT); ret = wake_nocb_gp(rdp, ndw == RCU_NOCB_WAKE_FORCE, flags); trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DeferredWake"));