With Paul, we've been thinking that the idle loop wasn't twisted enough yet to deserve 2020.
rcutorture, after some recent parameter changes, has been complaining about a hung task.
It appears that rcu_idle_enter() may wake up a NOCB kthread but this happens after the last generic need_resched() check. Some cpuidle drivers fix it by chance but many others don't.
Here is a proposed bunch of fixes. I will need to also fix the rcu_user_enter() case, likely using irq_work, since nohz_full requires irq work to support self IPI.
Also more generally, this raise the question of local task wake_up() under disabled interrupts. When a wake up occurs in a preempt disabled section, it gets handled by the outer preempt_enable() call. There is no similar mechanism when a wake up occurs with interrupts disabled. I guess it is assumed to be handled, at worst, in the next tick. But a local irq work would provide instant preemption once IRQs are re-enabled. Of course this would only make sense in CONFIG_PREEMPTION, and when the tick is disabled...
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git sched/idle
HEAD: f2fa6e4a070c1535b9edc9ee097167fd2b15d235
Thanks, Frederic ---
Frederic Weisbecker (4): sched/idle: Fix missing need_resched() check after rcu_idle_enter() cpuidle: Fix missing need_resched() check after rcu_idle_enter() ARM: imx6q: Fix missing need_resched() check after rcu_idle_enter() ACPI: processor: Fix missing need_resched() check after rcu_idle_enter()
arch/arm/mach-imx/cpuidle-imx6q.c | 7 ++++++- drivers/acpi/processor_idle.c | 10 ++++++++-- drivers/cpuidle/cpuidle.c | 33 +++++++++++++++++++++++++-------- kernel/sched/idle.c | 18 ++++++++++++------ 4 files changed, 51 insertions(+), 17 deletions(-)
Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP kthread (rcuog) to be serviced.
Usually a wake up happening while running the idle task is spotted in one of the need_resched() checks carefully placed within the idle loop that can break to the scheduler.
Unfortunately in default_idle_call(), the call to rcu_idle_enter() is already beyond the last need_resched() check and we may halt the CPU with a resched request unhandled, leaving the task hanging.
Fix this with performing a last minute need_resched() check after calling rcu_idle_enter().
Reported-by: Paul E. McKenney paulmck@kernel.org Fixes: 96d3fd0d315a (rcu: Break call_rcu() deadlock involving scheduler and perf) Cc: stable@vger.kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnarmingo@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org --- kernel/sched/idle.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index 305727ea0677..1af60dc50beb 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -109,15 +109,21 @@ void __cpuidle default_idle_call(void) rcu_idle_enter(); lockdep_hardirqs_on(_THIS_IP_);
- arch_cpu_idle(); + /* + * Last need_resched() check must come after rcu_idle_enter() + * which may wake up RCU internal tasks. + */ + if (!need_resched()) { + arch_cpu_idle(); + raw_local_irq_disable(); + }
/* - * OK, so IRQs are enabled here, but RCU needs them disabled to - * turn itself back on.. funny thing is that disabling IRQs - * will cause tracing, which needs RCU. Jump through hoops to - * make it 'work'. + * OK, so IRQs are enabled after arch_cpu_idle(), but RCU needs + * them disabled to turn itself back on.. funny thing is that + * disabling IRQs will cause tracing, which needs RCU. Jump through + * hoops to make it 'work'. */ - raw_local_irq_disable(); lockdep_hardirqs_off(_THIS_IP_); rcu_idle_exit(); lockdep_hardirqs_on(_THIS_IP_);
On Tue, Dec 22, 2020 at 02:37:09AM +0100, Frederic Weisbecker wrote:
Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP kthread (rcuog) to be serviced.
Usually a wake up happening while running the idle task is spotted in one of the need_resched() checks carefully placed within the idle loop that can break to the scheduler.
Unfortunately in default_idle_call(), the call to rcu_idle_enter() is already beyond the last need_resched() check and we may halt the CPU with a resched request unhandled, leaving the task hanging.
Fix this with performing a last minute need_resched() check after calling rcu_idle_enter().
Reported-by: Paul E. McKenney paulmck@kernel.org Fixes: 96d3fd0d315a (rcu: Break call_rcu() deadlock involving scheduler and perf) Cc: stable@vger.kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnarmingo@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org
Tested-by: Paul E. McKenney paulmck@kernel.org
kernel/sched/idle.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index 305727ea0677..1af60dc50beb 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -109,15 +109,21 @@ void __cpuidle default_idle_call(void) rcu_idle_enter(); lockdep_hardirqs_on(_THIS_IP_);
arch_cpu_idle();
/*
* Last need_resched() check must come after rcu_idle_enter()
* which may wake up RCU internal tasks.
*/
if (!need_resched()) {
arch_cpu_idle();
raw_local_irq_disable();
}
/*
* OK, so IRQs are enabled here, but RCU needs them disabled to
* turn itself back on.. funny thing is that disabling IRQs
* will cause tracing, which needs RCU. Jump through hoops to
* make it 'work'.
* OK, so IRQs are enabled after arch_cpu_idle(), but RCU needs
* them disabled to turn itself back on.. funny thing is that
* disabling IRQs will cause tracing, which needs RCU. Jump through
*/* hoops to make it 'work'.
lockdep_hardirqs_off(_THIS_IP_); rcu_idle_exit(); lockdep_hardirqs_on(_THIS_IP_);raw_local_irq_disable();
-- 2.25.1
Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP kthread (rcuog) to be serviced.
Usually a wake up happening while running the idle task is spotted in one of the need_resched() checks carefully placed within the idle loop that can break to the scheduler.
Unfortunately within cpuidle the call to rcu_idle_enter() is already beyond the last generic need_resched() check. Some drivers may perform their own checks like with mwait_idle_with_hints() but many others don't and we may halt the CPU with a resched request unhandled, leaving the task hanging.
Fix this with performing a last minute need_resched() check after calling rcu_idle_enter().
Reported-by: Paul E. McKenney paulmck@kernel.org Fixes: 1098582a0f6c (sched,idle,rcu: Push rcu_idle deeper into the idle path) Cc: stable@vger.kernel.org Cc: Daniel Lezcano daniel.lezcano@linaro.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org --- drivers/cpuidle/cpuidle.c | 33 +++++++++++++++++++++++++-------- 1 file changed, 25 insertions(+), 8 deletions(-)
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index ef2ea1b12cd8..4cc1ba49ce05 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -134,8 +134,8 @@ int cpuidle_find_deepest_state(struct cpuidle_driver *drv, }
#ifdef CONFIG_SUSPEND -static void enter_s2idle_proper(struct cpuidle_driver *drv, - struct cpuidle_device *dev, int index) +static int enter_s2idle_proper(struct cpuidle_driver *drv, + struct cpuidle_device *dev, int index) { ktime_t time_start, time_end; struct cpuidle_state *target_state = &drv->states[index]; @@ -151,7 +151,14 @@ static void enter_s2idle_proper(struct cpuidle_driver *drv, stop_critical_timings(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) rcu_idle_enter(); - target_state->enter_s2idle(dev, drv, index); + /* + * Last need_resched() check must come after rcu_idle_enter() + * which may wake up RCU internal tasks. + */ + if (!need_resched()) + target_state->enter_s2idle(dev, drv, index); + else + index = -EBUSY; if (WARN_ON_ONCE(!irqs_disabled())) local_irq_disable(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) @@ -159,10 +166,13 @@ static void enter_s2idle_proper(struct cpuidle_driver *drv, tick_unfreeze(); start_critical_timings();
- time_end = ns_to_ktime(local_clock()); + if (index > 0) { + time_end = ns_to_ktime(local_clock()); + dev->states_usage[index].s2idle_time += ktime_us_delta(time_end, time_start); + dev->states_usage[index].s2idle_usage++; + }
- dev->states_usage[index].s2idle_time += ktime_us_delta(time_end, time_start); - dev->states_usage[index].s2idle_usage++; + return index; }
/** @@ -184,7 +194,7 @@ int cpuidle_enter_s2idle(struct cpuidle_driver *drv, struct cpuidle_device *dev) */ index = find_deepest_state(drv, dev, U64_MAX, 0, true); if (index > 0) { - enter_s2idle_proper(drv, dev, index); + index = enter_s2idle_proper(drv, dev, index); local_irq_enable(); } return index; @@ -234,7 +244,14 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, stop_critical_timings(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) rcu_idle_enter(); - entered_state = target_state->enter(dev, drv, index); + /* + * Last need_resched() check must come after rcu_idle_enter() + * which may wake up RCU internal tasks. + */ + if (!need_resched()) + entered_state = target_state->enter(dev, drv, index); + else + entered_state = -EBUSY; if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) rcu_idle_exit(); start_critical_timings();
22.12.2020 04:37, Frederic Weisbecker пишет:
Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP kthread (rcuog) to be serviced.
Usually a wake up happening while running the idle task is spotted in one of the need_resched() checks carefully placed within the idle loop that can break to the scheduler.
Unfortunately within cpuidle the call to rcu_idle_enter() is already beyond the last generic need_resched() check. Some drivers may perform their own checks like with mwait_idle_with_hints() but many others don't and we may halt the CPU with a resched request unhandled, leaving the task hanging.
Fix this with performing a last minute need_resched() check after calling rcu_idle_enter().
Reported-by: Paul E. McKenney paulmck@kernel.org Fixes: 1098582a0f6c (sched,idle,rcu: Push rcu_idle deeper into the idle path) Cc: stable@vger.kernel.org Cc: Daniel Lezcano daniel.lezcano@linaro.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org
drivers/cpuidle/cpuidle.c | 33 +++++++++++++++++++++++++-------- 1 file changed, 25 insertions(+), 8 deletions(-)
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index ef2ea1b12cd8..4cc1ba49ce05 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -134,8 +134,8 @@ int cpuidle_find_deepest_state(struct cpuidle_driver *drv, } #ifdef CONFIG_SUSPEND -static void enter_s2idle_proper(struct cpuidle_driver *drv,
struct cpuidle_device *dev, int index)
+static int enter_s2idle_proper(struct cpuidle_driver *drv,
struct cpuidle_device *dev, int index)
{ ktime_t time_start, time_end; struct cpuidle_state *target_state = &drv->states[index]; @@ -151,7 +151,14 @@ static void enter_s2idle_proper(struct cpuidle_driver *drv, stop_critical_timings(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) rcu_idle_enter();
- target_state->enter_s2idle(dev, drv, index);
- /*
* Last need_resched() check must come after rcu_idle_enter()
* which may wake up RCU internal tasks.
*/
- if (!need_resched())
target_state->enter_s2idle(dev, drv, index);
- else
if (WARN_ON_ONCE(!irqs_disabled())) local_irq_disable(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE))index = -EBUSY;
@@ -159,10 +166,13 @@ static void enter_s2idle_proper(struct cpuidle_driver *drv, tick_unfreeze(); start_critical_timings();
- time_end = ns_to_ktime(local_clock());
- if (index > 0) {
index=0 is valid too
time_end = ns_to_ktime(local_clock());
dev->states_usage[index].s2idle_time += ktime_us_delta(time_end, time_start);
dev->states_usage[index].s2idle_usage++;
- }
- dev->states_usage[index].s2idle_time += ktime_us_delta(time_end, time_start);
- dev->states_usage[index].s2idle_usage++;
- return index;
} /** @@ -184,7 +194,7 @@ int cpuidle_enter_s2idle(struct cpuidle_driver *drv, struct cpuidle_device *dev) */ index = find_deepest_state(drv, dev, U64_MAX, 0, true); if (index > 0) {
enter_s2idle_proper(drv, dev, index);
local_irq_enable(); } return index;index = enter_s2idle_proper(drv, dev, index);
@@ -234,7 +244,14 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, stop_critical_timings(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) rcu_idle_enter();
- entered_state = target_state->enter(dev, drv, index);
- /*
* Last need_resched() check must come after rcu_idle_enter()
* which may wake up RCU internal tasks.
*/
- if (!need_resched())
entered_state = target_state->enter(dev, drv, index);
- else
if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) rcu_idle_exit(); start_critical_timings();entered_state = -EBUSY;
This patch causes a hardlock on NVIDIA Tegra using today's linux-next. Disabling coupled idling state helps. Please fix thanks in advance.
05.01.2021 20:25, Dmitry Osipenko пишет:
22.12.2020 04:37, Frederic Weisbecker пишет:
Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP kthread (rcuog) to be serviced.
Usually a wake up happening while running the idle task is spotted in one of the need_resched() checks carefully placed within the idle loop that can break to the scheduler.
Unfortunately within cpuidle the call to rcu_idle_enter() is already beyond the last generic need_resched() check. Some drivers may perform their own checks like with mwait_idle_with_hints() but many others don't and we may halt the CPU with a resched request unhandled, leaving the task hanging.
Fix this with performing a last minute need_resched() check after calling rcu_idle_enter().
Reported-by: Paul E. McKenney paulmck@kernel.org Fixes: 1098582a0f6c (sched,idle,rcu: Push rcu_idle deeper into the idle path) Cc: stable@vger.kernel.org Cc: Daniel Lezcano daniel.lezcano@linaro.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org
drivers/cpuidle/cpuidle.c | 33 +++++++++++++++++++++++++-------- 1 file changed, 25 insertions(+), 8 deletions(-)
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index ef2ea1b12cd8..4cc1ba49ce05 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -134,8 +134,8 @@ int cpuidle_find_deepest_state(struct cpuidle_driver *drv, } #ifdef CONFIG_SUSPEND -static void enter_s2idle_proper(struct cpuidle_driver *drv,
struct cpuidle_device *dev, int index)
+static int enter_s2idle_proper(struct cpuidle_driver *drv,
struct cpuidle_device *dev, int index)
{ ktime_t time_start, time_end; struct cpuidle_state *target_state = &drv->states[index]; @@ -151,7 +151,14 @@ static void enter_s2idle_proper(struct cpuidle_driver *drv, stop_critical_timings(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) rcu_idle_enter();
- target_state->enter_s2idle(dev, drv, index);
- /*
* Last need_resched() check must come after rcu_idle_enter()
* which may wake up RCU internal tasks.
*/
- if (!need_resched())
target_state->enter_s2idle(dev, drv, index);
- else
if (WARN_ON_ONCE(!irqs_disabled())) local_irq_disable(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE))index = -EBUSY;
@@ -159,10 +166,13 @@ static void enter_s2idle_proper(struct cpuidle_driver *drv, tick_unfreeze(); start_critical_timings();
- time_end = ns_to_ktime(local_clock());
- if (index > 0) {
index=0 is valid too
time_end = ns_to_ktime(local_clock());
dev->states_usage[index].s2idle_time += ktime_us_delta(time_end, time_start);
dev->states_usage[index].s2idle_usage++;
- }
- dev->states_usage[index].s2idle_time += ktime_us_delta(time_end, time_start);
- dev->states_usage[index].s2idle_usage++;
- return index;
} /** @@ -184,7 +194,7 @@ int cpuidle_enter_s2idle(struct cpuidle_driver *drv, struct cpuidle_device *dev) */ index = find_deepest_state(drv, dev, U64_MAX, 0, true); if (index > 0) {
enter_s2idle_proper(drv, dev, index);
local_irq_enable(); } return index;index = enter_s2idle_proper(drv, dev, index);
@@ -234,7 +244,14 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, stop_critical_timings(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) rcu_idle_enter();
- entered_state = target_state->enter(dev, drv, index);
- /*
* Last need_resched() check must come after rcu_idle_enter()
* which may wake up RCU internal tasks.
*/
- if (!need_resched())
entered_state = target_state->enter(dev, drv, index);
- else
if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) rcu_idle_exit(); start_critical_timings();entered_state = -EBUSY;
This patch causes a hardlock on NVIDIA Tegra using today's linux-next. Disabling coupled idling state helps. Please fix thanks in advance.
This isn't a proper fix, but it works:
diff --git a/drivers/cpuidle/cpuidle-tegra.c b/drivers/cpuidle/cpuidle-tegra.c index 191966dc8d02..ecc5d9b31553 100644 --- a/drivers/cpuidle/cpuidle-tegra.c +++ b/drivers/cpuidle/cpuidle-tegra.c @@ -148,7 +148,7 @@ static int tegra_cpuidle_c7_enter(void)
static int tegra_cpuidle_coupled_barrier(struct cpuidle_device *dev) { - if (tegra_pending_sgi()) { + if (tegra_pending_sgi() || need_resched()) { /* * CPU got local interrupt that will be lost after GIC's * shutdown because GIC driver doesn't save/restore the diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 4cc1ba49ce05..2bc52ccc339b 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -248,7 +248,7 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, * Last need_resched() check must come after rcu_idle_enter() * which may wake up RCU internal tasks. */ - if (!need_resched()) + if ((target_state->flags & CPUIDLE_FLAG_COUPLED) || !need_resched()) entered_state = target_state->enter(dev, drv, index); else entered_state = -EBUSY;
On Tue, Jan 05, 2021 at 09:10:30PM +0300, Dmitry Osipenko wrote:
05.01.2021 20:25, Dmitry Osipenko пишет:
22.12.2020 04:37, Frederic Weisbecker пишет:
Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP kthread (rcuog) to be serviced.
Usually a wake up happening while running the idle task is spotted in one of the need_resched() checks carefully placed within the idle loop that can break to the scheduler.
Unfortunately within cpuidle the call to rcu_idle_enter() is already beyond the last generic need_resched() check. Some drivers may perform their own checks like with mwait_idle_with_hints() but many others don't and we may halt the CPU with a resched request unhandled, leaving the task hanging.
Fix this with performing a last minute need_resched() check after calling rcu_idle_enter().
Reported-by: Paul E. McKenney paulmck@kernel.org Fixes: 1098582a0f6c (sched,idle,rcu: Push rcu_idle deeper into the idle path) Cc: stable@vger.kernel.org Cc: Daniel Lezcano daniel.lezcano@linaro.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org
drivers/cpuidle/cpuidle.c | 33 +++++++++++++++++++++++++-------- 1 file changed, 25 insertions(+), 8 deletions(-)
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index ef2ea1b12cd8..4cc1ba49ce05 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -134,8 +134,8 @@ int cpuidle_find_deepest_state(struct cpuidle_driver *drv, } #ifdef CONFIG_SUSPEND -static void enter_s2idle_proper(struct cpuidle_driver *drv,
struct cpuidle_device *dev, int index)
+static int enter_s2idle_proper(struct cpuidle_driver *drv,
struct cpuidle_device *dev, int index)
{ ktime_t time_start, time_end; struct cpuidle_state *target_state = &drv->states[index]; @@ -151,7 +151,14 @@ static void enter_s2idle_proper(struct cpuidle_driver *drv, stop_critical_timings(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) rcu_idle_enter();
- target_state->enter_s2idle(dev, drv, index);
- /*
* Last need_resched() check must come after rcu_idle_enter()
* which may wake up RCU internal tasks.
*/
- if (!need_resched())
target_state->enter_s2idle(dev, drv, index);
- else
if (WARN_ON_ONCE(!irqs_disabled())) local_irq_disable(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE))index = -EBUSY;
@@ -159,10 +166,13 @@ static void enter_s2idle_proper(struct cpuidle_driver *drv, tick_unfreeze(); start_critical_timings();
- time_end = ns_to_ktime(local_clock());
- if (index > 0) {
index=0 is valid too
time_end = ns_to_ktime(local_clock());
dev->states_usage[index].s2idle_time += ktime_us_delta(time_end, time_start);
dev->states_usage[index].s2idle_usage++;
- }
- dev->states_usage[index].s2idle_time += ktime_us_delta(time_end, time_start);
- dev->states_usage[index].s2idle_usage++;
- return index;
} /** @@ -184,7 +194,7 @@ int cpuidle_enter_s2idle(struct cpuidle_driver *drv, struct cpuidle_device *dev) */ index = find_deepest_state(drv, dev, U64_MAX, 0, true); if (index > 0) {
enter_s2idle_proper(drv, dev, index);
local_irq_enable(); } return index;index = enter_s2idle_proper(drv, dev, index);
@@ -234,7 +244,14 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, stop_critical_timings(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) rcu_idle_enter();
- entered_state = target_state->enter(dev, drv, index);
- /*
* Last need_resched() check must come after rcu_idle_enter()
* which may wake up RCU internal tasks.
*/
- if (!need_resched())
entered_state = target_state->enter(dev, drv, index);
- else
if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) rcu_idle_exit(); start_critical_timings();entered_state = -EBUSY;
This patch causes a hardlock on NVIDIA Tegra using today's linux-next. Disabling coupled idling state helps. Please fix thanks in advance.
This isn't a proper fix, but it works:
There is some ongoing discussion about what an overall proper fix might look like, so in the meantime I am folding you changes below into Frederic's original. ;-)
Thanx, Paul
diff --git a/drivers/cpuidle/cpuidle-tegra.c b/drivers/cpuidle/cpuidle-tegra.c index 191966dc8d02..ecc5d9b31553 100644 --- a/drivers/cpuidle/cpuidle-tegra.c +++ b/drivers/cpuidle/cpuidle-tegra.c @@ -148,7 +148,7 @@ static int tegra_cpuidle_c7_enter(void)
static int tegra_cpuidle_coupled_barrier(struct cpuidle_device *dev) {
- if (tegra_pending_sgi()) {
- if (tegra_pending_sgi() || need_resched()) { /*
- CPU got local interrupt that will be lost after GIC's
- shutdown because GIC driver doesn't save/restore the
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 4cc1ba49ce05..2bc52ccc339b 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -248,7 +248,7 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv, * Last need_resched() check must come after rcu_idle_enter() * which may wake up RCU internal tasks. */
- if (!need_resched())
- if ((target_state->flags & CPUIDLE_FLAG_COUPLED) || !need_resched()) entered_state = target_state->enter(dev, drv, index); else entered_state = -EBUSY;
Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP kthread (rcuog) to be serviced.
Usually a wake up happening while running the idle task is spotted in one of the need_resched() checks carefully placed within the idle loop that can break to the scheduler.
Unfortunately imx6q_enter_wait() is beyond the last generic need_resched() check and it performs a wfi right away after the call to rcu_idle_enter(). We may halt the CPU with a resched request unhandled, leaving the task hanging.
Fix this with performing a last minute need_resched() check after calling rcu_idle_enter().
Reported-by: Paul E. McKenney paulmck@kernel.org Fixes: 1a67b9263e06 (ARM: imx6q: Fixup RCU usage for cpuidle) Cc: stable@vger.kernel.org Cc: Shawn Guo shawnguo@kernel.org Cc: Sascha Hauer s.hauer@pengutronix.de Cc: Pengutronix Kernel Team kernel@pengutronix.de Cc: Fabio Estevam festevam@gmail.com Cc: NXP Linux Team linux-imx@nxp.com Cc: Peter Zijlstra peterz@infradead.org Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org --- arch/arm/mach-imx/cpuidle-imx6q.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/arm/mach-imx/cpuidle-imx6q.c b/arch/arm/mach-imx/cpuidle-imx6q.c index 094337dc1bc7..31a60d257d3d 100644 --- a/arch/arm/mach-imx/cpuidle-imx6q.c +++ b/arch/arm/mach-imx/cpuidle-imx6q.c @@ -25,7 +25,12 @@ static int imx6q_enter_wait(struct cpuidle_device *dev, raw_spin_unlock(&cpuidle_lock);
rcu_idle_enter(); - cpu_do_idle(); + /* + * Last need_resched() check must come after rcu_idle_enter() + * which may wake up RCU internal tasks. + */ + if (!need_resched()) + cpu_do_idle(); rcu_idle_exit();
raw_spin_lock(&cpuidle_lock);
Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP kthread (rcuog) to be serviced.
Usually a wake up happening while running the idle task is spotted in one of the need_resched() checks carefully placed within the idle loop that can break to the scheduler.
Unfortunately within acpi_idle_enter_bm() the call to rcu_idle_enter() is already beyond the last generic need_resched() check. The cpu idle implementation happens to be ok because it ends up calling mwait_idle_with_hints() or acpi_safe_halt() which both perform their own need_resched() checks. But the suspend to idle implementation doesn't so it may suspend the CPU with a resched request unhandled, leaving the task hanging.
Fix this with performing a last minute need_resched() check after calling rcu_idle_enter().
Reported-by: Paul E. McKenney paulmck@kernel.org Fixes: 1fecfdbb7acc (ACPI: processor: Take over RCU-idle for C3-BM idle) Cc: stable@vger.kernel.org Cc: Len Brown lenb@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnarmingo@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org --- drivers/acpi/processor_idle.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index d93e400940a3..c4939c49d972 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -604,8 +604,14 @@ static int acpi_idle_enter_bm(struct cpuidle_driver *drv, }
rcu_idle_enter(); - - acpi_idle_do_entry(cx); + /* + * Last need_resched() check must come after rcu_idle_enter() + * which may wake up RCU internal tasks. mwait_idle_with_hints() + * and acpi_safe_halt() have their own checks but s2idle + * implementation doesn't. + */ + if (!need_resched()) + acpi_idle_do_entry(cx);
rcu_idle_exit();
On 12/22/2020 2:37 AM, Frederic Weisbecker wrote:
With Paul, we've been thinking that the idle loop wasn't twisted enough yet to deserve 2020.
rcutorture, after some recent parameter changes, has been complaining about a hung task.
It appears that rcu_idle_enter() may wake up a NOCB kthread but this happens after the last generic need_resched() check. Some cpuidle drivers fix it by chance but many others don't.
Here is a proposed bunch of fixes. I will need to also fix the rcu_user_enter() case, likely using irq_work, since nohz_full requires irq work to support self IPI.
Also more generally, this raise the question of local task wake_up() under disabled interrupts. When a wake up occurs in a preempt disabled section, it gets handled by the outer preempt_enable() call. There is no similar mechanism when a wake up occurs with interrupts disabled. I guess it is assumed to be handled, at worst, in the next tick. But a local irq work would provide instant preemption once IRQs are re-enabled. Of course this would only make sense in CONFIG_PREEMPTION, and when the tick is disabled...
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git sched/idle
HEAD: f2fa6e4a070c1535b9edc9ee097167fd2b15d235
Thanks, Frederic
Frederic Weisbecker (4): sched/idle: Fix missing need_resched() check after rcu_idle_enter() cpuidle: Fix missing need_resched() check after rcu_idle_enter() ARM: imx6q: Fix missing need_resched() check after rcu_idle_enter() ACPI: processor: Fix missing need_resched() check after rcu_idle_enter()
arch/arm/mach-imx/cpuidle-imx6q.c | 7 ++++++- drivers/acpi/processor_idle.c | 10 ++++++++-- drivers/cpuidle/cpuidle.c | 33 +++++++++++++++++++++++++-------- kernel/sched/idle.c | 18 ++++++++++++------ 4 files changed, 51 insertions(+), 17 deletions(-)
Please feel free to add
Reviewed-by: Rafael J. Wysocki rafael.j.wysocki@intel.com
to all patches in the series.
Thanks!
On Tue, Dec 22, 2020 at 05:19:51PM +0100, Rafael J. Wysocki wrote:
On 12/22/2020 2:37 AM, Frederic Weisbecker wrote:
With Paul, we've been thinking that the idle loop wasn't twisted enough yet to deserve 2020.
rcutorture, after some recent parameter changes, has been complaining about a hung task.
It appears that rcu_idle_enter() may wake up a NOCB kthread but this happens after the last generic need_resched() check. Some cpuidle drivers fix it by chance but many others don't.
Here is a proposed bunch of fixes. I will need to also fix the rcu_user_enter() case, likely using irq_work, since nohz_full requires irq work to support self IPI.
Also more generally, this raise the question of local task wake_up() under disabled interrupts. When a wake up occurs in a preempt disabled section, it gets handled by the outer preempt_enable() call. There is no similar mechanism when a wake up occurs with interrupts disabled. I guess it is assumed to be handled, at worst, in the next tick. But a local irq work would provide instant preemption once IRQs are re-enabled. Of course this would only make sense in CONFIG_PREEMPTION, and when the tick is disabled...
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git sched/idle
HEAD: f2fa6e4a070c1535b9edc9ee097167fd2b15d235
Thanks, Frederic
Frederic Weisbecker (4): sched/idle: Fix missing need_resched() check after rcu_idle_enter() cpuidle: Fix missing need_resched() check after rcu_idle_enter() ARM: imx6q: Fix missing need_resched() check after rcu_idle_enter() ACPI: processor: Fix missing need_resched() check after rcu_idle_enter()
arch/arm/mach-imx/cpuidle-imx6q.c | 7 ++++++- drivers/acpi/processor_idle.c | 10 ++++++++-- drivers/cpuidle/cpuidle.c | 33 +++++++++++++++++++++++++-------- kernel/sched/idle.c | 18 ++++++++++++------ 4 files changed, 51 insertions(+), 17 deletions(-)
Please feel free to add
Reviewed-by: Rafael J. Wysocki rafael.j.wysocki@intel.com
to all patches in the series.
I would guess that they will take some other path to mainline, but I have queued these to cut down on rcutorture's whining. ;-)
Thanx, Paul
linux-stable-mirror@lists.linaro.org