From: Ville Syrjälä ville.syrjala@linux.intel.com
Since 4.20-rc1 my PIC machines no longer reboot/shutdown. I bisected this down to commit 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds").
I traced the hang into -> cpufreq_suspend() -> cpufreq_stop_governor() -> cpufreq_dbs_governor_stop() -> gov_clear_update_util() -> synchronize_sched() -> synchronize_rcu()
Only PREEMPT=y is affected for obvious reasons. The problem is limited to PIC machines since they mask off interrupts in i8259A_shutdown() (syscore_ops.shutdown() registered from device_initcall()).
I reported this long ago but no better fix has surfaced, hence sending out my initial workaround which I've been carrying around ever since. I just move cpufreq_core_init() to late_initcall() so the syscore_ops get registered in the oppsite order and thus the .shutdown() hooks get executed in the opposite order as well. Not 100% convinced this is safe (especially moving the cpufreq_global_kobject creation to late_initcall()) but I've not had any problems with it at least.
Here's the resulting change in initcall_debug: + PM: Calling cpufreq_suspend+0x0/0x100 PM: Calling mce_syscore_shutdown+0x0/0x10 PM: Calling i8259A_shutdown+0x0/0x10 - PM: Calling cpufreq_suspend+0x0/0x100 + reboot: Restarting system + reboot: machine restart
Cc: stable@vger.kernel.org Cc: Paul E. McKenney paulmck@linux.vnet.ibm.com Cc: Andi Kleen ak@linux.intel.com Cc: "Rafael J. Wysocki" rjw@rjwysocki.net Cc: Viresh Kumar viresh.kumar@linaro.org Cc: linux-pm@vger.kernel.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: "H. Peter Anvin" hpa@zytor.com Fixes: 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds") Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/cpufreq/cpufreq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index c52d6fa32aac..6a8fb9b08e33 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -2761,4 +2761,4 @@ static int __init cpufreq_core_init(void) return 0; } module_param(off, int, 0444); -core_initcall(cpufreq_core_init); +late_initcall(cpufreq_core_init);
On Thursday, October 3, 2019 4:08:28 PM CEST Ville Syrjala wrote:
From: Ville Syrjälä ville.syrjala@linux.intel.com
Since 4.20-rc1 my PIC machines no longer reboot/shutdown. I bisected this down to commit 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds").
I traced the hang into -> cpufreq_suspend() -> cpufreq_stop_governor() -> cpufreq_dbs_governor_stop() -> gov_clear_update_util() -> synchronize_sched() -> synchronize_rcu()
Only PREEMPT=y is affected for obvious reasons. The problem is limited to PIC machines since they mask off interrupts in i8259A_shutdown() (syscore_ops.shutdown() registered from device_initcall()).
Let me treat this as a fresh bug report. :-)
I reported this long ago but no better fix has surfaced,
So I don't recall seeing the original report or if I did, I had not understood the problem then.
hence sending out my initial workaround which I've been carrying around ever since. I just move cpufreq_core_init() to late_initcall() so the syscore_ops get registered in the oppsite order and thus the .shutdown() hooks get executed in the opposite order as well. Not 100% convinced this is safe (especially moving the cpufreq_global_kobject creation to late_initcall()) but I've not had any problems with it at least.
The problem is a bug in cpufreq that shouldn't point its syscore shutdown callback pointer to cpufreq_suspend(), because the syscore stage is generally too lat to call that function and I'm not sure why this has not been causing any other issues to trigger (or maybe it did, but they were not reported).
Does the patch below work for you?
--- drivers/base/core.c | 3 +++ drivers/cpufreq/cpufreq.c | 10 ---------- 2 files changed, 3 insertions(+), 10 deletions(-)
Index: linux-pm/drivers/cpufreq/cpufreq.c =================================================================== --- linux-pm.orig/drivers/cpufreq/cpufreq.c +++ linux-pm/drivers/cpufreq/cpufreq.c @@ -2737,14 +2737,6 @@ int cpufreq_unregister_driver(struct cpu } EXPORT_SYMBOL_GPL(cpufreq_unregister_driver);
-/* - * Stop cpufreq at shutdown to make sure it isn't holding any locks - * or mutexes when secondary CPUs are halted. - */ -static struct syscore_ops cpufreq_syscore_ops = { - .shutdown = cpufreq_suspend, -}; - struct kobject *cpufreq_global_kobject; EXPORT_SYMBOL(cpufreq_global_kobject);
@@ -2756,8 +2748,6 @@ static int __init cpufreq_core_init(void cpufreq_global_kobject = kobject_create_and_add("cpufreq", &cpu_subsys.dev_root->kobj); BUG_ON(!cpufreq_global_kobject);
- register_syscore_ops(&cpufreq_syscore_ops); - return 0; } module_param(off, int, 0444); Index: linux-pm/drivers/base/core.c =================================================================== --- linux-pm.orig/drivers/base/core.c +++ linux-pm/drivers/base/core.c @@ -9,6 +9,7 @@ */
#include <linux/acpi.h> +#include <linux/cpufreq.h> #include <linux/device.h> #include <linux/err.h> #include <linux/fwnode.h> @@ -3179,6 +3180,8 @@ void device_shutdown(void) wait_for_device_probe(); device_block_probing();
+ cpufreq_suspend(); + spin_lock(&devices_kset->list_lock); /* * Walk the devices list backward, shutting down each in turn.
On Thu, Oct 03, 2019 at 05:05:28PM +0200, Rafael J. Wysocki wrote:
On Thursday, October 3, 2019 4:08:28 PM CEST Ville Syrjala wrote:
From: Ville Syrjälä ville.syrjala@linux.intel.com
Since 4.20-rc1 my PIC machines no longer reboot/shutdown. I bisected this down to commit 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds").
I traced the hang into -> cpufreq_suspend() -> cpufreq_stop_governor() -> cpufreq_dbs_governor_stop() -> gov_clear_update_util() -> synchronize_sched() -> synchronize_rcu()
Only PREEMPT=y is affected for obvious reasons. The problem is limited to PIC machines since they mask off interrupts in i8259A_shutdown() (syscore_ops.shutdown() registered from device_initcall()).
Let me treat this as a fresh bug report. :-)
I reported this long ago but no better fix has surfaced,
So I don't recall seeing the original report or if I did, I had not understood the problem then.
hence sending out my initial workaround which I've been carrying around ever since. I just move cpufreq_core_init() to late_initcall() so the syscore_ops get registered in the oppsite order and thus the .shutdown() hooks get executed in the opposite order as well. Not 100% convinced this is safe (especially moving the cpufreq_global_kobject creation to late_initcall()) but I've not had any problems with it at least.
The problem is a bug in cpufreq that shouldn't point its syscore shutdown callback pointer to cpufreq_suspend(), because the syscore stage is generally too lat to call that function and I'm not sure why this has not been causing any other issues to trigger (or maybe it did, but they were not reported).
Does the patch below work for you?
It does. Thanks.
Feel free to slap on Tested-by: Ville Syrjälä ville.syrjala@linux.intel.com
if you want to go with that.
drivers/base/core.c | 3 +++ drivers/cpufreq/cpufreq.c | 10 ---------- 2 files changed, 3 insertions(+), 10 deletions(-)
Index: linux-pm/drivers/cpufreq/cpufreq.c
--- linux-pm.orig/drivers/cpufreq/cpufreq.c +++ linux-pm/drivers/cpufreq/cpufreq.c @@ -2737,14 +2737,6 @@ int cpufreq_unregister_driver(struct cpu } EXPORT_SYMBOL_GPL(cpufreq_unregister_driver); -/*
- Stop cpufreq at shutdown to make sure it isn't holding any locks
- or mutexes when secondary CPUs are halted.
- */
-static struct syscore_ops cpufreq_syscore_ops = {
- .shutdown = cpufreq_suspend,
-};
struct kobject *cpufreq_global_kobject; EXPORT_SYMBOL(cpufreq_global_kobject); @@ -2756,8 +2748,6 @@ static int __init cpufreq_core_init(void cpufreq_global_kobject = kobject_create_and_add("cpufreq", &cpu_subsys.dev_root->kobj); BUG_ON(!cpufreq_global_kobject);
- register_syscore_ops(&cpufreq_syscore_ops);
- return 0;
} module_param(off, int, 0444); Index: linux-pm/drivers/base/core.c =================================================================== --- linux-pm.orig/drivers/base/core.c +++ linux-pm/drivers/base/core.c @@ -9,6 +9,7 @@ */ #include <linux/acpi.h> +#include <linux/cpufreq.h> #include <linux/device.h> #include <linux/err.h> #include <linux/fwnode.h> @@ -3179,6 +3180,8 @@ void device_shutdown(void) wait_for_device_probe(); device_block_probing();
- cpufreq_suspend();
- spin_lock(&devices_kset->list_lock); /*
- Walk the devices list backward, shutting down each in turn.
On Fri, Oct 4, 2019 at 2:30 PM Ville Syrjälä ville.syrjala@linux.intel.com wrote:
On Thu, Oct 03, 2019 at 05:05:28PM +0200, Rafael J. Wysocki wrote:
On Thursday, October 3, 2019 4:08:28 PM CEST Ville Syrjala wrote:
From: Ville Syrjälä ville.syrjala@linux.intel.com
Since 4.20-rc1 my PIC machines no longer reboot/shutdown. I bisected this down to commit 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds").
I traced the hang into -> cpufreq_suspend() -> cpufreq_stop_governor() -> cpufreq_dbs_governor_stop() -> gov_clear_update_util() -> synchronize_sched() -> synchronize_rcu()
Only PREEMPT=y is affected for obvious reasons. The problem is limited to PIC machines since they mask off interrupts in i8259A_shutdown() (syscore_ops.shutdown() registered from device_initcall()).
Let me treat this as a fresh bug report. :-)
I reported this long ago but no better fix has surfaced,
So I don't recall seeing the original report or if I did, I had not understood the problem then.
hence sending out my initial workaround which I've been carrying around ever since. I just move cpufreq_core_init() to late_initcall() so the syscore_ops get registered in the oppsite order and thus the .shutdown() hooks get executed in the opposite order as well. Not 100% convinced this is safe (especially moving the cpufreq_global_kobject creation to late_initcall()) but I've not had any problems with it at least.
The problem is a bug in cpufreq that shouldn't point its syscore shutdown callback pointer to cpufreq_suspend(), because the syscore stage is generally too lat to call that function and I'm not sure why this has not been causing any other issues to trigger (or maybe it did, but they were not reported).
Does the patch below work for you?
It does. Thanks.
Feel free to slap on Tested-by: Ville Syrjälä ville.syrjala@linux.intel.com
if you want to go with that.
I will, thank you!
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
It is incorrect to set the cpufreq syscore shutdown callback pointer to cpufreq_suspend(), because that function cannot be run in the syscore stage of system shutdown for two reasons: (a) it may attempt to carry out actions depending on devices that have already been shut down at that point and (b) the RCU synchronization carried out by it may not be able to make progress then.
The latter issue has been present since commit 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds"), but the former one has always been there regardless.
Fix that by dropping cpufreq_syscore_ops altogether and making device_shutdown() call cpufreq_suspend() directly before shutting down devices, which is along the lines of what system-wide power management does.
Fixes: 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds") Reported-by: Ville Syrjälä ville.syrjala@linux.intel.com Tested-by: Ville Syrjälä ville.syrjala@linux.intel.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com --- drivers/base/core.c | 3 +++ drivers/cpufreq/cpufreq.c | 10 ---------- 2 files changed, 3 insertions(+), 10 deletions(-)
Index: linux-pm/drivers/cpufreq/cpufreq.c =================================================================== --- linux-pm.orig/drivers/cpufreq/cpufreq.c +++ linux-pm/drivers/cpufreq/cpufreq.c @@ -2737,14 +2737,6 @@ int cpufreq_unregister_driver(struct cpu } EXPORT_SYMBOL_GPL(cpufreq_unregister_driver);
-/* - * Stop cpufreq at shutdown to make sure it isn't holding any locks - * or mutexes when secondary CPUs are halted. - */ -static struct syscore_ops cpufreq_syscore_ops = { - .shutdown = cpufreq_suspend, -}; - struct kobject *cpufreq_global_kobject; EXPORT_SYMBOL(cpufreq_global_kobject);
@@ -2756,8 +2748,6 @@ static int __init cpufreq_core_init(void cpufreq_global_kobject = kobject_create_and_add("cpufreq", &cpu_subsys.dev_root->kobj); BUG_ON(!cpufreq_global_kobject);
- register_syscore_ops(&cpufreq_syscore_ops); - return 0; } module_param(off, int, 0444); Index: linux-pm/drivers/base/core.c =================================================================== --- linux-pm.orig/drivers/base/core.c +++ linux-pm/drivers/base/core.c @@ -9,6 +9,7 @@ */
#include <linux/acpi.h> +#include <linux/cpufreq.h> #include <linux/device.h> #include <linux/err.h> #include <linux/fwnode.h> @@ -3179,6 +3180,8 @@ void device_shutdown(void) wait_for_device_probe(); device_block_probing();
+ cpufreq_suspend(); + spin_lock(&devices_kset->list_lock); /* * Walk the devices list backward, shutting down each in turn.
On 09-10-19, 01:29, Rafael J. Wysocki wrote:
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
It is incorrect to set the cpufreq syscore shutdown callback pointer to cpufreq_suspend(), because that function cannot be run in the syscore stage of system shutdown for two reasons: (a) it may attempt to carry out actions depending on devices that have already been shut down at that point and (b) the RCU synchronization carried out by it may not be able to make progress then.
The latter issue has been present since commit 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds"), but the former one has always been there regardless.
Fix that by dropping cpufreq_syscore_ops altogether and making device_shutdown() call cpufreq_suspend() directly before shutting down devices, which is along the lines of what system-wide power management does.
Fixes: 45975c7d21a1 ("rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds") Reported-by: Ville Syrjälä ville.syrjala@linux.intel.com Tested-by: Ville Syrjälä ville.syrjala@linux.intel.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com
drivers/base/core.c | 3 +++ drivers/cpufreq/cpufreq.c | 10 ---------- 2 files changed, 3 insertions(+), 10 deletions(-)
Acked-by: Viresh Kumar viresh.kumar@linaro.org
linux-stable-mirror@lists.linaro.org