These patches fix the lockdep splat reported in this thread: https://lore.kernel.org/lkml/20220822164404.57952727@gandalf.local.home/
From: Peter Zijlstra peterz@infradead.org
commit 5e26aa93391195a64871db5d96d7163f0062ca4f upstream.
Make cpuidle_state::enter() methods IRQ state invariant on exit.
Additionally make sure to use raw_local_irq_*() methods since this cpuidle callback will be called with RCU already disabled.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Tested-by: Tony Lindgren tony@atomide.com Tested-by: Ulf Hansson ulf.hansson@linaro.org Reviewed-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Frederic Weisbecker frederic@kernel.org Link: https://lore.kernel.org/r/20230112195539.515253662@infradead.org Signed-off-by: Suren Baghdasaryan surenb@google.com Signed-off-by: Cheng-Jui Wang cheng-jui.wang@mediatek.com --- drivers/cpuidle/poll_state.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c index f7e83613ae94..1f578ed09c73 100644 --- a/drivers/cpuidle/poll_state.c +++ b/drivers/cpuidle/poll_state.c @@ -17,7 +17,7 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev,
dev->poll_time_limit = false;
- local_irq_enable(); + raw_local_irq_enable(); if (!current_set_polling_and_test()) { unsigned int loop_count = 0; u64 limit; @@ -36,6 +36,8 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev, } } } + raw_local_irq_disable(); + current_clr_polling();
return index;
From: Peter Zijlstra peterz@infradead.org
commit 8e9ab9e8da1eae61fdff35690d998eaf8cd527dc upstream.
Doing RCU-idle outside the driver, only to then temporarily enable it again, at least twice, before going idle is suboptimal.
That is, once implicitly through the cpu_pm_*() calls and once explicitly doing ct_irq_*_irqon().
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Tested-by: Tony Lindgren tony@atomide.com Tested-by: Ulf Hansson ulf.hansson@linaro.org Reviewed-by: Anup Patel anup@brainfault.org Reviewed-by: Frederic Weisbecker frederic@kernel.org Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Link: https://lore.kernel.org/r/20230112195539.637185846@infradead.org Signed-off-by: Suren Baghdasaryan surenb@google.com Signed-off-by: Cheng-Jui Wang cheng-jui.wang@mediatek.com --- drivers/cpuidle/cpuidle-riscv-sbi.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c index 05fe2902df9a..cbdbb11b972b 100644 --- a/drivers/cpuidle/cpuidle-riscv-sbi.c +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c @@ -121,12 +121,12 @@ static int __sbi_enter_domain_idle_state(struct cpuidle_device *dev, return -1;
/* Do runtime PM to manage a hierarchical CPU toplogy. */ - ct_irq_enter_irqson(); if (s2idle) dev_pm_genpd_suspend(pd_dev); else pm_runtime_put_sync_suspend(pd_dev); - ct_irq_exit_irqson(); + + ct_idle_enter();
if (sbi_is_domain_state_available()) state = sbi_get_domain_state(); @@ -135,12 +135,12 @@ static int __sbi_enter_domain_idle_state(struct cpuidle_device *dev,
ret = sbi_suspend(state) ? -1 : idx;
- ct_irq_enter_irqson(); + ct_idle_exit(); + if (s2idle) dev_pm_genpd_resume(pd_dev); else pm_runtime_get_sync(pd_dev); - ct_irq_exit_irqson();
cpu_pm_exit();
@@ -251,6 +251,7 @@ static int sbi_dt_cpu_init_topology(struct cpuidle_driver *drv, * of a shared state for the domain, assumes the domain states are all * deeper states. */ + drv->states[state_count - 1].flags |= CPUIDLE_FLAG_RCU_IDLE; drv->states[state_count - 1].enter = sbi_enter_domain_idle_state; drv->states[state_count - 1].enter_s2idle = sbi_enter_s2idle_domain_idle_state;
Hello:
This series was applied to riscv/linux.git (fixes) by Ingo Molnar mingo@kernel.org:
On Fri, 3 Mar 2023 17:23:24 +0800 you wrote:
From: Peter Zijlstra peterz@infradead.org
commit 8e9ab9e8da1eae61fdff35690d998eaf8cd527dc upstream.
Doing RCU-idle outside the driver, only to then temporarily enable it again, at least twice, before going idle is suboptimal.
[...]
Here is the summary with links: - [02/10] cpuidle, riscv: Push RCU-idle into driver https://git.kernel.org/riscv/c/8e9ab9e8da1e - [10/10] cpuidle: Fix ct_idle_*() usage (no matching commit)
You are awesome, thank you!
From: Peter Zijlstra peterz@infradead.org
commit 5fca0d9f5d76664786ca6c09076341def165a677 upstream.
Doing RCU-idle outside the driver, only to then temporarily enable it again, at least twice, before going idle is suboptimal.
Notably once implicitly through the cpu_pm_*() calls and once explicitly doing RCU_NONIDLE().
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Tested-by: Tony Lindgren tony@atomide.com Tested-by: Ulf Hansson ulf.hansson@linaro.org Reviewed-by: Frederic Weisbecker frederic@kernel.org Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Link: https://lore.kernel.org/r/20230112195539.699546331@infradead.org Signed-off-by: Suren Baghdasaryan surenb@google.com Signed-off-by: Cheng-Jui Wang cheng-jui.wang@mediatek.com --- drivers/cpuidle/cpuidle-tegra.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/drivers/cpuidle/cpuidle-tegra.c b/drivers/cpuidle/cpuidle-tegra.c index 9845629aeb6d..3ca5cfb9d322 100644 --- a/drivers/cpuidle/cpuidle-tegra.c +++ b/drivers/cpuidle/cpuidle-tegra.c @@ -180,9 +180,11 @@ static int tegra_cpuidle_state_enter(struct cpuidle_device *dev, }
local_fiq_disable(); - RCU_NONIDLE(tegra_pm_set_cpu_in_lp2()); + tegra_pm_set_cpu_in_lp2(); cpu_pm_enter();
+ ct_idle_enter(); + switch (index) { case TEGRA_C7: err = tegra_cpuidle_c7_enter(); @@ -197,8 +199,10 @@ static int tegra_cpuidle_state_enter(struct cpuidle_device *dev, break; }
+ ct_idle_exit(); + cpu_pm_exit(); - RCU_NONIDLE(tegra_pm_clear_cpu_in_lp2()); + tegra_pm_clear_cpu_in_lp2(); local_fiq_enable();
return err ?: index; @@ -226,6 +230,7 @@ static int tegra_cpuidle_enter(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { + bool do_rcu = drv->states[index].flags & CPUIDLE_FLAG_RCU_IDLE; unsigned int cpu = cpu_logical_map(dev->cpu); int ret;
@@ -233,9 +238,13 @@ static int tegra_cpuidle_enter(struct cpuidle_device *dev, if (dev->states_usage[index].disable) return -1;
- if (index == TEGRA_C1) + if (index == TEGRA_C1) { + if (do_rcu) + ct_idle_enter(); ret = arm_cpuidle_simple_enter(dev, drv, index); - else + if (do_rcu) + ct_idle_exit(); + } else ret = tegra_cpuidle_state_enter(dev, index, cpu);
if (ret < 0) { @@ -285,7 +294,8 @@ static struct cpuidle_driver tegra_idle_driver = { .exit_latency = 2000, .target_residency = 2200, .power_usage = 100, - .flags = CPUIDLE_FLAG_TIMER_STOP, + .flags = CPUIDLE_FLAG_TIMER_STOP | + CPUIDLE_FLAG_RCU_IDLE, .name = "C7", .desc = "CPU core powered off", }, @@ -295,6 +305,7 @@ static struct cpuidle_driver tegra_idle_driver = { .target_residency = 10000, .power_usage = 0, .flags = CPUIDLE_FLAG_TIMER_STOP | + CPUIDLE_FLAG_RCU_IDLE | CPUIDLE_FLAG_COUPLED, .name = "CC6", .desc = "CPU cluster powered off",
From: Peter Zijlstra peterz@infradead.org
commit e038f7b8028a1d1bc8ac82351c71ea538f19a879 upstream.
Doing RCU-idle outside the driver, only to then temporarily enable it again, at least twice, before going idle is suboptimal.
Notably once implicitly through the cpu_pm_*() calls and once explicitly doing ct_irq_*_irqon().
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Tested-by: Kajetan Puchalski kajetan.puchalski@arm.com Tested-by: Tony Lindgren tony@atomide.com Tested-by: Ulf Hansson ulf.hansson@linaro.org Reviewed-by: Frederic Weisbecker frederic@kernel.org Reviewed-by: Guo Ren guoren@kernel.org Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Link: https://lore.kernel.org/r/20230112195539.760296658@infradead.org Signed-off-by: Suren Baghdasaryan surenb@google.com Signed-off-by: Cheng-Jui Wang cheng-jui.wang@mediatek.com --- drivers/cpuidle/cpuidle-psci.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c index 57bc3e3ae391..6a259c94b8b2 100644 --- a/drivers/cpuidle/cpuidle-psci.c +++ b/drivers/cpuidle/cpuidle-psci.c @@ -69,12 +69,12 @@ static int __psci_enter_domain_idle_state(struct cpuidle_device *dev, return -1;
/* Do runtime PM to manage a hierarchical CPU toplogy. */ - ct_irq_enter_irqson(); if (s2idle) dev_pm_genpd_suspend(pd_dev); else pm_runtime_put_sync_suspend(pd_dev); - ct_irq_exit_irqson(); + + ct_idle_enter();
state = psci_get_domain_state(); if (!state) @@ -82,13 +82,12 @@ static int __psci_enter_domain_idle_state(struct cpuidle_device *dev,
ret = psci_cpu_suspend_enter(state) ? -1 : idx;
- ct_irq_enter_irqson(); + ct_idle_exit(); + if (s2idle) dev_pm_genpd_resume(pd_dev); else pm_runtime_get_sync(pd_dev); - ct_irq_exit_irqson(); - cpu_pm_exit();
/* Clear the domain state to start fresh when back from idle. */ @@ -240,6 +239,7 @@ static int psci_dt_cpu_init_topology(struct cpuidle_driver *drv, * of a shared state for the domain, assumes the domain states are all * deeper states. */ + drv->states[state_count - 1].flags |= CPUIDLE_FLAG_RCU_IDLE; drv->states[state_count - 1].enter = psci_enter_domain_idle_state; drv->states[state_count - 1].enter_s2idle = psci_enter_s2idle_domain_idle_state; psci_cpuidle_use_cpuhp = true;
Hi Cheng,
On Fri, Mar 3, 2023 at 10:35 AM Cheng-Jui Wang cheng-jui.wang@mediatek.com wrote:
From: Peter Zijlstra peterz@infradead.org
commit e038f7b8028a1d1bc8ac82351c71ea538f19a879 upstream.
Doing RCU-idle outside the driver, only to then temporarily enable it again, at least twice, before going idle is suboptimal.
Notably once implicitly through the cpu_pm_*() calls and once explicitly doing ct_irq_*_irqon().
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Tested-by: Kajetan Puchalski kajetan.puchalski@arm.com Tested-by: Tony Lindgren tony@atomide.com Tested-by: Ulf Hansson ulf.hansson@linaro.org Reviewed-by: Frederic Weisbecker frederic@kernel.org Reviewed-by: Guo Ren guoren@kernel.org Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Link: https://lore.kernel.org/r/20230112195539.760296658@infradead.org Signed-off-by: Suren Baghdasaryan surenb@google.com Signed-off-by: Cheng-Jui Wang cheng-jui.wang@mediatek.com
Given this patch introduced a so far unresolved regression upstream, I think it's premature to backport this to stable.
https://lore.kernel.org/all/ff338b9f-4ab0-741b-26ea-7b7351da156@linux-m68k.o...
Gr{oetje,eeting}s,
Geert
-- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
From: Peter Zijlstra peterz@infradead.org
commit b3f46658ce40a3467cda82f920dd9d5325ab0eaf upstream.
Doing RCU-idle outside the driver, only to then temporarily enable it again, at least twice, before going idle is suboptimal.
Notably both cpu_pm_enter() and cpu_cluster_pm_enter() implicity re-enable RCU.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Tested-by: Tony Lindgren tony@atomide.com Tested-by: Ulf Hansson ulf.hansson@linaro.org Reviewed-by: Frederic Weisbecker frederic@kernel.org Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Link: https://lore.kernel.org/r/20230112195539.821714572@infradead.org Signed-off-by: Suren Baghdasaryan surenb@google.com Signed-off-by: Cheng-Jui Wang cheng-jui.wang@mediatek.com --- arch/arm/mach-imx/cpuidle-imx6sx.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/arm/mach-imx/cpuidle-imx6sx.c b/arch/arm/mach-imx/cpuidle-imx6sx.c index 74ea1720e3d8..1dc01f6b0f36 100644 --- a/arch/arm/mach-imx/cpuidle-imx6sx.c +++ b/arch/arm/mach-imx/cpuidle-imx6sx.c @@ -47,7 +47,9 @@ static int imx6sx_enter_wait(struct cpuidle_device *dev, cpu_pm_enter(); cpu_cluster_pm_enter();
+ ct_idle_enter(); cpu_suspend(0, imx6sx_idle_finish); + ct_idle_exit();
cpu_cluster_pm_exit(); cpu_pm_exit(); @@ -87,7 +89,8 @@ static struct cpuidle_driver imx6sx_cpuidle_driver = { */ .exit_latency = 300, .target_residency = 500, - .flags = CPUIDLE_FLAG_TIMER_STOP, + .flags = CPUIDLE_FLAG_TIMER_STOP | + CPUIDLE_FLAG_RCU_IDLE, .enter = imx6sx_enter_wait, .name = "LOW-POWER-IDLE", .desc = "ARM power off",
From: Peter Zijlstra peterz@infradead.org
commit 4d1be9e745382b41492b0cb9000829863db7133a upstream.
Doing RCU-idle outside the driver, only to then teporarily enable it again before going idle is suboptimal.
Notably the cpu_pm_*() calls implicitly re-enable RCU for a bit.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Tested-by: Tony Lindgren tony@atomide.com Tested-by: Ulf Hansson ulf.hansson@linaro.org Reviewed-by: Frederic Weisbecker frederic@kernel.org Reviewed-by: Tony Lindgren tony@atomide.com Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Link: https://lore.kernel.org/r/20230112195539.883561913@infradead.org Signed-off-by: Suren Baghdasaryan surenb@google.com Signed-off-by: Cheng-Jui Wang cheng-jui.wang@mediatek.com --- arch/arm/mach-omap2/cpuidle34xx.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/arch/arm/mach-omap2/cpuidle34xx.c b/arch/arm/mach-omap2/cpuidle34xx.c index 090a8aafb25e..cedf5cbe451f 100644 --- a/arch/arm/mach-omap2/cpuidle34xx.c +++ b/arch/arm/mach-omap2/cpuidle34xx.c @@ -133,7 +133,9 @@ static int omap3_enter_idle(struct cpuidle_device *dev, }
/* Execute ARM wfi */ + ct_idle_enter(); omap_sram_idle(); + ct_idle_exit();
/* * Call idle CPU PM enter notifier chain to restore @@ -265,6 +267,7 @@ static struct cpuidle_driver omap3_idle_driver = { .owner = THIS_MODULE, .states = { { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = omap3_enter_idle_bm, .exit_latency = 2 + 2, .target_residency = 5, @@ -272,6 +275,7 @@ static struct cpuidle_driver omap3_idle_driver = { .desc = "MPU ON + CORE ON", }, { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = omap3_enter_idle_bm, .exit_latency = 10 + 10, .target_residency = 30, @@ -279,6 +283,7 @@ static struct cpuidle_driver omap3_idle_driver = { .desc = "MPU ON + CORE ON", }, { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = omap3_enter_idle_bm, .exit_latency = 50 + 50, .target_residency = 300, @@ -286,6 +291,7 @@ static struct cpuidle_driver omap3_idle_driver = { .desc = "MPU RET + CORE ON", }, { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = omap3_enter_idle_bm, .exit_latency = 1500 + 1800, .target_residency = 4000, @@ -293,6 +299,7 @@ static struct cpuidle_driver omap3_idle_driver = { .desc = "MPU OFF + CORE ON", }, { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = omap3_enter_idle_bm, .exit_latency = 2500 + 7500, .target_residency = 12000, @@ -300,6 +307,7 @@ static struct cpuidle_driver omap3_idle_driver = { .desc = "MPU RET + CORE RET", }, { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = omap3_enter_idle_bm, .exit_latency = 3000 + 8500, .target_residency = 15000, @@ -307,6 +315,7 @@ static struct cpuidle_driver omap3_idle_driver = { .desc = "MPU OFF + CORE RET", }, { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = omap3_enter_idle_bm, .exit_latency = 10000 + 30000, .target_residency = 30000, @@ -328,6 +337,7 @@ static struct cpuidle_driver omap3430_idle_driver = { .owner = THIS_MODULE, .states = { { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = omap3_enter_idle_bm, .exit_latency = 110 + 162, .target_residency = 5, @@ -335,6 +345,7 @@ static struct cpuidle_driver omap3430_idle_driver = { .desc = "MPU ON + CORE ON", }, { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = omap3_enter_idle_bm, .exit_latency = 106 + 180, .target_residency = 309, @@ -342,6 +353,7 @@ static struct cpuidle_driver omap3430_idle_driver = { .desc = "MPU ON + CORE ON", }, { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = omap3_enter_idle_bm, .exit_latency = 107 + 410, .target_residency = 46057, @@ -349,6 +361,7 @@ static struct cpuidle_driver omap3430_idle_driver = { .desc = "MPU RET + CORE ON", }, { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = omap3_enter_idle_bm, .exit_latency = 121 + 3374, .target_residency = 46057, @@ -356,6 +369,7 @@ static struct cpuidle_driver omap3430_idle_driver = { .desc = "MPU OFF + CORE ON", }, { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = omap3_enter_idle_bm, .exit_latency = 855 + 1146, .target_residency = 46057, @@ -363,6 +377,7 @@ static struct cpuidle_driver omap3430_idle_driver = { .desc = "MPU RET + CORE RET", }, { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = omap3_enter_idle_bm, .exit_latency = 7580 + 4134, .target_residency = 484329, @@ -370,6 +385,7 @@ static struct cpuidle_driver omap3430_idle_driver = { .desc = "MPU OFF + CORE RET", }, { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = omap3_enter_idle_bm, .exit_latency = 7505 + 15274, .target_residency = 484329,
From: Peter Zijlstra peterz@infradead.org
commit 4ce40e9dbe83153f60d7e4ccd24a1eb4f8264f6a upstream.
Doing RCU-idle outside the driver, only to then temporarily enable it again before going idle is suboptimal.
Notably the cpu_pm_*() calls implicitly re-enable RCU for a bit.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Tested-by: Tony Lindgren tony@atomide.com Tested-by: Ulf Hansson ulf.hansson@linaro.org Reviewed-by: Frederic Weisbecker frederic@kernel.org Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Link: https://lore.kernel.org/r/20230112195539.946630819@infradead.org Signed-off-by: Suren Baghdasaryan surenb@google.com Signed-off-by: Cheng-Jui Wang cheng-jui.wang@mediatek.com --- drivers/cpuidle/cpuidle-mvebu-v7.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/cpuidle/cpuidle-mvebu-v7.c b/drivers/cpuidle/cpuidle-mvebu-v7.c index 01a856971f05..c9568aa9410c 100644 --- a/drivers/cpuidle/cpuidle-mvebu-v7.c +++ b/drivers/cpuidle/cpuidle-mvebu-v7.c @@ -36,7 +36,10 @@ static int mvebu_v7_enter_idle(struct cpuidle_device *dev, if (drv->states[index].flags & MVEBU_V7_FLAG_DEEP_IDLE) deepidle = true;
+ ct_idle_enter(); ret = mvebu_v7_cpu_suspend(deepidle); + ct_idle_exit(); + cpu_pm_exit();
if (ret) @@ -49,6 +52,7 @@ static struct cpuidle_driver armadaxp_idle_driver = { .name = "armada_xp_idle", .states[0] = ARM_CPUIDLE_WFI_STATE, .states[1] = { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = mvebu_v7_enter_idle, .exit_latency = 100, .power_usage = 50, @@ -57,6 +61,7 @@ static struct cpuidle_driver armadaxp_idle_driver = { .desc = "CPU power down", }, .states[2] = { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = mvebu_v7_enter_idle, .exit_latency = 1000, .power_usage = 5, @@ -72,6 +77,7 @@ static struct cpuidle_driver armada370_idle_driver = { .name = "armada_370_idle", .states[0] = ARM_CPUIDLE_WFI_STATE, .states[1] = { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = mvebu_v7_enter_idle, .exit_latency = 100, .power_usage = 5, @@ -87,6 +93,7 @@ static struct cpuidle_driver armada38x_idle_driver = { .name = "armada_38x_idle", .states[0] = ARM_CPUIDLE_WFI_STATE, .states[1] = { + .flags = CPUIDLE_FLAG_RCU_IDLE, .enter = mvebu_v7_enter_idle, .exit_latency = 10, .power_usage = 5,
From: Peter Zijlstra peterz@infradead.org
commit c3d42418dca53d6c498a48c408f7a45289593650 upstream.
Doing RCU-idle outside the driver, only to then temporarily enable it again, some *four* times, before going idle is suboptimal.
Notably three times explicitly using RCU_NONIDLE() and once implicitly through cpu_pm_*().
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Tested-by: Tony Lindgren tony@atomide.com Tested-by: Ulf Hansson ulf.hansson@linaro.org Reviewed-by: Frederic Weisbecker frederic@kernel.org Reviewed-by: Tony Lindgren tony@atomide.com Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Link: https://lore.kernel.org/r/20230112195540.007918454@infradead.org Signed-off-by: Suren Baghdasaryan surenb@google.com Signed-off-by: Cheng-Jui Wang cheng-jui.wang@mediatek.com --- arch/arm/mach-omap2/cpuidle44xx.c | 29 ++++++++++++++++++----------- 1 file changed, 18 insertions(+), 11 deletions(-)
diff --git a/arch/arm/mach-omap2/cpuidle44xx.c b/arch/arm/mach-omap2/cpuidle44xx.c index de37027ad758..953ad888737c 100644 --- a/arch/arm/mach-omap2/cpuidle44xx.c +++ b/arch/arm/mach-omap2/cpuidle44xx.c @@ -105,7 +105,9 @@ static int omap_enter_idle_smp(struct cpuidle_device *dev, } raw_spin_unlock_irqrestore(&mpu_lock, flag);
+ ct_idle_enter(); omap4_enter_lowpower(dev->cpu, cx->cpu_state); + ct_idle_exit();
raw_spin_lock_irqsave(&mpu_lock, flag); if (cx->mpu_state_vote == num_online_cpus()) @@ -151,10 +153,10 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev, (cx->mpu_logic_state == PWRDM_POWER_OFF);
/* Enter broadcast mode for periodic timers */ - RCU_NONIDLE(tick_broadcast_enable()); + tick_broadcast_enable();
/* Enter broadcast mode for one-shot timers */ - RCU_NONIDLE(tick_broadcast_enter()); + tick_broadcast_enter();
/* * Call idle CPU PM enter notifier chain so that @@ -166,7 +168,7 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev,
if (dev->cpu == 0) { pwrdm_set_logic_retst(mpu_pd, cx->mpu_logic_state); - RCU_NONIDLE(omap_set_pwrdm_state(mpu_pd, cx->mpu_state)); + omap_set_pwrdm_state(mpu_pd, cx->mpu_state);
/* * Call idle CPU cluster PM enter notifier chain @@ -178,14 +180,16 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev, index = 0; cx = state_ptr + index; pwrdm_set_logic_retst(mpu_pd, cx->mpu_logic_state); - RCU_NONIDLE(omap_set_pwrdm_state(mpu_pd, cx->mpu_state)); + omap_set_pwrdm_state(mpu_pd, cx->mpu_state); mpuss_can_lose_context = 0; } } }
+ ct_idle_enter(); omap4_enter_lowpower(dev->cpu, cx->cpu_state); cpu_done[dev->cpu] = true; + ct_idle_exit();
/* Wakeup CPU1 only if it is not offlined */ if (dev->cpu == 0 && cpumask_test_cpu(1, cpu_online_mask)) { @@ -194,9 +198,9 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev, mpuss_can_lose_context) gic_dist_disable();
- RCU_NONIDLE(clkdm_deny_idle(cpu_clkdm[1])); - RCU_NONIDLE(omap_set_pwrdm_state(cpu_pd[1], PWRDM_POWER_ON)); - RCU_NONIDLE(clkdm_allow_idle(cpu_clkdm[1])); + clkdm_deny_idle(cpu_clkdm[1]); + omap_set_pwrdm_state(cpu_pd[1], PWRDM_POWER_ON); + clkdm_allow_idle(cpu_clkdm[1]);
if (IS_PM44XX_ERRATUM(PM_OMAP4_ROM_SMP_BOOT_ERRATUM_GICD) && mpuss_can_lose_context) { @@ -222,7 +226,7 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev, cpu_pm_exit();
cpu_pm_out: - RCU_NONIDLE(tick_broadcast_exit()); + tick_broadcast_exit();
fail: cpuidle_coupled_parallel_barrier(dev, &abort_barrier); @@ -247,7 +251,8 @@ static struct cpuidle_driver omap4_idle_driver = { /* C2 - CPU0 OFF + CPU1 OFF + MPU CSWR */ .exit_latency = 328 + 440, .target_residency = 960, - .flags = CPUIDLE_FLAG_COUPLED, + .flags = CPUIDLE_FLAG_COUPLED | + CPUIDLE_FLAG_RCU_IDLE, .enter = omap_enter_idle_coupled, .name = "C2", .desc = "CPUx OFF, MPUSS CSWR", @@ -256,7 +261,8 @@ static struct cpuidle_driver omap4_idle_driver = { /* C3 - CPU0 OFF + CPU1 OFF + MPU OSWR */ .exit_latency = 460 + 518, .target_residency = 1100, - .flags = CPUIDLE_FLAG_COUPLED, + .flags = CPUIDLE_FLAG_COUPLED | + CPUIDLE_FLAG_RCU_IDLE, .enter = omap_enter_idle_coupled, .name = "C3", .desc = "CPUx OFF, MPUSS OSWR", @@ -282,7 +288,8 @@ static struct cpuidle_driver omap5_idle_driver = { /* C2 - CPU0 RET + CPU1 RET + MPU CSWR */ .exit_latency = 48 + 60, .target_residency = 100, - .flags = CPUIDLE_FLAG_TIMER_STOP, + .flags = CPUIDLE_FLAG_TIMER_STOP | + CPUIDLE_FLAG_RCU_IDLE, .enter = omap_enter_idle_smp, .name = "C2", .desc = "CPUx CSWR, MPUSS CSWR",
From: Peter Zijlstra peterz@infradead.org
commit 0c5ffc3d7b15978c6b184938cd6b8af06e436424 upstream.
Doing RCU-idle outside the driver, only to then temporarily enable it again before going idle is suboptimal.
Notably: this converts all dt_init_idle_driver() and __CPU_PM_CPU_IDLE_ENTER() users for they are inextrably intertwined.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Tested-by: Tony Lindgren tony@atomide.com Tested-by: Ulf Hansson ulf.hansson@linaro.org Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Acked-by: Frederic Weisbecker frederic@kernel.org Link: https://lore.kernel.org/r/20230112195540.068981667@infradead.org Signed-off-by: Suren Baghdasaryan surenb@google.com Signed-off-by: Cheng-Jui Wang cheng-jui.wang@mediatek.com --- drivers/acpi/processor_idle.c | 2 ++ drivers/cpuidle/cpuidle-big_little.c | 8 ++++++-- drivers/cpuidle/dt_idle_states.c | 2 +- include/linux/cpuidle.h | 2 ++ 4 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index fc5b5b2c9e81..ae42d530d080 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -1221,6 +1221,8 @@ static int acpi_processor_setup_lpi_states(struct acpi_processor *pr) state->target_residency = lpi->min_residency; if (lpi->arch_flags) state->flags |= CPUIDLE_FLAG_TIMER_STOP; + if (i != 0 && lpi->entry_method == ACPI_CSTATE_FFH) + state->flags |= CPUIDLE_FLAG_RCU_IDLE; state->enter = acpi_idle_lpi_enter; drv->safe_state_index = i; } diff --git a/drivers/cpuidle/cpuidle-big_little.c b/drivers/cpuidle/cpuidle-big_little.c index abe51185f243..fffd4ed0c4d1 100644 --- a/drivers/cpuidle/cpuidle-big_little.c +++ b/drivers/cpuidle/cpuidle-big_little.c @@ -64,7 +64,8 @@ static struct cpuidle_driver bl_idle_little_driver = { .enter = bl_enter_powerdown, .exit_latency = 700, .target_residency = 2500, - .flags = CPUIDLE_FLAG_TIMER_STOP, + .flags = CPUIDLE_FLAG_TIMER_STOP | + CPUIDLE_FLAG_RCU_IDLE, .name = "C1", .desc = "ARM little-cluster power down", }, @@ -85,7 +86,8 @@ static struct cpuidle_driver bl_idle_big_driver = { .enter = bl_enter_powerdown, .exit_latency = 500, .target_residency = 2000, - .flags = CPUIDLE_FLAG_TIMER_STOP, + .flags = CPUIDLE_FLAG_TIMER_STOP | + CPUIDLE_FLAG_RCU_IDLE, .name = "C1", .desc = "ARM big-cluster power down", }, @@ -124,11 +126,13 @@ static int bl_enter_powerdown(struct cpuidle_device *dev, struct cpuidle_driver *drv, int idx) { cpu_pm_enter(); + ct_idle_enter();
cpu_suspend(0, bl_powerdown_finisher);
/* signals the MCPM core that CPU is out of low power state */ mcpm_cpu_powered_up(); + ct_idle_exit();
cpu_pm_exit();
diff --git a/drivers/cpuidle/dt_idle_states.c b/drivers/cpuidle/dt_idle_states.c index 448bc796b0b4..fcfa7490eecf 100644 --- a/drivers/cpuidle/dt_idle_states.c +++ b/drivers/cpuidle/dt_idle_states.c @@ -77,7 +77,7 @@ static int init_state_node(struct cpuidle_state *idle_state, if (err) desc = state_node->name;
- idle_state->flags = 0; + idle_state->flags = CPUIDLE_FLAG_RCU_IDLE; if (of_property_read_bool(state_node, "local-timer-stop")) idle_state->flags |= CPUIDLE_FLAG_TIMER_STOP; /* diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index fce476275e16..0ddc11e44302 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -289,7 +289,9 @@ extern s64 cpuidle_governor_latency_req(unsigned int cpu); if (!is_retention) \ __ret = cpu_pm_enter(); \ if (!__ret) { \ + ct_idle_enter(); \ __ret = low_level_idle_enter(state); \ + ct_idle_exit(); \ if (!is_retention) \ cpu_pm_exit(); \ } \
From: Peter Zijlstra peterz@infradead.org
commit a01353cf1896ea5b8a7bbc5e2b2d38feed8b7aaa upstream.
The whole disable-RCU, enable-IRQS dance is very intricate since changing IRQ state is traced, which depends on RCU.
Add two helpers for the cpuidle case that mirror the entry code:
ct_cpuidle_enter() ct_cpuidle_exit()
And fix all the cases where the enter/exit dance was buggy.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Tested-by: Tony Lindgren tony@atomide.com Tested-by: Ulf Hansson ulf.hansson@linaro.org Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Acked-by: Frederic Weisbecker frederic@kernel.org Link: https://lore.kernel.org/r/20230112195540.130014793@infradead.org Signed-off-by: Suren Baghdasaryan surenb@google.com Signed-off-by: Cheng-Jui Wang cheng-jui.wang@mediatek.com --- arch/arm/mach-imx/cpuidle-imx6q.c | 4 +-- arch/arm/mach-imx/cpuidle-imx6sx.c | 4 +-- arch/arm/mach-omap2/cpuidle34xx.c | 4 +-- arch/arm/mach-omap2/cpuidle44xx.c | 8 ++--- drivers/acpi/processor_idle.c | 8 +++-- drivers/cpuidle/cpuidle-big_little.c | 4 +-- drivers/cpuidle/cpuidle-mvebu-v7.c | 4 +-- drivers/cpuidle/cpuidle-psci.c | 4 +-- drivers/cpuidle/cpuidle-riscv-sbi.c | 4 +-- drivers/cpuidle/cpuidle-tegra.c | 8 ++--- drivers/cpuidle/cpuidle.c | 11 +++---- include/linux/clockchips.h | 4 +-- include/linux/cpuidle.h | 34 +++++++++++++++++++-- kernel/sched/idle.c | 45 ++++++++-------------------- kernel/time/tick-broadcast.c | 6 +++- 15 files changed, 86 insertions(+), 66 deletions(-)
diff --git a/arch/arm/mach-imx/cpuidle-imx6q.c b/arch/arm/mach-imx/cpuidle-imx6q.c index d086cbae09c3..c24c78a67cc1 100644 --- a/arch/arm/mach-imx/cpuidle-imx6q.c +++ b/arch/arm/mach-imx/cpuidle-imx6q.c @@ -25,9 +25,9 @@ static int imx6q_enter_wait(struct cpuidle_device *dev, imx6_set_lpm(WAIT_UNCLOCKED); raw_spin_unlock(&cpuidle_lock);
- ct_idle_enter(); + ct_cpuidle_enter(); cpu_do_idle(); - ct_idle_exit(); + ct_cpuidle_exit();
raw_spin_lock(&cpuidle_lock); if (num_idle_cpus-- == num_online_cpus()) diff --git a/arch/arm/mach-imx/cpuidle-imx6sx.c b/arch/arm/mach-imx/cpuidle-imx6sx.c index 1dc01f6b0f36..479f06286b50 100644 --- a/arch/arm/mach-imx/cpuidle-imx6sx.c +++ b/arch/arm/mach-imx/cpuidle-imx6sx.c @@ -47,9 +47,9 @@ static int imx6sx_enter_wait(struct cpuidle_device *dev, cpu_pm_enter(); cpu_cluster_pm_enter();
- ct_idle_enter(); + ct_cpuidle_enter(); cpu_suspend(0, imx6sx_idle_finish); - ct_idle_exit(); + ct_cpuidle_exit();
cpu_cluster_pm_exit(); cpu_pm_exit(); diff --git a/arch/arm/mach-omap2/cpuidle34xx.c b/arch/arm/mach-omap2/cpuidle34xx.c index cedf5cbe451f..6d63769cef0f 100644 --- a/arch/arm/mach-omap2/cpuidle34xx.c +++ b/arch/arm/mach-omap2/cpuidle34xx.c @@ -133,9 +133,9 @@ static int omap3_enter_idle(struct cpuidle_device *dev, }
/* Execute ARM wfi */ - ct_idle_enter(); + ct_cpuidle_enter(); omap_sram_idle(); - ct_idle_exit(); + ct_cpuidle_exit();
/* * Call idle CPU PM enter notifier chain to restore diff --git a/arch/arm/mach-omap2/cpuidle44xx.c b/arch/arm/mach-omap2/cpuidle44xx.c index 953ad888737c..3c97d5676e81 100644 --- a/arch/arm/mach-omap2/cpuidle44xx.c +++ b/arch/arm/mach-omap2/cpuidle44xx.c @@ -105,9 +105,9 @@ static int omap_enter_idle_smp(struct cpuidle_device *dev, } raw_spin_unlock_irqrestore(&mpu_lock, flag);
- ct_idle_enter(); + ct_cpuidle_enter(); omap4_enter_lowpower(dev->cpu, cx->cpu_state); - ct_idle_exit(); + ct_cpuidle_exit();
raw_spin_lock_irqsave(&mpu_lock, flag); if (cx->mpu_state_vote == num_online_cpus()) @@ -186,10 +186,10 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev, } }
- ct_idle_enter(); + ct_cpuidle_enter(); omap4_enter_lowpower(dev->cpu, cx->cpu_state); cpu_done[dev->cpu] = true; - ct_idle_exit(); + ct_cpuidle_exit();
/* Wakeup CPU1 only if it is not offlined */ if (dev->cpu == 0 && cpumask_test_cpu(1, cpu_online_mask)) { diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index ae42d530d080..ff40b001e0fa 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -644,6 +644,8 @@ static int __cpuidle acpi_idle_enter_bm(struct cpuidle_driver *drv, */ bool dis_bm = pr->flags.bm_control;
+ instrumentation_begin(); + /* If we can skip BM, demote to a safe state. */ if (!cx->bm_sts_skip && acpi_idle_bm_check()) { dis_bm = false; @@ -665,11 +667,11 @@ static int __cpuidle acpi_idle_enter_bm(struct cpuidle_driver *drv, raw_spin_unlock(&c3_lock); }
- ct_idle_enter(); + ct_cpuidle_enter();
acpi_idle_do_entry(cx);
- ct_idle_exit(); + ct_cpuidle_exit();
/* Re-enable bus master arbitration */ if (dis_bm) { @@ -679,6 +681,8 @@ static int __cpuidle acpi_idle_enter_bm(struct cpuidle_driver *drv, raw_spin_unlock(&c3_lock); }
+ instrumentation_end(); + return index; }
diff --git a/drivers/cpuidle/cpuidle-big_little.c b/drivers/cpuidle/cpuidle-big_little.c index fffd4ed0c4d1..5858db21e08c 100644 --- a/drivers/cpuidle/cpuidle-big_little.c +++ b/drivers/cpuidle/cpuidle-big_little.c @@ -126,13 +126,13 @@ static int bl_enter_powerdown(struct cpuidle_device *dev, struct cpuidle_driver *drv, int idx) { cpu_pm_enter(); - ct_idle_enter(); + ct_cpuidle_enter();
cpu_suspend(0, bl_powerdown_finisher);
/* signals the MCPM core that CPU is out of low power state */ mcpm_cpu_powered_up(); - ct_idle_exit(); + ct_cpuidle_exit();
cpu_pm_exit();
diff --git a/drivers/cpuidle/cpuidle-mvebu-v7.c b/drivers/cpuidle/cpuidle-mvebu-v7.c index c9568aa9410c..20bfb26d5a88 100644 --- a/drivers/cpuidle/cpuidle-mvebu-v7.c +++ b/drivers/cpuidle/cpuidle-mvebu-v7.c @@ -36,9 +36,9 @@ static int mvebu_v7_enter_idle(struct cpuidle_device *dev, if (drv->states[index].flags & MVEBU_V7_FLAG_DEEP_IDLE) deepidle = true;
- ct_idle_enter(); + ct_cpuidle_enter(); ret = mvebu_v7_cpu_suspend(deepidle); - ct_idle_exit(); + ct_cpuidle_exit();
cpu_pm_exit();
diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c index 6a259c94b8b2..32ad942d3e5a 100644 --- a/drivers/cpuidle/cpuidle-psci.c +++ b/drivers/cpuidle/cpuidle-psci.c @@ -74,7 +74,7 @@ static int __psci_enter_domain_idle_state(struct cpuidle_device *dev, else pm_runtime_put_sync_suspend(pd_dev);
- ct_idle_enter(); + ct_cpuidle_enter();
state = psci_get_domain_state(); if (!state) @@ -82,7 +82,7 @@ static int __psci_enter_domain_idle_state(struct cpuidle_device *dev,
ret = psci_cpu_suspend_enter(state) ? -1 : idx;
- ct_idle_exit(); + ct_cpuidle_exit();
if (s2idle) dev_pm_genpd_resume(pd_dev); diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c index cbdbb11b972b..0a480f5799a7 100644 --- a/drivers/cpuidle/cpuidle-riscv-sbi.c +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c @@ -126,7 +126,7 @@ static int __sbi_enter_domain_idle_state(struct cpuidle_device *dev, else pm_runtime_put_sync_suspend(pd_dev);
- ct_idle_enter(); + ct_cpuidle_enter();
if (sbi_is_domain_state_available()) state = sbi_get_domain_state(); @@ -135,7 +135,7 @@ static int __sbi_enter_domain_idle_state(struct cpuidle_device *dev,
ret = sbi_suspend(state) ? -1 : idx;
- ct_idle_exit(); + ct_cpuidle_exit();
if (s2idle) dev_pm_genpd_resume(pd_dev); diff --git a/drivers/cpuidle/cpuidle-tegra.c b/drivers/cpuidle/cpuidle-tegra.c index 3ca5cfb9d322..9c2903c1b1c0 100644 --- a/drivers/cpuidle/cpuidle-tegra.c +++ b/drivers/cpuidle/cpuidle-tegra.c @@ -183,7 +183,7 @@ static int tegra_cpuidle_state_enter(struct cpuidle_device *dev, tegra_pm_set_cpu_in_lp2(); cpu_pm_enter();
- ct_idle_enter(); + ct_cpuidle_enter();
switch (index) { case TEGRA_C7: @@ -199,7 +199,7 @@ static int tegra_cpuidle_state_enter(struct cpuidle_device *dev, break; }
- ct_idle_exit(); + ct_cpuidle_exit();
cpu_pm_exit(); tegra_pm_clear_cpu_in_lp2(); @@ -240,10 +240,10 @@ static int tegra_cpuidle_enter(struct cpuidle_device *dev,
if (index == TEGRA_C1) { if (do_rcu) - ct_idle_enter(); + ct_cpuidle_enter(); ret = arm_cpuidle_simple_enter(dev, drv, index); if (do_rcu) - ct_idle_exit(); + ct_cpuidle_exit(); } else ret = tegra_cpuidle_state_enter(dev, index, cpu);
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 6eceb1988243..caeb543c7ea0 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -14,6 +14,7 @@ #include <linux/mutex.h> #include <linux/sched.h> #include <linux/sched/clock.h> +#include <linux/sched/idle.h> #include <linux/notifier.h> #include <linux/pm_qos.h> #include <linux/cpu.h> @@ -152,12 +153,12 @@ static void enter_s2idle_proper(struct cpuidle_driver *drv, */ stop_critical_timings(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) - ct_idle_enter(); + ct_cpuidle_enter(); target_state->enter_s2idle(dev, drv, index); if (WARN_ON_ONCE(!irqs_disabled())) - local_irq_disable(); + raw_local_irq_disable(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) - ct_idle_exit(); + ct_cpuidle_exit(); tick_unfreeze(); start_critical_timings();
@@ -235,10 +236,10 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
stop_critical_timings(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) - ct_idle_enter(); + ct_cpuidle_enter(); entered_state = target_state->enter(dev, drv, index); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) - ct_idle_exit(); + ct_cpuidle_exit(); start_critical_timings();
sched_clock_idle_wakeup_event(); diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h index 8ae9a95ebf5b..9aac31d856f3 100644 --- a/include/linux/clockchips.h +++ b/include/linux/clockchips.h @@ -211,7 +211,7 @@ extern int tick_receive_broadcast(void); extern void tick_setup_hrtimer_broadcast(void); extern int tick_check_broadcast_expired(void); # else -static inline int tick_check_broadcast_expired(void) { return 0; } +static __always_inline int tick_check_broadcast_expired(void) { return 0; } static inline void tick_setup_hrtimer_broadcast(void) { } # endif
@@ -219,7 +219,7 @@ static inline void tick_setup_hrtimer_broadcast(void) { }
static inline void clockevents_suspend(void) { } static inline void clockevents_resume(void) { } -static inline int tick_check_broadcast_expired(void) { return 0; } +static __always_inline int tick_check_broadcast_expired(void) { return 0; } static inline void tick_setup_hrtimer_broadcast(void) { }
#endif /* !CONFIG_GENERIC_CLOCKEVENTS */ diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index 0ddc11e44302..630c879143c7 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -14,6 +14,7 @@ #include <linux/percpu.h> #include <linux/list.h> #include <linux/hrtimer.h> +#include <linux/context_tracking.h>
#define CPUIDLE_STATE_MAX 10 #define CPUIDLE_NAME_LEN 16 @@ -115,6 +116,35 @@ struct cpuidle_device { DECLARE_PER_CPU(struct cpuidle_device *, cpuidle_devices); DECLARE_PER_CPU(struct cpuidle_device, cpuidle_dev);
+static __always_inline void ct_cpuidle_enter(void) +{ + lockdep_assert_irqs_disabled(); + /* + * Idle is allowed to (temporary) enable IRQs. It + * will return with IRQs disabled. + * + * Trace IRQs enable here, then switch off RCU, and have + * arch_cpu_idle() use raw_local_irq_enable(). Note that + * ct_idle_enter() relies on lockdep IRQ state, so switch that + * last -- this is very similar to the entry code. + */ + trace_hardirqs_on_prepare(); + lockdep_hardirqs_on_prepare(); + instrumentation_end(); + ct_idle_enter(); + lockdep_hardirqs_on(_RET_IP_); +} + +static __always_inline void ct_cpuidle_exit(void) +{ + /* + * Carefully undo the above. + */ + lockdep_hardirqs_off(_RET_IP_); + ct_idle_exit(); + instrumentation_begin(); +} + /**************************** * CPUIDLE DRIVER INTERFACE * ****************************/ @@ -289,9 +319,9 @@ extern s64 cpuidle_governor_latency_req(unsigned int cpu); if (!is_retention) \ __ret = cpu_pm_enter(); \ if (!__ret) { \ - ct_idle_enter(); \ + ct_cpuidle_enter(); \ __ret = low_level_idle_enter(state); \ - ct_idle_exit(); \ + ct_cpuidle_exit(); \ if (!is_retention) \ cpu_pm_exit(); \ } \ diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index f26ab2675f7d..e924602ec43b 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -51,18 +51,22 @@ __setup("hlt", cpu_idle_nopoll_setup);
static noinline int __cpuidle cpu_idle_poll(void) { + instrumentation_begin(); trace_cpu_idle(0, smp_processor_id()); stop_critical_timings(); - ct_idle_enter(); - local_irq_enable(); + ct_cpuidle_enter();
+ raw_local_irq_enable(); while (!tif_need_resched() && (cpu_idle_force_poll || tick_check_broadcast_expired())) cpu_relax(); + raw_local_irq_disable();
- ct_idle_exit(); + ct_cpuidle_exit(); start_critical_timings(); trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id()); + local_irq_enable(); + instrumentation_end();
return 1; } @@ -85,44 +89,21 @@ void __weak arch_cpu_idle(void) */ void __cpuidle default_idle_call(void) { - if (current_clr_polling_and_test()) { - local_irq_enable(); - } else { - + instrumentation_begin(); + if (!current_clr_polling_and_test()) { trace_cpu_idle(1, smp_processor_id()); stop_critical_timings();
- /* - * arch_cpu_idle() is supposed to enable IRQs, however - * we can't do that because of RCU and tracing. - * - * Trace IRQs enable here, then switch off RCU, and have - * arch_cpu_idle() use raw_local_irq_enable(). Note that - * ct_idle_enter() relies on lockdep IRQ state, so switch that - * last -- this is very similar to the entry code. - */ - trace_hardirqs_on_prepare(); - lockdep_hardirqs_on_prepare(); - ct_idle_enter(); - lockdep_hardirqs_on(_THIS_IP_); - + ct_cpuidle_enter(); arch_cpu_idle(); - - /* - * OK, so IRQs are enabled here, but RCU needs them disabled to - * turn itself back on.. funny thing is that disabling IRQs - * will cause tracing, which needs RCU. Jump through hoops to - * make it 'work'. - */ raw_local_irq_disable(); - lockdep_hardirqs_off(_THIS_IP_); - ct_idle_exit(); - lockdep_hardirqs_on(_THIS_IP_); - raw_local_irq_enable(); + ct_cpuidle_exit();
start_critical_timings(); trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id()); } + local_irq_enable(); + instrumentation_end(); }
static int call_cpuidle_s2idle(struct cpuidle_driver *drv, diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index f7fe6fe36173..93bf2b4e47e5 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -622,9 +622,13 @@ struct cpumask *tick_get_broadcast_oneshot_mask(void) * to avoid a deep idle transition as we are about to get the * broadcast IPI right away. */ -int tick_check_broadcast_expired(void) +noinstr int tick_check_broadcast_expired(void) { +#ifdef _ASM_GENERIC_BITOPS_INSTRUMENTED_NON_ATOMIC_H + return arch_test_bit(smp_processor_id(), cpumask_bits(tick_broadcast_force_mask)); +#else return cpumask_test_cpu(smp_processor_id(), tick_broadcast_force_mask); +#endif }
/*
On Fri, Mar 03, 2023 at 05:23:22PM +0800, Cheng-Jui Wang wrote:
These patches fix the lockdep splat reported in this thread: https://lore.kernel.org/lkml/20220822164404.57952727@gandalf.local.home/
Why not cc: all of the developers involved in this 00/10 email?
I would need approval from them to be able to take this.
Also, is the lockdep splat correct? Or is it a false positive? And if correct, is it something that you can actually trigger somehow?
thanks,
greg k-h
On Fri, Mar 10, 2023 at 11:12:39AM +0100, Greg KH wrote:
On Fri, Mar 03, 2023 at 05:23:22PM +0800, Cheng-Jui Wang wrote:
These patches fix the lockdep splat reported in this thread: https://lore.kernel.org/lkml/20220822164404.57952727@gandalf.local.home/
Why not cc: all of the developers involved in this 00/10 email?
I would need approval from them to be able to take this.
Also, is the lockdep splat correct? Or is it a false positive? And if correct, is it something that you can actually trigger somehow?
Also, we can not take these for only 6.1.y for obvious reasons that you would hit a regression when you move to 6.2.y. So we need backports for all newer kernels before we can take them for older ones. Please provide all of the backports.
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org