Since commit 04c8b0f82c7d ("irqchip/gic: Make locking a BL_SWITCHER only feature") coupled CPU idle freezes from time to time on Exynos4210. Later commit 313c8c16ee62 ("PM / CPU: replace raw_notifier with atomic_notifier") changed the context in which the CPU idle code is executed, what results in fully reproducible freeze all the time. However, almost the same coupled CPU idle code works fine on Exynos3250 regarless of the changes made in the mentioned commits.
It turned out that the IPI call used on Exynos4210 is conflicting with the change done in the first mentioned commit in GIC. Fix this by using the same code path as for Exynos3250, instead of the IPI call for synchronization with second CPU core, call dsb_sev() directly.
Tested on Exynos4210-based Trats and Origen boards.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com CC: stable@vger.kernel.org # v4.13+ --- arch/arm/mach-exynos/pm.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/arch/arm/mach-exynos/pm.c b/arch/arm/mach-exynos/pm.c index dc4346ecf16d..a1055a2b8d54 100644 --- a/arch/arm/mach-exynos/pm.c +++ b/arch/arm/mach-exynos/pm.c @@ -271,11 +271,7 @@ static int exynos_cpu0_enter_aftr(void) goto fail;
call_firmware_op(cpu_boot, 1); - - if (soc_is_exynos3250()) - dsb_sev(); - else - arch_send_wakeup_ipi_mask(cpumask_of(1)); + dsb_sev(); } } fail:
On 21/03/18 09:45, Marek Szyprowski wrote:
Since commit 04c8b0f82c7d ("irqchip/gic: Make locking a BL_SWITCHER only feature") coupled CPU idle freezes from time to time on Exynos4210. Later commit 313c8c16ee62 ("PM / CPU: replace raw_notifier with atomic_notifier") changed the context in which the CPU idle code is executed, what results in fully reproducible freeze all the time. However, almost the same coupled CPU idle code works fine on Exynos3250 regarless of the changes made in the mentioned commits.
It turned out that the IPI call used on Exynos4210 is conflicting with the change done in the first mentioned commit in GIC. Fix this by using the same code path as for Exynos3250, instead of the IPI call for synchronization with second CPU core, call dsb_sev() directly.
Tested on Exynos4210-based Trats and Origen boards.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com CC: stable@vger.kernel.org # v4.13+
arch/arm/mach-exynos/pm.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/arch/arm/mach-exynos/pm.c b/arch/arm/mach-exynos/pm.c index dc4346ecf16d..a1055a2b8d54 100644 --- a/arch/arm/mach-exynos/pm.c +++ b/arch/arm/mach-exynos/pm.c @@ -271,11 +271,7 @@ static int exynos_cpu0_enter_aftr(void) goto fail; call_firmware_op(cpu_boot, 1);
if (soc_is_exynos3250())
dsb_sev();
else
arch_send_wakeup_ipi_mask(cpumask_of(1));
} }dsb_sev();
fail:
I'm a bit puzzled here.
If the GIC change broke something on your platform, it probably also broke something for all other users of arch_send_wakeup_ipi_mask. But nobody reported anything until now. Also, it doesn't look like 4210 is a big-little platform, so it is unlikely to use the big-little switcher.
What is specific to this system that makes misbehave? Were you implicitly relying on the BL lock to perform some serialization?
Thanks,
M.
Hi Marc,
On 2018-03-21 11:05, Marc Zyngier wrote:
On 21/03/18 09:45, Marek Szyprowski wrote:
Since commit 04c8b0f82c7d ("irqchip/gic: Make locking a BL_SWITCHER only feature") coupled CPU idle freezes from time to time on Exynos4210. Later commit 313c8c16ee62 ("PM / CPU: replace raw_notifier with atomic_notifier") changed the context in which the CPU idle code is executed, what results in fully reproducible freeze all the time. However, almost the same coupled CPU idle code works fine on Exynos3250 regarless of the changes made in the mentioned commits.
It turned out that the IPI call used on Exynos4210 is conflicting with the change done in the first mentioned commit in GIC. Fix this by using the same code path as for Exynos3250, instead of the IPI call for synchronization with second CPU core, call dsb_sev() directly.
Tested on Exynos4210-based Trats and Origen boards.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com CC: stable@vger.kernel.org # v4.13+
arch/arm/mach-exynos/pm.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/arch/arm/mach-exynos/pm.c b/arch/arm/mach-exynos/pm.c index dc4346ecf16d..a1055a2b8d54 100644 --- a/arch/arm/mach-exynos/pm.c +++ b/arch/arm/mach-exynos/pm.c @@ -271,11 +271,7 @@ static int exynos_cpu0_enter_aftr(void) goto fail; call_firmware_op(cpu_boot, 1);
if (soc_is_exynos3250())
dsb_sev();
else
arch_send_wakeup_ipi_mask(cpumask_of(1));
} } fail:dsb_sev();
I'm a bit puzzled here.
If the GIC change broke something on your platform, it probably also broke something for all other users of arch_send_wakeup_ipi_mask. But nobody reported anything until now. Also, it doesn't look like 4210 is a big-little platform, so it is unlikely to use the big-little switcher.
What is specific to this system that makes misbehave? Were you implicitly relying on the BL lock to perform some serialization?
My hypothesis, after some internal discussion, is that dsb_sev() should be there from the beginning, but instead there was an arch_send_wakeup_ipi_mask() call, which indirectly called dsb_sev() as a part of GIC spinlock internals. When spinlock has been removed from GIC, there is no dsb_sev() call anymore, what causes synchronization issue.
Best regards
On 21/03/18 10:18, Marek Szyprowski wrote:
Hi Marc,
On 2018-03-21 11:05, Marc Zyngier wrote:
On 21/03/18 09:45, Marek Szyprowski wrote:
Since commit 04c8b0f82c7d ("irqchip/gic: Make locking a BL_SWITCHER only feature") coupled CPU idle freezes from time to time on Exynos4210. Later commit 313c8c16ee62 ("PM / CPU: replace raw_notifier with atomic_notifier") changed the context in which the CPU idle code is executed, what results in fully reproducible freeze all the time. However, almost the same coupled CPU idle code works fine on Exynos3250 regarless of the changes made in the mentioned commits.
It turned out that the IPI call used on Exynos4210 is conflicting with the change done in the first mentioned commit in GIC. Fix this by using the same code path as for Exynos3250, instead of the IPI call for synchronization with second CPU core, call dsb_sev() directly.
Tested on Exynos4210-based Trats and Origen boards.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com CC: stable@vger.kernel.org # v4.13+
arch/arm/mach-exynos/pm.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/arch/arm/mach-exynos/pm.c b/arch/arm/mach-exynos/pm.c index dc4346ecf16d..a1055a2b8d54 100644 --- a/arch/arm/mach-exynos/pm.c +++ b/arch/arm/mach-exynos/pm.c @@ -271,11 +271,7 @@ static int exynos_cpu0_enter_aftr(void) goto fail; call_firmware_op(cpu_boot, 1);
if (soc_is_exynos3250())
dsb_sev();
else
arch_send_wakeup_ipi_mask(cpumask_of(1));
} } fail:dsb_sev();
I'm a bit puzzled here.
If the GIC change broke something on your platform, it probably also broke something for all other users of arch_send_wakeup_ipi_mask. But nobody reported anything until now. Also, it doesn't look like 4210 is a big-little platform, so it is unlikely to use the big-little switcher.
What is specific to this system that makes misbehave? Were you implicitly relying on the BL lock to perform some serialization?
My hypothesis, after some internal discussion, is that dsb_sev() should be there from the beginning, but instead there was an arch_send_wakeup_ipi_mask() call, which indirectly called dsb_sev() as a part of GIC spinlock internals. When spinlock has been removed from GIC, there is no dsb_sev() call anymore, what causes synchronization issue.
That's pretty scary. Glad you pinpointed it in the end.
FWIW:
Acked-by: Marc Zyngier marc.zyngier@arm.com
M.
On Wednesday, March 21, 2018 10:45:05 AM Marek Szyprowski wrote:
Since commit 04c8b0f82c7d ("irqchip/gic: Make locking a BL_SWITCHER only feature") coupled CPU idle freezes from time to time on Exynos4210. Later commit 313c8c16ee62 ("PM / CPU: replace raw_notifier with atomic_notifier") changed the context in which the CPU idle code is executed, what results in fully reproducible freeze all the time. However, almost the same coupled CPU idle code works fine on Exynos3250 regarless of the changes made in the mentioned commits.
It turned out that the IPI call used on Exynos4210 is conflicting with the change done in the first mentioned commit in GIC. Fix this by using the same code path as for Exynos3250, instead of the IPI call for synchronization with second CPU core, call dsb_sev() directly.
Tested on Exynos4210-based Trats and Origen boards.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com CC: stable@vger.kernel.org # v4.13+
Thanks for spotting and fixing this.
Acked-by: Bartlomiej Zolnierkiewicz b.zolnierkie@samsung.com
Best regards, -- Bartlomiej Zolnierkiewicz Samsung R&D Institute Poland Samsung Electronics
On Wednesday, March 21, 2018 10:45:05 AM CET Marek Szyprowski wrote:
Since commit 04c8b0f82c7d ("irqchip/gic: Make locking a BL_SWITCHER only feature") coupled CPU idle freezes from time to time on Exynos4210. Later commit 313c8c16ee62 ("PM / CPU: replace raw_notifier with atomic_notifier") changed the context in which the CPU idle code is executed, what results in fully reproducible freeze all the time. However, almost the same coupled CPU idle code works fine on Exynos3250 regarless of the changes made in the mentioned commits.
It turned out that the IPI call used on Exynos4210 is conflicting with the change done in the first mentioned commit in GIC. Fix this by using the same code path as for Exynos3250, instead of the IPI call for synchronization with second CPU core, call dsb_sev() directly.
Tested on Exynos4210-based Trats and Origen boards.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com CC: stable@vger.kernel.org # v4.13+
arch/arm/mach-exynos/pm.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/arch/arm/mach-exynos/pm.c b/arch/arm/mach-exynos/pm.c index dc4346ecf16d..a1055a2b8d54 100644 --- a/arch/arm/mach-exynos/pm.c +++ b/arch/arm/mach-exynos/pm.c @@ -271,11 +271,7 @@ static int exynos_cpu0_enter_aftr(void) goto fail; call_firmware_op(cpu_boot, 1);
if (soc_is_exynos3250())
dsb_sev();
else
arch_send_wakeup_ipi_mask(cpumask_of(1));
} }dsb_sev();
fail:
That will be -stable material I believe?
On Wed, Mar 21, 2018 at 02:49:52PM +0100, Rafael J. Wysocki wrote:
On Wednesday, March 21, 2018 10:45:05 AM CET Marek Szyprowski wrote:
Since commit 04c8b0f82c7d ("irqchip/gic: Make locking a BL_SWITCHER only feature") coupled CPU idle freezes from time to time on Exynos4210. Later commit 313c8c16ee62 ("PM / CPU: replace raw_notifier with atomic_notifier") changed the context in which the CPU idle code is executed, what results in fully reproducible freeze all the time. However, almost the same coupled CPU idle code works fine on Exynos3250 regarless of the changes made in the mentioned commits.
It turned out that the IPI call used on Exynos4210 is conflicting with the change done in the first mentioned commit in GIC. Fix this by using the same code path as for Exynos3250, instead of the IPI call for synchronization with second CPU core, call dsb_sev() directly.
Tested on Exynos4210-based Trats and Origen boards.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com CC: stable@vger.kernel.org # v4.13+
arch/arm/mach-exynos/pm.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/arch/arm/mach-exynos/pm.c b/arch/arm/mach-exynos/pm.c index dc4346ecf16d..a1055a2b8d54 100644 --- a/arch/arm/mach-exynos/pm.c +++ b/arch/arm/mach-exynos/pm.c @@ -271,11 +271,7 @@ static int exynos_cpu0_enter_aftr(void) goto fail; call_firmware_op(cpu_boot, 1);
if (soc_is_exynos3250())
dsb_sev();
else
arch_send_wakeup_ipi_mask(cpumask_of(1));
} }dsb_sev();
fail:
That will be -stable material I believe?
Yes, it is CC-stable.
Marek, Thanks, applied (with adjustment of stable address - I believe if some text follows it, it makes more readable to put <>).
Best regards, Krzysztof
linux-stable-mirror@lists.linaro.org