Currently the kernel threads are not frozen in software_resume(), so between dpm_suspend_start(PMSG_QUIESCE) and resume_target_kernel(), system_freezable_power_efficient_wq can still try to submit SCSI commands and this can cause a panic since the low level SCSI driver (e.g. hv_storvsc) has quiesced the SCSI adapter and can not accept any SCSI commands: https://lkml.org/lkml/2020/4/10/47
At first I posted a fix (https://lkml.org/lkml/2020/4/21/1318) trying to resolve the issue from hv_storvsc, but with the help of Bart Van Assche, I realized it's better to fix software_resume(), since this looks like a generic issue, not only pertaining to SCSI.
Cc: Bart Van Assche bvanassche@acm.org Cc: stable@vger.kernel.org Signed-off-by: Dexuan Cui decui@microsoft.com --- kernel/power/hibernate.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index 86aba8706b16..30bd28d1d418 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -898,6 +898,13 @@ static int software_resume(void) error = freeze_processes(); if (error) goto Close_Finish; + + error = freeze_kernel_threads(); + if (error) { + thaw_processes(); + goto Close_Finish; + } + error = load_image_and_restore(); thaw_processes(); Finish:
Hi
[This is an automated email]
This commit has been processed because it contains a -stable tag. The stable tag indicates that it's relevant for the following trees: all
The bot has tested the following trees: v5.6.7, v5.4.35, v4.19.118, v4.14.177, v4.9.220, v4.4.220.
v5.6.7: Build OK! v5.4.35: Build OK! v4.19.118: Build OK! v4.14.177: Build OK! v4.9.220: Build OK! v4.4.220: Failed to apply! Possible dependencies: ea00f4f4f00c ("PM / sleep: make PM notifiers called symmetrically") fe12c00d21bb ("PM / hibernate: Introduce test_resume mode for hibernation")
NOTE: The patch will not be queued to stable trees until it is upstream.
How should we proceed with this patch?
On Friday, April 24, 2020 5:40:16 AM CEST Dexuan Cui wrote:
Currently the kernel threads are not frozen in software_resume(), so between dpm_suspend_start(PMSG_QUIESCE) and resume_target_kernel(), system_freezable_power_efficient_wq can still try to submit SCSI commands and this can cause a panic since the low level SCSI driver (e.g. hv_storvsc) has quiesced the SCSI adapter and can not accept any SCSI commands: https://lkml.org/lkml/2020/4/10/47
At first I posted a fix (https://lkml.org/lkml/2020/4/21/1318) trying to resolve the issue from hv_storvsc, but with the help of Bart Van Assche, I realized it's better to fix software_resume(), since this looks like a generic issue, not only pertaining to SCSI.
Cc: Bart Van Assche bvanassche@acm.org Cc: stable@vger.kernel.org Signed-off-by: Dexuan Cui decui@microsoft.com
kernel/power/hibernate.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index 86aba8706b16..30bd28d1d418 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -898,6 +898,13 @@ static int software_resume(void) error = freeze_processes(); if (error) goto Close_Finish;
- error = freeze_kernel_threads();
- if (error) {
thaw_processes();
goto Close_Finish;
- }
- error = load_image_and_restore(); thaw_processes(); Finish:
Applied as a fix for 5.7-rc4, thanks!
On 2020-04-26 09:24, Rafael J. Wysocki wrote:
On Friday, April 24, 2020 5:40:16 AM CEST Dexuan Cui wrote:
Currently the kernel threads are not frozen in software_resume(), so between dpm_suspend_start(PMSG_QUIESCE) and resume_target_kernel(), system_freezable_power_efficient_wq can still try to submit SCSI commands and this can cause a panic since the low level SCSI driver (e.g. hv_storvsc) has quiesced the SCSI adapter and can not accept any SCSI commands: https://lkml.org/lkml/2020/4/10/47
At first I posted a fix (https://lkml.org/lkml/2020/4/21/1318) trying to resolve the issue from hv_storvsc, but with the help of Bart Van Assche, I realized it's better to fix software_resume(), since this looks like a generic issue, not only pertaining to SCSI.
Cc: Bart Van Assche bvanassche@acm.org Cc: stable@vger.kernel.org Signed-off-by: Dexuan Cui decui@microsoft.com
kernel/power/hibernate.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index 86aba8706b16..30bd28d1d418 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -898,6 +898,13 @@ static int software_resume(void) error = freeze_processes(); if (error) goto Close_Finish;
- error = freeze_kernel_threads();
- if (error) {
thaw_processes();
goto Close_Finish;
- }
- error = load_image_and_restore(); thaw_processes(); Finish:
Applied as a fix for 5.7-rc4, thanks!
Hi Rafael,
What is not clear to me is how kernel threads are thawed after load_image_and_restore() has finished? Should a comment perhaps be added above the freeze_kernel_threads() call that explains how thaw_kernel_threads() is invoked after load_image_and_restore() has finished?
Thanks,
Bart.
From: Bart Van Assche bvanassche@acm.org Sent: Sunday, April 26, 2020 11:34 AM To: Rafael J. Wysocki rjw@rjwysocki.net; Dexuan Cui decui@microsoft.com
--- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -898,6 +898,13 @@ static int software_resume(void) error = freeze_processes(); if (error) goto Close_Finish;
- error = freeze_kernel_threads();
- if (error) {
thaw_processes();
goto Close_Finish;
- }
- error = load_image_and_restore(); thaw_processes(); Finish:
Applied as a fix for 5.7-rc4, thanks!
Hi Rafael,
What is not clear to me is how kernel threads are thawed after load_image_and_restore() has finished? Should a comment perhaps be added above the freeze_kernel_threads() call that explains how thaw_kernel_threads() is invoked after load_image_and_restore() has finished?
Bart.
Hi Bart, Rafael, I would suggest the below comment:
If load_image_and_restore() succeeds, it won't return, and the execution will be restored from the 'old' kernel's hibernate() -> hibernation_snapshot() -> create_image() -> swsusp_arch_suspend(), and later hibernate() -> thaw_processes() will thaw every frozen kernel process and userspace process of the 'old' kernel.
Thanks, -- Dexuan
On Sun, Apr 26, 2020 at 8:34 PM Bart Van Assche bvanassche@acm.org wrote:
On 2020-04-26 09:24, Rafael J. Wysocki wrote:
On Friday, April 24, 2020 5:40:16 AM CEST Dexuan Cui wrote:
Currently the kernel threads are not frozen in software_resume(), so between dpm_suspend_start(PMSG_QUIESCE) and resume_target_kernel(), system_freezable_power_efficient_wq can still try to submit SCSI commands and this can cause a panic since the low level SCSI driver (e.g. hv_storvsc) has quiesced the SCSI adapter and can not accept any SCSI commands: https://lkml.org/lkml/2020/4/10/47
At first I posted a fix (https://lkml.org/lkml/2020/4/21/1318) trying to resolve the issue from hv_storvsc, but with the help of Bart Van Assche, I realized it's better to fix software_resume(), since this looks like a generic issue, not only pertaining to SCSI.
Cc: Bart Van Assche bvanassche@acm.org Cc: stable@vger.kernel.org Signed-off-by: Dexuan Cui decui@microsoft.com
kernel/power/hibernate.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index 86aba8706b16..30bd28d1d418 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -898,6 +898,13 @@ static int software_resume(void) error = freeze_processes(); if (error) goto Close_Finish;
- error = freeze_kernel_threads();
- if (error) {
thaw_processes();
goto Close_Finish;
- }
- error = load_image_and_restore(); thaw_processes(); Finish:
Applied as a fix for 5.7-rc4, thanks!
Hi Rafael,
What is not clear to me is how kernel threads are thawed after load_image_and_restore() has finished? Should a comment perhaps be added above the freeze_kernel_threads() call that explains how thaw_kernel_threads() is invoked after load_image_and_restore() has finished?
It isn't, because that is not necessary.
thaw_processes() will thaw them along with the user space.
Cheers!
linux-stable-mirror@lists.linaro.org