Hi,
A changed was added to both version 5.17.5 and 5.15.36 which causes my computer to freeze when resuming after a suspend. This happens every time I suspend and then resume.
I've bisected the change to commit: cbe6c3a8f8f4315b96e46e1a1c70393c06d95a4c (net: atlantic: invert deep par in pm functions, preventing null derefs) https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
My computer details that might be relevant: OS: Arch Linux CPU: AMD 5950X GPU: AMD 6800XT
As expected I have an Aquantia ethernet controller listed in lspci:
05:00.0 Ethernet controller: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 02)
Please let me know if there is any more info I can give that will help.
Regards, Jordan
Some info I missed out, and some I've discovered:
1. This causes my system to completely freeze such that I need to reboot to recover
2. There are no system logs from the crash, in fact absolutely no logs from the resume at all, the last logs were of the computer going into suspend
3. I've found that I can prevent this crash by unloading the atlantic module before suspending (modprobe -r atlantic)
4. Also, if I take the v5.17.5 tag of the kernel and revert the commit mentioned in my first email, this also prevents the crash
Regards, Jordan
------- Original Message ------- On Wednesday, May 4th, 2022 at 16:07, Jordan Leppert jordanleppert@protonmail.com wrote:
Hi,
A changed was added to both version 5.17.5 and 5.15.36 which causes my computer to freeze when resuming after a suspend. This happens every time I suspend and then resume.
I've bisected the change to commit: cbe6c3a8f8f4315b96e46e1a1c70393c06d95a4c (net: atlantic: invert deep par in pm functions, preventing null derefs) https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
My computer details that might be relevant: OS: Arch Linux CPU: AMD 5950X GPU: AMD 6800XT
As expected I have an Aquantia ethernet controller listed in lspci:
05:00.0 Ethernet controller: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 02)
Please let me know if there is any more info I can give that will help.
Regards, Jordan
On 2022-05-04 17:07, Jordan Leppert wrote:
Hi,
A changed was added to both version 5.17.5 and 5.15.36 which causes my computer to freeze when resuming after a suspend. This happens every time I suspend and then resume.
I've bisected the change to commit: cbe6c3a8f8f4315b96e46e1a1c70393c06d95a4c (net: atlantic: invert deep par in pm functions, preventing null derefs) https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
My computer details that might be relevant: OS: Arch Linux CPU: AMD 5950X GPU: AMD 6800XT
As expected I have an Aquantia ethernet controller listed in lspci:
05:00.0 Ethernet controller: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 02)
Please let me know if there is any more info I can give that will help.
Regards, Jordan
Just a quick note that I have the same issue (same card model); since recently (5.15.36) the hang after resume is 100% reliable. IIRC it used to be hit-and-miss before that. I'm currently building .38 with the mentioned commit reverted and will report back. Thanks for bringing this up.
-h
On 2022-05-04 19:50, Holger Hoffstätte wrote:
On 2022-05-04 17:07, Jordan Leppert wrote:
Hi,
A changed was added to both version 5.17.5 and 5.15.36 which causes my computer to freeze when resuming after a suspend. This happens every time I suspend and then resume.
I've bisected the change to commit: cbe6c3a8f8f4315b96e46e1a1c70393c06d95a4c (net: atlantic: invert deep par in pm functions, preventing null derefs) https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
My computer details that might be relevant: OS: Arch Linux CPU: AMD 5950X GPU: AMD 6800XT
As expected I have an Aquantia ethernet controller listed in lspci:
05:00.0 Ethernet controller: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 02)
Please let me know if there is any more info I can give that will help.
Regards, Jordan
Just a quick note that I have the same issue (same card model); since recently (5.15.36) the hang after resume is 100% reliable. IIRC it used to be hit-and-miss before that. I'm currently building .38 with the mentioned commit reverted and will report back. Thanks for bringing this up.
With said commit reverted and 5.15.38 I got 1 successful resume and 1 lockup. Difference is that with the patch reverted, the locked-up system can be pinged (unlike with the patch applied), though resume still does not finish properly and now probably runs into the problem that the patch was trying to fix. Any network services like ssh are still dead though. This used to work every time, all the time..looks like I'll try removing the module before suspend.
-h
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
Yup, that’s my fault and I reproduced this myself yesterday. I actually expected this to happen and attempted to test suspend with the patch, but must have screwed up by kexec-rebooting into an unpatched kernel version or something like that. I’ll disable the kexec service in the future, if I ever need to prepare a patch again.
05:00.0 Ethernet controller: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 02)
Yes, I have the same one.
Please let me know if there is any more info I can give that will help.
Can you confirm, that hibernation works with the patch, but not without it? The patch was an attempt to fix it, because I had the same behaviour with hibernation. I tried to make sense of the deep parameter in atl_resume_common pm function calls, but apparently it’s always required to be true and thus obsolete.
I’ll leave the cleanup of that parameter to the maintainers for mainline and prepare a patch. Last time I sent it against mainline. If this fixup of a stable patch regression should be posted differently, it would be nice, if someone could give me a pointer.
5.10.113 is also affected.
On 2022-05-04 21:25, Manuel Ullmann wrote:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
Yup, that’s my fault and I reproduced this myself yesterday. I actually expected this to happen and attempted to test suspend with the patch, but must have screwed up by kexec-rebooting into an unpatched kernel version or something like that. I’ll disable the kexec service in the future, if I ever need to prepare a patch again.
05:00.0 Ethernet controller: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 02)
Yes, I have the same one.
Please let me know if there is any more info I can give that will help.
Can you confirm, that hibernation works with the patch, but not without it? The patch was an attempt to fix it, because I had the same behaviour
Cannot test hibernation, but..
with hibernation. I tried to make sense of the deep parameter in atl_resume_common pm function calls, but apparently it’s always required to be true and thus obsolete.
..I patched 5.15.38 to pass true as deep arg everywhere, and now resume seems to work again reliably, 5 out of 5. \o/
I’ll leave the cleanup of that parameter to the maintainers for mainline and prepare a patch. Last time I sent it against mainline. If this fixup of a stable patch regression should be posted differently, it would be nice, if someone could give me a pointer.
Send fix to mainline first, with Fixes: <mainline commit id> tag and Cc: stable mentioning the affected versions.
cheers Holger
Holger Hoffstätte holger@applied-asynchrony.com writes:
On 2022-05-04 21:25, Manuel Ullmann wrote:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
Yup, that’s my fault and I reproduced this myself yesterday. I actually expected this to happen and attempted to test suspend with the patch, but must have screwed up by kexec-rebooting into an unpatched kernel version or something like that. I’ll disable the kexec service in the future, if I ever need to prepare a patch again.
05:00.0 Ethernet controller: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 02)
Yes, I have the same one.
Please let me know if there is any more info I can give that will help.
Can you confirm, that hibernation works with the patch, but not without it? The patch was an attempt to fix it, because I had the same behaviour
Cannot test hibernation, but..
That’s unfortunate.
with hibernation. I tried to make sense of the deep parameter in atl_resume_common pm function calls, but apparently it’s always required to be true and thus obsolete.
..I patched 5.15.38 to pass true as deep arg everywhere, and now resume seems to work again reliably, 5 out of 5. \o/
Thanks for confirming that my patch should work. For some reason I had the same idea. ;)
I’ll leave the cleanup of that parameter to the maintainers for mainline and prepare a patch. Last time I sent it against mainline. If this fixup of a stable patch regression should be posted differently, it would be nice, if someone could give me a pointer.
Send fix to mainline first, with Fixes: <mainline commit id> tag and Cc: stable mentioning the affected versions.
Thanks for the hint. I did that.
cheers Holger
[TLDR: I'm adding the regression report below to regzbot, the Linux kernel regression tracking bot; all text you find below is compiled from a few templates paragraphs you might have encountered already already from similar mails.]
Hi, this is your Linux kernel regression tracker. Top-posting for once, to make this easily accessible to everyone.
This is being dealt with already (great, thx a lot!), nevertheless I'd like to add it to the regression tracking:
#regzbot ^introduced cbe6c3a8f8f4315b96e46e1a1c70393c06d95a4c #regzbot title net: atlantic: computer to freeze when resuming after a suspend #regzbot ignore-activity #regzbot monitor https://lore.kernel.org/lkml/87czgt2bsb.fsf@posteo.de/
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
P.S.: As the Linux kernel's regression tracker I deal with a lot of reports and sometimes miss something important when writing mails like this. If that's the case here, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight.
On 04.05.22 17:07, Jordan Leppert wrote:
A changed was added to both version 5.17.5 and 5.15.36 which causes my computer to freeze when resuming after a suspend. This happens every time I suspend and then resume.
I've bisected the change to commit: cbe6c3a8f8f4315b96e46e1a1c70393c06d95a4c (net: atlantic: invert deep par in pm functions, preventing null derefs) https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=l...
My computer details that might be relevant: OS: Arch Linux CPU: AMD 5950X GPU: AMD 6800XT
As expected I have an Aquantia ethernet controller listed in lspci:
05:00.0 Ethernet controller: Aquantia Corp. AQC107 NBase-T/IEEE 802.3bz Ethernet Controller [AQtion] (rev 02)
Please let me know if there is any more info I can give that will help.
Regards, Jordan
linux-stable-mirror@lists.linaro.org