There is a nasty regression wrt mt7921e in the last LTS series (6.12). If your computer crashes or fails to get out of hibernation, then at the next boot the mt7921e wifi does not work, with dmesg reporting that it is unable to change power state from d3cold to d0.
The issue is nasty, because rebooting won't help.
The only solution that I have found to the issue is booting a 6.6 kernel. With that the wife gets alive again. If, at this point, you boot into 6.12, everything seems to be fine again, until a boot fails and from that moment on you are without wifi.
Working around the issue is not very discoverable. On my machine not even a hardware reset (40s of power off button pressed) helped alone, without going through the 6.6 kernel boot.
I think the regression was introduced with 6.12 and I remember having no issues with previous kernels, but cannot be 100% sure.
Similarly, I do not know if the bug is with the wifi card driver itself or with something related to PM or PCIe.
The machine on which I am experiencing the issue is an Asus ROG 14 laptop (2022 edition), with AMD CPU and GPU.
Thanks for the attention, Sergio
On Wed, Mar 19, 2025 at 08:38:52PM +0100, Sergio Callegari wrote:
There is a nasty regression wrt mt7921e in the last LTS series (6.12). If your computer crashes or fails to get out of hibernation, then at the next boot the mt7921e wifi does not work, with dmesg reporting that it is unable to change power state from d3cold to d0.
The issue is nasty, because rebooting won't help.
Can you do a 'git bisect' to track down the issue? Also, maybe letting the network driver authors know about this would be good.
thanks,
greg k-h
Might be able to test on the distro built kernels that basically trace the releases and stable point releases. This should start helping bracketing the problem a bit better as a starter. But it is going to take a lot of time, since the issue happens when the machine fails to get out of hibernation, that is not always, and obvioulsy I need to try avoiding this situation as much as possible.
Incidentally, the machine seems to hibernate-resume just fine. It is when I suspend-then-hibernate that I get the failures.
Before contacting the network driver authors, I just wanted to query whether the issue is likely in it or in the power-management or pcie subsystems.
Thanks, Sergio
On 20/03/2025 00:54, Greg KH wrote:
On Wed, Mar 19, 2025 at 08:38:52PM +0100, Sergio Callegari wrote:
There is a nasty regression wrt mt7921e in the last LTS series (6.12). If your computer crashes or fails to get out of hibernation, then at the next boot the mt7921e wifi does not work, with dmesg reporting that it is unable to change power state from d3cold to d0.
The issue is nasty, because rebooting won't help.
Can you do a 'git bisect' to track down the issue? Also, maybe letting the network driver authors know about this would be good.
thanks,
greg k-h
Hey Sergio,
On 25/03/20 08:49AM, Sergio Callegari wrote:
Might be able to test on the distro built kernels that basically trace the releases and stable point releases. This should start helping bracketing the problem a bit better as a starter. But it is going to take a lot of time, since the issue happens when the machine fails to get out of hibernation, that is not always, and obvioulsy I need to try avoiding this situation as much as possible.
Which linux distro are you using? If you're on Arch Linux I can provide you with prebuilt images for the bisection :)
Incidentally, the machine seems to hibernate-resume just fine. It is when I suspend-then-hibernate that I get the failures.
Before contacting the network driver authors, I just wanted to query whether the issue is likely in it or in the power-management or pcie subsystems.
Thanks, Sergio
Cheers, Chris
On 20/03/2025 00:54, Greg KH wrote:
On Wed, Mar 19, 2025 at 08:38:52PM +0100, Sergio Callegari wrote:
There is a nasty regression wrt mt7921e in the last LTS series (6.12). If your computer crashes or fails to get out of hibernation, then at the next boot the mt7921e wifi does not work, with dmesg reporting that it is unable to change power state from d3cold to d0.
The issue is nasty, because rebooting won't help.
Can you do a 'git bisect' to track down the issue? Also, maybe letting the network driver authors know about this would be good.
thanks,
greg k-h
Hi Christian,
Thanks for your nice offer, details below:
On 20/03/2025 11:05, Christian Heusel wrote:
Hey Sergio,
On 25/03/20 08:49AM, Sergio Callegari wrote:
Might be able to test on the distro built kernels that basically trace the releases and stable point releases. This should start helping bracketing the problem a bit better as a starter. But it is going to take a lot of time, since the issue happens when the machine fails to get out of hibernation, that is not always, and obvioulsy I need to try avoiding this situation as much as possible.
Which linux distro are you using? If you're on Arch Linux I can provide you with prebuilt images for the bisection :)
I am on manjaro, where the kernel follows slightly different naming conventions, but the arch kernels should be OK. So thank you very much for the nice offer. The thing is possibly a bit premature, in that I would like to identify first what is the kernel RC or point release where the issue started to appear, because I have these kernels available for my distro which makes things easier. Unfortunately, I am still in the dark even wrt this.
The issue is nasty, because it only happens when you crash on restore from hibernation, which is something that I am desperately trying to avoid because this is my work machine and I really don't want to risk data loss.
The big problem with this bug is that you remain with the impression that your hardware is bricked. On the web I read that booting windows immediately gives you back the wifi device on pcie, but I really cannot say, as I have no windows to try. What I can say is that the 6.6 LTS kernel also lets you recover the WIFI, while 6.12 LTS does not.
As a stopgap, would be great to know if there is anything that can be done while on 6.12 to fully reset the pcie (or the pcie device, I still don't know what is the culprit), so you don't need to boot an older kernel.
Thanks again, Sergio
Incidentally, the machine seems to hibernate-resume just fine. It is when I suspend-then-hibernate that I get the failures.
Before contacting the network driver authors, I just wanted to query whether the issue is likely in it or in the power-management or pcie subsystems.
Thanks, Sergio
Cheers, Chris
On 20/03/2025 00:54, Greg KH wrote:
On Wed, Mar 19, 2025 at 08:38:52PM +0100, Sergio Callegari wrote:
There is a nasty regression wrt mt7921e in the last LTS series (6.12). If your computer crashes or fails to get out of hibernation, then at the next boot the mt7921e wifi does not work, with dmesg reporting that it is unable to change power state from d3cold to d0.
The issue is nasty, because rebooting won't help.
Can you do a 'git bisect' to track down the issue? Also, maybe letting the network driver authors know about this would be good.
thanks,
greg k-h
Hi,
On 20/03/2025 00:54, Greg KH wrote:
On Wed, Mar 19, 2025 at 08:38:52PM +0100, Sergio Callegari wrote:
There is a nasty regression wrt mt7921e in the last LTS series (6.12). If your computer crashes or fails to get out of hibernation, then at the next boot the mt7921e wifi does not work, with dmesg reporting that it is unable to change power state from d3cold to d0.
The issue is nasty, because rebooting won't help.
Can you do a 'git bisect' to track down the issue? Also, maybe letting the network driver authors know about this would be good.
Bisecting is extra painful, because the issue seems to systematically happen only when freezing on restore from hibernation, which in turn seems to happen only when I get in hibernation through the suspend-then-hibernate path. Because this is my main machine, I need to desperately try to avoid these freezes/crashes, since I am afraid of data loss (I have already broken a filesystem once this way, and I don't want to repeat the experience).
However, in the meantime I have found confirmation for the issue. See:
- https://bbs.archlinux.org/viewtopic.php?id=301985 - https://forum.manjaro.org/t/wi-fi-mt7921e-stopped-working/175867
I would not totally trust the second link where it says that 6.12.4 was OK.
The general wisdom seems to be that with recent kernels you need to disable ASPM for mt7921e. Now I wonder if 6.6 was maybe not activating aspm for that device...
There is also:
- https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2059744
From which I get that the problem was already present in 6.8 and that, removing the mt7921e kernel module before shutdown and suspend may work around the issue.
Sergio
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org