Hi Mario et al,
Eric Degenetais reported in Debian (cf. https://bugs.debian.org/1091696) for his report, that after 7627a0edef54 ("ata: ahci: Drop low power policy board type") rebooting the system fails (but system boots fine if cold booted).
His report mentions that the SSD is not seen on warm reboots anymore.
Does this ring some bell which might be caused by the above bisected[1] commit?
#regzbot introduced: 7627a0edef54 #regzbot link: https://bugs.debian.org/1091696
What information to you could be helpful to identify the problem?
Regards, Salvatore
On 25/03/02 05:03PM, Salvatore Bonaccorso wrote:
Hi Mario et al,
Hey Salvatore,
Eric Degenetais reported in Debian (cf. https://bugs.debian.org/1091696) for his report, that after 7627a0edef54 ("ata: ahci: Drop low power policy board type") rebooting the system fails (but system boots fine if cold booted).
His report mentions that the SSD is not seen on warm reboots anymore.
Does this ring some bell which might be caused by the above bisected[1] commit?
just FYI that we have recently bisected an issue to the same commit: https://lore.kernel.org/all/e2be6f70-dff6-4b79-bd49-70ec7e27fc1c@heusel.eu/
What information to you could be helpful to identify the problem?
The other thread also has some debugging steps that could be interesting for this problem aswell!
Cheers, Chris
On Sun, Mar 02, 2025 at 05:03:48PM +0100, Salvatore Bonaccorso wrote:
Hi Mario et al,
Eric Degenetais reported in Debian (cf. https://bugs.debian.org/1091696) for his report, that after 7627a0edef54 ("ata: ahci: Drop low power policy board type") rebooting the system fails (but system boots fine if cold booted).
His report mentions that the SSD is not seen on warm reboots anymore.
Does this ring some bell which might be caused by the above bisected[1] commit?
#regzbot introduced: 7627a0edef54 #regzbot link: https://bugs.debian.org/1091696
What information to you could be helpful to identify the problem?
Additional information from the reporter: The SSD is:
$ sudo smartctl -i /dev/disk/by-id/ata-Samsung_SSD_870_QVO_2TB_S5RPNF0T419459E smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.12-amd64] (local build) Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION === Model Family: Samsung based SSDs Device Model: Samsung SSD 870 QVO 2TB Serial Number: S5RPNF0T419459E LU WWN Device Id: 5 002538 f4243493c Firmware Version: SVQ02B6Q User Capacity: 2 000 398 934 016 bytes [2,00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device Form Factor: 2.5 inches TRIM Command: Available, deterministic, zeroed Device is: In smartctl database 7.3/5528 ATA Version is: ACS-4 T13/BSR INCITS 529 revision 5 SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Sun Mar 2 18:46:44 2025 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled
So this might be the same issue aimed to be addressed by cc77e2ce187d ("ata: libata-core: Add ATA_QUIRK_NOLPM for Samsung SSD 870 QVO drives"), but which got reverted with a2f925a2f622 ("Revert "ata: libata-core: Add ATA_QUIRK_NOLPM for Samsung SSD 870 QVO drives"") as it introduces other problems.
So I'm adding as well Daniel Baumann into the loop as this seems related.
FTR, thanks Christian Heusel for the other comments an input!
Regards, Salvatore
On Sun, Mar 02, 2025 at 05:03:48PM +0100, Salvatore Bonaccorso wrote:
Hi Mario et al,
Eric Degenetais reported in Debian (cf. https://bugs.debian.org/1091696) for his report, that after 7627a0edef54 ("ata: ahci: Drop low power policy board type") rebooting the system fails (but system boots fine if cold booted).
His report mentions that the SSD is not seen on warm reboots anymore.
Does this ring some bell which might be caused by the above bisected[1] commit?
#regzbot introduced: 7627a0edef54 #regzbot link: https://bugs.debian.org/1091696
What information to you could be helpful to identify the problem?
The model and fw version of the SSD.
Anyway, I found it in the bug report: Device Model: Samsung SSD 870 QVO 2TB Firmware Version: SVQ02B6Q
The firmware for this SSD is not great, and has caused us a lot of pain recently: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... https://lore.kernel.org/linux-ide/Z7xk1LbiYFAAsb9p@ryzen/T/#m831645f6cf2e6b5... https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi...
Basically, older firmware versions for this SSD have broken LPM, but from user reports, the latest firmware version (which Eric is using) is apparently working: https://bugzilla.kernel.org/show_bug.cgi?id=219747 https://lore.kernel.org/stable/93c10d38-718c-459d-84a5-4d87680b4da7@debian.o...
Eric is using the latest SSD fimware version. So from other peoples reports, I would expect things to work for him as well.
However, no one has reported that their UEFI does not detect their SSD. This seems to be either SSD firmware bug or UEFI bug.
I would expect your UEFI to send a COMRESET even during a reboot, and a according to AHCI spec a COMRESET shall take the decide out of sleep states.
Considering that no one else seems to have any problem when using the latest firmware version for this SSD, this seems to be a problem specific to Eric. So... UEFI bug?
Have you tried updating your BIOS?
Kind regards, Niklas
Hi Niklas,
Le 02/03/2025 à 20:32, Niklas Cassel a écrit :
On Sun, Mar 02, 2025 at 05:03:48PM +0100, Salvatore Bonaccorso wrote:
Hi Mario et al,
Eric Degenetais reported in Debian (cf. https://bugs.debian.org/1091696) for his report, that after 7627a0edef54 ("ata: ahci: Drop low power policy board type") rebooting the system fails (but system boots fine if cold booted).
For what it's worth, before getting these replies I tested the ahci.mobile_lpm_policy=1 kernel parameter, which did work around the problem.
The model and fw version of the SSD.
Anyway, I found it in the bug report: Device Model: Samsung SSD 870 QVO 2TB Firmware Version: SVQ02B6Q
The firmware for this SSD is not great, and has caused us a lot of pain recently: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi... https://lore.kernel.org/linux-ide/Z7xk1LbiYFAAsb9p@ryzen/T/#m831645f6cf2e6b5... https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commi...
Basically, older firmware versions for this SSD have broken LPM, but from user reports, the latest firmware version (which Eric is using) is apparently working: https://bugzilla.kernel.org/show_bug.cgi?id=219747 https://lore.kernel.org/stable/93c10d38-718c-459d-84a5-4d87680b4da7@debian.o...
Eric is using the latest SSD fimware version. So from other peoples reports, I would expect things to work for him as well.
However, no one has reported that their UEFI does not detect their SSD. This seems to be either SSD firmware bug or UEFI bug.
I would expect your UEFI to send a COMRESET even during a reboot, and a according to AHCI spec a COMRESET shall take the decide out of sleep states.
Considering that no one else seems to have any problem when using the latest firmware version for this SSD, this seems to be a problem specific to Eric. So... UEFI bug?
Have you tried updating your BIOS?
I had not tried to update my bios (bit shy on this due to a problem long ago with a power failure during bios update which left me with an unbootable machine).
However, as far as I see, there is no newer version of it :
My mobo model is :
sudo dmidecode -t 2
# dmidecode 3.4 Getting SMBIOS data from sysfs. SMBIOS 2.7 present.
Handle 0x0002, DMI type 2, 15 bytes Base Board Information Manufacturer: ASUSTeK COMPUTER INC. Product Name: M5A99X EVO R2.0 Version: Rev 1.xx
From asus's website I get that the latest bios version for this model is version
M5A99X EVO R2.0 BIOS 2501 Version 2501 3.06 MB 2014/05/14
And I appear to already use it :
sudo dmidecode -s bios-version 2501
Kind regards, Niklas
kind regards,
Eric
On Sun, Mar 02, 2025 at 09:32:07PM +0100, Eric wrote:
Hi Niklas,
Le 02/03/2025 à 20:32, Niklas Cassel a écrit :
On Sun, Mar 02, 2025 at 05:03:48PM +0100, Salvatore Bonaccorso wrote:
Hi Mario et al,
Eric Degenetais reported in Debian (cf. https://bugs.debian.org/1091696) for his report, that after 7627a0edef54 ("ata: ahci: Drop low power policy board type") rebooting the system fails (but system boots fine if cold booted).
For what it's worth, before getting these replies I tested the ahci.mobile_lpm_policy=1 kernel parameter, which did work around the problem.
I'm glad that you have a workaround to make your system usable.
Eric is using the latest SSD fimware version. So from other peoples reports, I would expect things to work for him as well.
However, no one has reported that their UEFI does not detect their SSD. This seems to be either SSD firmware bug or UEFI bug.
I would expect your UEFI to send a COMRESET even during a reboot, and a according to AHCI spec a COMRESET shall take the decide out of sleep states.
Considering that no one else seems to have any problem when using the latest firmware version for this SSD, this seems to be a problem specific to Eric. So... UEFI bug?
Have you tried updating your BIOS?
I had not tried to update my bios (bit shy on this due to a problem long ago with a power failure during bios update which left me with an unbootable machine).
However, as far as I see, there is no newer version of it.
Ok.
So far, this just sounds like a bug where UEFI cannot detect your SSD. UEFI problems should be reported to your BIOS vendor.
It would be interesting to see if _Linux_ can detect your SSD, after a reboot, without UEFI involvement.
If you kexec into the same kernel as you are currently running: https://manpages.debian.org/testing/kexec-tools/kexec.8.en.html
Do you see your SSD in the kexec'd kernel?
Kind regards, Niklas
linux-stable-mirror@lists.linaro.org