Hi, Keith... Just to give you guys a response concerning this, I'm sorry for the late reply -- too much work. But yes, you are correct, due to having tried patching the kernel in different days and too much stuff going on at the same time, I applied this two-line patch to the same source where I have applied the other patch that multiplies the timeout by 2 and occurs at an earlier time on activation. I thought I had an unpatched kernel at the time and ended up compiling it this way. Sorry for the mistake, but I also saw that now there's a better patch for the issue.
On Tue, Sep 5, 2023 at 4:35 PM Keith Busch kbusch@kernel.org wrote:
On Tue, Sep 05, 2023 at 01:37:36PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
On 04.09.23 13:07, Bagas Sanjaya wrote:
I notice a regression report on Bugzilla [1]. Quoting from it:
I bought a new 4 TB Lexar NM790 and I was using kernel 6.3.13 at the time. It wasn't recognized, with these messages in dmesg:
[ 358.950147] nvme nvme0: pci function 0000:06:00.0 [ 358.958327] nvme nvme0: Device not ready; aborting initialisation, CSTS=0x0
My other NVMe appears correctly in the nvme list though.
So I tried using other kernels I had installed at the time: 6.3.7, 6.4.10, 6.5.0rc6, 6.5.0, 6.5.1 and none of these recognized the disk. I installed the 6.1.50 lts kernel from arch repositories (I can compile my own too if this would be an issue) and then the device was correctly recognized:
[ 4.654613] nvme 0000:06:00.0: platform quirk: setting simple suspend [ 4.654632] nvme nvme0: pci function 0000:06:00.0 [ 4.667290] nvme nvme0: allocated 40 MiB host memory buffer. [ 4.709473] nvme nvme0: 16/0/0 default/read/poll queues
FWIW, the quoted mail missed one crucial detail: """ Claudio Sampaio 2023-09-02 19:04:29 UTC
Adding the two lines
│ 3457 { PCI_DEVICE(0x1d97, 0x1602), /* Lexar NM790 */ │ 3458 │ .driver_data = NVME_QUIRK_BOGUS_NID, },
in file drivers/nvme/host/pci.c made my NVMe work correctly. Compiled a new 6.5.1 kernel and everything works. """
@NVME maintainers: is there anything more you need from Claudio at this point?
Yes: it doesn't really make any sense. The report says the device stopped showing up with message:
nvme nvme0: Device not ready; aborting initialisation, CSTS=0x0
That (a) happens long before the mentioned quirk is considered by the driver, and (b) the "quirk" behavior is now the default in 6.5 and several of the listed stable kernels anyway.
It more likely sounds like the device is flaky and either never becomes ready due to some unspecified internal firmware condition, or inaccurately reports how long it actually needs to become ready in worst-case-scenario.