On Tue, Sep 12, 2023 at 02:47:33PM -0700, Keith Busch wrote:
From: Keith Busch kbusch@kernel.org
Some devices are reporting controller ready mode support, but return 0 for CRTO. These devices require a much higher time to ready than that, so they are failing to initialize after the driver starter preferring that value over CAP.TO.
The spec requires that CAP.TO match the appropritate CRTO value, or be set to 0xff if CRTO is larger than that. This means that CAP.TO can be used to validate if CRTO is reliable, and provides an appropriate fallback for setting the timeout value if not. Use whichever is larger.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217863 Reported-by: Cláudio Sampaio patola@gmail.com Reported-by: Felix Yan felixonmars@archlinux.org Based-on-a-patch-by: Felix Yan felixonmars@archlinux.org Cc: stable@vger.kernel.org Signed-off-by: Keith Busch kbusch@kernel.org
drivers/nvme/host/core.c | 48 ++++++++++++++++++++++++---------------- 1 file changed, 29 insertions(+), 19 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 37b6fa7466620..4adc0b2f12f1e 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -2245,25 +2245,8 @@ int nvme_enable_ctrl(struct nvme_ctrl *ctrl) else ctrl->ctrl_config = NVME_CC_CSS_NVM;
- if (ctrl->cap & NVME_CAP_CRMS_CRWMS) {
u32 crto;
ret = ctrl->ops->reg_read32(ctrl, NVME_REG_CRTO, &crto);
if (ret) {
dev_err(ctrl->device, "Reading CRTO failed (%d)\n",
ret);
return ret;
}
if (ctrl->cap & NVME_CAP_CRMS_CRIMS) {
ctrl->ctrl_config |= NVME_CC_CRIME;
timeout = NVME_CRTO_CRIMT(crto);
} else {
timeout = NVME_CRTO_CRWMT(crto);
}
- } else {
timeout = NVME_CAP_TIMEOUT(ctrl->cap);
- }
- if (ctrl->cap & NVME_CAP_CRMS_CRWMS && ctrl->cap & NVME_CAP_CRMS_CRIMS)
ctrl->ctrl_config |= NVME_CC_CRIME;
ctrl->ctrl_config |= (NVME_CTRL_PAGE_SHIFT - 12) << NVME_CC_MPS_SHIFT; ctrl->ctrl_config |= NVME_CC_AMS_RR | NVME_CC_SHN_NONE; @@ -2277,6 +2260,33 @@ int nvme_enable_ctrl(struct nvme_ctrl *ctrl) if (ret) return ret;
- /* CAP value may change after initial CC write */
- ret = ctrl->ops->reg_read64(ctrl, NVME_REG_CAP, &ctrl->cap);
- if (ret)
return ret;
- timeout = NVME_CAP_TIMEOUT(ctrl->cap);
- if (ctrl->cap & NVME_CAP_CRMS_CRWMS) {
u32 crto;
ret = ctrl->ops->reg_read32(ctrl, NVME_REG_CRTO, &crto);
if (ret) {
dev_err(ctrl->device, "Reading CRTO failed (%d)\n",
ret);
return ret;
}
/*
* CRTO should always be greater or equal to CAP.TO, but some
* devices are known to get this wrong. Use the larger of the
* two values.
*/
if (ctrl->ctrl_config & NVME_CC_CRIME)
timeout = max(timeout, NVME_CRTO_CRIMT(crto));
else
timeout = max(timeout, NVME_CRTO_CRWMT(crto));
I saw the original bug report. But wasn't the problem that these were compared before NVME_CC_CRIME had been written?
i.e. is this max() check still needed for the bug reporter's NVMe drive, after NVME_CC_CRIME was been written and CAP has been re-read? (If so, would a quirk be better?)
Kind regards, Niklas