Hi Greg,
This series includes two patches: - A backport of "scsi: ufs: Remove dead code". - A backport of "scsi: ufs: Fix a deadlock in the error handler"
Although the first patch is a code cleanup patch, I have included it in this series because the second patch depends on it.
An Android developer requested a backport of the second patch to the android13-5.15 stable tree.
Please consider these patches for inclusion in the 5.15 stable tree.
Thanks,
Bart.
Bart Van Assche (2): scsi: ufs: Remove dead code scsi: ufs: Fix a deadlock in the error handler
drivers/scsi/ufs/ufshcd.c | 58 ++++++++++----------------------------- drivers/scsi/ufs/ufshcd.h | 2 ++ 2 files changed, 16 insertions(+), 44 deletions(-)
commit d77ea8226b3be23b0b45aa42851243b62a27bda1 upstream.
Commit 7252a3603015 ("scsi: ufs: Avoid busy-waiting by eliminating tag conflicts") guarantees that 'tag' is not in use by any SCSI command. Remove the check that returns early if a conflict occurs.
Link: https://lore.kernel.org/r/20211203231950.193369-6-bvanassche@acm.org Tested-by: Bean Huo beanhuo@micron.com Reviewed-by: Bean Huo beanhuo@micron.com Acked-by: Avri Altman avri.altman@wdc.com Signed-off-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Bart Van Assche bvanassche@acm.org --- drivers/scsi/ufs/ufshcd.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index f489954e4632..92e4f0dbb791 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -6658,11 +6658,6 @@ static int ufshcd_issue_devman_upiu_cmd(struct ufs_hba *hba, tag = req->tag; WARN_ONCE(tag < 0, "Invalid tag %d\n", tag);
- if (unlikely(test_bit(tag, &hba->outstanding_reqs))) { - err = -EBUSY; - goto out; - } - lrbp = &hba->lrb[tag]; WARN_ON(lrbp->cmd); lrbp->cmd = NULL; @@ -6730,8 +6725,8 @@ static int ufshcd_issue_devman_upiu_cmd(struct ufs_hba *hba, ufshcd_add_query_upiu_trace(hba, err ? UFS_QUERY_ERR : UFS_QUERY_COMP, (struct utp_upiu_req *)lrbp->ucd_rsp_ptr);
-out: blk_put_request(req); + out_unlock: up_read(&hba->clk_scaling_lock); return err;
commit 945c3cca05d78351bba29fa65d93834cb7934c7b upstream.
The following deadlock has been observed on a test setup:
- All tags allocated
- The SCSI error handler calls ufshcd_eh_host_reset_handler()
- ufshcd_eh_host_reset_handler() queues work that calls ufshcd_err_handler()
- ufshcd_err_handler() locks up as follows:
Workqueue: ufs_eh_wq_0 ufshcd_err_handler.cfi_jt Call trace: __switch_to+0x298/0x5d8 __schedule+0x6cc/0xa94 schedule+0x12c/0x298 blk_mq_get_tag+0x210/0x480 __blk_mq_alloc_request+0x1c8/0x284 blk_get_request+0x74/0x134 ufshcd_exec_dev_cmd+0x68/0x640 ufshcd_verify_dev_init+0x68/0x35c ufshcd_probe_hba+0x12c/0x1cb8 ufshcd_host_reset_and_restore+0x88/0x254 ufshcd_reset_and_restore+0xd0/0x354 ufshcd_err_handler+0x408/0xc58 process_one_work+0x24c/0x66c worker_thread+0x3e8/0xa4c kthread+0x150/0x1b4 ret_from_fork+0x10/0x30
Fix this lockup by making ufshcd_exec_dev_cmd() allocate a reserved request.
Link: https://lore.kernel.org/r/20211203231950.193369-10-bvanassche@acm.org Tested-by: Bean Huo beanhuo@micron.com Reviewed-by: Adrian Hunter adrian.hunter@intel.com Reviewed-by: Bean Huo beanhuo@micron.com Signed-off-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Bart Van Assche bvanassche@acm.org --- drivers/scsi/ufs/ufshcd.c | 53 +++++++++++---------------------------- drivers/scsi/ufs/ufshcd.h | 2 ++ 2 files changed, 16 insertions(+), 39 deletions(-)
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 92e4f0dbb791..cdec85bcc4cc 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -125,8 +125,9 @@ EXPORT_SYMBOL_GPL(ufshcd_dump_regs); enum { UFSHCD_MAX_CHANNEL = 0, UFSHCD_MAX_ID = 1, - UFSHCD_CMD_PER_LUN = 32, - UFSHCD_CAN_QUEUE = 32, + UFSHCD_NUM_RESERVED = 1, + UFSHCD_CMD_PER_LUN = 32 - UFSHCD_NUM_RESERVED, + UFSHCD_CAN_QUEUE = 32 - UFSHCD_NUM_RESERVED, };
/* UFSHCD error handling flags */ @@ -2185,6 +2186,7 @@ static inline int ufshcd_hba_capabilities(struct ufs_hba *hba) hba->nutrs = (hba->capabilities & MASK_TRANSFER_REQUESTS_SLOTS) + 1; hba->nutmrs = ((hba->capabilities & MASK_TASK_MANAGEMENT_REQUEST_SLOTS) >> 16) + 1; + hba->reserved_slot = hba->nutrs - 1;
/* Read crypto capabilities */ err = ufshcd_hba_init_crypto_capabilities(hba); @@ -2910,30 +2912,15 @@ static int ufshcd_wait_for_dev_cmd(struct ufs_hba *hba, static int ufshcd_exec_dev_cmd(struct ufs_hba *hba, enum dev_cmd_type cmd_type, int timeout) { - struct request_queue *q = hba->cmd_queue; DECLARE_COMPLETION_ONSTACK(wait); - struct request *req; + const u32 tag = hba->reserved_slot; struct ufshcd_lrb *lrbp; int err; - int tag;
- down_read(&hba->clk_scaling_lock); + /* Protects use of hba->reserved_slot. */ + lockdep_assert_held(&hba->dev_cmd.lock);
- /* - * Get free slot, sleep if slots are unavailable. - * Even though we use wait_event() which sleeps indefinitely, - * the maximum wait time is bounded by SCSI request timeout. - */ - req = blk_get_request(q, REQ_OP_DRV_OUT, 0); - if (IS_ERR(req)) { - err = PTR_ERR(req); - goto out_unlock; - } - tag = req->tag; - WARN_ONCE(tag < 0, "Invalid tag %d\n", tag); - /* Set the timeout such that the SCSI error handler is not activated. */ - req->timeout = msecs_to_jiffies(2 * timeout); - blk_mq_start_request(req); + down_read(&hba->clk_scaling_lock);
lrbp = &hba->lrb[tag]; WARN_ON(lrbp->cmd); @@ -2951,8 +2938,6 @@ static int ufshcd_exec_dev_cmd(struct ufs_hba *hba, (struct utp_upiu_req *)lrbp->ucd_rsp_ptr);
out: - blk_put_request(req); -out_unlock: up_read(&hba->clk_scaling_lock); return err; } @@ -6640,23 +6625,16 @@ static int ufshcd_issue_devman_upiu_cmd(struct ufs_hba *hba, enum dev_cmd_type cmd_type, enum query_opcode desc_op) { - struct request_queue *q = hba->cmd_queue; DECLARE_COMPLETION_ONSTACK(wait); - struct request *req; + const u32 tag = hba->reserved_slot; struct ufshcd_lrb *lrbp; int err = 0; - int tag; u8 upiu_flags;
- down_read(&hba->clk_scaling_lock); + /* Protects use of hba->reserved_slot. */ + lockdep_assert_held(&hba->dev_cmd.lock);
- req = blk_get_request(q, REQ_OP_DRV_OUT, 0); - if (IS_ERR(req)) { - err = PTR_ERR(req); - goto out_unlock; - } - tag = req->tag; - WARN_ONCE(tag < 0, "Invalid tag %d\n", tag); + down_read(&hba->clk_scaling_lock);
lrbp = &hba->lrb[tag]; WARN_ON(lrbp->cmd); @@ -6725,9 +6703,6 @@ static int ufshcd_issue_devman_upiu_cmd(struct ufs_hba *hba, ufshcd_add_query_upiu_trace(hba, err ? UFS_QUERY_ERR : UFS_QUERY_COMP, (struct utp_upiu_req *)lrbp->ucd_rsp_ptr);
- blk_put_request(req); - -out_unlock: up_read(&hba->clk_scaling_lock); return err; } @@ -9418,8 +9393,8 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq) /* Configure LRB */ ufshcd_host_memory_configure(hba);
- host->can_queue = hba->nutrs; - host->cmd_per_lun = hba->nutrs; + host->can_queue = hba->nutrs - UFSHCD_NUM_RESERVED; + host->cmd_per_lun = hba->nutrs - UFSHCD_NUM_RESERVED; host->max_id = UFSHCD_MAX_ID; host->max_lun = UFS_MAX_LUNS; host->max_channel = UFSHCD_MAX_CHANNEL; diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h index 07ada6676c3b..d470a52ff24c 100644 --- a/drivers/scsi/ufs/ufshcd.h +++ b/drivers/scsi/ufs/ufshcd.h @@ -725,6 +725,7 @@ struct ufs_hba_monitor { * @capabilities: UFS Controller Capabilities * @nutrs: Transfer Request Queue depth supported by controller * @nutmrs: Task Management Queue depth supported by controller + * @reserved_slot: Used to submit device commands. Protected by @dev_cmd.lock. * @ufs_version: UFS Version to which controller complies * @vops: pointer to variant specific operations * @priv: pointer to variant specific private data @@ -813,6 +814,7 @@ struct ufs_hba { u32 capabilities; int nutrs; int nutmrs; + u32 reserved_slot; u32 ufs_version; const struct ufs_hba_variant_ops *vops; struct ufs_hba_variant_params *vps;
On Fri, Feb 18, 2022 at 10:04:37AM -0800, Bart Van Assche wrote:
Hi Greg,
This series includes two patches:
- A backport of "scsi: ufs: Remove dead code".
- A backport of "scsi: ufs: Fix a deadlock in the error handler"
Although the first patch is a code cleanup patch, I have included it in this series because the second patch depends on it.
An Android developer requested a backport of the second patch to the android13-5.15 stable tree.
Please consider these patches for inclusion in the 5.15 stable tree.
They also are required in the 5.16 stable tree, so that no one upgrades kernel versions and hits a regression. I have added them there as well.
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org