Hi
Here are 2 patches backported for v5.10. Upstream there is a third patch associated with these i.e. commit 5cb37a26355 ("scsi: ufs: core: Fix another task management completion race"), but it is not needed because v5.10 has different lock usage.
Adrian Hunter (2): scsi: ufs: core: Fix task management completion scsi: ufs: core: Fix task management completion timeout race
drivers/scsi/ufs/ufshcd.c | 57 +++++++++++++++++++---------------------------- drivers/scsi/ufs/ufshcd.h | 1 + 2 files changed, 24 insertions(+), 34 deletions(-)
Regards Adrian
commit f5ef336fd2e4c36dedae4e7ca66cf5349d6fda62 upstream.
The UFS driver uses blk_mq_tagset_busy_iter() when identifying task management requests to complete, however blk_mq_tagset_busy_iter() doesn't work.
blk_mq_tagset_busy_iter() only iterates requests dispatched by the block layer. That appears as if it might have started since commit 37f4a24c2469 ("blk-mq: centralise related handling into blk_mq_get_driver_tag") which removed 'data->hctx->tags->rqs[rq->tag] = rq' from blk_mq_rq_ctx_init() which gets called:
blk_get_request blk_mq_alloc_request __blk_mq_alloc_request blk_mq_rq_ctx_init
Since UFS task management requests are not dispatched by the block layer, hctx->tags->rqs[rq->tag] remains NULL, and since blk_mq_tagset_busy_iter() relies on finding requests using hctx->tags->rqs[rq->tag], UFS task management requests are never found by blk_mq_tagset_busy_iter().
By using blk_mq_tagset_busy_iter(), the UFS driver was relying on internal details of the block layer, which was fragile and subsequently got broken. Fix by removing the use of blk_mq_tagset_busy_iter() and having the driver keep track of task management requests.
Link: https://lore.kernel.org/r/20210922091059.4040-1-adrian.hunter@intel.com Fixes: 1235fc569e0b ("scsi: ufs: core: Fix task management request completion timeout") Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs") Cc: stable@vger.kernel.org Tested-by: Bart Van Assche bvanassche@acm.org Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Adrian Hunter adrian.hunter@intel.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com [Adrian: Backport to v5.10] Signed-off-by: Adrian Hunter adrian.hunter@intel.com --- drivers/scsi/ufs/ufshcd.c | 52 +++++++++++++++++---------------------- drivers/scsi/ufs/ufshcd.h | 1 + 2 files changed, 24 insertions(+), 29 deletions(-)
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 930f35863cbb..5fd0a6ed181c 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -6099,27 +6099,6 @@ static irqreturn_t ufshcd_check_errors(struct ufs_hba *hba) return retval; }
-struct ctm_info { - struct ufs_hba *hba; - unsigned long pending; - unsigned int ncpl; -}; - -static bool ufshcd_compl_tm(struct request *req, void *priv, bool reserved) -{ - struct ctm_info *const ci = priv; - struct completion *c; - - WARN_ON_ONCE(reserved); - if (test_bit(req->tag, &ci->pending)) - return true; - ci->ncpl++; - c = req->end_io_data; - if (c) - complete(c); - return true; -} - /** * ufshcd_tmc_handler - handle task management function completion * @hba: per adapter instance @@ -6130,14 +6109,22 @@ static bool ufshcd_compl_tm(struct request *req, void *priv, bool reserved) */ static irqreturn_t ufshcd_tmc_handler(struct ufs_hba *hba) { - struct request_queue *q = hba->tmf_queue; - struct ctm_info ci = { - .hba = hba, - .pending = ufshcd_readl(hba, REG_UTP_TASK_REQ_DOOR_BELL), - }; + unsigned long pending, issued; + irqreturn_t ret = IRQ_NONE; + int tag; + + pending = ufshcd_readl(hba, REG_UTP_TASK_REQ_DOOR_BELL); + + issued = hba->outstanding_tasks & ~pending; + for_each_set_bit(tag, &issued, hba->nutmrs) { + struct request *req = hba->tmf_rqs[tag]; + struct completion *c = req->end_io_data; + + complete(c); + ret = IRQ_HANDLED; + }
- blk_mq_tagset_busy_iter(q->tag_set, ufshcd_compl_tm, &ci); - return ci.ncpl ? IRQ_HANDLED : IRQ_NONE; + return ret; }
/** @@ -6267,9 +6254,9 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba, ufshcd_hold(hba, false);
spin_lock_irqsave(host->host_lock, flags); - blk_mq_start_request(req);
task_tag = req->tag; + hba->tmf_rqs[req->tag] = req; treq->req_header.dword_0 |= cpu_to_be32(task_tag);
memcpy(hba->utmrdl_base_addr + task_tag, treq, sizeof(*treq)); @@ -6313,6 +6300,7 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba, }
spin_lock_irqsave(hba->host->host_lock, flags); + hba->tmf_rqs[req->tag] = NULL; __clear_bit(task_tag, &hba->outstanding_tasks); spin_unlock_irqrestore(hba->host->host_lock, flags);
@@ -9235,6 +9223,12 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq) err = PTR_ERR(hba->tmf_queue); goto free_tmf_tag_set; } + hba->tmf_rqs = devm_kcalloc(hba->dev, hba->nutmrs, + sizeof(*hba->tmf_rqs), GFP_KERNEL); + if (!hba->tmf_rqs) { + err = -ENOMEM; + goto free_tmf_queue; + }
/* Reset the attached device */ ufshcd_vops_device_reset(hba); diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h index 1ba9c786feb6..35dd5197ccb9 100644 --- a/drivers/scsi/ufs/ufshcd.h +++ b/drivers/scsi/ufs/ufshcd.h @@ -734,6 +734,7 @@ struct ufs_hba {
struct blk_mq_tag_set tmf_tag_set; struct request_queue *tmf_queue; + struct request **tmf_rqs;
struct uic_command *active_uic_cmd; struct mutex uic_cmd_mutex;
commit 886fe2915cce6658b0fc19e64b82879325de61ea upstream.
__ufshcd_issue_tm_cmd() clears req->end_io_data after timing out, which races with the completion function ufshcd_tmc_handler() which expects req->end_io_data to have a value.
Note __ufshcd_issue_tm_cmd() and ufshcd_tmc_handler() are already synchronized using hba->tmf_rqs and hba->outstanding_tasks under the host_lock spinlock.
It is also not necessary (nor typical) to clear req->end_io_data because the block layer does it before allocating out requests e.g. via blk_get_request().
So fix by not clearing it.
Link: https://lore.kernel.org/r/20211108064815.569494-2-adrian.hunter@intel.com Fixes: f5ef336fd2e4 ("scsi: ufs: core: Fix task management completion") Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Adrian Hunter adrian.hunter@intel.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com [Adrian: Backport to v5.10] Signed-off-by: Adrian Hunter adrian.hunter@intel.com --- drivers/scsi/ufs/ufshcd.c | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 5fd0a6ed181c..e3a9a02cadf5 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -6280,11 +6280,6 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba *hba, err = wait_for_completion_io_timeout(&wait, msecs_to_jiffies(TM_CMD_TIMEOUT)); if (!err) { - /* - * Make sure that ufshcd_compl_tm() does not trigger a - * use-after-free. - */ - req->end_io_data = NULL; ufshcd_add_tm_upiu_trace(hba, task_tag, "tm_complete_err"); dev_err(hba->dev, "%s: task management cmd 0x%.2x timed-out\n", __func__, tm_function);
On Tue, Nov 23, 2021 at 02:19:28PM +0200, Adrian Hunter wrote:
Hi
Here are 2 patches backported for v5.10. Upstream there is a third patch associated with these i.e. commit 5cb37a26355 ("scsi: ufs: core: Fix another task management completion race"), but it is not needed because v5.10 has different lock usage.
Adrian Hunter (2): scsi: ufs: core: Fix task management completion scsi: ufs: core: Fix task management completion timeout race
drivers/scsi/ufs/ufshcd.c | 57 +++++++++++++++++++---------------------------- drivers/scsi/ufs/ufshcd.h | 1 + 2 files changed, 24 insertions(+), 34 deletions(-)
Both now queued up, thanks.
greg k-h
linux-stable-mirror@lists.linaro.org