The patch below does not apply to the 5.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
Possible dependencies:
68ad83188d78 ("scsi: qla2xxx: Fix crash when I/O abort times out") 31e6cdbe0eae ("scsi: qla2xxx: Implement ref count for SRB") d4523bd6fd5d ("scsi: qla2xxx: Refactor asynchronous command initialization") 2cabf10dbbe3 ("scsi: qla2xxx: Fix hang on NVMe command timeouts") e3d2612f583b ("scsi: qla2xxx: Fix use after free in debug code") 9efea843a906 ("scsi: qla2xxx: edif: Add detection of secure device") dd30706e73b7 ("scsi: qla2xxx: edif: Add key update") fac2807946c1 ("scsi: qla2xxx: edif: Add extraction of auth_els from the wire") 84318a9f01ce ("scsi: qla2xxx: edif: Add send, receive, and accept for auth_els") 7878f22a2e03 ("scsi: qla2xxx: edif: Add getfcinfo and statistic bsgs") 7ebb336e45ef ("scsi: qla2xxx: edif: Add start + stop bsgs") d94d8158e184 ("scsi: qla2xxx: Add heartbeat check") f7a0ed479e66 ("scsi: qla2xxx: Fix crash in PCIe error handling") 2ce35c0821af ("scsi: qla2xxx: Fix use after free in bsg") 5777fef788a5 ("scsi: qla2xxx: Consolidate zio threshold setting for both FCP & NVMe") 960204ecca5e ("scsi: qla2xxx: Simplify if statement") a04658594399 ("scsi: qla2xxx: Wait for ABTS response on I/O timeouts for NVMe") dbf1f53cfd23 ("scsi: qla2xxx: Implementation to get and manage host, target stats and initiator port") 707531bc2626 ("scsi: qla2xxx: If fcport is undergoing deletion complete I/O with retry") 605e74025f95 ("scsi: qla2xxx: Move sess cmd list/lock to driver")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 68ad83188d782b2ecef2e41ac245d27e0710fe8e Mon Sep 17 00:00:00 2001 From: Arun Easi aeasi@marvell.com Date: Tue, 29 Nov 2022 01:26:34 -0800 Subject: [PATCH] scsi: qla2xxx: Fix crash when I/O abort times out
While performing CPU hotplug, a crash with the following stack was seen:
Call Trace: qla24xx_process_response_queue+0x42a/0x970 [qla2xxx] qla2x00_start_nvme_mq+0x3a2/0x4b0 [qla2xxx] qla_nvme_post_cmd+0x166/0x240 [qla2xxx] nvme_fc_start_fcp_op.part.0+0x119/0x2e0 [nvme_fc] blk_mq_dispatch_rq_list+0x17b/0x610 __blk_mq_sched_dispatch_requests+0xb0/0x140 blk_mq_sched_dispatch_requests+0x30/0x60 __blk_mq_run_hw_queue+0x35/0x90 __blk_mq_delay_run_hw_queue+0x161/0x180 blk_execute_rq+0xbe/0x160 __nvme_submit_sync_cmd+0x16f/0x220 [nvme_core] nvmf_connect_admin_queue+0x11a/0x170 [nvme_fabrics] nvme_fc_create_association.cold+0x50/0x3dc [nvme_fc] nvme_fc_connect_ctrl_work+0x19/0x30 [nvme_fc] process_one_work+0x1e8/0x3c0
On abort timeout, completion was called without checking if the I/O was already completed.
Verify that I/O and abort request are indeed outstanding before attempting completion.
Fixes: 71c80b75ce8f ("scsi: qla2xxx: Do command completion on abort timeout") Reported-by: Marco Patalano mpatalan@redhat.com Tested-by: Marco Patalano mpatalan@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Arun Easi aeasi@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20221129092634.15347-1-njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com
diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index ce4c5d728407..8d9ecabb1aac 100644 --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -110,6 +110,7 @@ static void qla24xx_abort_iocb_timeout(void *data) struct qla_qpair *qpair = sp->qpair; u32 handle; unsigned long flags; + int sp_found = 0, cmdsp_found = 0;
if (sp->cmd_sp) ql_dbg(ql_dbg_async, sp->vha, 0x507c, @@ -124,18 +125,21 @@ static void qla24xx_abort_iocb_timeout(void *data) spin_lock_irqsave(qpair->qp_lock_ptr, flags); for (handle = 1; handle < qpair->req->num_outstanding_cmds; handle++) { if (sp->cmd_sp && (qpair->req->outstanding_cmds[handle] == - sp->cmd_sp)) + sp->cmd_sp)) { qpair->req->outstanding_cmds[handle] = NULL; + cmdsp_found = 1; + }
/* removing the abort */ if (qpair->req->outstanding_cmds[handle] == sp) { qpair->req->outstanding_cmds[handle] = NULL; + sp_found = 1; break; } } spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
- if (sp->cmd_sp) { + if (cmdsp_found && sp->cmd_sp) { /* * This done function should take care of * original command ref: INIT @@ -143,8 +147,10 @@ static void qla24xx_abort_iocb_timeout(void *data) sp->cmd_sp->done(sp->cmd_sp, QLA_OS_TIMER_EXPIRED); }
- abt->u.abt.comp_status = cpu_to_le16(CS_TIMEOUT); - sp->done(sp, QLA_OS_TIMER_EXPIRED); + if (sp_found) { + abt->u.abt.comp_status = cpu_to_le16(CS_TIMEOUT); + sp->done(sp, QLA_OS_TIMER_EXPIRED); + } }
static void qla24xx_abort_sp_done(srb_t *sp, int res)