Commit a6dcfe08487e ("scsi: qla2xxx: Limit interrupt vectors to number of CPUs") lowers the number of allocated MSI-X vectors to the number of CPUs.
That breaks vector allocation assumptions in qla83xx_iospace_config(), qla24xx_enable_msix() and qla2x00_iospace_config(). Either of the functions computes maximum number of qpairs as:
ha->max_qpairs = ha->msix_count - 1 (MB interrupt) - 1 (default response queue) - 1 (ATIO, in dual or pure target mode)
max_qpairs is set to zero in case of two CPUs and initiator mode. The number is then used to allocate ha->queue_pair_map inside qla2x00_alloc_queues(). No allocation happens and ha->queue_pair_map is left NULL but the driver thinks there are queue pairs available.
qla2xxx_queuecommand() tries to find a qpair in the map and crashes:
if (ha->mqenable) { uint32_t tag; uint16_t hwq; struct qla_qpair *qpair = NULL;
tag = blk_mq_unique_tag(cmd->request); hwq = blk_mq_unique_tag_to_hwq(tag); qpair = ha->queue_pair_map[hwq]; # <- HERE
if (qpair) return qla2xxx_mqueuecommand(host, cmd, qpair); }
BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 72 Comm: kworker/u4:3 Tainted: G W 5.10.0-rc1+ #25 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Workqueue: scsi_wq_7 fc_scsi_scan_rport [scsi_transport_fc] RIP: 0010:qla2xxx_queuecommand+0x16b/0x3f0 [qla2xxx] Call Trace: scsi_queue_rq+0x58c/0xa60 blk_mq_dispatch_rq_list+0x2b7/0x6f0 ? __sbitmap_get_word+0x2a/0x80 __blk_mq_sched_dispatch_requests+0xb8/0x170 blk_mq_sched_dispatch_requests+0x2b/0x50 __blk_mq_run_hw_queue+0x49/0xb0 __blk_mq_delay_run_hw_queue+0xfb/0x150 blk_mq_sched_insert_request+0xbe/0x110 blk_execute_rq+0x45/0x70 __scsi_execute+0x10e/0x250 scsi_probe_and_add_lun+0x228/0xda0 __scsi_scan_target+0xf4/0x620 ? __pm_runtime_resume+0x4f/0x70 scsi_scan_target+0x100/0x110 fc_scsi_scan_rport+0xa1/0xb0 [scsi_transport_fc] process_one_work+0x1ea/0x3b0 worker_thread+0x28/0x3b0 ? process_one_work+0x3b0/0x3b0 kthread+0x112/0x130 ? kthread_park+0x80/0x80 ret_from_fork+0x22/0x30
Unfortunately it's not enough to force single queue mode for systems with small CPU count, the warning would be still produced:
WARNING: CPU: 0 PID: 1039 at drivers/pci/msi.c:1323 pci_irq_get_affinity+0x36/0x80 CPU: 0 PID: 1039 Comm: modprobe Not tainted 5.12.0-rc1+ #26 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:pci_irq_get_affinity+0x36/0x80 Call Trace: blk_mq_pci_map_queues+0x32/0xc0 blk_mq_alloc_tag_set+0x131/0x310 scsi_add_host_with_dma+0x79/0x2f0 qla2x00_probe_one+0x1870/0x2410 [qla2xxx] local_pci_probe+0x3d/0x90 ? pci_assign_irq+0x22/0xd0 pci_device_probe+0xd5/0x170 really_probe+0x1c5/0x3a0 driver_probe_device+0x49/0xa0 device_driver_attach+0x4a/0x50 __driver_attach+0x67/0xb0 ? device_driver_attach+0x50/0x50 bus_for_each_dev+0x71/0xb0 bus_add_driver+0x177/0x1f0 ? 0xffffffffc02fa000 driver_register+0x56/0xf0 ? 0xffffffffc02fa000 qla2x00_module_init+0x1a2/0x20a [qla2xxx] do_one_initcall+0x3f/0x1c0 ? __cond_resched+0x10/0x40 ? kmem_cache_alloc_trace+0x36/0x1a0 do_init_module+0x56/0x1ed load_module+0x219e/0x2880 ? __do_sys_finit_module+0x9c/0xe0 __do_sys_finit_module+0x9c/0xe0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f2d3adcf759
Also, former MSI-X allocation might be more performant - two more nr_hw_queues, especially on the small and medium initiators. Therefore, it's best to revert the offending commit.
Cc: Daniel Wagner daniel.wagner@suse.com Cc: Himanshu Madhani himanshu.madhani@oracle.com Cc: Quinn Tran qutran@marvell.com Cc: Nilesh Javali njavali@marvell.com Cc: Martin K. Petersen martin.petersen@oracle.com Cc: stable@vger.kernel.org # 5.11+ Fixes: a6dcfe08487e ("scsi: qla2xxx: Limit interrupt vectors to number of CPUs") Reported-by: Aleksandr Volkov a.y.volkov@yadro.com Reported-by: Aleksandr Miloserdov a.miloserdov@yadro.com Signed-off-by: Roman Bolshakov r.bolshakov@yadro.com --- drivers/scsi/qla2xxx/qla_isr.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c index 11d6e0db07fe..6641978dfecf 100644 --- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -3998,12 +3998,10 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp) if (USER_CTRL_IRQ(ha) || !ha->mqiobase) { /* user wants to control IRQ setting for target mode */ ret = pci_alloc_irq_vectors(ha->pdev, min_vecs, - min((u16)ha->msix_count, (u16)num_online_cpus()), - PCI_IRQ_MSIX); + ha->msix_count, PCI_IRQ_MSIX); } else ret = pci_alloc_irq_vectors_affinity(ha->pdev, min_vecs, - min((u16)ha->msix_count, (u16)num_online_cpus()), - PCI_IRQ_MSIX | PCI_IRQ_AFFINITY, + ha->msix_count, PCI_IRQ_MSIX | PCI_IRQ_AFFINITY, &desc);
if (ret < 0) {
On Fri, Apr 09, 2021 at 09:17:59AM +0300, Roman Bolshakov wrote:
drivers/scsi/qla2xxx/qla_isr.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c index 11d6e0db07fe..6641978dfecf 100644 --- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -3998,12 +3998,10 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp) if (USER_CTRL_IRQ(ha) || !ha->mqiobase) { /* user wants to control IRQ setting for target mode */ ret = pci_alloc_irq_vectors(ha->pdev, min_vecs,
min((u16)ha->msix_count, (u16)num_online_cpus()),
PCI_IRQ_MSIX);
} else ret = pci_alloc_irq_vectors_affinity(ha->pdev, min_vecs,ha->msix_count, PCI_IRQ_MSIX);
min((u16)ha->msix_count, (u16)num_online_cpus()),
PCI_IRQ_MSIX | PCI_IRQ_AFFINITY,
&desc);ha->msix_count, PCI_IRQ_MSIX | PCI_IRQ_AFFINITY,
Commit message a6dcfe08487e ("scsi: qla2xxx: Limit interrupt vectors to number of CPUs") says
Driver created too many QPairs(126) with 28xx adapter. Limit to the number of CPUs to minimize wasted resources.
I think a simple revert is not taking the resource wasting into account. Maybe the min vector calculation just needs a higher lower bound. So something stupid like
min(msix_cound, num_online_cpus() > 2? num_online_cpus() : 3)
?
On Fri, Apr 09, 2021 at 09:36:45AM +0200, Daniel Wagner wrote:
On Fri, Apr 09, 2021 at 09:17:59AM +0300, Roman Bolshakov wrote:
drivers/scsi/qla2xxx/qla_isr.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c index 11d6e0db07fe..6641978dfecf 100644 --- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -3998,12 +3998,10 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp) if (USER_CTRL_IRQ(ha) || !ha->mqiobase) { /* user wants to control IRQ setting for target mode */ ret = pci_alloc_irq_vectors(ha->pdev, min_vecs,
min((u16)ha->msix_count, (u16)num_online_cpus()),
PCI_IRQ_MSIX);
} else ret = pci_alloc_irq_vectors_affinity(ha->pdev, min_vecs,ha->msix_count, PCI_IRQ_MSIX);
min((u16)ha->msix_count, (u16)num_online_cpus()),
PCI_IRQ_MSIX | PCI_IRQ_AFFINITY,
&desc);ha->msix_count, PCI_IRQ_MSIX | PCI_IRQ_AFFINITY,
Commit message a6dcfe08487e ("scsi: qla2xxx: Limit interrupt vectors to number of CPUs") says
Driver created too many QPairs(126) with 28xx adapter. Limit to the number of CPUs to minimize wasted resources.
I think a simple revert is not taking the resource wasting into account. Maybe the min vector calculation just needs a higher lower bound. So something stupid like
min(msix_cound, num_online_cpus() > 2? num_online_cpus() : 3)
?
Hi Daniel,
That would reduce number of Queue Pairs on small-core systems, e.g. 14 HW queues are going to be available on 16-CPU system, even if the HBA has enough MSI-X vectors for 16 HW queues.
But... min((u16)ha->msix_count, (u16)num_online_cpus() + 3) might actually do the trick... I'll try that.
Thanks, Roman
linux-stable-mirror@lists.linaro.org