On Wed, Oct 19, 2022 at 12:27:46AM +0000, Angus Chen wrote:
Hi sasha
-----Original Message----- From: Sasha Levin sashal@kernel.org Sent: Tuesday, October 18, 2022 8:12 AM To: linux-kernel@vger.kernel.org; stable@vger.kernel.org Cc: Angus Chen angus.chen@jaguarmicro.com; Michael S . Tsirkin mst@redhat.com; Sasha Levin sashal@kernel.org; jasowang@redhat.com; virtualization@lists.linux-foundation.org Subject: [PATCH AUTOSEL 4.9 8/8] virtio_pci: don't try to use intxif pin is zero
From: Angus Chen angus.chen@jaguarmicro.com
[ Upstream commit 71491c54eafa318fdd24a1f26a1c82b28e1ac21d ]
The background is that we use dpu in cloud computing,the arch is x86,80 cores. We will have a lots of virtio devices,like 512 or more. When we probe about 200 virtio_blk devices,it will fail and the stack is printed as follows:
[25338.485128] virtio-pci 0000:b3:00.0: virtio_pci: leaving for legacy driver [25338.496174] genirq: Flags mismatch irq 0. 00000080 (virtio418) vs. 00015a00 (timer) [25338.503822] CPU: 20 PID: 5431 Comm: kworker/20:0 Kdump: loaded Tainted: G OE --------- - - 4.18.0-305.30.1.el8.x86_64 [25338.516403] Hardware name: Inspur NF5280M5/YZMB-00882-10E, BIOS 4.1.21 08/25/2021 [25338.523881] Workqueue: events work_for_cpu_fn [25338.528235] Call Trace: [25338.530687] dump_stack+0x5c/0x80 [25338.534000] __setup_irq.cold.53+0x7c/0xd3 [25338.538098] request_threaded_irq+0xf5/0x160 [25338.542371] vp_find_vqs+0xc7/0x190 [25338.545866] init_vq+0x17c/0x2e0 [virtio_blk] [25338.550223] ? ncpus_cmp_func+0x10/0x10 [25338.554061] virtblk_probe+0xe6/0x8a0 [virtio_blk] [25338.558846] virtio_dev_probe+0x158/0x1f0 [25338.562861] really_probe+0x255/0x4a0 [25338.566524] ? __driver_attach_async_helper+0x90/0x90 [25338.571567] driver_probe_device+0x49/0xc0 [25338.575660] bus_for_each_drv+0x79/0xc0 [25338.579499] __device_attach+0xdc/0x160 [25338.583337] bus_probe_device+0x9d/0xb0 [25338.587167] device_add+0x418/0x780 [25338.590654] register_virtio_device+0x9e/0xe0 [25338.595011] virtio_pci_probe+0xb3/0x140 [25338.598941] local_pci_probe+0x41/0x90 [25338.602689] work_for_cpu_fn+0x16/0x20 [25338.606443] process_one_work+0x1a7/0x360 [25338.610456] ? create_worker+0x1a0/0x1a0 [25338.614381] worker_thread+0x1cf/0x390 [25338.618132] ? create_worker+0x1a0/0x1a0 [25338.622051] kthread+0x116/0x130 [25338.625283] ? kthread_flush_work_fn+0x10/0x10 [25338.629731] ret_from_fork+0x1f/0x40 [25338.633395] virtio_blk: probe of virtio418 failed with error -16
The log : "genirq: Flags mismatch irq 0. 00000080 (virtio418) vs. 00015a00 (timer)" was printed because of the irq 0 is used by timer exclusive,and when vp_find_vqs call vp_find_vqs_msix and returns false twice (for whatever reason), then it will call vp_find_vqs_intx as a fallback. Because vp_dev->pci_dev->irq is zero, we request irq 0 with flag IRQF_SHARED, and get a backtrace like above.
According to PCI spec about "Interrupt Pin" Register (Offset 3Dh): "The Interrupt Pin register is a read-only register that identifies the legacy interrupt Message(s) the Function uses. Valid values are 01h, 02h, 03h, and 04h that map to legacy interrupt Messages for INTA, INTB, INTC, and INTD respectively. A value of 00h indicates that the Function uses no legacy interrupt Message(s)."
So if vp_dev->pci_dev->pin is zero, we should not request legacy interrupt.
Signed-off-by: Angus Chen angus.chen@jaguarmicro.com Suggested-by: Michael S. Tsirkin mst@redhat.com Message-Id: 20220930000915.548-1-angus.chen@jaguarmicro.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org
drivers/virtio/virtio_pci_common.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index 37e3ba5dadf6..d634eb926a2f 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -389,6 +389,9 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs, true, false); if (!err) return 0;
- /* Is there an interrupt pin? If not give up. */
- if (!(to_vp_device(vdev)->pci_dev->pin))
/* Finally fall back to regular interrupts. */ return vp_try_to_find_vqs(vdev, nvqs, vqs, callbacks, names, false, false);return err;
-- 2.35.1
the patch 71491c54eafa31 has been fixed by 2145ab513e3b3, It is report by Michael Ellerman mpe@ellerman.id.au and suggested by linus. If it is merged in the stable git repo, I worry about powerpc arch. Thans.
Yes, please either pick up both this and the fixup or none, and same for all other stable trees where this was autoselected.
It looks like autoselection basically picks up everything that has a Fixes tag in it yes?