4.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jianchao Wang jianchao.w.wang@oracle.com
[ Upstream commit 9f9cafc14016f23f982d3ce18f9057923bd3037a ]
There is race between nvme_remove and nvme_reset_work that can lead to io hang.
nvme_remove nvme_reset_work -> nvme_remove_dead_ctrl -> nvme_dev_disable -> quiesce request_queue -> queue remove_work -> cancel_work_sync reset_work -> nvme_remove_namespaces -> splice ctrl->namespaces nvme_remove_dead_ctrl_work -> nvme_kill_queues -> nvme_ns_remove do nothing -> blk_cleanup_queue -> blk_freeze_queue
Finally, the request_queue is quiesced state when wait freeze, we will get io hang here. To fix it, move the nvme_kill_queues from nvme_remove_dead_ctrl_work to nvme_remove_dead_ctrl.
Suggested-by: Keith Busch keith.busch@linux.intel.com Signed-off-by: Jianchao Wang jianchao.w.wang@oracle.com Reviewed-by: Keith Busch keith.busch@intel.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin alexander.levin@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/nvme/host/pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2291,6 +2291,7 @@ static void nvme_remove_dead_ctrl(struct
nvme_get_ctrl(&dev->ctrl); nvme_dev_disable(dev, false); + nvme_kill_queues(&dev->ctrl); if (!queue_work(nvme_wq, &dev->remove_work)) nvme_put_ctrl(&dev->ctrl); } @@ -2407,7 +2408,6 @@ static void nvme_remove_dead_ctrl_work(s struct nvme_dev *dev = container_of(work, struct nvme_dev, remove_work); struct pci_dev *pdev = to_pci_dev(dev->dev);
- nvme_kill_queues(&dev->ctrl); if (pci_get_drvdata(pdev)) device_release_driver(&pdev->dev); nvme_put_ctrl(&dev->ctrl);