This fixes a regression added with:
commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4 Author: Mike Christie mchristi@redhat.com Date: Sun Aug 4 14:10:06 2019 -0500
nbd: fix max number of supported devs
where we can deadlock during device shutdown. The problem will occur if userpsace has done a NBD_CLEAR_SOCK call, then does close() before the recv_work work has done its nbd_config_put() call. If recv_work does the last call then it will do destroy_workqueue which will then be stuck waiting for the work we are running from.
This fixes the issue by having nbd_start_device_ioctl flush the work queue on both the failure and success cases and has a refcount on the nbd_device while it is flushing the work queue.
Cc: stable@vger.kernel.org Signed-off-by: Mike Christie mchristi@redhat.com --- drivers/block/nbd.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 57532465fb83..f8597d2fb365 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -1293,13 +1293,15 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b
if (max_part) bdev->bd_invalidated = 1; + + refcount_inc(&nbd->config_refs); mutex_unlock(&nbd->config_lock); ret = wait_event_interruptible(config->recv_wq, atomic_read(&config->recv_threads) == 0); - if (ret) { + if (ret) sock_shutdown(nbd); - flush_workqueue(nbd->recv_workq); - } + flush_workqueue(nbd->recv_workq); + mutex_lock(&nbd->config_lock); nbd_bdev_reset(bdev); /* user requested, ignore socket errors */ @@ -1307,6 +1309,7 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b ret = 0; if (test_bit(NBD_RT_TIMEDOUT, &config->runtime_flags)) ret = -ETIMEDOUT; + nbd_config_put(nbd); return ret; }
Josef and Jens,
Ignore this patch. It could also deadlock but in a different way, and it looks like there are other possible issues with races and refcounts. I will send some new patches.
On 12/02/2019 03:51 PM, Mike Christie wrote:
This fixes a regression added with:
commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4 Author: Mike Christie mchristi@redhat.com Date: Sun Aug 4 14:10:06 2019 -0500
nbd: fix max number of supported devs
where we can deadlock during device shutdown. The problem will occur if userpsace has done a NBD_CLEAR_SOCK call, then does close() before the recv_work work has done its nbd_config_put() call. If recv_work does the last call then it will do destroy_workqueue which will then be stuck waiting for the work we are running from.
This fixes the issue by having nbd_start_device_ioctl flush the work queue on both the failure and success cases and has a refcount on the nbd_device while it is flushing the work queue.
Cc: stable@vger.kernel.org Signed-off-by: Mike Christie mchristi@redhat.com
drivers/block/nbd.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 57532465fb83..f8597d2fb365 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -1293,13 +1293,15 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b if (max_part) bdev->bd_invalidated = 1;
- refcount_inc(&nbd->config_refs); mutex_unlock(&nbd->config_lock); ret = wait_event_interruptible(config->recv_wq, atomic_read(&config->recv_threads) == 0);
- if (ret) {
- if (ret) sock_shutdown(nbd);
flush_workqueue(nbd->recv_workq);
- }
- flush_workqueue(nbd->recv_workq);
- mutex_lock(&nbd->config_lock); nbd_bdev_reset(bdev); /* user requested, ignore socket errors */
@@ -1307,6 +1309,7 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b ret = 0; if (test_bit(NBD_RT_TIMEDOUT, &config->runtime_flags)) ret = -ETIMEDOUT;
- nbd_config_put(nbd); return ret;
}
linux-stable-mirror@lists.linaro.org