The patch below does not apply to the 5.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0e4e1de5b63fa423b13593337a27fd2d2b0bcf77 Mon Sep 17 00:00:00 2001
From: Ilya Dryomov idryomov@gmail.com Date: Fri, 13 Mar 2020 11:20:51 +0100 Subject: [PATCH] rbd: avoid a deadlock on header_rwsem when flushing notifies
rbd_unregister_watch() flushes notifies and therefore cannot be called under header_rwsem because a header update notify takes header_rwsem to synchronize with "rbd map". If mapping an image fails after the watch is established and a header update notify sneaks in, we deadlock when erroring out from rbd_dev_image_probe().
Move watch registration and unregistration out of the critical section. The only reason they were put there was to make header_rwsem management slightly more obvious.
Fixes: 811c66887746 ("rbd: fix rbd map vs notify races") Signed-off-by: Ilya Dryomov idryomov@gmail.com Reviewed-by: Jason Dillaman dillaman@redhat.com
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 1e0a6b19ae0d..ff2377e6d12c 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -4527,6 +4527,10 @@ static void cancel_tasks_sync(struct rbd_device *rbd_dev) cancel_work_sync(&rbd_dev->unlock_work); }
+/* + * header_rwsem must not be held to avoid a deadlock with + * rbd_dev_refresh() when flushing notifies. + */ static void rbd_unregister_watch(struct rbd_device *rbd_dev) { cancel_tasks_sync(rbd_dev); @@ -6907,6 +6911,9 @@ static void rbd_dev_image_release(struct rbd_device *rbd_dev) * device. If this image is the one being mapped (i.e., not a * parent), initiate a watch on its header object before using that * object to get detailed information about the rbd image. + * + * On success, returns with header_rwsem held for write if called + * with @depth == 0. */ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth) { @@ -6936,6 +6943,9 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth) } }
+ if (!depth) + down_write(&rbd_dev->header_rwsem); + ret = rbd_dev_header_info(rbd_dev); if (ret) { if (ret == -ENOENT && !need_watch) @@ -6987,6 +6997,8 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth) err_out_probe: rbd_dev_unprobe(rbd_dev); err_out_watch: + if (!depth) + up_write(&rbd_dev->header_rwsem); if (need_watch) rbd_unregister_watch(rbd_dev); err_out_format: @@ -7050,12 +7062,9 @@ static ssize_t do_rbd_add(struct bus_type *bus, goto err_out_rbd_dev; }
- down_write(&rbd_dev->header_rwsem); rc = rbd_dev_image_probe(rbd_dev, 0); - if (rc < 0) { - up_write(&rbd_dev->header_rwsem); + if (rc < 0) goto err_out_rbd_dev; - }
if (rbd_dev->opts->alloc_size > rbd_dev->layout.object_size) { rbd_warn(rbd_dev, "alloc_size adjusted to %u",
On Tue, Apr 21, 2020 at 07:06:58PM +0200, gregkh@linuxfoundation.org wrote:
The patch below does not apply to the 5.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0e4e1de5b63fa423b13593337a27fd2d2b0bcf77 Mon Sep 17 00:00:00 2001 From: Ilya Dryomov idryomov@gmail.com Date: Fri, 13 Mar 2020 11:20:51 +0100 Subject: [PATCH] rbd: avoid a deadlock on header_rwsem when flushing notifies
rbd_unregister_watch() flushes notifies and therefore cannot be called under header_rwsem because a header update notify takes header_rwsem to synchronize with "rbd map". If mapping an image fails after the watch is established and a header update notify sneaks in, we deadlock when erroring out from rbd_dev_image_probe().
Move watch registration and unregistration out of the critical section. The only reason they were put there was to make header_rwsem management slightly more obvious.
Fixes: 811c66887746 ("rbd: fix rbd map vs notify races") Signed-off-by: Ilya Dryomov idryomov@gmail.com Reviewed-by: Jason Dillaman dillaman@redhat.com
There was a conflict with:
b9ef2b8858a0 ("rbd: don't establish watch for read-only mappings")
And I ended up with a funny resolution, but given the conflict at hand I think it makes sense:
@@ -6135,6 +6145,8 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth) err_out_probe: rbd_dev_unprobe(rbd_dev); err_out_watch: + if (!depth) + up_write(&rbd_dev->header_rwsem); if (!depth) rbd_unregister_watch(rbd_dev); err_out_format:
Queued for 5.4-4.14.
linux-stable-mirror@lists.linaro.org