The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
Possible dependencies:
2820e5d0820a ("block: mq-deadline: Fix dd_finish_request() for zoned devices")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2820e5d0820ac4daedff1272616a53d9c7682fd2 Mon Sep 17 00:00:00 2001 From: Damien Le Moal damien.lemoal@opensource.wdc.com Date: Thu, 24 Nov 2022 11:12:07 +0900 Subject: [PATCH] block: mq-deadline: Fix dd_finish_request() for zoned devices
dd_finish_request() tests if the per prio fifo_list is not empty to determine if request dispatching must be restarted for handling blocked write requests to zoned devices with a call to blk_mq_sched_mark_restart_hctx(). While simple, this implementation has 2 problems:
1) Only the priority level of the completed request is considered. However, writes to a zone may be blocked due to other writes to the same zone using a different priority level. While this is unlikely to happen in practice, as writing a zone with different IO priorirites does not make sense, nothing in the code prevents this from happening. 2) The use of list_empty() is dangerous as dd_finish_request() does not take dd->lock and may run concurrently with the insert and dispatch code.
Fix these 2 problems by testing the write fifo list of all priority levels using the new helper dd_has_write_work(), and by testing each fifo list using list_empty_careful().
Fixes: c807ab520fc3 ("block/mq-deadline: Add I/O priority support") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Link: https://lore.kernel.org/r/20221124021208.242541-2-damien.lemoal@opensource.w... Signed-off-by: Jens Axboe axboe@kernel.dk
diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 5639921dfa92..36374481cb87 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -789,6 +789,18 @@ static void dd_prepare_request(struct request *rq) rq->elv.priv[0] = NULL; }
+static bool dd_has_write_work(struct blk_mq_hw_ctx *hctx) +{ + struct deadline_data *dd = hctx->queue->elevator->elevator_data; + enum dd_prio p; + + for (p = 0; p <= DD_PRIO_MAX; p++) + if (!list_empty_careful(&dd->per_prio[p].fifo_list[DD_WRITE])) + return true; + + return false; +} + /* * Callback from inside blk_mq_free_request(). * @@ -828,9 +840,10 @@ static void dd_finish_request(struct request *rq)
spin_lock_irqsave(&dd->zone_lock, flags); blk_req_zone_write_unlock(rq); - if (!list_empty(&per_prio->fifo_list[DD_WRITE])) - blk_mq_sched_mark_restart_hctx(rq->mq_hctx); spin_unlock_irqrestore(&dd->zone_lock, flags); + + if (dd_has_write_work(rq->mq_hctx)) + blk_mq_sched_mark_restart_hctx(rq->mq_hctx); } }
commit 2820e5d0820ac4daedff1272616a53d9c7682fd2 upstream.
dd_finish_request() tests if the per prio fifo_list is not empty to determine if request dispatching must be restarted for handling blocked write requests to zoned devices with a call to blk_mq_sched_mark_restart_hctx(). While simple, this implementation has 2 problems:
1) Only the priority level of the completed request is considered. However, writes to a zone may be blocked due to other writes to the same zone using a different priority level. While this is unlikely to happen in practice, as writing a zone with different IO priorirites does not make sense, nothing in the code prevents this from happening. 2) The use of list_empty() is dangerous as dd_finish_request() does not take dd->lock and may run concurrently with the insert and dispatch code.
Fix these 2 problems by testing the write fifo list of all priority levels using the new helper dd_has_write_work(), and by testing each fifo list using list_empty_careful().
Fixes: c807ab520fc3 ("block/mq-deadline: Add I/O priority support") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Link: https://lore.kernel.org/r/20221124021208.242541-2-damien.lemoal@opensource.w... Signed-off-by: Jens Axboe axboe@kernel.dk --- block/mq-deadline.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/block/mq-deadline.c b/block/mq-deadline.c index cd2342d29704..a3f966ece0b7 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -733,6 +733,18 @@ static void dd_prepare_request(struct request *rq) rq->elv.priv[0] = NULL; }
+static bool dd_has_write_work(struct blk_mq_hw_ctx *hctx) +{ + struct deadline_data *dd = hctx->queue->elevator->elevator_data; + enum dd_prio p; + + for (p = 0; p <= DD_PRIO_MAX; p++) + if (!list_empty_careful(&dd->per_prio[p].fifo_list[DD_WRITE])) + return true; + + return false; +} + /* * Callback from inside blk_mq_free_request(). * @@ -755,7 +767,6 @@ static void dd_finish_request(struct request *rq) struct deadline_data *dd = q->elevator->elevator_data; const u8 ioprio_class = dd_rq_ioclass(rq); const enum dd_prio prio = ioprio_class_to_prio[ioprio_class]; - struct dd_per_prio *per_prio = &dd->per_prio[prio];
/* * The block layer core may call dd_finish_request() without having @@ -771,9 +782,10 @@ static void dd_finish_request(struct request *rq)
spin_lock_irqsave(&dd->zone_lock, flags); blk_req_zone_write_unlock(rq); - if (!list_empty(&per_prio->fifo_list[DD_WRITE])) - blk_mq_sched_mark_restart_hctx(rq->mq_hctx); spin_unlock_irqrestore(&dd->zone_lock, flags); + + if (dd_has_write_work(rq->mq_hctx)) + blk_mq_sched_mark_restart_hctx(rq->mq_hctx); } }
On Thu, Jan 05, 2023 at 01:07:56PM +0900, Damien Le Moal wrote:
commit 2820e5d0820ac4daedff1272616a53d9c7682fd2 upstream.
dd_finish_request() tests if the per prio fifo_list is not empty to determine if request dispatching must be restarted for handling blocked write requests to zoned devices with a call to blk_mq_sched_mark_restart_hctx(). While simple, this implementation has 2 problems:
- Only the priority level of the completed request is considered. However, writes to a zone may be blocked due to other writes to the same zone using a different priority level. While this is unlikely to happen in practice, as writing a zone with different IO priorirites does not make sense, nothing in the code prevents this from happening.
- The use of list_empty() is dangerous as dd_finish_request() does not take dd->lock and may run concurrently with the insert and dispatch code.
Fix these 2 problems by testing the write fifo list of all priority levels using the new helper dd_has_write_work(), and by testing each fifo list using list_empty_careful().
Fixes: c807ab520fc3 ("block/mq-deadline: Add I/O priority support") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Link: https://lore.kernel.org/r/20221124021208.242541-2-damien.lemoal@opensource.w... Signed-off-by: Jens Axboe axboe@kernel.dk
block/mq-deadline.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)
Now queued up, thanks.
greg k-h
On 1/4/23 23:21, gregkh@linuxfoundation.org wrote:
The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
Greg,
I sent you a backport in reply to your email.
Thanks !
Possible dependencies:
2820e5d0820a ("block: mq-deadline: Fix dd_finish_request() for zoned devices")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2820e5d0820ac4daedff1272616a53d9c7682fd2 Mon Sep 17 00:00:00 2001 From: Damien Le Moal damien.lemoal@opensource.wdc.com Date: Thu, 24 Nov 2022 11:12:07 +0900 Subject: [PATCH] block: mq-deadline: Fix dd_finish_request() for zoned devices
dd_finish_request() tests if the per prio fifo_list is not empty to determine if request dispatching must be restarted for handling blocked write requests to zoned devices with a call to blk_mq_sched_mark_restart_hctx(). While simple, this implementation has 2 problems:
- Only the priority level of the completed request is considered. However, writes to a zone may be blocked due to other writes to the same zone using a different priority level. While this is unlikely to happen in practice, as writing a zone with different IO priorirites does not make sense, nothing in the code prevents this from happening.
- The use of list_empty() is dangerous as dd_finish_request() does not take dd->lock and may run concurrently with the insert and dispatch code.
Fix these 2 problems by testing the write fifo list of all priority levels using the new helper dd_has_write_work(), and by testing each fifo list using list_empty_careful().
Fixes: c807ab520fc3 ("block/mq-deadline: Add I/O priority support") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Link: https://lore.kernel.org/r/20221124021208.242541-2-damien.lemoal@opensource.w... Signed-off-by: Jens Axboe axboe@kernel.dk
diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 5639921dfa92..36374481cb87 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -789,6 +789,18 @@ static void dd_prepare_request(struct request *rq) rq->elv.priv[0] = NULL; } +static bool dd_has_write_work(struct blk_mq_hw_ctx *hctx) +{
- struct deadline_data *dd = hctx->queue->elevator->elevator_data;
- enum dd_prio p;
- for (p = 0; p <= DD_PRIO_MAX; p++)
if (!list_empty_careful(&dd->per_prio[p].fifo_list[DD_WRITE]))
return true;
- return false;
+}
/*
- Callback from inside blk_mq_free_request().
@@ -828,9 +840,10 @@ static void dd_finish_request(struct request *rq) spin_lock_irqsave(&dd->zone_lock, flags); blk_req_zone_write_unlock(rq);
if (!list_empty(&per_prio->fifo_list[DD_WRITE]))
spin_unlock_irqrestore(&dd->zone_lock, flags);blk_mq_sched_mark_restart_hctx(rq->mq_hctx);
if (dd_has_write_work(rq->mq_hctx))
}blk_mq_sched_mark_restart_hctx(rq->mq_hctx);
}
linux-stable-mirror@lists.linaro.org