4.20-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 85bd6e61f34dffa8ec2dc75ff3c02ee7b2f1cbce ]
Florian reported a io hung issue when fsync(). It should be triggered by following race condition.
data + post flush a flush
blk_flush_complete_seq case REQ_FSEQ_DATA blk_flush_queue_rq issued to driver blk_mq_dispatch_rq_list try to issue a flush req failed due to NON-NCQ command .queue_rq return BLK_STS_DEV_RESOURCE
request completion req->end_io // doesn't check RESTART mq_flush_data_end_io case REQ_FSEQ_POSTFLUSH blk_kick_flush do nothing because previous flush has not been completed blk_mq_run_hw_queue insert rq to hctx->dispatch due to RESTART is still set, do nothing
To fix this, replace the blk_mq_run_hw_queue in mq_flush_data_end_io with blk_mq_sched_restart to check and clear the RESTART flag.
Fixes: bd166ef1 (blk-mq-sched: add framework for MQ capable IO schedulers) Reported-by: Florian Stecker m19@florianstecker.de Tested-by: Florian Stecker m19@florianstecker.de Signed-off-by: Jianchao Wang jianchao.w.wang@oracle.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-flush.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-flush.c b/block/blk-flush.c index 8b44b86779da..87fc49daa2b4 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -424,7 +424,7 @@ static void mq_flush_data_end_io(struct request *rq, blk_status_t error) blk_flush_complete_seq(rq, fq, REQ_FSEQ_DATA, error); spin_unlock_irqrestore(&fq->mq_flush_lock, flags);
- blk_mq_run_hw_queue(hctx, true); + blk_mq_sched_restart(hctx); }
/**