[PATCH AUTOSEL 5.1 113/141] block: init flush rq ref count to 1

19 Jul 2019

From: Josef Bacik josef@toxicpanda.com
[ Upstream commit b554db147feea39617b533ab6bca247c91c6198a ]
We discovered a problem in newer kernels where a disconnect of a NBD
device while the flush request was pending would result in a hang.  This
is because the blk mq timeout handler does
if (!refcount_inc_not_zero(&rq->ref))
                return true;
to determine if it's ok to run the timeout handler for the request.
Flush_rq's don't have a ref count set, so we'd skip running the timeout
handler for this request and it would just sit there in limbo forever.
Fix this by always setting the refcount of any request going through
blk_init_rq() to 1.  I tested this with a nbd-server that dropped flush
requests to verify that it hung, and then tested with this patch to
verify I got the timeout as expected and the error handling kicked in.
Thanks,
Signed-off-by: Josef Bacik josef@toxicpanda.com
Signed-off-by: Jens Axboe axboe@kernel.dk
Signed-off-by: Sasha Levin sashal@kernel.org
---
 block/blk-core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index 2dd94b3e9ece..aac658392d60 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -116,6 +116,7 @@ void blk_rq_init(struct request_queue *q, struct request *rq)
    rq->internal_tag = -1;
    rq->start_time_ns = ktime_get_ns();
    rq->part = NULL;
+	refcount_set(&rq->ref, 1);
 }
 EXPORT_SYMBOL(blk_rq_init);
-- 
2.20.1


    

2025

2024

2023

2022

2021

2020

2019

2018

2017

[PATCH AUTOSEL 5.1 113/141] block: init flush rq ref count to 1