From: Anton Eidelman anton@lightbitslabs.com
[ Upstream commit 2d570a7c0251c594489a2c16b82b14ae30345c03 ]
When nvme_tcp_io_work() fails to send to socket due to connection close/reset, error_recovery work is triggered from nvme_tcp_state_change() socket callback. This cancels all the active requests in the tagset, which requeues them.
The failed request, however, was ended and thus requeued individually as well unless send returned -EPIPE. Another return code to be treated the same way is -ECONNRESET.
Double requeue caused BUG_ON(blk_queued_rq(rq)) in blk_mq_requeue_request() from either the individual requeue of the failed request or the bulk requeue from blk_mq_tagset_busy_iter(, nvme_cancel_request, );
Signed-off-by: Anton Eidelman anton@lightbitslabs.com Reviewed-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/tcp.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 6d43b23a0fc8b..f8fa5c5b79f17 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1054,7 +1054,12 @@ static void nvme_tcp_io_work(struct work_struct *w) } else if (unlikely(result < 0)) { dev_err(queue->ctrl->ctrl.device, "failed to send request %d\n", result); - if (result != -EPIPE) + + /* + * Fail the request unless peer closed the connection, + * in which case error recovery flow will complete all. + */ + if ((result != -EPIPE) && (result != -ECONNRESET)) nvme_tcp_fail_request(queue->request); nvme_tcp_done_send_req(queue); return;