Re: [PATCH v4] blk-mq: Fix race conditions in request timeout handling

10 Apr 2018

      On Tue, 2018-04-10 at 22:30 +0800, Ming Lei wrote:
...
On Tue, Apr 10, 2018 at 02:09:33PM +0000, Bart Van Assche wrote:
...
Please keep in mind that all synchronize_rcu() does is to wait for pre-
existing RCU readers to finish. synchronize_rcu() does not prevent that new
rcu_read_lock() calls happen. It is e.g. possible that after
That is right, and I also mentioned normal completion can be done between

and reset aborted_gstate in 3).

...
blk_mq_rq_update_aborted_gstate(req, 0) has been executed that a regular
completion occurs. If that request is not reused before the timer that was
restarted by the timeout code expires, that request will be completed twice.
In this patch, blk_mq_add_timer(req, MQ_RQ_COMPLETE, MQ_RQ_IN_FLIGHT) is
called for handling BLK_EH_RESET_TIMER. And after rq's state is changed
to MQ_RQ_IN_FLIGHT, normal completion still can come and complete this rq,
just like the above you described, right?
I should have added the following in my previous e-mail: "if the completion
occurs after blk_mq_check_expired() examined rq->gstate and before it updated
rq->aborted_gstate". That race can occur with the current upstream blk-mq
timeout handling code but not after my patch has been applied.
Bart.

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v4] blk-mq: Fix race conditions in request timeout handling