On Tue, 2018-04-10 at 22:30 +0800, Ming Lei wrote:
On Tue, Apr 10, 2018 at 02:09:33PM +0000, Bart Van Assche wrote:
Please keep in mind that all synchronize_rcu() does is to wait for pre- existing RCU readers to finish. synchronize_rcu() does not prevent that new rcu_read_lock() calls happen. It is e.g. possible that after
That is right, and I also mentioned normal completion can be done between
- and reset aborted_gstate in 3).
blk_mq_rq_update_aborted_gstate(req, 0) has been executed that a regular completion occurs. If that request is not reused before the timer that was restarted by the timeout code expires, that request will be completed twice.
In this patch, blk_mq_add_timer(req, MQ_RQ_COMPLETE, MQ_RQ_IN_FLIGHT) is called for handling BLK_EH_RESET_TIMER. And after rq's state is changed to MQ_RQ_IN_FLIGHT, normal completion still can come and complete this rq, just like the above you described, right?
I should have added the following in my previous e-mail: "if the completion occurs after blk_mq_check_expired() examined rq->gstate and before it updated rq->aborted_gstate". That race can occur with the current upstream blk-mq timeout handling code but not after my patch has been applied.
Bart.