Hey, Bart.
On Sun, Apr 08, 2018 at 10:20:38PM -0700, Bart Van Assche wrote:
If a completion occurs after blk_mq_rq_timed_out() has reset rq->aborted_gstate and the request is again in flight when the timeout expires then a request will be completed twice: a first time by the timeout handler and a second time when the regular completion occurs.
Are we still talking about the same BLK_EH_RESET_TIMER case? This can be solved by the two patches which rcu-synchronizes the hand-over to normal completion path, right?
Additionally, the blk-mq timeout handling code ignores completions that occur after blk_mq_check_expired() has been called and before blk_mq_rq_timed_out() has reset rq->aborted_gstate. If a block driver timeout handler always returns BLK_EH_RESET_TIMER then the result will be that the request never terminates.
And this is the same race window which was always there, right? I really don't think reducing or closing this window requires full synchronization.
Thanks.