On 4/4/18 10:35 AM, Ming Lei wrote:
This patch orders getting budget and driver tag by making sure to acquire driver tag after budget is got, this way can help to avoid the following race:
- before dispatch request from scheduler queue, get one budget first, then
dequeue a request, call it request A.
- in another IO path for dispatching request B which is from hctx->dispatch,
driver tag is got, then try to get budget in blk_mq_dispatch_rq_list(), unfortunately the budget is held by request A.
- meantime blk_mq_dispatch_rq_list() is called for dispatching request
A, and try to get driver tag first, unfortunately no driver tag is available because the driver tag is held by request B
- both two IO pathes can't move on, and IO stall is caused.
This issue can be observed when running dbench on USB storage.
Good catch, this can trigger on anything potentially, but of course more likely with limited budget and/or tag space. Classic ABBA deadlock.