On 4/9/19 8:28 PM, Laurence Oberman wrote:
On Tue, 2019-04-09 at 09:31 +0800, jianchao.wang wrote:
On 4/8/19 10:36 AM, jianchao.wang wrote:
On 4/8/19 10:07 AM, jianchao.wang wrote:
Hi Bart
On 4/4/19 4:11 AM, Bart Van Assche wrote:
If blk_mq_try_issue_directly() returns BLK_STS*_RESOURCE that means that the request has not been queued and that the caller should retry to submit the request. Both blk_mq_request_bypass_insert() and blk_mq_sched_insert_request() guarantee that a request will be processed. Hence return BLK_STS_OK if one of these functions is called. This patch avoids that blk_mq_dispatch_rq_list() crashes when using dm- mpath.
Sorry, I seem to miss the original mail list that reported this issue. As your comment, it looks like that the request is handled again when the blk_mq_try_issue_directly return BLK_STS*_RESOURCE, right ?
The usage of this helper interface is, if care about the return value and want to handle the request yourself when return BLK_STS*_RESOURCE, pass 'byass' as true. otherwise, just pass 'bypass' as false, then blk_mq_try_issue_directly would take over all of the work including requeue or complete the request.
if dm-mpath case, the driver should only invoke dm_dispatch_clone_request, the 'bypass' parameter should only be true. as the blk_mq_try_issue_directly, it would return BLK_STS_OK when have to insert the request, otherwise, it would do nothing but return BLK_STS*_RESOURCE.
Would you please show the cause that the dm-mpath driver invoke blk_mq_try_issue_direclty with 'bypass == false' ?
The issue seems to be here,
blk_mq_try_issue_directly
if (unlikely(blk_mq_hctx_stopped(hctx) ||
blk_queue_quiesced(q))) { run_queue = false; bypass = false; //------> HERE !!! goto out_unlock; }
case BLK_STS_RESOURCE: if (force) { blk_mq_request_bypass_insert(rq, run_queue); ret = bypass ? BLK_STS_OK : ret; } else if (!bypass) { blk_mq_sched_insert_request(rq, false, run_queue, false); } break;
Then the request will be inserted and blk_mq_try_issue_dreictly returns BLK_STS_RESOURCE.
Could following patch fix the issue ?
Hi Laurence
Would you please test this patch to see whether the issue could be fixed ?
Thanks Jianchao
diff --git a/block/blk-mq.c b/block/blk-mq.c index a9c1816..a3394f2 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1813,7 +1813,7 @@ blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, */ if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) { run_queue = false;
bypass = false;
force = true; goto out_unlock; }
Thanks Jianchao
...
Hello Sir I think Jens already took the revert patch though. I will try this when I gat a chance. Need to wait until I can reboot the targetserver again.
Thanks so much for your help. Please share the test result here. I will get the reverted patches back after that.
Thanks Jianchao