On Tue, 2024-06-25 at 09:42 -0700, Bart Van Assche wrote:
Please include a full root cause analysis when reposting fixes for the reported crashes. It is not clear to me how it is possible that an invalid pointer is passed to blk_mq_unique_tag() (0x194). As I mentioned in my previous email, freeing a request does not modify the request pointer and does not modify the SCSI command pointer either. As one can derive from the blk_mq_alloc_rqs() call stack, memory for struct request and struct scsi_cmnd is allocated at request queue allocation time and is not freed until the request queue is freed. Hence, for a given tag, neither the request pointer nor the SCSI command pointer changes as long as a request queue exists. Hence my request for an explanation how it is possible that an invalid pointer was passed to blk_mq_unique_tag().
Thanks,
Bart.
Hi Bart,
Sorry I have not explain root-cause clearly. I will add more clear root-cause analyze next version.
And it is not an invalid pointer is passed to blk_mq_unique_tag(), I means blk_mq_unique_tag function try access null pointer. It is differnt and cause misunderstanding.
The null pinter blk_mq_unique_tag try access is: rq->mq_hctx(NULL)->queue_num.
The racing flow is:
Thread A ufshcd_err_handler step 1 ufshcd_cmd_inflight(true) step 3 ufshcd_mcq_req_to_hwq blk_mq_unique_tag rq->mq_hctx->queue_num step 5
Thread B ufs_mtk_mcq_intr(cq complete ISR) step 2 scsi_done ... __blk_mq_free_request rq->mq_hctx = NULL; step 4
Thanks. Peter