The present code in dwc3_gadget_ep_reclaim_completed_trb() will check for IOC/LST bit in the event->status and returns if IOC/LST bit is set. This logic doesn't work if multiple TRBs are queued per request and the IOC/LST bit is set on the last TRB of that request. Consider an example where a queued request has multiple queued TRBs and IOC/LST bit is set only for the last TRB. In this case, the Core generates XferComplete/XferInProgress events only for the last TRB (since IOC/LST are set only for the last TRB). As per the logic in dwc3_gadget_ep_reclaim_completed_trb() event->status is checked for IOC/LST bit and returns on the first TRB. This makes the remaining TRBs left unhandled. To aviod this, changed the code to check for IOC/LST bits in both
avoid
event->status & TRB->ctrl. This patch does the same.
We don't need to check both. It's very likely that checking the TRB is enough.
At a practical level, this patch resolves USB transfer stalls seen with adb on dwc3 based HiKey960 after functionfs gadget added scatter-gather support around v4.20.
Right, I remember asking for tracepoint data showing this problem happening. It's the best way to figure out what's really going on.
Before we accept these two patches, could you collect dwc3 tracepoint data and share here?
I should have replied to this one. Sorry for the confusion on the other thread. I have sent tracepoints long time ago for this problem, but the tracepoints did help much on debugging the issue, so I'm not going to send again.
But the problem is really obvious though. In function dwc3_gadget_ep_reclaim_trb_sg(), the for_each_sg loop doesn't get a chance to iterate through all TRBs in the sg list, because this function only gets called when the last TRB in the list is completed (because of setting IOC), so at this moment event->status has IOC bit set. The consequence is that, when the for_each_sg loop trying to reclaim the first TRB in the sg list, the call dwc3_gadget_ep_reclaim_completed_trb() returns 1 (if (event-status & DEPEVT_STATUS_IOC) return 1;), thus the for loop is terminated before the rest of the TRBs are reclaimed.
static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep, struct dwc3_request *req, const struct dwc3_event_depevt *event, int status) { struct dwc3_trb *trb = &dep->trb_pool[dep->trb_dequeue]; struct scatterlist *sg = req->sg; struct scatterlist *s; unsigned int pending = req->num_pending_sgs; unsigned int i; int ret = 0;
for_each_sg(sg, s, pending, i) { trb = &dep->trb_pool[dep->trb_dequeue];
if (trb->ctrl & DWC3_TRB_CTRL_HWO) break;
req->sg = sg_next(s); req->num_pending_sgs--;
ret = dwc3_gadget_ep_reclaim_completed_trb(dep, req, trb, event, status, true); if (ret) break; }
return ret; }