From: Fei Yang fei.yang@intel.com
If scatter-gather operation is allowed, a large USB request would be split into multiple TRBs. These TRBs are chained up by setting DWC3_TRB_CTRL_CHN bit except the last one which has DWC3_TRB_CTRL_IOC bit set instead. Since only the last TRB has IOC set, dwc3_gadget_ep_reclaim_completed_trb() would be called only once for the whole USB request, thus all the TRBs need to be reclaimed within this single call. However that is not what the current code does.
This patch addresses the issue by checking each TRB in function dwc3_gadget_ep_reclaim_trb_sg() and reclaiming the chained ones right there. Only the last TRB gets passed to dwc3_gadget_ep_reclaim_completed_trb(). This would guarantee all TRBs are reclaimed and trb_dequeue/num_trbs are updated properly.
Signed-off-by: Fei Yang fei.yang@intel.com Cc: stable stable@vger.kernel.org ---
V2: Better solution is to reclaim chained TRBs in dwc3_gadget_ep_reclaim_trb_sg() and leave the last TRB to the dwc3_gadget_ep_reclaim_completed_trb().
---
drivers/usb/dwc3/gadget.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 173f532..c0662c2 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -2404,7 +2404,7 @@ static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep, struct dwc3_request *req, const struct dwc3_event_depevt *event, int status) { - struct dwc3_trb *trb = &dep->trb_pool[dep->trb_dequeue]; + struct dwc3_trb *trb; struct scatterlist *sg = req->sg; struct scatterlist *s; unsigned int pending = req->num_pending_sgs; @@ -2419,7 +2419,15 @@ static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep,
req->sg = sg_next(s); req->num_pending_sgs--; + if (!(trb->ctrl & DWC3_TRB_CTRL_IOC)) { + /* reclaim the TRB without calling + * dwc3_gadget_ep_reclaim_completed_trb */ + dwc3_ep_inc_deq(dep); + req->num_trbs--; + continue; + }
+ /* Only the last TRB in the sg list would reach here */ ret = dwc3_gadget_ep_reclaim_completed_trb(dep, req, trb, event, status, true); if (ret)
Hi,
Let's look at the relevant code:
fei.yang@intel.com writes:
From: Fei Yang fei.yang@intel.com
If scatter-gather operation is allowed, a large USB request would be split into multiple TRBs. These TRBs are chained up by setting DWC3_TRB_CTRL_CHN bit except the last one which has DWC3_TRB_CTRL_IOC bit set instead. Since only the last TRB has IOC set, dwc3_gadget_ep_reclaim_completed_trb() would be called only once for the whole USB request, thus all the TRBs need to be reclaimed within this single call. However that is not what the current code does.
This patch addresses the issue by checking each TRB in function dwc3_gadget_ep_reclaim_trb_sg() and reclaiming the chained ones right there. Only the last TRB gets passed to dwc3_gadget_ep_reclaim_completed_trb(). This would guarantee all TRBs are reclaimed and trb_dequeue/num_trbs are updated properly.
Signed-off-by: Fei Yang fei.yang@intel.com Cc: stable stable@vger.kernel.org
V2: Better solution is to reclaim chained TRBs in dwc3_gadget_ep_reclaim_trb_sg() and leave the last TRB to the dwc3_gadget_ep_reclaim_completed_trb().
drivers/usb/dwc3/gadget.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 173f532..c0662c2 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -2404,7 +2404,7 @@ static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep, struct dwc3_request *req, const struct dwc3_event_depevt *event, int status)
Here's the full function:
| static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep, | struct dwc3_request *req, const struct dwc3_event_depevt *event, | int status) | { | struct dwc3_trb *trb = &dep->trb_pool[dep->trb_dequeue]; | struct scatterlist *sg = req->sg; | struct scatterlist *s; | unsigned int pending = req->num_pending_sgs; | unsigned int i; | int ret = 0; | | for_each_sg(sg, s, pending, i) {
iterate over each scatterlist member for the current request...
| trb = &dep->trb_pool[dep->trb_dequeue]; | | if (trb->ctrl & DWC3_TRB_CTRL_HWO) | break; | | req->sg = sg_next(s); | req->num_pending_sgs--; | | ret = dwc3_gadget_ep_reclaim_completed_trb(dep, req, | trb, event, status, true);
... and reclaim its TRB.
Now, looking dwc3_gadget_ep_reclaim_compmleted_trb() we have:
| static int dwc3_gadget_ep_reclaim_completed_trb(struct dwc3_ep *dep, | struct dwc3_request *req, struct dwc3_trb *trb, | const struct dwc3_event_depevt *event, int status, int chain) | { | unsigned int count; | | dwc3_ep_inc_deq(dep);
unconditionally increment the dequeue pointer. What Are we missing here?
[...]
| return 0; | }
Now, looking at what your patch does we will have:
{
- struct dwc3_trb *trb = &dep->trb_pool[dep->trb_dequeue];
- struct dwc3_trb *trb;
small cleanup, should be part of its own patch.
struct scatterlist *sg = req->sg; struct scatterlist *s; unsigned int pending = req->num_pending_sgs; @@ -2419,7 +2419,15 @@ static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep, req->sg = sg_next(s); req->num_pending_sgs--;
if (!(trb->ctrl & DWC3_TRB_CTRL_IOC)) {
/* reclaim the TRB without calling
* dwc3_gadget_ep_reclaim_completed_trb */
why do you have to skip dwc3_gadget_ep_reclaim_completed_trb()? Also, your patch description claims that we're NOT incrementing the TRBs, which is wrong. I fail to see what problem you're trying to solve here, really.
Could it be that we're, simply. returning 1 when we should return 0 for the previous SG list members? If that's the case, then that's the bug that should be fixed. Still, you shouldn't avoid calling dwc3_gadget_ep_reclaim_completed_trb() and should, instead, fix the bug it contains.
Looking at the cases where dwc3_gadget_ep_reclaim_completed_trb() returns 1, I can't see how that would be the case either:
| if (chain && (trb->ctrl & DWC3_TRB_CTRL_HWO)) | trb->ctrl &= ~DWC3_TRB_CTRL_HWO;
if CHN bit it set and HWO is bit, clear HWO
| if (req->needs_extra_trb && !(trb->ctrl & DWC3_TRB_CTRL_CHN)) { | trb->ctrl &= ~DWC3_TRB_CTRL_HWO; | return 1; | }
if *not* CHN and needs_extra_trb, return 1. This can only be true for the last TRB in the SG list.
| if ((trb->ctrl & DWC3_TRB_CTRL_HWO) && status != -ESHUTDOWN) | return 1;
This can't be true because we cleared HWO up above
| if (event->status & DEPEVT_STATUS_SHORT && !chain) | return 1;
can only be true for last TRB
| if (event->status & DEPEVT_STATUS_IOC) | return 1;
If we have a short packet, then we may fall here. Is that the case?
Please share dwc3 tracepoints of the problem happening so I can verify what's going on.
Can only be true for last TRB
| if (event->status & DEPEVT_STATUS_IOC) | return 1;
This is the problem. The whole USB request gets only one interrupt when the last TRB completes, so dwc3_gadget_ep_reclaim_trb_sg() gets called with event->status = 0x6 which has DEPEVT_STATUS_IOC bit set. Thus dwc3_gadget_ep_reclaim_completed_trb() returns 1 for the first TRB and the for-loop ends without having a chance to iterate through the sg list.
If we have a short packet, then we may fall here. Is that the case?
No need for a short packet to make it fail. In my case below, a 16384 byte request got slipt into 4 TRBs of 4096 bytes. All TRBs were completed normally, but the for-loop in dwc3_gadget_ep_reclaim_trb_sg() was terminated right after handling the first TRB. After that the trb_dequeue is messed up.
buffer_addr,size,type,ioc,isp_imi,csp,chn,lst,hwo 0000000077849000, 4096,normal,0,0,1,1,0,0 000000007784a000, 4096,normal,0,0,1,1,0,0 000000007784b000, 4096,normal,0,0,1,1,0,0 000000007784c000, 4096,normal,1,0,1,0,0,0 000000007784d000, 512,normal,1,0,1,0,0,0
My first version of the patch was trying to address the issue in dwc3_gadget_ep_reclaim_completed_trb(), but then I thought it's a bad idea to touch this function because that is also called from non scatter_gather list case, and I was not sure if returning 1 for the linear case is correct or not.
-Fei
Can only be true for last TRB
| if (event->status & DEPEVT_STATUS_IOC) | return 1;
This is the problem. The whole USB request gets only one interrupt when the last TRB completes, so dwc3_gadget_ep_reclaim_trb_sg() gets called with event->status = 0x6 which has DEPEVT_STATUS_IOC bit set. Thus dwc3_gadget_ep_reclaim_completed_trb() returns 1 for the first TRB and the for-loop ends without having a chance to iterate through the sg list.
If we have a short packet, then we may fall here. Is that the case?
No need for a short packet to make it fail. In my case below, a 16384 byte request got slipt into 4 TRBs of 4096 bytes. All TRBs were completed normally, but the for-loop in dwc3_gadget_ep_reclaim_trb_sg() was terminated right after handling the first TRB. After that the trb_dequeue is messed up.
buffer_addr,size,type,ioc,isp_imi,csp,chn,lst,hwo 0000000077849000, 4096,normal,0,0,1,1,0,0 000000007784a000, 4096,normal,0,0,1,1,0,0 000000007784b000, 4096,normal,0,0,1,1,0,0 000000007784c000, 4096,normal,1,0,1,0,0,0 000000007784d000, 512,normal,1,0,1,0,0,0
My first version of the patch was trying to address the issue in dwc3_gadget_ep_reclaim_completed_trb(), but then I thought it's a bad idea to touch this function because that is also called from non scatter_gather list case, and I was not sure if returning 1 for the linear case is correct or not.
I just sent v3 of the patch. Let me know your thoughts.
-Fei
"Yang, Fei" fei.yang@intel.com writes:
Hi,
Can only be true for last TRB
| if (event->status & DEPEVT_STATUS_IOC) | return 1;
This is the problem. The whole USB request gets only one interrupt when the last TRB completes, so dwc3_gadget_ep_reclaim_trb_sg() gets called with event->status = 0x6 which has DEPEVT_STATUS_IOC bit set. Thus dwc3_gadget_ep_reclaim_completed_trb() returns 1 for the first TRB and the for-loop ends without having a chance to iterate through the sg list.
IOC is only set for the last TRB, so this will iterate over and over again until it reaches the last TRB. Please collect tracepoints of the failure case.
If we have a short packet, then we may fall here. Is that the case?
No need for a short packet to make it fail. In my case below, a 16384 byte request got slipt into 4 TRBs of 4096 bytes. All TRBs were completed normally, but the for-loop in dwc3_gadget_ep_reclaim_trb_sg() was terminated right after handling the first TRB. After that the trb_dequeue is messed up.
I need tracepoints to se what's going on, please collect tracepoints.
buffer_addr,size,type,ioc,isp_imi,csp,chn,lst,hwo 0000000077849000, 4096,normal,0,0,1,1,0,0 000000007784a000, 4096,normal,0,0,1,1,0,0 000000007784b000, 4096,normal,0,0,1,1,0,0 000000007784c000, 4096,normal,1,0,1,0,0,0 000000007784d000, 512,normal,1,0,1,0,0,0
My first version of the patch was trying to address the issue in dwc3_gadget_ep_reclaim_completed_trb(), but then I thought it's a bad idea to touch this function because that is also called from non scatter_gather list case, and I was not sure if returning 1 for the linear case is correct or not.
That function *must* be called for all cases. We want to reduce the amount of special cases so code is more straight forward and easier to maintain. Again, please collect tracepoints of the failure case with the latest tag from Linus, otherwise you won't be able to convince me we need your patch.
I also think your version is the wrong way to sort it out.
linux-stable-mirror@lists.linaro.org