Hi,
John Stultz john.stultz@linaro.org writes:
On Wed, Jan 22, 2020 at 11:23 PM Felipe Balbi balbi@kernel.org wrote:
From: Anurag Kumar Vulisha anurag.kumar.vulisha@xilinx.com
The present code in dwc3_gadget_ep_reclaim_completed_trb() will check for IOC/LST bit in the event->status and returns if IOC/LST bit is set. This logic doesn't work if multiple TRBs are queued per request and the IOC/LST bit is set on the last TRB of that request. Consider an example where a queued request has multiple queued TRBs and IOC/LST bit is set only for the last TRB. In this case, the Core generates XferComplete/XferInProgress events only for the last TRB (since IOC/LST are set only for the last TRB). As per the logic in dwc3_gadget_ep_reclaim_completed_trb() event->status is checked for IOC/LST bit and returns on the first TRB. This makes the remaining TRBs left unhandled. To aviod this, changed the code to check for IOC/LST bits in both
avoid
event->status & TRB->ctrl. This patch does the same.
We don't need to check both. It's very likely that checking the TRB is enough.
Sorry, just to clarify, are you suggesting instead of:
if (event->status & DEPEVT_STATUS_IOC)
if ((event->status & DEPEVT_STATUS_IOC) &&
(trb->ctrl & DWC3_TRB_CTRL_IOC))
We do something like:
if (event->status & DEPEVT_STATUS_IOC)
if (trb->ctrl & DWC3_TRB_CTRL_IOC)
return 1;
if (trb->ctrl & DWC3_TRB_CTRL_LST) return 1;
?
that's correct. In hindsight, I have no idea why I used the event->status here since all other checks are done against the TRB only.
At a practical level, this patch resolves USB transfer stalls seen with adb on dwc3 based HiKey960 after functionfs gadget added scatter-gather support around v4.20.
Right, I remember asking for tracepoint data showing this problem happening. It's the best way to figure out what's really going on.
Before we accept these two patches, could you collect dwc3 tracepoint data and share here?
Sure. Attached is trace logs and regdumps for hikey960.
Thanks
The one gotcha with the logs is that in the working case (with this patch applied), I booted with the usb-c cable disconnected (as suggested in the dwc3.rst doc), enabled tracing and plugged in the device, then ran adb logcat a few times to validate no stalls.
In the failure case (without this patch), I booted with the usb-c cable disconnected, enabled tracing and then when I plugged in the device, it never was detected by adb (it seems perhaps the problem had already struck?).
You never got a Reset Interrupt, so something else is going on. I suggest putting a sniffer and first making sure the host *does* drive reset signalling. Second step would be to look at your phy configuration. Is it going in suspend for any reason? Might want to try our snps,dis_u3_susphy_quirk and snps,dis_u2_susphy_quirk flags.
So I generated the failure2 log by booting with USB-C plugged in, enabling tracing, and running adb logcat on the host to observe the stall.
Thank you. Here's a quick summary of what's in failure2:
There is a series of 24-byte transfers on ep1out and that's the one which shows a problem. We can clearly see that adb is issuing one transfer at a time, only enqueueing transfer n+1 when transfer n is completed and given back, so we see a series of similar blocks:
- dwc3_alloc_request - dwc3_ep_queue - dwc3_prepare_trb - dwc3_prepare_trb (for the chained bit) - dwc3_gadget_ep_cmd (update transfer) - dwc3_event (transfer in progress) - dwc3_complete_trb - dwc3_complete_trb (for the chained bit) - dwc3_gadget_giveback - dwc3_free_request
So this works for several iterations. Note, however, that the TRB addresses don't really make sense. DWC3 allocates a contiguous block of memory to server as TRB pool, but we see non-consecutive addresses on these TRBs. I'm assuming there's an IOMMU in your system.
Anyway, the failing point is here:
adbd-461 [002] d..1 49.855992: dwc3_alloc_request: ep1out: req 000000004e6eaaba length 0/0 zsI ==> 0 adbd-461 [002] d..2 49.855994: dwc3_ep_queue: ep1out: req 000000004e6eaaba length 0/24 zsI ==> -115 adbd-461 [002] d..2 49.855996: dwc3_prepare_trb: ep1out: trb 00000000bae39b48 buf 000000009eb0b100 size 24 ctrl 0000001d (HlCS:sc:normal) adbd-461 [002] d..2 49.855997: dwc3_prepare_trb: ep1out: trb 000000009093a074 buf 0000000217da8000 size 488 ctrl 00000819 (HlcS:sC:normal) adbd-461 [002] d..2 49.856003: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [20007] params 00000000 00000000 00000000 --> status: Successful
irq/65-dwc3-498 [000] d..1 53.902752: dwc3_event: event (00006084): ep1out: Transfer In Progress [0] (SIm) irq/65-dwc3-498 [000] d..1 53.902763: dwc3_complete_trb: ep1out: trb 00000000bae39b48 buf 000000009eb0b100 size 0 ctrl 0000001c (hlCS:sc:normal) irq/65-dwc3-498 [000] d..1 53.902769: dwc3_complete_trb: ep1out: trb 000000009093a074 buf 0000000217da8000 size 488 ctrl 00000819 (HlcS:sC:normal) irq/65-dwc3-498 [000] d..1 53.902781: dwc3_gadget_giveback: ep1out: req 000000004e6eaaba length 24/24 zsI ==> 0 kworker/u16:0-7 [000] .... 53.903020: dwc3_free_request: ep1out: req 000000004e6eaaba length 24/24 zsI ==> 0 adbd-461 [002] d..1 53.903273: dwc3_alloc_request: ep1out: req 00000000c769beab length 0/0 zsI ==> 0 adbd-461 [002] d..2 53.903285: dwc3_ep_queue: ep1out: req 00000000c769beab length 0/24 zsI ==> -115 adbd-461 [002] d..2 53.903292: dwc3_prepare_trb: ep1out: trb 00000000f0ffa827 buf 000000009eb11e80 size 24 ctrl 0000001d (HlCS:sc:normal) adbd-461 [002] d..2 53.903296: dwc3_prepare_trb: ep1out: trb 00000000d6a9892a buf 0000000217da8000 size 488 ctrl 00000819 (HlcS:sC:normal) adbd-461 [002] d..2 53.903315: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [20007] params 00000000 00000000 00000000 --> status: Successful
Note that this transfer, after started, took 4 seconds to complete, while all others completed within a few ms. There's no real reason for this visible from dwc3 driver itself. What follows, is a transfer that never completed.
The only thing I can come up with, is that we starve the TRB ring, by continuously reclaiming a single TRB. We have 255 usable TRBs, so after a few iterations, we would see a stall due to starved TRB ring.
There is a way to verify this by tracking trb_enqueue and trb_dequeue, if you're willing to do that, that'll help us prove that this is really the problem and, since current tracepoints doen't really show that information, it may be a good idea to add this information to dwc3_log_trb tracepoint class. Something like below should be enough, could you re-run the test of failure2 with this patch applied?
drivers/usb/dwc3/trace.h | 9 +++++++--
modified drivers/usb/dwc3/trace.h @@ -227,6 +227,8 @@ DECLARE_EVENT_CLASS(dwc3_log_trb, __field(u32, size) __field(u32, ctrl) __field(u32, type) + __field(u32, enqueue) + __field(u32, dequeue) ), TP_fast_assign( __assign_str(name, dep->name); @@ -236,9 +238,12 @@ DECLARE_EVENT_CLASS(dwc3_log_trb, __entry->size = trb->size; __entry->ctrl = trb->ctrl; __entry->type = usb_endpoint_type(dep->endpoint.desc); + __entry->enqueue = dep->trb_enqueue + __entry->dequeue = dep->trb_dequeue ), - TP_printk("%s: trb %p buf %08x%08x size %s%d ctrl %08x (%c%c%c%c:%c%c:%s)", - __get_str(name), __entry->trb, __entry->bph, __entry->bpl, + TP_printk("%s: trb %p (E%d:D%d) buf %08x%08x size %s%d ctrl %08x (%c%c%c%c:%c%c:%s)", + __get_str(name), __entry->trb, __entry->enqueue, + __entry->dequeue, __entry->bph, __entry->bpl, ({char *s; int pcm = ((__entry->size >> 24) & 3) + 1; switch (__entry->type) {
Anyway, all three sets of logs are included. Let me know if you need me to try anything else.
Thanks for doing this