Since ~4.20, when the functionfs gadget enabled scatter-gather support, we have seen problems with adb connections stalling and stopping to function on hardware with dwc3 usb controllers. Specifically, HiKey960, Dragonboard 845c, and Pixel3 devices.
Any chance this:
https://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git/commi t/ ?h=testing/next&id=f63333e8e4fd63d8d8ae83b89d2c38cf21d64801
This is a different issue. I have tried initializing num_sgs when debugging this adb stall problem, but it didn't help.
So multiple folks have run through this problem, but not *one* has tracepoints collected from the issue? C'mon guys. Can someone, please, collect tracepoints so we can figure out what's actually going on?
I'm pretty sure this should be solved at the DMA API level, just want to confirm.
I have sent you the tracepoints long time ago. Also my analysis of the problem (BTW, I don't think the tracepoints helped much). It's basically a logic problem in function dwc3_gadget_ep_reclaim_trb_sg().
AFAICT, this is caused by DMA API merging pages together when map an sglist for DMA. While doing that, it does *not* move the SG_END flag which sg_is_last() checks.
I consider that an overlook on the DMA API, wouldn't you? Why should DMA API users care if pages were merged or not while mapping the sglist? We have for_each_sg() and sg_is_last() for a reason.
Oops, my bad. Actually, I was talking about the other patch, not the one setting num_sgs = 0; I don't know if this patch is really needed, but from what I remember the DMA API is setting up the num_sgs properly. I agree even if there is a problem initializing num_sgs, it should be fixed in DMA API.
I can try dig into my old emails and resend, but that is a bit hard to find.
Don't bother, I'm still not convinced we should fix at the driver level when sg_is_last() should be working here, unless we should iterate over num_sgs instead of num_mapped_sgs, though I don't think that's the case since in that case we would have to chain buffers of size zero.
-- balbi