Hi,
John Stultz john.stultz@linaro.org writes:
On Thu, Jan 23, 2020 at 9:46 AM Felipe Balbi balbi@kernel.org wrote:
> > Since ~4.20, when the functionfs gadget enabled scatter-gather > support, we have seen problems with adb connections stalling and > stopping to function on hardware with dwc3 usb controllers. > Specifically, HiKey960, Dragonboard 845c, and Pixel3 devices.
...
I'm pretty sure this should be solved at the DMA API level, just want to confirm.
I have sent you the tracepoints long time ago. Also my analysis of the problem (BTW, I don't think the tracepoints helped much). It's basically a logic problem in function dwc3_gadget_ep_reclaim_trb_sg().
AFAICT, this is caused by DMA API merging pages together when map an sglist for DMA. While doing that, it does *not* move the SG_END flag which sg_is_last() checks.
I consider that an overlook on the DMA API, wouldn't you? Why should DMA API users care if pages were merged or not while mapping the sglist? We have for_each_sg() and sg_is_last() for a reason.
From an initial look, I agree this is pretty confusing. dma_map_sg() can coalesce entries in the sg list, modifying the sg entires themselves, however, in doing so it doesn't modify the number of entries in the sglist (nor the end state bit). That's pretty subtle!
My initial naive attempt to fix the dma-iommu path to set the end bit at the tail of __finalize_sg() which does the sg entry modifications, only caused trouble elsewhere, as there's plenty of logic that expects the number of entries to not change, so having sg_next() return NULL before that point results in lots of null pointer traversals.
Further, looking at the history, while apparently quirky, this has been the documented behavior in DMA-API.txt for over almost 14 years (at least). It clearly states that that dma_map_api can return fewer mapped entries then sg entries, and one should loop only that many times (for_each_sg() having a max number of entries to iterate over seems specifically for this purpose). Additionally, it says one must preserve the original number of entries (not # mapped entries) for dma_unmap_sg().
So I'm not sure that sg_is_last() is really valid for iterating on mapped sg lists.
Should it be? Probably (at least with my unfamiliar eyes), but sg_is_last() has been around for almost as long coexisting with this behavioral quirk, so I'm also not sure this is the best hill for the dwc3 driver to die on. :)
The fix here: https://lore.kernel.org/lkml/20200122222645.38805-3-john.stultz@linaro.org/ Or maybe the slightly cleaner varient here: https://git.linaro.org/people/john.stultz/android-dev.git/commit/?h=dev/db84...
in that case, we don't need to use sg_is_last() at all, since i will always encode the last entry in the list.
seems like it would correctly address things following the documentation and avoid the failures we're seeing.
As to if dma_map_sg() should fixup the state bits or properly shrink the sg list when it coalesces entries, that seems like it would be a much more intrusive change across quite a bit of the kernel that was written to follow the documented method. So my confidence that such a change would make it upstream in a reasonable amount of time isn't very high, and it seems like a bad idea to block the driver from working properly in the meantime.
Pulling in Christoph and Jens as I suspect they have more context on this, and maybe can explain thats its not so quirky with the right perspective?
Thoughts? Maybe there is an easier way to make it less quirky then what I imagine?
it just seems very counter-intuitive to me that DMA api can coalesce entries but they're actually still there and drivers have to cope with this behavior.