On Wed, Feb 05, 2020 at 01:03:51PM -0800, John Stultz wrote:
On Thu, Jan 23, 2020 at 9:46 AM Felipe Balbi balbi@kernel.org wrote:
I'm pretty sure this should be solved at the DMA API level, just want to confirm.
I have sent you the tracepoints long time ago. Also my analysis of the problem (BTW, I don't think the tracepoints helped much). It's basically a logic problem in function dwc3_gadget_ep_reclaim_trb_sg().
AFAICT, this is caused by DMA API merging pages together when map an sglist for DMA. While doing that, it does *not* move the SG_END flag which sg_is_last() checks.
I consider that an overlook on the DMA API, wouldn't you? Why should DMA API users care if pages were merged or not while mapping the sglist? We have for_each_sg() and sg_is_last() for a reason.
From an initial look, I agree this is pretty confusing. dma_map_sg()
can coalesce entries in the sg list, modifying the sg entires themselves, however, in doing so it doesn't modify the number of entries in the sglist (nor the end state bit). That's pretty subtle!
dma_map_sg only coalesces the dma address. The page, offset and len members are immutable.
The problem is really the design of the scatterlist structure - it combines immutable input parameters (page, offset, len) and output parameters (dma_addr, dma_len) in one data structure, and then needs different accessors depending on which information you care about. The end marker only works for the "CPU" view.
The right fix is top stop using struct scatterlist, but that is going to be larger and painful change. At least for block layer stuff I plan to incrementally do that, though.
So I'm not sure that sg_is_last() is really valid for iterating on mapped sg lists.
Should it be? Probably (at least with my unfamiliar eyes), but sg_is_last() has been around for almost as long coexisting with this behavioral quirk, so I'm also not sure this is the best hill for the dwc3 driver to die on. :)
No, it shoudn't. dma_map_sg returns the number of mapped segments, and the callers need to remember that.
The fix here: https://lore.kernel.org/lkml/20200122222645.38805-3-john.stultz@linaro.org/ Or maybe the slightly cleaner varient here: https://git.linaro.org/people/john.stultz/android-dev.git/commit/?h=dev/db84... seems like it would correctly address things following the documentation and avoid the failures we're seeing.
The first version is the corret one. sg_is_last has no meaning for the "DMA" view of the scatterlist.