On Thu, Mar 24, 2022 at 10:07 AM Toke Høiland-Jørgensen toke@toke.dk wrote:
Right, but is that sync_for_device call really needed?
Well, imagine that you have a non-cache-coherent DMA (not bounce buffers - just bad hardware)...
So the driver first does that dma_sync_single_for_cpu() for the CPU see the current state (for the non-cache-coherent case it would just invalidate caches).
The driver then examines the command buffer state, sees that it's still in progress, and does that return -EINPROGRESS.
It's actually very natural in that situation to flush the caches from the CPU side again. And so dma_sync_single_for_device() is a fairly reasonable thing to do in that situation.
But it doesn't seem *required*, no. The CPU caches only have a copy of the data in them, no writeback needed (and writeback wouldn't work since DMA from the device may be in progress).
So I don't think the dma_sync_single_for_device() is *wrong* per se, because the CPU didn't actually do any modifications.
But yes, I think it's unnecessary - because any later CPU accesses would need that dma_sync_single_for_cpu() anyway, which should invalidate any stale caches.
And it clearly doesn't work in a bounce-buffer situation, but honestly I don't think a "CPU modified buffers concurrently with DMA" can *ever* work in that situation, so I think it's wrong for a bounce buffer model to ever do anything in the dma_sync_single_for_device() situation.
Does removing that dma_sync_single_for_device() actually fix the problem for the ath driver?
There's a fair number of those dma_sync_single_for_device() things all over. Could we find mis-uses and warn about them some way? It seems to be a very natural thing to do in this context, but bounce buffering does make them very fragile.
Linus