On Fri, Mar 25, 2022 at 1:38 PM Johannes Berg johannes@sipsolutions.net wrote:
(2) The CPU now wants to see any state written by the device since the last sync
This is "dma_sync_single_for_cpu(DMA_FROM_DEVICE)". A bounce-buffer implementation needs to copy *from* the bounce buffer. A cache-coherent implementation needs to do nothing. A non-coherent implementation maybe needs to do nothing (ie it
assumes that previous ops have flushed the cache, and just accessing the data will bring the rigth thing back into it). Or it could just flush the cache.
Doesn't that just need to *invalidate* the cache, rather than *flush* it?
Yes. I should have been more careful.
That said, I think "invalidate without writeback" is a really dangerous operation (it can generate some *really* hard to debug memory state), so on the whole I think you should always strive to just do "flush-and-invalidate".
If the core has support for "invalidate clean cache lines only", then that's possibly a good alternative.
A non-coherent implementation needs to flush the cache again, bot not necessarily do a writeback-flush if there is some cheaper form (assuming it does nothing in the "CPU now wants to see any state" case because it depends on the data not having been in the caches)
And similarly here, it would seem that the implementation can't _flush_ the cache as the device might be writing concurrently (which it does in fact do in the ath9k case), but it must invalidate the cache?
Right, again, when I said "flush" I really should have said "invalidate".
I'm not sure about the (2) case, but here it seems fairly clear cut that if you have a cache, don't expect the CPU to write to the buffer (as evidenced by DMA_FROM_DEVICE), you wouldn't want to write out the cache to DRAM?
See above: I'd *really* want to avoid a pure "invalidate cacheline" model. The amount of debug issues that can cause is not worth it.
So please flush-and-invalidate, or invalidate-non-dirty, but not just "invalidate".
Then, however, we need to define what happens if you pass DMA_BIDIRECTIONAL to the sync_for_cpu() and sync_for_device() functions, which adds two more cases? Or maybe we eventually just think that's not valid at all, since you have to specify how you're (currently?) using the buffer, which can't be DMA_BIDIRECTIONAL?
Ugh. Do we actually have cases that do it? That sounds really odd for a "sync" operation. It sounds very reasonable for _allocating_ DMA, but for syncing I'm left scratching my head what the semantics would be.
But yes, if we do and people come up with semantics for it, those semantics should be clearly documented.
And if we don't - or people can't come up with semantics for it - we should actively warn about it and not have some code that does odd things that we don't know what they mean.
But it sounds like you agree with my analysis, just not with some of my bad/incorrect word choices.
Linus