Am 21.06.24 um 00:02 schrieb Xu Yilun:
On Thu, Jan 16, 2025 at 04:13:13PM +0100, Christian König wrote:
Am 15.01.25 um 18:09 schrieb Jason Gunthorpe:
On Wed, Jan 15, 2025 at 05:34:23PM +0100, Christian König wrote:
Granted, let me try to improve this. Here is a real world example of one of the issues we ran into and why CPU mappings of importers are redirected to the exporter. We have a good bunch of different exporters who track the CPU mappings of their backing store using address_space objects in one way or another and then uses unmap_mapping_range() to invalidate those CPU mappings. But when importers get the PFNs of the backing store they can look behind the curtain and directly insert this PFN into the CPU page tables. We had literally tons of cases like this where drivers developers cause access after free issues because the importer created a CPU mappings on their own without the exporter knowing about it. This is just one example of what we ran into. Additional to that basically the whole synchronization between drivers was overhauled as well because we found that we can't trust importers to always do the right thing.
But this, fundamentally, is importers creating attachments and then *ignoring the lifetime rules of DMABUF*. If you created an attachment, got a move and *ignored the move* because you put the PFN in your own VMA, then you are not following the attachment lifetime rules!
Move notify is solely for informing the importer that they need to re-fresh their DMA mappings and eventually block for ongoing DMA to end. This semantics doesn't work well for CPU mappings because you need to hold the reservation lock to make sure that the information stay valid and you can't hold a lock while returning from a page fault.
Dealing with CPU mapping and resource invalidation is a little hard, but is resolvable, by using other types of locks. And I guess for now dma-buf exporters should always handle this CPU mapping VS. invalidation contention if they support mmap().
It is resolvable so with some invalidation notify, a decent importers could also handle the contention well.
That doesn't work like this.
See page tables updates under DMA-buf works by using the same locking approach for both the validation and invalidation side. In other words we hold the same lock while inserting and removing entries into/from the page tables.
That this here should be an unlocked API means that can only use it with pre-allocated and hard pinned memory without any chance to invalidate it while running. Otherwise you can never be sure of the validity of the address information you got from the exporter.
IIUC now the only concern is importer device drivers are easier to do something wrong, so move CPU mapping things to exporter. But most of the exporters are also device drivers, why they are smarter?
Exporters always use their invalidation code path no matter if they are exporting their buffers for other to use or if they are stand alone.
If you do the invalidation on the importer side you always need both exporter and importer around to test it.
Additional to that we have much more importers than exporters. E.g. a lot of simple drivers only import DMA-heap buffers and never exports anything.
And there are increasing mapping needs, today exporters help handle CPU primary mapping, tomorrow should they also help on all other mappings? Clearly it is not feasible. So maybe conditionally give trust to some importers.
Why should that be necessary? Exporters *must* know what somebody does with their buffers.
If you have an use case the exporter doesn't support in their mapping operation then that use case most likely doesn't work in the first place.
For example direct I/O is enabled/disabled by exporters on their CPU mappings based on if that works correctly for them. And importer simply doesn't know if they should use vm_insert_pfn() or vm_insert_page().
We could of course implement that logic into each importer to chose between the different approaches, but than each importer gains logic it only exercises with a specific exporter. And that doesn't seem to be a good idea at all.
Regards, Christian.
Thanks, Yilun
linaro-mm-sig@lists.linaro.org