On Fri, Feb 27, 2026 at 07:42:08PM +0000, Matt Evans wrote:
Hi Jason + Christian,
On 27/02/2026 12:51, Jason Gunthorpe wrote:
On Fri, Feb 27, 2026 at 11:09:31AM +0100, Christian König wrote:
When a DMA-buf just represents a linear piece of BAR which is map-able through the VFIO FD anyway then the right approach is to just re-direct the mapping to this VFIO FD.
We think limiting this to one range per DMABUF isn't enough, i.e. supporting multiple ranges will be a benefit.
Bumping vm_pgoff to then reuse vfio_pci_mmap_ops is a really nice suggestion for the simplest case, but can't support multiple ranges; the .fault() needs to be aware of the non-linear DMABUF layout.
Sigh, yes that's right we have the non-linear thing, and if you need that to work it can't use the existing code.
I actually would like to go the other way and have VFIO always have a DMABUF under the VMA's it mmaps because that will make it easy to finish the type1 emulation which requires finding dmabufs for the VMAs.
This is a still better idea since it avoid duplicating the VMA flow into two parts..
Putting aside the above point of needing a new .fault() able to find a PFN for >1 range for a mo, how would the test of the revoked flag work w.r.t. synchronisation and protecting against a racing revoke? It's not safe to take memory_lock, test revoked, unlock, then hand over to the existing vfio_pci_mmap_*fault() -- which re-takes the lock. I'm not quite seeing how we could reuse the existing vfio_pci_mmap_*fault(), TBH. I did briefly consider refactoring that existing .fault() code, but that makes both paths uglier.
More reasons to do the above..
Possibly for this use case you can keep that and do a global unmap and rely on fault to restore the mmaps that were not revoked.
Hm, that'd be functional, but we should consider huge BARs with a lot of PTEs (even huge ones); zapping all BARs might noticeably disturb other clients. But see my query below please, if we could zap just the resource being reclaimed that would be preferable.
Hurm. Otherwise you have to create a bunch of address spaces and juggle them.
Otherwise functions like vfio_pci_zap_bars() doesn't work correctly any more and that usually creates a huge bunch of problems.
I'd reasoned it was OK for the DMABUF to have its own unique address space -- even though IIUC that means an unmap_mapping_range() by vfio_pci_core_device won't affect a DMABUF's mappings -- because anything that needs to zap a BAR _also_ must already plan to notify DMABUF importers via vfio_pci_dma_buf_move(). And then, vfio_pci_dma_buf_move() will zap the mappings.
That might be correct, but if then it is yet another reason to do the first point and remove the shared address_space fully.
Basically one mmap flow that always uses dma-buf and always uses a per-dma-buf address space with a per-FD revoke and so on and so forth.
This way there is still one of everything, we just pay a bit of cost to automatically create a dmabuf file * in the existing path.
Are there paths that _don't_ always pair vfio_pci_zap_bars() with a vfio_pci_dma_buf_move()?
There should not be.
Jason