On Fri, Feb 27, 2026 at 07:42:08PM +0000, Matt Evans wrote:
> Hi Jason + Christian,
>
> On 27/02/2026 12:51, Jason Gunthorpe wrote:
> > On Fri, Feb 27, 2026 at 11:09:31AM +0100, Christian König wrote:
> >
> >> When a DMA-buf just represents a linear piece of BAR which is
> >> map-able through the VFIO FD anyway then the right approach is to
> >> just re-direct the mapping to this VFIO FD.
>
> We think limiting this to one range per DMABUF isn't enough,
> i.e. supporting multiple ranges will be a benefit.
>
> Bumping vm_pgoff to then reuse vfio_pci_mmap_ops is a really nice
> suggestion for the simplest case, but can't support multiple ranges;
> the .fault() needs to be aware of the non-linear DMABUF layout.
Sigh, yes that's right we have the non-linear thing, and if you need
that to work it can't use the existing code.
> > I actually would like to go the other way and have VFIO always have a
> > DMABUF under the VMA's it mmaps because that will make it easy to
> > finish the type1 emulation which requires finding dmabufs for the
> > VMAs.
This is a still better idea since it avoid duplicating the VMA flow
into two parts..
> Putting aside the above point of needing a new .fault() able to find a
> PFN for >1 range for a mo, how would the test of the revoked flag work
> w.r.t. synchronisation and protecting against a racing revoke? It's not
> safe to take memory_lock, test revoked, unlock, then hand over to the
> existing vfio_pci_mmap_*fault() -- which re-takes the lock. I'm not
> quite seeing how we could reuse the existing vfio_pci_mmap_*fault(),
> TBH. I did briefly consider refactoring that existing .fault() code,
> but that makes both paths uglier.
More reasons to do the above..
> > Possibly for this use case you can keep that and do a global unmap and
> > rely on fault to restore the mmaps that were not revoked.
>
> Hm, that'd be functional, but we should consider huge BARs with a lot of
> PTEs (even huge ones); zapping all BARs might noticeably disturb other
> clients. But see my query below please, if we could zap just the
> resource being reclaimed that would be preferable.
Hurm. Otherwise you have to create a bunch of address spaces and
juggle them.
> >> Otherwise functions like vfio_pci_zap_bars() doesn't work correctly
> >> any more and that usually creates a huge bunch of problems.
>
> I'd reasoned it was OK for the DMABUF to have its own unique address
> space -- even though IIUC that means an unmap_mapping_range() by
> vfio_pci_core_device won't affect a DMABUF's mappings -- because
> anything that needs to zap a BAR _also_ must already plan to notify
> DMABUF importers via vfio_pci_dma_buf_move(). And then,
> vfio_pci_dma_buf_move() will zap the mappings.
That might be correct, but if then it is yet another reason to do the
first point and remove the shared address_space fully.
Basically one mmap flow that always uses dma-buf and always uses a
per-dma-buf address space with a per-FD revoke and so on and so forth.
This way there is still one of everything, we just pay a bit of cost
to automatically create a dmabuf file * in the existing path.
> Are there paths that _don't_ always pair vfio_pci_zap_bars() with a
> vfio_pci_dma_buf_move()?
There should not be.
Jason
Hi Matt,
On 2/27/26 14:02, Matt Evans wrote:
> Hi Christian,
>
> On 27/02/2026 10:05, Christian König wrote:
>> On 2/26/26 21:22, Matt Evans wrote:
>>> Add a new dma-buf ioctl() op, DMA_BUF_IOCTL_REVOKE, connected to a new
>>> (optional) dma_buf_ops callback, revoke(). An exporter receiving this
>>> will _permanently_ revoke the DMABUF, meaning it can no longer be
>>> mapped/attached/mmap()ed. It also guarantees that existing
>>> importers have been detached (e.g. via move_notify) and all mappings
>>> made inaccessible.
>>>
>>> This is useful for lifecycle management in scenarios where a process
>>> has created a DMABUF representing a resource, then delegated it to
>>> a client process; access to the resource is revoked when the client is
>>> deemed "done", and the resource can be safely re-used elsewhere.
>>
>> Well that means revoking from the importer side. That absolutely doesn't make sense to me.
>>
>> Why would you do that?
>
> Well, it's for cleanup, but directed to a specific buffer.
>
> Elaborating on the original example, a userspace driver creates a DMABUF
> for parts of a BAR and then sends its fd to some other client process
> via SCM_RIGHTS. The client might then do all of:
>
> - Process mappings of the buffer
> - iommufd IO-mappings of it
> - other unrelated drivers import it
> - share the fd with more processes!
>
> i.e. poking a programming interface and orchestrating P2P DMA to it.
> Eventually the client completes and messages the driver to say goodbye,
> except the client is buggy: it hangs before it munmaps or request other
> drivers to shut down/detach their imports.
>
> Now the original driver can't reuse any BAR ranges it shared out, as
> there might still be active mappings or even ongoing P2P DMA to them.
>
> The goal is to guarantee a point in time where resources corresponding
> to a previously-shared DMABUF fd _cannot_ be accessed anymore: CPUs,
> or other drivers/importers, or any other kind of P2P DMA. So yes, a
> revoke must detach importers, using the synchronous revocation flow
> Leon added in [0] ("dma-buf: Use revoke mechanism to invalidate shared
> buffers").
>
> (Apologies, I should really have just built this on top of a tree
> containing that series to make this need clearer.)
>
> But, it ultimately seems to have the same downstream effects as if one
> were to, say, shut down VFIO device fds and therefore trigger
> vfio_pci_dma_buf_cleanup(). It's just the reason to trigger revocation
> is different: a selective userspace-triggered revocation of a given
> buffer, instead of an exporter cleanup-triggered revocation of all
> buffers. In both cases the goals are identical too, of a synchronised
> point after which no more DMA/CPU access can happen.
>
> (If I've misunderstood your question please clarify, but I hope that
> answers it!)
Yeah that makes it clear, Jasons answer also helped quite a bit to understand what you want to do here.
First of all your requirements sound reasonable, but absolutely clear NAK to the way those patches approach of implementing them. You completely mixed up the different DMA-buf roles and which is used for what.
See the IOCTLs on the DMA-buf file descriptor are for the importer side to communicate with the exporter side. E.g. thinks like "I'm done writing with the CPU, please make that visible to yourself and other importers".....
But what you want to do here is just the other way around, the exporter side wants to signal to all importers that it can't use the buffer any more, correct?
If I understood that correctly then my suggestion is that you have a new IOCTL on the VFIO fd you originally used to export the DMA-buf fd. This IOCTL takes the DMA-buf fd and after double checking that it indeed is the exporter of that fd revokes all importer access to it.
I'm certainly open on suggestions on how to improve the DMA-buf documentation to make that more clearer in the future.
Regards,
Christian.
>
> Cheers,
>
>
> Matt
>
> [0] https://lore.kernel.org/linux-iommu/20260205-nocturnal-poetic-chamois-f566a…
>
>>
>> Regards,
>> Christian.
>>
>>>
>>> Signed-off-by: Matt Evans <mattev(a)meta.com>
>>> ---
>>> drivers/dma-buf/dma-buf.c | 5 +++++
>>> include/linux/dma-buf.h | 22 ++++++++++++++++++++++
>>> include/uapi/linux/dma-buf.h | 1 +
>>> 3 files changed, 28 insertions(+)
>>>
>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>> index edaa9e4ee4ae..b9b315317f2d 100644
>>> --- a/drivers/dma-buf/dma-buf.c
>>> +++ b/drivers/dma-buf/dma-buf.c
>>> @@ -561,6 +561,11 @@ static long dma_buf_ioctl(struct file *file,
>>> case DMA_BUF_IOCTL_IMPORT_SYNC_FILE:
>>> return dma_buf_import_sync_file(dmabuf, (const void __user *)arg);
>>> #endif
>>> + case DMA_BUF_IOCTL_REVOKE:
>>> + if (dmabuf->ops->revoke)
>>> + return dmabuf->ops->revoke(dmabuf);
>>> + else
>>> + return -EINVAL;
>>>
>>> default:
>>> return -ENOTTY;
>>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
>>> index 0bc492090237..a68c9ad7aebd 100644
>>> --- a/include/linux/dma-buf.h
>>> +++ b/include/linux/dma-buf.h
>>> @@ -277,6 +277,28 @@ struct dma_buf_ops {
>>>
>>> int (*vmap)(struct dma_buf *dmabuf, struct iosys_map *map);
>>> void (*vunmap)(struct dma_buf *dmabuf, struct iosys_map *map);
>>> +
>>> + /**
>>> + * @revoke:
>>> + *
>>> + * This callback is invoked from a userspace
>>> + * DMA_BUF_IOCTL_REVOKE operation, and requests that access to
>>> + * the buffer is immediately and permanently revoked. On
>>> + * successful return, the buffer is not accessible through any
>>> + * mmap() or dma-buf import. The request fails if the buffer
>>> + * is pinned; otherwise, the exporter marks the buffer as
>>> + * inaccessible and uses the move_notify callback to inform
>>> + * importers of the change. The buffer is permanently
>>> + * disabled, and the exporter must refuse all map, mmap,
>>> + * attach, etc. requests.
>>> + *
>>> + * Returns:
>>> + *
>>> + * 0 on success, or a negative error code on failure:
>>> + * -ENODEV if the associated device no longer exists/is closed.
>>> + * -EBADFD if the buffer has already been revoked.
>>> + */
>>> + int (*revoke)(struct dma_buf *dmabuf);
>>> };
>>>
>>> /**
>>> diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
>>> index 5a6fda66d9ad..84bf2dd2d0f3 100644
>>> --- a/include/uapi/linux/dma-buf.h
>>> +++ b/include/uapi/linux/dma-buf.h
>>> @@ -178,5 +178,6 @@ struct dma_buf_import_sync_file {
>>> #define DMA_BUF_SET_NAME_B _IOW(DMA_BUF_BASE, 1, __u64)
>>> #define DMA_BUF_IOCTL_EXPORT_SYNC_FILE _IOWR(DMA_BUF_BASE, 2, struct dma_buf_export_sync_file)
>>> #define DMA_BUF_IOCTL_IMPORT_SYNC_FILE _IOW(DMA_BUF_BASE, 3, struct dma_buf_import_sync_file)
>>> +#define DMA_BUF_IOCTL_REVOKE _IO(DMA_BUF_BASE, 4)
>>>
>>> #endif
>>> --
>>> 2.47.3
>>>
>>
>
Hi,
The recent introduction of heaps in the optee driver [1] made possible
the creation of heaps as modules.
It's generally a good idea if possible, including for the already
existing system and CMA heaps.
The system one is pretty trivial, the CMA one is a bit more involved,
especially since we have a call from kernel/dma/contiguous.c to the CMA
heap code. This was solved by turning the logic around and making the
CMA heap call into the contiguous DMA code.
Let me know what you think,
Maxime
1: https://lore.kernel.org/dri-devel/20250911135007.1275833-4-jens.wiklander@l…
Signed-off-by: Maxime Ripard <mripard(a)kernel.org>
---
Maxime Ripard (7):
dma: contiguous: Turn heap registration logic around
mm: cma: Export cma_alloc and cma_release
mm: cma: Export cma_get_name
mm: cma: Export dma_contiguous_default_area
dma-buf: heaps: Export mem_accounting parameter
dma-buf: heaps: cma: Turn the heap into a module
dma-buf: heaps: system: Turn the heap into a module
drivers/dma-buf/dma-heap.c | 1 +
drivers/dma-buf/heaps/Kconfig | 4 ++--
drivers/dma-buf/heaps/cma_heap.c | 21 +++++----------------
drivers/dma-buf/heaps/system_heap.c | 5 +++++
include/linux/dma-map-ops.h | 5 +++++
kernel/dma/contiguous.c | 27 +++++++++++++++++++++++++--
mm/cma.c | 3 +++
7 files changed, 46 insertions(+), 20 deletions(-)
---
base-commit: 499a718536dc0e1c1d1b6211847207d58acd9916
change-id: 20260225-dma-buf-heaps-as-modules-1034b3ec9f2a
Best regards,
--
Maxime Ripard <mripard(a)kernel.org>
Hi David,
On Thu, Feb 26, 2026 at 11:25:24AM +0100, David Hildenbrand (Arm) wrote:
> On 2/25/26 17:41, Maxime Ripard wrote:
> > The CMA dma-buf heap uses cma_alloc() and cma_release() to allocate and
> > free, respectively, its CMA buffers.
> >
> > However, these functions are not exported. Since we want to turn the CMA
> > heap into a module, let's export them both.
> >
> > Signed-off-by: Maxime Ripard <mripard(a)kernel.org>
> > ---
> > mm/cma.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/mm/cma.c b/mm/cma.c
> > index 94b5da468a7d719e5144d33b06bcc7619c0fbcc9..be142b473f3bd41b9c7d8ba4397f018f6993d962 100644
> > --- a/mm/cma.c
> > +++ b/mm/cma.c
> > @@ -949,10 +949,11 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
> > if (page)
> > set_pages_refcounted(page, count);
> >
> > return page;
> > }
> > +EXPORT_SYMBOL_GPL(cma_alloc);
> >
> > static struct cma_memrange *find_cma_memrange(struct cma *cma,
> > const struct page *pages, unsigned long count)
> > {
> > struct cma_memrange *cmr = NULL;
> > @@ -1025,10 +1026,11 @@ bool cma_release(struct cma *cma, const struct page *pages,
> >
> > __cma_release_frozen(cma, cmr, pages, count);
> >
> > return true;
> > }
> > +EXPORT_SYMBOL_GPL(cma_release);
> >
> > bool cma_release_frozen(struct cma *cma, const struct page *pages,
> > unsigned long count)
> > {
> > struct cma_memrange *cmr;
> >
>
> I'm wondering whether we want to restrict all these exports to the
> dma-buf module only using EXPORT_SYMBOL_FOR_MODULES().
TIL about EXPORT_SYMBOL_FOR_MODULES, thanks.
> Especially dma_contiguous_default_area() (patch #4), I am not sure
> whether we want arbitrary modules to mess with that.
Yeah, I wasn't too fond about that one either. Alternatively, I guess we
could turn dev_get_cma_area into a non-inlined function and export that
instead?
Or we could do both.
Maxime
On 2/26/26 21:22, Matt Evans wrote:
> Add a new dma-buf ioctl() op, DMA_BUF_IOCTL_REVOKE, connected to a new
> (optional) dma_buf_ops callback, revoke(). An exporter receiving this
> will _permanently_ revoke the DMABUF, meaning it can no longer be
> mapped/attached/mmap()ed. It also guarantees that existing
> importers have been detached (e.g. via move_notify) and all mappings
> made inaccessible.
>
> This is useful for lifecycle management in scenarios where a process
> has created a DMABUF representing a resource, then delegated it to
> a client process; access to the resource is revoked when the client is
> deemed "done", and the resource can be safely re-used elsewhere.
Well that means revoking from the importer side. That absolutely doesn't make sense to me.
Why would you do that?
Regards,
Christian.
>
> Signed-off-by: Matt Evans <mattev(a)meta.com>
> ---
> drivers/dma-buf/dma-buf.c | 5 +++++
> include/linux/dma-buf.h | 22 ++++++++++++++++++++++
> include/uapi/linux/dma-buf.h | 1 +
> 3 files changed, 28 insertions(+)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index edaa9e4ee4ae..b9b315317f2d 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -561,6 +561,11 @@ static long dma_buf_ioctl(struct file *file,
> case DMA_BUF_IOCTL_IMPORT_SYNC_FILE:
> return dma_buf_import_sync_file(dmabuf, (const void __user *)arg);
> #endif
> + case DMA_BUF_IOCTL_REVOKE:
> + if (dmabuf->ops->revoke)
> + return dmabuf->ops->revoke(dmabuf);
> + else
> + return -EINVAL;
>
> default:
> return -ENOTTY;
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 0bc492090237..a68c9ad7aebd 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -277,6 +277,28 @@ struct dma_buf_ops {
>
> int (*vmap)(struct dma_buf *dmabuf, struct iosys_map *map);
> void (*vunmap)(struct dma_buf *dmabuf, struct iosys_map *map);
> +
> + /**
> + * @revoke:
> + *
> + * This callback is invoked from a userspace
> + * DMA_BUF_IOCTL_REVOKE operation, and requests that access to
> + * the buffer is immediately and permanently revoked. On
> + * successful return, the buffer is not accessible through any
> + * mmap() or dma-buf import. The request fails if the buffer
> + * is pinned; otherwise, the exporter marks the buffer as
> + * inaccessible and uses the move_notify callback to inform
> + * importers of the change. The buffer is permanently
> + * disabled, and the exporter must refuse all map, mmap,
> + * attach, etc. requests.
> + *
> + * Returns:
> + *
> + * 0 on success, or a negative error code on failure:
> + * -ENODEV if the associated device no longer exists/is closed.
> + * -EBADFD if the buffer has already been revoked.
> + */
> + int (*revoke)(struct dma_buf *dmabuf);
> };
>
> /**
> diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
> index 5a6fda66d9ad..84bf2dd2d0f3 100644
> --- a/include/uapi/linux/dma-buf.h
> +++ b/include/uapi/linux/dma-buf.h
> @@ -178,5 +178,6 @@ struct dma_buf_import_sync_file {
> #define DMA_BUF_SET_NAME_B _IOW(DMA_BUF_BASE, 1, __u64)
> #define DMA_BUF_IOCTL_EXPORT_SYNC_FILE _IOWR(DMA_BUF_BASE, 2, struct dma_buf_export_sync_file)
> #define DMA_BUF_IOCTL_IMPORT_SYNC_FILE _IOW(DMA_BUF_BASE, 3, struct dma_buf_import_sync_file)
> +#define DMA_BUF_IOCTL_REVOKE _IO(DMA_BUF_BASE, 4)
>
> #endif
> --
> 2.47.3
>