Changelog:
v3:
* Used Jason's wordings for commits and cover letter.
* Removed IOMMUFD patch.
* Renamed dma_buf_attachment_is_revoke() to be dma_buf_attach_revocable().
* Added patch to remove CONFIG_DMABUF_MOVE_NOTIFY.
* Added Reviewed-by tags.
* Called to dma_resv_wait_timeout() after dma_buf_move_notify() in VFIO.
* Added dma_buf_attach_revocable() check to VFIO DMABUF attach function.
* Slightly changed commit messages.
v2: https://patch.msgid.link/20260118-dmabuf-revoke-v2-0-a03bb27c0875@nvidia.com
* Changed series to document the revoke semantics instead of
implementing it.
v1: https://patch.msgid.link/20260111-dmabuf-revoke-v1-0-fb4bcc8c259b@nvidia.com
-------------------------------------------------------------------------
This series documents a dma-buf “revoke” mechanism: to allow a dma-buf
exporter to explicitly invalidate (“kill”) a shared buffer after it has
been distributed to importers, so that further CPU and device access is
prevented and importers reliably observe failure.
The change in this series is to properly document and use existing core
“revoked” state on the dma-buf object and a corresponding exporter-triggered
revoke operation.
dma-buf has quietly allowed calling move_notify on pinned dma-bufs, even
though legacy importers using dma_buf_attach() would simply ignore
these calls.
RDMA saw this and needed to use allow_peer2peer=true, so implemented a
new-style pinned importer with an explicitly non-working move_notify()
callback.
This has been tolerable because the existing exporters are thought to
only call move_notify() on a pinned DMABUF under RAS events and we
have been willing to tolerate the UAF that results by allowing the
importer to continue to use the mapping in this rare case.
VFIO wants to implement a pin supporting exporter that will issue a
revoking move_notify() around FLRs and a few other user triggerable
operations. Since this is much more common we are not willing to
tolerate the security UAF caused by interworking with non-move_notify()
supporting drivers. Thus till now VFIO has required dynamic importers,
even though it never actually moves the buffer location.
To allow VFIO to work with pinned importers, according to how dma-buf
was intended, we need to allow VFIO to detect if an importer is legacy
or RDMA and does not actually implement move_notify().
Introduce a new function that exporters can call to detect these less
capable importers. VFIO can then refuse to accept them during attach.
In theory all exporters that call move_notify() on pinned dma-buf's
should call this function, however that would break a number of widely
used NIC/GPU flows. Thus for now do not spread this further than VFIO
until we can understand how much of RDMA can implement the full
semantic.
In the process clarify how move_notify is intended to be used with
pinned dma-bufs.
Thanks
Signed-off-by: Leon Romanovsky <leonro(a)nvidia.com>
---
Leon Romanovsky (7):
dma-buf: Rename .move_notify() callback to a clearer identifier
dma-buf: Always build with DMABUF_MOVE_NOTIFY
dma-buf: Document RDMA non-ODP invalidate_mapping() special case
dma-buf: Add check function for revoke semantics
iommufd: Pin dma-buf importer for revoke semantics
vfio: Wait for dma-buf invalidation to complete
vfio: Validate dma-buf revocation semantics
drivers/dma-buf/Kconfig | 12 -----
drivers/dma-buf/dma-buf.c | 69 +++++++++++++++++++++++------
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 14 +++---
drivers/gpu/drm/amd/amdkfd/Kconfig | 2 +-
drivers/gpu/drm/virtio/virtgpu_prime.c | 2 +-
drivers/gpu/drm/xe/tests/xe_dma_buf.c | 7 ++-
drivers/gpu/drm/xe/xe_dma_buf.c | 14 +++---
drivers/infiniband/core/umem_dmabuf.c | 13 +-----
drivers/infiniband/hw/mlx5/mr.c | 2 +-
drivers/iommu/iommufd/pages.c | 11 ++++-
drivers/vfio/pci/vfio_pci_dmabuf.c | 8 ++++
include/linux/dma-buf.h | 9 ++--
12 files changed, 96 insertions(+), 67 deletions(-)
---
base-commit: 9ace4753a5202b02191d54e9fdf7f9e3d02b85eb
change-id: 20251221-dmabuf-revoke-b90ef16e4236
Best regards,
--
Leon Romanovsky <leonro(a)nvidia.com>
On Wed, Jan 21, 2026 at 02:47:29PM +0000, Pranjal Shrivastava wrote:
> But at the same time, I'd like to discuss if we should think about
> changing the dmabuf core, NULL op == success feels like relying on a bug
Agree, IMHO, it is surprising and counter intuitive in the kernel that
a NULL op means the feature is supported and default to success.
Jason
Changelog:
v4:
* Changed DMA_RESV_USAGE_KERNEL to DMA_RESV_USAGE_BOOKKEEP.
* Made .invalidate_mapping() truly optional.
* Added patch which renames dma_buf_move_notify() to be
dma_buf_invalidate_mappings().
* Restored dma_buf_attachment_is_dynamic() function.
v3: https://lore.kernel.org/all/20260120-dmabuf-revoke-v3-0-b7e0b07b8214@nvidia…
* Used Jason's wordings for commits and cover letter.
* Removed IOMMUFD patch.
* Renamed dma_buf_attachment_is_revoke() to be dma_buf_attach_revocable().
* Added patch to remove CONFIG_DMABUF_MOVE_NOTIFY.
* Added Reviewed-by tags.
* Called to dma_resv_wait_timeout() after dma_buf_move_notify() in VFIO.
* Added dma_buf_attach_revocable() check to VFIO DMABUF attach function.
* Slightly changed commit messages.
v2: https://patch.msgid.link/20260118-dmabuf-revoke-v2-0-a03bb27c0875@nvidia.com
* Changed series to document the revoke semantics instead of
implementing it.
v1: https://patch.msgid.link/20260111-dmabuf-revoke-v1-0-fb4bcc8c259b@nvidia.com
-------------------------------------------------------------------------
This series documents a dma-buf “revoke” mechanism: to allow a dma-buf
exporter to explicitly invalidate (“kill”) a shared buffer after it has
been distributed to importers, so that further CPU and device access is
prevented and importers reliably observe failure.
The change in this series is to properly document and use existing core
“revoked” state on the dma-buf object and a corresponding exporter-triggered
revoke operation.
dma-buf has quietly allowed calling move_notify on pinned dma-bufs, even
though legacy importers using dma_buf_attach() would simply ignore
these calls.
RDMA saw this and needed to use allow_peer2peer=true, so implemented a
new-style pinned importer with an explicitly non-working move_notify()
callback.
This has been tolerable because the existing exporters are thought to
only call move_notify() on a pinned DMABUF under RAS events and we
have been willing to tolerate the UAF that results by allowing the
importer to continue to use the mapping in this rare case.
VFIO wants to implement a pin supporting exporter that will issue a
revoking move_notify() around FLRs and a few other user triggerable
operations. Since this is much more common we are not willing to
tolerate the security UAF caused by interworking with non-move_notify()
supporting drivers. Thus till now VFIO has required dynamic importers,
even though it never actually moves the buffer location.
To allow VFIO to work with pinned importers, according to how dma-buf
was intended, we need to allow VFIO to detect if an importer is legacy
or RDMA and does not actually implement move_notify().
In theory all exporters that call move_notify() on pinned dma-buf's
should call this function, however that would break a number of widely
used NIC/GPU flows. Thus for now do not spread this further than VFIO
until we can understand how much of RDMA can implement the full
semantic.
In the process clarify how move_notify is intended to be used with
pinned dma-bufs.
Thanks
Signed-off-by: Leon Romanovsky <leonro(a)nvidia.com>
---
Leon Romanovsky (8):
dma-buf: Rename .move_notify() callback to a clearer identifier
dma-buf: Rename dma_buf_move_notify() to dma_buf_invalidate_mappings()
dma-buf: Always build with DMABUF_MOVE_NOTIFY
dma-buf: Make .invalidate_mapping() truly optional
dma-buf: Add check function for revoke semantics
iommufd: Pin dma-buf importer for revoke semantics
vfio: Wait for dma-buf invalidation to complete
vfio: Validate dma-buf revocation semantics
drivers/dma-buf/Kconfig | 12 -------
drivers/dma-buf/dma-buf.c | 53 ++++++++++++++++++++++-------
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 14 +++-----
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
drivers/gpu/drm/amd/amdkfd/Kconfig | 2 +-
drivers/gpu/drm/virtio/virtgpu_prime.c | 2 +-
drivers/gpu/drm/xe/tests/xe_dma_buf.c | 7 ++--
drivers/gpu/drm/xe/xe_bo.c | 2 +-
drivers/gpu/drm/xe/xe_dma_buf.c | 14 +++-----
drivers/infiniband/core/umem_dmabuf.c | 13 -------
drivers/infiniband/hw/mlx5/mr.c | 2 +-
drivers/iommu/iommufd/pages.c | 11 ++++--
drivers/iommu/iommufd/selftest.c | 2 +-
drivers/vfio/pci/vfio_pci_dmabuf.c | 13 +++++--
include/linux/dma-buf.h | 9 ++---
15 files changed, 84 insertions(+), 74 deletions(-)
---
base-commit: 9ace4753a5202b02191d54e9fdf7f9e3d02b85eb
change-id: 20251221-dmabuf-revoke-b90ef16e4236
Best regards,
--
Leon Romanovsky <leonro(a)nvidia.com>
On Wed, Jan 21, 2026 at 02:22:31PM +0000, Pranjal Shrivastava wrote:
> On Wed, Jan 21, 2026 at 09:47:12AM -0400, Jason Gunthorpe wrote:
> > On Wed, Jan 21, 2026 at 02:59:16PM +0200, Leon Romanovsky wrote:
> > > From: Leon Romanovsky <leonro(a)nvidia.com>
> > >
> > > Use the new dma_buf_attach_revocable() helper to restrict attachments to
> > > importers that support mapping invalidation.
> > >
> > > Signed-off-by: Leon Romanovsky <leonro(a)nvidia.com>
> > > ---
> > > drivers/vfio/pci/vfio_pci_dmabuf.c | 3 +++
> > > 1 file changed, 3 insertions(+)
> > >
> > > diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
> > > index 5fceefc40e27..85056a5a3faf 100644
> > > --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
> > > +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
> > > @@ -31,6 +31,9 @@ static int vfio_pci_dma_buf_attach(struct dma_buf *dmabuf,
> > > if (priv->revoked)
> > > return -ENODEV;
> > >
> > > + if (!dma_buf_attach_revocable(attachment))
> > > + return -EOPNOTSUPP;
> > > +
> > > return 0;
> > > }
> >
> > We need to push an urgent -rc fix to implement a pin function here
> > that always fails. That was missed and it means things like rdma can
> > import vfio when the intention was to block that. It would be bad for
> > that uAPI mistake to reach a released kernel.
> >
> > It's tricky that NULL pin ops means "I support pin" :|
> >
>
> I've been wondering about this for a while now, I've been sitting on the
> following:
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index a4d8f2ff94e4..962bce959366 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -1133,6 +1133,8 @@ int dma_buf_pin(struct dma_buf_attachment *attach)
>
> if (dmabuf->ops->pin)
> ret = dmabuf->ops->pin(attach);
> + else
> + ret = -EOPNOTSUPP;
>
> return ret;
> }
>
> But didn't get a chance to dive in the history yet. I thought there's a
> good reason we didn't have it? Would it break exisitng dmabuf users?
Probably every importer which called to dma_buf_pin() while connecting
to existing exporters as many in tree implementation don't have ->pin()
implemented.
Thanks
>
> Praan
On 1/20/26 21:44, Matthew Brost wrote:
> On Tue, Jan 20, 2026 at 04:07:06PM +0200, Leon Romanovsky wrote:
>> From: Leon Romanovsky <leonro(a)nvidia.com>
>>
>> dma-buf invalidation is performed asynchronously by hardware, so VFIO must
>> wait until all affected objects have been fully invalidated.
>>
>> Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO regions")
>> Signed-off-by: Leon Romanovsky <leonro(a)nvidia.com>
>> ---
>> drivers/vfio/pci/vfio_pci_dmabuf.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
>> index d4d0f7d08c53..33bc6a1909dd 100644
>> --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
>> +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
>> @@ -321,6 +321,9 @@ void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked)
>> dma_resv_lock(priv->dmabuf->resv, NULL);
>> priv->revoked = revoked;
>> dma_buf_move_notify(priv->dmabuf);
>> + dma_resv_wait_timeout(priv->dmabuf->resv,
>> + DMA_RESV_USAGE_KERNEL, false,
>> + MAX_SCHEDULE_TIMEOUT);
>
> Should we explicitly call out in the dma_buf_move_notify() /
> invalidate_mappings kernel-doc that KERNEL slots are the mechanism
> for communicating asynchronous dma_buf_move_notify /
> invalidate_mappings events via fences?
Oh, I missed that! And no that is not correct.
This should be DMA_RESV_USAGE_BOOKKEEP so that we wait for everything.
Regards,
Christian.
>
> Yes, this is probably implied, but it wouldn’t hurt to state this
> explicitly as part of the cross-driver contract.
>
> Here is what we have now:
>
> * - Dynamic importers should set fences for any access that they can't
> * disable immediately from their &dma_buf_attach_ops.invalidate_mappings
> * callback.
>
> Matt
>
>> dma_resv_unlock(priv->dmabuf->resv);
>> }
>> fput(priv->dmabuf->file);
>> @@ -342,6 +345,8 @@ void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev)
>> priv->vdev = NULL;
>> priv->revoked = true;
>> dma_buf_move_notify(priv->dmabuf);
>> + dma_resv_wait_timeout(priv->dmabuf->resv, DMA_RESV_USAGE_KERNEL,
>> + false, MAX_SCHEDULE_TIMEOUT);
>> dma_resv_unlock(priv->dmabuf->resv);
>> vfio_device_put_registration(&vdev->vdev);
>> fput(priv->dmabuf->file);
>>
>> --
>> 2.52.0
>>
On 1/20/26 12:41, Tvrtko Ursulin wrote:
>
> On 20/01/2026 10:54, Christian König wrote:
>> Implement per-fence spinlocks, allowing implementations to not give an
>> external spinlock to protect the fence internal statei. Instead a spinlock
>> embedded into the fence structure itself is used in this case.
>>
>> Shared spinlocks have the problem that implementations need to guarantee
>> that the lock live at least as long all fences referencing them.
>>
>> Using a per-fence spinlock allows completely decoupling spinlock producer
>> and consumer life times, simplifying the handling in most use cases.
>>
>> v2: improve naming, coverage and function documentation
>> v3: fix one additional locking in the selftests
>> v4: separate out some changes to make the patch smaller,
>> fix one amdgpu crash found by CI systems
>>
>> Signed-off-by: Christian König <christian.koenig(a)amd.com>
>> ---
>> drivers/dma-buf/dma-fence.c | 25 +++++++++++++++++-------
>> drivers/dma-buf/sync_debug.h | 2 +-
>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +-
>> drivers/gpu/drm/drm_crtc.c | 2 +-
>> drivers/gpu/drm/drm_writeback.c | 2 +-
>> drivers/gpu/drm/nouveau/nouveau_fence.c | 3 ++-
>> drivers/gpu/drm/qxl/qxl_release.c | 3 ++-
>> drivers/gpu/drm/vmwgfx/vmwgfx_fence.c | 3 ++-
>> drivers/gpu/drm/xe/xe_hw_fence.c | 3 ++-
>
> i915 needed changes too, based on the kbuild report.
Going to take a look now.
> Have you seen my note about the RCU sparse warning as well?
Nope, I must have missed that mail.
...
>> +/**
>> + * dma_fence_spinlock - return pointer to the spinlock protecting the fence
>> + * @fence: the fence to get the lock from
>> + *
>> + * Return either the pointer to the embedded or the external spin lock.
>> + */
>> +static inline spinlock_t *dma_fence_spinlock(struct dma_fence *fence)
>> +{
>> + return test_bit(DMA_FENCE_FLAG_INLINE_LOCK_BIT, &fence->flags) ?
>> + &fence->inline_lock : fence->extern_lock;
>> +}
>
> You did not want to move this helper into "dma-buf: abstract fence locking" ?
I was avoiding that to keep the pre-requisite patch smaller, cause this change here seemed independent to that.
But thinking about it I could make a third patch which introduces dma_fence_spinlock() and changes all the container_of uses.
> I think that would have been better to keep everything mechanical in one patch, and then this patch which changes behaviour does not touch any drivers but only dma-fence core.
>
> Also, what about adding something like dma_fence_container_of() in that patch as well?
I would rather like to avoid that. Using the spinlock pointer with container_of seemed to be a bit of a hack to me in the first place and I don't want to encourage people to do that in new code as well.
Regards,
Christian.
>
> Regards,
>
> Tvrtko
>
>> +
>> /**
>> * dma_fence_lock_irqsave - irqsave lock the fence
>> * @fence: the fence to lock
>> @@ -385,7 +403,7 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep)
>> * Lock the fence, preventing it from changing to the signaled state.
>> */
>> #define dma_fence_lock_irqsave(fence, flags) \
>> - spin_lock_irqsave(fence->lock, flags)
>> + spin_lock_irqsave(dma_fence_spinlock(fence), flags)
>> /**
>> * dma_fence_unlock_irqrestore - unlock the fence and irqrestore
>> @@ -395,7 +413,7 @@ dma_fence_get_rcu_safe(struct dma_fence __rcu **fencep)
>> * Unlock the fence, allowing it to change it's state to signaled again.
>> */
>> #define dma_fence_unlock_irqrestore(fence, flags) \
>> - spin_unlock_irqrestore(fence->lock, flags)
>> + spin_unlock_irqrestore(dma_fence_spinlock(fence), flags)
>> #ifdef CONFIG_LOCKDEP
>> bool dma_fence_begin_signalling(void);
>
On Tue, Jan 20, 2026 at 12:44:50PM -0800, Matthew Brost wrote:
> On Tue, Jan 20, 2026 at 04:07:06PM +0200, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro(a)nvidia.com>
> >
> > dma-buf invalidation is performed asynchronously by hardware, so VFIO must
> > wait until all affected objects have been fully invalidated.
> >
> > Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO regions")
> > Signed-off-by: Leon Romanovsky <leonro(a)nvidia.com>
> > ---
> > drivers/vfio/pci/vfio_pci_dmabuf.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c
> > index d4d0f7d08c53..33bc6a1909dd 100644
> > --- a/drivers/vfio/pci/vfio_pci_dmabuf.c
> > +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c
> > @@ -321,6 +321,9 @@ void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked)
> > dma_resv_lock(priv->dmabuf->resv, NULL);
> > priv->revoked = revoked;
> > dma_buf_move_notify(priv->dmabuf);
> > + dma_resv_wait_timeout(priv->dmabuf->resv,
> > + DMA_RESV_USAGE_KERNEL, false,
> > + MAX_SCHEDULE_TIMEOUT);
>
> Should we explicitly call out in the dma_buf_move_notify() /
> invalidate_mappings kernel-doc that KERNEL slots are the mechanism
> for communicating asynchronous dma_buf_move_notify /
> invalidate_mappings events via fences?
>
> Yes, this is probably implied, but it wouldn’t hurt to state this
> explicitly as part of the cross-driver contract.
>
> Here is what we have now:
>
> * - Dynamic importers should set fences for any access that they can't
> * disable immediately from their &dma_buf_attach_ops.invalidate_mappings
> * callback.
I believe I documented this in patch 4:
https://lore.kernel.org/all/20260120-dmabuf-revoke-v3-4-b7e0b07b8214@nvidia…"
Is there anything else that should be added?
1275 /**
1276 * dma_buf_move_notify - notify attachments that DMA-buf is moving
1277 *
1278 * @dmabuf: [in] buffer which is moving
1279 *
1280 * Informs all attachments that they need to destroy and recreate all their
1281 * mappings. If the attachment is dynamic then the dynamic importer is expected
1282 * to invalidate any caches it has of the mapping result and perform a new
1283 * mapping request before allowing HW to do any further DMA.
1284 *
1285 * If the attachment is pinned then this informs the pinned importer that
1286 * the underlying mapping is no longer available. Pinned importers may take
1287 * this is as a permanent revocation so exporters should not trigger it
1288 * lightly.
1289 *
1290 * For legacy pinned importers that cannot support invalidation this is a NOP.
1291 * Drivers can call dma_buf_attach_revocable() to determine if the importer
1292 * supports this.
1293 *
1294 * NOTE: The invalidation triggers asynchronous HW operation and the callers
1295 * need to wait for this operation to complete by calling
1296 * to dma_resv_wait_timeout().
1297 */
Thanks
>
> Matt
>
> > dma_resv_unlock(priv->dmabuf->resv);
> > }
> > fput(priv->dmabuf->file);
> > @@ -342,6 +345,8 @@ void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev)
> > priv->vdev = NULL;
> > priv->revoked = true;
> > dma_buf_move_notify(priv->dmabuf);
> > + dma_resv_wait_timeout(priv->dmabuf->resv, DMA_RESV_USAGE_KERNEL,
> > + false, MAX_SCHEDULE_TIMEOUT);
> > dma_resv_unlock(priv->dmabuf->resv);
> > vfio_device_put_registration(&vdev->vdev);
> > fput(priv->dmabuf->file);
> >
> > --
> > 2.52.0
> >