This series implements a dma-buf “revoke” mechanism: to allow a dma-buf exporter to explicitly invalidate (“kill”) a shared buffer after it has been distributed to importers, so that further CPU and device access is prevented and importers reliably observe failure.
Today, dma-buf effectively provides “if you have the fd, you can keep using the memory indefinitely.” That assumption breaks down when an exporter must reclaim, reset, evict, or otherwise retire backing memory after it has been shared. Concrete cases include GPU reset and recovery where old allocations become unsafe to access, memory eviction/overcommit where backing storage must be withdrawn, and security or isolation situations where continued access must be prevented. While drivers can sometimes approximate this with exporter-specific fencing and policy, there is no core dma-buf state transition that communicates “this buffer is no longer valid; fail access” across all access paths.
The change in this series is to introduce a core “revoked” state on the dma-buf object and a corresponding exporter-triggered revoke operation. Once a dma-buf is revoked, new access paths are blocked so that attempts to DMA-map, vmap, or mmap the buffer fail in a consistent way.
In addition, the series aims to invalidate existing access as much as the kernel allows: device mappings are torn down where possible so devices and IOMMUs cannot continue DMA.
The semantics are intentionally simple: revoke is a one-way, permanent transition for the lifetime of that dma-buf instance.
From a compatibility perspective, users that never invoke revoke are unaffected, and exporters that adopt it gain a core-supported enforcement mechanism rather than relying on ad hoc driver behavior. The intent is to keep the interface minimal and avoid imposing policy; the series provides the mechanism to terminate access, with policy remaining in the exporter and higher-level components.
BTW, see this megathread [1] for additional context. Ironically, it was posted exactly one year ago.
[1] https://lore.kernel.org/all/20250107142719.179636-2-yilun.xu@linux.intel.com...
Thanks
Cc: linux-rdma@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: kvm@vger.kernel.org Cc: iommu@lists.linux.dev To: Jason Gunthorpe jgg@ziepe.ca To: Leon Romanovsky leon@kernel.org To: Sumit Semwal sumit.semwal@linaro.org To: Christian König christian.koenig@amd.com To: Alex Williamson alex@shazbot.org To: Kevin Tian kevin.tian@intel.com To: Joerg Roedel joro@8bytes.org To: Will Deacon will@kernel.org To: Robin Murphy robin.murphy@arm.com
Signed-off-by: Leon Romanovsky leonro@nvidia.com --- Leon Romanovsky (4): dma-buf: Introduce revoke semantics vfio: Use dma-buf revoke semantics iommufd: Require DMABUF revoke semantics iommufd/selftest: Reuse dma-buf revoke semantics
drivers/dma-buf/dma-buf.c | 36 ++++++++++++++++++++++++++++++++---- drivers/iommu/iommufd/pages.c | 2 +- drivers/iommu/iommufd/selftest.c | 12 ++++-------- drivers/vfio/pci/vfio_pci_dmabuf.c | 27 ++++++--------------------- include/linux/dma-buf.h | 31 +++++++++++++++++++++++++++++++ 5 files changed, 74 insertions(+), 34 deletions(-) --- base-commit: 9ace4753a5202b02191d54e9fdf7f9e3d02b85eb change-id: 20251221-dmabuf-revoke-b90ef16e4236
Best regards, -- Leon Romanovsky leonro@nvidia.com
From: Leon Romanovsky leonro@nvidia.com
Add a dma-buf revoke mechanism that allows an exporter to explicitly invalidate ("kill") a shared buffer after it has been handed out to importers. Once revoked, all further CPU and device access is blocked, and importers consistently observe failure.
This requires both importers and exporters to honor the revoke contract. For importers, this means no page faults are delivered after the buffer is invalidated. For exporters, the dma-buf core prevents attaching new importers and remapping existing ones once revocation has occurred.
The proposed mechanism allows binding importers that do not require revoke support, and they shall continue using the existing .move_notify() API. However, importers that cannot handle page faults to remap buffers will fail to bind to exporters that do not support revoke.
Signed-off-by: Leon Romanovsky leonro@nvidia.com --- drivers/dma-buf/dma-buf.c | 36 ++++++++++++++++++++++++++++++++---- include/linux/dma-buf.h | 31 +++++++++++++++++++++++++++++++ 2 files changed, 63 insertions(+), 4 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index edaa9e4ee4ae..4d31fba792ee 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -697,6 +697,9 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) if (WARN_ON(!exp_info->ops->pin != !exp_info->ops->unpin)) return ERR_PTR(-EINVAL);
+ if (WARN_ON(exp_info->revoke_semantics && exp_info->ops->pin)) + return ERR_PTR(-EINVAL); + if (!try_module_get(exp_info->owner)) return ERR_PTR(-ENOENT);
@@ -727,6 +730,7 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) dmabuf->cb_in.poll = dmabuf->cb_out.poll = &dmabuf->poll; dmabuf->cb_in.active = dmabuf->cb_out.active = 0; INIT_LIST_HEAD(&dmabuf->attachments); + dmabuf->revoke_semantics = exp_info->revoke_semantics;
if (!resv) { dmabuf->resv = (struct dma_resv *)&dmabuf[1]; @@ -948,8 +952,21 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, if (WARN_ON(!dmabuf || !dev)) return ERR_PTR(-EINVAL);
- if (WARN_ON(importer_ops && !importer_ops->move_notify)) - return ERR_PTR(-EINVAL); + if (dmabuf->invalidate) + return ERR_PTR(-ENODEV); + + if (importer_ops) { + if (WARN_ON(!importer_ops->move_notify && + !importer_ops->revoke_notify)) + return ERR_PTR(-EINVAL); + + if (WARN_ON(importer_ops->move_notify && + importer_ops->revoke_notify)) + return ERR_PTR(-EINVAL); + + if (!dmabuf->revoke_semantics && importer_ops->revoke_notify) + return ERR_PTR(-EINVAL); + }
attach = kzalloc(sizeof(*attach), GFP_KERNEL); if (!attach) @@ -1102,6 +1119,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, if (WARN_ON(!attach || !attach->dmabuf)) return ERR_PTR(-EINVAL);
+ if (attach->dmabuf->invalidate) + return ERR_PTR(-ENODEV); + dma_resv_assert_held(attach->dmabuf->resv);
if (dma_buf_pin_on_map(attach)) { @@ -1261,8 +1281,16 @@ void dma_buf_move_notify(struct dma_buf *dmabuf) dma_resv_assert_held(dmabuf->resv);
list_for_each_entry(attach, &dmabuf->attachments, node) - if (attach->importer_ops) - attach->importer_ops->move_notify(attach); + if (attach->importer_ops) { + if (attach->importer_ops->move_notify) + attach->importer_ops->move_notify(attach); + + if (attach->importer_ops->revoke_notify) + attach->importer_ops->revoke_notify(attach); + } + + if (dmabuf->revoke_semantics) + dmabuf->invalidate = true; } EXPORT_SYMBOL_NS_GPL(dma_buf_move_notify, "DMA_BUF");
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 0bc492090237..e198ee490151 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -23,6 +23,7 @@ #include <linux/dma-fence.h> #include <linux/wait.h> #include <linux/pci-p2pdma.h> +#include <linux/dma-resv.h>
struct device; struct dma_buf; @@ -441,6 +442,15 @@ struct dma_buf { struct dma_buf *dmabuf; } *sysfs_entry; #endif + /** + * @revoke_semantics: + * + * This exporter implements revoke semantics. + */ + bool revoke_semantics; + + /** @invalidate: this buffer was revoked and invalidated */ + bool invalidate; };
/** @@ -476,6 +486,18 @@ struct dma_buf_attach_ops { * point to the new location of the DMA-buf. */ void (*move_notify)(struct dma_buf_attachment *attach); + + /** + * @revoke_notify: [optional] notification that the DMA-buf is revoking + * + * If this callback is provided the importer will invildate the mappings. + * + * This callback is called with the lock of the reservation object + * associated with the dma_buf held. + * + * New mappings shouldn't be created after this callback returns. + */ + void (*revoke_notify)(struct dma_buf_attachment *attach); };
/** @@ -516,6 +538,7 @@ struct dma_buf_attachment { * @size: Size of the buffer - invariant over the lifetime of the buffer * @flags: mode flags for the file * @resv: reservation-object, NULL to allocate default one + * @revoke_semantics: support revoke semantics * @priv: Attach private data of allocator to this buffer * * This structure holds the information required to export the buffer. Used @@ -528,6 +551,7 @@ struct dma_buf_export_info { size_t size; int flags; struct dma_resv *resv; + bool revoke_semantics; void *priv; };
@@ -620,4 +644,11 @@ int dma_buf_vmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map); void dma_buf_vunmap_unlocked(struct dma_buf *dmabuf, struct iosys_map *map); struct dma_buf *dma_buf_iter_begin(void); struct dma_buf *dma_buf_iter_next(struct dma_buf *dmbuf); + +static inline void dma_buf_mark_valid(struct dma_buf *dma_buf) +{ + dma_resv_assert_held(dma_buf->resv); + + dma_buf->invalidate = false; +} #endif /* __DMA_BUF_H__ */
From: Leon Romanovsky leonro@nvidia.com
Remove open-code variant of revoked semantics and reuse existing dma_buf_move_notify() and newly introduced dma_buf_mark_valid() primitives.
Signed-off-by: Leon Romanovsky leonro@nvidia.com --- drivers/vfio/pci/vfio_pci_dmabuf.c | 27 ++++++--------------------- 1 file changed, 6 insertions(+), 21 deletions(-)
diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c index d4d0f7d08c53..d953bd4cd118 100644 --- a/drivers/vfio/pci/vfio_pci_dmabuf.c +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c @@ -17,20 +17,14 @@ struct vfio_pci_dma_buf { struct dma_buf_phys_vec *phys_vec; struct p2pdma_provider *provider; u32 nr_ranges; - u8 revoked : 1; };
static int vfio_pci_dma_buf_attach(struct dma_buf *dmabuf, struct dma_buf_attachment *attachment) { - struct vfio_pci_dma_buf *priv = dmabuf->priv; - if (!attachment->peer2peer) return -EOPNOTSUPP;
- if (priv->revoked) - return -ENODEV; - return 0; }
@@ -42,9 +36,6 @@ vfio_pci_dma_buf_map(struct dma_buf_attachment *attachment,
dma_resv_assert_held(priv->dmabuf->resv);
- if (priv->revoked) - return ERR_PTR(-ENODEV); - return dma_buf_phys_vec_to_sgt(attachment, priv->provider, priv->phys_vec, priv->nr_ranges, priv->size, dir); @@ -90,8 +81,6 @@ static const struct dma_buf_ops vfio_pci_dmabuf_ops = { * * If this function succeeds the following are true: * - There is one physical range and it is pointing to MMIO - * - When move_notify is called it means revoke, not move, vfio_dma_buf_map - * will fail if it is currently revoked */ int vfio_pci_dma_buf_iommufd_map(struct dma_buf_attachment *attachment, struct dma_buf_phys_vec *phys) @@ -104,9 +93,6 @@ int vfio_pci_dma_buf_iommufd_map(struct dma_buf_attachment *attachment, return -EOPNOTSUPP;
priv = attachment->dmabuf->priv; - if (priv->revoked) - return -ENODEV; - /* More than one range to iommufd will require proper DMABUF support */ if (priv->nr_ranges != 1) return -EOPNOTSUPP; @@ -268,6 +254,7 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, exp_info.size = priv->size; exp_info.flags = get_dma_buf.open_flags; exp_info.priv = priv; + exp_info.revoke_semantics = true;
priv->dmabuf = dma_buf_export(&exp_info); if (IS_ERR(priv->dmabuf)) { @@ -279,7 +266,6 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, INIT_LIST_HEAD(&priv->dmabufs_elm); down_write(&vdev->memory_lock); dma_resv_lock(priv->dmabuf->resv, NULL); - priv->revoked = !__vfio_pci_memory_enabled(vdev); list_add_tail(&priv->dmabufs_elm, &vdev->dmabufs); dma_resv_unlock(priv->dmabuf->resv); up_write(&vdev->memory_lock); @@ -317,12 +303,12 @@ void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked) if (!get_file_active(&priv->dmabuf->file)) continue;
- if (priv->revoked != revoked) { - dma_resv_lock(priv->dmabuf->resv, NULL); - priv->revoked = revoked; + dma_resv_lock(priv->dmabuf->resv, NULL); + if (revoked) dma_buf_move_notify(priv->dmabuf); - dma_resv_unlock(priv->dmabuf->resv); - } + else + dma_buf_mark_valid(priv->dmabuf); + dma_resv_unlock(priv->dmabuf->resv); fput(priv->dmabuf->file); } } @@ -340,7 +326,6 @@ void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev) dma_resv_lock(priv->dmabuf->resv, NULL); list_del_init(&priv->dmabufs_elm); priv->vdev = NULL; - priv->revoked = true; dma_buf_move_notify(priv->dmabuf); dma_resv_unlock(priv->dmabuf->resv); vfio_device_put_registration(&vdev->vdev);
From: Leon Romanovsky leonro@nvidia.com
IOMMUFD does not support page fault handling, and after a call to .move_notify() all mappings become invalid. Ensure that the IOMMUFD DMABUF importer is bound to a revoke‑aware DMABUF exporter (for example, VFIO).
Signed-off-by: Leon Romanovsky leonro@nvidia.com --- drivers/iommu/iommufd/pages.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c index dbe51ecb9a20..a233def71be0 100644 --- a/drivers/iommu/iommufd/pages.c +++ b/drivers/iommu/iommufd/pages.c @@ -1451,7 +1451,7 @@ static void iopt_revoke_notify(struct dma_buf_attachment *attach)
static struct dma_buf_attach_ops iopt_dmabuf_attach_revoke_ops = { .allow_peer2peer = true, - .move_notify = iopt_revoke_notify, + .revoke_notify = iopt_revoke_notify, };
/*
From: Leon Romanovsky leonro@nvidia.com
Test iommufd_test_dmabuf_revoke() with dma-buf revoke primitives.
Signed-off-by: Leon Romanovsky leonro@nvidia.com --- drivers/iommu/iommufd/selftest.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 550ff36dec3a..523dfac44ff8 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -1958,7 +1958,6 @@ void iommufd_selftest_destroy(struct iommufd_object *obj) struct iommufd_test_dma_buf { void *memory; size_t length; - bool revoked; };
static int iommufd_test_dma_buf_attach(struct dma_buf *dmabuf, @@ -2011,9 +2010,6 @@ int iommufd_test_dma_buf_iommufd_map(struct dma_buf_attachment *attachment, if (attachment->dmabuf->ops != &iommufd_test_dmabuf_ops) return -EOPNOTSUPP;
- if (priv->revoked) - return -ENODEV; - phys->paddr = virt_to_phys(priv->memory); phys->len = priv->length; return 0; @@ -2065,7 +2061,6 @@ static int iommufd_test_dmabuf_get(struct iommufd_ucmd *ucmd, static int iommufd_test_dmabuf_revoke(struct iommufd_ucmd *ucmd, int fd, bool revoked) { - struct iommufd_test_dma_buf *priv; struct dma_buf *dmabuf; int rc = 0;
@@ -2078,10 +2073,11 @@ static int iommufd_test_dmabuf_revoke(struct iommufd_ucmd *ucmd, int fd, goto err_put; }
- priv = dmabuf->priv; dma_resv_lock(dmabuf->resv, NULL); - priv->revoked = revoked; - dma_buf_move_notify(dmabuf); + if (revoked) + dma_buf_move_notify(dmabuf); + else + dma_buf_mark_valid(dmabuf); dma_resv_unlock(dmabuf->resv);
err_put:
linaro-mm-sig@lists.linaro.org