Changelog: v4: * Changed DMA_RESV_USAGE_KERNEL to DMA_RESV_USAGE_BOOKKEEP. * Made .invalidate_mapping() truly optional. * Added patch which renames dma_buf_move_notify() to be dma_buf_invalidate_mappings(). * Restored dma_buf_attachment_is_dynamic() function. v3: https://lore.kernel.org/all/20260120-dmabuf-revoke-v3-0-b7e0b07b8214@nvidia.... * Used Jason's wordings for commits and cover letter. * Removed IOMMUFD patch. * Renamed dma_buf_attachment_is_revoke() to be dma_buf_attach_revocable(). * Added patch to remove CONFIG_DMABUF_MOVE_NOTIFY. * Added Reviewed-by tags. * Called to dma_resv_wait_timeout() after dma_buf_move_notify() in VFIO. * Added dma_buf_attach_revocable() check to VFIO DMABUF attach function. * Slightly changed commit messages. v2: https://patch.msgid.link/20260118-dmabuf-revoke-v2-0-a03bb27c0875@nvidia.com * Changed series to document the revoke semantics instead of implementing it. v1: https://patch.msgid.link/20260111-dmabuf-revoke-v1-0-fb4bcc8c259b@nvidia.com
------------------------------------------------------------------------- This series documents a dma-buf “revoke” mechanism: to allow a dma-buf exporter to explicitly invalidate (“kill”) a shared buffer after it has been distributed to importers, so that further CPU and device access is prevented and importers reliably observe failure.
The change in this series is to properly document and use existing core “revoked” state on the dma-buf object and a corresponding exporter-triggered revoke operation.
dma-buf has quietly allowed calling move_notify on pinned dma-bufs, even though legacy importers using dma_buf_attach() would simply ignore these calls.
RDMA saw this and needed to use allow_peer2peer=true, so implemented a new-style pinned importer with an explicitly non-working move_notify() callback.
This has been tolerable because the existing exporters are thought to only call move_notify() on a pinned DMABUF under RAS events and we have been willing to tolerate the UAF that results by allowing the importer to continue to use the mapping in this rare case.
VFIO wants to implement a pin supporting exporter that will issue a revoking move_notify() around FLRs and a few other user triggerable operations. Since this is much more common we are not willing to tolerate the security UAF caused by interworking with non-move_notify() supporting drivers. Thus till now VFIO has required dynamic importers, even though it never actually moves the buffer location.
To allow VFIO to work with pinned importers, according to how dma-buf was intended, we need to allow VFIO to detect if an importer is legacy or RDMA and does not actually implement move_notify().
In theory all exporters that call move_notify() on pinned dma-buf's should call this function, however that would break a number of widely used NIC/GPU flows. Thus for now do not spread this further than VFIO until we can understand how much of RDMA can implement the full semantic.
In the process clarify how move_notify is intended to be used with pinned dma-bufs.
Thanks
Signed-off-by: Leon Romanovsky leonro@nvidia.com --- Leon Romanovsky (8): dma-buf: Rename .move_notify() callback to a clearer identifier dma-buf: Rename dma_buf_move_notify() to dma_buf_invalidate_mappings() dma-buf: Always build with DMABUF_MOVE_NOTIFY dma-buf: Make .invalidate_mapping() truly optional dma-buf: Add check function for revoke semantics iommufd: Pin dma-buf importer for revoke semantics vfio: Wait for dma-buf invalidation to complete vfio: Validate dma-buf revocation semantics
drivers/dma-buf/Kconfig | 12 ------- drivers/dma-buf/dma-buf.c | 53 ++++++++++++++++++++++------- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 14 +++----- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +- drivers/gpu/drm/amd/amdkfd/Kconfig | 2 +- drivers/gpu/drm/virtio/virtgpu_prime.c | 2 +- drivers/gpu/drm/xe/tests/xe_dma_buf.c | 7 ++-- drivers/gpu/drm/xe/xe_bo.c | 2 +- drivers/gpu/drm/xe/xe_dma_buf.c | 14 +++----- drivers/infiniband/core/umem_dmabuf.c | 13 ------- drivers/infiniband/hw/mlx5/mr.c | 2 +- drivers/iommu/iommufd/pages.c | 11 ++++-- drivers/iommu/iommufd/selftest.c | 2 +- drivers/vfio/pci/vfio_pci_dmabuf.c | 13 +++++-- include/linux/dma-buf.h | 9 ++--- 15 files changed, 84 insertions(+), 74 deletions(-) --- base-commit: 9ace4753a5202b02191d54e9fdf7f9e3d02b85eb change-id: 20251221-dmabuf-revoke-b90ef16e4236
Best regards, -- Leon Romanovsky leonro@nvidia.com
From: Leon Romanovsky leonro@nvidia.com
Rename the .move_notify() callback to .invalidate_mappings() to make its purpose explicit and highlight that it is responsible for invalidating existing mappings.
Suggested-by: Christian König christian.koenig@amd.com Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Leon Romanovsky leonro@nvidia.com --- drivers/dma-buf/dma-buf.c | 6 +++--- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 4 ++-- drivers/gpu/drm/virtio/virtgpu_prime.c | 2 +- drivers/gpu/drm/xe/tests/xe_dma_buf.c | 6 +++--- drivers/gpu/drm/xe/xe_dma_buf.c | 2 +- drivers/infiniband/core/umem_dmabuf.c | 4 ++-- drivers/infiniband/hw/mlx5/mr.c | 2 +- drivers/iommu/iommufd/pages.c | 2 +- include/linux/dma-buf.h | 6 +++--- 9 files changed, 17 insertions(+), 17 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index edaa9e4ee4ae..59cc647bf40e 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -948,7 +948,7 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, if (WARN_ON(!dmabuf || !dev)) return ERR_PTR(-EINVAL);
- if (WARN_ON(importer_ops && !importer_ops->move_notify)) + if (WARN_ON(importer_ops && !importer_ops->invalidate_mappings)) return ERR_PTR(-EINVAL);
attach = kzalloc(sizeof(*attach), GFP_KERNEL); @@ -1055,7 +1055,7 @@ EXPORT_SYMBOL_NS_GPL(dma_buf_pin, "DMA_BUF"); * * This unpins a buffer pinned by dma_buf_pin() and allows the exporter to move * any mapping of @attach again and inform the importer through - * &dma_buf_attach_ops.move_notify. + * &dma_buf_attach_ops.invalidate_mappings. */ void dma_buf_unpin(struct dma_buf_attachment *attach) { @@ -1262,7 +1262,7 @@ void dma_buf_move_notify(struct dma_buf *dmabuf)
list_for_each_entry(attach, &dmabuf->attachments, node) if (attach->importer_ops) - attach->importer_ops->move_notify(attach); + attach->importer_ops->invalidate_mappings(attach); } EXPORT_SYMBOL_NS_GPL(dma_buf_move_notify, "DMA_BUF");
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c index e22cfa7c6d32..863454148b28 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c @@ -450,7 +450,7 @@ amdgpu_dma_buf_create_obj(struct drm_device *dev, struct dma_buf *dma_buf) }
/** - * amdgpu_dma_buf_move_notify - &attach.move_notify implementation + * amdgpu_dma_buf_move_notify - &attach.invalidate_mappings implementation * * @attach: the DMA-buf attachment * @@ -521,7 +521,7 @@ amdgpu_dma_buf_move_notify(struct dma_buf_attachment *attach)
static const struct dma_buf_attach_ops amdgpu_dma_buf_attach_ops = { .allow_peer2peer = true, - .move_notify = amdgpu_dma_buf_move_notify + .invalidate_mappings = amdgpu_dma_buf_move_notify };
/** diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c index ce49282198cb..19c78dd2ca77 100644 --- a/drivers/gpu/drm/virtio/virtgpu_prime.c +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c @@ -288,7 +288,7 @@ static void virtgpu_dma_buf_move_notify(struct dma_buf_attachment *attach)
static const struct dma_buf_attach_ops virtgpu_dma_buf_attach_ops = { .allow_peer2peer = true, - .move_notify = virtgpu_dma_buf_move_notify + .invalidate_mappings = virtgpu_dma_buf_move_notify };
struct drm_gem_object *virtgpu_gem_prime_import(struct drm_device *dev, diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c b/drivers/gpu/drm/xe/tests/xe_dma_buf.c index 5df98de5ba3c..1f2cca5c2f81 100644 --- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c +++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c @@ -23,7 +23,7 @@ static bool p2p_enabled(struct dma_buf_test_params *params) static bool is_dynamic(struct dma_buf_test_params *params) { return IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY) && params->attach_ops && - params->attach_ops->move_notify; + params->attach_ops->invalidate_mappings; }
static void check_residency(struct kunit *test, struct xe_bo *exported, @@ -60,7 +60,7 @@ static void check_residency(struct kunit *test, struct xe_bo *exported,
/* * Evict exporter. Evicting the exported bo will - * evict also the imported bo through the move_notify() functionality if + * evict also the imported bo through the invalidate_mappings() functionality if * importer is on a different device. If they're on the same device, * the exporter and the importer should be the same bo. */ @@ -198,7 +198,7 @@ static void xe_test_dmabuf_import_same_driver(struct xe_device *xe)
static const struct dma_buf_attach_ops nop2p_attach_ops = { .allow_peer2peer = false, - .move_notify = xe_dma_buf_move_notify + .invalidate_mappings = xe_dma_buf_move_notify };
/* diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c index 7c74a31d4486..1b9cd043e517 100644 --- a/drivers/gpu/drm/xe/xe_dma_buf.c +++ b/drivers/gpu/drm/xe/xe_dma_buf.c @@ -287,7 +287,7 @@ static void xe_dma_buf_move_notify(struct dma_buf_attachment *attach)
static const struct dma_buf_attach_ops xe_dma_buf_attach_ops = { .allow_peer2peer = true, - .move_notify = xe_dma_buf_move_notify + .invalidate_mappings = xe_dma_buf_move_notify };
#if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST) diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c index 0ec2e4120cc9..d77a739cfe7a 100644 --- a/drivers/infiniband/core/umem_dmabuf.c +++ b/drivers/infiniband/core/umem_dmabuf.c @@ -129,7 +129,7 @@ ib_umem_dmabuf_get_with_dma_device(struct ib_device *device, if (check_add_overflow(offset, (unsigned long)size, &end)) return ret;
- if (unlikely(!ops || !ops->move_notify)) + if (unlikely(!ops || !ops->invalidate_mappings)) return ret;
dmabuf = dma_buf_get(fd); @@ -195,7 +195,7 @@ ib_umem_dmabuf_unsupported_move_notify(struct dma_buf_attachment *attach)
static struct dma_buf_attach_ops ib_umem_dmabuf_attach_pinned_ops = { .allow_peer2peer = true, - .move_notify = ib_umem_dmabuf_unsupported_move_notify, + .invalidate_mappings = ib_umem_dmabuf_unsupported_move_notify, };
struct ib_umem_dmabuf * diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 325fa04cbe8a..97099d3b1688 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1620,7 +1620,7 @@ static void mlx5_ib_dmabuf_invalidate_cb(struct dma_buf_attachment *attach)
static struct dma_buf_attach_ops mlx5_ib_dmabuf_attach_ops = { .allow_peer2peer = 1, - .move_notify = mlx5_ib_dmabuf_invalidate_cb, + .invalidate_mappings = mlx5_ib_dmabuf_invalidate_cb, };
static struct ib_mr * diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c index dbe51ecb9a20..76f900fa1687 100644 --- a/drivers/iommu/iommufd/pages.c +++ b/drivers/iommu/iommufd/pages.c @@ -1451,7 +1451,7 @@ static void iopt_revoke_notify(struct dma_buf_attachment *attach)
static struct dma_buf_attach_ops iopt_dmabuf_attach_revoke_ops = { .allow_peer2peer = true, - .move_notify = iopt_revoke_notify, + .invalidate_mappings = iopt_revoke_notify, };
/* diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 0bc492090237..1b397635c793 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -407,7 +407,7 @@ struct dma_buf { * through the device. * * - Dynamic importers should set fences for any access that they can't - * disable immediately from their &dma_buf_attach_ops.move_notify + * disable immediately from their &dma_buf_attach_ops.invalidate_mappings * callback. * * IMPORTANT: @@ -458,7 +458,7 @@ struct dma_buf_attach_ops { bool allow_peer2peer;
/** - * @move_notify: [optional] notification that the DMA-buf is moving + * @invalidate_mappings: [optional] notification that the DMA-buf is moving * * If this callback is provided the framework can avoid pinning the * backing store while mappings exists. @@ -475,7 +475,7 @@ struct dma_buf_attach_ops { * New mappings can be created after this callback returns, and will * point to the new location of the DMA-buf. */ - void (*move_notify)(struct dma_buf_attachment *attach); + void (*invalidate_mappings)(struct dma_buf_attachment *attach); };
/**
From: Leon Romanovsky leonro@nvidia.com
Along with renaming the .move_notify() callback, rename the corresponding dma-buf core function. This makes the expected behavior clear to exporters calling this function.
Signed-off-by: Leon Romanovsky leonro@nvidia.com --- drivers/dma-buf/dma-buf.c | 8 ++++---- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +- drivers/gpu/drm/xe/xe_bo.c | 2 +- drivers/iommu/iommufd/selftest.c | 2 +- drivers/vfio/pci/vfio_pci_dmabuf.c | 4 ++-- include/linux/dma-buf.h | 2 +- 6 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 59cc647bf40e..e12db540c413 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -912,7 +912,7 @@ dma_buf_pin_on_map(struct dma_buf_attachment *attach) * 3. Exporters must hold the dma-buf reservation lock when calling these * functions: * - * - dma_buf_move_notify() + * - dma_buf_invalidate_mappings() */
/** @@ -1247,14 +1247,14 @@ void dma_buf_unmap_attachment_unlocked(struct dma_buf_attachment *attach, EXPORT_SYMBOL_NS_GPL(dma_buf_unmap_attachment_unlocked, "DMA_BUF");
/** - * dma_buf_move_notify - notify attachments that DMA-buf is moving + * dma_buf_invalidate_mappings - notify attachments that DMA-buf is moving * * @dmabuf: [in] buffer which is moving * * Informs all attachments that they need to destroy and recreate all their * mappings. */ -void dma_buf_move_notify(struct dma_buf *dmabuf) +void dma_buf_invalidate_mappings(struct dma_buf *dmabuf) { struct dma_buf_attachment *attach;
@@ -1264,7 +1264,7 @@ void dma_buf_move_notify(struct dma_buf *dmabuf) if (attach->importer_ops) attach->importer_ops->invalidate_mappings(attach); } -EXPORT_SYMBOL_NS_GPL(dma_buf_move_notify, "DMA_BUF"); +EXPORT_SYMBOL_NS_GPL(dma_buf_invalidate_mappings, "DMA_BUF");
/** * DOC: cpu access diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index e08f58de4b17..f73dc99d1887 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -1270,7 +1270,7 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
if (abo->tbo.base.dma_buf && !drm_gem_is_imported(&abo->tbo.base) && old_mem && old_mem->mem_type != TTM_PL_SYSTEM) - dma_buf_move_notify(abo->tbo.base.dma_buf); + dma_buf_invalidate_mappings(abo->tbo.base.dma_buf);
/* move_notify is called before move happens */ trace_amdgpu_bo_move(abo, new_mem ? new_mem->mem_type : -1, diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index bf4ee976b680..7d02cd9a8501 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -819,7 +819,7 @@ static int xe_bo_move_notify(struct xe_bo *bo,
/* Don't call move_notify() for imported dma-bufs. */ if (ttm_bo->base.dma_buf && !ttm_bo->base.import_attach) - dma_buf_move_notify(ttm_bo->base.dma_buf); + dma_buf_invalidate_mappings(ttm_bo->base.dma_buf);
/* * TTM has already nuked the mmap for us (see ttm_bo_unmap_virtual), diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 550ff36dec3a..f60cbd5328cc 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -2081,7 +2081,7 @@ static int iommufd_test_dmabuf_revoke(struct iommufd_ucmd *ucmd, int fd, priv = dmabuf->priv; dma_resv_lock(dmabuf->resv, NULL); priv->revoked = revoked; - dma_buf_move_notify(dmabuf); + dma_buf_invalidate_mappings(dmabuf); dma_resv_unlock(dmabuf->resv);
err_put: diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c index d4d0f7d08c53..362e3d149817 100644 --- a/drivers/vfio/pci/vfio_pci_dmabuf.c +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c @@ -320,7 +320,7 @@ void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked) if (priv->revoked != revoked) { dma_resv_lock(priv->dmabuf->resv, NULL); priv->revoked = revoked; - dma_buf_move_notify(priv->dmabuf); + dma_buf_invalidate_mappings(priv->dmabuf); dma_resv_unlock(priv->dmabuf->resv); } fput(priv->dmabuf->file); @@ -341,7 +341,7 @@ void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev) list_del_init(&priv->dmabufs_elm); priv->vdev = NULL; priv->revoked = true; - dma_buf_move_notify(priv->dmabuf); + dma_buf_invalidate_mappings(priv->dmabuf); dma_resv_unlock(priv->dmabuf->resv); vfio_device_put_registration(&vdev->vdev); fput(priv->dmabuf->file); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 1b397635c793..d5c3ce2b3aa4 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -600,7 +600,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *, enum dma_data_direction); void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *, enum dma_data_direction); -void dma_buf_move_notify(struct dma_buf *dma_buf); +void dma_buf_invalidate_mappings(struct dma_buf *dma_buf); int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction dir); int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
From: Leon Romanovsky leonro@nvidia.com
DMABUF_MOVE_NOTIFY was introduced in 2018 and has been marked as experimental and disabled by default ever since. Six years later, all new importers implement this callback.
It is therefore reasonable to drop CONFIG_DMABUF_MOVE_NOTIFY and always build DMABUF with support for it enabled.
Suggested-by: Christian König christian.koenig@amd.com Signed-off-by: Leon Romanovsky leonro@nvidia.com --- drivers/dma-buf/Kconfig | 12 ------------ drivers/dma-buf/dma-buf.c | 3 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 10 +++------- drivers/gpu/drm/amd/amdkfd/Kconfig | 2 +- drivers/gpu/drm/xe/tests/xe_dma_buf.c | 3 +-- drivers/gpu/drm/xe/xe_dma_buf.c | 12 ++++-------- 6 files changed, 10 insertions(+), 32 deletions(-)
diff --git a/drivers/dma-buf/Kconfig b/drivers/dma-buf/Kconfig index b46eb8a552d7..84d5e9b24e20 100644 --- a/drivers/dma-buf/Kconfig +++ b/drivers/dma-buf/Kconfig @@ -40,18 +40,6 @@ config UDMABUF A driver to let userspace turn memfd regions into dma-bufs. Qemu can use this to create host dmabufs for guest framebuffers.
-config DMABUF_MOVE_NOTIFY - bool "Move notify between drivers (EXPERIMENTAL)" - default n - depends on DMA_SHARED_BUFFER - help - Don't pin buffers if the dynamic DMA-buf interface is available on - both the exporter as well as the importer. This fixes a security - problem where userspace is able to pin unrestricted amounts of memory - through DMA-buf. - This is marked experimental because we don't yet have a consistent - execution context and memory management between drivers. - config DMABUF_DEBUG bool "DMA-BUF debug checks" depends on DMA_SHARED_BUFFER diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index e12db540c413..cd68c1c0bfd7 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -847,8 +847,7 @@ static bool dma_buf_pin_on_map(struct dma_buf_attachment *attach) { return attach->dmabuf->ops->pin && - (!dma_buf_attachment_is_dynamic(attach) || - !IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY)); + !dma_buf_attachment_is_dynamic(attach); }
/** diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c index 863454148b28..349215549e8f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c @@ -145,13 +145,9 @@ static int amdgpu_dma_buf_pin(struct dma_buf_attachment *attach) * notifiers are disabled, only allow pinning in VRAM when move * notiers are enabled. */ - if (!IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY)) { - domains &= ~AMDGPU_GEM_DOMAIN_VRAM; - } else { - list_for_each_entry(attach, &dmabuf->attachments, node) - if (!attach->peer2peer) - domains &= ~AMDGPU_GEM_DOMAIN_VRAM; - } + list_for_each_entry(attach, &dmabuf->attachments, node) + if (!attach->peer2peer) + domains &= ~AMDGPU_GEM_DOMAIN_VRAM;
if (domains & AMDGPU_GEM_DOMAIN_VRAM) bo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig b/drivers/gpu/drm/amd/amdkfd/Kconfig index 16e12c9913f9..a5d7467c2f34 100644 --- a/drivers/gpu/drm/amd/amdkfd/Kconfig +++ b/drivers/gpu/drm/amd/amdkfd/Kconfig @@ -27,7 +27,7 @@ config HSA_AMD_SVM
config HSA_AMD_P2P bool "HSA kernel driver support for peer-to-peer for AMD GPU devices" - depends on HSA_AMD && PCI_P2PDMA && DMABUF_MOVE_NOTIFY + depends on HSA_AMD && PCI_P2PDMA help Enable peer-to-peer (P2P) communication between AMD GPUs over the PCIe bus. This can improve performance of multi-GPU compute diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c b/drivers/gpu/drm/xe/tests/xe_dma_buf.c index 1f2cca5c2f81..c107687ef3c0 100644 --- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c +++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c @@ -22,8 +22,7 @@ static bool p2p_enabled(struct dma_buf_test_params *params)
static bool is_dynamic(struct dma_buf_test_params *params) { - return IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY) && params->attach_ops && - params->attach_ops->invalidate_mappings; + return params->attach_ops && params->attach_ops->invalidate_mappings; }
static void check_residency(struct kunit *test, struct xe_bo *exported, diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c index 1b9cd043e517..ea370cd373e9 100644 --- a/drivers/gpu/drm/xe/xe_dma_buf.c +++ b/drivers/gpu/drm/xe/xe_dma_buf.c @@ -56,14 +56,10 @@ static int xe_dma_buf_pin(struct dma_buf_attachment *attach) bool allow_vram = true; int ret;
- if (!IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY)) { - allow_vram = false; - } else { - list_for_each_entry(attach, &dmabuf->attachments, node) { - if (!attach->peer2peer) { - allow_vram = false; - break; - } + list_for_each_entry(attach, &dmabuf->attachments, node) { + if (!attach->peer2peer) { + allow_vram = false; + break; } }
From: Leon Romanovsky leonro@nvidia.com
The .invalidate_mapping() callback is documented as optional, yet it effectively became mandatory whenever importer_ops were provided. This led to cases where RDMA non-ODP code had to supply an empty stub.
Relax the checks in the dma-buf core so the callback can be omitted, allowing RDMA code to drop the unnecessary function.
Signed-off-by: Leon Romanovsky leonro@nvidia.com --- drivers/dma-buf/dma-buf.c | 6 ++---- drivers/infiniband/core/umem_dmabuf.c | 13 ------------- 2 files changed, 2 insertions(+), 17 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index cd68c1c0bfd7..1629312d364a 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -947,9 +947,6 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, if (WARN_ON(!dmabuf || !dev)) return ERR_PTR(-EINVAL);
- if (WARN_ON(importer_ops && !importer_ops->invalidate_mappings)) - return ERR_PTR(-EINVAL); - attach = kzalloc(sizeof(*attach), GFP_KERNEL); if (!attach) return ERR_PTR(-ENOMEM); @@ -1260,7 +1257,8 @@ void dma_buf_invalidate_mappings(struct dma_buf *dmabuf) dma_resv_assert_held(dmabuf->resv);
list_for_each_entry(attach, &dmabuf->attachments, node) - if (attach->importer_ops) + if (attach->importer_ops && + attach->importer_ops->invalidate_mappings) attach->importer_ops->invalidate_mappings(attach); } EXPORT_SYMBOL_NS_GPL(dma_buf_invalidate_mappings, "DMA_BUF"); diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c index d77a739cfe7a..256e34c15e6b 100644 --- a/drivers/infiniband/core/umem_dmabuf.c +++ b/drivers/infiniband/core/umem_dmabuf.c @@ -129,9 +129,6 @@ ib_umem_dmabuf_get_with_dma_device(struct ib_device *device, if (check_add_overflow(offset, (unsigned long)size, &end)) return ret;
- if (unlikely(!ops || !ops->invalidate_mappings)) - return ret; - dmabuf = dma_buf_get(fd); if (IS_ERR(dmabuf)) return ERR_CAST(dmabuf); @@ -184,18 +181,8 @@ struct ib_umem_dmabuf *ib_umem_dmabuf_get(struct ib_device *device, } EXPORT_SYMBOL(ib_umem_dmabuf_get);
-static void -ib_umem_dmabuf_unsupported_move_notify(struct dma_buf_attachment *attach) -{ - struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv; - - ibdev_warn_ratelimited(umem_dmabuf->umem.ibdev, - "Invalidate callback should not be called when memory is pinned\n"); -} - static struct dma_buf_attach_ops ib_umem_dmabuf_attach_pinned_ops = { .allow_peer2peer = true, - .invalidate_mappings = ib_umem_dmabuf_unsupported_move_notify, };
struct ib_umem_dmabuf *
From: Leon Romanovsky leonro@nvidia.com
A DMA-buf revoke mechanism that allows an exporter to explicitly invalidate ("kill") a shared buffer after it has been handed out to importers. Once revoked, all further CPU and device access is blocked, and importers consistently observe failure.
This requires both importers and exporters to honor the revoke contract.
For importers, this means implementing .invalidate_mappings(). For exporters, this means implementing the .pin() and/or .attach() callback, which check the dma‑buf attachment for a valid revoke implementation.
Signed-off-by: Leon Romanovsky leonro@nvidia.com --- drivers/dma-buf/dma-buf.c | 32 +++++++++++++++++++++++++++++++- include/linux/dma-buf.h | 1 + 2 files changed, 32 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 1629312d364a..20fef3fb3bdf 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -1242,13 +1242,43 @@ void dma_buf_unmap_attachment_unlocked(struct dma_buf_attachment *attach, } EXPORT_SYMBOL_NS_GPL(dma_buf_unmap_attachment_unlocked, "DMA_BUF");
+/** + * dma_buf_attach_revocable - check if a DMA-buf importer implements + * revoke semantics. + * @attach: the DMA-buf attachment to check + * + * Returns true if the DMA-buf importer can handle invalidating it's mappings + * at any time, even after pinning a buffer. + */ +bool dma_buf_attach_revocable(struct dma_buf_attachment *attach) +{ + return attach->importer_ops && + attach->importer_ops->invalidate_mappings; +} +EXPORT_SYMBOL_NS_GPL(dma_buf_attach_revocable, "DMA_BUF"); + /** * dma_buf_invalidate_mappings - notify attachments that DMA-buf is moving * * @dmabuf: [in] buffer which is moving * * Informs all attachments that they need to destroy and recreate all their - * mappings. + * mappings. If the attachment is dynamic then the dynamic importer is expected + * to invalidate any caches it has of the mapping result and perform a new + * mapping request before allowing HW to do any further DMA. + * + * If the attachment is pinned then this informs the pinned importer that + * the underlying mapping is no longer available. Pinned importers may take + * this is as a permanent revocation so exporters should not trigger it + * lightly. + * + * For legacy pinned importers that cannot support invalidation this is a NOP. + * Drivers can call dma_buf_attach_revocable() to determine if the importer + * supports this. + * + * NOTE: The invalidation triggers asynchronous HW operation and the callers + * need to wait for this operation to complete by calling + * to dma_resv_wait_timeout(). */ void dma_buf_invalidate_mappings(struct dma_buf *dmabuf) { diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index d5c3ce2b3aa4..2aa9c7d08abb 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -601,6 +601,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *, void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *, enum dma_data_direction); void dma_buf_invalidate_mappings(struct dma_buf *dma_buf); +bool dma_buf_attach_revocable(struct dma_buf_attachment *attach); int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction dir); int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
From: Leon Romanovsky leonro@nvidia.com
IOMMUFD does not support page fault handling, and after a call to .invalidate_mappings() all mappings become invalid. Ensure that the IOMMUFD dma-buf importer is bound to a revoke‑aware dma-buf exporter (for example, VFIO).
Acked-by: Christian König christian.koenig@amd.com Signed-off-by: Leon Romanovsky leonro@nvidia.com --- drivers/iommu/iommufd/pages.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c index 76f900fa1687..a5eb2bc4ef48 100644 --- a/drivers/iommu/iommufd/pages.c +++ b/drivers/iommu/iommufd/pages.c @@ -1501,16 +1501,22 @@ static int iopt_map_dmabuf(struct iommufd_ctx *ictx, struct iopt_pages *pages, mutex_unlock(&pages->mutex); }
- rc = sym_vfio_pci_dma_buf_iommufd_map(attach, &pages->dmabuf.phys); + rc = dma_buf_pin(attach); if (rc) goto err_detach;
+ rc = sym_vfio_pci_dma_buf_iommufd_map(attach, &pages->dmabuf.phys); + if (rc) + goto err_unpin; + dma_resv_unlock(dmabuf->resv);
/* On success iopt_release_pages() will detach and put the dmabuf. */ pages->dmabuf.attach = attach; return 0;
+err_unpin: + dma_buf_unpin(attach); err_detach: dma_resv_unlock(dmabuf->resv); dma_buf_detach(dmabuf, attach); @@ -1656,6 +1662,7 @@ void iopt_release_pages(struct kref *kref) if (iopt_is_dmabuf(pages) && pages->dmabuf.attach) { struct dma_buf *dmabuf = pages->dmabuf.attach->dmabuf;
+ dma_buf_unpin(pages->dmabuf.attach); dma_buf_detach(dmabuf, pages->dmabuf.attach); dma_buf_put(dmabuf); WARN_ON(!list_empty(&pages->dmabuf.tracker));
From: Leon Romanovsky leonro@nvidia.com
dma-buf invalidation is performed asynchronously by hardware, so VFIO must wait until all affected objects have been fully invalidated.
Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO regions") Signed-off-by: Leon Romanovsky leonro@nvidia.com --- drivers/vfio/pci/vfio_pci_dmabuf.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c index 362e3d149817..5fceefc40e27 100644 --- a/drivers/vfio/pci/vfio_pci_dmabuf.c +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c @@ -321,6 +321,9 @@ void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked) dma_resv_lock(priv->dmabuf->resv, NULL); priv->revoked = revoked; dma_buf_invalidate_mappings(priv->dmabuf); + dma_resv_wait_timeout(priv->dmabuf->resv, + DMA_RESV_USAGE_BOOKKEEP, false, + MAX_SCHEDULE_TIMEOUT); dma_resv_unlock(priv->dmabuf->resv); } fput(priv->dmabuf->file); @@ -342,6 +345,9 @@ void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev) priv->vdev = NULL; priv->revoked = true; dma_buf_invalidate_mappings(priv->dmabuf); + dma_resv_wait_timeout(priv->dmabuf->resv, + DMA_RESV_USAGE_BOOKKEEP, false, + MAX_SCHEDULE_TIMEOUT); dma_resv_unlock(priv->dmabuf->resv); vfio_device_put_registration(&vdev->vdev); fput(priv->dmabuf->file);
From: Leon Romanovsky leonro@nvidia.com
Use the new dma_buf_attach_revocable() helper to restrict attachments to importers that support mapping invalidation.
Signed-off-by: Leon Romanovsky leonro@nvidia.com --- drivers/vfio/pci/vfio_pci_dmabuf.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c index 5fceefc40e27..85056a5a3faf 100644 --- a/drivers/vfio/pci/vfio_pci_dmabuf.c +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c @@ -31,6 +31,9 @@ static int vfio_pci_dma_buf_attach(struct dma_buf *dmabuf, if (priv->revoked) return -ENODEV;
+ if (!dma_buf_attach_revocable(attachment)) + return -EOPNOTSUPP; + return 0; }
On Wed, Jan 21, 2026 at 02:59:16PM +0200, Leon Romanovsky wrote:
From: Leon Romanovsky leonro@nvidia.com
Use the new dma_buf_attach_revocable() helper to restrict attachments to importers that support mapping invalidation.
Signed-off-by: Leon Romanovsky leonro@nvidia.com
drivers/vfio/pci/vfio_pci_dmabuf.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c index 5fceefc40e27..85056a5a3faf 100644 --- a/drivers/vfio/pci/vfio_pci_dmabuf.c +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c @@ -31,6 +31,9 @@ static int vfio_pci_dma_buf_attach(struct dma_buf *dmabuf, if (priv->revoked) return -ENODEV;
- if (!dma_buf_attach_revocable(attachment))
return -EOPNOTSUPP;- return 0;
}
We need to push an urgent -rc fix to implement a pin function here that always fails. That was missed and it means things like rdma can import vfio when the intention was to block that. It would be bad for that uAPI mistake to reach a released kernel.
It's tricky that NULL pin ops means "I support pin" :|
Jason
On 1/21/26 14:47, Jason Gunthorpe wrote:
On Wed, Jan 21, 2026 at 02:59:16PM +0200, Leon Romanovsky wrote:
From: Leon Romanovsky leonro@nvidia.com
Use the new dma_buf_attach_revocable() helper to restrict attachments to importers that support mapping invalidation.
Signed-off-by: Leon Romanovsky leonro@nvidia.com
drivers/vfio/pci/vfio_pci_dmabuf.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c index 5fceefc40e27..85056a5a3faf 100644 --- a/drivers/vfio/pci/vfio_pci_dmabuf.c +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c @@ -31,6 +31,9 @@ static int vfio_pci_dma_buf_attach(struct dma_buf *dmabuf, if (priv->revoked) return -ENODEV;
- if (!dma_buf_attach_revocable(attachment))
return -EOPNOTSUPP;- return 0;
}
We need to push an urgent -rc fix to implement a pin function here that always fails. That was missed and it means things like rdma can import vfio when the intention was to block that. It would be bad for that uAPI mistake to reach a released kernel.
It's tricky that NULL pin ops means "I support pin" :|
Well it means: "I have no memory management and my buffers are always pinned.".
Christian.
Jason
On Wed, Jan 21, 2026 at 09:47:12AM -0400, Jason Gunthorpe wrote:
On Wed, Jan 21, 2026 at 02:59:16PM +0200, Leon Romanovsky wrote:
From: Leon Romanovsky leonro@nvidia.com
Use the new dma_buf_attach_revocable() helper to restrict attachments to importers that support mapping invalidation.
Signed-off-by: Leon Romanovsky leonro@nvidia.com
drivers/vfio/pci/vfio_pci_dmabuf.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c index 5fceefc40e27..85056a5a3faf 100644 --- a/drivers/vfio/pci/vfio_pci_dmabuf.c +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c @@ -31,6 +31,9 @@ static int vfio_pci_dma_buf_attach(struct dma_buf *dmabuf, if (priv->revoked) return -ENODEV;
- if (!dma_buf_attach_revocable(attachment))
return -EOPNOTSUPP;- return 0;
}
We need to push an urgent -rc fix to implement a pin function here that always fails. That was missed and it means things like rdma can import vfio when the intention was to block that. It would be bad for that uAPI mistake to reach a released kernel.
I don't see any urgency here. In the current kernel, the RDMA importer prints a warning to indicate it was attached to the wrong exporter. VFIO also invokes dma_buf_move_notify().
With this series, we finally remove that warning.
Let's focus on getting this series merged.
Thanks
It's tricky that NULL pin ops means "I support pin" :|
Jason
On Wed, Jan 21, 2026 at 04:47:01PM +0200, Leon Romanovsky wrote:
We need to push an urgent -rc fix to implement a pin function here that always fails. That was missed and it means things like rdma can import vfio when the intention was to block that. It would be bad for that uAPI mistake to reach a released kernel.
I don't see any urgency here. In the current kernel, the RDMA importer prints a warning to indicate it was attached to the wrong exporter. VFIO also invokes dma_buf_move_notify().
The design of vfio was always that it must not work with RDMA because we cannot tolerate the errors that happen due to ignoring the move_notify.
The entire purpose of this series could be stated as continuing to block RDMA while opening up other pining users.
So it must be addressed urgently before someone builds an application relying on this connection.
Jason
On Wed, Jan 21, 2026 at 11:41:37AM -0400, Jason Gunthorpe wrote:
On Wed, Jan 21, 2026 at 04:47:01PM +0200, Leon Romanovsky wrote:
We need to push an urgent -rc fix to implement a pin function here that always fails. That was missed and it means things like rdma can import vfio when the intention was to block that. It would be bad for that uAPI mistake to reach a released kernel.
I don't see any urgency here. In the current kernel, the RDMA importer prints a warning to indicate it was attached to the wrong exporter. VFIO also invokes dma_buf_move_notify().
The design of vfio was always that it must not work with RDMA because we cannot tolerate the errors that happen due to ignoring the move_notify.
The entire purpose of this series could be stated as continuing to block RDMA while opening up other pining users.
So it must be addressed urgently before someone builds an application relying on this connection.
Done, https://lore.kernel.org/all/20260121-vfio-add-pin-v1-1-4e04916b17f1@nvidia.c...
Thanks
Jason
linaro-mm-sig@lists.linaro.org