On Thu, Jul 24, 2025 at 05:13:49AM +0000, Kasireddy, Vivek wrote:
Hi Leon,
Subject: [PATCH 10/10] vfio/pci: Add dma-buf export support for MMIO regions
From: Leon Romanovsky leonro@nvidia.com
Add support for exporting PCI device MMIO regions through dma-buf, enabling safe sharing of non-struct page memory with controlled lifetime management. This allows RDMA and other subsystems to import dma-buf FDs and build them into memory regions for PCI P2P operations.
The implementation provides a revocable attachment mechanism using dma-buf move operations. MMIO regions are normally pinned as BARs don't change physical addresses, but access is revoked when the VFIO device is closed or a PCI reset is issued. This ensures kernel self-defense against potentially hostile userspace.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Vivek Kasireddy vivek.kasireddy@intel.com Signed-off-by: Leon Romanovsky leonro@nvidia.com
drivers/vfio/pci/Kconfig | 20 ++ drivers/vfio/pci/Makefile | 2 + drivers/vfio/pci/vfio_pci_config.c | 22 +- drivers/vfio/pci/vfio_pci_core.c | 25 ++- drivers/vfio/pci/vfio_pci_dmabuf.c | 321 +++++++++++++++++++++++++++++ drivers/vfio/pci/vfio_pci_priv.h | 23 +++ include/linux/dma-buf.h | 1 + include/linux/vfio_pci_core.h | 3 + include/uapi/linux/vfio.h | 19 ++ 9 files changed, 431 insertions(+), 5 deletions(-) create mode 100644 drivers/vfio/pci/vfio_pci_dmabuf.c
<...>
+static int validate_dmabuf_input(struct vfio_pci_core_device *vdev,
struct vfio_device_feature_dma_buf *dma_buf)
+{
- struct pci_dev *pdev = vdev->pdev;
- u32 bar = dma_buf->region_index;
- u64 offset = dma_buf->offset;
- u64 len = dma_buf->length;
- resource_size_t bar_size;
- u64 sum;
- /*
* For PCI the region_index is the BAR number like everything else.
*/
- if (bar >= VFIO_PCI_ROM_REGION_INDEX)
return -ENODEV;
<...>
+/**
- Upon VFIO_DEVICE_FEATURE_GET create a dma_buf fd for the
- regions selected.
- open_flags are the typical flags passed to open(2), eg O_RDWR,
O_CLOEXEC,
- etc. offset/length specify a slice of the region to create the dmabuf from.
- nr_ranges is the total number of (P2P DMA) ranges that comprise the
dmabuf.
Any particular reason why you dropped the option (nr_ranges) of creating a single dmabuf from multiple ranges of an MMIO region?
I did it for two reasons. First, I wanted to simplify the code in order to speed-up discussion over the patchset itself. Second, I failed to find justification for need of multiple ranges, as the number of BARs are limited by VFIO_PCI_ROM_REGION_INDEX (6) and same functionality can be achieved by multiple calls to DMABUF import.
Restricting the dmabuf to a single range (or having to create multiple dmabufs to represent multiple regions/ranges associated with a single scattered buffer) would be very limiting and may not work in all cases. For instance, in my use-case, I am trying to share a large (4k mode) framebuffer (FB) located in GPU's VRAM between two (p2p compatible) GPU devices. And, this would probably not work given that allocating a large contiguous FB (nr_ranges = 1) in VRAM may not be feasible when there is memory pressure.
Can you please help me and point to the place in the code where this can fail? I'm probably missing something basic as there are no large allocations in the current patchset.
Furthermore, since you are adding a new UAPI with this patch/feature, as you know, we cannot go back and tweak it (to add support for nr_ranges > 1) should there be a need in the future, but you can always use nr_ranges = 1 anytime. Therefore, I think it makes sense to be flexible in terms of the number of ranges to include while creating a dmabuf instead of restricting ourselves to one range.
I'm not a big fan of over-engineering. Let's first understand if this case is needed.
Thanks
Thanks, Vivek
- Return: The fd number on success, -1 and errno is set on failure.
- */
+#define VFIO_DEVICE_FEATURE_DMA_BUF 11
+struct vfio_device_feature_dma_buf {
- __u32 region_index;
- __u32 open_flags;
- __u64 offset;
- __u64 length;
+};
/* -------- API for Type1 VFIO IOMMU -------- */
/**
2.50.1
linaro-mm-sig@lists.linaro.org