dma-buf has become a way to safely acquire a handle to non-struct page
memory that can still have lifetime controlled by the exporter. Notably
RDMA can now import dma-buf FDs and build them into MRs which allows for
PCI P2P operations. Extend this to allow vfio-pci to export MMIO memory
from PCI device BARs.
This series supports a use case for SPDK where a NVMe device will be owned
by SPDK through VFIO but interacting with a RDMA device. The RDMA device
may directly access the NVMe CMB or directly manipulate the NVMe device's
doorbell using PCI P2P.
However, as a general mechanism, it can support many other scenarios with
VFIO. I imagine this dmabuf approach to be usable by iommufd as well for
generic and safe P2P mappings.
This series goes after the "Break up ioctl dispatch functions to one
function per ioctl" series.
This is on github: https://github.com/jgunthorpe/linux/commits/vfio_dma_buf
Jason Gunthorpe (4):
dma-buf: Add dma_buf_try_get()
vfio: Add vfio_device_get()
vfio_pci: Do not open code pci_try_reset_function()
vfio/pci: Allow MMIO regions to be exported through dma-buf
drivers/vfio/pci/Makefile | 1 +
drivers/vfio/pci/vfio_pci_config.c | 22 ++-
drivers/vfio/pci/vfio_pci_core.c | 33 +++-
drivers/vfio/pci/vfio_pci_dma_buf.c | 265 ++++++++++++++++++++++++++++
drivers/vfio/pci/vfio_pci_priv.h | 24 +++
drivers/vfio/vfio_main.c | 3 +-
include/linux/dma-buf.h | 13 ++
include/linux/vfio.h | 6 +
include/linux/vfio_pci_core.h | 1 +
include/uapi/linux/vfio.h | 18 ++
10 files changed, 364 insertions(+), 22 deletions(-)
create mode 100644 drivers/vfio/pci/vfio_pci_dma_buf.c
base-commit: 385f0411fcd2780b5273992832cdc8edcd5b8ea9
--
2.37.2
Hi oushixiong,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on drm-misc/drm-misc-next]
[also build test WARNING on drm/drm-next drm-intel/for-linux-next drm-tip/drm-tip linus/master v6.0-rc3 next-20220826]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/oushixiong/drm-ast-add-dmabu…
base: git://anongit.freedesktop.org/drm/drm-misc drm-misc-next
config: i386-randconfig-a013 (https://download.01.org/0day-ci/archive/20220829/202208291132.RQqpjzO1-lkp@…)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/da31884b8b7e33af5cd8aa750dea3…
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review oushixiong/drm-ast-add-dmabuf-prime-buffer-sharing-support/20220829-084713
git checkout da31884b8b7e33af5cd8aa750dea30fde59d52aa
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash drivers/gpu/drm/ast/
If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp(a)intel.com>
All warnings (new ones prefixed by >>):
>> drivers/gpu/drm/ast/ast_drv.c:54:24: warning: no previous prototype for function 'ast_gem_prime_import' [-Wmissing-prototypes]
struct drm_gem_object *ast_gem_prime_import(struct drm_device *dev,
^
drivers/gpu/drm/ast/ast_drv.c:54:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
struct drm_gem_object *ast_gem_prime_import(struct drm_device *dev,
^
static
1 warning generated.
vim +/ast_gem_prime_import +54 drivers/gpu/drm/ast/ast_drv.c
53
> 54 struct drm_gem_object *ast_gem_prime_import(struct drm_device *dev,
55 struct dma_buf *dma_buf)
56 {
57 struct drm_gem_vram_object *gbo;
58
59 gbo = drm_gem_vram_of_gem(dma_buf->priv);
60 if (gbo->bo.base.dev == dev) {
61 /*
62 * Importing dmabuf exported from out own gem increases
63 * refcount on gem itself instead of f_count of dmabuf.
64 */
65 drm_gem_object_get(&gbo->bo.base);
66 return &gbo->bo.base;
67 }
68
69 gbo = drm_gem_vram_create(dev, dma_buf->size, 0);
70 if (IS_ERR(gbo))
71 return NULL;
72
73 get_dma_buf(dma_buf);
74 return &gbo->bo.base;
75 }
76
--
0-DAY CI Kernel Test Service
https://01.org/lkp
Previously when we added a fence to a dma_resv object we always
assumed the the newer than all the existing fences.
With Jason's work to add an UAPI to explicit export/import that's not
necessary the case any more. So without this check we would allow
userspace to force the kernel into an use after free error.
Since the change is very small and defensive it's probably a good
idea to backport this to stable kernels as well just in case others
are using the dma_resv object in the same way.
Signed-off-by: Christian König <christian.koenig(a)amd.com>
---
drivers/dma-buf/dma-resv.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 205acb2c744d..e3885c90a3ac 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -295,7 +295,8 @@ void dma_resv_add_fence(struct dma_resv *obj, struct dma_fence *fence,
enum dma_resv_usage old_usage;
dma_resv_list_entry(fobj, i, obj, &old, &old_usage);
- if ((old->context == fence->context && old_usage >= usage) ||
+ if ((old->context == fence->context && old_usage >= usage &&
+ dma_fence_is_later(fence, old)) ||
dma_fence_is_signaled(old)) {
dma_resv_list_set(fobj, i, fence, usage);
dma_fence_put(old);
--
2.25.1
Although there is usually not such a limitation (and when there is it is
often only because the driver forgot to change the super small default),
it is still correct here to break scatterlist element into chunks of
dma_max_mapping_size().
This might cause some issues for users with misbehaving drivers. If
bisecting has landed you on this commit, make sure your drivers both set
dma_set_max_seg_size() and are checking for contiguousness correctly.
Signed-off-by: Andrew Davis <afd(a)ti.com>
---
Changes from v1:
- Fixed mixed declarations and code warning
drivers/dma-buf/heaps/cma_heap.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
index 28fb04eccdd0..a927b248c045 100644
--- a/drivers/dma-buf/heaps/cma_heap.c
+++ b/drivers/dma-buf/heaps/cma_heap.c
@@ -52,16 +52,18 @@ static int cma_heap_attach(struct dma_buf *dmabuf,
{
struct cma_heap_buffer *buffer = dmabuf->priv;
struct dma_heap_attachment *a;
+ size_t max_segment;
int ret;
a = kzalloc(sizeof(*a), GFP_KERNEL);
if (!a)
return -ENOMEM;
- ret = sg_alloc_table_from_pages(&a->table, buffer->pages,
- buffer->pagecount, 0,
- buffer->pagecount << PAGE_SHIFT,
- GFP_KERNEL);
+ max_segment = dma_get_max_seg_size(attachment->dev);
+ ret = sg_alloc_table_from_pages_segment(&a->table, buffer->pages,
+ buffer->pagecount, 0,
+ buffer->pagecount << PAGE_SHIFT,
+ max_segment, GFP_KERNEL);
if (ret) {
kfree(a);
return ret;
--
2.36.1
Although there is usually not such a limitation (and when there is it is
often only because the driver forgot to change the super small default),
it is still correct here to break scatterlist element into chunks of
dma_max_mapping_size().
This might cause some issues for users with misbehaving drivers. If
bisecting has landed you on this commit, make sure your drivers both set
dma_set_max_seg_size() and are checking for contiguousness correctly.
Signed-off-by: Andrew Davis <afd(a)ti.com>
---
drivers/dma-buf/heaps/cma_heap.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
index 28fb04eccdd0..cacc84cb5ece 100644
--- a/drivers/dma-buf/heaps/cma_heap.c
+++ b/drivers/dma-buf/heaps/cma_heap.c
@@ -58,10 +58,11 @@ static int cma_heap_attach(struct dma_buf *dmabuf,
if (!a)
return -ENOMEM;
- ret = sg_alloc_table_from_pages(&a->table, buffer->pages,
- buffer->pagecount, 0,
- buffer->pagecount << PAGE_SHIFT,
- GFP_KERNEL);
+ size_t max_segment = dma_get_max_seg_size(attachment->dev);
+ ret = sg_alloc_table_from_pages_segment(&a->table, buffer->pages,
+ buffer->pagecount, 0,
+ buffer->pagecount << PAGE_SHIFT,
+ max_segment, GFP_KERNEL);
if (ret) {
kfree(a);
return ret;
--
2.36.1