Linaro-mm-sig April 2025

linaro-mm-sig@lists.linaro.org

24 participants
109 discussions

Re: [PATCH 1/7] drm/shmem-helper: Add lockdep asserts to vmap/vunmap

by Christian König

Am 18.03.25 um 20:22 schrieb Daniel Almeida: > From: Asahi Lina <lina(a)asahilina.net> > > Since commit 21aa27ddc582 ("drm/shmem-helper: Switch to reservation > lock"), the drm_gem_shmem_vmap and drm_gem_shmem_vunmap functions > require that the caller holds the DMA reservation lock for the object. > Add lockdep assertions to help validate this. > > Signed-off-by: Asahi Lina <lina(a)asahilina.net> > Signed-off-by: Daniel Almeida <daniel.almeida(a)collabora.com> Oh, yeah that is certainly a good idea. Reviewed-by: Christian König <christian.koenig(a)amd.com> > --- > drivers/gpu/drm/drm_gem_shmem_helper.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c > index 5ab351409312b5a0de542df2b636278d6186cb7b..ec89e9499f5f02a2a35713669bf649dd2abb9938 100644 > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c > @@ -338,6 +338,8 @@ int drm_gem_shmem_vmap(struct drm_gem_shmem_object *shmem, > struct drm_gem_object *obj = &shmem->base; > int ret = 0; > > + dma_resv_assert_held(obj->resv); > + > if (obj->import_attach) { > ret = dma_buf_vmap(obj->import_attach->dmabuf, map); > if (!ret) { > @@ -404,6 +406,8 @@ void drm_gem_shmem_vunmap(struct drm_gem_shmem_object *shmem, > { > struct drm_gem_object *obj = &shmem->base; > > + dma_resv_assert_held(obj->resv); > + > if (obj->import_attach) { > dma_buf_vunmap(obj->import_attach->dmabuf, map); > } else { >

2 months

Re: [PATCH v2 2/2] dma-buf: heaps: Give default CMA heap a fixed name

by Maxime Ripard

Hi Jared, Thanks for working on this On Tue, Apr 22, 2025 at 12:19:39PM -0700, Jared Kangas wrote: > The CMA heap's name in devtmpfs can vary depending on how the heap is > defined. Its name defaults to "reserved", but if a CMA area is defined > in the devicetree, the heap takes on the devicetree node's name, such as > "default-pool" or "linux,cma". To simplify naming, just name it > "default_cma", and keep a legacy node in place backed by the same > underlying structure for backwards compatibility. > > Signed-off-by: Jared Kangas <jkangas(a)redhat.com> > --- > Documentation/userspace-api/dma-buf-heaps.rst | 11 +++++++---- > drivers/dma-buf/heaps/Kconfig | 10 ++++++++++ > drivers/dma-buf/heaps/cma_heap.c | 14 +++++++++++++- > 3 files changed, 30 insertions(+), 5 deletions(-) > > diff --git a/Documentation/userspace-api/dma-buf-heaps.rst b/Documentation/userspace-api/dma-buf-heaps.rst > index 535f49047ce64..577de813ba461 100644 > --- a/Documentation/userspace-api/dma-buf-heaps.rst > +++ b/Documentation/userspace-api/dma-buf-heaps.rst > @@ -19,7 +19,10 @@ following heaps: > - The ``cma`` heap allocates physically contiguous, cacheable, > buffers. Only present if a CMA region is present. Such a region is > usually created either through the kernel commandline through the > - `cma` parameter, a memory region Device-Tree node with the > - `linux,cma-default` property set, or through the `CMA_SIZE_MBYTES` or > - `CMA_SIZE_PERCENTAGE` Kconfig options. Depending on the platform, it > - might be called ``reserved``, ``linux,cma``, or ``default-pool``. > + ``cma`` parameter, a memory region Device-Tree node with the > + ``linux,cma-default`` property set, or through the ``CMA_SIZE_MBYTES`` or > + ``CMA_SIZE_PERCENTAGE`` Kconfig options. The heap's name in devtmpfs is > + ``default_cma``. For backwards compatibility, when the > + ``DMABUF_HEAPS_CMA_LEGACY`` Kconfig option is set, a duplicate node is > + created following legacy naming conventions; the legacy name might be > + ``reserved``, ``linux,cma``, or ``default-pool``. It looks like, in addition to documenting the new naming, you also changed all the backticks to double backticks. Why did you do so? It seems mostly unrelated to that patch, so it would be better in a separate patch. > diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig > index a5eef06c42264..83f3770fa820a 100644 > --- a/drivers/dma-buf/heaps/Kconfig > +++ b/drivers/dma-buf/heaps/Kconfig > @@ -12,3 +12,13 @@ config DMABUF_HEAPS_CMA > Choose this option to enable dma-buf CMA heap. This heap is backed > by the Contiguous Memory Allocator (CMA). If your system has these > regions, you should say Y here. > + > +config DMABUF_HEAPS_CMA_LEGACY > + bool "DMA-BUF CMA Heap" > + default y > + depends on DMABUF_HEAPS_CMA > + help > + Add a duplicate CMA-backed dma-buf heap with legacy naming derived > + from the CMA area's devicetree node, or "reserved" if the area is not > + defined in the devicetree. This uses the same underlying allocator as > + CONFIG_DMABUF_HEAPS_CMA. > diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c > index e998d8ccd1dc6..cd742c961190d 100644 > --- a/drivers/dma-buf/heaps/cma_heap.c > +++ b/drivers/dma-buf/heaps/cma_heap.c > @@ -22,6 +22,7 @@ > #include <linux/slab.h> > #include <linux/vmalloc.h> > > +#define DEFAULT_CMA_NAME "default_cma" I appreciate this is kind of bikeshed-color territory, but I think "cma" would be a better option here. There's nothing "default" about it. > struct cma_heap { > struct dma_heap *heap; > @@ -394,15 +395,26 @@ static int __init __add_cma_heap(struct cma *cma, const char *name) > static int __init add_default_cma_heap(void) > { > struct cma *default_cma = dev_get_cma_area(NULL); > + const char *legacy_cma_name; > int ret; > > if (!default_cma) > return 0; > > - ret = __add_cma_heap(default_cma, cma_get_name(default_cma)); > + ret = __add_cma_heap(default_cma, DEFAULT_CMA_NAME); > if (ret) > return ret; > > + legacy_cma_name = cma_get_name(default_cma); > + > + if (IS_ENABLED(CONFIG_DMABUF_HEAPS_CMA_LEGACY) && > + strcmp(legacy_cma_name, DEFAULT_CMA_NAME)) { > + ret = __add_cma_heap(default_cma, legacy_cma_name); > + if (ret) > + pr_warn("cma_heap: failed to add legacy heap: %pe\n", > + ERR_PTR(-ret)); > + } > + It would also simplify this part, since you would always create the legacy heap. Maxime

2 months, 1 week

Re: [PATCH 2/3] drm/prime: Support importing DMA-BUF without sg_table

by Christian König

On 4/30/25 16:13, oushixiong wrote: > > 在 2025/4/30 19:03, Christian König 写道: >> On 4/30/25 10:56,oushixiong1025@163.com wrote: >>> From: Shixiong Ou<oushixiong(a)kylinos.cn> >>> >>> [WHY] >>> On some boards, the dma_mask of Aspeed devices is 0xffff_ffff, this >>> quite possibly causes the SWIOTLB to be triggered when importing dmabuf. >>> However IO_TLB_SEGSIZE limits the maximum amount of available memory >>> for DMA Streaming Mapping, as dmesg following: >>> >>> [ 24.885303][ T1947] ast 0000:07:00.0: swiotlb buffer is full (sz: 3145728 bytes), total 32768 (slots), used 0 (slots) >>> >>> [HOW] Provide an interface so that attachment is not mapped when >>> importing dma-buf. >> This is unecessary. The extra abstraction in DRM is only useful when you want to implement the obj->funcs->get_sg_table() callback. >> >> When a driver doesn't want to expose an sg_table for a buffer or want some other special handling it can simply do so by implementing the DMA-buf interface directly. >> >> See drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c for an example on how to do this. >> >> Regards, >> Christian. > > > Thanks for the reminder, > > most drivers that use DRM_GEM_SHADOW_PLANE_HELPER_FUNCSand DRM_GEM_SHMEM_DRIVER_OPS > > don't need to import the sg_table, such as the udl and the ast and so on at the moment. > > They just need to call dma_buf_vmap() to get the kernel virtual address of the shared buffer. > > So I wondered if there was a simple generic PRIME implementation for these drivers. > > If you don't recommend this, Maybe try to implement it in DRM_GEM_SHMEM_DRIVER_OPS ? Well if you only want to implement vmap/vunmap the necessary code in the driver would look something like this: const struct dma_buf_ops amdgpu_dmabuf_ops = { .map_dma_buf = dummy_map_function, .release = drm_gem_dmabuf_release, .mmap = drm_gem_dmabuf_mmap, .vmap = drm_gem_dmabuf_vmap, .vunmap = drm_gem_dmabuf_vunmap, }; struct dma_buf *drv_gem_prime_export(struct drm_gem_object *gobj, int flags) { struct dma_buf *buf; buf = drm_gem_prime_export(gobj, flags); if (!IS_ERR(buf)) buf->ops = &amdgpu_dmabuf_ops; return buf; } The only thing which could be improved is the dummy_map_function. As far as I can see we could make the map function optional in DMA-buf now. Apart from that you could make a DRM helper from that few lines, but to be honest I don't think it's worth it. It reduces the loc a bit, but there is no real complexity here which drivers could share. Regards, Christian. > > Regards, > > Shixiong Ou. > >>> Signed-off-by: Shixiong Ou<oushixiong(a)kylinos.cn> >>> --- >>> drivers/gpu/drm/ast/ast_drv.c | 2 +- >>> drivers/gpu/drm/drm_gem_shmem_helper.c | 17 +++++++ >>> drivers/gpu/drm/drm_prime.c | 67 ++++++++++++++++++++++++-- >>> drivers/gpu/drm/udl/udl_drv.c | 2 +- >>> include/drm/drm_drv.h | 3 ++ >>> include/drm/drm_gem_shmem_helper.h | 6 +++ >>> 6 files changed, 91 insertions(+), 6 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/ast/ast_drv.c b/drivers/gpu/drm/ast/ast_drv.c >>> index 6fbf62a99c48..2dac6acf79e7 100644 >>> --- a/drivers/gpu/drm/ast/ast_drv.c >>> +++ b/drivers/gpu/drm/ast/ast_drv.c >>> @@ -64,7 +64,7 @@ static const struct drm_driver ast_driver = { >>> .minor = DRIVER_MINOR, >>> .patchlevel = DRIVER_PATCHLEVEL, >>> - DRM_GEM_SHMEM_DRIVER_OPS, >>> + DRM_GEM_SHMEM_SIMPLE_DRIVER_OPS, >>> DRM_FBDEV_SHMEM_DRIVER_OPS, >>> }; >>> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c >>> index d99dee67353a..655d841df933 100644 >>> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c >>> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c >>> @@ -799,6 +799,23 @@ drm_gem_shmem_prime_import_sg_table(struct drm_device *dev, >>> } >>> EXPORT_SYMBOL_GPL(drm_gem_shmem_prime_import_sg_table); >>> +struct drm_gem_object * >>> +drm_gem_shmem_prime_import_attachment(struct drm_device *dev, >>> + struct dma_buf_attachment *attach) >>> +{ >>> + size_t size = PAGE_ALIGN(attach->dmabuf->size); >>> + struct drm_gem_shmem_object *shmem; >>> + >>> + shmem = __drm_gem_shmem_create(dev, size, true, NULL); >>> + if (IS_ERR(shmem)) >>> + return ERR_CAST(shmem); >>> + >>> + drm_dbg_prime(dev, "size = %zu\n", size); >>> + >>> + return &shmem->base; >>> +} >>> +EXPORT_SYMBOL_GPL(drm_gem_shmem_prime_import_attachment); >>> + >>> MODULE_DESCRIPTION("DRM SHMEM memory-management helpers"); >>> MODULE_IMPORT_NS("DMA_BUF"); >>> MODULE_LICENSE("GPL v2"); >>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c >>> index 8e70abca33b9..522cf974e202 100644 >>> --- a/drivers/gpu/drm/drm_prime.c >>> +++ b/drivers/gpu/drm/drm_prime.c >>> @@ -911,6 +911,62 @@ struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj, >>> } >>> EXPORT_SYMBOL(drm_gem_prime_export); >>> +/** >>> + * drm_gem_prime_import_dev_skip_map - core implementation of the import callback >>> + * @dev: drm_device to import into >>> + * @dma_buf: dma-buf object to import >>> + * @attach_dev: struct device to dma_buf attach >>> + * >>> + * This function exports a dma-buf without get it's scatter/gather table. >>> + * >>> + * Drivers who need to get an scatter/gather table for objects need to call >>> + * drm_gem_prime_import_dev() instead. >>> + */ >>> +struct drm_gem_object *drm_gem_prime_import_dev_skip_map(struct drm_device *dev, >>> + struct dma_buf *dma_buf, >>> + struct device *attach_dev) >>> +{ >>> + struct dma_buf_attachment *attach; >>> + struct drm_gem_object *obj; >>> + int ret; >>> + >>> + if (dma_buf->ops == &drm_gem_prime_dmabuf_ops) { >>> + obj = dma_buf->priv; >>> + if (obj->dev == dev) { >>> + /* >>> + * Importing dmabuf exported from our own gem increases >>> + * refcount on gem itself instead of f_count of dmabuf. >>> + */ >>> + drm_gem_object_get(obj); >>> + return obj; >>> + } >>> + } >>> + >>> + attach = dma_buf_attach(dma_buf, attach_dev, true); >>> + if (IS_ERR(attach)) >>> + return ERR_CAST(attach); >>> + >>> + get_dma_buf(dma_buf); >>> + >>> + obj = dev->driver->gem_prime_import_attachment(dev, attach); >>> + if (IS_ERR(obj)) { >>> + ret = PTR_ERR(obj); >>> + goto fail_detach; >>> + } >>> + >>> + obj->import_attach = attach; >>> + obj->resv = dma_buf->resv; >>> + >>> + return obj; >>> + >>> +fail_detach: >>> + dma_buf_detach(dma_buf, attach); >>> + dma_buf_put(dma_buf); >>> + >>> + return ERR_PTR(ret); >>> +} >>> +EXPORT_SYMBOL(drm_gem_prime_import_dev_skip_map); >>> + >>> /** >>> * drm_gem_prime_import_dev - core implementation of the import callback >>> * @dev: drm_device to import into >>> @@ -946,9 +1002,6 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, >>> } >>> } >>> - if (!dev->driver->gem_prime_import_sg_table) >>> - return ERR_PTR(-EINVAL); >>> - >>> attach = dma_buf_attach(dma_buf, attach_dev, false); >>> if (IS_ERR(attach)) >>> return ERR_CAST(attach); >>> @@ -998,7 +1051,13 @@ EXPORT_SYMBOL(drm_gem_prime_import_dev); >>> struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, >>> struct dma_buf *dma_buf) >>> { >>> - return drm_gem_prime_import_dev(dev, dma_buf, dev->dev); >>> + if (dev->driver->gem_prime_import_sg_table) >>> + return drm_gem_prime_import_dev(dev, dma_buf, dev->dev); >>> + else if (dev->driver->gem_prime_import_attachment) >>> + return drm_gem_prime_import_dev_skip_map(dev, dma_buf, dev->dev); >>> + else >>> + return ERR_PTR(-EINVAL); >>> + >>> } >>> EXPORT_SYMBOL(drm_gem_prime_import); >>> diff --git a/drivers/gpu/drm/udl/udl_drv.c b/drivers/gpu/drm/udl/udl_drv.c >>> index 05b3a152cc33..c00d8b8834f2 100644 >>> --- a/drivers/gpu/drm/udl/udl_drv.c >>> +++ b/drivers/gpu/drm/udl/udl_drv.c >>> @@ -72,7 +72,7 @@ static const struct drm_driver driver = { >>> /* GEM hooks */ >>> .fops = &udl_driver_fops, >>> - DRM_GEM_SHMEM_DRIVER_OPS, >>> + DRM_GEM_SHMEM_SIMPLE_DRIVER_OPS, >>> .gem_prime_import = udl_driver_gem_prime_import, >>> DRM_FBDEV_SHMEM_DRIVER_OPS, >>> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h >>> index a43d707b5f36..aef8d9051fcd 100644 >>> --- a/include/drm/drm_drv.h >>> +++ b/include/drm/drm_drv.h >>> @@ -326,6 +326,9 @@ struct drm_driver { >>> struct dma_buf_attachment *attach, >>> struct sg_table *sgt); >>> + struct drm_gem_object *(*gem_prime_import_attachment)( >>> + struct drm_device *dev, >>> + struct dma_buf_attachment *attach); >>> /** >>> * @dumb_create: >>> * >>> diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h >>> index cef5a6b5a4d6..39a93c222aaa 100644 >>> --- a/include/drm/drm_gem_shmem_helper.h >>> +++ b/include/drm/drm_gem_shmem_helper.h >>> @@ -274,6 +274,9 @@ struct drm_gem_object * >>> drm_gem_shmem_prime_import_sg_table(struct drm_device *dev, >>> struct dma_buf_attachment *attach, >>> struct sg_table *sgt); >>> +struct drm_gem_object * >>> +drm_gem_shmem_prime_import_attachment(struct drm_device *dev, >>> + struct dma_buf_attachment *attach); >>> int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev, >>> struct drm_mode_create_dumb *args); >>> @@ -287,4 +290,7 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev, >>> .gem_prime_import_sg_table = drm_gem_shmem_prime_import_sg_table, \ >>> .dumb_create = drm_gem_shmem_dumb_create >>> +#define DRM_GEM_SHMEM_SIMPLE_DRIVER_OPS \ >>> + .gem_prime_import_attachment = drm_gem_shmem_prime_import_attachment, \ >>> + .dumb_create = drm_gem_shmem_dumb_create >>> #endif /* __DRM_GEM_SHMEM_HELPER_H__ */

2 months, 1 week

Re: [PATCH 2/3] drm/prime: Support importing DMA-BUF without sg_table

by Christian König

On 4/30/25 10:56, oushixiong1025(a)163.com wrote: > From: Shixiong Ou <oushixiong(a)kylinos.cn> > > [WHY] > On some boards, the dma_mask of Aspeed devices is 0xffff_ffff, this > quite possibly causes the SWIOTLB to be triggered when importing dmabuf. > However IO_TLB_SEGSIZE limits the maximum amount of available memory > for DMA Streaming Mapping, as dmesg following: > > [ 24.885303][ T1947] ast 0000:07:00.0: swiotlb buffer is full (sz: 3145728 bytes), total 32768 (slots), used 0 (slots) > > [HOW] Provide an interface so that attachment is not mapped when > importing dma-buf. This is unecessary. The extra abstraction in DRM is only useful when you want to implement the obj->funcs->get_sg_table() callback. When a driver doesn't want to expose an sg_table for a buffer or want some other special handling it can simply do so by implementing the DMA-buf interface directly. See drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c for an example on how to do this. Regards, Christian. > > Signed-off-by: Shixiong Ou <oushixiong(a)kylinos.cn> > --- > drivers/gpu/drm/ast/ast_drv.c | 2 +- > drivers/gpu/drm/drm_gem_shmem_helper.c | 17 +++++++ > drivers/gpu/drm/drm_prime.c | 67 ++++++++++++++++++++++++-- > drivers/gpu/drm/udl/udl_drv.c | 2 +- > include/drm/drm_drv.h | 3 ++ > include/drm/drm_gem_shmem_helper.h | 6 +++ > 6 files changed, 91 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/ast/ast_drv.c b/drivers/gpu/drm/ast/ast_drv.c > index 6fbf62a99c48..2dac6acf79e7 100644 > --- a/drivers/gpu/drm/ast/ast_drv.c > +++ b/drivers/gpu/drm/ast/ast_drv.c > @@ -64,7 +64,7 @@ static const struct drm_driver ast_driver = { > .minor = DRIVER_MINOR, > .patchlevel = DRIVER_PATCHLEVEL, > > - DRM_GEM_SHMEM_DRIVER_OPS, > + DRM_GEM_SHMEM_SIMPLE_DRIVER_OPS, > DRM_FBDEV_SHMEM_DRIVER_OPS, > }; > > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c > index d99dee67353a..655d841df933 100644 > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c > @@ -799,6 +799,23 @@ drm_gem_shmem_prime_import_sg_table(struct drm_device *dev, > } > EXPORT_SYMBOL_GPL(drm_gem_shmem_prime_import_sg_table); > > +struct drm_gem_object * > +drm_gem_shmem_prime_import_attachment(struct drm_device *dev, > + struct dma_buf_attachment *attach) > +{ > + size_t size = PAGE_ALIGN(attach->dmabuf->size); > + struct drm_gem_shmem_object *shmem; > + > + shmem = __drm_gem_shmem_create(dev, size, true, NULL); > + if (IS_ERR(shmem)) > + return ERR_CAST(shmem); > + > + drm_dbg_prime(dev, "size = %zu\n", size); > + > + return &shmem->base; > +} > +EXPORT_SYMBOL_GPL(drm_gem_shmem_prime_import_attachment); > + > MODULE_DESCRIPTION("DRM SHMEM memory-management helpers"); > MODULE_IMPORT_NS("DMA_BUF"); > MODULE_LICENSE("GPL v2"); > diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c > index 8e70abca33b9..522cf974e202 100644 > --- a/drivers/gpu/drm/drm_prime.c > +++ b/drivers/gpu/drm/drm_prime.c > @@ -911,6 +911,62 @@ struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj, > } > EXPORT_SYMBOL(drm_gem_prime_export); > > +/** > + * drm_gem_prime_import_dev_skip_map - core implementation of the import callback > + * @dev: drm_device to import into > + * @dma_buf: dma-buf object to import > + * @attach_dev: struct device to dma_buf attach > + * > + * This function exports a dma-buf without get it's scatter/gather table. > + * > + * Drivers who need to get an scatter/gather table for objects need to call > + * drm_gem_prime_import_dev() instead. > + */ > +struct drm_gem_object *drm_gem_prime_import_dev_skip_map(struct drm_device *dev, > + struct dma_buf *dma_buf, > + struct device *attach_dev) > +{ > + struct dma_buf_attachment *attach; > + struct drm_gem_object *obj; > + int ret; > + > + if (dma_buf->ops == &drm_gem_prime_dmabuf_ops) { > + obj = dma_buf->priv; > + if (obj->dev == dev) { > + /* > + * Importing dmabuf exported from our own gem increases > + * refcount on gem itself instead of f_count of dmabuf. > + */ > + drm_gem_object_get(obj); > + return obj; > + } > + } > + > + attach = dma_buf_attach(dma_buf, attach_dev, true); > + if (IS_ERR(attach)) > + return ERR_CAST(attach); > + > + get_dma_buf(dma_buf); > + > + obj = dev->driver->gem_prime_import_attachment(dev, attach); > + if (IS_ERR(obj)) { > + ret = PTR_ERR(obj); > + goto fail_detach; > + } > + > + obj->import_attach = attach; > + obj->resv = dma_buf->resv; > + > + return obj; > + > +fail_detach: > + dma_buf_detach(dma_buf, attach); > + dma_buf_put(dma_buf); > + > + return ERR_PTR(ret); > +} > +EXPORT_SYMBOL(drm_gem_prime_import_dev_skip_map); > + > /** > * drm_gem_prime_import_dev - core implementation of the import callback > * @dev: drm_device to import into > @@ -946,9 +1002,6 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, > } > } > > - if (!dev->driver->gem_prime_import_sg_table) > - return ERR_PTR(-EINVAL); > - > attach = dma_buf_attach(dma_buf, attach_dev, false); > if (IS_ERR(attach)) > return ERR_CAST(attach); > @@ -998,7 +1051,13 @@ EXPORT_SYMBOL(drm_gem_prime_import_dev); > struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, > struct dma_buf *dma_buf) > { > - return drm_gem_prime_import_dev(dev, dma_buf, dev->dev); > + if (dev->driver->gem_prime_import_sg_table) > + return drm_gem_prime_import_dev(dev, dma_buf, dev->dev); > + else if (dev->driver->gem_prime_import_attachment) > + return drm_gem_prime_import_dev_skip_map(dev, dma_buf, dev->dev); > + else > + return ERR_PTR(-EINVAL); > + > } > EXPORT_SYMBOL(drm_gem_prime_import); > > diff --git a/drivers/gpu/drm/udl/udl_drv.c b/drivers/gpu/drm/udl/udl_drv.c > index 05b3a152cc33..c00d8b8834f2 100644 > --- a/drivers/gpu/drm/udl/udl_drv.c > +++ b/drivers/gpu/drm/udl/udl_drv.c > @@ -72,7 +72,7 @@ static const struct drm_driver driver = { > > /* GEM hooks */ > .fops = &udl_driver_fops, > - DRM_GEM_SHMEM_DRIVER_OPS, > + DRM_GEM_SHMEM_SIMPLE_DRIVER_OPS, > .gem_prime_import = udl_driver_gem_prime_import, > DRM_FBDEV_SHMEM_DRIVER_OPS, > > diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h > index a43d707b5f36..aef8d9051fcd 100644 > --- a/include/drm/drm_drv.h > +++ b/include/drm/drm_drv.h > @@ -326,6 +326,9 @@ struct drm_driver { > struct dma_buf_attachment *attach, > struct sg_table *sgt); > > + struct drm_gem_object *(*gem_prime_import_attachment)( > + struct drm_device *dev, > + struct dma_buf_attachment *attach); > /** > * @dumb_create: > * > diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h > index cef5a6b5a4d6..39a93c222aaa 100644 > --- a/include/drm/drm_gem_shmem_helper.h > +++ b/include/drm/drm_gem_shmem_helper.h > @@ -274,6 +274,9 @@ struct drm_gem_object * > drm_gem_shmem_prime_import_sg_table(struct drm_device *dev, > struct dma_buf_attachment *attach, > struct sg_table *sgt); > +struct drm_gem_object * > +drm_gem_shmem_prime_import_attachment(struct drm_device *dev, > + struct dma_buf_attachment *attach); > int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev, > struct drm_mode_create_dumb *args); > > @@ -287,4 +290,7 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev, > .gem_prime_import_sg_table = drm_gem_shmem_prime_import_sg_table, \ > .dumb_create = drm_gem_shmem_dumb_create > > +#define DRM_GEM_SHMEM_SIMPLE_DRIVER_OPS \ > + .gem_prime_import_attachment = drm_gem_shmem_prime_import_attachment, \ > + .dumb_create = drm_gem_shmem_dumb_create > #endif /* __DRM_GEM_SHMEM_HELPER_H__ */

2 months, 1 week

Re: [PATCH 1/3] dma-buf: add flags to skip map_dma_buf() for some drivers

by Christian König

On 4/30/25 10:56, oushixiong1025(a)163.com wrote: > From: Shixiong Ou <oushixiong(a)kylinos.cn> > > [WHY] Some Importer does not need to call dma_buf_map_attachment() to > get the scatterlist info, especially those drivers of hardware that do > not support DMA, such as the udl, the virtgpu and the ast. > > [HOW] skip map_dma_buf() when dma_buf_dynamic_attach() for some drivers. This patch is based on outdated code. Please see drm-misc-next where the mapping during attach was already dropped. commit b72f66f22c0e39ae6684c43fead774c13db24e73 Author: Christian König <christian.koenig(a)amd.com> Date: Tue Feb 11 17:20:53 2025 +0100 dma-buf: drop caching of sg_tables That was purely for the transition from static to dynamic dma-buf handling and can be removed again now. Regards, Christian. > Signed-off-by: Shixiong Ou <oushixiong(a)kylinos.cn> > --- > drivers/accel/ivpu/ivpu_gem.c | 2 +- > drivers/accel/qaic/qaic_data.c | 2 +- > drivers/dma-buf/dma-buf.c | 29 ++++++++++--------- > drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 2 +- > drivers/gpu/drm/armada/armada_gem.c | 2 +- > drivers/gpu/drm/drm_prime.c | 2 +- > drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 2 +- > .../drm/i915/gem/selftests/i915_gem_dmabuf.c | 2 +- > drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c | 2 +- > drivers/gpu/drm/tegra/gem.c | 4 +-- > drivers/gpu/drm/virtio/virtgpu_prime.c | 2 +- > drivers/gpu/drm/xe/xe_dma_buf.c | 2 +- > drivers/iio/industrialio-buffer.c | 2 +- > drivers/infiniband/core/umem_dmabuf.c | 3 +- > .../common/videobuf2/videobuf2-dma-contig.c | 2 +- > .../media/common/videobuf2/videobuf2-dma-sg.c | 2 +- > .../platform/nvidia/tegra-vde/dmabuf-cache.c | 2 +- > drivers/misc/fastrpc.c | 2 +- > drivers/usb/gadget/function/f_fs.c | 2 +- > drivers/xen/gntdev-dmabuf.c | 2 +- > include/linux/dma-buf.h | 5 ++-- > net/core/devmem.c | 2 +- > 22 files changed, 41 insertions(+), 36 deletions(-) > > diff --git a/drivers/accel/ivpu/ivpu_gem.c b/drivers/accel/ivpu/ivpu_gem.c > index 8741c73b92ce..5258a66ed945 100644 > --- a/drivers/accel/ivpu/ivpu_gem.c > +++ b/drivers/accel/ivpu/ivpu_gem.c > @@ -183,7 +183,7 @@ struct drm_gem_object *ivpu_gem_prime_import(struct drm_device *dev, > struct drm_gem_object *obj; > int ret; > > - attach = dma_buf_attach(dma_buf, attach_dev); > + attach = dma_buf_attach(dma_buf, attach_dev, false); > if (IS_ERR(attach)) > return ERR_CAST(attach); > > diff --git a/drivers/accel/qaic/qaic_data.c b/drivers/accel/qaic/qaic_data.c > index 43aba57b48f0..c13c64d59143 100644 > --- a/drivers/accel/qaic/qaic_data.c > +++ b/drivers/accel/qaic/qaic_data.c > @@ -803,7 +803,7 @@ struct drm_gem_object *qaic_gem_prime_import(struct drm_device *dev, struct dma_ > obj = &bo->base; > get_dma_buf(dma_buf); > > - attach = dma_buf_attach(dma_buf, dev->dev); > + attach = dma_buf_attach(dma_buf, dev->dev, false); > if (IS_ERR(attach)) { > ret = PTR_ERR(attach); > goto attach_fail; > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > index 5baa83b85515..dd7fe5fbf197 100644 > --- a/drivers/dma-buf/dma-buf.c > +++ b/drivers/dma-buf/dma-buf.c > @@ -904,7 +904,7 @@ static struct sg_table *__map_dma_buf(struct dma_buf_attachment *attach, > struct dma_buf_attachment * > dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, > const struct dma_buf_attach_ops *importer_ops, > - void *importer_priv) > + void *importer_priv, bool skip_map) > { > struct dma_buf_attachment *attach; > int ret; > @@ -941,8 +941,6 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, > */ > if (dma_buf_attachment_is_dynamic(attach) != > dma_buf_is_dynamic(dmabuf)) { > - struct sg_table *sgt; > - > dma_resv_lock(attach->dmabuf->resv, NULL); > if (dma_buf_is_dynamic(attach->dmabuf)) { > ret = dmabuf->ops->pin(attach); > @@ -950,16 +948,20 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, > goto err_unlock; > } > > - sgt = __map_dma_buf(attach, DMA_BIDIRECTIONAL); > - if (!sgt) > - sgt = ERR_PTR(-ENOMEM); > - if (IS_ERR(sgt)) { > - ret = PTR_ERR(sgt); > - goto err_unpin; > + if (!skip_map) { > + struct sg_table *sgt; > + > + sgt = __map_dma_buf(attach, DMA_BIDIRECTIONAL); > + if (!sgt) > + sgt = ERR_PTR(-ENOMEM); > + if (IS_ERR(sgt)) { > + ret = PTR_ERR(sgt); > + goto err_unpin; > + } > + attach->sgt = sgt; > + attach->dir = DMA_BIDIRECTIONAL; > } > dma_resv_unlock(attach->dmabuf->resv); > - attach->sgt = sgt; > - attach->dir = DMA_BIDIRECTIONAL; > } > > return attach; > @@ -989,9 +991,10 @@ EXPORT_SYMBOL_NS_GPL(dma_buf_dynamic_attach, "DMA_BUF"); > * mapping. > */ > struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, > - struct device *dev) > + struct device *dev, > + bool skip_map) > { > - return dma_buf_dynamic_attach(dmabuf, dev, NULL, NULL); > + return dma_buf_dynamic_attach(dmabuf, dev, NULL, NULL, skip_map); > } > EXPORT_SYMBOL_NS_GPL(dma_buf_attach, "DMA_BUF"); > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c > index e6913fcf2c7b..26c94834e6d2 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c > @@ -479,7 +479,7 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev, > return obj; > > attach = dma_buf_dynamic_attach(dma_buf, dev->dev, > - &amdgpu_dma_buf_attach_ops, obj); > + &amdgpu_dma_buf_attach_ops, obj, false); > if (IS_ERR(attach)) { > drm_gem_object_put(obj); > return ERR_CAST(attach); > diff --git a/drivers/gpu/drm/armada/armada_gem.c b/drivers/gpu/drm/armada/armada_gem.c > index 1a1680d71486..7e1a82828b87 100644 > --- a/drivers/gpu/drm/armada/armada_gem.c > +++ b/drivers/gpu/drm/armada/armada_gem.c > @@ -514,7 +514,7 @@ armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) > } > } > > - attach = dma_buf_attach(buf, dev->dev); > + attach = dma_buf_attach(buf, dev->dev, false); > if (IS_ERR(attach)) > return ERR_CAST(attach); > > diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c > index bdb51c8f262e..8e70abca33b9 100644 > --- a/drivers/gpu/drm/drm_prime.c > +++ b/drivers/gpu/drm/drm_prime.c > @@ -949,7 +949,7 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, > if (!dev->driver->gem_prime_import_sg_table) > return ERR_PTR(-EINVAL); > > - attach = dma_buf_attach(dma_buf, attach_dev); > + attach = dma_buf_attach(dma_buf, attach_dev, false); > if (IS_ERR(attach)) > return ERR_CAST(attach); > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > index 9473050ac842..6015f6beb8e6 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > @@ -305,7 +305,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, > return ERR_PTR(-E2BIG); > > /* need to attach */ > - attach = dma_buf_attach(dma_buf, dev->dev); > + attach = dma_buf_attach(dma_buf, dev->dev, false); > if (IS_ERR(attach)) > return ERR_CAST(attach); > > diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c > index 2fda549dd82d..1992241fdf54 100644 > --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c > +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c > @@ -287,7 +287,7 @@ static int igt_dmabuf_import_same_driver(struct drm_i915_private *i915, > goto out_import; > > /* Now try a fake an importer */ > - import_attach = dma_buf_attach(dmabuf, obj->base.dev->dev); > + import_attach = dma_buf_attach(dmabuf, obj->base.dev->dev, false); > if (IS_ERR(import_attach)) { > err = PTR_ERR(import_attach); > goto out_import; > diff --git a/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c b/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c > index 30cf1cdc1aa3..41fb4149409e 100644 > --- a/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c > +++ b/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c > @@ -114,7 +114,7 @@ struct drm_gem_object *omap_gem_prime_import(struct drm_device *dev, > } > } > > - attach = dma_buf_attach(dma_buf, dev->dev); > + attach = dma_buf_attach(dma_buf, dev->dev, false); > if (IS_ERR(attach)) > return ERR_CAST(attach); > > diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c > index ace3e5a805cf..e5527c9d10bb 100644 > --- a/drivers/gpu/drm/tegra/gem.c > +++ b/drivers/gpu/drm/tegra/gem.c > @@ -79,7 +79,7 @@ static struct host1x_bo_mapping *tegra_bo_pin(struct device *dev, struct host1x_ > if (obj->dma_buf) { > struct dma_buf *buf = obj->dma_buf; > > - map->attach = dma_buf_attach(buf, dev); > + map->attach = dma_buf_attach(buf, dev, false); > if (IS_ERR(map->attach)) { > err = PTR_ERR(map->attach); > goto free; > @@ -470,7 +470,7 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm, > * domain, map it first to the DRM device to get an sgt. > */ > if (tegra->domain) { > - attach = dma_buf_attach(buf, drm->dev); > + attach = dma_buf_attach(buf, drm->dev, false); > if (IS_ERR(attach)) { > err = PTR_ERR(attach); > goto free; > diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c > index 4de2a63ccd18..6d9d1fe342b6 100644 > --- a/drivers/gpu/drm/virtio/virtgpu_prime.c > +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c > @@ -326,7 +326,7 @@ struct drm_gem_object *virtgpu_gem_prime_import(struct drm_device *dev, > drm_gem_private_object_init(dev, obj, buf->size); > > attach = dma_buf_dynamic_attach(buf, dev->dev, > - &virtgpu_dma_buf_attach_ops, obj); > + &virtgpu_dma_buf_attach_ops, obj, true); > if (IS_ERR(attach)) { > kfree(bo); > return ERR_CAST(attach); > diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c > index f7a20264ea33..9f524b9ed425 100644 > --- a/drivers/gpu/drm/xe/xe_dma_buf.c > +++ b/drivers/gpu/drm/xe/xe_dma_buf.c > @@ -293,7 +293,7 @@ struct drm_gem_object *xe_gem_prime_import(struct drm_device *dev, > attach_ops = test->attach_ops; > #endif > > - attach = dma_buf_dynamic_attach(dma_buf, dev->dev, attach_ops, &bo->ttm.base); > + attach = dma_buf_dynamic_attach(dma_buf, dev->dev, attach_ops, &bo->ttm.base, false); > if (IS_ERR(attach)) { > obj = ERR_CAST(attach); > goto out_err; > diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c > index a80f7cc25a27..1296af4c2f7a 100644 > --- a/drivers/iio/industrialio-buffer.c > +++ b/drivers/iio/industrialio-buffer.c > @@ -1679,7 +1679,7 @@ static int iio_buffer_attach_dmabuf(struct iio_dev_buffer_pair *ib, > goto err_free_priv; > } > > - attach = dma_buf_attach(dmabuf, indio_dev->dev.parent); > + attach = dma_buf_attach(dmabuf, indio_dev->dev.parent, false); > if (IS_ERR(attach)) { > err = PTR_ERR(attach); > goto err_dmabuf_put; > diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c > index 0ec2e4120cc9..ed635c407cbd 100644 > --- a/drivers/infiniband/core/umem_dmabuf.c > +++ b/drivers/infiniband/core/umem_dmabuf.c > @@ -159,7 +159,8 @@ ib_umem_dmabuf_get_with_dma_device(struct ib_device *device, > dmabuf, > dma_device, > ops, > - umem_dmabuf); > + umem_dmabuf, > + false); > if (IS_ERR(umem_dmabuf->attach)) { > ret = ERR_CAST(umem_dmabuf->attach); > goto out_free_umem; > diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c > index a13ec569c82f..362f5b555ce2 100644 > --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c > +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c > @@ -786,7 +786,7 @@ static void *vb2_dc_attach_dmabuf(struct vb2_buffer *vb, struct device *dev, > buf->vb = vb; > > /* create attachment for the dmabuf with the user device */ > - dba = dma_buf_attach(dbuf, buf->dev); > + dba = dma_buf_attach(dbuf, buf->dev, false); > if (IS_ERR(dba)) { > pr_err("failed to attach dmabuf\n"); > kfree(buf); > diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c > index c6ddf2357c58..4f9a4e9783a1 100644 > --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c > +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c > @@ -632,7 +632,7 @@ static void *vb2_dma_sg_attach_dmabuf(struct vb2_buffer *vb, struct device *dev, > > buf->dev = dev; > /* create attachment for the dmabuf with the user device */ > - dba = dma_buf_attach(dbuf, buf->dev); > + dba = dma_buf_attach(dbuf, buf->dev, false); > if (IS_ERR(dba)) { > pr_err("failed to attach dmabuf\n"); > kfree(buf); > diff --git a/drivers/media/platform/nvidia/tegra-vde/dmabuf-cache.c b/drivers/media/platform/nvidia/tegra-vde/dmabuf-cache.c > index b34244ea14dd..d04da2d3e4da 100644 > --- a/drivers/media/platform/nvidia/tegra-vde/dmabuf-cache.c > +++ b/drivers/media/platform/nvidia/tegra-vde/dmabuf-cache.c > @@ -95,7 +95,7 @@ int tegra_vde_dmabuf_cache_map(struct tegra_vde *vde, > goto ref; > } > > - attachment = dma_buf_attach(dmabuf, dev); > + attachment = dma_buf_attach(dmabuf, dev, false); > if (IS_ERR(attachment)) { > dev_err(dev, "Failed to attach dmabuf\n"); > err = PTR_ERR(attachment); > diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c > index 7b7a22c91fe4..aee6f4cbd6c6 100644 > --- a/drivers/misc/fastrpc.c > +++ b/drivers/misc/fastrpc.c > @@ -778,7 +778,7 @@ static int fastrpc_map_create(struct fastrpc_user *fl, int fd, > goto get_err; > } > > - map->attach = dma_buf_attach(map->buf, sess->dev); > + map->attach = dma_buf_attach(map->buf, sess->dev, false); > if (IS_ERR(map->attach)) { > dev_err(sess->dev, "Failed to attach dmabuf\n"); > err = PTR_ERR(map->attach); > diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c > index 2dea9e42a0f8..51926ffdb843 100644 > --- a/drivers/usb/gadget/function/f_fs.c > +++ b/drivers/usb/gadget/function/f_fs.c > @@ -1487,7 +1487,7 @@ static int ffs_dmabuf_attach(struct file *file, int fd) > if (IS_ERR(dmabuf)) > return PTR_ERR(dmabuf); > > - attach = dma_buf_attach(dmabuf, gadget->dev.parent); > + attach = dma_buf_attach(dmabuf, gadget->dev.parent, false); > if (IS_ERR(attach)) { > err = PTR_ERR(attach); > goto err_dmabuf_put; > diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c > index 5453d86324f6..9de191b6d1f7 100644 > --- a/drivers/xen/gntdev-dmabuf.c > +++ b/drivers/xen/gntdev-dmabuf.c > @@ -587,7 +587,7 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, struct device *dev, > gntdev_dmabuf->priv = priv; > gntdev_dmabuf->fd = fd; > > - attach = dma_buf_attach(dma_buf, dev); > + attach = dma_buf_attach(dma_buf, dev, false); > if (IS_ERR(attach)) { > ret = ERR_CAST(attach); > goto fail_free_obj; > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > index 36216d28d8bd..1ea25089b3ba 100644 > --- a/include/linux/dma-buf.h > +++ b/include/linux/dma-buf.h > @@ -598,11 +598,12 @@ dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach) > } > > struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, > - struct device *dev); > + struct device *dev, > + bool skip_map); > struct dma_buf_attachment * > dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, > const struct dma_buf_attach_ops *importer_ops, > - void *importer_priv); > + void *importer_priv, bool skip_map); > void dma_buf_detach(struct dma_buf *dmabuf, > struct dma_buf_attachment *attach); > int dma_buf_pin(struct dma_buf_attachment *attach); > diff --git a/net/core/devmem.c b/net/core/devmem.c > index 6e27a47d0493..8137ecff9e39 100644 > --- a/net/core/devmem.c > +++ b/net/core/devmem.c > @@ -202,7 +202,7 @@ net_devmem_bind_dmabuf(struct net_device *dev, unsigned int dmabuf_fd, > > binding->dmabuf = dmabuf; > > - binding->attachment = dma_buf_attach(binding->dmabuf, dev->dev.parent); > + binding->attachment = dma_buf_attach(binding->dmabuf, dev->dev.parent, false); > if (IS_ERR(binding->attachment)) { > err = PTR_ERR(binding->attachment); > NL_SET_ERR_MSG(extack, "Failed to bind dmabuf to device");

2 months, 1 week

[PATCH] dma-buf: system_heap: No separate allocation for attachment sg_tables

by T.J. Mercier

struct dma_heap_attachment is a separate allocation from the struct sg_table it contains, but there is no reason for this. Let's use the slab allocator just once instead of twice for dma_heap_attachment. Signed-off-by: T.J. Mercier <tjmercier(a)google.com> --- drivers/dma-buf/heaps/system_heap.c | 43 ++++++++++++----------------- 1 file changed, 17 insertions(+), 26 deletions(-) diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c index 26d5dc89ea16..bee10c400cf0 100644 --- a/drivers/dma-buf/heaps/system_heap.c +++ b/drivers/dma-buf/heaps/system_heap.c @@ -35,7 +35,7 @@ struct system_heap_buffer { struct dma_heap_attachment { struct device *dev; - struct sg_table *table; + struct sg_table table; struct list_head list; bool mapped; }; @@ -54,29 +54,22 @@ static gfp_t order_flags[] = {HIGH_ORDER_GFP, HIGH_ORDER_GFP, LOW_ORDER_GFP}; static const unsigned int orders[] = {8, 4, 0}; #define NUM_ORDERS ARRAY_SIZE(orders) -static struct sg_table *dup_sg_table(struct sg_table *table) +static int dup_sg_table(struct sg_table *from, struct sg_table *to) { - struct sg_table *new_table; - int ret, i; struct scatterlist *sg, *new_sg; + int ret, i; - new_table = kzalloc(sizeof(*new_table), GFP_KERNEL); - if (!new_table) - return ERR_PTR(-ENOMEM); - - ret = sg_alloc_table(new_table, table->orig_nents, GFP_KERNEL); - if (ret) { - kfree(new_table); - return ERR_PTR(-ENOMEM); - } + ret = sg_alloc_table(to, from->orig_nents, GFP_KERNEL); + if (ret) + return ret; - new_sg = new_table->sgl; - for_each_sgtable_sg(table, sg, i) { + new_sg = to->sgl; + for_each_sgtable_sg(from, sg, i) { sg_set_page(new_sg, sg_page(sg), sg->length, sg->offset); new_sg = sg_next(new_sg); } - return new_table; + return 0; } static int system_heap_attach(struct dma_buf *dmabuf, @@ -84,19 +77,18 @@ static int system_heap_attach(struct dma_buf *dmabuf, { struct system_heap_buffer *buffer = dmabuf->priv; struct dma_heap_attachment *a; - struct sg_table *table; + int ret; a = kzalloc(sizeof(*a), GFP_KERNEL); if (!a) return -ENOMEM; - table = dup_sg_table(&buffer->sg_table); - if (IS_ERR(table)) { + ret = dup_sg_table(&buffer->sg_table, &a->table); + if (ret) { kfree(a); - return -ENOMEM; + return ret; } - a->table = table; a->dev = attachment->dev; INIT_LIST_HEAD(&a->list); a->mapped = false; @@ -120,8 +112,7 @@ static void system_heap_detach(struct dma_buf *dmabuf, list_del(&a->list); mutex_unlock(&buffer->lock); - sg_free_table(a->table); - kfree(a->table); + sg_free_table(&a->table); kfree(a); } @@ -129,7 +120,7 @@ static struct sg_table *system_heap_map_dma_buf(struct dma_buf_attachment *attac enum dma_data_direction direction) { struct dma_heap_attachment *a = attachment->priv; - struct sg_table *table = a->table; + struct sg_table *table = &a->table; int ret; ret = dma_map_sgtable(attachment->dev, table, direction, 0); @@ -164,7 +155,7 @@ static int system_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf, list_for_each_entry(a, &buffer->attachments, list) { if (!a->mapped) continue; - dma_sync_sgtable_for_cpu(a->dev, a->table, direction); + dma_sync_sgtable_for_cpu(a->dev, &a->table, direction); } mutex_unlock(&buffer->lock); @@ -185,7 +176,7 @@ static int system_heap_dma_buf_end_cpu_access(struct dma_buf *dmabuf, list_for_each_entry(a, &buffer->attachments, list) { if (!a->mapped) continue; - dma_sync_sgtable_for_device(a->dev, a->table, direction); + dma_sync_sgtable_for_device(a->dev, &a->table, direction); } mutex_unlock(&buffer->lock); base-commit: 8ffd015db85fea3e15a77027fda6c02ced4d2444 -- 2.49.0.805.g082f7c87e0-goog

2 months, 2 weeks

Re: [PATCH 2/3] dma-buf: Add DMA_BUF_IOCTL_GET_DMA_ADDR

by Christian König

On 4/29/25 08:39, Simona Vetter wrote: > Catching up after spring break, hence the late reply ... > > On Fri, Apr 11, 2025 at 02:34:37PM -0400, Nicolas Dufresne wrote: >> Le jeudi 10 avril 2025 à 16:53 +0200, Bastien Curutchet a écrit : >>> There is no way to transmit the DMA address of a buffer to userspace. >>> Some UIO users need this to handle DMA from userspace. >> >> To me this API is against all safe practice we've been pushing forward >> and has no place in DMA_BUF API. >> >> If this is fine for the UIO subsystem to pass around physicial >> addresses, then make this part of the UIO device ioctl. > > Yeah, this has no business in dma-buf since the entire point of dma-buf > was to stop all the nasty "just pass raw dma addr in userspace" hacks that > preceeded it. > > And over the years since dma-buf landed, we've removed a lot of these, > like dri1 drivers. Or where that's not possible like with fbdev, hid the > raw dma addr uapi behind a Kconfig. > > I concur with the overall sentiment that this should be done in > vfio/iommufd interfaces, maybe with some support added to map dma-buf. I > think patches for that have been floating around for a while, but I lost a > bit the status of where exactly they are. My take away is that we need to have a documented way for special driver specific interfaces in DMA-buf. In other words DMA-buf has some standardized rules of doing things which every implementation should follow. The implementations might of course still have bugs (e.g. allocate memory for a dma_fence operation), but at least we have documented what should be done and what's forbidden. What is still missing in the documentation is the use case when you have for example vfio which wants to talk to iommufd through a specialized interface. This doesn't necessarily needs to be part of DMA-buf, but we should still document "do it this way" because that has already worked in the last ten use cases and we don't want people to re-invent the wheel in a new funky way which then later turns out to not work. Regards, Christian. > > Cheers, Sima > >> >> regards, >> Nicolas >> >>> >>> Add a new dma_buf_ops operation that returns the DMA address. >>> Add a new ioctl to transmit this DMA address to userspace. >>> >>> Signed-off-by: Bastien Curutchet <bastien.curutchet(a)bootlin.com> >>> --- >>> drivers/dma-buf/dma-buf.c | 21 +++++++++++++++++++++ >>> include/linux/dma-buf.h | 1 + >>> include/uapi/linux/dma-buf.h | 1 + >>> 3 files changed, 23 insertions(+) >>> >>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c >>> index >>> 398418bd9731ad7a3a1f12eaea6a155fa77a22fe..cbbb518981e54e50f479c3d1fcf >>> 6da6971f639c1 100644 >>> --- a/drivers/dma-buf/dma-buf.c >>> +++ b/drivers/dma-buf/dma-buf.c >>> @@ -454,6 +454,24 @@ static long dma_buf_import_sync_file(struct >>> dma_buf *dmabuf, >>> } >>> #endif >>> >>> +static int dma_buf_get_dma_addr(struct dma_buf *dmabuf, u64 __user >>> *arg) >>> +{ >>> + u64 addr; >>> + int ret; >>> + >>> + if (!dmabuf->ops->get_dma_addr) >>> + return -EINVAL; >>> + >>> + ret = dmabuf->ops->get_dma_addr(dmabuf, &addr); >>> + if (ret) >>> + return ret; >>> + >>> + if (copy_to_user(arg, &addr, sizeof(u64))) >>> + return -EFAULT; >>> + >>> + return 0; >>> +} >>> + >>> static long dma_buf_ioctl(struct file *file, >>> unsigned int cmd, unsigned long arg) >>> { >>> @@ -504,6 +522,9 @@ static long dma_buf_ioctl(struct file *file, >>> return dma_buf_import_sync_file(dmabuf, (const void >>> __user *)arg); >>> #endif >>> >>> + case DMA_BUF_IOCTL_GET_DMA_ADDR: >>> + return dma_buf_get_dma_addr(dmabuf, (u64 __user >>> *)arg); >>> + >>> default: >>> return -ENOTTY; >>> } >>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h >>> index >>> 36216d28d8bdc01a9c9c47e27c392413f7f6c5fb..ed4bf15d3ce82e7a86323fff459 >>> 699a9bc8baa3b 100644 >>> --- a/include/linux/dma-buf.h >>> +++ b/include/linux/dma-buf.h >>> @@ -285,6 +285,7 @@ struct dma_buf_ops { >>> >>> int (*vmap)(struct dma_buf *dmabuf, struct iosys_map *map); >>> void (*vunmap)(struct dma_buf *dmabuf, struct iosys_map >>> *map); >>> + int (*get_dma_addr)(struct dma_buf *dmabuf, u64 *addr); >>> }; >>> >>> /** >>> diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma- >>> buf.h >>> index >>> 5a6fda66d9adf01438619e7e67fa69f0fec2d88d..f3aba46942042de6a2e3a4cca3e >>> b3f87175e29c9 100644 >>> --- a/include/uapi/linux/dma-buf.h >>> +++ b/include/uapi/linux/dma-buf.h >>> @@ -178,5 +178,6 @@ struct dma_buf_import_sync_file { >>> #define DMA_BUF_SET_NAME_B _IOW(DMA_BUF_BASE, 1, __u64) >>> #define DMA_BUF_IOCTL_EXPORT_SYNC_FILE _IOWR(DMA_BUF_BASE, 2, >>> struct dma_buf_export_sync_file) >>> #define DMA_BUF_IOCTL_IMPORT_SYNC_FILE _IOW(DMA_BUF_BASE, 3, struct >>> dma_buf_import_sync_file) >>> +#define DMA_BUF_IOCTL_GET_DMA_ADDR _IOR(DMA_BUF_BASE, 4, __u64 >>> *) >>> >>> #endif >

2 months, 2 weeks

[PATCH v3 00/33] drm/msm: sparse / "VM_BIND" support

by Rob Clark

From: Rob Clark <robdclark(a)chromium.org> Conversion to DRM GPU VA Manager[1], and adding support for Vulkan Sparse Memory[2] in the form of: 1. A new VM_BIND submitqueue type for executing VM MSM_SUBMIT_BO_OP_MAP/ MAP_NULL/UNMAP commands 2. A new VM_BIND ioctl to allow submitting batches of one or more MAP/MAP_NULL/UNMAP commands to a VM_BIND submitqueue I did not implement support for synchronous VM_BIND commands. Since userspace could just immediately wait for the `SUBMIT` to complete, I don't think we need this extra complexity in the kernel. Synchronous/immediate VM_BIND operations could be implemented with a 2nd VM_BIND submitqueue. The corresponding mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533 Changes in v3: - Switched to seperate VM_BIND ioctl. This makes the UABI a bit cleaner, but OTOH the userspace code was cleaner when the end result of either type of VkQueue lead to the same ioctl. So I'm a bit on the fence. - Switched to doing the gpuvm bookkeeping synchronously, and only deferring the pgtable updates. This avoids needing to hold any resv locks in the fence signaling path, resolving the last shrinker related lockdep complaints. OTOH it means userspace can trigger invalid pgtable updates with multiple VM_BIND queues. In this case, we ensure that unmaps happen completely (to prevent userspace from using this to access free'd pages), mark the context as unusable, and move on with life. - Link to v2: https://lore.kernel.org/all/20250319145425.51935-1-robdclark@gmail.com/ Changes in v2: - Dropped Bibek Kumar Patro's arm-smmu patches[3], which have since been merged. - Pre-allocate all the things, and drop HACK patch which disabled shrinker. This includes ensuring that vm_bo objects are allocated up front, pre- allocating VMA objects, and pre-allocating pages used for pgtable updates. The latter utilizes io_pgtable_cfg callbacks for pgtable alloc/free, that were initially added for panthor. - Add back support for BO dumping for devcoredump. - Link to v1 (RFC): https://lore.kernel.org/dri-devel/20241207161651.410556-1-robdclark@gmail.c… [1] https://www.kernel.org/doc/html/next/gpu/drm-mm.html#drm-gpuvm [2] https://docs.vulkan.org/spec/latest/chapters/sparsemem.html [3] https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=909700 Rob Clark (33): drm/gpuvm: Don't require obj lock in destructor path drm/gpuvm: Allow VAs to hold soft reference to BOs iommu/io-pgtable-arm: Add quirk to quiet WARN_ON() drm/msm: Rename msm_file_private -> msm_context drm/msm: Improve msm_context comments drm/msm: Rename msm_gem_address_space -> msm_gem_vm drm/msm: Remove vram carveout support drm/msm: Collapse vma allocation and initialization drm/msm: Collapse vma close and delete drm/msm: Don't close VMAs on purge drm/msm: drm_gpuvm conversion drm/msm: Convert vm locking drm/msm: Use drm_gpuvm types more drm/msm: Split out helper to get iommu prot flags drm/msm: Add mmu support for non-zero offset drm/msm: Add PRR support drm/msm: Rename msm_gem_vma_purge() -> _unmap() drm/msm: Lazily create context VM drm/msm: Add opt-in for VM_BIND drm/msm: Mark VM as unusable on GPU hangs drm/msm: Add _NO_SHARE flag drm/msm: Crashdump prep for sparse mappings drm/msm: rd dumping prep for sparse mappings drm/msm: Crashdec support for sparse drm/msm: rd dumping support for sparse drm/msm: Extract out syncobj helpers drm/msm: Use DMA_RESV_USAGE_BOOKKEEP/KERNEL drm/msm: Add VM_BIND submitqueue drm/msm: Support IO_PGTABLE_QUIRK_NO_WARN_ON drm/msm: Support pgtable preallocation drm/msm: Split out map/unmap ops drm/msm: Add VM_BIND ioctl drm/msm: Bump UAPI version drivers/gpu/drm/drm_gpuvm.c | 15 +- drivers/gpu/drm/msm/Kconfig | 1 + drivers/gpu/drm/msm/Makefile | 1 + drivers/gpu/drm/msm/adreno/a2xx_gpu.c | 25 +- drivers/gpu/drm/msm/adreno/a2xx_gpummu.c | 5 +- drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 17 +- drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 17 +- drivers/gpu/drm/msm/adreno/a5xx_debugfs.c | 4 +- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 22 +- drivers/gpu/drm/msm/adreno/a5xx_power.c | 2 +- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 10 +- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 32 +- drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 49 +- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 6 +- drivers/gpu/drm/msm/adreno/a6xx_preempt.c | 10 +- drivers/gpu/drm/msm/adreno/adreno_device.c | 4 - drivers/gpu/drm/msm/adreno/adreno_gpu.c | 88 +- drivers/gpu/drm/msm/adreno/adreno_gpu.h | 23 +- .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 14 +- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c | 18 +- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h | 2 +- drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 18 +- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 14 +- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h | 4 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c | 6 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c | 28 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c | 12 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c | 4 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 19 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c | 12 +- drivers/gpu/drm/msm/dsi/dsi_host.c | 14 +- drivers/gpu/drm/msm/msm_drv.c | 183 +-- drivers/gpu/drm/msm/msm_drv.h | 35 +- drivers/gpu/drm/msm/msm_fb.c | 18 +- drivers/gpu/drm/msm/msm_fbdev.c | 2 +- drivers/gpu/drm/msm/msm_gem.c | 489 +++---- drivers/gpu/drm/msm/msm_gem.h | 217 ++- drivers/gpu/drm/msm/msm_gem_prime.c | 15 + drivers/gpu/drm/msm/msm_gem_shrinker.c | 4 +- drivers/gpu/drm/msm/msm_gem_submit.c | 295 ++-- drivers/gpu/drm/msm/msm_gem_vma.c | 1265 +++++++++++++++-- drivers/gpu/drm/msm/msm_gpu.c | 171 ++- drivers/gpu/drm/msm/msm_gpu.h | 132 +- drivers/gpu/drm/msm/msm_iommu.c | 298 +++- drivers/gpu/drm/msm/msm_kms.c | 18 +- drivers/gpu/drm/msm/msm_kms.h | 2 +- drivers/gpu/drm/msm/msm_mmu.h | 38 +- drivers/gpu/drm/msm/msm_rd.c | 62 +- drivers/gpu/drm/msm/msm_ringbuffer.c | 4 +- drivers/gpu/drm/msm/msm_submitqueue.c | 86 +- drivers/gpu/drm/msm/msm_syncobj.c | 172 +++ drivers/gpu/drm/msm/msm_syncobj.h | 37 + drivers/iommu/io-pgtable-arm.c | 18 +- include/drm/drm_gpuvm.h | 12 +- include/linux/io-pgtable.h | 8 + include/uapi/drm/msm_drm.h | 149 +- 57 files changed, 3012 insertions(+), 1216 deletions(-) create mode 100644 drivers/gpu/drm/msm/msm_syncobj.c create mode 100644 drivers/gpu/drm/msm/msm_syncobj.h -- 2.49.0

2 months, 2 weeks

Re: [PATCH 4/4] drm/nouveau: Check dma_fence in canonical way

by Christian König

On 4/24/25 15:02, Philipp Stanner wrote: > In nouveau_fence_done(), a fence is checked for being signaled by > manually evaluating the base fence's bits. This can be done in a > canonical manner through dma_fence_is_signaled(). > > Replace the bit-check with dma_fence_is_signaled(). > > Signed-off-by: Philipp Stanner <phasta(a)kernel.org> I think the bit check was used here as fast path optimization because we later call dma_fence_is_signaled() anyway. Feel free to add my acked-by, but honestly what nouveau does here looks rather suspicious to me. Regards, Christian. > --- > drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c > index fb9811938c82..d5654e26d5bc 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_fence.c > +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c > @@ -253,7 +253,7 @@ nouveau_fence_done(struct nouveau_fence *fence) > struct nouveau_channel *chan; > unsigned long flags; > > - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->base.flags)) > + if (dma_fence_is_signaled(&fence->base)) > return true; > > spin_lock_irqsave(&fctx->lock, flags);

2 months, 2 weeks

Re: [PATCH 3/4] drm/nouveau: Simplify nouveau_fence_done()

by Christian König

On 4/24/25 15:02, Philipp Stanner wrote: > nouveau_fence_done() contains an if branch that checks whether a > nouveau_fence has either of the two existing nouveau_fence backend ops, > which will always evaluate to true. > > Remove the surplus check. > > Signed-off-by: Philipp Stanner <phasta(a)kernel.org> Reviewed-by: Christian König <christian.koenig(a)amd.com> > --- > drivers/gpu/drm/nouveau/nouveau_fence.c | 24 +++++++++++------------- > 1 file changed, 11 insertions(+), 13 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c > index 2b79bcb7da16..fb9811938c82 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_fence.c > +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c > @@ -249,21 +249,19 @@ nouveau_fence_emit(struct nouveau_fence *fence) > bool > nouveau_fence_done(struct nouveau_fence *fence) > { > - if (fence->base.ops == &nouveau_fence_ops_legacy || > - fence->base.ops == &nouveau_fence_ops_uevent) { > - struct nouveau_fence_chan *fctx = nouveau_fctx(fence); > - struct nouveau_channel *chan; > - unsigned long flags; > + struct nouveau_fence_chan *fctx = nouveau_fctx(fence); > + struct nouveau_channel *chan; > + unsigned long flags; > > - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->base.flags)) > - return true; > + if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->base.flags)) > + return true; > + > + spin_lock_irqsave(&fctx->lock, flags); > + chan = rcu_dereference_protected(fence->channel, lockdep_is_held(&fctx->lock)); > + if (chan) > + nouveau_fence_update(chan, fctx); > + spin_unlock_irqrestore(&fctx->lock, flags); > > - spin_lock_irqsave(&fctx->lock, flags); > - chan = rcu_dereference_protected(fence->channel, lockdep_is_held(&fctx->lock)); > - if (chan) > - nouveau_fence_update(chan, fctx); > - spin_unlock_irqrestore(&fctx->lock, flags); > - } > return dma_fence_is_signaled(&fence->base); > } >

2 months, 2 weeks

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig April 2025