Sent from my iPad
> On Aug 1, 2022, at 5:46 PM, Tomasz Figa <tfiga(a)chromium.org> wrote:
>
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
>> On Mon, Aug 1, 2022 at 3:44 PM Hsia-Jun Li <Randy.Li(a)synaptics.com> wrote:
>>> On 8/1/22 14:19, Tomasz Figa wrote:
>> Hello Tomasz
>>> ?Hi Randy,
>>> On Mon, Aug 1, 2022 at 5:21 AM <ayaka(a)soulik.info> wrote:
>>>> From: Randy Li <ayaka(a)soulik.info>
>>>> This module is still at a early stage, I wrote this for showing what
>>>> APIs we need here.
>>>> Let me explain why we need such a module here.
>>>> If you won't allocate buffers from a V4L2 M2M device, this module
>>>> may not be very useful. I am sure the most of users won't know a
>>>> device would require them allocate buffers from a DMA-Heap then
>>>> import those buffers into a V4L2's queue.
>>>> Then the question goes back to why DMA-Heap. From the Android's
>>>> description, we know it is about the copyright's DRM.
>>>> When we allocate a buffer in a DMA-Heap, it may register that buffer
>>>> in the trusted execution environment so the firmware which is running
>>>> or could only be acccesed from there could use that buffer later.
>>>> The answer above leads to another thing which is not done in this
>>>> version, the DMA mapping. Although in some platforms, a DMA-Heap
>>>> responses a IOMMU device as well. For the genernal purpose, we would
>>>> be better assuming the device mapping should be done for each device
>>>> itself. The problem here we only know alloc_devs in those DMAbuf
>>>> methods, which are DMA-heaps in my design, the device from the queue
>>>> is not enough, a plane may requests another IOMMU device or table
>>>> for mapping.
>>>> Signed-off-by: Randy Li <ayaka(a)soulik.info>
>>>> ---
>>>> drivers/media/common/videobuf2/Kconfig | 6 +
>>>> drivers/media/common/videobuf2/Makefile | 1 +
>>>> .../common/videobuf2/videobuf2-dma-heap.c | 350 ++++++++++++++++++
>>>> include/media/videobuf2-dma-heap.h | 30 ++
>>>> 4 files changed, 387 insertions(+)
>>>> create mode 100644 drivers/media/common/videobuf2/videobuf2-dma-heap.c
>>>> create mode 100644 include/media/videobuf2-dma-heap.h
>>> First of all, thanks for the series.
>>> Possibly a stupid question, but why not just allocate the DMA-bufs
>>> directly from the DMA-buf heap device in the userspace and just import
>>> the buffers to the V4L2 device using V4L2_MEMORY_DMABUF?
>> Sometimes the allocation policy could be very complex, let's suppose a
>> multiple planes pixel format enabling with frame buffer compression.
>> Its luma, chroma data could be allocated from a pool which is delegated
>> for large buffers while its metadata would come from a pool which many
>> users could take some few slices from it(likes system pool).
>> Then when we have a new users knowing nothing about this platform, if we
>> just configure the alloc_devs in each queues well. The user won't need
>> to know those complex rules.
>> The real situation could be more complex, Samsung MFC's left and right
>> banks could be regarded as two pools, many devices would benefit from
>> this either from the allocation times or the security buffers policy.
>> In our design, when we need to do some security decoding(DRM video),
>> codecs2 would allocate buffers from the pool delegated for that. While
>> the non-DRM video, users could not care about this.
>
> I'm a little bit surprised about this, because on Android all the
> graphics buffers are allocated from the system IAllocator and imported
> to the specific devices.
In the non-tunnel mode, yes it is. While the tunnel mode is completely vendor defined. Neither HWC nor codec2 cares about where the buffers coming from, you could do what ever you want.
Besides there are DRM video in GNU Linux platform, I heard the webkit has made huge effort here and Playready is one could work in non-Android Linux.
> Would it make sense to instead extend the UAPI to expose enough
> information about the allocation requirements to the userspace, so it
> can allocate correctly?
Yes, it could. But as I said it would need the users to do more works.
> My reasoning here is that it's not a driver's decision to allocate
> from a DMA-buf heap (and which one) or not. It's the userspace which
> knows that, based on the specific use case that it wants to fulfill.
Although I would like to let the users decide that, users just can’t do that which would violate the security rules in some platforms.
For example, video codec and display device could only access a region of memory, any other device or trusted apps can’t access it. Users have to allocate the buffer from the pool the vendor decided.
So why not we offer a quick way that users don’t need to try and error.
> Also, FWIW, dma_heap_ioctl_allocate() is a static function not exposed
> to other kernel modules:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__elixir.bootlin.com_lin…
I may forget to mention that you need two extra patches from Linaro that export those API(original version is actually out of time). Besides Android kernel did have the two kAPI I need here.
Actually I need more APIs from DMA-heap to archive those things in TODO list.
> By the way, the MFC left/right port requirement was gone long ago, it
> was only one of the earliest Exynos SoCs which required that.
Yes, MFCv5 or v6 right. I just want mention that the world has any possible, vendor always has its own reason.
> Best regards,
> Tomasz
>
>>> Best regards,
>>> Tomasz
>>>> diff --git a/drivers/media/common/videobuf2/Kconfig b/drivers/media/common/videobuf2/Kconfig
>>>> index d2223a12c95f..02235077f07e 100644
>>>> --- a/drivers/media/common/videobuf2/Kconfig
>>>> +++ b/drivers/media/common/videobuf2/Kconfig
>>>> @@ -30,3 +30,9 @@ config VIDEOBUF2_DMA_SG
>>>> config VIDEOBUF2_DVB
>>>> tristate
>>>> select VIDEOBUF2_CORE
>>>> +
>>>> +config VIDEOBUF2_DMA_HEAP
>>>> + tristate
>>>> + select VIDEOBUF2_CORE
>>>> + select VIDEOBUF2_MEMOPS
>>>> + select DMABUF_HEAPS
>>>> diff --git a/drivers/media/common/videobuf2/Makefile b/drivers/media/common/videobuf2/Makefile
>>>> index a6fe3f304685..7fe65f93117f 100644
>>>> --- a/drivers/media/common/videobuf2/Makefile
>>>> +++ b/drivers/media/common/videobuf2/Makefile
>>>> @@ -10,6 +10,7 @@ endif
>>>> # (e. g. LC_ALL=C sort Makefile)
>>>> obj-$(CONFIG_VIDEOBUF2_CORE) += videobuf2-common.o
>>>> obj-$(CONFIG_VIDEOBUF2_DMA_CONTIG) += videobuf2-dma-contig.o
>>>> +obj-$(CONFIG_VIDEOBUF2_DMA_HEAP) += videobuf2-dma-heap.o
>>>> obj-$(CONFIG_VIDEOBUF2_DMA_SG) += videobuf2-dma-sg.o
>>>> obj-$(CONFIG_VIDEOBUF2_DVB) += videobuf2-dvb.o
>>>> obj-$(CONFIG_VIDEOBUF2_MEMOPS) += videobuf2-memops.o
>>>> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-heap.c b/drivers/media/common/videobuf2/videobuf2-dma-heap.c
>>>> new file mode 100644
>>>> index 000000000000..377b82ab8f5a
>>>> --- /dev/null
>>>> +++ b/drivers/media/common/videobuf2/videobuf2-dma-heap.c
>>>> @@ -0,0 +1,350 @@
>>>> +/*
>>>> + * Copyright (C) 2022 Randy Li <ayaka(a)soulik.info>
>>>> + *
>>>> + * This software is licensed under the terms of the GNU General Public
>>>> + * License version 2, as published by the Free Software Foundation, and
>>>> + * may be copied, distributed, and modified under those terms.
>>>> + *
>>>> + * This program is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>>>> + * GNU General Public License for more details.
>>>> + *
>>>> + */
>>>> +
>>>> +#include <linux/dma-buf.h>
>>>> +#include <linux/dma-heap.h>
>>>> +#include <linux/refcount.h>
>>>> +#include <linux/scatterlist.h>
>>>> +#include <linux/sched.h>
>>>> +#include <linux/slab.h>
>>>> +#include <linux/dma-mapping.h>
>>>> +
>>>> +#include <media/videobuf2-v4l2.h>
>>>> +#include <media/videobuf2-memops.h>
>>>> +#include <media/videobuf2-dma-heap.h>
>>>> +
>>>> +struct vb2_dmaheap_buf {
>>>> + struct device *dev;
>>>> + void *vaddr;
>>>> + unsigned long size;
>>>> + struct dma_buf *dmabuf;
>>>> + dma_addr_t dma_addr;
>>>> + unsigned long attrs;
>>>> + enum dma_data_direction dma_dir;
>>>> + struct sg_table *dma_sgt;
>>>> +
>>>> + /* MMAP related */
>>>> + struct vb2_vmarea_handler handler;
>>>> + refcount_t refcount;
>>>> +
>>>> + /* DMABUF related */
>>>> + struct dma_buf_attachment *db_attach;
>>>> +};
>>>> +
>>>> +/*********************************************/
>>>> +/* callbacks for all buffers */
>>>> +/*********************************************/
>>>> +
>>>> +void *vb2_dmaheap_cookie(struct vb2_buffer *vb, void *buf_priv)
>>>> +{
>>>> + struct vb2_dmaheap_buf *buf = buf_priv;
>>>> +
>>>> + return &buf->dma_addr;
>>>> +}
>>>> +
>>>> +static void *vb2_dmaheap_vaddr(struct vb2_buffer *vb, void *buf_priv)
>>>> +{
>>>> + struct vb2_dmaheap_buf *buf = buf_priv;
>>>> + struct iosys_map map;
>>>> +
>>>> + if (buf->vaddr)
>>>> + return buf->vaddr;
>>>> +
>>>> + if (buf->db_attach) {
>>>> + if (!dma_buf_vmap(buf->db_attach->dmabuf, &map))
>>>> + buf->vaddr = map.vaddr;
>>>> + }
>>>> +
>>>> + return buf->vaddr;
>>>> +}
>>>> +
>>>> +static unsigned int vb2_dmaheap_num_users(void *buf_priv)
>>>> +{
>>>> + struct vb2_dmaheap_buf *buf = buf_priv;
>>>> +
>>>> + return refcount_read(&buf->refcount);
>>>> +}
>>>> +
>>>> +static void vb2_dmaheap_prepare(void *buf_priv)
>>>> +{
>>>> + struct vb2_dmaheap_buf *buf = buf_priv;
>>>> +
>>>> + /* TODO: DMABUF exporter will flush the cache for us */
>>>> + if (buf->db_attach)
>>>> + return;
>>>> +
>>>> + dma_buf_end_cpu_access(buf->dmabuf, buf->dma_dir);
>>>> +}
>>>> +
>>>> +static void vb2_dmaheap_finish(void *buf_priv)
>>>> +{
>>>> + struct vb2_dmaheap_buf *buf = buf_priv;
>>>> +
>>>> + /* TODO: DMABUF exporter will flush the cache for us */
>>>> + if (buf->db_attach)
>>>> + return;
>>>> +
>>>> + dma_buf_begin_cpu_access(buf->dmabuf, buf->dma_dir);
>>>> +}
>>>> +
>>>> +/*********************************************/
>>>> +/* callbacks for MMAP buffers */
>>>> +/*********************************************/
>>>> +
>>>> +void vb2_dmaheap_put(void *buf_priv)
>>>> +{
>>>> + struct vb2_dmaheap_buf *buf = buf_priv;
>>>> +
>>>> + if (!refcount_dec_and_test(&buf->refcount))
>>>> + return;
>>>> +
>>>> + dma_buf_put(buf->dmabuf);
>>>> +
>>>> + put_device(buf->dev);
>>>> + kfree(buf);
>>>> +}
>>>> +
>>>> +static void *vb2_dmaheap_alloc(struct vb2_buffer *vb,
>>>> + struct device *dev,
>>>> + unsigned long size)
>>>> +{
>>>> + struct vb2_queue *q = vb->vb2_queue;
>>>> + struct dma_heap *heap;
>>>> + struct vb2_dmaheap_buf *buf;
>>>> + const char *heap_name;
>>>> + int ret;
>>>> +
>>>> + if (WARN_ON(!dev))
>>>> + return ERR_PTR(-EINVAL);
>>>> +
>>>> + heap_name = dev_name(dev);
>>>> + if (!heap_name)
>>>> + return ERR_PTR(-EINVAL);
>>>> +
>>>> + heap = dma_heap_find(heap_name);
>>>> + if (!heap) {
>>>> + dev_err(dev, "is not a DMA-heap device\n");
>>>> + return ERR_PTR(-EINVAL);
>>>> + }
>>>> +
>>>> + buf = kzalloc(sizeof *buf, GFP_KERNEL);
>>>> + if (!buf)
>>>> + return ERR_PTR(-ENOMEM);
>>>> +
>>>> + /* Prevent the device from being released while the buffer is used */
>>>> + buf->dev = get_device(dev);
>>>> + buf->attrs = vb->vb2_queue->dma_attrs;
>>>> + buf->dma_dir = vb->vb2_queue->dma_dir;
>>>> +
>>>> + /* TODO: heap flags */
>>>> + ret = dma_heap_buffer_alloc(heap, size, 0, 0);
>>>> + if (ret < 0) {
>>>> + dev_err(dev, "is not a DMA-heap device\n");
>>>> + put_device(buf->dev);
>>>> + kfree(buf);
>>>> + return ERR_PTR(ret);
>>>> + }
>>>> + buf->dmabuf = dma_buf_get(ret);
>>>> +
>>>> + /* FIXME */
>>>> + buf->dma_addr = 0;
>>>> +
>>>> + if ((q->dma_attrs & DMA_ATTR_NO_KERNEL_MAPPING) == 0)
>>>> + buf->vaddr = buf->dmabuf;
>>>> +
>>>> + buf->handler.refcount = &buf->refcount;
>>>> + buf->handler.put = vb2_dmaheap_put;
>>>> + buf->handler.arg = buf;
>>>> +
>>>> + refcount_set(&buf->refcount, 1);
>>>> +
>>>> + return buf;
>>>> +}
>>>> +
>>>> +static int vb2_dmaheap_mmap(void *buf_priv, struct vm_area_struct *vma)
>>>> +{
>>>> + struct vb2_dmaheap_buf *buf = buf_priv;
>>>> + int ret;
>>>> +
>>>> + if (!buf) {
>>>> + printk(KERN_ERR "No buffer to map\n");
>>>> + return -EINVAL;
>>>> + }
>>>> +
>>>> + vma->vm_flags &= ~VM_PFNMAP;
>>>> +
>>>> + ret = dma_buf_mmap(buf->dmabuf, vma, 0);
>>>> + if (ret) {
>>>> + pr_err("Remapping memory failed, error: %d\n", ret);
>>>> + return ret;
>>>> + }
>>>> + vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
>>>> + vma->vm_private_data = &buf->handler;
>>>> + vma->vm_ops = &vb2_common_vm_ops;
>>>> +
>>>> + vma->vm_ops->open(vma);
>>>> +
>>>> + pr_debug("%s: mapped memid 0x%08lx at 0x%08lx, size %ld\n",
>>>> + __func__, (unsigned long)buf->dma_addr, vma->vm_start,
>>>> + buf->size);
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +/*********************************************/
>>>> +/* DMABUF ops for exporters */
>>>> +/*********************************************/
>>>> +
>>>> +static struct dma_buf *vb2_dmaheap_get_dmabuf(struct vb2_buffer *vb,
>>>> + void *buf_priv,
>>>> + unsigned long flags)
>>>> +{
>>>> + struct vb2_dmaheap_buf *buf = buf_priv;
>>>> + struct dma_buf *dbuf;
>>>> +
>>>> + dbuf = buf->dmabuf;
>>>> +
>>>> + return dbuf;
>>>> +}
>>>> +
>>>> +/*********************************************/
>>>> +/* callbacks for DMABUF buffers */
>>>> +/*********************************************/
>>>> +
>>>> +static int vb2_dmaheap_map_dmabuf(void *mem_priv)
>>>> +{
>>>> + struct vb2_dmaheap_buf *buf = mem_priv;
>>>> + struct sg_table *sgt;
>>>> +
>>>> + if (WARN_ON(!buf->db_attach)) {
>>>> + pr_err("trying to pin a non attached buffer\n");
>>>> + return -EINVAL;
>>>> + }
>>>> +
>>>> + if (WARN_ON(buf->dma_sgt)) {
>>>> + pr_err("dmabuf buffer is already pinned\n");
>>>> + return 0;
>>>> + }
>>>> +
>>>> + /* get the associated scatterlist for this buffer */
>>>> + sgt = dma_buf_map_attachment(buf->db_attach, buf->dma_dir);
>>>> + if (IS_ERR(sgt)) {
>>>> + pr_err("Error getting dmabuf scatterlist\n");
>>>> + return -EINVAL;
>>>> + }
>>>> +
>>>> + buf->dma_addr = sg_dma_address(sgt->sgl);
>>>> + buf->dma_sgt = sgt;
>>>> + buf->vaddr = NULL;
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static void vb2_dmaheap_unmap_dmabuf(void *mem_priv)
>>>> +{
>>>> + struct vb2_dmaheap_buf *buf = mem_priv;
>>>> + struct sg_table *sgt = buf->dma_sgt;
>>>> + struct iosys_map map = IOSYS_MAP_INIT_VADDR(buf->vaddr);
>>>> +
>>>> + if (WARN_ON(!buf->db_attach)) {
>>>> + pr_err("trying to unpin a not attached buffer\n");
>>>> + return;
>>>> + }
>>>> +
>>>> + if (WARN_ON(!sgt)) {
>>>> + pr_err("dmabuf buffer is already unpinned\n");
>>>> + return;
>>>> + }
>>>> +
>>>> + if (buf->vaddr) {
>>>> + dma_buf_vunmap(buf->db_attach->dmabuf, &map);
>>>> + buf->vaddr = NULL;
>>>> + }
>>>> + dma_buf_unmap_attachment(buf->db_attach, sgt, buf->dma_dir);
>>>> +
>>>> + buf->dma_addr = 0;
>>>> + buf->dma_sgt = NULL;
>>>> +}
>>>> +
>>>> +static void vb2_dmaheap_detach_dmabuf(void *mem_priv)
>>>> +{
>>>> + struct vb2_dmaheap_buf *buf = mem_priv;
>>>> +
>>>> + /* if vb2 works correctly you should never detach mapped buffer */
>>>> + if (WARN_ON(buf->dma_addr))
>>>> + vb2_dmaheap_unmap_dmabuf(buf);
>>>> +
>>>> + /* detach this attachment */
>>>> + dma_buf_detach(buf->db_attach->dmabuf, buf->db_attach);
>>>> + kfree(buf);
>>>> +}
>>>> +
>>>> +static void *vb2_dmaheap_attach_dmabuf(struct vb2_buffer *vb, struct device *dev,
>>>> + struct dma_buf *dbuf, unsigned long size)
>>>> +{
>>>> + struct vb2_dmaheap_buf *buf;
>>>> + struct dma_buf_attachment *dba;
>>>> +
>>>> + if (dbuf->size < size)
>>>> + return ERR_PTR(-EFAULT);
>>>> +
>>>> + if (WARN_ON(!dev))
>>>> + return ERR_PTR(-EINVAL);
>>>> + /*
>>>> + * TODO: A better way to check whether the buffer is coming
>>>> + * from this heap or this heap could accept this buffer
>>>> + */
>>>> + if (strcmp(dbuf->exp_name, dev_name(dev)))
>>>> + return ERR_PTR(-EINVAL);
>>>> +
>>>> + buf = kzalloc(sizeof(*buf), GFP_KERNEL);
>>>> + if (!buf)
>>>> + return ERR_PTR(-ENOMEM);
>>>> +
>>>> + buf->dev = dev;
>>>> + /* create attachment for the dmabuf with the user device */
>>>> + dba = dma_buf_attach(dbuf, buf->dev);
>>>> + if (IS_ERR(dba)) {
>>>> + pr_err("failed to attach dmabuf\n");
>>>> + kfree(buf);
>>>> + return dba;
>>>> + }
>>>> +
>>>> + buf->dma_dir = vb->vb2_queue->dma_dir;
>>>> + buf->size = size;
>>>> + buf->db_attach = dba;
>>>> +
>>>> + return buf;
>>>> +}
>>>> +
>>>> +const struct vb2_mem_ops vb2_dmaheap_memops = {
>>>> + .alloc = vb2_dmaheap_alloc,
>>>> + .put = vb2_dmaheap_put,
>>>> + .get_dmabuf = vb2_dmaheap_get_dmabuf,
>>>> + .cookie = vb2_dmaheap_cookie,
>>>> + .vaddr = vb2_dmaheap_vaddr,
>>>> + .prepare = vb2_dmaheap_prepare,
>>>> + .finish = vb2_dmaheap_finish,
>>>> + .map_dmabuf = vb2_dmaheap_map_dmabuf,
>>>> + .unmap_dmabuf = vb2_dmaheap_unmap_dmabuf,
>>>> + .attach_dmabuf = vb2_dmaheap_attach_dmabuf,
>>>> + .detach_dmabuf = vb2_dmaheap_detach_dmabuf,
>>>> + .num_users = vb2_dmaheap_num_users,
>>>> + .mmap = vb2_dmaheap_mmap,
>>>> +};
>>>> +
>>>> +MODULE_DESCRIPTION("DMA-Heap memory handling routines for videobuf2");
>>>> +MODULE_AUTHOR("Randy Li <ayaka(a)soulik.info>");
>>>> +MODULE_LICENSE("GPL");
>>>> +MODULE_IMPORT_NS(DMA_BUF);
>>>> diff --git a/include/media/videobuf2-dma-heap.h b/include/media/videobuf2-dma-heap.h
>>>> new file mode 100644
>>>> index 000000000000..fa057f67d6e9
>>>> --- /dev/null
>>>> +++ b/include/media/videobuf2-dma-heap.h
>>>> @@ -0,0 +1,30 @@
>>>> +/*
>>>> + * Copyright (C) 2022 Randy Li <ayaka(a)soulik.info>
>>>> + *
>>>> + * This software is licensed under the terms of the GNU General Public
>>>> + * License version 2, as published by the Free Software Foundation, and
>>>> + * may be copied, distributed, and modified under those terms.
>>>> + *
>>>> + * This program is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>>>> + * GNU General Public License for more details.
>>>> + *
>>>> + */
>>>> +
>>>> +#ifndef _MEDIA_VIDEOBUF2_DMA_HEAP_H
>>>> +#define _MEDIA_VIDEOBUF2_DMA_HEAP_H
>>>> +
>>>> +#include <media/videobuf2-v4l2.h>
>>>> +#include <linux/dma-mapping.h>
>>>> +
>>>> +static inline dma_addr_t
>>>> +vb2_dmaheap_plane_dma_addr(struct vb2_buffer *vb, unsigned int plane_no)
>>>> +{
>>>> + dma_addr_t *addr = vb2_plane_cookie(vb, plane_no);
>>>> +
>>>> + return *addr;
>>>> +}
>>>> +
>>>> +extern const struct vb2_mem_ops vb2_dmaheap_memops;
>>>> +#endif
>>>> --
>>>> 2.17.1
>> --
>> Hsia-Jun(Randy) Li
ib_umem_dmabuf_map_pages() returns 0 on success and -ERRNO on failure.
dma_resv_wait_timeout() uses a different scheme:
* Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or
* greater than zero on success.
This results in ib_umem_dmabuf_map_pages() being non-functional as a
positive return will be understood to be an error by drivers.
Fixes: f30bceab16d1 ("RDMA: use dma_resv_wait() instead of extracting the fence")
Cc: stable(a)kernel.org
Tested-by: Maor Gottlieb <maorg(a)nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
---
drivers/infiniband/core/umem_dmabuf.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
Oded, I assume the Habana driver will hit this as well - does this mean you
are not testing upstream kernels?
diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
index fce80a4a5147cd..04c04e6d24c358 100644
--- a/drivers/infiniband/core/umem_dmabuf.c
+++ b/drivers/infiniband/core/umem_dmabuf.c
@@ -18,6 +18,7 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf)
struct scatterlist *sg;
unsigned long start, end, cur = 0;
unsigned int nmap = 0;
+ long ret;
int i;
dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv);
@@ -67,9 +68,14 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf)
* may be not up-to-date. Wait for the exporter to finish
* the migration.
*/
- return dma_resv_wait_timeout(umem_dmabuf->attach->dmabuf->resv,
+ ret = dma_resv_wait_timeout(umem_dmabuf->attach->dmabuf->resv,
DMA_RESV_USAGE_KERNEL,
false, MAX_SCHEDULE_TIMEOUT);
+ if (ret < 0)
+ return ret;
+ if (ret == 0)
+ return -ETIMEDOUT;
+ return 0;
}
EXPORT_SYMBOL(ib_umem_dmabuf_map_pages);
base-commit: 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
--
2.37.2
kmap() is being deprecated in favor of kmap_local_page().
There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.
With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.
Since its use in amdgpu/amdgpu_ttm.c is safe, it should be preferred.
Therefore, replace kmap() with kmap_local_page() in amdgpu/amdgpu_ttm.c.
Suggested-by: Ira Weiny <ira.weiny(a)intel.com>
Signed-off-by: Fabio M. De Francesco <fmdefrancesco(a)gmail.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 3b4c19412625..c11657b5915f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -2301,9 +2301,9 @@ static ssize_t amdgpu_iomem_read(struct file *f, char __user *buf,
if (p->mapping != adev->mman.bdev.dev_mapping)
return -EPERM;
- ptr = kmap(p);
+ ptr = kmap_local_page(p);
r = copy_to_user(buf, ptr + off, bytes);
- kunmap(p);
+ kunmap_local(ptr);
if (r)
return -EFAULT;
@@ -2352,9 +2352,9 @@ static ssize_t amdgpu_iomem_write(struct file *f, const char __user *buf,
if (p->mapping != adev->mman.bdev.dev_mapping)
return -EPERM;
- ptr = kmap(p);
+ ptr = kmap_local_page(p);
r = copy_from_user(ptr + off, buf, bytes);
- kunmap(p);
+ kunmap_local(ptr);
if (r)
return -EFAULT;
--
2.37.1
Add a new ioctl called TEE_IOC_SHM_REGISTER_FD to register a
shared memory from a dmabuf file descriptor.
Etienne Carriere (1):
tee: new ioctl to a register tee_shm from a dmabuf file descriptor
drivers/tee/tee_core.c | 38 +++++++++++++++
drivers/tee/tee_shm.c | 99 +++++++++++++++++++++++++++++++++++++++-
include/linux/tee_drv.h | 11 +++++
include/uapi/linux/tee.h | 29 ++++++++++++
4 files changed, 175 insertions(+), 2 deletions(-)
--
2.25.0
Hello,
This series moves all drivers to a dynamic dma-buf locking specification.
From now on all dma-buf importers are made responsible for holding
dma-buf's reservation lock around all operations performed over dma-bufs
in accordance to the locking specification. This allows us to utilize
reservation lock more broadly around kernel without fearing of a potential
deadlocks.
This patchset passes all i915 selftests. It was also tested using VirtIO,
Panfrost, Lima and Tegra drivers. I tested cases of display+GPU,
display+V4L and GPU+V4L dma-buf sharing, which covers majority of kernel
drivers since rest of the drivers share same or similar code paths.
Changelog:
v2: - Changed locking specification to avoid problems with a cross-driver
ww locking, like was suggested by Christian König. Now the attach/detach
callbacks are invoked without the held lock and exporter should take the
lock.
- Added "locking convention" documentation that explains which dma-buf
functions and callbacks are locked/unlocked for importers and exporters,
which was requested by Christian König.
- Added ack from Tomasz Figa to the V4L patches that he gave to v1.
Dmitry Osipenko (5):
dma-buf: Add _unlocked postfix to function names
drm/gem: Take reservation lock for vmap/vunmap operations
dma-buf: Move all dma-bufs to dynamic locking specification
media: videobuf2: Stop using internal dma-buf lock
dma-buf: Remove internal lock
Documentation/driver-api/dma-buf.rst | 6 +
drivers/dma-buf/dma-buf.c | 253 +++++++++++++-----
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 4 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 +-
drivers/gpu/drm/armada/armada_gem.c | 14 +-
drivers/gpu/drm/drm_client.c | 4 +-
drivers/gpu/drm/drm_gem.c | 24 ++
drivers/gpu/drm/drm_gem_cma_helper.c | 6 +-
drivers/gpu/drm/drm_gem_framebuffer_helper.c | 6 +-
drivers/gpu/drm/drm_gem_shmem_helper.c | 6 +-
drivers/gpu/drm/drm_prime.c | 12 +-
drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 6 +-
drivers/gpu/drm/exynos/exynos_drm_gem.c | 2 +-
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 14 +-
.../drm/i915/gem/selftests/i915_gem_dmabuf.c | 20 +-
drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c | 8 +-
drivers/gpu/drm/qxl/qxl_object.c | 17 +-
drivers/gpu/drm/qxl/qxl_prime.c | 4 +-
drivers/gpu/drm/tegra/gem.c | 27 +-
drivers/infiniband/core/umem_dmabuf.c | 11 +-
.../common/videobuf2/videobuf2-dma-contig.c | 26 +-
.../media/common/videobuf2/videobuf2-dma-sg.c | 23 +-
.../common/videobuf2/videobuf2-vmalloc.c | 17 +-
.../platform/nvidia/tegra-vde/dmabuf-cache.c | 12 +-
drivers/misc/fastrpc.c | 12 +-
drivers/xen/gntdev-dmabuf.c | 14 +-
include/drm/drm_gem.h | 3 +
include/linux/dma-buf.h | 71 ++---
28 files changed, 372 insertions(+), 254 deletions(-)
--
2.36.1
Fix i2c transfers using GPI DMA mode for all message types that do not set
the I2C_M_DMA_SAFE flag (e.g. SMBus "read byte").
In this case a bounce buffer is returned by i2c_get_dma_safe_msg_buf(),
and it has to synced back to the message after the transfer is done.
Add missing assignment of dma buffer in geni_i2c_gpi().
Set xferred in i2c_put_dma_safe_msg_buf() to true in case of no error to
ensure the sync-back of this dma buffer to the message.
Signed-off-by: Robin Reckmann <robin.reckmann(a)gmail.com>
---
drivers/i2c/busses/i2c-qcom-geni.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c
index 6ac402ea58fb..d3541e94794e 100644
--- a/drivers/i2c/busses/i2c-qcom-geni.c
+++ b/drivers/i2c/busses/i2c-qcom-geni.c
@@ -484,12 +484,12 @@ static void geni_i2c_gpi_unmap(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
{
if (tx_buf) {
dma_unmap_single(gi2c->se.dev->parent, tx_addr, msg->len, DMA_TO_DEVICE);
- i2c_put_dma_safe_msg_buf(tx_buf, msg, false);
+ i2c_put_dma_safe_msg_buf(tx_buf, msg, !gi2c->err);
}
if (rx_buf) {
dma_unmap_single(gi2c->se.dev->parent, rx_addr, msg->len, DMA_FROM_DEVICE);
- i2c_put_dma_safe_msg_buf(rx_buf, msg, false);
+ i2c_put_dma_safe_msg_buf(rx_buf, msg, !gi2c->err);
}
}
@@ -553,6 +553,7 @@ static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
desc->callback_param = gi2c;
dmaengine_submit(desc);
+ *buf = dma_buf;
*dma_addr_p = addr;
return 0;
--
2.25.1
This reverts commit 8f61973718485f3e89bc4f408f929048b7b47c83.
It turned out that this is not correct. Especially the sync_file info
IOCTL needs to see even signaled fences to correctly report back their
status to userspace.
Instead add the filter in the merge function again where it makes sense.
Signed-off-by: Christian König <christian.koenig(a)amd.com>
---
drivers/dma-buf/dma-fence-unwrap.c | 3 ++-
include/linux/dma-fence-unwrap.h | 6 +-----
2 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/dma-buf/dma-fence-unwrap.c b/drivers/dma-buf/dma-fence-unwrap.c
index 502a65ea6d44..7002bca792ff 100644
--- a/drivers/dma-buf/dma-fence-unwrap.c
+++ b/drivers/dma-buf/dma-fence-unwrap.c
@@ -72,7 +72,8 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences,
count = 0;
for (i = 0; i < num_fences; ++i) {
dma_fence_unwrap_for_each(tmp, &iter[i], fences[i])
- ++count;
+ if (!dma_fence_is_signaled(tmp))
+ ++count;
}
if (count == 0)
diff --git a/include/linux/dma-fence-unwrap.h b/include/linux/dma-fence-unwrap.h
index 390de1ee9d35..66b1e56fbb81 100644
--- a/include/linux/dma-fence-unwrap.h
+++ b/include/linux/dma-fence-unwrap.h
@@ -43,14 +43,10 @@ struct dma_fence *dma_fence_unwrap_next(struct dma_fence_unwrap *cursor);
* Unwrap dma_fence_chain and dma_fence_array containers and deep dive into all
* potential fences in them. If @head is just a normal fence only that one is
* returned.
- *
- * Note that signalled fences are opportunistically filtered out, which
- * means the iteration is potentially over no fence at all.
*/
#define dma_fence_unwrap_for_each(fence, cursor, head) \
for (fence = dma_fence_unwrap_first(head, cursor); fence; \
- fence = dma_fence_unwrap_next(cursor)) \
- if (!dma_fence_is_signaled(fence))
+ fence = dma_fence_unwrap_next(cursor))
struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences,
struct dma_fence **fences,
--
2.25.1
Make it clear that DMA_RESV_USAGE_BOOKMARK can be used for explicit synced
user space submissions as well and document the rules around adding the
same fence with different usages.
Signed-off-by: Christian König <christian.koenig(a)amd.com>
---
include/linux/dma-resv.h | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index c8ccbc94d5d2..264e27e56dff 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -62,6 +62,11 @@ struct dma_resv_list;
* For example when asking for WRITE fences then the KERNEL fences are returned
* as well. Similar when asked for READ fences then both WRITE and KERNEL
* fences are returned as well.
+ *
+ * Already used fences can be promoted in the sense that a fence with
+ * DMA_RESV_USAGE_BOOKMARK could become DMA_RESV_USAGE_READ by adding it again
+ * with this usage. But fences can never be degraded in the sense that a fence
+ * with DMA_RESV_USAGE_WRITE could become DMA_RESV_USAGE_READ.
*/
enum dma_resv_usage {
/**
@@ -98,10 +103,15 @@ enum dma_resv_usage {
* @DMA_RESV_USAGE_BOOKKEEP: No implicit sync.
*
* This should be used by submissions which don't want to participate in
- * implicit synchronization.
+ * any implicit synchronization.
+ *
+ * The most common case are preemption fences, page table updates, TLB
+ * flushes as well as explicit synced user submissions.
*
- * The most common case are preemption fences as well as page table
- * updates and their TLB flushes.
+ * Explicit synced user user submissions can be promoted to
+ * DMA_RESV_USAGE_READ or DMA_RESV_USAGE_WRITE as needed using
+ * dma_buf_import_sync_file() when implicit synchronization should
+ * become necessary after initial adding of the fence.
*/
DMA_RESV_USAGE_BOOKKEEP
};
--
2.25.1
Hi guys,
we are currently working an Freesync and direct scan out from system
memory on AMD APUs in A+A laptops.
On problem we stumbled over is that our display hardware needs to scan
out from uncached system memory and we currently don't have a way to
communicate that through DMA-buf.
For our specific use case at hand we are going to implement something
driver specific, but the question is should we have something more
generic for this?
After all the system memory access pattern is a PCIe extension and as
such something generic.
Regards,
Christian.
This patch is an early RFC to discuss the viable options and
alternatives for inclusion of unsigned integer formats for the DRM API.
This patch adds a new single component 16-bit and a two component 32-bit
DRM fourcc’s that represent unsigned integer formats. The use case for
needing UINT formats, in our case, would be to support using raw buffers
for camera ISPs.
For images imported with DRM fourcc + modifier combination, the GPU
driver needs a way to determine the datatype of the format which
currently the DRM API does not provide explicitly with a notable
exception of the floating-point fourccs such as DRM_FORMAT_XRGB16161616F
as an example. As the DRM fourccs do not currently define the
interpretation of the data, should the information be made explicit in
the DRM API similarly to how it is already done in Vulkan?
The reason for introducing datatype to the DRM fourcc's is that the
alternative, for any API (e.g., EGL) that lacks the format datatype
information for fourcc/modifier combination for dma_buf interop would be
to introduce explicit additional metadata/attributes that encode this
information which then would be passed to the GPU driver but the
drawback of this is that it would require extending multiple graphics
APIs to support every single platform.
By having the DRM API expose the datatype information for formats saves
a lot of integration/verification work for all of the different graphics
APIs and platforms as this information could be determined by the DRM
triple alone for dma_buf interop.
It would be good to gather some opinions on what others think about
introducing datatypes to the DRM API.
Any feedback and suggestions are highly appreciated.
Dennis Tsiang (1):
[RFC] drm/fourcc: Add new unsigned R16_UINT/RG1616_UINT formats
include/uapi/drm/drm_fourcc.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--
2.36.1
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.