Linaro-mm-sig

linaro-mm-sig@lists.linaro.org

6 participants
3123 discussions

by Laurent Pinchart

Add support for the dma-buf exporter role to the frame buffer API. The importer role isn't meaningful for frame buffer devices, as the frame buffer device model doesn't allow using externally allocated memory. Signed-off-by: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com> --- Documentation/fb/api.txt | 36 ++++++++++++++++++++++++++++++++++++ drivers/video/fbmem.c | 36 ++++++++++++++++++++++++++++++++++++ include/linux/fb.h | 12 ++++++++++++ 3 files changed, 84 insertions(+), 0 deletions(-) diff --git a/Documentation/fb/api.txt b/Documentation/fb/api.txt index d4ff7de..f0b2173 100644 --- a/Documentation/fb/api.txt +++ b/Documentation/fb/api.txt @@ -304,3 +304,39 @@ extensions. Upon successful format configuration, drivers update the fb_fix_screeninfo type, visual and line_length fields depending on the selected format. The type and visual fields are set to FB_TYPE_FOURCC and FB_VISUAL_FOURCC respectively. + + +5. DMA buffer sharing +--------------------- + +The dma-buf kernel framework allows DMA buffers to be shared across devices +and applications. Sharing buffers across display devices and video capture or +video decoding devices allow zero-copy operation when displaying video content +produced by a hardware device such as a camera or a hardware codec. This is +crucial to achieve optimal system performances during video display. + +While dma-buf supports both exporting internally allocated memory as a dma-buf +object (known as the exporter role) and importing a dma-buf object to be used +as device memory (known as the importer role), the frame buffer API only +supports the exporter role, as the frame buffer device model doesn't support +using externally-allocated memory. + +The export a frame buffer as a dma-buf file descriptors, applications call the +FBIOGET_DMABUF ioctl. The ioctl takes a pointer to a fb_dmabuf_export +structure. + +struct fb_dmabuf_export { + __u32 fd; + __u32 flags; +}; + +The flag field specifies the flags to be used when creating the dma-buf file +descriptor. The only supported flag is O_CLOEXEC. If the call is successful, +the driver will set the fd field to a file descriptor corresponding to the +dma-buf object. + +Applications can then pass the file descriptors to another application or +another device driver. The dma-buf object is automatically reference-counted, +applications can and should close the file descriptor as soon as they don't +need it anymore. The underlying dma-buf object will not be freed before the +last device that uses the dma-buf object releases it. diff --git a/drivers/video/fbmem.c b/drivers/video/fbmem.c index 0dff12a..400e449 100644 --- a/drivers/video/fbmem.c +++ b/drivers/video/fbmem.c @@ -15,6 +15,7 @@ #include <linux/compat.h> #include <linux/types.h> +#include <linux/dma-buf.h> #include <linux/errno.h> #include <linux/kernel.h> #include <linux/major.h> @@ -1074,6 +1075,23 @@ fb_blank(struct fb_info *info, int blank) return ret; } +#ifdef CONFIG_DMA_SHARED_BUFFER +int +fb_get_dmabuf(struct fb_info *info, int flags) +{ + struct dma_buf *dmabuf; + + if (info->fbops->fb_dmabuf_export == NULL) + return -ENOTTY; + + dmabuf = info->fbops->fb_dmabuf_export(info); + if (IS_ERR(dmabuf)) + return PTR_ERR(dmabuf); + + return dma_buf_fd(dmabuf, flags); +} +#endif + static long do_fb_ioctl(struct fb_info *info, unsigned int cmd, unsigned long arg) { @@ -1084,6 +1102,7 @@ static long do_fb_ioctl(struct fb_info *info, unsigned int cmd, struct fb_cmap cmap_from; struct fb_cmap_user cmap; struct fb_event event; + struct fb_dmabuf_export dmaexp; void __user *argp = (void __user *)arg; long ret = 0; @@ -1191,6 +1210,23 @@ static long do_fb_ioctl(struct fb_info *info, unsigned int cmd, console_unlock(); unlock_fb_info(info); break; +#ifdef CONFIG_DMA_SHARED_BUFFER + case FBIOGET_DMABUF: + if (copy_from_user(&dmaexp, argp, sizeof(dmaexp))) + return -EFAULT; + + if (!lock_fb_info(info)) + return -ENODEV; + dmaexp.fd = fb_get_dmabuf(info, dmaexp.flags); + unlock_fb_info(info); + + if (dmaexp.fd < 0) + return dmaexp.fd; + + ret = copy_to_user(argp, &dmaexp, sizeof(dmaexp)) + ? -EFAULT : 0; + break; +#endif default: if (!lock_fb_info(info)) return -ENODEV; diff --git a/include/linux/fb.h b/include/linux/fb.h index ac3f1c6..c9fee75 100644 --- a/include/linux/fb.h +++ b/include/linux/fb.h @@ -39,6 +39,7 @@ #define FBIOPUT_MODEINFO 0x4617 #define FBIOGET_DISPINFO 0x4618 #define FBIO_WAITFORVSYNC _IOW('F', 0x20, __u32) +#define FBIOGET_DMABUF _IOR('F', 0x21, struct fb_dmabuf_export) #define FB_TYPE_PACKED_PIXELS 0 /* Packed Pixels */ #define FB_TYPE_PLANES 1 /* Non interleaved planes */ @@ -403,6 +404,11 @@ struct fb_cursor { #define FB_BACKLIGHT_MAX 0xFF #endif +struct fb_dmabuf_export { + __u32 fd; + __u32 flags; +}; + #ifdef __KERNEL__ #include <linux/fs.h> @@ -418,6 +424,7 @@ struct vm_area_struct; struct fb_info; struct device; struct file; +struct dma_buf; /* Definitions below are used in the parsed monitor specs */ #define FB_DPMS_ACTIVE_OFF 1 @@ -701,6 +708,11 @@ struct fb_ops { /* called at KDB enter and leave time to prepare the console */ int (*fb_debug_enter)(struct fb_info *info); int (*fb_debug_leave)(struct fb_info *info); + +#ifdef CONFIG_DMA_SHARED_BUFFER + /* Export the frame buffer as a dmabuf object */ + struct dma_buf *(*fb_dmabuf_export)(struct fb_info *info); +#endif }; #ifdef CONFIG_FB_TILEBLITTING -- Regards, Laurent Pinchart

13 years, 4 months

[PATCH/RFC 0/2] ARM: DMA-mapping: new extensions for buffer sharing (part 2)

by Marek Szyprowski

Hello, This is a continuation of the dma-mapping extensions posted in the following thread: http://thread.gmane.org/gmane.linux.kernel.mm/78644 We noticed that some advanced buffer sharing use cases usually require creating a dma mapping for the same memory buffer for more than one device. Usually also such buffer is never touched with CPU, so the data are processed by the devices. >From the DMA-mapping perspective this requires to call one of the dma_map_{page,single,sg} function for the given memory buffer a few times, for each of the devices. Each dma_map_* call performs CPU cache synchronization, what might be a time consuming operation, especially when the buffers are large. We would like to avoid any useless and time consuming operations, so that was the main reason for introducing another attribute for DMA-mapping subsystem: DMA_ATTR_SKIP_CPU_SYNC, which lets dma-mapping core to skip CPU cache synchronization in certain cases. The proposed patches have been generated on top of the ARM DMA-mapping redesign patch series on Linux v3.4-rc7. They are also available on the following GIT branch: git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git 3.4-rc7-arm-dma-v10-ext with all require patches on top of vanilla v3.4-rc7 kernel. I will resend them rebased onto v3.5-rc1 soon. Best regards Marek Szyprowski Samsung Poland R&D Center Patch summary: Marek Szyprowski (2): common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute Documentation/DMA-attributes.txt | 24 ++++++++++++++++++++++++ arch/arm/mm/dma-mapping.c | 20 +++++++++++--------- include/linux/dma-attrs.h | 1 + 3 files changed, 36 insertions(+), 9 deletions(-) -- 1.7.1.569.g6f426

13 years, 4 months

[PATCH v5 0/4] Add CMA heap for ION memory manager

by benjamin.gaignard＠stericsson.com

From: Benjamin Gaignard <benjamin.gaignard(a)linaro.org> The goal of those patches is to allow ION clients (drivers or userland applications) to use Contiguous Memory Allocator (CMA). To get more info about CMA: http://lists.linaro.org/pipermail/linaro-mm-sig/2012-February/001328.html patches version 5: - port patches on android kernel 3.4 where ION use dmabuf - add ion_cma_heap_map_dma and ion_cma_heap_unmap_dma functions patches version 4: - add ION_HEAP_TYPE_DMA heap type in ion_heap_type enum. - CMA heap is now a "native" ION heap. - add ion_heap_create_full function to keep backward compatibilty. - clean up included files in CMA heap - ux500-ion is using ion_heap_create_full instead of ion_heap_create patches version 3: - add a private field in ion_heap structure instead of expose ion_device structure to all heaps - ion_cma_heap is no more a platform driver - ion_cma_heap use ion_heap private field to store the device pointer and make the link with reserved CMA regions - provide ux500-ion driver and configuration file for snowball board to give an example of how use CMA heaps patches version 2: - fix comments done by Andy Green Benjamin Gaignard (4): fix ion_platform_data definition add private field in ion_heap structure add CMA heap add test/example driver for ux500 platform arch/arm/mach-ux500/board-mop500.c | 77 ++++++++++++++++ drivers/gpu/ion/Kconfig | 5 ++ drivers/gpu/ion/Makefile | 5 +- drivers/gpu/ion/ion_cma_heap.c | 175 ++++++++++++++++++++++++++++++++++++ drivers/gpu/ion/ion_heap.c | 18 +++- drivers/gpu/ion/ion_priv.h | 13 +++ drivers/gpu/ion/ux500/Makefile | 1 + drivers/gpu/ion/ux500/ux500_ion.c | 142 +++++++++++++++++++++++++++++ include/linux/ion.h | 5 +- 9 files changed, 438 insertions(+), 3 deletions(-) create mode 100644 drivers/gpu/ion/ion_cma_heap.c create mode 100644 drivers/gpu/ion/ux500/Makefile create mode 100644 drivers/gpu/ion/ux500/ux500_ion.c -- 1.7.10

13 years, 4 months

[GIT PULL] DMA-mapping fixes for v3.5-rc3

by Marek Szyprowski

Hi Linus, I would like to ask for pulling a set of minor fixes for dma-mapping code (ARM and x86) required for Contiguous Memory Allocator (CMA) patches merged in v3.5-rc1. The following changes since commit cfaf025112d3856637ff34a767ef785ef5cf2ca9: Linux 3.5-rc2 (2012-06-08 18:40:09 -0700) with the top-most commit c080e26edc3a2a3cdfa4c430c663ee1c3bbd8fae x86: dma-mapping: fix broken allocation when dma_mask has been provided are available in the git repository at: git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git fixes-for-linus Marek Szyprowski (3): ARM: mm: fix type of the arm_dma_limit global variable ARM: dma-mapping: fix debug messages in dmabounce code x86: dma-mapping: fix broken allocation when dma_mask has been provided Sachin Kamat (1): ARM: dma-mapping: Add missing static storage class specifier arch/arm/common/dmabounce.c | 16 ++++++++-------- arch/arm/mm/dma-mapping.c | 4 ++-- arch/arm/mm/init.c | 2 +- arch/arm/mm/mm.h | 2 +- arch/x86/kernel/pci-dma.c | 3 ++- 5 files changed, 14 insertions(+), 13 deletions(-) Thanks! Best regards Marek Szyprowski Samsung Poland R&D Center

13 years, 4 months

[PATCHv2 0/6] ARM: DMA-mapping: new extensions for buffer sharing

by Marek Szyprowski

Hello, This is an updated version of the patch series introducing a new features to DMA mapping subsystem to let drivers share the allocated buffers (preferably using recently introduced dma_buf framework) easy and efficient. The first extension is DMA_ATTR_NO_KERNEL_MAPPING attribute. It is intended for use with dma_{alloc, mmap, free}_attrs functions. It can be used to notify dma-mapping core that the driver will not use kernel mapping for the allocated buffer at all, so the core can skip creating it. This saves precious kernel virtual address space. Such buffer can be accessed from userspace, after calling dma_mmap_attrs() for it (a typical use case for multimedia buffers). The value returned by dma_alloc_attrs() with this attribute should be considered as a DMA cookie, which needs to be passed to dma_mmap_attrs() and dma_free_attrs() funtions. The second extension is required to let drivers to share the buffers allocated by DMA-mapping subsystem. Right now the driver gets a dma address of the allocated buffer and the kernel virtual mapping for it. If it wants to share it with other device (= map into its dma address space) it usually hacks around kernel virtual addresses to get pointers to pages or assumes that both devices share the DMA address space. Both solutions are just hacks for the special cases, which should be avoided in the final version of buffer sharing. To solve this issue in a generic way, a new call to DMA mapping has been introduced - dma_get_sgtable(). It allocates a scatter-list which describes the allocated buffer and lets the driver(s) to use it with other device(s) by calling dma_map_sg() on it. The third extension solves the performance issues which we observed with some advanced buffer sharing use cases, which require creating a dma mapping for the same memory buffer for more than one device. From the DMA-mapping perspective this requires to call one of the dma_map_{page,single,sg} function for the given memory buffer a few times, for each of the devices. Each dma_map_* call performs CPU cache synchronization, what might be a time consuming operation, especially when the buffers are large. We would like to avoid any useless and time consuming operations, so that was the main reason for introducing another attribute for DMA-mapping subsystem: DMA_ATTR_SKIP_CPU_SYNC, which lets dma-mapping core to skip CPU cache synchronization in certain cases. The proposed patches have been rebased on the latest Linux kernel v3.5-rc2 with 'ARM: replace custom consistent dma region with vmalloc' patches applied (for more information, please refer to the http://www.spinics.net/lists/arm-kernel/msg179202.html thread). The patches together with all dependences are also available on the following GIT branch: git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git 3.5-rc2-dma-ext-v2 Best regards Marek Szyprowski Samsung Poland R&D Center Changelog: v2: - rebased onto v3.5-rc2 and adapted for CMA and dma-mapping changes - renamed dma_get_sgtable() to dma_get_sgtable_attrs() to match the convention of the other dma-mapping calls with attributes - added generic fallback function for dma_get_sgtable() for architectures with simple dma-mapping implementations v1: http://thread.gmane.org/gmane.linux.kernel.mm/78644 http://thread.gmane.org/gmane.linux.kernel.cross-arch/14435 (part 2) - initial version Patch summary: Marek Szyprowski (6): common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING attribute common: dma-mapping: introduce dma_get_sgtable() function ARM: dma-mapping: add support for dma_get_sgtable() common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute Documentation/DMA-attributes.txt | 42 ++++++++++++++++++ arch/arm/common/dmabounce.c | 1 + arch/arm/include/asm/dma-mapping.h | 3 + arch/arm/mm/dma-mapping.c | 69 ++++++++++++++++++++++++------ drivers/base/dma-mapping.c | 18 ++++++++ include/asm-generic/dma-mapping-common.h | 18 ++++++++ include/linux/dma-attrs.h | 2 + include/linux/dma-mapping.h | 3 + 8 files changed, 142 insertions(+), 14 deletions(-) -- 1.7.1.569.g6f426

13 years, 4 months

[PATCH] ARM: dma-mapping: fix debug messages in dmabounce code

by Marek Szyprowski

This patch fixes the usage of uninitialized variables in dmabounce code intoduced by commit a227fb92 ('ARM: dma-mapping: remove offset parameter to prepare for generic dma_ops'): arch/arm/common/dmabounce.c: In function ‘dmabounce_sync_for_device’: arch/arm/common/dmabounce.c:409: warning: ‘off’ may be used uninitialized in this function arch/arm/common/dmabounce.c:407: note: ‘off’ was declared here arch/arm/common/dmabounce.c: In function ‘dmabounce_sync_for_cpu’: arch/arm/common/dmabounce.c:369: warning: ‘off’ may be used uninitialized in this function arch/arm/common/dmabounce.c:367: note: ‘off’ was declared here Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com> --- arch/arm/common/dmabounce.c | 16 ++++++++-------- 1 files changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c index 9d7eb53..aa07f59 100644 --- a/arch/arm/common/dmabounce.c +++ b/arch/arm/common/dmabounce.c @@ -366,8 +366,8 @@ static int __dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr, struct safe_buffer *buf; unsigned long off; - dev_dbg(dev, "%s(dma=%#x,off=%#lx,sz=%zx,dir=%x)\n", - __func__, addr, off, sz, dir); + dev_dbg(dev, "%s(dma=%#x,sz=%zx,dir=%x)\n", + __func__, addr, sz, dir); buf = find_safe_buffer_dev(dev, addr, __func__); if (!buf) @@ -377,8 +377,8 @@ static int __dmabounce_sync_for_cpu(struct device *dev, dma_addr_t addr, BUG_ON(buf->direction != dir); - dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n", - __func__, buf->ptr, virt_to_dma(dev, buf->ptr), + dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x off=%#lx) mapped to %p (dma=%#x)\n", + __func__, buf->ptr, virt_to_dma(dev, buf->ptr), off, buf->safe, buf->safe_dma_addr); DO_STATS(dev->archdata.dmabounce->bounce_count++); @@ -406,8 +406,8 @@ static int __dmabounce_sync_for_device(struct device *dev, dma_addr_t addr, struct safe_buffer *buf; unsigned long off; - dev_dbg(dev, "%s(dma=%#x,off=%#lx,sz=%zx,dir=%x)\n", - __func__, addr, off, sz, dir); + dev_dbg(dev, "%s(dma=%#x,sz=%zx,dir=%x)\n", + __func__, addr, sz, dir); buf = find_safe_buffer_dev(dev, addr, __func__); if (!buf) @@ -417,8 +417,8 @@ static int __dmabounce_sync_for_device(struct device *dev, dma_addr_t addr, BUG_ON(buf->direction != dir); - dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x) mapped to %p (dma=%#x)\n", - __func__, buf->ptr, virt_to_dma(dev, buf->ptr), + dev_dbg(dev, "%s: unsafe buffer %p (dma=%#x off=%#lx) mapped to %p (dma=%#x)\n", + __func__, buf->ptr, virt_to_dma(dev, buf->ptr), off, buf->safe, buf->safe_dma_addr); DO_STATS(dev->archdata.dmabounce->bounce_count++); -- 1.7.1.569.g6f426

13 years, 4 months

[PATCH][RFC] mm: Don't put CMA pages on per-cpu lists

by Laura Abbott

Currently, when freeing 0 order pages, CMA pages are treated the same as regular movable pages, which means they end up on the per-cpu page list. This means that the CMA pages are likely to be allocated for something other than contigous memory. This increases the chance that the next alloc_contig_range will fail because pages can't be migrated. Given the size of the CMA region is typically limited, it is best to optimize for success of alloc_contig_range as much as possible. Do this by freeing CMA pages directly instead of putting them on the per-cpu page lists. Signed-off-by: Laura Abbott <lauraa(a)codeaurora.org> --- mm/page_alloc.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 0e1c6f5..c9a6483 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1310,7 +1310,8 @@ void free_hot_cold_page(struct page *page, int cold) * excessively into the page allocator */ if (migratetype >= MIGRATE_PCPTYPES) { - if (unlikely(migratetype == MIGRATE_ISOLATE)) { + if (unlikely(migratetype == MIGRATE_ISOLATE) + || is_migrate_cma(migratetype)) { free_one_page(zone, page, 0, migratetype); goto out; } -- 1.7.8.3

13 years, 4 months

[PATCH 00/12] Support for dmabuf exporting for videobuf2

by Tomasz Stanislawski

Hello everyone, The patches adds support for DMABUF exporting to V4L2 stack. The latest support for DMABUF importing was posted in [1]. The exporter part is dependant on DMA mapping redesign [2] which is not merged into the mainline. Therefore it is posted as a separate patchset. Moreover some patches depends on vmap extension for DMABUF by Dave Airlie [3] and sg_alloc_table_from_pages function [4]. Changelog: v0: (RFC) - updated setup of VIDIOC_EXPBUF ioctl - doc updates - introduced workaround to avoid using dma_get_pages, - removed caching of exported dmabuf to avoid existence of circular reference between dmabuf and vb2_dc_buf or resource leakage - removed all 'change behaviour' patches - inital support for exporting in s5p-mfs driver - removal of vb2_mmap_pfn_range that is no longer used - use sg_alloc_table_from_pages instead of creating sglist in vb2_dc code - move attachment allocation to exporter's attach callback [1] http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/48730 [2] http://thread.gmane.org/gmane.linux.kernel.cross-arch/14098 [3] http://permalink.gmane.org/gmane.comp.video.dri.devel/69302 [4] This patchset is rebased on 3.4-rc1 plus the following patchsets: Marek Szyprowski (1): v4l: vb2-dma-contig: let mmap method to use dma_mmap_coherent call Tomasz Stanislawski (11): v4l: add buffer exporting via dmabuf v4l: vb2: add buffer exporting via dmabuf v4l: vb2-dma-contig: add setup of sglist for MMAP buffers v4l: vb2-dma-contig: add support for DMABUF exporting v4l: vb2-dma-contig: add vmap/kmap for dmabuf exporting v4l: s5p-fimc: support for dmabuf exporting v4l: s5p-tv: mixer: support for dmabuf exporting v4l: s5p-mfc: support for dmabuf exporting v4l: vb2: remove vb2_mmap_pfn_range function v4l: vb2-dma-contig: use sg_alloc_table_from_pages function v4l: vb2-dma-contig: Move allocation of dbuf attachment to attach cb drivers/media/video/s5p-fimc/fimc-capture.c | 9 + drivers/media/video/s5p-mfc/s5p_mfc_dec.c | 13 ++ drivers/media/video/s5p-mfc/s5p_mfc_enc.c | 13 ++ drivers/media/video/s5p-tv/mixer_video.c | 10 + drivers/media/video/v4l2-compat-ioctl32.c | 1 + drivers/media/video/v4l2-dev.c | 1 + drivers/media/video/v4l2-ioctl.c | 6 + drivers/media/video/videobuf2-core.c | 67 ++++++ drivers/media/video/videobuf2-dma-contig.c | 323 ++++++++++++++++++++++----- drivers/media/video/videobuf2-memops.c | 40 ---- include/linux/videodev2.h | 26 +++ include/media/v4l2-ioctl.h | 2 + include/media/videobuf2-core.h | 2 + include/media/videobuf2-memops.h | 5 - 14 files changed, 411 insertions(+), 107 deletions(-) -- 1.7.9.5

13 years, 4 months

Re: [Linaro-mm-sig] [RFC] Synchronizing access to buffers shared with dma-buf between drivers/devices

by Erik Gilling

On Thu, Jun 7, 2012 at 4:35 AM, Tom Cooksey <tom.cooksey(a)arm.com> wrote: > The alternate is to not associate sync objects with buffers and > have them be distinct entities, exposed to userspace. This gives > userpsace more power and flexibility and might allow for use-cases > which an implicit synchronization mechanism can't satisfy - I'd > be curious to know any specifics here. Time and time again we've had problems with implicit synchronization resulting in bugs where different drivers play by slightly different implicit rules. We're convinced the best way to attack this problem is to move as much of the command and control of synchronization as possible into a single piece of code (the compositor in our case.) To facilitate this we're going to be mandating this explicit approach in the K release of Android. > However, every driver which > needs to participate in the synchronization mechanism will need > to have its interface with userspace modified to allow the sync > objects to be passed to the drivers. This seemed like a lot of > work to me, which is why I prefer the implicit approach. However > I don't actually know what work is needed and think it should be > explored. I.e. How much work is it to add explicit sync object > support to the DRM & v4l2 interfaces? > > E.g. I believe DRM/GEM's job dispatch API is "in-order" > in which case it might be easy to just add "wait for this fence" > and "signal this fence" ioctls. Seems like vmwgfx already has > something similar to this already? Could this work over having > to specify a list of sync objects to wait on and another list > of sync objects to signal for every operation (exec buf/page > flip)? What about for v4l2? If I understand you right a job submission with explicit sync would become 3 submission: 1) submit wait for pre-req fence job 2) submit render job 3) submit signal ready fence job Does DRM provide a way to ensure these 3 jobs are submitted atomically? I also expect GPU vendor would like to get clever about GPU to GPU fence dependancies. That could probably be handled entirely in the userspace GL driver. > I guess my other thought is that implicit vs explicit is not > mutually exclusive, though I'd guess there'd be interesting > deadlocks to have to debug if both were in use _at the same > time_. :-) I think this is an approach worth investigating. I'd like a way to either opt out of implicit sync or have a way to check if a dma-buf has an attached fence and detach it. Actually, that could work really well. Consider: * Each dma_buf has a single fence "slot" * on submission * the driver will extract the fence from the dma_buf and queue a wait on it. * the driver will replace that fence with it's own complettion fence before the job submission ioctl returns. * dma_buf will have two userspace ioctls: * DETACH: will return the fence as an FD to userspace and clear the fence slot in the dma_buf * ATTACH: takes a fence FD from userspace and attaches it to the dma_buf fence slot. Returns an error if the fence slot is non-empty. In the android case, we can do a detach after every submission and an attach right before. -Erik

13 years, 4 months

Re: [Linaro-mm-sig] Synchronization framework

by Maarten Lankhorst

Hey Erik, Op 07-06-12 19:35, Erik Gilling schreef: > On Thu, Jun 7, 2012 at 1:55 AM, Maarten Lankhorst > <m.b.lankhorst(a)gmail.com> wrote: >> I haven't looked at intel and amd, but from a quick glance >> it seems like they already implement fencing too, so just >> some way to synch up the fences on shared buffers seems >> like it could benefit all graphics drivers and the whole >> userspace synching could be done away with entirely. > It's important to have some level of userspace API so that GPU > generated graphics can participate in the graphics pipeline. Think of > the case where you have a software video codec streaming textures into > the GPU. It needs to know when the GPU is done with those textures so > it can reuse the buffer. > In the graphics case this problem already has to be handled without dma-buf, so adding any extra synchronization api for userspace that is only used when the bo is shared is a waste. I do agree you need some way to synch userspace though, but I think adding a new api for userspace is not the way to go. Cheers, Maarten PS: re-added cc's that seem to have fallen off from your mail.

13 years, 5 months

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig