Buffer sharing proof-of-concept

List overview All Threads
Download

newer

older

[RFC] ARM: dma_map|unmap_sg plus...

summit dial-in update

Marek Szyprowski

2 Aug 2011 2 Aug '11

9:48 a.m.

Hello Everyone,

This patchset introduces the proof-of-concept infrastructure for buffer sharing between multiple devices using file descriptors. The infrastructure has been integrated with V4L2 framework, more specifically videobuf2 and two S5P drivers FIMC (capture interface) and TV drivers, but it can be easily used by other kernel subsystems, like DRI.

In this patch the buffer object has been simplified to absolute minimum - it contains only the buffer physical address (only physically contiguous buffers are supported), but this can be easily extended to complete scatter list in the future.

Best regards

-- Marek Szyprowski Samsung Poland R&D Center

Show replies by date

Marek Szyprowski

2 Aug 2 Aug

9:49 a.m.

New subject: [PATCH 1/6] drivers: base: add shared buffer framework

From: Tomasz Stanislawski t.stanislaws@samsung.com

This patch adds the framework for buffer sharing via a file descriptor. A driver that use shared buffer (shrbuf) can export a memory description by transforming it into a file descriptor. The reverse operation (import) is done by obtaining a memory description from a file descriptor. The driver is responsible to get and put callbacks to avoid resource leakage. Current implementation is dedicated for dma-contiguous buffers but scatter-gather lists will be use in the future.

The framework depends on anonfd framework which is used to create files that are not associated with any inode.

Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com Signed-off-by: Kyungmin Park kyungmin.park@samsung.com Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com --- drivers/base/Kconfig | 11 +++++ drivers/base/Makefile | 1 + drivers/base/shared-buffer.c | 96 +++++++++++++++++++++++++++++++++++++++++ include/linux/shared-buffer.h | 55 +++++++++++++++++++++++ 4 files changed, 163 insertions(+), 0 deletions(-) create mode 100644 drivers/base/shared-buffer.c create mode 100644 include/linux/shared-buffer.h

diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index d57e8d0..d75a038 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -168,4 +168,15 @@ config SYS_HYPERVISOR bool default n

+config SHARED_BUFFER + bool "Framework for buffer sharing between drivers" + depends on ANON_INODES + help + This option enables the framework for buffer sharing between + multiple drivers. A buffer is associated with a file descriptor + using driver API's extensions. The descriptor is passed to other + driver. + + If you are unsure about this, Say N here. + endmenu diff --git a/drivers/base/Makefile b/drivers/base/Makefile index 4c5701c..eeeb813 100644 --- a/drivers/base/Makefile +++ b/drivers/base/Makefile @@ -8,6 +8,7 @@ obj-$(CONFIG_DEVTMPFS) += devtmpfs.o obj-y += power/ obj-$(CONFIG_HAS_DMA) += dma-mapping.o obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o +obj-$(CONFIG_SHARED_BUFFER) += shared-buffer.o obj-$(CONFIG_ISA) += isa.o obj-$(CONFIG_FW_LOADER) += firmware_class.o obj-$(CONFIG_NUMA) += node.o diff --git a/drivers/base/shared-buffer.c b/drivers/base/shared-buffer.c new file mode 100644 index 0000000..105b696 --- /dev/null +++ b/drivers/base/shared-buffer.c @@ -0,0 +1,96 @@ +/* + * Framework for shared buffer + * + * Copyright (C) 2011 Samsung Electronics Co., Ltd. + * Author: Tomasz Stanislawski, t.stanislaws@samsung.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include <linux/anon_inodes.h> +#include <linux/dma-mapping.h> +#include <linux/fs.h> +#include <linux/file.h> +#include <linux/shared-buffer.h> + +/** + * shrbuf_release() - release resources of shrbuf file + * @inode: a file's inode, not used + * @file: file pointer + * + * The function unbinds shrbuf structure from a file + */ +static int shrbuf_release(struct inode *inode, struct file *file) +{ + struct shrbuf *sb = file->private_data; + + /* decrease reference counter increased in shrbuf_export */ + sb->put(sb); + return 0; +} + +static const struct file_operations shrbuf_fops = { + .release = shrbuf_release, +}; + +/** + * shrbuf_export() - transforms shrbuf into a file descriptor + * @sb: shared buffer instance to be exported + * + * The function creates a file descriptor associated with a shared buffer + * + * Returns file descriptor or appropriate error on failure + */ +int shrbuf_export(struct shrbuf *sb) +{ + int fd; + + BUG_ON(!sb || !sb->get || !sb->put); + /* binding shrbuf to a file so reference count in increased */ + sb->get(sb); + /* obtaing file descriptor without inode */ + fd = anon_inode_getfd("shrbuf", &shrbuf_fops, sb, 0); + /* releasing shrbuf on failure */ + if (fd < 0) + sb->put(sb); + return fd; +} + +EXPORT_SYMBOL(shrbuf_export); + +/** + * shrbuf_import() - obtain shrbuf structure from a file descriptor + * @fd: file descriptor + * + * The function obtains an instance of a shared buffer from a file descriptor + * Call sb->put when imported buffer is not longer needed + * + * Returns pointer to a shared buffer or error pointer on failure + */ +struct shrbuf *shrbuf_import(int fd) +{ + struct file *file; + struct shrbuf *sb; + + /* obtain a file, assure that it will not be released */ + file = fget(fd); + /* check if descriptor is incorrect */ + if (!file) + return ERR_PTR(-EBADF); + /* check if dealing with shrbuf-file */ + if (file->f_op != &shrbuf_fops) { + fput(file); + return ERR_PTR(-EINVAL); + } + /* add user of shared buffer */ + sb = file->private_data; + sb->get(sb); + /* release the file */ + fput(file); + + return sb; +} + +EXPORT_SYMBOL(shrbuf_import); + diff --git a/include/linux/shared-buffer.h b/include/linux/shared-buffer.h new file mode 100644 index 0000000..ac0822f --- /dev/null +++ b/include/linux/shared-buffer.h @@ -0,0 +1,55 @@ +/* + * Framework for shared buffer + * + * Copyright (C) 2011 Samsung Electronics Co., Ltd. + * Author: Tomasz Stanislawski, t.stanislaws@samsung.com + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#ifndef _LINUX_SHARED_BUFFER_H +#define _LINUX_SHARED_BUFFER_H + +#include <linux/err.h> + +/** + * struct shrbuf - shared buffer instance + * @get: increase number of a buffer's users + * @put: decrease number of a buffer's user, release resources if needed + * @dma_addr: start address of a contiguous buffer + * @size: size of a contiguous buffer + * + * Both get/put methods are required. The structure is dedicated for + * embedding. The fields dma_addr and size are used for proof-of-concept + * purpose. They will be substituted by scatter-gatter lists. + */ +struct shrbuf { + void (*get)(struct shrbuf *); + void (*put)(struct shrbuf *); + unsigned long dma_addr; + unsigned long size; +}; + +#ifdef CONFIG_SHARED_BUFFER + +int shrbuf_export(struct shrbuf *sb); + +struct shrbuf *shrbuf_import(int fd); + +#else + +static inline int shrbuf_export(struct shrbuf *sb) +{ + return -ENODEV; +} + +static inline struct shrbuf *shrbuf_import(int fd) +{ + return ERR_PTR(-ENODEV); +} + +#endif /* CONFIG_SHARED_BUFFER */ + +#endif /* _LINUX_SHARED_BUFFER_H */

-- 1.7.6

Clark, Rob

6:09 p.m.

New subject: [PATCH 1/6] drivers: base: add shared buffer framework

On Tue, Aug 2, 2011 at 4:49 AM, Marek Szyprowski m.szyprowski@samsung.com wrote:

...

From: Tomasz Stanislawski t.stanislaws@samsung.com

...

+/**

shrbuf_import() - obtain shrbuf structure from a file descriptor

@fd: file descriptor

The function obtains an instance of a shared buffer from a file

descriptor

Call sb->put when imported buffer is not longer needed

Returns pointer to a shared buffer or error pointer on failure

*/

+struct shrbuf *shrbuf_import(int fd) +{

struct file *file;

struct shrbuf *sb;

/* obtain a file, assure that it will not be released */

file = fget(fd);

/* check if descriptor is incorrect */

if (!file)

return ERR_PTR(-EBADF);

/* check if dealing with shrbuf-file */

if (file->f_op != &shrbuf_fops) {

Hmm.. I was liking the idea of letting the buffer allocator provide the fops, so it could deal w/ mmap'ing and that sort of thing. Although this reminds me that we would need a sane way to detect if someone tries to pass in a non-<umm/dmabuf/shrbuf/whatever> fd.

...

fput(file);

return ERR_PTR(-EINVAL);

}

/* add user of shared buffer */

sb = file->private_data;

sb->get(sb);

/* release the file */

fput(file);

return sb;

+}

...

+/**

struct shrbuf - shared buffer instance

@get: increase number of a buffer's users

@put: decrease number of a buffer's user, release resources if needed

@dma_addr: start address of a contiguous buffer

@size: size of a contiguous buffer

Both get/put methods are required. The structure is dedicated for

embedding. The fields dma_addr and size are used for proof-of-concept

purpose. They will be substituted by scatter-gatter lists.

*/

+struct shrbuf {

void (*get)(struct shrbuf *);

void (*put)(struct shrbuf *);

Hmm, is fput()/fget() and fops->release() not enough?

Ie. original buffer allocator provides fops, incl the fops->release(), which may in turn be decrementing an internal ref cnt used by the allocating driver.. so if your allocating driver was the GPU, it's release fxn might be calling drm_gem_object_unreference_unlocked().. and I guess there must be something similar for videobuf2.

(Previous comment about letting the allocating driver implement fops notwithstanding.. but I guess there must be some good way to deal with that.)

BR, -R

Marek Szyprowski

9:50 a.m.

New subject: [PATCH 2/6] v4l: add buffer exporting via shrbuf

From: Tomasz Stanislawski t.stanislaws@samsung.com

This patch adds extension to V4L2 api. It allow to export a mmap buffer as file descriptor. New ioctl VIDIOC_EXPBUF is added. It takes a buffer offset used by mmap and return a file descriptor on success.

Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com Signed-off-by: Kyungmin Park kyungmin.park@samsung.com Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com --- drivers/media/video/v4l2-ioctl.c | 10 ++++++++++ include/linux/videodev2.h | 8 ++++++++ include/media/v4l2-ioctl.h | 1 + 3 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/drivers/media/video/v4l2-ioctl.c b/drivers/media/video/v4l2-ioctl.c index 660b486..0a19fd4 100644 --- a/drivers/media/video/v4l2-ioctl.c +++ b/drivers/media/video/v4l2-ioctl.c @@ -1145,6 +1145,16 @@ static long __video_do_ioctl(struct file *file, dbgbuf(cmd, vfd, p); break; } + case VIDIOC_EXPBUF: + { + unsigned int *p = arg; + + if (!ops->vidioc_expbuf) + break; + + ret = ops->vidioc_expbuf(file, fh, *p); + break; + } case VIDIOC_DQBUF: { struct v4l2_buffer *p = arg; diff --git a/include/linux/videodev2.h b/include/linux/videodev2.h index 7c77c4e..cae7908 100644 --- a/include/linux/videodev2.h +++ b/include/linux/videodev2.h @@ -185,6 +185,7 @@ enum v4l2_memory { V4L2_MEMORY_MMAP = 1, V4L2_MEMORY_USERPTR = 2, V4L2_MEMORY_OVERLAY = 3, + V4L2_MEMORY_SHRBUF = 4, };

/* see also http://vektor.theorem.ca/graphics/ycbcr/ */ @@ -564,6 +565,8 @@ struct v4l2_requestbuffers { * should be passed to mmap() called on the video node) * @userptr: when memory is V4L2_MEMORY_USERPTR, a userspace pointer * pointing to this plane + * @fd: when memory is V4L2_MEMORY_SHRBUF, a userspace file + * descriptor associated with this plane * @data_offset: offset in the plane to the start of data; usually 0, * unless there is a header in front of the data * @@ -578,6 +581,7 @@ struct v4l2_plane { union { __u32 mem_offset; unsigned long userptr; + int fd; } m; __u32 data_offset; __u32 reserved[11]; @@ -600,6 +604,8 @@ struct v4l2_plane { * (or a "cookie" that should be passed to mmap() as offset) * @userptr: for non-multiplanar buffers with memory == V4L2_MEMORY_USERPTR; * a userspace pointer pointing to this buffer + * @fd: for non-multiplanar buffers with memory == V4L2_MEMORY_SHRBUF; + * a userspace file descriptor associated with this buffer * @planes: for multiplanar buffers; userspace pointer to the array of plane * info structs for this buffer * @length: size in bytes of the buffer (NOT its payload) for single-plane @@ -626,6 +632,7 @@ struct v4l2_buffer { __u32 offset; unsigned long userptr; struct v4l2_plane *planes; + int fd; } m; __u32 length; __u32 input; @@ -1868,6 +1875,7 @@ struct v4l2_dbg_chip_ident { #define VIDIOC_S_FBUF _IOW('V', 11, struct v4l2_framebuffer) #define VIDIOC_OVERLAY _IOW('V', 14, int) #define VIDIOC_QBUF _IOWR('V', 15, struct v4l2_buffer) +#define VIDIOC_EXPBUF _IOWR('V', 16, unsigned int) #define VIDIOC_DQBUF _IOWR('V', 17, struct v4l2_buffer) #define VIDIOC_STREAMON _IOW('V', 18, int) #define VIDIOC_STREAMOFF _IOW('V', 19, int) diff --git a/include/media/v4l2-ioctl.h b/include/media/v4l2-ioctl.h index dd9f1e7..20932e3 100644 --- a/include/media/v4l2-ioctl.h +++ b/include/media/v4l2-ioctl.h @@ -121,6 +121,7 @@ struct v4l2_ioctl_ops { int (*vidioc_querybuf)(struct file *file, void *fh, struct v4l2_buffer *b); int (*vidioc_qbuf) (struct file *file, void *fh, struct v4l2_buffer *b); int (*vidioc_dqbuf) (struct file *file, void *fh, struct v4l2_buffer *b); + int (*vidioc_expbuf) (struct file *file, void *fh, unsigned int offset);

int (*vidioc_overlay) (struct file *file, void *fh, unsigned int i);

-- 1.7.6

Marek Szyprowski

9:52 a.m.

New subject: [PATCH 3/6] v4l: vb2: add support for shared buffer (shrbuf)

From: Tomasz Stanislawski t.stanislaws@samsung.com

This patch adds support for SHRBUF memory type in videbuf2. It also provides implementation of VIDIOC_EXPOBUF ioctl within videobuf2.

Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com Signed-off-by: Kyungmin Park kyungmin.park@samsung.com Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com --- drivers/media/video/videobuf2-core.c | 176 ++++++++++++++++++++++++++++++++++ include/media/videobuf2-core.h | 7 ++ 2 files changed, 183 insertions(+), 0 deletions(-)

diff --git a/drivers/media/video/videobuf2-core.c b/drivers/media/video/videobuf2-core.c index 6ba1461..a1363a6 100644 --- a/drivers/media/video/videobuf2-core.c +++ b/drivers/media/video/videobuf2-core.c @@ -107,6 +107,25 @@ static void __vb2_buf_userptr_put(struct vb2_buffer *vb) }

/** + * __vb2_buf_shrbuf_put() - release memory associated with + * a SHRBUF buffer + */ +static void __vb2_buf_shrbuf_put(struct vb2_buffer *vb) +{ + struct vb2_queue *q = vb->vb2_queue; + unsigned int plane; + + for (plane = 0; plane < vb->num_planes; ++plane) { + void *mem_priv = vb->planes[plane].mem_priv; + + if (mem_priv) { + call_memop(q, plane, put_shrbuf, mem_priv); + vb->planes[plane].mem_priv = NULL; + } + } +} + +/** * __setup_offsets() - setup unique offsets ("cookies") for every plane in * every buffer on the queue */ @@ -220,6 +239,8 @@ static void __vb2_free_mem(struct vb2_queue *q) /* Free MMAP buffers or release USERPTR buffers */ if (q->memory == V4L2_MEMORY_MMAP) __vb2_buf_mem_free(vb); + if (q->memory == V4L2_MEMORY_SHRBUF) + __vb2_buf_shrbuf_put(vb); else __vb2_buf_userptr_put(vb); } @@ -314,6 +335,8 @@ static int __fill_v4l2_buffer(struct vb2_buffer *vb, struct v4l2_buffer *b) b->m.offset = vb->v4l2_planes[0].m.mem_offset; else if (q->memory == V4L2_MEMORY_USERPTR) b->m.userptr = vb->v4l2_planes[0].m.userptr; + else if (q->memory == V4L2_MEMORY_SHRBUF) + b->m.fd = vb->v4l2_planes[0].m.fd; }

/* @@ -402,6 +425,19 @@ static int __verify_mmap_ops(struct vb2_queue *q) }

/** + * __verify_shrbuf_ops() - verify that all memory operations required for + * SHRBUF queue type have been provided + */ +static int __verify_shrbuf_ops(struct vb2_queue *q) +{ + if (!(q->io_modes & VB2_SHRBUF) || !q->mem_ops->put_shrbuf || + !q->mem_ops->import_shrbuf || !q->mem_ops->export_shrbuf) + return -EINVAL; + + return 0; +} + +/** * __buffers_in_use() - return true if any buffers on the queue are in use and * the queue cannot be freed (by the means of REQBUFS(0)) call */ @@ -463,6 +499,7 @@ int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req) }

if (req->memory != V4L2_MEMORY_MMAP + && req->memory != V4L2_MEMORY_SHRBUF && req->memory != V4L2_MEMORY_USERPTR) { dprintk(1, "reqbufs: unsupported memory type\n"); return -EINVAL; @@ -492,6 +529,11 @@ int vb2_reqbufs(struct vb2_queue *q, struct v4l2_requestbuffers *req) return -EINVAL; }

+ if (req->memory == V4L2_MEMORY_SHRBUF && __verify_shrbuf_ops(q)) { + dprintk(1, "reqbufs: SHRBUF for current setup unsupported\n"); + return -EINVAL; + } + /* * If the same number of buffers and memory access method is requested * then return immediately. @@ -702,6 +744,10 @@ static int __fill_vb2_buffer(struct vb2_buffer *vb, struct v4l2_buffer *b, b->m.planes[plane].length; } } + if (b->memory == V4L2_MEMORY_SHRBUF) + for (plane = 0; plane < vb->num_planes; ++plane) + v4l2_planes[plane].m.fd = + b->m.planes[plane].m.fd; } else { /* * Single-planar buffers do not use planes array, @@ -716,6 +762,8 @@ static int __fill_vb2_buffer(struct vb2_buffer *vb, struct v4l2_buffer *b, v4l2_planes[0].m.userptr = b->m.userptr; v4l2_planes[0].length = b->length; } + if (b->memory == V4L2_MEMORY_SHRBUF) + v4l2_planes[0].m.fd = b->m.fd; }

vb->v4l2_buf.field = b->field; @@ -813,6 +861,79 @@ static int __qbuf_mmap(struct vb2_buffer *vb, struct v4l2_buffer *b) }

/** + * __qbuf_shrbuf() - handle qbuf of a SHRBUF buffer + */ +static int __qbuf_shrbuf(struct vb2_buffer *vb, struct v4l2_buffer *b) +{ + struct v4l2_plane planes[VIDEO_MAX_PLANES]; + struct vb2_queue *q = vb->vb2_queue; + void *mem_priv; + unsigned int plane; + int ret; + + /* Verify and copy relevant information provided by the userspace */ + ret = __fill_vb2_buffer(vb, b, planes); + if (ret) + return ret; + + for (plane = 0; plane < vb->num_planes; ++plane) { + /* Skip the plane if already verified */ + if (vb->v4l2_planes[plane].m.fd == planes[plane].m.fd) + continue; + + dprintk(3, "qbuf: buffer descriptor for plane %d changed, " + "reacquiring memory\n", plane); + + /* Release previously acquired memory if present */ + if (vb->planes[plane].mem_priv) + call_memop(q, plane, put_shrbuf, + vb->planes[plane].mem_priv); + + vb->planes[plane].mem_priv = NULL; + + /* Acquire each plane's memory */ + if (q->mem_ops->import_shrbuf) { + mem_priv = q->mem_ops->import_shrbuf( + q->alloc_ctx[plane], planes[plane].m.fd); + if (IS_ERR(mem_priv)) { + dprintk(1, "qbuf: failed acquiring userspace " + "memory for plane %d\n", plane); + ret = PTR_ERR(mem_priv); + goto err; + } + vb->planes[plane].mem_priv = mem_priv; + } + } + + /* + * Call driver-specific initialization on the newly acquired buffer, + * if provided. + */ + ret = call_qop(q, buf_init, vb); + if (ret) { + dprintk(1, "qbuf: buffer initialization failed\n"); + goto err; + } + + /* + * Now that everything is in order, copy relevant information + * provided by userspace. + */ + for (plane = 0; plane < vb->num_planes; ++plane) + vb->v4l2_planes[plane] = planes[plane]; + + return 0; +err: + /* In case of errors, release planes that were already acquired */ + for (; plane > 0; --plane) { + call_memop(q, plane, put_shrbuf, + vb->planes[plane - 1].mem_priv); + vb->planes[plane - 1].mem_priv = NULL; + } + + return ret; +} +/** * __enqueue_in_driver() - enqueue a vb2_buffer in driver for processing */ static void __enqueue_in_driver(struct vb2_buffer *vb) @@ -882,6 +1003,8 @@ int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b) ret = __qbuf_mmap(vb, b); else if (q->memory == V4L2_MEMORY_USERPTR) ret = __qbuf_userptr(vb, b); + else if (q->memory == V4L2_MEMORY_SHRBUF) + ret = __qbuf_shrbuf(vb, b); else { WARN(1, "Invalid queue type\n"); return -EINVAL; @@ -1102,6 +1225,59 @@ int vb2_dqbuf(struct vb2_queue *q, struct v4l2_buffer *b, bool nonblocking) } EXPORT_SYMBOL_GPL(vb2_dqbuf);

+static int __find_plane_by_offset(struct vb2_queue *q, unsigned long off, + unsigned int *_buffer, unsigned int *_plane); + +/** + * vb2_expbuf() - Export a buffer as a file descriptor + * @q: videobuf2 queue + * @b: buffer structure passed from userspace to vidioc_expbuf handler + * in driver + * + * The return values from this function are intended to be directly returned + * from vidioc_expbuf handler in driver. + */ +int vb2_expbuf(struct vb2_queue *q, unsigned int offset) +{ + struct vb2_buffer *vb = NULL; + struct vb2_plane *vb_plane; + unsigned int buffer, plane; + int ret; + + if (q->memory != V4L2_MEMORY_MMAP) { + dprintk(1, "Queue is not currently set up for mmap\n"); + return -EINVAL; + } + + if (~q->io_modes & VB2_SHRBUF) { + dprintk(1, "Queue does not support shared buffer\n"); + return -EINVAL; + } + + /* + * Find the plane corresponding to the offset passed by userspace. + */ + ret = __find_plane_by_offset(q, offset, &buffer, &plane); + if (ret) { + dprintk(1, "invalid offset %d\n", offset); + return ret; + } + + vb = q->bufs[buffer]; + vb_plane = &vb->planes[plane]; + + ret = q->mem_ops->export_shrbuf(vb_plane->mem_priv); + if (ret < 0) { + dprintk(1, "Failed to export buffer %d, plane %d\n", + buffer, plane); + return -EINVAL; + } + + dprintk(3, "buffer %d, plane %d exported as %d descriptor\n", + buffer, plane, ret); + return ret; +} + /** * vb2_streamon - start streaming * @q: videobuf2 queue diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h index f87472a..d2610e5 100644 --- a/include/media/videobuf2-core.h +++ b/include/media/videobuf2-core.h @@ -65,6 +65,10 @@ struct vb2_mem_ops { unsigned long size, int write); void (*put_userptr)(void *buf_priv);

+ void *(*import_shrbuf)(void *alloc_ctx, int fd); + int (*export_shrbuf)(void *buf_priv); + void (*put_shrbuf)(void *buf_priv); + void *(*vaddr)(void *buf_priv); void *(*cookie)(void *buf_priv);

@@ -82,6 +86,7 @@ struct vb2_plane { * enum vb2_io_modes - queue access methods * @VB2_MMAP: driver supports MMAP with streaming API * @VB2_USERPTR: driver supports USERPTR with streaming API + * @VB2_SHRBUF: driver supports SHRBUF with streaming API * @VB2_READ: driver supports read() style access * @VB2_WRITE: driver supports write() style access */ @@ -90,6 +95,7 @@ enum vb2_io_modes { VB2_USERPTR = (1 << 1), VB2_READ = (1 << 2), VB2_WRITE = (1 << 3), + VB2_SHRBUF = (1 << 4), };

/** @@ -296,6 +302,7 @@ int vb2_queue_init(struct vb2_queue *q); void vb2_queue_release(struct vb2_queue *q);

int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b); +int vb2_expbuf(struct vb2_queue *q, unsigned int offset); int vb2_dqbuf(struct vb2_queue *q, struct v4l2_buffer *b, bool nonblocking);

int vb2_streamon(struct vb2_queue *q, enum v4l2_buf_type type);

-- 1.7.6

Marek Szyprowski

9:53 a.m.

New subject: [PATCH 4/6] v4l: vb2: integrate dma-contig allocator with shrbuf

From: Tomasz Stanislawski t.stanislaws@samsung.com

Signed-off-by: Tomasz Stanislawski t.stanislaws@samsung.com Signed-off-by: Kyungmin Park kyungmin.park@samsung.com Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com --- drivers/media/video/videobuf2-dma-contig.c | 90 ++++++++++++++++++++++++++++ 1 files changed, 90 insertions(+), 0 deletions(-)

diff --git a/drivers/media/video/videobuf2-dma-contig.c b/drivers/media/video/videobuf2-dma-contig.c index a790a5f..7baac66 100644 --- a/drivers/media/video/videobuf2-dma-contig.c +++ b/drivers/media/video/videobuf2-dma-contig.c @@ -13,6 +13,7 @@ #include <linux/module.h> #include <linux/slab.h> #include <linux/dma-mapping.h> +#include <linux/shared-buffer.h>

#include <media/videobuf2-core.h> #include <media/videobuf2-memops.h> @@ -27,26 +28,57 @@ struct vb2_dc_buf { dma_addr_t paddr; unsigned long size; struct vm_area_struct *vma; + struct shrbuf *sb; atomic_t refcount; struct vb2_vmarea_handler handler; };

+struct vb2_dc_shrbuf { + struct vb2_dc_buf *buf; + struct shrbuf sb; +}; + static void vb2_dma_contig_put(void *buf_priv);

+static void __dc_shrbuf_get(struct shrbuf *__sb) +{ + struct vb2_dc_shrbuf *sb = container_of(__sb, struct vb2_dc_shrbuf, sb); + struct vb2_dc_buf *buf = sb->buf; + + atomic_inc(&buf->refcount); +} + +static void __dc_shrbuf_put(struct shrbuf *__sb) +{ + struct vb2_dc_shrbuf *sb = container_of(__sb, struct vb2_dc_shrbuf, sb); + struct vb2_dc_buf *buf = sb->buf; + + vb2_dma_contig_put(buf); +} + static void *vb2_dma_contig_alloc(void *alloc_ctx, unsigned long size) { struct vb2_dc_conf *conf = alloc_ctx; struct vb2_dc_buf *buf; + struct vb2_dc_shrbuf *sb;

buf = kzalloc(sizeof *buf, GFP_KERNEL); if (!buf) return ERR_PTR(-ENOMEM);

+ sb = kzalloc(sizeof *sb, GFP_KERNEL); + if (!sb) { + kfree(buf); + return ERR_PTR(-ENOMEM); + } + buf->sb = &sb->sb; + buf->vaddr = dma_alloc_coherent(conf->dev, size, &buf->paddr, GFP_KERNEL); if (!buf->vaddr) { dev_err(conf->dev, "dma_alloc_coherent of size %ld failed\n", size); + kfree(sb); kfree(buf); return ERR_PTR(-ENOMEM); } @@ -54,6 +86,12 @@ static void *vb2_dma_contig_alloc(void *alloc_ctx, unsigned long size) buf->conf = conf; buf->size = size;

+ sb->buf = buf; + sb->sb.get = __dc_shrbuf_get; + sb->sb.put = __dc_shrbuf_put; + sb->sb.dma_addr = buf->paddr; + sb->sb.size = buf->size; + buf->handler.refcount = &buf->refcount; buf->handler.put = vb2_dma_contig_put; buf->handler.arg = buf; @@ -70,6 +108,7 @@ static void vb2_dma_contig_put(void *buf_priv) if (atomic_dec_and_test(&buf->refcount)) { dma_free_coherent(buf->conf->dev, buf->size, buf->vaddr, buf->paddr); + kfree(container_of(buf->sb, struct vb2_dc_shrbuf, sb)); kfree(buf); } } @@ -148,6 +187,54 @@ static void vb2_dma_contig_put_userptr(void *mem_priv) kfree(buf); }

+static void *vb2_dma_contig_import_shrbuf(void *alloc_ctx, int fd) +{ + struct vb2_dc_buf *buf; + struct shrbuf *sb; + + buf = kzalloc(sizeof *buf, GFP_KERNEL); + if (!buf) + return ERR_PTR(-ENOMEM); + + sb = shrbuf_import(fd); + if (IS_ERR(sb)) { + printk(KERN_ERR "Failed acquiring shared buffer from fd %d\n", + fd); + kfree(buf); + return sb; + } + + buf->size = sb->size; + buf->paddr = sb->dma_addr; + buf->sb = sb; + + return buf; +} + +static void vb2_dma_contig_put_shrbuf(void *mem_priv) +{ + struct vb2_dc_buf *buf = mem_priv; + + if (!buf) + return; + + buf->sb->put(buf->sb); + kfree(buf); +} + +static int vb2_dma_contig_export_shrbuf(void *mem_priv) +{ + struct vb2_dc_buf *buf = mem_priv; + + if (!buf) + return -EINVAL; + + if (!buf->sb) + return -EINVAL; + + return shrbuf_export(buf->sb); +} + const struct vb2_mem_ops vb2_dma_contig_memops = { .alloc = vb2_dma_contig_alloc, .put = vb2_dma_contig_put, @@ -156,6 +243,9 @@ const struct vb2_mem_ops vb2_dma_contig_memops = { .mmap = vb2_dma_contig_mmap, .get_userptr = vb2_dma_contig_get_userptr, .put_userptr = vb2_dma_contig_put_userptr, + .import_shrbuf = vb2_dma_contig_import_shrbuf, + .export_shrbuf = vb2_dma_contig_export_shrbuf, + .put_shrbuf = vb2_dma_contig_put_shrbuf, .num_users = vb2_dma_contig_num_users, }; EXPORT_SYMBOL_GPL(vb2_dma_contig_memops);

-- 1.7.6

Marek Szyprowski

9:53 a.m.

New subject: [PATCH 5/6] v4l: fimc: integrate capture i-face with shrbuf

From: Tomasz Stanislawski t.stanislaws@samsung.com

diff --git a/drivers/media/video/s5p-fimc/fimc-capture.c b/drivers/media/video/s5p-fimc/fimc-capture.c index a16f359..0904263 100644 --- a/drivers/media/video/s5p-fimc/fimc-capture.c +++ b/drivers/media/video/s5p-fimc/fimc-capture.c @@ -981,6 +981,14 @@ static int fimc_cap_qbuf(struct file *file, void *priv, return vb2_qbuf(&fimc->vid_cap.vbq, buf); }

+static int fimc_cap_expbuf(struct file *file, void *priv, + unsigned int offset) +{ + struct fimc_dev *fimc = video_drvdata(file); + + return vb2_expbuf(&fimc->vid_cap.vbq, offset); +} + static int fimc_cap_dqbuf(struct file *file, void *priv, struct v4l2_buffer *buf) { @@ -1051,6 +1059,7 @@ static const struct v4l2_ioctl_ops fimc_capture_ioctl_ops = {

.vidioc_qbuf = fimc_cap_qbuf, .vidioc_dqbuf = fimc_cap_dqbuf, + .vidioc_expbuf = fimc_cap_expbuf,

.vidioc_streamon = fimc_cap_streamon, .vidioc_streamoff = fimc_cap_streamoff, @@ -1422,7 +1431,7 @@ int fimc_register_capture_device(struct fimc_dev *fimc, q = &fimc->vid_cap.vbq; memset(q, 0, sizeof(*q)); q->type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE; - q->io_modes = VB2_MMAP | VB2_USERPTR; + q->io_modes = VB2_MMAP | VB2_USERPTR | VB2_SHRBUF; q->drv_priv = fimc->vid_cap.ctx; q->ops = &fimc_capture_qops; q->mem_ops = &vb2_dma_contig_memops;

-- 1.7.6

Marek Szyprowski

9:54 a.m.

New subject: [PATCH 6/6] v4l: s5p-tv: mixer: integrate with shrbuf

From: Tomasz Stanislawski t.stanislaws@samsung.com

diff --git a/drivers/media/video/s5p-tv/mixer_video.c b/drivers/media/video/s5p-tv/mixer_video.c index 43ac22f..52cb51a 100644 --- a/drivers/media/video/s5p-tv/mixer_video.c +++ b/drivers/media/video/s5p-tv/mixer_video.c @@ -591,6 +591,14 @@ static int mxr_dqbuf(struct file *file, void *priv, struct v4l2_buffer *p) return vb2_dqbuf(&layer->vb_queue, p, file->f_flags & O_NONBLOCK); }

+static int mxr_expbuf(struct file *file, void *priv, unsigned int offset) +{ + struct mxr_layer *layer = video_drvdata(file); + + mxr_dbg(layer->mdev, "%s:%d\n", __func__, __LINE__); + return vb2_expbuf(&layer->vb_queue, offset); +} + static int mxr_streamon(struct file *file, void *priv, enum v4l2_buf_type i) { struct mxr_layer *layer = video_drvdata(file); @@ -618,6 +626,7 @@ static const struct v4l2_ioctl_ops mxr_ioctl_ops = { .vidioc_querybuf = mxr_querybuf, .vidioc_qbuf = mxr_qbuf, .vidioc_dqbuf = mxr_dqbuf, + .vidioc_expbuf = mxr_expbuf, /* Streaming control */ .vidioc_streamon = mxr_streamon, .vidioc_streamoff = mxr_streamoff, @@ -972,7 +981,7 @@ struct mxr_layer *mxr_base_layer_create(struct mxr_device *mdev,

layer->vb_queue = (struct vb2_queue) { .type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE, - .io_modes = VB2_MMAP | VB2_USERPTR, + .io_modes = VB2_MMAP | VB2_USERPTR | VB2_SHRBUF, .drv_priv = layer, .buf_struct_size = sizeof(struct mxr_buffer), .ops = &mxr_video_qops,

-- 1.7.6

KyongHo Cho

11:59 a.m.

Hi.

On Tue, Aug 2, 2011 at 6:48 PM, Marek Szyprowski m.szyprowski@samsung.com wrote:

...

Hello Everyone,

This patchset introduces the proof-of-concept infrastructure for buffer sharing between multiple devices using file descriptors. The infrastructure has been integrated with V4L2 framework, more specifically videobuf2 and two S5P drivers FIMC (capture interface) and TV drivers, but it can be easily used by other kernel subsystems, like DRI.

In this patch the buffer object has been simplified to absolute minimum - it contains only the buffer physical address (only physically contiguous buffers are supported), but this can be easily extended to complete scatter list in the future.

Is this patch set an attempt to share a buffer between different processes via open file descriptors? Your patches seems to include several constructs to pack information about a buffer in an open file descriptor and to unpack it.

I don't have any idea what is the purpose of your attempts. Is it the first step to the unified memory model that is being discussed in Linaro?

Regards, Cho KyongHo.

Marek Szyprowski

2:48 p.m.

Hello,

On 2011-08-02 13:59, KyongHo Cho wrote:

...

On Tue, Aug 2, 2011 at 6:48 PM, Marek Szyprowski m.szyprowski@samsung.com wrote:

...
Hello Everyone,

This patchset introduces the proof-of-concept infrastructure for buffer sharing between multiple devices using file descriptors. The infrastructure has been integrated with V4L2 framework, more specifically videobuf2 and two S5P drivers FIMC (capture interface) and TV drivers, but it can be easily used by other kernel subsystems, like DRI.

In this patch the buffer object has been simplified to absolute minimum - it contains only the buffer physical address (only physically contiguous buffers are supported), but this can be easily extended to complete scatter list in the future.

Is this patch set an attempt to share a buffer between different processes via open file descriptors? Your patches seems to include several constructs to pack information about a buffer in an open file descriptor and to unpack it.

I don't have any idea what is the purpose of your attempts. Is it the first step to the unified memory model that is being discussed in Linaro?

Yes, these patches were posted to demonstrate how sharing the buffers between different devices (currently only v4l2 based) can be implemented. We are discussing the idea of sharing the buffers on Memory Management summit on Linaro Sprint in Cambourne.

Best regards

-- Marek Szyprowski Samsung Poland R&D Center

Jordan Crouse

3:44 p.m.

On 08/02/2011 03:48 AM, Marek Szyprowski wrote:

...

Hello Everyone,

This patchset introduces the proof-of-concept infrastructure for buffer sharing between multiple devices using file descriptors. The infrastructure has been integrated with V4L2 framework, more specifically videobuf2 and two S5P drivers FIMC (capture interface) and TV drivers, but it can be easily used by other kernel subsystems, like DRI.

In this patch the buffer object has been simplified to absolute minimum - it contains only the buffer physical address (only physically contiguous buffers are supported), but this can be easily extended to complete scatter list in the future.

Best regards

Looks like a good start. I'm not sure what has already been discussed at the meetings, so please forgive me if any of these comments have already been added to the to-do list and/or discounted.

I would definitely consider adding lock and unlock functions. It would be great to have sane fencing built right into the sharing mechanism. Deferred unlock would be nice too, but that is probably hard to do in a generic way.

The owner of the buffer should be able to attach a private information structure to the object and the consumer should be able to get it. This is key for sharing buffer information and out of band data, especially for video buffers (width, height, fourcc, alignment, pitch, start of U buffer, start of V buffer, UV pitch, etc)

Thinking back to anything that could be salvaged from PMEM, about the only thing of value that wouldn't otherwise be implemented here is the idea of revoking a buffer. The thought is that when the master is was done with the buffer, it could revoke it so that the client couldn't hang on to it forever and possibly use it for nefarious purposes. The client still has it mapped, but the range is remapped to garbage. I've never been very clear on how useful this was from a security standpoint because the master has to implicitly share the fd in the first place but it seems to be a feature that has survived several years of pmem hacking.

I look forward to seeing the session notes from the meetings and seeing what the other ideas are. Thanks for your hard work.

Jordan

Tom Cooksey

3 Aug 3 Aug

9:33 a.m.

...

-----Original Message----- From: linaro-mm-sig-bounces@lists.linaro.org [mailto:linaro-mm-sig- bounces@lists.linaro.org] On Behalf Of Jordan Crouse Sent: 02 August 2011 16:45 To: Marek Szyprowski Cc: linaro-mm-sig@lists.linaro.org; Tomasz Stanislawski; Kyungmin Park; linux-media@vger.kernel.org Subject: Re: [Linaro-mm-sig] Buffer sharing proof-of-concept

On 08/02/2011 03:48 AM, Marek Szyprowski wrote:

...
Hello Everyone,

This patchset introduces the proof-of-concept infrastructure for

buffer sharing between multiple devices using file descriptors. The infrastructure has been integrated with V4L2 framework, more specifically videobuf2 and two S5P drivers FIMC (capture interface) and TV drivers, but it can be easily used by other kernel subsystems, like DRI.

...
In this patch the buffer object has been simplified to absolute

minimum - it contains only the buffer physical address (only physically contiguous buffers are supported), but this can be easily extended to complete scatter list in the future.

...
Best regards

Looks like a good start. I'm not sure what has already been discussed at the meetings, so please forgive me if any of these comments have already been added to the to-do list and/or discounted.

I would definitely consider adding lock and unlock functions. It would be great to have sane fencing built right into the sharing mechanism. Deferred unlock would be nice too, but that is probably hard to do in a generic way.

We've discussed synchronization and I think the general consensus is to keep it separate from the buffer sharing mechanism. Initially at least, user-space should be able to implement whatever mechanisms it needs on top of the buffer sharing system.

However, it was mentioned that having to bounce up into userspace and then go back down into kernel space when an event occurs is sub-optimal. To improve things, we discussed adding a kernel-side synchronization object/event mechanism. Extra API could be added to V4L2/KMS/Whatever which tells that driver to signal the sync object when something happens and another bit of API to tell the driver to do something when a sync object is signalled. A simple example might be to tell a camera to write to a buffer and signal a sync object when it has written a complete frame. At the same time, you could tell a KMS overlay plane to switch to the new video frame once the synchronization object is signalled. So userspace sets up what needs to happen, but it all actually occurs asynchronously (from the application's point of view) in kernel space. This obviously needs some more thought and investigation, but from the discussions we had yesterday, I don't think anything would stop this kind of thing being added in the future.

...

The owner of the buffer should be able to attach a private information structure to the object and the consumer should be able to get it. This is key for sharing buffer information and out of band data, especially for video buffers (width, height, fourcc, alignment, pitch, start of U buffer, start of V buffer, UV pitch, etc)

Passing buffer meta-data around was also discussed yesterday. Again, the general consensus seemed to be that this data should be kept out of the kernel. The userspace application should know what the buffer format etc. is and can provide that information to the relevant device APIs when is passes in the buffer.

This ties into another discussion we had yesterday about which device allocates buffers and how format negotiation works. I think this was a little more contentious. It seemed like a slight majority favoured a system where there wasn't a single buffer allocation device. Instead, each device API could allocate buffers and provide a way to get a file descriptor which could be passed to different devices. However, without a centralized buffer allocator, userspace needs to know which device it should use to allocate a buffer which it wants to share with another device. There's two aspects of this. The first is the actual memory allocation parameters - where in physical memory the buffer is allocated from, if it is physically contiguous or not, etc. This is information userspace shouldn't have to know. I seem to recall the discussion concluding that at least for the first iteration, userspace must "know" which device it has to use. I.e. There must be some vendor specific code in userspace which knows if a buffer is to be shared between device B and device D, device D must be used to allocate it. The second aspect is format negotiation. This seemed less contentious. V4L2 already provides API to query what formats a device supports so userspace can figure out which formats, etc. are common. Not sure about KMS or DRM, but it at least seems feasible to be able to add ioctls to query supported formats even if that doesn't exist today. I guess for GPU drivers, the userspace part of the driver will know what formats the GPU it's driving supports, so no need to query.

Cheers,

Tom

...

Thinking back to anything that could be salvaged from PMEM, about the only thing of value that wouldn't otherwise be implemented here is the idea of revoking a buffer. The thought is that when the master is was done with the buffer, it could revoke it so that the client couldn't hang on to it forever and possibly use it for nefarious purposes. The client still has it mapped, but the range is remapped to garbage. I've never been very clear on how useful this was from a security standpoint because the master has to implicitly share the fd in the first place but it seems to be a feature that has survived several years of pmem hacking.

I look forward to seeing the session notes from the meetings and seeing what the other ideas are. Thanks for your hard work.

Jordan

Linaro-mm-sig mailing list Linaro-mm-sig@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Jordan Crouse

3:12 p.m.

On 08/03/2011 03:33 AM, Tom Cooksey wrote:

...

...
-----Original Message----- From: linaro-mm-sig-bounces@lists.linaro.org [mailto:linaro-mm-sig- bounces@lists.linaro.org] On Behalf Of Jordan Crouse Sent: 02 August 2011 16:45 To: Marek Szyprowski Cc: linaro-mm-sig@lists.linaro.org; Tomasz Stanislawski; Kyungmin Park; linux-media@vger.kernel.org Subject: Re: [Linaro-mm-sig] Buffer sharing proof-of-concept

On 08/02/2011 03:48 AM, Marek Szyprowski wrote:

...
Hello Everyone,

This patchset introduces the proof-of-concept infrastructure for

buffer sharing between multiple devices using file descriptors. The infrastructure has been integrated with V4L2 framework, more specifically videobuf2 and two S5P drivers FIMC (capture interface) and TV drivers, but it can be easily used by other kernel subsystems, like DRI.

...
In this patch the buffer object has been simplified to absolute

minimum - it contains only the buffer physical address (only physically contiguous buffers are supported), but this can be easily extended to complete scatter list in the future.

...
Best regards

Looks like a good start. I'm not sure what has already been discussed at the meetings, so please forgive me if any of these comments have already been added to the to-do list and/or discounted.

I would definitely consider adding lock and unlock functions. It would be great to have sane fencing built right into the sharing mechanism. Deferred unlock would be nice too, but that is probably hard to do in a generic way.

We've discussed synchronization and I think the general consensus is to keep it separate from the buffer sharing mechanism. Initially at least, user-space should be able to implement whatever mechanisms it needs on top of the buffer sharing system.

However, it was mentioned that having to bounce up into userspace and then go back down into kernel space when an event occurs is sub-optimal. To improve things, we discussed adding a kernel-side synchronization object/event mechanism. Extra API could be added to V4L2/KMS/Whatever which tells that driver to signal the sync object when something happens and another bit of API to tell the driver to do something when a sync object is signalled. A simple example might be to tell a camera to write to a buffer and signal a sync object when it has written a complete frame. At the same time, you could tell a KMS overlay plane to switch to the new video frame once the synchronization object is signalled. So userspace sets up what needs to happen, but it all actually occurs asynchronously (from the application's point of view) in kernel space. This obviously needs some more thought and investigation, but from the discussions we had yesterday, I don't think anything would stop this kind of thing being added in the future.

I think thats is exactly what I was getting at. You solved the hard case first. The use case I had in mind was composition where you might have a 2D surface (or video surface) being composited into a 3D scene, and you wanted to serialize access to the surface, but it would be optimal to asynchronously release the lock when the surface is available without bothering user-space. I think what you describe above would handle that perfectly.

...

...
The owner of the buffer should be able to attach a private information structure to the object and the consumer should be able to get it. This is key for sharing buffer information and out of band data, especially for video buffers (width, height, fourcc, alignment, pitch, start of U buffer, start of V buffer, UV pitch, etc)

Passing buffer meta-data around was also discussed yesterday. Again, the general consensus seemed to be that this data should be kept out of the kernel. The userspace application should know what the buffer format etc. is and can provide that information to the relevant device APIs when is passes in the buffer.

True, but APIs change slowly. Some APIs *cough* OpenMAX *cough* are damn near immutable over the life time of a average software release. A blob of data attached to a buffer can evolve far more rapidly and be far more extensible and much more vendor specific. This isn't an new idea, I think the DRM/GEM guys have tossed it around too.

I bring up OpenMAX because this has been a thorn in my side before. Based on quirks in the video decoder and the GPU, there are various alignment requirements for the YUV frames which may differ slightly between different generations of GPUs and decoders. There just isn't enough information in the Port Definition for OMX to give us 100% certainty that we have the right magic combination, so we have to make educated guesses based on gentleman's agreements with the encoder drivers that the blocks will be aligned in certain ways. It works well enough to be functional, but I hate leaving things to agreement and chance. If we could have space to store a blob in the buffer structure it would make all the difference in the world.

...

This ties into another discussion we had yesterday about which device allocates buffers and how format negotiation works. I think this was a little more contentious. It seemed like a slight majority favoured a system where there wasn't a single buffer allocation device. Instead, each device API could allocate buffers and provide a way to get a file descriptor which could be passed to different devices. However, without a centralized buffer allocator, userspace needs to know which device it should use to allocate a buffer which it wants to share with another device. There's two aspects of this. The first is the actual memory allocation parameters - where in physical memory the buffer is allocated from, if it is physically contiguous or not, etc. This is information userspace shouldn't have to know. I seem to recall the discussion concluding that at least for the first iteration, userspace must "know" which device it has to use. I.e. There must be some vendor specific code in userspace which knows if a buffer is to be shared between device B and device D, device D must be used to allocate it.

I agree. It is easy to get bogged down into details. Like say that D has an IOMMU and that B only can access contiguous memory so then D needs to know what B's capabilities are. These are policy decisions that can only be rightly be done in userspace. I think in practice this will be less complex then it sounds since most of our modern SoCs have hardware that is flexible enough to handle the odd device with strange requirements (Samsung encoder) but designing for the worse case is always a good policy.

...

The second aspect is format negotiation. This seemed less contentious. V4L2 already provides API to query what formats a device supports so userspace can figure out which formats, etc. are common. Not sure about KMS or DRM, but it at least seems feasible to be able to add ioctls to query supported formats even if that doesn't exist today. I guess for GPU drivers, the userspace part of the driver will know what formats the GPU it's driving supports, so no need to query.

I agree for the most part. I think it gets a little bit less clear for things like streaming textures and EGL images where the specifications are far less clear about what is supported and the behavior tends to be more vendor specific. Of course, the vendors could help themselves with a private blob on the buffer to fill in the vague details.. :)

Jordan

...

Cheers,

Tom

...
Thinking back to anything that could be salvaged from PMEM, about the only thing of value that wouldn't otherwise be implemented here is the idea of revoking a buffer. The thought is that when the master is was done with the buffer, it could revoke it so that the client couldn't hang on to it forever and possibly use it for nefarious purposes. The client still has it mapped, but the range is remapped to garbage. I've never been very clear on how useful this was from a security standpoint because the master has to implicitly share the fd in the first place but it seems to be a feature that has survived several years of pmem hacking.

I look forward to seeing the session notes from the meetings and seeing what the other ideas are. Thanks for your hard work.

Jordan

Linaro-mm-sig mailing list Linaro-mm-sig@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Daniel Vetter

4 Aug 4 Aug

8:58 a.m.

On Wed, Aug 3, 2011 at 17:12, Jordan Crouse jcrouse@codeaurora.org wrote:

...

On 08/03/2011 03:33 AM, Tom Cooksey wrote:

...
Passing buffer meta-data around was also discussed yesterday. Again, the general consensus seemed to be that this data should be kept out of the kernel. The userspace application should know what the buffer format etc. is and can provide that information to the relevant device APIs when is passes in the buffer.

True, but APIs change slowly. Some APIs *cough* OpenMAX *cough* are damn near immutable over the life time of a average software release. A blob of data attached to a buffer can evolve far more rapidly and be far more extensible and much more vendor specific. This isn't an new idea, I think the DRM/GEM guys have tossed it around too.

Erh, no. For sharing gem buffers between process (i.e. between direct rendering clients and the compositor, whatever that is) we just hand around the gem id in the kernel. Some more stuff gets passed around in userspace in a generic way (e.g. DRI2 passes the buffer type (depth, stencil, color, ...) and the stride), but that's it.

Everything else is driver specific and mostly not even passed around explicitly and just agreed upon implicitly. E.g. running the wrong XvMC decoder lib for your Xorg Intel driver will result in garbage on the screen. There's a bit more leeway between Mesa and the Xorg driver because they're released independantly, but it's very ad-hoc (i.e. oops, that buffer doesn't fit the requirements of the new code, must be an old Xorg driver, so switch to the compat paths in Mesa).

But my main fear with the "blob attached to the buffer" idea is that sooner or later it'll be part of the kernel/userspace interface of the buffer sharing api ("hey, it's there, why not use it?"). And the timeframe for deprecating the kernel abi is 5-10 years and yes I've tried to dodge that and got shot at. Imo a better approach is to spec (_after_ the kernel buffer sharing works) a low-level userspace api that drivers need to implement (like the EGL Mesa extensions used to make Wayland work on gem drivers). -Daniel

-- Daniel Vetter daniel.vetter@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Clark, Rob

11:14 a.m.

On Thu, Aug 4, 2011 at 3:58 AM, Daniel Vetter daniel.vetter@ffwll.ch wrote:

...

On Wed, Aug 3, 2011 at 17:12, Jordan Crouse jcrouse@codeaurora.org wrote:

...
On 08/03/2011 03:33 AM, Tom Cooksey wrote:

...
Passing buffer meta-data around was also discussed yesterday. Again, the general consensus seemed to be that this data should be kept out of the kernel. The userspace application should know what the buffer format etc. is and can provide that information to the relevant device APIs when is passes in the buffer.

True, but APIs change slowly. Some APIs *cough* OpenMAX *cough* are damn near immutable over the life time of a average software release. A blob of data attached to a buffer can evolve far more rapidly and be far more extensible and much more vendor specific. This isn't an new idea, I think the DRM/GEM guys have tossed it around too.

Erh, no. For sharing gem buffers between process (i.e. between direct rendering clients and the compositor, whatever that is) we just hand around the gem id in the kernel. Some more stuff gets passed around in userspace in a generic way (e.g. DRI2 passes the buffer type (depth, stencil, color, ...) and the stride), but that's it.

Everything else is driver specific and mostly not even passed around explicitly and just agreed upon implicitly. E.g. running the wrong XvMC decoder lib for your Xorg Intel driver will result in garbage on the screen. There's a bit more leeway between Mesa and the Xorg driver because they're released independantly, but it's very ad-hoc (i.e. oops, that buffer doesn't fit the requirements of the new code, must be an old Xorg driver, so switch to the compat paths in Mesa).

But my main fear with the "blob attached to the buffer" idea is that sooner or later it'll be part of the kernel/userspace interface of the buffer sharing api ("hey, it's there, why not use it?"). And the timeframe for deprecating the kernel abi is 5-10 years and yes I've tried to dodge that and got shot at.

hmm, there would be a dmabuf->private ptr in struct dmabuf. Normally that should be for private data of the buffer allocator, but I guess it could be (ab)used for under the hood communication between drivers a platform specific way. It does seem a bit hacky, but at least it does not need to be exposed to userspace.

(Or maybe a better option is just 'rm -rf omx' ;-))

BR, -R

...

Imo a better approach is to spec (_after_ the kernel buffer sharing works) a low-level userspace api that drivers need to implement (like the EGL Mesa extensions used to make Wayland work on gem drivers).

-Daniel

Daniel Vetter daniel.vetter@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Linaro-mm-sig mailing list Linaro-mm-sig@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

Daniel Vetter

12:34 p.m.

On Thu, Aug 4, 2011 at 13:14, Clark, Rob rob@ti.com wrote:

...

hmm, there would be a dmabuf->private ptr in struct dmabuf. Normally that should be for private data of the buffer allocator, but I guess it could be (ab)used for under the hood communication between drivers a platform specific way. It does seem a bit hacky, but at least it does not need to be exposed to userspace.

An idea that just crossed my mind: I think we should seperate two kinds of meta-data about a shared piece of data (dmabuf): - logical metadata about it's contents, like strides, number of dimensions, pixel format/vbo layout, ... Imo that stuff doesn't belong into the buffer sharing simply because it's an a) awful mess and b) gem doesn't know it. To recap: only userspace knows this stuff and has to make sense of the data in the buffer by either setting up correct gpu command streams or telling kms what format this thing it needs to scan out has. - metadata about the physical layout: tiling layout, memory bank interleaving, page size for the iommu/contiguous buffer. As far as I can tell (i.e. please correct) for embedded systems this just depends on the (in)saneness of to iommu/bus/memory controller sitting between the ic block and it's data. So it would be great if we could completely hide this from drivers (and userspace) an shovel it into the dma subsystem (as private data). Unfortunately at least on Intel tiling needs to be known by the iommu code, the core gem kernel driver code and the userspace drivers. Otoh using tiled buffers for sharing is maybe a bit ambitious for the first cut. So maybe we can just ignore tiling which largely just leaves handling iommus restrictions (or their complete lack) which looks doable.

...

(Or maybe a better option is just 'rm -rf omx' ;-))

Yeah ;-) -Daniel

-- Daniel Vetter daniel.vetter@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Clark, Rob

4:19 p.m.

On Thu, Aug 4, 2011 at 7:34 AM, Daniel Vetter daniel.vetter@ffwll.ch wrote:

...

On Thu, Aug 4, 2011 at 13:14, Clark, Rob rob@ti.com wrote:

...
hmm, there would be a dmabuf->private ptr in struct dmabuf. Normally that should be for private data of the buffer allocator, but I guess it could be (ab)used for under the hood communication between drivers a platform specific way. It does seem a bit hacky, but at least it does not need to be exposed to userspace.

An idea that just crossed my mind: I think we should seperate two kinds of meta-data about a shared piece of data (dmabuf):

logical metadata about it's contents, like strides, number of

dimensions, pixel format/vbo layout, ... Imo that stuff doesn't belong into the buffer sharing simply because it's an a) awful mess and b) gem doesn't know it. To recap: only userspace knows this stuff and has to make sense of the data in the buffer by either setting up correct gpu command streams or telling kms what format this thing it needs to scan out has.

for sure, I think we've ruled out putting this sort of stuff in 'struct dmabuf'.. (notwithstanding any data stuffed away in a 'void * priv' on some platform or another)

...

metadata about the physical layout: tiling layout, memory bank

interleaving, page size for the iommu/contiguous buffer. As far as I can tell (i.e. please correct) for embedded systems this just depends on the (in)saneness of to iommu/bus/memory controller sitting between the ic block and it's data. So it would be great if we could completely hide this from drivers (and userspace) an shovel it into the dma subsystem (as private data). Unfortunately at least on Intel tiling needs to be known by the iommu code, the core gem kernel driver code and the userspace drivers. Otoh using tiled buffers for sharing is maybe a bit ambitious for the first cut. So maybe we can just ignore tiling which largely just leaves handling iommus restrictions (or their complete lack) which looks doable.

btw, on intel (or desktop platforms in general), could another device (say a USB webcam) DMA directly to a tiled buffer via the GART... ie. assuming you had some way to pre-fault some pages into the GART before the DMA happened.

I was sort of expecting 'struct dmabuf' to basically just be a scatterlist and some fxn ptrs, nothing about TILING.. not sure if we need an fxn ptr to ask the buffer allocator to generate some pages/addresses that some other DMA engine could write to (so you could do something like pre-faulting the buffer into some sort of GART) and again release the pages/addresses when DMA completes.

BR, -R

...

...
(Or maybe a better option is just 'rm -rf omx' ;-))

Yeah ;-)

-Daniel

Daniel Vetter daniel.vetter@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch

4922

days inactive

4924

days old

linaro-mm-sig@lists.linaro.org

16 comments

participants

tags (0)

participants (6)

Clark, Rob
Daniel Vetter
Jordan Crouse
KyongHo Cho
Marek Szyprowski
Tom Cooksey