- Linaro-mm-sig - lists.linaro.org

[PATCHv9 00/25] Integration of videobuf2 with DMABUF

by Tomasz Stanislawski

Hello everyone, This patchset adds support for DMABUF [2] importing and exporting to V4L2 stack. v9: - rebase on 3.6 - change type for fs to __s32 - add support for vb2_ioctl_expbuf - remove patch 'v4l: vb2: add support for DMA_ATTR_NO_KERNEL_MAPPING', it will be posted as a separate patch - fix typos and style in Documentation (from Hans Verkuil) - only vb2-core and vb2-dma-contig selects DMA_SHARED_BUFFER in Kconfig - use data_offset and length while queueing DMABUF - return the most recently used fd at VIDIOC_DQBUF - use (buffer-type, index, plane) tuple instead of mem_offset to identify a for exporting (from Hans Verkuil) - support for DMABUF mmap in vb2-dma-contig (from Laurent Pinchart) - add testing alignment of vaddr and size while verifying userptr against DMA capabilities (from Laurent Pinchart) - substitute VM_DONTDUMP with VM_RESERVED - simplify error handling in vb2_dc_get_dmabuf (from Laurent Pinchart) v8: - rebased on 3.6-rc1 - merged importer and exporter patchsets - fixed missing fields in v4l2_plane32 and v4l2_buffer32 structs - fixed typos/style in documentation - significant reduction of warnings from checkpatch.pl - fixed STREAMOFF issues reported by Dima Zavin [4] by adding __vb2_dqbuf helper to vb2-core - DC fails if userptr is not correctly aligned - add support for DMA attributes in DC - add support for buffers with no kernel mapping - add reference counting on device from allocator context - dummy support for mmap - use dma_get_sgtable, drop vb2_dc_kaddr_to_pages hack and vb2_dc_get_base_sgt helper v7: - support for V4L2_MEMORY_DMABUF in v4l2-compact-ioctl32.c - cosmetic fixes to the documentation - added importing for vmalloc because vmap support in dmabuf for 3.5 was pull-requested - support for dmabuf importing for VIVI - resurrect allocation of dma-contig context - remove reference of alloc_ctx in dma-contig buffer - use sg_alloc_table_from_pages - fix DMA scatterlist calls to use orig_nents instead of nents - fix memleak in vb2_dc_sgt_foreach_page (use orig_nents instead of nents) v6: - fixed missing entry in v4l2_memory_names - fixed a bug occuring after get_user_pages failure - fixed a bug caused by using invalid vma for get_user_pages - prepare/finish no longer call dma_sync for dmabuf buffers v5: - removed change of importer/exporter behaviour - fixes vb2_dc_pages_to_sgt basing on Laurent's hints - changed pin/unpin words to lock/unlock in Doc v4: - rebased on mainline 3.4-rc2 - included missing importing support for s5p-fimc and s5p-tv - added patch for changing map/unmap for importers - fixes to Documentation part - coding style fixes - pairing {map/unmap}_dmabuf in vb2-core - fixing variable types and semantic of arguments in videobufb2-dma-contig.c v3: - rebased on mainline 3.4-rc1 - split 'code refactor' patch to multiple smaller patches - squashed fixes to Sumit's patches - patchset is no longer dependant on 'DMA mapping redesign' - separated path for handling IO and non-IO mappings - add documentation for DMABUF importing to V4L - removed all DMABUF exporter related code - removed usage of dma_get_pages extension v2: - extended VIDIOC_EXPBUF argument from integer memoffset to struct v4l2_exportbuffer - added patch that breaks DMABUF spec on (un)map_atachment callcacks but allows to work with existing implementation of DMABUF prime in DRM - all dma-contig code refactoring patches were squashed - bugfixes v1: List of changes since [1]. - support for DMA api extension dma_get_pages, the function is used to retrieve pages used to create DMA mapping. - small fixes/code cleanup to videobuf2 - added prepare and finish callbacks to vb2 allocators, it is used keep consistency between dma-cpu acess to the memory (by Marek Szyprowski) - support for exporting of DMABUF buffer in V4L2 and Videobuf2, originated from [3]. - support for dma-buf exporting in vb2-dma-contig allocator - support for DMABUF for s5p-tv and s5p-fimc (capture interface) drivers, originated from [3] - changed handling for userptr buffers (by Marek Szyprowski, Andrzej Pietrasiewicz) - let mmap method to use dma_mmap_writecombine call (by Marek Szyprowski) [1] http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/4296… [2] https://lkml.org/lkml/2011/12/26/29 [3] http://thread.gmane.org/gmane.linux.kernel.cross-arch/12819 [4] http://article.gmane.org/gmane.linux.drivers.video-input-infrastructure/497… Laurent Pinchart (2): v4l: vb2-dma-contig: shorten vb2_dma_contig prefix to vb2_dc v4l: vb2-dma-contig: reorder functions Marek Szyprowski (4): v4l: vb2: add prepare/finish callbacks to allocators v4l: vb2-dma-contig: add prepare/finish to dma-contig allocator v4l: vb2-dma-contig: let mmap method to use dma_mmap_coherent call v4l: vb2-dma-contig: fail if user ptr buffer is not correctly aligned Sumit Semwal (4): v4l: Add DMABUF as a memory type v4l: vb2: add support for shared buffer (dma_buf) v4l: vb: remove warnings about MEMORY_DMABUF v4l: vb2-dma-contig: add support for dma_buf importing Tomasz Stanislawski (15): Documentation: media: description of DMABUF importing in V4L2 v4l: vb2-dma-contig: remove reference of alloc_ctx from a buffer v4l: vb2-dma-contig: add support for scatterlist in userptr mode v4l: vb2-vmalloc: add support for dmabuf importing v4l: vivi: support for dmabuf importing v4l: s5p-tv: mixer: support for dmabuf importing v4l: s5p-fimc: support for dmabuf importing Documentation: media: description of DMABUF exporting in V4L2 v4l: add buffer exporting via dmabuf v4l: vb2: add buffer exporting via dmabuf v4l: vb2-dma-contig: add support for DMABUF exporting v4l: vb2-dma-contig: add reference counting for a device from allocator context v4l: s5p-fimc: support for dmabuf exporting v4l: s5p-tv: mixer: support for dmabuf exporting v4l: s5p-mfc: support for dmabuf exporting Documentation/DocBook/media/v4l/compat.xml | 7 + Documentation/DocBook/media/v4l/io.xml | 183 +++++- Documentation/DocBook/media/v4l/v4l2.xml | 1 + .../DocBook/media/v4l/vidioc-create-bufs.xml | 16 +- Documentation/DocBook/media/v4l/vidioc-expbuf.xml | 212 ++++++ Documentation/DocBook/media/v4l/vidioc-qbuf.xml | 17 + Documentation/DocBook/media/v4l/vidioc-reqbufs.xml | 47 +- drivers/media/video/Kconfig | 2 + drivers/media/video/s5p-fimc/fimc-capture.c | 11 +- drivers/media/video/s5p-mfc/s5p_mfc_dec.c | 14 + drivers/media/video/s5p-mfc/s5p_mfc_enc.c | 14 + drivers/media/video/s5p-tv/mixer_video.c | 12 +- drivers/media/video/v4l2-compat-ioctl32.c | 19 + drivers/media/video/v4l2-dev.c | 1 + drivers/media/video/v4l2-ioctl.c | 11 + drivers/media/video/videobuf-core.c | 4 + drivers/media/video/videobuf2-core.c | 300 ++++++++- drivers/media/video/videobuf2-dma-contig.c | 695 ++++++++++++++++++-- drivers/media/video/videobuf2-memops.c | 40 -- drivers/media/video/videobuf2-vmalloc.c | 56 ++ drivers/media/video/vivi.c | 2 +- include/linux/videodev2.h | 35 + include/media/v4l2-ioctl.h | 2 + include/media/videobuf2-core.h | 38 ++ include/media/videobuf2-memops.h | 5 - 25 files changed, 1608 insertions(+), 136 deletions(-) create mode 100644 Documentation/DocBook/media/v4l/vidioc-expbuf.xml -- 1.7.9.5

13 years, 2 months

5
46
0 0

[PATCH 1/5] dma-buf: remove fallback for !CONFIG_DMA_SHARED_BUFFER

by Maarten Lankhorst

Documentation says that code requiring dma-buf should add it to select, so inline fallbacks are not going to be used. A link error will make it obvious what went wrong, instead of silently doing nothing at runtime. Signed-off-by: Maarten Lankhorst <maarten.lankhorst(a)canonical.com> --- include/linux/dma-buf.h | 99 ----------------------------------------------- 1 file changed, 99 deletions(-) diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index eb48f38..bd2e52c 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -156,7 +156,6 @@ static inline void get_dma_buf(struct dma_buf *dmabuf) get_file(dmabuf->file); } -#ifdef CONFIG_DMA_SHARED_BUFFER struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, struct device *dev); void dma_buf_detach(struct dma_buf *dmabuf, @@ -184,103 +183,5 @@ int dma_buf_mmap(struct dma_buf *, struct vm_area_struct *, unsigned long); void *dma_buf_vmap(struct dma_buf *); void dma_buf_vunmap(struct dma_buf *, void *vaddr); -#else - -static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, - struct device *dev) -{ - return ERR_PTR(-ENODEV); -} - -static inline void dma_buf_detach(struct dma_buf *dmabuf, - struct dma_buf_attachment *dmabuf_attach) -{ - return; -} - -static inline struct dma_buf *dma_buf_export(void *priv, - const struct dma_buf_ops *ops, - size_t size, int flags) -{ - return ERR_PTR(-ENODEV); -} - -static inline int dma_buf_fd(struct dma_buf *dmabuf, int flags) -{ - return -ENODEV; -} - -static inline struct dma_buf *dma_buf_get(int fd) -{ - return ERR_PTR(-ENODEV); -} - -static inline void dma_buf_put(struct dma_buf *dmabuf) -{ - return; -} - -static inline struct sg_table *dma_buf_map_attachment( - struct dma_buf_attachment *attach, enum dma_data_direction write) -{ - return ERR_PTR(-ENODEV); -} - -static inline void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, - struct sg_table *sg, enum dma_data_direction dir) -{ - return; -} - -static inline int dma_buf_begin_cpu_access(struct dma_buf *dmabuf, - size_t start, size_t len, - enum dma_data_direction dir) -{ - return -ENODEV; -} - -static inline void dma_buf_end_cpu_access(struct dma_buf *dmabuf, - size_t start, size_t len, - enum dma_data_direction dir) -{ -} - -static inline void *dma_buf_kmap_atomic(struct dma_buf *dmabuf, - unsigned long pnum) -{ - return NULL; -} - -static inline void dma_buf_kunmap_atomic(struct dma_buf *dmabuf, - unsigned long pnum, void *vaddr) -{ -} - -static inline void *dma_buf_kmap(struct dma_buf *dmabuf, unsigned long pnum) -{ - return NULL; -} - -static inline void dma_buf_kunmap(struct dma_buf *dmabuf, - unsigned long pnum, void *vaddr) -{ -} - -static inline int dma_buf_mmap(struct dma_buf *dmabuf, - struct vm_area_struct *vma, - unsigned long pgoff) -{ - return -ENODEV; -} - -static inline void *dma_buf_vmap(struct dma_buf *dmabuf) -{ - return NULL; -} - -static inline void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) -{ -} -#endif /* CONFIG_DMA_SHARED_BUFFER */ #endif /* __DMA_BUF_H__ */

13 years, 2 months

4
26
0 0

Re: [Linaro-mm-sig] [RFC] New dma_buf -> EGLImage EGL extension

by Maarten Lankhorst

Hey, Bit late reply, hopefully not too late. Op 30-08-12 16:00, Tom Cooksey schreef: > Hi All, > > Over the last few months I've been working on & off with a few people from > Linaro on a new EGL extension. The extension allows constructing an EGLImage > from a (set of) dma_buf file descriptors, including support for multi-plane > YUV. I envisage the primary use-case of this extension to be importing video > frames from v4l2 into the EGL/GLES graphics driver to texture from. > Originally the intent was to develop this as a Khronos-ratified extension. > However, this is a little too platform-specific to be an officially > sanctioned Khronos extension. It also goes against the general "EGLStream" > direction the EGL working group is going in. As such, the general feeling > was to make this an EXT "multi-vendor" extension with no official stamp of > approval from Khronos. As this is no-longer intended to be a Khronos > extension, I've re-written it to be a lot more Linux & dma_buf specific. It > also allows me to circulate the extension more widely (I.e. To those outside > Khronos membership). > > ARM are implementing this extension for at least our Mali-T6xx driver and > likely earlier drivers too. I am sending this e-mail to solicit feedback, > both from other vendors who might implement this extension (Mesa3D?) and > from potential users of the extension. However, any feedback is welcome. > Please find the extension text as it currently stands below. There several > open issues which I've proposed solutions for, but I'm not really happy with > those proposals and hoped others could chip-in with better ideas. There are > likely other issues I've not thought about which also need to be added and > addressed. > > Once there's a general consensus or if no-one's interested, I'll update the > spec, move it out of Draft status and get it added to the Khronos registry, > which includes assigning values for the new symbols. > > > Cheers, > > Tom > > > ---------8<--------- > > > Name > > EXT_image_dma_buf_import > > Name Strings > > EGL_EXT_image_dma_buf_import > > Contributors > > Jesse Barker > Rob Clark > Tom Cooksey > > Contacts > > Jesse Barker (jesse 'dot' barker 'at' linaro 'dot' org) > Tom Cooksey (tom 'dot' cooksey 'at' arm 'dot' com) > > Status > > DRAFT > > Version > > Version 3, August 16, 2012 > > Number > > EGL Extension ??? > > Dependencies > > EGL 1.2 is required. > > EGL_KHR_image_base is required. > > The EGL implementation must be running on a Linux kernel supporting the > dma_buf buffer sharing mechanism. > > This extension is written against the wording of the EGL 1.2 > Specification. > > Overview > > This extension allows creating an EGLImage from a Linux dma_buf file > descriptor or multiple file descriptors in the case of multi-plane YUV > images. > > New Types > > None > > New Procedures and Functions > > None > > New Tokens > > Accepted by the <target> parameter of eglCreateImageKHR: > > EGL_LINUX_DMA_BUF_EXT > > Accepted as an attribute in the <attrib_list> parameter of > eglCreateImageKHR: > > EGL_LINUX_DRM_FOURCC_EXT > EGL_DMA_BUF_PLANE0_FD_EXT > EGL_DMA_BUF_PLANE0_OFFSET_EXT > EGL_DMA_BUF_PLANE0_PITCH_EXT > EGL_DMA_BUF_PLANE1_FD_EXT > EGL_DMA_BUF_PLANE1_OFFSET_EXT > EGL_DMA_BUF_PLANE1_PITCH_EXT > EGL_DMA_BUF_PLANE2_FD_EXT > EGL_DMA_BUF_PLANE2_OFFSET_EXT > EGL_DMA_BUF_PLANE2_PITCH_EXT You might want to add PLANE3 just in case someone wants to import a AYUV image. > Additions to Chapter 2 of the EGL 1.2 Specification (EGL Operation) > > Add to section 2.5.1 "EGLImage Specification" (as defined by the > EGL_KHR_image_base specification), in the description of > eglCreateImageKHR: > > "Values accepted for <target> are listed in Table aaa, below. > > > +-------------------------+--------------------------------------------+ > | <target> | Notes > | > > +-------------------------+--------------------------------------------+ > | EGL_LINUX_DMA_BUF_EXT | Used for EGLImages imported from Linux > | > | | dma_buf file descriptors > | > > +-------------------------+--------------------------------------------+ > Table aaa. Legal values for eglCreateImageKHR <target> parameter > > ... > > If <target> is EGL_LINUX_DMA_BUF_EXT, <dpy> must be a valid display, > <ctx> > must be EGL_NO_CONTEXT, and <buffer> must be NULL, cast into the type > EGLClientBuffer. The details of the image is specified by the attributes > passed into eglCreateImageKHR. Required attributes and their values are > as > follows: > > * EGL_WIDTH & EGL_HEIGHT: The logical dimensions of the buffer in > pixels > > * EGL_LINUX_DRM_FOURCC_EXT: The pixel format of the buffer, as > specified > by drm_fourcc.h and used as the pixel_format parameter of the > drm_mode_fb_cmd2 ioctl. > > * EGL_DMA_BUF_PLANE0_FD_EXT: The dma_buf file descriptor of plane 0 > of > the image. > > * EGL_DMA_BUF_PLANE0_OFFSET_EXT: The offset from the start of the > dma_buf of the first sample in plane 0, in bytes. > > * EGL_DMA_BUF_PLANE0_PITCH_EXT: The number of bytes between the > start of > subsequent rows of samples in plane 0. May have special meaning > for > non-linear formats. > > For images in an RGB color-space or those using a single-plane YUV > format, > only the first plane's file descriptor, offset & pitch should be > specified. > For semi-planar YUV formats, the chroma samples are stored in plane 1 > and > for fully planar formats, U-samples are stored in plane 1 and V-samples > are > stored in plane 2. Planes 1 & 2 are specified by the following > attributes, > which have the same meanings as defined above for plane 0: > Nitpick, Y'CbCr not YUV. How do you want to deal with the case where Y' and CbCr are different hardware buffers? Could some support for 2d arrays be added in case Y' and CbCr are separated into top/bottom fields? How are semi-planar/planar formats handled that have a different width/height for Y' and CbCr? (YUV420) ~Maarten

13 years, 2 months

4
3
0 0

[RFC] New dma_buf -> EGLImage EGL extension - New draft!

by Tom Cooksey

Hi All, After receiving a fair bit of feedback (thanks!), I've updated the EGL_EXT_image_dma_buf_import spec and expanded it to resolve a number of the issues. Please find the latest draft below and let me know any additional feedback you might have, either on the lists or by private e-mail - I don't mind which. I think the only remaining issue now is if we need a mechanism whereby an application can query which drm_fourcc.h formats EGL supports or if just failing with EGL_BAD_MATCH when the application has use one EGL doesn't support is sufficient. Any thoughts? Cheers, Tom --------------------8<-------------------- Name EXT_image_dma_buf_import Name Strings EGL_EXT_image_dma_buf_import Contributors Jesse Barker Rob Clark Tom Cooksey Contacts Jesse Barker (jesse 'dot' barker 'at' linaro 'dot' org) Tom Cooksey (tom 'dot' cooksey 'at' arm 'dot' com) Status DRAFT Version Version 4, October 04, 2012 Number EGL Extension ??? Dependencies EGL 1.2 is required. EGL_KHR_image_base is required. The EGL implementation must be running on a Linux kernel supporting the dma_buf buffer sharing mechanism. This extension is written against the wording of the EGL 1.2 Specification. Overview This extension allows creating an EGLImage from a Linux dma_buf file descriptor or multiple file descriptors in the case of multi-plane YUV images. New Types None New Procedures and Functions None New Tokens Accepted by the <target> parameter of eglCreateImageKHR: EGL_LINUX_DMA_BUF_EXT Accepted as an attribute in the <attrib_list> parameter of eglCreateImageKHR: EGL_LINUX_DRM_FOURCC_EXT EGL_DMA_BUF_PLANE0_FD_EXT EGL_DMA_BUF_PLANE0_OFFSET_EXT EGL_DMA_BUF_PLANE0_PITCH_EXT EGL_DMA_BUF_PLANE1_FD_EXT EGL_DMA_BUF_PLANE1_OFFSET_EXT EGL_DMA_BUF_PLANE1_PITCH_EXT EGL_DMA_BUF_PLANE2_FD_EXT EGL_DMA_BUF_PLANE2_OFFSET_EXT EGL_DMA_BUF_PLANE2_PITCH_EXT EGL_YUV_COLOR_SPACE_HINT_EXT EGL_SAMPLE_RANGE_HINT_EXT EGL_YUV_CHROMA_HORIZONTAL_SITING_HINT_EXT EGL_YUV_CHROMA_VERTICAL_SITING_HINT_EXT Accepted as the value for the EGL_YUV_COLOR_SPACE_HINT_EXT attribute: EGL_ITU_REC601_EXT EGL_ITU_REC709_EXT EGL_ITU_REC2020_EXT Accepted as the value for the EGL_SAMPLE_RANGE_HINT_EXT attribute: EGL_YUV_FULL_RANGE_EXT EGL_YUV_NARROW_RANGE_EXT Accepted as the value for the EGL_YUV_CHROMA_HORIZONTAL_SITING_HINT_EXT & EGL_YUV_CHROMA_VERTICAL_SITING_HINT_EXT attributes: EGL_YUV_CHROMA_SITING_0_EXT EGL_YUV_CHROMA_SITING_0_5_EXT Additions to Chapter 2 of the EGL 1.2 Specification (EGL Operation) Add to section 2.5.1 "EGLImage Specification" (as defined by the EGL_KHR_image_base specification), in the description of eglCreateImageKHR: "Values accepted for <target> are listed in Table aaa, below. +-------------------------+--------------------------------------------+ | <target> | Notes | +-------------------------+--------------------------------------------+ | EGL_LINUX_DMA_BUF_EXT | Used for EGLImages imported from Linux | | | dma_buf file descriptors | +-------------------------+--------------------------------------------+ Table aaa. Legal values for eglCreateImageKHR <target> parameter ... If <target> is EGL_LINUX_DMA_BUF_EXT, <dpy> must be a valid display, <ctx> must be EGL_NO_CONTEXT, and <buffer> must be NULL, cast into the type EGLClientBuffer. The details of the image is specified by the attributes passed into eglCreateImageKHR. Required attributes and their values are as follows: * EGL_WIDTH & EGL_HEIGHT: The logical dimensions of the buffer in pixels * EGL_LINUX_DRM_FOURCC_EXT: The pixel format of the buffer, as specified by drm_fourcc.h and used as the pixel_format parameter of the drm_mode_fb_cmd2 ioctl. * EGL_DMA_BUF_PLANE0_FD_EXT: The dma_buf file descriptor of plane 0 of the image. * EGL_DMA_BUF_PLANE0_OFFSET_EXT: The offset from the start of the dma_buf of the first sample in plane 0, in bytes. * EGL_DMA_BUF_PLANE0_PITCH_EXT: The number of bytes between the start of subsequent rows of samples in plane 0. May have special meaning for non-linear formats. For images in an RGB color-space or those using a single-plane YUV format, only the first plane's file descriptor, offset & pitch should be specified. For semi-planar YUV formats, the chroma samples are stored in plane 1 and for fully planar formats, U-samples are stored in plane 1 and V-samples are stored in plane 2. Planes 1 & 2 are specified by the following attributes, which have the same meanings as defined above for plane 0: * EGL_DMA_BUF_PLANE1_FD_EXT * EGL_DMA_BUF_PLANE1_OFFSET_EXT * EGL_DMA_BUF_PLANE1_PITCH_EXT * EGL_DMA_BUF_PLANE2_FD_EXT * EGL_DMA_BUF_PLANE2_OFFSET_EXT * EGL_DMA_BUF_PLANE2_PITCH_EXT In addition to the above required attributes, the application may also provide hints as to how the data should be interpreted by the GL. If any of these hints are not specified, the GL will guess based on the pixel format passed as the EGL_LINUX_DRM_FOURCC_EXT attribute or may fall-back to some default value. Not all GLs will be able to support all combinations of these hints and are free to use whatever settings they choose to achieve the closest possible match. * EGL_YUV_COLOR_SPACE_HINT_EXT: The color-space the data is in. Only relevant for images in a YUV format, ignored when specified for an image in an RGB format. Accepted values are: EGL_ITU_REC601_EXT, EGL_ITU_REC709_EXT & EGL_ITU_REC2020_EXT. * EGL_YUV_CHROMA_HORIZONTAL_SITING_HINT_EXT & EGL_YUV_CHROMA_VERTICAL_SITING_HINT_EXT: Where chroma samples are sited relative to luma samples when the image is in a sub-sampled format. When the image is not using chroma sub-sampling, the luma and chroma samples are assumed to be co-sited. Siting is split into the vertical and horizontal and is in a fixed range. A siting of zero means the first luma sample is taken from the same position in that dimension as the chroma sample. This is best illustrated in the diagram below: (0.5, 0.5) (0.0, 0.5) (0.0, 0.0) + + + + + + + + * + * + x x x x + + + + + + + + + + + + + + + + + + + + * + * + x x x x + + + + + + + + + + + + Luma samples (+), Chroma samples (x) Chrome & Luma samples (*) Note this attribute is ignored for RGB images and non sub-sampled YUV images. Accepted values are: EGL_YUV_CHROMA_SITING_0_EXT (0.0) & EGL_YUV_CHROMA_SITING_0_5_EXT (0.5) * EGL_SAMPLE_RANGE_HINT_EXT: The numerical range of samples. Only relevant for images in a YUV format, ignored when specified for images in an RGB format. Accepted values are: EGL_YUV_FULL_RANGE_EXT (0-256) & EGL_YUV_NARROW_RANGE_EXT (16-235). If eglCreateImageKHR is successful for a EGL_LINUX_DMA_BUF_EXT target, the EGL takes ownership of the file descriptor and is responsible for closing it, which it may do at any time while the EGLDisplay is initialized." Add to the list of error conditions for eglCreateImageKHR: "* If <target> is EGL_LINUX_DMA_BUF_EXT and <buffer> is not NULL, the error EGL_BAD_PARAMETER is generated. * If <target> is EGL_LINUX_DMA_BUF_EXT, and the list of attributes is incomplete, EGL_BAD_PARAMETER is generated. * If <target> is EGL_LINUX_DMA_BUF_EXT, and the EGL_LINUX_DRM_FOURCC_EXT attribute is set to a format not supported by the EGL, EGL_BAD_MATCH is generated. * If <target> is EGL_LINUX_DMA_BUF_EXT, and the EGL_LINUX_DRM_FOURCC_EXT attribute indicates a single-plane format, EGL_BAD_ATTRIBUTE is generated if any of the EGL_DMA_BUF_PLANE1_* or EGL_DMA_BUF_PLANE2_* attributes are specified. * If <target> is EGL_LINUX_DMA_BUF_EXT and the value specified for EGL_YUV_COLOR_SPACE_HINT_EXT is not EGL_ITU_REC601_EXT, EGL_ITU_REC709_EXT or EGL_ITU_REC2020_EXT, EGL_BAD_ATTRIBUTE is generated. * If <target> is EGL_LINUX_DMA_BUF_EXT and the value specified for EGL_SAMPLE_RANGE_HINT_EXT is not EGL_YUV_FULL_RANGE_EXT or EGL_YUV_NARROW_RANGE_EXT, EGL_BAD_ATTRIBUTE is generated. * If <target> is EGL_LINUX_DMA_BUF_EXT and the value specified for EGL_YUV_CHROMA_HORIZONTAL_SITING_HINT_EXT or EGL_YUV_CHROMA_VERTICAL_SITING_HINT_EXT is not EGL_YUV_CHROMA_SITING_0_EXT or EGL_YUV_CHROMA_SITING_0_5_EXT, EGL_BAD_ATTRIBUTE is generated. * If <target> is EGL_LINUX_DMA_BUF_EXT and one or more of the values specified for a plane's pitch or offset isn't supported by EGL, EGL_BAD_ACCESS is generated. * If <target> is EGL_LINUX_DMA_BUF_EXT and eglCreateImageKHR fails, EGL does not retain ownership of the file descriptor and it is the responsibility of the application to close it." Issues 1. Should this be a KHR or EXT extension? ANSWER: EXT. Khronos EGL working group not keen on this extension as it is seen as contradicting the EGLStream direction the specification is going in. The working group recommends creating additional specs to allow an EGLStream producer/consumer connected to v4l2/DRM or any other Linux interface. 2. Should this be a generic any platform extension, or a Linux-only extension which explicitly states the handles are dma_buf fds? ANSWER: There's currently no intention to port this extension to any OS not based on the Linux kernel. Consequently, this spec can be explicitly written against Linux and the dma_buf API. 3. Does ownership of the file descriptor pass to the EGL library? ANSWER: If eglCreateImageKHR is successful, EGL assumes ownership of the file descriptors and is responsible for closing them. 4. How are the different YUV color spaces handled (BT.709/BT.601)? ANSWER: The pixel formats defined in drm_fourcc.h only specify how the data is laid out in memory. It does not define how that data should be interpreted. Added a new EGL_YUV_COLOR_SPACE_HINT_EXT attribute to allow the application to specify which color space the data is in to allow the GL to choose an appropriate set of co-efficients if it needs to convert that data to RGB for example. 5. What chroma-siting is used for sub-sampled YUV formats? ANSWER: The chroma siting is not specified by either the v4l2 or DRM APIs. This is similar to the color-space issue (4) in that the chroma siting doesn't affect how the data is stored in memory. However, the GL will need to know the siting in order to filter the image correctly. While the visual impact of getting the siting wrong is minor, provision should be made to allow an application to specify the siting if desired. Added additional EGL_YUV_CHROMA_HORIZONTAL_SITING_HINT_EXT & EGL_YUV_CHROMA_VERTICAL_SITING_HINT_EXT attributes to allow the siting to be specified using a set of pre-defined values (0 or 0.5). 6. How can an application query which formats the EGL implementation supports? PROPOSAL: Don't provide a query mechanism but instead add an error condition that EGL_BAD_MATCH is raised if the EGL implementation doesn't support that particular format. 7. Which image formats should be supported and how is format specified? Seem to be two options 1) specify a new enum in this specification and enumerate all possible formats. 2) Use an existing enum already in Linux, either v4l2_mbus_pixelcode and/or those formats listed in drm_fourcc.h? ANSWER: Go for option 2) and just use values defined in drm_fourcc.h. 8. How can AYUV images be handled? ANSWER: At least on fourcc.org and in drm_fourcc.h, there only seems to be a single AYUV format and that is a packed format, so everything, including the alpha component would be in the first plane. 9. How can you import interlaced images? ANSWER: Interlaced frames are usually stored with the top & bottom fields interleaved in a single buffer. As the fields would need to be displayed as at different times, the application would create two EGLImages from the same buffer, one for the top field and another for the bottom. Both EGLImages would set the pitch to 2x the buffer width and the second EGLImage would use a suitable offset to indicate it started on the second line of the buffer. This should work regardless of whether the data is packed in a single plane, semi-planar or multi-planar. If each interlaced field is stored in a separate buffer then it should be trivial to create two EGLImages, one for each field's buffer. 10. How are semi-planar/planar formats handled that have a different width/height for Y' and CbCr such as YUV420? ANSWER: The spec says EGL_WIDTH & EGL_HEIGHT specify the *logical* width and height of the buffer in pixels. For pixel formats with sub-sampled Chroma values, it should be trivial for the EGL implementation to calculate the width/height of the Chroma sample buffers using the logical width & height and by inspecting the pixel format passed as the EGL_LINUX_DRM_FOURCC_EXT attribute. I.e. If the pixel format says it's YUV420, the Chroma buffer's width = EGL_WIDTH/2 & height =EGL_HEIGHT/2. 11. How are Bayer formats handled? ANSWER: As of Linux 2.6.34, drm_fourcc.h does not include any Bayer formats. However, future kernel versions may add such formats in which case they would be handled in the same way as any other format. 12. Should the spec support buffers which have samples in a "narrow range"? Content sampled from older analogue sources typically don't use the full (0-256) range of the data type storing the sample and instead use a narrow (16-235) range to allow some headroom & toeroom in the signals to avoid clipping signals which overshoot slightly during processing. This is sometimes known as signals using "studio swing". ANSWER: Add a new attribute to define if the samples use a narrow 16-235 range or the full 0-256 range. 13. Specifying the color space and range seems cumbersome, why not just allow the application to specify the full YUV->RGB color conversion matrix? ANSWER: Some hardware may not be able to use an arbitrary conversion matrix and needs to select an appropriate pre-defined matrix based on the color space and the sample range. 14. How do you handle EGL implementations which have restrictions on pitch and/or offset? ANSWER: Buffers being imported using dma_buf pretty much have to be allocated by a kernel-space driver. As such, it is expected that a system integrator would make sure all devices which allocate buffers suitable for exporting make sure they use a pitch supported by all possible importers. However, it is still possible eglCreateImageKHR can fail due to an unsupported pitch. Added a new error to the list indicating this. 15. Should this specification also describe how to export an existing EGLImage as a dma_buf file descriptor? ANSWER: No. Importing and exporting buffers are two separate operations and importing an existing dma_buf fd into an EGLImage is useful functionality in itself. Agree that exporting an EGLImage as a dma_buf fd is useful, E.g. it could be used by an OpenMAX IL implementation's OMX_UseEGLImage function to give access to the buffer backing an EGLImage to video hardware. However, exporting can be split into a separate extension specification. Revision History #4 (Tom Cooksey, October 04, 2012) - Fixed issue numbering! - Added issues 8 - 15. - Promoted proposal for Issue 3 to be the answer. - Added an additional attribute to allow an application to specify the color space as a hint which should address issue 4. - Added an additional attribute to allow an application to specify the chroma siting as a hint which should address issue 5. - Added an additional attribute to allow an application to specify the sample range as a hint which should address the new issue 12. - Added language to end of error section clarifying who owns the fd passed to eglCreateImageKHR if an error is generated. #3 (Tom Cooksey, August 16, 2012) - Changed name from EGL_EXT_image_external and re-written language to explicitly state this for use with Linux & dma_buf. - Added a list of issues, including some still open ones. #2 (Jesse Barker, May 30, 2012) - Revision to split eglCreateImageKHR functionality from export Functionality. - Update definition of EGLNativeBufferType to be a struct containing a list of handles to support multi-buffer/multi-planar formats. #1 (Jesse Barker, March 20, 2012) - Initial draft.

13 years, 2 months

1
0
0 0

[GIT PULL] CMA and DMA-mapping updates for v3.7

by Marek Szyprowski

Hi Linus, I would like to ask for pulling another set of CMA and DMA-mapping framework updates for v3.7. The following changes since commit a0d271cbfed1dd50278c6b06bead3d00ba0a88f9: Linux 3.6 (2012-09-30 16:47:46 -0700) are available in the git repository at: git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git for-v3.7 for you to fetch changes up to 461b6f0d3d7d4e556035463b531136b034b7433e: Merge branch 'next-cleanup' into for-v3.7 (2012-10-02 09:24:24 +0200) ---------------------------------------------------------------- This time the pull request is rather small, because the further redesign patches were not ready on time. This pull request consists of the patches which extend ARM DMA-mapping subsystem with support for CPU coherent (ACP) DMA busses. The first client of the new version is HighBank SATA driver. The second part of the pull request includes various cleanup for both CMA common code and ARM DMA-mapping subsystem. Thanks! Best regards Marek Szyprowski Samsung Poland R&D Center Patch summary: Hiroshi Doyu (3): ARM: dma-mapping: Small logical clean up ARM: dma-mapping: Refrain noisy console message ARM: dma-mapping: Remove unsed var at arm_coherent_iommu_unmap_page Marek Szyprowski (1): Merge branch 'next-cleanup' into for-v3.7 Michal Nazarewicz (1): drivers: dma-contiguous: refactor dma_alloc_from_contiguous() Rob Herring (4): ARM: add coherent dma ops ARM: add coherent iommu dma ops ARM: kill off arch_is_coherent ARM: highbank: add coherent DMA setup .../devicetree/bindings/ata/ahci-platform.txt | 3 + .../devicetree/bindings/dma/arm-pl330.txt | 3 + .../devicetree/bindings/net/calxeda-xgmac.txt | 3 + arch/arm/boot/dts/highbank.dts | 1 + arch/arm/include/asm/barrier.h | 7 +- arch/arm/include/asm/dma-mapping.h | 1 + arch/arm/include/asm/memory.h | 8 - arch/arm/mach-highbank/highbank.c | 52 ++++ arch/arm/mm/dma-mapping.c | 264 +++++++++++++++----- arch/arm/mm/mmu.c | 17 +- drivers/base/dma-contiguous.c | 18 +- 11 files changed, 283 insertions(+), 94 deletions(-)

13 years, 2 months

1
0
0 0

[PATCH] dma-buf: might_sleep() in dma_buf_unmap_attachment()

by Rob Clark

From: Rob Clark <rob(a)ti.com> We never really clarified if unmap could be done in atomic context. But since mapping might require sleeping, this implies mutex in use to synchronize mapping/unmapping, so unmap could sleep as well. Add a might_sleep() to clarify this. Signed-off-by: Rob Clark <rob(a)ti.com> Acked-by: Daniel Vetter <daniel.vetter(a)ffwll.ch> --- drivers/base/dma-buf.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index c30f3e1..877eacb 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -298,6 +298,8 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, struct sg_table *sg_table, enum dma_data_direction direction) { + might_sleep(); + if (WARN_ON(!attach || !attach->dmabuf || !sg_table)) return; -- 1.7.9.5

13 years, 2 months

3
2
0 0

[PATCHv8 00/26] Integration of videobuf2 with DMABUF

by Tomasz Stanislawski

Hello everyone, This patchset adds support for DMABUF [2] importing and exporting to V4L2 stack. The importer and exporter part were merged because DMA mapping redesign [3] was scheduled for merge to mainline. v8: - rebased on 3.6-rc1 - merged importer and exporter patchsets - fixed missing fields in v4l2_plane32 and v4l2_buffer32 structs - fixed typos/style in documentation - significant reduction of warnings from checkpatch.pl - fixed STREAMOFF issues reported by Dima Zavin [4] by adding __vb2_dqbuf helper to vb2-core - DC fails if userptr is not correctly aligned - add support for DMA attributes in DC - add support for buffers with no kernel mapping - add reference counting on device from allocator context - dummy support for mmap - use dma_get_sgtable, drop vb2_dc_kaddr_to_pages hack and vb2_dc_get_base_sgt helper v7: - support for V4L2_MEMORY_DMABUF in v4l2-compact-ioctl32.c - cosmetic fixes to the documentation - added importing for vmalloc because vmap support in dmabuf for 3.5 was pull-requested - support for dmabuf importing for VIVI - resurrect allocation of dma-contig context - remove reference of alloc_ctx in dma-contig buffer - use sg_alloc_table_from_pages - fix DMA scatterlist calls to use orig_nents instead of nents - fix memleak in vb2_dc_sgt_foreach_page (use orig_nents instead of nents) v6: - fixed missing entry in v4l2_memory_names - fixed a bug occuring after get_user_pages failure - fixed a bug caused by using invalid vma for get_user_pages - prepare/finish no longer call dma_sync for dmabuf buffers v5: - removed change of importer/exporter behaviour - fixes vb2_dc_pages_to_sgt basing on Laurent's hints - changed pin/unpin words to lock/unlock in Doc v4: - rebased on mainline 3.4-rc2 - included missing importing support for s5p-fimc and s5p-tv - added patch for changing map/unmap for importers - fixes to Documentation part - coding style fixes - pairing {map/unmap}_dmabuf in vb2-core - fixing variable types and semantic of arguments in videobufb2-dma-contig.c v3: - rebased on mainline 3.4-rc1 - split 'code refactor' patch to multiple smaller patches - squashed fixes to Sumit's patches - patchset is no longer dependant on 'DMA mapping redesign' - separated path for handling IO and non-IO mappings - add documentation for DMABUF importing to V4L - removed all DMABUF exporter related code - removed usage of dma_get_pages extension v2: - extended VIDIOC_EXPBUF argument from integer memoffset to struct v4l2_exportbuffer - added patch that breaks DMABUF spec on (un)map_atachment callcacks but allows to work with existing implementation of DMABUF prime in DRM - all dma-contig code refactoring patches were squashed - bugfixes v1: List of changes since [1]. - support for DMA api extension dma_get_pages, the function is used to retrieve pages used to create DMA mapping. - small fixes/code cleanup to videobuf2 - added prepare and finish callbacks to vb2 allocators, it is used keep consistency between dma-cpu acess to the memory (by Marek Szyprowski) - support for exporting of DMABUF buffer in V4L2 and Videobuf2, originated from [3]. - support for dma-buf exporting in vb2-dma-contig allocator - support for DMABUF for s5p-tv and s5p-fimc (capture interface) drivers, originated from [3] - changed handling for userptr buffers (by Marek Szyprowski, Andrzej Pietrasiewicz) - let mmap method to use dma_mmap_writecombine call (by Marek Szyprowski) [1] http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/4296… [2] https://lkml.org/lkml/2011/12/26/29 [3] http://thread.gmane.org/gmane.linux.kernel.cross-arch/12819 [4] http://article.gmane.org/gmane.linux.drivers.video-input-infrastructure/497… Laurent Pinchart (2): v4l: vb2-dma-contig: Shorten vb2_dma_contig prefix to vb2_dc v4l: vb2-dma-contig: Reorder functions Marek Szyprowski (5): v4l: vb2: add prepare/finish callbacks to allocators v4l: vb2-dma-contig: add prepare/finish to dma-contig allocator v4l: vb2-dma-contig: let mmap method to use dma_mmap_coherent call media: vb2: fail if user ptr buffer is not correctly aligned v4l: vb2: add support for DMA_ATTR_NO_KERNEL_MAPPING Sumit Semwal (4): v4l: Add DMABUF as a memory type v4l: vb2: add support for shared buffer (dma_buf) v4l: vb: remove warnings about MEMORY_DMABUF v4l: vb2-dma-contig: add support for dma_buf importing Tomasz Stanislawski (15): Documentation: media: description of DMABUF importing in V4L2 v4l: vb2-dma-contig: remove reference of alloc_ctx from a buffer v4l: vb2-dma-contig: add support for scatterlist in userptr mode v4l: vb2-vmalloc: add support for dmabuf importing v4l: vivi: support for dmabuf importing v4l: s5p-tv: mixer: support for dmabuf importing v4l: s5p-fimc: support for dmabuf importing Documentation: media: description of DMABUF exporting in V4L2 v4l: add buffer exporting via dmabuf v4l: vb2: add buffer exporting via dmabuf v4l: vb2-dma-contig: add support for DMABUF exporting v4l: vb2-dma-contig: add reference counting for a device from allocator context v4l: s5p-fimc: support for dmabuf exporting v4l: s5p-tv: mixer: support for dmabuf exporting v4l: s5p-mfc: support for dmabuf exporting Documentation/DocBook/media/v4l/compat.xml | 7 + Documentation/DocBook/media/v4l/io.xml | 183 +++++ Documentation/DocBook/media/v4l/v4l2.xml | 1 + .../DocBook/media/v4l/vidioc-create-bufs.xml | 3 +- Documentation/DocBook/media/v4l/vidioc-expbuf.xml | 223 ++++++ Documentation/DocBook/media/v4l/vidioc-qbuf.xml | 15 + Documentation/DocBook/media/v4l/vidioc-reqbufs.xml | 47 +- drivers/media/video/Kconfig | 1 + drivers/media/video/atmel-isi.c | 2 +- drivers/media/video/blackfin/bfin_capture.c | 2 +- drivers/media/video/marvell-ccic/mcam-core.c | 3 +- drivers/media/video/mx2_camera.c | 2 +- drivers/media/video/mx2_emmaprp.c | 2 +- drivers/media/video/mx3_camera.c | 2 +- drivers/media/video/s5p-fimc/Kconfig | 1 + drivers/media/video/s5p-fimc/fimc-capture.c | 11 +- drivers/media/video/s5p-fimc/fimc-core.c | 2 +- drivers/media/video/s5p-fimc/fimc-lite.c | 2 +- drivers/media/video/s5p-g2d/g2d.c | 2 +- drivers/media/video/s5p-jpeg/jpeg-core.c | 2 +- drivers/media/video/s5p-mfc/s5p_mfc.c | 5 +- drivers/media/video/s5p-mfc/s5p_mfc_dec.c | 18 + drivers/media/video/s5p-mfc/s5p_mfc_enc.c | 18 + drivers/media/video/s5p-tv/Kconfig | 1 + drivers/media/video/s5p-tv/mixer_video.c | 14 +- drivers/media/video/sh_mobile_ceu_camera.c | 2 +- drivers/media/video/v4l2-compat-ioctl32.c | 19 + drivers/media/video/v4l2-dev.c | 1 + drivers/media/video/v4l2-ioctl.c | 16 + drivers/media/video/videobuf-core.c | 4 + drivers/media/video/videobuf2-core.c | 275 +++++++- drivers/media/video/videobuf2-dma-contig.c | 719 ++++++++++++++++++-- drivers/media/video/videobuf2-memops.c | 40 -- drivers/media/video/videobuf2-vmalloc.c | 56 ++ drivers/media/video/vivi.c | 2 +- drivers/staging/media/dt3155v4l/dt3155v4l.c | 2 +- include/linux/videodev2.h | 33 + include/media/v4l2-ioctl.h | 2 + include/media/videobuf2-core.h | 36 + include/media/videobuf2-dma-contig.h | 4 +- include/media/videobuf2-memops.h | 5 - 41 files changed, 1639 insertions(+), 146 deletions(-) create mode 100644 Documentation/DocBook/media/v4l/vidioc-expbuf.xml -- 1.7.9.5

13 years, 2 months

3
57
0 0

[GIT PULL] DMA-mapping fix for v3.6

by Marek Szyprowski

Hi Linus, I would like to ask for pulling yet another patch for ARM dma-mapping subsystem to Linux v3.6 kernel tree. This patch fixes potential memory leak ARM dma-mapping code. The following changes since commit 979570e02981d4a8fc20b3cc8fd651856c98ee9d: Linux 3.6-rc7 (2012-09-23 18:10:57 -0700) are available in the git repository at: git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git fixes-for-3.6 for you to fetch changes up to ec10665cbf271fb1f60daeb194ad4f2cdcdc59d9: ARM: dma-mapping: Fix potential memory leak in atomic_pool_init() (2012-09-24 08:35:03 +0200) ---------------------------------------------------------------- Sachin Kamat (1): ARM: dma-mapping: Fix potential memory leak in atomic_pool_init() arch/arm/mm/dma-mapping.c | 2 ++ 1 file changed, 2 insertions(+)

13 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] Memory eviction in ttm

by Thomas Hellström

Hi Maarten! Broadening the audience a bit.. On 9/14/12 9:12 AM, Maarten Lankhorst wrote: > Op 13-09-12 23:00, Thomas Hellstrom schreef: >> On 09/13/2012 07:13 PM, Maarten Lankhorst wrote: >>> Hey >>> >>> Op 13-09-12 18:41, Thomas Hellstrom schreef: >>>> On 09/13/2012 05:19 PM, Maarten Lankhorst wrote: >>>>> Hey, >>>>> >>>>> Op 12-09-12 15:28, Thomas Hellstrom schreef: >>>>>> On 09/12/2012 02:48 PM, Maarten Lankhorst wrote: >>>>>>> Hey Thomas, >>>>>>> >>>>>>> I'm playing around with moving reservations from ttm to global, but how ttm >>>>>>> ttm is handling reservations is getting in the way. The code wants to move >>>>>>> the bo from the lru lock at the same time a reservation is made, but that >>>>>>> seems to be slightly too strict. It would really help me if that guarantee >>>>>>> is removed. >>>>>> Hi, Maarten. >>>>>> >>>>>> Removing that restriction is not really possible at the moment. >>>>>> Also the memory accounting code depends on this, and may cause reservations >>>>>> in the most awkward places. Since these reservations don't have a ticket >>>>>> they may and will cause deadlocks. So in short the restriction is there >>>>>> to avoid deadlocks caused by ticketless reservations. >>>>> I have finished the lockdep annotations now which seems to catch almost >>>>> all abuse I threw at it, so I'm feeling slightly more confident about moving >>>>> the locking order and reservations around. >>>> Maarten, moving reservations in TTM out of the lru lock is incorrect as the code is >>>> written now. If we want to move it out we need something for ticketless reservations >>>> >>>> I've been thinking of having a global hash table of tickets with the task struct pointer as the key, >>>> but even then, we'd need to be able to handle EBUSY errors on every operation that might try to >>>> reserve a buffer. >>>> >>>> The fact that lockdep doesn't complain isn't enough. There *will* be deadlock use-cases when TTM is handed >>>> the right data-set. >>>> >>>> Isn't there a way that a subsystem can register a callback to be performed to remove stuff from LRU and >>>> to take a pre-reservation lock? >>> What if multiple subsystems need those? You will end up with a deadlock again. >>> >>> I think it would be easier to change the code in ttm_bo.c to not assume the first >>> item on the lru list is really the least recently used, and assume the first item >>> that can be reserved without blocking IS the least recently used instead. >> So what would happen then is that we'd spin on the first item on the LRU list, since >> when reserving we must release the LRU lock, and if reserving fails, we thus >> need to restart LRU traversal. Typically after a schedule(). That's bad. >> >> So let's take a step back and analyze why the LRU lock has become a problem. >> From what I can tell, it's because you want to use per-object lock when reserving instead of a >> global reservation lock (that TTM could use as the LRU lock). Is that correct? >> and in that case, in what situation do you envision such a global lock being contended >> to the extent that it hurts performance? >> >>>>> Lockdep WILL complain about trying to use multiple tickets, doing ticketed >>>>> and unticketed blocking reservations mixed, etc. >>>>> >>>>> I want to remove the global fence_lock and make it a per buffer lock, with some >>>>> lockdep annotations it's perfectly legal to grab obj->fence_lock and obj2->fence_lock >>>>> if you have a reservation, but it should complain loudly about trying to take 2 fence_locks >>>>> at the same time without a reservation. >>>> Yes, TTM was previously using per buffer fence locks, and that works fine from a deadlock perspective, but >>>> it hurts performance. Fencing 200 buffers in a command submission (google-earth for example) will mean >>>> 198 unnecessary locks, each discarding the processor pipelines. Locking is a *slow* operation, particularly >>>> on systems with many processors, and I don't think it's a good idea to change that back, without analyzing >>>> the performance impact. There are reasons people are writing stuff like RCU to avoid locking... >>> So why don't we simply use RCU for fence pointers and get rid of the fence locking? :D >>> danvet originally suggested it as a joke but if you think about it, it would make a lot of sense for this usecase. >> I thought of that before, but the problem is you'd still need a spinlock to change the buffer's fence pointer, >> even if reading it becomes quick. > Actually, I changed lockdep annotations a bit to distinguish between the > cases where ttm_bo_wait is called without reservation, and ttm_bo_wait > is called with, as far as I can see there are only 2 places that do it without, > at least if I converted my git tree properly.. > > http://cgit.freedesktop.org/~mlankhorst/linux/log/?h=v10-wip > > First one is nouveau_bo_vma_del, this can be fixed easily. > Second one is ttm_bo_cleanup_refs and ttm_bo_cleanup_refs_or_queue, > if reservation is done first before ttm_bo_wait, the fence_lock could be > dropped entirely by adding smb_mb() in reserve and unreserve, functionally > there would be no difference. So if you can verify my lockdep annotations are > correct in the most recent commit wrt what's using ttm_bo_wait without reservation > we could remove the fence_lock entirely. > > ~Maarten Being able to wait for buffer idle or get the fence pointer without reserving is a fundamental property of TTM. Reservation is a long-term lock. The fence lock is a very short term lock. If I were to choose, I'd rather accept per-object fence locks than removing this property, but see below. Likewise, to be able to guarantee that a reserved object is not on any LRU list is also an important property. Removing that property will, in addition to the spin wait we've already discussed make understanding TTM locking even more difficult, and I'd really like to avoid it. If this were a real performance problem we were trying to solve it would be easier to motivate changes in this area, but if it's just trying to avoid a global reservation lock and a global fence lock that will rarely if ever see any contention, I can't see the point. On the contrary, having per-object locks will be very costly when reserving / fencing many objects. As mentioned before, in the fence lock case it's been tried and removed, so I'd like to know the reasoning behind introducing it again, and in what situations you think the global locks will be contended. /Thomas

13 years, 3 months

3
2
0 0

[RFC][PATCH] fs/buffer.c: Revoke LRU when trying to drop buffers

by Laura Abbott

When a buffer is added to the LRU list, a reference is taken which is not dropped until the buffer is evicted from the LRU list. This is the correct behavior, however this LRU reference will prevent the buffer from being dropped. This means that the buffer can't actually be dropped until it is selected for eviction. There's no bound on the time spent on the LRU list, which means that the buffer may be undroppable for very long periods of time. Given that migration involves dropping buffers, the associated page is now unmigratible for long periods of time as well. CMA relies on being able to migrate a specific range of pages, so these these types of failures make CMA significantly less reliable, especially under high filesystem usage. Rather than waiting for the LRU algorithm to eventually kick out the buffer, explicitly remove the buffer from the LRU list when trying to drop it. There is still the possibility that the buffer could be added back on the list, but that indicates the buffer is still in use and would probably have other 'in use' indicates to prevent dropping. Signed-off-by: Laura Abbott <lauraa(a)codeaurora.org> --- fs/buffer.c | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/fs/buffer.c b/fs/buffer.c index ad5938c..daa0c3d 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1399,12 +1399,49 @@ static bool has_bh_in_lru(int cpu, void *dummy) return 0; } +static void __evict_bh_lru(void *arg) +{ + struct bh_lru *b = &get_cpu_var(bh_lrus); + struct buffer_head *bh = arg; + int i; + + for (i = 0; i < BH_LRU_SIZE; i++) { + if (b->bhs[i] == bh) { + brelse(b->bhs[i]); + b->bhs[i] = NULL; + goto out; + } + } +out: + put_cpu_var(bh_lrus); +} + +static bool bh_exists_in_lru(int cpu, void *arg) +{ + struct bh_lru *b = per_cpu_ptr(&bh_lrus, cpu); + struct buffer_head *bh = arg; + int i; + + for (i = 0; i < BH_LRU_SIZE; i++) { + if (b->bhs[i] == bh) + return 1; + } + + return 0; + +} void invalidate_bh_lrus(void) { on_each_cpu_cond(has_bh_in_lru, invalidate_bh_lru, NULL, 1, GFP_KERNEL); } EXPORT_SYMBOL_GPL(invalidate_bh_lrus); +void evict_bh_lrus(struct buffer_head *bh) +{ + on_each_cpu_cond(bh_exists_in_lru, __evict_bh_lru, bh, 1, GFP_ATOMIC); +} +EXPORT_SYMBOL_GPL(evict_bh_lrus); + void set_bh_page(struct buffer_head *bh, struct page *page, unsigned long offset) { @@ -3052,6 +3089,7 @@ drop_buffers(struct page *page, struct buffer_head **buffers_to_free) bh = head; do { + evict_bh_lrus(bh); if (buffer_write_io_error(bh) && page->mapping) set_bit(AS_EIO, &page->mapping->flags); if (buffer_busy(bh)) -- 1.7.11.3

13 years, 3 months

2
2
0 0

[GIT PULL] One more DMA-mapping fix for v3.6

by Marek Szyprowski

Hi Linus, I would like to ask for pulling one more patch for ARM dma-mapping subsystem to Linux v3.6 kernel tree. This patch fixes very subtle bug (typical off-by-one error) which might appear in very rare circumstances. The following changes since commit 55d512e245bc7699a8800e23df1a24195dd08217: Linux 3.6-rc5 (2012-09-08 16:43:45 -0700) are available in the git repository at: git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git fixes-for-3.6 for you to fetch changes up to f3d87524975f01b885fc3d009c6ab6afd0d00746: arm: mm: fix DMA pool affiliation check (2012-09-10 16:15:48 +0200) Thanks! Best regards Marek Szyprowski Samsung Poland R&D Center Patch summary: ---------------------------------------------------------------- Thomas Petazzoni (1): arm: mm: fix DMA pool affiliation check arch/arm/mm/dma-mapping.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-)

13 years, 3 months

1
0
0 0

Re: [Linaro-mm-sig] Converting OMAP's custom vram allocator

by Rob Clark

On Wed, Sep 5, 2012 at 5:08 AM, Tomi Valkeinen <tomi.valkeinen(a)ti.com> wrote: > Hi, > > OMAP has a custom video ram allocator, which I'd like to remove and use > the standard dma allocation functions. > > There are two problems for which I'd like to hear suggestions or > comments: > > First one is that the dma_alloc_* functions map the allocated memory for > cpu use. In many cases with OMAP DSS (display subsystem) this is not > needed: the memory may be written only by the SGX or the DSP, and it's > only read by the DSS, so it's never touched by the CPU. see dma_alloc_attrs() and DMA_ATTR_NO_KERNEL_MAPPING > This is even more true when using VRFB on omap3 (and probably TILER on > omap4) for rotation, as VRFB hides the actual memory and offers rotated > views. In this case the backend memory is never accessed by anyone else > than VRFB. just fwiw, we don't actually need contiguous memory on o4/tiler :-) (well, at least if you ignore things like secure playback) > Is there a way to allocate the memory without creating a mapping? While > it won't break anything as such, the allocated areas can be quite large > thus causing large areas of the kernel's memory space to be needlessly > reserved. > > The second case is passing a framebuffer address from the bootloader to > the kernel. Often with mobile devices the bootloader will initialize the > display hardware, showing a company logo or such. To keep the image on > the screen when kernel starts we need to reserve the same physical > memory area early at boot, and use that for the framebuffer. with a bit of handwaving, this is possible. You can pass a base address to dma_declare_contiguous() when you setup your device's CMA pool. Although that doesn't really guarantee you're allocation from that pool is at offset zero, I suppose. > I'm not sure if there's any actual problem with this one, presuming > there is a solution for the first case. Somehow the memory is reserved > at early boot time, and this is passed to the fb driver. But can the > memory be managed the same way as in normal case (for example freeing > it), or does it need to be handled as a special case? special-casing it might be better.. although possibly a dma attr could be added for this to tell dma_alloc_from_contiguous() that we need a particular address within the CMA pool. It seems a bit like a hack, but OTOH I guess pretty much every consumer device would need a hack like this. BR, -R > Tomi >

13 years, 3 months

1
0
0 0

[PATCH v3 0/2]gpu: ion: oom killer

by Zhangfei Gao

v2->v3 Split oom killer patch only. Based on Nishanth's patch, which change ion_debug_heap_total with id. 1. add heap_found 2. Solve the issue about serveral id share one type. Use ion_debug_heap_total(client, heap->id) instead of ion_debug_heap_total(client, heap->type) since id is unique while type can be shared. Fortunately Nishanth has update one patch, so rebase on the patch v1->v2 Sync to Aug 30 common.git v0->v1: 1. move ion_shrink out of mutex, suggested by Nishanth 2. check error flag of ERR_PTR(-ENOMEM) 3. add msleep to allow schedule out. Base on common.git, android-3.4 branch Add oom killer. Once heap is used off, SIGKILL is send to all tasks refered the buffer with descending oom_socre_adj Nishanth Peethambaran (1): gpu: ion: Update debugfs to show for each id Zhangfei Gao (1): gpu: ion: oom killer drivers/gpu/ion/ion.c | 131 +++++++++++++++++++++++++++++++++++++++++++++---- 1 files changed, 121 insertions(+), 10 deletions(-)

13 years, 3 months

1
2
0 0

[PATCH v2 0/3]gpu: ion: oom killer

by Zhangfei Gao

v1->v2 Sync to Aug 30 common.git v0->v1: 1, Change gen_pool_create(12, -1) to gen_pool_create(PAGE_SHIFT, -1), suggested by Haojian 2. move ion_shrink out of mutex, suggested by Nishanth 3. check error flag of ERR_PTR(-ENOMEM) 4. add msleep to allow schedule out. Base on common.git, android-3.4 branch Patch 2: Add support page wised cache flush for carveout_heap There is only one nents for carveout heap, as well as dirty bit. As a result, cache flush only takes effect for total carve heap. Patch 3: Add oom killer. Once heap is used off, SIGKILL is send to all tasks refered the buffer with descending oom_socre_adj Zhangfei Gao (3): gpu: ion: update carveout_heap_ops gpu: ion: carveout_heap page wised cache flush gpu: ion: oom killer drivers/gpu/ion/ion.c | 118 +++++++++++++++++++++++++++++++++- drivers/gpu/ion/ion_carveout_heap.c | 25 ++++++-- 2 files changed, 133 insertions(+), 10 deletions(-)

13 years, 3 months

3
12
0 0

[PATCH 1/2] gpu: ion: Fix kmalloc leak in carveout heap

by Nishanth Peethambaran

>From 78c843052f9438a848dd91334c6392b3b861bfa1 Mon Sep 17 00:00:00 2001 From: Nishanth Peethambaran <nishanth(a)broadcom.com> Date: Wed, 5 Sep 2012 02:21:54 +0530 Subject: [PATCH 1/2] gpu: ion: Fix kmalloc leak in carveout heap In carveout_unmap_dma(), the sg_table is freed. Change-Id: I8c2372c01a3f4c9061a1756ac9b31e09cf3ab9ef Signed-off-by: Nishanth Peethambaran <nishanth(a)broadcom.com> --- drivers/gpu/ion/ion_carveout_heap.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/gpu/ion/ion_carveout_heap.c b/drivers/gpu/ion/ion_carveout_heap.c index 5b6255b..f7608bc 100644 --- a/drivers/gpu/ion/ion_carveout_heap.c +++ b/drivers/gpu/ion/ion_carveout_heap.c @@ -107,6 +107,7 @@ void ion_carveout_heap_unmap_dma(struct ion_heap *heap, struct ion_buffer *buffer) { sg_free_table(buffer->sg_table); + kfree(buffer->sg_table); } void *ion_carveout_heap_map_kernel(struct ion_heap *heap, -- 1.7.0.4

13 years, 3 months

1
1
0 0

[PATCH 2/2] gpu: ion: Fix a memory leak

by Nishanth Peethambaran

>From 03184abb0efd05a422c4525f2f4398f127a9a0c9 Mon Sep 17 00:00:00 2001 From: Nishanth Peethambaran <nishanth(a)broadcom.com> Date: Wed, 5 Sep 2012 02:49:30 +0530 Subject: [PATCH 2/2] gpu: ion: Fix a memory leak The dirty bitmap buffer for cached buffers is freed when buffer gets destroyed. Change-Id: If2b6a6e3ffc5ed57623cfcbc01cb6b720bee532f Signed-off-by: Nishanth Peethambaran <nishanth(a)broadcom.com> --- drivers/gpu/ion/ion.c | 16 ++++++++++------ 1 files changed, 10 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/ion/ion.c b/drivers/gpu/ion/ion.c index 207d00f..3ad766a 100644 --- a/drivers/gpu/ion/ion.c +++ b/drivers/gpu/ion/ion.c @@ -176,12 +176,14 @@ static struct ion_buffer *ion_buffer_create(struct ion_heap *heap, return ERR_PTR(-EINVAL); } - ret = ion_buffer_alloc_dirty(buffer); - if (ret) { - heap->ops->unmap_dma(heap, buffer); - heap->ops->free(buffer); - kfree(buffer); - return ERR_PTR(ret); + if (buffer->flags & ION_FLAG_CACHED) { + ret = ion_buffer_alloc_dirty(buffer); + if (ret) { + heap->ops->unmap_dma(heap, buffer); + heap->ops->free(buffer); + kfree(buffer); + return ERR_PTR(ret); + } } buffer->dev = dev; @@ -210,6 +212,8 @@ static void ion_buffer_destroy(struct kref *kref) if (WARN_ON(buffer->kmap_cnt > 0)) buffer->heap->ops->unmap_kernel(buffer->heap, buffer); + if (buffer->flags & ION_FLAG_CACHED) + kfree(buffer->dirty); buffer->heap->ops->unmap_dma(buffer->heap, buffer); buffer->heap->ops->free(buffer); mutex_lock(&dev->lock); -- 1.7.0.4

13 years, 3 months

1
0
0 0

A few questions about the best way to implement RandR 1.4 / PRIME buffer sharing

by Aaron Plattner

So I've been experimenting with support for Dave Airlie's new RandR 1.4 provider object interface, so that Optimus-based laptops can use our driver to drive the discrete GPU and display on the integrated GPU. The good news is that I've got a proof of concept working. During a review of the current code, we came up with a few concerns: 1. The output source is responsible for allocating the shared memory Right now, the X server calls CreatePixmap on the output source screen and then expects the output sink screen to be able to display from whatever memory the source allocates. Right now, the source has no mechanism for asking the sink what its requirements are for the surface. I'm using our own internal pitch alignment requirements and that seems to be good enough for the Intel device to scan out, but that could be pure luck. Does it make sense to add a mechanism for drivers to negotiate this with each other, or is it sufficient to just define a lowest common denominator format and if your hardware can't deal with that format, you just don't get to share buffers? One of my coworkers brought to my attention the fact that Tegra requires a specific pitch alignment, and cannot accommodate larger pitches. If other SoC designs have similar restrictions, we might need to add a handshake mechanism. 2. There's no fallback mechanism if sharing can't be negotiated If RandR fails to share a pixmap with the output sink screen, the whole modeset fails. This means you'll end up not seeing anything on the screen and you'll probably think your computer locked up. Should there be some sort of software copy fallback to ensure that something at least shows up on the display? 3. How should the memory be allocated? In the prototype I threw together, I'm allocating the shared memory using shm_open and then exporting that as a dma-buf file descriptor using an ioctl I added to the kernel, and then importing that memory back into our driver through dma_buf_attach & dma_buf_map_attachment. Does it make sense for user-space programs to be able to export shmfs files like that? Should that interface go in DRM / GEM / PRIME instead? Something else? I'm pretty unfamiliar with this kernel code so any suggestions would be appreciated. -- Aaron P.S. for those unfamiliar with PRIME: Dave Airlie added new support to the X Resize and Rotate extension version 1.4 to support offloading display and rendering to different drivers. PRIME is the DRM implementation in the kernel, layered on top of DMA-BUF, that implements the actual sharing of buffers between drivers. http://cgit.freedesktop.org/xorg/proto/randrproto/tree/randrproto.txt?id=ra… http://airlied.livejournal.com/75555.html - update on hotplug server http://airlied.livejournal.com/76078.html - randr 1.5 demo videos ----------------------------------------------------------------------------------- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. -----------------------------------------------------------------------------------

13 years, 3 months

3
3
0 0

[PATCH 0/3] gpu: ion: oom killer

by Zhangfei Gao

Base on common.git, android-3.4 branch Patch 2: Add support page wised cache flush for carveout_heap There is only one nents for carveout heap, as well as dirty bit. As a result, cache flush only takes effect for total carve heap. Patch 3: Add oom killer. Once heap is used off, SIGKILL is send to all tasks refered the buffer with descending oom_socre_adj Zhangfei Gao (3): gpu: ion: update carveout_heap_ops gpu: ion: carveout_heap page wised cache flush gpu: ion: oom killer drivers/gpu/ion/ion.c | 112 ++++++++++++++++++++++++++++++++++- drivers/gpu/ion/ion_carveout_heap.c | 23 ++++++-- 2 files changed, 127 insertions(+), 8 deletions(-)

13 years, 3 months

4
17
0 0

[PATCH 0/3] gpu: ion: oom killer

by Zhangfei Gao

v0->v1: 1, Change gen_pool_create(12, -1) to gen_pool_create(PAGE_SHIFT, -1), suggested by Haojian 2. move ion_shrink out of mutex, suggested by Nishanth 3. check error flag of ERR_PTR(-ENOMEM) 4. add msleep to allow schedule out. Base on common.git, android-3.4 branch Patch 2: Add support page wised cache flush for carveout_heap There is only one nents for carveout heap, as well as dirty bit. As a result, cache flush only takes effect for total carve heap. Patch 3: Add oom killer. Once heap is used off, SIGKILL is send to all tasks refered the buffer with descending oom_socre_adj Zhangfei Gao (3): gpu: ion: update carveout_heap_ops gpu: ion: carveout_heap page wised cache flush gpu: ion: oom killer drivers/gpu/ion/ion.c | 118 +++++++++++++++++++++++++++++++++- drivers/gpu/ion/ion_carveout_heap.c | 25 ++++++-- 2 files changed, 133 insertions(+), 10 deletions(-)

13 years, 3 months

3
9
0 0

[GIT PULL] DMA-mapping fixes for v3.6

by Marek Szyprowski

Hi Linus, I would like to ask for pulling another set of fixes for ARM dma-mapping subsystem. Commit e9da6e9905e6 replaced custom consistent buffer remapping code with generic vmalloc areas. It however introduced some regressions caused by limited support for allocations in atomic context. This series contains fixes for those regressions. For some subplatforms the default, pre-allocated pool for atomic allocations turned out to be too small, so a function for setting its size has been added. Another set of patches adds support for atomic allocations to IOMMU-aware DMA-mapping implementation. The last part of this pull request contains two fixes for Contiguous Memory Allocator, which relax too strict requirements. The following changes since commit fea7a08acb13524b47711625eebea40a0ede69a0: Linux 3.6-rc3 (2012-08-22 13:29:06 -0700) are available in the git repository at: fixes-for-3.6 for you to fetch changes up to 479ed93a4b98eef03fd8260f7ddc00019221c450: ARM: dma-mapping: IOMMU allocates pages from atomic_pool with GFP_ATOMIC (2012-08-28 21:01:07 +0200) Thanks! Best regards Marek Szyprowski Samsung Poland R&D Center ---------------------------------------------------------------- Patch summary: Hiroshi Doyu (4): ARM: dma-mapping: atomic_pool with struct page **pages ARM: dma-mapping: Refactor out to introduce __in_atomic_pool ARM: dma-mapping: Introduce __atomic_get_pages() for __iommu_get_pages() ARM: dma-mapping: IOMMU allocates pages from atomic_pool with GFP_ATOMIC Marek Szyprowski (5): mm: cma: fix alignment requirements for contiguous regions ARM: relax conditions required for enabling Contiguous Memory Allocator ARM: DMA-Mapping: add function for setting coherent pool size from platform code ARM: DMA-Mapping: print warning when atomic coherent allocation fails ARM: Kirkwood: increase atomic coherent pool size arch/arm/Kconfig | 2 +- arch/arm/include/asm/dma-mapping.h | 7 ++ arch/arm/mach-kirkwood/common.c | 7 ++ arch/arm/mm/dma-mapping.c | 114 ++++++++++++++++++++++++++++++++--- drivers/base/dma-contiguous.c | 2 +- 5 files changed, 120 insertions(+), 12 deletions(-)

13 years, 3 months

1
0
0 0

Exporting ion buffer to multiple process from same driver

by Nishanth Peethambaran

Hi, I am trying to export an ion_buffer allocate from kernel space to multiple user-space clients. Eg: Allow multiple process to mmap framebuffer allocated using ion by fb driver. The following is the pseudo-code for that. Is this fine? there a cleaner way to do it? Or is it expected to share buffers across process only by user-space sharing fds using sockets/binder and not directly in kernel. fb driver init/probe: (init process context) ------------------------------------------------- /* Create an ion client and allocate framebuffer */ init_client = ion_client_create(idev,...); init_hdl = ion_alloc(init_client,...); /* Create a global dma_buf instance for the buffer */ fd = ion_share_dma_buf(init_client, init_hdl); // - Inc refcount of ion_buffer // - Create a dma_buf and anon-file for the ion buffer // - Get a free fd and install to anon file g_dma_buf = dma_buf_get(fd); // - Get the dma_buf pointer and inc refcount of anon_file dma_buf_put(g_dma_buf); // - Dec extra refcount of anon_file which happened in prev command put_unused_fd(fd); // - Free up the fd as fd is not exported to user-space here. fb driver exit: (init process context) ------------------------------------------ /* Free the dma_buf reference */ dma_buf_put(g_dma_buf); // - Dec refcount of anon_file. Free the dma_buf and dec refcount of ion_buffer if anon_file refcount = 0 /* Free the framebuffer and destroy the ion client created for init process */ ion_free(init_client, init_hdl); ion_client_destroy(init_client); fb device open: (user process context) ----------------------------------------------- /* Create an ion client for the user process */ p_client = ion_client_create(idev,...); fb device ioctl to import ion handle for the fb: (user process context) ----------------------------------------------------------------------------------- /* Import a ion_handle from the global dma_buf */ fd = dma_buf_fd(g_dmabuf, O_CLOEXEC); // - Get ref to anon file // - Get a free fd and install to anon file p_hdl = ion_import_dma_buf(p_client, fd); // - Inc refcount of ion_buffer // - create a ion_handle for the buffer for this process/client dma_buf_put(g_dmabuf); // - Free the anon file reference taken in first step put_unused_fd(fd); // - Free up the fd as fd is not exported to user-space here. fb device release: (user process context) --------------------------------------------------- /* Destroy the client created */ ion_client_destroy(p_client); - Nishanth Peethambaran

13 years, 3 months

1
0
0 0

CMA page migration failure due to buffers on bh_lru

by Laura Abbott

Hi, I've been observing a high rate of failures with CMA allocations on my ARM system. I've set up a test case set up with a 56MB CMA region that essentially does the following: total_failures = 0; loop forever: loop_failure = 0; for (i = 0; i < 56; i++) chunk[i] = dma_allocate(&cma_dev, 1MB) if (!chunk[i]) loop_failure = 0 if (loop_failure) total_failures++ loop_failure = 0 for (i = 0; i < 56; i++) dma_free(&cma_dev, chunk[i], 1MB) In the background, I also have a process doing some amount of filesystem activity (adb push/pull since this is an android system). During the course of my investigations I generally get ~8500 loops total and ~450 total failures (i.e. one or more buffers could not be allocated). This is unacceptably high for our use cases. In every case the allocation failure was ultimately due to a migration failure; the pages contained buffers which could not be dropped because the buffers were busy (move_to_new_page -> fallback_migrate_page -> try_to_release_page -> try_to_free_buffers -> drop_buffers -> buffer_busy). In every case, the b_count on the buffer head was always 1. The problem arises because of the LRU lists for buffer heads: __getblk __getblk_slow grow_buffers grow_dev_page find_or_create_page -- create a possibly movable page __find_get_block __find_get_block_slow find_get_page -- return the movable page bh_lru_install get_bh -- buffer head now has a reference The reference taken in bh_lru_install won't be dropped until the bh is evicted from the lru. This means the page cannot be migrated as long as the buffer exists on an LRU list. The real issue is that unless the buffer gets evicted quickly the page can remain non-migratible for long periods of time. This makes CMA regions unusable for long periods of time given that we generally don't want to size CMA regions any larger than necessary ergo any failure will cause a problem. My quick and dirty workaround for testing is to remove the GFP_MOVABLE flag from find_or_create_page but this seems significantly less than optimal. Ideally, it seems like the buffers should be evicted from the LRU when trying to drop (expand on invalid_bh_lru?) but I'm not familiar enough with the code path to know if this is a good approach. Any suggestions/feedback is appreciated. Thanks. Laura -- Sent by an employee of the Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

13 years, 3 months

1
1
0 0

[linaro-mm-sig] Backward compatibility of 3.4 android kernel to ICS

by Nishanth Peethambaran

Hi, I see that the lowmemkiller.c is changed to use oom_score_adj instead of oom_adj. Does this mean I cannot use an ICS system image with 3.4 kernel? Or is there a workaround? - Nishanth Peethambaran

13 years, 3 months

1
0
0 0

Re: [Linaro-mm-sig] CMA allocation issue : some pages can't be migrated

by Aubertin, Guillaume

looping-in the linaro-mm-sig ML. On Thu, Aug 30, 2012 at 4:47 PM, Aubertin, Guillaume <g-aubertin(a)ti.com>wrote: > hi guys, > > I've been working for a few days on getting a proper rmmod with the > remoteproc/rpmsg modules, and I stumbled upon an interesting issue. > > when doing sucessive memory allocation and release in the CMA > reservation (by loading/unloading the firmware several times), the > following message shows up : > > [ 119.908477] cma: dma_alloc_from_contiguous(cma ed10ad00, count 256, > align 8) > [ 119.908843] cma: dma_alloc_from_contiguous(): memory range at c0dfb000 > is busy, retrying > [ 119.909698] cma: dma_alloc_from_contiguous(): returned c0dfd000 > > dma_alloc_from_contiguous() tries to allocate the following range, > 0xc0dfd000, succesfully this time. > > In some cases, the allocation fails after trying several ranges : > > [ 119.912231] cma: dma_alloc_from_contiguous(cma ed10ad00, count 768, > align 8) > [ 119.912719] cma: dma_alloc_from_contiguous(): memory range at c0dff000 > is busy, retrying > [ 119.913055] cma: dma_alloc_from_contiguous(): memory range at c0e01000 > is busy, retrying > [ 119.913055] rproc remoteproc0: dma_alloc_coherent failed: 3145728 > > Here is my understanding so far : > > First, even if we made a CMA reservation, the kernel can still allocate > pages in this area, but these pages must be movable (user process page by > example). > > When dma_alloc_from_contiguous() is called to allocate X pages, it looks > for the next X contiguous free pages in it's CMA bitmap (with respect to > the memory alignment). Then, alloc_contig_range() is called to allocate the > given range of pages. Alloc_contig_range() analyses the pages we want to > allocate, and if a page is already used, it is migrated to a new page > outside the page array we want to reserve. this is done using > isolate_migratepages_range() to list the pages to migrate, and > migrate_pages() to try to migrate the pages, and that's where it fails. > Below is a list of next function calls : > > fallback_migrate_page() --> migrate_page() --> try_to_release_page() > --> try_to_free_buffer() --> drop_buffers() --> buffer_busy() > > I understand here that the page contains used buffers that can't be > dropped, and so the page can't be migrated. Well, I must admit that once > here, I'm feeling a little lost in this ocean of memory management code ;). > After a few researches, I found the following thread on the > linux-arm-kernel ML talking about the same issue : > > http://lists.infradead.org/pipermail/linux-arm-kernel/2012-June/102844.html with > the following patch : > > * mm/page_alloc.c | 3 ++-* > * 1 files changed, 2 insertions(+), 1 deletions(-)* > * > * > *diff --git a/mm/page_alloc.c b/mm/page_alloc.c* > *index 0e1c6f5..c9a6483 100644* > *--- a/mm/page_alloc.c* > *+++ b/mm/page_alloc.c* > *@@ -1310,7 +1310,8 @@ void free_hot_cold_page(struct page *page, int > cold)* > * * excessively into the page allocator* > * */* > * if (migratetype >= MIGRATE_PCPTYPES) {* > *- if (unlikely(migratetype == MIGRATE_ISOLATE)) {* > *+ if (unlikely(migratetype == MIGRATE_ISOLATE)* > *+ || is_migrate_cma(migratetype)) {* > * free_one_page(zone, page, 0, migratetype);* > * goto out;* > * }* > > I tried the patch, and it seems to work (I didn't have any "memory range > busy" in 5000+ tests), but I'm affraid that this could have some nasty side > effects. > > Any idea ? > > Thanks in advance, > Guillaume > > > -- > Texas Instruments France SA, 821 Avenue Jack Kilby, 06270 Villeneuve > Loubet. 036 420 040 R.C.S Antibes. Capital de EUR 753.920 > -- Texas Instruments France SA, 821 Avenue Jack Kilby, 06270 Villeneuve Loubet. 036 420 040 R.C.S Antibes. Capital de EUR 753.920

13 years, 3 months

2
2
0 0

[RFC] New dma_buf -> EGLImage EGL extension

by Tom Cooksey

Hi All, Over the last few months I've been working on & off with a few people from Linaro on a new EGL extension. The extension allows constructing an EGLImage from a (set of) dma_buf file descriptors, including support for multi-plane YUV. I envisage the primary use-case of this extension to be importing video frames from v4l2 into the EGL/GLES graphics driver to texture from. Originally the intent was to develop this as a Khronos-ratified extension. However, this is a little too platform-specific to be an officially sanctioned Khronos extension. It also goes against the general "EGLStream" direction the EGL working group is going in. As such, the general feeling was to make this an EXT "multi-vendor" extension with no official stamp of approval from Khronos. As this is no-longer intended to be a Khronos extension, I've re-written it to be a lot more Linux & dma_buf specific. It also allows me to circulate the extension more widely (I.e. To those outside Khronos membership). ARM are implementing this extension for at least our Mali-T6xx driver and likely earlier drivers too. I am sending this e-mail to solicit feedback, both from other vendors who might implement this extension (Mesa3D?) and from potential users of the extension. However, any feedback is welcome. Please find the extension text as it currently stands below. There several open issues which I've proposed solutions for, but I'm not really happy with those proposals and hoped others could chip-in with better ideas. There are likely other issues I've not thought about which also need to be added and addressed. Once there's a general consensus or if no-one's interested, I'll update the spec, move it out of Draft status and get it added to the Khronos registry, which includes assigning values for the new symbols. Cheers, Tom ---------8<--------- Name EXT_image_dma_buf_import Name Strings EGL_EXT_image_dma_buf_import Contributors Jesse Barker Rob Clark Tom Cooksey Contacts Jesse Barker (jesse 'dot' barker 'at' linaro 'dot' org) Tom Cooksey (tom 'dot' cooksey 'at' arm 'dot' com) Status DRAFT Version Version 3, August 16, 2012 Number EGL Extension ??? Dependencies EGL 1.2 is required. EGL_KHR_image_base is required. The EGL implementation must be running on a Linux kernel supporting the dma_buf buffer sharing mechanism. This extension is written against the wording of the EGL 1.2 Specification. Overview This extension allows creating an EGLImage from a Linux dma_buf file descriptor or multiple file descriptors in the case of multi-plane YUV images. New Types None New Procedures and Functions None New Tokens Accepted by the <target> parameter of eglCreateImageKHR: EGL_LINUX_DMA_BUF_EXT Accepted as an attribute in the <attrib_list> parameter of eglCreateImageKHR: EGL_LINUX_DRM_FOURCC_EXT EGL_DMA_BUF_PLANE0_FD_EXT EGL_DMA_BUF_PLANE0_OFFSET_EXT EGL_DMA_BUF_PLANE0_PITCH_EXT EGL_DMA_BUF_PLANE1_FD_EXT EGL_DMA_BUF_PLANE1_OFFSET_EXT EGL_DMA_BUF_PLANE1_PITCH_EXT EGL_DMA_BUF_PLANE2_FD_EXT EGL_DMA_BUF_PLANE2_OFFSET_EXT EGL_DMA_BUF_PLANE2_PITCH_EXT Additions to Chapter 2 of the EGL 1.2 Specification (EGL Operation) Add to section 2.5.1 "EGLImage Specification" (as defined by the EGL_KHR_image_base specification), in the description of eglCreateImageKHR: "Values accepted for <target> are listed in Table aaa, below. +-------------------------+--------------------------------------------+ | <target> | Notes | +-------------------------+--------------------------------------------+ | EGL_LINUX_DMA_BUF_EXT | Used for EGLImages imported from Linux | | | dma_buf file descriptors | +-------------------------+--------------------------------------------+ Table aaa. Legal values for eglCreateImageKHR <target> parameter ... If <target> is EGL_LINUX_DMA_BUF_EXT, <dpy> must be a valid display, <ctx> must be EGL_NO_CONTEXT, and <buffer> must be NULL, cast into the type EGLClientBuffer. The details of the image is specified by the attributes passed into eglCreateImageKHR. Required attributes and their values are as follows: * EGL_WIDTH & EGL_HEIGHT: The logical dimensions of the buffer in pixels * EGL_LINUX_DRM_FOURCC_EXT: The pixel format of the buffer, as specified by drm_fourcc.h and used as the pixel_format parameter of the drm_mode_fb_cmd2 ioctl. * EGL_DMA_BUF_PLANE0_FD_EXT: The dma_buf file descriptor of plane 0 of the image. * EGL_DMA_BUF_PLANE0_OFFSET_EXT: The offset from the start of the dma_buf of the first sample in plane 0, in bytes. * EGL_DMA_BUF_PLANE0_PITCH_EXT: The number of bytes between the start of subsequent rows of samples in plane 0. May have special meaning for non-linear formats. For images in an RGB color-space or those using a single-plane YUV format, only the first plane's file descriptor, offset & pitch should be specified. For semi-planar YUV formats, the chroma samples are stored in plane 1 and for fully planar formats, U-samples are stored in plane 1 and V-samples are stored in plane 2. Planes 1 & 2 are specified by the following attributes, which have the same meanings as defined above for plane 0: * EGL_DMA_BUF_PLANE1_FD_EXT * EGL_DMA_BUF_PLANE1_OFFSET_EXT * EGL_DMA_BUF_PLANE1_PITCH_EXT * EGL_DMA_BUF_PLANE2_FD_EXT * EGL_DMA_BUF_PLANE2_OFFSET_EXT * EGL_DMA_BUF_PLANE2_PITCH_EXT If eglCreateImageKHR is successful for a EGL_LINUX_DMA_BUF_EXT target, the EGL takes ownership of the file descriptor and is responsible for closing it, which it may do at any time while the EGLDisplay is initialized." Add to the list of error conditions for eglCreateImageKHR: "* If <target> is EGL_LINUX_DMA_BUF_EXT and <buffer> is not NULL, the error EGL_BAD_PARAMETER is generated. * If <target> is EGL_LINUX_DMA_BUF_EXT, and the list of attributes is incomplete, EGL_BAD_PARAMETER is generated. * If <target> is EGL_LINUX_DMA_BUF_EXT, and the EGL_LINUX_DRM_FOURCC_EXT attribute is set to a format not supported by the EGL, EGL_BAD_MATCH is generated. * If <target> is EGL_LINUX_DMA_BUF_EXT, and the EGL_LINUX_DRM_FOURCC_EXT attribute indicates a single-plane format, EGL_BAD_ATTRIBUTE is generated if any of the EGL_DMA_BUF_PLANE1_* or EGL_DMA_BUF_PLANE2_* attributes are specified. Issues 1. Should this be a KHR or EXT extension? ANSWER: EXT. Khronos EGL working group not keen on this extension as it is seen as contradicting the EGLStream direction the specification is going in. The working group recommends creating additional specs to allow an EGLStream producer/consumer connected to v4l2/DRM or any other Linux interface. 2. Should this be a generic any platform extension, or a Linux-only extension which explicitly states the handles are dma_buf fds? ANSWER: There's currently no intention to port this extension to any OS not based on the Linux kernel. Consequently, this spec can be explicitly written against Linux and the dma_buf API. 3. Does ownership of the file descriptor pass to the EGL library? PROPOSAL: If eglCreateImageKHR is successful, EGL assumes ownership of the file descriptors and is responsible for closing them. 4. How are the different YUV color spaces handled (BT.709/BT.601)? Open issue, still TBD. Doesn't seem to be specified by either the v4l2 or DRM APIs. PROPOSAL: Undefined and implementation/format dependent. 5. What chroma-siting is used for sub-sampled YUV formats? Open issue, still TBD. Doesn't seem to be specified by either the v4l2 or DRM APIs. PROPOSAL: Undefined and implementation/format dependent. 5. How can an application query which formats the EGL implementation supports? PROPOSAL: Don't provide a query mechanism but instead add an error condition that EGL_BAD_MATCH is raised if the EGL implementation doesn't support that particular format. 5. Which image formats should be supported and how is format specified? Open issue, still TBD. Seem to be two options 1) specify a new enum in this specification and enumerate all possible formats. 2) Use an existing enum already in Linux, either v4l2_mbus_pixelcode and/or those formats listed in drm_fourcc.h? PROPOSAL: Go for option 2) and just use values defined in drm_fourcc.h. Revision History #3 (Tom Cooksey, August 16, 2012) - Changed name from EGL_EXT_image_external and re-written language to explicitly state this for use with Linux & dma_buf. - Added a list of issues, including some still open ones. #2 (Jesse Barker, May 30, 2012) - Revision to split eglCreateImageKHR functionality from export Functionality. - Update definition of EGLNativeBufferType to be a struct containing a list of handles to support multi-buffer/multi-planar formats. #1 (Jesse Barker, March 20, 2012) - Initial draft.

13 years, 3 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig