- Linaro-mm-sig - lists.linaro.org

[linaro-mm-sig] Backward compatibility of 3.4 android kernel to ICS

by Nishanth Peethambaran

Hi, I see that the lowmemkiller.c is changed to use oom_score_adj instead of oom_adj. Does this mean I cannot use an ICS system image with 3.4 kernel? Or is there a workaround? - Nishanth Peethambaran

13 years, 5 months

1
0
0 0

Re: [Linaro-mm-sig] CMA allocation issue : some pages can't be migrated

by Aubertin, Guillaume

looping-in the linaro-mm-sig ML. On Thu, Aug 30, 2012 at 4:47 PM, Aubertin, Guillaume <g-aubertin(a)ti.com>wrote: > hi guys, > > I've been working for a few days on getting a proper rmmod with the > remoteproc/rpmsg modules, and I stumbled upon an interesting issue. > > when doing sucessive memory allocation and release in the CMA > reservation (by loading/unloading the firmware several times), the > following message shows up : > > [ 119.908477] cma: dma_alloc_from_contiguous(cma ed10ad00, count 256, > align 8) > [ 119.908843] cma: dma_alloc_from_contiguous(): memory range at c0dfb000 > is busy, retrying > [ 119.909698] cma: dma_alloc_from_contiguous(): returned c0dfd000 > > dma_alloc_from_contiguous() tries to allocate the following range, > 0xc0dfd000, succesfully this time. > > In some cases, the allocation fails after trying several ranges : > > [ 119.912231] cma: dma_alloc_from_contiguous(cma ed10ad00, count 768, > align 8) > [ 119.912719] cma: dma_alloc_from_contiguous(): memory range at c0dff000 > is busy, retrying > [ 119.913055] cma: dma_alloc_from_contiguous(): memory range at c0e01000 > is busy, retrying > [ 119.913055] rproc remoteproc0: dma_alloc_coherent failed: 3145728 > > Here is my understanding so far : > > First, even if we made a CMA reservation, the kernel can still allocate > pages in this area, but these pages must be movable (user process page by > example). > > When dma_alloc_from_contiguous() is called to allocate X pages, it looks > for the next X contiguous free pages in it's CMA bitmap (with respect to > the memory alignment). Then, alloc_contig_range() is called to allocate the > given range of pages. Alloc_contig_range() analyses the pages we want to > allocate, and if a page is already used, it is migrated to a new page > outside the page array we want to reserve. this is done using > isolate_migratepages_range() to list the pages to migrate, and > migrate_pages() to try to migrate the pages, and that's where it fails. > Below is a list of next function calls : > > fallback_migrate_page() --> migrate_page() --> try_to_release_page() > --> try_to_free_buffer() --> drop_buffers() --> buffer_busy() > > I understand here that the page contains used buffers that can't be > dropped, and so the page can't be migrated. Well, I must admit that once > here, I'm feeling a little lost in this ocean of memory management code ;). > After a few researches, I found the following thread on the > linux-arm-kernel ML talking about the same issue : > > http://lists.infradead.org/pipermail/linux-arm-kernel/2012-June/102844.html with > the following patch : > > * mm/page_alloc.c | 3 ++-* > * 1 files changed, 2 insertions(+), 1 deletions(-)* > * > * > *diff --git a/mm/page_alloc.c b/mm/page_alloc.c* > *index 0e1c6f5..c9a6483 100644* > *--- a/mm/page_alloc.c* > *+++ b/mm/page_alloc.c* > *@@ -1310,7 +1310,8 @@ void free_hot_cold_page(struct page *page, int > cold)* > * * excessively into the page allocator* > * */* > * if (migratetype >= MIGRATE_PCPTYPES) {* > *- if (unlikely(migratetype == MIGRATE_ISOLATE)) {* > *+ if (unlikely(migratetype == MIGRATE_ISOLATE)* > *+ || is_migrate_cma(migratetype)) {* > * free_one_page(zone, page, 0, migratetype);* > * goto out;* > * }* > > I tried the patch, and it seems to work (I didn't have any "memory range > busy" in 5000+ tests), but I'm affraid that this could have some nasty side > effects. > > Any idea ? > > Thanks in advance, > Guillaume > > > -- > Texas Instruments France SA, 821 Avenue Jack Kilby, 06270 Villeneuve > Loubet. 036 420 040 R.C.S Antibes. Capital de EUR 753.920 > -- Texas Instruments France SA, 821 Avenue Jack Kilby, 06270 Villeneuve Loubet. 036 420 040 R.C.S Antibes. Capital de EUR 753.920

13 years, 5 months

2
2
0 0

[RFC] New dma_buf -> EGLImage EGL extension

by Tom Cooksey

Hi All, Over the last few months I've been working on & off with a few people from Linaro on a new EGL extension. The extension allows constructing an EGLImage from a (set of) dma_buf file descriptors, including support for multi-plane YUV. I envisage the primary use-case of this extension to be importing video frames from v4l2 into the EGL/GLES graphics driver to texture from. Originally the intent was to develop this as a Khronos-ratified extension. However, this is a little too platform-specific to be an officially sanctioned Khronos extension. It also goes against the general "EGLStream" direction the EGL working group is going in. As such, the general feeling was to make this an EXT "multi-vendor" extension with no official stamp of approval from Khronos. As this is no-longer intended to be a Khronos extension, I've re-written it to be a lot more Linux & dma_buf specific. It also allows me to circulate the extension more widely (I.e. To those outside Khronos membership). ARM are implementing this extension for at least our Mali-T6xx driver and likely earlier drivers too. I am sending this e-mail to solicit feedback, both from other vendors who might implement this extension (Mesa3D?) and from potential users of the extension. However, any feedback is welcome. Please find the extension text as it currently stands below. There several open issues which I've proposed solutions for, but I'm not really happy with those proposals and hoped others could chip-in with better ideas. There are likely other issues I've not thought about which also need to be added and addressed. Once there's a general consensus or if no-one's interested, I'll update the spec, move it out of Draft status and get it added to the Khronos registry, which includes assigning values for the new symbols. Cheers, Tom ---------8<--------- Name EXT_image_dma_buf_import Name Strings EGL_EXT_image_dma_buf_import Contributors Jesse Barker Rob Clark Tom Cooksey Contacts Jesse Barker (jesse 'dot' barker 'at' linaro 'dot' org) Tom Cooksey (tom 'dot' cooksey 'at' arm 'dot' com) Status DRAFT Version Version 3, August 16, 2012 Number EGL Extension ??? Dependencies EGL 1.2 is required. EGL_KHR_image_base is required. The EGL implementation must be running on a Linux kernel supporting the dma_buf buffer sharing mechanism. This extension is written against the wording of the EGL 1.2 Specification. Overview This extension allows creating an EGLImage from a Linux dma_buf file descriptor or multiple file descriptors in the case of multi-plane YUV images. New Types None New Procedures and Functions None New Tokens Accepted by the <target> parameter of eglCreateImageKHR: EGL_LINUX_DMA_BUF_EXT Accepted as an attribute in the <attrib_list> parameter of eglCreateImageKHR: EGL_LINUX_DRM_FOURCC_EXT EGL_DMA_BUF_PLANE0_FD_EXT EGL_DMA_BUF_PLANE0_OFFSET_EXT EGL_DMA_BUF_PLANE0_PITCH_EXT EGL_DMA_BUF_PLANE1_FD_EXT EGL_DMA_BUF_PLANE1_OFFSET_EXT EGL_DMA_BUF_PLANE1_PITCH_EXT EGL_DMA_BUF_PLANE2_FD_EXT EGL_DMA_BUF_PLANE2_OFFSET_EXT EGL_DMA_BUF_PLANE2_PITCH_EXT Additions to Chapter 2 of the EGL 1.2 Specification (EGL Operation) Add to section 2.5.1 "EGLImage Specification" (as defined by the EGL_KHR_image_base specification), in the description of eglCreateImageKHR: "Values accepted for <target> are listed in Table aaa, below. +-------------------------+--------------------------------------------+ | <target> | Notes | +-------------------------+--------------------------------------------+ | EGL_LINUX_DMA_BUF_EXT | Used for EGLImages imported from Linux | | | dma_buf file descriptors | +-------------------------+--------------------------------------------+ Table aaa. Legal values for eglCreateImageKHR <target> parameter ... If <target> is EGL_LINUX_DMA_BUF_EXT, <dpy> must be a valid display, <ctx> must be EGL_NO_CONTEXT, and <buffer> must be NULL, cast into the type EGLClientBuffer. The details of the image is specified by the attributes passed into eglCreateImageKHR. Required attributes and their values are as follows: * EGL_WIDTH & EGL_HEIGHT: The logical dimensions of the buffer in pixels * EGL_LINUX_DRM_FOURCC_EXT: The pixel format of the buffer, as specified by drm_fourcc.h and used as the pixel_format parameter of the drm_mode_fb_cmd2 ioctl. * EGL_DMA_BUF_PLANE0_FD_EXT: The dma_buf file descriptor of plane 0 of the image. * EGL_DMA_BUF_PLANE0_OFFSET_EXT: The offset from the start of the dma_buf of the first sample in plane 0, in bytes. * EGL_DMA_BUF_PLANE0_PITCH_EXT: The number of bytes between the start of subsequent rows of samples in plane 0. May have special meaning for non-linear formats. For images in an RGB color-space or those using a single-plane YUV format, only the first plane's file descriptor, offset & pitch should be specified. For semi-planar YUV formats, the chroma samples are stored in plane 1 and for fully planar formats, U-samples are stored in plane 1 and V-samples are stored in plane 2. Planes 1 & 2 are specified by the following attributes, which have the same meanings as defined above for plane 0: * EGL_DMA_BUF_PLANE1_FD_EXT * EGL_DMA_BUF_PLANE1_OFFSET_EXT * EGL_DMA_BUF_PLANE1_PITCH_EXT * EGL_DMA_BUF_PLANE2_FD_EXT * EGL_DMA_BUF_PLANE2_OFFSET_EXT * EGL_DMA_BUF_PLANE2_PITCH_EXT If eglCreateImageKHR is successful for a EGL_LINUX_DMA_BUF_EXT target, the EGL takes ownership of the file descriptor and is responsible for closing it, which it may do at any time while the EGLDisplay is initialized." Add to the list of error conditions for eglCreateImageKHR: "* If <target> is EGL_LINUX_DMA_BUF_EXT and <buffer> is not NULL, the error EGL_BAD_PARAMETER is generated. * If <target> is EGL_LINUX_DMA_BUF_EXT, and the list of attributes is incomplete, EGL_BAD_PARAMETER is generated. * If <target> is EGL_LINUX_DMA_BUF_EXT, and the EGL_LINUX_DRM_FOURCC_EXT attribute is set to a format not supported by the EGL, EGL_BAD_MATCH is generated. * If <target> is EGL_LINUX_DMA_BUF_EXT, and the EGL_LINUX_DRM_FOURCC_EXT attribute indicates a single-plane format, EGL_BAD_ATTRIBUTE is generated if any of the EGL_DMA_BUF_PLANE1_* or EGL_DMA_BUF_PLANE2_* attributes are specified. Issues 1. Should this be a KHR or EXT extension? ANSWER: EXT. Khronos EGL working group not keen on this extension as it is seen as contradicting the EGLStream direction the specification is going in. The working group recommends creating additional specs to allow an EGLStream producer/consumer connected to v4l2/DRM or any other Linux interface. 2. Should this be a generic any platform extension, or a Linux-only extension which explicitly states the handles are dma_buf fds? ANSWER: There's currently no intention to port this extension to any OS not based on the Linux kernel. Consequently, this spec can be explicitly written against Linux and the dma_buf API. 3. Does ownership of the file descriptor pass to the EGL library? PROPOSAL: If eglCreateImageKHR is successful, EGL assumes ownership of the file descriptors and is responsible for closing them. 4. How are the different YUV color spaces handled (BT.709/BT.601)? Open issue, still TBD. Doesn't seem to be specified by either the v4l2 or DRM APIs. PROPOSAL: Undefined and implementation/format dependent. 5. What chroma-siting is used for sub-sampled YUV formats? Open issue, still TBD. Doesn't seem to be specified by either the v4l2 or DRM APIs. PROPOSAL: Undefined and implementation/format dependent. 5. How can an application query which formats the EGL implementation supports? PROPOSAL: Don't provide a query mechanism but instead add an error condition that EGL_BAD_MATCH is raised if the EGL implementation doesn't support that particular format. 5. Which image formats should be supported and how is format specified? Open issue, still TBD. Seem to be two options 1) specify a new enum in this specification and enumerate all possible formats. 2) Use an existing enum already in Linux, either v4l2_mbus_pixelcode and/or those formats listed in drm_fourcc.h? PROPOSAL: Go for option 2) and just use values defined in drm_fourcc.h. Revision History #3 (Tom Cooksey, August 16, 2012) - Changed name from EGL_EXT_image_external and re-written language to explicitly state this for use with Linux & dma_buf. - Added a list of issues, including some still open ones. #2 (Jesse Barker, May 30, 2012) - Revision to split eglCreateImageKHR functionality from export Functionality. - Update definition of EGLNativeBufferType to be a struct containing a list of handles to support multi-buffer/multi-planar formats. #1 (Jesse Barker, March 20, 2012) - Initial draft.

13 years, 5 months

1
0
0 0

[PATCH 1/4] dma-buf: remove fallback for !CONFIG_DMA_SHARED_BUFFER

by Maarten Lankhorst

Documentation says that code requiring dma-buf should add it to select, so inline fallbacks are not going to be used. A link error will make it obvious what went wrong, instead of silently doing nothing at runtime. Signed-off-by: Maarten Lankhorst <maarten.lankhorst(a)canonical.com> --- include/linux/dma-buf.h | 99 ----------------------------------------------- 1 file changed, 99 deletions(-) diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index eb48f38..bd2e52c 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -156,7 +156,6 @@ static inline void get_dma_buf(struct dma_buf *dmabuf) get_file(dmabuf->file); } -#ifdef CONFIG_DMA_SHARED_BUFFER struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, struct device *dev); void dma_buf_detach(struct dma_buf *dmabuf, @@ -184,103 +183,5 @@ int dma_buf_mmap(struct dma_buf *, struct vm_area_struct *, unsigned long); void *dma_buf_vmap(struct dma_buf *); void dma_buf_vunmap(struct dma_buf *, void *vaddr); -#else - -static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, - struct device *dev) -{ - return ERR_PTR(-ENODEV); -} - -static inline void dma_buf_detach(struct dma_buf *dmabuf, - struct dma_buf_attachment *dmabuf_attach) -{ - return; -} - -static inline struct dma_buf *dma_buf_export(void *priv, - const struct dma_buf_ops *ops, - size_t size, int flags) -{ - return ERR_PTR(-ENODEV); -} - -static inline int dma_buf_fd(struct dma_buf *dmabuf, int flags) -{ - return -ENODEV; -} - -static inline struct dma_buf *dma_buf_get(int fd) -{ - return ERR_PTR(-ENODEV); -} - -static inline void dma_buf_put(struct dma_buf *dmabuf) -{ - return; -} - -static inline struct sg_table *dma_buf_map_attachment( - struct dma_buf_attachment *attach, enum dma_data_direction write) -{ - return ERR_PTR(-ENODEV); -} - -static inline void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, - struct sg_table *sg, enum dma_data_direction dir) -{ - return; -} - -static inline int dma_buf_begin_cpu_access(struct dma_buf *dmabuf, - size_t start, size_t len, - enum dma_data_direction dir) -{ - return -ENODEV; -} - -static inline void dma_buf_end_cpu_access(struct dma_buf *dmabuf, - size_t start, size_t len, - enum dma_data_direction dir) -{ -} - -static inline void *dma_buf_kmap_atomic(struct dma_buf *dmabuf, - unsigned long pnum) -{ - return NULL; -} - -static inline void dma_buf_kunmap_atomic(struct dma_buf *dmabuf, - unsigned long pnum, void *vaddr) -{ -} - -static inline void *dma_buf_kmap(struct dma_buf *dmabuf, unsigned long pnum) -{ - return NULL; -} - -static inline void dma_buf_kunmap(struct dma_buf *dmabuf, - unsigned long pnum, void *vaddr) -{ -} - -static inline int dma_buf_mmap(struct dma_buf *dmabuf, - struct vm_area_struct *vma, - unsigned long pgoff) -{ - return -ENODEV; -} - -static inline void *dma_buf_vmap(struct dma_buf *dmabuf) -{ - return NULL; -} - -static inline void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) -{ -} -#endif /* CONFIG_DMA_SHARED_BUFFER */ #endif /* __DMA_BUF_H__ */

13 years, 5 months

5
22
0 0

[PATCH 1/4] ARM: dma-mapping: Small logical clean up

by Hiroshi Doyu

Skip unnecessary operations if order == 0. A little bit easier to read. Signed-off-by: Hiroshi Doyu <hdoyu(a)nvidia.com> --- arch/arm/mm/dma-mapping.c | 9 +++++---- 1 files changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 70a6275..aaea5e4 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -1032,11 +1032,12 @@ static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, gfp_t if (!pages[i]) goto error; - if (order) + if (order) { split_page(pages[i], order); - j = 1 << order; - while (--j) - pages[i + j] = pages[i] + j; + j = 1 << order; + while (--j) + pages[i + j] = pages[i] + j; + } __dma_clear_buffer(pages[i], PAGE_SIZE << order); i += 1 << order; -- 1.7.5.4

13 years, 5 months

2
5
0 0

[PATCH 0/3] Add code for setting coherent pool size

by Marek Szyprowski

Hi! Aaro Koskinen and Josh Coombs reported that commit e9da6e9905e639 ("ARM: dma-mapping: remove custom consistent dma region") introduced a regresion. It turned out that the default 256KiB for atomic coherent pool might not be enough. After that patch, some Kirkwood systems run out of atomic coherent memory and fail without any meanfull message. This patch series is an attempt to fix those issues by adding function for setting coherent pool size from platform initialization code and increasing the size of the pool for Kirkwood systems. Best regards Marek Szyprowski Samsung Poland R&D Center Patch summary: Marek Szyprowski (3): ARM: DMA-Mapping: add function for setting coherent pool size from platform code ARM: DMA-Mapping: print warning when atomic coherent allocation fails ARM: Kirkwood: increase atomic coherent pool size arch/arm/include/asm/dma-mapping.h | 7 +++++++ arch/arm/mach-kirkwood/common.c | 7 +++++++ arch/arm/mm/dma-mapping.c | 22 +++++++++++++++++++++- 3 files changed, 35 insertions(+), 1 deletions(-) -- 1.7.1.569.g6f426

13 years, 5 months

4
12
0 0

[v4 0/4] ARM: dma-mapping: IOMMU atomic allocation

by Hiroshi Doyu

Hi, The commit e9da6e9 "ARM: dma-mapping: remove custom consistent dma region" breaks the compatibility with existing drivers. This causes the following kernel oops(*1). That driver has called dma_pool_alloc() to allocate memory from the interrupt context, and it hits BUG_ON(in_interrpt()) in "get_vm_area_caller()". This patch seris fixes this problem with making use of the pre-allocate atomic memory pool which DMA is using in the same way as DMA does now. Any comment would be really appreciated. v4: Fix plain memory allocation. (Konrad,Marek) Print nicer error message at __in_atomic_pool() (Konrad) v3: Provide a different path for IOMMU for more clean code. (Marek) atomic_pool is backed with struct page *pages[]. http://lists.linaro.org/pipermail/linaro-mm-sig/2012-August/002446.html v2: Don't modify attrs(DMA_ATTR_NO_KERNEL_MAPPING) for atomic allocation. (Marek) Modify vzalloc (KyongHo,Minchan) http://lists.linaro.org/pipermail/linaro-mm-sig/2012-August/002430.html v1: http://lists.linaro.org/pipermail/linaro-mm-sig/2012-August/002398.html *1: [ 8.321343] ------------[ cut here ]------------ [ 8.325971] kernel BUG at kernel/mm/vmalloc.c:1322! [ 8.333615] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM [ 8.339436] Modules linked in: [ 8.342496] CPU: 0 Tainted: G W (3.4.6-00067-g5d485f7 #67) [ 8.349192] PC is at __get_vm_area_node.isra.29+0x164/0x16c [ 8.354758] LR is at get_vm_area_caller+0x4c/0x54 [ 8.359454] pc : [<c011297c>] lr : [<c011318c>] psr: 20000193 [ 8.359458] sp : c09edca0 ip : c09ec000 fp : ae278000 [ 8.370922] r10: f0000000 r9 : c011aa54 r8 : c0a26cb8 [ 8.376136] r7 : 00000001 r6 : 000000d0 r5 : 20000008 r4 : c09edca0 [ 8.382651] r3 : 00010000 r2 : 20000008 r1 : 00000001 r0 : 00001000 [ 8.389166] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel [ 8.396549] Control: 10c5387d Table: ad98c04a DAC: 00000015 .... [ 9.169162] dfa0: 412fc099 c09ec000 00000000 c000fdd8 c06df1e4 c0a1b080 00000000 00000000 [ 9.177329] dfc0: c0a235cc 8000406a 00000000 c0986818 ffffffff ffffffff c0986404 00000000 [ 9.185497] dfe0: 00000000 c09bb070 10c5387d c0a19c58 c09bb064 80008044 00000000 00000000 [ 9.193673] [<c011297c>] (__get_vm_area_node.isra.29+0x164/0x16c) from [<c011318c>] (get_vm_area_caller+0x4c/0x54) [ 9.204022] [<c011318c>] (get_vm_area_caller+0x4c/0x54) from [<c001aed8>] (__iommu_alloc_remap.isra.14+0x2c/0xfc) [ 9.214276] [<c001aed8>] (__iommu_alloc_remap.isra.14+0x2c/0xfc) from [<c001b06c>] (arm_iommu_alloc_attrs+0xc4/0xf8) [ 9.224795] [<c001b06c>] (arm_iommu_alloc_attrs+0xc4/0xf8) from [<c011aa54>] (pool_alloc_page.constprop.5+0x6c/0xf8) [ 9.235309] [<c011aa54>] (pool_alloc_page.constprop.5+0x6c/0xf8) from [<c011ab60>] (dma_pool_alloc+0x80/0x170) [ 9.245304] [<c011ab60>] (dma_pool_alloc+0x80/0x170) from [<c03cbbcc>] (tegra_build_dtd+0x48/0x14c) [ 9.254344] [<c03cbbcc>] (tegra_build_dtd+0x48/0x14c) from [<c03cbd4c>] (tegra_req_to_dtd+0x7c/0xa8) [ 9.263467] [<c03cbd4c>] (tegra_req_to_dtd+0x7c/0xa8) from [<c03cc140>] (tegra_ep_queue+0x154/0x33c) [ 9.272592] [<c03cc140>] (tegra_ep_queue+0x154/0x33c) from [<c03dd5b4>] (composite_setup+0x364/0x6d4) [ 9.281804] [<c03dd5b4>] (composite_setup+0x364/0x6d4) from [<c03dd9dc>] (android_setup+0xb8/0x14c) [ 9.290843] [<c03dd9dc>] (android_setup+0xb8/0x14c) from [<c03cd144>] (setup_received_irq+0xbc/0x270) [ 9.300053] [<c03cd144>] (setup_received_irq+0xbc/0x270) from [<c03cda64>] (tegra_udc_irq+0x2ac/0x2c4) [ 9.309353] [<c03cda64>] (tegra_udc_irq+0x2ac/0x2c4) from [<c00b5708>] (handle_irq_event_percpu+0x78/0x2e0) [ 9.319087] [<c00b5708>] (handle_irq_event_percpu+0x78/0x2e0) from [<c00b59b4>] (handle_irq_event+0x44/0x64) [ 9.328907] [<c00b59b4>] (handle_irq_event+0x44/0x64) from [<c00b8688>] (handle_fasteoi_irq+0xc4/0x16c) [ 9.338294] [<c00b8688>] (handle_fasteoi_irq+0xc4/0x16c) from [<c00b4f14>] (generic_handle_irq+0x34/0x48) [ 9.347858] [<c00b4f14>] (generic_handle_irq+0x34/0x48) from [<c000f6f4>] (handle_IRQ+0x54/0xb4) [ 9.356637] [<c000f6f4>] (handle_IRQ+0x54/0xb4) from [<c00084b0>] (gic_handle_irq+0x2c/0x60) [ 9.365068] [<c00084b0>] (gic_handle_irq+0x2c/0x60) from [<c000e900>] (__irq_svc+0x40/0x70) [ 9.373405] Exception stack(0xc09edf10 to 0xc09edf58) [ 9.378447] df00: 00000000 000f4240 00000003 00000000 [ 9.386615] df20: 00000000 e55bbc00 ef66f3ca 00000001 00000000 412fc099 c0abb9c8 00000000 [ 9.394781] df40: 3b9ac9ff c09edf58 c027a9bc c0042880 20000113 ffffffff [ 9.401396] [<c000e900>] (__irq_svc+0x40/0x70) from [<c0042880>] (tegra_idle_enter_lp3+0x68/0x78) [ 9.410272] [<c0042880>] (tegra_idle_enter_lp3+0x68/0x78) from [<c04701d4>] (cpuidle_idle_call+0xdc/0x3a4) [ 9.419922] [<c04701d4>] (cpuidle_idle_call+0xdc/0x3a4) from [<c000fdd8>] (cpu_idle+0xd8/0x134) [ 9.428612] [<c000fdd8>] (cpu_idle+0xd8/0x134) from [<c0986818>] (start_kernel+0x27c/0x2cc) [ 9.436952] Code: e1a00004 e3a04000 eb002265 eaffffe0 (e7f001f2) [ 9.443038] ---[ end trace 1b75b31a2719ed24 ]--- [ 9.447645] Kernel panic - not syncing: Fatal exception in interrupt Hiroshi Doyu (4): ARM: dma-mapping: atomic_pool with struct page **pages ARM: dma-mapping: Refactor out to introduce __in_atomic_pool ARM: dma-mapping: Introduce __atomic_get_pages() for __iommu_get_pages() ARM: dma-mapping: IOMMU allocates pages from atomic_pool with GFP_ATOMIC arch/arm/mm/dma-mapping.c | 91 ++++++++++++++++++++++++++++++++++++++++---- 1 files changed, 82 insertions(+), 9 deletions(-) -- 1.7.5.4

13 years, 5 months

2
5
0 0

Question about ION carveout heap support partial cache flush

by zhangfei gao

Hi, All We met question about dmac_map_area & dmac_flush_range from user addr. mcr would not return on armv7 processor. Existing ion carveout heap does not support partial cache flush. Total cache will be flushed at all. There is only one dirty bit for carveout heap, as well as sg_table->nents. drivers/gpu/ion/ion_carveout_heap.c ion_carveout_heap_map_dma -> sg_alloc_table(table, 1, GFP_KERNEL); ion_buffer_alloc_dirty -> pages = buffer->sg_table->nents; We want to support partial cache flush. Align to cache line, instead of PAGE_SIZE, for efficiency consideration. We have considered extended dirty bit, but looks like only align to PAGE_SIZE. For experiment we modify ioctl ION_IOC_SYNC on armv7. And directly use dmac_map_area & dmac_flush_range with add from user space. However, we find dmac_map_area can not work with this addr from user space. In fact, it is mcr can not work with addr from user space, it would hung. Also, ion_vm_falut would happen twice. The first time is from __dabt_usr, when we access the mmaped buffer, it is fine. The second is from __davt_svc, it is caused by mcr, it is strange? ION malloc carveout heap addr = user mmap user access addr, ion_vm_fault (__dabt_usr), build page table, and vm_insert_page. dmac_map_area & dmac_flush_range with addr -> ion_vm_fault (__davt_svc) mcr hung. Not understand why ion_vm_fault happen twice, where page table has been build. Why mcr will hung with addr from user space. Besides, no problem with ION on 3.0, which do not use ion_vm_fault. Any suggestion? Thanks

13 years, 5 months

4
9
0 0

Patch to show debugfs output for every heap->id instead of heap->type

by Nishanth Peethambaran

Hi, ION debugfs currently shows/groups output based on type. But, it is possible to have multiple heaps of the same type - for CMA and carveout types. It is more useful to get usage information for individual heaps. - Nishanth Peethambaran >From fa819b42fb69321a8e5db260ba9fd8ce7a2f16d2 Mon Sep 17 00:00:00 2001 From: Nishanth Peethambaran <nishanth(a)broadcom.com> Date: Tue, 28 Aug 2012 07:57:37 +0530 Subject: [PATCH] gpu: ion: Update debugfs to show for each id Update the debugfs read of client and heap to show based on 'id' instead of 'type'. Multiple heaps of the same type can be present, but id is unique. --- drivers/gpu/ion/ion.c | 14 +++++++------- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/ion/ion.c b/drivers/gpu/ion/ion.c index 34c12df..65cedee 100644 --- a/drivers/gpu/ion/ion.c +++ b/drivers/gpu/ion/ion.c @@ -547,11 +547,11 @@ static int ion_debug_client_show(struct seq_file *s, void *unused) for (n = rb_first(&client->handles); n; n = rb_next(n)) { struct ion_handle *handle = rb_entry(n, struct ion_handle, node); - enum ion_heap_type type = handle->buffer->heap->type; + int id = handle->buffer->heap->id; - if (!names[type]) - names[type] = handle->buffer->heap->name; - sizes[type] += handle->buffer->size; + if (!names[id]) + names[id] = handle->buffer->heap->name; + sizes[id] += handle->buffer->size; } mutex_unlock(&client->lock); @@ -1121,7 +1121,7 @@ static const struct file_operations ion_fops = { }; static size_t ion_debug_heap_total(struct ion_client *client, - enum ion_heap_type type) + int id) { size_t size = 0; struct rb_node *n; @@ -1131,7 +1131,7 @@ static size_t ion_debug_heap_total(struct ion_client *client, struct ion_handle *handle = rb_entry(n, struct ion_handle, node); - if (handle->buffer->heap->type == type) + if (handle->buffer->heap->id == id) size += handle->buffer->size; } mutex_unlock(&client->lock); @@ -1149,7 +1149,7 @@ static int ion_debug_heap_show(struct seq_file *s, void *unused) for (n = rb_first(&dev->clients); n; n = rb_next(n)) { struct ion_client *client = rb_entry(n, struct ion_client, node); - size_t size = ion_debug_heap_total(client, heap->type); + size_t size = ion_debug_heap_total(client, heap->id); if (!size) continue; if (client->task) { -- 1.7.0.4

13 years, 5 months

1
0
0 0

sharing across process of ion/dma_buf created after fork

by Nishanth Peethambaran

How do we share ion buffers from user-space with other processes if they are exported/shared after fork? The ION_IOC_SHARE ioctl creates an fd for process-1. In 3.0 kernel, the ION_ION_IMPORT ioctl from process-2 calls ion_import_fd which calls fget(fd) which fails to find the file for the fd shared by process-1. In 3.4 kernel, dma_buf_get does the fget(fd) to get struct file which also fails for the same reason - fget searches in current->files. - Nishanth Peethambaran

13 years, 5 months

2
1
0 0

ioremap issue in ION carveout heap

by Haojian Zhuang

Hi all, I think that we have a memory mapping issue on ION carveout heap for v3.4+ kernel from android. The scenario is User app + kernel driver (cpu) + kernel driver (dma) that all these three clients will access memory. And the memory is cacheable. The .map_kernel() of carveout heap remaps the allocated memory buffer by ioremap(). In arm_ioremap(), we don't allow memory to be mapped. In order to make .map_kernel() working, we need to use memblock_alloc() & memblock_remove() to move the heap memory from system to reserved area. So the linear address of the memory buffer is removed from page table. And the new virtual address comes from .map_kernel() while kernel driver wants to access the buffer. But ION use dma_sync_sg_for_devices() to flush cache that means they're using linear address from page. So they're using the NOT-EXISTED virtual address that is removed by memblock_remove(). Solution #1. .map_kernel() only returns the linear address. And there's a limitation of this solution, the heap should be always lying in low memory. So we needn't use any ioremap() and memblock_remove() any more. Solution #2. Use vmap() in .map_kernel(). How do you think about these two solutions? Regards Haojian

13 years, 5 months

5
14
0 0

[PATCH] mm: cma: fix alignment requirements for contiguous regions

by Marek Szyprowski

Contiguous Memory Allocator requires each of its regions to be aligned in such a way that it is possible to change migration type for all pageblocks holding it and then isolate page of largest possible order from the buddy allocator (which is MAX_ORDER-1). This patch relaxes alignment requirements by one order, because MAX_ORDER alignment is not really needed. Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com> CC: Michal Nazarewicz <mina86(a)mina86.com> --- drivers/base/dma-contiguous.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/base/dma-contiguous.c b/drivers/base/dma-contiguous.c index 78efb03..34d94c7 100644 --- a/drivers/base/dma-contiguous.c +++ b/drivers/base/dma-contiguous.c @@ -250,7 +250,7 @@ int __init dma_declare_contiguous(struct device *dev, unsigned long size, return -EINVAL; /* Sanitise input arguments */ - alignment = PAGE_SIZE << max(MAX_ORDER, pageblock_order); + alignment = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order); base = ALIGN(base, alignment); size = ALIGN(size, alignment); limit &= ~(alignment - 1); -- 1.7.1.569.g6f426

13 years, 5 months

2
1
0 0

[v3 0/4] ARM: dma-mapping: IOMMU atomic allocation

by Hiroshi Doyu

Hi, The commit e9da6e9 "ARM: dma-mapping: remove custom consistent dma region" breaks the compatibility with existing drivers. This causes the following kernel oops(*1). That driver has called dma_pool_alloc() to allocate memory from the interrupt context, and it hits BUG_ON(in_interrpt()) in "get_vm_area_caller()". This patch seris fixes this problem with making use of the pre-allocate atomic memory pool which DMA is using in the same way as DMA does now. Any comment would be really appreciated. v3: Provide a different path for IOMMU for more clean code. (Marek) atomic_pool is backed with struct page *pages[]. v2: Don't modify attrs(DMA_ATTR_NO_KERNEL_MAPPING) for atomic allocation. (Marek) Modify vzalloc (KyongHo, Minchan) v1: http://lists.linaro.org/pipermail/linaro-mm-sig/2012-August/002398.html *1: [ 8.321343] ------------[ cut here ]------------ [ 8.325971] kernel BUG at kernel/mm/vmalloc.c:1322! [ 8.333615] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM [ 8.339436] Modules linked in: [ 8.342496] CPU: 0 Tainted: G W (3.4.6-00067-g5d485f7 #67) [ 8.349192] PC is at __get_vm_area_node.isra.29+0x164/0x16c [ 8.354758] LR is at get_vm_area_caller+0x4c/0x54 [ 8.359454] pc : [<c011297c>] lr : [<c011318c>] psr: 20000193 [ 8.359458] sp : c09edca0 ip : c09ec000 fp : ae278000 [ 8.370922] r10: f0000000 r9 : c011aa54 r8 : c0a26cb8 [ 8.376136] r7 : 00000001 r6 : 000000d0 r5 : 20000008 r4 : c09edca0 [ 8.382651] r3 : 00010000 r2 : 20000008 r1 : 00000001 r0 : 00001000 [ 8.389166] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel [ 8.396549] Control: 10c5387d Table: ad98c04a DAC: 00000015 .... [ 9.169162] dfa0: 412fc099 c09ec000 00000000 c000fdd8 c06df1e4 c0a1b080 00000000 00000000 [ 9.177329] dfc0: c0a235cc 8000406a 00000000 c0986818 ffffffff ffffffff c0986404 00000000 [ 9.185497] dfe0: 00000000 c09bb070 10c5387d c0a19c58 c09bb064 80008044 00000000 00000000 [ 9.193673] [<c011297c>] (__get_vm_area_node.isra.29+0x164/0x16c) from [<c011318c>] (get_vm_area_caller+0x4c/0x54) [ 9.204022] [<c011318c>] (get_vm_area_caller+0x4c/0x54) from [<c001aed8>] (__iommu_alloc_remap.isra.14+0x2c/0xfc) [ 9.214276] [<c001aed8>] (__iommu_alloc_remap.isra.14+0x2c/0xfc) from [<c001b06c>] (arm_iommu_alloc_attrs+0xc4/0xf8) [ 9.224795] [<c001b06c>] (arm_iommu_alloc_attrs+0xc4/0xf8) from [<c011aa54>] (pool_alloc_page.constprop.5+0x6c/0xf8) [ 9.235309] [<c011aa54>] (pool_alloc_page.constprop.5+0x6c/0xf8) from [<c011ab60>] (dma_pool_alloc+0x80/0x170) [ 9.245304] [<c011ab60>] (dma_pool_alloc+0x80/0x170) from [<c03cbbcc>] (tegra_build_dtd+0x48/0x14c) [ 9.254344] [<c03cbbcc>] (tegra_build_dtd+0x48/0x14c) from [<c03cbd4c>] (tegra_req_to_dtd+0x7c/0xa8) [ 9.263467] [<c03cbd4c>] (tegra_req_to_dtd+0x7c/0xa8) from [<c03cc140>] (tegra_ep_queue+0x154/0x33c) [ 9.272592] [<c03cc140>] (tegra_ep_queue+0x154/0x33c) from [<c03dd5b4>] (composite_setup+0x364/0x6d4) [ 9.281804] [<c03dd5b4>] (composite_setup+0x364/0x6d4) from [<c03dd9dc>] (android_setup+0xb8/0x14c) [ 9.290843] [<c03dd9dc>] (android_setup+0xb8/0x14c) from [<c03cd144>] (setup_received_irq+0xbc/0x270) [ 9.300053] [<c03cd144>] (setup_received_irq+0xbc/0x270) from [<c03cda64>] (tegra_udc_irq+0x2ac/0x2c4) [ 9.309353] [<c03cda64>] (tegra_udc_irq+0x2ac/0x2c4) from [<c00b5708>] (handle_irq_event_percpu+0x78/0x2e0) [ 9.319087] [<c00b5708>] (handle_irq_event_percpu+0x78/0x2e0) from [<c00b59b4>] (handle_irq_event+0x44/0x64) [ 9.328907] [<c00b59b4>] (handle_irq_event+0x44/0x64) from [<c00b8688>] (handle_fasteoi_irq+0xc4/0x16c) [ 9.338294] [<c00b8688>] (handle_fasteoi_irq+0xc4/0x16c) from [<c00b4f14>] (generic_handle_irq+0x34/0x48) [ 9.347858] [<c00b4f14>] (generic_handle_irq+0x34/0x48) from [<c000f6f4>] (handle_IRQ+0x54/0xb4) [ 9.356637] [<c000f6f4>] (handle_IRQ+0x54/0xb4) from [<c00084b0>] (gic_handle_irq+0x2c/0x60) [ 9.365068] [<c00084b0>] (gic_handle_irq+0x2c/0x60) from [<c000e900>] (__irq_svc+0x40/0x70) [ 9.373405] Exception stack(0xc09edf10 to 0xc09edf58) [ 9.378447] df00: 00000000 000f4240 00000003 00000000 [ 9.386615] df20: 00000000 e55bbc00 ef66f3ca 00000001 00000000 412fc099 c0abb9c8 00000000 [ 9.394781] df40: 3b9ac9ff c09edf58 c027a9bc c0042880 20000113 ffffffff [ 9.401396] [<c000e900>] (__irq_svc+0x40/0x70) from [<c0042880>] (tegra_idle_enter_lp3+0x68/0x78) [ 9.410272] [<c0042880>] (tegra_idle_enter_lp3+0x68/0x78) from [<c04701d4>] (cpuidle_idle_call+0xdc/0x3a4) [ 9.419922] [<c04701d4>] (cpuidle_idle_call+0xdc/0x3a4) from [<c000fdd8>] (cpu_idle+0xd8/0x134) [ 9.428612] [<c000fdd8>] (cpu_idle+0xd8/0x134) from [<c0986818>] (start_kernel+0x27c/0x2cc) [ 9.436952] Code: e1a00004 e3a04000 eb002265 eaffffe0 (e7f001f2) [ 9.443038] ---[ end trace 1b75b31a2719ed24 ]--- [ 9.447645] Kernel panic - not syncing: Fatal exception in interrupt Hiroshi Doyu (4): ARM: dma-mapping: atomic_pool with struct page **pages ARM: dma-mapping: Refactor out to introduce __in_atomic_pool ARM: dma-mapping: Introduce __atomic_get_pages() for __iommu_get_pages() ARM: dma-mapping: IOMMU allocates pages from atomic_pool with GFP_ATOMIC arch/arm/mm/dma-mapping.c | 90 ++++++++++++++++++++++++++++++++++++++++----- 1 files changed, 80 insertions(+), 10 deletions(-) -- 1.7.5.4

13 years, 5 months

3
9
0 0

[RFC 0/4] ARM: dma-mapping: IOMMU atomic allocation

by Hiroshi Doyu

Hi, The commit e9da6e9 "ARM: dma-mapping: remove custom consistent dma region" breaks the compatibility with existing drivers. This causes the following kernel oops(*1). That driver has called dma_pool_alloc() to allocate memory from the interrupt context, and it hits BUG_ON(in_interrpt()) in "get_vm_area_caller()". This patch seris fixes this problem with making use of the pre-allocate atomic memory pool which DMA is using in the same way as DMA does now. Any comment would be really appreciated. *1: [ 8.321343] ------------[ cut here ]------------ [ 8.325971] kernel BUG at /home/hdoyu/mydroid-k340-cardhu/kernel/mm/vmalloc.c:1322! [ 8.333615] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM [ 8.339436] Modules linked in: [ 8.342496] CPU: 0 Tainted: G W (3.4.6-00067-g5d485f7 #67) [ 8.349192] PC is at __get_vm_area_node.isra.29+0x164/0x16c [ 8.354758] LR is at get_vm_area_caller+0x4c/0x54 [ 8.359454] pc : [<c011297c>] lr : [<c011318c>] psr: 20000193 [ 8.359458] sp : c09edca0 ip : c09ec000 fp : ae278000 [ 8.370922] r10: f0000000 r9 : c011aa54 r8 : c0a26cb8 [ 8.376136] r7 : 00000001 r6 : 000000d0 r5 : 20000008 r4 : c09edca0 [ 8.382651] r3 : 00010000 r2 : 20000008 r1 : 00000001 r0 : 00001000 [ 8.389166] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel [ 8.396549] Control: 10c5387d Table: ad98c04a DAC: 00000015 .... [ 9.169162] dfa0: 412fc099 c09ec000 00000000 c000fdd8 c06df1e4 c0a1b080 00000000 00000000 [ 9.177329] dfc0: c0a235cc 8000406a 00000000 c0986818 ffffffff ffffffff c0986404 00000000 [ 9.185497] dfe0: 00000000 c09bb070 10c5387d c0a19c58 c09bb064 80008044 00000000 00000000 [ 9.193673] [<c011297c>] (__get_vm_area_node.isra.29+0x164/0x16c) from [<c011318c>] (get_vm_area_caller+0x4c/0x54) [ 9.204022] [<c011318c>] (get_vm_area_caller+0x4c/0x54) from [<c001aed8>] (__iommu_alloc_remap.isra.14+0x2c/0xfc) [ 9.214276] [<c001aed8>] (__iommu_alloc_remap.isra.14+0x2c/0xfc) from [<c001b06c>] (arm_iommu_alloc_attrs+0xc4/0xf8) [ 9.224795] [<c001b06c>] (arm_iommu_alloc_attrs+0xc4/0xf8) from [<c011aa54>] (pool_alloc_page.constprop.5+0x6c/0xf8) [ 9.235309] [<c011aa54>] (pool_alloc_page.constprop.5+0x6c/0xf8) from [<c011ab60>] (dma_pool_alloc+0x80/0x170) [ 9.245304] [<c011ab60>] (dma_pool_alloc+0x80/0x170) from [<c03cbbcc>] (tegra_build_dtd+0x48/0x14c) [ 9.254344] [<c03cbbcc>] (tegra_build_dtd+0x48/0x14c) from [<c03cbd4c>] (tegra_req_to_dtd+0x7c/0xa8) [ 9.263467] [<c03cbd4c>] (tegra_req_to_dtd+0x7c/0xa8) from [<c03cc140>] (tegra_ep_queue+0x154/0x33c) [ 9.272592] [<c03cc140>] (tegra_ep_queue+0x154/0x33c) from [<c03dd5b4>] (composite_setup+0x364/0x6d4) [ 9.281804] [<c03dd5b4>] (composite_setup+0x364/0x6d4) from [<c03dd9dc>] (android_setup+0xb8/0x14c) [ 9.290843] [<c03dd9dc>] (android_setup+0xb8/0x14c) from [<c03cd144>] (setup_received_irq+0xbc/0x270) [ 9.300053] [<c03cd144>] (setup_received_irq+0xbc/0x270) from [<c03cda64>] (tegra_udc_irq+0x2ac/0x2c4) [ 9.309353] [<c03cda64>] (tegra_udc_irq+0x2ac/0x2c4) from [<c00b5708>] (handle_irq_event_percpu+0x78/0x2e0) [ 9.319087] [<c00b5708>] (handle_irq_event_percpu+0x78/0x2e0) from [<c00b59b4>] (handle_irq_event+0x44/0x64) [ 9.328907] [<c00b59b4>] (handle_irq_event+0x44/0x64) from [<c00b8688>] (handle_fasteoi_irq+0xc4/0x16c) [ 9.338294] [<c00b8688>] (handle_fasteoi_irq+0xc4/0x16c) from [<c00b4f14>] (generic_handle_irq+0x34/0x48) [ 9.347858] [<c00b4f14>] (generic_handle_irq+0x34/0x48) from [<c000f6f4>] (handle_IRQ+0x54/0xb4) [ 9.356637] [<c000f6f4>] (handle_IRQ+0x54/0xb4) from [<c00084b0>] (gic_handle_irq+0x2c/0x60) [ 9.365068] [<c00084b0>] (gic_handle_irq+0x2c/0x60) from [<c000e900>] (__irq_svc+0x40/0x70) [ 9.373405] Exception stack(0xc09edf10 to 0xc09edf58) [ 9.378447] df00: 00000000 000f4240 00000003 00000000 [ 9.386615] df20: 00000000 e55bbc00 ef66f3ca 00000001 00000000 412fc099 c0abb9c8 00000000 [ 9.394781] df40: 3b9ac9ff c09edf58 c027a9bc c0042880 20000113 ffffffff [ 9.401396] [<c000e900>] (__irq_svc+0x40/0x70) from [<c0042880>] (tegra_idle_enter_lp3+0x68/0x78) [ 9.410272] [<c0042880>] (tegra_idle_enter_lp3+0x68/0x78) from [<c04701d4>] (cpuidle_idle_call+0xdc/0x3a4) [ 9.419922] [<c04701d4>] (cpuidle_idle_call+0xdc/0x3a4) from [<c000fdd8>] (cpu_idle+0xd8/0x134) [ 9.428612] [<c000fdd8>] (cpu_idle+0xd8/0x134) from [<c0986818>] (start_kernel+0x27c/0x2cc) [ 9.436952] Code: e1a00004 e3a04000 eb002265 eaffffe0 (e7f001f2) [ 9.443038] ---[ end trace 1b75b31a2719ed24 ]--- [ 9.447645] Kernel panic - not syncing: Fatal exception in interrupt Hiroshi Doyu (4): ARM: dma-mapping: Refactor out to introduce __alloc_fill_pages ARM: dma-mapping: IOMMU allocates pages from pool with GFP_ATOMIC ARM: dma-mapping: Return cpu addr when dma_alloc(GFP_ATOMIC) ARM: dma-mapping: dma_{alloc,free}_coherent with empty attrs arch/arm/include/asm/dma-mapping.h | 21 ++++++++++-- arch/arm/mm/dma-mapping.c | 59 ++++++++++++++++++++++++++++-------- 2 files changed, 63 insertions(+), 17 deletions(-) -- 1.7.5.4

13 years, 5 months

4
15
0 0

[v2 0/4] ARM: dma-mapping: IOMMU atomic allocation

by Hiroshi Doyu

Hi, The commit e9da6e9 "ARM: dma-mapping: remove custom consistent dma region" breaks the compatibility with existing drivers. This causes the following kernel oops(*1). That driver has called dma_pool_alloc() to allocate memory from the interrupt context, and it hits BUG_ON(in_interrpt()) in "get_vm_area_caller()". This patch seris fixes this problem with making use of the pre-allocate atomic memory pool which DMA is using in the same way as DMA does now. Any comment would be really appreciated. v2: Don't modify attrs(DMA_ATTR_NO_KERNEL_MAPPING) for atomic allocation. (Marek) Skip vzalloc (KyongHo, Minchan) v1: http://lists.linaro.org/pipermail/linaro-mm-sig/2012-August/002398.html *1: [ 8.321343] ------------[ cut here ]------------ [ 8.325971] kernel BUG at kernel/mm/vmalloc.c:1322! [ 8.333615] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM [ 8.339436] Modules linked in: [ 8.342496] CPU: 0 Tainted: G W (3.4.6-00067-g5d485f7 #67) [ 8.349192] PC is at __get_vm_area_node.isra.29+0x164/0x16c [ 8.354758] LR is at get_vm_area_caller+0x4c/0x54 [ 8.359454] pc : [<c011297c>] lr : [<c011318c>] psr: 20000193 [ 8.359458] sp : c09edca0 ip : c09ec000 fp : ae278000 [ 8.370922] r10: f0000000 r9 : c011aa54 r8 : c0a26cb8 [ 8.376136] r7 : 00000001 r6 : 000000d0 r5 : 20000008 r4 : c09edca0 [ 8.382651] r3 : 00010000 r2 : 20000008 r1 : 00000001 r0 : 00001000 [ 8.389166] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel [ 8.396549] Control: 10c5387d Table: ad98c04a DAC: 00000015 .... [ 9.169162] dfa0: 412fc099 c09ec000 00000000 c000fdd8 c06df1e4 c0a1b080 00000000 00000000 [ 9.177329] dfc0: c0a235cc 8000406a 00000000 c0986818 ffffffff ffffffff c0986404 00000000 [ 9.185497] dfe0: 00000000 c09bb070 10c5387d c0a19c58 c09bb064 80008044 00000000 00000000 [ 9.193673] [<c011297c>] (__get_vm_area_node.isra.29+0x164/0x16c) from [<c011318c>] (get_vm_area_caller+0x4c/0x54) [ 9.204022] [<c011318c>] (get_vm_area_caller+0x4c/0x54) from [<c001aed8>] (__iommu_alloc_remap.isra.14+0x2c/0xfc) [ 9.214276] [<c001aed8>] (__iommu_alloc_remap.isra.14+0x2c/0xfc) from [<c001b06c>] (arm_iommu_alloc_attrs+0xc4/0xf8) [ 9.224795] [<c001b06c>] (arm_iommu_alloc_attrs+0xc4/0xf8) from [<c011aa54>] (pool_alloc_page.constprop.5+0x6c/0xf8) [ 9.235309] [<c011aa54>] (pool_alloc_page.constprop.5+0x6c/0xf8) from [<c011ab60>] (dma_pool_alloc+0x80/0x170) [ 9.245304] [<c011ab60>] (dma_pool_alloc+0x80/0x170) from [<c03cbbcc>] (tegra_build_dtd+0x48/0x14c) [ 9.254344] [<c03cbbcc>] (tegra_build_dtd+0x48/0x14c) from [<c03cbd4c>] (tegra_req_to_dtd+0x7c/0xa8) [ 9.263467] [<c03cbd4c>] (tegra_req_to_dtd+0x7c/0xa8) from [<c03cc140>] (tegra_ep_queue+0x154/0x33c) [ 9.272592] [<c03cc140>] (tegra_ep_queue+0x154/0x33c) from [<c03dd5b4>] (composite_setup+0x364/0x6d4) [ 9.281804] [<c03dd5b4>] (composite_setup+0x364/0x6d4) from [<c03dd9dc>] (android_setup+0xb8/0x14c) [ 9.290843] [<c03dd9dc>] (android_setup+0xb8/0x14c) from [<c03cd144>] (setup_received_irq+0xbc/0x270) [ 9.300053] [<c03cd144>] (setup_received_irq+0xbc/0x270) from [<c03cda64>] (tegra_udc_irq+0x2ac/0x2c4) [ 9.309353] [<c03cda64>] (tegra_udc_irq+0x2ac/0x2c4) from [<c00b5708>] (handle_irq_event_percpu+0x78/0x2e0) [ 9.319087] [<c00b5708>] (handle_irq_event_percpu+0x78/0x2e0) from [<c00b59b4>] (handle_irq_event+0x44/0x64) [ 9.328907] [<c00b59b4>] (handle_irq_event+0x44/0x64) from [<c00b8688>] (handle_fasteoi_irq+0xc4/0x16c) [ 9.338294] [<c00b8688>] (handle_fasteoi_irq+0xc4/0x16c) from [<c00b4f14>] (generic_handle_irq+0x34/0x48) [ 9.347858] [<c00b4f14>] (generic_handle_irq+0x34/0x48) from [<c000f6f4>] (handle_IRQ+0x54/0xb4) [ 9.356637] [<c000f6f4>] (handle_IRQ+0x54/0xb4) from [<c00084b0>] (gic_handle_irq+0x2c/0x60) [ 9.365068] [<c00084b0>] (gic_handle_irq+0x2c/0x60) from [<c000e900>] (__irq_svc+0x40/0x70) [ 9.373405] Exception stack(0xc09edf10 to 0xc09edf58) [ 9.378447] df00: 00000000 000f4240 00000003 00000000 [ 9.386615] df20: 00000000 e55bbc00 ef66f3ca 00000001 00000000 412fc099 c0abb9c8 00000000 [ 9.394781] df40: 3b9ac9ff c09edf58 c027a9bc c0042880 20000113 ffffffff [ 9.401396] [<c000e900>] (__irq_svc+0x40/0x70) from [<c0042880>] (tegra_idle_enter_lp3+0x68/0x78) [ 9.410272] [<c0042880>] (tegra_idle_enter_lp3+0x68/0x78) from [<c04701d4>] (cpuidle_idle_call+0xdc/0x3a4) [ 9.419922] [<c04701d4>] (cpuidle_idle_call+0xdc/0x3a4) from [<c000fdd8>] (cpu_idle+0xd8/0x134) [ 9.428612] [<c000fdd8>] (cpu_idle+0xd8/0x134) from [<c0986818>] (start_kernel+0x27c/0x2cc) [ 9.436952] Code: e1a00004 e3a04000 eb002265 eaffffe0 (e7f001f2) [ 9.443038] ---[ end trace 1b75b31a2719ed24 ]--- [ 9.447645] Kernel panic - not syncing: Fatal exception in interrupt Hiroshi Doyu (4): ARM: dma-mapping: Refactor out to introduce __in_atomic_pool ARM: dma-mapping: Use kzalloc() with GFP_ATOMIC ARM: dma-mapping: Refactor out to introduce __alloc_fill_pages ARM: dma-mapping: IOMMU allocates pages from atomic_pool with GFP_ATOMIC arch/arm/mm/dma-mapping.c | 102 ++++++++++++++++++++++++++++++++++----------- 1 files changed, 77 insertions(+), 25 deletions(-) -- 1.7.5.4

13 years, 5 months

3
6
0 0

[PATCHv6 0/2] ARM: replace custom consistent dma region with vmalloc

by Marek Szyprowski

Hello! This is yet another quick update on the patchset which replaces custom consistent dma regions usage in dma-mapping framework in favour of generic vmalloc areas created on demand for each allocation. The main purpose for this patchset is to remove 2MiB limit of dma coherent/writecombine allocations. This version addresses a few more cleanups pointed by Minchan Kim. This patch is based on vanilla v3.5 release. Best regards Marek Szyprowski Samsung Poland R&D Center Changelog: v6: - more cleanups of minor issues pointed by Minchan Kim, moved arm_dma_mmap() changes into separate patch v5: http://thread.gmane.org/gmane.linux.kernel.mm/83096 - fixed another minor issues pointed by Minchan Kim: added more comments here and there, changed pr_err() + stack_dump() to WARN(), added a fix for no-MMU systems v4: http://thread.gmane.org/gmane.linux.kernel.mm/80906 - replaced arch-independent VM_DMA flag with ARM-specific VM_ARM_DMA_CONSISTENT flag v3: http://thread.gmane.org/gmane.linux.kernel.mm/80028 - rebased onto v3.4-rc2: added support for IOMMU-aware implementation of dma-mapping calls, unified with CMA coherent dma pool - implemented changes requested by Minchan Kim: added more checks for vmarea->flags & VM_DMA, renamed some variables, removed obsole locks, squashed find_vm_area() exporting patch into the main redesign patch v2: http://thread.gmane.org/gmane.linux.kernel.mm/78563 - added support for atomic allocations (served from preallocated pool) - minor cleanup here and there - rebased onto v3.4-rc7 v1: http://thread.gmane.org/gmane.linux.kernel.mm/76703 - initial version Patch summary: Marek Szyprowski (2): mm: vmalloc: use const void * for caller argument ARM: dma-mapping: remove custom consistent dma region Documentation/kernel-parameters.txt | 2 +- arch/arm/include/asm/dma-mapping.h | 2 +- arch/arm/mm/dma-mapping.c | 486 ++++++++++++----------------------- arch/arm/mm/mm.h | 3 + include/linux/vmalloc.h | 9 +- mm/vmalloc.c | 28 ++- 6 files changed, 194 insertions(+), 336 deletions(-) -- 1.7.1.569.g6f426

13 years, 5 months

3
8
0 0

[PATCH] ARM: relax conditions required for enabling Contiguous Memory Allocator

by Marek Szyprowski

Contiguous Memory Allocator requires only paging and MMU enabled not particular CPU architectures, so there is no need for strict dependency on CPU type. This enables to use CMA on some older ARM v5 systems which also might need large contiguous blocks for the multimedia processing hw modules. Reported-by: Prabhakar Lad <prabhakar.csengg(a)gmail.com> Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com> --- arch/arm/Kconfig | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index e91c7cd..6ef75e2 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -6,7 +6,7 @@ config ARM select HAVE_DMA_API_DEBUG select HAVE_IDE if PCI || ISA || PCMCIA select HAVE_DMA_ATTRS - select HAVE_DMA_CONTIGUOUS if (CPU_V6 || CPU_V6K || CPU_V7) + select HAVE_DMA_CONTIGUOUS if MMU select HAVE_MEMBLOCK select RTC_LIB select SYS_SUPPORTS_APM_EMULATION -- 1.7.1.569.g6f426

13 years, 5 months

4
5
0 0

[PATCHv7 00/15] Integration of videobuf2 with dmabuf

by Tomasz Stanislawski

Hello everyone, This patchset adds support for DMABUF [2] importing to V4L2 stack. The support for DMABUF exporting was moved to separate patchset due to dependency on patches for DMA mapping redesign by Marek Szyprowski [4]. This patchset depends on new scatterlist constructor [5]. v7: - support for V4L2_MEMORY_DMABUF in v4l2-compact-ioctl32.c - cosmetic fixes to the documentation - added importing for vmalloc because vmap support in dmabuf for 3.5 was pull-requested - support for dmabuf importing for VIVI - resurrect allocation of dma-contig context - remove reference of alloc_ctx in dma-contig buffer - use sg_alloc_table_from_pages - fix DMA scatterlist calls to use orig_nents instead of nents - fix memleak in vb2_dc_sgt_foreach_page (use orig_nents instead of nents) v6: - fixed missing entry in v4l2_memory_names - fixed a bug occuring after get_user_pages failure - fixed a bug caused by using invalid vma for get_user_pages - prepare/finish no longer call dma_sync for dmabuf buffers v5: - removed change of importer/exporter behaviour - fixes vb2_dc_pages_to_sgt basing on Laurent's hints - changed pin/unpin words to lock/unlock in Doc v4: - rebased on mainline 3.4-rc2 - included missing importing support for s5p-fimc and s5p-tv - added patch for changing map/unmap for importers - fixes to Documentation part - coding style fixes - pairing {map/unmap}_dmabuf in vb2-core - fixing variable types and semantic of arguments in videobufb2-dma-contig.c v3: - rebased on mainline 3.4-rc1 - split 'code refactor' patch to multiple smaller patches - squashed fixes to Sumit's patches - patchset is no longer dependant on 'DMA mapping redesign' - separated path for handling IO and non-IO mappings - add documentation for DMABUF importing to V4L - removed all DMABUF exporter related code - removed usage of dma_get_pages extension v2: - extended VIDIOC_EXPBUF argument from integer memoffset to struct v4l2_exportbuffer - added patch that breaks DMABUF spec on (un)map_atachment callcacks but allows to work with existing implementation of DMABUF prime in DRM - all dma-contig code refactoring patches were squashed - bugfixes v1: List of changes since [1]. - support for DMA api extension dma_get_pages, the function is used to retrieve pages used to create DMA mapping. - small fixes/code cleanup to videobuf2 - added prepare and finish callbacks to vb2 allocators, it is used keep consistency between dma-cpu acess to the memory (by Marek Szyprowski) - support for exporting of DMABUF buffer in V4L2 and Videobuf2, originated from [3]. - support for dma-buf exporting in vb2-dma-contig allocator - support for DMABUF for s5p-tv and s5p-fimc (capture interface) drivers, originated from [3] - changed handling for userptr buffers (by Marek Szyprowski, Andrzej Pietrasiewicz) - let mmap method to use dma_mmap_writecombine call (by Marek Szyprowski) [1] http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/4296… [2] https://lkml.org/lkml/2011/12/26/29 [3] http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/3635… [4] http://thread.gmane.org/gmane.linux.kernel.cross-arch/12819 [5] http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/47983 Laurent Pinchart (2): v4l: vb2-dma-contig: Shorten vb2_dma_contig prefix to vb2_dc v4l: vb2-dma-contig: Reorder functions Marek Szyprowski (2): v4l: vb2: add prepare/finish callbacks to allocators v4l: vb2-dma-contig: add prepare/finish to dma-contig allocator Sumit Semwal (4): v4l: Add DMABUF as a memory type v4l: vb2: add support for shared buffer (dma_buf) v4l: vb: remove warnings about MEMORY_DMABUF v4l: vb2-dma-contig: add support for dma_buf importing Tomasz Stanislawski (7): Documentation: media: description of DMABUF importing in V4L2 v4l: vb2-dma-contig: remove reference of alloc_ctx from a buffer v4l: vb2-dma-contig: add support for scatterlist in userptr mode v4l: vb2-vmalloc: add support for dmabuf importing v4l: vivi: support for dmabuf importing v4l: s5p-tv: mixer: support for dmabuf importing v4l: s5p-fimc: support for dmabuf importing Documentation/DocBook/media/v4l/compat.xml | 4 + Documentation/DocBook/media/v4l/io.xml | 179 ++++++++ .../DocBook/media/v4l/vidioc-create-bufs.xml | 3 +- Documentation/DocBook/media/v4l/vidioc-qbuf.xml | 15 + Documentation/DocBook/media/v4l/vidioc-reqbufs.xml | 47 +- drivers/media/video/Kconfig | 1 + drivers/media/video/s5p-fimc/Kconfig | 1 + drivers/media/video/s5p-fimc/fimc-capture.c | 2 +- drivers/media/video/s5p-tv/Kconfig | 1 + drivers/media/video/s5p-tv/mixer_video.c | 2 +- drivers/media/video/v4l2-compat-ioctl32.c | 16 + drivers/media/video/v4l2-ioctl.c | 1 + drivers/media/video/videobuf-core.c | 4 + drivers/media/video/videobuf2-core.c | 207 ++++++++- drivers/media/video/videobuf2-dma-contig.c | 470 +++++++++++++++++--- drivers/media/video/videobuf2-vmalloc.c | 56 +++ drivers/media/video/vivi.c | 2 +- include/linux/videodev2.h | 7 + include/media/videobuf2-core.h | 34 ++ 19 files changed, 963 insertions(+), 89 deletions(-) -- 1.7.9.5

13 years, 6 months

6
37
0 0

[PATCH v2 0/2] Enhance DMABUF with reference counting for exporter module

by Tomasz Stanislawski

Hello, This patchset adds reference counting for an exporter module to DMABUF framework. Moreover, it adds setup of an owner field for exporters in DRM subsystem. v1: Original v2: - split patch into DMABUF and DRM part - allow owner to be NULL Regards, Tomasz Stanislawski Tomasz Stanislawski (2): dma-buf: add reference counting for exporter module drm: set owner field to for all DMABUF exporters Documentation/dma-buf-sharing.txt | 3 ++- drivers/base/dma-buf.c | 9 ++++++++- drivers/gpu/drm/exynos/exynos_drm_dmabuf.c | 1 + drivers/gpu/drm/i915/i915_gem_dmabuf.c | 1 + drivers/gpu/drm/nouveau/nouveau_prime.c | 1 + drivers/gpu/drm/radeon/radeon_prime.c | 1 + drivers/staging/omapdrm/omap_gem_dmabuf.c | 1 + include/linux/dma-buf.h | 2 ++ 8 files changed, 17 insertions(+), 2 deletions(-) -- 1.7.9.5

13 years, 6 months

2
4
0 0

[PATCH 1/3] dma-fence: dma-buf synchronization (v7)

by Maarten Lankhorst

A dma-fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace. A dma-fence is transient, one-shot deal. It is allocated and attached to one or more dma-buf's. When the one that attached it is done, with the pending operation, it can signal the fence. + dma_fence_signal() The dma-buf-mgr handles tracking, and waiting on, the fences associated with a dma-buf. TODO maybe need some helper fxn for simple devices, like a display- only drm/kms device which simply wants to wait for exclusive fence to be signaled, and then attach a non-exclusive fence while scanout is in progress. The one pending on the fence can add an async callback: + dma_fence_add_callback() The callback can optionally be cancelled with remove_wait_queue() Or wait synchronously (optionally with timeout or interruptible): + dma_fence_wait() A default software-only implementation is provided, which can be used by drivers attaching a fence to a buffer when they have no other means for hw sync. But a memory backed fence is also envisioned, because it is common that GPU's can write to, or poll on some memory location for synchronization. For example: fence = dma_buf_get_fence(dmabuf); if (fence->ops == &bikeshed_fence_ops) { dma_buf *fence_buf; dma_bikeshed_fence_get_buf(fence, &fence_buf, &offset); ... tell the hw the memory location to wait on ... } else { /* fall-back to sw sync * / dma_fence_add_callback(fence, my_cb); } On SoC platforms, if some other hw mechanism is provided for synchronizing between IP blocks, it could be supported as an alternate implementation with it's own fence ops in a similar way. To facilitate other non-sw implementations, the enable_signaling callback can be used to keep track if a device not supporting hw sync is waiting on the fence, and in this case should arrange to call dma_fence_signal() at some point after the condition has changed, to notify other devices waiting on the fence. If there are no sw waiters, this can be skipped to avoid waking the CPU unnecessarily. The handler of the enable_signaling op should take a refcount until the fence is signaled, then release its ref. The intention is to provide a userspace interface (presumably via eventfd) later, to be used in conjunction with dma-buf's mmap support for sw access to buffers (or for userspace apps that would prefer to do their own synchronization). v1: Original v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided that dma-fence didn't need to care about the sw->hw signaling path (it can be handled same as sw->sw case), and therefore the fence->ops can be simplified and more handled in the core. So remove the signal, add_callback, cancel_callback, and wait ops, and replace with a simple enable_signaling() op which can be used to inform a fence supporting hw->hw signaling that one or more devices which do not support hw signaling are waiting (and therefore it should enable an irq or do whatever is necessary in order that the CPU is notified when the fence is passed). v3: Fix locking fail in attach_fence() and get_fence() v4: Remove tie-in w/ dma-buf.. after discussion w/ danvet and mlankorst we decided that we need to be able to attach one fence to N dma-buf's, so using the list_head in dma-fence struct would be problematic. v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager. v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments about checking if fence fired or not. This is broken by design. waitqueue_active during destruction is now fatal, since the signaller should be holding a reference in enable_signalling until it signalled the fence. Pass the original dma_fence_cb along, and call __remove_wait in the dma_fence_callback handler, so that no cleanup needs to be performed. v7: [ Maarten Lankhorst ] Set cb->func and only enable sw signaling if fence wasn't signaled yet, for example for hardware fences that may choose to signal blindly. Signed-off-by: Maarten Lankhorst <maarten.lankhorst(a)canonical.com> --- drivers/base/Makefile | 2 drivers/base/dma-fence.c | 287 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/dma-fence.h | 96 +++++++++++++++ 3 files changed, 384 insertions(+), 1 deletion(-) create mode 100644 drivers/base/dma-fence.c create mode 100644 include/linux/dma-fence.h diff --git a/drivers/base/Makefile b/drivers/base/Makefile index 5aa2d70..6e9f217 100644 --- a/drivers/base/Makefile +++ b/drivers/base/Makefile @@ -10,7 +10,7 @@ obj-$(CONFIG_CMA) += dma-contiguous.o obj-y += power/ obj-$(CONFIG_HAS_DMA) += dma-mapping.o obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o -obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o +obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o dma-fence.o obj-$(CONFIG_ISA) += isa.o obj-$(CONFIG_FW_LOADER) += firmware_class.o obj-$(CONFIG_NUMA) += node.o diff --git a/drivers/base/dma-fence.c b/drivers/base/dma-fence.c new file mode 100644 index 0000000..c280ee7 --- /dev/null +++ b/drivers/base/dma-fence.c @@ -0,0 +1,287 @@ +/* + * Fence mechanism for dma-buf to allow for asynchronous dma access + * + * Copyright (C) 2012 Texas Instruments + * Author: Rob Clark <rob.clark(a)linaro.org> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see <http://www.gnu.org/licenses/>. + */ + +#include <linux/slab.h> +#include <linux/sched.h> +#include <linux/export.h> +#include <linux/dma-fence.h> + +/** + * dma_fence_signal - Signal a fence. + * + * @fence: The fence to signal + * + * All registered callbacks will be called directly (synchronously) and + * all blocked waters will be awoken. + * + * TODO: any value in adding a dma_fence_cancel(), for example to recov + * from hung gpu? It would behave like dma_fence_signal() but return + * an error to waiters and cb's to let them know that the condition they + * are waiting for will never happen. + */ +int dma_fence_signal(struct dma_fence *fence) +{ + unsigned long flags; + int ret = -EINVAL; + + if (WARN_ON(!fence)) + return -EINVAL; + + spin_lock_irqsave(&fence->event_queue.lock, flags); + if (!fence->signaled) { + fence->signaled = true; + __wake_up_locked_key(&fence->event_queue, TASK_NORMAL, + &fence->event_queue); + ret = 0; + } else WARN(1, "Already signaled"); + spin_unlock_irqrestore(&fence->event_queue.lock, flags); + + return ret; +} +EXPORT_SYMBOL_GPL(dma_fence_signal); + +static void release_fence(struct kref *kref) +{ + struct dma_fence *fence = + container_of(kref, struct dma_fence, refcount); + + BUG_ON(waitqueue_active(&fence->event_queue)); + + if (fence->ops->release) + fence->ops->release(fence); + + kfree(fence); +} + +/** + * dma_fence_put - Release a reference to the fence. + */ +void dma_fence_put(struct dma_fence *fence) +{ + WARN_ON(!fence); + kref_put(&fence->refcount, release_fence); +} +EXPORT_SYMBOL_GPL(dma_fence_put); + +/** + * dma_fence_get - Take a reference to the fence. + * + * In most cases this is used only internally by dma-fence. + */ +void dma_fence_get(struct dma_fence *fence) +{ + WARN_ON(!fence); + kref_get(&fence->refcount); +} +EXPORT_SYMBOL_GPL(dma_fence_get); + +static int check_signaling(struct dma_fence *fence) +{ + bool enable_signaling = false, signaled; + unsigned long flags; + + spin_lock_irqsave(&fence->event_queue.lock, flags); + signaled = fence->signaled; + if (!signaled && !fence->needs_sw_signal) + enable_signaling = fence->needs_sw_signal = true; + spin_unlock_irqrestore(&fence->event_queue.lock, flags); + + if (enable_signaling) { + int ret; + + /* At this point, if enable_signaling returns any error + * a wakeup has to be performanced regardless. + * -ENOENT signals fence was already signaled. Any other error + * inidicates a catastrophic hardware error. + * + * If any hardware error occurs, nothing can be done against + * it, so it's treated like the fence was already signaled. + * No synchronization can be performed, so we have to assume + * the fence was already signaled. + */ + ret = fence->ops->enable_signaling(fence); + if (ret) { + signaled = true; + dma_fence_signal(fence); + } + } + + if (!signaled) + return 0; + else + return -ENOENT; +} + +int __dma_fence_wake_func(wait_queue_t *wait, unsigned mode, + int flags, void *key) +{ + struct dma_fence_cb *cb = + container_of(wait, struct dma_fence_cb, base); + + __remove_wait_queue(key, wait); + return cb->func(cb, wait->private); +} + +/** + * dma_fence_add_callback - Add a callback to be called when the fence + * is signaled. + * + * @fence: The fence to wait on + * @cb: The callback to register + * + * Any number of callbacks can be registered to a fence, but a callback + * can only be registered to once fence at a time. + * + * Note that the callback can be called from an atomic context. If + * fence is already signaled, this function will return -ENOENT (and + * *not* call the callback) + */ +int dma_fence_add_callback(struct dma_fence *fence, struct dma_fence_cb *cb, + dma_fence_func_t func, void *priv) +{ + unsigned long flags; + int ret; + + if (WARN_ON(!fence || !func)) + return -EINVAL; + + ret = check_signaling(fence); + + spin_lock_irqsave(&fence->event_queue.lock, flags); + if (!ret && fence->signaled) + ret = -ENOENT; + + if (!ret) { + cb->base.flags = 0; + cb->base.func = __dma_fence_wake_func; + cb->base.private = priv; + cb->fence = fence; + cb->func = func; + __add_wait_queue(&fence->event_queue, &cb->base); + } + spin_unlock_irqrestore(&fence->event_queue.lock, flags); + + return ret; +} +EXPORT_SYMBOL_GPL(dma_fence_add_callback); + +/** + * dma_fence_wait - Wait for a fence to be signaled. + * + * @fence: The fence to wait on + * @interruptible: if true, do an interruptible wait + * @timeout: absolute time for timeout, in jiffies. + * + * Returns 0 on success, -EBUSY if a timeout occured, + * -ERESTARTSYS if the wait was interrupted by a signal. + */ +int dma_fence_wait(struct dma_fence *fence, bool interruptible, unsigned long timeout) +{ + unsigned long cur; + int ret; + + if (WARN_ON(!fence)) + return -EINVAL; + + cur = jiffies; + if (time_after_eq(cur, timeout)) + return -EBUSY; + + timeout -= cur; + + ret = check_signaling(fence); + if (ret == -ENOENT) + return 0; + else if (ret) + return ret; + + if (interruptible) + ret = wait_event_interruptible_timeout(fence->event_queue, + fence->signaled, + timeout); + else + ret = wait_event_timeout(fence->event_queue, + fence->signaled, timeout); + + if (ret > 0) + return 0; + else if (!ret) + return -EBUSY; + else + return ret; +} +EXPORT_SYMBOL_GPL(dma_fence_wait); + +/* + * Helpers intended to be used by the ops of the dma_fence implementation: + * + * NOTE: helpers and fxns intended to be used by other dma-fence + * implementations are not exported.. I'm not really sure if it makes + * sense to have a dma-fence implementation that is itself a module. + */ + +void __dma_fence_init(struct dma_fence *fence, struct dma_fence_ops *ops, void *priv) +{ + WARN_ON(!ops || !ops->enable_signaling); + + kref_init(&fence->refcount); + fence->ops = ops; + fence->priv = priv; + init_waitqueue_head(&fence->event_queue); +} +EXPORT_SYMBOL_GPL(__dma_fence_init); + +/* + * Pure sw implementation for dma-fence. The CPU always gets involved. + */ + +static int sw_enable_signaling(struct dma_fence *fence) +{ + /* + * pure sw, no irq's to enable, because the fence creator will + * always call dma_fence_signal() + */ + return 0; +} + +static struct dma_fence_ops sw_fence_ops = { + .enable_signaling = sw_enable_signaling, +}; + +/** + * dma_fence_create - Create a simple sw-only fence. + * + * This fence only supports signaling from/to CPU. Other implementations + * of dma-fence can be used to support hardware to hardware signaling, if + * supported by the hardware, and use the dma_fence_helper_* functions for + * compatibility with other devices that only support sw signaling. + */ +struct dma_fence *dma_fence_create(void) +{ + struct dma_fence *fence; + + fence = kzalloc(sizeof(struct dma_fence), GFP_KERNEL); + if (!fence) + return ERR_PTR(-ENOMEM); + + __dma_fence_init(fence, &sw_fence_ops, 0); + + return fence; +} +EXPORT_SYMBOL_GPL(dma_fence_create); diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h new file mode 100644 index 0000000..70d12c0 --- /dev/null +++ b/include/linux/dma-fence.h @@ -0,0 +1,96 @@ +/* + * Fence mechanism for dma-buf to allow for asynchronous dma access + * + * Copyright (C) 2012 Texas Instruments + * Author: Rob Clark <rob.clark(a)linaro.org> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see <http://www.gnu.org/licenses/>. + */ + +#ifndef __DMA_FENCE_H__ +#define __DMA_FENCE_H__ + +#include <linux/err.h> +#include <linux/list.h> +#include <linux/wait.h> +#include <linux/list.h> +#include <linux/dma-buf.h> + +struct dma_fence; +struct dma_fence_ops; +struct dma_fence_cb; + +struct dma_fence { + struct kref refcount; + struct dma_fence_ops *ops; + wait_queue_head_t event_queue; + void *priv; + + /* has this fence been signaled yet? */ + bool signaled : 1; + + /* do we have one or more waiters or callbacks? */ + bool needs_sw_signal : 1; +}; + +typedef int (*dma_fence_func_t)(struct dma_fence_cb *cb, void *priv); + +struct dma_fence_cb { + wait_queue_t base; + dma_fence_func_t func; + struct dma_fence *fence; +}; + +struct dma_fence_ops { + /** + * For fence implementations that have the capability for hw->hw + * signaling, they can implement this op to enable the necessary + * irqs, or insert commands into cmdstream, etc. This is called + * in the first wait() or add_callback() path to let the fence + * implementation know that there is another driver waiting on + * the signal (ie. hw->sw case). + * + * A return value of -ENOENT will indicate that the fence has + * already passed. Any other errors will be treated as -ENOENT, + * and can happen because of hardware failure. + */ + int (*enable_signaling)(struct dma_fence *fence); + void (*release)(struct dma_fence *fence); +}; + +/* + * TODO does it make sense to be able to enable dma-fence without dma-buf, + * or visa versa? + */ +#ifdef CONFIG_DMA_SHARED_BUFFER + +/* create a basic (pure sw) fence: */ +struct dma_fence *dma_fence_create(void); + +/* intended to be used by other dma_fence implementations: */ +void __dma_fence_init(struct dma_fence *fence, + struct dma_fence_ops *ops, void *priv); + +void dma_fence_get(struct dma_fence *fence); +void dma_fence_put(struct dma_fence *fence); + +int dma_fence_signal(struct dma_fence *fence); +int dma_fence_wait(struct dma_fence *fence, bool interruptible, unsigned long timeout); +int dma_fence_add_callback(struct dma_fence *fence, struct dma_fence_cb *cb, + dma_fence_func_t func, void *priv); + +#else +// TODO +#endif /* CONFIG_DMA_SHARED_BUFFER */ + +#endif /* __DMA_FENCE_H__ */

13 years, 6 months

2
5
0 0

[GIT PULL] DMA-mapping fixups for v3.6-rc2

by Marek Szyprowski

Hi Linus, I would like to ask for pulling a set of fixup patches for ARM dma-mapping extensions merged in v3.6-rc1. The following changes since commit 0d7614f09c1ebdbaa1599a5aba7593f147bf96ee: Linux 3.6-rc1 (2012-08-02 16:38:10 -0700) with the top-most commit d9e0d149b5dcc2ef4688afc572b9906bcda941ef ARM: dma-mapping: fix incorrect freeing of atomic allocations are available in the git repository at: git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git fixes-for-linus-for-3.6-rc2 Thanks! Best regards Marek Szyprowski Samsung Poland R&D Center Patch summary: Aaro Koskinen (2): ARM: dma-mapping: fix atomic allocation alignment ARM: dma-mapping: fix incorrect freeing of atomic allocations Chris Brand (1): ARM: mm: fix MMU mapping of CMA regions arch/arm/mm/dma-mapping.c | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-)

13 years, 6 months

1
0
0 0

[PATCH] dma-buf: add reference counting for exporter module

by Tomasz Stanislawski

This patch adds reference counting on a module that exports dma-buf and implements its operations. This prevents the module from being unloaded while DMABUF file is in use. Signed-off-by: Tomasz Stanislawski <t.stanislaws(a)samsung.com> --- Documentation/dma-buf-sharing.txt | 3 ++- drivers/base/dma-buf.c | 10 +++++++++- drivers/gpu/drm/exynos/exynos_drm_dmabuf.c | 1 + drivers/gpu/drm/i915/i915_gem_dmabuf.c | 1 + drivers/gpu/drm/nouveau/nouveau_prime.c | 1 + drivers/gpu/drm/radeon/radeon_prime.c | 1 + drivers/staging/omapdrm/omap_gem_dmabuf.c | 1 + include/linux/dma-buf.h | 2 ++ 8 files changed, 18 insertions(+), 2 deletions(-) diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt index ad86fb8..2613057 100644 --- a/Documentation/dma-buf-sharing.txt +++ b/Documentation/dma-buf-sharing.txt @@ -49,7 +49,8 @@ The dma_buf buffer sharing API usage contains the following steps: The buffer exporter announces its wish to export a buffer. In this, it connects its own private buffer data, provides implementation for operations that can be performed on the exported dma_buf, and flags for the file - associated with this buffer. + associated with this buffer. The operations structure has owner field. + You should initialize this to THIS_MODULE in most cases. Interface: struct dma_buf *dma_buf_export(void *priv, struct dma_buf_ops *ops, diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index c30f3e1..d14b2f5 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -27,6 +27,7 @@ #include <linux/dma-buf.h> #include <linux/anon_inodes.h> #include <linux/export.h> +#include <linux/module.h> static inline int is_dma_buf_file(struct file *); @@ -40,6 +41,7 @@ static int dma_buf_release(struct inode *inode, struct file *file) dmabuf = file->private_data; dmabuf->ops->release(dmabuf); + module_put(dmabuf->ops->owner); kfree(dmabuf); return 0; } @@ -96,6 +98,7 @@ struct dma_buf *dma_buf_export(void *priv, const struct dma_buf_ops *ops, struct file *file; if (WARN_ON(!priv || !ops + || !ops->owner || !ops->map_dma_buf || !ops->unmap_dma_buf || !ops->release @@ -105,9 +108,14 @@ struct dma_buf *dma_buf_export(void *priv, const struct dma_buf_ops *ops, return ERR_PTR(-EINVAL); } + if (!try_module_get(ops->owner)) + return ERR_PTR(-ENOENT); + dmabuf = kzalloc(sizeof(struct dma_buf), GFP_KERNEL); - if (dmabuf == NULL) + if (dmabuf == NULL) { + module_put(ops->owner); return ERR_PTR(-ENOMEM); + } dmabuf->priv = priv; dmabuf->ops = ops; diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c index 613bf8a..cf3bc6d 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c +++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c @@ -164,6 +164,7 @@ static void exynos_gem_dmabuf_kunmap(struct dma_buf *dma_buf, } static struct dma_buf_ops exynos_dmabuf_ops = { + .owner = THIS_MODULE, .map_dma_buf = exynos_gem_map_dma_buf, .unmap_dma_buf = exynos_gem_unmap_dma_buf, .kmap = exynos_gem_dmabuf_kmap, diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index aa308e1..07ff03b 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -152,6 +152,7 @@ static int i915_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct vm_area_struct * } static const struct dma_buf_ops i915_dmabuf_ops = { + .owner = THIS_MODULE, .map_dma_buf = i915_gem_map_dma_buf, .unmap_dma_buf = i915_gem_unmap_dma_buf, .release = i915_gem_dmabuf_release, diff --git a/drivers/gpu/drm/nouveau/nouveau_prime.c b/drivers/gpu/drm/nouveau/nouveau_prime.c index a25cf2c..8605033 100644 --- a/drivers/gpu/drm/nouveau/nouveau_prime.c +++ b/drivers/gpu/drm/nouveau/nouveau_prime.c @@ -127,6 +127,7 @@ static void nouveau_gem_prime_vunmap(struct dma_buf *dma_buf, void *vaddr) } static const struct dma_buf_ops nouveau_dmabuf_ops = { + .owner = THIS_MODULE, .map_dma_buf = nouveau_gem_map_dma_buf, .unmap_dma_buf = nouveau_gem_unmap_dma_buf, .release = nouveau_gem_dmabuf_release, diff --git a/drivers/gpu/drm/radeon/radeon_prime.c b/drivers/gpu/drm/radeon/radeon_prime.c index 6bef46a..4061fd3 100644 --- a/drivers/gpu/drm/radeon/radeon_prime.c +++ b/drivers/gpu/drm/radeon/radeon_prime.c @@ -127,6 +127,7 @@ static void radeon_gem_prime_vunmap(struct dma_buf *dma_buf, void *vaddr) mutex_unlock(&dev->struct_mutex); } const static struct dma_buf_ops radeon_dmabuf_ops = { + .owner = THIS_MODULE, .map_dma_buf = radeon_gem_map_dma_buf, .unmap_dma_buf = radeon_gem_unmap_dma_buf, .release = radeon_gem_dmabuf_release, diff --git a/drivers/staging/omapdrm/omap_gem_dmabuf.c b/drivers/staging/omapdrm/omap_gem_dmabuf.c index 42728e0..6a4dd67 100644 --- a/drivers/staging/omapdrm/omap_gem_dmabuf.c +++ b/drivers/staging/omapdrm/omap_gem_dmabuf.c @@ -179,6 +179,7 @@ out_unlock: } struct dma_buf_ops omap_dmabuf_ops = { + .owner = THIS_MODULE, .map_dma_buf = omap_gem_map_dma_buf, .unmap_dma_buf = omap_gem_unmap_dma_buf, .release = omap_gem_dmabuf_release, diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index eb48f38..22953de 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -37,6 +37,7 @@ struct dma_buf_attachment; /** * struct dma_buf_ops - operations possible on struct dma_buf + * @owner: the module that implements dma_buf operations * @attach: [optional] allows different devices to 'attach' themselves to the * given buffer. It might return -EBUSY to signal that backing storage * is already allocated and incompatible with the requirements @@ -70,6 +71,7 @@ struct dma_buf_attachment; * @vunmap: [optional] unmaps a vmap from the buffer */ struct dma_buf_ops { + struct module *owner; int (*attach)(struct dma_buf *, struct device *, struct dma_buf_attachment *); -- 1.7.9.5

13 years, 6 months

4
4
0 0

[PATCHv2 0/9] Support for dmabuf exporting for videobuf2

by Tomasz Stanislawski

Hello everyone, The patches adds support for DMABUF exporting to V4L2 stack. The latest support for DMABUF importing was posted in [1]. The exporter part is dependant on DMA mapping redesign [2] which is expected to be merged into the mainline. Therefore it is posted as a separate patchset. Moreover some patches depends on vmap extension for DMABUF by Dave Airlie [3] and sg_alloc_table_from_pages function [4]. The last patch 'v4l: vb2-dma-contig: use dma_get_sgtable' depends on dma_get_sgtable extension to DMA api [5]. The tree with all the patches and extensions is available at: repo: git://git.infradead.org/users/kmpark/linux-2.6-samsung branch: media-for3.5-vb2-dmabuf-v7 Changelog: v2: - add documentation for DMABUF exporting - squashed 'let mmap method to use dma_mmap_coherent call' with 'remove vb2_mmap_pfn_range function' - move setup of scatterlist for MMAP buffers from alloc to DMABUF export code - use locking to serialize map/unmap of DMABUF attachments - squash vmap/kmap, setup of sg lists, allocation in attachments into dma-contig exporter patch - fix occasional failure of follow_pfn trick by using init_mm in artificial VMA - add support for exporting in s5p-mfc driver - drop all code that duplicates sg_alloc_table_from_pages - introduce usage of dma_get_sgtable as generic solution to follow_pfn trick v1: - updated setup of VIDIOC_EXPBUF ioctl - doc updates - introduced workaround to avoid using dma_get_pages, - removed caching of exported dmabuf to avoid existence of circular reference between dmabuf and vb2_dc_buf or resource leakage - removed all 'change behaviour' patches - inital support for exporting in s5p-mfs driver - removal of vb2_mmap_pfn_range that is no longer used - use sg_alloc_table_from_pages instead of creating sglist in vb2_dc code - move attachment allocation to exporter's attach callback v0: RFC - initial version [1] http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/49438 [2] http://thread.gmane.org/gmane.linux.kernel.cross-arch/14098 [3] http://permalink.gmane.org/gmane.comp.video.dri.devel/69302 [4] This patchset is rebased on 3.4-rc1 plus the following patchsets: [5] http://www.spinics.net/lists/linux-arch/msg18282.html Marek Szyprowski (1): v4l: vb2-dma-contig: let mmap method to use dma_mmap_coherent call Tomasz Stanislawski (8): Documentation: media: description of DMABUF exporting in V4L2 v4l: add buffer exporting via dmabuf v4l: vb2: add buffer exporting via dmabuf v4l: vb2-dma-contig: add support for DMABUF exporting v4l: s5p-fimc: support for dmabuf exporting v4l: s5p-tv: mixer: support for dmabuf exporting v4l: s5p-mfc: support for dmabuf exporting v4l: vb2-dma-contig: use dma_get_sgtable Documentation/DocBook/media/v4l/compat.xml | 3 + Documentation/DocBook/media/v4l/io.xml | 3 + Documentation/DocBook/media/v4l/v4l2.xml | 1 + Documentation/DocBook/media/v4l/vidioc-expbuf.xml | 223 ++++++++++++++++++++ drivers/media/video/s5p-fimc/fimc-capture.c | 9 + drivers/media/video/s5p-mfc/s5p_mfc_dec.c | 18 ++ drivers/media/video/s5p-mfc/s5p_mfc_enc.c | 18 ++ drivers/media/video/s5p-tv/mixer_video.c | 10 + drivers/media/video/v4l2-compat-ioctl32.c | 1 + drivers/media/video/v4l2-dev.c | 1 + drivers/media/video/v4l2-ioctl.c | 6 + drivers/media/video/videobuf2-core.c | 67 ++++++ drivers/media/video/videobuf2-dma-contig.c | 224 ++++++++++++++++++++- drivers/media/video/videobuf2-memops.c | 40 ---- include/linux/videodev2.h | 26 +++ include/media/v4l2-ioctl.h | 2 + include/media/videobuf2-core.h | 2 + include/media/videobuf2-memops.h | 5 - 18 files changed, 612 insertions(+), 47 deletions(-) create mode 100644 Documentation/DocBook/media/v4l/vidioc-expbuf.xml -- 1.7.9.5

13 years, 6 months

7
33
0 0

[PATCH 0/2] dma-parms and helpers for dma-buf

by Rob Clark

From: Rob Clark <rob(a)ti.com> Re-sending first patch, with a wider audience. Apparently I didn't spam enough inboxes the first time. And, at Daniel Vetter's suggestion, adding some helper functions in dma-buf to get the most restrictive parameters of all the attached devices. Rob Clark (2): device: add dma_params->max_segment_count dma-buf: add helpers for attacher dma-parms drivers/base/dma-buf.c | 63 +++++++++++++++++++++++++++++++++++++++++++ include/linux/device.h | 1 + include/linux/dma-buf.h | 19 +++++++++++++ include/linux/dma-mapping.h | 16 +++++++++++ 4 files changed, 99 insertions(+) -- 1.7.9.5

13 years, 6 months

5
8
0 0

[RFC PATCH 1/3] dma-fence: dma-buf synchronization (v5)

by Maarten Lankhorst

A dma-fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace. A dma-fence is transient, one-shot deal. It is allocated and attached to one or more dma-buf's. When the one that attached it is done, with the pending operation, it can signal the fence. + dma_fence_signal() The dma-buf-mgr handles tracking, and waiting on, the fences associated with a dma-buf. TODO maybe need some helper fxn for simple devices, like a display- only drm/kms device which simply wants to wait for exclusive fence to be signaled, and then attach a non-exclusive fence while scanout is in progress. The one pending on the fence can add an async callback (and optionally cancel it.. for example, to recover from GPU hangs): + dma_fence_add_callback() + dma_fence_cancel_callback() Or wait synchronously (optionally with timeout or from atomic context): + dma_fence_wait() A default software-only implementation is provided, which can be used by drivers attaching a fence to a buffer when they have no other means for hw sync. But a memory backed fence is also envisioned, because it is common that GPU's can write to, or poll on some memory location for synchronization. For example: fence = dma_buf_get_fence(dmabuf); if (fence->ops == &bikeshed_fence_ops) { dma_buf *fence_buf; dma_bikeshed_fence_get_buf(fence, &fence_buf, &offset); ... tell the hw the memory location to wait on ... } else { /* fall-back to sw sync * / dma_fence_add_callback(fence, my_cb); } On SoC platforms, if some other hw mechanism is provided for synchronizing between IP blocks, it could be supported as an alternate implementation with it's own fence ops in a similar way. To facilitate other non-sw implementations, the enable_signaling callback can be used to keep track if a device not supporting hw sync is waiting on the fence, and in this case should arrange to call dma_fence_signal() at some point after the condition has changed, to notify other devices waiting on the fence. If there are no sw waiters, this can be skipped to avoid waking the CPU unnecessarily. The intention is to provide a userspace interface (presumably via eventfd) later, to be used in conjunction with dma-buf's mmap support for sw access to buffers (or for userspace apps that would prefer to do their own synchronization). v1: Original v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided that dma-fence didn't need to care about the sw->hw signaling path (it can be handled same as sw->sw case), and therefore the fence->ops can be simplified and more handled in the core. So remove the signal, add_callback, cancel_callback, and wait ops, and replace with a simple enable_signaling() op which can be used to inform a fence supporting hw->hw signaling that one or more devices which do not support hw signaling are waiting (and therefore it should enable an irq or do whatever is necessary in order that the CPU is notified when the fence is passed). v3: Fix locking fail in attach_fence() and get_fence() v4: Remove tie-in w/ dma-buf.. after discussion w/ danvet and mlankorst we decided that we need to be able to attach one fence to N dma-buf's, so using the list_head in dma-fence struct would be problematic. v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager. --- drivers/base/Makefile | 2 drivers/base/dma-fence.c | 317 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/dma-fence.h | 123 +++++++++++++++++ 3 files changed, 441 insertions(+), 1 deletion(-) create mode 100644 drivers/base/dma-fence.c create mode 100644 include/linux/dma-fence.h diff --git a/drivers/base/Makefile b/drivers/base/Makefile index 5aa2d70..6e9f217 100644 --- a/drivers/base/Makefile +++ b/drivers/base/Makefile @@ -10,7 +10,7 @@ obj-$(CONFIG_CMA) += dma-contiguous.o obj-y += power/ obj-$(CONFIG_HAS_DMA) += dma-mapping.o obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o -obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o +obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o dma-fence.o obj-$(CONFIG_ISA) += isa.o obj-$(CONFIG_FW_LOADER) += firmware_class.o obj-$(CONFIG_NUMA) += node.o diff --git a/drivers/base/dma-fence.c b/drivers/base/dma-fence.c new file mode 100644 index 0000000..6798dc4 --- /dev/null +++ b/drivers/base/dma-fence.c @@ -0,0 +1,317 @@ +/* + * Fence mechanism for dma-buf to allow for asynchronous dma access + * + * Copyright (C) 2012 Texas Instruments + * Author: Rob Clark <rob.clark(a)linaro.org> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see <http://www.gnu.org/licenses/>. + */ + +#include <linux/slab.h> +#include <linux/sched.h> +#include <linux/export.h> +#include <linux/dma-fence.h> + +/** + * dma_fence_signal - Signal a fence. + * + * @fence: The fence to signal + * + * All registered callbacks will be called directly (synchronously) and + * all blocked waters will be awoken. + * + * TODO: any value in adding a dma_fence_cancel(), for example to recov + * from hung gpu? It would behave like dma_fence_signal() but return + * an error to waiters and cb's to let them know that the condition they + * are waiting for will never happen. + */ +int dma_fence_signal(struct dma_fence *fence) +{ + unsigned long flags; + int ret = -EINVAL; + + if (WARN_ON(!fence)) + return -EINVAL; + + spin_lock_irqsave(&fence->event_queue.lock, flags); + if (!fence->signaled) { + fence->signaled = true; + wake_up_all_locked(&fence->event_queue); + ret = 0; + } else WARN(1, "Already signaled"); + spin_unlock_irqrestore(&fence->event_queue.lock, flags); + + return ret; +} +EXPORT_SYMBOL_GPL(dma_fence_signal); + +static void release_fence(struct kref *kref) +{ + struct dma_fence *fence = + container_of(kref, struct dma_fence, refcount); + + WARN_ON(waitqueue_active(&fence->event_queue)); + if (fence->ops->release) + fence->ops->release(fence); + + kfree(fence); +} + +/** + * dma_fence_put - Release a reference to the fence. + */ +void dma_fence_put(struct dma_fence *fence) +{ + WARN_ON(!fence); + kref_put(&fence->refcount, release_fence); +} +EXPORT_SYMBOL_GPL(dma_fence_put); + +/** + * dma_fence_get - Take a reference to the fence. + * + * In most cases this is used only internally by dma-fence. + */ +void dma_fence_get(struct dma_fence *fence) +{ + WARN_ON(!fence); + kref_get(&fence->refcount); +} +EXPORT_SYMBOL_GPL(dma_fence_get); + +static int check_signaling(struct dma_fence *fence) +{ + bool enable_signaling = false, signaled; + unsigned long flags; + + spin_lock_irqsave(&fence->event_queue.lock, flags); + if (!fence->needs_sw_signal) + enable_signaling = fence->needs_sw_signal = true; + signaled = fence->signaled; + spin_unlock_irqrestore(&fence->event_queue.lock, flags); + + if (enable_signaling) { + int ret; + + /* At this point, if enable_signaling returns any error + * a wakeup has to be performanced regardless. + * -ENOENT signals fence was already signaled. Any other error + * inidicates a catastrophic hardware error. + * + * If any hardware error occurs, nothing can be done against + * it, so it's treated like the fence was already signaled. + * No synchronization can be performed, so we have to assume + * the fence was already signaled. + */ + ret = fence->ops->enable_signaling(fence); + if (ret) { + signaled = true; + dma_fence_signal(fence); + } + } + + if (!signaled) + return 0; + else + return -ENOENT; +} + +/** + * dma_fence_add_callback - Add a callback to be called when the fence + * is signaled. + * + * @fence: The fence to wait on + * @cb: The callback to register + * + * Any number of callbacks can be registered to a fence, but a callback + * can only be registered to once fence at a time. + * + * Note that the callback can be called from an atomic context. If + * fence is already signaled, this function will return -ENOENT (and + * *not* call the callback) + */ +int dma_fence_add_callback(struct dma_fence *fence, + struct dma_fence_cb *cb) +{ + unsigned long flags; + int ret; + + if (WARN_ON(!fence || !cb)) + return -EINVAL; + + ret = check_signaling(fence); + + spin_lock_irqsave(&fence->event_queue.lock, flags); + if (!ret && !fence->signaled) { + cb->fence = fence; + __add_wait_queue(&fence->event_queue, &cb->base); + ret = 0; + } else if (!ret) + ret = -ENOENT; + spin_unlock_irqrestore(&fence->event_queue.lock, flags); + + return ret; +} +EXPORT_SYMBOL_GPL(dma_fence_add_callback); + +/** + * dma_fence_cancel_callback - Remove a previously registered callback. + * + * @cb: The callback to unregister + * + * The callback will not be called after this function returns, but could + * be called before this function returns. + */ +int dma_fence_cancel_callback(struct dma_fence_cb *cb) +{ + struct dma_fence *fence; + unsigned long flags; + int ret = -EINVAL; + + if (WARN_ON(!cb)) + return -EINVAL; + + fence = cb->fence; + + spin_lock_irqsave(&fence->event_queue.lock, flags); + if (fence) { + __remove_wait_queue(&fence->event_queue, &cb->base); + cb->fence = NULL; + ret = 0; + } + spin_unlock_irqrestore(&fence->event_queue.lock, flags); + + return ret; +} +EXPORT_SYMBOL_GPL(dma_fence_cancel_callback); + +/** + * dma_fence_wait - Wait for a fence to be signaled. + * + * @fence: The fence to wait on + * @interruptible: if true, do an interruptible wait + * @timeout: absolute time for timeout, in jiffies. + * + * Returns 0 on success, -EBUSY if a timeout occured, + * -ERESTARTSYS if the wait was interrupted by a signal. + */ +int dma_fence_wait(struct dma_fence *fence, bool interruptible, unsigned long timeout) +{ + unsigned long cur; + int ret; + + if (WARN_ON(!fence)) + return -EINVAL; + + cur = jiffies; + if (time_after_eq(cur, timeout)) + return -EBUSY; + + timeout -= cur; + + ret = check_signaling(fence); + if (ret == -ENOENT) + return 0; + else if (ret) + return ret; + + if (interruptible) + ret = wait_event_interruptible_timeout(fence->event_queue, + fence->signaled, + timeout); + else + ret = wait_event_timeout(fence->event_queue, + fence->signaled, timeout); + + WARN(1, "wait_event_timeout(%u) returns %i", interruptible, ret); + if (ret > 0) + return 0; + else if (!ret) + return -EBUSY; + else + return ret; +} +EXPORT_SYMBOL_GPL(dma_fence_wait); + +int __dma_fence_wake_func(wait_queue_t *wait, unsigned mode, + int flags, void *key) +{ + struct dma_fence_cb *cb = + container_of(wait, struct dma_fence_cb, base); + struct dma_fence *fence = cb->fence; + int ret; + + ret = cb->func(cb, fence); + cb->fence = NULL; + + return ret; +} +EXPORT_SYMBOL_GPL(__dma_fence_wake_func); + +/* + * Helpers intended to be used by the ops of the dma_fence implementation: + * + * NOTE: helpers and fxns intended to be used by other dma-fence + * implementations are not exported.. I'm not really sure if it makes + * sense to have a dma-fence implementation that is itself a module. + */ + +void __dma_fence_init(struct dma_fence *fence, struct dma_fence_ops *ops, void *priv) +{ + WARN_ON(!ops || !ops->enable_signaling); + + kref_init(&fence->refcount); + fence->ops = ops; + fence->priv = priv; + init_waitqueue_head(&fence->event_queue); +} +EXPORT_SYMBOL_GPL(__dma_fence_init); + +/* + * Pure sw implementation for dma-fence. The CPU always gets involved. + */ + +static int sw_enable_signaling(struct dma_fence *fence) +{ + /* + * pure sw, no irq's to enable, because the fence creator will + * always call dma_fence_signal() + */ + return 0; +} + +static struct dma_fence_ops sw_fence_ops = { + .enable_signaling = sw_enable_signaling, +}; + +/** + * dma_fence_create - Create a simple sw-only fence. + * + * This fence only supports signaling from/to CPU. Other implementations + * of dma-fence can be used to support hardware to hardware signaling, if + * supported by the hardware, and use the dma_fence_helper_* functions for + * compatibility with other devices that only support sw signaling. + */ +struct dma_fence *dma_fence_create(void) +{ + struct dma_fence *fence; + + fence = kzalloc(sizeof(struct dma_fence), GFP_KERNEL); + if (!fence) + return ERR_PTR(-ENOMEM); + + __dma_fence_init(fence, &sw_fence_ops, 0); + + return fence; +} +EXPORT_SYMBOL_GPL(dma_fence_create); diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h new file mode 100644 index 0000000..648f136 --- /dev/null +++ b/include/linux/dma-fence.h @@ -0,0 +1,123 @@ +/* + * Fence mechanism for dma-buf to allow for asynchronous dma access + * + * Copyright (C) 2012 Texas Instruments + * Author: Rob Clark <rob.clark(a)linaro.org> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see <http://www.gnu.org/licenses/>. + */ + +#ifndef __DMA_FENCE_H__ +#define __DMA_FENCE_H__ + +#include <linux/err.h> +#include <linux/list.h> +#include <linux/wait.h> +#include <linux/list.h> +#include <linux/dma-buf.h> + +struct dma_fence; +struct dma_fence_ops; +struct dma_fence_cb; + +struct dma_fence { + struct kref refcount; + struct dma_fence_ops *ops; + wait_queue_head_t event_queue; + void *priv; + + /* has this fence been signaled yet? */ + bool signaled : 1; + + /* do we have one or more waiters or callbacks? */ + bool needs_sw_signal : 1; +}; + +typedef int (*dma_fence_func_t)(struct dma_fence_cb *cb, + struct dma_fence *fence); + +struct dma_fence_cb { + wait_queue_t base; + dma_fence_func_t func; + + /* + * This is initialized when the cb is added, and NULL'd when it + * is canceled or expired, so can be used to for error checking + * if the cb is already pending. A dma_fence_cb can be pending + * on at most one fence at a time. + */ + struct dma_fence *fence; +}; + +struct dma_fence_ops { + /** + * For fence implementations that have the capability for hw->hw + * signaling, they can implement this op to enable the necessary + * irqs, or insert commands into cmdstream, etc. This is called + * in the first wait() or add_callback() path to let the fence + * implementation know that there is another driver waiting on + * the signal (ie. hw->sw case). + * + * A return value of -ENOENT will indicate that the fence has + * already passed. + */ + int (*enable_signaling)(struct dma_fence *fence); + void (*release)(struct dma_fence *fence); +}; + +int __dma_fence_wake_func(wait_queue_t *wait, unsigned mode, + int flags, void *key); + +#define DMA_FENCE_CB_INITIALIZER(cb_func) { \ + .base = { .func = __dma_fence_wake_func }, \ + .func = (cb_func), \ + } + +#define DECLARE_DMA_FENCE_CB(name, cb_func) \ + struct dma_fence_cb name = DMA_FENCE_CB_INITIALIZER(cb_func) + + +/* + * TODO does it make sense to be able to enable dma-fence without dma-buf, + * or visa versa? + */ +#ifdef CONFIG_DMA_SHARED_BUFFER + +/* create a basic (pure sw) fence: */ +struct dma_fence *dma_fence_create(void); + +/* intended to be used by other dma_fence implementations: */ +void __dma_fence_init(struct dma_fence *fence, struct dma_fence_ops *ops, void *priv); + +void dma_fence_get(struct dma_fence *fence); +void dma_fence_put(struct dma_fence *fence); +int dma_fence_signal(struct dma_fence *fence); + +int dma_fence_add_callback(struct dma_fence *fence, + struct dma_fence_cb *cb); +int dma_fence_cancel_callback(struct dma_fence_cb *cb); +int dma_fence_wait(struct dma_fence *fence, bool interruptible, unsigned long timeout); + +/* helpers intended to be used by the ops of the dma_fence implementation: */ +int dma_fence_helper_signal(struct dma_fence *fence); +int dma_fence_helper_add_callback(struct dma_fence *fence, + struct dma_fence_cb *cb); +int dma_fence_helper_cancel_callback(struct dma_fence_cb *cb); +int dma_fence_helper_wait(struct dma_fence *fence, bool interruptible, + long timeout); + +#else +// TODO +#endif /* CONFIG_DMA_SHARED_BUFFER */ + +#endif /* __DMA_FENCE_H__ */

13 years, 6 months

2
4
0 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig