Linaro-mm-sig July 2018

linaro-mm-sig@lists.linaro.org

9 participants
14 discussions

[PATCH v4 0/4] locking, drm: Fix ww mutex naming / algorithm inconsistency

by Thomas Hellstrom

This is a small fallout from a work to allow batching WW mutex locks and unlocks. Our Wound-Wait mutexes actually don't use the Wound-Wait algorithm but the Wait-Die algorithm. One could perhaps rename those mutexes tree-wide to "Wait-Die mutexes" or "Deadlock Avoidance mutexes". Another approach suggested here is to implement also the "Wound-Wait" algorithm as a per-WW-class choice, as it has advantages in some cases. See for example http://www.mathcs.emory.edu/~cheung/Courses/554/Syllabus/8-recv+serial/dead… Now Wound-Wait is a preemptive algorithm, and the preemption is implemented using a lazy scheme: If a wounded transaction is about to go to sleep on a contended WW mutex, we return -EDEADLK. That is sufficient for deadlock prevention. Since with WW mutexes we also require the aborted transaction to sleep waiting to lock the WW mutex it was aborted on, this choice also provides a suitable WW mutex to sleep on. If we were to return -EDEADLK on the first WW mutex lock after the transaction was wounded whether the WW mutex was contended or not, the transaction might frequently be restarted without a wait, which is far from optimal. Note also that with the lazy preemption scheme, contrary to Wait-Die there will be no rollbacks on lock contention of locks held by a transaction that has completed its locking sequence. The modeset locks are then changed from Wait-Die to Wound-Wait since the typical locking pattern of those locks very well matches the criterion for a substantial reduction in the number of rollbacks. For reservation objects, the benefit is more unclear at this point and they remain using Wait-Die. Cc: Ingo Molnar <mingo(a)redhat.com> Cc: Jonathan Corbet <corbet(a)lwn.net> Cc: Gustavo Padovan <gustavo(a)padovan.org> Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com> Cc: Sean Paul <seanpaul(a)chromium.org> Cc: David Airlie <airlied(a)linux.ie> Cc: Davidlohr Bueso <dave(a)stgolabs.net> Cc: "Paul E. McKenney" <paulmck(a)linux.vnet.ibm.com> Cc: Josh Triplett <josh(a)joshtriplett.org> Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: Kate Stewart <kstewart(a)linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne(a)nexb.com> Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Cc: linux-doc(a)vger.kernel.org Cc: linux-media(a)vger.kernel.org Cc: linaro-mm-sig(a)lists.linaro.org v2: Updated DEFINE_WW_CLASS() API, minor code- and comment fixes as detailed in each patch. v3: Included cleanup patch from Peter Zijlstra. Documentation fixes and a small correctness fix. v4: Reworked the correctness fix.

6 years, 6 months

[RFC] android: ion: How to properly clean caches for uncached allocations

by Liam Mark

The issue: Currently in ION if you allocate uncached memory it is possible that there are still dirty lines in the cache. And often these dirty lines in the cache are the zeros which were meant to clear out any sensitive kernel data. What this means is that if you allocate uncached memory from ION, and then subsequently write to that buffer (using the uncached mapping you are provided by ION) then the data you have written could be corrupted at some point in the future if a dirty line is evicted from the cache. Also this means there is a potential security issue. If an un-privileged userspace user allocated uncached memory (for example from the system heap) and then if they were to read from that buffer (through the un-cached mapping they are provided by ION), and if some of the zeros which were written to that memory are still in the cache then this un-privileged userspace user could read potentially sensitive kernel data. An unacceptable fix: I have included some sample code which should resolve this issue for the system heap and the cma heap on some architectures, however this code would not be acceptable for upstreaming since it uses hacks to clean the cache. Similar changes would need to be made to carveout heap and chunk heap. I would appreciate some feedback on the proper way for ION to clean the caches for memory it has allocated that is intended for uncached access. I realize that it may be tempting, as a solution to this problem, to simply strip uncached support from ION. I hope that we can try to find a solution which preserves uncached memory support as ION uncached memory is often used (though perhaps not in upstreamed code) in cases such as multimedia use cases where there is no CPU access required, in secure heap allocations, and in some cases where there is minimal CPU access and therefore uncached memory performs better. Signed-off-by: Liam Mark <lmark(a)codeaurora.org> --- drivers/staging/android/ion/ion.c | 16 ++++++++++++++++ drivers/staging/android/ion/ion.h | 5 ++++- drivers/staging/android/ion/ion_cma_heap.c | 3 +++ drivers/staging/android/ion/ion_page_pool.c | 8 ++++++-- drivers/staging/android/ion/ion_system_heap.c | 7 ++++++- 5 files changed, 35 insertions(+), 4 deletions(-) diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c index 57e0d8035b2e..10e967b0a0f4 100644 --- a/drivers/staging/android/ion/ion.c +++ b/drivers/staging/android/ion/ion.c @@ -38,6 +38,22 @@ bool ion_buffer_cached(struct ion_buffer *buffer) return !!(buffer->flags & ION_FLAG_CACHED); } +void ion_pages_sync_for_device(struct page *page, size_t size, + enum dma_data_direction dir) +{ + struct scatterlist sg; + struct device dev = {0}; + + /* hack, use dummy device */ + arch_setup_dma_ops(&dev, 0, 0, NULL, false); + + sg_init_table(&sg, 1); + sg_set_page(&sg, page, size, 0); + /* hack, use phys address for dma address */ + sg_dma_address(&sg) = page_to_phys(page); + dma_sync_sg_for_device(&dev, &sg, 1, dir); +} + /* this function should only be called while dev->lock is held */ static void ion_buffer_add(struct ion_device *dev, struct ion_buffer *buffer) diff --git a/drivers/staging/android/ion/ion.h b/drivers/staging/android/ion/ion.h index a238f23c9116..227b9928d185 100644 --- a/drivers/staging/android/ion/ion.h +++ b/drivers/staging/android/ion/ion.h @@ -192,6 +192,9 @@ struct ion_heap { */ bool ion_buffer_cached(struct ion_buffer *buffer); +void ion_pages_sync_for_device(struct page *page, size_t size, + enum dma_data_direction dir); + /** * ion_buffer_fault_user_mappings - fault in user mappings of this buffer * @buffer: buffer @@ -333,7 +336,7 @@ struct ion_page_pool { struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, unsigned int order, bool cached); void ion_page_pool_destroy(struct ion_page_pool *pool); -struct page *ion_page_pool_alloc(struct ion_page_pool *pool); +struct page *ion_page_pool_alloc(struct ion_page_pool *pool, bool *from_pool); void ion_page_pool_free(struct ion_page_pool *pool, struct page *page); /** ion_page_pool_shrink - shrinks the size of the memory cached in the pool diff --git a/drivers/staging/android/ion/ion_cma_heap.c b/drivers/staging/android/ion/ion_cma_heap.c index 49718c96bf9e..82e80621d114 100644 --- a/drivers/staging/android/ion/ion_cma_heap.c +++ b/drivers/staging/android/ion/ion_cma_heap.c @@ -59,6 +59,9 @@ static int ion_cma_allocate(struct ion_heap *heap, struct ion_buffer *buffer, memset(page_address(pages), 0, size); } + if (!ion_buffer_cached(buffer)) + ion_pages_sync_for_device(pages, size, DMA_BIDIRECTIONAL); + table = kmalloc(sizeof(*table), GFP_KERNEL); if (!table) goto err; diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index b3017f12835f..169a321778ed 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -63,7 +63,7 @@ static struct page *ion_page_pool_remove(struct ion_page_pool *pool, bool high) return page; } -struct page *ion_page_pool_alloc(struct ion_page_pool *pool) +struct page *ion_page_pool_alloc(struct ion_page_pool *pool, bool *from_pool) { struct page *page = NULL; @@ -76,8 +76,12 @@ struct page *ion_page_pool_alloc(struct ion_page_pool *pool) page = ion_page_pool_remove(pool, false); mutex_unlock(&pool->mutex); - if (!page) + if (!page) { page = ion_page_pool_alloc_pages(pool); + *from_pool = false; + } else { + *from_pool = true; + } return page; } diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index bc19cdd30637..3bb4604e032b 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -57,13 +57,18 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, bool cached = ion_buffer_cached(buffer); struct ion_page_pool *pool; struct page *page; + bool from_pool; if (!cached) pool = heap->uncached_pools[order_to_index(order)]; else pool = heap->cached_pools[order_to_index(order)]; - page = ion_page_pool_alloc(pool); + page = ion_page_pool_alloc(pool, &from_pool); + + if (!from_pool && !ion_buffer_cached(buffer)) + ion_pages_sync_for_device(page, PAGE_SIZE << order, + DMA_BIDIRECTIONAL); return page; } -- 1.8.5.2 Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

6 years, 8 months

[PATCH v6] Add udmabuf misc device

by Gerd Hoffmann

A driver to let userspace turn memfd regions into dma-bufs. Use case: Allows qemu create dmabufs for the vga framebuffer or virtio-gpu ressources. Then they can be passed around to display those guest things on the host. To spice client for classic full framebuffer display, and hopefully some day to wayland server for seamless guest window display. qemu test branch: https://git.kraxel.org/cgit/qemu/log/?h=sirius/udmabuf Cc: David Airlie <airlied(a)linux.ie> Cc: Tomeu Vizoso <tomeu.vizoso(a)collabora.com> Cc: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com> Cc: Daniel Vetter <daniel(a)ffwll.ch> Signed-off-by: Gerd Hoffmann <kraxel(a)redhat.com> --- Documentation/ioctl/ioctl-number.txt | 1 + include/uapi/linux/udmabuf.h | 33 +++ drivers/dma-buf/udmabuf.c | 287 ++++++++++++++++++++++ tools/testing/selftests/drivers/dma-buf/udmabuf.c | 96 ++++++++ MAINTAINERS | 8 + drivers/dma-buf/Kconfig | 8 + drivers/dma-buf/Makefile | 1 + tools/testing/selftests/drivers/dma-buf/Makefile | 5 + 8 files changed, 439 insertions(+) create mode 100644 include/uapi/linux/udmabuf.h create mode 100644 drivers/dma-buf/udmabuf.c create mode 100644 tools/testing/selftests/drivers/dma-buf/udmabuf.c create mode 100644 tools/testing/selftests/drivers/dma-buf/Makefile diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt index 480c8609dc..6636dea476 100644 --- a/Documentation/ioctl/ioctl-number.txt +++ b/Documentation/ioctl/ioctl-number.txt @@ -271,6 +271,7 @@ Code Seq#(hex) Include File Comments 't' 90-91 linux/toshiba.h toshiba and toshiba_acpi SMM 'u' 00-1F linux/smb_fs.h gone 'u' 20-3F linux/uvcvideo.h USB video class host driver +'u' 40-4f linux/udmabuf.h userspace dma-buf misc device 'v' 00-1F linux/ext2_fs.h conflict! 'v' 00-1F linux/fs.h conflict! 'v' 00-0F linux/sonypi.h conflict! diff --git a/include/uapi/linux/udmabuf.h b/include/uapi/linux/udmabuf.h new file mode 100644 index 0000000000..46b6532ed8 --- /dev/null +++ b/include/uapi/linux/udmabuf.h @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef _UAPI_LINUX_UDMABUF_H +#define _UAPI_LINUX_UDMABUF_H + +#include <linux/types.h> +#include <linux/ioctl.h> + +#define UDMABUF_FLAGS_CLOEXEC 0x01 + +struct udmabuf_create { + __u32 memfd; + __u32 flags; + __u64 offset; + __u64 size; +}; + +struct udmabuf_create_item { + __u32 memfd; + __u32 __pad; + __u64 offset; + __u64 size; +}; + +struct udmabuf_create_list { + __u32 flags; + __u32 count; + struct udmabuf_create_item list[]; +}; + +#define UDMABUF_CREATE _IOW('u', 0x42, struct udmabuf_create) +#define UDMABUF_CREATE_LIST _IOW('u', 0x43, struct udmabuf_create_list) + +#endif /* _UAPI_LINUX_UDMABUF_H */ diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c new file mode 100644 index 0000000000..8e24204526 --- /dev/null +++ b/drivers/dma-buf/udmabuf.c @@ -0,0 +1,287 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/init.h> +#include <linux/module.h> +#include <linux/device.h> +#include <linux/kernel.h> +#include <linux/slab.h> +#include <linux/miscdevice.h> +#include <linux/dma-buf.h> +#include <linux/highmem.h> +#include <linux/cred.h> +#include <linux/shmem_fs.h> +#include <linux/memfd.h> + +#include <uapi/linux/udmabuf.h> + +struct udmabuf { + u32 pagecount; + struct page **pages; +}; + +static int udmabuf_vm_fault(struct vm_fault *vmf) +{ + struct vm_area_struct *vma = vmf->vma; + struct udmabuf *ubuf = vma->vm_private_data; + + if (WARN_ON(vmf->pgoff >= ubuf->pagecount)) + return VM_FAULT_SIGBUS; + + vmf->page = ubuf->pages[vmf->pgoff]; + get_page(vmf->page); + return 0; +} + +static const struct vm_operations_struct udmabuf_vm_ops = { + .fault = udmabuf_vm_fault, +}; + +static int mmap_udmabuf(struct dma_buf *buf, struct vm_area_struct *vma) +{ + struct udmabuf *ubuf = buf->priv; + + if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0) + return -EINVAL; + + vma->vm_ops = &udmabuf_vm_ops; + vma->vm_private_data = ubuf; + return 0; +} + +static struct sg_table *map_udmabuf(struct dma_buf_attachment *at, + enum dma_data_direction direction) +{ + struct udmabuf *ubuf = at->dmabuf->priv; + struct sg_table *sg; + + sg = kzalloc(sizeof(*sg), GFP_KERNEL); + if (!sg) + goto err1; + if (sg_alloc_table_from_pages(sg, ubuf->pages, ubuf->pagecount, + 0, ubuf->pagecount << PAGE_SHIFT, + GFP_KERNEL) < 0) + goto err2; + if (!dma_map_sg(at->dev, sg->sgl, sg->nents, direction)) + goto err3; + + return sg; + +err3: + sg_free_table(sg); +err2: + kfree(sg); +err1: + return ERR_PTR(-ENOMEM); +} + +static void unmap_udmabuf(struct dma_buf_attachment *at, + struct sg_table *sg, + enum dma_data_direction direction) +{ + sg_free_table(sg); + kfree(sg); +} + +static void release_udmabuf(struct dma_buf *buf) +{ + struct udmabuf *ubuf = buf->priv; + pgoff_t pg; + + for (pg = 0; pg < ubuf->pagecount; pg++) + put_page(ubuf->pages[pg]); + kfree(ubuf->pages); + kfree(ubuf); +} + +static void *kmap_udmabuf(struct dma_buf *buf, unsigned long page_num) +{ + struct udmabuf *ubuf = buf->priv; + struct page *page = ubuf->pages[page_num]; + + return kmap(page); +} + +static void kunmap_udmabuf(struct dma_buf *buf, unsigned long page_num, + void *vaddr) +{ + kunmap(vaddr); +} + +static struct dma_buf_ops udmabuf_ops = { + .map_dma_buf = map_udmabuf, + .unmap_dma_buf = unmap_udmabuf, + .release = release_udmabuf, + .map = kmap_udmabuf, + .unmap = kunmap_udmabuf, + .mmap = mmap_udmabuf, +}; + +#define SEALS_WANTED (F_SEAL_SHRINK) +#define SEALS_DENIED (F_SEAL_WRITE) + +static long udmabuf_create(struct udmabuf_create_list *head, + struct udmabuf_create_item *list) +{ + DEFINE_DMA_BUF_EXPORT_INFO(exp_info); + struct file *memfd = NULL; + struct udmabuf *ubuf; + struct dma_buf *buf; + pgoff_t pgoff, pgcnt, pgidx, pgbuf; + struct page *page; + int seals, ret = -EINVAL; + u32 i, flags; + + ubuf = kzalloc(sizeof(struct udmabuf), GFP_KERNEL); + if (!ubuf) + return -ENOMEM; + + for (i = 0; i < head->count; i++) { + if (!IS_ALIGNED(list[i].offset, PAGE_SIZE)) + goto err_free_ubuf; + if (!IS_ALIGNED(list[i].size, PAGE_SIZE)) + goto err_free_ubuf; + ubuf->pagecount += list[i].size >> PAGE_SHIFT; + } + ubuf->pages = kmalloc_array(ubuf->pagecount, sizeof(struct page *), + GFP_KERNEL); + if (!ubuf->pages) { + ret = -ENOMEM; + goto err_free_ubuf; + } + + pgbuf = 0; + for (i = 0; i < head->count; i++) { + memfd = fget(list[i].memfd); + if (!memfd) + goto err_put_pages; + if (!shmem_mapping(file_inode(memfd)->i_mapping)) + goto err_put_pages; + seals = memfd_fcntl(memfd, F_GET_SEALS, 0); + if (seals == -EINVAL || + (seals & SEALS_WANTED) != SEALS_WANTED || + (seals & SEALS_DENIED) != 0) + goto err_put_pages; + pgoff = list[i].offset >> PAGE_SHIFT; + pgcnt = list[i].size >> PAGE_SHIFT; + for (pgidx = 0; pgidx < pgcnt; pgidx++) { + page = shmem_read_mapping_page( + file_inode(memfd)->i_mapping, pgoff + pgidx); + if (IS_ERR(page)) { + ret = PTR_ERR(page); + goto err_put_pages; + } + ubuf->pages[pgbuf++] = page; + } + fput(memfd); + } + memfd = NULL; + + exp_info.ops = &udmabuf_ops; + exp_info.size = ubuf->pagecount << PAGE_SHIFT; + exp_info.priv = ubuf; + + buf = dma_buf_export(&exp_info); + if (IS_ERR(buf)) { + ret = PTR_ERR(buf); + goto err_put_pages; + } + + flags = 0; + if (head->flags & UDMABUF_FLAGS_CLOEXEC) + flags |= O_CLOEXEC; + return dma_buf_fd(buf, flags); + +err_put_pages: + while (pgbuf > 0) + put_page(ubuf->pages[--pgbuf]); +err_free_ubuf: + fput(memfd); + kfree(ubuf->pages); + kfree(ubuf); + return ret; +} + +static long udmabuf_ioctl_create(struct file *filp, unsigned long arg) +{ + struct udmabuf_create create; + struct udmabuf_create_list head; + struct udmabuf_create_item list; + + if (copy_from_user(&create, (void __user *)arg, + sizeof(struct udmabuf_create))) + return -EFAULT; + + head.flags = create.flags; + head.count = 1; + list.memfd = create.memfd; + list.offset = create.offset; + list.size = create.size; + + return udmabuf_create(&head, &list); +} + +static long udmabuf_ioctl_create_list(struct file *filp, unsigned long arg) +{ + struct udmabuf_create_list head; + struct udmabuf_create_item *list; + int ret = -EINVAL; + u32 lsize; + + if (copy_from_user(&head, (void __user *)arg, sizeof(head))) + return -EFAULT; + if (head.count > 1024) + return -EINVAL; + lsize = sizeof(struct udmabuf_create_item) * head.count; + list = memdup_user((void __user *)(arg + sizeof(head)), lsize); + if (IS_ERR(list)) + return PTR_ERR(list); + + ret = udmabuf_create(&head, list); + kfree(list); + return ret; +} + +static long udmabuf_ioctl(struct file *filp, unsigned int ioctl, + unsigned long arg) +{ + long ret; + + switch (ioctl) { + case UDMABUF_CREATE: + ret = udmabuf_ioctl_create(filp, arg); + break; + case UDMABUF_CREATE_LIST: + ret = udmabuf_ioctl_create_list(filp, arg); + break; + default: + ret = -EINVAL; + break; + } + return ret; +} + +static const struct file_operations udmabuf_fops = { + .owner = THIS_MODULE, + .unlocked_ioctl = udmabuf_ioctl, +}; + +static struct miscdevice udmabuf_misc = { + .minor = MISC_DYNAMIC_MINOR, + .name = "udmabuf", + .fops = &udmabuf_fops, +}; + +static int __init udmabuf_dev_init(void) +{ + return misc_register(&udmabuf_misc); +} + +static void __exit udmabuf_dev_exit(void) +{ + misc_deregister(&udmabuf_misc); +} + +module_init(udmabuf_dev_init) +module_exit(udmabuf_dev_exit) + +MODULE_AUTHOR("Gerd Hoffmann <kraxel(a)redhat.com>"); +MODULE_LICENSE("GPL v2"); diff --git a/tools/testing/selftests/drivers/dma-buf/udmabuf.c b/tools/testing/selftests/drivers/dma-buf/udmabuf.c new file mode 100644 index 0000000000..376b1d6730 --- /dev/null +++ b/tools/testing/selftests/drivers/dma-buf/udmabuf.c @@ -0,0 +1,96 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <string.h> +#include <errno.h> +#include <fcntl.h> +#include <malloc.h> + +#include <sys/ioctl.h> +#include <sys/syscall.h> +#include <linux/memfd.h> +#include <linux/udmabuf.h> + +#define TEST_PREFIX "drivers/dma-buf/udmabuf" +#define NUM_PAGES 4 + +static int memfd_create(const char *name, unsigned int flags) +{ + return syscall(__NR_memfd_create, name, flags); +} + +int main(int argc, char *argv[]) +{ + struct udmabuf_create create; + int devfd, memfd, buf, ret; + off_t size; + void *mem; + + devfd = open("/dev/udmabuf", O_RDWR); + if (devfd < 0) { + printf("%s: [skip,no-udmabuf]\n", TEST_PREFIX); + exit(77); + } + + memfd = memfd_create("udmabuf-test", MFD_CLOEXEC); + if (memfd < 0) { + printf("%s: [skip,no-memfd]\n", TEST_PREFIX); + exit(77); + } + + size = getpagesize() * NUM_PAGES; + ret = ftruncate(memfd, size); + if (ret == -1) { + printf("%s: [FAIL,memfd-truncate]\n", TEST_PREFIX); + exit(1); + } + + memset(&create, 0, sizeof(create)); + + /* should fail (offset not page aligned) */ + create.memfd = memfd; + create.offset = getpagesize()/2; + create.size = getpagesize(); + buf = ioctl(devfd, UDMABUF_CREATE, &create); + if (buf >= 0) { + printf("%s: [FAIL,test-1]\n", TEST_PREFIX); + exit(1); + } + + /* should fail (size not multiple of page) */ + create.memfd = memfd; + create.offset = 0; + create.size = getpagesize()/2; + buf = ioctl(devfd, UDMABUF_CREATE, &create); + if (buf >= 0) { + printf("%s: [FAIL,test-2]\n", TEST_PREFIX); + exit(1); + } + + /* should fail (not memfd) */ + create.memfd = 0; /* stdin */ + create.offset = 0; + create.size = size; + buf = ioctl(devfd, UDMABUF_CREATE, &create); + if (buf >= 0) { + printf("%s: [FAIL,test-3]\n", TEST_PREFIX); + exit(1); + } + + /* should work */ + create.memfd = memfd; + create.offset = 0; + create.size = size; + buf = ioctl(devfd, UDMABUF_CREATE, &create); + if (buf < 0) { + printf("%s: [FAIL,test-4]\n", TEST_PREFIX); + exit(1); + } + + fprintf(stderr, "%s: ok\n", TEST_PREFIX); + close(buf); + close(memfd); + close(devfd); + return 0; +} diff --git a/MAINTAINERS b/MAINTAINERS index 07d1576fc7..a8e1bb3bc3 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14638,6 +14638,14 @@ S: Maintained F: Documentation/filesystems/udf.txt F: fs/udf/ +UDMABUF DRIVER +M: Gerd Hoffmann <kraxel(a)redhat.com> +L: dri-devel(a)lists.freedesktop.org +S: Maintained +F: drivers/dma-buf/udmabuf.c +F: include/uapi/linux/udmabuf.h +F: tools/testing/selftests/drivers/dma-buf/udmabuf.c + UDRAW TABLET M: Bastien Nocera <hadess(a)hadess.net> L: linux-input(a)vger.kernel.org diff --git a/drivers/dma-buf/Kconfig b/drivers/dma-buf/Kconfig index ed3b785bae..338129eb12 100644 --- a/drivers/dma-buf/Kconfig +++ b/drivers/dma-buf/Kconfig @@ -30,4 +30,12 @@ config SW_SYNC WARNING: improper use of this can result in deadlocking kernel drivers from userspace. Intended for test and debug only. +config UDMABUF + bool "userspace dmabuf misc driver" + default n + depends on DMA_SHARED_BUFFER + help + A driver to let userspace turn memfd regions into dma-bufs. + Qemu can use this to create host dmabufs for guest framebuffers. + endmenu diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index c33bf88631..0913a6ccab 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1,3 +1,4 @@ obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o seqno-fence.o obj-$(CONFIG_SYNC_FILE) += sync_file.o obj-$(CONFIG_SW_SYNC) += sw_sync.o sync_debug.o +obj-$(CONFIG_UDMABUF) += udmabuf.o diff --git a/tools/testing/selftests/drivers/dma-buf/Makefile b/tools/testing/selftests/drivers/dma-buf/Makefile new file mode 100644 index 0000000000..4154c3d7aa --- /dev/null +++ b/tools/testing/selftests/drivers/dma-buf/Makefile @@ -0,0 +1,5 @@ +CFLAGS += -I../../../../../usr/include/ + +TEST_GEN_PROGS := udmabuf + +include ../../lib.mk -- 2.9.3

6 years, 11 months

[PATCH 1/2] mm/cma: remove unsupported gfp_mask parameter from cma_alloc()

by Marek Szyprowski

cma_alloc() function doesn't really support gfp flags other than __GFP_NOWARN, so convert gfp_mask parameter to boolean no_warn parameter. This will help to avoid giving false feeling that this function supports standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer, what has already been an issue: see commit dd65a941f6ba ("arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag"). Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com> --- arch/powerpc/kvm/book3s_hv_builtin.c | 2 +- drivers/s390/char/vmcp.c | 2 +- drivers/staging/android/ion/ion_cma_heap.c | 2 +- include/linux/cma.h | 2 +- kernel/dma/contiguous.c | 3 ++- mm/cma.c | 8 ++++---- mm/cma_debug.c | 2 +- 7 files changed, 11 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c index d4a3f4da409b..fc6bb9630a9c 100644 --- a/arch/powerpc/kvm/book3s_hv_builtin.c +++ b/arch/powerpc/kvm/book3s_hv_builtin.c @@ -77,7 +77,7 @@ struct page *kvm_alloc_hpt_cma(unsigned long nr_pages) VM_BUG_ON(order_base_2(nr_pages) < KVM_CMA_CHUNK_ORDER - PAGE_SHIFT); return cma_alloc(kvm_cma, nr_pages, order_base_2(HPT_ALIGN_PAGES), - GFP_KERNEL); + false); } EXPORT_SYMBOL_GPL(kvm_alloc_hpt_cma); diff --git a/drivers/s390/char/vmcp.c b/drivers/s390/char/vmcp.c index 948ce82a7725..0fa1b6b1491a 100644 --- a/drivers/s390/char/vmcp.c +++ b/drivers/s390/char/vmcp.c @@ -68,7 +68,7 @@ static void vmcp_response_alloc(struct vmcp_session *session) * anymore the system won't work anyway. */ if (order > 2) - page = cma_alloc(vmcp_cma, nr_pages, 0, GFP_KERNEL); + page = cma_alloc(vmcp_cma, nr_pages, 0, false); if (page) { session->response = (char *)page_to_phys(page); session->cma_alloc = 1; diff --git a/drivers/staging/android/ion/ion_cma_heap.c b/drivers/staging/android/ion/ion_cma_heap.c index 49718c96bf9e..3fafd013d80a 100644 --- a/drivers/staging/android/ion/ion_cma_heap.c +++ b/drivers/staging/android/ion/ion_cma_heap.c @@ -39,7 +39,7 @@ static int ion_cma_allocate(struct ion_heap *heap, struct ion_buffer *buffer, if (align > CONFIG_CMA_ALIGNMENT) align = CONFIG_CMA_ALIGNMENT; - pages = cma_alloc(cma_heap->cma, nr_pages, align, GFP_KERNEL); + pages = cma_alloc(cma_heap->cma, nr_pages, align, false); if (!pages) return -ENOMEM; diff --git a/include/linux/cma.h b/include/linux/cma.h index bf90f0bb42bd..190184b5ff32 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -33,7 +33,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, const char *name, struct cma **res_cma); extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, - gfp_t gfp_mask); + bool no_warn); extern bool cma_release(struct cma *cma, const struct page *pages, unsigned int count); extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data); diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c index d987dcd1bd56..19ea5d70150c 100644 --- a/kernel/dma/contiguous.c +++ b/kernel/dma/contiguous.c @@ -191,7 +191,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count, if (align > CONFIG_CMA_ALIGNMENT) align = CONFIG_CMA_ALIGNMENT; - return cma_alloc(dev_get_cma_area(dev), count, align, gfp_mask); + return cma_alloc(dev_get_cma_area(dev), count, align, + gfp_mask & __GFP_NOWARN); } /** diff --git a/mm/cma.c b/mm/cma.c index 5809bbe360d7..4cb76121a3ab 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -395,13 +395,13 @@ static inline void cma_debug_show_areas(struct cma *cma) { } * @cma: Contiguous memory region for which the allocation is performed. * @count: Requested number of pages. * @align: Requested alignment of pages (in PAGE_SIZE order). - * @gfp_mask: GFP mask to use during compaction + * @no_warn: Avoid printing message about failed allocation * * This function allocates part of contiguous memory on specific * contiguous memory area. */ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, - gfp_t gfp_mask) + bool no_warn) { unsigned long mask, offset; unsigned long pfn = -1; @@ -447,7 +447,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit); mutex_lock(&cma_mutex); ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, - gfp_mask); + GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0)); mutex_unlock(&cma_mutex); if (ret == 0) { page = pfn_to_page(pfn); @@ -466,7 +466,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, trace_cma_alloc(pfn, page, count, align); - if (ret && !(gfp_mask & __GFP_NOWARN)) { + if (ret && !no_warn) { pr_err("%s: alloc failed, req-size: %zu pages, ret: %d\n", __func__, count, ret); cma_debug_show_areas(cma); diff --git a/mm/cma_debug.c b/mm/cma_debug.c index f23467291cfb..ad6723e9d110 100644 --- a/mm/cma_debug.c +++ b/mm/cma_debug.c @@ -139,7 +139,7 @@ static int cma_alloc_mem(struct cma *cma, int count) if (!mem) return -ENOMEM; - p = cma_alloc(cma, count, 0, GFP_KERNEL); + p = cma_alloc(cma, count, 0, false); if (!p) { kfree(mem); return -ENOMEM; -- 2.17.1

7 years

[PATCH 2/2] dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous()

by Marek Szyprowski

The CMA memory allocator doesn't support standard gfp flags for memory allocation, so there is no point having it as a parameter for dma_alloc_from_contiguous() function. Replace it by a boolean no_warn argument, which covers all the underlaying cma_alloc() function supports. This will help to avoid giving false feeling that this function supports standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer, what has already been an issue: see commit dd65a941f6ba ("arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag"). Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com> --- arch/arm/mm/dma-mapping.c | 5 +++-- arch/arm64/mm/dma-mapping.c | 4 ++-- arch/xtensa/kernel/pci-dma.c | 2 +- drivers/iommu/amd_iommu.c | 2 +- drivers/iommu/intel-iommu.c | 3 ++- include/linux/dma-contiguous.h | 4 ++-- kernel/dma/contiguous.c | 7 +++---- kernel/dma/direct.c | 3 ++- 8 files changed, 16 insertions(+), 14 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index be0fa7e39c26..121c6c3ba9e0 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -594,7 +594,7 @@ static void *__alloc_from_contiguous(struct device *dev, size_t size, struct page *page; void *ptr = NULL; - page = dma_alloc_from_contiguous(dev, count, order, gfp); + page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN); if (!page) return NULL; @@ -1294,7 +1294,8 @@ static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, unsigned long order = get_order(size); struct page *page; - page = dma_alloc_from_contiguous(dev, count, order, gfp); + page = dma_alloc_from_contiguous(dev, count, order, + gfp & __GFP_NOWARN); if (!page) goto error; diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index 61e93f0b5482..072c51fb07d7 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -355,7 +355,7 @@ static int __init atomic_pool_init(void) if (dev_get_cma_area(NULL)) page = dma_alloc_from_contiguous(NULL, nr_pages, - pool_size_order, GFP_KERNEL); + pool_size_order, false); else page = alloc_pages(GFP_DMA32, pool_size_order); @@ -573,7 +573,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size, struct page *page; page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT, - get_order(size), gfp); + get_order(size), gfp & __GFP_NOWARN); if (!page) return NULL; diff --git a/arch/xtensa/kernel/pci-dma.c b/arch/xtensa/kernel/pci-dma.c index ba4640cc0093..b2c7ba91fb08 100644 --- a/arch/xtensa/kernel/pci-dma.c +++ b/arch/xtensa/kernel/pci-dma.c @@ -137,7 +137,7 @@ static void *xtensa_dma_alloc(struct device *dev, size_t size, if (gfpflags_allow_blocking(flag)) page = dma_alloc_from_contiguous(dev, count, get_order(size), - flag); + flag & __GFP_NOWARN); if (!page) page = alloc_pages(flag, get_order(size)); diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 64cfe854e0f5..5ec97ffb561a 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -2622,7 +2622,7 @@ static void *alloc_coherent(struct device *dev, size_t size, return NULL; page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT, - get_order(size), flag); + get_order(size), flag & __GFP_NOWARN); if (!page) return NULL; } diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 869321c594e2..dd2d343428ab 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3746,7 +3746,8 @@ static void *intel_alloc_coherent(struct device *dev, size_t size, if (gfpflags_allow_blocking(flags)) { unsigned int count = size >> PAGE_SHIFT; - page = dma_alloc_from_contiguous(dev, count, order, flags); + page = dma_alloc_from_contiguous(dev, count, order, + flags & __GFP_NOWARN); if (page && iommu_no_mapping(dev) && page_to_phys(page) + size > dev->coherent_dma_mask) { dma_release_from_contiguous(dev, page, count); diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h index 3c5a4cb3eb95..f247e8aa5e3d 100644 --- a/include/linux/dma-contiguous.h +++ b/include/linux/dma-contiguous.h @@ -112,7 +112,7 @@ static inline int dma_declare_contiguous(struct device *dev, phys_addr_t size, } struct page *dma_alloc_from_contiguous(struct device *dev, size_t count, - unsigned int order, gfp_t gfp_mask); + unsigned int order, bool no_warn); bool dma_release_from_contiguous(struct device *dev, struct page *pages, int count); @@ -145,7 +145,7 @@ int dma_declare_contiguous(struct device *dev, phys_addr_t size, static inline struct page *dma_alloc_from_contiguous(struct device *dev, size_t count, - unsigned int order, gfp_t gfp_mask) + unsigned int order, bool no_warn) { return NULL; } diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c index 19ea5d70150c..286d82329eb0 100644 --- a/kernel/dma/contiguous.c +++ b/kernel/dma/contiguous.c @@ -178,7 +178,7 @@ int __init dma_contiguous_reserve_area(phys_addr_t size, phys_addr_t base, * @dev: Pointer to device for which the allocation is performed. * @count: Requested number of pages. * @align: Requested alignment of pages (in PAGE_SIZE order). - * @gfp_mask: GFP flags to use for this allocation. + * @no_warn: Avoid printing message about failed allocation. * * This function allocates memory buffer for specified device. It uses * device specific contiguous memory area if available or the default @@ -186,13 +186,12 @@ int __init dma_contiguous_reserve_area(phys_addr_t size, phys_addr_t base, * function. */ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count, - unsigned int align, gfp_t gfp_mask) + unsigned int align, bool no_warn) { if (align > CONFIG_CMA_ALIGNMENT) align = CONFIG_CMA_ALIGNMENT; - return cma_alloc(dev_get_cma_area(dev), count, align, - gfp_mask & __GFP_NOWARN); + return cma_alloc(dev_get_cma_area(dev), count, align, no_warn); } /** diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 8be8106270c2..e0241beeb645 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -78,7 +78,8 @@ void *dma_direct_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, again: /* CMA can be used only in the context which permits sleeping */ if (gfpflags_allow_blocking(gfp)) { - page = dma_alloc_from_contiguous(dev, count, page_order, gfp); + page = dma_alloc_from_contiguous(dev, count, page_order, + gfp & __GFP_NOWARN); if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) { dma_release_from_contiguous(dev, page, count); page = NULL; -- 2.17.1

7 years

[PATCH 0/2] CMA: remove unsupported gfp mask parameter

by Marek Szyprowski

Dear All, The CMA related functions cma_alloc() and dma_alloc_from_contiguous() have gfp mask parameter, but sadly they only support __GFP_NOWARN flag. This gave their users a misleading feeling that any standard memory allocation flags are supported, what resulted in the security issue when caller have set __GFP_ZERO flag and expected the buffer to be cleared. This patchset changes gfp_mask parameter to a simple boolean no_warn argument, which covers all the underlaying code supports. This patchset is a result of the following discussion: https://patchwork.kernel.org/patch/10461919/ Best regards Marek Szyprowski Samsung R&D Institute Poland Patch summary: Marek Szyprowski (2): mm/cma: remove unsupported gfp_mask parameter from cma_alloc() dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous() arch/arm/mm/dma-mapping.c | 5 +++-- arch/arm64/mm/dma-mapping.c | 4 ++-- arch/powerpc/kvm/book3s_hv_builtin.c | 2 +- arch/xtensa/kernel/pci-dma.c | 2 +- drivers/iommu/amd_iommu.c | 2 +- drivers/iommu/intel-iommu.c | 3 ++- drivers/s390/char/vmcp.c | 2 +- drivers/staging/android/ion/ion_cma_heap.c | 2 +- include/linux/cma.h | 2 +- include/linux/dma-contiguous.h | 4 ++-- kernel/dma/contiguous.c | 6 +++--- kernel/dma/direct.c | 3 ++- mm/cma.c | 8 ++++---- mm/cma_debug.c | 2 +- 14 files changed, 25 insertions(+), 22 deletions(-) -- 2.17.1

7 years

Re: [Linaro-mm-sig] [PATCH v6] Add udmabuf misc device

by Gerd Hoffmann

On Wed, Jul 04, 2018 at 09:26:39AM +0200, Tomeu Vizoso wrote: > On 07/04/2018 07:53 AM, Gerd Hoffmann wrote: > > On Tue, Jul 03, 2018 at 10:37:57AM +0200, Daniel Vetter wrote: > > > On Tue, Jul 03, 2018 at 09:53:58AM +0200, Gerd Hoffmann wrote: > > > > A driver to let userspace turn memfd regions into dma-bufs. > > > > > > > > Use case: Allows qemu create dmabufs for the vga framebuffer or > > > > virtio-gpu ressources. Then they can be passed around to display > > > > those guest things on the host. To spice client for classic full > > > > framebuffer display, and hopefully some day to wayland server for > > > > seamless guest window display. > > > > > > > > qemu test branch: > > > > https://git.kraxel.org/cgit/qemu/log/?h=sirius/udmabuf > > > > > > > > Cc: David Airlie <airlied(a)linux.ie> > > > > Cc: Tomeu Vizoso <tomeu.vizoso(a)collabora.com> > > > > Cc: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com> > > > > Cc: Daniel Vetter <daniel(a)ffwll.ch> > > > > Signed-off-by: Gerd Hoffmann <kraxel(a)redhat.com> > > > > > > I think some ack for a 2nd use-case, like virtio-wl or whatever would be > > > really cool. To give us some assurance that this is generically useful. > > > > Tomeu? Laurent? > > Sorry, but I think I will need some help to understand how this could help > in the virtio-wl case [adding Zach Reizner to CC]. > > Any graphics buffers that are allocated with memfd will be shared with the > compositor via wl_shm, without need for dmabufs. Within one machine, yes. Once virtualization is added to the mix things become more complicated ... When using virtio-gpu the guest will allocate graphics buffers from normal (guest) ram, then register these buffers (which are allowed to be scattered) with the host as resource. qemu can use memfd to allocate guest ram. Now, with the help of udmabuf, qemu can create a *host* dma-buf for the *guest* graphics buffer. That dma-buf can be used by qemu internally (mmap it to get a linear mapping of the resource, to avoid copying). It can be passed on to spice-client, to display the guest framebuffer. And I think it could also be quite useful to pass guest wayland windows to the host compositor, without mapping host-allocated buffers into the guest, so we don't have do deal with the "find some address space for the mapping" issue in the first place. There are more things needed to complete this of course, but it's a building block ... cheers, Gerd

7 years, 1 month

[PATCH 5/5] dma-fence: Polish kernel-doc for dma-fence.c

by Daniel Vetter

- Intro section that links to how this is exposed to userspace. - Lots more hyperlinks. - Minor clarifications and style polish v2: Add misplaced hunk of kerneldoc from a different patch. Signed-off-by: Daniel Vetter <daniel.vetter(a)ffwll.ch> Cc: Sumit Semwal <sumit.semwal(a)linaro.org> Cc: Gustavo Padovan <gustavo(a)padovan.org> Cc: linux-media(a)vger.kernel.org Cc: linaro-mm-sig(a)lists.linaro.org --- Documentation/driver-api/dma-buf.rst | 6 ++ drivers/dma-buf/dma-fence.c | 147 +++++++++++++++++++-------- 2 files changed, 109 insertions(+), 44 deletions(-) diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst index dc384f2f7f34..b541e97c7ab1 100644 --- a/Documentation/driver-api/dma-buf.rst +++ b/Documentation/driver-api/dma-buf.rst @@ -130,6 +130,12 @@ Reservation Objects DMA Fences ---------- +.. kernel-doc:: drivers/dma-buf/dma-fence.c + :doc: DMA fences overview + +DMA Fences Functions Reference +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + .. kernel-doc:: drivers/dma-buf/dma-fence.c :export: diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 7a92f85a4cec..1551ca7df394 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -38,12 +38,43 @@ EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal); */ static atomic64_t dma_fence_context_counter = ATOMIC64_INIT(0); +/** + * DOC: DMA fences overview + * + * DMA fences, represented by &struct dma_fence, are the kernel internal + * synchronization primitive for DMA operations like GPU rendering, video + * encoding/decoding, or displaying buffers on a screen. + * + * A fence is initialized using dma_fence_init() and completed using + * dma_fence_signal(). Fences are associated with a context, allocated through + * dma_fence_context_alloc(), and all fences on the same context are + * fully ordered. + * + * Since the purposes of fences is to facilitate cross-device and + * cross-application synchronization, there's multiple ways to use one: + * + * - Individual fences can be exposed as a &sync_file, accessed as a file + * descriptor from userspace, created by calling sync_file_create(). This is + * called explicit fencing, since userspace passes around explicit + * synchronization points. + * + * - Some subsystems also have their own explicit fencing primitives, like + * &drm_syncobj. Compared to &sync_file, a &drm_syncobj allows the underlying + * fence to be updated. + * + * - Then there's also implicit fencing, where the synchronization points are + * implicitly passed around as part of shared &dma_buf instances. Such + * implicit fences are stored in &struct reservation_object through the + * &dma_buf.resv pointer. + */ + /** * dma_fence_context_alloc - allocate an array of fence contexts - * @num: [in] amount of contexts to allocate + * @num: amount of contexts to allocate * - * This function will return the first index of the number of fences allocated. - * The fence context is used for setting fence->context to a unique number. + * This function will return the first index of the number of fence contexts + * allocated. The fence context is used for setting &dma_fence.context to a + * unique number by passing the context to dma_fence_init(). */ u64 dma_fence_context_alloc(unsigned num) { @@ -59,10 +90,14 @@ EXPORT_SYMBOL(dma_fence_context_alloc); * Signal completion for software callbacks on a fence, this will unblock * dma_fence_wait() calls and run all the callbacks added with * dma_fence_add_callback(). Can be called multiple times, but since a fence - * can only go from unsignaled to signaled state, it will only be effective - * the first time. + * can only go from the unsignaled to the signaled state and not back, it will + * only be effective the first time. + * + * Unlike dma_fence_signal(), this function must be called with &dma_fence.lock + * held. * - * Unlike dma_fence_signal, this function must be called with fence->lock held. + * Returns 0 on success and a negative error value when @fence has been + * signalled already. */ int dma_fence_signal_locked(struct dma_fence *fence) { @@ -102,8 +137,11 @@ EXPORT_SYMBOL(dma_fence_signal_locked); * Signal completion for software callbacks on a fence, this will unblock * dma_fence_wait() calls and run all the callbacks added with * dma_fence_add_callback(). Can be called multiple times, but since a fence - * can only go from unsignaled to signaled state, it will only be effective - * the first time. + * can only go from the unsignaled to the signaled state and not back, it will + * only be effective the first time. + * + * Returns 0 on success and a negative error value when @fence has been + * signalled already. */ int dma_fence_signal(struct dma_fence *fence) { @@ -136,9 +174,9 @@ EXPORT_SYMBOL(dma_fence_signal); /** * dma_fence_wait_timeout - sleep until the fence gets signaled * or until timeout elapses - * @fence: [in] the fence to wait on - * @intr: [in] if true, do an interruptible wait - * @timeout: [in] timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT + * @fence: the fence to wait on + * @intr: if true, do an interruptible wait + * @timeout: timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT * * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or the * remaining timeout in jiffies on success. Other error values may be @@ -148,6 +186,8 @@ EXPORT_SYMBOL(dma_fence_signal); * directly or indirectly (buf-mgr between reservation and committing) * holds a reference to the fence, otherwise the fence might be * freed before return, resulting in undefined behavior. + * + * See also dma_fence_wait() and dma_fence_wait_any_timeout(). */ signed long dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout) @@ -167,6 +207,13 @@ dma_fence_wait_timeout(struct dma_fence *fence, bool intr, signed long timeout) } EXPORT_SYMBOL(dma_fence_wait_timeout); +/** + * dma_fence_release - default relese function for fences + * @kref: &dma_fence.recfount + * + * This is the default release functions for &dma_fence. Drivers shouldn't call + * this directly, but instead call dma_fence_put(). + */ void dma_fence_release(struct kref *kref) { struct dma_fence *fence = @@ -184,6 +231,13 @@ void dma_fence_release(struct kref *kref) } EXPORT_SYMBOL(dma_fence_release); +/** + * dma_fence_free - default release function for &dma_fence. + * @fence: fence to release + * + * This is the default implementation for &dma_fence_ops.release. It calls + * kfree_rcu() on @fence. + */ void dma_fence_free(struct dma_fence *fence) { kfree_rcu(fence, rcu); @@ -192,10 +246,11 @@ EXPORT_SYMBOL(dma_fence_free); /** * dma_fence_enable_sw_signaling - enable signaling on fence - * @fence: [in] the fence to enable + * @fence: the fence to enable * - * this will request for sw signaling to be enabled, to make the fence - * complete as soon as possible + * This will request for sw signaling to be enabled, to make the fence + * complete as soon as possible. This calls &dma_fence_ops.enable_signaling + * internally. */ void dma_fence_enable_sw_signaling(struct dma_fence *fence) { @@ -220,24 +275,24 @@ EXPORT_SYMBOL(dma_fence_enable_sw_signaling); /** * dma_fence_add_callback - add a callback to be called when the fence * is signaled - * @fence: [in] the fence to wait on - * @cb: [in] the callback to register - * @func: [in] the function to call + * @fence: the fence to wait on + * @cb: the callback to register + * @func: the function to call * - * cb will be initialized by dma_fence_add_callback, no initialization + * @cb will be initialized by dma_fence_add_callback(), no initialization * by the caller is required. Any number of callbacks can be registered * to a fence, but a callback can only be registered to one fence at a time. * * Note that the callback can be called from an atomic context. If * fence is already signaled, this function will return -ENOENT (and - * *not* call the callback) + * *not* call the callback). * * Add a software callback to the fence. Same restrictions apply to - * refcount as it does to dma_fence_wait, however the caller doesn't need to - * keep a refcount to fence afterwards: when software access is enabled, - * the creator of the fence is required to keep the fence alive until - * after it signals with dma_fence_signal. The callback itself can be called - * from irq context. + * refcount as it does to dma_fence_wait(), however the caller doesn't need to + * keep a refcount to fence afterward dma_fence_add_callback() has returned: + * when software access is enabled, the creator of the fence is required to keep + * the fence alive until after it signals with dma_fence_signal(). The callback + * itself can be called from irq context. * * Returns 0 in case of success, -ENOENT if the fence is already signaled * and -EINVAL in case of error. @@ -286,7 +341,7 @@ EXPORT_SYMBOL(dma_fence_add_callback); /** * dma_fence_get_status - returns the status upon completion - * @fence: [in] the dma_fence to query + * @fence: the dma_fence to query * * This wraps dma_fence_get_status_locked() to return the error status * condition on a signaled fence. See dma_fence_get_status_locked() for more @@ -311,8 +366,8 @@ EXPORT_SYMBOL(dma_fence_get_status); /** * dma_fence_remove_callback - remove a callback from the signaling list - * @fence: [in] the fence to wait on - * @cb: [in] the callback to remove + * @fence: the fence to wait on + * @cb: the callback to remove * * Remove a previously queued callback from the fence. This function returns * true if the callback is successfully removed, or false if the fence has @@ -323,6 +378,9 @@ EXPORT_SYMBOL(dma_fence_get_status); * doing, since deadlocks and race conditions could occur all too easily. For * this reason, it should only ever be done on hardware lockup recovery, * with a reference held to the fence. + * + * Behaviour is undefined if @cb has not been added to @fence using + * dma_fence_add_callback() beforehand. */ bool dma_fence_remove_callback(struct dma_fence *fence, struct dma_fence_cb *cb) @@ -359,9 +417,9 @@ dma_fence_default_wait_cb(struct dma_fence *fence, struct dma_fence_cb *cb) /** * dma_fence_default_wait - default sleep until the fence gets signaled * or until timeout elapses - * @fence: [in] the fence to wait on - * @intr: [in] if true, do an interruptible wait - * @timeout: [in] timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT + * @fence: the fence to wait on + * @intr: if true, do an interruptible wait + * @timeout: timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT * * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or the * remaining timeout in jiffies on success. If timeout is zero the value one is @@ -454,12 +512,12 @@ dma_fence_test_signaled_any(struct dma_fence **fences, uint32_t count, /** * dma_fence_wait_any_timeout - sleep until any fence gets signaled * or until timeout elapses - * @fences: [in] array of fences to wait on - * @count: [in] number of fences to wait on - * @intr: [in] if true, do an interruptible wait - * @timeout: [in] timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT - * @idx: [out] the first signaled fence index, meaningful only on - * positive return + * @fences: array of fences to wait on + * @count: number of fences to wait on + * @intr: if true, do an interruptible wait + * @timeout: timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT + * @idx: used to store the first signaled fence index, meaningful only on + * positive return * * Returns -EINVAL on custom fence wait implementation, -ERESTARTSYS if * interrupted, 0 if the wait timed out, or the remaining timeout in jiffies @@ -468,6 +526,8 @@ dma_fence_test_signaled_any(struct dma_fence **fences, uint32_t count, * Synchronous waits for the first fence in the array to be signaled. The * caller needs to hold a reference to all fences in the array, otherwise a * fence might be freed before return, resulting in undefined behavior. + * + * See also dma_fence_wait() and dma_fence_wait_timeout(). */ signed long dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count, @@ -540,19 +600,18 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout); /** * dma_fence_init - Initialize a custom fence. - * @fence: [in] the fence to initialize - * @ops: [in] the dma_fence_ops for operations on this fence - * @lock: [in] the irqsafe spinlock to use for locking this fence - * @context: [in] the execution context this fence is run on - * @seqno: [in] a linear increasing sequence number for this context + * @fence: the fence to initialize + * @ops: the dma_fence_ops for operations on this fence + * @lock: the irqsafe spinlock to use for locking this fence + * @context: the execution context this fence is run on + * @seqno: a linear increasing sequence number for this context * * Initializes an allocated fence, the caller doesn't have to keep its * refcount after committing with this fence, but it will need to hold a - * refcount again if dma_fence_ops.enable_signaling gets called. This can - * be used for other implementing other types of fence. + * refcount again if &dma_fence_ops.enable_signaling gets called. * * context and seqno are used for easy comparison between fences, allowing - * to check which fence is later by simply using dma_fence_later. + * to check which fence is later by simply using dma_fence_later(). */ void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, -- 2.18.0

7 years, 1 month

Re: [Linaro-mm-sig] [PATCH] dma-buf: Move BUG_ON from _add_shared_fence to _add_shared_inplace

by Christian König

Am 04.07.2018 um 11:09 schrieb Michel Dänzer: > On 2018-07-04 10:31 AM, Christian König wrote: >> Am 26.06.2018 um 16:31 schrieb Michel Dänzer: >>> From: Michel Dänzer <michel.daenzer(a)amd.com> >>> >>> Fixes the BUG_ON spuriously triggering under the following >>> circumstances: >>> >>> * ttm_eu_reserve_buffers processes a list containing multiple BOs using >>> the same reservation object, so it calls >>> reservation_object_reserve_shared with that reservation object once >>> for each such BO. >>> * In reservation_object_reserve_shared, old->shared_count == >>> old->shared_max - 1, so obj->staged is freed in preparation of an >>> in-place update. >>> * ttm_eu_fence_buffer_objects calls reservation_object_add_shared_fence >>> once for each of the BOs above, always with the same fence. >>> * The first call adds the fence in the remaining free slot, after which >>> old->shared_count == old->shared_max. >> Well, the explanation here is not correct. For multiple BOs using the >> same reservation object we won't call >> reservation_object_add_shared_fence() multiple times because we move >> those to the duplicates list in ttm_eu_reserve_buffers(). >> >> But this bug can still happen because we call >> reservation_object_add_shared_fence() manually with fences for the same >> context in a couple of places. >> >> One prominent case which comes to my mind are for the VM BOs during >> updates. Another possibility are VRAM BOs which need to be cleared. > Thanks. How about the following: > > * ttm_eu_reserve_buffers calls reservation_object_reserve_shared. > * In reservation_object_reserve_shared, shared_count == shared_max - 1, > so obj->staged is freed in preparation of an in-place update. > * ttm_eu_fence_buffer_objects calls reservation_object_add_shared_fence, > after which shared_count == shared_max. > * The amdgpu driver also calls reservation_object_add_shared_fence for > the same reservation object, and the BUG_ON triggers. I would rather completely drop the reference to the ttm_eu_* functions, cause those wrappers are completely unrelated to the problem. Instead let's just note something like the following: * When reservation_object_reserve_shared is called with shared_count == shared_max - 1, so obj->staged is freed in preparation of an in-place update. * Now reservation_object_add_shared_fence is called with the first fence and after that shared_count == shared_max. * After that reservation_object_add_shared_fence can be called with follow up fences from the same context, but since shared_count == shared_max we would run into this BUG_ON. > However, nothing bad would happen in > reservation_object_add_shared_inplace, since all fences use the same > context, so they can only occupy a single slot. > > Prevent this by moving the BUG_ON to where an overflow would actually > happen (e.g. if a buggy caller didn't call > reservation_object_reserve_shared before). > > > Also, I'll add a reference to https://bugs.freedesktop.org/106418 in v2, > as I suspect this fix is necessary under the circumstances described > there as well. The rest sounds good to me. Regards, Christian.

7 years, 1 month

Re: [Linaro-mm-sig] [PATCH] dma-buf: Move BUG_ON from _add_shared_fence to _add_shared_inplace

by Christian König

Am 26.06.2018 um 16:31 schrieb Michel Dänzer: > From: Michel Dänzer <michel.daenzer(a)amd.com> > > Fixes the BUG_ON spuriously triggering under the following > circumstances: > > * ttm_eu_reserve_buffers processes a list containing multiple BOs using > the same reservation object, so it calls > reservation_object_reserve_shared with that reservation object once > for each such BO. > * In reservation_object_reserve_shared, old->shared_count == > old->shared_max - 1, so obj->staged is freed in preparation of an > in-place update. > * ttm_eu_fence_buffer_objects calls reservation_object_add_shared_fence > once for each of the BOs above, always with the same fence. > * The first call adds the fence in the remaining free slot, after which > old->shared_count == old->shared_max. Well, the explanation here is not correct. For multiple BOs using the same reservation object we won't call reservation_object_add_shared_fence() multiple times because we move those to the duplicates list in ttm_eu_reserve_buffers(). But this bug can still happen because we call reservation_object_add_shared_fence() manually with fences for the same context in a couple of places. One prominent case which comes to my mind are for the VM BOs during updates. Another possibility are VRAM BOs which need to be cleared. > > In the next call to reservation_object_add_shared_fence, the BUG_ON > triggers. However, nothing bad would happen in > reservation_object_add_shared_inplace, since the fence is already in the > reservation object. > > Prevent this by moving the BUG_ON to where an overflow would actually > happen (e.g. if a buggy caller didn't call > reservation_object_reserve_shared before). > > Cc: stable(a)vger.kernel.org > Signed-off-by: Michel Dänzer <michel.daenzer(a)amd.com> Reviewed-by: Christian König <christian.koenig(a)amd.com>. Regards, Christian. > --- > drivers/dma-buf/reservation.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c > index 314eb1071cce..532545b9488e 100644 > --- a/drivers/dma-buf/reservation.c > +++ b/drivers/dma-buf/reservation.c > @@ -141,6 +141,7 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, > if (signaled) { > RCU_INIT_POINTER(fobj->shared[signaled_idx], fence); > } else { > + BUG_ON(fobj->shared_count >= fobj->shared_max); > RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); > fobj->shared_count++; > } > @@ -230,10 +231,9 @@ void reservation_object_add_shared_fence(struct reservation_object *obj, > old = reservation_object_get_list(obj); > obj->staged = NULL; > > - if (!fobj) { > - BUG_ON(old->shared_count >= old->shared_max); > + if (!fobj) > reservation_object_add_shared_inplace(obj, old, fence); > - } else > + else > reservation_object_add_shared_replace(obj, old, fobj, fence); > } > EXPORT_SYMBOL(reservation_object_add_shared_fence);

7 years, 1 month

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig July 2018