Linaro-mm-sig May 2025

linaro-mm-sig@lists.linaro.org

30 participants
123 discussions

Re: [PATCH 2/2] dmabuf/heaps: implement DMA_BUF_IOCTL_RW_FILE for system_heap

by Christian König

On 5/21/25 12:25, wangtao wrote: > [wangtao] I previously explained that read/sendfile/splice/copy_file_range > syscalls can't achieve dmabuf direct IO zero-copy. And why can't you work on improving those syscalls instead of creating a new IOCTL? > My focus is enabling dmabuf direct I/O for [regular file] <--DMA--> [dmabuf] > zero-copy. Yeah and that focus is wrong. You need to work on a general solution to the issue and not specific to your problem. > Any API achieving this would work. Are there other uAPIs you think > could help? Could you recommend experts who might offer suggestions? Well once more: Either work on sendfile or copy_file_range or eventually splice to make it what you want to do. When that is done we can discuss with the VFS people if that approach is feasible. But just bypassing the VFS review by implementing a DMA-buf specific IOCTL is a NO-GO. That is clearly not something you can do in any way. Regards, Christian.

4 months, 4 weeks

Re: [PATCH 2/2] dmabuf/heaps: implement DMA_BUF_IOCTL_RW_FILE for system_heap

by Christian König

On 5/21/25 06:17, wangtao wrote: >>> Reducing CPU overhead/power consumption is critical for mobile devices. >>> We need simpler and more efficient dmabuf direct I/O support. >>> >>> As Christian evaluated sendfile performance based on your data, could >>> you confirm whether the cache was cleared? If not, please share the >>> post-cache-clearing test data. Thank you for your support. >> >> Yes sorry, I was out yesterday riding motorcycles. I did not clear the cache for >> the buffered reads, I didn't realize you had. The IO plus the copy certainly >> explains the difference. >> >> Your point about the unlikelihood of any of that data being in the cache also >> makes sense. > [wangtao] Thank you for testing and clarifying. > >> >> I'm not sure it changes anything about the ioctl approach though. >> Another way to do this would be to move the (optional) support for direct IO >> into the exporter via dma_buf_fops and dma_buf_ops. Then normal read() >> syscalls would just work for buffers that support them. >> I know that's more complicated, but at least it doesn't require inventing new >> uapi to do it. >> > [wangtao] Thank you for the discussion. I fully support any method that enables > dmabuf direct I/O. > > I understand using sendfile/splice with regular files for dmabuf > adds an extra CPU copy, preventing zero-copy. For example: > sendfile path: [DISK] → DMA → [page cache] → CPU copy → [memory file]. Yeah, but why can't you work on improving that? > The read() syscall can't pass regular file fd parameters, so I added > an ioctl command. > While copy_file_range() supports two fds (fd_in/fd_out), it blocks cross-fs use. > Even without this restriction, file_out->f_op->copy_file_range > only enables dmabuf direct reads from regular files, not writes. > > Since dmabuf's direct I/O limitation comes from its unique > attachment/map/fence model and lacks suitable syscalls, adding > an ioctl seems necessary. I absolutely don't see that. Both splice and sendfile can take two regular file descriptors. That the underlying fops currently can't do that is not a valid argument for adding new uAPI. It just means that you need to work on improving those fops. As long as nobody proves to me that the existing uAPI isn't sufficient for this use case I will systematically reject any approach to adding new one. Regards, Christian. > When system exporters return a duplicated sg_table via map_dma_buf > (used exclusively like a pages array), they should retain control > over it. > > I welcome all solutions to achieve dmabuf direct I/O! Your feedback > is greatly appreciated. > >> 1G from ext4 on 6.12.20 | read/sendfile (ms) w/ 3 > drop_caches >> ------------------------|------------------- >> udmabuf buffer read | 1210 >> udmabuf direct read | 671 >> udmabuf buffer sendfile | 1096 >> udmabuf direct sendfile | 2340 >> >> >> >>> >>>>> >>>>>>> dmabuf buffer read | 51 | 1068 | 1118 >>>>>>> dmabuf direct read | 52 | 297 | 349 >>>>>>> >>>>>>> udmabuf sendfile test steps: >>>>>>> 1. Open data file(1024MB), get back_fd 2. Create memfd(32MB) # >>>>>>> Loop steps 2-6 3. Allocate udmabuf with memfd 4. Call >>>>>>> sendfile(memfd, >>>>>>> back_fd) 5. Close memfd after sendfile 6. Close udmabuf 7. >>>>>>> Close back_fd >>>>>>> >>>>>>>> >>>>>>>> Regards, >>>>>>>> Christian. >>>>>>> >>>>>> >>>

4 months, 4 weeks

Re: [PATCH v5 01/10] dt-bindings: npu: rockchip,rknn: Add bindings

by Krzysztof Kozlowski

On Tue, May 20, 2025 at 12:26:54PM GMT, Tomeu Vizoso wrote: > Add the bindings for the Neural Processing Unit IP from Rockchip. > > v2: > - Adapt to new node structure (one node per core, each with its own > IOMMU) > - Several misc. fixes from Sebastian Reichel > > v3: > - Split register block in its constituent subblocks, and only require > the ones that the kernel would ever use (Nicolas Frattaroli) > - Group supplies (Rob Herring) > - Explain the way in which the top core is special (Rob Herring) > > v4: > - Change required node name to npu@ (Rob Herring and Krzysztof Kozlowski) > - Remove unneeded items: (Krzysztof Kozlowski) > - Fix use of minItems/maxItems (Krzysztof Kozlowski) > - Add reg-names to list of required properties (Krzysztof Kozlowski) > - Fix example (Krzysztof Kozlowski) > > v5: > - Rename file to rockchip,rk3588-rknn-core.yaml (Krzysztof Kozlowski) > - Streamline compatible property (Krzysztof Kozlowski) > This is a big patchset, so please slow down and do not send it every day but allow people to actually review the version you post. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org> Best regards, Krzysztof

5 months

Re: [PATCH 2/2] dmabuf/heaps: implement DMA_BUF_IOCTL_RW_FILE for system_heap

by T.J. Mercier

On Mon, May 19, 2025 at 9:06 PM wangtao <tao.wangtao(a)honor.com> wrote: > > > > > -----Original Message----- > > From: wangtao > > Sent: Monday, May 19, 2025 8:04 PM > > To: 'T.J. Mercier' <tjmercier(a)google.com>; Christian König > > <christian.koenig(a)amd.com> > > Cc: sumit.semwal(a)linaro.org; benjamin.gaignard(a)collabora.com; > > Brian.Starkey(a)arm.com; jstultz(a)google.com; linux-media(a)vger.kernel.org; > > dri-devel(a)lists.freedesktop.org; linaro-mm-sig(a)lists.linaro.org; linux- > > kernel(a)vger.kernel.org; wangbintian(BintianWang) > > <bintian.wang(a)honor.com>; yipengxiang <yipengxiang(a)honor.com>; liulu > > 00013167 <liulu.liu(a)honor.com>; hanfeng 00012985 <feng.han(a)honor.com> > > Subject: RE: [PATCH 2/2] dmabuf/heaps: implement > > DMA_BUF_IOCTL_RW_FILE for system_heap > > > > > > > > > -----Original Message----- > > > From: T.J. Mercier <tjmercier(a)google.com> > > > Sent: Saturday, May 17, 2025 2:37 AM > > > To: Christian König <christian.koenig(a)amd.com> > > > Cc: wangtao <tao.wangtao(a)honor.com>; sumit.semwal(a)linaro.org; > > > benjamin.gaignard(a)collabora.com; Brian.Starkey(a)arm.com; > > > jstultz(a)google.com; linux-media(a)vger.kernel.org; dri- > > > devel(a)lists.freedesktop.org; linaro-mm-sig(a)lists.linaro.org; linux- > > > kernel(a)vger.kernel.org; wangbintian(BintianWang) > > > <bintian.wang(a)honor.com>; yipengxiang <yipengxiang(a)honor.com>; liulu > > > 00013167 <liulu.liu(a)honor.com>; hanfeng 00012985 > > <feng.han(a)honor.com> > > > Subject: Re: [PATCH 2/2] dmabuf/heaps: implement > > DMA_BUF_IOCTL_RW_FILE > > > for system_heap > > > > > > On Fri, May 16, 2025 at 1:36 AM Christian König > > > <christian.koenig(a)amd.com> > > > wrote: > > > > > > > > On 5/16/25 09:40, wangtao wrote: > > > > > > > > > > > > > > >> -----Original Message----- > > > > >> From: Christian König <christian.koenig(a)amd.com> > > > > >> Sent: Thursday, May 15, 2025 10:26 PM > > > > >> To: wangtao <tao.wangtao(a)honor.com>; sumit.semwal(a)linaro.org; > > > > >> benjamin.gaignard(a)collabora.com; Brian.Starkey(a)arm.com; > > > > >> jstultz(a)google.com; tjmercier(a)google.com > > > > >> Cc: linux-media(a)vger.kernel.org; dri-devel(a)lists.freedesktop.org; > > > > >> linaro- mm-sig(a)lists.linaro.org; linux-kernel(a)vger.kernel.org; > > > > >> wangbintian(BintianWang) <bintian.wang(a)honor.com>; yipengxiang > > > > >> <yipengxiang(a)honor.com>; liulu 00013167 <liulu.liu(a)honor.com>; > > > > >> hanfeng > > > > >> 00012985 <feng.han(a)honor.com> > > > > >> Subject: Re: [PATCH 2/2] dmabuf/heaps: implement > > > > >> DMA_BUF_IOCTL_RW_FILE for system_heap > > > > >> > > > > >> On 5/15/25 16:03, wangtao wrote: > > > > >>> [wangtao] My Test Configuration (CPU 1GHz, 5-test average): > > > > >>> Allocation: 32x32MB buffer creation > > > > >>> - dmabuf 53ms vs. udmabuf 694ms (10X slower) > > > > >>> - Note: shmem shows excessive allocation time > > > > >> > > > > >> Yeah, that is something already noted by others as well. But that > > > > >> is orthogonal. > > > > >> > > > > >>> > > > > >>> Read 1024MB File: > > > > >>> - dmabuf direct 326ms vs. udmabuf direct 461ms (40% slower) > > > > >>> - Note: pin_user_pages_fast consumes majority CPU cycles > > > > >>> > > > > >>> Key function call timing: See details below. > > > > >> > > > > >> Those aren't valid, you are comparing different functionalities here. > > > > >> > > > > >> Please try using udmabuf with sendfile() as confirmed to be > > > > >> working by > > > T.J. > > > > > [wangtao] Using buffer IO with dmabuf file read/write requires one > > > memory copy. > > > > > Direct IO removes this copy to enable zero-copy. The sendfile > > > > > system call reduces memory copies from two (read/write) to one. > > > > > However, with udmabuf, sendfile still keeps at least one copy, failing > > zero-copy. > > > > > > > > > > > > Then please work on fixing this. > > > > > > > > Regards, > > > > Christian. > > > > > > > > > > > > > > > > > > If udmabuf sendfile uses buffer IO (file page cache), read latency > > > > > matches dmabuf buffer read, but allocation time is much longer. > > > > > With Direct IO, the default 16-page pipe size makes it slower than > > > > > buffer > > > IO. > > > > > > > > > > Test data shows: > > > > > udmabuf direct read is much faster than udmabuf sendfile. > > > > > dmabuf direct read outperforms udmabuf direct read by a large margin. > > > > > > > > > > Issue: After udmabuf is mapped via map_dma_buf, apps using memfd > > > > > or udmabuf for Direct IO might cause errors, but there are no > > > > > safeguards to prevent this. > > > > > > > > > > Allocate 32x32MB buffer and read 1024 MB file Test: > > > > > Metric | alloc (ms) | read (ms) | total (ms) > > > > > -----------------------|------------|-----------|----------- > > > > > udmabuf buffer read | 539 | 2017 | 2555 > > > > > udmabuf direct read | 522 | 658 | 1179 > > > > > > I can't reproduce the part where udmabuf direct reads are faster than > > > buffered reads. That's the opposite of what I'd expect. Something > > > seems wrong with those buffered reads. > > > > > > > > udmabuf buffer sendfile| 505 | 1040 | 1546 > > > > > udmabuf direct sendfile| 510 | 2269 | 2780 > > > > > > I can reproduce the 3.5x slower udambuf direct sendfile compared to > > > udmabuf direct read. It's a pretty disappointing result, so it seems > > > like something could be improved there. > > > > > > 1G from ext4 on 6.12.17 | read/sendfile (ms) > > > ------------------------|------------------- > > > udmabuf buffer read | 351 > > > udmabuf direct read | 540 > > > udmabuf buffer sendfile | 255 > > > udmabuf direct sendfile | 1990 > > > > > [wangtao] By the way, did you clear the file cache during testing? > > Looking at your data again, read and sendfile buffers are faster than Direct > > I/O, which suggests the file cache wasn’t cleared. If you didn’t clear the file > > cache, the test results are unfair and unreliable for reference. On embedded > > devices, it’s nearly impossible to maintain stable caching for multi-GB files. If > > such files could be cached, we might as well cache dmabufs directly to save > > time on creating dmabufs and reading file data. > > You can call posix_fadvise(file_fd, 0, len, POSIX_FADV_DONTNEED) after > > opening the file or before closing it to clear the file cache, ensuring actual file > > I/O operations are tested. > > > [wangtao] Please confirm if cache clearing was performed during testing. > I reduced the test scope from 3GB to 1GB. While results without > cache clearing show general alignment, udmabuf buffer read remains > slower than direct read. Comparative data: > > Your test reading 1GB(ext4 on 6.12.17: > Method | read/sendfile (ms) | read vs. (%) > ---------------------------------------------------------- > udmabuf buffer read | 351 | 138% > udmabuf direct read | 540 | 212% > udmabuf buffer sendfile | 255 | 100% > udmabuf direct sendfile | 1990 | 780% > > My 3.5GHz tests (f2fs): > Without cache clearing: > Method | alloc | read | vs. (%) > ----------------------------------------------- > udmabuf buffer read | 140 | 386 | 310% > udmabuf direct read | 151 | 326 | 262% > udmabuf buffer sendfile | 136 | 124 | 100% > udmabuf direct sendfile | 132 | 892 | 717% > dmabuf buffer read | 23 | 154 | 124% > patch direct read | 29 | 271 | 218% > > With cache clearing: > Method | alloc | read | vs. (%) > ----------------------------------------------- > udmabuf buffer read | 135 | 546 | 180% > udmabuf direct read | 159 | 300 | 99% > udmabuf buffer sendfile | 134 | 303 | 100% > udmabuf direct sendfile | 141 | 912 | 301% > dmabuf buffer read | 22 | 362 | 119% > patch direct read | 29 | 265 | 87% > > Results without cache clearing aren't representative for embedded > mobile devices. Notably, on low-power CPUs @1GHz, sendfile latency > without cache clearing exceeds dmabuf direct I/O read time. > > Without cache clearing: > Method | alloc | read | vs. (%) > ----------------------------------------------- > udmabuf buffer read | 546 | 1745 | 442% > udmabuf direct read | 511 | 704 | 178% > udmabuf buffer sendfile | 496 | 395 | 100% > udmabuf direct sendfile | 498 | 2332 | 591% > dmabuf buffer read | 43 | 453 | 115% > my patch direct read | 49 | 310 | 79% > > With cache clearing: > Method | alloc | read | vs. (%) > ----------------------------------------------- > udmabuf buffer read | 552 | 2067 | 198% > udmabuf direct read | 540 | 627 | 60% > udmabuf buffer sendfile | 497 | 1045 | 100% > udmabuf direct sendfile | 527 | 2330 | 223% > dmabuf buffer read | 40 | 1111 | 106% > my patch direct read | 44 | 310 | 30% > > Reducing CPU overhead/power consumption is critical for mobile devices. > We need simpler and more efficient dmabuf direct I/O support. > > As Christian evaluated sendfile performance based on your data, could > you confirm whether the cache was cleared? If not, please share the > post-cache-clearing test data. Thank you for your support. Yes sorry, I was out yesterday riding motorcycles. I did not clear the cache for the buffered reads, I didn't realize you had. The IO plus the copy certainly explains the difference. Your point about the unlikelihood of any of that data being in the cache also makes sense. I'm not sure it changes anything about the ioctl approach though. Another way to do this would be to move the (optional) support for direct IO into the exporter via dma_buf_fops and dma_buf_ops. Then normal read() syscalls would just work for buffers that support them. I know that's more complicated, but at least it doesn't require inventing new uapi to do it. 1G from ext4 on 6.12.20 | read/sendfile (ms) w/ 3 > drop_caches ------------------------|------------------- udmabuf buffer read | 1210 udmabuf direct read | 671 udmabuf buffer sendfile | 1096 udmabuf direct sendfile | 2340 > > > > > > > > > dmabuf buffer read | 51 | 1068 | 1118 > > > > > dmabuf direct read | 52 | 297 | 349 > > > > > > > > > > udmabuf sendfile test steps: > > > > > 1. Open data file(1024MB), get back_fd 2. Create memfd(32MB) # > > > > > Loop steps 2-6 3. Allocate udmabuf with memfd 4. Call > > > > > sendfile(memfd, > > > > > back_fd) 5. Close memfd after sendfile 6. Close udmabuf 7. Close > > > > > back_fd > > > > > > > > > >> > > > > >> Regards, > > > > >> Christian. > > > > > > > > > >

5 months

[PATCH v4] dma-buf: heaps: Introduce a new heap for reserved memory

by Maxime Ripard

Some reserved memory regions might have particular memory setup or attributes that make them good candidates for heaps. Let's provide a heap type that will create a new heap for each reserved memory region flagged as such. Signed-off-by: Maxime Ripard <mripard(a)kernel.org> --- Hi, This series is the follow-up of the discussion that John and I had some time ago here: https://lore.kernel.org/all/CANDhNCquJn6bH3KxKf65BWiTYLVqSd9892-xtFDHHqqyrr… The initial problem we were discussing was that I'm currently working on a platform which has a memory layout with ECC enabled. However, enabling the ECC has a number of drawbacks on that platform: lower performance, increased memory usage, etc. So for things like framebuffers, the trade-off isn't great and thus there's a memory region with ECC disabled to allocate from for such use cases. After a suggestion from John, I chose to first start using heap allocations flags to allow for userspace to ask for a particular ECC setup. This is then backed by a new heap type that runs from reserved memory chunks flagged as such, and the existing DT properties to specify the ECC properties. After further discussion, it was considered that flags were not the right solution, and relying on the names of the heaps would be enough to let userspace know the kind of buffer it deals with. Thus, even though the uAPI part of it has been dropped in this second version, we still need a driver to create heaps out of carved-out memory regions. In addition to the original usecase, a similar driver can be found in BSPs from most vendors, so I believe it would be a useful addition to the kernel. I submitted a draft PR to the DT schema for the bindings used in this PR: https://github.com/devicetree-org/dt-schema/pull/138 Let me know what you think, Maxime --- Changes in v4: - Rebased on 6.15-rc7 - Map buffers only when map is actually called, not at allocation time - Deal with restricted-dma-pool and shared-dma-pool - Reword Kconfig options - Properly report dma_map_sgtable failures - Link to v3: https://lore.kernel.org/r/20250407-dma-buf-ecc-heap-v3-0-97cdd36a5f29@kerne… Changes in v3: - Reworked global variable patch - Link to v2: https://lore.kernel.org/r/20250401-dma-buf-ecc-heap-v2-0-043fd006a1af@kerne… Changes in v2: - Add vmap/vunmap operations - Drop ECC flags uapi - Rebase on top of 6.14 - Link to v1: https://lore.kernel.org/r/20240515-dma-buf-ecc-heap-v1-0-54cbbd049511@kerne… --- drivers/dma-buf/heaps/Kconfig | 8 + drivers/dma-buf/heaps/Makefile | 1 + drivers/dma-buf/heaps/carveout_heap.c | 388 ++++++++++++++++++++++++++++++++++ 3 files changed, 397 insertions(+) diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig index a5eef06c422644e8aadaf5aff2bd9a33c49c1ba3..1ce4f6828d8c06bfdd7bc2e5127707f1778586e6 100644 --- a/drivers/dma-buf/heaps/Kconfig +++ b/drivers/dma-buf/heaps/Kconfig @@ -1,5 +1,13 @@ +config DMABUF_HEAPS_CARVEOUT + bool "DMA-BUF Carveout Heaps" + depends on DMABUF_HEAPS + help + Choose this option to enable the carveout dmabuf heap. The carveout + heap is backed by pages from reserved memory regions flagged as + exportable. If in doubt, say Y. + config DMABUF_HEAPS_SYSTEM bool "DMA-BUF System Heap" depends on DMABUF_HEAPS help Choose this option to enable the system dmabuf heap. The system heap diff --git a/drivers/dma-buf/heaps/Makefile b/drivers/dma-buf/heaps/Makefile index 974467791032ffb8a7aba17b1407d9a19b3f3b44..b734647ad5c84f449106748160258e372f153df2 100644 --- a/drivers/dma-buf/heaps/Makefile +++ b/drivers/dma-buf/heaps/Makefile @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_DMABUF_HEAPS_CARVEOUT) += carveout_heap.o obj-$(CONFIG_DMABUF_HEAPS_SYSTEM) += system_heap.o obj-$(CONFIG_DMABUF_HEAPS_CMA) += cma_heap.o diff --git a/drivers/dma-buf/heaps/carveout_heap.c b/drivers/dma-buf/heaps/carveout_heap.c new file mode 100644 index 0000000000000000000000000000000000000000..3fac190545bc6f853c18a614f6d89176ed2d7df6 --- /dev/null +++ b/drivers/dma-buf/heaps/carveout_heap.c @@ -0,0 +1,388 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <linux/dma-buf.h> +#include <linux/dma-heap.h> +#include <linux/genalloc.h> +#include <linux/highmem.h> +#include <linux/of_reserved_mem.h> + +struct carveout_heap_priv { + struct dma_heap *heap; + struct gen_pool *pool; +}; + +struct carveout_heap_buffer_priv { + struct mutex lock; + struct list_head attachments; + + unsigned long num_pages; + struct carveout_heap_priv *heap; + phys_addr_t paddr; + void *vaddr; + unsigned int vmap_cnt; +}; + +struct carveout_heap_attachment { + struct list_head head; + struct sg_table table; + + struct device *dev; + bool mapped; +}; + +static int carveout_heap_attach(struct dma_buf *buf, + struct dma_buf_attachment *attachment) +{ + struct carveout_heap_buffer_priv *priv = buf->priv; + struct carveout_heap_attachment *a; + struct sg_table *sgt; + unsigned long len = priv->num_pages * PAGE_SIZE; + int ret; + + a = kzalloc(sizeof(*a), GFP_KERNEL); + if (!a) + return -ENOMEM; + INIT_LIST_HEAD(&a->head); + a->dev = attachment->dev; + attachment->priv = a; + + sgt = &a->table; + ret = sg_alloc_table(sgt, 1, GFP_KERNEL); + if (ret) + goto err_cleanup_attach; + + sg_set_buf(sgt->sgl, priv->vaddr, len); + + mutex_lock(&priv->lock); + list_add(&a->head, &priv->attachments); + mutex_unlock(&priv->lock); + + return 0; + +err_cleanup_attach: + kfree(a); + return ret; +} + +static void carveout_heap_detach(struct dma_buf *dmabuf, + struct dma_buf_attachment *attachment) +{ + struct carveout_heap_buffer_priv *priv = dmabuf->priv; + struct carveout_heap_attachment *a = attachment->priv; + + mutex_lock(&priv->lock); + list_del(&a->head); + mutex_unlock(&priv->lock); + + sg_free_table(&a->table); + kfree(a); +} + +static struct sg_table * +carveout_heap_map_dma_buf(struct dma_buf_attachment *attachment, + enum dma_data_direction direction) +{ + struct carveout_heap_attachment *a = attachment->priv; + struct sg_table *table = &a->table; + int ret; + + ret = dma_map_sgtable(a->dev, table, direction, 0); + if (ret) + return ERR_PTR(ret); + + a->mapped = true; + + return table; +} + +static void carveout_heap_unmap_dma_buf(struct dma_buf_attachment *attachment, + struct sg_table *table, + enum dma_data_direction direction) +{ + struct carveout_heap_attachment *a = attachment->priv; + + a->mapped = false; + dma_unmap_sgtable(a->dev, table, direction, 0); +} + +static int +carveout_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf, + enum dma_data_direction direction) +{ + struct carveout_heap_buffer_priv *priv = dmabuf->priv; + struct carveout_heap_attachment *a; + unsigned long len = priv->num_pages * PAGE_SIZE; + + mutex_lock(&priv->lock); + + if (priv->vmap_cnt) + invalidate_kernel_vmap_range(priv->vaddr, len); + + list_for_each_entry(a, &priv->attachments, head) { + if (!a->mapped) + continue; + + dma_sync_sgtable_for_cpu(a->dev, &a->table, direction); + } + + mutex_unlock(&priv->lock); + + return 0; +} + +static int +carveout_heap_dma_buf_end_cpu_access(struct dma_buf *dmabuf, + enum dma_data_direction direction) +{ + struct carveout_heap_buffer_priv *priv = dmabuf->priv; + struct carveout_heap_attachment *a; + unsigned long len = priv->num_pages * PAGE_SIZE; + + mutex_lock(&priv->lock); + + if (priv->vmap_cnt) + flush_kernel_vmap_range(priv->vaddr, len); + + list_for_each_entry(a, &priv->attachments, head) { + if (!a->mapped) + continue; + + dma_sync_sgtable_for_device(a->dev, &a->table, direction); + } + + mutex_unlock(&priv->lock); + + return 0; +} + +static int carveout_heap_mmap(struct dma_buf *dmabuf, + struct vm_area_struct *vma) +{ + struct carveout_heap_buffer_priv *priv = dmabuf->priv; + unsigned long len = priv->num_pages * PAGE_SIZE; + + return vm_iomap_memory(vma, priv->paddr, len); +} + +static int carveout_heap_vmap(struct dma_buf *dmabuf, struct iosys_map *map) +{ + struct carveout_heap_buffer_priv *priv = dmabuf->priv; + unsigned long len = priv->num_pages * PAGE_SIZE; + + mutex_lock(&priv->lock); + + if (!priv->vmap_cnt) { + void *vaddr = memremap(priv->paddr, len, MEMREMAP_WB); + + if (!vaddr) { + mutex_unlock(&priv->lock); + return -ENOMEM; + } + + priv->vaddr = vaddr; + } + + WARN_ON(!priv->vaddr); + iosys_map_set_vaddr(map, priv->vaddr); + priv->vmap_cnt++; + + mutex_unlock(&priv->lock); + + return 0; +} + +static void carveout_heap_vunmap(struct dma_buf *dmabuf, struct iosys_map *map) +{ + struct carveout_heap_buffer_priv *priv = dmabuf->priv; + + mutex_lock(&priv->lock); + + priv->vmap_cnt--; + if (!priv->vmap_cnt) { + memunmap(priv->vaddr); + priv->vaddr = NULL; + } + + mutex_unlock(&priv->lock); + + iosys_map_clear(map); +} + +static void carveout_heap_dma_buf_release(struct dma_buf *buf) +{ + struct carveout_heap_buffer_priv *buffer_priv = buf->priv; + struct carveout_heap_priv *heap_priv = buffer_priv->heap; + unsigned long len = buffer_priv->num_pages * PAGE_SIZE; + + gen_pool_free(heap_priv->pool, buffer_priv->paddr, len); + kfree(buffer_priv); +} + +static const struct dma_buf_ops carveout_heap_buf_ops = { + .attach = carveout_heap_attach, + .detach = carveout_heap_detach, + .map_dma_buf = carveout_heap_map_dma_buf, + .unmap_dma_buf = carveout_heap_unmap_dma_buf, + .begin_cpu_access = carveout_heap_dma_buf_begin_cpu_access, + .end_cpu_access = carveout_heap_dma_buf_end_cpu_access, + .mmap = carveout_heap_mmap, + .vmap = carveout_heap_vmap, + .vunmap = carveout_heap_vunmap, + .release = carveout_heap_dma_buf_release, +}; + +static struct dma_buf *carveout_heap_allocate(struct dma_heap *heap, + unsigned long len, + u32 fd_flags, + u64 heap_flags) +{ + struct carveout_heap_priv *heap_priv = dma_heap_get_drvdata(heap); + struct carveout_heap_buffer_priv *buffer_priv; + DEFINE_DMA_BUF_EXPORT_INFO(exp_info); + struct dma_buf *buf; + phys_addr_t paddr; + /* len is guaranteed to be page-aligned by the framework, so we can use it as is. */ + size_t size = len; + int ret; + + buffer_priv = kzalloc(sizeof(*buffer_priv), GFP_KERNEL); + if (!buffer_priv) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&buffer_priv->attachments); + mutex_init(&buffer_priv->lock); + + paddr = gen_pool_alloc(heap_priv->pool, size); + if (!paddr) { + ret = -ENOMEM; + goto err_free_buffer_priv; + } + + buffer_priv->paddr = paddr; + buffer_priv->heap = heap_priv; + buffer_priv->num_pages = size >> PAGE_SHIFT; + + /* create the dmabuf */ + exp_info.exp_name = dma_heap_get_name(heap); + exp_info.ops = &carveout_heap_buf_ops; + exp_info.size = size; + exp_info.flags = fd_flags; + exp_info.priv = buffer_priv; + + buf = dma_buf_export(&exp_info); + if (IS_ERR(buf)) { + ret = PTR_ERR(buf); + goto err_free_buffer; + } + + return buf; + +err_free_buffer: + gen_pool_free(heap_priv->pool, paddr, len); +err_free_buffer_priv: + kfree(buffer_priv); + + return ERR_PTR(ret); +} + +static const struct dma_heap_ops carveout_heap_ops = { + .allocate = carveout_heap_allocate, +}; + +static int __init carveout_heap_setup(struct device_node *node) +{ + struct dma_heap_export_info exp_info = {}; + const struct reserved_mem *rmem; + struct carveout_heap_priv *priv; + struct dma_heap *heap; + struct gen_pool *pool; + int ret; + + rmem = of_reserved_mem_lookup(node); + if (!rmem) + return -EINVAL; + + priv = kzalloc(sizeof(*priv), GFP_KERNEL); + if (!priv) + return -ENOMEM; + + pool = gen_pool_create(PAGE_SHIFT, NUMA_NO_NODE); + if (!pool) { + ret = -ENOMEM; + goto err_cleanup_heap; + } + priv->pool = pool; + + ret = gen_pool_add(pool, rmem->base, rmem->size, NUMA_NO_NODE); + if (ret) + goto err_release_mem_region; + + exp_info.name = node->full_name; + exp_info.ops = &carveout_heap_ops; + exp_info.priv = priv; + + heap = dma_heap_add(&exp_info); + if (IS_ERR(heap)) { + ret = PTR_ERR(heap); + goto err_release_mem_region; + } + priv->heap = heap; + + return 0; + +err_release_mem_region: + gen_pool_destroy(pool); +err_cleanup_heap: + kfree(priv); + return ret; +} + +static int __init carveout_heap_init(void) +{ + struct device_node *rmem_node; + struct device_node *node; + int ret; + + rmem_node = of_find_node_by_path("/reserved-memory"); + if (!rmem_node) + return 0; + + for_each_child_of_node(rmem_node, node) { + /* + * TODO: shared-dma-pools register either a CMA or + * coherent memory region depending on whether the + * reusable property is set. + * + * In order to avoid issues, we would need to use the + * respective allocators if the shared-dma-pool property + * was set. + */ + if (of_device_is_compatible(node, "shared-dma-pool")) { + pr_warn("%pOFn: carveout heap driver doesn't support shared-dma-pools.", + node); + continue; + } + + /* + * TODO: restricted-dma-pools register in swiotlb. In + * order to avoid issues, we would need to use the + * swiotlb if that compatible was set. + */ + if (of_device_is_compatible(node, "restricted-dma-pool")) { + pr_warn("%pOFn: carveout heap driver doesn't support restricted-dma-pools.", + node); + continue; + } + + if (!of_property_read_bool(node, "export")) + continue; + + ret = carveout_heap_setup(node); + if (ret) + return ret; + } + + return 0; +} + +module_init(carveout_heap_init); --- base-commit: a17f109942554a751be2c3b05f898f1b3ae98c78 change-id: 20240515-dma-buf-ecc-heap-28a311d2c94e Best regards, -- Maxime Ripard <mripard(a)kernel.org>

5 months

Re: [PATCH v5 08/10] accel/rocket: Add IOCTLs for synchronizing memory accesses

by Lucas Stach

Hi Tomeu, Am Dienstag, dem 20.05.2025 um 12:27 +0200 schrieb Tomeu Vizoso: > The NPU cores have their own access to the memory bus, and this isn't > cache coherent with the CPUs. > > Add IOCTLs so userspace can mark when the caches need to be flushed, and > also when a writer job needs to be waited for before the buffer can be > accessed from the CPU. > > Initially based on the same IOCTLs from the Etnaviv driver. > > v2: > - Don't break UABI by reordering the IOCTL IDs (Jeff Hugo) > > v3: > - Check that padding fields in IOCTLs are zero (Jeff Hugo) > > Signed-off-by: Tomeu Vizoso <tomeu(a)tomeuvizoso.net> > --- > drivers/accel/rocket/rocket_drv.c | 2 + > drivers/accel/rocket/rocket_gem.c | 80 +++++++++++++++++++++++++++++++++++++++ > drivers/accel/rocket/rocket_gem.h | 5 +++ > include/uapi/drm/rocket_accel.h | 37 ++++++++++++++++++ > 4 files changed, 124 insertions(+) > > diff --git a/drivers/accel/rocket/rocket_drv.c b/drivers/accel/rocket/rocket_drv.c > index fef9b93372d3f65c41c1ac35a9bfa0c01ee721a5..c06e66939e6c39909fe08bef3c4f301b07bf8fbf 100644 > --- a/drivers/accel/rocket/rocket_drv.c > +++ b/drivers/accel/rocket/rocket_drv.c > @@ -59,6 +59,8 @@ static const struct drm_ioctl_desc rocket_drm_driver_ioctls[] = { > > ROCKET_IOCTL(CREATE_BO, create_bo), > ROCKET_IOCTL(SUBMIT, submit), > + ROCKET_IOCTL(PREP_BO, prep_bo), > + ROCKET_IOCTL(FINI_BO, fini_bo), > }; > > DEFINE_DRM_ACCEL_FOPS(rocket_accel_driver_fops); > diff --git a/drivers/accel/rocket/rocket_gem.c b/drivers/accel/rocket/rocket_gem.c > index 8a8a7185daac4740081293aae6945c9b2bbeb2dd..cdc5238a93fa5978129dc1ac8ec8de955160dc18 100644 > --- a/drivers/accel/rocket/rocket_gem.c > +++ b/drivers/accel/rocket/rocket_gem.c > @@ -129,3 +129,83 @@ int rocket_ioctl_create_bo(struct drm_device *dev, void *data, struct drm_file * > > return ret; > } > + > +static inline enum dma_data_direction rocket_op_to_dma_dir(u32 op) > +{ > + if (op & ROCKET_PREP_READ) > + return DMA_FROM_DEVICE; > + else if (op & ROCKET_PREP_WRITE) > + return DMA_TO_DEVICE; > + else > + return DMA_BIDIRECTIONAL; > +} This has copied over the bug fixed in etnaviv commit 58979ad6330a ("drm/etnaviv: fix DMA direction handling for cached RW buffers") Regards, Lucas

5 months

Re: [PATCH v4 05/10] accel/rocket: Add a new driver for Rockchip's NPU

by Krzysztof Kozlowski

On Mon, May 19, 2025 at 03:43:37PM GMT, Tomeu Vizoso wrote: > +#endif > diff --git a/drivers/accel/rocket/rocket_device.c b/drivers/accel/rocket/rocket_device.c > new file mode 100644 > index 0000000000000000000000000000000000000000..bb469ac87d36249157f4ba9d9f7106ad558309e4 > --- /dev/null > +++ b/drivers/accel/rocket/rocket_device.c > @@ -0,0 +1,39 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* Copyright 2024-2025 Tomeu Vizoso <tomeu(a)tomeuvizoso.net> */ > + > +#include <linux/clk.h> > +#include <linux/dev_printk.h> > + > +#include "rocket_device.h" > + > +int rocket_device_init(struct rocket_device *rdev) > +{ > + struct device *dev = rdev->cores[0].dev; > + int err; > + > + rdev->clk_npu = devm_clk_get(dev, "npu"); > + if (IS_ERR(rdev->clk_npu)) { > + err = PTR_ERR(rdev->clk_npu); > + dev_err(dev, "devm_clk_get failed %d for clock npu\n", err); > + return err; > + } That's probe path? so use standard syntax: return dev_err_probe(). One line instead of four. > + > + rdev->pclk = devm_clk_get(dev, "pclk"); > + if (IS_ERR(rdev->pclk)) { > + err = PTR_ERR(rdev->pclk); > + dev_err(dev, "devm_clk_get failed %d for clock pclk\n", err); > + return err; Same here... except that this should be blk API and entire function gets smaller. > + } > + > + /* Initialize core 0 (top) */ > + err = rocket_core_init(&rdev->cores[0]); > + if (err) > + return err; > + > + return 0; > +} ... > +static int rocket_device_runtime_resume(struct device *dev) > +{ > + struct rocket_device *rdev = dev_get_drvdata(dev); > + int core = find_core_for_dev(dev); > + int err = 0; > + > + if (core < 0) > + return -ENODEV; > + > + if (core == 0) { > + err = clk_prepare_enable(rdev->clk_npu); > + if (err) { > + dev_err(dev, "clk_prepare_enable failed %d for clock npu\n", err); > + return err; > + } > + > + err = clk_prepare_enable(rdev->pclk); > + if (err) { > + dev_err(dev, "clk_prepare_enable failed %d for clock pclk\n", err); > + goto error_clk_npu; > + } > + } > + > + err = clk_prepare_enable(rdev->cores[core].a_clk); > + if (err) { > + dev_err(dev, "clk_prepare_enable failed %d for a_clk in core %d\n", err, core); > + goto error_pclk; > + } > + > + err = clk_prepare_enable(rdev->cores[core].h_clk); > + if (err) { > + dev_err(dev, "clk_prepare_enable failed %d for h_clk in core %d\n", err, core); > + goto error_a_clk; > + } All four above calls could be just one call with bulk API. > + > + return 0; > + > +error_a_clk: > + clk_disable_unprepare(rdev->cores[core].a_clk); > + > +error_pclk: > + if (core == 0) > + clk_disable_unprepare(rdev->pclk); > + > +error_clk_npu: > + if (core == 0) > + clk_disable_unprepare(rdev->clk_npu); And all this would be gone... > + > + return err; Best regards, Krzysztof

5 months

Re: [PATCH v4 01/10] dt-bindings: npu: rockchip,rknn: Add bindings

by Krzysztof Kozlowski

On Mon, May 19, 2025 at 03:43:33PM GMT, Tomeu Vizoso wrote: > Add the bindings for the Neural Processing Unit IP from Rockchip. > > v2: > - Adapt to new node structure (one node per core, each with its own > IOMMU) > - Several misc. fixes from Sebastian Reichel > > v3: > - Split register block in its constituent subblocks, and only require > the ones that the kernel would ever use (Nicolas Frattaroli) > - Group supplies (Rob Herring) > - Explain the way in which the top core is special (Rob Herring) > > v4: > - Change required node name to npu@ (Rob Herring and Krzysztof Kozlowski) > - Remove unneeded items: (Krzysztof Kozlowski) > - Fix use of minItems/maxItems (Krzysztof Kozlowski) > - Add reg-names to list of required properties (Krzysztof Kozlowski) > - Fix example (Krzysztof Kozlowski) > > Signed-off-by: Tomeu Vizoso <tomeu(a)tomeuvizoso.net> > Signed-off-by: Sebastian Reichel <sebastian.reichel(a)collabora.com> This order of SoB is still odd. You as person sending it should be the last signing person. Are you sure you are using b4 for managing trailers? I would expect it to re-order these on every update and this is already v4. > --- > .../bindings/npu/rockchip,rknn-core.yaml | 149 +++++++++++++++++++++ Filename matching compatible, so rockchip,rk3588-rknn-core.yaml > 1 file changed, 149 insertions(+) > > diff --git a/Documentation/devicetree/bindings/npu/rockchip,rknn-core.yaml b/Documentation/devicetree/bindings/npu/rockchip,rknn-core.yaml > new file mode 100644 > index 0000000000000000000000000000000000000000..fafd0b01da215c7396262012988e364ef07ea137 > --- /dev/null > +++ b/Documentation/devicetree/bindings/npu/rockchip,rknn-core.yaml > @@ -0,0 +1,149 @@ > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) > +%YAML 1.2 > +--- > +$id: http://devicetree.org/schemas/npu/rockchip,rknn-core.yaml# > +$schema: http://devicetree.org/meta-schemas/core.yaml# > + > +title: Neural Processing Unit IP from Rockchip > + > +maintainers: > + - Tomeu Vizoso <tomeu(a)tomeuvizoso.net> > + > +description: > + Rockchip IP for accelerating inference of neural networks, based on NVIDIA's > + open source NVDLA IP. > + > + There is to be a node per each core in the NPU. In Rockchip's design there > + will be one core that is special and needs to be powered on before any of the > + other cores can be used. This special core is called the top core and should > + have the compatible string that corresponds to top cores. > + > +properties: > + $nodename: > + pattern: '^npu@[a-f0-9]+$' > + > + compatible: > + oneOf: Drop... if you followed my advice you would notice it is not necessary. > + - enum: > + - rockchip,rk3588-rknn-core-top > + - enum: > + - rockchip,rk3588-rknn-core My comments were only partially implemented. This syntax is really not readable and not necessary and I asked to make it part of previous enum. This is just one enum: compatible: enum: - foo - bar Best regards, Krzysztof

5 months

[PATCH v5 00/40] drm/msm: sparse / "VM_BIND" support

by Rob Clark

From: Rob Clark <robdclark(a)chromium.org> Conversion to DRM GPU VA Manager[1], and adding support for Vulkan Sparse Memory[2] in the form of: 1. A new VM_BIND submitqueue type for executing VM MSM_SUBMIT_BO_OP_MAP/ MAP_NULL/UNMAP commands 2. A new VM_BIND ioctl to allow submitting batches of one or more MAP/MAP_NULL/UNMAP commands to a VM_BIND submitqueue I did not implement support for synchronous VM_BIND commands. Since userspace could just immediately wait for the `SUBMIT` to complete, I don't think we need this extra complexity in the kernel. Synchronous/immediate VM_BIND operations could be implemented with a 2nd VM_BIND submitqueue. The corresponding mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533 Changes in v5: - Improved drm/sched enqueue_credit comments, and better define the return from drm_sched_entity_push_job() - Improve DRM_GPUVM_VA_WEAK_REF comments, and additional WARN_ON()s to make it clear that some of the gpuvm functionality is not available in this mode. - Link to v4: https://lore.kernel.org/all/20250514175527.42488-1-robdclark@gmail.com/ Changes in v4: - Various locking/etc fixes - Optimize the pgtable preallocation. If userspace sorts the VM_BIND ops then the kernel detects ops that fall into the same 2MB last level PTD to avoid duplicate page preallocation. - Add way to throttle pushing jobs to the scheduler, to cap the amount of potentially temporary prealloc'd pgtable pages. - Add vm_log to devcoredump for debugging. If the vm_log_shift module param is set, keep a log of the last 1<<vm_log_shift VM updates for easier debugging of faults/crashes. - Link to v3: https://lore.kernel.org/all/20250428205619.227835-1-robdclark@gmail.com/ Changes in v3: - Switched to seperate VM_BIND ioctl. This makes the UABI a bit cleaner, but OTOH the userspace code was cleaner when the end result of either type of VkQueue lead to the same ioctl. So I'm a bit on the fence. - Switched to doing the gpuvm bookkeeping synchronously, and only deferring the pgtable updates. This avoids needing to hold any resv locks in the fence signaling path, resolving the last shrinker related lockdep complaints. OTOH it means userspace can trigger invalid pgtable updates with multiple VM_BIND queues. In this case, we ensure that unmaps happen completely (to prevent userspace from using this to access free'd pages), mark the context as unusable, and move on with life. - Link to v2: https://lore.kernel.org/all/20250319145425.51935-1-robdclark@gmail.com/ Changes in v2: - Dropped Bibek Kumar Patro's arm-smmu patches[3], which have since been merged. - Pre-allocate all the things, and drop HACK patch which disabled shrinker. This includes ensuring that vm_bo objects are allocated up front, pre- allocating VMA objects, and pre-allocating pages used for pgtable updates. The latter utilizes io_pgtable_cfg callbacks for pgtable alloc/free, that were initially added for panthor. - Add back support for BO dumping for devcoredump. - Link to v1 (RFC): https://lore.kernel.org/dri-devel/20241207161651.410556-1-robdclark@gmail.c… [1] https://www.kernel.org/doc/html/next/gpu/drm-mm.html#drm-gpuvm [2] https://docs.vulkan.org/spec/latest/chapters/sparsemem.html [3] https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=909700 Rob Clark (40): drm/gpuvm: Don't require obj lock in destructor path drm/gpuvm: Allow VAs to hold soft reference to BOs drm/gem: Add ww_acquire_ctx support to drm_gem_lru_scan() drm/sched: Add enqueue credit limit iommu/io-pgtable-arm: Add quirk to quiet WARN_ON() drm/msm: Rename msm_file_private -> msm_context drm/msm: Improve msm_context comments drm/msm: Rename msm_gem_address_space -> msm_gem_vm drm/msm: Remove vram carveout support drm/msm: Collapse vma allocation and initialization drm/msm: Collapse vma close and delete drm/msm: Don't close VMAs on purge drm/msm: drm_gpuvm conversion drm/msm: Convert vm locking drm/msm: Use drm_gpuvm types more drm/msm: Split out helper to get iommu prot flags drm/msm: Add mmu support for non-zero offset drm/msm: Add PRR support drm/msm: Rename msm_gem_vma_purge() -> _unmap() drm/msm: Drop queued submits on lastclose() drm/msm: Lazily create context VM drm/msm: Add opt-in for VM_BIND drm/msm: Mark VM as unusable on GPU hangs drm/msm: Add _NO_SHARE flag drm/msm: Crashdump prep for sparse mappings drm/msm: rd dumping prep for sparse mappings drm/msm: Crashdec support for sparse drm/msm: rd dumping support for sparse drm/msm: Extract out syncobj helpers drm/msm: Use DMA_RESV_USAGE_BOOKKEEP/KERNEL drm/msm: Add VM_BIND submitqueue drm/msm: Support IO_PGTABLE_QUIRK_NO_WARN_ON drm/msm: Support pgtable preallocation drm/msm: Split out map/unmap ops drm/msm: Add VM_BIND ioctl drm/msm: Add VM logging for VM_BIND updates drm/msm: Add VMA unmap reason drm/msm: Add mmu prealloc tracepoint drm/msm: use trylock for debugfs drm/msm: Bump UAPI version drivers/gpu/drm/drm_gem.c | 14 +- drivers/gpu/drm/drm_gpuvm.c | 38 +- drivers/gpu/drm/msm/Kconfig | 1 + drivers/gpu/drm/msm/Makefile | 1 + drivers/gpu/drm/msm/adreno/a2xx_gpu.c | 25 +- drivers/gpu/drm/msm/adreno/a2xx_gpummu.c | 5 +- drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 17 +- drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 17 +- drivers/gpu/drm/msm/adreno/a5xx_debugfs.c | 4 +- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 22 +- drivers/gpu/drm/msm/adreno/a5xx_power.c | 2 +- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 10 +- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 32 +- drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 49 +- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 6 +- drivers/gpu/drm/msm/adreno/a6xx_preempt.c | 10 +- drivers/gpu/drm/msm/adreno/adreno_device.c | 4 - drivers/gpu/drm/msm/adreno/adreno_gpu.c | 99 +- drivers/gpu/drm/msm/adreno/adreno_gpu.h | 23 +- .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 14 +- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c | 18 +- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h | 2 +- drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 18 +- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 14 +- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h | 4 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c | 6 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c | 28 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c | 12 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c | 4 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 19 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c | 12 +- drivers/gpu/drm/msm/dsi/dsi_host.c | 14 +- drivers/gpu/drm/msm/msm_drv.c | 184 +-- drivers/gpu/drm/msm/msm_drv.h | 35 +- drivers/gpu/drm/msm/msm_fb.c | 18 +- drivers/gpu/drm/msm/msm_fbdev.c | 2 +- drivers/gpu/drm/msm/msm_gem.c | 494 +++--- drivers/gpu/drm/msm/msm_gem.h | 247 ++- drivers/gpu/drm/msm/msm_gem_prime.c | 15 + drivers/gpu/drm/msm/msm_gem_shrinker.c | 104 +- drivers/gpu/drm/msm/msm_gem_submit.c | 295 ++-- drivers/gpu/drm/msm/msm_gem_vma.c | 1471 ++++++++++++++++- drivers/gpu/drm/msm/msm_gpu.c | 211 ++- drivers/gpu/drm/msm/msm_gpu.h | 144 +- drivers/gpu/drm/msm/msm_gpu_trace.h | 14 + drivers/gpu/drm/msm/msm_iommu.c | 302 +++- drivers/gpu/drm/msm/msm_kms.c | 18 +- drivers/gpu/drm/msm/msm_kms.h | 2 +- drivers/gpu/drm/msm/msm_mmu.h | 38 +- drivers/gpu/drm/msm/msm_rd.c | 62 +- drivers/gpu/drm/msm/msm_ringbuffer.c | 10 +- drivers/gpu/drm/msm/msm_submitqueue.c | 96 +- drivers/gpu/drm/msm/msm_syncobj.c | 172 ++ drivers/gpu/drm/msm/msm_syncobj.h | 37 + drivers/gpu/drm/scheduler/sched_entity.c | 19 +- drivers/gpu/drm/scheduler/sched_main.c | 3 + drivers/iommu/io-pgtable-arm.c | 27 +- include/drm/drm_gem.h | 10 +- include/drm/drm_gpuvm.h | 19 +- include/drm/gpu_scheduler.h | 24 +- include/linux/io-pgtable.h | 8 + include/uapi/drm/msm_drm.h | 149 +- 63 files changed, 3526 insertions(+), 1250 deletions(-) create mode 100644 drivers/gpu/drm/msm/msm_syncobj.c create mode 100644 drivers/gpu/drm/msm/msm_syncobj.h -- 2.49.0

5 months

[PATCH v5 35/40] drm/msm: Add VM_BIND ioctl

by Rob Clark

From: Rob Clark <robdclark(a)chromium.org> Add a VM_BIND ioctl for binding/unbinding buffers into a VM. This is only supported if userspace has opted in to MSM_PARAM_EN_VM_BIND. Signed-off-by: Rob Clark <robdclark(a)chromium.org> --- drivers/gpu/drm/msm/msm_drv.c | 1 + drivers/gpu/drm/msm/msm_drv.h | 4 +- drivers/gpu/drm/msm/msm_gem.c | 40 +- drivers/gpu/drm/msm/msm_gem.h | 4 + drivers/gpu/drm/msm/msm_gem_submit.c | 22 +- drivers/gpu/drm/msm/msm_gem_vma.c | 1073 +++++++++++++++++++++++++- include/uapi/drm/msm_drm.h | 74 +- 7 files changed, 1185 insertions(+), 33 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index 89cb7820064f..bdf775897de8 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -791,6 +791,7 @@ static const struct drm_ioctl_desc msm_ioctls[] = { DRM_IOCTL_DEF_DRV(MSM_SUBMITQUEUE_NEW, msm_ioctl_submitqueue_new, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(MSM_SUBMITQUEUE_CLOSE, msm_ioctl_submitqueue_close, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(MSM_SUBMITQUEUE_QUERY, msm_ioctl_submitqueue_query, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(MSM_VM_BIND, msm_ioctl_vm_bind, DRM_RENDER_ALLOW), }; static void msm_show_fdinfo(struct drm_printer *p, struct drm_file *file) diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index b0add236cbb3..33240afc6365 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -232,7 +232,9 @@ struct drm_gpuvm *msm_kms_init_vm(struct drm_device *dev); bool msm_use_mmu(struct drm_device *dev); int msm_ioctl_gem_submit(struct drm_device *dev, void *data, - struct drm_file *file); + struct drm_file *file); +int msm_ioctl_vm_bind(struct drm_device *dev, void *data, + struct drm_file *file); #ifdef CONFIG_DEBUG_FS unsigned long msm_gem_shrinker_shrink(struct drm_device *dev, unsigned long nr_to_scan); diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index cf509ca42da0..040f0539baa5 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -233,8 +233,7 @@ static void put_pages(struct drm_gem_object *obj) } } -static struct page **msm_gem_get_pages_locked(struct drm_gem_object *obj, - unsigned madv) +struct page **msm_gem_get_pages_locked(struct drm_gem_object *obj, unsigned madv) { struct msm_gem_object *msm_obj = to_msm_bo(obj); @@ -1036,18 +1035,37 @@ static void msm_gem_free_object(struct drm_gem_object *obj) /* * We need to lock any VMs the object is still attached to, but not * the object itself (see explaination in msm_gem_assert_locked()), - * so just open-code this special case: + * so just open-code this special case. + * + * Note that we skip the dance if we aren't attached to any VM. This + * is load bearing. The driver needs to support two usage models: + * + * 1. Legacy kernel managed VM: Userspace expects the VMA's to be + * implicitly torn down when the object is freed, the VMA's do + * not hold a hard reference to the BO. + * + * 2. VM_BIND, userspace managed VM: The VMA holds a reference to the + * BO. This can be dropped when the VM is closed and it's associated + * VMAs are torn down. (See msm_gem_vm_close()). + * + * In the latter case the last reference to a BO can be dropped while + * we already have the VM locked. It would have already been removed + * from the gpuva list, but lockdep doesn't know that. Or understand + * the differences between the two usage models. */ - drm_exec_init(&exec, 0, 0); - drm_exec_until_all_locked (&exec) { - struct drm_gpuvm_bo *vm_bo; - drm_gem_for_each_gpuvm_bo (vm_bo, obj) { - drm_exec_lock_obj(&exec, drm_gpuvm_resv_obj(vm_bo->vm)); - drm_exec_retry_on_contention(&exec); + if (!list_empty(&obj->gpuva.list)) { + drm_exec_init(&exec, 0, 0); + drm_exec_until_all_locked (&exec) { + struct drm_gpuvm_bo *vm_bo; + drm_gem_for_each_gpuvm_bo (vm_bo, obj) { + drm_exec_lock_obj(&exec, + drm_gpuvm_resv_obj(vm_bo->vm)); + drm_exec_retry_on_contention(&exec); + } } + put_iova_spaces(obj, NULL, true); + drm_exec_fini(&exec); /* drop locks */ } - put_iova_spaces(obj, NULL, true); - drm_exec_fini(&exec); /* drop locks */ if (obj->import_attach) { GEM_WARN_ON(msm_obj->vaddr); diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index 8ad25927c604..bfeb0f584ae5 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -73,6 +73,9 @@ struct msm_gem_vm { /** @mmu: The mmu object which manages the pgtables */ struct msm_mmu *mmu; + /** @mmu_lock: Protects access to the mmu */ + struct mutex mmu_lock; + /** * @pid: For address spaces associated with a specific process, this * will be non-NULL: @@ -205,6 +208,7 @@ int msm_gem_get_and_pin_iova(struct drm_gem_object *obj, struct drm_gpuvm *vm, uint64_t *iova); void msm_gem_unpin_iova(struct drm_gem_object *obj, struct drm_gpuvm *vm); void msm_gem_pin_obj_locked(struct drm_gem_object *obj); +struct page **msm_gem_get_pages_locked(struct drm_gem_object *obj, unsigned madv); struct page **msm_gem_pin_pages_locked(struct drm_gem_object *obj); void msm_gem_unpin_pages_locked(struct drm_gem_object *obj); int msm_gem_dumb_create(struct drm_file *file, struct drm_device *dev, diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 053e6c65780f..9809918d8eb4 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -193,6 +193,7 @@ static int submit_lookup_objects(struct msm_gem_submit *submit, static int submit_lookup_cmds(struct msm_gem_submit *submit, struct drm_msm_gem_submit *args, struct drm_file *file) { + struct msm_context *ctx = file->driver_priv; unsigned i; size_t sz; int ret = 0; @@ -224,6 +225,20 @@ static int submit_lookup_cmds(struct msm_gem_submit *submit, goto out; } + if (msm_context_is_vmbind(ctx)) { + if (submit_cmd.nr_relocs) { + ret = SUBMIT_ERROR(EINVAL, submit, "nr_relocs must be zero"); + goto out; + } + + if (submit_cmd.submit_idx || submit_cmd.submit_offset) { + ret = SUBMIT_ERROR(EINVAL, submit, "submit_idx/offset must be zero"); + goto out; + } + + submit->cmd[i].iova = submit_cmd.iova; + } + submit->cmd[i].type = submit_cmd.type; submit->cmd[i].size = submit_cmd.size / 4; submit->cmd[i].offset = submit_cmd.submit_offset / 4; @@ -527,6 +542,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct msm_syncobj_post_dep *post_deps = NULL; struct drm_syncobj **syncobjs_to_reset = NULL; struct sync_file *sync_file = NULL; + unsigned cmds_to_parse; int out_fence_fd = -1; unsigned i; int ret; @@ -650,7 +666,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, if (ret) goto out; - for (i = 0; i < args->nr_cmds; i++) { + cmds_to_parse = msm_context_is_vmbind(ctx) ? 0 : args->nr_cmds; + + for (i = 0; i < cmds_to_parse; i++) { struct drm_gem_object *obj; uint64_t iova; @@ -681,7 +699,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, goto out; } - submit->nr_cmds = i; + submit->nr_cmds = args->nr_cmds; idr_preload(GFP_KERNEL); diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c index a105aed82cae..fe41b7a042c3 100644 --- a/drivers/gpu/drm/msm/msm_gem_vma.c +++ b/drivers/gpu/drm/msm/msm_gem_vma.c @@ -4,9 +4,16 @@ * Author: Rob Clark <robdclark(a)gmail.com> */ +#include "drm/drm_file.h" +#include "drm/msm_drm.h" +#include "linux/file.h" +#include "linux/sync_file.h" + #include "msm_drv.h" #include "msm_gem.h" +#include "msm_gpu.h" #include "msm_mmu.h" +#include "msm_syncobj.h" #define vm_dbg(fmt, ...) pr_debug("%s:%d: "fmt"\n", __func__, __LINE__, ##__VA_ARGS__) @@ -36,6 +43,97 @@ struct msm_vm_unmap_op { uint64_t range; }; +/** + * struct msm_vma_op - A MAP or UNMAP operation + */ +struct msm_vm_op { + /** @op: The operation type */ + enum { + MSM_VM_OP_MAP = 1, + MSM_VM_OP_UNMAP, + } op; + union { + /** @map: Parameters used if op == MSM_VMA_OP_MAP */ + struct msm_vm_map_op map; + /** @unmap: Parameters used if op == MSM_VMA_OP_UNMAP */ + struct msm_vm_unmap_op unmap; + }; + /** @node: list head in msm_vm_bind_job::vm_ops */ + struct list_head node; + + /** + * @obj: backing object for pages to be mapped/unmapped + * + * Async unmap ops, in particular, must hold a reference to the + * original GEM object backing the mapping that will be unmapped. + * But the same can be required in the map path, for example if + * there is not a corresponding unmap op, such as process exit. + * + * This ensures that the pages backing the mapping are not freed + * before the mapping is torn down. + */ + struct drm_gem_object *obj; +}; + +/** + * struct msm_vm_bind_job - Tracking for a VM_BIND ioctl + * + * A table of userspace requested VM updates (MSM_VM_BIND_OP_UNMAP/MAP/MAP_NULL) + * gets applied to the vm, generating a list of VM ops (MSM_VM_OP_MAP/UNMAP) + * which are applied to the pgtables asynchronously. For example a userspace + * requested MSM_VM_BIND_OP_MAP could end up generating both an MSM_VM_OP_UNMAP + * to unmap an existing mapping, and a MSM_VM_OP_MAP to apply the new mapping. + */ +struct msm_vm_bind_job { + /** @base: base class for drm_sched jobs */ + struct drm_sched_job base; + /** @vm: The VM being operated on */ + struct drm_gpuvm *vm; + /** @fence: The fence that is signaled when job completes */ + struct dma_fence *fence; + /** @queue: The queue that the job runs on */ + struct msm_gpu_submitqueue *queue; + /** @prealloc: Tracking for pre-allocated MMU pgtable pages */ + struct msm_mmu_prealloc prealloc; + /** @vm_ops: a list of struct msm_vm_op */ + struct list_head vm_ops; + /** @bos_pinned: are the GEM objects being bound pinned? */ + bool bos_pinned; + /** @nr_ops: the number of userspace requested ops */ + unsigned int nr_ops; + /** + * @ops: the userspace requested ops + * + * The userspace requested ops are copied/parsed and validated + * before we start applying the updates to try to do as much up- + * front error checking as possible, to avoid the VM being in an + * undefined state due to partially executed VM_BIND. + * + * This table also serves to hold a reference to the backing GEM + * objects. + */ + struct msm_vm_bind_op { + uint32_t op; + uint32_t flags; + union { + struct drm_gem_object *obj; + uint32_t handle; + }; + uint64_t obj_offset; + uint64_t iova; + uint64_t range; + } ops[]; +}; + +#define job_foreach_bo(obj, _job) \ + for (unsigned i = 0; i < (_job)->nr_ops; i++) \ + if ((obj = (_job)->ops[i].obj)) + +static inline struct msm_vm_bind_job *to_msm_vm_bind_job(struct drm_sched_job *job) +{ + return container_of(job, struct msm_vm_bind_job, base); +} + static void msm_gem_vm_free(struct drm_gpuvm *gpuvm) { @@ -52,6 +150,9 @@ msm_gem_vm_free(struct drm_gpuvm *gpuvm) static void vm_unmap_op(struct msm_gem_vm *vm, const struct msm_vm_unmap_op *op) { + if (!vm->managed) + lockdep_assert_held(&vm->mmu_lock); + vm_dbg("%p: %016llx %016llx", vm, op->iova, op->iova + op->range); vm->mmu->funcs->unmap(vm->mmu, op->iova, op->range); @@ -60,6 +161,9 @@ vm_unmap_op(struct msm_gem_vm *vm, const struct msm_vm_unmap_op *op) static int vm_map_op(struct msm_gem_vm *vm, const struct msm_vm_map_op *op) { + if (!vm->managed) + lockdep_assert_held(&vm->mmu_lock); + vm_dbg("%p: %016llx %016llx", vm, op->iova, op->iova + op->range); return vm->mmu->funcs->map(vm->mmu, op->iova, op->sgt, op->offset, @@ -69,17 +173,29 @@ vm_map_op(struct msm_gem_vm *vm, const struct msm_vm_map_op *op) /* Actually unmap memory for the vma */ void msm_gem_vma_unmap(struct drm_gpuva *vma) { + struct msm_gem_vm *vm = to_msm_vm(vma->vm); struct msm_gem_vma *msm_vma = to_msm_vma(vma); /* Don't do anything if the memory isn't mapped */ if (!msm_vma->mapped) return; - vm_unmap_op(to_msm_vm(vma->vm), &(struct msm_vm_unmap_op){ + /* + * The mmu_lock is only needed when preallocation is used. But + * in that case we don't need to worry about recursion into + * shrinker + */ + if (!vm->managed) + mutex_lock(&vm->mmu_lock); + + vm_unmap_op(vm, &(struct msm_vm_unmap_op){ .iova = vma->va.addr, .range = vma->va.range, }); + if (!vm->managed) + mutex_unlock(&vm->mmu_lock); + msm_vma->mapped = false; } @@ -87,6 +203,7 @@ void msm_gem_vma_unmap(struct drm_gpuva *vma) int msm_gem_vma_map(struct drm_gpuva *vma, int prot, struct sg_table *sgt) { + struct msm_gem_vm *vm = to_msm_vm(vma->vm); struct msm_gem_vma *msm_vma = to_msm_vma(vma); int ret; @@ -98,6 +215,14 @@ msm_gem_vma_map(struct drm_gpuva *vma, int prot, struct sg_table *sgt) msm_vma->mapped = true; + /* + * The mmu_lock is only needed when preallocation is used. But + * in that case we don't need to worry about recursion into + * shrinker + */ + if (!vm->managed) + mutex_lock(&vm->mmu_lock); + /* * NOTE: iommu/io-pgtable can allocate pages, so we cannot hold * a lock across map/unmap which is also used in the job_run() @@ -107,16 +232,19 @@ msm_gem_vma_map(struct drm_gpuva *vma, int prot, struct sg_table *sgt) * Revisit this if we can come up with a scheme to pre-alloc pages * for the pgtable in map/unmap ops. */ - ret = vm_map_op(to_msm_vm(vma->vm), &(struct msm_vm_map_op){ + ret = vm_map_op(vm, &(struct msm_vm_map_op){ .iova = vma->va.addr, .range = vma->va.range, .offset = vma->gem.offset, .sgt = sgt, .prot = prot, }); - if (ret) { + + if (!vm->managed) + mutex_unlock(&vm->mmu_lock); + + if (ret) msm_vma->mapped = false; - } return ret; } @@ -131,6 +259,9 @@ void msm_gem_vma_close(struct drm_gpuva *vma) drm_gpuvm_resv_assert_held(&vm->base); + if (vma->gem.obj) + msm_gem_assert_locked(vma->gem.obj); + if (vma->va.addr && vm->managed) drm_mm_remove_node(&msm_vma->node); @@ -158,6 +289,7 @@ msm_gem_vma_new(struct drm_gpuvm *gpuvm, struct drm_gem_object *obj, if (vm->managed) { BUG_ON(offset != 0); + BUG_ON(!obj); /* NULL mappings not valid for kernel managed VM */ ret = drm_mm_insert_node_in_range(&vm->mm, &vma->node, obj->size, PAGE_SIZE, 0, range_start, range_end, 0); @@ -169,7 +301,8 @@ msm_gem_vma_new(struct drm_gpuvm *gpuvm, struct drm_gem_object *obj, range_end = range_start + obj->size; } - GEM_WARN_ON((range_end - range_start) > obj->size); + if (obj) + GEM_WARN_ON((range_end - range_start) > obj->size); drm_gpuva_init(&vma->base, range_start, range_end - range_start, obj, offset); vma->mapped = false; @@ -178,6 +311,9 @@ msm_gem_vma_new(struct drm_gpuvm *gpuvm, struct drm_gem_object *obj, if (ret) goto err_free_range; + if (!obj) + return &vma->base; + vm_bo = drm_gpuvm_bo_obtain(&vm->base, obj); if (IS_ERR(vm_bo)) { ret = PTR_ERR(vm_bo); @@ -200,11 +336,297 @@ msm_gem_vma_new(struct drm_gpuvm *gpuvm, struct drm_gem_object *obj, return ERR_PTR(ret); } +static int +msm_gem_vm_bo_validate(struct drm_gpuvm_bo *vm_bo, struct drm_exec *exec) +{ + struct drm_gem_object *obj = vm_bo->obj; + struct drm_gpuva *vma; + int ret; + + vm_dbg("validate: %p", obj); + + msm_gem_assert_locked(obj); + + drm_gpuvm_bo_for_each_va (vma, vm_bo) { + ret = msm_gem_pin_vma_locked(obj, vma); + if (ret) + return ret; + } + + return 0; +} + +struct op_arg { + unsigned flags; + struct msm_vm_bind_job *job; +}; + +static void +vm_op_enqueue(struct op_arg *arg, struct msm_vm_op _op) +{ + struct msm_vm_op *op = kmalloc(sizeof(*op), GFP_KERNEL); + *op = _op; + list_add_tail(&op->node, &arg->job->vm_ops); + + if (op->obj) + drm_gem_object_get(op->obj); +} + +static struct drm_gpuva * +vma_from_op(struct op_arg *arg, struct drm_gpuva_op_map *op) +{ + return msm_gem_vma_new(arg->job->vm, op->gem.obj, op->gem.offset, + op->va.addr, op->va.addr + op->va.range); +} + +static int +msm_gem_vm_sm_step_map(struct drm_gpuva_op *op, void *arg) +{ + struct drm_gem_object *obj = op->map.gem.obj; + struct drm_gpuva *vma; + struct sg_table *sgt; + unsigned prot; + + vma = vma_from_op(arg, &op->map); + if (WARN_ON(IS_ERR(vma))) + return PTR_ERR(vma); + + vm_dbg("%p:%p:%p: %016llx %016llx", vma->vm, vma, vma->gem.obj, + vma->va.addr, vma->va.range); + + vma->flags = ((struct op_arg *)arg)->flags; + + if (obj) { + sgt = to_msm_bo(obj)->sgt; + prot = msm_gem_prot(obj); + } else { + sgt = NULL; + prot = IOMMU_READ | IOMMU_WRITE; + } + + vm_op_enqueue(arg, (struct msm_vm_op){ + .op = MSM_VM_OP_MAP, + .map = { + .sgt = sgt, + .iova = vma->va.addr, + .range = vma->va.range, + .offset = vma->gem.offset, + .prot = prot, + }, + .obj = vma->gem.obj, + }); + + to_msm_vma(vma)->mapped = true; + + return 0; +} + +static int +msm_gem_vm_sm_step_remap(struct drm_gpuva_op *op, void *arg) +{ + struct msm_vm_bind_job *job = ((struct op_arg *)arg)->job; + struct drm_gpuvm *vm = job->vm; + struct drm_gpuva *orig_vma = op->remap.unmap->va; + struct drm_gpuva *prev_vma = NULL, *next_vma = NULL; + struct drm_gpuvm_bo *vm_bo = orig_vma->vm_bo; + bool mapped = to_msm_vma(orig_vma)->mapped; + unsigned flags; + + vm_dbg("orig_vma: %p:%p:%p: %016llx %016llx", vm, orig_vma, + orig_vma->gem.obj, orig_vma->va.addr, orig_vma->va.range); + + if (mapped) { + uint64_t unmap_start, unmap_range; + + drm_gpuva_op_remap_to_unmap_range(&op->remap, &unmap_start, &unmap_range); + + vm_op_enqueue(arg, (struct msm_vm_op){ + .op = MSM_VM_OP_UNMAP, + .unmap = { + .iova = unmap_start, + .range = unmap_range, + }, + .obj = orig_vma->gem.obj, + }); + + /* + * Part of this GEM obj is still mapped, but we're going to kill the + * existing VMA and replace it with one or two new ones (ie. two if + * the unmapped range is in the middle of the existing (unmap) VMA). + * So just set the state to unmapped: + */ + to_msm_vma(orig_vma)->mapped = false; + } + + /* + * Hold a ref to the vm_bo between the msm_gem_vma_close() and the + * creation of the new prev/next vma's, in case the vm_bo is tracked + * in the VM's evict list: + */ + if (vm_bo) + drm_gpuvm_bo_get(vm_bo); + + /* + * The prev_vma and/or next_vma are replacing the unmapped vma, and + * therefore should preserve it's flags: + */ + flags = orig_vma->flags; + + msm_gem_vma_close(orig_vma); + + if (op->remap.prev) { + prev_vma = vma_from_op(arg, op->remap.prev); + if (WARN_ON(IS_ERR(prev_vma))) + return PTR_ERR(prev_vma); + + vm_dbg("prev_vma: %p:%p: %016llx %016llx", vm, prev_vma, prev_vma->va.addr, prev_vma->va.range); + to_msm_vma(prev_vma)->mapped = mapped; + prev_vma->flags = flags; + } + + if (op->remap.next) { + next_vma = vma_from_op(arg, op->remap.next); + if (WARN_ON(IS_ERR(next_vma))) + return PTR_ERR(next_vma); + + vm_dbg("next_vma: %p:%p: %016llx %016llx", vm, next_vma, next_vma->va.addr, next_vma->va.range); + to_msm_vma(next_vma)->mapped = mapped; + next_vma->flags = flags; + } + + if (!mapped) + drm_gpuvm_bo_evict(vm_bo, true); + + /* Drop the previous ref: */ + drm_gpuvm_bo_put(vm_bo); + + return 0; +} + +static int +msm_gem_vm_sm_step_unmap(struct drm_gpuva_op *op, void *arg) +{ + struct drm_gpuva *vma = op->unmap.va; + struct msm_gem_vma *msm_vma = to_msm_vma(vma); + + vm_dbg("%p:%p:%p: %016llx %016llx", vma->vm, vma, vma->gem.obj, + vma->va.addr, vma->va.range); + + if (!msm_vma->mapped) + goto out_close; + + vm_op_enqueue(arg, (struct msm_vm_op){ + .op = MSM_VM_OP_UNMAP, + .unmap = { + .iova = vma->va.addr, + .range = vma->va.range, + }, + .obj = vma->gem.obj, + }); + + msm_vma->mapped = false; + +out_close: + msm_gem_vma_close(vma); + + return 0; +} + static const struct drm_gpuvm_ops msm_gpuvm_ops = { .vm_free = msm_gem_vm_free, + .vm_bo_validate = msm_gem_vm_bo_validate, + .sm_step_map = msm_gem_vm_sm_step_map, + .sm_step_remap = msm_gem_vm_sm_step_remap, + .sm_step_unmap = msm_gem_vm_sm_step_unmap, }; +static struct dma_fence * +msm_vma_job_run(struct drm_sched_job *_job) +{ + struct msm_vm_bind_job *job = to_msm_vm_bind_job(_job); + struct msm_gem_vm *vm = to_msm_vm(job->vm); + struct drm_gem_object *obj; + int ret = vm->unusable ? -EINVAL : 0; + + vm_dbg(""); + + mutex_lock(&vm->mmu_lock); + vm->mmu->prealloc = &job->prealloc; + + while (!list_empty(&job->vm_ops)) { + struct msm_vm_op *op = + list_first_entry(&job->vm_ops, struct msm_vm_op, node); + + switch (op->op) { + case MSM_VM_OP_MAP: + /* + * On error, stop trying to map new things.. but we + * still want to process the unmaps (or in particular, + * the drm_gem_object_put()s) + */ + if (!ret) + ret = vm_map_op(vm, &op->map); + break; + case MSM_VM_OP_UNMAP: + vm_unmap_op(vm, &op->unmap); + break; + } + drm_gem_object_put(op->obj); + list_del(&op->node); + kfree(op); + } + + vm->mmu->prealloc = NULL; + mutex_unlock(&vm->mmu_lock); + + /* + * We failed to perform at least _some_ of the pgtable updates, so + * now the VM is in an undefined state. Game over! + */ + if (ret) + vm->unusable = true; + + job_foreach_bo (obj, job) { + msm_gem_lock(obj); + msm_gem_unpin_locked(obj); + msm_gem_unlock(obj); + } + + /* VM_BIND ops are synchronous, so no fence to wait on: */ + return NULL; +} + +static void +msm_vma_job_free(struct drm_sched_job *_job) +{ + struct msm_vm_bind_job *job = to_msm_vm_bind_job(_job); + struct msm_mmu *mmu = to_msm_vm(job->vm)->mmu; + struct drm_gem_object *obj; + + mmu->funcs->prealloc_cleanup(mmu, &job->prealloc); + + drm_sched_job_cleanup(_job); + + job_foreach_bo (obj, job) + drm_gem_object_put(obj); + + msm_submitqueue_put(job->queue); + dma_fence_put(job->fence); + + /* In error paths, we could have unexecuted ops: */ + while (!list_empty(&job->vm_ops)) { + struct msm_vm_op *op = + list_first_entry(&job->vm_ops, struct msm_vm_op, node); + list_del(&op->node); + kfree(op); + } + + kfree(job); +} + static const struct drm_sched_backend_ops msm_vm_bind_ops = { + .run_job = msm_vma_job_run, + .free_job = msm_vma_job_free }; /** @@ -254,6 +676,7 @@ msm_gem_vm_create(struct drm_device *drm, struct msm_mmu *mmu, const char *name, .ops = &msm_vm_bind_ops, .num_rqs = 1, .credit_limit = 1, + .enqueue_credit_limit = 1024, .timeout = MAX_SCHEDULE_TIMEOUT, .name = "msm-vm-bind", .dev = drm->dev, @@ -269,6 +692,7 @@ msm_gem_vm_create(struct drm_device *drm, struct msm_mmu *mmu, const char *name, drm_gem_object_put(dummy_gem); vm->mmu = mmu; + mutex_init(&vm->mmu_lock); vm->managed = managed; drm_mm_init(&vm->mm, va_start, va_size); @@ -281,7 +705,6 @@ msm_gem_vm_create(struct drm_device *drm, struct msm_mmu *mmu, const char *name, err_free_vm: kfree(vm); return ERR_PTR(ret); - } /** @@ -297,6 +720,7 @@ msm_gem_vm_close(struct drm_gpuvm *gpuvm) { struct msm_gem_vm *vm = to_msm_vm(gpuvm); struct drm_gpuva *vma, *tmp; + struct drm_exec exec; /* * For kernel managed VMs, the VMAs are torn down when the handle is @@ -313,22 +737,635 @@ msm_gem_vm_close(struct drm_gpuvm *gpuvm) drm_sched_fini(&vm->sched); /* Tear down any remaining mappings: */ - dma_resv_lock(drm_gpuvm_resv(gpuvm), NULL); - drm_gpuvm_for_each_va_safe (vma, tmp, gpuvm) { - struct drm_gem_object *obj = vma->gem.obj; + drm_exec_init(&exec, 0, 2); + drm_exec_until_all_locked (&exec) { + drm_exec_lock_obj(&exec, drm_gpuvm_resv_obj(gpuvm)); + drm_exec_retry_on_contention(&exec); + + drm_gpuvm_for_each_va_safe (vma, tmp, gpuvm) { + struct drm_gem_object *obj = vma->gem.obj; + + /* + * MSM_BO_NO_SHARE objects share the same resv as the + * VM, in which case the obj is already locked: + */ + if (obj && (obj->resv == drm_gpuvm_resv(gpuvm))) + obj = NULL; + + if (obj) { + drm_exec_lock_obj(&exec, obj); + drm_exec_retry_on_contention(&exec); + } + + msm_gem_vma_unmap(vma); + msm_gem_vma_close(vma); + + if (obj) { + drm_exec_unlock_obj(&exec, obj); + } + } + } + drm_exec_fini(&exec); +} + + +static struct msm_vm_bind_job * +vm_bind_job_create(struct drm_device *dev, struct msm_gpu *gpu, + struct msm_gpu_submitqueue *queue, uint32_t nr_ops) +{ + struct msm_vm_bind_job *job; + uint64_t sz; + int ret; + + sz = struct_size(job, ops, nr_ops); + + if (sz > SIZE_MAX) + return ERR_PTR(-ENOMEM); + + job = kzalloc(sz, GFP_KERNEL | __GFP_NOWARN); + if (!job) + return ERR_PTR(-ENOMEM); + + ret = drm_sched_job_init(&job->base, queue->entity, 1, queue); + if (ret) { + kfree(job); + return ERR_PTR(ret); + } + + job->vm = msm_context_vm(dev, queue->ctx); + job->queue = queue; + INIT_LIST_HEAD(&job->vm_ops); + + return job; +} + +static bool invalid_alignment(uint64_t addr) +{ + /* + * Technically this is about GPU alignment, not CPU alignment. But + * I've not seen any qcom SoC where the SMMU does not support the + * CPU's smallest page size. + */ + return !PAGE_ALIGNED(addr); +} + +static int +lookup_op(struct msm_vm_bind_job *job, const struct drm_msm_vm_bind_op *op) +{ + struct drm_device *dev = job->vm->drm; + int i = job->nr_ops++; + int ret = 0; + + job->ops[i].op = op->op; + job->ops[i].handle = op->handle; + job->ops[i].obj_offset = op->obj_offset; + job->ops[i].iova = op->iova; + job->ops[i].range = op->range; + job->ops[i].flags = op->flags; + + if (op->flags & ~MSM_VM_BIND_OP_FLAGS) + ret = UERR(EINVAL, dev, "invalid flags: %x\n", op->flags); + + if (invalid_alignment(op->iova)) + ret = UERR(EINVAL, dev, "invalid address: %016llx\n", op->iova); + + if (invalid_alignment(op->obj_offset)) + ret = UERR(EINVAL, dev, "invalid bo_offset: %016llx\n", op->obj_offset); + + if (invalid_alignment(op->range)) + ret = UERR(EINVAL, dev, "invalid range: %016llx\n", op->range); + + + /* + * MAP must specify a valid handle. But the handle MBZ for + * UNMAP or MAP_NULL. + */ + if (op->op == MSM_VM_BIND_OP_MAP) { + if (!op->handle) + ret = UERR(EINVAL, dev, "invalid handle\n"); + } else if (op->handle) { + ret = UERR(EINVAL, dev, "handle must be zero\n"); + } + + switch (op->op) { + case MSM_VM_BIND_OP_MAP: + case MSM_VM_BIND_OP_MAP_NULL: + case MSM_VM_BIND_OP_UNMAP: + break; + default: + ret = UERR(EINVAL, dev, "invalid op: %u\n", op->op); + break; + } + + return ret; +} + +/* + * ioctl parsing, parameter validation, and GEM handle lookup + */ +static int +vm_bind_job_lookup_ops(struct msm_vm_bind_job *job, struct drm_msm_vm_bind *args, + struct drm_file *file, int *nr_bos) +{ + struct drm_device *dev = job->vm->drm; + int ret = 0; + int cnt = 0; + + if (args->nr_ops == 1) { + /* Single op case, the op is inlined: */ + ret = lookup_op(job, &args->op); + } else { + for (unsigned i = 0; i < args->nr_ops; i++) { + struct drm_msm_vm_bind_op op; + void __user *userptr = + u64_to_user_ptr(args->ops + (i * sizeof(op))); + + /* make sure we don't have garbage flags, in case we hit + * error path before flags is initialized: + */ + job->ops[i].flags = 0; + + if (copy_from_user(&op, userptr, sizeof(op))) { + ret = -EFAULT; + break; + } + + ret = lookup_op(job, &op); + if (ret) + break; + } + } + + if (ret) { + job->nr_ops = 0; + goto out; + } + + spin_lock(&file->table_lock); + + for (unsigned i = 0; i < args->nr_ops; i++) { + struct drm_gem_object *obj; + + if (!job->ops[i].handle) { + job->ops[i].obj = NULL; + continue; + } + + /* + * normally use drm_gem_object_lookup(), but for bulk lookup + * all under single table_lock just hit object_idr directly: + */ + obj = idr_find(&file->object_idr, job->ops[i].handle); + if (!obj) { + ret = UERR(EINVAL, dev, "invalid handle %u at index %u\n", job->ops[i].handle, i); + goto out_unlock; + } + + drm_gem_object_get(obj); + + job->ops[i].obj = obj; + cnt++; + } + + *nr_bos = cnt; + +out_unlock: + spin_unlock(&file->table_lock); + +out: + return ret; +} + +static void +prealloc_count(struct msm_vm_bind_job *job, + struct msm_vm_bind_op *first, + struct msm_vm_bind_op *last) +{ + struct msm_mmu *mmu = to_msm_vm(job->vm)->mmu; + + if (!first) + return; + + uint64_t start_iova = first->iova; + uint64_t end_iova = last->iova + last->range; + + mmu->funcs->prealloc_count(mmu, &job->prealloc, start_iova, end_iova - start_iova); +} + +static bool +ops_are_same_pte(struct msm_vm_bind_op *first, struct msm_vm_bind_op *next) +{ + /* + * Last level pte covers 2MB.. so we should merge two ops, from + * the PoV of figuring out how much pgtable pages to pre-allocate + * if they land in the same 2MB range: + */ + uint64_t pte_mask = ~(SZ_2M - 1); + return ((first->iova + first->range) & pte_mask) == (next->iova & pte_mask); +} + +/* + * Determine the amount of memory to prealloc for pgtables. For sparse images, + * in particular, userspace plays some tricks with the order of page mappings + * to get the desired swizzle pattern, resulting in a large # of tiny MAP ops. + * So detect when multiple MAP operations are physically contiguous, and count + * them as a single mapping. Otherwise the prealloc_count() will not realize + * they can share pagetable pages and vastly overcount. + */ +static void +vm_bind_prealloc_count(struct msm_vm_bind_job *job) +{ + struct msm_vm_bind_op *first = NULL, *last = NULL; + + for (int i = 0; i < job->nr_ops; i++) { + struct msm_vm_bind_op *op = &job->ops[i]; + + /* We only care about MAP/MAP_NULL: */ + if (op->op == MSM_VM_BIND_OP_UNMAP) + continue; + + /* + * If op is contiguous with last in the current range, then + * it becomes the new last in the range and we continue + * looping: + */ + if (last && ops_are_same_pte(last, op)) { + last = op; + continue; + } + + /* + * If op is not contiguous with the current range, flush + * the current range and start anew: + */ + prealloc_count(job, first, last); + first = last = op; + } + + /* Flush the remaining range: */ + prealloc_count(job, first, last); + + job->base.enqueue_credits = job->prealloc.count; +} + +/* + * Lock VM and GEM objects + */ +static int +vm_bind_job_lock_objects(struct msm_vm_bind_job *job, struct drm_exec *exec) +{ + struct drm_gem_object *obj; + int ret; + + /* Lock VM and objects: */ + drm_exec_until_all_locked(exec) { + ret = drm_exec_lock_obj(exec, drm_gpuvm_resv_obj(job->vm)); + drm_exec_retry_on_contention(exec); + if (ret) + return ret; + + job_foreach_bo (obj, job) { + ret = drm_exec_prepare_obj(exec, obj, 1); + drm_exec_retry_on_contention(exec); + if (ret) + return ret; + } + } + + return 0; +} + +/* + * Pin GEM objects, ensuring that we have backing pages. Pinning will move + * the object to the pinned LRU so that the shrinker knows to first consider + * other objects for evicting. + */ +static int +vm_bind_job_pin_objects(struct msm_vm_bind_job *job) +{ + struct drm_gem_object *obj; + + /* + * First loop, before holding the LRU lock, avoids holding the + * LRU lock while calling msm_gem_pin_vma_locked (which could + * trigger get_pages()) + */ + job_foreach_bo (obj, job) { + struct page **pages; + + pages = msm_gem_get_pages_locked(obj, MSM_MADV_WILLNEED); + if (IS_ERR(pages)) + return PTR_ERR(pages); + } + + struct msm_drm_private *priv = job->vm->drm->dev_private; + + /* + * A second loop while holding the LRU lock (a) avoids acquiring/dropping + * the LRU lock for each individual bo, while (b) avoiding holding the + * LRU lock while calling msm_gem_pin_vma_locked() (which could trigger + * get_pages() which could trigger reclaim.. and if we held the LRU lock + * could trigger deadlock with the shrinker). + */ + mutex_lock(&priv->lru.lock); + job_foreach_bo (obj, job) + msm_gem_pin_obj_locked(obj); + mutex_unlock(&priv->lru.lock); + + job->bos_pinned = true; + + return 0; +} + +/* + * Unpin GEM objects. Normally this is done after the bind job is run. + */ +static void +vm_bind_job_unpin_objects(struct msm_vm_bind_job *job) +{ + struct drm_gem_object *obj; + + if (!job->bos_pinned) + return; + + job_foreach_bo (obj, job) + msm_gem_unpin_locked(obj); - if (obj && obj->resv != drm_gpuvm_resv(gpuvm)) { - drm_gem_object_get(obj); - msm_gem_lock(obj); + job->bos_pinned = false; +} + +/* + * Pre-allocate pgtable memory, and translate the VM bind requests into a + * sequence of pgtable updates to be applied asynchronously. + */ +static int +vm_bind_job_prepare(struct msm_vm_bind_job *job) +{ + struct msm_gem_vm *vm = to_msm_vm(job->vm); + struct msm_mmu *mmu = vm->mmu; + int ret; + + ret = mmu->funcs->prealloc_allocate(mmu, &job->prealloc); + if (ret) + return ret; + + for (unsigned i = 0; i < job->nr_ops; i++) { + const struct msm_vm_bind_op *op = &job->ops[i]; + struct op_arg arg = { + .job = job, + }; + + switch (op->op) { + case MSM_VM_BIND_OP_UNMAP: + ret = drm_gpuvm_sm_unmap(job->vm, &arg, op->iova, + op->obj_offset); + break; + case MSM_VM_BIND_OP_MAP: + if (op->flags & MSM_VM_BIND_OP_DUMP) + arg.flags |= MSM_VMA_DUMP; + fallthrough; + case MSM_VM_BIND_OP_MAP_NULL: + ret = drm_gpuvm_sm_map(job->vm, &arg, op->iova, + op->range, op->obj, op->obj_offset); + break; + default: + /* + * lookup_op() should have already thrown an error for + * invalid ops + */ + BUG_ON("unreachable"); } - msm_gem_vma_unmap(vma); - msm_gem_vma_close(vma); + if (ret) { + /* + * If we've already started modifying the vm, we can't + * adequetly describe to userspace the intermediate + * state the vm is in. So throw up our hands! + */ + if (i > 0) + vm->unusable = true; + return ret; + } + } + + return 0; +} + +/* + * Attach fences to the GEM objects being bound. This will signify to + * the shrinker that they are busy even after dropping the locks (ie. + * drm_exec_fini()) + */ +static void +vm_bind_job_attach_fences(struct msm_vm_bind_job *job) +{ + for (unsigned i = 0; i < job->nr_ops; i++) { + struct drm_gem_object *obj = job->ops[i].obj; + + if (!obj) + continue; + + dma_resv_add_fence(obj->resv, job->fence, + DMA_RESV_USAGE_KERNEL); + } +} + +int +msm_ioctl_vm_bind(struct drm_device *dev, void *data, struct drm_file *file) +{ + struct msm_drm_private *priv = dev->dev_private; + struct drm_msm_vm_bind *args = data; + struct msm_context *ctx = file->driver_priv; + struct msm_vm_bind_job *job = NULL; + struct msm_gpu *gpu = priv->gpu; + struct msm_gpu_submitqueue *queue; + struct msm_syncobj_post_dep *post_deps = NULL; + struct drm_syncobj **syncobjs_to_reset = NULL; + struct sync_file *sync_file = NULL; + struct dma_fence *fence; + int out_fence_fd = -1; + int ret, nr_bos = 0; + unsigned i; + + if (!gpu) + return -ENXIO; + + /* + * Maybe we could allow just UNMAP ops? OTOH userspace should just + * immediately close the device file and all will be torn down. + */ + if (to_msm_vm(ctx->vm)->unusable) + return UERR(EPIPE, dev, "context is unusable"); + + /* + * Technically, you cannot create a VM_BIND submitqueue in the first + * place, if you haven't opted in to VM_BIND context. But it is + * cleaner / less confusing, to check this case directly. + */ + if (!msm_context_is_vmbind(ctx)) + return UERR(EINVAL, dev, "context does not support vmbind"); + + if (args->flags & ~MSM_VM_BIND_FLAGS) + return UERR(EINVAL, dev, "invalid flags"); - if (obj && obj->resv != drm_gpuvm_resv(gpuvm)) { - msm_gem_unlock(obj); - drm_gem_object_put(obj); + queue = msm_submitqueue_get(ctx, args->queue_id); + if (!queue) + return -ENOENT; + + if (!(queue->flags & MSM_SUBMITQUEUE_VM_BIND)) { + ret = UERR(EINVAL, dev, "Invalid queue type"); + goto out_post_unlock; + } + + if (args->flags & MSM_VM_BIND_FENCE_FD_OUT) { + out_fence_fd = get_unused_fd_flags(O_CLOEXEC); + if (out_fence_fd < 0) { + ret = out_fence_fd; + goto out_post_unlock; } } - dma_resv_unlock(drm_gpuvm_resv(gpuvm)); + + job = vm_bind_job_create(dev, gpu, queue, args->nr_ops); + if (IS_ERR(job)) { + ret = PTR_ERR(job); + goto out_post_unlock; + } + + ret = mutex_lock_interruptible(&queue->lock); + if (ret) + goto out_post_unlock; + + if (args->flags & MSM_VM_BIND_FENCE_FD_IN) { + struct dma_fence *in_fence; + + in_fence = sync_file_get_fence(args->fence_fd); + + if (!in_fence) { + ret = UERR(EINVAL, dev, "invalid in-fence"); + goto out_unlock; + } + + ret = drm_sched_job_add_dependency(&job->base, in_fence); + if (ret) + goto out_unlock; + } + + if (args->in_syncobjs > 0) { + syncobjs_to_reset = msm_syncobj_parse_deps(dev, &job->base, + file, args->in_syncobjs, + args->nr_in_syncobjs, + args->syncobj_stride); + if (IS_ERR(syncobjs_to_reset)) { + ret = PTR_ERR(syncobjs_to_reset); + goto out_unlock; + } + } + + if (args->out_syncobjs > 0) { + post_deps = msm_syncobj_parse_post_deps(dev, file, + args->out_syncobjs, + args->nr_out_syncobjs, + args->syncobj_stride); + if (IS_ERR(post_deps)) { + ret = PTR_ERR(post_deps); + goto out_unlock; + } + } + + ret = vm_bind_job_lookup_ops(job, args, file, &nr_bos); + if (ret) + goto out_unlock; + + vm_bind_prealloc_count(job); + + struct drm_exec exec; + unsigned flags = DRM_EXEC_IGNORE_DUPLICATES | DRM_EXEC_INTERRUPTIBLE_WAIT; + drm_exec_init(&exec, flags, nr_bos + 1); + + ret = vm_bind_job_lock_objects(job, &exec); + if (ret) + goto out; + + ret = vm_bind_job_pin_objects(job); + if (ret) + goto out; + + ret = vm_bind_job_prepare(job); + if (ret) + goto out; + + drm_sched_job_arm(&job->base); + + job->fence = dma_fence_get(&job->base.s_fence->finished); + + if (args->flags & MSM_VM_BIND_FENCE_FD_OUT) { + sync_file = sync_file_create(job->fence); + if (!sync_file) { + ret = -ENOMEM; + } else { + fd_install(out_fence_fd, sync_file->file); + args->fence_fd = out_fence_fd; + } + } + + if (ret) + goto out; + + vm_bind_job_attach_fences(job); + + /* + * The job can be free'd (and fence unref'd) at any point after + * drm_sched_entity_push_job(), so we need to hold our own ref + */ + fence = dma_fence_get(job->fence); + + ret = drm_sched_entity_push_job(&job->base); + + msm_syncobj_reset(syncobjs_to_reset, args->nr_in_syncobjs); + msm_syncobj_process_post_deps(post_deps, args->nr_out_syncobjs, fence); + + dma_fence_put(fence); + +out: + if (ret) + vm_bind_job_unpin_objects(job); + + drm_exec_fini(&exec); +out_unlock: + mutex_unlock(&queue->lock); +out_post_unlock: + if (ret && (out_fence_fd >= 0)) { + put_unused_fd(out_fence_fd); + if (sync_file) + fput(sync_file->file); + } + + if (!IS_ERR_OR_NULL(job)) { + if (ret) + msm_vma_job_free(&job->base); + } else { + /* + * If the submit hasn't yet taken ownership of the queue + * then we need to drop the reference ourself: + */ + msm_submitqueue_put(queue); + } + + if (!IS_ERR_OR_NULL(post_deps)) { + for (i = 0; i < args->nr_out_syncobjs; ++i) { + kfree(post_deps[i].chain); + drm_syncobj_put(post_deps[i].syncobj); + } + kfree(post_deps); + } + + if (!IS_ERR_OR_NULL(syncobjs_to_reset)) { + for (i = 0; i < args->nr_in_syncobjs; ++i) { + if (syncobjs_to_reset[i]) + drm_syncobj_put(syncobjs_to_reset[i]); + } + kfree(syncobjs_to_reset); + } + + return ret; } diff --git a/include/uapi/drm/msm_drm.h b/include/uapi/drm/msm_drm.h index 6d6cd1219926..5c67294edc95 100644 --- a/include/uapi/drm/msm_drm.h +++ b/include/uapi/drm/msm_drm.h @@ -272,7 +272,10 @@ struct drm_msm_gem_submit_cmd { __u32 size; /* in, cmdstream size */ __u32 pad; __u32 nr_relocs; /* in, number of submit_reloc's */ - __u64 relocs; /* in, ptr to array of submit_reloc's */ + union { + __u64 relocs; /* in, ptr to array of submit_reloc's */ + __u64 iova; /* cmdstream address (for VM_BIND contexts) */ + }; }; /* Each buffer referenced elsewhere in the cmdstream submit (ie. the @@ -339,7 +342,74 @@ struct drm_msm_gem_submit { __u32 nr_out_syncobjs; /* in, number of entries in out_syncobj. */ __u32 syncobj_stride; /* in, stride of syncobj arrays. */ __u32 pad; /*in, reserved for future use, always 0. */ +}; + +#define MSM_VM_BIND_OP_UNMAP 0 +#define MSM_VM_BIND_OP_MAP 1 +#define MSM_VM_BIND_OP_MAP_NULL 2 + +#define MSM_VM_BIND_OP_DUMP 1 +#define MSM_VM_BIND_OP_FLAGS ( \ + MSM_VM_BIND_OP_DUMP | \ + 0) +/** + * struct drm_msm_vm_bind_op - bind/unbind op to run + */ +struct drm_msm_vm_bind_op { + /** @op: one of MSM_VM_BIND_OP_x */ + __u32 op; + /** @handle: GEM object handle, MBZ for UNMAP or MAP_NULL */ + __u32 handle; + /** @obj_offset: Offset into GEM object, MBZ for UNMAP or MAP_NULL */ + __u64 obj_offset; + /** @iova: Address to operate on */ + __u64 iova; + /** @range: Number of bites to to map/unmap */ + __u64 range; + /** @flags: Bitmask of MSM_VM_BIND_OP_FLAG_x */ + __u32 flags; + /** @pad: MBZ */ + __u32 pad; +}; + +#define MSM_VM_BIND_FENCE_FD_IN 0x00000001 +#define MSM_VM_BIND_FENCE_FD_OUT 0x00000002 +#define MSM_VM_BIND_FLAGS ( \ + MSM_VM_BIND_FENCE_FD_IN | \ + MSM_VM_BIND_FENCE_FD_OUT | \ + 0) + +/** + * struct drm_msm_vm_bind - Input of &DRM_IOCTL_MSM_VM_BIND + */ +struct drm_msm_vm_bind { + /** @flags: in, bitmask of MSM_VM_BIND_x */ + __u32 flags; + /** @nr_ops: the number of bind ops in this ioctl */ + __u32 nr_ops; + /** @fence_fd: in/out fence fd (see MSM_VM_BIND_FENCE_FD_IN/OUT) */ + __s32 fence_fd; + /** @queue_id: in, submitqueue id */ + __u32 queue_id; + /** @in_syncobjs: in, ptr to array of drm_msm_gem_syncobj */ + __u64 in_syncobjs; + /** @out_syncobjs: in, ptr to array of drm_msm_gem_syncobj */ + __u64 out_syncobjs; + /** @nr_in_syncobjs: in, number of entries in in_syncobj */ + __u32 nr_in_syncobjs; + /** @nr_out_syncobjs: in, number of entries in out_syncobj */ + __u32 nr_out_syncobjs; + /** @syncobj_stride: in, stride of syncobj arrays */ + __u32 syncobj_stride; + /** @op_stride: sizeof each struct drm_msm_vm_bind_op in @ops */ + __u32 op_stride; + union { + /** @op: used if num_ops == 1 */ + struct drm_msm_vm_bind_op op; + /** @ops: userptr to array of drm_msm_vm_bind_op if num_ops > 1 */ + __u64 ops; + }; }; #define MSM_WAIT_FENCE_BOOST 0x00000001 @@ -435,6 +505,7 @@ struct drm_msm_submitqueue_query { #define DRM_MSM_SUBMITQUEUE_NEW 0x0A #define DRM_MSM_SUBMITQUEUE_CLOSE 0x0B #define DRM_MSM_SUBMITQUEUE_QUERY 0x0C +#define DRM_MSM_VM_BIND 0x0D #define DRM_IOCTL_MSM_GET_PARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GET_PARAM, struct drm_msm_param) #define DRM_IOCTL_MSM_SET_PARAM DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SET_PARAM, struct drm_msm_param) @@ -448,6 +519,7 @@ struct drm_msm_submitqueue_query { #define DRM_IOCTL_MSM_SUBMITQUEUE_NEW DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_NEW, struct drm_msm_submitqueue) #define DRM_IOCTL_MSM_SUBMITQUEUE_CLOSE DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_CLOSE, __u32) #define DRM_IOCTL_MSM_SUBMITQUEUE_QUERY DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SUBMITQUEUE_QUERY, struct drm_msm_submitqueue_query) +#define DRM_IOCTL_MSM_VM_BIND DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_VM_BIND, struct drm_msm_vm_bind) #if defined(__cplusplus) } -- 2.49.0

5 months

← Newer
1
2
3
4
5
6
7
...
13
Older →

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig May 2025