On 6/6/25 11:52, wangtao wrote:
-----Original Message----- From: Christoph Hellwig hch@infradead.org Sent: Tuesday, June 3, 2025 9:20 PM To: Christian König christian.koenig@amd.com Cc: Christoph Hellwig hch@infradead.org; wangtao tao.wangtao@honor.com; sumit.semwal@linaro.org; kraxel@redhat.com; vivek.kasireddy@intel.com; viro@zeniv.linux.org.uk; brauner@kernel.org; hughd@google.com; akpm@linux-foundation.org; amir73il@gmail.com; benjamin.gaignard@collabora.com; Brian.Starkey@arm.com; jstultz@google.com; tjmercier@google.com; jack@suse.cz; baolin.wang@linux.alibaba.com; linux-media@vger.kernel.org; dri- devel@lists.freedesktop.org; linaro-mm-sig@lists.linaro.org; linux- kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org; linux- mm@kvack.org; wangbintian(BintianWang) bintian.wang@honor.com; yipengxiang yipengxiang@honor.com; liulu 00013167 liulu.liu@honor.com; hanfeng 00012985 feng.han@honor.com Subject: Re: [PATCH v4 0/4] Implement dmabuf direct I/O via copy_file_range
On Tue, Jun 03, 2025 at 03:14:20PM +0200, Christian König wrote:
On 6/3/25 15:00, Christoph Hellwig wrote:
This is a really weird interface. No one has yet to explain why dmabuf is so special that we can't support direct I/O to it when we can support it to otherwise exotic mappings like PCI P2P ones.
With udmabuf you can do direct I/O, it's just inefficient to walk the page tables for it when you already have an array of all the folios.
Does it matter compared to the I/O in this case?
Either way there has been talk (in case of networking implementations) that use a dmabuf as a first class container for lower level I/O. I'd much rather do that than adding odd side interfaces. I.e. have a version of splice that doesn't bother with the pipe, but instead just uses in-kernel direct I/O on one side and dmabuf-provided folios on the other.
If the VFS layer recognizes dmabuf type and acquires its sg_table and folios, zero-copy could also be achieved. I initially thought dmabuf acts as a driver and shouldn't be handled by VFS, so I made dmabuf implement copy_file_range callbacks to support direct I/O zero-copy. I'm open to both approaches. What's the preference of VFS experts?
That would probably be illegal. Using the sg_table in the DMA-buf implementation turned out to be a mistake.
The question Christoph raised was rather why is your CPU so slow that walking the page tables has a significant overhead compared to the actual I/O?
Regards, Christian.
Regards, Wangtao.