On Sat, Aug 30, 2025 at 4:58 PM Barry Song 21cnbao@gmail.com wrote:
From: Barry Song v-songbaohua@oppo.com
We can allocate high-order pages, but mapping them one by one is inefficient. This patch changes the code to map as large a chunk as possible. The code looks somewhat complicated mainly because supporting mmap with a non-zero offset is a bit tricky.
Using the micro-benchmark below, we see that mmap becomes 3.5X faster:
...
It's been awhile since I've done mm things, so take it with a pinch of salt, but this seems reasonable to me.
Though, one thought below...
diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c index bbe7881f1360..4c782fe33fd4 100644 --- a/drivers/dma-buf/heaps/system_heap.c +++ b/drivers/dma-buf/heaps/system_heap.c @@ -186,20 +186,35 @@ static int system_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma) struct system_heap_buffer *buffer = dmabuf->priv; struct sg_table *table = &buffer->sg_table; unsigned long addr = vma->vm_start;
struct sg_page_iter piter;
int ret;
unsigned long pgoff = vma->vm_pgoff;
struct scatterlist *sg;
int i, ret;
for_each_sgtable_sg(table, sg, i) {
unsigned long n = sg->length >> PAGE_SHIFT;
for_each_sgtable_page(table, &piter, vma->vm_pgoff) {
struct page *page = sg_page_iter_page(&piter);
if (pgoff < n)
break;
pgoff -= n;
}
for (; sg && addr < vma->vm_end; sg = sg_next(sg)) {
unsigned long n = (sg->length >> PAGE_SHIFT) - pgoff;
struct page *page = sg_page(sg) + pgoff;
unsigned long size = n << PAGE_SHIFT;
if (addr + size > vma->vm_end)
size = vma->vm_end - addr;
ret = remap_pfn_range(vma, addr, page_to_pfn(page), PAGE_SIZE,
vma->vm_page_prot);
ret = remap_pfn_range(vma, addr, page_to_pfn(page),
size, vma->vm_page_prot);
It feels like this sort of mapping loop for higher order pages wouldn't be a unique pattern to just this code. Would this be better worked into a helper so it would be more generally usable?
Otherwise, Acked-by: John Stultz jstultz@google.com
thanks -john