From: Xueyuan Chen Xueyuan.chen21@gmail.com
Replace the heavy for_each_sgtable_page() iterator in system_heap_do_vmap() with a more efficient nested loop approach.
Instead of iterating page by page, we now iterate through the scatterlist entries via for_each_sgtable_sg(). Because pages within a single sg entry are physically contiguous, we can populate the page array with a in an inner loop using simple pointer math. This save a lot of time.
The WARN_ON check is also pulled out of the loop to save branch instructions.
Performance results mapping a 2GB buffer on Radxa O6: - Before: ~1440000 ns - After: ~232000 ns (~84% reduction in iteration time, or ~6.2x faster)
Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Benjamin Gaignard benjamin.gaignard@collabora.com Cc: Brian Starkey Brian.Starkey@arm.com Cc: John Stultz jstultz@google.com Cc: T.J. Mercier tjmercier@google.com Cc: Christian König christian.koenig@amd.com Signed-off-by: Xueyuan Chen Xueyuan.chen21@gmail.com Signed-off-by: Barry Song (Xiaomi) baohua@kernel.org --- drivers/dma-buf/heaps/system_heap.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c index b3650d8fd651..769f01f0cc96 100644 --- a/drivers/dma-buf/heaps/system_heap.c +++ b/drivers/dma-buf/heaps/system_heap.c @@ -224,16 +224,21 @@ static void *system_heap_do_vmap(struct system_heap_buffer *buffer) int npages = PAGE_ALIGN(buffer->len) / PAGE_SIZE; struct page **pages = vmalloc(sizeof(struct page *) * npages); struct page **tmp = pages; - struct sg_page_iter piter; void *vaddr; + u32 i, j, count; + struct page *base_page; + struct scatterlist *sg;
if (!pages) return ERR_PTR(-ENOMEM);
- for_each_sgtable_page(table, &piter, 0) { - WARN_ON(tmp - pages >= npages); - *tmp++ = sg_page_iter_page(&piter); + for_each_sgtable_sg(table, sg, i) { + base_page = sg_page(sg); + count = sg->length >> PAGE_SHIFT; + for (j = 0; j < count; j++) + *tmp++ = base_page + j; } + WARN_ON(tmp - pages != npages);
vaddr = vmap(pages, npages, VM_MAP, PAGE_KERNEL); vfree(pages);