Re: [PATCH v2 2/2] mm: cma: try next MAX_ORDER_NR_PAGES during retry

25 Jan 2022

On 12.01.22 14:15, Dong Aisheng wrote:
...
On an ARMv7 platform with 32M pageblock(MAX_ORDER 14), we observed a
Did you actually intend to talk about pageblocks here (and below)?
I assume you have to be clearer here that you talk about the maximum
allocation granularity, which is usually bigger than actual pageblock size.
...
huge number of repeat retries of CMA allocation (1k+) during booting
when allocating one page for each of 3 mmc instance probe.
This is caused by CMA now supports cocurrent allocation since commit
a4efc174b382 ("mm/cma.c: remove redundant cma_mutex lock").
The pageblock or (MAX_ORDER -1) from which we are trying to allocate
memory may have already been acquired and isolated by others.
Current cma_alloc() will then retry the next area by the step of
bitmap_no + mask + 1 which are very likely within the same isolated range
and fail again. So when the pageblock or MAX_ORDER is big (e.g. 8192),
keep retrying in a small step become meaningless because it will be known
to fail at a huge number of times due to the pageblock has been isolated
by others, especially when allocating only one or two pages.
Instread of looping in the same pageblock and wasting CPU mips a lot,
especially for big pageblock system (e.g. 16M or 32M),
we try the next MAX_ORDER_NR_PAGES directly.
Doing this way can greatly mitigate the situtation.
Below is the original error log during booting:
[    2.004804] cma: cma_alloc(cma (ptrval), count 1, align 0)
[    2.010318] cma: cma_alloc(cma (ptrval), count 1, align 0)
[    2.010776] cma: cma_alloc(): memory range at (ptrval) is busy, retrying
[    2.010785] cma: cma_alloc(): memory range at (ptrval) is busy, retrying
[    2.010793] cma: cma_alloc(): memory range at (ptrval) is busy, retrying
[    2.010800] cma: cma_alloc(): memory range at (ptrval) is busy, retrying
[    2.010807] cma: cma_alloc(): memory range at (ptrval) is busy, retrying
[    2.010814] cma: cma_alloc(): memory range at (ptrval) is busy, retrying
.... (+1K retries)
After fix, the 1200+ reties can be reduced to 0.
Another test running 8 VPU decoder in parallel shows that 1500+ retries
dropped to ~145.
IOW this patch can improve the CMA allocation speed a lot when there're
enough CMA memory by reducing retries significantly.
Cc: Andrew Morton akpm@linux-foundation.org
Cc: Marek Szyprowski m.szyprowski@samsung.com
Cc: Lecopzer Chen lecopzer.chen@mediatek.com
Cc: David Hildenbrand david@redhat.com
Cc: Vlastimil Babka vbabka@suse.cz
CC: stable@vger.kernel.org # 5.11+
Fixes: a4efc174b382 ("mm/cma.c: remove redundant cma_mutex lock")
Signed-off-by: Dong Aisheng aisheng.dong@nxp.com

v1->v2:

change to align with MAX_ORDER_NR_PAGES instead of pageblock_nr_pages


mm/cma.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/cma.c b/mm/cma.c
index 1c13a729d274..1251f65e2364 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -500,7 +500,9 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
   	trace_cma_alloc_busy_retry(cma->name, pfn, pfn_to_page(pfn),
   				   count, align);
   	/* try again with a bit different memory target */

start = bitmap_no + mask + 1;




start = ALIGN(bitmap_no + mask + 1,


	      MAX_ORDER_NR_PAGES >> cma->order_per_bit);



Mind giving the reader a hint in the code why we went for
MAX_ORDER_NR_PAGES?
What would happen if the CMA granularity is bigger than
MAX_ORDER_NR_PAGES? I'd assume no harm done, as we'd try aligning to 0.
-- 
Thanks,

David / dhildenb


    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v2 2/2] mm: cma: try next MAX_ORDER_NR_PAGES during retry