looping-in the linaro-mm-sig ML.
On Thu, Aug 30, 2012 at 4:47 PM, Aubertin, Guillaume g-aubertin@ti.comwrote:
hi guys,
I've been working for a few days on getting a proper rmmod with the remoteproc/rpmsg modules, and I stumbled upon an interesting issue.
when doing sucessive memory allocation and release in the CMA reservation (by loading/unloading the firmware several times), the following message shows up :
[ 119.908477] cma: dma_alloc_from_contiguous(cma ed10ad00, count 256, align 8) [ 119.908843] cma: dma_alloc_from_contiguous(): memory range at c0dfb000 is busy, retrying [ 119.909698] cma: dma_alloc_from_contiguous(): returned c0dfd000
dma_alloc_from_contiguous() tries to allocate the following range, 0xc0dfd000, succesfully this time.
In some cases, the allocation fails after trying several ranges :
[ 119.912231] cma: dma_alloc_from_contiguous(cma ed10ad00, count 768, align 8) [ 119.912719] cma: dma_alloc_from_contiguous(): memory range at c0dff000 is busy, retrying [ 119.913055] cma: dma_alloc_from_contiguous(): memory range at c0e01000 is busy, retrying [ 119.913055] rproc remoteproc0: dma_alloc_coherent failed: 3145728
Here is my understanding so far :
First, even if we made a CMA reservation, the kernel can still allocate pages in this area, but these pages must be movable (user process page by example).
When dma_alloc_from_contiguous() is called to allocate X pages, it looks for the next X contiguous free pages in it's CMA bitmap (with respect to the memory alignment). Then, alloc_contig_range() is called to allocate the given range of pages. Alloc_contig_range() analyses the pages we want to allocate, and if a page is already used, it is migrated to a new page outside the page array we want to reserve. this is done using isolate_migratepages_range() to list the pages to migrate, and migrate_pages() to try to migrate the pages, and that's where it fails. Below is a list of next function calls :
fallback_migrate_page() --> migrate_page() --> try_to_release_page() --> try_to_free_buffer() --> drop_buffers() --> buffer_busy()
I understand here that the page contains used buffers that can't be dropped, and so the page can't be migrated. Well, I must admit that once here, I'm feeling a little lost in this ocean of memory management code ;). After a few researches, I found the following thread on the linux-arm-kernel ML talking about the same issue :
http://lists.infradead.org/pipermail/linux-arm-kernel/2012-June/102844.html with the following patch :
- mm/page_alloc.c | 3 ++-*
- 1 files changed, 2 insertions(+), 1 deletions(-)*
*diff --git a/mm/page_alloc.c b/mm/page_alloc.c* *index 0e1c6f5..c9a6483 100644* *--- a/mm/page_alloc.c* *+++ b/mm/page_alloc.c* *@@ -1310,7 +1310,8 @@ void free_hot_cold_page(struct page *page, int cold)*
- excessively into the page allocator*
- */*
- if (migratetype >= MIGRATE_PCPTYPES) {*
*- if (unlikely(migratetype == MIGRATE_ISOLATE)) {* *+ if (unlikely(migratetype == MIGRATE_ISOLATE)* *+ || is_migrate_cma(migratetype)) {*
- free_one_page(zone, page, 0, migratetype);*
- goto out;*
- }*
I tried the patch, and it seems to work (I didn't have any "memory range busy" in 5000+ tests), but I'm affraid that this could have some nasty side effects.
Any idea ?
Thanks in advance, Guillaume
-- Texas Instruments France SA, 821 Avenue Jack Kilby, 06270 Villeneuve Loubet. 036 420 040 R.C.S Antibes. Capital de EUR 753.920