The __vmap_pages_range_noflush() assumes its argument pages** contains pages with the same page shift. However, since commit e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations"), if gfp_flags includes __GFP_NOFAIL with high order in vm_area_alloc_pages() and page allocation failed for high order, the pages** may contain two different page shifts (high order and order-0). This could lead __vmap_pages_range_noflush() to perform incorrect mappings, potentially resulting in memory corruption.
Users might encounter this as follows (vmap_allow_huge = true, 2M is for PMD_SIZE): kvmalloc(2M, __GFP_NOFAIL|GFP_X) __vmalloc_node_range_noprof(vm_flags=VM_ALLOW_HUGE_VMAP) vm_area_alloc_pages(order=9) ---> order-9 allocation failed and fallback to order-0 vmap_pages_range() vmap_pages_range_noflush() __vmap_pages_range_noflush(page_shift = 21) ----> wrong mapping happens
We can remove the fallback code because if a high-order allocation fails, __vmalloc_node_range_noprof() will retry with order-0. Therefore, it is unnecessary to fallback to order-0 here. Therefore, fix this by removing the fallback code.
Fixes: e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations") Signed-off-by: Hailong Liu hailong.liu@oppo.com Reported-by: Tangquan Zheng zhengtangquan@oppo.com Cc: stable@vger.kernel.org CC: Barry Song 21cnbao@gmail.com CC: Baoquan He bhe@redhat.com CC: Matthew Wilcox willy@infradead.org --- mm/vmalloc.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 6b783baf12a1..af2de36549d6 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3584,15 +3584,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid, page = alloc_pages_noprof(alloc_gfp, order); else page = alloc_pages_node_noprof(nid, alloc_gfp, order); - if (unlikely(!page)) { - if (!nofail) - break; - - /* fall back to the zero order allocations */ - alloc_gfp |= __GFP_NOFAIL; - order = 0; - continue; - } + if (unlikely(!page)) + break;
/* * Higher order allocations must be able to be treated as --- Sorry for fat fingers. with .rej file. resend this.
Baoquan suggests set page_shift to 0 if fallback in (2 and concern about performance of retry with order-0. But IMO with retry, - Save memory usage if high order allocation failed. - Keep consistancy with align and page-shift. - make use of bulk allocator with order-0
[2] https://lore.kernel.org/lkml/20240725035318.471-1-hailong.liu@oppo.com/ -- 2.30.0
On 08/08/24 at 08:19pm, Hailong Liu wrote:
The __vmap_pages_range_noflush() assumes its argument pages** contains pages with the same page shift. However, since commit e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations"), if gfp_flags includes __GFP_NOFAIL with high order in vm_area_alloc_pages() and page allocation failed for high order, the pages** may contain two different page shifts (high order and order-0). This could lead __vmap_pages_range_noflush() to perform incorrect mappings, potentially resulting in memory corruption.
Users might encounter this as follows (vmap_allow_huge = true, 2M is for PMD_SIZE): kvmalloc(2M, __GFP_NOFAIL|GFP_X) __vmalloc_node_range_noprof(vm_flags=VM_ALLOW_HUGE_VMAP) vm_area_alloc_pages(order=9) ---> order-9 allocation failed and fallback to order-0 vmap_pages_range() vmap_pages_range_noflush() __vmap_pages_range_noflush(page_shift = 21) ----> wrong mapping happens
We can remove the fallback code because if a high-order allocation fails, __vmalloc_node_range_noprof() will retry with order-0. Therefore, it is unnecessary to fallback to order-0 here. Therefore, fix this by removing the fallback code.
Fixes: e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations") Signed-off-by: Hailong Liu hailong.liu@oppo.com Reported-by: Tangquan Zheng zhengtangquan@oppo.com Cc: stable@vger.kernel.org CC: Barry Song 21cnbao@gmail.com CC: Baoquan He bhe@redhat.com CC: Matthew Wilcox willy@infradead.org
mm/vmalloc.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 6b783baf12a1..af2de36549d6 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3584,15 +3584,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid, page = alloc_pages_noprof(alloc_gfp, order); else page = alloc_pages_node_noprof(nid, alloc_gfp, order);
if (unlikely(!page)) {
if (!nofail)
break;
/* fall back to the zero order allocations */
alloc_gfp |= __GFP_NOFAIL;
order = 0;
continue;
}
if (unlikely(!page))
break;
This looks great, thanks.
Reviewed-by: Baoquan He bhe@redhat.com
On Thu, Aug 08, 2024 at 08:19:56PM +0800, Hailong Liu wrote:
The __vmap_pages_range_noflush() assumes its argument pages** contains pages with the same page shift. However, since commit e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations"), if gfp_flags includes __GFP_NOFAIL with high order in vm_area_alloc_pages() and page allocation failed for high order, the pages** may contain two different page shifts (high order and order-0). This could lead __vmap_pages_range_noflush() to perform incorrect mappings, potentially resulting in memory corruption.
Users might encounter this as follows (vmap_allow_huge = true, 2M is for PMD_SIZE): kvmalloc(2M, __GFP_NOFAIL|GFP_X) __vmalloc_node_range_noprof(vm_flags=VM_ALLOW_HUGE_VMAP) vm_area_alloc_pages(order=9) ---> order-9 allocation failed and fallback to order-0 vmap_pages_range() vmap_pages_range_noflush() __vmap_pages_range_noflush(page_shift = 21) ----> wrong mapping happens
We can remove the fallback code because if a high-order allocation fails, __vmalloc_node_range_noprof() will retry with order-0. Therefore, it is unnecessary to fallback to order-0 here. Therefore, fix this by removing the fallback code.
Fixes: e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations") Signed-off-by: Hailong Liu hailong.liu@oppo.com Reported-by: Tangquan Zheng zhengtangquan@oppo.com Cc: stable@vger.kernel.org CC: Barry Song 21cnbao@gmail.com CC: Baoquan He bhe@redhat.com CC: Matthew Wilcox willy@infradead.org
mm/vmalloc.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 6b783baf12a1..af2de36549d6 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3584,15 +3584,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid, page = alloc_pages_noprof(alloc_gfp, order); else page = alloc_pages_node_noprof(nid, alloc_gfp, order);
if (unlikely(!page)) {
if (!nofail)
break;
/* fall back to the zero order allocations */
alloc_gfp |= __GFP_NOFAIL;
order = 0;
continue;
}
if (unlikely(!page))
break;
/*
- Higher order allocations must be able to be treated as
Sorry for fat fingers. with .rej file. resend this.
Baoquan suggests set page_shift to 0 if fallback in (2 and concern about performance of retry with order-0. But IMO with retry,
- Save memory usage if high order allocation failed.
- Keep consistancy with align and page-shift.
- make use of bulk allocator with order-0
[2] https://lore.kernel.org/lkml/20240725035318.471-1-hailong.liu@oppo.com/
2.30.0
Makes sense to me.
Reviewed-by: Uladzislau Rezki (Sony) urezki@gmail.com
Thank you!
-- Uladzislau Rezki
On Fri, Aug 9, 2024 at 12:20 AM Hailong Liu hailong.liu@oppo.com wrote:
The __vmap_pages_range_noflush() assumes its argument pages** contains pages with the same page shift. However, since commit e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations"), if gfp_flags includes __GFP_NOFAIL with high order in vm_area_alloc_pages() and page allocation failed for high order, the pages** may contain two different page shifts (high order and order-0). This could lead __vmap_pages_range_noflush() to perform incorrect mappings, potentially resulting in memory corruption.
Users might encounter this as follows (vmap_allow_huge = true, 2M is for PMD_SIZE): kvmalloc(2M, __GFP_NOFAIL|GFP_X) __vmalloc_node_range_noprof(vm_flags=VM_ALLOW_HUGE_VMAP) vm_area_alloc_pages(order=9) ---> order-9 allocation failed and fallback to order-0 vmap_pages_range() vmap_pages_range_noflush() __vmap_pages_range_noflush(page_shift = 21) ----> wrong mapping happens
We can remove the fallback code because if a high-order allocation fails, __vmalloc_node_range_noprof() will retry with order-0. Therefore, it is unnecessary to fallback to order-0 here. Therefore, fix this by removing the fallback code.
Fixes: e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations") Signed-off-by: Hailong Liu hailong.liu@oppo.com Reported-by: Tangquan Zheng zhengtangquan@oppo.com Cc: stable@vger.kernel.org CC: Barry Song 21cnbao@gmail.com CC: Baoquan He bhe@redhat.com CC: Matthew Wilcox willy@infradead.org
Acked-by: Barry Song baohua@kernel.org
because we already have a fallback here:
void *__vmalloc_node_range_noprof :
fail: if (shift > PAGE_SHIFT) { shift = PAGE_SHIFT; align = real_align; size = real_size; goto again; }
mm/vmalloc.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 6b783baf12a1..af2de36549d6 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3584,15 +3584,8 @@ vm_area_alloc_pages(gfp_t gfp, int nid, page = alloc_pages_noprof(alloc_gfp, order); else page = alloc_pages_node_noprof(nid, alloc_gfp, order);
if (unlikely(!page)) {
if (!nofail)
break;
/* fall back to the zero order allocations */
alloc_gfp |= __GFP_NOFAIL;
order = 0;
continue;
}
if (unlikely(!page))
break; /* * Higher order allocations must be able to be treated as
Sorry for fat fingers. with .rej file. resend this.
Baoquan suggests set page_shift to 0 if fallback in (2 and concern about performance of retry with order-0. But IMO with retry,
- Save memory usage if high order allocation failed.
- Keep consistancy with align and page-shift.
- make use of bulk allocator with order-0
[2] https://lore.kernel.org/lkml/20240725035318.471-1-hailong.liu@oppo.com/
2.30.0
On Fri 09-08-24 09:05:05, Barry Song wrote:
On Fri, Aug 9, 2024 at 12:20 AM Hailong Liu hailong.liu@oppo.com wrote:
The __vmap_pages_range_noflush() assumes its argument pages** contains pages with the same page shift. However, since commit e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations"), if gfp_flags includes __GFP_NOFAIL with high order in vm_area_alloc_pages() and page allocation failed for high order, the pages** may contain two different page shifts (high order and order-0). This could lead __vmap_pages_range_noflush() to perform incorrect mappings, potentially resulting in memory corruption.
Users might encounter this as follows (vmap_allow_huge = true, 2M is for PMD_SIZE): kvmalloc(2M, __GFP_NOFAIL|GFP_X) __vmalloc_node_range_noprof(vm_flags=VM_ALLOW_HUGE_VMAP) vm_area_alloc_pages(order=9) ---> order-9 allocation failed and fallback to order-0 vmap_pages_range() vmap_pages_range_noflush() __vmap_pages_range_noflush(page_shift = 21) ----> wrong mapping happens
We can remove the fallback code because if a high-order allocation fails, __vmalloc_node_range_noprof() will retry with order-0. Therefore, it is unnecessary to fallback to order-0 here. Therefore, fix this by removing the fallback code.
Fixes: e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations") Signed-off-by: Hailong Liu hailong.liu@oppo.com Reported-by: Tangquan Zheng zhengtangquan@oppo.com Cc: stable@vger.kernel.org CC: Barry Song 21cnbao@gmail.com CC: Baoquan He bhe@redhat.com CC: Matthew Wilcox willy@infradead.org
Acked-by: Barry Song baohua@kernel.org
because we already have a fallback here:
void *__vmalloc_node_range_noprof :
fail: if (shift > PAGE_SHIFT) { shift = PAGE_SHIFT; align = real_align; size = real_size; goto again; }
This really deserves a comment because this is not really clear at all. The code is also fragile and it would benefit from some re-org.
Thanks for the fix.
Acked-by: Michal Hocko mhocko@suse.com
On Fri, Aug 09, 2024 at 11:33:06AM +0200, Michal Hocko wrote:
On Fri 09-08-24 09:05:05, Barry Song wrote:
On Fri, Aug 9, 2024 at 12:20 AM Hailong Liu hailong.liu@oppo.com wrote:
The __vmap_pages_range_noflush() assumes its argument pages** contains pages with the same page shift. However, since commit e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations"), if gfp_flags includes __GFP_NOFAIL with high order in vm_area_alloc_pages() and page allocation failed for high order, the pages** may contain two different page shifts (high order and order-0). This could lead __vmap_pages_range_noflush() to perform incorrect mappings, potentially resulting in memory corruption.
Users might encounter this as follows (vmap_allow_huge = true, 2M is for PMD_SIZE): kvmalloc(2M, __GFP_NOFAIL|GFP_X) __vmalloc_node_range_noprof(vm_flags=VM_ALLOW_HUGE_VMAP) vm_area_alloc_pages(order=9) ---> order-9 allocation failed and fallback to order-0 vmap_pages_range() vmap_pages_range_noflush() __vmap_pages_range_noflush(page_shift = 21) ----> wrong mapping happens
We can remove the fallback code because if a high-order allocation fails, __vmalloc_node_range_noprof() will retry with order-0. Therefore, it is unnecessary to fallback to order-0 here. Therefore, fix this by removing the fallback code.
Fixes: e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations") Signed-off-by: Hailong Liu hailong.liu@oppo.com Reported-by: Tangquan Zheng zhengtangquan@oppo.com Cc: stable@vger.kernel.org CC: Barry Song 21cnbao@gmail.com CC: Baoquan He bhe@redhat.com CC: Matthew Wilcox willy@infradead.org
Acked-by: Barry Song baohua@kernel.org
because we already have a fallback here:
void *__vmalloc_node_range_noprof :
fail: if (shift > PAGE_SHIFT) { shift = PAGE_SHIFT; align = real_align; size = real_size; goto again; }
This really deserves a comment because this is not really clear at all. The code is also fragile and it would benefit from some re-org.
Thanks for the fix.
Acked-by: Michal Hocko mhocko@suse.com
I agree. This is only clear for people who know the code. A "fallback" to order-0 should be commented.
-- Uladzislau Rezki
On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki urezki@gmail.com wrote:
Acked-by: Barry Song baohua@kernel.org
because we already have a fallback here:
void *__vmalloc_node_range_noprof :
fail: if (shift > PAGE_SHIFT) { shift = PAGE_SHIFT; align = real_align; size = real_size; goto again; }
This really deserves a comment because this is not really clear at all. The code is also fragile and it would benefit from some re-org.
Thanks for the fix.
Acked-by: Michal Hocko mhocko@suse.com
I agree. This is only clear for people who know the code. A "fallback" to order-0 should be commented.
It's been a week. Could someone please propose a fixup patch to add this comment?
On Thu, Aug 15, 2024 at 10:07:09PM -0700, Andrew Morton wrote:
On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki urezki@gmail.com wrote:
Acked-by: Barry Song baohua@kernel.org
because we already have a fallback here:
void *__vmalloc_node_range_noprof :
fail: if (shift > PAGE_SHIFT) { shift = PAGE_SHIFT; align = real_align; size = real_size; goto again; }
This really deserves a comment because this is not really clear at all. The code is also fragile and it would benefit from some re-org.
Thanks for the fix.
Acked-by: Michal Hocko mhocko@suse.com
I agree. This is only clear for people who know the code. A "fallback" to order-0 should be commented.
It's been a week. Could someone please propose a fixup patch to add this comment?
I will send the patch. This is week i have a vacation, thus i am a bit slow.
-- Uladzislau Rezki
On Thu, 15. Aug 22:07, Andrew Morton wrote:
On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki urezki@gmail.com wrote:
Acked-by: Barry Song baohua@kernel.org
because we already have a fallback here:
void *__vmalloc_node_range_noprof :
fail: if (shift > PAGE_SHIFT) { shift = PAGE_SHIFT; align = real_align; size = real_size; goto again; }
This really deserves a comment because this is not really clear at all. The code is also fragile and it would benefit from some re-org.
Thanks for the fix.
Acked-by: Michal Hocko mhocko@suse.com
I agree. This is only clear for people who know the code. A "fallback" to order-0 should be commented.
It's been a week. Could someone please propose a fixup patch to add this comment?
Hi Andrew:
Do you mean that I need to send a v2 patch with the the comments included?
-- Brs, hailong.
On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
On Thu, 15. Aug 22:07, Andrew Morton wrote:
On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki urezki@gmail.com wrote:
Acked-by: Barry Song baohua@kernel.org
because we already have a fallback here:
void *__vmalloc_node_range_noprof :
fail: if (shift > PAGE_SHIFT) { shift = PAGE_SHIFT; align = real_align; size = real_size; goto again; }
This really deserves a comment because this is not really clear at all. The code is also fragile and it would benefit from some re-org.
Thanks for the fix.
Acked-by: Michal Hocko mhocko@suse.com
I agree. This is only clear for people who know the code. A "fallback" to order-0 should be commented.
It's been a week. Could someone please propose a fixup patch to add this comment?
Hi Andrew:
Do you mean that I need to send a v2 patch with the the comments included?
It is better to post v2.
But before, could you please comment on:
in case of order-0, bulk path may easily fail and fallback to the single page allocator. If an request is marked as NO_FAIL, i am talking about order-0 request, your change breaks GFP_NOFAIL for !order.
Am i missing something obvious?
Thanks!
-- Uladzsislau Rezki
On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
On Thu, 15. Aug 22:07, Andrew Morton wrote:
On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki urezki@gmail.com wrote:
Acked-by: Barry Song baohua@kernel.org
because we already have a fallback here:
void *__vmalloc_node_range_noprof :
fail: if (shift > PAGE_SHIFT) { shift = PAGE_SHIFT; align = real_align; size = real_size; goto again; }
This really deserves a comment because this is not really clear at all. The code is also fragile and it would benefit from some re-org.
Thanks for the fix.
Acked-by: Michal Hocko mhocko@suse.com
I agree. This is only clear for people who know the code. A "fallback" to order-0 should be commented.
It's been a week. Could someone please propose a fixup patch to add this comment?
Hi Andrew:
Do you mean that I need to send a v2 patch with the the comments included?
It is better to post v2.
Got it.
But before, could you please comment on:
in case of order-0, bulk path may easily fail and fallback to the single page allocator. If an request is marked as NO_FAIL, i am talking about order-0 request, your change breaks GFP_NOFAIL for !order.
Am i missing something obvious?
For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle the flag correctly. IMO we don't need to handle the flag here.
Thanks!
-- Uladzsislau Rezki
-- help you, help me, Hailong.
On Fri 16-08-24 19:46:26, Hailong Liu wrote:
On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
On Thu, 15. Aug 22:07, Andrew Morton wrote:
On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki urezki@gmail.com wrote:
> Acked-by: Barry Song baohua@kernel.org > > because we already have a fallback here: > > void *__vmalloc_node_range_noprof : > > fail: > if (shift > PAGE_SHIFT) { > shift = PAGE_SHIFT; > align = real_align; > size = real_size; > goto again; > }
This really deserves a comment because this is not really clear at all. The code is also fragile and it would benefit from some re-org.
Thanks for the fix.
Acked-by: Michal Hocko mhocko@suse.com
I agree. This is only clear for people who know the code. A "fallback" to order-0 should be commented.
It's been a week. Could someone please propose a fixup patch to add this comment?
Hi Andrew:
Do you mean that I need to send a v2 patch with the the comments included?
It is better to post v2.
Got it.
But before, could you please comment on:
in case of order-0, bulk path may easily fail and fallback to the single page allocator. If an request is marked as NO_FAIL, i am talking about order-0 request, your change breaks GFP_NOFAIL for !order.
Am i missing something obvious?
For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle the flag correctly. IMO we don't need to handle the flag here.
Let me clarify what I would like to have clarified:
diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 6b783baf12a1..fea90a39f5c5 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3510,13 +3510,13 @@ void *vmap_pfn(unsigned long *pfns, unsigned int count, pgprot_t prot) EXPORT_SYMBOL_GPL(vmap_pfn); #endif /* CONFIG_VMAP_PFN */
+/* GFP_NOFAIL semantic is implemented by __vmalloc_node_range_noprof */ static inline unsigned int vm_area_alloc_pages(gfp_t gfp, int nid, unsigned int order, unsigned int nr_pages, struct page **pages) { unsigned int nr_allocated = 0; - gfp_t alloc_gfp = gfp; - bool nofail = gfp & __GFP_NOFAIL; + gfp_t alloc_gfp = gfp & ~ __GFP_NOFAIL; struct page *page; int i;
@@ -3527,9 +3527,6 @@ vm_area_alloc_pages(gfp_t gfp, int nid, * more permissive. */ if (!order) { - /* bulk allocator doesn't support nofail req. officially */ - gfp_t bulk_gfp = gfp & ~__GFP_NOFAIL; - while (nr_allocated < nr_pages) { unsigned int nr, nr_pages_request;
@@ -3547,12 +3544,12 @@ vm_area_alloc_pages(gfp_t gfp, int nid, * but mempolicy wants to alloc memory by interleaving. */ if (IS_ENABLED(CONFIG_NUMA) && nid == NUMA_NO_NODE) - nr = alloc_pages_bulk_array_mempolicy_noprof(bulk_gfp, + nr = alloc_pages_bulk_array_mempolicy_noprof(alloc_gfp, nr_pages_request, pages + nr_allocated);
else - nr = alloc_pages_bulk_array_node_noprof(bulk_gfp, nid, + nr = alloc_pages_bulk_array_node_noprof(alloc_gfp, nid, nr_pages_request, pages + nr_allocated);
@@ -3566,13 +3563,6 @@ vm_area_alloc_pages(gfp_t gfp, int nid, if (nr != nr_pages_request) break; } - } else if (gfp & __GFP_NOFAIL) { - /* - * Higher order nofail allocations are really expensive and - * potentially dangerous (pre-mature OOM, disruptive reclaim - * and compaction etc. - */ - alloc_gfp &= ~__GFP_NOFAIL; }
/* High-order pages or fallback path if "bulk" fails. */
Hello, Michal.
Let me clarify what I would like to have clarified:
diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 6b783baf12a1..fea90a39f5c5 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3510,13 +3510,13 @@ void *vmap_pfn(unsigned long *pfns, unsigned int count, pgprot_t prot) EXPORT_SYMBOL_GPL(vmap_pfn); #endif /* CONFIG_VMAP_PFN */ +/* GFP_NOFAIL semantic is implemented by __vmalloc_node_range_noprof */ static inline unsigned int vm_area_alloc_pages(gfp_t gfp, int nid, unsigned int order, unsigned int nr_pages, struct page **pages) { unsigned int nr_allocated = 0;
- gfp_t alloc_gfp = gfp;
- bool nofail = gfp & __GFP_NOFAIL;
- gfp_t alloc_gfp = gfp & ~ __GFP_NOFAIL; struct page *page; int i;
@@ -3527,9 +3527,6 @@ vm_area_alloc_pages(gfp_t gfp, int nid, * more permissive. */ if (!order) {
/* bulk allocator doesn't support nofail req. officially */
gfp_t bulk_gfp = gfp & ~__GFP_NOFAIL;
- while (nr_allocated < nr_pages) { unsigned int nr, nr_pages_request;
@@ -3547,12 +3544,12 @@ vm_area_alloc_pages(gfp_t gfp, int nid, * but mempolicy wants to alloc memory by interleaving. */ if (IS_ENABLED(CONFIG_NUMA) && nid == NUMA_NO_NODE)
nr = alloc_pages_bulk_array_mempolicy_noprof(bulk_gfp,
nr = alloc_pages_bulk_array_mempolicy_noprof(alloc_gfp, nr_pages_request, pages + nr_allocated);
else
nr = alloc_pages_bulk_array_node_noprof(bulk_gfp, nid,
nr = alloc_pages_bulk_array_node_noprof(alloc_gfp, nid, nr_pages_request, pages + nr_allocated);
@@ -3566,13 +3563,6 @@ vm_area_alloc_pages(gfp_t gfp, int nid, if (nr != nr_pages_request) break; }
- } else if (gfp & __GFP_NOFAIL) {
/*
* Higher order nofail allocations are really expensive and
* potentially dangerous (pre-mature OOM, disruptive reclaim
* and compaction etc.
*/
}alloc_gfp &= ~__GFP_NOFAIL;
/* High-order pages or fallback path if "bulk" fails. */ --
See below the change. It does not do any functional change and it is rather a small refactoring, which includes the comment i wanted to add and what you wanted to be clarified(if i got you correctly):
<snip> diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 3f9b6bd707d2..24fad2e48799 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3531,8 +3531,6 @@ vm_area_alloc_pages(gfp_t gfp, int nid, unsigned int order, unsigned int nr_pages, struct page **pages) { unsigned int nr_allocated = 0; - gfp_t alloc_gfp = gfp; - bool nofail = gfp & __GFP_NOFAIL; struct page *page; int i;
@@ -3543,9 +3541,6 @@ vm_area_alloc_pages(gfp_t gfp, int nid, * more permissive. */ if (!order) { - /* bulk allocator doesn't support nofail req. officially */ - gfp_t bulk_gfp = gfp & ~__GFP_NOFAIL; - while (nr_allocated < nr_pages) { unsigned int nr, nr_pages_request;
@@ -3563,12 +3558,12 @@ vm_area_alloc_pages(gfp_t gfp, int nid, * but mempolicy wants to alloc memory by interleaving. */ if (IS_ENABLED(CONFIG_NUMA) && nid == NUMA_NO_NODE) - nr = alloc_pages_bulk_array_mempolicy_noprof(bulk_gfp, + nr = alloc_pages_bulk_array_mempolicy_noprof(gfp & ~__GFP_NOFAIL, nr_pages_request, pages + nr_allocated); - else - nr = alloc_pages_bulk_array_node_noprof(bulk_gfp, nid, + /* bulk allocator doesn't support nofail req. officially */ + nr = alloc_pages_bulk_array_node_noprof(gfp & ~__GFP_NOFAIL, nid, nr_pages_request, pages + nr_allocated);
@@ -3582,24 +3577,18 @@ vm_area_alloc_pages(gfp_t gfp, int nid, if (nr != nr_pages_request) break; } - } else if (gfp & __GFP_NOFAIL) { - /* - * Higher order nofail allocations are really expensive and - * potentially dangerous (pre-mature OOM, disruptive reclaim - * and compaction etc. - */ - alloc_gfp &= ~__GFP_NOFAIL; }
/* High-order pages or fallback path if "bulk" fails. */ while (nr_allocated < nr_pages) { - if (!nofail && fatal_signal_pending(current)) + if (!(gfp & __GFP_NOFAIL) && fatal_signal_pending(current)) break;
if (nid == NUMA_NO_NODE) - page = alloc_pages_noprof(alloc_gfp, order); + page = alloc_pages_noprof(gfp, order); else - page = alloc_pages_node_noprof(nid, alloc_gfp, order); + page = alloc_pages_node_noprof(nid, gfp, order); + if (unlikely(!page)) break;
@@ -3666,7 +3655,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, set_vm_area_page_order(area, page_shift - PAGE_SHIFT); page_order = vm_area_page_order(area);
- area->nr_pages = vm_area_alloc_pages(gfp_mask | __GFP_NOWARN, + /* + * Higher order nofail allocations are really expensive and + * potentially dangerous (pre-mature OOM, disruptive reclaim + * and compaction etc. + * + * Please note, the __vmalloc_node_range_noprof() falls-back + * to order-0 pages if high-order attempt has been unsuccessful. + */ + area->nr_pages = vm_area_alloc_pages(page_order ? + gfp_mask &= ~__GFP_NOFAIL : gfp_mask | __GFP_NOWARN, node, page_order, nr_small_pages, area->pages);
atomic_long_add(area->nr_pages, &nr_vmalloc_pages); <snip>
Is that aligned with your wish?
Thanks!
-- Uladzislau Rezki
On Fri 23-08-24 18:42:47, Uladzislau Rezki wrote: [...]
@@ -3666,7 +3655,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, set_vm_area_page_order(area, page_shift - PAGE_SHIFT); page_order = vm_area_page_order(area);
- area->nr_pages = vm_area_alloc_pages(gfp_mask | __GFP_NOWARN,
- /*
* Higher order nofail allocations are really expensive and
* potentially dangerous (pre-mature OOM, disruptive reclaim
* and compaction etc.
*
* Please note, the __vmalloc_node_range_noprof() falls-back
* to order-0 pages if high-order attempt has been unsuccessful.
*/
- area->nr_pages = vm_area_alloc_pages(page_order ?
node, page_order, nr_small_pages, area->pages);gfp_mask &= ~__GFP_NOFAIL : gfp_mask | __GFP_NOWARN,
atomic_long_add(area->nr_pages, &nr_vmalloc_pages);
<snip>
Is that aligned with your wish?
I am not a great fan of modifying gfp_mask inside the ternary operator like that. It makes the code harder to read. Is there any actual reason to simply drop GFP_NOFAIL unconditionally and rely do the NOFAIL handling for all orders at the same place?
Not that I care about this much TBH. It is an improvement to drop all the NOFAIL specifics from vm_area_alloc_pages.
On Mon, Aug 26, 2024 at 09:52:42AM +0200, Michal Hocko wrote:
On Fri 23-08-24 18:42:47, Uladzislau Rezki wrote: [...]
@@ -3666,7 +3655,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, set_vm_area_page_order(area, page_shift - PAGE_SHIFT); page_order = vm_area_page_order(area);
- area->nr_pages = vm_area_alloc_pages(gfp_mask | __GFP_NOWARN,
- /*
* Higher order nofail allocations are really expensive and
* potentially dangerous (pre-mature OOM, disruptive reclaim
* and compaction etc.
*
* Please note, the __vmalloc_node_range_noprof() falls-back
* to order-0 pages if high-order attempt has been unsuccessful.
*/
- area->nr_pages = vm_area_alloc_pages(page_order ?
node, page_order, nr_small_pages, area->pages);gfp_mask &= ~__GFP_NOFAIL : gfp_mask | __GFP_NOWARN,
atomic_long_add(area->nr_pages, &nr_vmalloc_pages);
<snip>
Is that aligned with your wish?
I am not a great fan of modifying gfp_mask inside the ternary operator like that. It makes the code harder to read. Is there any actual reason to simply drop GFP_NOFAIL unconditionally and rely do the NOFAIL handling for all orders at the same place?
1. So, for bulk we have below:
/* gfp_t bulk_gfp = gfp & ~__GFP_NOFAIL; */
I am not sure if we need it but it says it does not support it which is not clear for me why we have to drop __GFP_NOFAIL for bulk(). There is a fallback to a single page allocator. If passing __GFP_NOFAIL does not trigger any warning or panic a system, then i do not follow why we drop that flag.
Is that odd?
2. High-order allocations. Do you think we should not care much about it when __GFP_NOFAIL is set? Same here, there is a fallback for order-0 if "high" fails, it is more likely NO_FAIL succeed for order-0. Thus keeping NOFAIL for high-order sounds like not a good approach to me.
3. "... at the same place?" Do you mean in the __vmalloc_node_range_noprof()?
__vmalloc_node_range_noprof() -> __vmalloc_area_node(gfp_mask) -> vm_area_alloc_pages()
if, so it is not straight forward, i.e. there is one more allocation:
<snip> static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, pgprot_t prot, unsigned int page_shift, int node) { ... /* Please note that the recursion is strictly bounded. */ if (array_size > PAGE_SIZE) { area->pages = __vmalloc_node_noprof(array_size, 1, nested_gfp, node, area->caller); } else { area->pages = kmalloc_node_noprof(array_size, nested_gfp, node); } ... } <snip>
whereas it is easier to do it inside of the __vmalloc_area_node().
Not that I care about this much TBH. It is an improvement to drop all the NOFAIL specifics from vm_area_alloc_pages.
I agree. I also do not like modifying gfp flags on different levels and different cases. To me there is only one case. It is high-order requests with NOFAIL. For this i think we should keep our approach, i mean dropping NOFAIL and repeat because we have a fallback.
-- Uladzislau Rezki
On Mon 26-08-24 14:38:40, Uladzislau Rezki wrote:
On Mon, Aug 26, 2024 at 09:52:42AM +0200, Michal Hocko wrote:
On Fri 23-08-24 18:42:47, Uladzislau Rezki wrote: [...]
@@ -3666,7 +3655,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, set_vm_area_page_order(area, page_shift - PAGE_SHIFT); page_order = vm_area_page_order(area);
- area->nr_pages = vm_area_alloc_pages(gfp_mask | __GFP_NOWARN,
- /*
* Higher order nofail allocations are really expensive and
* potentially dangerous (pre-mature OOM, disruptive reclaim
* and compaction etc.
*
* Please note, the __vmalloc_node_range_noprof() falls-back
* to order-0 pages if high-order attempt has been unsuccessful.
*/
- area->nr_pages = vm_area_alloc_pages(page_order ?
node, page_order, nr_small_pages, area->pages);gfp_mask &= ~__GFP_NOFAIL : gfp_mask | __GFP_NOWARN,
atomic_long_add(area->nr_pages, &nr_vmalloc_pages);
<snip>
Is that aligned with your wish?
I am not a great fan of modifying gfp_mask inside the ternary operator like that. It makes the code harder to read. Is there any actual reason to simply drop GFP_NOFAIL unconditionally and rely do the NOFAIL handling for all orders at the same place?
- So, for bulk we have below:
/* gfp_t bulk_gfp = gfp & ~__GFP_NOFAIL; */
I am not sure if we need it but it says it does not support it which is not clear for me why we have to drop __GFP_NOFAIL for bulk(). There is a fallback to a single page allocator. If passing __GFP_NOFAIL does not trigger any warning or panic a system, then i do not follow why we drop that flag.
Is that odd?
I suspect this was a pre-caution more than anything.
- High-order allocations. Do you think we should not care much about
it when __GFP_NOFAIL is set? Same here, there is a fallback for order-0 if "high" fails, it is more likely NO_FAIL succeed for order-0. Thus keeping NOFAIL for high-order sounds like not a good approach to me.
We should avoid high order allocations with GFP_NOFAIL at all cost.
- "... at the same place?"
Do you mean in the __vmalloc_node_range_noprof()?
__vmalloc_node_range_noprof() -> __vmalloc_area_node(gfp_mask) -> vm_area_alloc_pages()
if, so it is not straight forward, i.e. there is one more allocation:
<snip> static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, pgprot_t prot, unsigned int page_shift, int node) { ... /* Please note that the recursion is strictly bounded. */ if (array_size > PAGE_SIZE) { area->pages = __vmalloc_node_noprof(array_size, 1, nested_gfp, node, area->caller); } else { area->pages = kmalloc_node_noprof(array_size, nested_gfp, node); } ... } <snip>
whereas it is easier to do it inside of the __vmalloc_area_node().
Right. The allocation path is quite convoluted here. If it is just too much of a hassle to implement NOFAIL at a single place then we should aim at reducing that. Having that at 3 different layers is just begging for inconsistences.
On Tue, Aug 27, 2024 at 08:49:35AM +0200, Michal Hocko wrote:
On Mon 26-08-24 14:38:40, Uladzislau Rezki wrote:
On Mon, Aug 26, 2024 at 09:52:42AM +0200, Michal Hocko wrote:
On Fri 23-08-24 18:42:47, Uladzislau Rezki wrote: [...]
@@ -3666,7 +3655,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, set_vm_area_page_order(area, page_shift - PAGE_SHIFT); page_order = vm_area_page_order(area);
- area->nr_pages = vm_area_alloc_pages(gfp_mask | __GFP_NOWARN,
- /*
* Higher order nofail allocations are really expensive and
* potentially dangerous (pre-mature OOM, disruptive reclaim
* and compaction etc.
*
* Please note, the __vmalloc_node_range_noprof() falls-back
* to order-0 pages if high-order attempt has been unsuccessful.
*/
- area->nr_pages = vm_area_alloc_pages(page_order ?
node, page_order, nr_small_pages, area->pages);gfp_mask &= ~__GFP_NOFAIL : gfp_mask | __GFP_NOWARN,
atomic_long_add(area->nr_pages, &nr_vmalloc_pages);
<snip>
Is that aligned with your wish?
I am not a great fan of modifying gfp_mask inside the ternary operator like that. It makes the code harder to read. Is there any actual reason to simply drop GFP_NOFAIL unconditionally and rely do the NOFAIL handling for all orders at the same place?
- So, for bulk we have below:
/* gfp_t bulk_gfp = gfp & ~__GFP_NOFAIL; */
I am not sure if we need it but it says it does not support it which is not clear for me why we have to drop __GFP_NOFAIL for bulk(). There is a fallback to a single page allocator. If passing __GFP_NOFAIL does not trigger any warning or panic a system, then i do not follow why we drop that flag.
Is that odd?
I suspect this was a pre-caution more than anything.
OK, then i drop it.
- High-order allocations. Do you think we should not care much about
it when __GFP_NOFAIL is set? Same here, there is a fallback for order-0 if "high" fails, it is more likely NO_FAIL succeed for order-0. Thus keeping NOFAIL for high-order sounds like not a good approach to me.
We should avoid high order allocations with GFP_NOFAIL at all cost.
What do you propose here? Fail such request?
- "... at the same place?"
Do you mean in the __vmalloc_node_range_noprof()?
__vmalloc_node_range_noprof() -> __vmalloc_area_node(gfp_mask) -> vm_area_alloc_pages()
if, so it is not straight forward, i.e. there is one more allocation:
<snip> static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask, pgprot_t prot, unsigned int page_shift, int node) { ... /* Please note that the recursion is strictly bounded. */ if (array_size > PAGE_SIZE) { area->pages = __vmalloc_node_noprof(array_size, 1, nested_gfp, node, area->caller); } else { area->pages = kmalloc_node_noprof(array_size, nested_gfp, node); } ... } <snip>
whereas it is easier to do it inside of the __vmalloc_area_node().
Right. The allocation path is quite convoluted here. If it is just too much of a hassle to implement NOFAIL at a single place then we should aim at reducing that. Having that at 3 different layers is just begging for inconsistences.
Hard to not agree :)
-- Uladzislau Rezki
On Tue 27-08-24 14:47:30, Uladzislau Rezki wrote:
On Tue, Aug 27, 2024 at 08:49:35AM +0200, Michal Hocko wrote:
[...]
- High-order allocations. Do you think we should not care much about
it when __GFP_NOFAIL is set? Same here, there is a fallback for order-0 if "high" fails, it is more likely NO_FAIL succeed for order-0. Thus keeping NOFAIL for high-order sounds like not a good approach to me.
We should avoid high order allocations with GFP_NOFAIL at all cost.
What do you propose here? Fail such request?
We shouldn't have any hard requirements for higher order allocations in the vmalloc right? In other words we can always fallback to base pages.
On Tue, Aug 27, 2024 at 03:37:38PM +0200, Michal Hocko wrote:
On Tue 27-08-24 14:47:30, Uladzislau Rezki wrote:
On Tue, Aug 26, 2024 at 08:49:35AM +0200, Michal Hocko wrote:
[...]
- High-order allocations. Do you think we should not care much about
it when __GFP_NOFAIL is set? Same here, there is a fallback for order-0 if "high" fails, it is more likely NO_FAIL succeed for order-0. Thus keeping NOFAIL for high-order sounds like not a good approach to me.
We should avoid high order allocations with GFP_NOFAIL at all cost.
What do you propose here? Fail such request?
We shouldn't have any hard requirements for higher order allocations in the vmalloc right? In other words we can always fallback to base pages.
We always drop NOFAIL for high-order, if it fails we fall-back to order-0. I got the feeling that you wanted just bail-out fully if high-order and NOFAIL.
-- Uladzislau Rezki
On Tue 27-08-24 17:29:34, Uladzislau Rezki wrote:
On Tue, Aug 27, 2024 at 03:37:38PM +0200, Michal Hocko wrote:
On Tue 27-08-24 14:47:30, Uladzislau Rezki wrote:
On Tue, Aug 26, 2024 at 08:49:35AM +0200, Michal Hocko wrote:
[...]
- High-order allocations. Do you think we should not care much about
it when __GFP_NOFAIL is set? Same here, there is a fallback for order-0 if "high" fails, it is more likely NO_FAIL succeed for order-0. Thus keeping NOFAIL for high-order sounds like not a good approach to me.
We should avoid high order allocations with GFP_NOFAIL at all cost.
What do you propose here? Fail such request?
We shouldn't have any hard requirements for higher order allocations in the vmalloc right? In other words we can always fallback to base pages.
We always drop NOFAIL for high-order, if it fails we fall-back to order-0. I got the feeling that you wanted just bail-out fully if high-order and NOFAIL.
Nope. We should always fall back to order 0 for both NOFAIL and regular vmalloc allocations.
On Wed, Aug 28, 2024 at 09:14:53AM +0200, Michal Hocko wrote:
On Tue 27-08-24 17:29:34, Uladzislau Rezki wrote:
On Tue, Aug 27, 2024 at 03:37:38PM +0200, Michal Hocko wrote:
On Tue 27-08-24 14:47:30, Uladzislau Rezki wrote:
On Tue, Aug 26, 2024 at 08:49:35AM +0200, Michal Hocko wrote:
[...]
- High-order allocations. Do you think we should not care much about
it when __GFP_NOFAIL is set? Same here, there is a fallback for order-0 if "high" fails, it is more likely NO_FAIL succeed for order-0. Thus keeping NOFAIL for high-order sounds like not a good approach to me.
We should avoid high order allocations with GFP_NOFAIL at all cost.
What do you propose here? Fail such request?
We shouldn't have any hard requirements for higher order allocations in the vmalloc right? In other words we can always fallback to base pages.
We always drop NOFAIL for high-order, if it fails we fall-back to order-0. I got the feeling that you wanted just bail-out fully if high-order and NOFAIL.
Nope. We should always fall back to order 0 for both NOFAIL and regular vmalloc allocations.
Good.
Thanks for the ACK!
-- Uladzislau Rezki
On Fri, Aug 16, 2024 at 07:46:26PM +0800, Hailong Liu wrote:
On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
On Thu, 15. Aug 22:07, Andrew Morton wrote:
On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki urezki@gmail.com wrote:
> Acked-by: Barry Song baohua@kernel.org > > because we already have a fallback here: > > void *__vmalloc_node_range_noprof : > > fail: > if (shift > PAGE_SHIFT) { > shift = PAGE_SHIFT; > align = real_align; > size = real_size; > goto again; > }
This really deserves a comment because this is not really clear at all. The code is also fragile and it would benefit from some re-org.
Thanks for the fix.
Acked-by: Michal Hocko mhocko@suse.com
I agree. This is only clear for people who know the code. A "fallback" to order-0 should be commented.
It's been a week. Could someone please propose a fixup patch to add this comment?
Hi Andrew:
Do you mean that I need to send a v2 patch with the the comments included?
It is better to post v2.
Got it.
But before, could you please comment on:
in case of order-0, bulk path may easily fail and fallback to the single page allocator. If an request is marked as NO_FAIL, i am talking about order-0 request, your change breaks GFP_NOFAIL for !order.
Am i missing something obvious?
For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle the flag correctly. IMO we don't need to handle the flag here.
Agree. As for comment, i meant to comment the below fallback:
<snip> fail: if (shift > PAGE_SHIFT) { shift = PAGE_SHIFT; align = real_align; size = real_size; goto again; } <snip>
-- Uladzislau Rezki
On Mon, 19. Aug 13:59, Uladzislau Rezki wrote:
On Fri, Aug 16, 2024 at 07:46:26PM +0800, Hailong Liu wrote:
On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
On Thu, 15. Aug 22:07, Andrew Morton wrote:
On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki urezki@gmail.com wrote:
> > Acked-by: Barry Song baohua@kernel.org > > > > because we already have a fallback here: > > > > void *__vmalloc_node_range_noprof : > > > > fail: > > if (shift > PAGE_SHIFT) { > > shift = PAGE_SHIFT; > > align = real_align; > > size = real_size; > > goto again; > > } > > This really deserves a comment because this is not really clear at all. > The code is also fragile and it would benefit from some re-org. > > Thanks for the fix. > > Acked-by: Michal Hocko mhocko@suse.com > I agree. This is only clear for people who know the code. A "fallback" to order-0 should be commented.
It's been a week. Could someone please propose a fixup patch to add this comment?
Hi Andrew:
Do you mean that I need to send a v2 patch with the the comments included?
It is better to post v2.
Got it.
But before, could you please comment on:
in case of order-0, bulk path may easily fail and fallback to the single page allocator. If an request is marked as NO_FAIL, i am talking about order-0 request, your change breaks GFP_NOFAIL for !order.
Am i missing something obvious?
For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle the flag correctly. IMO we don't need to handle the flag here.
Agree. As for comment, i meant to comment the below fallback:
Michal send a craft that make nofail logic more clearer and I check the branch found Andrew already merged in -stable branch. So we can include these with a new patch.
<snip> fail: if (shift > PAGE_SHIFT) { shift = PAGE_SHIFT; align = real_align; size = real_size; goto again; } <snip>
-- Uladzislau Rezki
--
Help you, Help me, Hailong.
On Mon, Aug 19, 2024 at 08:57:38PM +0800, Hailong Liu wrote:
On Mon, 19. Aug 13:59, Uladzislau Rezki wrote:
On Fri, Aug 16, 2024 at 07:46:26PM +0800, Hailong Liu wrote:
On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
On Thu, 15. Aug 22:07, Andrew Morton wrote:
On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki urezki@gmail.com wrote:
> > > Acked-by: Barry Song baohua@kernel.org > > > > > > because we already have a fallback here: > > > > > > void *__vmalloc_node_range_noprof : > > > > > > fail: > > > if (shift > PAGE_SHIFT) { > > > shift = PAGE_SHIFT; > > > align = real_align; > > > size = real_size; > > > goto again; > > > } > > > > This really deserves a comment because this is not really clear at all. > > The code is also fragile and it would benefit from some re-org. > > > > Thanks for the fix. > > > > Acked-by: Michal Hocko mhocko@suse.com > > > I agree. This is only clear for people who know the code. A "fallback" > to order-0 should be commented.
It's been a week. Could someone please propose a fixup patch to add this comment?
Hi Andrew:
Do you mean that I need to send a v2 patch with the the comments included?
It is better to post v2.
Got it.
But before, could you please comment on:
in case of order-0, bulk path may easily fail and fallback to the single page allocator. If an request is marked as NO_FAIL, i am talking about order-0 request, your change breaks GFP_NOFAIL for !order.
Am i missing something obvious?
For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle the flag correctly. IMO we don't need to handle the flag here.
Agree. As for comment, i meant to comment the below fallback:
Michal send a craft that make nofail logic more clearer and I check the branch found Andrew already merged in -stable branch. So we can include these with a new patch.
Just to confirm. Will you send an extra patch with the comment?
-- Uladzislau Rezki
On Mon, Aug 19, 2024 at 03:38:51PM +0200, Uladzislau Rezki wrote:
On Mon, Aug 19, 2024 at 08:57:38PM +0800, Hailong Liu wrote:
On Mon, 19. Aug 13:59, Uladzislau Rezki wrote:
On Fri, Aug 16, 2024 at 07:46:26PM +0800, Hailong Liu wrote:
On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
On Thu, 15. Aug 22:07, Andrew Morton wrote: > On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki urezki@gmail.com wrote: > > > > > Acked-by: Barry Song baohua@kernel.org > > > > > > > > because we already have a fallback here: > > > > > > > > void *__vmalloc_node_range_noprof : > > > > > > > > fail: > > > > if (shift > PAGE_SHIFT) { > > > > shift = PAGE_SHIFT; > > > > align = real_align; > > > > size = real_size; > > > > goto again; > > > > } > > > > > > This really deserves a comment because this is not really clear at all. > > > The code is also fragile and it would benefit from some re-org. > > > > > > Thanks for the fix. > > > > > > Acked-by: Michal Hocko mhocko@suse.com > > > > > I agree. This is only clear for people who know the code. A "fallback" > > to order-0 should be commented. > > It's been a week. Could someone please propose a fixup patch to add > this comment?
Hi Andrew:
Do you mean that I need to send a v2 patch with the the comments included?
It is better to post v2.
Got it.
But before, could you please comment on:
in case of order-0, bulk path may easily fail and fallback to the single page allocator. If an request is marked as NO_FAIL, i am talking about order-0 request, your change breaks GFP_NOFAIL for !order.
Am i missing something obvious?
For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle the flag correctly. IMO we don't need to handle the flag here.
Agree. As for comment, i meant to comment the below fallback:
Michal send a craft that make nofail logic more clearer and I check the branch found Andrew already merged in -stable branch. So we can include these with a new patch.
Just to confirm. Will you send an extra patch with the comment?
Also, an idea to handle NOFAIL outside of vm_area_alloc_pages() looks sounds good to me.
-- Uladzislau Rezki
On Mon, 19. Aug 15:38, Uladzislau Rezki wrote:
On Mon, Aug 19, 2024 at 08:57:38PM +0800, Hailong Liu wrote:
On Mon, 19. Aug 13:59, Uladzislau Rezki wrote:
On Fri, Aug 16, 2024 at 07:46:26PM +0800, Hailong Liu wrote:
On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
On Thu, 15. Aug 22:07, Andrew Morton wrote: > On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki urezki@gmail.com wrote: > > > > > Acked-by: Barry Song baohua@kernel.org > > > > > > > > because we already have a fallback here: > > > > > > > > void *__vmalloc_node_range_noprof : > > > > > > > > fail: > > > > if (shift > PAGE_SHIFT) { > > > > shift = PAGE_SHIFT; > > > > align = real_align; > > > > size = real_size; > > > > goto again; > > > > } > > > > > > This really deserves a comment because this is not really clear at all. > > > The code is also fragile and it would benefit from some re-org. > > > > > > Thanks for the fix. > > > > > > Acked-by: Michal Hocko mhocko@suse.com > > > > > I agree. This is only clear for people who know the code. A "fallback" > > to order-0 should be commented. > > It's been a week. Could someone please propose a fixup patch to add > this comment?
Hi Andrew:
Do you mean that I need to send a v2 patch with the the comments included?
It is better to post v2.
Got it.
But before, could you please comment on:
in case of order-0, bulk path may easily fail and fallback to the single page allocator. If an request is marked as NO_FAIL, i am talking about order-0 request, your change breaks GFP_NOFAIL for !order.
Am i missing something obvious?
For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle the flag correctly. IMO we don't need to handle the flag here.
Agree. As for comment, i meant to comment the below fallback:
Michal send a craft that make nofail logic more clearer and I check the branch found Andrew already merged in -stable branch. So we can include these with a new patch.
Just to confirm. Will you send an extra patch with the comment?
If this is not urgent, I can send this patch later this week. :)
-- Uladzislau Rezki
--
Help you, Help me, Hailong.
On Tue, Aug 20, 2024 at 09:59:50AM +0800, Hailong Liu wrote:
On Mon, 19. Aug 15:38, Uladzislau Rezki wrote:
On Mon, Aug 19, 2024 at 08:57:38PM +0800, Hailong Liu wrote:
On Mon, 19. Aug 13:59, Uladzislau Rezki wrote:
On Fri, Aug 16, 2024 at 07:46:26PM +0800, Hailong Liu wrote:
On Fri, 16. Aug 12:13, Uladzislau Rezki wrote:
On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote: > On Thu, 15. Aug 22:07, Andrew Morton wrote: > > On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki urezki@gmail.com wrote: > > > > > > > Acked-by: Barry Song baohua@kernel.org > > > > > > > > > > because we already have a fallback here: > > > > > > > > > > void *__vmalloc_node_range_noprof : > > > > > > > > > > fail: > > > > > if (shift > PAGE_SHIFT) { > > > > > shift = PAGE_SHIFT; > > > > > align = real_align; > > > > > size = real_size; > > > > > goto again; > > > > > } > > > > > > > > This really deserves a comment because this is not really clear at all. > > > > The code is also fragile and it would benefit from some re-org. > > > > > > > > Thanks for the fix. > > > > > > > > Acked-by: Michal Hocko mhocko@suse.com > > > > > > > I agree. This is only clear for people who know the code. A "fallback" > > > to order-0 should be commented. > > > > It's been a week. Could someone please propose a fixup patch to add > > this comment? > > Hi Andrew: > > Do you mean that I need to send a v2 patch with the the comments included? > It is better to post v2.
Got it.
But before, could you please comment on:
in case of order-0, bulk path may easily fail and fallback to the single page allocator. If an request is marked as NO_FAIL, i am talking about order-0 request, your change breaks GFP_NOFAIL for !order.
Am i missing something obvious?
For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle the flag correctly. IMO we don't need to handle the flag here.
Agree. As for comment, i meant to comment the below fallback:
Michal send a craft that make nofail logic more clearer and I check the branch found Andrew already merged in -stable branch. So we can include these with a new patch.
Just to confirm. Will you send an extra patch with the comment?
If this is not urgent, I can send this patch later this week. :)
This is for synchronization, so we both do not do a double work :)
-- Uladzislau Rezki
On Tue, 20. Aug 08:44, Uladzislau Rezki wrote:
On Tue, Aug 20, 2024 at 09:59:50AM +0800, Hailong Liu wrote:
On Mon, 19. Aug 15:38, Uladzislau Rezki wrote:
On Mon, Aug 19, 2024 at 08:57:38PM +0800, Hailong Liu wrote:
On Mon, 19. Aug 13:59, Uladzislau Rezki wrote:
On Fri, Aug 16, 2024 at 07:46:26PM +0800, Hailong Liu wrote:
On Fri, 16. Aug 12:13, Uladzislau Rezki wrote: > On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote: > > On Thu, 15. Aug 22:07, Andrew Morton wrote: > > > On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki urezki@gmail.com wrote: > > > > > > > > > Acked-by: Barry Song baohua@kernel.org > > > > > > > > > > > > because we already have a fallback here: > > > > > > > > > > > > void *__vmalloc_node_range_noprof : > > > > > > > > > > > > fail: > > > > > > if (shift > PAGE_SHIFT) { > > > > > > shift = PAGE_SHIFT; > > > > > > align = real_align; > > > > > > size = real_size; > > > > > > goto again; > > > > > > } > > > > > > > > > > This really deserves a comment because this is not really clear at all. > > > > > The code is also fragile and it would benefit from some re-org. > > > > > > > > > > Thanks for the fix. > > > > > > > > > > Acked-by: Michal Hocko mhocko@suse.com > > > > > > > > > I agree. This is only clear for people who know the code. A "fallback" > > > > to order-0 should be commented. > > > > > > It's been a week. Could someone please propose a fixup patch to add > > > this comment? > > > > Hi Andrew: > > > > Do you mean that I need to send a v2 patch with the the comments included? > > > It is better to post v2. Got it.
> > But before, could you please comment on: > > in case of order-0, bulk path may easily fail and fallback to the single > page allocator. If an request is marked as NO_FAIL, i am talking about > order-0 request, your change breaks GFP_NOFAIL for !order. > > Am i missing something obvious? For order-0, alloc_pages(GFP_X | __GFP_NOFAIL, 0), buddy allocator will handle the flag correctly. IMO we don't need to handle the flag here.
Agree. As for comment, i meant to comment the below fallback:
Michal send a craft that make nofail logic more clearer and I check the branch found Andrew already merged in -stable branch. So we can include these with a new patch.
Just to confirm. Will you send an extra patch with the comment?
If this is not urgent, I can send this patch later this week. :)
This is for synchronization, so we both do not do a double work :)
Ahh I guess you already have a plan, so I don't need to do this. Wating for your patch :)
-- Uladzislau Rezki
--
Help you, Help me, Hailong.
On 08/16/24 at 12:13pm, Uladzislau Rezki wrote:
On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
On Thu, 15. Aug 22:07, Andrew Morton wrote:
On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki urezki@gmail.com wrote:
Acked-by: Barry Song baohua@kernel.org
because we already have a fallback here:
void *__vmalloc_node_range_noprof :
fail: if (shift > PAGE_SHIFT) { shift = PAGE_SHIFT; align = real_align; size = real_size; goto again; }
This really deserves a comment because this is not really clear at all. The code is also fragile and it would benefit from some re-org.
Thanks for the fix.
Acked-by: Michal Hocko mhocko@suse.com
I agree. This is only clear for people who know the code. A "fallback" to order-0 should be commented.
It's been a week. Could someone please propose a fixup patch to add this comment?
Hi Andrew:
Do you mean that I need to send a v2 patch with the the comments included?
It is better to post v2.
But before, could you please comment on:
in case of order-0, bulk path may easily fail and fallback to the single page allocator. If an request is marked as NO_FAIL, i am talking about order-0 request, your change breaks GFP_NOFAIL for !order.
In case order-0, bulk_gfp masks off __GFP_NOFAIL, but alloc_gfp doesn't. So alloc_gfp has __GFP_NOFAIL in fallback, it won't be failed by alloc_pages().
On 08/17/24 at 12:11am, Baoquan He wrote:
On 08/16/24 at 12:13pm, Uladzislau Rezki wrote:
On Fri, Aug 16, 2024 at 05:12:32PM +0800, Hailong Liu wrote:
On Thu, 15. Aug 22:07, Andrew Morton wrote:
On Fri, 9 Aug 2024 11:41:42 +0200 Uladzislau Rezki urezki@gmail.com wrote:
> Acked-by: Barry Song baohua@kernel.org > > because we already have a fallback here: > > void *__vmalloc_node_range_noprof : > > fail: > if (shift > PAGE_SHIFT) { > shift = PAGE_SHIFT; > align = real_align; > size = real_size; > goto again; > }
This really deserves a comment because this is not really clear at all. The code is also fragile and it would benefit from some re-org.
Thanks for the fix.
Acked-by: Michal Hocko mhocko@suse.com
I agree. This is only clear for people who know the code. A "fallback" to order-0 should be commented.
It's been a week. Could someone please propose a fixup patch to add this comment?
Hi Andrew:
Do you mean that I need to send a v2 patch with the the comments included?
It is better to post v2.
But before, could you please comment on:
in case of order-0, bulk path may easily fail and fallback to the single page allocator. If an request is marked as NO_FAIL, i am talking about order-0 request, your change breaks GFP_NOFAIL for !order.
In case order-0, bulk_gfp masks off __GFP_NOFAIL, but alloc_gfp doesn't. So alloc_gfp has __GFP_NOFAIL in fallback, it won't be failed by alloc_pages().
Please ignore this, I didn't update my local mail box, didn't see Hailong's reply.
linux-stable-mirror@lists.linaro.org