On Fri 09-08-24 09:05:05, Barry Song wrote:
On Fri, Aug 9, 2024 at 12:20 AM Hailong Liu hailong.liu@oppo.com wrote:
The __vmap_pages_range_noflush() assumes its argument pages** contains pages with the same page shift. However, since commit e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations"), if gfp_flags includes __GFP_NOFAIL with high order in vm_area_alloc_pages() and page allocation failed for high order, the pages** may contain two different page shifts (high order and order-0). This could lead __vmap_pages_range_noflush() to perform incorrect mappings, potentially resulting in memory corruption.
Users might encounter this as follows (vmap_allow_huge = true, 2M is for PMD_SIZE): kvmalloc(2M, __GFP_NOFAIL|GFP_X) __vmalloc_node_range_noprof(vm_flags=VM_ALLOW_HUGE_VMAP) vm_area_alloc_pages(order=9) ---> order-9 allocation failed and fallback to order-0 vmap_pages_range() vmap_pages_range_noflush() __vmap_pages_range_noflush(page_shift = 21) ----> wrong mapping happens
We can remove the fallback code because if a high-order allocation fails, __vmalloc_node_range_noprof() will retry with order-0. Therefore, it is unnecessary to fallback to order-0 here. Therefore, fix this by removing the fallback code.
Fixes: e9c3cda4d86e ("mm, vmalloc: fix high order __GFP_NOFAIL allocations") Signed-off-by: Hailong Liu hailong.liu@oppo.com Reported-by: Tangquan Zheng zhengtangquan@oppo.com Cc: stable@vger.kernel.org CC: Barry Song 21cnbao@gmail.com CC: Baoquan He bhe@redhat.com CC: Matthew Wilcox willy@infradead.org
Acked-by: Barry Song baohua@kernel.org
because we already have a fallback here:
void *__vmalloc_node_range_noprof :
fail: if (shift > PAGE_SHIFT) { shift = PAGE_SHIFT; align = real_align; size = real_size; goto again; }
This really deserves a comment because this is not really clear at all. The code is also fragile and it would benefit from some re-org.
Thanks for the fix.
Acked-by: Michal Hocko mhocko@suse.com