The routine cma_init_reserved_areas is designed to activate all reserved cma areas. It quits when it first encounters an error. This can leave some areas in a state where they are reserved but not activated. There is no feedback to code which performed the reservation. Attempting to allocate memory from areas in such a state will result in a BUG.
Modify cma_init_reserved_areas to always attempt to activate all areas. The called routine, cma_activate_area is responsible for leaving the area in a valid state. No one is making active use of returned error codes, so change the routine to void.
How to reproduce: This example uses kernelcore, hugetlb and cma as an easy way to reproduce. However, this is a more general cma issue.
Two node x86 VM 16GB total, 8GB per node Kernel command line parameters, kernelcore=4G hugetlb_cma=8G Related boot time messages, hugetlb_cma: reserve 8192 MiB, up to 4096 MiB per node cma: Reserved 4096 MiB at 0x0000000100000000 hugetlb_cma: reserved 4096 MiB on node 0 cma: Reserved 4096 MiB at 0x0000000300000000 hugetlb_cma: reserved 4096 MiB on node 1 cma: CMA area hugetlb could not be activated
# echo 8 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI ... Call Trace: bitmap_find_next_zero_area_off+0x51/0x90 cma_alloc+0x1a5/0x310 alloc_fresh_huge_page+0x78/0x1a0 alloc_pool_huge_page+0x6f/0xf0 set_max_huge_pages+0x10c/0x250 nr_hugepages_store_common+0x92/0x120 ? __kmalloc+0x171/0x270 kernfs_fop_write+0xc1/0x1a0 vfs_write+0xc7/0x1f0 ksys_write+0x5f/0xe0 do_syscall_64+0x4d/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: c64be2bb1c6e ("drivers: add Contiguous Memory Allocator") Signed-off-by: Mike Kravetz mike.kravetz@oracle.com Cc: stable@vger.kernel.org --- mm/cma.c | 23 +++++++++-------------- 1 file changed, 9 insertions(+), 14 deletions(-)
diff --git a/mm/cma.c b/mm/cma.c index 26ecff818881..0963c0f9c502 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -93,17 +93,15 @@ static void cma_clear_bitmap(struct cma *cma, unsigned long pfn, mutex_unlock(&cma->lock); }
-static int __init cma_activate_area(struct cma *cma) +static void __init cma_activate_area(struct cma *cma) { unsigned long base_pfn = cma->base_pfn, pfn = base_pfn; unsigned i = cma->count >> pageblock_order; struct zone *zone;
cma->bitmap = bitmap_zalloc(cma_bitmap_maxno(cma), GFP_KERNEL); - if (!cma->bitmap) { - cma->count = 0; - return -ENOMEM; - } + if (!cma->bitmap) + goto out_error;
WARN_ON_ONCE(!pfn_valid(pfn)); zone = page_zone(pfn_to_page(pfn)); @@ -133,25 +131,22 @@ static int __init cma_activate_area(struct cma *cma) spin_lock_init(&cma->mem_head_lock); #endif
- return 0; + return;
not_in_zone: - pr_err("CMA area %s could not be activated\n", cma->name); bitmap_free(cma->bitmap); +out_error: cma->count = 0; - return -EINVAL; + pr_err("CMA area %s could not be activated\n", cma->name); + return; }
static int __init cma_init_reserved_areas(void) { int i;
- for (i = 0; i < cma_area_count; i++) { - int ret = cma_activate_area(&cma_areas[i]); - - if (ret) - return ret; - } + for (i = 0; i < cma_area_count; i++) + cma_activate_area(&cma_areas[i]);
return 0; }
On Thu, Jul 30, 2020 at 09:31:23AM -0700, Mike Kravetz wrote:
The routine cma_init_reserved_areas is designed to activate all reserved cma areas. It quits when it first encounters an error. This can leave some areas in a state where they are reserved but not activated. There is no feedback to code which performed the reservation. Attempting to allocate memory from areas in such a state will result in a BUG.
Modify cma_init_reserved_areas to always attempt to activate all areas. The called routine, cma_activate_area is responsible for leaving the area in a valid state. No one is making active use of returned error codes, so change the routine to void.
How to reproduce: This example uses kernelcore, hugetlb and cma as an easy way to reproduce. However, this is a more general cma issue.
Two node x86 VM 16GB total, 8GB per node Kernel command line parameters, kernelcore=4G hugetlb_cma=8G Related boot time messages, hugetlb_cma: reserve 8192 MiB, up to 4096 MiB per node cma: Reserved 4096 MiB at 0x0000000100000000 hugetlb_cma: reserved 4096 MiB on node 0 cma: Reserved 4096 MiB at 0x0000000300000000 hugetlb_cma: reserved 4096 MiB on node 1 cma: CMA area hugetlb could not be activated
# echo 8 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI ... Call Trace: bitmap_find_next_zero_area_off+0x51/0x90 cma_alloc+0x1a5/0x310 alloc_fresh_huge_page+0x78/0x1a0 alloc_pool_huge_page+0x6f/0xf0 set_max_huge_pages+0x10c/0x250 nr_hugepages_store_common+0x92/0x120 ? __kmalloc+0x171/0x270 kernfs_fop_write+0xc1/0x1a0 vfs_write+0xc7/0x1f0 ksys_write+0x5f/0xe0 do_syscall_64+0x4d/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: c64be2bb1c6e ("drivers: add Contiguous Memory Allocator") Signed-off-by: Mike Kravetz mike.kravetz@oracle.com Cc: stable@vger.kernel.org
Makes total sense to me!
Reviewed-by: Roman Gushchin guro@fb.com
Thanks!
-----Original Message----- From: Mike Kravetz [mailto:mike.kravetz@oracle.com] Sent: Friday, July 31, 2020 4:31 AM To: linux-mm@kvack.org; linux-kernel@vger.kernel.org Cc: Marek Szyprowski m.szyprowski@samsung.com; Michal Nazarewicz mina86@mina86.com; Kyungmin Park kyungmin.park@samsung.com; Song Bao Hua (Barry Song) song.bao.hua@hisilicon.com; Roman Gushchin guro@fb.com; Andrew Morton akpm@linux-foundation.org; Mike Kravetz mike.kravetz@oracle.com; stable@vger.kernel.org Subject: [PATCH] cma: don't quit at first error when activating reserved areas
The routine cma_init_reserved_areas is designed to activate all reserved cma areas. It quits when it first encounters an error. This can leave some areas in a state where they are reserved but not activated. There is no feedback to code which performed the reservation. Attempting to allocate memory from areas in such a state will result in a BUG.
Modify cma_init_reserved_areas to always attempt to activate all areas. The called routine, cma_activate_area is responsible for leaving the area in a valid state. No one is making active use of returned error codes, so change the routine to void.
How to reproduce: This example uses kernelcore, hugetlb and cma as an easy way to reproduce. However, this is a more general cma issue.
Two node x86 VM 16GB total, 8GB per node Kernel command line parameters, kernelcore=4G hugetlb_cma=8G Related boot time messages, hugetlb_cma: reserve 8192 MiB, up to 4096 MiB per node cma: Reserved 4096 MiB at 0x0000000100000000 hugetlb_cma: reserved 4096 MiB on node 0 cma: Reserved 4096 MiB at 0x0000000300000000 hugetlb_cma: reserved 4096 MiB on node 1 cma: CMA area hugetlb could not be activated
# echo 8 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI ... Call Trace: bitmap_find_next_zero_area_off+0x51/0x90 cma_alloc+0x1a5/0x310 alloc_fresh_huge_page+0x78/0x1a0 alloc_pool_huge_page+0x6f/0xf0 set_max_huge_pages+0x10c/0x250 nr_hugepages_store_common+0x92/0x120 ? __kmalloc+0x171/0x270 kernfs_fop_write+0xc1/0x1a0 vfs_write+0xc7/0x1f0 ksys_write+0x5f/0xe0 do_syscall_64+0x4d/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: c64be2bb1c6e ("drivers: add Contiguous Memory Allocator") Signed-off-by: Mike Kravetz mike.kravetz@oracle.com Cc: stable@vger.kernel.org
Acked-by: Barry Song song.bao.hua@hisilicon.com
mm/cma.c | 23 +++++++++-------------- 1 file changed, 9 insertions(+), 14 deletions(-)
diff --git a/mm/cma.c b/mm/cma.c index 26ecff818881..0963c0f9c502 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -93,17 +93,15 @@ static void cma_clear_bitmap(struct cma *cma, unsigned long pfn, mutex_unlock(&cma->lock); }
-static int __init cma_activate_area(struct cma *cma) +static void __init cma_activate_area(struct cma *cma) { unsigned long base_pfn = cma->base_pfn, pfn = base_pfn; unsigned i = cma->count >> pageblock_order; struct zone *zone;
cma->bitmap = bitmap_zalloc(cma_bitmap_maxno(cma), GFP_KERNEL);
- if (!cma->bitmap) {
cma->count = 0;
return -ENOMEM;
- }
if (!cma->bitmap)
goto out_error;
WARN_ON_ONCE(!pfn_valid(pfn)); zone = page_zone(pfn_to_page(pfn));
@@ -133,25 +131,22 @@ static int __init cma_activate_area(struct cma *cma) spin_lock_init(&cma->mem_head_lock); #endif
- return 0;
- return;
not_in_zone:
- pr_err("CMA area %s could not be activated\n", cma->name); bitmap_free(cma->bitmap);
+out_error: cma->count = 0;
- return -EINVAL;
- pr_err("CMA area %s could not be activated\n", cma->name);
- return;
}
static int __init cma_init_reserved_areas(void) { int i;
- for (i = 0; i < cma_area_count; i++) {
int ret = cma_activate_area(&cma_areas[i]);
if (ret)
return ret;
- }
for (i = 0; i < cma_area_count; i++)
cma_activate_area(&cma_areas[i]);
return 0;
}
2.25.4
linux-stable-mirror@lists.linaro.org