From: Liu Shixin liushixin2@huawei.com
commit f1897f2f08b28ae59476d8b73374b08f856973af upstream.
syzkaller reported such a BUG_ON():
------------[ cut here ]------------ kernel BUG at mm/khugepaged.c:1835! Internal error: Oops - BUG: 00000000f2000800 [#1] SMP ... CPU: 6 UID: 0 PID: 8009 Comm: syz.15.106 Kdump: loaded Tainted: G W 6.13.0-rc6 #22 Tainted: [W]=WARN Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : collapse_file+0xa44/0x1400 lr : collapse_file+0x88/0x1400 sp : ffff80008afe3a60 ... Call trace: collapse_file+0xa44/0x1400 (P) hpage_collapse_scan_file+0x278/0x400 madvise_collapse+0x1bc/0x678 madvise_vma_behavior+0x32c/0x448 madvise_walk_vmas.constprop.0+0xbc/0x140 do_madvise.part.0+0xdc/0x2c8 __arm64_sys_madvise+0x68/0x88 invoke_syscall+0x50/0x120 el0_svc_common.constprop.0+0xc8/0xf0 do_el0_svc+0x24/0x38 el0_svc+0x34/0x128 el0t_64_sync_handler+0xc8/0xd0 el0t_64_sync+0x190/0x198
This indicates that the pgoff is unaligned. After analysis, I confirm the vma is mapped to /dev/zero. Such a vma certainly has vm_file, but it is set to anonymous by mmap_zero(). So even if it's mmapped by 2m-unaligned, it can pass the check in thp_vma_allowable_order() as it is an anonymous-mmap, but then be collapsed as a file-mmap.
It seems the problem has existed for a long time, but actually, since we have khugepaged_max_ptes_none check before, we will skip collapse it as it is /dev/zero and so has no present page. But commit d8ea7cc8547c limit the check for only khugepaged, so the BUG_ON() can be triggered by madvise_collapse().
Add vma_is_anonymous() check to make such vma be processed by hpage_collapse_scan_pmd().
Link: https://lkml.kernel.org/r/20250111034511.2223353-1-liushixin2@huawei.com Fixes: d8ea7cc8547c ("mm/khugepaged: add flag to predicate khugepaged-only behavior") Signed-off-by: Liu Shixin liushixin2@huawei.com Reviewed-by: Yang Shi yang@os.amperecomputing.com Acked-by: David Hildenbrand david@redhat.com Cc: Chengming Zhou chengming.zhou@linux.dev Cc: Johannes Weiner hannes@cmpxchg.org Cc: Kefeng Wang wangkefeng.wang@huawei.com Cc: Mattew Wilcox willy@infradead.org Cc: Muchun Song muchun.song@linux.dev Cc: Nanyong Sun sunnanyong@huawei.com Cc: Qi Zheng zhengqi.arch@bytedance.com Signed-off-by: Andrew Morton akpm@linux-foundation.org [acsjakub: backport, clean apply] Cc: Jakub Acs acsjakub@amazon.de Cc: linux-mm@kvack.org --- Ran into the crash with syzkaller, backporting this patch works - the reproducer no longer crashes.
Please let me know if there was a reason not to backport.
mm/khugepaged.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c index b538c3d48386..abd5764e4864 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2404,7 +2404,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, VM_BUG_ON(khugepaged_scan.address < hstart || khugepaged_scan.address + HPAGE_PMD_SIZE > hend); - if (IS_ENABLED(CONFIG_SHMEM) && vma->vm_file) { + if (IS_ENABLED(CONFIG_SHMEM) && !vma_is_anonymous(vma)) { struct file *file = get_file(vma->vm_file); pgoff_t pgoff = linear_page_index(vma, khugepaged_scan.address); @@ -2750,7 +2750,7 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev, mmap_assert_locked(mm); memset(cc->node_load, 0, sizeof(cc->node_load)); nodes_clear(cc->alloc_nmask); - if (IS_ENABLED(CONFIG_SHMEM) && vma->vm_file) { + if (IS_ENABLED(CONFIG_SHMEM) && !vma_is_anonymous(vma)) { struct file *file = get_file(vma->vm_file); pgoff_t pgoff = linear_page_index(vma, addr);
On Tue, Jul 29, 2025 at 09:03:47AM +0000, Jakub Acs wrote:
From: Liu Shixin liushixin2@huawei.com
commit f1897f2f08b28ae59476d8b73374b08f856973af upstream.
syzkaller reported such a BUG_ON():
------------[ cut here ]------------ kernel BUG at mm/khugepaged.c:1835! Internal error: Oops - BUG: 00000000f2000800 [#1] SMP ... CPU: 6 UID: 0 PID: 8009 Comm: syz.15.106 Kdump: loaded Tainted: G W 6.13.0-rc6 #22 Tainted: [W]=WARN Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : collapse_file+0xa44/0x1400 lr : collapse_file+0x88/0x1400 sp : ffff80008afe3a60 ... Call trace: collapse_file+0xa44/0x1400 (P) hpage_collapse_scan_file+0x278/0x400 madvise_collapse+0x1bc/0x678 madvise_vma_behavior+0x32c/0x448 madvise_walk_vmas.constprop.0+0xbc/0x140 do_madvise.part.0+0xdc/0x2c8 __arm64_sys_madvise+0x68/0x88 invoke_syscall+0x50/0x120 el0_svc_common.constprop.0+0xc8/0xf0 do_el0_svc+0x24/0x38 el0_svc+0x34/0x128 el0t_64_sync_handler+0xc8/0xd0 el0t_64_sync+0x190/0x198
This indicates that the pgoff is unaligned. After analysis, I confirm the vma is mapped to /dev/zero. Such a vma certainly has vm_file, but it is set to anonymous by mmap_zero(). So even if it's mmapped by 2m-unaligned, it can pass the check in thp_vma_allowable_order() as it is an anonymous-mmap, but then be collapsed as a file-mmap.
It seems the problem has existed for a long time, but actually, since we have khugepaged_max_ptes_none check before, we will skip collapse it as it is /dev/zero and so has no present page. But commit d8ea7cc8547c limit the check for only khugepaged, so the BUG_ON() can be triggered by madvise_collapse().
Add vma_is_anonymous() check to make such vma be processed by hpage_collapse_scan_pmd().
Link: https://lkml.kernel.org/r/20250111034511.2223353-1-liushixin2@huawei.com Fixes: d8ea7cc8547c ("mm/khugepaged: add flag to predicate khugepaged-only behavior") Signed-off-by: Liu Shixin liushixin2@huawei.com Reviewed-by: Yang Shi yang@os.amperecomputing.com Acked-by: David Hildenbrand david@redhat.com Cc: Chengming Zhou chengming.zhou@linux.dev Cc: Johannes Weiner hannes@cmpxchg.org Cc: Kefeng Wang wangkefeng.wang@huawei.com Cc: Mattew Wilcox willy@infradead.org Cc: Muchun Song muchun.song@linux.dev Cc: Nanyong Sun sunnanyong@huawei.com Cc: Qi Zheng zhengqi.arch@bytedance.com Signed-off-by: Andrew Morton akpm@linux-foundation.org [acsjakub: backport, clean apply] Cc: Jakub Acs acsjakub@amazon.de
You need to sign off on patches you forward on. Please fix that up and resend all of these.
thanks,
greg -h
On Tue, Jul 29, 2025 at 04:49:51PM +0200, Greg KH wrote:
You need to sign off on patches you forward on. Please fix that up and resend all of these.
thanks,
greg -h
Oh, that's embarrassing, my apologies for the miss, sent v2s: https://lore.kernel.org/all/20250730073927.27312-1-acsjakub@amazon.de/ https://lore.kernel.org/all/20250730073956.28488-1-acsjakub@amazon.de/ https://lore.kernel.org/all/20250730073945.27790-1-acsjakub@amazon.de/
Jakub
Amazon Web Services Development Center Germany GmbH Tamara-Danz-Str. 13 10243 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B Sitz: Berlin Ust-ID: DE 365 538 597
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: f1897f2f08b28ae59476d8b73374b08f856973af
WARNING: Author mismatch between patch and upstream commit: Backport author: Jakub Acs acsjakub@amazon.de Commit author: Liu Shixin liushixin2@huawei.com
Status in newer kernel trees: 6.15.y | Present (exact SHA1)
Note: The patch differs from the upstream commit: --- 1: f1897f2f08b2 ! 1: 9575e41a87ec mm: khugepaged: fix call hpage_collapse_scan_file() for anonymous vma @@ Metadata ## Commit message ## mm: khugepaged: fix call hpage_collapse_scan_file() for anonymous vma
+ commit f1897f2f08b28ae59476d8b73374b08f856973af upstream. + syzkaller reported such a BUG_ON():
------------[ cut here ]------------ @@ Commit message Cc: Nanyong Sun sunnanyong@huawei.com Cc: Qi Zheng zhengqi.arch@bytedance.com Signed-off-by: Andrew Morton akpm@linux-foundation.org + [acsjakub: backport, clean apply] + Cc: Jakub Acs acsjakub@amazon.de + Cc: linux-mm@kvack.org
## mm/khugepaged.c ## @@ mm/khugepaged.c: static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result,
---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | origin/linux-6.12.y | Success | Success |
linux-stable-mirror@lists.linaro.org