The pmd_trans_huge() code in mfill_atomic() is wrong in two different ways depending on kernel version:
1. The pmd_trans_huge() check is racy and can lead to a BUG_ON() (if you hit the right two race windows) - I've tested this in a kernel build with some extra mdelay() calls. See the commit message for a description of the race scenario. On older kernels (before 6.5), I think the same bug can even theoretically lead to accessing transhuge page contents as a page table if you hit the right 5 narrow race windows (I haven't tested this case). 2. On newer kernels (>=6.5), for shmem mappings, khugepaged is allowed to yank page tables out from under us (though I haven't tested that), so I think the BUG_ON() checks in mfill_atomic() are just wrong.
I decided to write two separate fixes for these, so that the first fix can be backported to kernels affected by the first bug.
Signed-off-by: Jann Horn jannh@google.com --- Jann Horn (2): userfaultfd: Fix pmd_trans_huge() recheck race userfaultfd: Don't BUG_ON() if khugepaged yanks our page table
mm/userfaultfd.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) --- base-commit: d4560686726f7a357922f300fc81f5964be8df04 change-id: 20240812-uffd-thp-flip-fix-20f91f1151b9