On Sat, 21 Jan 2023, gregkh@linuxfoundation.org wrote:
The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
Thanks Greg: the backport below is suitable for 5.15-stable and 5.10-stable and 5.4-stable.
Hugh
From ab0c3f1251b4670978fde0bd54161795a139b060 Mon Sep 17 00:00:00 2001 From: Hugh Dickins hughd@google.com Date: Thu, 22 Dec 2022 12:41:50 -0800 Subject: [PATCH] mm/khugepaged: fix collapse_pte_mapped_thp() to allow anon_vma
commit ab0c3f1251b4670978fde0bd54161795a139b060 upstream.
uprobe_write_opcode() uses collapse_pte_mapped_thp() to restore huge pmd, when removing a breakpoint from hugepage text: vma->anon_vma is always set in that case, so undo the prohibition. And MADV_COLLAPSE ought to be able to collapse some page tables in a vma which happens to have anon_vma set from CoWing elsewhere.
Is anon_vma lock required? Almost not: if any page other than expected subpage of the non-anon huge page is found in the page table, collapse is aborted without making any change. However, it is possible that an anon page was CoWed from this extent in another mm or vma, in which case a concurrent lookup might look here: so keep it away while clearing pmd (but perhaps we shall go back to using pmd_lock() there in future).
Note that collapse_pte_mapped_thp() is exceptional in freeing a page table without having cleared its ptes: I'm uneasy about that, and had thought pte_clear()ing appropriate; but exclusive i_mmap lock does fix the problem, and we would have to move the mmu_notification if clearing those ptes.
What this fixes is not a dangerous instability. But I suggest Cc stable because uprobes "healing" has regressed in that way, so this should follow 8d3c106e19e8 into those stable releases where it was backported (and may want adjustment there - I'll supply backports as needed).
Link: https://lkml.kernel.org/r/b740c9fb-edba-92ba-59fb-7a5592e5dfc@google.com Fixes: 8d3c106e19e8 ("mm/khugepaged: take the right locks for page table retraction") Signed-off-by: Hugh Dickins hughd@google.com Acked-by: David Hildenbrand david@redhat.com Cc: Jann Horn jannh@google.com Cc: Yang Shi shy828301@gmail.com Cc: Zach O'Keefe zokeefe@google.com Cc: Song Liu songliubraving@fb.com Cc: stable@vger.kernel.org [5.4+] Signed-off-by: Andrew Morton akpm@linux-foundation.org --- mm/khugepaged.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c index fd25d12e85b3..3afcb1466ec5 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1458,14 +1458,6 @@ void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr) if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE)) return;
- /* - * Symmetry with retract_page_tables(): Exclude MAP_PRIVATE mappings - * that got written to. Without this, we'd have to also lock the - * anon_vma if one exists. - */ - if (vma->anon_vma) - return; - hpage = find_lock_page(vma->vm_file->f_mapping, linear_page_index(vma, haddr)); if (!hpage) @@ -1537,6 +1529,10 @@ void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr) }
/* step 4: collapse pmd */ + /* we make no change to anon, but protect concurrent anon page lookup */ + if (vma->anon_vma) + anon_vma_lock_write(vma->anon_vma); + mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, NULL, mm, haddr, haddr + HPAGE_PMD_SIZE); mmu_notifier_invalidate_range_start(&range); @@ -1546,6 +1542,8 @@ void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr) mmu_notifier_invalidate_range_end(&range); pte_free(mm, pmd_pgtable(_pmd));
+ if (vma->anon_vma) + anon_vma_unlock_write(vma->anon_vma); i_mmap_unlock_write(vma->vm_file->f_mapping);
drop_hpage: