The patch titled
Subject: mm/hugetlb: fix uffd wr-protection for CoW optimization path
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-hugetlb-fix-uffd-wr-protection-for-cow-optimization-path.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Peter Xu <peterx(a)redhat.com>
Subject: mm/hugetlb: fix uffd wr-protection for CoW optimization path
Date: Tue, 21 Mar 2023 15:18:40 -0400
This patch fixes an issue that a hugetlb uffd-wr-protected mapping can be
writable even with uffd-wp bit set. It only happens with all these
conditions met: (1) hugetlb memory (2) private mapping (3) original
mapping was missing, then (4) being wr-protected (IOW, pte marker
installed). Then write to the page to trigger.
Userfaultfd-wp trap for hugetlb was implemented in hugetlb_fault() before
even reaching hugetlb_wp() to avoid taking more locks that userfault won't
need. However there's one CoW optimization path for missing hugetlb page
that can trigger hugetlb_wp() inside hugetlb_no_page(), that can bypass
the userfaultfd-wp traps.
A few ways to resolve this:
(1) Skip the CoW optimization for hugetlb private mapping, considering
that private mappings for hugetlb should be very rare, so it may not
really be helpful to major workloads. The worst case is we only skip the
optimization if userfaultfd_wp(vma)==true, because uffd-wp needs another
fault anyway.
(2) Move the userfaultfd-wp handling for hugetlb from hugetlb_fault()
into hugetlb_wp(). The major cons is there're a bunch of locks taken
when calling hugetlb_wp(), and that will make the changeset unnecessarily
complicated due to the lock operations.
(3) Carry over uffd-wp bit in hugetlb_wp(), so it'll need to fault again
for uffd-wp privately mapped pages.
This patch chose option (3) which contains the minimum changeset (simplest
for backport) and also make sure hugetlb_wp() itself will start to be
always safe with uffd-wp ptes even if called elsewhere in the future.
This patch will be needed for v5.19+ hence copy stable.
Link: https://lkml.kernel.org/r/20230321191840.1897940-1-peterx@redhat.com
Fixes: 166f3ecc0daf ("mm/hugetlb: hook page faults for uffd write protection")
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Reported-by: Muhammad Usama Anjum <usama.anjum(a)collabora.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Mike Rapoport <rppt(a)linux.vnet.ibm.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-fix-uffd-wr-protection-for-cow-optimization-path
+++ a/mm/hugetlb.c
@@ -5478,7 +5478,7 @@ static vm_fault_t hugetlb_wp(struct mm_s
struct folio *pagecache_folio, spinlock_t *ptl)
{
const bool unshare = flags & FAULT_FLAG_UNSHARE;
- pte_t pte;
+ pte_t pte, newpte;
struct hstate *h = hstate_vma(vma);
struct page *old_page;
struct folio *new_folio;
@@ -5622,8 +5622,10 @@ retry_avoidcopy:
mmu_notifier_invalidate_range(mm, range.start, range.end);
page_remove_rmap(old_page, vma, true);
hugepage_add_new_anon_rmap(new_folio, vma, haddr);
- set_huge_pte_at(mm, haddr, ptep,
- make_huge_pte(vma, &new_folio->page, !unshare));
+ newpte = make_huge_pte(vma, &new_folio->page, !unshare);
+ if (huge_pte_uffd_wp(pte))
+ newpte = huge_pte_mkuffd_wp(newpte);
+ set_huge_pte_at(mm, haddr, ptep, newpte);
folio_set_hugetlb_migratable(new_folio);
/* Make the old page be freed below */
new_folio = page_folio(old_page);
_
Patches currently in -mm which might be from peterx(a)redhat.com are
kselftest-vm-fix-unused-variable-warning.patch
tools-headers-uapi-sync-linux-prctlh-with-the-kernel-sources.patch
mm-hugetlb-fix-uffd-wr-protection-for-cow-optimization-path.patch
mm-khugepaged-alloc_charge_hpage-take-care-of-mem-charge-errors.patch
mm-khugepaged-cleanup-memcg-uncharge-for-failure-path.patch
mm-uffd-uffd_feature_wp_unpopulated.patch
selftests-mm-smoke-test-uffd_feature_wp_unpopulated.patch
mm-thp-rename-transparent_hugepage_never_dax-to-_unsupported.patch
mm-thp-rename-transparent_hugepage_never_dax-to-_unsupported-fix.patch
The conditional MOVS instruction that appears to have been added to test
for the TIF_USING_IWMMXT thread_info flag only sets the N and Z
condition flags and register R7, none of which are referenced in the
subsequent code. This means that the instruction does nothing, which
means that we might misidentify faulting FPE instructions as iWMMXT
instructions on kernels that were built to support both.
This seems to have been part of the original submission of the code, and
so this has never worked as intended, and nobody ever noticed, and so we
might decide to just leave this as-is. However, with the ongoing move
towards multiplatform kernels, the issue becomes more likely to
manifest, and so it is better to fix it.
So check whether we are dealing with an undef exception regarding
coprocessor index #0 or #1, and if so, load the thread_info flag and
only dispatch it as a iWMMXT trap if the flag is set.
Cc: <stable(a)vger.kernel.org> # v2.6.9+
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
---
arch/arm/kernel/entry-armv.S | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index c39303e5c23470e6..c5d2f07994fb0d87 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -606,10 +606,11 @@ call_fpe:
strb r7, [r6, #TI_USED_CP] @ set appropriate used_cp[]
#ifdef CONFIG_IWMMXT
@ Test if we need to give access to iWMMXt coprocessors
- ldr r5, [r10, #TI_FLAGS]
- rsbs r7, r8, #(1 << 8) @ CP 0 or 1 only
- movscs r7, r5, lsr #(TIF_USING_IWMMXT + 1)
- bcs iwmmxt_task_enable
+ tst r8, #0xe << 8 @ CP 0 or 1?
+ ldreq r5, [r10, #TI_FLAGS] @ if so, load thread_info flags
+ andeq r5, r5, #1 << TIF_USING_IWMMXT @ isolate TIF_USING_IWMMXT flag
+ teqeq r5, #1 << TIF_USING_IWMMXT @ check whether it is set
+ beq iwmmxt_task_enable @ branch if set
#endif
ARM( add pc, pc, r8, lsr #6 )
THUMB( lsr r8, r8, #6 )
--
2.39.2