The patch below does not apply to the 4.19-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
Possible dependencies:
73bdf65ea748 ("migrate: hugetlb: check for hugetlb shared PMD in node migration") 7ce82f4c3f3e ("mm/migration: return errno when isolate_huge_page failed") 1b7f7e58decc ("mm/gup: Convert check_and_migrate_movable_pages() to use a folio") f9f38f78c5d5 ("mm: refactor check_and_migrate_movable_pages") 5ac95884a784 ("mm/migrate: enable returning precise migrate_pages() success count") c5b5a3dd2c1f ("mm: thp: refactor NUMA fault handling") 5db4f15c4fd7 ("mm: memory: add orig_pmd to struct vm_fault") 8f34f1eac382 ("mm/userfaultfd: fix uffd-wp special cases for fork()") 25182f05ffed ("mm,hwpoison: fix race with hugetlb page allocation") f68749ec342b ("mm/gup: longterm pin migration cleanup") d1e153fea2a8 ("mm/gup: migrate pinned pages out of movable zone") 1a08ae36cf8b ("mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_PIN") 6e7f34ebb8d2 ("mm/gup: check for isolation errors") f0f4463837da ("mm/gup: return an error on migration failure") 83c02c23d074 ("mm/gup: check every subpage of a compound page during isolation") c991ffef7bce ("mm/gup: don't pin migrated cma pages in movable zone") 7ee820ee7238 ("Revert "mm: migrate: skip shared exec THP for NUMA balancing"") ae37c7ff79f1 ("mm: make alloc_contig_range handle in-use hugetlb pages") 369fa227c219 ("mm: make alloc_contig_range handle free hugetlb pages") c2ad7a1ffeaf ("mm,compaction: let isolate_migratepages_{range,block} return error codes")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 73bdf65ea74857d7fb2ec3067a3cec0e261b1462 Mon Sep 17 00:00:00 2001 From: Mike Kravetz mike.kravetz@oracle.com Date: Thu, 26 Jan 2023 14:27:21 -0800 Subject: [PATCH] migrate: hugetlb: check for hugetlb shared PMD in node migration
migrate_pages/mempolicy semantics state that CAP_SYS_NICE is required to move pages shared with another process to a different node. page_mapcount
1 is being used to determine if a hugetlb page is shared. However, a
hugetlb page will have a mapcount of 1 if mapped by multiple processes via a shared PMD. As a result, hugetlb pages shared by multiple processes and mapped with a shared PMD can be moved by a process without CAP_SYS_NICE.
To fix, check for a shared PMD if mapcount is 1. If a shared PMD is found consider the page shared.
Link: https://lkml.kernel.org/r/20230126222721.222195-3-mike.kravetz@oracle.com Fixes: e2d8cf405525 ("migrate: add hugepage migration code to migrate_pages()") Signed-off-by: Mike Kravetz mike.kravetz@oracle.com Acked-by: Peter Xu peterx@redhat.com Acked-by: David Hildenbrand david@redhat.com Cc: James Houghton jthoughton@google.com Cc: Matthew Wilcox willy@infradead.org Cc: Michal Hocko mhocko@suse.com Cc: Muchun Song songmuchun@bytedance.com Cc: Naoya Horiguchi naoya.horiguchi@linux.dev Cc: Vishal Moola (Oracle) vishal.moola@gmail.com Cc: Yang Shi shy828301@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org
diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 02c8a712282f..f940395667c8 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -600,7 +600,8 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask,
/* With MPOL_MF_MOVE, we migrate only unshared hugepage. */ if (flags & (MPOL_MF_MOVE_ALL) || - (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) { + (flags & MPOL_MF_MOVE && page_mapcount(page) == 1 && + !hugetlb_pmd_shared(pte))) { if (isolate_hugetlb(page, qp->pagelist) && (flags & MPOL_MF_STRICT)) /*