A hugetlb page will have a mapcount of 1 if mapped by multiple processes
via a shared PMD. This is because only the first process increases the
map count, and subsequent processes just add the shared PMD page to
their page table.
page_mapcount is being used to decide if a hugetlb page is shared or
private in /proc/PID/smaps. Pages referenced via a shared PMD were
incorrectly being counted as private.
To fix, check for a shared PMD if mapcount is 1. If a shared PMD is
found count the hugetlb page as shared. A new helper to check for a
shared PMD is added.
Fixes: 25ee01a2fca0 ("mm: hugetlb: proc: add hugetlb-related fields to /proc/PID/smaps")
Cc: stable(a)vger.kernel.org
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
---
fs/proc/task_mmu.c | 10 ++++++++--
include/linux/hugetlb.h | 12 ++++++++++++
2 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index e35a0398db63..cb9539879402 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -749,8 +749,14 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask,
if (mapcount >= 2)
mss->shared_hugetlb += huge_page_size(hstate_vma(vma));
- else
- mss->private_hugetlb += huge_page_size(hstate_vma(vma));
+ else {
+ if (hugetlb_pmd_shared(pte))
+ mss->shared_hugetlb +=
+ huge_page_size(hstate_vma(vma));
+ else
+ mss->private_hugetlb +=
+ huge_page_size(hstate_vma(vma));
+ }
}
return 0;
}
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index e3aa336df900..8e65920e4363 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -1225,6 +1225,18 @@ static inline __init void hugetlb_cma_reserve(int order)
}
#endif
+#ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE
+static inline bool hugetlb_pmd_shared(pte_t *pte)
+{
+ return page_count(virt_to_page(pte)) > 1;
+}
+#else
+static inline bool hugetlb_pmd_shared(pte_t *pte)
+{
+ return false;
+}
+#endif
+
bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr);
#ifndef __HAVE_ARCH_FLUSH_HUGETLB_TLB_RANGE
--
2.39.1
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
73bdf65ea748 ("migrate: hugetlb: check for hugetlb shared PMD in node migration")
7ce82f4c3f3e ("mm/migration: return errno when isolate_huge_page failed")
1b7f7e58decc ("mm/gup: Convert check_and_migrate_movable_pages() to use a folio")
f9f38f78c5d5 ("mm: refactor check_and_migrate_movable_pages")
5ac95884a784 ("mm/migrate: enable returning precise migrate_pages() success count")
c5b5a3dd2c1f ("mm: thp: refactor NUMA fault handling")
5db4f15c4fd7 ("mm: memory: add orig_pmd to struct vm_fault")
8f34f1eac382 ("mm/userfaultfd: fix uffd-wp special cases for fork()")
25182f05ffed ("mm,hwpoison: fix race with hugetlb page allocation")
f68749ec342b ("mm/gup: longterm pin migration cleanup")
d1e153fea2a8 ("mm/gup: migrate pinned pages out of movable zone")
1a08ae36cf8b ("mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_PIN")
6e7f34ebb8d2 ("mm/gup: check for isolation errors")
f0f4463837da ("mm/gup: return an error on migration failure")
83c02c23d074 ("mm/gup: check every subpage of a compound page during isolation")
c991ffef7bce ("mm/gup: don't pin migrated cma pages in movable zone")
7ee820ee7238 ("Revert "mm: migrate: skip shared exec THP for NUMA balancing"")
ae37c7ff79f1 ("mm: make alloc_contig_range handle in-use hugetlb pages")
369fa227c219 ("mm: make alloc_contig_range handle free hugetlb pages")
c2ad7a1ffeaf ("mm,compaction: let isolate_migratepages_{range,block} return error codes")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 73bdf65ea74857d7fb2ec3067a3cec0e261b1462 Mon Sep 17 00:00:00 2001
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Date: Thu, 26 Jan 2023 14:27:21 -0800
Subject: [PATCH] migrate: hugetlb: check for hugetlb shared PMD in node
migration
migrate_pages/mempolicy semantics state that CAP_SYS_NICE is required to
move pages shared with another process to a different node. page_mapcount
> 1 is being used to determine if a hugetlb page is shared. However, a
hugetlb page will have a mapcount of 1 if mapped by multiple processes via
a shared PMD. As a result, hugetlb pages shared by multiple processes and
mapped with a shared PMD can be moved by a process without CAP_SYS_NICE.
To fix, check for a shared PMD if mapcount is 1. If a shared PMD is found
consider the page shared.
Link: https://lkml.kernel.org/r/20230126222721.222195-3-mike.kravetz@oracle.com
Fixes: e2d8cf405525 ("migrate: add hugepage migration code to migrate_pages()")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: James Houghton <jthoughton(a)google.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi(a)linux.dev>
Cc: Vishal Moola (Oracle) <vishal.moola(a)gmail.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 02c8a712282f..f940395667c8 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -600,7 +600,8 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask,
/* With MPOL_MF_MOVE, we migrate only unshared hugepage. */
if (flags & (MPOL_MF_MOVE_ALL) ||
- (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) {
+ (flags & MPOL_MF_MOVE && page_mapcount(page) == 1 &&
+ !hugetlb_pmd_shared(pte))) {
if (isolate_hugetlb(page, qp->pagelist) &&
(flags & MPOL_MF_STRICT))
/*
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
73bdf65ea748 ("migrate: hugetlb: check for hugetlb shared PMD in node migration")
7ce82f4c3f3e ("mm/migration: return errno when isolate_huge_page failed")
1b7f7e58decc ("mm/gup: Convert check_and_migrate_movable_pages() to use a folio")
f9f38f78c5d5 ("mm: refactor check_and_migrate_movable_pages")
5ac95884a784 ("mm/migrate: enable returning precise migrate_pages() success count")
c5b5a3dd2c1f ("mm: thp: refactor NUMA fault handling")
5db4f15c4fd7 ("mm: memory: add orig_pmd to struct vm_fault")
8f34f1eac382 ("mm/userfaultfd: fix uffd-wp special cases for fork()")
25182f05ffed ("mm,hwpoison: fix race with hugetlb page allocation")
f68749ec342b ("mm/gup: longterm pin migration cleanup")
d1e153fea2a8 ("mm/gup: migrate pinned pages out of movable zone")
1a08ae36cf8b ("mm cma: rename PF_MEMALLOC_NOCMA to PF_MEMALLOC_PIN")
6e7f34ebb8d2 ("mm/gup: check for isolation errors")
f0f4463837da ("mm/gup: return an error on migration failure")
83c02c23d074 ("mm/gup: check every subpage of a compound page during isolation")
c991ffef7bce ("mm/gup: don't pin migrated cma pages in movable zone")
7ee820ee7238 ("Revert "mm: migrate: skip shared exec THP for NUMA balancing"")
ae37c7ff79f1 ("mm: make alloc_contig_range handle in-use hugetlb pages")
369fa227c219 ("mm: make alloc_contig_range handle free hugetlb pages")
c2ad7a1ffeaf ("mm,compaction: let isolate_migratepages_{range,block} return error codes")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 73bdf65ea74857d7fb2ec3067a3cec0e261b1462 Mon Sep 17 00:00:00 2001
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Date: Thu, 26 Jan 2023 14:27:21 -0800
Subject: [PATCH] migrate: hugetlb: check for hugetlb shared PMD in node
migration
migrate_pages/mempolicy semantics state that CAP_SYS_NICE is required to
move pages shared with another process to a different node. page_mapcount
> 1 is being used to determine if a hugetlb page is shared. However, a
hugetlb page will have a mapcount of 1 if mapped by multiple processes via
a shared PMD. As a result, hugetlb pages shared by multiple processes and
mapped with a shared PMD can be moved by a process without CAP_SYS_NICE.
To fix, check for a shared PMD if mapcount is 1. If a shared PMD is found
consider the page shared.
Link: https://lkml.kernel.org/r/20230126222721.222195-3-mike.kravetz@oracle.com
Fixes: e2d8cf405525 ("migrate: add hugepage migration code to migrate_pages()")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: James Houghton <jthoughton(a)google.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi(a)linux.dev>
Cc: Vishal Moola (Oracle) <vishal.moola(a)gmail.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 02c8a712282f..f940395667c8 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -600,7 +600,8 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask,
/* With MPOL_MF_MOVE, we migrate only unshared hugepage. */
if (flags & (MPOL_MF_MOVE_ALL) ||
- (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) {
+ (flags & MPOL_MF_MOVE && page_mapcount(page) == 1 &&
+ !hugetlb_pmd_shared(pte))) {
if (isolate_hugetlb(page, qp->pagelist) &&
(flags & MPOL_MF_STRICT))
/*