The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x 0d6c356dd6547adac2b06b461528e3573f52d953 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to 'stable@vger.kernel.org' --in-reply-to '2025112032-parted-progeny-cd9e@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0d6c356dd6547adac2b06b461528e3573f52d953 Mon Sep 17 00:00:00 2001 From: "Isaac J. Manjarres" isaacmanjarres@google.com Date: Tue, 28 Oct 2025 12:10:12 -0700 Subject: [PATCH] mm/mm_init: fix hash table order logging in alloc_large_system_hash()
When emitting the order of the allocation for a hash table, alloc_large_system_hash() unconditionally subtracts PAGE_SHIFT from log base 2 of the allocation size. This is not correct if the allocation size is smaller than a page, and yields a negative value for the order as seen below:
TCP established hash table entries: 32 (order: -4, 256 bytes, linear) TCP bind hash table entries: 32 (order: -2, 1024 bytes, linear)
Use get_order() to compute the order when emitting the hash table information to correctly handle cases where the allocation size is smaller than a page:
TCP established hash table entries: 32 (order: 0, 256 bytes, linear) TCP bind hash table entries: 32 (order: 0, 1024 bytes, linear)
Link: https://lkml.kernel.org/r/20251028191020.413002-1-isaacmanjarres@google.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Isaac J. Manjarres isaacmanjarres@google.com Reviewed-by: Mike Rapoport (Microsoft) rppt@kernel.org Reviewed-by: David Hildenbrand david@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org
diff --git a/mm/mm_init.c b/mm/mm_init.c index 3db2dea7db4c..7712d887b696 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -2469,7 +2469,7 @@ void *__init alloc_large_system_hash(const char *tablename, panic("Failed to allocate %s hash table\n", tablename);
pr_info("%s hash table entries: %ld (order: %d, %lu bytes, %s)\n", - tablename, 1UL << log2qty, ilog2(size) - PAGE_SHIFT, size, + tablename, 1UL << log2qty, get_order(size), size, virt ? (huge ? "vmalloc hugepage" : "vmalloc") : "linear");
if (_hash_shift)
From: Lance Yang lance.yang@linux.dev
When a page fault occurs in a secret memory file created with `memfd_secret(2)`, the kernel will allocate a new page for it, mark the underlying page as not-present in the direct map, and add it to the file mapping.
If two tasks cause a fault in the same page concurrently, both could end up allocating a page and removing the page from the direct map, but only one would succeed in adding the page to the file mapping. The task that failed undoes the effects of its attempt by (a) freeing the page again and (b) putting the page back into the direct map. However, by doing these two operations in this order, the page becomes available to the allocator again before it is placed back in the direct mapping.
If another task attempts to allocate the page between (a) and (b), and the kernel tries to access it via the direct map, it would result in a supervisor not-present page fault.
Fix the ordering to restore the direct map before the page is freed.
Link: https://lkml.kernel.org/r/20251031120955.92116-1-lance.yang@linux.dev Fixes: 1507f51255c9 ("mm: introduce memfd_secret system call to create "secret" memory areas") Signed-off-by: Lance Yang lance.yang@linux.dev Reported-by: Google Big Sleep big-sleep-vuln-reports@google.com Closes: https://lore.kernel.org/linux-mm/CAEXGt5QeDpiHTu3K9tvjUTPqo+d-=wuCNYPa+6sWKr... Acked-by: David Hildenbrand david@redhat.com Reviewed-by: Mike Rapoport (Microsoft) rppt@kernel.org Reviewed-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org (cherry picked from commit 6f86d0534fddfbd08687fa0f01479d4226bc3c3d) [rppt: replaced folio with page in the patch and in the changelog] Signed-off-by: Mike Rapoport (Microsoft) rppt@kernel.org --- mm/secretmem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/secretmem.c b/mm/secretmem.c index 624663a94808..0c86133ad33f 100644 --- a/mm/secretmem.c +++ b/mm/secretmem.c @@ -82,13 +82,13 @@ static vm_fault_t secretmem_fault(struct vm_fault *vmf) __SetPageUptodate(page); err = add_to_page_cache_lru(page, mapping, offset, gfp); if (unlikely(err)) { - put_page(page); /* * If a split of large page was required, it * already happened when we marked the page invalid * which guarantees that this call won't fail */ set_direct_map_default_noflush(page); + put_page(page); if (err == -EEXIST) goto retry;
Oops, copied the wrong git send-email command, sorry for the noise
On Thu, Nov 20, 2025 at 09:15:47PM +0200, Mike Rapoport wrote:
From: Lance Yang lance.yang@linux.dev
When a page fault occurs in a secret memory file created with `memfd_secret(2)`, the kernel will allocate a new page for it, mark the underlying page as not-present in the direct map, and add it to the file mapping.
If two tasks cause a fault in the same page concurrently, both could end up allocating a page and removing the page from the direct map, but only one would succeed in adding the page to the file mapping. The task that failed undoes the effects of its attempt by (a) freeing the page again and (b) putting the page back into the direct map. However, by doing these two operations in this order, the page becomes available to the allocator again before it is placed back in the direct mapping.
If another task attempts to allocate the page between (a) and (b), and the kernel tries to access it via the direct map, it would result in a supervisor not-present page fault.
Fix the ordering to restore the direct map before the page is freed.
Link: https://lkml.kernel.org/r/20251031120955.92116-1-lance.yang@linux.dev Fixes: 1507f51255c9 ("mm: introduce memfd_secret system call to create "secret" memory areas") Signed-off-by: Lance Yang lance.yang@linux.dev Reported-by: Google Big Sleep big-sleep-vuln-reports@google.com Closes: https://lore.kernel.org/linux-mm/CAEXGt5QeDpiHTu3K9tvjUTPqo+d-=wuCNYPa+6sWKr... Acked-by: David Hildenbrand david@redhat.com Reviewed-by: Mike Rapoport (Microsoft) rppt@kernel.org Reviewed-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org (cherry picked from commit 6f86d0534fddfbd08687fa0f01479d4226bc3c3d) [rppt: replaced folio with page in the patch and in the changelog] Signed-off-by: Mike Rapoport (Microsoft) rppt@kernel.org
mm/secretmem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/secretmem.c b/mm/secretmem.c index 624663a94808..0c86133ad33f 100644 --- a/mm/secretmem.c +++ b/mm/secretmem.c @@ -82,13 +82,13 @@ static vm_fault_t secretmem_fault(struct vm_fault *vmf) __SetPageUptodate(page); err = add_to_page_cache_lru(page, mapping, offset, gfp); if (unlikely(err)) {
put_page(page); /* * If a split of large page was required, it * already happened when we marked the page invalid * which guarantees that this call won't fail */ set_direct_map_default_noflush(page);
put_page(page); if (err == -EEXIST) goto retry;2.50.1
From: "Isaac J. Manjarres" isaacmanjarres@google.com
When emitting the order of the allocation for a hash table, alloc_large_system_hash() unconditionally subtracts PAGE_SHIFT from log base 2 of the allocation size. This is not correct if the allocation size is smaller than a page, and yields a negative value for the order as seen below:
TCP established hash table entries: 32 (order: -4, 256 bytes, linear) TCP bind hash table entries: 32 (order: -2, 1024 bytes, linear)
Use get_order() to compute the order when emitting the hash table information to correctly handle cases where the allocation size is smaller than a page:
TCP established hash table entries: 32 (order: 0, 256 bytes, linear) TCP bind hash table entries: 32 (order: 0, 1024 bytes, linear)
Link: https://lkml.kernel.org/r/20251028191020.413002-1-isaacmanjarres@google.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Isaac J. Manjarres isaacmanjarres@google.com Reviewed-by: Mike Rapoport (Microsoft) rppt@kernel.org Reviewed-by: David Hildenbrand david@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org (cherry picked from commit 0d6c356dd6547adac2b06b461528e3573f52d953) Signed-off-by: Mike Rapoport (Microsoft) rppt@kernel.org --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 86066a2cf258..d760b96604ec 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -9225,7 +9225,7 @@ void *__init alloc_large_system_hash(const char *tablename, panic("Failed to allocate %s hash table\n", tablename);
pr_info("%s hash table entries: %ld (order: %d, %lu bytes, %s)\n", - tablename, 1UL << log2qty, ilog2(size) - PAGE_SHIFT, size, + tablename, 1UL << log2qty, get_order(size), size, virt ? (huge ? "vmalloc hugepage" : "vmalloc") : "linear");
if (_hash_shift)
On Thu, Nov 20, 2025 at 09:42:22PM +0200, Mike Rapoport wrote:
From: "Isaac J. Manjarres" isaacmanjarres@google.com
mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 86066a2cf258..d760b96604ec 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -9225,7 +9225,7 @@ void *__init alloc_large_system_hash(const char *tablename, panic("Failed to allocate %s hash table\n", tablename); pr_info("%s hash table entries: %ld (order: %d, %lu bytes, %s)\n",
tablename, 1UL << log2qty, ilog2(size) - PAGE_SHIFT, size,
virt ? (huge ? "vmalloc hugepage" : "vmalloc") : "linear");tablename, 1UL << log2qty, get_order(size), size,if (_hash_shift) -- 2.50.1
Thanks for backporting these patches to the older kernel branches, Mike! I really appreciate it :)
--Isaac
linux-stable-mirror@lists.linaro.org