Only select ARCH_WANT_HUGE_PMD_SHARE if hugetlb page table sharing is actually possible; page table sharing requires at least three levels, because it involves shared references to PMD tables.
Having ARCH_WANT_HUGE_PMD_SHARE enabled on non-PAE 32-bit X86 (which has 2-level paging) became particularly problematic after commit 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count"), since that changes `struct ptdesc` such that the `pt_mm` (for PGDs) and the `pt_share_count` (for PMDs) share the same union storage - and with 2-level paging, PMDs are PGDs.
(For comparison, arm64 also gates ARCH_WANT_HUGE_PMD_SHARE on the configuration of page tables such that it is never enabled with 2-level paging.)
Reported-by: Vitaly Chikunov vt@altlinux.org Closes: https://lore.kernel.org/r/srhpjxlqfna67blvma5frmy3aa@altlinux.org Fixes: cfe28c5d63d8 ("x86: mm: Remove x86 version of huge_pmd_share.") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn jannh@google.com --- arch/x86/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 71019b3b54ea..917f523b994b 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -147,7 +147,7 @@ config X86 select ARCH_WANTS_DYNAMIC_TASK_STRUCT select ARCH_WANTS_NO_INSTR select ARCH_WANT_GENERAL_HUGETLB - select ARCH_WANT_HUGE_PMD_SHARE + select ARCH_WANT_HUGE_PMD_SHARE if PGTABLE_LEVELS > 2 select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64 select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64
--- base-commit: d0b3b7b22dfa1f4b515fd3a295b3fd958f9e81af change-id: 20250630-x86-2level-hugetlb-b1d8feb255ce
Jann,
On Mon, Jun 30, 2025 at 09:07:34PM +0200, Jann Horn wrote:
Only select ARCH_WANT_HUGE_PMD_SHARE if hugetlb page table sharing is actually possible; page table sharing requires at least three levels, because it involves shared references to PMD tables.
Having ARCH_WANT_HUGE_PMD_SHARE enabled on non-PAE 32-bit X86 (which has 2-level paging) became particularly problematic after commit 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count"), since that changes `struct ptdesc` such that the `pt_mm` (for PGDs) and the `pt_share_count` (for PMDs) share the same union storage - and with 2-level paging, PMDs are PGDs.
(For comparison, arm64 also gates ARCH_WANT_HUGE_PMD_SHARE on the configuration of page tables such that it is never enabled with 2-level paging.)
Reported-by: Vitaly Chikunov vt@altlinux.org Closes: https://lore.kernel.org/r/srhpjxlqfna67blvma5frmy3aa@altlinux.org Fixes: cfe28c5d63d8 ("x86: mm: Remove x86 version of huge_pmd_share.") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn jannh@google.com
Tested on i586 over v6.1.142 (where the problem was surfaced).
Tested-by: Vitaly Chikunov vt@altlinux.org
Thanks,
arch/x86/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 71019b3b54ea..917f523b994b 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -147,7 +147,7 @@ config X86 select ARCH_WANTS_DYNAMIC_TASK_STRUCT select ARCH_WANTS_NO_INSTR select ARCH_WANT_GENERAL_HUGETLB
- select ARCH_WANT_HUGE_PMD_SHARE
- select ARCH_WANT_HUGE_PMD_SHARE if PGTABLE_LEVELS > 2 select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64 select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64
base-commit: d0b3b7b22dfa1f4b515fd3a295b3fd958f9e81af change-id: 20250630-x86-2level-hugetlb-b1d8feb255ce
-- Jann Horn jannh@google.com
On 6/30/25 12:07, Jann Horn wrote:
--- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -147,7 +147,7 @@ config X86 select ARCH_WANTS_DYNAMIC_TASK_STRUCT select ARCH_WANTS_NO_INSTR select ARCH_WANT_GENERAL_HUGETLB
- select ARCH_WANT_HUGE_PMD_SHARE
- select ARCH_WANT_HUGE_PMD_SHARE if PGTABLE_LEVELS > 2 select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64 select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64
Does pmd sharing really even work on 32-bit? Just practically, you only ever have 3GB of address space and thus 3 possible PGDs that can be used for sharing (with the 3:1 split configured). You presumably need *some* address space for the binary to even execve(). The vdso and friends go somewhere and we normally don't let anything get mapped at 0x0.
I think that leaves _maybe_ one slot.
Barring something some specific and compelling actual use case, this should probably just be:
select ARCH_WANT_HUGE_PMD_SHARE if X86_64
On Mon, Jun 30, 2025 at 10:39 PM Dave Hansen dave.hansen@intel.com wrote:
On 6/30/25 12:07, Jann Horn wrote:
--- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -147,7 +147,7 @@ config X86 select ARCH_WANTS_DYNAMIC_TASK_STRUCT select ARCH_WANTS_NO_INSTR select ARCH_WANT_GENERAL_HUGETLB
select ARCH_WANT_HUGE_PMD_SHARE
select ARCH_WANT_HUGE_PMD_SHARE if PGTABLE_LEVELS > 2 select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64 select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64
Does pmd sharing really even work on 32-bit? Just practically, you only ever have 3GB of address space and thus 3 possible PGDs that can be used for sharing (with the 3:1 split configured). You presumably need *some* address space for the binary to even execve(). The vdso and friends go somewhere and we normally don't let anything get mapped at 0x0.
I think that leaves _maybe_ one slot.
Barring something some specific and compelling actual use case, this should probably just be:
select ARCH_WANT_HUGE_PMD_SHARE if X86_64
Yeah, makes sense. I was also thinking that it would be more reasonable to restrict this to 64-bit only, but figured it would be less risky to make this more specific change.
But now that I think about it, it's not like stuff is actually going to break from this change, worst case the kernel memory usage goes up a bunch in a very unlikely configuration... so yeah, I guess I'll resend this later with "if X86-64".
linux-stable-mirror@lists.linaro.org