Re: [PATCH 1/2] hugetlb: fix update_and_free_page contig page struct assumption

18 Feb 2021


      On 18 Feb 2021, at 12:51, Mike Kravetz wrote:
...
On 2/18/21 9:40 AM, Zi Yan wrote:
...
On 18 Feb 2021, at 12:32, Jason Gunthorpe wrote:
...
On Thu, Feb 18, 2021 at 12:27:58PM -0500, Zi Yan wrote:
...
On 18 Feb 2021, at 12:25, Jason Gunthorpe wrote:
...
On Thu, Feb 18, 2021 at 02:45:54PM +0000, Matthew Wilcox wrote:
...
On Wed, Feb 17, 2021 at 11:02:52AM -0800, Andrew Morton wrote:
> On Wed, 17 Feb 2021 10:49:25 -0800 Mike Kravetz mike.kravetz@oracle.com wrote:
>> page structs are not guaranteed to be contiguous for gigantic pages.  The
>
> June 2014.  That's a long lurk time for a bug.  I wonder if some later
> commit revealed it.
I would suggest that gigantic pages have not seen much use.  Certainly
performance with Intel CPUs on benchmarks that I've been involved with
showed lower performance with 1GB pages than with 2MB pages until quite
recently.
I suggested in another thread that maybe it is time to consider
dropping this "feature"
You mean dropping gigantic page support in hugetlb?
No, I mean dropping support for arches that want to do:
tail_page != head_page + tail_page_nr
because they can't allocate the required page array either virtually
or physically contiguously.
It seems like quite a burden on the core mm for a very niche, and
maybe even non-existant, case.
It was originally done for PPC, can these PPC systems use VMEMMAP now?
...
...
The cost to fix GUP to be compatible with this will hurt normal
GUP performance - and again, that nobody has hit this bug in GUP
further suggests the feature isn't used..
A easy fix might be to make gigantic hugetlb page depends on
CONFIG_SPARSEMEM_VMEMMAP, which guarantee all struct pages are contiguous.
Yes, exactly.
I actually have a question on CONFIG_SPARSEMEM_VMEMMAP. Can we assume
PFN_A - PFN_B == struct_page_A - struct_page_B, meaning all struct pages
are ordered based on physical addresses? I just wonder for two PFN ranges,
e.g., [0 - 128MB], [128MB - 256MB], if it is possible to first online
[128MB - 256MB] then [0 - 128MB] and the struct pages of [128MB - 256MB]
are in front of [0 - 128MB] in the vmemmap due to online ordering.
I have not looked at the code which does the onlining and vmemmap setup.
But, these definitions make me believe it is true:
#elif defined(CONFIG_SPARSEMEM_VMEMMAP)
/* memmap is virtually contiguous.  */
#define __pfn_to_page(pfn)      (vmemmap + (pfn))
#define __page_to_pfn(page)     (unsigned long)((page) - vmemmap)
Makes sense. Thank you for checking.
I guess making gigantic page depends on CONFIG_SPARSEMEM_VMEMMAP might
be a good way of simplifying code and avoiding future bugs unless
there is an arch really needs gigantic page and cannot have VMEMMAP.
—
Best Regards,
Yan Zi

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH 1/2] hugetlb: fix update_and_free_page contig page struct assumption