On 18 Feb 2021, at 12:51, Mike Kravetz wrote:
On 2/18/21 9:40 AM, Zi Yan wrote:
On 18 Feb 2021, at 12:32, Jason Gunthorpe wrote:
On Thu, Feb 18, 2021 at 12:27:58PM -0500, Zi Yan wrote:
On 18 Feb 2021, at 12:25, Jason Gunthorpe wrote:
On Thu, Feb 18, 2021 at 02:45:54PM +0000, Matthew Wilcox wrote:
On Wed, Feb 17, 2021 at 11:02:52AM -0800, Andrew Morton wrote: > On Wed, 17 Feb 2021 10:49:25 -0800 Mike Kravetz mike.kravetz@oracle.com wrote: >> page structs are not guaranteed to be contiguous for gigantic pages. The > > June 2014. That's a long lurk time for a bug. I wonder if some later > commit revealed it.
I would suggest that gigantic pages have not seen much use. Certainly performance with Intel CPUs on benchmarks that I've been involved with showed lower performance with 1GB pages than with 2MB pages until quite recently.
I suggested in another thread that maybe it is time to consider dropping this "feature"
You mean dropping gigantic page support in hugetlb?
No, I mean dropping support for arches that want to do:
tail_page != head_page + tail_page_nr
because they can't allocate the required page array either virtually or physically contiguously.
It seems like quite a burden on the core mm for a very niche, and maybe even non-existant, case.
It was originally done for PPC, can these PPC systems use VMEMMAP now?
The cost to fix GUP to be compatible with this will hurt normal GUP performance - and again, that nobody has hit this bug in GUP further suggests the feature isn't used..
A easy fix might be to make gigantic hugetlb page depends on CONFIG_SPARSEMEM_VMEMMAP, which guarantee all struct pages are contiguous.
Yes, exactly.
I actually have a question on CONFIG_SPARSEMEM_VMEMMAP. Can we assume PFN_A - PFN_B == struct_page_A - struct_page_B, meaning all struct pages are ordered based on physical addresses? I just wonder for two PFN ranges, e.g., [0 - 128MB], [128MB - 256MB], if it is possible to first online [128MB - 256MB] then [0 - 128MB] and the struct pages of [128MB - 256MB] are in front of [0 - 128MB] in the vmemmap due to online ordering.
I have not looked at the code which does the onlining and vmemmap setup. But, these definitions make me believe it is true:
#elif defined(CONFIG_SPARSEMEM_VMEMMAP)
/* memmap is virtually contiguous. */ #define __pfn_to_page(pfn) (vmemmap + (pfn)) #define __page_to_pfn(page) (unsigned long)((page) - vmemmap)
Makes sense. Thank you for checking.
I guess making gigantic page depends on CONFIG_SPARSEMEM_VMEMMAP might be a good way of simplifying code and avoiding future bugs unless there is an arch really needs gigantic page and cannot have VMEMMAP.
— Best Regards, Yan Zi