Re: Bug: Performance regression in 1013af4f585f: mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race

3 Dec 2025


      On 12/3/25 18:22, Prakash Sangappa wrote:
...
...
On Nov 20, 2025, at 7:47 AM, David Hildenbrand (Red Hat) david@kernel.org wrote:
On 11/19/25 17:31, David Hildenbrand (Red Hat) wrote:
...
On 19.11.25 17:29, David Hildenbrand (Red Hat) wrote:
...
...
...
So what I am currently looking into is simply reducing (batching) the number
of IPIs.
As in the IPIs we are now generating in tlb_remove_table_sync_one()?
Or something else?
Yes, for now. I'm essentially reducing the number of
tlb_remove_table_sync_one() calls.
...
As this bug is only an issue when we don't use IPIs for pgtable freeing right
(e.g. CONFIG_MMU_GATHER_RCU_TABLE_FREE is set), as otherwise
tlb_remove_table_sync_one() is a no-op?
Right. But it's still confusing: I think for page table unsharing we
always need an IPI one way or the other to make sure GUP-fast was called.
At least for preventing that anybody would be able to reuse the page
table in the meantime.
That is either:
(a) The TLB shootdown implied an IPI
(b) We manually send one
But that's where it gets confusing: nowadays x86 also selects
MMU_GATHER_RCU_TABLE_FREE, meaning we would get a double IPI?
This is so complicated, so I might be missing something.
But it's the same behavior we have in collapse_huge_page() where we first
... flush and then call tlb_remove_table_sync_one().
Okay, I pushed something to
https://github.com/davidhildenbrand/linux.git hugetlb_unshare
For testing had to backport the fix to v5.15. Used top 8 commits from the above tree.
v5.15 kernel does not have ptdesc and hugetlb vma locking.
With that change, our DB team has verified that it fixes the regression.
Great, thanks for testing!
...
Will you push this fix to LTS trees after it is reviewed and merged?
I can further clean this up and send it out. There is something about 
the mmu_gather integration that I don't enjoy, but I didn't find a 
better solution so far.
I can try backporting it, I would likely have to try to minimize the 
prereq cleanups. Let me see to which degree this can be done in a 
sensible way!
-- 
Cheers

David

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: Bug: Performance regression in 1013af4f585f: mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race