Re: [PATCH] mm: fix copy_vma() error handling for hugetlb mappings

23 May 2025

      Hi Lorenzo,
Thanks for the in-depth review! answers below:
On Fri, May 23 2025 at 11:00:40, Lorenzo Stoakes lorenzo.stoakes@oracle.com wrote:
...
OK so really it is _only_ when vma_link() fails?
AFAICT yes, since copy_vma() only calls vma_close() if vma_link()
fails. A failure in any of the other helpers in copy_vma() before it is
handled by simply freeing the allocated resources.
...
Ordinarily 'private syzbot instance' makes me nervous, but you've made your case
here logically.
I understand your qualms with that but, although that instance is mostly
concerned with downstream code, in this case there's nothing unusual, as
it was able to find the issue in mainline with a common reproducer. The
closest public report I found was the one I linked in [3], although I
couldn't reproduce the issue with the repro provided there.
...
Hm, do we have a Fixes?
I couldn't find a single commit to point as a "Fixes". The actual commit
that introduces that close_vma() call there is
4080ef1579b2 ("mm: unconditionally close VMAs on error")
although I wouldn't say that's the culprit. As you said, the problem
with vma_close() seems to be more involved. If you want me to add that
one in the "Fixes" tag so we can keep track of the context, let me know,
that's fine by me.
...
Why 6.12+? It seems this bug has been around for... a while.
Because in stable versions lower than that (6.6) the code to patch is in
mm/mmap.c instead, so I'd rather have this one merged first and then
submit the appropriate backport for 6.6.
...
Thanks for links, though it's better to please provide this information here
even if in succinct form. This is because commit messages are a permanent
record, and these links (other than lore) are ephemeral.
True but, as you said, it's a bit of a pain to try to fit all the info
in the commit message, and the repro will still be living somewhere else.
...
So, can we please copy/paste the splat from [1] and drop this link, maybe just
keep link [2] as it's not so important (I'm guessing this takes a while to repro
so the failure injection hits the right point?) and of course keep [3].
Sure, I'll make the changes for v2. FWIW, in my tests the repro could
trigger this in a matter of seconds.
...
So,
Could you implement this slightly differently please? We're duplicating
this code now, so I think this should be in its own function with a copious
comment.
Something like:
static void fixup_hugetlb_reservations(struct vm_area_struct *vma)
{
   if (is_vm_hugetlb_page(new_vma))
   	clear_vma_resv_huge_pages(new_vma);
}
And call this from here and also in copy_vma_and_data().
Could you also please update the comment in clear_vma_resv_huge_pages():
/*

Reset and decrement one ref on hugepage private reservation.
Called with mm->mmap_lock writer semaphore held.
This function should be only used by move_vma() and operate on
same sized vma. It should never come here with last ref on the
reservation.

*/
Drop the mention of the specific function (which is now wrong, but
mentioning _any_ function is asking for bit rot anyway) and replace with
something like 'This function should only be used by mremap and...'
Ack, thanks for the suggestions!
Cheers,
Ricardo

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH] mm: fix copy_vma() error handling for hugetlb mappings