On Fri, Dec 17, 2021 at 5:53 PM Linus Torvalds torvalds@linux-foundation.org wrote:
And then there's THP and HUGETLB, that I do think needs fixing and aren't about those two kinds of cases.
I think we never got around to just doing the same thing we did for regular pages. I think the hugepage code simply doesn't follow that "COW on GUP, mark to not COW later" pattern.
In particular, do_huge_pmd_wp_page() has this pattern:
/* * We can only reuse the page if nobody else maps the huge page or it's * part. */ if (reuse_swap_page(page, NULL)) { ... mark it writable ...
and that never got converted to "only mark it writable if we actually have exclusive access to this huge page".
So the problem is literally that reuse_swap_page() uses that "page_mapcount()" logic, and doesn't take into account that the page is actually used by a GUP reference.
Which is exactly why David then sees that "oh, we got a GUP reference to it, and now we're seeing the writes come through". Because that code looks at mapcount, and it shouldn't.
I think the hugepage code should use the exact same logic that the regular wp fault code does.
Linus