[ Going back in the thread to this one ]
On Fri, Dec 17, 2021 at 1:15 PM Nadav Amit namit@vmware.com wrote:
I think that there is an assumption that once a page is COW-broken, it would never have another write-fault that might lead to COW breaking later.
Right. I do think there are problems in the current code, I just think that the patches are a step back.
The problems with the current code are of two kinds:
- I think the largepage code (either THP or explicit hugetlb) doesn't do as good a job of this whole COW handling as the regular pages do
- some of the "you can make pages read-only again explicitly" kinds of loads.
But honestly, at least for the second case, if somebody does a GUP, and then starts playing mprotect games on the same virtual memory area that they did a GUP on, and are surprised when they get another COW fault that breaks their own connection with a page they did a GUP on earlier, that's their own fault.
So I think there's some of "If you broke it, you get to keep both pieces". Literally, in this case. You have your GUP page that you looked up, and you have your virtual address page that you caused COW on with mprotect() by making it read-only and then read-write again, then you have two different pages, and at some point it really is just "Well, don't do that then".
But yes, there's also some of "some code probably didn't get fully converted to the new world order". So if VFIO only uses FOLL_LONGTERM, and didn't ask for the COW breaking, then yes, VFIO will see page incoherencies. But that should be an issue of "VFIO should do the right thing".
So part of it is a combination of "if you do crazy things, you'll get crazy results". And some of it is some kernel pinning code that doesn't do the right thing to actually make sure it gets a shared page to be pinned.
And then there's THP and HUGETLB, that I do think needs fixing and aren't about those two kinds of cases.
I think we never got around to just doing the same thing we did for regular pages. I think the hugepage code simply doesn't follow that "COW on GUP, mark to not COW later" pattern.
Linus