On Sun, Dec 19, 2021 at 06:59:51PM +0100, David Hildenbrand wrote:
On 19.12.21 18:44, Linus Torvalds wrote:
David, you said that you were working on some alternative model. Is it perhaps along these same lines below?
I was thinking that a bit in the page tables to say "this page is exclusive to this VM" would be a really simple thing to deal with for fork() and swapout and friends.
But we don't have such a bit in general, since many architectures have very limited sets of SW bits, and even when they exist we've spent them on things like UDDF_WP.,
But the more I think about the "bit doesn't even have to be in the page tables", the more I think maybe that's the solution.
A bit in the 'struct page' itself.
Exactly what I am prototyping right now.
For hugepages, you'd have to distribute said bit when you split the hugepage.
Yes, that's one tricky part ...
That part shouldn't be that tricky ...
Can we get rid of ->mapcount altogether? Three states: - Not mapped - Mapped exactly once - Possibly mapped more than once
I appreciate "Not mapped" is not a state that anon pages can meaningfully have (maybe when they go into the swap cache?)
And this information would only be present on the head page (ie stored per folio). If one VMA has multiple PTEs that map the same folio, then hopefully that only counts as mapped once.
I must admit about half this conversation is going over my head. I need more time to understand all the constraints than exists between emails :-)