On Mon, Dec 20, 2021 at 10:53 AM Matthew Wilcox willy@infradead.org wrote:
It makes me wonder if reuse_swap_page() can also be based on refcount instead of mapcount?
I suspect it doesn't even need refcount.
For regular pages, after we've copied the page, all we do right now is
if (page_copied) free_swap_cache(old_page);
which is basically just an optimistic trylock_page() followed by try_to_free_swap().
And that then pretty much simply checks "are there any swap users left" and deletes it from the swap cache if not.
The "free_swap_cache()" thing is actually just an optimization to avoid having memory pressure do it later. So it doesn't have to be exact.
In fact, I thought that swap is so unusual that it's not even needed at all, but I was wrong. See how this was re-introduced in commit f4c4a3f48480 ("mm: free idle swap cache page after COW") because yes, some loads still have swap space allocated.
In theory, it would probably be a good idea at COW time to see if the page ref is 2, and if it's a swap cache page, and try to do that swap cache removal even earlier, so that the page actually gets re-used (instead of copied and then the swap entry removed).
But swap is such a non-issue these days that I doubt it matters, and it's probably better to keep the swap handling in the unusual path.
So mapcount and refcount aren't what matters for the swap cache.
The swap count obviously *does* matter - because it means that some mapping has a reference to this swap entry (not as a page, but as an actual swap pointer).
But the mapcount is irrelevant - any users that have the swap page actually mapped, don't actually need to be a swapcache page.
Even the refcount doesn't really matter, afaik. The only "refcount" we care about is that swapcount - that's what actually reference counts the swap cases.
try_to_free_swap() does check for one particular kind of reference: it does a check for PageWriteback(). We don't want to remove the thing from the swap cache if it's under active IO.
(This codepath does need the page lock, though, thus all those "page_trylock()" things).
Linus