On 27.02.24 15:52, Zi Yan wrote:
On 27 Feb 2024, at 9:11, David Hildenbrand wrote:
On 14.02.24 15:34, David Hildenbrand wrote:
On 14.02.24 15:18, Matthew Wilcox wrote:
On Wed, Feb 14, 2024 at 12:04:10PM +0530, Charan Teja Kalla wrote:
- Is it broken in 5.15? Did you actually try to reproduce or is this just a guess?
We didn't run the tests with THP enabled on 5.15, __so we didn't encounter this issue__ on older to 6.1 kernels.
I mentioned that issue exists is based on my understanding after code walk through. To be specific, I just looked to the migrate_pages()->..->migrate_page_move_mapping() & __split_huge_page_tail() where the ->private field of thp sub-pages is not filled with swap entry. If it could have set, I think these are the only places where it would have done, per my understanding. CMIW.
I think you have a misunderstanding. David's patch cfeed8ffe55b (part of 6.6) _stopped_ us using the tail ->private entries. So in 6.1, these tail pages should already have page->private set, and I don't understand what you're fixing.
I think the issue is, that migrate_page_move_mapping() / folio_migrate_mapping() would update ->private for a folio in the swapcache (head page)
newfolio->private = folio_get_private(folio);
but not the ->private of the tail pages.
So once you migrate a THP that is in the swapcache, ->private of the tail pages would not be migrated and, therefore, be stale/wrong.
Even before your patch that was the case.
Looking at migrate_page_move_mapping(), we had:
if (PageSwapBacked(page)) { __SetPageSwapBacked(newpage); if (PageSwapCache(page)) { SetPageSwapCache(newpage); set_page_private(newpage, page_private(page)); } } else { VM_BUG_ON_PAGE(PageSwapCache(page), page); }
I don't immediately see where the tail pages would similarly get updated (via set_page_private).
With my patch the problem is gone, because the tail page entries don't have to be migrated, because they are unused.
Maybe this was an oversight from THP_SWAP -- 38d8b4e6bdc8 ("mm, THP, swap: delay splitting THP during swap out").
It did update __add_to_swap_cache():
for (i = 0; i < nr; i++) { set_page_private(page + i, entry.val + i); error = radix_tree_insert(&address_space->page_tree, idx + i, page + i); if (unlikely(error)) break; }
and similarly __delete_from_swap_cache().
But I don't see any updates to migration code.
Now, it could be that THP migration was added later (post 2017), in that case the introducing commit would not have been 38d8b4e6bdc8.
Let's continue:
The introducing commit is likely either
(1) 38d8b4e6bdc87 ("mm, THP, swap: delay splitting THP during swap out")
That one added THP_SWAP, but THP migration wasn't supported yet AFAIKS.
-> v4.13
(2) 616b8371539a6 ("mm: thp: enable thp migration in generic path")
I think this is the one, since it makes THP entering migrate_page_move_mapping() possible.
Or likely any of the following that actually allocate THP for migration:
8135d8926c08e mm: memory_hotplug: memory hotremove supports thp migration e8db67eb0ded3 mm: migrate: move_pages() supports thp migration c8633798497ce mm: mempolicy: mbind and migrate_pages support thp migration
That actually enable THP migration.
-> v4.14
So likely we'd have to fix the stable kernels:
4.19 5.4 5.10 5.15 6.1
That's a lot of pre-folio code. A backport of my series likely won't really make any sense.
Staring at 4.19.307 code base, we likely have to perform a stable-only fix that properly handles the swapcache of compoud pages in migrate_page_move_mapping().
Something like (applies to v4.19.307):
diff --git a/mm/migrate.c b/mm/migrate.c index 171573613c39..59878459c28c 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -514,8 +514,13 @@ int migrate_page_move_mapping(struct address_space *mapping, if (PageSwapBacked(page)) { __SetPageSwapBacked(newpage); if (PageSwapCache(page)) {
int i;
SetPageSwapCache(newpage);
set_page_private(newpage, page_private(page));
for (i = 0; i < (1 << compound_order(page)); i++) {
set_page_private(newpage + i,
page_private(page + i));
} } } else { VM_BUG_ON_PAGE(PageSwapCache(page), page);
I'm wondering if there is a swapcache update missing as well.