On Wed 22-11-17 07:29:38, Zi Yan wrote:
On 22 Nov 2017, at 7:13, Zi Yan wrote:
On 22 Nov 2017, at 5:14, Michal Hocko wrote:
On Wed 22-11-17 10:35:10, Michal Hocko wrote: [...]
Moreover I am not really sure this is really working properly. Just look at the split_huge_page. It moves all the tail pages to the LRU list while migrate_pages has a list of pages to migrate. So we will migrate the head page and all the rest will get back to the LRU list. What guarantees that they will get migrated as well.
OK, so this is as I've expected. It doesn't work! Some pfn walker based migration will just skip tail pages see madvise_inject_error. __alloc_contig_migrate_range will simply fail on THP page see isolate_migratepages_block so we even do not try to migrate it. do_move_page_to_node_array will simply migrate head and do not care about tail pages. do_mbind splits the page and then fall back to pte walk when thp migration is not supported but it doesn't handle tail pages if the THP migration path is not able to allocate a fresh THP AFAICS. Memory hotplug should be safe because it doesn't skip the whole THP when doing pfn walk.
Unless I am missing something here this looks like a huge mess to me.
+Kirill
First, I agree with you that splitting a THP and only migrating its head page is a mess. But what you describe is also the behavior of migrate_page() _before_ THP migration support is added. I thought that was intended.
Look at http://elixir.free-electrons.com/linux/v4.13.15/source/mm/migrate.c#L1091, unmap_and_move() splits THPs and only migrates the head page in v4.13 before THP migration is added. I think the behavior was introduced since v4.5 (I just skimmed v4.0 to v4.13 code and did not have time to use git blame), before that THPs are not migrated but shown as successfully migrated (at least from v4.4’s code).
Sorry, I misread v4.4’s code, it also does ‘splitting a THP and migrating its head page’. This behavior was there for a long time, at least since v3.0.
The code in unmap_and_move() is:
if (unlikely(PageTransHuge(page))) if (unlikely(split_huge_page(page))) goto out;
I _think_ that this all should be handled at migrate_pages layer. Try to migrate THP and fallback to split_huge_page into to the list when it fails. I haven't checked whether there is something which would prevent that though. THP tricks in specific paths then should be removed.