The patch titled Subject: mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty() has been added to the -mm tree. Its filename is mm-huge_memoryc-__split_huge_page-use-atomic-clearpagedirty.patch
This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-huge_memoryc-__split_huge_page-u... and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-huge_memoryc-__split_huge_page-u...
Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated there every 3-4 working days
------------------------------------------------------ From: Hugh Dickins hughd@google.com Subject: mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty()
Swapping load on huge=always tmpfs (with khugepaged tuned up to be very eager, but I'm not sure that is relevant) soon hung uninterruptibly, waiting for page lock in shmem_getpage_gfp()'s find_lock_entry(), most often when "cp -a" was trying to write to a smallish file. Debug showed that the page in question was not locked, and page->mapping NULL by now, but page->index consistent with having been in a huge page before.
Reproduced in minutes on a 4.15 kernel, even with 4.17's 605ca5ede764 ("mm/huge_memory.c: reorder operations in __split_huge_page_tail()") added in; but took hours to reproduce on a 4.17 kernel (no idea why).
The culprit proved to be the __ClearPageDirty() on tails beyond i_size in __split_huge_page(): the non-atomic __bitoperation may have been safe when 4.8's baa355fd3314 ("thp: file pages support for split_huge_page()") introduced it, but liable to erase PageWaiters after 4.10's 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit").
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805291841070.3197@eggly.anvils Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit") Signed-off-by: Hugh Dickins hughd@google.com Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: Konstantin Khlebnikov khlebnikov@yandex-team.ru Cc: Nicholas Piggin npiggin@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org ---
mm/huge_memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff -puN mm/huge_memory.c~mm-huge_memoryc-__split_huge_page-use-atomic-clearpagedirty mm/huge_memory.c --- a/mm/huge_memory.c~mm-huge_memoryc-__split_huge_page-use-atomic-clearpagedirty +++ a/mm/huge_memory.c @@ -2431,7 +2431,7 @@ static void __split_huge_page(struct pag __split_huge_page_tail(head, i, lruvec, list); /* Some pages can be beyond i_size: drop them from page cache */ if (head[i].index >= end) { - __ClearPageDirty(head + i); + ClearPageDirty(head + i); __delete_from_page_cache(head + i, NULL); if (IS_ENABLED(CONFIG_SHMEM) && PageSwapBacked(head)) shmem_uncharge(head->mapping->host, 1); _
Patches currently in -mm which might be from hughd@google.com are
mm-huge_memoryc-__split_huge_page-use-atomic-clearpagedirty.patch