The patch titled Subject: Revert "mm, memory_hotplug: initialize struct pages for the full memory section" has been added to the -mm tree. Its filename is revert-mm-memory_hotplug-initialize-struct-pages-for-the-full-memory-section.patch
This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/revert-mm-memory_hotplug-initialize... and later at http://ozlabs.org/~akpm/mmotm/broken-out/revert-mm-memory_hotplug-initialize...
Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated there every 3-4 working days
------------------------------------------------------ From: Michal Hocko mhocko@suse.com Subject: Revert "mm, memory_hotplug: initialize struct pages for the full memory section"
This reverts 2830bf6f05fb3e05b ("mm, memory_hotplug: initialize struct pages for the full memory section").
The underlying assumption that one sparse section belongs into a single numa node doesn't hold really. Robert Shteynfeld has reported a boot failure. The boot log was not captured but his memory layout is as follows:
[ 0.286954] Early memory node ranges [ 0.286955] node 1: [mem 0x0000000000001000-0x0000000000090fff] [ 0.286955] node 1: [mem 0x0000000000100000-0x00000000dbdf8fff] [ 0.286956] node 1: [mem 0x0000000100000000-0x0000001423ffffff] [ 0.286956] node 0: [mem 0x0000001424000000-0x0000002023ffffff]
This means that node0 starts in the middle of a memory section which is also in node1. memmap_init_zone tries to initialize padding of a section even when it is outside of the given pfn range because there are code paths (e.g. memory hotplug) which assume that the full worth of memory section is always initialized. In this particular case, though, such a range is already intialized and most likely already managed by the page allocator. Scribbling over those pages corrupts the internal state and likely blows up when any of those pages gets used.
Link: http://lkml.kernel.org/r/20190125181549.GE20411@dhcp22.suse.cz Fixes: 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full memory section") Signed-off-by: Michal Hocko mhocko@suse.com Reported-by: Robert Shteynfeld robert.shteynfeld@gmail.com Cc: Mikhail Zaslonko zaslonko@linux.ibm.com Cc: Gerald Schaefer gerald.schaefer@de.ibm.com Cc: Mikhail Gavrilov mikhail.v.gavrilov@gmail.com Cc: Dave Hansen dave.hansen@intel.com Cc: Alexander Duyck alexander.h.duyck@linux.intel.com Cc: Pasha Tatashin Pavel.Tatashin@microsoft.com Cc: Martin Schwidefsky schwidefsky@de.ibm.com Cc: Heiko Carstens heiko.carstens@de.ibm.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org ---
mm/page_alloc.c | 12 ------------ 1 file changed, 12 deletions(-)
--- a/mm/page_alloc.c~revert-mm-memory_hotplug-initialize-struct-pages-for-the-full-memory-section +++ a/mm/page_alloc.c @@ -5701,18 +5701,6 @@ void __meminit memmap_init_zone(unsigned cond_resched(); } } -#ifdef CONFIG_SPARSEMEM - /* - * If the zone does not span the rest of the section then - * we should at least initialize those pages. Otherwise we - * could blow up on a poisoned page in some paths which depend - * on full sections being initialized (e.g. memory hotplug). - */ - while (end_pfn % PAGES_PER_SECTION) { - __init_single_page(pfn_to_page(end_pfn), end_pfn, zone, nid); - end_pfn++; - } -#endif }
#ifdef CONFIG_ZONE_DEVICE _
Patches currently in -mm which might be from mhocko@suse.com are
mm-memory_hotplug-is_mem_section_removable-do-not-pass-the-end-of-a-zone.patch revert-mm-memory_hotplug-initialize-struct-pages-for-the-full-memory-section.patch mm-oom-marks-all-killed-tasks-as-oom-victims.patch memcg-do-not-report-racy-no-eligible-oom-tasks.patch