On Sat, Mar 3, 2018 at 12:05 AM, Andrew Morton akpm@linux-foundation.org wrote:
On Fri, 2 Mar 2018 23:55:29 +0100 Daniel Vacek neelx@redhat.com wrote:
On Fri, Mar 2, 2018 at 9:59 PM, akpm@linux-foundation.org wrote:
The patch titled Subject: mm/page_alloc: fix memmap_init_zone pageblock alignment has been added to the -mm tree. Its filename is mm-page_alloc-fix-memmap_init_zone-pageblock-alignment.patch
This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-fix-memmap_init_zone-... and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-fix-memmap_init_zone-...
Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/SubmitChecklist when testing your code ***
Documentation/process/submit-checklist.rst nowadays, btw.
thanks.
The -mm tree is included into linux-next and is updated there every 3-4 working days
Actually this should go directly to v4.16-rc4. Shall I cc Linus for v3 I'm about to send? Or do you think it's fine for -next and -stable and we keep it like this?
I normally will sit on these things for a week or so before sending to Linus, to make sure they've had a cycle in -next and that reviewers have had time to consider the patch.
I see.
Take a look in http://ozlabs.org/~akpm/mmots/series, at the "mainline-urgent" and "mainline-later" sections. Those are the patches which I'm planning on sending to Linus for this -rc cycle.
Bookmarked. Thanks.
On this particular patch I'd like to see some reviews and acks - I'm not terribly familiar with the zone initialization code.
That's why I prepared v3 series. It's the same diff refactored to two patches with nice descriptions. There are no code changes compared to v2 and second patch is identical to original v1. So do I still need to send it out or we keep v2?
Please be aware that this code later gets altered by mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma.patch, which is below. I did some reject-fixing on this one.
From: Eugeniu Rosca erosca@de.adit-jv.com Subject: mm: page_alloc: skip over regions of invalid pfns on UMA
As a result of bisecting the v4.10..v4.11 commit range, it was determined that commits [1] and [2] are both responsible of a ~140ms early startup improvement on Rcar-H3-ES20 arm64 platform.
Since Rcar Gen3 family is not NUMA, we don't define CONFIG_NUMA in the rcar3 defconfig (which also reduces KNL binary image by ~64KB), but this is how the boot time improvement is lost.
This patch makes optimization [2] available on UMA systems which provide support for CONFIG_HAVE_MEMBLOCK.
Testing this change on Rcar H3-ES20-ULCB using v4.15-rc9 KNL and vanilla arm64 defconfig + NUMA=n, a speed-up of ~139ms (from ~174ms [3] to ~35ms [4]) is observed in the execution of memmap_init_zone().
No boot time improvement is sensed on Apollo Lake SoC.
[1] commit 0f84832fb8f9 ("arm64: defconfig: Enable NUMA and NUMA_BALANCING") [2] commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")
[3] 174ms spent in memmap_init_zone() on H3ULCB w/o this patch (NUMA=n) [ 2.643685] On node 0 totalpages: 1015808 [ 2.643688] DMA zone: 3584 pages used for memmap [ 2.643691] DMA zone: 0 pages reserved [ 2.643693] DMA zone: 229376 pages, LIFO batch:31 [ 2.643696] > memmap_init_zone [ 2.663628] < memmap_init_zone (19.932 ms) [ 2.663632] Normal zone: 12288 pages used for memmap [ 2.663635] Normal zone: 786432 pages, LIFO batch:31 [ 2.663637] > memmap_init_zone [ 2.818012] < memmap_init_zone (154.375 ms) [ 2.818041] psci: probing for conduit method from DT.
[4] 35ms spent in memmap_init_zone() on H3ULCB with this patch (NUMA=n) [ 2.677202] On node 0 totalpages: 1015808 [ 2.677205] DMA zone: 3584 pages used for memmap [ 2.677208] DMA zone: 0 pages reserved [ 2.677211] DMA zone: 229376 pages, LIFO batch:31 [ 2.677213] > memmap_init_zone [ 2.684378] < memmap_init_zone (7.165 ms) [ 2.684382] Normal zone: 12288 pages used for memmap [ 2.684385] Normal zone: 786432 pages, LIFO batch:31 [ 2.684387] > memmap_init_zone [ 2.712556] < memmap_init_zone (28.169 ms) [ 2.712584] psci: probing for conduit method from DT.
[mhocko@kernel.org: fix build] Link: http://lkml.kernel.org/r/20180222072037.GC30681@dhcp22.suse.cz Link: http://lkml.kernel.org/r/20180217222846.29589-1-rosca.eugeniu@gmail.com Signed-off-by: Eugeniu Rosca erosca@de.adit-jv.com Signed-off-by: Michal Hocko mhocko@suse.com Reported-by: Eugeniu Rosca erosca@de.adit-jv.com Tested-by: Eugeniu Rosca erosca@de.adit-jv.com Acked-by: Michal Hocko mhocko@suse.com Signed-off-by: Andrew Morton akpm@linux-foundation.org
include/linux/memblock.h | 3 +- mm/memblock.c | 54 ++++++++++++++++++------------------- mm/page_alloc.c | 2 - 3 files changed, 30 insertions(+), 29 deletions(-)
diff -puN include/linux/memblock.h~mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma include/linux/memblock.h --- a/include/linux/memblock.h~mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma +++ a/include/linux/memblock.h @@ -187,7 +187,6 @@ int memblock_search_pfn_nid(unsigned lon unsigned long *end_pfn); void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn, unsigned long *out_end_pfn, int *out_nid); -unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
/**
- for_each_mem_pfn_range - early memory pfn range iterator
@@ -204,6 +203,8 @@ unsigned long memblock_next_valid_pfn(un i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid)) #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
+unsigned long memblock_next_valid_pfn(unsigned long pfn, unsigned long max_pfn);
/**
- for_each_free_mem_range - iterate through free memblock areas
- @i: u64 used as loop variable
diff -puN mm/memblock.c~mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma mm/memblock.c --- a/mm/memblock.c~mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma +++ a/mm/memblock.c @@ -1101,33 +1101,6 @@ void __init_memblock __next_mem_pfn_rang *out_nid = r->nid; }
-unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn) -{
struct memblock_type *type = &memblock.memory;
unsigned int right = type->cnt;
unsigned int mid, left = 0;
phys_addr_t addr = PFN_PHYS(++pfn);
do {
mid = (right + left) / 2;
if (addr < type->regions[mid].base)
right = mid;
else if (addr >= (type->regions[mid].base +
type->regions[mid].size))
left = mid + 1;
else {
/* addr is within the region, so pfn is valid */
return pfn;
}
} while (left < right);
if (right == type->cnt)
return -1UL;
else
return PHYS_PFN(type->regions[right].base);
-}
/**
- memblock_set_node - set node ID on memblock regions
- @base: base of area to set node ID for
@@ -1159,6 +1132,33 @@ int __init_memblock memblock_set_node(ph } #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn) +{
struct memblock_type *type = &memblock.memory;
unsigned int right = type->cnt;
unsigned int mid, left = 0;
phys_addr_t addr = PFN_PHYS(++pfn);
do {
mid = (right + left) / 2;
if (addr < type->regions[mid].base)
right = mid;
else if (addr >= (type->regions[mid].base +
type->regions[mid].size))
left = mid + 1;
else {
/* addr is within the region, so pfn is valid */
return pfn;
}
} while (left < right);
if (right == type->cnt)
return -1UL;
else
return PHYS_PFN(type->regions[right].base);
+}
static phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size, phys_addr_t align, phys_addr_t start, phys_addr_t end, int nid, ulong flags) diff -puN mm/page_alloc.c~mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma mm/page_alloc.c --- a/mm/page_alloc.c~mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma +++ a/mm/page_alloc.c @@ -5440,7 +5440,6 @@ void __meminit memmap_init_zone(unsigned goto not_early;
if (!early_pfn_valid(pfn)) {
-#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP /* * Skip to the pfn preceding the next valid one (or * end_pfn), such that we hit a valid pfn (or end_pfn) @@ -5450,6 +5449,7 @@ void __meminit memmap_init_zone(unsigned * the valid region but still depends on correct page * metadata. */ +#ifdef CONFIG_HAVE_MEMBLOCK
Hmm, the #ifdef should definitely contain the comment. Other than that it looks OK.
pfn = (memblock_next_valid_pfn(pfn) & ~(pageblock_nr_pages-1)) - 1;
#endif _