On 29 Oct 2020, at 17:14, Yang Shi wrote:
On Thu, Oct 29, 2020 at 1:04 PM Zi Yan zi.yan@sent.com wrote:
From: Zi Yan ziy@nvidia.com
In isolate_migratepages_block, when cc->alloc_contig is true, we are able to isolate compound pages, nr_migratepages and nr_isolated did not count compound pages correctly, causing us to isolate more pages than we thought. Use thp_nr_pages to count pages. Otherwise, we might be trapped in too_many_isolated while loop, since the actual isolated pages can go up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32,
Is it that easy to run into? 16384 doesn't seem like too many pages, just 64MB.
I hit this when I was running oom01 from ltp to test my PUD THP patchset, which allocates PUD THPs from CMA regions and splits them into PMD THPs due to memory pressure. I am not sure if it is common that in the upstream kernel PMD THPs will be allocated in CMA regions due to allocation fallback.
since we stop isolation after cc->nr_migratepages reaches to COMPACT_CLUSTER_MAX.
In addition, after we fix the issue above, cc->nr_migratepages could never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated, thus page isolation could not stop as we intended. Change the isolation stop condition to >=.
The fix looks sane to me. Reviewed-by: Yang Shi shy828301@gmail.com
Thanks.
Shall you add Fixes tag to commit 1da2f328fa643bd72197dfed0c655148af31e4eb? And may cc stable.
Sure.
Fixes: 1da2f328fa64 (“mm,thp,compaction,cma: allow THP migration for CMA allocations”)
stable cc’ed.
Signed-off-by: Zi Yan ziy@nvidia.com
mm/compaction.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c index ee1f8439369e..0683a4999581 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1012,8 +1012,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
isolate_success: list_add(&page->lru, &cc->migratepages);
cc->nr_migratepages++;
nr_isolated++;
cc->nr_migratepages += thp_nr_pages(page);
nr_isolated += thp_nr_pages(page); /* * Avoid isolating too much unless this block is being
@@ -1021,7 +1021,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, * or a lock is contended. For contention, isolate quickly to * potentially remove one source of contention. */
if (cc->nr_migratepages == COMPACT_CLUSTER_MAX &&
if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX && !cc->rescan && !cc->contended) { ++low_pfn; break;
@@ -1132,7 +1132,7 @@ isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn, if (!pfn) break;
if (cc->nr_migratepages == COMPACT_CLUSTER_MAX)
if (cc->nr_migratepages >= COMPACT_CLUSTER_MAX) break; }
-- 2.28.0
— Best Regards, Yan Zi
On Thu, 29 Oct 2020 17:31:28 -0400 Zi Yan ziy@nvidia.com wrote:
Shall you add Fixes tag to commit 1da2f328fa643bd72197dfed0c655148af31e4eb? And may cc stable.
Sure.
Fixes: 1da2f328fa64 (“mm,thp,compaction,cma: allow THP migration for CMA allocations”)
stable cc'ed.
A think a cc:stable really requires a description of the end-user visible effects of the bug. Could you please provide that?
On 29 Oct 2020, at 20:28, Andrew Morton wrote:
On Thu, 29 Oct 2020 17:31:28 -0400 Zi Yan ziy@nvidia.com wrote:
Shall you add Fixes tag to commit 1da2f328fa643bd72197dfed0c655148af31e4eb? And may cc stable.
Sure.
Fixes: 1da2f328fa64 (“mm,thp,compaction,cma: allow THP migration for CMA allocations”)
stable cc'ed.
A think a cc:stable really requires a description of the end-user visible effects of the bug. Could you please provide that?
Sure.
For example, in a system with 16GB memory and an 8GB CMA region reserved by hugetlb_cma, if we first allocate 10GB THPs and mlock them (so some THPs are allocated in the CMA region and mlocked), reserving 6 1GB hugetlb pages via /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages will get stuck (looping in too_many_isolated function) until we kill either task. With the patch applied, oom will kill the application with 10GB THPs and let hugetlb page reservation finish.
— Best Regards, Yan Zi
linux-stable-mirror@lists.linaro.org