From: Nicholas Piggin npiggin@gmail.com Subject: mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit
Previously the negated unsigned long would be cast back to signed long which would have the correct negative value. After commit 730ec8c01a2b ("mm/vmscan.c: change prototype for shrink_page_list"), the large unsigned int converts to a large positive signed long.
Symptoms include CMA allocations hanging forever holding the cma_mutex due to alloc_contig_range->...->isolate_migratepages_block waiting forever in "while (unlikely(too_many_isolated(pgdat)))".
[akpm@linux-foundation.org: fix -stat.nr_lazyfree_fail as well, per Michal] Link: https://lkml.kernel.org/r/20201029032320.1448441-1-npiggin@gmail.com Fixes: 730ec8c01a2b ("mm/vmscan.c: change prototype for shrink_page_list") Signed-off-by: Nicholas Piggin npiggin@gmail.com Acked-by: Michal Hocko mhocko@suse.com Cc: Vaneet Narang v.narang@samsung.com Cc: Maninder Singh maninder1.s@samsung.com Cc: Amit Sahrawat a.sahrawat@samsung.com Cc: Mel Gorman mgorman@suse.de Cc: Vlastimil Babka vbabka@suse.cz Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org ---
mm/vmscan.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/mm/vmscan.c~mm-vmscan-fix-nr_isolated_file-corruption-on-64-bit +++ a/mm/vmscan.c @@ -1516,7 +1516,8 @@ unsigned int reclaim_clean_pages_from_li nr_reclaimed = shrink_page_list(&clean_pages, zone->zone_pgdat, &sc, TTU_IGNORE_ACCESS, &stat, true); list_splice(&clean_pages, page_list); - mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -nr_reclaimed); + mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, + -(long)nr_reclaimed); /* * Since lazyfree pages are isolated from file LRU from the beginning, * they will rotate back to anonymous LRU in the end if it failed to @@ -1526,7 +1527,7 @@ unsigned int reclaim_clean_pages_from_li mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_ANON, stat.nr_lazyfree_fail); mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, - -stat.nr_lazyfree_fail); + -(long)stat.nr_lazyfree_fail); return nr_reclaimed; }
_
I've applied this patch, but I have to say that I absolutely detest it.
This is very fragile, and I think the problem is the nasty interface.
Why don't we have simple wrappers that internally do that "mod", but actually expose "inc" and "dec" instead, and make it much harder to have these kinds of problems?
The function name is horrible too: "mod_node_page_state()"? It's not like that really even describes what it does. It doesn't modify any page state what-so-ever, what it does is to update some node counters.
Did somebody mis-understand what "stat" stands for? It's not some odd shortening of "state" (like "creat"). It's shorthand for "statistics". Because that completely nonsensical "state" naming shows up in all of those helper functions too.
And the "page" part of the name is misleading too, since it's not even about page statistics, and the people who have a page need to literally do "page_pgdat()" on that page in order to pass in the "struct pglist_data" that the function wants.
To make matters worse, we have the double underscore versions too, which have that magical "we know interrupts are disabled" optimization, so we duplicate all of this horror.
So I really think this whole area would be ripe for some major interface cleanups, both in naming and in interfaces.
A _small_ step would have been to avoid the "mod" thing.
Linus
On Fri, Nov 13, 2020 at 10:51 PM Andrew Morton akpm@linux-foundation.org wrote:
From: Nicholas Piggin npiggin@gmail.com Subject: mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit
mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -nr_reclaimed);
mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE,
-(long)nr_reclaimed);
...
mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE,
-stat.nr_lazyfree_fail);
-(long)stat.nr_lazyfree_fail);
On Sat, Nov 14, 2020 at 01:39:47PM -0800, Linus Torvalds wrote:
I've applied this patch, but I have to say that I absolutely detest it.
This is very fragile, and I think the problem is the nasty interface.
I agree, it's nast.
Why don't we have simple wrappers that internally do that "mod", but actually expose "inc" and "dec" instead, and make it much harder to have these kinds of problems?
We actually need 'add' and 'sub', not 'inc' and 'dec'. Sometimes we inc, but often we need to add. But, yes, 'mod' is a bad name.
linux-stable-mirror@lists.linaro.org