On 07.04.22 14:04, Michal Hocko wrote:
On Thu 07-04-22 13:58:44, David Hildenbrand wrote: [...]
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 3589febc6d31..130a2feceddc 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6112,10 +6112,8 @@ static int build_zonerefs_node(pg_data_t *pgdat, struct zoneref *zonerefs) do { zone_type--; zone = pgdat->node_zones + zone_type;
if (managed_zone(zone)) {
zoneref_set_zone(zone, &zonerefs[nr_zones++]);
check_highest_zone(zone_type);
}
zoneref_set_zone(zone, &zonerefs[nr_zones++]);
} while (zone_type);check_highest_zone(zone_type);
return nr_zones;
I don't think having !populated zones in the zonelist is a particularly good idea. Populated vs !populated changes only during page onlininge/offlining.
If I'm not wrong, with your patch we'd even include ZONE_DEVICE here ...
What kind of problem that would cause? The allocator wouldn't see any pages at all so it would fallback to the next one. Maybe kswapd would need some tweak to have a bail out condition but as mentioned in the thread already. !populated or !managed for that matter are not all that much different from completely depleted zones. The fact that we are making that distinction has led to some bugs and I suspect it makes the code more complex without a very good reason.
I assume performance problems. Assume you have an ordinary system with multiple NUMA nodes and no MOVABLE memory. Most nodes will only have ZONE_NORMAL. Yet, you'd include ZONE_DMA* and ZONE_MOVABLE that will always remain empty to be traversed on each and every allocation fallback. Of course, we could measure, but IMHO at least *that* part of memory onlining/offlining is not the complicated part :D
Populated vs. !populated is under pretty good control via page onlining/offlining. We have to be careful with "managed pages", because that's a moving target, especially with memory ballooning. And I assume that's the bigger source of bugs.
I'd vote for going with the simple fix first, which should be good enough AFAIKT.
yes, see the other reply
I think we were composing almost simultaneously :)