On Thu, Apr 07, 2022 at 01:17:19PM +0200, Juergen Gross wrote:
On 07.04.22 13:07, Michal Hocko wrote:
On Thu 07-04-22 12:45:41, Juergen Gross wrote:
On 07.04.22 12:34, Michal Hocko wrote:
Ccing Mel
On Thu 07-04-22 11:32:21, Juergen Gross wrote:
Since commit 9d3be21bf9c0 ("mm, page_alloc: simplify zonelist initialization") only zones with free memory are included in a built zonelist. This is problematic when e.g. all memory of a zone has been ballooned out.
What is the actual problem there?
When running as Xen guest new hotplugged memory will not be onlined automatically, but only on special request. This is done in order to support adding e.g. the possibility to use another GB of memory, while adding only a part of that memory initially.
In case adding that memory is populating a new zone, the page allocator won't be able to use this memory when it is onlined, as the zone wasn't added to the zonelist, due to managed_zone() returning 0.
How is that memory onlined? Because "regular" onlining (online_pages()) does rebuild zonelists if their zone hasn't been populated before.
The Xen balloon driver has an own callback for onlining pages. The pages are just added to the ballooned-out page list without handing them to the allocator. This is done only when the guest is ballooned up.
Is this new behaviour? I ask because keeping !managed_zones out of the zonelist and reclaim paths and the behaviour makes sense. Elsewhere you state "zone can always happen to have no free memory left" and this is true but it's usually a transient event. The difference between a populated vs managed zone is usually permanent event where no memory will ever be placed on the buddy lists because the memory was reserved early in boot or a similar reason. The patch is probably harmless but it has the potential to waste CPUs allocating or reclaiming from zones that will never succeed.