6.11-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wei Xu weixugc@google.com
commit b130ba4a6259f6b64d8af15e9e7ab1e912bcb7ad upstream.
lru_gen_shrink_node() unconditionally clears kswapd_failures, which can prevent kswapd from sleeping and cause 100% kswapd cpu usage even when kswapd repeatedly fails to make progress in reclaim.
Only clear kswap_failures in lru_gen_shrink_node() if reclaim makes some progress, similar to shrink_node().
I happened to run into this problem in one of my tests recently. It requires a combination of several conditions: The allocator needs to allocate a right amount of pages such that it can wake up kswapd without itself being OOM killed; there is no memory for kswapd to reclaim (My test disables swap and cleans page cache first); no other process frees enough memory at the same time.
Link: https://lkml.kernel.org/r/20241014221211.832591-1-weixugc@google.com Fixes: e4dde56cd208 ("mm: multi-gen LRU: per-node lru_gen_folio lists") Signed-off-by: Wei Xu weixugc@google.com Cc: Axel Rasmussen axelrasmussen@google.com Cc: Brian Geffon bgeffon@google.com Cc: Jan Alexander Steffens heftig@archlinux.org Cc: Suleiman Souhlal suleiman@google.com Cc: Yu Zhao yuzhao@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/vmscan.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4940,8 +4940,8 @@ static void lru_gen_shrink_node(struct p
blk_finish_plug(&plug); done: - /* kswapd should never fail */ - pgdat->kswapd_failures = 0; + if (sc->nr_reclaimed > reclaimed) + pgdat->kswapd_failures = 0; }
/******************************************************************************