[to-be-updated] mm-vmscan-ensure-kswapd-is-woken-up-if-the-wait-queue-is-active.patch removed from -mm tree - Linux-stable-mirror

1 Dec 2024

The quilt patch titled
     Subject: mm: vmscan: ensure kswapd is woken up if the wait queue is active
has been removed from the -mm tree.  Its filename was
     mm-vmscan-ensure-kswapd-is-woken-up-if-the-wait-queue-is-active.patch
This patch was dropped because an updated version will be issued
------------------------------------------------------
From: Seiji Nishikawa snishika@redhat.com
Subject: mm: vmscan: ensure kswapd is woken up if the wait queue is active
Date: Wed, 27 Nov 2024 00:06:12 +0900
Even after commit 501b26510ae3 ("vmstat: allow_direct_reclaim should use
zone_page_state_snapshot"), a task may remain indefinitely stuck in
throttle_direct_reclaim() while holding mm->rwsem.
__alloc_pages_nodemask
 try_to_free_pages
  throttle_direct_reclaim
This can cause numerous other tasks to wait on the same rwsem, leading
to severe system hangups:
[1088963.358712] INFO: task python3:1670971 blocked for more than 120 seconds.
[1088963.365653]       Tainted: G           OE     -------- -  - 4.18.0-553.el8_10.aarch64 #1
[1088963.373887] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1088963.381862] task:python3         state:D stack:0     pid:1670971 ppid:1667117 flags:0x00800080
[1088963.381869] Call trace:
[1088963.381872]  __switch_to+0xd0/0x120
[1088963.381877]  __schedule+0x340/0xac8
[1088963.381881]  schedule+0x68/0x118
[1088963.381886]  rwsem_down_read_slowpath+0x2d4/0x4b8
The issue arises when allow_direct_reclaim(pgdat) returns false,
preventing progress even when the pgdat->pfmemalloc_wait wait queue is
empty. Despite the wait queue being empty, the condition,
allow_direct_reclaim(pgdat), may still be returning false, causing it to
continue looping.
In some cases, reclaimable pages exist (zone_reclaimable_pages() returns
...
0), but calculations of pfmemalloc_reserve and free_pages result in
wmark_ok being false.
And then, despite the pgdat->kswapd_wait queue being non-empty, kswapd
is not woken up, further exacerbating the problem:
crash> px ((struct pglist_data *) 0xffff00817fffe540)->kswapd_highest_zoneidx
$775 = __MAX_NR_ZONES
The issue likely occurs under specific conditions: high memory pressure
with frequent direct reclaim, contention on mmap_sem from concurrent
memory allocations, reclaimable pages exist, but zone states cause
wmark_ok to return false.
Modern workloads (e.g., Python multiprocessing) and changes in kernel
reclaim logic may have surfaced such edge cases more prominently than
before.
The workload involves concurrent Python processes under high memory
pressure, leading to contention on mmap_sem.  While not unusual, this
workload may trigger a rare combination of conditions that expose the
issue.
This patch modifies allow_direct_reclaim() to wake kswapd if the
pgdat->kswapd_wait queue is active, regardless of whether wmark_ok is true
or false.  This change ensures kswapd does not miss wake-ups under high
memory pressure, reducing the risk of task stalls in the throttled reclaim
path.
Link: https://lkml.kernel.org/r/20241126150612.114561-1-snishika@redhat.com
Signed-off-by: Seiji Nishikawa snishika@redhat.com
Cc: Mel Gorman mgorman@techsingularity.net
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Morton akpm@linux-foundation.org
---
mm/vmscan.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/mm/vmscan.c~mm-vmscan-ensure-kswapd-is-woken-up-if-the-wait-queue-is-active
+++ a/mm/vmscan.c
@@ -6389,8 +6389,8 @@ static bool allow_direct_reclaim(pg_data
wmark_ok = free_pages > pfmemalloc_reserve / 2;
-	/* kswapd must be awake if processes are being throttled */
-	if (!wmark_ok && waitqueue_active(&pgdat->kswapd_wait)) {
+	/* Always wake up kswapd if the wait queue is not empty */
+	if (waitqueue_active(&pgdat->kswapd_wait)) {
    	if (READ_ONCE(pgdat->kswapd_highest_zoneidx) > ZONE_NORMAL)
    		WRITE_ONCE(pgdat->kswapd_highest_zoneidx, ZONE_NORMAL);
_
Patches currently in -mm which might be from snishika@redhat.com are
mm-vmscan-account-for-free-pages-to-prevent-infinite-loop-in-throttle_direct_reclaim.patch