On Sat, Sep 06, 2025 at 03:21:08AM +0000, Andrew Guerrero wrote:
A filesystem writeback performance issue was discovered by repeatedly running CPU hotplug operations while a process in a cgroup with memory and io controllers enabled wrote to an ext4 file in a loop.
When a CPU is offlined, the memcg_hotplug_cpu_dead() callback function flushes per-cpu vmstats counters. However, instead of applying a per-cpu counter once to each cgroup in the heirarchy, the per-cpu counter is applied repeatedly just to the nested cgroup. Under certain conditions, the per-cpu NR_FILE_DIRTY counter is routinely positive during hotplug events and the dirty file count artifically inflates. Once the dirty file count grows past the dirty_freerun_ceiling(), balance_dirty_pages() starts a backgroup writeback each time a file page is marked dirty within the nested cgroup.
This change fixes memcg_hotplug_cpu_dead() so that the per-cpu vmstats and vmevents counters are applied once to each cgroup in the heirarchy, similar to __mod_memcg_state() and __count_memcg_events().
Fixes: 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty") Signed-off-by: Andrew Guerrero ajgja@amazon.com Reviewed-by: Gunnar Kudrjavets gunnarku@amazon.com
Hey all,
This patch is intended for the 5.10 longterm release branch. It will not apply cleanly to mainline and is inadvertantly fixed by a larger series of changes in later release branches: a3d4c05a4474 ("mm: memcontrol: fix cpuhotplug statistics flushing").
Why can't we take those instead?
In 5.15, the counter flushing code is completely removed. This may be another viable option here too, though it's a larger change.
If it's not needed anymore, why not just remove it with the upstream commits as well?
thanks,
greg k-h