On Tue, Nov 28, 2023 at 8:53 AM Nhat Pham nphamcs@gmail.com wrote:
On Tue, Nov 28, 2023 at 1:38 AM Michal Hocko mhocko@suse.com wrote:
On Mon 27-11-23 11:36:59, Nhat Pham wrote:
The new zswap writeback scheme requires an online-only memcg hierarchy traversal. Add a new parameter to mem_cgroup_iter() to check for onlineness before returning.
Why is this needed?
For context, in patch 3 of this series, Domenico and I are adding cgroup-aware LRU to zswap, so that we can perform workload-specific zswap writeback. When the reclaim happens due to the global zswap limit being hit, a cgroup is selected by the mem_cgroup_iter(), and the last one selected is saved in the zswap pool (so that the iteration can follow from there next time the limit is hit).
However, one problem with this scheme is we will be pinning the reference to that saved memcg until the next global reclaim attempt, which could prevent it from being killed for quite some time after it has been offlined. Johannes, Yosry, and I discussed a couple of approaches for a while, and decided to add a callback that would release the reference held by the zswap pool when the memcg is offlined, and the zswap pool will obtain the reference to the next online memcg in the traversal (or at least one that has not had the zswap-memcg-release-callback run on it yet).
I forgot to add, but as Andrew had pointed out, this is quite a niche use case (well only zswap is using it specifically). So I have decided to keep the original behavior for mem_cgroup_iter(), and added a special mem_cgroup_iter_online() that does this. All the current mem_cgroup_iter() users should not see any change. This is already in v7 of this patch series:
https://lore.kernel.org/linux-mm/20231127234600.2971029-3-nphamcs@gmail.com/
-- Michal Hocko SUSE Labs