A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") changes the behavior of memcg events, which will consider subtrees in memory.events. But oom_kill event is a special one as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed in memory.oom_control. The file memory.oom_control is in both root memcg and non root memcg, that is different with memory.event as it only in non-root memcg. That commit is okay for cgroup2, but it is not okay for cgroup1 as it will cause inconsistent behavior between root memcg and non-root memcg.
Here's an example on why this behavior is inconsistent in cgroup1. root memcg / memcg foo / memcg bar
Suppose there's an oom_kill in memcg bar, then the oon_kill will be
root memcg : memory.oom_control(oom_kill) 0 / memcg foo : memory.oom_control(oom_kill) 1 / memcg bar : memory.oom_control(oom_kill) 1
For the non-root memcg, its memory.oom_control(oom_kill) includes its descendants' oom_kill, but for root memcg, it doesn't include its descendants' oom_kill. That means, memory.oom_control(oom_kill) has different meanings in different memcgs. That is inconsistent. Then the user has to know whether the memcg is root or not.
If we can't fully support it in cgroup1, for example by adding memory.events.local into cgroup1 as well, then let's don't touch its original behavior. So let's recover the original behavior for cgroup1.
Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") Cc: Chris Down chris@chrisdown.name Cc: Johannes Weiner hannes@cmpxchg.org Cc: stable@vger.kernel.org Reviewed-by: Shakeel Butt shakeelb@google.com Signed-off-by: Yafang Shao laoar.shao@gmail.com --- include/linux/memcontrol.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 8c340e6b347f..a0ae080a67d1 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -798,7 +798,8 @@ static inline void memcg_memory_event(struct mem_cgroup *memcg, atomic_long_inc(&memcg->memory_events[event]); cgroup_file_notify(&memcg->events_file);
- if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS) + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS || + !cgroup_subsys_on_dfl(memory_cgrp_subsys)) break; } while ((memcg = parent_mem_cgroup(memcg)) && !mem_cgroup_is_root(memcg));
On Mon 13-04-20 21:59:52, Yafang Shao wrote:
A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") changes the behavior of memcg events, which will consider subtrees in memory.events. But oom_kill event is a special one as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed in memory.oom_control. The file memory.oom_control is in both root memcg and non root memcg, that is different with memory.event as it only in non-root memcg. That commit is okay for cgroup2, but it is not okay for cgroup1 as it will cause inconsistent behavior between root memcg and non-root memcg.
Here's an example on why this behavior is inconsistent in cgroup1. root memcg / memcg foo / memcg bar
Suppose there's an oom_kill in memcg bar, then the oon_kill will be
root memcg : memory.oom_control(oom_kill) 0 /
memcg foo : memory.oom_control(oom_kill) 1 / memcg bar : memory.oom_control(oom_kill) 1
For the non-root memcg, its memory.oom_control(oom_kill) includes its descendants' oom_kill, but for root memcg, it doesn't include its descendants' oom_kill. That means, memory.oom_control(oom_kill) has different meanings in different memcgs. That is inconsistent. Then the user has to know whether the memcg is root or not.
If we can't fully support it in cgroup1, for example by adding memory.events.local into cgroup1 as well, then let's don't touch its original behavior. So let's recover the original behavior for cgroup1.
Wthe localevents was mostly cgroup v2 feature. I do not think there was an intention to have side effects on the legacy hierarchy. I thought this would be the case but it is not apparently. Would it make more sense to have CGRP_ROOT_MEMORY_LOCAL_EVENTS for legacy hierarchy by default rather than special casing it somewhere quite deep in the code?
Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") Cc: Chris Down chris@chrisdown.name Cc: Johannes Weiner hannes@cmpxchg.org Cc: stable@vger.kernel.org Reviewed-by: Shakeel Butt shakeelb@google.com Signed-off-by: Yafang Shao laoar.shao@gmail.com
include/linux/memcontrol.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 8c340e6b347f..a0ae080a67d1 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -798,7 +798,8 @@ static inline void memcg_memory_event(struct mem_cgroup *memcg, atomic_long_inc(&memcg->memory_events[event]); cgroup_file_notify(&memcg->events_file);
if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS)
if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS ||
} while ((memcg = parent_mem_cgroup(memcg)) && !mem_cgroup_is_root(memcg));!cgroup_subsys_on_dfl(memory_cgrp_subsys)) break;
-- 2.18.2
On Tue, Apr 14, 2020 at 11:23 PM Michal Hocko mhocko@kernel.org wrote:
On Mon 13-04-20 21:59:52, Yafang Shao wrote:
A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") changes the behavior of memcg events, which will consider subtrees in memory.events. But oom_kill event is a special one as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed in memory.oom_control. The file memory.oom_control is in both root memcg and non root memcg, that is different with memory.event as it only in non-root memcg. That commit is okay for cgroup2, but it is not okay for cgroup1 as it will cause inconsistent behavior between root memcg and non-root memcg.
Here's an example on why this behavior is inconsistent in cgroup1. root memcg / memcg foo / memcg bar
Suppose there's an oom_kill in memcg bar, then the oon_kill will be
root memcg : memory.oom_control(oom_kill) 0 /
memcg foo : memory.oom_control(oom_kill) 1 / memcg bar : memory.oom_control(oom_kill) 1
For the non-root memcg, its memory.oom_control(oom_kill) includes its descendants' oom_kill, but for root memcg, it doesn't include its descendants' oom_kill. That means, memory.oom_control(oom_kill) has different meanings in different memcgs. That is inconsistent. Then the user has to know whether the memcg is root or not.
If we can't fully support it in cgroup1, for example by adding memory.events.local into cgroup1 as well, then let's don't touch its original behavior. So let's recover the original behavior for cgroup1.
Wthe localevents was mostly cgroup v2 feature. I do not think there was an intention to have side effects on the legacy hierarchy. I thought this would be the case but it is not apparently. Would it make more sense to have CGRP_ROOT_MEMORY_LOCAL_EVENTS for legacy hierarchy by default rather than special casing it somewhere quite deep in the code?
I had thought about setting CGRP_ROOT_MEMORY_LOCAL_EVENTS by defualt for cgroup1, but I was not sure whether we should also expose memory_localevents in cgroup1_show_options().
Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") Cc: Chris Down chris@chrisdown.name Cc: Johannes Weiner hannes@cmpxchg.org Cc: stable@vger.kernel.org Reviewed-by: Shakeel Butt shakeelb@google.com Signed-off-by: Yafang Shao laoar.shao@gmail.com
include/linux/memcontrol.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 8c340e6b347f..a0ae080a67d1 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -798,7 +798,8 @@ static inline void memcg_memory_event(struct mem_cgroup *memcg, atomic_long_inc(&memcg->memory_events[event]); cgroup_file_notify(&memcg->events_file);
if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS)
if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS ||
!cgroup_subsys_on_dfl(memory_cgrp_subsys)) break; } while ((memcg = parent_mem_cgroup(memcg)) && !mem_cgroup_is_root(memcg));
-- 2.18.2
-- Michal Hocko SUSE Labs
Thanks Yafang
linux-stable-mirror@lists.linaro.org