Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes

20 Sep 2023

      On 9/20/2023 10:43 AM, Michal Hocko wrote:
...
On Wed 20-09-23 01:11:01, Jeremi Piotrowski wrote:
...
On Sun, Sep 17, 2023 at 09:12:40PM +0200, Greg Kroah-Hartman wrote:
...
6.1-stable review patch.  If anyone has any objections, please let me know.

Hi Greg/Michal,
This commit breaks userspace which makes it a bad commit for mainline and an
even worse commit for stable.
We ingested 6.1.54 into our nightly testing and found that runc fails to gather
cgroup statistics (when reading kmem.limit_in_bytes). The same code is vendored
into kubelet and kubelet fails to start if this operation fails. 6.1.53 is
fine.
Could you expand some more on why is the file read? It doesn't support
writing to it for some time so how does reading it helps in any sense?
Anyway, I do agree that the stable backport should be reverted.
This file is read together with all the other memcg files. Each prefix:
memory
memory.memsw
memory.kmem
memory.kmem.tcp
is combined with these suffixes
.usage_in_bytes
.max_usage_in_bytes
.failcnt
.limit_in_bytes
and read, the values are then forwarded on to other components for scheduling decisions.
You want to know the limit when checking the usage (is the usage close to the limit or not).
Userspace tolerates MEMCG/MEMCG_KMEM being disabled, but having a single file out of the
set missing is an anomaly. So maybe we could keep the dummy file just for the
sake of consistency? Cgroupv1 is legacy after all.
...
...
...
Address this by wiping out the file completely and effectively get back to
pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
On reads, the runc code checks for MEMCG_KMEM=n by checking
kmem.usage_in_bytes. If it is present then runc expects the other cgroup files
to be there (including kmem.limit_in_bytes). So this change is not effectively
the same.
Here's a link to the PR that would be needed to handle this change in userspace
(not merged yet and would need to be propagated through the ecosystem):
https://github.com/opencontainers/runc/pull/4018.
Thanks. Does that mean the revert is still necessary for the Linus tree
or do you expect that the fix can be merged and propagated in a
reasonable time?
We can probably get runc and currently supported kubernetes versions patched in time
before 6.6 (or the next LTS kernel) hits LTS distros.
But there's still a bunch of users running cgroupv1 with unsupported kubernetes
versions that are still taking kernel updates as they come, so this might get reported
again next year if it stays in mainline.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [REGRESSION] Re: [PATCH 6.1 033/219] memcg: drop kmem.limit_in_bytes