Hello Michal,
On 4/2/24 18:16, Michal Koutný wrote:
Hello.
On Wed, Mar 27, 2024 at 11:53:22PM +0100, Djalal Harouni tixxdz@gmail.com wrote:
... For some cases we want to freeze the cgroup of a task based on some signals, doing so from bpf is better than user space which could be too late.
Notice that freezer itself is not immediate -- tasks are frozen as if a signal (kill(2)) was delivered to them (i.e. returning to userspace).
Thanks yes, I would expect freeze to behave like signal, and if one wants to block immediately there is the LSM override return. The selftest attached tries to do exactly that.
What kind of signals (also kill?) are you talking about for illustration?
Could be security signals, reading sensitive files or related to any operation management, for X reasons this user session should be freezed or killed.
The kill is an effective defense against fork-bombs as an example.
Planned users of this feature are: tetragon and systemd when freezing a cgroup hierarchy that could be a K8s pod, container, system service or a user session.
It sounds like the signals are related to a particular process. If so what is it good for to freeze unrelated processes in the same cgroup?
Today some container/pod operations are performed at bpf level, having the freeze and kill available is straightforward to perform this.
I think those answers better clarify why this is needed.
Alright will add those in v2.
As for the generalization to any cgroup attribute (or kernfs). Can this be compared with sysctls -- I see there are helpers to intercept user writes but no helpers to affect sysctl values without an outer writer. What would justify different approaches between kernfs attributes and sysctls (direct writes vs modified writes)?
For generalizing this, haven't thought about it that much. First use case is to try to get freeze and possibly kill support, and use a common interface as requested.
Thank you!
Thanks, Michal