On 8/20/25 02:14, Tejun Heo wrote:
Hello,
On Wed, Aug 20, 2025 at 12:31:01AM +0100, Djalal Harouni wrote: ...
Approach 1: First RFC months ago was something like that "bpf_task_freeze_cgroup" [1], can make it bpf_cgroup_freeze() as a proper kfunc, so resurrect approach 1?
Thanks for reminding me. I often feel like my memory is a ring buffer which lasts a few weeks at most.
yeh been a while...
Internally it used an ugly path to workaround kernfs active reference since we don't hold a kernfs_open_file coming from userspace kernfs->write path.
I can improve it, but let's discuss please approach (2) since you suggested it ;-)
Approach 2: Per the old suggestions from you and Alexei [2] [3] you wanted something like:
s32 bpf_kernfs_knob_write(struct kernfs_node *dir, const char *knob, char *buf);
I didn't make it generic for kernfs, since don't know yet about sysfs use cases and named it "bpf_cgroup_write_interface" to focus on cgroup base interfaces. Doing something that generic now including sysfs without a proper valid use cases seems a bit too much. Also we have some cgroup kfunc to acquire and release that integrate well, so I kept it focused.
Alexei suggested to refactor the cgroup_base_file[] [4][5] to take "kernfs_node" as argument instead of "kernfs_open_file", which will open other possibilities for BPF.
However, instead of going full change on cgroup_base_files[], I added a minimalist: cgroup_kn_cftype kn_cfts[] that for now hold only "cgroup.freeze".
I think there's some misunderstanding here. IIUC, Alexei didn't want to expose direct file interface because none of the necessary abstractions are available and the effort becomes too big and wide. However, I don't think the suggestion to use @kn as the path designator (or @cgroup for that matter) means that we want to pipe that all the way down to each op that's to be written to. That'd be rather pointless - why add generic interface if each needs a dedicated backend anyway?
I took time to check...
I thought to avoid precisely creating a kernfs_open_file, since we don't have the corresponding context. Beside the user information who opened the file, kernfs_open_file assumes a corresponding opened file and a corresponding kernfs_open_node for each kernfs_node with one or more open files.
Also it seems each kernfs_open_file is added back into kernfs_open_node->files...
Can't you make kernfs to create open_file from kn and follow the usual write path?
Seems tricky, needs to handle also kernfs_global_locks->open_file_mutex compared with writing directly to the dedicated cgroup backend. The only protection there beside cgroup_lock() is break kernfs active protection so kernfs nodes can be drained later...
The cgroup_kn_lock_live() grabs a ref on the cgroup, breaks the active protection, takes the cgroup_lock() and ensures the cgroup is not dead, from there we don't dereference kn-> until we restore the protection. Plus we have the kernfs_get(cgrp->kn) on cgroup_mkdir which ensures it is always accessible.
I do realize taking the same usual path with write is the obvious thing, but we don't have the corresponding open context, and faking it seems more trouble than calling directly cgroup backends...
Allow me please to do it again directly on cgroup_base_file[] assuming it was Alexei suggestion and see how it looks.
Also Tejun, could you please point me to extra cgroup or kernfs tests you run? much appreciated!
Thanks!
Thanks.