[PATCH AUTOSEL 6.17-6.6] bpf: Do not limit bpf_cgroup_from_id to current's namespace

9 Oct 2025

From: Kumar Kartikeya Dwivedi memxor@gmail.com
[ Upstream commit 2c895133950646f45e5cf3900b168c952c8dbee8 ]
The bpf_cgroup_from_id kfunc relies on cgroup_get_from_id to obtain the
cgroup corresponding to a given cgroup ID. This helper can be called in
a lot of contexts where the current thread can be random. A recent
example was its use in sched_ext's ops.tick(), to obtain the root cgroup
pointer. Since the current task can be whatever random user space task
preempted by the timer tick, this makes the behavior of the helper
unreliable.
Refactor out __cgroup_get_from_id as the non-namespace aware version of
cgroup_get_from_id, and change bpf_cgroup_from_id to make use of it.
There is no compatibility breakage here, since changing the namespace
against which the lookup is being done to the root cgroup namespace only
permits a wider set of lookups to succeed now. The cgroup IDs across
namespaces are globally unique, and thus don't need to be retranslated.
Reported-by: Dan Schatzberg dschatzberg@meta.com
Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com
Acked-by: Tejun Heo tj@kernel.org
Link: https://lore.kernel.org/r/20250915032618.1551762-2-memxor@gmail.com
Signed-off-by: Alexei Starovoitov ast@kernel.org
Signed-off-by: Sasha Levin sashal@kernel.org
---
LLM Generated explanations, may be completely bogus:
YES - This patch fixes a real functional bug in `bpf_cgroup_from_id()`
that breaks sched_ext and other BPF users when the helper runs from
contexts where `current` lives in an unrelated cgroup namespace.
**Bug Details**
- Today `bpf_cgroup_from_id()` simply calls `cgroup_get_from_id()`
  (`kernel/bpf/helpers.c:2539`), and that helper rejects IDs not visible
  from `current`’s namespace via the `cgroup_is_descendant()` check
  (`kernel/cgroup/cgroup.c:6407`). When the kfunc is invoked from timer
  and irq contexts (e.g. sched_ext `ops.tick()`), `current` is just
  whatever user task was interrupted, so the lookup spuriously returns
  `NULL` and the BPF scheduler treats valid cgroups as gone.
- Documentation already describes the ID lookup as global
  (`Documentation/bpf/kfuncs.rst:653`), so current behaviour contradicts
  the documented contract and leads to unpredictable failures for BPF
  programs that cache cgroup IDs.
**Fix Mechanics**
- The patch factors the namespace-agnostic portion of the lookup into a
  new `__cgroup_get_from_id()` placed directly above the existing helper
  in `kernel/cgroup/cgroup.c` (~6376 after applying the change). That
  routine mirrors the old code path but returns as soon as the
  refcounted `struct cgroup` is acquired, skipping the namespace filter.
- `bpf_cgroup_from_id()` is switched to call the new helper
  (`kernel/bpf/helpers.c:2539` post-patch), so BPF programs always see
  the globally unique ID mapping they rely on. The public declaration in
  `include/linux/cgroup.h:653` is added so other in-kernel users can opt
  into the unrestricted lookup if they intentionally need it.
- The original `cgroup_get_from_id()` continues to enforce namespace
  visibility for existing callers (block layer, memcg, BPF iterators),
  so their semantics are unchanged.
**Risk Assessment**
- Behaviour only widens the set of IDs that succeed for this BPF kfunc;
  no kernel data structures or locking rules change. The lookup and
  refcount handling remain identical, so regression risk is low.
- The broader visibility is acceptable because accessing kfuncs of this
  class already requires privileged BPF programs; the cgroup maintainers
  (Acked-by: Tejun Heo) agreed the helper should operate on the global
  namespace.
- No new exports or user-visible ABI are introduced—the change is
  confined to in-kernel helpers and a single BPF kfunc.
**Stable Backport Notes**
- The patch is self-contained and applies cleanly as long as commit
  332ea1f697be (“bpf: Add bpf_cgroup_from_id() kfunc”) is present, which
  is true for current stable lines. No follow-up fixes are required.
- Without it, sched_ext BPF schedulers and other consumers that cache
  cgroup IDs will continue to misbehave whenever executed from
  asynchronous contexts, so backporting is warranted.
include/linux/cgroup.h |  1 +
 kernel/bpf/helpers.c   |  2 +-
 kernel/cgroup/cgroup.c | 24 ++++++++++++++++++++----
 3 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index b18fb5fcb38e2..b08c8e62881cd 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -650,6 +650,7 @@ static inline void cgroup_kthread_ready(void)
 }
void cgroup_path_from_kernfs_id(u64 id, char *buf, size_t buflen);
+struct cgroup *__cgroup_get_from_id(u64 id);
 struct cgroup *cgroup_get_from_id(u64 id);
 #else /* !CONFIG_CGROUPS */
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 8af62cb243d9e..0bde01edf5e6e 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2540,7 +2540,7 @@ __bpf_kfunc struct cgroup *bpf_cgroup_from_id(u64 cgid)
 {
    struct cgroup *cgrp;
-	cgrp = cgroup_get_from_id(cgid);
+	cgrp = __cgroup_get_from_id(cgid);
    if (IS_ERR(cgrp))
    	return NULL;
    return cgrp;
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 77d02f87f3f12..c62b98f027f99 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6373,15 +6373,15 @@ void cgroup_path_from_kernfs_id(u64 id, char *buf, size_t buflen)
 }
/*
- * cgroup_get_from_id : get the cgroup associated with cgroup id
+ * __cgroup_get_from_id : get the cgroup associated with cgroup id
  * @id: cgroup id
  * On success return the cgrp or ERR_PTR on failure
- * Only cgroups within current task's cgroup NS are valid.
+ * There are no cgroup NS restrictions.
  */
-struct cgroup *cgroup_get_from_id(u64 id)
+struct cgroup *__cgroup_get_from_id(u64 id)
 {
    struct kernfs_node *kn;
-	struct cgroup *cgrp, *root_cgrp;
+	struct cgroup *cgrp;
kn = kernfs_find_and_get_node_by_id(cgrp_dfl_root.kf_root, id);
    if (!kn)
@@ -6403,6 +6403,22 @@ struct cgroup *cgroup_get_from_id(u64 id)
if (!cgrp)
    	return ERR_PTR(-ENOENT);
+	return cgrp;
+}
+
+/*
+ * cgroup_get_from_id : get the cgroup associated with cgroup id
+ * @id: cgroup id
+ * On success return the cgrp or ERR_PTR on failure
+ * Only cgroups within current task's cgroup NS are valid.
+ */
+struct cgroup *cgroup_get_from_id(u64 id)
+{
+	struct cgroup *cgrp, *root_cgrp;
+
+	cgrp = __cgroup_get_from_id(id);
+	if (IS_ERR(cgrp))
+		return cgrp;
root_cgrp = current_cgns_cgroup_dfl();
    if (!cgroup_is_descendant(cgrp, root_cgrp)) {
-- 
2.51.0



    

2025

2024

2023

2022

2021

2020

2019

2018

2017

[PATCH AUTOSEL 6.17-6.6] bpf: Do not limit bpf_cgroup_from_id to current's namespace