In cgroup v2, a mutual overlap check is required when at least one of
two
cpusets is exclusive. However, this check should be relaxed and limited
to
cases where both cpusets are exclusive.
This patch ensures that for sibling cpusets A1 (exclusive) and B1
(non-exclusive), change B1 cannot affect A1's exclusivity.
for example. Assume a machine has 4 CPUs (0-3).
root cgroup
/ \
A1 B1
Case 1:
Table 1.1: Before applying the patch
Step | A1's prstate | B1'sprstate |
#1> echo "0-1" > A1/cpuset.cpus | member | member |
#2> echo "root" > A1/cpuset.cpus.partition | root | member |
#3> echo "0" > B1/cpuset.cpus | root invalid | member |
After step #3, A1 changes from "root" to "root invalid" because its CPUs
(0-1) overlap with those requested by B1 (0-3). However, B1 can actually
use CPUs 2-3(from B1's parent), so it would be more reasonable for A1 to
remain as "root."
Table 1.2: After applying the patch
Step | A1's prstate | B1'sprstate |
#1> echo "0-1" > A1/cpuset.cpus | member | member |
#2> echo "root" > A1/cpuset.cpus.partition | root | member |
#3> echo "0" > B1/cpuset.cpus | root | member |
Case 2: (This situation remains unchanged from before)
Table 2.1: Before applying the patch
Step | A1's prstate | B1'sprstate |
#1> echo "0-1" > A1/cpuset.cpus | member | member |
#3> echo "1-2" > B1/cpuset.cpus | member | member |
#2> echo "root" > A1/cpuset.cpus.partition | root invalid | member |
Table 2.2: After applying the patch
Step | A1's prstate | B1'sprstate |
#1> echo "0-1" > A1/cpuset.cpus | member | member |
#3> echo "1-2" > B1/cpuset.cpus | member | member |
#2> echo "root" > A1/cpuset.cpus.partition | root invalid | member |
All other cases remain unaffected. For example, cgroup-v1, both A1 and
B1 are exclusive or non-exlusive.
---
v2 -> v3:
- Ensure compliance with constraints such as cpuset.cpus.exclusive.
- Link:
https://lore.kernel.org/cgroups/20251113131434.606961-1-sunshaojie@kylinos.…
v1 -> v2:
- Keeps the current cgroup v1 behavior unchanged
- Link:
https://lore.kernel.org/cgroups/c8e234f4-2c27-4753-8f39-8ae83197efd3@redhat…
kernel/cgroup/cpuset-internal.h | 3 ++
kernel/cgroup/cpuset-v1.c | 20 +++++++++
kernel/cgroup/cpuset.c | 44 ++++++++++++++-----
.../selftests/cgroup/test_cpuset_prs.sh | 10 ++---
4 files changed, 60 insertions(+), 17 deletions(-)
--
2.25.1
On Fri, Nov 14, 2025 at 11:55:48AM +0800, Guopeng Zhang <zhangguopeng(a)kylinos.cn> wrote:
> Actually, selftests are no longer just something for developers to view locally; they are now extensively
> run in CI and stable branch regression testing. Using a standardized layout means that general test runners
> and CI systems can parse the cgroup test results without any special handling.
Nice. I appreciate you took this up.
> This patch is not part of a formal, tree-wide conversion series I am running; it is an incremental step to align the
> cgroup C tests with the existing TAP usage. I started here because these tests already use ksft_test_result_*() and
> only require minor changes to generate proper TAP output.
The tests are in various state of usage, correctness and usefulness,
hence...
>
> > I'm asking to better asses whether also the scripts listed in
> > Makefile:TEST_PROGS should be converted too.
>
> I agree that having them produce TAP output would benefit tooling and CI. I did not want to mix
> that into this change, but if you and other maintainers think this direction is reasonable,
> I would be happy to follow up and convert the cgroup shell tests to TAP as well.
...I'd suggest next focus on test_cpuset_prs.sh (as discussed, it may
need more changes to adapt its output too).
Michal
Remove the "trigger_count" in trigger_bench.c and reuse trigger_driver()
instead for trigger_kernel_count_setup().
With the calling to bpf_get_numa_node_id(), the result for "kernel_count"
will become a little more accurate.
It will also easier if we want to test the performance of livepatch, just
hook the bpf_get_numa_node_id() and run the "kernel_count" bench trigger.
Signed-off-by: Menglong Dong <dongml2(a)chinatelecom.cn>
---
.../selftests/bpf/benchs/bench_trigger.c | 5 +----
.../testing/selftests/bpf/progs/trigger_bench.c | 17 +++++------------
2 files changed, 6 insertions(+), 16 deletions(-)
diff --git a/tools/testing/selftests/bpf/benchs/bench_trigger.c b/tools/testing/selftests/bpf/benchs/bench_trigger.c
index 1e2aff007c2a..34fd8fa3b803 100644
--- a/tools/testing/selftests/bpf/benchs/bench_trigger.c
+++ b/tools/testing/selftests/bpf/benchs/bench_trigger.c
@@ -179,11 +179,8 @@ static void trigger_syscall_count_setup(void)
static void trigger_kernel_count_setup(void)
{
setup_ctx();
- bpf_program__set_autoload(ctx.skel->progs.trigger_driver, false);
- bpf_program__set_autoload(ctx.skel->progs.trigger_count, true);
+ ctx.skel->rodata->kernel_count = 1;
load_ctx();
- /* override driver program */
- ctx.driver_prog_fd = bpf_program__fd(ctx.skel->progs.trigger_count);
}
static void trigger_kprobe_setup(void)
diff --git a/tools/testing/selftests/bpf/progs/trigger_bench.c b/tools/testing/selftests/bpf/progs/trigger_bench.c
index 3d5f30c29ae3..6564d1909c7b 100644
--- a/tools/testing/selftests/bpf/progs/trigger_bench.c
+++ b/tools/testing/selftests/bpf/progs/trigger_bench.c
@@ -39,26 +39,19 @@ int bench_trigger_uprobe_multi(void *ctx)
return 0;
}
+const volatile int kernel_count = 0;
const volatile int batch_iters = 0;
-SEC("?raw_tp")
-int trigger_count(void *ctx)
-{
- int i;
-
- for (i = 0; i < batch_iters; i++)
- inc_counter();
-
- return 0;
-}
-
SEC("?raw_tp")
int trigger_driver(void *ctx)
{
int i;
- for (i = 0; i < batch_iters; i++)
+ for (i = 0; i < batch_iters; i++) {
(void)bpf_get_numa_node_id(); /* attach point for benchmarking */
+ if (kernel_count)
+ inc_counter();
+ }
return 0;
}
--
2.51.2
The XDP qstats tests send 2k packets over a single socket.
Looks like when netdev CI is busy running those tests in QEMU
occasionally flakes. The target doesn't get to run at all
before all 2000 packets are sent.
Lower the number of packets to 1000 and reopen the socket
every 50 packets, to give RSS a chance to spread the packets
to multiple queues.
For the netdev CI testing either lowering the count or using
multiple sockets is enough, but let's do both for extra resiliency.
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
---
CC: shuah(a)kernel.org
CC: ast(a)kernel.org
CC: hawk(a)kernel.org
CC: john.fastabend(a)gmail.com
CC: sdf(a)fomichev.me
CC: linux-kselftest(a)vger.kernel.org
---
tools/testing/selftests/drivers/net/xdp.py | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/xdp.py b/tools/testing/selftests/drivers/net/xdp.py
index a148004e1c36..834a37ae7d0d 100755
--- a/tools/testing/selftests/drivers/net/xdp.py
+++ b/tools/testing/selftests/drivers/net/xdp.py
@@ -687,9 +687,12 @@ from lib.py import ip, bpftool, defer
"/dev/null"
# Listener runs on "remote" in case of XDP_TX
rx_host = cfg.remote if act == XDPAction.TX else None
- # We want to spew 2000 packets quickly, bash seems to do a good enough job
- tx_udp = f"exec 5<>/dev/udp/{cfg.addr}/{port}; " \
- "for i in `seq 2000`; do echo a >&5; done; exec 5>&-"
+ # We want to spew 1000 packets quickly, bash seems to do a good enough job
+ # Each reopening of the socket gives us a differenot local port (for RSS)
+ tx_udp = "for _ in `seq 20`; do " \
+ f"exec 5<>/dev/udp/{cfg.addr}/{port}; " \
+ "for i in `seq 50`; do echo a >&5; done; " \
+ "exec 5>&-; done"
cfg.wait_hw_stats_settle()
# Qstats have more clearly defined semantics than rtnetlink.
@@ -704,11 +707,11 @@ from lib.py import ip, bpftool, defer
cfg.wait_hw_stats_settle()
after = cfg.netnl.qstats_get({"ifindex": cfg.ifindex}, dump=True)[0]
- ksft_ge(after['rx-packets'] - before['rx-packets'], 2000)
+ expected_pkts = 1000
+ ksft_ge(after['rx-packets'] - before['rx-packets'], expected_pkts)
if act == XDPAction.TX:
- ksft_ge(after['tx-packets'] - before['tx-packets'], 2000)
+ ksft_ge(after['tx-packets'] - before['tx-packets'], expected_pkts)
- expected_pkts = 2000
stats = _get_stats(prog_info["maps"]["map_xdp_stats"])
ksft_eq(stats[XDPStats.RX.value], expected_pkts, "XDP RX stats mismatch")
if act == XDPAction.TX:
--
2.51.1