Re: [PATCH] perf/core: Introduce cpuctx->cgrp_ctx_list

4 Oct 2023

      * Namhyung Kim namhyung@kernel.org wrote:
...
AFAIK we don't have a tool to measure the context switch overhead
directly.  (I think I should add one to perf ftrace latency).  But I can
see it with a simple perf bench command like this.
$ perf bench sched pipe -l 100000
  # Running 'sched/pipe' benchmark:
  # Executed 100000 pipe operations between two processes
   Total time: 0.650 [sec]

     6.505740 usecs/op
       153710 ops/sec

It runs two tasks communicate each other using a pipe so it should
stress the context switch code.  This is the normal numbers on my
system.  But after I run these two perf stat commands in background,
the numbers vary a lot.
$ sudo perf stat -a -e cycles -G user.slice -- sleep 100000 &
  $ sudo perf stat -a -e uncore_imc/cas_count_read/ -- sleep 10000 &
I will show the last two lines of perf bench sched pipe output for
three runs.
    58.597060 usecs/op    # run 1
        17065 ops/sec

    11.329240 usecs/op    # run 2
        88267 ops/sec

    88.481920 usecs/op    # run 3
        11301 ops/sec

I think the deviation comes from the fact that uncore events are managed
a certain number of cpus only.  If the target process runs on a cpu that
manages uncore pmu, it'd take longer.  Otherwise it won't affect the
performance much.
The numbers of pipe-message context switching will vary a lot depending on 
CPU migration patterns as well.
The best way to measure context-switch overhead is to pin that task
to a single CPU with something like:
$ taskset 1 perf stat --null --repeat 10 perf bench sched pipe -l 10000 >/dev/null
Performance counter stats for 'perf bench sched pipe -l 10000' (10 runs):
0.049798 +- 0.000102 seconds time elapsed  ( +-  0.21% )
as you can see the 0.21% stddev is pretty low.
If we allow 2 CPUs, both runtime and stddev is much higher:
$ taskset 3 perf stat --null --repeat 10 perf bench sched pipe -l 10000 >/dev/null
Performance counter stats for 'perf bench sched pipe -l 10000' (10 runs):
1.4835 +- 0.0383 seconds time elapsed  ( +-  2.58% )
Thanks,
Ingo

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH] perf/core: Introduce cpuctx->cgrp_ctx_list