Re: [PATCH bpf-next v3] selftests/bpf: Add LPM trie microbenchmarks

29 Jul 2025

      On Mon, Jul 28, 2025 at 3:35 PM Alexei Starovoitov
alexei.starovoitov@gmail.com wrote:
...
Please make a full description of what the test does,
since it's not trivial to decipher from the code.
If I'm reading it correctly, first, the user space
makes the LPM completely full and then lookup/update
use random key-s within range.
Yep, that's correct. I'll provide a more comprehensive description in v4.
...
But delete looks different. It seems the kernel delete
operation can interleave with user space refilling the LPM,
so ...
...
lookup: throughput    7.423 ± 0.023 M ops/s (  7.423M ops/prod), latency  134.710 ns/op
  update: throughput    2.643 ± 0.015 M ops/s (  2.643M ops/prod), latency  378.310 ns/op
  delete: throughput    0.712 ± 0.008 M ops/s (  0.712M ops/prod), latency 1405.152 ns/op
this comparison doesn't look like apples to apples,
since delete will include user space refill time.
Yeah, you're right. Though we only measure the delete time from in the
BPF prog, delete throughput is essentially blocked while the refill
happens and because things are measured with a 1-second timer
(bench.c:sigalarm_handler) the refill time gets accounted for anyway.
I could try extrapolating the delete time like I've done for the free
op, i.e. we calculate how many ops were completed in what fraction of
a second.
...
...
free: throughput    0.574 ± 0.003 K ops/s (  0.574K ops/prod), latency    1.743 ms/op

Does this measure the free-ing of full LPM ?
Yes, this measures the total time to free every element in the trie.
...
...
+static void __lpm_validate(void)
why double underscore ?
Just lpm_validate() ?
The double underscore is used for the common validation parts, but
I'll rename this to include "_common()" (along with all other uses of
__).
...
...

  /*

   * Ideally we'd have access to the map ID but that's already

   * freed before we enter trie_free().

   */

  BPF_CORE_READ_STR_INTO(&name, map, name);

  if (bpf_strncmp(name, BPF_OBJ_NAME_LEN, "trie_free_map"))

          return 0;

  val = bpf_ktime_get_ns();

  bpf_map_update_elem(&latency_free_start, &map, &val, BPF_ANY);

Looks like there is only one lpm map.
What's the point of that extra map ?
Just have a global var for start time ?
Sure, I can make this a global variable for the start time instead.
...
bpf_get_prandom_u32() is not free
and modulo operation isn't free either.
The benchmark includes their time.
It's ok to have it, but add a mode where the bench
tests linear lookup/update too with simple key.data++
Good idea.
...
Since the map suppose to full before we start all keys
should be there, right?
Yes.
...
Let's add a sanity check that update() succeeds.
Will do.
...
...
+static int delete (__u32 index, bool *need_refill)
+{

  struct trie_key key = {

          .data = deleted_entries,

          .prefixlen = prefixlen,

  };

  bpf_map_delete_elem(&trie_map, &key);

  /* Do we need to refill the map? */

  if (++deleted_entries == nr_entries) {

          /*

           * Atomicity isn't required because DELETE only supports

           * one producer running concurrently. What we need is a

           * way to track how many entries have been deleted from

           * the trie between consecutive invocations of the BPF

           * prog because a single bpf_loop() call might not

           * delete all entries, e.g. when NR_LOOPS < nr_entries.

           */

          deleted_entries = 0;

          *need_refill = true;

          return 1;

This early break is another reason that makes
'delete' op different from 'lookup/update'.
Pls make all 3 consistent, so they can be compared to each other.
Hmm.. I'm not quite sure how to do that. lookup/update don't modify
the number of entries in the map whereas delete does (update only
updates existing entries, it doesn't create new ones). So when the map
is empty it needs to be refilled. You're right that somehow the time
to refill needs to be removed from the delete throughput/latency
numbers but fundamentally these 3 ops are not the same and I don't see
how to treat them as such.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH bpf-next v3] selftests/bpf: Add LPM trie microbenchmarks