Mina Almasry almasrymina@google.com writes:
From: Jesper Dangaard Brouer hawk@kernel.org
We frequently consult with Jesper's out-of-tree page_pool benchmark to evaluate page_pool changes.
Import the benchmark into the upstream linux kernel tree so that (a) we're all running the same version, (b) pave the way for shared improvements, and (c) maybe one day integrate it with nipa, if possible.
Import bench_page_pool_simple from commit 35b1716d0c30 ("Add page_bench06_walk_all"), from this repository: https://github.com/netoptimizer/prototype-kernel.git
Changes done during upstreaming:
- Fix checkpatch issues.
- Remove the tasklet logic not needed.
- Move under tools/testing
- Create ksft for the benchmark.
- Changed slightly how the benchmark gets build. Out of tree, time_bench is built as an independent .ko. Here it is included in bench_page_pool.ko
Steps to run:
mkdir -p /tmp/run-pp-bench make -C ./tools/testing/selftests/net/bench make -C ./tools/testing/selftests/net/bench install INSTALL_PATH=/tmp/run-pp-bench rsync --delete -avz --progress /tmp/run-pp-bench mina@$SERVER:~/ ssh mina@$SERVER << EOF cd ~/run-pp-bench && sudo ./test_bench_page_pool.sh EOF
Output:
(benchmrk dmesg logs) Fast path results: no-softirq-page_pool01 Per elem: 11 cycles(tsc) 4.368 ns ptr_ring results: no-softirq-page_pool02 Per elem: 527 cycles(tsc) 195.187 ns slow path results: no-softirq-page_pool03 Per elem: 549 cycles(tsc) 203.466 ns
Cc: Jesper Dangaard Brouer hawk@kernel.org Cc: Ilias Apalodimas ilias.apalodimas@linaro.org Cc: Jakub Kicinski kuba@kernel.org Cc: Toke Høiland-Jørgensen toke@toke.dk
Signed-off-by: Mina Almasry almasrymina@google.com
Back when you posted the first RFC, Jesper and I chatted about ways to avoid the ugly "load module and read the output from dmesg" interface to the test.
One idea we came up with was to make the module include only the "inner" functions for the benchmark, and expose those to BPF as kfuncs. Then the test runner can be a BPF program that runs the tests, collects the data and passes it to userspace via maps or a ringbuffer or something. That's a nicer and more customisable interface than the printk output. And if they're small enough, maybe we could even include the functions into the page_pool code itself, instead of in a separate benchmark module?
WDYT of that idea? :)
-Toke