Going to update with v4, I see some netdev failures on non-appeared trace events. I need to review the synchronization on the tracer thread.
pw-bot: changes-requested
On Thu, 15 Aug 2024 at 22:32, Dmitry Safonov via B4 Relay devnull+0x7f454c46.gmail.com@kernel.org wrote:
First 3 patches are more-or-less cleanups/preparations.
Patches 4/5 are fixes for netns file descriptors leaks/open.
Patch 6 was sent to me/contributed off-list by Mohammad, who wants 32-bit kernels to run TCP-AO.
Patch 7 is a workaround/fix for slow VMs. Albeit, I can't reproduce the issue, but I hope it will fix netdev flakes for connect-deny-* tests.
And the biggest change is adding TCP-AO tracepoints to selftests. I think it's a good addition by the following reasons:
- The related tracepoints are now tested;
- It allows tcp-ao selftests to raise expectations on the kernel behavior - up from the syscalls exit statuses + net counters.
- Provides tracepoints usage samples.
As tracepoints are not a stable ABI, any kernel changes done to them will be reflected to the selftests, which also will allow users to see how to change their code. It's quite better than parsing dmesg (what BGP was doing pre-tracepoints, ugh).
Somewhat arguably, the code parses trace_pipe, rather than uses libtraceevent (which any sane user should do). The reason behind that is the same as for rt-netlink macros instead of libmnl: I'm trying to minimize the library dependencies of the selftests. And the performance of formatting text in kernel and parsing it again in a test is not critical.
Current output sample:
ok 73 Trace events matched expectations: 13 tcp_hash_md5_required[2] tcp_hash_md5_unexpected[4] tcp_hash_ao_required[3] tcp_ao_key_not_found[4]
Previously, tracepoints selftests were part of kernel tcp tracepoints submission [1], but since then the code was quite changed:
- Now generic tracing setup is in lib/ftrace.c, separate from lib/ftrace-tcp.c which utilizes TCP trace points. This separation allows future selftests to trace non-TCP events, i.e. to find out an skb's drop reason, which was useful in the creation of TCP-CLOSE stress-test (not in this patch set, but used in attempt to reproduce the issue from [2]).
- Another change is that in the previous submission the trace events where used only to detect unexpected TCP-AO/TCP-MD5 events. In this version the selftests will fail if an expected trace event didn't appear. Let's see how reliable this is on the netdev bot - it obviously passes on my testing, but potentially may require a temporary XFAIL patch if it misbehaves on a slow VM.
[1] https://lore.kernel.org/lkml/20240224-tcp-ao-tracepoints-v1-0-15f31b7f30a7@a... [2] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=33...
Signed-off-by: Dmitry Safonov 0x7f454c46@gmail.com
[..]
Sorry for the noise, Dmitry