Hi Martin, Thanks for the review.
Il giorno ven 8 mar 2024 alle ore 02:03 Martin KaFai Lau martin.lau@linux.dev ha scritto:
On 2/29/24 6:00 AM, Alessandro Carminati (Red Hat) wrote:
In some systems, the netcat server can incur in delay to start listening. When this happens, the test can randomly fail in various points. This is an example error message: # ip gre none gso # encap 192.168.1.1 to 192.168.1.2, type gre, mac none len 2000 # test basic connectivity # Ncat: Connection refused.
This explained what is the issue. Please also explain how the patch solves it.
The issue, as stated, depends on a race condition between the netcat client and server. The test author addressed this problem using a sleep, which I removed in this patch. To easily solve the issue, one could simply increase the sleep duration. However, this patch opts to tackle the problem by querying the /proc directory and verifying TCP binds at the specified port before letting the client connect.
Signed-off-by: Alessandro Carminati (Red Hat) alessandro.carminati@gmail.com
tools/testing/selftests/bpf/test_tc_tunnel.sh | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh index 910044f08908..01c0f4b1a8c2 100755 --- a/tools/testing/selftests/bpf/test_tc_tunnel.sh +++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh @@ -72,7 +72,6 @@ cleanup() { server_listen() { ip netns exec "${ns2}" nc "${netcat_opt}" -l "${port}" > "${outfile}" & server_pid=$!
sleep 0.2
}
client_connect() {
@@ -93,6 +92,22 @@ verify_data() { fi }
+wait_for_port() {
local digits=8
local port2check=$(printf ":%04X" $1)
local prot=$([ "$2" == "-6" ] && echo 6 && digits=32)
for i in $(seq 20); do
if ip netns exec "${ns2}" cat /proc/net/tcp${prot} | \
sed -r 's/^[ \t]+[0-9]+: ([0-9A-F]{'${digits}'}:[0-9A-F]{4}) .*$/\1/' | \
grep -q "${port2check}"; then
The idea is to check if there is socket listening on port 8888?
May be something simpler like "ss -OHtl src :$1" instead?
Indeed, the aim is to ensure that the server is bound before the client attempts to connect by checking if socket is listening, and yes using 'ss' would be shorter. However, I chose not to use 'ss' or 'netstat' to avoid adding new dependencies, considering they are already many. Nonetheless, 'ss' is preferred, I have no objections.
-- pw-bot: cr
The check-and-wait fix in this patch is fine to get your test environment going.
Eventually, it will be good to see the test_tc_tunnel.sh test moved to test_progs. The test_tc_tunnel.sh is not run by bpf CI and issue like this got unnoticed. Some other "*.sh" tests have already been moved to test_progs.
Regards Alessandro