On Fri, Aug 02, 2024 at 10:23:30AM +0100, Dmitry Safonov via B4 Relay wrote:
From: Dmitry Safonov 0x7f454c46@gmail.com
On tests that are expecting failure the timeout value is TEST_RETRANSMIT_SEC == 1 second. Which is big enough for most of devices under tests. But on a particularly slow machine/VM, 1 second might be not enough for another thread to be scheduled and attempt to connect(). It is not a problem for tests that expect connect() to succeed as the timeout value for them (TEST_TIMEOUT_SEC) is intentionally bigger.
One obvious way to solve this would be to increase TEST_RETRANSMIT_SEC. But as all tests would increase the timeouts, that's going to sum up.
But here is less obvious way that keeps timeouts for expected connect() failures low: just synchronize the two threads, which will assure that before counter checks the other thread got a chance to run and timeout on connect(). The expected increase of the related counter for listen() socket will yet test the expected failure.
Never happens on my machine, but I suppose the majority of netdev's connect-deny-* flakes [1] are caused by this.
Fixes:
Hi Dmitry,
I realise it probably wasn't intended to be a fixes tag, but it turns out to be an invalid one. Could you express this in a different way?
# selftests: net/tcp_ao: connect-deny_ipv6 # 1..21 # # 462[lib/setup.c:243] rand seed 1720905426 # TAP version 13 # ok 1 Non-AO server + AO client # not ok 2 Non-AO server + AO client: TCPAOKeyNotFound counter did not increase: 0 <= 0 # ok 3 AO server + Non-AO client # ok 4 AO server + Non-AO client: counter TCPAORequired increased 0 => 1
...
Signed-off-by: Dmitry Safonov 0x7f454c46@gmail.com
...