On Mon, 2024-01-29 at 08:39 -0800, Jakub Kicinski wrote:
On Mon, 29 Jan 2024 17:31:33 +0100 Paolo Abeni wrote:
Uhm... while the self-test doesn't emit anymore the message related to the missing modules, it still fails in the CI env and I can't reproduce the failures in my local env (the same for the gro.sh script).
If I understand correctly, the tests run under double virtualization (a VM on top AWS?), is that correct? I guess the extra slowdown/overhead will need more care.
Yes, it's VM inside a VM without nested virtualization support. A weird setup, granted, but when we move to bare metal I'd like to enable KASAN, which will probably cause a similar slowdown..
You could possibly get a similar slowdown by disabling HW virt / KVM?
Thanks, the above helped - that is, I can reproduce the failure running the self-tests in a VM with KVM disabled in the host. Funnily enough I can't use plain virtme for that - the virtme VM crashes on boot, possibly due to the wrong 'machine' argument passed to qemu.
In any case I can't see a sane way to cope with such slow environments except skipping the sensitive cases.
FWIW far the 4 types of issues we've seen were:
- config missing
- OS doesn't ifup by default
- OS tools are old / buggy
- VM-in-VM is just too slow.
There's a bunch of failures in forwarding which look like perf issues. I wonder if we should introduce something in the settings file to let tests know that they are running in very slow env?
I was wondering about passing such info to the test e.g. via an env variable:
vng --run . --user root -- HOST_IS_DAMN_SLOW=true ./tools/testing/selftests/kselftest_install/run_kselftest.sh -t <whatever>
In any case some tests should be updated to skip the relevant cases accordingly, right?
Cheers,
Paolo