CCing Ruoqing He
On Wed, 22 Jan 2025 at 04:48, Simon Kaegi simon.kaegi@gmail.com wrote:
Thanks Stefano,
The feedback about vsock expectations was exactly what I was hoping you could provide.
You're welcome ;-)
In the Kata agent we're not directly setting SO_REUSEPORT as a socket option so I think what you suggest where SO_REUSEORT is being set indiscriminately is happening a layer down perhaps in the tokio or nix crates we use. I unfortunately do not have an easy way to reproduce the problem without setting up kata containers and what's more you need to then rebuild a recent kata flavoured minimal kernel to see the issue.
I talked with Ruoqing He yesterday about this issue since he knows Kata better than me :-)
He pointed out that Kata is using ttrpc-rust and he shared with me this code: https://github.com/containerd/ttrpc-rust/blob/0610015a92c340c6d88f81c0d6f9f4...
The change (setting SO_REUSEPORT) was introduced more than 4 years ago, but I honestly don't think it solved the problem mentioned in the commit: https://github.com/containerd/ttrpc-rust/commit/9ac87828ee870ecf5fb5feaa45cc... So far it didn't give any problems because it was allowed on every socket, but effectively it was a NOP for AF_VSOCK.
IIUC that code, it supports 2 address families: AF_VSOCK and AF_UNIX. For AF_VSOCK we've made it clear that SO_REUSEPORT is useless, but for AF_UNIX it's even more useless since there's no concept of a port, so in my opinion `setsockopt(fd, sockopt::ReusePort, &true)?;` can be removed completely. Or at least not fail the entire function if it's unsupported, whereas now it fails and the next bind is not done.
I don't know where this code is called, but removing that line is likely to make everything work correctly.
Cheers, Stefano
I spent the day updating our build to use the latest kata container release and dependencies to see if that would correct the issue. Unfortunately that did not and so will work tomorrow to get stack traces etc. to more directly figure things out. For the others on the thread ... based on what Stefano said although throwing an error for vsocks is a change in behaviour I suspect this is a problem we can fix in a crate corrected to be more aware of vsock capabilities. I'll know better what's possible and update tomorrow.
Thanks -Simon
On Tue, Jan 21, 2025 at 4:54 AM Stefano Garzarella sgarzare@redhat.com wrote:
On Tue, 21 Jan 2025 at 10:26, Stefano Garzarella sgarzare@redhat.com wrote:
Hi Simon,
On Tue, 21 Jan 2025 at 05:53, Simon Kaegi simon.kaegi@gmail.com wrote:
#regzbot introduced v6.6.69..v6.6.70 #regzbot introduced: ad91a2dacbf8c26a446658cdd55e8324dfeff1e7
We hit this regression when updating our guest vm kernel from 6.6.69 to 6.6.70 -- bisecting, this problem was introduced in ad91a2dacbf8c26a446658cdd55e8324dfeff1e7 -- net: restrict SO_REUSEPORT to inet sockets
We're getting a timeout when trying to connect to the vsocket in the guest VM when launching a kata containers 3.10.1 agent which unsurprisingly ... uses a vsocket to communicate back to the host.
We updated this commit and added an additional sk_is_vsock check and recompiled and this works correctly for us.
- if (valbool && !sk_is_inet(sk))
- if (valbool && !(sk_is_inet(sk) || sk_is_vsock(sk)))
My understanding is limited here so I've added Stefano as he is likely to better understand what makes sense here.
Thanks for adding me, do you have a reproducer here?
AFAIK in AF_VSOCK we never supported SO_REUSEPORT, so it seems strange to me.
I understand that the patch you refer to actually changes the behavior of setsockopt(..., SO_REUSEPORT, ...) on an AF_VSOCK socket, where it used to return successfully before that change, but now returns an error, but subsequent binds should have still failed even without this patch.
Do you actually use the SO_REUSEPORT feature on AF_VSOCK?
If so, I need to better understand if the core socket does anything, but as I recall AF_VSOCK allocates ports internally, so I don't think multiple binds on the same port have ever been supported.
I just tried on an old kernel without the patch applied, and I confirm that SO_REUSEPORT was not supported also if the setsockopt() was successful.
I run the following snippet on 2 shell, on the first one everything fine, but on the second the bind() fails in this way:
$ uname -r 6.10.11-200.fc40.x86_64 $ python3
import socket import os s = socket.socket(socket.AF_VSOCK, socket.SOCK_STREAM) s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1) s.bind((socket.VMADDR_CID_ANY, 4242))
Traceback (most recent call last): File "<stdin>", line 1, in <module> OSError: [Errno 98] Address already in use
With the patch applied, the setsockopt() fails immediately, but the bind() behavior is the same (fails only on the second):
$ uname -r 6.12.9-200.fc41.x86_64 $ python3
import socket import os s = socket.socket(socket.AF_VSOCK, socket.SOCK_STREAM) s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
Traceback (most recent call last): File "<python-input-3>", line 1, in <module> s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1) ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ OSError: [Errno 95] Operation not supported
So, IMHO the patch is correct since AF_VSOCK never really supported SO_REUSEPORT, so better to fail early.
BTW I'm not sure what is happening on your side. Could it be a problem in your code that uses SO_REUSEPORT indiscriminately on AF_VSOCK, even though you then never bind on the same port again?
Thanks, Stefano