On Tue, 21 Jan 2025 at 10:26, Stefano Garzarella sgarzare@redhat.com wrote:
Hi Simon,
On Tue, 21 Jan 2025 at 05:53, Simon Kaegi simon.kaegi@gmail.com wrote:
#regzbot introduced v6.6.69..v6.6.70 #regzbot introduced: ad91a2dacbf8c26a446658cdd55e8324dfeff1e7
We hit this regression when updating our guest vm kernel from 6.6.69 to 6.6.70 -- bisecting, this problem was introduced in ad91a2dacbf8c26a446658cdd55e8324dfeff1e7 -- net: restrict SO_REUSEPORT to inet sockets
We're getting a timeout when trying to connect to the vsocket in the guest VM when launching a kata containers 3.10.1 agent which unsurprisingly ... uses a vsocket to communicate back to the host.
We updated this commit and added an additional sk_is_vsock check and recompiled and this works correctly for us.
- if (valbool && !sk_is_inet(sk))
- if (valbool && !(sk_is_inet(sk) || sk_is_vsock(sk)))
My understanding is limited here so I've added Stefano as he is likely to better understand what makes sense here.
Thanks for adding me, do you have a reproducer here?
AFAIK in AF_VSOCK we never supported SO_REUSEPORT, so it seems strange to me.
I understand that the patch you refer to actually changes the behavior of setsockopt(..., SO_REUSEPORT, ...) on an AF_VSOCK socket, where it used to return successfully before that change, but now returns an error, but subsequent binds should have still failed even without this patch.
Do you actually use the SO_REUSEPORT feature on AF_VSOCK?
If so, I need to better understand if the core socket does anything, but as I recall AF_VSOCK allocates ports internally, so I don't think multiple binds on the same port have ever been supported.
I just tried on an old kernel without the patch applied, and I confirm that SO_REUSEPORT was not supported also if the setsockopt() was successful.
I run the following snippet on 2 shell, on the first one everything fine, but on the second the bind() fails in this way:
$ uname -r 6.10.11-200.fc40.x86_64 $ python3
import socket import os s = socket.socket(socket.AF_VSOCK, socket.SOCK_STREAM) s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1) s.bind((socket.VMADDR_CID_ANY, 4242))
Traceback (most recent call last): File "<stdin>", line 1, in <module> OSError: [Errno 98] Address already in use
With the patch applied, the setsockopt() fails immediately, but the bind() behavior is the same (fails only on the second):
$ uname -r 6.12.9-200.fc41.x86_64 $ python3
import socket import os s = socket.socket(socket.AF_VSOCK, socket.SOCK_STREAM) s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
Traceback (most recent call last): File "<python-input-3>", line 1, in <module> s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1) ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ OSError: [Errno 95] Operation not supported
So, IMHO the patch is correct since AF_VSOCK never really supported SO_REUSEPORT, so better to fail early.
BTW I'm not sure what is happening on your side. Could it be a problem in your code that uses SO_REUSEPORT indiscriminately on AF_VSOCK, even though you then never bind on the same port again?
Thanks, Stefano