On 1/14/25 11:16, Stefano Garzarella wrote:
On Tue, Jan 14, 2025 at 01:09:24AM +0100, Michal Luczaj wrote:
On 1/13/25 16:01, Stefano Garzarella wrote:
On Mon, Jan 13, 2025 at 02:51:58PM +0100, Michal Luczaj wrote:
On 1/13/25 12:05, Stefano Garzarella wrote:
... An alternative approach, which would perhaps allow us to avoid all this, is to re-insert the socket in the unbound list after calling release() when we deassign the transport.
WDYT?
If we can't keep the old state (sk_state, transport, etc) on failed re-connect() then reverting back to initial state sounds, uhh, like an option :) I'm not sure how well this aligns with (user's expectations of) good ol' socket API, but maybe that train has already left.
We really want to behave as similar as possible with the other sockets, like AF_INET, so I would try to continue toward that train.
I was worried that such connect()/transport error handling may have some user visible side effects, but I guess I was wrong. I mean you can still reach a sk_state=TCP_LISTEN with a transport assigned[1], but perhaps that's a different issue.
I've tried your suggestion on top of this series. Passes the tests.
Great, thanks!
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index fa9d1b49599b..4718fe86689d 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -492,6 +492,10 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk) vsk->transport->release(vsk); vsock_deassign_transport(vsk);
vsock_addr_unbind(&vsk->local_addr);
vsock_addr_unbind(&vsk->remote_addr);
My only doubt is that if a user did a specific bind() before the connect, this way we're resetting everything, is that right?
That is right.
But we aren't changing much. Transport release already removes vsk from vsock_bound_sockets. So even though vsk->local_addr is untouched (i.e. vsock_addr_bound() returns `true`), vsk can't be picked by vsock_find_bound_socket(). User can't bind() it again, either.
And when patched as above: bind() works as "expected", but socket is pretty much useless, anyway. If I'm correct, the first failing connect() trips virtio_transport_recv_connecting(), which sets `sk->sk_err`. I don't see it being reset. Does the vsock suppose to keep sk_err state once set?
Currently only AF_VSOCK throws ConnectionResetError: ``` from socket import *
def test(family, addr): s = socket(family, SOCK_STREAM) assert s.connect_ex(addr) != 0
lis = socket(family, SOCK_STREAM) lis.bind(addr) lis.listen() s.connect(addr)
p, _ = lis.accept() p.send(b'x') assert s.recv(1) == b'x'
test(AF_INET, ('127.0.0.1', 2000)) test(AF_UNIX, '\0/tmp/foo') test(AF_VSOCK, (1, 2000)) # VMADDR_CID_LOCAL ```
Maybe we need to look better at the release, and prevent it from removing the socket from the lists as you suggested, maybe adding a function in af_vsock.c that all transports can call.
I'd be happy to submit a proper patch, but it would be helpful to decide how close to AF_INET/AF_UNIX's behaviour is close enough. Or would you rather have that UAF plugged first?