Hi all,
This series backports three upstream commits: - 135ffc7 "bpf, vsock: Invoke proto::close on close()" - fcdd224 "vsock: Keep the binding until socket destruction" - 78dafe1 "vsock: Orphan socket after transport release"
Although this version of the kernel does not support sockmap, I think backporting this patch can be useful to reduce conflicts in future backports [1]. It does not harm the system. The comment it introduces in the code can be misleading. I added some words in the commit to explain the situation.
The other two commits are untouched, fixing a use-after free[2] and a null-ptr-deref[3] respectively.
[1]https://lore.kernel.org/stable/f7lr3ftzo66sl6phlcygh4xx4spga4b6je37fhawjrsjt... [2]https://lore.kernel.org/all/20250128-vsock-transport-vs-autobind-v3-0-1cf570... [3]https://lore.kernel.org/all/20250210-vsock-linger-nullderef-v3-0-ef6244d02b5...
Cheers, Luigi
To: Stefano Garzarella sgarzare@redhat.com To: Michal Luczaj mhal@rbox.co To: stable@vger.kernel.org
Signed-off-by: Luigi Leonardi leonardi@redhat.com --- Michal Luczaj (3): bpf, vsock: Invoke proto::close on close() vsock: Keep the binding until socket destruction vsock: Orphan socket after transport release
net/vmw_vsock/af_vsock.c | 77 +++++++++++++++++++++++++++++++----------------- 1 file changed, 50 insertions(+), 27 deletions(-) --- base-commit: f0a53361993a94f602df6f35e78149ad2ac12c89 change-id: 20250220-backport_fix_5_10-0ae85f834bc4
Best regards,
From: Michal Luczaj mhal@rbox.co
commit 135ffc7becc82cfb84936ae133da7969220b43b2 upstream.
vsock defines a BPF callback to be invoked when close() is called. However, this callback is never actually executed. As a result, a closed vsock socket is not automatically removed from the sockmap/sockhash.
Introduce a dummy vsock_close() and make vsock_release() call proto::close.
Note: changes in __vsock_release() look messy, but it's only due to indent level reduction and variables xmas tree reorder.
Fixes: 634f1a7110b4 ("vsock: support sockmap") Signed-off-by: Michal Luczaj mhal@rbox.co Reviewed-by: Stefano Garzarella sgarzare@redhat.com Reviewed-by: Luigi Leonardi leonardi@redhat.com Link: https://lore.kernel.org/r/20241118-vsock-bpf-poll-close-v1-3-f1b9669cacdc@rb... Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com [LL: There is no sockmap support for this kernel version. This patch has been backported because it helps reduce conflicts on future backports] Signed-off-by: Luigi Leonardi leonardi@redhat.com --- net/vmw_vsock/af_vsock.c | 67 +++++++++++++++++++++++++++++------------------- 1 file changed, 40 insertions(+), 27 deletions(-)
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 53a9c0a73489bad5d4d9de1d0299b7b850462204..f80b28934c4b5af11765074da8d3f6f3d92ce6ff 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -113,12 +113,14 @@ static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr); static void vsock_sk_destruct(struct sock *sk); static int vsock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb); +static void vsock_close(struct sock *sk, long timeout);
/* Protocol family. */ static struct proto vsock_proto = { .name = "AF_VSOCK", .owner = THIS_MODULE, .obj_size = sizeof(struct vsock_sock), + .close = vsock_close, };
/* The default peer timeout indicates how long we will wait for a peer response @@ -767,39 +769,37 @@ static struct sock *__vsock_create(struct net *net,
static void __vsock_release(struct sock *sk, int level) { - if (sk) { - struct sock *pending; - struct vsock_sock *vsk; - - vsk = vsock_sk(sk); - pending = NULL; /* Compiler warning. */ + struct vsock_sock *vsk; + struct sock *pending;
- /* When "level" is SINGLE_DEPTH_NESTING, use the nested - * version to avoid the warning "possible recursive locking - * detected". When "level" is 0, lock_sock_nested(sk, level) - * is the same as lock_sock(sk). - */ - lock_sock_nested(sk, level); + vsk = vsock_sk(sk); + pending = NULL; /* Compiler warning. */
- if (vsk->transport) - vsk->transport->release(vsk); - else if (sk->sk_type == SOCK_STREAM) - vsock_remove_sock(vsk); + /* When "level" is SINGLE_DEPTH_NESTING, use the nested + * version to avoid the warning "possible recursive locking + * detected". When "level" is 0, lock_sock_nested(sk, level) + * is the same as lock_sock(sk). + */ + lock_sock_nested(sk, level);
- sock_orphan(sk); - sk->sk_shutdown = SHUTDOWN_MASK; + if (vsk->transport) + vsk->transport->release(vsk); + else if (sk->sk_type == SOCK_STREAM) + vsock_remove_sock(vsk);
- skb_queue_purge(&sk->sk_receive_queue); + sock_orphan(sk); + sk->sk_shutdown = SHUTDOWN_MASK;
- /* Clean up any sockets that never were accepted. */ - while ((pending = vsock_dequeue_accept(sk)) != NULL) { - __vsock_release(pending, SINGLE_DEPTH_NESTING); - sock_put(pending); - } + skb_queue_purge(&sk->sk_receive_queue);
- release_sock(sk); - sock_put(sk); + /* Clean up any sockets that never were accepted. */ + while ((pending = vsock_dequeue_accept(sk)) != NULL) { + __vsock_release(pending, SINGLE_DEPTH_NESTING); + sock_put(pending); } + + release_sock(sk); + sock_put(sk); }
static void vsock_sk_destruct(struct sock *sk) @@ -853,9 +853,22 @@ s64 vsock_stream_has_space(struct vsock_sock *vsk) } EXPORT_SYMBOL_GPL(vsock_stream_has_space);
+/* Dummy callback required by sockmap. + * See unconditional call of saved_close() in sock_map_close(). + */ +static void vsock_close(struct sock *sk, long timeout) +{ +} + static int vsock_release(struct socket *sock) { - __vsock_release(sock->sk, 0); + struct sock *sk = sock->sk; + + if (!sk) + return 0; + + sk->sk_prot->close(sk, 0); + __vsock_release(sk, 0); sock->sk = NULL; sock->state = SS_FREE;
From: Michal Luczaj mhal@rbox.co
commit fcdd2242c0231032fc84e1404315c245ae56322a upstream.
Preserve sockets bindings; this includes both resulting from an explicit bind() and those implicitly bound through autobind during connect().
Prevents socket unbinding during a transport reassignment, which fixes a use-after-free:
1. vsock_create() (refcnt=1) calls vsock_insert_unbound() (refcnt=2) 2. transport->release() calls vsock_remove_bound() without checking if sk was bound and moved to bound list (refcnt=1) 3. vsock_bind() assumes sk is in unbound list and before __vsock_insert_bound(vsock_bound_sockets()) calls __vsock_remove_bound() which does: list_del_init(&vsk->bound_table); // nop sock_put(&vsk->sk); // refcnt=0
BUG: KASAN: slab-use-after-free in __vsock_bind+0x62e/0x730 Read of size 4 at addr ffff88816b46a74c by task a.out/2057 dump_stack_lvl+0x68/0x90 print_report+0x174/0x4f6 kasan_report+0xb9/0x190 __vsock_bind+0x62e/0x730 vsock_bind+0x97/0xe0 __sys_bind+0x154/0x1f0 __x64_sys_bind+0x6e/0xb0 do_syscall_64+0x93/0x1b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Allocated by task 2057: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 __kasan_slab_alloc+0x85/0x90 kmem_cache_alloc_noprof+0x131/0x450 sk_prot_alloc+0x5b/0x220 sk_alloc+0x2c/0x870 __vsock_create.constprop.0+0x2e/0xb60 vsock_create+0xe4/0x420 __sock_create+0x241/0x650 __sys_socket+0xf2/0x1a0 __x64_sys_socket+0x6e/0xb0 do_syscall_64+0x93/0x1b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Freed by task 2057: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x60 __kasan_slab_free+0x4b/0x70 kmem_cache_free+0x1a1/0x590 __sk_destruct+0x388/0x5a0 __vsock_bind+0x5e1/0x730 vsock_bind+0x97/0xe0 __sys_bind+0x154/0x1f0 __x64_sys_bind+0x6e/0xb0 do_syscall_64+0x93/0x1b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e
refcount_t: addition on 0; use-after-free. WARNING: CPU: 7 PID: 2057 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150 RIP: 0010:refcount_warn_saturate+0xce/0x150 __vsock_bind+0x66d/0x730 vsock_bind+0x97/0xe0 __sys_bind+0x154/0x1f0 __x64_sys_bind+0x6e/0xb0 do_syscall_64+0x93/0x1b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e
refcount_t: underflow; use-after-free. WARNING: CPU: 7 PID: 2057 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150 RIP: 0010:refcount_warn_saturate+0xee/0x150 vsock_remove_bound+0x187/0x1e0 __vsock_release+0x383/0x4a0 vsock_release+0x90/0x120 __sock_release+0xa3/0x250 sock_close+0x14/0x20 __fput+0x359/0xa80 task_work_run+0x107/0x1d0 do_exit+0x847/0x2560 do_group_exit+0xb8/0x250 __x64_sys_exit_group+0x3a/0x50 x64_sys_call+0xfec/0x14f0 do_syscall_64+0x93/0x1b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") Reviewed-by: Stefano Garzarella sgarzare@redhat.com Signed-off-by: Michal Luczaj mhal@rbox.co Link: https://patch.msgid.link/20250128-vsock-transport-vs-autobind-v3-1-1cf57065b... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Luigi Leonardi leonardi@redhat.com --- net/vmw_vsock/af_vsock.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index f80b28934c4b5af11765074da8d3f6f3d92ce6ff..f3e520e127bc271810ce80152d1e05a9ed1bea42 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -330,7 +330,10 @@ EXPORT_SYMBOL_GPL(vsock_find_connected_socket);
void vsock_remove_sock(struct vsock_sock *vsk) { - vsock_remove_bound(vsk); + /* Transport reassignment must not remove the binding. */ + if (sock_flag(sk_vsock(vsk), SOCK_DEAD)) + vsock_remove_bound(vsk); + vsock_remove_connected(vsk); } EXPORT_SYMBOL_GPL(vsock_remove_sock); @@ -782,12 +785,13 @@ static void __vsock_release(struct sock *sk, int level) */ lock_sock_nested(sk, level);
+ sock_orphan(sk); + if (vsk->transport) vsk->transport->release(vsk); else if (sk->sk_type == SOCK_STREAM) vsock_remove_sock(vsk);
- sock_orphan(sk); sk->sk_shutdown = SHUTDOWN_MASK;
skb_queue_purge(&sk->sk_receive_queue);
From: Michal Luczaj mhal@rbox.co
commit 78dafe1cf3afa02ed71084b350713b07e72a18fb upstream.
During socket release, sock_orphan() is called without considering that it sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a null pointer dereferenced in virtio_transport_wait_close().
Orphan the socket only after transport release.
Partially reverts the 'Fixes:' commit.
KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] lock_acquire+0x19e/0x500 _raw_spin_lock_irqsave+0x47/0x70 add_wait_queue+0x46/0x230 virtio_transport_release+0x4e7/0x7f0 __vsock_release+0xfd/0x490 vsock_release+0x90/0x120 __sock_release+0xa3/0x250 sock_close+0x14/0x20 __fput+0x35e/0xa90 __x64_sys_close+0x78/0xd0 do_syscall_64+0x93/0x1b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Reported-by: syzbot+9d55b199192a4be7d02c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9d55b199192a4be7d02c Fixes: fcdd2242c023 ("vsock: Keep the binding until socket destruction") Tested-by: Luigi Leonardi leonardi@redhat.com Reviewed-by: Luigi Leonardi leonardi@redhat.com Signed-off-by: Michal Luczaj mhal@rbox.co Link: https://patch.msgid.link/20250210-vsock-linger-nullderef-v3-1-ef6244d02b54@r... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Luigi Leonardi leonardi@redhat.com --- net/vmw_vsock/af_vsock.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index f3e520e127bc271810ce80152d1e05a9ed1bea42..8955a574719f2df6431cc9240f1bbb7f1b637b31 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -785,13 +785,19 @@ static void __vsock_release(struct sock *sk, int level) */ lock_sock_nested(sk, level);
- sock_orphan(sk); + /* Indicate to vsock_remove_sock() that the socket is being released and + * can be removed from the bound_table. Unlike transport reassignment + * case, where the socket must remain bound despite vsock_remove_sock() + * being called from the transport release() callback. + */ + sock_set_flag(sk, SOCK_DEAD);
if (vsk->transport) vsk->transport->release(vsk); else if (sk->sk_type == SOCK_STREAM) vsock_remove_sock(vsk);
+ sock_orphan(sk); sk->sk_shutdown = SHUTDOWN_MASK;
skb_queue_purge(&sk->sk_receive_queue);
linux-stable-mirror@lists.linaro.org