Series deals with one more case of vsock surprising BPF/sockmap by being inconsistency about (having an) assigned transport.
KASAN: null-ptr-deref in range [0x0000000000000120-0x0000000000000127] CPU: 7 UID: 0 PID: 56 Comm: kworker/7:0 Not tainted 6.14.0-rc1+ Workqueue: vsock-loopback vsock_loopback_work RIP: 0010:vsock_read_skb+0x4b/0x90 Call Trace: sk_psock_verdict_data_ready+0xa4/0x2e0 virtio_transport_recv_pkt+0x1ca8/0x2acc vsock_loopback_work+0x27d/0x3f0 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x35a/0x700 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30
This bug, similarly to commit f6abafcd32f9 ("vsock/bpf: return early if transport is not assigned"), could be fixed with a single NULL check. But instead, let's explore another approach: take a hint from vsock_bpf_update_proto() and teach sockmap to accept only vsocks that are already connected (no risk of transport being dropped or reassigned). At the same time straight reject the listeners (vsock listening sockets do not carry any transport anyway). This way BPF does not have to worry about vsk->transport becoming NULL.
Signed-off-by: Michal Luczaj mhal@rbox.co --- Michal Luczaj (4): sockmap, vsock: For connectible sockets allow only connected vsock/bpf: Warn on socket without transport selftest/bpf: Adapt vsock_delete_on_close to sockmap rejecting unconnected selftest/bpf: Add vsock test for sockmap rejecting unconnected
net/core/sock_map.c | 3 + net/vmw_vsock/af_vsock.c | 3 + net/vmw_vsock/vsock_bpf.c | 2 +- .../selftests/bpf/prog_tests/sockmap_basic.c | 70 ++++++++++++++++------ 4 files changed, 59 insertions(+), 19 deletions(-) --- base-commit: 9c01a177c2e4b55d2bcce8a1f6bdd1d46a8320e3 change-id: 20250210-vsock-listen-sockmap-nullptr-e6e82ca79611
Best regards,
sockmap expects all vsocks to have a transport assigned, which is expressed in vsock_proto::psock_update_sk_prot(). However, there is an edge case where an unconnected (connectible) socket may lose its previously assigned transport. This is handled with a NULL check in the vsock/BPF recv path.
Another design detail is that listening vsocks are not supposed to have any transport assigned at all. Which implies they are not supported by the sockmap. But this is complicated by the fact that a socket, before switching to TCP_LISTEN, may have had some transport assigned during a failed connect() attempt. Hence, we may end up with a listening vsock in a sockmap, which blows up quickly:
KASAN: null-ptr-deref in range [0x0000000000000120-0x0000000000000127] CPU: 7 UID: 0 PID: 56 Comm: kworker/7:0 Not tainted 6.14.0-rc1+ Workqueue: vsock-loopback vsock_loopback_work RIP: 0010:vsock_read_skb+0x4b/0x90 Call Trace: sk_psock_verdict_data_ready+0xa4/0x2e0 virtio_transport_recv_pkt+0x1ca8/0x2acc vsock_loopback_work+0x27d/0x3f0 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x35a/0x700 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30
For connectible sockets, instead of relying solely on the state of vsk->transport, tell sockmap to only allow those representing established connections. This aligns with the behaviour for AF_INET and AF_UNIX.
Fixes: 634f1a7110b4 ("vsock: support sockmap") Signed-off-by: Michal Luczaj mhal@rbox.co --- net/core/sock_map.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/core/sock_map.c b/net/core/sock_map.c index f1b9b3958792cd599efcb591742874e9b3f4a76b..2f1be9baad0578e2202b5cf79616b6e814c1ed54 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -541,6 +541,9 @@ static bool sock_map_sk_state_allowed(const struct sock *sk) return (1 << sk->sk_state) & (TCPF_ESTABLISHED | TCPF_LISTEN); if (sk_is_stream_unix(sk)) return (1 << sk->sk_state) & TCPF_ESTABLISHED; + if (sk_is_vsock(sk) && + (sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET)) + return (1 << sk->sk_state) & TCPF_ESTABLISHED; return true; }
... Another design detail is that listening vsocks are not supposed to have any transport assigned at all. Which implies they are not supported by the sockmap. But this is complicated by the fact that a socket, before switching to TCP_LISTEN, may have had some transport assigned during a failed connect() attempt. Hence, we may end up with a listening vsock in a sockmap, which blows up quickly:
KASAN: null-ptr-deref in range [0x0000000000000120-0x0000000000000127] CPU: 7 UID: 0 PID: 56 Comm: kworker/7:0 Not tainted 6.14.0-rc1+ Workqueue: vsock-loopback vsock_loopback_work RIP: 0010:vsock_read_skb+0x4b/0x90 Call Trace: sk_psock_verdict_data_ready+0xa4/0x2e0 virtio_transport_recv_pkt+0x1ca8/0x2acc vsock_loopback_work+0x27d/0x3f0 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x35a/0x700 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30
Perhaps I should have expanded more on the null-ptr-deref itself.
The idea is: force a vsock into assigning a transport and add it to the sockmap (with a verdict program), but keep it unconnected. Then, drop the transport and set the vsock to TCP_LISTEN. The moment a new connection is established:
virtio_transport_recv_pkt() virtio_transport_recv_listen() sk->sk_data_ready(sk) i.e. sk_psock_verdict_data_ready() ops->read_skb() i.e. vsock_read_skb() vsk->transport->read_skb() vsk->transport is NULL, boom
Here's a stand-alone repro:
/* * # modprobe -a vsock_loopback vhost_vsock * # gcc test.c && ./a.out */ #include <stdio.h> #include <stdint.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <sys/socket.h> #include <sys/syscall.h> #include <linux/bpf.h> #include <linux/vm_sockets.h>
static void die(const char *msg) { perror(msg); exit(-1); }
static int sockmap_create(void) { union bpf_attr attr = { .map_type = BPF_MAP_TYPE_SOCKMAP, .key_size = sizeof(int), .value_size = sizeof(int), .max_entries = 1 }; int map;
map = syscall(SYS_bpf, BPF_MAP_CREATE, &attr, sizeof(attr)); if (map < 0) die("map_create");
return map; }
static void map_update_elem(int fd, int key, int value) { union bpf_attr attr = { .map_fd = fd, .key = (uint64_t)&key, .value = (uint64_t)&value, .flags = BPF_ANY };
if (syscall(SYS_bpf, BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr))) die("map_update_elem"); }
static int prog_load(void) { /* mov %r0, 1; exit */ struct bpf_insn insns[] = { { .code = BPF_ALU64 | BPF_MOV | BPF_K, .dst_reg = 0, .imm = 1 }, { .code = BPF_JMP | BPF_EXIT } }; union bpf_attr attr = { .prog_type = BPF_PROG_TYPE_SK_SKB, .insn_cnt = sizeof(insns)/sizeof(insns[0]), .insns = (long)insns, .license = (long)"", }; int prog = syscall(SYS_bpf, BPF_PROG_LOAD, &attr, sizeof(attr)); if (prog < 0) die("prog_load");
return prog; }
static void link_create(int prog_fd, int target_fd) { union bpf_attr attr = { .link_create = { .prog_fd = prog_fd, .target_fd = target_fd, .attach_type = BPF_SK_SKB_VERDICT } };
if (syscall(SYS_bpf, BPF_LINK_CREATE, &attr, sizeof(attr)) < 0) die("link_create"); }
int main(void) { struct sockaddr_vm addr = { .svm_family = AF_VSOCK, .svm_cid = VMADDR_CID_LOCAL, .svm_port = VMADDR_PORT_ANY }; socklen_t alen = sizeof(addr); int s, map, prog, c;
s = socket(AF_VSOCK, SOCK_SEQPACKET, 0); if (s < 0) die("socket");
if (bind(s, (struct sockaddr *)&addr, alen)) die("bind");
if (!connect(s, (struct sockaddr *)&addr, alen) || errno != ECONNRESET) die("connect #1");
map = sockmap_create(); prog = prog_load(); link_create(prog, map); map_update_elem(map, 0, s);
addr.svm_cid = 0x42424242; /* non-existing */ if (!connect(s, (struct sockaddr *)&addr, alen) || errno != ESOCKTNOSUPPORT) die("connect #2");
if (listen(s, 1)) die("listen");
if (getsockname(s, (struct sockaddr *)&addr, &alen)) die("getsockname");
c = socket(AF_VSOCK, SOCK_SEQPACKET, 0); if (c < 0) die("socket c");
if (connect(c, (struct sockaddr *)&addr, alen)) die("connect #3");
return 0; }
On Fri, Feb 14, 2025 at 02:11:48PM +0100, Michal Luczaj wrote:
... Another design detail is that listening vsocks are not supposed to have any transport assigned at all. Which implies they are not supported by the sockmap. But this is complicated by the fact that a socket, before switching to TCP_LISTEN, may have had some transport assigned during a failed connect() attempt. Hence, we may end up with a listening vsock in a sockmap, which blows up quickly:
KASAN: null-ptr-deref in range [0x0000000000000120-0x0000000000000127] CPU: 7 UID: 0 PID: 56 Comm: kworker/7:0 Not tainted 6.14.0-rc1+ Workqueue: vsock-loopback vsock_loopback_work RIP: 0010:vsock_read_skb+0x4b/0x90 Call Trace: sk_psock_verdict_data_ready+0xa4/0x2e0 virtio_transport_recv_pkt+0x1ca8/0x2acc vsock_loopback_work+0x27d/0x3f0 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x35a/0x700 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30
Perhaps I should have expanded more on the null-ptr-deref itself.
The idea is: force a vsock into assigning a transport and add it to the sockmap (with a verdict program), but keep it unconnected. Then, drop the transport and set the vsock to TCP_LISTEN. The moment a new connection is established:
virtio_transport_recv_pkt() virtio_transport_recv_listen() sk->sk_data_ready(sk) i.e. sk_psock_verdict_data_ready() ops->read_skb() i.e. vsock_read_skb() vsk->transport->read_skb() vsk->transport is NULL, boom
Yes I agree, it's a little clearer with this, but I think it was also clear the concept before. So with or without:
Acked-by: Stefano Garzarella sgarzare@redhat.com
Here's a stand-alone repro:
/*
- # modprobe -a vsock_loopback vhost_vsock
- # gcc test.c && ./a.out
*/ #include <stdio.h> #include <stdint.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <sys/socket.h> #include <sys/syscall.h> #include <linux/bpf.h> #include <linux/vm_sockets.h>
static void die(const char *msg) { perror(msg); exit(-1); }
static int sockmap_create(void) { union bpf_attr attr = { .map_type = BPF_MAP_TYPE_SOCKMAP, .key_size = sizeof(int), .value_size = sizeof(int), .max_entries = 1 }; int map;
map = syscall(SYS_bpf, BPF_MAP_CREATE, &attr, sizeof(attr)); if (map < 0) die("map_create");
return map; }
static void map_update_elem(int fd, int key, int value) { union bpf_attr attr = { .map_fd = fd, .key = (uint64_t)&key, .value = (uint64_t)&value, .flags = BPF_ANY };
if (syscall(SYS_bpf, BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr))) die("map_update_elem"); }
static int prog_load(void) { /* mov %r0, 1; exit */ struct bpf_insn insns[] = { { .code = BPF_ALU64 | BPF_MOV | BPF_K, .dst_reg = 0, .imm = 1 }, { .code = BPF_JMP | BPF_EXIT } }; union bpf_attr attr = { .prog_type = BPF_PROG_TYPE_SK_SKB, .insn_cnt = sizeof(insns)/sizeof(insns[0]), .insns = (long)insns, .license = (long)"", }; int prog = syscall(SYS_bpf, BPF_PROG_LOAD, &attr, sizeof(attr)); if (prog < 0) die("prog_load");
return prog; }
static void link_create(int prog_fd, int target_fd) { union bpf_attr attr = { .link_create = { .prog_fd = prog_fd, .target_fd = target_fd, .attach_type = BPF_SK_SKB_VERDICT } };
if (syscall(SYS_bpf, BPF_LINK_CREATE, &attr, sizeof(attr)) < 0) die("link_create"); }
int main(void) { struct sockaddr_vm addr = { .svm_family = AF_VSOCK, .svm_cid = VMADDR_CID_LOCAL, .svm_port = VMADDR_PORT_ANY }; socklen_t alen = sizeof(addr); int s, map, prog, c;
s = socket(AF_VSOCK, SOCK_SEQPACKET, 0); if (s < 0) die("socket");
if (bind(s, (struct sockaddr *)&addr, alen)) die("bind");
if (!connect(s, (struct sockaddr *)&addr, alen) || errno != ECONNRESET) die("connect #1");
map = sockmap_create(); prog = prog_load(); link_create(prog, map); map_update_elem(map, 0, s);
addr.svm_cid = 0x42424242; /* non-existing */ if (!connect(s, (struct sockaddr *)&addr, alen) || errno != ESOCKTNOSUPPORT) die("connect #2");
if (listen(s, 1)) die("listen");
if (getsockname(s, (struct sockaddr *)&addr, &alen)) die("getsockname");
c = socket(AF_VSOCK, SOCK_SEQPACKET, 0); if (c < 0) die("socket c");
if (connect(c, (struct sockaddr *)&addr, alen)) die("connect #3");
return 0; }
In the spirit of commit 91751e248256 ("vsock: prevent null-ptr-deref in vsock_*[has_data|has_space]"), armorize the "impossible" cases with a warning.
Fixes: 634f1a7110b4 ("vsock: support sockmap") Signed-off-by: Michal Luczaj mhal@rbox.co --- net/vmw_vsock/af_vsock.c | 3 +++ net/vmw_vsock/vsock_bpf.c | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 53a081d49d28ac1c04e7f8057c8a55e7b73cc131..7e3db87ae4333cf63327ec105ca99253569bb9fe 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1189,6 +1189,9 @@ static int vsock_read_skb(struct sock *sk, skb_read_actor_t read_actor) { struct vsock_sock *vsk = vsock_sk(sk);
+ if (WARN_ON_ONCE(!vsk->transport)) + return -ENODEV; + return vsk->transport->read_skb(vsk, read_actor); }
diff --git a/net/vmw_vsock/vsock_bpf.c b/net/vmw_vsock/vsock_bpf.c index f201d9eca1df2f8143638cf7a4d08671e8368c11..07b96d56f3a577af71021b1b8132743554996c4f 100644 --- a/net/vmw_vsock/vsock_bpf.c +++ b/net/vmw_vsock/vsock_bpf.c @@ -87,7 +87,7 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, lock_sock(sk); vsk = vsock_sk(sk);
- if (!vsk->transport) { + if (WARN_ON_ONCE(!vsk->transport)) { copied = -ENODEV; goto out; }
On Thu, Feb 13, 2025 at 12:58:50PM +0100, Michal Luczaj wrote:
In the spirit of commit 91751e248256 ("vsock: prevent null-ptr-deref in vsock_*[has_data|has_space]"), armorize the "impossible" cases with a warning.
Fixes: 634f1a7110b4 ("vsock: support sockmap") Signed-off-by: Michal Luczaj mhal@rbox.co
net/vmw_vsock/af_vsock.c | 3 +++ net/vmw_vsock/vsock_bpf.c | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 53a081d49d28ac1c04e7f8057c8a55e7b73cc131..7e3db87ae4333cf63327ec105ca99253569bb9fe 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1189,6 +1189,9 @@ static int vsock_read_skb(struct sock *sk, skb_read_actor_t read_actor) { struct vsock_sock *vsk = vsock_sk(sk);
- if (WARN_ON_ONCE(!vsk->transport))
return -ENODEV;
- return vsk->transport->read_skb(vsk, read_actor);
}
diff --git a/net/vmw_vsock/vsock_bpf.c b/net/vmw_vsock/vsock_bpf.c index f201d9eca1df2f8143638cf7a4d08671e8368c11..07b96d56f3a577af71021b1b8132743554996c4f 100644 --- a/net/vmw_vsock/vsock_bpf.c +++ b/net/vmw_vsock/vsock_bpf.c @@ -87,7 +87,7 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, lock_sock(sk); vsk = vsock_sk(sk);
- if (!vsk->transport) {
- if (WARN_ON_ONCE(!vsk->transport)) { copied = -ENODEV; goto out; }
I'm not a sockmap expert, so I don't understand why here print an error.
Since there was already a check, I expected it to be a case that can happen, but instead calling `rcvmsg()` on a socket not yet connected is impossible?
Thanks, Stefano
On 2/17/25 11:59, Stefano Garzarella wrote:
On Thu, Feb 13, 2025 at 12:58:50PM +0100, Michal Luczaj wrote:
In the spirit of commit 91751e248256 ("vsock: prevent null-ptr-deref in vsock_*[has_data|has_space]"), armorize the "impossible" cases with a warning.
Fixes: 634f1a7110b4 ("vsock: support sockmap") Signed-off-by: Michal Luczaj mhal@rbox.co
net/vmw_vsock/af_vsock.c | 3 +++ net/vmw_vsock/vsock_bpf.c | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 53a081d49d28ac1c04e7f8057c8a55e7b73cc131..7e3db87ae4333cf63327ec105ca99253569bb9fe 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1189,6 +1189,9 @@ static int vsock_read_skb(struct sock *sk, skb_read_actor_t read_actor) { struct vsock_sock *vsk = vsock_sk(sk);
- if (WARN_ON_ONCE(!vsk->transport))
return -ENODEV;
- return vsk->transport->read_skb(vsk, read_actor);
}
diff --git a/net/vmw_vsock/vsock_bpf.c b/net/vmw_vsock/vsock_bpf.c index f201d9eca1df2f8143638cf7a4d08671e8368c11..07b96d56f3a577af71021b1b8132743554996c4f 100644 --- a/net/vmw_vsock/vsock_bpf.c +++ b/net/vmw_vsock/vsock_bpf.c @@ -87,7 +87,7 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, lock_sock(sk); vsk = vsock_sk(sk);
- if (!vsk->transport) {
- if (WARN_ON_ONCE(!vsk->transport)) { copied = -ENODEV; goto out; }
I'm not a sockmap expert, so I don't understand why here print an error.
Since there was already a check, I expected it to be a case that can happen, but instead calling `rcvmsg()` on a socket not yet connected is impossible?
That's right, calling vsock_bpf_recvmsg() on a not-yet-connected connectible socket is impossible since PATCH 1/4 of this series.
That is because to reach vsock_bpf_recvmsg(), you must have sock's proto replaced in vsock_bpf_update_proto(). For that you need to run sock_map_init_proto(), which you can't because the patched sock_map_sk_state_allowed() will stop you.
On Mon, Feb 17, 2025 at 08:45:06PM +0100, Michal Luczaj wrote:
On 2/17/25 11:59, Stefano Garzarella wrote:
On Thu, Feb 13, 2025 at 12:58:50PM +0100, Michal Luczaj wrote:
In the spirit of commit 91751e248256 ("vsock: prevent null-ptr-deref in vsock_*[has_data|has_space]"), armorize the "impossible" cases with a warning.
Fixes: 634f1a7110b4 ("vsock: support sockmap") Signed-off-by: Michal Luczaj mhal@rbox.co
net/vmw_vsock/af_vsock.c | 3 +++ net/vmw_vsock/vsock_bpf.c | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 53a081d49d28ac1c04e7f8057c8a55e7b73cc131..7e3db87ae4333cf63327ec105ca99253569bb9fe 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1189,6 +1189,9 @@ static int vsock_read_skb(struct sock *sk, skb_read_actor_t read_actor) { struct vsock_sock *vsk = vsock_sk(sk);
- if (WARN_ON_ONCE(!vsk->transport))
return -ENODEV;
- return vsk->transport->read_skb(vsk, read_actor);
}
diff --git a/net/vmw_vsock/vsock_bpf.c b/net/vmw_vsock/vsock_bpf.c index f201d9eca1df2f8143638cf7a4d08671e8368c11..07b96d56f3a577af71021b1b8132743554996c4f 100644 --- a/net/vmw_vsock/vsock_bpf.c +++ b/net/vmw_vsock/vsock_bpf.c @@ -87,7 +87,7 @@ static int vsock_bpf_recvmsg(struct sock *sk, struct msghdr *msg, lock_sock(sk); vsk = vsock_sk(sk);
- if (!vsk->transport) {
- if (WARN_ON_ONCE(!vsk->transport)) { copied = -ENODEV; goto out; }
I'm not a sockmap expert, so I don't understand why here print an error.
Since there was already a check, I expected it to be a case that can happen, but instead calling `rcvmsg()` on a socket not yet connected is impossible?
That's right, calling vsock_bpf_recvmsg() on a not-yet-connected connectible socket is impossible since PATCH 1/4 of this series.
Okay, I get it now.
If you need to send a v2, can you write it in the commit description that the behavior changed in the previous patch and so in vsock_bpf_recvmsg() now we no longer expect to be called with a null transport.
In any case, LGTM:
Reviewed-by: Stefano Garzarella sgarzare@redhat.com
That is because to reach vsock_bpf_recvmsg(), you must have sock's proto replaced in vsock_bpf_update_proto(). For that you need to run sock_map_init_proto(), which you can't because the patched sock_map_sk_state_allowed() will stop you.
Commit 515745445e92 ("selftest/bpf: Add test for vsock removal from sockmap on close()") added test that checked if proto::close() callback was invoked on AF_VSOCK socket release. I.e. it verified that a close()d vsock does indeed get removed from the sockmap.
It was done simply by creating a socket pair and attempting to replace a close()d one with its peer. Since, due to a recent change, sockmap does not allow updating index with a non-established connectible vsock, redo it with a freshly established one.
Signed-off-by: Michal Luczaj mhal@rbox.co --- .../selftests/bpf/prog_tests/sockmap_basic.c | 40 ++++++++++++---------- 1 file changed, 22 insertions(+), 18 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c index 884ad87783d59ef3d1ca84c3a542f3f8670cd463..21793d8c79e12b6e607f59ecebb26448c310044b 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c @@ -111,31 +111,35 @@ static void test_sockmap_create_update_free(enum bpf_map_type map_type)
static void test_sockmap_vsock_delete_on_close(void) { - int err, c, p, map; - const int zero = 0; - - err = create_pair(AF_VSOCK, SOCK_STREAM, &c, &p); - if (!ASSERT_OK(err, "create_pair(AF_VSOCK)")) - return; + int map, c, p, err, zero = 0;
map = bpf_map_create(BPF_MAP_TYPE_SOCKMAP, NULL, sizeof(int), sizeof(int), 1, NULL); - if (!ASSERT_GE(map, 0, "bpf_map_create")) { - close(c); - goto out; - } + if (!ASSERT_OK_FD(map, "bpf_map_create")) + return;
- err = bpf_map_update_elem(map, &zero, &c, BPF_NOEXIST); - close(c); - if (!ASSERT_OK(err, "bpf_map_update")) - goto out; + err = create_pair(AF_VSOCK, SOCK_STREAM, &c, &p); + if (!ASSERT_OK(err, "create_pair")) + goto close_map;
- err = bpf_map_update_elem(map, &zero, &p, BPF_NOEXIST); + if (xbpf_map_update_elem(map, &zero, &c, BPF_NOEXIST)) + goto close_socks; + + xclose(c); + xclose(p); + + err = create_pair(AF_VSOCK, SOCK_STREAM, &c, &p); + if (!ASSERT_OK(err, "create_pair")) + goto close_map; + + err = bpf_map_update_elem(map, &zero, &c, BPF_NOEXIST); ASSERT_OK(err, "after close(), bpf_map_update");
-out: - close(p); - close(map); +close_socks: + xclose(c); + xclose(p); +close_map: + xclose(map); }
static void test_skmsg_helpers(enum bpf_map_type map_type)
On Thu, Feb 13, 2025 at 12:58:51PM +0100, Michal Luczaj wrote:
Commit 515745445e92 ("selftest/bpf: Add test for vsock removal from sockmap on close()") added test that checked if proto::close() callback was invoked on AF_VSOCK socket release. I.e. it verified that a close()d vsock does indeed get removed from the sockmap.
It was done simply by creating a socket pair and attempting to replace a close()d one with its peer. Since, due to a recent change, sockmap does not allow updating index with a non-established connectible vsock, redo it with a freshly established one.
Signed-off-by: Michal Luczaj mhal@rbox.co
.../selftests/bpf/prog_tests/sockmap_basic.c | 40 ++++++++++++---------- 1 file changed, 22 insertions(+), 18 deletions(-)
Acked-by: Stefano Garzarella sgarzare@redhat.com
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c index 884ad87783d59ef3d1ca84c3a542f3f8670cd463..21793d8c79e12b6e607f59ecebb26448c310044b 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c @@ -111,31 +111,35 @@ static void test_sockmap_create_update_free(enum bpf_map_type map_type)
static void test_sockmap_vsock_delete_on_close(void) {
- int err, c, p, map;
- const int zero = 0;
- err = create_pair(AF_VSOCK, SOCK_STREAM, &c, &p);
- if (!ASSERT_OK(err, "create_pair(AF_VSOCK)"))
return;
int map, c, p, err, zero = 0;
map = bpf_map_create(BPF_MAP_TYPE_SOCKMAP, NULL, sizeof(int), sizeof(int), 1, NULL);
- if (!ASSERT_GE(map, 0, "bpf_map_create")) {
close(c);
goto out;
- }
- if (!ASSERT_OK_FD(map, "bpf_map_create"))
return;
- err = bpf_map_update_elem(map, &zero, &c, BPF_NOEXIST);
- close(c);
- if (!ASSERT_OK(err, "bpf_map_update"))
goto out;
- err = create_pair(AF_VSOCK, SOCK_STREAM, &c, &p);
- if (!ASSERT_OK(err, "create_pair"))
goto close_map;
- err = bpf_map_update_elem(map, &zero, &p, BPF_NOEXIST);
- if (xbpf_map_update_elem(map, &zero, &c, BPF_NOEXIST))
goto close_socks;
- xclose(c);
- xclose(p);
- err = create_pair(AF_VSOCK, SOCK_STREAM, &c, &p);
- if (!ASSERT_OK(err, "create_pair"))
goto close_map;
- err = bpf_map_update_elem(map, &zero, &c, BPF_NOEXIST); ASSERT_OK(err, "after close(), bpf_map_update");
-out:
- close(p);
- close(map);
+close_socks:
- xclose(c);
- xclose(p);
+close_map:
- xclose(map);
}
static void test_skmsg_helpers(enum bpf_map_type map_type)
-- 2.48.1
Verify that for a connectible AF_VSOCK socket, merely having a transport assigned is insufficient; socket must be connected for the sockmap to accept.
This does not test datagram vsocks. Even though it hardly matters. VMCI is the only transport that features VSOCK_TRANSPORT_F_DGRAM, but it has an unimplemented vsock_transport::readskb() callback, making it unsupported by BPF/sockmap.
Signed-off-by: Michal Luczaj mhal@rbox.co --- .../selftests/bpf/prog_tests/sockmap_basic.c | 30 ++++++++++++++++++++++ 1 file changed, 30 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c index 21793d8c79e12b6e607f59ecebb26448c310044b..05eb37935c3e290ee52b8d8c7c3e3a8db026cba2 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c @@ -1065,6 +1065,34 @@ static void test_sockmap_skb_verdict_vsock_poll(void) test_sockmap_pass_prog__destroy(skel); }
+static void test_sockmap_vsock_unconnected(void) +{ + struct sockaddr_storage addr; + int map, s, zero = 0; + socklen_t alen; + + map = bpf_map_create(BPF_MAP_TYPE_SOCKMAP, NULL, sizeof(int), + sizeof(int), 1, NULL); + if (!ASSERT_OK_FD(map, "bpf_map_create")) + return; + + s = xsocket(AF_VSOCK, SOCK_STREAM, 0); + if (s < 0) + goto close_map; + + /* Fail connect(), but trigger transport assignment. */ + init_addr_loopback(AF_VSOCK, &addr, &alen); + if (!ASSERT_ERR(connect(s, sockaddr(&addr), alen), "connect")) + goto close_sock; + + ASSERT_ERR(bpf_map_update_elem(map, &zero, &s, BPF_ANY), "map_update"); + +close_sock: + xclose(s); +close_map: + xclose(map); +} + void test_sockmap_basic(void) { if (test__start_subtest("sockmap create_update_free")) @@ -1131,4 +1159,6 @@ void test_sockmap_basic(void) test_skmsg_helpers_with_link(BPF_MAP_TYPE_SOCKHASH); if (test__start_subtest("sockmap skb_verdict vsock poll")) test_sockmap_skb_verdict_vsock_poll(); + if (test__start_subtest("sockmap vsock unconnected")) + test_sockmap_vsock_unconnected(); }
On 2/13/25 12:58, Michal Luczaj wrote:
... This does not test datagram vsocks. Even though it hardly matters. VMCI is the only transport that features VSOCK_TRANSPORT_F_DGRAM, but it has an unimplemented vsock_transport::readskb() callback, making it unsupported by
^^^^^^^ Ugh, read_skb().
Sorry, Michal
On Thu, Feb 13, 2025 at 12:58:52PM +0100, Michal Luczaj wrote:
Verify that for a connectible AF_VSOCK socket, merely having a transport assigned is insufficient; socket must be connected for the sockmap to accept.
This does not test datagram vsocks. Even though it hardly matters. VMCI is the only transport that features VSOCK_TRANSPORT_F_DGRAM, but it has an unimplemented vsock_transport::readskb() callback, making it unsupported by BPF/sockmap.
Signed-off-by: Michal Luczaj mhal@rbox.co
.../selftests/bpf/prog_tests/sockmap_basic.c | 30 ++++++++++++++++++++++ 1 file changed, 30 insertions(+)
Acked-by: Stefano Garzarella sgarzare@redhat.com
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c index 21793d8c79e12b6e607f59ecebb26448c310044b..05eb37935c3e290ee52b8d8c7c3e3a8db026cba2 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c @@ -1065,6 +1065,34 @@ static void test_sockmap_skb_verdict_vsock_poll(void) test_sockmap_pass_prog__destroy(skel); }
+static void test_sockmap_vsock_unconnected(void) +{
- struct sockaddr_storage addr;
- int map, s, zero = 0;
- socklen_t alen;
- map = bpf_map_create(BPF_MAP_TYPE_SOCKMAP, NULL, sizeof(int),
sizeof(int), 1, NULL);
- if (!ASSERT_OK_FD(map, "bpf_map_create"))
return;
- s = xsocket(AF_VSOCK, SOCK_STREAM, 0);
- if (s < 0)
goto close_map;
- /* Fail connect(), but trigger transport assignment. */
- init_addr_loopback(AF_VSOCK, &addr, &alen);
- if (!ASSERT_ERR(connect(s, sockaddr(&addr), alen), "connect"))
goto close_sock;
- ASSERT_ERR(bpf_map_update_elem(map, &zero, &s, BPF_ANY), "map_update");
+close_sock:
- xclose(s);
+close_map:
- xclose(map);
+}
void test_sockmap_basic(void) { if (test__start_subtest("sockmap create_update_free")) @@ -1131,4 +1159,6 @@ void test_sockmap_basic(void) test_skmsg_helpers_with_link(BPF_MAP_TYPE_SOCKHASH); if (test__start_subtest("sockmap skb_verdict vsock poll")) test_sockmap_skb_verdict_vsock_poll();
- if (test__start_subtest("sockmap vsock unconnected"))
test_sockmap_vsock_unconnected();
}
-- 2.48.1
Hello:
This series was applied to netdev/net.git (main) by Paolo Abeni pabeni@redhat.com:
On Thu, 13 Feb 2025 12:58:48 +0100 you wrote:
Series deals with one more case of vsock surprising BPF/sockmap by being inconsistency about (having an) assigned transport.
KASAN: null-ptr-deref in range [0x0000000000000120-0x0000000000000127] CPU: 7 UID: 0 PID: 56 Comm: kworker/7:0 Not tainted 6.14.0-rc1+ Workqueue: vsock-loopback vsock_loopback_work RIP: 0010:vsock_read_skb+0x4b/0x90 Call Trace: sk_psock_verdict_data_ready+0xa4/0x2e0 virtio_transport_recv_pkt+0x1ca8/0x2acc vsock_loopback_work+0x27d/0x3f0 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x35a/0x700 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30
[...]
Here is the summary with links: - [net,1/4] sockmap, vsock: For connectible sockets allow only connected https://git.kernel.org/netdev/net/c/8fb5bb169d17 - [net,2/4] vsock/bpf: Warn on socket without transport https://git.kernel.org/netdev/net/c/857ae05549ee - [net,3/4] selftest/bpf: Adapt vsock_delete_on_close to sockmap rejecting unconnected https://git.kernel.org/netdev/net/c/8350695bfb16 - [net,4/4] selftest/bpf: Add vsock test for sockmap rejecting unconnected https://git.kernel.org/netdev/net/c/85928e9c4363
You are awesome, thank you!
linux-kselftest-mirror@lists.linaro.org