[ Upstream commit 45a4521dcbd92e71c9e53031b40e34211d3b4feb ]
__sock_map_delete() may be called from a tcp event such as unhash or close from the following trace,
tcp_bpf_close() tcp_bpf_remove() sk_psock_unlink() sock_map_delete_from_link() __sock_map_delete()
In this case the sock lock is held but this only protects against duplicate removals on the TCP side. If the map is free'd then we have this trace,
sock_map_free xchg() <- replaces map entry sock_map_unref() sk_psock_put() sock_map_del_link()
The __sock_map_delete() call however uses a read, test, null over the map entry which can result in both paths trying to free the map entry.
To fix use xchg in TCP paths as well so we avoid having two references to the same map entry.
Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: John Fastabend john.fastabend@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/sock_map.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/net/core/sock_map.c b/net/core/sock_map.c index be6092ac69f8a..1d40e040320d2 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -281,16 +281,20 @@ static int __sock_map_delete(struct bpf_stab *stab, struct sock *sk_test, struct sock **psk) { struct sock *sk; + int err = 0;
raw_spin_lock_bh(&stab->lock); sk = *psk; if (!sk_test || sk_test == sk) - *psk = NULL; + sk = xchg(psk, NULL); + + if (likely(sk)) + sock_map_unref(sk, psk); + else + err = -EINVAL; + raw_spin_unlock_bh(&stab->lock); - if (unlikely(!sk)) - return -EINVAL; - sock_map_unref(sk, psk); - return 0; + return err; }
static void sock_map_delete_from_link(struct bpf_map *map, struct sock *sk,