From: Ignat Korchagin ignat@cloudflare.com Date: Mon, 17 Jun 2024 20:59:34 +0100
A KASAN enabled kernel will log something like below (decoded and stripped): [ 78.328507][ T299] ================================================================== [ 78.329018][ T299] BUG: KASAN: slab-use-after-free in __sock_gen_cookie (./arch/x86/include/asm/atomic64_64.h:15 ./include/linux/atomic/atomic-arch-fallback.h:2583 ./include/linux/atomic/atomic-instrumented.h:1611 net/core/sock_diag.c:29) [ 78.329366][ T299] Read of size 8 at addr ffff888007110dd8 by task traceroute/299 [ 78.329366][ T299] [ 78.329366][ T299] CPU: 2 PID: 299 Comm: traceroute Tainted: G E 6.10.0-rc2+ #2 [ 78.329366][ T299] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 78.329366][ T299] Call Trace: [ 78.329366][ T299] <TASK> [ 78.329366][ T299] dump_stack_lvl (lib/dump_stack.c:117 (discriminator 1)) [ 78.329366][ T299] print_report (mm/kasan/report.c:378 mm/kasan/report.c:488) [ 78.329366][ T299] ? __sock_gen_cookie (./arch/x86/include/asm/atomic64_64.h:15 ./include/linux/atomic/atomic-arch-fallback.h:2583 ./include/linux/atomic/atomic-instrumented.h:1611 net/core/sock_diag.c:29) [ 78.329366][ T299] kasan_report (mm/kasan/report.c:603) [ 78.329366][ T299] ? __sock_gen_cookie (./arch/x86/include/asm/atomic64_64.h:15 ./include/linux/atomic/atomic-arch-fallback.h:2583 ./include/linux/atomic/atomic-instrumented.h:1611 net/core/sock_diag.c:29) [ 78.329366][ T299] kasan_check_range (mm/kasan/generic.c:183 mm/kasan/generic.c:189) [ 78.329366][ T299] __sock_gen_cookie (./arch/x86/include/asm/atomic64_64.h:15 ./include/linux/atomic/atomic-arch-fallback.h:2583 ./include/linux/atomic/atomic-instrumented.h:1611 net/core/sock_diag.c:29) [ 78.329366][ T299] bpf_get_socket_ptr_cookie (./arch/x86/include/asm/preempt.h:94 ./include/linux/sock_diag.h:42 net/core/filter.c:5094 net/core/filter.c:5092) [ 78.329366][ T299] bpf_prog_875642cf11f1d139___sock_release+0x6e/0x8e [ 78.329366][ T299] bpf_trampoline_6442506592+0x47/0xaf [ 78.329366][ T299] __sock_release (net/socket.c:652) [ 78.329366][ T299] __sock_create (net/socket.c:1601) ... [ 78.329366][ T299] Allocated by task 299 on cpu 2 at 78.328492s: [ 78.329366][ T299] kasan_save_stack (mm/kasan/common.c:48) [ 78.329366][ T299] kasan_save_track (mm/kasan/common.c:68) [ 78.329366][ T299] __kasan_slab_alloc (mm/kasan/common.c:312 mm/kasan/common.c:338) [ 78.329366][ T299] kmem_cache_alloc_noprof (mm/slub.c:3941 mm/slub.c:4000 mm/slub.c:4007) [ 78.329366][ T299] sk_prot_alloc (net/core/sock.c:2075) [ 78.329366][ T299] sk_alloc (net/core/sock.c:2134) [ 78.329366][ T299] inet_create (net/ipv4/af_inet.c:327 net/ipv4/af_inet.c:252) [ 78.329366][ T299] __sock_create (net/socket.c:1572) [ 78.329366][ T299] __sys_socket (net/socket.c:1660 net/socket.c:1644 net/socket.c:1706) [ 78.329366][ T299] __x64_sys_socket (net/socket.c:1718) [ 78.329366][ T299] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) [ 78.329366][ T299] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) [ 78.329366][ T299] [ 78.329366][ T299] Freed by task 299 on cpu 2 at 78.328502s: [ 78.329366][ T299] kasan_save_stack (mm/kasan/common.c:48) [ 78.329366][ T299] kasan_save_track (mm/kasan/common.c:68) [ 78.329366][ T299] kasan_save_free_info (mm/kasan/generic.c:582) [ 78.329366][ T299] poison_slab_object (mm/kasan/common.c:242) [ 78.329366][ T299] __kasan_slab_free (mm/kasan/common.c:256) [ 78.329366][ T299] kmem_cache_free (mm/slub.c:4437 mm/slub.c:4511) [ 78.329366][ T299] __sk_destruct (net/core/sock.c:2117 net/core/sock.c:2208) [ 78.329366][ T299] inet_create (net/ipv4/af_inet.c:397 net/ipv4/af_inet.c:252) [ 78.329366][ T299] __sock_create (net/socket.c:1572) [ 78.329366][ T299] __sys_socket (net/socket.c:1660 net/socket.c:1644 net/socket.c:1706) [ 78.329366][ T299] __x64_sys_socket (net/socket.c:1718) [ 78.329366][ T299] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) [ 78.329366][ T299] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
Fix this by clearing the struct socket reference in sk_common_release() to cover all protocol families create functions.
Fixes: c5dbb89fc2ac ("bpf: Expose bpf_get_socket_cookie to tracing programs") Suggested-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Ignat Korchagin ignat@cloudflare.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/netdev/20240613194047.36478-1-kuniyu@amazon.com/T/
Changes in v2:
- moved the NULL-ing of the socket reference to sk_common_release() (as suggested by Kuniyuki Iwashima)
- trimmed down the KASAN report in the commit message to show only relevant info
It seems the most important repro was lost. I'd like to keep that in the commit message so that we can easily understand the Fixes: tag and how the issue happens.
While at it, could you remove the timestamp and thread id in KASAN splat ?
net/core/sock.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/net/core/sock.c b/net/core/sock.c index 8629f9aecf91..575af557c46b 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -3742,6 +3742,17 @@ void sk_common_release(struct sock *sk) sk->sk_prot->unhash(sk);
- /*
* struct net_proto_family create functions like inet_create() or
nit: This should be netdev style:
/* struct net_proto_family ... * ... */
See: Documentation/process/maintainer-netdev.rst
But I think the comment is not needed here if the commit message has
* KASAN splat * How the problem happens * run bpf_get_socket_cookie() in __sock_release() * What the problem is * UAF happens if pf->create() fails after calling sock_init_data()
, we can just git-blame the change below.
Thanks!
* inet6_create() have an error path, which call this function. This sk
* may have already been associated with a struct socket, so ensure to
* clear this reference not to leave a dangling pointer in the
* struct socket instance.
*/
- if (sk->sk_socket)
sk->sk_socket->sk = NULL;
- /*
- In this point socket cannot receive new packets, but it is possible
- that some packets are in flight because some CPU runs receiver and
-- 2.39.2