From: Eric Dumazet edumazet@google.com
commit ac888d58869bb99753e7652be19a151df9ecb35d upstream.
dst_entries_add() uses per-cpu data that might be freed at netns dismantle from ip6_route_net_exit() calling dst_entries_destroy()
Before ip6_route_net_exit() can be called, we release all the dsts associated with this netns, via calls to dst_release(), which waits an rcu grace period before calling dst_destroy()
dst_entries_add() use in dst_destroy() is racy, because dst_entries_destroy() could have been called already.
Decrementing the number of dsts must happen sooner.
Notes:
1) in CONFIG_XFRM case, dst_destroy() can call dst_release_immediate(child), this might also cause UAF if the child does not have DST_NOCOUNT set. IPSEC maintainers might take a look and see how to address this.
2) There is also discussion about removing this count of dst, which might happen in future kernels.
Fixes: f88649721268 ("ipv4: fix dst race in sk_dst_get()") Closes: https://lore.kernel.org/lkml/CANn89iLCCGsP7SFn9HKpvnKu96Td4KD08xf7aGtiYgZnkj... Reported-by: Naresh Kamboju naresh.kamboju@linaro.org Tested-by: Linux Kernel Functional Testing lkft@linaro.org Tested-by: Naresh Kamboju naresh.kamboju@linaro.org Signed-off-by: Eric Dumazet edumazet@google.com Cc: Xin Long lucien.xin@gmail.com Cc: Steffen Klassert steffen.klassert@secunet.com Reviewed-by: Xin Long lucien.xin@gmail.com Link: https://patch.msgid.link/20241008143110.1064899-1-edumazet@google.com Signed-off-by: Paolo Abeni pabeni@redhat.com
[Conflict due to bc9d3a9f2afc ("net: dst: Switch to rcuref_t reference counting") is not in the tree] Signed-off-by: Abdelkareem Abdelsaamad kareemem@amazon.com --- net/core/dst.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/net/core/dst.c b/net/core/dst.c index 453ec8aafc4a..5bb143857336 100644 --- a/net/core/dst.c +++ b/net/core/dst.c @@ -109,9 +109,6 @@ struct dst_entry *dst_destroy(struct dst_entry * dst) child = xdst->child; } #endif - if (!(dst->flags & DST_NOCOUNT)) - dst_entries_add(dst->ops, -1); - if (dst->ops->destroy) dst->ops->destroy(dst); if (dst->dev) @@ -162,6 +159,12 @@ void dst_dev_put(struct dst_entry *dst) } EXPORT_SYMBOL(dst_dev_put);
+static void dst_count_dec(struct dst_entry *dst) +{ + if (!(dst->flags & DST_NOCOUNT)) + dst_entries_add(dst->ops, -1); +} + void dst_release(struct dst_entry *dst) { if (dst) { @@ -171,8 +174,10 @@ void dst_release(struct dst_entry *dst) if (WARN_ONCE(newrefcnt < 0, "dst_release underflow")) net_warn_ratelimited("%s: dst:%p refcnt:%d\n", __func__, dst, newrefcnt); - if (!newrefcnt) + if (!newrefcnt){ + dst_count_dec(dst); call_rcu(&dst->rcu_head, dst_destroy_rcu); + } } } EXPORT_SYMBOL(dst_release); @@ -186,8 +191,10 @@ void dst_release_immediate(struct dst_entry *dst) if (WARN_ONCE(newrefcnt < 0, "dst_release_immediate underflow")) net_warn_ratelimited("%s: dst:%p refcnt:%d\n", __func__, dst, newrefcnt); - if (!newrefcnt) + if (!newrefcnt){ + dst_count_dec(dst); dst_destroy(dst); + } } } EXPORT_SYMBOL(dst_release_immediate);