From: Eric Dumazet edumazet@google.com
[ Upstream commit 9fba1eb39e2f74d2002c5cbcf1d4435d37a4f752 ]
Add READ_ONCE() annotations because np->rxpmtu can be changed while udpv6_recvmsg() and rawv6_recvmsg() read it.
Since this is a very rarely used feature, and that udpv6_recvmsg() and rawv6_recvmsg() read np->rxopt anyway, change the test order so that np->rxpmtu does not need to be in a hot cache line.
Signed-off-by: Eric Dumazet edumazet@google.com Reviewed-by: Willem de Bruijn willemb@google.com Reviewed-by: David Ahern dsahern@kernel.org Reviewed-by: Kuniyuki Iwashima kuniyu@google.com Link: https://patch.msgid.link/20250916160951.541279-4-edumazet@google.com Reviewed-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
YES - `udpv6_recvmsg()` and `rawv6_recvmsg()` both dereference `np->rxpmtu` without synchronization even though writers update it via `xchg(&np->rxpmtu, skb)` in `ipv6_local_rxpmtu()` (`net/ipv6/datagram.c:415`) and clear it in other contexts; that unsupervised read is undefined behaviour under the kernel memory model and is caught by KCSAN. Annotating the load with `READ_ONCE()` at `net/ipv6/udp.c:483` and `net/ipv6/raw.c:448` guarantees an atomic, non-reordered fetch, eliminating the data race. - The branch order swap (`np->rxopt.bits.rxpmtu` first) keeps the hot- path behaviour identical—both functions already consult `np->rxopt`—while avoiding an unnecessary cache-line touch of `np->rxpmtu` unless the option is enabled, so the risk of regression is negligible. - Older stable kernels share this lockless pattern and therefore the same latent race, while the fix is self-contained (no new APIs, no dependency churn). Delivering accurate IPV6_PATHMTU notifications to user space is observable behaviour, so backporting this minimal annotation is justified for correctness on stable branches.
Natural next step: consider running an IPv6 UDP/RAW recv regression or KCSAN sanity check once merged into stable to confirm the race no longer fires.
net/ipv6/raw.c | 2 +- net/ipv6/udp.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c index 4c3f8245c40f1..eceef8af1355f 100644 --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -445,7 +445,7 @@ static int rawv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, if (flags & MSG_ERRQUEUE) return ipv6_recv_error(sk, msg, len, addr_len);
- if (np->rxpmtu && np->rxopt.bits.rxpmtu) + if (np->rxopt.bits.rxpmtu && READ_ONCE(np->rxpmtu)) return ipv6_recv_rxpmtu(sk, msg, len, addr_len);
skb = skb_recv_datagram(sk, flags, &err); diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 6a68f77da44b5..7f53fcc82a9ec 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -479,7 +479,7 @@ int udpv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, if (flags & MSG_ERRQUEUE) return ipv6_recv_error(sk, msg, len, addr_len);
- if (np->rxpmtu && np->rxopt.bits.rxpmtu) + if (np->rxopt.bits.rxpmtu && READ_ONCE(np->rxpmtu)) return ipv6_recv_rxpmtu(sk, msg, len, addr_len);
try_again: