Willem de Bruijn wrote:
Richard Gobert wrote:
Commits a602456 ("udp: Add GRO functions to UDP socket") and 57c67ff ("udp: additional GRO support") introduce incorrect usage of {ip,ipv6}_hdr in the complete phase of gro. The functions always return skb->network_header, which in the case of encapsulated packets at the gro complete phase, is always set to the innermost L3 of the packet. That means that calling {ip,ipv6}_hdr for skbs which completed the GRO receive phase (both in gro_list and *_gro_complete) when parsing an encapsulated packet's _outer_ L3/L4 may return an unexpected value.
This incorrect usage leads to a bug in GRO's UDP socket lookup. udp{4,6}_lib_lookup_skb functions use ip_hdr/ipv6_hdr respectively. These *_hdr functions return network_header which will point to the innermost L3, resulting in the wrong offset being used in __udp{4,6}_lib_lookup with encapsulated packets.
To fix this issue p_off param is used in *_gro_complete to pass off the offset of the previous layer.
What exactly does this mean?
This patch changes the definition of gro_complete to add a thoff alongside the existing "nhoff"..
> - int (*gro_complete)(struct sk_buff *skb, int nhoff); > + int (*gro_complete)(struct sk_buff *skb, int nhoff, > + int thoff);
.. but also fixes up implementations to interpret the existing argument as a thoff
> -INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int thoff) > +INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int nhoff, > + int thoff) > { > - const struct iphdr *iph = ip_hdr(skb); > - struct tcphdr *th = tcp_hdr(skb); > + const struct iphdr *iph = (const struct iphdr *)(skb->data + nhoff); > + struct tcphdr *th = (struct tcphdr *)(skb->data + thoff);
But in some cases the new argument is not nhoff but p_off, e.g.,
> static int geneve_gro_complete(struct sock *sk, struct sk_buff *skb, > - int nhoff) > + int p_off, int nhoff)
Really, the argument is the start of the next header, each callback just casts to its expected header (ethhdr, tcphdr, etc.)
The only place where we need to pass an extra argument is in udp, because that needs a pointer to the network header right before the transport header pointed to by nhoff.
And only due to possible IPv4 options or IPv6 extension headers, we cannot just do
struct udphdr *iph = (struct iphdr *)(skb->data + nhoff - sizeof(*iph)); struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
I also do not immediately see an a way to avoid all the boilerplate of a new argument in every callback. Aside from a per_cpu var -- but that is excessive.
But it can just be left zero in all callsites, except for inet_gro_complete/ipv6_gro_complete, which pass in nhoff.
Actually, we can avoid the boilerplate changes that add an extra arg.
By changing the contract between network layer callbacks (inet_gro_complete/ipv6_gro_complete) and transport layer callbacks (tcp4_gro_complete et al).
If the first pass their own unmodified offset, nhoff:
err = INDIRECT_CALL_2(ops->callbacks.gro_complete, tcp4_gro_complete, udp4_gro_complete, - skb, nhoff + sizeof(*iph)); + skb, nhoff);
And the latter parse the network header for total_len/payload_len, to find their original offset.
It's also a bit of a hack. But a lot smaller patch, probably.