Hi Richard,
On Thu, May 9, 2024 at 9:09 PM Richard Gobert richardbgobert@gmail.com wrote:
{inet,ipv6}_gro_receive functions perform flush checks (ttl, flags, iph->id, ...) against all packets in a loop. These flush checks are used in all merging UDP and TCP flows.
These checks need to be done only once and only against the found p skb, since they only affect flush and not same_flow.
This patch leverages correct network header offsets from the cb for both outer and inner network headers - allowing these checks to be done only once, in tcp_gro_receive and udp_gro_receive_segment. As a result, NAPI_GRO_CB(p)->flush is not used at all. In addition, flush_id checks are more declarative and contained in inet_gro_flush, thus removing the need for flush_id in napi_gro_cb.
This results in less parsing code for non-loop flush tests for TCP and UDP flows.
To make sure results are not within noise range - I've made netfilter drop all TCP packets, and measured CPU performance in GRO (in this case GRO is responsible for about 50% of the CPU utilization).
perf top while replaying 64 parallel IP/TCP streams merging in GRO: (gro_receive_network_flush is compiled inline to tcp_gro_receive) net-next: 6.94% [kernel] [k] inet_gro_receive 3.02% [kernel] [k] tcp_gro_receive
patch applied: 4.27% [kernel] [k] tcp_gro_receive 4.22% [kernel] [k] inet_gro_receive
perf top while replaying 64 parallel IP/IP/TCP streams merging in GRO (same results for any encapsulation, in this case inet_gro_receive is top offender in net-next) net-next: 10.09% [kernel] [k] inet_gro_receive 2.08% [kernel] [k] tcp_gro_receive
patch applied: 6.97% [kernel] [k] inet_gro_receive 3.68% [kernel] [k] tcp_gro_receive
Signed-off-by: Richard Gobert richardbgobert@gmail.com
Thanks for your patch, which is now commit 4b0ebbca3e167976 ("net: gro: move L3 flush checks to tcp_gro_receive and udp_gro_receive_segment") in net-next/main (next-20240514).
noreply@ellerman.id.au reports build failures on m68k, e.g. http://kisskb.ellerman.id.au/kisskb/buildresult/15168903/
net/core/gro.c: In function ‘dev_gro_receive’: ././include/linux/compiler_types.h:460:38: error: call to ‘__compiletime_assert_654’ declared with attribute error: BUILD_BUG_ON failed: !IS_ALIGNED(offsetof(struct napi_gro_cb, zeroed), sizeof(u32))
--- a/include/net/gro.h +++ b/include/net/gro.h @@ -36,15 +36,15 @@ struct napi_gro_cb { /* This is non-zero if the packet cannot be merged with the new skb. */ u16 flush;
/* Save the IP ID here and check when we get to the transport layer */
u16 flush_id;
/* Number of segments aggregated. */ u16 count; /* Used in ipv6_gro_receive() and foo-over-udp and esp-in-udp */ u16 proto;
On most architectures, there is now a hole of 2 bytes here. However, on m68k the minimum alignment of __wsum (__u32) below is 2, hence there is no hole, breaking the assertion.
Probably you just want to make this explicit, by adding
u16 pad;
here.
/* used to support CHECKSUM_COMPLETE for tunneling protocols */
__wsum csum;
Gr{oetje,eeting}s,
Geert
-- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds