November 2024 - Linux-kselftest-mirror

[PATCH v2 0/2] kselftest/arm64: fp-stress signal delivery interval improvements

by Mark Brown

One of the things that fp-stress does to stress the floating point context switching is send signals to the test threads it spawns. Currently we do this once per second but as suggested by Mark Rutland if we increase this we can improve the chances of triggering any issues with context switching the signal handling code. Do a quick change to increase the rate of signal delivery, trying to avoid excessive impact on emulated platforms, and a further change to mitigate the impact that this creates during startup. Signed-off-by: Mark Brown <broonie(a)kernel.org> --- Changes in v2: - Minor clarifications in commit message and log output. - Link to v1: https://lore.kernel.org/r/20241029-arm64-fp-stress-interval-v1-0-db540abf6d… --- Mark Brown (2): kselftest/arm64: Increase frequency of signal delivery in fp-stress kselftest/arm64: Poll less often while waiting for fp-stress children tools/testing/selftests/arm64/fp/fp-stress.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-) --- base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354 change-id: 20241028-arm64-fp-stress-interval-8f5e62c06e12 Best regards, -- Mark Brown <broonie(a)kernel.org>

1 year, 1 month

3
4
0 0

[PATCH bpf-next 1/2] libbpf: Add missing per-arch include path

by Björn Töpel

From: Björn Töpel <bjorn(a)rivosinc.com> libbpf does not include the per-arch tools include path, e.g. tools/arch/riscv/include. Some architectures depend those files to build properly. Include tools/arch/$(SUBARCH)/include in the libbpf build. Fixes: 6d74d178fe6e ("tools: Add riscv barrier implementation") Signed-off-by: Björn Töpel <bjorn(a)rivosinc.com> --- tools/lib/bpf/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile index 1b22f0f37288..857a5f7b413d 100644 --- a/tools/lib/bpf/Makefile +++ b/tools/lib/bpf/Makefile @@ -61,7 +61,8 @@ ifndef VERBOSE endif INCLUDES = -I$(or $(OUTPUT),.) \ - -I$(srctree)/tools/include -I$(srctree)/tools/include/uapi + -I$(srctree)/tools/include -I$(srctree)/tools/include/uapi \ + -I$(srctree)/tools/arch/$(SRCARCH)/include export prefix libdir src obj base-commit: db5ca265e3334b48c4e3fa07eef79e8bc578c430 -- 2.43.0

1 year, 1 month

5
11
0 0

[PATCH net-next v9] net: ipv4: Cache pmtu for all packet paths if multipath enabled

by Vladimir Vdovin

Check number of paths by fib_info_num_path(), and update_or_create_fnhe() for every path. Problem is that pmtu is cached only for the oif that has received icmp message "need to frag", other oifs will still try to use "default" iface mtu. An example topology showing the problem: | host1 +---------+ | dummy0 | 10.179.20.18/32 mtu9000 +---------+ +-----------+----------------+ +---------+ +---------+ | ens17f0 | 10.179.2.141/31 | ens17f1 | 10.179.2.13/31 +---------+ +---------+ | (all here have mtu 9000) | +------+ +------+ | ro1 | 10.179.2.140/31 | ro2 | 10.179.2.12/31 +------+ +------+ | | ---------+------------+-------------------+------ | +-----+ | ro3 | 10.10.10.10 mtu1500 +-----+ | ======================================== some networks ======================================== | +-----+ | eth0| 10.10.30.30 mtu9000 +-----+ | host2 host1 have enabled multipath and sysctl net.ipv4.fib_multipath_hash_policy = 1: default proto static src 10.179.20.18 nexthop via 10.179.2.12 dev ens17f1 weight 1 nexthop via 10.179.2.140 dev ens17f0 weight 1 When host1 tries to do pmtud from 10.179.20.18/32 to host2, host1 receives at ens17f1 iface an icmp packet from ro3 that ro3 mtu=1500. And host1 caches it in nexthop exceptions cache. Problem is that it is cached only for the iface that has received icmp, and there is no way that ro3 will send icmp msg to host1 via another path. Host1 now have this routes to host2: ip r g 10.10.30.30 sport 30000 dport 443 10.10.30.30 via 10.179.2.12 dev ens17f1 src 10.179.20.18 uid 0 cache expires 521sec mtu 1500 ip r g 10.10.30.30 sport 30033 dport 443 10.10.30.30 via 10.179.2.140 dev ens17f0 src 10.179.20.18 uid 0 cache So when host1 tries again to reach host2 with mtu>1500, if packet flow is lucky enough to be hashed with oif=ens17f1 its ok, if oif=ens17f0 it blackholes and still gets icmp msgs from ro3 to ens17f1, until lucky day when ro3 will send it through another flow to ens17f0. Signed-off-by: Vladimir Vdovin <deliran(a)verdict.gg> --- V9: selftests in pmtu.sh: - remove useless return - fix mtu var override V8: selftests in pmtu.sh: - Change var names from "dummy" to "host" - Fix errors caused by incorrect iface arguments pass - Add src addr to setup_multipath_new - Change multipath* func order - Change route_get_dst_exception() && route_get_dst_pmtu_from_exception() and arguments pass where they are used as Ido suggested in https://lore.kernel.org/all/ZykH_fdcMBdFgXix@shredder/ V7: selftest in pmtu.sh: - add setup_multipath() with old and new nh tests - add global "dummy_v4" addr variables - add documentation - remove dummy netdev usage in mp nh test - remove useless sysctl opts in mp nh test V6: - make commit message cleaner V5: - make self test cleaner V4: - fix selftest, do route lookup before checking cached exceptions V3: - add selftest - fix compile error V2: - fix fib_info_num_path parameter pass --- net/ipv4/route.c | 13 ++++ tools/testing/selftests/net/pmtu.sh | 117 ++++++++++++++++++++++++---- 2 files changed, 113 insertions(+), 17 deletions(-) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 723ac9181558..652f603d29fe 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1027,6 +1027,19 @@ static void __ip_rt_update_pmtu(struct rtable *rt, struct flowi4 *fl4, u32 mtu) struct fib_nh_common *nhc; fib_select_path(net, &res, fl4, NULL); +#ifdef CONFIG_IP_ROUTE_MULTIPATH + if (fib_info_num_path(res.fi) > 1) { + int nhsel; + + for (nhsel = 0; nhsel < fib_info_num_path(res.fi); nhsel++) { + nhc = fib_info_nhc(res.fi, nhsel); + update_or_create_fnhe(nhc, fl4->daddr, 0, mtu, lock, + jiffies + net->ipv4.ip_rt_mtu_expires); + } + rcu_read_unlock(); + return; + } +#endif /* CONFIG_IP_ROUTE_MULTIPATH */ nhc = FIB_RES_NHC(res); update_or_create_fnhe(nhc, fl4->daddr, 0, mtu, lock, jiffies + net->ipv4.ip_rt_mtu_expires); diff --git a/tools/testing/selftests/net/pmtu.sh b/tools/testing/selftests/net/pmtu.sh index 569bce8b6383..611ae7862f46 100755 --- a/tools/testing/selftests/net/pmtu.sh +++ b/tools/testing/selftests/net/pmtu.sh @@ -197,6 +197,12 @@ # # - pmtu_ipv6_route_change # Same as above but with IPv6 +# +# - pmtu_ipv4_mp_exceptions +# Use the same topology as in pmtu_ipv4, but add routeable addresses +# on host A and B on lo reachable via both routers. Host A and B +# addresses have multipath routes to each other, b_r1 mtu = 1500. +# Check that PMTU exceptions are created for both paths. source lib.sh source net_helper.sh @@ -266,7 +272,8 @@ tests=" list_flush_ipv4_exception ipv4: list and flush cached exceptions 1 list_flush_ipv6_exception ipv6: list and flush cached exceptions 1 pmtu_ipv4_route_change ipv4: PMTU exception w/route replace 1 - pmtu_ipv6_route_change ipv6: PMTU exception w/route replace 1" + pmtu_ipv6_route_change ipv6: PMTU exception w/route replace 1 + pmtu_ipv4_mp_exceptions ipv4: PMTU multipath nh exceptions 1" # Addressing and routing for tests with routers: four network segments, with # index SEGMENT between 1 and 4, a common prefix (PREFIX4 or PREFIX6) and an @@ -343,6 +350,9 @@ tunnel6_a_addr="fd00:2::a" tunnel6_b_addr="fd00:2::b" tunnel6_mask="64" +host4_a_addr="192.168.99.99" +host4_b_addr="192.168.88.88" + dummy6_0_prefix="fc00:1000::" dummy6_1_prefix="fc00:1001::" dummy6_mask="64" @@ -984,6 +994,52 @@ setup_ovs_bridge() { run_cmd ip route add ${prefix6}:${b_r1}::1 via ${prefix6}:${a_r1}::2 } +setup_multipath_new() { + # Set up host A with multipath routes to host B host4_b_addr + run_cmd ${ns_a} ip addr add ${host4_a_addr} dev lo + run_cmd ${ns_a} ip nexthop add id 401 via ${prefix4}.${a_r1}.2 dev veth_A-R1 + run_cmd ${ns_a} ip nexthop add id 402 via ${prefix4}.${a_r2}.2 dev veth_A-R2 + run_cmd ${ns_a} ip nexthop add id 403 group 401/402 + run_cmd ${ns_a} ip route add ${host4_b_addr} src ${host4_a_addr} nhid 403 + + # Set up host B with multipath routes to host A host4_a_addr + run_cmd ${ns_b} ip addr add ${host4_b_addr} dev lo + run_cmd ${ns_b} ip nexthop add id 401 via ${prefix4}.${b_r1}.2 dev veth_B-R1 + run_cmd ${ns_b} ip nexthop add id 402 via ${prefix4}.${b_r2}.2 dev veth_B-R2 + run_cmd ${ns_b} ip nexthop add id 403 group 401/402 + run_cmd ${ns_b} ip route add ${host4_a_addr} src ${host4_b_addr} nhid 403 +} + +setup_multipath_old() { + # Set up host A with multipath routes to host B host4_b_addr + run_cmd ${ns_a} ip addr add ${host4_a_addr} dev lo + run_cmd ${ns_a} ip route add ${host4_b_addr} \ + src ${host4_a_addr} \ + nexthop via ${prefix4}.${a_r1}.2 weight 1 \ + nexthop via ${prefix4}.${a_r2}.2 weight 1 + + # Set up host B with multipath routes to host A host4_a_addr + run_cmd ${ns_b} ip addr add ${host4_b_addr} dev lo + run_cmd ${ns_b} ip route add ${host4_a_addr} \ + src ${host4_b_addr} \ + nexthop via ${prefix4}.${b_r1}.2 weight 1 \ + nexthop via ${prefix4}.${b_r2}.2 weight 1 +} + +setup_multipath() { + if [ "$USE_NH" = "yes" ]; then + setup_multipath_new + else + setup_multipath_old + fi + + # Set up routers with routes to dummies + run_cmd ${ns_r1} ip route add ${host4_a_addr} via ${prefix4}.${a_r1}.1 + run_cmd ${ns_r2} ip route add ${host4_a_addr} via ${prefix4}.${a_r2}.1 + run_cmd ${ns_r1} ip route add ${host4_b_addr} via ${prefix4}.${b_r1}.1 + run_cmd ${ns_r2} ip route add ${host4_b_addr} via ${prefix4}.${b_r2}.1 +} + setup() { [ "$(id -u)" -ne 0 ] && echo " need to run as root" && return $ksft_skip @@ -1076,23 +1132,15 @@ link_get_mtu() { } route_get_dst_exception() { - ns_cmd="${1}" - dst="${2}" - dsfield="${3}" - - if [ -z "${dsfield}" ]; then - dsfield=0 - fi + ns_cmd="${1}"; shift - ${ns_cmd} ip route get "${dst}" dsfield "${dsfield}" + ${ns_cmd} ip route get "$@" } route_get_dst_pmtu_from_exception() { - ns_cmd="${1}" - dst="${2}" - dsfield="${3}" + ns_cmd="${1}"; shift - mtu_parse "$(route_get_dst_exception "${ns_cmd}" "${dst}" "${dsfield}")" + mtu_parse "$(route_get_dst_exception "${ns_cmd}" "$@")" } check_pmtu_value() { @@ -1235,10 +1283,10 @@ test_pmtu_ipv4_dscp_icmp_exception() { run_cmd "${ns_a}" ping -q -M want -Q "${dsfield}" -c 1 -w 1 -s "${len}" "${dst2}" # Check that exceptions have been created with the correct PMTU - pmtu_1="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst1}" "${policy_mark}")" + pmtu_1="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst1}" dsfield "${policy_mark}")" check_pmtu_value "1400" "${pmtu_1}" "exceeding MTU" || return 1 - pmtu_2="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst2}" "${policy_mark}")" + pmtu_2="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst2}" dsfield "${policy_mark}")" check_pmtu_value "1500" "${pmtu_2}" "exceeding MTU" || return 1 } @@ -1285,9 +1333,9 @@ test_pmtu_ipv4_dscp_udp_exception() { UDP:"${dst2}":50000,tos="${dsfield}" # Check that exceptions have been created with the correct PMTU - pmtu_1="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst1}" "${policy_mark}")" + pmtu_1="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst1}" dsfield "${policy_mark}")" check_pmtu_value "1400" "${pmtu_1}" "exceeding MTU" || return 1 - pmtu_2="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst2}" "${policy_mark}")" + pmtu_2="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst2}" dsfield "${policy_mark}")" check_pmtu_value "1500" "${pmtu_2}" "exceeding MTU" || return 1 } @@ -2329,6 +2377,41 @@ test_pmtu_ipv6_route_change() { test_pmtu_ipvX_route_change 6 } +test_pmtu_ipv4_mp_exceptions() { + setup namespaces routing multipath || return $ksft_skip + + trace "${ns_a}" veth_A-R1 "${ns_r1}" veth_R1-A \ + "${ns_r1}" veth_R1-B "${ns_b}" veth_B-R1 \ + "${ns_a}" veth_A-R2 "${ns_r2}" veth_R2-A \ + "${ns_r2}" veth_R2-B "${ns_b}" veth_B-R2 + + # Set up initial MTU values + mtu "${ns_a}" veth_A-R1 2000 + mtu "${ns_r1}" veth_R1-A 2000 + mtu "${ns_r1}" veth_R1-B 1500 + mtu "${ns_b}" veth_B-R1 1500 + + mtu "${ns_a}" veth_A-R2 2000 + mtu "${ns_r2}" veth_R2-A 2000 + mtu "${ns_r2}" veth_R2-B 1500 + mtu "${ns_b}" veth_B-R2 1500 + + fail=0 + + # Ping and expect two nexthop exceptions for two routes in nh group + run_cmd ${ns_a} ping -q -M want -i 0.1 -c 1 -s 1800 "${host4_b_addr}" + + # Do route lookup before checking cached exceptions. + # The following commands are needed for dst entries to be cached + # in both paths exceptions and therefore dumped to user space + # Check that exceptions have been created with the correct PMTU + pmtu_a_R1="$(route_get_dst_pmtu_from_exception "${ns_a}" "${host4_b_addr}" oif veth_A-R1)" + pmtu_a_R2="$(route_get_dst_pmtu_from_exception "${ns_a}" "${host4_b_addr}" oif veth_A-R2)" + + check_pmtu_value "1500" "${pmtu_a_R1}" "exceeding MTU (veth_A-R2)" || return 1 + check_pmtu_value "1500" "${pmtu_a_R2}" "exceeding MTU (veth_A-R1)" || return 1 +} + usage() { echo echo "$0 [OPTIONS] [TEST]..." base-commit: 66600fac7a984dea4ae095411f644770b2561ede -- 2.43.5

1 year, 1 month

2
1
0 0

[PATCH net-next v8] net: ipv4: Cache pmtu for all packet paths if multipath enabled

by Vladimir Vdovin

Check number of paths by fib_info_num_path(), and update_or_create_fnhe() for every path. Problem is that pmtu is cached only for the oif that has received icmp message "need to frag", other oifs will still try to use "default" iface mtu. An example topology showing the problem: | host1 +---------+ | dummy0 | 10.179.20.18/32 mtu9000 +---------+ +-----------+----------------+ +---------+ +---------+ | ens17f0 | 10.179.2.141/31 | ens17f1 | 10.179.2.13/31 +---------+ +---------+ | (all here have mtu 9000) | +------+ +------+ | ro1 | 10.179.2.140/31 | ro2 | 10.179.2.12/31 +------+ +------+ | | ---------+------------+-------------------+------ | +-----+ | ro3 | 10.10.10.10 mtu1500 +-----+ | ======================================== some networks ======================================== | +-----+ | eth0| 10.10.30.30 mtu9000 +-----+ | host2 host1 have enabled multipath and sysctl net.ipv4.fib_multipath_hash_policy = 1: default proto static src 10.179.20.18 nexthop via 10.179.2.12 dev ens17f1 weight 1 nexthop via 10.179.2.140 dev ens17f0 weight 1 When host1 tries to do pmtud from 10.179.20.18/32 to host2, host1 receives at ens17f1 iface an icmp packet from ro3 that ro3 mtu=1500. And host1 caches it in nexthop exceptions cache. Problem is that it is cached only for the iface that has received icmp, and there is no way that ro3 will send icmp msg to host1 via another path. Host1 now have this routes to host2: ip r g 10.10.30.30 sport 30000 dport 443 10.10.30.30 via 10.179.2.12 dev ens17f1 src 10.179.20.18 uid 0 cache expires 521sec mtu 1500 ip r g 10.10.30.30 sport 30033 dport 443 10.10.30.30 via 10.179.2.140 dev ens17f0 src 10.179.20.18 uid 0 cache So when host1 tries again to reach host2 with mtu>1500, if packet flow is lucky enough to be hashed with oif=ens17f1 its ok, if oif=ens17f0 it blackholes and still gets icmp msgs from ro3 to ens17f1, until lucky day when ro3 will send it through another flow to ens17f0. Signed-off-by: Vladimir Vdovin <deliran(a)verdict.gg> --- V8: selftests in pmtu.sh: - Change var names from "dummy" to "host" - Fix errors caused by incorrect iface arguments pass - Add src addr to setup_multipath_new - Change multipath* func order - Change route_get_dst_exception() && route_get_dst_pmtu_from_exception() and arguments pass where they are used as Ido suggested in https://lore.kernel.org/all/ZykH_fdcMBdFgXix@shredder/ V7: selftest in pmtu.sh: - add setup_multipath() with old and new nh tests - add global "dummy_v4" addr variables - add documentation - remove dummy netdev usage in mp nh test - remove useless sysctl opts in mp nh test V6: - make commit message cleaner V5: - make self test cleaner V4: - fix selftest, do route lookup before checking cached exceptions V3: - add selftest - fix compile error V2: - fix fib_info_num_path parameter pass --- net/ipv4/route.c | 13 +++ tools/testing/selftests/net/pmtu.sh | 119 ++++++++++++++++++++++++---- 2 files changed, 115 insertions(+), 17 deletions(-) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 723ac9181558..652f603d29fe 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1027,6 +1027,19 @@ static void __ip_rt_update_pmtu(struct rtable *rt, struct flowi4 *fl4, u32 mtu) struct fib_nh_common *nhc; fib_select_path(net, &res, fl4, NULL); +#ifdef CONFIG_IP_ROUTE_MULTIPATH + if (fib_info_num_path(res.fi) > 1) { + int nhsel; + + for (nhsel = 0; nhsel < fib_info_num_path(res.fi); nhsel++) { + nhc = fib_info_nhc(res.fi, nhsel); + update_or_create_fnhe(nhc, fl4->daddr, 0, mtu, lock, + jiffies + net->ipv4.ip_rt_mtu_expires); + } + rcu_read_unlock(); + return; + } +#endif /* CONFIG_IP_ROUTE_MULTIPATH */ nhc = FIB_RES_NHC(res); update_or_create_fnhe(nhc, fl4->daddr, 0, mtu, lock, jiffies + net->ipv4.ip_rt_mtu_expires); diff --git a/tools/testing/selftests/net/pmtu.sh b/tools/testing/selftests/net/pmtu.sh index 569bce8b6383..2b79a16d3369 100755 --- a/tools/testing/selftests/net/pmtu.sh +++ b/tools/testing/selftests/net/pmtu.sh @@ -197,6 +197,12 @@ # # - pmtu_ipv6_route_change # Same as above but with IPv6 +# +# - pmtu_ipv4_mp_exceptions +# Use the same topology as in pmtu_ipv4, but add routeable addresses +# on host A and B on lo reachable via both routers. Host A and B +# addresses have multipath routes to each other, b_r1 mtu = 1500. +# Check that PMTU exceptions are created for both paths. source lib.sh source net_helper.sh @@ -266,7 +272,8 @@ tests=" list_flush_ipv4_exception ipv4: list and flush cached exceptions 1 list_flush_ipv6_exception ipv6: list and flush cached exceptions 1 pmtu_ipv4_route_change ipv4: PMTU exception w/route replace 1 - pmtu_ipv6_route_change ipv6: PMTU exception w/route replace 1" + pmtu_ipv6_route_change ipv6: PMTU exception w/route replace 1 + pmtu_ipv4_mp_exceptions ipv4: PMTU multipath nh exceptions 1" # Addressing and routing for tests with routers: four network segments, with # index SEGMENT between 1 and 4, a common prefix (PREFIX4 or PREFIX6) and an @@ -343,6 +350,9 @@ tunnel6_a_addr="fd00:2::a" tunnel6_b_addr="fd00:2::b" tunnel6_mask="64" +host4_a_addr="192.168.99.99" +host4_b_addr="192.168.88.88" + dummy6_0_prefix="fc00:1000::" dummy6_1_prefix="fc00:1001::" dummy6_mask="64" @@ -984,6 +994,52 @@ setup_ovs_bridge() { run_cmd ip route add ${prefix6}:${b_r1}::1 via ${prefix6}:${a_r1}::2 } +setup_multipath_new() { + # Set up host A with multipath routes to host B host4_b_addr + run_cmd ${ns_a} ip addr add ${host4_a_addr} dev lo + run_cmd ${ns_a} ip nexthop add id 401 via ${prefix4}.${a_r1}.2 dev veth_A-R1 + run_cmd ${ns_a} ip nexthop add id 402 via ${prefix4}.${a_r2}.2 dev veth_A-R2 + run_cmd ${ns_a} ip nexthop add id 403 group 401/402 + run_cmd ${ns_a} ip route add ${host4_b_addr} src ${host4_a_addr} nhid 403 + + # Set up host B with multipath routes to host A host4_a_addr + run_cmd ${ns_b} ip addr add ${host4_b_addr} dev lo + run_cmd ${ns_b} ip nexthop add id 401 via ${prefix4}.${b_r1}.2 dev veth_B-R1 + run_cmd ${ns_b} ip nexthop add id 402 via ${prefix4}.${b_r2}.2 dev veth_B-R2 + run_cmd ${ns_b} ip nexthop add id 403 group 401/402 + run_cmd ${ns_b} ip route add ${host4_a_addr} src ${host4_b_addr} nhid 403 +} + +setup_multipath_old() { + # Set up host A with multipath routes to host B host4_b_addr + run_cmd ${ns_a} ip addr add ${host4_a_addr} dev lo + run_cmd ${ns_a} ip route add ${host4_b_addr} \ + src ${host4_a_addr} \ + nexthop via ${prefix4}.${a_r1}.2 weight 1 \ + nexthop via ${prefix4}.${a_r2}.2 weight 1 + + # Set up host B with multipath routes to host A host4_a_addr + run_cmd ${ns_b} ip addr add ${host4_b_addr} dev lo + run_cmd ${ns_b} ip route add ${host4_a_addr} \ + src ${host4_b_addr} \ + nexthop via ${prefix4}.${b_r1}.2 weight 1 \ + nexthop via ${prefix4}.${b_r2}.2 weight 1 +} + +setup_multipath() { + if [ "$USE_NH" = "yes" ]; then + setup_multipath_new + else + setup_multipath_old + fi + + # Set up routers with routes to dummies + run_cmd ${ns_r1} ip route add ${host4_a_addr} via ${prefix4}.${a_r1}.1 + run_cmd ${ns_r2} ip route add ${host4_a_addr} via ${prefix4}.${a_r2}.1 + run_cmd ${ns_r1} ip route add ${host4_b_addr} via ${prefix4}.${b_r1}.1 + run_cmd ${ns_r2} ip route add ${host4_b_addr} via ${prefix4}.${b_r2}.1 +} + setup() { [ "$(id -u)" -ne 0 ] && echo " need to run as root" && return $ksft_skip @@ -1076,23 +1132,15 @@ link_get_mtu() { } route_get_dst_exception() { - ns_cmd="${1}" - dst="${2}" - dsfield="${3}" - - if [ -z "${dsfield}" ]; then - dsfield=0 - fi + ns_cmd="${1}"; shift - ${ns_cmd} ip route get "${dst}" dsfield "${dsfield}" + ${ns_cmd} ip route get "$@" } route_get_dst_pmtu_from_exception() { - ns_cmd="${1}" - dst="${2}" - dsfield="${3}" + ns_cmd="${1}"; shift - mtu_parse "$(route_get_dst_exception "${ns_cmd}" "${dst}" "${dsfield}")" + mtu_parse "$(route_get_dst_exception "${ns_cmd}" "$@")" } check_pmtu_value() { @@ -1235,10 +1283,10 @@ test_pmtu_ipv4_dscp_icmp_exception() { run_cmd "${ns_a}" ping -q -M want -Q "${dsfield}" -c 1 -w 1 -s "${len}" "${dst2}" # Check that exceptions have been created with the correct PMTU - pmtu_1="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst1}" "${policy_mark}")" + pmtu_1="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst1}" dsfield "${policy_mark}")" check_pmtu_value "1400" "${pmtu_1}" "exceeding MTU" || return 1 - pmtu_2="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst2}" "${policy_mark}")" + pmtu_2="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst2}" dsfield "${policy_mark}")" check_pmtu_value "1500" "${pmtu_2}" "exceeding MTU" || return 1 } @@ -1285,9 +1333,9 @@ test_pmtu_ipv4_dscp_udp_exception() { UDP:"${dst2}":50000,tos="${dsfield}" # Check that exceptions have been created with the correct PMTU - pmtu_1="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst1}" "${policy_mark}")" + pmtu_1="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst1}" dsfield "${policy_mark}")" check_pmtu_value "1400" "${pmtu_1}" "exceeding MTU" || return 1 - pmtu_2="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst2}" "${policy_mark}")" + pmtu_2="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst2}" dsfield "${policy_mark}")" check_pmtu_value "1500" "${pmtu_2}" "exceeding MTU" || return 1 } @@ -2329,6 +2377,43 @@ test_pmtu_ipv6_route_change() { test_pmtu_ipvX_route_change 6 } +test_pmtu_ipv4_mp_exceptions() { + setup namespaces routing multipath || return $ksft_skip + + trace "${ns_a}" veth_A-R1 "${ns_r1}" veth_R1-A \ + "${ns_r1}" veth_R1-B "${ns_b}" veth_B-R1 \ + "${ns_a}" veth_A-R2 "${ns_r2}" veth_R2-A \ + "${ns_r2}" veth_R2-B "${ns_b}" veth_B-R2 + + # Set up initial MTU values + mtu "${ns_a}" veth_A-R1 2000 + mtu "${ns_r1}" veth_R1-A 2000 + mtu "${ns_r1}" veth_R1-B 1500 + mtu "${ns_b}" veth_B-R1 1500 + + mtu "${ns_a}" veth_A-R2 2000 + mtu "${ns_r2}" veth_R2-A 2000 + mtu "${ns_r2}" veth_R2-B 1500 + mtu "${ns_b}" veth_B-R2 1500 + + fail=0 + + # Ping and expect two nexthop exceptions for two routes in nh group + run_cmd ${ns_a} ping -q -M want -i 0.1 -c 1 -s 1800 "${host4_b_addr}" + + # Do route lookup before checking cached exceptions. + # The following commands are needed for dst entries to be cached + # in both paths exceptions and therefore dumped to user space + # Check that exceptions have been created with the correct PMTU + pmtu="$(route_get_dst_pmtu_from_exception "${ns_a}" "${host4_b_addr}" oif veth_A-R1)" + pmtu="$(route_get_dst_pmtu_from_exception "${ns_a}" "${host4_b_addr}" oif veth_A-R2)" + + check_pmtu_value "1500" "${pmtu}" "exceeding MTU (veth_A-R2)" || return 1 + check_pmtu_value "1500" "${pmtu}" "exceeding MTU (veth_A-R1)" || return 1 + + return ${fail} +} + usage() { echo echo "$0 [OPTIONS] [TEST]..." base-commit: 66600fac7a984dea4ae095411f644770b2561ede -- 2.43.5

1 year, 1 month

3
3
0 0

[PATCH] selftests: vDSO: Do not rely on $ARCH for vdso_test_getrandom && vdso_test_chacha

by Christophe Leroy

$ARCH is not always enough to know whether getrandom vDSO is supported or not. For instance on x86 we want it for x86_64 but not i386. On the other hand, we already have detailed architecture selection in vdso_config.h, the only difference is that it cannot be used for Makefile. But most selftests are built regardless of whether a functionality is supported or not. The return value KSFT_SKIP is there for than: it tells the test is skipped because it is not supported. Make the implementation more flexible by setting a VDSO_GETRANDOM macro in vdso_config.h. That macro contains the path to the file that defines __arch_chacha20_blocks_nostack(). It avoids the symbolic link to vdso directory and will allow architectures to have several implementations of __arch_chacha20_blocks_nostack() if needed. Then restore the original behaviour which was dedicated to vdso_standalone_test_x86 and build getrandom and chacha tests all the time just like other vDSO selftests and return SKIP when the functionality to be tested is not implemented. This has the advantage of doing architecture specific selection at only one place. Also change vdso_test_getrandom to return SKIP instead of FAIL when vDSO function is not found, just like vdso_test_getcpu or vdso_test_gettimeofday. Signed-off-by: Christophe Leroy <christophe.leroy(a)csgroup.eu> --- Based on latest random tree (0dfed8092247) tools/arch/x86/vdso | 1 - tools/testing/selftests/vDSO/Makefile | 10 ++++------ tools/testing/selftests/vDSO/vdso_config.h | 3 +++ tools/testing/selftests/vDSO/vdso_test_chacha-asm.S | 7 +++++++ tools/testing/selftests/vDSO/vdso_test_chacha.c | 11 +++++++++++ tools/testing/selftests/vDSO/vdso_test_getrandom.c | 2 +- 6 files changed, 26 insertions(+), 8 deletions(-) delete mode 120000 tools/arch/x86/vdso create mode 100644 tools/testing/selftests/vDSO/vdso_test_chacha-asm.S diff --git a/tools/arch/x86/vdso b/tools/arch/x86/vdso deleted file mode 120000 index 7eb962fd3454..000000000000 --- a/tools/arch/x86/vdso +++ /dev/null @@ -1 +0,0 @@ -../../../arch/x86/entry/vdso/ \ No newline at end of file diff --git a/tools/testing/selftests/vDSO/Makefile b/tools/testing/selftests/vDSO/Makefile index 5ead6b1f0478..cfb7c281b22c 100644 --- a/tools/testing/selftests/vDSO/Makefile +++ b/tools/testing/selftests/vDSO/Makefile @@ -1,6 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 -ARCH ?= $(shell uname -m | sed -e s/i.86/x86/) -SRCARCH := $(subst x86_64,x86,$(ARCH)) +uname_M := $(shell uname -m 2>/dev/null || echo not) +ARCH ?= $(shell echo $(uname_M) | sed -e s/i.86/x86/ -e s/x86_64/x86/) TEST_GEN_PROGS := vdso_test_gettimeofday TEST_GEN_PROGS += vdso_test_getcpu @@ -10,10 +10,8 @@ ifeq ($(ARCH),$(filter $(ARCH),x86 x86_64)) TEST_GEN_PROGS += vdso_standalone_test_x86 endif TEST_GEN_PROGS += vdso_test_correctness -ifeq ($(ARCH),$(filter $(ARCH),x86_64)) TEST_GEN_PROGS += vdso_test_getrandom TEST_GEN_PROGS += vdso_test_chacha -endif CFLAGS := -std=gnu99 @@ -38,8 +36,8 @@ $(OUTPUT)/vdso_test_getrandom: CFLAGS += -isystem $(top_srcdir)/tools/include \ $(KHDR_INCLUDES) \ -isystem $(top_srcdir)/include/uapi -$(OUTPUT)/vdso_test_chacha: $(top_srcdir)/tools/arch/$(SRCARCH)/vdso/vgetrandom-chacha.S +$(OUTPUT)/vdso_test_chacha: vdso_test_chacha-asm.S $(OUTPUT)/vdso_test_chacha: CFLAGS += -idirafter $(top_srcdir)/tools/include \ - -idirafter $(top_srcdir)/arch/$(SRCARCH)/include \ + -idirafter $(top_srcdir)/arch/$(ARCH)/include \ -idirafter $(top_srcdir)/include \ -D__ASSEMBLY__ -Wa,--noexecstack diff --git a/tools/testing/selftests/vDSO/vdso_config.h b/tools/testing/selftests/vDSO/vdso_config.h index 740ce8c98d2e..693920471160 100644 --- a/tools/testing/selftests/vDSO/vdso_config.h +++ b/tools/testing/selftests/vDSO/vdso_config.h @@ -47,6 +47,7 @@ #elif defined(__x86_64__) #define VDSO_VERSION 0 #define VDSO_NAMES 1 +#define VDSO_GETRANDOM "../../../../arch/x86/entry/vdso/vgetrandom-chacha.S" #elif defined(__riscv__) || defined(__riscv) #define VDSO_VERSION 5 #define VDSO_NAMES 1 @@ -58,6 +59,7 @@ #define VDSO_NAMES 1 #endif +#ifndef __ASSEMBLY__ static const char *versions[7] = { "LINUX_2.6", "LINUX_2.6.15", @@ -88,5 +90,6 @@ static const char *names[2][7] = { "__vdso_getrandom", }, }; +#endif #endif /* __VDSO_CONFIG_H__ */ diff --git a/tools/testing/selftests/vDSO/vdso_test_chacha-asm.S b/tools/testing/selftests/vDSO/vdso_test_chacha-asm.S new file mode 100644 index 000000000000..8e704165f6f2 --- /dev/null +++ b/tools/testing/selftests/vDSO/vdso_test_chacha-asm.S @@ -0,0 +1,7 @@ +#include "vdso_config.h" + +#ifdef VDSO_GETRANDOM + +#include VDSO_GETRANDOM + +#endif diff --git a/tools/testing/selftests/vDSO/vdso_test_chacha.c b/tools/testing/selftests/vDSO/vdso_test_chacha.c index 3a5a08d857cf..9d18d49a82f8 100644 --- a/tools/testing/selftests/vDSO/vdso_test_chacha.c +++ b/tools/testing/selftests/vDSO/vdso_test_chacha.c @@ -8,6 +8,8 @@ #include <string.h> #include <stdint.h> #include <stdbool.h> +#include <linux/kconfig.h> +#include "vdso_config.h" #include "../kselftest.h" static uint32_t rol32(uint32_t word, unsigned int shift) @@ -57,6 +59,10 @@ typedef uint32_t u32; typedef uint64_t u64; #include <vdso/getrandom.h> +#ifdef VDSO_GETRANDOM +#define HAVE_VDSO_GETRANDOM 1 +#endif + int main(int argc, char *argv[]) { enum { TRIALS = 1000, BLOCKS = 128, BLOCK_SIZE = 64 }; @@ -68,6 +74,11 @@ int main(int argc, char *argv[]) ksft_print_header(); ksft_set_plan(1); + if (!__is_defined(HAVE_VDSO_GETRANDOM)) { + printf("__arch_chacha20_blocks_nostack() not implemented\n"); + return KSFT_SKIP; + } + for (unsigned int trial = 0; trial < TRIALS; ++trial) { if (getrandom(key, sizeof(key), 0) != sizeof(key)) { printf("getrandom() failed!\n"); diff --git a/tools/testing/selftests/vDSO/vdso_test_getrandom.c b/tools/testing/selftests/vDSO/vdso_test_getrandom.c index 8866b65a4605..47ee94b32617 100644 --- a/tools/testing/selftests/vDSO/vdso_test_getrandom.c +++ b/tools/testing/selftests/vDSO/vdso_test_getrandom.c @@ -115,7 +115,7 @@ static void vgetrandom_init(void) vgrnd.fn = (__typeof__(vgrnd.fn))vdso_sym(version, name); if (!vgrnd.fn) { printf("%s is missing!\n", name); - exit(KSFT_FAIL); + exit(KSFT_SKIP); } ret = VDSO_CALL(vgrnd.fn, 5, NULL, 0, 0, &vgrnd.params, ~0UL); if (ret == -ENOSYS) { -- 2.44.0

1 year, 1 month

5
15
0 0

[PATCH net-next] selftests: wireguards: use nft by default

by Hangbin Liu

Use nft by default if it's supported, as nft is the replacement for iptables, which is used by default in some releases. Additionally, iptables is dropped in some releases. Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com> --- CC nft developers to see if there are any easier configurations, as I'm not very familiar with nft commands. --- tools/testing/selftests/wireguard/netns.sh | 63 ++++++++++++++++++---- 1 file changed, 53 insertions(+), 10 deletions(-) diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh index 405ff262ca93..4e29c1a7003c 100755 --- a/tools/testing/selftests/wireguard/netns.sh +++ b/tools/testing/selftests/wireguard/netns.sh @@ -44,6 +44,7 @@ sleep() { read -t "$1" -N 1 || true; } waitiperf() { pretty "${1//*-}" "wait for iperf:${3:-5201} pid $2"; while [[ $(ss -N "$1" -tlpH "sport = ${3:-5201}") != *\"iperf3\",pid=$2,fd=* ]]; do sleep 0.1; done; } waitncatudp() { pretty "${1//*-}" "wait for udp:1111 pid $2"; while [[ $(ss -N "$1" -ulpH 'sport = 1111') != *\"ncat\",pid=$2,fd=* ]]; do sleep 0.1; done; } waitiface() { pretty "${1//*-}" "wait for $2 to come up"; ip netns exec "$1" bash -c "while [[ \$(< \"/sys/class/net/$2/operstate\") != up ]]; do read -t .1 -N 0 || true; done;"; } +use_nft() { nft --version &> /dev/null; } cleanup() { set +e @@ -196,13 +197,23 @@ ip1 link set wg0 mtu 1300 ip2 link set wg0 mtu 1300 n1 wg set wg0 peer "$pub2" endpoint 127.0.0.1:2 n2 wg set wg0 peer "$pub1" endpoint 127.0.0.1:1 -n0 iptables -A INPUT -m length --length 1360 -j DROP +if use_nft; then + n0 nft add table inet filter + n0 nft add chain inet filter INPUT { type filter hook input priority filter \; policy accept \; } + n0 nft add rule inet filter INPUT meta length 1360 counter drop +else + n0 iptables -A INPUT -m length --length 1360 -j DROP +fi n1 ip route add 192.168.241.2/32 dev wg0 mtu 1299 n2 ip route add 192.168.241.1/32 dev wg0 mtu 1299 n2 ping -c 1 -W 1 -s 1269 192.168.241.1 n2 ip route delete 192.168.241.1/32 dev wg0 mtu 1299 n1 ip route delete 192.168.241.2/32 dev wg0 mtu 1299 -n0 iptables -F INPUT +if use_nft; then + n0 nft delete table inet filter +else + n0 iptables -F INPUT +fi ip1 link set wg0 mtu $orig_mtu ip2 link set wg0 mtu $orig_mtu @@ -334,7 +345,13 @@ waitiface $netns2 veths n0 bash -c 'printf 1 > /proc/sys/net/ipv4/ip_forward' n0 bash -c 'printf 2 > /proc/sys/net/netfilter/nf_conntrack_udp_timeout' n0 bash -c 'printf 2 > /proc/sys/net/netfilter/nf_conntrack_udp_timeout_stream' -n0 iptables -t nat -A POSTROUTING -s 192.168.1.0/24 -d 10.0.0.0/24 -j SNAT --to 10.0.0.1 +if use_nft; then + n0 nft add table inet nat + n0 nft add chain inet nat POSTROUTING { type nat hook postrouting priority srcnat\; policy accept \; } + n0 nft add rule inet nat POSTROUTING ip saddr 192.168.1.0/24 ip daddr 10.0.0.0/24 counter snat to 10.0.0.1 +else + n0 iptables -t nat -A POSTROUTING -s 192.168.1.0/24 -d 10.0.0.0/24 -j SNAT --to 10.0.0.1 +fi n1 wg set wg0 peer "$pub2" endpoint 10.0.0.100:2 persistent-keepalive 1 n1 ping -W 1 -c 1 192.168.241.2 @@ -348,10 +365,20 @@ n1 wg set wg0 peer "$pub2" persistent-keepalive 0 # Test that sk_bound_dev_if works n1 ping -I wg0 -c 1 -W 1 192.168.241.2 # What about when the mark changes and the packet must be rerouted? -n1 iptables -t mangle -I OUTPUT -j MARK --set-xmark 1 +if use_nft; then + n1 nft add table inet mangle + n1 nft add chain inet mangle OUTPUT { type route hook output priority mangle\; policy accept \; } + n1 nft add rule inet mangle OUTPUT counter meta mark set 0x1 +else + n1 iptables -t mangle -I OUTPUT -j MARK --set-xmark 1 +fi n1 ping -c 1 -W 1 192.168.241.2 # First the boring case n1 ping -I wg0 -c 1 -W 1 192.168.241.2 # Then the sk_bound_dev_if case -n1 iptables -t mangle -D OUTPUT -j MARK --set-xmark 1 +if use_nft; then + n1 nft delete table inet mangle +else + n1 iptables -t mangle -D OUTPUT -j MARK --set-xmark 1 +fi # Test that onion routing works, even when it loops n1 wg set wg0 peer "$pub3" allowed-ips 192.168.242.2/32 endpoint 192.168.241.2:5 @@ -385,16 +412,32 @@ n1 ping -W 1 -c 100 -f 192.168.99.7 n1 ping -W 1 -c 100 -f abab::1111 # Have ns2 NAT into wg0 packets from ns0, but return an icmp error along the right route. -n2 iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -d 192.168.241.0/24 -j SNAT --to 192.168.241.2 -n0 iptables -t filter -A INPUT \! -s 10.0.0.0/24 -i vethrs -j DROP # Manual rpfilter just to be explicit. +if use_nft; then + n2 nft add table inet nat + n2 nft add chain inet nat POSTROUTING { type nat hook postrouting priority srcnat\; policy accept \; } + n2 nft add rule inet nat POSTROUTING ip saddr 10.0.0.0/24 ip daddr 192.168.241.0/24 counter snat to 192.168.241.2 + + n0 nft add table inet filter + n0 nft add chain inet filter INPUT { type filter hook input priority filter \; policy accept \; } + n0 nft add rule inet filter INPUT iifname "vethrs" ip saddr != 10.0.0.0/24 counter drop +else + n2 iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -d 192.168.241.0/24 -j SNAT --to 192.168.241.2 + n0 iptables -t filter -A INPUT \! -s 10.0.0.0/24 -i vethrs -j DROP # Manual rpfilter just to be explicit. +fi n2 bash -c 'printf 1 > /proc/sys/net/ipv4/ip_forward' ip0 -4 route add 192.168.241.1 via 10.0.0.100 n2 wg set wg0 peer "$pub1" remove [[ $(! n0 ping -W 1 -c 1 192.168.241.1 || false) == *"From 10.0.0.100 icmp_seq=1 Destination Host Unreachable"* ]] -n0 iptables -t nat -F -n0 iptables -t filter -F -n2 iptables -t nat -F +if use_nft; then + n0 nft delete table inet nat + n0 nft delete table inet filter + n2 nft delete table inet nat +else + n0 iptables -t nat -F + n0 iptables -t filter -F + n2 iptables -t nat -F +fi ip0 link del vethrc ip0 link del vethrs ip1 link del wg0 -- 2.46.0

1 year, 1 month

2
1
0 0

[PATCH v2 net-next 0/2] ipv6: fix hangup on device removal

by Paolo Abeni

This addresses the infamous unregister_netdevice splat in net selftests; the actual fix is carried by the first patch, while the 2nd one addresses a related problem in the relevant test that was patially hiding the problem. Targeting net-next as the issue is quite old and I feel a little lost in the fib info/nh jungle. --- v1 -> v2: - drop unintended whitespace change in patch 1/2 Paolo Abeni (2): ipv6: release nexthop on device removal selftests: net: really check for bg process completion net/ipv6/route.c | 6 +++--- tools/testing/selftests/net/pmtu.sh | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) -- 2.45.2

1 year, 1 month

4
7
0 0

[PATCH v4 0/2] kselftest: tmpfs: Add ksft macros and skip if no root

by Shivam Chaudhary

This version 4 patch series replace direct error handling methods with ksft macros, which provide better reporting.Currently, when the tmpfs test runs, it does not display any output if it passes,and if it fails (particularly when not run as root),it simply exits without any warning or message. This series of patch adds: 1. Add 'ksft_print_header()' and 'ksft_set_plan()' to structure test outputs more effectively. 2. Skip if not run as root. 3. Replace direct error handling with 'ksft_test_result_*', 'ksft_print_msg' and KSFT_SKIP macros for better reporting. v3->v4: - Start a patchset - Split patch into smaller pathes to make it easy to review. Patch1 Replace 'ksft_test_result_skip' with 'KSFT_SKIP' during root run check. Patch2 Replace 'ksft_test_result_fail' with 'KSFT_SKIP' where fail does not make sense, or failure could be due to not unsupported APIs with appropriate warnings. v3: https://lore.kernel.org/all/20241028185756.111832-1-cvam0000@gmail.com/ v2->v3: - Remove extra ksft_set_plan() - Remove function for unshare() - Fix the comment style v2: https://lore.kernel.org/all/20241026191621.2860376-1-cvam0000@gmail.com/ v1->v2: - Make the commit message more clear. v1: https://lore.kernel.org/all/20241024200228.1075840-1-cvam0000@gmail.com/T/#u thanks Shivam Shivam Chaudhary (2): selftests:tmpfs: Add Skip test if not run as root selftests:tmpfs: Add kselftest support to tmpfs .../selftests/tmpfs/bug-link-o-tmpfile.c | 79 +++++++++++++++---- 1 file changed, 62 insertions(+), 17 deletions(-) -- 2.45.2

1 year, 1 month

2
4
0 0

[PATCH net-next v7] net: ipv4: Cache pmtu for all packet paths if multipath enabled

by Vladimir Vdovin

Check number of paths by fib_info_num_path(), and update_or_create_fnhe() for every path. Problem is that pmtu is cached only for the oif that has received icmp message "need to frag", other oifs will still try to use "default" iface mtu. An example topology showing the problem: | host1 +---------+ | dummy0 | 10.179.20.18/32 mtu9000 +---------+ +-----------+----------------+ +---------+ +---------+ | ens17f0 | 10.179.2.141/31 | ens17f1 | 10.179.2.13/31 +---------+ +---------+ | (all here have mtu 9000) | +------+ +------+ | ro1 | 10.179.2.140/31 | ro2 | 10.179.2.12/31 +------+ +------+ | | ---------+------------+-------------------+------ | +-----+ | ro3 | 10.10.10.10 mtu1500 +-----+ | ======================================== some networks ======================================== | +-----+ | eth0| 10.10.30.30 mtu9000 +-----+ | host2 host1 have enabled multipath and sysctl net.ipv4.fib_multipath_hash_policy = 1: default proto static src 10.179.20.18 nexthop via 10.179.2.12 dev ens17f1 weight 1 nexthop via 10.179.2.140 dev ens17f0 weight 1 When host1 tries to do pmtud from 10.179.20.18/32 to host2, host1 receives at ens17f1 iface an icmp packet from ro3 that ro3 mtu=1500. And host1 caches it in nexthop exceptions cache. Problem is that it is cached only for the iface that has received icmp, and there is no way that ro3 will send icmp msg to host1 via another path. Host1 now have this routes to host2: ip r g 10.10.30.30 sport 30000 dport 443 10.10.30.30 via 10.179.2.12 dev ens17f1 src 10.179.20.18 uid 0 cache expires 521sec mtu 1500 ip r g 10.10.30.30 sport 30033 dport 443 10.10.30.30 via 10.179.2.140 dev ens17f0 src 10.179.20.18 uid 0 cache So when host1 tries again to reach host2 with mtu>1500, if packet flow is lucky enough to be hashed with oif=ens17f1 its ok, if oif=ens17f0 it blackholes and still gets icmp msgs from ro3 to ens17f1, until lucky day when ro3 will send it through another flow to ens17f0. Signed-off-by: Vladimir Vdovin <deliran(a)verdict.gg> --- V7: selftest in pmtu.sh: - add setup_multipath() with old and new nh tests - add global "dummy_v4" addr variables - add documentation - remove dummy netdev usage in mp nh test - remove useless sysctl opts in mp nh test V6: - make commit message cleaner V5: - make self test cleaner V4: - fix selftest, do route lookup before checking cached exceptions V3: - add selftest - fix compile error V2: - fix fib_info_num_path parameter pass --- net/ipv4/route.c | 13 ++++ tools/testing/selftests/net/pmtu.sh | 95 ++++++++++++++++++++++++++++- 2 files changed, 107 insertions(+), 1 deletion(-) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 723ac9181558..652f603d29fe 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1027,6 +1027,19 @@ static void __ip_rt_update_pmtu(struct rtable *rt, struct flowi4 *fl4, u32 mtu) struct fib_nh_common *nhc; fib_select_path(net, &res, fl4, NULL); +#ifdef CONFIG_IP_ROUTE_MULTIPATH + if (fib_info_num_path(res.fi) > 1) { + int nhsel; + + for (nhsel = 0; nhsel < fib_info_num_path(res.fi); nhsel++) { + nhc = fib_info_nhc(res.fi, nhsel); + update_or_create_fnhe(nhc, fl4->daddr, 0, mtu, lock, + jiffies + net->ipv4.ip_rt_mtu_expires); + } + rcu_read_unlock(); + return; + } +#endif /* CONFIG_IP_ROUTE_MULTIPATH */ nhc = FIB_RES_NHC(res); update_or_create_fnhe(nhc, fl4->daddr, 0, mtu, lock, jiffies + net->ipv4.ip_rt_mtu_expires); diff --git a/tools/testing/selftests/net/pmtu.sh b/tools/testing/selftests/net/pmtu.sh index 569bce8b6383..f24c84184c61 100755 --- a/tools/testing/selftests/net/pmtu.sh +++ b/tools/testing/selftests/net/pmtu.sh @@ -197,6 +197,12 @@ # # - pmtu_ipv6_route_change # Same as above but with IPv6 +# +# - pmtu_ipv4_mp_exceptions +# Use the same topology as in pmtu_ipv4, but add routeable "dummy" +# addresses on host A and B on lo0 reachable via both routers. +# Host A and B "dummy" addresses have multipath routes to each other. +# Check that PMTU exceptions are created for both paths. source lib.sh source net_helper.sh @@ -266,7 +272,8 @@ tests=" list_flush_ipv4_exception ipv4: list and flush cached exceptions 1 list_flush_ipv6_exception ipv6: list and flush cached exceptions 1 pmtu_ipv4_route_change ipv4: PMTU exception w/route replace 1 - pmtu_ipv6_route_change ipv6: PMTU exception w/route replace 1" + pmtu_ipv6_route_change ipv6: PMTU exception w/route replace 1 + pmtu_ipv4_mp_exceptions ipv4: PMTU multipath nh exceptions 1" # Addressing and routing for tests with routers: four network segments, with # index SEGMENT between 1 and 4, a common prefix (PREFIX4 or PREFIX6) and an @@ -343,6 +350,9 @@ tunnel6_a_addr="fd00:2::a" tunnel6_b_addr="fd00:2::b" tunnel6_mask="64" +dummy4_a_addr="192.168.99.99" +dummy4_b_addr="192.168.88.88" + dummy6_0_prefix="fc00:1000::" dummy6_1_prefix="fc00:1001::" dummy6_mask="64" @@ -984,6 +994,50 @@ setup_ovs_bridge() { run_cmd ip route add ${prefix6}:${b_r1}::1 via ${prefix6}:${a_r1}::2 } +setup_multipath() { + if [ "$USE_NH" = "yes" ]; then + setup_multipath_new + else + setup_multipath_old + fi + + # Set up routers with routes to dummies + run_cmd ${ns_r1} ip route add ${dummy4_a_addr} via ${prefix4}.${a_r1}.1 + run_cmd ${ns_r2} ip route add ${dummy4_a_addr} via ${prefix4}.${a_r2}.1 + run_cmd ${ns_r1} ip route add ${dummy4_b_addr} via ${prefix4}.${b_r1}.1 + run_cmd ${ns_r2} ip route add ${dummy4_b_addr} via ${prefix4}.${b_r2}.1 +} + +setup_multipath_new() { + # Set up host A with multipath routes to host B dummy4_b_addr + run_cmd ${ns_a} ip addr add ${dummy4_a_addr} dev lo0 + run_cmd ${ns_a} ip nexthop add id 201 via ${prefix4}.${a_r1}.2 dev veth_A-R1 + run_cmd ${ns_a} ip nexthop add id 202 via ${prefix4}.${a_r2}.2 dev veth_A-R2 + run_cmd ${ns_a} ip nexthop add id 203 group 201/202 + run_cmd ${ns_a} ip route add ${dummy4_b_addr} nhid 203 + + # Set up host B with multipath routes to host A dummy4_a_addr + run_cmd ${ns_b} ip addr add ${dummy4_b_addr} dev lo0 + run_cmd ${ns_b} ip nexthop add id 201 via ${prefix4}.${b_r1}.2 dev veth_A-R1 + run_cmd ${ns_b} ip nexthop add id 202 via ${prefix4}.${b_r2}.2 dev veth_A-R2 + run_cmd ${ns_b} ip nexthop add id 203 group 201/202 + run_cmd ${ns_b} ip route add ${dummy4_a_addr} nhid 203 +} + +setup_multipath_old() { + # Set up host A with multipath routes to host B dummy4_b_addr + run_cmd ${ns_a} ip addr add ${dummy4_a_addr} dev lo0 + run_cmd ${ns_a} ip route add ${dummy4_b_addr} \ + nexthop via ${prefix4}.${a_r1}.2 weight 1 \ + nexthop via ${prefix4}.${a_r2}.2 weight 1 + + # Set up host B with multipath routes to host A dummy4_a_addr + run_cmd ${ns_b} ip addr add ${dummy4_b_addr} dev lo0 + run_cmd ${ns_a} ip route add ${dummy4_a_addr} \ + nexthop via ${prefix4}.${a_b1}.2 weight 1 \ + nexthop via ${prefix4}.${a_b2}.2 weight 1 +} + setup() { [ "$(id -u)" -ne 0 ] && echo " need to run as root" && return $ksft_skip @@ -2329,6 +2383,45 @@ test_pmtu_ipv6_route_change() { test_pmtu_ipvX_route_change 6 } +test_pmtu_ipv4_mp_exceptions() { + setup namespaces routing multipath || return $ksft_skip + + trace "${ns_a}" veth_A-R1 "${ns_r1}" veth_R1-A \ + "${ns_r1}" veth_R1-B "${ns_b}" veth_B-R1 \ + "${ns_a}" veth_A-R2 "${ns_r2}" veth_R2-A \ + "${ns_r2}" veth_R2-B "${ns_b}" veth_B-R2 + + # Set up initial MTU values + mtu "${ns_a}" veth_A-R1 2000 + mtu "${ns_r1}" veth_R1-A 2000 + mtu "${ns_r1}" veth_R1-B 1400 + mtu "${ns_b}" veth_B-R1 1400 + + mtu "${ns_a}" veth_A-R2 2000 + mtu "${ns_r2}" veth_R2-A 2000 + mtu "${ns_r2}" veth_R2-B 1500 + mtu "${ns_b}" veth_B-R2 1500 + + fail=0 + + # Ping and expect two nexthop exceptions for two routes in nh group + run_cmd ${ns_a} ping -q -M want -i 0.1 -c 1 -s 1800 "${dummy4_b_addr}" + + # Do route lookup before checking cached exceptions. + # The following commands are needed for dst entries to be cached + # in both paths exceptions and therefore dumped to user space + run_cmd ${ns_a} ip route get ${dummy4_b_addr} oif veth_A-R1 + run_cmd ${ns_a} ip route get ${dummy4_b_addr} oif veth_A-R2 + + # Check cached exceptions + if [ "$(${ns_a} ip -oneline route list cache | grep mtu | wc -l)" -ne 2 ]; then + err " there are not enough cached exceptions" + fail=1 + fi + + return ${fail} +} + usage() { echo echo "$0 [OPTIONS] [TEST]..." base-commit: 66600fac7a984dea4ae095411f644770b2561ede -- 2.43.5

1 year, 1 month

3
3
0 0

[PATCH 0/6] kselftest/arm64: Test floating point signal context restore in fp-stress

by Mark Brown

Currently we test signal delivery to the programs run by fp-stress but our signal handlers simply count the number of signals seen and don't do anything with the floating point state. The original fpsimd-test and sve-test programs had signal handlers called irritators which modify the live register state, verifying that we restore the signal context on return, but a combination of misleading comments and code resulted in them never being used and the equivalent handlers in the other tests being stubbed or omitted. Clarify the code, implement effective irritator handlers for the test programs that can have them and then switch the signals generated by the fp-stress program over to use the irritators, ensuring that we validate that we restore the saved signal context properly. Signed-off-by: Mark Brown <broonie(a)kernel.org> --- Mark Brown (6): kselftest/arm64: Correct misleading comments on fp-stress irritators kselftest/arm64: Remove unused ADRs from irritator handlers kselftest/arm64: Corrupt P15 in the irritator when testing SSVE kselftest/arm64: Implement irritators for ZA and ZT kselftest/arm64: Provide a SIGUSR1 handler in the kernel mode FP stress test kselftest/arm64: Test signal handler state modification in fp-stress tools/testing/selftests/arm64/fp/fp-stress.c | 2 +- tools/testing/selftests/arm64/fp/fpsimd-test.S | 4 +--- tools/testing/selftests/arm64/fp/kernel-test.c | 4 ++++ tools/testing/selftests/arm64/fp/sve-test.S | 6 ++---- tools/testing/selftests/arm64/fp/za-test.S | 13 ++++--------- tools/testing/selftests/arm64/fp/zt-test.S | 13 ++++--------- 6 files changed, 16 insertions(+), 26 deletions(-) --- base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354 change-id: 20241023-arm64-fp-stress-irritator-5415fe92adef Best regards, -- Mark Brown <broonie(a)kernel.org>

1 year, 1 month

2
15
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror November 2024