The active-backup bonding mode supports XFRM ESP offload. However, when a bond is added using command like `ip link add bond0 type bond mode 1 miimon 100`, the `ethtool -k` command shows that the XFRM ESP offload is disabled. This occurs because, in bond_newlink(), we change bond link first and register bond device later. So the XFRM feature update in bond_option_mode_set() is not called as the bond device is not yet registered, leading to the offload feature not being set successfully.
To resolve this issue, we can modify the code order in bond_newlink() to ensure that the bond device is registered first before changing the bond link parameters. This change will allow the XFRM ESP offload feature to be correctly enabled.
Fixes: 007ab5345545 ("bonding: fix feature flag setting at init time") Signed-off-by: Hangbin Liu liuhangbin@gmail.com --- v3: rebase to latest net, no code update v2: rebase to latest net, no code update --- drivers/net/bonding/bond_main.c | 2 +- drivers/net/bonding/bond_netlink.c | 16 +++++++++------- include/net/bonding.h | 1 + 3 files changed, 11 insertions(+), 8 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 57be04f6cb11..f4f0feddd9fa 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -4411,7 +4411,7 @@ void bond_work_init_all(struct bonding *bond) INIT_DELAYED_WORK(&bond->slave_arr_work, bond_slave_arr_handler); }
-static void bond_work_cancel_all(struct bonding *bond) +void bond_work_cancel_all(struct bonding *bond) { cancel_delayed_work_sync(&bond->mii_work); cancel_delayed_work_sync(&bond->arp_work); diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c index 57fff2421f1b..7a9d73ec8e91 100644 --- a/drivers/net/bonding/bond_netlink.c +++ b/drivers/net/bonding/bond_netlink.c @@ -579,20 +579,22 @@ static int bond_newlink(struct net_device *bond_dev, struct rtnl_newlink_params *params, struct netlink_ext_ack *extack) { + struct bonding *bond = netdev_priv(bond_dev); struct nlattr **data = params->data; struct nlattr **tb = params->tb; int err;
- err = bond_changelink(bond_dev, tb, data, extack); - if (err < 0) + err = register_netdevice(bond_dev); + if (err) return err;
- err = register_netdevice(bond_dev); - if (!err) { - struct bonding *bond = netdev_priv(bond_dev); + netif_carrier_off(bond_dev); + bond_work_init_all(bond);
- netif_carrier_off(bond_dev); - bond_work_init_all(bond); + err = bond_changelink(bond_dev, tb, data, extack); + if (err) { + bond_work_cancel_all(bond); + unregister_netdevice(bond_dev); }
return err; diff --git a/include/net/bonding.h b/include/net/bonding.h index e06f0d63b2c1..bd56ad976cfb 100644 --- a/include/net/bonding.h +++ b/include/net/bonding.h @@ -711,6 +711,7 @@ struct bond_vlan_tag *bond_verify_device_path(struct net_device *start_dev, int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave); void bond_slave_arr_work_rearm(struct bonding *bond, unsigned long delay); void bond_work_init_all(struct bonding *bond); +void bond_work_cancel_all(struct bonding *bond);
#ifdef CONFIG_PROC_FS void bond_create_proc_entry(struct bonding *bond);
This introduces a test for IPSec offload over bonding, utilizing netdevsim for the testing process, as veth interfaces do not support IPSec offload. The test will ensure that the IPSec offload functionality remains operational even after a failover event occurs in the bonding configuration.
Here is the test result:
TEST: bond_ipsec_offload (active_slave eth0) [ OK ] TEST: bond_ipsec_offload (active_slave eth1) [ OK ]
Reviewed-by: Petr Machata petrm@nvidia.com Signed-off-by: Hangbin Liu liuhangbin@gmail.com --- v3: fix shellcheck errors v2: rebase to latest net, no code update --- .../selftests/drivers/net/bonding/Makefile | 3 +- .../drivers/net/bonding/bond_ipsec_offload.sh | 156 ++++++++++++++++++ .../selftests/drivers/net/bonding/config | 4 + 3 files changed, 162 insertions(+), 1 deletion(-) create mode 100755 tools/testing/selftests/drivers/net/bonding/bond_ipsec_offload.sh
diff --git a/tools/testing/selftests/drivers/net/bonding/Makefile b/tools/testing/selftests/drivers/net/bonding/Makefile index 44b98f17f8ff..c13ef40e7db1 100644 --- a/tools/testing/selftests/drivers/net/bonding/Makefile +++ b/tools/testing/selftests/drivers/net/bonding/Makefile @@ -11,7 +11,8 @@ TEST_PROGS := \ bond_options.sh \ bond-eth-type-change.sh \ bond_macvlan_ipvlan.sh \ - bond_passive_lacp.sh + bond_passive_lacp.sh \ + bond_ipsec_offload.sh
TEST_FILES := \ lag_lib.sh \ diff --git a/tools/testing/selftests/drivers/net/bonding/bond_ipsec_offload.sh b/tools/testing/selftests/drivers/net/bonding/bond_ipsec_offload.sh new file mode 100755 index 000000000000..f09e100232c7 --- /dev/null +++ b/tools/testing/selftests/drivers/net/bonding/bond_ipsec_offload.sh @@ -0,0 +1,156 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +# IPsec over bonding offload test: +# +# +----------------+ +# | bond0 | +# | | | +# | eth0 eth1 | +# +---+-------+----+ +# +# We use netdevsim instead of physical interfaces +#------------------------------------------------------------------- +# Example commands +# ip x s add proto esp src 192.0.2.1 dst 192.0.2.2 \ +# spi 0x07 mode transport reqid 0x07 replay-window 32 \ +# aead 'rfc4106(gcm(aes))' 1234567890123456dcba 128 \ +# sel src 192.0.2.1/24 dst 192.0.2.2/24 +# offload dev bond0 dir out +# ip x p add dir out src 192.0.2.1/24 dst 192.0.2.2/24 \ +# tmpl proto esp src 192.0.2.1 dst 192.0.2.2 \ +# spi 0x07 mode transport reqid 0x07 +# +#------------------------------------------------------------------- + +lib_dir=$(dirname "$0") +# shellcheck disable=SC1091 +source "$lib_dir"/../../../net/lib.sh +srcip=192.0.2.1 +dstip=192.0.2.2 +ipsec0=/sys/kernel/debug/netdevsim/netdevsim0/ports/0/ipsec +ipsec1=/sys/kernel/debug/netdevsim/netdevsim0/ports/1/ipsec +active_slave="" + +# shellcheck disable=SC2317 +active_slave_changed() +{ + local old_active_slave=$1 + local new_active_slave + + # shellcheck disable=SC2154 + new_active_slave=$(ip -n "${ns}" -d -j link show bond0 | \ + jq -r ".[].linkinfo.info_data.active_slave") + [ "$new_active_slave" != "$old_active_slave" ] && [ "$new_active_slave" != "null" ] +} + +test_offload() +{ + # use ping to exercise the Tx path + ip netns exec "$ns" ping -I bond0 -c 3 -W 1 -i 0 "$dstip" >/dev/null + + active_slave=$(ip -n "${ns}" -d -j link show bond0 | \ + jq -r ".[].linkinfo.info_data.active_slave") + + if [ "$active_slave" = "$nic0" ]; then + sysfs=$ipsec0 + elif [ "$active_slave" = "$nic1" ]; then + sysfs=$ipsec1 + else + check_err 1 "bond_ipsec_offload invalid active_slave $active_slave" + fi + + # The tx/rx order in sysfs may changed after failover + grep -q "SA count=2 tx=3" "$sysfs" && grep -q "tx ipaddr=$dstip" "$sysfs" + check_err $? "incorrect tx count with link ${active_slave}" + + log_test bond_ipsec_offload "active_slave ${active_slave}" +} + +setup_env() +{ + if ! mount | grep -q debugfs; then + mount -t debugfs none /sys/kernel/debug/ &> /dev/null + defer umount /sys/kernel/debug/ + + fi + + # setup netdevsim since dummy/veth dev doesn't have offload support + if [ ! -w /sys/bus/netdevsim/new_device ] ; then + if ! modprobe -q netdevsim; then + echo "SKIP: can't load netdevsim for ipsec offload" + # shellcheck disable=SC2154 + exit "$ksft_skip" + fi + defer modprobe -r netdevsim + fi + + setup_ns ns + defer cleanup_ns "$ns" +} + +setup_bond() +{ + ip -n "$ns" link add bond0 type bond mode active-backup miimon 100 + ip -n "$ns" addr add "$srcip/24" dev bond0 + ip -n "$ns" link set bond0 up + + echo "0 2" | ip netns exec "$ns" tee /sys/bus/netdevsim/new_device >/dev/null + nic0=$(ip netns exec "$ns" ls /sys/bus/netdevsim/devices/netdevsim0/net | head -n 1) + nic1=$(ip netns exec "$ns" ls /sys/bus/netdevsim/devices/netdevsim0/net | tail -n 1) + ip -n "$ns" link set "$nic0" master bond0 + ip -n "$ns" link set "$nic1" master bond0 + + # we didn't create a peer, make sure we can Tx by adding a permanent + # neighbour this need to be added after enslave + ip -n "$ns" neigh add "$dstip" dev bond0 lladdr 00:11:22:33:44:55 + + # create offloaded SAs, both in and out + ip -n "$ns" x p add dir out src "$srcip/24" dst "$dstip/24" \ + tmpl proto esp src "$srcip" dst "$dstip" spi 9 \ + mode transport reqid 42 + + ip -n "$ns" x p add dir in src "$dstip/24" dst "$srcip/24" \ + tmpl proto esp src "$dstip" dst "$srcip" spi 9 \ + mode transport reqid 42 + + ip -n "$ns" x s add proto esp src "$srcip" dst "$dstip" spi 9 \ + mode transport reqid 42 aead "rfc4106(gcm(aes))" \ + 0x3132333435363738393031323334353664636261 128 \ + sel src "$srcip/24" dst "$dstip/24" \ + offload dev bond0 dir out + + ip -n "$ns" x s add proto esp src "$dstip" dst "$srcip" spi 9 \ + mode transport reqid 42 aead "rfc4106(gcm(aes))" \ + 0x3132333435363738393031323334353664636261 128 \ + sel src "$dstip/24" dst "$srcip/24" \ + offload dev bond0 dir in + + # does offload show up in ip output + lines=$(ip -n "$ns" x s list | grep -c "crypto offload parameters: dev bond0 dir") + if [ "$lines" -ne 2 ] ; then + check_err 1 "bond_ipsec_offload SA offload missing from list output" + fi +} + +trap defer_scopes_cleanup EXIT +setup_env +setup_bond + +# start Offload testing +test_offload + +# do failover and re-test +ip -n "$ns" link set "$active_slave" down +slowwait 5 active_slave_changed "$active_slave" +test_offload + +# make sure offload get removed from driver +ip -n "$ns" x s flush +ip -n "$ns" x p flush +line0=$(grep -c "SA count=0" "$ipsec0") +line1=$(grep -c "SA count=0" "$ipsec1") +[ "$line0" -ne 1 ] || [ "$line1" -ne 1 ] +check_fail $? "bond_ipsec_offload SA not removed from driver" + +exit "$EXIT_STATUS" diff --git a/tools/testing/selftests/drivers/net/bonding/config b/tools/testing/selftests/drivers/net/bonding/config index 832fa1caeb66..e5b7a8db4dfa 100644 --- a/tools/testing/selftests/drivers/net/bonding/config +++ b/tools/testing/selftests/drivers/net/bonding/config @@ -11,3 +11,7 @@ CONFIG_NET_SCH_INGRESS=y CONFIG_NLMON=y CONFIG_VETH=y CONFIG_VLAN_8021Q=m +CONFIG_INET_ESP=y +CONFIG_INET_ESP_OFFLOAD=y +CONFIG_XFRM_USER=m +CONFIG_NETDEVSIM=m
Hello,
On 25/09/2025 04:33, Hangbin Liu wrote:
This introduces a test for IPSec offload over bonding, utilizing netdevsim for the testing process, as veth interfaces do not support IPSec offload. The test will ensure that the IPSec offload functionality remains operational even after a failover event occurs in the bonding configuration.
Here is the test result:
TEST: bond_ipsec_offload (active_slave eth0) [ OK ] TEST: bond_ipsec_offload (active_slave eth1) [ OK ]
Reviewed-by: Petr Machata petrm@nvidia.com Signed-off-by: Hangbin Liu liuhangbin@gmail.com
v3: fix shellcheck errors v2: rebase to latest net, no code update
.../selftests/drivers/net/bonding/Makefile | 3 +- .../drivers/net/bonding/bond_ipsec_offload.sh | 156 ++++++++++++++++++ .../selftests/drivers/net/bonding/config | 4 + 3 files changed, 162 insertions(+), 1 deletion(-) create mode 100755 tools/testing/selftests/drivers/net/bonding/bond_ipsec_offload.sh
diff --git a/tools/testing/selftests/drivers/net/bonding/Makefile b/tools/testing/selftests/drivers/net/bonding/Makefile index 44b98f17f8ff..c13ef40e7db1 100644 --- a/tools/testing/selftests/drivers/net/bonding/Makefile +++ b/tools/testing/selftests/drivers/net/bonding/Makefile @@ -11,7 +11,8 @@ TEST_PROGS := \ bond_options.sh \ bond-eth-type-change.sh \ bond_macvlan_ipvlan.sh \
- bond_passive_lacp.sh
- bond_passive_lacp.sh \
- bond_ipsec_offload.sh
FYI, we got a small conflict when merging 'net' in 'net-next' in the MPTCP tree due to this patch applied in 'net':
99e4c35eada9 ("selftests: bonding: add ipsec offload test")
and this one from 'net-next':
c2377f1763e9 ("selftests: bonding: add test for LACP actor port priority")
----- Generic Message ----- The best is to avoid conflicts between 'net' and 'net-next' trees but if they cannot be avoided when preparing patches, a note about how to fix them is much appreciated.
The conflict has been resolved on our side [1] and the resolution we suggest is attached to this email. Please report any issues linked to this conflict resolution as it might be used by others. If you worked on the mentioned patches, don't hesitate to ACK this conflict resolution. ---------------------------
Regarding this conflict, I simply added the new files from both trees:
bond_passive_lacp.sh \ bond_ipsec_offload.sh \ bond_lacp_prio.sh
Note: A way to reduce such conflicts in the future is to sort each entry by alphabetical order instead of adding new ones at the end. Same in the 'config' file that is also modified in this patch.
Rerere cache is available in [2].
1: https://github.com/multipath-tcp/mptcp_net-next/commit/f6c62892b853 2: https://github.com/multipath-tcp/mptcp-upstream-rr-cache/commit/65a75
Cheers, Matt
On Wed, Oct 01, 2025 at 10:37:51AM +0200, Matthieu Baerts wrote:
Note: A way to reduce such conflicts in the future is to sort each entry by alphabetical order instead of adding new ones at the end. Same in the 'config' file that is also modified in this patch.
Thanks for the suggestion. I will update Makefile and config to use alphabet order next time.
Regards Hangbin
Hello:
This series was applied to netdev/net.git (main) by Paolo Abeni pabeni@redhat.com:
On Thu, 25 Sep 2025 02:33:03 +0000 you wrote:
The active-backup bonding mode supports XFRM ESP offload. However, when a bond is added using command like `ip link add bond0 type bond mode 1 miimon 100`, the `ethtool -k` command shows that the XFRM ESP offload is disabled. This occurs because, in bond_newlink(), we change bond link first and register bond device later. So the XFRM feature update in bond_option_mode_set() is not called as the bond device is not yet registered, leading to the offload feature not being set successfully.
[...]
Here is the summary with links: - [PATCHv3,net,1/2] bonding: fix xfrm offload feature setup on active-backup mode https://git.kernel.org/netdev/net/c/5b66169f6be4 - [PATCHv3,net,2/2] selftests: bonding: add ipsec offload test https://git.kernel.org/netdev/net/c/99e4c35eada9
You are awesome, thank you!
linux-kselftest-mirror@lists.linaro.org