When a nexthop is added, without a gw address, the default scope was set to 'host'. Thus, when a source address is selected, 127.0.0.1 may be chosen but rejected when the route is used.
When using a route without a nexthop id, the scope can be configured in the route, thus the problem doesn't exist.
To explain more deeply: when a user creates a nexthop, it cannot specify the scope. To create it, the function nh_create_ipv4() calls fib_check_nh() with scope set to 0. fib_check_nh() calls fib_check_nh_nongw() wich was setting scope to 'host'. Then, nh_create_ipv4() calls fib_info_update_nhc_saddr() with scope set to 'host'. The src addr is chosen before the route is inserted.
When a 'standard' route (ie without a reference to a nexthop) is added, fib_create_info() calls fib_info_update_nhc_saddr() with the scope set by the user. iproute2 set the scope to 'link' by default.
Here is a way to reproduce the problem: ip netns add foo ip -n foo link set lo up ip netns add bar ip -n bar link set lo up sleep 1
ip -n foo link add name eth0 type dummy ip -n foo link set eth0 up ip -n foo address add 192.168.0.1/24 dev eth0
ip -n foo link add name veth0 type veth peer name veth1 netns bar ip -n foo link set veth0 address 00:09:c0:26:05:82 ip -n foo link set veth0 arp off ip -n foo link set veth0 up ip -n bar link set veth1 address 00:09:c0:26:05:82 ip -n bar link set veth1 arp off ip -n bar link set veth1 up
ip -n bar address add 192.168.1.1/32 dev veth1 ip -n bar route add default dev veth1
ip -n foo nexthop add id 1 dev veth0 ip -n foo route add 192.168.1.1 nhid 1
Try to get/use the route:
$ ip -n foo route get 192.168.1.1 RTNETLINK answers: Invalid argument $ ip netns exec foo ping -c1 192.168.1.1 ping: connect: Invalid argument
Try without nexthop group (iproute2 sets scope to 'link' by dflt): ip -n foo route del 192.168.1.1 ip -n foo route add 192.168.1.1 dev veth0
Try to get/use the route:
$ ip -n foo route get 192.168.1.1 192.168.1.1 dev veth0 src 192.168.0.1 uid 0 cache $ ip netns exec foo ping -c1 192.168.1.1 PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.039 ms
--- 192.168.1.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.039/0.039/0.039/0.000 ms
CC: stable@vger.kernel.org Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops") Reported-by: Edwin Brossette edwin.brossette@6wind.com Signed-off-by: Nicolas Dichtel nicolas.dichtel@6wind.com --- net/ipv4/fib_semantics.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index a57ba23571c9..20177ecf5bdd 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -1230,7 +1230,7 @@ static int fib_check_nh_nongw(struct net *net, struct fib_nh *nh,
nh->fib_nh_dev = in_dev->dev; dev_hold_track(nh->fib_nh_dev, &nh->fib_nh_dev_tracker, GFP_ATOMIC); - nh->fib_nh_scope = RT_SCOPE_HOST; + nh->fib_nh_scope = RT_SCOPE_LINK; if (!netif_carrier_ok(nh->fib_nh_dev)) nh->fib_nh_flags |= RTNH_F_LINKDOWN; err = 0;
On Wed, 2022-07-06 at 18:05 +0200, Nicolas Dichtel wrote:
When a nexthop is added, without a gw address, the default scope was set to 'host'. Thus, when a source address is selected, 127.0.0.1 may be chosen but rejected when the route is used.
When using a route without a nexthop id, the scope can be configured in the route, thus the problem doesn't exist.
To explain more deeply: when a user creates a nexthop, it cannot specify the scope. To create it, the function nh_create_ipv4() calls fib_check_nh() with scope set to 0. fib_check_nh() calls fib_check_nh_nongw() wich was setting scope to 'host'. Then, nh_create_ipv4() calls fib_info_update_nhc_saddr() with scope set to 'host'. The src addr is chosen before the route is inserted.
When a 'standard' route (ie without a reference to a nexthop) is added, fib_create_info() calls fib_info_update_nhc_saddr() with the scope set by the user. iproute2 set the scope to 'link' by default.
Here is a way to reproduce the problem: ip netns add foo ip -n foo link set lo up ip netns add bar ip -n bar link set lo up sleep 1
ip -n foo link add name eth0 type dummy ip -n foo link set eth0 up ip -n foo address add 192.168.0.1/24 dev eth0
ip -n foo link add name veth0 type veth peer name veth1 netns bar ip -n foo link set veth0 address 00:09:c0:26:05:82 ip -n foo link set veth0 arp off
It looks like the 'arp off'/fixed mac address is not relevant for the test case, could you please drop it, so that the example and the self- test are more clean?
ip -n foo link set veth0 up ip -n bar link set veth1 address 00:09:c0:26:05:82 ip -n bar link set veth1 arp off ip -n bar link set veth1 up
ip -n bar address add 192.168.1.1/32 dev veth1 ip -n bar route add default dev veth1
ip -n foo nexthop add id 1 dev veth0 ip -n foo route add 192.168.1.1 nhid 1
Try to get/use the route:
$ ip -n foo route get 192.168.1.1 RTNETLINK answers: Invalid argument $ ip netns exec foo ping -c1 192.168.1.1 ping: connect: Invalid argument
Try without nexthop group (iproute2 sets scope to 'link' by dflt): ip -n foo route del 192.168.1.1 ip -n foo route add 192.168.1.1 dev veth0
Try to get/use the route:
$ ip -n foo route get 192.168.1.1 192.168.1.1 dev veth0 src 192.168.0.1 uid 0 cache $ ip netns exec foo ping -c1 192.168.1.1 PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.039 ms
--- 192.168.1.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.039/0.039/0.039/0.000 ms
CC: stable@vger.kernel.org Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops")
Why that commit? It looks like fib_check_nh() used SCOPE_HOST for nongw next hop since well before ?!?
Otherwise LGTM.
Paolo
Le 12/07/2022 à 10:11, Paolo Abeni a écrit : [snip]
ip -n foo link add name veth0 type veth peer name veth1 netns bar ip -n foo link set veth0 address 00:09:c0:26:05:82 ip -n foo link set veth0 arp off
It looks like the 'arp off'/fixed mac address is not relevant for the test case, could you please drop it, so that the example and the self- test are more clean?Good point, I will remove these lines.
[snip]
Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops")
Why that commit? It looks like fib_check_nh() used SCOPE_HOST for nongw next hop since well before ?!?
Yes, but with "standard" route, ie when the nexthop objects are not used, the scope used, at the end, is the one specified in the netlink message. With nexthop object, the user cannot specify the scope and we end up in this problem. Thus, the problem exists only with nexthop object only.
I tried to explain this difference in the commit log, but maybe it's not clear enough.
Otherwise LGTM.
Thank you, Nicolas
linux-stable-mirror@lists.linaro.org