This patch set is aim to update the old IP_TOS_MASK to new IP_DSCP_MASK as tos value has been obsoleted for a long time. But to make sure we don't break any existing behaviour, we can't just replease all IP_TOS_MASK to new IP_DSCP_MASK.
So let's update it case by case. The first issue we will fix is that vxlan is unable to take the first 3 bits from DSCP field before xmit. Use the new RT_DSCP() would resolve this.
v2: Remove IP_DSCP() definition as it's duplicated with RT_DSCP(). Post the patch to net instead of net-next as we need fix the vxlan issue
Hangbin Liu (2): net: add IP_DSCP_MASK vxlan: fix getting tos value from DSCP field
drivers/net/vxlan.c | 4 ++-- include/uapi/linux/in_route.h | 1 + include/uapi/linux/ip.h | 1 + 3 files changed, 4 insertions(+), 2 deletions(-)
In RFC1349 it defined TOS field like
0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | PRECEDENCE | TOS | MBZ | +-----+-----+-----+-----+-----+-----+-----+-----+
But this has been obsoleted by RFC2474, and updated by RFC3168 later. Now the DS Field should be like
0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | DS FIELD, DSCP | ECN FIELD | +-----+-----+-----+-----+-----+-----+-----+-----+
DSCP: differentiated services codepoint ECN: Explicit Congestion Notification
So the old IPTOS_TOS_MASK 0x1E should be updated. But since changed the value will break UAPI, let's add a new value IP_DSCP_MASK 0xFC as a replacement.
v2: remove IP_DSCP() definition as it's duplicated with RT_DSCP().
Signed-off-by: Hangbin Liu liuhangbin@gmail.com --- include/uapi/linux/in_route.h | 1 + include/uapi/linux/ip.h | 1 + 2 files changed, 2 insertions(+)
diff --git a/include/uapi/linux/in_route.h b/include/uapi/linux/in_route.h index 0cc2c23b47f8..26ba4efb054d 100644 --- a/include/uapi/linux/in_route.h +++ b/include/uapi/linux/in_route.h @@ -29,5 +29,6 @@ #define RTCF_NAT (RTCF_DNAT|RTCF_SNAT)
#define RT_TOS(tos) ((tos)&IPTOS_TOS_MASK) +#define RT_DSCP(tos) ((tos)&IP_DSCP_MASK)
#endif /* _LINUX_IN_ROUTE_H */ diff --git a/include/uapi/linux/ip.h b/include/uapi/linux/ip.h index e42d13b55cf3..699a86038b9f 100644 --- a/include/uapi/linux/ip.h +++ b/include/uapi/linux/ip.h @@ -38,6 +38,7 @@ #define IPTOS_PREC_PRIORITY 0x20 #define IPTOS_PREC_ROUTINE 0x00
+#define IP_DSCP_MASK 0xFC
/* IP options */ #define IPOPT_COPY 0x80
In commit 71130f29979c ("vxlan: fix tos value before xmit") we strict the vxlan tos value before xmit. But as IP tos field has been obsoleted by RFC2474, and updated by RFC3168 later. We should use new DSCP field, or we will lost the first 3 bits value when xmit.
Fixes: 71130f29979c ("vxlan: fix tos value before xmit") Signed-off-by: Hangbin Liu liuhangbin@gmail.com --- drivers/net/vxlan.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index a7c3939264b0..79488df4bc70 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -2722,7 +2722,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev, ndst = &rt->dst; skb_tunnel_check_pmtu(skb, ndst, VXLAN_HEADROOM);
- tos = ip_tunnel_ecn_encap(RT_TOS(tos), old_iph, skb); + tos = ip_tunnel_ecn_encap(RT_DSCP(tos), old_iph, skb); ttl = ttl ? : ip4_dst_hoplimit(&rt->dst); err = vxlan_build_skb(skb, ndst, sizeof(struct iphdr), vni, md, flags, udp_sum); @@ -2762,7 +2762,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
skb_tunnel_check_pmtu(skb, ndst, VXLAN6_HEADROOM);
- tos = ip_tunnel_ecn_encap(RT_TOS(tos), old_iph, skb); + tos = ip_tunnel_ecn_encap(RT_DSCP(tos), old_iph, skb); ttl = ttl ? : ip6_dst_hoplimit(ndst); skb_scrub_packet(skb, xnet); err = vxlan_build_skb(skb, ndst, sizeof(struct ipv6hdr),
From: Hangbin Liu liuhangbin@gmail.com Date: Tue, 4 Aug 2020 09:43:12 +0800
In commit 71130f29979c ("vxlan: fix tos value before xmit") we strict the vxlan tos value before xmit. But as IP tos field has been obsoleted by RFC2474, and updated by RFC3168 later. We should use new DSCP field, or we will lost the first 3 bits value when xmit.
Fixes: 71130f29979c ("vxlan: fix tos value before xmit") Signed-off-by: Hangbin Liu liuhangbin@gmail.com
Looking at the Fixes: tag commit more closely, it doesn't make much sense at all to me and I think the fix is that the Fixes: commit should be reverted.
If you pass the raw TOS into ip_tunnel_ecn_encap(), then that has the same exact effect as your patch series here. The ECN encap routines will clear the ECN bits before potentially incorporating the ECN value from the inner header etc. The clearing of the ECN bits done by your RT_DSCP() helper is completely unnecessary, the ECN helpers do the right thing. So effectively the RT_DSCP() isn't changing the tos value at all.
I also think that your commit messages are lacking, as you fail (especially in the Fixes: commit) to show exactly where things go wrong. It's always good to give example code paths and show what happens to the TOS and/or ECN values in these places, what part of that transformation you feel is incorrect, and what exactly you believe the correct transformation to be.
I'm not applying this series, sorry.
On Tue, Aug 04, 2020 at 12:47:56PM -0700, David Miller wrote:
From: Hangbin Liu liuhangbin@gmail.com Date: Tue, 4 Aug 2020 09:43:12 +0800
In commit 71130f29979c ("vxlan: fix tos value before xmit") we strict the vxlan tos value before xmit. But as IP tos field has been obsoleted by RFC2474, and updated by RFC3168 later. We should use new DSCP field, or we will lost the first 3 bits value when xmit.
Fixes: 71130f29979c ("vxlan: fix tos value before xmit") Signed-off-by: Hangbin Liu liuhangbin@gmail.com
Looking at the Fixes: tag commit more closely, it doesn't make much sense at all to me and I think the fix is that the Fixes: commit should be reverted.
Hi David,
Both this patch and the Fixes: commit are not aim to fix the ECN bits. ECN bits are handled by ip_tunnel_ecn_encap() correctly.
The Fixes: commit and this patch are aim to fix the TOS/DSCP field, just as the commit subject said.
In my first patch "net: add IP_DSCP_MASK", as I said, the current RT_TOS()/IPTOS_TOS_MASK will ignore the first 3 bits in IP header based on RFC1349.
0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | PRECEDENCE | TOS | MBZ | +-----+-----+-----+-----+-----+-----+-----+-----+
While in RFC3168 we defined DSCP field like
0 1 2 3 4 5 6 7 +-----+-----+-----+-----+-----+-----+-----+-----+ | DS FIELD, DSCP | ECN FIELD | +-----+-----+-----+-----+-----+-----+-----+-----+
So if a user defined the IP DSCP/TOS field like 1111 1100, with RT_TOS(tos) we will got tos 0001 1100, but based on RFC3168, we should send the header with DSCP 1111 1100. That's why I add RT_DSCP() in my first patch.
If you pass the raw TOS into ip_tunnel_ecn_encap(), then that has the same exact effect as your patch series here. The ECN encap routines will clear the ECN bits before potentially incorporating the ECN value from the inner header etc. The clearing of the ECN bits done by your RT_DSCP() helper is completely unnecessary, the ECN helpers do the right thing. So effectively the RT_DSCP() isn't changing the tos value at all.
Yes, you are right. RT_DSCP() doesn't change the tos value in this case. I will revert the Fixes: commit.
While RT_DSCP() is still needed in other place, I will re-post a new patch set for that issue.
I also think that your commit messages are lacking, as you fail (especially in the Fixes: commit) to show exactly where things go wrong. It's always good to give example code paths and show what happens to the TOS and/or ECN values in these places, what part of that transformation you feel is incorrect, and what exactly you believe the correct transformation to be.
Thanks, I will try add more info in later patches.
Hangbin
On Tue, Aug 04, 2020 at 09:43:10AM +0800, Hangbin Liu wrote:
This patch set is aim to update the old IP_TOS_MASK to new IP_DSCP_MASK as tos value has been obsoleted for a long time. But to make sure we don't break any existing behaviour, we can't just replease all IP_TOS_MASK to new IP_DSCP_MASK.
So let's update it case by case. The first issue we will fix is that vxlan is unable to take the first 3 bits from DSCP field before xmit. Use the new RT_DSCP() would resolve this.
v2: Remove IP_DSCP() definition as it's duplicated with RT_DSCP(). Post the patch to net instead of net-next as we need fix the vxlan issue
Acked-by: Guillaume Nault gnault@redhat.com
linux-stable-mirror@lists.linaro.org