On Fri, Aug 15, 2025 at 1:39 AM chia-yu.chang@nokia-bell-labs.com wrote:
From: Ilpo Järvinen ij@kernel.org
Accurate ECN negotiation parts based on the specification: https://tools.ietf.org/id/draft-ietf-tcpm-accurate-ecn-28.txt
Accurate ECN is negotiated using ECE, CWR and AE flags in the TCP header. TCP falls back into using RFC3168 ECN if one of the ends supports only RFC3168-style ECN.
The AccECN negotiation includes reflecting IP ECN field value seen in SYN and SYNACK back using the same bits as negotiation to allow responding to SYN CE marks and to detect ECN field mangling. CE marks should not occur currently because SYN=1 segments are sent with Non-ECT in IP ECN field (but proposal exists to remove this restriction).
Reflecting SYN IP ECN field in SYNACK is relatively simple. Reflecting SYNACK IP ECN field in the final/third ACK of the handshake is more challenging. Linux TCP code is not well prepared for using the final/third ACK a signalling channel which makes things somewhat complicated here.
tcp_ecn sysctl can be used to select the highest ECN variant (Accurate ECN, ECN, No ECN) that is attemped to be negotiated and requested for incoming connection and outgoing connection: TCP_ECN_IN_NOECN_OUT_NOECN, TCP_ECN_IN_ECN_OUT_ECN, TCP_ECN_IN_ECN_OUT_NOECN, TCP_ECN_IN_ACCECN_OUT_ACCECN, TCP_ECN_IN_ACCECN_OUT_ECN, and TCP_ECN_IN_ACCECN_OUT_NOECN.
After this patch, the size of tcp_request_sock remains unchanged and no new holes are added. Below are the pahole outcomes before and after this patch:
Signed-off-by: Ilpo Järvinen ij@kernel.org Co-developed-by: Olivier Tilmans olivier.tilmans@nokia.com Signed-off-by: Olivier Tilmans olivier.tilmans@nokia.com Co-developed-by: Chia-Yu Chang chia-yu.chang@nokia-bell-labs.com Signed-off-by: Chia-Yu Chang chia-yu.chang@nokia-bell-labs.com Acked-by: Paolo Abeni pabeni@redhat.com
if (tp->ecn_flags & TCP_ECN_MODE_ACCECN) {
TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_ACE;
TCP_SKB_CB(skb)->tcp_flags |=
tcp_accecn_reflector_flags(tp->syn_ect_rcv);
tp->syn_ect_snt = inet_sk(sk)->tos & INET_ECN_MASK;
}
}
/* Packet ECN state for a SYN. */ @@ -125,8 +377,20 @@ static inline void tcp_ecn_send_syn(struct sock *sk, struct sk_buff *skb) { struct tcp_sock *tp = tcp_sk(sk); bool bpf_needs_ecn = tcp_bpf_ca_needs_ecn(sk);
bool use_ecn = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_ecn) == 1 ||
tcp_ca_needs_ecn(sk) || bpf_needs_ecn;
bool use_ecn, use_accecn;
u8 tcp_ecn = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_ecn);
/* +================+===========================+
* | tcp_ecn values | Outgoing connections |
* +================+===========================+
* | 0,2,5 | Do not request ECN |
* | 1,4 | Request ECN connection |
* | 3 | Request AccECN connection |
* +================+===========================+
*/
You have nice macros, maybe use them ?
TCP_ECN_IN_NOECN_OUT_NOECN = 0, TCP_ECN_IN_ECN_OUT_ECN = 1, TCP_ECN_IN_ECN_OUT_NOECN = 2, TCP_ECN_IN_ACCECN_OUT_ACCECN = 3, TCP_ECN_IN_ACCECN_OUT_ECN = 4, TCP_ECN_IN_ACCECN_OUT_NOECN = 5,
This can be done later, no need to respin.
Reviewed-by: Eric Dumazet edumazet@google.com