-----Original Message----- From: Paolo Abeni pabeni@redhat.com Sent: Tuesday, April 29, 2025 12:37 PM To: Chia-Yu Chang (Nokia) chia-yu.chang@nokia-bell-labs.com; horms@kernel.org; dsahern@kernel.org; kuniyu@amazon.com; bpf@vger.kernel.org; netdev@vger.kernel.org; dave.taht@gmail.com; jhs@mojatatu.com; kuba@kernel.org; stephen@networkplumber.org; xiyou.wangcong@gmail.com; jiri@resnulli.us; davem@davemloft.net; edumazet@google.com; andrew+netdev@lunn.ch; donald.hunter@gmail.com; ast@fiberby.net; liuhangbin@gmail.com; shuah@kernel.org; linux-kselftest@vger.kernel.org; ij@kernel.org; ncardwell@google.com; Koen De Schepper (Nokia) koen.de_schepper@nokia-bell-labs.com; g.white g.white@cablelabs.com; ingemar.s.johansson@ericsson.com; mirja.kuehlewind@ericsson.com; cheshire@apple.com; rs.ietf@gmx.at; Jason_Livingood@comcast.com; vidhi_goel vidhi_goel@apple.com Cc: Olivier Tilmans (Nokia) olivier.tilmans@nokia.com Subject: Re: [PATCH v5 net-next 04/15] tcp: accecn: AccECN negotiation
CAUTION: This is an external email. Please be very careful when clicking links or opening attachments. See the URL nok.it/ext for additional information.
On 4/22/25 5:35 PM, chia-yu.chang@nokia-bell-labs.com wrote:
diff --git a/include/linux/tcp.h b/include/linux/tcp.h index e36018203bd0..af38fff24aa4 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -156,6 +156,10 @@ struct tcp_request_sock { #if IS_ENABLED(CONFIG_MPTCP) bool drop_req; #endif
u8 accecn_ok : 1,
syn_ect_snt: 2,
syn_ect_rcv: 2;
u8 accecn_fail_mode:4;
AFAICS this will create a 3 bytes hole. That could be bad if it will also increase the number of cachelines used by struct tcp_request_sock. Please include the pahole info and struct size in the commit message.
If there is no size problem I guess you are better off using a 'bool' for 'accecn_ok'
Hi Paolo,
Thanks for the feedback I will include the pahole in the message and see whether I can move to reduce the size of holes.
u32 txhash; u32 rcv_isn; u32 snt_isn;
@@ -376,7 +380,10 @@ struct tcp_sock { u8 compressed_ack; u8 dup_ack_counter:2, tlp_retrans:1, /* TLP is a retransmission */
unused:5;
syn_ect_snt:2, /* AccECN ECT memory, only */
syn_ect_rcv:2, /* ... needed durign 3WHS + first seqno */
wait_third_ack:1; /* Wait 3rd ACK in simultaneous open
- */
A good bunch of conditionals will be added to the fast path checking this flag. Is simult open really a thing for AccECN? Can we simple disable AccECN in such scenarios and simplify the code a bit? In my limited experience only syzkaller reliably use it.
There are few simulateneous open testcase for AccECN in packtetdrill: https://github.com/minuscat/packetdrill_accecn/tree/main/gtests/net/tcp/acce...
u8 accecn_fail_mode:4; /* AccECN failure handling */
This is outside the fastpath area, so possibly the struct size increase is less critical, but AFAICS this will create a 6bits hole (as the next u8 has only 6bit used). I think it's better to read the 'unused' field to mark such hole.
Sure, will take action in the next version, either provide pahole in commit message or read the unsued field to make such hole.
u8 thin_lto : 1,/* Use linear timeouts for thin streams */ fastopen_connect:1, /* FASTOPEN_CONNECT sockopt */ fastopen_no_cookie:1, /* Allow send/recv SYN+data
without a cookie */
[...]
+/* See Table 2 of the AccECN draft */ static void +tcp_ecn_rcv_synack(struct sock *sk, const struct tcphdr *th,
u8 ip_dsfield) {
struct tcp_sock *tp = tcp_sk(sk);
u8 ace = tcp_accecn_ace(th);
switch (ace) {
case 0x0:
case 0x7: tcp_ecn_mode_set(tp, TCP_ECN_DISABLED);
break;
case 0x1:
case 0x5:
Possibly some human readable defines could help instead of magic numbers here.
Sure, I will add comments here.
[...]
@@ -6171,16 +6252,27 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, * RFC 5961 4.2 : Send a challenge ack */ if (th->syn) {
if (tcp_ecn_mode_accecn(tp))
send_accecn_reflector = true; if (sk->sk_state == TCP_SYN_RECV && sk->sk_socket && th->ack && TCP_SKB_CB(skb)->seq + 1 == TCP_SKB_CB(skb)->end_seq && TCP_SKB_CB(skb)->seq + 1 == tp->rcv_nxt &&
TCP_SKB_CB(skb)->ack_seq == tp->snd_nxt)
TCP_SKB_CB(skb)->ack_seq == tp->snd_nxt) {
if (!tcp_ecn_disabled(tp)) {
u8 ect = tp->syn_ect_rcv;
tp->wait_third_ack = true;
__tcp_send_ack(sk, tp->rcv_nxt,
!send_accecn_reflector ? 0 :
- tcp_accecn_reflector_flags(ect));
The same expression is used above possibly you can create a new helper for this statement.
OK, will do that.
...
This patch is quite huge. Any hope to break id down to a more palatable size? i.e. moving the 3rd ack/self connect handling to a separate patch (if that thing is really needed).
I am ok to make a practce on reducing the number of patch in this series, but this series shall be the key for AccECN: ACE bitfield, TCP option, fallback, error handling, etc. Or if you have any suggestions, I am find to take actions. And after this series, we still have 15 patches including several corner case handling in RFC, doucmentation, etc.
Chia-Yu
/P