From: Chia-Yu Chang chia-yu.chang@nokia-bell-labs.com
As SACK blocks tend to eat all option space when there are many holes, it is useful to compromise on sending many SACK blocks in every ACK and attempt to fit the AccECN option there by reducing the number of SACK blocks. However, it will never go below two SACK blocks because of the AccECN option.
As the AccECN option is often not put to every ACK, the space hijack is usually only temporary. Depending on the reuqired AccECN fields (can be either 3, 2, 1, or 0, cf. Table 5 in AccECN spec) and the NOPs used for alignment of other TCP options, up to two SACK blocks will be reduced. Please find below tables for more details:
+====================+=========================================+ | Number of | Required | Remaining | Number of | Final | | SACK | AccECN | option | reduced | number of | | blocks | fields | spaces | SACK blocks | SACK blocks | +===========+==========+===========+=============+=============+ | x (<=2) | 0 to 3 | any | 0 | x | +-----------+----------+-----------+-------------+-------------+ | 3 | 0 | any | 0 | 3 | | 3 | 1 | <4 | 1 | 2 | | 3 | 1 | >=4 | 0 | 3 | | 3 | 2 | <8 | 1 | 2 | | 3 | 2 | >=8 | 0 | 3 | | 3 | 3 | <12 | 1 | 2 | | 3 | 3 | >=12 | 0 | 3 | +-----------+----------+-----------+-------------+-------------+ | y (>=4) | 0 | any | 0 | y | | y (>=4) | 1 | <4 | 1 | y-1 | | y (>=4) | 1 | >=4 | 0 | y | | y (>=4) | 2 | <8 | 1 | y-1 | | y (>=4) | 2 | >=8 | 0 | y | | y (>=4) | 3 | <4 | 2 | y-2 | | y (>=4) | 3 | <12 | 1 | y-1 | | y (>=4) | 3 | >=12 | 0 | y | +===========+==========+===========+=============+=============+
Signed-off-by: Chia-Yu Chang chia-yu.chang@nokia-bell-labs.com Co-developed-by: Ilpo Järvinen ij@kernel.org Signed-off-by: Ilpo Järvinen ij@kernel.org
--- v8: - Update tcp_options_fit_accecn() to avoid using recursion --- net/ipv4/tcp_output.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 560b0ca54bb8..cf1d40e9c0ed 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -876,7 +876,9 @@ static int tcp_options_fit_accecn(struct tcp_out_options *opts, int required, int remaining) { int size = TCP_ACCECN_MAXSIZE; + int sack_blocks_reduce = 0; int max_combine_saving; + int rem = remaining;
if (opts->use_synack_ecn_bytes) max_combine_saving = tcp_synack_options_combine_saving(opts); @@ -889,14 +891,31 @@ static int tcp_options_fit_accecn(struct tcp_out_options *opts, int required, if (leftover_size > max_combine_saving) leftover_size = -((4 - leftover_size) & 0x3);
- if (remaining >= size - leftover_size) { + if (rem >= size - leftover_size) { size -= leftover_size; break; + } else if (opts->num_accecn_fields == required && + opts->num_sack_blocks > 2 && + required > 0) { + /* Try to fit the option by removing one SACK block */ + opts->num_sack_blocks--; + sack_blocks_reduce++; + rem = rem + TCPOLEN_SACK_PERBLOCK; + + opts->num_accecn_fields = TCP_ACCECN_NUMFIELDS; + size = TCP_ACCECN_MAXSIZE; + continue; }
opts->num_accecn_fields--; size -= TCPOLEN_ACCECN_PERFIELD; } + if (sack_blocks_reduce > 0) { + if (opts->num_accecn_fields >= required) + size -= sack_blocks_reduce * TCPOLEN_SACK_PERBLOCK; + else + opts->num_sack_blocks += sack_blocks_reduce; + } if (opts->num_accecn_fields < required) return 0;