The patch below does not apply to the 6.6-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y git checkout FETCH_HEAD git cherry-pick -x cd7c957f936f8cb80d03e5152f4013aae65bd986 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to 'stable@vger.kernel.org' --in-reply-to '2024081244-uncertain-snarl-e4f6@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
cd7c957f936f ("mptcp: pm: don't try to create sf if alloc failed") c95eb32ced82 ("mptcp: pm: reduce indentation blocks")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From cd7c957f936f8cb80d03e5152f4013aae65bd986 Mon Sep 17 00:00:00 2001 From: "Matthieu Baerts (NGI0)" matttbe@kernel.org Date: Wed, 31 Jul 2024 13:05:56 +0200 Subject: [PATCH] mptcp: pm: don't try to create sf if alloc failed
It sounds better to avoid wasting cycles and / or put extreme memory pressure on the system by trying to create new subflows if it was not possible to add a new item in the announce list.
While at it, a warning is now printed if the entry was already in the list as it should not happen with the in-kernel path-manager. With this PM, mptcp_pm_alloc_anno_list() should only fail in case of memory pressure.
Fixes: b6c08380860b ("mptcp: remove addr and subflow in PM netlink") Cc: stable@vger.kernel.org Suggested-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-s... Signed-off-by: Jakub Kicinski kuba@kernel.org
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 780f4cca165c..2be7af377cda 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -348,7 +348,7 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk, add_entry = mptcp_lookup_anno_list_by_saddr(msk, addr);
if (add_entry) { - if (mptcp_pm_is_kernel(msk)) + if (WARN_ON_ONCE(mptcp_pm_is_kernel(msk))) return false;
sk_reset_timer(sk, &add_entry->add_timer, @@ -555,8 +555,6 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
/* check first for announce */ if (msk->pm.add_addr_signaled < add_addr_signal_max) { - local = select_signal_address(pernet, msk); - /* due to racing events on both ends we can reach here while * previous add address is still running: if we invoke now * mptcp_pm_announce_addr(), that will fail and the @@ -567,11 +565,15 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) if (msk->pm.addr_signal & BIT(MPTCP_ADD_ADDR_SIGNAL)) return;
+ local = select_signal_address(pernet, msk); if (!local) goto subflow;
+ /* If the alloc fails, we are on memory pressure, not worth + * continuing, and trying to create subflows. + */ if (!mptcp_pm_alloc_anno_list(msk, &local->addr)) - goto subflow; + return;
__clear_bit(local->addr.id, msk->pm.id_avail_bitmap); msk->pm.add_addr_signaled++;
Patches "mptcp: pm: don't try to create sf if alloc failed" and "mptcp: pm: do not ignore 'subflow' if 'signal' flag is also set" depend on "mptcp: pm: reduce indentation blocks", a simple refactoring that can be picked to ease the backports. Including this patch avoids conflicts with the two other patches.
While at it, also picked the modifications of the selftests to validate the other modifications.
If you prefer, feel free to backport these 5 commits to v6.6:
c95eb32ced82 cd7c957f936f 85df533a787b bec1f3b119eb 4d2868b5d191
In this order, and thanks to c95eb32ced82, there are no conflicts.
Details:
- c95eb32ced82 ("mptcp: pm: reduce indentation blocks") - cd7c957f936f ("mptcp: pm: don't try to create sf if alloc failed") - 85df533a787b ("mptcp: pm: do not ignore 'subflow' if 'signal' flag is also set") - bec1f3b119eb ("selftests: mptcp: join: ability to invert ADD_ADDR check") - 4d2868b5d191 ("selftests: mptcp: join: test both signal & subflow")
Matthieu Baerts (NGI0) (5): mptcp: pm: reduce indentation blocks mptcp: pm: don't try to create sf if alloc failed mptcp: pm: do not ignore 'subflow' if 'signal' flag is also set selftests: mptcp: join: ability to invert ADD_ADDR check selftests: mptcp: join: test both signal & subflow
net/mptcp/pm_netlink.c | 43 ++++++++++----- .../testing/selftests/net/mptcp/mptcp_join.sh | 55 ++++++++++++++----- 2 files changed, 69 insertions(+), 29 deletions(-)
On Mon, Aug 12, 2024 at 05:30:51PM +0200, Matthieu Baerts (NGI0) wrote:
Patches "mptcp: pm: don't try to create sf if alloc failed" and "mptcp: pm: do not ignore 'subflow' if 'signal' flag is also set" depend on "mptcp: pm: reduce indentation blocks", a simple refactoring that can be picked to ease the backports. Including this patch avoids conflicts with the two other patches.
While at it, also picked the modifications of the selftests to validate the other modifications.
If you prefer, feel free to backport these 5 commits to v6.6:
c95eb32ced82 cd7c957f936f 85df533a787b bec1f3b119eb 4d2868b5d191
In this order, and thanks to c95eb32ced82, there are no conflicts.
All now queued up, thanks!
gre gk-h
commit c95eb32ced823a00be62202b43966b07b2f20b7f upstream.
That will simplify the following commits.
No functional changes intended.
Suggested-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-s... Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: cd7c957f936f ("mptcp: pm: don't try to create sf if alloc failed") Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org --- net/mptcp/pm_netlink.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index be4be3bcefc7..6d5c75a40bc6 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -575,16 +575,19 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) if (msk->pm.addr_signal & BIT(MPTCP_ADD_ADDR_SIGNAL)) return;
- if (local) { - if (mptcp_pm_alloc_anno_list(msk, &local->addr)) { - __clear_bit(local->addr.id, msk->pm.id_avail_bitmap); - msk->pm.add_addr_signaled++; - mptcp_pm_announce_addr(msk, &local->addr, false); - mptcp_pm_nl_addr_send_ack(msk); - } - } + if (!local) + goto subflow; + + if (!mptcp_pm_alloc_anno_list(msk, &local->addr)) + goto subflow; + + __clear_bit(local->addr.id, msk->pm.id_avail_bitmap); + msk->pm.add_addr_signaled++; + mptcp_pm_announce_addr(msk, &local->addr, false); + mptcp_pm_nl_addr_send_ack(msk); }
+subflow: /* check if should create a new subflow */ while (msk->pm.local_addr_used < local_addr_max && msk->pm.subflows < subflows_max) {
commit cd7c957f936f8cb80d03e5152f4013aae65bd986 upstream.
It sounds better to avoid wasting cycles and / or put extreme memory pressure on the system by trying to create new subflows if it was not possible to add a new item in the announce list.
While at it, a warning is now printed if the entry was already in the list as it should not happen with the in-kernel path-manager. With this PM, mptcp_pm_alloc_anno_list() should only fail in case of memory pressure.
Fixes: b6c08380860b ("mptcp: remove addr and subflow in PM netlink") Cc: stable@vger.kernel.org Suggested-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-s... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org --- net/mptcp/pm_netlink.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 6d5c75a40bc6..83210485cde8 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -353,7 +353,7 @@ bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk, add_entry = mptcp_lookup_anno_list_by_saddr(msk, addr);
if (add_entry) { - if (mptcp_pm_is_kernel(msk)) + if (WARN_ON_ONCE(mptcp_pm_is_kernel(msk))) return false;
sk_reset_timer(sk, &add_entry->add_timer, @@ -563,8 +563,6 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk)
/* check first for announce */ if (msk->pm.add_addr_signaled < add_addr_signal_max) { - local = select_signal_address(pernet, msk); - /* due to racing events on both ends we can reach here while * previous add address is still running: if we invoke now * mptcp_pm_announce_addr(), that will fail and the @@ -575,11 +573,15 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) if (msk->pm.addr_signal & BIT(MPTCP_ADD_ADDR_SIGNAL)) return;
+ local = select_signal_address(pernet, msk); if (!local) goto subflow;
+ /* If the alloc fails, we are on memory pressure, not worth + * continuing, and trying to create subflows. + */ if (!mptcp_pm_alloc_anno_list(msk, &local->addr)) - goto subflow; + return;
__clear_bit(local->addr.id, msk->pm.id_avail_bitmap); msk->pm.add_addr_signaled++;
commit 85df533a787bf07bf4367ce2a02b822ff1fba1a3 upstream.
Up to the 'Fixes' commit, having an endpoint with both the 'signal' and 'subflow' flags, resulted in the creation of a subflow and an address announcement using the address linked to this endpoint. After this commit, only the address announcement was done, ignoring the 'subflow' flag.
That's because the same bitmap is used for the two flags. It is OK to keep this single bitmap, the already selected local endpoint simply have to be re-used, but not via select_local_address() not to look at the just modified bitmap.
Note that it is unusual to set the two flags together: creating a new subflow using a new local address will implicitly advertise it to the other peer. So in theory, no need to advertise it explicitly as well. Maybe there are use-cases -- the subflow might not reach the other peer that way, we can ask the other peer to try initiating the new subflow without delay -- or very likely the user is confused, and put both flags "just to be sure at least the right one is set". Still, if it is allowed, the kernel should do what has been asked: using this endpoint to announce the address and to create a new subflow from it.
An alternative is to forbid the use of the two flags together, but that's probably too late, there are maybe use-cases, and it was working before. This patch will avoid people complaining subflows are not created using the endpoint they added with the 'subflow' and 'signal' flag.
Note that with the current patch, the subflow might not be created in some corner cases, e.g. if the 'subflows' limit was reached when sending the ADD_ADDR, but changed later on. It is probably not worth splitting id_avail_bitmap per target ('signal', 'subflow'), which will add another large field to the msk "just" to track (again) endpoints. Anyway, currently when the limits are changed, the kernel doesn't check if new subflows can be created or removed, because we would need to keep track of the received ADD_ADDR, and more. It sounds OK to assume that the limits should be properly configured before establishing new connections.
Fixes: 86e39e04482b ("mptcp: keep track of local endpoint still available for each msk") Cc: stable@vger.kernel.org Suggested-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-s... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org --- net/mptcp/pm_netlink.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 83210485cde8..2c49182c674f 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -520,8 +520,8 @@ __lookup_addr(struct pm_nl_pernet *pernet, const struct mptcp_addr_info *info,
static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) { + struct mptcp_pm_addr_entry *local, *signal_and_subflow = NULL; struct sock *sk = (struct sock *)msk; - struct mptcp_pm_addr_entry *local; unsigned int add_addr_signal_max; unsigned int local_addr_max; struct pm_nl_pernet *pernet; @@ -587,6 +587,9 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) msk->pm.add_addr_signaled++; mptcp_pm_announce_addr(msk, &local->addr, false); mptcp_pm_nl_addr_send_ack(msk); + + if (local->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) + signal_and_subflow = local; }
subflow: @@ -597,9 +600,14 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) bool fullmesh; int i, nr;
- local = select_local_address(pernet, msk); - if (!local) - break; + if (signal_and_subflow) { + local = signal_and_subflow; + signal_and_subflow = NULL; + } else { + local = select_local_address(pernet, msk); + if (!local) + break; + }
fullmesh = !!(local->flags & MPTCP_PM_ADDR_FLAG_FULLMESH);
commit bec1f3b119ebc613d08dfbcdbaef01a79aa7de92 upstream.
In the following commit, the client will initiate the ADD_ADDR, instead of the server. We need to way to verify the ADD_ADDR have been correctly sent.
Note: the default expected counters for when the port number is given are never changed by the caller, no need to accept them as parameter then.
The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID.
Fixes: 86e39e04482b ("mptcp: keep track of local endpoint still available for each msk") Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-s... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org --- .../testing/selftests/net/mptcp/mptcp_join.sh | 40 ++++++++++++------- 1 file changed, 26 insertions(+), 14 deletions(-)
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index d7973b1202d9..ede3661607ef 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -1559,18 +1559,28 @@ chk_add_nr() local add_nr=$1 local echo_nr=$2 local port_nr=${3:-0} - local syn_nr=${4:-$port_nr} - local syn_ack_nr=${5:-$port_nr} - local ack_nr=${6:-$port_nr} - local mis_syn_nr=${7:-0} - local mis_ack_nr=${8:-0} + local ns_invert=${4:-""} + local syn_nr=$port_nr + local syn_ack_nr=$port_nr + local ack_nr=$port_nr + local mis_syn_nr=0 + local mis_ack_nr=0 + local ns_tx=$ns1 + local ns_rx=$ns2 + local extra_msg="" local count local timeout
- timeout=$(ip netns exec $ns1 sysctl -n net.mptcp.add_addr_timeout) + if [[ $ns_invert = "invert" ]]; then + ns_tx=$ns2 + ns_rx=$ns1 + extra_msg="invert" + fi + + timeout=$(ip netns exec ${ns_tx} sysctl -n net.mptcp.add_addr_timeout)
print_check "add" - count=$(mptcp_lib_get_counter ${ns2} "MPTcpExtAddAddr") + count=$(mptcp_lib_get_counter ${ns_rx} "MPTcpExtAddAddr") if [ -z "$count" ]; then print_skip # if the test configured a short timeout tolerate greater then expected @@ -1582,7 +1592,7 @@ chk_add_nr() fi
print_check "echo" - count=$(mptcp_lib_get_counter ${ns1} "MPTcpExtEchoAdd") + count=$(mptcp_lib_get_counter ${ns_tx} "MPTcpExtEchoAdd") if [ -z "$count" ]; then print_skip elif [ "$count" != "$echo_nr" ]; then @@ -1593,7 +1603,7 @@ chk_add_nr()
if [ $port_nr -gt 0 ]; then print_check "pt" - count=$(mptcp_lib_get_counter ${ns2} "MPTcpExtPortAdd") + count=$(mptcp_lib_get_counter ${ns_rx} "MPTcpExtPortAdd") if [ -z "$count" ]; then print_skip elif [ "$count" != "$port_nr" ]; then @@ -1603,7 +1613,7 @@ chk_add_nr() fi
print_check "syn" - count=$(mptcp_lib_get_counter ${ns1} "MPTcpExtMPJoinPortSynRx") + count=$(mptcp_lib_get_counter ${ns_tx} "MPTcpExtMPJoinPortSynRx") if [ -z "$count" ]; then print_skip elif [ "$count" != "$syn_nr" ]; then @@ -1614,7 +1624,7 @@ chk_add_nr() fi
print_check "synack" - count=$(mptcp_lib_get_counter ${ns2} "MPTcpExtMPJoinPortSynAckRx") + count=$(mptcp_lib_get_counter ${ns_rx} "MPTcpExtMPJoinPortSynAckRx") if [ -z "$count" ]; then print_skip elif [ "$count" != "$syn_ack_nr" ]; then @@ -1625,7 +1635,7 @@ chk_add_nr() fi
print_check "ack" - count=$(mptcp_lib_get_counter ${ns1} "MPTcpExtMPJoinPortAckRx") + count=$(mptcp_lib_get_counter ${ns_tx} "MPTcpExtMPJoinPortAckRx") if [ -z "$count" ]; then print_skip elif [ "$count" != "$ack_nr" ]; then @@ -1636,7 +1646,7 @@ chk_add_nr() fi
print_check "syn" - count=$(mptcp_lib_get_counter ${ns1} "MPTcpExtMismatchPortSynRx") + count=$(mptcp_lib_get_counter ${ns_tx} "MPTcpExtMismatchPortSynRx") if [ -z "$count" ]; then print_skip elif [ "$count" != "$mis_syn_nr" ]; then @@ -1647,7 +1657,7 @@ chk_add_nr() fi
print_check "ack" - count=$(mptcp_lib_get_counter ${ns1} "MPTcpExtMismatchPortAckRx") + count=$(mptcp_lib_get_counter ${ns_tx} "MPTcpExtMismatchPortAckRx") if [ -z "$count" ]; then print_skip elif [ "$count" != "$mis_ack_nr" ]; then @@ -1657,6 +1667,8 @@ chk_add_nr() print_ok fi fi + + print_info "$extra_msg" }
chk_add_tx_nr()
commit 4d2868b5d191c74262f7407972d68d1bf3245d6a upstream.
It should be quite uncommon to set both the subflow and the signal flags: the initiator of the connection is typically the one creating new subflows, not the other peer, then no need to announce additional local addresses, and use it to create subflows.
But some people might be confused about the flags, and set both "just to be sure at least the right one is set". To verify the previous fix, and avoid future regressions, this specific case is now validated: the client announces a new address, and initiates a new subflow from the same address.
While working on this, another bug has been noticed, where the client reset the new subflow because an ADD_ADDR echo got received as the 3rd ACK: this new test also explicitly checks that no RST have been sent by the client and server.
The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID.
Fixes: 86e39e04482b ("mptcp: keep track of local endpoint still available for each msk") Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-s... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index ede3661607ef..b16b8278c4ce 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -2133,6 +2133,21 @@ signal_address_tests() chk_add_nr 1 1 fi
+ # uncommon: subflow and signal flags on the same endpoint + # or because the user wrongly picked both, but still expects the client + # to create additional subflows + if reset "subflow and signal together"; then + pm_nl_set_limits $ns1 0 2 + pm_nl_set_limits $ns2 0 2 + pm_nl_add_endpoint $ns2 10.0.3.2 flags signal,subflow + run_tests $ns1 $ns2 10.0.1.1 + chk_join_nr 1 1 1 + chk_add_nr 1 1 0 invert # only initiated by ns2 + chk_add_nr 0 0 0 # none initiated by ns1 + chk_rst_nr 0 0 invert # no RST sent by the client + chk_rst_nr 0 0 # no RST sent by the server + fi + # accept and use add_addr with additional subflows if reset "multiple subflows and signal"; then pm_nl_set_limits $ns1 0 3
linux-stable-mirror@lists.linaro.org