[PATCH net 0/6] mptcp: fixes for 6.4

List overview All Threads
Download

newer

older

[mst@redhat.com: [PATCH v2] Revert...

[PATCH] drm/amd/display: perform a...

Matthieu Baerts

20 Jun 2023 20 Jun '23

4:24 p.m.

Patch 1 correctly handles disconnect() failures that can happen in some specific cases: now the socket state is set as unconnected as expected. That fixes an issue introduced in v6.2.

Patch 2 fixes a divide by zero bug in mptcp_recvmsg() with a fix similar to a recent one from Eric Dumazet for TCP introducing sk_wait_pending flag. It should address an issue present in MPTCP from almost the beginning, from v5.9.

Patch 3 fixes a possible list corruption on passive MPJ even if the race seems very unlikely, better be safe than sorry. The possible issue is present from v5.17.

Patch 4 consolidates fallback and non fallback state machines to avoid leaking some MPTCP sockets. The fix is likely needed for versions from v5.11.

Patch 5 drops code that is no longer used after the introduction of patch 4/6. This is not really a fix but this patch can probably land in the -net tree as well not to leave unused code.

Patch 6 ensures listeners are unhashed before updating their sk status to avoid possible deadlocks when diag info are going to be retrieved with a lock. Even if it should not be visible with the way we are currently getting diag info, the issue is present from v5.17.

Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net --- Paolo Abeni (6): mptcp: handle correctly disconnect() failures mptcp: fix possible divide by zero in recvmsg() mptcp: fix possible list corruption on passive MPJ mptcp: consolidate fallback and non fallback state machine mptcp: drop legacy code around RX EOF mptcp: ensure listener is unhashed before updating the sk status

net/mptcp/pm_netlink.c | 1 + net/mptcp/protocol.c | 160 ++++++++++++++++++++----------------------------- net/mptcp/protocol.h | 5 +- net/mptcp/subflow.c | 17 +++--- 4 files changed, 76 insertions(+), 107 deletions(-) --- base-commit: 9a43827e876c9a071826cc81783aa2222b020f1d change-id: 20230620-upstream-net-20230620-misc-fixes-for-v6-4-55ef43802324

Best regards,

-- Matthieu Baerts matthieu.baerts@tessares.net

Show replies by date

Matthieu Baerts

20 Jun 20 Jun

4:24 p.m.

New subject: [PATCH net 1/6] mptcp: handle correctly disconnect() failures

From: Paolo Abeni pabeni@redhat.com

Currently the mptcp code has assumes that disconnect() can fail only at mptcp_sendmsg_fastopen() time - to avoid a deadlock scenario - and don't even bother returning an error code.

Soon mptcp_disconnect() will handle more error conditions: let's track them explicitly.

As a bonus, explicitly annotate TCP-level disconnect as not failing: the mptcp code never blocks for event on the subflows.

Fixes: 7d803344fdc3 ("mptcp: fix deadlock in fastopen error path") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Tested-by: Christoph Paasch cpaasch@apple.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net --- net/mptcp/protocol.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 67311e7d5b21..86f8a7621aff 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1727,7 +1727,13 @@ static int mptcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, if (ret && ret != -EINPROGRESS && ret != -ERESTARTSYS && ret != -EINTR) *copied_syn = 0; } else if (ret && ret != -EINPROGRESS) { - mptcp_disconnect(sk, 0); + /* The disconnect() op called by tcp_sendmsg_fastopen()/ + * __inet_stream_connect() can fail, due to looking check, + * see mptcp_disconnect(). + * Attempt it again outside the problematic scope. + */ + if (!mptcp_disconnect(sk, 0)) + sk->sk_socket->state = SS_UNCONNECTED; } inet_sk(sk)->defer_connect = 0;

@@ -2389,7 +2395,10 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,

need_push = (flags & MPTCP_CF_PUSH) && __mptcp_retransmit_pending_data(sk); if (!dispose_it) { - tcp_disconnect(ssk, 0); + /* The MPTCP code never wait on the subflow sockets, TCP-level + * disconnect should never fail + */ + WARN_ON_ONCE(tcp_disconnect(ssk, 0)); msk->subflow->state = SS_UNCONNECTED; mptcp_subflow_ctx_reset(subflow); release_sock(ssk); @@ -2812,7 +2821,7 @@ void mptcp_subflow_shutdown(struct sock *sk, struct sock *ssk, int how) break; fallthrough; case TCP_SYN_SENT: - tcp_disconnect(ssk, O_NONBLOCK); + WARN_ON_ONCE(tcp_disconnect(ssk, O_NONBLOCK)); break; default: if (__mptcp_check_fallback(mptcp_sk(sk))) { @@ -3075,11 +3084,10 @@ static int mptcp_disconnect(struct sock *sk, int flags)

/* We are on the fastopen error path. We can't call straight into the * subflows cleanup code due to lock nesting (we are already under - * msk->firstsocket lock). Do nothing and leave the cleanup to the - * caller. + * msk->firstsocket lock). */ if (msk->fastopening) - return 0; + return -EBUSY;

mptcp_listen_inuse_dec(sk); inet_sk_state_store(sk, TCP_CLOSE);

-- 2.40.1

Matthieu Baerts

4:24 p.m.

New subject: [PATCH net 2/6] mptcp: fix possible divide by zero in recvmsg()

From: Paolo Abeni pabeni@redhat.com

Christoph reported a divide by zero bug in mptcp_recvmsg():

divide error: 0000 [#1] PREEMPT SMP CPU: 1 PID: 19978 Comm: syz-executor.6 Not tainted 6.4.0-rc2-gffcc7899081b #20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:__tcp_select_window+0x30e/0x420 net/ipv4/tcp_output.c:3018 Code: 11 ff 0f b7 cd c1 e9 0c b8 ff ff ff ff d3 e0 89 c1 f7 d1 01 cb 21 c3 eb 17 e8 2e 83 11 ff 31 db eb 0e e8 25 83 11 ff 89 d8 99 <f7> 7c 24 04 29 d3 65 48 8b 04 25 28 00 00 00 48 3b 44 24 10 75 60 RSP: 0018:ffffc90000a07a18 EFLAGS: 00010246 RAX: 000000000000ffd7 RBX: 000000000000ffd7 RCX: 0000000000040000 RDX: 0000000000000000 RSI: 000000000003ffff RDI: 0000000000040000 RBP: 000000000000ffd7 R08: ffffffff820cf297 R09: 0000000000000001 R10: 0000000000000000 R11: ffffffff8103d1a0 R12: 0000000000003f00 R13: 0000000000300000 R14: ffff888101cf3540 R15: 0000000000180000 FS: 00007f9af4c09640(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b33824000 CR3: 000000012f241001 CR4: 0000000000170ee0 Call Trace: <TASK> __tcp_cleanup_rbuf+0x138/0x1d0 net/ipv4/tcp.c:1611 mptcp_recvmsg+0xcb8/0xdd0 net/mptcp/protocol.c:2034 inet_recvmsg+0x127/0x1f0 net/ipv4/af_inet.c:861 ____sys_recvmsg+0x269/0x2b0 net/socket.c:1019 ___sys_recvmsg+0xe6/0x260 net/socket.c:2764 do_recvmmsg+0x1a5/0x470 net/socket.c:2858 __do_sys_recvmmsg net/socket.c:2937 [inline] __se_sys_recvmmsg net/socket.c:2953 [inline] __x64_sys_recvmmsg+0xa6/0x130 net/socket.c:2953 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x47/0xa0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7f9af58fc6a9 Code: 5c c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4f 37 0d 00 f7 d8 64 89 01 48 RSP: 002b:00007f9af4c08cd8 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 00000000006bc050 RCX: 00007f9af58fc6a9 RDX: 0000000000000001 RSI: 0000000020000140 RDI: 0000000000000004 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000f00 R11: 0000000000000246 R12: 00000000006bc05c R13: fffffffffffffea8 R14: 00000000006bc050 R15: 000000000001fe40 </TASK>

mptcp_recvmsg is allowed to release the msk socket lock when blocking, and before re-acquiring it another thread could have switched the sock to TCP_LISTEN status - with a prior connect(AF_UNSPEC) - also clearing icsk_ack.rcv_mss.

Address the issue preventing the disconnect if some other process is concurrently performing a blocking syscall on the same socket, alike commit 4faeee0cf8a5 ("tcp: deny tcp_disconnect() when threads are waiting").

Fixes: a6b118febbab ("mptcp: add receive buffer auto-tuning") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch cpaasch@apple.com Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/404 Signed-off-by: Paolo Abeni pabeni@redhat.com Tested-by: Christoph Paasch cpaasch@apple.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net --- net/mptcp/protocol.c | 7 +++++++ 1 file changed, 7 insertions(+)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 86f8a7621aff..ee357700b27b 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3082,6 +3082,12 @@ static int mptcp_disconnect(struct sock *sk, int flags) { struct mptcp_sock *msk = mptcp_sk(sk);

+ /* Deny disconnect if other threads are blocked in sk_wait_event() + * or inet_wait_for_connect(). + */ + if (sk->sk_wait_pending) + return -EBUSY; + /* We are on the fastopen error path. We can't call straight into the * subflows cleanup code due to lock nesting (we are already under * msk->firstsocket lock). @@ -3148,6 +3154,7 @@ struct sock *mptcp_sk_clone_init(const struct sock *sk, inet_sk(nsk)->pinet6 = mptcp_inet6_sk(nsk); #endif

+ nsk->sk_wait_pending = 0; __mptcp_init_sock(nsk);

msk = mptcp_sk(nsk);

-- 2.40.1

Matthieu Baerts

4:24 p.m.

New subject: [PATCH net 3/6] mptcp: fix possible list corruption on passive MPJ

From: Paolo Abeni pabeni@redhat.com

At passive MPJ time, if the msk socket lock is held by the user, the new subflow is appended to the msk->join_list under the msk data lock.

In mptcp_release_cb()/__mptcp_flush_join_list(), the subflows in that list are moved from the join_list into the conn_list under the msk socket lock.

Append and removal could race, possibly corrupting such list. Address the issue splicing the join list into a temporary one while still under the msk data lock.

Found by code inspection, the race itself should be almost impossible to trigger in practice.

Fixes: 3e5014909b56 ("mptcp: cleanup MPJ subflow list handling") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net --- net/mptcp/protocol.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index ee357700b27b..9a40dae31cec 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -850,12 +850,12 @@ static bool __mptcp_finish_join(struct mptcp_sock *msk, struct sock *ssk) return true; }

-static void __mptcp_flush_join_list(struct sock *sk) +static void __mptcp_flush_join_list(struct sock *sk, struct list_head *join_list) { struct mptcp_subflow_context *tmp, *subflow; struct mptcp_sock *msk = mptcp_sk(sk);

- list_for_each_entry_safe(subflow, tmp, &msk->join_list, node) { + list_for_each_entry_safe(subflow, tmp, join_list, node) { struct sock *ssk = mptcp_subflow_tcp_sock(subflow); bool slow = lock_sock_fast(ssk);

@@ -3342,9 +3342,14 @@ static void mptcp_release_cb(struct sock *sk) for (;;) { unsigned long flags = (msk->cb_flags & MPTCP_FLAGS_PROCESS_CTX_NEED) | msk->push_pending; + struct list_head join_list; + if (!flags) break;

+ INIT_LIST_HEAD(&join_list); + list_splice_init(&msk->join_list, &join_list); + /* the following actions acquire the subflow socket lock * * 1) can't be invoked in atomic scope @@ -3355,8 +3360,9 @@ static void mptcp_release_cb(struct sock *sk) msk->push_pending = 0; msk->cb_flags &= ~flags; spin_unlock_bh(&sk->sk_lock.slock); + if (flags & BIT(MPTCP_FLUSH_JOIN_LIST)) - __mptcp_flush_join_list(sk); + __mptcp_flush_join_list(sk, &join_list); if (flags & BIT(MPTCP_PUSH_PENDING)) __mptcp_push_pending(sk, 0); if (flags & BIT(MPTCP_RETRANSMIT))

-- 2.40.1

Matthieu Baerts

4:24 p.m.

New subject: [PATCH net 4/6] mptcp: consolidate fallback and non fallback state machine

From: Paolo Abeni pabeni@redhat.com

An orphaned msk releases the used resources via the worker, when the latter first see the msk in CLOSED status.

If the msk status transitions to TCP_CLOSE in the release callback invoked by the worker's final release_sock(), such instance of the workqueue will not take any action.

Additionally the MPTCP code prevents scheduling the worker once the socket reaches the CLOSE status: such msk resources will be leaked.

The only code path that can trigger the above scenario is the __mptcp_check_send_data_fin() in fallback mode.

Address the issue removing the special handling of fallback socket in __mptcp_check_send_data_fin(), consolidating the state machine for fallback and non fallback socket.

Since non-fallback sockets do not send and do not receive data_fin, the mptcp code can update the msk internal status to match the next step in the SM every time data fin (ack) should be generated or received.

As a consequence we can remove a bunch of checks for fallback from the fastpath.

Fixes: 6e628cd3a8f7 ("mptcp: use mptcp release_cb for delayed tasks") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net --- net/mptcp/protocol.c | 41 +++++++++++++++-------------------------- net/mptcp/subflow.c | 17 ++++++++++------- 2 files changed, 25 insertions(+), 33 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 9a40dae31cec..27d206f7af62 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -44,7 +44,7 @@ enum { static struct percpu_counter mptcp_sockets_allocated ____cacheline_aligned_in_smp;

static void __mptcp_destroy_sock(struct sock *sk); -static void __mptcp_check_send_data_fin(struct sock *sk); +static void mptcp_check_send_data_fin(struct sock *sk);

DEFINE_PER_CPU(struct mptcp_delegated_action, mptcp_delegated_actions); static struct net_device mptcp_napi_dev; @@ -424,8 +424,7 @@ static bool mptcp_pending_data_fin_ack(struct sock *sk) { struct mptcp_sock *msk = mptcp_sk(sk);

- return !__mptcp_check_fallback(msk) && - ((1 << sk->sk_state) & + return ((1 << sk->sk_state) & (TCPF_FIN_WAIT1 | TCPF_CLOSING | TCPF_LAST_ACK)) && msk->write_seq == READ_ONCE(msk->snd_una); } @@ -583,9 +582,6 @@ static bool mptcp_check_data_fin(struct sock *sk) u64 rcv_data_fin_seq; bool ret = false;

- if (__mptcp_check_fallback(msk)) - return ret; - /* Need to ack a DATA_FIN received from a peer while this side * of the connection is in ESTABLISHED, FIN_WAIT1, or FIN_WAIT2. * msk->rcv_data_fin was set when parsing the incoming options @@ -623,7 +619,8 @@ static bool mptcp_check_data_fin(struct sock *sk) }

ret = true; - mptcp_send_ack(msk); + if (!__mptcp_check_fallback(msk)) + mptcp_send_ack(msk); mptcp_close_wake_up(sk); } return ret; @@ -1609,7 +1606,7 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags) if (!mptcp_timer_pending(sk)) mptcp_reset_timer(sk); if (do_check_data_fin) - __mptcp_check_send_data_fin(sk); + mptcp_check_send_data_fin(sk); }

static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk, bool first) @@ -2680,8 +2677,6 @@ static void mptcp_worker(struct work_struct *work) if (unlikely((1 << state) & (TCPF_CLOSE | TCPF_LISTEN))) goto unlock;

- mptcp_check_data_fin_ack(sk); - mptcp_check_fastclose(msk);

mptcp_pm_nl_work(msk); @@ -2689,7 +2684,8 @@ static void mptcp_worker(struct work_struct *work) if (test_and_clear_bit(MPTCP_WORK_EOF, &msk->flags)) mptcp_check_for_eof(msk);

- __mptcp_check_send_data_fin(sk); + mptcp_check_send_data_fin(sk); + mptcp_check_data_fin_ack(sk); mptcp_check_data_fin(sk);

if (test_and_clear_bit(MPTCP_WORK_CLOSE_SUBFLOW, &msk->flags)) @@ -2828,6 +2824,12 @@ void mptcp_subflow_shutdown(struct sock *sk, struct sock *ssk, int how) pr_debug("Fallback"); ssk->sk_shutdown |= how; tcp_shutdown(ssk, how); + + /* simulate the data_fin ack reception to let the state + * machine move forward + */ + WRITE_ONCE(mptcp_sk(sk)->snd_una, mptcp_sk(sk)->snd_nxt); + mptcp_schedule_work(sk); } else { pr_debug("Sending DATA_FIN on subflow %p", ssk); tcp_send_ack(ssk); @@ -2867,7 +2869,7 @@ static int mptcp_close_state(struct sock *sk) return next & TCP_ACTION_FIN; }

-static void __mptcp_check_send_data_fin(struct sock *sk) +static void mptcp_check_send_data_fin(struct sock *sk) { struct mptcp_subflow_context *subflow; struct mptcp_sock *msk = mptcp_sk(sk); @@ -2885,19 +2887,6 @@ static void __mptcp_check_send_data_fin(struct sock *sk)

WRITE_ONCE(msk->snd_nxt, msk->write_seq);

- /* fallback socket will not get data_fin/ack, can move to the next - * state now - */ - if (__mptcp_check_fallback(msk)) { - WRITE_ONCE(msk->snd_una, msk->write_seq); - if ((1 << sk->sk_state) & (TCPF_CLOSING | TCPF_LAST_ACK)) { - inet_sk_state_store(sk, TCP_CLOSE); - mptcp_close_wake_up(sk); - } else if (sk->sk_state == TCP_FIN_WAIT1) { - inet_sk_state_store(sk, TCP_FIN_WAIT2); - } - } - mptcp_for_each_subflow(msk, subflow) { struct sock *tcp_sk = mptcp_subflow_tcp_sock(subflow);

@@ -2917,7 +2906,7 @@ static void __mptcp_wr_shutdown(struct sock *sk) WRITE_ONCE(msk->write_seq, msk->write_seq + 1); WRITE_ONCE(msk->snd_data_fin_enable, 1);

- __mptcp_check_send_data_fin(sk); + mptcp_check_send_data_fin(sk); }

static void __mptcp_destroy_sock(struct sock *sk) diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 4688daa6b38b..d9c8b21c6076 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1749,14 +1749,16 @@ static void subflow_state_change(struct sock *sk) { struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); struct sock *parent = subflow->conn; + struct mptcp_sock *msk;

__subflow_state_change(sk);

+ msk = mptcp_sk(parent); if (subflow_simultaneous_connect(sk)) { mptcp_propagate_sndbuf(parent, sk); mptcp_do_fallback(sk); - mptcp_rcv_space_init(mptcp_sk(parent), sk); - pr_fallback(mptcp_sk(parent)); + mptcp_rcv_space_init(msk, sk); + pr_fallback(msk); subflow->conn_finished = 1; mptcp_set_connected(parent); } @@ -1772,11 +1774,12 @@ static void subflow_state_change(struct sock *sk)

subflow_sched_work_if_closed(mptcp_sk(parent), sk);

- if (__mptcp_check_fallback(mptcp_sk(parent)) && - !subflow->rx_eof && subflow_is_done(sk)) { - subflow->rx_eof = 1; - mptcp_subflow_eof(parent); - } + /* when the fallback subflow closes the rx side, trigger a 'dummy' + * ingress data fin, so that the msk state will follow along + */ + if (__mptcp_check_fallback(msk) && subflow_is_done(sk) && msk->first == sk && + mptcp_update_rcv_data_fin(msk, READ_ONCE(msk->ack_seq), true)) + mptcp_schedule_work(parent); }

void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_ssk)

-- 2.40.1

Matthieu Baerts

4:24 p.m.

New subject: [PATCH net 6/6] mptcp: ensure listener is unhashed before updating the sk status

From: Paolo Abeni pabeni@redhat.com

The MPTCP protocol access the listener subflow in a lockless manner in a couple of places (poll, diag). That works only if the msk itself leaves the listener status only after that the subflow itself has been closed/disconnected. Otherwise we risk deadlock in diag, as reported by Christoph.

Address the issue ensuring that the first subflow (the listener one) is always disconnected before updating the msk socket status.

Reported-by: Christoph Paasch cpaasch@apple.com Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/407 Fixes: b29fcfb54cd7 ("mptcp: full disconnect implementation") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net --- net/mptcp/pm_netlink.c | 1 + net/mptcp/protocol.c | 31 +++++++++++++++++++------------ 2 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 59f8f3124855..1224dfca5bf3 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -1047,6 +1047,7 @@ static int mptcp_pm_nl_create_listen_socket(struct sock *sk, if (err) return err;

+ inet_sk_state_store(newsk, TCP_LISTEN); err = kernel_listen(ssock, backlog); if (err) return err; diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index a66ec341485e..a6c7f2d24909 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2368,13 +2368,6 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, kfree_rcu(subflow, rcu); } else { /* otherwise tcp will dispose of the ssk and subflow ctx */ - if (ssk->sk_state == TCP_LISTEN) { - tcp_set_state(ssk, TCP_CLOSE); - mptcp_subflow_queue_clean(sk, ssk); - inet_csk_listen_stop(ssk); - mptcp_event_pm_listener(ssk, MPTCP_EVENT_LISTENER_CLOSED); - } - __tcp_close(ssk, 0);

/* close acquired an extra ref */ @@ -2902,10 +2895,24 @@ static __poll_t mptcp_check_readable(struct mptcp_sock *msk) return EPOLLIN | EPOLLRDNORM; }

-static void mptcp_listen_inuse_dec(struct sock *sk) +static void mptcp_check_listen_stop(struct sock *sk) { - if (inet_sk_state_load(sk) == TCP_LISTEN) - sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1); + struct sock *ssk; + + if (inet_sk_state_load(sk) != TCP_LISTEN) + return; + + sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1); + ssk = mptcp_sk(sk)->first; + if (WARN_ON_ONCE(!ssk || inet_sk_state_load(ssk) != TCP_LISTEN)) + return; + + lock_sock_nested(ssk, SINGLE_DEPTH_NESTING); + mptcp_subflow_queue_clean(sk, ssk); + inet_csk_listen_stop(ssk); + mptcp_event_pm_listener(ssk, MPTCP_EVENT_LISTENER_CLOSED); + tcp_set_state(ssk, TCP_CLOSE); + release_sock(ssk); }

bool __mptcp_close(struct sock *sk, long timeout) @@ -2918,7 +2925,7 @@ bool __mptcp_close(struct sock *sk, long timeout) WRITE_ONCE(sk->sk_shutdown, SHUTDOWN_MASK);

if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)) { - mptcp_listen_inuse_dec(sk); + mptcp_check_listen_stop(sk); inet_sk_state_store(sk, TCP_CLOSE); goto cleanup; } @@ -3035,7 +3042,7 @@ static int mptcp_disconnect(struct sock *sk, int flags) if (msk->fastopening) return -EBUSY;

- mptcp_listen_inuse_dec(sk); + mptcp_check_listen_stop(sk); inet_sk_state_store(sk, TCP_CLOSE);

mptcp_stop_timer(sk);

-- 2.40.1

patchwork-bot+netdevbpf＠kernel.org

22 Jun 22 Jun

5:50 a.m.

Hello:

This series was applied to netdev/net.git (main) by Jakub Kicinski kuba@kernel.org:

On Tue, 20 Jun 2023 18:24:17 +0200 you wrote:

...

Patch 1 correctly handles disconnect() failures that can happen in some specific cases: now the socket state is set as unconnected as expected. That fixes an issue introduced in v6.2.

Patch 2 fixes a divide by zero bug in mptcp_recvmsg() with a fix similar to a recent one from Eric Dumazet for TCP introducing sk_wait_pending flag. It should address an issue present in MPTCP from almost the beginning, from v5.9.

[...]

Here is the summary with links: - [net,1/6] mptcp: handle correctly disconnect() failures https://git.kernel.org/netdev/net/c/c2b2ae3925b6 - [net,2/6] mptcp: fix possible divide by zero in recvmsg() https://git.kernel.org/netdev/net/c/0ad529d9fd2b - [net,3/6] mptcp: fix possible list corruption on passive MPJ https://git.kernel.org/netdev/net/c/56a666c48b03 - [net,4/6] mptcp: consolidate fallback and non fallback state machine https://git.kernel.org/netdev/net/c/81c1d0290160 - [net,5/6] mptcp: drop legacy code around RX EOF https://git.kernel.org/netdev/net/c/b7535cfed223 - [net,6/6] mptcp: ensure listener is unhashed before updating the sk status https://git.kernel.org/netdev/net/c/57fc0f1ceaa4

You are awesome, thank you!

-- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html

913

days inactive

915

days old

linux-stable-mirror@lists.linaro.org

6 comments

participants

tags (0)

participants (2)

Matthieu Baerts
patchwork-bot+netdevbpf＠kernel.org