August 2022 - Linux-stable-mirror

FAILED: patch "[PATCH] can: mcp251x: Fix race condition on receive interrupt" failed to apply to 4.19-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.19-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From d80d60b0db6ff3dd2e29247cc2a5166d7e9ae37e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Sebastian=20W=C3=BCrl?= <sebastian.wuerl(a)ororatech.com> Date: Thu, 4 Aug 2022 10:14:11 +0200 Subject: [PATCH] can: mcp251x: Fix race condition on receive interrupt MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The mcp251x driver uses both receiving mailboxes of the CAN controller chips. For retrieving the CAN frames from the controller via SPI, it checks once per interrupt which mailboxes have been filled and will retrieve the messages accordingly. This introduces a race condition, as another CAN frame can enter mailbox 1 while mailbox 0 is emptied. If now another CAN frame enters mailbox 0 until the interrupt handler is called next, mailbox 0 is emptied before mailbox 1, leading to out-of-order CAN frames in the network device. This is fixed by checking the interrupt flags once again after freeing mailbox 0, to correctly also empty mailbox 1 before leaving the handler. For reproducing the bug I created the following setup: - Two CAN devices, one Raspberry Pi with MCP2515, the other can be any. - Setup CAN to 1 MHz - Spam bursts of 5 CAN-messages with increasing CAN-ids - Continue sending the bursts while sleeping a second between the bursts - Check on the RPi whether the received messages have increasing CAN-ids - Without this patch, every burst of messages will contain a flipped pair v3: https://lore.kernel.org/all/20220804075914.67569-1-sebastian.wuerl@ororatec… v2: https://lore.kernel.org/all/20220804064803.63157-1-sebastian.wuerl@ororatec… v1: https://lore.kernel.org/all/20220803153300.58732-1-sebastian.wuerl@ororatec… Fixes: bf66f3736a94 ("can: mcp251x: Move to threaded interrupts instead of workqueues.") Signed-off-by: Sebastian Würl <sebastian.wuerl(a)ororatech.com> Link: https://lore.kernel.org/all/20220804081411.68567-1-sebastian.wuerl@ororatec… [mkl: reduce scope of intf1, eflag1] Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de> diff --git a/drivers/net/can/spi/mcp251x.c b/drivers/net/can/spi/mcp251x.c index e750d13c8841..c320de474f40 100644 --- a/drivers/net/can/spi/mcp251x.c +++ b/drivers/net/can/spi/mcp251x.c @@ -1070,9 +1070,6 @@ static irqreturn_t mcp251x_can_ist(int irq, void *dev_id) mcp251x_read_2regs(spi, CANINTF, &intf, &eflag); - /* mask out flags we don't care about */ - intf &= CANINTF_RX | CANINTF_TX | CANINTF_ERR; - /* receive buffer 0 */ if (intf & CANINTF_RX0IF) { mcp251x_hw_rx(spi, 0); @@ -1082,6 +1079,18 @@ static irqreturn_t mcp251x_can_ist(int irq, void *dev_id) if (mcp251x_is_2510(spi)) mcp251x_write_bits(spi, CANINTF, CANINTF_RX0IF, 0x00); + + /* check if buffer 1 is already known to be full, no need to re-read */ + if (!(intf & CANINTF_RX1IF)) { + u8 intf1, eflag1; + + /* intf needs to be read again to avoid a race condition */ + mcp251x_read_2regs(spi, CANINTF, &intf1, &eflag1); + + /* combine flags from both operations for error handling */ + intf |= intf1; + eflag |= eflag1; + } } /* receive buffer 1 */ @@ -1092,6 +1101,9 @@ static irqreturn_t mcp251x_can_ist(int irq, void *dev_id) clear_intf |= CANINTF_RX1IF; } + /* mask out flags we don't care about */ + intf &= CANINTF_RX | CANINTF_TX | CANINTF_ERR; + /* any error or tx interrupt we need to clear? */ if (intf & (CANINTF_ERR | CANINTF_TX)) clear_intf |= intf & (CANINTF_ERR | CANINTF_TX);

2 years, 11 months

1
0
0 0

FAILED: patch "[PATCH] can: j1939: j1939_session_destroy(): fix memory leak of skbs" failed to apply to 5.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 8c21c54a53ab21842f5050fa090f26b03c0313d6 Mon Sep 17 00:00:00 2001 From: Fedor Pchelkin <pchelkin(a)ispras.ru> Date: Fri, 5 Aug 2022 18:02:16 +0300 Subject: [PATCH] can: j1939: j1939_session_destroy(): fix memory leak of skbs We need to drop skb references taken in j1939_session_skb_queue() when destroying a session in j1939_session_destroy(). Otherwise those skbs would be lost. Link to Syzkaller info and repro: https://forge.ispras.ru/issues/11743. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. V1: https://lore.kernel.org/all/20220708175949.539064-1-pchelkin@ispras.ru Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Suggested-by: Oleksij Rempel <o.rempel(a)pengutronix.de> Signed-off-by: Fedor Pchelkin <pchelkin(a)ispras.ru> Signed-off-by: Alexey Khoroshilov <khoroshilov(a)ispras.ru> Acked-by: Oleksij Rempel <o.rempel(a)pengutronix.de> Link: https://lore.kernel.org/all/20220805150216.66313-1-pchelkin@ispras.ru Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de> diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c index 307ee1174a6e..d7d86c944d76 100644 --- a/net/can/j1939/transport.c +++ b/net/can/j1939/transport.c @@ -260,6 +260,8 @@ static void __j1939_session_drop(struct j1939_session *session) static void j1939_session_destroy(struct j1939_session *session) { + struct sk_buff *skb; + if (session->transmission) { if (session->err) j1939_sk_errqueue(session, J1939_ERRQUEUE_TX_ABORT); @@ -274,7 +276,11 @@ static void j1939_session_destroy(struct j1939_session *session) WARN_ON_ONCE(!list_empty(&session->sk_session_queue_entry)); WARN_ON_ONCE(!list_empty(&session->active_session_list_entry)); - skb_queue_purge(&session->skb_queue); + while ((skb = skb_dequeue(&session->skb_queue)) != NULL) { + /* drop ref taken in j1939_session_skb_queue() */ + skb_unref(skb); + kfree_skb(skb); + } __j1939_session_drop(session); j1939_priv_put(session->priv); kfree(session);

2 years, 11 months

1
0
0 0

FAILED: patch "[PATCH] can: j1939: j1939_session_destroy(): fix memory leak of skbs" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 8c21c54a53ab21842f5050fa090f26b03c0313d6 Mon Sep 17 00:00:00 2001 From: Fedor Pchelkin <pchelkin(a)ispras.ru> Date: Fri, 5 Aug 2022 18:02:16 +0300 Subject: [PATCH] can: j1939: j1939_session_destroy(): fix memory leak of skbs We need to drop skb references taken in j1939_session_skb_queue() when destroying a session in j1939_session_destroy(). Otherwise those skbs would be lost. Link to Syzkaller info and repro: https://forge.ispras.ru/issues/11743. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. V1: https://lore.kernel.org/all/20220708175949.539064-1-pchelkin@ispras.ru Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Suggested-by: Oleksij Rempel <o.rempel(a)pengutronix.de> Signed-off-by: Fedor Pchelkin <pchelkin(a)ispras.ru> Signed-off-by: Alexey Khoroshilov <khoroshilov(a)ispras.ru> Acked-by: Oleksij Rempel <o.rempel(a)pengutronix.de> Link: https://lore.kernel.org/all/20220805150216.66313-1-pchelkin@ispras.ru Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de> diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c index 307ee1174a6e..d7d86c944d76 100644 --- a/net/can/j1939/transport.c +++ b/net/can/j1939/transport.c @@ -260,6 +260,8 @@ static void __j1939_session_drop(struct j1939_session *session) static void j1939_session_destroy(struct j1939_session *session) { + struct sk_buff *skb; + if (session->transmission) { if (session->err) j1939_sk_errqueue(session, J1939_ERRQUEUE_TX_ABORT); @@ -274,7 +276,11 @@ static void j1939_session_destroy(struct j1939_session *session) WARN_ON_ONCE(!list_empty(&session->sk_session_queue_entry)); WARN_ON_ONCE(!list_empty(&session->active_session_list_entry)); - skb_queue_purge(&session->skb_queue); + while ((skb = skb_dequeue(&session->skb_queue)) != NULL) { + /* drop ref taken in j1939_session_skb_queue() */ + skb_unref(skb); + kfree_skb(skb); + } __j1939_session_drop(session); j1939_priv_put(session->priv); kfree(session);

2 years, 11 months

1
0
0 0

FAILED: patch "[PATCH] selftests: mptcp: make sendfile selftest work" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From df9e03aec3b14970df05b72d54f8ac9da3ab29e1 Mon Sep 17 00:00:00 2001 From: Florian Westphal <fw(a)strlen.de> Date: Thu, 4 Aug 2022 17:21:27 -0700 Subject: [PATCH] selftests: mptcp: make sendfile selftest work When the selftest got added, sendfile() on mptcp sockets returned -EOPNOTSUPP, so running 'mptcp_connect.sh -m sendfile' failed immediately. This is no longer the case, but the script fails anyway due to timeout. Let the receiver know once the sender has sent all data, just like with '-m mmap' mode. v2: need to respect cfg_wait too, as pm_userspace.sh relied on -m sendfile to keep the connection open (Mat Martineau) Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp") Reported-by: Xiumei Mu <xmu(a)redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com> Signed-off-by: Florian Westphal <fw(a)strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com> Signed-off-by: David S. Miller <davem(a)davemloft.net> diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c index e2ea6c126c99..24d4e9cb617e 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_connect.c +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c @@ -553,6 +553,18 @@ static void set_nonblock(int fd, bool nonblock) fcntl(fd, F_SETFL, flags & ~O_NONBLOCK); } +static void shut_wr(int fd) +{ + /* Close our write side, ev. give some time + * for address notification and/or checking + * the current status + */ + if (cfg_wait) + usleep(cfg_wait); + + shutdown(fd, SHUT_WR); +} + static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after_out) { struct pollfd fds = { @@ -630,14 +642,7 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after /* ... and peer also closed already */ break; - /* ... but we still receive. - * Close our write side, ev. give some time - * for address notification and/or checking - * the current status - */ - if (cfg_wait) - usleep(cfg_wait); - shutdown(peerfd, SHUT_WR); + shut_wr(peerfd); } else { if (errno == EINTR) continue; @@ -767,7 +772,7 @@ static int copyfd_io_mmap(int infd, int peerfd, int outfd, if (err) return err; - shutdown(peerfd, SHUT_WR); + shut_wr(peerfd); err = do_recvfile(peerfd, outfd); *in_closed_after_out = true; @@ -791,6 +796,9 @@ static int copyfd_io_sendfile(int infd, int peerfd, int outfd, err = do_sendfile(infd, peerfd, size); if (err) return err; + + shut_wr(peerfd); + err = do_recvfile(peerfd, outfd); *in_closed_after_out = true; }

2 years, 11 months

1
0
0 0

FAILED: patch "[PATCH] selftests: mptcp: make sendfile selftest work" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From df9e03aec3b14970df05b72d54f8ac9da3ab29e1 Mon Sep 17 00:00:00 2001 From: Florian Westphal <fw(a)strlen.de> Date: Thu, 4 Aug 2022 17:21:27 -0700 Subject: [PATCH] selftests: mptcp: make sendfile selftest work When the selftest got added, sendfile() on mptcp sockets returned -EOPNOTSUPP, so running 'mptcp_connect.sh -m sendfile' failed immediately. This is no longer the case, but the script fails anyway due to timeout. Let the receiver know once the sender has sent all data, just like with '-m mmap' mode. v2: need to respect cfg_wait too, as pm_userspace.sh relied on -m sendfile to keep the connection open (Mat Martineau) Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp") Reported-by: Xiumei Mu <xmu(a)redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com> Signed-off-by: Florian Westphal <fw(a)strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com> Signed-off-by: David S. Miller <davem(a)davemloft.net> diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c b/tools/testing/selftests/net/mptcp/mptcp_connect.c index e2ea6c126c99..24d4e9cb617e 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_connect.c +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c @@ -553,6 +553,18 @@ static void set_nonblock(int fd, bool nonblock) fcntl(fd, F_SETFL, flags & ~O_NONBLOCK); } +static void shut_wr(int fd) +{ + /* Close our write side, ev. give some time + * for address notification and/or checking + * the current status + */ + if (cfg_wait) + usleep(cfg_wait); + + shutdown(fd, SHUT_WR); +} + static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after_out) { struct pollfd fds = { @@ -630,14 +642,7 @@ static int copyfd_io_poll(int infd, int peerfd, int outfd, bool *in_closed_after /* ... and peer also closed already */ break; - /* ... but we still receive. - * Close our write side, ev. give some time - * for address notification and/or checking - * the current status - */ - if (cfg_wait) - usleep(cfg_wait); - shutdown(peerfd, SHUT_WR); + shut_wr(peerfd); } else { if (errno == EINTR) continue; @@ -767,7 +772,7 @@ static int copyfd_io_mmap(int infd, int peerfd, int outfd, if (err) return err; - shutdown(peerfd, SHUT_WR); + shut_wr(peerfd); err = do_recvfile(peerfd, outfd); *in_closed_after_out = true; @@ -791,6 +796,9 @@ static int copyfd_io_sendfile(int infd, int peerfd, int outfd, err = do_sendfile(infd, peerfd, size); if (err) return err; + + shut_wr(peerfd); + err = do_recvfile(peerfd, outfd); *in_closed_after_out = true; }

2 years, 11 months

1
0
0 0

FAILED: patch "[PATCH] mptcp: do not queue data on closed subflows" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From c886d70286bf3ad411eb3d689328a67f7102c6ae Mon Sep 17 00:00:00 2001 From: Paolo Abeni <pabeni(a)redhat.com> Date: Thu, 4 Aug 2022 17:21:26 -0700 Subject: [PATCH] mptcp: do not queue data on closed subflows Dipanjan reported a syzbot splat at close time: WARNING: CPU: 1 PID: 10818 at net/ipv4/af_inet.c:153 inet_sock_destruct+0x6d0/0x8e0 net/ipv4/af_inet.c:153 Modules linked in: uio_ivshmem(OE) uio(E) CPU: 1 PID: 10818 Comm: kworker/1:16 Tainted: G OE 5.19.0-rc6-g2eae0556bb9d #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Workqueue: events mptcp_worker RIP: 0010:inet_sock_destruct+0x6d0/0x8e0 net/ipv4/af_inet.c:153 Code: 21 02 00 00 41 8b 9c 24 28 02 00 00 e9 07 ff ff ff e8 34 4d 91 f9 89 ee 4c 89 e7 e8 4a 47 60 ff e9 a6 fc ff ff e8 20 4d 91 f9 <0f> 0b e9 84 fe ff ff e8 14 4d 91 f9 0f 0b e9 d4 fd ff ff e8 08 4d RSP: 0018:ffffc9001b35fa78 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000002879d0 RCX: ffff8881326f3b00 RDX: 0000000000000000 RSI: ffff8881326f3b00 RDI: 0000000000000002 RBP: ffff888179662674 R08: ffffffff87e983a0 R09: 0000000000000000 R10: 0000000000000005 R11: 00000000000004ea R12: ffff888179662400 R13: ffff888179662428 R14: 0000000000000001 R15: ffff88817e38e258 FS: 0000000000000000(0000) GS:ffff8881f5f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020007bc0 CR3: 0000000179592000 CR4: 0000000000150ee0 Call Trace: <TASK> __sk_destruct+0x4f/0x8e0 net/core/sock.c:2067 sk_destruct+0xbd/0xe0 net/core/sock.c:2112 __sk_free+0xef/0x3d0 net/core/sock.c:2123 sk_free+0x78/0xa0 net/core/sock.c:2134 sock_put include/net/sock.h:1927 [inline] __mptcp_close_ssk+0x50f/0x780 net/mptcp/protocol.c:2351 __mptcp_destroy_sock+0x332/0x760 net/mptcp/protocol.c:2828 mptcp_worker+0x5d2/0xc90 net/mptcp/protocol.c:2586 process_one_work+0x9cc/0x1650 kernel/workqueue.c:2289 worker_thread+0x623/0x1070 kernel/workqueue.c:2436 kthread+0x2e9/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302 </TASK> The root cause of the problem is that an mptcp-level (re)transmit can race with mptcp_close() and the packet scheduler checks the subflow state before acquiring the socket lock: we can try to (re)transmit on an already closed ssk. Fix the issue checking again the subflow socket status under the subflow socket lock protection. Additionally add the missing check for the fallback-to-tcp case. Fixes: d5f49190def6 ("mptcp: allow picking different xmit subflows") Reported-by: Dipanjan Das <mail.dipanjan.das(a)gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com> Signed-off-by: Paolo Abeni <pabeni(a)redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com> Signed-off-by: David S. Miller <davem(a)davemloft.net> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 07fcc86e1fc9..da4257504fad 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1240,6 +1240,9 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk, info->limit > dfrag->data_len)) return 0; + if (unlikely(!__tcp_can_send(ssk))) + return -EAGAIN; + /* compute send limit */ info->mss_now = tcp_send_mss(ssk, &info->size_goal, info->flags); copy = info->size_goal; @@ -1413,7 +1416,8 @@ static struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk) if (__mptcp_check_fallback(msk)) { if (!msk->first) return NULL; - return sk_stream_memory_free(msk->first) ? msk->first : NULL; + return __tcp_can_send(msk->first) && + sk_stream_memory_free(msk->first) ? msk->first : NULL; } /* re-use last subflow, if the burst allow that */ @@ -1564,6 +1568,8 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags) ret = mptcp_sendmsg_frag(sk, ssk, dfrag, &info); if (ret <= 0) { + if (ret == -EAGAIN) + continue; mptcp_push_release(ssk, &info); goto out; } diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 40881a7df5d5..132d50833df1 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -624,16 +624,19 @@ void mptcp_info2sockaddr(const struct mptcp_addr_info *info, struct sockaddr_storage *addr, unsigned short family); -static inline bool __mptcp_subflow_active(struct mptcp_subflow_context *subflow) +static inline bool __tcp_can_send(const struct sock *ssk) { - struct sock *ssk = mptcp_subflow_tcp_sock(subflow); + /* only send if our side has not closed yet */ + return ((1 << inet_sk_state_load(ssk)) & (TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)); +} +static inline bool __mptcp_subflow_active(struct mptcp_subflow_context *subflow) +{ /* can't send if JOIN hasn't completed yet (i.e. is usable for mptcp) */ if (subflow->request_join && !subflow->fully_established) return false; - /* only send if our side has not closed yet */ - return ((1 << ssk->sk_state) & (TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)); + return __tcp_can_send(mptcp_subflow_tcp_sock(subflow)); } void mptcp_subflow_set_active(struct mptcp_subflow_context *subflow);

2 years, 11 months

1
0
0 0

FAILED: patch "[PATCH] mptcp: do not queue data on closed subflows" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From c886d70286bf3ad411eb3d689328a67f7102c6ae Mon Sep 17 00:00:00 2001 From: Paolo Abeni <pabeni(a)redhat.com> Date: Thu, 4 Aug 2022 17:21:26 -0700 Subject: [PATCH] mptcp: do not queue data on closed subflows Dipanjan reported a syzbot splat at close time: WARNING: CPU: 1 PID: 10818 at net/ipv4/af_inet.c:153 inet_sock_destruct+0x6d0/0x8e0 net/ipv4/af_inet.c:153 Modules linked in: uio_ivshmem(OE) uio(E) CPU: 1 PID: 10818 Comm: kworker/1:16 Tainted: G OE 5.19.0-rc6-g2eae0556bb9d #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Workqueue: events mptcp_worker RIP: 0010:inet_sock_destruct+0x6d0/0x8e0 net/ipv4/af_inet.c:153 Code: 21 02 00 00 41 8b 9c 24 28 02 00 00 e9 07 ff ff ff e8 34 4d 91 f9 89 ee 4c 89 e7 e8 4a 47 60 ff e9 a6 fc ff ff e8 20 4d 91 f9 <0f> 0b e9 84 fe ff ff e8 14 4d 91 f9 0f 0b e9 d4 fd ff ff e8 08 4d RSP: 0018:ffffc9001b35fa78 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000002879d0 RCX: ffff8881326f3b00 RDX: 0000000000000000 RSI: ffff8881326f3b00 RDI: 0000000000000002 RBP: ffff888179662674 R08: ffffffff87e983a0 R09: 0000000000000000 R10: 0000000000000005 R11: 00000000000004ea R12: ffff888179662400 R13: ffff888179662428 R14: 0000000000000001 R15: ffff88817e38e258 FS: 0000000000000000(0000) GS:ffff8881f5f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020007bc0 CR3: 0000000179592000 CR4: 0000000000150ee0 Call Trace: <TASK> __sk_destruct+0x4f/0x8e0 net/core/sock.c:2067 sk_destruct+0xbd/0xe0 net/core/sock.c:2112 __sk_free+0xef/0x3d0 net/core/sock.c:2123 sk_free+0x78/0xa0 net/core/sock.c:2134 sock_put include/net/sock.h:1927 [inline] __mptcp_close_ssk+0x50f/0x780 net/mptcp/protocol.c:2351 __mptcp_destroy_sock+0x332/0x760 net/mptcp/protocol.c:2828 mptcp_worker+0x5d2/0xc90 net/mptcp/protocol.c:2586 process_one_work+0x9cc/0x1650 kernel/workqueue.c:2289 worker_thread+0x623/0x1070 kernel/workqueue.c:2436 kthread+0x2e9/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302 </TASK> The root cause of the problem is that an mptcp-level (re)transmit can race with mptcp_close() and the packet scheduler checks the subflow state before acquiring the socket lock: we can try to (re)transmit on an already closed ssk. Fix the issue checking again the subflow socket status under the subflow socket lock protection. Additionally add the missing check for the fallback-to-tcp case. Fixes: d5f49190def6 ("mptcp: allow picking different xmit subflows") Reported-by: Dipanjan Das <mail.dipanjan.das(a)gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com> Signed-off-by: Paolo Abeni <pabeni(a)redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com> Signed-off-by: David S. Miller <davem(a)davemloft.net> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 07fcc86e1fc9..da4257504fad 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1240,6 +1240,9 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk, info->limit > dfrag->data_len)) return 0; + if (unlikely(!__tcp_can_send(ssk))) + return -EAGAIN; + /* compute send limit */ info->mss_now = tcp_send_mss(ssk, &info->size_goal, info->flags); copy = info->size_goal; @@ -1413,7 +1416,8 @@ static struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk) if (__mptcp_check_fallback(msk)) { if (!msk->first) return NULL; - return sk_stream_memory_free(msk->first) ? msk->first : NULL; + return __tcp_can_send(msk->first) && + sk_stream_memory_free(msk->first) ? msk->first : NULL; } /* re-use last subflow, if the burst allow that */ @@ -1564,6 +1568,8 @@ void __mptcp_push_pending(struct sock *sk, unsigned int flags) ret = mptcp_sendmsg_frag(sk, ssk, dfrag, &info); if (ret <= 0) { + if (ret == -EAGAIN) + continue; mptcp_push_release(ssk, &info); goto out; } diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 40881a7df5d5..132d50833df1 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -624,16 +624,19 @@ void mptcp_info2sockaddr(const struct mptcp_addr_info *info, struct sockaddr_storage *addr, unsigned short family); -static inline bool __mptcp_subflow_active(struct mptcp_subflow_context *subflow) +static inline bool __tcp_can_send(const struct sock *ssk) { - struct sock *ssk = mptcp_subflow_tcp_sock(subflow); + /* only send if our side has not closed yet */ + return ((1 << inet_sk_state_load(ssk)) & (TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)); +} +static inline bool __mptcp_subflow_active(struct mptcp_subflow_context *subflow) +{ /* can't send if JOIN hasn't completed yet (i.e. is usable for mptcp) */ if (subflow->request_join && !subflow->fully_established) return false; - /* only send if our side has not closed yet */ - return ((1 << ssk->sk_state) & (TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)); + return __tcp_can_send(mptcp_subflow_tcp_sock(subflow)); } void mptcp_subflow_set_active(struct mptcp_subflow_context *subflow);

2 years, 11 months

1
0
0 0

FAILED: patch "[PATCH] mptcp: move subflow cleanup in mptcp_destroy_common()" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From c0bf3c6aa444a5ef44acc57ef6cfa53fd4fc1c9b Mon Sep 17 00:00:00 2001 From: Paolo Abeni <pabeni(a)redhat.com> Date: Thu, 4 Aug 2022 17:21:25 -0700 Subject: [PATCH] mptcp: move subflow cleanup in mptcp_destroy_common() If the mptcp socket creation fails due to a CGROUP_INET_SOCK_CREATE eBPF program, the MPTCP protocol ends-up leaking all the subflows: the related cleanup happens in __mptcp_destroy_sock() that is not invoked in such code path. Address the issue moving the subflow sockets cleanup in the mptcp_destroy_common() helper, which is invoked in every msk cleanup path. Additionally get rid of the intermediate list_splice_init step, which is an unneeded relic from the past. The issue is present since before the reported root cause commit, but any attempt to backport the fix before that hash will require a complete rewrite. Fixes: e16163b6e2 ("mptcp: refactor shutdown and close") Reported-by: Nguyen Dinh Phi <phind.uet(a)gmail.com> Reviewed-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com> Co-developed-by: Nguyen Dinh Phi <phind.uet(a)gmail.com> Signed-off-by: Nguyen Dinh Phi <phind.uet(a)gmail.com> Signed-off-by: Paolo Abeni <pabeni(a)redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com> Signed-off-by: David S. Miller <davem(a)davemloft.net> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index a3f1c1461874..07fcc86e1fc9 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2769,30 +2769,16 @@ static void __mptcp_wr_shutdown(struct sock *sk) static void __mptcp_destroy_sock(struct sock *sk) { - struct mptcp_subflow_context *subflow, *tmp; struct mptcp_sock *msk = mptcp_sk(sk); - LIST_HEAD(conn_list); pr_debug("msk=%p", msk); might_sleep(); - /* join list will be eventually flushed (with rst) at sock lock release time*/ - list_splice_init(&msk->conn_list, &conn_list); - mptcp_stop_timer(sk); sk_stop_timer(sk, &sk->sk_timer); msk->pm.status = 0; - /* clears msk->subflow, allowing the following loop to close - * even the initial subflow - */ - mptcp_dispose_initial_subflow(msk); - list_for_each_entry_safe(subflow, tmp, &conn_list, node) { - struct sock *ssk = mptcp_subflow_tcp_sock(subflow); - __mptcp_close_ssk(sk, ssk, subflow, 0); - } - sk->sk_prot->destroy(sk); WARN_ON_ONCE(msk->rmem_fwd_alloc); @@ -2884,24 +2870,20 @@ static void mptcp_copy_inaddrs(struct sock *msk, const struct sock *ssk) static int mptcp_disconnect(struct sock *sk, int flags) { - struct mptcp_subflow_context *subflow, *tmp; struct mptcp_sock *msk = mptcp_sk(sk); inet_sk_state_store(sk, TCP_CLOSE); - list_for_each_entry_safe(subflow, tmp, &msk->conn_list, node) { - struct sock *ssk = mptcp_subflow_tcp_sock(subflow); - - __mptcp_close_ssk(sk, ssk, subflow, MPTCP_CF_FASTCLOSE); - } - mptcp_stop_timer(sk); sk_stop_timer(sk, &sk->sk_timer); if (mptcp_sk(sk)->token) mptcp_event(MPTCP_EVENT_CLOSED, mptcp_sk(sk), NULL, GFP_KERNEL); - mptcp_destroy_common(msk); + /* msk->subflow is still intact, the following will not free the first + * subflow + */ + mptcp_destroy_common(msk, MPTCP_CF_FASTCLOSE); msk->last_snd = NULL; WRITE_ONCE(msk->flags, 0); msk->cb_flags = 0; @@ -3051,12 +3033,17 @@ static struct sock *mptcp_accept(struct sock *sk, int flags, int *err, return newsk; } -void mptcp_destroy_common(struct mptcp_sock *msk) +void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags) { + struct mptcp_subflow_context *subflow, *tmp; struct sock *sk = (struct sock *)msk; __mptcp_clear_xmit(sk); + /* join list will be eventually flushed (with rst) at sock lock release time */ + list_for_each_entry_safe(subflow, tmp, &msk->conn_list, node) + __mptcp_close_ssk(sk, mptcp_subflow_tcp_sock(subflow), subflow, flags); + /* move to sk_receive_queue, sk_stream_kill_queues will purge it */ mptcp_data_lock(sk); skb_queue_splice_tail_init(&msk->receive_queue, &sk->sk_receive_queue); @@ -3078,7 +3065,11 @@ static void mptcp_destroy(struct sock *sk) { struct mptcp_sock *msk = mptcp_sk(sk); - mptcp_destroy_common(msk); + /* clears msk->subflow, allowing the following to close + * even the initial subflow + */ + mptcp_dispose_initial_subflow(msk); + mptcp_destroy_common(msk, 0); sk_sockets_allocated_dec(sk); } diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 5d6043c16b09..40881a7df5d5 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -717,7 +717,7 @@ static inline void mptcp_write_space(struct sock *sk) } } -void mptcp_destroy_common(struct mptcp_sock *msk); +void mptcp_destroy_common(struct mptcp_sock *msk, unsigned int flags); #define MPTCP_TOKEN_MAX_RETRIES 4 diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 901c763dcdbb..c7d49fb6e7bd 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -621,7 +621,8 @@ static void mptcp_sock_destruct(struct sock *sk) sock_orphan(sk); } - mptcp_destroy_common(mptcp_sk(sk)); + /* We don't need to clear msk->subflow, as it's still NULL at this point */ + mptcp_destroy_common(mptcp_sk(sk), 0); inet_sock_destruct(sk); }

2 years, 11 months

1
0
0 0

FAILED: patch "[PATCH] tcp: fix possible freeze in tx path under memory pressure" failed to apply to 4.9-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.9-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From f54755f6a11accb2db5ef17f8f75aad0875aefdc Mon Sep 17 00:00:00 2001 From: Eric Dumazet <edumazet(a)google.com> Date: Tue, 14 Jun 2022 10:17:34 -0700 Subject: [PATCH] tcp: fix possible freeze in tx path under memory pressure Blamed commit only dealt with applications issuing small writes. Issue here is that we allow to force memory schedule for the sk_buff allocation, but we have no guarantee that sendmsg() is able to copy some payload in it. In this patch, I make sure the socket can use up to tcp_wmem[0] bytes. For example, if we consider tcp_wmem[0] = 4096 (default on x86), and initial skb->truesize being 1280, tcp_sendmsg() is able to copy up to 2816 bytes under memory pressure. Before this patch a sendmsg() sending more than 2816 bytes would either block forever (if persistent memory pressure), or return -EAGAIN. For bigger MTU networks, it is advised to increase tcp_wmem[0] to avoid sending too small packets. v2: deal with zero copy paths. Fixes: 8e4d980ac215 ("tcp: fix behavior for epoll edge trigger") Signed-off-by: Eric Dumazet <edumazet(a)google.com> Acked-by: Soheil Hassas Yeganeh <soheil(a)google.com> Reviewed-by: Wei Wang <weiwan(a)google.com> Reviewed-by: Shakeel Butt <shakeelb(a)google.com> Signed-off-by: David S. Miller <davem(a)davemloft.net> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index b278063a1724..6c3eab485249 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -968,6 +968,23 @@ static int tcp_wmem_schedule(struct sock *sk, int copy) return min(copy, sk->sk_forward_alloc); } +static int tcp_wmem_schedule(struct sock *sk, int copy) +{ + int left; + + if (likely(sk_wmem_schedule(sk, copy))) + return copy; + + /* We could be in trouble if we have nothing queued. + * Use whatever is left in sk->sk_forward_alloc and tcp_wmem[0] + * to guarantee some progress. + */ + left = sock_net(sk)->ipv4.sysctl_tcp_wmem[0] - sk->sk_wmem_queued; + if (left > 0) + sk_forced_mem_schedule(sk, min(left, copy)); + return min(copy, sk->sk_forward_alloc); +} + static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags, struct page *page, int offset, size_t *size) {

2 years, 11 months

1
0
0 0

FAILED: patch "[PATCH] tcp: fix possible freeze in tx path under memory pressure" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From f54755f6a11accb2db5ef17f8f75aad0875aefdc Mon Sep 17 00:00:00 2001 From: Eric Dumazet <edumazet(a)google.com> Date: Tue, 14 Jun 2022 10:17:34 -0700 Subject: [PATCH] tcp: fix possible freeze in tx path under memory pressure Blamed commit only dealt with applications issuing small writes. Issue here is that we allow to force memory schedule for the sk_buff allocation, but we have no guarantee that sendmsg() is able to copy some payload in it. In this patch, I make sure the socket can use up to tcp_wmem[0] bytes. For example, if we consider tcp_wmem[0] = 4096 (default on x86), and initial skb->truesize being 1280, tcp_sendmsg() is able to copy up to 2816 bytes under memory pressure. Before this patch a sendmsg() sending more than 2816 bytes would either block forever (if persistent memory pressure), or return -EAGAIN. For bigger MTU networks, it is advised to increase tcp_wmem[0] to avoid sending too small packets. v2: deal with zero copy paths. Fixes: 8e4d980ac215 ("tcp: fix behavior for epoll edge trigger") Signed-off-by: Eric Dumazet <edumazet(a)google.com> Acked-by: Soheil Hassas Yeganeh <soheil(a)google.com> Reviewed-by: Wei Wang <weiwan(a)google.com> Reviewed-by: Shakeel Butt <shakeelb(a)google.com> Signed-off-by: David S. Miller <davem(a)davemloft.net> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index b278063a1724..6c3eab485249 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -968,6 +968,23 @@ static int tcp_wmem_schedule(struct sock *sk, int copy) return min(copy, sk->sk_forward_alloc); } +static int tcp_wmem_schedule(struct sock *sk, int copy) +{ + int left; + + if (likely(sk_wmem_schedule(sk, copy))) + return copy; + + /* We could be in trouble if we have nothing queued. + * Use whatever is left in sk->sk_forward_alloc and tcp_wmem[0] + * to guarantee some progress. + */ + left = sock_net(sk)->ipv4.sysctl_tcp_wmem[0] - sk->sk_wmem_queued; + if (left > 0) + sk_forced_mem_schedule(sk, min(left, copy)); + return min(copy, sk->sk_forward_alloc); +} + static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags, struct page *page, int offset, size_t *size) {

2 years, 11 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror August 2022