From: Oleksij Rempel o.rempel@pengutronix.de
This patch addresses an issue within the j1939_sk_send_loop_abort() function in the j1939/socket.c file, specifically in the context of Transport Protocol (TP) sessions.
Without this patch, when a TP session is initiated and a Clear To Send (CTS) frame is received from the remote side requesting one data packet, the kernel dispatches the first Data Transport (DT) frame and then waits for the next CTS. If the remote side doesn't respond with another CTS, the kernel aborts due to a timeout. This leads to the user-space receiving an EPOLLERR on the socket, and the socket becomes active.
However, when trying to read the error queue from the socket with sock.recvmsg(, , socket.MSG_ERRQUEUE), it returns -EAGAIN, given that the socket is non-blocking. This situation results in an infinite loop: the user-space repeatedly calls epoll(), epoll() returns the socket file descriptor with EPOLLERR, but the socket then blocks on the recv() of ERRQUEUE.
This patch introduces an additional check for the J1939_SOCK_ERRQUEUE flag within the j1939_sk_send_loop_abort() function. If the flag is set, it indicates that the application has subscribed to receive error queue messages. In such cases, the kernel can communicate the current transfer state via the error queue. This allows for the function to return early, preventing the unnecessary setting of the socket into an error state, and breaking the infinite loop. It is crucial to note that a socket error is only needed if the application isn't using the error queue, as, without it, the application wouldn't be aware of transfer issues.
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Reported-by: David Jander david@protonic.nl Tested-by: David Jander david@protonic.nl Signed-off-by: Oleksij Rempel o.rempel@pengutronix.de Link: https://lore.kernel.org/r/20230526081946.715190-1-o.rempel@pengutronix.de Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de --- net/can/j1939/socket.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c index 1790469b2580..35970c25496a 100644 --- a/net/can/j1939/socket.c +++ b/net/can/j1939/socket.c @@ -1088,6 +1088,11 @@ void j1939_sk_errqueue(struct j1939_session *session,
void j1939_sk_send_loop_abort(struct sock *sk, int err) { + struct j1939_sock *jsk = j1939_sk(sk); + + if (jsk->state & J1939_SOCK_ERRQUEUE) + return; + sk->sk_err = err;
sk_error_report(sk);
base-commit: 8cde87b007dad2e461015ff70352af56ceb02c75
Hello:
This series was applied to netdev/net.git (main) by Marc Kleine-Budde mkl@pengutronix.de:
On Mon, 5 Jun 2023 08:59:50 +0200 you wrote:
From: Oleksij Rempel o.rempel@pengutronix.de
This patch addresses an issue within the j1939_sk_send_loop_abort() function in the j1939/socket.c file, specifically in the context of Transport Protocol (TP) sessions.
Without this patch, when a TP session is initiated and a Clear To Send (CTS) frame is received from the remote side requesting one data packet, the kernel dispatches the first Data Transport (DT) frame and then waits for the next CTS. If the remote side doesn't respond with another CTS, the kernel aborts due to a timeout. This leads to the user-space receiving an EPOLLERR on the socket, and the socket becomes active.
[...]
Here is the summary with links: - [net,1/3] can: j1939: j1939_sk_send_loop_abort(): improved error queue handling in J1939 Socket https://git.kernel.org/netdev/net/c/2a84aea80e92 - [net,2/3] can: j1939: change j1939_netdev_lock type to mutex https://git.kernel.org/netdev/net/c/cd9c790de208 - [net,3/3] can: j1939: avoid possible use-after-free when j1939_can_rx_register fails https://git.kernel.org/netdev/net/c/9f16eb106aa5
You are awesome, thank you!
linux-stable-mirror@lists.linaro.org