Thanks for the suggestion!
On Thu, Apr 03 2025 at 14:44:18, Xin Long lucien.xin@gmail.com wrote:
@@ -9234,7 +9236,7 @@ static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p, TASK_INTERRUPTIBLE); if (asoc->base.dead) goto do_dead;
if (!*timeo_p)
if (!*timeo_p || (t && t->dead)) goto do_nonblock; if (sk->sk_err || asoc->state >= SCTP_STATE_SHUTDOWN_PENDING) goto do_error;
I suppose checking t->dead should be done after locking the socket again, where sctp_assoc_rm_peer() may have had a chance to run, rather than here?
Something like this:
@@ -9225,7 +9227,9 @@ static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p, pr_debug("%s: asoc:%p, timeo:%ld, msg_len:%zu\n", __func__, asoc, *timeo_p, msg_len);
- /* Increment the association's refcnt. */ + /* Increment the transport and association's refcnt. */ + if (transport) + sctp_transport_hold(transport); sctp_association_hold(asoc);
/* Wait on the association specific sndbuf space. */ @@ -9252,6 +9256,8 @@ static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p, lock_sock(sk); if (sk != asoc->base.sk) goto do_error; + if (transport && transport->dead) + goto do_nonblock;
*timeo_p = current_timeo; } @@ -9259,7 +9265,9 @@ static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p, out: finish_wait(&asoc->wait, &wait);
- /* Release the association's refcnt. */ + /* Release the transport and association's refcnt. */ + if (transport) + sctp_transport_put(transport); sctp_association_put(asoc);
return err;
So by the time the sending thread re-claims the socket lock it can tell whether someone else removed the transport by checking transport->dead (set in sctp_transport_free()) and there's a guarantee that the transport hasn't been freed yet because we hold a reference to it.
If the whole receive path through sctp_assoc_rm_peer() is protected by the same socket lock, as you said, this should be safe. The tests I ran seem to work fine. If you're ok with it I'll send another patch to supersede this one.
You will need to reintroduce the dead bit in struct sctp_transport and set it in sctp_transport_free(). Note this field was previously removed in:
commit 47faa1e4c50ec26e6e75dcd1ce53f064bd45f729 Author: Xin Long lucien.xin@gmail.com Date: Fri Jan 22 01:49:09 2016 +0800
sctp: remove the dead field of sctp_transport
I understand that none of the transport->dead checks from that commit are necessary anymore, since they were replaced by refcnt checks, and that we'll only bring the bit back for this particular check we're doing now, correct?
Cheers, Ricardo