Jason Xing wrote:
On Mon, Sep 30, 2024 at 7:49 PM Willem de Bruijn willemdebruijn.kernel@gmail.com wrote:
Jason Xing wrote:
On Mon, Sep 30, 2024 at 6:39 PM Willem de Bruijn willemdebruijn.kernel@gmail.com wrote:
Jason Xing wrote:
From: Jason Xing kernelxing@tencent.com
Even though this case is unlikely to happen, we have to avoid such a case occurring at an earlier point: the sk_rmem_alloc could get increased because of inserting more and more skbs into the errqueue when calling __skb_complete_tx_timestamp(). This bad case would stop the socket transmitting soon.
It is up to the application to read from the error queue frequently enough and/or increase SO_RCVBUF.
Sure thing. If we test it without setting SOF_TIMESTAMPING_SOFTWARE on the loopback, it will soon stop. That's the reason why I tried to add the restriction just in case.
I don't follow at all.
That bit does not affect the core issue: that the application is not clearing its error queue quickly enough.
Signed-off-by: Jason Xing kernelxing@tencent.com
net/core/sock.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/core/sock.c b/net/core/sock.c index fe87f9bd8f16..4bddd6f62e4f 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -905,6 +905,10 @@ int sock_set_timestamping(struct sock *sk, int optname, if (val & ~SOF_TIMESTAMPING_MASK) return -EINVAL;
if (val & SOF_TIMESTAMPING_TX_RECORD_MASK &&
!(val & SOF_TIMESTAMPING_SOFTWARE))
return -EINVAL;
This breaks hardware timestamping
Yes, and sorry about that. I'll fix this.
As is I don't understand the purpose of this patch. Please do not just resubmit with a change, but explain the problem and suggested solution first.
I will drop this patch because I just tested with my program in the local machine and found there is one mistake I made about calculating the diff between those two . Sorry for the noise.
Well, I only need to send a V2 patch of patch [3/3] in the next few days.
BTW, please allow me to ask one question unrelated to this patch again. I do wonder: if we batch the recv skbs from the errqueue when calling tcp_recvmsg() -> inet_recv_error(), it could break users, right?
Analogous to __msg_zerocopy_callback with __msg_zerocopy_callback.
Only here we cannot return range-based results and thus cannot just expand the range of the one outstanding notification.
This would mean in ip(v6)_recv_error calling sock_dequeue_err_skb, sock_recv_timestamp and put_cmsg IP_RECVERR multiple times. And ip_cmsg_recv if needed.
Existing applications do not have to expect multiple results per single recvmsg call. So enabling that unconditionally could break them.
Adding this will require a new flag. An sk_tsflag is the obvious approach.
Interpreting a MSG_* flag passed to recvmsg would be another option. If there is a bit that can be set with MSG_ERRQUEUE and cannot already be set currently. But I don't think that's the case. We allow all bits and ignore any undefined ones.