From: Jason Xing kernelxing@tencent.com
introduce a new flag SOF_TIMESTAMPING_OPT_RX_FILTER in the receive path. User can set it with SOF_TIMESTAMPING_SOFTWARE to filter out rx software timestamp report, especially after a process turns on netstamp_needed_key which can time stamp every incoming skb.
Previously, we found out if an application starts first which turns on netstamp_needed_key, then another one only passing SOF_TIMESTAMPING_SOFTWARE could also get rx timestamp. Now we handle this case by introducing this new flag without breaking users.
Quoting Willem to explain why we need the flag: "why a process would want to request software timestamp reporting, but not receive software timestamp generation. The only use I see is when the application does request SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_TX_SOFTWARE."
Similarly, this new flag could also be used for hardware case where we can set it with SOF_TIMESTAMPING_RAW_HARDWARE, then we won't receive hardware receive timestamp.
Another thing about errqueue in this patch I have a few words to say: In this case, we need to handle the egress path carefully, or else reporting the tx timestamp will fail. Egress path and ingress path will finally call sock_recv_timestamp(). We have to distinguish them. Errqueue is a good indicator to reflect the flow direction.
Suggested-by: Willem de Bruijn willemb@google.com Signed-off-by: Jason Xing kernelxing@tencent.com --- v5 Link: https://lore.kernel.org/all/20240905071738.3725-1-kerneljasonxing@gmail.com/ 1. squash the hardware case patch into this one (Willem) 2. update corresponding commit message and doc (Willem) 3. remove the limitation in sock_set_timestamping() and restore the simplification branches. (Willem)
v4 Link: https://lore.kernel.org/all/20240830153751.86895-2-kerneljasonxing@gmail.com... 1. revise the commit message and doc (Willem) 2. simplify the test statement (Jakub) 3. add Willem's reviewed-by tag (Willem)
v3 1. Willem suggested this alternative way to solve the issue, so I added his Suggested-by tag here. Thanks! --- Documentation/networking/timestamping.rst | 27 +++++++++++++++++++++++ include/uapi/linux/net_tstamp.h | 3 ++- net/ethtool/common.c | 1 + net/ipv4/tcp.c | 9 ++++++-- net/socket.c | 10 +++++++-- 5 files changed, 45 insertions(+), 5 deletions(-)
diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst index 9c7773271393..ae122ae3df72 100644 --- a/Documentation/networking/timestamping.rst +++ b/Documentation/networking/timestamping.rst @@ -267,6 +267,33 @@ SOF_TIMESTAMPING_OPT_TX_SWHW: two separate messages will be looped to the socket's error queue, each containing just one timestamp.
+SOF_TIMESTAMPING_OPT_RX_FILTER: + Used to filter out software/hardware receive timestamps. Existing + applications can 1) set SOF_TIMESTAMPING_SOFTWARE only, or 2) set + SOF_TIMESTAMPING_RAW_HARDWARE only because they implicitly came to + depend on another (usually daemon) process to enable receive + timestamps systemwide. Enabling the flag along with + SOF_TIMESTAMPING_SOFTWARE or SOF_TIMESTAMPING_RAW_HARDWARE will not + report the software or hardware receive timestamp to the userspace + so that it can filter out the cases where 1) one process starts + first which turns on netstamp_needed_key through setting generation + flags like SOF_TIMESTAMPING_RX_SOFTWARE, or 2) similarly one process + enables generating the hardware timestamp first like setting + SOF_TIMESTAMPING_RX_HARDWARE, then another one only passing report + flag could also get the receive timestamp. + + If the new applications 1) enable the flag along with + SOF_TIMESTAMPING_SOFTWARE and SOF_TIMESTAMPING_TX_SOFTWARE, or + 2) enable it with SOF_TIMESTAMPING_RAW_HARDWARE and + SOF_TIMESTAMPING_TX_HARDWARE, they will only receive software or + hardware transmit timestamp respectively. + + If the new applications 1) enable the flag along with + SOF_TIMESTAMPING_SOFTWARE and SOF_TIMESTAMPING_RX_SOFTWARE, or + 2) enable it with SOF_TIMESTAMPING_RAW_HARDWARE and + SOF_TIMESTAMPING_RX_HARDWARE, they will still receive software or + hardware receive timestamp respectively. + New applications are encouraged to pass SOF_TIMESTAMPING_OPT_ID to disambiguate timestamps and SOF_TIMESTAMPING_OPT_TSONLY to operate regardless of the setting of sysctl net.core.tstamp_allow_data. diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h index a2c66b3d7f0f..858339d1c1c4 100644 --- a/include/uapi/linux/net_tstamp.h +++ b/include/uapi/linux/net_tstamp.h @@ -32,8 +32,9 @@ enum { SOF_TIMESTAMPING_OPT_TX_SWHW = (1<<14), SOF_TIMESTAMPING_BIND_PHC = (1 << 15), SOF_TIMESTAMPING_OPT_ID_TCP = (1 << 16), + SOF_TIMESTAMPING_OPT_RX_FILTER = (1 << 17),
- SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_ID_TCP, + SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_RX_FILTER, SOF_TIMESTAMPING_MASK = (SOF_TIMESTAMPING_LAST - 1) | SOF_TIMESTAMPING_LAST }; diff --git a/net/ethtool/common.c b/net/ethtool/common.c index 781834ef57c3..6c245e59bbc1 100644 --- a/net/ethtool/common.c +++ b/net/ethtool/common.c @@ -427,6 +427,7 @@ const char sof_timestamping_names[][ETH_GSTRING_LEN] = { [const_ilog2(SOF_TIMESTAMPING_OPT_TX_SWHW)] = "option-tx-swhw", [const_ilog2(SOF_TIMESTAMPING_BIND_PHC)] = "bind-phc", [const_ilog2(SOF_TIMESTAMPING_OPT_ID_TCP)] = "option-id-tcp", + [const_ilog2(SOF_TIMESTAMPING_OPT_RX_FILTER)] = "option-rx-filter", }; static_assert(ARRAY_SIZE(sof_timestamping_names) == __SOF_TIMESTAMPING_CNT);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 8a5680b4e786..e359a9161445 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2235,6 +2235,7 @@ void tcp_recv_timestamp(struct msghdr *msg, const struct sock *sk, struct scm_timestamping_internal *tss) { int new_tstamp = sock_flag(sk, SOCK_TSTAMP_NEW); + u32 tsflags = READ_ONCE(sk->sk_tsflags); bool has_timestamping = false;
if (tss->ts[0].tv_sec || tss->ts[0].tv_nsec) { @@ -2274,14 +2275,18 @@ void tcp_recv_timestamp(struct msghdr *msg, const struct sock *sk, } }
- if (READ_ONCE(sk->sk_tsflags) & SOF_TIMESTAMPING_SOFTWARE) + if (tsflags & SOF_TIMESTAMPING_SOFTWARE && + (tsflags & SOF_TIMESTAMPING_RX_SOFTWARE || + !(tsflags & SOF_TIMESTAMPING_OPT_RX_FILTER))) has_timestamping = true; else tss->ts[0] = (struct timespec64) {0}; }
if (tss->ts[2].tv_sec || tss->ts[2].tv_nsec) { - if (READ_ONCE(sk->sk_tsflags) & SOF_TIMESTAMPING_RAW_HARDWARE) + if (tsflags & SOF_TIMESTAMPING_RAW_HARDWARE && + (tsflags & SOF_TIMESTAMPING_RX_HARDWARE || + !(tsflags & SOF_TIMESTAMPING_OPT_RX_FILTER))) has_timestamping = true; else tss->ts[2] = (struct timespec64) {0}; diff --git a/net/socket.c b/net/socket.c index fcbdd5bc47ac..1c2fd1a317af 100644 --- a/net/socket.c +++ b/net/socket.c @@ -946,11 +946,17 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
memset(&tss, 0, sizeof(tss)); tsflags = READ_ONCE(sk->sk_tsflags); - if ((tsflags & SOF_TIMESTAMPING_SOFTWARE) && + if ((tsflags & SOF_TIMESTAMPING_SOFTWARE && + (tsflags & SOF_TIMESTAMPING_RX_SOFTWARE || + skb_is_err_queue(skb) || + !(tsflags & SOF_TIMESTAMPING_OPT_RX_FILTER))) && ktime_to_timespec64_cond(skb->tstamp, tss.ts + 0)) empty = 0; if (shhwtstamps && - (tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) && + (tsflags & SOF_TIMESTAMPING_RAW_HARDWARE && + (tsflags & SOF_TIMESTAMPING_RX_HARDWARE || + skb_is_err_queue(skb) || + !(tsflags & SOF_TIMESTAMPING_OPT_RX_FILTER))) && !skb_is_swtx_tstamp(skb, false_tstamp)) { if_index = 0; if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP_NETDEV)