4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Soheil Hassas Yeganeh soheil@google.com
[ Upstream commit ad02c4f547826167a709dab8a89a1caefd2c1f50 ]
For TCP sockets, TX timestamps are only captured when the user data is successfully and fully written to the socket. In many cases, however, TCP writes can be partial for which no timestamp is collected.
Collect timestamps whenever any user data is (fully or partially) copied into the socket. Pass tcp_write_queue_tail to tcp_tx_timestamp instead of the local skb pointer since it can be set to NULL on the error path.
Note that tcp_write_queue_tail can be NULL, even if bytes have been copied to the socket. This is because acknowledgements are being processed in tcp_sendmsg(), and by the time tcp_tx_timestamp is called tcp_write_queue_tail can be NULL. For such cases, this patch does not collect any timestamps (i.e., it is best-effort).
This patch is written with suggestions from Willem de Bruijn and Eric Dumazet.
Change-log V1 -> V2: - Use sockc.tsflags instead of sk->sk_tsflags. - Use the same code path for normal writes and errors.
Signed-off-by: Soheil Hassas Yeganeh soheil@google.com Acked-by: Yuchung Cheng ycheng@google.com Cc: Willem de Bruijn willemb@google.com Cc: Eric Dumazet edumazet@google.com Cc: Neal Cardwell ncardwell@google.com Cc: Martin KaFai Lau kafai@fb.com Acked-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin alexander.levin@verizon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/tcp.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-)
--- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -431,7 +431,7 @@ EXPORT_SYMBOL(tcp_init_sock);
static void tcp_tx_timestamp(struct sock *sk, u16 tsflags, struct sk_buff *skb) { - if (tsflags) { + if (tsflags && skb) { struct skb_shared_info *shinfo = skb_shinfo(skb); struct tcp_skb_cb *tcb = TCP_SKB_CB(skb);
@@ -966,10 +966,8 @@ new_segment: copied += copy; offset += copy; size -= copy; - if (!size) { - tcp_tx_timestamp(sk, sk->sk_tsflags, skb); + if (!size) goto out; - }
if (skb->len < size_goal || (flags & MSG_OOB)) continue; @@ -995,8 +993,11 @@ wait_for_memory: }
out: - if (copied && !(flags & MSG_SENDPAGE_NOTLAST)) - tcp_push(sk, flags, mss_now, tp->nonagle, size_goal); + if (copied) { + tcp_tx_timestamp(sk, sk->sk_tsflags, tcp_write_queue_tail(sk)); + if (!(flags & MSG_SENDPAGE_NOTLAST)) + tcp_push(sk, flags, mss_now, tp->nonagle, size_goal); + } return copied;
do_error: @@ -1289,7 +1290,6 @@ new_segment:
copied += copy; if (!msg_data_left(msg)) { - tcp_tx_timestamp(sk, sockc.tsflags, skb); if (unlikely(flags & MSG_EOR)) TCP_SKB_CB(skb)->eor = 1; goto out; @@ -1320,8 +1320,10 @@ wait_for_memory: }
out: - if (copied) + if (copied) { + tcp_tx_timestamp(sk, sockc.tsflags, tcp_write_queue_tail(sk)); tcp_push(sk, flags, mss_now, tp->nonagle, size_goal); + } out_nopush: release_sock(sk); return copied + copied_syn;