On Thu, Dec 5, 2024 at 2:13 AM Jakub Kicinski kuba@kernel.org wrote:
On Wed, 13 Nov 2024 18:46:39 +0000 Dmitry Safonov via B4 Relay wrote:
- Inet-diag allocates netlink message for sockets in inet_diag_dump_one_icsk(), which uses a TCP-diag callback .idiag_get_aux_size(), that pre-calculates the needed space for TCP-diag related information. But as neither socket lock nor rcu_readlock() are held between allocation and the actual TCP info filling, the TCP-related space requirement may change before reaching tcp_diag_put_md5sig(). I.e., the number of TCP-MD5 keys on a socket. Thankfully, TCP-MD5-diag won't overwrite the skb, but will return EMSGSIZE, triggering WARN_ON() in inet_diag_dump_one_icsk().
Hi Eric!
This was posted while you were away -- any thoughts or recommendation on how to address the required nl message size changing? Or other problems pointed out by Dmitry? My suggestion in the subthread is to re-dump with a fixed, large buffer on EMSGSIZE, but that's not super clean..
Hi Jakub
inet_diag_dump_one_icsk() could retry, doubling the size until the ~32768 byte limit is reached ?
Also, we could make sure inet_sk_attr_size() returns at least NLMSG_DEFAULT_SIZE, there is no point trying to save memory for a single skb in inet_diag_dump_one_icsk().
diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c index 321acc8abf17e8c7d6a4e3326615123fff19deab..cd2e7fe9b090ea9127aebbba0faf2ef12c0f86a4 100644 --- a/net/ipv4/inet_diag.c +++ b/net/ipv4/inet_diag.c @@ -102,7 +102,7 @@ static size_t inet_sk_attr_size(struct sock *sk, bool net_admin) { const struct inet_diag_handler *handler; - size_t aux = 0; + size_t aux = 0, res;
rcu_read_lock(); handler = rcu_dereference(inet_diag_table[req->sdiag_protocol]); @@ -111,7 +111,7 @@ static size_t inet_sk_attr_size(struct sock *sk, aux = handler->idiag_get_aux_size(sk, net_admin); rcu_read_unlock();
- return nla_total_size(sizeof(struct tcp_info)) + res = nla_total_size(sizeof(struct tcp_info)) + nla_total_size(sizeof(struct inet_diag_msg)) + inet_diag_msg_attrs_size() + nla_total_size(sizeof(struct inet_diag_meminfo)) @@ -120,6 +120,7 @@ static size_t inet_sk_attr_size(struct sock *sk, + nla_total_size(sizeof(struct tcpvegas_info)) + aux + 64; + return max(res, NLMSG_DEFAULT_SIZE); }
int inet_diag_msg_attrs_fill(struct sock *sk, struct sk_buff *skb, @@ -570,6 +571,7 @@ int inet_diag_dump_one_icsk(struct inet_hashinfo *hashinfo, bool net_admin = netlink_net_capable(in_skb, CAP_NET_ADMIN); struct net *net = sock_net(in_skb->sk); struct sk_buff *rep; + size_t attr_size; struct sock *sk; int err;
@@ -577,7 +579,9 @@ int inet_diag_dump_one_icsk(struct inet_hashinfo *hashinfo, if (IS_ERR(sk)) return PTR_ERR(sk);
- rep = nlmsg_new(inet_sk_attr_size(sk, req, net_admin), GFP_KERNEL); + attr_size = inet_sk_attr_size(sk, req, net_admin); +retry: + rep = nlmsg_new(attr_size, GFP_KERNEL); if (!rep) { err = -ENOMEM; goto out; @@ -585,8 +589,14 @@ int inet_diag_dump_one_icsk(struct inet_hashinfo *hashinfo,
err = sk_diag_fill(sk, rep, cb, req, 0, net_admin); if (err < 0) { - WARN_ON(err == -EMSGSIZE); nlmsg_free(rep); + if (err == -EMSGSIZE) { + attr_size <<= 1; + if (attr_size + NLMSG_HDRLEN <= SKB_WITH_OVERHEAD(32768)) { + cond_resched(); + goto retry; + } + } goto out; } err = nlmsg_unicast(net->diag_nlsk, rep, NETLINK_CB(in_skb).portid);