6.7-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin KaFai Lau martin.lau@kernel.org
[ Upstream commit 2242fd537fab52d5f4d2fbb1845f047c01fad0cf ]
There is a bug in the bpf_iter_udp_batch() function that stops the userspace from making forward progress.
The case that triggers the bug is the userspace passed in a very small read buffer. When the bpf prog does bpf_seq_printf, the userspace read buffer is not enough to capture the whole bucket.
When the read buffer is not large enough, the kernel will remember the offset of the bucket in iter->offset such that the next userspace read() can continue from where it left off.
The kernel will skip the number (== "iter->offset") of sockets in the next read(). However, the code directly decrements the "--iter->offset". This is incorrect because the next read() may not consume the whole bucket either and then the next-next read() will start from offset 0. The net effect is the userspace will keep reading from the beginning of a bucket and the process will never finish. "iter->offset" must always go forward until the whole bucket is consumed.
This patch fixes it by using a local variable "resume_offset" and "resume_bucket". "iter->offset" is always reset to 0 before it may be used. "iter->offset" will be advanced to the "resume_offset" when it continues from the "resume_bucket" (i.e. "state->bucket == resume_bucket"). This brings it closer to the bpf_iter_tcp's offset handling which does not suffer the same bug.
Cc: Aditi Ghag aditi.ghag@isovalent.com Fixes: c96dac8d369f ("bpf: udp: Implement batching for sockets iterator") Acked-by: Yonghong Song yonghong.song@linux.dev Signed-off-by: Martin KaFai Lau martin.lau@kernel.org Reviewed-by: Aditi Ghag aditi.ghag@isovalent.com Link: https://lore.kernel.org/r/20240112190530.3751661-3-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/udp.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 79050d83e736..148ffb007969 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -3137,16 +3137,18 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq) struct bpf_udp_iter_state *iter = seq->private; struct udp_iter_state *state = &iter->state; struct net *net = seq_file_net(seq); + int resume_bucket, resume_offset; struct udp_table *udptable; unsigned int batch_sks = 0; bool resized = false; struct sock *sk;
+ resume_bucket = state->bucket; + resume_offset = iter->offset; + /* The current batch is done, so advance the bucket. */ - if (iter->st_bucket_done) { + if (iter->st_bucket_done) state->bucket++; - iter->offset = 0; - }
udptable = udp_get_table_seq(seq, net);
@@ -3166,19 +3168,19 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq) for (; state->bucket <= udptable->mask; state->bucket++) { struct udp_hslot *hslot2 = &udptable->hash2[state->bucket];
- if (hlist_empty(&hslot2->head)) { - iter->offset = 0; + if (hlist_empty(&hslot2->head)) continue; - }
+ iter->offset = 0; spin_lock_bh(&hslot2->lock); udp_portaddr_for_each_entry(sk, &hslot2->head) { if (seq_sk_match(seq, sk)) { /* Resume from the last iterated socket at the * offset in the bucket before iterator was stopped. */ - if (iter->offset) { - --iter->offset; + if (state->bucket == resume_bucket && + iter->offset < resume_offset) { + ++iter->offset; continue; } if (iter->end_sk < iter->max_sk) { @@ -3192,9 +3194,6 @@ static struct sock *bpf_iter_udp_batch(struct seq_file *seq)
if (iter->end_sk) break; - - /* Reset the current bucket's offset before moving to the next bucket. */ - iter->offset = 0; }
/* All done: no batch made. */