On Fri, Aug 12, 2022 at 10:22:11AM +0100, Quentin Monnet wrote:
From: Daniel Borkmann daniel@iogearbox.net
[ Upstream commit 5ccb071e97fbd9ffe623a0d3977cc6d013bee93c ]
Commit aaac3ba95e4c ("bpf: charge user for creation of BPF maps and programs") made a wrong assumption of charging against prog->pages. Unlike map->pages, prog->pages are still subject to change when we need to expand the program through bpf_prog_realloc().
This can for example happen during verification stage when we need to expand and rewrite parts of the program. Should the required space cross a page boundary, then prog->pages is not the same anymore as its original value that we used to bpf_prog_charge_memlock() on. Thus, we'll hit a wrap-around during bpf_prog_uncharge_memlock() when prog is freed eventually. I noticed this that despite having unlimited memlock, programs suddenly refused to load with EPERM error due to insufficient memlock.
There are two ways to fix this issue. One would be to add a cached variable to struct bpf_prog that takes a snapshot of prog->pages at the time of charging. The other approach is to also account for resizes. I chose to go with the latter for a couple of reasons: i) We want accounting rather to be more accurate instead of further fooling limits, ii) adding yet another page counter on struct bpf_prog would also be a waste just for this purpose. We also do want to charge as early as possible to avoid going into the verifier just to find out later on that we crossed limits. The only place that needs to be fixed is bpf_prog_realloc(), since only here we expand the program, so we try to account for the needed delta and should we fail, call-sites check for outcome anyway. On cBPF to eBPF migrations, we don't grab a reference to the user as they are charged differently. With that in place, my test case worked fine.
Fixes: aaac3ba95e4c ("bpf: charge user for creation of BPF maps and programs") Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Alexei Starovoitov ast@kernel.org Signed-off-by: David S. Miller davem@davemloft.net [Quentin: backport to 4.9: Adjust context in bpf.h ] Signed-off-by: Quentin Monnet quentin@isovalent.com
This fix was merged in Linux 4.10 but never backported to 4.9. The overflow has been occurring regularly when running Cilium's CI tests on kernel 4.9, so I would like to submit this patch for consideration to the 4.9 stable branch.
The initial patch applied with a minor conflict on include/linux/bpf.h, due to unprivileged_ebpf_enabled() backported in 6481835a9a5b ("x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting")
Now queued up, thanks.
greg k-h