On Tue, 2024-08-20 at 18:33 -0700, Eduard Zingerman wrote:
On Tue, 2024-08-20 at 17:21 +0800, Liu RuiTong wrote:
[...]
bpf_core_calc_relo_insn+311 <bpf_core_calc_relo_insn+311> ─────────────────────────────────────────────────────────────────────────────────────────────[ SOURCE (CODE) ]────────────────────────────────────────────────────────────────────────────────────────────── In file: /home/ubuntu/fuzz/linux-6.11-rc4/tools/lib/bpf/relo_core.c:1300 1295 char spec_buf[256]; 1296 int i, j, err; 1297 1298 local_id = relo->type_id; 1299 local_type = btf_type_by_id(local_btf, local_id); ► 1300 local_name = btf__name_by_offset(local_btf, local_type->name_off);
Hi Liu,
Thank you for the report, I can reproduce the issue. Will comment later today.
Hi Liu,
Your analysis is correct, the issue is caused by a missing null pointer check for 'local_type'.
I was curious why the attached test case does not cause null pointer exception every time, but then I realized that this is because of the sequence of BPF commands it issues (each in separate thread): 1. Create BTF, wait for completion; 2. Load BPF program, do not wait for completion; 3. Rewrite memory region passed to load BPF command as bpf_attr to reuse it for another system call (actual call is map update, but that does not matter).
From time to time steps (2) and (3) would run concurrently and user space memory chunk passed to kernel in (2) would be updated to make relocation data invalid.
I attach a simplified test case, will post a fix to bpf mailing list shortly.