Hello,
Wei reported when loading his bpf prog in 5.10.200 kernel, host would panic, this didn't happen in 5.10.135 kernel. Test on latest v5.10.238 still has this panic.
[ 26.531718] BUG: kernel NULL pointer dereference, address: 0000000000000168 [ 26.538093] #PF: supervisor read access in kernel mode [ 26.542727] #PF: error_code(0x0000) - not-present page [ 26.548093] PGD 10f3e9067 P4D 10f332067 PUD 10f0c5067 PMD 0 [ 26.553211] Oops: 0000 [#1] SMP NOPTI [ 26.556531] CPU: 2 PID: 541 Comm: main Not tainted 5.10.238-00267-g01e7e36b8606 #63 [ 26.563816] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 26.572357] RIP: 0010:__mark_chain_precision+0x24b/0x4d0 [ 26.576572] Code: 51 01 be 20 00 00 00 4c 89 ef 48 63 d2 e8 bd df 31 00 89 c1 83 f8 1f 7f 29 48 63 d1 48 89 d0 48 c1 e0 04 48 29 d0 48 8d 04 c3 <83> 38 01 75 c3 0f b6 74 24 06 80 78 74 00 c6 40 74 01 44 0f 44 f6 [ 26.589100] RSP: 0018:ffa0000000ff7b60 EFLAGS: 00010216 [ 26.592612] RAX: 0000000000000168 RBX: 0000000000000000 RCX: 0000000000000003 [ 26.597416] RDX: 0000000000000003 RSI: 0000000000000020 RDI: ffa0000000ff7b78 [ 26.601362] RBP: 0000000000000003 R08: ffa0000000ff7b70 R09: 0000000000000004 [ 26.604261] R10: 0000000000000007 R11: ffa0000000425000 R12: ff11000102ee2000 [ 26.607202] R13: ffa0000000ff7b78 R14: 0000000000000000 R15: ff1100010ee37140 [ 26.610327] FS: 00000000007a0630(0000) GS:ff1100081c400000(0000) knlGS:0000000000000000 [ 26.613678] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 26.616105] CR2: 0000000000000168 CR3: 0000000115e72002 CR4: 0000000000371ee0 [ 26.619059] Call Trace: [ 26.620118] adjust_reg_min_max_vals+0x133/0x340 [ 26.622048] ? krealloc+0x63/0xe0 [ 26.623435] do_check+0x38c/0xa80 [ 26.624859] do_check_common+0x15b/0x280 [ 26.626496] bpf_check+0xbe1/0xd30 [ 26.627939] ? srso_alias_return_thunk+0x5/0x7f [ 26.629796] ? trace_hardirqs_on+0x1a/0xd0 [ 26.631503] ? srso_alias_return_thunk+0x5/0x7f [ 26.633402] bpf_prog_load+0x422/0x8a0 [ 26.634987] ? srso_alias_return_thunk+0x5/0x7f [ 26.636864] ? __handle_mm_fault+0x3cb/0x6d0 [ 26.638658] ? srso_alias_return_thunk+0x5/0x7f [ 26.640543] ? lock_release+0xe3/0x110 [ 26.642114] __do_sys_bpf+0x485/0xdf0 [ 26.643624] do_syscall_64+0x33/0x40 [ 26.645110] entry_SYSCALL_64_after_hwframe+0x67/0xd1 [ 26.647190] RIP: 0033:0x409a6e [ 26.648470] Code: 24 28 44 8b 44 24 2c e9 70 ff ff ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48 [ 26.656154] RSP: 002b:000000c00199edc0 EFLAGS: 00000212 ORIG_RAX: 0000000000000141 [ 26.659451] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 0000000000409a6e [ 26.662375] RDX: 0000000000000098 RSI: 000000c00199f290 RDI: 0000000000000005 [ 26.665267] RBP: 000000c00199ee00 R08: 0000000000000000 R09: 0000000000000000 [ 26.668204] R10: 0000000000000000 R11: 0000000000000212 R12: 0000000000000000 [ 26.671125] R13: 0000000000000080 R14: 000000c000002380 R15: 8080808080808080 [ 26.674085] Modules linked in: [ 26.675363] CR2: 0000000000000168 [ 26.676772] ---[ end trace 3fc192ee4dabbf12 ]--- [ 26.678667] RIP: 0010:__mark_chain_precision+0x24b/0x4d0 [ 26.680926] Code: 51 01 be 20 00 00 00 4c 89 ef 48 63 d2 e8 bd df 31 00 89 c1 83 f8 1f 7f 29 48 63 d1 48 89 d0 48 c1 e0 04 48 29 d0 48 8d 04 c3 <83> 38 01 75 c3 0f b6 74 24 06 80 78 74 00 c6 40 74 01 44 0f 44 f6 [ 26.688665] RSP: 0018:ffa0000000ff7b60 EFLAGS: 00010216 [ 26.690828] RAX: 0000000000000168 RBX: 0000000000000000 RCX: 0000000000000003 [ 26.693777] RDX: 0000000000000003 RSI: 0000000000000020 RDI: ffa0000000ff7b78 [ 26.696680] RBP: 0000000000000003 R08: ffa0000000ff7b70 R09: 0000000000000004 [ 26.699651] R10: 0000000000000007 R11: ffa0000000425000 R12: ff11000102ee2000 [ 26.702561] R13: ffa0000000ff7b78 R14: 0000000000000000 R15: ff1100010ee37140 [ 26.705522] FS: 00000000007a0630(0000) GS:ff1100081c400000(0000) knlGS:0000000000000000 [ 26.708806] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 26.711179] CR2: 0000000000000168 CR3: 0000000115e72002 CR4: 0000000000371ee0 [ 26.714143] Kernel panic - not syncing: Fatal exception [ 26.716893] Kernel Offset: disabled [ 26.718911] Rebooting in 5 seconds..
I did a bisect in linux-5.10.y branch and found the fbc is commit 2474ec58b96d("bpf: allow precision tracking for programs with subprogs").
I noticed there is a commit in Linus master branch that has a fix tag for this bisected commit: commit 81335f90e8a8("bpf: unconditionally reset backtrack_state masks on global func exit"), I tried to apply it in this 5.10.y branch but since the bases are quite different, clean apply is not possible, I end up with the following diff:
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 40ac67a04ab75..71da33fb96552 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2118,11 +2118,9 @@ static int __mark_chain_precision(struct bpf_verifier_env *env, int frame, int r bitmap_from_u64(mask, reg_mask); for_each_set_bit(i, mask, 32) { reg = &st->frame[0]->regs[i]; - if (reg->type != SCALAR_VALUE) { - reg_mask &= ~(1u << i); - continue; - } - reg->precise = true; + reg_mask &= ~(1u << i); + if (reg->type == SCALAR_VALUE) + reg->precise = true; } return 0; }
But it didn't make any difference.
Here are the reproduce steps: 1 clone this repo https://github.com/bytedance/vArmor-ebpf and switch to panic-analysis branch; 2 make build A binary named main should be built. I used golang compiler downloaded here: https://go.dev/dl/go1.24.3.linux-amd64.tar.gz but other golang compiler may also work.
Run main as root and it will panic the host(kernel needs CONFIG_BPF_LSM).
Full dmesg and config are attached, feel free to let me know if you need any additional info, thanks.
P.S. linux-5.15.y has the same situation.
linux-stable-mirror@lists.linaro.org