On 3/17/23 11:24 PM, Daniel Borkmann wrote:
On 3/14/23 9:34 PM, Xu Kuohai wrote:
From: Xu Kuohai xukuohai@huawei.com
After commit 3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking"), the following bpf prog is rejected:
0: (61) r2 = *(u32 *)(r1 +0) ; R2_w=pkt(off=0,r=0,imm=0) 1: (61) r3 = *(u32 *)(r1 +4) ; R3_w=pkt_end(off=0,imm=0) 2: (bf) r1 = r2 3: (07) r1 += 1 4: (2d) if r1 > r3 goto pc+8 5: (71) r1 = *(u8 *)(r2 +0) ; R1_w=scalar(umax=255,var_off=(0x0; 0xff)) 6: (18) r0 = 0x7fffffffffffff10 8: (0f) r1 += r0 ; R1_w=scalar(umin=0x7fffffffffffff10,umax=0x800000000000000f) 9: (18) r0 = 0x8000000000000000 11: (07) r0 += 1 12: (ad) if r0 < r1 goto pc-2 13: (b7) r0 = 0 14: (95) exit
And the verifier log says:
[...]
from 12 to 11: R0_w=-9223372036854775794 R1=scalar(umin=9223372036854775823,umax=9223372036854775823,var_off=(0x8000000000000000; 0xffffffff)) 11: (07) r0 += 1 ; R0_w=-9223372036854775793 12: (ad) if r0 < r1 goto pc-2 ; R0_w=-9223372036854775793 R1=scalar(umin=9223372036854775823,umax=9223372036854775823,var_off=(0x8000000000000000; 0xffffffff)) 13: safe
from 12 to 11: R0_w=-9223372036854775793 R1=scalar(umin=9223372036854775824,umax=9223372036854775823,var_off=(0x8000000000000000; 0xffffffff)) 11: (07) r0 += 1 ; R0_w=-9223372036854775792 12: (ad) if r0 < r1 goto pc-2 ; R0_w=-9223372036854775792 R1=scalar(umin=9223372036854775824,umax=9223372036854775823,var_off=(0x8000000000000000; 0xffffffff)) 13: safe
[...]
What can be seen here is that r1->umin grows blindly and becomes bigger than r1->umax. The reason is because the loop does not terminate, when r0 increases to r1->umax_value, the following code in reg_set_min_max() sets r1->umin_value to r1->umax_value + 1 blindly:
case BPF_JGT: { if (is_jmp32) { [...] } else { u64 false_umax = opcode == BPF_JGT ? val : val - 1; u64 true_umin = opcode == BPF_JGT ? val + 1 : val;
false_reg->umax_value = min(false_reg->umax_value, false_umax); true_reg->umin_value = max(true_reg->umin_value, true_umin); } break; }
Why the loop does not terminate is because tnum_is_const(src_reg->var_off) always returns false, causing is_branch_taken() to be skipped:
if (src_reg->type == SCALAR_VALUE && !is_jmp32 && tnum_is_const(src_reg->var_off)) { pred = is_branch_taken(dst_reg, // could not reach here src_reg->var_off.value, opcode, is_jmp32); }
Why tnum_is_const(src_reg->var_off) always returns false is because r1->umin_value starts increasing from 0x7fffffffffffff10, always bigger than U32_MAX, causing the __reg_combine_64_into_32() to mark the lower 32 bits unbounded, i.e. not a constant.
To fix it:
- avoid increasing reg lower bound to a value bigger than the upper bound,
or decreasing reg upper bound to a value smaller than the lower bound. 2. set 32-bit min/max values to the lower 32 bits of the 64-bit min/max values when the 64-bit min/max values are equal.
Should both these be separate patches, meaning are both of them strictly required as one logical entity or not? From your description it's not really clear wrt reg_{inc,dec}_{u32,u64}_{min,max} and if this is mainly defensive or required.
Fyi, I'm working on the below draft patch which passes all of test_verifier and your test cases as well from patch 2. Will cook a proper patch once I'm through with further analysis:
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index d517d13878cf..8bef2ed89e87 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1823,7 +1823,7 @@ static void __reg_bound_offset(struct bpf_reg_state *reg) struct tnum var64_off = tnum_intersect(reg->var_off, tnum_range(reg->umin_value, reg->umax_value)); - struct tnum var32_off = tnum_intersect(tnum_subreg(reg->var_off), + struct tnum var32_off = tnum_intersect(tnum_subreg(var64_off), tnum_range(reg->u32_min_value, reg->u32_max_value));