This improves the expressiveness of unprivileged BPF by inserting speculation barriers instead of rejcting the programs.
The approach was presented at LPC'24: https://lpc.events/event/18/contributions/1954/ ("Mitigating Spectre-PHT using Speculation Barriers in Linux eBPF") and RAID'24: https://arxiv.org/pdf/2405.00078 ("VeriFence: Lightweight and Precise Spectre Defenses for Untrusted Linux Kernel Extensions")
Goal of this RFC is to get feedback on the approach and the structuring into commits.
TODOs to be fixed for final version: * actually emit arm64 barrier * fix unexpected_load_success from test_progs for "bpf: Fall back to nospec for sanitization-failures" * use bpf-next as base commit
Luis Gerhorst (9): bpf/arm64: Unset bypass_spec_v4() instead of ignoring BPF_NOSPEC bpf: Refactor do_check() if/else into do_check_insn() bpf: Return EFAULT on misconfigurations bpf: Return EFAULT on internal errors bpf: Fall back to nospec if v1 verification fails bpf: Allow nospec-protected var-offset stack access bpf: Refactor push_stack to return error code bpf: Fall back to nospec for sanitization-failures bpf: Cut speculative path verification short
arch/arm64/net/bpf_jit_comp.c | 10 +- include/linux/bpf.h | 14 +- include/linux/bpf_verifier.h | 3 +- kernel/bpf/core.c | 17 +- kernel/bpf/verifier.c | 832 ++++++++++-------- .../selftests/bpf/progs/verifier_and.c | 3 +- .../selftests/bpf/progs/verifier_bounds.c | 30 +- .../selftests/bpf/progs/verifier_movsx.c | 6 +- .../selftests/bpf/progs/verifier_unpriv.c | 3 +- .../bpf/progs/verifier_value_ptr_arith.c | 11 +- 10 files changed, 520 insertions(+), 409 deletions(-)
base-commit: d082ecbc71e9e0bf49883ee4afd435a77a5101b6
This changes the semantics of BPF_NOSPEC to always insert a speculation barrier. If this is not needed on some architecture, bypass_spec_v4() should instead return true.
Consequently, sanitize_stack_spill is renamed to nospec_result.
This later allows us to rely on BPF_NOSPEC from v4 to reduce complexity of Spectre v1 verification.
Signed-off-by: Luis Gerhorst luis.gerhorst@fau.de Acked-by: Henriette Herzog henriette.herzog@rub.de Cc: Maximilian Ott ott@cs.fau.de Cc: Milan Stephan milan.stephan@fau.de --- arch/arm64/net/bpf_jit_comp.c | 10 +--------- include/linux/bpf.h | 14 +++++++++++++- include/linux/bpf_verifier.h | 2 +- kernel/bpf/verifier.c | 4 ++-- 4 files changed, 17 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index 8446848edddb..18370a45e8f2 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -1508,15 +1508,7 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
/* speculation barrier */ case BPF_ST | BPF_NOSPEC: - /* - * Nothing required here. - * - * In case of arm64, we rely on the firmware mitigation of - * Speculative Store Bypass as controlled via the ssbd kernel - * parameter. Whenever the mitigation is enabled, it works - * for all of the kernel code with no need to provide any - * additional instructions. - */ + /* TODO: emit(A64_SB) */ break;
/* ST: *(size *)(dst + off) = imm */ diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f3f50e29d639..bd2a2c5f519e 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2423,7 +2423,19 @@ static inline bool bpf_bypass_spec_v1(const struct bpf_token *token)
static inline bool bpf_bypass_spec_v4(const struct bpf_token *token) { - return cpu_mitigations_off() || bpf_token_capable(token, CAP_PERFMON); +#ifdef ARM64 + /* In case of arm64, we rely on the firmware mitigation of Speculative + * Store Bypass as controlled via the ssbd kernel parameter. Whenever + * the mitigation is enabled, it works for all of the kernel code with + * no need to provide any additional instructions. Therefore, skip + * inserting nospec insns against Spectre v4 if arm64 + * spectre_v4_mitigations_on/dynamic() is true. + */ + bool spec_v4 = arm64_get_spectre_v4_state() == SPECTRE_VULNERABLE; +#else + bool spec_v4 = true; +#endif + return !spec_v4 || cpu_mitigations_off() || bpf_token_capable(token, CAP_PERFMON); }
int bpf_map_new_fd(struct bpf_map *map, int flags); diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 32c23f2a3086..2af09d75c7cd 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -561,7 +561,7 @@ struct bpf_insn_aux_data { u64 map_key_state; /* constant (32 bit) key tracking for maps */ int ctx_field_size; /* the ctx field size for load insn, maybe 0 */ u32 seen; /* this insn was processed by the verifier at env->pass_cnt */ - bool sanitize_stack_spill; /* subject to Spectre v4 sanitation */ + bool nospec_result; /* ensure following insns from executing speculatively */ bool zext_dst; /* this insn zero extends dst reg */ bool needs_zext; /* alu op needs to clear upper bits */ bool storage_get_func_atomic; /* bpf_*_storage_get() with atomic memory alloc */ diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 60611df77957..5be3bd38f540 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4904,7 +4904,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env, }
if (sanitize) - env->insn_aux_data[insn_idx].sanitize_stack_spill = true; + env->insn_aux_data[insn_idx].nospec_result = true; }
err = destroy_if_dynptr_stack_slot(env, state, spi); @@ -20445,7 +20445,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) }
if (type == BPF_WRITE && - env->insn_aux_data[i + delta].sanitize_stack_spill) { + env->insn_aux_data[i + delta].nospec_result) { struct bpf_insn patch[] = { *insn, BPF_ST_NOSPEC(),
This is required to catch the errors later and fall back to a nospec if on a speculative path.
Move code into do_check_insn(), replace "continue" with "return CHECK_NEXT_INSN", "break" with "return ALL_PATHS_CHECKED", "do_print_state = " with "*do_print_state = ", and "goto process_bpf_exit" / fallthrough with "return process_bpf_exit()".
Signed-off-by: Luis Gerhorst luis.gerhorst@fau.de Acked-by: Henriette Herzog henriette.herzog@rub.de Cc: Maximilian Ott ott@cs.fau.de Cc: Milan Stephan milan.stephan@fau.de --- kernel/bpf/verifier.c | 528 +++++++++++++++++++++++------------------- 1 file changed, 288 insertions(+), 240 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 5be3bd38f540..42ff90bc81e6 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -18932,6 +18932,275 @@ static int save_aux_ptr_type(struct bpf_verifier_env *env, enum bpf_reg_type typ return 0; }
+enum { + ALL_PATHS_CHECKED = 1, + CHECK_NEXT_INSN +}; + +static int process_bpf_exit(struct bpf_verifier_env *env, int *prev_insn_idx, + bool pop_log, bool *do_print_state) +{ + int err; + + mark_verifier_state_scratched(env); + update_branch_counts(env, env->cur_state); + err = pop_stack(env, prev_insn_idx, + &env->insn_idx, pop_log); + if (err < 0) { + if (err != -ENOENT) + return err; + return ALL_PATHS_CHECKED; + } + + *do_print_state = true; + return CHECK_NEXT_INSN; +} + +static int do_check_insn(struct bpf_verifier_env *env, struct bpf_insn *insn, + bool pop_log, bool *do_print_state, + struct bpf_reg_state *regs, + struct bpf_verifier_state *state, int *prev_insn_idx) +{ + int err; + u8 class = BPF_CLASS(insn->code); + bool exception_exit = false; + + if (class == BPF_ALU || class == BPF_ALU64) { + err = check_alu_op(env, insn); + if (err) + return err; + + } else if (class == BPF_LDX) { + enum bpf_reg_type src_reg_type; + + /* check for reserved fields is already done */ + + /* check src operand */ + err = check_reg_arg(env, insn->src_reg, SRC_OP); + if (err) + return err; + + err = check_reg_arg(env, insn->dst_reg, DST_OP_NO_MARK); + if (err) + return err; + + src_reg_type = regs[insn->src_reg].type; + + /* check that memory (src_reg + off) is readable, + * the state of dst_reg will be updated by this func + */ + err = check_mem_access(env, env->insn_idx, insn->src_reg, + insn->off, BPF_SIZE(insn->code), + BPF_READ, insn->dst_reg, false, + BPF_MODE(insn->code) == BPF_MEMSX); + err = err ?: save_aux_ptr_type(env, src_reg_type, true); + err = err ?: + reg_bounds_sanity_check(env, ®s[insn->dst_reg], + "ldx"); + if (err) + return err; + } else if (class == BPF_STX) { + enum bpf_reg_type dst_reg_type; + + if (BPF_MODE(insn->code) == BPF_ATOMIC) { + err = check_atomic(env, env->insn_idx, insn); + if (err) + return err; + env->insn_idx++; + return CHECK_NEXT_INSN; + } + + if (BPF_MODE(insn->code) != BPF_MEM || insn->imm != 0) { + verbose(env, "BPF_STX uses reserved fields\n"); + return -EINVAL; + } + + /* check src1 operand */ + err = check_reg_arg(env, insn->src_reg, SRC_OP); + if (err) + return err; + /* check src2 operand */ + err = check_reg_arg(env, insn->dst_reg, SRC_OP); + if (err) + return err; + + dst_reg_type = regs[insn->dst_reg].type; + + /* check that memory (dst_reg + off) is writeable */ + err = check_mem_access(env, env->insn_idx, insn->dst_reg, + insn->off, BPF_SIZE(insn->code), + BPF_WRITE, insn->src_reg, false, false); + if (err) + return err; + + err = save_aux_ptr_type(env, dst_reg_type, false); + if (err) + return err; + } else if (class == BPF_ST) { + enum bpf_reg_type dst_reg_type; + + if (BPF_MODE(insn->code) != BPF_MEM || + insn->src_reg != BPF_REG_0) { + verbose(env, "BPF_ST uses reserved fields\n"); + return -EINVAL; + } + /* check src operand */ + err = check_reg_arg(env, insn->dst_reg, SRC_OP); + if (err) + return err; + + dst_reg_type = regs[insn->dst_reg].type; + + /* check that memory (dst_reg + off) is writeable */ + err = check_mem_access(env, env->insn_idx, insn->dst_reg, + insn->off, BPF_SIZE(insn->code), + BPF_WRITE, -1, false, false); + if (err) + return err; + + err = save_aux_ptr_type(env, dst_reg_type, false); + if (err) + return err; + } else if (class == BPF_JMP || class == BPF_JMP32) { + u8 opcode = BPF_OP(insn->code); + + env->jmps_processed++; + if (opcode == BPF_CALL) { + if (BPF_SRC(insn->code) != BPF_K || + (insn->src_reg != BPF_PSEUDO_KFUNC_CALL && + insn->off != 0) || + (insn->src_reg != BPF_REG_0 && + insn->src_reg != BPF_PSEUDO_CALL && + insn->src_reg != BPF_PSEUDO_KFUNC_CALL) || + insn->dst_reg != BPF_REG_0 || class == BPF_JMP32) { + verbose(env, "BPF_CALL uses reserved fields\n"); + return -EINVAL; + } + + if (env->cur_state->active_locks) { + if ((insn->src_reg == BPF_REG_0 && + insn->imm != BPF_FUNC_spin_unlock) || + (insn->src_reg == BPF_PSEUDO_KFUNC_CALL && + (insn->off != 0 || + !kfunc_spin_allowed(insn->imm)))) { + verbose(env, + "function calls are not allowed while holding a lock\n"); + return -EINVAL; + } + } + if (insn->src_reg == BPF_PSEUDO_CALL) { + err = check_func_call(env, insn, + &env->insn_idx); + } else if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) { + err = check_kfunc_call(env, insn, + &env->insn_idx); + if (!err && is_bpf_throw_kfunc(insn)) { + exception_exit = true; + goto process_bpf_exit_full; + } + } else { + err = check_helper_call(env, insn, + &env->insn_idx); + } + if (err) + return err; + + mark_reg_scratched(env, BPF_REG_0); + } else if (opcode == BPF_JA) { + if (BPF_SRC(insn->code) != BPF_K || + insn->src_reg != BPF_REG_0 || + insn->dst_reg != BPF_REG_0 || + (class == BPF_JMP && insn->imm != 0) || + (class == BPF_JMP32 && insn->off != 0)) { + verbose(env, "BPF_JA uses reserved fields\n"); + return -EINVAL; + } + + if (class == BPF_JMP) + env->insn_idx += insn->off + 1; + else + env->insn_idx += insn->imm + 1; + return CHECK_NEXT_INSN; + + } else if (opcode == BPF_EXIT) { + if (BPF_SRC(insn->code) != BPF_K || insn->imm != 0 || + insn->src_reg != BPF_REG_0 || + insn->dst_reg != BPF_REG_0 || class == BPF_JMP32) { + verbose(env, "BPF_EXIT uses reserved fields\n"); + return -EINVAL; + } +process_bpf_exit_full: + /* We must do check_reference_leak here before + * prepare_func_exit to handle the case when + * state->curframe > 0, it may be a callback function, + * for which reference_state must match caller reference + * state when it exits. + */ + err = check_resource_leak(env, exception_exit, + !env->cur_state->curframe, + "BPF_EXIT instruction in main prog"); + if (err) + return err; + + /* The side effect of the prepare_func_exit which is + * being skipped is that it frees bpf_func_state. + * Typically, process_bpf_exit will only be hit with + * outermost exit. copy_verifier_state in pop_stack will + * handle freeing of any extra bpf_func_state left over + * from not processing all nested function exits. We + * also skip return code checks as they are not needed + * for exceptional exits. + */ + if (exception_exit) + goto process_bpf_exit; + + if (state->curframe) { + /* exit from nested function */ + err = prepare_func_exit(env, &env->insn_idx); + if (err) + return err; + *do_print_state = true; + return CHECK_NEXT_INSN; + } + + err = check_return_code(env, BPF_REG_0, "R0"); + if (err) + return err; +process_bpf_exit: + return process_bpf_exit(env, prev_insn_idx, pop_log, + do_print_state); + } else { + err = check_cond_jmp_op(env, insn, &env->insn_idx); + if (err) + return err; + } + } else if (class == BPF_LD) { + u8 mode = BPF_MODE(insn->code); + + if (mode == BPF_ABS || mode == BPF_IND) { + err = check_ld_abs(env, insn); + if (err) + return err; + + } else if (mode == BPF_IMM) { + err = check_ld_imm(env, insn); + if (err) + return err; + + env->insn_idx++; + sanitize_mark_insn_seen(env); + } else { + verbose(env, "invalid BPF_LD mode\n"); + return -EINVAL; + } + } else { + verbose(env, "unknown insn class %d\n", class); + return -EINVAL; + } + + return 0; +} + static int do_check(struct bpf_verifier_env *env) { bool pop_log = !(env->log.level & BPF_LOG_LEVEL2); @@ -18943,9 +19212,7 @@ static int do_check(struct bpf_verifier_env *env) int prev_insn_idx = -1;
for (;;) { - bool exception_exit = false; struct bpf_insn *insn; - u8 class; int err;
/* reset current history entry on each new instruction */ @@ -18959,7 +19226,6 @@ static int do_check(struct bpf_verifier_env *env) }
insn = &insns[env->insn_idx]; - class = BPF_CLASS(insn->code);
if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) { verbose(env, @@ -18985,7 +19251,16 @@ static int do_check(struct bpf_verifier_env *env) else verbose(env, "%d: safe\n", env->insn_idx); } - goto process_bpf_exit; + err = process_bpf_exit(env, &prev_insn_idx, pop_log, + &do_print_state); + if (err == CHECK_NEXT_INSN) { + continue; + } else if (err == ALL_PATHS_CHECKED) { + break; + } else if (err) { + WARN_ON_ONCE(err > 0); + return err; + } } }
@@ -19039,242 +19314,15 @@ static int do_check(struct bpf_verifier_env *env) sanitize_mark_insn_seen(env); prev_insn_idx = env->insn_idx;
- if (class == BPF_ALU || class == BPF_ALU64) { - err = check_alu_op(env, insn); - if (err) - return err; - - } else if (class == BPF_LDX) { - enum bpf_reg_type src_reg_type; - - /* check for reserved fields is already done */ - - /* check src operand */ - err = check_reg_arg(env, insn->src_reg, SRC_OP); - if (err) - return err; - - err = check_reg_arg(env, insn->dst_reg, DST_OP_NO_MARK); - if (err) - return err; - - src_reg_type = regs[insn->src_reg].type; - - /* check that memory (src_reg + off) is readable, - * the state of dst_reg will be updated by this func - */ - err = check_mem_access(env, env->insn_idx, insn->src_reg, - insn->off, BPF_SIZE(insn->code), - BPF_READ, insn->dst_reg, false, - BPF_MODE(insn->code) == BPF_MEMSX); - err = err ?: save_aux_ptr_type(env, src_reg_type, true); - err = err ?: reg_bounds_sanity_check(env, ®s[insn->dst_reg], "ldx"); - if (err) - return err; - } else if (class == BPF_STX) { - enum bpf_reg_type dst_reg_type; - - if (BPF_MODE(insn->code) == BPF_ATOMIC) { - err = check_atomic(env, env->insn_idx, insn); - if (err) - return err; - env->insn_idx++; - continue; - } - - if (BPF_MODE(insn->code) != BPF_MEM || insn->imm != 0) { - verbose(env, "BPF_STX uses reserved fields\n"); - return -EINVAL; - } - - /* check src1 operand */ - err = check_reg_arg(env, insn->src_reg, SRC_OP); - if (err) - return err; - /* check src2 operand */ - err = check_reg_arg(env, insn->dst_reg, SRC_OP); - if (err) - return err; - - dst_reg_type = regs[insn->dst_reg].type; - - /* check that memory (dst_reg + off) is writeable */ - err = check_mem_access(env, env->insn_idx, insn->dst_reg, - insn->off, BPF_SIZE(insn->code), - BPF_WRITE, insn->src_reg, false, false); - if (err) - return err; - - err = save_aux_ptr_type(env, dst_reg_type, false); - if (err) - return err; - } else if (class == BPF_ST) { - enum bpf_reg_type dst_reg_type; - - if (BPF_MODE(insn->code) != BPF_MEM || - insn->src_reg != BPF_REG_0) { - verbose(env, "BPF_ST uses reserved fields\n"); - return -EINVAL; - } - /* check src operand */ - err = check_reg_arg(env, insn->dst_reg, SRC_OP); - if (err) - return err; - - dst_reg_type = regs[insn->dst_reg].type; - - /* check that memory (dst_reg + off) is writeable */ - err = check_mem_access(env, env->insn_idx, insn->dst_reg, - insn->off, BPF_SIZE(insn->code), - BPF_WRITE, -1, false, false); - if (err) - return err; - - err = save_aux_ptr_type(env, dst_reg_type, false); - if (err) - return err; - } else if (class == BPF_JMP || class == BPF_JMP32) { - u8 opcode = BPF_OP(insn->code); - - env->jmps_processed++; - if (opcode == BPF_CALL) { - if (BPF_SRC(insn->code) != BPF_K || - (insn->src_reg != BPF_PSEUDO_KFUNC_CALL - && insn->off != 0) || - (insn->src_reg != BPF_REG_0 && - insn->src_reg != BPF_PSEUDO_CALL && - insn->src_reg != BPF_PSEUDO_KFUNC_CALL) || - insn->dst_reg != BPF_REG_0 || - class == BPF_JMP32) { - verbose(env, "BPF_CALL uses reserved fields\n"); - return -EINVAL; - } - - if (env->cur_state->active_locks) { - if ((insn->src_reg == BPF_REG_0 && insn->imm != BPF_FUNC_spin_unlock) || - (insn->src_reg == BPF_PSEUDO_KFUNC_CALL && - (insn->off != 0 || !kfunc_spin_allowed(insn->imm)))) { - verbose(env, "function calls are not allowed while holding a lock\n"); - return -EINVAL; - } - } - if (insn->src_reg == BPF_PSEUDO_CALL) { - err = check_func_call(env, insn, &env->insn_idx); - } else if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) { - err = check_kfunc_call(env, insn, &env->insn_idx); - if (!err && is_bpf_throw_kfunc(insn)) { - exception_exit = true; - goto process_bpf_exit_full; - } - } else { - err = check_helper_call(env, insn, &env->insn_idx); - } - if (err) - return err; - - mark_reg_scratched(env, BPF_REG_0); - } else if (opcode == BPF_JA) { - if (BPF_SRC(insn->code) != BPF_K || - insn->src_reg != BPF_REG_0 || - insn->dst_reg != BPF_REG_0 || - (class == BPF_JMP && insn->imm != 0) || - (class == BPF_JMP32 && insn->off != 0)) { - verbose(env, "BPF_JA uses reserved fields\n"); - return -EINVAL; - } - - if (class == BPF_JMP) - env->insn_idx += insn->off + 1; - else - env->insn_idx += insn->imm + 1; - continue; - - } else if (opcode == BPF_EXIT) { - if (BPF_SRC(insn->code) != BPF_K || - insn->imm != 0 || - insn->src_reg != BPF_REG_0 || - insn->dst_reg != BPF_REG_0 || - class == BPF_JMP32) { - verbose(env, "BPF_EXIT uses reserved fields\n"); - return -EINVAL; - } -process_bpf_exit_full: - /* We must do check_reference_leak here before - * prepare_func_exit to handle the case when - * state->curframe > 0, it may be a callback - * function, for which reference_state must - * match caller reference state when it exits. - */ - err = check_resource_leak(env, exception_exit, !env->cur_state->curframe, - "BPF_EXIT instruction in main prog"); - if (err) - return err; - - /* The side effect of the prepare_func_exit - * which is being skipped is that it frees - * bpf_func_state. Typically, process_bpf_exit - * will only be hit with outermost exit. - * copy_verifier_state in pop_stack will handle - * freeing of any extra bpf_func_state left over - * from not processing all nested function - * exits. We also skip return code checks as - * they are not needed for exceptional exits. - */ - if (exception_exit) - goto process_bpf_exit; - - if (state->curframe) { - /* exit from nested function */ - err = prepare_func_exit(env, &env->insn_idx); - if (err) - return err; - do_print_state = true; - continue; - } - - err = check_return_code(env, BPF_REG_0, "R0"); - if (err) - return err; -process_bpf_exit: - mark_verifier_state_scratched(env); - update_branch_counts(env, env->cur_state); - err = pop_stack(env, &prev_insn_idx, - &env->insn_idx, pop_log); - if (err < 0) { - if (err != -ENOENT) - return err; - break; - } else { - do_print_state = true; - continue; - } - } else { - err = check_cond_jmp_op(env, insn, &env->insn_idx); - if (err) - return err; - } - } else if (class == BPF_LD) { - u8 mode = BPF_MODE(insn->code); - - if (mode == BPF_ABS || mode == BPF_IND) { - err = check_ld_abs(env, insn); - if (err) - return err; - - } else if (mode == BPF_IMM) { - err = check_ld_imm(env, insn); - if (err) - return err; - - env->insn_idx++; - sanitize_mark_insn_seen(env); - } else { - verbose(env, "invalid BPF_LD mode\n"); - return -EINVAL; - } - } else { - verbose(env, "unknown insn class %d\n", class); - return -EINVAL; + err = do_check_insn(env, insn, pop_log, &do_print_state, regs, state, + &prev_insn_idx); + if (err == CHECK_NEXT_INSN) { + continue; + } else if (err == ALL_PATHS_CHECKED) { + break; + } else if (err) { + WARN_ON_ONCE(err > 0); + return err; }
env->insn_idx++;
Mark these cases as non-recoverable, even when they only occur during speculative path verification.
Signed-off-by: Luis Gerhorst luis.gerhorst@fau.de Acked-by: Henriette Herzog henriette.herzog@rub.de Cc: Maximilian Ott ott@cs.fau.de Cc: Milan Stephan milan.stephan@fau.de --- kernel/bpf/verifier.c | 37 +++++++++++++++++++------------------ 1 file changed, 19 insertions(+), 18 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 42ff90bc81e6..d8a95b84c566 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -8668,7 +8668,7 @@ static int resolve_map_arg_type(struct bpf_verifier_env *env, if (!meta->map_ptr) { /* kernel subsystem misconfigured verifier */ verbose(env, "invalid map_ptr to access map->type\n"); - return -EACCES; + return -EFAULT; }
switch (meta->map_ptr->map_type) { @@ -9356,7 +9356,7 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, * that kernel subsystem misconfigured verifier */ verbose(env, "invalid map_ptr to access map->key\n"); - return -EACCES; + return -EFAULT; } key_size = meta->map_ptr->key_size; err = check_helper_mem_access(env, regno, key_size, BPF_READ, false, NULL); @@ -9383,7 +9383,7 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, if (!meta->map_ptr) { /* kernel subsystem misconfigured verifier */ verbose(env, "invalid map_ptr to access map->value\n"); - return -EACCES; + return -EFAULT; } meta->raw_mode = arg_type & MEM_UNINIT; err = check_helper_mem_access(env, regno, meta->map_ptr->value_size, @@ -10687,7 +10687,7 @@ record_func_map(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta,
if (map == NULL) { verbose(env, "kernel subsystem misconfigured verifier\n"); - return -EINVAL; + return -EFAULT; }
/* In case of read-only, some additional restrictions @@ -10726,7 +10726,7 @@ record_func_key(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta, return 0; if (!map || map->map_type != BPF_MAP_TYPE_PROG_ARRAY) { verbose(env, "kernel subsystem misconfigured verifier\n"); - return -EINVAL; + return -EFAULT; }
reg = ®s[BPF_REG_3]; @@ -10972,7 +10972,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn if (changes_data && fn->arg1_type != ARG_PTR_TO_CTX) { verbose(env, "kernel subsystem misconfigured func %s#%d: r1 != ctx\n", func_id_name(func_id), func_id); - return -EINVAL; + return -EFAULT; }
memset(&meta, 0, sizeof(meta)); @@ -10982,6 +10982,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn if (err) { verbose(env, "kernel subsystem misconfigured func %s#%d\n", func_id_name(func_id), func_id); + WARN_ON_ONCE(error_recoverable_with_nospec(err)); return err; }
@@ -11274,7 +11275,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn if (meta.map_ptr == NULL) { verbose(env, "kernel subsystem misconfigured verifier\n"); - return -EINVAL; + return -EFAULT; }
if (func_id == BPF_FUNC_map_lookup_elem && @@ -16291,7 +16292,7 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn) dst_reg->type = CONST_PTR_TO_MAP; } else { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; }
return 0; @@ -16338,7 +16339,7 @@ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn)
if (!env->ops->gen_ld_abs) { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; }
if (insn->dst_reg != BPF_REG_0 || insn->off != 0 || @@ -20398,7 +20399,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) -(subprogs[0].stack_depth + 8)); if (epilogue_cnt >= INSN_BUF_SIZE) { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; } else if (epilogue_cnt) { /* Save the ARG_PTR_TO_CTX for the epilogue to use */ cnt = 0; @@ -20417,13 +20418,13 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) if (ops->gen_prologue || env->seen_direct_write) { if (!ops->gen_prologue) { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; } cnt = ops->gen_prologue(insn_buf, env->seen_direct_write, env->prog); if (cnt >= INSN_BUF_SIZE) { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; } else if (cnt) { new_prog = bpf_patch_insn_data(env, 0, insn_buf, cnt); if (!new_prog) @@ -20574,7 +20575,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
if (type == BPF_WRITE) { verbose(env, "bpf verifier narrow ctx access misconfigured\n"); - return -EINVAL; + return -EFAULT; }
size_code = BPF_H; @@ -20593,7 +20594,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) if (cnt == 0 || cnt >= INSN_BUF_SIZE || (ctx_field_size && !target_size)) { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; }
if (is_narrower_load && size < target_size) { @@ -20601,7 +20602,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) off, size, size_default) * 8; if (shift && cnt + 1 >= INSN_BUF_SIZE) { verbose(env, "bpf verifier narrow ctx load misconfigured\n"); - return -EINVAL; + return -EFAULT; } if (ctx_field_size <= 4) { if (shift) @@ -21355,7 +21356,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) cnt = env->ops->gen_ld_abs(insn, insn_buf); if (cnt == 0 || cnt >= INSN_BUF_SIZE) { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; }
new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); @@ -21648,7 +21649,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) goto patch_map_ops_generic; if (cnt <= 0 || cnt >= INSN_BUF_SIZE) { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; }
new_prog = bpf_patch_insn_data(env, i + delta, @@ -21991,7 +21992,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) !map_ptr->ops->map_poke_untrack || !map_ptr->ops->map_poke_run) { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; }
ret = map_ptr->ops->map_poke_track(map_ptr, prog->aux);
This prevents us from trying to recover from these on speculative paths in the future.
Signed-off-by: Luis Gerhorst luis.gerhorst@fau.de Acked-by: Henriette Herzog henriette.herzog@rub.de Cc: Maximilian Ott ott@cs.fau.de Cc: Milan Stephan milan.stephan@fau.de --- kernel/bpf/verifier.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index d8a95b84c566..03e27b012af3 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -11368,7 +11368,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn verbose(env, "verifier internal error:"); verbose(env, "func %s has non-overwritten BPF_PTR_POISON return type\n", func_id_name(func_id)); - return -EINVAL; + return -EFAULT; } ret_btf = btf_vmlinux; ret_btf_id = *fn->ret_btf_id; @@ -14856,12 +14856,12 @@ static int adjust_reg_min_max_vals(struct bpf_verifier_env *env, if (WARN_ON_ONCE(ptr_reg)) { print_verifier_state(env, vstate, vstate->curframe, true); verbose(env, "verifier internal error: unexpected ptr_reg\n"); - return -EINVAL; + return -EFAULT; } if (WARN_ON(!src_reg)) { print_verifier_state(env, vstate, vstate->curframe, true); verbose(env, "verifier internal error: no src_reg\n"); - return -EINVAL; + return -EFAULT; } err = adjust_scalar_min_max_vals(env, insn, dst_reg, *src_reg); if (err)
This implements the core of the series and causes the verifier to fall back to mitigating Spectre v1 using speculation barriers. The approach was presented at LPC'24: https://lpc.events/event/18/contributions/1954/ ("Mitigating Spectre-PHT using Speculation Barriers in Linux eBPF") and RAID'24: https://arxiv.org/pdf/2405.00078 ("VeriFence: Lightweight and Precise Spectre Defenses for Untrusted Linux Kernel Extensions")
In the tests, some are now successful where we previously had a false-positive (i.e., rejection). Change them to reflect where the nospec should be inserted (as comment) and modify the error message if the nospec is able to mitigate a problem that previously shadowed another problem.
Briefly went through all the occurrences of EPERM, EINVAL, and EACCESS in the verifier in order to validate that catching them like this makes sense.
Signed-off-by: Luis Gerhorst luis.gerhorst@fau.de Acked-by: Henriette Herzog henriette.herzog@rub.de Cc: Maximilian Ott ott@cs.fau.de Cc: Milan Stephan milan.stephan@fau.de --- include/linux/bpf_verifier.h | 1 + kernel/bpf/core.c | 17 ++-- kernel/bpf/verifier.c | 77 ++++++++++++++++--- .../selftests/bpf/progs/verifier_and.c | 3 +- .../selftests/bpf/progs/verifier_bounds.c | 30 ++++---- .../selftests/bpf/progs/verifier_movsx.c | 6 +- .../selftests/bpf/progs/verifier_unpriv.c | 3 +- .../bpf/progs/verifier_value_ptr_arith.c | 11 ++- 8 files changed, 108 insertions(+), 40 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 2af09d75c7cd..7c289e3bf18d 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -561,6 +561,7 @@ struct bpf_insn_aux_data { u64 map_key_state; /* constant (32 bit) key tracking for maps */ int ctx_field_size; /* the ctx field size for load insn, maybe 0 */ u32 seen; /* this insn was processed by the verifier at env->pass_cnt */ + bool nospec; /* do not execute this instruction speculatively */ bool nospec_result; /* ensure following insns from executing speculatively */ bool zext_dst; /* this insn zero extends dst reg */ bool needs_zext; /* alu op needs to clear upper bits */ diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index da729cbbaeb9..d2971bc8e5c7 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2099,14 +2099,15 @@ static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn) #undef COND_JMP /* ST, STX and LDX*/ ST_NOSPEC: - /* Speculation barrier for mitigating Speculative Store Bypass. - * In case of arm64, we rely on the firmware mitigation as - * controlled via the ssbd kernel parameter. Whenever the - * mitigation is enabled, it works for all of the kernel code - * with no need to provide any additional instructions here. - * In case of x86, we use 'lfence' insn for mitigation. We - * reuse preexisting logic from Spectre v1 mitigation that - * happens to produce the required code on x86 for v4 as well. + /* Speculation barrier for mitigating Speculative Store Bypass, + * Bounds-Check Bypass, and Type Confusion. In case of arm64, we + * rely on the firmware mitigation as controlled via the ssbd + * kernel parameter. Whenever the mitigation is enabled, it + * works for all of the kernel code with no need to provide any + * additional instructions here. In case of x86, we use 'lfence' + * insn for mitigation. We reuse preexisting logic from Spectre + * v1 mitigation that happens to produce the required code on + * x86 for v4 as well. */ barrier_nospec(); CONT; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 03e27b012af3..aee49f8da0c1 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1914,6 +1914,17 @@ static int pop_stack(struct bpf_verifier_env *env, int *prev_insn_idx, return 0; }
+static bool error_recoverable_with_nospec(int err) +{ + /* Should only return true for non-fatal errors that are allowed to + * occur during speculative verification. For these we can insert a + * nospec and the program might still be accepted. Do not include + * something like ENOMEM because it is likely to re-occur for the next + * architectural path once it has been recovered-from in all speculative + * paths. */ + return err == -EPERM || err == -EACCES || err == -EINVAL; +} + static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env, int insn_idx, int prev_insn_idx, bool speculative) @@ -19252,16 +19263,7 @@ static int do_check(struct bpf_verifier_env *env) else verbose(env, "%d: safe\n", env->insn_idx); } - err = process_bpf_exit(env, &prev_insn_idx, pop_log, - &do_print_state); - if (err == CHECK_NEXT_INSN) { - continue; - } else if (err == ALL_PATHS_CHECKED) { - break; - } else if (err) { - WARN_ON_ONCE(err > 0); - return err; - } + goto nospec_or_safe_state_found; } }
@@ -19315,17 +19317,47 @@ static int do_check(struct bpf_verifier_env *env) sanitize_mark_insn_seen(env); prev_insn_idx = env->insn_idx;
+ if (state->speculative && cur_aux(env)->nospec) { + /* Reduce verification complexity by only simulating + * speculative paths until we reach a nospec. + */ + goto nospec_or_safe_state_found; + } + err = do_check_insn(env, insn, pop_log, &do_print_state, regs, state, &prev_insn_idx); if (err == CHECK_NEXT_INSN) { continue; } else if (err == ALL_PATHS_CHECKED) { break; + } else if (error_recoverable_with_nospec(err) && state->speculative) { + WARN_ON_ONCE(env->bypass_spec_v1); + WARN_ON_ONCE(env->cur_state != state); + + /* Prevent this speculative path from ever reaching the + * insn that would have been unsafe to execute. + */ + cur_aux(env)->nospec = true; + + goto nospec_or_safe_state_found; } else if (err) { WARN_ON_ONCE(err > 0); return err; }
+ if (state->speculative && cur_aux(env)->nospec_result) { + /* Reduce verification complexity by stopping spec. + * verification when nospec is encountered. + */ +nospec_or_safe_state_found: + err = process_bpf_exit(env, &prev_insn_idx, pop_log, &do_print_state); + if (err == CHECK_NEXT_INSN) + continue; + else if (err == ALL_PATHS_CHECKED) + break; + return err; + } + env->insn_idx++; }
@@ -20447,6 +20479,28 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) bpf_convert_ctx_access_t convert_ctx_access; u8 mode;
+ if (env->insn_aux_data[i + delta].nospec) { + struct bpf_insn patch[] = { + BPF_ST_NOSPEC(), + *insn, + }; + + cnt = ARRAY_SIZE(patch); + new_prog = bpf_patch_insn_data(env, i + delta, patch, cnt); + if (!new_prog) + return -ENOMEM; + + delta += cnt - 1; + env->prog = new_prog; + insn = new_prog->insnsi + i + delta; + /* This can not be easily merged with the + * nospec_result-case, because an insn may require a + * nospec before and after itself. Therefore also do not + * 'continue' here but potentially apply further + * patching to insn. *insn should equal patch[1] now. + */ + } + if (insn->code == (BPF_LDX | BPF_MEM | BPF_B) || insn->code == (BPF_LDX | BPF_MEM | BPF_H) || insn->code == (BPF_LDX | BPF_MEM | BPF_W) || @@ -20495,6 +20549,9 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
if (type == BPF_WRITE && env->insn_aux_data[i + delta].nospec_result) { + /* nospec_result is only used to mitigate Spectre v4 and + * to limit verification-time for Spectre v1. + */ struct bpf_insn patch[] = { *insn, BPF_ST_NOSPEC(), diff --git a/tools/testing/selftests/bpf/progs/verifier_and.c b/tools/testing/selftests/bpf/progs/verifier_and.c index e97e518516b6..98bafce6d8f3 100644 --- a/tools/testing/selftests/bpf/progs/verifier_and.c +++ b/tools/testing/selftests/bpf/progs/verifier_and.c @@ -85,7 +85,7 @@ l0_%=: r0 = r0; \
SEC("socket") __description("check known subreg with unknown reg") -__success __failure_unpriv __msg_unpriv("R1 !read_ok") +__success __success_unpriv __retval(0) __naked void known_subreg_with_unknown_reg(void) { @@ -96,6 +96,7 @@ __naked void known_subreg_with_unknown_reg(void) r0 &= 0xFFFF1234; \ /* Upper bits are unknown but AND above masks out 1 zero'ing lower bits */\ if w0 < 1 goto l0_%=; \ + /* unpriv: nospec (inserted to prevent `R1 !read_ok'`) */\ r1 = *(u32*)(r1 + 512); \ l0_%=: r0 = 0; \ exit; \ diff --git a/tools/testing/selftests/bpf/progs/verifier_bounds.c b/tools/testing/selftests/bpf/progs/verifier_bounds.c index 0eb33bb801b5..0488ef05a7a4 100644 --- a/tools/testing/selftests/bpf/progs/verifier_bounds.c +++ b/tools/testing/selftests/bpf/progs/verifier_bounds.c @@ -620,7 +620,7 @@ l1_%=: exit; \
SEC("socket") __description("bounds check mixed 32bit and 64bit arithmetic. test1") -__success __failure_unpriv __msg_unpriv("R0 invalid mem access 'scalar'") +__success __success_unpriv __retval(0) __naked void _32bit_and_64bit_arithmetic_test1(void) { @@ -636,6 +636,7 @@ __naked void _32bit_and_64bit_arithmetic_test1(void) if w1 > 2 goto l0_%=; \ goto l1_%=; \ l0_%=: /* invalid ldx if bounds are lost above */ \ + /* unpriv: nospec (inserted to prevent `R0 invalid mem access 'scalar'`) */\ r0 = *(u64*)(r0 - 1); \ l1_%=: exit; \ " ::: __clobber_all); @@ -643,7 +644,7 @@ l1_%=: exit; \
SEC("socket") __description("bounds check mixed 32bit and 64bit arithmetic. test2") -__success __failure_unpriv __msg_unpriv("R0 invalid mem access 'scalar'") +__success __success_unpriv __retval(0) __naked void _32bit_and_64bit_arithmetic_test2(void) { @@ -660,6 +661,7 @@ __naked void _32bit_and_64bit_arithmetic_test2(void) if r1 > r2 goto l0_%=; \ goto l1_%=; \ l0_%=: /* invalid ldx if bounds are lost above */ \ + /* unpriv: nospec (inserted to prevent `R0 invalid mem access 'scalar'`) */\ r0 = *(u64*)(r0 - 1); \ l1_%=: exit; \ " ::: __clobber_all); @@ -691,8 +693,7 @@ l0_%=: r0 = 0; \
SEC("socket") __description("bounds check for reg = 0, reg xor 1") -__success __failure_unpriv -__msg_unpriv("R0 min value is outside of the allowed memory range") +__success __success_unpriv __retval(0) __naked void reg_0_reg_xor_1(void) { @@ -708,6 +709,7 @@ __naked void reg_0_reg_xor_1(void) l0_%=: r1 = 0; \ r1 ^= 1; \ if r1 != 0 goto l1_%=; \ + /* unpriv: nospec (inserted to prevent `R0 min value is outside of the allowed memory range`) */\ r0 = *(u64*)(r0 + 8); \ l1_%=: r0 = 0; \ exit; \ @@ -719,8 +721,7 @@ l1_%=: r0 = 0; \
SEC("socket") __description("bounds check for reg32 = 0, reg32 xor 1") -__success __failure_unpriv -__msg_unpriv("R0 min value is outside of the allowed memory range") +__success __success_unpriv __retval(0) __naked void reg32_0_reg32_xor_1(void) { @@ -736,6 +737,7 @@ __naked void reg32_0_reg32_xor_1(void) l0_%=: w1 = 0; \ w1 ^= 1; \ if w1 != 0 goto l1_%=; \ + /* unpriv: nospec (inserted to prevent `R0 min value is outside of the allowed memory range`) */\ r0 = *(u64*)(r0 + 8); \ l1_%=: r0 = 0; \ exit; \ @@ -747,8 +749,7 @@ l1_%=: r0 = 0; \
SEC("socket") __description("bounds check for reg = 2, reg xor 3") -__success __failure_unpriv -__msg_unpriv("R0 min value is outside of the allowed memory range") +__success __success_unpriv __retval(0) __naked void reg_2_reg_xor_3(void) { @@ -764,6 +765,7 @@ __naked void reg_2_reg_xor_3(void) l0_%=: r1 = 2; \ r1 ^= 3; \ if r1 > 0 goto l1_%=; \ + /* unpriv: nospec (inserted to prevent `R0 min value is outside of the allowed memory range`) */\ r0 = *(u64*)(r0 + 8); \ l1_%=: r0 = 0; \ exit; \ @@ -829,8 +831,7 @@ l1_%=: r0 = 0; \
SEC("socket") __description("bounds check for reg > 0, reg xor 3") -__success __failure_unpriv -__msg_unpriv("R0 min value is outside of the allowed memory range") +__success __success_unpriv __retval(0) __naked void reg_0_reg_xor_3(void) { @@ -843,7 +844,8 @@ __naked void reg_0_reg_xor_3(void) call %[bpf_map_lookup_elem]; \ if r0 != 0 goto l0_%=; \ exit; \ -l0_%=: r1 = *(u64*)(r0 + 0); \ +l0_%=: /* unpriv: nospec (inserted to prevent `R0 min value is outside of the allowed memory range`) */\ + r1 = *(u64*)(r0 + 0); \ if r1 <= 0 goto l1_%=; \ r1 ^= 3; \ if r1 >= 0 goto l1_%=; \ @@ -858,8 +860,7 @@ l1_%=: r0 = 0; \
SEC("socket") __description("bounds check for reg32 > 0, reg32 xor 3") -__success __failure_unpriv -__msg_unpriv("R0 min value is outside of the allowed memory range") +__success __success_unpriv __retval(0) __naked void reg32_0_reg32_xor_3(void) { @@ -872,7 +873,8 @@ __naked void reg32_0_reg32_xor_3(void) call %[bpf_map_lookup_elem]; \ if r0 != 0 goto l0_%=; \ exit; \ -l0_%=: r1 = *(u64*)(r0 + 0); \ +l0_%=: /* unpriv: nospec (inserted to prevent `R0 min value is outside of the allowed memory range`) */\ + r1 = *(u64*)(r0 + 0); \ if w1 <= 0 goto l1_%=; \ w1 ^= 3; \ if w1 >= 0 goto l1_%=; \ diff --git a/tools/testing/selftests/bpf/progs/verifier_movsx.c b/tools/testing/selftests/bpf/progs/verifier_movsx.c index 994bbc346d25..5de9bcbe7425 100644 --- a/tools/testing/selftests/bpf/progs/verifier_movsx.c +++ b/tools/testing/selftests/bpf/progs/verifier_movsx.c @@ -245,7 +245,7 @@ l0_%=: \ SEC("socket") __description("MOV32SX, S8, var_off not u32_max, positive after s8 extension") __success __retval(0) -__failure_unpriv __msg_unpriv("frame pointer is read only") +__success_unpriv __naked void mov64sx_s32_varoff_2(void) { asm volatile (" \ @@ -257,6 +257,7 @@ __naked void mov64sx_s32_varoff_2(void) w0 = 0; \ exit; \ l0_%=: \ + /* nospec (inserted to prevent `frame pointer is read only`) */\ r10 = 1; \ exit; \ " : @@ -267,7 +268,7 @@ l0_%=: \ SEC("socket") __description("MOV32SX, S8, var_off not u32_max, negative after s8 extension") __success __retval(0) -__failure_unpriv __msg_unpriv("frame pointer is read only") +__success_unpriv __naked void mov64sx_s32_varoff_3(void) { asm volatile (" \ @@ -280,6 +281,7 @@ __naked void mov64sx_s32_varoff_3(void) w0 = 0; \ exit; \ l0_%=: \ + /* nospec (inserted to prevent `frame pointer is read only`) */\ r10 = 1; \ exit; \ " : diff --git a/tools/testing/selftests/bpf/progs/verifier_unpriv.c b/tools/testing/selftests/bpf/progs/verifier_unpriv.c index a4a5e2071604..af35365558ce 100644 --- a/tools/testing/selftests/bpf/progs/verifier_unpriv.c +++ b/tools/testing/selftests/bpf/progs/verifier_unpriv.c @@ -572,7 +572,7 @@ l0_%=: exit; \
SEC("socket") __description("alu32: mov u32 const") -__success __failure_unpriv __msg_unpriv("R7 invalid mem access 'scalar'") +__success __success_unpriv __retval(0) __naked void alu32_mov_u32_const(void) { @@ -581,6 +581,7 @@ __naked void alu32_mov_u32_const(void) w7 &= 1; \ w0 = w7; \ if r0 == 0 goto l0_%=; \ + /* unpriv: nospec (inserted to prevent `R7 invalid mem access 'scalar'`) */\ r0 = *(u64*)(r7 + 0); \ l0_%=: exit; \ " ::: __clobber_all); diff --git a/tools/testing/selftests/bpf/progs/verifier_value_ptr_arith.c b/tools/testing/selftests/bpf/progs/verifier_value_ptr_arith.c index 5ba6e53571c8..01a238e3047e 100644 --- a/tools/testing/selftests/bpf/progs/verifier_value_ptr_arith.c +++ b/tools/testing/selftests/bpf/progs/verifier_value_ptr_arith.c @@ -398,7 +398,7 @@ l2_%=: r0 = 1; \
SEC("socket") __description("map access: mixing value pointer and scalar, 1") -__success __failure_unpriv __msg_unpriv("R2 pointer comparison prohibited") +__success __failure_unpriv __msg_unpriv("R2 tried to add from different maps, paths or scalars, pointer arithmetic with it prohibited for !root") __retval(0) __naked void value_pointer_and_scalar_1(void) { @@ -433,6 +433,7 @@ l2_%=: /* common instruction */ \ l3_%=: /* branch B */ \ r0 = 0x13371337; \ /* verifier follows fall-through */ \ + /* unpriv: nospec (inserted to prevent `R2 pointer comparison prohibited`) */\ if r2 != 0x100000 goto l4_%=; \ r0 = 0; \ exit; \ @@ -450,7 +451,7 @@ l4_%=: /* fake-dead code; targeted from branch A to \
SEC("socket") __description("map access: mixing value pointer and scalar, 2") -__success __failure_unpriv __msg_unpriv("R0 invalid mem access 'scalar'") +__success __failure_unpriv __msg_unpriv("R2 tried to add from different maps, paths or scalars, pointer arithmetic with it prohibited for !root") __retval(0) __naked void value_pointer_and_scalar_2(void) { @@ -466,6 +467,7 @@ __naked void value_pointer_and_scalar_2(void) if r0 != 0 goto l0_%=; \ exit; \ l0_%=: /* load some number from the map into r1 */ \ + /* unpriv: nospec (inserted to prevent `R0 invalid mem access 'scalar'`) */\ r1 = *(u8*)(r0 + 0); \ /* depending on r1, branch: */ \ if r1 == 0 goto l1_%=; \ @@ -1296,11 +1298,12 @@ l0_%=: r0 = 1; \
SEC("socket") __description("map access: value_ptr -= unknown scalar, 2") -__success __failure_unpriv -__msg_unpriv("R0 pointer arithmetic of map value goes out of range") +__success __success_unpriv __retval(1) __naked void value_ptr_unknown_scalar_2_2(void) { + /* unpriv: nospec inserted by verifier to mitigate 'R0 pointer + * arithmetic of map value goes out of range'. */ asm volatile (" \ r1 = 0; \ *(u64*)(r10 - 8) = r1; \
On 24/02/2025 21:47, Luis Gerhorst wrote:
} else if (error_recoverable_with_nospec(err) && state->speculative)
{
WARN_ON_ONCE(env->bypass_spec_v1);
WARN_ON_ONCE(env->cur_state != state);
/* Prevent this speculative path from ever reaching the
* insn that would have been unsafe to execute.
*/
cur_aux(env)->nospec = true;
This allows us to accept more programs, but it has the downside that Spectre v1 mitigation now requires BPF_NOSPEC to be emitted by every JIT for archs vulnerable to Spectre v1. This currently is not the case, and this patch therefore may regress BPF's security.
The regression is limited to systems vulnerable to Spectre v1, have unprivileged BPF enabled, and do NOT emit insns for BPF_NOSPEC. The latter is not the case for x86 64- and 32-bit, arm64, and powerpc 64-bit and they are therefore not affected by the regression. According to [1], LoongArch and mips are not vulnerable to Spectre v1 and therefore also not affected by the regression.
Also, if any of those regressed systems is also vulnerable to Spectre v4, the system was already vulnerable to Spectre v4 attacks based on unpriv BPF before this patch and the impact is therefore further limited.
As far as I am aware, it is unclear which other architectures (besides x86 64- and 32-bit, arm64, powerpc 64-bit, LoongArch, and mips) supported by the kernel are vulnerable to Spectre v1 but not to Spectre v4. Also, I am not sure if barriers are available on these architectures. Implementing BPF_NOSPEC on these architectures therefore appears non-trivial (probably impossible) to me. Searching gcc / the kernel for speculation barrier implementations for these architectures yielded no result. Any input is very welcome.
As an alternative, one could still reject programs if the architecture does not emit BPF_NOSPEC (e.g., by removing the empty BPF_NOSPEC-case from all JITs except for LoongArch and mips where they appear justified). However, this will cause rejections on these archs and some may have to re-add the empty case. Even if this happens, some may not do it and only rejecting the programs on some archs might complicate BPF selftests.
Do you think the potential regression is acceptable or should we err on the side of caution?
[1] a6f6a95f25803500079513780d11a911ce551d76 ("LoongArch, bpf: Fix jit to skip speculation barrier opcode")
Insert a nospec before the access to prevent it from ever using a index that is subject to speculative scalar-confusion.
Signed-off-by: Luis Gerhorst luis.gerhorst@fau.de Acked-by: Henriette Herzog henriette.herzog@rub.de Cc: Maximilian Ott ott@cs.fau.de Cc: Milan Stephan milan.stephan@fau.de --- kernel/bpf/verifier.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index aee49f8da0c1..06c2f929d602 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -7638,6 +7638,8 @@ static int check_atomic(struct bpf_verifier_env *env, int insn_idx, struct bpf_i return 0; }
+static struct bpf_insn_aux_data *cur_aux(struct bpf_verifier_env *env); + /* When register 'regno' is used to read the stack (either directly or through * a helper function) make sure that it's within stack boundary and, depending * on the access type and privileges, that all elements of the stack are @@ -7677,18 +7679,17 @@ static int check_stack_range_initialized( if (tnum_is_const(reg->var_off)) { min_off = max_off = reg->var_off.value + off; } else { - /* Variable offset is prohibited for unprivileged mode for + /* Variable offset requires a nospec for unprivileged mode for * simplicity since it requires corresponding support in * Spectre masking for stack ALU. * See also retrieve_ptr_limit(). */ if (!env->bypass_spec_v1) { - char tn_buf[48]; - - tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off); - verbose(env, "R%d variable offset stack access prohibited for !root, var_off=%s\n", - regno, tn_buf); - return -EACCES; + /* Allow the access, but prevent it from using a + * speculative offset using a nospec before the + * dereference op. + */ + cur_aux(env)->nospec = true; } /* Only initialized buffer on stack is allowed to be accessed * with variable offset. With uninitialized buffer it's hard to
Main reason is, that it will later allow us to fall back to a nospec for certain errors in push_stack().
This has the side effect of changing the sanitization-case to returning ENOMEM. However, I believe this is more fitting as I undestand EFAULT to indicate a verifier-internal bug.
Downside is, that it requires us to introduce an output parameter for the state.
Signed-off-by: Luis Gerhorst luis.gerhorst@fau.de Acked-by: Henriette Herzog henriette.herzog@rub.de Cc: Maximilian Ott ott@cs.fau.de Cc: Milan Stephan milan.stephan@fau.de --- kernel/bpf/verifier.c | 71 +++++++++++++++++++++++++------------------ 1 file changed, 42 insertions(+), 29 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 06c2f929d602..406294bcd5ce 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1934,8 +1934,10 @@ static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env, int err;
elem = kzalloc(sizeof(struct bpf_verifier_stack_elem), GFP_KERNEL); - if (!elem) - goto err; + if (!elem) { + err = -ENOMEM; + goto unrecoverable_err; + }
elem->insn_idx = insn_idx; elem->prev_insn_idx = prev_insn_idx; @@ -1945,12 +1947,18 @@ static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env, env->stack_size++; err = copy_verifier_state(&elem->st, cur); if (err) - goto err; + goto unrecoverable_err; elem->st.speculative |= speculative; if (env->stack_size > BPF_COMPLEXITY_LIMIT_JMP_SEQ) { verbose(env, "The sequence of %d jumps is too complex.\n", env->stack_size); - goto err; + /* Do not return -EINVAL to signal to the main loop that this + * can likely not be recovered-from by inserting a nospec if we + * are on a speculative path. If it was tried anyway, we would + * encounter it again shortly anyway. + */ + err = -ENOMEM; + goto unrecoverable_err; } if (elem->st.parent) { ++elem->st.parent->branches; @@ -1965,12 +1973,14 @@ static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env, */ } return &elem->st; -err: +unrecoverable_err: free_verifier_state(env->cur_state, true); env->cur_state = NULL; /* pop all elements and return */ while (!pop_stack(env, NULL, NULL, false)); - return NULL; + WARN_ON_ONCE(err >= 0); + WARN_ON_ONCE(error_recoverable_with_nospec(err)); + return ERR_PTR(err); }
#define CALLER_SAVED_REGS 6 @@ -8630,8 +8640,8 @@ static int process_iter_next_call(struct bpf_verifier_env *env, int insn_idx, prev_st = find_prev_entry(env, cur_st->parent, insn_idx); /* branch out active iter state */ queued_st = push_stack(env, insn_idx + 1, insn_idx, false); - if (!queued_st) - return -ENOMEM; + if (IS_ERR(queued_st)) + return PTR_ERR(queued_st);
queued_iter = get_iter_from_state(queued_st, meta); queued_iter->iter.state = BPF_ITER_STATE_ACTIVE; @@ -10214,8 +10224,8 @@ static int push_callback_call(struct bpf_verifier_env *env, struct bpf_insn *ins * proceed with next instruction within current frame. */ callback_state = push_stack(env, env->subprog_info[subprog].start, insn_idx, false); - if (!callback_state) - return -ENOMEM; + if (IS_ERR(callback_state)) + return PTR_ERR(callback_state);
err = setup_func_entry(env, subprog, insn_idx, set_callee_state_cb, callback_state); @@ -13654,7 +13664,7 @@ sanitize_speculative_path(struct bpf_verifier_env *env, struct bpf_reg_state *regs;
branch = push_stack(env, next_idx, curr_idx, true); - if (branch && insn) { + if (!IS_ERR(branch) && insn) { regs = branch->frame[branch->curframe]->regs; if (BPF_SRC(insn->code) == BPF_K) { mark_reg_unknown(env, regs, insn->dst_reg); @@ -13682,7 +13692,7 @@ static int sanitize_ptr_alu(struct bpf_verifier_env *env, u8 opcode = BPF_OP(insn->code); u32 alu_state, alu_limit; struct bpf_reg_state tmp; - bool ret; + struct bpf_verifier_state *branch; int err;
if (can_skip_alu_sanitation(env, insn)) @@ -13755,11 +13765,11 @@ static int sanitize_ptr_alu(struct bpf_verifier_env *env, tmp = *dst_reg; copy_register_state(dst_reg, ptr_reg); } - ret = sanitize_speculative_path(env, NULL, env->insn_idx + 1, - env->insn_idx); - if (!ptr_is_dst_reg && ret) + branch = sanitize_speculative_path(env, NULL, env->insn_idx + 1, + env->insn_idx); + if (!ptr_is_dst_reg && !IS_ERR(branch)) *dst_reg = tmp; - return !ret ? REASON_STACK : 0; + return IS_ERR(branch) ? REASON_STACK : 0; }
static void sanitize_mark_insn_seen(struct bpf_verifier_env *env) @@ -16008,8 +16018,8 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
/* branch out 'fallthrough' insn as a new state to explore */ queued_st = push_stack(env, idx + 1, idx, false); - if (!queued_st) - return -ENOMEM; + if (IS_ERR(queued_st)) + return PTR_ERR(queued_st);
queued_st->may_goto_depth++; if (prev_st) @@ -16073,10 +16083,12 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, * the fall-through branch for simulation under speculative * execution. */ - if (!env->bypass_spec_v1 && - !sanitize_speculative_path(env, insn, *insn_idx + 1, - *insn_idx)) - return -EFAULT; + if (!env->bypass_spec_v1) { + struct bpf_verifier_state *branch = sanitize_speculative_path( + env, insn, *insn_idx + 1, *insn_idx); + if (IS_ERR(branch)) + return PTR_ERR(branch); + } if (env->log.level & BPF_LOG_LEVEL) print_insn_state(env, this_branch, this_branch->curframe); *insn_idx += insn->off; @@ -16086,11 +16098,12 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env, * program will go. If needed, push the goto branch for * simulation under speculative execution. */ - if (!env->bypass_spec_v1 && - !sanitize_speculative_path(env, insn, - *insn_idx + insn->off + 1, - *insn_idx)) - return -EFAULT; + if (!env->bypass_spec_v1) { + struct bpf_verifier_state *branch = sanitize_speculative_path( + env, insn, *insn_idx + insn->off + 1, *insn_idx); + if (IS_ERR(branch)) + return PTR_ERR(branch); + } if (env->log.level & BPF_LOG_LEVEL) print_insn_state(env, this_branch, this_branch->curframe); return 0; @@ -16113,8 +16126,8 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
other_branch = push_stack(env, *insn_idx + insn->off + 1, *insn_idx, false); - if (!other_branch) - return -EFAULT; + if (IS_ERR(other_branch)) + return PTR_ERR(other_branch); other_branch_regs = other_branch->frame[other_branch->curframe]->regs;
if (BPF_SRC(insn->code) == BPF_X) {
For the now raised REASON_STACK, this allows us to later fall back to nospec for certain errors from push_stack() if we are on a speculative path.
Fall back to nospec_result directly for the remaining sanitization errs even if we are not on a speculative path. We must prevent a following mem-access from using the result of the alu op speculatively. Therefore, insert a nospec after the alu insn.
The latter requires us to modify the nospec_result patching code to work not only for write-type insns.
Signed-off-by: Luis Gerhorst luis.gerhorst@fau.de Acked-by: Henriette Herzog henriette.herzog@rub.de Cc: Maximilian Ott ott@cs.fau.de Cc: Milan Stephan milan.stephan@fau.de --- kernel/bpf/verifier.c | 122 +++++++++++++++--------------------------- 1 file changed, 42 insertions(+), 80 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 406294bcd5ce..033780578966 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13572,14 +13572,6 @@ static bool check_reg_sane_offset(struct bpf_verifier_env *env, return true; }
-enum { - REASON_BOUNDS = -1, - REASON_TYPE = -2, - REASON_PATHS = -3, - REASON_LIMIT = -4, - REASON_STACK = -5, -}; - static int retrieve_ptr_limit(const struct bpf_reg_state *ptr_reg, u32 *alu_limit, bool mask_to_left) { @@ -13602,11 +13594,13 @@ static int retrieve_ptr_limit(const struct bpf_reg_state *ptr_reg, ptr_reg->umax_value) + ptr_reg->off; break; default: - return REASON_TYPE; + /* Register has pointer with unsupported alu operation. */ + return -ENOTSUPP; }
+ /* Register tried access beyond pointer bounds. */ if (ptr_limit >= max) - return REASON_LIMIT; + return -ENOTSUPP; *alu_limit = ptr_limit; return 0; } @@ -13625,8 +13619,12 @@ static int update_alu_sanitation_state(struct bpf_insn_aux_data *aux, */ if (aux->alu_state && (aux->alu_state != alu_state || - aux->alu_limit != alu_limit)) - return REASON_PATHS; + aux->alu_limit != alu_limit)) { + /* Tried to perform alu op from different maps, paths or scalars */ + aux->nospec_result = true; + aux->alu_state = 0; + return 0; + }
/* Corresponding fixup done in do_misc_fixups(). */ aux->alu_state = alu_state; @@ -13707,16 +13705,24 @@ static int sanitize_ptr_alu(struct bpf_verifier_env *env,
if (!commit_window) { if (!tnum_is_const(off_reg->var_off) && - (off_reg->smin_value < 0) != (off_reg->smax_value < 0)) - return REASON_BOUNDS; + (off_reg->smin_value < 0) != (off_reg->smax_value < 0)) { + /* Register has unknown scalar with mixed signed bounds. */ + aux->nospec_result = true; + aux->alu_state = 0; + return 0; + }
info->mask_to_left = (opcode == BPF_ADD && off_is_neg) || (opcode == BPF_SUB && !off_is_neg); }
err = retrieve_ptr_limit(ptr_reg, &alu_limit, info->mask_to_left); - if (err < 0) - return err; + if (err) { + WARN_ON_ONCE(err != -ENOTSUPP); + aux->nospec_result = true; + aux->alu_state = 0; + return 0; + }
if (commit_window) { /* In commit phase we narrow the masking window based on @@ -13769,7 +13775,7 @@ static int sanitize_ptr_alu(struct bpf_verifier_env *env, env->insn_idx); if (!ptr_is_dst_reg && !IS_ERR(branch)) *dst_reg = tmp; - return IS_ERR(branch) ? REASON_STACK : 0; + return PTR_ERR_OR_ZERO(branch); }
static void sanitize_mark_insn_seen(struct bpf_verifier_env *env) @@ -13785,45 +13791,6 @@ static void sanitize_mark_insn_seen(struct bpf_verifier_env *env) env->insn_aux_data[env->insn_idx].seen = env->pass_cnt; }
-static int sanitize_err(struct bpf_verifier_env *env, - const struct bpf_insn *insn, int reason, - const struct bpf_reg_state *off_reg, - const struct bpf_reg_state *dst_reg) -{ - static const char *err = "pointer arithmetic with it prohibited for !root"; - const char *op = BPF_OP(insn->code) == BPF_ADD ? "add" : "sub"; - u32 dst = insn->dst_reg, src = insn->src_reg; - - switch (reason) { - case REASON_BOUNDS: - verbose(env, "R%d has unknown scalar with mixed signed bounds, %s\n", - off_reg == dst_reg ? dst : src, err); - break; - case REASON_TYPE: - verbose(env, "R%d has pointer with unsupported alu operation, %s\n", - off_reg == dst_reg ? src : dst, err); - break; - case REASON_PATHS: - verbose(env, "R%d tried to %s from different maps, paths or scalars, %s\n", - dst, op, err); - break; - case REASON_LIMIT: - verbose(env, "R%d tried to %s beyond pointer bounds, %s\n", - dst, op, err); - break; - case REASON_STACK: - verbose(env, "R%d could not be pushed for speculative verification, %s\n", - dst, err); - break; - default: - verbose(env, "verifier internal error: unknown reason (%d)\n", - reason); - break; - } - - return -EACCES; -} - /* check that stack access falls within stack limits and that 'reg' doesn't * have a variable offset. * @@ -13989,7 +13956,7 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env, ret = sanitize_ptr_alu(env, insn, ptr_reg, off_reg, dst_reg, &info, false); if (ret < 0) - return sanitize_err(env, insn, ret, off_reg, dst_reg); + return ret; }
switch (opcode) { @@ -14117,7 +14084,7 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env, ret = sanitize_ptr_alu(env, insn, dst_reg, off_reg, dst_reg, &info, true); if (ret < 0) - return sanitize_err(env, insn, ret, off_reg, dst_reg); + return ret; }
return 0; @@ -14711,7 +14678,7 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env, if (sanitize_needed(opcode)) { ret = sanitize_val_alu(env, insn); if (ret < 0) - return sanitize_err(env, insn, ret, NULL, NULL); + return ret; }
/* Calculate sign/unsigned bounds and tnum for alu32 and alu64 bit ops. @@ -20515,6 +20482,22 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) */ }
+ if (env->insn_aux_data[i + delta].nospec_result) { + struct bpf_insn patch[] = { + *insn, + BPF_ST_NOSPEC(), + }; + + cnt = ARRAY_SIZE(patch); + new_prog = bpf_patch_insn_data(env, i + delta, patch, cnt); + if (!new_prog) + return -ENOMEM; + + delta += cnt - 1; + env->prog = new_prog; + insn = new_prog->insnsi + i + delta; + } + if (insn->code == (BPF_LDX | BPF_MEM | BPF_B) || insn->code == (BPF_LDX | BPF_MEM | BPF_H) || insn->code == (BPF_LDX | BPF_MEM | BPF_W) || @@ -20561,27 +20544,6 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) continue; }
- if (type == BPF_WRITE && - env->insn_aux_data[i + delta].nospec_result) { - /* nospec_result is only used to mitigate Spectre v4 and - * to limit verification-time for Spectre v1. - */ - struct bpf_insn patch[] = { - *insn, - BPF_ST_NOSPEC(), - }; - - cnt = ARRAY_SIZE(patch); - new_prog = bpf_patch_insn_data(env, i + delta, patch, cnt); - if (!new_prog) - return -ENOMEM; - - delta += cnt - 1; - env->prog = new_prog; - insn = new_prog->insnsi + i + delta; - continue; - } - switch ((int)env->insn_aux_data[i + delta].ptr_type) { case PTR_TO_CTX: if (!ops->convert_ctx_access)
This trades verification complexity for runtime overheads due to the nospec inserted because of the EINVAL.
With increased limits this allows applying mitigations to large BPF progs such as the Parca Continuous Profiler's prog. However, this requires a jump-seq limit of 256k. In any case, the same principle should apply to smaller programs therefore include it even if the limit stays at 8k for now. Most programs in "VeriFence: Lightweight and Precise Spectre Defenses for Untrusted Linux Kernel Extensions" (https://arxiv.org/pdf/2405.00078) only require a limit of 32k.
Signed-off-by: Luis Gerhorst luis.gerhorst@fau.de Acked-by: Henriette Herzog henriette.herzog@rub.de Cc: Maximilian Ott ott@cs.fau.de Cc: Milan Stephan milan.stephan@fau.de --- kernel/bpf/verifier.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 033780578966..bde4ae1ea637 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -187,6 +187,7 @@ struct bpf_verifier_stack_elem { };
#define BPF_COMPLEXITY_LIMIT_JMP_SEQ 8192 +#define BPF_COMPLEXITY_LIMIT_SPEC_V1_VERIFICATION (BPF_COMPLEXITY_LIMIT_JMP_SEQ / 2) #define BPF_COMPLEXITY_LIMIT_STATES 64
#define BPF_MAP_KEY_POISON (1ULL << 63) @@ -1933,6 +1934,19 @@ static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env, struct bpf_verifier_stack_elem *elem; int err;
+ if (!env->bypass_spec_v1 && + cur->speculative && + env->stack_size > BPF_COMPLEXITY_LIMIT_SPEC_V1_VERIFICATION) { + /* Avoiding nested speculative path verification because we are + * close to exceeding the jump sequence complexity limit. Will + * instead insert a speculation barrier which will impact + * performace. To improve performance, authors should reduce the + * program's complexity. Barrier will be inserted in + * do_check(). + */ + return ERR_PTR(-EINVAL); + } + elem = kzalloc(sizeof(struct bpf_verifier_stack_elem), GFP_KERNEL); if (!elem) { err = -ENOMEM;
linux-kselftest-mirror@lists.linaro.org