New subject: [PATCH bpf-next v2 01/11] selftests/bpf: Fix caps for __xlated/jited_unpriv

21 Apr 2025


      This improves the expressiveness of unprivileged BPF by inserting
speculation barriers instead of rejecting the programs.
The approach was previously presented at LPC'24 [1] and RAID'24 [2].
To mitigate the Spectre v1 (PHT) vulnerability, the kernel rejects
potentially-dangerous unprivileged BPF programs as of
commit 9183671af6db ("bpf: Fix leakage under speculation on mispredicted
branches"). In [2], we have analyzed 364 object files from open source
projects (Linux Samples and Selftests, BCC, Loxilb, Cilium, libbpf
Examples, Parca, and Prevail) and found that this affects 31% to 54% of
programs.
To resolve this in the majority of cases this patchset adds a fall-back
for mitigating Spectre v1 using speculation barriers. The kernel still
optimistically attempts to verify all speculative paths but uses
speculation barriers against v1 when unsafe behavior is detected. This
allows for more programs to be accepted without disabling the BPF
Spectre mitigations (e.g., by setting cpu_mitigations_off()).
For this, it relies on the fact that speculation barriers prevent all
later instructions if the speculation was not correct:
* On x86_64, lfence acts as full speculation barrier, not only as a
  load fence [3]:
An LFENCE instruction or a serializing instruction will ensure that
      no later instructions execute, even speculatively, until all prior
      instructions complete locally. [...] Inserting an LFENCE instruction
      after a bounds check prevents later operations from executing before
      the bound check completes.
This was experimentally confirmed in [4].
* ARM's SB speculation barrier instruction also affects "any instruction
  that appears later in the program order than the barrier" [5].
In [1] we have measured the overhead of this approach relative to having
mitigations off and including the upstream Spectre v4 mitigations. For
event tracing and stack-sampling profilers, we found that mitigations
increase BPF program execution time by 0% to 62%. For the Loxilb network
load balancer, we have measured a 14% slowdown in SCTP performance but
no significant slowdown for TCP. This overhead only applies to programs
that were previously rejected.
I reran the expressiveness-evaluation with v6.14 and made sure the main
results still match those from [1] and [2] (which used v6.5).
Main design decisions are:
* Do not use separate bytecode insns for v1 and v4 barriers. This
  simplifies the verifier significantly and has the only downside that
  performance on PowerPC is not as high as it could be.
* Allow archs to still disable v1/v4 mitigations separately by setting
  bpf_jit_bypass_spec_v1/v4(). This has the benefit that archs can
  benefit from improved BPF expressiveness / performance if they are not
  vulnerable (e.g., ARM64 for v4 in the kernel).
* Do not remove the empty BPF_NOSPEC implementation for backends for
  which it is unknown whether they are vulnerable to Spectre v1.
[1] https://lpc.events/event/18/contributions/1954/ ("Mitigating
    Spectre-PHT using Speculation Barriers in Linux eBPF")
[2] https://arxiv.org/pdf/2405.00078 ("VeriFence: Lightweight and
    Precise Spectre Defenses for Untrusted Linux Kernel Extensions")
[3] https://www.intel.com/content/www/us/en/developer/articles/technical/softwar...
    ("Managed Runtime Speculative Execution Side Channel Mitigations")
[4] https://dl.acm.org/doi/pdf/10.1145/3359789.3359837 ("Speculator: a
    tool to analyze speculative execution attacks and mitigations" -
    Section 4.6 "Stopping Speculative Execution")
[5] https://developer.arm.com/documentation/ddi0597/2020-12/Base-Instructions/SB...
    ("SB - Speculation Barrier - Arm Armv8-A A32/T32 Instruction Set Architecture (2020-12)")
Changes:
* v1 -> v2:
  - Drop former commits 9 ("bpf: Return PTR_ERR from push_stack()") and 11
    ("bpf: Fall back to nospec for spec path verification") as suggested
    by Alexei. This series therefore no longer changes push_stack() to
    return PTR_ERR.
  - Add detailed explanation of how lfence works internally and how it
    affects the algorithm.
  - Add tests checking that nospec instructions are inserted in expected
    locations using __xlated_unpriv as suggested by Eduard (also,
    include a fix for __xlated_unpriv)
  - Add a test for the mitigations from the description of
    commit 9183671af6db ("bpf: Fix leakage under speculation on
    mispredicted branches")
  - Remove unused variables from do_check[_insn]() as suggested by
    Eduard.
  - Remove INSN_IDX_MODIFIED to improve readability as suggested by
    Eduard. This also causes the nospec_result-check to run (and fail)
    for jumping-ops. Add a warning to assert that this check must never
    succeed in that case.
  - Add details on the safety of patch 10 ("bpf: Allow nospec-protected
    var-offset stack access") based on the feedback on v1.
  - Rebase to bpf-next-250420
  - Link to v1: https://lore.kernel.org/all/20250313172127.1098195-1-luis.gerhorst@fau.de/
* RFC -> v1:
  - rebase to bpf-next-250313
  - tests: mark expected successes/new errors
  - add bpt_jit_bypass_spec_v1/v4() to avoid #ifdef in
    bpf_bypass_spec_v1/v4()
  - ensure that nospec with v1-support is implemented for archs for
    which GCC supports speculation barriers, except for MIPS
  - arm64: emit speculation barrier
  - powerpc: change nospec to include v1 barrier
  - discuss potential security (archs that do not impl. BPF nospec) and
    performance (only PowerPC) regressions
  - Linkt to RFC: https://lore.kernel.org/bpf/20250224203619.594724-1-luis.gerhorst@fau.de/
Luis Gerhorst (11):
  selftests/bpf: Fix caps for __xlated/jited_unpriv
  bpf: Move insn if/else into do_check_insn()
  bpf: Return -EFAULT on misconfigurations
  bpf: Return -EFAULT on internal errors
  bpf, arm64, powerpc: Add bpf_jit_bypass_spec_v1/v4()
  bpf, arm64, powerpc: Change nospec to include v1 barrier
  bpf: Rename sanitize_stack_spill to nospec_result
  bpf: Fall back to nospec for Spectre v1
  selftests/bpf: Add test for Spectre v1 mitigation
  bpf: Allow nospec-protected var-offset stack access
  bpf: Fall back to nospec for sanitization-failures
arch/arm64/net/bpf_jit.h                      |   5 +
 arch/arm64/net/bpf_jit_comp.c                 |  28 +-
 arch/powerpc/net/bpf_jit_comp64.c             |  79 ++-
 include/linux/bpf.h                           |  11 +-
 include/linux/bpf_verifier.h                  |   3 +-
 include/linux/filter.h                        |   2 +-
 kernel/bpf/core.c                             |  32 +-
 kernel/bpf/verifier.c                         | 648 ++++++++++--------
 tools/testing/selftests/bpf/progs/bpf_misc.h  |   4 +
 .../selftests/bpf/progs/verifier_and.c        |   8 +-
 .../selftests/bpf/progs/verifier_bounds.c     |  66 +-
 .../bpf/progs/verifier_bounds_deduction.c     |  45 +-
 .../selftests/bpf/progs/verifier_map_ptr.c    |  20 +-
 .../selftests/bpf/progs/verifier_movsx.c      |  16 +-
 .../selftests/bpf/progs/verifier_unpriv.c     |  65 +-
 .../bpf/progs/verifier_value_ptr_arith.c      | 101 ++-
 tools/testing/selftests/bpf/test_loader.c     |  14 +-
 .../selftests/bpf/verifier/dead_code.c        |   3 +-
 tools/testing/selftests/bpf/verifier/jmp32.c  |  33 +-
 tools/testing/selftests/bpf/verifier/jset.c   |  10 +-
 20 files changed, 765 insertions(+), 428 deletions(-)
base-commit: 8582d9ab3efdebb88e0cd8beed8e0b9de76443e7
-- 
2.49.0

[PATCH bpf-next v2 00/11] bpf: Mitigate Spectre v1 using barriers