Hello, this series follows some discussions started in [1] around bpf trampolines limitations on specific cases. When a trampoline is generated for a target function involving many arguments, it has to properly find and save the arguments that has been passed through stack. While this is doable with basic types (eg: scalars), it brings more uncertainty when dealing with specific types like structs (many ABIs allow to pass structures by value if they fit in a register or a pair of registers). The issue is that those structures layout and location on the stack can be altered (ie with attributes, like packed or aligned(x)), and this kind of alteration is not encoded in dwarf or BTF, making the trampolines clueless about the needed adjustments. Rather than trying to support this specific case, as agreed in [2], this series aims to properly deny it.
It targets all the architectures currently implementing arch_prepare_bpf_trampoline (except aarch64, since it has been handled while adding the support for many args): - x86 - s390 - riscv - powerpc
A small validation function is added in the JIT compiler for each of those architectures, ensuring that no argument passed on stack is a struct. If so, the trampoline creation is cancelled. Any check on args already implemented in a JIT comp has been moved in this new function. On top of that, it updates the tracing_struct_many_args test, which now merely checks that this case is indeed denied.
[1] https://lore.kernel.org/bpf/20250411-many_args_arm64-v1-0-0a32fe72339e@bootl... [2] https://lore.kernel.org/bpf/CAADnVQKr3ftNt1uQVrXBE0a2o37ZYRo2PHqCoHUnw6PE5T2...
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com --- Alexis Lothoré (eBPF Foundation) (7): bpf/x86: use define for max regs count used for arguments bpf/x86: prevent trampoline attachment when args location on stack is uncertain bpf/riscv: prevent trampoline attachment when args location on stack is uncertain bpf/s390: prevent trampoline attachment when args location on stack is uncertain bpf/powerpc64: use define for max regs count used for arguments bpf/powerpc64: prevent trampoline attachment when args location on stack is uncertain selftests/bpf: ensure that functions passing structs on stack can not be hooked
arch/powerpc/net/bpf_jit_comp.c | 38 ++++++++++-- arch/riscv/net/bpf_jit_comp64.c | 26 +++++++- arch/s390/net/bpf_jit_comp.c | 33 ++++++++-- arch/x86/net/bpf_jit_comp.c | 50 ++++++++++++---- .../selftests/bpf/prog_tests/tracing_struct.c | 37 +----------- .../selftests/bpf/progs/tracing_struct_many_args.c | 70 ---------------------- .../testing/selftests/bpf/test_kmods/bpf_testmod.c | 43 ++----------- 7 files changed, 129 insertions(+), 168 deletions(-) --- base-commit: c4f4f8da70044d8b28fccf73016b4119f3e2fd50 change-id: 20250609-deny_trampoline_structs_on_stack-5bbc7bc20dd1
Best regards,
x86 allows using up to 6 registers to pass arguments between function calls. This value is hardcoded in multiple places, use a define for this value.
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com --- arch/x86/net/bpf_jit_comp.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 15672cb926fc1817f97d2cd1c55d1575803f6958..9689834de1bb1a90fdc28156e0e2a56ac0ff2076 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -61,6 +61,8 @@ static u8 *emit_code(u8 *ptr, u32 bytes, unsigned int len) #define EMIT_ENDBR_POISON() #endif
+#define MAX_REGS_FOR_ARGS 6 + static bool is_imm8(int value) { return value <= 127 && value >= -128; @@ -2710,10 +2712,10 @@ static int get_nr_used_regs(const struct btf_func_model *m)
for (i = 0; i < min_t(int, m->nr_args, MAX_BPF_FUNC_ARGS); i++) { arg_regs = (m->arg_size[i] + 7) / 8; - if (nr_used_regs + arg_regs <= 6) + if (nr_used_regs + arg_regs <= MAX_REGS_FOR_ARGS) nr_used_regs += arg_regs;
- if (nr_used_regs >= 6) + if (nr_used_regs >= MAX_REGS_FOR_ARGS) break; }
@@ -2751,7 +2753,7 @@ static void save_args(const struct btf_func_model *m, u8 **prog, * the arg1-5,arg7 will be passed by regs, and arg6 will * by stack. */ - if (nr_regs + arg_regs > 6) { + if (nr_regs + arg_regs > MAX_REGS_FOR_ARGS) { /* copy function arguments from origin stack frame * into current stack frame. * @@ -2811,7 +2813,7 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, */ for (i = 0; i < min_t(int, m->nr_args, MAX_BPF_FUNC_ARGS); i++) { arg_regs = (m->arg_size[i] + 7) / 8; - if (nr_regs + arg_regs <= 6) { + if (nr_regs + arg_regs <= MAX_REGS_FOR_ARGS) { for (j = 0; j < arg_regs; j++) { emit_ldx(prog, BPF_DW, nr_regs == 5 ? X86_REG_R9 : BPF_REG_1 + nr_regs, @@ -2824,7 +2826,7 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, stack_size -= 8 * arg_regs; }
- if (nr_regs >= 6) + if (nr_regs >= MAX_REGS_FOR_ARGS) break; } } @@ -3149,7 +3151,7 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im stack_size += (sizeof(struct bpf_tramp_run_ctx) + 7) & ~0x7; run_ctx_off = stack_size;
- if (nr_regs > 6 && (flags & BPF_TRAMP_F_CALL_ORIG)) { + if (nr_regs > MAX_REGS_FOR_ARGS && (flags & BPF_TRAMP_F_CALL_ORIG)) { /* the space that used to pass arguments on-stack */ stack_size += (nr_regs - get_nr_used_regs(m)) * 8; /* make sure the stack pointer is 16-byte aligned if we
When the target function receives more arguments than available registers, the additional arguments are passed on stack, and so the generated trampoline needs to read those to prepare the bpf context, but also to prepare the target function stack when it is in charge of calling it. This works well for scalar types, but if the value is a struct, we can not know for sure the exact struct location, as it may have been packed or manually aligned to a greater value.
Prevent wrong readings by refusing trampoline attachment if the target function receives a struct on stack. While at it, move the max bpf args check in the new function.
Fixes: 473e3150e30a ("bpf, x86: allow function arguments up to 12 for TRACING") Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com --- arch/x86/net/bpf_jit_comp.c | 36 ++++++++++++++++++++++++++++++------ 1 file changed, 30 insertions(+), 6 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 9689834de1bb1a90fdc28156e0e2a56ac0ff2076..120e05a978679c046631cc94d942800c3051ad0a 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -3001,6 +3001,29 @@ static int invoke_bpf_mod_ret(const struct btf_func_model *m, u8 **pprog, return 0; }
+static int validate_args(const struct btf_func_model *m) +{ + int i, arg_regs = 0, nr_regs = 0; + + for (i = 0; i < min_t(int, m->nr_args, MAX_BPF_FUNC_ARGS); i++) { + arg_regs = (m->arg_size[i] + 7) / 8; + + if (nr_regs + arg_regs > MAX_REGS_FOR_ARGS && + m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) + return -ENOTSUPP; + nr_regs += arg_regs; + } + + /* x86-64 supports up to MAX_BPF_FUNC_ARGS arguments. 1-6 + * are passed through regs, the remains are through stack. + */ + if (nr_regs > MAX_BPF_FUNC_ARGS) + return -ENOTSUPP; + + + return 0; +} + /* mov rax, qword ptr [rbp - rounded_stack_depth - 8] */ #define LOAD_TRAMP_TAIL_CALL_CNT_PTR(stack) \ __LOAD_TCC_PTR(-round_up(stack, 8) - 8) @@ -3089,18 +3112,19 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im WARN_ON_ONCE((flags & BPF_TRAMP_F_INDIRECT) && (flags & ~(BPF_TRAMP_F_INDIRECT | BPF_TRAMP_F_RET_FENTRY_RET)));
+ /* make sure that any argument can be located and processed by the + * trampoline + */ + ret = validate_args(m); + if (ret) + return ret; + /* extra registers for struct arguments */ for (i = 0; i < m->nr_args; i++) { if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) nr_regs += (m->arg_size[i] + 7) / 8 - 1; }
- /* x86-64 supports up to MAX_BPF_FUNC_ARGS arguments. 1-6 - * are passed through regs, the remains are through stack. - */ - if (nr_regs > MAX_BPF_FUNC_ARGS) - return -ENOTSUPP; - /* Generated trampoline stack layout: * * RBP + 8 [ return address ]
On Fri, Jun 13, 2025 at 09:37:11AM +0200, Alexis Lothoré (eBPF Foundation) wrote:
When the target function receives more arguments than available registers, the additional arguments are passed on stack, and so the generated trampoline needs to read those to prepare the bpf context, but also to prepare the target function stack when it is in charge of calling it. This works well for scalar types, but if the value is a struct, we can not know for sure the exact struct location, as it may have been packed or manually aligned to a greater value.
https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf
Has fairly clear rules on how arguments are encoded. Broadly speaking for the kernel, if the structure exceeds 2 registers in size, it is passed as a reference, otherwise it is passed as two registers.
Hi Peter,
On Fri Jun 13, 2025 at 10:11 AM CEST, Peter Zijlstra wrote:
On Fri, Jun 13, 2025 at 09:37:11AM +0200, Alexis Lothoré (eBPF Foundation) wrote:
When the target function receives more arguments than available registers, the additional arguments are passed on stack, and so the generated trampoline needs to read those to prepare the bpf context, but also to prepare the target function stack when it is in charge of calling it. This works well for scalar types, but if the value is a struct, we can not know for sure the exact struct location, as it may have been packed or manually aligned to a greater value.
https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf
Has fairly clear rules on how arguments are encoded. Broadly speaking for the kernel, if the structure exceeds 2 registers in size, it is passed as a reference, otherwise it is passed as two registers.
Maybe my commit wording is not precise enough, but indeed, there's not doubt about whether the struct value is passed on the stack or through a register/a pair of registers. The doubt is rather about the struct location when it is passed _by value_ and _on the stack_: the ABI indeed clearly states that "Structures and unions assume the alignment of their most strictly aligned component" (p.13), but this rule is "silently broken" when a struct has an __attribute__((packed)) or and __attribute__((aligned(X))), and AFAICT this case can not be detected at runtime with current BTF info.
On Fri, Jun 13, 2025 at 10:26:37AM +0200, Alexis Lothoré wrote:
Hi Peter,
On Fri Jun 13, 2025 at 10:11 AM CEST, Peter Zijlstra wrote:
On Fri, Jun 13, 2025 at 09:37:11AM +0200, Alexis Lothoré (eBPF Foundation) wrote:
When the target function receives more arguments than available registers, the additional arguments are passed on stack, and so the generated trampoline needs to read those to prepare the bpf context, but also to prepare the target function stack when it is in charge of calling it. This works well for scalar types, but if the value is a struct, we can not know for sure the exact struct location, as it may have been packed or manually aligned to a greater value.
https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf
Has fairly clear rules on how arguments are encoded. Broadly speaking for the kernel, if the structure exceeds 2 registers in size, it is passed as a reference, otherwise it is passed as two registers.
Maybe my commit wording is not precise enough, but indeed, there's not doubt about whether the struct value is passed on the stack or through a register/a pair of registers. The doubt is rather about the struct location when it is passed _by value_ and _on the stack_: the ABI indeed clearly states that "Structures and unions assume the alignment of their most strictly aligned component" (p.13), but this rule is "silently broken" when a struct has an __attribute__((packed)) or and __attribute__((aligned(X))), and AFAICT this case can not be detected at runtime with current BTF info.
Ah, okay. So it is a failure of BTF. That was indeed not clear.
On Fri Jun 13, 2025 at 10:32 AM CEST, Peter Zijlstra wrote:
On Fri, Jun 13, 2025 at 10:26:37AM +0200, Alexis Lothoré wrote:
Hi Peter,
On Fri Jun 13, 2025 at 10:11 AM CEST, Peter Zijlstra wrote:
On Fri, Jun 13, 2025 at 09:37:11AM +0200, Alexis Lothoré (eBPF Foundation) wrote:
[...]
Maybe my commit wording is not precise enough, but indeed, there's not doubt about whether the struct value is passed on the stack or through a register/a pair of registers. The doubt is rather about the struct location when it is passed _by value_ and _on the stack_: the ABI indeed clearly states that "Structures and unions assume the alignment of their most strictly aligned component" (p.13), but this rule is "silently broken" when a struct has an __attribute__((packed)) or and __attribute__((aligned(X))), and AFAICT this case can not be detected at runtime with current BTF info.
Ah, okay. So it is a failure of BTF. That was indeed not clear.
If I need to respin, I'll rewrite the commit message to include the details above.
Alexis
On Fri, Jun 13, 2025 at 1:59 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
On Fri Jun 13, 2025 at 10:32 AM CEST, Peter Zijlstra wrote:
On Fri, Jun 13, 2025 at 10:26:37AM +0200, Alexis Lothoré wrote:
Hi Peter,
On Fri Jun 13, 2025 at 10:11 AM CEST, Peter Zijlstra wrote:
On Fri, Jun 13, 2025 at 09:37:11AM +0200, Alexis Lothoré (eBPF Foundation) wrote:
[...]
Maybe my commit wording is not precise enough, but indeed, there's not doubt about whether the struct value is passed on the stack or through a register/a pair of registers. The doubt is rather about the struct location when it is passed _by value_ and _on the stack_: the ABI indeed clearly states that "Structures and unions assume the alignment of their most strictly aligned component" (p.13), but this rule is "silently broken" when a struct has an __attribute__((packed)) or and __attribute__((aligned(X))), and AFAICT this case can not be detected at runtime with current BTF info.
Ah, okay. So it is a failure of BTF. That was indeed not clear.
If I need to respin, I'll rewrite the commit message to include the details above.
No need to respin. The cover letter is quite detailed already.
But looking at the patch and this thread I think we need to agree on the long term approach to BTF, since people assume that it's a more compact dwarf and any missing information should be added to it. Like in this case special alignment case and packed attributes are not expressed in BTF and I believe they should not be. BTF is not a debug format and not a substitute for dwarf. There is no goal to express everything possible in C. It's minimal, because BTF is _practical_ description of types and data present in the kernel. I don't think the special case of packing and alignment exists in the kernel today, so the current format is sufficient. It doesn't miss anything. I think we made arm64 JIT unnecessary restrictive and now considering to make all other JITs restrictive too for hypothetical case of some future kernel functions. I feel we're going in the wrong direction. Instead we should teach pahole to sanitize BTF where functions are using this fancy alignment and packed structs. pahole can see it in dwarf and can skip emitting BTF for such functions. Then the kernel JITs on all architectures won't even see such cases.
The issue was initially discovered by a selftest: https://lore.kernel.org/bpf/20250411-many_args_arm64-v1-3-0a32fe72339e@bootl... that attempted to support these two types: +struct bpf_testmod_struct_arg_4 { + __u64 a; + __u64 b; +}; + +struct bpf_testmod_struct_arg_5 { + __int128 a; +};
The former is present in the kernel. It's more or less sockptr_t, and people want to access it for observability in tracing. The latter doesn't exist in the kernel and we cannot represent it properly in BTF without losing alignment.
So I think we should go back to that series: https://lore.kernel.org/bpf/20250411-many_args_arm64-v1-0-0a32fe72339e@bootl...
remove __int128 selftest, but also teach pahole to recognize types that cannot be represented in BTF and don't emit them either into vmlinux or in kernel module (like in this case it was bpf_testmod.ko) I think that would be a better path forward aligned with the long term goal of BTF.
And before people ask... pahole is a trusted component of the build system. We trust it just as we trust gcc, clang, linker, objtool.
On Sat Jun 14, 2025 at 12:35 AM CEST, Alexei Starovoitov wrote:
On Fri, Jun 13, 2025 at 1:59 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
On Fri Jun 13, 2025 at 10:32 AM CEST, Peter Zijlstra wrote:
On Fri, Jun 13, 2025 at 10:26:37AM +0200, Alexis Lothoré wrote:
[...]
If I need to respin, I'll rewrite the commit message to include the details above.
No need to respin. The cover letter is quite detailed already.
But looking at the patch and this thread I think we need to agree on the long term approach to BTF, since people assume that it's a more compact dwarf and any missing information should be added to it. Like in this case special alignment case and packed attributes are not expressed in BTF and I believe they should not be. BTF is not a debug format and not a substitute for dwarf. There is no goal to express everything possible in C. It's minimal, because BTF is _practical_ description of types and data present in the kernel. I don't think the special case of packing and alignment exists in the kernel today, so the current format is sufficient. It doesn't miss anything. I think we made arm64 JIT unnecessary restrictive and now considering to make all other JITs restrictive too for hypothetical case of some future kernel functions. I feel we're going in the wrong direction. Instead we should teach pahole to sanitize BTF where functions are using this fancy alignment and packed structs. pahole can see it in dwarf and can skip emitting BTF for such functions. Then the kernel JITs on all architectures won't even see such cases.
The issue was initially discovered by a selftest: https://lore.kernel.org/bpf/20250411-many_args_arm64-v1-3-0a32fe72339e@bootl... that attempted to support these two types: +struct bpf_testmod_struct_arg_4 {
- __u64 a;
- __u64 b;
+};
+struct bpf_testmod_struct_arg_5 {
- __int128 a;
+};
The former is present in the kernel. It's more or less sockptr_t, and people want to access it for observability in tracing. The latter doesn't exist in the kernel and we cannot represent it properly in BTF without losing alignment.
So I think we should go back to that series: https://lore.kernel.org/bpf/20250411-many_args_arm64-v1-0-0a32fe72339e@bootl...
remove __int128 selftest, but also teach pahole to recognize types that cannot be represented in BTF and don't emit them either into vmlinux or in kernel module (like in this case it was bpf_testmod.ko) I think that would be a better path forward aligned with the long term goal of BTF.
And before people ask... pahole is a trusted component of the build system. We trust it just as we trust gcc, clang, linker, objtool.
So if I understand correctly your point, it would be better to just move out those constraints from the JIT compilers, and just not represent those special cases in BTF, so that it becomes impossible to hook programs on those functions, since they are not event present in BTF info. And so: - cancel this series - revert the small ARM64 check about struct passed on stack - update pahole to make sure that it does not encode info about this specific kind of functions.
I still expect some challenges with this. AFAIU pahole uses DWARF to generate BTF, and discussions in [1] highlighted the fact that the attributes altering the structs alignment are not reliably encoded in DWARF. Maybe pahole can "guess" if a struct has been altered, by doing something like btf_is_struct_packed in libbpf ? As Andrii mentioned in [2], it may not be able to cover all cases, but that could be a start. If that's indeed the desired direction, I can take a further look at this.
+ CC dwarves ML
Alexis
[1] https://lore.kernel.org/bpf/9a2ba0ad-b34d-42f8-89a6-d9a44f007bdc@linux.dev/ [2] https://lore.kernel.org/bpf/CAEf4BzZHMYyGDZ4c4eNXG7Fm=ecxCCbKhKbQTbCjvWmKtdw...
On Sun, Jun 15, 2025 at 7:00 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
On Sat Jun 14, 2025 at 12:35 AM CEST, Alexei Starovoitov wrote:
On Fri, Jun 13, 2025 at 1:59 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
On Fri Jun 13, 2025 at 10:32 AM CEST, Peter Zijlstra wrote:
On Fri, Jun 13, 2025 at 10:26:37AM +0200, Alexis Lothoré wrote:
[...]
If I need to respin, I'll rewrite the commit message to include the details above.
No need to respin. The cover letter is quite detailed already.
But looking at the patch and this thread I think we need to agree on the long term approach to BTF, since people assume that it's a more compact dwarf and any missing information should be added to it. Like in this case special alignment case and packed attributes are not expressed in BTF and I believe they should not be. BTF is not a debug format and not a substitute for dwarf. There is no goal to express everything possible in C. It's minimal, because BTF is _practical_ description of types and data present in the kernel. I don't think the special case of packing and alignment exists in the kernel today, so the current format is sufficient. It doesn't miss anything. I think we made arm64 JIT unnecessary restrictive and now considering to make all other JITs restrictive too for hypothetical case of some future kernel functions. I feel we're going in the wrong direction. Instead we should teach pahole to sanitize BTF where functions are using this fancy alignment and packed structs. pahole can see it in dwarf and can skip emitting BTF for such functions. Then the kernel JITs on all architectures won't even see such cases.
The issue was initially discovered by a selftest: https://lore.kernel.org/bpf/20250411-many_args_arm64-v1-3-0a32fe72339e@bootl... that attempted to support these two types: +struct bpf_testmod_struct_arg_4 {
- __u64 a;
- __u64 b;
+};
+struct bpf_testmod_struct_arg_5 {
- __int128 a;
+};
The former is present in the kernel. It's more or less sockptr_t, and people want to access it for observability in tracing. The latter doesn't exist in the kernel and we cannot represent it properly in BTF without losing alignment.
So I think we should go back to that series: https://lore.kernel.org/bpf/20250411-many_args_arm64-v1-0-0a32fe72339e@bootl...
remove __int128 selftest, but also teach pahole to recognize types that cannot be represented in BTF and don't emit them either into vmlinux or in kernel module (like in this case it was bpf_testmod.ko) I think that would be a better path forward aligned with the long term goal of BTF.
And before people ask... pahole is a trusted component of the build system. We trust it just as we trust gcc, clang, linker, objtool.
So if I understand correctly your point, it would be better to just move out those constraints from the JIT compilers, and just not represent those special cases in BTF, so that it becomes impossible to hook programs on those functions, since they are not event present in BTF info. And so:
- cancel this series
- revert the small ARM64 check about struct passed on stack
- update pahole to make sure that it does not encode info about this specific kind of functions.
yes
I still expect some challenges with this. AFAIU pahole uses DWARF to generate BTF, and discussions in [1] highlighted the fact that the attributes altering the structs alignment are not reliably encoded in DWARF. Maybe pahole can "guess" if a struct has been altered, by doing something like btf_is_struct_packed in libbpf ? As Andrii mentioned in [2], it may not be able to cover all cases, but that could be a start. If that's indeed the desired direction, I can take a further look at this.
so be it. If syzbot was taught to fuzz dwarf I'm sure it would have exposed hundreds of bugs in the format itself and compilers, but since such convoluted constructs are not present in the kernel source code it's not a concern.
When the target function receives more arguments than available registers, the additional arguments are passed on stack, and so the generated trampoline needs to read those to prepare the bpf context, but also to prepare the target function stack when it is in charge of calling it. This works well for scalar types, but if the value is a struct, we can not know for sure the exact struct location, as it may have been packed or manually aligned to a greater value.
Prevent wrong readings by refusing trampoline attachment if the target function receives a struct on stack. While at it, move the max bpf args check in the new function.
Fixes: 6801b0aef79d ("riscv, bpf: Add 12-argument support for RV64 bpf trampoline") Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com --- arch/riscv/net/bpf_jit_comp64.c | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c index 10e01ff06312d9f1e6e213bb069c6ea749ea9af2..ea3a1c3af6bc129057c16a4070c33dbf00e6c611 100644 --- a/arch/riscv/net/bpf_jit_comp64.c +++ b/arch/riscv/net/bpf_jit_comp64.c @@ -1005,6 +1005,24 @@ static int invoke_bpf_prog(struct bpf_tramp_link *l, int args_off, int retval_of return ret; }
+static int validate_args(const struct btf_func_model *m) +{ + int i, nr_arg_slots, nr_regs = 0; + + if (m->nr_args > MAX_BPF_FUNC_ARGS) + return -ENOTSUPP; + + for (i = 0; i < m->nr_args; i++) { + nr_arg_slots = round_up(m->arg_size[i], 8) / 8; + if (nr_regs + nr_arg_slots > RV_MAX_REG_ARGS && + m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) + return -ENOTSUPP; + nr_regs += nr_arg_slots; + } + + return 0; +} + static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, const struct btf_func_model *m, struct bpf_tramp_links *tlinks, @@ -1069,8 +1087,12 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, if (flags & (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY)) return -ENOTSUPP;
- if (m->nr_args > MAX_BPF_FUNC_ARGS) - return -ENOTSUPP; + /* make sure that any argument can be located and processed by the + * trampoline + */ + ret = validate_args(m); + if (ret) + return ret;
for (i = 0; i < m->nr_args; i++) nr_arg_slots += round_up(m->arg_size[i], 8) / 8;
When the target function receives more arguments than available registers, the additional arguments are passed on stack, and so the generated trampoline needs to read those to prepare the bpf context, but also to prepare the target function stack when it is in charge of calling it. This works well for scalar types, but if the value is a struct, we can not know for sure the exact struct location, as it may have been packed or manually aligned to a greater value.
Prevent wrong readings by refusing trampoline attachment if the target function receives a struct on stack. While doing so, move the existing check (ensuring that the number of args passed on stack is not higher than MAX_NR_STACK_ARGS) into the newly created check function.
Fixes: 528eb2cb87bc ("s390/bpf: Implement arch_prepare_bpf_trampoline()") Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com --- arch/s390/net/bpf_jit_comp.c | 33 +++++++++++++++++++++++++++++---- 1 file changed, 29 insertions(+), 4 deletions(-)
diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index c7f8313ba449716a8f18eafdeb6c77ed3b23f52e..b441feb20e993f54cc0e9a39c67a726f4b61d9f2 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -2566,6 +2566,27 @@ static int alloc_stack(struct bpf_tramp_jit *tjit, size_t size) /* -mfentry generates a 6-byte nop on s390x. */ #define S390X_PATCH_SIZE 6
+static int validate_args(const struct btf_func_model *m) +{ + int i = 0, nr_reg_args, nr_stack_args; + + nr_reg_args = min_t(int, m->nr_args, MAX_NR_REG_ARGS); + nr_stack_args = m->nr_args - nr_reg_args; + + if (nr_stack_args == 0) + return 0; + + /* Support as many stack arguments as "mvc" instruction can handle. */ + if (nr_stack_args > MAX_NR_STACK_ARGS) + return -ENOTSUPP; + + for (i = nr_reg_args; i < m->nr_args; i++) + if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) + return -ENOTSUPP; + + return 0; +} + static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, struct bpf_tramp_jit *tjit, const struct btf_func_model *m, @@ -2579,13 +2600,17 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, int nr_bpf_args, nr_reg_args, nr_stack_args; struct bpf_jit *jit = &tjit->common; int arg, bpf_arg_off; - int i, j; + int i, j, ret; + + /* make sure that any argument can be located and processed by the + * trampoline + */ + ret = validate_args(m); + if (ret) + return ret;
- /* Support as many stack arguments as "mvc" instruction can handle. */ nr_reg_args = min_t(int, m->nr_args, MAX_NR_REG_ARGS); nr_stack_args = m->nr_args - nr_reg_args; - if (nr_stack_args > MAX_NR_STACK_ARGS) - return -ENOTSUPP;
/* Return to %r14 in the struct_ops case. */ if (flags & BPF_TRAMP_F_INDIRECT)
powerpc allows using up to 8 registers to pass arguments between function calls. This value is hardcoded in multiple places, use a define for this value.
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com --- arch/powerpc/net/bpf_jit_comp.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c index c0684733e9d6ac79b4cf653bf1b9ad40eb3e1aca..d313920a42c2310c6b5deab6d82e13af49c8ecb1 100644 --- a/arch/powerpc/net/bpf_jit_comp.c +++ b/arch/powerpc/net/bpf_jit_comp.c @@ -22,6 +22,8 @@
#include "bpf_jit.h"
+#define MAX_REGS_FOR_ARGS 8 + /* These offsets are from bpf prog end and stay the same across progs */ static int bpf_jit_ool_stub, bpf_jit_long_branch_stub;
@@ -613,7 +615,7 @@ static void bpf_trampoline_save_args(u32 *image, struct codegen_context *ctx, in param_save_area_offset += STACK_FRAME_MIN_SIZE; /* param save area is past frame header */
for (int i = 0; i < nr_regs; i++) { - if (i < 8) { + if (i < MAX_REGS_FOR_ARGS) { EMIT(PPC_RAW_STL(_R3 + i, _R1, regs_off + i * SZL)); } else { EMIT(PPC_RAW_LL(_R3, _R1, param_save_area_offset + i * SZL)); @@ -626,7 +628,7 @@ static void bpf_trampoline_save_args(u32 *image, struct codegen_context *ctx, in static void bpf_trampoline_restore_args_regs(u32 *image, struct codegen_context *ctx, int nr_regs, int regs_off) { - for (int i = 0; i < nr_regs && i < 8; i++) + for (int i = 0; i < nr_regs && i < MAX_REGS_FOR_ARGS; i++) EMIT(PPC_RAW_LL(_R3 + i, _R1, regs_off + i * SZL)); }
@@ -725,7 +727,9 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im * * Reserve space for at least 8 registers for now. This can be optimized later. */ - bpf_frame_size += (nr_regs > 8 ? nr_regs : 8) * SZL; + bpf_frame_size += + (nr_regs > MAX_REGS_FOR_ARGS ? nr_regs : MAX_REGS_FOR_ARGS) * + SZL;
/* Room for struct bpf_tramp_run_ctx */ run_ctx_off = bpf_frame_size;
When the target function receives more arguments than available registers, the additional arguments are passed on stack, and so the generated trampoline needs to read those to prepare the bpf context, but also to prepare the target function stack when it is in charge of calling it. This works well for scalar types, but if the value is a struct, we can not know for sure the exact struct location, as it may have been packed or manually aligned to a greater value.
Prevent wrong readings by refusing trampoline attachment if the target function receives a struct on stack. While at it, move the max bpf args check in the new function.
Fixes: d243b62b7bd3 ("powerpc64/bpf: Add support for bpf trampolines") Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com --- arch/powerpc/net/bpf_jit_comp.c | 28 +++++++++++++++++++++++++--- 1 file changed, 25 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c index d313920a42c2310c6b5deab6d82e13af49c8ecb1..97f5209a25adb4865e3cc342292c8f15b1985156 100644 --- a/arch/powerpc/net/bpf_jit_comp.c +++ b/arch/powerpc/net/bpf_jit_comp.c @@ -648,6 +648,24 @@ static void bpf_trampoline_restore_args_stack(u32 *image, struct codegen_context bpf_trampoline_restore_args_regs(image, ctx, nr_regs, regs_off); }
+static int validate_args(const struct btf_func_model *m) +{ + int nr_regs = m->nr_args, i; + + for (i = 0; i < m->nr_args; i++) { + if (m->arg_size[i] > SZL) + nr_regs += round_up(m->arg_size[i], SZL) / SZL - 1; + if (i > MAX_REGS_FOR_ARGS && + m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) + return -ENOTSUPP; + } + + if (nr_regs > MAX_BPF_FUNC_ARGS) + return -ENOTSUPP; + + return 0; +} + static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_image, void *rw_image_end, void *ro_image, const struct btf_func_model *m, u32 flags, @@ -668,15 +686,19 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im if (IS_ENABLED(CONFIG_PPC32)) return -EOPNOTSUPP;
+ /* make sure that any argument can be located and processed by the + * trampoline + */ + ret = validate_args(m); + if (ret) + return ret; + nr_regs = m->nr_args; /* Extra registers for struct arguments */ for (i = 0; i < m->nr_args; i++) if (m->arg_size[i] > SZL) nr_regs += round_up(m->arg_size[i], SZL) / SZL - 1;
- if (nr_regs > MAX_BPF_FUNC_ARGS) - return -EOPNOTSUPP; - ctx = &codegen_ctx; memset(ctx, 0, sizeof(*ctx));
When attaching ebpf programs to functions through fentry/fexit, the generated trampolines can not really make sure about the arguments exact location on the stack if those are structures: those structures can be altered with attributes such as packed or aligned(x), but this information is not encoded in BTF.
Update tracing_struct_many_args test to check that programs can not be attached on those specific functions. Not all architectures can use the same number of registers to pass arguments, so define a testing function that makes all currently supported architectures start passing arguments on stack (-> more than 8 args)
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com --- .../selftests/bpf/prog_tests/tracing_struct.c | 37 +----------- .../selftests/bpf/progs/tracing_struct_many_args.c | 70 ---------------------- .../testing/selftests/bpf/test_kmods/bpf_testmod.c | 43 ++----------- 3 files changed, 6 insertions(+), 144 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/tracing_struct.c b/tools/testing/selftests/bpf/prog_tests/tracing_struct.c index 19e68d4b353278bf8e2917e62f62c89d14d7fe80..a084f6e5eca4e97b463950feba2142a628e9ec72 100644 --- a/tools/testing/selftests/bpf/prog_tests/tracing_struct.c +++ b/tools/testing/selftests/bpf/prog_tests/tracing_struct.c @@ -70,44 +70,9 @@ static void test_struct_many_args(void) return;
err = tracing_struct_many_args__attach(skel); - if (!ASSERT_OK(err, "tracing_struct_many_args__attach")) + if (!ASSERT_EQ(err, -ENOTSUPP, "tracing_struct_many_args__attach")) goto destroy_skel;
- ASSERT_OK(trigger_module_test_read(256), "trigger_read"); - - ASSERT_EQ(skel->bss->t7_a, 16, "t7:a"); - ASSERT_EQ(skel->bss->t7_b, 17, "t7:b"); - ASSERT_EQ(skel->bss->t7_c, 18, "t7:c"); - ASSERT_EQ(skel->bss->t7_d, 19, "t7:d"); - ASSERT_EQ(skel->bss->t7_e, 20, "t7:e"); - ASSERT_EQ(skel->bss->t7_f_a, 21, "t7:f.a"); - ASSERT_EQ(skel->bss->t7_f_b, 22, "t7:f.b"); - ASSERT_EQ(skel->bss->t7_ret, 133, "t7 ret"); - - ASSERT_EQ(skel->bss->t8_a, 16, "t8:a"); - ASSERT_EQ(skel->bss->t8_b, 17, "t8:b"); - ASSERT_EQ(skel->bss->t8_c, 18, "t8:c"); - ASSERT_EQ(skel->bss->t8_d, 19, "t8:d"); - ASSERT_EQ(skel->bss->t8_e, 20, "t8:e"); - ASSERT_EQ(skel->bss->t8_f_a, 21, "t8:f.a"); - ASSERT_EQ(skel->bss->t8_f_b, 22, "t8:f.b"); - ASSERT_EQ(skel->bss->t8_g, 23, "t8:g"); - ASSERT_EQ(skel->bss->t8_ret, 156, "t8 ret"); - - ASSERT_EQ(skel->bss->t9_a, 16, "t9:a"); - ASSERT_EQ(skel->bss->t9_b, 17, "t9:b"); - ASSERT_EQ(skel->bss->t9_c, 18, "t9:c"); - ASSERT_EQ(skel->bss->t9_d, 19, "t9:d"); - ASSERT_EQ(skel->bss->t9_e, 20, "t9:e"); - ASSERT_EQ(skel->bss->t9_f, 21, "t9:f"); - ASSERT_EQ(skel->bss->t9_g, 22, "t9:f"); - ASSERT_EQ(skel->bss->t9_h_a, 23, "t9:h.a"); - ASSERT_EQ(skel->bss->t9_h_b, 24, "t9:h.b"); - ASSERT_EQ(skel->bss->t9_h_c, 25, "t9:h.c"); - ASSERT_EQ(skel->bss->t9_h_d, 26, "t9:h.d"); - ASSERT_EQ(skel->bss->t9_i, 27, "t9:i"); - ASSERT_EQ(skel->bss->t9_ret, 258, "t9 ret"); - destroy_skel: tracing_struct_many_args__destroy(skel); } diff --git a/tools/testing/selftests/bpf/progs/tracing_struct_many_args.c b/tools/testing/selftests/bpf/progs/tracing_struct_many_args.c index 4742012ace06af949d7f15a21131aaef7ab006e4..1cbedcdc1c42e1fe2f118fdbd1a4ab7fe48b52fb 100644 --- a/tools/testing/selftests/bpf/progs/tracing_struct_many_args.c +++ b/tools/testing/selftests/bpf/progs/tracing_struct_many_args.c @@ -8,28 +8,11 @@ struct bpf_testmod_struct_arg_4 { int b; };
-struct bpf_testmod_struct_arg_5 { - char a; - short b; - int c; - long d; -}; - -long t7_a, t7_b, t7_c, t7_d, t7_e, t7_f_a, t7_f_b, t7_ret; -long t8_a, t8_b, t8_c, t8_d, t8_e, t8_f_a, t8_f_b, t8_g, t8_ret; -long t9_a, t9_b, t9_c, t9_d, t9_e, t9_f, t9_g, t9_h_a, t9_h_b, t9_h_c, t9_h_d, t9_i, t9_ret;
SEC("fentry/bpf_testmod_test_struct_arg_7") int BPF_PROG2(test_struct_many_args_1, __u64, a, void *, b, short, c, int, d, void *, e, struct bpf_testmod_struct_arg_4, f) { - t7_a = a; - t7_b = (long)b; - t7_c = c; - t7_d = d; - t7_e = (long)e; - t7_f_a = f.a; - t7_f_b = f.b; return 0; }
@@ -37,59 +20,6 @@ SEC("fexit/bpf_testmod_test_struct_arg_7") int BPF_PROG2(test_struct_many_args_2, __u64, a, void *, b, short, c, int, d, void *, e, struct bpf_testmod_struct_arg_4, f, int, ret) { - t7_ret = ret; - return 0; -} - -SEC("fentry/bpf_testmod_test_struct_arg_8") -int BPF_PROG2(test_struct_many_args_3, __u64, a, void *, b, short, c, int, d, - void *, e, struct bpf_testmod_struct_arg_4, f, int, g) -{ - t8_a = a; - t8_b = (long)b; - t8_c = c; - t8_d = d; - t8_e = (long)e; - t8_f_a = f.a; - t8_f_b = f.b; - t8_g = g; - return 0; -} - -SEC("fexit/bpf_testmod_test_struct_arg_8") -int BPF_PROG2(test_struct_many_args_4, __u64, a, void *, b, short, c, int, d, - void *, e, struct bpf_testmod_struct_arg_4, f, int, g, - int, ret) -{ - t8_ret = ret; return 0; } - -SEC("fentry/bpf_testmod_test_struct_arg_9") -int BPF_PROG2(test_struct_many_args_5, __u64, a, void *, b, short, c, int, d, void *, e, - char, f, short, g, struct bpf_testmod_struct_arg_5, h, long, i) -{ - t9_a = a; - t9_b = (long)b; - t9_c = c; - t9_d = d; - t9_e = (long)e; - t9_f = f; - t9_g = g; - t9_h_a = h.a; - t9_h_b = h.b; - t9_h_c = h.c; - t9_h_d = h.d; - t9_i = i; - return 0; -} - -SEC("fexit/bpf_testmod_test_struct_arg_9") -int BPF_PROG2(test_struct_many_args_6, __u64, a, void *, b, short, c, int, d, void *, e, - char, f, short, g, struct bpf_testmod_struct_arg_5, h, long, i, int, ret) -{ - t9_ret = ret; - return 0; -} - char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c index e9e918cdf31ff2b15bf41302ad429e8683b834d6..ff6a4a0fb73679c6c4831ae0662bce2080e53c23 100644 --- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c +++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c @@ -55,13 +55,6 @@ struct bpf_testmod_struct_arg_4 { int b; };
-struct bpf_testmod_struct_arg_5 { - char a; - short b; - int c; - long d; -}; - __bpf_hook_start();
noinline int @@ -101,30 +94,10 @@ bpf_testmod_test_struct_arg_6(struct bpf_testmod_struct_arg_3 *a) { return bpf_testmod_test_struct_arg_result; }
-noinline int -bpf_testmod_test_struct_arg_7(u64 a, void *b, short c, int d, void *e, - struct bpf_testmod_struct_arg_4 f) -{ - bpf_testmod_test_struct_arg_result = a + (long)b + c + d + - (long)e + f.a + f.b; - return bpf_testmod_test_struct_arg_result; -} - -noinline int -bpf_testmod_test_struct_arg_8(u64 a, void *b, short c, int d, void *e, - struct bpf_testmod_struct_arg_4 f, int g) +noinline int bpf_testmod_test_struct_arg_7(u64 a, void *b, short c, int d, + void *e, u64 f, u64 g, u64 h, + struct bpf_testmod_struct_arg_4 i) { - bpf_testmod_test_struct_arg_result = a + (long)b + c + d + - (long)e + f.a + f.b + g; - return bpf_testmod_test_struct_arg_result; -} - -noinline int -bpf_testmod_test_struct_arg_9(u64 a, void *b, short c, int d, void *e, char f, - short g, struct bpf_testmod_struct_arg_5 h, long i) -{ - bpf_testmod_test_struct_arg_result = a + (long)b + c + d + (long)e + - f + g + h.a + h.b + h.c + h.d + i; return bpf_testmod_test_struct_arg_result; }
@@ -397,7 +370,6 @@ bpf_testmod_test_read(struct file *file, struct kobject *kobj, struct bpf_testmod_struct_arg_2 struct_arg2 = {2, 3}; struct bpf_testmod_struct_arg_3 *struct_arg3; struct bpf_testmod_struct_arg_4 struct_arg4 = {21, 22}; - struct bpf_testmod_struct_arg_5 struct_arg5 = {23, 24, 25, 26}; int i = 1;
while (bpf_testmod_return_ptr(i)) @@ -408,13 +380,8 @@ bpf_testmod_test_read(struct file *file, struct kobject *kobj, (void)bpf_testmod_test_struct_arg_3(1, 4, struct_arg2); (void)bpf_testmod_test_struct_arg_4(struct_arg1, 1, 2, 3, struct_arg2); (void)bpf_testmod_test_struct_arg_5(); - (void)bpf_testmod_test_struct_arg_7(16, (void *)17, 18, 19, - (void *)20, struct_arg4); - (void)bpf_testmod_test_struct_arg_8(16, (void *)17, 18, 19, - (void *)20, struct_arg4, 23); - (void)bpf_testmod_test_struct_arg_9(16, (void *)17, 18, 19, (void *)20, - 21, 22, struct_arg5, 27); - + (void)bpf_testmod_test_struct_arg_7(16, (void *)17, 18, 19, (void *)20, + 21, 22, 23, struct_arg4); (void)bpf_testmod_test_arg_ptr_to_struct(&struct_arg1_2);
(void)trace_bpf_testmod_test_raw_tp_null_tp(NULL);
linux-kselftest-mirror@lists.linaro.org