Hello,
this series is a revival of Xu Kuhoai's work to enable larger arguments count for BPF programs on ARM64 ([1]). His initial series received some positive feedback, but lacked some specific case handling around arguments alignment (see AAPCS64 C.14 rule in section 6.8.2, [2]). There as been another attempt from Puranjay Mohan, which was unfortunately missing the same thing ([3]). Since there has been some time between those series and this new one, I chose to send it as a new series rather than a new revision of the existing series.
To support the increased argument counts and arguments larger than registers size (eg: structures), the trampoline does the following: - for bpf programs: arguments are retrieved from both registers and the function stack, and pushed in the trampoline stack as an array of u64 to generate the programs context. It is then passed by pointer to the bpf programs - when the trampoline is in charge of calling the original function: it restores the registers content, and generates a new stack layout for the additional arguments that do not fit in registers.
This new attempt is based on Xu's series and aims to handle the missing alignment concern raised in the reviews discussions. The main novelties are then around arguments alignments: - the first commit is exposing some new info in the BTF function model passed to the JIT compiler to allow it to deduce the needed alignment when configuring the trampoline stack - the second commit is taken from Xu's series, and received the following modifications: - the calc_aux_args computes an expected alignment for each argument - the calc_aux_args computes two different stack space sizes: the one needed to store the bpf programs context, and the original function stacked arguments (which needs alignment). Those stack sizes are in bytes instead of "slots" - when saving/restoring arguments for bpf program or for the original function, make sure to align the load/store accordingly, when relevant - a few typos fixes and some rewording, raised by the review on the original series - the last commit introduces some explicit tests that ensure that the needed alignment is enforced by the trampoline
I marked the series as RFC because it appears that the new tests trigger some failures in CI on x86 and s390, despite the series not touching any code related to those architectures. Some very early investigation/gdb debugging on the x86 side seems to hint that it could be related to the same missing alignment too (based on section 3.2.3 in [4], and so the x86 trampoline would need the same alignment handling ?). For s390 it looks less clear, as all values captured from the bpf test program are set to 0 in the CI output, and I don't have the proper setup yet to check the low level details. I am tempted to isolate those new tests (which were actually useful to spot real issues while tuning the ARM64 trampoline) and add them to the relevant DENYLIST files for x86/s390, but I guess this is not the right direction, so I would gladly take a second opinion on this.
[1] https://lore.kernel.org/all/20230917150752.69612-1-xukuohai@huaweicloud.com/... [2] https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#id82 [3] https://lore.kernel.org/bpf/20240705125336.46820-1-puranjay@kernel.org/ [4] https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com --- Alexis Lothoré (eBPF Foundation) (3): bpf: add struct largest member size in func model bpf/selftests: add tests to validate proper arguments alignment on ARM64 bpf/selftests: enable tracing tests for ARM64
Xu Kuohai (1): bpf, arm64: Support up to 12 function arguments
arch/arm64/net/bpf_jit_comp.c | 235 ++++++++++++++++----- include/linux/bpf.h | 1 + kernel/bpf/btf.c | 25 +++ tools/testing/selftests/bpf/DENYLIST.aarch64 | 3 - .../selftests/bpf/prog_tests/tracing_struct.c | 23 ++ tools/testing/selftests/bpf/progs/tracing_struct.c | 10 +- .../selftests/bpf/progs/tracing_struct_many_args.c | 67 ++++++ .../testing/selftests/bpf/test_kmods/bpf_testmod.c | 50 +++++ 8 files changed, 357 insertions(+), 57 deletions(-) --- base-commit: 91e7eb701b4bc389e7ddfd80ef6e82d1a6d2d368 change-id: 20250220-many_args_arm64-8bd3747e6948
Best regards,
In order to properly JIT the trampolines needed to attach BPF programs to functions, some architectures like ARM64 need to know about the alignment needed for the function arguments. Such alignment can generally be deduced from the argument size, but that's not completely true for composite types. In the specific case of ARM64, the AAPCS64 ABI defines that a composite type which needs to be passed through stack must be aligned on the maximum between 8 and the largest alignment constraint of its first-level members. So the JIT compiler needs more information about the arguments to make sure to generate code that respects those alignment constraints.
For struct arguments, add information about the size of the largest first-level member in the struct btf_func_model to allow the JIT compiler to guess the needed alignment. The information is quite specific, but it allows to keep arch-specific concerns (ie: guessing the final needed alignment for an argument) isolated in each JIT compiler.
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com --- include/linux/bpf.h | 1 + kernel/bpf/btf.c | 25 +++++++++++++++++++++++++ 2 files changed, 26 insertions(+)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 3f0cc89c0622cb1a097999afb78c17102593b6bb..8b34dcf60a0ce09228ff761b962ab67b6a3e2263 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1106,6 +1106,7 @@ struct btf_func_model { u8 nr_args; u8 arg_size[MAX_BPF_FUNC_ARGS]; u8 arg_flags[MAX_BPF_FUNC_ARGS]; + u8 arg_largest_member_size[MAX_BPF_FUNC_ARGS]; };
/* Restore arguments before returning from trampoline to let original function diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 16ba36f34dfab7531babf5753cab9f368cddefa3..5d40911ec90210086a6175d569abb6e52d75ad17 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -7318,6 +7318,29 @@ static int __get_type_size(struct btf *btf, u32 btf_id, return -EINVAL; }
+static u8 __get_largest_member_size(struct btf *btf, const struct btf_type *t) +{ + const struct btf_member *member; + const struct btf_type *mtype; + u8 largest_member_size = 0; + int i; + + if (!__btf_type_is_struct(t)) + return largest_member_size; + + for_each_member(i, t, member) { + mtype = btf_type_by_id(btf, member->type); + while (mtype && btf_type_is_modifier(mtype)) + mtype = btf_type_by_id(btf, mtype->type); + if (!mtype) + return -EINVAL; + if (mtype->size > largest_member_size) + largest_member_size = mtype->size; + } + + return largest_member_size; +} + static u8 __get_type_fmodel_flags(const struct btf_type *t) { u8 flags = 0; @@ -7396,6 +7419,8 @@ int btf_distill_func_proto(struct bpf_verifier_log *log, } m->arg_size[i] = ret; m->arg_flags[i] = __get_type_fmodel_flags(t); + m->arg_largest_member_size[i] = + __get_largest_member_size(btf, t); } m->nr_args = nargs; return 0;
On Fri, Apr 11, 2025 at 10:32:10PM +0200, Alexis Lothoré (eBPF Foundation) wrote:
In order to properly JIT the trampolines needed to attach BPF programs to functions, some architectures like ARM64 need to know about the alignment needed for the function arguments. Such alignment can generally be deduced from the argument size, but that's not completely true for composite types. In the specific case of ARM64, the AAPCS64 ABI defines that a composite type which needs to be passed through stack must be aligned on the maximum between 8 and the largest alignment constraint of its first-level members. So the JIT compiler needs more information about the arguments to make sure to generate code that respects those alignment constraints.
For struct arguments, add information about the size of the largest first-level member in the struct btf_func_model to allow the JIT compiler to guess the needed alignment. The information is quite specific, but it allows to keep arch-specific concerns (ie: guessing the final needed alignment for an argument) isolated in each JIT compiler.
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com
include/linux/bpf.h | 1 + kernel/bpf/btf.c | 25 +++++++++++++++++++++++++ 2 files changed, 26 insertions(+)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 3f0cc89c0622cb1a097999afb78c17102593b6bb..8b34dcf60a0ce09228ff761b962ab67b6a3e2263 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1106,6 +1106,7 @@ struct btf_func_model { u8 nr_args; u8 arg_size[MAX_BPF_FUNC_ARGS]; u8 arg_flags[MAX_BPF_FUNC_ARGS];
- u8 arg_largest_member_size[MAX_BPF_FUNC_ARGS];
}; /* Restore arguments before returning from trampoline to let original function diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 16ba36f34dfab7531babf5753cab9f368cddefa3..5d40911ec90210086a6175d569abb6e52d75ad17 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -7318,6 +7318,29 @@ static int __get_type_size(struct btf *btf, u32 btf_id, return -EINVAL; } +static u8 __get_largest_member_size(struct btf *btf, const struct btf_type *t) +{
- const struct btf_member *member;
- const struct btf_type *mtype;
- u8 largest_member_size = 0;
- int i;
- if (!__btf_type_is_struct(t))
return largest_member_size;
- for_each_member(i, t, member) {
mtype = btf_type_by_id(btf, member->type);
while (mtype && btf_type_is_modifier(mtype))
mtype = btf_type_by_id(btf, mtype->type);
if (!mtype)
return -EINVAL;
should we use __get_type_size for member->type instead ?
jirka
if (mtype->size > largest_member_size)
largest_member_size = mtype->size;
- }
- return largest_member_size;
+}
static u8 __get_type_fmodel_flags(const struct btf_type *t) { u8 flags = 0; @@ -7396,6 +7419,8 @@ int btf_distill_func_proto(struct bpf_verifier_log *log, } m->arg_size[i] = ret; m->arg_flags[i] = __get_type_fmodel_flags(t);
m->arg_largest_member_size[i] =
} m->nr_args = nargs; return 0;__get_largest_member_size(btf, t);
-- 2.49.0
Hello Jiri,
On Mon Apr 14, 2025 at 1:04 PM CEST, Jiri Olsa wrote:
On Fri, Apr 11, 2025 at 10:32:10PM +0200, Alexis Lothoré (eBPF Foundation) wrote:
[...]
- for_each_member(i, t, member) {
mtype = btf_type_by_id(btf, member->type);
while (mtype && btf_type_is_modifier(mtype))
mtype = btf_type_by_id(btf, mtype->type);
if (!mtype)
return -EINVAL;
should we use __get_type_size for member->type instead ?
Ah, yes, thanks for the hint, that will allow to get rid of the manual modifiers skip.
Alexis
jirka
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
In order to properly JIT the trampolines needed to attach BPF programs to functions, some architectures like ARM64 need to know about the alignment needed for the function arguments. Such alignment can generally be deduced from the argument size, but that's not completely true for composite types. In the specific case of ARM64, the AAPCS64 ABI defines that a composite type which needs to be passed through stack must be aligned on the maximum between 8 and the largest alignment constraint of its first-level members. So the JIT compiler needs more information about the arguments to make sure to generate code that respects those alignment constraints.
For struct arguments, add information about the size of the largest first-level member in the struct btf_func_model to allow the JIT compiler to guess the needed alignment. The information is quite
I might be missing something, but how can the *size* of the field be used to calculate that argument's *alignment*? i.e., I don't understand why arg_largest_member_size needs to be calculated instead of arg_largest_member_alignment...
specific, but it allows to keep arch-specific concerns (ie: guessing the final needed alignment for an argument) isolated in each JIT compiler.
couldn't all this information be calculated in the JIT compiler (if JIT needs that) from BTF?
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com
include/linux/bpf.h | 1 + kernel/bpf/btf.c | 25 +++++++++++++++++++++++++ 2 files changed, 26 insertions(+)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 3f0cc89c0622cb1a097999afb78c17102593b6bb..8b34dcf60a0ce09228ff761b962ab67b6a3e2263 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1106,6 +1106,7 @@ struct btf_func_model { u8 nr_args; u8 arg_size[MAX_BPF_FUNC_ARGS]; u8 arg_flags[MAX_BPF_FUNC_ARGS];
u8 arg_largest_member_size[MAX_BPF_FUNC_ARGS];
};
/* Restore arguments before returning from trampoline to let original function diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 16ba36f34dfab7531babf5753cab9f368cddefa3..5d40911ec90210086a6175d569abb6e52d75ad17 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -7318,6 +7318,29 @@ static int __get_type_size(struct btf *btf, u32 btf_id, return -EINVAL; }
+static u8 __get_largest_member_size(struct btf *btf, const struct btf_type *t) +{
const struct btf_member *member;
const struct btf_type *mtype;
u8 largest_member_size = 0;
int i;
if (!__btf_type_is_struct(t))
return largest_member_size;
for_each_member(i, t, member) {
mtype = btf_type_by_id(btf, member->type);
while (mtype && btf_type_is_modifier(mtype))
mtype = btf_type_by_id(btf, mtype->type);
if (!mtype)
return -EINVAL;
if (mtype->size > largest_member_size)
largest_member_size = mtype->size;
}
return largest_member_size;
+}
static u8 __get_type_fmodel_flags(const struct btf_type *t) { u8 flags = 0; @@ -7396,6 +7419,8 @@ int btf_distill_func_proto(struct bpf_verifier_log *log, } m->arg_size[i] = ret; m->arg_flags[i] = __get_type_fmodel_flags(t);
m->arg_largest_member_size[i] =
__get_largest_member_size(btf, t); } m->nr_args = nargs; return 0;
-- 2.49.0
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
In order to properly JIT the trampolines needed to attach BPF programs to functions, some architectures like ARM64 need to know about the alignment needed for the function arguments. Such alignment can generally be deduced from the argument size, but that's not completely true for composite types. In the specific case of ARM64, the AAPCS64 ABI defines that a composite type which needs to be passed through stack must be aligned on the maximum between 8 and the largest alignment constraint of its first-level members. So the JIT compiler needs more information about the arguments to make sure to generate code that respects those alignment constraints.
For struct arguments, add information about the size of the largest first-level member in the struct btf_func_model to allow the JIT compiler to guess the needed alignment. The information is quite
I might be missing something, but how can the *size* of the field be used to calculate that argument's *alignment*? i.e., I don't understand why arg_largest_member_size needs to be calculated instead of arg_largest_member_alignment...
Indeed I initially checked whether I could return directly some alignment info from btf, but it then involves the alignment computation in the btf module. Since there could be minor differences between architectures about alignment requirements, I though it would be better to in fact keep alignment computation out of the btf module. For example, I see that 128 bits values are aligned on 16 bytes on ARM64, while being aligned on 8 bytes on S390.
And since for ARM64, all needed alignments are somehow derived from size (it is either directly size for fundamental types, or alignment of the largest member for structs, which is then size of largest member), returning the size seems to be enough to allow the JIT side to compute alignments.
specific, but it allows to keep arch-specific concerns (ie: guessing the final needed alignment for an argument) isolated in each JIT compiler.
couldn't all this information be calculated in the JIT compiler (if JIT needs that) from BTF?
From what I understand, the JIT compiler does not have access to BTF info, only a substract from it, arranged in a struct btf_func_model ? This struct btf_func_model already has size info for standard types, but for structs we need some additional info about the members, hence this arg_largest_member_alignment addition in btf_func_model.
Thanks,
Alexis
On 4/17/2025 3:14 PM, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
In order to properly JIT the trampolines needed to attach BPF programs to functions, some architectures like ARM64 need to know about the alignment needed for the function arguments. Such alignment can generally be deduced from the argument size, but that's not completely true for composite types. In the specific case of ARM64, the AAPCS64 ABI defines that a composite type which needs to be passed through stack must be aligned on the maximum between 8 and the largest alignment constraint of its first-level members. So the JIT compiler needs more information about the arguments to make sure to generate code that respects those alignment constraints.
For struct arguments, add information about the size of the largest first-level member in the struct btf_func_model to allow the JIT compiler to guess the needed alignment. The information is quite
I might be missing something, but how can the *size* of the field be used to calculate that argument's *alignment*? i.e., I don't understand why arg_largest_member_size needs to be calculated instead of arg_largest_member_alignment...
Indeed I initially checked whether I could return directly some alignment info from btf, but it then involves the alignment computation in the btf module. Since there could be minor differences between architectures about alignment requirements, I though it would be better to in fact keep alignment computation out of the btf module. For example, I see that 128 bits values are aligned on 16 bytes on ARM64, while being aligned on 8 bytes on S390.
And since for ARM64, all needed alignments are somehow derived from size (it is either directly size for fundamental types, or alignment of the largest member for structs, which is then size of largest member), returning the size seems to be enough to allow the JIT side to compute alignments.
Not exactly. The compiler's "packed" and "alignment" attributes cause a structure to be aligned differently from its natural alignment.
For example, with the following three structures:
struct s0 { __int128 x; };
struct s1 { __int128 x; } __attribute__((packed));
struct s2 { __int128 x; } __attribute__((aligned(64)));
Even though the largest member size is the same, s0 will be aligned to 16 bytes, s1 and s2 are not aligned the same way. s1 has no alignment due to the "packed" attribute, while s2 will be aligned to 64 bytes.
When these three structures are passed as function arguments, they will be located on different positions on the stack.
For the following three functions:
int f0(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s0 g); int f1(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s1 g); int f2(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s2 g);
g will be located at sp+32 in f0, sp + 24 in f1, and some 64-byte aligned stack address in f2.
specific, but it allows to keep arch-specific concerns (ie: guessing the final needed alignment for an argument) isolated in each JIT compiler.
couldn't all this information be calculated in the JIT compiler (if JIT needs that) from BTF?
From what I understand, the JIT compiler does not have access to BTF info,
only a substract from it, arranged in a struct btf_func_model ? This struct btf_func_model already has size info for standard types, but for structs we need some additional info about the members, hence this arg_largest_member_alignment addition in btf_func_model.
Thanks,
Alexis
Hi Xu,
On Thu Apr 17, 2025 at 4:10 PM CEST, Xu Kuohai wrote:
On 4/17/2025 3:14 PM, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
[...]
I might be missing something, but how can the *size* of the field be used to calculate that argument's *alignment*? i.e., I don't understand why arg_largest_member_size needs to be calculated instead of arg_largest_member_alignment...
Indeed I initially checked whether I could return directly some alignment info from btf, but it then involves the alignment computation in the btf module. Since there could be minor differences between architectures about alignment requirements, I though it would be better to in fact keep alignment computation out of the btf module. For example, I see that 128 bits values are aligned on 16 bytes on ARM64, while being aligned on 8 bytes on S390.
And since for ARM64, all needed alignments are somehow derived from size (it is either directly size for fundamental types, or alignment of the largest member for structs, which is then size of largest member), returning the size seems to be enough to allow the JIT side to compute alignments.
Not exactly. The compiler's "packed" and "alignment" attributes cause a structure to be aligned differently from its natural alignment.
For example, with the following three structures:
struct s0 { __int128 x; };
struct s1 { __int128 x; } __attribute__((packed));
struct s2 { __int128 x; } __attribute__((aligned(64)));
Even though the largest member size is the same, s0 will be aligned to 16 bytes, s1 and s2 are not aligned the same way. s1 has no alignment due to the "packed" attribute, while s2 will be aligned to 64 bytes.
When these three structures are passed as function arguments, they will be located on different positions on the stack.
For the following three functions:
int f0(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s0 g); int f1(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s1 g); int f2(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s2 g);
g will be located at sp+32 in f0, sp + 24 in f1, and some 64-byte aligned stack address in f2.
Ah, thanks for those clear examples, I completely overlooked this possibility. And now that you mention it, I feel a bit dumb because I now remember that you mentioned this in Puranjay's series...
I took a quick look at the x86 JIT compiler for reference, and saw no code related to this specific case neither. So I searched in the kernel for actual functions taking struct arguments by value AND being declared with some packed or aligned attribute. I only found a handful of those, and none seems to take enough arguments to have the corresponding struct passed on the stack. So rather than supporting this very specific case, I am tempted to just return an error for now during trampoline creation if we detect such structure (and then the JIT compiler can keep using data size to compute alignment, now that it is sure not to receive custom alignments). Or am I missing some actual cases involving those very specific alignments ?
Thanks,
Alexis
On 4/21/2025 12:02 AM, Alexis Lothoré wrote:
Hi Xu,
On Thu Apr 17, 2025 at 4:10 PM CEST, Xu Kuohai wrote:
On 4/17/2025 3:14 PM, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
[...]
I might be missing something, but how can the *size* of the field be used to calculate that argument's *alignment*? i.e., I don't understand why arg_largest_member_size needs to be calculated instead of arg_largest_member_alignment...
Indeed I initially checked whether I could return directly some alignment info from btf, but it then involves the alignment computation in the btf module. Since there could be minor differences between architectures about alignment requirements, I though it would be better to in fact keep alignment computation out of the btf module. For example, I see that 128 bits values are aligned on 16 bytes on ARM64, while being aligned on 8 bytes on S390.
And since for ARM64, all needed alignments are somehow derived from size (it is either directly size for fundamental types, or alignment of the largest member for structs, which is then size of largest member), returning the size seems to be enough to allow the JIT side to compute alignments.
Not exactly. The compiler's "packed" and "alignment" attributes cause a structure to be aligned differently from its natural alignment.
For example, with the following three structures:
struct s0 { __int128 x; };
struct s1 { __int128 x; } __attribute__((packed));
struct s2 { __int128 x; } __attribute__((aligned(64)));
Even though the largest member size is the same, s0 will be aligned to 16 bytes, s1 and s2 are not aligned the same way. s1 has no alignment due to the "packed" attribute, while s2 will be aligned to 64 bytes.
When these three structures are passed as function arguments, they will be located on different positions on the stack.
For the following three functions:
int f0(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s0 g); int f1(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s1 g); int f2(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s2 g);
g will be located at sp+32 in f0, sp + 24 in f1, and some 64-byte aligned stack address in f2.
Ah, thanks for those clear examples, I completely overlooked this possibility. And now that you mention it, I feel a bit dumb because I now remember that you mentioned this in Puranjay's series...
I took a quick look at the x86 JIT compiler for reference, and saw no code related to this specific case neither. So I searched in the kernel for actual functions taking struct arguments by value AND being declared with some packed or aligned attribute. I only found a handful of those, and none seems to take enough arguments to have the corresponding struct passed on the stack. So rather than supporting this very specific case, I am tempted to just return an error for now during trampoline creation if we detect such structure (and then the JIT compiler can keep using data size to compute alignment, now that it is sure not to receive custom alignments). Or am I missing some actual cases involving those very specific alignments ?
How can we reliably 'detect' the case? If a function has such a parameter but we fail to detect it, the BPF trampoline will pass an incorrect value to the function, which is also unacceptable.
Thanks,
Alexis
On Mon Apr 21, 2025 at 4:14 AM CEST, Xu Kuohai wrote:
On 4/21/2025 12:02 AM, Alexis Lothoré wrote:
Hi Xu,
On Thu Apr 17, 2025 at 4:10 PM CEST, Xu Kuohai wrote:
On 4/17/2025 3:14 PM, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
[...]
Ah, thanks for those clear examples, I completely overlooked this possibility. And now that you mention it, I feel a bit dumb because I now remember that you mentioned this in Puranjay's series...
I took a quick look at the x86 JIT compiler for reference, and saw no code related to this specific case neither. So I searched in the kernel for actual functions taking struct arguments by value AND being declared with some packed or aligned attribute. I only found a handful of those, and none seems to take enough arguments to have the corresponding struct passed on the stack. So rather than supporting this very specific case, I am tempted to just return an error for now during trampoline creation if we detect such structure (and then the JIT compiler can keep using data size to compute alignment, now that it is sure not to receive custom alignments). Or am I missing some actual cases involving those very specific alignments ?
How can we reliably 'detect' the case? If a function has such a parameter but we fail to detect it, the BPF trampoline will pass an incorrect value to the function, which is also unacceptable.
That's a question I still have to answer :) I imagined being able to detect it thanks to some info somewhere in BTF, but I have to dig further to find how.
Alexis
On Thu, Apr 17, 2025 at 12:14 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
In order to properly JIT the trampolines needed to attach BPF programs to functions, some architectures like ARM64 need to know about the alignment needed for the function arguments. Such alignment can generally be deduced from the argument size, but that's not completely true for composite types. In the specific case of ARM64, the AAPCS64 ABI defines that a composite type which needs to be passed through stack must be aligned on the maximum between 8 and the largest alignment constraint of its first-level members. So the JIT compiler needs more information about the arguments to make sure to generate code that respects those alignment constraints.
For struct arguments, add information about the size of the largest first-level member in the struct btf_func_model to allow the JIT compiler to guess the needed alignment. The information is quite
I might be missing something, but how can the *size* of the field be used to calculate that argument's *alignment*? i.e., I don't understand why arg_largest_member_size needs to be calculated instead of arg_largest_member_alignment...
Indeed I initially checked whether I could return directly some alignment info from btf, but it then involves the alignment computation in the btf module. Since there could be minor differences between architectures about alignment requirements, I though it would be better to in fact keep alignment computation out of the btf module. For example, I see that 128 bits values are aligned on 16 bytes on ARM64, while being aligned on 8 bytes on S390.
And since for ARM64, all needed alignments are somehow derived from size (it is either directly size for fundamental types, or alignment of the largest member for structs, which is then size of largest member), returning the size seems to be enough to allow the JIT side to compute alignments.
If you mean the size of "primitive" field and/or array element (applied recursively for all embedded structs/unions) then yes, that's close enough. But saying just "largest struct member" is wrong, because for
struct blah { struct { int whatever[128]; } heya; };
blah.heya has a large size, but alignment is still just 4 bytes.
I'd suggest looking at btf__align_of() in libbpf (tools/lib/bpf/btf.c) to see how we calculate alignment there. It seems to work decently enough. It won't cover any arch-specific extra rules like double needing 16-byte alignment (I vaguely remember something like that for some architectures, but I might be misremembering), or anything similar. It also won't detect (I don't think it's possible without DWARF) artificially increased alignment with attribute((aligned(N))).
specific, but it allows to keep arch-specific concerns (ie: guessing the final needed alignment for an argument) isolated in each JIT compiler.
couldn't all this information be calculated in the JIT compiler (if JIT needs that) from BTF?
From what I understand, the JIT compiler does not have access to BTF info, only a substract from it, arranged in a struct btf_func_model ? This struct btf_func_model already has size info for standard types, but for structs we need some additional info about the members, hence this arg_largest_member_alignment addition in btf_func_model.
Thanks,
Alexis
-- Alexis Lothoré, Bootlin Embedded Linux and Kernel engineering https://bootlin.com
Hi Andrii,
On Wed Apr 23, 2025 at 7:15 PM CEST, Andrii Nakryiko wrote:
On Thu, Apr 17, 2025 at 12:14 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
[...]
Indeed I initially checked whether I could return directly some alignment info from btf, but it then involves the alignment computation in the btf module. Since there could be minor differences between architectures about alignment requirements, I though it would be better to in fact keep alignment computation out of the btf module. For example, I see that 128 bits values are aligned on 16 bytes on ARM64, while being aligned on 8 bytes on S390.
And since for ARM64, all needed alignments are somehow derived from size (it is either directly size for fundamental types, or alignment of the largest member for structs, which is then size of largest member), returning the size seems to be enough to allow the JIT side to compute alignments.
If you mean the size of "primitive" field and/or array element (applied recursively for all embedded structs/unions) then yes, that's close enough. But saying just "largest struct member" is wrong, because for
struct blah { struct { int whatever[128]; } heya; };
blah.heya has a large size, but alignment is still just 4 bytes.
Indeed, that's another case making my proposal fail :)
I'd suggest looking at btf__align_of() in libbpf (tools/lib/bpf/btf.c) to see how we calculate alignment there. It seems to work decently enough. It won't cover any arch-specific extra rules like double needing 16-byte alignment (I vaguely remember something like that for some architectures, but I might be misremembering), or anything similar. It also won't detect (I don't think it's possible without DWARF) artificially increased alignment with attribute((aligned(N))).
Thanks for the pointer, I'll take a look at it. The more we discuss this series, the less member size sounds relevant for what I'm trying to achieve here.
Following Xu's comments, I have been thinking about how I could detect the custom alignments and packing on structures, and I was wondering if I could somehow benefit from __attribute__ encoding in BTF info ([1]). But following your hint, I also see some btf_is_struct_packed() in tools/lib/bpf/btf_dump.c that could help. I'll dig this further and see if I can manage to make something work with all of this.
Thanks,
Alexis
[1] https://lore.kernel.org/bpf/20250130201239.1429648-1-ihor.solodrai@linux.dev...
On 4/24/2025 3:24 AM, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 23, 2025 at 7:15 PM CEST, Andrii Nakryiko wrote:
On Thu, Apr 17, 2025 at 12:14 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
[...]
Indeed I initially checked whether I could return directly some alignment info from btf, but it then involves the alignment computation in the btf module. Since there could be minor differences between architectures about alignment requirements, I though it would be better to in fact keep alignment computation out of the btf module. For example, I see that 128 bits values are aligned on 16 bytes on ARM64, while being aligned on 8 bytes on S390.
And since for ARM64, all needed alignments are somehow derived from size (it is either directly size for fundamental types, or alignment of the largest member for structs, which is then size of largest member), returning the size seems to be enough to allow the JIT side to compute alignments.
If you mean the size of "primitive" field and/or array element (applied recursively for all embedded structs/unions) then yes, that's close enough. But saying just "largest struct member" is wrong, because for
struct blah { struct { int whatever[128]; } heya; };
blah.heya has a large size, but alignment is still just 4 bytes.
Indeed, that's another case making my proposal fail :)
I'd suggest looking at btf__align_of() in libbpf (tools/lib/bpf/btf.c) to see how we calculate alignment there. It seems to work decently enough. It won't cover any arch-specific extra rules like double needing 16-byte alignment (I vaguely remember something like that for some architectures, but I might be misremembering), or anything similar. It also won't detect (I don't think it's possible without DWARF) artificially increased alignment with attribute((aligned(N))).
Thanks for the pointer, I'll take a look at it. The more we discuss this series, the less member size sounds relevant for what I'm trying to achieve here.
Following Xu's comments, I have been thinking about how I could detect the custom alignments and packing on structures, and I was wondering if I could somehow benefit from __attribute__ encoding in BTF info ([1]). But following your hint, I also see some btf_is_struct_packed() in tools/lib/bpf/btf_dump.c that could help. I'll dig this further and see if I can manage to make something work with all of this.
With DWARF info, we might not need to detect the structure alignment anymore, since the DW_AT_location attribute tells us where the structure parameter is located on the stack, and DW_AT_byte_size gives us the size of the structure.
Thanks,
Alexis
[1] https://lore.kernel.org/bpf/20250130201239.1429648-1-ihor.solodrai@linux.dev...
Hi Xu,
On Thu Apr 24, 2025 at 2:00 PM CEST, Xu Kuohai wrote:
On 4/24/2025 3:24 AM, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 23, 2025 at 7:15 PM CEST, Andrii Nakryiko wrote:
On Thu, Apr 17, 2025 at 12:14 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
[...]
Thanks for the pointer, I'll take a look at it. The more we discuss this series, the less member size sounds relevant for what I'm trying to achieve here.
Following Xu's comments, I have been thinking about how I could detect the custom alignments and packing on structures, and I was wondering if I could somehow benefit from __attribute__ encoding in BTF info ([1]). But following your hint, I also see some btf_is_struct_packed() in tools/lib/bpf/btf_dump.c that could help. I'll dig this further and see if I can manage to make something work with all of this.
With DWARF info, we might not need to detect the structure alignment anymore, since the DW_AT_location attribute tells us where the structure parameter is located on the stack, and DW_AT_byte_size gives us the size of the structure.
I am not sure to follow you here, because DWARF info is not accessible from kernel at runtime, right ? Or are you meaning that we could, at build time, enrich the BTF info embedded in the kernel thanks to DWARF info ?
Thanks,
Alexis
On Thu, Apr 24, 2025 at 6:38 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Xu,
On Thu Apr 24, 2025 at 2:00 PM CEST, Xu Kuohai wrote:
On 4/24/2025 3:24 AM, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 23, 2025 at 7:15 PM CEST, Andrii Nakryiko wrote:
On Thu, Apr 17, 2025 at 12:14 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
[...]
Thanks for the pointer, I'll take a look at it. The more we discuss this series, the less member size sounds relevant for what I'm trying to achieve here.
Following Xu's comments, I have been thinking about how I could detect the custom alignments and packing on structures, and I was wondering if I could somehow benefit from __attribute__ encoding in BTF info ([1]). But following your hint, I also see some btf_is_struct_packed() in tools/lib/bpf/btf_dump.c that could help. I'll dig this further and see if I can manage to make something work with all of this.
With DWARF info, we might not need to detect the structure alignment anymore, since the DW_AT_location attribute tells us where the structure parameter is located on the stack, and DW_AT_byte_size gives us the size of the structure.
I am not sure to follow you here, because DWARF info is not accessible from kernel at runtime, right ? Or are you meaning that we could, at build time, enrich the BTF info embedded in the kernel thanks to DWARF info ?
Sounds like arm64 has complicated rules for stack alignment and stack offset computation for passing 9th+ argument.
Since your analysis shows: "there are about 200 functions accept 9 to 12 arguments, so adding support for up to 12 function arguments." I say, let's keep the existing limitation: if (nregs > 8) return -ENOTSUPP;
If there is a simple and dumb way to detect that arg9+ are scalars with simple stack passing rules, then, sure, let's support those too, but fancy packed/align(x)/etc let's ignore.
Hello Alexei,
On Fri Apr 25, 2025 at 1:14 AM CEST, Alexei Starovoitov wrote:
On Thu, Apr 24, 2025 at 6:38 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
[...]
With DWARF info, we might not need to detect the structure alignment anymore, since the DW_AT_location attribute tells us where the structure parameter is located on the stack, and DW_AT_byte_size gives us the size of the structure.
I am not sure to follow you here, because DWARF info is not accessible from kernel at runtime, right ? Or are you meaning that we could, at build time, enrich the BTF info embedded in the kernel thanks to DWARF info ?
Sounds like arm64 has complicated rules for stack alignment and stack offset computation for passing 9th+ argument.
AFAICT, arm64 has some specificities for large types, but not that much compared to x86 for example. If I take a look at System V ABI ([1]), I see pretty much the same constraints: - p.18: "Arguments of type __int128 offer the same operations as INTEGERs, [...] with the exception that arguments of type __int128 that are stored in memory must be aligned on a 16-byte boundary" - p.13: "Structures and unions assume the alignment of their most strictly aligned component" - the custom packing and alignments attributes will end up having the same consequence on both architectures
As I mentioned in my cover letter, the new tests covering those same alignment constraints for ARM64 break on x86, which makes me think other archs are also silently ignoring those cases.
Since your analysis shows: "there are about 200 functions accept 9 to 12 arguments, so adding support for up to 12 function arguments." I say, let's keep the existing limitation: if (nregs > 8) return -ENOTSUPP;
If there is a simple and dumb way to detect that arg9+ are scalars with simple stack passing rules, then, sure, let's support those too, but fancy packed/align(x)/etc let's ignore.
[1] https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf
From: Xu Kuohai xukuohai@huawei.com
Currently ARM64 bpf trampoline supports up to 8 function arguments. According to the statistics from commit 473e3150e30a ("bpf, x86: allow function arguments up to 12 for TRACING"), there are about 200 functions accept 9 to 12 arguments, so adding support for up to 12 function arguments.
Due to bpf only supporting function arguments up to 16 bytes, according to AAPCS64, starting from the first argument, each argument is first attempted to be loaded to 1 or 2 smallest registers from x0-x7, if there are no enough registers to hold the entire argument, then all remaining arguments starting from this one are pushed to the stack for passing.
Signed-off-by: Xu Kuohai xukuohai@huawei.com Co-developed-by: Alexis Lothoré alexis.lothore@bootlin.com Signed-off-by: Alexis Lothoré alexis.lothore@bootlin.com --- arch/arm64/net/bpf_jit_comp.c | 235 ++++++++++++++++++++++++++++++++---------- 1 file changed, 182 insertions(+), 53 deletions(-)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index 70d7c89d3ac907798e86e0051e7b472c252c1412..3618f42b1a0019590d47db0ec0fce1f9f51b01af 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -2064,7 +2064,7 @@ bool bpf_jit_supports_subprog_tailcalls(void) }
static void invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l, - int args_off, int retval_off, int run_ctx_off, + int bargs_off, int retval_off, int run_ctx_off, bool save_ret) { __le32 *branch; @@ -2106,7 +2106,7 @@ static void invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l, branch = ctx->image + ctx->idx; emit(A64_NOP, ctx);
- emit(A64_ADD_I(1, A64_R(0), A64_SP, args_off), ctx); + emit(A64_ADD_I(1, A64_R(0), A64_SP, bargs_off), ctx); if (!p->jited) emit_addr_mov_i64(A64_R(1), (const u64)p->insnsi, ctx);
@@ -2131,7 +2131,7 @@ static void invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l, }
static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl, - int args_off, int retval_off, int run_ctx_off, + int bargs_off, int retval_off, int run_ctx_off, __le32 **branches) { int i; @@ -2141,7 +2141,7 @@ static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl, */ emit(A64_STR64I(A64_ZR, A64_SP, retval_off), ctx); for (i = 0; i < tl->nr_links; i++) { - invoke_bpf_prog(ctx, tl->links[i], args_off, retval_off, + invoke_bpf_prog(ctx, tl->links[i], bargs_off, retval_off, run_ctx_off, true); /* if (*(u64 *)(sp + retval_off) != 0) * goto do_fexit; @@ -2155,23 +2155,142 @@ static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl, } }
-static void save_args(struct jit_ctx *ctx, int args_off, int nregs) +struct arg_aux { + /* how many args are passed through registers, the rest of the args are + * passed through stack + */ + int args_in_regs; + /* how many registers are used to pass arguments */ + int regs_for_args; + /* how much stack is used for additional args passed to bpf program + * that did not fit in original function registers + **/ + int bstack_for_args; + /* home much stack is used for additional args passed to the + * original function when called from trampoline (this one needs + * arguments to be properly aligned) + */ + int ostack_for_args; + u8 arg_align[MAX_BPF_FUNC_ARGS]; +}; + +static u8 calc_arg_align(const struct btf_func_model *m, int i) +{ + if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) + return max(8, m->arg_largest_member_size[i]); + return max(8, m->arg_size[i]); +} + +static void calc_arg_aux(const struct btf_func_model *m, + struct arg_aux *a) { int i; + int nregs; + int slots; + int stack_slots; + + for (i = 0; i < m->nr_args; i++) + a->arg_align[i] = calc_arg_align(m, i); + + /* verifier ensures m->nr_args <= MAX_BPF_FUNC_ARGS */ + for (i = 0, nregs = 0; i < m->nr_args; i++) { + slots = (m->arg_size[i] + 7) / 8; + if (nregs + slots <= 8) /* passed through register ? */ + nregs += slots; + else + break; + }
- for (i = 0; i < nregs; i++) { - emit(A64_STR64I(i, A64_SP, args_off), ctx); - args_off += 8; + a->args_in_regs = i; + a->regs_for_args = nregs; + a->ostack_for_args = 0; + + /* the rest arguments are passed through stack */ + for (a->ostack_for_args = 0, a->bstack_for_args = 0; + i < m->nr_args; i++) { + stack_slots = (m->arg_size[i] + 7) / 8; + /* AAPCS 64 C.14: arguments passed on stack must be aligned to + * max(8, arg_natural_alignment) + */ + a->bstack_for_args += stack_slots * 8; + a->ostack_for_args = round_up(a->ostack_for_args + + stack_slots * 8, a->arg_align[i]); + } +} + +static void clear_garbage(struct jit_ctx *ctx, int reg, int effective_bytes) +{ + if (effective_bytes) { + int garbage_bits = 64 - 8 * effective_bytes; +#ifdef CONFIG_CPU_BIG_ENDIAN + /* garbage bits are at the right end */ + emit(A64_LSR(1, reg, reg, garbage_bits), ctx); + emit(A64_LSL(1, reg, reg, garbage_bits), ctx); +#else + /* garbage bits are at the left end */ + emit(A64_LSL(1, reg, reg, garbage_bits), ctx); + emit(A64_LSR(1, reg, reg, garbage_bits), ctx); +#endif } }
-static void restore_args(struct jit_ctx *ctx, int args_off, int nregs) +static void save_args(struct jit_ctx *ctx, int bargs_off, int oargs_off, + const struct btf_func_model *m, + const struct arg_aux *a, + bool for_call_origin) { int i; + int reg; + int doff; + int soff; + int slots; + u8 tmp = bpf2a64[TMP_REG_1]; + + /* store arguments to the stack for the bpf program, or restore + * arguments from stack for the original function + */ + for (reg = 0; reg < a->regs_for_args; reg++) { + emit(for_call_origin ? + A64_LDR64I(reg, A64_SP, bargs_off) : + A64_STR64I(reg, A64_SP, bargs_off), + ctx); + bargs_off += 8; + } + + soff = 32; /* on stack arguments start from FP + 32 */ + doff = (for_call_origin ? oargs_off : bargs_off); + + /* save on stack arguments */ + for (i = a->args_in_regs; i < m->nr_args; i++) { + slots = (m->arg_size[i] + 7) / 8; + /* AAPCS C.14: additional arguments on stack must be + * aligned on max(8, arg_natural_alignment) + */ + soff = round_up(soff, a->arg_align[i]); + if (for_call_origin) + doff = round_up(doff, a->arg_align[i]); + /* verifier ensures arg_size <= 16, so slots equals 1 or 2 */ + while (slots-- > 0) { + emit(A64_LDR64I(tmp, A64_FP, soff), ctx); + /* if there is unused space in the last slot, clear + * the garbage contained in the space. + */ + if (slots == 0 && !for_call_origin) + clear_garbage(ctx, tmp, m->arg_size[i] % 8); + emit(A64_STR64I(tmp, A64_SP, doff), ctx); + soff += 8; + doff += 8; + } + } +}
- for (i = 0; i < nregs; i++) { - emit(A64_LDR64I(i, A64_SP, args_off), ctx); - args_off += 8; +static void restore_args(struct jit_ctx *ctx, int bargs_off, int nregs) +{ + int reg; + + for (reg = 0; reg < nregs; reg++) { + emit(A64_LDR64I(reg, A64_SP, bargs_off), ctx); + bargs_off += 8; } }
@@ -2194,17 +2313,21 @@ static bool is_struct_ops_tramp(const struct bpf_tramp_links *fentry_links) */ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, struct bpf_tramp_links *tlinks, void *func_addr, - int nregs, u32 flags) + const struct btf_func_model *m, + const struct arg_aux *a, + u32 flags) { int i; int stack_size; int retaddr_off; int regs_off; int retval_off; - int args_off; - int nregs_off; + int bargs_off; + int nfuncargs_off; int ip_off; int run_ctx_off; + int oargs_off; + int nfuncargs; struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY]; struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT]; struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN]; @@ -2213,31 +2336,38 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, bool is_struct_ops = is_struct_ops_tramp(fentry);
/* trampoline stack layout: - * [ parent ip ] - * [ FP ] - * SP + retaddr_off [ self ip ] - * [ FP ] + * [ parent ip ] + * [ FP ] + * SP + retaddr_off [ self ip ] + * [ FP ] * - * [ padding ] align SP to multiples of 16 + * [ padding ] align SP to multiples of 16 * - * [ x20 ] callee saved reg x20 - * SP + regs_off [ x19 ] callee saved reg x19 + * [ x20 ] callee saved reg x20 + * SP + regs_off [ x19 ] callee saved reg x19 * - * SP + retval_off [ return value ] BPF_TRAMP_F_CALL_ORIG or - * BPF_TRAMP_F_RET_FENTRY_RET + * SP + retval_off [ return value ] BPF_TRAMP_F_CALL_ORIG or + * BPF_TRAMP_F_RET_FENTRY_RET + * [ arg reg N ] + * [ ... ] + * SP + bargs_off [ arg reg 1 ] for bpf * - * [ arg reg N ] - * [ ... ] - * SP + args_off [ arg reg 1 ] + * SP + nfuncargs_off [ arg regs count ] * - * SP + nregs_off [ arg regs count ] + * SP + ip_off [ traced function ] BPF_TRAMP_F_IP_ARG flag * - * SP + ip_off [ traced function ] BPF_TRAMP_F_IP_ARG flag + * SP + run_ctx_off [ bpf_tramp_run_ctx ] * - * SP + run_ctx_off [ bpf_tramp_run_ctx ] + * [ stack arg N ] + * [ ... ] + * SP + oargs_off [ stack arg 1 ] for original func */
stack_size = 0; + oargs_off = stack_size; + if (flags & BPF_TRAMP_F_CALL_ORIG) + stack_size += a->ostack_for_args; + run_ctx_off = stack_size; /* room for bpf_tramp_run_ctx */ stack_size += round_up(sizeof(struct bpf_tramp_run_ctx), 8); @@ -2247,13 +2377,14 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, if (flags & BPF_TRAMP_F_IP_ARG) stack_size += 8;
- nregs_off = stack_size; + nfuncargs_off = stack_size; /* room for args count */ stack_size += 8;
- args_off = stack_size; + bargs_off = stack_size; /* room for args */ - stack_size += nregs * 8; + nfuncargs = a->regs_for_args + a->bstack_for_args / 8; + stack_size += 8 * nfuncargs;
/* room for return value */ retval_off = stack_size; @@ -2300,11 +2431,11 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, }
/* save arg regs count*/ - emit(A64_MOVZ(1, A64_R(10), nregs, 0), ctx); - emit(A64_STR64I(A64_R(10), A64_SP, nregs_off), ctx); + emit(A64_MOVZ(1, A64_R(10), nfuncargs, 0), ctx); + emit(A64_STR64I(A64_R(10), A64_SP, nfuncargs_off), ctx);
- /* save arg regs */ - save_args(ctx, args_off, nregs); + /* save args for bpf */ + save_args(ctx, bargs_off, oargs_off, m, a, false);
/* save callee saved registers */ emit(A64_STR64I(A64_R(19), A64_SP, regs_off), ctx); @@ -2320,7 +2451,7 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, }
for (i = 0; i < fentry->nr_links; i++) - invoke_bpf_prog(ctx, fentry->links[i], args_off, + invoke_bpf_prog(ctx, fentry->links[i], bargs_off, retval_off, run_ctx_off, flags & BPF_TRAMP_F_RET_FENTRY_RET);
@@ -2330,12 +2461,13 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, if (!branches) return -ENOMEM;
- invoke_bpf_mod_ret(ctx, fmod_ret, args_off, retval_off, + invoke_bpf_mod_ret(ctx, fmod_ret, bargs_off, retval_off, run_ctx_off, branches); }
if (flags & BPF_TRAMP_F_CALL_ORIG) { - restore_args(ctx, args_off, nregs); + /* save args for original func */ + save_args(ctx, bargs_off, oargs_off, m, a, true); /* call original func */ emit(A64_LDR64I(A64_R(10), A64_SP, retaddr_off), ctx); emit(A64_ADR(A64_LR, AARCH64_INSN_SIZE * 2), ctx); @@ -2354,7 +2486,7 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, }
for (i = 0; i < fexit->nr_links; i++) - invoke_bpf_prog(ctx, fexit->links[i], args_off, retval_off, + invoke_bpf_prog(ctx, fexit->links[i], bargs_off, retval_off, run_ctx_off, false);
if (flags & BPF_TRAMP_F_CALL_ORIG) { @@ -2368,7 +2500,7 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, }
if (flags & BPF_TRAMP_F_RESTORE_REGS) - restore_args(ctx, args_off, nregs); + restore_args(ctx, bargs_off, a->regs_for_args);
/* restore callee saved register x19 and x20 */ emit(A64_LDR64I(A64_R(19), A64_SP, regs_off), ctx); @@ -2428,14 +2560,13 @@ int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags, .idx = 0, }; struct bpf_tramp_image im; + struct arg_aux aaux; int nregs, ret;
nregs = btf_func_model_nregs(m); - /* the first 8 registers are used for arguments */ - if (nregs > 8) - return -ENOTSUPP;
- ret = prepare_trampoline(&ctx, &im, tlinks, func_addr, nregs, flags); + calc_arg_aux(m, &aaux); + ret = prepare_trampoline(&ctx, &im, tlinks, func_addr, m, &aaux, flags); if (ret < 0) return ret;
@@ -2462,9 +2593,10 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image, u32 flags, struct bpf_tramp_links *tlinks, void *func_addr) { - int ret, nregs; - void *image, *tmp; u32 size = ro_image_end - ro_image; + struct arg_aux aaux; + void *image, *tmp; + int ret;
/* image doesn't need to be in module memory range, so we can * use kvmalloc. @@ -2480,13 +2612,10 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image, .write = true, };
- nregs = btf_func_model_nregs(m); - /* the first 8 registers are used for arguments */ - if (nregs > 8) - return -ENOTSUPP;
jit_fill_hole(image, (unsigned int)(ro_image_end - ro_image)); - ret = prepare_trampoline(&ctx, im, tlinks, func_addr, nregs, flags); + calc_arg_aux(m, &aaux); + ret = prepare_trampoline(&ctx, im, tlinks, func_addr, m, &aaux, flags);
if (ret > 0 && validate_code(&ctx) < 0) { ret = -EINVAL;
When dealing with large types (>8 bytes), ARM64 trampolines need to take extra care about the arguments alignment to respect the calling convention set by AAPCS64.
Add two tests ensuring that the BPF trampoline arranges arguments with the relevant layout. The two new tests involve almost the same arguments, except that the second one requires a more specific alignment to be set by the trampoline when preparing arguments before calling the the target function.
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com --- .../selftests/bpf/prog_tests/tracing_struct.c | 23 ++++++++ tools/testing/selftests/bpf/progs/tracing_struct.c | 10 +++- .../selftests/bpf/progs/tracing_struct_many_args.c | 67 ++++++++++++++++++++++ .../testing/selftests/bpf/test_kmods/bpf_testmod.c | 50 ++++++++++++++++ 4 files changed, 149 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/tracing_struct.c b/tools/testing/selftests/bpf/prog_tests/tracing_struct.c index 19e68d4b353278bf8e2917e62f62c89d14d7fe80..ecb5d38539f8b2fc275f93713ce2a7aad908b929 100644 --- a/tools/testing/selftests/bpf/prog_tests/tracing_struct.c +++ b/tools/testing/selftests/bpf/prog_tests/tracing_struct.c @@ -108,6 +108,29 @@ static void test_struct_many_args(void) ASSERT_EQ(skel->bss->t9_i, 27, "t9:i"); ASSERT_EQ(skel->bss->t9_ret, 258, "t9 ret");
+ ASSERT_EQ(skel->bss->t10_a_a, 27, "t10:a.a"); + ASSERT_EQ(skel->bss->t10_a_b, 28, "t10:a.b"); + ASSERT_EQ(skel->bss->t10_b_a, 29, "t10:b.a"); + ASSERT_EQ(skel->bss->t10_b_b, 30, "t10:b.b"); + ASSERT_EQ(skel->bss->t10_c_a, 31, "t10:c.a"); + ASSERT_EQ(skel->bss->t10_c_b, 32, "t10:c.b"); + ASSERT_EQ(skel->bss->t10_d_a, 33, "t10:d.a"); + ASSERT_EQ(skel->bss->t10_d_b, 34, "t10:d.b"); + ASSERT_EQ(skel->bss->t10_e, 35, "t10:e"); + ASSERT_EQ(skel->bss->t10_f_a, 36, "t10:f.a"); + ASSERT_EQ(skel->bss->t10_f_b, 37, "t10:f.b"); + + ASSERT_EQ(skel->bss->t10_ret, 352, "t10 ret"); + + ASSERT_EQ(skel->bss->t11_a, 38, "t11:a"); + ASSERT_EQ(skel->bss->t11_b, 39, "t11:b"); + ASSERT_EQ(skel->bss->t11_c, 40, "t11:c"); + ASSERT_EQ(skel->bss->t11_d, 41, "t11:d"); + ASSERT_EQ(skel->bss->t11_e, 42, "t11:e"); + ASSERT_EQ(skel->bss->t11_f, 43, "t11:f"); + + ASSERT_EQ(skel->bss->t11_ret, 243, "t11 ret"); + destroy_skel: tracing_struct_many_args__destroy(skel); } diff --git a/tools/testing/selftests/bpf/progs/tracing_struct.c b/tools/testing/selftests/bpf/progs/tracing_struct.c index c435a3a8328ab1580c63967a7f0ab779aa7ca4d4..feabf618a2e011b0d08eeaa6cc09fba1922ecb3f 100644 --- a/tools/testing/selftests/bpf/progs/tracing_struct.c +++ b/tools/testing/selftests/bpf/progs/tracing_struct.c @@ -18,6 +18,15 @@ struct bpf_testmod_struct_arg_3 { int b[]; };
+struct bpf_testmod_struct_arg_4 { + __u64 a; + __u64 b; +}; + +struct bpf_testmod_struct_arg_5 { + __int128 a; +}; + long t1_a_a, t1_a_b, t1_b, t1_c, t1_ret, t1_nregs; __u64 t1_reg0, t1_reg1, t1_reg2, t1_reg3; long t2_a, t2_b_a, t2_b_b, t2_c, t2_ret; @@ -25,7 +34,6 @@ long t3_a, t3_b, t3_c_a, t3_c_b, t3_ret; long t4_a_a, t4_b, t4_c, t4_d, t4_e_a, t4_e_b, t4_ret; long t5_ret; int t6; - SEC("fentry/bpf_testmod_test_struct_arg_1") int BPF_PROG2(test_struct_arg_1, struct bpf_testmod_struct_arg_2, a, int, b, int, c) { diff --git a/tools/testing/selftests/bpf/progs/tracing_struct_many_args.c b/tools/testing/selftests/bpf/progs/tracing_struct_many_args.c index 4742012ace06af949d7f15a21131aaef7ab006e4..6f086b3d32d5f89e426aa4b79daa24bdb42861db 100644 --- a/tools/testing/selftests/bpf/progs/tracing_struct_many_args.c +++ b/tools/testing/selftests/bpf/progs/tracing_struct_many_args.c @@ -18,6 +18,14 @@ struct bpf_testmod_struct_arg_5 { long t7_a, t7_b, t7_c, t7_d, t7_e, t7_f_a, t7_f_b, t7_ret; long t8_a, t8_b, t8_c, t8_d, t8_e, t8_f_a, t8_f_b, t8_g, t8_ret; long t9_a, t9_b, t9_c, t9_d, t9_e, t9_f, t9_g, t9_h_a, t9_h_b, t9_h_c, t9_h_d, t9_i, t9_ret; +__u64 t10_a_a, t10_a_b, t10_b_a, t10_b_b, t10_c_a, t10_c_b, t10_d_a, t10_d_b, + t10_f_a, t10_f_b; +short t10_e; +int t10_ret; +__int128 t11_a, t11_b, t11_c, t11_d, t11_f; +short t11_e; +int t11_ret; +
SEC("fentry/bpf_testmod_test_struct_arg_7") int BPF_PROG2(test_struct_many_args_1, __u64, a, void *, b, short, c, int, d, @@ -92,4 +100,63 @@ int BPF_PROG2(test_struct_many_args_6, __u64, a, void *, b, short, c, int, d, vo return 0; }
+SEC("fentry/bpf_testmod_test_struct_arg_10") +int BPF_PROG2(test_struct_many_args_7, struct bpf_testmod_struct_arg_4, a, + struct bpf_testmod_struct_arg_4, b, + struct bpf_testmod_struct_arg_4, c, + struct bpf_testmod_struct_arg_4, d, short, e, + struct bpf_testmod_struct_arg_4, f) +{ + t10_a_a = a.a; + t10_a_b = a.b; + t10_b_a = b.a; + t10_b_b = b.b; + t10_c_a = c.a; + t10_c_b = c.b; + t10_d_a = d.a; + t10_d_b = d.b; + t10_e = e; + t10_f_a = f.a; + t10_f_b = f.b; + return 0; +} + +SEC("fexit/bpf_testmod_test_struct_arg_10") +int BPF_PROG2(test_struct_many_args_8, struct bpf_testmod_struct_arg_4, a, + struct bpf_testmod_struct_arg_4, b, + struct bpf_testmod_struct_arg_4, c, + struct bpf_testmod_struct_arg_4, d, short, e, + struct bpf_testmod_struct_arg_4, f, int, ret) +{ + t10_ret = ret; + return 0; +} + +SEC("fentry/bpf_testmod_test_struct_arg_11") +int BPF_PROG2(test_struct_many_args_9, struct bpf_testmod_struct_arg_5, a, + struct bpf_testmod_struct_arg_5, b, + struct bpf_testmod_struct_arg_5, c, + struct bpf_testmod_struct_arg_5, d, int, e, + struct bpf_testmod_struct_arg_5, f) +{ + t11_a = a.a; + t11_b = b.a; + t11_c = c.a; + t11_d = d.a; + t11_e = e; + t11_f = f.a; + return 0; +} + +SEC("fexit/bpf_testmod_test_struct_arg_11") +int BPF_PROG2(test_struct_many_args_10, struct bpf_testmod_struct_arg_5, a, + struct bpf_testmod_struct_arg_5, b, + struct bpf_testmod_struct_arg_5, c, + struct bpf_testmod_struct_arg_5, d, int, e, + struct bpf_testmod_struct_arg_5, f, int, ret) +{ + t11_ret = ret; + return 0; +} + char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c index f38eaf0d35efa712ec288f06983c86b02c0d3e0e..96c80da725c978f2e48df8602dd63155971a7bf6 100644 --- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c +++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c @@ -62,6 +62,15 @@ struct bpf_testmod_struct_arg_5 { long d; };
+struct bpf_testmod_struct_arg_6 { + u64 a; + u64 b; +}; + +struct bpf_testmod_struct_arg_7 { + __int128 a; +}; + __bpf_hook_start();
noinline int @@ -128,6 +137,29 @@ bpf_testmod_test_struct_arg_9(u64 a, void *b, short c, int d, void *e, char f, return bpf_testmod_test_struct_arg_result; }
+noinline int bpf_testmod_test_struct_arg_10(struct bpf_testmod_struct_arg_6 a, + struct bpf_testmod_struct_arg_6 b, + struct bpf_testmod_struct_arg_6 c, + struct bpf_testmod_struct_arg_6 d, + short e, + struct bpf_testmod_struct_arg_6 f) +{ + bpf_testmod_test_struct_arg_result = + a.a + a.b + b.a + b.b + c.a + c.b + d.a + d.b + e + f.a + f.b; + return bpf_testmod_test_struct_arg_result; +} + +noinline int bpf_testmod_test_struct_arg_11(struct bpf_testmod_struct_arg_7 a, + struct bpf_testmod_struct_arg_7 b, + struct bpf_testmod_struct_arg_7 c, + struct bpf_testmod_struct_arg_7 d, + short e, + struct bpf_testmod_struct_arg_7 f) +{ + bpf_testmod_test_struct_arg_result = a.a + b.a + c.a + d.a + e + f.a; + return bpf_testmod_test_struct_arg_result; +} + noinline int bpf_testmod_test_arg_ptr_to_struct(struct bpf_testmod_struct_arg_1 *a) { bpf_testmod_test_struct_arg_result = a->a; @@ -394,6 +426,16 @@ bpf_testmod_test_read(struct file *file, struct kobject *kobj, struct bpf_testmod_struct_arg_3 *struct_arg3; struct bpf_testmod_struct_arg_4 struct_arg4 = {21, 22}; struct bpf_testmod_struct_arg_5 struct_arg5 = {23, 24, 25, 26}; + struct bpf_testmod_struct_arg_6 struct_arg6_a = {27, 28}; + struct bpf_testmod_struct_arg_6 struct_arg6_b = {29, 30}; + struct bpf_testmod_struct_arg_6 struct_arg6_c = {31, 32}; + struct bpf_testmod_struct_arg_6 struct_arg6_d = {33, 34}; + struct bpf_testmod_struct_arg_6 struct_arg6_f = {36, 37}; + struct bpf_testmod_struct_arg_7 struct_arg7_a = {38}; + struct bpf_testmod_struct_arg_7 struct_arg7_b = {39}; + struct bpf_testmod_struct_arg_7 struct_arg7_c = {40}; + struct bpf_testmod_struct_arg_7 struct_arg7_d = {41}; + struct bpf_testmod_struct_arg_7 struct_arg7_f = {43}; int i = 1;
while (bpf_testmod_return_ptr(i)) @@ -411,6 +453,14 @@ bpf_testmod_test_read(struct file *file, struct kobject *kobj, (void)bpf_testmod_test_struct_arg_9(16, (void *)17, 18, 19, (void *)20, 21, 22, struct_arg5, 27);
+ (void)bpf_testmod_test_struct_arg_10(struct_arg6_a, struct_arg6_b, + struct_arg6_c, struct_arg6_d, 35, + struct_arg6_f); + + (void)bpf_testmod_test_struct_arg_11(struct_arg7_a, struct_arg7_b, + struct_arg7_c, struct_arg7_d, 42, + struct_arg7_f); + (void)bpf_testmod_test_arg_ptr_to_struct(&struct_arg1_2);
(void)trace_bpf_testmod_test_raw_tp_null(NULL);
The fentry_many_args, fexit_many_args and struct_many_args tests were disabled on ARM64 due to the lack of many args support.
With the previous commits bringing in this missing support, drop the last denied tests.
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com --- tools/testing/selftests/bpf/DENYLIST.aarch64 | 3 --- 1 file changed, 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/DENYLIST.aarch64 b/tools/testing/selftests/bpf/DENYLIST.aarch64 deleted file mode 100644 index 6d8feda27ce9de07d77d6e384666082923e3dc76..0000000000000000000000000000000000000000 --- a/tools/testing/selftests/bpf/DENYLIST.aarch64 +++ /dev/null @@ -1,3 +0,0 @@ -fentry_test/fentry_many_args # fentry_many_args:FAIL:fentry_many_args_attach unexpected error: -524 -fexit_test/fexit_many_args # fexit_many_args:FAIL:fexit_many_args_attach unexpected error: -524 -tracing_struct/struct_many_args # struct_many_args:FAIL:tracing_struct_many_args__attach unexpected error: -524
linux-kselftest-mirror@lists.linaro.org