 
            Hello,
this series is a revival of Xu Kuhoai's work to enable larger arguments count for BPF programs on ARM64 ([1]). His initial series received some positive feedback, but lacked some specific case handling around arguments alignment (see AAPCS64 C.14 rule in section 6.8.2, [2]). There as been another attempt from Puranjay Mohan, which was unfortunately missing the same thing ([3]). Since there has been some time between those series and this new one, I chose to send it as a new series rather than a new revision of the existing series.
To support the increased argument counts and arguments larger than registers size (eg: structures), the trampoline does the following: - for bpf programs: arguments are retrieved from both registers and the function stack, and pushed in the trampoline stack as an array of u64 to generate the programs context. It is then passed by pointer to the bpf programs - when the trampoline is in charge of calling the original function: it restores the registers content, and generates a new stack layout for the additional arguments that do not fit in registers.
This new attempt is based on Xu's series and aims to handle the missing alignment concern raised in the reviews discussions. The main novelties are then around arguments alignments: - the first commit is exposing some new info in the BTF function model passed to the JIT compiler to allow it to deduce the needed alignment when configuring the trampoline stack - the second commit is taken from Xu's series, and received the following modifications: - the calc_aux_args computes an expected alignment for each argument - the calc_aux_args computes two different stack space sizes: the one needed to store the bpf programs context, and the original function stacked arguments (which needs alignment). Those stack sizes are in bytes instead of "slots" - when saving/restoring arguments for bpf program or for the original function, make sure to align the load/store accordingly, when relevant - a few typos fixes and some rewording, raised by the review on the original series - the last commit introduces some explicit tests that ensure that the needed alignment is enforced by the trampoline
I marked the series as RFC because it appears that the new tests trigger some failures in CI on x86 and s390, despite the series not touching any code related to those architectures. Some very early investigation/gdb debugging on the x86 side seems to hint that it could be related to the same missing alignment too (based on section 3.2.3 in [4], and so the x86 trampoline would need the same alignment handling ?). For s390 it looks less clear, as all values captured from the bpf test program are set to 0 in the CI output, and I don't have the proper setup yet to check the low level details. I am tempted to isolate those new tests (which were actually useful to spot real issues while tuning the ARM64 trampoline) and add them to the relevant DENYLIST files for x86/s390, but I guess this is not the right direction, so I would gladly take a second opinion on this.
[1] https://lore.kernel.org/all/20230917150752.69612-1-xukuohai@huaweicloud.com/... [2] https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#id82 [3] https://lore.kernel.org/bpf/20240705125336.46820-1-puranjay@kernel.org/ [4] https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com --- Alexis Lothoré (eBPF Foundation) (3): bpf: add struct largest member size in func model bpf/selftests: add tests to validate proper arguments alignment on ARM64 bpf/selftests: enable tracing tests for ARM64
Xu Kuohai (1): bpf, arm64: Support up to 12 function arguments
arch/arm64/net/bpf_jit_comp.c | 235 ++++++++++++++++----- include/linux/bpf.h | 1 + kernel/bpf/btf.c | 25 +++ tools/testing/selftests/bpf/DENYLIST.aarch64 | 3 - .../selftests/bpf/prog_tests/tracing_struct.c | 23 ++ tools/testing/selftests/bpf/progs/tracing_struct.c | 10 +- .../selftests/bpf/progs/tracing_struct_many_args.c | 67 ++++++ .../testing/selftests/bpf/test_kmods/bpf_testmod.c | 50 +++++ 8 files changed, 357 insertions(+), 57 deletions(-) --- base-commit: 91e7eb701b4bc389e7ddfd80ef6e82d1a6d2d368 change-id: 20250220-many_args_arm64-8bd3747e6948
Best regards,
 
            In order to properly JIT the trampolines needed to attach BPF programs to functions, some architectures like ARM64 need to know about the alignment needed for the function arguments. Such alignment can generally be deduced from the argument size, but that's not completely true for composite types. In the specific case of ARM64, the AAPCS64 ABI defines that a composite type which needs to be passed through stack must be aligned on the maximum between 8 and the largest alignment constraint of its first-level members. So the JIT compiler needs more information about the arguments to make sure to generate code that respects those alignment constraints.
For struct arguments, add information about the size of the largest first-level member in the struct btf_func_model to allow the JIT compiler to guess the needed alignment. The information is quite specific, but it allows to keep arch-specific concerns (ie: guessing the final needed alignment for an argument) isolated in each JIT compiler.
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com --- include/linux/bpf.h | 1 + kernel/bpf/btf.c | 25 +++++++++++++++++++++++++ 2 files changed, 26 insertions(+)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 3f0cc89c0622cb1a097999afb78c17102593b6bb..8b34dcf60a0ce09228ff761b962ab67b6a3e2263 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1106,6 +1106,7 @@ struct btf_func_model { u8 nr_args; u8 arg_size[MAX_BPF_FUNC_ARGS]; u8 arg_flags[MAX_BPF_FUNC_ARGS]; + u8 arg_largest_member_size[MAX_BPF_FUNC_ARGS]; };
/* Restore arguments before returning from trampoline to let original function diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 16ba36f34dfab7531babf5753cab9f368cddefa3..5d40911ec90210086a6175d569abb6e52d75ad17 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -7318,6 +7318,29 @@ static int __get_type_size(struct btf *btf, u32 btf_id, return -EINVAL; }
+static u8 __get_largest_member_size(struct btf *btf, const struct btf_type *t) +{ + const struct btf_member *member; + const struct btf_type *mtype; + u8 largest_member_size = 0; + int i; + + if (!__btf_type_is_struct(t)) + return largest_member_size; + + for_each_member(i, t, member) { + mtype = btf_type_by_id(btf, member->type); + while (mtype && btf_type_is_modifier(mtype)) + mtype = btf_type_by_id(btf, mtype->type); + if (!mtype) + return -EINVAL; + if (mtype->size > largest_member_size) + largest_member_size = mtype->size; + } + + return largest_member_size; +} + static u8 __get_type_fmodel_flags(const struct btf_type *t) { u8 flags = 0; @@ -7396,6 +7419,8 @@ int btf_distill_func_proto(struct bpf_verifier_log *log, } m->arg_size[i] = ret; m->arg_flags[i] = __get_type_fmodel_flags(t); + m->arg_largest_member_size[i] = + __get_largest_member_size(btf, t); } m->nr_args = nargs; return 0;
 
            On Fri, Apr 11, 2025 at 10:32:10PM +0200, Alexis Lothoré (eBPF Foundation) wrote:
In order to properly JIT the trampolines needed to attach BPF programs to functions, some architectures like ARM64 need to know about the alignment needed for the function arguments. Such alignment can generally be deduced from the argument size, but that's not completely true for composite types. In the specific case of ARM64, the AAPCS64 ABI defines that a composite type which needs to be passed through stack must be aligned on the maximum between 8 and the largest alignment constraint of its first-level members. So the JIT compiler needs more information about the arguments to make sure to generate code that respects those alignment constraints.
For struct arguments, add information about the size of the largest first-level member in the struct btf_func_model to allow the JIT compiler to guess the needed alignment. The information is quite specific, but it allows to keep arch-specific concerns (ie: guessing the final needed alignment for an argument) isolated in each JIT compiler.
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com
include/linux/bpf.h | 1 + kernel/bpf/btf.c | 25 +++++++++++++++++++++++++ 2 files changed, 26 insertions(+)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 3f0cc89c0622cb1a097999afb78c17102593b6bb..8b34dcf60a0ce09228ff761b962ab67b6a3e2263 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1106,6 +1106,7 @@ struct btf_func_model { u8 nr_args; u8 arg_size[MAX_BPF_FUNC_ARGS]; u8 arg_flags[MAX_BPF_FUNC_ARGS];
- u8 arg_largest_member_size[MAX_BPF_FUNC_ARGS];
}; /* Restore arguments before returning from trampoline to let original function diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 16ba36f34dfab7531babf5753cab9f368cddefa3..5d40911ec90210086a6175d569abb6e52d75ad17 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -7318,6 +7318,29 @@ static int __get_type_size(struct btf *btf, u32 btf_id, return -EINVAL; } +static u8 __get_largest_member_size(struct btf *btf, const struct btf_type *t) +{
- const struct btf_member *member;
- const struct btf_type *mtype;
- u8 largest_member_size = 0;
- int i;
- if (!__btf_type_is_struct(t))
return largest_member_size;- for_each_member(i, t, member) {
mtype = btf_type_by_id(btf, member->type);
while (mtype && btf_type_is_modifier(mtype))
mtype = btf_type_by_id(btf, mtype->type);
if (!mtype)
return -EINVAL;
should we use __get_type_size for member->type instead ?
jirka
if (mtype->size > largest_member_size)
largest_member_size = mtype->size;- }
- return largest_member_size;
+}
static u8 __get_type_fmodel_flags(const struct btf_type *t) { u8 flags = 0; @@ -7396,6 +7419,8 @@ int btf_distill_func_proto(struct bpf_verifier_log *log, } m->arg_size[i] = ret; m->arg_flags[i] = __get_type_fmodel_flags(t);
m->arg_largest_member_size[i] =} m->nr_args = nargs; return 0;
__get_largest_member_size(btf, t);-- 2.49.0
 
            Hello Jiri,
On Mon Apr 14, 2025 at 1:04 PM CEST, Jiri Olsa wrote:
On Fri, Apr 11, 2025 at 10:32:10PM +0200, Alexis Lothoré (eBPF Foundation) wrote:
[...]
- for_each_member(i, t, member) {
mtype = btf_type_by_id(btf, member->type);
while (mtype && btf_type_is_modifier(mtype))
mtype = btf_type_by_id(btf, mtype->type);
if (!mtype)
return -EINVAL;should we use __get_type_size for member->type instead ?
Ah, yes, thanks for the hint, that will allow to get rid of the manual modifiers skip.
Alexis
jirka
 
            On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
In order to properly JIT the trampolines needed to attach BPF programs to functions, some architectures like ARM64 need to know about the alignment needed for the function arguments. Such alignment can generally be deduced from the argument size, but that's not completely true for composite types. In the specific case of ARM64, the AAPCS64 ABI defines that a composite type which needs to be passed through stack must be aligned on the maximum between 8 and the largest alignment constraint of its first-level members. So the JIT compiler needs more information about the arguments to make sure to generate code that respects those alignment constraints.
For struct arguments, add information about the size of the largest first-level member in the struct btf_func_model to allow the JIT compiler to guess the needed alignment. The information is quite
I might be missing something, but how can the *size* of the field be used to calculate that argument's *alignment*? i.e., I don't understand why arg_largest_member_size needs to be calculated instead of arg_largest_member_alignment...
specific, but it allows to keep arch-specific concerns (ie: guessing the final needed alignment for an argument) isolated in each JIT compiler.
couldn't all this information be calculated in the JIT compiler (if JIT needs that) from BTF?
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com
include/linux/bpf.h | 1 + kernel/bpf/btf.c | 25 +++++++++++++++++++++++++ 2 files changed, 26 insertions(+)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 3f0cc89c0622cb1a097999afb78c17102593b6bb..8b34dcf60a0ce09228ff761b962ab67b6a3e2263 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1106,6 +1106,7 @@ struct btf_func_model { u8 nr_args; u8 arg_size[MAX_BPF_FUNC_ARGS]; u8 arg_flags[MAX_BPF_FUNC_ARGS];
u8 arg_largest_member_size[MAX_BPF_FUNC_ARGS];};
/* Restore arguments before returning from trampoline to let original function diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 16ba36f34dfab7531babf5753cab9f368cddefa3..5d40911ec90210086a6175d569abb6e52d75ad17 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -7318,6 +7318,29 @@ static int __get_type_size(struct btf *btf, u32 btf_id, return -EINVAL; }
+static u8 __get_largest_member_size(struct btf *btf, const struct btf_type *t) +{
const struct btf_member *member;
const struct btf_type *mtype;
u8 largest_member_size = 0;
int i;
if (!__btf_type_is_struct(t))
return largest_member_size;
for_each_member(i, t, member) {
mtype = btf_type_by_id(btf, member->type);
while (mtype && btf_type_is_modifier(mtype))
mtype = btf_type_by_id(btf, mtype->type);
if (!mtype)
return -EINVAL;
if (mtype->size > largest_member_size)
largest_member_size = mtype->size;
}
return largest_member_size;+}
static u8 __get_type_fmodel_flags(const struct btf_type *t) { u8 flags = 0; @@ -7396,6 +7419,8 @@ int btf_distill_func_proto(struct bpf_verifier_log *log, } m->arg_size[i] = ret; m->arg_flags[i] = __get_type_fmodel_flags(t);
m->arg_largest_member_size[i] =
__get_largest_member_size(btf, t); } m->nr_args = nargs; return 0;-- 2.49.0
 
            Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
In order to properly JIT the trampolines needed to attach BPF programs to functions, some architectures like ARM64 need to know about the alignment needed for the function arguments. Such alignment can generally be deduced from the argument size, but that's not completely true for composite types. In the specific case of ARM64, the AAPCS64 ABI defines that a composite type which needs to be passed through stack must be aligned on the maximum between 8 and the largest alignment constraint of its first-level members. So the JIT compiler needs more information about the arguments to make sure to generate code that respects those alignment constraints.
For struct arguments, add information about the size of the largest first-level member in the struct btf_func_model to allow the JIT compiler to guess the needed alignment. The information is quite
I might be missing something, but how can the *size* of the field be used to calculate that argument's *alignment*? i.e., I don't understand why arg_largest_member_size needs to be calculated instead of arg_largest_member_alignment...
Indeed I initially checked whether I could return directly some alignment info from btf, but it then involves the alignment computation in the btf module. Since there could be minor differences between architectures about alignment requirements, I though it would be better to in fact keep alignment computation out of the btf module. For example, I see that 128 bits values are aligned on 16 bytes on ARM64, while being aligned on 8 bytes on S390.
And since for ARM64, all needed alignments are somehow derived from size (it is either directly size for fundamental types, or alignment of the largest member for structs, which is then size of largest member), returning the size seems to be enough to allow the JIT side to compute alignments.
specific, but it allows to keep arch-specific concerns (ie: guessing the final needed alignment for an argument) isolated in each JIT compiler.
couldn't all this information be calculated in the JIT compiler (if JIT needs that) from BTF?
From what I understand, the JIT compiler does not have access to BTF info, only a substract from it, arranged in a struct btf_func_model ? This struct btf_func_model already has size info for standard types, but for structs we need some additional info about the members, hence this arg_largest_member_alignment addition in btf_func_model.
Thanks,
Alexis
 
            On 4/17/2025 3:14 PM, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
In order to properly JIT the trampolines needed to attach BPF programs to functions, some architectures like ARM64 need to know about the alignment needed for the function arguments. Such alignment can generally be deduced from the argument size, but that's not completely true for composite types. In the specific case of ARM64, the AAPCS64 ABI defines that a composite type which needs to be passed through stack must be aligned on the maximum between 8 and the largest alignment constraint of its first-level members. So the JIT compiler needs more information about the arguments to make sure to generate code that respects those alignment constraints.
For struct arguments, add information about the size of the largest first-level member in the struct btf_func_model to allow the JIT compiler to guess the needed alignment. The information is quite
I might be missing something, but how can the *size* of the field be used to calculate that argument's *alignment*? i.e., I don't understand why arg_largest_member_size needs to be calculated instead of arg_largest_member_alignment...
Indeed I initially checked whether I could return directly some alignment info from btf, but it then involves the alignment computation in the btf module. Since there could be minor differences between architectures about alignment requirements, I though it would be better to in fact keep alignment computation out of the btf module. For example, I see that 128 bits values are aligned on 16 bytes on ARM64, while being aligned on 8 bytes on S390.
And since for ARM64, all needed alignments are somehow derived from size (it is either directly size for fundamental types, or alignment of the largest member for structs, which is then size of largest member), returning the size seems to be enough to allow the JIT side to compute alignments.
Not exactly. The compiler's "packed" and "alignment" attributes cause a structure to be aligned differently from its natural alignment.
For example, with the following three structures:
struct s0 { __int128 x; };
struct s1 { __int128 x; } __attribute__((packed));
struct s2 { __int128 x; } __attribute__((aligned(64)));
Even though the largest member size is the same, s0 will be aligned to 16 bytes, s1 and s2 are not aligned the same way. s1 has no alignment due to the "packed" attribute, while s2 will be aligned to 64 bytes.
When these three structures are passed as function arguments, they will be located on different positions on the stack.
For the following three functions:
int f0(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s0 g); int f1(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s1 g); int f2(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s2 g);
g will be located at sp+32 in f0, sp + 24 in f1, and some 64-byte aligned stack address in f2.
specific, but it allows to keep arch-specific concerns (ie: guessing the final needed alignment for an argument) isolated in each JIT compiler.
couldn't all this information be calculated in the JIT compiler (if JIT needs that) from BTF?
From what I understand, the JIT compiler does not have access to BTF info,
only a substract from it, arranged in a struct btf_func_model ? This struct btf_func_model already has size info for standard types, but for structs we need some additional info about the members, hence this arg_largest_member_alignment addition in btf_func_model.
Thanks,
Alexis
 
            Hi Xu,
On Thu Apr 17, 2025 at 4:10 PM CEST, Xu Kuohai wrote:
On 4/17/2025 3:14 PM, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
[...]
I might be missing something, but how can the *size* of the field be used to calculate that argument's *alignment*? i.e., I don't understand why arg_largest_member_size needs to be calculated instead of arg_largest_member_alignment...
Indeed I initially checked whether I could return directly some alignment info from btf, but it then involves the alignment computation in the btf module. Since there could be minor differences between architectures about alignment requirements, I though it would be better to in fact keep alignment computation out of the btf module. For example, I see that 128 bits values are aligned on 16 bytes on ARM64, while being aligned on 8 bytes on S390.
And since for ARM64, all needed alignments are somehow derived from size (it is either directly size for fundamental types, or alignment of the largest member for structs, which is then size of largest member), returning the size seems to be enough to allow the JIT side to compute alignments.
Not exactly. The compiler's "packed" and "alignment" attributes cause a structure to be aligned differently from its natural alignment.
For example, with the following three structures:
struct s0 { __int128 x; };
struct s1 { __int128 x; } __attribute__((packed));
struct s2 { __int128 x; } __attribute__((aligned(64)));
Even though the largest member size is the same, s0 will be aligned to 16 bytes, s1 and s2 are not aligned the same way. s1 has no alignment due to the "packed" attribute, while s2 will be aligned to 64 bytes.
When these three structures are passed as function arguments, they will be located on different positions on the stack.
For the following three functions:
int f0(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s0 g); int f1(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s1 g); int f2(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s2 g);
g will be located at sp+32 in f0, sp + 24 in f1, and some 64-byte aligned stack address in f2.
Ah, thanks for those clear examples, I completely overlooked this possibility. And now that you mention it, I feel a bit dumb because I now remember that you mentioned this in Puranjay's series...
I took a quick look at the x86 JIT compiler for reference, and saw no code related to this specific case neither. So I searched in the kernel for actual functions taking struct arguments by value AND being declared with some packed or aligned attribute. I only found a handful of those, and none seems to take enough arguments to have the corresponding struct passed on the stack. So rather than supporting this very specific case, I am tempted to just return an error for now during trampoline creation if we detect such structure (and then the JIT compiler can keep using data size to compute alignment, now that it is sure not to receive custom alignments). Or am I missing some actual cases involving those very specific alignments ?
Thanks,
Alexis
 
            On 4/21/2025 12:02 AM, Alexis Lothoré wrote:
Hi Xu,
On Thu Apr 17, 2025 at 4:10 PM CEST, Xu Kuohai wrote:
On 4/17/2025 3:14 PM, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
[...]
I might be missing something, but how can the *size* of the field be used to calculate that argument's *alignment*? i.e., I don't understand why arg_largest_member_size needs to be calculated instead of arg_largest_member_alignment...
Indeed I initially checked whether I could return directly some alignment info from btf, but it then involves the alignment computation in the btf module. Since there could be minor differences between architectures about alignment requirements, I though it would be better to in fact keep alignment computation out of the btf module. For example, I see that 128 bits values are aligned on 16 bytes on ARM64, while being aligned on 8 bytes on S390.
And since for ARM64, all needed alignments are somehow derived from size (it is either directly size for fundamental types, or alignment of the largest member for structs, which is then size of largest member), returning the size seems to be enough to allow the JIT side to compute alignments.
Not exactly. The compiler's "packed" and "alignment" attributes cause a structure to be aligned differently from its natural alignment.
For example, with the following three structures:
struct s0 { __int128 x; };
struct s1 { __int128 x; } __attribute__((packed));
struct s2 { __int128 x; } __attribute__((aligned(64)));
Even though the largest member size is the same, s0 will be aligned to 16 bytes, s1 and s2 are not aligned the same way. s1 has no alignment due to the "packed" attribute, while s2 will be aligned to 64 bytes.
When these three structures are passed as function arguments, they will be located on different positions on the stack.
For the following three functions:
int f0(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s0 g); int f1(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s1 g); int f2(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s2 g);
g will be located at sp+32 in f0, sp + 24 in f1, and some 64-byte aligned stack address in f2.
Ah, thanks for those clear examples, I completely overlooked this possibility. And now that you mention it, I feel a bit dumb because I now remember that you mentioned this in Puranjay's series...
I took a quick look at the x86 JIT compiler for reference, and saw no code related to this specific case neither. So I searched in the kernel for actual functions taking struct arguments by value AND being declared with some packed or aligned attribute. I only found a handful of those, and none seems to take enough arguments to have the corresponding struct passed on the stack. So rather than supporting this very specific case, I am tempted to just return an error for now during trampoline creation if we detect such structure (and then the JIT compiler can keep using data size to compute alignment, now that it is sure not to receive custom alignments). Or am I missing some actual cases involving those very specific alignments ?
How can we reliably 'detect' the case? If a function has such a parameter but we fail to detect it, the BPF trampoline will pass an incorrect value to the function, which is also unacceptable.
Thanks,
Alexis
 
            On Mon Apr 21, 2025 at 4:14 AM CEST, Xu Kuohai wrote:
On 4/21/2025 12:02 AM, Alexis Lothoré wrote:
Hi Xu,
On Thu Apr 17, 2025 at 4:10 PM CEST, Xu Kuohai wrote:
On 4/17/2025 3:14 PM, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
[...]
Ah, thanks for those clear examples, I completely overlooked this possibility. And now that you mention it, I feel a bit dumb because I now remember that you mentioned this in Puranjay's series...
I took a quick look at the x86 JIT compiler for reference, and saw no code related to this specific case neither. So I searched in the kernel for actual functions taking struct arguments by value AND being declared with some packed or aligned attribute. I only found a handful of those, and none seems to take enough arguments to have the corresponding struct passed on the stack. So rather than supporting this very specific case, I am tempted to just return an error for now during trampoline creation if we detect such structure (and then the JIT compiler can keep using data size to compute alignment, now that it is sure not to receive custom alignments). Or am I missing some actual cases involving those very specific alignments ?
How can we reliably 'detect' the case? If a function has such a parameter but we fail to detect it, the BPF trampoline will pass an incorrect value to the function, which is also unacceptable.
That's a question I still have to answer :) I imagined being able to detect it thanks to some info somewhere in BTF, but I have to dig further to find how.
Alexis
 
            On Thu, Apr 17, 2025 at 12:14 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
In order to properly JIT the trampolines needed to attach BPF programs to functions, some architectures like ARM64 need to know about the alignment needed for the function arguments. Such alignment can generally be deduced from the argument size, but that's not completely true for composite types. In the specific case of ARM64, the AAPCS64 ABI defines that a composite type which needs to be passed through stack must be aligned on the maximum between 8 and the largest alignment constraint of its first-level members. So the JIT compiler needs more information about the arguments to make sure to generate code that respects those alignment constraints.
For struct arguments, add information about the size of the largest first-level member in the struct btf_func_model to allow the JIT compiler to guess the needed alignment. The information is quite
I might be missing something, but how can the *size* of the field be used to calculate that argument's *alignment*? i.e., I don't understand why arg_largest_member_size needs to be calculated instead of arg_largest_member_alignment...
Indeed I initially checked whether I could return directly some alignment info from btf, but it then involves the alignment computation in the btf module. Since there could be minor differences between architectures about alignment requirements, I though it would be better to in fact keep alignment computation out of the btf module. For example, I see that 128 bits values are aligned on 16 bytes on ARM64, while being aligned on 8 bytes on S390.
And since for ARM64, all needed alignments are somehow derived from size (it is either directly size for fundamental types, or alignment of the largest member for structs, which is then size of largest member), returning the size seems to be enough to allow the JIT side to compute alignments.
If you mean the size of "primitive" field and/or array element (applied recursively for all embedded structs/unions) then yes, that's close enough. But saying just "largest struct member" is wrong, because for
struct blah { struct { int whatever[128]; } heya; };
blah.heya has a large size, but alignment is still just 4 bytes.
I'd suggest looking at btf__align_of() in libbpf (tools/lib/bpf/btf.c) to see how we calculate alignment there. It seems to work decently enough. It won't cover any arch-specific extra rules like double needing 16-byte alignment (I vaguely remember something like that for some architectures, but I might be misremembering), or anything similar. It also won't detect (I don't think it's possible without DWARF) artificially increased alignment with attribute((aligned(N))).
specific, but it allows to keep arch-specific concerns (ie: guessing the final needed alignment for an argument) isolated in each JIT compiler.
couldn't all this information be calculated in the JIT compiler (if JIT needs that) from BTF?
From what I understand, the JIT compiler does not have access to BTF info, only a substract from it, arranged in a struct btf_func_model ? This struct btf_func_model already has size info for standard types, but for structs we need some additional info about the members, hence this arg_largest_member_alignment addition in btf_func_model.
Thanks,
Alexis
-- Alexis Lothoré, Bootlin Embedded Linux and Kernel engineering https://bootlin.com
 
            Hi Andrii,
On Wed Apr 23, 2025 at 7:15 PM CEST, Andrii Nakryiko wrote:
On Thu, Apr 17, 2025 at 12:14 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
[...]
Indeed I initially checked whether I could return directly some alignment info from btf, but it then involves the alignment computation in the btf module. Since there could be minor differences between architectures about alignment requirements, I though it would be better to in fact keep alignment computation out of the btf module. For example, I see that 128 bits values are aligned on 16 bytes on ARM64, while being aligned on 8 bytes on S390.
And since for ARM64, all needed alignments are somehow derived from size (it is either directly size for fundamental types, or alignment of the largest member for structs, which is then size of largest member), returning the size seems to be enough to allow the JIT side to compute alignments.
If you mean the size of "primitive" field and/or array element (applied recursively for all embedded structs/unions) then yes, that's close enough. But saying just "largest struct member" is wrong, because for
struct blah { struct { int whatever[128]; } heya; };
blah.heya has a large size, but alignment is still just 4 bytes.
Indeed, that's another case making my proposal fail :)
I'd suggest looking at btf__align_of() in libbpf (tools/lib/bpf/btf.c) to see how we calculate alignment there. It seems to work decently enough. It won't cover any arch-specific extra rules like double needing 16-byte alignment (I vaguely remember something like that for some architectures, but I might be misremembering), or anything similar. It also won't detect (I don't think it's possible without DWARF) artificially increased alignment with attribute((aligned(N))).
Thanks for the pointer, I'll take a look at it. The more we discuss this series, the less member size sounds relevant for what I'm trying to achieve here.
Following Xu's comments, I have been thinking about how I could detect the custom alignments and packing on structures, and I was wondering if I could somehow benefit from __attribute__ encoding in BTF info ([1]). But following your hint, I also see some btf_is_struct_packed() in tools/lib/bpf/btf_dump.c that could help. I'll dig this further and see if I can manage to make something work with all of this.
Thanks,
Alexis
[1] https://lore.kernel.org/bpf/20250130201239.1429648-1-ihor.solodrai@linux.dev...
 
            On 4/24/2025 3:24 AM, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 23, 2025 at 7:15 PM CEST, Andrii Nakryiko wrote:
On Thu, Apr 17, 2025 at 12:14 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
[...]
Indeed I initially checked whether I could return directly some alignment info from btf, but it then involves the alignment computation in the btf module. Since there could be minor differences between architectures about alignment requirements, I though it would be better to in fact keep alignment computation out of the btf module. For example, I see that 128 bits values are aligned on 16 bytes on ARM64, while being aligned on 8 bytes on S390.
And since for ARM64, all needed alignments are somehow derived from size (it is either directly size for fundamental types, or alignment of the largest member for structs, which is then size of largest member), returning the size seems to be enough to allow the JIT side to compute alignments.
If you mean the size of "primitive" field and/or array element (applied recursively for all embedded structs/unions) then yes, that's close enough. But saying just "largest struct member" is wrong, because for
struct blah { struct { int whatever[128]; } heya; };
blah.heya has a large size, but alignment is still just 4 bytes.
Indeed, that's another case making my proposal fail :)
I'd suggest looking at btf__align_of() in libbpf (tools/lib/bpf/btf.c) to see how we calculate alignment there. It seems to work decently enough. It won't cover any arch-specific extra rules like double needing 16-byte alignment (I vaguely remember something like that for some architectures, but I might be misremembering), or anything similar. It also won't detect (I don't think it's possible without DWARF) artificially increased alignment with attribute((aligned(N))).
Thanks for the pointer, I'll take a look at it. The more we discuss this series, the less member size sounds relevant for what I'm trying to achieve here.
Following Xu's comments, I have been thinking about how I could detect the custom alignments and packing on structures, and I was wondering if I could somehow benefit from __attribute__ encoding in BTF info ([1]). But following your hint, I also see some btf_is_struct_packed() in tools/lib/bpf/btf_dump.c that could help. I'll dig this further and see if I can manage to make something work with all of this.
With DWARF info, we might not need to detect the structure alignment anymore, since the DW_AT_location attribute tells us where the structure parameter is located on the stack, and DW_AT_byte_size gives us the size of the structure.
Thanks,
Alexis
[1] https://lore.kernel.org/bpf/20250130201239.1429648-1-ihor.solodrai@linux.dev...
 
            Hi Xu,
On Thu Apr 24, 2025 at 2:00 PM CEST, Xu Kuohai wrote:
On 4/24/2025 3:24 AM, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 23, 2025 at 7:15 PM CEST, Andrii Nakryiko wrote:
On Thu, Apr 17, 2025 at 12:14 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
[...]
Thanks for the pointer, I'll take a look at it. The more we discuss this series, the less member size sounds relevant for what I'm trying to achieve here.
Following Xu's comments, I have been thinking about how I could detect the custom alignments and packing on structures, and I was wondering if I could somehow benefit from __attribute__ encoding in BTF info ([1]). But following your hint, I also see some btf_is_struct_packed() in tools/lib/bpf/btf_dump.c that could help. I'll dig this further and see if I can manage to make something work with all of this.
With DWARF info, we might not need to detect the structure alignment anymore, since the DW_AT_location attribute tells us where the structure parameter is located on the stack, and DW_AT_byte_size gives us the size of the structure.
I am not sure to follow you here, because DWARF info is not accessible from kernel at runtime, right ? Or are you meaning that we could, at build time, enrich the BTF info embedded in the kernel thanks to DWARF info ?
Thanks,
Alexis
 
            On Thu, Apr 24, 2025 at 6:38 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Xu,
On Thu Apr 24, 2025 at 2:00 PM CEST, Xu Kuohai wrote:
On 4/24/2025 3:24 AM, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 23, 2025 at 7:15 PM CEST, Andrii Nakryiko wrote:
On Thu, Apr 17, 2025 at 12:14 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
[...]
Thanks for the pointer, I'll take a look at it. The more we discuss this series, the less member size sounds relevant for what I'm trying to achieve here.
Following Xu's comments, I have been thinking about how I could detect the custom alignments and packing on structures, and I was wondering if I could somehow benefit from __attribute__ encoding in BTF info ([1]). But following your hint, I also see some btf_is_struct_packed() in tools/lib/bpf/btf_dump.c that could help. I'll dig this further and see if I can manage to make something work with all of this.
With DWARF info, we might not need to detect the structure alignment anymore, since the DW_AT_location attribute tells us where the structure parameter is located on the stack, and DW_AT_byte_size gives us the size of the structure.
I am not sure to follow you here, because DWARF info is not accessible from kernel at runtime, right ? Or are you meaning that we could, at build time, enrich the BTF info embedded in the kernel thanks to DWARF info ?
Sounds like arm64 has complicated rules for stack alignment and stack offset computation for passing 9th+ argument.
Since your analysis shows: "there are about 200 functions accept 9 to 12 arguments, so adding support for up to 12 function arguments." I say, let's keep the existing limitation: if (nregs > 8) return -ENOTSUPP;
If there is a simple and dumb way to detect that arg9+ are scalars with simple stack passing rules, then, sure, let's support those too, but fancy packed/align(x)/etc let's ignore.
 
            Hello Alexei,
On Fri Apr 25, 2025 at 1:14 AM CEST, Alexei Starovoitov wrote:
On Thu, Apr 24, 2025 at 6:38 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
[...]
With DWARF info, we might not need to detect the structure alignment anymore, since the DW_AT_location attribute tells us where the structure parameter is located on the stack, and DW_AT_byte_size gives us the size of the structure.
I am not sure to follow you here, because DWARF info is not accessible from kernel at runtime, right ? Or are you meaning that we could, at build time, enrich the BTF info embedded in the kernel thanks to DWARF info ?
Sounds like arm64 has complicated rules for stack alignment and stack offset computation for passing 9th+ argument.
AFAICT, arm64 has some specificities for large types, but not that much compared to x86 for example. If I take a look at System V ABI ([1]), I see pretty much the same constraints: - p.18: "Arguments of type __int128 offer the same operations as INTEGERs, [...] with the exception that arguments of type __int128 that are stored in memory must be aligned on a 16-byte boundary" - p.13: "Structures and unions assume the alignment of their most strictly aligned component" - the custom packing and alignments attributes will end up having the same consequence on both architectures
As I mentioned in my cover letter, the new tests covering those same alignment constraints for ARM64 break on x86, which makes me think other archs are also silently ignoring those cases.
Since your analysis shows: "there are about 200 functions accept 9 to 12 arguments, so adding support for up to 12 function arguments." I say, let's keep the existing limitation: if (nregs > 8) return -ENOTSUPP;
If there is a simple and dumb way to detect that arg9+ are scalars with simple stack passing rules, then, sure, let's support those too, but fancy packed/align(x)/etc let's ignore.
[1] https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf
 
            On 4/24/2025 9:38 PM, Alexis Lothoré wrote:
Hi Xu,
On Thu Apr 24, 2025 at 2:00 PM CEST, Xu Kuohai wrote:
On 4/24/2025 3:24 AM, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 23, 2025 at 7:15 PM CEST, Andrii Nakryiko wrote:
On Thu, Apr 17, 2025 at 12:14 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
On Fri, Apr 11, 2025 at 1:32 PM Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com wrote:
[...]
Thanks for the pointer, I'll take a look at it. The more we discuss this series, the less member size sounds relevant for what I'm trying to achieve here.
Following Xu's comments, I have been thinking about how I could detect the custom alignments and packing on structures, and I was wondering if I could somehow benefit from __attribute__ encoding in BTF info ([1]). But following your hint, I also see some btf_is_struct_packed() in tools/lib/bpf/btf_dump.c that could help. I'll dig this further and see if I can manage to make something work with all of this.
With DWARF info, we might not need to detect the structure alignment anymore, since the DW_AT_location attribute tells us where the structure parameter is located on the stack, and DW_AT_byte_size gives us the size of the structure.
I am not sure to follow you here, because DWARF info is not accessible from kernel at runtime, right ? Or are you meaning that we could, at build time, enrich the BTF info embedded in the kernel thanks to DWARF info ?
Sorry for the confusion.
What I meant is that there are two DWARF attributes, DW_AT_location and DW_AT_byte_size, which tell us the position and size of function parameters.
For the example earlier:
struct s2 { __int128 x; } __attribute__((aligned(64)));
int f2(__int128 a, __int128 b, __int128 c, int64_t d, __int128 e, int64_t f, struct s2 g) { return 0; }
On my build host, the DW_AT_location attributes for "e", "f", and "g" are:
<2><ee>: Abbrev Number: 2 (DW_TAG_formal_parameter) <ef> DW_AT_name : e ... <f6> DW_AT_location : 2 byte block: 91 0 (DW_OP_fbreg: 0)
<2><f9>: Abbrev Number: 2 (DW_TAG_formal_parameter) <fa> DW_AT_name : f ... <101> DW_AT_location : 2 byte block: 91 10 (DW_OP_fbreg: 16)
<2><104>: Abbrev Number: 2 (DW_TAG_formal_parameter) <105> DW_AT_name : g ... <10c> DW_AT_location : 2 byte block: 83 0 (DW_OP_breg19 (x19): 0)
We can see "e" and "f" are at fp+0 and fp+16, but "g" is in x19+0. Disassembly shows x19 holds a 64-byte aligned stack address.
For the two questions you mentioned, I’m not sure if we can access DWARF attributes at runtime. As for adding parameter locations to BTF at building time, I think it means we would need to record CPU-related register info in BTF, which I don’t think is a good idea.
Thanks,
Alexis
 
            On Fri, 2025-04-25 at 17:23 +0800, Xu Kuohai wrote:
[...]
For the two questions you mentioned, I’m not sure if we can access DWARF attributes at runtime. As for adding parameter locations to BTF at building time, I think it means we would need to record CPU-related register info in BTF, which I don’t think is a good idea.
Another option would be for pahole to check if function parameter DW_AT_locaction is placed in accordance with ABI. These flags can be recorded in a dedicated section or smth like this. Having said that, DW_AT_locaction seem to be not very reliable. E.g. for bpf_testmod.ko generated by clang 19.1.7 I don't see DW_AT_locaction specified for parameters a, b, c.
 
            Hi all, a simpler version of this series has been merged, and so I am now taking a look at the issue I have put aside in the merged version: dealing with more specific data layout for arguments passed on stack. For example, a function can pass small structs on stack, but this need special care when generating the corresponding bpf trampolines. Those structs have specific alignment specified by the target ABI, but it can also be altered with attributes packing the structure or modifying the alignment.
Some platforms already support structs on stack (see tracing_struct_many_args test), but as discussed earlier, those may suffer from the same kind of issue mentioned above.
On Wed Apr 23, 2025 at 9:24 PM CEST, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 23, 2025 at 7:15 PM CEST, Andrii Nakryiko wrote:
On Thu, Apr 17, 2025 at 12:14 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
[...]
I'd suggest looking at btf__align_of() in libbpf (tools/lib/bpf/btf.c) to see how we calculate alignment there. It seems to work decently enough. It won't cover any arch-specific extra rules like double needing 16-byte alignment (I vaguely remember something like that for some architectures, but I might be misremembering), or anything similar. It also won't detect (I don't think it's possible without DWARF) artificially increased alignment with attribute((aligned(N))).
Thanks for the pointer, I'll take a look at it. The more we discuss this series, the less member size sounds relevant for what I'm trying to achieve here.
Following Xu's comments, I have been thinking about how I could detect the custom alignments and packing on structures, and I was wondering if I could somehow benefit from __attribute__ encoding in BTF info ([1]). But following your hint, I also see some btf_is_struct_packed() in tools/lib/bpf/btf_dump.c that could help. I'll dig this further and see if I can manage to make something work with all of this.
Andrii's comment above illustrates well my current issue: when functions pass arguments on stack, we are missing info for some of them to correctly build trampolines, especially for struct, which can have attributes like __attribute__((packed)) or __attribute__((align(x))). [1] seems to be a recent solution implemented for BTF to cover this need. IIUC it encodes any arbitratry attribute affecting a data type or function, so if I have some struct like this one in my kernel or a module:
struct foo { short b int a; } __packed;
I would expect the corresponding BTF data to have some BTF_KIND_DECL_TAG describing the "packed" attribute for the corresponding structure, but I fail to find any of those when running:
$ bpftool btf dump file vmlinux format raw
In there I see some DECL_TAG but those are mostly 'bpf_kfunc', I see not arbitrary attribute like 'packed' or 'aligned(x)'.
What I really need to do in the end is to be able to parse those alignments attributes info in the kernel at runtime when generating trampolines, but I hoped to be able to see them with bpftool first to validate the concept.
I started taking a look further at this and stumbled upon [2] in which Alan gives many more details about the feature, so I did the following checks: - kernel version 6.15.0-rc4 from bpf-next_base: it contains the updated Makefile.btf calling pahole with `--btf_features=attributes` - pahole v1.30 $ pahole --supported_btf_features encode_force,var,float,decl_tag,type_tag,enum64,optimized_func,consistent_func,decl_tag_kfuncs,reproducible_build,distilled_base,global_var,attributes This pahole comes from my distro pkg manager, but I have also done the same test with a freshly built pahole, while taking care of pulling the libbpf submodule. - bpftool v7.6.0 bpftool v7.6.0 using libbpf v1.6 features: llvm, skeletons
Could I be missing something obvious ? Or did I misunderstand the actual attribute encoding feature ?
Thanks,
Alexis
[1] https://lore.kernel.org/bpf/20250130201239.1429648-1-ihor.solodrai@linux.dev... [2] https://lore.kernel.org/all/CA+icZUW31vpS=R3zM6G4FMkzuiQovqtd+e-8ihwsK_A-QtS...
 
            On 6/4/25 2:02 AM, Alexis Lothoré wrote:
Hi all, a simpler version of this series has been merged, and so I am now taking a look at the issue I have put aside in the merged version: dealing with more specific data layout for arguments passed on stack. For example, a function can pass small structs on stack, but this need special care when generating the corresponding bpf trampolines. Those structs have specific alignment specified by the target ABI, but it can also be altered with attributes packing the structure or modifying the alignment.
Some platforms already support structs on stack (see tracing_struct_many_args test), but as discussed earlier, those may suffer from the same kind of issue mentioned above.
On Wed Apr 23, 2025 at 9:24 PM CEST, Alexis Lothoré wrote:
Hi Andrii,
On Wed Apr 23, 2025 at 7:15 PM CEST, Andrii Nakryiko wrote:
On Thu, Apr 17, 2025 at 12:14 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Andrii,
On Wed Apr 16, 2025 at 11:24 PM CEST, Andrii Nakryiko wrote:
[...]
I'd suggest looking at btf__align_of() in libbpf (tools/lib/bpf/btf.c) to see how we calculate alignment there. It seems to work decently enough. It won't cover any arch-specific extra rules like double needing 16-byte alignment (I vaguely remember something like that for some architectures, but I might be misremembering), or anything similar. It also won't detect (I don't think it's possible without DWARF) artificially increased alignment with attribute((aligned(N))).
Thanks for the pointer, I'll take a look at it. The more we discuss this series, the less member size sounds relevant for what I'm trying to achieve here.
Following Xu's comments, I have been thinking about how I could detect the custom alignments and packing on structures, and I was wondering if I could somehow benefit from __attribute__ encoding in BTF info ([1]). But following your hint, I also see some btf_is_struct_packed() in tools/lib/bpf/btf_dump.c that could help. I'll dig this further and see if I can manage to make something work with all of this.
Andrii's comment above illustrates well my current issue: when functions pass arguments on stack, we are missing info for some of them to correctly build trampolines, especially for struct, which can have attributes like __attribute__((packed)) or __attribute__((align(x))). [1] seems to be a recent solution implemented for BTF to cover this need. IIUC it encodes any arbitratry attribute affecting a data type or function, so if I have some struct like this one in my kernel or a module:
struct foo { short b int a; } __packed;
I would expect the corresponding BTF data to have some BTF_KIND_DECL_TAG describing the "packed" attribute for the corresponding structure, but I fail to find any of those when running:
$ bpftool btf dump file vmlinux format raw
In there I see some DECL_TAG but those are mostly 'bpf_kfunc', I see not arbitrary attribute like 'packed' or 'aligned(x)'.
What I really need to do in the end is to be able to parse those alignments attributes info in the kernel at runtime when generating trampolines, but I hoped to be able to see them with bpftool first to validate the concept.
I started taking a look further at this and stumbled upon [2] in which Alan gives many more details about the feature, so I did the following checks:
- kernel version 6.15.0-rc4 from bpf-next_base: it contains the updated Makefile.btf calling pahole with `--btf_features=attributes`
- pahole v1.30 $ pahole --supported_btf_features encode_force,var,float,decl_tag,type_tag,enum64,optimized_func,consistent_func,decl_tag_kfuncs,reproducible_build,distilled_base,global_var,attributes This pahole comes from my distro pkg manager, but I have also done the same test with a freshly built pahole, while taking care of pulling the libbpf submodule.
- bpftool v7.6.0 bpftool v7.6.0 using libbpf v1.6 features: llvm, skeletons
Could I be missing something obvious ? Or did I misunderstand the actual attribute encoding feature ?
Hi Alexis.
The changes recently landed in pahole and libbpf re attributes had a very narrow goal: passing through particular attributes for some BPF kfuncs from the kernel source to vmlinux.h
BTF now has a way of encoding any attribute (as opposed to only bpf type/decl tags) by setting type/decl tag kind flag [1]. So it is possible to represent attributes like packed and aligned in BTF.
However, the BTF tags need to be generated by something, in case of vmlinux by pahole. Pahole generates BTF by parsing DWARF. And, as far as I understand, attributes are not (can not be?) represented in DWARF in a generic way, it really depends on specifics of the attribute.
In order to support packed/aligned, pahole needs to know how to figure them out from DWARF input and add the tags to BTF. And this does not happen right now, which is why you don't see anything in bpftool output.
[1] https://lore.kernel.org/bpf/20250130201239.1429648-1-ihor.solodrai@linux.dev...
Thanks,
Alexis
[1] https://lore.kernel.org/bpf/20250130201239.1429648-1-ihor.solodrai@linux.dev... [2] https://lore.kernel.org/all/CA+icZUW31vpS=R3zM6G4FMkzuiQovqtd+e-8ihwsK_A-QtS...
 
            Hi Ihor,
On Wed Jun 4, 2025 at 7:31 PM CEST, Ihor Solodrai wrote:
On 6/4/25 2:02 AM, Alexis Lothoré wrote:
[...]
Could I be missing something obvious ? Or did I misunderstand the actual attribute encoding feature ?
Hi Alexis.
The changes recently landed in pahole and libbpf re attributes had a very narrow goal: passing through particular attributes for some BPF kfuncs from the kernel source to vmlinux.h
BTF now has a way of encoding any attribute (as opposed to only bpf type/decl tags) by setting type/decl tag kind flag [1]. So it is possible to represent attributes like packed and aligned in BTF.
However, the BTF tags need to be generated by something, in case of vmlinux by pahole. Pahole generates BTF by parsing DWARF. And, as far as I understand, attributes are not (can not be?) represented in DWARF in a generic way, it really depends on specifics of the attribute.
In order to support packed/aligned, pahole needs to know how to figure them out from DWARF input and add the tags to BTF. And this does not happen right now, which is why you don't see anything in bpftool output.
[1] https://lore.kernel.org/bpf/20250130201239.1429648-1-ihor.solodrai@linux.dev...
Thanks for the details ! I have missed this possibility, as I have been assuming that DWARF info was exposing the needed info. I'll take a look at it, but if those attributes can not be represented by DWARF, I'll have to find another way of getting those packing/alignment modifications on data type (eg: re-use/share btf__align_of from libbpf, as suggested by Andrii, but it may not able to cover all cases).
Thanks,
Alexis
 
            On Thu, Jun 5, 2025 at 12:35 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Ihor,
On Wed Jun 4, 2025 at 7:31 PM CEST, Ihor Solodrai wrote:
On 6/4/25 2:02 AM, Alexis Lothoré wrote:
[...]
Could I be missing something obvious ? Or did I misunderstand the actual attribute encoding feature ?
Hi Alexis.
The changes recently landed in pahole and libbpf re attributes had a very narrow goal: passing through particular attributes for some BPF kfuncs from the kernel source to vmlinux.h
BTF now has a way of encoding any attribute (as opposed to only bpf type/decl tags) by setting type/decl tag kind flag [1]. So it is possible to represent attributes like packed and aligned in BTF.
However, the BTF tags need to be generated by something, in case of vmlinux by pahole. Pahole generates BTF by parsing DWARF. And, as far as I understand, attributes are not (can not be?) represented in DWARF in a generic way, it really depends on specifics of the attribute.
In order to support packed/aligned, pahole needs to know how to figure them out from DWARF input and add the tags to BTF. And this does not happen right now, which is why you don't see anything in bpftool output.
[1] https://lore.kernel.org/bpf/20250130201239.1429648-1-ihor.solodrai@linux.dev...
Thanks for the details ! I have missed this possibility, as I have been assuming that DWARF info was exposing the needed info. I'll take a look at it, but if those attributes can not be represented by DWARF, I'll have to find another way of getting those packing/alignment modifications on data type (eg: re-use/share btf__align_of from libbpf, as suggested by Andrii, but it may not able to cover all cases).
Not sure all the trouble is worth it. I feel it's a corner case. Something we don't need to fix.
 
            Hi Alexei,
On Thu Jun 5, 2025 at 6:09 PM CEST, Alexei Starovoitov wrote:
On Thu, Jun 5, 2025 at 12:35 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Ihor,
On Wed Jun 4, 2025 at 7:31 PM CEST, Ihor Solodrai wrote:
On 6/4/25 2:02 AM, Alexis Lothoré wrote:
[...]
Thanks for the details ! I have missed this possibility, as I have been assuming that DWARF info was exposing the needed info. I'll take a look at it, but if those attributes can not be represented by DWARF, I'll have to find another way of getting those packing/alignment modifications on data type (eg: re-use/share btf__align_of from libbpf, as suggested by Andrii, but it may not able to cover all cases).
Not sure all the trouble is worth it. I feel it's a corner case. Something we don't need to fix.
TBH I don't own any specific use case really needing this handling, so if it does not feel worth the trouble, I'm fine with not trying to support this. My effort is rather motivated by the goal of aligning the ARM64 features with other platform, and so of getting rid of tools/testing/selftests/bpf/DENYLIST.aarch64.
For the record, this effort also showed that the same kind of issue affects other platforms already supporting many args + structs passed by value ([1]) - structs alignment with specific alignment constraints are not specifically handled (eg: a struct with an __int128 as a top-level member, leading to a 16 byte alignment requirement) - packing and custom alignment is not handled
From there, I could do two different things: 1. do nothing, keep ARM64 as-is with the current version which has been recently merged: ARM64 then denies attachment to any function trying to pass a struct by value on stack. We keep the tracing_struct tests denied for ARM64. Other platforms still allow to attach such functions, but may be parsing wrongly arguments in those specific cases. 2. add the constraint applied on ARM64 (refusing attachment when structs are passed through stack) to other JIT compilers. Then update the tracing_struct test to ensure this specific case is properly denied on all platforms to avoid risking reading wrongly arguments passed through stack when structs or large types are involved.
I tend to think 2. is better, but let me know if you have a different opinion here.
Thanks,
Alexis
 
            On Fri, Jun 6, 2025 at 12:45 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Alexei,
On Thu Jun 5, 2025 at 6:09 PM CEST, Alexei Starovoitov wrote:
On Thu, Jun 5, 2025 at 12:35 AM Alexis Lothoré alexis.lothore@bootlin.com wrote:
Hi Ihor,
On Wed Jun 4, 2025 at 7:31 PM CEST, Ihor Solodrai wrote:
On 6/4/25 2:02 AM, Alexis Lothoré wrote:
[...]
Thanks for the details ! I have missed this possibility, as I have been assuming that DWARF info was exposing the needed info. I'll take a look at it, but if those attributes can not be represented by DWARF, I'll have to find another way of getting those packing/alignment modifications on data type (eg: re-use/share btf__align_of from libbpf, as suggested by Andrii, but it may not able to cover all cases).
Not sure all the trouble is worth it. I feel it's a corner case. Something we don't need to fix.
TBH I don't own any specific use case really needing this handling, so if it does not feel worth the trouble, I'm fine with not trying to support this. My effort is rather motivated by the goal of aligning the ARM64 features with other platform, and so of getting rid of tools/testing/selftests/bpf/DENYLIST.aarch64.
For the record, this effort also showed that the same kind of issue affects other platforms already supporting many args + structs passed by value ([1])
- structs alignment with specific alignment constraints are not specifically handled (eg: a struct with an __int128 as a top-level member, leading to a 16 byte alignment requirement)
- packing and custom alignment is not handled
From there, I could do two different things:
- do nothing, keep ARM64 as-is with the current version which has been
recently merged: ARM64 then denies attachment to any function trying to pass a struct by value on stack. We keep the tracing_struct tests denied for ARM64. Other platforms still allow to attach such functions, but may be parsing wrongly arguments in those specific cases. 2. add the constraint applied on ARM64 (refusing attachment when structs are passed through stack) to other JIT compilers. Then update the tracing_struct test to ensure this specific case is properly denied on all platforms to avoid risking reading wrongly arguments passed through stack when structs or large types are involved.
I tend to think 2. is better, but let me know if you have a different opinion here.
Agree. tracing_struct_many_args is working on x86, but assumptions about BTF being able to express everything about calling convention were not correct, so let's roll back.
 
            From: Xu Kuohai xukuohai@huawei.com
Currently ARM64 bpf trampoline supports up to 8 function arguments. According to the statistics from commit 473e3150e30a ("bpf, x86: allow function arguments up to 12 for TRACING"), there are about 200 functions accept 9 to 12 arguments, so adding support for up to 12 function arguments.
Due to bpf only supporting function arguments up to 16 bytes, according to AAPCS64, starting from the first argument, each argument is first attempted to be loaded to 1 or 2 smallest registers from x0-x7, if there are no enough registers to hold the entire argument, then all remaining arguments starting from this one are pushed to the stack for passing.
Signed-off-by: Xu Kuohai xukuohai@huawei.com Co-developed-by: Alexis Lothoré alexis.lothore@bootlin.com Signed-off-by: Alexis Lothoré alexis.lothore@bootlin.com --- arch/arm64/net/bpf_jit_comp.c | 235 ++++++++++++++++++++++++++++++++---------- 1 file changed, 182 insertions(+), 53 deletions(-)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index 70d7c89d3ac907798e86e0051e7b472c252c1412..3618f42b1a0019590d47db0ec0fce1f9f51b01af 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -2064,7 +2064,7 @@ bool bpf_jit_supports_subprog_tailcalls(void) }
static void invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l, - int args_off, int retval_off, int run_ctx_off, + int bargs_off, int retval_off, int run_ctx_off, bool save_ret) { __le32 *branch; @@ -2106,7 +2106,7 @@ static void invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l, branch = ctx->image + ctx->idx; emit(A64_NOP, ctx);
- emit(A64_ADD_I(1, A64_R(0), A64_SP, args_off), ctx); + emit(A64_ADD_I(1, A64_R(0), A64_SP, bargs_off), ctx); if (!p->jited) emit_addr_mov_i64(A64_R(1), (const u64)p->insnsi, ctx);
@@ -2131,7 +2131,7 @@ static void invoke_bpf_prog(struct jit_ctx *ctx, struct bpf_tramp_link *l, }
static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl, - int args_off, int retval_off, int run_ctx_off, + int bargs_off, int retval_off, int run_ctx_off, __le32 **branches) { int i; @@ -2141,7 +2141,7 @@ static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl, */ emit(A64_STR64I(A64_ZR, A64_SP, retval_off), ctx); for (i = 0; i < tl->nr_links; i++) { - invoke_bpf_prog(ctx, tl->links[i], args_off, retval_off, + invoke_bpf_prog(ctx, tl->links[i], bargs_off, retval_off, run_ctx_off, true); /* if (*(u64 *)(sp + retval_off) != 0) * goto do_fexit; @@ -2155,23 +2155,142 @@ static void invoke_bpf_mod_ret(struct jit_ctx *ctx, struct bpf_tramp_links *tl, } }
-static void save_args(struct jit_ctx *ctx, int args_off, int nregs) +struct arg_aux { + /* how many args are passed through registers, the rest of the args are + * passed through stack + */ + int args_in_regs; + /* how many registers are used to pass arguments */ + int regs_for_args; + /* how much stack is used for additional args passed to bpf program + * that did not fit in original function registers + **/ + int bstack_for_args; + /* home much stack is used for additional args passed to the + * original function when called from trampoline (this one needs + * arguments to be properly aligned) + */ + int ostack_for_args; + u8 arg_align[MAX_BPF_FUNC_ARGS]; +}; + +static u8 calc_arg_align(const struct btf_func_model *m, int i) +{ + if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) + return max(8, m->arg_largest_member_size[i]); + return max(8, m->arg_size[i]); +} + +static void calc_arg_aux(const struct btf_func_model *m, + struct arg_aux *a) { int i; + int nregs; + int slots; + int stack_slots; + + for (i = 0; i < m->nr_args; i++) + a->arg_align[i] = calc_arg_align(m, i); + + /* verifier ensures m->nr_args <= MAX_BPF_FUNC_ARGS */ + for (i = 0, nregs = 0; i < m->nr_args; i++) { + slots = (m->arg_size[i] + 7) / 8; + if (nregs + slots <= 8) /* passed through register ? */ + nregs += slots; + else + break; + }
- for (i = 0; i < nregs; i++) { - emit(A64_STR64I(i, A64_SP, args_off), ctx); - args_off += 8; + a->args_in_regs = i; + a->regs_for_args = nregs; + a->ostack_for_args = 0; + + /* the rest arguments are passed through stack */ + for (a->ostack_for_args = 0, a->bstack_for_args = 0; + i < m->nr_args; i++) { + stack_slots = (m->arg_size[i] + 7) / 8; + /* AAPCS 64 C.14: arguments passed on stack must be aligned to + * max(8, arg_natural_alignment) + */ + a->bstack_for_args += stack_slots * 8; + a->ostack_for_args = round_up(a->ostack_for_args + + stack_slots * 8, a->arg_align[i]); + } +} + +static void clear_garbage(struct jit_ctx *ctx, int reg, int effective_bytes) +{ + if (effective_bytes) { + int garbage_bits = 64 - 8 * effective_bytes; +#ifdef CONFIG_CPU_BIG_ENDIAN + /* garbage bits are at the right end */ + emit(A64_LSR(1, reg, reg, garbage_bits), ctx); + emit(A64_LSL(1, reg, reg, garbage_bits), ctx); +#else + /* garbage bits are at the left end */ + emit(A64_LSL(1, reg, reg, garbage_bits), ctx); + emit(A64_LSR(1, reg, reg, garbage_bits), ctx); +#endif } }
-static void restore_args(struct jit_ctx *ctx, int args_off, int nregs) +static void save_args(struct jit_ctx *ctx, int bargs_off, int oargs_off, + const struct btf_func_model *m, + const struct arg_aux *a, + bool for_call_origin) { int i; + int reg; + int doff; + int soff; + int slots; + u8 tmp = bpf2a64[TMP_REG_1]; + + /* store arguments to the stack for the bpf program, or restore + * arguments from stack for the original function + */ + for (reg = 0; reg < a->regs_for_args; reg++) { + emit(for_call_origin ? + A64_LDR64I(reg, A64_SP, bargs_off) : + A64_STR64I(reg, A64_SP, bargs_off), + ctx); + bargs_off += 8; + } + + soff = 32; /* on stack arguments start from FP + 32 */ + doff = (for_call_origin ? oargs_off : bargs_off); + + /* save on stack arguments */ + for (i = a->args_in_regs; i < m->nr_args; i++) { + slots = (m->arg_size[i] + 7) / 8; + /* AAPCS C.14: additional arguments on stack must be + * aligned on max(8, arg_natural_alignment) + */ + soff = round_up(soff, a->arg_align[i]); + if (for_call_origin) + doff = round_up(doff, a->arg_align[i]); + /* verifier ensures arg_size <= 16, so slots equals 1 or 2 */ + while (slots-- > 0) { + emit(A64_LDR64I(tmp, A64_FP, soff), ctx); + /* if there is unused space in the last slot, clear + * the garbage contained in the space. + */ + if (slots == 0 && !for_call_origin) + clear_garbage(ctx, tmp, m->arg_size[i] % 8); + emit(A64_STR64I(tmp, A64_SP, doff), ctx); + soff += 8; + doff += 8; + } + } +}
- for (i = 0; i < nregs; i++) { - emit(A64_LDR64I(i, A64_SP, args_off), ctx); - args_off += 8; +static void restore_args(struct jit_ctx *ctx, int bargs_off, int nregs) +{ + int reg; + + for (reg = 0; reg < nregs; reg++) { + emit(A64_LDR64I(reg, A64_SP, bargs_off), ctx); + bargs_off += 8; } }
@@ -2194,17 +2313,21 @@ static bool is_struct_ops_tramp(const struct bpf_tramp_links *fentry_links) */ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, struct bpf_tramp_links *tlinks, void *func_addr, - int nregs, u32 flags) + const struct btf_func_model *m, + const struct arg_aux *a, + u32 flags) { int i; int stack_size; int retaddr_off; int regs_off; int retval_off; - int args_off; - int nregs_off; + int bargs_off; + int nfuncargs_off; int ip_off; int run_ctx_off; + int oargs_off; + int nfuncargs; struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY]; struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT]; struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN]; @@ -2213,31 +2336,38 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, bool is_struct_ops = is_struct_ops_tramp(fentry);
/* trampoline stack layout: - * [ parent ip ] - * [ FP ] - * SP + retaddr_off [ self ip ] - * [ FP ] + * [ parent ip ] + * [ FP ] + * SP + retaddr_off [ self ip ] + * [ FP ] * - * [ padding ] align SP to multiples of 16 + * [ padding ] align SP to multiples of 16 * - * [ x20 ] callee saved reg x20 - * SP + regs_off [ x19 ] callee saved reg x19 + * [ x20 ] callee saved reg x20 + * SP + regs_off [ x19 ] callee saved reg x19 * - * SP + retval_off [ return value ] BPF_TRAMP_F_CALL_ORIG or - * BPF_TRAMP_F_RET_FENTRY_RET + * SP + retval_off [ return value ] BPF_TRAMP_F_CALL_ORIG or + * BPF_TRAMP_F_RET_FENTRY_RET + * [ arg reg N ] + * [ ... ] + * SP + bargs_off [ arg reg 1 ] for bpf * - * [ arg reg N ] - * [ ... ] - * SP + args_off [ arg reg 1 ] + * SP + nfuncargs_off [ arg regs count ] * - * SP + nregs_off [ arg regs count ] + * SP + ip_off [ traced function ] BPF_TRAMP_F_IP_ARG flag * - * SP + ip_off [ traced function ] BPF_TRAMP_F_IP_ARG flag + * SP + run_ctx_off [ bpf_tramp_run_ctx ] * - * SP + run_ctx_off [ bpf_tramp_run_ctx ] + * [ stack arg N ] + * [ ... ] + * SP + oargs_off [ stack arg 1 ] for original func */
stack_size = 0; + oargs_off = stack_size; + if (flags & BPF_TRAMP_F_CALL_ORIG) + stack_size += a->ostack_for_args; + run_ctx_off = stack_size; /* room for bpf_tramp_run_ctx */ stack_size += round_up(sizeof(struct bpf_tramp_run_ctx), 8); @@ -2247,13 +2377,14 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, if (flags & BPF_TRAMP_F_IP_ARG) stack_size += 8;
- nregs_off = stack_size; + nfuncargs_off = stack_size; /* room for args count */ stack_size += 8;
- args_off = stack_size; + bargs_off = stack_size; /* room for args */ - stack_size += nregs * 8; + nfuncargs = a->regs_for_args + a->bstack_for_args / 8; + stack_size += 8 * nfuncargs;
/* room for return value */ retval_off = stack_size; @@ -2300,11 +2431,11 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, }
/* save arg regs count*/ - emit(A64_MOVZ(1, A64_R(10), nregs, 0), ctx); - emit(A64_STR64I(A64_R(10), A64_SP, nregs_off), ctx); + emit(A64_MOVZ(1, A64_R(10), nfuncargs, 0), ctx); + emit(A64_STR64I(A64_R(10), A64_SP, nfuncargs_off), ctx);
- /* save arg regs */ - save_args(ctx, args_off, nregs); + /* save args for bpf */ + save_args(ctx, bargs_off, oargs_off, m, a, false);
/* save callee saved registers */ emit(A64_STR64I(A64_R(19), A64_SP, regs_off), ctx); @@ -2320,7 +2451,7 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, }
for (i = 0; i < fentry->nr_links; i++) - invoke_bpf_prog(ctx, fentry->links[i], args_off, + invoke_bpf_prog(ctx, fentry->links[i], bargs_off, retval_off, run_ctx_off, flags & BPF_TRAMP_F_RET_FENTRY_RET);
@@ -2330,12 +2461,13 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, if (!branches) return -ENOMEM;
- invoke_bpf_mod_ret(ctx, fmod_ret, args_off, retval_off, + invoke_bpf_mod_ret(ctx, fmod_ret, bargs_off, retval_off, run_ctx_off, branches); }
if (flags & BPF_TRAMP_F_CALL_ORIG) { - restore_args(ctx, args_off, nregs); + /* save args for original func */ + save_args(ctx, bargs_off, oargs_off, m, a, true); /* call original func */ emit(A64_LDR64I(A64_R(10), A64_SP, retaddr_off), ctx); emit(A64_ADR(A64_LR, AARCH64_INSN_SIZE * 2), ctx); @@ -2354,7 +2486,7 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, }
for (i = 0; i < fexit->nr_links; i++) - invoke_bpf_prog(ctx, fexit->links[i], args_off, retval_off, + invoke_bpf_prog(ctx, fexit->links[i], bargs_off, retval_off, run_ctx_off, false);
if (flags & BPF_TRAMP_F_CALL_ORIG) { @@ -2368,7 +2500,7 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im, }
if (flags & BPF_TRAMP_F_RESTORE_REGS) - restore_args(ctx, args_off, nregs); + restore_args(ctx, bargs_off, a->regs_for_args);
/* restore callee saved register x19 and x20 */ emit(A64_LDR64I(A64_R(19), A64_SP, regs_off), ctx); @@ -2428,14 +2560,13 @@ int arch_bpf_trampoline_size(const struct btf_func_model *m, u32 flags, .idx = 0, }; struct bpf_tramp_image im; + struct arg_aux aaux; int nregs, ret;
nregs = btf_func_model_nregs(m); - /* the first 8 registers are used for arguments */ - if (nregs > 8) - return -ENOTSUPP;
- ret = prepare_trampoline(&ctx, &im, tlinks, func_addr, nregs, flags); + calc_arg_aux(m, &aaux); + ret = prepare_trampoline(&ctx, &im, tlinks, func_addr, m, &aaux, flags); if (ret < 0) return ret;
@@ -2462,9 +2593,10 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image, u32 flags, struct bpf_tramp_links *tlinks, void *func_addr) { - int ret, nregs; - void *image, *tmp; u32 size = ro_image_end - ro_image; + struct arg_aux aaux; + void *image, *tmp; + int ret;
/* image doesn't need to be in module memory range, so we can * use kvmalloc. @@ -2480,13 +2612,10 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *ro_image, .write = true, };
- nregs = btf_func_model_nregs(m); - /* the first 8 registers are used for arguments */ - if (nregs > 8) - return -ENOTSUPP;
jit_fill_hole(image, (unsigned int)(ro_image_end - ro_image)); - ret = prepare_trampoline(&ctx, im, tlinks, func_addr, nregs, flags); + calc_arg_aux(m, &aaux); + ret = prepare_trampoline(&ctx, im, tlinks, func_addr, m, &aaux, flags);
if (ret > 0 && validate_code(&ctx) < 0) { ret = -EINVAL;
 
            When dealing with large types (>8 bytes), ARM64 trampolines need to take extra care about the arguments alignment to respect the calling convention set by AAPCS64.
Add two tests ensuring that the BPF trampoline arranges arguments with the relevant layout. The two new tests involve almost the same arguments, except that the second one requires a more specific alignment to be set by the trampoline when preparing arguments before calling the the target function.
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com --- .../selftests/bpf/prog_tests/tracing_struct.c | 23 ++++++++ tools/testing/selftests/bpf/progs/tracing_struct.c | 10 +++- .../selftests/bpf/progs/tracing_struct_many_args.c | 67 ++++++++++++++++++++++ .../testing/selftests/bpf/test_kmods/bpf_testmod.c | 50 ++++++++++++++++ 4 files changed, 149 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/tracing_struct.c b/tools/testing/selftests/bpf/prog_tests/tracing_struct.c index 19e68d4b353278bf8e2917e62f62c89d14d7fe80..ecb5d38539f8b2fc275f93713ce2a7aad908b929 100644 --- a/tools/testing/selftests/bpf/prog_tests/tracing_struct.c +++ b/tools/testing/selftests/bpf/prog_tests/tracing_struct.c @@ -108,6 +108,29 @@ static void test_struct_many_args(void) ASSERT_EQ(skel->bss->t9_i, 27, "t9:i"); ASSERT_EQ(skel->bss->t9_ret, 258, "t9 ret");
+ ASSERT_EQ(skel->bss->t10_a_a, 27, "t10:a.a"); + ASSERT_EQ(skel->bss->t10_a_b, 28, "t10:a.b"); + ASSERT_EQ(skel->bss->t10_b_a, 29, "t10:b.a"); + ASSERT_EQ(skel->bss->t10_b_b, 30, "t10:b.b"); + ASSERT_EQ(skel->bss->t10_c_a, 31, "t10:c.a"); + ASSERT_EQ(skel->bss->t10_c_b, 32, "t10:c.b"); + ASSERT_EQ(skel->bss->t10_d_a, 33, "t10:d.a"); + ASSERT_EQ(skel->bss->t10_d_b, 34, "t10:d.b"); + ASSERT_EQ(skel->bss->t10_e, 35, "t10:e"); + ASSERT_EQ(skel->bss->t10_f_a, 36, "t10:f.a"); + ASSERT_EQ(skel->bss->t10_f_b, 37, "t10:f.b"); + + ASSERT_EQ(skel->bss->t10_ret, 352, "t10 ret"); + + ASSERT_EQ(skel->bss->t11_a, 38, "t11:a"); + ASSERT_EQ(skel->bss->t11_b, 39, "t11:b"); + ASSERT_EQ(skel->bss->t11_c, 40, "t11:c"); + ASSERT_EQ(skel->bss->t11_d, 41, "t11:d"); + ASSERT_EQ(skel->bss->t11_e, 42, "t11:e"); + ASSERT_EQ(skel->bss->t11_f, 43, "t11:f"); + + ASSERT_EQ(skel->bss->t11_ret, 243, "t11 ret"); + destroy_skel: tracing_struct_many_args__destroy(skel); } diff --git a/tools/testing/selftests/bpf/progs/tracing_struct.c b/tools/testing/selftests/bpf/progs/tracing_struct.c index c435a3a8328ab1580c63967a7f0ab779aa7ca4d4..feabf618a2e011b0d08eeaa6cc09fba1922ecb3f 100644 --- a/tools/testing/selftests/bpf/progs/tracing_struct.c +++ b/tools/testing/selftests/bpf/progs/tracing_struct.c @@ -18,6 +18,15 @@ struct bpf_testmod_struct_arg_3 { int b[]; };
+struct bpf_testmod_struct_arg_4 { + __u64 a; + __u64 b; +}; + +struct bpf_testmod_struct_arg_5 { + __int128 a; +}; + long t1_a_a, t1_a_b, t1_b, t1_c, t1_ret, t1_nregs; __u64 t1_reg0, t1_reg1, t1_reg2, t1_reg3; long t2_a, t2_b_a, t2_b_b, t2_c, t2_ret; @@ -25,7 +34,6 @@ long t3_a, t3_b, t3_c_a, t3_c_b, t3_ret; long t4_a_a, t4_b, t4_c, t4_d, t4_e_a, t4_e_b, t4_ret; long t5_ret; int t6; - SEC("fentry/bpf_testmod_test_struct_arg_1") int BPF_PROG2(test_struct_arg_1, struct bpf_testmod_struct_arg_2, a, int, b, int, c) { diff --git a/tools/testing/selftests/bpf/progs/tracing_struct_many_args.c b/tools/testing/selftests/bpf/progs/tracing_struct_many_args.c index 4742012ace06af949d7f15a21131aaef7ab006e4..6f086b3d32d5f89e426aa4b79daa24bdb42861db 100644 --- a/tools/testing/selftests/bpf/progs/tracing_struct_many_args.c +++ b/tools/testing/selftests/bpf/progs/tracing_struct_many_args.c @@ -18,6 +18,14 @@ struct bpf_testmod_struct_arg_5 { long t7_a, t7_b, t7_c, t7_d, t7_e, t7_f_a, t7_f_b, t7_ret; long t8_a, t8_b, t8_c, t8_d, t8_e, t8_f_a, t8_f_b, t8_g, t8_ret; long t9_a, t9_b, t9_c, t9_d, t9_e, t9_f, t9_g, t9_h_a, t9_h_b, t9_h_c, t9_h_d, t9_i, t9_ret; +__u64 t10_a_a, t10_a_b, t10_b_a, t10_b_b, t10_c_a, t10_c_b, t10_d_a, t10_d_b, + t10_f_a, t10_f_b; +short t10_e; +int t10_ret; +__int128 t11_a, t11_b, t11_c, t11_d, t11_f; +short t11_e; +int t11_ret; +
SEC("fentry/bpf_testmod_test_struct_arg_7") int BPF_PROG2(test_struct_many_args_1, __u64, a, void *, b, short, c, int, d, @@ -92,4 +100,63 @@ int BPF_PROG2(test_struct_many_args_6, __u64, a, void *, b, short, c, int, d, vo return 0; }
+SEC("fentry/bpf_testmod_test_struct_arg_10") +int BPF_PROG2(test_struct_many_args_7, struct bpf_testmod_struct_arg_4, a, + struct bpf_testmod_struct_arg_4, b, + struct bpf_testmod_struct_arg_4, c, + struct bpf_testmod_struct_arg_4, d, short, e, + struct bpf_testmod_struct_arg_4, f) +{ + t10_a_a = a.a; + t10_a_b = a.b; + t10_b_a = b.a; + t10_b_b = b.b; + t10_c_a = c.a; + t10_c_b = c.b; + t10_d_a = d.a; + t10_d_b = d.b; + t10_e = e; + t10_f_a = f.a; + t10_f_b = f.b; + return 0; +} + +SEC("fexit/bpf_testmod_test_struct_arg_10") +int BPF_PROG2(test_struct_many_args_8, struct bpf_testmod_struct_arg_4, a, + struct bpf_testmod_struct_arg_4, b, + struct bpf_testmod_struct_arg_4, c, + struct bpf_testmod_struct_arg_4, d, short, e, + struct bpf_testmod_struct_arg_4, f, int, ret) +{ + t10_ret = ret; + return 0; +} + +SEC("fentry/bpf_testmod_test_struct_arg_11") +int BPF_PROG2(test_struct_many_args_9, struct bpf_testmod_struct_arg_5, a, + struct bpf_testmod_struct_arg_5, b, + struct bpf_testmod_struct_arg_5, c, + struct bpf_testmod_struct_arg_5, d, int, e, + struct bpf_testmod_struct_arg_5, f) +{ + t11_a = a.a; + t11_b = b.a; + t11_c = c.a; + t11_d = d.a; + t11_e = e; + t11_f = f.a; + return 0; +} + +SEC("fexit/bpf_testmod_test_struct_arg_11") +int BPF_PROG2(test_struct_many_args_10, struct bpf_testmod_struct_arg_5, a, + struct bpf_testmod_struct_arg_5, b, + struct bpf_testmod_struct_arg_5, c, + struct bpf_testmod_struct_arg_5, d, int, e, + struct bpf_testmod_struct_arg_5, f, int, ret) +{ + t11_ret = ret; + return 0; +} + char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c index f38eaf0d35efa712ec288f06983c86b02c0d3e0e..96c80da725c978f2e48df8602dd63155971a7bf6 100644 --- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c +++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c @@ -62,6 +62,15 @@ struct bpf_testmod_struct_arg_5 { long d; };
+struct bpf_testmod_struct_arg_6 { + u64 a; + u64 b; +}; + +struct bpf_testmod_struct_arg_7 { + __int128 a; +}; + __bpf_hook_start();
noinline int @@ -128,6 +137,29 @@ bpf_testmod_test_struct_arg_9(u64 a, void *b, short c, int d, void *e, char f, return bpf_testmod_test_struct_arg_result; }
+noinline int bpf_testmod_test_struct_arg_10(struct bpf_testmod_struct_arg_6 a, + struct bpf_testmod_struct_arg_6 b, + struct bpf_testmod_struct_arg_6 c, + struct bpf_testmod_struct_arg_6 d, + short e, + struct bpf_testmod_struct_arg_6 f) +{ + bpf_testmod_test_struct_arg_result = + a.a + a.b + b.a + b.b + c.a + c.b + d.a + d.b + e + f.a + f.b; + return bpf_testmod_test_struct_arg_result; +} + +noinline int bpf_testmod_test_struct_arg_11(struct bpf_testmod_struct_arg_7 a, + struct bpf_testmod_struct_arg_7 b, + struct bpf_testmod_struct_arg_7 c, + struct bpf_testmod_struct_arg_7 d, + short e, + struct bpf_testmod_struct_arg_7 f) +{ + bpf_testmod_test_struct_arg_result = a.a + b.a + c.a + d.a + e + f.a; + return bpf_testmod_test_struct_arg_result; +} + noinline int bpf_testmod_test_arg_ptr_to_struct(struct bpf_testmod_struct_arg_1 *a) { bpf_testmod_test_struct_arg_result = a->a; @@ -394,6 +426,16 @@ bpf_testmod_test_read(struct file *file, struct kobject *kobj, struct bpf_testmod_struct_arg_3 *struct_arg3; struct bpf_testmod_struct_arg_4 struct_arg4 = {21, 22}; struct bpf_testmod_struct_arg_5 struct_arg5 = {23, 24, 25, 26}; + struct bpf_testmod_struct_arg_6 struct_arg6_a = {27, 28}; + struct bpf_testmod_struct_arg_6 struct_arg6_b = {29, 30}; + struct bpf_testmod_struct_arg_6 struct_arg6_c = {31, 32}; + struct bpf_testmod_struct_arg_6 struct_arg6_d = {33, 34}; + struct bpf_testmod_struct_arg_6 struct_arg6_f = {36, 37}; + struct bpf_testmod_struct_arg_7 struct_arg7_a = {38}; + struct bpf_testmod_struct_arg_7 struct_arg7_b = {39}; + struct bpf_testmod_struct_arg_7 struct_arg7_c = {40}; + struct bpf_testmod_struct_arg_7 struct_arg7_d = {41}; + struct bpf_testmod_struct_arg_7 struct_arg7_f = {43}; int i = 1;
while (bpf_testmod_return_ptr(i)) @@ -411,6 +453,14 @@ bpf_testmod_test_read(struct file *file, struct kobject *kobj, (void)bpf_testmod_test_struct_arg_9(16, (void *)17, 18, 19, (void *)20, 21, 22, struct_arg5, 27);
+ (void)bpf_testmod_test_struct_arg_10(struct_arg6_a, struct_arg6_b, + struct_arg6_c, struct_arg6_d, 35, + struct_arg6_f); + + (void)bpf_testmod_test_struct_arg_11(struct_arg7_a, struct_arg7_b, + struct_arg7_c, struct_arg7_d, 42, + struct_arg7_f); + (void)bpf_testmod_test_arg_ptr_to_struct(&struct_arg1_2);
(void)trace_bpf_testmod_test_raw_tp_null(NULL);
 
            On Fri, 2025-04-11 at 22:32 +0200, Alexis Lothoré (eBPF Foundation) wrote:
When dealing with large types (>8 bytes), ARM64 trampolines need to take extra care about the arguments alignment to respect the calling convention set by AAPCS64.
Add two tests ensuring that the BPF trampoline arranges arguments with the relevant layout. The two new tests involve almost the same arguments, except that the second one requires a more specific alignment to be set by the trampoline when preparing arguments before calling the the target function.
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com
[...]
+SEC("fentry/bpf_testmod_test_struct_arg_11") +int BPF_PROG2(test_struct_many_args_9, struct bpf_testmod_struct_arg_5, a,
struct bpf_testmod_struct_arg_5, b,
struct bpf_testmod_struct_arg_5, c,
struct bpf_testmod_struct_arg_5, d, int, e,
struct bpf_testmod_struct_arg_5, f)
Hello Alexis,
I'm trying to double check the error you've seen for x86. I see that tracing_struct/struct_many_args fails with assertion: "test_struct_many_args:FAIL:t11:f unexpected t11:f: actual 35 != expected 43". Could you please help me understand this test? The function listened to is defined as accepting 'struct bpf_testmod_struct_arg_7', at the same time this function uses 'struct bpf_testmod_struct_arg_5'. Nevertheless, the assertion persists even with correct types.
+{
- t11_a = a.a;
- t11_b = b.a;
- t11_c = c.a;
- t11_d = d.a;
- t11_e = e;
- t11_f = f.a;
- return 0;
+}
[...]
 
            Hello Eduard,
On Mon Apr 28, 2025 at 9:01 AM CEST, Eduard Zingerman wrote:
On Fri, 2025-04-11 at 22:32 +0200, Alexis Lothoré (eBPF Foundation) wrote:
When dealing with large types (>8 bytes), ARM64 trampolines need to take extra care about the arguments alignment to respect the calling convention set by AAPCS64.
Add two tests ensuring that the BPF trampoline arranges arguments with the relevant layout. The two new tests involve almost the same arguments, except that the second one requires a more specific alignment to be set by the trampoline when preparing arguments before calling the the target function.
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com
[...]
+SEC("fentry/bpf_testmod_test_struct_arg_11") +int BPF_PROG2(test_struct_many_args_9, struct bpf_testmod_struct_arg_5, a,
struct bpf_testmod_struct_arg_5, b,
struct bpf_testmod_struct_arg_5, c,
struct bpf_testmod_struct_arg_5, d, int, e,
struct bpf_testmod_struct_arg_5, f)Hello Alexis,
I'm trying to double check the error you've seen for x86.
Thanks for taking a look at this.
I see that tracing_struct/struct_many_args fails with assertion: "test_struct_many_args:FAIL:t11:f unexpected t11:f: actual 35 != expected 43". Could you please help me understand this test?
Sure. When we attach fentry/fexit programs to a kernel function, a small trampoline is generated to handle the attached programs execution. This trampoline has to: 1. properly read and aggregate the functions arguments into a bpf tracing context. Those arguments comes from registers, and possibly from the stack if not all function arguments fit in registers 2. if the trampoline is in charge of calling the target function (that's the case for example when we have both a fentry and a fexit programs attached), it must prepare the arguments for the target function, by filling again the registers, and possibly pushing the additional arguments on the stack at the relevant offset.
This test is about validating the first point: ensuring that the trampoline has correctly read the initial arguments from the target function and forwarded them to the fentry program. So the actual test triggers the targeted function, it triggers the attached fentry program which record the values passed by the trampoline in t11_a, t11_b, etc, and then the test checks back that those "spied" values are the one that it has actually passed when executing the target function.
The function listened to is defined as accepting 'struct bpf_testmod_struct_arg_7', at the same time this function uses 'struct bpf_testmod_struct_arg_5'.
That's not an accidental mistake, those are in fact the same definition. bpf_testmod_struct_arg_7 is the kernel side definition in bpf_testmod.c:
struct bpf_testmod_struct_arg_7 { __int128 a; };
and struct bpf_testmode_struct_arg_5 is the one defined in the bpf test program:
struct bpf_testmod_struct_arg_5 { __int128 a; };
I agree while being correct, this is confusing :/ I have kind of blindly applied the logic I have observed in here for declaring test structs, and so declared it twice (in kernel part and in bpf part), just incrementing the index of the last defined structure. But I guess it could benefit from some index syncing, or better, some refactoring to properly share those declarations. I'll think about it for a new rev.
Nevertheless, the assertion persists even with correct types.
So I digged a bit further to better share my observations here. This is the function stack when entering the trampoline after having triggered the target function execution:
(gdb) x/64b $rbp+0x18 0xffffc9000015fd60: 41 0 0 0 0 0 0 0 0xffffc9000015fd68: 0 0 0 0 0 0 0 0 0xffffc9000015fd70: 42 0 0 0 0 0 0 0 0xffffc9000015fd78: 35 0 0 0 0 0 0 0 0xffffc9000015fd80: 43 0 0 0 0 0 0 0 0xffffc9000015fd88: 0 0 0 0 0 0 0 0
We see the arguments that did not fit in registers, so d, e and f.
This is the ebpf context generated by the trampoline for the fentry program, from the content of the stack above + the registers:
(gdb) x/128b $rbp-60 0xffffc9000015fce8: 38 0 0 0 0 0 0 0 0xffffc9000015fcf0: 0 0 0 0 0 0 0 0 0xffffc9000015fcf8: 39 0 0 0 0 0 0 0 0xffffc9000015fd00: 0 0 0 0 0 0 0 0 0xffffc9000015fd08: 40 0 0 0 0 0 0 0 0xffffc9000015fd10: 0 0 0 0 0 0 0 0 0xffffc9000015fd18: 41 0 0 0 0 0 0 0 0xffffc9000015fd20: 0 0 0 0 0 0 0 0 0xffffc9000015fd28: 42 0 0 0 0 0 0 0 0xffffc9000015fd30: 35 0 0 0 0 0 0 0 0xffffc9000015fd38: 43 0 0 0 0 0 0 0 0xffffc9000015fd40: 37 0 0 0 0 0 0 0
So IIUC, this is wrong because the "e" variable in the bpf program being an int (and about to receive value 42), it occupies only 1 "tracing context 8-byte slot", so the value 43 (representing the content for variable f), should be right after it, at 0xffffc9000015fd30. What we have instead is a hole, very likely because we copied silently the alignment from the original function call (and I guess this 35 value is a remnant from the previous test, which uses values from 27 to 37)
Regardless of this issue, based on discussion from last week, I think I'll go for the implementation suggested by Alexei: handling the nominal cases, and detecting and blocking the non trivial cases (eg: structs passed on stack). It sounds reasonable as there seems to be no exisiting kernel function currently able to trigger those very specific cases, so it could be added later if this changes.
Alexis
 
            Alexis Lothoré alexis.lothore@bootlin.com writes:
[...]
The function listened to is defined as accepting 'struct bpf_testmod_struct_arg_7', at the same time this function uses 'struct bpf_testmod_struct_arg_5'.
That's not an accidental mistake, those are in fact the same definition. bpf_testmod_struct_arg_7 is the kernel side definition in bpf_testmod.c:
struct bpf_testmod_struct_arg_7 { __int128 a; };
and struct bpf_testmode_struct_arg_5 is the one defined in the bpf test program:
struct bpf_testmod_struct_arg_5 { __int128 a; };
Apologies, but I'm still confused: - I apply this series on top of: 224ee86639f5 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc4")
- line 12 of tracing_struct_many_args.c has the following definition:
struct bpf_testmod_struct_arg_5 { char a; short b; int c; long d; };
- line 135 of the same file has the following definition:
SEC("fentry/bpf_testmod_test_struct_arg_11") int BPF_PROG2(test_struct_many_args_9, struct bpf_testmod_struct_arg_5, a, struct bpf_testmod_struct_arg_5, b, struct bpf_testmod_struct_arg_5, c, struct bpf_testmod_struct_arg_5, d, int, e, struct bpf_testmod_struct_arg_5, f)
- line 70 of tools/testing/selftests/bpf/test_kmods/bpf_testmod.c:
struct bpf_testmod_struct_arg_7 { __int128 a; };
- line 152 of the same file:
noinline int bpf_testmod_test_struct_arg_11(struct bpf_testmod_struct_arg_7 a, struct bpf_testmod_struct_arg_7 b, struct bpf_testmod_struct_arg_7 c, struct bpf_testmod_struct_arg_7 d, short e, struct bpf_testmod_struct_arg_7 f)
Do I use a wrong base to apply the series?
[...]
Nevertheless, the assertion persists even with correct types.
So I digged a bit further to better share my observations here. This is the function stack when entering the trampoline after having triggered the target function execution:
(gdb) x/64b $rbp+0x18 0xffffc9000015fd60: 41 0 0 0 0 0 0 0 0xffffc9000015fd68: 0 0 0 0 0 0 0 0 0xffffc9000015fd70: 42 0 0 0 0 0 0 0 0xffffc9000015fd78: 35 0 0 0 0 0 0 0 0xffffc9000015fd80: 43 0 0 0 0 0 0 0 0xffffc9000015fd88: 0 0 0 0 0 0 0 0
We see the arguments that did not fit in registers, so d, e and f.
This is the ebpf context generated by the trampoline for the fentry program, from the content of the stack above + the registers:
(gdb) x/128b $rbp-60 0xffffc9000015fce8: 38 0 0 0 0 0 0 0 0xffffc9000015fcf0: 0 0 0 0 0 0 0 0 0xffffc9000015fcf8: 39 0 0 0 0 0 0 0 0xffffc9000015fd00: 0 0 0 0 0 0 0 0 0xffffc9000015fd08: 40 0 0 0 0 0 0 0 0xffffc9000015fd10: 0 0 0 0 0 0 0 0 0xffffc9000015fd18: 41 0 0 0 0 0 0 0 0xffffc9000015fd20: 0 0 0 0 0 0 0 0 0xffffc9000015fd28: 42 0 0 0 0 0 0 0 0xffffc9000015fd30: 35 0 0 0 0 0 0 0 0xffffc9000015fd38: 43 0 0 0 0 0 0 0 0xffffc9000015fd40: 37 0 0 0 0 0 0 0
So IIUC, this is wrong because the "e" variable in the bpf program being an int (and about to receive value 42), it occupies only 1 "tracing context 8-byte slot", so the value 43 (representing the content for variable f), should be right after it, at 0xffffc9000015fd30. What we have instead is a hole, very likely because we copied silently the alignment from the original function call (and I guess this 35 value is a remnant from the previous test, which uses values from 27 to 37)
Interesting, thank you for the print outs.
Regardless of this issue, based on discussion from last week, I think I'll go for the implementation suggested by Alexei: handling the nominal cases, and detecting and blocking the non trivial cases (eg: structs passed on stack). It sounds reasonable as there seems to be no exisiting kernel function currently able to trigger those very specific cases, so it could be added later if this changes.
Yes, this makes sense.
 
            On Mon Apr 28, 2025 at 6:52 PM CEST, Eduard Zingerman wrote:
Alexis Lothoré alexis.lothore@bootlin.com writes:
[...]
The function listened to is defined as accepting 'struct bpf_testmod_struct_arg_7', at the same time this function uses 'struct bpf_testmod_struct_arg_5'.
That's not an accidental mistake, those are in fact the same definition. bpf_testmod_struct_arg_7 is the kernel side definition in bpf_testmod.c:
struct bpf_testmod_struct_arg_7 { __int128 a; };
and struct bpf_testmode_struct_arg_5 is the one defined in the bpf test program:
struct bpf_testmod_struct_arg_5 { __int128 a; };
Apologies, but I'm still confused:
I apply this series on top of: 224ee86639f5 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc4")
line 12 of tracing_struct_many_args.c has the following definition:
struct bpf_testmod_struct_arg_5 { char a; short b; int c; long d; };
line 135 of the same file has the following definition:
SEC("fentry/bpf_testmod_test_struct_arg_11") int BPF_PROG2(test_struct_many_args_9, struct bpf_testmod_struct_arg_5, a, struct bpf_testmod_struct_arg_5, b, struct bpf_testmod_struct_arg_5, c, struct bpf_testmod_struct_arg_5, d, int, e, struct bpf_testmod_struct_arg_5, f)
line 70 of tools/testing/selftests/bpf/test_kmods/bpf_testmod.c:
struct bpf_testmod_struct_arg_7 { __int128 a; };
line 152 of the same file:
noinline int bpf_testmod_test_struct_arg_11(struct bpf_testmod_struct_arg_7 a, struct bpf_testmod_struct_arg_7 b, struct bpf_testmod_struct_arg_7 c, struct bpf_testmod_struct_arg_7 d, short e, struct bpf_testmod_struct_arg_7 f)
Do I use a wrong base to apply the series?
Argh, no, no, you are right, I checked again and I made some confusions between progs/tracing_struct.c and progs/tracing_struct_many_args.c. I initially did most of the work in tracing_struct.c, and eventually moved the code to tracing_struct_many_args.c before sending my series, but I apparently forgot to move bpf_testmod_struct_arg_5 declaration in tracing_struct_many_args.c (and so, to rename it, since this name is already used in there). As a consequence the bpf program is actually using the wrong struct layout. So thanks for insisting and spotting this issue !
I fixed my mess locally in order to re-run the gdb analysis mentioned in my previous mail, and the bug seems to be the same (unexpected t11:f: actual 35 != expected 43), with the same layout issue on the bpf context passed on the stack ("lucky" mistake ?). However, thinking more about this, I feel like there is still something that I have missed:
0xffffc900001dbce8: 38 0 0 0 0 0 0 0 0xffffc900001dbcf0: 0 0 0 0 0 0 0 0 0xffffc900001dbcf8: 39 0 0 0 0 0 0 0 0xffffc900001dbd00: 0 0 0 0 0 0 0 0 0xffffc900001dbd08: 40 0 0 0 0 0 0 0 0xffffc900001dbd10: 0 0 0 0 0 0 0 0 0xffffc900001dbd18: 41 0 0 0 0 0 0 0 0xffffc900001dbd20: 0 0 0 0 0 0 0 0 0xffffffffc04016a6: 42 0 0 0 0 0 0 0 0xffffc900001dbd30: 35 0 0 0 0 0 0 0 0xffffc900001dbd38: 43 0 0 0 0 0 0 0 0xffffc900001dbd40: 37 0 0 0 0 0 0 0
If things really behaved correctly, f would not have the correct value but would still be handled as a 16 bytes value, so the test would not fail with "actual 35 != 43", but something like "actual 27254487904906932132179118915584 != 43" (43 << 64 | 35) I guess. I still need to sort this out.
Alexis
 
            On Mon Apr 28, 2025 at 10:41 PM CEST, Alexis Lothoré wrote:
On Mon Apr 28, 2025 at 6:52 PM CEST, Eduard Zingerman wrote:
Alexis Lothoré alexis.lothore@bootlin.com writes:
If things really behaved correctly, f would not have the correct value but would still be handled as a 16 bytes value, so the test would not fail with "actual 35 != 43", but something like "actual 27254487904906932132179118915584 != 43" (43 << 64 | 35) I guess. I still need to sort this out.
And so indeed, the broken value is a big one:
(gdb) p skel->bss->t11_f $4 = 793209995169510719523 (gdb) p/x skel->bss->t11_f $5 = 0x2b0000000000000023 (gdb)
But we see the 35 (0x23) value in the error log because the formatters used in ASSERT_EQ truncate the actual value.
Alexis
 
            The fentry_many_args, fexit_many_args and struct_many_args tests were disabled on ARM64 due to the lack of many args support.
With the previous commits bringing in this missing support, drop the last denied tests.
Signed-off-by: Alexis Lothoré (eBPF Foundation) alexis.lothore@bootlin.com --- tools/testing/selftests/bpf/DENYLIST.aarch64 | 3 --- 1 file changed, 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/DENYLIST.aarch64 b/tools/testing/selftests/bpf/DENYLIST.aarch64 deleted file mode 100644 index 6d8feda27ce9de07d77d6e384666082923e3dc76..0000000000000000000000000000000000000000 --- a/tools/testing/selftests/bpf/DENYLIST.aarch64 +++ /dev/null @@ -1,3 +0,0 @@ -fentry_test/fentry_many_args # fentry_many_args:FAIL:fentry_many_args_attach unexpected error: -524 -fexit_test/fexit_many_args # fexit_many_args:FAIL:fexit_many_args_attach unexpected error: -524 -tracing_struct/struct_many_args # struct_many_args:FAIL:tracing_struct_many_args__attach unexpected error: -524
linux-kselftest-mirror@lists.linaro.org







