On Mon, Aug 26, 2024 at 3:53 AM Tony Ambardar tony.ambardar@gmail.com wrote:
On Fri, Aug 23, 2024 at 12:47:47PM -0700, Andrii Nakryiko wrote:
On Thu, Aug 22, 2024 at 2:25 AM Tony Ambardar tony.ambardar@gmail.com wrote:
From: Tony Ambardar tony.ambardar@gmail.com
Allow bpf_object__open() to access files of either endianness, and convert included BPF programs to native byte-order in-memory for introspection.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/libbpf.c | 21 +++++++++++++++++++-- tools/lib/bpf/libbpf_internal.h | 11 +++++++++++ 2 files changed, 30 insertions(+), 2 deletions(-)
Instructions are not the only data that would need swapping. We have user's data sections and stuff like that, which, generally speaking, isn't that safe to just byteswap.
I do understand the appeal of being endianness-agnostic, but doesn't extend all the way to actually loading BPF programs. At least I wouldn't start there.
Yes, absolutely. I first planned to move the endianness check from "open" to "load" functions but got waylaid tracing skeleton code into the latter and left it to continue progress. Let me figure out the best place to put a check without breaking things.
checking early during load should work just fine, I don't expect any problems
We need to make open phase endianness agnostic, load should just fail for swapped endianness case. So let's record the fact that we are not in native endianness, and fail early in load step.
This will still allow us to generate skeletons and stuff like that, right?
[...]
/* change BPF program insns to native endianness for introspection */
if (bpf_object__check_endianness(obj))
let's rename this to "is_native_endianness()" and return true/false. "check" makes sense as something that errors out, but now it's purely a query, so "check" naming is confusing.
Right, I mistook this as exported before and left it.
yeah, that double underscore is very misleading and I'd like to get rid of it, but my last attempt failed, so we are stuck with that for now
BTW, so libelf will transparently byte-swap relocations and stuff like that to native endianness, is that right?
Correct. Sections with types like ELF_T_REL (.rel) and ELF_T_SYM (.symtab) get translated automagically. See patch #3 for example.
ok, thanks for confirming
[...]
+static inline void bpf_insn_bswap(struct bpf_insn *insn) +{
/* dst_reg & src_reg nibbles */
__u8 *regs = (__u8 *)insn + offsetofend(struct bpf_insn, code);
*regs = (*regs >> 4) | (*regs << 4);
hm... we have fields, just do a brain-dead swap instead of all this mucking with offsetofend(
__u8 tmp_reg = insn->dst_reg;
insn->dst_reg = insn->src_reg; insn->src_reg = tmp_reg;
?
Main reason for this is most compilers recognize the shift/or statement pattern and emit a rotate op as I recall. And the offsetofend() seemed clearest at documenting "the byte after opcode" while not obscuring these are nibble fields. So would prefer to leave it unless you have strong objections or I'm off the mark somehow. Let me know either way? Thanks!
I do strongly prefer not having to use offsetofend() and pointer manipulations. Whatever tiny performance difference is completely irrelevant here. Let's go with a cleaner approach, please.
insn->off = bswap_16(insn->off);
insn->imm = bswap_32(insn->imm);
+}
/* Unconditionally dup FD, ensuring it doesn't use [0, 2] range.
- Original FD is not closed or altered in any other way.
- Preserves original FD value, if it's invalid (negative).
-- 2.34.1