On Tue, 17 Dec 2024 at 16:47, Alexei Starovoitov alexei.starovoitov@gmail.com wrote:
Since we're on this topic, Daniel is looking to reuse format_decode() in bpf_bprintf_prepare() to get rid of our manual format validation.
That was literally why I started looking into this - the many separate type formats actually end up causing format_decode() (and the callers) to have to generate multiple different cases, which then in turn either cause a jump table, or - more commonly due to the CPU indirect branch mitigations - a chain of conditionals that are fairly ugly.
Compressing the state table for the types from 11 down to 4 types helps a bit, but then also dealing with the "smaller than int" things as just 'int' (with the formatting flags that are separate) also ends up avoiding some unnecessary and extra cases.
Because in the end, 'size_t' and 'long' are the same thing, even on architectures like 32-bit x86 where 'size_t' really is 'unsigned int' - simply because the only thing that matters for fetching the value is the size, which is 32-bit.
(The whole "is it signed" and the truncation to smaller-than-int etc is then something we have to handle anyway in by the 'printf_spec' thing).
So I have a patch series to clean some of this up and avoid the extra states. I'm not entirely happy with it, though, and I've been going back and forth on some of the code, so I'm not ready to post it or have anybody use it as a basis for some "real" cleanups.
I guess I could at least post the "turn 11 different types into 4" part. I have other things in there, but that part seems fairly unambiguously good.
Let me go separate that part out and maybe people can point out where I've done something silly.
Linus