Hi Mike,
On Mon, Dec 08, 2025 at 02:47:21PM +0000, Mike Leach wrote:
[...]
I tested locally and did not see the GCC complaint for this approach. And this is a global structure with about 16KiB (~4K items x
Which is precisely the issue - why use 16k bytes of space when a pair of indexed tables will use 21 x 32bit locations per table -> 168 bytes
- 100x smaller!
This space matters little to high end server systems but is much more important in smaller embedded systems.
For the concern of performance and footprint, my approach can avoid any conversion for standard registers, we end up need to convert registers for non-standard registers anyway.
I understand your concern for using an array for conversion, this is cost 16KiB memory but this can benefit a bit performance. It is a trade-off between memory and speed. As said, we can use a static function for register conversion, the side effect is this might cause more time.
Given the CTI MMIO register access, I don't think an extra branch instruction (checking the flag) would cause significant panelty, given the flag is set once at init and never changed afterwards.
Moreover the table + inline helper is more efficient at extracting the correct offset value. The helper is a simple de-reference - whereas the helper functions you suggest require the code to make the comparison at every register access. The "if qcom ..." may be contained in one place in the source code, but is called and executed for every access.
Why add inefficiencies, either in footprint or execution?
This is about how we design a driver that supports both a standard IP and non-standard implementations.
Because the standard IP is well defined, its register layout should be the default; it keeps the code simple and makes future CTI extensions easier. For non-standard IPs, we only apply the register translations needed.
TBH, the optimization topic is a bit over design for me now. The CTI module is configured once and remains untouched until it is disabled, so it is not a hot path.
Thanks, Leo