Hi,
On Wed, Apr 20, 2011 at 1:42 PM, Nicolas Pitre nicolas.pitre@linaro.org wrote:
On Wed, 20 Apr 2011, Dave Martin wrote:
On Tue, Apr 19, 2011 at 01:33:12PM -0400, Nicolas Pitre wrote:
You must not use static variable in the decompressor. For one thing, that breaks the ability to XIP the decompressor code and move writable data elsewhere.
So the fix is indeed to _not_ declare any global variable as static in this case.
After some thinking about this, I think I agree.
Having to relocate a GOT-full of addresses many of which are actually at fixed PC-relative offsets just for this capability is a bit annoying, but the GNU tools don't support other models very well.
You cannot relocate PC-relative offsets at run time. Those references are spread throughout the code into literal pools. Forcing all references to go through the GOT makes it possible for the code to relocate selected parts of itself at run time.
My point was that relocatability implies overhead, and the GOT potentially contains a load of relocations for code and read-only data which will never get moved in practice.
For writable/uninitialised data, it's different of course -- we often will need to relocate that in real situations (as observed here). I'd guessed that only part of the GOT in the compressed loader was addressing such data, but actually, it seems to be pretty much all of it, as you suggest.
So the number of useless relocations, and any associated overhead, looks low (if any).
We might be able to reduce the size of the GOT by building with -fvisibility=hidden, and making judicious use of "extern" on all data declarations/definitions:
[gcc-4.4.info] `extern' declarations are not affected by `-fvisibility', so a lot of code can be recompiled with `-fvisibility=hidden' with no modifications. However, this means that calls to `extern' functions with no explicit visibility will use the PLT, so it is more effective to use `__attribute ((visibility))' and/or `#pragma GCC visibility' to tell the compiler which `extern' declarations should be treated as hidden.
This only seems to work reliably for data definitions; plus the toolchain behaviour may "evolve" with respect to obscure features like this.
That doesn't solve the problem at all. In this case, we really want _all_ data references to go through the GOT, meaning that everything would have to be marked extern. The only references which are OK to be PC relative are read-only references, and therefore they can just be marked as static const.
So if we wanted to achieve such a thing reliably, we'd probably need explicit visibility attributes on the affected declarations.
Like I said, it's about all of them.
The advantage is unlikely to be huge though since the GOT is small anyway; and we wouldn't be able to throw away the GOT relocation code completely, beacuse of the need to relocate bss references...
In fact, all that remains in the GOT, assuming that const data is marked static, are .bss references. Again, for simplicity's sake, we don't support initialized and writable global variables as in the XIP case those would have to be copied into RAM and the GOT patched accordingly. In practice this is not hard to achieve. To ensure that, we simply discard the .data early in the linker script.
Sure -- my observations were simply based around the fact that we're using the tools to do something they don't feel well adapted to, compared with other tools with a more embedded/bare-metal focus. So if there were a better or more correct way to use the tools to get the results we need, it would be worth considering. But from the discussion it sounds like the code already does pretty much the best thing possible anyway.
Cheers ---Dave