arm thumb veneer question

List overview All Threads
Download

newer

older

NEON vectorization improvements -...

Branding for Linaro GDB package

John Rigby

20 Sep 2010 20 Sep '10

7:54 p.m.

While trying out the u-boot-next branch I found a problem. First some explanation. On most platforms, u-boot is linked to the address it will first start running. For example when using NOR flash U-Boot will be linked to an address in flash. Very early in the boot process, U-Boot copies itself to the top and ram and jumps there. This relocation has worked for years on powerpc and other arches. The -next tree adds this for arm and it almost works.

The part that does not work is that some veneer routines do not get fixed up.

Here is an example. A routine called i2c_init calls __aeabi_idiv. Here is the disassembly:

... 288: e59f0148 ldr r0, [pc, #328] ; 3d8 <i2c_init+0x1a4> 28c: e1a01083 lsl r1, r3, #1 290: ebfffffe bl 0 <__aeabi_idiv> 294: e2507006 subs r7, r0, #6 298: 4a000001 bmi 2a4 <i2c_init+0x70>

Later after this .o is linked with everything else and libgcc that morphs to:

8000b384: e59f0148 ldr r0, [pc, #328] ; 8000b4d4 <_end+0xfff97c98> 8000b388: e1a01083 lsl r1, r3, #1 8000b38c: eb00aa43 bl 80035ca0 <____aeabi_idiv_veneer> 8000b390: e2507006 subs r7, r0, #6 8000b394: 4a000001 bmi 8000b3a0 <i2c_init+0x70>

and the veneer version is at the end of text with other veneers:

80035ca0 <____aeabi_idiv_veneer>: 80035ca0: e51ff004 ldr pc, [pc, #-4] ; 80035ca4 <_end+0xfffc2468> 80035ca4: 80035999 .word 0x80035999

80035ca8 <____aeabi_llsl_veneer>: 80035ca8: e51ff004 ldr pc, [pc, #-4] ; 80035cac <_end+0xfffc2470> 80035cac: 80035c7d .word 0x80035c7d

80035cb0 <____aeabi_lasr_veneer>: 80035cb0: e51ff004 ldr pc, [pc, #-4] ; 80035cb4 <_end+0xfffc2478> 80035cb4: 80035c61 .word 0x80035c61

80035cb8 <____aeabi_llsr_veneer>: 80035cb8: e51ff004 ldr pc, [pc, #-4] ; 80035cbc <_end+0xfffc2480> 80035cbc: 80035c49 .word 0x80035c49

80035cc0 <____aeabi_uidivmod_veneer>: 80035cc0: e51ff004 ldr pc, [pc, #-4] ; 80035cc4 <_end+0xfffc2488> 80035cc4: 8003597d .word 0x8003597d

80035cc8 <____aeabi_uidiv_veneer>: 80035cc8: e51ff004 ldr pc, [pc, #-4] ; 80035ccc <_end+0xfffc2490> 80035ccc: 80035721 .word 0x80035721

80035cd0 <____aeabi_idivmod_veneer>: 80035cd0: e51ff004 ldr pc, [pc, #-4] ; 80035cd4 <_end+0xfffc2498> 80035cd4: 80035c2d .word 0x80035c2d

then if we look at 80035998 we see some thumb code.

80035998 <__aeabi_idiv>: 80035998: 2900 cmp r1, #0 8003599a: f000 813e beq.w 80035c1a <.divsi3_nodiv0+0x27c>

When u-boot copies itself to ram it relocates the jump tables it knows about and could relocate the addresses in the veneer routines if it knew about them.

There are at least three possible ways to fix these:

1) u-boot has its own private libgcc and if I use it the problem goes away. 2) is there an option for the toolchain to use an arm libgcc instead of thumb? 3) is there a way to find the veneers at runtime and fix them up?

All input welcome. Thanks, John

Show replies by date

Wolfgang Denk

20 Sep 20 Sep

8:21 p.m.

Dear John Rigby,

In message AANLkTin+NyvQr6hnDOt82WzwSwM+4Tw+h+Py5Rt1Xh6D@mail.gmail.com you wrote:

...

The part that does not work is that some veneer routines do not get fixed up.

These veneer routines seem to be specific to some (pretty recent?) tool chain versions. We haven't seen any of these with older tool chains (say, up to and including gcc 4.2.x).

Can anybody shed some light on 1) when these routines have been introduced and 2) what their exact function is?

Is the specific tool chain in question available somewhere for testing?

...

When u-boot copies itself to ram it relocates the jump tables it knows about and could relocate the addresses in the veneer routines if it knew about them.

The relocation performed by U-Boot is based on the fact that we compile the code with -fPIC and then rely on entries in the GOT to apply the relocation offset to the addresses registered there.

It seems these veneer routines have not been entered into the GOT.

Note: it would be helpfup if somebody could verify this.

If so, the question is if there is a problem with handling -fPIC code?

...

There are at least three possible ways to fix these:

u-boot has its own private libgcc and if I use it the problem goes away.

In U-Boot we consider this always as a last resort workaround for broken tool chains. We prefer to see the problems fixed at the cause - either in U-Boot or in the tool chain, wherever the problem may be.

...

is there an option for the toolchain to use an arm libgcc instead of thumb?

I cannot comment on that. I don't even know which tool chain was used, or which versions of gcc and binutils.

...

is there a way to find the veneers at runtime and fix them up?

The problem should go away automatically when the addresses of these routines somehow make their way into the GOT.

Best regards,

Wolfgang Denk

-- DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de The price one pays for pursuing any profession, or calling, is an intimate knowledge of its ugly side. - James Baldwin

Loïc Minier

22 Sep 22 Sep

12:28 p.m.

Hey

On Mon, Sep 20, 2010, Wolfgang Denk wrote:

...

Is the specific tool chain in question available somewhere for testing?

Sure thing; the Linaro toolchain is available in source form from: https://launchpad.net/linaro-toolchain more specifically the GCC branch: https://launchpad.net/gcc-linaro (there are tarball downloads or you can check it out with bzr)

There are some pre-built packages for the development series of Ubuntu ("maverick"), "apt-get install gcc-arm-linux-gnueabi" should do the trick. There aren't any other pre-built binaries available yet though (well there are some Debian armhf one, but don't think you care about these). If you're running Ubuntu 10.04 ("lucid"), these instructions allow installing the maverick packages: https://wiki.linaro.org/MichaelHope/Sandbox/CrossCompilerOnLucid

NB: the Ubuntu binaries are built for eglibc (linux-gnu-eabi), i.e. not no-libc.

-- Loïc Minier

Dave Martin

21 Sep 21 Sep

11:23 a.m.

Hi,

John Rigby wrote:

...

288: e59f0148 ldr r0, [pc, #328] ; 3d8 <i2c_init+0x1a4> 28c: e1a01083 lsl r1, r3, #1 290: ebfffffe bl 0 <__aeabi_idiv> 294: e2507006 subs r7, r0, #6 298: 4a000001 bmi 2a4 <i2c_init+0x70>

I believe such calls are getting resolved via a veneer because of a combination the thumb2-ness of libgcc and the toolchain being used.

In principle, the linker can know that it is linking for >= ARMv5T due to the way it was configured and the way the input objects were built, but GNU ld is conservative and doesn't do this automatically. As a result, it has to generate a veneer, reached via a normal non-interworking branch. ld has no way the veneer needs to be PIC and use the GOT, so it isn't and doesn't.

ld.info says:

The `--use-blx' switch enables the linker to use ARM/Thumb BLX instructions (available on ARMv5t and above) in various situations. Currently it is used to perform calls via the PLT from Thumb code using BLX rather than using BX and a mode-switching stub before each PLT entry. This should lead to such calls executing slightly faster.

...so you might explicitly want to enable this whenever building for ARMv5 or later.

Wolfgang, can you foresee any reason not to do that? As far as I can see it will be safe so long as we don't use it when building for architectured <ARMv5 (where the BLX instruction isn't supported).

Otherwise, the only straightforward workaround is to build with cc -fPIC and ld -shared, to guarantee that every absolute address is resolved via the GOT instead of hard-coded veneers, then let U-Boot patch it all up. This causes non-relative branches to jump a veneer in .plt instead. If extra trampolines are needed to reach .plt, those will be generated as PIC code in this case too. Of course, we don't want the linker to resolve symbols using actual shared libraries, so we need -nostdlib and explicit references to .a files if there might be ambiguity about library selection. Since U-Boot is bare-metal, I'm guessing these requirements are either met or nearly met already... but there might be gotchas such as if the fixup code in U-Boot ends up with fixups inside itself.

Alternatively, using ld --emit-relocs and then embedding the relocation information in the image so that U-Boot can use it could help to solve the problem. I'm guessing that isn't set up at present, though.

[...]

...

u-boot has its own private libgcc and if I use it the problem goes away.

Hopefully not necessary--- I agree with Wolfgang's concerns on relying on this.

...

is there an option for the toolchain to use an arm libgcc instead of thumb?

You'd need to rebuild the toolchain (or at least libgcc). I believe that no ARM libgcc is built at present for the linaro/Ubuntu tools. I don't think the GCC packages currently support this kind of thing well. --use-blx is probably the better workaround.

...

is there a way to find the veneers at runtime and fix them up?

No. Even if you can find the veneers, you would need to make assumptions about their structure which may break when the toolchain gets upgraded... unless you write the veneers by hand in the first place (bad...)

So the options are to avoid the veneers using --use-blx; to post-processing the relocations output from ld --emit-relocs; or to do a fully PIC link with ld -shared. The latter options feel like overkill, unless U-Boot already supports this, or evolves a general need for it in the future.

Cheers ---Dave

Wolfgang Denk

8:29 p.m.

Dear Dave Martin,

In message AANLkTikHv1SpjcyRbRxGK2QeCoq96tsTken3LmP5bsse@mail.gmail.com you wrote:

...

I believe such calls are getting resolved via a veneer because of a combination the thumb2-ness of libgcc and the toolchain being used.

In principle, the linker can know that it is linking for >= ARMv5T due to the way it was configured and the way the input objects were built, but GNU ld is conservative and doesn't do this automatically. As a result, it has to generate a veneer, reached via a normal non-interworking branch. ld has no way the veneer needs to be PIC and use the GOT, so it isn't and doesn't.

Stupid question: why not?

...

The `--use-blx' switch enables the linker to use ARM/Thumb BLX instructions (available on ARMv5t and above) in various situations. Currently it is used to perform calls via the PLT from Thumb code using BLX rather than using BX and a mode-switching stub before each PLT entry. This should lead to such calls executing slightly faster.

...so you might explicitly want to enable this whenever building for ARMv5 or later.

Wolfgang, can you foresee any reason not to do that? As far as I can see it will be safe so long as we don't use it when building for architectured <ARMv5 (where the BLX instruction isn't supported).

If care is taken that it causes no conflicts with older tool chains I'm happy with that. I guess that can be added to arch/arm/cpu/armv7/config.mk

...

Alternatively, using ld --emit-relocs and then embedding the relocation information in the image so that U-Boot can use it could help to solve the problem. I'm guessing that isn't set up at present, though.

Is there any information available about relative code sizes / performance numbers of "--emit-relocs" versus "--use-blx"?

...

...

is there an option for the toolchain to use an arm libgcc instead of thumb?

You'd need to rebuild the toolchain (or at least libgcc). I believe that no ARM libgcc is built at present for the linaro/Ubuntu tools. I don't think the GCC packages currently support this kind of thing well.

I think that should be fixed. I guess you will run intot hat again sooner or later.

BTW: why does nobody answer my questions?

Can anybody shed some light on 1) when these routines have been introduced ... ?

Is the specific tool chain in question available somewhere for testing?

Best regards,

Wolfgang Denk

-- DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de Old programmers never die, they just branch to a new address.

Nicolas Pitre

9:32 p.m.

On Tue, 21 Sep 2010, Wolfgang Denk wrote:

...

Is there any information available about relative code sizes / performance numbers of "--emit-relocs" versus "--use-blx"?

The blx instruction will always win on both counts: it is smaller and faster than a veneer.

...

BTW: why does nobody answer my questions?

    Can anybody shed some light on 1) when these routines have
    been introduced ... ?

The "veneers" are just code stubs that the linker automatically insert into the final binary in order to work around some incompatibility issues.

For example, if one of your .o file contains the following instruction:

bl foobar

During the link phase the linker may realize that the foobar function is too far away from the call site above (the bl instruction is relative to the current pc and has a limited range). In that case the linker has two choices: abort the link, or append to your .o this code:

foobar_veneer: ldr pc, pc, #-4 /* pc is always 8 bytes ahead */ .word foobar

and then the "bl foobar" is modified to branch to foobar_veneer instead in order to produce an absolute call.

Those veneers are also used for other things, such as ARM vs Thumb interworking issues such as the one in this thread.

...

    Is the specific tool chain in question available somewhere
    for testing?

As far as I know, any reasonably recent toolchain (e.g. like toolchains released even 2 years ago) will emit veneers when required.

Nicolas

Wolfgang Denk

9:44 p.m.

Dear Nicolas Pitre,

In message alpine.LFD.2.00.1009211711080.13233@xanadu.home you wrote:

...

...
    Is the specific tool chain in question available somewhere
    for testing?
As far as I know, any reasonably recent toolchain (e.g. like toolchains released even 2 years ago) will emit veneers when required.

Thanks. But that was not really my question.

I understand the answer is "no", then?

Best regards,

Wolfgang Denk

-- DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de In the bathtub of history the truth is harder to hold than the soap, and much more difficult to find ... - Terry Pratchett, _Sourcery_

Nicolas Pitre

22 Sep 22 Sep

12:42 a.m.

On Tue, 21 Sep 2010, Wolfgang Denk wrote:

...

Dear Nicolas Pitre,

In message alpine.LFD.2.00.1009211711080.13233@xanadu.home you wrote:

...
...
    Is the specific tool chain in question available somewhere
    for testing?
As far as I know, any reasonably recent toolchain (e.g. like toolchains released even 2 years ago) will emit veneers when required.
Thanks. But that was not really my question.

I understand the answer is "no", then?

If you want the exact same toolchain, you may have a look at: http://lists.linaro.org/pipermail/linaro-toolchain/2010-September/000155.htm...

Nicolas

Dave Martin

9:22 a.m.

On Wed, Sep 22, 2010 at 1:42 AM, Nicolas Pitre nicolas.pitre@linaro.org wrote:

[...]

...

If you want the exact same toolchain, you may have a look at: http://lists.linaro.org/pipermail/linaro-toolchain/2010-September/000155.htm...

I don't know exactly when --use-blx was introduced, but it has apparently existed for a long time; the vast majority of arm-*eabi toolchains should support it.

I guess we could have a configure-time test to see whether the option is supported, or only use it for thumb-2 capable platforms (the latter option may be the more sensible one, and will not cruft-ify the build system)

Cheers ---Dave

Dave Martin

10:14 a.m.

On Tue, Sep 21, 2010 at 9:29 PM, Wolfgang Denk wd@denx.de wrote:

...

Dear Dave Martin,

In message AANLkTikHv1SpjcyRbRxGK2QeCoq96tsTken3LmP5bsse@mail.gmail.com you wrote:

...
I believe such calls are getting resolved via a veneer because of a combination the thumb2-ness of libgcc and the toolchain being used.

In principle, the linker can know that it is linking for >= ARMv5T due to the way it was configured and the way the input objects were built, but GNU ld is conservative and doesn't do this automatically. As a result, it has to generate a veneer, reached via a normal non-interworking branch. ld has no way the veneer needs to be PIC and use the GOT, so it isn't and doesn't.

Stupid question: why not?

Because U-Boot doesn't build PIC for ARM (I notice it does for some other arches).

u-boot$ grep -irl -- '-fpic|-shared' `find . -name Makefile* -o -name *.mk` ./arch/mips/config.mk ./arch/sparc/cpu/leon2/config.mk ./arch/sparc/cpu/leon3/config.mk ./arch/powerpc/cpu/mpc85xx/config.mk ./arch/powerpc/cpu/mpc5xxx/config.mk ./arch/powerpc/cpu/mpc824x/config.mk ./arch/powerpc/cpu/mpc86xx/config.mk ./arch/powerpc/cpu/mpc83xx/config.mk ./arch/powerpc/cpu/mpc8xx/config.mk ./arch/powerpc/cpu/mpc8260/config.mk ./arch/powerpc/cpu/mpc8220/config.mk ./arch/powerpc/cpu/ppc4xx/config.mk ./arch/powerpc/cpu/74xx_7xx/config.mk ./arch/powerpc/cpu/mpc5xx/config.mk ./arch/powerpc/cpu/mpc512x/config.mk ./arch/avr32/config.mk ./arch/m68k/cpu/mcf5227x/config.mk ./arch/m68k/cpu/mcf547x_8x/config.mk ./arch/m68k/cpu/mcf532x/config.mk ./arch/m68k/cpu/mcf523x/config.mk ./arch/m68k/cpu/mcf5445x/config.mk

Note that ARM code is fairly PIC even without -fPIC, except for references to data (which are usually absolute), and certain cases of veneers/trampolines inserted by the linker (as we saw).

One think I'm confused about: why do references to read-only data not cause a problem? I would expect the read-only data to need to be relocated along with .text, but currently u-boot.bin references these with absolute addresses (because of no -fPIC). Are we just tending to get lucky, i.e., in practice U-Boot is usually run at the link address and copied elsewhere?

To illustrate, here's an example: note the absense of a GOT or any relocations in u-boot, and the non-relocatable absolute reference to a string in .rodata.str1.1

u-boot$ objdump -dr net/net.o [...] 000009d4 <ArpRequest>: ab8: 00000079 .word 0x00000079 ab8: R_ARM_ABS32 .rodata.str1.1 [...]

u-boot$ objdump -dh u-boot

u-boot: file format elf32-littlearm

Sections: Idx Name Size VMA LMA File off Algn 0 .text 00014c4c 06000000 06000000 00008000 2**5 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .rodata 00000fdc 06014c4c 06014c4c 0001cc4c 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .rodata.str1.1 00004180 06015c28 06015c28 0001dc28 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATA 3 .data 00000914 06019da8 06019da8 00021da8 2**2 CONTENTS, ALLOC, LOAD, DATA 4 .u_boot_cmd 00000378 0601a6bc 0601a6bc 000226bc 2**2 CONTENTS, ALLOC, LOAD, DATA 5 .bss 00035f30 0601aa34 0601aa34 00022a34 2**2 ALLOC 6 .ARM.attributes 0000002d 00000000 00000000 00022a34 2**0 CONTENTS, READONLY 7 .comment 00000041 00000000 00000000 00022a61 2**0 CONTENTS, READONLY 8 .debug_line 00005dd0 00000000 00000000 00022aa2 2**0 CONTENTS, READONLY, DEBUGGING 9 .debug_info 0001756a 00000000 00000000 00028872 2**0 CONTENTS, READONLY, DEBUGGING 10 .debug_abbrev 00006726 00000000 00000000 0003fddc 2**0 CONTENTS, READONLY, DEBUGGING 11 .debug_aranges 000006a0 00000000 00000000 00046508 2**3 CONTENTS, READONLY, DEBUGGING 12 .debug_loc 0000f178 00000000 00000000 00046ba8 2**0 CONTENTS, READONLY, DEBUGGING 13 .debug_pubnames 000022f6 00000000 00000000 00055d20 2**0 CONTENTS, READONLY, DEBUGGING 14 .debug_ranges 00000880 00000000 00000000 00058016 2**0 CONTENTS, READONLY, DEBUGGING 15 .debug_str 00004cd5 00000000 00000000 00058896 2**0 CONTENTS, READONLY, DEBUGGING 16 .debug_frame 00003178 00000000 00000000 0005d56c 2**2 CONTENTS, READONLY, DEBUGGING [...] 06001a44 <ArpRequest>: [...] 6001b28: 06016065 .word 0x06016065 [...]

As to why the linker doesn't automatically know that the expensive veneers aren't needed, I don't think there's any really good reason; it's just not implemented AFAIK.

ld doesn't know the target architecture in the same way that gcc does --- I believe ld doesn't understand a -march= switch for most architectures.

ld could guess the target architecture based on information the compiler puts in the objects, but this might be a future thing. Some ARM toolchains do it, ld currently doesn't. I think it may get done in the future, but in the meantime we need to manage without :/

[...]

...

Is there any information available about relative code sizes / performance numbers of "--emit-relocs" versus "--use-blx"?

I don't have numbers, but it's straightforward to answer: the resulting code should be identical at run-time. But --use-blx get the linker to do the work for you, whereas --emit-relocs requires something else in the build system to perform these fixups. Both are smaller and faster than branching via veneers (the current behaviour).

...

...
...

is there an option for the toolchain to use an arm libgcc instead of thumb?

You'd need to rebuild the toolchain (or at least libgcc). I believe that no ARM libgcc is built at present for the linaro/Ubuntu tools. I don't think the GCC packages currently support this kind of thing well.

I think that should be fixed. I guess you will run intot hat again sooner or later.

Indeed... I believe it is being looked at in relation to multiarch; there seems to be a general consensus is that multilibs isn't really scalable enough. But we're going to have to put up with this for a while in the interim...

[...]

...

Can anybody shed some light on 1) when these routines have been introduced ... ?

I think this was already answered, but to clarify from my side: nothing has been introduced. The observed behaviour is something the linker does when it sees a mixture of ARM and Thumb code, so it happens as a side-effect of using a toolchain which has a Thumb-2 libgcc to build ARM code (i.e., U-Boot). Because most people have non-Thumb toolchains, the problem hasn't been observed before...

This also means that using such a toolchain to build U-Boot for a platform which doesn't support Thumb-2 will result in a broken build containing Thumb-2 libgcc code that the target can't run. But this shouldn't affect toolchains which default to ARM (the usual case, except for Ubuntu/linaro), and in particular shouldn't break any toolchain/U-Boot/platform combinations which currently work.

Again, fixing this properly really requires the mutiple-libgcc problem to be solved.

Cheers ---Dave

Wolfgang Denk

10:48 a.m.

Dear Dave Martin,

In message AANLkTi=3GCRv00HRA=aXzrV1XxJTcBAV=2H_QeTR0=2R@mail.gmail.com you wrote:

...

...
...
non-interworking branch. ld has no way the veneer needs to be PIC and use the GOT, so it isn't and doesn't.

Stupid question: why not?

Because U-Boot doesn't build PIC for ARM (I notice it does for some other arches).

You are referring to old code. John explitly mentioned that he was working on the "next" branch, which has this:

# needed for relocation PLATFORM_RELFLAGS += -fPIC

Note that BEFORE commit f1d2b31 ("ARM: add relocation support") we did not see such issues (at least I never did, and my underwstanding is that John didn't either).

Best regards,

Wolfgang Denk

-- DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de Time is fluid ... like a river with currents, eddies, backwash. -- Spock, "The City on the Edge of Forever", stardate 3134.0

Dave Martin

12:29 p.m.

Hi,

On Wed, Sep 22, 2010 at 11:48 AM, Wolfgang Denk wd@denx.de wrote:

[...]

...

...
Because U-Boot doesn't build PIC for ARM (I notice it does for some other arches).

...

You are referring to old code. John explitly mentioned that he was working on the "next" branch, which has this:

# needed for relocation PLATFORM_RELFLAGS += -fPIC

Ah! Apologies, I missed that... looks like I'm out of date on some assumptions.

This tells the compiler to generate PIC code, but it doesn't tell the linker to generate PIC output... which matters if the linker needs to add extra code during the link.

Two solutions come to mind:

a) In order for linker-added stuff to be PIC, you could link with -shared. This will definitely PIC-ify any veneers added by the linker and push related relocations into the GOT. Strictly speaking it might be wrong not to do this if you expect the output from the linker to be fully PIC -- if so, this may apply to all arches where the linker may generate code. Naturally, it's necessary to ensure that the U-Boot ELF image doesn't accidentally get linked against any shared libs. Checking for DT_NEEDED entries in the u-boot ELF image would be a way to sanity-check this, but the way the U-Boot drives the link looks pretty safe (no -l options; explicit references to .a libs only etc.)

b) For ARM specifically, there is also a --pic-veneer option which may also Do The Right Thing in the specific case of these veneers, even when not using ld -shared. Again, I don't know precisely which toolchain versions support this; if we want to be really safe it may be necessary to probe for support for this option at configure-time.

--use-blx is probably still a good idea in either case, but this is more about optimisation than correctness. The optimisation is probably still worthwhile though, if it might speed up calls into libgcc.

Cheers ---Dave

Loïc Minier

12:38 p.m.

On Wed, Sep 22, 2010, Dave Martin wrote:

...

This tells the compiler to generate PIC code, but it doesn't tell the linker to generate PIC output... which matters if the linker needs to add extra code during the link.

Perhaps a stupid question, but why -fPIC/-shared and not -fPIE/-pie?

-- Loïc Minier

Dave Martin

12:55 p.m.

Hi,

On Wed, Sep 22, 2010 at 1:38 PM, Loïc Minier loic.minier@linaro.org wrote:

...

On Wed, Sep 22, 2010, Dave Martin wrote:

...
This tells the compiler to generate PIC code, but it doesn't tell the linker to generate PIC output... which matters if the linker needs to add extra code during the link.

Perhaps a stupid question, but why -fPIC/-shared and not -fPIE/-pie?

Dunno :)

I'm not a toolchain expert, so I'm happy to be overridden... but my _guess_ is:

I think that in practice (at least on arm) cc -fPIC = cc -fPIE, and ld -pie just forces ld to generate PIC veneers (as for -shared). Beyond this, I think ld -shared / -pie / (nothing) probably just changes which linker script is used by default. U-Boot overrides the default with its own linker script anyway, so it may make no difference.

---Dave

Loïc Minier

1:04 p.m.

On Wed, Sep 22, 2010, Dave Martin wrote:

...

I'm not a toolchain expert, so I'm happy to be overridden... but my _guess_ is:

I think that in practice (at least on arm) cc -fPIC = cc -fPIE, and ld -pie just forces ld to generate PIC veneers (as for -shared). Beyond this, I think ld -shared / -pie / (nothing) probably just changes which linker script is used by default. U-Boot overrides the default with its own linker script anyway, so it may make no difference.

Catching up on email, I just came across: http://article.gmane.org/gmane.comp.boot-loaders.u-boot/84789 so it seems to be different, but not significantly

I don't care too strongly, but it might make sense to try to use the flag which means exactly what we want to allow for future optimizations?

-- Loïc Minier

Dave Martin

1:23 p.m.

Hi,

On Wed, Sep 22, 2010 at 2:04 PM, Loïc Minier loic.minier@linaro.org wrote:

...

On Wed, Sep 22, 2010, Dave Martin wrote:

...
I'm not a toolchain expert, so I'm happy to be overridden... but my _guess_ is:

I think that in practice (at least on arm) cc -fPIC = cc -fPIE, and ld -pie just forces ld to generate PIC veneers (as for -shared). Beyond this, I think ld -shared / -pie / (nothing) probably just changes which linker script is used by default. U-Boot overrides the default with its own linker script anyway, so it may make no difference.

Catching up on email, I just came across: http://article.gmane.org/gmane.comp.boot-loaders.u-boot/84789 so it seems to be different, but not significantly

Hmmm, interesting discussion: looks like my guess was naive ;)

It looks like -fPIE provides some savings after all, since it enables the compiler to make some additional assumptions.

...

I don't care too strongly, but it might make sense to try to use the flag which means exactly what we want to allow for future optimizations?

Since U-Boot is now trying to build PIC anyway, I suggest that for things to work robustly one of the following is needed anyway:

* -fPIE + -pie * -fPIC + -shared (less optimal?)

Logically, pie seems to be the closest match to what is actually happening in U-Boot.

Then we can optionally add:

* --use-blx (to get rid of ARM<->Thumb interworking veneers)

--pic-veneers could be used as a temporary fix, but this feels fragile to me: it only controls a speicific aspect of linker behaviour, so we could hit other problems in the future. We could do this as a short-term workaround to allow the linaro toolchain to be used in the meantime, though.

Cheers ---Dave

5428

days inactive

5430

days old

linaro-toolchain@lists.linaro.org

15 comments

participants

tags (0)

participants (5)

Dave Martin
John Rigby
Loïc Minier
Nicolas Pitre
Wolfgang Denk