+ linaro-toolchain
Hello Ulrich,
I want to revisit this old thread. Sorry for the sloppy follow-up. But, this time around I have more data.
On Tue, Mar 15, 2011 at 9:00 PM, Ulrich Weigand Ulrich.Weigand@de.ibm.com wrote:
I could reproduce the problem on older tool-chain(Sourcery G++ Lite 2010q1-202) [1] with a modified version of your sample code:
a.c: ==== extern void foo (void) __attribute__ ((weak, alias ("__foo")));
void __foo (void) { }
extern void call_foo(void);
int main (void) { call_foo (); }
b.S: ==== .text .align 2 .global foo foo: push {r7} add r7, sp, #0 mov sp, r7 pop {r7} bx lr .size foo, .-foo
c.S .text .align 2
.global call_foo call_foo: bl foo bx lr
.global __aeabi_unwind_cpp_pr0 __aeabi_unwind_cpp_pr0: bx lr
Now, I build them using the following commands, which is similar to what U-Boot does:
arm-none-linux-gnueabi-gcc -mthumb -mthumb-interwork -c a.c arm-none-linux-gnueabi-gcc -mthumb -mthumb-interwork -c b.S arm-none-linux-gnueabi-gcc -mthumb -mthumb-interwork -c c.S arm-none-linux-gnueabi-ld -r a.o -o alib.o arm-none-linux-gnueabi-ld -r b.o -o blib.o arm-none-linux-gnueabi-ld -r c.o -o clib.o arm-none-linux-gnueabi-ld --start-group clib.o alib.o blib.o --end-group -o a.out armobjdump -S --reloc a.out
You will get something like: 00008094 <call_foo>: 8094: fa000006 blx 80b4 <foo> 8098: e12fff1e bx lr
Please note that that the 'blx' is not correct. Now, do the following change:
diff --git a/b.S b/b.S index e0f2de9..96dba1f 100644 --- a/b.S +++ b/b.S @@ -1,5 +1,6 @@ .text .align 2 +.type foo, %function .global foo foo: push {r7}
And build it again the same way and you will see: 00008094 <call_foo>: 8094: eb000006 bl 80b4 <foo> 8098: e12fff1e bx lr
I can't reproduce this on Linaro GCC 2012.01, so looks like the problem is solved in recent tool-chains. However, sadly I could reproduce a different but similar problem with Linaro GCC 2012.01. This time the call is from C(Thumb) to assembly(ARM) and no weakly linked symbols are involved.
a.c: ==== int main (void) { foo (); }
b.S: ==== .text .align 2 .global foo foo: push {r7} add r7, sp, #0 mov sp, r7 pop {r7} bx lr .size foo, .-foo
.global __aeabi_unwind_cpp_pr0 __aeabi_unwind_cpp_pr0: bx lr
arm-linux-gnueabi-gcc -mthumb -mthumb-interwork -c a.c arm-linux-gnueabi-gcc -mthumb -mthumb-interwork -c b.S arm-linux-gnueabi-ld -r a.o -o alib.o arm-linux-gnueabi-ld -r b.o -o blib.o arm-linux-gnueabi-ld --start-group alib.o blib.o --end-group -o a.out arm-linux-gnueabi-objdump -S --reloc a.out
gives: 8076: af00 add r7, sp, #0 8078: f000 f802 bl 8080 <foo> 807c: 4618 mov r0, r3
It should have been "blx 8080 <foo>", isn't it? Again, %function solves it.
I agree that not marking the assembly functions ' %function' is a problem in the code, so it's not a critical bug. But I would've been happier if the linker refused to link it rather than branching with the wrong instruction. Isn't that a problem?
Problem No:2 ************* Linaro GCC 2012.01 is giving a new problem w.r.to Thumb build that is not existing in Sourcery G++ Lite 2010q1-202. However, I couldn't reproduce this problem with a small program like above. So, let me give you reference to the original u-boot code that shows the problem and steps to reproduce it.
tree: git://github.com/aneeshv/u-boot.git branch: thumb
The above branch has mainline u-boot with 4 additional patches from me for enabling Thumb build. You can build it like this:
make CROSS_COMPILE=arm-linux-gnueabi- ARCH=arm distclean make CROSS_COMPILE=arm-linux-gnueabi- ARCH=arm omap4_sdp4430
This builds two images u-boot and u-boot-spl. SPL is a tiny u-boot that runs from internal RAM and loads the u-boot to SDRAM. Now, please have a look at the map file of u-boot-spl which is at: spl/u-boot-spl.map
I see the following in my map file:
/spl/u-boot-spl.map: ================= .rodata.wkup_padconf_array_essential_4460 0x40309583 0x4 board/ti/sdp4430/libsdp4430.o 0x40309583 wkup_padconf_array_essential_4460 .rodata.wkup_padconf_array_essential 0x40309587 0xc board/ti/sdp4430/libsdp4430.o 0x40309587 wkup_padconf_array_essential .rodata.core_padconf_array_essential 0x40309593 0x60 board/ti/sdp4430/libsdp4430.o 0x40309593 core_padconf_array_essential
Please note that the .rodata symbols have odd addresses. These arrays actually need to be aligned at least to half-word boundary. In fact, in the image I verified that they are put at even addresses. So, the symbols have been kept as real address +1. I understand that this is the convention for Thumb functions. I guess the tool-chain did it for data too?? And I am getting unaligned access aborts on accessing them.
I notice that this doesn't happen with all .rodata. symbols in the image. I couldn't see any difference between working and non-working files nor any difference in the command used to build them!
Well, this doesn't happen if I don't use "-fdata-sections" in gcc options. So, apply the following patch and you will see that those symbols have even addresses now.
diff --git a/config.mk b/config.mk index ddaa477..723286a 100644 --- a/config.mk +++ b/config.mk @@ -190,7 +190,7 @@ CPPFLAGS := $(DBGFLAGS) $(OPTFLAGS) $(RELFLAGS) \
# Enable garbage collection of un-used sections for SPL ifeq ($(CONFIG_SPL_BUILD),y) -CPPFLAGS += -ffunction-sections -fdata-sections +CPPFLAGS += -ffunction-sections LDFLAGS_FINAL += --gc-sections endif
/spl/u-boot-spl.map: ================= .rodata 0x40309204 0x38c board/ti/sdp4430/libsdp4430.o 0x40309204 core_padconf_array_essential 0x40309264 wkup_padconf_array_essential 0x40309270 wkup_padconf_array_essential_4460 0x40309274 core_padconf_array_non_essential 0x40309540 wkup_padconf_array_non_essential 0x40309588 wkup_padconf_array_non_essential_4430
Will you be able to look into these?
Thanks, Aneesh
[1] Sourcery G++ Lite 2010q1-202 arm-none-linux-gnueabi-gcc (Sourcery G++ Lite 2010q1-202) 4.4.1 GNU ld (Sourcery G++ Lite 2010q1-202) - binutils 2.19.51.20090709
[2] Linaro 4.6-2012.01 arm-linux-gnueabi-gcc (crosstool-NG linaro-1.13.1-2012.01-20120125 - Linaro GCC 2012.01) 4.6.3 20120105 (prerelease) GNU ld (crosstool-NG linaro-1.13.1-2012.01-20120125 - Linaro GCC 2012.01) 2.22
"V, Aneesh" aneesh@ti.com wrote:
Well, if the target symbol of a branch is not marked as %function, the linker has no way of knowing whether that target is ARM or Thumb, so it cannot specifically error out if (and only if) the instruction is wrong.
The linker *could* in theory give an error *unconditionally* whenever it detects a branch to a non-%function symbol. The reason this is not done is probably for backwards compatibility with old hand-written code, say from an ARM-only era: branches to non-function symbols used to be allowed, and in fact work fine if all code is ARM-only. Adding an error would break such old code.
[snip]
This seems strange. How did you verify "that they are put at even addresses"? I can reproduce the odd addresses of .rodata symbols. However, this occurs simply because the linker put *no* alignment requirement whatsoever on those sections:
Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [snip] [11] .rodata.wkup_padc PROGBITS 00000000 000100 000004 00 A 0 0 1 [12] .rodata.wkup_padc PROGBITS 00000000 000104 000048 00 A 0 0 1 [13] .rodata.wkup_padc PROGBITS 00000000 00014c 00000c 00 A 0 0 1 [14] .rodata.wkup_padc PROGBITS 00000000 000158 000004 00 A 0 0 1
Note the "Al" column values of 1. In the final executable, those sections happen to end up immediately following a .rodata.str string section with odd lenght, and since they don't have any alignment requirement, they start out at an odd address.
The reason for the lack of alignment requirement is actually in the source:
const struct pad_conf_entry core_padconf_array_essential[] = {
where
struct pad_conf_entry {
u16 offset;
u16 val;
} __attribute__ ((packed));
The "packed" attribute specifies that all struct elements ought to be considered to have alignment requirement 1 instead of their default alignment. Thus the whole struct ends up having alignment requirement 1; and since the section contains only a single variable of such struct type, this is then also the alignment requirement of the section.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
-- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294
On Mon, Feb 20, 2012 at 06:59:53PM +0100, Ulrich Weigand wrote:
Is "packed" even wanted here?
Based on a very brief skim of the code, it looks like the packed attribute is an unnecessary attempt to save some space -- unnecessary because all ARM ABI variants I know of (actually, all arches I know of) will pack that structure into a word anyway as a result of natural alignment of the members. It doesn't look to me like the packed structure is used to map a memory-mapped peripheral directly or otherwise communicate with the outside world -- which would be the only situations in which a packed structure would normally make sense.
Of course, I may have missed something...
Cheers ---Dave
On Tue 21 Feb 2012 09:57:12 GMT, Dave Martin wrote:
I'm not sure, but I believe that the compiler requires (prefers) any structs that you want included inside packed structs to be themselves packed, so you can end up with some structs with apparently unnecessary attributes on them.
It might also have an effect when you place the struct inside an unpacked struct?
Without context I've no idea whether that's what's going on here. Of course, a no-op "packed" attribute ought to be harmless...
Andrew
Andrew Stubbs andrew.stubbs@linaro.org wrote on 21.02.2012 11:56:07:
I don't see why the compiler would care. Its just that if an inner struct already has padding somewhere, this padding doesn't go away (the struct layout is not recomputed) just because it is embedded in a larger struct which is marked packed. [Marking the outer struct packed will still affect the overall alignment requirement of the inner struct; it just won't affect it inner layout.]
It might also have an effect when you place the struct inside an unpacked struct?
Yes, of course: if the struct is packed, then the struct as a whole has an alignment requirement of 1, which may affect the placement of the struct as an element within an outside (unpacked) struct.
Without context I've no idea whether that's what's going on here. Of course, a no-op "packed" attribute ought to be harmless...
It is not really "no-op": it reduces the alignment requirement of the struct from 2 (in this case) to 1, which is actually exactly what's causing the problems in the original test case.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
-- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294
On Tue, Feb 21, 2012 at 12:05:54PM +0100, Ulrich Weigand wrote:
With packed, the compiler can validly emit misaligned memory accesses, which is bad when building bare-metal code where the underlying memory may be strongly-ordered, since the CPU will throw exceptions in that case.
I don't know whether that applies to this case, though.
I can't see struct pad_conf_entry nested inside anything else, so it just looks unnecessary for this to be packed (I still didn't check the context of this thoroughly, however...)
What happens if that attribute is simply removed from all the affected headers?
Cheers ---Dave
Oops! Sorry. These mails skipped my Inbox and went into a sub-folder in my mail client that I hadn't used for a long time. I didn't realize that I had mails!
On Monday 20 February 2012 11:29 PM, Ulrich Weigand wrote:
Ok. Agree. I never used to use %function when I wrote assembly functions earlier. I am sure a lot of code will break if this was enforced.
Sorry, that was an error in interpreting the data. The data is put at the odd address given in the map file.
Hmm.. Thanks. Removing packed seems to help. The following also helped.
@@ -34,7 +34,7 @@ struct pad_conf_entry {
u16 val;
-} __attribute__ ((packed)); +} __attribute__ ((packed)) __attribute__ ((aligned(2)));
BTW, just for my understanding: The effect of adding __attribute__ ((packed)) to a structure is equivalent to adding it for each field in the structure, right? And because the first field doesn't have any alignment requirement, the struct also doesn't have any alignment requirement right?
Wonder why the issue never occurred until we started building in Thumb.
Thanks a lot for looking into this.
best regards, Aneesh
Aneesh V aneesh@ti.com wrote on 23.02.2012 11:27:40:
1;
Right.
And because the first field doesn't have any alignment requirement, the struct also doesn't have any alignment requirement right?
Sort of. The alignment requirement of the struct is the maximum (actually, the least common multiple, but since we're talking only powers of two, that amounts to the same) of the alignment requirements of all members. However, attribute packed for a struct applies to each element separately, and thus each element has alignment requirement 1, which is then also the maximum.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
-- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294
On 23/02/12 10:27, Aneesh V wrote:
If you've not used %function on ARM, then your code is semantically broken even if it isn't syntactically broken.
The ABI rules for dealing with interworking and and out-of-range branches all rely on %function being used correctly.
R.
On Thursday 23 February 2012 07:26 PM, Richard Earnshaw wrote:
Hmm.. I learned it the hard way:)
BTW, very few assembly functions in U-Boot had %function or equivalent. I am trying to clean it up at least for armv7.
Thanks, Aneesh
linaro-toolchain@lists.linaro.org