On Wed, Oct 16, 2013 at 12:13:28PM +0100, Ben Dooks wrote:
On 15/10/13 23:38, Taras Kondratiuk wrote:
Hi
I was debugging kprobes-test for BE8 and noticed that some data fields are stored in LE instead of BE. It happens because these data fields get interpreted as instructions.
Is it a known issue?
I reported the crashes to Tixy along with a different method of sovling the problem (changed to using pointers to the strings) a while ago. However it seems that nothing has happened to fix this.
Since kprobes seems to work with the fixed tests I forgot to follow up and prod Jon about looking into this problem.
Jon, if you are not interested in fixing this, then please let me know and we can get a patch sorted to fix it.
PS, I am going to leave this out of the current be8 patchset as I want to get that merged, and at the moment kprobes-test is not essential to getting the system started.
For example: test_align_fail_data: bx lr .byte 0xaa .align .word 0x12345678
I would expect to see something like this: 00000000<test_align_fail_data>: 0: e12fff1e bx lr 4: aa .byte 0xaa 5: 00 .byte 0x00 6: 0000 .short 0x0000 8: 12345678 .word 0x12345678
But instead I have: 00000000<test_align_fail_data>: 0: e12fff1e bx lr 4: aa .byte 0xaa 5: 00 .byte 0x00 6: 0000 .short 0x0000 8: 12345678 eorsne r5, r4, #120, 12 ; 0x7800000
As a result the word 0x12345678 will be stored in LE.
I've run several tests and here are my observations:
- Double ".align" fixes the issue :)
- Behavior is the same for LE/BE, ARM/Thumb, GCC 4.4.1/4.6.x/4.8.2
- Size of alignment doesn't matter.
- Issue happens only if previous data is not instruction-aligned and 0's are added before NOPs.
- Explicit filling with 0's (.align , 0) fixes the issue, but as a side effect data @0x4 is interpreted as a single ".word 0xaa000000" instead of ".byte .byte .short". I'm not sure if there can be any functional difference because of this.
There isn't. This is just objdump's pretty-printing behaviour.
Objdump prints out the data in naturally-aligned units, but this has nothing to do with whether .byte, .short or .word/.long was used in the source.
- Issue doesn't happen if there is no instructions before data (no "bx lr" in the example).
- Issue doesn't happen if data after .align is defined as ".type<symbol>,%object".
Thanks for getting down to a simple test case.
Unfortunately, objdump can and does get confused about data/instruction boundaries, so the output you see above may be misleading.
Displaying the symbol table with --special-syms will list the magic symbols that mark the instruction and data boundaries, to help debug this kind of situation.
However, in this case, I think you've found a bug in the assembler, as shown below.
Before linking, the final $a symbol (indicating the start of ARM instructions) is at address 8, so in this case objdump is correct to show 0x12345678 as an instruction.
After linking, the mapping symbols ($[atd]) remain as before, and the linker has byteswapped this "instruction" (as it should).
This is likely related to the magic for inserting the extensible NOP-padding fragment which implements the .align in code sections. That is code, and requires a $a mapping symbol, but that somehow goes AWOL or gets displaced after the alignment padding ...
I can't quite get my head around what is going on in binutils/gas/config/tc-arm.c. We would need to understand that before we can identify a reliable workaround.
My view is to fix this by not doing complicated things by trying to save a bit of space by embedding strings into the code. It is not as if we cannot get the compiler to put the strings into the relevant data area and give us a pointer we can use.
The code in this case is /not easy/ to follow and it would be nice if it could be cleaned up to just take the string as a argument to the test code instead of trying to find it via assembly magic.
My conslusion is the same as yours -- we should avoid these clever tricks in asm for now, if possible. Otherwise, we need to identify a workaround...
Cheers ---Dave
[terminal spew follows]
$ arm-linux-gnueabi-as --version GNU assembler (GNU Binutils) 2.23.2 $ arm-linux-gnueabi-as -EB -o test.o test.s $ arm-linux-gnueabi-ld -EB --be8 -o test test.o arm-linux-gnueabi-ld: warning: cannot find entry symbol _start; defaulting to 0000000000008054 $ arm-linux-gnueabi-objdump -dst --special-syms test.o
test.o: file format elf32-bigarm
SYMBOL TABLE: 00000000 l d .text 00000000 .text 00000000 l d .data 00000000 .data 00000000 l d .bss 00000000 .bss 00000000 l .text 00000000 test_align_fail_data 00000000 l .text 00000000 $a 00000004 l .text 00000000 $d 00000005 l .text 00000000 $d 00000008 l .text 00000000 $a 00000000 l d .ARM.attributes 00000000 .ARM.attributes
Contents of section .text: 0000 e12fff1e aa000000 12345678 ./.......4Vx Contents of section .ARM.attributes: 0000 41000000 15616561 62690001 0000000b A....aeabi...... 0010 06020801 0901 ......
Disassembly of section .text:
00000000 <test_align_fail_data>: 0: e12fff1e bx lr 4: aa .byte 0xaa 5: 00 .byte 0x00 6: 0000 .short 0x0000 8: 12345678 eorsne r5, r4, #120, 12 ; 0x7800000 $ arm-linux-gnueabi-objdump -dst --special-syms test
test: file format elf32-bigarm
SYMBOL TABLE: 00008054 l d .text 00000000 .text 00000000 l d .ARM.attributes 00000000 .ARM.attributes 00000000 l df *ABS* 00000000 test.o 00008054 l .text 00000000 test_align_fail_data 00008054 l .text 00000000 $a 00008058 l .text 00000000 $d 00008059 l .text 00000000 $d 0000805c l .text 00000000 $a 00000000 l df *ABS* 00000000 00010060 g .text 00000000 _bss_end__ 00010060 g .text 00000000 __bss_start__ 00010060 g .text 00000000 __bss_end__ 00000000 *UND* 00000000 _start 00010060 g .text 00000000 __bss_start 00010060 g .text 00000000 __end__ 00010060 g .text 00000000 _edata 00010060 g .text 00000000 _end
Contents of section .text: 8054 1eff2fe1 aa000000 78563412 ../.....xV4. Contents of section .ARM.attributes: 0000 41000000 15616561 62690001 0000000b A....aeabi...... 0010 06020801 0901 ......
Disassembly of section .text:
00008054 <test_align_fail_data>: 8054: e12fff1e bx lr 8058: aa .byte 0xaa 8059: 00 .byte 0x00 805a: 0000 .short 0x0000 805c: 12345678 eorsne r5, r4, #120, 12 ; 0x7800000
On 10/16/2013 10:25 PM, Dave Martin wrote:
Unfortunately, objdump can and does get confused about data/instruction boundaries, so the output you see above may be misleading.
Displaying the symbol table with --special-syms will list the magic symbols that mark the instruction and data boundaries, to help debug this kind of situation.
However, in this case, I think you've found a bug in the assembler, as shown below.
Before linking, the final $a symbol (indicating the start of ARM instructions) is at address 8, so in this case objdump is correct to show 0x12345678 as an instruction.
After linking, the mapping symbols ($[atd]) remain as before, and the linker has byteswapped this "instruction" (as it should).
This is likely related to the magic for inserting the extensible NOP-padding fragment which implements the .align in code sections. That is code, and requires a $a mapping symbol, but that somehow goes AWOL or gets displaced after the alignment padding ...
I can't quite get my head around what is going on in binutils/gas/config/tc-arm.c. We would need to understand that before we can identify a reliable workaround.
Thanks for confirming the issue. Does it makes sense to file a GCC bug?
Taras Kondratiuk taras.kondratiuk@linaro.org writes:
On 10/16/2013 10:25 PM, Dave Martin wrote:
Unfortunately, objdump can and does get confused about data/instruction boundaries, so the output you see above may be misleading.
Displaying the symbol table with --special-syms will list the magic symbols that mark the instruction and data boundaries, to help debug this kind of situation.
However, in this case, I think you've found a bug in the assembler, as shown below.
Before linking, the final $a symbol (indicating the start of ARM instructions) is at address 8, so in this case objdump is correct to show 0x12345678 as an instruction.
After linking, the mapping symbols ($[atd]) remain as before, and the linker has byteswapped this "instruction" (as it should).
This is likely related to the magic for inserting the extensible NOP-padding fragment which implements the .align in code sections. That is code, and requires a $a mapping symbol, but that somehow goes AWOL or gets displaced after the alignment padding ...
I can't quite get my head around what is going on in binutils/gas/config/tc-arm.c. We would need to understand that before we can identify a reliable workaround.
Thanks for confirming the issue. Does it makes sense to file a GCC bug?
It seems like a binutils (gas) bug to me.
linaro-kernel@lists.linaro.org