On 12 December 2013 21:02, Michael Hudson-Doyle michael.hudson@linaro.org wrote:
Hi,
Thanks for the respsonse.
Will Newton will.newton@linaro.org writes:
On 12 December 2013 08:00, Michael Hudson-Doyle michael.hudson@linaro.org wrote:
Hi all,
I have a bit of a strange one. I'm not after a full solution, just any hints that quickly come to mind :)
After a few simple patches I have a build of mongodb for aarch64 (built with gcc-4.8). However, all of the test binaries that the build spits out immediately segfault. gdb-ing shows that they segfault inside this macro:
TSP_DECLARE(OwnedOstreamVector, threadOstreamCache);
This expands to:
# define TSP_DECLARE(T,p) \ extern __thread T* _ ## p; \ template<> inline T* TSP<T>::get() const { return _ ## p; } \ extern TSP<T> p;
And indeed, it's mongo::TSP<mongo::OwnedPointerVector<...> >::get() const that we're segfaulting in. This is the disassembly of this function (at -O0) with the faulting instruction marked:
0x00000000004b4b6c <+0>: stp x29, x30, [sp,#-32]! 0x00000000004b4b70 <+4>: mov x29, sp 0x00000000004b4b74 <+8>: str x0, [x29,#16] 0x00000000004b4b78 <+12>: adrp x0, 0x64c000 0x00000000004b4b7c <+16>: ldr x0, [x0,#776] 0x00000000004b4b80 <+20>: nop 0x00000000004b4b84 <+24>: nop 0x00000000004b4b88 <+28>: mrs x1, tpidr_el0 0x00000000004b4b8c <+32>: add x0, x1, x0 => 0x00000000004b4b90 <+36>: ldr x0, [x0] 0x00000000004b4b94 <+40>: ldp x29, x30, [sp],#32 0x00000000004b4b98 <+44>: ret
And the registers:
(gdb) info registers x0 0x7fb863fd70 548554407280
This value looks surprisingly large if it is an offset from TP (x1).
Yeah, it does a bit doesn't it.
(gdb) p/x $x0 - $x1 $9 = 0x648680
(not really a suspicious number)
I guess I don't understand the adrp code. My understanding is that:
0x00000000004b4b78 <+12>: adrp x0, 0x64c000
would result in 0x4b4000 + 0x64c000 in x0 and then
The disassembler may have done this for you, would 0x64c000 make more sense?
0x00000000004b4b7c <+16>: ldr x0, [x0,#776]
reads from 0x4b4000 + 0x64c000 + 776 but
(gdb) x 0x4b4000 + 0x64c000 + 776 0xb00308: Cannot access memory at address 0xb00308
(I'm not sure if the disassembly for adrp has the immediate shifted or not, but anyway:
(gdb) x 0x4b4000 + (0x64c000<<12) + 776 0x4c4b4308: Cannot access memory at address 0x4c4b4308)
So I'm clearly missing something here...
x1 0x7fb7ff76f0 548547819248
Have you tried printing the memory at this address? It looks like it is probably ok...
Yeah, it's fine.
I guess that means that the thread pointer is probably correct.
(gdb) x/20g $x1 0x7fb7ff76f0: 0x0000007fb7ff7e28 0x0000000000000000 0x7fb7ff7700: 0x0000000000000000 0x0000000000000000 0x7fb7ff7710: 0x0000000000000000 0x0000000000000000 0x7fb7ff7720: 0x0000000000000000 0x0000007fb7e5ce50 0x7fb7ff7730: 0x0000007fb7e5fff8 0x0000000000000000 0x7fb7ff7740: 0x0000007fb7e1bab8 0x0000007fb7e1b4b8 0x7fb7ff7750: 0x0000007fb7e1c3b8 0x0000007fb7e5c550 0x7fb7ff7760: 0x0000000000000000 0x0000000000000000 0x7fb7ff7770: 0x0000000000000000 0x0000000000000000 0x7fb7ff7780: 0x0000000000000000 0x0000000000000000
The end of /proc/$pid/maps looks like this:
7fb7fd3000-7fb7fee000 r-xp 00000000 08:01 4330216 /lib/aarch64-linux-gnu/ld-2.17.so 7fb7ff3000-7fb7ffc000 rwxp 00000000 00:00 0 7fb7ffc000-7fb7ffe000 r-xp 00000000 00:00 0 [vdso] 7fb7ffe000-7fb7fff000 r-xp 0001b000 08:01 4330216 /lib/aarch64-linux-gnu/ld-2.17.so 7fb7fff000-7fb8001000 rwxp 0001c000 08:01 4330216 /lib/aarch64-linux-gnu/ld-2.17.so 7ffffdf000-8000000000 rwxp 00000000 00:00 0 [stack]
So $x1 is within a random 36k map and $x0 is off in la la land between a bit of ld-2.17.so and the stack.
x2 0x0 0 x3 0x7fb7fc11b8 548547596728 x4 0x1 1 x5 0x0 0 x6 0x50 80 x7 0x0 0 x8 0x0 0 x9 0x6165727473676f4c 7018141438804717388 x10 0x0 0 x11 0x0 0 x12 0x2 2 x13 0x10 16 x14 0x0 0 x15 0x7fb7e5e590 548546143632 x16 0x64b3d8 6599640 x17 0x7fb7f667d0 548547225552 x18 0x7fffffdab0 549755804336 x19 0x7fffffed50 549755809104 x20 0xb 11 x21 0xb 11 x22 0x6500b0 6619312 x23 0x650070 6619248 x24 0x7fffffff 2147483647 x25 0x64db40 6609728 x26 0x7fffffeda0 549755809184 x27 0x653d00 6634752 x28 0x7fffffe750 549755807568 x29 0x7fffffe4d0 549755806928 x30 0x4b4ed4 4935380 sp 0x7fffffe4d0 0x7fffffe4d0 pc 0x4b4b90 0x4b4b90 <mongo::TSP<mongo::OwnedPointerVector<std::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> > > >::get() const+36> cpsr 0x20000000 536870912 fpsr 0x0 0 fpcr 0x0 0
If I recompile this object file without -fPIC, it works.
I guess I see three things that could be wrong:
- The operand to "adrp x0, 0x64c000"[1]
- The operand to "ldr x0, [x0,#776]"
Is there a dynamic reloc for this GOT slot?
How would I tell? :)
Generally the TLS code will load the TP then load an offset from the GOT that the dynamic linker has fixed up based on a dynamic relocation which should reference the correct symbol etc.
I would guess that 0x64c000 is the base of the GOT and 776 is the offset into it (but I could be wrong). objdump -h will give you the layout of the sections, objdump -R will dump the relocations.