Hi all,
I have a bit of a strange one. I'm not after a full solution, just any hints that quickly come to mind :)
After a few simple patches I have a build of mongodb for aarch64 (built with gcc-4.8). However, all of the test binaries that the build spits out immediately segfault. gdb-ing shows that they segfault inside this macro:
TSP_DECLARE(OwnedOstreamVector, threadOstreamCache);
This expands to:
# define TSP_DECLARE(T,p) \ extern __thread T* _ ## p; \ template<> inline T* TSP<T>::get() const { return _ ## p; } \ extern TSP<T> p;
And indeed, it's mongo::TSP<mongo::OwnedPointerVector<...> >::get() const that we're segfaulting in. This is the disassembly of this function (at -O0) with the faulting instruction marked:
0x00000000004b4b6c <+0>: stp x29, x30, [sp,#-32]! 0x00000000004b4b70 <+4>: mov x29, sp 0x00000000004b4b74 <+8>: str x0, [x29,#16] 0x00000000004b4b78 <+12>: adrp x0, 0x64c000 0x00000000004b4b7c <+16>: ldr x0, [x0,#776] 0x00000000004b4b80 <+20>: nop 0x00000000004b4b84 <+24>: nop 0x00000000004b4b88 <+28>: mrs x1, tpidr_el0 0x00000000004b4b8c <+32>: add x0, x1, x0 => 0x00000000004b4b90 <+36>: ldr x0, [x0] 0x00000000004b4b94 <+40>: ldp x29, x30, [sp],#32 0x00000000004b4b98 <+44>: ret
And the registers:
(gdb) info registers x0 0x7fb863fd70 548554407280 x1 0x7fb7ff76f0 548547819248 x2 0x0 0 x3 0x7fb7fc11b8 548547596728 x4 0x1 1 x5 0x0 0 x6 0x50 80 x7 0x0 0 x8 0x0 0 x9 0x6165727473676f4c 7018141438804717388 x10 0x0 0 x11 0x0 0 x12 0x2 2 x13 0x10 16 x14 0x0 0 x15 0x7fb7e5e590 548546143632 x16 0x64b3d8 6599640 x17 0x7fb7f667d0 548547225552 x18 0x7fffffdab0 549755804336 x19 0x7fffffed50 549755809104 x20 0xb 11 x21 0xb 11 x22 0x6500b0 6619312 x23 0x650070 6619248 x24 0x7fffffff 2147483647 x25 0x64db40 6609728 x26 0x7fffffeda0 549755809184 x27 0x653d00 6634752 x28 0x7fffffe750 549755807568 x29 0x7fffffe4d0 549755806928 x30 0x4b4ed4 4935380 sp 0x7fffffe4d0 0x7fffffe4d0 pc 0x4b4b90 0x4b4b90 <mongo::TSP<mongo::OwnedPointerVector<std::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> > > >::get() const+36> cpsr 0x20000000 536870912 fpsr 0x0 0 fpcr 0x0 0
If I recompile this object file without -fPIC, it works.
I guess I see three things that could be wrong:
1) The operand to "adrp x0, 0x64c000"[1] 2) The operand to "ldr x0, [x0,#776]" 3) The value of tpidr_el0
Oh, and I guess:
4) The setup of tls has gone wrong and the address in x0 _ought_ to be accessible but isn't for some reason.
Any hints on which of these seems mostly likely to be the culprit?
Chers, mwh
[1] FWIW, objdump reports 0x64c000 as "_GLOBAL_OFFSET_TABLE_+0x2d0", not sure why that doesn't show up in gdb's disassembly).