Hi,
I'm working on the libatomic-ops (part of the Boehm gc) AArch64 support, I mainly use GCC's __atomic builtins to do this, but in our 4.7 version they don't use the load acquire / store release instructions now available in the ARMv8 ISA. These instructions are used in the mainline GCC (in atomic.md) but not in their exclusive form, I understand that it should be due to the performance penalty, but I want your feeling on that point as I don't find the ARMv8 ISA really clear.
If we want to implement an atomic load acquire, is
LDAR x1, [x0]
sufficient, or do we have to write it like that :
L: LDAXR x0, [x3] STEX x1, x0, [x3] CBZ x0, L1
Thanks Yvan