(not a review, just looking at this diff stat) is changing the fastpath
read_unlock() read_lock()
from something like
fence rw,w amodadd.w amoadd.w fence r,rw
to
fence rw,rw amoadd.w amoadd.w fence rw,rw
no matter Zalasr or !Zalasr. Similarly for other atomic operations with release or acquire semantics. I guess the change was not intentional? If it was intentional, it should be properly mentioned in the changelog.
Sorry about that. It is intended. The atomic operation before __atomic_acquire_fence or operation after __atomic_release_fence can be just a single ld or sd instruction instead of amocas or amoswap. In such cases, when the store release operation becomes 'sd.rl', the __atomic_acquire_fence via 'fence r, rw' can not ensure FENCE.TSO anymore. Thus I replace it with 'fence rw, rw'.
But you could apply similar changes you performed for xchg & cmpxchg: use .AQ and .RL for other atomic RMW operations as well, no? AFAICS, that is what arm64 actually does in arch/arm64/include/asm/atomic_{ll_sc,lse}.h .
Andrea
This is also the common implementation on other architectures who use aq/rl instructions like ARM. And you certainly already knew it~