On Fri, Sep 19, 2025 at 6:39 PM Xu Lu luxu.kernel@bytedance.com wrote:
Hi Andrea,
On Fri, Sep 19, 2025 at 6:04 PM Andrea Parri parri.andrea@gmail.com wrote:
On Fri, Sep 19, 2025 at 03:37:06PM +0800, Xu Lu wrote:
This patch adds support for the Zalasr ISA extension, which supplies the real load acquire/store release instructions.
The specification can be found here: https://github.com/riscv/riscv-zalasr/blob/main/chapter2.adoc
This patch seires has been tested with ltp on Qemu with Brensan's zalasr support patch[1].
Some false positive spacing error happens during patch checking. Thus I CCed maintainers of checkpatch.pl as well.
[1] https://lore.kernel.org/all/CAGPSXwJEdtqW=nx71oufZp64nK6tK=0rytVEcz4F-gfvCOX...
v3:
- Apply acquire/release semantics to arch_xchg/arch_cmpxchg operations
so as to ensure FENCE.TSO ordering between operations which precede the UNLOCK+LOCK sequence and operations which follow the sequence. Thanks to Andrea.
- Support hwprobe of Zalasr.
- Allow Zalasr extensions for Guest/VM.
v2:
- Adjust the order of Zalasr and Zalrsc in dt-bindings. Thanks to
Conor.
Xu Lu (8): riscv: add ISA extension parsing for Zalasr dt-bindings: riscv: Add Zalasr ISA extension description riscv: hwprobe: Export Zalasr extension riscv: Introduce Zalasr instructions riscv: Use Zalasr for smp_load_acquire/smp_store_release riscv: Apply acquire/release semantics to arch_xchg/arch_cmpxchg operations RISC-V: KVM: Allow Zalasr extensions for Guest/VM KVM: riscv: selftests: Add Zalasr extensions to get-reg-list test
Documentation/arch/riscv/hwprobe.rst | 5 +- .../devicetree/bindings/riscv/extensions.yaml | 5 + arch/riscv/include/asm/atomic.h | 6 - arch/riscv/include/asm/barrier.h | 91 ++++++++++-- arch/riscv/include/asm/cmpxchg.h | 136 ++++++++---------- arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/include/asm/insn-def.h | 79 ++++++++++ arch/riscv/include/uapi/asm/hwprobe.h | 1 + arch/riscv/include/uapi/asm/kvm.h | 1 + arch/riscv/kernel/cpufeature.c | 1 + arch/riscv/kernel/sys_hwprobe.c | 1 + arch/riscv/kvm/vcpu_onereg.c | 2 + .../selftests/kvm/riscv/get-reg-list.c | 4 + 13 files changed, 242 insertions(+), 91 deletions(-)
I wouldn't have rushed this submission while the discussion on v2 seems so much alive; IAC, to add and link to that discussion, this version
Thanks. This version is sent out to show my solution to the FENCE.TSO problem you pointed out before. I will continue to improve it. Look forward to more suggestions from you.
(not a review, just looking at this diff stat) is changing the fastpath
read_unlock() read_lock()
from something like
fence rw,w amodadd.w amoadd.w fence r,rw
to
fence rw,rw amoadd.w amoadd.w fence rw,rw
no matter Zalasr or !Zalasr. Similarly for other atomic operations with release or acquire semantics. I guess the change was not intentional? If it was intentional, it should be properly mentioned in the changelog.
Sorry about that. It is intended. The atomic operation before __atomic_acquire_fence or operation after __atomic_release_fence can be just a single ld or sd instruction instead of amocas or amoswap. In such cases, when the store release operation becomes 'sd.rl', the __atomic_acquire_fence via 'fence r, rw' can not ensure FENCE.TSO anymore. Thus I replace it with 'fence rw, rw'.
This is also the common implementation on other architectures who use aq/rl instructions like ARM. And you certainly already knew it~
I will make it a separate commit and provide more messages in the changelog. Maybe alternative mechanism can be applied to accelerate it.
Best Regards, Xu Lu
Andrea