Re: [PATCH v4 5/7] arm64: Add support for FEAT_{LS64, LS64_V}

17 Sep 2025

      On 2025/9/17 11:51, Yicong Yang wrote:
...
On 2025/9/16 22:56, Catalin Marinas wrote:
...
On Mon, Sep 15, 2025 at 04:29:25PM +0800, Yicong Yang wrote:
...
On 2025/9/12 21:47, Jonathan Cameron wrote:
...
On Thu, 11 Sep 2025 16:50:14 +0100
Will Deacon will@kernel.org wrote:
...
On Tue, Sep 09, 2025 at 09:48:04AM +0800, Yicong Yang wrote:
...
per ARM DDI0487 L.b section C3.2.6,
When the instructions access a memory type that is not one of the following,
  a data abort for unsupported Exclusive or atomic access is generated...
That's about the memory _type_. I'm talking about a supported memory type
(e.g. writeback cacheable) but when the physical location doesn't support
the instruction. That's captured a little later in the same section:
| If the target memory location does not support the LD64B or ST64B
  | instructions, then one of the following behaviors occurs:
  |  * A stage 1 Data Abort, reported using the DFSC code of 0b110101,
  |    is generated.
  |  * The instruction performs the memory accesses, but the accesses
  |    are not single-copy atomic above the byte level
Is this a new addition to the L.b release of the ARM ARM? Maybe it was
there before in some other form (or a different place). At least it
doesn't say "unpredictable".
i think it's new in L.b. didn't find this is mentioned in K.a either.
...
...
...
...
and I think that's a bad interface to expose blindly to userspace solely
as a boolean hwcap.
Nasty, so now I'm curious. Any thoughts on how to expose what regions it is appropriate
for?  I can think of various heavy weight options but wondering if there is a simple
solution.
in my understanding the hwcap only describes the capabilities of the CPU but not
the whole system. the users should make sure the function works as expected if the
CPU supports it and they're going to use it. specifically the LS64 is intended for
device memory only, so the user should take responsibility of using it on supported
memory.
We have other cases like MTE where we avoid exposing the HWCAP to user
if we know the memory system does not support MTE, though we intercepted
this early and asked the (micro)architects to tie the CPU ID field to
what the system supports.
but we lack the same identification mechanism as CPU for the memory system, so it's just a
restriction for the hardware vendor that if certain feature is not supported for the whole
system (SoC) then do not advertise it in the CPU's ID field. otherwise i think we're currently
doing in the manner that if capability mismatch or cannot work as expected together then a
errata/workaround is used to disable the feature or add some workaround on this certain
platform.
this is also the case for LS64 but a bit more complex, since it involves the completer outside
the SoC (the device) and could be a hotplug one (PCIe). from the SoC part we can restrict to
advertise the feature only if it's fully supported (what we've already done on our hardware).
...
...
may raise the similar question if use other atomic instructions (e.g. LSE) on the
memory does not support atomicity. find this restriction in ARM DDI0487 L.b section B2.2.6
Some system implementations might not support atomic instructions for all regions of the
  memory
With exclusives or atomics, we require that the general purpose (system)
RAM supports the feature, otherwise Linux won't work properly (I don't
think we specifically documented this but it would be fairly obvious
when the kernel doesn't boot or user-space randomly crashes).
yes the spec requires general purpose memory to support atomicity.
The architecture only requires that Conventional memory that is mapped in this way supports
  this functionality
otherwise users should have knowledge whether the target memory agent support atomicity.
...
...
and if perform atomic instruction on unsupported memory it's allowed to implement as

The instruction generates a synchronous External abort.
The instruction generates a System Error interrupt.
The instruction generates an IMPLEMENTATION DEFINED MMU fault reported using the Data
Abort Fault status code of ESR_ELx.DFSC = 110101.
The instruction is treated as a NOP.
The instructions are performed, but there is no guarantee that the memory accesses were
performed atomically in regard to other agents that access memory. In this case, the
instruction might also generate a System Error interrupt.

if instruction performed without generate a SEI in the last implementation, it's quite similar
to the condition of LS64.
The difference is that we don't support Linux on such systems.
Arguably, the use of LD/ST64B* is fairly specialised and won't be used
on the general purpose RAM and by random applications. It needs a device
driver to create the NC/Device mapping and specific programs/libraries
to access it. I'm not sure the LS64 properties are guaranteed by the
device alone or the device together with the interconnect. I suspect the
latter and neither the kernel driver nor user space can tell. In the
best case, you get a fault and realise the system doesn't work as
expected. Worse is the non-atomicity with potentially silent corruption.
will be the latter one, both interconnect and the target device need to
support it. but I think the driver developer (kernel driver or userspace
driver) must have knowledge about the support status, otherwise they
should not use it.
for general purpose ram currently we have fault mechenism to avoid ld/st64b
usage (there's maybe an exception if FEAT_LS64WB is supported, which is
introduced in the latest feature list but no spec describe it and allows
perform ld/st64b on memory with write-back attribute. but currently I have
no details about this feature from any public documents)
...
So, to Will's point, the HWCAP is not sufficient for user space to make
an informed decision on whether it can safely use the LS64 instructions.
Can a (generic) device driver tell or do we need additional information
in firmware tables to advertise the correct behaviour?
my thoughts is that the driver developer should have known whether their
device support it or not if going to use this. the information in the
firmware table should be fine for platform devices, but cannot describe
information for hotpluggable ones like PCIe endpoint devices which may
not be listed in a firmware table.
another option is that we drop the advertisement from the hwcap and cpuinfo
and let user retrieve the CPU suppport from the ID reg directly or from the
dmesg. does it make sense?
thanks.

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v4 5/7] arm64: Add support for FEAT_{LS64, LS64_V}