On Thu, Jul 20, 2023 at 11:31:13AM +0100, Will Deacon wrote:
On Wed, Jul 12, 2023 at 12:02:30PM +0100, Mark Brown wrote:
- /*
* This is a memset() but we don't want the compiler to
* optimise it into either instructions or a library call
* which might be incompatible with streaming mode.
*/
- for (i = 0; i < td->live_sz; i++) {
asm volatile("nop"
: "+m" (*dest_uc)
:
: "memory");
I don't think it's save to use "+m" here, since the compiler can assume that the address is used exactly once in the asm. If a post-indexed addressing mode is generated, then you can end up with register corruption.
Stepping back, why not use either barrier() or OPTIMIZER_HIDE_VAR() instead?
That should work. I was mostly just open coding OPTIMIZER_HIDE_VAR() and noticed that memory constraints were a thing.
The most robust fix would be to write all of the streaming mode code in asm, but I can appreciate that's a tonne of work for a testcase.
It's probably more proportionate to add a dependency on toolchain support for SME, but that'd mean we hardly ever run the tests.