On Tue, Dec 03, 2024 at 05:24:39PM +0000, Mark Brown wrote:
On Tue, Dec 03, 2024 at 05:00:08PM +0000, Dave Martin wrote:
On Tue, Dec 03, 2024 at 04:00:45PM +0000, Mark Brown wrote:
[...]
We know that the only bit of register state which is not up to date at this point is the SME vector length, we don't configure that for tasks that do not have SME. SVCR is always configured since we have to exit streaming mode for FPSIMD and SVE to work properly so we know it's already 0, all the other SME specific state is gated by controls in SVCR.
fpsimd_flush_task_state() means that we do the necessary work when re- entering userspace, but is there a problem with simply marking all the FPSIMD/vector state as stale? If FPSR or FPCR is dirty for example, it now looks like they won't get written back to thread struct if there is a context switch before current re-enters userspace?
Maybe the other flags distinguish these cases -- I haven't fully got my head around it.
We are doing fpsimd_flush_task_state() in the TIF_FOREIGN_FPSTATE case so we know there is no dirty state in the registers.
Ah, that wasn't obvious from the diff context, but you're right.
I was confused by the fpsimd_bind_task_to_cpu() call; I forgot that there are reasons to call this even when TIF_FOREIGN_FPSTATE is clear. Perhaps it would be worth splitting some of those uses up, but it would need some thinking about. Doesn't really belong in this series anyway.
(Actually, the ARM ARM says (IMHTLZ) that toggling PSTATE.SM by any means causes FPSR to become 0x800009f. I'm not sure where that fits in -- do we handle that anywhere? I guess the "soft" SM toggling via
Urgh, not seen that one - that needs handling in the signal entry path and ptrace. That will have been defined while the feature was being implemented. It's not relevant here though since we are in the SME access trap, we might be trapping due to a SMSTART or equivalent operation but that SMSTART has not yet run at the point where we return to userspace.
ptrace, signal delivery or maybe exec, ought to set this? Not sure how that interacts with the expected behaviour of the fenv(3) API... Hmm. I see no corresponding statement about FPCR.)
Fun. I'm not sure how the ABI is defined there by libc.
I guess this should be left as-is, for now. There's an argument for sanitising FPCR/FPSR on signal delivery, but neither signal(7) nor fenv(3) give any clue about the expected behaviour...
For ptrace, the user has the opportunity to specify exactly what they want to happen to all the registers, so I suppose it's best to stick to the current model and require the tracer to specify all changes explicitly rather than add new magic ptrace behaviour.
Not relevant for this series, in any case.
Cheers ---Dave