On Tue, 21 Jan 2025 15:37:13 +0000, Mark Rutland mark.rutland@arm.com wrote:
On Tue, Jan 21, 2025 at 11:20:04AM +0000, Marc Zyngier wrote:
Hi Mark,
[...]
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 8c4c1a2186cc5..e4053a90ed240 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -1711,8 +1711,24 @@ void fpsimd_kvm_prepare(void) */ get_cpu_fpsimd_context();
- if (test_and_clear_thread_flag(TIF_SVE)) {
sve_to_fpsimd(current);
- if (!test_thread_flag(TIF_FOREIGN_FPSTATE) &&
test_and_clear_thread_flag(TIF_SVE)) {
sve_user_disable();
I'm pretty happy with this fix. However...
/*
* The KVM hyp code doesn't set fp_type when saving the host's
* FPSIMD state. Set fp_type here in case the hyp code saves
* the host state.
Should KVM do that? The comment seems to indicate that this is papering over yet another bug...
Yes; really this should be done at hyp (and at that point, hyp could actually save the entire host SVE state), but that's a larger change and more painful for backporting, which is why I didn't go that route. I'm happy to go try to fix hyp to do that, or I can make the comment more explicit that this is a bodge, if that's all you're after?
Alternatively, we could take the large hammer approach and always save and unbind the host state prior to entering the guest, so that hyp doesn't need to save anything. An unconditional call to fpsimd_save_and_flush_cpu_state() would suffice, and that'd also implicitly fix the SME issue below.
I think I'd rather see that. Even if that costs us a few hundred cycles on vcpu_load(), I would take that any time over the current fragile/broken behaviour.
*
* If hyp code does not save the host state, then the host
* state remains live on the CPU and saved fp_type is
* irrelevant until it is overwritten by a later call to
* fpsimd_save_user_state().
I'm not sure I understand this. If fp_type is irrelevant, surely it is *forever* irrelevant, not until something else happens. Or am I missing something?
Sorry, this was not very clear.
What this is trying to say is that *while the state is live on a CPU* fp_type is irrelevant, and it's only meaningful when saving/restoring state. As above, the only reason to set it here is so that *if* hyp saves and unbinds the state, fp_type will accurately describe what the hyp code saved.
The key thing is that there are two possibilities:
(1) The guest doesn't use FPSIMD/SVE, and no trap is taken to save the host state. In this case, fp_type is not consumed before the next time state has to be written back to memory (the act of which will set fp_type).
So in this case, setting fp_type is redundant but benign.
(2) The guest *does* use FPSIMD/SVE, and a trap is taken to hyp to save the host state. In this case the hyp code will save the task's FPSIMD state to task->thread.uw.fpsimd_state, but will not update task->thread.fp_type accordingly, and:
* If fp_type happened to be FP_STATE_FPSIMD, all is good and a later restore will load the state saved by the hyp code. * If fp_type happened to be FP_STATE_SVE, a later restore will load stale state from task->thread.sve_state.
... does that make sense?
It does now, thanks. But with your above alternative suggestion, this becomes completely moot, right?
*
* This is *NOT* sufficient when CONFIG_ARM64_SME=y, where
* fp_type can be FP_STATE_SVE regardless of TIF_SVE.
*/
BUILD_BUG_ON(IS_ENABLED(CONFIG_ARM64_SME));
I'd rather not have this build-time failure, as this is very likely to annoy a lot of people. Instead, just make SME unselectable with KVM:
I'm happy to change this, but FWIW I'd used BUILD_BUG() here because it kept that associated with the comment and logic, and because we disabled SME in commit:
81235ae0c846e1fb4 ("arm64: Kconfig: Make SME depend on BROKEN for now)"
... which was CC'd stable, and so this *shouldn't* blow up on anything with that commit.
My experience is that people do set CONFIG_BROKEN, and don't expect the kernel to fail to compile -- they "only" expect it to misbehave.
So I can:
(a) Add the dependency, as you suggest.
(b) Leave that as-is.
(c) Solve this in a different way so that we don't need a BUILD_BUG() or dependency. e.g. fix the SME case at the same time, at the cost of possibly needing to do a bit more work when backporting.
... any preference?
My preference would be on (c), if at all possible. My understanding is now that the fpsimd_save_and_flush_cpu_state() approach solves all of these problems, at the expense of a bit of overhead.
Did I get that correctly?
Thanks,
M.