Hi Marc,
On 5/10/21 4:04 PM, Marc Zyngier wrote:
On Mon, 10 May 2021 15:55:28 +0100, Alexandru Elisei alexandru.elisei@arm.com wrote:
Hi Marc,
On 5/10/21 10:49 AM, Marc Zyngier wrote:
KVM currently updates PC (and the corresponding exception state) using a two phase approach: first by setting a set of flags, then by converting these flags into a state update when the vcpu is about to enter the guest.
However, this creates a disconnect with userspace if the vcpu thread returns there with any exception/PC flag set. In this case, the exposed
The code seems to handle only the KVM_ARM64_PENDING_EXCEPTION flag. Is the "PC flag" a reference to the KVM_ARM64_INCREMENT_PC flag?
No, it does handle both exception and PC increment, unless I have completely bodged something (entirely possible).
The message is correct, my bad.
context is wrong, as userpsace doesn't have access to these flags
s/userpsace/userspace
(they aren't architectural). It also means that these flags are preserved across a reset, which isn't expected.
To solve this problem, force an explicit synchronisation of the exception state on vcpu exit to userspace. As an optimisation for nVHE systems, only perform this when there is something pending.
Reported-by: Zenghui Yu yuzenghui@huawei.com Signed-off-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org # 5.11
arch/arm64/include/asm/kvm_asm.h | 1 + arch/arm64/kvm/arm.c | 10 ++++++++++ arch/arm64/kvm/hyp/exception.c | 4 ++-- arch/arm64/kvm/hyp/nvhe/hyp-main.c | 8 ++++++++ 4 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index d5b11037401d..5e9b33cbac51 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -63,6 +63,7 @@ #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector 18 #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize 19 #define __KVM_HOST_SMCCC_FUNC___pkvm_mark_hyp 20 +#define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc 21 #ifndef __ASSEMBLY__ diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 1cb39c0803a4..d62a7041ebd1 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -897,6 +897,16 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) kvm_sigset_deactivate(vcpu);
- /*
* In the unlikely event that we are returning to userspace
* with pending exceptions or PC adjustment, commit these
I'm going to assume "PC adjustment" means the KVM_ARM64_INCREMENT_PC flag. Please correct me if that's not true, but if that's the case, then the flag isn't handled below.
* adjustments in order to give userspace a consistent view of
* the vcpu state.
*/
- if (unlikely(vcpu->arch.flags & (KVM_ARM64_PENDING_EXCEPTION |
KVM_ARM64_EXCEPT_MASK)))
The condition seems to suggest that it is valid to set KVM_ARM64_EXCEPT_{AA32,AA64}_* without setting KVM_ARM64_PENDING_EXCEPTION, which looks rather odd to me. Is that a valid use of the KVM_ARM64_EXCEPT_MASK bits? If it's not (the existing code always sets the exception type with the KVM_ARM64_PENDING_EXCEPTION), that I was thinking that checking only the KVM_ARM64_PENDING_EXCEPTION flag would make the intention clearer.
No, you are missing this (subtle) comment in kvm_host.h:
<quote> /* * Overlaps with KVM_ARM64_EXCEPT_MASK on purpose so that it can't be * set together with an exception... */ #define KVM_ARM64_INCREMENT_PC (1 << 9) /* Increment PC */ </quote>
So (KVM_ARM64_PENDING_EXCEPTION | KVM_ARM64_EXCEPT_MASK) checks for *both* an exception and a PC increment.
Then how about explicitly checking for the KVM_ARM64_PENDING_EXCEPTION and KVM_ARM64_INCREMENT_PC flags, like it's done in __kvm_adjust_pc? That would certainly make the code easier to understand, as it's not immediately obvious that the EXCEPT mask includes the INCREMENT_PC flag.
Thanks,
Alex