Recently while trying to fix some unit tests I found a CVE in SVM nested code.
In 'shutdown_interception' vmexit handler we call kvm_vcpu_reset.
However if running nested and L1 doesn't intercept shutdown, we will still end up running this function and trigger a bug in it.
The bug is that this function resets the 'vcpu->arch.hflags' without properly leaving the nested state, which leaves the vCPU in inconsistent state, which later triggers a kernel panic in SVM code.
The same bug can likely be triggered by sending INIT via local apic to a vCPU which runs a nested guest.
On VMX we are lucky that the issue can't happen because VMX always intercepts triple faults, thus triple fault in L2 will always be redirected to L1. Plus the 'handle_triple_fault' of VMX doesn't reset the vCPU.
INIT IPI can't happen on VMX either because INIT events are masked while in VMX mode.
First 4 patches in this series address the above issue, and are already posted on the list with title, ('nSVM: fix L0 crash if L2 has shutdown condtion which L1 doesn't intercept') I addressed the review feedback and also added a unit test to hit this issue.
In addition to these patches I noticed that KVM doesn't honour SHUTDOWN intercept bit of L1 on SVM, and I included a fix to do so - its only for correctness as a normal hypervisor should always intercept SHUTDOWN. A unit test on the other hand might want to not do so. I also extendted the triple_fault_test selftest to hit this issue.
Finaly I found another security issue, I found a way to trigger a kernel non rate limited printk on SVM from the guest, and last patch in the series fixes that.
A unit test I posted to kvm-unit-tests project hits this issue, so no selftest was added.
Best regards, Maxim Levitsky
Maxim Levitsky (9): KVM: x86: nSVM: leave nested mode on vCPU free KVM: x86: nSVM: harden svm_free_nested against freeing vmcb02 while still in use KVM: x86: add kvm_leave_nested KVM: x86: forcibly leave nested mode on vCPU reset KVM: selftests: move idt_entry to header kvm: selftests: add svm nested shutdown test KVM: x86: allow L1 to not intercept triple fault KVM: selftests: add svm part to triple_fault_test KVM: x86: remove exit_int_info warning in svm_handle_exit
arch/x86/kvm/svm/nested.c | 12 +++- arch/x86/kvm/svm/svm.c | 10 +-- arch/x86/kvm/vmx/nested.c | 4 +- arch/x86/kvm/x86.c | 29 ++++++-- tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/include/x86_64/processor.h | 13 ++++ .../selftests/kvm/lib/x86_64/processor.c | 13 ---- .../kvm/x86_64/svm_nested_shutdown_test.c | 71 +++++++++++++++++++ .../kvm/x86_64/triple_fault_event_test.c | 71 ++++++++++++++----- 10 files changed, 174 insertions(+), 51 deletions(-) create mode 100644 tools/testing/selftests/kvm/x86_64/svm_nested_shutdown_test.c
If the VM was terminated while nested, we free the nested state while the vCPU still is in nested mode.
Soon a warning will be added for this condition.
Cc: stable@vger.kernel.org Signed-off-by: Maxim Levitsky mlevitsk@redhat.com --- arch/x86/kvm/svm/svm.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index d22a809d923339..e9cec1b692051c 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1440,6 +1440,7 @@ static void svm_vcpu_free(struct kvm_vcpu *vcpu) */ svm_clear_current_vmcb(svm->vmcb);
+ svm_leave_nested(vcpu); svm_free_nested(svm);
sev_free_vcpu(vcpu);
Make sure that KVM uses vmcb01 before freeing nested state, and warn if that is not the case.
This is a minimal fix for CVE-2022-3344 making the kernel print a warning instead of a kernel panic.
Cc: stable@vger.kernel.org Signed-off-by: Maxim Levitsky mlevitsk@redhat.com --- arch/x86/kvm/svm/nested.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index b258d6988f5dde..b74da40c1fc40c 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -1126,6 +1126,9 @@ void svm_free_nested(struct vcpu_svm *svm) if (!svm->nested.initialized) return;
+ if (WARN_ON_ONCE(svm->vmcb != svm->vmcb01.ptr)) + svm_switch_vmcb(svm, &svm->vmcb01); + svm_vcpu_free_msrpm(svm->nested.msrpm); svm->nested.msrpm = NULL;
add kvm_leave_nested which wraps a call to nested_ops->leave_nested into a function.
Cc: stable@vger.kernel.org Signed-off-by: Maxim Levitsky mlevitsk@redhat.com --- arch/x86/kvm/svm/nested.c | 3 --- arch/x86/kvm/vmx/nested.c | 3 --- arch/x86/kvm/x86.c | 8 +++++++- 3 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index b74da40c1fc40c..bcc4f6620f8aec 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -1147,9 +1147,6 @@ void svm_free_nested(struct vcpu_svm *svm) svm->nested.initialized = false; }
-/* - * Forcibly leave nested mode in order to be able to reset the VCPU later on. - */ void svm_leave_nested(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 61a2e551640a08..1ebe141a0a015f 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -6441,9 +6441,6 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu, return kvm_state.size; }
-/* - * Forcibly leave nested mode in order to be able to reset the VCPU later on. - */ void vmx_leave_nested(struct kvm_vcpu *vcpu) { if (is_guest_mode(vcpu)) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index cd9eb13e2ed7fc..316ab1d5317f92 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -627,6 +627,12 @@ static void kvm_queue_exception_vmexit(struct kvm_vcpu *vcpu, unsigned int vecto ex->payload = payload; }
+/* Forcibly leave the nested mode in cases like a vCPU reset */ +static void kvm_leave_nested(struct kvm_vcpu *vcpu) +{ + kvm_x86_ops.nested_ops->leave_nested(vcpu); +} + static void kvm_multiple_exception(struct kvm_vcpu *vcpu, unsigned nr, bool has_error, u32 error_code, bool has_payload, unsigned long payload, bool reinject) @@ -5193,7 +5199,7 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu, if (events->flags & KVM_VCPUEVENT_VALID_SMM) { #ifdef CONFIG_KVM_SMM if (!!(vcpu->arch.hflags & HF_SMM_MASK) != events->smi.smm) { - kvm_x86_ops.nested_ops->leave_nested(vcpu); + kvm_leave_nested(vcpu); kvm_smm_changed(vcpu, events->smi.smm); }
While not obivous, kvm_vcpu_reset() leaves the nested mode by clearing 'vcpu->arch.hflags' but it does so without all the required housekeeping.
On SVM, it is possible to have a vCPU reset while in guest mode because unlike VMX, on SVM, INIT's are not latched in SVM non root mode and in addition to that L1 doesn't have to intercept triple fault, which should also trigger L1's reset if happens in L2 while L1 didn't intercept it.
If one of the above conditions happen, KVM will continue to use vmcb02 while not having in the guest mode.
Later the IA32_EFER will be cleared which will lead to freeing of the nested guest state which will (correctly) free the vmcb02, but since KVM still uses it (incorrectly) this will lead to a use after free and kernel crash.
This issue is assigned CVE-2022-3344
Cc: stable@vger.kernel.org Signed-off-by: Maxim Levitsky mlevitsk@redhat.com --- arch/x86/kvm/x86.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 316ab1d5317f92..3fd900504e683b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11694,8 +11694,18 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) WARN_ON_ONCE(!init_event && (old_cr0 || kvm_read_cr3(vcpu) || kvm_read_cr4(vcpu)));
+ /* + * SVM doesn't unconditionally VM-Exit on INIT and SHUTDOWN, thus it's + * possible to INIT the vCPU while L2 is active. Force the vCPU back + * into L1 as EFER.SVME is cleared on INIT (along with all other EFER + * bits), i.e. virtualization is disabled. + */ + if (is_guest_mode(vcpu)) + kvm_leave_nested(vcpu); + kvm_lapic_reset(vcpu, init_event);
+ WARN_ON_ONCE(is_guest_mode(vcpu) || is_smm(vcpu)); vcpu->arch.hflags = 0;
vcpu->arch.smi_pending = 0;
struct idt_entry will be used for a test which will break IDT on purpose.
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com --- .../selftests/kvm/include/x86_64/processor.h | 13 +++++++++++++ tools/testing/selftests/kvm/lib/x86_64/processor.c | 13 ------------- 2 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h index e8ca0d8a6a7e0a..5da0c5e2a7afc4 100644 --- a/tools/testing/selftests/kvm/include/x86_64/processor.h +++ b/tools/testing/selftests/kvm/include/x86_64/processor.h @@ -748,6 +748,19 @@ struct ex_regs { uint64_t rflags; };
+struct idt_entry { + uint16_t offset0; + uint16_t selector; + uint16_t ist : 3; + uint16_t : 5; + uint16_t type : 4; + uint16_t : 1; + uint16_t dpl : 2; + uint16_t p : 1; + uint16_t offset1; + uint32_t offset2; uint32_t reserved; +}; + void vm_init_descriptor_tables(struct kvm_vm *vm); void vcpu_init_descriptor_tables(struct kvm_vcpu *vcpu); void vm_install_exception_handler(struct kvm_vm *vm, int vector, diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c index 39c4409ef56a6a..41c1c73c464d48 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/processor.c +++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c @@ -1074,19 +1074,6 @@ void kvm_get_cpu_address_width(unsigned int *pa_bits, unsigned int *va_bits) } }
-struct idt_entry { - uint16_t offset0; - uint16_t selector; - uint16_t ist : 3; - uint16_t : 5; - uint16_t type : 4; - uint16_t : 1; - uint16_t dpl : 2; - uint16_t p : 1; - uint16_t offset1; - uint32_t offset2; uint32_t reserved; -}; - static void set_idt_entry(struct kvm_vm *vm, int vector, unsigned long addr, int dpl, unsigned short selector) {
Add test that tests that on SVM if L1 doesn't intercept SHUTDOWN, then L2 crashes L1 and doesn't crash L2
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com --- tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 1 + .../kvm/x86_64/svm_nested_shutdown_test.c | 71 +++++++++++++++++++ 3 files changed, 73 insertions(+) create mode 100644 tools/testing/selftests/kvm/x86_64/svm_nested_shutdown_test.c
diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore index 2f0d705db9dba5..05d980fb083d17 100644 --- a/tools/testing/selftests/kvm/.gitignore +++ b/tools/testing/selftests/kvm/.gitignore @@ -41,6 +41,7 @@ /x86_64/svm_vmcall_test /x86_64/svm_int_ctl_test /x86_64/svm_nested_soft_inject_test +/x86_64/svm_nested_shutdown_test /x86_64/sync_regs_test /x86_64/tsc_msrs_test /x86_64/tsc_scaling_sync diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 0172eb6cb6eee2..4a2caef2c9396f 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -101,6 +101,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/state_test TEST_GEN_PROGS_x86_64 += x86_64/vmx_preemption_timer_test TEST_GEN_PROGS_x86_64 += x86_64/svm_vmcall_test TEST_GEN_PROGS_x86_64 += x86_64/svm_int_ctl_test +TEST_GEN_PROGS_x86_64 += x86_64/svm_nested_shutdown_test TEST_GEN_PROGS_x86_64 += x86_64/svm_nested_soft_inject_test TEST_GEN_PROGS_x86_64 += x86_64/tsc_scaling_sync TEST_GEN_PROGS_x86_64 += x86_64/sync_regs_test diff --git a/tools/testing/selftests/kvm/x86_64/svm_nested_shutdown_test.c b/tools/testing/selftests/kvm/x86_64/svm_nested_shutdown_test.c new file mode 100644 index 00000000000000..3155edf81f5474 --- /dev/null +++ b/tools/testing/selftests/kvm/x86_64/svm_nested_shutdown_test.c @@ -0,0 +1,71 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * svm_nested_shutdown_test + * + * Copyright (C) 2022, Red Hat, Inc. + * + * Nested SVM testing: test that unintercepted shutdown in L2 doesn't crash the host + */ + +#include "test_util.h" +#include "kvm_util.h" +#include "processor.h" +#include "svm_util.h" + + +static void l2_guest_code(struct svm_test_data *svm) +{ + __asm__ __volatile__("ud2"); +} + +static void l1_guest_code(struct svm_test_data *svm, struct idt_entry * idt) +{ + #define L2_GUEST_STACK_SIZE 64 + unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE]; + struct vmcb *vmcb = svm->vmcb; + + generic_svm_setup(svm, l2_guest_code, + &l2_guest_stack[L2_GUEST_STACK_SIZE]); + + vmcb->control.intercept &= ~(BIT(INTERCEPT_SHUTDOWN)); + + idt[6].p = 0; // #UD is intercepted but its injection will cause #NP + idt[11].p = 0; // #NP is not intercepted and will cause another #NP will be be converted to #DF + idt[8].p = 0; // #DF will cause #NP which will cause SHUTDOWN + + run_guest(vmcb, svm->vmcb_gpa); + + /* should not reach here */ + GUEST_ASSERT(0); +} + +void test_run(bool emulated_shutdown) +{ +} + +int main(int argc, char *argv[]) +{ + struct kvm_vcpu *vcpu; + struct kvm_run *run; + vm_vaddr_t svm_gva; + struct kvm_vm *vm; + + TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_SVM)); + + vm = vm_create_with_one_vcpu(&vcpu, l1_guest_code); + vm_init_descriptor_tables(vm); + vcpu_init_descriptor_tables(vcpu); + + vcpu_alloc_svm(vm, &svm_gva); + + vcpu_args_set(vcpu, 2, svm_gva, vm->idt); + run = vcpu->run; + + vcpu_run(vcpu); + TEST_ASSERT(run->exit_reason == KVM_EXIT_SHUTDOWN, + "Got exit_reason other than KVM_EXIT_SHUTDOWN: %u (%s)\n", + run->exit_reason, + exit_reason_str(run->exit_reason)); + + kvm_vm_free(vm); +}
This is SVM correctness fix - although a sane L1 would intercept SHUTDOWN event, it doesn't have to, so we have to honour this.
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com --- arch/x86/kvm/svm/nested.c | 6 ++++++ arch/x86/kvm/vmx/nested.c | 1 + arch/x86/kvm/x86.c | 11 ++++++----- 3 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index bcc4f6620f8aec..3aa9184d1e4ed7 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -1092,6 +1092,12 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
static void nested_svm_triple_fault(struct kvm_vcpu *vcpu) { + struct vcpu_svm *svm = to_svm(vcpu); + + if (!vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_SHUTDOWN)) + return; + + kvm_clear_request(KVM_REQ_TRIPLE_FAULT, vcpu); nested_svm_simple_vmexit(to_svm(vcpu), SVM_EXIT_SHUTDOWN); }
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 1ebe141a0a015f..7924dea9367813 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -4855,6 +4855,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
static void nested_vmx_triple_fault(struct kvm_vcpu *vcpu) { + kvm_clear_request(KVM_REQ_TRIPLE_FAULT, vcpu); nested_vmx_vmexit(vcpu, EXIT_REASON_TRIPLE_FAULT, 0, 0); }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 3fd900504e683b..f0a0102a78f5c3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9741,7 +9741,7 @@ static void update_cr8_intercept(struct kvm_vcpu *vcpu)
int kvm_check_nested_events(struct kvm_vcpu *vcpu) { - if (kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu)) { + if (kvm_test_request(KVM_REQ_TRIPLE_FAULT, vcpu)) { kvm_x86_ops.nested_ops->triple_fault(vcpu); return 1; } @@ -10255,15 +10255,16 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) r = 0; goto out; } - if (kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu)) { - if (is_guest_mode(vcpu)) { + if (kvm_test_request(KVM_REQ_TRIPLE_FAULT, vcpu)) { + if (is_guest_mode(vcpu)) kvm_x86_ops.nested_ops->triple_fault(vcpu); - } else { + + if (kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu)) { vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN; vcpu->mmio_needed = 0; r = 0; - goto out; } + goto out; } if (kvm_check_request(KVM_REQ_APF_HALT, vcpu)) { /* Page is swapped out. Do synthetic halt */
Add a SVM implementation to triple_fault_test to test that emulated/injected shutdown works.
Since instead of the VMX, the SVM allows the hypervisor to avoid intercepting shutdown in guest, don't intercept shutdown to test that KVM suports this correctly.
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com --- .../kvm/x86_64/triple_fault_event_test.c | 71 ++++++++++++++----- 1 file changed, 54 insertions(+), 17 deletions(-)
diff --git a/tools/testing/selftests/kvm/x86_64/triple_fault_event_test.c b/tools/testing/selftests/kvm/x86_64/triple_fault_event_test.c index 70b44f0b52fef2..78d2941f7c1f0d 100644 --- a/tools/testing/selftests/kvm/x86_64/triple_fault_event_test.c +++ b/tools/testing/selftests/kvm/x86_64/triple_fault_event_test.c @@ -3,6 +3,7 @@ #include "kvm_util.h" #include "processor.h" #include "vmx.h" +#include "svm_util.h"
#include <string.h> #include <sys/ioctl.h> @@ -20,10 +21,11 @@ static void l2_guest_code(void) : : [port] "d" (ARBITRARY_IO_PORT) : "rax"); }
-void l1_guest_code(struct vmx_pages *vmx) -{ #define L2_GUEST_STACK_SIZE 64 - unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE]; +unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE]; + +void l1_guest_code_vmx(struct vmx_pages *vmx) +{
GUEST_ASSERT(vmx->vmcs_gpa); GUEST_ASSERT(prepare_for_vmx_operation(vmx)); @@ -38,24 +40,51 @@ void l1_guest_code(struct vmx_pages *vmx) GUEST_DONE(); }
+void l1_guest_code_svm(struct svm_test_data *svm) +{ + struct vmcb *vmcb = svm->vmcb; + + generic_svm_setup(svm, l2_guest_code, + &l2_guest_stack[L2_GUEST_STACK_SIZE]); + + /* don't intercept shutdown to test the case of SVM allowing to do so */ + vmcb->control.intercept &= ~(BIT(INTERCEPT_SHUTDOWN)); + + run_guest(vmcb, svm->vmcb_gpa); + + /* should not reach here, L1 should crash */ + GUEST_ASSERT(0); +} + int main(void) { struct kvm_vcpu *vcpu; struct kvm_run *run; struct kvm_vcpu_events events; - vm_vaddr_t vmx_pages_gva; struct ucall uc;
- TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_VMX)); + bool has_vmx = kvm_cpu_has(X86_FEATURE_VMX); + bool has_svm = kvm_cpu_has(X86_FEATURE_SVM); + + TEST_REQUIRE(has_vmx || has_svm);
TEST_REQUIRE(kvm_has_cap(KVM_CAP_X86_TRIPLE_FAULT_EVENT));
- vm = vm_create_with_one_vcpu(&vcpu, l1_guest_code); - vm_enable_cap(vm, KVM_CAP_X86_TRIPLE_FAULT_EVENT, 1);
+ if (has_vmx) { + vm_vaddr_t vmx_pages_gva; + vm = vm_create_with_one_vcpu(&vcpu, l1_guest_code_vmx); + vcpu_alloc_vmx(vm, &vmx_pages_gva); + vcpu_args_set(vcpu, 1, vmx_pages_gva); + } else { + vm_vaddr_t svm_gva; + vm = vm_create_with_one_vcpu(&vcpu, l1_guest_code_svm); + vcpu_alloc_svm(vm, &svm_gva); + vcpu_args_set(vcpu, 1, svm_gva); + } + + vm_enable_cap(vm, KVM_CAP_X86_TRIPLE_FAULT_EVENT, 1); run = vcpu->run; - vcpu_alloc_vmx(vm, &vmx_pages_gva); - vcpu_args_set(vcpu, 1, vmx_pages_gva); vcpu_run(vcpu);
TEST_ASSERT(run->exit_reason == KVM_EXIT_IO, @@ -78,13 +107,21 @@ int main(void) "No triple fault pending"); vcpu_run(vcpu);
- switch (get_ucall(vcpu, &uc)) { - case UCALL_DONE: - break; - case UCALL_ABORT: - REPORT_GUEST_ASSERT(uc); - default: - TEST_FAIL("Unexpected ucall: %lu", uc.cmd); - }
+ if (has_svm) { + TEST_ASSERT(run->exit_reason == KVM_EXIT_SHUTDOWN, + "Got exit_reason other than KVM_EXIT_SHUTDOWN: %u (%s)\n", + run->exit_reason, + exit_reason_str(run->exit_reason)); + return 0; + } else { + switch (get_ucall(vcpu, &uc)) { + case UCALL_DONE: + break; + case UCALL_ABORT: + REPORT_GUEST_ASSERT(uc); + default: + TEST_FAIL("Unexpected ucall: %lu", uc.cmd); + } + } }
It is valid to receive external interrupt and have broken IDT entry, which will lead to #GP with exit_int_into that will contain the index of the IDT entry (e.g any value).
Other exceptions can happen as well, like #NP or #SS (if stack switch fails).
Thus this warning can be user triggred and has very little value.
Cc: stable@vger.kernel.org Signed-off-by: Maxim Levitsky mlevitsk@redhat.com --- arch/x86/kvm/svm/svm.c | 9 --------- 1 file changed, 9 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index e9cec1b692051c..36f651ce842174 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3428,15 +3428,6 @@ static int svm_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) return 0; }
- if (is_external_interrupt(svm->vmcb->control.exit_int_info) && - exit_code != SVM_EXIT_EXCP_BASE + PF_VECTOR && - exit_code != SVM_EXIT_NPF && exit_code != SVM_EXIT_TASK_SWITCH && - exit_code != SVM_EXIT_INTR && exit_code != SVM_EXIT_NMI) - printk(KERN_ERR "%s: unexpected exit_int_info 0x%x " - "exit_code 0x%x\n", - __func__, svm->vmcb->control.exit_int_info, - exit_code); - if (exit_fastpath != EXIT_FASTPATH_NONE) return 1;
On Thu, 2022-11-03 at 15:57 +0200, Maxim Levitsky wrote:
Recently while trying to fix some unit tests I found a CVE in SVM nested code.
In 'shutdown_interception' vmexit handler we call kvm_vcpu_reset.
However if running nested and L1 doesn't intercept shutdown, we will still end up running this function and trigger a bug in it.
The bug is that this function resets the 'vcpu->arch.hflags' without properly leaving the nested state, which leaves the vCPU in inconsistent state, which later triggers a kernel panic in SVM code.
The same bug can likely be triggered by sending INIT via local apic to a vCPU which runs a nested guest.
On VMX we are lucky that the issue can't happen because VMX always intercepts triple faults, thus triple fault in L2 will always be redirected to L1. Plus the 'handle_triple_fault' of VMX doesn't reset the vCPU.
INIT IPI can't happen on VMX either because INIT events are masked while in VMX mode.
First 4 patches in this series address the above issue, and are already posted on the list with title, ('nSVM: fix L0 crash if L2 has shutdown condtion which L1 doesn't intercept') I addressed the review feedback and also added a unit test to hit this issue.
In addition to these patches I noticed that KVM doesn't honour SHUTDOWN intercept bit of L1 on SVM, and I included a fix to do so - its only for correctness as a normal hypervisor should always intercept SHUTDOWN. A unit test on the other hand might want to not do so. I also extendted the triple_fault_test selftest to hit this issue.
Finaly I found another security issue, I found a way to trigger a kernel non rate limited printk on SVM from the guest, and last patch in the series fixes that.
A unit test I posted to kvm-unit-tests project hits this issue, so no selftest was added.
Best regards, Maxim Levitsky
Maxim Levitsky (9): KVM: x86: nSVM: leave nested mode on vCPU free KVM: x86: nSVM: harden svm_free_nested against freeing vmcb02 while still in use KVM: x86: add kvm_leave_nested KVM: x86: forcibly leave nested mode on vCPU reset KVM: selftests: move idt_entry to header kvm: selftests: add svm nested shutdown test KVM: x86: allow L1 to not intercept triple fault KVM: selftests: add svm part to triple_fault_test KVM: x86: remove exit_int_info warning in svm_handle_exit
arch/x86/kvm/svm/nested.c | 12 +++- arch/x86/kvm/svm/svm.c | 10 +-- arch/x86/kvm/vmx/nested.c | 4 +- arch/x86/kvm/x86.c | 29 ++++++-- tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/include/x86_64/processor.h | 13 ++++ .../selftests/kvm/lib/x86_64/processor.c | 13 ---- .../kvm/x86_64/svm_nested_shutdown_test.c | 71 +++++++++++++++++++ .../kvm/x86_64/triple_fault_event_test.c | 71 ++++++++++++++----- 10 files changed, 174 insertions(+), 51 deletions(-) create mode 100644 tools/testing/selftests/kvm/x86_64/svm_nested_shutdown_test.c
-- 2.34.3
I jumped the gun a bit with this patch series, there are few checkpatch.pl issues and some leftovers I didn't remove. I'll resend this shortly.
Best regards, Maxim Levitsky
linux-kselftest-mirror@lists.linaro.org