This is a resend of a few patches that implement few SVM's optional features for nesting.
I was testing these patches during last few weeks with various nested configurations and I was unable to find any issues.
I also implemented support for nested vGIF in the last patch.
Best regards, Maxim Levitsky
Maxim Levitsky (6): KVM: x86: SVM: add module param to control LBR virtualization KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running KVM: x86: nSVM: implement nested LBR virtualization KVM: x86: nSVM: implement nested VMLOAD/VMSAVE KVM: x86: nSVM: support PAUSE filter threshold and count when cpu_pm=on KVM: x86: SVM: implement nested vGIF
arch/x86/kvm/svm/nested.c | 86 ++++++++++++++++++++--- arch/x86/kvm/svm/svm.c | 140 ++++++++++++++++++++++++++++++++------ arch/x86/kvm/svm/svm.h | 38 +++++++++-- 3 files changed, 228 insertions(+), 36 deletions(-)
This is useful for debug and also makes it consistent with the rest of the SVM optional features.
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com --- arch/x86/kvm/svm/svm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 21bb81710e0f6..2b5f8e10d686d 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -186,12 +186,13 @@ module_param(vls, int, 0444); static int vgif = true; module_param(vgif, int, 0444);
+static int tsc_scaling = true; +module_param(tsc_scaling, int, 0444); + /* enable/disable LBR virtualization */ static int lbrv = true; module_param(lbrv, int, 0444);
-static int tsc_scaling = true; -module_param(tsc_scaling, int, 0444);
/* * enable / disable AVIC. Because the defaults differ for APICv
Maxim Levitsky mlevitsk@redhat.com writes:
This is useful for debug and also makes it consistent with the rest of the SVM optional features.
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com
arch/x86/kvm/svm/svm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 21bb81710e0f6..2b5f8e10d686d 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -186,12 +186,13 @@ module_param(vls, int, 0444); static int vgif = true; module_param(vgif, int, 0444); +static int tsc_scaling = true; +module_param(tsc_scaling, int, 0444);
/* enable/disable LBR virtualization */ static int lbrv = true; module_param(lbrv, int, 0444); -static int tsc_scaling = true; -module_param(tsc_scaling, int, 0444);
Subject line says "KVM: x86: SVM: add module param to control LBR virtualization" but the patch just moves 'tsc_scaling' param a couple lines up. Was this intended or is this a rebase artifact and the patch just needs to be dropped?
/*
- enable / disable AVIC. Because the defaults differ for APICv
On Mon, 2021-11-01 at 17:21 +0100, Vitaly Kuznetsov wrote:
Maxim Levitsky mlevitsk@redhat.com writes:
This is useful for debug and also makes it consistent with the rest of the SVM optional features.
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com
arch/x86/kvm/svm/svm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 21bb81710e0f6..2b5f8e10d686d 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -186,12 +186,13 @@ module_param(vls, int, 0444); static int vgif = true; module_param(vgif, int, 0444); +static int tsc_scaling = true; +module_param(tsc_scaling, int, 0444);
/* enable/disable LBR virtualization */ static int lbrv = true; module_param(lbrv, int, 0444); -static int tsc_scaling = true; -module_param(tsc_scaling, int, 0444);
Subject line says "KVM: x86: SVM: add module param to control LBR virtualization" but the patch just moves 'tsc_scaling' param a couple lines up. Was this intended or is this a rebase artifact and the patch just needs to be dropped?
Yes, I didn't notice when I rebased the series last time, that this patch was accepted already.
Sorry for the noise.
Best regards, Maxim Levitsky
/*
- enable / disable AVIC. Because the defaults differ for APICv
When L2 is running without LBR virtualization, we should ensure that L1's LBR msrs continue to update as usual.
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com --- arch/x86/kvm/svm/nested.c | 11 +++++ arch/x86/kvm/svm/svm.c | 98 +++++++++++++++++++++++++++++++-------- arch/x86/kvm/svm/svm.h | 2 + 3 files changed, 92 insertions(+), 19 deletions(-)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index f8b7bc04b3e7a..9239631846d88 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -517,6 +517,9 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12 svm->vcpu.arch.dr6 = vmcb12->save.dr6 | DR6_ACTIVE_LOW; vmcb_mark_dirty(svm->vmcb, VMCB_DR); } + + if (unlikely(svm->vmcb01.ptr->control.virt_ext & LBR_CTL_ENABLE_MASK)) + svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb); }
static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) @@ -574,6 +577,9 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) svm->vmcb->control.event_inj = svm->nested.ctl.event_inj; svm->vmcb->control.event_inj_err = svm->nested.ctl.event_inj_err;
+ svm->vmcb->control.virt_ext = svm->vmcb01.ptr->control.virt_ext & + LBR_CTL_ENABLE_MASK; + nested_svm_transition_tlb_flush(vcpu);
/* Enter Guest-Mode */ @@ -835,6 +841,11 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
svm_switch_vmcb(svm, &svm->vmcb01);
+ if (unlikely(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK)) { + svm_copy_lbrs(svm->nested.vmcb02.ptr, svm->vmcb); + svm_update_lbrv(vcpu); + } + /* * On vmexit the GIF is set to false and * no event can be injected in L1. diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 2b5f8e10d686d..c08f8804d107f 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -799,6 +799,17 @@ static void init_msrpm_offsets(void) } }
+void svm_copy_lbrs(struct vmcb *from_vmcb, struct vmcb *to_vmcb) +{ + to_vmcb->save.dbgctl = from_vmcb->save.dbgctl; + to_vmcb->save.br_from = from_vmcb->save.br_from; + to_vmcb->save.br_to = from_vmcb->save.br_to; + to_vmcb->save.last_excp_from = from_vmcb->save.last_excp_from; + to_vmcb->save.last_excp_to = from_vmcb->save.last_excp_to; + + vmcb_mark_dirty(to_vmcb, VMCB_LBR); +} + static void svm_enable_lbrv(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); @@ -808,6 +819,10 @@ static void svm_enable_lbrv(struct kvm_vcpu *vcpu) set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, 1, 1); set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTFROMIP, 1, 1); set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, 1, 1); + + /* Move the LBR msrs to the vmcb02 so that the guest can see them. */ + if (is_guest_mode(vcpu)) + svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb); }
static void svm_disable_lbrv(struct kvm_vcpu *vcpu) @@ -819,6 +834,63 @@ static void svm_disable_lbrv(struct kvm_vcpu *vcpu) set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, 0, 0); set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTFROMIP, 0, 0); set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, 0, 0); + + /* + * Move the LBR msrs back to the vmcb01 to avoid copying them + * on nested guest entries. + */ + if (is_guest_mode(vcpu)) + svm_copy_lbrs(svm->vmcb, svm->vmcb01.ptr); +} + +static int svm_get_lbr_msr(struct vcpu_svm *svm, u32 index) +{ + /* + * If the LBR virtualization is disabled, the LBR msrs are always + * kept in the vmcb01 to avoid copying them on nested guest entries. + * + * If nested, and the LBR virtualization is enabled/disabled, the msrs + * are moved between the vmcb01 and vmcb02 as needed. + */ + struct vmcb *vmcb = + (svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK) ? + svm->vmcb : svm->vmcb01.ptr; + + switch (index) { + case MSR_IA32_DEBUGCTLMSR: + return vmcb->save.dbgctl; + case MSR_IA32_LASTBRANCHFROMIP: + return vmcb->save.br_from; + case MSR_IA32_LASTBRANCHTOIP: + return vmcb->save.br_to; + case MSR_IA32_LASTINTFROMIP: + return vmcb->save.last_excp_from; + case MSR_IA32_LASTINTTOIP: + return vmcb->save.last_excp_to; + default: + KVM_BUG(false, svm->vcpu.kvm, + "%s: Unknown MSR 0x%x", __func__, index); + return 0; + } +} + +void svm_update_lbrv(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + + bool enable_lbrv = svm_get_lbr_msr(svm, MSR_IA32_DEBUGCTLMSR) & + DEBUGCTLMSR_LBR; + + bool current_enable_lbrv = !!(svm->vmcb->control.virt_ext & + LBR_CTL_ENABLE_MASK); + + if (enable_lbrv == current_enable_lbrv) + return; + + if (enable_lbrv) + svm_enable_lbrv(vcpu); + else + svm_disable_lbrv(vcpu); }
void disable_nmi_singlestep(struct vcpu_svm *svm) @@ -2762,25 +2834,12 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_TSC_AUX: msr_info->data = svm->tsc_aux; break; - /* - * Nobody will change the following 5 values in the VMCB so we can - * safely return them on rdmsr. They will always be 0 until LBRV is - * implemented. - */ case MSR_IA32_DEBUGCTLMSR: - msr_info->data = svm->vmcb->save.dbgctl; - break; case MSR_IA32_LASTBRANCHFROMIP: - msr_info->data = svm->vmcb->save.br_from; - break; case MSR_IA32_LASTBRANCHTOIP: - msr_info->data = svm->vmcb->save.br_to; - break; case MSR_IA32_LASTINTFROMIP: - msr_info->data = svm->vmcb->save.last_excp_from; - break; case MSR_IA32_LASTINTTOIP: - msr_info->data = svm->vmcb->save.last_excp_to; + msr_info->data = svm_get_lbr_msr(svm, msr_info->index); break; case MSR_VM_HSAVE_PA: msr_info->data = svm->nested.hsave_msr; @@ -3011,12 +3070,13 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) if (data & DEBUGCTL_RESERVED_BITS) return 1;
- svm->vmcb->save.dbgctl = data; - vmcb_mark_dirty(svm->vmcb, VMCB_LBR); - if (data & (1ULL<<0)) - svm_enable_lbrv(vcpu); + if (svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK) + svm->vmcb->save.dbgctl = data; else - svm_disable_lbrv(vcpu); + svm->vmcb01.ptr->save.dbgctl = data; + + svm_update_lbrv(vcpu); + break; case MSR_VM_HSAVE_PA: /* diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 0d7bbe548ac3e..ff84f84f697f7 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -420,6 +420,8 @@ u32 svm_msrpm_offset(u32 msr); u32 *svm_vcpu_alloc_msrpm(void); void svm_vcpu_init_msrpm(struct kvm_vcpu *vcpu, u32 *msrpm); void svm_vcpu_free_msrpm(u32 *msrpm); +void svm_copy_lbrs(struct vmcb *from_vmcb, struct vmcb *to_vmcb); +void svm_update_lbrv(struct kvm_vcpu *vcpu);
int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer); void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
This was tested with kvm-unit-test that was developed for this purpose.
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com --- arch/x86/kvm/svm/nested.c | 21 +++++++++++++++++++-- arch/x86/kvm/svm/svm.c | 8 ++++++++ arch/x86/kvm/svm/svm.h | 1 + 3 files changed, 28 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 9239631846d88..eca0f2f41bf30 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -518,8 +518,19 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12 vmcb_mark_dirty(svm->vmcb, VMCB_DR); }
- if (unlikely(svm->vmcb01.ptr->control.virt_ext & LBR_CTL_ENABLE_MASK)) + if (unlikely(svm->lbrv_enabled && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) { + svm_copy_lbrs(vmcb12, svm->vmcb); + /* + * TODO: to avoid TOC/TOU race, + * make sure that we only pick LBR enable bit from + * the guest. + */ + svm->vmcb->save.dbgctl &= LBR_CTL_ENABLE_MASK; + svm_update_lbrv(&svm->vcpu); + + } else if (unlikely(svm->vmcb01.ptr->control.virt_ext & LBR_CTL_ENABLE_MASK)) { svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb); + } }
static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) @@ -579,6 +590,9 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm)
svm->vmcb->control.virt_ext = svm->vmcb01.ptr->control.virt_ext & LBR_CTL_ENABLE_MASK; + if (svm->lbrv_enabled) + svm->vmcb->control.virt_ext |= + (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK);
nested_svm_transition_tlb_flush(vcpu);
@@ -841,7 +855,10 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
svm_switch_vmcb(svm, &svm->vmcb01);
- if (unlikely(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK)) { + if (unlikely(svm->lbrv_enabled && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) { + svm_copy_lbrs(svm->nested.vmcb02.ptr, vmcb12); + svm_update_lbrv(vcpu); + } else if (unlikely(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK)) { svm_copy_lbrs(svm->nested.vmcb02.ptr, svm->vmcb); svm_update_lbrv(vcpu); } diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index c08f8804d107f..0ae8fa9400902 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -884,6 +884,10 @@ void svm_update_lbrv(struct kvm_vcpu *vcpu) bool current_enable_lbrv = !!(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK);
+ if (unlikely(is_guest_mode(vcpu) && svm->lbrv_enabled)) + if (unlikely(svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK)) + enable_lbrv = true; + if (enable_lbrv == current_enable_lbrv) return;
@@ -1016,6 +1020,9 @@ static __init void svm_set_cpu_caps(void) if (tsc_scaling) kvm_cpu_cap_set(X86_FEATURE_TSCRATEMSR);
+ if (lbrv) + kvm_cpu_cap_set(X86_FEATURE_LBRV); + /* Nested VM can receive #VMEXIT instead of triggering #GP */ kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK); } @@ -4147,6 +4154,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) guest_cpuid_has(vcpu, X86_FEATURE_NRIPS);
svm->tsc_scaling_enabled = tsc_scaling && guest_cpuid_has(vcpu, X86_FEATURE_TSCRATEMSR); + svm->lbrv_enabled = lbrv && guest_cpuid_has(vcpu, X86_FEATURE_LBRV);
svm_recalc_instruction_intercepts(vcpu, svm);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index ff84f84f697f7..c3d46fdf4b9ad 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -164,6 +164,7 @@ struct vcpu_svm { /* cached guest cpuid flags for faster access */ bool nrips_enabled : 1; bool tsc_scaling_enabled : 1; + bool lbrv_enabled : 1;
u32 ldr_reg; u32 dfr_reg;
This was tested by booting L1,L2,L3 (all Linux) and checking that no VMLOAD/VMSAVE vmexits happened.
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com --- arch/x86/kvm/svm/nested.c | 35 +++++++++++++++++++++++++++++------ arch/x86/kvm/svm/svm.c | 7 +++++++ arch/x86/kvm/svm/svm.h | 8 +++++++- 3 files changed, 43 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index eca0f2f41bf30..2dc97cca68f7c 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -119,6 +119,20 @@ static void nested_svm_uninit_mmu_context(struct kvm_vcpu *vcpu) vcpu->arch.walk_mmu = &vcpu->arch.root_mmu; }
+static bool nested_vmcb_needs_vls_intercept(struct vcpu_svm *svm) +{ + if (!svm->v_vmload_vmsave_enabled) + return true; + + if (!nested_npt_enabled(svm)) + return true; + + if (!(svm->nested.ctl.virt_ext & VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK)) + return true; + + return false; +} + void recalc_intercepts(struct vcpu_svm *svm) { struct vmcb_control_area *c, *h, *g; @@ -159,8 +173,17 @@ void recalc_intercepts(struct vcpu_svm *svm) if (!intercept_smi) vmcb_clr_intercept(c, INTERCEPT_SMI);
- vmcb_set_intercept(c, INTERCEPT_VMLOAD); - vmcb_set_intercept(c, INTERCEPT_VMSAVE); + if (nested_vmcb_needs_vls_intercept(svm)) { + /* + * If the virtual VMLOAD/VMSAVE is not enabled for the L2, + * we must intercept these instructions to correctly + * emulate them in case L1 doesn't intercept them. + */ + vmcb_set_intercept(c, INTERCEPT_VMLOAD); + vmcb_set_intercept(c, INTERCEPT_VMSAVE); + } else { + WARN_ON(!(c->virt_ext & VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK)); + } }
static void copy_vmcb_control_area(struct vmcb_control_area *dst, @@ -402,10 +425,7 @@ static void nested_save_pending_event_to_vmcb12(struct vcpu_svm *svm, vmcb12->control.exit_int_info = exit_int_info; }
-static inline bool nested_npt_enabled(struct vcpu_svm *svm) -{ - return svm->nested.ctl.nested_ctl & SVM_NESTED_CTL_NP_ENABLE; -} +
static void nested_svm_transition_tlb_flush(struct kvm_vcpu *vcpu) { @@ -594,6 +614,9 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) svm->vmcb->control.virt_ext |= (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK);
+ if (!nested_vmcb_needs_vls_intercept(svm)) + svm->vmcb->control.virt_ext |= VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK; + nested_svm_transition_tlb_flush(vcpu);
/* Enter Guest-Mode */ diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 0ae8fa9400902..77fd0922c4060 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1023,6 +1023,9 @@ static __init void svm_set_cpu_caps(void) if (lbrv) kvm_cpu_cap_set(X86_FEATURE_LBRV);
+ if (vls) + kvm_cpu_cap_set(X86_FEATURE_V_VMSAVE_VMLOAD); + /* Nested VM can receive #VMEXIT instead of triggering #GP */ kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK); } @@ -1274,6 +1277,8 @@ static inline void init_vmcb_after_set_cpuid(struct kvm_vcpu *vcpu)
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_SYSENTER_EIP, 0, 0); set_msr_interception(vcpu, svm->msrpm, MSR_IA32_SYSENTER_ESP, 0, 0); + + svm->v_vmload_vmsave_enabled = false; } else { /* * If hardware supports Virtual VMLOAD VMSAVE then enable it @@ -4156,6 +4161,8 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) svm->tsc_scaling_enabled = tsc_scaling && guest_cpuid_has(vcpu, X86_FEATURE_TSCRATEMSR); svm->lbrv_enabled = lbrv && guest_cpuid_has(vcpu, X86_FEATURE_LBRV);
+ svm->v_vmload_vmsave_enabled = vls && guest_cpuid_has(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD); + svm_recalc_instruction_intercepts(vcpu, svm);
/* For sev guests, the memory encryption bit is not reserved in CR3. */ diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index c3d46fdf4b9ad..ec8dd09e41e69 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -161,10 +161,11 @@ struct vcpu_svm { unsigned int3_injected; unsigned long int3_rip;
- /* cached guest cpuid flags for faster access */ + /* optional nested SVM features that are enabled for this guest */ bool nrips_enabled : 1; bool tsc_scaling_enabled : 1; bool lbrv_enabled : 1; + bool v_vmload_vmsave_enabled : 1;
u32 ldr_reg; u32 dfr_reg; @@ -412,6 +413,11 @@ static inline bool gif_set(struct vcpu_svm *svm) return !!(svm->vcpu.arch.hflags & HF_GIF_MASK); }
+static inline bool nested_npt_enabled(struct vcpu_svm *svm) +{ + return svm->nested.ctl.nested_ctl & SVM_NESTED_CTL_NP_ENABLE; +} + /* svm.c */ #define MSR_INVALID 0xffffffffU
Allow L1 to use these settings if L0 disables PAUSE interception (AKA cpu_pm=on)
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com --- arch/x86/kvm/svm/nested.c | 6 ++++++ arch/x86/kvm/svm/svm.c | 17 +++++++++++++++++ arch/x86/kvm/svm/svm.h | 2 ++ 3 files changed, 25 insertions(+)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 2dc97cca68f7c..15be37368380d 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -617,6 +617,12 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) if (!nested_vmcb_needs_vls_intercept(svm)) svm->vmcb->control.virt_ext |= VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK;
+ if (svm->pause_filter_enabled) + svm->vmcb->control.pause_filter_count = svm->nested.ctl.pause_filter_count; + + if (svm->pause_threshold_enabled) + svm->vmcb->control.pause_filter_thresh = svm->nested.ctl.pause_filter_thresh; + nested_svm_transition_tlb_flush(vcpu);
/* Enter Guest-Mode */ diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 77fd0922c4060..ff1447a3551fc 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1026,6 +1026,12 @@ static __init void svm_set_cpu_caps(void) if (vls) kvm_cpu_cap_set(X86_FEATURE_V_VMSAVE_VMLOAD);
+ if (pause_filter_count) + kvm_cpu_cap_set(X86_FEATURE_PAUSEFILTER); + + if (pause_filter_thresh) + kvm_cpu_cap_set(X86_FEATURE_PFTHRESHOLD); + /* Nested VM can receive #VMEXIT instead of triggering #GP */ kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK); } @@ -4163,6 +4169,17 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
svm->v_vmload_vmsave_enabled = vls && guest_cpuid_has(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD);
+ if (kvm_pause_in_guest(vcpu->kvm)) { + svm->pause_filter_enabled = pause_filter_count > 0 && + guest_cpuid_has(vcpu, X86_FEATURE_PAUSEFILTER); + + svm->pause_threshold_enabled = pause_filter_thresh > 0 && + guest_cpuid_has(vcpu, X86_FEATURE_PFTHRESHOLD); + } else { + svm->pause_filter_enabled = false; + svm->pause_threshold_enabled = false; + } + svm_recalc_instruction_intercepts(vcpu, svm);
/* For sev guests, the memory encryption bit is not reserved in CR3. */ diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index ec8dd09e41e69..75781d66cbd60 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -166,6 +166,8 @@ struct vcpu_svm { bool tsc_scaling_enabled : 1; bool lbrv_enabled : 1; bool v_vmload_vmsave_enabled : 1; + bool pause_filter_enabled : 1; + bool pause_threshold_enabled : 1;
u32 ldr_reg; u32 dfr_reg;
In case L1 enables vGIF for L2, the L2 cannot affect L1's GIF, regardless of STGI/CLGI intercepts, and since VM entry enables GIF, this means that L1's GIF is always 1 while L2 is running.
Thus in this case leave L1's vGIF in vmcb01, while letting L2 control the vGIF thus implementing nested vGIF.
Also allow KVM to toggle L1's GIF during nested entry/exit by always using vmcb01.
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com --- arch/x86/kvm/svm/nested.c | 17 +++++++++++++---- arch/x86/kvm/svm/svm.c | 5 +++++ arch/x86/kvm/svm/svm.h | 25 +++++++++++++++++++++---- 3 files changed, 39 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 15be37368380d..4e4e3aea519be 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -384,6 +384,10 @@ void nested_sync_control_from_vmcb02(struct vcpu_svm *svm) */ mask &= ~V_IRQ_MASK; } + + if (nested_vgif_enabled(svm)) + mask |= V_GIF_MASK; + svm->nested.ctl.int_ctl &= ~mask; svm->nested.ctl.int_ctl |= svm->vmcb->control.int_ctl & mask; } @@ -555,10 +559,8 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12
static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) { - const u32 int_ctl_vmcb01_bits = - V_INTR_MASKING_MASK | V_GIF_MASK | V_GIF_ENABLE_MASK; - - const u32 int_ctl_vmcb12_bits = V_TPR_MASK | V_IRQ_INJECTION_BITS_MASK; + u32 int_ctl_vmcb01_bits = V_INTR_MASKING_MASK; + u32 int_ctl_vmcb12_bits = V_TPR_MASK | V_IRQ_INJECTION_BITS_MASK;
struct kvm_vcpu *vcpu = &svm->vcpu;
@@ -573,6 +575,13 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) */ WARN_ON(kvm_apicv_activated(svm->vcpu.kvm));
+ + + if (svm->vgif_enabled && (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK)) + int_ctl_vmcb12_bits |= (V_GIF_MASK | V_GIF_ENABLE_MASK); + else + int_ctl_vmcb01_bits |= (V_GIF_MASK | V_GIF_ENABLE_MASK); + /* Copied from vmcb01. msrpm_base can be overwritten later. */ svm->vmcb->control.nested_ctl = svm->vmcb01.ptr->control.nested_ctl; svm->vmcb->control.iopm_base_pa = svm->vmcb01.ptr->control.iopm_base_pa; diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index ff1447a3551fc..0461a3430d529 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1032,6 +1032,9 @@ static __init void svm_set_cpu_caps(void) if (pause_filter_thresh) kvm_cpu_cap_set(X86_FEATURE_PFTHRESHOLD);
+ if (vgif) + kvm_cpu_cap_set(X86_FEATURE_VGIF); + /* Nested VM can receive #VMEXIT instead of triggering #GP */ kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK); } @@ -4180,6 +4183,8 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) svm->pause_threshold_enabled = false; }
+ svm->vgif_enabled = vgif && guest_cpuid_has(vcpu, X86_FEATURE_VGIF); + svm_recalc_instruction_intercepts(vcpu, svm);
/* For sev guests, the memory encryption bit is not reserved in CR3. */ diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 75781d66cbd60..06e5c43ce18a8 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -168,6 +168,7 @@ struct vcpu_svm { bool v_vmload_vmsave_enabled : 1; bool pause_filter_enabled : 1; bool pause_threshold_enabled : 1; + bool vgif_enabled : 1;
u32 ldr_reg; u32 dfr_reg; @@ -386,31 +387,47 @@ static inline bool svm_is_intercept(struct vcpu_svm *svm, int bit) return vmcb_is_intercept(&svm->vmcb->control, bit); }
+static bool nested_vgif_enabled(struct vcpu_svm *svm) +{ + if (!is_guest_mode(&svm->vcpu) || !svm->vgif_enabled) + return false; + return svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK; +} + static inline bool vgif_enabled(struct vcpu_svm *svm) { - return !!(svm->vmcb->control.int_ctl & V_GIF_ENABLE_MASK); + struct vmcb *vmcb = nested_vgif_enabled(svm) ? svm->vmcb01.ptr : svm->vmcb; + + return !!(vmcb->control.int_ctl & V_GIF_ENABLE_MASK); }
static inline void enable_gif(struct vcpu_svm *svm) { + struct vmcb *vmcb = nested_vgif_enabled(svm) ? svm->vmcb01.ptr : svm->vmcb; + if (vgif_enabled(svm)) - svm->vmcb->control.int_ctl |= V_GIF_MASK; + vmcb->control.int_ctl |= V_GIF_MASK; else svm->vcpu.arch.hflags |= HF_GIF_MASK; }
static inline void disable_gif(struct vcpu_svm *svm) { + struct vmcb *vmcb = nested_vgif_enabled(svm) ? svm->vmcb01.ptr : svm->vmcb; + if (vgif_enabled(svm)) - svm->vmcb->control.int_ctl &= ~V_GIF_MASK; + vmcb->control.int_ctl &= ~V_GIF_MASK; else svm->vcpu.arch.hflags &= ~HF_GIF_MASK; + }
static inline bool gif_set(struct vcpu_svm *svm) { + struct vmcb *vmcb = nested_vgif_enabled(svm) ? svm->vmcb01.ptr : svm->vmcb; + if (vgif_enabled(svm)) - return !!(svm->vmcb->control.int_ctl & V_GIF_MASK); + return !!(vmcb->control.int_ctl & V_GIF_MASK); else return !!(svm->vcpu.arch.hflags & HF_GIF_MASK); }
On Mon, 2021-11-01 at 16:03 +0200, Maxim Levitsky wrote:
This is a resend of a few patches that implement few SVM's optional features for nesting.
I was testing these patches during last few weeks with various nested configurations and I was unable to find any issues.
I also implemented support for nested vGIF in the last patch.
Best regards, Maxim Levitsky
Maxim Levitsky (6): KVM: x86: SVM: add module param to control LBR virtualization KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running KVM: x86: nSVM: implement nested LBR virtualization KVM: x86: nSVM: implement nested VMLOAD/VMSAVE KVM: x86: nSVM: support PAUSE filter threshold and count when cpu_pm=on KVM: x86: SVM: implement nested vGIF
arch/x86/kvm/svm/nested.c | 86 ++++++++++++++++++++--- arch/x86/kvm/svm/svm.c | 140 ++++++++++++++++++++++++++++++++------ arch/x86/kvm/svm/svm.h | 38 +++++++++-- 3 files changed, 228 insertions(+), 36 deletions(-)
-- 2.26.3
Kind ping on these patches.
Best regards, Maxim Levitsky
On Tue, 2021-11-16 at 23:38 +0200, Maxim Levitsky wrote:
On Mon, 2021-11-01 at 16:03 +0200, Maxim Levitsky wrote:
This is a resend of a few patches that implement few SVM's optional features for nesting.
I was testing these patches during last few weeks with various nested configurations and I was unable to find any issues.
I also implemented support for nested vGIF in the last patch.
Best regards, Maxim Levitsky
Maxim Levitsky (6): KVM: x86: SVM: add module param to control LBR virtualization KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running KVM: x86: nSVM: implement nested LBR virtualization KVM: x86: nSVM: implement nested VMLOAD/VMSAVE KVM: x86: nSVM: support PAUSE filter threshold and count when cpu_pm=on KVM: x86: SVM: implement nested vGIF
arch/x86/kvm/svm/nested.c | 86 ++++++++++++++++++++--- arch/x86/kvm/svm/svm.c | 140 ++++++++++++++++++++++++++++++++------ arch/x86/kvm/svm/svm.h | 38 +++++++++-- 3 files changed, 228 insertions(+), 36 deletions(-)
-- 2.26.3
Kind ping on these patches.
Another kind ping on these patches.
Best regards, Maxim Levitsky
Best regards, Maxim Levitsky
linux-kselftest-mirror@lists.linaro.org