The patch below does not apply to the 6.12-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y git checkout FETCH_HEAD git cherry-pick -x fbe5e5f030c22ae717ee422aaab0e00ea84fab5e # <resolve conflicts, build, test, etc.> git commit -s git send-email --to 'stable@vger.kernel.org' --in-reply-to '2025112046-confider-smelting-6296@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fbe5e5f030c22ae717ee422aaab0e00ea84fab5e Mon Sep 17 00:00:00 2001 From: Yosry Ahmed yosry.ahmed@linux.dev Date: Sat, 8 Nov 2025 00:45:20 +0000 Subject: [PATCH] KVM: nSVM: Always recalculate LBR MSR intercepts in svm_update_lbrv()
svm_update_lbrv() is called when MSR_IA32_DEBUGCTLMSR is updated, and on nested transitions where LBRV is used. It checks whether LBRV enablement needs to be changed in the current VMCB, and if it does, it also recalculate intercepts to LBR MSRs.
However, there are cases where intercepts need to be updated even when LBRV enablement doesn't. Example scenario: - L1 has MSR_IA32_DEBUGCTLMSR cleared. - L1 runs L2 without LBR_CTL_ENABLE (no LBRV). - L2 sets DEBUGCTLMSR_LBR in MSR_IA32_DEBUGCTLMSR, svm_update_lbrv() sets LBR_CTL_ENABLE in VMCB02 and disables intercepts to LBR MSRs. - L2 exits to L1, svm_update_lbrv() is not called on this transition. - L1 clears MSR_IA32_DEBUGCTLMSR, svm_update_lbrv() finds that LBR_CTL_ENABLE is already cleared in VMCB01 and does nothing. - Intercepts remain disabled, L1 reads to LBR MSRs read the host MSRs.
Fix it by always recalculating intercepts in svm_update_lbrv().
Fixes: 1d5a1b5860ed ("KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed yosry.ahmed@linux.dev Link: https://patch.msgid.link/20251108004524.1600006-3-yosry.ahmed@linux.dev Signed-off-by: Paolo Bonzini pbonzini@redhat.com
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 39538098002b..53201f13a43c 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -806,25 +806,29 @@ void svm_copy_lbrs(struct vmcb *to_vmcb, struct vmcb *from_vmcb) vmcb_mark_dirty(to_vmcb, VMCB_LBR); }
-void svm_enable_lbrv(struct kvm_vcpu *vcpu) +static void __svm_enable_lbrv(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu);
svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK; - svm_recalc_lbr_msr_intercepts(vcpu);
/* Move the LBR msrs to the vmcb02 so that the guest can see them. */ if (is_guest_mode(vcpu)) svm_copy_lbrs(svm->vmcb, svm->vmcb01.ptr); }
-static void svm_disable_lbrv(struct kvm_vcpu *vcpu) +void svm_enable_lbrv(struct kvm_vcpu *vcpu) +{ + __svm_enable_lbrv(vcpu); + svm_recalc_lbr_msr_intercepts(vcpu); +} + +static void __svm_disable_lbrv(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu);
KVM_BUG_ON(sev_es_guest(vcpu->kvm), vcpu->kvm); svm->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK; - svm_recalc_lbr_msr_intercepts(vcpu);
/* * Move the LBR msrs back to the vmcb01 to avoid copying them @@ -853,13 +857,18 @@ void svm_update_lbrv(struct kvm_vcpu *vcpu) (is_guest_mode(vcpu) && guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
- if (enable_lbrv == current_enable_lbrv) - return; + if (enable_lbrv && !current_enable_lbrv) + __svm_enable_lbrv(vcpu); + else if (!enable_lbrv && current_enable_lbrv) + __svm_disable_lbrv(vcpu);
- if (enable_lbrv) - svm_enable_lbrv(vcpu); - else - svm_disable_lbrv(vcpu); + /* + * During nested transitions, it is possible that the current VMCB has + * LBR_CTL set, but the previous LBR_CTL had it cleared (or vice versa). + * In this case, even though LBR_CTL does not need an update, intercepts + * do, so always recalculate the intercepts here. + */ + svm_recalc_lbr_msr_intercepts(vcpu); }
void disable_nmi_singlestep(struct vcpu_svm *svm)
This is a backport of LBR virtualization fixes that recently landed in Linus's tree to 6.12.y.
Patch 1 is not a backport, it's introducing a helper that exists in Linus's tree to make the following backports more straightforward to apply.
Patch 2 should already be in queue-6.12, but it's included here as the remaining patches depend on it.
Yosry Ahmed (5): KVM: SVM: Introduce svm_recalc_lbr_msr_intercepts() KVM: SVM: Mark VMCB_LBR dirty when MSR_IA32_DEBUGCTLMSR is updated KVM: nSVM: Always recalculate LBR MSR intercepts in svm_update_lbrv() KVM: nSVM: Fix and simplify LBR virtualization handling with nested KVM: SVM: Fix redundant updates of LBR MSR intercepts
arch/x86/kvm/svm/nested.c | 20 +++----- arch/x86/kvm/svm/svm.c | 96 ++++++++++++++++++++------------------- arch/x86/kvm/svm/svm.h | 1 + 3 files changed, 57 insertions(+), 60 deletions(-)
Introduce a helper updating the intercepts for LBR MSRs, similar to the one introduced upstream by commit 160f143cc131 ("KVM: SVM: Manually recalc all MSR intercepts on userspace MSR filter change"). The main difference is that this version uses set_msr_interception(), which has inverted polarity compared to svm_set_intercept_for_msr().
This is intended to simplify incoming backports. No functional changes intended.
Signed-off-by: Yosry Ahmed yosry.ahmed@linux.dev --- arch/x86/kvm/svm/svm.c | 32 +++++++++++++++++++++----------- 1 file changed, 21 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index c48a466db0c3..6deccc1b1669 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -995,18 +995,31 @@ void svm_copy_lbrs(struct vmcb *to_vmcb, struct vmcb *from_vmcb) vmcb_mark_dirty(to_vmcb, VMCB_LBR); }
-void svm_enable_lbrv(struct kvm_vcpu *vcpu) +static void svm_recalc_lbr_msr_intercepts(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); + bool intercept = !(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK);
- svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK; - set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHFROMIP, 1, 1); - set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, 1, 1); - set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTFROMIP, 1, 1); - set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, 1, 1); + set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHFROMIP, + !intercept, !intercept); + set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, + !intercept, !intercept); + set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTFROMIP, + !intercept, !intercept); + set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, + !intercept, !intercept);
if (sev_es_guest(vcpu->kvm)) - set_msr_interception(vcpu, svm->msrpm, MSR_IA32_DEBUGCTLMSR, 1, 1); + set_msr_interception(vcpu, svm->msrpm, MSR_IA32_DEBUGCTLMSR, + !intercept, !intercept); +} + +void svm_enable_lbrv(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + + svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK; + svm_recalc_lbr_msr_intercepts(vcpu);
/* Move the LBR msrs to the vmcb02 so that the guest can see them. */ if (is_guest_mode(vcpu)) @@ -1020,10 +1033,7 @@ static void svm_disable_lbrv(struct kvm_vcpu *vcpu) KVM_BUG_ON(sev_es_guest(vcpu->kvm), vcpu->kvm);
svm->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK; - set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHFROMIP, 0, 0); - set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, 0, 0); - set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTFROMIP, 0, 0); - set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, 0, 0); + svm_recalc_lbr_msr_intercepts(vcpu);
/* * Move the LBR msrs back to the vmcb01 to avoid copying them
The APM lists the DbgCtlMsr field as being tracked by the VMCB_LBR clean bit. Always clear the bit when MSR_IA32_DEBUGCTLMSR is updated.
The history is complicated, it was correctly cleared for L1 before commit 1d5a1b5860ed ("KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running"). At that point svm_set_msr() started to rely on svm_update_lbrv() to clear the bit, but when nested virtualization is enabled the latter does not always clear it even if MSR_IA32_DEBUGCTLMSR changed. Go back to clearing it directly in svm_set_msr().
Fixes: 1d5a1b5860ed ("KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running") Reported-by: Matteo Rizzo matteorizzo@google.com Reported-by: evn@google.com Co-developed-by: Jim Mattson jmattson@google.com Signed-off-by: Jim Mattson jmattson@google.com Signed-off-by: Yosry Ahmed yosry.ahmed@linux.dev Link: https://patch.msgid.link/20251108004524.1600006-2-yosry.ahmed@linux.dev Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com (cherry picked from commit dc55b3c3f61246e483e50c85d8d5366f9567e188) --- arch/x86/kvm/svm/svm.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 6deccc1b1669..1e1210e5ef45 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3267,7 +3267,11 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) if (data & DEBUGCTL_RESERVED_BITS) return 1;
+ if (svm_get_lbr_vmcb(svm)->save.dbgctl == data) + break; + svm_get_lbr_vmcb(svm)->save.dbgctl = data; + vmcb_mark_dirty(svm->vmcb, VMCB_LBR); svm_update_lbrv(vcpu); break; case MSR_VM_HSAVE_PA:
svm_update_lbrv() is called when MSR_IA32_DEBUGCTLMSR is updated, and on nested transitions where LBRV is used. It checks whether LBRV enablement needs to be changed in the current VMCB, and if it does, it also recalculate intercepts to LBR MSRs.
However, there are cases where intercepts need to be updated even when LBRV enablement doesn't. Example scenario: - L1 has MSR_IA32_DEBUGCTLMSR cleared. - L1 runs L2 without LBR_CTL_ENABLE (no LBRV). - L2 sets DEBUGCTLMSR_LBR in MSR_IA32_DEBUGCTLMSR, svm_update_lbrv() sets LBR_CTL_ENABLE in VMCB02 and disables intercepts to LBR MSRs. - L2 exits to L1, svm_update_lbrv() is not called on this transition. - L1 clears MSR_IA32_DEBUGCTLMSR, svm_update_lbrv() finds that LBR_CTL_ENABLE is already cleared in VMCB01 and does nothing. - Intercepts remain disabled, L1 reads to LBR MSRs read the host MSRs.
Fix it by always recalculating intercepts in svm_update_lbrv().
Fixes: 1d5a1b5860ed ("KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed yosry.ahmed@linux.dev Link: https://patch.msgid.link/20251108004524.1600006-3-yosry.ahmed@linux.dev Signed-off-by: Paolo Bonzini pbonzini@redhat.com (cherry picked from commit fbe5e5f030c22ae717ee422aaab0e00ea84fab5e)
Signed-off-by: Yosry Ahmed yosry.ahmed@linux.dev --- arch/x86/kvm/svm/svm.c | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 1e1210e5ef45..e3e570d11b26 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1014,26 +1014,30 @@ static void svm_recalc_lbr_msr_intercepts(struct kvm_vcpu *vcpu) !intercept, !intercept); }
-void svm_enable_lbrv(struct kvm_vcpu *vcpu) +static void __svm_enable_lbrv(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu);
svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK; - svm_recalc_lbr_msr_intercepts(vcpu);
/* Move the LBR msrs to the vmcb02 so that the guest can see them. */ if (is_guest_mode(vcpu)) svm_copy_lbrs(svm->vmcb, svm->vmcb01.ptr); }
-static void svm_disable_lbrv(struct kvm_vcpu *vcpu) +void svm_enable_lbrv(struct kvm_vcpu *vcpu) +{ + __svm_enable_lbrv(vcpu); + svm_recalc_lbr_msr_intercepts(vcpu); +} + +static void __svm_disable_lbrv(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu);
KVM_BUG_ON(sev_es_guest(vcpu->kvm), vcpu->kvm);
svm->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK; - svm_recalc_lbr_msr_intercepts(vcpu);
/* * Move the LBR msrs back to the vmcb01 to avoid copying them @@ -1062,13 +1066,18 @@ void svm_update_lbrv(struct kvm_vcpu *vcpu) (is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
- if (enable_lbrv == current_enable_lbrv) - return; + if (enable_lbrv && !current_enable_lbrv) + __svm_enable_lbrv(vcpu); + else if (!enable_lbrv && current_enable_lbrv) + __svm_disable_lbrv(vcpu);
- if (enable_lbrv) - svm_enable_lbrv(vcpu); - else - svm_disable_lbrv(vcpu); + /* + * During nested transitions, it is possible that the current VMCB has + * LBR_CTL set, but the previous LBR_CTL had it cleared (or vice versa). + * In this case, even though LBR_CTL does not need an update, intercepts + * do, so always recalculate the intercepts here. + */ + svm_recalc_lbr_msr_intercepts(vcpu); }
void disable_nmi_singlestep(struct vcpu_svm *svm)
The current scheme for handling LBRV when nested is used is very complicated, especially when L1 does not enable LBRV (i.e. does not set LBR_CTL_ENABLE_MASK).
To avoid copying LBRs between VMCB01 and VMCB02 on every nested transition, the current implementation switches between using VMCB01 or VMCB02 as the source of truth for the LBRs while L2 is running. If L2 enables LBR, VMCB02 is used as the source of truth. When L2 disables LBR, the LBRs are copied to VMCB01 and VMCB01 is used as the source of truth. This introduces significant complexity, and incorrect behavior in some cases.
For example, on a nested #VMEXIT, the LBRs are only copied from VMCB02 to VMCB01 if LBRV is enabled in VMCB01. This is because L2's writes to MSR_IA32_DEBUGCTLMSR to enable LBR are intercepted and propagated to VMCB01 instead of VMCB02. However, LBRV is only enabled in VMCB02 when L2 is running.
This means that if L2 enables LBR and exits to L1, the LBRs will not be propagated from VMCB02 to VMCB01, because LBRV is disabled in VMCB01.
There is no meaningful difference in CPUID rate in L2 when copying LBRs on every nested transition vs. the current approach, so do the simple and correct thing and always copy LBRs between VMCB01 and VMCB02 on nested transitions (when LBRV is disabled by L1). Drop the conditional LBRs copying in __svm_{enable/disable}_lbrv() as it is now unnecessary.
VMCB02 becomes the only source of truth for LBRs when L2 is running, regardless of LBRV being enabled by L1, drop svm_get_lbr_vmcb() and use svm->vmcb directly in its place.
Fixes: 1d5a1b5860ed ("KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed yosry.ahmed@linux.dev Link: https://patch.msgid.link/20251108004524.1600006-4-yosry.ahmed@linux.dev Signed-off-by: Paolo Bonzini pbonzini@redhat.com (cherry picked from commit 8a4821412cf2c1429fffa07c012dd150f2edf78c)
Signed-off-by: Yosry Ahmed yosry.ahmed@linux.dev --- arch/x86/kvm/svm/nested.c | 20 ++++++----------- arch/x86/kvm/svm/svm.c | 47 +++++++++------------------------------ 2 files changed, 17 insertions(+), 50 deletions(-)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 2dcb9c870d5a..ff6379fdf222 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -602,11 +602,10 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12 */ svm_copy_lbrs(vmcb02, vmcb12); vmcb02->save.dbgctl &= ~DEBUGCTL_RESERVED_BITS; - svm_update_lbrv(&svm->vcpu); - - } else if (unlikely(vmcb01->control.virt_ext & LBR_CTL_ENABLE_MASK)) { + } else { svm_copy_lbrs(vmcb02, vmcb01); } + svm_update_lbrv(&svm->vcpu); }
static inline bool is_evtinj_soft(u32 evtinj) @@ -731,11 +730,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm, svm->soft_int_next_rip = vmcb12_rip; }
- vmcb02->control.virt_ext = vmcb01->control.virt_ext & - LBR_CTL_ENABLE_MASK; - if (guest_can_use(vcpu, X86_FEATURE_LBRV)) - vmcb02->control.virt_ext |= - (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK); + /* LBR_CTL_ENABLE_MASK is controlled by svm_update_lbrv() */
if (!nested_vmcb_needs_vls_intercept(svm)) vmcb02->control.virt_ext |= VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK; @@ -1066,13 +1061,12 @@ int nested_svm_vmexit(struct vcpu_svm *svm) kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
if (unlikely(guest_can_use(vcpu, X86_FEATURE_LBRV) && - (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) { + (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) svm_copy_lbrs(vmcb12, vmcb02); - svm_update_lbrv(vcpu); - } else if (unlikely(vmcb01->control.virt_ext & LBR_CTL_ENABLE_MASK)) { + else svm_copy_lbrs(vmcb01, vmcb02); - svm_update_lbrv(vcpu); - } + + svm_update_lbrv(vcpu);
if (vnmi) { if (vmcb02->control.int_ctl & V_NMI_BLOCKING_MASK) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index e3e570d11b26..a5e3dc482845 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1016,13 +1016,7 @@ static void svm_recalc_lbr_msr_intercepts(struct kvm_vcpu *vcpu)
static void __svm_enable_lbrv(struct kvm_vcpu *vcpu) { - struct vcpu_svm *svm = to_svm(vcpu); - - svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK; - - /* Move the LBR msrs to the vmcb02 so that the guest can see them. */ - if (is_guest_mode(vcpu)) - svm_copy_lbrs(svm->vmcb, svm->vmcb01.ptr); + to_svm(vcpu)->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK; }
void svm_enable_lbrv(struct kvm_vcpu *vcpu) @@ -1033,36 +1027,15 @@ void svm_enable_lbrv(struct kvm_vcpu *vcpu)
static void __svm_disable_lbrv(struct kvm_vcpu *vcpu) { - struct vcpu_svm *svm = to_svm(vcpu); - KVM_BUG_ON(sev_es_guest(vcpu->kvm), vcpu->kvm); - - svm->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK; - - /* - * Move the LBR msrs back to the vmcb01 to avoid copying them - * on nested guest entries. - */ - if (is_guest_mode(vcpu)) - svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb); -} - -static struct vmcb *svm_get_lbr_vmcb(struct vcpu_svm *svm) -{ - /* - * If LBR virtualization is disabled, the LBR MSRs are always kept in - * vmcb01. If LBR virtualization is enabled and L1 is running VMs of - * its own, the MSRs are moved between vmcb01 and vmcb02 as needed. - */ - return svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK ? svm->vmcb : - svm->vmcb01.ptr; + to_svm(vcpu)->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK; }
void svm_update_lbrv(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); bool current_enable_lbrv = svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK; - bool enable_lbrv = (svm_get_lbr_vmcb(svm)->save.dbgctl & DEBUGCTLMSR_LBR) || + bool enable_lbrv = (svm->vmcb->save.dbgctl & DEBUGCTLMSR_LBR) || (is_guest_mode(vcpu) && guest_can_use(vcpu, X86_FEATURE_LBRV) && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK));
@@ -2991,19 +2964,19 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) msr_info->data = svm->tsc_aux; break; case MSR_IA32_DEBUGCTLMSR: - msr_info->data = svm_get_lbr_vmcb(svm)->save.dbgctl; + msr_info->data = svm->vmcb->save.dbgctl; break; case MSR_IA32_LASTBRANCHFROMIP: - msr_info->data = svm_get_lbr_vmcb(svm)->save.br_from; + msr_info->data = svm->vmcb->save.br_from; break; case MSR_IA32_LASTBRANCHTOIP: - msr_info->data = svm_get_lbr_vmcb(svm)->save.br_to; + msr_info->data = svm->vmcb->save.br_to; break; case MSR_IA32_LASTINTFROMIP: - msr_info->data = svm_get_lbr_vmcb(svm)->save.last_excp_from; + msr_info->data = svm->vmcb->save.last_excp_from; break; case MSR_IA32_LASTINTTOIP: - msr_info->data = svm_get_lbr_vmcb(svm)->save.last_excp_to; + msr_info->data = svm->vmcb->save.last_excp_to; break; case MSR_VM_HSAVE_PA: msr_info->data = svm->nested.hsave_msr; @@ -3276,10 +3249,10 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) if (data & DEBUGCTL_RESERVED_BITS) return 1;
- if (svm_get_lbr_vmcb(svm)->save.dbgctl == data) + if (svm->vmcb->save.dbgctl == data) break;
- svm_get_lbr_vmcb(svm)->save.dbgctl = data; + svm->vmcb->save.dbgctl = data; vmcb_mark_dirty(svm->vmcb, VMCB_LBR); svm_update_lbrv(vcpu); break;
Don't update the LBR MSR intercept bitmaps if they're already up-to-date, as unconditionally updating the intercepts forces KVM to recalculate the MSR bitmaps for vmcb02 on every nested VMRUN. The redundant updates are functionally okay; however, they neuter an optimization in Hyper-V nested virtualization enlightenments and this manifests as a self-test failure.
In particular, Hyper-V lets L1 mark "nested enlightenments" as clean, i.e. tell KVM that no changes were made to the MSR bitmap since the last VMRUN. The hyperv_svm_test KVM selftest intentionally changes the MSR bitmap "without telling KVM about it" to verify that KVM honors the clean hint, correctly fails because KVM notices the changed bitmap anyway:
==== Test Assertion Failure ==== x86/hyperv_svm_test.c:120: vmcb->control.exit_code == 0x081 pid=193558 tid=193558 errno=4 - Interrupted system call 1 0x0000000000411361: assert_on_unhandled_exception at processor.c:659 2 0x0000000000406186: _vcpu_run at kvm_util.c:1699 3 (inlined by) vcpu_run at kvm_util.c:1710 4 0x0000000000401f2a: main at hyperv_svm_test.c:175 5 0x000000000041d0d3: __libc_start_call_main at libc-start.o:? 6 0x000000000041f27c: __libc_start_main_impl at ??:? 7 0x00000000004021a0: _start at ??:? vmcb->control.exit_code == SVM_EXIT_VMMCALL
Do *not* fix this by skipping svm_hv_vmcb_dirty_nested_enlightenments() when svm_set_intercept_for_msr() performs a no-op change. changes to the L0 MSR interception bitmap are only triggered by full CPUID updates and MSR filter updates, both of which should be rare. Changing svm_set_intercept_for_msr() risks hiding unintended pessimizations like this one, and is actually more complex than this change.
Fixes: fbe5e5f030c2 ("KVM: nSVM: Always recalculate LBR MSR intercepts in svm_update_lbrv()") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed yosry.ahmed@linux.dev Link: https://patch.msgid.link/20251112013017.1836863-1-yosry.ahmed@linux.dev [Rewritten commit message based on mailing list discussion. - Paolo] Reviewed-by: Sean Christopherson seanjc@google.com Tested-by: Sean Christopherson seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com (cherry picked from commit 3fa05f96fc08dff5e846c2cc283a249c1bf029a1)
Signed-off-by: Yosry Ahmed yosry.ahmed@linux.dev --- arch/x86/kvm/svm/svm.c | 6 ++++++ arch/x86/kvm/svm/svm.h | 1 + 2 files changed, 7 insertions(+)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index a5e3dc482845..71b32e64e801 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1000,6 +1000,9 @@ static void svm_recalc_lbr_msr_intercepts(struct kvm_vcpu *vcpu) struct vcpu_svm *svm = to_svm(vcpu); bool intercept = !(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK);
+ if (intercept == svm->lbr_msrs_intercepted) + return; + set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHFROMIP, !intercept, !intercept); set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, @@ -1012,6 +1015,8 @@ static void svm_recalc_lbr_msr_intercepts(struct kvm_vcpu *vcpu) if (sev_es_guest(vcpu->kvm)) set_msr_interception(vcpu, svm->msrpm, MSR_IA32_DEBUGCTLMSR, !intercept, !intercept); + + svm->lbr_msrs_intercepted = intercept; }
static void __svm_enable_lbrv(struct kvm_vcpu *vcpu) @@ -1450,6 +1455,7 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu) }
svm->x2avic_msrs_intercepted = true; + svm->lbr_msrs_intercepted = true;
svm->vmcb01.ptr = page_address(vmcb01_page); svm->vmcb01.pa = __sme_set(page_to_pfn(vmcb01_page) << PAGE_SHIFT); diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 1aa9b1e468cb..cada5db654b8 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -324,6 +324,7 @@ struct vcpu_svm { bool guest_state_loaded;
bool x2avic_msrs_intercepted; + bool lbr_msrs_intercepted;
/* Guest GIF value, used when vGIF is not enabled */ bool guest_gif;
linux-stable-mirror@lists.linaro.org