Bug-report: https://lore.kernel.org/all/915c0e00-b92d-4e37-9d4b-0f6a4580da97@oracle.com/
Summary: While backporting commit: 7c62c442b6eb ("x86/vmscape: Enumerate VMSCAPE bug") to 6.12.y --> VULNBL_AMD(0x1a, SRSO | VMSCAPE) was added even when 6.12.y doesn't have commit: 877818802c3e ("x86/bugs: Add SRSO_USER_KERNEL_NO support").
Boris Ostrovsky suggested backporting three commits to 6.12.y: 1. commit: 877818802c3e ("x86/bugs: Add SRSO_USER_KERNEL_NO support") 2. commit: 8442df2b49ed ("x86/bugs: KVM: Add support for SRSO_MSR_FIX") and its fix 3. commit: e3417ab75ab2 ("KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0 <=> 1 VM count transitions") -- Maybe optional
Which changes current mitigation status on turin for 6.12.48 from Safe RET to Reduced Speculation, leaving it with Safe RET liely causes heavy performance regressions.
This three patches together change mitigation status from Safe RET to Reduced Speculation
Tested on Turin: [ 3.188134] Speculative Return Stack Overflow: Mitigation: Reduced Speculation
Backports: 1. Patch 1 had minor conflict as VMSCAPE commit added VULNBL_AMD(0x1a, SRSO | VMSCAPE), and resolution is to skip that line. 2. Patch 2 and 3 are clean cherry-picks, 3 is a fix for 2.
Note: I verified if this problem is also on other stable trees like (6.6 --> 5.10, no they don't have this backport problem)
Thanks, Harshit
Borislav Petkov (1): x86/bugs: KVM: Add support for SRSO_MSR_FIX
Borislav Petkov (AMD) (1): x86/bugs: Add SRSO_USER_KERNEL_NO support
Sean Christopherson (1): KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0 <=> 1 VM count transitions
Documentation/admin-guide/hw-vuln/srso.rst | 13 +++++ arch/x86/include/asm/cpufeatures.h | 5 ++ arch/x86/include/asm/msr-index.h | 1 + arch/x86/kernel/cpu/bugs.c | 28 ++++++++-- arch/x86/kvm/svm/svm.c | 65 ++++++++++++++++++++++ arch/x86/kvm/svm/svm.h | 2 + arch/x86/lib/msr.c | 2 + 7 files changed, 112 insertions(+), 4 deletions(-)
From: "Borislav Petkov (AMD)" bp@alien8.de
[ Upstream commit 877818802c3e970f67ccb53012facc78bef5f97a ]
If the machine has:
CPUID Fn8000_0021_EAX[30] (SRSO_USER_KERNEL_NO) -- If this bit is 1, it indicates the CPU is not subject to the SRSO vulnerability across user/kernel boundaries.
have it fall back to IBPB on VMEXIT only, in the case it is going to run VMs:
Speculative Return Stack Overflow: Mitigation: IBPB on VMEXIT only
Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Nikolay Borisov nik.borisov@suse.com Link: https://lore.kernel.org/r/20241202120416.6054-2-bp@kernel.org (cherry picked from commit 877818802c3e970f67ccb53012facc78bef5f97a) [Harshit: Conflicts resolved as this commit: 7c62c442b6eb ("x86/vmscape: Enumerate VMSCAPE bug") has been applied already to 6.12.y] Signed-off-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/kernel/cpu/bugs.c | 4 ++++ 2 files changed, 5 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 90f1f2f9d314..3fc47f25cafc 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -464,6 +464,7 @@ #define X86_FEATURE_SBPB (20*32+27) /* Selective Branch Prediction Barrier */ #define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */ #define X86_FEATURE_SRSO_NO (20*32+29) /* CPU is not affected by SRSO */ +#define X86_FEATURE_SRSO_USER_KERNEL_NO (20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */
/* * Extended auxiliary flags: Linux defined - for features scattered in various diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 06bbc297c26c..c3ea29efe26f 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -2810,6 +2810,9 @@ static void __init srso_select_mitigation(void) break;
case SRSO_CMD_SAFE_RET: + if (boot_cpu_has(X86_FEATURE_SRSO_USER_KERNEL_NO)) + goto ibpb_on_vmexit; + if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) { /* * Enable the return thunk for generated code @@ -2861,6 +2864,7 @@ static void __init srso_select_mitigation(void) } break;
+ibpb_on_vmexit: case SRSO_CMD_IBPB_ON_VMEXIT: if (IS_ENABLED(CONFIG_MITIGATION_IBPB_ENTRY)) { if (has_microcode) {
From: Borislav Petkov bp@alien8.de
[ Upstream commit 8442df2b49ed9bcd67833ad4f091d15ac91efd00 ]
Add support for
CPUID Fn8000_0021_EAX[31] (SRSO_MSR_FIX). If this bit is 1, it indicates that software may use MSR BP_CFG[BpSpecReduce] to mitigate SRSO.
Enable BpSpecReduce to mitigate SRSO across guest/host boundaries.
Switch back to enabling the bit when virtualization is enabled and to clear the bit when virtualization is disabled because using a MSR slot would clear the bit when the guest is exited and any training the guest has done, would potentially influence the host kernel when execution enters the kernel and hasn't VMRUN the guest yet.
More detail on the public thread in Link below.
Co-developed-by: Sean Christopherson seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Link: https://lore.kernel.org/r/20241202120416.6054-1-bp@kernel.org (cherry picked from commit 8442df2b49ed9bcd67833ad4f091d15ac91efd00) Signed-off-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com --- Documentation/admin-guide/hw-vuln/srso.rst | 13 ++++++++++++ arch/x86/include/asm/cpufeatures.h | 4 ++++ arch/x86/include/asm/msr-index.h | 1 + arch/x86/kernel/cpu/bugs.c | 24 ++++++++++++++++++---- arch/x86/kvm/svm/svm.c | 6 ++++++ arch/x86/lib/msr.c | 2 ++ 6 files changed, 46 insertions(+), 4 deletions(-)
diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst index 2ad1c05b8c88..66af95251a3d 100644 --- a/Documentation/admin-guide/hw-vuln/srso.rst +++ b/Documentation/admin-guide/hw-vuln/srso.rst @@ -104,7 +104,20 @@ The possible values in this file are:
(spec_rstack_overflow=ibpb-vmexit)
+ * 'Mitigation: Reduced Speculation':
+ This mitigation gets automatically enabled when the above one "IBPB on + VMEXIT" has been selected and the CPU supports the BpSpecReduce bit. + + It gets automatically enabled on machines which have the + SRSO_USER_KERNEL_NO=1 CPUID bit. In that case, the code logic is to switch + to the above =ibpb-vmexit mitigation because the user/kernel boundary is + not affected anymore and thus "safe RET" is not needed. + + After enabling the IBPB on VMEXIT mitigation option, the BpSpecReduce bit + is detected (functionality present on all such machines) and that + practically overrides IBPB on VMEXIT as it has a lot less performance + impact and takes care of the guest->host attack vector too.
In order to exploit vulnerability, an attacker needs to:
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 3fc47f25cafc..16a8c1f3ff65 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -465,6 +465,10 @@ #define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */ #define X86_FEATURE_SRSO_NO (20*32+29) /* CPU is not affected by SRSO */ #define X86_FEATURE_SRSO_USER_KERNEL_NO (20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */ +#define X86_FEATURE_SRSO_BP_SPEC_REDUCE (20*32+31) /* + * BP_CFG[BpSpecReduce] can be used to mitigate SRSO for VMs. + * (SRSO_MSR_FIX in the official doc). + */
/* * Extended auxiliary flags: Linux defined - for features scattered in various diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 2b6e3127ef4e..21d07aa9400c 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -728,6 +728,7 @@
/* Zen4 */ #define MSR_ZEN4_BP_CFG 0xc001102e +#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4 #define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
/* Fam 19h MSRs */ diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index c3ea29efe26f..f3cb559a598d 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -2718,6 +2718,7 @@ enum srso_mitigation { SRSO_MITIGATION_SAFE_RET, SRSO_MITIGATION_IBPB, SRSO_MITIGATION_IBPB_ON_VMEXIT, + SRSO_MITIGATION_BP_SPEC_REDUCE, };
enum srso_mitigation_cmd { @@ -2735,7 +2736,8 @@ static const char * const srso_strings[] = { [SRSO_MITIGATION_MICROCODE] = "Vulnerable: Microcode, no safe RET", [SRSO_MITIGATION_SAFE_RET] = "Mitigation: Safe RET", [SRSO_MITIGATION_IBPB] = "Mitigation: IBPB", - [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only" + [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only", + [SRSO_MITIGATION_BP_SPEC_REDUCE] = "Mitigation: Reduced Speculation" };
static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE; @@ -2774,7 +2776,7 @@ static void __init srso_select_mitigation(void) srso_cmd == SRSO_CMD_OFF) { if (boot_cpu_has(X86_FEATURE_SBPB)) x86_pred_cmd = PRED_CMD_SBPB; - return; + goto out; }
if (has_microcode) { @@ -2786,7 +2788,7 @@ static void __init srso_select_mitigation(void) */ if (boot_cpu_data.x86 < 0x19 && !cpu_smt_possible()) { setup_force_cpu_cap(X86_FEATURE_SRSO_NO); - return; + goto out; }
if (retbleed_mitigation == RETBLEED_MITIGATION_IBPB) { @@ -2866,6 +2868,12 @@ static void __init srso_select_mitigation(void)
ibpb_on_vmexit: case SRSO_CMD_IBPB_ON_VMEXIT: + if (boot_cpu_has(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) { + pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n"); + srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE; + break; + } + if (IS_ENABLED(CONFIG_MITIGATION_IBPB_ENTRY)) { if (has_microcode) { setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT); @@ -2887,7 +2895,15 @@ static void __init srso_select_mitigation(void) }
out: - pr_info("%s\n", srso_strings[srso_mitigation]); + /* + * Clear the feature flag if this mitigation is not selected as that + * feature flag controls the BpSpecReduce MSR bit toggling in KVM. + */ + if (srso_mitigation != SRSO_MITIGATION_BP_SPEC_REDUCE) + setup_clear_cpu_cap(X86_FEATURE_SRSO_BP_SPEC_REDUCE); + + if (srso_mitigation != SRSO_MITIGATION_NONE) + pr_info("%s\n", srso_strings[srso_mitigation]); }
#undef pr_fmt diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 800f781475c0..2984851ba890 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -608,6 +608,9 @@ static void svm_disable_virtualization_cpu(void) kvm_cpu_svm_disable();
amd_pmu_disable_virt(); + + if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) + msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); }
static int svm_enable_virtualization_cpu(void) @@ -685,6 +688,9 @@ static int svm_enable_virtualization_cpu(void) rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi); }
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) + msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); + return 0; }
diff --git a/arch/x86/lib/msr.c b/arch/x86/lib/msr.c index 4bf4fad5b148..5a18ecc04a6c 100644 --- a/arch/x86/lib/msr.c +++ b/arch/x86/lib/msr.c @@ -103,6 +103,7 @@ int msr_set_bit(u32 msr, u8 bit) { return __flip_bit(msr, bit, true); } +EXPORT_SYMBOL_GPL(msr_set_bit);
/** * msr_clear_bit - Clear @bit in a MSR @msr. @@ -118,6 +119,7 @@ int msr_clear_bit(u32 msr, u8 bit) { return __flip_bit(msr, bit, false); } +EXPORT_SYMBOL_GPL(msr_clear_bit);
#ifdef CONFIG_TRACEPOINTS void do_trace_write_msr(unsigned int msr, u64 val, int failed)
On Fri, Sep 19, 2025 at 10:32:59AM -0700, Harshit Mogalapalli wrote:
From: Borislav Petkov bp@alien8.de
[ Upstream commit 8442df2b49ed9bcd67833ad4f091d15ac91efd00 ]
Add support for
CPUID Fn8000_0021_EAX[31] (SRSO_MSR_FIX). If this bit is 1, it indicates that software may use MSR BP_CFG[BpSpecReduce] to mitigate SRSO.
Enable BpSpecReduce to mitigate SRSO across guest/host boundaries.
Switch back to enabling the bit when virtualization is enabled and to clear the bit when virtualization is disabled because using a MSR slot would clear the bit when the guest is exited and any training the guest has done, would potentially influence the host kernel when execution enters the kernel and hasn't VMRUN the guest yet.
More detail on the public thread in Link below.
Co-developed-by: Sean Christopherson seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Link: https://lore.kernel.org/r/20241202120416.6054-1-bp@kernel.org (cherry picked from commit 8442df2b49ed9bcd67833ad4f091d15ac91efd00)
This and the next patch doesn't need those "cherry picked from" - that's what the "Upstream commit... " tag is for.
But Greg will zap that when applying.
Other than that, LGTM.
Thx for doing that.
From: Sean Christopherson seanjc@google.com
[ Upstream commit e3417ab75ab2e7dca6372a1bfa26b1be3ac5889e ]
Set the magic BP_SPEC_REDUCE bit to mitigate SRSO when running VMs if and only if KVM has at least one active VM. Leaving the bit set at all times unfortunately degrades performance by a wee bit more than expected.
Use a dedicated spinlock and counter instead of hooking virtualization enablement, as changing the behavior of kvm.enable_virt_at_load based on SRSO_BP_SPEC_REDUCE is painful, and has its own drawbacks, e.g. could result in performance issues for flows that are sensitive to VM creation latency.
Defer setting BP_SPEC_REDUCE until VMRUN is imminent to avoid impacting performance on CPUs that aren't running VMs, e.g. if a setup is using housekeeping CPUs. Setting BP_SPEC_REDUCE in task context, i.e. without blasting IPIs to all CPUs, also helps avoid serializing 1<=>N transitions without incurring a gross amount of complexity (see the Link for details on how ugly coordinating via IPIs gets).
Link: https://lore.kernel.org/all/aBOnzNCngyS_pQIW@google.com Fixes: 8442df2b49ed ("x86/bugs: KVM: Add support for SRSO_MSR_FIX") Reported-by: Michael Larabel Michael@michaellarabel.com Closes: https://www.phoronix.com/review/linux-615-amd-regression Cc: Borislav Petkov bp@alien8.de Tested-by: Borislav Petkov (AMD) bp@alien8.de Link: https://lore.kernel.org/r/20250505180300.973137-1-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com (cherry picked from commit e3417ab75ab2e7dca6372a1bfa26b1be3ac5889e) Signed-off-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com --- arch/x86/kvm/svm/svm.c | 71 ++++++++++++++++++++++++++++++++++++++---- arch/x86/kvm/svm/svm.h | 2 ++ 2 files changed, 67 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 2984851ba890..ed609e4f0d06 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -608,9 +608,6 @@ static void svm_disable_virtualization_cpu(void) kvm_cpu_svm_disable();
amd_pmu_disable_virt(); - - if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) - msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); }
static int svm_enable_virtualization_cpu(void) @@ -688,9 +685,6 @@ static int svm_enable_virtualization_cpu(void) rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi); }
- if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) - msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); - return 0; }
@@ -1513,6 +1507,63 @@ static void svm_vcpu_free(struct kvm_vcpu *vcpu) __free_pages(virt_to_page(svm->msrpm), get_order(MSRPM_SIZE)); }
+#ifdef CONFIG_CPU_MITIGATIONS +static DEFINE_SPINLOCK(srso_lock); +static atomic_t srso_nr_vms; + +static void svm_srso_clear_bp_spec_reduce(void *ign) +{ + struct svm_cpu_data *sd = this_cpu_ptr(&svm_data); + + if (!sd->bp_spec_reduce_set) + return; + + msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); + sd->bp_spec_reduce_set = false; +} + +static void svm_srso_vm_destroy(void) +{ + if (!cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) + return; + + if (atomic_dec_return(&srso_nr_vms)) + return; + + guard(spinlock)(&srso_lock); + + /* + * Verify a new VM didn't come along, acquire the lock, and increment + * the count before this task acquired the lock. + */ + if (atomic_read(&srso_nr_vms)) + return; + + on_each_cpu(svm_srso_clear_bp_spec_reduce, NULL, 1); +} + +static void svm_srso_vm_init(void) +{ + if (!cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE)) + return; + + /* + * Acquire the lock on 0 => 1 transitions to ensure a potential 1 => 0 + * transition, i.e. destroying the last VM, is fully complete, e.g. so + * that a delayed IPI doesn't clear BP_SPEC_REDUCE after a vCPU runs. + */ + if (atomic_inc_not_zero(&srso_nr_vms)) + return; + + guard(spinlock)(&srso_lock); + + atomic_inc(&srso_nr_vms); +} +#else +static void svm_srso_vm_init(void) { } +static void svm_srso_vm_destroy(void) { } +#endif + static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); @@ -1545,6 +1596,11 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu) (!boot_cpu_has(X86_FEATURE_V_TSC_AUX) || !sev_es_guest(vcpu->kvm))) kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
+ if (cpu_feature_enabled(X86_FEATURE_SRSO_BP_SPEC_REDUCE) && + !sd->bp_spec_reduce_set) { + sd->bp_spec_reduce_set = true; + msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); + } svm->guest_state_loaded = true; }
@@ -5011,6 +5067,8 @@ static void svm_vm_destroy(struct kvm *kvm) { avic_vm_destroy(kvm); sev_vm_destroy(kvm); + + svm_srso_vm_destroy(); }
static int svm_vm_init(struct kvm *kvm) @@ -5036,6 +5094,7 @@ static int svm_vm_init(struct kvm *kvm) return ret; }
+ svm_srso_vm_init(); return 0; }
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index d114efac7af7..1aa9b1e468cb 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -335,6 +335,8 @@ struct svm_cpu_data { u32 next_asid; u32 min_asid;
+ bool bp_spec_reduce_set; + struct vmcb *save_area; unsigned long save_area_pa;
linux-stable-mirror@lists.linaro.org