Currently vmx enables SECONDARY_EXEC_ENCLS_EXITING even when sgx is not set in the host MSR.
When booting a guest, KVM checks that the cpuid bit is actually set in vmx.c, and if not, it does not enable the feature.
However, in nesting this control bit is blindly set, and will be propagated to VMCS12 and VMCS02. Therefore, when L1 tries to boot the guest, the host will try to execute VMLOAD with VMCS02 containing a feature that the hardware does not support, making it fail with hardware error 0x7.
According to section "Secondary Processor-Based VM-Execution Controls" in the Intel SDM, software should *always* check the value in the actual MSR_IA32_VMX_PROCBASED_CTLS2 before enabling this bit.
Not updating enable_sgx is responsible for a second bug: vmx_set_cpu_caps() doesn't clear the SGX bits when hardware support is unavailable. This is a much less problematic bug as it only pops up if SGX is soft-disabled (the case being handled by cpu_has_sgx()) or if SGX is supported for bare metal but not in the VMCS (will never happen when running on bare metal, but can theoertically happen when running in a VM).
Last but not least, KVM should ideally have module params reflect KVM's actual configuration.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2127128
Fixes: 72add915fbd5 ("KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC") Cc: stable@vger.kernel.org
Suggested-by: Sean Christopherson seanjc@google.com Suggested-by: Bandan Das bsd@redhat.com Signed-off-by: Emanuele Giuseppe Esposito eesposit@redhat.com --- arch/x86/kvm/vmx/vmx.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 9dba04b6b019..ea0c65d3c08a 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -8263,6 +8263,11 @@ static __init int hardware_setup(void) if (!cpu_has_virtual_nmis()) enable_vnmi = 0;
+ #ifdef CONFIG_X86_SGX_KVM + if (!cpu_has_vmx_encls_vmexit()) + enable_sgx = false; + #endif + /* * set_apic_access_page_addr() is used to reload apic access * page upon invalidation. No need to do anything if not
Shortlog scope is still wrong, should be "KVM: nVMX:"
The shortlog is also somewhat is misleading/confusing, as it's not at all obvious that "sgx enabled" means "KVM's sgx_module param is enabled" and not "SGX is enabled in the system".
E.g.
KVM: nVMX: Advertise ENCLS_EXITING to L1 iff SGX is fully supported
On Tue, Oct 25, 2022, Emanuele Giuseppe Esposito wrote:
Currently vmx
s/vmx/KVM
enables SECONDARY_EXEC_ENCLS_EXITING even when sgx is not set in the host MSR.
"sgx is not set in the host MSR" is ambiguous. "sgx ... in the host MSR" could easily refer to the SGX_ENABLED bit in IA32_FEATURE_CONTROL, it could refer to the ENCLS_EXITING bit in the allowed-1 half of IA32_VMX_PROCBASED_CTLS2, etc...
In other words, please be more precise.
This statement is also wrong in that it implies that KVM _always_ sets ENCLS_EXITING, whereas the bug is purely limited to nested virtualization.
E.g.
Clear enable_sgx if ENCLS-exiting is not supported, i.e. if SGX cannot be virtualized. This fixes a bug where KVM would advertise ENCLS-exiting to L1 and propagate the control from vmcs12 to vmcs02 even if ENCLS-exiting isn't supported in secondary execution controls, e.g. because SGX isn't fully enabled, and thus induce an unexpected VM-Fail in L1.
When booting a guest, KVM checks that the cpuid bit is actually set in vmx.c, and if not, it does not enable the feature.
Again, this is nothing to do with the failure.
On 10/25/22 19:21, Sean Christopherson wrote:
Shortlog scope is still wrong, should be "KVM: nVMX:"
The shortlog is also somewhat is misleading/confusing, as it's not at all obvious that "sgx enabled" means "KVM's sgx_module param is enabled" and not "SGX is enabled in the system".
E.g.
KVM: nVMX: Advertise ENCLS_EXITING to L1 iff SGX is fully supported
Queued with this commit message:
--- KVM: VMX: fully disable SGX if SECONDARY_EXEC_ENCLS_EXITING unavailable
Clear enable_sgx if ENCLS-exiting is not supported, i.e. if SGX cannot be virtualized. When KVM is loaded, adjust_vmx_controls checks that the bit is available before enabling the feature; however, other parts of the code check enable_sgx and not clearing the variable caused two different bugs, mostly affecting nested virtualization scenarios.
First, because enable_sgx remained true, SECONDARY_EXEC_ENCLS_EXITING would be marked available in the capability MSR that are accessed by a nested hypervisor. KVM would then propagate the control from vmcs12 to vmcs02 even if it isn't supported by the processor, thus causing an unexpected VM-Fail (exit code 0x7) in L1.
Second, vmx_set_cpu_caps() would not clear the SGX bits when hardware support is unavailable. This is a much less problematic bug as it only happens if SGX is soft-disabled (available in the processor but hidden in CPUID) or if SGX is supported for bare metal but not in the VMCS (will never happen when running on bare metal, but can theoertically happen when running in a VM).
Last but not least, this ensures that module params in sysfs reflect KVM's actual configuration.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2127128 Fixes: 72add915fbd5 ("KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC") Cc: stable@vger.kernel.org Suggested-by: Sean Christopherson seanjc@google.com Suggested-by: Bandan Das bsd@redhat.com Signed-off-by: Emanuele Giuseppe Esposito eesposit@redhat.com Message-Id: 20221025123749.2201649-1-eesposit@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com ---
The bug is strictly speaking not in nVMX, although that's where most of the symptoms surface.
Paolo
linux-stable-mirror@lists.linaro.org