RE: [PATCH v3 13/22] kvm: x86: Intercept #NM for saving IA32_XFD_ERR

29 Dec 2021

...
From: Sean Christopherson seanjc@google.com
Sent: Wednesday, December 29, 2021 8:10 AM
On Wed, Dec 22, 2021, Jing Liu wrote:
...
Guest IA32_XFD_ERR is generally modified in two places:

Set by CPU when #NM is triggered;
Cleared by guest in its #NM handler;

Intercept #NM for the first case, if guest writes XFD as nonzero for
the first time which indicates guest is possible to use XFD generating
the exception. #NM is rare if the guest doesn't use dynamic features.
Otherwise, there is at most one exception per guest task given a
dynamic feature.
Save the current XFD_ERR value to the guest_fpu container in the #NM
VM-exit handler. This must be done with interrupt/preemption disabled,
Assuming my below understanding is correct, drop the "preemption" bit, it's
misleading.
code-wise yes. In concept we just want to highlight that this operation 
must be completed when both interrupt and preemption are disabled.
But we can also drop preemption if you prefer to, since preemption is
certainly disabled  when interrupt is disabled.
...
...
otherwise the unsaved MSR value may be clobbered by host operations.
Inject a virtual #NM to the guest after saving the MSR value.
Restore the host value (always ZERO outside of the host #NM
handler) before enabling preemption.
AIUI, changelog is wrong, code is right.  This must be done before _IRQs_ are
enabled, same as handling TIF_NEED_FPU_LOAD.
yes
...
...
Restore the guest value from the guest_fpu container right before
entering the guest (with preemption disabled).
Same complaint about preemption.
...
Suggested-by: Thomas Gleixner tglx@linutronix.de
Signed-off-by: Jing Liu jing2.liu@intel.com

arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/vmx/vmcs.h         |  5 +++++
 arch/x86/kvm/vmx/vmx.c          | 22 +++++++++++++++++++++-
 arch/x86/kvm/x86.c              |  6 ++++++
 4 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h
b/arch/x86/include/asm/kvm_host.h
...
index 555f4de47ef2..f7a661f35d1a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -640,6 +640,7 @@ struct kvm_vcpu_arch {
   u64 smi_count;
   bool tpr_access_reporting;
   bool xsaves_enabled;

bool trap_nm;
u64 ia32_xss;
u64 microcode_version;
u64 arch_capabilities;

...
...
@@ -763,6 +764,9 @@ void vmx_update_exception_bitmap(struct
kvm_vcpu *vcpu)
...
vmcs_write32(PAGE_FAULT_ERROR_CODE_MATCH, match);

}

if (vcpu->arch.trap_nm)
eb |= (1u << NM_VECTOR);


vmcs_write32(EXCEPTION_BITMAP, eb);

}
@@ -1960,6 +1964,13 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu,
struct msr_data *msr_info)
...
case MSR_KERNEL_GS_BASE:
   	vmx_write_guest_kernel_gs_base(vmx, data);
   	break;

case MSR_IA32_XFD:
ret = kvm_set_msr_common(vcpu, msr_info);


if (!ret && data) {


	vcpu->arch.trap_nm = true;


	vmx_update_exception_bitmap(vcpu);



This is wrong, it fails to clear vcpu->arch.trap_nm and update the bitmap if
the
MSR is cleared.
In concept you are right if just looking at this patch. It's pointless to
trap #NM if guest xfd is cleared.
But here we need think about patch22 which disables write interception
for xfd. With that in consideration we use the 1st non-zero write as the
hint indicating that guest might enable xfd-related usages thus always
trap #NM after this point.
It's not a good ordering, but Paolo wants to put the optimization in the
end of this series. But we do need to put a clear comment here explaining
the always-trap policy.
...
But why even bother with an extra flag?  Can't
vmx_update_exception_bitmap() get
the guest's MSR_IA32_XFD value and intercept #NM accordingly?  Then you
Above is the reason for the extra flag
...
could
even handle this fully in kvm_set_msr_common(), e.g.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2c9606380bca..c6c936d2b298 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3704,6 +3704,8 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu,
struct msr_data *msr_info)
                        return 1;
            fpu_update_guest_xfd(&vcpu->arch.guest_fpu, data);


          /* Blah blah blah blah */


          static_call(kvm_x86_update_exception_bitmap)(vcpu);
          break;
  case MSR_IA32_XFD_ERR:
          if (!msr_info->host_initiated &&




...

}


break;



#endif
   case MSR_IA32_SYSENTER_CS:
   	if (is_guest_mode(vcpu))
@@ -4746,7 +4757,7 @@ static int handle_exception_nmi(struct kvm_vcpu
*vcpu)
...
vect_info = vmx->idt_vectoring_info;
   intr_info = vmx_get_intr_info(vcpu);

if (is_machine_check(intr_info) || is_nmi(intr_info))


if (is_machine_check(intr_info) || is_nmi(intr_info) ||

is_nm(intr_info))
...
return 1; /* handled by handle_exception_nmi_irqoff() */


if (is_invalid_opcode(intr_info))
@@ -6350,6 +6361,12 @@ static void handle_interrupt_nmi_irqoff(struct
kvm_vcpu *vcpu,
...
kvm_after_interrupt(vcpu);
 }
+static void handle_exception_nm(struct kvm_vcpu *vcpu)
This needs a different name, it's waaaay too close to the base
handle_exception_nmi(),
which runs with IRQs _on_.  And please add "_irqoff" at the end.  Maybe
handle_nm_fault_irqoff()?
sounds good.
...
...
+{

rdmsrl(MSR_IA32_XFD_ERR, vcpu->arch.guest_fpu.xfd_err);
kvm_queue_exception(vcpu, NM_VECTOR);

+}



static void handle_exception_nmi_irqoff(struct vcpu_vmx *vmx)
 {
   const unsigned long nmi_entry = (unsigned long)asm_exc_nmi_noist;
@@ -6358,6 +6375,9 @@ static void handle_exception_nmi_irqoff(struct
vcpu_vmx *vmx)
...
/* if exit due to PF check for async PF */
   if (is_page_fault(intr_info))
   	vmx->vcpu.arch.apf.host_apf_flags =
kvm_read_and_reset_apf_flags();
...

/* if exit due to NM, handle before preemptions are enabled */
else if (is_nm(intr_info))

Same naming complaint about this helper, it looks like an is_nmi() typo.
is_nm_fault()?
will fix
...
...

handle_exception_nm(&vmx->vcpu);

/* Handle machine checks before interrupts are enabled */
else if (is_machine_check(intr_info))
kvm_machine_check();


    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

RE: [PATCH v3 13/22] kvm: x86: Intercept #NM for saving IA32_XFD_ERR