On Tue, Nov 16, 2021, Paolo Bonzini wrote:
Currently, checks for whether VT-d PI can be used refer to the current status of the feature in the current vCPU; or they more or less pick vCPU 0 in case a specific vCPU is not available.
However, these checks do not attempt to synchronize with changes to the IRTE. In particular, there is no path that updates the IRTE when APICv is re-activated on vCPU 0; and there is no path to wakeup a CPU that has APICv disabled, if the wakeup occurs because of an IRTE that points to a posted interrupt.
Ooooh, I think I get it now. You're saying that if pi_update_irte() configured the IRQ to post the IRQ to a vCPU, and then that vCPU disables APICv, because KVM doesn't go back and fixup the IRTE, the device will send the IRQ to the current posted interrupt vector, not to the non-posted vector. That makes sense.
To fix this, always go through the VT-d PI path as long as there are assigned devices and APICv is available on both the host and the VM side. Since the relevant condition was copied over three times, take the hint and factor it into a separate function.
Suggested-by: Sean Christopherson seanjc@google.com Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini pbonzini@redhat.com
arch/x86/kvm/vmx/posted_intr.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c index 5f81ef092bd4..b64dd1374ed9 100644 --- a/arch/x86/kvm/vmx/posted_intr.c +++ b/arch/x86/kvm/vmx/posted_intr.c @@ -5,6 +5,7 @@ #include <asm/cpu.h> #include "lapic.h" +#include "irq.h" #include "posted_intr.h" #include "trace.h" #include "vmx.h" @@ -77,13 +78,18 @@ void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu) pi_set_on(pi_desc); } +static bool vmx_can_use_vtd_pi(struct kvm *kvm) +{
- return kvm_arch_has_assigned_device(kvm) &&
irq_remapping_cap(IRQ_POSTING_CAP) &&
irqchip_in_kernel(kvm) && enable_apicv;
Bad indentation/alignment.
Not that it's likely to matter, but would it make sense to invert the checks so that they're short-circuited on the faster KVM checks? E.g. fastest to slowest:
return irqchip_in_kernel(kvm) && enable_apic && kvm_arch_has_assigned_device(kvm) && irq_remapping_cap(IRQ_POSTING_CAP);
Nits aside,
Reviewed-by: Sean Christopherson seanjc@google.com