 
            On Tue, Aug 26, 2025 at 3:29 PM Sagi Shahar sagis@google.com wrote:
On Tue, Aug 26, 2025 at 3:14 PM Ira Weiny ira.weiny@intel.com wrote:
Sean Christopherson wrote:
On Wed, Aug 20, 2025, Sagi Shahar wrote:
TDX require special handling for VM and VCPU initialization for various reasons:
- Special ioctlss for creating VM and VCPU.
- TDX registers are inaccessible to KVM.
- TDX require special boot code trampoline for loading parameters.
- TDX only supports KVM_CAP_SPLIT_IRQCHIP.
Please split this up and elaborate at least a little bit on why each flow needs special handling for TDX. Even for someone like me who is fairly familiar with TDX, there's too much "Trust me bro" and not enough explanation of why selftests really need all of these special paths for TDX.
At least four patches, one for each of your bullet points. Probably 5 or 6, as I think the CPUID handling warrants its own patch.
Hook this special handling into __vm_create() and vm_arch_vcpu_add() using the utility functions added in previous patches.
Signed-off-by: Sagi Shahar sagis@google.com
tools/testing/selftests/kvm/lib/kvm_util.c | 24 ++++++++- .../testing/selftests/kvm/lib/x86/processor.c | 49 ++++++++++++++----- 2 files changed, 61 insertions(+), 12 deletions(-)
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index b4c8702ba4bd..d9f0ff97770d 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -4,6 +4,7 @@
- Copyright (C) 2018, Google LLC.
*/ +#include "tdx/tdx_util.h" #include "test_util.h" #include "kvm_util.h" #include "processor.h" @@ -465,7 +466,7 @@ void kvm_set_files_rlimit(uint32_t nr_vcpus) static bool is_guest_memfd_required(struct vm_shape shape) { #ifdef __x86_64__
- return shape.type == KVM_X86_SNP_VM;
- return (shape.type == KVM_X86_SNP_VM || shape.type == KVM_X86_TDX_VM);
#else return false; #endif @@ -499,6 +500,12 @@ struct kvm_vm *__vm_create(struct vm_shape shape, uint32_t nr_runnable_vcpus, for (i = 0; i < NR_MEM_REGIONS; i++) vm->memslots[i] = 0;
if (is_tdx_vm(vm)) {
/* Setup additional mem regions for TDX. */
vm_tdx_setup_boot_code_region(vm);
vm_tdx_setup_boot_parameters_region(vm, nr_runnable_vcpus);
}
kvm_vm_elf_load(vm, program_invocation_name);
/*
@@ -1728,11 +1735,26 @@ void *addr_gpa2alias(struct kvm_vm *vm, vm_paddr_t gpa) return (void *) ((uintptr_t) region->host_alias + offset); }
+static bool is_split_irqchip_required(struct kvm_vm *vm) +{ +#ifdef __x86_64__
- return is_tdx_vm(vm);
+#else
- return false;
+#endif +}
/* Create an interrupt controller chip for the specified VM. */ void vm_create_irqchip(struct kvm_vm *vm) { int r;
- if (is_split_irqchip_required(vm)) {
vm_enable_cap(vm, KVM_CAP_SPLIT_IRQCHIP, 24);
vm->has_irqchip = true;
return;- }
Ugh. IMO, this is a KVM bug. Allowing KVM_CREATE_IRQCHIP for a TDX VM is simply wrong. It _can't_ work. Waiting until KVM_CREATE_VCPU to fail setup is terrible ABI.
If we stretch the meaning of ENOTTY a bit and return that when trying to create a fully in-kernel IRQCHIP for a TDX VM, then the selftests code Just Works thanks to the code below, which handles the scenario where KVM was be built without
^^^^^^^^^^I'm not following. Was there supposed to be a patch attached?
I think Sean refers to the original implementation which was out of the scope for the git diff so it was left out of the patch:
/*
- Allocate a fully in-kernel IRQ chip by default, but fall back to a
- split model (x86 only) if that fails (KVM x86 allows compiling out
- support for KVM_CREATE_IRQCHIP).
*/ r = __vm_ioctl(vm, KVM_CREATE_IRQCHIP, NULL); if (r && errno == ENOTTY && kvm_has_cap(KVM_CAP_SPLIT_IRQCHIP)) vm_enable_cap(vm, KVM_CAP_SPLIT_IRQCHIP, 24); else TEST_ASSERT_VM_VCPU_IOCTL(!r, KVM_CREATE_IRQCHIP, r, vm);
/*
- Allocate a fully in-kernel IRQ chip by default, but fall back to a
- split model (x86 only) if that fails (KVM x86 allows compiling out
- support for KVM_CREATE_IRQCHIP).
*/ r = __vm_ioctl(vm, KVM_CREATE_IRQCHIP, NULL); if (r && errno == ENOTTY && kvm_has_cap(KVM_CAP_SPLIT_IRQCHIP)) vm_enable_cap(vm, KVM_CAP_SPLIT_IRQCHIP, 24); else TEST_ASSERT_VM_VCPU_IOCTL(!r, KVM_CREATE_IRQCHIP, r, vm); /*
- Allocate a fully in-kernel IRQ chip by default, but fall back to a
- split model (x86 only) if that fails (KVM x86 allows compiling out
- support for KVM_CREATE_IRQCHIP).
*/ r = __vm_ioctl(vm, KVM_CREATE_IRQCHIP, NULL); if (r && errno == ENOTTY && kvm_has_cap(KVM_CAP_SPLIT_IRQCHIP)) vm_enable_cap(vm, KVM_CAP_SPLIT_IRQCHIP, 24); else TEST_ASSERT_VM_VCPU_IOCTL(!r, KVM_CREATE_IRQCHIP, r, vm);
Sorry, I messed up the paste somehow:
/* * Allocate a fully in-kernel IRQ chip by default, but fall back to a * split model (x86 only) if that fails (KVM x86 allows compiling out * support for KVM_CREATE_IRQCHIP). */ r = __vm_ioctl(vm, KVM_CREATE_IRQCHIP, NULL); if (r && errno == ENOTTY && kvm_has_cap(KVM_CAP_SPLIT_IRQCHIP)) vm_enable_cap(vm, KVM_CAP_SPLIT_IRQCHIP, 24); else TEST_ASSERT_VM_VCPU_IOCTL(!r, KVM_CREATE_IRQCHIP, r, vm);
Ira
support for in-kernel I/O APIC (and PIC and PIT).