KVM's implementation of nested SVM treats PAT the same way whether or not nested NPT is enabled: L1 and L2 share a PAT.
This is correct when nested NPT is disabled, but incorrect when nested NPT is enabled. When nested NPT is enabled, L1 and L2 have independent PATs.
The architectural specification for this separation is unusual. There is a "guest PAT register" that is accessed by references to the PAT MSR in guest mode, but it is different from the (host) PAT MSR. Other resources that have distinct host and guest values have a shared storage location, and the values are swapped on VM-entry/VM-exit.
In https://lore.kernel.org/kvm/20251107201151.3303170-1-jmattson@google.com/, I proposed an implementation that adhered to the architectural specification. It had a few warts. The worst was the necessity of "fixing up" KVM_SET_MSRS when executing KVM_SET_NESTED_STATE if L2 was active and nested NPT was enabled when a snapshot was taken. Aside from Yosry's clarification, no one has responded. I will take silence to imply rejection. That's okay; I wasn't fond of that implementation myself.
The current series treats PAT just like any other resource with distinct host and guest values. There is a single shared storage location (vcpu->arch.pat), and the values are swapped on VM-entry/VM-exit. Though this implementation doesn't precisely follow the architectural specification, the guest visible behavior is the same as architected.
The first three patches ensure that the vmcb01.g_pat value at VMRUN is preserved through virtual SMM and serialization. When NPT is enabled, this field holds the host (L1) hPAT value from emulated VMRUN to emulated #VMEXIT.
The fourth patch restores (L1) hPAT value from vmcb01.g_pat at emulated #VMEXIT. Note that this is not architected, but it is required for this implementation, because hPAT and gPAT occupy the same storage location.
The next three patches handle loading vmcb12.g_pat into the (L2) guest PAT register at VMRUN. Most of this behavior is architected, but the architectural specification states that the value is loaded into the guest PAT register, leaving the hPAT register unchanged.
The eighth patch stores the (L2) guest PAT register into vmcb12_g_pat on emulated #VMEXIT, as architected.
The ninth patch fixes the emulation of WRMSR(IA32_PAT) when nested NPT is enabled.
The tenth patch introduces a new KVM selftest to validate virtualized PAT behavior.
Jim Mattson (10): KVM: x86: nSVM: Add g_pat to fields copied by svm_copy_vmrun_state() KVM: x86: nSVM: Add VALID_GPAT flag to kvm_svm_nested_state_hdr KVM: x86: nSVM: Handle legacy SVM nested state in SET_NESTED_STATE KVM: x86: nSVM: Restore L1's PAT on emulated #VMEXIT from L2 to L1 KVM: x86: nSVM: Cache g_pat in vmcb_save_area_cached KVM: x86: nSVM: Add validity check for VMCB12 g_pat KVM: x86: nSVM: Set vmcb02.g_pat correctly for nested NPT KVM: x86: nSVM: Save gPAT to vmcb12.g_pat on emulated #VMEXIT from L2 to L1 KVM: x86: nSVM: Fix assignment to IA32_PAT from L2 KVM: selftests: nSVM: Add svm_nested_pat test
arch/x86/include/uapi/asm/kvm.h | 3 + arch/x86/kvm/svm/nested.c | 74 +++- arch/x86/kvm/svm/svm.c | 14 +- arch/x86/kvm/svm/svm.h | 2 +- tools/testing/selftests/kvm/Makefile.kvm | 1 + .../selftests/kvm/x86/svm_nested_pat_test.c | 357 ++++++++++++++++++ 6 files changed, 432 insertions(+), 19 deletions(-) create mode 100644 tools/testing/selftests/kvm/x86/svm_nested_pat_test.c
base-commit: f62b64b970570c92fe22503b0cdc65be7ce7fc7c