On Wednesday, December 29, 2021 8:39 AM, Sean Christopherson wrote:
To: Liu, Jing2 jing2.liu@intel.com Cc: x86@kernel.org; kvm@vger.kernel.org; linux-kernel@vger.kernel.org; linux-doc@vger.kernel.org; linux-kselftest@vger.kernel.org; tglx@linutronix.de; mingo@redhat.com; bp@alien8.de; dave.hansen@linux.intel.com; pbonzini@redhat.com; corbet@lwn.net; shuah@kernel.org; Nakajima, Jun jun.nakajima@intel.com; Tian, Kevin kevin.tian@intel.com; jing2.liu@linux.intel.com; Zeng, Guang guang.zeng@intel.com; Wang, Wei W wei.w.wang@intel.com; Zhong, Yang yang.zhong@intel.com Subject: Re: [PATCH v3 19/22] kvm: x86: Get/set expanded xstate buffer
Shortlog needs to have a verb somewhere.
On Wed, Dec 22, 2021, Jing Liu wrote:
From: Guang Zeng guang.zeng@intel.com
When AMX is enabled it requires a larger xstate buffer than the legacy hardcoded 4KB one. Exising kvm ioctls
Existing
(KVM_[G|S]ET_XSAVE under KVM_CAP_XSAVE) are not suitable for this purpose.
...
Reuse KVM_SET_XSAVE for both old/new formats by reimplementing it to do properly-sized memdup_user() based on the guest fpu container.
I'm confused, the first sentence says KVM_SET_XSAVE isn't suitable, the second says it can be reused with minimal effort.
Probably "doesn't support" sounds better than "isn't suitable" above. But plan to reword a bit:
With KVM_CAP_XSAVE, userspace uses a hardcoded 4KB buffer to get/set xstate data from/to KVM. This doesn't work when dynamic features (e.g. AMX) are used by the guest, as KVM uses a full expanded xstate buffer for the guest fpu emulation, which is larger than 4KB.
Add KVM_CAP_XSAVE2, and userspace gets the required xstate buffer size from KVM via KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2). KVM_SET_XSAVE is extended with the support to work with larger xstate data size passed from userspace. KVM_GET_XSAVE2 is preferred to extending KVM_GET_XSAVE to work with large buffer size for backward-compatible considerations. (Link: https://lkml.org/lkml/2021/12/15/510)
Also, update the api doc with the new KVM_GET_XSAVE2 ioctl.
Also, update the api doc with the new KVM_GET_XSAVE2 ioctl.
...
@@ -5367,7 +5382,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp, break; } case KVM_SET_XSAVE: {
u.xsave = memdup_user(argp, sizeof(*u.xsave));
int size = vcpu->arch.guest_fpu.uabi_size;
IIUC, reusing KVM_SET_XSAVE works by requiring that userspace use KVM_GET_XSAVE2 if userspace has expanded the guest FPU size by exposing relevant features to the guest via guest CPUID. If so, then that needs to be enforced in KVM_GET_XSAVE, otherwise userspace will get subtle corruption by invoking the wrong ioctl, e.g.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 2c9606380bca..5d2acbd52df5 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5386,6 +5386,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp, break; } case KVM_GET_XSAVE: {
r -EINVAL;
if (vcpu->arch.guest_fpu.uabi_size > sizeof(struct
kvm_xsave))
break;
Looks good to me.
Thanks, Wei