On Thu, Nov 10, 2022, Dave Hansen wrote:
On 11/10/22 16:03, Kyle Huey wrote:
On Tue, Nov 8, 2022 at 10:23 AM Dave Hansen dave.hansen@intel.com wrote:
BTW, I'd love to know if KVM *REALLY* depends on this.
Unlikely, but nearly impossible to know for sure. Copy+pasting my response[1] to an earlier version.
: Hrm, the current behavior has been KVM ABI for a very long time. : : It's definitely odd because all other components will be initialized due to their : bits being cleared in the header during kvm_load_guest_fpu(), and it probably : wouldn't cause problems in practice as most VMMs likely do "all or nothing" loads. : But, in theory, userspace could save/restore a subset of guest XSTATE and rely on : the kernel not overwriting guest PKRU when its bit is cleared in the header. : : All that said, I don't see any reason to force KVM to change at this time, it's : trivial enough to handle KVM's oddities while providing sane behavior for others. : Nullify the pointer in the guest path and then update copy_uabi_to_xstate() to : play nice with a NULL pointer, e.g. : : /* : * Nullify @vpkru to preserve its current value if PKRU's bit isn't set : * in the header. KVM's odd ABI is to leave PKRU untouched in this : * case (all other components are eventually re-initialized). : */ : if (!(kstate->regs.xsave.header.xfeatures & XFEATURE_MASK_PKRU)) : vpkru = NULL; : : return copy_uabi_from_kernel_to_xstate(kstate, ustate, vpkru);
It'd be nice to kill if not.
I don't disagree, my hesitation is purely that doing so might subtly break userspace.
That said, I'm 99.9% certain no traditional VMM, e.g. QEMU, is relying on this behavior, as doing KVM_SET_XSAVE with anything except the guest's xfeatures mask would corrupt guest XSAVE state for everything except PKRU. I.e. for all intents and purposes, a traditional VMM must do KVM_GET_SAVE => KVM_SET_XSAVE without touching the xfeatures mask.
And for non-traditional usage of KVM, I would be quite surprised if any of those use cases utilize PKRU in the guest, let alone play games with KVM_{G,S}SET_XSAVE.
So, I'm not completely opposed to "fixing" KVM's ABI, but it should be done as a separate patch that is tagged "KVM: x86:" and clearly states that it's changing KVM's ABI in a way that could theoretically break userspace.
Would something like this be more clear?
if (hdr.xfeatures & XFEATURE_MASK_PKRU) { struct pkru_state *xpkru; xpkru = __raw_xsave_addr(xsave, XFEATURE_PKRU); *pkru = xpkru->pkru; } else { /* * KVM may pass a NULL 'pkru' to indicate * that it does not need PKRU updated. */ if (pkru) *pkru = 0; }
Yeah, Sean Christopherson suggested this (with the else and if collapsed into a single level) when I submitted this previously.
I generally agree with Sean, but he's also been guilty of an atrocity or two over the years. :)
Heh, just one or two? I'll call that a win.
While I generally like low levels of indentation I also think my version is much more clear in this case.
I've no objection to a standalone if. My suggestion[2] was in response to code that zeroed @pkru before the XFEATURE_MASK_PKRU check.
if (pkru) *pkru = 0;
if (hdr.xfeatures & XFEATURE_MASK_PKRU) { struct pkru_state *xpkru; xpkru = __raw_xsave_addr(xsave, XFEATURE_PKRU); *pkru = xpkru->pkru; }
[1] https://lore.kernel.org/all/Yv6szXuKGv75wWmm@google.com [2] https://lore.kernel.org/all/YxDP6jie4cwzZIHp@google.com