On Tue, Nov 9, 2021, at 5:43 AM, Brian Geffon wrote:
Hi Dave,
On Tue, Nov 9, 2021 at 1:49 AM Dave Hansen dave.hansen@intel.com wrote:
Well, gosh, it's making it back to the software init value. If you do:
echo 0x15555554 > /sys/kernel/debug/x86/init_pkru
do you end up with 0x15555554 as the value?
What's interesting is that writing to init_pkru fails with -EINVAL for me, and I've traced it down to get_xsave_addr() returning NULL on the following check:
/*
- This assumes the last 'xsave*' instruction to
- have requested that 'xfeature_nr' be saved.
- If it did not, we might be seeing and old value
- of the field in the buffer.
- This can happen because the last 'xsave' did not
- request that this feature be saved (unlikely)
- or because the "init optimization" caused it
- to not be saved.
*/ if (!(xsave->header.xfeatures & BIT_ULL(xfeature_nr))) return NULL;
Here's an excerpt from an old email that I, perhaps unwisely, sent to Dave but not to a public list:
static inline void write_pkru(u32 pkru) { struct pkru_state *pk;
if (!boot_cpu_has(X86_FEATURE_OSPKE)) return;
pk = get_xsave_addr(¤t->thread.fpu.state.xsave, XFEATURE_PKRU);
/* * The PKRU value in xstate needs to be in sync with the value that is * written to the CPU. The FPU restore on return to userland would * otherwise load the previous value again. */ fpregs_lock(); if (pk) pk->pkru = pkru;
^^^ else we just write to the PKRU register but leave XINUSE[PKRU] clear on return to usermode? That seems... unwise.
__write_pkru(pkru); fpregs_unlock(); }
I bet you're hitting exactly this bug. The fix ended up being a whole series of patches, but the gist of it is that the write_pkru() slow path needs to set the xfeature bit in the xsave buffer and then do the write. It should be possible to make a little patch to do just this in a couple lines of code.