On 2/16/22 02:05, Greg KH wrote:
How was this tested, and what do the maintainers of this subsystem think? And will you be around to fix the bugs in this when they are found?
This has been trivial to reproduce, I've used a small repro which I've put here: https://gist.github.com/bgaff/9f8cbfc8dd22e60f9492e4f0aff8f04f , I also was able to reproduce this using the protection_keys self tests on a 11th Gen Core i5-1135G7. I'm happy to commit to addressing any bugs that may appear. I'll see what the maintainers say, but there is also a smaller fix that just involves using this_cpu_read() in switch_fpu_finish() for this specific issue, although that approach isn't as clean.
Can you add the test to the in-kernel tests so that we make sure it is fixed and never comes back?
It would be great if Brian could confirm this. But, I'm 99% sure that this can be reproduced in the vm/protection_keys.c selftest, if you run it for long enough.
The symptom here is corruption of the PKRU register. I created *lots* of bugs like this during protection keys development so the selftest keeps a shadow copy of the register to specifically watch for corruption.
It's _plausible_ that no one ever ran the pkey selftests with a clang-compiled kernel for long enough to hit this issue.