On Tue, Apr 22, 2025, Borislav Petkov wrote:
On Tue, Apr 22, 2025 at 10:22:54AM -0700, Sean Christopherson wrote:
Because I really hate wagging the dog and "fixing" the kernel because something else can't be bothered. I didn't object stronger to that fix because it is meh, more of those "if I'm a guest" gunk which we sprinkle nowadays and that's apparently not that awful-ish...
FWIW, I think splattering X86_FEATURE_HYPERVISOR everywhere is quite awful. There are definitely cases where the kernel needs to know if it's running as a guest, because the behavior of "hardware" fundamentally changes in ways that can't be enumerated otherwise. E.g. that things like the HPET are fully emulated and thus will be prone to significant jitter.
But when it comes to feature enumeration, IMO sprinkling HYPERVISOR everywhere is unnecessary because it's the hypervisor/VMM's responsibility to present a sane model. And I also think it's outright dangerous, because everywhere the kernel does X for bare metal and Y for guest results in reduced test coverage.
E.g. things like syzkaller and other bots will largely be testing the HYPERVISOR code, while humans will largely be testing and using the bare metal code.
All valid points...
At least one case justifies the X86_FEATURE_HYPERVISOR check: microcode loading and we've chewed that topic back then with Xen ad nauseam.
Yeah, from my perspective, ucode loading falls into the "fundamentally different" bucket.
But I'd love to whack as many of such checks as possible.
$ git grep X86_FEATURE_HYPERVISOR | wc -l 60
I think I should start whacking at those and CC you if I'm not sure. It'll be a long-term, low prio thing but it'll be a good cleanup.
I did a quick pass. Most of the usage is "fine". E.g. explicit PV code, cases where checking for HYPERVISOR is the least awful option, etc.
Looks sketchy, might be worth investigating? -------------------------------------------- arch/x86/kernel/cpu/amd.c: if (!cpu_has(c, X86_FEATURE_HYPERVISOR) && arch/x86/kernel/cpu/amd.c: if (!cpu_has(c, X86_FEATURE_HYPERVISOR) && !cpu_has(c, X86_FEATURE_IBPB_BRTYPE)) { arch/x86/kernel/cpu/amd.c: if (c->x86_model < 0x14 && cpu_has(c, X86_FEATURE_LAHF_LM) && !cpu_has(c, X86_FEATURE_HYPERVISOR)) { arch/x86/kernel/cpu/amd.c: if (!cpu_has(c, X86_FEATURE_HYPERVISOR)) { arch/x86/kernel/cpu/amd.c: if (!cpu_has(c, X86_FEATURE_HYPERVISOR)) { arch/x86/kernel/cpu/amd.c: if (cpu_has(c, X86_FEATURE_HYPERVISOR)) arch/x86/kernel/cpu/amd.c: if (!cpu_has(c, X86_FEATURE_HYPERVISOR)) { arch/x86/kernel/cpu/amd.c: if (!cpu_has(c, X86_FEATURE_HYPERVISOR)) arch/x86/kernel/cpu/topology_amd.c: if (!boot_cpu_has(X86_FEATURE_HYPERVISOR) && tscan->c->x86_model <= 0x3) { arch/x86/mm/init_64.c: if (!boot_cpu_has(X86_FEATURE_HYPERVISOR)) { arch/x86/mm/pat/set_memory.c: return !cpu_feature_enabled(X86_FEATURE_HYPERVISOR); drivers/platform/x86/intel/pmc/pltdrv.c: if (cpu_feature_enabled(X86_FEATURE_HYPERVISOR) && !xen_initial_domain()) drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c: if (cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) --------------------------------------
Could do with some love, but not horrible. ------------------------------------------ Eww. Optimization to lessen the pain of DR7 interception. It'd be nice to clean this up at some point, especially with things like SEV-ES with DebugSwap, where DR7 is never intercepted. arch/x86/include/asm/debugreg.h: if (static_cpu_has(X86_FEATURE_HYPERVISOR) && !hw_breakpoint_active()) arch/x86/kernel/hw_breakpoint.c: * When in guest (X86_FEATURE_HYPERVISOR), local_db_save()
This usage should be restricted to just the FMS matching, but unfortunately needs to be kept for that check. arch/x86/kernel/cpu/bus_lock.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
Most of these look sane, e.g. are just being transparent about the state of mitigations when running in a VM. The use in update_srbds_msr() is the only one that stands out as somewhat sketchy. arch/x86/kernel/cpu/bugs.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) arch/x86/kernel/cpu/bugs.c: else if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) arch/x86/kernel/cpu/bugs.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) { arch/x86/kernel/cpu/bugs.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) { arch/x86/kernel/cpu/bugs.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) { arch/x86/kernel/cpu/bugs.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) { ------------------------------------------
Don't bother ------------------------------------------ Most of these look sane, e.g. are just being transparent about the state of mitigations when running in a VM. The use in update_srbds_msr() is the only one that stands out as somewhat sketchy. arch/x86/kernel/cpu/bugs.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) arch/x86/kernel/cpu/bugs.c: else if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) arch/x86/kernel/cpu/bugs.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) { arch/x86/kernel/cpu/bugs.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) { arch/x86/kernel/cpu/bugs.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) { arch/x86/kernel/cpu/bugs.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
Perf, don't bother. PMUs are notoriously virtualization-unfriendly, and perf has had to resort to detecting its running in a VM to avoid crashing the kernel, and I don't see this being fully solved any time soon. arch/x86/events/core.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) { arch/x86/events/intel/core.c: if (!boot_cpu_has(X86_FEATURE_HYPERVISOR)) arch/x86/events/intel/core.c: int assume = 3 * !boot_cpu_has(X86_FEATURE_HYPERVISOR); arch/x86/events/intel/cstate.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) arch/x86/events/intel/uncore.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
PV code of one form or another. arch/x86/include/asm/acrn.h: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) arch/x86/hyperv/ivm.c: if (!cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) arch/x86/kernel/cpu/mshyperv.c: if (!boot_cpu_has(X86_FEATURE_HYPERVISOR)) arch/x86/kernel/cpu/vmware.c: * If !boot_cpu_has(X86_FEATURE_HYPERVISOR), vmware_hypercall_mode arch/x86/kernel/cpu/vmware.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) { arch/x86/kernel/jailhouse.c: !boot_cpu_has(X86_FEATURE_HYPERVISOR)) arch/x86/kernel/kvm.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) arch/x86/kernel/paravirt.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) arch/x86/kernel/tsc.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR) || arch/x86/kvm/vmx/vmx.c: if (!static_cpu_has(X86_FEATURE_HYPERVISOR) || arch/x86/virt/svm/cmdline.c: if (!cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) {
Ugh. Eliding WBINVD when running as a VM. Probably the least awful option as there's no sane way to enumerate that WBINVD is a nop, and a "passthrough" setup can (and should) simply omit HYPERVISOR. arch/x86/include/asm/acenv.h: if (!cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) \
Skip sanity check on TSC deadline timer. Makes sense to keep; either the timer is emulated and thus not subject to hardware errata, or its passed through, in which case HYPERVSIOR arguably shouldn't be set. arch/x86/kernel/apic/apic.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
This "feature" is awful, but getting rid of it may not be feasible. https://lore.kernel.org/all/20250201005048.657470-1-seanjc@google.com arch/x86/kernel/cpu/mtrr/generic.c: if (!cpu_feature_enabled(X86_FEATURE_HYPERVISOR))
Exempting VMs from a gross workaround for old, buggy Intel chipsets. Fine to keep. drivers/acpi/processor_idle.c: if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
More mitigation crud, probably not worth pursuing. arch/x86/kernel/cpu/common.c: boot_cpu_has(X86_FEATURE_HYPERVISOR))) arch/x86/kernel/cpu/common.c: if (cpu_has(c, X86_FEATURE_HYPERVISOR)) {
LOL. Skip ucode revision check when detecting bad Spectre mitigation. arch/x86/kernel/cpu/intel.c: if (cpu_has(c, X86_FEATURE_HYPERVISOR)) ------------------------------------------