On 23/10/24 01:30, Sean Christopherson wrote:
On Tue, Oct 22, 2024, Adrian Hunter wrote:
On 22/10/24 19:30, Sean Christopherson wrote:
LOL, yeah, this needs to be burned with fire. It's wildly broken. So for stable@,
It doesn't seem wildly broken. Just the VMM passing invalid CPUID and KVM not validating it.
Heh, I agree with "just", but unfortunately "just ... not validating" a large swath of userspace inputs is pretty widly broken. More importantly, it's not easy to fix. E.g. KVM could require the inputs to exactly match hardware, but that creates an ABI that I'm not entirely sure is desirable in the long term.
Although the CPUID ABI does not really change. KVM does not support emulating Intel PT, so accepting CPUID that the hardware cannot support seems like a bit of a lie.
But it's not all or nothing, e.g. KVM should support exposing fewer address ranges than are supported by hardware, so that the same virtual CPU model can be run on different generations of hardware.
Aren't there other features that KVM does not support if the hardware support is not there?
Many. But either features are one-off things without configurable properties, or KVM does the right thing (usually). E.g. nested virtualization heavily relies on hardware, and has a plethora of knobs, but KVM (usually) honors and validates the configuration provided by userspace.
To some degree, a testing and debugging feature does not have to be available in 100% of cases because it can still be useful when it is available.
I don't disagree, but "works on my machine" is how KVM has gotten into so many messes with such features. I also don't necessarily disagree with supporting a very limited subset of use cases, but I want such support to come as well-defined package with proper guard rails, docs, and ideally tests.
Ok, so how about: leave VMM to choose CPUID, but then map it to what the hardware actually supports for what is possible. So the guest user might not get trace data exactly as expected, or perhaps not at all, but at least KVM doesn't die. Then add documentation to explain how it all works.
Note, the number of address ranges is not that much of an issue because currently all processors that support Intel PT virtualization have 2.
I have a feeling QEMU was targeting compatibility with IceLake, which would probably work for all processors that support Intel PT virtualization except for one feature - the maximum number of cycle thresholds (dropped from 2048 to 16)