Emanuele Giuseppe Esposito eesposit@redhat.com writes:
On 31/03/2021 05:01, Sean Christopherson wrote:
On Tue, Mar 30, 2021, Emanuele Giuseppe Esposito wrote:
Calling the kvm KVM_GET_[SUPPORTED/EMULATED]_CPUID ioctl requires a nent field inside the kvm_cpuid2 struct to be big enough to contain all entries that will be set by kvm. Therefore if the nent field is too high, kvm will adjust it to the right value. If too low, -E2BIG is returned.
However, when filling the entries do_cpuid_func() requires an additional entry, so if the right nent is known in advance, giving the exact number of entries won't work because it has to be increased by one.
Signed-off-by: Emanuele Giuseppe Esposito eesposit@redhat.com
arch/x86/kvm/cpuid.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 6bd2f8b830e4..5412b48b9103 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -975,6 +975,12 @@ int kvm_dev_ioctl_get_cpuid(struct kvm_cpuid2 *cpuid, if (cpuid->nent < 1) return -E2BIG;
- /* if there are X entries, we need to allocate at least X+1
* entries but return the actual number of entries
*/
- cpuid->nent++;
I don't see how this can be correct.
If this bonus entry really is needed, then won't that be reflected in array.nent? I.e won't KVM overrun the userspace buffer?
If it's not reflected in array.nent, that would imply there's an off-by-one check somewhere, or KVM is creating an entry that it doesn't copy to userspace. The former seems unlikely as there are literally only two checks against maxnent, and they both look correct (famous last words...).
KVM does decrement array->nent in one specific case (CPUID.0xD.2..64), i.e. a false positive is theoretically possible, but that carries a WARN and requires a kernel or CPU bug as well. And fudging nent for that case would still break normal use cases due to the overrun problem.
What am I missing?
(Maybe I should have put this series as RFC)
The problem I see and noticed while doing the KVM_GET_EMULATED_CPUID selftest is the following: assume there are 3 kvm emulated entries, and the user sets cpuid->nent = 3. This should work because kvm sets 3 array->entries[], and copies them to user space.
However, when the 3rd entry is populated inside kvm (array->entries[2]), array->nent is increased once more (do_host_cpuid and __do_cpuid_func_emulated). At that point, the loop in kvm_dev_ioctl_get_cpuid and get_cpuid_func can potentially iterate once more, going into the
if (array->nent >= array->maxnent) return -E2BIG;
in __do_cpuid_func_emulated and do_host_cpuid, returning the error. I agree that we need that check there because the following code tries to access the array entry at array->nent index, but from what I understand that access can be potentially useless because it might just jump to the default entry in the switch statement and not set the entry, leaving array->nent to 3.
The problem seems to be exclusive to __do_cpuid_func_emulated(), do_host_cpuid() always does
entry = &array->entries[array->nent++];
Something like (completely untested and stupid):
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 6bd2f8b830e4..54dcabd3abec 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -565,14 +565,22 @@ static struct kvm_cpuid_entry2 *do_host_cpuid(struct kvm_cpuid_array *array, return entry; }
+static bool cpuid_func_emulated(u32 func) +{ + return (func == 0) || (func == 1) || (func == 7); +} + static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func) { struct kvm_cpuid_entry2 *entry;
+ if (!cpuid_func_emulated()) + return 0; + if (array->nent >= array->maxnent) return -E2BIG;
- entry = &array->entries[array->nent]; + entry = &array->entries[array->nent++]; entry->function = func; entry->index = 0; entry->flags = 0; @@ -580,18 +588,14 @@ static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func) switch (func) { case 0: entry->eax = 7; - ++array->nent; break; case 1: entry->ecx = F(MOVBE); - ++array->nent; break; case 7: entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX; entry->eax = 0; entry->ecx = F(RDPID); - ++array->nent; - default: break; }
should do the job, right?