Re: [PATCH 2/7] KVM: x86: Implement Hyper-V's vCPU suspended state

15 Oct 2024


      On 15.10.24 17:58, Sean Christopherson wrote:
...
...
And from a performance perspective, synchronizing on kvm->srcu is going to be
susceptible to random slowdowns, because writers will have to wait until all vCPUs
drop SRCU, even if they have nothing to do with PV TLB flushes.  E.g. if vCPUs
are faulting in memory from swap, uninhibiting a TLB flushes could be stalled
unnecessarily for an extended duration.
This should be an easy fix, right? Just create an SRCU only for the TLB flushes only.
...
Lastly, KVM_REQ_EVENT is a big hammer (triggers a lot of processing) and semantically
misleading (there is no event to process).  At a glance, KVM_REQ_UNBLOCK is likely
more appropriate.
Before we spend too much time cleaning things up, I want to first settle on the
overall design, because it's not clear to me that punting HvTranslateVirtualAddress
to userspace is a net positive.  We agreed that VTLs should be modeled primarily
in userspace, but that doesn't automatically make punting everything to userspace
the best option, especially given the discussion at KVM Forum with respect to
mplementing VTLs, VMPLs, TD partitions, etc.
I wasn't at the discussion, so maybe I'm missing something, but the hypercall
still needs VTL awareness. For one, it is primarily executed from VTL0 and
primarily targets VTL1 (primarily here means "thats what I see when I boot
Windows Server 2019"), so it would need to know which vCPU is the corresponding
VTL (this assumes one vCPU per VTL, as in the QEMU implementation). To make
matters worse, the hypercall can also arbitrarily choose to target a different
VP. This would require a way to map (VP index, VTL) -> (vcpu_id) within KVM.
...
The cover letters for this series and KVM_TRANSLATE2 simply say they're needed
for HvTranslateVirtualAddress, but neither series nor Nicolas' patch to punt
HVCALL_TRANSLATE_VIRTUAL_ADDRESS[*] justifies the split between userspace and
KVM.  And it very much is a split, because there are obviously a lot of details
around TlbFlushInhibit that bleed into KVM.
Side topic, what actually clears HvRegisterInterceptSuspend.TlbFlushInhibit?  The
TLFS just says
After the memory intercept routine performs instruction completion, it should
  clear the TlbFlushInhibit bit of the HvRegisterInterceptSuspend register.
but I can't find anything that says _how_ it clears TlbFlushInhibit.
The register cannot be accessed using the HvSetVpRegisters hypercall, but the TLFS
talks about it elsewhere. I'm assuming this is a formatting issue (there are a few
elsewhere). In 15.5.1.3 it says
To unlock the TLB, the higher VTL can clear this bit. Also, once a VP returns
  to a lower VTL, it releases all TLB locks which it holds at the time.
The QEMU implementation also just uninhibits on intercept exit, and that, at least,
does not crash.
Nikolas

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH 2/7] KVM: x86: Implement Hyper-V's vCPU suspended state