Take the vCPU's mmu_seq snapshot as an "unsigned long" instead of an "int" when checking to see if a page fault is stale, as the sequence count is stored as an "unsigned long" everywhere else in KVM. This fixes a bug where KVM will effectively hang vCPUs due to always thinking page faults are stale, which results in KVM refusing to "fix" faults.
mmu_invalidate_seq (née mmu_notifier_seq) is a sequence counter used when KVM is handling page faults to detect if userspace mappings relevant to the guest were invalidated between snapshotting the counter and acquiring mmu_lock, i.e. to ensure that the userspace mapping KVM is using to resolve the page fault is fresh. If KVM sees that the counter has changed, KVM simply resumes the guest without fixing the fault.
What _should_ happen is that the source of the mmu_notifier invalidations eventually goes away, mmu_invalidate_seq becomes stable, and KVM can once again fix guest page fault(s).
But for a long-lived VM and/or a VM that the host just doesn't particularly like, it's possible for a VM to be on the receiving end of 2 billion (with a B) mmu_notifier invalidations. When that happens, bit 31 will be set in mmu_invalidate_seq. This causes the value to be turned into a 32-bit negative value when implicitly cast to an "int" by is_page_fault_stale(), and then sign-extended into a 64-bit unsigned when the signed "int" is implicitly cast back to an "unsigned long" on the call to mmu_invalidate_retry_hva().
As a result of the casting and sign-extension, given a sequence counter of e.g. 0x8002dc25, mmu_invalidate_retry_hva() ends up doing
if (0x8002dc25 != 0xffffffff8002dc25)
and signals that the page fault is stale and needs to be retried even though the sequence counter is stable, and KVM effectively hangs any vCPU that takes a page fault (EPT violation or #NPF when TDP is enabled).
Note, upstream commit ba6e3fe25543 ("KVM: x86/mmu: Grab mmu_invalidate_seq in kvm_faultin_pfn()") unknowingly fixed the bug in v6.3 when refactoring how KVM tracks the sequence counter snapshot.
Reported-by: Brian Rak brak@vultr.com Reported-by: Amaan Cheval amaan.cheval@gmail.com Reported-by: Eric Wheeler kvm@lists.ewheeler.net Closes: https://lore.kernel.org/all/f023d927-52aa-7e08-2ee5-59a2fbc65953@gameservers... Fixes: a955cad84cda ("KVM: x86/mmu: Retry page fault if root is invalidated by memslot update") Signed-off-by: Sean Christopherson seanjc@google.com --- arch/x86/kvm/mmu/mmu.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 230108a90cf3..beca03556379 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4212,7 +4212,8 @@ static int kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) * root was invalidated by a memslot update or a relevant mmu_notifier fired. */ static bool is_page_fault_stale(struct kvm_vcpu *vcpu, - struct kvm_page_fault *fault, int mmu_seq) + struct kvm_page_fault *fault, + unsigned long mmu_seq) { struct kvm_mmu_page *sp = to_shadow_page(vcpu->arch.mmu->root.hpa);
base-commit: 802aacbbffe2512dce9f8f33ad99d01cfec435de
On Wed, Aug 23, 2023 at 06:01:04PM -0700, Sean Christopherson wrote:
Take the vCPU's mmu_seq snapshot as an "unsigned long" instead of an "int" when checking to see if a page fault is stale, as the sequence count is stored as an "unsigned long" everywhere else in KVM. This fixes a bug where KVM will effectively hang vCPUs due to always thinking page faults are stale, which results in KVM refusing to "fix" faults.
mmu_invalidate_seq (née mmu_notifier_seq) is a sequence counter used when KVM is handling page faults to detect if userspace mappings relevant to the guest were invalidated between snapshotting the counter and acquiring mmu_lock, i.e. to ensure that the userspace mapping KVM is using to resolve the page fault is fresh. If KVM sees that the counter has changed, KVM simply resumes the guest without fixing the fault.
What _should_ happen is that the source of the mmu_notifier invalidations eventually goes away, mmu_invalidate_seq becomes stable, and KVM can once again fix guest page fault(s).
But for a long-lived VM and/or a VM that the host just doesn't particularly like, it's possible for a VM to be on the receiving end of 2 billion (with a B) mmu_notifier invalidations. When that happens, bit 31 will be set in mmu_invalidate_seq. This causes the value to be turned into a 32-bit negative value when implicitly cast to an "int" by is_page_fault_stale(), and then sign-extended into a 64-bit unsigned when the signed "int" is implicitly cast back to an "unsigned long" on the call to mmu_invalidate_retry_hva().
As a result of the casting and sign-extension, given a sequence counter of e.g. 0x8002dc25, mmu_invalidate_retry_hva() ends up doing
if (0x8002dc25 != 0xffffffff8002dc25)
and signals that the page fault is stale and needs to be retried even though the sequence counter is stable, and KVM effectively hangs any vCPU that takes a page fault (EPT violation or #NPF when TDP is enabled).
Note, upstream commit ba6e3fe25543 ("KVM: x86/mmu: Grab mmu_invalidate_seq in kvm_faultin_pfn()") unknowingly fixed the bug in v6.3 when refactoring how KVM tracks the sequence counter snapshot.
Reported-by: Brian Rak brak@vultr.com Reported-by: Amaan Cheval amaan.cheval@gmail.com Reported-by: Eric Wheeler kvm@lists.ewheeler.net Closes: https://lore.kernel.org/all/f023d927-52aa-7e08-2ee5-59a2fbc65953@gameservers... Fixes: a955cad84cda ("KVM: x86/mmu: Retry page fault if root is invalidated by memslot update") Signed-off-by: Sean Christopherson seanjc@google.com
What is the git commit id of this change in Linus's tree?
thanks,
greg k-h
On Thu, Aug 24, 2023, Greg Kroah-Hartman wrote:
On Wed, Aug 23, 2023 at 06:01:04PM -0700, Sean Christopherson wrote:
Note, upstream commit ba6e3fe25543 ("KVM: x86/mmu: Grab mmu_invalidate_seq in kvm_faultin_pfn()") unknowingly fixed the bug in v6.3 when refactoring how KVM tracks the sequence counter snapshot.
Reported-by: Brian Rak brak@vultr.com Reported-by: Amaan Cheval amaan.cheval@gmail.com Reported-by: Eric Wheeler kvm@lists.ewheeler.net Closes: https://lore.kernel.org/all/f023d927-52aa-7e08-2ee5-59a2fbc65953@gameservers... Fixes: a955cad84cda ("KVM: x86/mmu: Retry page fault if root is invalidated by memslot update") Signed-off-by: Sean Christopherson seanjc@google.com
What is the git commit id of this change in Linus's tree?
There is none. Commit ba6e3fe25543 (landed in v6.3) unknowingly fixed the bug as part of a completely unrelated refactoring.
On Thu, Aug 24, 2023 at 06:46:59AM -0700, Sean Christopherson wrote:
On Thu, Aug 24, 2023, Greg Kroah-Hartman wrote:
On Wed, Aug 23, 2023 at 06:01:04PM -0700, Sean Christopherson wrote:
Note, upstream commit ba6e3fe25543 ("KVM: x86/mmu: Grab mmu_invalidate_seq in kvm_faultin_pfn()") unknowingly fixed the bug in v6.3 when refactoring how KVM tracks the sequence counter snapshot.
Reported-by: Brian Rak brak@vultr.com Reported-by: Amaan Cheval amaan.cheval@gmail.com Reported-by: Eric Wheeler kvm@lists.ewheeler.net Closes: https://lore.kernel.org/all/f023d927-52aa-7e08-2ee5-59a2fbc65953@gameservers... Fixes: a955cad84cda ("KVM: x86/mmu: Retry page fault if root is invalidated by memslot update") Signed-off-by: Sean Christopherson seanjc@google.com
What is the git commit id of this change in Linus's tree?
There is none. Commit ba6e3fe25543 (landed in v6.3) unknowingly fixed the bug as part of a completely unrelated refactoring.
Ah, missed that in the text here, thanks!
greg k-h
On Thu, Aug 24, 2023 at 04:46:44PM +0200, Greg Kroah-Hartman wrote:
On Thu, Aug 24, 2023 at 06:46:59AM -0700, Sean Christopherson wrote:
On Thu, Aug 24, 2023, Greg Kroah-Hartman wrote:
On Wed, Aug 23, 2023 at 06:01:04PM -0700, Sean Christopherson wrote:
Note, upstream commit ba6e3fe25543 ("KVM: x86/mmu: Grab mmu_invalidate_seq in kvm_faultin_pfn()") unknowingly fixed the bug in v6.3 when refactoring how KVM tracks the sequence counter snapshot.
Reported-by: Brian Rak brak@vultr.com Reported-by: Amaan Cheval amaan.cheval@gmail.com Reported-by: Eric Wheeler kvm@lists.ewheeler.net Closes: https://lore.kernel.org/all/f023d927-52aa-7e08-2ee5-59a2fbc65953@gameservers... Fixes: a955cad84cda ("KVM: x86/mmu: Retry page fault if root is invalidated by memslot update") Signed-off-by: Sean Christopherson seanjc@google.com
What is the git commit id of this change in Linus's tree?
There is none. Commit ba6e3fe25543 (landed in v6.3) unknowingly fixed the bug as part of a completely unrelated refactoring.
Ah, missed that in the text here, thanks!
Now queued up, thanks.
greg k-h
linux-stable-mirror@lists.linaro.org