Upstream commit:
f775b13eedee ("x86,kvm: move qemu/guest FPU switching out to vcpu_run")
introduced a bug, which was later fixed by upstream commit:
5663d8f9bbe4 ("kvm: x86: fix WARN due to uninitialized guest FPU state")
For reasons unknown, both commits were initially passed-over for inclusion in the 4.14 stable branch despite being tagged for stable. Eventually, someone noticed that the fixup, commit 5663d8f9bbe4, was missing from stable[1], and so it was queued up for 4.14 and included in release v4.14.79.
Even later, the original buggy patch, commit f775b13eedee, was also applied to the 4.14 stable branch. Through an unlucky coincidence, the incorrect ordering did not generate a conflict between the two patches, and led to v4.14.94 and later releases containing a spurious call to kvm_load_guest_fpu() in kvm_arch_vcpu_ioctl_run(). As a result, KVM may reload stale guest FPU state, e.g. after accepting in INIT event. This can manifest as crashes during boot, segfaults, failed checksums and so on and so forth.
Remove the unwanted kvm_{load,put}_guest_fpu() calls, i.e. make kvm_arch_vcpu_ioctl_run() look like commit 5663d8f9bbe4 was backported after commit f775b13eedee.
[1] https://www.spinics.net/lists/stable/msg263931.html
Fixes: 4124a4cff344 ("x86,kvm: move qemu/guest FPU switching out to vcpu_run") Cc: stable@vger.kernel.org Cc: Sasha Levin sashal@kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Peter Xu peterx@redhat.com Cc: Rik van Riel riel@redhat.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: Radim Krčmář rkrcmar@redhat.com Reported-by: Roman Mamedov Reported-by: Thomas Lindroth thomas.lindroth@gmail.com Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com --- arch/x86/kvm/x86.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 130be2efafbe..af7ab2c71786 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7423,14 +7423,12 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) } }
- kvm_load_guest_fpu(vcpu); - if (unlikely(vcpu->arch.complete_userspace_io)) { int (*cui)(struct kvm_vcpu *) = vcpu->arch.complete_userspace_io; vcpu->arch.complete_userspace_io = NULL; r = cui(vcpu); if (r <= 0) - goto out_fpu; + goto out; } else WARN_ON(vcpu->arch.pio.count || vcpu->mmio_needed);
@@ -7439,8 +7437,6 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) else r = vcpu_run(vcpu);
-out_fpu: - kvm_put_guest_fpu(vcpu); out: kvm_put_guest_fpu(vcpu); post_kvm_run_save(vcpu);
On Mon, Jan 28, 2019 at 12:51:02PM -0800, Sean Christopherson wrote:
Upstream commit:
f775b13eedee ("x86,kvm: move qemu/guest FPU switching out to vcpu_run")
introduced a bug, which was later fixed by upstream commit:
5663d8f9bbe4 ("kvm: x86: fix WARN due to uninitialized guest FPU state")
For reasons unknown, both commits were initially passed-over for inclusion in the 4.14 stable branch despite being tagged for stable. Eventually, someone noticed that the fixup, commit 5663d8f9bbe4, was missing from stable[1], and so it was queued up for 4.14 and included in release v4.14.79.
Even later, the original buggy patch, commit f775b13eedee, was also applied to the 4.14 stable branch. Through an unlucky coincidence, the incorrect ordering did not generate a conflict between the two patches, and led to v4.14.94 and later releases containing a spurious call to kvm_load_guest_fpu() in kvm_arch_vcpu_ioctl_run(). As a result, KVM may reload stale guest FPU state, e.g. after accepting in INIT event. This can manifest as crashes during boot, segfaults, failed checksums and so on and so forth.
Remove the unwanted kvm_{load,put}_guest_fpu() calls, i.e. make kvm_arch_vcpu_ioctl_run() look like commit 5663d8f9bbe4 was backported after commit f775b13eedee.
[1] https://www.spinics.net/lists/stable/msg263931.html
Fixes: 4124a4cff344 ("x86,kvm: move qemu/guest FPU switching out to vcpu_run") Cc: stable@vger.kernel.org Cc: Sasha Levin sashal@kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Peter Xu peterx@redhat.com Cc: Rik van Riel riel@redhat.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: Radim Krčmář rkrcmar@redhat.com Reported-by: Roman Mamedov Reported-by: Thomas Lindroth thomas.lindroth@gmail.com Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com
I agree with your analysis and the patch makes sense. Hopefully one of the KVM folks can Ack.
-- Thanks, Sasha
On 28/01/19 23:14, Sasha Levin wrote:
On Mon, Jan 28, 2019 at 12:51:02PM -0800, Sean Christopherson wrote:
Upstream commit:
f775b13eedee ("x86,kvm: move qemu/guest FPU switching out to vcpu_run")
introduced a bug, which was later fixed by upstream commit:
5663d8f9bbe4 ("kvm: x86: fix WARN due to uninitialized guest FPU state")
For reasons unknown, both commits were initially passed-over for inclusion in the 4.14 stable branch despite being tagged for stable. Eventually, someone noticed that the fixup, commit 5663d8f9bbe4, was missing from stable[1], and so it was queued up for 4.14 and included in release v4.14.79.
Even later, the original buggy patch, commit f775b13eedee, was also applied to the 4.14 stable branch. Through an unlucky coincidence, the incorrect ordering did not generate a conflict between the two patches, and led to v4.14.94 and later releases containing a spurious call to kvm_load_guest_fpu() in kvm_arch_vcpu_ioctl_run(). As a result, KVM may reload stale guest FPU state, e.g. after accepting in INIT event. This can manifest as crashes during boot, segfaults, failed checksums and so on and so forth.
Remove the unwanted kvm_{load,put}_guest_fpu() calls, i.e. make kvm_arch_vcpu_ioctl_run() look like commit 5663d8f9bbe4 was backported after commit f775b13eedee.
[1] https://www.spinics.net/lists/stable/msg263931.html
Fixes: 4124a4cff344 ("x86,kvm: move qemu/guest FPU switching out to vcpu_run") Cc: stable@vger.kernel.org Cc: Sasha Levin sashal@kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Peter Xu peterx@redhat.com Cc: Rik van Riel riel@redhat.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: Radim Krčmář rkrcmar@redhat.com Reported-by: Roman Mamedov Reported-by: Thomas Lindroth thomas.lindroth@gmail.com Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com
I agree with your analysis and the patch makes sense. Hopefully one of the KVM folks can Ack.
Acked-by: Paolo Bonzini pbonzini@redhat.com
Paolo
On Mon, Jan 28, 2019 at 12:51:02PM -0800, Sean Christopherson wrote:
Upstream commit:
f775b13eedee ("x86,kvm: move qemu/guest FPU switching out to vcpu_run")
introduced a bug, which was later fixed by upstream commit:
5663d8f9bbe4 ("kvm: x86: fix WARN due to uninitialized guest FPU state")
For reasons unknown, both commits were initially passed-over for inclusion in the 4.14 stable branch despite being tagged for stable. Eventually, someone noticed that the fixup, commit 5663d8f9bbe4, was missing from stable[1], and so it was queued up for 4.14 and included in release v4.14.79.
Even later, the original buggy patch, commit f775b13eedee, was also applied to the 4.14 stable branch. Through an unlucky coincidence, the incorrect ordering did not generate a conflict between the two patches, and led to v4.14.94 and later releases containing a spurious call to kvm_load_guest_fpu() in kvm_arch_vcpu_ioctl_run(). As a result, KVM may reload stale guest FPU state, e.g. after accepting in INIT event. This can manifest as crashes during boot, segfaults, failed checksums and so on and so forth.
Remove the unwanted kvm_{load,put}_guest_fpu() calls, i.e. make kvm_arch_vcpu_ioctl_run() look like commit 5663d8f9bbe4 was backported after commit f775b13eedee.
[1] https://www.spinics.net/lists/stable/msg263931.html
Fixes: 4124a4cff344 ("x86,kvm: move qemu/guest FPU switching out to vcpu_run") Cc: stable@vger.kernel.org Cc: Sasha Levin sashal@kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Peter Xu peterx@redhat.com Cc: Rik van Riel riel@redhat.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: Radim Krčmář rkrcmar@redhat.com Reported-by: Roman Mamedov Reported-by: Thomas Lindroth thomas.lindroth@gmail.com Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com
arch/x86/kvm/x86.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
Thanks so much for this, sorry for the mis-merge, nice catch!
Now queued up.
greg k-h
On 1/28/19 9:51 PM, Sean Christopherson wrote:
Upstream commit:
f775b13eedee ("x86,kvm: move qemu/guest FPU switching out to vcpu_run")
introduced a bug, which was later fixed by upstream commit:
5663d8f9bbe4 ("kvm: x86: fix WARN due to uninitialized guest FPU state")
For reasons unknown, both commits were initially passed-over for inclusion in the 4.14 stable branch despite being tagged for stable. Eventually, someone noticed that the fixup, commit 5663d8f9bbe4, was missing from stable[1], and so it was queued up for 4.14 and included in release v4.14.79.
Even later, the original buggy patch, commit f775b13eedee, was also applied to the 4.14 stable branch. Through an unlucky coincidence, the incorrect ordering did not generate a conflict between the two patches, and led to v4.14.94 and later releases containing a spurious call to kvm_load_guest_fpu() in kvm_arch_vcpu_ioctl_run(). As a result, KVM may reload stale guest FPU state, e.g. after accepting in INIT event. This can manifest as crashes during boot, segfaults, failed checksums and so on and so forth.
Remove the unwanted kvm_{load,put}_guest_fpu() calls, i.e. make kvm_arch_vcpu_ioctl_run() look like commit 5663d8f9bbe4 was backported after commit f775b13eedee.
[1] https://www.spinics.net/lists/stable/msg263931.html
Fixes: 4124a4cff344 ("x86,kvm: move qemu/guest FPU switching out to vcpu_run") Cc: stable@vger.kernel.org Cc: Sasha Levin sashal@kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Peter Xu peterx@redhat.com Cc: Rik van Riel riel@redhat.com Cc: Paolo Bonzini pbonzini@redhat.com Cc: Radim Krčmář rkrcmar@redhat.com Reported-by: Roman Mamedov Reported-by: Thomas Lindroth thomas.lindroth@gmail.com Signed-off-by: Sean Christopherson sean.j.christopherson@intel.com
arch/x86/kvm/x86.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 130be2efafbe..af7ab2c71786 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7423,14 +7423,12 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) } }
- kvm_load_guest_fpu(vcpu);
- if (unlikely(vcpu->arch.complete_userspace_io)) { int (*cui)(struct kvm_vcpu *) = vcpu->arch.complete_userspace_io; vcpu->arch.complete_userspace_io = NULL; r = cui(vcpu); if (r <= 0)
goto out_fpu;
} else WARN_ON(vcpu->arch.pio.count || vcpu->mmio_needed);goto out;
@@ -7439,8 +7437,6 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) else r = vcpu_run(vcpu); -out_fpu:
- kvm_put_guest_fpu(vcpu); out: kvm_put_guest_fpu(vcpu); post_kvm_run_save(vcpu);
I applied this patch on top of a standard 4.14.96 kernel and ran the stress-ng test for a few hours. No errors or other problems to report.
linux-stable-mirror@lists.linaro.org