This is a note to let you know that I've just added the patch titled
KVM: X86: Fix softlockup when get the current kvmclock
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-fix-softlockup-when-get-the-current-kvmclock.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Wanpeng Li <kernellwp(a)gmail.com>
Date: Mon, 20 Nov 2017 14:55:05 -0800
Subject: KVM: X86: Fix softlockup when get the current kvmclock
From: Wanpeng Li <kernellwp(a)gmail.com>
[ Upstream commit e70b57a6ce4e8b92a56a615ae79bdb2bd66035e7 ]
watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [qemu-system-x86:10185]
CPU: 6 PID: 10185 Comm: qemu-system-x86 Tainted: G OE 4.14.0-rc4+ #4
RIP: 0010:kvm_get_time_scale+0x4e/0xa0 [kvm]
Call Trace:
get_time_ref_counter+0x5a/0x80 [kvm]
kvm_hv_process_stimers+0x120/0x5f0 [kvm]
kvm_arch_vcpu_ioctl_run+0x4b4/0x1690 [kvm]
kvm_vcpu_ioctl+0x33a/0x620 [kvm]
do_vfs_ioctl+0xa1/0x5d0
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x1e/0xa9
This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and
cpu-hotplug stress simultaneously. __this_cpu_read(cpu_tsc_khz) returns 0
(set in kvmclock_cpu_down_prep()) when the pCPU is unhotplug which results
in kvm_get_time_scale() gets into an infinite loop.
This patch fixes it by treating the unhotplug pCPU as not using master clock.
Reviewed-by: Radim Krčmář <rkrcmar(a)redhat.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/x86.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1795,10 +1795,13 @@ u64 get_kvmclock_ns(struct kvm *kvm)
/* both __this_cpu_read() and rdtsc() should be on the same cpu */
get_cpu();
- kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) * 1000LL,
- &hv_clock.tsc_shift,
- &hv_clock.tsc_to_system_mul);
- ret = __pvclock_read_cycles(&hv_clock, rdtsc());
+ if (__this_cpu_read(cpu_tsc_khz)) {
+ kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) * 1000LL,
+ &hv_clock.tsc_shift,
+ &hv_clock.tsc_to_system_mul);
+ ret = __pvclock_read_cycles(&hv_clock, rdtsc());
+ } else
+ ret = ktime_get_boot_ns() + ka->kvmclock_offset;
put_cpu();
Patches currently in stable-queue which might be from kernellwp(a)gmail.com are
queue-4.14/kvm-x86-fix-softlockup-when-get-the-current-kvmclock.patch
This is a note to let you know that I've just added the patch titled
KVM: X86: Fix operand/address-size during instruction decoding
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-fix-operand-address-size-during-instruction-decoding.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
Date: Sun, 5 Nov 2017 16:54:47 -0800
Subject: KVM: X86: Fix operand/address-size during instruction decoding
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
[ Upstream commit 3853be2603191829b442b64dac6ae8ba0c027bf9 ]
Pedro reported:
During tests that we conducted on KVM, we noticed that executing a "PUSH %ES"
instruction under KVM produces different results on both memory and the SP
register depending on whether EPT support is enabled. With EPT the SP is
reduced by 4 bytes (and the written value is 0-padded) but without EPT support
it is only reduced by 2 bytes. The difference can be observed when the CS.DB
field is 1 (32-bit) but not when it's 0 (16-bit).
The internal segment descriptor cache exist even in real/vm8096 mode. The CS.D
also should be respected instead of just default operand/address-size/66H
prefix/67H prefix during instruction decoding. This patch fixes it by also
adjusting operand/address-size according to CS.D.
Reported-by: Pedro Fonseca <pfonseca(a)cs.washington.edu>
Tested-by: Pedro Fonseca <pfonseca(a)cs.washington.edu>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Pedro Fonseca <pfonseca(a)cs.washington.edu>
Signed-off-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/emulate.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -5009,6 +5009,8 @@ int x86_decode_insn(struct x86_emulate_c
bool op_prefix = false;
bool has_seg_override = false;
struct opcode opcode;
+ u16 dummy;
+ struct desc_struct desc;
ctxt->memop.type = OP_NONE;
ctxt->memopp = NULL;
@@ -5027,6 +5029,11 @@ int x86_decode_insn(struct x86_emulate_c
switch (mode) {
case X86EMUL_MODE_REAL:
case X86EMUL_MODE_VM86:
+ def_op_bytes = def_ad_bytes = 2;
+ ctxt->ops->get_segment(ctxt, &dummy, &desc, NULL, VCPU_SREG_CS);
+ if (desc.d)
+ def_op_bytes = def_ad_bytes = 4;
+ break;
case X86EMUL_MODE_PROT16:
def_op_bytes = def_ad_bytes = 2;
break;
Patches currently in stable-queue which might be from wanpeng.li(a)hotmail.com are
queue-4.14/kvm-x86-don-t-re-execute-instruction-when-not-passing-cr2-value.patch
queue-4.14/kvm-x86-ioapic-preserve-read-only-values-in-the-redirection-table.patch
queue-4.14/kvm-nvmx-fix-mmu-context-after-vmlaunch-vmresume-failure.patch
queue-4.14/kvm-vmx-fix-rflags-cache-during-vcpu-reset.patch
queue-4.14/kvm-x86-fix-operand-address-size-during-instruction-decoding.patch
queue-4.14/kvm-x86-emulator-return-to-user-mode-on-l1-cpl-0-emulation-failure.patch
queue-4.14/kvm-nvmx-nsvm-don-t-intercept-ud-when-running-l2.patch
queue-4.14/kvm-x86-fix-softlockup-when-get-the-current-kvmclock.patch
queue-4.14/kvm-x86-ioapic-clear-remote-irr-when-entry-is-switched-to-edge-triggered.patch
This is a note to let you know that I've just added the patch titled
KVM: x86: emulator: Return to user-mode on L1 CPL=0 emulation failure
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-emulator-return-to-user-mode-on-l1-cpl-0-emulation-failure.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Liran Alon <liran.alon(a)oracle.com>
Date: Sun, 5 Nov 2017 16:56:33 +0200
Subject: KVM: x86: emulator: Return to user-mode on L1 CPL=0 emulation failure
From: Liran Alon <liran.alon(a)oracle.com>
[ Upstream commit 1f4dcb3b213235e642088709a1c54964d23365e9 ]
On this case, handle_emulation_failure() fills kvm_run with
internal-error information which it expects to be delivered
to user-mode for further processing.
However, the code reports a wrong return-value which makes KVM to never
return to user-mode on this scenario.
Fixes: 6d77dbfc88e3 ("KVM: inject #UD if instruction emulation fails and exit to
userspace")
Signed-off-by: Liran Alon <liran.alon(a)oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko(a)oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Reviewed-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/x86.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5416,7 +5416,7 @@ static int handle_emulation_failure(stru
vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
vcpu->run->internal.ndata = 0;
- r = EMULATE_FAIL;
+ r = EMULATE_USER_EXIT;
}
kvm_queue_exception(vcpu, UD_VECTOR);
Patches currently in stable-queue which might be from liran.alon(a)oracle.com are
queue-4.14/kvm-nvmx-fix-vmx_check_nested_events-return-value-in-case-an-event-was-reinjected-to-l2.patch
queue-4.14/kvm-x86-don-t-re-execute-instruction-when-not-passing-cr2-value.patch
queue-4.14/kvm-x86-ioapic-preserve-read-only-values-in-the-redirection-table.patch
queue-4.14/kvm-x86-ioapic-fix-level-triggered-eoi-and-ioapic-reconfigure-race.patch
queue-4.14/kvm-x86-emulator-return-to-user-mode-on-l1-cpl-0-emulation-failure.patch
queue-4.14/kvm-nvmx-nsvm-don-t-intercept-ud-when-running-l2.patch
queue-4.14/kvm-x86-ioapic-clear-remote-irr-when-entry-is-switched-to-edge-triggered.patch
This is a note to let you know that I've just added the patch titled
KVM: x86: emulate #UD while in guest mode
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-emulate-ud-while-in-guest-mode.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Thu, 11 Jan 2018 16:55:24 +0100
Subject: KVM: x86: emulate #UD while in guest mode
From: Paolo Bonzini <pbonzini(a)redhat.com>
[ Upstream commit bd89525a823ce6edddcedbe9aed79faa1b9cf544 ]
This reverts commits ae1f57670703656cc9f293722c3b8b6782f8ab3f
and ac9b305caa0df6f5b75d294e4b86c1027648991e.
If the hardware doesn't support MOVBE, but L0 sets CPUID.01H:ECX.MOVBE
in L1's emulated CPUID information, then L1 is likely to pass that
CPUID bit through to L2. L2 will expect MOVBE to work, but if L1
doesn't intercept #UD, then any MOVBE instruction executed in L2 will
raise #UD, and the exception will be delivered in L2.
Commit ac9b305caa0df6f5b75d294e4b86c1027648991e is a better and more
complete version of ae1f57670703 ("KVM: nVMX: Do not emulate #UD while
in guest mode"); however, neither considers the above case.
Suggested-by: Jim Mattson <jmattson(a)redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/svm.c | 9 +--------
arch/x86/kvm/vmx.c | 5 +----
2 files changed, 2 insertions(+), 12 deletions(-)
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -362,7 +362,6 @@ static void recalc_intercepts(struct vcp
{
struct vmcb_control_area *c, *h;
struct nested_state *g;
- u32 h_intercept_exceptions;
mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
@@ -373,14 +372,9 @@ static void recalc_intercepts(struct vcp
h = &svm->nested.hsave->control;
g = &svm->nested;
- /* No need to intercept #UD if L1 doesn't intercept it */
- h_intercept_exceptions =
- h->intercept_exceptions & ~(1U << UD_VECTOR);
-
c->intercept_cr = h->intercept_cr | g->intercept_cr;
c->intercept_dr = h->intercept_dr | g->intercept_dr;
- c->intercept_exceptions =
- h_intercept_exceptions | g->intercept_exceptions;
+ c->intercept_exceptions = h->intercept_exceptions | g->intercept_exceptions;
c->intercept = h->intercept | g->intercept;
}
@@ -2195,7 +2189,6 @@ static int ud_interception(struct vcpu_s
{
int er;
- WARN_ON_ONCE(is_guest_mode(&svm->vcpu));
er = emulate_instruction(&svm->vcpu, EMULTYPE_TRAP_UD);
if (er == EMULATE_USER_EXIT)
return 0;
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1891,7 +1891,7 @@ static void update_exception_bitmap(stru
{
u32 eb;
- eb = (1u << PF_VECTOR) | (1u << MC_VECTOR) |
+ eb = (1u << PF_VECTOR) | (1u << UD_VECTOR) | (1u << MC_VECTOR) |
(1u << DB_VECTOR) | (1u << AC_VECTOR);
if ((vcpu->guest_debug &
(KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP)) ==
@@ -1909,8 +1909,6 @@ static void update_exception_bitmap(stru
*/
if (is_guest_mode(vcpu))
eb |= get_vmcs12(vcpu)->exception_bitmap;
- else
- eb |= 1u << UD_VECTOR;
vmcs_write32(EXCEPTION_BITMAP, eb);
}
@@ -5921,7 +5919,6 @@ static int handle_exception(struct kvm_v
return 1; /* already handled by vmx_vcpu_run() */
if (is_invalid_opcode(intr_info)) {
- WARN_ON_ONCE(is_guest_mode(vcpu));
er = emulate_instruction(vcpu, EMULTYPE_TRAP_UD);
if (er == EMULATE_USER_EXIT)
return 0;
Patches currently in stable-queue which might be from pbonzini(a)redhat.com are
queue-4.14/kvm-nvmx-fix-mmu-context-after-vmlaunch-vmresume-failure.patch
queue-4.14/kvm-vmx-fix-rflags-cache-during-vcpu-reset.patch
queue-4.14/kvm-x86-emulate-ud-while-in-guest-mode.patch
queue-4.14/kvm-x86-fix-operand-address-size-during-instruction-decoding.patch
queue-4.14/kvm-nvmx-nsvm-don-t-intercept-ud-when-running-l2.patch
queue-4.14/kvm-let-kvm_set_signal_mask-work-as-advertised.patch
queue-4.14/kvm-x86-fix-softlockup-when-get-the-current-kvmclock.patch
This is a note to let you know that I've just added the patch titled
KVM: x86: Don't re-execute instruction when not passing CR2 value
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-don-t-re-execute-instruction-when-not-passing-cr2-value.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Liran Alon <liran.alon(a)oracle.com>
Date: Sun, 5 Nov 2017 16:56:34 +0200
Subject: KVM: x86: Don't re-execute instruction when not passing CR2 value
From: Liran Alon <liran.alon(a)oracle.com>
[ Upstream commit 9b8ae63798cb97e785a667ff27e43fa6220cb734 ]
In case of instruction-decode failure or emulation failure,
x86_emulate_instruction() will call reexecute_instruction() which will
attempt to use the cr2 value passed to x86_emulate_instruction().
However, when x86_emulate_instruction() is called from
emulate_instruction(), cr2 is not passed (passed as 0) and therefore
it doesn't make sense to execute reexecute_instruction() logic at all.
Fixes: 51d8b66199e9 ("KVM: cleanup emulate_instruction")
Signed-off-by: Liran Alon <liran.alon(a)oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko(a)oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Reviewed-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/kvm_host.h | 3 ++-
arch/x86/kvm/vmx.c | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1156,7 +1156,8 @@ int x86_emulate_instruction(struct kvm_v
static inline int emulate_instruction(struct kvm_vcpu *vcpu,
int emulation_type)
{
- return x86_emulate_instruction(vcpu, 0, emulation_type, NULL, 0);
+ return x86_emulate_instruction(vcpu, 0,
+ emulation_type | EMULTYPE_NO_REEXECUTE, NULL, 0);
}
void kvm_enable_efer_bits(u64);
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6607,7 +6607,7 @@ static int handle_invalid_guest_state(st
if (kvm_test_request(KVM_REQ_EVENT, vcpu))
return 1;
- err = emulate_instruction(vcpu, EMULTYPE_NO_REEXECUTE);
+ err = emulate_instruction(vcpu, 0);
if (err == EMULATE_USER_EXIT) {
++vcpu->stat.mmio_exits;
Patches currently in stable-queue which might be from liran.alon(a)oracle.com are
queue-4.14/kvm-nvmx-fix-vmx_check_nested_events-return-value-in-case-an-event-was-reinjected-to-l2.patch
queue-4.14/kvm-x86-don-t-re-execute-instruction-when-not-passing-cr2-value.patch
queue-4.14/kvm-x86-ioapic-preserve-read-only-values-in-the-redirection-table.patch
queue-4.14/kvm-x86-ioapic-fix-level-triggered-eoi-and-ioapic-reconfigure-race.patch
queue-4.14/kvm-x86-emulator-return-to-user-mode-on-l1-cpl-0-emulation-failure.patch
queue-4.14/kvm-nvmx-nsvm-don-t-intercept-ud-when-running-l2.patch
queue-4.14/kvm-x86-ioapic-clear-remote-irr-when-entry-is-switched-to-edge-triggered.patch
This is a note to let you know that I've just added the patch titled
KVM: VMX: Fix rflags cache during vCPU reset
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-vmx-fix-rflags-cache-during-vcpu-reset.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
Date: Mon, 20 Nov 2017 14:52:21 -0800
Subject: KVM: VMX: Fix rflags cache during vCPU reset
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
[ Upstream commit c37c28730bb031cc8a44a130c2555c0f3efbe2d0 ]
Reported by syzkaller:
*** Guest State ***
CR0: actual=0x0000000080010031, shadow=0x0000000060000010, gh_mask=fffffffffffffff7
CR4: actual=0x0000000000002061, shadow=0x0000000000000000, gh_mask=ffffffffffffe8f1
CR3 = 0x000000002081e000
RSP = 0x000000000000fffa RIP = 0x0000000000000000
RFLAGS=0x00023000 DR7 = 0x00000000000000
^^^^^^^^^^
------------[ cut here ]------------
WARNING: CPU: 6 PID: 24431 at /home/kernel/linux/arch/x86/kvm//x86.c:7302 kvm_arch_vcpu_ioctl_run+0x651/0x2ea0 [kvm]
CPU: 6 PID: 24431 Comm: reprotest Tainted: G W OE 4.14.0+ #26
RIP: 0010:kvm_arch_vcpu_ioctl_run+0x651/0x2ea0 [kvm]
RSP: 0018:ffff880291d179e0 EFLAGS: 00010202
Call Trace:
kvm_vcpu_ioctl+0x479/0x880 [kvm]
do_vfs_ioctl+0x142/0x9a0
SyS_ioctl+0x74/0x80
entry_SYSCALL_64_fastpath+0x23/0x9a
The failed vmentry is triggered by the following beautified testcase:
#include <unistd.h>
#include <sys/syscall.h>
#include <string.h>
#include <stdint.h>
#include <linux/kvm.h>
#include <fcntl.h>
#include <sys/ioctl.h>
long r[5];
int main()
{
struct kvm_debugregs dr = { 0 };
r[2] = open("/dev/kvm", O_RDONLY);
r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7);
struct kvm_guest_debug debug = {
.control = 0xf0403,
.arch = {
.debugreg[6] = 0x2,
.debugreg[7] = 0x2
}
};
ioctl(r[4], KVM_SET_GUEST_DEBUG, &debug);
ioctl(r[4], KVM_RUN, 0);
}
which testcase tries to setup the processor specific debug
registers and configure vCPU for handling guest debug events through
KVM_SET_GUEST_DEBUG. The KVM_SET_GUEST_DEBUG ioctl will get and set
rflags in order to set TF bit if single step is needed. All regs' caches
are reset to avail and GUEST_RFLAGS vmcs field is reset to 0x2 during vCPU
reset. However, the cache of rflags is not reset during vCPU reset. The
function vmx_get_rflags() returns an unreset rflags cache value since
the cache is marked avail, it is 0 after boot. Vmentry fails if the
rflags reserved bit 1 is 0.
This patch fixes it by resetting both the GUEST_RFLAGS vmcs field and
its cache to 0x2 during vCPU reset.
Reported-by: Dmitry Vyukov <dvyukov(a)google.com>
Tested-by: Dmitry Vyukov <dvyukov(a)google.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Signed-off-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/vmx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -5608,7 +5608,7 @@ static void vmx_vcpu_reset(struct kvm_vc
vmcs_write64(GUEST_IA32_DEBUGCTL, 0);
}
- vmcs_writel(GUEST_RFLAGS, 0x02);
+ kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
kvm_rip_write(vcpu, 0xfff0);
vmcs_writel(GUEST_GDTR_BASE, 0);
Patches currently in stable-queue which might be from wanpeng.li(a)hotmail.com are
queue-4.14/kvm-x86-don-t-re-execute-instruction-when-not-passing-cr2-value.patch
queue-4.14/kvm-x86-ioapic-preserve-read-only-values-in-the-redirection-table.patch
queue-4.14/kvm-nvmx-fix-mmu-context-after-vmlaunch-vmresume-failure.patch
queue-4.14/kvm-vmx-fix-rflags-cache-during-vcpu-reset.patch
queue-4.14/kvm-x86-fix-operand-address-size-during-instruction-decoding.patch
queue-4.14/kvm-x86-emulator-return-to-user-mode-on-l1-cpl-0-emulation-failure.patch
queue-4.14/kvm-nvmx-nsvm-don-t-intercept-ud-when-running-l2.patch
queue-4.14/kvm-x86-fix-softlockup-when-get-the-current-kvmclock.patch
queue-4.14/kvm-x86-ioapic-clear-remote-irr-when-entry-is-switched-to-edge-triggered.patch
This is a note to let you know that I've just added the patch titled
KVM: nVMX/nSVM: Don't intercept #UD when running L2
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-nvmx-nsvm-don-t-intercept-ud-when-running-l2.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Liran Alon <liran.alon(a)oracle.com>
Date: Mon, 6 Nov 2017 16:15:10 +0200
Subject: KVM: nVMX/nSVM: Don't intercept #UD when running L2
From: Liran Alon <liran.alon(a)oracle.com>
[ Upstream commit ac9b305caa0df6f5b75d294e4b86c1027648991e ]
When running L2, #UD should be intercepted by L1 or just forwarded
directly to L2. It should not reach L0 x86 emulator.
Therefore, set intercept for #UD only based on L1 exception-bitmap.
Also add WARN_ON_ONCE() on L0 #UD intercept handlers to make sure
it is never reached while running L2.
This improves commit ae1f57670703 ("KVM: nVMX: Do not emulate #UD while
in guest mode") by removing an unnecessary exit from L2 to L0 on #UD
when L1 doesn't intercept it.
In addition, SVM L0 #UD intercept handler doesn't handle correctly the
case it is raised from L2. In this case, it should forward the #UD to
guest instead of x86 emulator. As done in VMX #UD intercept handler.
This commit fixes this issue as-well.
Signed-off-by: Liran Alon <liran.alon(a)oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko(a)oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Reviewed-by: Paolo Bonzini <pbonzini(a)redhat.com>
Reviewed-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/svm.c | 9 ++++++++-
arch/x86/kvm/vmx.c | 9 ++++-----
2 files changed, 12 insertions(+), 6 deletions(-)
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -362,6 +362,7 @@ static void recalc_intercepts(struct vcp
{
struct vmcb_control_area *c, *h;
struct nested_state *g;
+ u32 h_intercept_exceptions;
mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
@@ -372,9 +373,14 @@ static void recalc_intercepts(struct vcp
h = &svm->nested.hsave->control;
g = &svm->nested;
+ /* No need to intercept #UD if L1 doesn't intercept it */
+ h_intercept_exceptions =
+ h->intercept_exceptions & ~(1U << UD_VECTOR);
+
c->intercept_cr = h->intercept_cr | g->intercept_cr;
c->intercept_dr = h->intercept_dr | g->intercept_dr;
- c->intercept_exceptions = h->intercept_exceptions | g->intercept_exceptions;
+ c->intercept_exceptions =
+ h_intercept_exceptions | g->intercept_exceptions;
c->intercept = h->intercept | g->intercept;
}
@@ -2189,6 +2195,7 @@ static int ud_interception(struct vcpu_s
{
int er;
+ WARN_ON_ONCE(is_guest_mode(&svm->vcpu));
er = emulate_instruction(&svm->vcpu, EMULTYPE_TRAP_UD);
if (er == EMULATE_USER_EXIT)
return 0;
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1891,7 +1891,7 @@ static void update_exception_bitmap(stru
{
u32 eb;
- eb = (1u << PF_VECTOR) | (1u << UD_VECTOR) | (1u << MC_VECTOR) |
+ eb = (1u << PF_VECTOR) | (1u << MC_VECTOR) |
(1u << DB_VECTOR) | (1u << AC_VECTOR);
if ((vcpu->guest_debug &
(KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP)) ==
@@ -1909,6 +1909,8 @@ static void update_exception_bitmap(stru
*/
if (is_guest_mode(vcpu))
eb |= get_vmcs12(vcpu)->exception_bitmap;
+ else
+ eb |= 1u << UD_VECTOR;
vmcs_write32(EXCEPTION_BITMAP, eb);
}
@@ -5919,10 +5921,7 @@ static int handle_exception(struct kvm_v
return 1; /* already handled by vmx_vcpu_run() */
if (is_invalid_opcode(intr_info)) {
- if (is_guest_mode(vcpu)) {
- kvm_queue_exception(vcpu, UD_VECTOR);
- return 1;
- }
+ WARN_ON_ONCE(is_guest_mode(vcpu));
er = emulate_instruction(vcpu, EMULTYPE_TRAP_UD);
if (er == EMULATE_USER_EXIT)
return 0;
Patches currently in stable-queue which might be from liran.alon(a)oracle.com are
queue-4.14/kvm-nvmx-fix-vmx_check_nested_events-return-value-in-case-an-event-was-reinjected-to-l2.patch
queue-4.14/kvm-x86-don-t-re-execute-instruction-when-not-passing-cr2-value.patch
queue-4.14/kvm-x86-ioapic-preserve-read-only-values-in-the-redirection-table.patch
queue-4.14/kvm-x86-ioapic-fix-level-triggered-eoi-and-ioapic-reconfigure-race.patch
queue-4.14/kvm-x86-emulator-return-to-user-mode-on-l1-cpl-0-emulation-failure.patch
queue-4.14/kvm-nvmx-nsvm-don-t-intercept-ud-when-running-l2.patch
queue-4.14/kvm-x86-ioapic-clear-remote-irr-when-entry-is-switched-to-edge-triggered.patch
This is a note to let you know that I've just added the patch titled
KVM: nVMX: Fix vmx_check_nested_events() return value in case an event was reinjected to L2
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-nvmx-fix-vmx_check_nested_events-return-value-in-case-an-event-was-reinjected-to-l2.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Liran Alon <liran.alon(a)oracle.com>
Date: Sun, 5 Nov 2017 16:07:43 +0200
Subject: KVM: nVMX: Fix vmx_check_nested_events() return value in case an event was reinjected to L2
From: Liran Alon <liran.alon(a)oracle.com>
[ Upstream commit 917dc6068bc12a2dafffcf0e9d405ddb1b8780cb ]
vmx_check_nested_events() should return -EBUSY only in case there is a
pending L1 event which requires a VMExit from L2 to L1 but such a
VMExit is currently blocked. Such VMExits are blocked either
because nested_run_pending=1 or an event was reinjected to L2.
vmx_check_nested_events() should return 0 in case there are no
pending L1 events which requires a VMExit from L2 to L1 or if
a VMExit from L2 to L1 was done internally.
However, upstream commit which introduced blocking in case an event was
reinjected to L2 (commit acc9ab601327 ("KVM: nVMX: Fix pending events
injection")) contains a bug: It returns -EBUSY even if there are no
pending L1 events which requires VMExit from L2 to L1.
This commit fix this issue.
Fixes: acc9ab601327 ("KVM: nVMX: Fix pending events injection")
Signed-off-by: Liran Alon <liran.alon(a)oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko(a)oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/vmx.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -11114,13 +11114,12 @@ static int vmx_check_nested_events(struc
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
unsigned long exit_qual;
-
- if (kvm_event_needs_reinjection(vcpu))
- return -EBUSY;
+ bool block_nested_events =
+ vmx->nested.nested_run_pending || kvm_event_needs_reinjection(vcpu);
if (vcpu->arch.exception.pending &&
nested_vmx_check_exception(vcpu, &exit_qual)) {
- if (vmx->nested.nested_run_pending)
+ if (block_nested_events)
return -EBUSY;
nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
vcpu->arch.exception.pending = false;
@@ -11129,14 +11128,14 @@ static int vmx_check_nested_events(struc
if (nested_cpu_has_preemption_timer(get_vmcs12(vcpu)) &&
vmx->nested.preemption_timer_expired) {
- if (vmx->nested.nested_run_pending)
+ if (block_nested_events)
return -EBUSY;
nested_vmx_vmexit(vcpu, EXIT_REASON_PREEMPTION_TIMER, 0, 0);
return 0;
}
if (vcpu->arch.nmi_pending && nested_exit_on_nmi(vcpu)) {
- if (vmx->nested.nested_run_pending)
+ if (block_nested_events)
return -EBUSY;
nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
NMI_VECTOR | INTR_TYPE_NMI_INTR |
@@ -11152,7 +11151,7 @@ static int vmx_check_nested_events(struc
if ((kvm_cpu_has_interrupt(vcpu) || external_intr) &&
nested_exit_on_intr(vcpu)) {
- if (vmx->nested.nested_run_pending)
+ if (block_nested_events)
return -EBUSY;
nested_vmx_vmexit(vcpu, EXIT_REASON_EXTERNAL_INTERRUPT, 0, 0);
return 0;
Patches currently in stable-queue which might be from liran.alon(a)oracle.com are
queue-4.14/kvm-nvmx-fix-vmx_check_nested_events-return-value-in-case-an-event-was-reinjected-to-l2.patch
queue-4.14/kvm-x86-don-t-re-execute-instruction-when-not-passing-cr2-value.patch
queue-4.14/kvm-x86-ioapic-preserve-read-only-values-in-the-redirection-table.patch
queue-4.14/kvm-x86-ioapic-fix-level-triggered-eoi-and-ioapic-reconfigure-race.patch
queue-4.14/kvm-x86-emulator-return-to-user-mode-on-l1-cpl-0-emulation-failure.patch
queue-4.14/kvm-nvmx-nsvm-don-t-intercept-ud-when-running-l2.patch
queue-4.14/kvm-x86-ioapic-clear-remote-irr-when-entry-is-switched-to-edge-triggered.patch
This is a note to let you know that I've just added the patch titled
KVM: nVMX: Fix mmu context after VMLAUNCH/VMRESUME failure
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-nvmx-fix-mmu-context-after-vmlaunch-vmresume-failure.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 13:45:42 CET 2018
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
Date: Sun, 5 Nov 2017 16:54:49 -0800
Subject: KVM: nVMX: Fix mmu context after VMLAUNCH/VMRESUME failure
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
[ Upstream commit 5af4157388adad82c339e3742fb6b67840721347 ]
Commit 4f350c6dbcb (kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure
properly) can result in L1(run kvm-unit-tests/run_tests.sh vmx_controls in L1)
null pointer deference and also L0 calltrace when EPT=0 on both L0 and L1.
In L1:
BUG: unable to handle kernel paging request at ffffffffc015bf8f
IP: vmx_vcpu_run+0x202/0x510 [kvm_intel]
PGD 146e13067 P4D 146e13067 PUD 146e15067 PMD 3d2686067 PTE 3d4af9161
Oops: 0003 [#1] PREEMPT SMP
CPU: 2 PID: 1798 Comm: qemu-system-x86 Not tainted 4.14.0-rc4+ #6
RIP: 0010:vmx_vcpu_run+0x202/0x510 [kvm_intel]
Call Trace:
WARNING: kernel stack frame pointer at ffffb86f4988bc18 in qemu-system-x86:1798 has bad value 0000000000000002
In L0:
-----------[ cut here ]------------
WARNING: CPU: 6 PID: 4460 at /home/kernel/linux/arch/x86/kvm//vmx.c:9845 vmx_inject_page_fault_nested+0x130/0x140 [kvm_intel]
CPU: 6 PID: 4460 Comm: qemu-system-x86 Tainted: G OE 4.14.0-rc7+ #25
RIP: 0010:vmx_inject_page_fault_nested+0x130/0x140 [kvm_intel]
Call Trace:
paging64_page_fault+0x500/0xde0 [kvm]
? paging32_gva_to_gpa_nested+0x120/0x120 [kvm]
? nonpaging_page_fault+0x3b0/0x3b0 [kvm]
? __asan_storeN+0x12/0x20
? paging64_gva_to_gpa+0xb0/0x120 [kvm]
? paging64_walk_addr_generic+0x11a0/0x11a0 [kvm]
? lock_acquire+0x2c0/0x2c0
? vmx_read_guest_seg_ar+0x97/0x100 [kvm_intel]
? vmx_get_segment+0x2a6/0x310 [kvm_intel]
? sched_clock+0x1f/0x30
? check_chain_key+0x137/0x1e0
? __lock_acquire+0x83c/0x2420
? kvm_multiple_exception+0xf2/0x220 [kvm]
? debug_check_no_locks_freed+0x240/0x240
? debug_smp_processor_id+0x17/0x20
? __lock_is_held+0x9e/0x100
kvm_mmu_page_fault+0x90/0x180 [kvm]
kvm_handle_page_fault+0x15c/0x310 [kvm]
? __lock_is_held+0x9e/0x100
handle_exception+0x3c7/0x4d0 [kvm_intel]
vmx_handle_exit+0x103/0x1010 [kvm_intel]
? kvm_arch_vcpu_ioctl_run+0x1628/0x2e20 [kvm]
The commit avoids to load host state of vmcs12 as vmcs01's guest state
since vmcs12 is not modified (except for the VM-instruction error field)
if the checking of vmcs control area fails. However, the mmu context is
switched to nested mmu in prepare_vmcs02() and it will not be reloaded
since load_vmcs12_host_state() is skipped when nested VMLAUNCH/VMRESUME
fails. This patch fixes it by reloading mmu context when nested
VMLAUNCH/VMRESUME fails.
Reviewed-by: Jim Mattson <jmattson(a)google.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan(a)oracle.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Jim Mattson <jmattson(a)google.com>
Signed-off-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/vmx.c | 34 ++++++++++++++++++++++------------
1 file changed, 22 insertions(+), 12 deletions(-)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -11339,6 +11339,24 @@ static void prepare_vmcs12(struct kvm_vc
kvm_clear_interrupt_queue(vcpu);
}
+static void load_vmcs12_mmu_host_state(struct kvm_vcpu *vcpu,
+ struct vmcs12 *vmcs12)
+{
+ u32 entry_failure_code;
+
+ nested_ept_uninit_mmu_context(vcpu);
+
+ /*
+ * Only PDPTE load can fail as the value of cr3 was checked on entry and
+ * couldn't have changed.
+ */
+ if (nested_vmx_load_cr3(vcpu, vmcs12->host_cr3, false, &entry_failure_code))
+ nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_PDPTE_FAIL);
+
+ if (!enable_ept)
+ vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
+}
+
/*
* A part of what we need to when the nested L2 guest exits and we want to
* run its L1 parent, is to reset L1's guest state to the host state specified
@@ -11352,7 +11370,6 @@ static void load_vmcs12_host_state(struc
struct vmcs12 *vmcs12)
{
struct kvm_segment seg;
- u32 entry_failure_code;
if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_EFER)
vcpu->arch.efer = vmcs12->host_ia32_efer;
@@ -11379,17 +11396,7 @@ static void load_vmcs12_host_state(struc
vcpu->arch.cr4_guest_owned_bits = ~vmcs_readl(CR4_GUEST_HOST_MASK);
vmx_set_cr4(vcpu, vmcs12->host_cr4);
- nested_ept_uninit_mmu_context(vcpu);
-
- /*
- * Only PDPTE load can fail as the value of cr3 was checked on entry and
- * couldn't have changed.
- */
- if (nested_vmx_load_cr3(vcpu, vmcs12->host_cr3, false, &entry_failure_code))
- nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_PDPTE_FAIL);
-
- if (!enable_ept)
- vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
+ load_vmcs12_mmu_host_state(vcpu, vmcs12);
if (enable_vpid) {
/*
@@ -11615,6 +11622,9 @@ static void nested_vmx_vmexit(struct kvm
* accordingly.
*/
nested_vmx_failValid(vcpu, VMXERR_ENTRY_INVALID_CONTROL_FIELD);
+
+ load_vmcs12_mmu_host_state(vcpu, vmcs12);
+
/*
* The emulated instruction was already skipped in
* nested_vmx_run, but the updated RIP was never
Patches currently in stable-queue which might be from wanpeng.li(a)hotmail.com are
queue-4.14/kvm-x86-don-t-re-execute-instruction-when-not-passing-cr2-value.patch
queue-4.14/kvm-x86-ioapic-preserve-read-only-values-in-the-redirection-table.patch
queue-4.14/kvm-nvmx-fix-mmu-context-after-vmlaunch-vmresume-failure.patch
queue-4.14/kvm-vmx-fix-rflags-cache-during-vcpu-reset.patch
queue-4.14/kvm-x86-fix-operand-address-size-during-instruction-decoding.patch
queue-4.14/kvm-x86-emulator-return-to-user-mode-on-l1-cpl-0-emulation-failure.patch
queue-4.14/kvm-nvmx-nsvm-don-t-intercept-ud-when-running-l2.patch
queue-4.14/kvm-x86-fix-softlockup-when-get-the-current-kvmclock.patch
queue-4.14/kvm-x86-ioapic-clear-remote-irr-when-entry-is-switched-to-edge-triggered.patch