This is a note to let you know that I've just added the patch titled
lockd: fix "list_add double add" caused by legacy signal interface
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
lockd-fix-list_add-double-add-caused-by-legacy-signal-interface.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 14:14:46 CET 2018
From: Vasily Averin <vvs(a)virtuozzo.com>
Date: Mon, 13 Nov 2017 07:25:40 +0300
Subject: lockd: fix "list_add double add" caused by legacy signal interface
From: Vasily Averin <vvs(a)virtuozzo.com>
[ Upstream commit 81833de1a46edce9ca20cfe079872ac1c20ef359 ]
restart_grace() uses hardcoded init_net.
It can cause to "list_add double add" in following scenario:
1) nfsd and lockd was started in several net namespaces
2) nfsd in init_net was stopped (lockd was not stopped because
it have users from another net namespaces)
3) lockd got signal, called restart_grace() -> set_grace_period()
and enabled lock_manager in hardcoded init_net.
4) nfsd in init_net is started again,
its lockd_up() calls set_grace_period() and tries to add
lock_manager into init_net 2nd time.
Jeff Layton suggest:
"Make it safe to call locks_start_grace multiple times on the same
lock_manager. If it's already on the global grace_list, then don't try
to add it again. (But we don't intentionally add twice, so for now we
WARN about that case.)
With this change, we also need to ensure that the nfsd4 lock manager
initializes the list before we call locks_start_grace. While we're at
it, move the rest of the nfsd_net initialization into
nfs4_state_create_net. I see no reason to have it spread over two
functions like it is today."
Suggested patch was updated to generate warning in described situation.
Suggested-by: Jeff Layton <jlayton(a)redhat.com>
Signed-off-by: Vasily Averin <vvs(a)virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/nfs_common/grace.c | 6 +++++-
fs/nfsd/nfs4state.c | 7 ++++---
2 files changed, 9 insertions(+), 4 deletions(-)
--- a/fs/nfs_common/grace.c
+++ b/fs/nfs_common/grace.c
@@ -30,7 +30,11 @@ locks_start_grace(struct net *net, struc
struct list_head *grace_list = net_generic(net, grace_net_id);
spin_lock(&grace_lock);
- list_add(&lm->list, grace_list);
+ if (list_empty(&lm->list))
+ list_add(&lm->list, grace_list);
+ else
+ WARN(1, "double list_add attempt detected in net %x %s\n",
+ net->ns.inum, (net == &init_net) ? "(init_net)" : "");
spin_unlock(&grace_lock);
}
EXPORT_SYMBOL_GPL(locks_start_grace);
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -6792,6 +6792,10 @@ static int nfs4_state_create_net(struct
INIT_LIST_HEAD(&nn->sessionid_hashtbl[i]);
nn->conf_name_tree = RB_ROOT;
nn->unconf_name_tree = RB_ROOT;
+ nn->boot_time = get_seconds();
+ nn->grace_ended = false;
+ nn->nfsd4_manager.block_opens = true;
+ INIT_LIST_HEAD(&nn->nfsd4_manager.list);
INIT_LIST_HEAD(&nn->client_lru);
INIT_LIST_HEAD(&nn->close_lru);
INIT_LIST_HEAD(&nn->del_recall_lru);
@@ -6846,9 +6850,6 @@ nfs4_state_start_net(struct net *net)
ret = nfs4_state_create_net(net);
if (ret)
return ret;
- nn->boot_time = get_seconds();
- nn->grace_ended = false;
- nn->nfsd4_manager.block_opens = true;
locks_start_grace(net, &nn->nfsd4_manager);
nfsd4_client_tracking_init(net);
printk(KERN_INFO "NFSD: starting %ld-second grace period (net %p)\n",
Patches currently in stable-queue which might be from vvs(a)virtuozzo.com are
queue-4.4/lockd-fix-list_add-double-add-caused-by-legacy-signal-interface.patch
queue-4.4/grace-replace-bug_on-by-warn_once-in-exit_net-hook.patch
This is a note to let you know that I've just added the patch titled
KVM: x86: ioapic: Preserve read-only values in the redirection table
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-ioapic-preserve-read-only-values-in-the-redirection-table.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 14:14:46 CET 2018
From: Nikita Leshenko <nikita.leshchenko(a)oracle.com>
Date: Sun, 5 Nov 2017 15:52:33 +0200
Subject: KVM: x86: ioapic: Preserve read-only values in the redirection table
From: Nikita Leshenko <nikita.leshchenko(a)oracle.com>
[ Upstream commit b200dded0a6974a3b69599832b2203483920ab25 ]
According to 82093AA (IOAPIC) manual, Remote IRR and Delivery Status are
read-only. QEMU implements the bits as RO in commit 479c2a1cb7fb
("ioapic: keep RO bits for IOAPIC entry").
Signed-off-by: Nikita Leshenko <nikita.leshchenko(a)oracle.com>
Reviewed-by: Liran Alon <liran.alon(a)oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Reviewed-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
Reviewed-by: Steve Rutherford <srutherford(a)google.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/ioapic.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -268,6 +268,7 @@ static void ioapic_write_indirect(struct
{
unsigned index;
bool mask_before, mask_after;
+ int old_remote_irr, old_delivery_status;
union kvm_ioapic_redirect_entry *e;
switch (ioapic->ioregsel) {
@@ -290,6 +291,9 @@ static void ioapic_write_indirect(struct
return;
e = &ioapic->redirtbl[index];
mask_before = e->fields.mask;
+ /* Preserve read-only fields */
+ old_remote_irr = e->fields.remote_irr;
+ old_delivery_status = e->fields.delivery_status;
if (ioapic->ioregsel & 1) {
e->bits &= 0xffffffff;
e->bits |= (u64) val << 32;
@@ -297,6 +301,8 @@ static void ioapic_write_indirect(struct
e->bits &= ~0xffffffffULL;
e->bits |= (u32) val;
}
+ e->fields.remote_irr = old_remote_irr;
+ e->fields.delivery_status = old_delivery_status;
/*
* Some OSes (Linux, Xen) assume that Remote IRR bit will
Patches currently in stable-queue which might be from nikita.leshchenko(a)oracle.com are
queue-4.4/kvm-x86-don-t-re-execute-instruction-when-not-passing-cr2-value.patch
queue-4.4/kvm-x86-ioapic-preserve-read-only-values-in-the-redirection-table.patch
queue-4.4/kvm-x86-ioapic-fix-level-triggered-eoi-and-ioapic-reconfigure-race.patch
queue-4.4/kvm-x86-emulator-return-to-user-mode-on-l1-cpl-0-emulation-failure.patch
queue-4.4/kvm-x86-ioapic-clear-remote-irr-when-entry-is-switched-to-edge-triggered.patch
This is a note to let you know that I've just added the patch titled
KVM: x86: ioapic: Fix level-triggered EOI and IOAPIC reconfigure race
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-ioapic-fix-level-triggered-eoi-and-ioapic-reconfigure-race.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 14:14:46 CET 2018
From: Nikita Leshenko <nikita.leshchenko(a)oracle.com>
Date: Sun, 5 Nov 2017 15:52:29 +0200
Subject: KVM: x86: ioapic: Fix level-triggered EOI and IOAPIC reconfigure race
From: Nikita Leshenko <nikita.leshchenko(a)oracle.com>
[ Upstream commit 0fc5a36dd6b345eb0d251a65c236e53bead3eef7 ]
KVM uses ioapic_handled_vectors to track vectors that need to notify the
IOAPIC on EOI. The problem is that IOAPIC can be reconfigured while an
interrupt with old configuration is pending or running and
ioapic_handled_vectors only remembers the newest configuration;
thus EOI from the old interrupt is not delievered to the IOAPIC.
A previous commit db2bdcbbbd32
("KVM: x86: fix edge EOI and IOAPIC reconfig race")
addressed this issue by adding pending edge-triggered interrupts to
ioapic_handled_vectors, fixing this race for edge-triggered interrupts.
The commit explicitly ignored level-triggered interrupts,
but this race applies to them as well:
1) IOAPIC sends a level triggered interrupt vector to VCPU0
2) VCPU0's handler deasserts the irq line and reconfigures the IOAPIC
to route the vector to VCPU1. The reconfiguration rewrites only the
upper 32 bits of the IOREDTBLn register. (Causes KVM to update
ioapic_handled_vectors for VCPU0 and it no longer includes the vector.)
3) VCPU0 sends EOI for the vector, but it's not delievered to the
IOAPIC because the ioapic_handled_vectors doesn't include the vector.
4) New interrupts are not delievered to VCPU1 because remote_irr bit
is set forever.
Therefore, the correct behavior is to add all pending and running
interrupts to ioapic_handled_vectors.
This commit introduces a slight performance hit similar to
commit db2bdcbbbd32 ("KVM: x86: fix edge EOI and IOAPIC reconfig race")
for the rare case that the vector is reused by a non-IOAPIC source on
VCPU0. We prefer to keep solution simple and not handle this case just
as the original commit does.
Fixes: db2bdcbbbd32 ("KVM: x86: fix edge EOI and IOAPIC reconfig race")
Signed-off-by: Nikita Leshenko <nikita.leshchenko(a)oracle.com>
Reviewed-by: Liran Alon <liran.alon(a)oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/ioapic.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -247,8 +247,7 @@ void kvm_ioapic_scan_entry(struct kvm_vc
index == RTC_GSI) {
if (kvm_apic_match_dest(vcpu, NULL, 0,
e->fields.dest_id, e->fields.dest_mode) ||
- (e->fields.trig_mode == IOAPIC_EDGE_TRIG &&
- kvm_apic_pending_eoi(vcpu, e->fields.vector)))
+ kvm_apic_pending_eoi(vcpu, e->fields.vector))
__set_bit(e->fields.vector,
(unsigned long *)eoi_exit_bitmap);
}
Patches currently in stable-queue which might be from nikita.leshchenko(a)oracle.com are
queue-4.4/kvm-x86-don-t-re-execute-instruction-when-not-passing-cr2-value.patch
queue-4.4/kvm-x86-ioapic-preserve-read-only-values-in-the-redirection-table.patch
queue-4.4/kvm-x86-ioapic-fix-level-triggered-eoi-and-ioapic-reconfigure-race.patch
queue-4.4/kvm-x86-emulator-return-to-user-mode-on-l1-cpl-0-emulation-failure.patch
queue-4.4/kvm-x86-ioapic-clear-remote-irr-when-entry-is-switched-to-edge-triggered.patch
This is a note to let you know that I've just added the patch titled
KVM: x86: ioapic: Clear Remote IRR when entry is switched to edge-triggered
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-ioapic-clear-remote-irr-when-entry-is-switched-to-edge-triggered.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 14:14:46 CET 2018
From: Nikita Leshenko <nikita.leshchenko(a)oracle.com>
Date: Sun, 5 Nov 2017 15:52:32 +0200
Subject: KVM: x86: ioapic: Clear Remote IRR when entry is switched to edge-triggered
From: Nikita Leshenko <nikita.leshchenko(a)oracle.com>
[ Upstream commit a8bfec2930525808c01f038825d1df3904638631 ]
Some OSes (Linux, Xen) use this behavior to clear the Remote IRR bit for
IOAPICs without an EOI register. They simulate the EOI message manually
by changing the trigger mode to edge and then back to level, with the
entry being masked during this.
QEMU implements this feature in commit ed1263c363c9
("ioapic: clear remote irr bit for edge-triggered interrupts")
As a side effect, this commit removes an incorrect behavior where Remote
IRR was cleared when the redirection table entry was rewritten. This is not
consistent with the manual and also opens an opportunity for a strange
behavior when a redirection table entry is modified from an interrupt
handler that handles the same entry: The modification will clear the
Remote IRR bit even though the interrupt handler is still running.
Signed-off-by: Nikita Leshenko <nikita.leshchenko(a)oracle.com>
Reviewed-by: Liran Alon <liran.alon(a)oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Reviewed-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
Reviewed-by: Steve Rutherford <srutherford(a)google.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/ioapic.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -296,8 +296,17 @@ static void ioapic_write_indirect(struct
} else {
e->bits &= ~0xffffffffULL;
e->bits |= (u32) val;
- e->fields.remote_irr = 0;
}
+
+ /*
+ * Some OSes (Linux, Xen) assume that Remote IRR bit will
+ * be cleared by IOAPIC hardware when the entry is configured
+ * as edge-triggered. This behavior is used to simulate an
+ * explicit EOI on IOAPICs that don't have the EOI register.
+ */
+ if (e->fields.trig_mode == IOAPIC_EDGE_TRIG)
+ e->fields.remote_irr = 0;
+
mask_after = e->fields.mask;
if (mask_before != mask_after)
kvm_fire_mask_notifiers(ioapic->kvm, KVM_IRQCHIP_IOAPIC, index, mask_after);
Patches currently in stable-queue which might be from nikita.leshchenko(a)oracle.com are
queue-4.4/kvm-x86-don-t-re-execute-instruction-when-not-passing-cr2-value.patch
queue-4.4/kvm-x86-ioapic-preserve-read-only-values-in-the-redirection-table.patch
queue-4.4/kvm-x86-ioapic-fix-level-triggered-eoi-and-ioapic-reconfigure-race.patch
queue-4.4/kvm-x86-emulator-return-to-user-mode-on-l1-cpl-0-emulation-failure.patch
queue-4.4/kvm-x86-ioapic-clear-remote-irr-when-entry-is-switched-to-edge-triggered.patch
This is a note to let you know that I've just added the patch titled
KVM: X86: Fix operand/address-size during instruction decoding
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-fix-operand-address-size-during-instruction-decoding.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 14:14:46 CET 2018
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
Date: Sun, 5 Nov 2017 16:54:47 -0800
Subject: KVM: X86: Fix operand/address-size during instruction decoding
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
[ Upstream commit 3853be2603191829b442b64dac6ae8ba0c027bf9 ]
Pedro reported:
During tests that we conducted on KVM, we noticed that executing a "PUSH %ES"
instruction under KVM produces different results on both memory and the SP
register depending on whether EPT support is enabled. With EPT the SP is
reduced by 4 bytes (and the written value is 0-padded) but without EPT support
it is only reduced by 2 bytes. The difference can be observed when the CS.DB
field is 1 (32-bit) but not when it's 0 (16-bit).
The internal segment descriptor cache exist even in real/vm8096 mode. The CS.D
also should be respected instead of just default operand/address-size/66H
prefix/67H prefix during instruction decoding. This patch fixes it by also
adjusting operand/address-size according to CS.D.
Reported-by: Pedro Fonseca <pfonseca(a)cs.washington.edu>
Tested-by: Pedro Fonseca <pfonseca(a)cs.washington.edu>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Pedro Fonseca <pfonseca(a)cs.washington.edu>
Signed-off-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/emulate.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -4978,6 +4978,8 @@ int x86_decode_insn(struct x86_emulate_c
bool op_prefix = false;
bool has_seg_override = false;
struct opcode opcode;
+ u16 dummy;
+ struct desc_struct desc;
ctxt->memop.type = OP_NONE;
ctxt->memopp = NULL;
@@ -4996,6 +4998,11 @@ int x86_decode_insn(struct x86_emulate_c
switch (mode) {
case X86EMUL_MODE_REAL:
case X86EMUL_MODE_VM86:
+ def_op_bytes = def_ad_bytes = 2;
+ ctxt->ops->get_segment(ctxt, &dummy, &desc, NULL, VCPU_SREG_CS);
+ if (desc.d)
+ def_op_bytes = def_ad_bytes = 4;
+ break;
case X86EMUL_MODE_PROT16:
def_op_bytes = def_ad_bytes = 2;
break;
Patches currently in stable-queue which might be from wanpeng.li(a)hotmail.com are
queue-4.4/kvm-x86-don-t-re-execute-instruction-when-not-passing-cr2-value.patch
queue-4.4/kvm-x86-ioapic-preserve-read-only-values-in-the-redirection-table.patch
queue-4.4/kvm-vmx-fix-rflags-cache-during-vcpu-reset.patch
queue-4.4/kvm-x86-fix-operand-address-size-during-instruction-decoding.patch
queue-4.4/kvm-x86-emulator-return-to-user-mode-on-l1-cpl-0-emulation-failure.patch
queue-4.4/kvm-x86-ioapic-clear-remote-irr-when-entry-is-switched-to-edge-triggered.patch
This is a note to let you know that I've just added the patch titled
KVM: x86: emulator: Return to user-mode on L1 CPL=0 emulation failure
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-emulator-return-to-user-mode-on-l1-cpl-0-emulation-failure.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 14:14:46 CET 2018
From: Liran Alon <liran.alon(a)oracle.com>
Date: Sun, 5 Nov 2017 16:56:33 +0200
Subject: KVM: x86: emulator: Return to user-mode on L1 CPL=0 emulation failure
From: Liran Alon <liran.alon(a)oracle.com>
[ Upstream commit 1f4dcb3b213235e642088709a1c54964d23365e9 ]
On this case, handle_emulation_failure() fills kvm_run with
internal-error information which it expects to be delivered
to user-mode for further processing.
However, the code reports a wrong return-value which makes KVM to never
return to user-mode on this scenario.
Fixes: 6d77dbfc88e3 ("KVM: inject #UD if instruction emulation fails and exit to
userspace")
Signed-off-by: Liran Alon <liran.alon(a)oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko(a)oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Reviewed-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/x86.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5153,7 +5153,7 @@ static int handle_emulation_failure(stru
vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
vcpu->run->internal.ndata = 0;
- r = EMULATE_FAIL;
+ r = EMULATE_USER_EXIT;
}
kvm_queue_exception(vcpu, UD_VECTOR);
Patches currently in stable-queue which might be from liran.alon(a)oracle.com are
queue-4.4/kvm-x86-don-t-re-execute-instruction-when-not-passing-cr2-value.patch
queue-4.4/kvm-x86-ioapic-preserve-read-only-values-in-the-redirection-table.patch
queue-4.4/kvm-x86-ioapic-fix-level-triggered-eoi-and-ioapic-reconfigure-race.patch
queue-4.4/kvm-x86-emulator-return-to-user-mode-on-l1-cpl-0-emulation-failure.patch
queue-4.4/kvm-x86-ioapic-clear-remote-irr-when-entry-is-switched-to-edge-triggered.patch
This is a note to let you know that I've just added the patch titled
KVM: x86: Don't re-execute instruction when not passing CR2 value
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-don-t-re-execute-instruction-when-not-passing-cr2-value.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 14:14:46 CET 2018
From: Liran Alon <liran.alon(a)oracle.com>
Date: Sun, 5 Nov 2017 16:56:34 +0200
Subject: KVM: x86: Don't re-execute instruction when not passing CR2 value
From: Liran Alon <liran.alon(a)oracle.com>
[ Upstream commit 9b8ae63798cb97e785a667ff27e43fa6220cb734 ]
In case of instruction-decode failure or emulation failure,
x86_emulate_instruction() will call reexecute_instruction() which will
attempt to use the cr2 value passed to x86_emulate_instruction().
However, when x86_emulate_instruction() is called from
emulate_instruction(), cr2 is not passed (passed as 0) and therefore
it doesn't make sense to execute reexecute_instruction() logic at all.
Fixes: 51d8b66199e9 ("KVM: cleanup emulate_instruction")
Signed-off-by: Liran Alon <liran.alon(a)oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko(a)oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Reviewed-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/kvm_host.h | 3 ++-
arch/x86/kvm/vmx.c | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -998,7 +998,8 @@ int x86_emulate_instruction(struct kvm_v
static inline int emulate_instruction(struct kvm_vcpu *vcpu,
int emulation_type)
{
- return x86_emulate_instruction(vcpu, 0, emulation_type, NULL, 0);
+ return x86_emulate_instruction(vcpu, 0,
+ emulation_type | EMULTYPE_NO_REEXECUTE, NULL, 0);
}
void kvm_enable_efer_bits(u64);
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6023,7 +6023,7 @@ static int handle_invalid_guest_state(st
if (test_bit(KVM_REQ_EVENT, &vcpu->requests))
return 1;
- err = emulate_instruction(vcpu, EMULTYPE_NO_REEXECUTE);
+ err = emulate_instruction(vcpu, 0);
if (err == EMULATE_USER_EXIT) {
++vcpu->stat.mmio_exits;
Patches currently in stable-queue which might be from liran.alon(a)oracle.com are
queue-4.4/kvm-x86-don-t-re-execute-instruction-when-not-passing-cr2-value.patch
queue-4.4/kvm-x86-ioapic-preserve-read-only-values-in-the-redirection-table.patch
queue-4.4/kvm-x86-ioapic-fix-level-triggered-eoi-and-ioapic-reconfigure-race.patch
queue-4.4/kvm-x86-emulator-return-to-user-mode-on-l1-cpl-0-emulation-failure.patch
queue-4.4/kvm-x86-ioapic-clear-remote-irr-when-entry-is-switched-to-edge-triggered.patch
This is a note to let you know that I've just added the patch titled
KVM: VMX: Fix rflags cache during vCPU reset
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-vmx-fix-rflags-cache-during-vcpu-reset.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 14:14:46 CET 2018
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
Date: Mon, 20 Nov 2017 14:52:21 -0800
Subject: KVM: VMX: Fix rflags cache during vCPU reset
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
[ Upstream commit c37c28730bb031cc8a44a130c2555c0f3efbe2d0 ]
Reported by syzkaller:
*** Guest State ***
CR0: actual=0x0000000080010031, shadow=0x0000000060000010, gh_mask=fffffffffffffff7
CR4: actual=0x0000000000002061, shadow=0x0000000000000000, gh_mask=ffffffffffffe8f1
CR3 = 0x000000002081e000
RSP = 0x000000000000fffa RIP = 0x0000000000000000
RFLAGS=0x00023000 DR7 = 0x00000000000000
^^^^^^^^^^
------------[ cut here ]------------
WARNING: CPU: 6 PID: 24431 at /home/kernel/linux/arch/x86/kvm//x86.c:7302 kvm_arch_vcpu_ioctl_run+0x651/0x2ea0 [kvm]
CPU: 6 PID: 24431 Comm: reprotest Tainted: G W OE 4.14.0+ #26
RIP: 0010:kvm_arch_vcpu_ioctl_run+0x651/0x2ea0 [kvm]
RSP: 0018:ffff880291d179e0 EFLAGS: 00010202
Call Trace:
kvm_vcpu_ioctl+0x479/0x880 [kvm]
do_vfs_ioctl+0x142/0x9a0
SyS_ioctl+0x74/0x80
entry_SYSCALL_64_fastpath+0x23/0x9a
The failed vmentry is triggered by the following beautified testcase:
#include <unistd.h>
#include <sys/syscall.h>
#include <string.h>
#include <stdint.h>
#include <linux/kvm.h>
#include <fcntl.h>
#include <sys/ioctl.h>
long r[5];
int main()
{
struct kvm_debugregs dr = { 0 };
r[2] = open("/dev/kvm", O_RDONLY);
r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7);
struct kvm_guest_debug debug = {
.control = 0xf0403,
.arch = {
.debugreg[6] = 0x2,
.debugreg[7] = 0x2
}
};
ioctl(r[4], KVM_SET_GUEST_DEBUG, &debug);
ioctl(r[4], KVM_RUN, 0);
}
which testcase tries to setup the processor specific debug
registers and configure vCPU for handling guest debug events through
KVM_SET_GUEST_DEBUG. The KVM_SET_GUEST_DEBUG ioctl will get and set
rflags in order to set TF bit if single step is needed. All regs' caches
are reset to avail and GUEST_RFLAGS vmcs field is reset to 0x2 during vCPU
reset. However, the cache of rflags is not reset during vCPU reset. The
function vmx_get_rflags() returns an unreset rflags cache value since
the cache is marked avail, it is 0 after boot. Vmentry fails if the
rflags reserved bit 1 is 0.
This patch fixes it by resetting both the GUEST_RFLAGS vmcs field and
its cache to 0x2 during vCPU reset.
Reported-by: Dmitry Vyukov <dvyukov(a)google.com>
Tested-by: Dmitry Vyukov <dvyukov(a)google.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Signed-off-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/vmx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4954,7 +4954,7 @@ static void vmx_vcpu_reset(struct kvm_vc
vmcs_write64(GUEST_IA32_DEBUGCTL, 0);
}
- vmcs_writel(GUEST_RFLAGS, 0x02);
+ kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
kvm_rip_write(vcpu, 0xfff0);
vmcs_writel(GUEST_GDTR_BASE, 0);
Patches currently in stable-queue which might be from wanpeng.li(a)hotmail.com are
queue-4.4/kvm-x86-don-t-re-execute-instruction-when-not-passing-cr2-value.patch
queue-4.4/kvm-x86-ioapic-preserve-read-only-values-in-the-redirection-table.patch
queue-4.4/kvm-vmx-fix-rflags-cache-during-vcpu-reset.patch
queue-4.4/kvm-x86-fix-operand-address-size-during-instruction-decoding.patch
queue-4.4/kvm-x86-emulator-return-to-user-mode-on-l1-cpl-0-emulation-failure.patch
queue-4.4/kvm-x86-ioapic-clear-remote-irr-when-entry-is-switched-to-edge-triggered.patch
This is a note to let you know that I've just added the patch titled
kmemleak: add scheduling point to kmemleak_scan()
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kmemleak-add-scheduling-point-to-kmemleak_scan.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Thu Feb 1 14:14:46 CET 2018
From: Yisheng Xie <xieyisheng1(a)huawei.com>
Date: Wed, 29 Nov 2017 16:11:08 -0800
Subject: kmemleak: add scheduling point to kmemleak_scan()
From: Yisheng Xie <xieyisheng1(a)huawei.com>
[ Upstream commit bde5f6bc68db51128f875a756e9082a6c6ff7b4c ]
kmemleak_scan() will scan struct page for each node and it can be really
large and resulting in a soft lockup. We have seen a soft lockup when
do scan while compile kernel:
watchdog: BUG: soft lockup - CPU#53 stuck for 22s! [bash:10287]
[...]
Call Trace:
kmemleak_scan+0x21a/0x4c0
kmemleak_write+0x312/0x350
full_proxy_write+0x5a/0xa0
__vfs_write+0x33/0x150
vfs_write+0xad/0x1a0
SyS_write+0x52/0xc0
do_syscall_64+0x61/0x1a0
entry_SYSCALL64_slow_path+0x25/0x25
Fix this by adding cond_resched every MAX_SCAN_SIZE.
Link: http://lkml.kernel.org/r/1511439788-20099-1-git-send-email-xieyisheng1@huaw…
Signed-off-by: Yisheng Xie <xieyisheng1(a)huawei.com>
Suggested-by: Catalin Marinas <catalin.marinas(a)arm.com>
Acked-by: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/kmemleak.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1394,6 +1394,8 @@ static void kmemleak_scan(void)
if (page_count(page) == 0)
continue;
scan_block(page, page + 1, NULL);
+ if (!(pfn % (MAX_SCAN_SIZE / sizeof(*page))))
+ cond_resched();
}
}
put_online_mems();
Patches currently in stable-queue which might be from xieyisheng1(a)huawei.com are
queue-4.4/kmemleak-add-scheduling-point-to-kmemleak_scan.patch