The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 9cfec6d097c607e36199cf0cfbb8cf5acbd8e9b2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023112056-opposing-bobbing-e37e@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
9cfec6d097c6 ("KVM: x86: Fix lapic timer interrupt lost after loading a snapshot.")
b3f257a84696 ("KVM: x86: Track required APICv inhibits with variable, not callback")
9a364857ab4f ("KVM: SVM: Inhibit AVIC if vCPUs are aliased in logical mode")
5063c41bebac ("KVM: x86: Inhibit APICv/AVIC if the optimized physical map is disabled")
2008fab34530 ("KVM: x86: Inhibit APIC memslot if x2APIC and AVIC are enabled")
c482f2cebe2d ("KVM: x86: Move APIC access page helper to common x86 code")
f651a0089548 ("KVM: x86: Don't inhibit APICv/AVIC if xAPIC ID mismatch is due to 32-bit ID")
0e311d33bfbe ("KVM: SVM: Introduce hybrid-AVIC mode")
4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
d2fe6bf5b881 ("KVM: SVM: Update max number of vCPUs supported for x2AVIC mode")
ec1d7e6ab9ff ("KVM: SVM: Drop unused AVIC / kvm_x86_ops declarations")
5bdae49fc2f6 ("KVM: SEV: fix misplaced closing parenthesis")
3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base")
a9603ae0e4ee ("KVM: x86: document AVIC/APICv inhibit reasons")
a4cfff3f0f8c ("Merge branch 'kvm-older-features' into HEAD")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9cfec6d097c607e36199cf0cfbb8cf5acbd8e9b2 Mon Sep 17 00:00:00 2001
From: Haitao Shan <hshan(a)google.com>
Date: Tue, 12 Sep 2023 16:55:45 -0700
Subject: [PATCH] KVM: x86: Fix lapic timer interrupt lost after loading a
snapshot.
When running android emulator (which is based on QEMU 2.12) on
certain Intel hosts with kernel version 6.3-rc1 or above, guest
will freeze after loading a snapshot. This is almost 100%
reproducible. By default, the android emulator will use snapshot
to speed up the next launching of the same android guest. So
this breaks the android emulator badly.
I tested QEMU 8.0.4 from Debian 12 with an Ubuntu 22.04 guest by
running command "loadvm" after "savevm". The same issue is
observed. At the same time, none of our AMD platforms is impacted.
More experiments show that loading the KVM module with
"enable_apicv=false" can workaround it.
The issue started to show up after commit 8e6ed96cdd50 ("KVM: x86:
fire timer when it is migrated and expired, and in oneshot mode").
However, as is pointed out by Sean Christopherson, it is introduced
by commit 967235d32032 ("KVM: vmx: clear pending interrupts on
KVM_SET_LAPIC"). commit 8e6ed96cdd50 ("KVM: x86: fire timer when
it is migrated and expired, and in oneshot mode") just makes it
easier to hit the issue.
Having both commits, the oneshot lapic timer gets fired immediately
inside the KVM_SET_LAPIC call when loading the snapshot. On Intel
platforms with APIC virtualization and posted interrupt processing,
this eventually leads to setting the corresponding PIR bit. However,
the whole PIR bits get cleared later in the same KVM_SET_LAPIC call
by apicv_post_state_restore. This leads to timer interrupt lost.
The fix is to move vmx_apicv_post_state_restore to the beginning of
the KVM_SET_LAPIC call and rename to vmx_apicv_pre_state_restore.
What vmx_apicv_post_state_restore does is actually clearing any
former apicv state and this behavior is more suitable to carry out
in the beginning.
Fixes: 967235d32032 ("KVM: vmx: clear pending interrupts on KVM_SET_LAPIC")
Cc: stable(a)vger.kernel.org
Suggested-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Haitao Shan <hshan(a)google.com>
Link: https://lore.kernel.org/r/20230913000215.478387-1-hshan@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index e3054e3e46d5..9b419f0de713 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -108,6 +108,7 @@ KVM_X86_OP_OPTIONAL(vcpu_blocking)
KVM_X86_OP_OPTIONAL(vcpu_unblocking)
KVM_X86_OP_OPTIONAL(pi_update_irte)
KVM_X86_OP_OPTIONAL(pi_start_assignment)
+KVM_X86_OP_OPTIONAL(apicv_pre_state_restore)
KVM_X86_OP_OPTIONAL(apicv_post_state_restore)
KVM_X86_OP_OPTIONAL_RET0(dy_apicv_has_pending_interrupt)
KVM_X86_OP_OPTIONAL(set_hv_timer)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 17715cb8731d..f77568c6a326 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1709,6 +1709,7 @@ struct kvm_x86_ops {
int (*pi_update_irte)(struct kvm *kvm, unsigned int host_irq,
uint32_t guest_irq, bool set);
void (*pi_start_assignment)(struct kvm *kvm);
+ void (*apicv_pre_state_restore)(struct kvm_vcpu *vcpu);
void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu);
bool (*dy_apicv_has_pending_interrupt)(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index dcd60b39e794..c6a5a5945021 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2670,6 +2670,8 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
u64 msr_val;
int i;
+ static_call_cond(kvm_x86_apicv_pre_state_restore)(vcpu);
+
if (!init_event) {
msr_val = APIC_DEFAULT_PHYS_BASE | MSR_IA32_APICBASE_ENABLE;
if (kvm_vcpu_is_reset_bsp(vcpu))
@@ -2977,6 +2979,8 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
struct kvm_lapic *apic = vcpu->arch.apic;
int r;
+ static_call_cond(kvm_x86_apicv_pre_state_restore)(vcpu);
+
kvm_lapic_set_base(vcpu, vcpu->arch.apic_base);
/* set SPIV separately to get count of SW disabled APICs right */
apic_set_spiv(apic, *((u32 *)(s->regs + APIC_SPIV)));
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 72e3943f3693..9bba5352582c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6912,7 +6912,7 @@ static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
vmcs_write64(EOI_EXIT_BITMAP3, eoi_exit_bitmap[3]);
}
-static void vmx_apicv_post_state_restore(struct kvm_vcpu *vcpu)
+static void vmx_apicv_pre_state_restore(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -8286,7 +8286,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
.refresh_apicv_exec_ctrl = vmx_refresh_apicv_exec_ctrl,
.load_eoi_exitmap = vmx_load_eoi_exitmap,
- .apicv_post_state_restore = vmx_apicv_post_state_restore,
+ .apicv_pre_state_restore = vmx_apicv_pre_state_restore,
.required_apicv_inhibits = VMX_REQUIRED_APICV_INHIBITS,
.hwapic_irr_update = vmx_hwapic_irr_update,
.hwapic_isr_update = vmx_hwapic_isr_update,
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 9cfec6d097c607e36199cf0cfbb8cf5acbd8e9b2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023112055-hardener-designer-a77b@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
9cfec6d097c6 ("KVM: x86: Fix lapic timer interrupt lost after loading a snapshot.")
b3f257a84696 ("KVM: x86: Track required APICv inhibits with variable, not callback")
9a364857ab4f ("KVM: SVM: Inhibit AVIC if vCPUs are aliased in logical mode")
5063c41bebac ("KVM: x86: Inhibit APICv/AVIC if the optimized physical map is disabled")
2008fab34530 ("KVM: x86: Inhibit APIC memslot if x2APIC and AVIC are enabled")
c482f2cebe2d ("KVM: x86: Move APIC access page helper to common x86 code")
f651a0089548 ("KVM: x86: Don't inhibit APICv/AVIC if xAPIC ID mismatch is due to 32-bit ID")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9cfec6d097c607e36199cf0cfbb8cf5acbd8e9b2 Mon Sep 17 00:00:00 2001
From: Haitao Shan <hshan(a)google.com>
Date: Tue, 12 Sep 2023 16:55:45 -0700
Subject: [PATCH] KVM: x86: Fix lapic timer interrupt lost after loading a
snapshot.
When running android emulator (which is based on QEMU 2.12) on
certain Intel hosts with kernel version 6.3-rc1 or above, guest
will freeze after loading a snapshot. This is almost 100%
reproducible. By default, the android emulator will use snapshot
to speed up the next launching of the same android guest. So
this breaks the android emulator badly.
I tested QEMU 8.0.4 from Debian 12 with an Ubuntu 22.04 guest by
running command "loadvm" after "savevm". The same issue is
observed. At the same time, none of our AMD platforms is impacted.
More experiments show that loading the KVM module with
"enable_apicv=false" can workaround it.
The issue started to show up after commit 8e6ed96cdd50 ("KVM: x86:
fire timer when it is migrated and expired, and in oneshot mode").
However, as is pointed out by Sean Christopherson, it is introduced
by commit 967235d32032 ("KVM: vmx: clear pending interrupts on
KVM_SET_LAPIC"). commit 8e6ed96cdd50 ("KVM: x86: fire timer when
it is migrated and expired, and in oneshot mode") just makes it
easier to hit the issue.
Having both commits, the oneshot lapic timer gets fired immediately
inside the KVM_SET_LAPIC call when loading the snapshot. On Intel
platforms with APIC virtualization and posted interrupt processing,
this eventually leads to setting the corresponding PIR bit. However,
the whole PIR bits get cleared later in the same KVM_SET_LAPIC call
by apicv_post_state_restore. This leads to timer interrupt lost.
The fix is to move vmx_apicv_post_state_restore to the beginning of
the KVM_SET_LAPIC call and rename to vmx_apicv_pre_state_restore.
What vmx_apicv_post_state_restore does is actually clearing any
former apicv state and this behavior is more suitable to carry out
in the beginning.
Fixes: 967235d32032 ("KVM: vmx: clear pending interrupts on KVM_SET_LAPIC")
Cc: stable(a)vger.kernel.org
Suggested-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Haitao Shan <hshan(a)google.com>
Link: https://lore.kernel.org/r/20230913000215.478387-1-hshan@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index e3054e3e46d5..9b419f0de713 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -108,6 +108,7 @@ KVM_X86_OP_OPTIONAL(vcpu_blocking)
KVM_X86_OP_OPTIONAL(vcpu_unblocking)
KVM_X86_OP_OPTIONAL(pi_update_irte)
KVM_X86_OP_OPTIONAL(pi_start_assignment)
+KVM_X86_OP_OPTIONAL(apicv_pre_state_restore)
KVM_X86_OP_OPTIONAL(apicv_post_state_restore)
KVM_X86_OP_OPTIONAL_RET0(dy_apicv_has_pending_interrupt)
KVM_X86_OP_OPTIONAL(set_hv_timer)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 17715cb8731d..f77568c6a326 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1709,6 +1709,7 @@ struct kvm_x86_ops {
int (*pi_update_irte)(struct kvm *kvm, unsigned int host_irq,
uint32_t guest_irq, bool set);
void (*pi_start_assignment)(struct kvm *kvm);
+ void (*apicv_pre_state_restore)(struct kvm_vcpu *vcpu);
void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu);
bool (*dy_apicv_has_pending_interrupt)(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index dcd60b39e794..c6a5a5945021 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2670,6 +2670,8 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
u64 msr_val;
int i;
+ static_call_cond(kvm_x86_apicv_pre_state_restore)(vcpu);
+
if (!init_event) {
msr_val = APIC_DEFAULT_PHYS_BASE | MSR_IA32_APICBASE_ENABLE;
if (kvm_vcpu_is_reset_bsp(vcpu))
@@ -2977,6 +2979,8 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
struct kvm_lapic *apic = vcpu->arch.apic;
int r;
+ static_call_cond(kvm_x86_apicv_pre_state_restore)(vcpu);
+
kvm_lapic_set_base(vcpu, vcpu->arch.apic_base);
/* set SPIV separately to get count of SW disabled APICs right */
apic_set_spiv(apic, *((u32 *)(s->regs + APIC_SPIV)));
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 72e3943f3693..9bba5352582c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6912,7 +6912,7 @@ static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
vmcs_write64(EOI_EXIT_BITMAP3, eoi_exit_bitmap[3]);
}
-static void vmx_apicv_post_state_restore(struct kvm_vcpu *vcpu)
+static void vmx_apicv_pre_state_restore(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -8286,7 +8286,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
.refresh_apicv_exec_ctrl = vmx_refresh_apicv_exec_ctrl,
.load_eoi_exitmap = vmx_load_eoi_exitmap,
- .apicv_post_state_restore = vmx_apicv_post_state_restore,
+ .apicv_pre_state_restore = vmx_apicv_pre_state_restore,
.required_apicv_inhibits = VMX_REQUIRED_APICV_INHIBITS,
.hwapic_irr_update = vmx_hwapic_irr_update,
.hwapic_isr_update = vmx_hwapic_isr_update,
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y
git checkout FETCH_HEAD
git cherry-pick -x d6800af51c76b6dae20e6023bbdc9b3da3ab5121
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023112013-chemicals-trousers-113f@gregkh' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
d6800af51c76 ("KVM: x86: hyper-v: Don't auto-enable stimer on write from user-space")
013cc6ebbf41 ("x86/kvm/hyper-v: avoid spurious pending stimer on vCPU init")
87a8d795b2f1 ("x86/hyper-v: Stop caring about EOI for direct stimers")
8644f771e07c ("x86/kvm/hyper-v: direct mode for synthetic timers")
6a058a1eadc3 ("x86/kvm/hyper-v: use stimer config definition from hyperv-tlfs.h")
0aa67255f54d ("x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h")
7deec5e0df74 ("x86: kvm: hyperv: don't retry message delivery for periodic timers")
3a0e7731724f ("x86: kvm: hyperv: simplify SynIC message delivery")
f21dd494506a ("KVM: x86: hyperv: optimize sparse VP set processing")
e6b6c483ebe9 ("KVM: x86: hyperv: fix 'tlb_lush' typo")
214ff83d4473 ("KVM: x86: hyperv: implement PV IPI send hypercalls")
2cefc5feb80c ("KVM: x86: hyperv: optimize kvm_hv_flush_tlb() for vp_index == vcpu_idx case")
0b0a31badb2d ("KVM: x86: hyperv: valid_bank_mask should be 'u64'")
a812297c4fd9 ("KVM: x86: hyperv: optimize 'all cpus' case in kvm_hv_flush_tlb()")
aa069a996951 ("KVM: PPC: Book3S HV: Add a VM capability to enable nested virtualization")
9d67121a4fce ("Merge remote-tracking branch 'remotes/powerpc/topic/ppc-kvm' into kvm-ppc-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d6800af51c76b6dae20e6023bbdc9b3da3ab5121 Mon Sep 17 00:00:00 2001
From: Nicolas Saenz Julienne <nsaenz(a)amazon.com>
Date: Tue, 17 Oct 2023 15:51:02 +0000
Subject: [PATCH] KVM: x86: hyper-v: Don't auto-enable stimer on write from
user-space
Don't apply the stimer's counter side effects when modifying its
value from user-space, as this may trigger spurious interrupts.
For example:
- The stimer is configured in auto-enable mode.
- The stimer's count is set and the timer enabled.
- The stimer expires, an interrupt is injected.
- The VM is live migrated.
- The stimer config and count are deserialized, auto-enable is ON, the
stimer is re-enabled.
- The stimer expires right away, and injects an unwarranted interrupt.
Cc: stable(a)vger.kernel.org
Fixes: 1f4b34f825e8 ("kvm/x86: Hyper-V SynIC timers")
Signed-off-by: Nicolas Saenz Julienne <nsaenz(a)amazon.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets(a)redhat.com>
Link: https://lore.kernel.org/r/20231017155101.40677-1-nsaenz@amazon.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 7c2dac6824e2..238afd7335e4 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -727,10 +727,12 @@ static int stimer_set_count(struct kvm_vcpu_hv_stimer *stimer, u64 count,
stimer_cleanup(stimer);
stimer->count = count;
- if (stimer->count == 0)
- stimer->config.enable = 0;
- else if (stimer->config.auto_enable)
- stimer->config.enable = 1;
+ if (!host) {
+ if (stimer->count == 0)
+ stimer->config.enable = 0;
+ else if (stimer->config.auto_enable)
+ stimer->config.enable = 1;
+ }
if (stimer->config.enable)
stimer_mark_pending(stimer, false);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x d6800af51c76b6dae20e6023bbdc9b3da3ab5121
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023112012-decency-frying-1e27@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
d6800af51c76 ("KVM: x86: hyper-v: Don't auto-enable stimer on write from user-space")
013cc6ebbf41 ("x86/kvm/hyper-v: avoid spurious pending stimer on vCPU init")
87a8d795b2f1 ("x86/hyper-v: Stop caring about EOI for direct stimers")
8644f771e07c ("x86/kvm/hyper-v: direct mode for synthetic timers")
6a058a1eadc3 ("x86/kvm/hyper-v: use stimer config definition from hyperv-tlfs.h")
0aa67255f54d ("x86/hyper-v: move synic/stimer control structures definitions to hyperv-tlfs.h")
7deec5e0df74 ("x86: kvm: hyperv: don't retry message delivery for periodic timers")
3a0e7731724f ("x86: kvm: hyperv: simplify SynIC message delivery")
f21dd494506a ("KVM: x86: hyperv: optimize sparse VP set processing")
e6b6c483ebe9 ("KVM: x86: hyperv: fix 'tlb_lush' typo")
214ff83d4473 ("KVM: x86: hyperv: implement PV IPI send hypercalls")
2cefc5feb80c ("KVM: x86: hyperv: optimize kvm_hv_flush_tlb() for vp_index == vcpu_idx case")
0b0a31badb2d ("KVM: x86: hyperv: valid_bank_mask should be 'u64'")
a812297c4fd9 ("KVM: x86: hyperv: optimize 'all cpus' case in kvm_hv_flush_tlb()")
aa069a996951 ("KVM: PPC: Book3S HV: Add a VM capability to enable nested virtualization")
9d67121a4fce ("Merge remote-tracking branch 'remotes/powerpc/topic/ppc-kvm' into kvm-ppc-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d6800af51c76b6dae20e6023bbdc9b3da3ab5121 Mon Sep 17 00:00:00 2001
From: Nicolas Saenz Julienne <nsaenz(a)amazon.com>
Date: Tue, 17 Oct 2023 15:51:02 +0000
Subject: [PATCH] KVM: x86: hyper-v: Don't auto-enable stimer on write from
user-space
Don't apply the stimer's counter side effects when modifying its
value from user-space, as this may trigger spurious interrupts.
For example:
- The stimer is configured in auto-enable mode.
- The stimer's count is set and the timer enabled.
- The stimer expires, an interrupt is injected.
- The VM is live migrated.
- The stimer config and count are deserialized, auto-enable is ON, the
stimer is re-enabled.
- The stimer expires right away, and injects an unwarranted interrupt.
Cc: stable(a)vger.kernel.org
Fixes: 1f4b34f825e8 ("kvm/x86: Hyper-V SynIC timers")
Signed-off-by: Nicolas Saenz Julienne <nsaenz(a)amazon.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets(a)redhat.com>
Link: https://lore.kernel.org/r/20231017155101.40677-1-nsaenz@amazon.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 7c2dac6824e2..238afd7335e4 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -727,10 +727,12 @@ static int stimer_set_count(struct kvm_vcpu_hv_stimer *stimer, u64 count,
stimer_cleanup(stimer);
stimer->count = count;
- if (stimer->count == 0)
- stimer->config.enable = 0;
- else if (stimer->config.auto_enable)
- stimer->config.enable = 1;
+ if (!host) {
+ if (stimer->count == 0)
+ stimer->config.enable = 0;
+ else if (stimer->config.auto_enable)
+ stimer->config.enable = 1;
+ }
if (stimer->config.enable)
stimer_mark_pending(stimer, false);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 7d08f21f8c6307cb05cabb8d86e90ff6ccba57e9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023112024-timing-octane-cc05@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
7d08f21f8c63 ("x86/PCI: Avoid PME from D3hot/D3cold for AMD Rembrandt and Phoenix USB4")
606012dddebb ("PCI: Fix up L1SS capability for Intel Apollo Lake Root Port")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7d08f21f8c6307cb05cabb8d86e90ff6ccba57e9 Mon Sep 17 00:00:00 2001
From: Mario Limonciello <mario.limonciello(a)amd.com>
Date: Wed, 4 Oct 2023 09:49:59 -0500
Subject: [PATCH] x86/PCI: Avoid PME from D3hot/D3cold for AMD Rembrandt and
Phoenix USB4
Iain reports that USB devices can't be used to wake a Lenovo Z13 from
suspend. This occurs because on some AMD platforms, even though the Root
Ports advertise PME_Support for D3hot and D3cold, wakeup events from
devices on a USB4 controller don't result in wakeup interrupts from the
Root Port when amd-pmc has put the platform in a hardware sleep state.
If amd-pmc will be involved in the suspend, remove D3hot and D3cold from
the PME_Support mask of Root Ports above USB4 controllers so we avoid those
states if we need wakeups.
Restore D3 support at resume so that it can be used by runtime suspend.
This affects both AMD Rembrandt and Phoenix SoCs.
"pm_suspend_target_state == PM_SUSPEND_ON" means we're doing runtime
suspend, and amd-pmc will not be involved. In that case PMEs work as
advertised in D3hot/D3cold, so we don't need to do anything.
Note that amd-pmc is technically optional, and there's no need for this
quirk if it's not present, but we assume it's always present because power
consumption is so high without it.
Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
Link: https://lore.kernel.org/r/20231004144959.158840-1-mario.limonciello@amd.com
Reported-by: Iain Lane <iain(a)orangesquash.org.uk>
Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-exte…
Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com>
[bhelgaas: commit log, move to arch/x86/pci/fixup.c, add #includes]
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: stable(a)vger.kernel.org
diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index e3ec02e6ac9f..f347c20247d3 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -3,9 +3,11 @@
* Exceptions for specific devices. Usually work-arounds for fatal design flaws.
*/
+#include <linux/bitfield.h>
#include <linux/delay.h>
#include <linux/dmi.h>
#include <linux/pci.h>
+#include <linux/suspend.h>
#include <linux/vgaarb.h>
#include <asm/amd_nb.h>
#include <asm/hpet.h>
@@ -904,3 +906,60 @@ static void chromeos_fixup_apl_pci_l1ss_capability(struct pci_dev *dev)
}
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x5ad6, chromeos_save_apl_pci_l1ss_capability);
DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, 0x5ad6, chromeos_fixup_apl_pci_l1ss_capability);
+
+#ifdef CONFIG_SUSPEND
+/*
+ * Root Ports on some AMD SoCs advertise PME_Support for D3hot and D3cold, but
+ * if the SoC is put into a hardware sleep state by the amd-pmc driver, the
+ * Root Ports don't generate wakeup interrupts for USB devices.
+ *
+ * When suspending, remove D3hot and D3cold from the PME_Support advertised
+ * by the Root Port so we don't use those states if we're expecting wakeup
+ * interrupts. Restore the advertised PME_Support when resuming.
+ */
+static void amd_rp_pme_suspend(struct pci_dev *dev)
+{
+ struct pci_dev *rp;
+
+ /*
+ * PM_SUSPEND_ON means we're doing runtime suspend, which means
+ * amd-pmc will not be involved so PMEs during D3 work as advertised.
+ *
+ * The PMEs *do* work if amd-pmc doesn't put the SoC in the hardware
+ * sleep state, but we assume amd-pmc is always present.
+ */
+ if (pm_suspend_target_state == PM_SUSPEND_ON)
+ return;
+
+ rp = pcie_find_root_port(dev);
+ if (!rp->pm_cap)
+ return;
+
+ rp->pme_support &= ~((PCI_PM_CAP_PME_D3hot|PCI_PM_CAP_PME_D3cold) >>
+ PCI_PM_CAP_PME_SHIFT);
+ dev_info_once(&rp->dev, "quirk: disabling D3cold for suspend\n");
+}
+
+static void amd_rp_pme_resume(struct pci_dev *dev)
+{
+ struct pci_dev *rp;
+ u16 pmc;
+
+ rp = pcie_find_root_port(dev);
+ if (!rp->pm_cap)
+ return;
+
+ pci_read_config_word(rp, rp->pm_cap + PCI_PM_PMC, &pmc);
+ rp->pme_support = FIELD_GET(PCI_PM_CAP_PME_MASK, pmc);
+}
+/* Rembrandt (yellow_carp) */
+DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x162e, amd_rp_pme_suspend);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x162e, amd_rp_pme_resume);
+DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x162f, amd_rp_pme_suspend);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x162f, amd_rp_pme_resume);
+/* Phoenix (pink_sardine) */
+DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x1668, amd_rp_pme_suspend);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x1668, amd_rp_pme_resume);
+DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x1669, amd_rp_pme_suspend);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x1669, amd_rp_pme_resume);
+#endif /* CONFIG_SUSPEND */
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 7d08f21f8c6307cb05cabb8d86e90ff6ccba57e9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023112023-shifter-dividing-57a0@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
7d08f21f8c63 ("x86/PCI: Avoid PME from D3hot/D3cold for AMD Rembrandt and Phoenix USB4")
606012dddebb ("PCI: Fix up L1SS capability for Intel Apollo Lake Root Port")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7d08f21f8c6307cb05cabb8d86e90ff6ccba57e9 Mon Sep 17 00:00:00 2001
From: Mario Limonciello <mario.limonciello(a)amd.com>
Date: Wed, 4 Oct 2023 09:49:59 -0500
Subject: [PATCH] x86/PCI: Avoid PME from D3hot/D3cold for AMD Rembrandt and
Phoenix USB4
Iain reports that USB devices can't be used to wake a Lenovo Z13 from
suspend. This occurs because on some AMD platforms, even though the Root
Ports advertise PME_Support for D3hot and D3cold, wakeup events from
devices on a USB4 controller don't result in wakeup interrupts from the
Root Port when amd-pmc has put the platform in a hardware sleep state.
If amd-pmc will be involved in the suspend, remove D3hot and D3cold from
the PME_Support mask of Root Ports above USB4 controllers so we avoid those
states if we need wakeups.
Restore D3 support at resume so that it can be used by runtime suspend.
This affects both AMD Rembrandt and Phoenix SoCs.
"pm_suspend_target_state == PM_SUSPEND_ON" means we're doing runtime
suspend, and amd-pmc will not be involved. In that case PMEs work as
advertised in D3hot/D3cold, so we don't need to do anything.
Note that amd-pmc is technically optional, and there's no need for this
quirk if it's not present, but we assume it's always present because power
consumption is so high without it.
Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
Link: https://lore.kernel.org/r/20231004144959.158840-1-mario.limonciello@amd.com
Reported-by: Iain Lane <iain(a)orangesquash.org.uk>
Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-exte…
Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com>
[bhelgaas: commit log, move to arch/x86/pci/fixup.c, add #includes]
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: stable(a)vger.kernel.org
diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index e3ec02e6ac9f..f347c20247d3 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -3,9 +3,11 @@
* Exceptions for specific devices. Usually work-arounds for fatal design flaws.
*/
+#include <linux/bitfield.h>
#include <linux/delay.h>
#include <linux/dmi.h>
#include <linux/pci.h>
+#include <linux/suspend.h>
#include <linux/vgaarb.h>
#include <asm/amd_nb.h>
#include <asm/hpet.h>
@@ -904,3 +906,60 @@ static void chromeos_fixup_apl_pci_l1ss_capability(struct pci_dev *dev)
}
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x5ad6, chromeos_save_apl_pci_l1ss_capability);
DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, 0x5ad6, chromeos_fixup_apl_pci_l1ss_capability);
+
+#ifdef CONFIG_SUSPEND
+/*
+ * Root Ports on some AMD SoCs advertise PME_Support for D3hot and D3cold, but
+ * if the SoC is put into a hardware sleep state by the amd-pmc driver, the
+ * Root Ports don't generate wakeup interrupts for USB devices.
+ *
+ * When suspending, remove D3hot and D3cold from the PME_Support advertised
+ * by the Root Port so we don't use those states if we're expecting wakeup
+ * interrupts. Restore the advertised PME_Support when resuming.
+ */
+static void amd_rp_pme_suspend(struct pci_dev *dev)
+{
+ struct pci_dev *rp;
+
+ /*
+ * PM_SUSPEND_ON means we're doing runtime suspend, which means
+ * amd-pmc will not be involved so PMEs during D3 work as advertised.
+ *
+ * The PMEs *do* work if amd-pmc doesn't put the SoC in the hardware
+ * sleep state, but we assume amd-pmc is always present.
+ */
+ if (pm_suspend_target_state == PM_SUSPEND_ON)
+ return;
+
+ rp = pcie_find_root_port(dev);
+ if (!rp->pm_cap)
+ return;
+
+ rp->pme_support &= ~((PCI_PM_CAP_PME_D3hot|PCI_PM_CAP_PME_D3cold) >>
+ PCI_PM_CAP_PME_SHIFT);
+ dev_info_once(&rp->dev, "quirk: disabling D3cold for suspend\n");
+}
+
+static void amd_rp_pme_resume(struct pci_dev *dev)
+{
+ struct pci_dev *rp;
+ u16 pmc;
+
+ rp = pcie_find_root_port(dev);
+ if (!rp->pm_cap)
+ return;
+
+ pci_read_config_word(rp, rp->pm_cap + PCI_PM_PMC, &pmc);
+ rp->pme_support = FIELD_GET(PCI_PM_CAP_PME_MASK, pmc);
+}
+/* Rembrandt (yellow_carp) */
+DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x162e, amd_rp_pme_suspend);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x162e, amd_rp_pme_resume);
+DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x162f, amd_rp_pme_suspend);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x162f, amd_rp_pme_resume);
+/* Phoenix (pink_sardine) */
+DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x1668, amd_rp_pme_suspend);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x1668, amd_rp_pme_resume);
+DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x1669, amd_rp_pme_suspend);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x1669, amd_rp_pme_resume);
+#endif /* CONFIG_SUSPEND */
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 7d08f21f8c6307cb05cabb8d86e90ff6ccba57e9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023112022-mutilated-twelve-7956@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
7d08f21f8c63 ("x86/PCI: Avoid PME from D3hot/D3cold for AMD Rembrandt and Phoenix USB4")
606012dddebb ("PCI: Fix up L1SS capability for Intel Apollo Lake Root Port")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7d08f21f8c6307cb05cabb8d86e90ff6ccba57e9 Mon Sep 17 00:00:00 2001
From: Mario Limonciello <mario.limonciello(a)amd.com>
Date: Wed, 4 Oct 2023 09:49:59 -0500
Subject: [PATCH] x86/PCI: Avoid PME from D3hot/D3cold for AMD Rembrandt and
Phoenix USB4
Iain reports that USB devices can't be used to wake a Lenovo Z13 from
suspend. This occurs because on some AMD platforms, even though the Root
Ports advertise PME_Support for D3hot and D3cold, wakeup events from
devices on a USB4 controller don't result in wakeup interrupts from the
Root Port when amd-pmc has put the platform in a hardware sleep state.
If amd-pmc will be involved in the suspend, remove D3hot and D3cold from
the PME_Support mask of Root Ports above USB4 controllers so we avoid those
states if we need wakeups.
Restore D3 support at resume so that it can be used by runtime suspend.
This affects both AMD Rembrandt and Phoenix SoCs.
"pm_suspend_target_state == PM_SUSPEND_ON" means we're doing runtime
suspend, and amd-pmc will not be involved. In that case PMEs work as
advertised in D3hot/D3cold, so we don't need to do anything.
Note that amd-pmc is technically optional, and there's no need for this
quirk if it's not present, but we assume it's always present because power
consumption is so high without it.
Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
Link: https://lore.kernel.org/r/20231004144959.158840-1-mario.limonciello@amd.com
Reported-by: Iain Lane <iain(a)orangesquash.org.uk>
Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-exte…
Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com>
[bhelgaas: commit log, move to arch/x86/pci/fixup.c, add #includes]
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: stable(a)vger.kernel.org
diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index e3ec02e6ac9f..f347c20247d3 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -3,9 +3,11 @@
* Exceptions for specific devices. Usually work-arounds for fatal design flaws.
*/
+#include <linux/bitfield.h>
#include <linux/delay.h>
#include <linux/dmi.h>
#include <linux/pci.h>
+#include <linux/suspend.h>
#include <linux/vgaarb.h>
#include <asm/amd_nb.h>
#include <asm/hpet.h>
@@ -904,3 +906,60 @@ static void chromeos_fixup_apl_pci_l1ss_capability(struct pci_dev *dev)
}
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x5ad6, chromeos_save_apl_pci_l1ss_capability);
DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL, 0x5ad6, chromeos_fixup_apl_pci_l1ss_capability);
+
+#ifdef CONFIG_SUSPEND
+/*
+ * Root Ports on some AMD SoCs advertise PME_Support for D3hot and D3cold, but
+ * if the SoC is put into a hardware sleep state by the amd-pmc driver, the
+ * Root Ports don't generate wakeup interrupts for USB devices.
+ *
+ * When suspending, remove D3hot and D3cold from the PME_Support advertised
+ * by the Root Port so we don't use those states if we're expecting wakeup
+ * interrupts. Restore the advertised PME_Support when resuming.
+ */
+static void amd_rp_pme_suspend(struct pci_dev *dev)
+{
+ struct pci_dev *rp;
+
+ /*
+ * PM_SUSPEND_ON means we're doing runtime suspend, which means
+ * amd-pmc will not be involved so PMEs during D3 work as advertised.
+ *
+ * The PMEs *do* work if amd-pmc doesn't put the SoC in the hardware
+ * sleep state, but we assume amd-pmc is always present.
+ */
+ if (pm_suspend_target_state == PM_SUSPEND_ON)
+ return;
+
+ rp = pcie_find_root_port(dev);
+ if (!rp->pm_cap)
+ return;
+
+ rp->pme_support &= ~((PCI_PM_CAP_PME_D3hot|PCI_PM_CAP_PME_D3cold) >>
+ PCI_PM_CAP_PME_SHIFT);
+ dev_info_once(&rp->dev, "quirk: disabling D3cold for suspend\n");
+}
+
+static void amd_rp_pme_resume(struct pci_dev *dev)
+{
+ struct pci_dev *rp;
+ u16 pmc;
+
+ rp = pcie_find_root_port(dev);
+ if (!rp->pm_cap)
+ return;
+
+ pci_read_config_word(rp, rp->pm_cap + PCI_PM_PMC, &pmc);
+ rp->pme_support = FIELD_GET(PCI_PM_CAP_PME_MASK, pmc);
+}
+/* Rembrandt (yellow_carp) */
+DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x162e, amd_rp_pme_suspend);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x162e, amd_rp_pme_resume);
+DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x162f, amd_rp_pme_suspend);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x162f, amd_rp_pme_resume);
+/* Phoenix (pink_sardine) */
+DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x1668, amd_rp_pme_suspend);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x1668, amd_rp_pme_resume);
+DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_AMD, 0x1669, amd_rp_pme_suspend);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x1669, amd_rp_pme_resume);
+#endif /* CONFIG_SUSPEND */