The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x a6b87bfc2ab5bccb7ad953693c85d9062aef3fdd
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081235-dinghy-spectrum-644a@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a6b87bfc2ab5bccb7ad953693c85d9062aef3fdd Mon Sep 17 00:00:00 2001
From: Alan Stern <stern(a)rowland.harvard.edu>
Date: Wed, 23 Jul 2025 10:37:04 -0400
Subject: [PATCH] HID: core: Harden s32ton() against conversion to 0 bits
Testing by the syzbot fuzzer showed that the HID core gets a
shift-out-of-bounds exception when it tries to convert a 32-bit
quantity to a 0-bit quantity. Ideally this should never occur, but
there are buggy devices and some might have a report field with size
set to zero; we shouldn't reject the report or the device just because
of that.
Instead, harden the s32ton() routine so that it returns a reasonable
result instead of crashing when it is called with the number of bits
set to 0 -- the same as what snto32() does.
Signed-off-by: Alan Stern <stern(a)rowland.harvard.edu>
Reported-by: syzbot+b63d677d63bcac06cf90(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-usb/68753a08.050a0220.33d347.0008.GAE@google.…
Tested-by: syzbot+b63d677d63bcac06cf90(a)syzkaller.appspotmail.com
Fixes: dde5845a529f ("[PATCH] Generic HID layer - code split")
Cc: stable(a)vger.kernel.org
Link: https://patch.msgid.link/613a66cd-4309-4bce-a4f7-2905f9bce0c9@rowland.harva…
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index ef1f79951d9b..f7d4efcae603 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -66,8 +66,12 @@ static s32 snto32(__u32 value, unsigned int n)
static u32 s32ton(__s32 value, unsigned int n)
{
- s32 a = value >> (n - 1);
+ s32 a;
+ if (!value || !n)
+ return 0;
+
+ a = value >> (n - 1);
if (a && a != -1)
return value < 0 ? 1 << (n - 1) : (1 << (n - 1)) - 1;
return value & ((1 << n) - 1);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x a6b87bfc2ab5bccb7ad953693c85d9062aef3fdd
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081233-stamp-fiddling-ffca@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a6b87bfc2ab5bccb7ad953693c85d9062aef3fdd Mon Sep 17 00:00:00 2001
From: Alan Stern <stern(a)rowland.harvard.edu>
Date: Wed, 23 Jul 2025 10:37:04 -0400
Subject: [PATCH] HID: core: Harden s32ton() against conversion to 0 bits
Testing by the syzbot fuzzer showed that the HID core gets a
shift-out-of-bounds exception when it tries to convert a 32-bit
quantity to a 0-bit quantity. Ideally this should never occur, but
there are buggy devices and some might have a report field with size
set to zero; we shouldn't reject the report or the device just because
of that.
Instead, harden the s32ton() routine so that it returns a reasonable
result instead of crashing when it is called with the number of bits
set to 0 -- the same as what snto32() does.
Signed-off-by: Alan Stern <stern(a)rowland.harvard.edu>
Reported-by: syzbot+b63d677d63bcac06cf90(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-usb/68753a08.050a0220.33d347.0008.GAE@google.…
Tested-by: syzbot+b63d677d63bcac06cf90(a)syzkaller.appspotmail.com
Fixes: dde5845a529f ("[PATCH] Generic HID layer - code split")
Cc: stable(a)vger.kernel.org
Link: https://patch.msgid.link/613a66cd-4309-4bce-a4f7-2905f9bce0c9@rowland.harva…
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index ef1f79951d9b..f7d4efcae603 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -66,8 +66,12 @@ static s32 snto32(__u32 value, unsigned int n)
static u32 s32ton(__s32 value, unsigned int n)
{
- s32 a = value >> (n - 1);
+ s32 a;
+ if (!value || !n)
+ return 0;
+
+ a = value >> (n - 1);
if (a && a != -1)
return value < 0 ? 1 << (n - 1) : (1 << (n - 1)) - 1;
return value & ((1 << n) - 1);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x a6b87bfc2ab5bccb7ad953693c85d9062aef3fdd
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081234-unmapped-serpent-9b0d@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a6b87bfc2ab5bccb7ad953693c85d9062aef3fdd Mon Sep 17 00:00:00 2001
From: Alan Stern <stern(a)rowland.harvard.edu>
Date: Wed, 23 Jul 2025 10:37:04 -0400
Subject: [PATCH] HID: core: Harden s32ton() against conversion to 0 bits
Testing by the syzbot fuzzer showed that the HID core gets a
shift-out-of-bounds exception when it tries to convert a 32-bit
quantity to a 0-bit quantity. Ideally this should never occur, but
there are buggy devices and some might have a report field with size
set to zero; we shouldn't reject the report or the device just because
of that.
Instead, harden the s32ton() routine so that it returns a reasonable
result instead of crashing when it is called with the number of bits
set to 0 -- the same as what snto32() does.
Signed-off-by: Alan Stern <stern(a)rowland.harvard.edu>
Reported-by: syzbot+b63d677d63bcac06cf90(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-usb/68753a08.050a0220.33d347.0008.GAE@google.…
Tested-by: syzbot+b63d677d63bcac06cf90(a)syzkaller.appspotmail.com
Fixes: dde5845a529f ("[PATCH] Generic HID layer - code split")
Cc: stable(a)vger.kernel.org
Link: https://patch.msgid.link/613a66cd-4309-4bce-a4f7-2905f9bce0c9@rowland.harva…
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index ef1f79951d9b..f7d4efcae603 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -66,8 +66,12 @@ static s32 snto32(__u32 value, unsigned int n)
static u32 s32ton(__s32 value, unsigned int n)
{
- s32 a = value >> (n - 1);
+ s32 a;
+ if (!value || !n)
+ return 0;
+
+ a = value >> (n - 1);
if (a && a != -1)
return value < 0 ? 1 << (n - 1) : (1 << (n - 1)) - 1;
return value & ((1 << n) - 1);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x a6b87bfc2ab5bccb7ad953693c85d9062aef3fdd
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081233-observing-prepay-25da@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a6b87bfc2ab5bccb7ad953693c85d9062aef3fdd Mon Sep 17 00:00:00 2001
From: Alan Stern <stern(a)rowland.harvard.edu>
Date: Wed, 23 Jul 2025 10:37:04 -0400
Subject: [PATCH] HID: core: Harden s32ton() against conversion to 0 bits
Testing by the syzbot fuzzer showed that the HID core gets a
shift-out-of-bounds exception when it tries to convert a 32-bit
quantity to a 0-bit quantity. Ideally this should never occur, but
there are buggy devices and some might have a report field with size
set to zero; we shouldn't reject the report or the device just because
of that.
Instead, harden the s32ton() routine so that it returns a reasonable
result instead of crashing when it is called with the number of bits
set to 0 -- the same as what snto32() does.
Signed-off-by: Alan Stern <stern(a)rowland.harvard.edu>
Reported-by: syzbot+b63d677d63bcac06cf90(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-usb/68753a08.050a0220.33d347.0008.GAE@google.…
Tested-by: syzbot+b63d677d63bcac06cf90(a)syzkaller.appspotmail.com
Fixes: dde5845a529f ("[PATCH] Generic HID layer - code split")
Cc: stable(a)vger.kernel.org
Link: https://patch.msgid.link/613a66cd-4309-4bce-a4f7-2905f9bce0c9@rowland.harva…
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index ef1f79951d9b..f7d4efcae603 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -66,8 +66,12 @@ static s32 snto32(__u32 value, unsigned int n)
static u32 s32ton(__s32 value, unsigned int n)
{
- s32 a = value >> (n - 1);
+ s32 a;
+ if (!value || !n)
+ return 0;
+
+ a = value >> (n - 1);
if (a && a != -1)
return value < 0 ? 1 << (n - 1) : (1 << (n - 1)) - 1;
return value & ((1 << n) - 1);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x a6b87bfc2ab5bccb7ad953693c85d9062aef3fdd
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081232-immovable-among-3b59@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a6b87bfc2ab5bccb7ad953693c85d9062aef3fdd Mon Sep 17 00:00:00 2001
From: Alan Stern <stern(a)rowland.harvard.edu>
Date: Wed, 23 Jul 2025 10:37:04 -0400
Subject: [PATCH] HID: core: Harden s32ton() against conversion to 0 bits
Testing by the syzbot fuzzer showed that the HID core gets a
shift-out-of-bounds exception when it tries to convert a 32-bit
quantity to a 0-bit quantity. Ideally this should never occur, but
there are buggy devices and some might have a report field with size
set to zero; we shouldn't reject the report or the device just because
of that.
Instead, harden the s32ton() routine so that it returns a reasonable
result instead of crashing when it is called with the number of bits
set to 0 -- the same as what snto32() does.
Signed-off-by: Alan Stern <stern(a)rowland.harvard.edu>
Reported-by: syzbot+b63d677d63bcac06cf90(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-usb/68753a08.050a0220.33d347.0008.GAE@google.…
Tested-by: syzbot+b63d677d63bcac06cf90(a)syzkaller.appspotmail.com
Fixes: dde5845a529f ("[PATCH] Generic HID layer - code split")
Cc: stable(a)vger.kernel.org
Link: https://patch.msgid.link/613a66cd-4309-4bce-a4f7-2905f9bce0c9@rowland.harva…
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index ef1f79951d9b..f7d4efcae603 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -66,8 +66,12 @@ static s32 snto32(__u32 value, unsigned int n)
static u32 s32ton(__s32 value, unsigned int n)
{
- s32 a = value >> (n - 1);
+ s32 a;
+ if (!value || !n)
+ return 0;
+
+ a = value >> (n - 1);
if (a && a != -1)
return value < 0 ? 1 << (n - 1) : (1 << (n - 1)) - 1;
return value & ((1 << n) - 1);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 6b1dd26544d045f6a79e8c73572c0c0db3ef3c1a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081238-veggie-abrasive-ce80@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6b1dd26544d045f6a79e8c73572c0c0db3ef3c1a Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk(a)redhat.com>
Date: Tue, 10 Jun 2025 16:20:10 -0700
Subject: [PATCH] KVM: VMX: Preserve host's DEBUGCTLMSR_FREEZE_IN_SMM while
running the guest
Set/clear DEBUGCTLMSR_FREEZE_IN_SMM in GUEST_IA32_DEBUGCTL based on the
host's pre-VM-Enter value, i.e. preserve the host's FREEZE_IN_SMM setting
while running the guest. When running with the "default treatment of SMIs"
in effect (the only mode KVM supports), SMIs do not generate a VM-Exit that
is visible to host (non-SMM) software, and instead transitions directly
from VMX non-root to SMM. And critically, DEBUGCTL isn't context switched
by hardware on SMI or RSM, i.e. SMM will run with whatever value was
resident in hardware at the time of the SMI.
Failure to preserve FREEZE_IN_SMM results in the PMU unexpectedly counting
events while the CPU is executing in SMM, which can pollute profiling and
potentially leak information into the guest.
Check for changes in FREEZE_IN_SMM prior to every entry into KVM's inner
run loop, as the bit can be toggled in IRQ context via IPI callback (SMP
function call), by way of /sys/devices/cpu/freeze_on_smi.
Add a field in kvm_x86_ops to communicate which DEBUGCTL bits need to be
preserved, as FREEZE_IN_SMM is only supported and defined for Intel CPUs,
i.e. explicitly checking FREEZE_IN_SMM in common x86 is at best weird, and
at worst could lead to undesirable behavior in the future if AMD CPUs ever
happened to pick up a collision with the bit.
Exempt TDX vCPUs, i.e. protected guests, from the check, as the TDX Module
owns and controls GUEST_IA32_DEBUGCTL.
WARN in SVM if KVM_RUN_LOAD_DEBUGCTL is set, mostly to document that the
lack of handling isn't a KVM bug (TDX already WARNs on any run_flag).
Lastly, explicitly reload GUEST_IA32_DEBUGCTL on a VM-Fail that is missed
by KVM but detected by hardware, i.e. in nested_vmx_restore_host_state().
Doing so avoids the need to track host_debugctl on a per-VMCS basis, as
GUEST_IA32_DEBUGCTL is unconditionally written by prepare_vmcs02() and
load_vmcs12_host_state(). For the VM-Fail case, even though KVM won't
have actually entered the guest, vcpu_enter_guest() will have run with
vmcs02 active and thus could result in vmcs01 being run with a stale value.
Cc: stable(a)vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Co-developed-by: Sean Christopherson <seanjc(a)google.com>
Link: https://lore.kernel.org/r/20250610232010.162191-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4f7ee2dbf10e..5e0415a8ee3f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1677,6 +1677,7 @@ static inline u16 kvm_lapic_irq_dest_mode(bool dest_mode_logical)
enum kvm_x86_run_flags {
KVM_RUN_FORCE_IMMEDIATE_EXIT = BIT(0),
KVM_RUN_LOAD_GUEST_DR6 = BIT(1),
+ KVM_RUN_LOAD_DEBUGCTL = BIT(2),
};
struct kvm_x86_ops {
@@ -1707,6 +1708,12 @@ struct kvm_x86_ops {
void (*vcpu_load)(struct kvm_vcpu *vcpu, int cpu);
void (*vcpu_put)(struct kvm_vcpu *vcpu);
+ /*
+ * Mask of DEBUGCTL bits that are owned by the host, i.e. that need to
+ * match the host's value even while the guest is active.
+ */
+ const u64 HOST_OWNED_DEBUGCTL;
+
void (*update_exception_bitmap)(struct kvm_vcpu *vcpu);
int (*get_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr);
int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr);
diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
index c85cbce6d2f6..4a6d4460f947 100644
--- a/arch/x86/kvm/vmx/main.c
+++ b/arch/x86/kvm/vmx/main.c
@@ -915,6 +915,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = {
.vcpu_load = vt_op(vcpu_load),
.vcpu_put = vt_op(vcpu_put),
+ .HOST_OWNED_DEBUGCTL = DEBUGCTLMSR_FREEZE_IN_SMM,
+
.update_exception_bitmap = vt_op(update_exception_bitmap),
.get_feature_msr = vmx_get_feature_msr,
.get_msr = vt_op(get_msr),
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index ef20184b8b11..c69df3aba8d1 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4861,6 +4861,9 @@ static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu)
WARN_ON(kvm_set_dr(vcpu, 7, vmcs_readl(GUEST_DR7)));
}
+ /* Reload DEBUGCTL to ensure vmcs01 has a fresh FREEZE_IN_SMM value. */
+ vmx_reload_guest_debugctl(vcpu);
+
/*
* Note that calling vmx_set_{efer,cr0,cr4} is important as they
* handle a variety of side effects to KVM's software model.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a77d325fe78b..0db1bd383302 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7377,6 +7377,9 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
if (run_flags & KVM_RUN_LOAD_GUEST_DR6)
set_debugreg(vcpu->arch.dr6, 6);
+ if (run_flags & KVM_RUN_LOAD_DEBUGCTL)
+ vmx_reload_guest_debugctl(vcpu);
+
/*
* Refresh vmcs.HOST_CR3 if necessary. This must be done immediately
* prior to VM-Enter, as the kernel may load a new ASID (PCID) any time
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index c20a4185d10a..076af78af151 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -419,12 +419,25 @@ bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated)
static inline void vmx_guest_debugctl_write(struct kvm_vcpu *vcpu, u64 val)
{
+ WARN_ON_ONCE(val & DEBUGCTLMSR_FREEZE_IN_SMM);
+
+ val |= vcpu->arch.host_debugctl & DEBUGCTLMSR_FREEZE_IN_SMM;
vmcs_write64(GUEST_IA32_DEBUGCTL, val);
}
static inline u64 vmx_guest_debugctl_read(void)
{
- return vmcs_read64(GUEST_IA32_DEBUGCTL);
+ return vmcs_read64(GUEST_IA32_DEBUGCTL) & ~DEBUGCTLMSR_FREEZE_IN_SMM;
+}
+
+static inline void vmx_reload_guest_debugctl(struct kvm_vcpu *vcpu)
+{
+ u64 val = vmcs_read64(GUEST_IA32_DEBUGCTL);
+
+ if (!((val ^ vcpu->arch.host_debugctl) & DEBUGCTLMSR_FREEZE_IN_SMM))
+ return;
+
+ vmx_guest_debugctl_write(vcpu, val & ~DEBUGCTLMSR_FREEZE_IN_SMM);
}
/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7fe3549bbd93..179c2c550c8d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10779,7 +10779,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
dm_request_for_irq_injection(vcpu) &&
kvm_cpu_accept_dm_intr(vcpu);
fastpath_t exit_fastpath;
- u64 run_flags;
+ u64 run_flags, debug_ctl;
bool req_immediate_exit = false;
@@ -11051,7 +11051,17 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
set_debugreg(0, 7);
}
- vcpu->arch.host_debugctl = get_debugctlmsr();
+ /*
+ * Refresh the host DEBUGCTL snapshot after disabling IRQs, as DEBUGCTL
+ * can be modified in IRQ context, e.g. via SMP function calls. Inform
+ * vendor code if any host-owned bits were changed, e.g. so that the
+ * value loaded into hardware while running the guest can be updated.
+ */
+ debug_ctl = get_debugctlmsr();
+ if ((debug_ctl ^ vcpu->arch.host_debugctl) & kvm_x86_ops.HOST_OWNED_DEBUGCTL &&
+ !vcpu->arch.guest_state_protected)
+ run_flags |= KVM_RUN_LOAD_DEBUGCTL;
+ vcpu->arch.host_debugctl = debug_ctl;
guest_timing_enter_irqoff();
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 6b1dd26544d045f6a79e8c73572c0c0db3ef3c1a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081237-acclaim-finished-f854@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6b1dd26544d045f6a79e8c73572c0c0db3ef3c1a Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk(a)redhat.com>
Date: Tue, 10 Jun 2025 16:20:10 -0700
Subject: [PATCH] KVM: VMX: Preserve host's DEBUGCTLMSR_FREEZE_IN_SMM while
running the guest
Set/clear DEBUGCTLMSR_FREEZE_IN_SMM in GUEST_IA32_DEBUGCTL based on the
host's pre-VM-Enter value, i.e. preserve the host's FREEZE_IN_SMM setting
while running the guest. When running with the "default treatment of SMIs"
in effect (the only mode KVM supports), SMIs do not generate a VM-Exit that
is visible to host (non-SMM) software, and instead transitions directly
from VMX non-root to SMM. And critically, DEBUGCTL isn't context switched
by hardware on SMI or RSM, i.e. SMM will run with whatever value was
resident in hardware at the time of the SMI.
Failure to preserve FREEZE_IN_SMM results in the PMU unexpectedly counting
events while the CPU is executing in SMM, which can pollute profiling and
potentially leak information into the guest.
Check for changes in FREEZE_IN_SMM prior to every entry into KVM's inner
run loop, as the bit can be toggled in IRQ context via IPI callback (SMP
function call), by way of /sys/devices/cpu/freeze_on_smi.
Add a field in kvm_x86_ops to communicate which DEBUGCTL bits need to be
preserved, as FREEZE_IN_SMM is only supported and defined for Intel CPUs,
i.e. explicitly checking FREEZE_IN_SMM in common x86 is at best weird, and
at worst could lead to undesirable behavior in the future if AMD CPUs ever
happened to pick up a collision with the bit.
Exempt TDX vCPUs, i.e. protected guests, from the check, as the TDX Module
owns and controls GUEST_IA32_DEBUGCTL.
WARN in SVM if KVM_RUN_LOAD_DEBUGCTL is set, mostly to document that the
lack of handling isn't a KVM bug (TDX already WARNs on any run_flag).
Lastly, explicitly reload GUEST_IA32_DEBUGCTL on a VM-Fail that is missed
by KVM but detected by hardware, i.e. in nested_vmx_restore_host_state().
Doing so avoids the need to track host_debugctl on a per-VMCS basis, as
GUEST_IA32_DEBUGCTL is unconditionally written by prepare_vmcs02() and
load_vmcs12_host_state(). For the VM-Fail case, even though KVM won't
have actually entered the guest, vcpu_enter_guest() will have run with
vmcs02 active and thus could result in vmcs01 being run with a stale value.
Cc: stable(a)vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Co-developed-by: Sean Christopherson <seanjc(a)google.com>
Link: https://lore.kernel.org/r/20250610232010.162191-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4f7ee2dbf10e..5e0415a8ee3f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1677,6 +1677,7 @@ static inline u16 kvm_lapic_irq_dest_mode(bool dest_mode_logical)
enum kvm_x86_run_flags {
KVM_RUN_FORCE_IMMEDIATE_EXIT = BIT(0),
KVM_RUN_LOAD_GUEST_DR6 = BIT(1),
+ KVM_RUN_LOAD_DEBUGCTL = BIT(2),
};
struct kvm_x86_ops {
@@ -1707,6 +1708,12 @@ struct kvm_x86_ops {
void (*vcpu_load)(struct kvm_vcpu *vcpu, int cpu);
void (*vcpu_put)(struct kvm_vcpu *vcpu);
+ /*
+ * Mask of DEBUGCTL bits that are owned by the host, i.e. that need to
+ * match the host's value even while the guest is active.
+ */
+ const u64 HOST_OWNED_DEBUGCTL;
+
void (*update_exception_bitmap)(struct kvm_vcpu *vcpu);
int (*get_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr);
int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr);
diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
index c85cbce6d2f6..4a6d4460f947 100644
--- a/arch/x86/kvm/vmx/main.c
+++ b/arch/x86/kvm/vmx/main.c
@@ -915,6 +915,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = {
.vcpu_load = vt_op(vcpu_load),
.vcpu_put = vt_op(vcpu_put),
+ .HOST_OWNED_DEBUGCTL = DEBUGCTLMSR_FREEZE_IN_SMM,
+
.update_exception_bitmap = vt_op(update_exception_bitmap),
.get_feature_msr = vmx_get_feature_msr,
.get_msr = vt_op(get_msr),
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index ef20184b8b11..c69df3aba8d1 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4861,6 +4861,9 @@ static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu)
WARN_ON(kvm_set_dr(vcpu, 7, vmcs_readl(GUEST_DR7)));
}
+ /* Reload DEBUGCTL to ensure vmcs01 has a fresh FREEZE_IN_SMM value. */
+ vmx_reload_guest_debugctl(vcpu);
+
/*
* Note that calling vmx_set_{efer,cr0,cr4} is important as they
* handle a variety of side effects to KVM's software model.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a77d325fe78b..0db1bd383302 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7377,6 +7377,9 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
if (run_flags & KVM_RUN_LOAD_GUEST_DR6)
set_debugreg(vcpu->arch.dr6, 6);
+ if (run_flags & KVM_RUN_LOAD_DEBUGCTL)
+ vmx_reload_guest_debugctl(vcpu);
+
/*
* Refresh vmcs.HOST_CR3 if necessary. This must be done immediately
* prior to VM-Enter, as the kernel may load a new ASID (PCID) any time
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index c20a4185d10a..076af78af151 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -419,12 +419,25 @@ bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated)
static inline void vmx_guest_debugctl_write(struct kvm_vcpu *vcpu, u64 val)
{
+ WARN_ON_ONCE(val & DEBUGCTLMSR_FREEZE_IN_SMM);
+
+ val |= vcpu->arch.host_debugctl & DEBUGCTLMSR_FREEZE_IN_SMM;
vmcs_write64(GUEST_IA32_DEBUGCTL, val);
}
static inline u64 vmx_guest_debugctl_read(void)
{
- return vmcs_read64(GUEST_IA32_DEBUGCTL);
+ return vmcs_read64(GUEST_IA32_DEBUGCTL) & ~DEBUGCTLMSR_FREEZE_IN_SMM;
+}
+
+static inline void vmx_reload_guest_debugctl(struct kvm_vcpu *vcpu)
+{
+ u64 val = vmcs_read64(GUEST_IA32_DEBUGCTL);
+
+ if (!((val ^ vcpu->arch.host_debugctl) & DEBUGCTLMSR_FREEZE_IN_SMM))
+ return;
+
+ vmx_guest_debugctl_write(vcpu, val & ~DEBUGCTLMSR_FREEZE_IN_SMM);
}
/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7fe3549bbd93..179c2c550c8d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10779,7 +10779,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
dm_request_for_irq_injection(vcpu) &&
kvm_cpu_accept_dm_intr(vcpu);
fastpath_t exit_fastpath;
- u64 run_flags;
+ u64 run_flags, debug_ctl;
bool req_immediate_exit = false;
@@ -11051,7 +11051,17 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
set_debugreg(0, 7);
}
- vcpu->arch.host_debugctl = get_debugctlmsr();
+ /*
+ * Refresh the host DEBUGCTL snapshot after disabling IRQs, as DEBUGCTL
+ * can be modified in IRQ context, e.g. via SMP function calls. Inform
+ * vendor code if any host-owned bits were changed, e.g. so that the
+ * value loaded into hardware while running the guest can be updated.
+ */
+ debug_ctl = get_debugctlmsr();
+ if ((debug_ctl ^ vcpu->arch.host_debugctl) & kvm_x86_ops.HOST_OWNED_DEBUGCTL &&
+ !vcpu->arch.guest_state_protected)
+ run_flags |= KVM_RUN_LOAD_DEBUGCTL;
+ vcpu->arch.host_debugctl = debug_ctl;
guest_timing_enter_irqoff();
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 6b1dd26544d045f6a79e8c73572c0c0db3ef3c1a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081236-platonic-plant-3b8f@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6b1dd26544d045f6a79e8c73572c0c0db3ef3c1a Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk(a)redhat.com>
Date: Tue, 10 Jun 2025 16:20:10 -0700
Subject: [PATCH] KVM: VMX: Preserve host's DEBUGCTLMSR_FREEZE_IN_SMM while
running the guest
Set/clear DEBUGCTLMSR_FREEZE_IN_SMM in GUEST_IA32_DEBUGCTL based on the
host's pre-VM-Enter value, i.e. preserve the host's FREEZE_IN_SMM setting
while running the guest. When running with the "default treatment of SMIs"
in effect (the only mode KVM supports), SMIs do not generate a VM-Exit that
is visible to host (non-SMM) software, and instead transitions directly
from VMX non-root to SMM. And critically, DEBUGCTL isn't context switched
by hardware on SMI or RSM, i.e. SMM will run with whatever value was
resident in hardware at the time of the SMI.
Failure to preserve FREEZE_IN_SMM results in the PMU unexpectedly counting
events while the CPU is executing in SMM, which can pollute profiling and
potentially leak information into the guest.
Check for changes in FREEZE_IN_SMM prior to every entry into KVM's inner
run loop, as the bit can be toggled in IRQ context via IPI callback (SMP
function call), by way of /sys/devices/cpu/freeze_on_smi.
Add a field in kvm_x86_ops to communicate which DEBUGCTL bits need to be
preserved, as FREEZE_IN_SMM is only supported and defined for Intel CPUs,
i.e. explicitly checking FREEZE_IN_SMM in common x86 is at best weird, and
at worst could lead to undesirable behavior in the future if AMD CPUs ever
happened to pick up a collision with the bit.
Exempt TDX vCPUs, i.e. protected guests, from the check, as the TDX Module
owns and controls GUEST_IA32_DEBUGCTL.
WARN in SVM if KVM_RUN_LOAD_DEBUGCTL is set, mostly to document that the
lack of handling isn't a KVM bug (TDX already WARNs on any run_flag).
Lastly, explicitly reload GUEST_IA32_DEBUGCTL on a VM-Fail that is missed
by KVM but detected by hardware, i.e. in nested_vmx_restore_host_state().
Doing so avoids the need to track host_debugctl on a per-VMCS basis, as
GUEST_IA32_DEBUGCTL is unconditionally written by prepare_vmcs02() and
load_vmcs12_host_state(). For the VM-Fail case, even though KVM won't
have actually entered the guest, vcpu_enter_guest() will have run with
vmcs02 active and thus could result in vmcs01 being run with a stale value.
Cc: stable(a)vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Co-developed-by: Sean Christopherson <seanjc(a)google.com>
Link: https://lore.kernel.org/r/20250610232010.162191-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4f7ee2dbf10e..5e0415a8ee3f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1677,6 +1677,7 @@ static inline u16 kvm_lapic_irq_dest_mode(bool dest_mode_logical)
enum kvm_x86_run_flags {
KVM_RUN_FORCE_IMMEDIATE_EXIT = BIT(0),
KVM_RUN_LOAD_GUEST_DR6 = BIT(1),
+ KVM_RUN_LOAD_DEBUGCTL = BIT(2),
};
struct kvm_x86_ops {
@@ -1707,6 +1708,12 @@ struct kvm_x86_ops {
void (*vcpu_load)(struct kvm_vcpu *vcpu, int cpu);
void (*vcpu_put)(struct kvm_vcpu *vcpu);
+ /*
+ * Mask of DEBUGCTL bits that are owned by the host, i.e. that need to
+ * match the host's value even while the guest is active.
+ */
+ const u64 HOST_OWNED_DEBUGCTL;
+
void (*update_exception_bitmap)(struct kvm_vcpu *vcpu);
int (*get_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr);
int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr);
diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
index c85cbce6d2f6..4a6d4460f947 100644
--- a/arch/x86/kvm/vmx/main.c
+++ b/arch/x86/kvm/vmx/main.c
@@ -915,6 +915,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = {
.vcpu_load = vt_op(vcpu_load),
.vcpu_put = vt_op(vcpu_put),
+ .HOST_OWNED_DEBUGCTL = DEBUGCTLMSR_FREEZE_IN_SMM,
+
.update_exception_bitmap = vt_op(update_exception_bitmap),
.get_feature_msr = vmx_get_feature_msr,
.get_msr = vt_op(get_msr),
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index ef20184b8b11..c69df3aba8d1 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4861,6 +4861,9 @@ static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu)
WARN_ON(kvm_set_dr(vcpu, 7, vmcs_readl(GUEST_DR7)));
}
+ /* Reload DEBUGCTL to ensure vmcs01 has a fresh FREEZE_IN_SMM value. */
+ vmx_reload_guest_debugctl(vcpu);
+
/*
* Note that calling vmx_set_{efer,cr0,cr4} is important as they
* handle a variety of side effects to KVM's software model.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a77d325fe78b..0db1bd383302 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7377,6 +7377,9 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
if (run_flags & KVM_RUN_LOAD_GUEST_DR6)
set_debugreg(vcpu->arch.dr6, 6);
+ if (run_flags & KVM_RUN_LOAD_DEBUGCTL)
+ vmx_reload_guest_debugctl(vcpu);
+
/*
* Refresh vmcs.HOST_CR3 if necessary. This must be done immediately
* prior to VM-Enter, as the kernel may load a new ASID (PCID) any time
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index c20a4185d10a..076af78af151 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -419,12 +419,25 @@ bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated)
static inline void vmx_guest_debugctl_write(struct kvm_vcpu *vcpu, u64 val)
{
+ WARN_ON_ONCE(val & DEBUGCTLMSR_FREEZE_IN_SMM);
+
+ val |= vcpu->arch.host_debugctl & DEBUGCTLMSR_FREEZE_IN_SMM;
vmcs_write64(GUEST_IA32_DEBUGCTL, val);
}
static inline u64 vmx_guest_debugctl_read(void)
{
- return vmcs_read64(GUEST_IA32_DEBUGCTL);
+ return vmcs_read64(GUEST_IA32_DEBUGCTL) & ~DEBUGCTLMSR_FREEZE_IN_SMM;
+}
+
+static inline void vmx_reload_guest_debugctl(struct kvm_vcpu *vcpu)
+{
+ u64 val = vmcs_read64(GUEST_IA32_DEBUGCTL);
+
+ if (!((val ^ vcpu->arch.host_debugctl) & DEBUGCTLMSR_FREEZE_IN_SMM))
+ return;
+
+ vmx_guest_debugctl_write(vcpu, val & ~DEBUGCTLMSR_FREEZE_IN_SMM);
}
/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7fe3549bbd93..179c2c550c8d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10779,7 +10779,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
dm_request_for_irq_injection(vcpu) &&
kvm_cpu_accept_dm_intr(vcpu);
fastpath_t exit_fastpath;
- u64 run_flags;
+ u64 run_flags, debug_ctl;
bool req_immediate_exit = false;
@@ -11051,7 +11051,17 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
set_debugreg(0, 7);
}
- vcpu->arch.host_debugctl = get_debugctlmsr();
+ /*
+ * Refresh the host DEBUGCTL snapshot after disabling IRQs, as DEBUGCTL
+ * can be modified in IRQ context, e.g. via SMP function calls. Inform
+ * vendor code if any host-owned bits were changed, e.g. so that the
+ * value loaded into hardware while running the guest can be updated.
+ */
+ debug_ctl = get_debugctlmsr();
+ if ((debug_ctl ^ vcpu->arch.host_debugctl) & kvm_x86_ops.HOST_OWNED_DEBUGCTL &&
+ !vcpu->arch.guest_state_protected)
+ run_flags |= KVM_RUN_LOAD_DEBUGCTL;
+ vcpu->arch.host_debugctl = debug_ctl;
guest_timing_enter_irqoff();
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 6b1dd26544d045f6a79e8c73572c0c0db3ef3c1a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081235-galleria-reapply-8a83@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6b1dd26544d045f6a79e8c73572c0c0db3ef3c1a Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk(a)redhat.com>
Date: Tue, 10 Jun 2025 16:20:10 -0700
Subject: [PATCH] KVM: VMX: Preserve host's DEBUGCTLMSR_FREEZE_IN_SMM while
running the guest
Set/clear DEBUGCTLMSR_FREEZE_IN_SMM in GUEST_IA32_DEBUGCTL based on the
host's pre-VM-Enter value, i.e. preserve the host's FREEZE_IN_SMM setting
while running the guest. When running with the "default treatment of SMIs"
in effect (the only mode KVM supports), SMIs do not generate a VM-Exit that
is visible to host (non-SMM) software, and instead transitions directly
from VMX non-root to SMM. And critically, DEBUGCTL isn't context switched
by hardware on SMI or RSM, i.e. SMM will run with whatever value was
resident in hardware at the time of the SMI.
Failure to preserve FREEZE_IN_SMM results in the PMU unexpectedly counting
events while the CPU is executing in SMM, which can pollute profiling and
potentially leak information into the guest.
Check for changes in FREEZE_IN_SMM prior to every entry into KVM's inner
run loop, as the bit can be toggled in IRQ context via IPI callback (SMP
function call), by way of /sys/devices/cpu/freeze_on_smi.
Add a field in kvm_x86_ops to communicate which DEBUGCTL bits need to be
preserved, as FREEZE_IN_SMM is only supported and defined for Intel CPUs,
i.e. explicitly checking FREEZE_IN_SMM in common x86 is at best weird, and
at worst could lead to undesirable behavior in the future if AMD CPUs ever
happened to pick up a collision with the bit.
Exempt TDX vCPUs, i.e. protected guests, from the check, as the TDX Module
owns and controls GUEST_IA32_DEBUGCTL.
WARN in SVM if KVM_RUN_LOAD_DEBUGCTL is set, mostly to document that the
lack of handling isn't a KVM bug (TDX already WARNs on any run_flag).
Lastly, explicitly reload GUEST_IA32_DEBUGCTL on a VM-Fail that is missed
by KVM but detected by hardware, i.e. in nested_vmx_restore_host_state().
Doing so avoids the need to track host_debugctl on a per-VMCS basis, as
GUEST_IA32_DEBUGCTL is unconditionally written by prepare_vmcs02() and
load_vmcs12_host_state(). For the VM-Fail case, even though KVM won't
have actually entered the guest, vcpu_enter_guest() will have run with
vmcs02 active and thus could result in vmcs01 being run with a stale value.
Cc: stable(a)vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Co-developed-by: Sean Christopherson <seanjc(a)google.com>
Link: https://lore.kernel.org/r/20250610232010.162191-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4f7ee2dbf10e..5e0415a8ee3f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1677,6 +1677,7 @@ static inline u16 kvm_lapic_irq_dest_mode(bool dest_mode_logical)
enum kvm_x86_run_flags {
KVM_RUN_FORCE_IMMEDIATE_EXIT = BIT(0),
KVM_RUN_LOAD_GUEST_DR6 = BIT(1),
+ KVM_RUN_LOAD_DEBUGCTL = BIT(2),
};
struct kvm_x86_ops {
@@ -1707,6 +1708,12 @@ struct kvm_x86_ops {
void (*vcpu_load)(struct kvm_vcpu *vcpu, int cpu);
void (*vcpu_put)(struct kvm_vcpu *vcpu);
+ /*
+ * Mask of DEBUGCTL bits that are owned by the host, i.e. that need to
+ * match the host's value even while the guest is active.
+ */
+ const u64 HOST_OWNED_DEBUGCTL;
+
void (*update_exception_bitmap)(struct kvm_vcpu *vcpu);
int (*get_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr);
int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr);
diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
index c85cbce6d2f6..4a6d4460f947 100644
--- a/arch/x86/kvm/vmx/main.c
+++ b/arch/x86/kvm/vmx/main.c
@@ -915,6 +915,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = {
.vcpu_load = vt_op(vcpu_load),
.vcpu_put = vt_op(vcpu_put),
+ .HOST_OWNED_DEBUGCTL = DEBUGCTLMSR_FREEZE_IN_SMM,
+
.update_exception_bitmap = vt_op(update_exception_bitmap),
.get_feature_msr = vmx_get_feature_msr,
.get_msr = vt_op(get_msr),
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index ef20184b8b11..c69df3aba8d1 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4861,6 +4861,9 @@ static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu)
WARN_ON(kvm_set_dr(vcpu, 7, vmcs_readl(GUEST_DR7)));
}
+ /* Reload DEBUGCTL to ensure vmcs01 has a fresh FREEZE_IN_SMM value. */
+ vmx_reload_guest_debugctl(vcpu);
+
/*
* Note that calling vmx_set_{efer,cr0,cr4} is important as they
* handle a variety of side effects to KVM's software model.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a77d325fe78b..0db1bd383302 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7377,6 +7377,9 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
if (run_flags & KVM_RUN_LOAD_GUEST_DR6)
set_debugreg(vcpu->arch.dr6, 6);
+ if (run_flags & KVM_RUN_LOAD_DEBUGCTL)
+ vmx_reload_guest_debugctl(vcpu);
+
/*
* Refresh vmcs.HOST_CR3 if necessary. This must be done immediately
* prior to VM-Enter, as the kernel may load a new ASID (PCID) any time
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index c20a4185d10a..076af78af151 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -419,12 +419,25 @@ bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated)
static inline void vmx_guest_debugctl_write(struct kvm_vcpu *vcpu, u64 val)
{
+ WARN_ON_ONCE(val & DEBUGCTLMSR_FREEZE_IN_SMM);
+
+ val |= vcpu->arch.host_debugctl & DEBUGCTLMSR_FREEZE_IN_SMM;
vmcs_write64(GUEST_IA32_DEBUGCTL, val);
}
static inline u64 vmx_guest_debugctl_read(void)
{
- return vmcs_read64(GUEST_IA32_DEBUGCTL);
+ return vmcs_read64(GUEST_IA32_DEBUGCTL) & ~DEBUGCTLMSR_FREEZE_IN_SMM;
+}
+
+static inline void vmx_reload_guest_debugctl(struct kvm_vcpu *vcpu)
+{
+ u64 val = vmcs_read64(GUEST_IA32_DEBUGCTL);
+
+ if (!((val ^ vcpu->arch.host_debugctl) & DEBUGCTLMSR_FREEZE_IN_SMM))
+ return;
+
+ vmx_guest_debugctl_write(vcpu, val & ~DEBUGCTLMSR_FREEZE_IN_SMM);
}
/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7fe3549bbd93..179c2c550c8d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10779,7 +10779,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
dm_request_for_irq_injection(vcpu) &&
kvm_cpu_accept_dm_intr(vcpu);
fastpath_t exit_fastpath;
- u64 run_flags;
+ u64 run_flags, debug_ctl;
bool req_immediate_exit = false;
@@ -11051,7 +11051,17 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
set_debugreg(0, 7);
}
- vcpu->arch.host_debugctl = get_debugctlmsr();
+ /*
+ * Refresh the host DEBUGCTL snapshot after disabling IRQs, as DEBUGCTL
+ * can be modified in IRQ context, e.g. via SMP function calls. Inform
+ * vendor code if any host-owned bits were changed, e.g. so that the
+ * value loaded into hardware while running the guest can be updated.
+ */
+ debug_ctl = get_debugctlmsr();
+ if ((debug_ctl ^ vcpu->arch.host_debugctl) & kvm_x86_ops.HOST_OWNED_DEBUGCTL &&
+ !vcpu->arch.guest_state_protected)
+ run_flags |= KVM_RUN_LOAD_DEBUGCTL;
+ vcpu->arch.host_debugctl = debug_ctl;
guest_timing_enter_irqoff();
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 6b1dd26544d045f6a79e8c73572c0c0db3ef3c1a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081234-labrador-unturned-2c21@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6b1dd26544d045f6a79e8c73572c0c0db3ef3c1a Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk(a)redhat.com>
Date: Tue, 10 Jun 2025 16:20:10 -0700
Subject: [PATCH] KVM: VMX: Preserve host's DEBUGCTLMSR_FREEZE_IN_SMM while
running the guest
Set/clear DEBUGCTLMSR_FREEZE_IN_SMM in GUEST_IA32_DEBUGCTL based on the
host's pre-VM-Enter value, i.e. preserve the host's FREEZE_IN_SMM setting
while running the guest. When running with the "default treatment of SMIs"
in effect (the only mode KVM supports), SMIs do not generate a VM-Exit that
is visible to host (non-SMM) software, and instead transitions directly
from VMX non-root to SMM. And critically, DEBUGCTL isn't context switched
by hardware on SMI or RSM, i.e. SMM will run with whatever value was
resident in hardware at the time of the SMI.
Failure to preserve FREEZE_IN_SMM results in the PMU unexpectedly counting
events while the CPU is executing in SMM, which can pollute profiling and
potentially leak information into the guest.
Check for changes in FREEZE_IN_SMM prior to every entry into KVM's inner
run loop, as the bit can be toggled in IRQ context via IPI callback (SMP
function call), by way of /sys/devices/cpu/freeze_on_smi.
Add a field in kvm_x86_ops to communicate which DEBUGCTL bits need to be
preserved, as FREEZE_IN_SMM is only supported and defined for Intel CPUs,
i.e. explicitly checking FREEZE_IN_SMM in common x86 is at best weird, and
at worst could lead to undesirable behavior in the future if AMD CPUs ever
happened to pick up a collision with the bit.
Exempt TDX vCPUs, i.e. protected guests, from the check, as the TDX Module
owns and controls GUEST_IA32_DEBUGCTL.
WARN in SVM if KVM_RUN_LOAD_DEBUGCTL is set, mostly to document that the
lack of handling isn't a KVM bug (TDX already WARNs on any run_flag).
Lastly, explicitly reload GUEST_IA32_DEBUGCTL on a VM-Fail that is missed
by KVM but detected by hardware, i.e. in nested_vmx_restore_host_state().
Doing so avoids the need to track host_debugctl on a per-VMCS basis, as
GUEST_IA32_DEBUGCTL is unconditionally written by prepare_vmcs02() and
load_vmcs12_host_state(). For the VM-Fail case, even though KVM won't
have actually entered the guest, vcpu_enter_guest() will have run with
vmcs02 active and thus could result in vmcs01 being run with a stale value.
Cc: stable(a)vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Co-developed-by: Sean Christopherson <seanjc(a)google.com>
Link: https://lore.kernel.org/r/20250610232010.162191-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4f7ee2dbf10e..5e0415a8ee3f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1677,6 +1677,7 @@ static inline u16 kvm_lapic_irq_dest_mode(bool dest_mode_logical)
enum kvm_x86_run_flags {
KVM_RUN_FORCE_IMMEDIATE_EXIT = BIT(0),
KVM_RUN_LOAD_GUEST_DR6 = BIT(1),
+ KVM_RUN_LOAD_DEBUGCTL = BIT(2),
};
struct kvm_x86_ops {
@@ -1707,6 +1708,12 @@ struct kvm_x86_ops {
void (*vcpu_load)(struct kvm_vcpu *vcpu, int cpu);
void (*vcpu_put)(struct kvm_vcpu *vcpu);
+ /*
+ * Mask of DEBUGCTL bits that are owned by the host, i.e. that need to
+ * match the host's value even while the guest is active.
+ */
+ const u64 HOST_OWNED_DEBUGCTL;
+
void (*update_exception_bitmap)(struct kvm_vcpu *vcpu);
int (*get_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr);
int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr);
diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
index c85cbce6d2f6..4a6d4460f947 100644
--- a/arch/x86/kvm/vmx/main.c
+++ b/arch/x86/kvm/vmx/main.c
@@ -915,6 +915,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = {
.vcpu_load = vt_op(vcpu_load),
.vcpu_put = vt_op(vcpu_put),
+ .HOST_OWNED_DEBUGCTL = DEBUGCTLMSR_FREEZE_IN_SMM,
+
.update_exception_bitmap = vt_op(update_exception_bitmap),
.get_feature_msr = vmx_get_feature_msr,
.get_msr = vt_op(get_msr),
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index ef20184b8b11..c69df3aba8d1 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4861,6 +4861,9 @@ static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu)
WARN_ON(kvm_set_dr(vcpu, 7, vmcs_readl(GUEST_DR7)));
}
+ /* Reload DEBUGCTL to ensure vmcs01 has a fresh FREEZE_IN_SMM value. */
+ vmx_reload_guest_debugctl(vcpu);
+
/*
* Note that calling vmx_set_{efer,cr0,cr4} is important as they
* handle a variety of side effects to KVM's software model.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a77d325fe78b..0db1bd383302 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7377,6 +7377,9 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
if (run_flags & KVM_RUN_LOAD_GUEST_DR6)
set_debugreg(vcpu->arch.dr6, 6);
+ if (run_flags & KVM_RUN_LOAD_DEBUGCTL)
+ vmx_reload_guest_debugctl(vcpu);
+
/*
* Refresh vmcs.HOST_CR3 if necessary. This must be done immediately
* prior to VM-Enter, as the kernel may load a new ASID (PCID) any time
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index c20a4185d10a..076af78af151 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -419,12 +419,25 @@ bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated)
static inline void vmx_guest_debugctl_write(struct kvm_vcpu *vcpu, u64 val)
{
+ WARN_ON_ONCE(val & DEBUGCTLMSR_FREEZE_IN_SMM);
+
+ val |= vcpu->arch.host_debugctl & DEBUGCTLMSR_FREEZE_IN_SMM;
vmcs_write64(GUEST_IA32_DEBUGCTL, val);
}
static inline u64 vmx_guest_debugctl_read(void)
{
- return vmcs_read64(GUEST_IA32_DEBUGCTL);
+ return vmcs_read64(GUEST_IA32_DEBUGCTL) & ~DEBUGCTLMSR_FREEZE_IN_SMM;
+}
+
+static inline void vmx_reload_guest_debugctl(struct kvm_vcpu *vcpu)
+{
+ u64 val = vmcs_read64(GUEST_IA32_DEBUGCTL);
+
+ if (!((val ^ vcpu->arch.host_debugctl) & DEBUGCTLMSR_FREEZE_IN_SMM))
+ return;
+
+ vmx_guest_debugctl_write(vcpu, val & ~DEBUGCTLMSR_FREEZE_IN_SMM);
}
/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7fe3549bbd93..179c2c550c8d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10779,7 +10779,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
dm_request_for_irq_injection(vcpu) &&
kvm_cpu_accept_dm_intr(vcpu);
fastpath_t exit_fastpath;
- u64 run_flags;
+ u64 run_flags, debug_ctl;
bool req_immediate_exit = false;
@@ -11051,7 +11051,17 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
set_debugreg(0, 7);
}
- vcpu->arch.host_debugctl = get_debugctlmsr();
+ /*
+ * Refresh the host DEBUGCTL snapshot after disabling IRQs, as DEBUGCTL
+ * can be modified in IRQ context, e.g. via SMP function calls. Inform
+ * vendor code if any host-owned bits were changed, e.g. so that the
+ * value loaded into hardware while running the guest can be updated.
+ */
+ debug_ctl = get_debugctlmsr();
+ if ((debug_ctl ^ vcpu->arch.host_debugctl) & kvm_x86_ops.HOST_OWNED_DEBUGCTL &&
+ !vcpu->arch.guest_state_protected)
+ run_flags |= KVM_RUN_LOAD_DEBUGCTL;
+ vcpu->arch.host_debugctl = debug_ctl;
guest_timing_enter_irqoff();
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 6b1dd26544d045f6a79e8c73572c0c0db3ef3c1a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081233-clang-reusable-6ef9@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6b1dd26544d045f6a79e8c73572c0c0db3ef3c1a Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk(a)redhat.com>
Date: Tue, 10 Jun 2025 16:20:10 -0700
Subject: [PATCH] KVM: VMX: Preserve host's DEBUGCTLMSR_FREEZE_IN_SMM while
running the guest
Set/clear DEBUGCTLMSR_FREEZE_IN_SMM in GUEST_IA32_DEBUGCTL based on the
host's pre-VM-Enter value, i.e. preserve the host's FREEZE_IN_SMM setting
while running the guest. When running with the "default treatment of SMIs"
in effect (the only mode KVM supports), SMIs do not generate a VM-Exit that
is visible to host (non-SMM) software, and instead transitions directly
from VMX non-root to SMM. And critically, DEBUGCTL isn't context switched
by hardware on SMI or RSM, i.e. SMM will run with whatever value was
resident in hardware at the time of the SMI.
Failure to preserve FREEZE_IN_SMM results in the PMU unexpectedly counting
events while the CPU is executing in SMM, which can pollute profiling and
potentially leak information into the guest.
Check for changes in FREEZE_IN_SMM prior to every entry into KVM's inner
run loop, as the bit can be toggled in IRQ context via IPI callback (SMP
function call), by way of /sys/devices/cpu/freeze_on_smi.
Add a field in kvm_x86_ops to communicate which DEBUGCTL bits need to be
preserved, as FREEZE_IN_SMM is only supported and defined for Intel CPUs,
i.e. explicitly checking FREEZE_IN_SMM in common x86 is at best weird, and
at worst could lead to undesirable behavior in the future if AMD CPUs ever
happened to pick up a collision with the bit.
Exempt TDX vCPUs, i.e. protected guests, from the check, as the TDX Module
owns and controls GUEST_IA32_DEBUGCTL.
WARN in SVM if KVM_RUN_LOAD_DEBUGCTL is set, mostly to document that the
lack of handling isn't a KVM bug (TDX already WARNs on any run_flag).
Lastly, explicitly reload GUEST_IA32_DEBUGCTL on a VM-Fail that is missed
by KVM but detected by hardware, i.e. in nested_vmx_restore_host_state().
Doing so avoids the need to track host_debugctl on a per-VMCS basis, as
GUEST_IA32_DEBUGCTL is unconditionally written by prepare_vmcs02() and
load_vmcs12_host_state(). For the VM-Fail case, even though KVM won't
have actually entered the guest, vcpu_enter_guest() will have run with
vmcs02 active and thus could result in vmcs01 being run with a stale value.
Cc: stable(a)vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Co-developed-by: Sean Christopherson <seanjc(a)google.com>
Link: https://lore.kernel.org/r/20250610232010.162191-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4f7ee2dbf10e..5e0415a8ee3f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1677,6 +1677,7 @@ static inline u16 kvm_lapic_irq_dest_mode(bool dest_mode_logical)
enum kvm_x86_run_flags {
KVM_RUN_FORCE_IMMEDIATE_EXIT = BIT(0),
KVM_RUN_LOAD_GUEST_DR6 = BIT(1),
+ KVM_RUN_LOAD_DEBUGCTL = BIT(2),
};
struct kvm_x86_ops {
@@ -1707,6 +1708,12 @@ struct kvm_x86_ops {
void (*vcpu_load)(struct kvm_vcpu *vcpu, int cpu);
void (*vcpu_put)(struct kvm_vcpu *vcpu);
+ /*
+ * Mask of DEBUGCTL bits that are owned by the host, i.e. that need to
+ * match the host's value even while the guest is active.
+ */
+ const u64 HOST_OWNED_DEBUGCTL;
+
void (*update_exception_bitmap)(struct kvm_vcpu *vcpu);
int (*get_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr);
int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr);
diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
index c85cbce6d2f6..4a6d4460f947 100644
--- a/arch/x86/kvm/vmx/main.c
+++ b/arch/x86/kvm/vmx/main.c
@@ -915,6 +915,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = {
.vcpu_load = vt_op(vcpu_load),
.vcpu_put = vt_op(vcpu_put),
+ .HOST_OWNED_DEBUGCTL = DEBUGCTLMSR_FREEZE_IN_SMM,
+
.update_exception_bitmap = vt_op(update_exception_bitmap),
.get_feature_msr = vmx_get_feature_msr,
.get_msr = vt_op(get_msr),
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index ef20184b8b11..c69df3aba8d1 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4861,6 +4861,9 @@ static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu)
WARN_ON(kvm_set_dr(vcpu, 7, vmcs_readl(GUEST_DR7)));
}
+ /* Reload DEBUGCTL to ensure vmcs01 has a fresh FREEZE_IN_SMM value. */
+ vmx_reload_guest_debugctl(vcpu);
+
/*
* Note that calling vmx_set_{efer,cr0,cr4} is important as they
* handle a variety of side effects to KVM's software model.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a77d325fe78b..0db1bd383302 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7377,6 +7377,9 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
if (run_flags & KVM_RUN_LOAD_GUEST_DR6)
set_debugreg(vcpu->arch.dr6, 6);
+ if (run_flags & KVM_RUN_LOAD_DEBUGCTL)
+ vmx_reload_guest_debugctl(vcpu);
+
/*
* Refresh vmcs.HOST_CR3 if necessary. This must be done immediately
* prior to VM-Enter, as the kernel may load a new ASID (PCID) any time
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index c20a4185d10a..076af78af151 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -419,12 +419,25 @@ bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated)
static inline void vmx_guest_debugctl_write(struct kvm_vcpu *vcpu, u64 val)
{
+ WARN_ON_ONCE(val & DEBUGCTLMSR_FREEZE_IN_SMM);
+
+ val |= vcpu->arch.host_debugctl & DEBUGCTLMSR_FREEZE_IN_SMM;
vmcs_write64(GUEST_IA32_DEBUGCTL, val);
}
static inline u64 vmx_guest_debugctl_read(void)
{
- return vmcs_read64(GUEST_IA32_DEBUGCTL);
+ return vmcs_read64(GUEST_IA32_DEBUGCTL) & ~DEBUGCTLMSR_FREEZE_IN_SMM;
+}
+
+static inline void vmx_reload_guest_debugctl(struct kvm_vcpu *vcpu)
+{
+ u64 val = vmcs_read64(GUEST_IA32_DEBUGCTL);
+
+ if (!((val ^ vcpu->arch.host_debugctl) & DEBUGCTLMSR_FREEZE_IN_SMM))
+ return;
+
+ vmx_guest_debugctl_write(vcpu, val & ~DEBUGCTLMSR_FREEZE_IN_SMM);
}
/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7fe3549bbd93..179c2c550c8d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10779,7 +10779,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
dm_request_for_irq_injection(vcpu) &&
kvm_cpu_accept_dm_intr(vcpu);
fastpath_t exit_fastpath;
- u64 run_flags;
+ u64 run_flags, debug_ctl;
bool req_immediate_exit = false;
@@ -11051,7 +11051,17 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
set_debugreg(0, 7);
}
- vcpu->arch.host_debugctl = get_debugctlmsr();
+ /*
+ * Refresh the host DEBUGCTL snapshot after disabling IRQs, as DEBUGCTL
+ * can be modified in IRQ context, e.g. via SMP function calls. Inform
+ * vendor code if any host-owned bits were changed, e.g. so that the
+ * value loaded into hardware while running the guest can be updated.
+ */
+ debug_ctl = get_debugctlmsr();
+ if ((debug_ctl ^ vcpu->arch.host_debugctl) & kvm_x86_ops.HOST_OWNED_DEBUGCTL &&
+ !vcpu->arch.guest_state_protected)
+ run_flags |= KVM_RUN_LOAD_DEBUGCTL;
+ vcpu->arch.host_debugctl = debug_ctl;
guest_timing_enter_irqoff();
The patch below does not apply to the 6.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.15.y
git checkout FETCH_HEAD
git cherry-pick -x 6b1dd26544d045f6a79e8c73572c0c0db3ef3c1a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081232-language-safehouse-4f99@gregkh' --subject-prefix 'PATCH 6.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6b1dd26544d045f6a79e8c73572c0c0db3ef3c1a Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk(a)redhat.com>
Date: Tue, 10 Jun 2025 16:20:10 -0700
Subject: [PATCH] KVM: VMX: Preserve host's DEBUGCTLMSR_FREEZE_IN_SMM while
running the guest
Set/clear DEBUGCTLMSR_FREEZE_IN_SMM in GUEST_IA32_DEBUGCTL based on the
host's pre-VM-Enter value, i.e. preserve the host's FREEZE_IN_SMM setting
while running the guest. When running with the "default treatment of SMIs"
in effect (the only mode KVM supports), SMIs do not generate a VM-Exit that
is visible to host (non-SMM) software, and instead transitions directly
from VMX non-root to SMM. And critically, DEBUGCTL isn't context switched
by hardware on SMI or RSM, i.e. SMM will run with whatever value was
resident in hardware at the time of the SMI.
Failure to preserve FREEZE_IN_SMM results in the PMU unexpectedly counting
events while the CPU is executing in SMM, which can pollute profiling and
potentially leak information into the guest.
Check for changes in FREEZE_IN_SMM prior to every entry into KVM's inner
run loop, as the bit can be toggled in IRQ context via IPI callback (SMP
function call), by way of /sys/devices/cpu/freeze_on_smi.
Add a field in kvm_x86_ops to communicate which DEBUGCTL bits need to be
preserved, as FREEZE_IN_SMM is only supported and defined for Intel CPUs,
i.e. explicitly checking FREEZE_IN_SMM in common x86 is at best weird, and
at worst could lead to undesirable behavior in the future if AMD CPUs ever
happened to pick up a collision with the bit.
Exempt TDX vCPUs, i.e. protected guests, from the check, as the TDX Module
owns and controls GUEST_IA32_DEBUGCTL.
WARN in SVM if KVM_RUN_LOAD_DEBUGCTL is set, mostly to document that the
lack of handling isn't a KVM bug (TDX already WARNs on any run_flag).
Lastly, explicitly reload GUEST_IA32_DEBUGCTL on a VM-Fail that is missed
by KVM but detected by hardware, i.e. in nested_vmx_restore_host_state().
Doing so avoids the need to track host_debugctl on a per-VMCS basis, as
GUEST_IA32_DEBUGCTL is unconditionally written by prepare_vmcs02() and
load_vmcs12_host_state(). For the VM-Fail case, even though KVM won't
have actually entered the guest, vcpu_enter_guest() will have run with
vmcs02 active and thus could result in vmcs01 being run with a stale value.
Cc: stable(a)vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Co-developed-by: Sean Christopherson <seanjc(a)google.com>
Link: https://lore.kernel.org/r/20250610232010.162191-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4f7ee2dbf10e..5e0415a8ee3f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1677,6 +1677,7 @@ static inline u16 kvm_lapic_irq_dest_mode(bool dest_mode_logical)
enum kvm_x86_run_flags {
KVM_RUN_FORCE_IMMEDIATE_EXIT = BIT(0),
KVM_RUN_LOAD_GUEST_DR6 = BIT(1),
+ KVM_RUN_LOAD_DEBUGCTL = BIT(2),
};
struct kvm_x86_ops {
@@ -1707,6 +1708,12 @@ struct kvm_x86_ops {
void (*vcpu_load)(struct kvm_vcpu *vcpu, int cpu);
void (*vcpu_put)(struct kvm_vcpu *vcpu);
+ /*
+ * Mask of DEBUGCTL bits that are owned by the host, i.e. that need to
+ * match the host's value even while the guest is active.
+ */
+ const u64 HOST_OWNED_DEBUGCTL;
+
void (*update_exception_bitmap)(struct kvm_vcpu *vcpu);
int (*get_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr);
int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr);
diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
index c85cbce6d2f6..4a6d4460f947 100644
--- a/arch/x86/kvm/vmx/main.c
+++ b/arch/x86/kvm/vmx/main.c
@@ -915,6 +915,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = {
.vcpu_load = vt_op(vcpu_load),
.vcpu_put = vt_op(vcpu_put),
+ .HOST_OWNED_DEBUGCTL = DEBUGCTLMSR_FREEZE_IN_SMM,
+
.update_exception_bitmap = vt_op(update_exception_bitmap),
.get_feature_msr = vmx_get_feature_msr,
.get_msr = vt_op(get_msr),
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index ef20184b8b11..c69df3aba8d1 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4861,6 +4861,9 @@ static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu)
WARN_ON(kvm_set_dr(vcpu, 7, vmcs_readl(GUEST_DR7)));
}
+ /* Reload DEBUGCTL to ensure vmcs01 has a fresh FREEZE_IN_SMM value. */
+ vmx_reload_guest_debugctl(vcpu);
+
/*
* Note that calling vmx_set_{efer,cr0,cr4} is important as they
* handle a variety of side effects to KVM's software model.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a77d325fe78b..0db1bd383302 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7377,6 +7377,9 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
if (run_flags & KVM_RUN_LOAD_GUEST_DR6)
set_debugreg(vcpu->arch.dr6, 6);
+ if (run_flags & KVM_RUN_LOAD_DEBUGCTL)
+ vmx_reload_guest_debugctl(vcpu);
+
/*
* Refresh vmcs.HOST_CR3 if necessary. This must be done immediately
* prior to VM-Enter, as the kernel may load a new ASID (PCID) any time
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index c20a4185d10a..076af78af151 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -419,12 +419,25 @@ bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated)
static inline void vmx_guest_debugctl_write(struct kvm_vcpu *vcpu, u64 val)
{
+ WARN_ON_ONCE(val & DEBUGCTLMSR_FREEZE_IN_SMM);
+
+ val |= vcpu->arch.host_debugctl & DEBUGCTLMSR_FREEZE_IN_SMM;
vmcs_write64(GUEST_IA32_DEBUGCTL, val);
}
static inline u64 vmx_guest_debugctl_read(void)
{
- return vmcs_read64(GUEST_IA32_DEBUGCTL);
+ return vmcs_read64(GUEST_IA32_DEBUGCTL) & ~DEBUGCTLMSR_FREEZE_IN_SMM;
+}
+
+static inline void vmx_reload_guest_debugctl(struct kvm_vcpu *vcpu)
+{
+ u64 val = vmcs_read64(GUEST_IA32_DEBUGCTL);
+
+ if (!((val ^ vcpu->arch.host_debugctl) & DEBUGCTLMSR_FREEZE_IN_SMM))
+ return;
+
+ vmx_guest_debugctl_write(vcpu, val & ~DEBUGCTLMSR_FREEZE_IN_SMM);
}
/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7fe3549bbd93..179c2c550c8d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10779,7 +10779,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
dm_request_for_irq_injection(vcpu) &&
kvm_cpu_accept_dm_intr(vcpu);
fastpath_t exit_fastpath;
- u64 run_flags;
+ u64 run_flags, debug_ctl;
bool req_immediate_exit = false;
@@ -11051,7 +11051,17 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
set_debugreg(0, 7);
}
- vcpu->arch.host_debugctl = get_debugctlmsr();
+ /*
+ * Refresh the host DEBUGCTL snapshot after disabling IRQs, as DEBUGCTL
+ * can be modified in IRQ context, e.g. via SMP function calls. Inform
+ * vendor code if any host-owned bits were changed, e.g. so that the
+ * value loaded into hardware while running the guest can be updated.
+ */
+ debug_ctl = get_debugctlmsr();
+ if ((debug_ctl ^ vcpu->arch.host_debugctl) & kvm_x86_ops.HOST_OWNED_DEBUGCTL &&
+ !vcpu->arch.guest_state_protected)
+ run_flags |= KVM_RUN_LOAD_DEBUGCTL;
+ vcpu->arch.host_debugctl = debug_ctl;
guest_timing_enter_irqoff();
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 095686e6fcb4150f0a55b1a25987fad3d8af58d6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081226-sublease-shoptalk-bc18@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 095686e6fcb4150f0a55b1a25987fad3d8af58d6 Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk(a)redhat.com>
Date: Tue, 10 Jun 2025 16:20:08 -0700
Subject: [PATCH] KVM: nVMX: Check vmcs12->guest_ia32_debugctl on nested
VM-Enter
Add a consistency check for L2's guest_ia32_debugctl, as KVM only supports
a subset of hardware functionality, i.e. KVM can't rely on hardware to
detect illegal/unsupported values. Failure to check the vmcs12 value
would allow the guest to load any harware-supported value while running L2.
Take care to exempt BTF and LBR from the validity check in order to match
KVM's behavior for writes via WRMSR, but without clobbering vmcs12. Even
if VM_EXIT_SAVE_DEBUG_CONTROLS is set in vmcs12, L1 can reasonably expect
that vmcs12->guest_ia32_debugctl will not be modified if writes to the MSR
are being intercepted.
Arguably, KVM _should_ update vmcs12 if VM_EXIT_SAVE_DEBUG_CONTROLS is set
*and* writes to MSR_IA32_DEBUGCTLMSR are not being intercepted by L1, but
that would incur non-trivial complexity and wouldn't change the fact that
KVM's handling of DEBUGCTL is blatantly broken. I.e. the extra complexity
is not worth carrying.
Cc: stable(a)vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Co-developed-by: Sean Christopherson <seanjc(a)google.com>
Link: https://lore.kernel.org/r/20250610232010.162191-7-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 7211c71d4241..1b8b0642fc2d 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2663,7 +2663,8 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
if (vmx->nested.nested_run_pending &&
(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS)) {
kvm_set_dr(vcpu, 7, vmcs12->guest_dr7);
- vmcs_write64(GUEST_IA32_DEBUGCTL, vmcs12->guest_ia32_debugctl);
+ vmcs_write64(GUEST_IA32_DEBUGCTL, vmcs12->guest_ia32_debugctl &
+ vmx_get_supported_debugctl(vcpu, false));
} else {
kvm_set_dr(vcpu, 7, vcpu->arch.dr7);
vmcs_write64(GUEST_IA32_DEBUGCTL, vmx->nested.pre_vmenter_debugctl);
@@ -3156,7 +3157,8 @@ static int nested_vmx_check_guest_state(struct kvm_vcpu *vcpu,
return -EINVAL;
if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS) &&
- CC(!kvm_dr7_valid(vmcs12->guest_dr7)))
+ (CC(!kvm_dr7_valid(vmcs12->guest_dr7)) ||
+ CC(!vmx_is_valid_debugctl(vcpu, vmcs12->guest_ia32_debugctl, false))))
return -EINVAL;
if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PAT) &&
@@ -4608,6 +4610,12 @@ static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
(vmcs12->vm_entry_controls & ~VM_ENTRY_IA32E_MODE) |
(vm_entry_controls_get(to_vmx(vcpu)) & VM_ENTRY_IA32E_MODE);
+ /*
+ * Note! Save DR7, but intentionally don't grab DEBUGCTL from vmcs02.
+ * Writes to DEBUGCTL that aren't intercepted by L1 are immediately
+ * propagated to vmcs12 (see vmx_set_msr()), as the value loaded into
+ * vmcs02 doesn't strictly track vmcs12.
+ */
if (vmcs12->vm_exit_controls & VM_EXIT_SAVE_DEBUG_CONTROLS)
vmcs12->guest_dr7 = vcpu->arch.dr7;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 4f827a75d980..6a8b78e954cd 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2174,7 +2174,7 @@ static u64 nested_vmx_truncate_sysenter_addr(struct kvm_vcpu *vcpu,
return (unsigned long)data;
}
-static u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated)
+u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated)
{
u64 debugctl = 0;
@@ -2193,8 +2193,7 @@ static u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated
return debugctl;
}
-static bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data,
- bool host_initiated)
+bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated)
{
u64 invalid;
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index b5758c33c60f..392e66c7e5fe 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -414,6 +414,9 @@ static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu);
+u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated);
+bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated);
+
/*
* Note, early Intel manuals have the write-low and read-high bitmap offsets
* the wrong way round. The bitmaps control MSRs 0x00000000-0x00001fff and
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 095686e6fcb4150f0a55b1a25987fad3d8af58d6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081225-unsettled-hardener-28aa@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 095686e6fcb4150f0a55b1a25987fad3d8af58d6 Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk(a)redhat.com>
Date: Tue, 10 Jun 2025 16:20:08 -0700
Subject: [PATCH] KVM: nVMX: Check vmcs12->guest_ia32_debugctl on nested
VM-Enter
Add a consistency check for L2's guest_ia32_debugctl, as KVM only supports
a subset of hardware functionality, i.e. KVM can't rely on hardware to
detect illegal/unsupported values. Failure to check the vmcs12 value
would allow the guest to load any harware-supported value while running L2.
Take care to exempt BTF and LBR from the validity check in order to match
KVM's behavior for writes via WRMSR, but without clobbering vmcs12. Even
if VM_EXIT_SAVE_DEBUG_CONTROLS is set in vmcs12, L1 can reasonably expect
that vmcs12->guest_ia32_debugctl will not be modified if writes to the MSR
are being intercepted.
Arguably, KVM _should_ update vmcs12 if VM_EXIT_SAVE_DEBUG_CONTROLS is set
*and* writes to MSR_IA32_DEBUGCTLMSR are not being intercepted by L1, but
that would incur non-trivial complexity and wouldn't change the fact that
KVM's handling of DEBUGCTL is blatantly broken. I.e. the extra complexity
is not worth carrying.
Cc: stable(a)vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Co-developed-by: Sean Christopherson <seanjc(a)google.com>
Link: https://lore.kernel.org/r/20250610232010.162191-7-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 7211c71d4241..1b8b0642fc2d 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2663,7 +2663,8 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
if (vmx->nested.nested_run_pending &&
(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS)) {
kvm_set_dr(vcpu, 7, vmcs12->guest_dr7);
- vmcs_write64(GUEST_IA32_DEBUGCTL, vmcs12->guest_ia32_debugctl);
+ vmcs_write64(GUEST_IA32_DEBUGCTL, vmcs12->guest_ia32_debugctl &
+ vmx_get_supported_debugctl(vcpu, false));
} else {
kvm_set_dr(vcpu, 7, vcpu->arch.dr7);
vmcs_write64(GUEST_IA32_DEBUGCTL, vmx->nested.pre_vmenter_debugctl);
@@ -3156,7 +3157,8 @@ static int nested_vmx_check_guest_state(struct kvm_vcpu *vcpu,
return -EINVAL;
if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS) &&
- CC(!kvm_dr7_valid(vmcs12->guest_dr7)))
+ (CC(!kvm_dr7_valid(vmcs12->guest_dr7)) ||
+ CC(!vmx_is_valid_debugctl(vcpu, vmcs12->guest_ia32_debugctl, false))))
return -EINVAL;
if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PAT) &&
@@ -4608,6 +4610,12 @@ static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
(vmcs12->vm_entry_controls & ~VM_ENTRY_IA32E_MODE) |
(vm_entry_controls_get(to_vmx(vcpu)) & VM_ENTRY_IA32E_MODE);
+ /*
+ * Note! Save DR7, but intentionally don't grab DEBUGCTL from vmcs02.
+ * Writes to DEBUGCTL that aren't intercepted by L1 are immediately
+ * propagated to vmcs12 (see vmx_set_msr()), as the value loaded into
+ * vmcs02 doesn't strictly track vmcs12.
+ */
if (vmcs12->vm_exit_controls & VM_EXIT_SAVE_DEBUG_CONTROLS)
vmcs12->guest_dr7 = vcpu->arch.dr7;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 4f827a75d980..6a8b78e954cd 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2174,7 +2174,7 @@ static u64 nested_vmx_truncate_sysenter_addr(struct kvm_vcpu *vcpu,
return (unsigned long)data;
}
-static u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated)
+u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated)
{
u64 debugctl = 0;
@@ -2193,8 +2193,7 @@ static u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated
return debugctl;
}
-static bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data,
- bool host_initiated)
+bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated)
{
u64 invalid;
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index b5758c33c60f..392e66c7e5fe 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -414,6 +414,9 @@ static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu);
+u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated);
+bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated);
+
/*
* Note, early Intel manuals have the write-low and read-high bitmap offsets
* the wrong way round. The bitmaps control MSRs 0x00000000-0x00001fff and
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 095686e6fcb4150f0a55b1a25987fad3d8af58d6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081224-earlobe-tilt-6881@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 095686e6fcb4150f0a55b1a25987fad3d8af58d6 Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <mlevitsk(a)redhat.com>
Date: Tue, 10 Jun 2025 16:20:08 -0700
Subject: [PATCH] KVM: nVMX: Check vmcs12->guest_ia32_debugctl on nested
VM-Enter
Add a consistency check for L2's guest_ia32_debugctl, as KVM only supports
a subset of hardware functionality, i.e. KVM can't rely on hardware to
detect illegal/unsupported values. Failure to check the vmcs12 value
would allow the guest to load any harware-supported value while running L2.
Take care to exempt BTF and LBR from the validity check in order to match
KVM's behavior for writes via WRMSR, but without clobbering vmcs12. Even
if VM_EXIT_SAVE_DEBUG_CONTROLS is set in vmcs12, L1 can reasonably expect
that vmcs12->guest_ia32_debugctl will not be modified if writes to the MSR
are being intercepted.
Arguably, KVM _should_ update vmcs12 if VM_EXIT_SAVE_DEBUG_CONTROLS is set
*and* writes to MSR_IA32_DEBUGCTLMSR are not being intercepted by L1, but
that would incur non-trivial complexity and wouldn't change the fact that
KVM's handling of DEBUGCTL is blatantly broken. I.e. the extra complexity
is not worth carrying.
Cc: stable(a)vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk(a)redhat.com>
Co-developed-by: Sean Christopherson <seanjc(a)google.com>
Link: https://lore.kernel.org/r/20250610232010.162191-7-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 7211c71d4241..1b8b0642fc2d 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2663,7 +2663,8 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
if (vmx->nested.nested_run_pending &&
(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS)) {
kvm_set_dr(vcpu, 7, vmcs12->guest_dr7);
- vmcs_write64(GUEST_IA32_DEBUGCTL, vmcs12->guest_ia32_debugctl);
+ vmcs_write64(GUEST_IA32_DEBUGCTL, vmcs12->guest_ia32_debugctl &
+ vmx_get_supported_debugctl(vcpu, false));
} else {
kvm_set_dr(vcpu, 7, vcpu->arch.dr7);
vmcs_write64(GUEST_IA32_DEBUGCTL, vmx->nested.pre_vmenter_debugctl);
@@ -3156,7 +3157,8 @@ static int nested_vmx_check_guest_state(struct kvm_vcpu *vcpu,
return -EINVAL;
if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS) &&
- CC(!kvm_dr7_valid(vmcs12->guest_dr7)))
+ (CC(!kvm_dr7_valid(vmcs12->guest_dr7)) ||
+ CC(!vmx_is_valid_debugctl(vcpu, vmcs12->guest_ia32_debugctl, false))))
return -EINVAL;
if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PAT) &&
@@ -4608,6 +4610,12 @@ static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
(vmcs12->vm_entry_controls & ~VM_ENTRY_IA32E_MODE) |
(vm_entry_controls_get(to_vmx(vcpu)) & VM_ENTRY_IA32E_MODE);
+ /*
+ * Note! Save DR7, but intentionally don't grab DEBUGCTL from vmcs02.
+ * Writes to DEBUGCTL that aren't intercepted by L1 are immediately
+ * propagated to vmcs12 (see vmx_set_msr()), as the value loaded into
+ * vmcs02 doesn't strictly track vmcs12.
+ */
if (vmcs12->vm_exit_controls & VM_EXIT_SAVE_DEBUG_CONTROLS)
vmcs12->guest_dr7 = vcpu->arch.dr7;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 4f827a75d980..6a8b78e954cd 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2174,7 +2174,7 @@ static u64 nested_vmx_truncate_sysenter_addr(struct kvm_vcpu *vcpu,
return (unsigned long)data;
}
-static u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated)
+u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated)
{
u64 debugctl = 0;
@@ -2193,8 +2193,7 @@ static u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated
return debugctl;
}
-static bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data,
- bool host_initiated)
+bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated)
{
u64 invalid;
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index b5758c33c60f..392e66c7e5fe 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -414,6 +414,9 @@ static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu);
+u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated);
+bool vmx_is_valid_debugctl(struct kvm_vcpu *vcpu, u64 data, bool host_initiated);
+
/*
* Note, early Intel manuals have the write-low and read-high bitmap offsets
* the wrong way round. The bitmaps control MSRs 0x00000000-0x00001fff and
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 303084ad12767db64c84ba8fcd0450aec38c8534
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081259-clutter-elusive-1a09@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 303084ad12767db64c84ba8fcd0450aec38c8534 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Mon, 21 Jul 2025 11:19:50 +0100
Subject: [PATCH] KVM: arm64: Filter out HCR_EL2 bits when running in
hypervisor context
Most HCR_EL2 bits are not supposed to affect EL2 at all, but only
the guest. However, we gladly merge these bits with the host's
HCR_EL2 configuration, irrespective of entering L1 or L2.
This leads to some funky behaviour, such as L1 trying to inject
a virtual SError for L2, and getting a taste of its own medecine.
Not quite what the architecture anticipated.
In the end, the only bits that matter are those we have defined as
invariants, either because we've made them RESx (E2H, HCD...), or
that we actively refuse to merge because the mess with KVM's own
logic.
Use the sanitisation infrastructure to get the RES1 bits, and let
things rip in a safer way.
Fixes: 04ab519bb86df ("KVM: arm64: nv: Configure HCR_EL2 for FEAT_NV2")
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20250721101955.535159-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index 477f1580ffea..e482181c6632 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -48,8 +48,7 @@ DEFINE_PER_CPU(unsigned long, kvm_hyp_vector);
static u64 __compute_hcr(struct kvm_vcpu *vcpu)
{
- u64 guest_hcr = __vcpu_sys_reg(vcpu, HCR_EL2);
- u64 hcr = vcpu->arch.hcr_el2;
+ u64 guest_hcr, hcr = vcpu->arch.hcr_el2;
if (!vcpu_has_nv(vcpu))
return hcr;
@@ -68,10 +67,21 @@ static u64 __compute_hcr(struct kvm_vcpu *vcpu)
if (!vcpu_el2_e2h_is_set(vcpu))
hcr |= HCR_NV1;
+ /*
+ * Nothing in HCR_EL2 should impact running in hypervisor
+ * context, apart from bits we have defined as RESx (E2H,
+ * HCD and co), or that cannot be set directly (the EXCLUDE
+ * bits). Given that we OR the guest's view with the host's,
+ * we can use the 0 value as the starting point, and only
+ * use the config-driven RES1 bits.
+ */
+ guest_hcr = kvm_vcpu_apply_reg_masks(vcpu, HCR_EL2, 0);
+
write_sysreg_s(vcpu->arch.ctxt.vncr_array, SYS_VNCR_EL2);
} else {
host_data_clear_flag(VCPU_IN_HYP_CONTEXT);
+ guest_hcr = __vcpu_sys_reg(vcpu, HCR_EL2);
if (guest_hcr & HCR_NV) {
u64 va = __fix_to_virt(vncr_fixmap(smp_processor_id()));
The patch below does not apply to the 6.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.15.y
git checkout FETCH_HEAD
git cherry-pick -x 303084ad12767db64c84ba8fcd0450aec38c8534
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081258-convent-unscrew-2e57@gregkh' --subject-prefix 'PATCH 6.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 303084ad12767db64c84ba8fcd0450aec38c8534 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Mon, 21 Jul 2025 11:19:50 +0100
Subject: [PATCH] KVM: arm64: Filter out HCR_EL2 bits when running in
hypervisor context
Most HCR_EL2 bits are not supposed to affect EL2 at all, but only
the guest. However, we gladly merge these bits with the host's
HCR_EL2 configuration, irrespective of entering L1 or L2.
This leads to some funky behaviour, such as L1 trying to inject
a virtual SError for L2, and getting a taste of its own medecine.
Not quite what the architecture anticipated.
In the end, the only bits that matter are those we have defined as
invariants, either because we've made them RESx (E2H, HCD...), or
that we actively refuse to merge because the mess with KVM's own
logic.
Use the sanitisation infrastructure to get the RES1 bits, and let
things rip in a safer way.
Fixes: 04ab519bb86df ("KVM: arm64: nv: Configure HCR_EL2 for FEAT_NV2")
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20250721101955.535159-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index 477f1580ffea..e482181c6632 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -48,8 +48,7 @@ DEFINE_PER_CPU(unsigned long, kvm_hyp_vector);
static u64 __compute_hcr(struct kvm_vcpu *vcpu)
{
- u64 guest_hcr = __vcpu_sys_reg(vcpu, HCR_EL2);
- u64 hcr = vcpu->arch.hcr_el2;
+ u64 guest_hcr, hcr = vcpu->arch.hcr_el2;
if (!vcpu_has_nv(vcpu))
return hcr;
@@ -68,10 +67,21 @@ static u64 __compute_hcr(struct kvm_vcpu *vcpu)
if (!vcpu_el2_e2h_is_set(vcpu))
hcr |= HCR_NV1;
+ /*
+ * Nothing in HCR_EL2 should impact running in hypervisor
+ * context, apart from bits we have defined as RESx (E2H,
+ * HCD and co), or that cannot be set directly (the EXCLUDE
+ * bits). Given that we OR the guest's view with the host's,
+ * we can use the 0 value as the starting point, and only
+ * use the config-driven RES1 bits.
+ */
+ guest_hcr = kvm_vcpu_apply_reg_masks(vcpu, HCR_EL2, 0);
+
write_sysreg_s(vcpu->arch.ctxt.vncr_array, SYS_VNCR_EL2);
} else {
host_data_clear_flag(VCPU_IN_HYP_CONTEXT);
+ guest_hcr = __vcpu_sys_reg(vcpu, HCR_EL2);
if (guest_hcr & HCR_NV) {
u64 va = __fix_to_virt(vncr_fixmap(smp_processor_id()));
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x c6e35dff58d348c1a9489e9b3b62b3721e62631d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081254-rejoicing-peroxide-4695@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c6e35dff58d348c1a9489e9b3b62b3721e62631d Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Sun, 20 Jul 2025 11:22:29 +0100
Subject: [PATCH] KVM: arm64: Check for SYSREGS_ON_CPU before accessing the CPU
state
Mark Brown reports that since we commit to making exceptions
visible without the vcpu being loaded, the external abort selftest
fails.
Upon investigation, it turns out that the code that makes registers
affected by an exception visible to the guest is completely broken
on VHE, as we don't check whether the system registers are loaded
on the CPU at this point. We managed to get away with this so far,
but that's obviously as bad as it gets,
Add the required checksm and document the absolute need to check
for the SYSREGS_ON_CPU flag before calling into any of the
__vcpu_write_sys_reg_to_cpu()__vcpu_read_sys_reg_from_cpu() helpers.
Reported-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/18535df8-e647-4643-af9a-bb780af03a70@sirena.org.uk
Link: https://lore.kernel.org/r/20250720102229.179114-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e54d29feb469..d373d555a69b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1169,6 +1169,8 @@ static inline bool __vcpu_read_sys_reg_from_cpu(int reg, u64 *val)
* System registers listed in the switch are not saved on every
* exit from the guest but are only saved on vcpu_put.
*
+ * SYSREGS_ON_CPU *MUST* be checked before using this helper.
+ *
* Note that MPIDR_EL1 for the guest is set by KVM via VMPIDR_EL2 but
* should never be listed below, because the guest cannot modify its
* own MPIDR_EL1 and MPIDR_EL1 is accessed for VCPU A from VCPU B's
@@ -1221,6 +1223,8 @@ static inline bool __vcpu_write_sys_reg_to_cpu(u64 val, int reg)
* System registers listed in the switch are not restored on every
* entry to the guest but are only restored on vcpu_load.
*
+ * SYSREGS_ON_CPU *MUST* be checked before using this helper.
+ *
* Note that MPIDR_EL1 for the guest is set by KVM via VMPIDR_EL2 but
* should never be listed below, because the MPIDR should only be set
* once, before running the VCPU, and never changed later.
diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c
index 7dafd10e52e8..95d186e0bf54 100644
--- a/arch/arm64/kvm/hyp/exception.c
+++ b/arch/arm64/kvm/hyp/exception.c
@@ -26,7 +26,8 @@ static inline u64 __vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg)
if (unlikely(vcpu_has_nv(vcpu)))
return vcpu_read_sys_reg(vcpu, reg);
- else if (__vcpu_read_sys_reg_from_cpu(reg, &val))
+ else if (vcpu_get_flag(vcpu, SYSREGS_ON_CPU) &&
+ __vcpu_read_sys_reg_from_cpu(reg, &val))
return val;
return __vcpu_sys_reg(vcpu, reg);
@@ -36,7 +37,8 @@ static inline void __vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg)
{
if (unlikely(vcpu_has_nv(vcpu)))
vcpu_write_sys_reg(vcpu, val, reg);
- else if (!__vcpu_write_sys_reg_to_cpu(val, reg))
+ else if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU) ||
+ !__vcpu_write_sys_reg_to_cpu(val, reg))
__vcpu_assign_sys_reg(vcpu, reg, val);
}
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x c6e35dff58d348c1a9489e9b3b62b3721e62631d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081253-shrunk-karate-ad70@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c6e35dff58d348c1a9489e9b3b62b3721e62631d Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Sun, 20 Jul 2025 11:22:29 +0100
Subject: [PATCH] KVM: arm64: Check for SYSREGS_ON_CPU before accessing the CPU
state
Mark Brown reports that since we commit to making exceptions
visible without the vcpu being loaded, the external abort selftest
fails.
Upon investigation, it turns out that the code that makes registers
affected by an exception visible to the guest is completely broken
on VHE, as we don't check whether the system registers are loaded
on the CPU at this point. We managed to get away with this so far,
but that's obviously as bad as it gets,
Add the required checksm and document the absolute need to check
for the SYSREGS_ON_CPU flag before calling into any of the
__vcpu_write_sys_reg_to_cpu()__vcpu_read_sys_reg_from_cpu() helpers.
Reported-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/18535df8-e647-4643-af9a-bb780af03a70@sirena.org.uk
Link: https://lore.kernel.org/r/20250720102229.179114-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e54d29feb469..d373d555a69b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1169,6 +1169,8 @@ static inline bool __vcpu_read_sys_reg_from_cpu(int reg, u64 *val)
* System registers listed in the switch are not saved on every
* exit from the guest but are only saved on vcpu_put.
*
+ * SYSREGS_ON_CPU *MUST* be checked before using this helper.
+ *
* Note that MPIDR_EL1 for the guest is set by KVM via VMPIDR_EL2 but
* should never be listed below, because the guest cannot modify its
* own MPIDR_EL1 and MPIDR_EL1 is accessed for VCPU A from VCPU B's
@@ -1221,6 +1223,8 @@ static inline bool __vcpu_write_sys_reg_to_cpu(u64 val, int reg)
* System registers listed in the switch are not restored on every
* entry to the guest but are only restored on vcpu_load.
*
+ * SYSREGS_ON_CPU *MUST* be checked before using this helper.
+ *
* Note that MPIDR_EL1 for the guest is set by KVM via VMPIDR_EL2 but
* should never be listed below, because the MPIDR should only be set
* once, before running the VCPU, and never changed later.
diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c
index 7dafd10e52e8..95d186e0bf54 100644
--- a/arch/arm64/kvm/hyp/exception.c
+++ b/arch/arm64/kvm/hyp/exception.c
@@ -26,7 +26,8 @@ static inline u64 __vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg)
if (unlikely(vcpu_has_nv(vcpu)))
return vcpu_read_sys_reg(vcpu, reg);
- else if (__vcpu_read_sys_reg_from_cpu(reg, &val))
+ else if (vcpu_get_flag(vcpu, SYSREGS_ON_CPU) &&
+ __vcpu_read_sys_reg_from_cpu(reg, &val))
return val;
return __vcpu_sys_reg(vcpu, reg);
@@ -36,7 +37,8 @@ static inline void __vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg)
{
if (unlikely(vcpu_has_nv(vcpu)))
vcpu_write_sys_reg(vcpu, val, reg);
- else if (!__vcpu_write_sys_reg_to_cpu(val, reg))
+ else if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU) ||
+ !__vcpu_write_sys_reg_to_cpu(val, reg))
__vcpu_assign_sys_reg(vcpu, reg, val);
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 17ec2f965344ee3fd6620bef7ef68792f4ac3af0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081206-eclipse-preview-8baf@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 17ec2f965344ee3fd6620bef7ef68792f4ac3af0 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Tue, 10 Jun 2025 16:20:06 -0700
Subject: [PATCH] KVM: VMX: Allow guest to set DEBUGCTL.RTM_DEBUG if RTM is
supported
Let the guest set DEBUGCTL.RTM_DEBUG if RTM is supported according to the
guest CPUID model, as debug support is supposed to be available if RTM is
supported, and there are no known downsides to letting the guest debug RTM
aborts.
Note, there are no known bug reports related to RTM_DEBUG, the primary
motivation is to reduce the probability of breaking existing guests when a
future change adds a missing consistency check on vmcs12.GUEST_DEBUGCTL
(KVM currently lets L2 run with whatever hardware supports; whoops).
Note #2, KVM already emulates DR6.RTM, and doesn't restrict access to
DR7.RTM.
Fixes: 83c529151ab0 ("KVM: x86: expose Intel cpu new features (HLE, RTM) to guest")
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20250610232010.162191-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index b7dded3c8113..fa878b136eba 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -419,6 +419,7 @@
#define DEBUGCTLMSR_FREEZE_PERFMON_ON_PMI (1UL << 12)
#define DEBUGCTLMSR_FREEZE_IN_SMM_BIT 14
#define DEBUGCTLMSR_FREEZE_IN_SMM (1UL << DEBUGCTLMSR_FREEZE_IN_SMM_BIT)
+#define DEBUGCTLMSR_RTM_DEBUG BIT(15)
#define MSR_PEBS_FRONTEND 0x000003f7
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 4ee6cc796855..311f6fa53b67 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2186,6 +2186,10 @@ static u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated
(host_initiated || intel_pmu_lbr_is_enabled(vcpu)))
debugctl |= DEBUGCTLMSR_LBR | DEBUGCTLMSR_FREEZE_LBRS_ON_PMI;
+ if (boot_cpu_has(X86_FEATURE_RTM) &&
+ (host_initiated || guest_cpu_cap_has(vcpu, X86_FEATURE_RTM)))
+ debugctl |= DEBUGCTLMSR_RTM_DEBUG;
+
return debugctl;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 17ec2f965344ee3fd6620bef7ef68792f4ac3af0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081205-resurface-filler-e78b@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 17ec2f965344ee3fd6620bef7ef68792f4ac3af0 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Tue, 10 Jun 2025 16:20:06 -0700
Subject: [PATCH] KVM: VMX: Allow guest to set DEBUGCTL.RTM_DEBUG if RTM is
supported
Let the guest set DEBUGCTL.RTM_DEBUG if RTM is supported according to the
guest CPUID model, as debug support is supposed to be available if RTM is
supported, and there are no known downsides to letting the guest debug RTM
aborts.
Note, there are no known bug reports related to RTM_DEBUG, the primary
motivation is to reduce the probability of breaking existing guests when a
future change adds a missing consistency check on vmcs12.GUEST_DEBUGCTL
(KVM currently lets L2 run with whatever hardware supports; whoops).
Note #2, KVM already emulates DR6.RTM, and doesn't restrict access to
DR7.RTM.
Fixes: 83c529151ab0 ("KVM: x86: expose Intel cpu new features (HLE, RTM) to guest")
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20250610232010.162191-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index b7dded3c8113..fa878b136eba 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -419,6 +419,7 @@
#define DEBUGCTLMSR_FREEZE_PERFMON_ON_PMI (1UL << 12)
#define DEBUGCTLMSR_FREEZE_IN_SMM_BIT 14
#define DEBUGCTLMSR_FREEZE_IN_SMM (1UL << DEBUGCTLMSR_FREEZE_IN_SMM_BIT)
+#define DEBUGCTLMSR_RTM_DEBUG BIT(15)
#define MSR_PEBS_FRONTEND 0x000003f7
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 4ee6cc796855..311f6fa53b67 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2186,6 +2186,10 @@ static u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated
(host_initiated || intel_pmu_lbr_is_enabled(vcpu)))
debugctl |= DEBUGCTLMSR_LBR | DEBUGCTLMSR_FREEZE_LBRS_ON_PMI;
+ if (boot_cpu_has(X86_FEATURE_RTM) &&
+ (host_initiated || guest_cpu_cap_has(vcpu, X86_FEATURE_RTM)))
+ debugctl |= DEBUGCTLMSR_RTM_DEBUG;
+
return debugctl;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 8466d393700f9ccef68134d3349f4e0a087679b9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081206-rejoicing-democrat-be9e@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8466d393700f9ccef68134d3349f4e0a087679b9 Mon Sep 17 00:00:00 2001
From: Ammar Faizi <ammarfaizi2(a)gnuweeb.org>
Date: Wed, 6 Aug 2025 07:31:05 +0700
Subject: [PATCH] net: usbnet: Fix the wrong netif_carrier_on() call
The commit referenced in the Fixes tag causes usbnet to malfunction
(identified via git bisect). Post-commit, my external RJ45 LAN cable
fails to connect. Linus also reported the same issue after pulling that
commit.
The code has a logic error: netif_carrier_on() is only called when the
link is already on. Fix this by moving the netif_carrier_on() call
outside the if-statement entirely. This ensures it is always called
when EVENT_LINK_CARRIER_ON is set and properly clears it regardless
of the link state.
Cc: stable(a)vger.kernel.org
Cc: Armando Budianto <sprite(a)gnuweeb.org>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Suggested-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Link: https://lore.kernel.org/all/CAHk-=wjqL4uF0MG_c8+xHX1Vv8==sPYQrtzbdA3kzi9628…
Closes: https://lore.kernel.org/netdev/CAHk-=wjKh8X4PT_mU1kD4GQrbjivMfPn-_hXa6han_B…
Closes: https://lore.kernel.org/netdev/0752dee6-43d6-4e1f-81d2-4248142cccd2@gnuweeb…
Fixes: 0d9cfc9b8cb1 ("net: usbnet: Avoid potential RCU stall on LINK_CHANGE event")
Signed-off-by: Ammar Faizi <ammarfaizi2(a)gnuweeb.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index a38ffbf4b3f0..511c4154cf74 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -1120,6 +1120,9 @@ static void __handle_link_change(struct usbnet *dev)
if (!test_bit(EVENT_DEV_OPEN, &dev->flags))
return;
+ if (test_and_clear_bit(EVENT_LINK_CARRIER_ON, &dev->flags))
+ netif_carrier_on(dev->net);
+
if (!netif_carrier_ok(dev->net)) {
/* kill URBs for reading packets to save bus bandwidth */
unlink_urbs(dev, &dev->rxq);
@@ -1129,9 +1132,6 @@ static void __handle_link_change(struct usbnet *dev)
* tx queue is stopped by netcore after link becomes off
*/
} else {
- if (test_and_clear_bit(EVENT_LINK_CARRIER_ON, &dev->flags))
- netif_carrier_on(dev->net);
-
/* submitting URBs for reading packets */
queue_work(system_bh_wq, &dev->bh_work);
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 8466d393700f9ccef68134d3349f4e0a087679b9
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081204-broken-atlas-ad4f@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8466d393700f9ccef68134d3349f4e0a087679b9 Mon Sep 17 00:00:00 2001
From: Ammar Faizi <ammarfaizi2(a)gnuweeb.org>
Date: Wed, 6 Aug 2025 07:31:05 +0700
Subject: [PATCH] net: usbnet: Fix the wrong netif_carrier_on() call
The commit referenced in the Fixes tag causes usbnet to malfunction
(identified via git bisect). Post-commit, my external RJ45 LAN cable
fails to connect. Linus also reported the same issue after pulling that
commit.
The code has a logic error: netif_carrier_on() is only called when the
link is already on. Fix this by moving the netif_carrier_on() call
outside the if-statement entirely. This ensures it is always called
when EVENT_LINK_CARRIER_ON is set and properly clears it regardless
of the link state.
Cc: stable(a)vger.kernel.org
Cc: Armando Budianto <sprite(a)gnuweeb.org>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Suggested-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Link: https://lore.kernel.org/all/CAHk-=wjqL4uF0MG_c8+xHX1Vv8==sPYQrtzbdA3kzi9628…
Closes: https://lore.kernel.org/netdev/CAHk-=wjKh8X4PT_mU1kD4GQrbjivMfPn-_hXa6han_B…
Closes: https://lore.kernel.org/netdev/0752dee6-43d6-4e1f-81d2-4248142cccd2@gnuweeb…
Fixes: 0d9cfc9b8cb1 ("net: usbnet: Avoid potential RCU stall on LINK_CHANGE event")
Signed-off-by: Ammar Faizi <ammarfaizi2(a)gnuweeb.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index a38ffbf4b3f0..511c4154cf74 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -1120,6 +1120,9 @@ static void __handle_link_change(struct usbnet *dev)
if (!test_bit(EVENT_DEV_OPEN, &dev->flags))
return;
+ if (test_and_clear_bit(EVENT_LINK_CARRIER_ON, &dev->flags))
+ netif_carrier_on(dev->net);
+
if (!netif_carrier_ok(dev->net)) {
/* kill URBs for reading packets to save bus bandwidth */
unlink_urbs(dev, &dev->rxq);
@@ -1129,9 +1132,6 @@ static void __handle_link_change(struct usbnet *dev)
* tx queue is stopped by netcore after link becomes off
*/
} else {
- if (test_and_clear_bit(EVENT_LINK_CARRIER_ON, &dev->flags))
- netif_carrier_on(dev->net);
-
/* submitting URBs for reading packets */
queue_work(system_bh_wq, &dev->bh_work);
}
Now that I have a proper email address, add me to the CC list so I don't
have to subscribe to <stable(a)vger.kernel.org> only to get this mail.
Signed-off-by: Achill Gilgenast <achill(a)achill.org>
---
scripts/quilt-mail | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/scripts/quilt-mail b/scripts/quilt-mail
index e0df3d935f..531d9fafb9 100755
--- a/scripts/quilt-mail
+++ b/scripts/quilt-mail
@@ -181,7 +181,8 @@ CC_NAMES=("linux-kernel(a)vger\.kernel\.org"
"rwarsow(a)gmx\.de"
"conor(a)kernel\.org"
"hargar(a)microsoft\.com"
- "broonie(a)kernel\.org")
+ "broonie(a)kernel\.org"
+ "achill(a)achill\.org")
#CC_LIST="stable(a)vger\.kernel\.org"
CC_LIST="patches(a)lists.linux.dev"
--
2.50.1
This reverts commit 10af0273a35ab4513ca1546644b8c853044da134.
While this change was merged, it is not the preferred solution.
During review of a similar change to the gpio-mlxbf2 driver, the
use of "platform_get_irq_optional" was identified as the preferred
solution, so let's use it for gpio-mlxbf3 driver as well.
Cc: stable(a)vger.kernel.org
Fixes: 10af0273a35a ("gpio: mlxbf3: only get IRQ for device instance 0")
Signed-off-by: David Thompson <davthompson(a)nvidia.com>
---
drivers/gpio/gpio-mlxbf3.c | 54 ++++++++++++++------------------------
1 file changed, 19 insertions(+), 35 deletions(-)
diff --git a/drivers/gpio/gpio-mlxbf3.c b/drivers/gpio/gpio-mlxbf3.c
index 9875e34bde72..10ea71273c89 100644
--- a/drivers/gpio/gpio-mlxbf3.c
+++ b/drivers/gpio/gpio-mlxbf3.c
@@ -190,9 +190,7 @@ static int mlxbf3_gpio_probe(struct platform_device *pdev)
struct mlxbf3_gpio_context *gs;
struct gpio_irq_chip *girq;
struct gpio_chip *gc;
- char *colon_ptr;
int ret, irq;
- long num;
gs = devm_kzalloc(dev, sizeof(*gs), GFP_KERNEL);
if (!gs)
@@ -229,39 +227,25 @@ static int mlxbf3_gpio_probe(struct platform_device *pdev)
gc->owner = THIS_MODULE;
gc->add_pin_ranges = mlxbf3_gpio_add_pin_ranges;
- colon_ptr = strchr(dev_name(dev), ':');
- if (!colon_ptr) {
- dev_err(dev, "invalid device name format\n");
- return -EINVAL;
- }
-
- ret = kstrtol(++colon_ptr, 16, &num);
- if (ret) {
- dev_err(dev, "invalid device instance\n");
- return ret;
- }
-
- if (!num) {
- irq = platform_get_irq(pdev, 0);
- if (irq >= 0) {
- girq = &gs->gc.irq;
- gpio_irq_chip_set_chip(girq, &gpio_mlxbf3_irqchip);
- girq->default_type = IRQ_TYPE_NONE;
- /* This will let us handle the parent IRQ in the driver */
- girq->num_parents = 0;
- girq->parents = NULL;
- girq->parent_handler = NULL;
- girq->handler = handle_bad_irq;
-
- /*
- * Directly request the irq here instead of passing
- * a flow-handler because the irq is shared.
- */
- ret = devm_request_irq(dev, irq, mlxbf3_gpio_irq_handler,
- IRQF_SHARED, dev_name(dev), gs);
- if (ret)
- return dev_err_probe(dev, ret, "failed to request IRQ");
- }
+ irq = platform_get_irq(pdev, 0);
+ if (irq >= 0) {
+ girq = &gs->gc.irq;
+ gpio_irq_chip_set_chip(girq, &gpio_mlxbf3_irqchip);
+ girq->default_type = IRQ_TYPE_NONE;
+ /* This will let us handle the parent IRQ in the driver */
+ girq->num_parents = 0;
+ girq->parents = NULL;
+ girq->parent_handler = NULL;
+ girq->handler = handle_bad_irq;
+
+ /*
+ * Directly request the irq here instead of passing
+ * a flow-handler because the irq is shared.
+ */
+ ret = devm_request_irq(dev, irq, mlxbf3_gpio_irq_handler,
+ IRQF_SHARED, dev_name(dev), gs);
+ if (ret)
+ return dev_err_probe(dev, ret, "failed to request IRQ");
}
platform_set_drvdata(pdev, gs);
--
2.43.2
The gpio-mlxbf3 driver interfaces with two GPIO controllers,
device instance 0 and 1. There is a single IRQ resource shared
between the two controllers, and it is found in the ACPI table for
device instance 0. The driver should not use platform_get_irq(),
otherwise this error is logged when probing instance 1:
mlxbf3_gpio MLNXBF33:01: error -ENXIO: IRQ index 0 not found
Cc: stable(a)vger.kernel.org
Fixes: cd33f216d241 ("gpio: mlxbf3: Add gpio driver support")
Signed-off-by: David Thompson <davthompson(a)nvidia.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
---
drivers/gpio/gpio-mlxbf3.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpio/gpio-mlxbf3.c b/drivers/gpio/gpio-mlxbf3.c
index 10ea71273c89..ed29b07d16c1 100644
--- a/drivers/gpio/gpio-mlxbf3.c
+++ b/drivers/gpio/gpio-mlxbf3.c
@@ -227,7 +227,7 @@ static int mlxbf3_gpio_probe(struct platform_device *pdev)
gc->owner = THIS_MODULE;
gc->add_pin_ranges = mlxbf3_gpio_add_pin_ranges;
- irq = platform_get_irq(pdev, 0);
+ irq = platform_get_irq_optional(pdev, 0);
if (irq >= 0) {
girq = &gs->gc.irq;
gpio_irq_chip_set_chip(girq, &gpio_mlxbf3_irqchip);
--
2.43.2
This reverts commit 10af0273a35ab4513ca1546644b8c853044da134.
While this change was merged, it is not the preferred solution.
During review of a similar change to the gpio-mlxbf2 driver, the
use of "platform_get_irq_optional" was identified as the preferred
solution, so let's use it for gpio-mlxbf3 driver as well.
Cc: stable(a)vger.kernel.org
Fixes: 10af0273a35a ("gpio: mlxbf3: only get IRQ for device instance 0")
Signed-off-by: David Thompson <davthompson(a)nvidia.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
---
drivers/gpio/gpio-mlxbf3.c | 54 ++++++++++++++------------------------
1 file changed, 19 insertions(+), 35 deletions(-)
diff --git a/drivers/gpio/gpio-mlxbf3.c b/drivers/gpio/gpio-mlxbf3.c
index 9875e34bde72..10ea71273c89 100644
--- a/drivers/gpio/gpio-mlxbf3.c
+++ b/drivers/gpio/gpio-mlxbf3.c
@@ -190,9 +190,7 @@ static int mlxbf3_gpio_probe(struct platform_device *pdev)
struct mlxbf3_gpio_context *gs;
struct gpio_irq_chip *girq;
struct gpio_chip *gc;
- char *colon_ptr;
int ret, irq;
- long num;
gs = devm_kzalloc(dev, sizeof(*gs), GFP_KERNEL);
if (!gs)
@@ -229,39 +227,25 @@ static int mlxbf3_gpio_probe(struct platform_device *pdev)
gc->owner = THIS_MODULE;
gc->add_pin_ranges = mlxbf3_gpio_add_pin_ranges;
- colon_ptr = strchr(dev_name(dev), ':');
- if (!colon_ptr) {
- dev_err(dev, "invalid device name format\n");
- return -EINVAL;
- }
-
- ret = kstrtol(++colon_ptr, 16, &num);
- if (ret) {
- dev_err(dev, "invalid device instance\n");
- return ret;
- }
-
- if (!num) {
- irq = platform_get_irq(pdev, 0);
- if (irq >= 0) {
- girq = &gs->gc.irq;
- gpio_irq_chip_set_chip(girq, &gpio_mlxbf3_irqchip);
- girq->default_type = IRQ_TYPE_NONE;
- /* This will let us handle the parent IRQ in the driver */
- girq->num_parents = 0;
- girq->parents = NULL;
- girq->parent_handler = NULL;
- girq->handler = handle_bad_irq;
-
- /*
- * Directly request the irq here instead of passing
- * a flow-handler because the irq is shared.
- */
- ret = devm_request_irq(dev, irq, mlxbf3_gpio_irq_handler,
- IRQF_SHARED, dev_name(dev), gs);
- if (ret)
- return dev_err_probe(dev, ret, "failed to request IRQ");
- }
+ irq = platform_get_irq(pdev, 0);
+ if (irq >= 0) {
+ girq = &gs->gc.irq;
+ gpio_irq_chip_set_chip(girq, &gpio_mlxbf3_irqchip);
+ girq->default_type = IRQ_TYPE_NONE;
+ /* This will let us handle the parent IRQ in the driver */
+ girq->num_parents = 0;
+ girq->parents = NULL;
+ girq->parent_handler = NULL;
+ girq->handler = handle_bad_irq;
+
+ /*
+ * Directly request the irq here instead of passing
+ * a flow-handler because the irq is shared.
+ */
+ ret = devm_request_irq(dev, irq, mlxbf3_gpio_irq_handler,
+ IRQF_SHARED, dev_name(dev), gs);
+ if (ret)
+ return dev_err_probe(dev, ret, "failed to request IRQ");
}
platform_set_drvdata(pdev, gs);
--
2.43.2
Hi everyone,
This is v2 of my series to backport the critical security fix,
identified as CVE-2020-12965 ("Transient Execution of Non-Canonical Accesses"),
to the 6.6.y stable kernel tree.
Linus Torvalds's second proposed solution offers a more targeted and
smaller backport for CVE-2020-12965 compared to backporting the entire
patch series.
This alternative would focus solely on the user address masking
logic that addresses the AMD speculation issue with non-canonical
addresses.
Instead of introducing the extensive "runtime-constant"
infrastructure seen in the larger patch series, this solution would:
- Introduce a single new variable for the USER_PTR_MAX
value.
- Use an actual memory load to access this USER_PTR_MAX value, rather than
leveraging the runtime_const mechanism.
While this approach would result in a noticeably smaller and more
localized patch, it would differ from what's currently in the
mainline kernel. This divergence would necessitate significant
additional testing to ensure its stability.
I am ready to implement the second proposed solution if the
maintainers wish to move forward in that direction, understanding the
testing implications. Please let me know your preference.
Changes in v2:
==============
- Incorporated the commit 91309a708: x86: use cmov for user address
as suggested by David Laight. This commit is now included as the first patch
in the series.
This series addresses the CVE-2020-12965 vulnerability by
introducing the necessary x86 infrastructure and the specific fix for user
address masking non-canonical speculation issues.
v1:
==============
This patch series backports a critical security fix, identified as
CVE-2020-12965 ("Transient Execution of Non-Canonical Accesses"), to the
6.6.y stable kernel tree.
David Laight (1):
x86: fix off-by-one in access_ok()
Linus Torvalds (6):
vfs: dcache: move hashlen_hash() from callers into d_hash()
runtime constants: add default dummy infrastructure
runtime constants: add x86 architecture support
arm64: add 'runtime constant' support
x86: fix user address masking non-canonical speculation issue
x86: use cmov for user address masking
arch/arm64/include/asm/runtime-const.h | 92 ++++++++++++++++++++++++++
arch/arm64/kernel/vmlinux.lds.S | 3 +
arch/x86/include/asm/runtime-const.h | 61 +++++++++++++++++
arch/x86/include/asm/uaccess_64.h | 44 +++++++-----
arch/x86/kernel/cpu/common.c | 10 +++
arch/x86/kernel/vmlinux.lds.S | 4 ++
arch/x86/lib/getuser.S | 10 ++-
fs/dcache.c | 17 +++--
include/asm-generic/Kbuild | 1 +
include/asm-generic/runtime-const.h | 15 +++++
include/asm-generic/vmlinux.lds.h | 8 +++
11 files changed, 242 insertions(+), 23 deletions(-)
create mode 100644 arch/arm64/include/asm/runtime-const.h
create mode 100644 arch/x86/include/asm/runtime-const.h
create mode 100644 include/asm-generic/runtime-const.h
--
2.50.1.470.g6ba607880d-goog
Add the missing write_blocked check for updating sysfs related to uncore
efficiency latency control (ELC). If write operation is blocked return
error.
Fixes: bb516dc79c4a ("platform/x86/intel-uncore-freq: Add support for efficiency latency control")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
---
Non urgent patch. It can go through regular merge window even if it has fix tag.
This is not a current production use case.
Rebased on
https://kernel.googlesource.com/pub/scm/linux/kernel/git/pdx86/platform-dri…
for-next
.../x86/intel/uncore-frequency/uncore-frequency-tpmi.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
index 6df55c8e16b7..bfcf92aa4d69 100644
--- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
+++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency-tpmi.c
@@ -192,9 +192,14 @@ static int uncore_read_control_freq(struct uncore_data *data, unsigned int *valu
static int write_eff_lat_ctrl(struct uncore_data *data, unsigned int val, enum uncore_index index)
{
struct tpmi_uncore_cluster_info *cluster_info;
+ struct tpmi_uncore_struct *uncore_root;
u64 control;
cluster_info = container_of(data, struct tpmi_uncore_cluster_info, uncore_data);
+ uncore_root = cluster_info->uncore_root;
+
+ if (uncore_root->write_blocked)
+ return -EPERM;
if (cluster_info->root_domain)
return -ENODATA;
--
2.49.0
Hello,
I have come across the linux kernel
commit 805e3ce5e0e32b31dcecc0774c57c17a1f13cef6 merged into the 6.15 kernel.
Kernel Commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
Currently, we need this fix into the latest 6.8 based kernel as our servers
are all based on Ubuntu 24.04 with 6.8 based kernels. Please let us know
the process for the same.
Could you please help in this regard. Any help on this would be greatly
appreciated.
Thanks,
Srikanth
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 6b445309eec2bc0594f3e24c7777aeef891d386e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081220-superbowl-aerospace-880f@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6b445309eec2bc0594f3e24c7777aeef891d386e Mon Sep 17 00:00:00 2001
From: Paulo Alcantara <pc(a)manguebit.org>
Date: Thu, 31 Jul 2025 20:46:42 -0300
Subject: [PATCH] smb: client: default to nonativesocket under POSIX mounts
SMB3.1.1 POSIX mounts require sockets to be created with NFS reparse
points.
Cc: linux-cifs(a)vger.kernel.org
Cc: Ralph Boehme <slow(a)samba.org>
Cc: David Howells <dhowells(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Reported-by: Matthew Richardson <m.richardson(a)ed.ac.uk>
Closes: https://marc.info/?i=1124e7cd-6a46-40a6-9f44-b7664a66654b@ed.ac.uk
Signed-off-by: Paulo Alcantara (Red Hat) <pc(a)manguebit.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c
index cc8bd79ebca9..072383899e81 100644
--- a/fs/smb/client/fs_context.c
+++ b/fs/smb/client/fs_context.c
@@ -1652,6 +1652,7 @@ static int smb3_fs_context_parse_param(struct fs_context *fc,
pr_warn_once("conflicting posix mount options specified\n");
ctx->linux_ext = 1;
ctx->no_linux_ext = 0;
+ ctx->nonativesocket = 1; /* POSIX mounts use NFS style reparse points */
}
break;
case Opt_nocase:
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 6b445309eec2bc0594f3e24c7777aeef891d386e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081229-blizzard-gout-ee04@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6b445309eec2bc0594f3e24c7777aeef891d386e Mon Sep 17 00:00:00 2001
From: Paulo Alcantara <pc(a)manguebit.org>
Date: Thu, 31 Jul 2025 20:46:42 -0300
Subject: [PATCH] smb: client: default to nonativesocket under POSIX mounts
SMB3.1.1 POSIX mounts require sockets to be created with NFS reparse
points.
Cc: linux-cifs(a)vger.kernel.org
Cc: Ralph Boehme <slow(a)samba.org>
Cc: David Howells <dhowells(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Reported-by: Matthew Richardson <m.richardson(a)ed.ac.uk>
Closes: https://marc.info/?i=1124e7cd-6a46-40a6-9f44-b7664a66654b@ed.ac.uk
Signed-off-by: Paulo Alcantara (Red Hat) <pc(a)manguebit.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c
index cc8bd79ebca9..072383899e81 100644
--- a/fs/smb/client/fs_context.c
+++ b/fs/smb/client/fs_context.c
@@ -1652,6 +1652,7 @@ static int smb3_fs_context_parse_param(struct fs_context *fc,
pr_warn_once("conflicting posix mount options specified\n");
ctx->linux_ext = 1;
ctx->no_linux_ext = 0;
+ ctx->nonativesocket = 1; /* POSIX mounts use NFS style reparse points */
}
break;
case Opt_nocase:
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 6b445309eec2bc0594f3e24c7777aeef891d386e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081229-unbraided-drinking-9839@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6b445309eec2bc0594f3e24c7777aeef891d386e Mon Sep 17 00:00:00 2001
From: Paulo Alcantara <pc(a)manguebit.org>
Date: Thu, 31 Jul 2025 20:46:42 -0300
Subject: [PATCH] smb: client: default to nonativesocket under POSIX mounts
SMB3.1.1 POSIX mounts require sockets to be created with NFS reparse
points.
Cc: linux-cifs(a)vger.kernel.org
Cc: Ralph Boehme <slow(a)samba.org>
Cc: David Howells <dhowells(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Reported-by: Matthew Richardson <m.richardson(a)ed.ac.uk>
Closes: https://marc.info/?i=1124e7cd-6a46-40a6-9f44-b7664a66654b@ed.ac.uk
Signed-off-by: Paulo Alcantara (Red Hat) <pc(a)manguebit.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c
index cc8bd79ebca9..072383899e81 100644
--- a/fs/smb/client/fs_context.c
+++ b/fs/smb/client/fs_context.c
@@ -1652,6 +1652,7 @@ static int smb3_fs_context_parse_param(struct fs_context *fc,
pr_warn_once("conflicting posix mount options specified\n");
ctx->linux_ext = 1;
ctx->no_linux_ext = 0;
+ ctx->nonativesocket = 1; /* POSIX mounts use NFS style reparse points */
}
break;
case Opt_nocase:
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 8e7d178d06e8937454b6d2f2811fa6a15656a214
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081256-unhitched-broadcast-2eef@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8e7d178d06e8937454b6d2f2811fa6a15656a214 Mon Sep 17 00:00:00 2001
From: Thorsten Blum <thorsten.blum(a)linux.dev>
Date: Wed, 6 Aug 2025 03:03:49 +0200
Subject: [PATCH] smb: server: Fix extension string in
ksmbd_extract_shortname()
In ksmbd_extract_shortname(), strscpy() is incorrectly called with the
length of the source string (excluding the NUL terminator) rather than
the size of the destination buffer. This results in "__" being copied
to 'extension' rather than "___" (two underscores instead of three).
Use the destination buffer size instead to ensure that the string "___"
(three underscores) is copied correctly.
Cc: stable(a)vger.kernel.org
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Signed-off-by: Thorsten Blum <thorsten.blum(a)linux.dev>
Acked-by: Namjae Jeon <linkinjeon(a)kernel.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/smb/server/smb_common.c b/fs/smb/server/smb_common.c
index 425c756bcfb8..b23203a1c286 100644
--- a/fs/smb/server/smb_common.c
+++ b/fs/smb/server/smb_common.c
@@ -515,7 +515,7 @@ int ksmbd_extract_shortname(struct ksmbd_conn *conn, const char *longname,
p = strrchr(longname, '.');
if (p == longname) { /*name starts with a dot*/
- strscpy(extension, "___", strlen("___"));
+ strscpy(extension, "___", sizeof(extension));
} else {
if (p) {
p++;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 8e7d178d06e8937454b6d2f2811fa6a15656a214
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081253-lapping-empathic-d0ae@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8e7d178d06e8937454b6d2f2811fa6a15656a214 Mon Sep 17 00:00:00 2001
From: Thorsten Blum <thorsten.blum(a)linux.dev>
Date: Wed, 6 Aug 2025 03:03:49 +0200
Subject: [PATCH] smb: server: Fix extension string in
ksmbd_extract_shortname()
In ksmbd_extract_shortname(), strscpy() is incorrectly called with the
length of the source string (excluding the NUL terminator) rather than
the size of the destination buffer. This results in "__" being copied
to 'extension' rather than "___" (two underscores instead of three).
Use the destination buffer size instead to ensure that the string "___"
(three underscores) is copied correctly.
Cc: stable(a)vger.kernel.org
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Signed-off-by: Thorsten Blum <thorsten.blum(a)linux.dev>
Acked-by: Namjae Jeon <linkinjeon(a)kernel.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/smb/server/smb_common.c b/fs/smb/server/smb_common.c
index 425c756bcfb8..b23203a1c286 100644
--- a/fs/smb/server/smb_common.c
+++ b/fs/smb/server/smb_common.c
@@ -515,7 +515,7 @@ int ksmbd_extract_shortname(struct ksmbd_conn *conn, const char *longname,
p = strrchr(longname, '.');
if (p == longname) { /*name starts with a dot*/
- strscpy(extension, "___", strlen("___"));
+ strscpy(extension, "___", sizeof(extension));
} else {
if (p) {
p++;
commit bcb0fda3c2da9fe4721d3e73d80e778c038e7d27 upstream.
The IOPOLL path posts CQEs when the io_kiocb is marked as completed,
so it cannot rely on the usual retry that non-IOPOLL requests do for
read/write requests.
If -EAGAIN is received and the request should be retried, go through
the normal completion path and let the normal flush logic catch it and
reissue it, like what is done for !IOPOLL reads or writes.
Fixes: d803d123948f ("io_uring/rw: handle -EAGAIN retry at IO completion time")
Reported-by: John Garry <john.g.garry(a)oracle.com>
Link: https://lore.kernel.org/io-uring/2b43ccfa-644d-4a09-8f8f-39ad71810f41@oracl…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Sumanth Gavini <sumanth.gavini(a)yahoo.com>
---
io_uring/rw.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/io_uring/rw.c b/io_uring/rw.c
index 4ff3442ac2ee..6a84c4a39ce9 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -326,11 +326,10 @@ static void io_complete_rw_iopoll(struct kiocb *kiocb, long res)
if (kiocb->ki_flags & IOCB_WRITE)
io_req_end_write(req);
if (unlikely(res != req->cqe.res)) {
- if (res == -EAGAIN && io_rw_should_reissue(req)) {
+ if (res == -EAGAIN && io_rw_should_reissue(req))
req->flags |= REQ_F_REISSUE | REQ_F_PARTIAL_IO;
- return;
- }
- req->cqe.res = res;
+ else
+ req->cqe.res = res;
}
/* order with io_iopoll_complete() checking ->iopoll_completed */
--
2.43.0
From: Gabor Juhos <j4g8y7(a)gmail.com>
Since commit 3d1f08b032dc ("mtd: spinand: Use the external ECC engine
logic") the spinand_write_page() function ignores the errors returned
by spinand_wait(). Change the code to propagate those up to the stack
as it was done before the offending change.
Cc: stable(a)vger.kernel.org
Fixes: 3d1f08b032dc ("mtd: spinand: Use the external ECC engine logic")
Signed-off-by: Gabor Juhos <j4g8y7(a)gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
---
drivers/mtd/nand/spi/core.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
index daf6efb87d8..3e21a06dd0f 100644
--- a/drivers/mtd/nand/spi/core.c
+++ b/drivers/mtd/nand/spi/core.c
@@ -691,7 +691,10 @@ int spinand_write_page(struct spinand_device *spinand,
SPINAND_WRITE_INITIAL_DELAY_US,
SPINAND_WRITE_POLL_DELAY_US,
&status);
- if (!ret && (status & STATUS_PROG_FAILED))
+ if (ret)
+ return ret;
+
+ if (status & STATUS_PROG_FAILED)
return -EIO;
return spinand_ondie_ecc_finish_io_req(nand, (struct nand_page_io_req *)req);
--
2.47.2
On Fri, Aug 08, 2025 at 07:27:26PM -0400, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> sched: Do not call __put_task_struct() on rt if pi_blocked_on is set
>
> to the 6.16-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> sched-do-not-call-__put_task_struct-on-rt-if-pi_bloc.patch
> and it can be found in the queue-6.16 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
Hi Sasha!
I am the author of that patch and I sent a follow-up patch that fixes a
mistake I made on the original patch:
[RESEND PATCH] sched: restore the behavior of put_task_struct() for non-rt
https://lore.kernel.org/all/aJOwe_ZS5rHXMrsO@uudg.org/
The patch you have does not match what is in the description. I unfortunately
sent the wrong version of the patch on the verge of leaving for a long vacation
and only noticed that when I returned. The code is correct, but does not
match the commit description and is a change that I would like to propose
later as an RFC, not the simpler bugfix originally intended.
I suggest waiting for the follow-up patch mentioned above to include both on
stable kernels.
Sorry for the inconvenience,
Luis
>
>
> commit 7bed29f5ad955444283dca82476dd92c59923f73
> Author: Luis Claudio R. Goncalves <lgoncalv(a)redhat.com>
> Date: Mon Jul 7 11:03:59 2025 -0300
>
> sched: Do not call __put_task_struct() on rt if pi_blocked_on is set
>
> [ Upstream commit 8671bad873ebeb082afcf7b4501395c374da6023 ]
>
> With PREEMPT_RT enabled, some of the calls to put_task_struct() coming
> from rt_mutex_adjust_prio_chain() could happen in preemptible context and
> with a mutex enqueued. That could lead to this sequence:
>
> rt_mutex_adjust_prio_chain()
> put_task_struct()
> __put_task_struct()
> sched_ext_free()
> spin_lock_irqsave()
> rtlock_lock() ---> TRIGGERS
> lockdep_assert(!current->pi_blocked_on);
>
> This is not a SCHED_EXT bug. The first cleanup function called by
> __put_task_struct() is sched_ext_free() and it happens to take a
> (RT) spin_lock, which in the scenario described above, would trigger
> the lockdep assertion of "!current->pi_blocked_on".
>
> Crystal Wood was able to identify the problem as __put_task_struct()
> being called during rt_mutex_adjust_prio_chain(), in the context of
> a process with a mutex enqueued.
>
> Instead of adding more complex conditions to decide when to directly
> call __put_task_struct() and when to defer the call, unconditionally
> resort to the deferred call on PREEMPT_RT to simplify the code.
>
> Fixes: 893cdaaa3977 ("sched: avoid false lockdep splat in put_task_struct()")
> Suggested-by: Crystal Wood <crwood(a)redhat.com>
> Signed-off-by: Luis Claudio R. Goncalves <lgoncalv(a)redhat.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
> Reviewed-by: Wander Lairson Costa <wander(a)redhat.com>
> Reviewed-by: Valentin Schneider <vschneid(a)redhat.com>
> Reviewed-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
> Link: https://lore.kernel.org/r/aGvTz5VaPFyj0pBV@uudg.org
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
> index ca1db4b92c32..58ce71715268 100644
> --- a/include/linux/sched/task.h
> +++ b/include/linux/sched/task.h
> @@ -135,24 +135,17 @@ static inline void put_task_struct(struct task_struct *t)
> return;
>
> /*
> - * In !RT, it is always safe to call __put_task_struct().
> - * Under RT, we can only call it in preemptible context.
> - */
> - if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible()) {
> - static DEFINE_WAIT_OVERRIDE_MAP(put_task_map, LD_WAIT_SLEEP);
> -
> - lock_map_acquire_try(&put_task_map);
> - __put_task_struct(t);
> - lock_map_release(&put_task_map);
> - return;
> - }
> -
> - /*
> - * under PREEMPT_RT, we can't call put_task_struct
> + * Under PREEMPT_RT, we can't call __put_task_struct
> * in atomic context because it will indirectly
> - * acquire sleeping locks.
> + * acquire sleeping locks. The same is true if the
> + * current process has a mutex enqueued (blocked on
> + * a PI chain).
> + *
> + * In !RT, it is always safe to call __put_task_struct().
> + * Though, in order to simplify the code, resort to the
> + * deferred call too.
> *
> - * call_rcu() will schedule delayed_put_task_struct_rcu()
> + * call_rcu() will schedule __put_task_struct_rcu_cb()
> * to be called in process context.
> *
> * __put_task_struct() is called when
> @@ -165,7 +158,7 @@ static inline void put_task_struct(struct task_struct *t)
> *
> * delayed_free_task() also uses ->rcu, but it is only called
> * when it fails to fork a process. Therefore, there is no
> - * way it can conflict with put_task_struct().
> + * way it can conflict with __put_task_struct().
> */
> call_rcu(&t->rcu, __put_task_struct_rcu_cb);
> }
>
---end quoted text---
Hi all,
Commit 41532b785e (tls: separate no-async decryption request handling from async) [1] actually covers a UAF read and write bug in the kernel, and should be backported to 6.1. As of now, it has only been backported to 6.6, back from the time when the patch was committed. The commit mentions a non-reproducible UAF that was previously observed, but we managed to hit the vulnerable case.
The vulnerable case is when a user wraps an existing crypto algorithm (such as gcm or ghash) in cryptd. By default, cryptd-wrapped algorithms have a higher priority than the base variant. tls_decrypt_sg allocates the aead request, and triggers the crypto handling with tls_do_decryption. When the crypto is handled by cryptd, it gets dispatched to a worker that handles it and initially returns EINPROGRESS. While older LTS versions (5.4, 5.10, and 5.15) seem to have an additional crypto_wait_req call in those cases, 6.1 just returns success and frees the aead request. The cryptd worker could still be operating in this case, which causes a UAF.
However, this vulnerability only occurs when the CPU is without AVX support (perhaps this is why there were reproducibility difficulties). With AVX, aesni_init calls simd_register_aeads_compat to force the crypto subsystem to use the SIMD version and avoids the async issues raised by cryptd. While I doubt many people are using host systems without AVX these days, this environment is pretty common in VMs when QEMU uses KVM without using the "-cpu host" flag.
The following is a repro, and can be triggered from unprivileged users. Multishot KASAN shows multiple UAF reads and writes, and ends up panicking the system with a null dereference.
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
#include <unistd.h>
#include <sched.h>
#include <arpa/inet.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <linux/tcp.h>
#include <linux/tls.h>
#include <sys/resource.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <linux/if_alg.h>
#include <signal.h>
#include <sys/wait.h>
#include <time.h>
struct tls_conn {
int tx;
int rx;
};
void tls_enable(int sk, int type) {
struct tls12_crypto_info_aes_gcm_256 tls_ci = {
.info.version = TLS_1_3_VERSION,
.info.cipher_type = TLS_CIPHER_AES_GCM_256,
};
setsockopt(sk, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls"));
setsockopt(sk, SOL_TLS, type, &tls_ci, sizeof(tls_ci));
}
struct tls_conn *tls_create_conn(int port) {
int s0 = socket(AF_INET, SOCK_STREAM, 0);
int s1 = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in a = {
.sin_family = AF_INET,
.sin_port = htons(port),
.sin_addr = htobe32(0),
};
bind(s0, (struct sockaddr*)&a, sizeof(a));
listen(s0, 1);
connect(s1, (struct sockaddr *)&a, sizeof(a));
int s2 = accept(s0, 0, 0);
close(s0);
tls_enable(s1, TLS_TX);
tls_enable(s2, TLS_RX);
struct tls_conn *t = calloc(1, sizeof(struct tls_conn));
t->tx = s1;
t->rx = s2;
return t;
}
void tls_destroy_conn(struct tls_conn *t) {
close(t->tx);
close(t->rx);
free(t);
}
int tls_send(struct tls_conn *t, char *data, size_t size) {
return sendto(t->tx, data, size, 0, NULL, 0);
}
int tls_recv(struct tls_conn *t, char *data, size_t size) {
return recvfrom(t->rx, data, size, 0, NULL, NULL);
}
int crypto_register_algo(char *type, char *name) {
int s = socket(AF_ALG, SOCK_SEQPACKET, 0);
struct sockaddr_alg sa = {};
sa.salg_family = AF_ALG;
strcpy(sa.salg_type, type);
strcpy(sa.salg_name, name);
bind(s, (struct sockaddr *)&sa, sizeof(sa));
close(s);
return 0;
}
int main(void) {
char buff[0x2000];
crypto_register_algo("aead", "cryptd(gcm(aes))");
struct tls_conn *t = tls_create_conn(20000);
tls_send(t, buff, 0x10);
tls_recv(t, buff, 0x100);
}
Feel free to let us know if you have any questions and if there is anything else we can do to help.
Best,
Will
Savy
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
The hwprobe vDSO data for some keys, like MISALIGNED_VECTOR_PERF,
is determined by an asynchronous kthread. This can create a race
condition where the kthread finishes after the vDSO data has
already been populated, causing userspace to read stale values.
To fix this race, a new 'ready' flag is added to the vDSO data,
initialized to 'false' during arch_initcall_sync. This flag is
checked by both the vDSO's user-space code and the riscv_hwprobe
syscall. The syscall serves as a one-time gate, using a completion
to wait for any pending probes before populating the data and
setting the flag to 'true', thus ensuring userspace reads fresh
values on its first request.
Reported-by: Tsukasa OI <research_trasio(a)irq.a4lg.com>
Closes: https://lore.kernel.org/linux-riscv/760d637b-b13b-4518-b6bf-883d55d44e7f@ir…
Fixes: e7c9d66e313b ("RISC-V: Report vector unaligned access speed hwprobe")
Cc: Palmer Dabbelt <palmer(a)dabbelt.com>
Cc: Alexandre Ghiti <alexghiti(a)rivosinc.com>
Cc: Olof Johansson <olof(a)lixom.net>
Cc: stable(a)vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti(a)rivosinc.com>
Co-developed-by: Palmer Dabbelt <palmer(a)dabbelt.com>
Signed-off-by: Palmer Dabbelt <palmer(a)dabbelt.com>
Signed-off-by: Jingwei Wang <wangjingwei(a)iscas.ac.cn>
---
Changes in v8:
- Based on feedback from Alexandre, reverted the initcall back to
arch_initcall_sync since the DO_ONCE logic makes a late init redundant.
- Added the required Reviewed-by and Signed-off-by tags and fixed minor
formatting nits.
Changes in v7:
- Refined the on-demand synchronization by using the DO_ONCE_SLEEPABLE
macro.
- Fixed a build error for nommu configs and addressed several coding
style issues reported by the CI.
Changes in v6:
- Based on Palmer's feedback, reworked the synchronization to be on-demand,
deferring the wait until the first hwprobe syscall via a 'ready' flag.
This avoids the boot-time regression from v5's approach.
Changes in v5:
- Reworked the synchronization logic to a robust "sentinel-count"
pattern based on feedback from Alexandre.
- Fixed a "multiple definition" linker error for nommu builds by changing
the header-file stub functions to `static inline`, as pointed out by Olof.
- Updated the commit message to better explain the rationale for moving
the vDSO initialization to `late_initcall`.
Changes in v4:
- Reworked the synchronization mechanism based on feedback from Palmer
and Alexandre.
- Instead of a post-hoc refresh, this version introduces a robust
completion-based framework using an atomic counter to ensure async
probes are finished before populating the vDSO.
- Moved the vdso data initialization to a late_initcall to avoid
impacting boot time.
Changes in v3:
- Retained existing blank line.
Changes in v2:
- Addressed feedback from Yixun's regarding #ifdef CONFIG_MMU usage.
- Updated commit message to provide a high-level summary.
- Added Fixes tag for commit e7c9d66e313b.
v1: https://lore.kernel.org/linux-riscv/20250521052754.185231-1-wangjingwei@isc…
arch/riscv/include/asm/hwprobe.h | 7 +++
arch/riscv/include/asm/vdso/arch_data.h | 6 ++
arch/riscv/kernel/sys_hwprobe.c | 70 ++++++++++++++++++----
arch/riscv/kernel/unaligned_access_speed.c | 9 ++-
arch/riscv/kernel/vdso/hwprobe.c | 2 +-
5 files changed, 79 insertions(+), 15 deletions(-)
diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h
index 7fe0a379474ae2c6..5fe10724d307dc99 100644
--- a/arch/riscv/include/asm/hwprobe.h
+++ b/arch/riscv/include/asm/hwprobe.h
@@ -41,4 +41,11 @@ static inline bool riscv_hwprobe_pair_cmp(struct riscv_hwprobe *pair,
return pair->value == other_pair->value;
}
+#ifdef CONFIG_MMU
+void riscv_hwprobe_register_async_probe(void);
+void riscv_hwprobe_complete_async_probe(void);
+#else
+static inline void riscv_hwprobe_register_async_probe(void) {}
+static inline void riscv_hwprobe_complete_async_probe(void) {}
+#endif
#endif
diff --git a/arch/riscv/include/asm/vdso/arch_data.h b/arch/riscv/include/asm/vdso/arch_data.h
index da57a3786f7a53c8..88b37af55175129b 100644
--- a/arch/riscv/include/asm/vdso/arch_data.h
+++ b/arch/riscv/include/asm/vdso/arch_data.h
@@ -12,6 +12,12 @@ struct vdso_arch_data {
/* Boolean indicating all CPUs have the same static hwprobe values. */
__u8 homogeneous_cpus;
+
+ /*
+ * A gate to check and see if the hwprobe data is actually ready, as
+ * probing is deferred to avoid boot slowdowns.
+ */
+ __u8 ready;
};
#endif /* __RISCV_ASM_VDSO_ARCH_DATA_H */
diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c
index 0b170e18a2beba57..44201160b9c43177 100644
--- a/arch/riscv/kernel/sys_hwprobe.c
+++ b/arch/riscv/kernel/sys_hwprobe.c
@@ -5,6 +5,9 @@
* more details.
*/
#include <linux/syscalls.h>
+#include <linux/completion.h>
+#include <linux/atomic.h>
+#include <linux/once.h>
#include <asm/cacheflush.h>
#include <asm/cpufeature.h>
#include <asm/hwprobe.h>
@@ -452,28 +455,32 @@ static int hwprobe_get_cpus(struct riscv_hwprobe __user *pairs,
return 0;
}
-static int do_riscv_hwprobe(struct riscv_hwprobe __user *pairs,
- size_t pair_count, size_t cpusetsize,
- unsigned long __user *cpus_user,
- unsigned int flags)
-{
- if (flags & RISCV_HWPROBE_WHICH_CPUS)
- return hwprobe_get_cpus(pairs, pair_count, cpusetsize,
- cpus_user, flags);
+#ifdef CONFIG_MMU
- return hwprobe_get_values(pairs, pair_count, cpusetsize,
- cpus_user, flags);
+static DECLARE_COMPLETION(boot_probes_done);
+static atomic_t pending_boot_probes = ATOMIC_INIT(1);
+
+void riscv_hwprobe_register_async_probe(void)
+{
+ atomic_inc(&pending_boot_probes);
}
-#ifdef CONFIG_MMU
+void riscv_hwprobe_complete_async_probe(void)
+{
+ if (atomic_dec_and_test(&pending_boot_probes))
+ complete(&boot_probes_done);
+}
-static int __init init_hwprobe_vdso_data(void)
+static int complete_hwprobe_vdso_data(void)
{
struct vdso_arch_data *avd = vdso_k_arch_data;
u64 id_bitsmash = 0;
struct riscv_hwprobe pair;
int key;
+ if (unlikely(!atomic_dec_and_test(&pending_boot_probes)))
+ wait_for_completion(&boot_probes_done);
+
/*
* Initialize vDSO data with the answers for the "all CPUs" case, to
* save a syscall in the common case.
@@ -501,13 +508,52 @@ static int __init init_hwprobe_vdso_data(void)
* vDSO should defer to the kernel for exotic cpu masks.
*/
avd->homogeneous_cpus = id_bitsmash != 0 && id_bitsmash != -1;
+
+ /*
+ * Make sure all the VDSO values are visible before we look at them.
+ * This pairs with the implicit "no speculativly visible accesses"
+ * barrier in the VDSO hwprobe code.
+ */
+ smp_wmb();
+ avd->ready = true;
+ return 0;
+}
+
+static int __init init_hwprobe_vdso_data(void)
+{
+ struct vdso_arch_data *avd = vdso_k_arch_data;
+
+ /*
+ * Prevent the vDSO cached values from being used, as they're not ready
+ * yet.
+ */
+ avd->ready = false;
return 0;
}
arch_initcall_sync(init_hwprobe_vdso_data);
+#else
+
+static int complete_hwprobe_vdso_data(void) { return 0; }
+
#endif /* CONFIG_MMU */
+static int do_riscv_hwprobe(struct riscv_hwprobe __user *pairs,
+ size_t pair_count, size_t cpusetsize,
+ unsigned long __user *cpus_user,
+ unsigned int flags)
+{
+ DO_ONCE_SLEEPABLE(complete_hwprobe_vdso_data);
+
+ if (flags & RISCV_HWPROBE_WHICH_CPUS)
+ return hwprobe_get_cpus(pairs, pair_count, cpusetsize,
+ cpus_user, flags);
+
+ return hwprobe_get_values(pairs, pair_count, cpusetsize,
+ cpus_user, flags);
+}
+
SYSCALL_DEFINE5(riscv_hwprobe, struct riscv_hwprobe __user *, pairs,
size_t, pair_count, size_t, cpusetsize, unsigned long __user *,
cpus, unsigned int, flags)
diff --git a/arch/riscv/kernel/unaligned_access_speed.c b/arch/riscv/kernel/unaligned_access_speed.c
index ae2068425fbcd207..aa912c62fb70ba0e 100644
--- a/arch/riscv/kernel/unaligned_access_speed.c
+++ b/arch/riscv/kernel/unaligned_access_speed.c
@@ -379,6 +379,7 @@ static void check_vector_unaligned_access(struct work_struct *work __always_unus
static int __init vec_check_unaligned_access_speed_all_cpus(void *unused __always_unused)
{
schedule_on_each_cpu(check_vector_unaligned_access);
+ riscv_hwprobe_complete_async_probe();
return 0;
}
@@ -473,8 +474,12 @@ static int __init check_unaligned_access_all_cpus(void)
per_cpu(vector_misaligned_access, cpu) = unaligned_vector_speed_param;
} else if (!check_vector_unaligned_access_emulated_all_cpus() &&
IS_ENABLED(CONFIG_RISCV_PROBE_VECTOR_UNALIGNED_ACCESS)) {
- kthread_run(vec_check_unaligned_access_speed_all_cpus,
- NULL, "vec_check_unaligned_access_speed_all_cpus");
+ riscv_hwprobe_register_async_probe();
+ if (IS_ERR(kthread_run(vec_check_unaligned_access_speed_all_cpus,
+ NULL, "vec_check_unaligned_access_speed_all_cpus"))) {
+ pr_warn("Failed to create vec_unalign_check kthread\n");
+ riscv_hwprobe_complete_async_probe();
+ }
}
/*
diff --git a/arch/riscv/kernel/vdso/hwprobe.c b/arch/riscv/kernel/vdso/hwprobe.c
index 2ddeba6c68dda09b..bf77b4c1d2d8e803 100644
--- a/arch/riscv/kernel/vdso/hwprobe.c
+++ b/arch/riscv/kernel/vdso/hwprobe.c
@@ -27,7 +27,7 @@ static int riscv_vdso_get_values(struct riscv_hwprobe *pairs, size_t pair_count,
* homogeneous, then this function can handle requests for arbitrary
* masks.
*/
- if ((flags != 0) || (!all_cpus && !avd->homogeneous_cpus))
+ if ((flags != 0) || (!all_cpus && !avd->homogeneous_cpus) || unlikely(!avd->ready))
return riscv_hwprobe(pairs, pair_count, cpusetsize, cpus, flags);
/* This is something we can handle, fill out the pairs. */
--
2.50.1
From: Guchun Chen <guchun.chen(a)amd.com>
[ Upstream commit 248b061689a40f4fed05252ee2c89f87cf26d7d8 ]
In current code, when a PCI error state pci_channel_io_normal is detectd,
it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI
driver will continue the execution of PCI resume callback report_resume by
pci_walk_bridge, and the callback will go into amdgpu_pci_resume
finally, where write lock is releasd unconditionally without acquiring
such lock first. In this case, a deadlock will happen when other threads
start to acquire the read lock.
To fix this, add a member in amdgpu_device strucutre to cache
pci_channel_state, and only continue the execution in amdgpu_pci_resume
when it's pci_channel_io_frozen.
Fixes: c9a6b82f45e2 ("drm/amdgpu: Implement DPC recovery")
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky(a)amd.com>
Signed-off-by: Guchun Chen <guchun.chen(a)amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
[Shivani: Modified to apply on 5.10.y]
Signed-off-by: Shivani Agarwal <shivani.agarwal(a)broadcom.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 ++++++
2 files changed, 7 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index ff5555353eb4..683bbefc39c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -997,6 +997,7 @@ struct amdgpu_device {
bool in_pci_err_recovery;
struct pci_saved_state *pci_state;
+ pci_channel_state_t pci_channel_state;
};
static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 40d2f0ed1c75..8efd3ee2621f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4944,6 +4944,8 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
return PCI_ERS_RESULT_DISCONNECT;
}
+ adev->pci_channel_state = state;
+
switch (state) {
case pci_channel_io_normal:
return PCI_ERS_RESULT_CAN_RECOVER;
@@ -5079,6 +5081,10 @@ void amdgpu_pci_resume(struct pci_dev *pdev)
DRM_INFO("PCI error: resume callback!!\n");
+ /* Only continue execution for the case of pci_channel_io_frozen */
+ if (adev->pci_channel_state != pci_channel_io_frozen)
+ return;
+
for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
struct amdgpu_ring *ring = adev->rings[i];
--
2.40.4
Hi,
The first four patches in this series are miscellaneous fixes and
improvements in the Cadence and TI CSI-RX drivers around probing, fwnode
and link creation.
The last two patches add support for transmitting multiple pixels per
clock on the internal bus between Cadence CSI-RX bridge and TI CSI-RX
wrapper. As this internal bus is 32-bit wide, the maximum number of
pixels that can be transmitted per cycle depend upon the format's bit
width. Secondly, the downstream element must support unpacking of
multiple pixels.
Thus we export a module function that can be used by the downstream
driver to negotiate the pixels per cycle on the output pixel stream of
the Cadence bridge.
Signed-off-by: Jai Luthra <jai.luthra(a)ideasonboard.com>
---
Changes in v3:
- Move cdns-csi2rx header to include/media
- Export symbol from cdns-csi2rx.c to be used only through
the j721e-csi2rx.c module namespace
- Other minor fixes suggested by Sakari
- Add Abhilash's T-by tags
- Link to v2: https://lore.kernel.org/r/20250410-probe_fixes-v2-0-801bc6eebdea@ideasonboa…
Changes in v2:
- Rebase on v6.15-rc1
- Fix lkp warnings in PATCH 5/6 missing header for FIELD_PREP
- Add R-By tags from Devarsh and Changhuang
- Link to v1: https://lore.kernel.org/r/20250324-probe_fixes-v1-0-5cd5b9e1cfac@ideasonboa…
---
Jai Luthra (6):
media: ti: j721e-csi2rx: Use devm_of_platform_populate
media: ti: j721e-csi2rx: Use fwnode_get_named_child_node
media: ti: j721e-csi2rx: Fix source subdev link creation
media: cadence: csi2rx: Implement get_fwnode_pad op
media: cadence: cdns-csi2rx: Support multiple pixels per clock cycle
media: ti: j721e-csi2rx: Support multiple pixels per clock
MAINTAINERS | 1 +
drivers/media/platform/cadence/cdns-csi2rx.c | 74 ++++++++++++++++------
drivers/media/platform/ti/Kconfig | 3 +-
.../media/platform/ti/j721e-csi2rx/j721e-csi2rx.c | 65 ++++++++++++++-----
include/media/cadence/cdns-csi2rx.h | 19 ++++++
5 files changed, 127 insertions(+), 35 deletions(-)
---
base-commit: 19272b37aa4f83ca52bdf9c16d5d81bdd1354494
change-id: 20250314-probe_fixes-7e0ec33c7fee
Best regards,
--
Jai Luthra <jai.luthra(a)ideasonboard.com>
From: James Smart <jsmart2021(a)gmail.com>
[ Upstream commit 1854f53ccd88ad4e7568ddfafafffe71f1ceb0a6 ]
If an FC link down transition while PLOGIs are outstanding to fabric well
known addresses, outstanding ABTS requests may result in a NULL pointer
dereference. Driver unload requests may hang with repeated "2878" log
messages.
The Link down processing results in ABTS requests for outstanding ELS
requests. The Abort WQEs are sent for the ELSs before the driver had set
the link state to down. Thus the driver is sending the Abort with the
expectation that an ABTS will be sent on the wire. The Abort request is
stalled waiting for the link to come up. In some conditions the driver may
auto-complete the ELSs thus if the link does come up, the Abort completions
may reference an invalid structure.
Fix by ensuring that Abort set the flag to avoid link traffic if issued due
to conditions where the link failed.
Link: https://lore.kernel.org/r/20211020211417.88754-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee(a)broadcom.com>
Signed-off-by: Justin Tee <justin.tee(a)broadcom.com>
Signed-off-by: James Smart <jsmart2021(a)gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
[Shivani: Modified to apply on 5.10.y]
Signed-off-by: Shivani Agarwal <shivani.agarwal(a)broadcom.com>
---
drivers/scsi/lpfc/lpfc_sli.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c
index ff39c596f000..49931577da38 100644
--- a/drivers/scsi/lpfc/lpfc_sli.c
+++ b/drivers/scsi/lpfc/lpfc_sli.c
@@ -11432,10 +11432,12 @@ lpfc_sli_abort_iotag_issue(struct lpfc_hba *phba, struct lpfc_sli_ring *pring,
if (cmdiocb->iocb_flag & LPFC_IO_FOF)
abtsiocbp->iocb_flag |= LPFC_IO_FOF;
- if (phba->link_state >= LPFC_LINK_UP)
- iabt->ulpCommand = CMD_ABORT_XRI_CN;
- else
+ if (phba->link_state < LPFC_LINK_UP ||
+ (phba->sli_rev == LPFC_SLI_REV4 &&
+ phba->sli4_hba.link_state.status == LPFC_FC_LA_TYPE_LINK_DOWN))
iabt->ulpCommand = CMD_CLOSE_XRI_CN;
+ else
+ iabt->ulpCommand = CMD_ABORT_XRI_CN;
abtsiocbp->iocb_cmpl = lpfc_sli_abort_els_cmpl;
abtsiocbp->vport = vport;
--
2.40.4
The quilt patch titled
Subject: proc: proc_maps_open allow proc_mem_open to return NULL
has been removed from the -mm tree. Its filename was
proc-proc_maps_open-allow-proc_mem_open-to-return-null.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Jialin Wang <wjl.linux(a)gmail.com>
Subject: proc: proc_maps_open allow proc_mem_open to return NULL
Date: Fri, 8 Aug 2025 00:54:55 +0800
The commit 65c66047259f ("proc: fix the issue of proc_mem_open returning
NULL") caused proc_maps_open() to return -ESRCH when proc_mem_open()
returns NULL. This breaks legitimate /proc/<pid>/maps access for kernel
threads since kernel threads have NULL mm_struct.
The regression causes perf to fail and exit when profiling a kernel
thread:
# perf record -v -g -p $(pgrep kswapd0)
...
couldn't open /proc/65/task/65/maps
This patch partially reverts the commit to fix it.
Link: https://lkml.kernel.org/r/20250807165455.73656-1-wjl.linux@gmail.com
Fixes: 65c66047259f ("proc: fix the issue of proc_mem_open returning NULL")
Signed-off-by: Jialin Wang <wjl.linux(a)gmail.com>
Cc: Penglei Jiang <superman.xpt(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/task_mmu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/proc/task_mmu.c~proc-proc_maps_open-allow-proc_mem_open-to-return-null
+++ a/fs/proc/task_mmu.c
@@ -340,8 +340,8 @@ static int proc_maps_open(struct inode *
priv->inode = inode;
priv->mm = proc_mem_open(inode, PTRACE_MODE_READ);
- if (IS_ERR_OR_NULL(priv->mm)) {
- int err = priv->mm ? PTR_ERR(priv->mm) : -ESRCH;
+ if (IS_ERR(priv->mm)) {
+ int err = PTR_ERR(priv->mm);
seq_release_private(inode, file);
return err;
_
Patches currently in -mm which might be from wjl.linux(a)gmail.com are
The quilt patch titled
Subject: userfaultfd: fix a crash in UFFDIO_MOVE when PMD is a migration entry
has been removed from the -mm tree. Its filename was
userfaultfd-fix-a-crash-in-uffdio_move-when-pmd-is-a-migration-entry.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Suren Baghdasaryan <surenb(a)google.com>
Subject: userfaultfd: fix a crash in UFFDIO_MOVE when PMD is a migration entry
Date: Wed, 6 Aug 2025 15:00:22 -0700
When UFFDIO_MOVE encounters a migration PMD entry, it proceeds with
obtaining a folio and accessing it even though the entry is swp_entry_t.
Add the missing check and let split_huge_pmd() handle migration entries.
While at it also remove unnecessary folio check.
[surenb(a)google.com: remove extra folio check, per David]
Link: https://lkml.kernel.org/r/20250807200418.1963585-1-surenb@google.com
Link: https://lkml.kernel.org/r/20250806220022.926763-1-surenb@google.com
Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI")
Signed-off-by: Suren Baghdasaryan <surenb(a)google.com>
Reported-by: syzbot+b446dbe27035ef6bd6c2(a)syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68794b5c.a70a0220.693ce.0050.GAE@google.com/
Reviewed-by: Peter Xu <peterx(a)redhat.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Lokesh Gidra <lokeshgidra(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/userfaultfd.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
--- a/mm/userfaultfd.c~userfaultfd-fix-a-crash-in-uffdio_move-when-pmd-is-a-migration-entry
+++ a/mm/userfaultfd.c
@@ -1821,13 +1821,16 @@ ssize_t move_pages(struct userfaultfd_ct
/* Check if we can move the pmd without splitting it. */
if (move_splits_huge_pmd(dst_addr, src_addr, src_start + len) ||
!pmd_none(dst_pmdval)) {
- struct folio *folio = pmd_folio(*src_pmd);
+ /* Can be a migration entry */
+ if (pmd_present(*src_pmd)) {
+ struct folio *folio = pmd_folio(*src_pmd);
- if (!folio || (!is_huge_zero_folio(folio) &&
- !PageAnonExclusive(&folio->page))) {
- spin_unlock(ptl);
- err = -EBUSY;
- break;
+ if (!is_huge_zero_folio(folio) &&
+ !PageAnonExclusive(&folio->page)) {
+ spin_unlock(ptl);
+ err = -EBUSY;
+ break;
+ }
}
spin_unlock(ptl);
_
Patches currently in -mm which might be from surenb(a)google.com are
mm-limit-the-scope-of-vma_start_read.patch
mm-change-vma_start_read-to-drop-rcu-lock-on-failure.patch
selftests-proc-test-procmap_query-ioctl-while-vma-is-concurrently-modified.patch
fs-proc-task_mmu-factor-out-proc_maps_private-fields-used-by-procmap_query.patch
fs-proc-task_mmu-execute-procmap_query-ioctl-under-per-vma-locks.patch
commit 5af1f84ed13a416297ab9ced7537f4d5ae7f329a upstream.
Connections may be cleanup while waiting for the commands to complete so
this attempts to check if the connection handle remains valid in case of
errors that would lead to call hci_conn_failed:
BUG: KASAN: slab-use-after-free in hci_conn_failed+0x1f/0x160
Read of size 8 at addr ffff888001376958 by task kworker/u3:0/52
CPU: 0 PID: 52 Comm: kworker/u3:0 Not tainted
6.5.0-rc1-00527-g2dfe76d58d3a #5615
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
1.16.2-1.fc38 04/01/2014
Workqueue: hci0 hci_cmd_sync_work
Call Trace:
<TASK>
dump_stack_lvl+0x1d/0x70
print_report+0xce/0x620
? __virt_addr_valid+0xd4/0x150
? hci_conn_failed+0x1f/0x160
kasan_report+0xd1/0x100
? hci_conn_failed+0x1f/0x160
hci_conn_failed+0x1f/0x160
hci_abort_conn_sync+0x237/0x360
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
Signed-off-by: Sumanth Gavini <sumanth.gavini(a)yahoo.com>
---
net/bluetooth/hci_sync.c | 43 +++++++++++++++++++++++++++-------------
1 file changed, 29 insertions(+), 14 deletions(-)
diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c
index 3f905ee4338f..acff47da799a 100644
--- a/net/bluetooth/hci_sync.c
+++ b/net/bluetooth/hci_sync.c
@@ -5525,31 +5525,46 @@ static int hci_reject_conn_sync(struct hci_dev *hdev, struct hci_conn *conn,
int hci_abort_conn_sync(struct hci_dev *hdev, struct hci_conn *conn, u8 reason)
{
- int err;
+ int err = 0;
+ u16 handle = conn->handle;
switch (conn->state) {
case BT_CONNECTED:
case BT_CONFIG:
- return hci_disconnect_sync(hdev, conn, reason);
+ err = hci_disconnect_sync(hdev, conn, reason);
+ break;
case BT_CONNECT:
err = hci_connect_cancel_sync(hdev, conn);
- /* Cleanup hci_conn object if it cannot be cancelled as it
- * likelly means the controller and host stack are out of sync.
- */
- if (err) {
- hci_dev_lock(hdev);
- hci_conn_failed(conn, err);
- hci_dev_unlock(hdev);
- }
- return err;
+ break;
case BT_CONNECT2:
- return hci_reject_conn_sync(hdev, conn, reason);
+ err = hci_reject_conn_sync(hdev, conn, reason);
+ break;
default:
conn->state = BT_CLOSED;
- break;
+ return 0;
}
- return 0;
+ /* Cleanup hci_conn object if it cannot be cancelled as it
+ * likelly means the controller and host stack are out of sync
+ * or in case of LE it was still scanning so it can be cleanup
+ * safely.
+ */
+ if (err) {
+ struct hci_conn *c;
+
+ /* Check if the connection hasn't been cleanup while waiting
+ * commands to complete.
+ */
+ c = hci_conn_hash_lookup_handle(hdev, handle);
+ if (!c || c != conn)
+ return 0;
+
+ hci_dev_lock(hdev);
+ hci_conn_failed(conn, err);
+ hci_dev_unlock(hdev);
+ }
+
+ return err;
}
static int hci_disconnect_all_sync(struct hci_dev *hdev, u8 reason)
--
2.43.0
The patch titled
Subject: init: handle bootloader identifier in kernel parameters
has been added to the -mm mm-nonmm-unstable branch. Its filename is
init-handle-bootloader-identifier-in-kernel-parameters.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-nonmm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Huacai Chen <chenhuacai(a)loongson.cn>
Subject: init: handle bootloader identifier in kernel parameters
Date: Mon, 21 Jul 2025 18:13:43 +0800
BootLoader (Grub, LILO, etc) may pass an identifier such as "BOOT_IMAGE=
/boot/vmlinuz-x.y.z" to kernel parameters. But these identifiers are not
recognized by the kernel itself so will be passed to user space. However
user space init program also doesn't recognized it.
KEXEC/KDUMP (kexec-tools) may also pass an identifier such as "kexec" on
some architectures.
We cannot change BootLoader's behavior, because this behavior exists for
many years, and there are already user space programs search BOOT_IMAGE=
in /proc/cmdline to obtain the kernel image locations:
https://github.com/linuxdeepin/deepin-ab-recovery/blob/master/util.go
(search getBootOptions)
https://github.com/linuxdeepin/deepin-ab-recovery/blob/master/main.go
(search getKernelReleaseWithBootOption)
So the the best way is handle (ignore) it by the kernel itself, which can
avoid such boot warnings (if we use something like init=/bin/bash,
bootloader identifier can even cause a crash):
Kernel command line: BOOT_IMAGE=(hd0,1)/vmlinuz-6.x root=/dev/sda3 ro console=tty
Unknown kernel command line parameters "BOOT_IMAGE=(hd0,1)/vmlinuz-6.x", will be passed to user space.
Link: https://lkml.kernel.org/r/20250721101343.3283480-1-chenhuacai@loongson.cn
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Jan Kara <jack(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
init/main.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
--- a/init/main.c~init-handle-bootloader-identifier-in-kernel-parameters
+++ a/init/main.c
@@ -544,6 +544,12 @@ static int __init unknown_bootoption(cha
const char *unused, void *arg)
{
size_t len = strlen(param);
+ /*
+ * Well-known bootloader identifiers:
+ * 1. LILO/Grub pass "BOOT_IMAGE=...";
+ * 2. kexec/kdump (kexec-tools) pass "kexec".
+ */
+ const char *bootloader[] = { "BOOT_IMAGE=", "kexec", NULL };
/* Handle params aliased to sysctls */
if (sysctl_is_alias(param))
@@ -551,6 +557,12 @@ static int __init unknown_bootoption(cha
repair_env_string(param, val);
+ /* Handle bootloader identifier */
+ for (int i = 0; bootloader[i]; i++) {
+ if (!strncmp(param, bootloader[i], strlen(bootloader[i])))
+ return 0;
+ }
+
/* Handle obsolete-style parameters */
if (obsolete_checksetup(param))
return 0;
_
Patches currently in -mm which might be from chenhuacai(a)loongson.cn are
init-handle-bootloader-identifier-in-kernel-parameters.patch
The patch titled
Subject: mm/damon/core: fix commit_ops_filters by using correct nth function
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-damon-core-fix-commit_ops_filters-by-using-correct-nth-function.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Sang-Heon Jeon <ekffu200098(a)gmail.com>
Subject: mm/damon/core: fix commit_ops_filters by using correct nth function
Date: Sun, 10 Aug 2025 21:42:01 +0900
damos_commit_ops_filters() incorrectly uses damos_nth_filter() which
iterates core_filters. As a result, performing a commit unintentionally
corrupts ops_filters.
Add damos_nth_ops_filter() which iterates ops_filters. Use this function
to fix issues caused by wrong iteration.
Link: https://lkml.kernel.org/r/20250810124201.15743-1-ekffu200098@gmail.com
Fixes: 3607cc590f18 ("mm/damon/core: support committing ops_filters") # 6.15.x
Signed-off-by: Sang-Heon Jeon <ekffu200098(a)gmail.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/core.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
--- a/mm/damon/core.c~mm-damon-core-fix-commit_ops_filters-by-using-correct-nth-function
+++ a/mm/damon/core.c
@@ -845,6 +845,18 @@ static struct damos_filter *damos_nth_fi
return NULL;
}
+static struct damos_filter *damos_nth_ops_filter(int n, struct damos *s)
+{
+ struct damos_filter *filter;
+ int i = 0;
+
+ damos_for_each_ops_filter(filter, s) {
+ if (i++ == n)
+ return filter;
+ }
+ return NULL;
+}
+
static void damos_commit_filter_arg(
struct damos_filter *dst, struct damos_filter *src)
{
@@ -908,7 +920,7 @@ static int damos_commit_ops_filters(stru
int i = 0, j = 0;
damos_for_each_ops_filter_safe(dst_filter, next, dst) {
- src_filter = damos_nth_filter(i++, src);
+ src_filter = damos_nth_ops_filter(i++, src);
if (src_filter)
damos_commit_filter(dst_filter, src_filter);
else
_
Patches currently in -mm which might be from ekffu200098(a)gmail.com are
mm-damon-core-fix-commit_ops_filters-by-using-correct-nth-function.patch
mm-damon-update-expired-description-of-damos_action.patch
docs-mm-damon-design-fix-typo-s-sz_trtied-sz_tried.patch
--
Hi,
PERDIS SUPER U is a leading retail group in France with numerous
outlets across the country. After reviewing your company profile and
products, we’re very interested in establishing a long-term partnership.
Kindly share your product catalog or website so we can review your
offerings and pricing. We are ready to place orders and begin
cooperation.Please note: Our payment terms are SWIFT, 14 days after
delivery.
Looking forward to your response.
Best regards,
Dominique Schelcher
Director, PERDIS SUPER U
RUE DE SAVOIE, 45600 SAINT-PÈRE-SUR-LOIRE
VAT: FR65380071464
www.magasins-u.com
The patch titled
Subject: squashfs: fix memory leak in squashfs_fill_super
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
squashfs-fix-memory-leak-in-squashfs_fill_super.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Phillip Lougher <phillip(a)squashfs.org.uk>
Subject: squashfs: fix memory leak in squashfs_fill_super
Date: Mon, 11 Aug 2025 23:37:40 +0100
If sb_min_blocksize returns 0, squashfs_fill_super exits without freeing
allocated memory (sb->s_fs_info).
Fix this by moving the call to sb_min_blocksize to before memory is
allocated.
Link: https://lkml.kernel.org/r/20250811223740.110392-1-phillip@squashfs.org.uk
Fixes: 734aa85390ea ("Squashfs: check return result of sb_min_blocksize")
Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk>
Reported-by: Scott GUO <scottzhguo(a)tencent.com>
Closes: https://lore.kernel.org/all/20250811061921.3807353-1-scott_gzh@163.com
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/squashfs/super.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
--- a/fs/squashfs/super.c~squashfs-fix-memory-leak-in-squashfs_fill_super
+++ a/fs/squashfs/super.c
@@ -187,10 +187,15 @@ static int squashfs_fill_super(struct su
unsigned short flags;
unsigned int fragments;
u64 lookup_table_start, xattr_id_table_start, next_table;
- int err;
+ int err, devblksize = sb_min_blocksize(sb, SQUASHFS_DEVBLK_SIZE);
TRACE("Entered squashfs_fill_superblock\n");
+ if (!devblksize) {
+ errorf(fc, "squashfs: unable to set blocksize\n");
+ return -EINVAL;
+ }
+
sb->s_fs_info = kzalloc(sizeof(*msblk), GFP_KERNEL);
if (sb->s_fs_info == NULL) {
ERROR("Failed to allocate squashfs_sb_info\n");
@@ -201,12 +206,7 @@ static int squashfs_fill_super(struct su
msblk->panic_on_errors = (opts->errors == Opt_errors_panic);
- msblk->devblksize = sb_min_blocksize(sb, SQUASHFS_DEVBLK_SIZE);
- if (!msblk->devblksize) {
- errorf(fc, "squashfs: unable to set blocksize\n");
- return -EINVAL;
- }
-
+ msblk->devblksize = devblksize;
msblk->devblksize_log2 = ffz(~msblk->devblksize);
mutex_init(&msblk->meta_index_mutex);
_
Patches currently in -mm which might be from phillip(a)squashfs.org.uk are
squashfs-fix-memory-leak-in-squashfs_fill_super.patch
Hi,
To Fix CVE-2021-47498 b4459b11e840 is required, but it has a dependency
on e2118b3c3d94 ("rearrange core declarations for extended use
from dm-zone.c"). Therefore backported both patches for v5.10.
Thanks,
Shivani
Shivani Agarwal (2):
dm: rearrange core declarations for extended use from dm-zone.c
dm rq: don't queue request to blk-mq during DM suspend
drivers/md/dm-core.h | 52 ++++++++++++++++++++++++++++++++++++++
drivers/md/dm-rq.c | 8 ++++++
drivers/md/dm.c | 59 ++++++--------------------------------------
3 files changed, 67 insertions(+), 52 deletions(-)
--
2.40.4
So we've had this regression in 9p for.. almost a year, which is way too
long, but there was no "easy" reproducer until yesterday (thank you
again!!)
It turned out to be a bug with iov_iter on folios,
iov_iter_get_pages_alloc2() would advance the iov_iter correctly up to
the end edge of a folio and the later copy_to_iter() fails on the
iterate_folioq() bug.
Happy to consider alternative ways of fixing this, now there's a
reproducer it's all much clearer; for the bug to be visible we basically
need to make and IO with non-contiguous folios in the iov_iter which is
not obvious to test with synthetic VMs, with size that triggers a
zero-copy read followed by a non-zero-copy read.
Signed-off-by: Dominique Martinet <asmadeus(a)codewreck.org>
---
Changes in v2:
- Fixed 'remain' being used uninitialized in iterate_folioq when going
through the goto
- s/forwarded/advanced in commit message
- Link to v1: https://lore.kernel.org/r/20250811-iot_iter_folio-v1-0-d9c223adf93c@codewre…
---
Dominique Martinet (2):
iov_iter: iterate_folioq: fix handling of offset >= folio size
iov_iter: iov_folioq_get_pages: don't leave empty slot behind
include/linux/iov_iter.h | 5 ++++-
lib/iov_iter.c | 6 +++---
2 files changed, 7 insertions(+), 4 deletions(-)
---
base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
change-id: 20250811-iot_iter_folio-1b7849f88fed
Best regards,
--
Dominique Martinet <asmadeus(a)codewreck.org>
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 31cd31c9e17ece125aad27259501a2af69ccb020
Gitweb: https://git.kernel.org/tip/31cd31c9e17ece125aad27259501a2af69ccb020
Author: Fushuai Wang <wangfushuai(a)baidu.com>
AuthorDate: Mon, 11 Aug 2025 11:50:44 -07:00
Committer: Dave Hansen <dave.hansen(a)linux.intel.com>
CommitterDate: Mon, 11 Aug 2025 13:28:07 -07:00
x86/fpu: Fix NULL dereference in avx512_status()
Problem
-------
With CONFIG_X86_DEBUG_FPU enabled, reading /proc/[kthread]/arch_status
causes a warning and a NULL pointer dereference.
This is because the AVX-512 timestamp code uses x86_task_fpu() but
doesn't check it for NULL. CONFIG_X86_DEBUG_FPU addles that function
for kernel threads (PF_KTHREAD specifically), making it return NULL.
The point of the warning was to ensure that kernel threads only access
task->fpu after going through kernel_fpu_begin()/_end(). Note: all
kernel tasks exposed in /proc have a valid task->fpu.
Solution
--------
One option is to silence the warning and check for NULL from
x86_task_fpu(). However, that warning is fairly fresh and seems like a
defense against misuse of the FPU state in kernel threads.
Instead, stop outputting AVX-512_elapsed_ms for kernel threads
altogether. The data was garbage anyway because avx512_timestamp is
only updated for user threads, not kernel threads.
If anyone ever wants to track kernel thread AVX-512 use, they can come
back later and do it properly, separate from this bug fix.
[ dhansen: mostly rewrite changelog ]
Fixes: 22aafe3bcb67 ("x86/fpu: Remove init_task FPU state dependencies, add debugging warning for PF_KTHREAD tasks")
Co-developed-by: Sohil Mehta <sohil.mehta(a)intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta(a)intel.com>
Signed-off-by: Fushuai Wang <wangfushuai(a)baidu.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/20250811185044.2227268-1-sohil.mehta%40intel.com
---
arch/x86/kernel/fpu/xstate.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 12ed75c..28e4fd6 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1881,19 +1881,20 @@ long fpu_xstate_prctl(int option, unsigned long arg2)
#ifdef CONFIG_PROC_PID_ARCH_STATUS
/*
* Report the amount of time elapsed in millisecond since last AVX512
- * use in the task.
+ * use in the task. Report -1 if no AVX-512 usage.
*/
static void avx512_status(struct seq_file *m, struct task_struct *task)
{
- unsigned long timestamp = READ_ONCE(x86_task_fpu(task)->avx512_timestamp);
- long delta;
+ unsigned long timestamp;
+ long delta = -1;
- if (!timestamp) {
- /*
- * Report -1 if no AVX512 usage
- */
- delta = -1;
- } else {
+ /* AVX-512 usage is not tracked for kernel threads. Don't report anything. */
+ if (task->flags & (PF_KTHREAD | PF_USER_WORKER))
+ return;
+
+ timestamp = READ_ONCE(x86_task_fpu(task)->avx512_timestamp);
+
+ if (timestamp) {
delta = (long)(jiffies - timestamp);
/*
* Cap to LONG_MAX if time difference > LONG_MAX
From: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
Temperature sensor returns the temperature of the mechanical parts
of the chip. If both accel and gyro are off, temperature sensor is
also automatically turned off and return invalid data.
In this case, returning EBUSY error code is better then EINVAL and
indicates userspace that it needs to retry reading temperature in
another context.
Fixes: bc3eb0207fb5 ("iio: imu: inv_icm42600: add temperature sensor support")
Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
Cc: stable(a)vger.kernel.org
---
drivers/iio/imu/inv_icm42600/inv_icm42600_temp.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_temp.c b/drivers/iio/imu/inv_icm42600/inv_icm42600_temp.c
index 8b15afca498cb5dfa7e056a60d3c78e419f11b29..1756f3d07049a26038776a35d9242f3dd1320354 100644
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_temp.c
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_temp.c
@@ -32,8 +32,12 @@ static int inv_icm42600_temp_read(struct inv_icm42600_state *st, s16 *temp)
goto exit;
*temp = (s16)be16_to_cpup(raw);
+ /*
+ * Temperature data is invalid if both accel and gyro are off.
+ * Return EBUSY in this case.
+ */
if (*temp == INV_ICM42600_DATA_INVALID)
- ret = -EINVAL;
+ ret = -EBUSY;
exit:
mutex_unlock(&st->lock);
---
base-commit: 6408dba154079656d069a6a25fb3a8954959474c
change-id: 20250807-inv-icm42600-change-temperature-error-code-65d16a98c6e1
Best regards,
--
Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
Use 'min(len, x)' as the index instead of just 'len' to NUL-terminate
the copied 'line' string at the correct position.
Add a newline after the local variable declarations to silence a
checkpatch warning.
Cc: stable(a)vger.kernel.org
Fixes: 692d97c380c6 ("kconfig: new configuration interface (nconfig)")
Signed-off-by: Thorsten Blum <thorsten.blum(a)linux.dev>
---
scripts/kconfig/nconf.gui.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/scripts/kconfig/nconf.gui.c b/scripts/kconfig/nconf.gui.c
index 7206437e784a..ec021ebd2c52 100644
--- a/scripts/kconfig/nconf.gui.c
+++ b/scripts/kconfig/nconf.gui.c
@@ -175,8 +175,9 @@ void fill_window(WINDOW *win, const char *text)
for (i = 0; i < total_lines; i++) {
char tmp[x+10];
const char *line = get_line(text, i);
- int len = get_line_length(line);
- strncpy(tmp, line, min(len, x));
+ int len = min(get_line_length(line), x);
+
+ strncpy(tmp, line, len);
tmp[len] = '\0';
mvwprintw(win, i, 0, "%s", tmp);
}
--
2.50.1
The following commit has been merged into the locking/urgent branch of tip:
Commit-ID: dfb36e4a8db0cd56f92d4cb445f54e85a9b40897
Gitweb: https://git.kernel.org/tip/dfb36e4a8db0cd56f92d4cb445f54e85a9b40897
Author: Waiman Long <longman(a)redhat.com>
AuthorDate: Mon, 11 Aug 2025 10:11:47 -04:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Mon, 11 Aug 2025 17:53:21 +02:00
futex: Use user_write_access_begin/_end() in futex_put_value()
Commit cec199c5e39b ("futex: Implement FUTEX2_NUMA") introduced the
futex_put_value() helper to write a value to the given user
address.
However, it uses user_read_access_begin() before the write. For
architectures that differentiate between read and write accesses, like
PowerPC, futex_put_value() fails with -EFAULT.
Fix that by using the user_write_access_begin/user_write_access_end() pair
instead.
Fixes: cec199c5e39b ("futex: Implement FUTEX2_NUMA")
Signed-off-by: Waiman Long <longman(a)redhat.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/20250811141147.322261-1-longman@redhat.com
---
kernel/futex/futex.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h
index c74eac5..2cd5709 100644
--- a/kernel/futex/futex.h
+++ b/kernel/futex/futex.h
@@ -319,13 +319,13 @@ static __always_inline int futex_put_value(u32 val, u32 __user *to)
{
if (can_do_masked_user_access())
to = masked_user_access_begin(to);
- else if (!user_read_access_begin(to, sizeof(*to)))
+ else if (!user_write_access_begin(to, sizeof(*to)))
return -EFAULT;
unsafe_put_user(val, to, Efault);
- user_read_access_end();
+ user_write_access_end();
return 0;
Efault:
- user_read_access_end();
+ user_write_access_end();
return -EFAULT;
}
There is a fix 17ba9cde11c2bfebbd70867b0a2ac4a22e573379
introduced in v6.8 to fix the problem introduced by
the original fix 66951d98d9bf45ba25acf37fe0747253fafdf298,
and they together fix the CVE-2024-26661.
Since this is the first time I submit the changes on vulns project,
not sure if the changes in my patch are exact, @Greg, please point
out the problems if there are and I will fix them.
Thanks!
Qingfeng Hao (1):
CVE-2024-26661: change the sha1 of the cve id
cve/published/2024/CVE-2024-26661.dyad | 6 ++--
cve/published/2024/CVE-2024-26661.json | 46 ++------------------------
cve/published/2024/CVE-2024-26661.sha1 | 2 +-
3 files changed, 5 insertions(+), 49 deletions(-)
--
2.34.1
All the other ref clocks provided by this driver have the bi_tcxo
as parent. The eDP refclk is the only one without a parent, leading
to reporting its rate as 0. So set its parent to bi_tcxo, just like
the rest of the refclks.
Cc: stable(a)vger.kernel.org # v6.9
Fixes: 06aff116199c ("clk: qcom: Add TCSR clock driver for x1e80100")
Signed-off-by: Abel Vesa <abel.vesa(a)linaro.org>
---
drivers/clk/qcom/tcsrcc-x1e80100.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/clk/qcom/tcsrcc-x1e80100.c b/drivers/clk/qcom/tcsrcc-x1e80100.c
index ff61769a08077e916157a03c789ab3d5b0c090f6..a367e1f55622d990929984facb62185b551d6c50 100644
--- a/drivers/clk/qcom/tcsrcc-x1e80100.c
+++ b/drivers/clk/qcom/tcsrcc-x1e80100.c
@@ -29,6 +29,10 @@ static struct clk_branch tcsr_edp_clkref_en = {
.enable_mask = BIT(0),
.hw.init = &(const struct clk_init_data) {
.name = "tcsr_edp_clkref_en",
+ .parent_data = &(const struct clk_parent_data){
+ .index = DT_BI_TCXO_PAD,
+ },
+ .num_parents = 1,
.ops = &clk_branch2_ops,
},
},
---
base-commit: 79fb37f39b77bbf9a56304e9af843cd93a7a1916
change-id: 20250730-clk-qcom-tcsrcc-x1e80100-parent-edp-refclk-8b17a56f050a
Best regards,
--
Abel Vesa <abel.vesa(a)linaro.org>
On Mon, 11 Aug 2025 at 16:18, Sasha Levin <sashal(a)kernel.org> wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> ARM: s3c/gpio: complete the conversion to new GPIO value setters
>
> to the 6.16-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> arm-s3c-gpio-complete-the-conversion-to-new-gpio-val.patch
> and it can be found in the queue-6.16 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
This is not something we should backport. This is a refactoring targeting v6.17.
Bartosz
>
>
> commit c09ce1148e4748b856b6d13a8b9fa568a1e687a2
> Author: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
> Date: Wed Jul 30 09:14:43 2025 +0200
>
> ARM: s3c/gpio: complete the conversion to new GPIO value setters
>
> [ Upstream commit 3dca3d51b933beb3f35a60472ed2110d1bd7046a ]
>
> Commit fb52f3226cab ("ARM: s3c/gpio: use new line value setter
> callbacks") correctly changed the assignment of the callback but missed
> the check one liner higher. Change it now too to using the recommended
> callback as the legacy one is going away soon.
>
> Fixes: fb52f3226cab ("ARM: s3c/gpio: use new line value setter callbacks")
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
> Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/arch/arm/mach-s3c/gpio-samsung.c b/arch/arm/mach-s3c/gpio-samsung.c
> index 206a492fbaf5..3ee4ad969cc2 100644
> --- a/arch/arm/mach-s3c/gpio-samsung.c
> +++ b/arch/arm/mach-s3c/gpio-samsung.c
> @@ -516,7 +516,7 @@ static void __init samsung_gpiolib_add(struct samsung_gpio_chip *chip)
> gc->direction_input = samsung_gpiolib_2bit_input;
> if (!gc->direction_output)
> gc->direction_output = samsung_gpiolib_2bit_output;
> - if (!gc->set)
> + if (!gc->set_rv)
> gc->set_rv = samsung_gpiolib_set;
> if (!gc->get)
> gc->get = samsung_gpiolib_get;
Move ARCH_PAGE_TABLE_SYNC_MASK and arch_sync_kernel_mappings() to
linux/pgtable.h so that they can be used outside of vmalloc and ioremap.
Cc: <stable(a)vger.kernel.org>
Fixes: 8d400913c231 ("x86/vmemmap: handle unpopulated sub-pmd ranges")
Signed-off-by: Harry Yoo <harry.yoo(a)oracle.com>
---
include/linux/pgtable.h | 16 ++++++++++++++++
include/linux/vmalloc.h | 16 ----------------
2 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 4c035637eeb7..ba699df6ef69 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1467,6 +1467,22 @@ static inline void modify_prot_commit_ptes(struct vm_area_struct *vma, unsigned
}
#endif
+/*
+ * Architectures can set this mask to a combination of PGTBL_P?D_MODIFIED values
+ * and let generic vmalloc and ioremap code know when arch_sync_kernel_mappings()
+ * needs to be called.
+ */
+#ifndef ARCH_PAGE_TABLE_SYNC_MASK
+#define ARCH_PAGE_TABLE_SYNC_MASK 0
+#endif
+
+/*
+ * There is no default implementation for arch_sync_kernel_mappings(). It is
+ * relied upon the compiler to optimize calls out if ARCH_PAGE_TABLE_SYNC_MASK
+ * is 0.
+ */
+void arch_sync_kernel_mappings(unsigned long start, unsigned long end);
+
#endif /* CONFIG_MMU */
/*
diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index fdc9aeb74a44..2759dac6be44 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -219,22 +219,6 @@ extern int remap_vmalloc_range(struct vm_area_struct *vma, void *addr,
int vmap_pages_range(unsigned long addr, unsigned long end, pgprot_t prot,
struct page **pages, unsigned int page_shift);
-/*
- * Architectures can set this mask to a combination of PGTBL_P?D_MODIFIED values
- * and let generic vmalloc and ioremap code know when arch_sync_kernel_mappings()
- * needs to be called.
- */
-#ifndef ARCH_PAGE_TABLE_SYNC_MASK
-#define ARCH_PAGE_TABLE_SYNC_MASK 0
-#endif
-
-/*
- * There is no default implementation for arch_sync_kernel_mappings(). It is
- * relied upon the compiler to optimize calls out if ARCH_PAGE_TABLE_SYNC_MASK
- * is 0.
- */
-void arch_sync_kernel_mappings(unsigned long start, unsigned long end);
-
/*
* Lowlevel-APIs (not for driver use!)
*/
--
2.43.0
When fuzzing USB with syzkaller on a PREEMPT_RT enabled kernel, following
bug is triggered in the ksoftirqd context.
| BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
| in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 30, name: ksoftirqd/1
| preempt_count: 0, expected: 0
| RCU nest depth: 2, expected: 2
| CPU: 1 UID: 0 PID: 30 Comm: ksoftirqd/1 Tainted: G W 6.16.0-rc1-rt1 #11 PREEMPT_RT
| Tainted: [W]=WARN
| Hardware name: QEMU KVM Virtual Machine, BIOS 2025.02-8 05/13/2025
| Call trace:
| show_stack+0x2c/0x3c (C)
| __dump_stack+0x30/0x40
| dump_stack_lvl+0x148/0x1d8
| dump_stack+0x1c/0x3c
| __might_resched+0x2e4/0x52c
| rt_spin_lock+0xa8/0x1bc
| kcov_remote_start+0xb0/0x490
| __usb_hcd_giveback_urb+0x2d0/0x5e8
| usb_giveback_urb_bh+0x234/0x3c4
| process_scheduled_works+0x678/0xd18
| bh_worker+0x2f0/0x59c
| workqueue_softirq_action+0x104/0x14c
| tasklet_action+0x18/0x8c
| handle_softirqs+0x208/0x63c
| run_ksoftirqd+0x64/0x264
| smpboot_thread_fn+0x4ac/0x908
| kthread+0x5e8/0x734
| ret_from_fork+0x10/0x20
To reproduce on PREEMPT_RT kernel:
$ git remote add rt-devel git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git
$ git fetch rt-devel
$ git checkout -b v6.16-rc1-rt1 v6.16-rc1-rt1
I have attached the syzlang and the C source code converted by syz-prog2c:
Link: https://gist.github.com/kzall0c/9455aaa246f4aa1135353a51753adbbe
Then, run with a PREEMPT_RT config.
This issue was introduced by commit
f85d39dd7ed8 ("kcov, usb: disable interrupts in kcov_remote_start_usb_softirq").
However, this creates a conflict on PREEMPT_RT kernels. The local_irq_save()
call establishes an atomic context where sleeping is forbidden. Inside this
context, kcov_remote_start() is called, which on PREEMPT_RT uses sleeping
locks (spinlock_t and local_lock_t are mapped to rt_mutex). This results in
a sleeping function called from invalid context.
On PREEMPT_RT, interrupt handlers are threaded, so the re-entrancy scenario
is already safely handled by the existing local_lock_t and the global
kcov_remote_lock within kcov_remote_start(). Therefore, the outer
local_irq_save() is not necessary.
This preserves the intended re-entrancy protection for non-RT kernels while
resolving the locking violation on PREEMPT_RT kernels.
After making this modification and testing it, syzkaller fuzzing the
PREEMPT_RT kernel is now running without stopping on latest announced
Real-time Linux.
Link: https://lore.kernel.org/linux-rt-devel/20250610080307.LMm1hleC@linutronix.d…
Fixes: f85d39dd7ed8 ("kcov, usb: disable interrupts in kcov_remote_start_usb_softirq")
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Tetsuo Handa <penguin-kernel(a)i-love.sakura.ne.jp>
Cc: Alan Stern <stern(a)rowland.harvard.edu>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Byungchul Park <byungchul(a)sk.com>
Cc: stable(a)vger.kernel.org
Cc: kasan-dev(a)googlegroups.com
Cc: syzkaller(a)googlegroups.com
Cc: linux-usb(a)vger.kernel.org
Cc: linux-rt-devel(a)lists.linux.dev
Signed-off-by: Yunseong Kim <ysk(a)kzalloc.com>
---
include/linux/kcov.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/include/linux/kcov.h b/include/linux/kcov.h
index 75a2fb8b16c3..c5e1b2dd0bb7 100644
--- a/include/linux/kcov.h
+++ b/include/linux/kcov.h
@@ -85,7 +85,9 @@ static inline unsigned long kcov_remote_start_usb_softirq(u64 id)
unsigned long flags = 0;
if (in_serving_softirq()) {
+#ifndef CONFIG_PREEMPT_RT
local_irq_save(flags);
+#endif
kcov_remote_start_usb(id);
}
@@ -96,7 +98,9 @@ static inline void kcov_remote_stop_softirq(unsigned long flags)
{
if (in_serving_softirq()) {
kcov_remote_stop();
+#ifndef CONFIG_PREEMPT_RT
local_irq_restore(flags);
+#endif
}
}
--
2.50.0
The BIOS can leave the AUX power well enabled on an output, even if this
isn't required (on platforms where the AUX power is only needed for an
AUX access). This was observed at least on PTL. To avoid the WARN which
would be triggered by this during the HW readout, convert the WARN to a
debug message.
Cc: stable(a)vger.kernel.org # v6.8+
Reported-by: Charlton Lin <charlton.lin(a)intel.com>
Tested-by: Khaled Almahallawy <khaled.almahallawy(a)intel.com>
Reviewed-by: Mika Kahola <mika.kahola(a)intel.com>
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
---
drivers/gpu/drm/i915/display/intel_tc.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_tc.c b/drivers/gpu/drm/i915/display/intel_tc.c
index 8b6826fc06855..0ff5a8951e734 100644
--- a/drivers/gpu/drm/i915/display/intel_tc.c
+++ b/drivers/gpu/drm/i915/display/intel_tc.c
@@ -1497,11 +1497,11 @@ static void intel_tc_port_reset_mode(struct intel_tc_port *tc,
intel_display_power_flush_work(display);
if (!intel_tc_cold_requires_aux_pw(dig_port)) {
enum intel_display_power_domain aux_domain;
- bool aux_powered;
aux_domain = intel_aux_power_domain(dig_port);
- aux_powered = intel_display_power_is_enabled(display, aux_domain);
- drm_WARN_ON(display->drm, aux_powered);
+ if (intel_display_power_is_enabled(display, aux_domain))
+ drm_dbg_kms(display->drm, "Port %s: AUX unexpectedly powered\n",
+ tc->port_name);
}
tc_phy_disconnect(tc);
--
2.49.1
Use the cached max lane count value on LNL+, to account for scenarios
where this value is queried after the HW cleared the corresponding pin
assignment value in the TCSS_DDI_STATUS register after the sink got
disconnected.
For consistency, follow-up changes will use the cached max lane count
value on other platforms as well and will also cache the pin assignment
value in a similar way.
Cc: stable(a)vger.kernel.org # v6.8+
Reported-by: Charlton Lin <charlton.lin(a)intel.com>
Tested-by: Khaled Almahallawy <khaled.almahallawy(a)intel.com>
Reviewed-by: Mika Kahola <mika.kahola(a)intel.com>
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
---
drivers/gpu/drm/i915/display/intel_tc.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_tc.c b/drivers/gpu/drm/i915/display/intel_tc.c
index b0edbce2060ff..8b6826fc06855 100644
--- a/drivers/gpu/drm/i915/display/intel_tc.c
+++ b/drivers/gpu/drm/i915/display/intel_tc.c
@@ -395,12 +395,16 @@ static void read_pin_configuration(struct intel_tc_port *tc)
int intel_tc_port_max_lane_count(struct intel_digital_port *dig_port)
{
+ struct intel_display *display = to_intel_display(dig_port);
struct intel_tc_port *tc = to_tc_port(dig_port);
if (!intel_encoder_is_tc(&dig_port->base))
return 4;
- return get_max_lane_count(tc);
+ if (DISPLAY_VER(display) < 20)
+ return get_max_lane_count(tc);
+
+ return tc->max_lane_count;
}
void intel_tc_port_set_fia_lane_count(struct intel_digital_port *dig_port,
--
2.49.1
On LNL+ for a disconnected sink the pin assignment value gets cleared by
the HW/FW as soon as the sink gets disconnected, even if the PHY
ownership got acquired already by the BIOS/driver (and hence the PHY
itself is still connected and used by the display). During HW readout
this can result in detecting the PHY's max lane count as 0 - matching
the above cleared aka NONE pin assignment HW state. For a connected PHY
the driver in general (outside of intel_tc.c) expects the max lane count
value to be valid for the video mode enabled on the corresponding output
(1, 2 or 4). Ensure this by setting the max lane count to 4 in this
case. Note, that it doesn't matter if this lane count happened to be
more than the max lane count with which the PHY got connected and
enabled, since the only thing the driver can do with such an output -
where the DP-alt sink is disconnected - is to disable the output.
v2: Rebased on change reading out the pin configuration only if the PHY
is connected.
Cc: stable(a)vger.kernel.org # v6.8+
Reported-by: Charlton Lin <charlton.lin(a)intel.com>
Tested-by: Khaled Almahallawy <khaled.almahallawy(a)intel.com>
Reviewed-by: Mika Kahola <mika.kahola(a)intel.com>
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
---
drivers/gpu/drm/i915/display/intel_tc.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_tc.c b/drivers/gpu/drm/i915/display/intel_tc.c
index 752900f1c115c..b0edbce2060ff 100644
--- a/drivers/gpu/drm/i915/display/intel_tc.c
+++ b/drivers/gpu/drm/i915/display/intel_tc.c
@@ -23,6 +23,7 @@
#include "intel_modeset_lock.h"
#include "intel_tc.h"
+#define DP_PIN_ASSIGNMENT_NONE 0x0
#define DP_PIN_ASSIGNMENT_C 0x3
#define DP_PIN_ASSIGNMENT_D 0x4
#define DP_PIN_ASSIGNMENT_E 0x5
@@ -308,6 +309,8 @@ static int lnl_tc_port_get_max_lane_count(struct intel_digital_port *dig_port)
REG_FIELD_GET(TCSS_DDI_STATUS_PIN_ASSIGNMENT_MASK, val);
switch (pin_assignment) {
+ case DP_PIN_ASSIGNMENT_NONE:
+ return 0;
default:
MISSING_CASE(pin_assignment);
fallthrough;
@@ -1159,6 +1162,12 @@ static void xelpdp_tc_phy_get_hw_state(struct intel_tc_port *tc)
tc->lock_wakeref = tc_cold_block(tc);
read_pin_configuration(tc);
+ /*
+ * Set a valid lane count value for a DP-alt sink which got
+ * disconnected. The driver can only disable the output on this PHY.
+ */
+ if (tc->max_lane_count == 0)
+ tc->max_lane_count = 4;
}
drm_WARN_ON(display->drm,
--
2.49.1
The TypeC PHY HW readout during driver loading and system resume
determines which TypeC mode the PHY is in (legacy/DP-alt/TBT-alt) and
whether the PHY is connected, based on the PHY's Owned and Ready flags.
For the PHY to be in DP-alt or legacy mode and for the PHY to be in the
connected state in these modes, both the Owned (set by the BIOS/driver)
and the Ready (set by the HW) flags should be set.
On ICL-MTL the HW kept the PHY's Ready flag set after the driver
connected the PHY by acquiring the PHY ownership (by setting the Owned
flag), until the driver disconnected the PHY by releasing the PHY
ownership (by clearing the Owned flag). On LNL+ this has changed, in
that the HW clears the Ready flag as soon as the sink gets disconnected,
even if the PHY ownership was acquired already and hence the PHY is
being used by the display.
When inheriting the HW state from BIOS for a PHY connected in DP-alt
mode on which the sink got disconnected - i.e. in a case where the sink
was connected while BIOS/GOP was running and so the sink got enabled
connecting the PHY, but the user disconnected the sink by the time the
driver loaded - the PHY Owned but not Ready state must be accounted for
on LNL+ according to the above. Do that by assuming on LNL+ that the PHY
is connected in DP-alt mode whenever the PHY Owned flag is set,
regardless of the PHY Ready flag.
This fixes a problem on LNL+, where the PHY TypeC mode / connected state
was detected incorrectly for a DP-alt sink, which got connected and then
disconnected by the user in the above way.
v2: Rename tc_phy_in_legacy_or_dp_alt_mode() to tc_phy_owned_by_display().
(Luca, Jani)
Cc: Jani Nikula <jani.nikula(a)intel.com>
Cc: stable(a)vger.kernel.org # v6.8+
Reported-by: Charlton Lin <charlton.lin(a)intel.com>
Tested-by: Khaled Almahallawy <khaled.almahallawy(a)intel.com>
Reviewed-by: Mika Kahola <mika.kahola(a)intel.com>
Reviewed-by: Luca Coelho <luciano.coelho(a)intel.com>
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
---
drivers/gpu/drm/i915/display/intel_tc.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_tc.c b/drivers/gpu/drm/i915/display/intel_tc.c
index 3bc57579fe53e..d8247d1a8319b 100644
--- a/drivers/gpu/drm/i915/display/intel_tc.c
+++ b/drivers/gpu/drm/i915/display/intel_tc.c
@@ -1226,14 +1226,18 @@ static void tc_phy_get_hw_state(struct intel_tc_port *tc)
tc->phy_ops->get_hw_state(tc);
}
-static bool tc_phy_is_ready_and_owned(struct intel_tc_port *tc,
- bool phy_is_ready, bool phy_is_owned)
+static bool tc_phy_owned_by_display(struct intel_tc_port *tc,
+ bool phy_is_ready, bool phy_is_owned)
{
struct intel_display *display = to_intel_display(tc->dig_port);
- drm_WARN_ON(display->drm, phy_is_owned && !phy_is_ready);
+ if (DISPLAY_VER(display) < 20) {
+ drm_WARN_ON(display->drm, phy_is_owned && !phy_is_ready);
- return phy_is_ready && phy_is_owned;
+ return phy_is_ready && phy_is_owned;
+ } else {
+ return phy_is_owned;
+ }
}
static bool tc_phy_is_connected(struct intel_tc_port *tc,
@@ -1244,7 +1248,7 @@ static bool tc_phy_is_connected(struct intel_tc_port *tc,
bool phy_is_owned = tc_phy_is_owned(tc);
bool is_connected;
- if (tc_phy_is_ready_and_owned(tc, phy_is_ready, phy_is_owned))
+ if (tc_phy_owned_by_display(tc, phy_is_ready, phy_is_owned))
is_connected = port_pll_type == ICL_PORT_DPLL_MG_PHY;
else
is_connected = port_pll_type == ICL_PORT_DPLL_DEFAULT;
@@ -1352,7 +1356,7 @@ tc_phy_get_current_mode(struct intel_tc_port *tc)
phy_is_ready = tc_phy_is_ready(tc);
phy_is_owned = tc_phy_is_owned(tc);
- if (!tc_phy_is_ready_and_owned(tc, phy_is_ready, phy_is_owned)) {
+ if (!tc_phy_owned_by_display(tc, phy_is_ready, phy_is_owned)) {
mode = get_tc_mode_in_phy_not_owned_state(tc, live_mode);
} else {
drm_WARN_ON(display->drm, live_mode == TC_PORT_TBT_ALT);
--
2.49.1
It is possible for a malicious HID device to trigger a signed integer
overflow (undefined behaviour) in set_abs() in the following expression
by supplying bogus logical maximum and minimum values:
int fuzz = snratio ? (fmax - fmin) / snratio : 0;
For example, if the logical_maximum is INT_MAX and logical_minimum is -1
then (fmax - fmin) resolves to INT_MAX + 1, which does not fit in a 32-bit
signed int, so the subtraction overflows. Fix this by computing the
difference in a 64 bit context.
Fixes: 5519cab477b6 ("HID: hid-multitouch: support for PixCir-based panels")
Cc: stable(a)vger.kernel.org
Signed-off-by: Qasim Ijaz <qasdev00(a)gmail.com>
---
drivers/hid/hid-multitouch.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c
index 22c6314a8843..687638ed6d0f 100644
--- a/drivers/hid/hid-multitouch.c
+++ b/drivers/hid/hid-multitouch.c
@@ -540,7 +540,8 @@ static void set_abs(struct input_dev *input, unsigned int code,
{
int fmin = field->logical_minimum;
int fmax = field->logical_maximum;
- int fuzz = snratio ? (fmax - fmin) / snratio : 0;
+ s64 diff = (s64)fmax - (s64)fmin;
+ int fuzz = snratio ? (int)div_s64(diff, snratio) : 0;
input_set_abs_params(input, code, fmin, fmax, fuzz, 0);
input_abs_set_res(input, code, hidinput_calc_abs_res(field, code));
}
--
2.39.5
From: John Garry <john.g.garry(a)oracle.com>
[ Upstream commit 448dfecc7ff807822ecd47a5c052acedca7d09e8 ]
In blk_stack_limits(), we check that the t->chunk_sectors value is a
multiple of the t->physical_block_size value.
However, by finding the chunk_sectors value in bytes, we may overflow
the unsigned int which holds chunk_sectors, so change the check to be
based on sectors.
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: John Garry <john.g.garry(a)oracle.com>
Reviewed-by: Damien Le Moal <dlemoal(a)kernel.org>
Link: https://lore.kernel.org/r/20250729091448.1691334-2-john.g.garry@oracle.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
## Extensive Analysis
This commit should be backported to stable kernel trees for the
following reasons:
### 1. **It fixes a real integer overflow bug**
The original code performs `(t->chunk_sectors << 9)` which can cause an
integer overflow. Since both `chunk_sectors` and `physical_block_size`
are `unsigned int` (32-bit), when `chunk_sectors` is larger than
8,388,607 (2^23 - 1), shifting it left by 9 bits (multiplying by 512)
will overflow the 32-bit unsigned integer. This overflow can lead to
incorrect alignment checks.
### 2. **The fix is minimal and contained**
The change is a single-line modification that replaces:
```c
if ((t->chunk_sectors << 9) & (t->physical_block_size - 1))
```
with:
```c
if (t->chunk_sectors % (t->physical_block_size >> SECTOR_SHIFT))
```
This mathematically equivalent check avoids the overflow by dividing
`physical_block_size` by the sector size (512 bytes) instead of
multiplying `chunk_sectors` by 512.
### 3. **It's a correctness issue in critical block layer code**
The `blk_stack_limits()` function is fundamental to the block layer's
device stacking functionality, used by MD, DM, and other stacked block
devices. An overflow here could lead to:
- Incorrect alignment detection
- Improper chunk_sectors being cleared when it shouldn't be
- Potential data corruption or I/O errors in stacked block devices
### 4. **The fix has low regression risk**
- The new calculation is mathematically equivalent to the intended check
- It only changes how the comparison is performed, not the logic
- The fix has been reviewed by multiple maintainers (Hannes Reinecke,
Martin K. Petersen, Damien Le Moal)
- No new features or architectural changes are introduced
### 5. **Real-world impact potential**
Modern storage devices can have large chunk sizes. For example:
- NVMe devices with large zones could have chunk_sectors approaching or
exceeding the overflow threshold
- RAID configurations with large stripe sizes
- This becomes more likely as storage devices grow in capacity and
complexity
### 6. **Follows stable kernel criteria**
According to stable kernel rules, this qualifies because it:
- Fixes a bug that users can potentially hit
- Is obviously correct and tested
- Is small (single line change)
- Fixes only one thing
- Has no dependencies on other patches
The commit message clearly describes the problem (overflow in unsigned
int) and the solution (changing the check to avoid overflow). The fix
maintains the same semantic meaning while being overflow-safe, making it
an ideal candidate for stable backporting.
block/blk-settings.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-settings.c b/block/blk-settings.c
index a000daafbfb4..88890e904320 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -779,7 +779,7 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
}
/* chunk_sectors a multiple of the physical block size? */
- if ((t->chunk_sectors << 9) & (t->physical_block_size - 1)) {
+ if (t->chunk_sectors % (t->physical_block_size >> SECTOR_SHIFT)) {
t->chunk_sectors = 0;
t->flags |= BLK_FLAG_MISALIGNED;
ret = -1;
--
2.39.5
When building with CONFIG_CMODEL_MEDLOW and CONFIG_LTO_CLANG, there is a
series of errors due to some files being unconditionally compiled with
'-mcmodel=medany', mismatching with the rest of the kernel built with
'-mcmodel=medlow':
ld.lld: error: Function Import: link error: linking module flags 'Code Model': IDs have conflicting values: 'i32 3' from vmlinux.a(init.o at 899908), and 'i32 1' from vmlinux.a(net-traces.o at 1014628)
Only allow LTO to be performed when CONFIG_CMODEL_MEDANY is enabled to
ensure there will be no code model mismatch errors. An alternative
solution would be disabling LTO for the files with a different code
model than the main kernel like some specialized areas of the kernel do
but doing that for individual files is not as sustainable than
forbidding the combination altogether.
Cc: stable(a)vger.kernel.org
Fixes: 021d23428bdb ("RISC-V: build: Allow LTO to be selected")
Reported-by: kernel test robot <lkp(a)intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202506290255.KBVM83vZ-lkp@intel.com/
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
arch/riscv/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 36061f4732b7..4eee737a050f 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -68,7 +68,7 @@ config RISCV
select ARCH_SUPPORTS_HUGE_PFNMAP if TRANSPARENT_HUGEPAGE
select ARCH_SUPPORTS_HUGETLBFS if MMU
# LLD >= 14: https://github.com/llvm/llvm-project/issues/50505
- select ARCH_SUPPORTS_LTO_CLANG if LLD_VERSION >= 140000
+ select ARCH_SUPPORTS_LTO_CLANG if LLD_VERSION >= 140000 && CMODEL_MEDANY
select ARCH_SUPPORTS_LTO_CLANG_THIN if LLD_VERSION >= 140000
select ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS if 64BIT && MMU
select ARCH_SUPPORTS_PAGE_TABLE_CHECK if MMU
---
base-commit: fda589c286040d9ba2d72a0eaf0a13945fc48026
change-id: 20250710-riscv-restrict-lto-to-medany-f1b7dd5c9bba
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
From: Masahiro Yamada <masahiroy(a)kernel.org>
[ Upstream commit cae9cdbcd9af044810bcceeb43a87accca47c71d ]
The on_treeview2_cursor_changed() handler is connected to both the left
and right tree views, but it hardcodes model2 (the GtkTreeModel of the
right tree view). This is incorrect. Get the associated model from the
view.
Signed-off-by: Masahiro Yamada <masahiroy(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
Let me check the commit you provided more carefully. Looking at the
diff:
The commit changes line 851-852 in the `on_treeview2_cursor_changed`
function from:
```c
if (gtk_tree_selection_get_selected(selection, &model2, &iter)) {
gtk_tree_model_get(model2, &iter, COL_MENU, &menu, -1);
```
To:
```c
if (gtk_tree_selection_get_selected(selection, &model, &iter)) {
gtk_tree_model_get(model, &iter, COL_MENU, &menu, -1);
```
Where `model` is obtained from: `GtkTreeModel *model =
gtk_tree_view_get_model(treeview);`
This is indeed a bug fix. The function `on_treeview2_cursor_changed()`
is connected as a signal handler to BOTH tree views (tree1_w and
tree2_w), as shown in the connection code around lines 989 and 997.
However, the old code hardcoded `model2` which is specifically the model
for the right tree view. This would cause incorrect behavior when the
handler is triggered from the left tree view (tree1_w).
**Backport Status: YES**
## Extensive Analysis:
### Bug Description:
The commit fixes a clear bug in the gconf GUI tool for kernel
configuration. The `on_treeview2_cursor_changed()` function is used as a
signal handler for cursor changes in both the left and right tree views
of the configuration interface. However, the function incorrectly
hardcoded `model2` (the GtkTreeModel for the right tree view) instead of
dynamically getting the model from whichever view triggered the event.
### Code Analysis:
1. **Signal Connection** (lines 988-997 in the repository):
- The same handler is connected to both tree views:
```c
g_signal_connect(tree1_w, "cursor-changed",
G_CALLBACK(on_treeview2_cursor_changed), NULL);
g_signal_connect(tree2_w, "cursor-changed",
G_CALLBACK(on_treeview2_cursor_changed), NULL);
```
2. **The Bug**: When the cursor changes in tree1_w (left tree), the
handler would incorrectly use model2 (right tree's model) to get the
selected item, potentially causing:
- Incorrect help text display
- Crashes if the models have different structures
- Undefined behavior when accessing invalid iterators
3. **The Fix**: The commit correctly obtains the model from the treeview
parameter that triggered the event:
```c
GtkTreeModel *model = gtk_tree_view_get_model(treeview);
```
This ensures the correct model is used regardless of which tree view
triggered the event.
### Backport Justification:
1. **Clear Bug Fix**: This fixes an obvious programming error that
affects functionality
2. **Small and Contained**: The change is minimal (2 lines modified) and
localized to a single function
3. **Low Risk**: The fix is straightforward and correct - getting the
model from the actual tree view instead of hardcoding
4. **User Impact**: This bug could affect users of the gconf
configuration tool, potentially causing crashes or incorrect behavior
5. **No Architectural Changes**: This is a simple bug fix with no design
changes
6. **No New Features**: Pure bug fix, no functionality additions
This is an ideal candidate for stable backporting as it's a clear,
minimal bug fix that improves reliability without introducing risk.
scripts/kconfig/gconf.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/scripts/kconfig/gconf.c b/scripts/kconfig/gconf.c
index c0f46f189060..abe4cfe66b14 100644
--- a/scripts/kconfig/gconf.c
+++ b/scripts/kconfig/gconf.c
@@ -942,13 +942,14 @@ on_treeview2_key_press_event(GtkWidget * widget,
void
on_treeview2_cursor_changed(GtkTreeView * treeview, gpointer user_data)
{
+ GtkTreeModel *model = gtk_tree_view_get_model(treeview);
GtkTreeSelection *selection;
GtkTreeIter iter;
struct menu *menu;
selection = gtk_tree_view_get_selection(treeview);
- if (gtk_tree_selection_get_selected(selection, &model2, &iter)) {
- gtk_tree_model_get(model2, &iter, COL_MENU, &menu, -1);
+ if (gtk_tree_selection_get_selected(selection, &model, &iter)) {
+ gtk_tree_model_get(model, &iter, COL_MENU, &menu, -1);
text_insert_help(menu);
}
}
--
2.39.5
From: Hyejeong Choi <hjeong.choi(a)samsung.com>
commit 72c7d62583ebce7baeb61acce6057c361f73be4a upstream.
smp_store_mb() inserts memory barrier after storing operation.
It is different with what the comment is originally aiming so Null
pointer dereference can be happened if memory update is reordered.
Signed-off-by: Hyejeong Choi <hjeong.choi(a)samsung.com>
Fixes: a590d0fdbaa5 ("dma-buf: Update reservation shared_count after adding the new fence")
CC: stable(a)vger.kernel.org
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Link: https://lore.kernel.org/r/20250513020638.GA2329653@au1-maretx-p37.eng.sarc.…
Signed-off-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
[Conflict resolved by applying changes from dma_resv_add_fence() in the original fix to dma_resv_add_shared_fence() in current code base]
Signed-off-by: Jay Wang <wanjay(a)amazon.com>
---
drivers/dma-buf/dma-resv.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 789f72db097f..16eb0b389e08 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -291,8 +291,9 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
replace:
RCU_INIT_POINTER(fobj->shared[i], fence);
- /* pointer update must be visible before we extend the shared_count */
- smp_store_mb(fobj->shared_count, count);
+ /* fence update must be visible before we extend the shared_count */
+ smp_wmb();
+ fobj->shared_count = count;
write_seqcount_end(&obj->seq);
dma_fence_put(old);
--
2.47.3
During the integration of the RTL8239 POE chip + its frontend MCU, it was
noticed that multi-byte operations were basically broken in the current
driver.
Tests using SMBus Block Writes showed that the data (after the Wr maker +
Ack) was mixed up on the wire. At first glance, it looked like an
endianness problem. But for transfers where the number of count + data
bytes was not divisible by 4, the last bytes were not looking like an
endianness problem because they were in the wrong order but not for example
0 - which would be the case for an endianness problem with 32 bit
registers. At the end, it turned out to be the way how i2c_write tried to
add the bytes to the send registers.
Each 32 bit register was used similar to a shift register - shifting the
various bytes up the register while the next one is added to the least
significant byte. But the I2C controller expects the first byte of the
transmission in the least significant byte of the first register. And the
last byte (assuming it is a 16 byte transfer) is expected in the most
significant byte of the fourth register.
While doing these tests, it was also observed that the count byte was
missing from the SMBus Block Writes. The driver just removed them from the
data->block (from the I2C subsystem). But the I2C controller DOES NOT
automatically add this byte - for example by using the configured
transmission length.
The RTL8239 MCU is not actually an SMBus compliant device. Instead, it
expects I2C Block Reads + I2C Block Writes. But according to the already
identified bugs in the driver, it was clear that the I2C controller can
simply be modified to not send the count byte for I2C_SMBUS_I2C_BLOCK_DATA.
The receive part just needs to write the content of the receive buffer to
the correct position in data->block.
While the on-wire format was now correct, reads were still not possible
against the MCU (for the RTL8239 POE chip). It was always timing out
because the 2ms were not enough for sending the read request and then
receiving the 12 byte answer.
These changes were originally submitted to OpenWrt. But there are plans to
migrate OpenWrt to the upstream Linux driver. As a result, the pull request
was stopped and the changes were redone against this driver.
For reasons of transparency: The work on I2C_SMBUS_I2C_BLOCK_DATA support
for the RTL8239-MCU was done on RTL931xx. All problems were therefore
detected with the patches from Jonas Jelonek [1] and not the vanilla Linux
driver. But looking through the code, it seems like these are NOT
regressions introduced by the RTL931x patchset.
I've picked up Alex Guo's patch [2] to reduce conflicts between pending
fixes.
[1] https://patchwork.ozlabs.org/project/linux-i2c/cover/20250727114800.3046-1-…
[2] https://lore.kernel.org/r/20250615235248.529019-1-alexguo1023@gmail.com
Signed-off-by: Sven Eckelmann <sven(a)narfation.org>
---
Changes in v4:
- Provide only "write" examples for "i2c: rtl9300: Fix multi-byte I2C write"
- drop the second initialization of vals in rtl9300_i2c_write() directly in
the "Fix multi-byte I2C write" fix
- indicate in target branch for each patch in PATCH prefix
- minor commit message cleanups
- Link to v3: https://lore.kernel.org/r/20250804-i2c-rtl9300-multi-byte-v3-0-e20607e1b28c…
Changes in v3:
- integrated patch
https://lore.kernel.org/r/20250615235248.529019-1-alexguo1023@gmail.com
to avoid conflicts in the I2C_SMBUS_BLOCK_DATA code
- added Fixes and stable(a)vger.kernel.org to Alex Guo's patch
- added Chris Packham's Reviewed-by/Acked-by
- Link to v2: https://lore.kernel.org/r/20250803-i2c-rtl9300-multi-byte-v2-0-9b7b759fe2b6…
Changes in v2:
- add the missing transfer width and read length increase for the SMBus
Write/Read
- Link to v1: https://lore.kernel.org/r/20250802-i2c-rtl9300-multi-byte-v1-0-5f687e0098e2…
---
Alex Guo (1):
[i2c-host-fixes] i2c: rtl9300: Fix out-of-bounds bug in rtl9300_i2c_smbus_xfer
Harshal Gohel (2):
[i2c-host-fixes] i2c: rtl9300: Fix multi-byte I2C write
[i2c-host] i2c: rtl9300: Implement I2C block read and write
Sven Eckelmann (2):
[i2c-host-fixes] i2c: rtl9300: Increase timeout for transfer polling
[i2c-host-fixes] i2c: rtl9300: Add missing count byte for SMBus Block Ops
drivers/i2c/busses/i2c-rtl9300.c | 50 +++++++++++++++++++++++++++++++++-------
1 file changed, 42 insertions(+), 8 deletions(-)
---
base-commit: 09eaa2a604ed1bfda7e6fb10488127ce8fdc8048
change-id: 20250802-i2c-rtl9300-multi-byte-edaa1fb0872c
Best regards,
--
Sven Eckelmann <sven(a)narfation.org>