Sohil reported seeing a split lock warning when running a test that generates userspace #DB:
x86/split lock detection: #DB: sigtrap_loop_64/4614 took a bus_lock trap at address: 0x4011ae
We investigated the issue and figured out:
1) The warning is a false positive.
2) It is not caused by the test itself.
3) It occurs even when Bus Lock Detection (BLD) is disabled.
4) It only happens on the first #DB on a CPU.
And the root cause is, at boot time, Linux zeros DR6. This leads to different DR6 values depending on whether the CPU supports BLD:
1) On CPUs with BLD support, DR6 becomes 0xFFFF07F0 (bit 11, DR6.BLD, is cleared).
2) On CPUs without BLD, DR6 becomes 0xFFFF0FF0.
Since only BLD-induced #DB exceptions clear DR6.BLD and other debug exceptions leave it unchanged, even if the first #DB is unrelated to BLD, DR6.BLD is still cleared. As a result, such a first #DB is misinterpreted as a BLD #DB, and a false warning is triggerred.
Fix the bug by initializing DR6 by writing its architectural reset value at boot time.
DR7 suffers from a similar issue, apply the same fix.
This patch set is based on tip/x86/urgent branch as of today.
Link to the previous patch set v2: https://lore.kernel.org/lkml/20250617073234.1020644-1-xin@zytor.com/
Changes in v3: *) Polish initialize_debug_regs() (PeterZ). *) Rewrite the comment for DR6_RESERVED definition (Sohil and Sean). *) Reword the patch 2's changelog using Sean's description. *) Explain the definition of DR7_FIXED_1 (Sohil). *) Collect TB, RB, AB (PeterZ, Sohil and Sean).
Xin Li (Intel) (2): x86/traps: Initialize DR6 by writing its architectural reset value x86/traps: Initialize DR7 by writing its architectural reset value
arch/x86/include/asm/debugreg.h | 19 ++++++++++++---- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/include/uapi/asm/debugreg.h | 21 ++++++++++++++++- arch/x86/kernel/cpu/common.c | 24 ++++++++------------ arch/x86/kernel/kgdb.c | 2 +- arch/x86/kernel/process_32.c | 2 +- arch/x86/kernel/process_64.c | 2 +- arch/x86/kernel/traps.c | 34 +++++++++++++++++----------- arch/x86/kvm/x86.c | 4 ++-- 9 files changed, 72 insertions(+), 38 deletions(-)
base-commit: 2aebf5ee43bf0ed225a09a30cf515d9f2813b759
Initialize DR6 by writing its architectural reset value to avoid incorrectly zeroing DR6 to clear DR6.BLD at boot time, which leads to a false bus lock detected warning.
The Intel SDM says:
1) Certain debug exceptions may clear bits 0-3 of DR6.
2) BLD induced #DB clears DR6.BLD and any other debug exception doesn't modify DR6.BLD.
3) RTM induced #DB clears DR6.RTM and any other debug exception sets DR6.RTM.
To avoid confusion in identifying debug exceptions, debug handlers should set DR6.BLD and DR6.RTM, and clear other DR6 bits before returning.
The DR6 architectural reset value 0xFFFF0FF0, already defined as macro DR6_RESERVED, satisfies these requirements, so just use it to reinitialize DR6 whenever needed.
Since clear_all_debug_regs() no longer zeros all debug registers, rename it to initialize_debug_regs() to better reflect its current behavior.
Since debug_read_clear_dr6() no longer clears DR6, rename it to debug_read_reset_dr6() to better reflect its current behavior.
Reported-by: Sohil Mehta sohil.mehta@intel.com Link: https://lore.kernel.org/lkml/06e68373-a92b-472e-8fd9-ba548119770c@intel.com/ Fixes: ebb1064e7c2e9 ("x86/traps: Handle #DB for bus lock") Suggested-by: H. Peter Anvin (Intel) hpa@zytor.com Tested-by: Sohil Mehta sohil.mehta@intel.com Reviewed-by: H. Peter Anvin (Intel) hpa@zytor.com Reviewed-by: Sohil Mehta sohil.mehta@intel.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Xin Li (Intel) xin@zytor.com Cc: stable@vger.kernel.org ---
Changes in v3: *) Polish initialize_debug_regs() (PeterZ). *) Rewrite the comment for DR6_RESERVED definition (Sohil and Sean). *) Collect TB, RB, AB (PeterZ and Sohil).
Changes in v2: *) Use debug register index 6 rather than DR_STATUS (PeterZ and Sean). *) Move this patch the first of the patch set to ease backporting. --- arch/x86/include/uapi/asm/debugreg.h | 21 ++++++++++++++++- arch/x86/kernel/cpu/common.c | 24 ++++++++------------ arch/x86/kernel/traps.c | 34 +++++++++++++++++----------- 3 files changed, 51 insertions(+), 28 deletions(-)
diff --git a/arch/x86/include/uapi/asm/debugreg.h b/arch/x86/include/uapi/asm/debugreg.h index 0007ba077c0c..41da492dfb01 100644 --- a/arch/x86/include/uapi/asm/debugreg.h +++ b/arch/x86/include/uapi/asm/debugreg.h @@ -15,7 +15,26 @@ which debugging register was responsible for the trap. The other bits are either reserved or not of interest to us. */
-/* Define reserved bits in DR6 which are always set to 1 */ +/* + * Define bits in DR6 which are set to 1 by default. + * + * This is also the DR6 architectural value following Power-up, Reset or INIT. + * + * Note, with the introduction of Bus Lock Detection (BLD) and Restricted + * Transactional Memory (RTM), the DR6 register has been modified: + * + * 1) BLD flag (bit 11) is no longer reserved to 1 if the CPU supports + * Bus Lock Detection. The assertion of a bus lock could clear it. + * + * 2) RTM flag (bit 16) is no longer reserved to 1 if the CPU supports + * restricted transactional memory. #DB occurred inside an RTM region + * could clear it. + * + * Apparently, DR6.BLD and DR6.RTM are active low bits. + * + * As a result, DR6_RESERVED is an incorrect name now, but it is kept for + * compatibility. + */ #define DR6_RESERVED (0xFFFF0FF0)
#define DR_TRAP0 (0x1) /* db0 */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 8feb8fd2957a..0f6c280a94f0 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -2243,20 +2243,16 @@ EXPORT_PER_CPU_SYMBOL(__stack_chk_guard); #endif #endif
-/* - * Clear all 6 debug registers: - */ -static void clear_all_debug_regs(void) +static void initialize_debug_regs(void) { - int i; - - for (i = 0; i < 8; i++) { - /* Ignore db4, db5 */ - if ((i == 4) || (i == 5)) - continue; - - set_debugreg(0, i); - } + /* Control register first -- to make sure everything is disabled. */ + set_debugreg(0, 7); + set_debugreg(DR6_RESERVED, 6); + /* dr5 and dr4 don't exist */ + set_debugreg(0, 3); + set_debugreg(0, 2); + set_debugreg(0, 1); + set_debugreg(0, 0); }
#ifdef CONFIG_KGDB @@ -2417,7 +2413,7 @@ void cpu_init(void)
load_mm_ldt(&init_mm);
- clear_all_debug_regs(); + initialize_debug_regs(); dbg_restore_debug_regs();
doublefault_init_cpu_tss(); diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index c5c897a86418..36354b470590 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1022,24 +1022,32 @@ static bool is_sysenter_singlestep(struct pt_regs *regs) #endif }
-static __always_inline unsigned long debug_read_clear_dr6(void) +static __always_inline unsigned long debug_read_reset_dr6(void) { unsigned long dr6;
+ get_debugreg(dr6, 6); + dr6 ^= DR6_RESERVED; /* Flip to positive polarity */ + /* * The Intel SDM says: * - * Certain debug exceptions may clear bits 0-3. The remaining - * contents of the DR6 register are never cleared by the - * processor. To avoid confusion in identifying debug - * exceptions, debug handlers should clear the register before - * returning to the interrupted task. + * Certain debug exceptions may clear bits 0-3 of DR6. + * + * BLD induced #DB clears DR6.BLD and any other debug + * exception doesn't modify DR6.BLD. * - * Keep it simple: clear DR6 immediately. + * RTM induced #DB clears DR6.RTM and any other debug + * exception sets DR6.RTM. + * + * To avoid confusion in identifying debug exceptions, + * debug handlers should set DR6.BLD and DR6.RTM, and + * clear other DR6 bits before returning. + * + * Keep it simple: write DR6 with its architectural reset + * value 0xFFFF0FF0, defined as DR6_RESERVED, immediately. */ - get_debugreg(dr6, 6); set_debugreg(DR6_RESERVED, 6); - dr6 ^= DR6_RESERVED; /* Flip to positive polarity */
return dr6; } @@ -1239,13 +1247,13 @@ static noinstr void exc_debug_user(struct pt_regs *regs, unsigned long dr6) /* IST stack entry */ DEFINE_IDTENTRY_DEBUG(exc_debug) { - exc_debug_kernel(regs, debug_read_clear_dr6()); + exc_debug_kernel(regs, debug_read_reset_dr6()); }
/* User entry, runs on regular task stack */ DEFINE_IDTENTRY_DEBUG_USER(exc_debug) { - exc_debug_user(regs, debug_read_clear_dr6()); + exc_debug_user(regs, debug_read_reset_dr6()); }
#ifdef CONFIG_X86_FRED @@ -1264,7 +1272,7 @@ DEFINE_FREDENTRY_DEBUG(exc_debug) { /* * FRED #DB stores DR6 on the stack in the format which - * debug_read_clear_dr6() returns for the IDT entry points. + * debug_read_reset_dr6() returns for the IDT entry points. */ unsigned long dr6 = fred_event_data(regs);
@@ -1279,7 +1287,7 @@ DEFINE_FREDENTRY_DEBUG(exc_debug) /* 32 bit does not have separate entry points. */ DEFINE_IDTENTRY_RAW(exc_debug) { - unsigned long dr6 = debug_read_clear_dr6(); + unsigned long dr6 = debug_read_reset_dr6();
if (user_mode(regs)) exc_debug_user(regs, dr6);
Initialize DR7 by writing its architectural reset value to always set bit 10, which is reserved to '1', when "clearing" DR7 so as not to trigger unanticipated behavior if said bit is ever unreserved, e.g. as a feature enabling flag with inverted polarity.
Tested-by: Sohil Mehta sohil.mehta@intel.com Reviewed-by: H. Peter Anvin (Intel) hpa@zytor.com Reviewed-by: Sohil Mehta sohil.mehta@intel.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Sean Christopherson seanjc@google.com Signed-off-by: Xin Li (Intel) xin@zytor.com ---
Changes in v3: *) Reword the changelog using Sean's description. *) Explain the definition of DR7_FIXED_1 (Sohil). *) Collect TB, RB, AB (PeterZ, Sohil and Sean).
Changes in v2: *) Use debug register index 7 rather than DR_CONTROL (PeterZ and Sean). *) Use DR7_FIXED_1 as the architectural reset value of DR7 (Sean). --- arch/x86/include/asm/debugreg.h | 19 +++++++++++++++---- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kernel/cpu/common.c | 2 +- arch/x86/kernel/kgdb.c | 2 +- arch/x86/kernel/process_32.c | 2 +- arch/x86/kernel/process_64.c | 2 +- arch/x86/kvm/x86.c | 4 ++-- 7 files changed, 22 insertions(+), 11 deletions(-)
diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h index 363110e6b2e3..a2c1f2d24b64 100644 --- a/arch/x86/include/asm/debugreg.h +++ b/arch/x86/include/asm/debugreg.h @@ -9,6 +9,14 @@ #include <asm/cpufeature.h> #include <asm/msr.h>
+/* + * Define bits that are always set to 1 in DR7, only bit 10 is + * architecturally reserved to '1'. + * + * This is also the init/reset value for DR7. + */ +#define DR7_FIXED_1 0x00000400 + DECLARE_PER_CPU(unsigned long, cpu_dr7);
#ifndef CONFIG_PARAVIRT_XXL @@ -100,8 +108,8 @@ static __always_inline void native_set_debugreg(int regno, unsigned long value)
static inline void hw_breakpoint_disable(void) { - /* Zero the control register for HW Breakpoint */ - set_debugreg(0UL, 7); + /* Reset the control register for HW Breakpoint */ + set_debugreg(DR7_FIXED_1, 7);
/* Zero-out the individual HW breakpoint address registers */ set_debugreg(0UL, 0); @@ -125,9 +133,12 @@ static __always_inline unsigned long local_db_save(void) return 0;
get_debugreg(dr7, 7); - dr7 &= ~0x400; /* architecturally set bit */ + + /* Architecturally set bit */ + dr7 &= ~DR7_FIXED_1; if (dr7) - set_debugreg(0, 7); + set_debugreg(DR7_FIXED_1, 7); + /* * Ensure the compiler doesn't lower the above statements into * the critical section; disabling breakpoints late would not diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index b4a391929cdb..639d9bcee842 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -31,6 +31,7 @@
#include <asm/apic.h> #include <asm/pvclock-abi.h> +#include <asm/debugreg.h> #include <asm/desc.h> #include <asm/mtrr.h> #include <asm/msr-index.h> @@ -249,7 +250,6 @@ enum x86_intercept_stage; #define DR7_BP_EN_MASK 0x000000ff #define DR7_GE (1 << 9) #define DR7_GD (1 << 13) -#define DR7_FIXED_1 0x00000400 #define DR7_VOLATILE 0xffff2bff
#define KVM_GUESTDBG_VALID_MASK \ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 0f6c280a94f0..27125e009847 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -2246,7 +2246,7 @@ EXPORT_PER_CPU_SYMBOL(__stack_chk_guard); static void initialize_debug_regs(void) { /* Control register first -- to make sure everything is disabled. */ - set_debugreg(0, 7); + set_debugreg(DR7_FIXED_1, 7); set_debugreg(DR6_RESERVED, 6); /* dr5 and dr4 don't exist */ set_debugreg(0, 3); diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c index 102641fd2172..8b1a9733d13e 100644 --- a/arch/x86/kernel/kgdb.c +++ b/arch/x86/kernel/kgdb.c @@ -385,7 +385,7 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs) struct perf_event *bp;
/* Disable hardware debugging while we are in kgdb: */ - set_debugreg(0UL, 7); + set_debugreg(DR7_FIXED_1, 7); for (i = 0; i < HBP_NUM; i++) { if (!breakinfo[i].enabled) continue; diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c index a10e180cbf23..3ef15c2f152f 100644 --- a/arch/x86/kernel/process_32.c +++ b/arch/x86/kernel/process_32.c @@ -93,7 +93,7 @@ void __show_regs(struct pt_regs *regs, enum show_regs_mode mode,
/* Only print out debug registers if they are in their non-default state. */ if ((d0 == 0) && (d1 == 0) && (d2 == 0) && (d3 == 0) && - (d6 == DR6_RESERVED) && (d7 == 0x400)) + (d6 == DR6_RESERVED) && (d7 == DR7_FIXED_1)) return;
printk("%sDR0: %08lx DR1: %08lx DR2: %08lx DR3: %08lx\n", diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 8d6cf25127aa..b972bf72fb8b 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -133,7 +133,7 @@ void __show_regs(struct pt_regs *regs, enum show_regs_mode mode,
/* Only print out debug registers if they are in their non-default state. */ if (!((d0 == 0) && (d1 == 0) && (d2 == 0) && (d3 == 0) && - (d6 == DR6_RESERVED) && (d7 == 0x400))) { + (d6 == DR6_RESERVED) && (d7 == DR7_FIXED_1))) { printk("%sDR0: %016lx DR1: %016lx DR2: %016lx\n", log_lvl, d0, d1, d2); printk("%sDR3: %016lx DR6: %016lx DR7: %016lx\n", diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b58a74c1722d..a9d992d5652f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11035,7 +11035,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
if (unlikely(vcpu->arch.switch_db_regs && !(vcpu->arch.switch_db_regs & KVM_DEBUGREG_AUTO_SWITCH))) { - set_debugreg(0, 7); + set_debugreg(DR7_FIXED_1, 7); set_debugreg(vcpu->arch.eff_db[0], 0); set_debugreg(vcpu->arch.eff_db[1], 1); set_debugreg(vcpu->arch.eff_db[2], 2); @@ -11044,7 +11044,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)) kvm_x86_call(set_dr6)(vcpu, vcpu->arch.dr6); } else if (unlikely(hw_breakpoint_active())) { - set_debugreg(0, 7); + set_debugreg(DR7_FIXED_1, 7); }
vcpu->arch.host_debugctl = get_debugctlmsr();
Hi,
Thanks for your patch.
FYI: kernel test robot notices the stable kernel rule is not satisfied.
The check is based on https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html#opti...
Rule: add the tag "Cc: stable@vger.kernel.org" in the sign-off area to have the patch automatically included in the stable tree. Subject: [PATCH v3 2/2] x86/traps: Initialize DR7 by writing its architectural reset value Link: https://lore.kernel.org/stable/20250618172723.1651465-3-xin%40zytor.com
linux-stable-mirror@lists.linaro.org