Robert Gill reported below #GP when dosemu software was executing vm86() system call:
general protection fault: 0000 [#1] PREEMPT SMP CPU: 4 PID: 4610 Comm: dosemu.bin Not tainted 6.6.21-gentoo-x86 #1 Hardware name: Dell Inc. PowerEdge 1950/0H723K, BIOS 2.7.0 10/30/2010 EIP: restore_all_switch_stack+0xbe/0xcf EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000 ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: ff8affdc DS: 0000 ES: 0000 FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00010046 CR0: 80050033 CR2: 00c2101c CR3: 04b6d000 CR4: 000406d0 Call Trace: show_regs+0x70/0x78 die_addr+0x29/0x70 exc_general_protection+0x13c/0x348 exc_bounds+0x98/0x98 handle_exception+0x14d/0x14d exc_bounds+0x98/0x98 restore_all_switch_stack+0xbe/0xcf exc_bounds+0x98/0x98 restore_all_switch_stack+0xbe/0xcf
This only happens when VERW based mitigations like MDS/RFDS are enabled. This is because segment registers with an arbitrary user value can result in #GP when executing VERW. Intel SDM vol. 2C documents the following behavior for VERW instruction:
#GP(0) - If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.
CLEAR_CPU_BUFFERS macro executes VERW instruction before returning to user space. Replace CLEAR_CPU_BUFFERS with a safer version that uses %ss to refer VERW operand mds_verw_sel. This ensures VERW will not #GP for an arbitrary user %ds. Also, in NMI return path, move VERW to after RESTORE_ALL_NMI that touches GPRs.
For clarity, below are the locations where the new CLEAR_CPU_BUFFERS_SAFE version is being used:
* entry_INT80_32(), entry_SYSENTER_32() and interrupts (via handle_exception_return) do:
restore_all_switch_stack: [...] mov %esi,%esi verw %ss:0xc0fc92c0 <------------- iret
* Opportunistic SYSEXIT:
[...] verw %ss:0xc0fc92c0 <------------- btrl $0x9,(%esp) popf pop %eax sti sysexit
* nmi_return and nmi_from_espfix: mov %esi,%esi verw %ss:0xc0fc92c0 <------------- jmp .Lirq_return
Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition") Cc: stable@vger.kernel.org # 5.10+ Reported-by: Robert Gill rtgill82@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707 Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.in... Suggested-by: Dave Hansen dave.hansen@linux.intel.com Suggested-by: Brian Gerst brgerst@gmail.com # Use %ss Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com --- v4: - Further simplify the patch by using %ss for all VERW calls in 32-bit mode (Brian). - In NMI exit path move VERW after RESTORE_ALL_NMI that touches GPRs (Dave).
v3: https://lore.kernel.org/r/20240701-fix-dosemu-vm86-v3-1-b1969532c75a@linux.i... - Simplify CLEAR_CPU_BUFFERS_SAFE by using %ss instead of %ds (Brian). - Do verw before popf in SYSEXIT path (Jari).
v2: https://lore.kernel.org/r/20240627-fix-dosemu-vm86-v2-1-d5579f698e77@linux.i... - Safe guard against any other system calls like vm86() that might change %ds (Dave).
v1: https://lore.kernel.org/r/20240426-fix-dosemu-vm86-v1-1-88c826a3f378@linux.i... ---
--- arch/x86/entry/entry_32.S | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index d3a814efbff6..d54f6002e5a0 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -253,6 +253,16 @@ .Lend_@: .endm
+/* + * Safer version of CLEAR_CPU_BUFFERS that uses %ss to reference VERW operand + * mds_verw_sel. This ensures VERW will not #GP for an arbitrary user %ds. + */ +.macro CLEAR_CPU_BUFFERS_SAFE + ALTERNATIVE "jmp .Lskip_verw@", "", X86_FEATURE_CLEAR_CPU_BUF + verw %ss:_ASM_RIP(mds_verw_sel) +.Lskip_verw@: +.endm + .macro RESTORE_INT_REGS popl %ebx popl %ecx @@ -871,6 +881,8 @@ SYM_FUNC_START(entry_SYSENTER_32)
/* Now ready to switch the cr3 */ SWITCH_TO_USER_CR3 scratch_reg=%eax + /* Clobbers ZF */ + CLEAR_CPU_BUFFERS_SAFE
/* * Restore all flags except IF. (We restore IF separately because @@ -881,7 +893,6 @@ SYM_FUNC_START(entry_SYSENTER_32) BUG_IF_WRONG_CR3 no_user_check=1 popfl popl %eax - CLEAR_CPU_BUFFERS
/* * Return back to the vDSO, which will pop ecx and edx. @@ -951,7 +962,7 @@ restore_all_switch_stack:
/* Restore user state */ RESTORE_REGS pop=4 # skip orig_eax/error_code - CLEAR_CPU_BUFFERS + CLEAR_CPU_BUFFERS_SAFE .Lirq_return: /* * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on IRET core serialization @@ -1144,7 +1155,6 @@ SYM_CODE_START(asm_exc_nmi)
/* Not on SYSENTER stack. */ call exc_nmi - CLEAR_CPU_BUFFERS jmp .Lnmi_return
.Lnmi_from_sysenter_stack: @@ -1165,6 +1175,7 @@ SYM_CODE_START(asm_exc_nmi)
CHECK_AND_APPLY_ESPFIX RESTORE_ALL_NMI cr3_reg=%edi pop=4 + CLEAR_CPU_BUFFERS_SAFE jmp .Lirq_return
#ifdef CONFIG_X86_ESPFIX32 @@ -1206,6 +1217,7 @@ SYM_CODE_START(asm_exc_nmi) * 1 - orig_ax */ lss (1+5+6)*4(%esp), %esp # back to espfix stack + CLEAR_CPU_BUFFERS_SAFE jmp .Lirq_return #endif SYM_CODE_END(asm_exc_nmi)
--- base-commit: f2661062f16b2de5d7b6a5c42a9a5c96326b8454 change-id: 20240426-fix-dosemu-vm86-dd111a01737e