On May 3, 2019, at 6:22 AM, Steven Rostedt rostedt@goodmis.org wrote:
On Fri, 3 May 2019 11:29:59 +0200 Peter Zijlstra peterz@infradead.org wrote:
OMG, WTF, ARGH... That code is fsck'ing horrible. I'd almost argue to always do the INT3 thing, just to avoid games like that.
Hehe, that's almost the exact same thoughts I had when seeing this code ;-)
That said; for normal traps ®s->sp is indeed the previous context -- if it doesn't fall off the stack. Your hack detects the regular INT3 frame. Howver if regs->sp has been modified (int3_emulate_push, for example) your detectoring comes unstuck.
Yep. I realized the issue as well. But wanted to make sure this did work when sp wasn't changed.
Now, it is rather unlikely these two code paths interact, but just to be safe, something like so might be more reliable:
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c index 4b8ee05dd6ad..aceaad0cc9a9 100644 --- a/arch/x86/kernel/ptrace.c +++ b/arch/x86/kernel/ptrace.c @@ -163,6 +163,9 @@ static inline bool invalid_selector(u16 value)
- stack pointer we fall back to regs as stack if no previous stack
- exists.
- There is a special case for INT3, there we construct a full pt_regs
- environment. We can detect this case by a high bit in regs->cs
- This is valid only for kernel mode traps.
*/ unsigned long kernel_stack_pointer(struct pt_regs *regs) @@ -171,6 +174,9 @@ unsigned long kernel_stack_pointer(struct pt_regs *regs) unsigned long sp = (unsigned long)®s->sp; u32 *prev_esp;
- if (regs->__csh & (1 << 13)) /* test CS_FROM_INT3 */
return regs->sp;
Thanks, I was looking into doing something like this (setting a flag in the int3 code), but didn't have the time to see the best way to do this.
I'll add this version of the code and run it through my tests.
-- Steve
if (context == (sp & ~(THREAD_SIZE - 1))) return sp;
--- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -388,6 +388,7 @@
#define CS_FROM_ENTRY_STACK (1 << 31) #define CS_FROM_USER_CR3 (1 << 30) +#define CS_FROM_INT3 (1 << 29)
.macro SWITCH_TO_KERNEL_STACK
@@ -1515,6 +1516,9 @@ ENTRY(int3)
add $16, 12(%esp) # point sp back at the previous context
andl $0x0000ffff, 4(%esp)
orl $CS_FROM_INT3, 4(%esp)
pushl $-1 # orig_eax; mark as interrupt
SAVE_ALL
So here’s a somewhat nutty suggestion: how about we tweak the 32-bit entry code to emulate the sane 64-bit frame, not just for int3 but always? Basically, the entry asm for entries from kernel mode would do, roughly:
push $0 ;dummy for call emulation push %ss push $0 ;a dummy for ESP push 3*4(%esp) ;EFLAGS push 3*4(%esp) ;CS push 3*4(%esp) ;EIP push %rax lea 7*4(%esp), %rax mov %rax, 4*4(%esp) ;ESP
And the exit asm would do a little dance to write EFLAGS, CS, and EIP to the right spot, then load ESP-3*4 into %esp and do IRET.
Now the annoying kernel_stack_pointer() hack can just go away, since regs->sp is always correct!
I probably screwed up some arithmetic there, but it’s the idea that counts :)