On Thu, 9 May 2019 19:14:16 +0200 Peter Zijlstra peterz@infradead.org wrote:
On Thu, May 09, 2019 at 11:01:06PM +0900, Masami Hiramatsu wrote:
On Thu, 9 May 2019 10:14:31 +0200 Peter Zijlstra peterz@infradead.org wrote:
But what I'd love to do is something like the belwo patch, and make all the trampolines (very much including ftrace) use that. Such that we then only have 1 copy of this magic (well, 2 because x86_64 also needs an implementation of this of course).
OK, but I will make kretprobe integrated with func-graph tracer, since it is inefficient that we have 2 different hidden return stack...
Anyway,
Changing ftrace over to this would be a little more work but it can easily chain things a little to get its original context back:
ENTRY(ftrace_regs_caller) GLOBAL(ftrace_regs_func) push ftrace_stub push ftrace_regs_handler jmp call_to_exception_trampoline END(ftrace_regs_caller)
typedef void (*ftrace_func_t)(unsigned long, unsigned long, struct ftrace_op *, struct pt_regs *);
struct ftrace_regs_stack { ftrace_func_t func; unsigned long parent_ip; };
void ftrace_regs_handler(struct pr_regs *regs) { struct ftrace_regs_stack *st = (void *)regs->sp; ftrace_func_t func = st->func;
regs->sp += sizeof(long); /* pop func */
Sorry, why pop here?
Otherwise it stays on the return stack and bad things happen. Note how the below trampoline thing uses regs->sp.
func(regs->ip, st->parent_ip, function_trace_op, regs); }
Hmm? I didn't look into the function_graph thing, but I imagine it can be added without too much pain.
Yes, that should be good for function_graph trampoline too. We use very similar technic.
Ideally also the optimized kprobe trampoline, but I've not managed to fully comprehend that one.
As you pointed in other reply, save/restore can be a macro, but each trampoline code is slightly different. Optprobe template has below parts
(jumped from probed address) [store regs] [setup function arguments (pt_regs and probed address)] [handler call] [restore regs] [execute copied instruction] [jump back to probed address]
Note that there is a limitation that if it is optiomized probe, user handler can not change regs->ip. (we can not use "ret" after executed a copied instruction, which must run on same stack)
--- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -1576,3 +1576,100 @@ ENTRY(rewind_stack_do_exit) call do_exit 1: jmp 1b END(rewind_stack_do_exit)
+/*
- Transforms a CALL frame into an exception frame; IOW it pretends the CALL we
- just did was in fact scribbled with an INT3.
- Use this trampoline like:
- PUSH $func
- JMP call_to_exception_trampoline
- $func will see regs->ip point at the CALL instruction and must therefore
- modify regs->ip in order to make progress (just like a normal INT3 scribbled
- CALL).
- NOTE: we do not restore any of the segment registers.
- */
+ENTRY(call_to_exception_trampoline)
- /*
* On entry the stack looks like:
*
* 2*4(%esp) <previous context>
* 1*4(%esp) RET-IP
* 0*4(%esp) func
*
* transform this into:
*
* 19*4(%esp) <previous context>
* 18*4(%esp) gap / RET-IP
* 17*4(%esp) gap / func
* 16*4(%esp) ss
* 15*415*4(%esp) sp / <previous context>
isn't this "&<previous context>" ?
Yes.
* 14*4(%esp) flags
* 13*4(%esp) cs
* 12*4(%esp) ip / RET-IP
* 11*4(%esp) orig_eax
* 10*4(%esp) gs
* 9*4(%esp) fs
* 8*4(%esp) es
* 7*4(%esp) ds
* 6*4(%esp) eax
* 5*4(%esp) ebp
* 4*4(%esp) edi
* 3*4(%esp) esi
* 2*4(%esp) edx
* 1*4(%esp) ecx
* 0*4(%esp) ebx
*/
- pushl %ss
- pushl %esp # points at ss
- addl $3*4, (%esp) # point it at <previous context>
- pushfl
- pushl %cs
- pushl 5*4(%esp) # RET-IP
- subl 5, (%esp) # point at CALL instruction
- pushl $-1
- pushl %gs
- pushl %fs
- pushl %es
- pushl %ds
- pushl %eax
- pushl %ebp
- pushl %edi
- pushl %esi
- pushl %edx
- pushl %ecx
- pushl %ebx
- ENCODE_FRAME_POINTER
- movl %esp, %eax # 1st argument: pt_regs
- movl 17*4(%esp), %ebx # func
- CALL_NOSPEC %ebx
- movl PT_OLDESP(%esp), %eax
Is PT_OLDESP(%esp) "<previous context>" or "&<previous contex>"?
The latter.
- movl PT_EIP(%esp), %ecx
- movl %ecx, -1*4(%eax)
Ah, OK, so $func must set the true return address to regs->ip instead of returning it.
Just so.
- movl PT_EFLAGS(%esp), %ecx
- movl %ecx, -2*4(%eax)
- movl PT_EAX(%esp), %ecx
- movl %ecx, -3*4(%eax)
So, at this point, the stack becomes
3*4(%esp) ®s->sp 2*4(%esp) RET-IP 1*4(%esp) eflags 0*4(%esp) eax
Correct?
Yes, relative to regs->sp, which is why we need to pop 'func', otherwise it stays on the stack.
- popl %ebx
- popl %ecx
- popl %edx
- popl %esi
- popl %edi
- popl %ebp
- lea -3*4(%eax), %esp
- popl %eax
- popfl
- ret
+END(call_to_exception_trampoline) --- a/arch/x86/kernel/kprobes/core.c +++ b/arch/x86/kernel/kprobes/core.c @@ -731,29 +731,8 @@ asm( ".global kretprobe_trampoline\n" ".type kretprobe_trampoline, @function\n" "kretprobe_trampoline:\n"
- /* We don't bother saving the ss register */
-#ifdef CONFIG_X86_64
- " pushq %rsp\n"
- " pushfq\n"
- SAVE_REGS_STRING
- " movq %rsp, %rdi\n"
- " call trampoline_handler\n"
- /* Replace saved sp with true return address. */
- " movq %rax, 19*8(%rsp)\n"
- RESTORE_REGS_STRING
- " popfq\n"
-#else
- " pushl %esp\n"
- " pushfl\n"
- SAVE_REGS_STRING
- " movl %esp, %eax\n"
- " call trampoline_handler\n"
- /* Replace saved sp with true return address. */
- " movl %eax, 15*4(%esp)\n"
- RESTORE_REGS_STRING
- " popfl\n"
-#endif
- " ret\n"
Here, we need a gap for storing ret-ip, because kretprobe_trampoline is the address which is returned from the target function. We have no "ret-ip" here at this point. So something like
- "push $0\n" /* This is a gap, will be filled with real return address*/
The trampoline already provides a gap, trampoline_handler() will need to use int3_emulate_push() if it wants to inject something on the return stack.
I guess you mean the int3 case. This trampoline is used as a return destination. When the target function is called, kretprobe interrupts the first instruction, and replace the return address with this trampoline. When a "ret" instruction is done, it returns to this trampoline. Thus the stack frame start with previous context here. As you described above,
* On entry the stack looks like:
*
* 2*4(%esp) <previous context>
* 1*4(%esp) RET-IP
* 0*4(%esp) func
From this trampoline call, the stack looks like:
* 1*4(%esp) <previous context> * 0*4(%esp) func
So we need one more push.
- "push trampoline_handler\n"
- "jmp call_to_exception_trampoline\n" ".size kretprobe_trampoline, .-kretprobe_trampoline\n"
); NOKPROBE_SYMBOL(kretprobe_trampoline);
Thank you,