 
            Hi Ingo,
Here is v3 series of fixing kretprobe incorrect stacking order patches.
I found that this series did not merged yet, so I just ported it on the tip tree.
In this version, I just added Steve's Ack and sort tags. Please push it as an urgent fix.
(1) kprobe incorrct stacking order problem
On recent talk with Andrea, I started more precise investigation on the kernel panic with kretprobes on notrace functions, which Francis had been reported last year ( https://lkml.org/lkml/2017/7/14/466 ).
See the investigation details in https://lkml.kernel.org/r/154686789378.15479.2886543882215785247.stgit@devbo...
When we put a kretprobe on ftrace_ops_assist_func() and put another kretprobe on probed-function, below happens
<caller> -><probed-function> ->fentry ->ftrace_ops_assist_func() ->int3 ->kprobe_int3_handler() ...->pre_handler_kretprobe() push the return address (*fentry*) of ftrace_ops_assist_func() to top of the kretprobe list and replace it with kretprobe_trampoline. <-kprobe_int3_handler() <-(int3) ->kprobe_ftrace_handler() ...->pre_handler_kretprobe() push the return address (caller) of probed-function to top of the kretprobe list and replace it with kretprobe_trampoline. <-(kprobe_ftrace_handler()) <-(ftrace_ops_assist_func()) [kretprobe_trampoline] ->tampoline_handler() pop the return address (caller) from top of the kretprobe list <-(trampoline_handler()) <caller> [run caller with incorrect stack information] <-(<caller>) !!KERNEL PANIC!!
Therefore, this kernel panic happens only when we put 2 k*ret*probes on ftrace_ops_assist_func() and other functions. If we put kprobes, it doesn't cause any issue, since it doesn't change the return address.
To fix (or just avoid) this issue, we can introduce a frame pointer verification to skip wrong order entries. And I also would like to blacklist those functions because those are part of ftrace-based kprobe handling routine.
(2) kretprobe trampoline recursion problem
This was found by Andrea in the previous thread https://lkml.kernel.org/r/20190107183444.GA5966@xps-13
---- echo "r:event_1 __fdget" >> kprobe_events echo "r:event_2 _raw_spin_lock_irqsave" >> kprobe_events echo 1 > events/kprobes/enable [DEADLOCK] ----
Because kretprobe trampoline_handler uses spinlock for protecting hash table, if we probe the spinlock itself, it causes deadlock. Thank you Andrea and Steve for discovering this root cause!!
This bug has been introduced with the asm-coded trampoline code, since previously it used another kprobe for hooking the function return placeholder (which only has a nop) and trampoline handler was called from that kprobe.
To fix this bug, I introduced a dummy kprobe and set it in current_kprobe as we did in old days.
Thank you,
---
Masami Hiramatsu (3): x86/kprobes: Verify stack frame on kretprobe kprobes: Mark ftrace mcount handler functions nokprobe x86/kprobes: Fix to avoid kretprobe recursion
arch/x86/kernel/kprobes/core.c | 48 ++++++++++++++++++++++++++++++++++++++-- include/linux/kprobes.h | 1 + kernel/trace/ftrace.c | 6 ++++- 3 files changed, 52 insertions(+), 3 deletions(-)
-- Masami Hiramatsu (Linaro) mhiramat@kernel.org
 
            Verify the stack frame pointer on kretprobe trampoline handler, If the stack frame pointer does not match, it skips the wrong entry and tries to find correct one.
This can happen if user puts the kretprobe on the function which can be used in the path of ftrace user-function call. Such functions should not be probed, so this adds a warning message that reports which function should be blacklisted.
Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Tested-by: Andrea Righi righi.andrea@gmail.com Cc: stable@vger.kernel.org --- arch/x86/kernel/kprobes/core.c | 26 ++++++++++++++++++++++++++ include/linux/kprobes.h | 1 + 2 files changed, 27 insertions(+)
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c index a034cb808e7e..18fbe9be2d68 100644 --- a/arch/x86/kernel/kprobes/core.c +++ b/arch/x86/kernel/kprobes/core.c @@ -569,6 +569,7 @@ void arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs) unsigned long *sara = stack_addr(regs);
ri->ret_addr = (kprobe_opcode_t *) *sara; + ri->fp = sara;
/* Replace the return addr with trampoline addr */ *sara = (unsigned long) &kretprobe_trampoline; @@ -759,15 +760,21 @@ static __used void *trampoline_handler(struct pt_regs *regs) unsigned long flags, orig_ret_address = 0; unsigned long trampoline_address = (unsigned long)&kretprobe_trampoline; kprobe_opcode_t *correct_ret_addr = NULL; + void *frame_pointer; + bool skipped = false;
INIT_HLIST_HEAD(&empty_rp); kretprobe_hash_lock(current, &head, &flags); /* fixup registers */ #ifdef CONFIG_X86_64 regs->cs = __KERNEL_CS; + /* On x86-64, we use pt_regs->sp for return address holder. */ + frame_pointer = ®s->sp; #else regs->cs = __KERNEL_CS | get_kernel_rpl(); regs->gs = 0; + /* On x86-32, we use pt_regs->flags for return address holder. */ + frame_pointer = ®s->flags; #endif regs->ip = trampoline_address; regs->orig_ax = ~0UL; @@ -789,8 +796,25 @@ static __used void *trampoline_handler(struct pt_regs *regs) if (ri->task != current) /* another task is sharing our hash bucket */ continue; + /* + * Return probes must be pushed on this hash list correct + * order (same as return order) so that it can be poped + * correctly. However, if we find it is pushed it incorrect + * order, this means we find a function which should not be + * probed, because the wrong order entry is pushed on the + * path of processing other kretprobe itself. + */ + if (ri->fp != frame_pointer) { + if (!skipped) + pr_warn("kretprobe is stacked incorrectly. Trying to fixup.\n"); + skipped = true; + continue; + }
orig_ret_address = (unsigned long)ri->ret_addr; + if (skipped) + pr_warn("%ps must be blacklisted because of incorrect kretprobe order\n", + ri->rp->kp.addr);
if (orig_ret_address != trampoline_address) /* @@ -808,6 +832,8 @@ static __used void *trampoline_handler(struct pt_regs *regs) if (ri->task != current) /* another task is sharing our hash bucket */ continue; + if (ri->fp != frame_pointer) + continue;
orig_ret_address = (unsigned long)ri->ret_addr; if (ri->rp && ri->rp->handler) { diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h index 201f0f2683f2..9a897256e481 100644 --- a/include/linux/kprobes.h +++ b/include/linux/kprobes.h @@ -173,6 +173,7 @@ struct kretprobe_instance { struct kretprobe *rp; kprobe_opcode_t *ret_addr; struct task_struct *task; + void *fp; char data[0]; };
 
            Mark ftrace mcount handler functions nokprobe since probing on these functions with kretprobe pushes return address incorrectly on kretprobe shadow stack.
Reported-by: Francis Deslauriers francis.deslauriers@efficios.com Tested-by: Andrea Righi righi.andrea@gmail.com Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Acked-by: Steven Rostedt (VMware) rostedt@goodmis.org Cc: stable@vger.kernel.org --- - Changes in v2: Fix to include kprobes.h (Thanks Andrea!) --- kernel/trace/ftrace.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index aac7847c0214..f546ae5102e0 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -33,6 +33,7 @@ #include <linux/list.h> #include <linux/hash.h> #include <linux/rcupdate.h> +#include <linux/kprobes.h>
#include <trace/events/sched.h>
@@ -6216,7 +6217,7 @@ void ftrace_reset_array_ops(struct trace_array *tr) tr->ops->func = ftrace_stub; }
-static inline void +static nokprobe_inline void __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip, struct ftrace_ops *ignored, struct pt_regs *regs) { @@ -6276,11 +6277,13 @@ static void ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip, { __ftrace_ops_list_func(ip, parent_ip, NULL, regs); } +NOKPROBE_SYMBOL(ftrace_ops_list_func); #else static void ftrace_ops_no_ops(unsigned long ip, unsigned long parent_ip) { __ftrace_ops_list_func(ip, parent_ip, NULL, NULL); } +NOKPROBE_SYMBOL(ftrace_ops_no_ops); #endif
/* @@ -6307,6 +6310,7 @@ static void ftrace_ops_assist_func(unsigned long ip, unsigned long parent_ip, preempt_enable_notrace(); trace_clear_recursion(bit); } +NOKPROBE_SYMBOL(ftrace_ops_assist_func);
/** * ftrace_ops_get_func - get the function a trampoline should call
 
            Fix to avoid kretprobe recursion loop by setting a dummy kprobes to current_kprobe per-cpu variable.
This bug has been introduced with the asm-coded trampoline code, since previously it used another kprobe for hooking the function return placeholder (which only has a nop) and trampoline handler was called from that kprobe.
This revives the old lost kprobe again.
With this fix, we don't see deadlock anymore.
# echo "r:event_1 __fdget" >> kprobe_events # echo "r:event_2 _raw_spin_lock_irqsave" >> kprobe_events # echo 1 > events/kprobes/enable
And you can see that all inner-called kretprobe are skipped.
# cat kprobe_profile event_1 235 0 event_2 19375 19612
The 1st column is recorded count and the 2nd is missed count. Above shows (event_1 rec) + (event_2 rec) ~= (event_2 missed) (some difference are here because the counter is racy)
Fixes: c9becf58d935 ("[PATCH] kretprobe: kretprobe-booster") Reported-by: Andrea Righi righi.andrea@gmail.com Tested-by: Andrea Righi righi.andrea@gmail.com Signed-off-by: Masami Hiramatsu mhiramat@kernel.org Cc: stable@vger.kernel.org --- arch/x86/kernel/kprobes/core.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c index 18fbe9be2d68..fed46ddb1eef 100644 --- a/arch/x86/kernel/kprobes/core.c +++ b/arch/x86/kernel/kprobes/core.c @@ -749,11 +749,16 @@ asm( NOKPROBE_SYMBOL(kretprobe_trampoline); STACK_FRAME_NON_STANDARD(kretprobe_trampoline);
+static struct kprobe kretprobe_kprobe = { + .addr = (void *)kretprobe_trampoline, +}; + /* * Called from kretprobe_trampoline */ static __used void *trampoline_handler(struct pt_regs *regs) { + struct kprobe_ctlblk *kcb; struct kretprobe_instance *ri = NULL; struct hlist_head *head, empty_rp; struct hlist_node *tmp; @@ -763,6 +768,17 @@ static __used void *trampoline_handler(struct pt_regs *regs) void *frame_pointer; bool skipped = false;
+ preempt_disable(); + + /* + * Set a dummy kprobe for avoiding kretprobe recursion. + * Since kretprobe never run in kprobe handler, kprobe must not + * be running at this point. + */ + kcb = get_kprobe_ctlblk(); + __this_cpu_write(current_kprobe, &kretprobe_kprobe); + kcb->kprobe_status = KPROBE_HIT_ACTIVE; + INIT_HLIST_HEAD(&empty_rp); kretprobe_hash_lock(current, &head, &flags); /* fixup registers */ @@ -838,10 +854,9 @@ static __used void *trampoline_handler(struct pt_regs *regs) orig_ret_address = (unsigned long)ri->ret_addr; if (ri->rp && ri->rp->handler) { __this_cpu_write(current_kprobe, &ri->rp->kp); - get_kprobe_ctlblk()->kprobe_status = KPROBE_HIT_ACTIVE; ri->ret_addr = correct_ret_addr; ri->rp->handler(ri, regs); - __this_cpu_write(current_kprobe, NULL); + __this_cpu_write(current_kprobe, &kretprobe_kprobe); }
recycle_rp_inst(ri, &empty_rp); @@ -857,6 +872,9 @@ static __used void *trampoline_handler(struct pt_regs *regs)
kretprobe_hash_unlock(current, &flags);
+ __this_cpu_write(current_kprobe, NULL); + preempt_enable(); + hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) { hlist_del(&ri->hlist); kfree(ri);
 
            Hi Ingo,
I was looking through at urgent patches that need to go upstream (as I'm working on a couple of patches now), and saw this series. It doesn't look like it was applied yet.
Do you want to take this, or would you want me to? It touches arch/x86, so I'll take it if one of the x86 maintainers acks it. Or you can take it as I already acked the generic patch.
-- Steve
On Sun, 24 Feb 2019 01:49:22 +0900 Masami Hiramatsu mhiramat@kernel.org wrote:
Hi Ingo,
Here is v3 series of fixing kretprobe incorrect stacking order patches.
I found that this series did not merged yet, so I just ported it on the tip tree.
In this version, I just added Steve's Ack and sort tags. Please push it as an urgent fix.
(1) kprobe incorrct stacking order problem
On recent talk with Andrea, I started more precise investigation on the kernel panic with kretprobes on notrace functions, which Francis had been reported last year ( https://lkml.org/lkml/2017/7/14/466 ).
See the investigation details in https://lkml.kernel.org/r/154686789378.15479.2886543882215785247.stgit@devbo...
When we put a kretprobe on ftrace_ops_assist_func() and put another kretprobe on probed-function, below happens
<caller> -><probed-function> ->fentry ->ftrace_ops_assist_func() ->int3 ->kprobe_int3_handler() ...->pre_handler_kretprobe() push the return address (*fentry*) of ftrace_ops_assist_func() to top of the kretprobe list and replace it with kretprobe_trampoline. <-kprobe_int3_handler() <-(int3) ->kprobe_ftrace_handler() ...->pre_handler_kretprobe() push the return address (caller) of probed-function to top of the kretprobe list and replace it with kretprobe_trampoline. <-(kprobe_ftrace_handler()) <-(ftrace_ops_assist_func()) [kretprobe_trampoline] ->tampoline_handler() pop the return address (caller) from top of the kretprobe list <-(trampoline_handler()) <caller> [run caller with incorrect stack information] <-(<caller>) !!KERNEL PANIC!!
Therefore, this kernel panic happens only when we put 2 k*ret*probes on ftrace_ops_assist_func() and other functions. If we put kprobes, it doesn't cause any issue, since it doesn't change the return address.
To fix (or just avoid) this issue, we can introduce a frame pointer verification to skip wrong order entries. And I also would like to blacklist those functions because those are part of ftrace-based kprobe handling routine.
(2) kretprobe trampoline recursion problem
This was found by Andrea in the previous thread https://lkml.kernel.org/r/20190107183444.GA5966@xps-13
echo "r:event_1 __fdget" >> kprobe_events echo "r:event_2 _raw_spin_lock_irqsave" >> kprobe_events echo 1 > events/kprobes/enable [DEADLOCK]
Because kretprobe trampoline_handler uses spinlock for protecting hash table, if we probe the spinlock itself, it causes deadlock. Thank you Andrea and Steve for discovering this root cause!!
This bug has been introduced with the asm-coded trampoline code, since previously it used another kprobe for hooking the function return placeholder (which only has a nop) and trampoline handler was called from that kprobe.
To fix this bug, I introduced a dummy kprobe and set it in current_kprobe as we did in old days.
Thank you,
Masami Hiramatsu (3): x86/kprobes: Verify stack frame on kretprobe kprobes: Mark ftrace mcount handler functions nokprobe x86/kprobes: Fix to avoid kretprobe recursion
arch/x86/kernel/kprobes/core.c | 48 ++++++++++++++++++++++++++++++++++++++-- include/linux/kprobes.h | 1 + kernel/trace/ftrace.c | 6 ++++- 3 files changed, 52 insertions(+), 3 deletions(-)
-- Masami Hiramatsu (Linaro) mhiramat@kernel.org
linux-stable-mirror@lists.linaro.org

