[ This is the non-RFC version.
It went through and passed all my tests. If there's no objections I'm going to include this in my pull request. I still have patches in my INBOX that may still be included, so I need to run those through my tests as well, so a pull request wont be immediate. ]
Nicolai Stange discovered that Live Kernel Patching can have unforseen consequences if tracing is enabled when there are functions that are patched. The reason being, is that Live Kernel patching is built on top of ftrace, which will have the patched functions call the live kernel trampoline directly, and that trampoline will modify the regs->ip address to return to the patched function.
But in the transition between changing the call to the customized trampoline, the tracing code is needed to have its handler called an well, so the function fentry location must be changed from calling the live kernel patching trampoline, to the ftrace_reg_caller trampoline which will iterate through all the registered ftrace handlers for that function.
During this transition, a break point is added to do the live code modifications. But if that break point is hit, it just skips calling any handler, and makes the call site act as a nop. For tracing, the worse that can happen is that you miss a function being traced, but for live kernel patching the affects are more severe, as the old buggy function is now called.
To solve this, an int3_emulate_call() is created for x86_64 to allow ftrace on x86_64 to emulate the call to ftrace_regs_caller() which will make sure all the registered handlers to that function are still called. And this keeps live kernel patching happy!
To mimimize the changes, and to avoid controversial patches, this only changes x86_64. Due to the way x86_32 implements the regs->sp the complexity of emulating calls on that platform is too much for stable patches, and live kernel patching does not support x86_32 anyway.
Josh Poimboeuf (1): x86_64: Add gap to int3 to allow for call emulation
Peter Zijlstra (2): x86_64: Allow breakpoints to emulate call instructions ftrace/x86_64: Emulate call function while updating in breakpoint handler
---- arch/x86/entry/entry_64.S | 18 ++++++++++++++++-- arch/x86/include/asm/text-patching.h | 28 ++++++++++++++++++++++++++++ arch/x86/kernel/ftrace.c | 32 +++++++++++++++++++++++++++----- 3 files changed, 71 insertions(+), 7 deletions(-)
From: Josh Poimboeuf jpoimboe@redhat.com
To allow an int3 handler to emulate a call instruction, it must be able to push a return address onto the stack. Add a gap to the stack to allow the int3 handler to push the return address and change the return from int3 to jump straight to the emulated called function target.
Link: http://lkml.kernel.org/r/20181130183917.hxmti5josgq4clti@treble Link: http://lkml.kernel.org/r/20190502162133.GX2623@hirez.programming.kicks-ass.n...
[ Note, this is needed to allow Live Kernel Patching to not miss calling a patched function when tracing is enabled. -- Steven Rostedt ]
Cc: stable@vger.kernel.org Fixes: b700e7f03df5 ("livepatch: kernel: add support for live patching") Signed-off-by: Josh Poimboeuf jpoimboe@redhat.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org --- arch/x86/entry/entry_64.S | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 1f0efdb7b629..27fcc6fbdd52 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -879,7 +879,7 @@ apicinterrupt IRQ_WORK_VECTOR irq_work_interrupt smp_irq_work_interrupt * @paranoid == 2 is special: the stub will never switch stacks. This is for * #DF: if the thread stack is somehow unusable, we'll still get a useful OOPS. */ -.macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1 +.macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1 create_gap=0 ENTRY(\sym) UNWIND_HINT_IRET_REGS offset=\has_error_code*8
@@ -899,6 +899,20 @@ ENTRY(\sym) jnz .Lfrom_usermode_switch_stack_@ .endif
+ .if \create_gap == 1 + /* + * If coming from kernel space, create a 6-word gap to allow the + * int3 handler to emulate a call instruction. + */ + testb $3, CS-ORIG_RAX(%rsp) + jnz .Lfrom_usermode_no_gap_@ + .rept 6 + pushq 5*8(%rsp) + .endr + UNWIND_HINT_IRET_REGS offset=8 +.Lfrom_usermode_no_gap_@: + .endif + .if \paranoid call paranoid_entry .else @@ -1130,7 +1144,7 @@ apicinterrupt3 HYPERV_STIMER0_VECTOR \ #endif /* CONFIG_HYPERV */
idtentry debug do_debug has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK -idtentry int3 do_int3 has_error_code=0 +idtentry int3 do_int3 has_error_code=0 create_gap=1 idtentry stack_segment do_stack_segment has_error_code=1
#ifdef CONFIG_XEN_PV
From: Peter Zijlstra peterz@infradead.org
In order to allow breakpoints to emulate call instructions, they need to push the return address onto the stack. The x86_64 int3 handler adds a small gap to allow the stack to grow some. Use this gap to add the return address to be able to emulate a call instruction at the breakpoint location.
These helper functions are added:
int3_emulate_jmp(): changes the location of the regs->ip to return there.
(The next two are only for x86_64) int3_emulate_push(): to push the address onto the gap in the stack int3_emulate_call(): push the return address and change regs->ip
Cc: Andy Lutomirski luto@kernel.org Cc: Nicolai Stange nstange@suse.de Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: "H. Peter Anvin" hpa@zytor.com Cc: the arch/x86 maintainers x86@kernel.org Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Jiri Kosina jikos@kernel.org Cc: Miroslav Benes mbenes@suse.cz Cc: Petr Mladek pmladek@suse.com Cc: Joe Lawrence joe.lawrence@redhat.com Cc: Shuah Khan shuah@kernel.org Cc: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Sebastian Andrzej Siewior bigeasy@linutronix.de Cc: Mimi Zohar zohar@linux.ibm.com Cc: Juergen Gross jgross@suse.com Cc: Nick Desaulniers ndesaulniers@google.com Cc: Nayna Jain nayna@linux.ibm.com Cc: Masahiro Yamada yamada.masahiro@socionext.com Cc: Joerg Roedel jroedel@suse.de Cc: "open list:KERNEL SELFTEST FRAMEWORK" linux-kselftest@vger.kernel.org Cc: stable@vger.kernel.org Fixes: b700e7f03df5 ("livepatch: kernel: add support for live patching") Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org [ Modified to only work for x86_64 and added comment to int3_emulate_push() ] Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org --- arch/x86/include/asm/text-patching.h | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h index e85ff65c43c3..05861cc08787 100644 --- a/arch/x86/include/asm/text-patching.h +++ b/arch/x86/include/asm/text-patching.h @@ -39,4 +39,32 @@ extern int poke_int3_handler(struct pt_regs *regs); extern void *text_poke_bp(void *addr, const void *opcode, size_t len, void *handler); extern int after_bootmem;
+static inline void int3_emulate_jmp(struct pt_regs *regs, unsigned long ip) +{ + regs->ip = ip; +} + +#define INT3_INSN_SIZE 1 +#define CALL_INSN_SIZE 5 + +#ifdef CONFIG_X86_64 +static inline void int3_emulate_push(struct pt_regs *regs, unsigned long val) +{ + /* + * The int3 handler in entry_64.S adds a gap between the + * stack where the break point happened, and the saving of + * pt_regs. We can extend the original stack because of + * this gap. See the idtentry macro's create_gap option. + */ + regs->sp -= sizeof(unsigned long); + *(unsigned long *)regs->sp = val; +} + +static inline void int3_emulate_call(struct pt_regs *regs, unsigned long func) +{ + int3_emulate_push(regs, regs->ip - INT3_INSN_SIZE + CALL_INSN_SIZE); + int3_emulate_jmp(regs, func); +} +#endif + #endif /* _ASM_X86_TEXT_PATCHING_H */
From: Peter Zijlstra peterz@infradead.org
Nicolai Stange discovered[1] that if live kernel patching is enabled, and the function tracer started tracing the same function that was patched, the conversion of the fentry call site during the translation of going from calling the live kernel patch trampoline to the iterator trampoline, would have as slight window where it didn't call anything.
As live kernel patching depends on ftrace to always call its code (to prevent the function being traced from being called, as it will redirect it). This small window would allow the old buggy function to be called, and this can cause undesirable results.
Nicolai submitted new patches[2] but these were controversial. As this is similar to the static call emulation issues that came up a while ago[3]. But after some debate[4][5] adding a gap in the stack when entering the breakpoint handler allows for pushing the return address onto the stack to easily emulate a call.
[1] http://lkml.kernel.org/r/20180726104029.7736-1-nstange@suse.de [2] http://lkml.kernel.org/r/20190427100639.15074-1-nstange@suse.de [3] http://lkml.kernel.org/r/3cf04e113d71c9f8e4be95fb84a510f085aa4afa.1541711457... [4] http://lkml.kernel.org/r/CAHk-=wh5OpheSU8Em_Q3Hg8qw_JtoijxOdPtHru6d+5K8TWM=A... [5] http://lkml.kernel.org/r/CAHk-=wjvQxY4DvPrJ6haPgAa6b906h=MwZXO6G8OtiTGe=N7_w...
[ Live kernel patching is not implemented on x86_32, thus the emulate calls are only for x86_64. ]
Cc: Andy Lutomirski luto@kernel.org Cc: Nicolai Stange nstange@suse.de Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@redhat.com Cc: Borislav Petkov bp@alien8.de Cc: "H. Peter Anvin" hpa@zytor.com Cc: the arch/x86 maintainers x86@kernel.org Cc: Josh Poimboeuf jpoimboe@redhat.com Cc: Jiri Kosina jikos@kernel.org Cc: Miroslav Benes mbenes@suse.cz Cc: Petr Mladek pmladek@suse.com Cc: Joe Lawrence joe.lawrence@redhat.com Cc: Shuah Khan shuah@kernel.org Cc: Konrad Rzeszutek Wilk konrad.wilk@oracle.com Cc: Tim Chen tim.c.chen@linux.intel.com Cc: Sebastian Andrzej Siewior bigeasy@linutronix.de Cc: Mimi Zohar zohar@linux.ibm.com Cc: Juergen Gross jgross@suse.com Cc: Nick Desaulniers ndesaulniers@google.com Cc: Nayna Jain nayna@linux.ibm.com Cc: Masahiro Yamada yamada.masahiro@socionext.com Cc: Joerg Roedel jroedel@suse.de Cc: "open list:KERNEL SELFTEST FRAMEWORK" linux-kselftest@vger.kernel.org Cc: stable@vger.kernel.org Fixes: b700e7f03df5 ("livepatch: kernel: add support for live patching") Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org [ Changed to only implement emulated calls for x86_64 ] Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org --- arch/x86/kernel/ftrace.c | 32 +++++++++++++++++++++++++++----- 1 file changed, 27 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index ef49517f6bb2..bd553b3af22e 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -29,6 +29,7 @@ #include <asm/kprobes.h> #include <asm/ftrace.h> #include <asm/nops.h> +#include <asm/text-patching.h>
#ifdef CONFIG_DYNAMIC_FTRACE
@@ -231,6 +232,7 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr, }
static unsigned long ftrace_update_func; +static unsigned long ftrace_update_func_call;
static int update_ftrace_func(unsigned long ip, void *new) { @@ -259,6 +261,8 @@ int ftrace_update_ftrace_func(ftrace_func_t func) unsigned char *new; int ret;
+ ftrace_update_func_call = (unsigned long)func; + new = ftrace_call_replace(ip, (unsigned long)func); ret = update_ftrace_func(ip, new);
@@ -294,13 +298,28 @@ int ftrace_int3_handler(struct pt_regs *regs) if (WARN_ON_ONCE(!regs)) return 0;
- ip = regs->ip - 1; - if (!ftrace_location(ip) && !is_ftrace_caller(ip)) - return 0; + ip = regs->ip - INT3_INSN_SIZE;
- regs->ip += MCOUNT_INSN_SIZE - 1; +#ifdef CONFIG_X86_64 + if (ftrace_location(ip)) { + int3_emulate_call(regs, (unsigned long)ftrace_regs_caller); + return 1; + } else if (is_ftrace_caller(ip)) { + if (!ftrace_update_func_call) { + int3_emulate_jmp(regs, ip + CALL_INSN_SIZE); + return 1; + } + int3_emulate_call(regs, ftrace_update_func_call); + return 1; + } +#else + if (ftrace_location(ip) || is_ftrace_caller(ip)) { + int3_emulate_jmp(regs, ip + CALL_INSN_SIZE); + return 1; + } +#endif
- return 1; + return 0; } NOKPROBE_SYMBOL(ftrace_int3_handler);
@@ -859,6 +878,8 @@ void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
func = ftrace_ops_get_func(ops);
+ ftrace_update_func_call = (unsigned long)func; + /* Do a safe modify in case the trampoline is executing */ new = ftrace_call_replace(ip, (unsigned long)func); ret = update_ftrace_func(ip, new); @@ -960,6 +981,7 @@ static int ftrace_mod_jmp(unsigned long ip, void *func) { unsigned char *new;
+ ftrace_update_func_call = 0UL; new = ftrace_jmp_replace(ip, (unsigned long)func);
return update_ftrace_func(ip, new);
On Tue, 07 May 2019 21:55:59 -0400 Steven Rostedt rostedt@goodmis.org wrote:
[ This is the non-RFC version.
It went through and passed all my tests. If there's no objections I'm going to include this in my pull request. I still have patches in my INBOX that may still be included, so I need to run those through my tests as well, so a pull request wont be immediate. ]
Nicolai Stange discovered that Live Kernel Patching can have unforseen consequences if tracing is enabled when there are functions that are patched. The reason being, is that Live Kernel patching is built on top of ftrace, which will have the patched functions call the live kernel trampoline directly, and that trampoline will modify the regs->ip address to return to the patched function.
But in the transition between changing the call to the customized trampoline, the tracing code is needed to have its handler called an well, so the function fentry location must be changed from calling the live kernel patching trampoline, to the ftrace_reg_caller trampoline which will iterate through all the registered ftrace handlers for that function.
During this transition, a break point is added to do the live code modifications. But if that break point is hit, it just skips calling any handler, and makes the call site act as a nop. For tracing, the worse that can happen is that you miss a function being traced, but for live kernel patching the affects are more severe, as the old buggy function is now called.
To solve this, an int3_emulate_call() is created for x86_64 to allow ftrace on x86_64 to emulate the call to ftrace_regs_caller() which will make sure all the registered handlers to that function are still called. And this keeps live kernel patching happy!
Out of curiosity, would you have any idea to re-use these function for other use-case? Maybe kprobes can reuse it, but very limited use-case.
To mimimize the changes, and to avoid controversial patches, this only changes x86_64. Due to the way x86_32 implements the regs->sp the complexity of emulating calls on that platform is too much for stable patches, and live kernel patching does not support x86_32 anyway.
This series looks good to me.
Reviewed-by: Masami Hiramatsu mhiramat@kernel.org
Thanks!
On Wed, 8 May 2019 13:30:22 +0900 Masami Hiramatsu mhiramat@kernel.org wrote:
To solve this, an int3_emulate_call() is created for x86_64 to allow ftrace on x86_64 to emulate the call to ftrace_regs_caller() which will make sure all the registered handlers to that function are still called. And this keeps live kernel patching happy!
Out of curiosity, would you have any idea to re-use these function for other use-case? Maybe kprobes can reuse it, but very limited use-case.
Yes, but only for x86_64.
To mimimize the changes, and to avoid controversial patches, this only changes x86_64. Due to the way x86_32 implements the regs->sp the complexity of emulating calls on that platform is too much for stable patches, and live kernel patching does not support x86_32 anyway.
This series looks good to me.
Reviewed-by: Masami Hiramatsu mhiramat@kernel.org
Thanks Masami!
-- Steve
Steven Rostedt rostedt@goodmis.org writes:
[ This is the non-RFC version.
It went through and passed all my tests. If there's no objections I'm going to include this in my pull request. I still have patches in my INBOX that may still be included, so I need to run those through my tests as well, so a pull request wont be immediate. ]
<snip />
Josh Poimboeuf (1): x86_64: Add gap to int3 to allow for call emulation
Peter Zijlstra (2): x86_64: Allow breakpoints to emulate call instructions ftrace/x86_64: Emulate call function while updating in breakpoint handler
Reviewed-and-tested-by: Nicolai Stange nstange@suse.de
for the whole series. Many, many thanks to everybody involved!
I'll resend that live patching selftest once this fix here has been merged.
Nicolai
linux-kselftest-mirror@lists.linaro.org