On Fri, Aug 05, 2022 at 04:04:38PM -0400, Paul Gortmaker wrote:
The panic comes from the sanity test code, but after trying to boil down the .config differences between the kitchen sink our test team uses, and a "defconfig", it seems there are at least a couple extra dependencies for creating a reproducer:
make defconfig echo CONFIG_FUNCTION_TRACER=y >> .config echo CONFIG_KPROBES_SANITY_TEST=y >> .config echo CONFIG_UNWINDER_FRAME_POINTER=y >> .config yes "" | make oldconfig
Note that ftrace is probably just opening the door to CONFIG_KPROBES_ON_FTRACE=y
The report I got was with gcc-11 on an Atom; I was able to reproduce it with the default gcc-7 found on Ubuntu 18.04 and booting on a Xeon v2 - so it seems to not be specific to gcc options or processor features.
I don't know if the v5.15 backports were specifically tested to be fully bisectable, but if we assume they are, a bisect between 56 and 57 says:
commit 1d61a2988612ac0632134454d5407c63ae0b9d42 (refs/bisect/bad) Author: Peter Zijlstra peterz@infradead.org Date: Tue Jun 14 23:15:45 2022 +0200 x86: Use return-thunk in asm code commit aa3d480315ba6c3025a60958e1981072ea37c3df upstream. Use the return thunk in asm code. If the thunk isn't needed, it will get patched into a RET instruction during boot by apply_returns().
Splat follows:
rcu: Hierarchical SRCU implementation. Kprobe smoke test: started BUG: unable to handle page fault for address: ffffffffc110f3e7 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page PGD b2c60f067 P4D b2c60f067 PUD b2c611067 PMD 0 Oops: 0010 [#1] SMP NOPTI CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.57 #33 Hardware name: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.02.06.E006.013120181511 01/31/2018 RIP: 0010:0xffffffffc110f3e7 Code: Unable to access opcode bytes at RIP 0xffffffffc110f3bd. RSP: 0000:ffffae4bc006be38 EFLAGS: 00010246 RAX: ffffffffb973f310 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000005856e7bd RBP: ffffae4bc006be60 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001 R13: ffffffffbae38560 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8c92df800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffc110f3bd CR3: 0000000b2c60c001 CR4: 00000000001706f0 Call Trace: <TASK> ? kprobe_target+0x5/0x20 ? init_test_probes+0x78/0x420 init_kprobes+0x16c/0x18e ? init_optprobes+0x27/0x27 do_one_initcall+0x43/0x1d0 kernel_init_freeable+0xf1/0x240 ? rest_init+0xd0/0xd0 kernel_init+0x1a/0x120 ret_from_fork+0x1f/0x30 </TASK> Modules linked in: CR2: ffffffffc110f3e7 ---[ end trace 759f040622219261 ]---
Can you try the patch below?
Thanks. Cascardo.
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index 74c2f88a43d0..6bb479ce1ae4 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -321,12 +321,12 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size) unsigned long offset; unsigned long npages; unsigned long size; - unsigned long retq; unsigned long *ptr; void *trampoline; void *ip; /* 48 8b 15 <offset> is movq <offset>(%rip), %rdx */ unsigned const char op_ref[] = { 0x48, 0x8b, 0x15 }; + unsigned const char retq[] = { RET_INSN_OPCODE, INT3_INSN_OPCODE }; union ftrace_op_code_union op_ptr; int ret;
@@ -364,15 +364,10 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size) goto fail;
ip = trampoline + size; - - /* The trampoline ends with ret(q) */ - retq = (unsigned long)ftrace_stub; if (cpu_feature_enabled(X86_FEATURE_RETHUNK)) memcpy(ip, text_gen_insn(JMP32_INSN_OPCODE, ip, &__x86_return_thunk), JMP32_INSN_SIZE); else - ret = copy_from_kernel_nofault(ip, (void *)retq, RET_SIZE); - if (WARN_ON(ret < 0)) - goto fail; + memcpy(ip, retq, sizeof(retq));
/* No need to test direct calls on created trampolines */ if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {