These patches address some issues I spotted while looking at kprobes and uprobes.
Patch 1 is the most pressing, as a uprobes user can trigger a kernel BUG(). Patches 2 and 3 fix latent endianness bugs which only manifest on big-endian kernels, and patchs 4-6 clean things up so that it's harder to get this wrong again in future.
Mark.
Mark Rutland (6): arm64: probes: Remove broken LDR (literal) uprobe support arm64: probes: Fix simulate_ldr*_literal() arm64: probes: Fix uprobes for big-endian kernels arm64: probes: Move kprobes-specific fields arm64: probes: Cleanup kprobes endianness conversions arm64: probes: Remove probe_opcode_t
arch/arm64/include/asm/probes.h | 11 +++---- arch/arm64/include/asm/uprobes.h | 8 ++--- arch/arm64/kernel/probes/decode-insn.c | 22 ++++++++----- arch/arm64/kernel/probes/decode-insn.h | 2 +- arch/arm64/kernel/probes/kprobes.c | 39 ++++++++++++------------ arch/arm64/kernel/probes/simulate-insn.c | 18 +++++------ arch/arm64/kernel/probes/uprobes.c | 8 ++--- 7 files changed, 53 insertions(+), 55 deletions(-)
The simulate_ldr_literal() and simulate_ldrsw_literal() functions are unsafe to use for uprobes. Both functions were originally written for use with kprobes, and access memory with plain C accesses. When uprobes was added, these were reused unmodified even though they cannot safely access user memory.
There are three key problems:
1) The plain C accesses do not have corresponding extable entries, and thus if they encounter a fault the kernel will treat these as unintentional accesses to user memory, resulting in a BUG() which will kill the kernel thread, and likely lead to further issues (e.g. lockup or panic()).
2) The plain C accesses are subject to HW PAN and SW PAN, and so when either is in use, any attempt to simulate an access to user memory will fault. Thus neither simulate_ldr_literal() nor simulate_ldrsw_literal() can do anything useful when simulating a user instruction on any system with HW PAN or SW PAN.
3) The plain C accesses are privileged, as they run in kernel context, and in practice can access a small range of kernel virtual addresses. The instructions they simulate have a range of +/-1MiB, and since the simulated instructions must itself be a user instructions in the TTBR0 address range, these can address the final 1MiB of the TTBR1 acddress range by wrapping downwards from an address in the first 1MiB of the TTBR0 address range.
In contemporary kernels the last 8MiB of TTBR1 address range is reserved, and accesses to this will always fault, meaning this is no worse than (1).
Historically, it was theoretically possible for the linear map or vmemmap to spill into the final 8MiB of the TTBR1 address range, but in practice this is extremely unlikely to occur as this would require either:
* Having enough physical memory to fill the entire linear map all the way to the final 1MiB of the TTBR1 address range.
* Getting unlucky with KASLR randomization of the linear map such that the populated region happens to overlap with the last 1MiB of the TTBR address range.
... and in either case if we were to spill into the final page there would be larger problems as the final page would alias with error pointers.
Practically speaking, (1) and (2) are the big issues. Given there have been no reports of problems since the broken code was introduced, it appears that no-one is relying on probing these instructions with uprobes.
Avoid these issues by not allowing uprobes on LDR (literal) and LDRSW (literal), limiting the use of simulate_ldr_literal() and simulate_ldrsw_literal() to kprobes. Attempts to place uprobes on LDR (literal) and LDRSW (literal) will be rejected as arm_probe_decode_insn() will return INSN_REJECTED. In future we can consider introducing working uprobes support for these instructions, but this will require more significant work.
Fixes: 9842ceae9fa8deae ("arm64: Add uprobe support") Cc: stable@vger.kernel.org Signed-off-by: Mark Rutland mark.rutland@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/kernel/probes/decode-insn.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kernel/probes/decode-insn.c b/arch/arm64/kernel/probes/decode-insn.c index 968d5fffe2330..3496d6169e59b 100644 --- a/arch/arm64/kernel/probes/decode-insn.c +++ b/arch/arm64/kernel/probes/decode-insn.c @@ -99,10 +99,6 @@ arm_probe_decode_insn(probe_opcode_t insn, struct arch_probe_insn *api) aarch64_insn_is_blr(insn) || aarch64_insn_is_ret(insn)) { api->handler = simulate_br_blr_ret; - } else if (aarch64_insn_is_ldr_lit(insn)) { - api->handler = simulate_ldr_literal; - } else if (aarch64_insn_is_ldrsw_lit(insn)) { - api->handler = simulate_ldrsw_literal; } else { /* * Instruction cannot be stepped out-of-line and we don't @@ -140,6 +136,17 @@ arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi) probe_opcode_t insn = le32_to_cpu(*addr); probe_opcode_t *scan_end = NULL; unsigned long size = 0, offset = 0; + struct arch_probe_insn *api = &asi->api; + + if (aarch64_insn_is_ldr_lit(insn)) { + api->handler = simulate_ldr_literal; + decoded = INSN_GOOD_NO_SLOT; + } else if (aarch64_insn_is_ldrsw_lit(insn)) { + api->handler = simulate_ldrsw_literal; + decoded = INSN_GOOD_NO_SLOT; + } else { + decoded = arm_probe_decode_insn(insn, &asi->api); + }
/* * If there's a symbol defined in front of and near enough to @@ -157,7 +164,6 @@ arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi) else scan_end = addr - MAX_ATOMIC_CONTEXT_SIZE; } - decoded = arm_probe_decode_insn(insn, &asi->api);
if (decoded != INSN_REJECTED && scan_end) if (is_probed_address_atomic(addr - 1, scan_end))
The simulate_ldr_literal() code always loads a 64-bit quantity, and when simulating a 32-bit load into a 'W' register, it discards the most significant 32 bits. For big-endian kernels this means that the relevant bits are discarded, and the value returned is the the subsequent 32 bits in memory (i.e. the value at addr + 4).
Additionally, simulate_ldr_literal() and simulate_ldrsw_literal() use a plain C load, which the compiler may tear or elide (e.g. if the target is the zero register). Today this doesn't happen to matter, but it may matter in future if trampoline code uses a LDR (literal) or LDRSW (literal).
Update simulate_ldr_literal() and simulate_ldrsw_literal() to use an appropriately-sized READ_ONCE() to perform the access, which avoids these problems.
Fixes: 39a67d49ba353630 ("arm64: kprobes instruction simulation support") Cc: stable@vger.kernel.org Signed-off-by: Mark Rutland mark.rutland@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/kernel/probes/simulate-insn.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-)
diff --git a/arch/arm64/kernel/probes/simulate-insn.c b/arch/arm64/kernel/probes/simulate-insn.c index 22d0b32524763..b65334ab79d2b 100644 --- a/arch/arm64/kernel/probes/simulate-insn.c +++ b/arch/arm64/kernel/probes/simulate-insn.c @@ -171,17 +171,15 @@ simulate_tbz_tbnz(u32 opcode, long addr, struct pt_regs *regs) void __kprobes simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs) { - u64 *load_addr; + unsigned long load_addr; int xn = opcode & 0x1f; - int disp;
- disp = ldr_displacement(opcode); - load_addr = (u64 *) (addr + disp); + load_addr = addr + ldr_displacement(opcode);
if (opcode & (1 << 30)) /* x0-x30 */ - set_x_reg(regs, xn, *load_addr); + set_x_reg(regs, xn, READ_ONCE(*(u64 *)load_addr)); else /* w0-w30 */ - set_w_reg(regs, xn, *load_addr); + set_w_reg(regs, xn, READ_ONCE(*(u32 *)load_addr));
instruction_pointer_set(regs, instruction_pointer(regs) + 4); } @@ -189,14 +187,12 @@ simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs) void __kprobes simulate_ldrsw_literal(u32 opcode, long addr, struct pt_regs *regs) { - s32 *load_addr; + unsigned long load_addr; int xn = opcode & 0x1f; - int disp;
- disp = ldr_displacement(opcode); - load_addr = (s32 *) (addr + disp); + load_addr = addr + ldr_displacement(opcode);
- set_x_reg(regs, xn, *load_addr); + set_x_reg(regs, xn, READ_ONCE(*(s32 *)load_addr));
instruction_pointer_set(regs, instruction_pointer(regs) + 4); }
The arm64 uprobes code is broken for big-endian kernels as it doesn't convert the in-memory instruction encoding (which is always little-endian) into the kernel's native endianness before analyzing and simulating instructions. This may result in a few distinct problems:
* The kernel may may erroneously reject probing an instruction which can safely be probed.
* The kernel may erroneously erroneously permit stepping an instruction out-of-line when that instruction cannot be stepped out-of-line safely.
* The kernel may erroneously simulate instruction incorrectly dur to interpretting the byte-swapped encoding.
The endianness mismatch isn't caught by the compiler or sparse because:
* The arch_uprobe::{insn,ixol} fields are encoded as arrays of u8, so the compiler and sparse have no idea these contain a little-endian 32-bit value. The core uprobes code populates these with a memcpy() which similarly does not handle endianness.
* While the uprobe_opcode_t type is an alias for __le32, both arch_uprobe_analyze_insn() and arch_uprobe_skip_sstep() cast from u8[] to the similarly-named probe_opcode_t, which is an alias for u32. Hence there is no endianness conversion warning.
Fix this by changing the arch_uprobe::{insn,ixol} fields to __le32 and adding the appropriate __le32_to_cpu() conversions prior to consuming the instruction encoding. The core uprobes copies these fields as opaque ranges of bytes, and so is unaffected by this change.
At the same time, remove MAX_UINSN_BYTES and consistently use AARCH64_INSN_SIZE for clarity.
Tested with the following:
| #include <stdio.h> | #include <stdbool.h> | | #define noinline __attribute__((noinline)) | | static noinline void *adrp_self(void) | { | void *addr; | | asm volatile( | " adrp %x0, adrp_self\n" | " add %x0, %x0, :lo12:adrp_self\n" | : "=r" (addr)); | } | | | int main(int argc, char *argv) | { | void *ptr = adrp_self(); | bool equal = (ptr == adrp_self); | | printf("adrp_self => %p\n" | "adrp_self() => %p\n" | "%s\n", | adrp_self, ptr, equal ? "EQUAL" : "NOT EQUAL"); | | return 0; | }
.... where the adrp_self() function was compiled to:
| 00000000004007e0 <adrp_self>: | 4007e0: 90000000 adrp x0, 400000 <__ehdr_start> | 4007e4: 911f8000 add x0, x0, #0x7e0 | 4007e8: d65f03c0 ret
Before this patch, the ADRP is not recognized, and is assumed to be steppable, resulting in corruption of the result:
| # ./adrp-self | adrp_self => 0x4007e0 | adrp_self() => 0x4007e0 | EQUAL | # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events | # echo 1 > /sys/kernel/tracing/events/uprobes/enable | # ./adrp-self | adrp_self => 0x4007e0 | adrp_self() => 0xffffffffff7e0 | NOT EQUAL
After this patch, the ADRP is correctly recognized and simulated:
| # ./adrp-self | adrp_self => 0x4007e0 | adrp_self() => 0x4007e0 | EQUAL | # | # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events | # echo 1 > /sys/kernel/tracing/events/uprobes/enable | # ./adrp-self | adrp_self => 0x4007e0 | adrp_self() => 0x4007e0 | EQUAL
Fixes: 9842ceae9fa8deae ("arm64: Add uprobe support") Cc: stable@vger.kernel.org Signed-off-by: Mark Rutland mark.rutland@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/asm/uprobes.h | 8 +++----- arch/arm64/kernel/probes/uprobes.c | 4 ++-- 2 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/include/asm/uprobes.h b/arch/arm64/include/asm/uprobes.h index 2b09495499c61..014b02897f8e2 100644 --- a/arch/arm64/include/asm/uprobes.h +++ b/arch/arm64/include/asm/uprobes.h @@ -10,11 +10,9 @@ #include <asm/insn.h> #include <asm/probes.h>
-#define MAX_UINSN_BYTES AARCH64_INSN_SIZE - #define UPROBE_SWBP_INSN cpu_to_le32(BRK64_OPCODE_UPROBES) #define UPROBE_SWBP_INSN_SIZE AARCH64_INSN_SIZE -#define UPROBE_XOL_SLOT_BYTES MAX_UINSN_BYTES +#define UPROBE_XOL_SLOT_BYTES AARCH64_INSN_SIZE
typedef __le32 uprobe_opcode_t;
@@ -23,8 +21,8 @@ struct arch_uprobe_task {
struct arch_uprobe { union { - u8 insn[MAX_UINSN_BYTES]; - u8 ixol[MAX_UINSN_BYTES]; + __le32 insn; + __le32 ixol; }; struct arch_probe_insn api; bool simulate; diff --git a/arch/arm64/kernel/probes/uprobes.c b/arch/arm64/kernel/probes/uprobes.c index d49aef2657cdf..a2f137a595fc1 100644 --- a/arch/arm64/kernel/probes/uprobes.c +++ b/arch/arm64/kernel/probes/uprobes.c @@ -42,7 +42,7 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, else if (!IS_ALIGNED(addr, AARCH64_INSN_SIZE)) return -EINVAL;
- insn = *(probe_opcode_t *)(&auprobe->insn[0]); + insn = le32_to_cpu(auprobe->insn);
switch (arm_probe_decode_insn(insn, &auprobe->api)) { case INSN_REJECTED: @@ -108,7 +108,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs) if (!auprobe->simulate) return false;
- insn = *(probe_opcode_t *)(&auprobe->insn[0]); + insn = le32_to_cpu(auprobe->insn); addr = instruction_pointer(regs);
if (auprobe->api.handler)
We share struct arch_probe_insn between krpboes and uprobes, but most of its fields aren't necessary for uprobes:
* The 'insn' field is only used by kprobes as a pointer to the XOL slot.
* The 'restore' field is only used by probes as the PC to restore after stepping an instruction in the XOL slot.
* The 'pstate_cc' field isn't used by kprobes or uprobes, and seems to only exist as a result of copy-pasting the 32-bit arm implementation of kprobes.
As these fields live in struct arch_probe_insn they cannot use definitions that only exist when CONFIG_KPROBES=y, such as the kprobe_opcode_t typedef, which we'd like to use in subsequent patches.
Clean this up by removing the 'pstate_cc' field, and moving the kprobes-specific fields into the kprobes-specific struct arch_specific_insn. To make it clear that the fields are related to stepping instructions in the XOL slot, 'insn' is renamed to 'xol_insn' and 'restore' is renamed to 'xol_restore'
At the same time, remove the misleading and useless comment above struct arch_probe_insn.
The should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland mark.rutland@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/asm/probes.h | 8 +++----- arch/arm64/kernel/probes/kprobes.c | 30 +++++++++++++++--------------- 2 files changed, 18 insertions(+), 20 deletions(-)
diff --git a/arch/arm64/include/asm/probes.h b/arch/arm64/include/asm/probes.h index 006946745352e..4aa54322794da 100644 --- a/arch/arm64/include/asm/probes.h +++ b/arch/arm64/include/asm/probes.h @@ -12,18 +12,16 @@ typedef u32 probe_opcode_t; typedef void (probes_handler_t) (u32 opcode, long addr, struct pt_regs *);
-/* architecture specific copy of original instruction */ struct arch_probe_insn { - probe_opcode_t *insn; - pstate_check_t *pstate_cc; probes_handler_t *handler; - /* restore address after step xol */ - unsigned long restore; }; #ifdef CONFIG_KPROBES typedef u32 kprobe_opcode_t; struct arch_specific_insn { struct arch_probe_insn api; + probe_opcode_t *xol_insn; + /* restore address after step xol */ + unsigned long xol_restore; }; #endif
diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c index 4268678d0e86c..222419a41a400 100644 --- a/arch/arm64/kernel/probes/kprobes.c +++ b/arch/arm64/kernel/probes/kprobes.c @@ -43,7 +43,7 @@ post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *);
static void __kprobes arch_prepare_ss_slot(struct kprobe *p) { - kprobe_opcode_t *addr = p->ainsn.api.insn; + kprobe_opcode_t *addr = p->ainsn.xol_insn;
/* * Prepare insn slot, Mark Rutland points out it depends on a coupe of @@ -70,14 +70,14 @@ static void __kprobes arch_prepare_ss_slot(struct kprobe *p) /* * Needs restoring of return address after stepping xol. */ - p->ainsn.api.restore = (unsigned long) p->addr + + p->ainsn.xol_restore = (unsigned long) p->addr + sizeof(kprobe_opcode_t); }
static void __kprobes arch_prepare_simulate(struct kprobe *p) { /* This instructions is not executed xol. No need to adjust the PC */ - p->ainsn.api.restore = 0; + p->ainsn.xol_restore = 0; }
static void __kprobes arch_simulate_insn(struct kprobe *p, struct pt_regs *regs) @@ -110,18 +110,18 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p) return -EINVAL;
case INSN_GOOD_NO_SLOT: /* insn need simulation */ - p->ainsn.api.insn = NULL; + p->ainsn.xol_insn = NULL; break;
case INSN_GOOD: /* instruction uses slot */ - p->ainsn.api.insn = get_insn_slot(); - if (!p->ainsn.api.insn) + p->ainsn.xol_insn = get_insn_slot(); + if (!p->ainsn.xol_insn) return -ENOMEM; break; }
/* prepare the instruction */ - if (p->ainsn.api.insn) + if (p->ainsn.xol_insn) arch_prepare_ss_slot(p); else arch_prepare_simulate(p); @@ -148,9 +148,9 @@ void __kprobes arch_disarm_kprobe(struct kprobe *p)
void __kprobes arch_remove_kprobe(struct kprobe *p) { - if (p->ainsn.api.insn) { - free_insn_slot(p->ainsn.api.insn, 0); - p->ainsn.api.insn = NULL; + if (p->ainsn.xol_insn) { + free_insn_slot(p->ainsn.xol_insn, 0); + p->ainsn.xol_insn = NULL; } }
@@ -205,9 +205,9 @@ static void __kprobes setup_singlestep(struct kprobe *p, }
- if (p->ainsn.api.insn) { + if (p->ainsn.xol_insn) { /* prepare for single stepping */ - slot = (unsigned long)p->ainsn.api.insn; + slot = (unsigned long)p->ainsn.xol_insn;
kprobes_save_local_irqflag(kcb, regs); instruction_pointer_set(regs, slot); @@ -245,8 +245,8 @@ static void __kprobes post_kprobe_handler(struct kprobe *cur, struct kprobe_ctlblk *kcb, struct pt_regs *regs) { /* return addr restore if non-branching insn */ - if (cur->ainsn.api.restore != 0) - instruction_pointer_set(regs, cur->ainsn.api.restore); + if (cur->ainsn.xol_restore != 0) + instruction_pointer_set(regs, cur->ainsn.xol_restore);
/* restore back original saved kprobe variables and continue */ if (kcb->kprobe_status == KPROBE_REENTER) { @@ -348,7 +348,7 @@ kprobe_breakpoint_ss_handler(struct pt_regs *regs, unsigned long esr) struct kprobe *cur = kprobe_running();
if (cur && (kcb->kprobe_status & (KPROBE_HIT_SS | KPROBE_REENTER)) && - ((unsigned long)&cur->ainsn.api.insn[1] == addr)) { + ((unsigned long)&cur->ainsn.xol_insn[1] == addr)) { kprobes_restore_local_irqflag(kcb, regs); post_kprobe_handler(cur, kcb, regs);
The core kprobes code uses kprobe_opcode_t for the in-memory representation of an instruction, using 'kprobe_opcode_t *' for XOL slots. As arm64 instructions are always little-endian 32-bit values, kprobes_opcode_t should be __le32, but at the moment kprobe_opcode_t is typedef'd to u32.
Today there is no functional issue as we convert values via cpu_to_le32() and le32_to_cpu() where necessary, but these conversions are inconsistent with the types used, causing sparse warnings:
| CHECK arch/arm64/kernel/probes/kprobes.c | arch/arm64/kernel/probes/kprobes.c:102:21: warning: cast to restricted __le32 | CHECK arch/arm64/kernel/probes/decode-insn.c | arch/arm64/kernel/probes/decode-insn.c:122:46: warning: cast to restricted __le32 | arch/arm64/kernel/probes/decode-insn.c:124:50: warning: cast to restricted __le32 | arch/arm64/kernel/probes/decode-insn.c:136:31: warning: cast to restricted __le32
Improve this by making kprobes_opcode_t a typedef for __le32 and consistently using this for pointers to executable instructions. With this change we can rely on the type system to tell us where conversions are necessary.
Since kprobe::opcode is changed from u32 to __le32, the existing le32_to_cpu() converion moves from the point this is initialized (in arch_prepare_kprobe()) to the points this is consumed when passed to a handler or text patching function. As kprobe::opcode isn't altered or consumed elsewhere, this shouldn't result in a functional change.
Signed-off-by: Mark Rutland mark.rutland@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/asm/probes.h | 4 ++-- arch/arm64/kernel/probes/decode-insn.c | 2 +- arch/arm64/kernel/probes/kprobes.c | 9 +++++---- 3 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/include/asm/probes.h b/arch/arm64/include/asm/probes.h index 4aa54322794da..11e809733b7d9 100644 --- a/arch/arm64/include/asm/probes.h +++ b/arch/arm64/include/asm/probes.h @@ -16,10 +16,10 @@ struct arch_probe_insn { probes_handler_t *handler; }; #ifdef CONFIG_KPROBES -typedef u32 kprobe_opcode_t; +typedef __le32 kprobe_opcode_t; struct arch_specific_insn { struct arch_probe_insn api; - probe_opcode_t *xol_insn; + kprobe_opcode_t *xol_insn; /* restore address after step xol */ unsigned long xol_restore; }; diff --git a/arch/arm64/kernel/probes/decode-insn.c b/arch/arm64/kernel/probes/decode-insn.c index 3496d6169e59b..147d6ddf3a4c9 100644 --- a/arch/arm64/kernel/probes/decode-insn.c +++ b/arch/arm64/kernel/probes/decode-insn.c @@ -134,7 +134,7 @@ arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi) { enum probe_insn decoded; probe_opcode_t insn = le32_to_cpu(*addr); - probe_opcode_t *scan_end = NULL; + kprobe_opcode_t *scan_end = NULL; unsigned long size = 0, offset = 0; struct arch_probe_insn *api = &asi->api;
diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c index 222419a41a400..48d88e07611d4 100644 --- a/arch/arm64/kernel/probes/kprobes.c +++ b/arch/arm64/kernel/probes/kprobes.c @@ -64,7 +64,7 @@ static void __kprobes arch_prepare_ss_slot(struct kprobe *p) * the BRK exception handler, so it is unnecessary to generate * Contex-Synchronization-Event via ISB again. */ - aarch64_insn_patch_text_nosync(addr, p->opcode); + aarch64_insn_patch_text_nosync(addr, le32_to_cpu(p->opcode)); aarch64_insn_patch_text_nosync(addr + 1, BRK64_OPCODE_KPROBES_SS);
/* @@ -85,7 +85,7 @@ static void __kprobes arch_simulate_insn(struct kprobe *p, struct pt_regs *regs) struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
if (p->ainsn.api.handler) - p->ainsn.api.handler((u32)p->opcode, (long)p->addr, regs); + p->ainsn.api.handler(le32_to_cpu(p->opcode), (long)p->addr, regs);
/* single step simulated, now go for post processing */ post_kprobe_handler(p, kcb, regs); @@ -99,7 +99,7 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p) return -EINVAL;
/* copy instruction */ - p->opcode = le32_to_cpu(*p->addr); + p->opcode = *p->addr;
if (search_exception_tables(probe_addr)) return -EINVAL; @@ -142,8 +142,9 @@ void __kprobes arch_arm_kprobe(struct kprobe *p) void __kprobes arch_disarm_kprobe(struct kprobe *p) { void *addr = p->addr; + u32 insn = le32_to_cpu(p->opcode);
- aarch64_insn_patch_text(&addr, &p->opcode, 1); + aarch64_insn_patch_text(&addr, &insn, 1); }
void __kprobes arch_remove_kprobe(struct kprobe *p)
The probe_opcode_t typedef for u32 isn't necessary, and is a source of confusion as it is easily confused with kprobe_opcode_t, which is a typedef for __le32.
The typedef is only used within arch/arm64, and all of arm64's commn insn code uses u32 for the endian-agnostic value of an instruction, so it'd be clearer to use u32 consistently.
Remove probe_opcode_t and use u32 directly.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland mark.rutland@arm.com Cc: Catalin Marinas catalin.marnias@arm.com Cc: Will Deacon will@kernel.org --- arch/arm64/include/asm/probes.h | 1 - arch/arm64/kernel/probes/decode-insn.c | 4 ++-- arch/arm64/kernel/probes/decode-insn.h | 2 +- arch/arm64/kernel/probes/uprobes.c | 4 ++-- 4 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/include/asm/probes.h b/arch/arm64/include/asm/probes.h index 11e809733b7d9..d493688863094 100644 --- a/arch/arm64/include/asm/probes.h +++ b/arch/arm64/include/asm/probes.h @@ -9,7 +9,6 @@
#include <asm/insn.h>
-typedef u32 probe_opcode_t; typedef void (probes_handler_t) (u32 opcode, long addr, struct pt_regs *);
struct arch_probe_insn { diff --git a/arch/arm64/kernel/probes/decode-insn.c b/arch/arm64/kernel/probes/decode-insn.c index 147d6ddf3a4c9..41b100bcb041d 100644 --- a/arch/arm64/kernel/probes/decode-insn.c +++ b/arch/arm64/kernel/probes/decode-insn.c @@ -73,7 +73,7 @@ static bool __kprobes aarch64_insn_is_steppable(u32 insn) * INSN_GOOD_NO_SLOT If instruction is supported but doesn't use its slot. */ enum probe_insn __kprobes -arm_probe_decode_insn(probe_opcode_t insn, struct arch_probe_insn *api) +arm_probe_decode_insn(u32 insn, struct arch_probe_insn *api) { /* * Instructions reading or modifying the PC won't work from the XOL @@ -133,7 +133,7 @@ enum probe_insn __kprobes arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi) { enum probe_insn decoded; - probe_opcode_t insn = le32_to_cpu(*addr); + u32 insn = le32_to_cpu(*addr); kprobe_opcode_t *scan_end = NULL; unsigned long size = 0, offset = 0; struct arch_probe_insn *api = &asi->api; diff --git a/arch/arm64/kernel/probes/decode-insn.h b/arch/arm64/kernel/probes/decode-insn.h index 8b758c5a20622..0e4195de82061 100644 --- a/arch/arm64/kernel/probes/decode-insn.h +++ b/arch/arm64/kernel/probes/decode-insn.h @@ -28,6 +28,6 @@ enum probe_insn __kprobes arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi); #endif enum probe_insn __kprobes -arm_probe_decode_insn(probe_opcode_t insn, struct arch_probe_insn *asi); +arm_probe_decode_insn(u32 insn, struct arch_probe_insn *asi);
#endif /* _ARM_KERNEL_KPROBES_ARM64_H */ diff --git a/arch/arm64/kernel/probes/uprobes.c b/arch/arm64/kernel/probes/uprobes.c index a2f137a595fc1..fa0b7941d204c 100644 --- a/arch/arm64/kernel/probes/uprobes.c +++ b/arch/arm64/kernel/probes/uprobes.c @@ -34,7 +34,7 @@ unsigned long uprobe_get_swbp_addr(struct pt_regs *regs) int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm, unsigned long addr) { - probe_opcode_t insn; + u32 insn;
/* TODO: Currently we do not support AARCH32 instruction probing */ if (mm->context.flags & MMCF_AARCH32) @@ -102,7 +102,7 @@ bool arch_uprobe_xol_was_trapped(struct task_struct *t)
bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs) { - probe_opcode_t insn; + u32 insn; unsigned long addr;
if (!auprobe->simulate)
On Tue, 08 Oct 2024 16:58:45 +0100, Mark Rutland wrote:
These patches address some issues I spotted while looking at kprobes and uprobes.
Patch 1 is the most pressing, as a uprobes user can trigger a kernel BUG(). Patches 2 and 3 fix latent endianness bugs which only manifest on big-endian kernels, and patchs 4-6 clean things up so that it's harder to get this wrong again in future.
[...]
Applied first three (fixes) to arm64 (for-next/fixes), thanks!
[1/6] arm64: probes: Remove broken LDR (literal) uprobe support https://git.kernel.org/arm64/c/acc450aa0709 [2/6] arm64: probes: Fix simulate_ldr*_literal() https://git.kernel.org/arm64/c/50f813e57601 [3/6] arm64: probes: Fix uprobes for big-endian kernels https://git.kernel.org/arm64/c/13f8f1e05f1d
Cheers,
On Tue, 08 Oct 2024 16:58:45 +0100, Mark Rutland wrote:
These patches address some issues I spotted while looking at kprobes and uprobes.
Patch 1 is the most pressing, as a uprobes user can trigger a kernel BUG(). Patches 2 and 3 fix latent endianness bugs which only manifest on big-endian kernels, and patchs 4-6 clean things up so that it's harder to get this wrong again in future.
[...]
Applied to arm64 (for-next/probes), thanks! The branch also contains the first three patches in the series from arm64 for-next/fixes.
[4/6] arm64: probes: Move kprobes-specific fields https://git.kernel.org/arm64/c/6105c5d46d0b [5/6] arm64: probes: Cleanup kprobes endianness conversions https://git.kernel.org/arm64/c/dd0eb50e7c71 [6/6] arm64: probes: Remove probe_opcode_t https://git.kernel.org/arm64/c/14762109de02
linux-stable-mirror@lists.linaro.org