Hi,
Here is version 3 series of patches to support accessing function entry data from function *return* probes (including kretprobe and fprobe-exit event). The previous version is here;
https://lore.kernel.org/all/170891987362.609861.6767830614537418260.stgit@de...
In this version, [1/8] is a bugfix patch (but note that this is already pushed to probes-fixes-v6.8-rc5, just for reference), updated [4/8] changelog and build error, fixes selftests error [6/8], update document[8/8] and added Steve's reviewed-by.
This allows us to access the results of some functions, which returns the error code and its results are passed via function parameter, such as an structure-initialization function.
For example, vfs_open() will link the file structure to the inode and update mode. Thus we can trace that changes.
# echo 'f vfs_open mode=file->f_mode:x32 inode=file->f_inode:x64' >> dynamic_events # echo 'f vfs_open%return mode=file->f_mode:x32 inode=file->f_inode:x64' >> dynamic_events # echo 1 > events/fprobes/enable # cat trace sh-131 [006] ...1. 1945.714346: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x2 inode=0x0 sh-131 [006] ...1. 1945.714358: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0x4d801e inode=0xffff888008470168 cat-143 [007] ...1. 1945.717949: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x1 inode=0x0 cat-143 [007] ...1. 1945.717956: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0x4a801d inode=0xffff888005f78d28 cat-143 [007] ...1. 1945.720616: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x1 inode=0x0 cat-143 [007] ...1. 1945.728263: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0xa800d inode=0xffff888004ada8d8
So as you can see those fields are initialized at exit.
This series is based on v6.8-rc5 kernel or you can checkout from
https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/log/?h=to...
Thank you,
---
Masami Hiramatsu (Google) (8): fprobe: Fix to allocate entry_data_size buffer with rethook instances tracing/fprobe-event: cleanup: Fix a wrong comment in fprobe event tracing/probes: Cleanup probe argument parser tracing/probes: cleanup: Set trace_probe::nr_args at trace_probe_init tracing: Remove redundant #else block for BTF args from README tracing/probes: Support $argN in return probe (kprobe and fprobe) selftests/ftrace: Add test cases for entry args at function exit Documentation: tracing: Add entry argument access at function exit
Documentation/trace/fprobetrace.rst | 31 + Documentation/trace/kprobetrace.rst | 9 kernel/trace/fprobe.c | 14 - kernel/trace/trace.c | 5 kernel/trace/trace_eprobe.c | 8 kernel/trace/trace_fprobe.c | 59 ++- kernel/trace/trace_kprobe.c | 58 ++- kernel/trace/trace_probe.c | 417 ++++++++++++++------ kernel/trace/trace_probe.h | 30 + kernel/trace/trace_probe_tmpl.h | 10 kernel/trace/trace_uprobe.c | 14 - .../ftrace/test.d/dynevent/fprobe_entry_arg.tc | 18 + .../ftrace/test.d/dynevent/fprobe_syntax_errors.tc | 4 .../ftrace/test.d/kprobe/kprobe_syntax_errors.tc | 2 .../ftrace/test.d/kprobe/kretprobe_entry_arg.tc | 18 + 15 files changed, 521 insertions(+), 176 deletions(-) create mode 100644 tools/testing/selftests/ftrace/test.d/dynevent/fprobe_entry_arg.tc create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_entry_arg.tc
-- Masami Hiramatsu (Google) mhiramat@kernel.org
From: Masami Hiramatsu (Google) mhiramat@kernel.org
Fix to allocate fprobe::entry_data_size buffer with rethook instances. If fprobe doesn't allocate entry_data_size buffer for each rethook instance, fprobe entry handler can cause a buffer overrun when storing entry data in entry handler.
Link: https://lore.kernel.org/all/170920576727.107552.638161246679734051.stgit@dev...
Reported-by: Jiri Olsa olsajiri@gmail.com Closes: https://lore.kernel.org/all/Zd9eBn2FTQzYyg7L@krava/ Fixes: 4bbd93455659 ("kprobes: kretprobe scalability improvement") Cc: stable@vger.kernel.org Tested-by: Jiri Olsa olsajiri@gmail.com Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org --- kernel/trace/fprobe.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c index 6cd2a4e3afb8..9ff018245840 100644 --- a/kernel/trace/fprobe.c +++ b/kernel/trace/fprobe.c @@ -189,9 +189,6 @@ static int fprobe_init_rethook(struct fprobe *fp, int num) { int size;
- if (num <= 0) - return -EINVAL; - if (!fp->exit_handler) { fp->rethook = NULL; return 0; @@ -199,15 +196,16 @@ static int fprobe_init_rethook(struct fprobe *fp, int num)
/* Initialize rethook if needed */ if (fp->nr_maxactive) - size = fp->nr_maxactive; + num = fp->nr_maxactive; else - size = num * num_possible_cpus() * 2; - if (size <= 0) + num *= num_possible_cpus() * 2; + if (num <= 0) return -EINVAL;
+ size = sizeof(struct fprobe_rethook_node) + fp->entry_data_size; + /* Initialize rethook */ - fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler, - sizeof(struct fprobe_rethook_node), size); + fp->rethook = rethook_alloc((void *)fp, fprobe_exit_handler, size, num); if (IS_ERR(fp->rethook)) return PTR_ERR(fp->rethook);
From: Masami Hiramatsu (Google) mhiramat@kernel.org
Despite the fprobe event, "Kretprobe" was commented. So fix it.
Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org --- kernel/trace/trace_fprobe.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c index 7d2ddbcfa377..3ccef4d82235 100644 --- a/kernel/trace/trace_fprobe.c +++ b/kernel/trace/trace_fprobe.c @@ -210,7 +210,7 @@ fentry_trace_func(struct trace_fprobe *tf, unsigned long entry_ip, } NOKPROBE_SYMBOL(fentry_trace_func);
-/* Kretprobe handler */ +/* function exit handler */ static nokprobe_inline void __fexit_trace_func(struct trace_fprobe *tf, unsigned long entry_ip, unsigned long ret_ip, struct pt_regs *regs,
From: Masami Hiramatsu (Google) mhiramat@kernel.org
Cleanup traceprobe_parse_probe_arg_body() to split out the type parser and post-processing part of fetch_insn. This makes no functional change.
Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org --- kernel/trace/trace_probe.c | 230 ++++++++++++++++++++++++++------------------ 1 file changed, 137 insertions(+), 93 deletions(-)
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c index 34289f9c6707..67a0b9cbb648 100644 --- a/kernel/trace/trace_probe.c +++ b/kernel/trace/trace_probe.c @@ -1090,67 +1090,45 @@ static int __parse_bitfield_probe_arg(const char *bf, return (BYTES_TO_BITS(t->size) < (bw + bo)) ? -EINVAL : 0; }
-/* String length checking wrapper */ -static int traceprobe_parse_probe_arg_body(const char *argv, ssize_t *size, - struct probe_arg *parg, - struct traceprobe_parse_context *ctx) +/* Split type part from @arg and return it. */ +static char *parse_probe_arg_type(char *arg, struct probe_arg *parg, + struct traceprobe_parse_context *ctx) { - struct fetch_insn *code, *scode, *tmp = NULL; - char *t, *t2, *t3; - int ret, len; - char *arg; + char *t = NULL, *t2, *t3; + int offs;
- arg = kstrdup(argv, GFP_KERNEL); - if (!arg) - return -ENOMEM; - - ret = -EINVAL; - len = strlen(arg); - if (len > MAX_ARGSTR_LEN) { - trace_probe_log_err(ctx->offset, ARG_TOO_LONG); - goto out; - } else if (len == 0) { - trace_probe_log_err(ctx->offset, NO_ARG_BODY); - goto out; - } - - ret = -ENOMEM; - parg->comm = kstrdup(arg, GFP_KERNEL); - if (!parg->comm) - goto out; - - ret = -EINVAL; t = strchr(arg, ':'); if (t) { - *t = '\0'; - t2 = strchr(++t, '['); + *t++ = '\0'; + t2 = strchr(t, '['); if (t2) { *t2++ = '\0'; t3 = strchr(t2, ']'); if (!t3) { - int offs = t2 + strlen(t2) - arg; + offs = t2 + strlen(t2) - arg;
trace_probe_log_err(ctx->offset + offs, ARRAY_NO_CLOSE); - goto out; + return ERR_PTR(-EINVAL); } else if (t3[1] != '\0') { trace_probe_log_err(ctx->offset + t3 + 1 - arg, BAD_ARRAY_SUFFIX); - goto out; + return ERR_PTR(-EINVAL); } *t3 = '\0'; if (kstrtouint(t2, 0, &parg->count) || !parg->count) { trace_probe_log_err(ctx->offset + t2 - arg, BAD_ARRAY_NUM); - goto out; + return ERR_PTR(-EINVAL); } if (parg->count > MAX_ARRAY_LEN) { trace_probe_log_err(ctx->offset + t2 - arg, ARRAY_TOO_BIG); - goto out; + return ERR_PTR(-EINVAL); } } } + offs = t ? t - arg : 0;
/* * Since $comm and immediate string can not be dereferenced, @@ -1161,74 +1139,52 @@ static int traceprobe_parse_probe_arg_body(const char *argv, ssize_t *size, strncmp(arg, "\"", 2) == 0)) { /* The type of $comm must be "string", and not an array type. */ if (parg->count || (t && strcmp(t, "string"))) { - trace_probe_log_err(ctx->offset + (t ? (t - arg) : 0), - NEED_STRING_TYPE); - goto out; + trace_probe_log_err(ctx->offset + offs, NEED_STRING_TYPE); + return ERR_PTR(-EINVAL); } parg->type = find_fetch_type("string", ctx->flags); } else parg->type = find_fetch_type(t, ctx->flags); + if (!parg->type) { - trace_probe_log_err(ctx->offset + (t ? (t - arg) : 0), BAD_TYPE); - goto out; + trace_probe_log_err(ctx->offset + offs, BAD_TYPE); + return ERR_PTR(-EINVAL); }
- code = tmp = kcalloc(FETCH_INSN_MAX, sizeof(*code), GFP_KERNEL); - if (!code) - goto out; - code[FETCH_INSN_MAX - 1].op = FETCH_OP_END; - - ctx->last_type = NULL; - ret = parse_probe_arg(arg, parg->type, &code, &code[FETCH_INSN_MAX - 1], - ctx); - if (ret) - goto fail; - - /* Update storing type if BTF is available */ - if (IS_ENABLED(CONFIG_PROBE_EVENTS_BTF_ARGS) && - ctx->last_type) { - if (!t) { - parg->type = find_fetch_type_from_btf_type(ctx); - } else if (strstr(t, "string")) { - ret = check_prepare_btf_string_fetch(t, &code, ctx); - if (ret) - goto fail; - } - } - parg->offset = *size; - *size += parg->type->size * (parg->count ?: 1); + return t; +}
- if (parg->count) { - len = strlen(parg->type->fmttype) + 6; - parg->fmt = kmalloc(len, GFP_KERNEL); - if (!parg->fmt) { - ret = -ENOMEM; - goto out; - } - snprintf(parg->fmt, len, "%s[%d]", parg->type->fmttype, - parg->count); - } +/* After parsing, adjust the fetch_insn according to the probe_arg */ +static int finalize_fetch_insn(struct fetch_insn *code, + struct probe_arg *parg, + char *type, + int type_offset, + struct traceprobe_parse_context *ctx) +{ + struct fetch_insn *scode; + int ret;
- ret = -EINVAL; /* Store operation */ if (parg->type->is_string) { + /* Check bad combination of the type and the last fetch_insn. */ if (!strcmp(parg->type->name, "symstr")) { if (code->op != FETCH_OP_REG && code->op != FETCH_OP_STACK && code->op != FETCH_OP_RETVAL && code->op != FETCH_OP_ARG && code->op != FETCH_OP_DEREF && code->op != FETCH_OP_TP_ARG) { - trace_probe_log_err(ctx->offset + (t ? (t - arg) : 0), + trace_probe_log_err(ctx->offset + type_offset, BAD_SYMSTRING); - goto fail; + return -EINVAL; } } else { if (code->op != FETCH_OP_DEREF && code->op != FETCH_OP_UDEREF && code->op != FETCH_OP_IMM && code->op != FETCH_OP_COMM && code->op != FETCH_OP_DATA && code->op != FETCH_OP_TP_ARG) { - trace_probe_log_err(ctx->offset + (t ? (t - arg) : 0), + trace_probe_log_err(ctx->offset + type_offset, BAD_STRING); - goto fail; + return -EINVAL; } } + if (!strcmp(parg->type->name, "symstr") || (code->op == FETCH_OP_IMM || code->op == FETCH_OP_COMM || code->op == FETCH_OP_DATA) || code->op == FETCH_OP_TP_ARG || @@ -1244,9 +1200,10 @@ static int traceprobe_parse_probe_arg_body(const char *argv, ssize_t *size, code++; if (code->op != FETCH_OP_NOP) { trace_probe_log_err(ctx->offset, TOO_MANY_OPS); - goto fail; + return -EINVAL; } } + /* If op == DEREF, replace it with STRING */ if (!strcmp(parg->type->name, "ustring") || code->op == FETCH_OP_UDEREF) @@ -1267,47 +1224,134 @@ static int traceprobe_parse_probe_arg_body(const char *argv, ssize_t *size, code++; if (code->op != FETCH_OP_NOP) { trace_probe_log_err(ctx->offset, TOO_MANY_OPS); - goto fail; + return -E2BIG; } code->op = FETCH_OP_ST_RAW; code->size = parg->type->size; } + + /* Save storing fetch_insn. */ scode = code; + /* Modify operation */ - if (t != NULL) { - ret = __parse_bitfield_probe_arg(t, parg->type, &code); + if (type != NULL) { + /* Bitfield needs a special fetch_insn. */ + ret = __parse_bitfield_probe_arg(type, parg->type, &code); if (ret) { - trace_probe_log_err(ctx->offset + t - arg, BAD_BITFIELD); - goto fail; + trace_probe_log_err(ctx->offset + type_offset, BAD_BITFIELD); + return ret; } } else if (IS_ENABLED(CONFIG_PROBE_EVENTS_BTF_ARGS) && ctx->last_type) { + /* If user not specified the type, try parsing BTF bitfield. */ ret = parse_btf_bitfield(&code, ctx); if (ret) - goto fail; + return ret; } - ret = -EINVAL; + /* Loop(Array) operation */ if (parg->count) { if (scode->op != FETCH_OP_ST_MEM && scode->op != FETCH_OP_ST_STRING && scode->op != FETCH_OP_ST_USTRING) { - trace_probe_log_err(ctx->offset + (t ? (t - arg) : 0), - BAD_STRING); - goto fail; + trace_probe_log_err(ctx->offset + type_offset, BAD_STRING); + return -EINVAL; } code++; if (code->op != FETCH_OP_NOP) { trace_probe_log_err(ctx->offset, TOO_MANY_OPS); - goto fail; + return -E2BIG; } code->op = FETCH_OP_LP_ARRAY; code->param = parg->count; } + + /* Finalize the fetch_insn array. */ code++; code->op = FETCH_OP_END;
- ret = 0; + return 0; +} + +/* String length checking wrapper */ +static int traceprobe_parse_probe_arg_body(const char *argv, ssize_t *size, + struct probe_arg *parg, + struct traceprobe_parse_context *ctx) +{ + struct fetch_insn *code, *tmp = NULL; + char *type, *arg; + int ret, len; + + len = strlen(argv); + if (len > MAX_ARGSTR_LEN) { + trace_probe_log_err(ctx->offset, ARG_TOO_LONG); + return -E2BIG; + } else if (len == 0) { + trace_probe_log_err(ctx->offset, NO_ARG_BODY); + return -EINVAL; + } + + arg = kstrdup(argv, GFP_KERNEL); + if (!arg) + return -ENOMEM; + + parg->comm = kstrdup(arg, GFP_KERNEL); + if (!parg->comm) { + ret = -ENOMEM; + goto out; + } + + type = parse_probe_arg_type(arg, parg, ctx); + if (IS_ERR(type)) { + ret = PTR_ERR(type); + goto out; + } + + code = tmp = kcalloc(FETCH_INSN_MAX, sizeof(*code), GFP_KERNEL); + if (!code) { + ret = -ENOMEM; + goto out; + } + code[FETCH_INSN_MAX - 1].op = FETCH_OP_END; + + ctx->last_type = NULL; + ret = parse_probe_arg(arg, parg->type, &code, &code[FETCH_INSN_MAX - 1], + ctx); + if (ret < 0) + goto fail; + + /* Update storing type if BTF is available */ + if (IS_ENABLED(CONFIG_PROBE_EVENTS_BTF_ARGS) && + ctx->last_type) { + if (!type) { + parg->type = find_fetch_type_from_btf_type(ctx); + } else if (strstr(type, "string")) { + ret = check_prepare_btf_string_fetch(type, &code, ctx); + if (ret) + goto fail; + } + } + parg->offset = *size; + *size += parg->type->size * (parg->count ?: 1); + + if (parg->count) { + len = strlen(parg->type->fmttype) + 6; + parg->fmt = kmalloc(len, GFP_KERNEL); + if (!parg->fmt) { + ret = -ENOMEM; + goto out; + } + snprintf(parg->fmt, len, "%s[%d]", parg->type->fmttype, + parg->count); + } + + ret = finalize_fetch_insn(code, parg, type, type ? type - arg : 0, ctx); + if (ret < 0) + goto fail; + + for (; code < tmp + FETCH_INSN_MAX; code++) + if (code->op == FETCH_OP_END) + break; /* Shrink down the code buffer */ parg->code = kcalloc(code - tmp + 1, sizeof(*code), GFP_KERNEL); if (!parg->code) @@ -1316,7 +1360,7 @@ static int traceprobe_parse_probe_arg_body(const char *argv, ssize_t *size, memcpy(parg->code, tmp, sizeof(*code) * (code - tmp + 1));
fail: - if (ret) { + if (ret < 0) { for (code = tmp; code < tmp + FETCH_INSN_MAX; code++) if (code->op == FETCH_NOP_SYMBOL || code->op == FETCH_OP_DATA)
From: Masami Hiramatsu (Google) mhiramat@kernel.org
Instead of incrementing the trace_probe::nr_args, init it at trace_probe_init(). Without this change, there is no way to get the number of trace_probe arguments while parsing it. This is a cleanup, so the behavior is not changed.
Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org --- Changes in v3: - Fix build error in trace_{kprobe,fprobe}.c. - Update changelog. --- kernel/trace/trace_eprobe.c | 2 +- kernel/trace/trace_fprobe.c | 2 +- kernel/trace/trace_kprobe.c | 2 +- kernel/trace/trace_probe.c | 10 ++++++---- kernel/trace/trace_probe.h | 2 +- kernel/trace/trace_uprobe.c | 2 +- 6 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/kernel/trace/trace_eprobe.c b/kernel/trace/trace_eprobe.c index 03c851f57969..eb72def7410f 100644 --- a/kernel/trace/trace_eprobe.c +++ b/kernel/trace/trace_eprobe.c @@ -220,7 +220,7 @@ static struct trace_eprobe *alloc_event_probe(const char *group, if (!ep->event_system) goto error;
- ret = trace_probe_init(&ep->tp, this_event, group, false); + ret = trace_probe_init(&ep->tp, this_event, group, false, nargs); if (ret < 0) goto error;
diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c index 3ccef4d82235..5109650b0d82 100644 --- a/kernel/trace/trace_fprobe.c +++ b/kernel/trace/trace_fprobe.c @@ -389,7 +389,7 @@ static struct trace_fprobe *alloc_trace_fprobe(const char *group, tf->tpoint = tpoint; tf->fp.nr_maxactive = maxactive;
- ret = trace_probe_init(&tf->tp, event, group, false); + ret = trace_probe_init(&tf->tp, event, group, false, nargs); if (ret < 0) goto error;
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index c4c6e0e0068b..843f13f839df 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -290,7 +290,7 @@ static struct trace_kprobe *alloc_trace_kprobe(const char *group, INIT_HLIST_NODE(&tk->rp.kp.hlist); INIT_LIST_HEAD(&tk->rp.kp.list);
- ret = trace_probe_init(&tk->tp, event, group, false); + ret = trace_probe_init(&tk->tp, event, group, false, nargs); if (ret < 0) goto error;
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c index 67a0b9cbb648..93f36f8a108e 100644 --- a/kernel/trace/trace_probe.c +++ b/kernel/trace/trace_probe.c @@ -1423,9 +1423,6 @@ int traceprobe_parse_probe_arg(struct trace_probe *tp, int i, const char *arg, struct probe_arg *parg = &tp->args[i]; const char *body;
- /* Increment count for freeing args in error case */ - tp->nr_args++; - body = strchr(arg, '='); if (body) { if (body - arg > MAX_ARG_NAME_LEN) { @@ -1810,7 +1807,7 @@ void trace_probe_cleanup(struct trace_probe *tp) }
int trace_probe_init(struct trace_probe *tp, const char *event, - const char *group, bool alloc_filter) + const char *group, bool alloc_filter, int nargs) { struct trace_event_call *call; size_t size = sizeof(struct trace_probe_event); @@ -1846,6 +1843,11 @@ int trace_probe_init(struct trace_probe *tp, const char *event, goto error; }
+ tp->nr_args = nargs; + /* Make sure pointers in args[] are NULL */ + if (nargs) + memset(tp->args, 0, sizeof(tp->args[0]) * nargs); + return 0;
error: diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h index c1877d018269..ed8d1052f8a7 100644 --- a/kernel/trace/trace_probe.h +++ b/kernel/trace/trace_probe.h @@ -338,7 +338,7 @@ static inline bool trace_probe_has_single_file(struct trace_probe *tp) }
int trace_probe_init(struct trace_probe *tp, const char *event, - const char *group, bool alloc_filter); + const char *group, bool alloc_filter, int nargs); void trace_probe_cleanup(struct trace_probe *tp); int trace_probe_append(struct trace_probe *tp, struct trace_probe *to); void trace_probe_unlink(struct trace_probe *tp); diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c index a84b85d8aac1..796ebcae9b38 100644 --- a/kernel/trace/trace_uprobe.c +++ b/kernel/trace/trace_uprobe.c @@ -337,7 +337,7 @@ alloc_trace_uprobe(const char *group, const char *event, int nargs, bool is_ret) if (!tu) return ERR_PTR(-ENOMEM);
- ret = trace_probe_init(&tu->tp, event, group, true); + ret = trace_probe_init(&tu->tp, event, group, true, nargs); if (ret < 0) goto error;
From: Masami Hiramatsu (Google) mhiramat@kernel.org
Remove redundant #else block for BTF args from README message. This is a cleanup, so no change on the message.
Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org --- kernel/trace/trace.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 8198bfc54b58..516c585f5879 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -5747,11 +5747,9 @@ static const char readme_msg[] = "\t args: <name>=fetcharg[:type]\n" "\t fetcharg: (%<register>|$<efield>), @<address>, @<symbol>[+|-<offset>],\n" #ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API -#ifdef CONFIG_PROBE_EVENTS_BTF_ARGS "\t $stack<index>, $stack, $retval, $comm, $arg<N>,\n" +#ifdef CONFIG_PROBE_EVENTS_BTF_ARGS "\t <argname>[->field[->field|.field...]],\n" -#else - "\t $stack<index>, $stack, $retval, $comm, $arg<N>,\n" #endif #else "\t $stack<index>, $stack, $retval, $comm,\n"
From: Masami Hiramatsu (Google) mhiramat@kernel.org
Support accessing $argN in the return probe events. This will help users to record entry data in function return (exit) event for simplfing the function entry/exit information in one event, and record the result values (e.g. allocated object/initialized object) at function exit.
For example, if we have a function `int init_foo(struct foo *obj, int param)` sometimes we want to check how `obj` is initialized. In such case, we can define a new return event like below;
# echo 'r init_foo retval=$retval param=$arg2 field1=+0($arg1)' >> kprobe_events
Thus it records the function parameter `param` and its result `obj->field1` (the dereference will be done in the function exit timing) value at once.
This also support fprobe, BTF args and'$arg*'. So if CONFIG_DEBUG_INFO_BTF is enabled, we can trace both function parameters and the return value by following command.
# echo 'f target_function%return $arg* $retval' >> dynamic_events
Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org --- Changes in v3: - Fix syntax error check selftests because now some syntax error will not happen. Changes in v2: - Update README message --- kernel/trace/trace.c | 1 kernel/trace/trace_eprobe.c | 6 - kernel/trace/trace_fprobe.c | 55 ++++-- kernel/trace/trace_kprobe.c | 56 +++++- kernel/trace/trace_probe.c | 177 ++++++++++++++++++-- kernel/trace/trace_probe.h | 28 +++ kernel/trace/trace_probe_tmpl.h | 10 + kernel/trace/trace_uprobe.c | 12 + .../ftrace/test.d/dynevent/fprobe_syntax_errors.tc | 4 .../ftrace/test.d/kprobe/kprobe_syntax_errors.tc | 2 10 files changed, 289 insertions(+), 62 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 516c585f5879..71a96decc276 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -5755,6 +5755,7 @@ static const char readme_msg[] = "\t $stack<index>, $stack, $retval, $comm,\n" #endif "\t +|-[u]<offset>(<fetcharg>), \imm-value, \"imm-string"\n" + "\t kernel return probes support: $retval, $arg<N>, $comm\n" "\t type: s8/16/32/64, u8/16/32/64, x8/16/32/64, char, string, symbol,\n" "\t b<bit-width>@<bit-offset>/<container-size>, ustring,\n" "\t symstr, <type>\[<array-size>\]\n" diff --git a/kernel/trace/trace_eprobe.c b/kernel/trace/trace_eprobe.c index eb72def7410f..b0e0ec85912e 100644 --- a/kernel/trace/trace_eprobe.c +++ b/kernel/trace/trace_eprobe.c @@ -390,8 +390,8 @@ static int get_eprobe_size(struct trace_probe *tp, void *rec)
/* Note that we don't verify it, since the code does not come from user space */ static int -process_fetch_insn(struct fetch_insn *code, void *rec, void *dest, - void *base) +process_fetch_insn(struct fetch_insn *code, void *rec, void *edata, + void *dest, void *base) { unsigned long val; int ret; @@ -438,7 +438,7 @@ __eprobe_trace_func(struct eprobe_data *edata, void *rec) return;
entry = fbuffer.entry = ring_buffer_event_data(fbuffer.event); - store_trace_args(&entry[1], &edata->ep->tp, rec, sizeof(*entry), dsize); + store_trace_args(&entry[1], &edata->ep->tp, rec, NULL, sizeof(*entry), dsize);
trace_event_buffer_commit(&fbuffer); } diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c index 5109650b0d82..4f4280815522 100644 --- a/kernel/trace/trace_fprobe.c +++ b/kernel/trace/trace_fprobe.c @@ -4,6 +4,7 @@ * Copyright (C) 2022 Google LLC. */ #define pr_fmt(fmt) "trace_fprobe: " fmt +#include <asm/ptrace.h>
#include <linux/fprobe.h> #include <linux/module.h> @@ -129,8 +130,8 @@ static bool trace_fprobe_is_registered(struct trace_fprobe *tf) * from user space. */ static int -process_fetch_insn(struct fetch_insn *code, void *rec, void *dest, - void *base) +process_fetch_insn(struct fetch_insn *code, void *rec, void *edata, + void *dest, void *base) { struct pt_regs *regs = rec; unsigned long val; @@ -152,6 +153,9 @@ process_fetch_insn(struct fetch_insn *code, void *rec, void *dest, case FETCH_OP_ARG: val = regs_get_kernel_argument(regs, code->param); break; + case FETCH_OP_EDATA: + val = *(unsigned long *)((unsigned long)edata + code->offset); + break; #endif case FETCH_NOP_SYMBOL: /* Ignore a place holder */ code++; @@ -184,7 +188,7 @@ __fentry_trace_func(struct trace_fprobe *tf, unsigned long entry_ip, if (trace_trigger_soft_disabled(trace_file)) return;
- dsize = __get_data_size(&tf->tp, regs); + dsize = __get_data_size(&tf->tp, regs, NULL);
entry = trace_event_buffer_reserve(&fbuffer, trace_file, sizeof(*entry) + tf->tp.size + dsize); @@ -194,7 +198,7 @@ __fentry_trace_func(struct trace_fprobe *tf, unsigned long entry_ip, fbuffer.regs = regs; entry = fbuffer.entry = ring_buffer_event_data(fbuffer.event); entry->ip = entry_ip; - store_trace_args(&entry[1], &tf->tp, regs, sizeof(*entry), dsize); + store_trace_args(&entry[1], &tf->tp, regs, NULL, sizeof(*entry), dsize);
trace_event_buffer_commit(&fbuffer); } @@ -211,10 +215,23 @@ fentry_trace_func(struct trace_fprobe *tf, unsigned long entry_ip, NOKPROBE_SYMBOL(fentry_trace_func);
/* function exit handler */ +static int trace_fprobe_entry_handler(struct fprobe *fp, unsigned long entry_ip, + unsigned long ret_ip, struct pt_regs *regs, + void *entry_data) +{ + struct trace_fprobe *tf = container_of(fp, struct trace_fprobe, fp); + + if (tf->tp.entry_arg) + store_trace_entry_data(entry_data, &tf->tp, regs); + + return 0; +} +NOKPROBE_SYMBOL(trace_fprobe_entry_handler) + static nokprobe_inline void __fexit_trace_func(struct trace_fprobe *tf, unsigned long entry_ip, unsigned long ret_ip, struct pt_regs *regs, - struct trace_event_file *trace_file) + void *entry_data, struct trace_event_file *trace_file) { struct fexit_trace_entry_head *entry; struct trace_event_buffer fbuffer; @@ -227,7 +244,7 @@ __fexit_trace_func(struct trace_fprobe *tf, unsigned long entry_ip, if (trace_trigger_soft_disabled(trace_file)) return;
- dsize = __get_data_size(&tf->tp, regs); + dsize = __get_data_size(&tf->tp, regs, entry_data);
entry = trace_event_buffer_reserve(&fbuffer, trace_file, sizeof(*entry) + tf->tp.size + dsize); @@ -238,19 +255,19 @@ __fexit_trace_func(struct trace_fprobe *tf, unsigned long entry_ip, entry = fbuffer.entry = ring_buffer_event_data(fbuffer.event); entry->func = entry_ip; entry->ret_ip = ret_ip; - store_trace_args(&entry[1], &tf->tp, regs, sizeof(*entry), dsize); + store_trace_args(&entry[1], &tf->tp, regs, entry_data, sizeof(*entry), dsize);
trace_event_buffer_commit(&fbuffer); }
static void fexit_trace_func(struct trace_fprobe *tf, unsigned long entry_ip, - unsigned long ret_ip, struct pt_regs *regs) + unsigned long ret_ip, struct pt_regs *regs, void *entry_data) { struct event_file_link *link;
trace_probe_for_each_link_rcu(link, &tf->tp) - __fexit_trace_func(tf, entry_ip, ret_ip, regs, link->file); + __fexit_trace_func(tf, entry_ip, ret_ip, regs, entry_data, link->file); } NOKPROBE_SYMBOL(fexit_trace_func);
@@ -269,7 +286,7 @@ static int fentry_perf_func(struct trace_fprobe *tf, unsigned long entry_ip, if (hlist_empty(head)) return 0;
- dsize = __get_data_size(&tf->tp, regs); + dsize = __get_data_size(&tf->tp, regs, NULL); __size = sizeof(*entry) + tf->tp.size + dsize; size = ALIGN(__size + sizeof(u32), sizeof(u64)); size -= sizeof(u32); @@ -280,7 +297,7 @@ static int fentry_perf_func(struct trace_fprobe *tf, unsigned long entry_ip,
entry->ip = entry_ip; memset(&entry[1], 0, dsize); - store_trace_args(&entry[1], &tf->tp, regs, sizeof(*entry), dsize); + store_trace_args(&entry[1], &tf->tp, regs, NULL, sizeof(*entry), dsize); perf_trace_buf_submit(entry, size, rctx, call->event.type, 1, regs, head, NULL); return 0; @@ -289,7 +306,8 @@ NOKPROBE_SYMBOL(fentry_perf_func);
static void fexit_perf_func(struct trace_fprobe *tf, unsigned long entry_ip, - unsigned long ret_ip, struct pt_regs *regs) + unsigned long ret_ip, struct pt_regs *regs, + void *entry_data) { struct trace_event_call *call = trace_probe_event_call(&tf->tp); struct fexit_trace_entry_head *entry; @@ -301,7 +319,7 @@ fexit_perf_func(struct trace_fprobe *tf, unsigned long entry_ip, if (hlist_empty(head)) return;
- dsize = __get_data_size(&tf->tp, regs); + dsize = __get_data_size(&tf->tp, regs, entry_data); __size = sizeof(*entry) + tf->tp.size + dsize; size = ALIGN(__size + sizeof(u32), sizeof(u64)); size -= sizeof(u32); @@ -312,7 +330,7 @@ fexit_perf_func(struct trace_fprobe *tf, unsigned long entry_ip,
entry->func = entry_ip; entry->ret_ip = ret_ip; - store_trace_args(&entry[1], &tf->tp, regs, sizeof(*entry), dsize); + store_trace_args(&entry[1], &tf->tp, regs, entry_data, sizeof(*entry), dsize); perf_trace_buf_submit(entry, size, rctx, call->event.type, 1, regs, head, NULL); } @@ -343,10 +361,10 @@ static void fexit_dispatcher(struct fprobe *fp, unsigned long entry_ip, struct trace_fprobe *tf = container_of(fp, struct trace_fprobe, fp);
if (trace_probe_test_flag(&tf->tp, TP_FLAG_TRACE)) - fexit_trace_func(tf, entry_ip, ret_ip, regs); + fexit_trace_func(tf, entry_ip, ret_ip, regs, entry_data); #ifdef CONFIG_PERF_EVENTS if (trace_probe_test_flag(&tf->tp, TP_FLAG_PROFILE)) - fexit_perf_func(tf, entry_ip, ret_ip, regs); + fexit_perf_func(tf, entry_ip, ret_ip, regs, entry_data); #endif } NOKPROBE_SYMBOL(fexit_dispatcher); @@ -1109,6 +1127,11 @@ static int __trace_fprobe_create(int argc, const char *argv[]) goto error; /* This can be -ENOMEM */ }
+ if (is_return && tf->tp.entry_arg) { + tf->fp.entry_handler = trace_fprobe_entry_handler; + tf->fp.entry_data_size = traceprobe_get_entry_data_size(&tf->tp); + } + ret = traceprobe_set_print_fmt(&tf->tp, is_return ? PROBE_PRINT_RETURN : PROBE_PRINT_NORMAL); if (ret < 0) diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 843f13f839df..14099cc17fc9 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -740,6 +740,9 @@ static unsigned int number_of_same_symbols(char *func_name) return ctx.count; }
+static int trace_kprobe_entry_handler(struct kretprobe_instance *ri, + struct pt_regs *regs); + static int __trace_kprobe_create(int argc, const char *argv[]) { /* @@ -948,6 +951,11 @@ static int __trace_kprobe_create(int argc, const char *argv[]) if (ret) goto error; /* This can be -ENOMEM */ } + /* entry handler for kretprobe */ + if (is_return && tk->tp.entry_arg) { + tk->rp.entry_handler = trace_kprobe_entry_handler; + tk->rp.data_size = traceprobe_get_entry_data_size(&tk->tp); + }
ptype = is_return ? PROBE_PRINT_RETURN : PROBE_PRINT_NORMAL; ret = traceprobe_set_print_fmt(&tk->tp, ptype); @@ -1303,8 +1311,8 @@ static const struct file_operations kprobe_profile_ops = {
/* Note that we don't verify it, since the code does not come from user space */ static int -process_fetch_insn(struct fetch_insn *code, void *rec, void *dest, - void *base) +process_fetch_insn(struct fetch_insn *code, void *rec, void *edata, + void *dest, void *base) { struct pt_regs *regs = rec; unsigned long val; @@ -1329,6 +1337,9 @@ process_fetch_insn(struct fetch_insn *code, void *rec, void *dest, case FETCH_OP_ARG: val = regs_get_kernel_argument(regs, code->param); break; + case FETCH_OP_EDATA: + val = *(unsigned long *)((unsigned long)edata + code->offset); + break; #endif case FETCH_NOP_SYMBOL: /* Ignore a place holder */ code++; @@ -1359,7 +1370,7 @@ __kprobe_trace_func(struct trace_kprobe *tk, struct pt_regs *regs, if (trace_trigger_soft_disabled(trace_file)) return;
- dsize = __get_data_size(&tk->tp, regs); + dsize = __get_data_size(&tk->tp, regs, NULL);
entry = trace_event_buffer_reserve(&fbuffer, trace_file, sizeof(*entry) + tk->tp.size + dsize); @@ -1368,7 +1379,7 @@ __kprobe_trace_func(struct trace_kprobe *tk, struct pt_regs *regs,
fbuffer.regs = regs; entry->ip = (unsigned long)tk->rp.kp.addr; - store_trace_args(&entry[1], &tk->tp, regs, sizeof(*entry), dsize); + store_trace_args(&entry[1], &tk->tp, regs, NULL, sizeof(*entry), dsize);
trace_event_buffer_commit(&fbuffer); } @@ -1384,6 +1395,31 @@ kprobe_trace_func(struct trace_kprobe *tk, struct pt_regs *regs) NOKPROBE_SYMBOL(kprobe_trace_func);
/* Kretprobe handler */ + +static int trace_kprobe_entry_handler(struct kretprobe_instance *ri, + struct pt_regs *regs) +{ + struct kretprobe *rp = get_kretprobe(ri); + struct trace_kprobe *tk; + + /* + * There is a small chance that get_kretprobe(ri) returns NULL when + * the kretprobe is unregister on another CPU between kretprobe's + * trampoline_handler and this function. + */ + if (unlikely(!rp)) + return -ENOENT; + + tk = container_of(rp, struct trace_kprobe, rp); + + /* store argument values into ri->data as entry data */ + if (tk->tp.entry_arg) + store_trace_entry_data(ri->data, &tk->tp, regs); + + return 0; +} + + static nokprobe_inline void __kretprobe_trace_func(struct trace_kprobe *tk, struct kretprobe_instance *ri, struct pt_regs *regs, @@ -1399,7 +1435,7 @@ __kretprobe_trace_func(struct trace_kprobe *tk, struct kretprobe_instance *ri, if (trace_trigger_soft_disabled(trace_file)) return;
- dsize = __get_data_size(&tk->tp, regs); + dsize = __get_data_size(&tk->tp, regs, ri->data);
entry = trace_event_buffer_reserve(&fbuffer, trace_file, sizeof(*entry) + tk->tp.size + dsize); @@ -1409,7 +1445,7 @@ __kretprobe_trace_func(struct trace_kprobe *tk, struct kretprobe_instance *ri, fbuffer.regs = regs; entry->func = (unsigned long)tk->rp.kp.addr; entry->ret_ip = get_kretprobe_retaddr(ri); - store_trace_args(&entry[1], &tk->tp, regs, sizeof(*entry), dsize); + store_trace_args(&entry[1], &tk->tp, regs, ri->data, sizeof(*entry), dsize);
trace_event_buffer_commit(&fbuffer); } @@ -1557,7 +1593,7 @@ kprobe_perf_func(struct trace_kprobe *tk, struct pt_regs *regs) if (hlist_empty(head)) return 0;
- dsize = __get_data_size(&tk->tp, regs); + dsize = __get_data_size(&tk->tp, regs, NULL); __size = sizeof(*entry) + tk->tp.size + dsize; size = ALIGN(__size + sizeof(u32), sizeof(u64)); size -= sizeof(u32); @@ -1568,7 +1604,7 @@ kprobe_perf_func(struct trace_kprobe *tk, struct pt_regs *regs)
entry->ip = (unsigned long)tk->rp.kp.addr; memset(&entry[1], 0, dsize); - store_trace_args(&entry[1], &tk->tp, regs, sizeof(*entry), dsize); + store_trace_args(&entry[1], &tk->tp, regs, NULL, sizeof(*entry), dsize); perf_trace_buf_submit(entry, size, rctx, call->event.type, 1, regs, head, NULL); return 0; @@ -1593,7 +1629,7 @@ kretprobe_perf_func(struct trace_kprobe *tk, struct kretprobe_instance *ri, if (hlist_empty(head)) return;
- dsize = __get_data_size(&tk->tp, regs); + dsize = __get_data_size(&tk->tp, regs, ri->data); __size = sizeof(*entry) + tk->tp.size + dsize; size = ALIGN(__size + sizeof(u32), sizeof(u64)); size -= sizeof(u32); @@ -1604,7 +1640,7 @@ kretprobe_perf_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
entry->func = (unsigned long)tk->rp.kp.addr; entry->ret_ip = get_kretprobe_retaddr(ri); - store_trace_args(&entry[1], &tk->tp, regs, sizeof(*entry), dsize); + store_trace_args(&entry[1], &tk->tp, regs, ri->data, sizeof(*entry), dsize); perf_trace_buf_submit(entry, size, rctx, call->event.type, 1, regs, head, NULL); } diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c index 93f36f8a108e..217169de0920 100644 --- a/kernel/trace/trace_probe.c +++ b/kernel/trace/trace_probe.c @@ -594,6 +594,8 @@ static int parse_btf_field(char *fieldname, const struct btf_type *type, return 0; }
+static int __store_entry_arg(struct trace_probe *tp, int argnum); + static int parse_btf_arg(char *varname, struct fetch_insn **pcode, struct fetch_insn *end, struct traceprobe_parse_context *ctx) @@ -618,11 +620,7 @@ static int parse_btf_arg(char *varname, return -EOPNOTSUPP; }
- if (ctx->flags & TPARG_FL_RETURN) { - if (strcmp(varname, "$retval") != 0) { - trace_probe_log_err(ctx->offset, NO_BTFARG); - return -ENOENT; - } + if (ctx->flags & TPARG_FL_RETURN && !strcmp(varname, "$retval")) { code->op = FETCH_OP_RETVAL; /* Check whether the function return type is not void */ if (query_btf_context(ctx) == 0) { @@ -654,11 +652,21 @@ static int parse_btf_arg(char *varname, const char *name = btf_name_by_offset(ctx->btf, params[i].name_off);
if (name && !strcmp(name, varname)) { - code->op = FETCH_OP_ARG; - if (ctx->flags & TPARG_FL_TPOINT) - code->param = i + 1; - else - code->param = i; + if (tparg_is_function_entry(ctx->flags)) { + code->op = FETCH_OP_ARG; + if (ctx->flags & TPARG_FL_TPOINT) + code->param = i + 1; + else + code->param = i; + } else if (tparg_is_function_return(ctx->flags)) { + code->op = FETCH_OP_EDATA; + ret = __store_entry_arg(ctx->tp, i); + if (ret < 0) { + /* internal error */ + return ret; + } + code->offset = ret; + } tid = params[i].type; goto found; } @@ -755,6 +763,110 @@ static int check_prepare_btf_string_fetch(char *typename,
#endif
+#ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API + +static int __store_entry_arg(struct trace_probe *tp, int argnum) +{ + struct probe_entry_arg *earg = tp->entry_arg; + bool match = false; + int i, offset; + + if (!earg) { + earg = kzalloc(sizeof(*tp->entry_arg), GFP_KERNEL); + if (!earg) + return -ENOMEM; + earg->size = 2 * tp->nr_args + 1; + earg->code = kcalloc(earg->size, sizeof(struct fetch_insn), + GFP_KERNEL); + if (!earg->code) { + kfree(earg); + return -ENOMEM; + } + /* Fill the code buffer with 'end' to simplify it */ + for (i = 0; i < earg->size; i++) + earg->code[i].op = FETCH_OP_END; + tp->entry_arg = earg; + } + + offset = 0; + for (i = 0; i < earg->size - 1; i++) { + switch (earg->code[i].op) { + case FETCH_OP_END: + earg->code[i].op = FETCH_OP_ARG; + earg->code[i].param = argnum; + earg->code[i + 1].op = FETCH_OP_ST_EDATA; + earg->code[i + 1].offset = offset; + return offset; + case FETCH_OP_ARG: + match = (earg->code[i].param == argnum); + break; + case FETCH_OP_ST_EDATA: + offset = earg->code[i].offset; + if (match) + return offset; + offset += sizeof(unsigned long); + break; + default: + break; + } + } + return -ENOSPC; +} + +int traceprobe_get_entry_data_size(struct trace_probe *tp) +{ + struct probe_entry_arg *earg = tp->entry_arg; + int i, size = 0; + + if (!earg) + return 0; + + for (i = 0; i < earg->size; i++) { + switch (earg->code[i].op) { + case FETCH_OP_END: + goto out; + case FETCH_OP_ST_EDATA: + size = earg->code[i].offset + sizeof(unsigned long); + break; + default: + break; + } + } +out: + return size; +} + +void store_trace_entry_data(void *edata, struct trace_probe *tp, struct pt_regs *regs) +{ + struct probe_entry_arg *earg = tp->entry_arg; + unsigned long val; + int i; + + if (!earg) + return; + + for (i = 0; i < earg->size; i++) { + struct fetch_insn *code = &earg->code[i]; + + switch (code->op) { + case FETCH_OP_ARG: + val = regs_get_kernel_argument(regs, code->param); + break; + case FETCH_OP_ST_EDATA: + *(unsigned long *)((unsigned long)edata + code->offset) = val; + break; + case FETCH_OP_END: + goto end; + default: + break; + } + } +end: + return; +} +NOKPROBE_SYMBOL(store_trace_entry_data) +#endif + #define PARAM_MAX_STACK (THREAD_SIZE / sizeof(unsigned long))
/* Parse $vars. @orig_arg points '$', which syncs to @ctx->offset */ @@ -830,7 +942,7 @@ static int parse_probe_vars(char *orig_arg, const struct fetch_type *t,
#ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API len = str_has_prefix(arg, "arg"); - if (len && tparg_is_function_entry(ctx->flags)) { + if (len) { ret = kstrtoul(arg + len, 10, ¶m); if (ret) goto inval; @@ -839,15 +951,29 @@ static int parse_probe_vars(char *orig_arg, const struct fetch_type *t, err = TP_ERR_BAD_ARG_NUM; goto inval; } + param--; /* argN starts from 1, but internal arg[N] starts from 0 */
- code->op = FETCH_OP_ARG; - code->param = (unsigned int)param - 1; - /* - * The tracepoint probe will probe a stub function, and the - * first parameter of the stub is a dummy and should be ignored. - */ - if (ctx->flags & TPARG_FL_TPOINT) - code->param++; + if (tparg_is_function_entry(ctx->flags)) { + code->op = FETCH_OP_ARG; + code->param = (unsigned int)param; + /* + * The tracepoint probe will probe a stub function, and the + * first parameter of the stub is a dummy and should be ignored. + */ + if (ctx->flags & TPARG_FL_TPOINT) + code->param++; + } else if (tparg_is_function_return(ctx->flags)) { + /* function entry argument access from return probe */ + ret = __store_entry_arg(ctx->tp, param); + if (ret < 0) /* This error should be an internal error */ + return ret; + + code->op = FETCH_OP_EDATA; + code->offset = ret; + } else { + err = TP_ERR_NOFENTRY_ARGS; + goto inval; + } return 0; } #endif @@ -1037,7 +1163,8 @@ parse_probe_arg(char *arg, const struct fetch_type *type, break; default: if (isalpha(arg[0]) || arg[0] == '_') { /* BTF variable */ - if (!tparg_is_function_entry(ctx->flags)) { + if (!tparg_is_function_entry(ctx->flags) && + !tparg_is_function_return(ctx->flags)) { trace_probe_log_err(ctx->offset, NOSUP_BTFARG); return -EINVAL; } @@ -1423,6 +1550,7 @@ int traceprobe_parse_probe_arg(struct trace_probe *tp, int i, const char *arg, struct probe_arg *parg = &tp->args[i]; const char *body;
+ ctx->tp = tp; body = strchr(arg, '='); if (body) { if (body - arg > MAX_ARG_NAME_LEN) { @@ -1479,7 +1607,8 @@ static int argv_has_var_arg(int argc, const char *argv[], int *args_idx, if (str_has_prefix(argv[i], "$arg")) { trace_probe_log_set_index(i + 2);
- if (!tparg_is_function_entry(ctx->flags)) { + if (!tparg_is_function_entry(ctx->flags) && + !tparg_is_function_return(ctx->flags)) { trace_probe_log_err(0, NOFENTRY_ARGS); return -EINVAL; } @@ -1802,6 +1931,12 @@ void trace_probe_cleanup(struct trace_probe *tp) for (i = 0; i < tp->nr_args; i++) traceprobe_free_probe_arg(&tp->args[i]);
+ if (tp->entry_arg) { + kfree(tp->entry_arg->code); + kfree(tp->entry_arg); + tp->entry_arg = NULL; + } + if (tp->event) trace_probe_unlink(tp); } diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h index ed8d1052f8a7..cef3a50628a3 100644 --- a/kernel/trace/trace_probe.h +++ b/kernel/trace/trace_probe.h @@ -92,6 +92,7 @@ enum fetch_op { FETCH_OP_ARG, /* Function argument : .param */ FETCH_OP_FOFFS, /* File offset: .immediate */ FETCH_OP_DATA, /* Allocated data: .data */ + FETCH_OP_EDATA, /* Entry data: .offset */ // Stage 2 (dereference) op FETCH_OP_DEREF, /* Dereference: .offset */ FETCH_OP_UDEREF, /* User-space Dereference: .offset */ @@ -102,6 +103,7 @@ enum fetch_op { FETCH_OP_ST_STRING, /* String: .offset, .size */ FETCH_OP_ST_USTRING, /* User String: .offset, .size */ FETCH_OP_ST_SYMSTR, /* Kernel Symbol String: .offset, .size */ + FETCH_OP_ST_EDATA, /* Store Entry Data: .offset */ // Stage 4 (modify) op FETCH_OP_MOD_BF, /* Bitfield: .basesize, .lshift, .rshift */ // Stage 5 (loop) op @@ -232,6 +234,11 @@ struct probe_arg { const struct fetch_type *type; /* Type of this argument */ };
+struct probe_entry_arg { + struct fetch_insn *code; + unsigned int size; /* The entry data size */ +}; + struct trace_uprobe_filter { rwlock_t rwlock; int nr_systemwide; @@ -253,6 +260,7 @@ struct trace_probe { struct trace_probe_event *event; ssize_t size; /* trace entry size */ unsigned int nr_args; + struct probe_entry_arg *entry_arg; /* This is only for return probe */ struct probe_arg args[]; };
@@ -355,6 +363,18 @@ int trace_probe_create(const char *raw_command, int (*createfn)(int, const char int trace_probe_print_args(struct trace_seq *s, struct probe_arg *args, int nr_args, u8 *data, void *field);
+#ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API +int traceprobe_get_entry_data_size(struct trace_probe *tp); +/* This is a runtime function to store entry data */ +void store_trace_entry_data(void *edata, struct trace_probe *tp, struct pt_regs *regs); +#else /* !CONFIG_HAVE_FUNCTION_ARG_ACCESS_API */ +static inline int traceprobe_get_entry_data_size(struct trace_probe *tp) +{ + return 0; +} +#define store_trace_entry_data(edata, tp, regs) do { } while (0) +#endif + #define trace_probe_for_each_link(pos, tp) \ list_for_each_entry(pos, &(tp)->event->files, list) #define trace_probe_for_each_link_rcu(pos, tp) \ @@ -381,6 +401,11 @@ static inline bool tparg_is_function_entry(unsigned int flags) return (flags & TPARG_FL_LOC_MASK) == (TPARG_FL_KERNEL | TPARG_FL_FENTRY); }
+static inline bool tparg_is_function_return(unsigned int flags) +{ + return (flags & TPARG_FL_LOC_MASK) == (TPARG_FL_KERNEL | TPARG_FL_RETURN); +} + struct traceprobe_parse_context { struct trace_event_call *event; /* BTF related parameters */ @@ -392,6 +417,7 @@ struct traceprobe_parse_context { const struct btf_type *last_type; /* Saved type */ u32 last_bitoffs; /* Saved bitoffs */ u32 last_bitsize; /* Saved bitsize */ + struct trace_probe *tp; unsigned int flags; int offset; }; @@ -506,7 +532,7 @@ extern int traceprobe_define_arg_fields(struct trace_event_call *event_call, C(NO_BTFARG, "This variable is not found at this probe point"),\ C(NO_BTF_ENTRY, "No BTF entry for this probe point"), \ C(BAD_VAR_ARGS, "$arg* must be an independent parameter without name etc."),\ - C(NOFENTRY_ARGS, "$arg* can be used only on function entry"), \ + C(NOFENTRY_ARGS, "$arg* can be used only on function entry or exit"), \ C(DOUBLE_ARGS, "$arg* can be used only once in the parameters"), \ C(ARGS_2LONG, "$arg* failed because the argument list is too long"), \ C(ARGIDX_2BIG, "$argN index is too big"), \ diff --git a/kernel/trace/trace_probe_tmpl.h b/kernel/trace/trace_probe_tmpl.h index 3935b347f874..2caf0d2afb32 100644 --- a/kernel/trace/trace_probe_tmpl.h +++ b/kernel/trace/trace_probe_tmpl.h @@ -54,7 +54,7 @@ fetch_apply_bitfield(struct fetch_insn *code, void *buf) * If dest is NULL, don't store result and return required dynamic data size. */ static int -process_fetch_insn(struct fetch_insn *code, void *rec, +process_fetch_insn(struct fetch_insn *code, void *rec, void *edata, void *dest, void *base); static nokprobe_inline int fetch_store_strlen(unsigned long addr); static nokprobe_inline int @@ -232,7 +232,7 @@ process_fetch_insn_bottom(struct fetch_insn *code, unsigned long val,
/* Sum up total data length for dynamic arrays (strings) */ static nokprobe_inline int -__get_data_size(struct trace_probe *tp, struct pt_regs *regs) +__get_data_size(struct trace_probe *tp, struct pt_regs *regs, void *edata) { struct probe_arg *arg; int i, len, ret = 0; @@ -240,7 +240,7 @@ __get_data_size(struct trace_probe *tp, struct pt_regs *regs) for (i = 0; i < tp->nr_args; i++) { arg = tp->args + i; if (unlikely(arg->dynamic)) { - len = process_fetch_insn(arg->code, regs, NULL, NULL); + len = process_fetch_insn(arg->code, regs, edata, NULL, NULL); if (len > 0) ret += len; } @@ -251,7 +251,7 @@ __get_data_size(struct trace_probe *tp, struct pt_regs *regs)
/* Store the value of each argument */ static nokprobe_inline void -store_trace_args(void *data, struct trace_probe *tp, void *rec, +store_trace_args(void *data, struct trace_probe *tp, void *rec, void *edata, int header_size, int maxlen) { struct probe_arg *arg; @@ -266,7 +266,7 @@ store_trace_args(void *data, struct trace_probe *tp, void *rec, /* Point the dynamic data area if needed */ if (unlikely(arg->dynamic)) *dl = make_data_loc(maxlen, dyndata - base); - ret = process_fetch_insn(arg->code, rec, dl, base); + ret = process_fetch_insn(arg->code, rec, edata, dl, base); if (arg->dynamic && likely(ret > 0)) { dyndata += ret; maxlen -= ret; diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c index 796ebcae9b38..9e461362450a 100644 --- a/kernel/trace/trace_uprobe.c +++ b/kernel/trace/trace_uprobe.c @@ -211,8 +211,8 @@ static unsigned long translate_user_vaddr(unsigned long file_offset)
/* Note that we don't verify it, since the code does not come from user space */ static int -process_fetch_insn(struct fetch_insn *code, void *rec, void *dest, - void *base) +process_fetch_insn(struct fetch_insn *code, void *rec, void *edata, + void *dest, void *base) { struct pt_regs *regs = rec; unsigned long val; @@ -1490,11 +1490,11 @@ static int uprobe_dispatcher(struct uprobe_consumer *con, struct pt_regs *regs) if (WARN_ON_ONCE(!uprobe_cpu_buffer)) return 0;
- dsize = __get_data_size(&tu->tp, regs); + dsize = __get_data_size(&tu->tp, regs, NULL); esize = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
ucb = uprobe_buffer_get(); - store_trace_args(ucb->buf, &tu->tp, regs, esize, dsize); + store_trace_args(ucb->buf, &tu->tp, regs, NULL, esize, dsize);
if (trace_probe_test_flag(&tu->tp, TP_FLAG_TRACE)) ret |= uprobe_trace_func(tu, regs, ucb, dsize); @@ -1525,11 +1525,11 @@ static int uretprobe_dispatcher(struct uprobe_consumer *con, if (WARN_ON_ONCE(!uprobe_cpu_buffer)) return 0;
- dsize = __get_data_size(&tu->tp, regs); + dsize = __get_data_size(&tu->tp, regs, NULL); esize = SIZEOF_TRACE_ENTRY(is_ret_probe(tu));
ucb = uprobe_buffer_get(); - store_trace_args(ucb->buf, &tu->tp, regs, esize, dsize); + store_trace_args(ucb->buf, &tu->tp, regs, NULL, esize, dsize);
if (trace_probe_test_flag(&tu->tp, TP_FLAG_TRACE)) uretprobe_trace_func(tu, func, regs, ucb, dsize); diff --git a/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_syntax_errors.tc b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_syntax_errors.tc index 20e42c030095..61877d166451 100644 --- a/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_syntax_errors.tc +++ b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_syntax_errors.tc @@ -34,7 +34,9 @@ check_error 'f vfs_read ^$stack10000' # BAD_STACK_NUM
check_error 'f vfs_read ^$arg10000' # BAD_ARG_NUM
+if !grep -q 'kernel return probes support:' README; then check_error 'f vfs_read $retval ^$arg1' # BAD_VAR +fi check_error 'f vfs_read ^$none_var' # BAD_VAR check_error 'f vfs_read ^'$REG # BAD_VAR
@@ -99,7 +101,9 @@ if grep -q "<argname>" README; then check_error 'f vfs_read args=^$arg*' # BAD_VAR_ARGS check_error 'f vfs_read +0(^$arg*)' # BAD_VAR_ARGS check_error 'f vfs_read $arg* ^$arg*' # DOUBLE_ARGS +if !grep -q 'kernel return probes support:' README; then check_error 'f vfs_read%return ^$arg*' # NOFENTRY_ARGS +fi check_error 'f vfs_read ^hoge' # NO_BTFARG check_error 'f kfree ^$arg10' # NO_BTFARG (exceed the number of parameters) check_error 'f kfree%return ^$retval' # NO_RETVAL diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc index 65fbb26fd58c..a16c6a6f6055 100644 --- a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc +++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc @@ -108,7 +108,9 @@ if grep -q "<argname>" README; then check_error 'p vfs_read args=^$arg*' # BAD_VAR_ARGS check_error 'p vfs_read +0(^$arg*)' # BAD_VAR_ARGS check_error 'p vfs_read $arg* ^$arg*' # DOUBLE_ARGS +if !grep -q 'kernel return probes support:' README; then check_error 'r vfs_read ^$arg*' # NOFENTRY_ARGS +fi check_error 'p vfs_read+8 ^$arg*' # NOFENTRY_ARGS check_error 'p vfs_read ^hoge' # NO_BTFARG check_error 'p kfree ^$arg10' # NO_BTFARG (exceed the number of parameters)
From: Masami Hiramatsu (Google) mhiramat@kernel.org
Add kretprobe and function exit probe test cases for checking whether those can access entry arguments at function exit correctly.
Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org --- .../ftrace/test.d/dynevent/fprobe_entry_arg.tc | 18 ++++++++++++++++++ .../ftrace/test.d/kprobe/kretprobe_entry_arg.tc | 18 ++++++++++++++++++ 2 files changed, 36 insertions(+) create mode 100644 tools/testing/selftests/ftrace/test.d/dynevent/fprobe_entry_arg.tc create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_entry_arg.tc
diff --git a/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_entry_arg.tc b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_entry_arg.tc new file mode 100644 index 000000000000..d183b8a8ecf8 --- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_entry_arg.tc @@ -0,0 +1,18 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# description: Function return probe entry argument access +# requires: dynamic_events 'f[:[<group>/][<event>]] <func-name>':README 'kernel return probes support:':README + +echo 'f:tests/myevent1 vfs_open arg=$arg1' >> dynamic_events +echo 'f:tests/myevent2 vfs_open%return arg=$arg1' >> dynamic_events + +echo 1 > events/tests/enable + +echo > trace +cat trace > /dev/null + +function streq() { + test $1 = $2 +} + +streq `grep -A 1 -m 1 myevent1 trace | sed -r 's/^.*(arg=.*)/\1/' ` diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_entry_arg.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_entry_arg.tc new file mode 100644 index 000000000000..53b82f36a1d0 --- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_entry_arg.tc @@ -0,0 +1,18 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# description: Kretprobe entry argument access +# requires: kprobe_events 'kernel return probes support:':README + +echo 'p:myevent1 vfs_open arg=$arg1' >> kprobe_events +echo 'r:myevent2 vfs_open arg=$arg1' >> kprobe_events + +echo 1 > events/kprobes/enable + +echo > trace +cat trace > /dev/null + +function streq() { + test $1 = $2 +} + +streq `grep -A 1 -m 1 myevent1 trace | sed -r 's/^.*(arg=.*)/\1/' `
From: Masami Hiramatsu (Google) mhiramat@kernel.org
Add a notes about the entry argument access at function exit probes for kprobes and fprobe trace event.
Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org --- Changes in v3: - Add usage example in fprobetrace.rst. --- Documentation/trace/fprobetrace.rst | 31 +++++++++++++++++++++++++++++++ Documentation/trace/kprobetrace.rst | 9 +++++++++ 2 files changed, 40 insertions(+)
diff --git a/Documentation/trace/fprobetrace.rst b/Documentation/trace/fprobetrace.rst index e35e6b18df40..0f187e3796e4 100644 --- a/Documentation/trace/fprobetrace.rst +++ b/Documentation/trace/fprobetrace.rst @@ -70,6 +70,14 @@ Synopsis of fprobe-events
For the details of TYPE, see :ref:`kprobetrace documentation <kprobetrace_types>`.
+Function arguments at exit +-------------------------- +Function arguments can be accessed at exit probe using $arg<N> fetcharg. This +is useful to record the function parameter and return value at once, and +trace the difference of structure fields (for debuging a function whether it +correctly updates the given data structure or not) +See the :ref:`sample<fprobetrace_exit_args_sample>` below for how it works. + BTF arguments ------------- BTF (BPF Type Format) argument allows user to trace function and tracepoint @@ -218,3 +226,26 @@ traceprobe event, you can trace that field as below. <idle>-0 [000] d..3. 5606.690317: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="kworker/0:1" usage=1 start_time=137000000 kworker/0:1-14 [000] d..3. 5606.690339: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="swapper/0" usage=2 start_time=0 <idle>-0 [000] d..3. 5606.692368: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="kworker/0:1" usage=1 start_time=137000000 + +.. _fprobetrace_exit_args_sample: + +The return probe allows us to access the results of some functions, which returns +the error code and its results are passed via function parameter, such as an +structure-initialization function. + +For example, vfs_open() will link the file structure to the inode and update +mode. You can trace that changes with return probe. +:: + + # echo 'f vfs_open mode=file->f_mode:x32 inode=file->f_inode:x64' >> dynamic_events + # echo 'f vfs_open%%return mode=file->f_mode:x32 inode=file->f_inode:x64' >> dynamic_events + # echo 1 > events/fprobes/enable + # cat trace + sh-131 [006] ...1. 1945.714346: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x2 inode=0x0 + sh-131 [006] ...1. 1945.714358: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0x4d801e inode=0xffff888008470168 + cat-143 [007] ...1. 1945.717949: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x1 inode=0x0 + cat-143 [007] ...1. 1945.717956: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0x4a801d inode=0xffff888005f78d28 + cat-143 [007] ...1. 1945.720616: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x1 inode=0x0 + cat-143 [007] ...1. 1945.728263: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0xa800d inode=0xffff888004ada8d8 + +You can see the `file::f_mode` and `file::f_inode` are upated in `vfs_open()`. diff --git a/Documentation/trace/kprobetrace.rst b/Documentation/trace/kprobetrace.rst index bf9cecb69fc9..a49662ccd53c 100644 --- a/Documentation/trace/kprobetrace.rst +++ b/Documentation/trace/kprobetrace.rst @@ -70,6 +70,15 @@ Synopsis of kprobe_events (*3) this is useful for fetching a field of data structures. (*4) "u" means user-space dereference. See :ref:`user_mem_access`.
+Function arguments at kretprobe +------------------------------- +Function arguments can be accessed at kretprobe using $arg<N> fetcharg. This +is useful to record the function parameter and return value at once, and +trace the difference of structure fields (for debuging a function whether it +correctly updates the given data structure or not). +See the :ref:`sample<fprobetrace_exit_args_sample>` in fprobe event for how +it works. + .. _kprobetrace_types:
Types
linux-kselftest-mirror@lists.linaro.org