Extend fprobe to support list-style filters and explicit entry/exit suffixes. Currentyl, fprobe only supports a single symbol (or wildcard) per event. This patch allows users to specify a comma-separated list of symbols.
New Syntax: - f:[GRP/][EVENT] func1,func2,func3:entry - f:[GRP/][EVENT] func1,func2,func3:exit
Logic changes: - Refactor parsing logic into 'parse_fprobe_spec' - Support '!' prefix for exclusion - Disable BTF lookup ('ctx->funcname = NULL') when a list or wildcard is used, as a single function signature cannot apply to multiple functions. - Reject legacy '%return' suffix when used with lists or wildcards - Update tracefs/README
Testing: Verified on x86_64 via QEMU. Checked registration of lists, exclusions, and explicit suffixes. Verified rejection of invalid syntax including trailing commas and mixed legacy/new syntax.
Seokwoo Chung (Ryan) (3): docs: tracing/fprobe: Document list filters and :entry/:exit tracing/fprobe: Support comma-separated symbols and :entry/:exit selftests/ftrace: Add accept cases for fprobe list syntax
Changes in v4: - Added validation to reject trailing commas (empty tokens) in symbol lists - Added vaildation to reject mixed of list syntax with %return suffix - Refactored parse_fprobe_spec to user __free(kfree) for automatic memory cleanup - Removed the now-unused parse_symbol_and_return function to avoid compiler warnings. - Tigtened %return detection to ensure it only matches as a strict suffix, not a substring - Link to v3: https://lore.kernel.org/lkml/20250904103219.f4937968362bfff1ecd3f004@kernel....
Documentation/trace/fprobetrace.rst | 17 +- kernel/trace/trace.c | 3 +- kernel/trace/trace_fprobe.c | 209 ++++++++++++++---- .../ftrace/test.d/dynevent/fprobe_list.tc | 92 ++++++++ 4 files changed, 269 insertions(+), 52 deletions(-) create mode 100644 tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc
Update fprobe event documentation to describe comma-separated symbol lists, exclusions, and explicit suffixes.
Signed-off-by: Seokwoo Chung (Ryan) seokwoo.chung130@gmail.com --- Documentation/trace/fprobetrace.rst | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/Documentation/trace/fprobetrace.rst b/Documentation/trace/fprobetrace.rst index b4c2ca3d02c1..bbcfd57f0005 100644 --- a/Documentation/trace/fprobetrace.rst +++ b/Documentation/trace/fprobetrace.rst @@ -25,14 +25,18 @@ Synopsis of fprobe-events ------------------------- ::
- f[:[GRP1/][EVENT1]] SYM [FETCHARGS] : Probe on function entry - f[MAXACTIVE][:[GRP1/][EVENT1]] SYM%return [FETCHARGS] : Probe on function exit + f[:[GRP1/][EVENT1]] SYM[%return] [FETCHARGS] : Single function + f[:[GRP1/][EVENT1]] SYM[,[!]SYM[,...]][:entry|:exit] [FETCHARGS] :Multiple + function t[:[GRP2/][EVENT2]] TRACEPOINT [FETCHARGS] : Probe on tracepoint
GRP1 : Group name for fprobe. If omitted, use "fprobes" for it. GRP2 : Group name for tprobe. If omitted, use "tracepoints" for it. EVENT1 : Event name for fprobe. If omitted, the event name is - "SYM__entry" or "SYM__exit". + - For a single literal symbol, the event name is + "SYM__entry" or "SYM__exit". + - For a *list or any wildcard*, an explicit [GRP1/][EVENT1] is + required; otherwise the parser rejects it. EVENT2 : Event name for tprobe. If omitted, the event name is the same as "TRACEPOINT", but if the "TRACEPOINT" starts with a digit character, "_TRACEPOINT" is used. @@ -40,6 +44,13 @@ Synopsis of fprobe-events can be probed simultaneously, or 0 for the default value as defined in Documentation/trace/fprobe.rst
+ SYM : Function name or comma-separated list of symbols. + - SYM prefixed with "!" are exclusions. + - ":entry" suffix means it probes entry of given symbols + (default) + - ":exit" suffix means it probes exit of given symbols. + - "%return" suffix means it probes exit of SYM (single + symbol). FETCHARGS : Arguments. Each probe can have up to 128 args. ARG : Fetch "ARG" function argument using BTF (only for function entry or tracepoint.) (*1)
- Update DEFINE_FREE to use standard __free() - Extend fprobe to support multiple symbols per event. Add parsing logic for lists, ! exclusions, and explicit suffixes. Update tracefs/README to reflect the new syntax
Signed-off-by: Seokwoo Chung (Ryan) seokwoo.chung130@gmail.com --- kernel/trace/trace.c | 3 +- kernel/trace/trace_fprobe.c | 209 +++++++++++++++++++++++++++--------- 2 files changed, 163 insertions(+), 49 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index d1e527cf2aae..e0b77268a702 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -5518,7 +5518,8 @@ static const char readme_msg[] = "\t r[maxactive][:[<group>/][<event>]] <place> [<args>]\n" #endif #ifdef CONFIG_FPROBE_EVENTS - "\t f[:[<group>/][<event>]] <func-name>[%return] [<args>]\n" + "\t f[:[<group>/][<event>]] <func-name>[:entry|:exit] [<args>]\n" + "\t (single symbols still accept %return)\n" "\t t[:[<group>/][<event>]] <tracepoint> [<args>]\n" #endif #ifdef CONFIG_HIST_TRIGGERS diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c index 8001dbf16891..6307d7d7dd9c 100644 --- a/kernel/trace/trace_fprobe.c +++ b/kernel/trace/trace_fprobe.c @@ -187,11 +187,14 @@ DEFINE_FREE(tuser_put, struct tracepoint_user *, */ struct trace_fprobe { struct dyn_event devent; + char *filter; struct fprobe fp; + bool list_mode; + char *nofilter; const char *symbol; + struct trace_probe tp; bool tprobe; struct tracepoint_user *tuser; - struct trace_probe tp; };
static bool is_trace_fprobe(struct dyn_event *ev) @@ -559,6 +562,8 @@ static void free_trace_fprobe(struct trace_fprobe *tf) trace_probe_cleanup(&tf->tp); if (tf->tuser) tracepoint_user_put(tf->tuser); + kfree(tf->filter); + kfree(tf->nofilter); kfree(tf->symbol); kfree(tf); } @@ -838,7 +843,12 @@ static int __register_trace_fprobe(struct trace_fprobe *tf) if (trace_fprobe_is_tracepoint(tf)) return __regsiter_tracepoint_fprobe(tf);
- /* TODO: handle filter, nofilter or symbol list */ + /* Registration path: + * - list_mode: pass filter/nofilter + * - single: pass symbol only (legacy) + */ + if (tf->list_mode) + return register_fprobe(&tf->fp, tf->filter, tf->nofilter); return register_fprobe(&tf->fp, tf->symbol, NULL); }
@@ -1154,60 +1164,119 @@ static struct notifier_block tprobe_event_module_nb = { }; #endif /* CONFIG_MODULES */
-static int parse_symbol_and_return(int argc, const char *argv[], - char **symbol, bool *is_return, - bool is_tracepoint) +static bool has_wildcard(const char *s) { - char *tmp = strchr(argv[1], '%'); - int i; + return s && (strchr(s, '*') || strchr(s, '?')); +}
- if (tmp) { - int len = tmp - argv[1]; +static int parse_fprobe_spec(const char *in, bool is_tracepoint, + char **base, bool *is_return, bool *list_mode, + char **filter, char **nofilter) +{ + char *work __free(kfree) = NULL; + char *b __free(kfree) = NULL; + char *f __free(kfree) = NULL; + char *nf __free(kfree) = NULL; + bool legacy_ret = false; + bool list = false; + const char *p; + int ret = 0;
- if (!is_tracepoint && !strcmp(tmp, "%return")) { - *is_return = true; - } else { - trace_probe_log_err(len, BAD_ADDR_SUFFIX); - return -EINVAL; - } - *symbol = kmemdup_nul(argv[1], len, GFP_KERNEL); - } else - *symbol = kstrdup(argv[1], GFP_KERNEL); - if (!*symbol) - return -ENOMEM; + if (!in || !base || !is_return || !list_mode || !filter || !nofilter) + return -EINVAL;
- if (*is_return) - return 0; + *base = NULL; *filter = NULL; *nofilter = NULL; + *is_return = false; *list_mode = false;
if (is_tracepoint) { - tmp = *symbol; - while (*tmp && (isalnum(*tmp) || *tmp == '_')) - tmp++; - if (*tmp) { - /* find a wrong character. */ - trace_probe_log_err(tmp - *symbol, BAD_TP_NAME); - kfree(*symbol); - *symbol = NULL; + if (strchr(in, ',') || strchr(in, ':')) return -EINVAL; - } + if (strstr(in, "%return")) + return -EINVAL; + for (p = in; *p; p++) + if (!isalnum(*p) && *p != '_') + return -EINVAL; + b = kstrdup(in, GFP_KERNEL); + if (!b) + return -ENOMEM; + *base = no_free_ptr(b); + return 0; }
- /* If there is $retval, this should be a return fprobe. */ - for (i = 2; i < argc; i++) { - tmp = strstr(argv[i], "$retval"); - if (tmp && !isalnum(tmp[7]) && tmp[7] != '_') { - if (is_tracepoint) { - trace_probe_log_set_index(i); - trace_probe_log_err(tmp - argv[i], RETVAL_ON_PROBE); - kfree(*symbol); - *symbol = NULL; + work = kstrdup(in, GFP_KERNEL); + if (!work) + return -ENOMEM; + + p = strstr(work, "%return"); + if (p && p[7] == '\0') { + *is_return = true; + legacy_ret = true; + *(char *)p = '\0'; + } else { + /* + * If "symbol:entry" or "symbol:exit" is given, it is new + * style probe. + */ + p = strrchr(work, ':'); + if (p) { + if (!strcmp(p, ":exit")) { + *is_return = true; + *(char *)p = '\0'; + } else if (!strcmp(p, ":entry")) { + *(char *)p = '\0'; + } else { return -EINVAL; } - *is_return = true; - break; } } - return 0; + + list = !!strchr(work, ','); + + if (list && legacy_ret) { + return -EINVAL; + } + + if (legacy_ret) + *is_return = true; + + b = kstrdup(work, GFP_KERNEL); + if (!b) + return -ENOMEM; + + if (list) { + char *tmp = b, *tok; + size_t fsz, nfsz; + + fsz = nfsz = strlen(b) + 1; + + f = kzalloc(fsz, GFP_KERNEL); + nf = kzalloc(nfsz, GFP_KERNEL); + if (!f || !nf) + return -ENOMEM; + + while ((tok = strsep(&tmp, ",")) != NULL) { + char *dst; + bool neg = (*tok == '!'); + + if (*tok == '\0') { + trace_probe_log_err(tmp - b - 1, BAD_TP_NAME); + return -EINVAL; + + if (neg) + tok++; + dst = neg ? nf : f; + if (dst[0] != '\0') + strcat(dst, ","); + strcat(dst, tok); + } + *list_mode = true; + } + + *base = no_free_ptr(b); + *filter = no_free_ptr(f); + *nofilter = no_free_ptr(nf); + + return ret; }
static int trace_fprobe_create_internal(int argc, const char *argv[], @@ -1241,6 +1310,8 @@ static int trace_fprobe_create_internal(int argc, const char *argv[], const char *event = NULL, *group = FPROBE_EVENT_SYSTEM; struct module *mod __free(module_put) = NULL; const char **new_argv __free(kfree) = NULL; + char *parsed_nofilter __free(kfree) = NULL; + char *parsed_filter __free(kfree) = NULL; char *symbol __free(kfree) = NULL; char *ebuf __free(kfree) = NULL; char *gbuf __free(kfree) = NULL; @@ -1249,6 +1320,7 @@ static int trace_fprobe_create_internal(int argc, const char *argv[], char *dbuf __free(kfree) = NULL; int i, new_argc = 0, ret = 0; bool is_tracepoint = false; + bool list_mode = false; bool is_return = false;
if ((argv[0][0] != 'f' && argv[0][0] != 't') || argc < 2) @@ -1270,11 +1342,26 @@ static int trace_fprobe_create_internal(int argc, const char *argv[],
trace_probe_log_set_index(1);
- /* a symbol(or tracepoint) must be specified */ - ret = parse_symbol_and_return(argc, argv, &symbol, &is_return, is_tracepoint); + /* Parse spec early (single vs list, suffix, base symbol) */ + ret = parse_fprobe_spec(argv[1], is_tracepoint, &symbol, &is_return, + &list_mode, &parsed_filter, &parsed_nofilter); if (ret < 0) return -EINVAL;
+ for (i = 2; i < argc; i++) { + char *tmp = strstr(argv[i], "$retval"); + + if (tmp && !isalnum(tmp[7]) && tmp[7] != '_') { + if (is_tracepoint) { + trace_probe_log_set_index(i); + trace_probe_log_err(tmp - argv[i], RETVAL_ON_PROBE); + return -EINVAL; + } + is_return = true; + break; + } + } + trace_probe_log_set_index(0); if (event) { gbuf = kmalloc(MAX_EVENT_NAME_LEN, GFP_KERNEL); @@ -1287,6 +1374,15 @@ static int trace_fprobe_create_internal(int argc, const char *argv[], }
if (!event) { + /* + * Event name rules: + * - For list/wildcard: require explicit [GROUP/]EVENT + * - For single literal: autogenerate symbol__entry/symbol__exit + */ + if (list_mode || has_wildcard(symbol)) { + trace_probe_log_err(0, NO_GROUP_NAME); + return -EINVAL; + } ebuf = kmalloc(MAX_EVENT_NAME_LEN, GFP_KERNEL); if (!ebuf) return -ENOMEM; @@ -1322,7 +1418,8 @@ static int trace_fprobe_create_internal(int argc, const char *argv[], NULL, NULL, NULL, sbuf); } } - if (!ctx->funcname) + + if (!list_mode && !has_wildcard(symbol) && !is_tracepoint) ctx->funcname = symbol;
abuf = kmalloc(MAX_BTF_ARGS_LEN, GFP_KERNEL); @@ -1356,6 +1453,21 @@ static int trace_fprobe_create_internal(int argc, const char *argv[], return ret; }
+ /* carry list parsing result into tf */ + if (!is_tracepoint) { + tf->list_mode = list_mode; + if (parsed_filter) { + tf->filter = kstrdup(parsed_filter, GFP_KERNEL); + if (!tf->filter) + return -ENOMEM; + } + if (parsed_nofilter) { + tf->nofilter = kstrdup(parsed_nofilter, GFP_KERNEL); + if (!tf->nofilter) + return -ENOMEM; + } + } + /* parse arguments */ for (i = 0; i < argc; i++) { trace_probe_log_set_index(i + 2); @@ -1442,8 +1554,9 @@ static int trace_fprobe_show(struct seq_file *m, struct dyn_event *ev) seq_printf(m, ":%s/%s", trace_probe_group_name(&tf->tp), trace_probe_name(&tf->tp));
- seq_printf(m, " %s%s", trace_fprobe_symbol(tf), - trace_fprobe_is_return(tf) ? "%return" : ""); + seq_printf(m, " %s", trace_fprobe_symbol(tf)); + if (!trace_fprobe_is_tracepoint(tf) && trace_fprobe_is_return(tf)) + seq_puts(m, ":exit");
for (i = 0; i < tf->tp.nr_args; i++) seq_printf(m, " %s=%s", tf->tp.args[i].name, tf->tp.args[i].comm);
Signed-off-by: Seokwoo Chung (Ryan) seokwoo.chung130@gmail.com --- .../ftrace/test.d/dynevent/fprobe_list.tc | 92 +++++++++++++++++++ 1 file changed, 92 insertions(+) create mode 100644 tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc
diff --git a/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc new file mode 100644 index 000000000000..45e57c6f487d --- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/dynevent/fprobe_list.tc @@ -0,0 +1,92 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# description: Fprobe event list syntax and :entry/:exit suffixes +# requires: dynamic_events "f[:[<group>/][<event>]] <func-name>[:entry|:exit] [<args>]":README + +# Setup symbols to test. These are common kernel functions. +PLACE=vfs_read +PLACE2=vfs_write +PLACE3=vfs_open + +echo 0 > events/enable +echo > dynamic_events + +# Get baseline count of enabled functions (should be 0 if clean, but be safe) +if [ -f enabled_functions ]; then + ocnt=`cat enabled_functions | wc -l` +else + ocnt=0 +fi + +# Test 1: List default (entry) with exclusion +# Target: Trace vfs_read and vfs_open, but EXCLUDE vfs_write +echo "f:test/list_entry $PLACE,!$PLACE2,$PLACE3" >> dynamic_events +grep -q "test/list_entry" dynamic_events +test -d events/test/list_entry + +echo 1 > events/test/list_entry/enable + +grep -q "$PLACE" enabled_functions +grep -q "$PLACE3" enabled_functions +! grep -q "$PLACE2" enabled_functions + +# Check count (Baseline + 2 new functions) +cnt=`cat enabled_functions | wc -l` +if [ $cnt -ne $((ocnt + 2)) ]; then + exit_fail +fi + +# Cleanup Test 1 +echo 0 > events/test/list_entry/enable +echo "-:test/list_entry" >> dynamic_events +! grep -q "test/list_entry" dynamic_events + +# Count should return to baseline +cnt=`cat enabled_functions | wc -l` +if [ $cnt -ne $ocnt ]; then + exit_fail +fi + +# Test 2: List with explicit :entry suffix +# (Should behave exactly like Test 1) +echo "f:test/list_entry_exp $PLACE,!$PLACE2,$PLACE3:entry" >> dynamic_events +grep -q "test/list_entry_exp" dynamic_events +test -d events/test/list_entry_exp + +echo 1 > events/test/list_entry_exp/enable + +grep -q "$PLACE" enabled_functions +grep -q "$PLACE3" enabled_functions +! grep -q "$PLACE2" enabled_functions + +cnt=`cat enabled_functions | wc -l` +if [ $cnt -ne $((ocnt + 2)) ]; then + exit_fail +fi + +# Cleanup Test 2 +echo 0 > events/test/list_entry_exp/enable +echo "-:test/list_entry_exp" >> dynamic_events + +# Test 3: List with :exit suffix +echo "f:test/list_exit $PLACE,!$PLACE2,$PLACE3:exit" >> dynamic_events +grep -q "test/list_exit" dynamic_events +test -d events/test/list_exit + +echo 1 > events/test/list_exit/enable + +# Even for return probes, enabled_functions lists the attached symbols +grep -q "$PLACE" enabled_functions +grep -q "$PLACE3" enabled_functions +! grep -q "$PLACE2" enabled_functions + +cnt=`cat enabled_functions | wc -l` +if [ $cnt -ne $((ocnt + 2)) ]; then + exit_fail +fi + +# Cleanup Test 3 +echo 0 > events/test/list_exit/enable +echo "-:test/list_exit" >> dynamic_events + +clear_trace
linux-kselftest-mirror@lists.linaro.org