For cases like IPv6 addresses, having a means to supply tracing predicates for fields with more than 8 bytes would be convenient. This series provides a simple way to support this by allowing simple ==, != memory comparison with the predicate supplied when the size of the field exceeds 8 bytes. For example, to trace ::1, the predicate
"dst == 0x00000000000000000000000000000001"
..could be used.
When investigating this initially, I stumbled upon a kernel crash when specifying a predicate for a non-string field that is not 1, 2, 4, or 8 bytes in size. Patch 1 fixes it. Patch 2 provides the support for > 8 byte fields via a memcmp()-style predicate. Patch 3 adds tests for filter predicates, and patch 4 documents the fact that for > 8 bytes. only == and != are supported.
Alan Maguire (2): tracing: predicate matching trigger crashes for > 8-byte arrays tracing: support > 8 byte array filter predicates
Oracle Public Cloud User (2): selftests/ftrace: add test coverage for filter predicates tracing: document > 8 byte numeric filtering support
Documentation/trace/events.rst | 9 +++ kernel/trace/trace_events_filter.c | 59 +++++++++++++++++- .../selftests/ftrace/test.d/event/filter.tc | 62 +++++++++++++++++++ 3 files changed, 129 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/ftrace/test.d/event/filter.tc
The following (wrong) use of tracepoint filtering was enough to trigger a null-pointer dereference crash:
cd /sys/kernel/debug/tracing echo "saddr_v6 == 0x0100007f" > tcp/tcp_receive_reset/filter echo 1 > tcp/tcp_receive_reset/enable wget https://localhost
This works fine if saddr - a 4-byte array representing the source address - is used instead.
Fix is to handle case where we encounter an unexpected size.
Signed-off-by: Alan Maguire alan.maguire@oracle.com --- kernel/trace/trace_events_filter.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c index 4b1057ab9d96..65e01c8d48d9 100644 --- a/kernel/trace/trace_events_filter.c +++ b/kernel/trace/trace_events_filter.c @@ -1490,6 +1490,11 @@ static int parse_pred(const char *str, void *data, else { pred->fn = select_comparison_fn(pred->op, field->size, field->is_signed); + if (!pred->fn) { + parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP, + pos + i); + goto err_free; + } if (pred->op == OP_NE) pred->not = 1; }
On Sun, 7 Aug 2022 23:21:20 +0100 Alan Maguire alan.maguire@oracle.com wrote:
The following (wrong) use of tracepoint filtering was enough to trigger a null-pointer dereference crash:
cd /sys/kernel/debug/tracing echo "saddr_v6 == 0x0100007f" > tcp/tcp_receive_reset/filter echo 1 > tcp/tcp_receive_reset/enable wget https://localhost
This works fine if saddr - a 4-byte array representing the source address - is used instead.
The patch series is a new feature so it would need to go into the next merge window. But this patch looks to be a bug fix, so I'll pull this one in separately, and tag it for stable.
Thanks,
-- Steve
Fix is to handle case where we encounter an unexpected size.
Signed-off-by: Alan Maguire alan.maguire@oracle.com
kernel/trace/trace_events_filter.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c index 4b1057ab9d96..65e01c8d48d9 100644 --- a/kernel/trace/trace_events_filter.c +++ b/kernel/trace/trace_events_filter.c @@ -1490,6 +1490,11 @@ static int parse_pred(const char *str, void *data, else { pred->fn = select_comparison_fn(pred->op, field->size, field->is_signed);
if (!pred->fn) {
parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP,
pos + i);
goto err_free;
}} if (pred->op == OP_NE) pred->not = 1;
For > 8 byte values, allow simple binary '==', '!=' predicates where the user passes in a hex ASCII representation of the desired value. This representation must match the field size exactly, and a simple memory comparison between predicate and actual values is carried out. This will allow predicates with for example IPv6 addresses to be supported, such as filtering on ::1
cd /sys/kernel/debug/tracing/events/tcp/tcp_receive_reset echo "saddr_v6 == 0x00000000000000000000000000000001" > filter
Signed-off-by: Alan Maguire alan.maguire@oracle.com --- kernel/trace/trace_events_filter.c | 54 +++++++++++++++++++++++++++++- 1 file changed, 53 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/trace_events_filter.c b/kernel/trace/trace_events_filter.c index 65e01c8d48d9..31c900b6a83c 100644 --- a/kernel/trace/trace_events_filter.c +++ b/kernel/trace/trace_events_filter.c @@ -147,6 +147,8 @@ enum { PROCESS_OR = 4, };
+static int filter_pred_memcmp(struct filter_pred *pred, void *event); + /* * Without going into a formal proof, this explains the method that is used in * parsing the logical expressions. @@ -583,8 +585,11 @@ predicate_parse(const char *str, int nr_parens, int nr_preds, kfree(op_stack); kfree(inverts); if (prog_stack) { - for (i = 0; prog_stack[i].pred; i++) + for (i = 0; prog_stack[i].pred; i++) { + if (prog_stack[i].pred->fn == filter_pred_memcmp) + kfree((u8 *)prog_stack[i].pred->val); kfree(prog_stack[i].pred); + } kfree(prog_stack); } return ERR_PTR(ret); @@ -841,6 +846,14 @@ static int filter_pred_none(struct filter_pred *pred, void *event) return 0; }
+static int filter_pred_memcmp(struct filter_pred *pred, void *event) +{ + u8 *mem = (u8 *)(event + pred->offset); + u8 *cmp = (u8 *)(pred->val); + + return (memcmp(mem, cmp, pred->field->size) == 0) ^ pred->not; +} + /* * regex_match_foo - Basic regex callbacks * @@ -1443,6 +1456,45 @@ static int parse_pred(const char *str, void *data, /* go past the last quote */ i++;
+ } else if (str[i] == '0' && tolower(str[i + 1]) == 'x' && + field->size > 8) { + u8 *pred_val; + + /* For sizes > 8 bytes, we store a binary representation + * for comparison; only '==' and '!=' are supported. + * To keep things simple, the predicate value must specify + * a value that matches the field size exactly, with leading + * 0s if necessary. + */ + if (pred->op != OP_EQ && pred->op != OP_NE) { + parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP, pos + i); + goto err_free; + } + + /* skip required 0x */ + s += 2; + i += 2; + + while (isalnum(str[i])) + i++; + + len = i - s; + if (len != (field->size * 2)) { + parse_error(pe, FILT_ERR_ILLEGAL_FIELD_OP, pos + s); + goto err_free; + } + + pred_val = kzalloc(field->size, GFP_KERNEL); + if (hex2bin(pred_val, str + s, field->size)) { + parse_error(pe, FILT_ERR_ILLEGAL_INTVAL, pos + s); + kfree(pred_val); + goto err_free; + } + pred->val = (u64)pred_val; + pred->fn = filter_pred_memcmp; + if (pred->op == OP_NE) + pred->not = 1; + } else if (isdigit(str[i]) || str[i] == '-') {
/* Make sure the field is not a string */
add tests verifying filter predicates work for 1/2/4/8/16 byte values and strings; use predicates at event and subsystem level.
Signed-off-by: Alan Maguire alan.maguire@oracle.com --- .../selftests/ftrace/test.d/event/filter.tc | 62 +++++++++++++++++++ 1 file changed, 62 insertions(+) create mode 100644 tools/testing/selftests/ftrace/test.d/event/filter.tc
diff --git a/tools/testing/selftests/ftrace/test.d/event/filter.tc b/tools/testing/selftests/ftrace/test.d/event/filter.tc new file mode 100644 index 000000000000..396383519f84 --- /dev/null +++ b/tools/testing/selftests/ftrace/test.d/event/filter.tc @@ -0,0 +1,62 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# description: event tracing - enable filter predicates +# requires: set_event events/sched +# flags: + +do_reset() { + echo 0 > ${event}/enable + echo 0 > ${event}/filter + clear_trace +} + +fail() { #msg + echo $1 + exit_fail +} + +# verify filter predicates at trace event/subsys level for +# - string (prev_comm) +# - 1-byte value (common_flags) +# - 2-byte value (common_type) +# - 4-byte value (next_pid) +# - 8-byte value (prev_state) + +for event in events/sched/sched_switch events/sched +do + for filter in "prev_comm == 'ping'" \ + "common_flags != 0" \ + "common_type >= 0" \ + "next_pid > 0" \ + "prev_state != 0" + do + echo "$filter" > ${event}/filter + echo 1 > ${event}/enable + yield + count=`grep sched_switch trace|wc -l` + if [ $count -lt 1 ]; then + fail "at least one $event should be recorded for '$filter'" + fi + do_reset + done +done + +# verify '==', '!=' filter predicates for 16-byte array at event/subsys +# level + +LOCALHOST="-6 ::1" +for event in events/fib6/fib6_table_lookup events/fib6 ; do + for filter in "dst == 0x00000000000000000000000000000001" \ + "src != 0x00000000000000000000000000000001" + do + echo "$filter" > ${event}/filter + echo 1 > ${event}/enable + yield + count=`grep fib6_table_lookup trace|wc -l` + if [ $count -lt 1 ]; then + fail "at least one $event should be recorded for '$filter'" + fi + do_reset + done +done +exit 0
For values > 8 bytes in size, only == and != filter predicates are supported; document this.
Signed-off-by: Alan Maguire alan.maguire@oracle.com --- Documentation/trace/events.rst | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst index c47f381d0c00..318dba2fe3ee 100644 --- a/Documentation/trace/events.rst +++ b/Documentation/trace/events.rst @@ -186,6 +186,15 @@ The operators available for numeric fields are:
==, !=, <, <=, >, >=, &
+For numeric fields larger than 8 bytes, only + +==, != + +...are allowed, and values for comparison must match field size exactly. +For example, to match the "::1" IPv6 address: + +"dst == 0x00000000000000000000000000000001" + And for string fields they are:
==, !=, ~
linux-kselftest-mirror@lists.linaro.org