On Mon, 10 Jan 2022 11:15:20 +0800 Pingfan Liu kernelfans@gmail.com wrote:
Hi Steven,
This patch passed my test. But I have some concern, please see comment inline.
Thanks, can I add a "Tested-by:" from you?
(when I have my final version)
index 996920ed1812..cf0fa9a785c7 100644 --- a/kernel/trace/trace_events_filter.c +++ b/kernel/trace/trace_events_filter.c @@ -5,6 +5,7 @@
- Copyright (C) 2009 Tom Zanussi tzanussi@gmail.com
*/ +#include <linux/uaccess.h> #include <linux/module.h> #include <linux/ctype.h> #include <linux/mutex.h> @@ -654,12 +655,50 @@ DEFINE_EQUALITY_PRED(32); DEFINE_EQUALITY_PRED(16); DEFINE_EQUALITY_PRED(8); +/* user space strings temp buffer */ +#define USTRING_BUF_SIZE 512
Should it be PATH_MAX(4096) in case of matching against a file path?
I went back and forth with this size in my head, and since I currently do not free it, and it is 4 * nr_cpus in size, I went with the smallest number I felt was OK.
We can increase it in the future, and even expose the size to user space.
+struct ustring_buffer {
- char buffer[USTRING_BUF_SIZE];
+};
+static __percpu struct ustring_buffer *ustring_per_cpu;
+static __always_inline char *test_string(char *str) +{
- struct ustring_buffer *ubuf;
- char __user *ustr;
- char *kstr;
- if (!ustring_per_cpu)
return NULL;
- ubuf = this_cpu_ptr(ustring_per_cpu);
- kstr = ubuf->buffer;
- if (likely((unsigned long)str >= TASK_SIZE)) {
/* For safety, do not trust the string pointer */
if (!strncpy_from_kernel_nofault(kstr, str, USTRING_BUF_SIZE))
Since no other trace_event_class except event_class_syscall_enter tries to uaccess, so the unreliable source only comes from event_class_syscall_enter.
In that case, the access to kernel address is forbidden. So here just return -EACCES ?
I changed the way all pointers work. Any event that access a string pointer (there are a few), and we filter on it, then I want it to go through this path.
And returning -EACCES is useless this is done in the filtering logic and that error will not be exposed to anyone. The best we can do is to fail the filter.
return NULL;
- } else {
/* user space address? */
ustr = str;
if (!strncpy_from_user_nofault(kstr, ustr, USTRING_BUF_SIZE))
return NULL;
- }
- return kstr;
+}
/* Filter predicate for fixed sized arrays of characters */ static int filter_pred_string(struct filter_pred *pred, void *event) { char *addr = (char *)(event + pred->offset); int cmp, match;
- addr = test_string(addr);
Among all of trace_event_class, only event_class_syscall_enter exposed to this fault (uprobe does not uaccess). So I think the strncpy_*() can be avoided based on class, which improves performance.
The thing is, tracing should never cause a fault in the system. If a pointer is bad and you filter on it, it should not cause a crash. Your patch showed me we have this issues with kernel pointers too. And since we have no safe "strncmp" the best we can do is a safe "strncpy" and then compare it.
You actually exposed more issues than the one you were trying to solve, and this patch now addresses those issues.
Yes, it will impact performance, but robustness always trumps performance.
- if (!addr)
return 0;
- cmp = pred->regex.match(addr, &pred->regex, pred->regex.field_len);
match = cmp ^ pred->not; @@ -671,10 +710,16 @@ static int filter_pred_string(struct filter_pred *pred, void *event) static int filter_pred_pchar(struct filter_pred *pred, void *event) { char **addr = (char **)(event + pred->offset);
- char *str; int cmp, match;
- int len = strlen(*addr) + 1; /* including tailing '\0' */
- int len;
- str = test_string(*addr);
- if (!str)
return 0;
- cmp = pred->regex.match(*addr, &pred->regex, len);
- len = strlen(str) + 1; /* including tailing '\0' */
- cmp = pred->regex.match(str, &pred->regex, len);
match = cmp ^ pred->not; @@ -784,6 +829,10 @@ static int filter_pred_none(struct filter_pred *pred, void *event) static int regex_match_full(char *str, struct regex *r, int len) {
- str = test_string(str);
Since all regex_match_*() are called in filter_pred_*(), which have already protected codes from page fault. So no need to double check.
Ah, you're right. I only need to add this to filter_pred_pchar() and not the regex.
Thanks, I'll send a v2.
-- Steve