From: wuqiang wuqiang.matt@bytedance.com
[ Upstream commit 3b7ddab8a19aefc768f345fd3782af35b4a68d9b ]
Default value of maxactive is set as num_possible_cpus() for nonpreemptable systems. For a 2-core system, only 2 kretprobe instances would be allocated in default, then these 2 instances for execve kretprobe are very likely to be used up with a pipelined command.
Here's the testcase: a shell script was added to crontab, and the content of the script is:
#!/bin/sh do_something_magic `tr -dc a-z < /dev/urandom | head -c 10`
cron will trigger a series of program executions (4 times every hour). Then events loss would be noticed normally after 3-4 hours of testings.
The issue is caused by a burst of series of execve requests. The best number of kretprobe instances could be different case by case, and should be user's duty to determine, but num_possible_cpus() as the default value is inadequate especially for systems with small number of cpus.
This patch enables the logic for preemption as default, thus increases the minimum of maxactive to 10 for nonpreemptable systems.
Link: https://lore.kernel.org/all/20221110081502.492289-1-wuqiang.matt@bytedance.c...
Signed-off-by: wuqiang wuqiang.matt@bytedance.com Reviewed-by: Solar Designer solar@openwall.com Acked-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/trace/kprobes.rst | 3 +-- kernel/kprobes.c | 8 ++------ 2 files changed, 3 insertions(+), 8 deletions(-)
diff --git a/Documentation/trace/kprobes.rst b/Documentation/trace/kprobes.rst index f318bceda1e6..97d086b23ce8 100644 --- a/Documentation/trace/kprobes.rst +++ b/Documentation/trace/kprobes.rst @@ -131,8 +131,7 @@ For example, if the function is non-recursive and is called with a spinlock held, maxactive = 1 should be enough. If the function is non-recursive and can never relinquish the CPU (e.g., via a semaphore or preemption), NR_CPUS should be enough. If maxactive <= 0, it is -set to a default value. If CONFIG_PREEMPT is enabled, the default -is max(10, 2*NR_CPUS). Otherwise, the default is NR_CPUS. +set to a default value: max(10, 2*NR_CPUS).
It's not a disaster if you set maxactive too low; you'll just miss some probes. In the kretprobe struct, the nmissed field is set to diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 6d2a8623ec7b..f2413aae1aba 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -2209,13 +2209,9 @@ int register_kretprobe(struct kretprobe *rp) rp->kp.post_handler = NULL;
/* Pre-allocate memory for max kretprobe instances */ - if (rp->maxactive <= 0) { -#ifdef CONFIG_PREEMPTION + if (rp->maxactive <= 0) rp->maxactive = max_t(unsigned int, 10, 2*num_possible_cpus()); -#else - rp->maxactive = num_possible_cpus(); -#endif - } + #ifdef CONFIG_KRETPROBE_ON_RETHOOK rp->rh = rethook_alloc((void *)rp, kretprobe_rethook_handler); if (!rp->rh)