Bernd Edlinger bernd.edlinger@hotmail.de writes:
This fixes a deadlock in the tracer when tracing a multi-threaded application that calls execve while more than one thread are running.
I observed that when running strace on the gcc test suite, it always blocks after a while, when expect calls execve, because other threads have to be terminated. They send ptrace events, but the strace is no longer able to respond, since it is blocked in vm_access.
The deadlock is always happening when strace needs to access the tracees process mmap, while another thread in the tracee starts to execve a child process, but that cannot continue until the PTRACE_EVENT_EXIT is handled and the WIFEXITED event is received:
A couple of things.
Why do we think it is safe to change the behavior exposed to userspace? Not the deadlock but all of the times the current code would not deadlock?
Especially given that this is a small window it might be hard for people to track down and report so we need a strong argument that this won't break existing userspace before we just change things.
Usually surveying all of the users of a system call that we can find and checking to see if they might be affected by the change in behavior is difficult enough that we usually opt for not being lazy and preserving the behavior.
This patch is up to two changes in behavior now, that could potentially affect a whole array of programs. Adding linux-api so that this change in behavior can be documented if/when this change goes through.
If you can split the documentation and test fixes out into separate patches that would help reviewing this code, or please make it explicit that the your are changing documentation about behavior that is changing with this patch.
Eric
diff --git a/tools/testing/selftests/ptrace/vmaccess.c b/tools/testing/selftests/ptrace/vmaccess.c new file mode 100644 index 0000000..6d8a048 --- /dev/null +++ b/tools/testing/selftests/ptrace/vmaccess.c @@ -0,0 +1,66 @@ +// SPDX-License-Identifier: GPL-2.0+ +/*
- Copyright (c) 2020 Bernd Edlinger bernd.edlinger@hotmail.de
- All rights reserved.
- Check whether /proc/$pid/mem can be accessed without causing deadlocks
- when de_thread is blocked with ->cred_guard_mutex held.
- */
+#include "../kselftest_harness.h" +#include <stdio.h> +#include <fcntl.h> +#include <pthread.h> +#include <signal.h> +#include <unistd.h> +#include <sys/ptrace.h>
+static void *thread(void *arg) +{
- ptrace(PTRACE_TRACEME, 0, 0L, 0L);
- return NULL;
+}
+TEST(vmaccess) +{
- int f, pid = fork();
- char mm[64];
- if (!pid) {
pthread_t pt;
pthread_create(&pt, NULL, thread, NULL);
pthread_join(pt, NULL);
execlp("true", "true", NULL);
- }
- sleep(1);
- sprintf(mm, "/proc/%d/mem", pid);
- f = open(mm, O_RDONLY);
- ASSERT_LE(0, f);
- close(f);
- f = kill(pid, SIGCONT);
- ASSERT_EQ(0, f);
+}
+TEST(attach) +{
- int f, pid = fork();
- if (!pid) {
pthread_t pt;
pthread_create(&pt, NULL, thread, NULL);
pthread_join(pt, NULL);
execlp("true", "true", NULL);
- }
- sleep(1);
- f = ptrace(PTRACE_ATTACH, pid, 0L, 0L);
To be meaningful this code needs to learn to loop when ptrace returns -EAGAIN.
Because that is pretty much what any self respecting user space process will do.
At which point I am not certain we can say that the behavior has sufficiently improved not to be a deadlock.
- ASSERT_EQ(EAGAIN, errno);
- ASSERT_EQ(f, -1);
- f = kill(pid, SIGCONT);
- ASSERT_EQ(0, f);
+}
+TEST_HARNESS_MAIN
Eric