On 13. 02. 19, 19:38, Greg Kroah-Hartman wrote:
4.20-stable review patch. If anyone has any objections, please let me know.
From: Eric W. Biederman ebiederm@xmission.com
commit 35634ffa1751b6efd8cf75010b509dcb0263e29b upstream.
Recently syzkaller was able to create unkillablle processes by creating a timer that is delivered as a thread local signal on SIGHUP, and receiving SIGHUP SA_NODEFERER. Ultimately causing a loop failing to deliver SIGHUP but always trying.
Upon examination it turns out part of the problem is actually most of the solution. Since 2.5 signal delivery has found all fatal signals, marked the signal group for death, and queued SIGKILL in every threads thread queue relying on signal->group_exit_code to preserve the information of which was the actual fatal signal.
The conversion of all fatal signals to SIGKILL results in the synchronous signal heuristic in next_signal kicking in and preferring SIGHUP to SIGKILL. Which is especially problematic as all fatal signals have already been transformed into SIGKILL.
Instead of dequeueing signals and depending upon SIGKILL to be the first signal dequeued, first test if the signal group has already been marked for death. This guarantees that nothing in the signal queue can prevent a process that needs to exit from exiting.
Cc: stable@vger.kernel.org Tested-by: Dmitry Vyukov dvyukov@google.com Reported-by: Dmitry Vyukov dvyukov@google.com Ref: ebf5ebe31d2c ("[PATCH] signal-fixes-2.5.59-A4") History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
This patch breaks strace self-tests in 4.20.9. In particular, "threads-execve": https://github.com/strace/strace/blob/master/tests/threads-execve.c https://github.com/strace/strace/blob/master/tests/threads-execve.test
The test received some fix a day ago, but it did not help in this case: https://github.com/strace/strace/commit/2a50278b9
Only a revert of the above patch helped.
I don't know if the strace's test is broken (which is quite usual in cases like these) or the patch affects some user-visible behaviour -- e.g. could this be a reason for sh failures in the build farm?
Any ideas?
The failure is (the test output is non-unified diff: "<" lines are expected, ">" is actual output from strace):
FAIL: threads-execve
11,12c11 < 19311 execve("../threads-execve", ["../threads-execve", "8", "2"], 0x7ffc2447c258 /* 63 vars */ <unfinished ...>
< 19181 <... rt_sigsuspend resumed>) = ?
19311 execve("../threads-execve", ["../threads-execve", "8", "2"], 0x7ffc2447c258 /* 63 vars */ <pid changed to 19181 ...>
17,18c16 < 19395 execve("../threads-execve", ["../threads-execve", "8", "3"], 0x7ffdedb69ee8 /* 63 vars */ <unfinished ...>
< 19181 <... nanosleep resumed> <unfinished ...>) = ?
19395 execve("../threads-execve", ["../threads-execve", "8", "3"], 0x7ffdedb69ee8 /* 63 vars */ <pid changed to 19181 ...>
... 11,12c11 < 22715 execve("../threads-execve", ["../threads-execve", "8", "2"], 0x7fff2ea03388 /* 63 vars */ <unfinished ...>
< 22657 <... rt_sigsuspend resumed>) = ?
22715 execve("../threads-execve", ["../threads-execve", "8", "2"], 0x7fff2ea03388 /* 63 vars */ <pid changed to 22657 ...>
17,18c16 < 22764 execve("../threads-execve", ["../threads-execve", "8", "3"], 0x7ffc5ea29658 /* 63 vars */ <unfinished ...>
< 22657 <... nanosleep resumed> <unfinished ...>) = ?
22764 execve("../threads-execve", ["../threads-execve", "8", "3"], 0x7ffc5ea29658 /* 63 vars */ <pid changed to 22657 ...>
threads-execve.test: failed test: ../../strace -a21 -f -esignal=none -e trace=execve,exit,nanosleep,rt_sigsuspend ../threads-execve output mismatch
kernel/signal.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/kernel/signal.c +++ b/kernel/signal.c @@ -2393,6 +2393,11 @@ relock: goto relock; }
- /* Has this task already been marked for death? */
- ksig->info.si_signo = signr = SIGKILL;
- if (signal_group_exit(signal))
goto fatal;
- for (;;) { struct k_sigaction *ka;
@@ -2488,6 +2493,7 @@ relock: continue; }
- fatal: spin_unlock_irq(&sighand->siglock);
/*