On Tue, Feb 19, 2019 at 07:23:41AM +0100, Jiri Slaby wrote:
On 13. 02. 19, 19:38, Greg Kroah-Hartman wrote:
4.20-stable review patch. If anyone has any objections, please let me know.
From: Eric W. Biederman ebiederm@xmission.com
commit 35634ffa1751b6efd8cf75010b509dcb0263e29b upstream.
Recently syzkaller was able to create unkillablle processes by creating a timer that is delivered as a thread local signal on SIGHUP, and receiving SIGHUP SA_NODEFERER. Ultimately causing a loop failing to deliver SIGHUP but always trying.
Upon examination it turns out part of the problem is actually most of the solution. Since 2.5 signal delivery has found all fatal signals, marked the signal group for death, and queued SIGKILL in every threads thread queue relying on signal->group_exit_code to preserve the information of which was the actual fatal signal.
The conversion of all fatal signals to SIGKILL results in the synchronous signal heuristic in next_signal kicking in and preferring SIGHUP to SIGKILL. Which is especially problematic as all fatal signals have already been transformed into SIGKILL.
Instead of dequeueing signals and depending upon SIGKILL to be the first signal dequeued, first test if the signal group has already been marked for death. This guarantees that nothing in the signal queue can prevent a process that needs to exit from exiting.
Cc: stable@vger.kernel.org Tested-by: Dmitry Vyukov dvyukov@google.com Reported-by: Dmitry Vyukov dvyukov@google.com Ref: ebf5ebe31d2c ("[PATCH] signal-fixes-2.5.59-A4") History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Signed-off-by: "Eric W. Biederman" ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
This patch breaks strace self-tests in 4.20.9. In particular, "threads-execve": https://github.com/strace/strace/blob/master/tests/threads-execve.c https://github.com/strace/strace/blob/master/tests/threads-execve.test
The test received some fix a day ago, but it did not help in this case: https://github.com/strace/strace/commit/2a50278b9
Only a revert of the above patch helped.
I don't know if the strace's test is broken (which is quite usual in cases like these) or the patch affects some user-visible behaviour -- e.g. could this be a reason for sh failures in the build farm?
Any ideas?
Does cf43a757fd49 ("signal: Restore the stop PTRACE_EVENT_EXIT") help with this? It's queued up for the next round of stable releases and is in Linus's tree.
thanks,
greg k-h