Hi folks,
I'm the maintainer in Debian for strace. Trying to reproduce https://bugs.debian.org/963462 on my machine (Thinkpad T470), I've found a repeatable hard lockup running the strace testsuite. Each time it seems to have failed in a slightly different place in the testsuite (suggesting it's not one particular syscall test that's triggering the failure). I initially found this using Debian's current Buster kernel (4.19.118+2+deb10u1), then backtracking I found that 4.19.98+1+deb10u1 worked fine.
I've bisected to find the failure point along the linux-4.19.y stable branch and what I've got to is the following commit:
e58f543fc7c0926f31a49619c1a3648e49e8d233 is the first bad commit commit e58f543fc7c0926f31a49619c1a3648e49e8d233 Author: Jann Horn jannh@google.com Date: Thu Sep 13 18:12:09 2018 +0200
apparmor: don't try to replace stale label in ptrace access check
[ Upstream commit 1f8266ff58840d698a1e96d2274189de1bdf7969 ]
As a comment above begin_current_label_crit_section() explains, begin_current_label_crit_section() must run in sleepable context because when label_is_stale() is true, aa_replace_current_label() runs, which uses prepare_creds(), which can sleep. Until now, the ptrace access check (which runs with a task lock held) violated this rule.
Also add a might_sleep() assertion to begin_current_label_crit_section(), because asserts are less likely to be ignored than comments.
Fixes: b2d09ae449ced ("apparmor: move ptrace checks to using labels") Signed-off-by: Jann Horn jannh@google.com Signed-off-by: John Johansen john.johansen@canonical.com Signed-off-by: Sasha Levin sashal@kernel.org
:040000 040000 ca92f885a38c1747b812116f19de6967084a647e 865a227665e460e159502f21e8a16e6fa590bf50 M security
Considering I'm running strace build tests to provoke this bug, finding the failure in a commit talking about ptrace changes does look very suspicious...!
Annoyingly, I can't reproduce this on my disparate other machines here, suggesting it's maybe(?) timing related.
Hope this helps - happy to give more information, test things, etc.
On Fri, Jun 26, 2020 at 12:35:58PM +0100, Steve McIntyre wrote:
Hi folks,
I'm the maintainer in Debian for strace. Trying to reproduce https://bugs.debian.org/963462 on my machine (Thinkpad T470), I've found a repeatable hard lockup running the strace testsuite. Each time it seems to have failed in a slightly different place in the testsuite (suggesting it's not one particular syscall test that's triggering the failure). I initially found this using Debian's current Buster kernel (4.19.118+2+deb10u1), then backtracking I found that 4.19.98+1+deb10u1 worked fine.
I've bisected to find the failure point along the linux-4.19.y stable branch and what I've got to is the following commit:
e58f543fc7c0926f31a49619c1a3648e49e8d233 is the first bad commit commit e58f543fc7c0926f31a49619c1a3648e49e8d233 Author: Jann Horn jannh@google.com Date: Thu Sep 13 18:12:09 2018 +0200
apparmor: don't try to replace stale label in ptrace access check [ Upstream commit 1f8266ff58840d698a1e96d2274189de1bdf7969 ] As a comment above begin_current_label_crit_section() explains, begin_current_label_crit_section() must run in sleepable context because when label_is_stale() is true, aa_replace_current_label() runs, which uses prepare_creds(), which can sleep. Until now, the ptrace access check (which runs with a task lock held) violated this rule. Also add a might_sleep() assertion to begin_current_label_crit_section(), because asserts are less likely to be ignored than comments. Fixes: b2d09ae449ced ("apparmor: move ptrace checks to using labels") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
:040000 040000 ca92f885a38c1747b812116f19de6967084a647e 865a227665e460e159502f21e8a16e6fa590bf50 M security
Considering I'm running strace build tests to provoke this bug, finding the failure in a commit talking about ptrace changes does look very suspicious...!
Annoyingly, I can't reproduce this on my disparate other machines here, suggesting it's maybe(?) timing related.
Hope this helps - happy to give more information, test things, etc.
So if you just revert this one patch, all works well?
I've added the authors of the patch to the cc: list...
Also, does this problem happen on Linus's tree?
thanks,
greg k-h
Hey Greg,
On Fri, Jun 26, 2020 at 03:41:32PM +0200, Greg KH wrote:
On Fri, Jun 26, 2020 at 12:35:58PM +0100, Steve McIntyre wrote:
e58f543fc7c0926f31a49619c1a3648e49e8d233 is the first bad commit commit e58f543fc7c0926f31a49619c1a3648e49e8d233 Author: Jann Horn jannh@google.com Date: Thu Sep 13 18:12:09 2018 +0200
...
Annoyingly, I can't reproduce this on my disparate other machines here, suggesting it's maybe(?) timing related.
Hope this helps - happy to give more information, test things, etc.
So if you just revert this one patch, all works well?
Correct.
I've added the authors of the patch to the cc: list...
Also, does this problem happen on Linus's tree?
Just building to check that now...
On Fri, Jun 26, 2020 at 02:45:18PM +0100, Steve McIntyre wrote:
Hey Greg,
On Fri, Jun 26, 2020 at 03:41:32PM +0200, Greg KH wrote:
On Fri, Jun 26, 2020 at 12:35:58PM +0100, Steve McIntyre wrote:
e58f543fc7c0926f31a49619c1a3648e49e8d233 is the first bad commit commit e58f543fc7c0926f31a49619c1a3648e49e8d233 Author: Jann Horn jannh@google.com Date: Thu Sep 13 18:12:09 2018 +0200
...
Annoyingly, I can't reproduce this on my disparate other machines here, suggesting it's maybe(?) timing related.
Hope this helps - happy to give more information, test things, etc.
So if you just revert this one patch, all works well?
Correct.
I've added the authors of the patch to the cc: list...
Also, does this problem happen on Linus's tree?
Just building to check that now...
Current head here (3e08a95294a4fb3702bb3d35ed08028433c37fe6) works fine.
On Fri, Jun 26, 2020 at 3:41 PM Greg KH gregkh@linuxfoundation.org wrote:
On Fri, Jun 26, 2020 at 12:35:58PM +0100, Steve McIntyre wrote:
I'm the maintainer in Debian for strace. Trying to reproduce https://bugs.debian.org/963462 on my machine (Thinkpad T470), I've found a repeatable hard lockup running the strace testsuite. Each time it seems to have failed in a slightly different place in the testsuite (suggesting it's not one particular syscall test that's triggering the failure). I initially found this using Debian's current Buster kernel (4.19.118+2+deb10u1), then backtracking I found that 4.19.98+1+deb10u1 worked fine.
I've bisected to find the failure point along the linux-4.19.y stable branch and what I've got to is the following commit:
e58f543fc7c0926f31a49619c1a3648e49e8d233 is the first bad commit commit e58f543fc7c0926f31a49619c1a3648e49e8d233 Author: Jann Horn jannh@google.com Date: Thu Sep 13 18:12:09 2018 +0200
apparmor: don't try to replace stale label in ptrace access check [ Upstream commit 1f8266ff58840d698a1e96d2274189de1bdf7969 ] As a comment above begin_current_label_crit_section() explains, begin_current_label_crit_section() must run in sleepable context because when label_is_stale() is true, aa_replace_current_label() runs, which uses prepare_creds(), which can sleep. Until now, the ptrace access check (which runs with a task lock held) violated this rule. Also add a might_sleep() assertion to begin_current_label_crit_section(), because asserts are less likely to be ignored than comments. Fixes: b2d09ae449ced ("apparmor: move ptrace checks to using labels") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
:040000 040000 ca92f885a38c1747b812116f19de6967084a647e 865a227665e460e159502f21e8a16e6fa590bf50 M security
Considering I'm running strace build tests to provoke this bug, finding the failure in a commit talking about ptrace changes does look very suspicious...!
Annoyingly, I can't reproduce this on my disparate other machines here, suggesting it's maybe(?) timing related.
Does "hard lockup" mean that the HARDLOCKUP_DETECTOR infrastructure prints a warning to dmesg? If so, can you share that warning?
If you don't have any way to see console output, and you don't have a working serial console setup or such, you may want to try re-running those tests while the kernel is booted with netconsole enabled to log to a different machine over UDP (see https://www.kernel.org/doc/Documentation/networking/netconsole.txt).
You may want to try setting the sysctl kernel.sysrq=1 , then when the system has locked up, press ALT+PRINT+L (to generate stack traces for all active CPUs from NMI context), and maybe also ALT+PRINT+T and ALT+PRINT+W (to collect more information about active tasks).
(If you share stack traces from these things with us, it would be helpful if you could run them through scripts/decode_stacktrace.pl from the kernel tree first, to add line number information.)
Trying to isolate the problem:
__end_current_label_crit_section and end_current_label_crit_section are aliases of each other (via #define), so that line change can't have done anything.
That leaves two possibilities AFAICS: - the might_sleep() call by itself is causing issues for one of the remaining users of begin_current_label_crit_section() (because it causes preemption to happen more often where it didn't happen on PREEMPT_VOLUNTARY before, or because it's trying to print a warning message with the runqueue lock held, or something like that) - the lack of "if (aa_replace_current_label(label) == 0) aa_put_label(label);" in __begin_current_label_crit_section() is somehow causing issues
You could try to see whether just adding the might_sleep(), or just replacing the begin_current_label_crit_section() call with __begin_current_label_crit_section(), triggers the bug.
If you could recompile the kernel with CONFIG_DEBUG_ATOMIC_SLEEP - if that isn't already set in your kernel config -, that might help track down the problem, unless it magically makes the problem stop triggering (which I guess would be conceivable if this indeed is a race).
Hi Jann,
On Fri, Jun 26, 2020 at 04:25:59PM +0200, Jann Horn wrote:
On Fri, Jun 26, 2020 at 3:41 PM Greg KH gregkh@linuxfoundation.org wrote:
On Fri, Jun 26, 2020 at 12:35:58PM +0100, Steve McIntyre wrote:
...
Considering I'm running strace build tests to provoke this bug, finding the failure in a commit talking about ptrace changes does look very suspicious...!
Annoyingly, I can't reproduce this on my disparate other machines here, suggesting it's maybe(?) timing related.
Does "hard lockup" mean that the HARDLOCKUP_DETECTOR infrastructure prints a warning to dmesg? If so, can you share that warning?
I mean the machine locks hard - X stops updating, the mouse/keyboard stop responding. No pings, etc. When I reboot, there's nothing in the logs.
If you don't have any way to see console output, and you don't have a working serial console setup or such, you may want to try re-running those tests while the kernel is booted with netconsole enabled to log to a different machine over UDP (see https://www.kernel.org/doc/Documentation/networking/netconsole.txt).
ACK, will try that now for you.
You may want to try setting the sysctl kernel.sysrq=1 , then when the system has locked up, press ALT+PRINT+L (to generate stack traces for all active CPUs from NMI context), and maybe also ALT+PRINT+T and ALT+PRINT+W (to collect more information about active tasks).
Nod.
(If you share stack traces from these things with us, it would be helpful if you could run them through scripts/decode_stacktrace.pl from the kernel tree first, to add line number information.)
ACK.
Trying to isolate the problem:
__end_current_label_crit_section and end_current_label_crit_section are aliases of each other (via #define), so that line change can't have done anything.
That leaves two possibilities AFAICS:
- the might_sleep() call by itself is causing issues for one of the
remaining users of begin_current_label_crit_section() (because it causes preemption to happen more often where it didn't happen on PREEMPT_VOLUNTARY before, or because it's trying to print a warning message with the runqueue lock held, or something like that)
- the lack of "if (aa_replace_current_label(label) == 0)
aa_put_label(label);" in __begin_current_label_crit_section() is somehow causing issues
You could try to see whether just adding the might_sleep(), or just replacing the begin_current_label_crit_section() call with __begin_current_label_crit_section(), triggers the bug.
If you could recompile the kernel with CONFIG_DEBUG_ATOMIC_SLEEP - if that isn't already set in your kernel config -, that might help track down the problem, unless it magically makes the problem stop triggering (which I guess would be conceivable if this indeed is a race).
OK, will try that second...
On 6/26/20 9:50 AM, Steve McIntyre wrote:
Hi Jann,
On Fri, Jun 26, 2020 at 04:25:59PM +0200, Jann Horn wrote:
On Fri, Jun 26, 2020 at 3:41 PM Greg KH gregkh@linuxfoundation.org wrote:
On Fri, Jun 26, 2020 at 12:35:58PM +0100, Steve McIntyre wrote:
...
Considering I'm running strace build tests to provoke this bug, finding the failure in a commit talking about ptrace changes does look very suspicious...!
Annoyingly, I can't reproduce this on my disparate other machines here, suggesting it's maybe(?) timing related.
Does "hard lockup" mean that the HARDLOCKUP_DETECTOR infrastructure prints a warning to dmesg? If so, can you share that warning?
I mean the machine locks hard - X stops updating, the mouse/keyboard stop responding. No pings, etc. When I reboot, there's nothing in the logs.
If you don't have any way to see console output, and you don't have a working serial console setup or such, you may want to try re-running those tests while the kernel is booted with netconsole enabled to log to a different machine over UDP (see https://www.kernel.org/doc/Documentation/networking/netconsole.txt).
ACK, will try that now for you.
You may want to try setting the sysctl kernel.sysrq=1 , then when the system has locked up, press ALT+PRINT+L (to generate stack traces for all active CPUs from NMI context), and maybe also ALT+PRINT+T and ALT+PRINT+W (to collect more information about active tasks).
Nod.
(If you share stack traces from these things with us, it would be helpful if you could run them through scripts/decode_stacktrace.pl from the kernel tree first, to add line number information.)
ACK.
Trying to isolate the problem:
__end_current_label_crit_section and end_current_label_crit_section are aliases of each other (via #define), so that line change can't have done anything.
That leaves two possibilities AFAICS:
- the might_sleep() call by itself is causing issues for one of the
remaining users of begin_current_label_crit_section() (because it causes preemption to happen more often where it didn't happen on PREEMPT_VOLUNTARY before, or because it's trying to print a warning message with the runqueue lock held, or something like that)
- the lack of "if (aa_replace_current_label(label) == 0)
aa_put_label(label);" in __begin_current_label_crit_section() is somehow causing issues
You could try to see whether just adding the might_sleep(), or just replacing the begin_current_label_crit_section() call with __begin_current_label_crit_section(), triggers the bug.
If you could recompile the kernel with CONFIG_DEBUG_ATOMIC_SLEEP - if that isn't already set in your kernel config -, that might help track down the problem, unless it magically makes the problem stop triggering (which I guess would be conceivable if this indeed is a race).
OK, will try that second...
I have not been able to reproduce but
So looking at linux-4.19.y it looks like 1f8266ff5884 apparmor: don't try to replace stale label in ptrace access check
was picked without ca3fde5214e1 apparmor: don't try to replace stale label in ptraceme check
Both of them are marked as Fixes: b2d09ae449ced ("apparmor: move ptrace checks to using labels")
so I would expect them to be picked together.
ptraceme is potentially updating the task's cred while the access check is running.
Try building after picking ca3fde5214e1 apparmor: don't try to replace stale label in ptraceme check
On Fri, Jun 26, 2020 at 10:29:28AM -0700, John Johansen wrote:
On 6/26/20 9:50 AM, Steve McIntyre wrote:
OK, will try that second...
I have not been able to reproduce but
So looking at linux-4.19.y it looks like 1f8266ff5884 apparmor: don't try to replace stale label in ptrace access check
was picked without ca3fde5214e1 apparmor: don't try to replace stale label in ptraceme check
Both of them are marked as Fixes: b2d09ae449ced ("apparmor: move ptrace checks to using labels")
so I would expect them to be picked together.
ptraceme is potentially updating the task's cred while the access check is running.
Try building after picking ca3fde5214e1 apparmor: don't try to replace stale label in ptraceme check
Bingo! With that one change the test suite runs to completion, no lockup.
\o/
Thanks guys, I think we've found the cause here.
On Fri, Jun 26, 2020 at 08:01:18PM +0100, Steve McIntyre wrote:
On Fri, Jun 26, 2020 at 10:29:28AM -0700, John Johansen wrote:
On 6/26/20 9:50 AM, Steve McIntyre wrote:
OK, will try that second...
I have not been able to reproduce but
So looking at linux-4.19.y it looks like 1f8266ff5884 apparmor: don't try to replace stale label in ptrace access check
was picked without ca3fde5214e1 apparmor: don't try to replace stale label in ptraceme check
Both of them are marked as Fixes: b2d09ae449ced ("apparmor: move ptrace checks to using labels")
so I would expect them to be picked together.
ptraceme is potentially updating the task's cred while the access check is running.
Try building after picking ca3fde5214e1 apparmor: don't try to replace stale label in ptraceme check
Bingo! With that one change the test suite runs to completion, no lockup.
I've queued ca3fde5214e1 for 4.19 and 4.14, thank you for your work!
Hi again,
On Fri, Jun 26, 2020 at 05:50:00PM +0100, Steve McIntyre wrote:
On Fri, Jun 26, 2020 at 04:25:59PM +0200, Jann Horn wrote:
On Fri, Jun 26, 2020 at 3:41 PM Greg KH gregkh@linuxfoundation.org wrote:
On Fri, Jun 26, 2020 at 12:35:58PM +0100, Steve McIntyre wrote:
...
Considering I'm running strace build tests to provoke this bug, finding the failure in a commit talking about ptrace changes does look very suspicious...!
Annoyingly, I can't reproduce this on my disparate other machines here, suggesting it's maybe(?) timing related.
Does "hard lockup" mean that the HARDLOCKUP_DETECTOR infrastructure prints a warning to dmesg? If so, can you share that warning?
I mean the machine locks hard - X stops updating, the mouse/keyboard stop responding. No pings, etc. When I reboot, there's nothing in the logs.
If you don't have any way to see console output, and you don't have a working serial console setup or such, you may want to try re-running those tests while the kernel is booted with netconsole enabled to log to a different machine over UDP (see https://www.kernel.org/doc/Documentation/networking/netconsole.txt).
ACK, will try that now for you.
You may want to try setting the sysctl kernel.sysrq=1 , then when the system has locked up, press ALT+PRINT+L (to generate stack traces for all active CPUs from NMI context), and maybe also ALT+PRINT+T and ALT+PRINT+W (to collect more information about active tasks).
Nod.
(If you share stack traces from these things with us, it would be helpful if you could run them through scripts/decode_stacktrace.pl from the kernel tree first, to add line number information.)
ACK.
Output passed through scripts/decode_stacktrace.sh attached.
Just about to try John's suggestion next.
On Fri, Jun 26, 2020 at 7:52 PM Steve McIntyre steve@einval.com wrote:
On Fri, Jun 26, 2020 at 05:50:00PM +0100, Steve McIntyre wrote:
On Fri, Jun 26, 2020 at 04:25:59PM +0200, Jann Horn wrote:
On Fri, Jun 26, 2020 at 3:41 PM Greg KH gregkh@linuxfoundation.org wrote:
On Fri, Jun 26, 2020 at 12:35:58PM +0100, Steve McIntyre wrote:
...
Considering I'm running strace build tests to provoke this bug, finding the failure in a commit talking about ptrace changes does look very suspicious...!
Annoyingly, I can't reproduce this on my disparate other machines here, suggesting it's maybe(?) timing related.
Does "hard lockup" mean that the HARDLOCKUP_DETECTOR infrastructure prints a warning to dmesg? If so, can you share that warning?
I mean the machine locks hard - X stops updating, the mouse/keyboard stop responding. No pings, etc. When I reboot, there's nothing in the logs.
If you don't have any way to see console output, and you don't have a working serial console setup or such, you may want to try re-running those tests while the kernel is booted with netconsole enabled to log to a different machine over UDP (see https://www.kernel.org/doc/Documentation/networking/netconsole.txt).
ACK, will try that now for you.
You may want to try setting the sysctl kernel.sysrq=1 , then when the system has locked up, press ALT+PRINT+L (to generate stack traces for all active CPUs from NMI context), and maybe also ALT+PRINT+T and ALT+PRINT+W (to collect more information about active tasks).
Nod.
(If you share stack traces from these things with us, it would be helpful if you could run them through scripts/decode_stacktrace.pl from the kernel tree first, to add line number information.)
ACK.
Output passed through scripts/decode_stacktrace.sh attached.
Just about to try John's suggestion next.
Okay, so this is some sort of deadlock...
Looking at the NMI backtraces, all the CPUs are blocked on spinlocks: CPU 3 is blocked on current->sighand->siglock, in tty_open_proc_set_tty() CPU 1 is blocked on... I'm not sure which lock, somewhere in do_wait() CPU 2 is blocked on something, somewhere in ptrace_stop() CPU 0 is stuck on a lock in do_exit()
So I think it's probably something like a classic deadlock, or a sleeping-in-atomic issue, or a lock-balancing issue (or memory corruption, that can cause all kinds of weird errors)?
If it really is a classic deadlock, CONFIG_PROVE_LOCKING=y should be able to pinpoint the issue. If it is a sleeping-in-atomic issue, CONFIG_DEBUG_ATOMIC_SLEEP=y should help. If it is memory corruption, CONFIG_KASAN=y should discover it... but that might majorly mess up the timing, so if this really is a race, that might not work.
Maybe flip all of those on, and if it doesn't reproduce anymore, turn off CONFIG_KASAN and try again?
linux-stable-mirror@lists.linaro.org