On Tue, 30 Jan 2018 15:35:54 +1100 Michael Ellerman mpe@ellerman.id.au wrote:
alexander.levin@verizon.com writes:
On Thu, Dec 14, 2017 at 12:10:39AM +1100, Michael Ellerman wrote:
alexander.levin@verizon.com writes:
From: Nicholas Piggin npiggin@gmail.com
[ Upstream commit 064996d62a33ffe10264b5af5dca92d54f60f806 ]
The SMP hardlockup watchdog cross-checks other CPUs for lockups, which causes xmon headaches because it's assuming interrupts hard disabled means no watchdog troubles. Try to improve that by calling touch_nmi_watchdog() in obvious places where secondaries are spinning.
Also annotate these spin loops with spin_begin/end calls.
These macros didn't exist until 4.13, and haven't been backported AFAIK.
But the touch_nmi_watchdog() bits are something we want in stable, right?
I don't think you need them unless you've also back ported arch/powerpc/kernel/watchdog.c, which I don't think you have.
Maybe Nick can confirm?
I'm not 100% sure. The CPUs only check themselves for lockups. They will blow their threshold when in xmon, but when they come out of xmon, I think by a quirk of our local_irq_enable() implementation that actually checks timers explicitly and runs them first before re-enabling hard interrupts, then our heartbeat starts up again just before the perf interrupt would come in to report the lockup.
I think.
Given that we've had no reports of misbehaviour of the old perf watchdog, I would say you can skip the backport.
Thanks, Nick