I can't test your branch...... Because now the mainline kernel lacks too many features needed by Loongson-3. By the way, Your approach is based on NMI but I don't think NMI is always available on each MIPS board.
Huacai
------------------ Original ------------------ From: "Paul Burton"paul.burton@mips.com; Date: Thu, Jun 14, 2018 05:21 AM To: "Huacai Chen"chenhc@lemote.com; Cc: "Ralf Baechle"ralf@linux-mips.org; "James Hogan"james.hogan@mips.com; "Steven J . Hill"Steven.Hill@cavium.com; "linux-mips"linux-mips@linux-mips.org; "Fuxin Zhang"zhangfx@lemote.com; "wuzhangjin"wuzhangjin@gmail.com; "stable"stable@vger.kernel.org; Subject: Re: [PATCH] MIPS: Fix arch_trigger_cpumask_backtrace()
Hi Huacai,
On Mon, Feb 05, 2018 at 11:42:47AM +0800, Huacai Chen wrote:
SysRq-L and RCU stall detector call arch_trigger_cpumask_backtrace() to trigger other CPU's backtrace, but its behavior is totally broken. The root cause is arch_trigger_cpumask_backtrace() use call-function IPI in irq context, which trigger deadlocks in smp_call_function_single() and smp_call_function_many().
This patch fix arch_trigger_cpumask_backtrace() by: 1, Use a dedecated IPI (SMP_CPU_BACKTRACE) to trigger backtraces; 2, If myself is in target cpumask, do backtrace and clear myself; 3, Use a spinlock to avoid parallel backtrace output; 4, Handle SMP_CPU_BACKTRACE IPI for Loongson-3.
I have attempted to implement SMP_CPU_BACKTRACE for all MIPS CPUs, but I failed because some of their IPIs are not extensible. :(
Interesting - I've been using a similar patch internally for a little while which can be seen here:
https://git.linux-mips.org/cgit/linux-mti.git/commit/?h=eng-v4.15&id=f46...
Mine uses the generic nmi_trigger_cpumask_backtrace() infrastructure to handle most of the work, and just has to deal with sending the IPIs. It relies upon some changes from Matt to do that for the generic platform.
If you have a chance could you test the branch below & let me know whether it works for you?
git://git.kernel.org/pub/scm/linux/kernel/git/paulburton/linux.git
Branch "wip-cpumask-backtrace".
Hopefully with a little more work we can fix this up generically.
Thanks, Paul