Re: [PATCH] riscv: entry: Fixup do_trap_break from kernel side

17 Jul 2023

On Mon, Jul 17, 2023 at 6:45 PM Peter Zijlstra peterz@infradead.org wrote:
...
On Mon, Jul 17, 2023 at 07:33:25AM +0800, Guo Ren wrote:
...
On Mon, Jul 10, 2023 at 4:02 PM Peter Zijlstra peterz@infradead.org wrote:
...
On Sun, Jul 09, 2023 at 10:30:22AM +0800, Guo Ren wrote:
...
On Wed, Jul 5, 2023 at 12:40 AM Peter Zijlstra peterz@infradead.org wrote:
...
On Sat, Jul 01, 2023 at 10:57:07PM -0400, guoren@kernel.org wrote:
...
From: Guo Ren guoren@linux.alibaba.com
The irqentry_nmi_enter/exit would force the current context into in_interrupt.
That would trigger the kernel to dead panic, but the kdb still needs "ebreak" to
debug the kernel.
Move irqentry_nmi_enter/exit to exception_enter/exit could correct handle_break
of the kernel side.
This doesn't explain much if anything :/
I'm confused (probably because I don't know RISC-V very well), what's
EBREAK and how does it happen?
EBREAK is just an instruction of riscv which would rise breakpoint exception.
...
Specifically, if EBREAK can happen inside an local_irq_disable() region,
then the below change is actively wrong. Any exception/interrupt that
can happen while local_irq_disable() must be treated like an NMI.
When the ebreak happend out of local_irq_disable region, but
__nmi_enter forces handle_break() into in_interupt() state. So how
And why is that a problem? I think I'm missing something fundamental
here...
The irqentry_nmi_enter() would force the current context to get
in_interrupt=true, although ebreak happens in the context which is
in_interrupt=false.
A lot of checking codes, such as:
        if (in_interrupt())
                panic("Fatal exception in interrupt");
Why would you do that?!?
Are you're trying to differentiate between an exception and an
interrupt?
You *could* have ebreak in an interrupt, right? So why panic the machine
if that happens?
Do you mean the below patch? Yes, it could fix up.

diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index f910dfccbf5d..92899db6696b 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -85,8 +85,6 @@ void die(struct pt_regs *regs, const char *str)
        spin_unlock_irqrestore(&die_lock, flags);
        oops_exit();
-       if (in_interrupt())
-               panic("Fatal exception in interrupt");
        if (panic_on_oops)
                panic("Fatal exception");
        if (ret != NOTIFY_STOP)
diff --git a/kernel/exit.c b/kernel/exit.c
index edb50b4c9972..a46a1aef66ce 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -940,8 +940,6 @@ void __noreturn make_task_dead(int signr)
        struct task_struct *tsk = current;
        unsigned int limit;
-       if (unlikely(in_interrupt()))
-               panic("Aiee, killing interrupt handler!");
        if (unlikely(!tsk->pid))
                panic("Attempted to kill the idle task!");
But how does x86 deal with it without kernel/exit.c modifcation?
...
...
It would make the kernel panic, but we don't panic; we want back to the shell.
eg:
echo BUG > /sys/kernel/debug/provoke-crash/DIRECT
-- 
Best Regards
 Guo Ren

    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH] riscv: entry: Fixup do_trap_break from kernel side