Currently, when kdb is compiled with keyboard support, then we will use schedule_work() to provoke reset of the keyboard status. Unfortunately schedule_work() gets called from the kgdboc post-debug-exception handler. That risks deadlock since schedule_work() is not NMI-safe and, even on platforms where the NMI is not directly used for debugging, the debug trap can have NMI-like behaviour depending on where breakpoints are placed.
Fix this by using the irq work system, which is NMI-safe, to defer the call to schedule_work() to a point when it is safe to call.
Reported-by: Liuye liu.yeC@h3c.com Closes: https://lore.kernel.org/all/20240228025602.3087748-1-liu.yeC@h3c.com/ Cc: stable@vger.kernel.org Reviewed-by: Douglas Anderson dianders@chromium.org Signed-off-by: Daniel Thompson daniel.thompson@linaro.org --- @Greg: I'm assuming this could/should go via your tree but feel free to share an ack if you want me to hoover it up instead.
Changes in v2: - Fix typo in the big comment (thanks Doug) - Link to v1: https://lore.kernel.org/r/20240419-kgdboc_fix_schedule_work-v1-1-ff19881677e... --- drivers/tty/serial/kgdboc.c | 30 +++++++++++++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-)
diff --git a/drivers/tty/serial/kgdboc.c b/drivers/tty/serial/kgdboc.c index 7ce7bb1640054..58ea1e1391cee 100644 --- a/drivers/tty/serial/kgdboc.c +++ b/drivers/tty/serial/kgdboc.c @@ -19,6 +19,7 @@ #include <linux/console.h> #include <linux/vt_kern.h> #include <linux/input.h> +#include <linux/irq_work.h> #include <linux/module.h> #include <linux/platform_device.h> #include <linux/serial_core.h> @@ -48,6 +49,25 @@ static struct kgdb_io kgdboc_earlycon_io_ops; static int (*earlycon_orig_exit)(struct console *con); #endif /* IS_BUILTIN(CONFIG_KGDB_SERIAL_CONSOLE) */
+/* + * When we leave the debug trap handler we need to reset the keyboard status + * (since the original keyboard state gets partially clobbered by kdb use of + * the keyboard). + * + * The path to deliver the reset is somewhat circuitous. + * + * To deliver the reset we register an input handler, reset the keyboard and + * then deregister the input handler. However, to get this done right, we do + * have to carefully manage the calling context because we can only register + * input handlers from task context. + * + * In particular we need to trigger the action from the debug trap handler with + * all its NMI and/or NMI-like oddities. To solve this the kgdboc trap exit code + * (the "post_exception" callback) uses irq_work_queue(), which is NMI-safe, to + * schedule a callback from a hardirq context. From there we have to defer the + * work again, this time using schedule_work(), to get a callback using the + * system workqueue, which runs in task context. + */ #ifdef CONFIG_KDB_KEYBOARD static int kgdboc_reset_connect(struct input_handler *handler, struct input_dev *dev, @@ -99,10 +119,17 @@ static void kgdboc_restore_input_helper(struct work_struct *dummy)
static DECLARE_WORK(kgdboc_restore_input_work, kgdboc_restore_input_helper);
+static void kgdboc_queue_restore_input_helper(struct irq_work *unused) +{ + schedule_work(&kgdboc_restore_input_work); +} + +static DEFINE_IRQ_WORK(kgdboc_restore_input_irq_work, kgdboc_queue_restore_input_helper); + static void kgdboc_restore_input(void) { if (likely(system_state == SYSTEM_RUNNING)) - schedule_work(&kgdboc_restore_input_work); + irq_work_queue(&kgdboc_restore_input_irq_work); }
static int kgdboc_register_kbd(char **cptr) @@ -133,6 +160,7 @@ static void kgdboc_unregister_kbd(void) i--; } } + irq_work_sync(&kgdboc_restore_input_irq_work); flush_work(&kgdboc_restore_input_work); } #else /* ! CONFIG_KDB_KEYBOARD */
--- base-commit: 0bbac3facb5d6cc0171c45c9873a2dc96bea9680 change-id: 20240419-kgdboc_fix_schedule_work-f0cb44b8a354
Best regards,
On Wed, Apr 24, 2024 at 03:21:41PM +0100, Daniel Thompson wrote:
Currently, when kdb is compiled with keyboard support, then we will use schedule_work() to provoke reset of the keyboard status. Unfortunately schedule_work() gets called from the kgdboc post-debug-exception handler. That risks deadlock since schedule_work() is not NMI-safe and, even on platforms where the NMI is not directly used for debugging, the debug trap can have NMI-like behaviour depending on where breakpoints are placed.
Fix this by using the irq work system, which is NMI-safe, to defer the call to schedule_work() to a point when it is safe to call.
Reported-by: Liuye liu.yeC@h3c.com Closes: https://lore.kernel.org/all/20240228025602.3087748-1-liu.yeC@h3c.com/ Cc: stable@vger.kernel.org Reviewed-by: Douglas Anderson dianders@chromium.org Signed-off-by: Daniel Thompson daniel.thompson@linaro.org
@Greg: I'm assuming this could/should go via your tree but feel free to share an ack if you want me to hoover it up instead.
Hoover away!
Acked-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
On Wed, 24 Apr 2024 15:21:41 +0100, Daniel Thompson wrote:
Currently, when kdb is compiled with keyboard support, then we will use schedule_work() to provoke reset of the keyboard status. Unfortunately schedule_work() gets called from the kgdboc post-debug-exception handler. That risks deadlock since schedule_work() is not NMI-safe and, even on platforms where the NMI is not directly used for debugging, the debug trap can have NMI-like behaviour depending on where breakpoints are placed.
[...]
Applied, thanks!
[1/1] serial: kgdboc: Fix NMI-safety problems from keyboard reset code commit: b2aba15ad6f908d1a620fd97f6af5620c3639742
Best regards,
linux-stable-mirror@lists.linaro.org