From: John Ogness john.ogness@linutronix.de
[ Upstream commit 7b23a66db55ed0a55b020e913f0d6f6d52a1ad2c ]
A semaphore is not NMI-safe, even when using down_trylock(). Both down_trylock() and up() are using internal spinlocks and up() might even call wake_up_process().
In the panic() code path it gets even worse because the internal spinlocks of the semaphore may have been taken by a CPU that has been stopped.
To reduce the risk of deadlocks caused by the console semaphore in the panic path, make the following changes:
- First check if any consoles have implemented the unblank() callback. If not, then there is no reason to take the console semaphore anyway. (This check is also useful for the non-panic path since the locking/unlocking of the console lock can be quite expensive due to console printing.)
- If the panic path is in NMI context, bail out without attempting to take the console semaphore or calling any unblank() callbacks. Bailing out is acceptable because console_unblank() would already bail out if the console semaphore is contended. The alternative of ignoring the console semaphore and calling the unblank() callbacks anyway is a bad idea because these callbacks are also not NMI-safe.
If consoles with unblank() callbacks exist and console_unblank() is called from a non-NMI panic context, it will still attempt a down_trylock(). This could still result in a deadlock if one of the stopped CPUs is holding the semaphore internal spinlock. But this is a risk that the kernel has been (and continues to be) willing to take.
Signed-off-by: John Ogness john.ogness@linutronix.de Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Reviewed-by: Petr Mladek pmladek@suse.com Signed-off-by: Petr Mladek pmladek@suse.com Link: https://lore.kernel.org/r/20230717194607.145135-3-john.ogness@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/printk/printk.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 6a333adce3b33..653ad62ded417 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -3045,9 +3045,27 @@ EXPORT_SYMBOL(console_conditional_schedule);
void console_unblank(void) { + bool found_unblank = false; struct console *c; int cookie;
+ /* + * First check if there are any consoles implementing the unblank() + * callback. If not, there is no reason to continue and take the + * console lock, which in particular can be dangerous if + * @oops_in_progress is set. + */ + cookie = console_srcu_read_lock(); + for_each_console_srcu(c) { + if ((console_srcu_read_flags(c) & CON_ENABLED) && c->unblank) { + found_unblank = true; + break; + } + } + console_srcu_read_unlock(cookie); + if (!found_unblank) + return; + /* * Stop console printing because the unblank() callback may * assume the console is not within its write() callback. @@ -3056,6 +3074,16 @@ void console_unblank(void) * In that case, attempt a trylock as best-effort. */ if (oops_in_progress) { + /* Semaphores are not NMI-safe. */ + if (in_nmi()) + return; + + /* + * Attempting to trylock the console lock can deadlock + * if another CPU was stopped while modifying the + * semaphore. "Hope and pray" that this is not the + * current situation. + */ if (down_trylock_console_sem() != 0) return; } else
From: John Ogness john.ogness@linutronix.de
[ Upstream commit 51a1d258e50e03a0216bf42b6af9ff34ec402ac1 ]
When in a panic situation, non-panic CPUs should avoid holding the console lock so as not to contend with the panic CPU. This is already implemented with abandon_console_lock_in_panic(), which is checked after each printed line. However, non-panic CPUs should also avoid trying to acquire the console lock during a panic.
Modify console_trylock() to fail and console_lock() to block() when called from a non-panic CPU during a panic.
Signed-off-by: John Ogness john.ogness@linutronix.de Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Reviewed-by: Petr Mladek pmladek@suse.com Signed-off-by: Petr Mladek pmladek@suse.com Link: https://lore.kernel.org/r/20230717194607.145135-4-john.ogness@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/printk/printk.c | 45 ++++++++++++++++++++++++------------------ 1 file changed, 26 insertions(+), 19 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 653ad62ded417..d4df72efe91b5 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2585,6 +2585,25 @@ static int console_cpu_notify(unsigned int cpu) return 0; }
+/* + * Return true when this CPU should unlock console_sem without pushing all + * messages to the console. This reduces the chance that the console is + * locked when the panic CPU tries to use it. + */ +static bool abandon_console_lock_in_panic(void) +{ + if (!panic_in_progress()) + return false; + + /* + * We can use raw_smp_processor_id() here because it is impossible for + * the task to be migrated to the panic_cpu, or away from it. If + * panic_cpu has already been set, and we're not currently executing on + * that CPU, then we never will be. + */ + return atomic_read(&panic_cpu) != raw_smp_processor_id(); +} + /** * console_lock - block the console subsystem from printing * @@ -2597,6 +2616,10 @@ void console_lock(void) { might_sleep();
+ /* On panic, the console_lock must be left to the panic cpu. */ + while (abandon_console_lock_in_panic()) + msleep(1000); + down_console_sem(); if (console_suspended) return; @@ -2615,6 +2638,9 @@ EXPORT_SYMBOL(console_lock); */ int console_trylock(void) { + /* On panic, the console_lock must be left to the panic cpu. */ + if (abandon_console_lock_in_panic()) + return 0; if (down_trylock_console_sem()) return 0; if (console_suspended) { @@ -2633,25 +2659,6 @@ int is_console_locked(void) } EXPORT_SYMBOL(is_console_locked);
-/* - * Return true when this CPU should unlock console_sem without pushing all - * messages to the console. This reduces the chance that the console is - * locked when the panic CPU tries to use it. - */ -static bool abandon_console_lock_in_panic(void) -{ - if (!panic_in_progress()) - return false; - - /* - * We can use raw_smp_processor_id() here because it is impossible for - * the task to be migrated to the panic_cpu, or away from it. If - * panic_cpu has already been set, and we're not currently executing on - * that CPU, then we never will be. - */ - return atomic_read(&panic_cpu) != raw_smp_processor_id(); -} - /* * Check if the given console is currently capable and allowed to print * records.
From: John Ogness john.ogness@linutronix.de
[ Upstream commit eacb04ff3c5b8662a65f380ae450250698448cff ]
Currently console_flush_on_panic() will attempt to acquire the console lock when flushing the buffer on panic. If it fails to acquire the lock, it continues anyway because this is the last chance to get any pending records printed.
The reason why the console lock was attempted at all was to prevent any other CPUs from acquiring the console lock for printing while the panic CPU was printing. But as of the previous commit, non-panic CPUs will no longer attempt to acquire the console lock in a panic situation. Therefore it is no longer strictly necessary for a panic CPU to acquire the console lock.
Avoiding taking the console lock when flushing in panic has the additional benefit of avoiding possible deadlocks due to semaphore usage in NMI context (semaphores are not NMI-safe) and avoiding possible deadlocks if another CPU accesses the semaphore and is stopped while holding one of the semaphore's internal spinlocks.
Signed-off-by: John Ogness john.ogness@linutronix.de Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Reviewed-by: Petr Mladek pmladek@suse.com Signed-off-by: Petr Mladek pmladek@suse.com Link: https://lore.kernel.org/r/20230717194607.145135-5-john.ogness@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/printk/printk.c | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index d4df72efe91b5..e00753aa035bc 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -3120,14 +3120,24 @@ void console_unblank(void) */ void console_flush_on_panic(enum con_flush_mode mode) { + bool handover; + u64 next_seq; + /* - * If someone else is holding the console lock, trylock will fail - * and may_schedule may be set. Ignore and proceed to unlock so - * that messages are flushed out. As this can be called from any - * context and we don't want to get preempted while flushing, - * ensure may_schedule is cleared. + * Ignore the console lock and flush out the messages. Attempting a + * trylock would not be useful because: + * + * - if it is contended, it must be ignored anyway + * - console_lock() and console_trylock() block and fail + * respectively in panic for non-panic CPUs + * - semaphores are not NMI-safe + */ + + /* + * If another context is holding the console lock, + * @console_may_schedule might be set. Clear it so that + * this context does not call cond_resched() while flushing. */ - console_trylock(); console_may_schedule = 0;
if (mode == CONSOLE_REPLAY_ALL) { @@ -3140,15 +3150,15 @@ void console_flush_on_panic(enum con_flush_mode mode) cookie = console_srcu_read_lock(); for_each_console_srcu(c) { /* - * If the above console_trylock() failed, this is an - * unsynchronized assignment. But in that case, the + * This is an unsynchronized assignment, but the * kernel is in "hope and pray" mode anyway. */ c->seq = seq; } console_srcu_read_unlock(cookie); } - console_unlock(); + + console_flush_all(false, &next_seq, &handover); }
/*
From: John Ogness john.ogness@linutronix.de
[ Upstream commit 696ffaf50e1f8dbc66223ff614473f945f5fb8d8 ]
Printing to consoles can be deferred for several reasons:
- explicitly with printk_deferred() - printk() in NMI context - recursive printk() calls
The current implementation is not consistent. For printk_deferred(), irq work is scheduled twice. For NMI und recursive, panic CPU suppression and caller delays are not properly enforced.
Correct these inconsistencies by consolidating the deferred printing code so that vprintk_deferred() is the top-level function for deferred printing and vprintk_emit() will perform whichever irq_work queueing is appropriate.
Also add kerneldoc for wake_up_klogd() and defer_console_output() to clarify their differences and appropriate usage.
Signed-off-by: John Ogness john.ogness@linutronix.de Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Reviewed-by: Petr Mladek pmladek@suse.com Signed-off-by: Petr Mladek pmladek@suse.com Link: https://lore.kernel.org/r/20230717194607.145135-6-john.ogness@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/printk/printk.c | 35 ++++++++++++++++++++++++++++------- kernel/printk/printk_safe.c | 9 ++------- 2 files changed, 30 insertions(+), 14 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index e00753aa035bc..7a3077d92ab17 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2308,7 +2308,11 @@ asmlinkage int vprintk_emit(int facility, int level, preempt_enable(); }
- wake_up_klogd(); + if (in_sched) + defer_console_output(); + else + wake_up_klogd(); + return printed_len; } EXPORT_SYMBOL(vprintk_emit); @@ -3843,11 +3847,33 @@ static void __wake_up_klogd(int val) preempt_enable(); }
+/** + * wake_up_klogd - Wake kernel logging daemon + * + * Use this function when new records have been added to the ringbuffer + * and the console printing of those records has already occurred or is + * known to be handled by some other context. This function will only + * wake the logging daemon. + * + * Context: Any context. + */ void wake_up_klogd(void) { __wake_up_klogd(PRINTK_PENDING_WAKEUP); }
+/** + * defer_console_output - Wake kernel logging daemon and trigger + * console printing in a deferred context + * + * Use this function when new records have been added to the ringbuffer, + * this context is responsible for console printing those records, but + * the current context is not allowed to perform the console printing. + * Trigger an irq_work context to perform the console printing. This + * function also wakes the logging daemon. + * + * Context: Any context. + */ void defer_console_output(void) { /* @@ -3864,12 +3890,7 @@ void printk_trigger_flush(void)
int vprintk_deferred(const char *fmt, va_list args) { - int r; - - r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, fmt, args); - defer_console_output(); - - return r; + return vprintk_emit(0, LOGLEVEL_SCHED, NULL, fmt, args); }
int _printk_deferred(const char *fmt, ...) diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c index ef0f9a2044da1..6d10927a07d83 100644 --- a/kernel/printk/printk_safe.c +++ b/kernel/printk/printk_safe.c @@ -38,13 +38,8 @@ asmlinkage int vprintk(const char *fmt, va_list args) * Use the main logbuf even in NMI. But avoid calling console * drivers that might have their own locks. */ - if (this_cpu_read(printk_context) || in_nmi()) { - int len; - - len = vprintk_store(0, LOGLEVEL_DEFAULT, NULL, fmt, args); - defer_console_output(); - return len; - } + if (this_cpu_read(printk_context) || in_nmi()) + return vprintk_deferred(fmt, args);
/* No obstacles. */ return vprintk_default(fmt, args);
From: John Ogness john.ogness@linutronix.de
[ Upstream commit 132a90d1527fedba2d95085c951ccf00dbbebe41 ]
Currently abandon_console_lock_in_panic() is only used to determine if the current CPU should immediately release the console lock because another CPU is in panic. However, later this function will be used by the CPU to immediately release other resources in this situation.
Rename the function to other_cpu_in_panic(), which is a better description and does not assume it is related to the console lock.
Signed-off-by: John Ogness john.ogness@linutronix.de Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Reviewed-by: Petr Mladek pmladek@suse.com Signed-off-by: Petr Mladek pmladek@suse.com Link: https://lore.kernel.org/r/20230717194607.145135-8-john.ogness@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/printk/internal.h | 2 ++ kernel/printk/printk.c | 15 ++++++++------- 2 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h index 2a17704136f1d..7d4979d5c3ce6 100644 --- a/kernel/printk/internal.h +++ b/kernel/printk/internal.h @@ -103,3 +103,5 @@ struct printk_message { u64 seq; unsigned long dropped; }; + +bool other_cpu_in_panic(void); diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 7a3077d92ab17..36e2c316bcd2b 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2590,11 +2590,12 @@ static int console_cpu_notify(unsigned int cpu) }
/* - * Return true when this CPU should unlock console_sem without pushing all - * messages to the console. This reduces the chance that the console is - * locked when the panic CPU tries to use it. + * Return true if a panic is in progress on a remote CPU. + * + * On true, the local CPU should immediately release any printing resources + * that may be needed by the panic CPU. */ -static bool abandon_console_lock_in_panic(void) +bool other_cpu_in_panic(void) { if (!panic_in_progress()) return false; @@ -2621,7 +2622,7 @@ void console_lock(void) might_sleep();
/* On panic, the console_lock must be left to the panic cpu. */ - while (abandon_console_lock_in_panic()) + while (other_cpu_in_panic()) msleep(1000);
down_console_sem(); @@ -2643,7 +2644,7 @@ EXPORT_SYMBOL(console_lock); int console_trylock(void) { /* On panic, the console_lock must be left to the panic cpu. */ - if (abandon_console_lock_in_panic()) + if (other_cpu_in_panic()) return 0; if (down_trylock_console_sem()) return 0; @@ -2959,7 +2960,7 @@ static bool console_flush_all(bool do_cond_resched, u64 *next_seq, bool *handove any_progress = true;
/* Allow panic_cpu to take over the consoles safely. */ - if (abandon_console_lock_in_panic()) + if (other_cpu_in_panic()) goto abandon;
if (do_cond_resched)
From: Mark Brown broonie@kernel.org
[ Upstream commit c51592a95f360aabf2b8a5691c550e1749dc41eb ]
The sun8i thermal driver reads calibration data via the nvmem API at startup, updating the device configuration and not referencing the data again. Rather than explicitly freeing the nvmem data the driver relies on devm_ to release it, even though the data is never referenced again. The allocation is still tracked so it's not leaked but this is notable when looking at the code and is a little wasteful so let's instead explicitly free the nvmem after we're done with it.
Signed-off-by: Mark Brown broonie@kernel.org Acked-by: Jernej Skrabec jernej.skrabec@gmail.com Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Link: https://lore.kernel.org/r/20230719-thermal-sun8i-free-nvmem-v1-1-f553d5afef7... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/thermal/sun8i_thermal.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/thermal/sun8i_thermal.c b/drivers/thermal/sun8i_thermal.c index d4d241686c810..56a00f1efe47a 100644 --- a/drivers/thermal/sun8i_thermal.c +++ b/drivers/thermal/sun8i_thermal.c @@ -286,7 +286,7 @@ static int sun8i_ths_calibrate(struct ths_device *tmdev) size_t callen; int ret = 0;
- calcell = devm_nvmem_cell_get(dev, "calibration"); + calcell = nvmem_cell_get(dev, "calibration"); if (IS_ERR(calcell)) { if (PTR_ERR(calcell) == -EPROBE_DEFER) return -EPROBE_DEFER; @@ -316,6 +316,8 @@ static int sun8i_ths_calibrate(struct ths_device *tmdev)
kfree(caldata); out: + if (!IS_ERR(calcell)) + nvmem_cell_put(calcell); return ret; }
linux-stable-mirror@lists.linaro.org