September 2023 - Linux-stable-mirror

[for-linus][PATCH 08/15] tracing: Have event inject files inc the trace array ref count

by Steven Rostedt

From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org> The event inject files add events for a specific trace array. For an instance, if the file is opened and the instance is deleted, reading or writing to the file will cause a use after free. Up the ref count of the trace_array when a event inject file is opened. Link: https://lkml.kernel.org/r/20230907024804.292337868@goodmis.org Link: https://lore.kernel.org/all/1cb3aee2-19af-c472-e265-05176fe9bd84@huawei.com/ Cc: stable(a)vger.kernel.org Cc: Masami Hiramatsu <mhiramat(a)kernel.org> Cc: Mark Rutland <mark.rutland(a)arm.com> Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: Zheng Yejian <zhengyejian1(a)huawei.com> Fixes: 6c3edaf9fd6a ("tracing: Introduce trace event injection") Tested-by: Linux Kernel Functional Testing <lkft(a)linaro.org> Tested-by: Naresh Kamboju <naresh.kamboju(a)linaro.org> Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org> --- kernel/trace/trace_events_inject.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/trace/trace_events_inject.c b/kernel/trace/trace_events_inject.c index abe805d471eb..8650562bdaa9 100644 --- a/kernel/trace/trace_events_inject.c +++ b/kernel/trace/trace_events_inject.c @@ -328,7 +328,8 @@ event_inject_read(struct file *file, char __user *buf, size_t size, } const struct file_operations event_inject_fops = { - .open = tracing_open_generic, + .open = tracing_open_file_tr, .read = event_inject_read, .write = event_inject_write, + .release = tracing_release_file_tr, }; -- 2.40.1

2 years, 3 months

1
0
0 0

[for-linus][PATCH 07/15] tracing: Have option files inc the trace array ref count

by Steven Rostedt

From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org> The option files update the options for a given trace array. For an instance, if the file is opened and the instance is deleted, reading or writing to the file will cause a use after free. Up the ref count of the trace_array when an option file is opened. Link: https://lkml.kernel.org/r/20230907024804.086679464@goodmis.org Link: https://lore.kernel.org/all/1cb3aee2-19af-c472-e265-05176fe9bd84@huawei.com/ Cc: stable(a)vger.kernel.org Cc: Masami Hiramatsu <mhiramat(a)kernel.org> Cc: Mark Rutland <mark.rutland(a)arm.com> Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: Zheng Yejian <zhengyejian1(a)huawei.com> Fixes: 8530dec63e7b4 ("tracing: Add tracing_check_open_get_tr()") Tested-by: Linux Kernel Functional Testing <lkft(a)linaro.org> Tested-by: Naresh Kamboju <naresh.kamboju(a)linaro.org> Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org> --- kernel/trace/trace.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index b82df33d20ff..0608ad20cf30 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -8988,12 +8988,33 @@ trace_options_write(struct file *filp, const char __user *ubuf, size_t cnt, return cnt; } +static int tracing_open_options(struct inode *inode, struct file *filp) +{ + struct trace_option_dentry *topt = inode->i_private; + int ret; + + ret = tracing_check_open_get_tr(topt->tr); + if (ret) + return ret; + + filp->private_data = inode->i_private; + return 0; +} + +static int tracing_release_options(struct inode *inode, struct file *file) +{ + struct trace_option_dentry *topt = file->private_data; + + trace_array_put(topt->tr); + return 0; +} static const struct file_operations trace_options_fops = { - .open = tracing_open_generic, + .open = tracing_open_options, .read = trace_options_read, .write = trace_options_write, .llseek = generic_file_llseek, + .release = tracing_release_options, }; /* -- 2.40.1

2 years, 3 months

1
0
0 0

[for-linus][PATCH 06/15] tracing: Have current_trace inc the trace array ref count

by Steven Rostedt

From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org> The current_trace updates the trace array tracer. For an instance, if the file is opened and the instance is deleted, reading or writing to the file will cause a use after free. Up the ref count of the trace array when current_trace is opened. Link: https://lkml.kernel.org/r/20230907024803.877687227@goodmis.org Link: https://lore.kernel.org/all/1cb3aee2-19af-c472-e265-05176fe9bd84@huawei.com/ Cc: stable(a)vger.kernel.org Cc: Masami Hiramatsu <mhiramat(a)kernel.org> Cc: Mark Rutland <mark.rutland(a)arm.com> Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: Zheng Yejian <zhengyejian1(a)huawei.com> Fixes: 8530dec63e7b4 ("tracing: Add tracing_check_open_get_tr()") Tested-by: Linux Kernel Functional Testing <lkft(a)linaro.org> Tested-by: Naresh Kamboju <naresh.kamboju(a)linaro.org> Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org> --- kernel/trace/trace.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index c8b8b4c6feaf..b82df33d20ff 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -7791,10 +7791,11 @@ static const struct file_operations tracing_max_lat_fops = { #endif static const struct file_operations set_tracer_fops = { - .open = tracing_open_generic, + .open = tracing_open_generic_tr, .read = tracing_set_trace_read, .write = tracing_set_trace_write, .llseek = generic_file_llseek, + .release = tracing_release_generic_tr, }; static const struct file_operations tracing_pipe_fops = { -- 2.40.1

2 years, 3 months

1
0
0 0

[for-linus][PATCH 05/15] tracing: Have tracing_max_latency inc the trace array ref count

by Steven Rostedt

From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org> The tracing_max_latency file points to the trace_array max_latency field. For an instance, if the file is opened and the instance is deleted, reading or writing to the file will cause a use after free. Up the ref count of the trace_array when tracing_max_latency is opened. Link: https://lkml.kernel.org/r/20230907024803.666889383@goodmis.org Link: https://lore.kernel.org/all/1cb3aee2-19af-c472-e265-05176fe9bd84@huawei.com/ Cc: stable(a)vger.kernel.org Cc: Masami Hiramatsu <mhiramat(a)kernel.org> Cc: Mark Rutland <mark.rutland(a)arm.com> Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: Zheng Yejian <zhengyejian1(a)huawei.com> Fixes: 8530dec63e7b4 ("tracing: Add tracing_check_open_get_tr()") Tested-by: Linux Kernel Functional Testing <lkft(a)linaro.org> Tested-by: Naresh Kamboju <naresh.kamboju(a)linaro.org> Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org> --- kernel/trace/trace.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 0827037ee3b8..c8b8b4c6feaf 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -1772,7 +1772,7 @@ static void trace_create_maxlat_file(struct trace_array *tr, init_irq_work(&tr->fsnotify_irqwork, latency_fsnotify_workfn_irq); tr->d_max_latency = trace_create_file("tracing_max_latency", TRACE_MODE_WRITE, - d_tracer, &tr->max_latency, + d_tracer, tr, &tracing_max_lat_fops); } @@ -1805,7 +1805,7 @@ void latency_fsnotify(struct trace_array *tr) #define trace_create_maxlat_file(tr, d_tracer) \ trace_create_file("tracing_max_latency", TRACE_MODE_WRITE, \ - d_tracer, &tr->max_latency, &tracing_max_lat_fops) + d_tracer, tr, &tracing_max_lat_fops) #endif @@ -6717,14 +6717,18 @@ static ssize_t tracing_max_lat_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos) { - return tracing_nsecs_read(filp->private_data, ubuf, cnt, ppos); + struct trace_array *tr = filp->private_data; + + return tracing_nsecs_read(&tr->max_latency, ubuf, cnt, ppos); } static ssize_t tracing_max_lat_write(struct file *filp, const char __user *ubuf, size_t cnt, loff_t *ppos) { - return tracing_nsecs_write(filp->private_data, ubuf, cnt, ppos); + struct trace_array *tr = filp->private_data; + + return tracing_nsecs_write(&tr->max_latency, ubuf, cnt, ppos); } #endif @@ -7778,10 +7782,11 @@ static const struct file_operations tracing_thresh_fops = { #ifdef CONFIG_TRACER_MAX_TRACE static const struct file_operations tracing_max_lat_fops = { - .open = tracing_open_generic, + .open = tracing_open_generic_tr, .read = tracing_max_lat_read, .write = tracing_max_lat_write, .llseek = generic_file_llseek, + .release = tracing_release_generic_tr, }; #endif -- 2.40.1

2 years, 3 months

1
0
0 0

[for-linus][PATCH 04/15] tracing: Increase trace array ref count on enable and filter files

by Steven Rostedt

From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org> When the trace event enable and filter files are opened, increment the trace array ref counter, otherwise they can be accessed when the trace array is being deleted. The ref counter keeps the trace array from being deleted while those files are opened. Link: https://lkml.kernel.org/r/20230907024803.456187066@goodmis.org Link: https://lore.kernel.org/all/1cb3aee2-19af-c472-e265-05176fe9bd84@huawei.com/ Cc: stable(a)vger.kernel.org Cc: Masami Hiramatsu <mhiramat(a)kernel.org> Cc: Mark Rutland <mark.rutland(a)arm.com> Cc: Andrew Morton <akpm(a)linux-foundation.org> Fixes: 8530dec63e7b4 ("tracing: Add tracing_check_open_get_tr()") Tested-by: Linux Kernel Functional Testing <lkft(a)linaro.org> Tested-by: Naresh Kamboju <naresh.kamboju(a)linaro.org> Reported-by: Zheng Yejian <zhengyejian1(a)huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org> --- kernel/trace/trace.c | 27 +++++++++++++++++++++++++++ kernel/trace/trace.h | 2 ++ kernel/trace/trace_events.c | 6 ++++-- 3 files changed, 33 insertions(+), 2 deletions(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 35783a7baf15..0827037ee3b8 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -4973,6 +4973,33 @@ int tracing_open_generic_tr(struct inode *inode, struct file *filp) return 0; } +/* + * The private pointer of the inode is the trace_event_file. + * Update the tr ref count associated to it. + */ +int tracing_open_file_tr(struct inode *inode, struct file *filp) +{ + struct trace_event_file *file = inode->i_private; + int ret; + + ret = tracing_check_open_get_tr(file->tr); + if (ret) + return ret; + + filp->private_data = inode->i_private; + + return 0; +} + +int tracing_release_file_tr(struct inode *inode, struct file *filp) +{ + struct trace_event_file *file = inode->i_private; + + trace_array_put(file->tr); + + return 0; +} + static int tracing_mark_open(struct inode *inode, struct file *filp) { stream_open(inode, filp); diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h index 5669dd1f90d9..77debe53f07c 100644 --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -610,6 +610,8 @@ void tracing_reset_all_online_cpus(void); void tracing_reset_all_online_cpus_unlocked(void); int tracing_open_generic(struct inode *inode, struct file *filp); int tracing_open_generic_tr(struct inode *inode, struct file *filp); +int tracing_open_file_tr(struct inode *inode, struct file *filp); +int tracing_release_file_tr(struct inode *inode, struct file *filp); bool tracing_is_disabled(void); bool tracer_tracing_is_on(struct trace_array *tr); void tracer_tracing_on(struct trace_array *tr); diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index ed367d713be0..2af92177b765 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -2103,9 +2103,10 @@ static const struct file_operations ftrace_set_event_notrace_pid_fops = { }; static const struct file_operations ftrace_enable_fops = { - .open = tracing_open_generic, + .open = tracing_open_file_tr, .read = event_enable_read, .write = event_enable_write, + .release = tracing_release_file_tr, .llseek = default_llseek, }; @@ -2122,9 +2123,10 @@ static const struct file_operations ftrace_event_id_fops = { }; static const struct file_operations ftrace_event_filter_fops = { - .open = tracing_open_generic, + .open = tracing_open_file_tr, .read = event_filter_read, .write = event_filter_write, + .release = tracing_release_file_tr, .llseek = default_llseek, }; -- 2.40.1

2 years, 3 months

1
0
0 0

[for-linus][PATCH 01/15] tracefs: Add missing lockdown check to tracefs_create_dir()

by Steven Rostedt

From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org> The function tracefs_create_dir() was missing a lockdown check and was called by the RV code. This gave an inconsistent behavior of this function returning success while other tracefs functions failed. This caused the inode being freed by the wrong kmem_cache. Link: https://lkml.kernel.org/r/20230905182711.692687042@goodmis.org Link: https://lore.kernel.org/all/202309050916.58201dc6-oliver.sang@intel.com/ Cc: stable(a)vger.kernel.org Cc: Masami Hiramatsu <mhiramat(a)kernel.org> Cc: Mark Rutland <mark.rutland(a)arm.com> Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: Ajay Kaher <akaher(a)vmware.com> Cc: Ching-lin Yu <chinglinyu(a)google.com> Fixes: bf8e602186ec4 ("tracing: Do not create tracefs files if tracefs lockdown is in effect") Reported-by: kernel test robot <oliver.sang(a)intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org> --- fs/tracefs/inode.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c index de5b72216b1a..3b8dd938b1c8 100644 --- a/fs/tracefs/inode.c +++ b/fs/tracefs/inode.c @@ -673,6 +673,9 @@ static struct dentry *__create_dir(const char *name, struct dentry *parent, */ struct dentry *tracefs_create_dir(const char *name, struct dentry *parent) { + if (security_locked_down(LOCKDOWN_TRACEFS)) + return NULL; + return __create_dir(name, parent, &simple_dir_inode_operations); } -- 2.40.1

2 years, 3 months

1
0
0 0

[PATCH AUTOSEL 5.10] thermal/drivers/sun8i: Free calibration nvmem after reading it

by Sasha Levin

From: Mark Brown <broonie(a)kernel.org> [ Upstream commit c51592a95f360aabf2b8a5691c550e1749dc41eb ] The sun8i thermal driver reads calibration data via the nvmem API at startup, updating the device configuration and not referencing the data again. Rather than explicitly freeing the nvmem data the driver relies on devm_ to release it, even though the data is never referenced again. The allocation is still tracked so it's not leaked but this is notable when looking at the code and is a little wasteful so let's instead explicitly free the nvmem after we're done with it. Signed-off-by: Mark Brown <broonie(a)kernel.org> Acked-by: Jernej Skrabec <jernej.skrabec(a)gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano(a)linaro.org> Link: https://lore.kernel.org/r/20230719-thermal-sun8i-free-nvmem-v1-1-f553d5afef… Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- drivers/thermal/sun8i_thermal.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/thermal/sun8i_thermal.c b/drivers/thermal/sun8i_thermal.c index e053b06280172..6a0e809174731 100644 --- a/drivers/thermal/sun8i_thermal.c +++ b/drivers/thermal/sun8i_thermal.c @@ -285,7 +285,7 @@ static int sun8i_ths_calibrate(struct ths_device *tmdev) size_t callen; int ret = 0; - calcell = devm_nvmem_cell_get(dev, "calibration"); + calcell = nvmem_cell_get(dev, "calibration"); if (IS_ERR(calcell)) { if (PTR_ERR(calcell) == -EPROBE_DEFER) return -EPROBE_DEFER; @@ -315,6 +315,8 @@ static int sun8i_ths_calibrate(struct ths_device *tmdev) kfree(caldata); out: + if (!IS_ERR(calcell)) + nvmem_cell_put(calcell); return ret; } -- 2.40.1

2 years, 3 months

1
0
0 0

[PATCH AUTOSEL 5.15 1/2] printk: Consolidate console deferred printing

by Sasha Levin

From: John Ogness <john.ogness(a)linutronix.de> [ Upstream commit 696ffaf50e1f8dbc66223ff614473f945f5fb8d8 ] Printing to consoles can be deferred for several reasons: - explicitly with printk_deferred() - printk() in NMI context - recursive printk() calls The current implementation is not consistent. For printk_deferred(), irq work is scheduled twice. For NMI und recursive, panic CPU suppression and caller delays are not properly enforced. Correct these inconsistencies by consolidating the deferred printing code so that vprintk_deferred() is the top-level function for deferred printing and vprintk_emit() will perform whichever irq_work queueing is appropriate. Also add kerneldoc for wake_up_klogd() and defer_console_output() to clarify their differences and appropriate usage. Signed-off-by: John Ogness <john.ogness(a)linutronix.de> Reviewed-by: Sergey Senozhatsky <senozhatsky(a)chromium.org> Reviewed-by: Petr Mladek <pmladek(a)suse.com> Signed-off-by: Petr Mladek <pmladek(a)suse.com> Link: https://lore.kernel.org/r/20230717194607.145135-6-john.ogness@linutronix.de Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- kernel/printk/printk.c | 35 ++++++++++++++++++++++++++++------- kernel/printk/printk_safe.c | 9 ++------- 2 files changed, 30 insertions(+), 14 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 8d856b7c2e5af..8b110b245d92c 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2269,7 +2269,11 @@ asmlinkage int vprintk_emit(int facility, int level, preempt_enable(); } - wake_up_klogd(); + if (in_sched) + defer_console_output(); + else + wake_up_klogd(); + return printed_len; } EXPORT_SYMBOL(vprintk_emit); @@ -3277,11 +3281,33 @@ static void __wake_up_klogd(int val) preempt_enable(); } +/** + * wake_up_klogd - Wake kernel logging daemon + * + * Use this function when new records have been added to the ringbuffer + * and the console printing of those records has already occurred or is + * known to be handled by some other context. This function will only + * wake the logging daemon. + * + * Context: Any context. + */ void wake_up_klogd(void) { __wake_up_klogd(PRINTK_PENDING_WAKEUP); } +/** + * defer_console_output - Wake kernel logging daemon and trigger + * console printing in a deferred context + * + * Use this function when new records have been added to the ringbuffer, + * this context is responsible for console printing those records, but + * the current context is not allowed to perform the console printing. + * Trigger an irq_work context to perform the console printing. This + * function also wakes the logging daemon. + * + * Context: Any context. + */ void defer_console_output(void) { /* @@ -3298,12 +3324,7 @@ void printk_trigger_flush(void) int vprintk_deferred(const char *fmt, va_list args) { - int r; - - r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, fmt, args); - defer_console_output(); - - return r; + return vprintk_emit(0, LOGLEVEL_SCHED, NULL, fmt, args); } int _printk_deferred(const char *fmt, ...) diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c index ef0f9a2044da1..6d10927a07d83 100644 --- a/kernel/printk/printk_safe.c +++ b/kernel/printk/printk_safe.c @@ -38,13 +38,8 @@ asmlinkage int vprintk(const char *fmt, va_list args) * Use the main logbuf even in NMI. But avoid calling console * drivers that might have their own locks. */ - if (this_cpu_read(printk_context) || in_nmi()) { - int len; - - len = vprintk_store(0, LOGLEVEL_DEFAULT, NULL, fmt, args); - defer_console_output(); - return len; - } + if (this_cpu_read(printk_context) || in_nmi()) + return vprintk_deferred(fmt, args); /* No obstacles. */ return vprintk_default(fmt, args); -- 2.40.1

2 years, 3 months

1
1
0 0

[PATCH AUTOSEL 6.1 1/3] printk: Keep non-panic-CPUs out of console lock

by Sasha Levin

From: John Ogness <john.ogness(a)linutronix.de> [ Upstream commit 51a1d258e50e03a0216bf42b6af9ff34ec402ac1 ] When in a panic situation, non-panic CPUs should avoid holding the console lock so as not to contend with the panic CPU. This is already implemented with abandon_console_lock_in_panic(), which is checked after each printed line. However, non-panic CPUs should also avoid trying to acquire the console lock during a panic. Modify console_trylock() to fail and console_lock() to block() when called from a non-panic CPU during a panic. Signed-off-by: John Ogness <john.ogness(a)linutronix.de> Reviewed-by: Sergey Senozhatsky <senozhatsky(a)chromium.org> Reviewed-by: Petr Mladek <pmladek(a)suse.com> Signed-off-by: Petr Mladek <pmladek(a)suse.com> Link: https://lore.kernel.org/r/20230717194607.145135-4-john.ogness@linutronix.de Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- kernel/printk/printk.c | 45 ++++++++++++++++++++++++------------------ 1 file changed, 26 insertions(+), 19 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index e4f1e7478b521..4b9429f3fd6d8 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2552,6 +2552,25 @@ static int console_cpu_notify(unsigned int cpu) return 0; } +/* + * Return true when this CPU should unlock console_sem without pushing all + * messages to the console. This reduces the chance that the console is + * locked when the panic CPU tries to use it. + */ +static bool abandon_console_lock_in_panic(void) +{ + if (!panic_in_progress()) + return false; + + /* + * We can use raw_smp_processor_id() here because it is impossible for + * the task to be migrated to the panic_cpu, or away from it. If + * panic_cpu has already been set, and we're not currently executing on + * that CPU, then we never will be. + */ + return atomic_read(&panic_cpu) != raw_smp_processor_id(); +} + /** * console_lock - lock the console system for exclusive use. * @@ -2564,6 +2583,10 @@ void console_lock(void) { might_sleep(); + /* On panic, the console_lock must be left to the panic cpu. */ + while (abandon_console_lock_in_panic()) + msleep(1000); + down_console_sem(); if (console_suspended) return; @@ -2582,6 +2605,9 @@ EXPORT_SYMBOL(console_lock); */ int console_trylock(void) { + /* On panic, the console_lock must be left to the panic cpu. */ + if (abandon_console_lock_in_panic()) + return 0; if (down_trylock_console_sem()) return 0; if (console_suspended) { @@ -2600,25 +2626,6 @@ int is_console_locked(void) } EXPORT_SYMBOL(is_console_locked); -/* - * Return true when this CPU should unlock console_sem without pushing all - * messages to the console. This reduces the chance that the console is - * locked when the panic CPU tries to use it. - */ -static bool abandon_console_lock_in_panic(void) -{ - if (!panic_in_progress()) - return false; - - /* - * We can use raw_smp_processor_id() here because it is impossible for - * the task to be migrated to the panic_cpu, or away from it. If - * panic_cpu has already been set, and we're not currently executing on - * that CPU, then we never will be. - */ - return atomic_read(&panic_cpu) != raw_smp_processor_id(); -} - /* * Check if the given console is currently capable and allowed to print * records. -- 2.40.1

2 years, 3 months

1
2
0 0

[PATCH AUTOSEL 6.4 1/6] printk: Reduce console_unblank() usage in unsafe scenarios

by Sasha Levin

From: John Ogness <john.ogness(a)linutronix.de> [ Upstream commit 7b23a66db55ed0a55b020e913f0d6f6d52a1ad2c ] A semaphore is not NMI-safe, even when using down_trylock(). Both down_trylock() and up() are using internal spinlocks and up() might even call wake_up_process(). In the panic() code path it gets even worse because the internal spinlocks of the semaphore may have been taken by a CPU that has been stopped. To reduce the risk of deadlocks caused by the console semaphore in the panic path, make the following changes: - First check if any consoles have implemented the unblank() callback. If not, then there is no reason to take the console semaphore anyway. (This check is also useful for the non-panic path since the locking/unlocking of the console lock can be quite expensive due to console printing.) - If the panic path is in NMI context, bail out without attempting to take the console semaphore or calling any unblank() callbacks. Bailing out is acceptable because console_unblank() would already bail out if the console semaphore is contended. The alternative of ignoring the console semaphore and calling the unblank() callbacks anyway is a bad idea because these callbacks are also not NMI-safe. If consoles with unblank() callbacks exist and console_unblank() is called from a non-NMI panic context, it will still attempt a down_trylock(). This could still result in a deadlock if one of the stopped CPUs is holding the semaphore internal spinlock. But this is a risk that the kernel has been (and continues to be) willing to take. Signed-off-by: John Ogness <john.ogness(a)linutronix.de> Reviewed-by: Sergey Senozhatsky <senozhatsky(a)chromium.org> Reviewed-by: Petr Mladek <pmladek(a)suse.com> Signed-off-by: Petr Mladek <pmladek(a)suse.com> Link: https://lore.kernel.org/r/20230717194607.145135-3-john.ogness@linutronix.de Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- kernel/printk/printk.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 6a333adce3b33..653ad62ded417 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -3045,9 +3045,27 @@ EXPORT_SYMBOL(console_conditional_schedule); void console_unblank(void) { + bool found_unblank = false; struct console *c; int cookie; + /* + * First check if there are any consoles implementing the unblank() + * callback. If not, there is no reason to continue and take the + * console lock, which in particular can be dangerous if + * @oops_in_progress is set. + */ + cookie = console_srcu_read_lock(); + for_each_console_srcu(c) { + if ((console_srcu_read_flags(c) & CON_ENABLED) && c->unblank) { + found_unblank = true; + break; + } + } + console_srcu_read_unlock(cookie); + if (!found_unblank) + return; + /* * Stop console printing because the unblank() callback may * assume the console is not within its write() callback. @@ -3056,6 +3074,16 @@ void console_unblank(void) * In that case, attempt a trylock as best-effort. */ if (oops_in_progress) { + /* Semaphores are not NMI-safe. */ + if (in_nmi()) + return; + + /* + * Attempting to trylock the console lock can deadlock + * if another CPU was stopped while modifying the + * semaphore. "Hope and pray" that this is not the + * current situation. + */ if (down_trylock_console_sem() != 0) return; } else -- 2.40.1

2 years, 3 months

1
5
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror September 2023