The Turris Mox shares the moxtet IRQ with various devices on the board,
so mark the IRQ as shared in the driver as well.
Without this loading the module will fail with:
genirq: Flags mismatch irq 40. 00002002 (moxtet) vs. 00002080 (mcp7940x)
Signed-off-by: Sjoerd Simons <sjoerd(a)collabora.com>
Cc: stable(a)vger.kernel.org # v6.2+
---
(no changes since v1)
drivers/bus/moxtet.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/bus/moxtet.c b/drivers/bus/moxtet.c
index 5eb0fe73ddc4..48c18f95660a 100644
--- a/drivers/bus/moxtet.c
+++ b/drivers/bus/moxtet.c
@@ -755,7 +755,7 @@ static int moxtet_irq_setup(struct moxtet *moxtet)
moxtet->irq.masked = ~0;
ret = request_threaded_irq(moxtet->dev_irq, NULL, moxtet_irq_thread_fn,
- IRQF_ONESHOT, "moxtet", moxtet);
+ IRQF_SHARED | IRQF_ONESHOT, "moxtet", moxtet);
if (ret < 0)
goto err_free;
--
2.43.0
A refcount issue can appeared in __fwnode_link_del() due to the
pr_debug() call:
WARNING: CPU: 0 PID: 901 at lib/refcount.c:25 refcount_warn_saturate+0xe5/0x110
Call Trace:
<TASK>
...
of_node_get+0x1e/0x30
of_fwnode_get+0x28/0x40
fwnode_full_name_string+0x34/0x90
fwnode_string+0xdb/0x140
...
vsnprintf+0x17b/0x630
...
__fwnode_link_del+0x25/0xa0
fwnode_links_purge+0x39/0xb0
of_node_release+0xd9/0x180
...
Indeed, an fwnode (of_node) is being destroyed and so, of_node_release()
is called because the of_node refcount reached 0.
From of_node_release() several function calls are done and lead to
a pr_debug() calls with %pfwf to print the fwnode full name.
The issue is not present if we change %pfwf to %pfwP.
To print the full name, %pfwf iterates over the current node and its
parents and obtain/drop a reference to all nodes involved.
In order to allow to print the full name (%pfwf) of a node while it is
being destroyed, do not obtain/drop a reference to this current node.
Fixes: a92eb7621b9f ("lib/vsprintf: Make use of fwnode API to obtain node names and separators")
Cc: stable(a)vger.kernel.org
Signed-off-by: Herve Codina <herve.codina(a)bootlin.com>
Reviewed-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
---
Changes v2 -> v3
- Fix typo in comment ("ie parents node" -> "i.e. parent nodes")
- Add 'Reviewed-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>'
- Add 'Reviewed-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>'
Changes v1 -> v2
- Avoid handling current node out of the loop. Instead obtain/drop references
in the loop based on the depth value.
- Remove some of the backtrace lines in the commit log.
lib/vsprintf.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index afb88b24fa74..2aa408441cd3 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -2110,15 +2110,20 @@ char *fwnode_full_name_string(struct fwnode_handle *fwnode, char *buf,
/* Loop starting from the root node to the current node. */
for (depth = fwnode_count_parents(fwnode); depth >= 0; depth--) {
- struct fwnode_handle *__fwnode =
- fwnode_get_nth_parent(fwnode, depth);
+ /*
+ * Only get a reference for other nodes (i.e. parent nodes).
+ * fwnode refcount may be 0 here.
+ */
+ struct fwnode_handle *__fwnode = depth ?
+ fwnode_get_nth_parent(fwnode, depth) : fwnode;
buf = string(buf, end, fwnode_get_name_prefix(__fwnode),
default_str_spec);
buf = string(buf, end, fwnode_get_name(__fwnode),
default_str_spec);
- fwnode_handle_put(__fwnode);
+ if (depth)
+ fwnode_handle_put(__fwnode);
}
return buf;
--
2.41.0
Hi all,
This series fixes some long-term issues in kernel that preventing
some machine from work properly.
Hopefully that will rescue some system in wild :-)
Thanks
Signed-off-by: Jiaxun Yang <jiaxun.yang(a)flygoat.com>
---
Changes in v2:
- Typo and style fixes
- Link to v1: https://lore.kernel.org/r/20231101-loongson64_fixes-v1-0-2a2582a4bfa9@flygo…
---
Jiaxun Yang (3):
MIPS: Loongson64: Reserve vgabios memory on boot
MIPS: Loongson64: Enable DMA noncoherent support
MIPS: Loongson64: Handle more memory types passed from firmware
arch/mips/Kconfig | 2 +
arch/mips/include/asm/mach-loongson64/boot_param.h | 9 ++++-
arch/mips/loongson64/env.c | 10 ++++-
arch/mips/loongson64/init.c | 47 ++++++++++++++--------
4 files changed, 49 insertions(+), 19 deletions(-)
---
base-commit: 9c2d379d63450ae464eeab45462e0cb573cd97d0
change-id: 20231101-loongson64_fixes-0afb1b503d1e
Best regards,
--
Jiaxun Yang <jiaxun.yang(a)flygoat.com>
io_uring sets up the io worker kernel thread via a syscall out of an
user space prrocess. This process might have used FPU and since
copy_thread() didn't clear FPU states for kernel threads a BUG()
is triggered for using FPU inside kernel. Move code around
to always clear FPU state for user and kernel threads.
Cc: stable(a)vger.kernel.org
Reported-by: Aurelien Jarno <aurel32(a)debian.org>
Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1055021
Signed-off-by: Thomas Bogendoerfer <tsbogend(a)alpha.franken.de>
---
arch/mips/kernel/process.c | 25 +++++++++++++------------
1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
index 5387ed0a5186..b630604c577f 100644
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -121,6 +121,19 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
/* Put the stack after the struct pt_regs. */
childksp = (unsigned long) childregs;
p->thread.cp0_status = (read_c0_status() & ~(ST0_CU2|ST0_CU1)) | ST0_KERNEL_CUMASK;
+
+ /*
+ * New tasks lose permission to use the fpu. This accelerates context
+ * switching for most programs since they don't use the fpu.
+ */
+ clear_tsk_thread_flag(p, TIF_USEDFPU);
+ clear_tsk_thread_flag(p, TIF_USEDMSA);
+ clear_tsk_thread_flag(p, TIF_MSA_CTX_LIVE);
+
+#ifdef CONFIG_MIPS_MT_FPAFF
+ clear_tsk_thread_flag(p, TIF_FPUBOUND);
+#endif /* CONFIG_MIPS_MT_FPAFF */
+
if (unlikely(args->fn)) {
/* kernel thread */
unsigned long status = p->thread.cp0_status;
@@ -149,20 +162,8 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
p->thread.reg29 = (unsigned long) childregs;
p->thread.reg31 = (unsigned long) ret_from_fork;
- /*
- * New tasks lose permission to use the fpu. This accelerates context
- * switching for most programs since they don't use the fpu.
- */
childregs->cp0_status &= ~(ST0_CU2|ST0_CU1);
- clear_tsk_thread_flag(p, TIF_USEDFPU);
- clear_tsk_thread_flag(p, TIF_USEDMSA);
- clear_tsk_thread_flag(p, TIF_MSA_CTX_LIVE);
-
-#ifdef CONFIG_MIPS_MT_FPAFF
- clear_tsk_thread_flag(p, TIF_FPUBOUND);
-#endif /* CONFIG_MIPS_MT_FPAFF */
-
#ifdef CONFIG_MIPS_FP_SUPPORT
atomic_set(&p->thread.bd_emu_frame, BD_EMUFRAME_NONE);
#endif
--
2.35.3
Hi, all
We are encountering a perf related soft lockup as shown below:
[25023823.265138] watchdog: BUG: soft lockup - CPU#29 stuck for 45s!
[YD:3284696]
[25023823.275772] net_failover virtio_scsi failover
[25023823.276750] CPU: 29 PID: 3284696 Comm: YD Kdump: loaded Not
tainted 4.19.90-23.18.v2101.ky10.aarch64 #1
[25023823.278257] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[25023823.279475] pstate: 80400005 (Nzcv daif +PAN -UAO)
[25023823.280516] pc : perf_iterate_sb+0x1b8/0x1f0
[25023823.281530] lr : perf_iterate_sb+0x18c/0x1f0
[25023823.282529] sp : ffff801f282efbf0
[25023823.283446] x29: ffff801f282efbf0 x28: ffff801f207a8b80
[25023823.284551] x27: 0000000000000000 x26: ffff801f99b355e8
[25023823.285674] x25: 0000000000000000 x24: ffff8019e2fbd800
[25023823.286770] x23: ffff0000093f0018 x22: ffff801f282efc40
[25023823.287864] x21: ffff000008255f60 x20: ffff801ffdf58e80
[25023823.288964] x19: ffff8019f1c27800 x18: 0000000000000000
[25023823.290060] x17: 0000000000000000 x16: 0000000000000000
[25023823.291164] x15: 0400000000000000 x14: 0000000000000000
[25023823.292266] x13: ffff000008c6e340 x12: 0000000000000002
[25023823.293381] x11: ffff000008c6e318 x10: 00000019e5feff20
[25023823.294486] x9 : ffff8019fb49c000 x8 : 0058e6fd335b260e
[25023823.295597] x7 : 0000000100321ed8 x6 : ffff00003d083780
[25023823.296715] x5 : 00ffffffffffffff x4 : 0000801ff4ae0000
[25023823.297860] x3 : ffff801ffdf64cc0 x2 : ffff000009858758
[25023823.298977] x1 : 0000000000000000 x0 : ffff8019e2fbd800
[25023823.300090] Call trace:
[25023823.300962] perf_iterate_sb+0x1b8/0x1f0
[25023823.301961] perf_event_task+0x78/0x80
[25023823.302946] perf_event_exit_task+0xa4/0xb0
[25023823.303978] do_exit+0x38c/0x5d0
[25023823.304932] do_group_exit+0x3c/0xd8
[25023823.305904] get_signal+0x12c/0x740
[25023823.306859] do_signal+0x158/0x260
[25023823.307795] do_notify_resume+0xd8/0x358
[25023823.308781] work_pending+0x8/0x10
We got a vmcore by enable panic_on_soft_lockup, from the vmcore we
found the perf_event accessed through
perf_iterate_sb -> perf_iterate_sb_cpu -> event_filter_match ->
pmu_filter_match -> for_each_sibling_event
had been removed:
#define for_each_sibling_event(sibling, event) \
if ((event)->group_leader == (event)) \
list_for_each_entry((sibling), &(event)->sibling_list,
sibling_list)
#define list_for_each_entry(pos, head, member) \
for (pos = __container_of((head)->next, pos, member); \
&pos->member != (head); \
pos = __container_of(pos->member.next, pos, member))
crash> struct perf_event ffff8019e2fbd800
struct perf_event {
event_entry = {
next = 0xffff8019f1c27800,
prev = 0xdead000000000200
},
...
state = PERF_EVENT_STATE_DEAD,
...
}
By the way, we also found another process which is deleting sibling_list:
crash> bt 3284533
PID: 3284533 TASK: ffff801f901ae880 CPU: 16 COMMAND: "YD"
#0 [ffff801f8cd977f0] __switch_to at ffff000008088ba4
#1 [ffff801f8cd97810] __schedule at ffff000008bf10c4
#2 [ffff801f8cd97890] schedule at ffff000008bf17b0
#3 [ffff801f8cd978a0] schedule_timeout at ffff000008bf5b10
#4 [ffff801f8cd97960] wait_for_common at ffff000008bf2530
#5 [ffff801f8cd979f0] wait_for_completion at ffff000008bf2644
#6 [ffff801f8cd97a10] __wait_rcu_gp at ffff000008171c00
#7 [ffff801f8cd97a80] synchronize_sched at ffff000008179da8
#8 [ffff801f8cd97ad0] perf_trace_event_unreg at ffff000008216d50
#9 [ffff801f8cd97b00] perf_trace_destroy at ffff000008217148
#10 [ffff801f8cd97b20] tp_perf_event_destroy at ffff000008256ae0
#11 [ffff801f8cd97b30] _free_event at ffff00000825f21c
#12 [ffff801f8cd97b70] put_event at ffff00000825faf0
#13 [ffff801f8cd97b80] perf_event_release_kernel at ffff00000825fcb8
#14 [ffff801f8cd97be0] perf_release at ffff00000825fdbc
#15 [ffff801f8cd97bf0] __fput at ffff00000832f0b8
#16 [ffff801f8cd97c30] ____fput at ffff00000832f28c
#17 [ffff801f8cd97c50] task_work_run at ffff00000810f8c8
#18 [ffff801f8cd97c90] do_exit at ffff0000080ef458
#19 [ffff801f8cd97cf0] do_group_exit at ffff0000080ef738
#20 [ffff801f8cd97d20] get_signal at ffff0000080fdde0
#21 [ffff801f8cd97d90] do_signal at ffff00000808e488
#22 [ffff801f8cd97e80] do_notify_resume at ffff00000808e7f4
#23 [ffff801f8cd97ff0] work_pending at ffff000008083f60
So it's reasonable to suspect that perf_iterate_sb is traversing
sibling_list while another
process is deleting it which eventually caused for_each_sibling_event
to endless loop and thus soft lockup.
The race scenario thus could be this:
CPU 29: CPU 16:
perf_event_release_kernel
--> mutex_lock(&ctx->mutex)
--> perf_remove_from_context
--> perf_group_detach(event);
for_each_sibling_event() -->
list_del_init(&event->sibling_list)
As commit f3c0eba287049(“perf: Add a few assertions”)said:
“Notable for_each_sibling_event() relies on exclusion from
modification. This would normally be holding either ctx->lock or
ctx->mutex, however due to how things are constructed disabling IRQs
is a valid and sufficient substitute for ctx->lock.”, we think it's
necessary to hold ctx ->mutex, but currently LTS such as 4.19,5.4,5.10,
and 6.1 all does not do so:
perf_event_task
--> perf_iterate_sb
--> perf_iterate_sb_cpu
--> event_filter_match
--> pmu_filter_match
--> for_each_sibling_event
commit bd27568117664(“perf: Rewrite core context handling”)had removed
the pmu_filter_match operation, so it may be a temporary workaround
for this issue.
But it's necessary to confirm if there is a race problem between
sibling_list, and if it is, how
to fix currently LTS branches.
Thanks in advance.
The intended move from wait_for_completion_*() to
wait_for_completion_interruptible_*() was to allow (very) long spi memory
transfers to be stopped upon user request instead of freezing the
machine forever as the timeout value could now be significantly bigger.
However, depending on the user logic, applications can receive many
signals for their own "internal" purpose and have nothing to do with the
requested kernel operations, hence interrupting spi transfers upon any
signal is probably not a wise choice. Instead, let's switch to
wait_for_completion_killable_*() to only catch the "important"
signals. This was likely the intended behavior anyway.
Fixes: e0205d6203c2 ("spi: atmel: Prevent false timeouts on long transfers")
Cc: stable(a)vger.kernel.org
Reported-by: Ronald Wahl <ronald.wahl(a)raritan.com>
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
---
Hello Ronald, this is only compile tested, please let me know if that
fixes your use case or if you still suffer from interrupted transfers.
Thanks!
---
drivers/spi/spi-atmel.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/spi/spi-atmel.c b/drivers/spi/spi-atmel.c
index 6aa8adbe4170..2e8860865af9 100644
--- a/drivers/spi/spi-atmel.c
+++ b/drivers/spi/spi-atmel.c
@@ -1336,8 +1336,8 @@ static int atmel_spi_one_transfer(struct spi_controller *host,
}
dma_timeout = msecs_to_jiffies(spi_controller_xfer_timeout(host, xfer));
- ret_timeout = wait_for_completion_interruptible_timeout(&as->xfer_completion,
- dma_timeout);
+ ret_timeout = wait_for_completion_killable_timeout(&as->xfer_completion,
+ dma_timeout);
if (ret_timeout <= 0) {
dev_err(&spi->dev, "spi transfer %s\n",
!ret_timeout ? "timeout" : "canceled");
--
2.34.1
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 2e84dc37920012b458e9458b19fc4ed33f81bc74
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023112330-squealer-strife-0ecc@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
2e84dc379200 ("driver core: Release all resources during unbind before updating device links")
25f3bcfc54bc ("driver core: Add dma_cleanup callback in bus_type")
9ad307213fa4 ("driver core: Refactor multiple copies of device cleanup")
d8f7a5484f21 ("driver core: Free DMA range map when device is released")
885e50253bfd ("driver core: Move driver_sysfs_remove() after driver_sysfs_add()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2e84dc37920012b458e9458b19fc4ed33f81bc74 Mon Sep 17 00:00:00 2001
From: Saravana Kannan <saravanak(a)google.com>
Date: Tue, 17 Oct 2023 18:38:50 -0700
Subject: [PATCH] driver core: Release all resources during unbind before
updating device links
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This commit fixes a bug in commit 9ed9895370ae ("driver core: Functional
dependencies tracking support") where the device link status was
incorrectly updated in the driver unbind path before all the device's
resources were released.
Fixes: 9ed9895370ae ("driver core: Functional dependencies tracking support")
Cc: stable <stable(a)kernel.org>
Reported-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Closes: https://lore.kernel.org/all/20231014161721.f4iqyroddkcyoefo@pengutronix.de/
Signed-off-by: Saravana Kannan <saravanak(a)google.com>
Cc: Thierry Reding <thierry.reding(a)gmail.com>
Cc: Yang Yingliang <yangyingliang(a)huawei.com>
Cc: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Matti Vaittinen <mazziesaccount(a)gmail.com>
Cc: James Clark <james.clark(a)arm.com>
Acked-by: "Rafael J. Wysocki" <rafael(a)kernel.org>
Tested-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Acked-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Link: https://lore.kernel.org/r/20231018013851.3303928-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index a528cec24264..0c3725c3eefa 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -1274,8 +1274,8 @@ static void __device_release_driver(struct device *dev, struct device *parent)
if (dev->bus && dev->bus->dma_cleanup)
dev->bus->dma_cleanup(dev);
- device_links_driver_cleanup(dev);
device_unbind_cleanup(dev);
+ device_links_driver_cleanup(dev);
klist_remove(&dev->p->knode_driver);
device_pm_check_callbacks(dev);