The commit 80dd33cf72d1 ("drivers: base: Fix device link removal")
introduces a workqueue to release the consumer and supplier devices used
in the devlink.
In the job queued, devices are release and in turn, when all the
references to these devices are dropped, the release function of the
device itself is called.
Nothing is present to provide some synchronisation with this workqueue
in order to ensure that all ongoing releasing operations are done and
so, some other operations can be started safely.
For instance, in the following sequence:
1) of_platform_depopulate()
2) of_overlay_remove()
During the step 1, devices are released and related devlinks are removed
(jobs pushed in the workqueue).
During the step 2, OF nodes are destroyed but, without any
synchronisation with devlink removal jobs, of_overlay_remove() can raise
warnings related to missing of_node_put():
ERROR: memory leak, expected refcount 1 instead of 2
Indeed, the missing of_node_put() call is going to be done, too late,
from the workqueue job execution.
Introduce device_link_wait_removal() to offer a way to synchronize
operations waiting for the end of devlink removals (i.e. end of
workqueue jobs).
Also, as a flushing operation is done on the workqueue, the workqueue
used is moved from a system-wide workqueue to a local one.
Cc: stable(a)vger.kernel.org
Signed-off-by: Herve Codina <herve.codina(a)bootlin.com>
Tested-by: Luca Ceresoli <luca.ceresoli(a)bootlin.com>
Reviewed-by: Nuno Sa <nuno.sa(a)analog.com>
Reviewed-by: Saravana Kannan <saravanak(a)google.com>
---
drivers/base/core.c | 26 +++++++++++++++++++++++---
include/linux/device.h | 1 +
2 files changed, 24 insertions(+), 3 deletions(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 7e3af0ad770a..f2242aadffb0 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -44,6 +44,7 @@ static bool fw_devlink_is_permissive(void);
static void __fw_devlink_link_to_consumers(struct device *dev);
static bool fw_devlink_drv_reg_done;
static bool fw_devlink_best_effort;
+static struct workqueue_struct *device_link_wq;
/**
* __fwnode_link_add - Create a link between two fwnode_handles.
@@ -533,12 +534,26 @@ static void devlink_dev_release(struct device *dev)
/*
* It may take a while to complete this work because of the SRCU
* synchronization in device_link_release_fn() and if the consumer or
- * supplier devices get deleted when it runs, so put it into the "long"
- * workqueue.
+ * supplier devices get deleted when it runs, so put it into the
+ * dedicated workqueue.
*/
- queue_work(system_long_wq, &link->rm_work);
+ queue_work(device_link_wq, &link->rm_work);
}
+/**
+ * device_link_wait_removal - Wait for ongoing devlink removal jobs to terminate
+ */
+void device_link_wait_removal(void)
+{
+ /*
+ * devlink removal jobs are queued in the dedicated work queue.
+ * To be sure that all removal jobs are terminated, ensure that any
+ * scheduled work has run to completion.
+ */
+ flush_workqueue(device_link_wq);
+}
+EXPORT_SYMBOL_GPL(device_link_wait_removal);
+
static struct class devlink_class = {
.name = "devlink",
.dev_groups = devlink_groups,
@@ -4165,9 +4180,14 @@ int __init devices_init(void)
sysfs_dev_char_kobj = kobject_create_and_add("char", dev_kobj);
if (!sysfs_dev_char_kobj)
goto char_kobj_err;
+ device_link_wq = alloc_workqueue("device_link_wq", 0, 0);
+ if (!device_link_wq)
+ goto wq_err;
return 0;
+ wq_err:
+ kobject_put(sysfs_dev_char_kobj);
char_kobj_err:
kobject_put(sysfs_dev_block_kobj);
block_kobj_err:
diff --git a/include/linux/device.h b/include/linux/device.h
index 1795121dee9a..d7d8305a72e8 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -1249,6 +1249,7 @@ void device_link_del(struct device_link *link);
void device_link_remove(void *consumer, struct device *supplier);
void device_links_supplier_sync_state_pause(void);
void device_links_supplier_sync_state_resume(void);
+void device_link_wait_removal(void);
/* Create alias, so I can be autoloaded. */
#define MODULE_ALIAS_CHARDEV(major,minor) \
--
2.44.0
Commit fc663711b944 ("scsi: core: Remove the /proc/scsi/${proc_name} directory
earlier") fixed a bug related to modules loading/unloading, by adding a
call to scsi_proc_hostdir_rm() on scsi_remove_host(). But that led to a
potential duplicate call to the hostdir_rm() routine, since it's also
called from scsi_host_dev_release(). That triggered a regression report,
which was then fixed by commit be03df3d4bfe ("scsi: core: Fix a procfs host
directory removal regression"). The fix just dropped the hostdir_rm() call
from dev_release().
But happens that this proc directory is created on scsi_host_alloc(),
and that function "pairs" with scsi_host_dev_release(), while
scsi_remove_host() pairs with scsi_add_host(). In other words, it seems
the reason for removing the proc directory on dev_release() was meant to
cover cases in which a SCSI host structure was allocated, but the call
to scsi_add_host() didn't happen. And that pattern happens to exist in
some error paths, for example.
Syzkaller causes that by using USB raw gadget device, error'ing on usb-storage
driver, at usb_stor_probe2(). By checking that path, we can see that the
BadDevice label leads to a scsi_host_put() after a SCSI host allocation, but
there's no call to scsi_add_host() in such path. That leads to messages like
this in dmesg (and a leak of the SCSI host proc structure):
usb-storage 4-1:87.51: USB Mass Storage device detected
proc_dir_entry 'scsi/usb-storage' already registered
WARNING: CPU: 1 PID: 3519 at fs/proc/generic.c:377 proc_register+0x347/0x4e0 fs/proc/generic.c:376
The proper fix seems to still call scsi_proc_hostdir_rm() on dev_release(),
but guard that with the state check for SHOST_CREATED; there is even a
comment in scsi_host_dev_release() detailing that: such conditional is
meant for cases where the SCSI host was allocated but there was no calls
to {add,remove}_host(), like the usb-storage case.
This is what we propose here and with that, the error path of usb-storage
does not trigger the warning anymore.
Reported-by: syzbot+c645abf505ed21f931b5(a)syzkaller.appspotmail.com
Fixes: be03df3d4bfe ("scsi: core: Fix a procfs host directory removal regression")
Cc: stable(a)vger.kernel.org
Cc: Bart Van Assche <bvanassche(a)acm.org>
Cc: John Garry <john.g.garry(a)oracle.com>
Cc: Shin'ichiro Kawasaki <shinichiro.kawasaki(a)wdc.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com>
---
Hi folks, thanks in advance for reviews.
The issue with usb-storage happens upstream but despite we have a
repro in the aforementioned Syzkaller report, it's only easy to reproduce
the proc_dir_entry warning in a timely manner on stable right now.
The reason for that is commit 036abd614007 ("scsi: core: Introduce a new
list for SCSI proc directory entries") not being present on stable. This
commit (indirectly) bumps the ->present field from unsigned char to uint,
and the bug reproducer shows the warning whenever such field overflows,
hence it's way easier/faster to see this problem in stable, though it's
present in latest v6.8.0 too.
Cheers,
Guilherme
drivers/scsi/hosts.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index d7f51b84f3c7..445f4a220df3 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -353,12 +353,13 @@ static void scsi_host_dev_release(struct device *dev)
if (shost->shost_state == SHOST_CREATED) {
/*
- * Free the shost_dev device name here if scsi_host_alloc()
- * and scsi_host_put() have been called but neither
+ * Free the shost_dev device name and remove the proc host dir
+ * here if scsi_host_{alloc,put}() have been called but neither
* scsi_host_add() nor scsi_remove_host() has been called.
* This avoids that the memory allocated for the shost_dev
- * name is leaked.
+ * name as well as the proc dir structure are leaked.
*/
+ scsi_proc_hostdir_rm(shost->hostt);
kfree(dev_name(&shost->shost_dev));
}
--
2.43.0
sg_remove_sfp_usercontext() must not use sg_device_destroy() after
calling scsi_device_put().
sg_device_destroy() is accessling the device queue. Which will be set to
NULL if scsi_device_put() removes the last reference to the sg device.
Link: https://lore.kernel.org/r/20240305150509.23896-1-Alexander@wetzel-home.de
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Alexander Wetzel <Alexander(a)wetzel-home.de>
---
This is my best shot for a real fix of the issue.
I confirmed with printk's that I get the NULL pointer freeze ony when
scsi_device_put() is deleting the last reference to the device.
In the cases where it's not crashing there is still a reference left
after the call.
I don't see any obvious down side of simply swapping the calls.
The alternative would by my first patch, just without the WARN_ON.
Alexander
---
drivers/scsi/sg.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 86210e4dd0d3..80e0d1981191 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -2232,8 +2232,8 @@ sg_remove_sfp_usercontext(struct work_struct *work)
"sg_remove_sfp: sfp=0x%p\n", sfp));
kfree(sfp);
- scsi_device_put(sdp->device);
kref_put(&sdp->d_ref, sg_device_destroy);
+ scsi_device_put(sdp->device);
module_put(THIS_MODULE);
}
--
2.44.0
The following commit has been merged into the irq/urgent branch of tip:
Commit-ID: c2ddeb29612f7ca84ed10c6d4f3ac99705135447
Gitweb: https://git.kernel.org/tip/c2ddeb29612f7ca84ed10c6d4f3ac99705135447
Author: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
AuthorDate: Mon, 25 Mar 2024 13:58:08 +01:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Mon, 25 Mar 2024 23:45:21 +01:00
genirq: Introduce IRQF_COND_ONESHOT and use it in pinctrl-amd
There is a problem when a driver requests a shared interrupt line to run a
threaded handler on it without IRQF_ONESHOT set if that flag has been set
already for the IRQ in question by somebody else. Namely, the request
fails which usually leads to a probe failure even though the driver might
have worked just fine with IRQF_ONESHOT, but it does not want to use it by
default. Currently, the only way to handle this is to try to request the
IRQ without IRQF_ONESHOT, but with IRQF_PROBE_SHARED set and if this fails,
try again with IRQF_ONESHOT set. However, this is a bit cumbersome and not
very clean.
When commit 7a36b901a6eb ("ACPI: OSL: Use a threaded interrupt handler for
SCI") switched the ACPI subsystem over to using a threaded interrupt
handler for the SCI, it had to use IRQF_ONESHOT for it because that's
required due to the way the SCI handler works (it needs to walk all of the
enabled GPEs before the interrupt line can be unmasked). The SCI interrupt
line is not shared with other users very often due to the SCI handling
overhead, but on sone systems it is shared and when the other user of it
attempts to install a threaded handler, a flags mismatch related to
IRQF_ONESHOT may occur.
As it turned out, that happened to the pinctrl-amd driver and so commit
4451e8e8415e ("pinctrl: amd: Add IRQF_ONESHOT to the interrupt request")
attempted to address the issue by adding IRQF_ONESHOT to the interrupt
flags in that driver, but this is now causing an IRQF_ONESHOT-related
mismatch to occur on another system which cannot boot as a result of it.
Clearly, pinctrl-amd can work with IRQF_ONESHOT if need be, but it should
not set that flag by default, so it needs a way to indicate that to the
interrupt subsystem.
To that end, introdcuce a new interrupt flag, IRQF_COND_ONESHOT, which will
only have effect when the IRQ line is shared and IRQF_ONESHOT has been set
for it already, in which case it will be promoted to the latter.
This is sufficient for drivers sharing the interrupt line with the SCI as
it is requested by the ACPI subsystem before any drivers are probed, so
they will always see IRQF_ONESHOT set for the interrupt in question.
Fixes: 4451e8e8415e ("pinctrl: amd: Add IRQF_ONESHOT to the interrupt request")
Reported-by: Francisco Ayala Le Brun <francisco(a)videowindow.eu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Linus Walleij <linus.walleij(a)linaro.org>
Cc: 6.8+ <stable(a)vger.kernel.org> # 6.8+
Closes: https://lore.kernel.org/lkml/CAN-StX1HqWqi+YW=t+V52-38Mfp5fAz7YHx4aH-CQjgyN…
Link: https://lore.kernel.org/r/12417336.O9o76ZdvQC@kreacher
---
drivers/pinctrl/pinctrl-amd.c | 2 +-
include/linux/interrupt.h | 3 +++
kernel/irq/manage.c | 9 +++++++--
3 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
index 49f89b7..7f66ec7 100644
--- a/drivers/pinctrl/pinctrl-amd.c
+++ b/drivers/pinctrl/pinctrl-amd.c
@@ -1159,7 +1159,7 @@ static int amd_gpio_probe(struct platform_device *pdev)
}
ret = devm_request_irq(&pdev->dev, gpio_dev->irq, amd_gpio_irq_handler,
- IRQF_SHARED | IRQF_ONESHOT, KBUILD_MODNAME, gpio_dev);
+ IRQF_SHARED | IRQF_COND_ONESHOT, KBUILD_MODNAME, gpio_dev);
if (ret)
goto out2;
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 76121c2..5c9bdd3 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -67,6 +67,8 @@
* later.
* IRQF_NO_DEBUG - Exclude from runnaway detection for IPI and similar handlers,
* depends on IRQF_PERCPU.
+ * IRQF_COND_ONESHOT - Agree to do IRQF_ONESHOT if already set for a shared
+ * interrupt.
*/
#define IRQF_SHARED 0x00000080
#define IRQF_PROBE_SHARED 0x00000100
@@ -82,6 +84,7 @@
#define IRQF_COND_SUSPEND 0x00040000
#define IRQF_NO_AUTOEN 0x00080000
#define IRQF_NO_DEBUG 0x00100000
+#define IRQF_COND_ONESHOT 0x00200000
#define IRQF_TIMER (__IRQF_TIMER | IRQF_NO_SUSPEND | IRQF_NO_THREAD)
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index ad3eaf2..bf9ae8a 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1643,8 +1643,13 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
}
if (!((old->flags & new->flags) & IRQF_SHARED) ||
- (oldtype != (new->flags & IRQF_TRIGGER_MASK)) ||
- ((old->flags ^ new->flags) & IRQF_ONESHOT))
+ (oldtype != (new->flags & IRQF_TRIGGER_MASK)))
+ goto mismatch;
+
+ if ((old->flags & IRQF_ONESHOT) &&
+ (new->flags & IRQF_COND_ONESHOT))
+ new->flags |= IRQF_ONESHOT;
+ else if ((old->flags ^ new->flags) & IRQF_ONESHOT)
goto mismatch;
/* All handlers must agree on per-cpuness */
The commit 3a2dbc510c43 ("driver core: fw_devlink: Don't purge child
fwnode's consumer links") introduces the possibility to use the
supplier's parent device instead of the supplier itself.
In that case the supplier fwnode used is not updated and is no more
consistent with the supplier device used.
Use the fwnode consistent with the supplier device when checking flags.
Fixes: 3a2dbc510c43 ("driver core: fw_devlink: Don't purge child fwnode's consumer links")
Cc: stable(a)vger.kernel.org
Signed-off-by: Herve Codina <herve.codina(a)bootlin.com>
---
Changes v2 -> v3:
Do not update the supplier handle in order to keep the original handle
for debug traces.
Changes v1 -> v2:
Remove sup_handle check and related pr_debug() call as sup_handle cannot be
invalid if sup_dev is valid.
drivers/base/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index b93f3c5716ae..0d335b0dc396 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2163,7 +2163,7 @@ static int fw_devlink_create_devlink(struct device *con,
* supplier device indefinitely.
*/
if (sup_dev->links.status == DL_DEV_NO_DRIVER &&
- sup_handle->flags & FWNODE_FLAG_INITIALIZED) {
+ sup_dev->fwnode->flags & FWNODE_FLAG_INITIALIZED) {
dev_dbg(con,
"Not linking %pfwf - dev might never probe\n",
sup_handle);
--
2.44.0