The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x bc8aeb04fd80cb8cfae3058445c84410fd0beb5e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120614-amuser-purging-2004@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bc8aeb04fd80cb8cfae3058445c84410fd0beb5e Mon Sep 17 00:00:00 2001
From: Chao Yu <chao(a)kernel.org>
Date: Thu, 21 Nov 2024 22:17:16 +0800
Subject: [PATCH] f2fs: fix to drop all discards after creating snapshot on lvm
device
Piergiorgio reported a bug in bugzilla as below:
------------[ cut here ]------------
WARNING: CPU: 2 PID: 969 at fs/f2fs/segment.c:1330
RIP: 0010:__submit_discard_cmd+0x27d/0x400 [f2fs]
Call Trace:
__issue_discard_cmd+0x1ca/0x350 [f2fs]
issue_discard_thread+0x191/0x480 [f2fs]
kthread+0xcf/0x100
ret_from_fork+0x31/0x50
ret_from_fork_asm+0x1a/0x30
w/ below testcase, it can reproduce this bug quickly:
- pvcreate /dev/vdb
- vgcreate myvg1 /dev/vdb
- lvcreate -L 1024m -n mylv1 myvg1
- mount /dev/myvg1/mylv1 /mnt/f2fs
- dd if=/dev/zero of=/mnt/f2fs/file bs=1M count=20
- sync
- rm /mnt/f2fs/file
- sync
- lvcreate -L 1024m -s -n mylv1-snapshot /dev/myvg1/mylv1
- umount /mnt/f2fs
The root cause is: it will update discard_max_bytes of mounted lvm
device to zero after creating snapshot on this lvm device, then,
__submit_discard_cmd() will pass parameter @nr_sects w/ zero value
to __blkdev_issue_discard(), it returns a NULL bio pointer, result
in panic.
This patch changes as below for fixing:
1. Let's drop all remained discards in f2fs_unfreeze() if snapshot
of lvm device is created.
2. Checking discard_max_bytes before submitting discard during
__submit_discard_cmd().
Cc: stable(a)vger.kernel.org
Fixes: 35ec7d574884 ("f2fs: split discard command in prior to block layer")
Reported-by: Piergiorgio Sartor <piergiorgio.sartor(a)nexgo.de>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219484
Signed-off-by: Chao Yu <chao(a)kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 4236040e3994..eade36c5ef13 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1290,16 +1290,18 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
wait_list, issued);
return 0;
}
-
- /*
- * Issue discard for conventional zones only if the device
- * supports discard.
- */
- if (!bdev_max_discard_sectors(bdev))
- return -EOPNOTSUPP;
}
#endif
+ /*
+ * stop issuing discard for any of below cases:
+ * 1. device is conventional zone, but it doesn't support discard.
+ * 2. device is regulare device, after snapshot it doesn't support
+ * discard.
+ */
+ if (!bdev_max_discard_sectors(bdev))
+ return -EOPNOTSUPP;
+
trace_f2fs_issue_discard(bdev, dc->di.start, dc->di.len);
lstart = dc->di.lstart;
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index c0670cd61956..fc7d463dee15 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1760,6 +1760,18 @@ static int f2fs_freeze(struct super_block *sb)
static int f2fs_unfreeze(struct super_block *sb)
{
+ struct f2fs_sb_info *sbi = F2FS_SB(sb);
+
+ /*
+ * It will update discard_max_bytes of mounted lvm device to zero
+ * after creating snapshot on this lvm device, let's drop all
+ * remained discards.
+ * We don't need to disable real-time discard because discard_max_bytes
+ * will recover after removal of snapshot.
+ */
+ if (test_opt(sbi, DISCARD) && !f2fs_hw_support_discard(sbi))
+ f2fs_issue_discard_timeout(sbi);
+
clear_sbi_flag(F2FS_SB(sb), SBI_IS_FREEZING);
return 0;
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x bc8aeb04fd80cb8cfae3058445c84410fd0beb5e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120613-dating-jawless-e589@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bc8aeb04fd80cb8cfae3058445c84410fd0beb5e Mon Sep 17 00:00:00 2001
From: Chao Yu <chao(a)kernel.org>
Date: Thu, 21 Nov 2024 22:17:16 +0800
Subject: [PATCH] f2fs: fix to drop all discards after creating snapshot on lvm
device
Piergiorgio reported a bug in bugzilla as below:
------------[ cut here ]------------
WARNING: CPU: 2 PID: 969 at fs/f2fs/segment.c:1330
RIP: 0010:__submit_discard_cmd+0x27d/0x400 [f2fs]
Call Trace:
__issue_discard_cmd+0x1ca/0x350 [f2fs]
issue_discard_thread+0x191/0x480 [f2fs]
kthread+0xcf/0x100
ret_from_fork+0x31/0x50
ret_from_fork_asm+0x1a/0x30
w/ below testcase, it can reproduce this bug quickly:
- pvcreate /dev/vdb
- vgcreate myvg1 /dev/vdb
- lvcreate -L 1024m -n mylv1 myvg1
- mount /dev/myvg1/mylv1 /mnt/f2fs
- dd if=/dev/zero of=/mnt/f2fs/file bs=1M count=20
- sync
- rm /mnt/f2fs/file
- sync
- lvcreate -L 1024m -s -n mylv1-snapshot /dev/myvg1/mylv1
- umount /mnt/f2fs
The root cause is: it will update discard_max_bytes of mounted lvm
device to zero after creating snapshot on this lvm device, then,
__submit_discard_cmd() will pass parameter @nr_sects w/ zero value
to __blkdev_issue_discard(), it returns a NULL bio pointer, result
in panic.
This patch changes as below for fixing:
1. Let's drop all remained discards in f2fs_unfreeze() if snapshot
of lvm device is created.
2. Checking discard_max_bytes before submitting discard during
__submit_discard_cmd().
Cc: stable(a)vger.kernel.org
Fixes: 35ec7d574884 ("f2fs: split discard command in prior to block layer")
Reported-by: Piergiorgio Sartor <piergiorgio.sartor(a)nexgo.de>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219484
Signed-off-by: Chao Yu <chao(a)kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 4236040e3994..eade36c5ef13 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1290,16 +1290,18 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
wait_list, issued);
return 0;
}
-
- /*
- * Issue discard for conventional zones only if the device
- * supports discard.
- */
- if (!bdev_max_discard_sectors(bdev))
- return -EOPNOTSUPP;
}
#endif
+ /*
+ * stop issuing discard for any of below cases:
+ * 1. device is conventional zone, but it doesn't support discard.
+ * 2. device is regulare device, after snapshot it doesn't support
+ * discard.
+ */
+ if (!bdev_max_discard_sectors(bdev))
+ return -EOPNOTSUPP;
+
trace_f2fs_issue_discard(bdev, dc->di.start, dc->di.len);
lstart = dc->di.lstart;
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index c0670cd61956..fc7d463dee15 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1760,6 +1760,18 @@ static int f2fs_freeze(struct super_block *sb)
static int f2fs_unfreeze(struct super_block *sb)
{
+ struct f2fs_sb_info *sbi = F2FS_SB(sb);
+
+ /*
+ * It will update discard_max_bytes of mounted lvm device to zero
+ * after creating snapshot on this lvm device, let's drop all
+ * remained discards.
+ * We don't need to disable real-time discard because discard_max_bytes
+ * will recover after removal of snapshot.
+ */
+ if (test_opt(sbi, DISCARD) && !f2fs_hw_support_discard(sbi))
+ f2fs_issue_discard_timeout(sbi);
+
clear_sbi_flag(F2FS_SB(sb), SBI_IS_FREEZING);
return 0;
}
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x bc8aeb04fd80cb8cfae3058445c84410fd0beb5e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120612-helmet-unwed-8cfe@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bc8aeb04fd80cb8cfae3058445c84410fd0beb5e Mon Sep 17 00:00:00 2001
From: Chao Yu <chao(a)kernel.org>
Date: Thu, 21 Nov 2024 22:17:16 +0800
Subject: [PATCH] f2fs: fix to drop all discards after creating snapshot on lvm
device
Piergiorgio reported a bug in bugzilla as below:
------------[ cut here ]------------
WARNING: CPU: 2 PID: 969 at fs/f2fs/segment.c:1330
RIP: 0010:__submit_discard_cmd+0x27d/0x400 [f2fs]
Call Trace:
__issue_discard_cmd+0x1ca/0x350 [f2fs]
issue_discard_thread+0x191/0x480 [f2fs]
kthread+0xcf/0x100
ret_from_fork+0x31/0x50
ret_from_fork_asm+0x1a/0x30
w/ below testcase, it can reproduce this bug quickly:
- pvcreate /dev/vdb
- vgcreate myvg1 /dev/vdb
- lvcreate -L 1024m -n mylv1 myvg1
- mount /dev/myvg1/mylv1 /mnt/f2fs
- dd if=/dev/zero of=/mnt/f2fs/file bs=1M count=20
- sync
- rm /mnt/f2fs/file
- sync
- lvcreate -L 1024m -s -n mylv1-snapshot /dev/myvg1/mylv1
- umount /mnt/f2fs
The root cause is: it will update discard_max_bytes of mounted lvm
device to zero after creating snapshot on this lvm device, then,
__submit_discard_cmd() will pass parameter @nr_sects w/ zero value
to __blkdev_issue_discard(), it returns a NULL bio pointer, result
in panic.
This patch changes as below for fixing:
1. Let's drop all remained discards in f2fs_unfreeze() if snapshot
of lvm device is created.
2. Checking discard_max_bytes before submitting discard during
__submit_discard_cmd().
Cc: stable(a)vger.kernel.org
Fixes: 35ec7d574884 ("f2fs: split discard command in prior to block layer")
Reported-by: Piergiorgio Sartor <piergiorgio.sartor(a)nexgo.de>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219484
Signed-off-by: Chao Yu <chao(a)kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk(a)kernel.org>
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 4236040e3994..eade36c5ef13 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1290,16 +1290,18 @@ static int __submit_discard_cmd(struct f2fs_sb_info *sbi,
wait_list, issued);
return 0;
}
-
- /*
- * Issue discard for conventional zones only if the device
- * supports discard.
- */
- if (!bdev_max_discard_sectors(bdev))
- return -EOPNOTSUPP;
}
#endif
+ /*
+ * stop issuing discard for any of below cases:
+ * 1. device is conventional zone, but it doesn't support discard.
+ * 2. device is regulare device, after snapshot it doesn't support
+ * discard.
+ */
+ if (!bdev_max_discard_sectors(bdev))
+ return -EOPNOTSUPP;
+
trace_f2fs_issue_discard(bdev, dc->di.start, dc->di.len);
lstart = dc->di.lstart;
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index c0670cd61956..fc7d463dee15 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1760,6 +1760,18 @@ static int f2fs_freeze(struct super_block *sb)
static int f2fs_unfreeze(struct super_block *sb)
{
+ struct f2fs_sb_info *sbi = F2FS_SB(sb);
+
+ /*
+ * It will update discard_max_bytes of mounted lvm device to zero
+ * after creating snapshot on this lvm device, let's drop all
+ * remained discards.
+ * We don't need to disable real-time discard because discard_max_bytes
+ * will recover after removal of snapshot.
+ */
+ if (test_opt(sbi, DISCARD) && !f2fs_hw_support_discard(sbi))
+ f2fs_issue_discard_timeout(sbi);
+
clear_sbi_flag(F2FS_SB(sb), SBI_IS_FREEZING);
return 0;
}
Hi stable tree maintainers,
Please revert the backports of
44c76825d6ee ("x86: Increase brk randomness entropy for 64-bit systems")
namely:
5.4: 03475167fda50b8511ef620a27409b08365882e1
5.10: 25d31baf922c1ee987efd6fcc9c7d4ab539c66b4
5.15: 06cb3463aa58906cfff72877eb7f50cb26e9ca93
6.1: b0cde867b80a5e81fcbc0383e138f5845f2005ee
6.6: 1a45994fb218d93dec48a3a86f68283db61e0936
There seems to be a bad interaction between this change and older
PIE-built qemu-user-static (for aarch64) binaries[1]. Investigation
continues to see if this will need to be reverted from 6.6, 6.11,
and mainline. But for now, it's clearly a problem for older kernels with
older qemu.
Thanks!
-Kees
[1] https://lore.kernel.org/all/202411201000.F3313C02@keescook/
--
Kees Cook
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x bac3b10b78e54b7da3cede397258f75a2180609b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120659-hasty-crunching-8425@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bac3b10b78e54b7da3cede397258f75a2180609b Mon Sep 17 00:00:00 2001
From: Saravana Kannan <saravanak(a)google.com>
Date: Wed, 30 Oct 2024 10:10:07 -0700
Subject: [PATCH] driver core: fw_devlink: Stop trying to optimize cycle
detection logic
In attempting to optimize fw_devlink runtime, I introduced numerous cycle
detection bugs by foregoing cycle detection logic under specific
conditions. Each fix has further narrowed the conditions for optimization.
It's time to give up on these optimization attempts and just run the cycle
detection logic every time fw_devlink tries to create a device link.
The specific bug report that triggered this fix involved a supplier fwnode
that never gets a device created for it. Instead, the supplier fwnode is
represented by the device that corresponds to an ancestor fwnode.
In this case, fw_devlink didn't do any cycle detection because the cycle
detection logic is only run when a device link is created between the
devices that correspond to the actual consumer and supplier fwnodes.
With this change, fw_devlink will run cycle detection logic even when
creating SYNC_STATE_ONLY proxy device links from a device that is an
ancestor of a consumer fwnode.
Reported-by: Tomi Valkeinen <tomi.valkeinen(a)ideasonboard.com>
Closes: https://lore.kernel.org/all/1a1ab663-d068-40fb-8c94-f0715403d276@ideasonboa…
Fixes: 6442d79d880c ("driver core: fw_devlink: Improve detection of overlapping cycles")
Cc: stable <stable(a)kernel.org>
Tested-by: Tomi Valkeinen <tomi.valkeinen(a)ideasonboard.com>
Signed-off-by: Saravana Kannan <saravanak(a)google.com>
Link: https://lore.kernel.org/r/20241030171009.1853340-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 54b25570a492..633fb4283282 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1989,10 +1989,10 @@ static struct device *fwnode_get_next_parent_dev(const struct fwnode_handle *fwn
*
* Return true if one or more cycles were found. Otherwise, return false.
*/
-static bool __fw_devlink_relax_cycles(struct device *con,
+static bool __fw_devlink_relax_cycles(struct fwnode_handle *con_handle,
struct fwnode_handle *sup_handle)
{
- struct device *sup_dev = NULL, *par_dev = NULL;
+ struct device *sup_dev = NULL, *par_dev = NULL, *con_dev = NULL;
struct fwnode_link *link;
struct device_link *dev_link;
bool ret = false;
@@ -2009,22 +2009,22 @@ static bool __fw_devlink_relax_cycles(struct device *con,
sup_handle->flags |= FWNODE_FLAG_VISITED;
- sup_dev = get_dev_from_fwnode(sup_handle);
-
/* Termination condition. */
- if (sup_dev == con) {
+ if (sup_handle == con_handle) {
pr_debug("----- cycle: start -----\n");
ret = true;
goto out;
}
+ sup_dev = get_dev_from_fwnode(sup_handle);
+ con_dev = get_dev_from_fwnode(con_handle);
/*
* If sup_dev is bound to a driver and @con hasn't started binding to a
* driver, sup_dev can't be a consumer of @con. So, no need to check
* further.
*/
if (sup_dev && sup_dev->links.status == DL_DEV_DRIVER_BOUND &&
- con->links.status == DL_DEV_NO_DRIVER) {
+ con_dev && con_dev->links.status == DL_DEV_NO_DRIVER) {
ret = false;
goto out;
}
@@ -2033,7 +2033,7 @@ static bool __fw_devlink_relax_cycles(struct device *con,
if (link->flags & FWLINK_FLAG_IGNORE)
continue;
- if (__fw_devlink_relax_cycles(con, link->supplier)) {
+ if (__fw_devlink_relax_cycles(con_handle, link->supplier)) {
__fwnode_link_cycle(link);
ret = true;
}
@@ -2048,7 +2048,7 @@ static bool __fw_devlink_relax_cycles(struct device *con,
else
par_dev = fwnode_get_next_parent_dev(sup_handle);
- if (par_dev && __fw_devlink_relax_cycles(con, par_dev->fwnode)) {
+ if (par_dev && __fw_devlink_relax_cycles(con_handle, par_dev->fwnode)) {
pr_debug("%pfwf: cycle: child of %pfwf\n", sup_handle,
par_dev->fwnode);
ret = true;
@@ -2066,7 +2066,7 @@ static bool __fw_devlink_relax_cycles(struct device *con,
!(dev_link->flags & DL_FLAG_CYCLE))
continue;
- if (__fw_devlink_relax_cycles(con,
+ if (__fw_devlink_relax_cycles(con_handle,
dev_link->supplier->fwnode)) {
pr_debug("%pfwf: cycle: depends on %pfwf\n", sup_handle,
dev_link->supplier->fwnode);
@@ -2114,11 +2114,6 @@ static int fw_devlink_create_devlink(struct device *con,
if (link->flags & FWLINK_FLAG_IGNORE)
return 0;
- if (con->fwnode == link->consumer)
- flags = fw_devlink_get_flags(link->flags);
- else
- flags = FW_DEVLINK_FLAGS_PERMISSIVE;
-
/*
* In some cases, a device P might also be a supplier to its child node
* C. However, this would defer the probe of C until the probe of P
@@ -2139,25 +2134,23 @@ static int fw_devlink_create_devlink(struct device *con,
return -EINVAL;
/*
- * SYNC_STATE_ONLY device links don't block probing and supports cycles.
- * So, one might expect that cycle detection isn't necessary for them.
- * However, if the device link was marked as SYNC_STATE_ONLY because
- * it's part of a cycle, then we still need to do cycle detection. This
- * is because the consumer and supplier might be part of multiple cycles
- * and we need to detect all those cycles.
+ * Don't try to optimize by not calling the cycle detection logic under
+ * certain conditions. There's always some corner case that won't get
+ * detected.
*/
- if (!device_link_flag_is_sync_state_only(flags) ||
- flags & DL_FLAG_CYCLE) {
- device_links_write_lock();
- if (__fw_devlink_relax_cycles(con, sup_handle)) {
- __fwnode_link_cycle(link);
- flags = fw_devlink_get_flags(link->flags);
- pr_debug("----- cycle: end -----\n");
- dev_info(con, "Fixed dependency cycle(s) with %pfwf\n",
- sup_handle);
- }
- device_links_write_unlock();
+ device_links_write_lock();
+ if (__fw_devlink_relax_cycles(link->consumer, sup_handle)) {
+ __fwnode_link_cycle(link);
+ pr_debug("----- cycle: end -----\n");
+ pr_info("%pfwf: Fixed dependency cycle(s) with %pfwf\n",
+ link->consumer, sup_handle);
}
+ device_links_write_unlock();
+
+ if (con->fwnode == link->consumer)
+ flags = fw_devlink_get_flags(link->flags);
+ else
+ flags = FW_DEVLINK_FLAGS_PERMISSIVE;
if (sup_handle->flags & FWNODE_FLAG_NOT_DEVICE)
sup_dev = fwnode_get_next_parent_dev(sup_handle);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x bac3b10b78e54b7da3cede397258f75a2180609b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024120658-pesky-tactile-da5a@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From bac3b10b78e54b7da3cede397258f75a2180609b Mon Sep 17 00:00:00 2001
From: Saravana Kannan <saravanak(a)google.com>
Date: Wed, 30 Oct 2024 10:10:07 -0700
Subject: [PATCH] driver core: fw_devlink: Stop trying to optimize cycle
detection logic
In attempting to optimize fw_devlink runtime, I introduced numerous cycle
detection bugs by foregoing cycle detection logic under specific
conditions. Each fix has further narrowed the conditions for optimization.
It's time to give up on these optimization attempts and just run the cycle
detection logic every time fw_devlink tries to create a device link.
The specific bug report that triggered this fix involved a supplier fwnode
that never gets a device created for it. Instead, the supplier fwnode is
represented by the device that corresponds to an ancestor fwnode.
In this case, fw_devlink didn't do any cycle detection because the cycle
detection logic is only run when a device link is created between the
devices that correspond to the actual consumer and supplier fwnodes.
With this change, fw_devlink will run cycle detection logic even when
creating SYNC_STATE_ONLY proxy device links from a device that is an
ancestor of a consumer fwnode.
Reported-by: Tomi Valkeinen <tomi.valkeinen(a)ideasonboard.com>
Closes: https://lore.kernel.org/all/1a1ab663-d068-40fb-8c94-f0715403d276@ideasonboa…
Fixes: 6442d79d880c ("driver core: fw_devlink: Improve detection of overlapping cycles")
Cc: stable <stable(a)kernel.org>
Tested-by: Tomi Valkeinen <tomi.valkeinen(a)ideasonboard.com>
Signed-off-by: Saravana Kannan <saravanak(a)google.com>
Link: https://lore.kernel.org/r/20241030171009.1853340-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 54b25570a492..633fb4283282 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1989,10 +1989,10 @@ static struct device *fwnode_get_next_parent_dev(const struct fwnode_handle *fwn
*
* Return true if one or more cycles were found. Otherwise, return false.
*/
-static bool __fw_devlink_relax_cycles(struct device *con,
+static bool __fw_devlink_relax_cycles(struct fwnode_handle *con_handle,
struct fwnode_handle *sup_handle)
{
- struct device *sup_dev = NULL, *par_dev = NULL;
+ struct device *sup_dev = NULL, *par_dev = NULL, *con_dev = NULL;
struct fwnode_link *link;
struct device_link *dev_link;
bool ret = false;
@@ -2009,22 +2009,22 @@ static bool __fw_devlink_relax_cycles(struct device *con,
sup_handle->flags |= FWNODE_FLAG_VISITED;
- sup_dev = get_dev_from_fwnode(sup_handle);
-
/* Termination condition. */
- if (sup_dev == con) {
+ if (sup_handle == con_handle) {
pr_debug("----- cycle: start -----\n");
ret = true;
goto out;
}
+ sup_dev = get_dev_from_fwnode(sup_handle);
+ con_dev = get_dev_from_fwnode(con_handle);
/*
* If sup_dev is bound to a driver and @con hasn't started binding to a
* driver, sup_dev can't be a consumer of @con. So, no need to check
* further.
*/
if (sup_dev && sup_dev->links.status == DL_DEV_DRIVER_BOUND &&
- con->links.status == DL_DEV_NO_DRIVER) {
+ con_dev && con_dev->links.status == DL_DEV_NO_DRIVER) {
ret = false;
goto out;
}
@@ -2033,7 +2033,7 @@ static bool __fw_devlink_relax_cycles(struct device *con,
if (link->flags & FWLINK_FLAG_IGNORE)
continue;
- if (__fw_devlink_relax_cycles(con, link->supplier)) {
+ if (__fw_devlink_relax_cycles(con_handle, link->supplier)) {
__fwnode_link_cycle(link);
ret = true;
}
@@ -2048,7 +2048,7 @@ static bool __fw_devlink_relax_cycles(struct device *con,
else
par_dev = fwnode_get_next_parent_dev(sup_handle);
- if (par_dev && __fw_devlink_relax_cycles(con, par_dev->fwnode)) {
+ if (par_dev && __fw_devlink_relax_cycles(con_handle, par_dev->fwnode)) {
pr_debug("%pfwf: cycle: child of %pfwf\n", sup_handle,
par_dev->fwnode);
ret = true;
@@ -2066,7 +2066,7 @@ static bool __fw_devlink_relax_cycles(struct device *con,
!(dev_link->flags & DL_FLAG_CYCLE))
continue;
- if (__fw_devlink_relax_cycles(con,
+ if (__fw_devlink_relax_cycles(con_handle,
dev_link->supplier->fwnode)) {
pr_debug("%pfwf: cycle: depends on %pfwf\n", sup_handle,
dev_link->supplier->fwnode);
@@ -2114,11 +2114,6 @@ static int fw_devlink_create_devlink(struct device *con,
if (link->flags & FWLINK_FLAG_IGNORE)
return 0;
- if (con->fwnode == link->consumer)
- flags = fw_devlink_get_flags(link->flags);
- else
- flags = FW_DEVLINK_FLAGS_PERMISSIVE;
-
/*
* In some cases, a device P might also be a supplier to its child node
* C. However, this would defer the probe of C until the probe of P
@@ -2139,25 +2134,23 @@ static int fw_devlink_create_devlink(struct device *con,
return -EINVAL;
/*
- * SYNC_STATE_ONLY device links don't block probing and supports cycles.
- * So, one might expect that cycle detection isn't necessary for them.
- * However, if the device link was marked as SYNC_STATE_ONLY because
- * it's part of a cycle, then we still need to do cycle detection. This
- * is because the consumer and supplier might be part of multiple cycles
- * and we need to detect all those cycles.
+ * Don't try to optimize by not calling the cycle detection logic under
+ * certain conditions. There's always some corner case that won't get
+ * detected.
*/
- if (!device_link_flag_is_sync_state_only(flags) ||
- flags & DL_FLAG_CYCLE) {
- device_links_write_lock();
- if (__fw_devlink_relax_cycles(con, sup_handle)) {
- __fwnode_link_cycle(link);
- flags = fw_devlink_get_flags(link->flags);
- pr_debug("----- cycle: end -----\n");
- dev_info(con, "Fixed dependency cycle(s) with %pfwf\n",
- sup_handle);
- }
- device_links_write_unlock();
+ device_links_write_lock();
+ if (__fw_devlink_relax_cycles(link->consumer, sup_handle)) {
+ __fwnode_link_cycle(link);
+ pr_debug("----- cycle: end -----\n");
+ pr_info("%pfwf: Fixed dependency cycle(s) with %pfwf\n",
+ link->consumer, sup_handle);
}
+ device_links_write_unlock();
+
+ if (con->fwnode == link->consumer)
+ flags = fw_devlink_get_flags(link->flags);
+ else
+ flags = FW_DEVLINK_FLAGS_PERMISSIVE;
if (sup_handle->flags & FWNODE_FLAG_NOT_DEVICE)
sup_dev = fwnode_get_next_parent_dev(sup_handle);