The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 51888393cc64dd0462d0b96c13ab94873abbc030
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025082128-casually-sensuous-5677@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 51888393cc64dd0462d0b96c13ab94873abbc030 Mon Sep 17 00:00:00 2001
From: "Rafael J. Wysocki" <rafael.j.wysocki(a)intel.com>
Date: Wed, 9 Jul 2025 12:41:45 +0200
Subject: [PATCH] PM: runtime: Take active children into account in
pm_runtime_get_if_in_use()
For all practical purposes, there is no difference between the situation
in which a given device is not ignoring children and its active child
count is nonzero and the situation in which its runtime PM usage counter
is nonzero. However, pm_runtime_get_if_in_use() will only increment the
device's usage counter and return 1 in the latter case.
For consistency, make it do so in the former case either by adjusting
pm_runtime_get_conditional() and update the related kerneldoc comments
accordingly.
Fixes: c111566bea7c ("PM: runtime: Add pm_runtime_get_if_active()")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson(a)linaro.org>
Reviewed-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Cc: 5.10+ <stable(a)vger.kernel.org> # 5.10+: c0ef3df8dbae: PM: runtime: Simplify pm_runtime_get_if_active() usage
Cc: 5.10+ <stable(a)vger.kernel.org> # 5.10+
Link: https://patch.msgid.link/12700973.O9o76ZdvQC@rjwysocki.net
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index c55a7c70bc1a..2ba0dfd1de5a 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -1191,10 +1191,12 @@ EXPORT_SYMBOL_GPL(__pm_runtime_resume);
*
* Return -EINVAL if runtime PM is disabled for @dev.
*
- * Otherwise, if the runtime PM status of @dev is %RPM_ACTIVE and either
- * @ign_usage_count is %true or the runtime PM usage counter of @dev is not
- * zero, increment the usage counter of @dev and return 1. Otherwise, return 0
- * without changing the usage counter.
+ * Otherwise, if its runtime PM status is %RPM_ACTIVE and (1) @ign_usage_count
+ * is set, or (2) @dev is not ignoring children and its active child count is
+ * nonero, or (3) the runtime PM usage counter of @dev is not zero, increment
+ * the usage counter of @dev and return 1.
+ *
+ * Otherwise, return 0 without changing the usage counter.
*
* If @ign_usage_count is %true, this function can be used to prevent suspending
* the device when its runtime PM status is %RPM_ACTIVE.
@@ -1216,7 +1218,8 @@ static int pm_runtime_get_conditional(struct device *dev, bool ign_usage_count)
retval = -EINVAL;
} else if (dev->power.runtime_status != RPM_ACTIVE) {
retval = 0;
- } else if (ign_usage_count) {
+ } else if (ign_usage_count || (!dev->power.ignore_children &&
+ atomic_read(&dev->power.child_count) > 0)) {
retval = 1;
atomic_inc(&dev->power.usage_count);
} else {
@@ -1249,10 +1252,16 @@ EXPORT_SYMBOL_GPL(pm_runtime_get_if_active);
* @dev: Target device.
*
* Increment the runtime PM usage counter of @dev if its runtime PM status is
- * %RPM_ACTIVE and its runtime PM usage counter is greater than 0, in which case
- * it returns 1. If the device is in a different state or its usage_count is 0,
- * 0 is returned. -EINVAL is returned if runtime PM is disabled for the device,
- * in which case also the usage_count will remain unmodified.
+ * %RPM_ACTIVE and its runtime PM usage counter is greater than 0 or it is not
+ * ignoring children and its active child count is nonzero. 1 is returned in
+ * this case.
+ *
+ * If @dev is in a different state or it is not in use (that is, its usage
+ * counter is 0, or it is ignoring children, or its active child count is 0),
+ * 0 is returned.
+ *
+ * -EINVAL is returned if runtime PM is disabled for the device, in which case
+ * also the usage counter of @dev is not updated.
*/
int pm_runtime_get_if_in_use(struct device *dev)
{
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 086a0e516f7b3844e6328a5c69e2708b66b0ce18
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025082120-canteen-quickly-68c4@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 086a0e516f7b3844e6328a5c69e2708b66b0ce18 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Thu, 24 Jul 2025 11:19:06 +0200
Subject: [PATCH] usb: dwc3: imx8mp: fix device leak at unbind
Make sure to drop the reference to the dwc3 device taken by
of_find_device_by_node() on probe errors and on driver unbind.
Fixes: 6dd2565989b4 ("usb: dwc3: add imx8mp dwc3 glue layer driver")
Cc: stable(a)vger.kernel.org # 5.12
Cc: Li Jun <jun.li(a)nxp.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Link: https://lore.kernel.org/r/20250724091910.21092-2-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/dwc3/dwc3-imx8mp.c b/drivers/usb/dwc3/dwc3-imx8mp.c
index 3edc5aca76f9..bce6af82f54c 100644
--- a/drivers/usb/dwc3/dwc3-imx8mp.c
+++ b/drivers/usb/dwc3/dwc3-imx8mp.c
@@ -244,7 +244,7 @@ static int dwc3_imx8mp_probe(struct platform_device *pdev)
IRQF_ONESHOT, dev_name(dev), dwc3_imx);
if (err) {
dev_err(dev, "failed to request IRQ #%d --> %d\n", irq, err);
- goto depopulate;
+ goto put_dwc3;
}
device_set_wakeup_capable(dev, true);
@@ -252,6 +252,8 @@ static int dwc3_imx8mp_probe(struct platform_device *pdev)
return 0;
+put_dwc3:
+ put_device(&dwc3_imx->dwc3->dev);
depopulate:
of_platform_depopulate(dev);
remove_swnode:
@@ -265,8 +267,11 @@ static int dwc3_imx8mp_probe(struct platform_device *pdev)
static void dwc3_imx8mp_remove(struct platform_device *pdev)
{
+ struct dwc3_imx8mp *dwc3_imx = platform_get_drvdata(pdev);
struct device *dev = &pdev->dev;
+ put_device(&dwc3_imx->dwc3->dev);
+
pm_runtime_get_sync(dev);
of_platform_depopulate(dev);
device_remove_software_node(dev);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x ed62a62a18bc144f73eadf866ae46842e8f6606e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025082137-grandly-daytime-751f@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ed62a62a18bc144f73eadf866ae46842e8f6606e Mon Sep 17 00:00:00 2001
From: Damien Le Moal <dlemoal(a)kernel.org>
Date: Wed, 18 Jun 2025 16:25:19 +0900
Subject: [PATCH] ata: Fix SATA_MOBILE_LPM_POLICY description in Kconfig
Improve the description of the possible default SATA link power
management policies and add the missing description for policy 5.
No functional changes.
Fixes: a5ec5a7bfd1f ("ata: ahci: Support state with min power but Partial low power state")
Cc: stable(a)vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Niklas Cassel <cassel(a)kernel.org>
diff --git a/drivers/ata/Kconfig b/drivers/ata/Kconfig
index e00536b49552..120a2b7067fc 100644
--- a/drivers/ata/Kconfig
+++ b/drivers/ata/Kconfig
@@ -117,23 +117,39 @@ config SATA_AHCI
config SATA_MOBILE_LPM_POLICY
int "Default SATA Link Power Management policy"
- range 0 4
+ range 0 5
default 3
depends on SATA_AHCI
help
Select the Default SATA Link Power Management (LPM) policy to use
for chipsets / "South Bridges" supporting low-power modes. Such
chipsets are ubiquitous across laptops, desktops and servers.
+ Each policy combines power saving states and features:
+ - Partial: The Phy logic is powered but is in a reduced power
+ state. The exit latency from this state is no longer than
+ 10us).
+ - Slumber: The Phy logic is powered but is in an even lower power
+ state. The exit latency from this state is potentially
+ longer, but no longer than 10ms.
+ - DevSleep: The Phy logic may be powered down. The exit latency from
+ this state is no longer than 20 ms, unless otherwise
+ specified by DETO in the device Identify Device Data log.
+ - HIPM: Host Initiated Power Management (host automatically
+ transitions to partial and slumber).
+ - DIPM: Device Initiated Power Management (device automatically
+ transitions to partial and slumber).
- The value set has the following meanings:
+ The possible values for the default SATA link power management
+ policies are:
0 => Keep firmware settings
- 1 => Maximum performance
- 2 => Medium power
- 3 => Medium power with Device Initiated PM enabled
- 4 => Minimum power
+ 1 => No power savings (maximum performance)
+ 2 => HIPM (Partial)
+ 3 => HIPM (Partial) and DIPM (Partial and Slumber)
+ 4 => HIPM (Partial and DevSleep) and DIPM (Partial and Slumber)
+ 5 => HIPM (Slumber and DevSleep) and DIPM (Partial and Slumber)
- Note "Minimum power" is known to cause issues, including disk
- corruption, with some disks and should not be used.
+ Excluding the value 0, higher values represent policies with higher
+ power savings.
config SATA_AHCI_PLATFORM
tristate "Platform AHCI SATA support"
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 1473e9e7679bd4f5a62d1abccae894fb86de280f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025082155-sympathy-finally-e1ad@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1473e9e7679bd4f5a62d1abccae894fb86de280f Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Thu, 24 Jul 2025 11:19:09 +0200
Subject: [PATCH] usb: musb: omap2430: fix device leak at unbind
Make sure to drop the reference to the control device taken by
of_find_device_by_node() during probe when the driver is unbound.
Fixes: 8934d3e4d0e7 ("usb: musb: omap2430: Don't use omap_get_control_dev()")
Cc: stable(a)vger.kernel.org # 3.13
Cc: Roger Quadros <rogerq(a)kernel.org>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Link: https://lore.kernel.org/r/20250724091910.21092-5-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/musb/omap2430.c b/drivers/usb/musb/omap2430.c
index 2970967a4fd2..36f756f9b7f6 100644
--- a/drivers/usb/musb/omap2430.c
+++ b/drivers/usb/musb/omap2430.c
@@ -400,7 +400,7 @@ static int omap2430_probe(struct platform_device *pdev)
ret = platform_device_add_resources(musb, pdev->resource, pdev->num_resources);
if (ret) {
dev_err(&pdev->dev, "failed to add resources\n");
- goto err2;
+ goto err_put_control_otghs;
}
if (populate_irqs) {
@@ -413,7 +413,7 @@ static int omap2430_probe(struct platform_device *pdev)
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
if (!res) {
ret = -EINVAL;
- goto err2;
+ goto err_put_control_otghs;
}
musb_res[i].start = res->start;
@@ -441,14 +441,14 @@ static int omap2430_probe(struct platform_device *pdev)
ret = platform_device_add_resources(musb, musb_res, i);
if (ret) {
dev_err(&pdev->dev, "failed to add IRQ resources\n");
- goto err2;
+ goto err_put_control_otghs;
}
}
ret = platform_device_add_data(musb, pdata, sizeof(*pdata));
if (ret) {
dev_err(&pdev->dev, "failed to add platform_data\n");
- goto err2;
+ goto err_put_control_otghs;
}
pm_runtime_enable(glue->dev);
@@ -463,7 +463,9 @@ static int omap2430_probe(struct platform_device *pdev)
err3:
pm_runtime_disable(glue->dev);
-
+err_put_control_otghs:
+ if (!IS_ERR(glue->control_otghs))
+ put_device(glue->control_otghs);
err2:
platform_device_put(musb);
@@ -477,6 +479,8 @@ static void omap2430_remove(struct platform_device *pdev)
platform_device_unregister(glue->musb);
pm_runtime_disable(glue->dev);
+ if (!IS_ERR(glue->control_otghs))
+ put_device(glue->control_otghs);
}
#ifdef CONFIG_PM
Kernel initialize "jiffies" timer as 5 minutes below zero, as shown in
include/linux/jiffies.h
/*
* Have the 32 bit jiffies value wrap 5 minutes after boot
* so jiffies wrap bugs show up earlier.
*/
#define INITIAL_JIFFIES ((unsigned long)(unsigned int) (-300*HZ))
And they cast unsigned value to signed to cover wraparound
#define time_after_eq(a,b) \
(typecheck(unsigned long, a) && \
typecheck(unsigned long, b) && \
((long)((a) - (b)) >= 0))
In 64bit system, these might not be a problem because wrapround occurs
300 million years after the boot, assuming HZ value is 1000.
With same assuming, In 32bit system, wraparound occurs 5 minutues after
the initial boot and every 49 days after the first wraparound. And about
25 days after first wraparound, it continues quota charging window up to
next 25 days.
Example 1: initial boot
jiffies=0xFFFB6C20, charged_from+interval=0x000003E8
time_after_eq(jiffies, charged_from+interval)=(long)0xFFFB6838; In
signed values, it is considered negative so it is false.
Example 2: after about 25 days first wraparound
jiffies=0x800004E8, charged_from+interval=0x000003E8
time_after_eq(jiffies, charged_from+interval)=(long)0x80000100; In
signed values, it is considered negative so it is false
So, change quota->charged_from to jiffies at damos_adjust_quota() when
it is consider first charge window.
In theory; but almost impossible; quota->total_charged_sz and
qutoa->charged_from should be both zero even if it is not in first
charge window. But It will only delay one reset_interval, So it is not
big problem.
Fixes: 2b8a248d5873 ("mm/damon/schemes: implement size quota for schemes application speed control") # 5.16
Cc: stable(a)vger.kernel.org
Signed-off-by: Sang-Heon Jeon <ekffu200098(a)gmail.com>
---
Changes from v1 [1]
- not change current default value of quota->charged_from
- set quota->charged_from when it is consider first charge below
- add more description of jiffies and wraparound example to commit
messages
SeongJae, please re-check Fixes commit is valid. Thank you.
[1] https://lore.kernel.org/damon/20250818183803.1450539-1-ekffu200098@gmail.co…
---
mm/damon/core.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/mm/damon/core.c b/mm/damon/core.c
index cb41fddca78c..93bad6d0da5b 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -2130,6 +2130,10 @@ static void damos_adjust_quota(struct damon_ctx *c, struct damos *s)
if (!quota->ms && !quota->sz && list_empty("a->goals))
return;
+ /* First charge window */
+ if (!quota->total_charged_sz && !quota->charged_from)
+ quota->charged_from = jiffies;
+
/* New charge window starts */
if (time_after_eq(jiffies, quota->charged_from +
msecs_to_jiffies(quota->reset_interval))) {
--
2.43.0
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x e2374953461947eee49f69b3e3204ff080ef31b1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025082112-segment-delta-e613@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e2374953461947eee49f69b3e3204ff080ef31b1 Mon Sep 17 00:00:00 2001
From: Tzung-Bi Shih <tzungbi(a)kernel.org>
Date: Tue, 22 Jul 2025 12:05:13 +0000
Subject: [PATCH] platform/chrome: cros_ec: Unregister notifier in
cros_ec_unregister()
The blocking notifier is registered in cros_ec_register(); however, it
isn't unregistered in cros_ec_unregister().
Fix it.
Fixes: 42cd0ab476e2 ("platform/chrome: cros_ec: Query EC protocol version if EC transitions between RO/RW")
Cc: stable(a)vger.kernel.org
Reviewed-by: Benson Leung <bleung(a)chromium.org>
Link: https://lore.kernel.org/r/20250722120513.234031-1-tzungbi@kernel.org
Signed-off-by: Tzung-Bi Shih <tzungbi(a)kernel.org>
diff --git a/drivers/platform/chrome/cros_ec.c b/drivers/platform/chrome/cros_ec.c
index 110771a8645e..fd58781a2fb7 100644
--- a/drivers/platform/chrome/cros_ec.c
+++ b/drivers/platform/chrome/cros_ec.c
@@ -318,6 +318,9 @@ EXPORT_SYMBOL(cros_ec_register);
*/
void cros_ec_unregister(struct cros_ec_device *ec_dev)
{
+ if (ec_dev->mkbp_event_supported)
+ blocking_notifier_chain_unregister(&ec_dev->event_notifier,
+ &ec_dev->notifier_ready);
platform_device_unregister(ec_dev->pd);
platform_device_unregister(ec_dev->ec);
mutex_destroy(&ec_dev->lock);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 1473e9e7679bd4f5a62d1abccae894fb86de280f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025082155-easing-flavorful-b21d@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1473e9e7679bd4f5a62d1abccae894fb86de280f Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Thu, 24 Jul 2025 11:19:09 +0200
Subject: [PATCH] usb: musb: omap2430: fix device leak at unbind
Make sure to drop the reference to the control device taken by
of_find_device_by_node() during probe when the driver is unbound.
Fixes: 8934d3e4d0e7 ("usb: musb: omap2430: Don't use omap_get_control_dev()")
Cc: stable(a)vger.kernel.org # 3.13
Cc: Roger Quadros <rogerq(a)kernel.org>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Link: https://lore.kernel.org/r/20250724091910.21092-5-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/musb/omap2430.c b/drivers/usb/musb/omap2430.c
index 2970967a4fd2..36f756f9b7f6 100644
--- a/drivers/usb/musb/omap2430.c
+++ b/drivers/usb/musb/omap2430.c
@@ -400,7 +400,7 @@ static int omap2430_probe(struct platform_device *pdev)
ret = platform_device_add_resources(musb, pdev->resource, pdev->num_resources);
if (ret) {
dev_err(&pdev->dev, "failed to add resources\n");
- goto err2;
+ goto err_put_control_otghs;
}
if (populate_irqs) {
@@ -413,7 +413,7 @@ static int omap2430_probe(struct platform_device *pdev)
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
if (!res) {
ret = -EINVAL;
- goto err2;
+ goto err_put_control_otghs;
}
musb_res[i].start = res->start;
@@ -441,14 +441,14 @@ static int omap2430_probe(struct platform_device *pdev)
ret = platform_device_add_resources(musb, musb_res, i);
if (ret) {
dev_err(&pdev->dev, "failed to add IRQ resources\n");
- goto err2;
+ goto err_put_control_otghs;
}
}
ret = platform_device_add_data(musb, pdata, sizeof(*pdata));
if (ret) {
dev_err(&pdev->dev, "failed to add platform_data\n");
- goto err2;
+ goto err_put_control_otghs;
}
pm_runtime_enable(glue->dev);
@@ -463,7 +463,9 @@ static int omap2430_probe(struct platform_device *pdev)
err3:
pm_runtime_disable(glue->dev);
-
+err_put_control_otghs:
+ if (!IS_ERR(glue->control_otghs))
+ put_device(glue->control_otghs);
err2:
platform_device_put(musb);
@@ -477,6 +479,8 @@ static void omap2430_remove(struct platform_device *pdev)
platform_device_unregister(glue->musb);
pm_runtime_disable(glue->dev);
+ if (!IS_ERR(glue->control_otghs))
+ put_device(glue->control_otghs);
}
#ifdef CONFIG_PM
commit f0c6eab5e45c529f449fbc595873719e00de6d79 upstream.
A BPF scheduler may want to use the built-in idle cpumasks in ops.init()
before the scheduler is fully initialized, either directly or through a
BPF timer for example.
However, this would result in an error, since the idle state has not
been properly initialized yet.
This can be easily verified by modifying scx_simple to call
scx_bpf_get_idle_cpumask() in ops.init():
$ sudo scx_simple
DEBUG DUMP
===========================================================================
scx_simple[121] triggered exit kind 1024:
runtime error (built-in idle tracking is disabled)
...
Fix this by properly initializing the idle state before ops.init() is
called. With this change applied:
$ sudo scx_simple
local=2 global=0
local=19 global=11
local=23 global=11
...
Fixes: d73249f88743d ("sched_ext: idle: Make idle static keys private")
Signed-off-by: Andrea Righi <arighi(a)nvidia.com>
Reviewed-by: Joel Fernandes <joelagnelf(a)nvidia.com>
Signed-off-by: Tejun Heo <tj(a)kernel.org>
[ Backport to 6.12:
- Original commit doesn't apply cleanly to 6.12 since d73249f88743d is
not present.
- This backport applies the same logical fix to prevent BPF scheduler
failures while accessing idle cpumasks from ops.init(). ]
Signed-off-by: Andrea Righi <arighi(a)nvidia.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Andrea Righi <arighi(a)nvidia.com>
---
kernel/sched/ext.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index c801dd20c63d9..7eae1c64f7348 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -5220,6 +5220,13 @@ static int scx_ops_enable(struct sched_ext_ops *ops, struct bpf_link *link)
for_each_possible_cpu(cpu)
cpu_rq(cpu)->scx.cpuperf_target = SCX_CPUPERF_ONE;
+ if (!ops->update_idle || (ops->flags & SCX_OPS_KEEP_BUILTIN_IDLE)) {
+ reset_idle_masks();
+ static_branch_enable(&scx_builtin_idle_enabled);
+ } else {
+ static_branch_disable(&scx_builtin_idle_enabled);
+ }
+
/*
* Keep CPUs stable during enable so that the BPF scheduler can track
* online CPUs by watching ->on/offline_cpu() after ->init().
@@ -5287,13 +5294,6 @@ static int scx_ops_enable(struct sched_ext_ops *ops, struct bpf_link *link)
if (scx_ops.cpu_acquire || scx_ops.cpu_release)
static_branch_enable(&scx_ops_cpu_preempt);
- if (!ops->update_idle || (ops->flags & SCX_OPS_KEEP_BUILTIN_IDLE)) {
- reset_idle_masks();
- static_branch_enable(&scx_builtin_idle_enabled);
- } else {
- static_branch_disable(&scx_builtin_idle_enabled);
- }
-
/*
* Lock out forks, cgroup on/offlining and moves before opening the
* floodgate so that they don't wander into the operations prematurely.
--
2.50.1
Since the commit 96336ec70264 ("PCI: Perform reset_resource() and build
fail list in sync") the failed list is always built and returned to let
the caller decide what to do with the failures. The caller may want to
retry resource fitting and assignment and before that can happen, the
resources should be restored to their original state (a reset
effectively clears the struct resource), which requires returning them
on the failed list so that the original state remains stored in the
associated struct pci_dev_resource.
Resource resizing is different from the ordinary resource fitting and
assignment in that it only considers part of the resources. This means
failures for other resource types are not relevant at all and should be
ignored. As resize doesn't unassign such unrelated resources, those
resource ending up into the failed list implies assignment of that
resource must have failed before resize too. The check in
pci_reassign_bridge_resources() to decide if the whole assignment is
successful, however, is based on list emptiness which will cause false
negatives when the failed list has resources with an unrelated type.
If the failed list is not empty, call pci_required_resource_failed()
and extend it to be able to filter on specific resource types too (if
provided).
Calling pci_required_resource_failed() at this point is slightly
problematic because the resource itself is reset when the failed list
is constructed in __assign_resources_sorted(). As a result,
pci_resource_is_optional() does not have access to the original
resource flags. This could be worked around by restoring and
re-reseting the resource around the call to pci_resource_is_optional(),
however, it shouldn't cause issue as resource resizing is meant for
64-bit prefetchable resources according to Christian König (see the
Link which unfortunately doesn't point directly to Christian's reply
because lore didn't store that email at all).
Fixes: 96336ec70264 ("PCI: Perform reset_resource() and build fail list in sync")
Link: https://lore.kernel.org/all/c5d1b5d8-8669-5572-75a7-0b480f581ac1@linux.inte…
Reported-by: D Scott Phillips <scott(a)os.amperecomputing.com>
Tested-by: D Scott Phillips <scott(a)os.amperecomputing.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Reviewed-by: D Scott Phillips <scott(a)os.amperecomputing.com>
Cc: Christian König <christian.koenig(a)amd.com>
Cc: <stable(a)vger.kernel.org>
---
drivers/pci/setup-bus.c | 26 ++++++++++++++++++--------
1 file changed, 18 insertions(+), 8 deletions(-)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 24863d8d0053..dbbd80d78d3d 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -28,6 +28,10 @@
#include <linux/acpi.h>
#include "pci.h"
+#define PCI_RES_TYPE_MASK \
+ (IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH |\
+ IORESOURCE_MEM_64)
+
unsigned int pci_flags;
EXPORT_SYMBOL_GPL(pci_flags);
@@ -384,13 +388,19 @@ static bool pci_need_to_release(unsigned long mask, struct resource *res)
}
/* Return: @true if assignment of a required resource failed. */
-static bool pci_required_resource_failed(struct list_head *fail_head)
+static bool pci_required_resource_failed(struct list_head *fail_head,
+ unsigned long type)
{
struct pci_dev_resource *fail_res;
+ type &= PCI_RES_TYPE_MASK;
+
list_for_each_entry(fail_res, fail_head, list) {
int idx = pci_resource_num(fail_res->dev, fail_res->res);
+ if (type && (fail_res->flags & PCI_RES_TYPE_MASK) != type)
+ continue;
+
if (!pci_resource_is_optional(fail_res->dev, idx))
return true;
}
@@ -504,7 +514,7 @@ static void __assign_resources_sorted(struct list_head *head,
}
/* Without realloc_head and only optional fails, nothing more to do. */
- if (!pci_required_resource_failed(&local_fail_head) &&
+ if (!pci_required_resource_failed(&local_fail_head, 0) &&
list_empty(realloc_head)) {
list_for_each_entry(save_res, &save_head, list) {
struct resource *res = save_res->res;
@@ -1708,10 +1718,6 @@ static void __pci_bridge_assign_resources(const struct pci_dev *bridge,
}
}
-#define PCI_RES_TYPE_MASK \
- (IORESOURCE_IO | IORESOURCE_MEM | IORESOURCE_PREFETCH |\
- IORESOURCE_MEM_64)
-
static void pci_bridge_release_resources(struct pci_bus *bus,
unsigned long type)
{
@@ -2449,8 +2455,12 @@ int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type)
free_list(&added);
if (!list_empty(&failed)) {
- ret = -ENOSPC;
- goto cleanup;
+ if (pci_required_resource_failed(&failed, type)) {
+ ret = -ENOSPC;
+ goto cleanup;
+ }
+ /* Only resources with unrelated types failed (again) */
+ free_list(&failed);
}
list_for_each_entry(dev_res, &saved, list) {
--
2.39.5
pdev_sort_resources() uses pdev_resources_assignable() helper to decide
if device's resources cannot be assigned. pbus_size_mem(), on the other
hand, does not do the same check. This could lead into a situation
where a resource ends up on realloc_head list but is not on the head
list, which is turn prevents emptying the resource from the
realloc_head list in __assign_resources_sorted().
A non-empty realloc_head is unacceptable because it triggers an
internal sanity check as show in this log with a device that has class
0 (PCI_CLASS_NOT_DEFINED):
pci 0001:01:00.0: [144d:a5a5] type 00 class 0x000000 PCIe Endpoint
pci 0001:01:00.0: BAR 0 [mem 0x00000000-0x000fffff 64bit]
pci 0001:01:00.0: ROM [mem 0x00000000-0x0000ffff pref]
pci 0001:01:00.0: enabling Extended Tags
pci 0001:01:00.0: PME# supported from D0 D3hot D3cold
pci 0001:01:00.0: 15.752 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x2 link at 0001:00:00.0 (capable of 31.506 Gb/s with 16.0 GT/s PCIe x2 link)
pcieport 0001:00:00.0: bridge window [mem 0x00100000-0x001fffff] to [bus 01-ff] add_size 100000 add_align 100000
pcieport 0001:00:00.0: bridge window [mem 0x40000000-0x401fffff]: assigned
------------[ cut here ]------------
kernel BUG at drivers/pci/setup-bus.c:2532!
Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
...
Call trace:
pci_assign_unassigned_bus_resources+0x110/0x114 (P)
pci_rescan_bus+0x28/0x48
Use pdev_resources_assignable() also within pbus_size_mem() to skip
processing of non-assignable resources which removes the disparity in
between what resources pdev_sort_resources() and pbus_size_mem()
consider. As non-assignable resources are no longer processed, they are
not added to the realloc_head list, thus the sanity check no longer
triggers.
This disparity problem is very old but only now became apparent after
the commit 2499f5348431 ("PCI: Rework optional resource handling") that
made the ROM resources optional when calculating bridge window sizes
which required adding the resource to the realloc_head list.
Previously, bridge windows were just sized larger than necessary.
Fixes: 2499f5348431 ("PCI: Rework optional resource handling")
Reported-by: Tudor Ambarus <tudor.ambarus(a)linaro.org>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org>
---
drivers/pci/setup-bus.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index f90d49cd07da..24863d8d0053 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1191,6 +1191,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
resource_size_t r_size;
if (r->parent || (r->flags & IORESOURCE_PCI_FIXED) ||
+ !pdev_resources_assignable(dev) ||
((r->flags & mask) != type &&
(r->flags & mask) != type2 &&
(r->flags & mask) != type3))
--
2.39.5