Hello,
This is Chenglong from Google Container Optimized OS. I'm reporting a
severe CPU hang regression that occurs after a high volume of file
creation and subsequent cgroup cleanup.
Through bisection, the issue appears to be caused by a chain reaction
between three commits related to writeback, unbound workqueues, and
CPU-hogging detection. The issue is greatly alleviated on the latest
mainline kernel but is not fully resolved, still occurring
intermittently (~1 in 10 runs).
How to reproduce
The kernel v6.1 is good. The hang is reliably triggered(over 80%
chance) on kernels v6.6 and 6.12 and intermittently on
mainline(6.17-rc7) with the following steps:
Environment: A machine with a fast SSD and a high core count (e.g.,
Google Cloud's N2-standard-128).
Workload: Concurrently generate a large number of files (e.g., 2
million) using multiple services managed by systemd-run. This creates
significant I/O and cgroup churn.
Trigger: After the file generation completes, terminate the
systemd-run services.
Result: Shortly after the services are killed, the system's CPU load
spikes, leading to a massive number of kworker/+inode_switch_wbs
threads and a system-wide hang/livelock where the machine becomes
unresponsive (20s - 300s).
Analysis and Problematic Commits
1. The initial commit: The process begins with a worker that can get
stuck busy-waiting on a spinlock.
Commit: ("writeback, cgroup: release dying cgwbs by switching attached inodes")
Effect: This introduced the inode_switch_wbs_work_fn worker to clean
up cgroup writeback structures. Under our test load, this worker
appears to hit a highly contended wb->list_lock spinlock, causing it
to burn 100% CPU without sleeping.
2. The Kworker Explosion: A subsequent change misinterprets the
spinning worker from Stage 1, leading to a runaway feedback loop of
worker creation.
Commit: 616db8779b1e ("workqueue: Automatically mark CPU-hogging work
items CPU_INTENSIVE")
Effect: This logic sees the spinning worker, marks it as
CPU_INTENSIVE, and excludes it from concurrency management. To handle
the work backlog, it spawns a new kworker, which then also gets stuck
on the same lock, repeating the cycle. This directly causes the
kworker count to explode from <50 to 100-2000+.
3. The System-Wide Lockdown: The final piece allows this localized
worker explosion to saturate the entire system.
Commit: 8639ecebc9b1 ("workqueue: Implement non-strict affinity scope
for unbound workqueues")
Effect: This change introduced non-strict affinity as the default. It
allows the hundreds of kworkers created in Stage 2 to be spread by the
scheduler across all available CPU cores, turning the problem into a
system-wide hang.
Current Status and Mitigation
Mainline Status: On the latest mainline kernel, the hang is far less
frequent and the kworker counts are reduced back to normal (<50),
suggesting other changes have partially mitigated the issue. However,
the hang still occurs, and when it does, the kworker count still
explodes (e.g., 300+), indicating the underlying feedback loop
remains.
Workaround: A reliable mitigation is to revert to the old workqueue
behavior by setting affinity_strict to 1. This contains the kworker
proliferation to a single CPU pod, preventing the system-wide hang.
Questions
Given that the issue is not fully resolved, could you please provide
some guidance?
1. Is this a known issue, and are there patches in development that
might fully address the underlying spinlock contention or the kworker
feedback loop?
2. Is there a better long-term mitigation we can apply other than
forcing strict affinity?
Thank you for your time and help.
Best regards,
Chenglong
On Wed, Sep 24, 2025 at 05:24:15PM -0700, Chenglong Tang wrote:
> The kernel v6.1 is good. The hang is reliably triggered(over 80% chance) on
> kernels v6.6 and 6.12 and intermittently on mainline(6.17-rc7) with the
> following steps:
> -
>
> *Environment:* A machine with a fast SSD and a high core count (e.g.,
> Google Cloud's N2-standard-128).
> -
>
> *Workload:* Concurrently generate a large number of files (e.g., 2 million)
> using multiple services managed by systemd-run. This creates significant
> I/O and cgroup churn.
> -
>
> *Trigger:* After the file generation completes, terminate the systemd-run
> services.
> -
>
> *Result:* Shortly after the services are killed, the system's CPU load
> spikes, leading to a massive number of kworker/+inode_switch_wbs threads
> and a system-wide hang/livelock where the machine becomes unresponsive (20s
> - 300s).
Sounds like:
http://lkml.kernel.org/r/20250912103522.2935-1-jack@suse.cz
Can you see whether those patches resolve the problem?
Thanks.
--
tejun
Make sure to drop the reference taken to the ahb platform device when
looking up its driver data while enabling the smmu.
Note that holding a reference to a device does not prevent its driver
data from going away.
Fixes: 89c788bab1f0 ("ARM: tegra: Add SMMU enabler in AHB")
Cc: stable(a)vger.kernel.org # 3.5
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/amba/tegra-ahb.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/amba/tegra-ahb.c b/drivers/amba/tegra-ahb.c
index c0e8b765522d..f23c3ed01810 100644
--- a/drivers/amba/tegra-ahb.c
+++ b/drivers/amba/tegra-ahb.c
@@ -144,6 +144,7 @@ int tegra_ahb_enable_smmu(struct device_node *dn)
if (!dev)
return -EPROBE_DEFER;
ahb = dev_get_drvdata(dev);
+ put_device(dev);
val = gizmo_readl(ahb, AHB_ARBITRATION_XBAR_CTRL);
val |= AHB_ARBITRATION_XBAR_CTRL_SMMU_INIT_DONE;
gizmo_writel(ahb, val, AHB_ARBITRATION_XBAR_CTRL);
--
2.49.1
Make sure to drop the reference taken to the pbs platform device when
looking up its driver data.
Note that holding a reference to a device does not prevent its driver
data from going away so there is no point in keeping the reference.
Fixes: 5b2dd77be1d8 ("soc: qcom: add QCOM PBS driver")
Cc: stable(a)vger.kernel.org # 6.9
Cc: Anjelique Melendez <quic_amelende(a)quicinc.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/soc/qcom/qcom-pbs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/soc/qcom/qcom-pbs.c b/drivers/soc/qcom/qcom-pbs.c
index 1cc5d045f9dd..06b4a596e275 100644
--- a/drivers/soc/qcom/qcom-pbs.c
+++ b/drivers/soc/qcom/qcom-pbs.c
@@ -173,6 +173,8 @@ struct pbs_dev *get_pbs_client_device(struct device *dev)
return ERR_PTR(-EINVAL);
}
+ platform_device_put(pdev);
+
return pbs;
}
EXPORT_SYMBOL_GPL(get_pbs_client_device);
--
2.49.1
Make sure to drop the reference taken to the mbox platform device when
looking up its driver data.
Note that holding a reference to a device does not prevent its driver
data from going away so there is no point in keeping the reference.
Fixes: 6e1457fcad3f ("soc: apple: mailbox: Add ASC/M3 mailbox driver")
Cc: stable(a)vger.kernel.org # 6.8
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/soc/apple/mailbox.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/soc/apple/mailbox.c b/drivers/soc/apple/mailbox.c
index 49a0955e82d6..1685da1da23d 100644
--- a/drivers/soc/apple/mailbox.c
+++ b/drivers/soc/apple/mailbox.c
@@ -299,11 +299,18 @@ struct apple_mbox *apple_mbox_get(struct device *dev, int index)
return ERR_PTR(-EPROBE_DEFER);
mbox = platform_get_drvdata(pdev);
- if (!mbox)
- return ERR_PTR(-EPROBE_DEFER);
+ if (!mbox) {
+ mbox = ERR_PTR(-EPROBE_DEFER);
+ goto out_put_pdev;
+ }
+
+ if (!device_link_add(dev, &pdev->dev, DL_FLAG_AUTOREMOVE_CONSUMER)) {
+ mbox = ERR_PTR(-ENODEV);
+ goto out_put_pdev;
+ }
- if (!device_link_add(dev, &pdev->dev, DL_FLAG_AUTOREMOVE_CONSUMER))
- return ERR_PTR(-ENODEV);
+out_put_pdev:
+ put_device(&pdev->dev);
return mbox;
}
--
2.49.1
The ACPI has ways to annotate the location of a USB device. Wire that
annotation to a v4l2 control.
To support all possible devices, add a way to annotate USB devices on DT
as well. The original binding discussion happened here:
https://lore.kernel.org/linux-devicetree/20241212-usb-orientation-v1-1-0b69…
The following patches are needed regardless if we finally add support
for orientation and rotation or not:
- media: uvcvideo: Always set default_value
- media: uvcvideo: Set a function for UVC_EXT_GPIO_UNIT
Signed-off-by: Ricardo Ribalda <ribalda(a)chromium.org>
---
Changes in v3:
- refactor dt bindings
- add media: uvcvideo: Use current_value for read-only controls
- get_(max|cur|def) = swentity_get_cur
- virtual_entity add codestyle
- Codestyle
- Fix xu get_info and get_len
- Drop ACPI_DEVICE_SWNODE_DEV_ROTATION
- Add missing select V4L2_FWNODE
- Link to v2: https://lore.kernel.org/r/20250605-uvc-orientation-v2-0-5710f9d030aa@chromi…
Changes in v2:
- Add support for rotation
- Rename fwnode to swentity
- Remove the patch to move the gpio file
- Remove patches already in media-committers
- Change priority of data origins
- Patch mipi-disco
- Link to v1: https://lore.kernel.org/r/20250403-uvc-orientation-v1-0-1a0cc595a62d@chromi…
---
Ricardo Ribalda (12):
media: uvcvideo: Always set default_value
media: uvcvideo: Set a function for UVC_EXT_GPIO_UNIT
media: v4l: fwnode: Support ACPI's _PLD for v4l2_fwnode_device_parse
ACPI: mipi-disco-img: Do not duplicate rotation info into swnodes
media: ipu-bridge: Use v4l2_fwnode_device_parse helper
media: ipu-bridge: Use v4l2_fwnode for unknown rotations
dt-bindings: media: Add usb-camera-module
media: uvcvideo: Add support for V4L2_CID_CAMERA_ORIENTATION
media: uvcvideo: Fill ctrl->info.selector earlier
media: uvcvideo: Add uvc_ctrl_query_entity helper
media: uvcvideo: Use current_value for read-only controls
media: uvcvideo: Add support for V4L2_CID_CAMERA_ROTATION
.../bindings/media/usb-camera-module.yaml | 46 +++++
MAINTAINERS | 1 +
drivers/acpi/mipi-disco-img.c | 15 --
drivers/media/pci/intel/Kconfig | 1 +
drivers/media/pci/intel/ipu-bridge.c | 58 +++---
drivers/media/usb/uvc/Kconfig | 1 +
drivers/media/usb/uvc/Makefile | 3 +-
drivers/media/usb/uvc/uvc_ctrl.c | 201 +++++++++++++++------
drivers/media/usb/uvc/uvc_driver.c | 22 ++-
drivers/media/usb/uvc/uvc_entity.c | 3 +-
drivers/media/usb/uvc/uvc_swentity.c | 107 +++++++++++
drivers/media/usb/uvc/uvcvideo.h | 22 +++
drivers/media/v4l2-core/v4l2-fwnode.c | 84 ++++++++-
include/acpi/acpi_bus.h | 1 -
include/linux/usb/uvc.h | 3 +
15 files changed, 441 insertions(+), 127 deletions(-)
---
base-commit: afb100a5ea7a13d7e6937dcd3b36b19dc6cc9328
change-id: 20250403-uvc-orientation-5f7f19da5adb
Best regards,
--
Ricardo Ribalda <ribalda(a)chromium.org>