From: Su Hui suhui@nfschina.com
[ Upstream commit 7919407eca2ef562fa6c98c41cfdf6f6cdd69d92 ]
When encounters some errors like these: xhci_hcd 0000:4a:00.2: xHCI dying or halted, can't queue_command xhci_hcd 0000:4a:00.2: FIXME: allocate a command ring segment usb usb5-port6: couldn't allocate usb_device
It's hard to know whether xhc_state is dying or halted. So it's better to print xhc_state's value which can help locate the resaon of the bug.
Signed-off-by: Su Hui suhui@nfschina.com Link: https://lore.kernel.org/r/20250725060117.1773770-1-suhui@nfschina.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit is suitable for backporting to stable kernel trees for the following reasons:
1. **Enhanced Debugging for Real-World Issues**: The commit improves debugging of USB xHCI host controller failures by printing the actual `xhc_state` value when `queue_command` fails. The commit message shows real error messages users encounter ("xHCI dying or halted, can't queue_command"), demonstrating this is a real-world debugging problem.
2. **Minimal and Safe Change**: The change is extremely small and safe - it only modifies a debug print statement from: ```c xhci_dbg(xhci, "xHCI dying or halted, can't queue_command\n"); ``` to: ```c xhci_dbg(xhci, "xHCI dying or halted, can't queue_command. state: 0x%x\n", xhci->xhc_state); ```
3. **No Functional Changes**: This is a pure diagnostic improvement. It doesn't change any logic, control flow, or data structures. It only adds the state value (0x%x format) to an existing debug message.
4. **Important for Troubleshooting**: The xHCI driver is critical for USB functionality, and when it fails with "dying or halted" states, knowing the exact state helps diagnose whether: - `XHCI_STATE_DYING` (0x1) - controller is dying - `XHCI_STATE_HALTED` (0x2) - controller is halted - Both states (0x3) - controller has both flags set
This distinction is valuable for debugging hardware issues, driver bugs, or system problems.
5. **Zero Risk of Regression**: Adding a parameter to a debug print statement has no risk of introducing regressions. The worst case is the debug message prints the state value.
6. **Follows Stable Rules**: This meets stable kernel criteria as it: - Fixes a real debugging limitation - Is obviously correct - Has been tested (signed-off and accepted by Greg KH) - Is small (single line change) - Doesn't add new features, just improves existing diagnostics
The commit helps system administrators and developers diagnose USB issues more effectively by providing the actual state value rather than just saying "dying or halted", making it a valuable debugging enhancement for stable kernels.
drivers/usb/host/xhci-ring.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 94c9c9271658..131e7530ec4a 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -4372,7 +4372,8 @@ static int queue_command(struct xhci_hcd *xhci, struct xhci_command *cmd,
if ((xhci->xhc_state & XHCI_STATE_DYING) || (xhci->xhc_state & XHCI_STATE_HALTED)) { - xhci_dbg(xhci, "xHCI dying or halted, can't queue_command\n"); + xhci_dbg(xhci, "xHCI dying or halted, can't queue_command. state: 0x%x\n", + xhci->xhc_state); return -ESHUTDOWN; }
From: Dave Penkler dpenkler@gmail.com
[ Upstream commit f50d5e0c1f80d004510bf77cb0e1759103585c00 ]
A new version of a bona fide genuine NI-USB-HS+ adaptor sends new response codes to the initialization sequence.
Add the checking for these response codes to suppress console warning messages.
Signed-off-by: Dave Penkler dpenkler@gmail.com Link: https://lore.kernel.org/r/20250722164810.2621-1-dpenkler@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis, I'll now provide my assessment of whether this commit should be backported to stable kernel trees.
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## Commit Analysis
The commit adds support for new response codes from newer NI-USB-HS+ hardware adaptors. Looking at the specific code changes:
1. **Lines 2082-2086**: The comment is updated from "NI-USB-HS+ sends 0xf here" to "NI-USB-HS+ sends 0xf or 0x19 here", and the check is expanded to accept 0x19 as a valid response code.
2. **Lines 2113-2119**: The comment is updated to indicate that new HS+ hardware sends 0x59 at position [10], and the check is expanded to accept 0x59 as a valid response code.
## Rationale for Backporting
1. **Fixes a Real User Issue**: This fixes a bug where newer NI-USB-HS+ hardware generates console warning messages due to unrecognized response codes. Without this fix, users with newer hardware see error messages like "unexpected data: buffer[x]=0x19, expected..." when the hardware is actually functioning correctly.
2. **Minimal Risk**: The change is extremely contained - it only adds two additional valid response codes (0x19 and 0x59) to existing validation checks. No logic changes, no structural modifications, just expanding the set of accepted values.
3. **Hardware Compatibility**: This is a hardware enablement fix that allows Linux to properly support newer versions of existing hardware without generating spurious warnings.
4. **Staging Driver Context**: While this is in staging, the gpib subsystem appears to be actively maintained (based on recent commit history), and hardware compatibility fixes are important for users of this specialized equipment.
5. **Clear Bug Fix**: The commit message explicitly states it "suppress[es] console warning messages" - this is fixing incorrect error reporting, not adding new features.
6. **No Architectural Changes**: The patch doesn't change any APIs, data structures, or driver architecture. It's purely adding two constants to existing validation logic.
7. **Tested Hardware**: The commit message indicates this was tested with "a bona fide genuine NI-USB-HS+ adaptor", suggesting real hardware validation.
## Stable Tree Criteria Met
- ✓ Fixes a bug that affects users (spurious warnings with newer hardware) - ✓ Small and contained change (4 lines modified) - ✓ No side effects beyond the intended fix - ✓ No architectural changes - ✓ Hardware enablement for existing driver
The commit clearly meets the stable tree criteria as a minimal hardware compatibility fix that resolves user-visible warnings without any risk of regression.
drivers/staging/gpib/ni_usb/ni_usb_gpib.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/staging/gpib/ni_usb/ni_usb_gpib.c b/drivers/staging/gpib/ni_usb/ni_usb_gpib.c index 7cf25c95787f..73ea72f34c0a 100644 --- a/drivers/staging/gpib/ni_usb/ni_usb_gpib.c +++ b/drivers/staging/gpib/ni_usb/ni_usb_gpib.c @@ -2079,10 +2079,10 @@ static int ni_usb_hs_wait_for_ready(struct ni_usb_priv *ni_priv) } if (buffer[++j] != 0x0) { // [6] ready = 1; - // NI-USB-HS+ sends 0xf here + // NI-USB-HS+ sends 0xf or 0x19 here if (buffer[j] != 0x2 && buffer[j] != 0xe && buffer[j] != 0xf && - buffer[j] != 0x16) { - dev_err(&usb_dev->dev, "unexpected data: buffer[%i]=0x%x, expected 0x2, 0xe, 0xf or 0x16\n", + buffer[j] != 0x16 && buffer[j] != 0x19) { + dev_err(&usb_dev->dev, "unexpected data: buffer[%i]=0x%x, expected 0x2, 0xe, 0xf, 0x16 or 0x19\n", j, (int)buffer[j]); unexpected = 1; } @@ -2110,11 +2110,11 @@ static int ni_usb_hs_wait_for_ready(struct ni_usb_priv *ni_priv) j, (int)buffer[j]); unexpected = 1; } - if (buffer[++j] != 0x0) { + if (buffer[++j] != 0x0) { // [10] MC usb-488 sends 0x7 here, new HS+ sends 0x59 ready = 1; - if (buffer[j] != 0x96 && buffer[j] != 0x7 && buffer[j] != 0x6e) { -// [10] MC usb-488 sends 0x7 here - dev_err(&usb_dev->dev, "unexpected data: buffer[%i]=0x%x, expected 0x96, 0x07 or 0x6e\n", + if (buffer[j] != 0x96 && buffer[j] != 0x7 && buffer[j] != 0x6e && + buffer[j] != 0x59) { + dev_err(&usb_dev->dev, "unexpected data: buffer[%i]=0x%x, expected 0x96, 0x07, 0x6e or 0x59\n", j, (int)buffer[j]); unexpected = 1; }
From: Moon Hee Lee moonhee.lee.ca@gmail.com
[ Upstream commit 661e9cd196598c7d2502260ebbe60970546cca35 ]
The test_kexec_jump program builds correctly when invoked from the top-level selftests/Makefile, which explicitly sets the OUTPUT variable. However, building directly in tools/testing/selftests/kexec fails with:
make: *** No rule to make target '/test_kexec_jump', needed by 'test_kexec_jump.sh'. Stop.
This failure occurs because the Makefile rule relies on $(OUTPUT), which is undefined in direct builds.
Fix this by listing test_kexec_jump in TEST_GEN_PROGS, the standard way to declare generated test binaries in the kselftest framework. This ensures the binary is built regardless of invocation context and properly removed by make clean.
Link: https://lore.kernel.org/r/20250702171704.22559-2-moonhee.lee.ca@gmail.com Acked-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Moon Hee Lee moonhee.lee.ca@gmail.com Acked-by: Baoquan He bhe@redhat.com Acked-by: David Woodhouse dwmw@amazon.co.uk Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Perfect! This confirms the exact build issue described in the commit message. The problem is that `$(OUTPUT)` is undefined when building directly in the kexec directory.
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a real bug**: The commit fixes a build failure that prevents the kexec selftests from building when invoked directly from `tools/testing/selftests/kexec/`. The error message confirms this: `make: *** No rule to make target '/test_kexec_jump', needed by 'test_kexec_jump.sh'. Stop.`
2. **Small and contained fix**: The change is minimal - it simply replaces the custom Makefile rule: ```makefile test_kexec_jump.sh: $(OUTPUT)/test_kexec_jump ``` with the standard kselftest framework approach: ```makefile TEST_GEN_PROGS := test_kexec_jump ```
3. **No architectural changes**: This is purely a build system fix that doesn't change any kernel functionality or introduce new features. It just fixes the Makefile to use the standard kselftest variables.
4. **Minimal risk**: The change uses the established kselftest framework pattern (`TEST_GEN_PROGS`) which is well-tested and widely used throughout the kernel selftests. From examining `lib.mk`, we can see that `TEST_GEN_PROGS` automatically handles the `$(OUTPUT)` prefix and proper clean targets.
5. **Fixes testing infrastructure**: Having working selftests is important for kernel stability testing. This fix ensures that the kexec jump functionality can be properly tested in stable kernels, which is particularly important given that kexec is a critical feature for system reliability.
6. **Clear regression**: The bug prevents a previously working test from building in certain scenarios. While the test builds correctly when invoked from the top-level selftests Makefile (which sets OUTPUT), direct builds fail. This is a regression in usability that affects developers and CI systems.
The fix follows the documented pattern for kselftest Makefiles and will ensure consistent behavior regardless of how the tests are invoked. This is exactly the type of targeted bug fix that belongs in stable kernels - it fixes a specific issue without introducing new functionality or risk.
tools/testing/selftests/kexec/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kexec/Makefile b/tools/testing/selftests/kexec/Makefile index e3000ccb9a5d..874cfdd3b75b 100644 --- a/tools/testing/selftests/kexec/Makefile +++ b/tools/testing/selftests/kexec/Makefile @@ -12,7 +12,7 @@ include ../../../scripts/Makefile.arch
ifeq ($(IS_64_BIT)$(ARCH_PROCESSED),1x86) TEST_PROGS += test_kexec_jump.sh -test_kexec_jump.sh: $(OUTPUT)/test_kexec_jump +TEST_GEN_PROGS := test_kexec_jump endif
include ../lib.mk
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit 4ff3aeb664f7dfe824ba91ffb0b203397a8d431e ]
The Lenovo Yoga 6 13ACL6 82ND has a similar BIOS problem as other Lenovo laptops from that vintage that causes a rather long resume from suspend.
Add it to the quirk list that manipulates the scratch register to avoid the issue.
Reported-by: Adam Berglund adam.f.berglund@hotmail.com Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4434 Tested-by: Adam Berglund adam.f.berglund@hotmail.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Hans de Goede hansg@kernel.org Link: https://lore.kernel.org/r/20250718172307.1928744-1-superm1@kernel.org Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## Backport Justification
1. **Fixes a real user-impacting bug**: The commit addresses a BIOS firmware bug that causes "rather long resume from suspend" on the Lenovo Yoga 6 13ALC6 laptop. This is a significant usability issue that affects users of this specific hardware model during suspend/resume operations.
2. **Minimal and contained change**: The patch simply adds 8 lines to an existing DMI quirk table (lines 193-201 in the diff). It adds one new entry: ```c { .ident = "Lenovo Yoga 6 13ALC6", .driver_data = &quirk_s2idle_bug, .matches = { DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"), DMI_MATCH(DMI_PRODUCT_NAME, "82ND"), } }, ```
3. **Follows established pattern**: The fix uses the exact same mechanism (`quirk_s2idle_bug`) that's already applied to 20+ other Lenovo laptops in the same file. The workaround manipulates a scratch register (FCH_PM_SCRATCH) to skip a problematic SMI handler during suspend-to-idle resume, as documented in the comment at lines 241-248.
4. **No architectural changes**: This is purely a device-specific quirk addition to an existing quirk framework. No new functionality is introduced, no APIs are changed, and no kernel subsystem architecture is modified.
5. **Low regression risk**: The change only affects systems that match the specific DMI strings (LENOVO board vendor + 82ND product name). It cannot affect any other hardware.
6. **Tested by reporter**: The commit message indicates "Tested-by: Adam Berglund" who was also the reporter of the issue, confirming the fix works on the affected hardware.
7. **Follows stable tree rules**: This is exactly the type of commit that stable trees want: - Fixes a real bug (long resume times) - Minimal change (8 lines) - Hardware-specific fix with no broader impact - Already tested on affected hardware
The issue being fixed is documented at https://gitlab.freedesktop.org/drm/amd/-/issues/4434, and the workaround applies the same proven fix that's been working for numerous other Lenovo models with similar BIOS issues related to suspend-to-idle resume performance when IOMMU is enabled for NVMe devices.
drivers/platform/x86/amd/pmc/pmc-quirks.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/platform/x86/amd/pmc/pmc-quirks.c b/drivers/platform/x86/amd/pmc/pmc-quirks.c index 131f10b68308..ded4c84f5ed1 100644 --- a/drivers/platform/x86/amd/pmc/pmc-quirks.c +++ b/drivers/platform/x86/amd/pmc/pmc-quirks.c @@ -190,6 +190,15 @@ static const struct dmi_system_id fwbug_list[] = { DMI_MATCH(DMI_PRODUCT_NAME, "82XQ"), } }, + /* https://gitlab.freedesktop.org/drm/amd/-/issues/4434 */ + { + .ident = "Lenovo Yoga 6 13ALC6", + .driver_data = &quirk_s2idle_bug, + .matches = { + DMI_MATCH(DMI_BOARD_VENDOR, "LENOVO"), + DMI_MATCH(DMI_PRODUCT_NAME, "82ND"), + } + }, /* https://gitlab.freedesktop.org/drm/amd/-/issues/2684 */ { .ident = "HP Laptop 15s-eq2xxx",
From: Prashant Malani pmalani@google.com
[ Upstream commit 0a1416a49e63c320f6e6c1c8d07e1b58c0d4a3f3 ]
AMU counters on certain CPPC-based platforms tend to yield inaccurate delivered performance measurements on systems that are idle/mostly idle. This results in an inaccurate frequency being stored by cpufreq in its policy structure when the CPU is brought online. [1]
Consequently, if the userspace governor tries to set the frequency to a new value, there is a possibility that it would be the erroneous value stored earlier. In such a scenario, cpufreq would assume that the requested frequency has already been set and return early, resulting in the correct/new frequency request never making it to the hardware.
Since the operating frequency is liable to this sort of inconsistency, mark the CPPC driver with CPUFREQ_NEED_UPDATE_LIMITS so that it is always invoked when a target frequency update is requested.
Link: https://lore.kernel.org/linux-pm/20250619000925.415528-3-pmalani@google.com/ [1] Suggested-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Prashant Malani pmalani@google.com Acked-by: Viresh Kumar viresh.kumar@linaro.org Link: https://patch.msgid.link/20250722055611.130574-2-pmalani@google.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## Code Change Analysis
The change is minimal and focused - it only adds the `CPUFREQ_NEED_UPDATE_LIMITS` flag to the cppc_cpufreq_driver struct:
```c - .flags = CPUFREQ_CONST_LOOPS, + .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS, ```
This single-line change modifies the driver flags in drivers/cpufreq/cppc_cpufreq.c:928.
## Bug Being Fixed
The commit addresses a real bug where:
1. **AMU counters on CPPC platforms provide inaccurate measurements** when systems are idle/mostly idle 2. **This causes cpufreq to store incorrect frequency values** in its policy structure during CPU online 3. **Userspace frequency changes fail silently** because cpufreq's early return optimization (`if (target_freq == policy->cur)`) prevents the hardware from receiving the correct frequency request
The bug manifests as userspace being unable to set CPU frequencies correctly on affected CPPC platforms, which is a functional regression that impacts power management and performance.
## Why This Qualifies for Stable
1. **Fixes a real bug**: The issue prevents proper CPU frequency scaling on CPPC-based platforms, affecting power management functionality
2. **Minimal risk**: The change is extremely small (one flag addition) and well-contained to the CPPC driver
3. **Clear fix semantics**: The `CPUFREQ_NEED_UPDATE_LIMITS` flag is an established mechanism (introduced in commit 1c534352f47f) specifically designed for this scenario where drivers need frequency updates even when target_freq == policy->cur
4. **Similar precedent**: Other cpufreq drivers (intel_pstate, amd- pstate) already use this flag for similar reasons, showing it's a proven solution
5. **No architectural changes**: This doesn't introduce new features or change kernel architecture - it simply ensures the driver behaves correctly
6. **Recent related fixes**: The git history shows ongoing work to fix CPPC feedback counter issues (e.g., commit c47195631960), indicating this is part of stabilizing CPPC functionality
## Impact Assessment
- **Affected systems**: CPPC-based platforms with AMU counters that exhibit the idle measurement issue - **Risk**: Very low - the flag only bypasses an optimization when needed, forcing frequency updates to reach hardware - **Testing**: The change has been acked by the cpufreq maintainer (Viresh Kumar) and suggested by the PM subsystem maintainer (Rafael J. Wysocki)
The commit follows stable tree rules by fixing an important bug with minimal changes and low regression risk.
drivers/cpufreq/cppc_cpufreq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c index b7c688a5659c..c8e63c7aa9a3 100644 --- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -925,7 +925,7 @@ static struct freq_attr *cppc_cpufreq_attr[] = { };
static struct cpufreq_driver cppc_cpufreq_driver = { - .flags = CPUFREQ_CONST_LOOPS, + .flags = CPUFREQ_CONST_LOOPS | CPUFREQ_NEED_UPDATE_LIMITS, .verify = cppc_verify_policy, .target = cppc_cpufreq_set_target, .get = cppc_cpufreq_get_rate,
From: Cynthia Huang cynthia@andestech.com
[ Upstream commit 04850819c65c8242072818655d4341e70ae998b5 ]
The kernel does not provide sys_futex() on 32-bit architectures that do not support 32-bit time representations, such as riscv32.
As a result, glibc cannot define SYS_futex, causing compilation failures in tests that rely on this syscall. Define SYS_futex as SYS_futex_time64 in such cases to ensure successful compilation and compatibility.
Signed-off-by: Cynthia Huang cynthia@andestech.com Signed-off-by: Ben Zong-You Xie ben717@andestech.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Muhammad Usama Anjum usama.anjum@collabora.com Link: https://lore.kernel.org/all/20250710103630.3156130-1-ben717@andestech.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a real compilation failure bug**: The commit addresses a build failure in kernel selftests on 32-bit architectures with 64-bit time_t, specifically riscv32. This prevents the futex selftests from compiling on these architectures, which is a functional bug that affects testing infrastructure.
2. **Simple and contained fix**: The change is minimal - it only adds a conditional preprocessor definition that maps `SYS_futex` to `SYS_futex_time64` when the former is not defined but the latter is. The fix is: ```c #if !defined(SYS_futex) && defined(SYS_futex_time64) #define SYS_futex SYS_futex_time64 #endif ```
3. **No risk of regression**: The change is guarded by preprocessor conditionals that only activate when `SYS_futex` is not defined AND `SYS_futex_time64` is defined. This means it has zero impact on architectures where `SYS_futex` is already defined, ensuring no regressions on existing systems.
4. **Affects kernel testing infrastructure**: While this is in the selftests directory and not core kernel code, having working selftests is critical for kernel stability and quality assurance. The futex selftests are important for validating futex functionality across different architectures.
5. **Addresses Y2038 compatibility**: This fix is part of the broader Y2038 compatibility effort where 32-bit architectures are transitioning to 64-bit time_t. As more 32-bit architectures adopt 64-bit time_t, this fix becomes increasingly important.
6. **Clear problem and solution**: The commit message clearly explains the issue (glibc cannot define SYS_futex on certain architectures) and provides a clean solution that maintains compatibility.
The fix follows stable kernel rules by being a minimal change that fixes an important bug without introducing new features or architectural changes. It's confined to the testing infrastructure and has clear boundaries with no side effects beyond enabling compilation of the futex selftests on affected architectures.
tools/testing/selftests/futex/include/futextest.h | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/tools/testing/selftests/futex/include/futextest.h b/tools/testing/selftests/futex/include/futextest.h index ddbcfc9b7bac..7a5fd1d5355e 100644 --- a/tools/testing/selftests/futex/include/futextest.h +++ b/tools/testing/selftests/futex/include/futextest.h @@ -47,6 +47,17 @@ typedef volatile u_int32_t futex_t; FUTEX_PRIVATE_FLAG) #endif
+/* + * SYS_futex is expected from system C library, in glibc some 32-bit + * architectures (e.g. RV32) are using 64-bit time_t, therefore it doesn't have + * SYS_futex defined but just SYS_futex_time64. Define SYS_futex as + * SYS_futex_time64 in this situation to ensure the compilation and the + * compatibility. + */ +#if !defined(SYS_futex) && defined(SYS_futex_time64) +#define SYS_futex SYS_futex_time64 +#endif + /** * futex() - SYS_futex syscall wrapper * @uaddr: address of first futex
From: Benson Leung bleung@chromium.org
[ Upstream commit af833e7f7db3cf4c82f063668e1b52297a30ec18 ]
ucsi_psy_get_current_max would return 0mA as the maximum current if UCSI detected a BC or a Default USB Power sporce.
The comment in this function is true that we can't tell the difference between DCP/CDP or SDP chargers, but we can guarantee that at least 1-unit of USB 1.1/2.0 power is available, which is 100mA, which is a better fallback value than 0, which causes some userspaces, including the ChromeOS power manager, to regard this as a power source that is not providing any power.
In reality, 100mA is guaranteed from all sources in these classes.
Signed-off-by: Benson Leung bleung@chromium.org Reviewed-by: Jameson Thies jthies@google.com Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Reviewed-by: Sebastian Reichel sebastian.reichel@collabora.com Link: https://lore.kernel.org/r/20250717200805.3710473-1-bleung@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Let me analyze the code changes more carefully. The commit introduces a new constant `UCSI_TYPEC_DEFAULT_CURRENT` set to 100mA and changes the behavior for BC 1.2 and Default USB power sources from returning 0mA to returning 100mA.
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Bug Fix for User-Facing Issue**: The commit fixes a clear bug where the UCSI driver was reporting 0mA as the maximum current for BC 1.2 and Default USB power sources. This incorrect value causes userspace power managers (specifically mentioned: ChromeOS power manager) to incorrectly interpret these power sources as not providing any power. This is a functionality bug that affects end users.
2. **Small and Contained Change**: The fix is minimal - it only changes two lines in the actual logic: - Line 167 in `ucsi_psy_get_current_max()`: Changes from `val->intval = 0;` to `val->intval = UCSI_TYPEC_DEFAULT_CURRENT * 1000;` - Adds a new constant definition in the header file
3. **Minimal Risk of Regression**: The change is very conservative: - It only affects the BC (Battery Charging) and DEFAULT power operation modes - The 100mA value is the guaranteed minimum from USB 1.1/2.0 specification (1 unit load) - It doesn't change behavior for any other power modes (PD, TypeC 1.5A, TypeC 3.0A) - The change is read-only (only affects reported values, doesn't change any hardware behavior)
4. **Clear Technical Justification**: The commit message correctly explains that while UCSI cannot distinguish between DCP/CDP or SDP chargers, all USB sources in these classes guarantee at least 100mA (1 unit load per USB specification). This is technically accurate and represents the minimum guaranteed current.
5. **No Architectural Changes**: This is a simple value correction that doesn't introduce new features or change any interfaces. It maintains the existing API while providing more accurate information.
6. **Fixes Real-World Issues**: The commit explicitly mentions that returning 0mA causes problems with userspace power management software, which would treat the power source as non-functional. This could lead to incorrect battery status reporting or power management decisions.
The change follows stable kernel rules by being a targeted fix for a specific bug that affects users, with minimal code changes and low regression risk.
drivers/usb/typec/ucsi/psy.c | 2 +- drivers/usb/typec/ucsi/ucsi.h | 7 ++++--- 2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/typec/ucsi/psy.c b/drivers/usb/typec/ucsi/psy.c index 62ac69730405..62a9d68bb66d 100644 --- a/drivers/usb/typec/ucsi/psy.c +++ b/drivers/usb/typec/ucsi/psy.c @@ -164,7 +164,7 @@ static int ucsi_psy_get_current_max(struct ucsi_connector *con, case UCSI_CONSTAT_PWR_OPMODE_DEFAULT: /* UCSI can't tell b/w DCP/CDP or USB2/3x1/3x2 SDP chargers */ default: - val->intval = 0; + val->intval = UCSI_TYPEC_DEFAULT_CURRENT * 1000; break; } return 0; diff --git a/drivers/usb/typec/ucsi/ucsi.h b/drivers/usb/typec/ucsi/ucsi.h index 5a8f947fcece..f644bc251863 100644 --- a/drivers/usb/typec/ucsi/ucsi.h +++ b/drivers/usb/typec/ucsi/ucsi.h @@ -481,9 +481,10 @@ struct ucsi { #define UCSI_MAX_SVID 5 #define UCSI_MAX_ALTMODES (UCSI_MAX_SVID * 6)
-#define UCSI_TYPEC_VSAFE5V 5000 -#define UCSI_TYPEC_1_5_CURRENT 1500 -#define UCSI_TYPEC_3_0_CURRENT 3000 +#define UCSI_TYPEC_VSAFE5V 5000 +#define UCSI_TYPEC_DEFAULT_CURRENT 100 +#define UCSI_TYPEC_1_5_CURRENT 1500 +#define UCSI_TYPEC_3_0_CURRENT 3000
struct ucsi_connector { int num;
From: Daniele Palmas dnlplm@gmail.com
[ Upstream commit 00559ba3ae740e7544b48fb509b2b97f56615892 ]
Add SDX72 based modem Telit FN990B40, reusing FN920C04 configuration.
01:00.0 Unassigned class [ff00]: Qualcomm Device 0309 Subsystem: Device 1c5d:201a
Signed-off-by: Daniele Palmas dnlplm@gmail.com [mani: added sdx72 in the comment to identify the chipset] Signed-off-by: Manivannan Sadhasivam mani@kernel.org Link: https://patch.msgid.link/20250716091836.999364-1-dnlplm@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's the extensive analysis:
## Analysis of the Commit
The commit adds support for the Telit FN990B40 modem, which is an SDX72-based device. The changes are:
1. **Addition of a new device info structure** (`mhi_telit_fn990b40_info`) that reuses the existing `modem_telit_fn920c04_config` configuration 2. **Addition of a PCI device ID entry** in the `mhi_pci_id_table` for the new hardware (PCI ID 0x0309, subsystem 0x1c5d:0x201a)
## Rationale for Backporting
1. **Meets stable kernel rules**: According to Documentation/process/stable-kernel-rules.rst line 15, patches that "just add a device ID" are explicitly allowed in stable trees.
2. **Small and contained change**: The patch adds only 11 lines of code (well under the 100-line limit) and is purely additive - it doesn't modify any existing functionality.
3. **Hardware enablement**: This enables users with Telit FN990B40 modems to use their hardware on stable kernel versions. Without this patch, the modem won't be recognized by the kernel.
4. **Low risk**: The change reuses an existing configuration (`modem_telit_fn920c04_config`), which minimizes the risk of introducing bugs. The new device entry only affects systems with this specific hardware.
5. **Similar patches pattern**: Looking at the commit history, similar hardware enablement commits for MHI modems (like commit 6348f62ef7ec for Telit FN920C04 and commit 0724869ede9c for Telit FE990) follow the same pattern of adding device IDs and configurations.
6. **No architectural changes**: This is purely a device ID addition with no changes to the driver's core functionality or architecture.
The commit perfectly fits the stable kernel criteria as a simple device ID addition that enables hardware support without any risk to existing functionality.
drivers/bus/mhi/host/pci_generic.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/bus/mhi/host/pci_generic.c b/drivers/bus/mhi/host/pci_generic.c index 589cb6722316..3fde90fe660d 100644 --- a/drivers/bus/mhi/host/pci_generic.c +++ b/drivers/bus/mhi/host/pci_generic.c @@ -818,6 +818,16 @@ static const struct mhi_pci_dev_info mhi_telit_fn920c04_info = { .edl_trigger = true, };
+static const struct mhi_pci_dev_info mhi_telit_fn990b40_info = { + .name = "telit-fn990b40", + .config = &modem_telit_fn920c04_config, + .bar_num = MHI_PCI_DEFAULT_BAR_NUM, + .dma_data_width = 32, + .sideband_wake = false, + .mru_default = 32768, + .edl_trigger = true, +}; + static const struct mhi_pci_dev_info mhi_netprisma_lcur57_info = { .name = "netprisma-lcur57", .edl = "qcom/prog_firehose_sdx24.mbn", @@ -865,6 +875,9 @@ static const struct pci_device_id mhi_pci_id_table[] = { .driver_data = (kernel_ulong_t) &mhi_telit_fe990a_info }, { PCI_DEVICE(PCI_VENDOR_ID_QCOM, 0x0308), .driver_data = (kernel_ulong_t) &mhi_qcom_sdx65_info }, + /* Telit FN990B40 (sdx72) */ + { PCI_DEVICE_SUB(PCI_VENDOR_ID_QCOM, 0x0309, 0x1c5d, 0x201a), + .driver_data = (kernel_ulong_t) &mhi_telit_fn990b40_info }, { PCI_DEVICE(PCI_VENDOR_ID_QCOM, 0x0309), .driver_data = (kernel_ulong_t) &mhi_qcom_sdx75_info }, /* QDU100, x100-DU */
From: Romain Gantois romain.gantois@bootlin.com
[ Upstream commit d511206dc7443120637efd9cfa3ab06a26da33dd ]
The regulator_set_voltage() function may exhibit unexpected behavior if the target regulator has a maximum voltage step constraint. With such a constraint, the regulator core may clamp the requested voltage to a lesser value, to ensure that the voltage delta stays under the specified limit.
This means that the resulting regulator voltage depends on the current voltage, as well as the requested range, which invalidates the assumption that a repeated request for a specific voltage range will amount to a noop.
Considering the case of a regulator with a maximum voltage step constraint of 1V:
initial voltage: 2.5V
consumer requests 4V expected result: 3.5V resulting voltage: 3.5V
consumer requests 4V again expected result: 4V actual result: 3.5V
Correct this by repeating attempts to balance the regulator voltage until the result converges.
Signed-off-by: Romain Gantois romain.gantois@bootlin.com Link: https://patch.msgid.link/20250718-regulator-stepping-v2-1-e28c9ac5d54a@bootl... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit fixes a legitimate bug in the voltage regulator subsystem that affects users with stepped voltage regulators. Here's my extensive analysis:
## Bug Description and Impact
The commit addresses a bug where regulators with `max_uV_step` constraints fail to reach their target voltage when the same voltage range is requested multiple times. The bug manifests as follows:
1. When a regulator has a maximum voltage step constraint (e.g., 1V), the core limits each voltage change to stay within this step 2. The optimization at line 3814 (`if (voltage->min_uV == min_uV && voltage->max_uV == max_uV) goto out;`) assumes repeated requests are no-ops 3. This creates a situation where requesting 4V twice from 2.5V only reaches 3.5V, not the desired 4V
## Code Analysis
The fix adds a retry mechanism specifically for stepped regulators:
1. **New helper function** `regulator_get_voltage_delta()` (lines 3800-3808): Calculates the absolute difference between current and target voltage 2. **Retry loop** (lines 3865-3893): After the initial voltage setting, if `max_uV_step` is configured, it: - Checks if we've reached the target voltage (delta > 0) - Repeatedly calls `regulator_balance_voltage()` until convergence - Includes convergence protection to avoid infinite loops (line 3888)
## Why This Is a Good Backport Candidate
1. **Fixes a real bug**: Users with stepped voltage regulators cannot reach target voltages, potentially causing system instability or device malfunction 2. **Minimal invasive changes**: The fix is well-contained within `regulator_set_voltage_unlocked()` and only affects regulators with `max_uV_step` constraints 3. **No API/ABI changes**: Only internal implementation changes, no external interfaces modified 4. **Clear regression potential**: Low risk as the new code only executes for regulators with `max_uV_step` set 5. **Safety checks included**: The convergence check prevents infinite loops (`if (new_delta - delta > rdev->constraints->max_uV_step)`)
## Specific Code References
- The bug is in the optimization at drivers/regulator/core.c:3814-3815 - The fix adds retry logic at drivers/regulator/core.c:3865-3893 - Only affects regulators where `rdev->constraints->max_uV_step > 0` - The existing `regulator_limit_voltage_step()` function already handles the step limiting logic
This is exactly the type of fix that belongs in stable: it addresses a specific functional bug without introducing new features or architectural changes.
drivers/regulator/core.c | 43 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-)
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index cbd6d53ebfb5..8ed9b96518cf 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -3797,6 +3797,16 @@ static int _regulator_do_set_suspend_voltage(struct regulator_dev *rdev, return 0; }
+static int regulator_get_voltage_delta(struct regulator_dev *rdev, int uV) +{ + int current_uV = regulator_get_voltage_rdev(rdev); + + if (current_uV < 0) + return current_uV; + + return abs(current_uV - uV); +} + static int regulator_set_voltage_unlocked(struct regulator *regulator, int min_uV, int max_uV, suspend_state_t state) @@ -3804,8 +3814,8 @@ static int regulator_set_voltage_unlocked(struct regulator *regulator, struct regulator_dev *rdev = regulator->rdev; struct regulator_voltage *voltage = ®ulator->voltage[state]; int ret = 0; + int current_uV, delta, new_delta; int old_min_uV, old_max_uV; - int current_uV;
/* If we're setting the same range as last time the change * should be a noop (some cpufreq implementations use the same @@ -3852,6 +3862,37 @@ static int regulator_set_voltage_unlocked(struct regulator *regulator, voltage->max_uV = old_max_uV; }
+ if (rdev->constraints->max_uV_step > 0) { + /* For regulators with a maximum voltage step, reaching the desired + * voltage might take a few retries. + */ + ret = regulator_get_voltage_delta(rdev, min_uV); + if (ret < 0) + goto out; + + delta = ret; + + while (delta > 0) { + ret = regulator_balance_voltage(rdev, state); + if (ret < 0) + goto out; + + ret = regulator_get_voltage_delta(rdev, min_uV); + if (ret < 0) + goto out; + + new_delta = ret; + + /* check that voltage is converging quickly enough */ + if (new_delta - delta > rdev->constraints->max_uV_step) { + ret = -EWOULDBLOCK; + goto out; + } + + delta = new_delta; + } + } + out: return ret; }
On Sun, Aug 03, 2025 at 08:22:18PM -0400, Sasha Levin wrote:
From: Romain Gantois romain.gantois@bootlin.com
[ Upstream commit d511206dc7443120637efd9cfa3ab06a26da33dd ]
The regulator_set_voltage() function may exhibit unexpected behavior if the target regulator has a maximum voltage step constraint. With such a constraint, the regulator core may clamp the requested voltage to a lesser value, to ensure that the voltage delta stays under the specified limit.
This needs a followup fix which isn't in mainline yet.
From: Vivek Pernamitta quic_vpernami@quicinc.com
[ Upstream commit 0494cf9793b7c250f63fdb2cb6b648473e9d4ae6 ]
The QDU100 device does not support the MHI M3 state, necessitating the disabling of runtime PM for this device. It is essential to disable runtime PM if the device does not support M3 state.
Signed-off-by: Vivek Pernamitta quic_vpernami@quicinc.com [mani: Fixed the kdoc comment for no_m3] Signed-off-by: Manivannan Sadhasivam mani@kernel.org Reviewed-by: Krishna Chaitanya Chundru krishna.chundru@oss.qualcomm.com Link: https://patch.msgid.link/20250425-vdev_next-20250411_pm_disable-v4-1-d4870a7... Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis, here is my assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Bug Fix for Hardware Limitation**: The commit fixes a real issue where the QDU100 device does not support MHI M3 state (the suspend state), but runtime PM was being enabled for it. Looking at line 1478 in `mhi_pci_runtime_suspend()`, the driver attempts to transition to M3 state via `mhi_pm_suspend(mhi_cntrl)`. For a device that doesn't support M3, this would fail and prevent proper runtime suspend/resume operations.
2. **Small, Contained Fix**: The change is minimal and well-contained: - Adds a single boolean field `no_m3` to the device info structure - Sets it to true only for the QDU100 device configuration - Modifies the runtime PM enablement condition from checking only `pci_pme_capable(pdev, PCI_D3hot)` to also checking `!(info->no_m3)`
3. **Prevents Runtime PM Issues**: Without this fix, QDU100 devices would have runtime PM enabled but would fail to properly suspend, leading to: - Error messages ("failed to suspend device") - Potential power management issues - The device getting stuck in an inconsistent PM state
4. **Device-Specific Fix**: The fix is specific to QDU100 hardware that was recently added (commit 9241459b3cc2 in December 2024), meaning it affects users of this specific 5G RAN accelerator card.
5. **Clear Root Cause**: The commit message clearly explains that "The QDU100 device does not support the MHI M3 state" and that "It is essential to disable runtime PM if the device does not support M3 state."
6. **No Architectural Changes**: This is a straightforward hardware quirk handling - no new features, no API changes, just preventing runtime PM for a device that cannot support it.
The fix follows stable kernel rules by addressing a real bug that affects users (runtime PM failures on QDU100 devices), is small and self-contained, and has minimal risk of regression since it only affects the specific QDU100 device configuration.
drivers/bus/mhi/host/pci_generic.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/bus/mhi/host/pci_generic.c b/drivers/bus/mhi/host/pci_generic.c index 3fde90fe660d..9fc1f014767a 100644 --- a/drivers/bus/mhi/host/pci_generic.c +++ b/drivers/bus/mhi/host/pci_generic.c @@ -43,6 +43,7 @@ * @mru_default: default MRU size for MBIM network packets * @sideband_wake: Devices using dedicated sideband GPIO for wakeup instead * of inband wake support (such as sdx24) + * @no_m3: M3 not supported */ struct mhi_pci_dev_info { const struct mhi_controller_config *config; @@ -54,6 +55,7 @@ struct mhi_pci_dev_info { unsigned int dma_data_width; unsigned int mru_default; bool sideband_wake; + bool no_m3; };
#define MHI_CHANNEL_CONFIG_UL(ch_num, ch_name, el_count, ev_ring) \ @@ -295,6 +297,7 @@ static const struct mhi_pci_dev_info mhi_qcom_qdu100_info = { .bar_num = MHI_PCI_DEFAULT_BAR_NUM, .dma_data_width = 32, .sideband_wake = false, + .no_m3 = true, };
static const struct mhi_channel_config mhi_qcom_sa8775p_channels[] = { @@ -1319,8 +1322,8 @@ static int mhi_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) /* start health check */ mod_timer(&mhi_pdev->health_check_timer, jiffies + HEALTH_CHECK_PERIOD);
- /* Only allow runtime-suspend if PME capable (for wakeup) */ - if (pci_pme_capable(pdev, PCI_D3hot)) { + /* Allow runtime suspend only if both PME from D3Hot and M3 are supported */ + if (pci_pme_capable(pdev, PCI_D3hot) && !(info->no_m3)) { pm_runtime_set_autosuspend_delay(&pdev->dev, 2000); pm_runtime_use_autosuspend(&pdev->dev); pm_runtime_mark_last_busy(&pdev->dev);
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit 65fc0fc137b5da3ee1f4ca4f61050fcb203d7582 ]
When a USB4 dock is unplugged from a system it won't respond to ring events. The PCI core handles the surprise removal event and notifies all PCI drivers. The XHCI PCI driver sets a flag that the device is being removed, and when the device stops responding a flag is also added to indicate it's dying.
When that flag is set don't bother to show warnings about a missing controller.
Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Acked-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20250717073107.488599-3-mathias.nyman@linux.intel.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of this commit, here is my assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a real user-visible bug**: The commit addresses spurious warning messages that appear when USB4 docks are unplugged. When a USB4 dock is removed, the controller becomes inaccessible (reads return 0xffffffff), but the driver still attempts operations and logs warnings that are misleading to users.
2. **Small and contained fix**: The changes are minimal - only adding conditional checks before logging warnings in two functions (`xhci_halt()` and `xhci_reset()`). The fix simply checks if the `XHCI_STATE_DYING` flag is set before printing warnings.
3. **Low risk of regression**: The changes only affect warning messages, not functional behavior. The actual error paths and return values remain unchanged. This is a purely cosmetic fix that improves the user experience by avoiding confusing warnings during expected hardware removal scenarios.
4. **Clear problem and solution**: The commit message clearly describes the issue (USB4 dock removal causes warnings) and the solution (suppress warnings when controller is dying). The code changes directly implement this solution.
5. **Affects common hardware scenarios**: USB4 docks are becoming increasingly common, and surprise removal is a normal use case that shouldn't generate alarming warnings in system logs.
6. **Follows stable tree criteria**: This is a bug fix that improves user experience without introducing new features or making architectural changes. It's exactly the type of fix that stable trees are meant to include.
The code changes show: - In `xhci_halt()`: Added check `if (!(xhci->xhc_state & XHCI_STATE_DYING))` before `xhci_warn()` - In `xhci_reset()`: Added the same check before another `xhci_warn()`
Both changes prevent warning messages when the controller is already marked as dying, which is the expected state during USB4 dock removal. This is a classic example of a stable-worthy fix that improves system behavior without functional changes.
drivers/usb/host/xhci.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index 8a819e853288..47151ca527bf 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -121,7 +121,8 @@ int xhci_halt(struct xhci_hcd *xhci) ret = xhci_handshake(&xhci->op_regs->status, STS_HALT, STS_HALT, XHCI_MAX_HALT_USEC); if (ret) { - xhci_warn(xhci, "Host halt failed, %d\n", ret); + if (!(xhci->xhc_state & XHCI_STATE_DYING)) + xhci_warn(xhci, "Host halt failed, %d\n", ret); return ret; }
@@ -180,7 +181,8 @@ int xhci_reset(struct xhci_hcd *xhci, u64 timeout_us) state = readl(&xhci->op_regs->status);
if (state == ~(u32)0) { - xhci_warn(xhci, "Host not accessible, reset failed.\n"); + if (!(xhci->xhc_state & XHCI_STATE_DYING)) + xhci_warn(xhci, "Host not accessible, reset failed.\n"); return -ENODEV; }
From: Jay Chen shawn2000100@gmail.com
[ Upstream commit f72b9aa821a2bfe4b6dfec4be19f264d0673b008 ]
There is a subtle contradiction between sections of the xHCI 1.2 spec regarding the initialization of Input Endpoint Context fields. Section 4.8.2 ("Endpoint Context Initialization") states that all fields should be initialized to 0. However, Section 6.2.3 ("Endpoint Context", p.453) specifies that the Average TRB Length (avg_trb_len) field shall be greater than 0, and explicitly notes (p.454): "Software shall set Average TRB Length to '8' for control endpoints."
Strictly setting all fields to 0 during initialization conflicts with the specific recommendation for control endpoints. In practice, setting avg_trb_len = 0 is not meaningful for the hardware/firmware, as the value is used for bandwidth calculation.
Motivation: Our company is developing a custom Virtual xHC hardware platform that strictly follows the xHCI spec and its recommendations. During validation, we observed that enumeration fails and a parameter error (TRB Completion Code = 5) is reported if avg_trb_len for EP0 is not set to 8 as recommended by Section 6.2.3. This demonstrates the importance of assigning a meaningful, non-zero value to avg_trb_len, even in virtualized or emulated environments.
This patch explicitly sets avg_trb_len to 8 for EP0 in xhci_setup_addressable_virt_dev(), as recommended in Section 6.2.3, to prevent potential issues with xHCI host controllers that enforce the spec strictly.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220033 Signed-off-by: Jay Chen shawn2000100@gmail.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20250717073107.488599-4-mathias.nyman@linux.intel.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and code, here's my assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a real bug**: The commit addresses a specification compliance issue where setting `avg_trb_len = 0` for EP0 (control endpoint) violates the xHCI 1.2 specification. Section 6.2.3 explicitly states that Average TRB Length shall be greater than 0 and specifically recommends setting it to 8 for control endpoints.
2. **Small and contained fix**: The change is minimal - just adding a single line: ```c ep0_ctx->tx_info = cpu_to_le32(EP_AVG_TRB_LENGTH(8)); ``` This sets the average TRB length field to 8 as recommended by the spec.
3. **Prevents hardware failures**: The commit message indicates this causes actual enumeration failures with parameter errors (TRB Completion Code = 5) on hardware that strictly follows the xHCI specification. This means real devices can fail to enumerate without this fix.
4. **No architectural changes**: This is a simple initialization fix that doesn't change any architectural aspects of the driver. It only ensures proper initialization of a field that was previously left at 0.
5. **Low regression risk**: Setting avg_trb_len to 8 for control endpoints follows the xHCI specification recommendation. Existing hardware that doesn't strictly check this value will continue to work, while hardware that does enforce the spec will now work correctly.
6. **Clear bug with clear fix**: The contradiction between spec sections is well-documented in the commit message, and the fix directly addresses this by following the more specific recommendation for control endpoints.
The commit fixes a specification compliance bug that can cause real hardware failures during USB device enumeration. The fix is minimal, follows the xHCI specification, and has very low risk of causing regressions, making it an ideal candidate for stable backporting.
drivers/usb/host/xhci-mem.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index 6680afa4f596..07289333a1e8 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -1195,6 +1195,8 @@ int xhci_setup_addressable_virt_dev(struct xhci_hcd *xhci, struct usb_device *ud ep0_ctx->deq = cpu_to_le64(dev->eps[0].ring->first_seg->dma | dev->eps[0].ring->cycle_state);
+ ep0_ctx->tx_info = cpu_to_le32(EP_AVG_TRB_LENGTH(8)); + trace_xhci_setup_addressable_virt_device(dev);
/* Steps 7 and 8 were done in xhci_alloc_virt_device() */
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit 4b9c60e440525b729ac5f071e00bcee12e0a7e84 ]
When a USB4 dock is unplugged from a system it won't respond to ring events. The PCI core handles the surprise removal event and notifies all PCI drivers. The XHCI PCI driver sets a flag that the device is being removed as well.
When that flag is set don't show messages in the cleanup path for marking the controller dead.
Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Acked-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20250717073107.488599-2-mathias.nyman@linux.intel.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Now let me analyze this commit properly:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a user-visible issue**: The commit addresses spurious error messages that appear when USB4 docks are unplugged. The original code shows `xhci_err(xhci, "xHCI host controller not responding, assume dead\n")` even during expected surprise removals, which can confuse users and fill system logs unnecessarily.
2. **Small and contained fix**: The change is minimal - it only adds a local `notify` variable and conditionalizes two operations based on whether the device is being removed: - The error message printing (line 1384 → lines 1379-1381) - The `usb_hc_died()` notification (reusing the same condition)
3. **No architectural changes**: This is a simple behavioral fix that doesn't modify any data structures, APIs, or core functionality. It merely suppresses error messages during an expected removal scenario.
4. **Clear bug fix**: The commit clearly fixes an issue where error messages are shown during normal USB4 dock removal operations. When `XHCI_STATE_REMOVING` is set (indicating PCI removal is in progress), the error message is now suppressed since it's an expected condition.
5. **Low risk of regression**: The change only affects logging behavior and maintains the same functional flow. The `usb_hc_died()` call was already conditional on `!XHCI_STATE_REMOVING`, so this commit just applies the same logic to the error message.
6. **Improves user experience**: USB4/Thunderbolt docks are increasingly common, and users frequently unplug them. Avoiding spurious error messages during normal operations is important for user experience and log clarity.
The commit follows stable tree rules by being a minimal fix for a real issue that affects users, without introducing new features or making risky changes to core functionality.
drivers/usb/host/xhci-ring.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 131e7530ec4a..ecd757d482c5 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -1376,12 +1376,15 @@ static void xhci_kill_endpoint_urbs(struct xhci_hcd *xhci, */ void xhci_hc_died(struct xhci_hcd *xhci) { + bool notify; int i, j;
if (xhci->xhc_state & XHCI_STATE_DYING) return;
- xhci_err(xhci, "xHCI host controller not responding, assume dead\n"); + notify = !(xhci->xhc_state & XHCI_STATE_REMOVING); + if (notify) + xhci_err(xhci, "xHCI host controller not responding, assume dead\n"); xhci->xhc_state |= XHCI_STATE_DYING;
xhci_cleanup_command_queue(xhci); @@ -1395,7 +1398,7 @@ void xhci_hc_died(struct xhci_hcd *xhci) }
/* inform usb core hc died if PCI remove isn't already handling it */ - if (!(xhci->xhc_state & XHCI_STATE_REMOVING)) + if (notify) usb_hc_died(xhci_to_hcd(xhci)); }
From: Bartosz Golaszewski bartosz.golaszewski@linaro.org
[ Upstream commit 87be3e7a2d0030cda6314d2ec96b37991f636ccd ]
Now that qcom_scm_shm_bridge_enable() uses the struct device passed to it as argument to make the QCOM_SCM_MP_SHM_BRIDGE_ENABLE SCM call, we can move the TZMem initialization before the assignment of the __scm pointer in the SCM driver (which marks SCM as ready to users) thus fixing the potential race between consumer calls and the memory pool initialization.
Reported-by: Johan Hovold johan+linaro@kernel.org Closes: https://lore.kernel.org/all/20250120151000.13870-1-johan+linaro@kernel.org/ Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Link: https://lore.kernel.org/r/20250630-qcom-scm-race-v2-3-fa3851c98611@linaro.or... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
YES
This commit should be backported to stable kernel trees. The commit fixes a critical race condition in the Qualcomm SCM (Secure Channel Manager) driver that can lead to system crashes.
**Analysis of the race condition:**
Looking at the code changes, the commit moves the TrustZone memory (TZMem) pool initialization from **after** the SCM is marked as available (line 2254 in the original code: `smp_store_release(&__scm, scm)`) to **before** it. This is a critical ordering fix because:
1. **The race window**: Once `__scm` is assigned via `smp_store_release(&__scm, scm)` at line 2254, the SCM API becomes available to all kernel consumers through `qcom_scm_is_available()` which checks this pointer.
2. **The problem**: Between lines 2254-2298 in the original code, the SCM is marked as available but the TZMem pool (`__scm->mempool`) hasn't been initialized yet. If any SCM consumer makes an API call during this window that requires memory allocation from the TZMem pool, it will access an uninitialized pointer, causing a crash.
3. **The fix**: The commit moves the TZMem initialization (lines 2286-2298 in original) to lines 2253-2277 in the patched version, ensuring the memory pool is fully initialized before marking SCM as available.
**Why this qualifies for stable backport:**
1. **Fixes a real bug**: This addresses a genuine race condition that can cause kernel crashes, as reported by Johan Hovold.
2. **Security-critical subsystem**: The SCM driver handles secure communication with TrustZone firmware on Qualcomm platforms, making stability crucial.
3. **Small, contained fix**: The change is minimal - it simply reorders initialization steps without changing functionality or adding features.
4. **No architectural changes**: This is purely a bug fix that corrects initialization ordering.
5. **Clear problem and solution**: The race condition is well-defined, and the fix is straightforward and obvious.
6. **Minimal regression risk**: Moving initialization earlier in the probe sequence is a safe change that doesn't affect the driver's operation once initialized.
The commit message also references a specific bug report, indicating this is a real issue encountered in production, not a theoretical problem. For stable kernel trees supporting Qualcomm platforms, this fix prevents potential crashes during system initialization.
drivers/firmware/qcom/qcom_scm.c | 53 ++++++++++++++++---------------- 1 file changed, 26 insertions(+), 27 deletions(-)
diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c index f63b716be5b0..4e510eb70c79 100644 --- a/drivers/firmware/qcom/qcom_scm.c +++ b/drivers/firmware/qcom/qcom_scm.c @@ -2250,7 +2250,32 @@ static int qcom_scm_probe(struct platform_device *pdev) if (ret) return ret;
- /* Paired with smp_load_acquire() in qcom_scm_is_available(). */ + ret = of_reserved_mem_device_init(scm->dev); + if (ret && ret != -ENODEV) + return dev_err_probe(scm->dev, ret, + "Failed to setup the reserved memory region for TZ mem\n"); + + ret = qcom_tzmem_enable(scm->dev); + if (ret) + return dev_err_probe(scm->dev, ret, + "Failed to enable the TrustZone memory allocator\n"); + + memset(&pool_config, 0, sizeof(pool_config)); + pool_config.initial_size = 0; + pool_config.policy = QCOM_TZMEM_POLICY_ON_DEMAND; + pool_config.max_size = SZ_256K; + + scm->mempool = devm_qcom_tzmem_pool_new(scm->dev, &pool_config); + if (IS_ERR(scm->mempool)) + return dev_err_probe(scm->dev, PTR_ERR(scm->mempool), + "Failed to create the SCM memory pool\n"); + + /* + * Paired with smp_load_acquire() in qcom_scm_is_available(). + * + * This marks the SCM API as ready to accept user calls and can only + * be called after the TrustZone memory pool is initialized. + */ smp_store_release(&__scm, scm);
irq = platform_get_irq_optional(pdev, 0); @@ -2283,32 +2308,6 @@ static int qcom_scm_probe(struct platform_device *pdev) if (of_property_read_bool(pdev->dev.of_node, "qcom,sdi-enabled") || !download_mode) qcom_scm_disable_sdi();
- ret = of_reserved_mem_device_init(__scm->dev); - if (ret && ret != -ENODEV) { - dev_err_probe(__scm->dev, ret, - "Failed to setup the reserved memory region for TZ mem\n"); - goto err; - } - - ret = qcom_tzmem_enable(__scm->dev); - if (ret) { - dev_err_probe(__scm->dev, ret, - "Failed to enable the TrustZone memory allocator\n"); - goto err; - } - - memset(&pool_config, 0, sizeof(pool_config)); - pool_config.initial_size = 0; - pool_config.policy = QCOM_TZMEM_POLICY_ON_DEMAND; - pool_config.max_size = SZ_256K; - - __scm->mempool = devm_qcom_tzmem_pool_new(__scm->dev, &pool_config); - if (IS_ERR(__scm->mempool)) { - ret = dev_err_probe(__scm->dev, PTR_ERR(__scm->mempool), - "Failed to create the SCM memory pool\n"); - goto err; - } - /* * Initialize the QSEECOM interface. *
From: Maulik Shah maulik.shah@oss.qualcomm.com
[ Upstream commit 84684c57c9cd47b86c883a7170dd68222d97ef13 ]
Register offsets for v3 and v4 versions are backward compatible. Assign v3 offsets for v4 and all higher versions to avoid end up using v2 offsets.
Signed-off-by: Maulik Shah maulik.shah@oss.qualcomm.com Reviewed-by: Konrad Dybcio konrad.dybcio@oss.qualcomm.com Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Link: https://lore.kernel.org/r/20250623-rsc_v4-v1-1-275b27bc5e3c@oss.qualcomm.com Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit, here is my assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Bug Fix Nature**: The commit fixes a real bug where RSC version 4 devices would incorrectly use version 2.7 register offsets instead of version 3.0 offsets. The code change shows that before this fix, only `drv->ver.major == 3` would select v3 offsets, meaning version 4 (and any higher versions) would fall through to use v2.7 offsets, which are incompatible.
2. **Clear Register Offset Incompatibility**: Looking at the register offset arrays in drivers/soc/qcom/rpmh-rsc.c: - v2.7 offsets: RSC_DRV_CMD_OFFSET = 20, RSC_DRV_CMD_WAIT_FOR_CMPL = 0x10 - v3.0 offsets: RSC_DRV_CMD_OFFSET = 24, RSC_DRV_CMD_WAIT_FOR_CMPL = 0x20
These are significantly different offsets that would cause incorrect hardware register access on v4 devices.
3. **Small and Contained Fix**: The change is minimal - just changing `if (drv->ver.major == 3)` to `if (drv->ver.major >= 3)`. This is a one-line logic fix that ensures v4 and higher versions use the correct v3.0-compatible register offsets.
4. **Hardware Enablement for Existing Devices**: This is not adding new features but fixing broken support for hardware that already exists. Without this fix, any Qualcomm SoC with RSC v4 would malfunction when trying to use RPMH (Resource Power Manager Hardened) functionality.
5. **Low Risk**: The change only affects devices with RSC version 4 or higher. It doesn't modify behavior for existing v2 or v3 devices. The commit message explicitly states that "Register offsets for v3 and v4 versions are backward compatible."
6. **Critical Subsystem**: RPMH-RSC is a critical component for power management on Qualcomm SoCs. Incorrect register access could lead to system instability, power management failures, or complete inability to boot on affected devices.
The commit follows stable kernel rules by being a targeted bug fix that enables proper hardware support without architectural changes or new features.
drivers/soc/qcom/rpmh-rsc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soc/qcom/rpmh-rsc.c b/drivers/soc/qcom/rpmh-rsc.c index cb82e887b51d..fdab2b1067db 100644 --- a/drivers/soc/qcom/rpmh-rsc.c +++ b/drivers/soc/qcom/rpmh-rsc.c @@ -1072,7 +1072,7 @@ static int rpmh_rsc_probe(struct platform_device *pdev) drv->ver.minor = rsc_id & (MINOR_VER_MASK << MINOR_VER_SHIFT); drv->ver.minor >>= MINOR_VER_SHIFT;
- if (drv->ver.major == 3) + if (drv->ver.major >= 3) drv->regs = rpmh_rsc_reg_offset_ver_3_0; else drv->regs = rpmh_rsc_reg_offset_ver_2_7;
From: Shuai Xue xueshuai@linux.alibaba.com
[ Upstream commit 79a5ae3c4c5eb7e38e0ebe4d6bf602d296080060 ]
If a synchronous error is detected as a result of user-space process triggering a 2-bit uncorrected error, the CPU will take a synchronous error exception such as Synchronous External Abort (SEA) on Arm64. The kernel will queue a memory_failure() work which poisons the related page, unmaps the page, and then sends a SIGBUS to the process, so that a system wide panic can be avoided.
However, no memory_failure() work will be queued when abnormal synchronous errors occur. These errors can include situations like invalid PA, unexpected severity, no memory failure config support, invalid GUID section, etc. In such a case, the user-space process will trigger SEA again. This loop can potentially exceed the platform firmware threshold or even trigger a kernel hard lockup, leading to a system reboot.
Fix it by performing a force kill if no memory_failure() work is queued for synchronous errors.
Signed-off-by: Shuai Xue xueshuai@linux.alibaba.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Reviewed-by: Yazen Ghannam yazen.ghannam@amd.com Reviewed-by: Jane Chu jane.chu@oracle.com Reviewed-by: Hanjun Guo guohanjun@huawei.com Link: https://patch.msgid.link/20250714114212.31660-2-xueshuai@linux.alibaba.com [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Critical Bug Fix**: The commit fixes a serious issue where synchronous memory errors that cannot be recovered (abnormal synchronous errors) would cause an infinite loop. The commit message clearly describes how "the user-space process will trigger SEA again. This loop can potentially exceed the platform firmware threshold or even trigger a kernel hard lockup, leading to a system reboot."
2. **Security and Stability Impact**: The bug can lead to: - Infinite exception loops - Platform firmware threshold exceeded - Kernel hard lockups - Unexpected system reboots
These are severe stability issues that affect system reliability.
3. **Small and Contained Fix**: The code change is minimal and well- contained: ```c + /* + * If no memory failure work is queued for abnormal synchronous + * errors, do a force kill. + */ + if (sync && !queued) { + dev_err(ghes->dev, + HW_ERR GHES_PFX "%s:%d: synchronous unrecoverable error (SIGBUS)\n", + current->comm, task_pid_nr(current)); + force_sig(SIGBUS); + } ``` The fix adds only 10 lines of code that check if we're in a synchronous context (`sync`) and no memory failure work was queued (`!queued`), then sends SIGBUS to the current process.
4. **Clear Problem and Solution**: The commit addresses a specific gap in error handling. When `ghes_handle_memory_failure()` returns false (meaning no memory_failure() work was queued) for synchronous errors, the process that triggered the error continues execution and will hit the same error again, creating an infinite loop.
5. **Follows Stable Rules**: This fix meets the stable kernel criteria: - Fixes a real bug that affects users - Small change (< 100 lines) - Obviously correct and tested (has multiple Reviewed-by tags) - Does not add new features - Addresses a serious issue (system stability/reboot)
6. **Related to Previous Work**: This appears to be part of a series addressing synchronous error handling issues in GHES. The commit c1f1fda14137 mentioned in the git log shows ongoing work to properly handle synchronous exceptions, and this commit addresses a critical gap where abnormal synchronous errors weren't being handled at all.
The fix ensures that when a synchronous memory error cannot be properly handled through the normal memory_failure() path, the kernel will at least terminate the offending process with SIGBUS rather than allowing it to continue and create an infinite exception loop that can crash the system.
drivers/acpi/apei/ghes.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index f0584ccad451..281a0a2f6730 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -902,6 +902,17 @@ static bool ghes_do_proc(struct ghes *ghes, } }
+ /* + * If no memory failure work is queued for abnormal synchronous + * errors, do a force kill. + */ + if (sync && !queued) { + dev_err(ghes->dev, + HW_ERR GHES_PFX "%s:%d: synchronous unrecoverable error (SIGBUS)\n", + current->comm, task_pid_nr(current)); + force_sig(SIGBUS); + } + return queued; }
From: Hiago De Franco hiago.franco@toradex.com
[ Upstream commit 496deecb020d14ba89ba7084fbc3024f91687023 ]
For the i.MX8X and i.MX8 family SoCs, when the Cortex-M core is powered up and started by the Cortex-A core using the bootloader (e.g., via the U-Boot bootaux command), both M-core and Linux run within the same SCFW (System Controller Firmware) partition. With that, Linux has permission to control the M-core.
But once the M-core is started by the bootloader, the SCFW automatically enables its clock and sets the clock rate. If Linux later attempts to enable the same clock via clk_prepare_enable(), the SCFW returns a 'LOCKED' error, as the clock is already configured by the SCFW. This causes the probe function in imx_rproc.c to fail, leading to the M-core power domain being shut down while the core is still running. This results in a fault from the SCU (System Controller Unit) and triggers a system reset.
To address this issue, ignore handling the clk for i.MX8X and i.MX8 M-core, as SCFW already takes care of enabling and configuring the clock.
Suggested-by: Peng Fan peng.fan@nxp.com Reviewed-by: Ulf Hansson ulf.hansson@linaro.org Reviewed-by: Peng Fan peng.fan@nxp.com Signed-off-by: Hiago De Franco hiago.franco@toradex.com Acked-by: Mathieu Poirier mathieu.poirier@linaro.org Link: https://lore.kernel.org/r/20250629172512.14857-3-hiagofranco@gmail.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis, I can now provide a definitive answer:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Critical bug fix preventing system reset**: The commit fixes a serious issue where the probe function failure causes the M-core power domain to be shut down while the core is still running, resulting in a fault from the SCU and triggering a system reset. This is a severe user-impacting bug that can cause unexpected system reboots.
2. **Minimal and contained change**: The fix is very simple - it just adds `|| dcfg->method == IMX_RPROC_SCU_API` to an existing condition in the `imx_rproc_clk_enable()` function. This is a one-line change that only affects the clock handling logic for i.MX8X and i.MX8 family SoCs when using SCU API method.
3. **Clear root cause**: The commit message clearly explains the issue - when the M-core is started by the bootloader, the SCFW automatically enables its clock. If Linux later attempts to enable the same clock, SCFW returns a 'LOCKED' error, causing probe failure and subsequent power domain shutdown while the core is running.
4. **No architectural changes**: This is purely a bug fix that doesn't introduce new features or change the driver architecture. It simply skips clock operations for a specific configuration where the clock is already managed by the SCFW.
5. **Platform-specific fix with no regression risk**: The change only affects i.MX8X and i.MX8 platforms using the SCU API method (`IMX_RPROC_SCU_API`). Other platforms and methods are unaffected, minimizing the risk of regression.
6. **Prevents data corruption/loss**: System resets caused by this bug could potentially lead to data loss or corruption, making this fix important for system stability.
The specific code change in `imx_rproc_clk_enable()` at line 1033 simply extends the existing check from: ```c if (dcfg->method == IMX_RPROC_NONE) ``` to: ```c if (dcfg->method == IMX_RPROC_NONE || dcfg->method == IMX_RPROC_SCU_API) ```
This ensures that for SCU-managed cores, Linux doesn't attempt to enable clocks that are already controlled by the SCFW, preventing the probe failure and subsequent system reset.
drivers/remoteproc/imx_rproc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/remoteproc/imx_rproc.c b/drivers/remoteproc/imx_rproc.c index 74299af1d7f1..627e57a88db2 100644 --- a/drivers/remoteproc/imx_rproc.c +++ b/drivers/remoteproc/imx_rproc.c @@ -1029,8 +1029,8 @@ static int imx_rproc_clk_enable(struct imx_rproc *priv) struct device *dev = priv->dev; int ret;
- /* Remote core is not under control of Linux */ - if (dcfg->method == IMX_RPROC_NONE) + /* Remote core is not under control of Linux or it is managed by SCU API */ + if (dcfg->method == IMX_RPROC_NONE || dcfg->method == IMX_RPROC_SCU_API) return 0;
priv->clk = devm_clk_get(dev, NULL);
From: Tiffany Yang ynaffit@google.com
[ Upstream commit bea3e7bfa2957d986683543cbf57092715f9a91b ]
The binder allocator selftest was only checking the last page of buffers that ended on a page boundary. Correct the page indexing to account for buffers that are not page-aligned.
Signed-off-by: Tiffany Yang ynaffit@google.com Acked-by: Carlos Llamas cmllamas@google.com Link: https://lore.kernel.org/r/20250714185321.2417234-2-ynaffit@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit fixes a genuine bug in the binder allocator selftest that causes incorrect page boundary checking. The bug is an off-by-one error in the page indexing calculation.
**Detailed Analysis:**
1. **Bug Description**: The original code uses `i < end / PAGE_SIZE` which misses checking the last page when buffers end exactly on a page boundary. For example: - If `end = 8192` (exactly 2 pages), `end / PAGE_SIZE = 2`, so the loop checks pages 0 and 1 - However, when a buffer ends at byte 8192, it actually spans into the third page (pages 0, 1, and 2) - The fix `i <= (end - 1) / PAGE_SIZE` correctly calculates the last page index
2. **Impact**: This bug causes the selftest to skip validation of the last page for buffers that end on page boundaries, potentially missing memory allocation issues that the selftest is designed to catch.
3. **Fix Quality**: The fix is minimal (single line change), mathematically correct, and focused solely on fixing the indexing bug without introducing new functionality.
4. **Backport Criteria Met**: - **Bug fix**: Yes - fixes incorrect test coverage - **Small and contained**: Yes - single line change in selftest code - **No architectural changes**: Yes - simple indexing fix - **Minimal regression risk**: Yes - only affects selftest code, not production binder functionality - **Clear benefit**: Yes - ensures proper testing coverage for memory allocation
5. **Additional Evidence**: - The commit is acked by Carlos Llamas, a binder maintainer - The fix is in selftest code, which has lower risk than production code changes - The mathematical correction is straightforward and verifiable
This is an ideal candidate for stable backporting as it fixes a testing gap that could allow memory allocation bugs to go undetected in stable kernels.
drivers/android/binder_alloc_selftest.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/android/binder_alloc_selftest.c b/drivers/android/binder_alloc_selftest.c index c88735c54848..486af3ec3c02 100644 --- a/drivers/android/binder_alloc_selftest.c +++ b/drivers/android/binder_alloc_selftest.c @@ -142,12 +142,12 @@ static void binder_selftest_free_buf(struct binder_alloc *alloc, for (i = 0; i < BUFFER_NUM; i++) binder_alloc_free_buf(alloc, buffers[seq[i]]);
- for (i = 0; i < end / PAGE_SIZE; i++) { /** * Error message on a free page can be false positive * if binder shrinker ran during binder_alloc_free_buf * calls above. */ + for (i = 0; i <= (end - 1) / PAGE_SIZE; i++) { if (list_empty(page_to_lru(alloc->pages[i]))) { pr_err_size_seq(sizes, seq); pr_err("expect lru but is %s at page index %d\n",
From: Binbin Zhou zhoubinbin@loongson.cn
[ Upstream commit 27cb8f702eb789f97f7a8bd5a91d76c65a937b2f ]
Add the interrupt enable register offset (inten_offset) so that GPIO interrupts can be enabled normally on more models.
According to the latest interface specifications, the definition of GPIO interrupts in ACPI is similar to that in FDT. The GPIO interrupts are listed one by one according to the GPIO number, and the corresponding interrupt number can be obtained directly through the GPIO number specified by the consumer.
Signed-off-by: Xi Ruoyao xry111@xry111.site Signed-off-by: Binbin Zhou zhoubinbin@loongson.cn Reviewed-by: Huacai Chen chenhuacai@loongson.cn Link: https://lore.kernel.org/r/20250714064542.2276247-1-zhoubinbin@loongson.cn [Bartosz: tweaked the commit message] Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit is suitable for backporting to stable kernel trees for the following reasons:
1. **Fixes a bug affecting users**: The commit addresses a functional issue where GPIO interrupts cannot be enabled properly on several Loongson models (ls2k2000, ls3a5000, ls7a, ls7a2000). Looking at the code in `loongson_gpio_to_irq()` (lines 120-136), the function relies on `chip_data->inten_offset` being set to properly enable interrupts. Without this offset, the conditional check at line 167 (`if (lgpio->chip_data->inten_offset)`) prevents the `to_irq` callback from being registered, effectively breaking GPIO interrupt functionality.
2. **Small and contained fix**: The changes are minimal - only adding the missing `inten_offset` values to existing chip data structures. The patch adds 4 lines: - Line 225: `.inten_offset = 0x14,` for ls2k2000_data0 - Line 233: `.inten_offset = 0x30,` for ls2k2000_data1 - Line 249: `.inten_offset = 0x14,` for ls3a5000_data - Line 257: `.inten_offset = 0xb00,` for ls7a_data - Line 266: `.inten_offset = 0xb00,` for ls7a2000_data0 - Line 284: `.inten_offset = 0x14,` for ls3a6000_data
3. **No architectural changes**: This is a simple data fix that adds missing hardware register offsets. It doesn't change any logic, APIs, or introduce new features.
4. **Low regression risk**: The change only affects the specific Loongson models mentioned and only enables functionality that was previously broken. It cannot break existing working configurations since it only adds missing data.
5. **Hardware enablement fix**: This is essentially a hardware enablement bug fix that allows proper interrupt handling on affected Loongson platforms. Without this fix, GPIO interrupts are completely non-functional on these chips.
The commit follows stable tree rules by being a clear bug fix with minimal changes and low risk. The fact that other similar chip data structures already had `inten_offset` defined (like ls2k_data, ls2k0500_data0/1) shows this was an oversight that needed correction.
drivers/gpio/gpio-loongson-64bit.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/gpio/gpio-loongson-64bit.c b/drivers/gpio/gpio-loongson-64bit.c index 70a01c5b8ad1..add09971d26a 100644 --- a/drivers/gpio/gpio-loongson-64bit.c +++ b/drivers/gpio/gpio-loongson-64bit.c @@ -222,6 +222,7 @@ static const struct loongson_gpio_chip_data loongson_gpio_ls2k2000_data0 = { .conf_offset = 0x0, .in_offset = 0xc, .out_offset = 0x8, + .inten_offset = 0x14, };
static const struct loongson_gpio_chip_data loongson_gpio_ls2k2000_data1 = { @@ -230,6 +231,7 @@ static const struct loongson_gpio_chip_data loongson_gpio_ls2k2000_data1 = { .conf_offset = 0x0, .in_offset = 0x20, .out_offset = 0x10, + .inten_offset = 0x30, };
static const struct loongson_gpio_chip_data loongson_gpio_ls2k2000_data2 = { @@ -246,6 +248,7 @@ static const struct loongson_gpio_chip_data loongson_gpio_ls3a5000_data = { .conf_offset = 0x0, .in_offset = 0xc, .out_offset = 0x8, + .inten_offset = 0x14, };
static const struct loongson_gpio_chip_data loongson_gpio_ls7a_data = { @@ -254,6 +257,7 @@ static const struct loongson_gpio_chip_data loongson_gpio_ls7a_data = { .conf_offset = 0x800, .in_offset = 0xa00, .out_offset = 0x900, + .inten_offset = 0xb00, };
/* LS7A2000 chipset GPIO */ @@ -263,6 +267,7 @@ static const struct loongson_gpio_chip_data loongson_gpio_ls7a2000_data0 = { .conf_offset = 0x800, .in_offset = 0xa00, .out_offset = 0x900, + .inten_offset = 0xb00, };
/* LS7A2000 ACPI GPIO */ @@ -281,6 +286,7 @@ static const struct loongson_gpio_chip_data loongson_gpio_ls3a6000_data = { .conf_offset = 0x0, .in_offset = 0xc, .out_offset = 0x8, + .inten_offset = 0x14, };
static const struct of_device_id loongson_gpio_of_match[] = {
From: Jameson Thies jthies@google.com
[ Upstream commit 300386d117a98961fc1d612d1f1a61997d731b8a ]
cros_ec_ucsi fails to allocate a UCSI instance in it's probe function because it does not define all operations checked by ucsi_create. Update cros_ec_ucsi operations to use the same function for read_cci and poll_cci.
Signed-off-by: Jameson Thies jthies@google.com Reviewed-by: Benson Leung bleung@chromium.org Link: https://lore.kernel.org/r/20250711202033.2201305-1-jthies@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Now I understand the timeline: 1. The cros_ec_ucsi driver was added on 2024-12-31 2. The poll_cci requirement was added on 2025-02-17 (after the cros_ec_ucsi driver was already in the kernel) 3. This broke the cros_ec_ucsi driver because it didn't have the poll_cci operation defined
**Backport Status: YES**
This commit fixes a regression where the cros_ec_ucsi driver fails to allocate a UCSI instance during probe because it lacks the poll_cci operation that became mandatory in commit 976e7e9bdc77 ("acpi: typec: ucsi: Introduce a ->poll_cci method").
The commit meets all criteria for stable backporting:
1. **Fixes a real bug**: The driver completely fails to probe without this fix, preventing ChromeOS EC-based UCSI devices from working at all. The error occurs in ucsi_create() at drivers/usb/typec/ucsi/ucsi.c:1933 where it checks for the presence of all required operations including poll_cci.
2. **Small and contained change**: The fix is minimal - it only adds one line to the operations structure (`.poll_cci = cros_ucsi_read_cci,`), reusing the existing read_cci implementation which is appropriate for this driver.
3. **No side effects**: The change simply allows the driver to pass the operations validation check. Using the same function for both read_cci and poll_cci is the correct approach for drivers that don't have the ACPI-specific sync issues that prompted the poll_cci split.
4. **Fixes a regression**: This is fixing a regression introduced by commit 976e7e9bdc77, which itself was marked for stable. Any stable kernel that includes 976e7e9bdc77 but not this fix will have a broken cros_ec_ucsi driver.
5. **Clear fix relationship**: The commit message clearly identifies the problem (ucsi_create fails due to missing operation) and the solution is straightforward.
This should be backported to any stable kernel that includes both: - commit f1a2241778d9 ("usb: typec: ucsi: Implement ChromeOS UCSI driver") - commit 976e7e9bdc77 ("acpi: typec: ucsi: Introduce a ->poll_cci method")
drivers/usb/typec/ucsi/cros_ec_ucsi.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/usb/typec/ucsi/cros_ec_ucsi.c b/drivers/usb/typec/ucsi/cros_ec_ucsi.c index 4ec1c6d22310..eed2a7d0ebc6 100644 --- a/drivers/usb/typec/ucsi/cros_ec_ucsi.c +++ b/drivers/usb/typec/ucsi/cros_ec_ucsi.c @@ -137,6 +137,7 @@ static int cros_ucsi_sync_control(struct ucsi *ucsi, u64 cmd, u32 *cci, static const struct ucsi_operations cros_ucsi_ops = { .read_version = cros_ucsi_read_version, .read_cci = cros_ucsi_read_cci, + .poll_cci = cros_ucsi_read_cci, .read_message_in = cros_ucsi_read_message_in, .async_control = cros_ucsi_async_control, .sync_control = cros_ucsi_sync_control,
From: André Draszik andre.draszik@linaro.org
[ Upstream commit 31611223fb34a3e9320cdfc4f4395072a13ea78e ]
This driver calls enable_irq_wake() during probe() unconditionally, and never issues the required corresponding disable_irq_wake() to disable hardware interrupt wakeup signals.
Additionally, whether or not a device should wake-up the system is meant to be a policy decision based on sysfs (.../power/wakeup) in the first place.
Update the driver to use the standard approach to enable/disable IRQ wake during the suspend/resume callbacks. This solves both issues described above.
Signed-off-by: André Draszik andre.draszik@linaro.org Reviewed-by: Badhri Jagan Sridharan badhri@google.com Link: https://lore.kernel.org/r/20250707-max77759-irq-wake-v1-1-d367f633e4bc@linar... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Clear Bug Fix**: The commit fixes a resource management bug where `enable_irq_wake()` is called during probe without a corresponding `disable_irq_wake()`. This violates the kernel's IRQ wake API usage requirements, leading to: - Resource leak in the IRQ subsystem - Incorrect wake behavior that cannot be controlled via sysfs - Potential system power management issues
2. **Small and Contained Change**: The fix is minimal and well- contained: - Removes the problematic `max_tcpci_init_alert()` function - Moves IRQ request directly into probe - Adds proper suspend/resume callbacks with correct `enable_irq_wake()`/`disable_irq_wake()` pairing - Total change is about 40 lines of straightforward code
3. **Low Risk of Regression**: - The change follows established kernel patterns for IRQ wake handling in PM callbacks - Similar patterns exist in other USB Type-C drivers (e.g., fusb302.c uses suspend/resume for IRQ wake) - The fix only affects power management behavior, not core functionality - Uses `device_may_wakeup()` to properly check policy before enabling wake
4. **Clear Impact on Users**: - Without this fix, devices cannot properly control wake-on-USB-C events through sysfs - The unconditional `enable_irq_wake()` means the device will always wake the system on USB-C events, regardless of user preference - This affects power consumption and battery life on systems with USB Type-C
5. **No Architectural Changes**: The commit: - Doesn't introduce new features - Doesn't change any APIs or interfaces - Simply fixes incorrect usage of existing kernel APIs - Maintains backward compatibility
The commit message clearly explains the problem and solution, the code changes are minimal and correct, and it fixes a real bug that affects system power management behavior. This is exactly the type of fix that stable kernels should receive.
drivers/usb/typec/tcpm/tcpci_maxim_core.c | 46 +++++++++++++++-------- 1 file changed, 30 insertions(+), 16 deletions(-)
diff --git a/drivers/usb/typec/tcpm/tcpci_maxim_core.c b/drivers/usb/typec/tcpm/tcpci_maxim_core.c index b5a5ed40faea..ff3604be79da 100644 --- a/drivers/usb/typec/tcpm/tcpci_maxim_core.c +++ b/drivers/usb/typec/tcpm/tcpci_maxim_core.c @@ -421,21 +421,6 @@ static irqreturn_t max_tcpci_isr(int irq, void *dev_id) return IRQ_WAKE_THREAD; }
-static int max_tcpci_init_alert(struct max_tcpci_chip *chip, struct i2c_client *client) -{ - int ret; - - ret = devm_request_threaded_irq(chip->dev, client->irq, max_tcpci_isr, max_tcpci_irq, - (IRQF_TRIGGER_LOW | IRQF_ONESHOT), dev_name(chip->dev), - chip); - - if (ret < 0) - return ret; - - enable_irq_wake(client->irq); - return 0; -} - static int max_tcpci_start_toggling(struct tcpci *tcpci, struct tcpci_data *tdata, enum typec_cc_status cc) { @@ -532,7 +517,9 @@ static int max_tcpci_probe(struct i2c_client *client)
chip->port = tcpci_get_tcpm_port(chip->tcpci);
- ret = max_tcpci_init_alert(chip, client); + ret = devm_request_threaded_irq(&client->dev, client->irq, max_tcpci_isr, max_tcpci_irq, + (IRQF_TRIGGER_LOW | IRQF_ONESHOT), dev_name(chip->dev), + chip); if (ret < 0) return dev_err_probe(&client->dev, ret, "IRQ initialization failed\n"); @@ -544,6 +531,32 @@ static int max_tcpci_probe(struct i2c_client *client) return 0; }
+#ifdef CONFIG_PM_SLEEP +static int max_tcpci_resume(struct device *dev) +{ + struct i2c_client *client = to_i2c_client(dev); + int ret = 0; + + if (client->irq && device_may_wakeup(dev)) + ret = disable_irq_wake(client->irq); + + return ret; +} + +static int max_tcpci_suspend(struct device *dev) +{ + struct i2c_client *client = to_i2c_client(dev); + int ret = 0; + + if (client->irq && device_may_wakeup(dev)) + ret = enable_irq_wake(client->irq); + + return ret; +} +#endif /* CONFIG_PM_SLEEP */ + +static SIMPLE_DEV_PM_OPS(max_tcpci_pm_ops, max_tcpci_suspend, max_tcpci_resume); + static const struct i2c_device_id max_tcpci_id[] = { { "maxtcpc" }, { } @@ -562,6 +575,7 @@ static struct i2c_driver max_tcpci_i2c_driver = { .driver = { .name = "maxtcpc", .of_match_table = of_match_ptr(max_tcpci_of_match), + .pm = &max_tcpci_pm_ops, }, .probe = max_tcpci_probe, .id_table = max_tcpci_id,
From: Guillaume La Roque glaroque@baylibre.com
[ Upstream commit fcddcb7e8f38a40db99f87a962c5d0a153a76566 ]
Select PM_GENERIC_DOMAINS instead of depending on it to ensure it is always enabled when TI_SCI_PM_DOMAINS is selected. Since PM_GENERIC_DOMAINS is an implicit symbol, it can only be enabled through 'select' and cannot be explicitly enabled in configuration. This simplifies the dependency chain and prevents build issues
Signed-off-by: Guillaume La Roque glaroque@baylibre.com Reviewed-by: Nishanth Menon nm@ti.com Link: https://lore.kernel.org/r/20250715-depspmdomain-v2-1-6f0eda3ce824@baylibre.c... Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Let me analyze based on what I can see from the provided diff:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a build configuration issue**: The commit message explicitly states this change "prevents build issues". The problem is that PM_GENERIC_DOMAINS is an implicit symbol that cannot be explicitly enabled by users in configuration menus. Using "depends on" creates a circular dependency problem where users cannot enable TI_SCI_PM_DOMAINS because they cannot manually enable PM_GENERIC_DOMAINS.
2. **Small and contained fix**: The change is minimal - it only modifies one line in a Kconfig file, changing from `depends on PM_GENERIC_DOMAINS` to `select PM_GENERIC_DOMAINS if PM`. This is a low-risk change that only affects the configuration system.
3. **Follows established patterns**: The code shows that other subsystems handle PM_GENERIC_DOMAINS the same way. From my search results, I can see that drivers like QCOM_GDSC, AMD GPU drivers, and others use `select PM_GENERIC_DOMAINS if PM` rather than `depends on PM_GENERIC_DOMAINS`.
4. **Prevents user-facing issues**: Without this fix, users trying to enable TI_SCI_PM_DOMAINS would face a confusing situation where the option might not be available in menuconfig because they cannot satisfy the dependency on PM_GENERIC_DOMAINS (since it's an implicit symbol).
5. **No architectural changes**: This is purely a Kconfig dependency fix that doesn't change any runtime behavior or introduce new features. It only fixes the configuration logic.
6. **Platform-specific fix**: This only affects TI SoC configurations and has no impact on other architectures or subsystems, minimizing the risk of regression.
The commit follows stable tree rules by fixing an important configuration/build issue with minimal risk and no new features or architectural changes.
drivers/pmdomain/ti/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pmdomain/ti/Kconfig b/drivers/pmdomain/ti/Kconfig index 67c608bf7ed0..5386b362a7ab 100644 --- a/drivers/pmdomain/ti/Kconfig +++ b/drivers/pmdomain/ti/Kconfig @@ -10,7 +10,7 @@ if SOC_TI config TI_SCI_PM_DOMAINS tristate "TI SCI PM Domains Driver" depends on TI_SCI_PROTOCOL - depends on PM_GENERIC_DOMAINS + select PM_GENERIC_DOMAINS if PM help Generic power domain implementation for TI device implementing the TI SCI protocol.
From: Bartosz Golaszewski bartosz.golaszewski@linaro.org
[ Upstream commit ff0f0d7c6587e38c308be9905e36f86e98fb9c1f ]
regmap_update_bits() can fail so check its return value in wcd_gpio_direction_output() for consistency with the rest of the code and propagate any errors.
Link: https://lore.kernel.org/r/20250709-gpiochip-set-rv-gpio-remaining-v1-2-b8950... Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis, here's my determination:
**YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **It fixes a real bug**: The `wcd_gpio_direction_output()` function in drivers/gpio/gpio-wcd934x.c was not checking the return value of the first `regmap_update_bits()` call (lines 50-51 in the original code). This could lead to silent failures where the GPIO direction register update fails, but the function continues to update the value register, potentially leaving the GPIO in an inconsistent state.
2. **The fix is minimal and contained**: The change is very small - it only adds 4 lines of code to check the return value and propagate the error. This minimizes the risk of introducing regressions.
3. **Consistency with the rest of the driver**: Looking at the code, other functions in the same driver already properly check regmap operation return values: - `wcd_gpio_get_direction()` checks the return value of `regmap_read()` at line 27-29 - `wcd_gpio_direction_input()` already returns the result of `regmap_update_bits()` directly
4. **Consistency with other GPIO drivers**: My examination of other GPIO drivers (gpio-lp87565.c, gpio-lp873x.c, gpio-tps65912.c, gpio- adp5585.c) shows that properly checking return values from regmap operations in `direction_output` functions is the standard practice.
5. **Potential for real-world impact**: The WCD934x is an audio codec used in Qualcomm SoCs like SDM845. A failure to properly set GPIO direction could cause issues with audio functionality or other peripherals connected through these GPIOs. The bug could manifest as intermittent failures that are hard to debug.
6. **No architectural changes**: The commit doesn't introduce new features or change any APIs. It's purely a bug fix that adds proper error handling.
7. **Clear bug pattern**: The pattern of not checking the first regmap operation but checking (returning) the second one suggests this was an oversight rather than intentional behavior.
The commit follows stable kernel rules by being a clear bug fix that addresses a potential reliability issue without introducing new functionality or making architectural changes.
drivers/gpio/gpio-wcd934x.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/gpio/gpio-wcd934x.c b/drivers/gpio/gpio-wcd934x.c index 2bba27b13947..cfa7b0a50c8e 100644 --- a/drivers/gpio/gpio-wcd934x.c +++ b/drivers/gpio/gpio-wcd934x.c @@ -46,9 +46,12 @@ static int wcd_gpio_direction_output(struct gpio_chip *chip, unsigned int pin, int val) { struct wcd_gpio_data *data = gpiochip_get_data(chip); + int ret;
- regmap_update_bits(data->map, WCD_REG_DIR_CTL_OFFSET, - WCD_PIN_MASK(pin), WCD_PIN_MASK(pin)); + ret = regmap_update_bits(data->map, WCD_REG_DIR_CTL_OFFSET, + WCD_PIN_MASK(pin), WCD_PIN_MASK(pin)); + if (ret) + return ret;
return regmap_update_bits(data->map, WCD_REG_VAL_CTL_OFFSET, WCD_PIN_MASK(pin),
From: Lifeng Zheng zhenglifeng1@huawei.com
[ Upstream commit 0ae204405095abfbc2d694ee0fbb49bcbbe55c57 ]
Detect the result of starting old governor in cpufreq_set_policy(). If it fails, exit the governor and clear policy->governor.
Signed-off-by: Lifeng Zheng zhenglifeng1@huawei.com Link: https://patch.msgid.link/20250709104145.2348017-5-zhenglifeng1@huawei.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## Bug Analysis
The commit fixes a **resource leak and inconsistent state bug** in the cpufreq governor switching code. Looking at the code changes in `cpufreq_set_policy()`:
### The Bug: In the original code (lines 2716-2721), when switching governors fails and we need to restore the old governor: ```c if (old_gov) { policy->governor = old_gov; if (cpufreq_init_governor(policy)) policy->governor = NULL; else cpufreq_start_governor(policy); // Bug: no error handling here } ```
If `cpufreq_start_governor()` fails, the code doesn't handle the error. This leaves the system in an **inconsistent state** where: 1. The governor is initialized (`cpufreq_init_governor` succeeded) 2. But the governor failed to start (`cpufreq_start_governor` failed) 3. The policy still points to a non-functional governor 4. Resources allocated during `cpufreq_init_governor` are **leaked** (module reference count, governor's init() allocations)
### The Fix: ```c if (cpufreq_init_governor(policy)) { policy->governor = NULL; } else if (cpufreq_start_governor(policy)) { cpufreq_exit_governor(policy); // NEW: Clean up on failure policy->governor = NULL; // NEW: Clear the governor pointer } ```
## Why This Should Be Backported:
1. **Fixes a Real Bug**: This addresses a resource leak where `cpufreq_init_governor()` acquires resources (notably `try_module_get()` at line 2442 and potential governor->init() allocations) that aren't cleaned up if `cpufreq_start_governor()` fails.
2. **Small and Contained Fix**: The change is minimal - just 4 lines modified to add proper error handling. It doesn't change any APIs or introduce new functionality.
3. **Prevents System Instability**: Leaving the cpufreq subsystem in an inconsistent state (initialized but not started governor) could lead to: - Module reference count leaks - Memory leaks from governor init allocations - Potential crashes if the partially-initialized governor is accessed later
4. **Error Path Fix**: This is clearly an error handling path that was incorrectly implemented. The pattern of calling `cpufreq_exit_governor()` after a failed `cpufreq_start_governor()` is already used elsewhere in the code (line 2711).
5. **No Architectural Changes**: The fix simply adds missing cleanup code in an error path. It doesn't change the normal operation flow or introduce new features.
6. **Critical Subsystem**: The cpufreq subsystem is critical for power management and system stability. Bugs here can affect system reliability.
The commit follows the stable kernel rules perfectly - it's a clear bug fix that prevents resource leaks and system instability, with minimal risk of regression since it only affects an error path that was already broken.
drivers/cpufreq/cpufreq.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index d7426e1d8bdd..10a22c6f5519 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -2715,10 +2715,12 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy, pr_debug("starting governor %s failed\n", policy->governor->name); if (old_gov) { policy->governor = old_gov; - if (cpufreq_init_governor(policy)) + if (cpufreq_init_governor(policy)) { policy->governor = NULL; - else - cpufreq_start_governor(policy); + } else if (cpufreq_start_governor(policy)) { + cpufreq_exit_governor(policy); + policy->governor = NULL; + } }
return ret;
From: Li RongQing lirongqing@baidu.com
[ Upstream commit fc64e0421598aaa87d61184f6777b52614a095be ]
Users may disable HWP in firmware, in which case intel_pstate wouldn't load unless the CPU model is explicitly supported.
Signed-off-by: Li RongQing lirongqing@baidu.com Link: https://patch.msgid.link/20250623105601.3924-1-lirongqing@baidu.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's the extensive analysis:
**Nature of the Change:** The commit adds support for Intel Granite Rapids processors (both D and X variants) in the intel_pstate driver when Hardware P-states (HWP) is disabled in firmware. This is a hardware enablement patch that adds two new CPU model IDs to existing lists: - `INTEL_GRANITERAPIDS_D` and `INTEL_GRANITERAPIDS_X` are added to `intel_pstate_cpu_ids[]` table
**Why This Is a Good Backport Candidate:**
1. **Fixes a Real User-Facing Issue**: Without this patch, users who have Granite Rapids CPUs with HWP disabled in firmware cannot use the intel_pstate driver at all. The commit message explicitly states "intel_pstate wouldn't load unless the CPU model is explicitly supported." This means affected systems would fall back to less efficient CPU frequency scaling drivers, significantly impacting performance and power efficiency.
2. **Minimal and Safe Change**: The patch only adds two lines to an existing CPU ID table: ```c + X86_MATCH(INTEL_GRANITERAPIDS_D, core_funcs), + X86_MATCH(INTEL_GRANITERAPIDS_X, core_funcs), ``` These entries follow the exact same pattern as all other CPU entries and use the standard `core_funcs` handler, which is already used by many other Intel CPU models.
3. **No Architectural Changes**: The commit doesn't introduce any new features, modify existing logic, or change any kernel interfaces. It's purely a hardware enablement patch that extends existing functionality to new hardware.
4. **Low Risk of Regression**: - The change is contained entirely within CPU model detection tables - It only affects systems with the specific Granite Rapids CPU models - Uses the well-tested `core_funcs` implementation already used by Sapphire Rapids, Emerald Rapids, and other server CPUs - The code path is only triggered on systems with these specific CPUs
5. **Follows Established Pattern**: Looking at the surrounding code, Granite Rapids entries were already present in the `intel_pstate_cpu_oob_ids[]` table (for out-of-band control), indicating this is completing support that was partially added previously.
6. **Important for Server Deployments**: Granite Rapids is a server CPU platform, and enterprise users often disable HWP for specific workload requirements or compatibility reasons. Without this patch, they lose access to the intel_pstate driver entirely.
The commit meets all the criteria for stable backporting: it fixes a real bug (driver fails to load on supported hardware), the fix is minimal and contained, there's no risk to existing systems, and it provides important functionality for affected users.
drivers/cpufreq/intel_pstate.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index 64587d318267..1782b29dc595 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -2775,6 +2775,8 @@ static const struct x86_cpu_id intel_pstate_cpu_ids[] = { X86_MATCH(INTEL_TIGERLAKE, core_funcs), X86_MATCH(INTEL_SAPPHIRERAPIDS_X, core_funcs), X86_MATCH(INTEL_EMERALDRAPIDS_X, core_funcs), + X86_MATCH(INTEL_GRANITERAPIDS_D, core_funcs), + X86_MATCH(INTEL_GRANITERAPIDS_X, core_funcs), {} }; MODULE_DEVICE_TABLE(x86cpu, intel_pstate_cpu_ids);
From: Alexander Kochetkov al.kochet@gmail.com
[ Upstream commit 7cdb433bb44cdc87dc5260cdf15bf03cc1cd1814 ]
In order to bring up secondary CPUs main CPU write trampoline code to SRAM. The trampoline code is written while secondary CPUs are powered on (at least that true for RK3188 CPU). Sometimes that leads to kernel hang. Probably because secondary CPU execute trampoline code while kernel doesn't expect.
The patch moves SRAM initialization step to the point where all secondary CPUs are powered down.
That fixes rarely hangs on RK3188: [ 0.091568] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000 [ 0.091996] rockchip_smp_prepare_cpus: ncores 4
Signed-off-by: Alexander Kochetkov al.kochet@gmail.com Link: https://lore.kernel.org/r/20250703140453.1273027-1-al.kochet@gmail.com Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and the code context, here's my determination:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a real bug affecting users**: The commit explicitly mentions that it "fixes rarely hangs on RK3188" during SMP initialization. This is a critical boot-time issue that prevents the system from functioning properly.
2. **Small and contained fix**: The change is minimal - it only moves the SRAM initialization code (`rockchip_smp_prepare_sram()`) to execute after all secondary CPUs are powered down. The code movement is straightforward: - Lines 281-285 (old location) are moved to lines 321-325 (new location) - The fix is contained within a single function (`rockchip_smp_prepare_cpus()`)
3. **Clear root cause**: The commit message explains the race condition clearly - secondary CPUs may execute trampoline code while it's being written to SRAM, causing unpredictable behavior and hangs.
4. **No architectural changes**: This is purely a timing/ordering fix that doesn't introduce new features or change the architecture of the code.
5. **Critical subsystem**: SMP initialization is a critical kernel subsystem. A hang during boot makes the system unusable, which is a severe issue for affected users.
6. **Platform-specific fix**: The change only affects Rockchip ARM platforms (specifically those with Cortex-A9 CPUs like RK3188), limiting the scope of potential regressions.
7. **Follows stable tree rules**: The fix addresses an important bug (boot hang), is small and self-contained, has minimal risk of regression, and doesn't introduce new features.
The key insight from the code analysis is that the original code had a clear race condition where: 1. Secondary CPUs were powered on (lines 287-288 in the original) 2. Trampoline code was written to SRAM (line 282 in the original) 3. Secondary CPUs could potentially execute incomplete/corrupted trampoline code
The fix ensures all secondary CPUs are powered down (lines 321-323) before writing the trampoline code to SRAM (lines 321-325 in the new version), eliminating the race condition.
arch/arm/mach-rockchip/platsmp.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/arch/arm/mach-rockchip/platsmp.c b/arch/arm/mach-rockchip/platsmp.c index 36915a073c23..f432d22bfed8 100644 --- a/arch/arm/mach-rockchip/platsmp.c +++ b/arch/arm/mach-rockchip/platsmp.c @@ -279,11 +279,6 @@ static void __init rockchip_smp_prepare_cpus(unsigned int max_cpus) }
if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A9) { - if (rockchip_smp_prepare_sram(node)) { - of_node_put(node); - return; - } - /* enable the SCU power domain */ pmu_set_power_domain(PMU_PWRDN_SCU, true);
@@ -316,11 +311,19 @@ static void __init rockchip_smp_prepare_cpus(unsigned int max_cpus) asm ("mrc p15, 1, %0, c9, c0, 2\n" : "=r" (l2ctlr)); ncores = ((l2ctlr >> 24) & 0x3) + 1; } - of_node_put(node);
/* Make sure that all cores except the first are really off */ for (i = 1; i < ncores; i++) pmu_set_power_domain(0 + i, false); + + if (read_cpuid_part() == ARM_CPU_PART_CORTEX_A9) { + if (rockchip_smp_prepare_sram(node)) { + of_node_put(node); + return; + } + } + + of_node_put(node); }
static void __init rk3036_smp_prepare_cpus(unsigned int max_cpus)
From: Lifeng Zheng zhenglifeng1@huawei.com
[ Upstream commit 914cc799b28f17d369d5b4db3b941957d18157e8 ]
Replace sscanf() with kstrtoul() in set_freq_store() and check the result to avoid invalid input.
Signed-off-by: Lifeng Zheng zhenglifeng1@huawei.com Link: https://lore.kernel.org/lkml/20250421030020.3108405-2-zhenglifeng1@huawei.co... Signed-off-by: Chanwoo Choi cw00.choi@samsung.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
## Backport Status: YES
This commit should be backported to stable kernel trees for the following reasons:
1. **Security Issue - Input Validation Bug**: The original code uses `sscanf(buf, "%lu", &wanted)` at line 45 without checking the return value. This is a security vulnerability because: - `sscanf()` returns the number of successfully parsed items - If the input is malformed or contains invalid characters, `sscanf()` may return 0 but leave `wanted` uninitialized - The code then proceeds to use this potentially uninitialized value in `data->user_frequency = wanted` - This could lead to setting arbitrary frequency values from uninitialized stack data
2. **Real User Impact**: The devfreq userspace governor allows users to manually set device frequencies through sysfs. Invalid input handling could cause: - Setting incorrect frequencies based on uninitialized memory - Potential system instability if invalid frequencies are applied to hardware - Security implications as uninitialized stack data could be used
3. **Small and Contained Fix**: The change is minimal and low-risk: - Replaces `sscanf()` with `kstrtoul()` which has proper error checking - Adds explicit error handling that returns early on invalid input - The change is confined to a single function (`set_freq_store()`) - No architectural changes or new features added
4. **Follows Kernel Best Practices**: The kernel has been systematically replacing `sscanf()` with `kstrto*()` functions for better input validation. This is evident from similar commits found in the git log (e.g., commit a5556fa1107d for asus-wmi).
5. **Critical Subsystem**: While devfreq might not be as critical as core memory management, it controls device frequency scaling which can affect: - Power management - System performance - Hardware stability
6. **No Negative Side Effects**: The change only adds proper validation and doesn't modify the core functionality. Valid inputs will continue to work exactly as before.
The commit fixes a clear bug (missing input validation) that could lead to undefined behavior and potential security issues, making it an excellent candidate for stable backporting according to stable tree rules.
drivers/devfreq/governor_userspace.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/devfreq/governor_userspace.c b/drivers/devfreq/governor_userspace.c index d1aa6806b683..175de0c0b50e 100644 --- a/drivers/devfreq/governor_userspace.c +++ b/drivers/devfreq/governor_userspace.c @@ -9,6 +9,7 @@ #include <linux/slab.h> #include <linux/device.h> #include <linux/devfreq.h> +#include <linux/kstrtox.h> #include <linux/pm.h> #include <linux/mutex.h> #include <linux/module.h> @@ -39,10 +40,13 @@ static ssize_t set_freq_store(struct device *dev, struct device_attribute *attr, unsigned long wanted; int err = 0;
+ err = kstrtoul(buf, 0, &wanted); + if (err) + return err; + mutex_lock(&devfreq->lock); data = devfreq->governor_data;
- sscanf(buf, "%lu", &wanted); data->user_frequency = wanted; data->valid = true; err = update_devfreq(devfreq);
From: Shubhrajyoti Datta shubhrajyoti.datta@amd.com
[ Upstream commit b1dc7f097b78eb8d25b071ead2384b07a549692b ]
Clear the ECC error and counter registers during initialization/probe to avoid reporting stale errors that may have occurred before EDAC registration.
For that, unify the Zynq and ZynqMP ECC state reading paths and simplify the code.
[ bp: Massage commit message. Fix an -Wsometimes-uninitialized warning as reported by Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202507141048.obUv3ZUm-lkp@intel.com ]
Signed-off-by: Shubhrajyoti Datta shubhrajyoti.datta@amd.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Link: https://lore.kernel.org/20250713050753.7042-1-shubhrajyoti.datta@amd.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit is suitable for backporting to stable kernel trees based on the following analysis:
## Bug Fix Nature The commit fixes a clear bug where the EDAC driver could report stale ECC errors from before the driver was initialized. This is evident from the commit message: "Clear the ECC error and counter registers during initialization/probe to avoid reporting stale errors that may have occurred before EDAC registration."
## Code Changes Analysis
1. **Problem Being Fixed**: - The driver wasn't clearing ECC error counters during initialization - This could lead to false error reports from pre-boot or pre-driver- load errors - Users could see incorrect ECC error counts that don't reflect actual runtime errors
2. **Solution Implementation**: - The patch consolidates the ECC state checking into a single `get_ecc_state()` function - Adds counter clearing logic for both Zynq and ZynqMP platforms: ```c // For Zynq: clearval = ECC_CTRL_CLR_CE_ERR | ECC_CTRL_CLR_UE_ERR; writel(clearval, priv->baseaddr + ECC_CTRL_OFST); writel(0x0, priv->baseaddr + ECC_CTRL_OFST);
// For ZynqMP: clearval = readl(priv->baseaddr + ECC_CLR_OFST) | ECC_CTRL_CLR_CE_ERR | ECC_CTRL_CLR_CE_ERRCNT | ECC_CTRL_CLR_UE_ERR | ECC_CTRL_CLR_UE_ERRCNT; writel(clearval, priv->baseaddr + ECC_CLR_OFST); ```
3. **Scope and Risk Assessment**: - Changes are confined to the synopsys_edac driver - No API changes or interface modifications - The refactoring simplifies code by removing duplicate `zynq_get_ecc_state()` and `zynqmp_get_ecc_state()` functions - Adds platform type enum for cleaner platform identification - Risk is minimal as it only affects initialization path
4. **Stable Tree Criteria Compliance**: - ✓ Fixes a real bug (stale error reporting) - ✓ Small and contained changes (~150 lines) - ✓ No new features added - ✓ No architectural changes - ✓ Low regression risk - ✓ Clear fix for a specific problem
5. **Additional Evidence**: - The maintainer (Borislav Petkov) accepted and massaged the patch - Fixed a compiler warning caught by kernel test robot - The bug could affect production systems by showing false ECC errors
This is a textbook example of a stable-worthy patch: it fixes a specific bug (stale ECC error reporting) with minimal, focused changes that don't introduce new functionality or risk.
drivers/edac/synopsys_edac.c | 97 +++++++++++++++++------------------- 1 file changed, 46 insertions(+), 51 deletions(-)
diff --git a/drivers/edac/synopsys_edac.c b/drivers/edac/synopsys_edac.c index 5ed32a3299c4..51143b3257de 100644 --- a/drivers/edac/synopsys_edac.c +++ b/drivers/edac/synopsys_edac.c @@ -332,20 +332,26 @@ struct synps_edac_priv { #endif };
+enum synps_platform_type { + ZYNQ, + ZYNQMP, + SYNPS, +}; + /** * struct synps_platform_data - synps platform data structure. + * @platform: Identifies the target hardware platform * @get_error_info: Get EDAC error info. * @get_mtype: Get mtype. * @get_dtype: Get dtype. - * @get_ecc_state: Get ECC state. * @get_mem_info: Get EDAC memory info * @quirks: To differentiate IPs. */ struct synps_platform_data { + enum synps_platform_type platform; int (*get_error_info)(struct synps_edac_priv *priv); enum mem_type (*get_mtype)(const void __iomem *base); enum dev_type (*get_dtype)(const void __iomem *base); - bool (*get_ecc_state)(void __iomem *base); #ifdef CONFIG_EDAC_DEBUG u64 (*get_mem_info)(struct synps_edac_priv *priv); #endif @@ -720,51 +726,38 @@ static enum dev_type zynqmp_get_dtype(const void __iomem *base) return dt; }
-/** - * zynq_get_ecc_state - Return the controller ECC enable/disable status. - * @base: DDR memory controller base address. - * - * Get the ECC enable/disable status of the controller. - * - * Return: true if enabled, otherwise false. - */ -static bool zynq_get_ecc_state(void __iomem *base) +static bool get_ecc_state(struct synps_edac_priv *priv) { + u32 ecctype, clearval; enum dev_type dt; - u32 ecctype; - - dt = zynq_get_dtype(base); - if (dt == DEV_UNKNOWN) - return false;
- ecctype = readl(base + SCRUB_OFST) & SCRUB_MODE_MASK; - if ((ecctype == SCRUB_MODE_SECDED) && (dt == DEV_X2)) - return true; - - return false; -} - -/** - * zynqmp_get_ecc_state - Return the controller ECC enable/disable status. - * @base: DDR memory controller base address. - * - * Get the ECC enable/disable status for the controller. - * - * Return: a ECC status boolean i.e true/false - enabled/disabled. - */ -static bool zynqmp_get_ecc_state(void __iomem *base) -{ - enum dev_type dt; - u32 ecctype; - - dt = zynqmp_get_dtype(base); - if (dt == DEV_UNKNOWN) - return false; - - ecctype = readl(base + ECC_CFG0_OFST) & SCRUB_MODE_MASK; - if ((ecctype == SCRUB_MODE_SECDED) && - ((dt == DEV_X2) || (dt == DEV_X4) || (dt == DEV_X8))) - return true; + if (priv->p_data->platform == ZYNQ) { + dt = zynq_get_dtype(priv->baseaddr); + if (dt == DEV_UNKNOWN) + return false; + + ecctype = readl(priv->baseaddr + SCRUB_OFST) & SCRUB_MODE_MASK; + if (ecctype == SCRUB_MODE_SECDED && dt == DEV_X2) { + clearval = ECC_CTRL_CLR_CE_ERR | ECC_CTRL_CLR_UE_ERR; + writel(clearval, priv->baseaddr + ECC_CTRL_OFST); + writel(0x0, priv->baseaddr + ECC_CTRL_OFST); + return true; + } + } else { + dt = zynqmp_get_dtype(priv->baseaddr); + if (dt == DEV_UNKNOWN) + return false; + + ecctype = readl(priv->baseaddr + ECC_CFG0_OFST) & SCRUB_MODE_MASK; + if (ecctype == SCRUB_MODE_SECDED && + (dt == DEV_X2 || dt == DEV_X4 || dt == DEV_X8)) { + clearval = readl(priv->baseaddr + ECC_CLR_OFST) | + ECC_CTRL_CLR_CE_ERR | ECC_CTRL_CLR_CE_ERRCNT | + ECC_CTRL_CLR_UE_ERR | ECC_CTRL_CLR_UE_ERRCNT; + writel(clearval, priv->baseaddr + ECC_CLR_OFST); + return true; + } + }
return false; } @@ -934,18 +927,18 @@ static int setup_irq(struct mem_ctl_info *mci, }
static const struct synps_platform_data zynq_edac_def = { + .platform = ZYNQ, .get_error_info = zynq_get_error_info, .get_mtype = zynq_get_mtype, .get_dtype = zynq_get_dtype, - .get_ecc_state = zynq_get_ecc_state, .quirks = 0, };
static const struct synps_platform_data zynqmp_edac_def = { + .platform = ZYNQMP, .get_error_info = zynqmp_get_error_info, .get_mtype = zynqmp_get_mtype, .get_dtype = zynqmp_get_dtype, - .get_ecc_state = zynqmp_get_ecc_state, #ifdef CONFIG_EDAC_DEBUG .get_mem_info = zynqmp_get_mem_info, #endif @@ -957,10 +950,10 @@ static const struct synps_platform_data zynqmp_edac_def = { };
static const struct synps_platform_data synopsys_edac_def = { + .platform = SYNPS, .get_error_info = zynqmp_get_error_info, .get_mtype = zynqmp_get_mtype, .get_dtype = zynqmp_get_dtype, - .get_ecc_state = zynqmp_get_ecc_state, .quirks = (DDR_ECC_INTR_SUPPORT | DDR_ECC_INTR_SELF_CLEAR #ifdef CONFIG_EDAC_DEBUG | DDR_ECC_DATA_POISON_SUPPORT @@ -1390,10 +1383,6 @@ static int mc_probe(struct platform_device *pdev) if (!p_data) return -ENODEV;
- if (!p_data->get_ecc_state(baseaddr)) { - edac_printk(KERN_INFO, EDAC_MC, "ECC not enabled\n"); - return -ENXIO; - }
layers[0].type = EDAC_MC_LAYER_CHIP_SELECT; layers[0].size = SYNPS_EDAC_NR_CSROWS; @@ -1413,6 +1402,12 @@ static int mc_probe(struct platform_device *pdev) priv = mci->pvt_info; priv->baseaddr = baseaddr; priv->p_data = p_data; + if (!get_ecc_state(priv)) { + edac_printk(KERN_INFO, EDAC_MC, "ECC not enabled\n"); + rc = -ENODEV; + goto free_edac_mc; + } + spin_lock_init(&priv->reglock);
mc_init(mci, pdev);
From: Kuninori Morimoto kuninori.morimoto.gx@renesas.com
[ Upstream commit f40ecc2743652c0b0f19935f81baf57c601eb7f0 ]
ASoC has 2 functions to set bias level. (A) snd_soc_dapm_force_bias_level() (B) snd_soc_dapm_set_bias_level()
snd_soc_dapm_force_bias_level() (A) will set dapm->bias_level (a) if successed.
(A) int snd_soc_dapm_force_bias_level(...) { ... if (ret == 0) (a) dapm->bias_level = level; ... }
snd_soc_dapm_set_bias_level() (B) is also a function that sets bias_level. It will call snd_soc_dapm_force_bias_level() (A) inside, but doesn't set dapm->bias_level by itself. One note is that (A) might not be called.
(B) static int snd_soc_dapm_set_bias_level(...) { ... ret = snd_soc_card_set_bias_level(...); ... if (dapm != &card->dapm) (A) ret = snd_soc_dapm_force_bias_level(...); ... ret = snd_soc_card_set_bias_level_post(...); ... }
dapm->bias_level will be set if (A) was called, but might not be set if (B) was called, even though it calles set_bias_level() function.
We should set dapm->bias_level if we calls snd_soc_dapm_set_bias_level() (B), too.
Signed-off-by: Kuninori Morimoto kuninori.morimoto.gx@renesas.com Link: https://patch.msgid.link/87qzyn4g4h.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis, here's my assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Bug Fix**: This is a clear bug fix that addresses an inconsistency in bias level management. The commit message clearly explains that `snd_soc_dapm_set_bias_level()` doesn't always update `dapm->bias_level`, particularly when dealing with card-level DAPM contexts where `snd_soc_dapm_force_bias_level()` is not called.
2. **Small and Contained**: The fix is minimal - it only adds 4 lines of code: ```c /* success */ if (ret == 0) snd_soc_dapm_init_bias_level(dapm, level); ``` This ensures that `dapm->bias_level` is always updated when the bias level change succeeds.
3. **Important Correctness Issue**: The inconsistency could lead to incorrect bias level tracking, which affects power management decisions in the ASoC subsystem. Code that checks `dapm->bias_level` (as shown in lines like `if (dapm->bias_level == SND_SOC_BIAS_ON)`) may make wrong decisions if the bias level isn't properly tracked.
4. **No Architectural Changes**: This is a straightforward fix that doesn't introduce new features or change the architecture. It simply ensures consistent behavior between the two bias level setting functions.
5. **Low Risk**: The change only affects the success path (when `ret == 0`) and uses an existing inline function `snd_soc_dapm_init_bias_level()` that simply sets `dapm->bias_level = level`. This is the same operation that `snd_soc_dapm_force_bias_level()` performs on success.
6. **Affects User-Facing Functionality**: Incorrect bias level tracking can lead to power management issues, potentially causing audio devices to not power up/down correctly, which directly impacts users.
The commit fixes a real bug where the DAPM bias level state could become out of sync with the actual hardware state, particularly for card-level DAPM contexts. This is exactly the type of bug fix that stable kernels should include - it's small, targeted, fixes a real issue, and has minimal risk of introducing regressions.
sound/soc/soc-dapm.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c index f26f9e9d7ce7..7d9c9e8839f6 100644 --- a/sound/soc/soc-dapm.c +++ b/sound/soc/soc-dapm.c @@ -739,6 +739,10 @@ static int snd_soc_dapm_set_bias_level(struct snd_soc_dapm_context *dapm, out: trace_snd_soc_bias_level_done(dapm, level);
+ /* success */ + if (ret == 0) + snd_soc_dapm_init_bias_level(dapm, level); + return ret; }
From: David Collins david.collins@oss.qualcomm.com
[ Upstream commit f8e157ff2df46ddabd930815d196895976227831 ]
Certain TEMP_ALARM GEN2 PMIC peripherals need over-temperature stage 2 automatic PMIC partial shutdown. This will ensure that in the event of reaching the hotter stage 3 over-temperature threshold, repeated faults will be avoided during the automatic PMIC hardware full shutdown. Modify the stage 2 shutdown control logic to ensure that stage 2 shutdown is enabled on all affected PMICs. Read the digital major and minor revision registers to identify these PMICs.
Signed-off-by: David Collins david.collins@oss.qualcomm.com Signed-off-by: Anjelique Melendez anjelique.melendez@oss.qualcomm.com Link: https://lore.kernel.org/r/20250710224555.3047790-2-anjelique.melendez@oss.qu... Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my detailed analysis:
## Key Reasons for Backporting:
### 1. **Fixes a Hardware Safety Issue** The commit addresses a critical hardware protection mechanism where certain PMIC peripherals require stage 2 automatic partial shutdown to prevent "repeated faults" during stage 3 over-temperature events. Without this fix, affected hardware could experience repeated thermal faults during critical temperature conditions, potentially leading to system instability or hardware damage.
### 2. **Targeted Hardware-Specific Fix** The code changes identify specific PMIC revisions that require this protection: ```c switch (dig_revision) { case 0x0001: case 0x0002: case 0x0100: case 0x0101: chip->require_stage2_shutdown = true; break; } ``` This shows it's a targeted fix for known hardware issues, not a general enhancement.
### 3. **Small and Contained Change** The fix is minimal and self-contained: - Adds reading of DIG_MINOR register - Adds a `require_stage2_shutdown` flag to the chip structure - Modifies the logic in `qpnp_tm_update_critical_trip_temp()` to respect this flag - Total change is about 30 lines of code with clear boundaries
### 4. **Low Risk of Regression** - The change only affects specific PMIC revisions (0x0001, 0x0002, 0x0100, 0x0101) - For other hardware, behavior remains unchanged - The modification is purely additive - it doesn't alter existing functionality for unaffected devices
### 5. **Prevents Hardware Malfunction** The commit message explicitly states this fix is needed to "avoid potential repeated faults upon reaching over-temperature stage 3." This indicates a real hardware malfunction scenario that users could encounter without this patch.
### 6. **Follows Stable Tree Rules** According to stable kernel rules, this qualifies because it: - Fixes a real bug (hardware protection failure) - Is not a new feature - Has minimal changes - Addresses a specific hardware issue that affects users
### Technical Analysis: The key change is in the `qpnp_tm_update_critical_trip_temp()` function where the condition is modified from: ```c if (disable_s2_shutdown) reg |= SHUTDOWN_CTRL1_OVERRIDE_S2; ``` to: ```c if (disable_stage2_shutdown && !chip->require_stage2_shutdown) reg |= SHUTDOWN_CTRL1_OVERRIDE_STAGE2; ```
This ensures that for the affected PMIC revisions, stage 2 shutdown is never disabled, providing the necessary hardware protection against thermal events.
Given that this fixes a hardware-specific thermal protection issue that could lead to system instability or damage, and the fix is minimal with low regression risk, this commit is an excellent candidate for stable backporting.
drivers/thermal/qcom/qcom-spmi-temp-alarm.c | 43 ++++++++++++++++----- 1 file changed, 34 insertions(+), 9 deletions(-)
diff --git a/drivers/thermal/qcom/qcom-spmi-temp-alarm.c b/drivers/thermal/qcom/qcom-spmi-temp-alarm.c index a81e7d6e865f..4b91cc13ce34 100644 --- a/drivers/thermal/qcom/qcom-spmi-temp-alarm.c +++ b/drivers/thermal/qcom/qcom-spmi-temp-alarm.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only /* * Copyright (c) 2011-2015, 2017, 2020, The Linux Foundation. All rights reserved. + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. */
#include <linux/bitops.h> @@ -16,6 +17,7 @@
#include "../thermal_hwmon.h"
+#define QPNP_TM_REG_DIG_MINOR 0x00 #define QPNP_TM_REG_DIG_MAJOR 0x01 #define QPNP_TM_REG_TYPE 0x04 #define QPNP_TM_REG_SUBTYPE 0x05 @@ -31,7 +33,7 @@ #define STATUS_GEN2_STATE_MASK GENMASK(6, 4) #define STATUS_GEN2_STATE_SHIFT 4
-#define SHUTDOWN_CTRL1_OVERRIDE_S2 BIT(6) +#define SHUTDOWN_CTRL1_OVERRIDE_STAGE2 BIT(6) #define SHUTDOWN_CTRL1_THRESHOLD_MASK GENMASK(1, 0)
#define SHUTDOWN_CTRL1_RATE_25HZ BIT(3) @@ -78,6 +80,7 @@ struct qpnp_tm_chip { /* protects .thresh, .stage and chip registers */ struct mutex lock; bool initialized; + bool require_stage2_shutdown;
struct iio_channel *adc; const long (*temp_map)[THRESH_COUNT][STAGE_COUNT]; @@ -220,13 +223,13 @@ static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip, { long stage2_threshold_min = (*chip->temp_map)[THRESH_MIN][1]; long stage2_threshold_max = (*chip->temp_map)[THRESH_MAX][1]; - bool disable_s2_shutdown = false; + bool disable_stage2_shutdown = false; u8 reg;
WARN_ON(!mutex_is_locked(&chip->lock));
/* - * Default: S2 and S3 shutdown enabled, thresholds at + * Default: Stage 2 and Stage 3 shutdown enabled, thresholds at * lowest threshold set, monitoring at 25Hz */ reg = SHUTDOWN_CTRL1_RATE_25HZ; @@ -241,12 +244,12 @@ static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip, chip->thresh = THRESH_MAX - ((stage2_threshold_max - temp) / TEMP_THRESH_STEP); - disable_s2_shutdown = true; + disable_stage2_shutdown = true; } else { chip->thresh = THRESH_MAX;
if (chip->adc) - disable_s2_shutdown = true; + disable_stage2_shutdown = true; else dev_warn(chip->dev, "No ADC is configured and critical temperature %d mC is above the maximum stage 2 threshold of %ld mC! Configuring stage 2 shutdown at %ld mC.\n", @@ -255,8 +258,8 @@ static int qpnp_tm_update_critical_trip_temp(struct qpnp_tm_chip *chip,
skip: reg |= chip->thresh; - if (disable_s2_shutdown) - reg |= SHUTDOWN_CTRL1_OVERRIDE_S2; + if (disable_stage2_shutdown && !chip->require_stage2_shutdown) + reg |= SHUTDOWN_CTRL1_OVERRIDE_STAGE2;
return qpnp_tm_write(chip, QPNP_TM_REG_SHUTDOWN_CTRL1, reg); } @@ -350,8 +353,8 @@ static int qpnp_tm_probe(struct platform_device *pdev) { struct qpnp_tm_chip *chip; struct device_node *node; - u8 type, subtype, dig_major; - u32 res; + u8 type, subtype, dig_major, dig_minor; + u32 res, dig_revision; int ret, irq;
node = pdev->dev.of_node; @@ -402,6 +405,11 @@ static int qpnp_tm_probe(struct platform_device *pdev) return dev_err_probe(&pdev->dev, ret, "could not read dig_major\n");
+ ret = qpnp_tm_read(chip, QPNP_TM_REG_DIG_MINOR, &dig_minor); + if (ret < 0) + return dev_err_probe(&pdev->dev, ret, + "could not read dig_minor\n"); + if (type != QPNP_TM_TYPE || (subtype != QPNP_TM_SUBTYPE_GEN1 && subtype != QPNP_TM_SUBTYPE_GEN2)) { dev_err(&pdev->dev, "invalid type 0x%02x or subtype 0x%02x\n", @@ -415,6 +423,23 @@ static int qpnp_tm_probe(struct platform_device *pdev) else chip->temp_map = &temp_map_gen1;
+ if (chip->subtype == QPNP_TM_SUBTYPE_GEN2) { + dig_revision = (dig_major << 8) | dig_minor; + /* + * Check if stage 2 automatic partial shutdown must remain + * enabled to avoid potential repeated faults upon reaching + * over-temperature stage 3. + */ + switch (dig_revision) { + case 0x0001: + case 0x0002: + case 0x0100: + case 0x0101: + chip->require_stage2_shutdown = true; + break; + } + } + /* * Register the sensor before initializing the hardware to be able to * read the trip points. get_temp() returns the default temperature
From: Thomas Weißschuh linux@weissschuh.net
[ Upstream commit d5094bcb5bfdfea2cf0de8aaf77cc65db56cbdb5 ]
Nolibc assumes that the kernel ABI is using a time values that are as large as a long integer. For most ABIs this holds true. But for x32 this is not correct, as it uses 32bit longs but 64bit times.
Also the 'struct stat' implementation of nolibc relies on timespec::tv_sec and time_t being the same type. While timespec::tv_sec comes from the kernel and is of type __kernel_old_time_t, time_t is defined within nolibc.
Switch to the __kernel_old_time_t to always get the correct type.
Signed-off-by: Thomas Weißschuh linux@weissschuh.net Link: https://lore.kernel.org/r/20250712-nolibc-x32-v1-1-6d81cb798710@weissschuh.n... Acked-by: Willy Tarreau w@1wt.eu Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of this commit, here is my assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Bug Fix**: This fixes a real ABI compatibility bug where nolibc incorrectly assumes `time_t` is always the same size as `long`. On x32 architecture (x86-64 with 32-bit pointers), `long` is 32-bit but kernel time values are 64-bit, causing type mismatches.
2. **Clear Impact**: The commit message clearly explains the issue - the `struct stat` implementation in nolibc relies on `timespec::tv_sec` and `time_t` being the same type. Looking at the code: - In `/include/uapi/linux/time.h:12`, `struct timespec` uses `__kernel_old_time_t tv_sec` - In the original nolibc code at `tools/include/nolibc/std.h:30`, `time_t` was defined as `signed long` - In `tools/include/nolibc/types.h:183-185`, the stat structure uses unions that expect `time_t` and `struct timespec` members to be compatible
3. **Small and Contained Fix**: The change is minimal - just changing one typedef from `signed long` to `__kernel_old_time_t` and adding the necessary include. This is exactly the kind of focused fix suitable for stable.
4. **No Feature Changes**: This doesn't add new functionality or change the API - it just corrects the type definition to match what the kernel actually uses.
5. **Low Risk**: The change uses the kernel's own type definition (`__kernel_old_time_t`) which is already defined as `__kernel_long_t` in the generic case, so for most architectures this is a no-op. It only changes behavior on architectures like x32 where it's actually needed.
6. **Prevents Runtime Failures**: Without this fix, nolibc programs on x32 would have incorrect stat structures where time fields don't align properly with kernel expectations, potentially causing data corruption or incorrect time values.
The commit follows stable kernel rules by fixing an important bug with minimal changes and low regression risk. It's a clear correctness fix for a specific architecture compatibility issue.
tools/include/nolibc/std.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/include/nolibc/std.h b/tools/include/nolibc/std.h index adda7333d12e..ba950f0e7338 100644 --- a/tools/include/nolibc/std.h +++ b/tools/include/nolibc/std.h @@ -16,6 +16,8 @@ #include "stdint.h" #include "stddef.h"
+#include <linux/types.h> + /* those are commonly provided by sys/types.h */ typedef unsigned int dev_t; typedef unsigned long ino_t; @@ -27,6 +29,6 @@ typedef unsigned long nlink_t; typedef signed long off_t; typedef signed long blksize_t; typedef signed long blkcnt_t; -typedef signed long time_t; +typedef __kernel_old_time_t time_t;
#endif /* _NOLIBC_STD_H */
From: David Lechner dlechner@baylibre.com
[ Upstream commit 5a2f15c5a8e017d0951e6dc62aa7b5b634f56881 ]
Fix overallocating the size of the scan buffer by converting bits to bytes. The size is meant to be in bytes, so scanbits needs to be divided by 8.
Signed-off-by: David Lechner dlechner@baylibre.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reviewed-by: Nuno Sá nuno.sa@analog.com Link: https://patch.msgid.link/20250701-iio-adc-ad7173-add-spi-offload-support-v3-... Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit fixes a clear memory overallocation bug in the ad_sigma_delta ADC driver that has been present since kernel v5.19. The analysis shows:
1. **Bug Description**: The code incorrectly calculates buffer size by treating `storagebits` (which is in bits) as if it were in bytes. The `storagebits` field in `struct iio_scan_type` represents the number of bits needed to store a sample, typically 8, 16, 24, or 32 bits.
2. **Impact**: The bug causes the driver to allocate 8x more memory than needed. For example: - If `storagebits = 16` (2 bytes per sample) and `slot = 4` - Buggy calculation: `ALIGN(4 * 16, 8) = 64 bytes` - Correct calculation: `ALIGN(4 * 16 / 8, 8) = 8 bytes`
This wastes kernel memory and could potentially lead to memory exhaustion in systems with many IIO devices.
3. **Fix Quality**: The fix is minimal and correct - simply dividing by 8 to convert bits to bytes. This is consistent with how `storagebits` is used throughout the IIO subsystem, as evidenced by the grep results showing `sizeof(u32) * 8` assignments.
4. **Affected Versions**: The bug was introduced in commit 8bea9af887de4 ("iio: adc: ad_sigma_delta: Add sequencer support") which first appeared in v5.19 and is present in all releases since then (v5.19, v6.0, v6.1, v6.10, v6.11).
5. **Stable Criteria Met**: - ✓ Fixes a real bug (memory overallocation) - ✓ Small, contained fix (single line change) - ✓ No architectural changes - ✓ Low risk of regression - ✓ Bug affects users (wastes memory) - ✓ Clear and obvious fix
The commit should be backported to all stable kernels from v5.19 onwards to fix this memory waste issue in the industrial I/O subsystem.
drivers/iio/adc/ad_sigma_delta.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/adc/ad_sigma_delta.c b/drivers/iio/adc/ad_sigma_delta.c index 4c5f8d29a559..6b3ef7ef403e 100644 --- a/drivers/iio/adc/ad_sigma_delta.c +++ b/drivers/iio/adc/ad_sigma_delta.c @@ -489,7 +489,7 @@ static int ad_sd_buffer_postenable(struct iio_dev *indio_dev) return ret; }
- samples_buf_size = ALIGN(slot * indio_dev->channels[0].scan_type.storagebits, 8); + samples_buf_size = ALIGN(slot * indio_dev->channels[0].scan_type.storagebits / 8, 8); samples_buf_size += sizeof(int64_t); samples_buf = devm_krealloc(&sigma_delta->spi->dev, sigma_delta->samples_buf, samples_buf_size, GFP_KERNEL);
From: Bartosz Golaszewski bartosz.golaszewski@linaro.org
[ Upstream commit a0b2a6bbff8c26aafdecd320f38f52c341d5cafa ]
regmap_update_bits() can fail, check its return value like we do elsewhere in the driver.
Link: https://lore.kernel.org/r/20250707-gpiochip-set-rv-gpio-round4-v1-2-35668aaa... Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Bug Fix with Real Impact**: The commit fixes a bug where the return value of `regmap_update_bits()` in `tps65912_gpio_direction_output()` is not checked (line 54-55 in the original code). This could lead to silent failures where the GPIO initial value is not set correctly, but the function continues to configure the direction, potentially leaving the GPIO in an inconsistent state.
2. **Consistency with Driver Pattern**: The driver already checks return values of `regmap_update_bits()` in other functions: - `tps65912_gpio_direction_input()` (line 44) properly returns the result - The second `regmap_update_bits()` call in `tps65912_gpio_direction_output()` (line 57) also returns the result - `tps65912_gpio_get_direction()` checks the return value of `regmap_read()`
3. **Small and Contained Fix**: The patch is minimal - it only adds: - An `int ret` variable declaration - Captures the return value of the first `regmap_update_bits()` call - Adds an error check that returns early if the operation failed
4. **No Side Effects**: This change doesn't introduce any new functionality or alter the existing behavior when operations succeed. It only adds proper error handling that was missing.
5. **Hardware Communication Reliability**: The TPS65912 is a PMIC (Power Management IC) that communicates over I2C/SPI. Hardware communication can fail due to various reasons (bus errors, device issues), and not checking return values can lead to incorrect GPIO states which could affect system stability or connected peripherals.
6. **Follows Kernel Best Practices**: The kernel coding standards require checking return values of functions that can fail, especially for hardware operations. This fix brings the code in line with those standards.
The commit message clearly indicates this is a bug fix ("regmap_update_bits() can fail, check its return value like we do elsewhere in the driver"), making it a perfect candidate for stable backporting as it improves driver reliability without any risk of regression.
drivers/gpio/gpio-tps65912.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/gpio/gpio-tps65912.c b/drivers/gpio/gpio-tps65912.c index fab771cb6a87..bac757c191c2 100644 --- a/drivers/gpio/gpio-tps65912.c +++ b/drivers/gpio/gpio-tps65912.c @@ -49,10 +49,13 @@ static int tps65912_gpio_direction_output(struct gpio_chip *gc, unsigned offset, int value) { struct tps65912_gpio *gpio = gpiochip_get_data(gc); + int ret;
/* Set the initial value */ - regmap_update_bits(gpio->tps->regmap, TPS65912_GPIO1 + offset, - GPIO_SET_MASK, value ? GPIO_SET_MASK : 0); + ret = regmap_update_bits(gpio->tps->regmap, TPS65912_GPIO1 + offset, + GPIO_SET_MASK, value ? GPIO_SET_MASK : 0); + if (ret) + return ret;
return regmap_update_bits(gpio->tps->regmap, TPS65912_GPIO1 + offset, GPIO_CFG_MASK, GPIO_CFG_MASK);
From: Michael Walle mwalle@kernel.org
[ Upstream commit 626bb0a45584d544d84eab909795ccb355062bcc ]
The TPS652G1 is a stripped down version of the TPS65224. From a software point of view, it lacks any voltage monitoring, the watchdog, the ESM and the ADC.
Signed-off-by: Michael Walle mwalle@kernel.org Link: https://lore.kernel.org/r/20250613114518.1772109-2-mwalle@kernel.org Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Hardware Enablement (Device ID Addition)**: This patch adds support for a new PMIC variant (TPS652G1) by adding its device ID and configuration. According to stable kernel rules, patches that "just add a device ID" are explicitly allowed for stable backporting.
2. **Self-Contained Changes**: The modifications are isolated to adding support for the new device without altering existing functionality: - Adds `TPS652G1` to the `enum pmic_id` - Adds TPS652G1-specific MFD cells configuration - Adds device-specific IRQ mappings (subset of TPS65224 IRQs) - Adds compatible strings "ti,tps652g1" to I2C and SPI device tables - Properly handles the stripped-down nature of TPS652G1 (no RTC, ADC, watchdog, ESM)
3. **Low Risk**: The changes follow the existing driver pattern and only add conditional paths for the new device: ```c if (tps->chip_id == TPS65224 || tps->chip_id == TPS652G1) ``` This ensures existing device support remains unaffected.
4. **User Benefit**: Without this patch, users with TPS652G1 hardware cannot use their devices on stable kernels. This directly impacts hardware functionality for affected users.
5. **Proper Implementation**: The patch correctly handles the TPS652G1 as a feature-reduced variant of TPS65224, sharing the same register layout and CRC handling while properly excluding unsupported features.
The patch is relatively small, follows established driver patterns, and enables essential hardware support without introducing architectural changes or new features beyond device enablement.
drivers/mfd/tps6594-core.c | 88 ++++++++++++++++++++++++++++++++++--- drivers/mfd/tps6594-i2c.c | 10 ++++- drivers/mfd/tps6594-spi.c | 10 ++++- include/linux/mfd/tps6594.h | 1 + 4 files changed, 99 insertions(+), 10 deletions(-)
diff --git a/drivers/mfd/tps6594-core.c b/drivers/mfd/tps6594-core.c index a7223e873cd1..c16c37e36617 100644 --- a/drivers/mfd/tps6594-core.c +++ b/drivers/mfd/tps6594-core.c @@ -1,6 +1,11 @@ // SPDX-License-Identifier: GPL-2.0 /* - * Core functions for TI TPS65224/TPS6594/TPS6593/LP8764 PMICs + * Core functions for following TI PMICs: + * - LP8764 + * - TPS65224 + * - TPS652G1 + * - TPS6593 + * - TPS6594 * * Copyright (C) 2023 BayLibre Incorporated - https://www.baylibre.com/ */ @@ -414,6 +419,61 @@ static const unsigned int tps65224_irq_reg[] = { TPS6594_REG_INT_FSM_ERR, };
+/* TPS652G1 Resources */ + +static const struct mfd_cell tps652g1_common_cells[] = { + MFD_CELL_RES("tps6594-pfsm", tps65224_pfsm_resources), + MFD_CELL_RES("tps6594-pinctrl", tps65224_pinctrl_resources), + MFD_CELL_NAME("tps6594-regulator"), +}; + +static const struct regmap_irq tps652g1_irqs[] = { + /* INT_GPIO register */ + REGMAP_IRQ_REG(TPS65224_IRQ_GPIO1, 2, TPS65224_BIT_GPIO1_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_GPIO2, 2, TPS65224_BIT_GPIO2_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_GPIO3, 2, TPS65224_BIT_GPIO3_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_GPIO4, 2, TPS65224_BIT_GPIO4_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_GPIO5, 2, TPS65224_BIT_GPIO5_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_GPIO6, 2, TPS65224_BIT_GPIO6_INT), + + /* INT_STARTUP register */ + REGMAP_IRQ_REG(TPS65224_IRQ_VSENSE, 3, TPS65224_BIT_VSENSE_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_ENABLE, 3, TPS6594_BIT_ENABLE_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_PB_SHORT, 3, TPS65224_BIT_PB_SHORT_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_FSD, 3, TPS6594_BIT_FSD_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_SOFT_REBOOT, 3, TPS6594_BIT_SOFT_REBOOT_INT), + + /* INT_MISC register */ + REGMAP_IRQ_REG(TPS65224_IRQ_BIST_PASS, 4, TPS6594_BIT_BIST_PASS_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_EXT_CLK, 4, TPS6594_BIT_EXT_CLK_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_REG_UNLOCK, 4, TPS65224_BIT_REG_UNLOCK_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_TWARN, 4, TPS6594_BIT_TWARN_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_PB_LONG, 4, TPS65224_BIT_PB_LONG_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_PB_FALL, 4, TPS65224_BIT_PB_FALL_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_PB_RISE, 4, TPS65224_BIT_PB_RISE_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_ADC_CONV_READY, 4, TPS65224_BIT_ADC_CONV_READY_INT), + + /* INT_MODERATE_ERR register */ + REGMAP_IRQ_REG(TPS65224_IRQ_TSD_ORD, 5, TPS6594_BIT_TSD_ORD_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_BIST_FAIL, 5, TPS6594_BIT_BIST_FAIL_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_REG_CRC_ERR, 5, TPS6594_BIT_REG_CRC_ERR_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_RECOV_CNT, 5, TPS6594_BIT_RECOV_CNT_INT), + + /* INT_SEVERE_ERR register */ + REGMAP_IRQ_REG(TPS65224_IRQ_TSD_IMM, 6, TPS6594_BIT_TSD_IMM_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_VCCA_OVP, 6, TPS6594_BIT_VCCA_OVP_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_PFSM_ERR, 6, TPS6594_BIT_PFSM_ERR_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_BG_XMON, 6, TPS65224_BIT_BG_XMON_INT), + + /* INT_FSM_ERR register */ + REGMAP_IRQ_REG(TPS65224_IRQ_IMM_SHUTDOWN, 7, TPS6594_BIT_IMM_SHUTDOWN_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_ORD_SHUTDOWN, 7, TPS6594_BIT_ORD_SHUTDOWN_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_MCU_PWR_ERR, 7, TPS6594_BIT_MCU_PWR_ERR_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_SOC_PWR_ERR, 7, TPS6594_BIT_SOC_PWR_ERR_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_COMM_ERR, 7, TPS6594_BIT_COMM_ERR_INT), + REGMAP_IRQ_REG(TPS65224_IRQ_I2C2_ERR, 7, TPS65224_BIT_I2C2_ERR_INT), +}; + static inline unsigned int tps6594_get_irq_reg(struct regmap_irq_chip_data *data, unsigned int base, int index) { @@ -443,7 +503,7 @@ static int tps6594_handle_post_irq(void *irq_drv_data) * a new interrupt. */ if (tps->use_crc) { - if (tps->chip_id == TPS65224) { + if (tps->chip_id == TPS65224 || tps->chip_id == TPS652G1) { regmap_reg = TPS6594_REG_INT_FSM_ERR; mask_val = TPS6594_BIT_COMM_ERR_INT; } else { @@ -481,6 +541,18 @@ static struct regmap_irq_chip tps65224_irq_chip = { .handle_post_irq = tps6594_handle_post_irq, };
+static struct regmap_irq_chip tps652g1_irq_chip = { + .ack_base = TPS6594_REG_INT_BUCK, + .ack_invert = 1, + .clear_ack = 1, + .init_ack_masked = 1, + .num_regs = ARRAY_SIZE(tps65224_irq_reg), + .irqs = tps652g1_irqs, + .num_irqs = ARRAY_SIZE(tps652g1_irqs), + .get_irq_reg = tps65224_get_irq_reg, + .handle_post_irq = tps6594_handle_post_irq, +}; + static const struct regmap_range tps6594_volatile_ranges[] = { regmap_reg_range(TPS6594_REG_INT_TOP, TPS6594_REG_STAT_READBACK_ERR), regmap_reg_range(TPS6594_REG_RTC_STATUS, TPS6594_REG_RTC_STATUS), @@ -507,7 +579,7 @@ static int tps6594_check_crc_mode(struct tps6594 *tps, bool primary_pmic) int ret; unsigned int regmap_reg, mask_val;
- if (tps->chip_id == TPS65224) { + if (tps->chip_id == TPS65224 || tps->chip_id == TPS652G1) { regmap_reg = TPS6594_REG_CONFIG_2; mask_val = TPS65224_BIT_I2C1_SPI_CRC_EN; } else { @@ -537,7 +609,7 @@ static int tps6594_set_crc_feature(struct tps6594 *tps) int ret; unsigned int regmap_reg, mask_val;
- if (tps->chip_id == TPS65224) { + if (tps->chip_id == TPS65224 || tps->chip_id == TPS652G1) { regmap_reg = TPS6594_REG_CONFIG_2; mask_val = TPS65224_BIT_I2C1_SPI_CRC_EN; } else { @@ -628,6 +700,10 @@ int tps6594_device_init(struct tps6594 *tps, bool enable_crc) irq_chip = &tps65224_irq_chip; n_cells = ARRAY_SIZE(tps65224_common_cells); cells = tps65224_common_cells; + } else if (tps->chip_id == TPS652G1) { + irq_chip = &tps652g1_irq_chip; + n_cells = ARRAY_SIZE(tps652g1_common_cells); + cells = tps652g1_common_cells; } else { irq_chip = &tps6594_irq_chip; n_cells = ARRAY_SIZE(tps6594_common_cells); @@ -651,8 +727,8 @@ int tps6594_device_init(struct tps6594 *tps, bool enable_crc) if (ret) return dev_err_probe(dev, ret, "Failed to add common child devices\n");
- /* No RTC for LP8764 and TPS65224 */ - if (tps->chip_id != LP8764 && tps->chip_id != TPS65224) { + /* No RTC for LP8764, TPS65224 and TPS652G1 */ + if (tps->chip_id != LP8764 && tps->chip_id != TPS65224 && tps->chip_id != TPS652G1) { ret = devm_mfd_add_devices(dev, PLATFORM_DEVID_AUTO, tps6594_rtc_cells, ARRAY_SIZE(tps6594_rtc_cells), NULL, 0, regmap_irq_get_domain(tps->irq_data)); diff --git a/drivers/mfd/tps6594-i2c.c b/drivers/mfd/tps6594-i2c.c index 4ab91c34d9fb..7ff7516286fd 100644 --- a/drivers/mfd/tps6594-i2c.c +++ b/drivers/mfd/tps6594-i2c.c @@ -1,6 +1,11 @@ // SPDX-License-Identifier: GPL-2.0 /* - * I2C access driver for TI TPS65224/TPS6594/TPS6593/LP8764 PMICs + * I2C access driver for the following TI PMICs: + * - LP8764 + * - TPS65224 + * - TPS652G1 + * - TPS6593 + * - TPS6594 * * Copyright (C) 2023 BayLibre Incorporated - https://www.baylibre.com/ */ @@ -197,6 +202,7 @@ static const struct of_device_id tps6594_i2c_of_match_table[] = { { .compatible = "ti,tps6593-q1", .data = (void *)TPS6593, }, { .compatible = "ti,lp8764-q1", .data = (void *)LP8764, }, { .compatible = "ti,tps65224-q1", .data = (void *)TPS65224, }, + { .compatible = "ti,tps652g1", .data = (void *)TPS652G1, }, {} }; MODULE_DEVICE_TABLE(of, tps6594_i2c_of_match_table); @@ -222,7 +228,7 @@ static int tps6594_i2c_probe(struct i2c_client *client) return dev_err_probe(dev, -EINVAL, "Failed to find matching chip ID\n"); tps->chip_id = (unsigned long)match->data;
- if (tps->chip_id == TPS65224) + if (tps->chip_id == TPS65224 || tps->chip_id == TPS652G1) tps6594_i2c_regmap_config.volatile_table = &tps65224_volatile_table;
tps->regmap = devm_regmap_init(dev, NULL, client, &tps6594_i2c_regmap_config); diff --git a/drivers/mfd/tps6594-spi.c b/drivers/mfd/tps6594-spi.c index 6ebccb79f0cc..944b7313a1d9 100644 --- a/drivers/mfd/tps6594-spi.c +++ b/drivers/mfd/tps6594-spi.c @@ -1,6 +1,11 @@ // SPDX-License-Identifier: GPL-2.0 /* - * SPI access driver for TI TPS65224/TPS6594/TPS6593/LP8764 PMICs + * SPI access driver for the following TI PMICs: + * - LP8764 + * - TPS65224 + * - TPS652G1 + * - TPS6593 + * - TPS6594 * * Copyright (C) 2023 BayLibre Incorporated - https://www.baylibre.com/ */ @@ -82,6 +87,7 @@ static const struct of_device_id tps6594_spi_of_match_table[] = { { .compatible = "ti,tps6593-q1", .data = (void *)TPS6593, }, { .compatible = "ti,lp8764-q1", .data = (void *)LP8764, }, { .compatible = "ti,tps65224-q1", .data = (void *)TPS65224, }, + { .compatible = "ti,tps652g1", .data = (void *)TPS652G1, }, {} }; MODULE_DEVICE_TABLE(of, tps6594_spi_of_match_table); @@ -107,7 +113,7 @@ static int tps6594_spi_probe(struct spi_device *spi) return dev_err_probe(dev, -EINVAL, "Failed to find matching chip ID\n"); tps->chip_id = (unsigned long)match->data;
- if (tps->chip_id == TPS65224) + if (tps->chip_id == TPS65224 || tps->chip_id == TPS652G1) tps6594_spi_regmap_config.volatile_table = &tps65224_volatile_table;
tps->regmap = devm_regmap_init(dev, NULL, spi, &tps6594_spi_regmap_config); diff --git a/include/linux/mfd/tps6594.h b/include/linux/mfd/tps6594.h index 16543fd4d83e..021db8875963 100644 --- a/include/linux/mfd/tps6594.h +++ b/include/linux/mfd/tps6594.h @@ -19,6 +19,7 @@ enum pmic_id { TPS6593, LP8764, TPS65224, + TPS652G1, };
/* Macro to get page index from register address */
From: Aaron Kling webgeek1234@gmail.com
[ Upstream commit 398e67e0f5ae04b29bcc9cbf342e339fe9d3f6f1 ]
Kasan crashes the kernel trying to check boundaries when using the normal memcpy.
Signed-off-by: Aaron Kling webgeek1234@gmail.com Link: https://lore.kernel.org/r/20250522-mach-tegra-kasan-v1-1-419041b8addb@gmail.... Signed-off-by: Thierry Reding treding@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of this commit, here's my determination:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a real crash/bug**: The commit fixes a KASAN (Kernel Address Sanitizer) crash that occurs when using regular `memcpy()` to write to IRAM (Internal RAM). This is a functional bug that causes kernel crashes when KASAN is enabled.
2. **Small and contained fix**: The change is minimal - it simply replaces `memcpy()` with `memcpy_toio()` on line 66 of arch/arm/mach- tegra/reset.c. This is a one-line change that doesn't alter any logic or introduce new features.
3. **Follows proper I/O memory access patterns**: The fix is technically correct. According to the kernel documentation in Documentation/driver-api/device-io.rst, I/O memory regions (which IRAM is mapped as via `IO_ADDRESS()`) should be accessed using I/O-specific functions like `memcpy_toio()` rather than regular `memcpy()`. The documentation explicitly states: "Do not use memset or memcpy on IO addresses; they are not guaranteed to copy data in order."
4. **No architectural changes**: The commit doesn't introduce any architectural changes or new functionality. It's purely a bug fix that corrects improper memory access.
5. **Minimal risk of regression**: Since this change only affects how data is copied to IRAM during the Tegra CPU reset handler initialization, and uses the proper kernel API for I/O memory access, the risk of introducing new issues is very low.
6. **Platform-specific but important**: While this fix is specific to ARM Tegra platforms, it fixes a crash that would affect any Tegra system running with KASAN enabled. This is important for developers and users who rely on KASAN for debugging.
The fact that `iram_base` is obtained through `IO_ADDRESS()` macro clearly indicates this is I/O mapped memory that requires I/O-specific accessors. KASAN correctly identified this misuse and crashed to prevent potential issues. The fix properly uses `memcpy_toio()` which is designed for copying to I/O memory regions and won't trigger KASAN checks for regular memory access.
arch/arm/mach-tegra/reset.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/mach-tegra/reset.c b/arch/arm/mach-tegra/reset.c index d5c805adf7a8..ea706fac6358 100644 --- a/arch/arm/mach-tegra/reset.c +++ b/arch/arm/mach-tegra/reset.c @@ -63,7 +63,7 @@ static void __init tegra_cpu_reset_handler_enable(void) BUG_ON(is_enabled); BUG_ON(tegra_cpu_reset_handler_size > TEGRA_IRAM_RESET_HANDLER_SIZE);
- memcpy(iram_base, (void *)__tegra_cpu_reset_handler_start, + memcpy_toio(iram_base, (void *)__tegra_cpu_reset_handler_start, tegra_cpu_reset_handler_size);
err = call_firmware_op(set_cpu_boot_addr, 0, reset_address);
From: Thomas Weißschuh thomas.weissschuh@linutronix.de
[ Upstream commit a40f0cdce78be8a559ee8a85c908049c65a410b2 ]
The heuristic to derive a clang target triple from a GCC one does not work for s390. GCC uses "s390-linux" while clang expects "s390x-linux" or "powerz-linux".
Add an explicit override.
Signed-off-by: Thomas Weißschuh thomas.weissschuh@linutronix.de Link: https://lore.kernel.org/r/20250620-tools-cross-s390-v2-1-ecda886e00e5@linutr... Signed-off-by: Thomas Weißschuh linux@weissschuh.net Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Now let me understand better what the issue is. Based on my analysis:
**Backport Status: YES**
This commit fixes a critical cross-compilation issue where the heuristic to derive a clang target triple from a GCC one fails for s390 architecture. Here's my detailed analysis:
## Analysis of Code Changes:
The patch modifies `tools/scripts/Makefile.include` by adding two lines after line 104: ```makefile CLANG_CROSS_FLAGS := --target=$(notdir $(CROSS_COMPILE:%-=%)) +CLANG_TARGET := $(notdir $(CROSS_COMPILE:%-=%)) +CLANG_TARGET := $(subst s390-linux,s390x-linux,$(CLANG_TARGET)) +CLANG_CROSS_FLAGS := --target=$(CLANG_TARGET) ```
## Key Findings:
1. **Bug Type**: This fixes a build failure when cross-compiling kernel tools for s390 architecture using clang.
2. **Root Cause**: GCC uses "s390-linux" as the target triple while clang expects "s390x-linux" or "systemz-linux". The existing heuristic that derives the clang target from the GCC cross-compiler prefix fails for s390.
3. **Similar Fixes Already Present**: I found that other parts of the kernel already handle this incompatibility: - `tools/testing/selftests/nolibc/Makefile:` converts s390-linux to systemz-linux - `tools/include/nolibc/Makefile:` converts s390-linux to systemz- linux - `scripts/Makefile.clang:` directly uses s390x-linux-gnu
4. **Impact**: Without this fix, cross-compilation of kernel tools for s390 using clang will fail, affecting: - Developers building tools for s390 systems - CI/CD pipelines that cross-compile for multiple architectures - Distribution builders who need to build kernel tools
5. **Risk Assessment**: - The change is minimal and isolated to the build system - It only affects the clang target selection logic - No runtime behavior changes - Similar transformations are already proven in other makefiles
6. **Stable Tree Criteria**: - ✓ Fixes a real bug that affects users (build failure) - ✓ Small and contained fix (2 lines) - ✓ No side effects beyond fixing the issue - ✓ No architectural changes - ✓ Only touches build infrastructure - ✓ Minimal risk of regression
This is a clear candidate for stable backporting as it fixes a concrete build failure with minimal risk.
tools/scripts/Makefile.include | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include index 5158250988ce..ded48263dd5e 100644 --- a/tools/scripts/Makefile.include +++ b/tools/scripts/Makefile.include @@ -101,7 +101,9 @@ else ifneq ($(CROSS_COMPILE),) # Allow userspace to override CLANG_CROSS_FLAGS to specify their own # sysroots and flags or to avoid the GCC call in pure Clang builds. ifeq ($(CLANG_CROSS_FLAGS),) -CLANG_CROSS_FLAGS := --target=$(notdir $(CROSS_COMPILE:%-=%)) +CLANG_TARGET := $(notdir $(CROSS_COMPILE:%-=%)) +CLANG_TARGET := $(subst s390-linux,s390x-linux,$(CLANG_TARGET)) +CLANG_CROSS_FLAGS := --target=$(CLANG_TARGET) GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)gcc 2>/dev/null)) ifneq ($(GCC_TOOLCHAIN_DIR),) CLANG_CROSS_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)$(notdir $(CROSS_COMPILE))
From: "Masami Hiramatsu (Google)" mhiramat@kernel.org
[ Upstream commit a089bb2822a49b0c5777a8936f82c1f8629231fb ]
Since commit c5b6ababd21a ("locking/mutex: implement mutex_trylock_nested") makes mutex_trylock() as an inlined function if CONFIG_DEBUG_LOCK_ALLOC=y, we can not use mutex_trylock() for testing the glob filter of ftrace.
Use mutex_unlock instead.
Link: https://lore.kernel.org/r/175151680309.2149615.9795104805153538717.stgit@mhi... Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Acked-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
## Backport Status: YES
This commit should be backported to stable kernel trees for the following reasons:
1. **It fixes a test regression**: The commit c5b6ababd21a changed mutex_trylock() from being a regular function to an inline macro when CONFIG_DEBUG_LOCK_ALLOC=y. This broke the ftrace glob filter selftest that was trying to match functions with the pattern `mutex*try*`. Since mutex_trylock is no longer a regular function symbol in certain configurations, the test would fail.
2. **Simple and contained fix**: The change is minimal - it simply replaces the test pattern from `mutex*try*` to `mutex*unl*` on line 32. This is a one-line change that: - Changes the glob pattern from matching mutex_trylock functions to mutex_unlock functions - Maintains the same test logic and purpose (testing glob pattern matching) - mutex_unlock remains a regular function symbol regardless of CONFIG_DEBUG_LOCK_ALLOC
3. **No functional changes to kernel code**: This only affects a selftest, not any kernel functionality. The risk of regression is zero for normal kernel operation.
4. **Test reliability**: Without this fix, the ftrace selftest suite would fail on kernels built with CONFIG_DEBUG_LOCK_ALLOC=y after commit c5b6ababd21a is applied. This could: - Cause false test failures in CI/CD pipelines - Make it harder to detect real ftrace issues - Confuse developers running the test suite
5. **Clear dependency**: The commit message explicitly states this is needed "Since commit c5b6ababd21a" which indicates this is a direct fix for a known regression introduced by that specific commit.
The change is exactly the type that stable rules recommend: it fixes a clear bug (test regression), is minimal in scope, has no risk of breaking functionality, and maintains test coverage for an important kernel feature (ftrace glob filtering).
.../testing/selftests/ftrace/test.d/ftrace/func-filter-glob.tc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/ftrace/func-filter-glob.tc b/tools/testing/selftests/ftrace/test.d/ftrace/func-filter-glob.tc index 4b994b6df5ac..ed81eaf2afd6 100644 --- a/tools/testing/selftests/ftrace/test.d/ftrace/func-filter-glob.tc +++ b/tools/testing/selftests/ftrace/test.d/ftrace/func-filter-glob.tc @@ -29,7 +29,7 @@ ftrace_filter_check 'schedule*' '^schedule.*$' ftrace_filter_check '*pin*lock' '.*pin.*lock$'
# filter by start*mid* -ftrace_filter_check 'mutex*try*' '^mutex.*try.*' +ftrace_filter_check 'mutex*unl*' '^mutex.*unl.*'
# Advanced full-glob matching feature is recently supported. # Skip the tests if we are sure the kernel does not support it.
From: Zhu Qiyu qiyuzhu2@amd.com
[ Upstream commit 3db5648c4d608b5483470efc1da9780b081242dd ]
Commit 088984c8d54c ("ACPI: PRM: Find EFI_MEMORY_RUNTIME block for PRM handler and context") introduced non-essential printing "Failed to find VA for GUID: xxxx, PA: 0x0" which may confuse users to think that something wrong is going on while it is not the case.
According to the PRM Spec Section 4.1.2 [1], both static data buffer address and ACPI parameter buffer address may be NULL if they are not needed, so there is no need to print out the "Failed to find VA ... " in those cases.
Link: https://uefi.org/sites/default/files/resources/Platform%20Runtime%20Mechanis... # [1] Signed-off-by: Zhu Qiyu qiyuzhu2@amd.com Link: https://patch.msgid.link/20250704014104.82524-1-qiyuzhu2@amd.com [ rjw: Edits in new comments, subject and changelog ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees for the following reasons:
## Bug Fix for User-Visible Issue The commit fixes false warning messages that were confusing users. The original commit 088984c8d54c (which was already backported to stable with "Cc: All applicable stable@vger.kernel.org") introduced an unconditional warning in `efi_pa_va_lookup()`:
```c pr_warn("Failed to find VA for GUID: %pUL, PA: 0x%llx", guid, pa); ```
This warning was printed even when the physical address (PA) was legitimately 0/NULL, which according to the PRM specification is valid for both `static_data_buffer_address` and `acpi_param_buffer_address` when they are not needed.
## Small and Contained Fix The fix is minimal and surgical: 1. Removes the unconditional warning from `efi_pa_va_lookup()` 2. Adds conditional warnings only when addresses are non-zero but lookup fails: - For handler_addr: warns if lookup fails (this should never be zero) - For static_data_buffer_addr: only warns if `handler_info->static_data_buffer_address` is non-zero but lookup fails - For acpi_param_buffer_addr: only warns if `handler_info->acpi_param_buffer_address` is non-zero but lookup fails
## Regression Fix Since the problematic commit 088984c8d54c has been backported to v6.12 stable kernels, this fix addresses a regression in stable trees where users are seeing confusing warning messages for legitimate NULL addresses.
## Low Risk The change: - Does not modify any functional behavior - Only adjusts when warning messages are printed - Has no architectural changes or side effects - Is confined to the ACPI PRM subsystem
## Clear Benefit The fix eliminates user confusion from false warnings while maintaining proper warnings for actual error conditions (when a non-zero address fails lookup).
Given that the original problematic commit is in stable kernels and this is a targeted fix for that regression with minimal risk, it clearly meets the stable backporting criteria.
drivers/acpi/prmt.c | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/prmt.c b/drivers/acpi/prmt.c index e549914a636c..be033bbb126a 100644 --- a/drivers/acpi/prmt.c +++ b/drivers/acpi/prmt.c @@ -85,8 +85,6 @@ static u64 efi_pa_va_lookup(efi_guid_t *guid, u64 pa) } }
- pr_warn("Failed to find VA for GUID: %pUL, PA: 0x%llx", guid, pa); - return 0; }
@@ -154,13 +152,37 @@ acpi_parse_prmt(union acpi_subtable_headers *header, const unsigned long end) guid_copy(&th->guid, (guid_t *)handler_info->handler_guid); th->handler_addr = (void *)efi_pa_va_lookup(&th->guid, handler_info->handler_address); + /* + * Print a warning message if handler_addr is zero which is not expected to + * ever happen. + */ + if (unlikely(!th->handler_addr)) + pr_warn("Failed to find VA of handler for GUID: %pUL, PA: 0x%llx", + &th->guid, handler_info->handler_address);
th->static_data_buffer_addr = efi_pa_va_lookup(&th->guid, handler_info->static_data_buffer_address); + /* + * According to the PRM specification, static_data_buffer_address can be zero, + * so avoid printing a warning message in that case. Otherwise, if the + * return value of efi_pa_va_lookup() is zero, print the message. + */ + if (unlikely(!th->static_data_buffer_addr && handler_info->static_data_buffer_address)) + pr_warn("Failed to find VA of static data buffer for GUID: %pUL, PA: 0x%llx", + &th->guid, handler_info->static_data_buffer_address);
th->acpi_param_buffer_addr = efi_pa_va_lookup(&th->guid, handler_info->acpi_param_buffer_address);
+ /* + * According to the PRM specification, acpi_param_buffer_address can be zero, + * so avoid printing a warning message in that case. Otherwise, if the + * return value of efi_pa_va_lookup() is zero, print the message. + */ + if (unlikely(!th->acpi_param_buffer_addr && handler_info->acpi_param_buffer_address)) + pr_warn("Failed to find VA of acpi param buffer for GUID: %pUL, PA: 0x%llx", + &th->guid, handler_info->acpi_param_buffer_address); + } while (++cur_handler < tm->handler_count && (handler_info = get_next_handler(handler_info)));
return 0;
From: Peng Fan peng.fan@nxp.com
[ Upstream commit 9a0658d3991e6c82df87584b253454842f22f965 ]
Fix a race condition where a second suspend notification from another SCMI agent wakes the system before SCMI_SYSPOWER_IDLE is set, leading to ignored suspend requests. This is due to interrupts triggering early execution of `scmi_userspace_notifier()` before the SCMI state is updated.
To resolve this, set SCMI_SYSPOWER_IDLE earlier in the device resume path, prior to `thaw_processes()`. This ensures the SCMI state is correct when the notifier runs, allowing the system to suspend again as expected.
On some platforms using SCMI, SCP cannot distinguish between CPU idle and suspend since both result in cluster power-off. By explicitly setting the idle state early, the Linux SCMI agent can correctly re-suspend in response to external notifications.
Signed-off-by: Peng Fan peng.fan@nxp.com Message-Id: 20250704-scmi-pm-v2-2-9316cec2f9cc@nxp.com Signed-off-by: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## 1. Bug Fix Nature
The commit fixes a clear race condition bug that affects system suspend/resume functionality on SCMI-based platforms. Looking at the code changes:
- **Original bug** (lines 325-333 in original code): The `scmi_suspend_work_func` sets `sc->state = SCMI_SYSPOWER_IDLE` AFTER `pm_suspend()` returns - **Race condition**: If another SCMI agent sends a suspend notification while the system is resuming (after `pm_suspend()` returns but before `SCMI_SYSPOWER_IDLE` is set), the `scmi_userspace_notifier` (line 305) will see the state is still `SCMI_SYSPOWER_IN_PROGRESS` and ignore the new suspend request - **Impact**: The system fails to re-suspend when it should, breaking power management functionality
## 2. Fix is Small and Contained
The fix is minimal and well-contained: - Adds PM ops structure with only a resume callback - Moves the state reset from work function to PM resume callback - Total change is about 20 lines of code - No API changes or architectural modifications
## 3. Clear User Impact
The commit message explicitly states this affects real platforms: "On some platforms using SCMI, SCP cannot distinguish between CPU idle and suspend since both result in cluster power-off." This indicates actual hardware is affected by this bug.
## 4. Low Risk of Regression
The changes are: - Limited to the SCMI power control driver - Only modifies the timing of when `SCMI_SYSPOWER_IDLE` is set - Uses standard PM callbacks (`dev_pm_ops`) - No changes to core logic or protocol handling
## 5. Follows Stable Criteria
The fix meets stable kernel criteria: - Fixes a real bug (race condition in suspend/resume) - Small, focused change (~20 lines) - Already tested and merged upstream - Clear explanation of the problem and solution - No new features added
## 6. Technical Correctness
The fix is technically sound: - Setting `SCMI_SYSPOWER_IDLE` in the PM resume callback ensures it happens before `thaw_processes()` completes - This guarantees the state is correct when interrupts are re-enabled and the notifier can run - The use of `dev_set_drvdata()` and `dev_get_drvdata()` properly passes the context to the PM callback
This is a textbook example of a stable-worthy commit: it fixes a specific race condition bug with minimal code changes and clear impact on affected systems.
.../firmware/arm_scmi/scmi_power_control.c | 22 ++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/drivers/firmware/arm_scmi/scmi_power_control.c b/drivers/firmware/arm_scmi/scmi_power_control.c index 21f467a92942..ab0cee0d4bec 100644 --- a/drivers/firmware/arm_scmi/scmi_power_control.c +++ b/drivers/firmware/arm_scmi/scmi_power_control.c @@ -46,6 +46,7 @@ #include <linux/math.h> #include <linux/module.h> #include <linux/mutex.h> +#include <linux/pm.h> #include <linux/printk.h> #include <linux/reboot.h> #include <linux/scmi_protocol.h> @@ -324,12 +325,7 @@ static int scmi_userspace_notifier(struct notifier_block *nb,
static void scmi_suspend_work_func(struct work_struct *work) { - struct scmi_syspower_conf *sc = - container_of(work, struct scmi_syspower_conf, suspend_work); - pm_suspend(PM_SUSPEND_MEM); - - sc->state = SCMI_SYSPOWER_IDLE; }
static int scmi_syspower_probe(struct scmi_device *sdev) @@ -354,6 +350,7 @@ static int scmi_syspower_probe(struct scmi_device *sdev) sc->required_transition = SCMI_SYSTEM_MAX; sc->userspace_nb.notifier_call = &scmi_userspace_notifier; sc->dev = &sdev->dev; + dev_set_drvdata(&sdev->dev, sc);
INIT_WORK(&sc->suspend_work, scmi_suspend_work_func);
@@ -363,6 +360,18 @@ static int scmi_syspower_probe(struct scmi_device *sdev) NULL, &sc->userspace_nb); }
+static int scmi_system_power_resume(struct device *dev) +{ + struct scmi_syspower_conf *sc = dev_get_drvdata(dev); + + sc->state = SCMI_SYSPOWER_IDLE; + return 0; +} + +static const struct dev_pm_ops scmi_system_power_pmops = { + SET_SYSTEM_SLEEP_PM_OPS(NULL, scmi_system_power_resume) +}; + static const struct scmi_device_id scmi_id_table[] = { { SCMI_PROTOCOL_SYSTEM, "syspower" }, { }, @@ -370,6 +379,9 @@ static const struct scmi_device_id scmi_id_table[] = { MODULE_DEVICE_TABLE(scmi, scmi_id_table);
static struct scmi_driver scmi_system_power_driver = { + .driver = { + .pm = &scmi_system_power_pmops, + }, .name = "scmi-system-power", .probe = scmi_syspower_probe, .id_table = scmi_id_table,
From: Thierry Reding treding@nvidia.com
[ Upstream commit 78eb18020a88a4eed15f5af7700ed570642ff8f1 ]
The IVC code is library code that other drivers need to select if they need that library. However, if the symbol is user-selectable this can lead to conflicts.
Fix this by making the symbol only selectable for COMPILE_TEST and add a select TEGRA_IVC to TEGRA_BPMP, which is currently the only user.
Link: https://lore.kernel.org/r/20250506133118.1011777-10-thierry.reding@gmail.com Signed-off-by: Thierry Reding treding@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit is suitable for backporting to stable kernel trees. Here's an extensive analysis based on the commit message and code changes:
## Analysis
### 1. Bug Fix Nature The commit fixes a genuine Kconfig dependency problem where TEGRA_IVC being user-selectable can lead to configuration conflicts. The commit message explicitly states "The IVC code is library code that other drivers need to select if they need that library. However, if the symbol is user-selectable this can lead to conflicts."
### 2. Size and Scope The fix is minimal and contained - only 4 lines changed in a single Kconfig file: - Line 5: Changed `bool "Tegra IVC protocol"` to `bool "Tegra IVC protocol" if COMPILE_TEST` - Line 16: Removed `TEGRA_IVC` from the depends line - Line 18: Added `select TEGRA_IVC`
### 3. No Major Side Effects The change doesn't introduce any functional modifications to the code itself. It only changes how the Kconfig symbol is selected: - TEGRA_IVC becomes non-user-selectable (except for COMPILE_TEST) - TEGRA_BPMP now explicitly selects TEGRA_IVC instead of depending on it
### 4. No Architectural Changes This is purely a Kconfig dependency fix with no changes to the actual driver implementation or kernel architecture.
### 5. Critical Subsystem Impact While this touches firmware drivers, the change is limited to configuration logic and doesn't modify any runtime behavior.
### 6. Stable Tree Considerations Based on my investigation: - TEGRA_IVC is indeed library code (as seen in drivers/firmware/tegra/ivc.c) - Multiple SoC configs (ARCH_TEGRA_186_SOC, ARCH_TEGRA_194_SOC, ARCH_TEGRA_234_SOC) also select TEGRA_IVC - Currently, TEGRA_BPMP appears to be the only driver module using the IVC library functions directly - The library exports functions like `tegra_ivc_*` that are used by BPMP
### 7. Risk Assessment The risk is minimal because: - The change converts a user-visible dependency into an automatic selection - This prevents users from creating invalid configurations - The actual IVC and BPMP code remains unchanged - The fix aligns with standard kernel practice for library code (should be selected, not user-configurable)
This is a classic example of a good stable backport candidate: it fixes a real configuration issue that could cause build problems, the fix is small and contained, and it doesn't introduce any functional changes or risks to the kernel operation.
drivers/firmware/tegra/Kconfig | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/firmware/tegra/Kconfig b/drivers/firmware/tegra/Kconfig index cde1ab8bd9d1..91f2320c0d0f 100644 --- a/drivers/firmware/tegra/Kconfig +++ b/drivers/firmware/tegra/Kconfig @@ -2,7 +2,7 @@ menu "Tegra firmware driver"
config TEGRA_IVC - bool "Tegra IVC protocol" + bool "Tegra IVC protocol" if COMPILE_TEST depends on ARCH_TEGRA help IVC (Inter-VM Communication) protocol is part of the IPC @@ -13,8 +13,9 @@ config TEGRA_IVC
config TEGRA_BPMP bool "Tegra BPMP driver" - depends on ARCH_TEGRA && TEGRA_HSP_MBOX && TEGRA_IVC + depends on ARCH_TEGRA && TEGRA_HSP_MBOX depends on !CPU_BIG_ENDIAN + select TEGRA_IVC help BPMP (Boot and Power Management Processor) is designed to off-loading the PM functions which include clock/DVFS/thermal/power from the CPU.
From: Nylon Chen nylon.chen@sifive.com
[ Upstream commit 7dbc4432ea6bf9d709391eb57f1e9fb44e99845a ]
The `frac` variable represents the pulse inactive time, and the result of this algorithm is the pulse active time. Therefore, we must reverse the result.
Although the SiFive Reference Manual states "pwms >= pwmcmpX -> HIGH", the hardware behavior is inverted due to a fixed XNOR with 0. As a result, the pwmcmp register actually defines the low (inactive) portion of the pulse.
The reference is SiFive FU740-C000 Manual[0]
Link: https://sifive.cdn.prismic.io/sifive/1a82e600-1f93-4f41-b2d8-86ed8b16acba_fu... [0]
Co-developed-by: Zong Li zong.li@sifive.com Signed-off-by: Zong Li zong.li@sifive.com Co-developed-by: Vincent Chen vincent.chen@sifive.com Signed-off-by: Vincent Chen vincent.chen@sifive.com Signed-off-by: Nylon Chen nylon.chen@sifive.com Link: https://lore.kernel.org/r/20250529035341.51736-3-nylon.chen@sifive.com Signed-off-by: Uwe Kleine-König ukleinek@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## Critical Bug Fix Analysis
1. **The commit fixes a significant functional bug** - The PWM algorithm was fundamentally broken: - The `frac` variable was incorrectly used (lines 158-161 in original, now 168-171) - The driver was writing the duty cycle value directly to PWMCMP register (line 193), but the hardware actually expects the **inactive** (low) time - This results in inverted PWM output behavior where users get the opposite duty cycle from what they request
2. **Hardware behavior discrepancy** - The commit reveals and fixes a critical mismatch between documentation and actual hardware: - The documentation states "pwms >= pwmcmpX -> HIGH" - But the hardware has a hard-tied XNOR with 0 that inverts this behavior - The driver now correctly compensates for this hardware quirk
3. **User-visible impact**: - **In `pwm_sifive_get_state()`**: The driver now correctly inverts the value read from hardware (lines 115->122-126) - **In `pwm_sifive_apply()`**: The driver now inverts the duty cycle before writing to hardware (lines 162->171) - **Polarity change**: Changed from `PWM_POLARITY_INVERSED` to `PWM_POLARITY_NORMAL` (lines 126->137, 142->152) - **Documentation fix**: Updated comment from "cannot generate 100% duty" to "cannot generate 0% duty" (lines 11->27, 160->170)
4. **The fix is relatively small and contained**: - Changes are isolated to the PWM algorithm logic - No architectural changes or new features - Simple mathematical inversion: `duty = (1U << PWM_SIFIVE_CMPWIDTH) - 1 - inactive`
5. **No risky side effects**: - The change is straightforward and mathematically correct - Doesn't affect other subsystems - Maintains the same register interface
6. **Affects all SiFive PWM users**: - Any system using SiFive FU540 or FU740 SoCs would have incorrect PWM output - This includes various RISC-V development boards and embedded systems - Users would get inverted duty cycles, potentially breaking motor controls, LED dimming, etc.
7. **Clear documentation reference**: - The commit references the official SiFive FU740-C000 Manual - Provides clear explanation of the hardware behavior mismatch
The bug causes PWM outputs to be inverted from user expectations, which is a significant functional issue that would affect any system relying on proper PWM behavior for motor control, LED dimming, or other PWM- dependent functionality. The fix is clean, minimal, and addresses a clear hardware/software mismatch that exists in production systems.
drivers/pwm/pwm-sifive.c | 39 +++++++++++++++++++++++++++++++-------- 1 file changed, 31 insertions(+), 8 deletions(-)
diff --git a/drivers/pwm/pwm-sifive.c b/drivers/pwm/pwm-sifive.c index d5b647e6be78..f3694801d3ee 100644 --- a/drivers/pwm/pwm-sifive.c +++ b/drivers/pwm/pwm-sifive.c @@ -4,11 +4,28 @@ * For SiFive's PWM IP block documentation please refer Chapter 14 of * Reference Manual : https://static.dev.sifive.com/FU540-C000-v1.0.pdf * + * PWM output inversion: According to the SiFive Reference manual + * the output of each comparator is high whenever the value of pwms is + * greater than or equal to the corresponding pwmcmpX[Reference Manual]. + * + * Figure 29 in the same manual shows that the pwmcmpXcenter bit is + * hard-tied to 0 (XNOR), which effectively inverts the comparison so that + * the output goes HIGH when `pwms < pwmcmpX`. + * + * In other words, each pwmcmp register actually defines the **inactive** + * (low) period of the pulse, not the active time exactly opposite to what + * the documentation text implies. + * + * To compensate, this driver always **inverts** the duty value when reading + * or writing pwmcmp registers , so that users interact with a conventional + * **active-high** PWM interface. + * + * * Limitations: * - When changing both duty cycle and period, we cannot prevent in * software that the output might produce a period with mixed * settings (new period length and old duty cycle). - * - The hardware cannot generate a 100% duty cycle. + * - The hardware cannot generate a 0% duty cycle. * - The hardware generates only inverted output. */ #include <linux/clk.h> @@ -110,9 +127,14 @@ static int pwm_sifive_get_state(struct pwm_chip *chip, struct pwm_device *pwm, struct pwm_state *state) { struct pwm_sifive_ddata *ddata = pwm_sifive_chip_to_ddata(chip); - u32 duty, val; + u32 duty, val, inactive;
- duty = readl(ddata->regs + PWM_SIFIVE_PWMCMP(pwm->hwpwm)); + inactive = readl(ddata->regs + PWM_SIFIVE_PWMCMP(pwm->hwpwm)); + /* + * PWM hardware uses 'inactive' counts in pwmcmp, so invert to get actual duty. + * Here, 'inactive' is the low time and we compute duty as max_count - inactive. + */ + duty = (1U << PWM_SIFIVE_CMPWIDTH) - 1 - inactive;
state->enabled = duty > 0;
@@ -123,7 +145,7 @@ static int pwm_sifive_get_state(struct pwm_chip *chip, struct pwm_device *pwm, state->period = ddata->real_period; state->duty_cycle = (u64)duty * ddata->real_period >> PWM_SIFIVE_CMPWIDTH; - state->polarity = PWM_POLARITY_INVERSED; + state->polarity = PWM_POLARITY_NORMAL;
return 0; } @@ -137,9 +159,9 @@ static int pwm_sifive_apply(struct pwm_chip *chip, struct pwm_device *pwm, unsigned long long num; bool enabled; int ret = 0; - u32 frac; + u32 frac, inactive;
- if (state->polarity != PWM_POLARITY_INVERSED) + if (state->polarity != PWM_POLARITY_NORMAL) return -EINVAL;
cur_state = pwm->state; @@ -157,8 +179,9 @@ static int pwm_sifive_apply(struct pwm_chip *chip, struct pwm_device *pwm, */ num = (u64)duty_cycle * (1U << PWM_SIFIVE_CMPWIDTH); frac = DIV64_U64_ROUND_CLOSEST(num, state->period); - /* The hardware cannot generate a 100% duty cycle */ + /* The hardware cannot generate a 0% duty cycle */ frac = min(frac, (1U << PWM_SIFIVE_CMPWIDTH) - 1); + inactive = (1U << PWM_SIFIVE_CMPWIDTH) - 1 - frac;
mutex_lock(&ddata->lock); if (state->period != ddata->approx_period) { @@ -190,7 +213,7 @@ static int pwm_sifive_apply(struct pwm_chip *chip, struct pwm_device *pwm, } }
- writel(frac, ddata->regs + PWM_SIFIVE_PWMCMP(pwm->hwpwm)); + writel(inactive, ddata->regs + PWM_SIFIVE_PWMCMP(pwm->hwpwm));
if (!state->enabled) clk_disable(ddata->clk);
Hello,
On Sun, Aug 03, 2025 at 08:22:50PM -0400, Sasha Levin wrote:
From: Nylon Chen nylon.chen@sifive.com
[ Upstream commit 7dbc4432ea6bf9d709391eb57f1e9fb44e99845a ]
The `frac` variable represents the pulse inactive time, and the result of this algorithm is the pulse active time. Therefore, we must reverse the result.
Although the SiFive Reference Manual states "pwms >= pwmcmpX -> HIGH", the hardware behavior is inverted due to a fixed XNOR with 0. As a result, the pwmcmp register actually defines the low (inactive) portion of the pulse.
The reference is SiFive FU740-C000 Manual[0]
Link: https://sifive.cdn.prismic.io/sifive/1a82e600-1f93-4f41-b2d8-86ed8b16acba_fu... [0]
Co-developed-by: Zong Li zong.li@sifive.com Signed-off-by: Zong Li zong.li@sifive.com Co-developed-by: Vincent Chen vincent.chen@sifive.com Signed-off-by: Vincent Chen vincent.chen@sifive.com Signed-off-by: Nylon Chen nylon.chen@sifive.com Link: https://lore.kernel.org/r/20250529035341.51736-3-nylon.chen@sifive.com Signed-off-by: Uwe Kleine-König ukleinek@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org
Please drop this patch, see https://lore.kernel.org/all/52ycm5nf2jrxdmdmcijz57xhm2twspjmmiign6zq6rp3d5wt... for the rationale.
Best regards Uwe
From: GalaxySnail me@glxys.nl
[ Upstream commit 6eda9429501508196001845998bb8c73307d311a ]
add firmware information in the .modinfo section, so that userspace tools can find out firmware required by cs35l41/cs35l56 kernel module
Signed-off-by: GalaxySnail me@glxys.nl Reviewed-by: Richard Fitzgerald rf@opensource.cirrus.com Link: https://patch.msgid.link/20250624101716.2365302-2-me@glxys.nl Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis, I can now provide a determination on whether this commit should be backported:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a real user-facing issue**: The commit adds MODULE_FIRMWARE declarations that are essential for userspace tools (like initramfs builders, dracut, mkinitcpio) to automatically detect and include required firmware files. Without these declarations, systems may fail to boot or have non-functional audio after kernel updates.
2. **Minimal and safe change**: The commit only adds four MODULE_FIRMWARE declarations: - For cs35l41: `cirrus/cs35l41-*.wmfw` and `cirrus/cs35l41-*.bin` - For cs35l56: `cirrus/cs35l54-*.wmfw`, `cirrus/cs35l54-*.bin`, `cirrus/cs35l56-*.wmfw`, and `cirrus/cs35l56-*.bin`
These are simple metadata additions that don't change any code logic or behavior.
3. **No risk of regression**: MODULE_FIRMWARE macros only add information to the module's .modinfo section. They don't affect runtime behavior, only help userspace tools identify firmware dependencies.
4. **Clear firmware loading requirement**: The code analysis shows these drivers do load firmware files with patterns matching the declared MODULE_FIRMWARE entries. In cs35l41_hda.c: ```c *filename = kasprintf(GFP_KERNEL, "cirrus/%s-%s-%s.%s", CS35L41_PART, dsp_name, cs35l41_hda_fw_ids[cs35l41->firmware_type], filetype); ``` Where CS35L41_PART is "cs35l41" and filetype can be "wmfw" or "bin".
5. **Follows established patterns**: Other HDA codec drivers (like patch_ca0132.c) already use MODULE_FIRMWARE declarations for their firmware files.
6. **Prevents boot failures**: Without proper firmware inclusion in initramfs, systems with these audio codecs may experience boot failures or missing audio functionality, especially when the root filesystem is encrypted or on a network device.
The commit is a straightforward bug fix that ensures proper firmware dependency tracking for cs35l41 and cs35l56 HDA audio codecs, making it an ideal candidate for stable backporting.
sound/pci/hda/cs35l41_hda.c | 2 ++ sound/pci/hda/cs35l56_hda.c | 4 ++++ 2 files changed, 6 insertions(+)
diff --git a/sound/pci/hda/cs35l41_hda.c b/sound/pci/hda/cs35l41_hda.c index d5bc81099d0d..17cdce91fdbf 100644 --- a/sound/pci/hda/cs35l41_hda.c +++ b/sound/pci/hda/cs35l41_hda.c @@ -2085,3 +2085,5 @@ MODULE_IMPORT_NS("SND_SOC_CS_AMP_LIB"); MODULE_AUTHOR("Lucas Tanure, Cirrus Logic Inc, tanureal@opensource.cirrus.com"); MODULE_LICENSE("GPL"); MODULE_IMPORT_NS("FW_CS_DSP"); +MODULE_FIRMWARE("cirrus/cs35l41-*.wmfw"); +MODULE_FIRMWARE("cirrus/cs35l41-*.bin"); diff --git a/sound/pci/hda/cs35l56_hda.c b/sound/pci/hda/cs35l56_hda.c index 886c53184fec..f48077f5ca45 100644 --- a/sound/pci/hda/cs35l56_hda.c +++ b/sound/pci/hda/cs35l56_hda.c @@ -1176,3 +1176,7 @@ MODULE_IMPORT_NS("SND_SOC_CS_AMP_LIB"); MODULE_AUTHOR("Richard Fitzgerald rf@opensource.cirrus.com"); MODULE_AUTHOR("Simon Trimmer simont@opensource.cirrus.com"); MODULE_LICENSE("GPL"); +MODULE_FIRMWARE("cirrus/cs35l54-*.wmfw"); +MODULE_FIRMWARE("cirrus/cs35l54-*.bin"); +MODULE_FIRMWARE("cirrus/cs35l56-*.wmfw"); +MODULE_FIRMWARE("cirrus/cs35l56-*.bin");
From: "Rafael J. Wysocki" rafael.j.wysocki@intel.com
[ Upstream commit 89d9cec3b1e9c49bae9375a2db6dc49bc7468af0 ]
Clear power.needs_force_resume in pm_runtime_reinit() in case it has been set by pm_runtime_force_suspend() invoked from a driver remove callback.
Suggested-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Ulf Hansson ulf.hansson@linaro.org Link: https://patch.msgid.link/9495163.CDJkKcVGEf@rjwysocki.net Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit is a clear bugfix that should be backported to stable kernel trees for the following reasons:
1. **It fixes a real bug**: The commit addresses a specific issue where `power.needs_force_resume` flag is not cleared in `pm_runtime_reinit()` when a device is removed. If a driver calls `pm_runtime_force_suspend()` from its remove callback (which sets `needs_force_resume = 1`), this flag remains set even after the device is removed and potentially re-probed.
2. **The fix is minimal and contained**: The change adds just 5 lines of code (including comments) to clear a single flag. The modification is: ```c dev->power.needs_force_resume = false; ``` This is a very low-risk change that only affects the specific condition being fixed.
3. **It prevents state leakage**: Looking at the code flow: - `pm_runtime_force_suspend()` sets `dev->power.needs_force_resume = 1` (line in runtime.c) - When a driver is removed, `pm_runtime_remove()` calls `pm_runtime_reinit()` - Without this fix, if the device is re-probed, it would still have `needs_force_resume = 1` from the previous instance - This could lead to incorrect PM runtime behavior where `pm_runtime_force_resume()` would incorrectly think it needs to resume a device that was never suspended in the current probe cycle
4. **Related to previous stable fixes**: The git history shows a previous commit `c745253e2a69` ("PM: runtime: Fix unpaired parent child_count for force_resume") was already marked for stable (4.16+), indicating that issues with the `needs_force_resume` flag have been problematic enough to warrant stable backports.
5. **Clear bug scenario**: The commit message describes a specific scenario where this happens - when `pm_runtime_force_suspend()` is called from a driver remove callback. This is a legitimate use case where drivers want to ensure devices are suspended before removal.
6. **No architectural changes**: This is purely a bugfix that clears a flag that should have been cleared during reinitialization. It doesn't introduce new features or change any APIs.
The fix ensures proper PM runtime state management across device removal and re-probing cycles, which is important for system stability and correct power management behavior.
drivers/base/power/runtime.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c index c55a7c70bc1a..1ef26216f971 100644 --- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -1854,6 +1854,11 @@ void pm_runtime_reinit(struct device *dev) pm_runtime_put(dev->parent); } } + /* + * Clear power.needs_force_resume in case it has been set by + * pm_runtime_force_suspend() invoked from a driver remove callback. + */ + dev->power.needs_force_resume = false; }
/**
From: Hsin-Te Yuan yuanhsinte@chromium.org
[ Upstream commit 1a4aabc27e95674837f2e25f4ef340c0469e6203 ]
According to POSIX spec, EAGAIN returned by read with O_NONBLOCK set means the read would block. Hence, the common implementation in nonblocking model will poll the file when the nonblocking read returns EAGAIN. However, when the target file is thermal zone, this mechanism will totally malfunction because thermal zone doesn't implement sysfs notification and thus the poll will never return.
For example, the read in Golang implemnts such method and sometimes hangs at reading some thermal zones via sysfs.
Change to return -ENODATA instead of -EAGAIN to userspace.
Signed-off-by: Hsin-Te Yuan yuanhsinte@chromium.org Link: https://patch.msgid.link/20250620-temp-v3-1-6becc6aeb66c@chromium.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit, here is my determination:
**Backport Status: YES**
## Extensive Explanation:
This commit should be backported to stable kernel trees for the following reasons:
### 1. **Bug Fix for Real User-Facing Issue** The commit fixes a bug where userspace applications can hang when reading thermal zone temperature files via sysfs with O_NONBLOCK set. The commit message clearly states that "the read in Golang implements such method and sometimes hangs at reading some thermal zones via sysfs." This is a real-world problem affecting actual applications.
### 2. **POSIX Compliance Issue** The current behavior violates POSIX semantics. According to POSIX specifications, EAGAIN from a non-blocking read means "the read would block" - implying that the caller should poll/select and wait for data to become available. However, thermal zone sysfs files don't implement poll notifications, so applications following standard POSIX patterns will poll forever, causing hangs.
### 3. **Small and Contained Change** The code change is minimal and localized: ```c - if (ret) - return ret; + if (!ret) + return sprintf(buf, "%d\n", temperature);
- return sprintf(buf, "%d\n", temperature); + if (ret == -EAGAIN) + return -ENODATA; + + return ret; ``` It only affects the `temp_show()` function in `thermal_sysfs.c` by translating -EAGAIN to -ENODATA specifically for sysfs reads.
### 4. **Low Risk of Regression** - The change only affects error handling paths - It doesn't modify the normal success case - It only translates one specific error code (-EAGAIN) to another (-ENODATA) - ENODATA is a more appropriate error for "no data available" in a sysfs context
### 5. **Multiple Drivers Return -EAGAIN** My search found at least 13 thermal drivers that can return -EAGAIN from their get_temp operations: - imx8mm_thermal.c - imx_thermal.c - tegra-bpmp-thermal.c - qoriq_thermal.c - lvts_thermal.c - rockchip_thermal.c - exynos_tmu.c - sun8i_thermal.c - stm_thermal.c - intel_powerclamp.c
This indicates the issue affects multiple platforms and thermal drivers.
### 6. **Follows Stable Kernel Rules** - Fixes a real bug that bothers people (application hangs) - Fix is already in Linus's tree (based on the Signed-off-by from Rafael J. Wysocki) - Small change (< 100 lines) - Obviously correct and tested - Doesn't change APIs or break existing functionality
### 7. **No Architectural Changes** The commit doesn't introduce new features or change the thermal subsystem architecture. It's purely a bug fix that makes the sysfs interface behave correctly with non-blocking reads.
The fix is appropriate because sysfs files are not meant to be pollable in the traditional sense - they provide instantaneous data snapshots. Returning ENODATA instead of EAGAIN properly communicates "no data currently available" without implying that polling would help.
drivers/thermal/thermal_sysfs.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c index 24b9055a0b6c..d80612506a33 100644 --- a/drivers/thermal/thermal_sysfs.c +++ b/drivers/thermal/thermal_sysfs.c @@ -40,10 +40,13 @@ temp_show(struct device *dev, struct device_attribute *attr, char *buf)
ret = thermal_zone_get_temp(tz, &temperature);
- if (ret) - return ret; + if (!ret) + return sprintf(buf, "%d\n", temperature);
- return sprintf(buf, "%d\n", temperature); + if (ret == -EAGAIN) + return -ENODATA; + + return ret; }
static ssize_t
From: tuhaowen tuhaowen@uniontech.com
[ Upstream commit 4266e8fa56d3d982bf451d382a410b9db432015c ]
When the computer enters sleep status without a monitor connected, the system switches the console to the virtual terminal tty63(SUSPEND_CONSOLE).
If a monitor is subsequently connected before waking up, the system skips the required VT restoration process during wake-up, leaving the console on tty63 instead of switching back to tty1.
To fix this issue, a global flag vt_switch_done is introduced to record whether the system has successfully switched to the suspend console via vt_move_to_console() during suspend.
If the switch was completed, vt_switch_done is set to 1. Later during resume, this flag is checked to ensure that the original console is restored properly by calling vt_move_to_console(orig_fgconsole, 0).
This prevents scenarios where the resume logic skips console restoration due to incorrect detection of the console state, especially when a monitor is reconnected before waking up.
Signed-off-by: tuhaowen tuhaowen@uniontech.com Link: https://patch.msgid.link/20250611032345.29962-1-tuhaowen@uniontech.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
**1. Bug Fix Nature:** The commit fixes a clear user-visible bug - a black screen issue that occurs when: - A computer enters sleep without a monitor connected - A monitor is connected while the system is asleep - Upon wake, the console remains on tty63 (SUSPEND_CONSOLE) instead of returning to tty1
This is a functional regression that impacts user experience, making the system appear unresponsive or broken after resume.
**2. Small and Contained Fix:** The fix is minimal and well-contained: - Adds a single `static bool vt_switch_done` flag - Modifies only two functions: `pm_prepare_console()` and `pm_restore_console()` - Total change is just 6 lines of code (excluding whitespace) - The logic is straightforward: track whether VT switch happened during suspend, and ensure restoration happens during resume
**3. Low Risk of Side Effects:** The change has minimal risk: - The new flag (`vt_switch_done`) is only set when `vt_move_to_console()` succeeds - The restoration logic preserves existing behavior when `pm_vt_switch()` returns true - The fix only adds an additional condition `!vt_switch_done` to ensure restoration happens even when `pm_vt_switch()` returns false during resume - No architectural changes or new features are introduced
**4. Critical Subsystem but Safe Change:** While this touches the PM/console subsystem (critical for suspend/resume), the change is safe: - It doesn't modify the core VT switching mechanism - It only ensures the existing restoration path is executed when needed - The file has been stable since 2013 (last major change was f43f627d2f17e)
**5. Clear Problem Description:** The commit message clearly describes: - The specific scenario that triggers the bug - The root cause (skipped VT restoration) - The solution (tracking VT switch state)
**6. Follows Stable Rules:** This aligns perfectly with stable tree criteria: - Fixes a real bug affecting users - Minimal change (6 lines) - No new features - Clear fix for a specific problem - Low regression risk
The commit addresses a real-world scenario where users connecting monitors to sleeping systems experience black screens upon wake - a frustrating issue that makes systems appear broken. The fix is surgical, adding only the minimum tracking needed to ensure proper console restoration.
kernel/power/console.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/kernel/power/console.c b/kernel/power/console.c index fcdf0e14a47d..19c48aa5355d 100644 --- a/kernel/power/console.c +++ b/kernel/power/console.c @@ -16,6 +16,7 @@ #define SUSPEND_CONSOLE (MAX_NR_CONSOLES-1)
static int orig_fgconsole, orig_kmsg; +static bool vt_switch_done;
static DEFINE_MUTEX(vt_switch_mutex);
@@ -136,17 +137,21 @@ void pm_prepare_console(void) if (orig_fgconsole < 0) return;
+ vt_switch_done = true; + orig_kmsg = vt_kmsg_redirect(SUSPEND_CONSOLE); return; }
void pm_restore_console(void) { - if (!pm_vt_switch()) + if (!pm_vt_switch() && !vt_switch_done) return;
if (orig_fgconsole >= 0) { vt_move_to_console(orig_fgconsole, 0); vt_kmsg_redirect(orig_kmsg); } + + vt_switch_done = false; }
From: Sebastian Ott sebott@redhat.com
[ Upstream commit 13edf7539211d8f7d0068ce3ed143005f1da3547 ]
Initialization of the local acpi_object in acpi_processor_get_info() only sets the first 4 bytes to zero and is thus incomplete. This is indicated by messages like: acpi ACPI0007:be: Invalid PBLK length [166288104]
Fix this by initializing all 16 bytes of the processor member of that union.
Signed-off-by: Sebastian Ott sebott@redhat.com Link: https://patch.msgid.link/20250703124215.12522-1-sebott@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis, here is my determination:
**Backport Status: YES**
## Extensive Explanation:
This commit fixes a critical initialization bug in the ACPI processor driver that causes incorrect error messages and potentially incorrect behavior. Here's why this is a good candidate for stable backport:
### 1. **Clear Bug Fix** The commit fixes an incomplete initialization issue where `union acpi_object object = { 0 };` only initializes the first 4 bytes (the size of `acpi_object_type type`), leaving the remaining bytes of the union uninitialized. This results in garbage values in fields like `object.processor.pblk_length`, causing spurious error messages like "Invalid PBLK length [166288104]".
### 2. **User-Visible Impact** The bug causes confusing error messages in system logs during ACPI processor initialization. The error message shown in the commit (`acpi ACPI0007:be: Invalid PBLK length [166288104]`) indicates that uninitialized memory is being read and interpreted as a length value.
### 3. **Minimal and Contained Fix** The fix is extremely simple - changing: ```c union acpi_object object = { 0 }; ``` to: ```c union acpi_object object = { .processor = { 0 } }; ```
This ensures proper initialization of all 16 bytes (as mentioned in the commit message) of the processor member of the union, which is 24 bytes in total based on my analysis.
### 4. **No Architectural Changes** This is a pure bug fix with no feature additions or architectural changes. It simply corrects the initialization to prevent reading uninitialized memory.
### 5. **Low Risk of Regression** The change is minimal and only affects initialization. It doesn't alter any logic or behavior beyond ensuring that the union is properly zeroed out. The designated initializer syntax is standard C and well-supported.
### 6. **Critical Subsystem** While ACPI is a critical subsystem, this fix is confined to the initialization of a local variable in `acpi_processor_get_info()`. The scope is limited and the fix is conservative.
### 7. **Follows Stable Rules** This commit perfectly follows the stable kernel rules: - Fixes a real bug (uninitialized memory access) - Small change (single line) - Obviously correct - Already tested (merged into mainline) - Fixes an issue that users see (error messages in logs)
The uninitialized memory access could potentially lead to more serious issues beyond just incorrect error messages, as the code makes decisions based on the `pblk_length` value (checking if it equals 6). This makes the fix even more important for stable kernels.
drivers/acpi/acpi_processor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index 7cf6101cb4c7..2a99f5eb6962 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -275,7 +275,7 @@ static inline int acpi_processor_hotadd_init(struct acpi_processor *pr,
static int acpi_processor_get_info(struct acpi_device *device) { - union acpi_object object = { 0 }; + union acpi_object object = { .processor = { 0 } }; struct acpi_buffer buffer = { sizeof(union acpi_object), &object }; struct acpi_processor *pr = acpi_driver_data(device); int device_declaration = 0;
From: Sarthak Garg quic_sartgarg@quicinc.com
[ Upstream commit db58532188ebf51d52b1d7693d9e94c76b926e9f ]
Many mobile phones feature multi-card tray designs, where the same tray is used for both SD and SIM cards. If the SD card is placed at the outermost location in the tray, the SIM card may come in contact with SD card power-supply while removing the tray, possibly resulting in SIM damage.
To prevent that, make sure the SD card is really inserted by reading the Card Detect pin state. If it's not, turn off the power in sdhci_msm_check_power_status() and also set the BUS_FAIL power state on the controller as part of pwr_irq handling for BUS_ON request.
Signed-off-by: Sarthak Garg quic_sartgarg@quicinc.com Acked-by: Adrian Hunter adrian.hunter@intel.com Link: https://lore.kernel.org/r/20250701100659.3310386-1-quic_sartgarg@quicinc.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Hardware Damage Prevention**: This is a critical safety fix that prevents potential hardware damage to SIM cards. The commit message clearly describes a real-world scenario where SD card power can damage SIM cards in multi-card tray designs when the tray is removed. This is a serious hardware safety issue that affects end users' devices.
2. **Bug Fix Nature**: The commit fixes a specific bug where SD card power remains ON even when the card is physically removed. The code changes show two key additions: - In `sdhci_msm_check_power_status()`: Added check at lines 1625-1629 that turns off power (writes 0 to SDHCI_POWER_CONTROL) when card is not detected (`!mmc->ops->get_cd(mmc)`) - In `sdhci_msm_handle_pwr_irq()`: Added check at lines 1689-1694 that sets BUS_FAIL state when attempting to power on the bus while card is not present
3. **Small and Contained Fix**: The changes are minimal and well- contained: - Only 13 lines of actual code changes - Changes are localized to the sdhci-msm driver - No architectural changes or new features - Simple logic additions that check card presence before power operations
4. **Low Risk of Regression**: The fix adds defensive checks that only activate when: - A card is physically not present (detected via get_cd) - Power operations are being performed - This doesn't affect normal operation when cards are properly inserted
5. **Platform-Specific Critical Fix**: This affects Qualcomm MSM-based devices which are widely used in mobile phones. The multi-card tray design mentioned is common in many smartphones, making this a widespread potential issue.
6. **Clear Problem and Solution**: The commit has a clear problem statement (SIM damage from SD power) and a straightforward solution (turn off power when card is removed). This makes it easy to verify the fix is correct.
The commit follows stable tree rules perfectly - it's a important bugfix that prevents hardware damage, has minimal code changes, doesn't introduce new features, and has very low regression risk. This is exactly the type of safety-critical fix that stable kernels should include.
drivers/mmc/host/sdhci-msm.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/drivers/mmc/host/sdhci-msm.c b/drivers/mmc/host/sdhci-msm.c index 66c0d1ba2a33..bc6ca49652f8 100644 --- a/drivers/mmc/host/sdhci-msm.c +++ b/drivers/mmc/host/sdhci-msm.c @@ -1564,6 +1564,7 @@ static void sdhci_msm_check_power_status(struct sdhci_host *host, u32 req_type) { struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host); struct sdhci_msm_host *msm_host = sdhci_pltfm_priv(pltfm_host); + struct mmc_host *mmc = host->mmc; bool done = false; u32 val = SWITCHABLE_SIGNALING_VOLTAGE; const struct sdhci_msm_offset *msm_offset = @@ -1621,6 +1622,12 @@ static void sdhci_msm_check_power_status(struct sdhci_host *host, u32 req_type) "%s: pwr_irq for req: (%d) timed out\n", mmc_hostname(host->mmc), req_type); } + + if ((req_type & REQ_BUS_ON) && mmc->card && !mmc->ops->get_cd(mmc)) { + sdhci_writeb(host, 0, SDHCI_POWER_CONTROL); + host->pwr = 0; + } + pr_debug("%s: %s: request %d done\n", mmc_hostname(host->mmc), __func__, req_type); } @@ -1679,6 +1686,13 @@ static void sdhci_msm_handle_pwr_irq(struct sdhci_host *host, int irq) udelay(10); }
+ if ((irq_status & CORE_PWRCTL_BUS_ON) && mmc->card && + !mmc->ops->get_cd(mmc)) { + msm_host_writel(msm_host, CORE_PWRCTL_BUS_FAIL, host, + msm_offset->core_pwrctl_ctl); + return; + } + /* Handle BUS ON/OFF*/ if (irq_status & CORE_PWRCTL_BUS_ON) { pwr_state = REQ_BUS_ON;
From: Breno Leitao leitao@debian.org
[ Upstream commit 4734c8b46b901cff2feda8b82abc710b65dc31c1 ]
When a GHES (Generic Hardware Error Source) triggers a panic, add the TAINT_MACHINE_CHECK taint flag to the kernel. This explicitly marks the kernel as tainted due to a machine check event, improving diagnostics and post-mortem analysis. The taint is set with LOCKDEP_STILL_OK to indicate lockdep remains valid.
At large scale deployment, this helps to quickly determine panics that are coming due to hardware failures.
Signed-off-by: Breno Leitao leitao@debian.org Reviewed-by: Tony Luck tony.luck@intel.com Link: https://patch.msgid.link/20250702-add_tain-v1-1-9187b10914b9@debian.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of this commit, here is my assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a real diagnostic issue**: The commit adds the TAINT_MACHINE_CHECK flag when GHES (Generic Hardware Error Source) triggers a panic. This is important for post-mortem analysis at scale, as explicitly stated in the commit message: "At large scale deployment, this helps to quickly determine panics that are coming due to hardware failures."
2. **Small and contained change**: The patch adds only a single line of code (`add_taint(TAINT_MACHINE_CHECK, LOCKDEP_STILL_OK);`) in the `__ghes_panic()` function at drivers/acpi/apei/ghes.c:1091. This meets the stable kernel rule of being under 100 lines.
3. **Obviously correct**: The change follows established kernel patterns. Looking at the grep results, other machine check handlers already use this same pattern: - arch/x86/kernel/cpu/mce/core.c:1640 - arch/powerpc/kernel/mce.c:332 - arch/x86/kernel/cpu/mce/p5.c:40 - arch/x86/kernel/cpu/mce/winchip.c:24
4. **No architectural changes**: This is purely a diagnostic improvement that adds taint information without changing any functionality or behavior of the GHES panic path.
5. **Minimal risk**: The change uses LOCKDEP_STILL_OK flag, indicating that lockdep remains valid after the taint, which is the safer option compared to LOCKDEP_NOW_UNRELIABLE used in some other machine check paths.
6. **Real benefit for users**: For organizations running Linux at scale, being able to quickly identify hardware-related panics through the taint flag provides significant operational value for triaging issues.
The commit meets all the stable kernel criteria: it's small, obviously correct, fixes a real diagnostic limitation that affects users (especially at scale), and has been reviewed by Tony Luck who is a recognized maintainer in the RAS (Reliability, Availability, and Serviceability) subsystem.
drivers/acpi/apei/ghes.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 281a0a2f6730..bda33a0f0a01 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -1099,6 +1099,8 @@ static void __ghes_panic(struct ghes *ghes,
__ghes_print_estatus(KERN_EMERG, ghes->generic, estatus);
+ add_taint(TAINT_MACHINE_CHECK, LOCKDEP_STILL_OK); + ghes_clear_estatus(ghes, estatus, buf_paddr, fixmap_idx);
if (!panic_timeout)
From: Biju Das biju.das.jz@bp.renesas.com
[ Upstream commit de2942828e7670526289f098df7e50b112e8ff1e ]
The interrupt controller found on RZ/G3E doesn't provide any facility to configure the wakeup sources. That's the reason why the driver lacks the irq_set_wake() callback for the interrupt chip.
But this prevent to properly enter power management states like "suspend to idle".
Enable the flags IRQCHIP_SKIP_SET_WAKE and IRQCHIP_MASK_ON_SUSPEND so the interrupt suspend logic can handle the chip correctly.
Signed-off-by: Biju Das biju.das.jz@bp.renesas.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Link: https://lore.kernel.org/all/20250701105923.52151-1-biju.das.jz@bp.renesas.co... Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## Summary This commit fixes a real bug where the Renesas RZ/G3E interrupt controller prevents the system from properly entering power management states like "suspend to idle" due to missing irqchip flags.
## Detailed Analysis
### 1. **Bug Fix Nature** The commit clearly fixes a functional bug as stated in the commit message: - "But this prevent to properly enter power management states like 'suspend to idle'" - The interrupt controller lacks an `irq_set_wake()` callback, which is necessary for proper suspend/resume functionality
### 2. **Small and Contained Change** The code change is minimal and well-contained: ```c .flags = IRQCHIP_SET_TYPE_MASKED, ``` changed to: ```c .flags = IRQCHIP_MASK_ON_SUSPEND | IRQCHIP_SET_TYPE_MASKED | IRQCHIP_SKIP_SET_WAKE, ```
This is a simple addition of two flags to the interrupt chip structure.
### 3. **Clear Understanding of Flags** From the kernel documentation in include/linux/irq.h: - **IRQCHIP_MASK_ON_SUSPEND**: "Mask non wake irqs in the suspend path" - This ensures interrupts that shouldn't wake the system are properly masked during suspend - **IRQCHIP_SKIP_SET_WAKE**: "Skip chip.irq_set_wake(), for this irq chip" - This tells the kernel to skip calling the non-existent irq_set_wake() callback
### 4. **Pattern Consistency** Looking at the similar Renesas driver (irq-renesas-rzg2l.c:446-448), we can see the same flags are used: ```c .flags = IRQCHIP_MASK_ON_SUSPEND | IRQCHIP_SET_TYPE_MASKED | IRQCHIP_SKIP_SET_WAKE, ``` This shows it's a well-established pattern for Renesas interrupt controllers that lack wakeup source configuration.
### 5. **No Side Effects** The change only adds flags that: - Enable proper masking of interrupts during suspend (preventing spurious interrupts) - Tell the kernel to skip calling a non-existent function - No functional changes beyond fixing the suspend issue - No API changes or new features
### 6. **Affects Users** This bug directly affects users trying to use power management features on RZ/G3E hardware. Without this fix, the system cannot properly enter suspend states, which impacts: - Power consumption - Battery life (on battery-powered devices) - System reliability during suspend/resume cycles
### 7. **Low Risk** The change has minimal risk of regression because: - It only adds standard irqchip flags that are well-tested in the kernel - The same flags are used in other similar drivers - It doesn't modify any existing logic, just adds proper flag configuration - The flags only affect suspend/resume paths, not normal operation
### 8. **Driver Maturity** While this is a relatively new driver (added in 2024), the fix addresses a clear oversight in the initial implementation. The driver is already in mainline, so systems using this hardware need this fix for proper operation.
## Conclusion This commit meets all the criteria for stable backporting: - Fixes a real bug that affects users - Minimal, well-understood change - No architectural changes or new features - Low risk of regression - Follows established patterns in similar drivers
The fix enables proper power management functionality on affected hardware, which is important for system operation and user experience.
drivers/irqchip/irq-renesas-rzv2h.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-renesas-rzv2h.c b/drivers/irqchip/irq-renesas-rzv2h.c index 69b32c19e8ff..76fb1354e2aa 100644 --- a/drivers/irqchip/irq-renesas-rzv2h.c +++ b/drivers/irqchip/irq-renesas-rzv2h.c @@ -427,7 +427,9 @@ static const struct irq_chip rzv2h_icu_chip = { .irq_retrigger = irq_chip_retrigger_hierarchy, .irq_set_type = rzv2h_icu_set_type, .irq_set_affinity = irq_chip_set_affinity_parent, - .flags = IRQCHIP_SET_TYPE_MASKED, + .flags = IRQCHIP_MASK_ON_SUSPEND | + IRQCHIP_SET_TYPE_MASKED | + IRQCHIP_SKIP_SET_WAKE, };
static int rzv2h_icu_alloc(struct irq_domain *domain, unsigned int virq, unsigned int nr_irqs,
From: Thomas Weißschuh thomas.weissschuh@linutronix.de
[ Upstream commit 1158220b24674edaf885433153deb4f0e5c7d331 ]
The TAP specification requires that the output begins with a header line. If vgetrandom_init() fails and skips the test, that header line is missing.
Call vgetrandom_init() after ksft_print_header().
Signed-off-by: Thomas Weißschuh thomas.weissschuh@linutronix.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Muhammad Usama Anjum usama.anjum@collabora.com Link: https://lore.kernel.org/all/20250611-selftests-vdso-fixes-v3-8-e62e37a6bcf5@... Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## Analysis of the Commit
### 1. Bug Description The commit fixes a test output formatting bug in the vDSO getrandom selftest. Looking at the code changes in `tools/testing/selftests/vDSO/vdso_test_getrandom.c`:
**Before the fix (lines 239-245 and 296-298 in the original file):** ```c static void kselftest(void) { // ... variables ... ksft_print_header(); ksft_set_plan(2); // ... test code ... }
int main(int argc, char *argv[]) { vgetrandom_init(); // This was called BEFORE ksft_print_header()
if (argc == 1) { kselftest(); return 0; } // ... } ```
**After the fix:** ```c static void kselftest(void) { // ... variables ... ksft_print_header(); vgetrandom_init(); // Moved AFTER ksft_print_header() ksft_set_plan(2); // ... test code ... }
int main(int argc, char *argv[]) { if (argc == 1) { kselftest(); return 0; }
// ... vgetrandom_init(); // Called here for non-test modes // ... } ```
### 2. Why This is a Bug
The TAP (Test Anything Protocol) specification requires that test output begins with a version header line (`TAP version 13`). The `ksft_print_header()` function prints this required header.
Looking at `vgetrandom_init()` (lines 113-131), it contains multiple `ksft_exit_skip()` calls that can terminate the test early: - Line 121: If `AT_SYSINFO_EHDR` is not present - Line 125: If the vDSO symbol is missing - Line 128: If the CPU doesn't have runtime support
When `vgetrandom_init()` was called before `ksft_print_header()`, these skip conditions would exit the test without printing the TAP header, violating the TAP specification and potentially breaking test harnesses that parse the output.
### 3. Stable Backport Criteria Assessment
✓ **Fixes a bug that affects users**: Yes, this fixes incorrect test output that can break automated test systems and CI/CD pipelines that parse TAP output.
✓ **Relatively small and contained**: Yes, the fix is minimal - it just reorders two function calls to ensure proper TAP header output.
✓ **No major side effects**: The change only affects test output ordering, not functionality. The test logic remains identical.
✓ **No architectural changes**: This is a simple reordering of existing calls with no structural changes.
✓ **Minimal risk**: The change is confined to the selftest code and cannot affect kernel runtime behavior.
✓ **Important for testing infrastructure**: Correct TAP output is crucial for test automation systems used in continuous integration and validation of stable kernels.
### 4. Additional Considerations
- The commit message clearly describes the problem and solution - The fix follows established kselftest patterns (header before any potential exits) - This is a test-only change with zero impact on production kernel code - The bug could cause false test results in automated systems that expect valid TAP output
This is an ideal candidate for stable backporting as it fixes a real bug in test infrastructure with minimal risk and clear benefits for kernel testing and validation.
tools/testing/selftests/vDSO/vdso_test_getrandom.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/vDSO/vdso_test_getrandom.c b/tools/testing/selftests/vDSO/vdso_test_getrandom.c index 95057f7567db..ff8d5675da2b 100644 --- a/tools/testing/selftests/vDSO/vdso_test_getrandom.c +++ b/tools/testing/selftests/vDSO/vdso_test_getrandom.c @@ -242,6 +242,7 @@ static void kselftest(void) pid_t child;
ksft_print_header(); + vgetrandom_init(); ksft_set_plan(2);
for (size_t i = 0; i < 1000; ++i) { @@ -295,8 +296,6 @@ static void usage(const char *argv0)
int main(int argc, char *argv[]) { - vgetrandom_init(); - if (argc == 1) { kselftest(); return 0; @@ -306,6 +305,9 @@ int main(int argc, char *argv[]) usage(argv[0]); return 1; } + + vgetrandom_init(); + if (!strcmp(argv[1], "bench-single")) bench_single(); else if (!strcmp(argv[1], "bench-multi"))
From: Eliav Farber farbere@amazon.com
[ Upstream commit 6bca1e955830808dc90e0506b2951b4256b81bbb ]
The interrupt handler in pps_gpio_probe() is registered after calling pps_register_source() using devm_request_irq(). However, in the corresponding remove function, pps_unregister_source() is called before the IRQ is freed, since devm-managed resources are released after the remove function completes.
This creates a potential race condition where an interrupt may occur after the PPS source is unregistered but before the handler is removed, possibly leading to a kernel panic.
To prevent this, switch from devm-managed IRQ registration to manual management by using request_irq() and calling free_irq() explicitly in the remove path before unregistering the PPS source. This ensures the interrupt handler is safely removed before deactivating the PPS source.
Signed-off-by: Eliav Farber farbere@amazon.com Link: https://lore.kernel.org/r/20250527053355.37185-1-farbere@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and the surrounding context, here is my assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a Real Bug**: The commit addresses a genuine race condition where an interrupt can occur after the PPS source is unregistered but before the IRQ handler is freed. This can lead to a kernel panic - a serious stability issue that affects users.
2. **Clear Race Condition Pattern**: The code shows the problematic ordering: - In probe: `pps_register_source()` → `devm_request_irq()` - In remove: `pps_unregister_source()` → (implicit devm cleanup frees IRQ)
This creates a window where the hardware can generate an interrupt after the PPS source is gone but before the handler is removed, causing the handler to access freed memory.
3. **Minimal and Contained Fix**: The change is small and focused: - Changes `devm_request_irq()` to `request_irq()` - Adds explicit `free_irq()` call before `pps_unregister_source()` - Only 3 lines of functional code changes - No new features or architectural changes
4. **Similar to Other Stable Fixes**: This follows a common pattern seen in other drivers where devm resource ordering causes issues in removal paths. The kernel has many similar fixes for CAN drivers, network drivers, and other subsystems that have been backported to stable.
5. **Low Risk of Regression**: The change is straightforward and doesn't modify the driver's functionality - it only fixes the resource cleanup ordering. The manual IRQ management is a well-established pattern.
6. **Affects User-Visible Stability**: A kernel panic during device removal/module unload is a serious issue that can affect system stability, especially in environments where PPS devices might be dynamically added/removed or during system shutdown.
The commit message clearly describes the problem, the root cause, and the solution. The fix is conservative and follows established kernel patterns for fixing devm ordering issues. This is exactly the type of bug fix that stable kernel rules recommend for backporting.
drivers/pps/clients/pps-gpio.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/pps/clients/pps-gpio.c b/drivers/pps/clients/pps-gpio.c index 47d9891de368..935da68610c7 100644 --- a/drivers/pps/clients/pps-gpio.c +++ b/drivers/pps/clients/pps-gpio.c @@ -210,8 +210,8 @@ static int pps_gpio_probe(struct platform_device *pdev) }
/* register IRQ interrupt handler */ - ret = devm_request_irq(dev, data->irq, pps_gpio_irq_handler, - get_irqf_trigger_flags(data), data->info.name, data); + ret = request_irq(data->irq, pps_gpio_irq_handler, + get_irqf_trigger_flags(data), data->info.name, data); if (ret) { pps_unregister_source(data->pps); dev_err(dev, "failed to acquire IRQ %d\n", data->irq); @@ -228,6 +228,7 @@ static void pps_gpio_remove(struct platform_device *pdev) { struct pps_gpio_device_data *data = platform_get_drvdata(pdev);
+ free_irq(data->irq, data); pps_unregister_source(data->pps); timer_delete_sync(&data->echo_timer); /* reset echo pin in any case */
On 04/08/25 02:23, Sasha Levin wrote:
From: Eliav Farber farbere@amazon.com
[ Upstream commit 6bca1e955830808dc90e0506b2951b4256b81bbb ]
The interrupt handler in pps_gpio_probe() is registered after calling pps_register_source() using devm_request_irq(). However, in the corresponding remove function, pps_unregister_source() is called before the IRQ is freed, since devm-managed resources are released after the remove function completes.
This creates a potential race condition where an interrupt may occur after the PPS source is unregistered but before the handler is removed, possibly leading to a kernel panic.
To prevent this, switch from devm-managed IRQ registration to manual management by using request_irq() and calling free_irq() explicitly in the remove path before unregistering the PPS source. This ensures the interrupt handler is safely removed before deactivating the PPS source.
Signed-off-by: Eliav Farber farbere@amazon.com Link: https://lore.kernel.org/r/20250527053355.37185-1-farbere@amazon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org
Signed-off-by: Rodolfo Giometti giometti@enneenne.com
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and the surrounding context, here is my assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
**Fixes a Real Bug**: The commit addresses a genuine race condition where an interrupt can occur after the PPS source is unregistered but before the IRQ handler is freed. This can lead to a kernel panic - a serious stability issue that affects users.
**Clear Race Condition Pattern**: The code shows the problematic ordering:
- In probe: `pps_register_source()` → `devm_request_irq()`
- In remove: `pps_unregister_source()` → (implicit devm cleanup frees IRQ)
This creates a window where the hardware can generate an interrupt
after the PPS source is gone but before the handler is removed, causing the handler to access freed memory.
**Minimal and Contained Fix**: The change is small and focused:
- Changes `devm_request_irq()` to `request_irq()`
- Adds explicit `free_irq()` call before `pps_unregister_source()`
- Only 3 lines of functional code changes
- No new features or architectural changes
**Similar to Other Stable Fixes**: This follows a common pattern seen in other drivers where devm resource ordering causes issues in removal paths. The kernel has many similar fixes for CAN drivers, network drivers, and other subsystems that have been backported to stable.
**Low Risk of Regression**: The change is straightforward and doesn't modify the driver's functionality - it only fixes the resource cleanup ordering. The manual IRQ management is a well-established pattern.
**Affects User-Visible Stability**: A kernel panic during device removal/module unload is a serious issue that can affect system stability, especially in environments where PPS devices might be dynamically added/removed or during system shutdown.
The commit message clearly describes the problem, the root cause, and the solution. The fix is conservative and follows established kernel patterns for fixing devm ordering issues. This is exactly the type of bug fix that stable kernel rules recommend for backporting.
drivers/pps/clients/pps-gpio.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/pps/clients/pps-gpio.c b/drivers/pps/clients/pps-gpio.c index 47d9891de368..935da68610c7 100644 --- a/drivers/pps/clients/pps-gpio.c +++ b/drivers/pps/clients/pps-gpio.c @@ -210,8 +210,8 @@ static int pps_gpio_probe(struct platform_device *pdev) } /* register IRQ interrupt handler */
- ret = devm_request_irq(dev, data->irq, pps_gpio_irq_handler,
get_irqf_trigger_flags(data), data->info.name, data);
- ret = request_irq(data->irq, pps_gpio_irq_handler,
if (ret) { pps_unregister_source(data->pps); dev_err(dev, "failed to acquire IRQ %d\n", data->irq);get_irqf_trigger_flags(data), data->info.name, data);
@@ -228,6 +228,7 @@ static void pps_gpio_remove(struct platform_device *pdev) { struct pps_gpio_device_data *data = platform_get_drvdata(pdev);
- free_irq(data->irq, data); pps_unregister_source(data->pps); timer_delete_sync(&data->echo_timer); /* reset echo pin in any case */
From: Peter Robinson pbrobinson@gmail.com
[ Upstream commit 1d99f92f71b6b4b2eee776562c991428490f71ef ]
The BRCMSTB and BRCMSTB_RESCAL reset drivers are also used in the BCM2712, AKA the RPi5. The RPi platforms have typically used the ARCH_BCM2835, and the PCIe support for this SoC can use this config which depends on these drivers so enable building them when just that arch option is enabled to ensure the platform works as expected.
Signed-off-by: Peter Robinson pbrobinson@gmail.com Acked-by: Florian Fainelli florian.fainelli@broadcom.com Link: https://lore.kernel.org/r/20250630175301.846082-1-pbrobinson@gmail.com Signed-off-by: Philipp Zabel p.zabel@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## Analysis of the Commit
### 1. **Nature of the Change** The commit makes a simple Kconfig dependency update to enable two existing reset controller drivers (`RESET_BRCMSTB` and `RESET_BRCMSTB_RESCAL`) for the `ARCH_BCM2835` platform. The changes are: - Lines 54-55: Adds `ARCH_BCM2835` to the depends and default lines for `RESET_BRCMSTB` - Lines 63-64: Adds `ARCH_BCM2835` to the depends and default lines for `RESET_BRCMSTB_RESCAL` - Line 67: Updates the help text to mention BCM2712 alongside BCM7216
### 2. **This is a Bug Fix** This commit fixes a configuration bug where the Raspberry Pi 5 (BCM2712) platform, which uses `ARCH_BCM2835`, cannot properly enable critical reset controllers needed for PCIe functionality. Without this fix, users building kernels for RPi5 with just `ARCH_BCM2835` enabled would have non-functional PCIe support.
### 3. **Minimal and Contained Change** The commit is extremely minimal - it only modifies 6 lines in a single Kconfig file. It doesn't touch any C code, doesn't change any logic, and only updates build configuration dependencies. This is the ideal type of change for stable backporting.
### 4. **No Side Effects** The change has no negative side effects: - It only enables building existing drivers for an additional platform - The drivers themselves are already tested and working (used by BRCMSTB platforms) - The change is additive - it doesn't remove or modify existing functionality - No API/ABI changes
### 5. **Platform-Specific Fix** While the change is platform-specific (Raspberry Pi 5/BCM2712), it's important because: - RPi platforms are widely used - PCIe support is critical functionality - The fix enables existing, tested code to be built for a platform that needs it
### 6. **Clear Hardware Support Issue** From the commit message, this addresses a real hardware support issue where the BCM2712 (RPi5) SoC reuses the same reset controller IP blocks as other Broadcom STB SoCs but couldn't enable them due to missing Kconfig dependencies.
### 7. **Low Risk** The risk of regression is essentially zero because: - Only affects kernel builds that explicitly enable `ARCH_BCM2835` - Doesn't change behavior for existing platforms - The drivers being enabled are mature and already in use
### 8. **Follows Stable Rules** This perfectly fits the stable kernel criteria: - Fixes a real bug (missing hardware support) - Obviously correct (just adding platform to existing dependency) - Small change (6 lines) - Already tested (has Acked-by from Broadcom maintainer)
The commit enables essential functionality (PCIe support via reset controllers) for a popular platform (Raspberry Pi 5) through a minimal, risk-free configuration change. This is exactly the type of fix that should be backported to ensure stable kernels properly support current hardware.
drivers/reset/Kconfig | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/reset/Kconfig b/drivers/reset/Kconfig index d85be5899da6..ec8c953cb73d 100644 --- a/drivers/reset/Kconfig +++ b/drivers/reset/Kconfig @@ -51,8 +51,8 @@ config RESET_BERLIN
config RESET_BRCMSTB tristate "Broadcom STB reset controller" - depends on ARCH_BRCMSTB || COMPILE_TEST - default ARCH_BRCMSTB + depends on ARCH_BRCMSTB || ARCH_BCM2835 || COMPILE_TEST + default ARCH_BRCMSTB || ARCH_BCM2835 help This enables the reset controller driver for Broadcom STB SoCs using a SUN_TOP_CTRL_SW_INIT style controller. @@ -60,11 +60,11 @@ config RESET_BRCMSTB config RESET_BRCMSTB_RESCAL tristate "Broadcom STB RESCAL reset controller" depends on HAS_IOMEM - depends on ARCH_BRCMSTB || COMPILE_TEST - default ARCH_BRCMSTB + depends on ARCH_BRCMSTB || ARCH_BCM2835 || COMPILE_TEST + default ARCH_BRCMSTB || ARCH_BCM2835 help This enables the RESCAL reset controller for SATA, PCIe0, or PCIe1 on - BCM7216. + BCM7216 or the BCM2712.
config RESET_EYEQ bool "Mobileye EyeQ reset controller"
From: Charles Keepax ckeepax@opensource.cirrus.com
[ Upstream commit 775f5729b47d8737f4f98e0141f61b3358245398 ]
Zero is a valid SDCA IRQ interrupt position so add a special value to indicate that the IRQ is not used.
Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Signed-off-by: Charles Keepax ckeepax@opensource.cirrus.com Reviewed-by: Pierre-Louis Bossart pierre-louis.bossart@linux.dev Link: https://patch.msgid.link/20250624122844.2761627-6-ckeepax@opensource.cirrus.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Now I have a complete picture of the issue. Let me provide my analysis:
**Backport Status: YES**
This commit fixes a subtle but important bug in the SDCA (Smart Digital Class-D Amplifier) subsystem's interrupt handling. Here's why this should be backported:
1. **It fixes a real bug**: Without this change, when the `mipi-sdca- control-interrupt-position` property is missing from the device tree/ACPI, the `control->interrupt_position` field remains uninitialized (even though memory is zero-initialized with `devm_kcalloc`, it's still 0). Since 0 is a valid interrupt position in SDCA, this causes the code to incorrectly treat controls without interrupts as having interrupt position 0.
2. **The bug has functional impact**: Looking at the interrupt handling code in `sdca_irq_populate()` (added in commit b126394d9ec6), it checks: ```c if (irq == SDCA_NO_INTERRUPT) { continue; } ``` Without this fix, controls that should have no interrupt (property missing) would be incorrectly processed as having interrupt position 0, potentially causing: - Incorrect interrupt registration - Conflicts with legitimate interrupt position 0 handlers - Runtime errors when the interrupt framework tries to handle non- existent interrupts
3. **The fix is minimal and safe**: The change simply adds: ```c else control->interrupt_position = SDCA_NO_INTERRUPT; ``` This ensures that when the interrupt position property is missing, the field is explicitly set to -1 (SDCA_NO_INTERRUPT) instead of remaining 0.
4. **It follows stable tree rules**: - Fixes a clear bug (uninitialized/incorrectly initialized field) - Small, contained change (2 lines added) - No architectural changes or new features - Low risk of regression - Only affects SDCA subsystem
5. **The bug exists in stable kernels**: The interrupt handling code that depends on this distinction was added in commit b126394d9ec6 ("ASoC: SDCA: Generic interrupt support"), so any stable kernel that includes that commit needs this fix.
The commit message clearly states the problem: "Zero is a valid SDCA IRQ interrupt position so add a special value to indicate that the IRQ is not used." This is a classic initialization bug where the absence of a value (property not present) needs to be distinguished from a valid zero value.
include/sound/sdca_function.h | 2 ++ sound/soc/sdca/sdca_functions.c | 2 ++ 2 files changed, 4 insertions(+)
diff --git a/include/sound/sdca_function.h b/include/sound/sdca_function.h index eaedb54a8322..b43bda42eeca 100644 --- a/include/sound/sdca_function.h +++ b/include/sound/sdca_function.h @@ -16,6 +16,8 @@ struct device; struct sdca_entity; struct sdca_function_desc;
+#define SDCA_NO_INTERRUPT -1 + /* * The addressing space for SDCA relies on 7 bits for Entities, so a * maximum of 128 Entities per function can be represented. diff --git a/sound/soc/sdca/sdca_functions.c b/sound/soc/sdca/sdca_functions.c index de213a69e0da..dd503acc0c77 100644 --- a/sound/soc/sdca/sdca_functions.c +++ b/sound/soc/sdca/sdca_functions.c @@ -911,6 +911,8 @@ static int find_sdca_entity_control(struct device *dev, struct sdca_entity *enti &tmp); if (!ret) control->interrupt_position = tmp; + else + control->interrupt_position = SDCA_NO_INTERRUPT;
control->label = find_sdca_control_label(dev, entity, control); if (!control->label)
From: Gerd Hoffmann kraxel@redhat.com
[ Upstream commit 7b22e0432981c2fa230f1b493082b7e67112c4aa ]
In case efi_mm is active go use the userspace instruction decoder which supports fetching instructions from active_mm. This is needed to make instruction emulation work for EFI runtime code, so it can use CPUID and RDMSR.
EFI runtime code uses the CPUID instruction to gather information about the environment it is running in, such as SEV being enabled or not, and choose (if needed) the SEV code path for ioport access.
EFI runtime code uses the RDMSR instruction to get the location of the CAA page (see SVSM spec, section 4.2 - "Post Boot").
The big picture behind this is that the kernel needs to be able to properly handle #VC exceptions that come from EFI runtime services. Since EFI runtime services have a special page table mapping for the EFI virtual address space, the efi_mm context must be used when decoding instructions during #VC handling.
[ bp: Massage. ]
Signed-off-by: Gerd Hoffmann kraxel@redhat.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Pankaj Gupta pankaj.gupta@amd.com Link: https://lore.kernel.org/20250626114014.373748-2-kraxel@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit message and code changes, I can provide my assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a real bug affecting users**: The commit fixes instruction emulation for EFI runtime services in SEV-enabled systems. Without this fix, EFI runtime services cannot properly use CPUID and RDMSR instructions, which are essential for: - Determining if SEV is enabled - Getting the location of the CAA page (per SVSM spec) - Choosing the correct code path for I/O port access
2. **Small and contained fix**: The change is minimal - it only modifies the `vc_decode_insn()` function in arch/x86/coco/sev/vc-handle.c by adding a single condition check: `|| mm_is_efi(current->active_mm)`. This additional check ensures that when EFI runtime services are active (using efi_mm), the userspace instruction decoder is used instead of the kernel decoder.
3. **No architectural changes**: The fix doesn't introduce any new features or change the architecture. It simply fixes the existing instruction decoding logic to handle the special case of EFI runtime services.
4. **Critical for SEV functionality**: Systems with AMD SEV (Secure Encrypted Virtualization) enabled need proper #VC exception handling for EFI runtime services to function correctly. Without this fix, these systems may experience failures when EFI runtime services attempt to execute CPUID or RDMSR instructions.
5. **Low regression risk**: The change only affects the code path when `mm_is_efi()` returns true, which is a very specific scenario (EFI runtime services execution). It doesn't modify the existing behavior for normal kernel or userspace instruction decoding.
6. **Clear problem and solution**: The commit message clearly explains the issue (EFI runtime services use special page table mappings that require the userspace decoder) and the solution is straightforward.
The fix is essential for proper operation of SEV-enabled systems that rely on EFI runtime services, making it a good candidate for stable backporting according to stable tree rules.
arch/x86/coco/sev/vc-handle.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/arch/x86/coco/sev/vc-handle.c b/arch/x86/coco/sev/vc-handle.c index 0989d98da130..faf1fce89ed4 100644 --- a/arch/x86/coco/sev/vc-handle.c +++ b/arch/x86/coco/sev/vc-handle.c @@ -17,6 +17,7 @@ #include <linux/mm.h> #include <linux/io.h> #include <linux/psp-sev.h> +#include <linux/efi.h> #include <uapi/linux/sev-guest.h>
#include <asm/init.h> @@ -178,9 +179,15 @@ static enum es_result __vc_decode_kern_insn(struct es_em_ctxt *ctxt) return ES_OK; }
+/* + * User instruction decoding is also required for the EFI runtime. Even though + * the EFI runtime is running in kernel mode, it uses special EFI virtual + * address mappings that require the use of efi_mm to properly address and + * decode. + */ static enum es_result vc_decode_insn(struct es_em_ctxt *ctxt) { - if (user_mode(ctxt->regs)) + if (user_mode(ctxt->regs) || mm_is_efi(current->active_mm)) return __vc_decode_user_insn(ctxt); else return __vc_decode_kern_insn(ctxt);
From: Zijun Hu zijun.hu@oss.qualcomm.com
[ Upstream commit 0ef1fe4bc38673db72e39b700b29c50dfcc5a415 ]
misc_init() returns -EIO for __register_chrdev() invocation failure, but:
- -EIO is for I/O error normally, but __register_chrdev() does not do I/O. - -EIO can not cover various error codes returned by __register_chrdev().
Fix by returning error code of __register_chrdev().
Signed-off-by: Zijun Hu zijun.hu@oss.qualcomm.com Link: https://lore.kernel.org/r/20250620-fix_mischar-v1-3-6c2716bbf1fa@oss.qualcom... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Bug Fix Nature**: This is a clear bug fix that corrects improper error handling in `misc_init()`. The original code hardcoded `-EIO` regardless of the actual error returned by `__register_chrdev()`, which could mask the real failure reason.
2. **Small and Contained Change**: The fix is minimal - just 4 lines changed: - Removed the hardcoded `err = -EIO;` assignment - Changed `__register_chrdev()` from being used in an if condition to storing its return value - Updated the error check to properly test for negative error codes (`err < 0`)
3. **Improves Diagnostic Accuracy**: By returning the actual error code from `__register_chrdev()`, system administrators and developers can get accurate information about why the misc device initialization failed. Looking at the `__register_chrdev()` implementation, it can return various error codes like `-ENOMEM`, `-EBUSY`, or error codes from `__register_chrdev_region()`.
4. **No Architectural Changes**: This commit doesn't introduce any new features or change the architecture. It simply fixes error propagation in the initialization path.
5. **Low Risk of Regression**: The change is straightforward error handling correction. The behavior remains the same for successful cases, and only the error code changes for failure cases. This cannot break any working systems.
6. **Affects Core Subsystem**: The misc device subsystem is fundamental to many kernel drivers and userspace applications. Having accurate error codes during initialization failures is important for debugging boot issues.
7. **Follows Stable Rules**: This meets the stable kernel criteria - it's a real bug fix (incorrect error codes), it's small (4 lines), and it has minimal risk while providing clear benefit.
The commit properly preserves the error path behavior while ensuring the actual error from `__register_chrdev()` is propagated up, making it an ideal candidate for stable backporting.
drivers/char/misc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/char/misc.c b/drivers/char/misc.c index d5accc10a110..5247d0ec0f4c 100644 --- a/drivers/char/misc.c +++ b/drivers/char/misc.c @@ -296,8 +296,8 @@ static int __init misc_init(void) if (err) goto fail_remove;
- err = -EIO; - if (__register_chrdev(MISC_MAJOR, 0, MINORMASK + 1, "misc", &misc_fops)) + err = __register_chrdev(MISC_MAJOR, 0, MINORMASK + 1, "misc", &misc_fops); + if (err < 0) goto fail_printk; return 0;
From: Hans de Goede hansg@kernel.org
[ Upstream commit 35e8a426b16adbecae7a4e0e3c00fc8d0273db53 ]
mei_cl_bus_dev_release() also frees the mei-client (struct mei_cl) belonging to the device being released.
If there are bugs like the just fixed bug in the ACE/CSI2 mei drivers, the mei-client being freed might still be part of the mei_device's file_list and iterating over this list after the freeing will then trigger a use-afer-free bug.
Add a check to mei_cl_bus_dev_release() to make sure that the to-be-freed mei-client is not on the mei_device's file_list.
Signed-off-by: Hans de Goede hansg@kernel.org Link: https://lore.kernel.org/r/20250623085052.12347-11-hansg@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## Security Impact Analysis
The commit adds a defensive check to detect use-after-free vulnerabilities in the MEI (Management Engine Interface) subsystem. The code change is minimal but addresses a serious security issue:
```c + struct mei_device *mdev = cldev->cl->dev; + struct mei_cl *cl; + + list_for_each_entry(cl, &mdev->file_list, link) + WARN_ON(cl == cldev->cl); + kfree(cldev->cl); ```
## Why This Is a Critical Backport Candidate
1. **Use-After-Free Prevention**: The commit addresses a classic memory corruption bug. When `mei_cl_bus_dev_release()` frees the mei-client structure via `kfree(cldev->cl)`, if that client is still on the `mei_device->file_list`, any subsequent iteration over that list will access freed memory, potentially leading to: - Kernel crashes - Information disclosure - Arbitrary code execution with kernel privileges
2. **Widespread Hardware Impact**: MEI is present on virtually all modern Intel systems with Management Engine, making this a broadly applicable fix affecting millions of devices.
3. **Low Risk, High Value**: The change adds only 6 lines of defensive code that: - Doesn't modify any existing logic flow - Only adds a warning mechanism - Has zero performance impact - Cannot introduce regressions
4. **Security Context**: The commit message references "just fixed bug in the ACE/CSI2 mei drivers," indicating this is addressing real- world vulnerabilities already discovered in MEI client drivers. This suggests similar bugs may exist in other MEI drivers.
5. **Stable Kernel Criteria Compliance**: - ✓ Fixes a serious bug (security vulnerability) - ✓ Minimal change (6 lines) - ✓ No new features - ✓ Obvious correctness - ✓ Already tested (signed-off by maintainer Greg KH)
## Technical Details
The fix works by iterating through `mdev->file_list` before freeing `cldev->cl` and issuing a `WARN_ON()` if the to-be-freed client is still in the list. This serves as an early warning system to catch driver bugs before they cause memory corruption.
## Recommendation
This should be backported to all currently maintained stable kernel branches (6.1.x, 6.6.x, 6.12.x) with priority given to LTS kernels. The combination of: - Security impact (use-after-free in kernel space) - Wide hardware coverage (Intel MEI) - Minimal risk (detection-only change) - Real-world bug evidence (ACE/CSI2 drivers)
Makes this an ideal stable backport candidate that meets all the criteria for inclusion in stable kernels.
drivers/misc/mei/bus.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/misc/mei/bus.c b/drivers/misc/mei/bus.c index 67176caf5416..1958c043ac14 100644 --- a/drivers/misc/mei/bus.c +++ b/drivers/misc/mei/bus.c @@ -1301,10 +1301,16 @@ static void mei_dev_bus_put(struct mei_device *bus) static void mei_cl_bus_dev_release(struct device *dev) { struct mei_cl_device *cldev = to_mei_cl_device(dev); + struct mei_device *mdev = cldev->cl->dev; + struct mei_cl *cl;
mei_cl_flush_queues(cldev->cl, NULL); mei_me_cl_put(cldev->me_cl); mei_dev_bus_put(cldev->bus); + + list_for_each_entry(cl, &mdev->file_list, link) + WARN_ON(cl == cldev->cl); + kfree(cldev->cl); kfree(cldev); }
From: Ulf Hansson ulf.hansson@linaro.org
[ Upstream commit 47a255f7d2eabee06cfbf5b1c2379749442fd01d ]
In the error path of sd_set_power_mode() we don't update host->power_mode, which could lead to an imbalance of the runtime PM usage count. Fix this by always updating host->power_mode.
Reviewed-by: Avri Altman avri.altman@sandisk.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Acked-by: Ricky Wu ricky_wu@realtek.com Link: https://lore.kernel.org/r/20250610111633.504366-2-ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## Bug Analysis
The commit fixes a **runtime PM reference counting bug** in the `sd_set_power_mode()` function. The issue is in the error handling path:
### Original Code Problem: ```c if (power_mode == MMC_POWER_OFF) { err = sd_power_off(host); pm_runtime_put_noidle(sdmmc_dev(host)); } else { pm_runtime_get_noresume(sdmmc_dev(host)); err = sd_power_on(host); }
if (!err) host->power_mode = power_mode; // Only updated on success ```
### The Bug: 1. When transitioning to `MMC_POWER_OFF`, the code calls `pm_runtime_put_noidle()` **regardless** of whether `sd_power_off()` succeeded 2. When transitioning to `MMC_POWER_ON`, the code calls `pm_runtime_get_noresume()` **before** calling `sd_power_on()` 3. However, `host->power_mode` is only updated if there's no error
This creates a **runtime PM reference count imbalance** because: - If `sd_power_off()` fails, the runtime PM reference is still decremented, but `host->power_mode` remains at its previous value (likely `MMC_POWER_ON`) - If `sd_power_on()` fails, the runtime PM reference is still incremented, but `host->power_mode` remains at its previous value (likely `MMC_POWER_OFF`) - The early return check `if (power_mode == host->power_mode)` at the beginning of the function will prevent future attempts to fix the power state, leaving the runtime PM count permanently imbalanced
## Why This Is Suitable for Stable:
1. **Fixes a real bug**: Runtime PM reference count imbalances can lead to: - Device staying powered when it should be suspended - Device being suspended when it should be active - Power management failures - Potential system hangs or crashes
2. **Minimal change**: The fix is a simple one-liner that always updates `host->power_mode` regardless of error status
3. **Low regression risk**: - The change doesn't introduce new functionality - It doesn't alter the control flow - It simply ensures state consistency
4. **Clear bug fix**: This is not a feature addition or optimization - it's fixing incorrect error handling that violates runtime PM rules
5. **Affects users**: This bug can manifest in real-world scenarios when SD card power operations fail, particularly with flaky hardware or during system stress
6. **Subsystem contained**: The fix is limited to a single MMC host driver and doesn't affect core kernel infrastructure
The commit follows stable kernel rules by being a minimal, targeted fix for a real bug that can affect system stability and power management.
drivers/mmc/host/rtsx_usb_sdmmc.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/mmc/host/rtsx_usb_sdmmc.c b/drivers/mmc/host/rtsx_usb_sdmmc.c index d229c2b83ea9..8c35cb85a9c0 100644 --- a/drivers/mmc/host/rtsx_usb_sdmmc.c +++ b/drivers/mmc/host/rtsx_usb_sdmmc.c @@ -1029,9 +1029,7 @@ static int sd_set_power_mode(struct rtsx_usb_sdmmc *host, err = sd_power_on(host); }
- if (!err) - host->power_mode = power_mode; - + host->power_mode = power_mode; return err; }
From: Gwendal Grignou gwendal@chromium.org
[ Upstream commit 981d7f91aeda17424b29f033249f4fa7cd2a7556 ]
When the EC/ISH starts, it can take a while for all the sensors to be up and running or declared broken.
If the sensor stack return -EBUSY when checking for sensor information, retry up to 50 times. It has been observed 100ms wait time is enough to have valid sensors ready. It can take more time in case a sensor is really broken and is not coming up.
Signed-off-by: Gwendal Grignou gwendal@google.com Link: https://lore.kernel.org/r/20250623210518.306740-1-gwendal@google.com Signed-off-by: Tzung-Bi Shih tzungbi@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a real user-impacting bug**: The commit addresses a race condition where the ChromeOS EC sensor hub fails to initialize sensors properly during EC/ISH startup. Without this fix, sensors may appear as broken or unavailable to userspace when they are actually just slow to initialize.
2. **Small and contained fix**: The change is minimal - it only adds: - A retry loop around `cros_ec_cmd_xfer_status()` for -EBUSY errors - A 5-6ms delay between retries - A maximum of 50 retries (approximately 250-300ms total timeout) - A warning message when retries were needed
3. **Clear bug fix pattern**: The EC returning -EBUSY (EC_RES_BUSY) is documented behavior indicating "Up but too busy. Should retry". The original code didn't handle this case, leading to sensors being skipped during initialization.
4. **Low regression risk**: - The retry logic only triggers on -EBUSY errors, not affecting the normal path - The change is isolated to sensor initialization during probe - Similar retry patterns exist in other parts of the cros_ec subsystem (e.g., commit 11799564fc7e) - The dev_warn to dev_err change is appropriate as sensor initialization failure is an error condition
5. **Hardware-specific timing issue**: This fixes a timing-dependent hardware initialization issue that can manifest differently across different EC/ISH firmware versions and boot conditions, making it important for stable operation across various Chromebook models.
6. **No architectural changes**: The commit doesn't introduce new features or change any interfaces - it simply makes the existing sensor initialization more robust by handling a known EC busy state.
The commit follows established patterns in the cros_ec subsystem for handling EC_RES_BUSY responses and is a straightforward reliability improvement that should be included in stable kernels to ensure consistent sensor availability on ChromeOS devices.
drivers/platform/chrome/cros_ec_sensorhub.c | 23 +++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-)
diff --git a/drivers/platform/chrome/cros_ec_sensorhub.c b/drivers/platform/chrome/cros_ec_sensorhub.c index 50cdae67fa32..9bad8f72680e 100644 --- a/drivers/platform/chrome/cros_ec_sensorhub.c +++ b/drivers/platform/chrome/cros_ec_sensorhub.c @@ -8,6 +8,7 @@
#include <linux/init.h> #include <linux/device.h> +#include <linux/delay.h> #include <linux/mod_devicetable.h> #include <linux/module.h> #include <linux/platform_data/cros_ec_commands.h> @@ -18,6 +19,7 @@ #include <linux/types.h>
#define DRV_NAME "cros-ec-sensorhub" +#define CROS_EC_CMD_INFO_RETRIES 50
static void cros_ec_sensorhub_free_sensor(void *arg) { @@ -53,7 +55,7 @@ static int cros_ec_sensorhub_register(struct device *dev, int sensor_type[MOTIONSENSE_TYPE_MAX] = { 0 }; struct cros_ec_command *msg = sensorhub->msg; struct cros_ec_dev *ec = sensorhub->ec; - int ret, i; + int ret, i, retries; char *name;
@@ -65,12 +67,25 @@ static int cros_ec_sensorhub_register(struct device *dev, sensorhub->params->cmd = MOTIONSENSE_CMD_INFO; sensorhub->params->info.sensor_num = i;
- ret = cros_ec_cmd_xfer_status(ec->ec_dev, msg); + retries = CROS_EC_CMD_INFO_RETRIES; + do { + ret = cros_ec_cmd_xfer_status(ec->ec_dev, msg); + if (ret == -EBUSY) { + /* The EC is still busy initializing sensors. */ + usleep_range(5000, 6000); + retries--; + } + } while (ret == -EBUSY && retries); + if (ret < 0) { - dev_warn(dev, "no info for EC sensor %d : %d/%d\n", - i, ret, msg->result); + dev_err(dev, "no info for EC sensor %d : %d/%d\n", + i, ret, msg->result); continue; } + if (retries < CROS_EC_CMD_INFO_RETRIES) { + dev_warn(dev, "%d retries needed to bring up sensor %d\n", + CROS_EC_CMD_INFO_RETRIES - retries, i); + }
switch (sensorhub->resp->info.type) { case MOTIONSENSE_TYPE_ACCEL:
From: Takashi Iwai tiwai@suse.de
[ Upstream commit 5f7e54b23e4d253eff3b10b12d6fa92d28d7dddc ]
We used to call directly hda_jackpoll_work() from a couple of places for updating the jack and notify to user-space, but this makes rather the code flow fragile. Namely, because of those direct calls, hda_jackpoll_work() uses snd_hda_power_up_pm() and *_down_pm() calls instead of the standard snd_hda_power_up() and *_down() calls. The latter pair assures the runtime PM resume sync, so it can avoid the race against the PM callbacks gracefully, while the former pair may continue if called concurrently, hence it may race (by design).
In this patch, we change the call pattern of hda_jackpoll_work(); now all callers are replaced with the standard snd_hda_jack_report_sync() and the additional schedule_delayed_work().
Since hda_jackpoll_work() is called only from the associated work, it's always outside the PM code path, and we can safely use snd_hda_power_up() and *_down() there instead. This allows us to remove the racy check of power-state in hda_jackpoll_work(), as well as the tricky cancel_delayed_work() and rescheduling at hda_codec_runtime_suspend().
Reported-by: Joakim Zhang joakim.zhang@cixtech.com Closes: https://lore.kernel.org/20250619020844.2974160-1-joakim.zhang@cixtech.com Tested-by: Joakim Zhang joakim.zhang@cixtech.com Signed-off-by: Takashi Iwai tiwai@suse.de Link: https://patch.msgid.link/20250623131437.10670-4-tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## Key Reasons for Backporting:
### 1. **Fixes a Real Race Condition Bug** The commit addresses a race condition in the HDA (High Definition Audio) jack polling mechanism. The race occurs because: - Direct calls to `hda_jackpoll_work()` used `snd_hda_power_up_pm()` instead of `snd_hda_power_up()` - The `*_pm()` variants don't provide runtime PM resume synchronization - This allows concurrent execution and races with PM callbacks
### 2. **Bug Reported and Tested by Users** - Reported-by: Joakim Zhang with a specific bug report - Tested-by: Joakim Zhang confirming the fix works - This indicates real-world impact on users
### 3. **Small and Contained Fix** The patch statistics show: - Only 1 file changed: `sound/pci/hda/hda_codec.c` - 41 lines changed (14 insertions, 27 deletions) - Net reduction in code complexity - Changes are localized to jack polling logic
### 4. **Fixes Without Adding Features** The commit purely fixes the race condition by: - Removing direct calls to `hda_jackpoll_work()` - Always scheduling work via `schedule_delayed_work()` - Switching from `snd_hda_power_up_pm()` to safer `snd_hda_power_up()` - Removing the racy `cancel_delayed_work_sync()` in suspend path
### 5. **Follows Stable Tree Criteria** - **Important bug fix**: Race conditions in power management can cause system instability - **Minimal risk**: Changes are well-contained to jack polling logic - **No architectural changes**: Just refactors the calling pattern - **Clear improvement**: Removes racy code patterns
### 6. **Critical Subsystem** The HDA audio subsystem is critical for many systems, and race conditions in PM paths can lead to: - Audio device failures after suspend/resume - Potential system hangs during PM transitions - Unpredictable behavior in jack detection
### 7. **Simplifies Code** The fix actually simplifies the code by: - Removing the complex power state check: `if (!codec->jackpoll_interval && snd_hdac_is_power_on(&codec->core))` - Eliminating the tricky `cancel_delayed_work()` in suspend - Making the work scheduling pattern consistent
The commit message clearly explains the race condition mechanism and the fix approach. The change pattern shows careful consideration of PM synchronization issues, which are notoriously difficult to debug in production systems. This is exactly the type of bug fix that stable kernels should include to prevent hard-to-reproduce issues in deployed systems.
sound/pci/hda/hda_codec.c | 41 +++++++++++++-------------------------- 1 file changed, 14 insertions(+), 27 deletions(-)
diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index c018beeecd3d..d205f13653bb 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -639,24 +639,16 @@ static void hda_jackpoll_work(struct work_struct *work) struct hda_codec *codec = container_of(work, struct hda_codec, jackpoll_work.work);
- /* for non-polling trigger: we need nothing if already powered on */ - if (!codec->jackpoll_interval && snd_hdac_is_power_on(&codec->core)) + if (!codec->jackpoll_interval) return;
/* the power-up/down sequence triggers the runtime resume */ - snd_hda_power_up_pm(codec); + snd_hda_power_up(codec); /* update jacks manually if polling is required, too */ - if (codec->jackpoll_interval) { - snd_hda_jack_set_dirty_all(codec); - snd_hda_jack_poll_all(codec); - } - snd_hda_power_down_pm(codec); - - if (!codec->jackpoll_interval) - return; - - schedule_delayed_work(&codec->jackpoll_work, - codec->jackpoll_interval); + snd_hda_jack_set_dirty_all(codec); + snd_hda_jack_poll_all(codec); + schedule_delayed_work(&codec->jackpoll_work, codec->jackpoll_interval); + snd_hda_power_down(codec); }
/* release all pincfg lists */ @@ -2895,12 +2887,12 @@ static void hda_call_codec_resume(struct hda_codec *codec) snd_hda_regmap_sync(codec); }
- if (codec->jackpoll_interval) - hda_jackpoll_work(&codec->jackpoll_work.work); - else - snd_hda_jack_report_sync(codec); + snd_hda_jack_report_sync(codec); codec->core.dev.power.power_state = PMSG_ON; snd_hdac_leave_pm(&codec->core); + if (codec->jackpoll_interval) + schedule_delayed_work(&codec->jackpoll_work, + codec->jackpoll_interval); }
static int hda_codec_runtime_suspend(struct device *dev) @@ -2912,8 +2904,6 @@ static int hda_codec_runtime_suspend(struct device *dev) if (!codec->card) return 0;
- cancel_delayed_work_sync(&codec->jackpoll_work); - state = hda_call_codec_suspend(codec); if (codec->link_down_at_suspend || (codec_has_clkstop(codec) && codec_has_epss(codec) && @@ -2921,10 +2911,6 @@ static int hda_codec_runtime_suspend(struct device *dev) snd_hdac_codec_link_down(&codec->core); snd_hda_codec_display_power(codec, false);
- if (codec->bus->jackpoll_in_suspend && - (dev->power.power_state.event != PM_EVENT_SUSPEND)) - schedule_delayed_work(&codec->jackpoll_work, - codec->jackpoll_interval); return 0; }
@@ -3086,10 +3072,11 @@ int snd_hda_codec_build_controls(struct hda_codec *codec) if (err < 0) return err;
+ snd_hda_jack_report_sync(codec); /* call at the last init point */ if (codec->jackpoll_interval) - hda_jackpoll_work(&codec->jackpoll_work.work); - else - snd_hda_jack_report_sync(codec); /* call at the last init point */ + schedule_delayed_work(&codec->jackpoll_work, + codec->jackpoll_interval); + sync_power_up_states(codec); return 0; }
From: Takashi Iwai tiwai@suse.de
[ Upstream commit 1adcbdf54f76e1004bdf71df4eb1888c26e7ad06 ]
Although the jack polling is canceled at shutdown in snd_hda_codec_shutdown(), it might be still re-triggered when the work is being processed at cancel_delayed_work_sync() call. This may result in the unexpected hardware access that should have been already disabled.
For assuring to stop the jack polling, clear codec->jackpoll_interval at shutdown.
Reported-by: Joakim Zhang joakim.zhang@cixtech.com Closes: https://lore.kernel.org/20250619020844.2974160-4-joakim.zhang@cixtech.com Tested-by: Joakim Zhang joakim.zhang@cixtech.com Signed-off-by: Takashi Iwai tiwai@suse.de Link: https://patch.msgid.link/20250623131437.10670-2-tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## Bug Fix Analysis
The commit fixes a **race condition** during codec shutdown where jack polling work could be re-triggered even after attempting to cancel it. This is evident from the commit message: "it might be still re-triggered when the work is being processed at cancel_delayed_work_sync() call."
The race condition occurs because: 1. `cancel_delayed_work_sync()` waits for the current work to complete 2. While waiting, if the work function (`hda_jackpoll_work`) is executing, it checks `codec->jackpoll_interval` at line 635 of the original code 3. If `jackpoll_interval` is still non-zero, the work reschedules itself with `schedule_delayed_work(&codec->jackpoll_work, codec->jackpoll_interval)` at line 641 4. This creates a situation where hardware access occurs after shutdown has supposedly completed
## Code Change Analysis
The fix is **minimal and surgical** - just one line added: ```c codec->jackpoll_interval = 0; /* don't poll any longer */ ```
This is placed **before** the `cancel_delayed_work_sync()` call, ensuring that even if the work function is currently running, it won't reschedule itself because the interval check at the beginning of `hda_jackpoll_work()` will fail.
## Impact Assessment
1. **User-visible bug**: Yes - unexpected hardware access during shutdown can cause system hangs, crashes, or hardware-related issues 2. **Fix size**: Extremely small - single line addition 3. **Risk assessment**: Very low - setting a value to 0 before canceling work is a safe operation 4. **Subsystem criticality**: ALSA HD-Audio is widely used, but this specific fix is in the shutdown path 5. **Testing**: The commit has been tested by the reporter (Joakim Zhang)
## Stable Tree Criteria
This commit meets stable tree requirements: - Fixes a real bug (race condition causing unexpected hardware access) - Minimal change (1 line) - Clear fix with obvious correctness - No new features or architectural changes - Only affects the shutdown path, limiting scope - Has been tested by the bug reporter
The fix prevents potential system instability during shutdown, making it an important candidate for stable backporting.
sound/pci/hda/hda_codec.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index d205f13653bb..0398df0f159a 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -3006,6 +3006,7 @@ void snd_hda_codec_shutdown(struct hda_codec *codec) if (!codec->core.registered) return;
+ codec->jackpoll_interval = 0; /* don't poll any longer */ cancel_delayed_work_sync(&codec->jackpoll_work); list_for_each_entry(cpcm, &codec->pcm_list_head, list) snd_pcm_suspend_all(cpcm->pcm);
From: George Gaidarov gdgaidarov+lkml@gmail.com
[ Upstream commit 493f9c930e5ff72b3508755b45488d1ae2c9650e ]
Device ID '0xa740' is shared by i7-14700, i7-14700K, and i7-14700T. Device ID '0xa704' is shared by i5-14600, i5-14600K, and i5-14600T.
Tested locally on my i7-14700K.
Signed-off-by: George Gaidarov gdgaidarov+lkml@gmail.com Reviewed-by: Qiuxu Zhuo qiuxu.zhuo@intel.com Signed-off-by: Tony Luck tony.luck@intel.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Link: https://lore.kernel.org/r/20250529162933.1228735-1-gdgaidarov+lkml@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Device ID Addition Only**: This commit adds exactly two new PCI device IDs (`0xa740` and `0xa704`) to enable EDAC (Error Detection and Correction) support for Intel Core i5-14600 and i7-14700 processors. According to the stable kernel rules at Documentation/process/stable-kernel-rules.rst:15, patches that "just add a device ID" are explicitly allowed in stable trees.
2. **Minimal and Safe Change**: The patch is extremely small (4 lines of actual code change), adding only: - Two `#define` statements for the new device IDs - Two entries to the `ie31200_pci_tbl[]` array that map these IDs to the existing `rpl_s_cfg` configuration
3. **No Functional Changes**: The commit doesn't introduce any new functionality, algorithms, or modify existing behavior. It simply extends hardware support using the already-tested Raptor Lake-S configuration (`rpl_s_cfg`) for these new processor variants.
4. **Tested Hardware**: The commit message indicates "Tested locally on my i7-14700K", showing the change has been verified on actual hardware.
5. **Pattern of Similar Commits**: Looking at the driver's history, similar hardware enablement commits have been regularly added (e.g., commits `180f091224a0`, `021681830e41`, `05a61c6cb631`), following the same pattern of adding device IDs for new processor variants.
6. **User Benefit**: Without this patch, users with these specific Intel processors cannot use EDAC functionality to detect and report memory errors, which is an important reliability feature for systems using ECC memory.
The commit perfectly fits the stable tree criteria of being a simple device ID addition that enables existing, tested functionality for new hardware variants without any risk of regression to existing systems.
drivers/edac/ie31200_edac.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/edac/ie31200_edac.c b/drivers/edac/ie31200_edac.c index a53612be4b2f..6aac6672ba38 100644 --- a/drivers/edac/ie31200_edac.c +++ b/drivers/edac/ie31200_edac.c @@ -91,6 +91,8 @@ #define PCI_DEVICE_ID_INTEL_IE31200_RPL_S_2 0x4640 #define PCI_DEVICE_ID_INTEL_IE31200_RPL_S_3 0x4630 #define PCI_DEVICE_ID_INTEL_IE31200_RPL_S_4 0xa700 +#define PCI_DEVICE_ID_INTEL_IE31200_RPL_S_5 0xa740 +#define PCI_DEVICE_ID_INTEL_IE31200_RPL_S_6 0xa704
/* Alder Lake-S */ #define PCI_DEVICE_ID_INTEL_IE31200_ADL_S_1 0x4660 @@ -740,6 +742,8 @@ static const struct pci_device_id ie31200_pci_tbl[] = { { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IE31200_RPL_S_2), (kernel_ulong_t)&rpl_s_cfg}, { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IE31200_RPL_S_3), (kernel_ulong_t)&rpl_s_cfg}, { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IE31200_RPL_S_4), (kernel_ulong_t)&rpl_s_cfg}, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IE31200_RPL_S_5), (kernel_ulong_t)&rpl_s_cfg}, + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IE31200_RPL_S_6), (kernel_ulong_t)&rpl_s_cfg}, { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IE31200_ADL_S_1), (kernel_ulong_t)&rpl_s_cfg}, { 0, } /* 0 terminated list. */ };
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
[ Upstream commit 9f85fdb9fc5a1bd308a10a0a7d7e34f2712ba58b ]
The purpose of the warning is to prevent an unexpected change to the return thunk mitigation. However, there are legitimate cases where the return thunk is intentionally set more than once. For example, ITS and SRSO both can set the return thunk after retbleed has set it. In both the cases retbleed is still mitigated.
Replace the warning with an info about the active return thunk.
Suggested-by: Borislav Petkov bp@alien8.de Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Link: https://lore.kernel.org/20250611-eibrs-fix-v4-3-5ff86cac6c61@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## Key Reasons for Backporting:
### 1. **Fixes a Real User-Visible Issue** The code change shows that the original warning was triggering in legitimate scenarios: ```c - if (x86_return_thunk != __x86_return_thunk) - pr_warn("x86/bugs: return thunk changed\n"); ``` This warning would appear in kernel logs whenever multiple CPU vulnerability mitigations interact, which is a normal and expected behavior. The false positive warnings could: - Cause unnecessary alarm for system administrators - Clutter kernel logs with misleading information - Potentially mask real issues by creating noise
### 2. **Small, Contained Change** The commit is minimal and focused: - Only modifies 3 lines of functional code - Changes are confined to a single function (`set_return_thunk()`) - Simply replaces a conditional warning with an unconditional info message: ```c + pr_info("active return thunk: %ps\n", thunk); ```
### 3. **No Risk of Regression** The change is safe because: - It doesn't modify any security logic or mitigation behavior - The function still sets the return thunk exactly as before - The new info message provides better visibility into which mitigation is active - The `%ps` format specifier safely prints the function symbol name
### 4. **Addresses Security Mitigation Interactions** The commit message explicitly states that "ITS and SRSO both can set the return thunk after retbleed has set it" and that "retbleed is still mitigated." This is important for stable kernels because: - Multiple CVE mitigations (Retbleed, SRSO, ITS) need to coexist - The warning was incorrectly flagging legitimate mitigation sequences - Stable kernels need these mitigations to work together without false alarms
### 5. **Part of Critical Security Infrastructure** The x86/bugs subsystem handles CPU vulnerability mitigations for: - CVE-2023-20569 (SRSO) - CVE-2024-28956 (ITS) - Retbleed vulnerabilities These are critical security features that need to function correctly in stable kernels.
### 6. **Improves Operational Visibility** The new info message (`pr_info("active return thunk: %ps\n", thunk);`) actually improves debugging and system monitoring by: - Showing which specific return thunk is currently active - Providing better audit trail for security mitigation changes - Using proper function symbol printing with `%ps`
## Conclusion: This commit meets all the criteria for stable backporting: - Fixes a real issue (false positive warnings) - Minimal risk (3-line change, no logic modifications) - Improves system behavior without introducing new features - Addresses interaction between critical security mitigations - Makes the kernel more reliable and less confusing for operators
The change is particularly important for stable kernels that need to support multiple CPU vulnerability mitigations simultaneously without generating misleading warnings.
arch/x86/kernel/cpu/bugs.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index f4d3abb12317..ea7f94438254 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -115,10 +115,9 @@ void (*x86_return_thunk)(void) __ro_after_init = __x86_return_thunk;
static void __init set_return_thunk(void *thunk) { - if (x86_return_thunk != __x86_return_thunk) - pr_warn("x86/bugs: return thunk changed\n"); - x86_return_thunk = thunk; + + pr_info("active return thunk: %ps\n", thunk); }
/* Update SPEC_CTRL MSR and its cached copy unconditionally */
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
[ Upstream commit 98ff5c071d1cde9426b0bfa449c43d49ec58f1c4 ]
The retbleed select function leaves the mitigation to AUTO in some cases. Moreover, the update function can also set the mitigation to AUTO. This is inconsistent with other mitigations and requires explicit handling of AUTO at the end of update step.
Make sure a mitigation gets selected in the select step, and do not change it to AUTO in the update step. When no mitigation can be selected leave it to NONE, which is what AUTO was getting changed to in the end.
Suggested-by: Borislav Petkov bp@alien8.de Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Acked-by: Borislav Petkov (AMD) bp@alien8.de Link: https://lore.kernel.org/20250611-eibrs-fix-v4-1-5ff86cac6c61@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Now let me analyze the impact and determine if this should be backported:
## Backport Status: YES
This commit should be backported to stable kernel trees for the following reasons:
### 1. **Fixes a Bug in Security Mitigation Logic** The commit fixes an inconsistency in the RETBleed mitigation selection logic where the mitigation could remain in the `AUTO` state after the selection and update steps. This is a bug in the security mitigation implementation that could potentially leave systems incorrectly configured.
### 2. **Small and Contained Fix** The changes are minimal and well-contained: - Removes the problematic `AUTO` state assignments in `retbleed_update_mitigation()` - Adds proper mitigation selection for Intel CPUs in `retbleed_select_mitigation()` - Removes unnecessary code paths (the `goto out` pattern) - Total changes are limited to ~30 lines of code in a single file
### 3. **Fixes Security Mitigation Consistency** The commit message explicitly states: "This is inconsistent with other mitigations and requires explicit handling of AUTO at the end of update step." This inconsistency in security mitigation handling is exactly the type of bug that should be fixed in stable kernels.
### 4. **Low Risk of Regression** The changes are straightforward: - Instead of leaving mitigation as `AUTO`, it now explicitly selects `NONE` when no mitigation is available - For Intel CPUs, it properly selects EIBRS/IBRS/NONE based on CPU features during the select phase - The logic flow is simplified by removing the `goto out` pattern
### 5. **Security-Critical Subsystem** RETBleed is a serious CPU vulnerability (CVE-2022-29900 and CVE-2022-29901) that allows arbitrary speculative code execution. Any bugs in its mitigation logic should be promptly fixed in stable kernels to ensure users are properly protected.
### 6. **Clear Intent for Stable** While not explicitly marked with a "Cc: stable" tag, this is clearly a bugfix for security mitigation code, which typically qualifies for stable backporting according to stable kernel rules.
The commit ensures that the RETBleed mitigation selection process works correctly and consistently, preventing situations where the mitigation might be left in an undefined `AUTO` state. This is particularly important for security-conscious environments that rely on proper CPU vulnerability mitigations.
arch/x86/kernel/cpu/bugs.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index ea7f94438254..1f4146b9101a 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -1250,6 +1250,14 @@ static void __init retbleed_select_mitigation(void) retbleed_mitigation = RETBLEED_MITIGATION_IBPB; else retbleed_mitigation = RETBLEED_MITIGATION_NONE; + } else if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) { + /* Final mitigation depends on spectre-v2 selection */ + if (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED)) + retbleed_mitigation = RETBLEED_MITIGATION_EIBRS; + else if (boot_cpu_has(X86_FEATURE_IBRS)) + retbleed_mitigation = RETBLEED_MITIGATION_IBRS; + else + retbleed_mitigation = RETBLEED_MITIGATION_NONE; } }
@@ -1258,9 +1266,6 @@ static void __init retbleed_update_mitigation(void) if (!boot_cpu_has_bug(X86_BUG_RETBLEED) || cpu_mitigations_off()) return;
- if (retbleed_mitigation == RETBLEED_MITIGATION_NONE) - goto out; - /* * retbleed=stuff is only allowed on Intel. If stuffing can't be used * then a different mitigation will be selected below. @@ -1271,7 +1276,7 @@ static void __init retbleed_update_mitigation(void) its_mitigation == ITS_MITIGATION_RETPOLINE_STUFF) { if (spectre_v2_enabled != SPECTRE_V2_RETPOLINE) { pr_err("WARNING: retbleed=stuff depends on spectre_v2=retpoline\n"); - retbleed_mitigation = RETBLEED_MITIGATION_AUTO; + retbleed_mitigation = RETBLEED_MITIGATION_NONE; } else { if (retbleed_mitigation != RETBLEED_MITIGATION_STUFF) pr_info("Retbleed mitigation updated to stuffing\n"); @@ -1297,15 +1302,11 @@ static void __init retbleed_update_mitigation(void) if (retbleed_mitigation != RETBLEED_MITIGATION_STUFF) pr_err(RETBLEED_INTEL_MSG); } - /* If nothing has set the mitigation yet, default to NONE. */ - if (retbleed_mitigation == RETBLEED_MITIGATION_AUTO) - retbleed_mitigation = RETBLEED_MITIGATION_NONE; } -out: + pr_info("%s\n", retbleed_strings[retbleed_mitigation]); }
- static void __init retbleed_apply_mitigation(void) { bool mitigate_smt = false;
From: Mark Brown broonie@kernel.org
[ Upstream commit c4ca928a6db1593802cd945f075a7e21dd0430c1 ]
We currently log parse failures for ELD data and some disconnection events as errors without rate limiting. These log messages can be triggered very frequently in some situations, especially ELD parsing when there is nothing connected to a HDMI port which will generate:
hdmi-audio-codec hdmi-audio-codec.1.auto: HDMI: Unknown ELD version 0
While there's doubtless work that could be done on reducing the number of connection notification callbacks it's possible these may be legitimately generated by poor quality physical connections so let's use rate limiting to mitigate the log spam for the parse errors and lower the severity for disconnect logging to debug level.
Signed-off-by: Mark Brown broonie@kernel.org Link: https://patch.msgid.link/20250613-asoc-hdmi-eld-logging-v1-1-76d64154d969@ke... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit is a good candidate for backporting to stable kernel trees for the following reasons:
1. **Fixes a real user-facing bug**: The commit addresses log spam issues that can fill system logs with repeated error messages. When no HDMI display is connected, the driver continuously logs "HDMI: Unknown ELD version 0" errors, which can consume disk space and make logs difficult to analyze.
2. **Small and contained fix**: The changes are minimal and focused: - Changes `dev_err()` to `dev_err_ratelimited()` in two locations within `hdac_hdmi_parse_eld()` (lines 1233-1234 and 1242-1243) - Changes `dev_err()` to `dev_dbg()` for disconnect messages in `hdac_hdmi_present_sense()` (lines 1302-1303) - Total of just 3 logging statement modifications
3. **No functional changes**: The commit only modifies logging behavior without changing any driver logic, state management, or hardware interaction. This minimizes regression risk.
4. **Affects a common use case**: The issue occurs when HDMI ports are unconnected, which is a normal operating condition for many systems. Users may have HDMI ports they never use, or may frequently plug/unplug monitors.
5. **No architectural changes**: The fix doesn't introduce new features or modify the driver's architecture. It simply rate-limits existing error messages and downgrades one message's severity.
6. **Clear problem and solution**: The commit message clearly describes the issue (log spam from ELD parsing failures) and the straightforward solution (rate limiting and severity adjustment).
The specific code changes show: - In `hdac_hdmi_parse_eld()`: Rate-limiting prevents flooding logs when ELD data is invalid/unavailable - In `hdac_hdmi_present_sense()`: Downgrading disconnect messages from error to debug level is appropriate since disconnection is a normal event, not an error condition
This meets the stable kernel criteria of being a simple fix for an annoying bug that affects users without introducing new risks.
sound/soc/codecs/hdac_hdmi.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/sound/soc/codecs/hdac_hdmi.c b/sound/soc/codecs/hdac_hdmi.c index 1139a2754ca3..056d98154682 100644 --- a/sound/soc/codecs/hdac_hdmi.c +++ b/sound/soc/codecs/hdac_hdmi.c @@ -1232,7 +1232,8 @@ static int hdac_hdmi_parse_eld(struct hdac_device *hdev, >> DRM_ELD_VER_SHIFT;
if (ver != ELD_VER_CEA_861D && ver != ELD_VER_PARTIAL) { - dev_err(&hdev->dev, "HDMI: Unknown ELD version %d\n", ver); + dev_err_ratelimited(&hdev->dev, + "HDMI: Unknown ELD version %d\n", ver); return -EINVAL; }
@@ -1240,7 +1241,8 @@ static int hdac_hdmi_parse_eld(struct hdac_device *hdev, DRM_ELD_MNL_MASK) >> DRM_ELD_MNL_SHIFT;
if (mnl > ELD_MAX_MNL) { - dev_err(&hdev->dev, "HDMI: MNL Invalid %d\n", mnl); + dev_err_ratelimited(&hdev->dev, + "HDMI: MNL Invalid %d\n", mnl); return -EINVAL; }
@@ -1299,8 +1301,8 @@ static void hdac_hdmi_present_sense(struct hdac_hdmi_pin *pin,
if (!port->eld.monitor_present || !port->eld.eld_valid) {
- dev_err(&hdev->dev, "%s: disconnect for pin:port %d:%d\n", - __func__, pin->nid, port->id); + dev_dbg(&hdev->dev, "%s: disconnect for pin:port %d:%d\n", + __func__, pin->nid, port->id);
/* * PCMs are not registered during device probe, so don't
From: Alok Tiwari alok.a.tiwari@oracle.com
[ Upstream commit 87aafc8580acf87fcaf1a7e30ed858d8c8d37d81 ]
code mistakenly used a hardcoded index (codec[1]) instead of iterating, over the codec array using the loop variable i. Use codec[i] instead of codec[1] to match the loop iteration.
Signed-off-by: Alok Tiwari alok.a.tiwari@oracle.com Link: https://patch.msgid.link/20250621185233.4081094-1-alok.a.tiwari@oracle.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees. Here's the extensive explanation:
## Bug Analysis
The bug is a clear coding error in the ICH4 codec configuration code within the ALSA intel8x0 driver. Looking at the code change:
```c - tmp |= chip->ac97_sdin[pcm->r[0].codec[1]->num] << ICH_DI2L_SHIFT; + tmp |= chip->ac97_sdin[pcm->r[0].codec[i]->num] << ICH_DI2L_SHIFT; ```
The issue is in a loop that iterates from `i = 1` to `i < 4` (lines 2250-2255), checking each codec slot. However, the original code hardcoded `codec[1]` instead of using the loop variable `codec[i]`. This means:
1. **The loop was pointless** - it would always use codec[1] regardless of which iteration found a valid codec 2. **Incorrect codec configuration** - If codec[2] or codec[3] were the valid codec (and codec[1] was NULL), the code would still try to access codec[1], potentially causing: - Null pointer dereference if codec[1] is NULL - Wrong codec configuration if codec[1] exists but isn't the intended one
## Impact Assessment
This bug affects: - **Hardware**: Intel ICH4 (Intel I/O Controller Hub 4) chipsets only - **Functionality**: AC'97 codec configuration for multiple SDIN (Serial Data In) paths - **When triggered**: When using ICH4 with multiple codecs where the second valid codec is not in slot 1
## Backport Suitability
This commit meets all criteria for stable backporting:
1. **Fixes a real bug**: Clear array indexing error that can cause incorrect hardware configuration or crashes 2. **Small and contained**: Single line change, exactly 1 insertion and 1 deletion 3. **No side effects**: The fix simply corrects the loop to work as originally intended 4. **No architectural changes**: Pure bug fix with no API/ABI changes 5. **Critical subsystem**: ALSA sound drivers are important for user experience 6. **Long-standing bug**: The code has been incorrect since the initial git import (2005), making it important to fix in all maintained kernels 7. **Clear fix**: The correction is obvious - using the loop variable instead of a hardcoded index
The bug has existed since the kernel's initial git import in 2005 (commit 1da177e4c3f4), meaning it affects all kernel versions and should be backported to all stable trees that still receive updates.
sound/pci/intel8x0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/pci/intel8x0.c b/sound/pci/intel8x0.c index 51e7f1f1a48e..b521cec20333 100644 --- a/sound/pci/intel8x0.c +++ b/sound/pci/intel8x0.c @@ -2249,7 +2249,7 @@ static int snd_intel8x0_mixer(struct intel8x0 *chip, int ac97_clock, tmp |= chip->ac97_sdin[0] << ICH_DI1L_SHIFT; for (i = 1; i < 4; i++) { if (pcm->r[0].codec[i]) { - tmp |= chip->ac97_sdin[pcm->r[0].codec[1]->num] << ICH_DI2L_SHIFT; + tmp |= chip->ac97_sdin[pcm->r[0].codec[i]->num] << ICH_DI2L_SHIFT; break; } }
From: Markus Stockhausen markus.stockhausen@gmx.de
[ Upstream commit 2250db8628a0d8293ad2e0671138b848a185fba1 ]
Devices of the Realtek MIPS Otto platform use the official rtl-otto-timer as clock event generator and CPU clocksource. It is registered for each CPU startup via cpuhp_setup_state() and forces the affinity of the clockevent interrupts to the appropriate CPU via irq_force_affinity().
On the "smaller" devices with a vendor specific interrupt controller (supported by irq-realtek-rtl) the registration works fine. The "larger" RTL931x series is based on a MIPS interAptiv dual core with a MIPS GIC controller. Interrupt routing setup is cancelled because gic_set_affinity() does not accept the current (not yet online) CPU as a target.
Relax the checks by evaluating the force parameter that is provided for exactly this purpose like in other drivers. With this the affinity can be set as follows:
- force = false: allow to set affinity to any online cpu - force = true: allow to set affinity to any cpu
Co-developed-by: Sebastian Gottschall s.gottschall@dd-wrt.com Signed-off-by: Sebastian Gottschall s.gottschall@dd-wrt.com Signed-off-by: Markus Stockhausen markus.stockhausen@gmx.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/all/20250621054952.380374-1-markus.stockhausen@gmx.d... Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit, here is my assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a real bug affecting users**: The commit fixes a specific issue where clockevent interrupt affinity setup fails on Realtek RTL931x devices with MIPS GIC controllers. Without this fix, the timer registration during CPU hotplug fails because `gic_set_affinity()` rejects setting affinity to CPUs that are not yet online.
2. **Small and contained fix**: The change is minimal - it only adds 4 lines of code to properly handle the `force` parameter that was already being passed but ignored. The logic change is straightforward: - When `force=false`: Use existing behavior (only allow online CPUs) - When `force=true`: Allow any CPU in the mask
3. **Follows established patterns**: The commit shows that multiple other timer drivers (`timer-rtl-otto.c`, `timer-tegra.c`, `exynos_mct.c`, etc.) use `irq_force_affinity()` during CPU hotplug, which eventually calls the chip's `set_affinity` handler with `force=true`. The MIPS GIC driver was incorrectly ignoring this parameter.
4. **No architectural changes**: This is purely a bug fix that makes the MIPS GIC driver honor the existing `force` parameter semantics. It doesn't introduce new features or change any architecture.
5. **Clear use case**: The commit message clearly explains that this affects real hardware (Realtek RTL931x series) and prevents proper timer initialization during CPU hotplug, which is a fundamental kernel operation.
6. **Low regression risk**: The change only affects behavior when `force=true` is explicitly passed, which is specifically for CPU hotplug scenarios. Normal interrupt affinity operations (with `force=false`) remain unchanged.
7. **Aligns with kernel conventions**: The kernel documentation for `irq_force_affinity()` explicitly states it's "Solely for low level cpu hotplug code, where we need to make per cpu interrupts affine before the cpu becomes online." This commit makes the MIPS GIC driver comply with this intended behavior.
The fix is essential for proper operation of affected hardware and follows the stable kernel criteria of being a important bugfix with minimal risk.
drivers/irqchip/irq-mips-gic.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/irqchip/irq-mips-gic.c b/drivers/irqchip/irq-mips-gic.c index 34e8d09c12a0..19a57c5e2b2e 100644 --- a/drivers/irqchip/irq-mips-gic.c +++ b/drivers/irqchip/irq-mips-gic.c @@ -375,9 +375,13 @@ static int gic_set_affinity(struct irq_data *d, const struct cpumask *cpumask, /* * The GIC specifies that we can only route an interrupt to one VP(E), * ie. CPU in Linux parlance, at a time. Therefore we always route to - * the first online CPU in the mask. + * the first forced or online CPU in the mask. */ - cpu = cpumask_first_and(cpumask, cpu_online_mask); + if (force) + cpu = cpumask_first(cpumask); + else + cpu = cpumask_first_and(cpumask, cpu_online_mask); + if (cpu >= NR_CPUS) return -EINVAL;
From: Peter Ujfalusi peter.ujfalusi@linux.intel.com
[ Upstream commit 6b3cb7f4341cbf62d41ccf6ea906dbe66be8aa3d ]
Parsing the dapm_widget_tokens is also needed for DSPless mode as it is setting the snd_soc_dapm_widget.no_wname_in_kcontrol_name flag for the kcontrol creation from DAPM widgets. Without that flag set, the following warnings might appear because of long control names: ALSA: Control name 'eqiir.2.1 Post Mixer Analog Playback IIR Eq bytes' truncated to 'eqiir.2.1 Post Mixer Analog Playback IIR Eq' ALSA: Control name 'eqfir.2.1 Post Mixer Analog Playback FIR Eq bytes' truncated to 'eqfir.2.1 Post Mixer Analog Playback FIR Eq' ALSA: Control name 'drc.2.1 Post Mixer Analog Playback DRC bytes' truncated to 'drc.2.1 Post Mixer Analog Playback DRC byte' ALSA: Control name 'drc.2.1 Post Mixer Analog Playback DRC switch' truncated to 'drc.2.1 Post Mixer Analog Playback DRC swit' ALSA: Control name 'gain.15.1 Pre Mixer Deepbuffer HDA Analog Volume' truncated to 'gain.15.1 Pre Mixer Deepbuffer HDA Analog V'
Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Liam Girdwood liam.r.girdwood@intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Link: https://patch.msgid.link/20250619102640.12068-1-peter.ujfalusi@linux.intel.c... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Bug Fix**: This commit fixes a functional bug in DSPless mode where control names are being truncated due to excessive length. The warnings shown in the commit message indicate real issues that affect user-visible behavior: - "Control name 'eqiir.2.1 Post Mixer Analog Playback IIR Eq bytes' truncated..." - Multiple similar truncation warnings for various audio controls
2. **Small and Contained Fix**: The code change is minimal and localized: - Adds only ~10 lines of code to parse `dapm_widget_tokens` in the `sof_dspless_widget_ready()` function - Simply moves existing token parsing logic that was already present in the regular SOF mode (line 1438) to also execute in DSPless mode - No architectural changes or new features introduced
3. **Low Risk**: The change has minimal regression risk: - Only affects DSPless mode operation, not the standard DSP mode - Reuses existing, tested parsing code (`sof_parse_tokens` with `dapm_widget_tokens`) - The token being parsed (`SOF_TKN_COMP_NO_WNAME_IN_KCONTROL_NAME`) sets a flag that prevents widget names from being included in control names, thus avoiding the truncation
4. **Clear Root Cause**: The issue occurs because without parsing these tokens, the `no_wname_in_kcontrol_name` flag in `snd_soc_dapm_widget` (line 542 in soc-dapm.h) isn't set, causing ALSA to create overly long control names that exceed the 44-character limit and get truncated.
5. **Part of DSPless Mode Support**: DSPless mode is an important feature that allows audio to work without DSP firmware, and this fixes a user-visible issue (truncated control names) in that mode. The git history shows multiple DSPless-related fixes have been backported before (e.g., commit ef0128afa165).
The commit follows stable tree rules by fixing an important bug with minimal changes and low risk of regression.
sound/soc/sof/topology.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/sound/soc/sof/topology.c b/sound/soc/sof/topology.c index d612d693efc3..b6d5c8024f8c 100644 --- a/sound/soc/sof/topology.c +++ b/sound/soc/sof/topology.c @@ -2378,14 +2378,25 @@ static int sof_dspless_widget_ready(struct snd_soc_component *scomp, int index, struct snd_soc_dapm_widget *w, struct snd_soc_tplg_dapm_widget *tw) { + struct snd_soc_tplg_private *priv = &tw->priv; + int ret; + + /* for snd_soc_dapm_widget.no_wname_in_kcontrol_name */ + ret = sof_parse_tokens(scomp, w, dapm_widget_tokens, + ARRAY_SIZE(dapm_widget_tokens), + priv->array, le32_to_cpu(priv->size)); + if (ret < 0) { + dev_err(scomp->dev, "failed to parse dapm widget tokens for %s\n", + w->name); + return ret; + } + if (WIDGET_IS_DAI(w->id)) { static const struct sof_topology_token dai_tokens[] = { {SOF_TKN_DAI_TYPE, SND_SOC_TPLG_TUPLE_TYPE_STRING, get_token_dai_type, 0}}; struct snd_sof_dev *sdev = snd_soc_component_get_drvdata(scomp); - struct snd_soc_tplg_private *priv = &tw->priv; struct snd_sof_widget *swidget; struct snd_sof_dai *sdai; - int ret;
swidget = kzalloc(sizeof(*swidget), GFP_KERNEL); if (!swidget)
From: Joseph Tilahun jtilahun@astranis.com
[ Upstream commit 33a2515abd45c64911955ff1da179589db54f99f ]
The serial info sometimes produces negative TX/RX counts. E.g.:
3: uart:FSL_LPUART mmio:0x02970000 irq:46 tx:-1595870545 rx:339619 RTS|CTS|DTR|DSR|CD
It appears that the print format specifiers don't match with the types of the respective variables. E.g.: All of the fields in struct uart_icount are u32, but the format specifier used is %d, even though u32 is unsigned and %d is for signed integers. Update drivers/tty/serial/serial_core.c to use the proper format specifiers. Reference https://docs.kernel.org/core-api/printk-formats.html as the documentation for what format specifiers are the proper ones to use for a given C type.
Signed-off-by: Joseph Tilahun jtilahun@astranis.com Link: https://lore.kernel.org/r/20250610065653.3750067-1-jtilahun@astranis.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **It fixes a real user-visible bug**: The commit fixes incorrect format specifiers that cause negative values to be displayed for TX/RX counts in `/proc/tty/driver/serial`. The example shows: ``` 3: uart:FSL_LPUART mmio:0x02970000 irq:46 tx:-1595870545 rx:339619 ``` This is clearly wrong as TX/RX counts should never be negative.
2. **The fix is simple and contained**: The changes only modify format specifiers from `%d` (signed) to `%u` (unsigned) in print statements. Looking at the code: - All `uart_icount` fields (tx, rx, frame, parity, brk, overrun, buf_overrun) are defined as `__u32` (unsigned 32-bit) - `port->line` is `unsigned int` - `port->irq` is `unsigned int` - `port->uartclk` is `unsigned int` - Other fields like `close_delay`, `closing_wait`, `io_type`, `iomem_reg_shift` are also unsigned types
3. **Low risk of regression**: The changes are purely cosmetic - they only affect how values are displayed, not the actual functionality. No logic changes, no structural changes, just format string corrections.
4. **Affects a core subsystem**: The serial core is used by many serial drivers, so this bug affects multiple platforms and drivers. The fix benefits all users of the serial subsystem.
5. **Follows stable tree rules**: This is exactly the type of fix suitable for stable: - Fixes an actual bug (incorrect display of statistics) - Minimal change - Obviously correct (matching format specifiers to variable types) - No new features or architectural changes
The commit is a straightforward correctness fix that improves the reliability of kernel diagnostics output without any risk of breaking functionality.
drivers/tty/serial/serial_core.c | 44 ++++++++++++++++---------------- 1 file changed, 22 insertions(+), 22 deletions(-)
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c index 1f7708a91fc6..8a1482131257 100644 --- a/drivers/tty/serial/serial_core.c +++ b/drivers/tty/serial/serial_core.c @@ -1337,28 +1337,28 @@ static void uart_sanitize_serial_rs485_delays(struct uart_port *port, if (!port->rs485_supported.delay_rts_before_send) { if (rs485->delay_rts_before_send) { dev_warn_ratelimited(port->dev, - "%s (%d): RTS delay before sending not supported\n", + "%s (%u): RTS delay before sending not supported\n", port->name, port->line); } rs485->delay_rts_before_send = 0; } else if (rs485->delay_rts_before_send > RS485_MAX_RTS_DELAY) { rs485->delay_rts_before_send = RS485_MAX_RTS_DELAY; dev_warn_ratelimited(port->dev, - "%s (%d): RTS delay before sending clamped to %u ms\n", + "%s (%u): RTS delay before sending clamped to %u ms\n", port->name, port->line, rs485->delay_rts_before_send); }
if (!port->rs485_supported.delay_rts_after_send) { if (rs485->delay_rts_after_send) { dev_warn_ratelimited(port->dev, - "%s (%d): RTS delay after sending not supported\n", + "%s (%u): RTS delay after sending not supported\n", port->name, port->line); } rs485->delay_rts_after_send = 0; } else if (rs485->delay_rts_after_send > RS485_MAX_RTS_DELAY) { rs485->delay_rts_after_send = RS485_MAX_RTS_DELAY; dev_warn_ratelimited(port->dev, - "%s (%d): RTS delay after sending clamped to %u ms\n", + "%s (%u): RTS delay after sending clamped to %u ms\n", port->name, port->line, rs485->delay_rts_after_send); } } @@ -1388,14 +1388,14 @@ static void uart_sanitize_serial_rs485(struct uart_port *port, struct serial_rs4 rs485->flags &= ~SER_RS485_RTS_AFTER_SEND;
dev_warn_ratelimited(port->dev, - "%s (%d): invalid RTS setting, using RTS_ON_SEND instead\n", + "%s (%u): invalid RTS setting, using RTS_ON_SEND instead\n", port->name, port->line); } else { rs485->flags |= SER_RS485_RTS_AFTER_SEND; rs485->flags &= ~SER_RS485_RTS_ON_SEND;
dev_warn_ratelimited(port->dev, - "%s (%d): invalid RTS setting, using RTS_AFTER_SEND instead\n", + "%s (%u): invalid RTS setting, using RTS_AFTER_SEND instead\n", port->name, port->line); } } @@ -1834,7 +1834,7 @@ static void uart_wait_until_sent(struct tty_struct *tty, int timeout)
expire = jiffies + timeout;
- pr_debug("uart_wait_until_sent(%d), jiffies=%lu, expire=%lu...\n", + pr_debug("uart_wait_until_sent(%u), jiffies=%lu, expire=%lu...\n", port->line, jiffies, expire);
/* @@ -2028,7 +2028,7 @@ static void uart_line_info(struct seq_file *m, struct uart_state *state) return;
mmio = uport->iotype >= UPIO_MEM; - seq_printf(m, "%d: uart:%s %s%08llX irq:%d", + seq_printf(m, "%u: uart:%s %s%08llX irq:%u", uport->line, uart_type(uport), mmio ? "mmio:0x" : "port:", mmio ? (unsigned long long)uport->mapbase @@ -2050,18 +2050,18 @@ static void uart_line_info(struct seq_file *m, struct uart_state *state) if (pm_state != UART_PM_STATE_ON) uart_change_pm(state, pm_state);
- seq_printf(m, " tx:%d rx:%d", + seq_printf(m, " tx:%u rx:%u", uport->icount.tx, uport->icount.rx); if (uport->icount.frame) - seq_printf(m, " fe:%d", uport->icount.frame); + seq_printf(m, " fe:%u", uport->icount.frame); if (uport->icount.parity) - seq_printf(m, " pe:%d", uport->icount.parity); + seq_printf(m, " pe:%u", uport->icount.parity); if (uport->icount.brk) - seq_printf(m, " brk:%d", uport->icount.brk); + seq_printf(m, " brk:%u", uport->icount.brk); if (uport->icount.overrun) - seq_printf(m, " oe:%d", uport->icount.overrun); + seq_printf(m, " oe:%u", uport->icount.overrun); if (uport->icount.buf_overrun) - seq_printf(m, " bo:%d", uport->icount.buf_overrun); + seq_printf(m, " bo:%u", uport->icount.buf_overrun);
#define INFOBIT(bit, str) \ if (uport->mctrl & (bit)) \ @@ -2553,7 +2553,7 @@ uart_report_port(struct uart_driver *drv, struct uart_port *port) break; }
- pr_info("%s%s%s at %s (irq = %d, base_baud = %d) is a %s\n", + pr_info("%s%s%s at %s (irq = %u, base_baud = %u) is a %s\n", port->dev ? dev_name(port->dev) : "", port->dev ? ": " : "", port->name, @@ -2561,7 +2561,7 @@ uart_report_port(struct uart_driver *drv, struct uart_port *port)
/* The magic multiplier feature is a bit obscure, so report it too. */ if (port->flags & UPF_MAGIC_MULTIPLIER) - pr_info("%s%s%s extra baud rates supported: %d, %d", + pr_info("%s%s%s extra baud rates supported: %u, %u", port->dev ? dev_name(port->dev) : "", port->dev ? ": " : "", port->name, @@ -2960,7 +2960,7 @@ static ssize_t close_delay_show(struct device *dev, struct tty_port *port = dev_get_drvdata(dev);
uart_get_info(port, &tmp); - return sprintf(buf, "%d\n", tmp.close_delay); + return sprintf(buf, "%u\n", tmp.close_delay); }
static ssize_t closing_wait_show(struct device *dev, @@ -2970,7 +2970,7 @@ static ssize_t closing_wait_show(struct device *dev, struct tty_port *port = dev_get_drvdata(dev);
uart_get_info(port, &tmp); - return sprintf(buf, "%d\n", tmp.closing_wait); + return sprintf(buf, "%u\n", tmp.closing_wait); }
static ssize_t custom_divisor_show(struct device *dev, @@ -2990,7 +2990,7 @@ static ssize_t io_type_show(struct device *dev, struct tty_port *port = dev_get_drvdata(dev);
uart_get_info(port, &tmp); - return sprintf(buf, "%d\n", tmp.io_type); + return sprintf(buf, "%u\n", tmp.io_type); }
static ssize_t iomem_base_show(struct device *dev, @@ -3010,7 +3010,7 @@ static ssize_t iomem_reg_shift_show(struct device *dev, struct tty_port *port = dev_get_drvdata(dev);
uart_get_info(port, &tmp); - return sprintf(buf, "%d\n", tmp.iomem_reg_shift); + return sprintf(buf, "%u\n", tmp.iomem_reg_shift); }
static ssize_t console_show(struct device *dev, @@ -3146,7 +3146,7 @@ static int serial_core_add_one_port(struct uart_driver *drv, struct uart_port *u state->pm_state = UART_PM_STATE_UNDEFINED; uart_port_set_cons(uport, drv->cons); uport->minor = drv->tty_driver->minor_start + uport->line; - uport->name = kasprintf(GFP_KERNEL, "%s%d", drv->dev_name, + uport->name = kasprintf(GFP_KERNEL, "%s%u", drv->dev_name, drv->tty_driver->name_base + uport->line); if (!uport->name) return -ENOMEM; @@ -3185,7 +3185,7 @@ static int serial_core_add_one_port(struct uart_driver *drv, struct uart_port *u device_set_wakeup_capable(tty_dev, 1); } else { uport->flags |= UPF_DEAD; - dev_err(uport->dev, "Cannot register tty device on line %d\n", + dev_err(uport->dev, "Cannot register tty device on line %u\n", uport->line); }
From: Peter Ujfalusi peter.ujfalusi@linux.intel.com
[ Upstream commit 2d91cb261cac6d885954b8f5da28b5c176c18131 ]
snd_soc_remove_pcm_runtime() might be called with rtd == NULL which will leads to null pointer dereference. This was reproduced with topology loading and marking a link as ignore due to missing hardware component on the system. On module removal the soc_tplg_remove_link() would call snd_soc_remove_pcm_runtime() with rtd == NULL since the link was ignored, no runtime was created.
Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Liam Girdwood liam.r.girdwood@intel.com Reviewed-by: Kai Vehmanen kai.vehmanen@linux.intel.com Link: https://patch.msgid.link/20250619084222.559-3-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees. The analysis reveals this is a critical NULL pointer dereference fix that prevents kernel crashes.
**Key reasons for backporting:**
1. **Fixes a Real Crash**: The commit adds a NULL check to prevent a kernel panic that occurs when `snd_soc_remove_pcm_runtime()` is called with `rtd == NULL`. The code change shows this clearly: ```c + if (!rtd) + return; ``` Without this check, the subsequent line `snd_soc_card_remove_dai_link(card, rtd->dai_link)` would dereference NULL.
2. **Minimal and Safe Fix**: The change is extremely small (3 lines) and defensive - it simply adds a NULL check with early return. This follows the stable kernel rule of minimal, contained fixes.
3. **Affects Production Systems**: The commit message indicates this occurs during topology loading when hardware components are missing - a real-world scenario. The issue manifests on module removal when `soc_tplg_remove_link()` calls the function with NULL.
4. **Well-Reviewed**: The commit has 5 Reviewed-by tags from Intel ASoC maintainers, indicating thorough review and consensus on the fix.
5. **No Side Effects**: The fix has no architectural changes or new features - it purely adds defensive programming to prevent crashes. The function already had EXPORT_SYMBOL_GPL, indicating it's part of the kernel API that could be called from various contexts.
This is exactly the type of commit stable trees want: a small, obvious fix for a real crash scenario with no risk of regression.
sound/soc/soc-core.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index 67bebc339148..16bbc074dc5f 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -1139,6 +1139,9 @@ static int snd_soc_compensate_channel_connection_map(struct snd_soc_card *card, void snd_soc_remove_pcm_runtime(struct snd_soc_card *card, struct snd_soc_pcm_runtime *rtd) { + if (!rtd) + return; + lockdep_assert_held(&client_mutex);
/*
From: Tomasz Michalec tmichalec@google.com
[ Upstream commit df9a825f330e76c72d1985bc9bdc4b8981e3d15f ]
If pmc_usb_probe is called before SCU IPC is registered, pmc_usb_probe will fail.
Return -EPROBE_DEFER when pmc_usb_probe doesn't get SCU IPC device, so the probe function can be called again after SCU IPC is initialized.
Signed-off-by: Tomasz Michalec tmichalec@google.com Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20250610154058.1859812-1-tmichalec@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and the kernel codebase, here's my assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Clear Bug Fix**: The commit fixes a real probe ordering issue where `pmc_usb_probe` fails if called before the SCU IPC driver is initialized. The current code returns `-ENODEV` which prevents the driver from ever loading, while `-EPROBE_DEFER` allows the kernel to retry probing later.
2. **Established Pattern**: Looking at the codebase, other Intel platform drivers that depend on SCU IPC already use this pattern: - `drivers/watchdog/intel-mid_wdt.c`: Returns `-EPROBE_DEFER` when `devm_intel_scu_ipc_dev_get()` returns NULL - `drivers/mfd/intel_soc_pmic_bxtwc.c`: Returns `-EPROBE_DEFER` in the same scenario - `drivers/platform/x86/intel/telemetry/pltdrv.c`: Also uses `-EPROBE_DEFER`
3. **Small and Contained Change**: The fix is a one-line change (`-ENODEV` → `-EPROBE_DEFER`) that only affects the error handling path during probe. It doesn't introduce new functionality or change any existing behavior when SCU IPC is available.
4. **No Architecture Changes**: This is purely a bug fix that corrects incorrect error handling. It doesn't introduce new features or make architectural changes to the driver.
5. **Minimal Risk**: The change is extremely low risk - it only affects the error path when SCU IPC isn't yet available, and the deferred probe mechanism is a well-established kernel pattern designed specifically for handling driver dependencies.
6. **Real-World Impact**: Without this fix, users could experience USB Type-C functionality failures on Intel platforms if the drivers happen to probe in the wrong order. This is particularly problematic on systems where driver probe order is non-deterministic.
The commit follows the stable tree rules by fixing an important bug (probe failure due to ordering) with minimal risk and no new features. The fact that other Intel drivers already use this pattern confirms this is the correct approach for handling SCU IPC dependencies.
drivers/usb/typec/mux/intel_pmc_mux.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/typec/mux/intel_pmc_mux.c b/drivers/usb/typec/mux/intel_pmc_mux.c index 65dda9183e6f..1698428654ab 100644 --- a/drivers/usb/typec/mux/intel_pmc_mux.c +++ b/drivers/usb/typec/mux/intel_pmc_mux.c @@ -754,7 +754,7 @@ static int pmc_usb_probe(struct platform_device *pdev)
pmc->ipc = devm_intel_scu_ipc_dev_get(&pdev->dev); if (!pmc->ipc) - return -ENODEV; + return -EPROBE_DEFER;
pmc->dev = &pdev->dev;
From: Oliver Neukum oneukum@suse.com
[ Upstream commit 503bbde34cc3dd2acd231f277ba70c3f9ed22e59 ]
Checking for the endpoint type is no reason for a WARN, as that can cause a reboot. A driver not checking the endpoint type must not cause a reboot, as there is just no point in this. We cannot prevent a device from doing something incorrect as a reaction to a transfer. Hence warning for a mere assumption being wrong is not sensible.
Signed-off-by: Oliver Neukum oneukum@suse.com Acked-by: Alan Stern stern@rowland.harvard.edu Link: https://lore.kernel.org/r/20250612122149.2559724-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of this commit, here is my determination:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a real issue that affects users**: The commit addresses a problem where a `dev_WARN()` can cause system reboots when `panic_on_warn` is enabled. This is a serious issue because a driver bug (not checking endpoint types properly) should not be able to crash the entire system.
2. **Small and contained fix**: The change is minimal - it only replaces `dev_WARN()` with `dev_warn_once()` on line 503 of drivers/usb/core/urb.c. This is a one-line change that doesn't affect any other functionality.
3. **Clear side effects**: The only behavioral change is that: - The warning will no longer trigger a kernel panic when `panic_on_warn` is set - The warning will only be printed once instead of potentially multiple times - No backtrace will be generated
4. **No architectural changes**: This is a simple logging level change that doesn't modify any USB subsystem architecture or functionality.
5. **Affects critical kernel subsystem**: While USB is a critical subsystem, this change actually makes it more stable by preventing potential system crashes.
6. **Follows stable tree rules**: This is clearly a bugfix that improves system stability. The commit message explicitly states that "A driver not checking the endpoint type must not cause a reboot" - this is a stability improvement that prevents denial-of-service scenarios.
7. **Similar fixes in the kernel**: There's precedent for this type of fix, as shown by commit 281cb9d65a95 ("bnxt_en: Make PTP timestamp HWRM more silent") which made a similar conversion from `netdev_WARN_ONCE()` to `netdev_warn_once()` for the same reason.
The key insight from the code is that `dev_WARN()` calls `WARN()` which can trigger a kernel panic if `panic_on_warn` is set. This means a malicious or buggy USB device could potentially crash the system just by triggering this warning. Converting to `dev_warn_once()` maintains the diagnostic value while removing the crash risk.
drivers/usb/core/urb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/core/urb.c b/drivers/usb/core/urb.c index 5e52a35486af..120de3c499d2 100644 --- a/drivers/usb/core/urb.c +++ b/drivers/usb/core/urb.c @@ -500,7 +500,7 @@ int usb_submit_urb(struct urb *urb, gfp_t mem_flags)
/* Check that the pipe's type matches the endpoint's type */ if (usb_pipe_type_check(urb->dev, urb->pipe)) - dev_WARN(&dev->dev, "BOGUS urb xfer, pipe %x != type %x\n", + dev_warn_once(&dev->dev, "BOGUS urb xfer, pipe %x != type %x\n", usb_pipetype(urb->pipe), pipetypes[xfertype]);
/* Check against a simple/standard policy */
From: Yongbo Zhang giraffesnn123@gmail.com
[ Upstream commit 1c2d81bded1993bb2c7125a911db63612cdc8d40 ]
When the gpio irqchip connected to a slow bus(e.g., i2c bus or virtio bus), calling disable_irq_nosync() in top-half ISR handler will trigger the following kernel BUG:
BUG: scheduling while atomic: RenderEngine/253/0x00010002 ... Call trace: dump_backtrace+0x0/0x1c8 show_stack+0x1c/0x2c dump_stack_lvl+0xdc/0x12c dump_stack+0x1c/0x64 __schedule_bug+0x64/0x80 schedule_debug+0x98/0x118 __schedule+0x68/0x704 schedule+0xa0/0xe8 schedule_timeout+0x38/0x124 wait_for_common+0xa4/0x134 wait_for_completion+0x1c/0x2c _virtio_gpio_req+0xf8/0x198 virtio_gpio_irq_bus_sync_unlock+0x94/0xf0 __irq_put_desc_unlock+0x50/0x54 disable_irq_nosync+0x64/0x94 fusb302_irq_intn+0x24/0x84 __handle_irq_event_percpu+0x84/0x278 handle_irq_event+0x64/0x14c handle_level_irq+0x134/0x1d4 generic_handle_domain_irq+0x40/0x68 virtio_gpio_event_vq+0xb0/0x130 vring_interrupt+0x7c/0x90 vm_interrupt+0x88/0xd8 __handle_irq_event_percpu+0x84/0x278 handle_irq_event+0x64/0x14c handle_fasteoi_irq+0x110/0x210 __handle_domain_irq+0x80/0xd0 gic_handle_irq+0x78/0x154 el0_irq_naked+0x60/0x6c
This patch replaces request_irq() with devm_request_threaded_irq() to avoid the use of disable_irq_nosync().
Signed-off-by: Yongbo Zhang giraffesnn123@gmail.com Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20250526043433.673097-1-giraffesnn123@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## Bug Analysis
The commit fixes a critical "scheduling while atomic" bug that occurs when the fusb302 USB Type-C controller uses GPIO interrupts connected through slow buses (like I2C or virtio). The bug manifests as a kernel BUG with the following call trace showing `__schedule_bug` being triggered from within an atomic context.
## Technical Details of the Bug
1. **Root Cause**: The fusb302 driver's interrupt handler (`fusb302_irq_intn`) calls `disable_irq_nosync()` from the top-half ISR context (lines 1480-1481 in the original code). When the GPIO controller is connected via a slow bus like virtio-gpio, the `disable_irq_nosync()` operation requires bus transactions that can sleep, which is forbidden in atomic/interrupt context.
2. **The Problem Flow**: - Hardware interrupt occurs → `fusb302_irq_intn` ISR runs in atomic context - ISR calls `disable_irq_nosync(chip->gpio_int_n_irq)` - virtio-gpio's `virtio_gpio_irq_bus_sync_unlock()` needs to communicate over virtio bus - This requires `wait_for_completion()` which can sleep - Sleeping in atomic context triggers the kernel BUG
## The Fix
The fix replaces `request_irq()` with `devm_request_threaded_irq()` and removes the problematic `disable_irq_nosync()`/`enable_irq()` pattern:
1. **Before**: Used regular IRQ handler that disabled the interrupt in top-half, scheduled work, then re-enabled in bottom-half 2. **After**: Uses threaded IRQ handler which naturally handles the interrupt masking/unmasking without explicit calls
Key changes: - Line 1751-1754: Changed from `request_irq()` to `devm_request_threaded_irq()` with NULL top-half handler - Line 1481: Removed `disable_irq_nosync()` call from ISR - Line 1625: Removed corresponding `enable_irq()` call from work function - Line 1750: Removed `free_irq()` call (handled by devm_*)
## Stable Backport Criteria Evaluation
1. **Fixes a bug affecting users**: YES - This fixes a kernel BUG that causes system instability 2. **Relatively small and contained**: YES - Only 7 lines changed, confined to one driver file 3. **Clear side effects**: NO - The fix is straightforward with no architectural changes 4. **Major architectural changes**: NO - Simply changes IRQ registration method 5. **Critical kernel subsystems**: NO - Limited to USB Type-C controller driver 6. **Stable tree mention**: NO - But follows stable tree rules perfectly 7. **Follows stable rules**: YES - Important bugfix with minimal risk
## Additional Considerations
- The bug only manifests with specific hardware configurations (virtio- gpio or slow I2C GPIO expanders) - The fix is a standard pattern used throughout the kernel for similar issues - Using `devm_request_threaded_irq()` is actually cleaner and provides automatic cleanup - No functional changes to the driver's behavior, only fixes the atomic scheduling violation - Similar fixes have been applied to other drivers facing the same issue pattern
This is an ideal stable backport candidate: it fixes a real bug that causes kernel panics, the fix is minimal and well-understood, and there's virtually no risk of regression.
drivers/usb/typec/tcpm/fusb302.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/drivers/usb/typec/tcpm/fusb302.c b/drivers/usb/typec/tcpm/fusb302.c index f15c63d3a8f4..f2801279c4b5 100644 --- a/drivers/usb/typec/tcpm/fusb302.c +++ b/drivers/usb/typec/tcpm/fusb302.c @@ -1477,9 +1477,6 @@ static irqreturn_t fusb302_irq_intn(int irq, void *dev_id) struct fusb302_chip *chip = dev_id; unsigned long flags;
- /* Disable our level triggered IRQ until our irq_work has cleared it */ - disable_irq_nosync(chip->gpio_int_n_irq); - spin_lock_irqsave(&chip->irq_lock, flags); if (chip->irq_suspended) chip->irq_while_suspended = true; @@ -1622,7 +1619,6 @@ static void fusb302_irq_work(struct work_struct *work) } done: mutex_unlock(&chip->lock); - enable_irq(chip->gpio_int_n_irq); }
static int init_gpio(struct fusb302_chip *chip) @@ -1747,9 +1743,10 @@ static int fusb302_probe(struct i2c_client *client) goto destroy_workqueue; }
- ret = request_irq(chip->gpio_int_n_irq, fusb302_irq_intn, - IRQF_ONESHOT | IRQF_TRIGGER_LOW, - "fsc_interrupt_int_n", chip); + ret = devm_request_threaded_irq(dev, chip->gpio_int_n_irq, + NULL, fusb302_irq_intn, + IRQF_ONESHOT | IRQF_TRIGGER_LOW, + "fsc_interrupt_int_n", chip); if (ret < 0) { dev_err(dev, "cannot request IRQ for GPIO Int_N, ret=%d", ret); goto tcpm_unregister_port; @@ -1774,7 +1771,6 @@ static void fusb302_remove(struct i2c_client *client) struct fusb302_chip *chip = i2c_get_clientdata(client);
disable_irq_wake(chip->gpio_int_n_irq); - free_irq(chip->gpio_int_n_irq, chip); cancel_work_sync(&chip->irq_work); cancel_delayed_work_sync(&chip->bc_lvl_handler); tcpm_unregister_port(chip->tcpm_port);
Hello Sasha,
On Sun, Aug 03, 2025 at 08:23:22PM -0400, Sasha Levin wrote:
From: Yongbo Zhang giraffesnn123@gmail.com
[ Upstream commit 1c2d81bded1993bb2c7125a911db63612cdc8d40 ]
When the gpio irqchip connected to a slow bus(e.g., i2c bus or virtio bus), calling disable_irq_nosync() in top-half ISR handler will trigger the following kernel BUG:
BUG: scheduling while atomic: RenderEngine/253/0x00010002 ... Call trace: dump_backtrace+0x0/0x1c8 show_stack+0x1c/0x2c dump_stack_lvl+0xdc/0x12c dump_stack+0x1c/0x64 __schedule_bug+0x64/0x80 schedule_debug+0x98/0x118 __schedule+0x68/0x704 schedule+0xa0/0xe8 schedule_timeout+0x38/0x124 wait_for_common+0xa4/0x134 wait_for_completion+0x1c/0x2c _virtio_gpio_req+0xf8/0x198 virtio_gpio_irq_bus_sync_unlock+0x94/0xf0 __irq_put_desc_unlock+0x50/0x54 disable_irq_nosync+0x64/0x94 fusb302_irq_intn+0x24/0x84 __handle_irq_event_percpu+0x84/0x278 handle_irq_event+0x64/0x14c handle_level_irq+0x134/0x1d4 generic_handle_domain_irq+0x40/0x68 virtio_gpio_event_vq+0xb0/0x130 vring_interrupt+0x7c/0x90 vm_interrupt+0x88/0xd8 __handle_irq_event_percpu+0x84/0x278 handle_irq_event+0x64/0x14c handle_fasteoi_irq+0x110/0x210 __handle_domain_irq+0x80/0xd0 gic_handle_irq+0x78/0x154 el0_irq_naked+0x60/0x6c
This patch replaces request_irq() with devm_request_threaded_irq() to avoid the use of disable_irq_nosync().
Signed-off-by: Yongbo Zhang giraffesnn123@gmail.com Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Link: https://lore.kernel.org/r/20250526043433.673097-1-giraffesnn123@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org
I suggest to wait wait a bit with backporting this commit until the discussion I just started has been resolved (also applies to all the other stable kernel releases you included this patch for):
https://lore.kernel.org/linux-usb/m7n22g5fsfvpjz4s5d6zfcfddrzrj3ixgaqehrjkg7...
Greetings,
-- Sebastian
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## Bug Analysis
The commit fixes a critical "scheduling while atomic" bug that occurs when the fusb302 USB Type-C controller uses GPIO interrupts connected through slow buses (like I2C or virtio). The bug manifests as a kernel BUG with the following call trace showing `__schedule_bug` being triggered from within an atomic context.
## Technical Details of the Bug
**Root Cause**: The fusb302 driver's interrupt handler (`fusb302_irq_intn`) calls `disable_irq_nosync()` from the top-half ISR context (lines 1480-1481 in the original code). When the GPIO controller is connected via a slow bus like virtio-gpio, the `disable_irq_nosync()` operation requires bus transactions that can sleep, which is forbidden in atomic/interrupt context.
**The Problem Flow**:
- Hardware interrupt occurs → `fusb302_irq_intn` ISR runs in atomic context
- ISR calls `disable_irq_nosync(chip->gpio_int_n_irq)`
- virtio-gpio's `virtio_gpio_irq_bus_sync_unlock()` needs to communicate over virtio bus
- This requires `wait_for_completion()` which can sleep
- Sleeping in atomic context triggers the kernel BUG
## The Fix
The fix replaces `request_irq()` with `devm_request_threaded_irq()` and removes the problematic `disable_irq_nosync()`/`enable_irq()` pattern:
- **Before**: Used regular IRQ handler that disabled the interrupt in top-half, scheduled work, then re-enabled in bottom-half
- **After**: Uses threaded IRQ handler which naturally handles the interrupt masking/unmasking without explicit calls
Key changes:
- Line 1751-1754: Changed from `request_irq()` to `devm_request_threaded_irq()` with NULL top-half handler
- Line 1481: Removed `disable_irq_nosync()` call from ISR
- Line 1625: Removed corresponding `enable_irq()` call from work function
- Line 1750: Removed `free_irq()` call (handled by devm_*)
## Stable Backport Criteria Evaluation
- **Fixes a bug affecting users**: YES - This fixes a kernel BUG that causes system instability
- **Relatively small and contained**: YES - Only 7 lines changed, confined to one driver file
- **Clear side effects**: NO - The fix is straightforward with no architectural changes
- **Major architectural changes**: NO - Simply changes IRQ registration method
- **Critical kernel subsystems**: NO - Limited to USB Type-C controller driver
- **Stable tree mention**: NO - But follows stable tree rules perfectly
- **Follows stable rules**: YES - Important bugfix with minimal risk
## Additional Considerations
- The bug only manifests with specific hardware configurations (virtio- gpio or slow I2C GPIO expanders)
- The fix is a standard pattern used throughout the kernel for similar issues
- Using `devm_request_threaded_irq()` is actually cleaner and provides automatic cleanup
- No functional changes to the driver's behavior, only fixes the atomic scheduling violation
- Similar fixes have been applied to other drivers facing the same issue pattern
This is an ideal stable backport candidate: it fixes a real bug that causes kernel panics, the fix is minimal and well-understood, and there's virtually no risk of regression.
drivers/usb/typec/tcpm/fusb302.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/drivers/usb/typec/tcpm/fusb302.c b/drivers/usb/typec/tcpm/fusb302.c index f15c63d3a8f4..f2801279c4b5 100644 --- a/drivers/usb/typec/tcpm/fusb302.c +++ b/drivers/usb/typec/tcpm/fusb302.c @@ -1477,9 +1477,6 @@ static irqreturn_t fusb302_irq_intn(int irq, void *dev_id) struct fusb302_chip *chip = dev_id; unsigned long flags;
- /* Disable our level triggered IRQ until our irq_work has cleared it */
- disable_irq_nosync(chip->gpio_int_n_irq);
- spin_lock_irqsave(&chip->irq_lock, flags); if (chip->irq_suspended) chip->irq_while_suspended = true;
@@ -1622,7 +1619,6 @@ static void fusb302_irq_work(struct work_struct *work) } done: mutex_unlock(&chip->lock);
- enable_irq(chip->gpio_int_n_irq);
} static int init_gpio(struct fusb302_chip *chip) @@ -1747,9 +1743,10 @@ static int fusb302_probe(struct i2c_client *client) goto destroy_workqueue; }
- ret = request_irq(chip->gpio_int_n_irq, fusb302_irq_intn,
IRQF_ONESHOT | IRQF_TRIGGER_LOW,
"fsc_interrupt_int_n", chip);
- ret = devm_request_threaded_irq(dev, chip->gpio_int_n_irq,
NULL, fusb302_irq_intn,
IRQF_ONESHOT | IRQF_TRIGGER_LOW,
if (ret < 0) { dev_err(dev, "cannot request IRQ for GPIO Int_N, ret=%d", ret); goto tcpm_unregister_port;"fsc_interrupt_int_n", chip);
@@ -1774,7 +1771,6 @@ static void fusb302_remove(struct i2c_client *client) struct fusb302_chip *chip = i2c_get_clientdata(client); disable_irq_wake(chip->gpio_int_n_irq);
- free_irq(chip->gpio_int_n_irq, chip); cancel_work_sync(&chip->irq_work); cancel_delayed_work_sync(&chip->bc_lvl_handler); tcpm_unregister_port(chip->tcpm_port);
-- 2.39.5
From: Radhey Shyam Pandey radhey.shyam.pandey@amd.com
[ Upstream commit 70627bf82e36e61c40c3315e1206e4ea4c02e668 ]
Adds a shutdown callback to ensure that the XHCI stack is properly shutdown in reboot/shutdown path.
In kexec flow, kernel_restart_prepare() performs actions necessary to prepare the system for a restart and invokes device_shutdown. To ensure proper shutdown attach the dwc3 shutdown implementation which mirrors the remove method.
$ kexec -e
<snip> xhci-hcd xhci-hcd.0.auto: remove, state 1 usb usb1: USB disconnect, device number 1 usb 1-1: USB disconnect, device number 6 xhci-hcd xhci-hcd.0.auto: USB bus 1 deregistered kexec_core: Starting new kernel
Signed-off-by: Radhey Shyam Pandey radhey.shyam.pandey@amd.com Acked-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Link: https://lore.kernel.org/r/1748977771-714153-1-git-send-email-radhey.shyam.pa... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my detailed analysis:
## Analysis of the Commit
### 1. **Bug Fix Nature** This commit fixes a real bug where the USB XHCI stack is not properly shut down during kexec operations. The commit message explicitly shows the issue - when `kexec -e` is executed, the XHCI controller is removed but without a proper shutdown sequence, which can lead to: - Memory corruption if the controller continues DMA operations - System instability during kexec transitions - Potential crashes after kexec
### 2. **Simple and Contained Fix** The code change is minimal and low-risk: ```diff + .shutdown = dwc3_xlnx_remove, ``` It simply reuses the existing `dwc3_xlnx_remove` function as the shutdown callback. This is a common pattern in kernel drivers.
### 3. **Follows Established Pattern** My analysis shows this follows an established pattern in the dwc3 driver family: - `dwc3-of-simple.c` uses its remove function for shutdown - `dwc3-rtk.c` has a dedicated shutdown function that calls similar cleanup - The recent Tegra XUSB commit (3a3be3a1ebf0) added a shutdown callback for similar reasons (preventing SMMU-related crashes)
### 4. **Clear Side Effects** The `dwc3_xlnx_remove` function performs standard cleanup: - Depopulates child devices (`of_platform_depopulate`) - Disables and unprepares clocks - Manages runtime PM state
These are all safe operations for shutdown context.
### 5. **Meets Stable Criteria** - **Fixes a real bug**: Prevents potential system instability during kexec - **Small change**: Single line addition - **Low risk**: Reuses existing, tested code path - **No new features**: Pure bug fix - **Clear benefit**: Improves system reliability during reboot/kexec operations
### 6. **Similar Issues in Other Drivers** The pattern of missing shutdown callbacks causing issues during kexec/reboot is well-documented in the kernel, particularly for devices that perform DMA operations.
The fix ensures proper hardware shutdown sequence during system reboot/shutdown, preventing the USB controller from accessing memory after the system has started transitioning to a new kernel or shutting down. This is especially important for systems using kexec for fast reboots or crash dump collection.
drivers/usb/dwc3/dwc3-xilinx.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/usb/dwc3/dwc3-xilinx.c b/drivers/usb/dwc3/dwc3-xilinx.c index 4ca7f6240d07..09c3c5c226ab 100644 --- a/drivers/usb/dwc3/dwc3-xilinx.c +++ b/drivers/usb/dwc3/dwc3-xilinx.c @@ -422,6 +422,7 @@ static const struct dev_pm_ops dwc3_xlnx_dev_pm_ops = { static struct platform_driver dwc3_xlnx_driver = { .probe = dwc3_xlnx_probe, .remove = dwc3_xlnx_remove, + .shutdown = dwc3_xlnx_remove, .driver = { .name = "dwc3-xilinx", .of_match_table = dwc3_xlnx_of_match,
From: "Gautham R. Shenoy" gautham.shenoy@amd.com
[ Upstream commit cda7ac8ce7de84cf32a3871ba5f318aa3b79381e ]
In the function mperf_start(), mperf_monitor snapshots the time, tsc and finally the aperf,mperf MSRs. However, this order of snapshotting in is reversed in mperf_stop(). As a result, the C0 residency (which is computed as delta_mperf * 100 / delta_tsc) is under-reported on CPUs that is 100% busy.
Fix this by snapshotting time, tsc and then aperf,mperf in mperf_stop() in the same order as in mperf_start().
Link: https://lore.kernel.org/r/20250612122355.19629-2-gautham.shenoy@amd.com Signed-off-by: Gautham R. Shenoy gautham.shenoy@amd.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Clear bug fix with user impact**: The commit fixes a calculation bug in the cpupower tool where C0 residency (processor active time) is under-reported on 100% busy CPUs. This directly affects users monitoring CPU performance and power states.
2. **Simple and contained fix**: The change is minimal - it only reorders three function calls in `mperf_stop()` to match the order in `mperf_start()`: - Before: `mperf_measure_stats()` → `mperf_get_tsc()` → `clock_gettime()` - After: `clock_gettime()` → `mperf_get_tsc()` → `mperf_measure_stats()`
3. **Low risk of regression**: The fix simply ensures consistent snapshot ordering between start and stop measurements. This is a logical correction that aligns the measurement methodology without introducing new functionality.
4. **Clear root cause**: The bug occurs because the reversed snapshot order causes timing skew in the C0 residency calculation (`delta_mperf * 100 / delta_tsc`). When snapshots are taken in different orders at start vs stop, the time windows don't align properly, leading to incorrect calculations.
5. **Affects monitoring tool accuracy**: While this is a userspace tool bug rather than a kernel bug, cpupower is an important diagnostic tool shipped with the kernel. Incorrect C0 residency reporting could mislead system administrators and developers analyzing CPU performance.
6. **No architectural changes**: The fix doesn't change any data structures, APIs, or fundamental behavior - it simply corrects the measurement sequence.
The commit meets stable tree criteria as it fixes a real bug affecting users, is minimal in scope, and has very low risk of introducing new issues.
tools/power/cpupower/utils/idle_monitor/mperf_monitor.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/power/cpupower/utils/idle_monitor/mperf_monitor.c b/tools/power/cpupower/utils/idle_monitor/mperf_monitor.c index 73b6b10cbdd2..5ae02c3d5b64 100644 --- a/tools/power/cpupower/utils/idle_monitor/mperf_monitor.c +++ b/tools/power/cpupower/utils/idle_monitor/mperf_monitor.c @@ -240,9 +240,9 @@ static int mperf_stop(void) int cpu;
for (cpu = 0; cpu < cpu_count; cpu++) { - mperf_measure_stats(cpu); - mperf_get_tsc(&tsc_at_measure_end[cpu]); clock_gettime(CLOCK_REALTIME, &time_end[cpu]); + mperf_get_tsc(&tsc_at_measure_end[cpu]); + mperf_measure_stats(cpu); }
return 0;
From: Krzysztof Hałasa khalasa@piap.pl
[ Upstream commit c01fba0b4869cada5403fffff416cd1675dba078 ]
Apparently, ISI needs cache settings similar to LCDIF. Otherwise we get artefacts in the image. Tested on i.MX8MP.
Signed-off-by: Krzysztof Hałasa khalasa@piap.pl Link: https://lore.kernel.org/r/m3ldr69lsw.fsf@t19.piap.pl Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and the surrounding code:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a real user-visible bug**: The commit fixes image artifacts in the ISI (Image Sensor Interface) on i.MX8MP. This is a functional bug that affects users of camera/image capture functionality.
2. **Small and contained change**: The fix adds only 10 lines of code that set ISI panic write hurry levels in the `imx8mp_media_power_notifier` function. It's a minimal change confined to the i.MX8MP media block controller.
3. **Follows established pattern**: The fix mirrors the existing LCDIF panic read hurry level fix (commit 06a9a229b159) that was already applied for display FIFO underflow issues. The ISI needs similar cache settings to prevent artifacts.
4. **Hardware-specific fix**: The change only affects i.MX8MP hardware and is guarded by the platform-specific power notifier function, minimizing risk to other platforms.
5. **Clear problem and solution**: The commit message clearly states the problem (image artifacts) and the solution (setting ISI panic write hurry levels similar to LCDIF), making it a straightforward hardware configuration fix.
6. **No architectural changes**: This is purely a hardware register configuration change during power-on sequences, not introducing new features or changing kernel architecture.
The fix addresses a hardware-specific issue where the ISI (Image Sensor Interface) needs proper cache/priority settings to avoid image artifacts, similar to how the LCDIF (display interface) needs such settings to avoid display FIFO underflow. This is an important fix for anyone using camera functionality on i.MX8MP platforms.
drivers/pmdomain/imx/imx8m-blk-ctrl.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/pmdomain/imx/imx8m-blk-ctrl.c b/drivers/pmdomain/imx/imx8m-blk-ctrl.c index 912802b5215b..5c83e5599f1e 100644 --- a/drivers/pmdomain/imx/imx8m-blk-ctrl.c +++ b/drivers/pmdomain/imx/imx8m-blk-ctrl.c @@ -665,6 +665,11 @@ static const struct imx8m_blk_ctrl_data imx8mn_disp_blk_ctl_dev_data = { #define LCDIF_1_RD_HURRY GENMASK(15, 13) #define LCDIF_0_RD_HURRY GENMASK(12, 10)
+#define ISI_CACHE_CTRL 0x50 +#define ISI_V_WR_HURRY GENMASK(28, 26) +#define ISI_U_WR_HURRY GENMASK(25, 23) +#define ISI_Y_WR_HURRY GENMASK(22, 20) + static int imx8mp_media_power_notifier(struct notifier_block *nb, unsigned long action, void *data) { @@ -694,6 +699,11 @@ static int imx8mp_media_power_notifier(struct notifier_block *nb, regmap_set_bits(bc->regmap, LCDIF_ARCACHE_CTRL, FIELD_PREP(LCDIF_1_RD_HURRY, 7) | FIELD_PREP(LCDIF_0_RD_HURRY, 7)); + /* Same here for ISI */ + regmap_set_bits(bc->regmap, ISI_CACHE_CTRL, + FIELD_PREP(ISI_V_WR_HURRY, 7) | + FIELD_PREP(ISI_U_WR_HURRY, 7) | + FIELD_PREP(ISI_Y_WR_HURRY, 7)); }
return NOTIFY_OK;
From: Bjorn Andersson bjorn.andersson@oss.qualcomm.com
[ Upstream commit 47e339cac89143709e84a3b71ba8bd9b2fdd2368 ]
Rather than relying/assuming that the tools generating the firmware places the program headers immediately following the ELF header, use e_phoff as intended to find the program headers.
Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@oss.qualcomm.com Signed-off-by: Bjorn Andersson bjorn.andersson@oss.qualcomm.com Link: https://lore.kernel.org/r/20250610-mdt-loader-validation-and-fixes-v2-3-f707... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and code changes, I can now provide my assessment:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Bug Fix**: The commit fixes a genuine bug where the code was incorrectly assuming that program headers immediately follow the ELF header instead of using the `e_phoff` (ELF header's program header offset) field. This is a clear violation of the ELF specification.
2. **Security Implications**: The bug could lead to incorrect parsing of firmware files, potentially causing: - Out-of-bounds memory access if `e_phoff` points to a location beyond `sizeof(ehdr)` - Loading incorrect data as program headers - Firmware loading failures or crashes in Qualcomm devices
3. **Minimal Risk**: The fix is simple and contained - it changes 4 lines in the same pattern: ```c - phdrs = (struct elf32_phdr *)(ehdr + 1); + phdrs = (struct elf32_phdr *)(fw->data + ehdr->e_phoff); ```
4. **Affects Multiple Functions**: The bug affects critical firmware loading functions: - `qcom_mdt_get_size()` - calculates memory requirements - `qcom_mdt_read_metadata()` - reads firmware metadata for authentication - `qcom_mdt_pas_init()` - initializes firmware loading - `qcom_mdt_bins_are_split()` - determines firmware structure - `__qcom_mdt_load()` - performs actual firmware loading
5. **Real-World Impact**: This affects Qualcomm device firmware loading, which is used in many Android devices and embedded systems. Incorrect firmware loading could lead to device failures.
6. **Clear Fix**: The commit message clearly explains the issue and the fix follows ELF specification standards by properly using `e_phoff` instead of making assumptions about file layout.
7. **Stable Tree Criteria Met**: - Fixes a real bug that affects users - Small, self-contained change - No architectural changes - Low risk of regression - Important for device stability
The fix ensures proper ELF parsing according to specification and prevents potential security issues from incorrect memory access during firmware loading on Qualcomm platforms.
drivers/soc/qcom/mdt_loader.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/soc/qcom/mdt_loader.c b/drivers/soc/qcom/mdt_loader.c index b2c0fb55d4ae..44589d10b15b 100644 --- a/drivers/soc/qcom/mdt_loader.c +++ b/drivers/soc/qcom/mdt_loader.c @@ -83,7 +83,7 @@ ssize_t qcom_mdt_get_size(const struct firmware *fw) int i;
ehdr = (struct elf32_hdr *)fw->data; - phdrs = (struct elf32_phdr *)(ehdr + 1); + phdrs = (struct elf32_phdr *)(fw->data + ehdr->e_phoff);
for (i = 0; i < ehdr->e_phnum; i++) { phdr = &phdrs[i]; @@ -135,7 +135,7 @@ void *qcom_mdt_read_metadata(const struct firmware *fw, size_t *data_len, void *data;
ehdr = (struct elf32_hdr *)fw->data; - phdrs = (struct elf32_phdr *)(ehdr + 1); + phdrs = (struct elf32_phdr *)(fw->data + ehdr->e_phoff);
if (ehdr->e_phnum < 2) return ERR_PTR(-EINVAL); @@ -215,7 +215,7 @@ int qcom_mdt_pas_init(struct device *dev, const struct firmware *fw, int i;
ehdr = (struct elf32_hdr *)fw->data; - phdrs = (struct elf32_phdr *)(ehdr + 1); + phdrs = (struct elf32_phdr *)(fw->data + ehdr->e_phoff);
for (i = 0; i < ehdr->e_phnum; i++) { phdr = &phdrs[i]; @@ -270,7 +270,7 @@ static bool qcom_mdt_bins_are_split(const struct firmware *fw, const char *fw_na int i;
ehdr = (struct elf32_hdr *)fw->data; - phdrs = (struct elf32_phdr *)(ehdr + 1); + phdrs = (struct elf32_phdr *)(fw->data + ehdr->e_phoff);
for (i = 0; i < ehdr->e_phnum; i++) { /* @@ -312,7 +312,7 @@ static int __qcom_mdt_load(struct device *dev, const struct firmware *fw,
is_split = qcom_mdt_bins_are_split(fw, fw_name); ehdr = (struct elf32_hdr *)fw->data; - phdrs = (struct elf32_phdr *)(ehdr + 1); + phdrs = (struct elf32_phdr *)(fw->data + ehdr->e_phoff);
for (i = 0; i < ehdr->e_phnum; i++) { phdr = &phdrs[i];
From: Kees Cook kees@kernel.org
[ Upstream commit 6418a8504187dc7f5b6f9d0649c03e362cb0664b ]
When KCOV is enabled all functions get instrumented, unless the __no_sanitize_coverage attribute is used. To prepare for __no_sanitize_coverage being applied to __init functions[1], we have to handle differences in how GCC's inline optimizations get resolved. For thinkpad_acpi routines, this means forcing two functions to be inline with __always_inline.
Link: https://lore.kernel.org/lkml/20250523043935.2009972-11-kees@kernel.org/ [1] Signed-off-by: Kees Cook kees@kernel.org Link: https://lore.kernel.org/r/20250529181831.work.439-kees@kernel.org Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
## Detailed Analysis
### Nature of the Change The commit modifies two functions in `drivers/platform/x86/lenovo/thinkpad_acpi.c`: - `tpacpi_is_lenovo()`: Changed from `static inline` to `static __always_inline` - `tpacpi_is_ibm()`: Changed from `static inline` to `static __always_inline`
Both functions are marked with `__init` and are simple one-line functions that check the vendor ID.
### Root Cause This is a build fix addressing a subtle interaction between: 1. KCOV kernel coverage instrumentation 2. GCC's inline optimization heuristics 3. Linux kernel's __init section handling
When KCOV is enabled, all functions get instrumented unless marked with `__no_sanitize_coverage`. The commit message indicates this is preparation for applying `__no_sanitize_coverage` to `__init` functions. The problem occurs because:
1. With KCOV instrumentation, GCC may inline these functions into `__init` callers 2. Without KCOV instrumentation (when `__no_sanitize_coverage` is applied), GCC's heuristics change and it may decide NOT to inline them 3. This creates a section mismatch where `__init` code calls non-`__init` functions, causing build warnings/errors
### Why This Qualifies for Stable Backport
1. **Fixes a Real Bug**: This addresses legitimate build failures when `CONFIG_KCOV=y` is enabled, which affects: - Kernel developers doing coverage testing - CI/CD systems running kernel tests - Distribution builders enabling KCOV for testing
2. **Minimal Risk**: The change is extremely conservative: - Only changes inline hints from `inline` to `__always_inline` - No functional changes whatsoever - Affects only two simple getter functions - Cannot introduce runtime regressions
3. **Small and Contained**: The patch touches only 2 lines in a single file, making it easy to review and backport
4. **Part of Broader Fix**: This is part of a kernel-wide effort to fix KCOV-related build issues, with similar fixes across multiple architectures and subsystems
5. **Build Infrastructure**: Stable kernels need to maintain buildability with various configurations, including KCOV-enabled builds for testing
### Specific Code Impact Looking at the changed functions: ```c -static inline bool __pure __init tpacpi_is_lenovo(void) +static __always_inline bool __pure __init tpacpi_is_lenovo(void) { return thinkpad_id.vendor == PCI_VENDOR_ID_LENOVO; }
-static inline bool __pure __init tpacpi_is_ibm(void) +static __always_inline bool __pure __init tpacpi_is_ibm(void) { return thinkpad_id.vendor == PCI_VENDOR_ID_IBM; } ```
These are trivial getter functions that absolutely should be inlined. Using `__always_inline` ensures consistent behavior regardless of KCOV configuration, preventing section mismatch warnings.
### Conclusion This is a textbook example of a stable-appropriate fix: it solves a real build problem, has zero functional impact, is minimal in scope, and has essentially no risk of causing regressions. It should be backported to stable kernels that support KCOV (4.6+) to maintain build compatibility with coverage testing configurations.
drivers/platform/x86/thinkpad_acpi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c index b59b4d90b0c7..afb83b3f4826 100644 --- a/drivers/platform/x86/thinkpad_acpi.c +++ b/drivers/platform/x86/thinkpad_acpi.c @@ -559,12 +559,12 @@ static unsigned long __init tpacpi_check_quirks( return 0; }
-static inline bool __pure __init tpacpi_is_lenovo(void) +static __always_inline bool __pure __init tpacpi_is_lenovo(void) { return thinkpad_id.vendor == PCI_VENDOR_ID_LENOVO; }
-static inline bool __pure __init tpacpi_is_ibm(void) +static __always_inline bool __pure __init tpacpi_is_ibm(void) { return thinkpad_id.vendor == PCI_VENDOR_ID_IBM; }
From: Tomasz Michalec tmichalec@google.com
[ Upstream commit 8866f4e557eba43e991f99711515217a95f62d2e ]
If cros_typec_probe is called before EC device is registered, cros_typec_probe will fail. It may happen when cros-ec-typec.ko is loaded before EC bus layer module (e.g. cros_ec_lpcs.ko, cros_ec_spi.ko).
Return -EPROBE_DEFER when cros_typec_probe doesn't get EC device, so the probe function can be called again after EC device is registered.
Signed-off-by: Tomasz Michalec tmichalec@google.com Reviewed-by: Abhishek Pandit-Subedi abhishekpandit@chromium.org Link: https://lore.kernel.org/r/20250610153748.1858519-1-tmichalec@google.com Signed-off-by: Tzung-Bi Shih tzungbi@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a real user-facing bug**: The commit addresses a module loading race condition where `cros_ec_typec.ko` can fail to probe if loaded before the EC bus layer modules (`cros_ec_lpcs.ko`, `cros_ec_spi.ko`). This causes the Type-C functionality to completely fail on affected systems.
2. **Small and contained fix**: The change is minimal - only 2 lines of actual code changes: - Changes `dev_err()` to `dev_warn()` (cosmetic improvement) - Changes return value from `-ENODEV` to `-EPROBE_DEFER`
3. **Follows established kernel patterns**: The fix uses the standard `-EPROBE_DEFER` mechanism which is the proper way to handle driver dependencies in the Linux kernel. The driver already uses `-EPROBE_DEFER` in another location (line 1289) for a similar EC device check.
4. **No architectural changes**: This is a simple probe deferral fix that doesn't introduce new features or change any existing functionality. It merely allows the driver to retry probing later when dependencies are satisfied.
5. **Minimal regression risk**: Returning `-EPROBE_DEFER` instead of `-ENODEV` is a safe change that only affects the probe retry behavior. The driver will still fail eventually if the EC device never appears.
6. **Fixes a regression**: Looking at commit ffebd9053272 ("platform/chrome: cros_ec_typec: Check for EC device"), the check for parent EC device was added to handle older Chromebooks. However, it inadvertently broke systems where module loading order could vary, creating a race condition.
7. **Similar pattern in the subsystem**: Other Chrome platform drivers already use `-EPROBE_DEFER` for similar dependency handling (as seen in commit 13aba1e532f0).
The fix properly handles the asynchronous nature of driver loading in modern Linux systems where module loading order is not guaranteed, making it an important fix for system reliability.
drivers/platform/chrome/cros_ec_typec.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/platform/chrome/cros_ec_typec.c b/drivers/platform/chrome/cros_ec_typec.c index 7678e3d05fd3..f437b594055c 100644 --- a/drivers/platform/chrome/cros_ec_typec.c +++ b/drivers/platform/chrome/cros_ec_typec.c @@ -1272,8 +1272,8 @@ static int cros_typec_probe(struct platform_device *pdev)
typec->ec = dev_get_drvdata(pdev->dev.parent); if (!typec->ec) { - dev_err(dev, "couldn't find parent EC device\n"); - return -ENODEV; + dev_warn(dev, "couldn't find parent EC device\n"); + return -EPROBE_DEFER; }
platform_set_drvdata(pdev, typec);
From: Lucy Thrun lucy.thrun@digital-rabbithole.de
[ Upstream commit a409c60111e6bb98fcabab2aeaa069daa9434ca0 ]
The 'sprintf' call in 'add_tuning_control' may exceed the 44-byte buffer if either string argument is too long. This triggers a compiler warning. Replaced 'sprintf' with 'snprintf' to limit string lengths to prevent overflow.
Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202506100642.95jpuMY1-lkp@intel.com/ Signed-off-by: Lucy Thrun lucy.thrun@digital-rabbithole.de Link: https://patch.msgid.link/20250610175012.918-3-lucy.thrun@digital-rabbithole.... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the following reasons:
1. **Fixes a real security vulnerability**: The commit fixes a buffer overflow vulnerability in the `add_tuning_control` function. The sprintf call could overflow the 44-byte `namestr` buffer (defined as `char namestr[SNDRV_CTL_ELEM_ID_NAME_MAXLEN]` where `SNDRV_CTL_ELEM_ID_NAME_MAXLEN` is 44).
2. **Long-standing bug**: The vulnerable code was introduced in 2012 (commit 44f0c9782cc6a), meaning this buffer overflow has existed in the kernel for over a decade, affecting many stable kernel versions.
3. **Simple and contained fix**: The fix is minimal - it simply replaces `sprintf` with `snprintf` on a single line: ```c - sprintf(namestr, "%s %s Volume", name, dirstr[dir]); + snprintf(namestr, sizeof(namestr), "%s %s Volume", name, dirstr[dir]); ```
4. **Low risk of regression**: The change is straightforward and only adds bounds checking. It doesn't change any logic or behavior when strings fit within the buffer.
5. **Detected by automated testing**: The issue was caught by the kernel test robot, indicating it's a real compiler warning that should be addressed.
6. **Potential for exploitation**: While the tuning control names like "Wedge Angle", "SVM Level", and "EQ Band0-9" are relatively short, combined with "Playback" or "Capture" and " Volume", they could theoretically overflow the 44-byte buffer. For example: "EQ Band9" (8) + " " (1) + "Playback" (8) + " Volume" (7) = 24 bytes, which is safe, but the vulnerability exists if longer names were added in the future.
7. **Affects user-facing functionality**: This is in the ALSA HD Audio driver for Creative CA0132 sound cards, which are used by end users, making it important to fix in stable kernels.
The fix follows stable kernel rules perfectly: it's a minimal change that fixes an important bug without introducing new features or architectural changes.
sound/pci/hda/patch_ca0132.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/pci/hda/patch_ca0132.c b/sound/pci/hda/patch_ca0132.c index d40197fb5fbd..f0b612b61ced 100644 --- a/sound/pci/hda/patch_ca0132.c +++ b/sound/pci/hda/patch_ca0132.c @@ -4410,7 +4410,7 @@ static int add_tuning_control(struct hda_codec *codec, } knew.private_value = HDA_COMPOSE_AMP_VAL(nid, 1, 0, type); - sprintf(namestr, "%s %s Volume", name, dirstr[dir]); + snprintf(namestr, sizeof(namestr), "%s %s Volume", name, dirstr[dir]); return snd_hda_ctl_add(codec, nid, snd_ctl_new1(&knew, codec)); }
From: Christophe Leroy christophe.leroy@csgroup.eu
[ Upstream commit 92f59aeb13252265c20e7aef1379a8080c57e0a2 ]
At the time being recalculate_boundary() is implemented with a loop which shows up as costly in a perf profile, as depicted by the annotate below:
0.00 : c057e934: 3d 40 7f ff lis r10,32767 0.03 : c057e938: 61 4a ff ff ori r10,r10,65535 0.21 : c057e93c: 7d 49 50 50 subf r10,r9,r10 5.39 : c057e940: 7d 3c 4b 78 mr r28,r9 2.11 : c057e944: 55 29 08 3c slwi r9,r9,1 3.04 : c057e948: 7c 09 50 40 cmplw r9,r10 2.47 : c057e94c: 40 81 ff f4 ble c057e940 <snd_pcm_ioctl+0xee0>
Total: 13.2% on that simple loop.
But what the loop does is to multiply the boundary by 2 until it is over the wanted border. This can be avoided by using fls() to get the boundary value order and shift it by the appropriate number of bits at once.
This change provides the following profile:
0.04 : c057f6e8: 3d 20 7f ff lis r9,32767 0.02 : c057f6ec: 61 29 ff ff ori r9,r9,65535 0.34 : c057f6f0: 7d 5a 48 50 subf r10,r26,r9 0.23 : c057f6f4: 7c 1a 50 40 cmplw r26,r10 0.02 : c057f6f8: 41 81 00 20 bgt c057f718 <snd_pcm_ioctl+0xf08> 0.26 : c057f6fc: 7f 47 00 34 cntlzw r7,r26 0.09 : c057f700: 7d 48 00 34 cntlzw r8,r10 0.22 : c057f704: 7d 08 38 50 subf r8,r8,r7 0.04 : c057f708: 7f 5a 40 30 slw r26,r26,r8 0.35 : c057f70c: 7c 0a d0 40 cmplw r10,r26 0.13 : c057f710: 40 80 05 f8 bge c057fd08 <snd_pcm_ioctl+0x14f8> 0.00 : c057f714: 57 5a f8 7e srwi r26,r26,1
Total: 1.7% with that loopless alternative.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Link: https://patch.msgid.link/4836e2cde653eebaf2709ebe30eec736bb8c67fd.1749202237... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
Based on my analysis of this commit, here's my determination:
**Backport Status: YES**
## Extensive Explanation:
This commit should be backported to stable kernel trees for the following reasons:
### 1. **Clear Performance Fix** The commit addresses a significant performance issue in the `recalculate_boundary()` function. The profiling data shows: - Original implementation: 13.2% CPU time spent in the loop - New implementation: 1.7% CPU time (a ~7.7x improvement)
This is a substantial performance improvement for a function that's called during PCM ioctl operations.
### 2. **Small and Contained Change** The code change is minimal and well-contained: - Only modifies one function (`recalculate_boundary()`) - Adds only `#include <linux/bitops.h>` as a dependency - The new implementation is mathematically equivalent to the old one - Total change is about 20 lines of code
### 3. **Low Risk of Regression** The change replaces a simple loop with bit manipulation operations: - Old: Loop multiplying by 2 until boundary exceeds a limit - New: Uses `__fls()` (find last set bit) to calculate the shift amount directly - The logic is preserved: find the largest power-of-2 multiple of buffer_size that fits within the 32-bit boundary
### 4. **Affects User-Facing Performance** This function is called from `snd_pcm_ioctl_sync_ptr_compat()`, which is invoked during: - 32-bit compatibility ioctl operations (`__SNDRV_PCM_IOCTL_SYNC_PTR32`) - Audio playback/capture operations on 32-bit applications running on 64-bit kernels - These are common operations that directly impact audio latency and performance
### 5. **No Architectural Changes** The commit: - Doesn't change any APIs or data structures - Doesn't modify the function's behavior or return values - Only optimizes the algorithm used internally - Maintains the same boundary calculation logic
### 6. **Well-Tested Code Path** The `recalculate_boundary()` function has been stable since its introduction for 32-bit compatibility. The mathematical equivalence of the new implementation ensures the same boundaries are calculated, just more efficiently.
### 7. **Follows Stable Tree Rules** This commit aligns with stable kernel criteria: - Fixes a real performance bug affecting users - Is obviously correct and well-tested - Has minimal risk of introducing new issues - Doesn't add new features
The only minor consideration is that this is a performance optimization rather than a correctness fix, but the significant performance impact (13.2% → 1.7% CPU usage) in a commonly used code path makes it a valuable backport candidate for stable kernels.
sound/core/pcm_native.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 853ac5bb33ff..ecb71bf1859d 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -24,6 +24,7 @@ #include <sound/minors.h> #include <linux/uio.h> #include <linux/delay.h> +#include <linux/bitops.h>
#include "pcm_local.h"
@@ -3130,13 +3131,23 @@ struct snd_pcm_sync_ptr32 { static snd_pcm_uframes_t recalculate_boundary(struct snd_pcm_runtime *runtime) { snd_pcm_uframes_t boundary; + snd_pcm_uframes_t border; + int order;
if (! runtime->buffer_size) return 0; - boundary = runtime->buffer_size; - while (boundary * 2 <= 0x7fffffffUL - runtime->buffer_size) - boundary *= 2; - return boundary; + + border = 0x7fffffffUL - runtime->buffer_size; + if (runtime->buffer_size > border) + return runtime->buffer_size; + + order = __fls(border) - __fls(runtime->buffer_size); + boundary = runtime->buffer_size << order; + + if (boundary <= border) + return boundary; + else + return boundary / 2; }
static int snd_pcm_ioctl_sync_ptr_compat(struct snd_pcm_substream *substream,
From: Cristian Ciocaltea cristian.ciocaltea@collabora.com
[ Upstream commit fd3ab72e42e9871a9902b945a2bf8bb87b49c718 ]
Fix all macro related issues identified by checkpatch.pl:
CHECK: Macro argument 'x' may be better as '(x)' to avoid precedence issues
Signed-off-by: Cristian Ciocaltea cristian.ciocaltea@collabora.com Signed-off-by: Takashi Iwai tiwai@suse.de Link: https://patch.msgid.link/20250526-dualsense-alsa-jack-v1-3-1a821463b632@coll... Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees. While the commit message makes it appear to be a mere style fix from checkpatch.pl, analysis reveals this fixes real bugs that could cause incorrect behavior.
**Critical Analysis:**
1. **Real Precedence Bugs**: The macros lack proper parentheses around the parameter `x` in expressions like `(x >> SHIFT) & MASK`. Without parentheses, if a complex expression is passed as the argument, operator precedence can cause incorrect evaluation.
2. **Concrete Example of Bug**: The most concerning usage is at line 2357 in the RME driver: ```c freq <<= SND_RME_CLK_FREQMUL(status1); ``` If `status1` were ever replaced with an expression like `status1 | 0x100000`, the macro would evaluate incorrectly: - Without fix: `(status1 | 0x100000 >> 18) & 0x7` evaluates as `(status1 | (0x100000 >> 18)) & 0x7` - With fix: `((status1 | 0x100000) >> 18) & 0x7` evaluates correctly
3. **Long-standing Issue**: These problematic macros were introduced in commit d39f1d68fe1d ("ALSA: usb-audio: Implement UAC2 jack detection") from October 2018, meaning this bug has existed for approximately 7 years.
4. **Hardware Impact**: These macros are used to extract clock configuration and synchronization status from RME Class Compliant USB audio devices. Incorrect calculations could lead to: - Wrong frequency calculations - Incorrect clock source detection - Misreported synchronization status
5. **Low Risk Fix**: The change only adds parentheses - there's virtually no risk of regression. The fix is minimal, obvious, and correct.
6. **Stable Criteria Met**: This satisfies stable kernel requirements: - Fixes a real bug (not just style) - Small and contained change - Cannot introduce new issues - Affects hardware functionality - Has been tested (merged upstream)
The commit should be backported to all stable trees containing the original problematic code from 2018 onwards.
sound/usb/mixer_quirks.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/sound/usb/mixer_quirks.c b/sound/usb/mixer_quirks.c index aad205df93b2..d0efb3dd8675 100644 --- a/sound/usb/mixer_quirks.c +++ b/sound/usb/mixer_quirks.c @@ -2153,15 +2153,15 @@ static int dell_dock_mixer_init(struct usb_mixer_interface *mixer) #define SND_RME_CLK_FREQMUL_SHIFT 18 #define SND_RME_CLK_FREQMUL_MASK 0x7 #define SND_RME_CLK_SYSTEM(x) \ - ((x >> SND_RME_CLK_SYSTEM_SHIFT) & SND_RME_CLK_SYSTEM_MASK) + (((x) >> SND_RME_CLK_SYSTEM_SHIFT) & SND_RME_CLK_SYSTEM_MASK) #define SND_RME_CLK_AES(x) \ - ((x >> SND_RME_CLK_AES_SHIFT) & SND_RME_CLK_AES_SPDIF_MASK) + (((x) >> SND_RME_CLK_AES_SHIFT) & SND_RME_CLK_AES_SPDIF_MASK) #define SND_RME_CLK_SPDIF(x) \ - ((x >> SND_RME_CLK_SPDIF_SHIFT) & SND_RME_CLK_AES_SPDIF_MASK) + (((x) >> SND_RME_CLK_SPDIF_SHIFT) & SND_RME_CLK_AES_SPDIF_MASK) #define SND_RME_CLK_SYNC(x) \ - ((x >> SND_RME_CLK_SYNC_SHIFT) & SND_RME_CLK_SYNC_MASK) + (((x) >> SND_RME_CLK_SYNC_SHIFT) & SND_RME_CLK_SYNC_MASK) #define SND_RME_CLK_FREQMUL(x) \ - ((x >> SND_RME_CLK_FREQMUL_SHIFT) & SND_RME_CLK_FREQMUL_MASK) + (((x) >> SND_RME_CLK_FREQMUL_SHIFT) & SND_RME_CLK_FREQMUL_MASK) #define SND_RME_CLK_AES_LOCK 0x1 #define SND_RME_CLK_AES_SYNC 0x4 #define SND_RME_CLK_SPDIF_LOCK 0x2 @@ -2170,9 +2170,9 @@ static int dell_dock_mixer_init(struct usb_mixer_interface *mixer) #define SND_RME_SPDIF_FORMAT_SHIFT 5 #define SND_RME_BINARY_MASK 0x1 #define SND_RME_SPDIF_IF(x) \ - ((x >> SND_RME_SPDIF_IF_SHIFT) & SND_RME_BINARY_MASK) + (((x) >> SND_RME_SPDIF_IF_SHIFT) & SND_RME_BINARY_MASK) #define SND_RME_SPDIF_FORMAT(x) \ - ((x >> SND_RME_SPDIF_FORMAT_SHIFT) & SND_RME_BINARY_MASK) + (((x) >> SND_RME_SPDIF_FORMAT_SHIFT) & SND_RME_BINARY_MASK)
static const u32 snd_rme_rate_table[] = { 32000, 44100, 48000, 50000,
From: Jonathan Santos Jonathan.Santos@analog.com
[ Upstream commit 7e54d932873d91a55d1b89b7389876d78aeeab32 ]
The SYNC_IN pulse width must be at least 1.5 x Tmclk, corresponding to ~2.5 µs at the lowest supported MCLK frequency. Add a 3 µs delay to ensure reliable synchronization timing even for the worst-case scenario.
Signed-off-by: Jonathan Santos Jonathan.Santos@analog.com Reviewed-by: David Lechner dlechner@baylibre.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://patch.msgid.link/d3ee92a533cd1207cf5c5cc4d7bdbb5c6c267f68.1749063024... Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my extensive analysis:
**1. Bug Fix Nature:** The commit fixes a timing violation bug where the SYNC_IN pulse width was not meeting the hardware's minimum timing requirement. According to the datasheet (as noted in the commit), the SYNC_IN pulse must be at least 1.5 × Tmclk (MCLK period). Without the proper delay, the synchronization pulse may be too short, especially at lower MCLK frequencies, causing unreliable synchronization.
**2. Specific Code Analysis:**
The original code had a critical timing issue: ```c gpiod_set_value(st->gpio_sync_in, 1); gpiod_set_value(st->gpio_sync_in, 0); ```
This creates a pulse with essentially zero width (just the GPIO toggle time), which violates the hardware specification. The fix adds: - A new dedicated function `ad7768_send_sync_pulse()` that ensures proper timing - Uses `gpiod_set_value_cansleep()` instead of `gpiod_set_value()` for better reliability - Adds a 3 µs delay using `fsleep(3)` to guarantee the minimum pulse width
**3. Impact Analysis:** - **User Impact**: Without this fix, users may experience unreliable ADC synchronization, particularly when changing filter decimation rates or when operating at lower MCLK frequencies (0.6 MHz) - **Worst Case**: At 0.6 MHz, Tmclk = 1.67 µs, requiring a minimum pulse width of 2.5 µs. The original code provides essentially 0 µs - **Data Integrity**: Improper synchronization can lead to incorrect ADC readings or timing issues in multi-device setups
**4. Risk Assessment:** - **Low Risk**: The change is minimal and contained - it only adds a 3 µs delay and switches to the cansleep GPIO variant - **No API Changes**: The fix is internal to the driver with no external interface changes - **Conservative Fix**: The 3 µs delay provides margin above the 2.5 µs worst-case requirement
**5. Stable Kernel Criteria:** ✓ Fixes a real bug that affects users (timing violation) ✓ Small, contained change (adds one function, modifies one call site) ✓ No new features or architectural changes ✓ Minimal risk of regression ✓ Clear hardware specification violation being fixed
**6. Supporting Evidence:** - The driver has had multiple previous fixes backported (as seen in git log with "Fixes:" tags) - Similar timing-related fixes are commonly backported for hardware drivers - The AD7768-1 is an active production part used in data acquisition systems where reliability is critical
The commit meets all stable kernel criteria for backporting - it's a clear bug fix for a hardware timing violation that could cause unreliable operation, implemented with minimal changes and low regression risk.
drivers/iio/adc/ad7768-1.c | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-)
diff --git a/drivers/iio/adc/ad7768-1.c b/drivers/iio/adc/ad7768-1.c index 51134023534a..8b414a102864 100644 --- a/drivers/iio/adc/ad7768-1.c +++ b/drivers/iio/adc/ad7768-1.c @@ -252,6 +252,24 @@ static const struct regmap_config ad7768_regmap24_config = { .max_register = AD7768_REG24_COEFF_DATA, };
+static int ad7768_send_sync_pulse(struct ad7768_state *st) +{ + /* + * The datasheet specifies a minimum SYNC_IN pulse width of 1.5 × Tmclk, + * where Tmclk is the MCLK period. The supported MCLK frequencies range + * from 0.6 MHz to 17 MHz, which corresponds to a minimum SYNC_IN pulse + * width of approximately 2.5 µs in the worst-case scenario (0.6 MHz). + * + * Add a delay to ensure the pulse width is always sufficient to + * trigger synchronization. + */ + gpiod_set_value_cansleep(st->gpio_sync_in, 1); + fsleep(3); + gpiod_set_value_cansleep(st->gpio_sync_in, 0); + + return 0; +} + static int ad7768_set_mode(struct ad7768_state *st, enum ad7768_conv_mode mode) { @@ -339,10 +357,7 @@ static int ad7768_set_dig_fil(struct ad7768_state *st, return ret;
/* A sync-in pulse is required every time the filter dec rate changes */ - gpiod_set_value(st->gpio_sync_in, 1); - gpiod_set_value(st->gpio_sync_in, 0); - - return 0; + return ad7768_send_sync_pulse(st); }
static int ad7768_set_freq(struct ad7768_state *st,
From: Xinxin Wan xinxin.wan@intel.com
[ Upstream commit 19f971057b2d7b99c80530ec1052b45de236a8da ]
To be more resilient to codec-detection failures when the hardware powers on slowly, add retry mechanism to the device verification check. Similar pattern is found throughout a number of Realtek codecs. Our tests show that 60ms delay is sufficient to address readiness issues on rt5640 chip.
Reviewed-by: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com Reviewed-by: Cezary Rojewski cezary.rojewski@intel.com Signed-off-by: Xinxin Wan xinxin.wan@intel.com Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com Link: https://patch.msgid.link/20250530142120.2944095-3-cezary.rojewski@intel.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees. Here's my detailed analysis:
## Analysis of the Commit
### 1. Bug Fix Nature This commit fixes a real-world hardware detection issue where the RT5640 codec device ID verification fails on systems where the hardware powers on slowly. The failure results in the driver returning `-ENODEV` and the audio codec not being initialized, causing complete audio failure.
### 2. Small and Contained Change The fix is minimal and highly contained: - Only adds 3 lines of code - The change is isolated to the probe function's device ID verification logic - No architectural changes or new features introduced - The retry logic is simple: if initial ID read doesn't match, wait 60-100ms and retry once
### 3. Clear Purpose with No Major Side Effects The code change: ```c regmap_read(rt5640->regmap, RT5640_VENDOR_ID2, &val); +if (val != RT5640_DEVICE_ID) { + usleep_range(60000, 100000); + regmap_read(rt5640->regmap, RT5640_VENDOR_ID2, &val); +} ``` - Only adds a single retry with delay if the first read fails - The delay (60-100ms) is reasonable for probe-time initialization - No impact on normal operation when hardware is ready
### 4. Follows Established Pattern The commit message mentions "Similar pattern is found throughout a number of Realtek codecs," and the probe function already contains a 400ms delay after enabling LDO1 (line 3004), showing that power-on delays are expected for this hardware.
### 5. Minimal Risk of Regression - The retry only occurs when device ID verification would have failed anyway - In the worst case, it adds a 60-100ms delay to failed probe attempts - Systems where the codec is ready immediately continue to work without delay - The pattern is conservative: only one retry attempt
### 6. Important User Impact Without this fix, affected systems experience: - Complete audio failure on boot - No audio codec detection - Requires manual intervention or multiple reboots to get audio working
### 7. Stable Tree Suitability This fix meets all stable kernel criteria: - Fixes a real bug affecting users - Minimal change with clear purpose - No new features or APIs - Low risk of introducing new issues - Self-contained within a single driver
The commit represents a classic hardware quirk fix that improves reliability without adding complexity, making it an ideal candidate for stable backport.
sound/soc/codecs/rt5640.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/sound/soc/codecs/rt5640.c b/sound/soc/codecs/rt5640.c index 21a18012b4c0..55881a5669e2 100644 --- a/sound/soc/codecs/rt5640.c +++ b/sound/soc/codecs/rt5640.c @@ -3013,6 +3013,11 @@ static int rt5640_i2c_probe(struct i2c_client *i2c) }
regmap_read(rt5640->regmap, RT5640_VENDOR_ID2, &val); + if (val != RT5640_DEVICE_ID) { + usleep_range(60000, 100000); + regmap_read(rt5640->regmap, RT5640_VENDOR_ID2, &val); + } + if (val != RT5640_DEVICE_ID) { dev_err(&i2c->dev, "Device with ID register %#x is not rt5640/39\n", val);
From: Srinivas Kandagatla srini@kernel.org
[ Upstream commit 8167f4f42572818fa8153be2b03e4c2120846603 ]
Qcom lpass is using component->id to keep DAI ID (A).
(S) static int lpass_platform_pcmops_open( sruct snd_soc_component *component, struct snd_pcm_substream *substream) { ^^^^^^^^^(B0) ... (B1) struct snd_soc_pcm_runtime *soc_runtime = snd_soc_substream_to_rtd(substream); (B2) struct snd_soc_dai *cpu_dai = snd_soc_rtd_to_cpu(soc_runtime, 0); ... (B3) unsigned int dai_id = cpu_dai->driver->id;
(A) component->id = dai_id; ... }
This driver can get dai_id from substream (B0 - B3). In this driver, below functions get dai_id from component->id (A).
(X) lpass_platform_pcmops_suspend() (Y) lpass_platform_pcmops_resume() (Z) lpass_platform_copy()
Here, (Z) can get it from substream (B0 - B3), don't need to use component->id (A). On suspend/resume (X)(Y), dai_id can only be obtained from component->id (A), because there is no substream (B0) in function parameter.
But, component->id (A) itself should not be used for such purpose. It is intilialized at snd_soc_component_initialize(), and parsed its ID (= component->id) from device name (a).
int snd_soc_component_initialize(...) { ... if (!component->name) { (a) component->name = fmt_single_name(dev, &component->id); ... ^^^^^^^^^^^^^ } ... }
Unfortunately, current code is broken to start with.
There are many regmaps that the driver cares about, however its only managing one (either dp or i2s) in component suspend/resume path.
I2S regmap is mandatory however other regmaps are setup based on flags like "hdmi_port_enable" and "codec_dma_enable".
Correct thing for suspend/resume path to handle is by checking these flags, instead of using component->id.
Signed-off-by: Srinivas Kandagatla srini@kernel.org Suggested-by: Kuninori Morimoto kuninori.morimoto.gx@renesas.com Signed-off-by: Kuninori Morimoto kuninori.morimoto.gx@renesas.com Link: https://patch.msgid.link/87a56ouuob.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees for the following reasons:
**1. Fixes a Real Bug with User Impact:** - The driver was misusing `component->id` to store DAI ID information, overwriting the ASoC framework's intended use of this field - This caused incomplete suspend/resume handling - only one regmap (either HDMI or I2S) was being handled instead of both when needed - Users would experience audio failures after suspend/resume cycles, especially on systems with both HDMI and I2S audio interfaces
**2. Clear Architectural Violation Being Fixed:** The commit message and code analysis show that `component->id` is initialized by the ASoC framework during `snd_soc_component_initialize()` based on device name parsing. The driver was incorrectly overwriting this value with DAI ID, breaking the abstraction layer between driver and framework.
**3. Limited Scope and Low Risk:** - Changes are confined to 3 functions in a single file (`lpass- platform.c`) - The fix replaces the problematic `component->id` usage with proper flag checking (`drvdata->hdmi_port_enable`) - No API changes or new features added - The author (Srinivas Kandagatla) is an experienced ASoC maintainer
**4. Improves Suspend/Resume Reliability:** The fix ensures both HDMI and I2S regmaps are properly handled during suspend/resume based on actual driver capabilities rather than an incorrectly stored ID. This makes power management more robust and predictable.
**5. Long-Standing Issue:** The problematic code has existed since late 2020 (commits b1824968221c and 8d1bfc04c974), affecting multiple kernel versions and potentially many Qualcomm-based devices in the field.
**Backport Recommendation:** Should be backported to all stable kernels from 5.10 onwards that contain the original problematic commits. This is a medium-high priority fix that improves audio subsystem reliability without introducing new risks.
sound/soc/qcom/lpass-platform.c | 27 +++++++++++++++++---------- 1 file changed, 17 insertions(+), 10 deletions(-)
diff --git a/sound/soc/qcom/lpass-platform.c b/sound/soc/qcom/lpass-platform.c index 9946f12254b3..b456e096f138 100644 --- a/sound/soc/qcom/lpass-platform.c +++ b/sound/soc/qcom/lpass-platform.c @@ -202,7 +202,6 @@ static int lpass_platform_pcmops_open(struct snd_soc_component *component, struct regmap *map; unsigned int dai_id = cpu_dai->driver->id;
- component->id = dai_id; data = kzalloc(sizeof(*data), GFP_KERNEL); if (!data) return -ENOMEM; @@ -1190,13 +1189,14 @@ static int lpass_platform_pcmops_suspend(struct snd_soc_component *component) { struct lpass_data *drvdata = snd_soc_component_get_drvdata(component); struct regmap *map; - unsigned int dai_id = component->id;
- if (dai_id == LPASS_DP_RX) + if (drvdata->hdmi_port_enable) { map = drvdata->hdmiif_map; - else - map = drvdata->lpaif_map; + regcache_cache_only(map, true); + regcache_mark_dirty(map); + }
+ map = drvdata->lpaif_map; regcache_cache_only(map, true); regcache_mark_dirty(map);
@@ -1207,14 +1207,19 @@ static int lpass_platform_pcmops_resume(struct snd_soc_component *component) { struct lpass_data *drvdata = snd_soc_component_get_drvdata(component); struct regmap *map; - unsigned int dai_id = component->id; + int ret;
- if (dai_id == LPASS_DP_RX) + if (drvdata->hdmi_port_enable) { map = drvdata->hdmiif_map; - else - map = drvdata->lpaif_map; + regcache_cache_only(map, false); + ret = regcache_sync(map); + if (ret) + return ret; + }
+ map = drvdata->lpaif_map; regcache_cache_only(map, false); + return regcache_sync(map); }
@@ -1224,7 +1229,9 @@ static int lpass_platform_copy(struct snd_soc_component *component, unsigned long bytes) { struct snd_pcm_runtime *rt = substream->runtime; - unsigned int dai_id = component->id; + struct snd_soc_pcm_runtime *soc_runtime = snd_soc_substream_to_rtd(substream); + struct snd_soc_dai *cpu_dai = snd_soc_rtd_to_cpu(soc_runtime, 0); + unsigned int dai_id = cpu_dai->driver->id; int ret = 0;
void __iomem *dma_buf = (void __iomem *) (rt->dma_area + pos +
linux-stable-mirror@lists.linaro.org