Re: [PATCH v1] Revert "usb: xhci: Implement xhci_handshake_check_state() helper"

22 May 2025


      On Thu, May 22, 2025 at 5:24 AM Mathias Nyman mathias.nyman@intel.com wrote:
...
On 22.5.2025 5.21, Roy Luo wrote:
...
...
...
...
Udipto Goswami, can you recall the platforms that needed this workaroud?
and do we have an easy way to detect those?
Hi Mathias,
From what I recall, we saw this issue coming up on our QCOM mobile
platforms but it was not consistent. It was only reported in long runs
i believe. The most recent instance when I pushed this patch was with
platform SM8650, it was a watchdog timeout issue where xhci_reset() ->
xhci_handshake() polling read timeout upon xhci remove. Unfortunately
I was not able to simulate the scenario for more granular testing and
had validated it with long hours stress testing.
The callstack was like so:
Full call stack on core6:
-000|readl([X19] addr = 0xFFFFFFC03CC08020)
-001|xhci_handshake(inline)
-001|xhci_reset([X19] xhci = 0xFFFFFF8942052250, [X20] timeout_us = 10000000)
-002|xhci_resume([X20] xhci = 0xFFFFFF8942052250, [?] hibernated = ?)
-003|xhci_plat_runtime_resume([locdesc] dev = ?)
-004|pm_generic_runtime_resume([locdesc] dev = ?)
-005|__rpm_callback([X23] cb = 0xFFFFFFE3F09307D8, [X22] dev =
0xFFFFFF890F619C10)
-006|rpm_callback(inline)
-006|rpm_resume([X19] dev = 0xFFFFFF890F619C10,
[NSD:0xFFFFFFC041453AD4] rpmflags = 4)
-007|__pm_runtime_resume([X20] dev = 0xFFFFFF890F619C10, [X19] rpmflags = 4)
-008|pm_runtime_get_sync(inline)
-008|xhci_plat_remove([X20] dev = 0xFFFFFF890F619C00)
Thank you for clarifying this.
So patch avoids the long timeout by always cutting xhci reinit path short in
xhci_resume() if resume was caused by pm_runtime_get_sync() call in
xhci_plat_remove()
void xhci_plat_remove(struct platform_device *dev)
{
         xhci->xhc_state |= XHCI_STATE_REMOVING;
         pm_runtime_get_sync(&dev->dev);
         ...
}
I think we can revert this patch, and just make sure that we don't reset the
host in the reinit path of xhci_resume() if XHCI_STATE_REMOVING is set.
Just return immediately instead.
Just to be sure, are you proposing that we skip xhci_reset() within
the reinit path
of xhci_resume()? If we do that, could that lead to issues with
subsequent operations
in the reinit sequence, such as xhci_init() or xhci_run()?
I suggest to only skip xhci_reset in xhci_resume() if XHCI_STATE_REMOVING is set.
This should be similar to what is going on already.
xhci_reset() currently returns -ENODEV if XHCI_STATE_REMOVING is set, unless reset
completes extremely fast. xhci_resume() bails out if xhci_reset() returns error:
xhci_resume()
   ...
   if (power_lost) {
     ...
     retval = xhci_reset(xhci, XHCI_RESET_LONG_USEC);
     spin_unlock_irq(&xhci->lock);
     if (retval)
       return retval;
...
Do you prefer to group the change to skip xhci_reset() within the
reinit path together
with this revert? or do you want it to be sent and reviewed separately?
First a patch that bails out from xhci_resume() if XHCI_STATE_REMOVING is set
and we are in the reinit (power_lost) path about to call xhci_reset();
Then a second patch that reverts 6ccb83d6c497 ("usb: xhci: Implement
xhci_handshake_check_state()
Does this sound reasonable?
should avoid the QCOM 10sec watchdog issue as next xhci_rest() called
in xhci_remove() path has a short 250ms timeout, and ensure the
SNPS DWC3 USB regression won't trigger.
Thanks
Mathias
Thanks for the clarification! SGTM.
I've sent out a new patchset accordingly
https://lore.kernel.org/linux-usb/20250522190912.457583-1-royluo@google.com/
Thanks,
Roy

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v1] Revert "usb: xhci: Implement xhci_handshake_check_state() helper"