On Fri, 16 May 2025 23:38:33 +0000, Thinh Nguyen wrote:
But on the other hand, xhci_handshake() has long timeouts because the handshakes themselves can take a surprisingly long time (and sometimes still succeed), so any reliance on handshake completing before timeout is frankly a bug in itself.
This patch simply honors the contract between the software and hardware, allowing the handshake to complete. It doesn't assume the handshake will finish on time. If it times out, then it times out and returns a failure.
As Michał pointed out, disregarding the xhci handshake timeout is not proper. The change 6ccb83d6c497 seems to workaround some different watchdog warning timeout instead of resolving the actual issue. The watchdog timeout should not be less than the handshake timeout here.
There is certainly one real problem, which has likely existed since forever: some of those handshakes cause system-wide freezes. I haven't investigated it thoroughly, but I suspect the main culprit is the one in xhci_abort_cmd_ring(), which holds the spinlock for a few seconds if the xHC is particularly slow to complete the abort. This probably causes xhci_irq() to spin and disrupt other IRQs.
I encounter it sometimes with ASMedia controllers, but I guess anyone can simulate it by inserting artificial delays near xhci_handshake().
Regards, Michal