On 2.3.2024 17.55, Chris Yokum wrote:
The submission of >512 URBs is via usbfs, yes. This worked forever, and still works on EHCI, it's just been failing on xHCI once the indicated change was applied.
We have found a regression bug, where more than 512 URBs cannot be reliably submitted to XHCI. URBs beyond that return 0x00 instead of valid data in the buffer.
FWIW, that's f5af638f0609af ("xhci: Fix transfer ring expansion size calculation") [v6.5-rc1] from Mathias.
Attached is a test program that demonstrates the problem. We used a few different USB-to-Serial adapters with no driver installed as a convenient way to reproduce. We check the TRB debug information before and after to verify the actual number of allocated TRBs.
Could you send me that test program as well?
Ah, so this is just through usbfs?
With some adapters on unaffected kernels, the TRB map gets expanded correctly. This directly corresponds to correct functional behavior. On affected kernels, the TRB ring does not expand, and our functional tests also will fail.
We don't know exactly why this happens. Some adapters do work correctly, so there seems to also be some subtle problem that was being masked by the liberal expansion of the TRB ring in older kernels. We also saw on one system that the TRB expansion did work correctly with one particular adapter. However, on all systems at least two adapters did exhibit the problem and fail.
Ok, I see, this could be the empty ring exception check in xhci-ring.c:
It could falsely assume ring is empty when it in fact is filled up in one go by queuing several small urbs.
static unsigned int xhci_ring_expansion_needed(struct xhci_hcd *xhci, struct xhci_ring *ring, unsigned int num_trbs) { ... /* Empty ring special case, enqueue stuck on link trb while dequeue advanced */ if (trb_is_link(ring->enqueue) && ring->enq_seg->next->trbs == ring->dequeue) return 0; ... }
https://elixir.bootlin.com/linux/v6.7/source/drivers/usb/host/xhci-ring.c#L3...
Can you help me test some patches on your setup?
Thanks Mathias