On 29.7.2025 20.25, WeitaoWang-oc@zhaoxin.com wrote:
On 2025/7/28 21:16, Mathias Nyman wrote:
On 25.7.2025 21.51, Weitao Wang wrote:
In such a scenario, device-A with slot_id equal to 1 is disconnecting while device-B is enumerating, device-B will fail to enumerate in the follow sequence.
1.[device-A] send disable slot command 2.[device-B] send enable slot command 3.[device-A] disable slot command completed and wakeup waiting thread 4.[device-B] enable slot command completed with slot_id equal to 1 and wakeup waiting thread 5.[device-B] driver check this slot_id was used by someone(device-A) in xhci_alloc_virt_device, this device fails to enumerate as this conflict 6.[device-A] xhci->devs[slot_id] set to NULL in xhci_free_virt_device
To fix driver's slot_id resources conflict, let the xhci_free_virt_device functionm call in the interrupt handler when disable slot command success.
Cc: stable@vger.kernel.org Fixes: 7faac1953ed1 ("xhci: avoid race between disable slot command and host runtime suspend") Signed-off-by: Weitao Wang WeitaoWang-oc@zhaoxin.com
Nice catch, good to get this fixed.
This however has the downside of doing a lot in interrupt context.
what if we only clear some strategic pointers in the interrupt context, and then do all the actual unmapping and endpoint ring segments freeing, contexts freeing ,etc later?
Pseudocode:
xhci_handle_cmd_disable_slot(xhci, slot_id, comp_code) { if (cmd_comp_code == COMP_SUCCESS) { xhci->dcbaa->dev_context_ptrs[slot_id] = 0; xhci->devs[slot_id] = NULL; } }
xhci_disable_and_free_slot(xhci, slot_id) { struct xhci_virt_device *vdev = xhci->devs[slot_id];
xhci_disable_slot(xhci, slot_id); xhci_free_virt_device(xhci, vdev, slot_id); }
xhci_free_virt_device(xhci, vdev, slot_id) { if (xhci->dcbaa->dev_context_ptrs[slot_id] == vdev->out_ctx->dma) xhci->dcbaa->dev_context_ptrs[slot_id] = 0;
// free and unmap things just like before ...
if (xhci->devs[slot_id] == vdev) xhci->devs[slot_id] = NULL;
kfee(vdev); }
Hi Mathias,
Yes, your suggestion is a better revision, I made some modifications to the patch which is listed below. Please help to review again. Thanks for your help.
drivers/usb/host/xhci-hub.c | 3 +-- drivers/usb/host/xhci-mem.c | 21 ++++++++++----------- drivers/usb/host/xhci-ring.c | 9 +++++++-- drivers/usb/host/xhci.c | 23 ++++++++++++++++------- drivers/usb/host/xhci.h | 3 ++- 5 files changed, 36 insertions(+), 23 deletions(-)
diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c index 92bb84f8132a..b3a59ce1b3f4 100644 --- a/drivers/usb/host/xhci-hub.c +++ b/drivers/usb/host/xhci-hub.c @@ -704,8 +704,7 @@ static int xhci_enter_test_mode(struct xhci_hcd *xhci, if (!xhci->devs[i]) continue;
- retval = xhci_disable_slot(xhci, i); - xhci_free_virt_device(xhci, i); + retval = xhci_disable_and_free_slot(xhci, i); if (retval) xhci_err(xhci, "Failed to disable slot %d, %d. Enter test mode anyway\n", i, retval); diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index 6680afa4f596..fc4aca2e65bc 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -865,21 +865,18 @@ int xhci_alloc_tt_info(struct xhci_hcd *xhci, * will be manipulated by the configure endpoint, allocate device, or update * hub functions while this function is removing the TT entries from the list. */ -void xhci_free_virt_device(struct xhci_hcd *xhci, int slot_id) +void xhci_free_virt_device(struct xhci_hcd *xhci, struct xhci_virt_device *dev, + int slot_id) { - struct xhci_virt_device *dev; int i; int old_active_eps = 0;
/* Slot ID 0 is reserved */ - if (slot_id == 0 || !xhci->devs[slot_id]) + if (slot_id == 0 || !dev) return;
- dev = xhci->devs[slot_id];
- xhci->dcbaa->dev_context_ptrs[slot_id] = 0; - if (!dev) - return; + if (xhci->dcbaa->dev_context_ptrs[slot_id] == dev->out_ctx->dma)
forgot that dev_context_ptrs[] values are stored as le64 while out_ctx->dma is in whatever cpu uses.
So above should be: if (xhci->dcbaa->dev_context_ptrs[slot_id] == cpu_to_le64(dev->out_ctx->dma))
Otherwise it looks good to me
Thanks Mathias