On 14.10.2019 13.16, Johan Hovold wrote:
On Fri, Oct 11, 2019 at 03:58:42PM +0300, Mathias Nyman wrote:
commit ef513be0a905 ("usb: xhci: Add Clear_TT_Buffer") schedules work to clear TT buffer, but causes a use-after-free regression at the same time
Make sure hub_tt_work finishes before endpoint is disabled, otherwise the work will dereference already freed endpoint and device related pointers.
This was triggered when usb core failed to read the configuration descriptor of a FS/LS device during enumeration. xhci driver queued clear_tt_work while usb core freed and reallocated a new device for the next enumeration attempt.
EHCI driver implents ehci_endpoint_disable() that makes sure clear_tt_work has finished before it returns, but xhci lacks this support. usb core will call hcd->driver->endpoint_disable() callback before disabling endpoints, so we want this in xhci as well.
The added xhci_endpoint_disable() is based on ehci_endpoint_disable()
I used essentially the same reproducer as you did for debugging this after I first hit it with an actually stalled control endpoint, and this patch works also with my fault-injection hack.
I've reviewed the code and it looks good to me except for one mostly theoretical issue. You need to check ep->hc_priv while holding the xhci->lock in xhci_clear_tt_buffer_complete() or you could end up having xhci_endpoint_disable() reschedule indefinitely while waiting for EP_CLEARING_TT to be cleared on a sufficiently weakly ordered system.
Good point, I'll change that
Since cfbb8a84c2d2 ("xhci: Fix NULL pointer dereference in xhci_clear_tt_buffer_complete()") isn't needed anymore and is slightly misleading, I suggest amending the patch with the following:
I'll add those changes and your tags to the patch
Thanks Mathias