On Tue, Nov 18, 2025, Alan Stern wrote:
On Wed, Nov 19, 2025 at 01:49:12AM +0000, Thinh Nguyen wrote:
On Mon, Nov 17, 2025, Alan Stern wrote:
Hi Alan,
Can you help give your opinion on this?
Well, I think the change to the API was made because drivers _were_ calling these routines in interrupt context. That's what the commit's description says, anyway.
One way out of the problem would be to change the kerneldoc for usb_ep_disable(). Instead of saying that pending requests will complete before the all returns, say that the the requests will be marked for cancellation (with -ESHUTDOWN) before the call returns, but the actual completions might happen asynchronously later on.
The burden of synchronization would be shifted to the gadget drivers. The problem with this is that gadget drivers may modify the requests after usb_ep_disable() when it should not (e.g. the controller may still be processing the buffer). Also, gadget drivers shouldn't call usb_ep_enabled() until the requests are returned.
No, they probably shouldn't, although I don't know if that would actually cause any trouble. It's not a good idea, in any case -- particularly if the drivers want to re-use the same requests as before.
Right.
The problem is that function drivers' ->set_alt() callbacks are expected to do two things: disable all the endpoints from the old altsetting and enable all the endpoints in the new altsetting. There's no way for any part of the gadget core to intervene between those things (for instance, to wait for requests to complete).
The difficulty comes when a gadget driver has to handle a Set-Interface request, or Set-Config for the same configuration. The endpoints for the old altsetting/config have to be disabled and then the endpoints for the new altsetting/config have to be enabled, all while managing any
Right.
pending requests. I don't know how various function drivers handle this, just that f_mass_storage is very careful about taking care of everything in a separate kernel thread that explicitly dequeues the pending requests and flushes the endpoints. In fact, this scenario was the whole reason for inventing the DELAYED_STATUS mechanism, because it was impossible to do all the necessary work within the callback routine for a control-request interrupt handler.
If we want to keep usb_ep_disable in interrupt context, should we revise the wording such that gadget drivers must ensure pending requests are dequeued before calling usb_ep_disable()? That requests are expected to be given back before usb_ep_disable().
Or perhaps revert the commit above (commit b0d5d2a71641), fix the dwc3 driver for that, and gadget drivers need to follow the original requirement.
Function drivers would have to go to great lengths to guarantee that requests had completed before the endpoint is re-enabled. Right now their ->set_alt() callback routines are designed to run in interrupt context; they can't afford to wait for requests to complete.
Why is ->set_alt() designed for interrupt context? We can't expect requests to be completed before usb_ep_disable() completes _and_ also expect usb_ep_disable() be able to be called in interrupt context.
The easiest way out is for usb_ep_disable() to do what the kerneldoc says: ensure that pending requests do complete before it returns. Can dwc3 do this? (And what if at some time in the future we want to start
The dwc3 can do that, but we need to note that usb_ep_disable() must be executed in process context and might sleep. I suspect we may run into some issues from some function drivers that expected usb_ep_disable() to be executable in interrupt context.
using an asynchronous bottom half for request completions, like usbcore does for URBs?)
Which one are you referring to? From what I see, even the host side expected ->endpoint_disable to be executed in process context.
Perhaps we can introduce endpoint_flush() on gadget side for synchronization if we want to keep usb_ep_disable() to be asynchronous?
Let's face it; the situation is a mess.
Glad you're here to help with the mess :)
Thanks, Thinh