On 4/2/2025 5:06 AM, Thinh Nguyen wrote:
Hi Frode,
On Tue, Apr 01, 2025, Krishna Kurapati wrote:
On 4/1/2025 5:38 PM, Frode Isaksen wrote:
On 4/1/25 7:43 AM, Krishna Kurapati wrote:
On 3/28/2025 4:14 PM, Frode Isaksen wrote:
From: Frode Isaksen frode@meta.com
The event count is read from register DWC3_GEVNTCOUNT. There is a check for the count being zero, but not for exceeding the event buffer length. Check that event count does not exceed event buffer length, avoiding an out-of-bounds access when memcpy'ing the event. Crash log: Unable to handle kernel paging request at virtual address ffffffc0129be000 pc : __memcpy+0x114/0x180 lr : dwc3_check_event_buf+0xec/0x348 x3 : 0000000000000030 x2 : 000000000000dfc4 x1 : ffffffc0129be000 x0 : ffffff87aad60080 Call trace: __memcpy+0x114/0x180 dwc3_interrupt+0x24/0x34
Signed-off-by: Frode Isaksen frode@meta.com Fixes: ebbb2d59398f ("usb: dwc3: gadget: use evt->cache for processing events") Cc: stable@vger.kernel.org
v1 -> v2: Added Fixes and Cc tag.
This bug was discovered, tested and fixed (no more crashes seen) on Meta Quest 3 device. Also tested on T.I. AM62x board.
drivers/usb/dwc3/gadget.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 63fef4a1a498..548e112167f3 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4564,7 +4564,7 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt) count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0)); count &= DWC3_GEVNTCOUNT_MASK; - if (!count) + if (!count || count > evt->length) return IRQ_NONE; evt->count = count;
I did see this issue previously ([1] on 5.10) on SAR2130 (upstreamed recently). Can you help check if the issue is same on your end if you can reproduce it easily. Thinh also provided some debug pointers to check suspecting it to be a HW issue.
Seems to be exactly the same issue, and your fix looks OK as well. I'm happy to abandon my patch and let yo provide the fix.
NAK. I tried to skip copying data beyond 4K which is not the right approach. Thinh was tending more towards your line of code changes. So your code looks fine, but an error log indicating the presence of this issue might be helpful.
Note that I am not able to reproduce this locally and it happens very seldom.
It was very hard to reproduce this issue. Only two instances reported on SAR2130 on my end.
I still wonder what's current behavior of the HW to properly respond here. If the device is dead, register read often returns all Fs, which may be the case you're seeing here. If so, we should properly prevent the driver from accessing the device and properly teardown the driver.
If this is a momentary bleep/lost of power in the device, perhaps your change here is sufficient and the driver can continue to access the device.
With the difficulty of reproducing this issue, can you confirm that the device still operates properly after this change?
Unfortunately, I did not test this particular change of returning when ev count is invalid. I stress tested the change of copying only 4K [1], but didn't see any issue. I suspect we didn't hit the issue later again in the course of 14 day testing.
[1]: https://lore.kernel.org/all/20230521100330.22478-1-quic_kriskura@quicinc.com...
Regards, Krishna,