On Tue, Oct 11, 2022 at 11:54 AM Ferry Toth fntoth@gmail.com wrote:
Hi,
Op 10-10-2022 om 23:35 schreef Andrey Smirnov:
On Mon, Oct 10, 2022 at 1:52 PM Ferry Toth fntoth@gmail.com wrote:
Hi
Op 10-10-2022 om 13:04 schreef Ferry Toth:
Hi
On 10-10-2022 07:02, Andrey Smirnov wrote:
On Fri, Oct 7, 2022 at 6:07 AM Ferry Toth fntoth@gmail.com wrote:
On 07-10-2022 04:11, Thinh Nguyen wrote: > On Thu, Oct 06, 2022, Ferry Toth wrote: >> Hi >> >> On 06-10-2022 04:12, Thinh Nguyen wrote: >>> On Wed, Oct 05, 2022, Ferry Toth wrote: >>>> Hi, >>>> >>>> Thanks! >>>> >>>> Does the failure only happen the first time host is >>>> initialized? Or can >>>> it recover after switching to device then back to host mode? >>>> >>>> I can switch back and forth and device mode works each time, >>>> host mode remains >>>> dead. >>> Ok. >>> >>>> Probably the failure happens if some step(s) in >>>> dwc3_core_init() hasn't >>>> completed. >>>> >>>> tusb1210 is a phy driver right? The issue is probably >>>> because we didn't >>>> initialize the phy yet. So, I suspect placing >>>> dwc3_get_extcon() after >>>> initializing the phy will probably solve the dependency >>>> problem. >>>> >>>> You can try something for yourself or I can provide >>>> something to test >>>> later if you don't mind (maybe next week if it's ok). >>>> >>>> Yes, the code move I mentioned above "moves dwc3_get_extcon() >>>> until after >>>> dwc3_core_init() but just before dwc3_core_init_mode(). AFAIU >>>> initially >>>> dwc3_get_extcon() was called from within dwc3_core_init_mode() >>>> but only for >>>> case USB_DR_MODE_OTG. So with this change order of events is >>>> more or less >>>> unchanged" solves the issue. >>>> >>> I saw the experiment you did from the link you provided. We want >>> to also >>> confirm exactly which step in dwc3_core_init() was needed. >> Ok. I first tried the code move suggested by Andrey (didn't work). >> Then >> after reading the actual code I moved a bit further. >> >> This move was on top of -rc6 without any reverts. I did not make >> additional >> changes to dwc3_core_init() >> >> So current v6.0 has: dwc3_get_extcon - dwc3_get_dr_mode - ... - >> dwc3_core_init - .. - dwc3_core_init_mode (not working) >> >> I changed to: dwc3_get_dr_mode - dwc3_get_extcon - .. - >> dwc3_core_init - .. >> - dwc3_core_init_mode (no change) >> >> Then to: dwc3_get_dr_mode - .. - dwc3_core_init - .. - >> dwc3_get_extcon - >> dwc3_core_init_mode (works) >> >> .. are what I believe for this issue irrelevant calls to >> dwc3_alloc_scratch_buffers, dwc3_check_params and dwc3_debugfs_init. >> > Right. Thanks for narrowing it down. There are still many steps in > dwc3_core_init(). We have some suspicion, but we still haven't > confirmed > the exact cause of the failure. We can write a proper patch once we > know > the reason. If you would like me to test your suspicion, just tell me what to do :-)
OK, Ferry, I think I'm going to need clarification on specifics on your test setup. Can you share your kernel config, maybe your "/proc/config.gz", somewhere? When you say you are running vanilla Linux, do you mean it or do you mean vanilla tree + some patch delta?
For v6.0 I can get the exacts tonight. But earlier I had this for v5.17:
https://github.com/htot/meta-intel-edison/blob/master/meta-intel-edison-bsp/...
There are 2 patches referred in #67 and #68. One is related to the infinite loop. The other is I believe also needed to get dwc3 to work.
All the kernel config are applied as .cfg.
Patches and cfs's here:
https://github.com/htot/meta-intel-edison/tree/master/meta-intel-edison-bsp/...
Updated Yocto recipe for v6.0 here:
https://github.com/htot/meta-intel-edison/blob/honister/meta-intel-edison-bs...
#75-#77 are the 2 reverts from Andy, + one SOF revert (not related to this thread).
Please drop all of this https://github.com/htot/meta-intel-edison/blob/honister/meta-intel-edison-bs... and re do the testing. Assuming things are still broken, that's how you want to do the bisecting.
I removed 4 patches: 0043b-TODO-driver-core-Break-infinite-loop-when-deferred-p.patch 0044-REVERTME-usb-dwc3-gadget-skip-endpoints-ep-18-in-out.patch 0001-Revert-USB-fixup-for-merge-issue-with-usb-dwc3-Don-t.patch 0001-Revert-usb-dwc3-Don-t-switch-OTG-peripheral-if-extco.patch
Please remove all custom patches so we are on the same page. I don't suspect the 8250 related changes to affect anything, but I also would like to be testing the same thing. I'm testing vanilla v6.0
and indeed as you expect kernel boots (no infinite loop). However dwc3 host mode is not working as in your case, device mode works fine (Yocto configures a set of gadgets for me).
What do you do to test host mode working? lsusb? Something else? Asking to make sure I'm doing something equivalent on my end.
Just to be sure if I could have bisected without 0043a I added back the 2 0001-Revert* and indeed I run into the infinite loop with the console spitting out continuous: debugfs: Directory 'dwc3.0.auto' with parent 'ulpi' already present! tusb1210 dwc3.0.auto.ulpi: error -110 writing val 0x41 to reg 0x80
so yes it seems either 0043b or your patch "usb: dwc3: Don't switch OTG -> peripheral if extcon is present" is needed to boot (break the infinite loop). But your patch is in my case not sufficient to make host mode work.
Next step would be to establish if USB is working before my patch. You should be able to avoid the boot loop if you disable the "phy-tusb1210" driver. The driver fails to probe anyway, so it's not very likely to be crucial for functioning, so it should allow you to try things with my patch reverted:
git revert 8bd6b8c4b100 0f0101719138
After that, if things start working, it'd make sense to re-do your function re-arranging experiment to re-validate it.
As I understand it depends a bit on the timing, I might have a different initrd (built by Yocto vs. Buildroot). F.i. I see I have extcon-intel-mrfld in initrd and dwc3 / phy-tusb1210 built-in.
You mentioned that your rootfs image does some gadget configuration for you. Can this be disabled? If yes, it'd make sense to check if this could be a variable explaining the difference.
What U-Boot version are you running? AFACT U-Boot will touch that particular IP block, so this might be somewhat relevant.