On Thu, Oct 10, 2024 at 03:30:05PM -0700, Doug Anderson wrote:
On Wed, Oct 9, 2024 at 7:10 AM Johan Hovold johan@kernel.org wrote:
On Thu, Oct 03, 2024 at 11:30:08AM -0700, Doug Anderson wrote:
Hmmm, when I look at that commit it makes me think that the problem that commit e83766334f96 ("tty: serial: qcom_geni_serial: No need to stop tx/rx on UART shutdown") was fixing was re-introduced by commit d8aca2f96813 ("tty: serial: qcom-geni-serial: stop operations in progress at shutdown"). ...and indeed, it was. :(
I can't interact with kgdb if I do this:
- ssh over to DUT
- Kill the console process (on ChromeOS stop console-ttyMSM0)
- Drop in the debugger (echo g > /proc/sysrq-trigger)
Yeah, don't do that then. ;)
The problem is, I don't always have a choice. As talked about in the message of commit e83766334f96 ("tty: serial: qcom_geni_serial: No need to stop tx/rx on UART shutdown"), the above steps attempt to simulate what happened organically: a crash in late shutdown. During shutdown the agetty has been killed by the init system and I don't have a choice about it. If I get a kernel crash then (which isn't uncommon since shutdown code tends to trigger seldom-used code paths) then I can't debug it. :(
Ok, thanks for clarifying.
Not sure how your "console process" works, but this should only happen if you do not enable the serial console (console=ttyMSM0) and then try to use a polled console (as enabling the console will prevent port shutdown from being called).
That simply doesn't seem to be the case for me. The port shutdown seems to be called. To confirm, I put a printout at the start of qcom_geni_serial_shutdown(). I see in my /proc/cmdline:
console=ttyMSM0,115200n8
...and I indeed verify that I see console messages on my UART. I then run:
stop console-ttyMSM0
...and I see on the UART:
[ 92.916964] DOUG: qcom_geni_serial_shutdown [ 92.922703] init: console-ttyMSM0 main process (611) killed by TERM signal
Console messages keep coming out the UART even though the agetty isn't there.
And this is with a Chromium kernel, not mainline?
If you take a look at tty_port_shutdown() there's a hack in there for consoles that was added back in 2010 and that prevents shutdown() from called for console ports.
Put perhaps you manage to hit shutdown() via some other path. Serial core is not yet using tty_port_hangup() so a hangup might trigger that...
Could you check that with a dump_stack()?
Now I (via ssh) drop into the debugger:
echo g > /proc/sysrq-trigger
I see the "kgdb" prompt but I can't interact with it because qcom_geni_serial_shutdown() stopped RX.
How about simply amending poll_get_char() so that it enables the receiver if it's not already enabled?
Johan