Hi, Geert,
On 27.11.2024 18:28, Geert Uytterhoeven wrote:
Hi Claudiu,
Thanks for your patch, which is now commit 3791ea69a4858b81 ("serial: sh-sci: Clean sci_ports[0] after at earlycon exit") in tty/tty-next.
On Wed, Nov 6, 2024 at 1:02 PM Claudiu claudiu.beznea@tuxon.dev wrote:
From: Claudiu Beznea claudiu.beznea.uj@bp.renesas.com
The early_console_setup() function initializes the sci_ports[0].port with an object of type struct uart_port obtained from the object of type struct earlycon_device received as argument by the early_console_setup().
It may happen that later, when the rest of the serial ports are probed, the serial port that was used as earlycon (e.g., port A) to be mapped to a different position in sci_ports[] and the slot 0 to be used by a different serial port (e.g., port B), as follows:
sci_ports[0] = port A sci_ports[X] = port B
Haven't you mixed A and B?
In this case, the new port mapped at index zero will have associated data that was used for earlycon.
Oops, do you have a simple reproducer for this?
It is reproducible with patches: - [PATCH 6/9] arm64: dts: renesas: rzg3s-smarc: Fix the debug serial alias - [PATCH 9/9] arm64: dts: renesas: r9a08g045s33-smarc-pmod: Add overlay for SCIF1
After boot, cat /dev/ttySC0 will lead to the issue described.
In case this happens, after Linux boot, any access to the serial port that maps on sci_ports[0] (port A) will block the serial port that was used as earlycon (port B).
Again, A <-> B?
To fix this, add early_console_exit() that clean the sci_ports[0] at earlycon exit time.
Fixes: 0b0cced19ab1 ("serial: sh-sci: Add CONFIG_SERIAL_EARLYCON support") Cc: stable@vger.kernel.org Signed-off-by: Claudiu Beznea claudiu.beznea.uj@bp.renesas.com
This causes a crash (lock-up without any output) when CONFIG_DEBUG_SPINLOCK=y (e.g. CONFIG_PROVE_LOCKING=y).
I missed to check this. Thank you for testing it.
--- a/drivers/tty/serial/sh-sci.c +++ b/drivers/tty/serial/sh-sci.c @@ -3546,6 +3546,32 @@ sh_early_platform_init_buffer("earlyprintk", &sci_driver, #ifdef CONFIG_SERIAL_SH_SCI_EARLYCON static struct plat_sci_port port_cfg __initdata;
+static int early_console_exit(struct console *co) +{
struct sci_port *sci_port = &sci_ports[0];
struct uart_port *port = &sci_port->port;
unsigned long flags;
int locked = 1;
if (port->sysrq)
locked = 0;
else if (oops_in_progress)
locked = uart_port_trylock_irqsave(port, &flags);
else
uart_port_lock_irqsave(port, &flags);
/*
* Clean the slot used by earlycon. A new SCI device might
* map to this slot.
*/
memset(sci_ports, 0, sizeof(*sci_port));
Nit: I'd rather use "*sci_port" instead of "sci_ports".
That would be better, indeed.
if (locked)
uart_port_unlock_irqrestore(port, flags);
"BUG: spinlock bad magic", as you've just cleared the port, including the spinlock.
I guess we can just remove all locking from this function to fix this?
I'll look to it.
However, could it happen that the new device taking slot 0 is probed before the early console is terminated?
I don't know to answer this. In my testing I haven't encountered it.
In that case, its active sci_ports[] entry would be cleared when early_console_exit() is called.
Also, what happens if "earlycon keep_bootcon" is passed on the kernel command line, and the new device takes slot 0?
I checked it with earlycon and the serial device being on slot 0. In this case it was OK.
Thanks!
return 0;
+}
static int __init early_console_setup(struct earlycon_device *device, int type) { @@ -3562,6 +3588,8 @@ static int __init early_console_setup(struct earlycon_device *device, SCSCR_RE | SCSCR_TE | port_cfg.scscr);
device->con->write = serial_console_write;
device->con->exit = early_console_exit;
return 0;
} static int __init sci_early_console_setup(struct earlycon_device *device,
Gr{oetje,eeting}s,
Geert