On 20.08.24 11:30, Beleswar Prasad Padhi wrote:
Hi Jan,
On 19-08-2024 22:17, Jan Kiszka wrote:
From: Jan Kiszka jan.kiszka@siemens.com
When k3_r5_cluster_rproc_exit is run, core 1 is shutdown and removed first. When core 0 should then be stopped before its removal, it will find core1->rproc as NULL already and crashes. Happens on rmmod e.g.
Did you check this on top of -next-20240820 tag? There was a series[0] which was merged recently which fixed this condition. I don't see this issue when trying on top of -next-20240820 tag. [0]: https://lore.kernel.org/all/20240808074127.2688131-1-b-padhi@ti.com/
I didn't try those yet, I was on 6.11-rcX. But from reading them quickly, I'm not seeing the two issues I found directly addressed there.
Fixes: 3c8a9066d584 ("remoteproc: k3-r5: Do not allow core1 to power up before core0 via sysfs") CC: stable@vger.kernel.org Signed-off-by: Jan Kiszka jan.kiszka@siemens.com
There might be one more because I can still make this driver crash after an operator error. Were error scenarios tested at all?
Can you point out what is this issue more specifically, and I can take this up then.
Try starting core1 before core0, and then again - system will hang or crash while trying to wipe ATCM. I do not understand this problem yet - same VA is used, and I cannot see where/how the region should have been unmapped in between.
Jan
drivers/remoteproc/ti_k3_r5_remoteproc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c b/drivers/remoteproc/ti_k3_r5_remoteproc.c index eb09d2e9b32a..9ebd7a34e638 100644 --- a/drivers/remoteproc/ti_k3_r5_remoteproc.c +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c @@ -646,7 +646,8 @@ static int k3_r5_rproc_stop(struct rproc *rproc) /* do not allow core 0 to stop before core 1 */ core1 = list_last_entry(&cluster->cores, struct k3_r5_core, elem); - if (core != core1 && core1->rproc->state != RPROC_OFFLINE) { + if (core != core1 && core1->rproc && + core1->rproc->state != RPROC_OFFLINE) { dev_err(dev, "%s: can not stop core 0 before core 1\n", __func__); ret = -EPERM;