On 21-08-2024 23:40, Jan Kiszka wrote:
On 21.08.24 07:30, Beleswar Prasad Padhi wrote:
On 19-08-2024 20:54, Jan Kiszka wrote:
From: Jan Kiszka jan.kiszka@siemens.com
By simply bailing out, the driver was violating its rule and internal
Using device lifecycle managed functions to register the rproc (devm_rproc_add()), bailing out with an error code will work.
assumptions that either both or no rproc should be initialized. E.g., this could cause the first core to be available but not the second one, leading to crashes on its shutdown later on while trying to dereference that second instance.
Fixes: 61f6f68447ab ("remoteproc: k3-r5: Wait for core0 power-up before powering up core1") Signed-off-by: Jan Kiszka jan.kiszka@siemens.com
drivers/remoteproc/ti_k3_r5_remoteproc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c b/drivers/remoteproc/ti_k3_r5_remoteproc.c index 39a47540c590..eb09d2e9b32a 100644 --- a/drivers/remoteproc/ti_k3_r5_remoteproc.c +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c @@ -1332,7 +1332,7 @@ static int k3_r5_cluster_rproc_init(struct platform_device *pdev) dev_err(dev, "Timed out waiting for %s core to power up!\n", rproc->name); - return ret; + goto err_powerup; } } @@ -1348,6 +1348,7 @@ static int k3_r5_cluster_rproc_init(struct platform_device *pdev) } } +err_powerup: rproc_del(rproc);
Please use devm_rproc_add() to avoid having to do rproc_del() manually here.
This is just be the tip of the iceberg. The whole code needs to be reworked accordingly so that we can drop these goto, not just this one.
You are correct. Unfortunately, the organic growth of this driver has resulted in a need to refactor. I plan on doing this and post the refactoring soon. This should be part of the overall refactoring as suggested by Mathieu[2]. But for the immediate problem, your fix does patch things up.. hence:
Acked-by: Beleswar Padhi b-padhi@ti.com
[2]: https://lore.kernel.org/all/Zr4w8Vj0mVo5sBsJ@p14s/
Just look at k3_r5_reserved_mem_init. Your change in [1] was also too early in this regard, breaking current error handling additionally.
Curious, Could you point out how does the change in [1] breaks current error handling?
I'll stop my whac-a-mole. Someone needs to sit down and do that for the complete code consistently. And test the error cases.
Jan
[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?...
err_add: k3_r5_reserved_mem_exit(kproc);