On Mon, Apr 13, 2026 at 01:45:50PM +0800, Jie Gan wrote:
[...]
@@ -1787,15 +1808,32 @@ static int coresight_pm_save(struct coresight_path *path) to = list_prev_entry(coresight_path_last_node(path), link); coresight_disable_path_from_to(path, from, to);
- ret = coresight_pm_device_save(coresight_get_sink(path));
- if (ret)
goto sink_failed;- return 0;
+sink_failed:
- if (!coresight_enable_path_from_to(path, coresight_get_mode(source),
from, to))coresight_pm_device_restore(source);I have go through the history messages. I have a question about this point here:
how can we handle the scenario if coresight_enable_path_from_to failed? It means we are never calling coresight_pm_device_restore for the ETM and leaving the ETM with OS lock state until CPU reset?
From a design perspective, if any failure occurs in the idle flow, the priority is to avoid further mess, especially partial enable/disable sequences that could lead to lockups.
The case you mentioned is a typical risk - if a path after source to sink fails to be enabled, it is unsafe to arbitrarily enable the source further. We rely on the per-CPU flag "percpu_pm_failed" to disable idle states, if ETE/TRBE fails to be disabled, if CPU is turned off, this also might cause lockup.
Consider we are calling etm4_disable_hw with OS lock: etm4_disable_hw -> etm4_disable_trace_unit -> etm4x_wait_status (may timeout here?)
This is expected. I don't want to introduce a _recovery_ mechanism for CPU PM failures, which is complex and over-engineering. CPU PM notifier is low level code, and in my experience, PM issues can be easily observed once CPU idle is enabled and should be resolved during the development phase.
In many cases PM issues are often not caused by CoreSight drivers but by other modules (e.g., clock or regulator drivers). The log "Failed in coresight PM save ..." reminds developers the bugs. As said, percpu_pm_failed is used as a last resort to prevent the platform from locking up if there is a PM bug.
Thanks, Leo