On Thu, Aug 21, 2025 at 10:45:10AM +0100, James Clark wrote:
[...]
The flow is updated with this change:
CPU0 CPU1 etm4_enable() ` etm4_enable_sysfs() ` smp_call_function_single() ----> etm4_enable_hw_smp_call() ` coresight_take_mode(SYSFS) Failed, set back to DISABLED ` coresight_set_mode(DISABLED)
CPU idle: etm4_cpu_save() ` coresight_get_mode() ^^^ Read out the DISABLED mode
There's no lock though, so can't it do this:
CPU0 CPU1 etm4_enable() ` etm4_enable_sysfs() ` smp_call_function_single() ---> etm4_enable_hw_smp_call() `coresight_take_mode(SYSFS)
CPU idle: etm4_cpu_save() ` coresight_get_mode() ^^ Intermediate SYSFS mode
No.
The point is a target CPU will execute etm4_enable_hw_smp_call() and CPU idle flow sequentially. It is no chance for the idle flow to preempt the ETM flow. This is why using the target CPU to set the ETM state machine will dismiss the race condition.
This is why I was voting for changing the compare and swap mode stuff to spinlocks so we can be sure it's correct. The lock shouldn't be released until the mode is at its final value.
The question is a spinlock usage. If both initiate CPUs and target CPU try to acquire the spinlock, the low level's CPU idle flow (etm4_cpu_save() / etm4_cpu_restore()) needs to wait for high level's SysFS operation to complete - which is what we try to avoid.
If we move the state machine on the target CPU - the operation and its state transition on the same CPU so that dismiss a race condition. And the idle flow is sequential to the ETM enabling/disabling, then, we can achieve lockless approach.
A global picture is ETM's state is used in multiple places (Sysfs knobs, ETM enabling / disabling, CPU idle and hotplug flow), if we start to use spinlock for exclusively access the statch machine, then everywhere will use it. This is why this series wants to do reliable state transition based on appropriate atomic APIs (it also can promise atomicity and ordering but lockless).
Thanks, Leo