On 4/24/25 04:30, Yabin Cui wrote:
> Similar to ETE, TRBE may lose its context when a CPU enters low
> power state. To make things worse, if ETE state is restored without
> restoring TRBE state, it can lead to a stuck CPU due to an enabled
> source device with no enabled sink devices.
Could you please provide some more details about when the cpu gets
stuck e.g dmesg, traces etc. Also consider adding those details in
the commit message as well to establish the problem, this patch is
trying to address.
>
> This patch introduces support for "arm,coresight-loses-context-with-cpu"
> in the TRBE driver. When present, TRBE registers are saved before
> and restored after CPU low power state. To prevent CPU hangs, TRBE
> state is always saved after ETE state and restored after ETE state.
The save and restore order here is
1. Save ETE
2. Save TRBE
3. Restore ETE
4. Restore TRBE
>
> Signed-off-by: Yabin Cui <yabinc(a)google.com>
> ---
> .../coresight/coresight-etm4x-core.c | 13 ++++-
> drivers/hwtracing/coresight/coresight-trbe.c | 53 +++++++++++++++++++
> include/linux/coresight.h | 6 +++
> 3 files changed, 71 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index e5972f16abff..1bbaa1249206 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -1863,6 +1863,7 @@ static int __etm4_cpu_save(struct etmv4_drvdata *drvdata)
> static int etm4_cpu_save(struct etmv4_drvdata *drvdata)
> {
> int ret = 0;
> + struct coresight_device *sink;
>
> /* Save the TRFCR irrespective of whether the ETM is ON */
> if (drvdata->trfcr)
> @@ -1871,8 +1872,14 @@ static int etm4_cpu_save(struct etmv4_drvdata *drvdata)
> * Save and restore the ETM Trace registers only if
> * the ETM is active.
> */
> - if (coresight_get_mode(drvdata->csdev) && drvdata->save_state)
> + if (coresight_get_mode(drvdata->csdev) && drvdata->save_state) {
> ret = __etm4_cpu_save(drvdata);
> + if (ret == 0) {
> + sink = coresight_get_percpu_sink(drvdata->cpu);
> + if (sink && sink_ops(sink)->percpu_save)
> + sink_ops(sink)->percpu_save(sink);
> + }
> + }
> return ret;
> }
>
> @@ -1977,6 +1984,10 @@ static void __etm4_cpu_restore(struct etmv4_drvdata *drvdata)
>
> static void etm4_cpu_restore(struct etmv4_drvdata *drvdata)
> {
> + struct coresight_device *sink = coresight_get_percpu_sink(drvdata->cpu);
> +
> + if (sink && sink_ops(sink)->percpu_restore)
> + sink_ops(sink)->percpu_restore(sink);
TRBE gets restored first which contradicts the order mentioned
earlier in the commit message ?
> if (drvdata->trfcr)
> write_trfcr(drvdata->save_trfcr);
> if (drvdata->state_needs_restore)
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
> index fff67aac8418..38bf46951a82 100644
> --- a/drivers/hwtracing/coresight/coresight-trbe.c
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -115,6 +115,13 @@ static int trbe_errata_cpucaps[] = {
> */
> #define TRBE_WORKAROUND_OVERWRITE_FILL_MODE_SKIP_BYTES 256
>
> +struct trbe_save_state {
> + u64 trblimitr;
> + u64 trbbaser;
> + u64 trbptr;
> + u64 trbsr;
> +};
This seems to accommodate all required TRBE registers. Although this is
very intuitive could you please also add a comment above this structure
explaining the elements like other existing structures in the file ?
> +
> /*
> * struct trbe_cpudata: TRBE instance specific data
> * @trbe_flag - TRBE dirty/access flag support
> @@ -123,6 +130,9 @@ static int trbe_errata_cpucaps[] = {
> * @cpu - CPU this TRBE belongs to.
> * @mode - Mode of current operation. (perf/disabled)
> * @drvdata - TRBE specific drvdata
> + * @state_needs_save - Need to save trace registers when entering cpu idle
> + * @state_needs_restore - Need to restore trace registers when exiting cpu idle
> + * @save_state - Saved trace registers
> * @errata - Bit map for the errata on this TRBE.
> */
> struct trbe_cpudata {
> @@ -133,6 +143,9 @@ struct trbe_cpudata {
> enum cs_mode mode;
> struct trbe_buf *buf;
> struct trbe_drvdata *drvdata;
> + bool state_needs_save;
This tracks whether coresight_loses_context_with_cpu() is supported
on the CPU or not.
> + bool state_needs_restore;
This tracks whether the state has been saved earlier and hence can
be restored later on when required.
> + struct trbe_save_state save_state;
> DECLARE_BITMAP(errata, TRBE_ERRATA_MAX);
> };
>
> @@ -1187,12 +1200,49 @@ static irqreturn_t arm_trbe_irq_handler(int irq, void *dev)
> return IRQ_HANDLED;
> }
>
> +static void arm_trbe_cpu_save(struct coresight_device *csdev)
> +{
> + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev);
> + struct trbe_save_state *state = &cpudata->save_state;
Please move the above statement after the following conditional
block. Because struct trbe_save_state is not going to be required
if arm_trbe_cpu_save() returns prematurely from here.
> +
> + if (cpudata->mode == CS_MODE_DISABLED || !cpudata->state_needs_save)
> + return;> +
> + state->trbbaser = read_sysreg_s(SYS_TRBBASER_EL1);
> + state->trbptr = read_sysreg_s(SYS_TRBPTR_EL1);
> + state->trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
> + state->trbsr = read_sysreg_s(SYS_TRBSR_EL1);
> + cpudata->state_needs_restore = true;
This looks right.
> +}
> +
> +static void arm_trbe_cpu_restore(struct coresight_device *csdev)
> +{
> + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev);
> + struct trbe_save_state *state = &cpudata->save_state;
Please move this assignment inside the block where these registers
actually get restored after all checks are done.
> +
> + if (cpudata->state_needs_restore) {
> + /*
> + * To avoid disruption of normal tracing, restore trace
> + * registers only when TRBE lost power (TRBLIMITR == 0).
> + */
Although this sounds like a reasonable condition, but what happens
when TRBE restoration is skipped ?
> + if (read_sysreg_s(SYS_TRBLIMITR_EL1) == 0) {
state = &cpudata->save_state
> + write_sysreg_s(state->trbbaser, SYS_TRBBASER_EL1);
> + write_sysreg_s(state->trbptr, SYS_TRBPTR_EL1);
> + write_sysreg_s(state->trbsr, SYS_TRBSR_EL1);
> + set_trbe_enabled(cpudata, state->trblimitr);
> + }
> + cpudata->state_needs_restore = false;
That means earlier captured state is dropped without a restoration.
> + }
> +}
> +
> static const struct coresight_ops_sink arm_trbe_sink_ops = {
> .enable = arm_trbe_enable,
> .disable = arm_trbe_disable,
> .alloc_buffer = arm_trbe_alloc_buffer,
> .free_buffer = arm_trbe_free_buffer,
> .update_buffer = arm_trbe_update_buffer,
> + .percpu_save = arm_trbe_cpu_save,
> + .percpu_restore = arm_trbe_cpu_restore,
Why this percpu_* prefix ?
> };
>
> static const struct coresight_ops arm_trbe_cs_ops = {
> @@ -1358,6 +1408,9 @@ static void arm_trbe_probe_cpu(void *info)
> cpudata->trbe_flag = get_trbe_flag_update(trbidr);
> cpudata->cpu = cpu;
> cpudata->drvdata = drvdata;
> + cpudata->state_needs_save = coresight_loses_context_with_cpu(
> + &drvdata->pdev->dev);
> + cpudata->state_needs_restore = false;
The init here looks good.
> return;
> cpu_clear:
> cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
> diff --git a/include/linux/coresight.h b/include/linux/coresight.h
> index d79a242b271d..fec375d02535 100644
> --- a/include/linux/coresight.h
> +++ b/include/linux/coresight.h
> @@ -362,6 +362,10 @@ enum cs_mode {
> * @alloc_buffer: initialises perf's ring buffer for trace collection.
> * @free_buffer: release memory allocated in @get_config.
> * @update_buffer: update buffer pointers after a trace session.
> + * @percpu_save: saves state when CPU enters idle state.
> + * Only set for percpu sink.
> + * @percpu_restore: restores state when CPU exits idle state.
> + * only set for percpu sink.
> */
> struct coresight_ops_sink {
> int (*enable)(struct coresight_device *csdev, enum cs_mode mode,
> @@ -374,6 +378,8 @@ struct coresight_ops_sink {
> unsigned long (*update_buffer)(struct coresight_device *csdev,
> struct perf_output_handle *handle,
> void *sink_config);
> + void (*percpu_save)(struct coresight_device *csdev);
> + void (*percpu_restore)(struct coresight_device *csdev);
Again - why this percpu_* prefix ?
> };
>
> /**
On 2025/4/23 10:07, Jie Gan wrote:
>
>
> On 4/22/2025 8:52 PM, hejunhao wrote:
>>
>> On 2025/4/18 15:12, Jie Gan wrote:
>>>
>>>
>>> On 4/18/2025 1:58 PM, Junhao He wrote:
>>>> When trying to run perf and sysfs mode simultaneously, the WARN_ON()
>>>> in tmc_etr_enable_hw() is triggered sometimes:
>>>>
>>>> WARNING: CPU: 42 PID: 3911571 at drivers/hwtracing/coresight/
>>>> coresight-tmc-etr.c:1060 tmc_etr_enable_hw+0xc0/0xd8 [coresight_tmc]
>>>> [..snip..]
>>>> Call trace:
>>>> tmc_etr_enable_hw+0xc0/0xd8 [coresight_tmc] (P)
>>>> tmc_enable_etr_sink+0x11c/0x250 [coresight_tmc] (L)
>>>> tmc_enable_etr_sink+0x11c/0x250 [coresight_tmc]
>>>> coresight_enable_path+0x1c8/0x218 [coresight]
>>>> coresight_enable_sysfs+0xa4/0x228 [coresight]
>>>> enable_source_store+0x58/0xa8 [coresight]
>>>> dev_attr_store+0x20/0x40
>>>> sysfs_kf_write+0x4c/0x68
>>>> kernfs_fop_write_iter+0x120/0x1b8
>>>> vfs_write+0x2c8/0x388
>>>> ksys_write+0x74/0x108
>>>> __arm64_sys_write+0x24/0x38
>>>> el0_svc_common.constprop.0+0x64/0x148
>>>> do_el0_svc+0x24/0x38
>>>> el0_svc+0x3c/0x130
>>>> el0t_64_sync_handler+0xc8/0xd0
>>>> el0t_64_sync+0x1ac/0x1b0
>>>> ---[ end trace 0000000000000000 ]---
>>>>
>>>> Since the sysfs buffer allocation and the hardware enablement is not
>>>> in the same critical region, it's possible to race with the perf
>>>>
>>>> mode:
>>>> [sysfs mode] [perf mode]
>>>> tmc_etr_get_sysfs_buffer()
>>>> spin_lock(&drvdata->spinlock)
>>>> [sysfs buffer allocation]
>>>> spin_unlock(&drvdata->spinlock)
>>>> spin_lock(&drvdata->spinlock)
>>>> tmc_etr_enable_hw()
>>>> drvdata->etr_buf =
>>>> etr_perf->etr_buf
>>>> spin_unlock(&drvdata->spinlock)
>>>> spin_lock(&drvdata->spinlock)
>>>> tmc_etr_enable_hw()
>>>> WARN_ON(drvdata->etr_buf) // WARN sicne etr_buf initialized at
>>>> the perf side
>>>> spin_unlock(&drvdata->spinlock)
>>>>
>>>> To resolve this, configure the tmc-etr mode before invoking
>>>> `enable_perf()` or sysfs interfaces. Prior to mode configuration,
>>>> explicitly check if the tmc-etr sink is already enabled in a
>>>> different mode to prevent race conditions between mode transitions.
>>>> Furthermore, enforce spinlock protection around the critical
>>>> sections to serialize concurrent accesses from sysfs and perf
>>>> subsystems.
>>>
>>> Is any issue observed during this race condition?
>>>
>>> The etr_buf is checked before assign it to sysfs_buf or perf_buf.
>>> In my understanding, the warning raised to indicate the etr already
>>> enabled with sysfs mode or perf mode, so terminate current enable
>>> session, right?
>>>
>>> Thanks,
>>> Jie
>>>
>>
>> The sysfs task is executing the process to enable tmc-etr and has
>> allocated
>> a sysfs_buf. However, the sysfs spinlock does not ensure atomicity of
>> the
>> entire enable_etr_sink_sysfs() function execution. In this scenario,
>> the perf
>> session may preemptively call tmc_etr_enable_hw() to enable tmc-etr.
>>
>> Consequently, during race conditions, perf always prioritizes
>> execution, and
>> the sysfs_buf remains un-released, leading to memory leaks. Then this
>> triggers
>> a WARN_ON(drvdata->etr_buf) warning.
>
> The pre-allocated sysfs_buf is stored in drvdata->sysfs_buf. And I
> think it's fine to remain unreleased until the ETR is enabled with
> sysfs mode.
Yes, you're right. The tmc-etr is retaining the sysfs_buf, so no memory
leak occurs.
> The ETR will not setup a new sysfs_buf if drvdata->sysfs_buf already
> exists. But in other words, it definitely consumes some
> memory(especailly allocated a large size of sysfs_buf) for a while or
> forever if we dont enable the ETR with sysfs mode in the future?
>
> I think this is the issue try to address?
>
Patch 3 of patchset [1] is visible, serving as the basis, and you can
understand what the patch aims to achieve.
This patch aims to avoid race conditions, as warnings can lead to
concerns about potential hidden bugs, such
as getting out of sync. I may need to update the commit to include the
commit log for patch 1's fix in patchset [1].
[1]:
https://lore.kernel.org/linux-arm-kernel/20241202092419.11777-4-yangyicong@…
Best regards,
Junhao.
> Thanks,
> Jie
>
>>
>> This patch also addresses the issue described in [1], simplifying the
>> setup and
>> checks for "mode".
>>
>> [1] Closes: https://lore.kernel.org/linux-arm-
>> kernel/20241202092419.11777-2-yangyicong(a)huawei.com/
>>
>> Best regards,
>> Junhao.
>>
>>>>
>>>> Fixes: 296b01fd106e ("coresight: Refactor out buffer allocation
>>>> function for ETR")
>>>> Reported-by: Yicong Yang <yangyicong(a)hisilicon.com>
>>>> Closes: https://lore.kernel.org/linux-arm-
>>>> kernel/20241202092419.11777-2-yangyicong(a)huawei.com/
>>>> Signed-off-by: Junhao He <hejunhao3(a)huawei.com>
>>>> ---
>>>> .../hwtracing/coresight/coresight-tmc-etr.c | 77
>>>> +++++++++++--------
>>>> 1 file changed, 47 insertions(+), 30 deletions(-)
>>>>
>>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/
>>>> drivers/hwtracing/coresight/coresight-tmc-etr.c
>>>> index a48bb85d0e7f..3d94d64cacaa 100644
>>>> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>>> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
>>>> @@ -1190,11 +1190,6 @@ static struct etr_buf
>>>> *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev)
>>>> spin_lock_irqsave(&drvdata->spinlock, flags);
>>>> }
>>>> - if (drvdata->reading || coresight_get_mode(csdev) ==
>>>> CS_MODE_PERF) {
>>>> - ret = -EBUSY;
>>>> - goto out;
>>>> - }
>>>> -
>>>> /*
>>>> * If we don't have a buffer or it doesn't match the
>>>> requested size,
>>>> * use the buffer allocated above. Otherwise reuse the
>>>> existing buffer.
>>>> @@ -1205,7 +1200,6 @@ static struct etr_buf
>>>> *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev)
>>>> drvdata->sysfs_buf = new_buf;
>>>> }
>>>> -out:
>>>> spin_unlock_irqrestore(&drvdata->spinlock, flags);
>>>> /* Free memory outside the spinlock if need be */
>>>> @@ -1216,7 +1210,7 @@ static struct etr_buf
>>>> *tmc_etr_get_sysfs_buffer(struct coresight_device *csdev)
>>>> static int tmc_enable_etr_sink_sysfs(struct coresight_device
>>>> *csdev)
>>>> {
>>>> - int ret = 0;
>>>> + int ret;
>>>> unsigned long flags;
>>>> struct tmc_drvdata *drvdata =
>>>> dev_get_drvdata(csdev->dev.parent);
>>>> struct etr_buf *sysfs_buf = tmc_etr_get_sysfs_buffer(csdev);
>>>> @@ -1226,23 +1220,10 @@ static int tmc_enable_etr_sink_sysfs(struct
>>>> coresight_device *csdev)
>>>> spin_lock_irqsave(&drvdata->spinlock, flags);
>>>> - /*
>>>> - * In sysFS mode we can have multiple writers per sink. Since
>>>> this
>>>> - * sink is already enabled no memory is needed and the HW need
>>>> not be
>>>> - * touched, even if the buffer size has changed.
>>>> - */
>>>> - if (coresight_get_mode(csdev) == CS_MODE_SYSFS) {
>>>> - csdev->refcnt++;
>>>> - goto out;
>>>> - }
>>>> -
>>>> ret = tmc_etr_enable_hw(drvdata, sysfs_buf);
>>>> - if (!ret) {
>>>> - coresight_set_mode(csdev, CS_MODE_SYSFS);
>>>> + if (!ret)
>>>> csdev->refcnt++;
>>>> - }
>>>> -out:
>>>> spin_unlock_irqrestore(&drvdata->spinlock, flags);
>>>> if (!ret)
>>>> @@ -1652,11 +1633,6 @@ static int tmc_enable_etr_sink_perf(struct
>>>> coresight_device *csdev, void *data)
>>>> struct etr_perf_buffer *etr_perf = etm_perf_sink_config(handle);
>>>> spin_lock_irqsave(&drvdata->spinlock, flags);
>>>> - /* Don't use this sink if it is already claimed by sysFS */
>>>> - if (coresight_get_mode(csdev) == CS_MODE_SYSFS) {
>>>> - rc = -EBUSY;
>>>> - goto unlock_out;
>>>> - }
>>>> if (WARN_ON(!etr_perf || !etr_perf->etr_buf)) {
>>>> rc = -EINVAL;
>>>> @@ -1685,7 +1661,6 @@ static int tmc_enable_etr_sink_perf(struct
>>>> coresight_device *csdev, void *data)
>>>> if (!rc) {
>>>> /* Associate with monitored process. */
>>>> drvdata->pid = pid;
>>>> - coresight_set_mode(csdev, CS_MODE_PERF);
>>>> drvdata->perf_buf = etr_perf->etr_buf;
>>>> csdev->refcnt++;
>>>> }
>>>> @@ -1698,14 +1673,56 @@ static int tmc_enable_etr_sink_perf(struct
>>>> coresight_device *csdev, void *data)
>>>> static int tmc_enable_etr_sink(struct coresight_device *csdev,
>>>> enum cs_mode mode, void *data)
>>>> {
>>>> + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
>>>> + enum cs_mode old_mode;
>>>> + int rc;
>>>> +
>>>> + scoped_guard(spinlock_irqsave, &drvdata->spinlock) {
>>>> + old_mode = coresight_get_mode(csdev);
>>>> + if (old_mode != CS_MODE_DISABLED && old_mode != mode)
>>>> + return -EBUSY;
>>>> +
>>>> + if (drvdata->reading)
>>>> + return -EBUSY;
>>>> +
>>>> + /*
>>>> + * In sysFS mode we can have multiple writers per sink.
>>>> Since this
>>>> + * sink is already enabled no memory is needed and the HW
>>>> need not be
>>>> + * touched, even if the buffer size has changed.
>>>> + */
>>>> + if (old_mode == CS_MODE_SYSFS) {
>>>> + csdev->refcnt++;
>>>> + return 0;
>>>> + }
>>>> +
>>>> + /*
>>>> + * minor note:
>>>> + * When sysfs-task1 get locked, it setup the mode first. Then
>>>> + * sysfs-task2 gets locked,it will directly return success
>>>> even
>>>> + * when the tmc-etr is not enabled at this moment.
>>>> Ultimately,
>>>> + * sysfs-task1 will still successfully enable tmc-etr.
>>>> + * This is a transient state and does not cause an anomaly.
>>>> + */
>>>> + coresight_set_mode(csdev, mode);
>>>> + }
>>>> +
>>>> switch (mode) {
>>>> case CS_MODE_SYSFS:
>>>> - return tmc_enable_etr_sink_sysfs(csdev);
>>>> + rc = tmc_enable_etr_sink_sysfs(csdev);
>>>> + break;
>>>> case CS_MODE_PERF:
>>>> - return tmc_enable_etr_sink_perf(csdev, data);
>>>> + rc = tmc_enable_etr_sink_perf(csdev, data);
>>>> + break;
>>>> default:
>>>> - return -EINVAL;
>>>> + rc = -EINVAL;
>>>> }
>>>> +
>>>> + scoped_guard(spinlock_irqsave, &drvdata->spinlock) {
>>>> + if (rc && old_mode != mode)
>>>> + coresight_set_mode(csdev, old_mode);
>>>> + }
>>>> +
>>>> + return rc;
>>>> }
>>>> static int tmc_disable_etr_sink(struct coresight_device *csdev)
>>>
>>> .
>>>
>>
>
> .
>
The system on chip (SoC) consists of main APSS(Applications processor
subsytem) and additional processors like modem, lpass. There is
coresight-etm driver for etm trace of APSS. Coresight remote etm driver
is for enabling and disabling the etm trace of remote processors.
It uses QMI interface to communicate with remote processors' software
and uses coresight framework to configure the connection from remote
etm source to TMC sinks.
Example to capture the remote etm trace:
Enable source:
echo 1 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
echo 1 > /sys/bus/coresight/devices/remote_etm0/enable_source
Capture the trace:
cat /dev/tmc_etf0 > /data/remote_etm.bin
Disable source:
echo 0 > /sys/bus/coresight/devices/remote_etm0/enable_source
Changes since V4:
1. Add coresight QMI driver
2. Add coresight qmi node and qcom,qmi-id of modem-etm in msm8996 dtsi
Changes since V3:
1. Use different compatible for different remote etms in dt.
2. Get qmi instance id from the match table data in driver.
Change since V2:
1. Change qcom,inst-id to qcom,qmi-id
2. Fix the error in code for type of remote_etm_remove
3. Depend on QMI helper in Kconfig
Changes since V1:
1. Remove unused content
2. Use CORESIGHT_DEV_SUBTYPE_SOURCE_OTHERS as remote etm source type.
3. Use enabled instead of enable in driver data.
4. Validate instance id value where it's read from the DT.
Mao Jinlong (5):
dt-bindings: arm: Add CoreSight QMI component description
coresight: Add coresight QMI driver
dt-bindings: arm: Add qcom,qmi-id for remote etm
coresight: Add remote etm support
arm64: dts: qcom: msm8996: Add coresight qmi node
.../bindings/arm/qcom,coresight-qmi.yaml | 65 +++++
.../arm/qcom,coresight-remote-etm.yaml | 10 +
arch/arm64/boot/dts/qcom/msm8996.dtsi | 11 +
drivers/hwtracing/coresight/Kconfig | 24 ++
drivers/hwtracing/coresight/Makefile | 2 +
drivers/hwtracing/coresight/coresight-qmi.c | 209 +++++++++++++++
drivers/hwtracing/coresight/coresight-qmi.h | 107 ++++++++
.../coresight/coresight-remote-etm.c | 252 ++++++++++++++++++
8 files changed, 680 insertions(+)
create mode 100644 Documentation/devicetree/bindings/arm/qcom,coresight-qmi.yaml
create mode 100644 drivers/hwtracing/coresight/coresight-qmi.c
create mode 100644 drivers/hwtracing/coresight/coresight-qmi.h
create mode 100644 drivers/hwtracing/coresight/coresight-remote-etm.c
--
2.25.1
On 4/24/25 01:31, Yabin Cui wrote:
> On Tue, Apr 22, 2025 at 7:21 AM Leo Yan <leo.yan(a)arm.com> wrote:
>>
>> On Mon, Apr 21, 2025 at 02:58:18PM -0700, Yabin Cui wrote:
>>> The cs_etm PMU, regardless of the underlying trace sink (ETF, ETR or
>>> TRBE), doesn't require contiguous pages for its AUX buffer.
>>
>> Though contiguous pages are not mandatory for TRBE, I would set the
>> PERF_PMU_CAP_AUX_NO_SG flag for it. This can potentially benefit
>> performance.
>
> As explained in the patch 1/2, my use case periodically collects ETM data
> from the field (using both TRBE and ETR), and needs to reduce memory
> fragmentation. If the performance impact is big, we can make it user
> configurable. Otherwise, shall we default it to non-contiguous pages?
But is not that already happening ? cs_etm does not set the PMU cap
PERF_PMU_CAP_AUX_NO_SG that means it can allocate non-contig memory
chunk. Where am I missing ?
>
>>
>> For non per CPU sinks, it is fine to allocate non-contiguous pages.
>>
>> Thanks,
>> Leo
>>
>>> This patch adds the PERF_PMU_CAP_AUX_NON_CONTIGUOUS_PAGES capability
>>> to the cs_etm PMU. This allows the kernel to allocate non-contiguous
>>> pages for the AUX buffer, reducing memory fragmentation when using
>>> cs_etm.
>>>
>>> Signed-off-by: Yabin Cui <yabinc(a)google.com>
>>> ---
>>> drivers/hwtracing/coresight/coresight-etm-perf.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
>>> index f4cccd68e625..c98646eca7f8 100644
>>> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
>>> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
>>> @@ -899,7 +899,8 @@ int __init etm_perf_init(void)
>>> int ret;
>>>
>>> etm_pmu.capabilities = (PERF_PMU_CAP_EXCLUSIVE |
>>> - PERF_PMU_CAP_ITRACE);
>>> + PERF_PMU_CAP_ITRACE |
>>> + PERF_PMU_CAP_AUX_NON_CONTIGUOUS_PAGES);
>>>
>>> etm_pmu.attr_groups = etm_pmu_attr_groups;
>>> etm_pmu.task_ctx_nr = perf_sw_context;
>>> --
>>> 2.49.0.805.g082f7c87e0-goog
>>>
>>>
>
Hi,
On Wed, 23 Apr 2025 at 11:37, Jie Gan <jie.gan(a)oss.qualcomm.com> wrote:
>
> From: Mao Jinlong <quic_jinlmao(a)quicinc.com>
>
> From: Mao Jinlong <quic_jinlmao(a)quicinc.com>
>
What are these extra email addresses for?
> ETF may fail to re-enable after reading, and driver->reading will
> not be set to false, this will cause failure to enable/disable to ETF.
> This change set driver->reading to false even if re-enabling fail.
>
> Fixes: 669c4614236a7 ("coresight: tmc: Don't enable TMC when it's not ready.")
This SHA and message appear not be present in any upstream / coresight branch.
> Co-developed-by: Yuanfang Zhang <quic_yuanfang(a)quicinc.com>
> Signed-off-by: Yuanfang Zhang <quic_yuanfang(a)quicinc.com>
> Signed-off-by: Mao Jinlong <quic_jinlmao(a)quicinc.com>
> Signed-off-by: Jie Gan <jie.gan(a)oss.qualcomm.com>
> ---
> drivers/hwtracing/coresight/coresight-tmc-etf.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c
> index d858740001c2..8c9f14e36bc2 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c
> @@ -87,6 +87,12 @@ static void __tmc_etb_disable_hw(struct tmc_drvdata *drvdata)
> {
> CS_UNLOCK(drvdata->base);
>
> + /* Check if the etf already disabled*/
> + if (!(readl_relaxed(drvdata->base + TMC_CTL) & TMC_CTL_CAPT_EN)) {
> + CS_LOCK(drvdata->base);
> + return;
> + }
> +
What does this have to do with the stated function of the patch - this
is unnecessary.
Under what scenario can this function be called with the ETB
previously disabled?
> tmc_flush_and_stop(drvdata);
> /*
> * When operating in sysFS mode the content of the buffer needs to be
> @@ -747,7 +753,6 @@ int tmc_read_unprepare_etb(struct tmc_drvdata *drvdata)
> char *buf = NULL;
> enum tmc_mode mode;
> unsigned long flags;
> - int rc = 0;
>
> /* config types are set a boot time and never change */
> if (WARN_ON_ONCE(drvdata->config_type != TMC_CONFIG_TYPE_ETB &&
> @@ -773,11 +778,7 @@ int tmc_read_unprepare_etb(struct tmc_drvdata *drvdata)
> * can't be NULL.
> */
> memset(drvdata->buf, 0, drvdata->size);
> - rc = __tmc_etb_enable_hw(drvdata);
> - if (rc) {
> - raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
> - return rc;
> - }
> + __tmc_etb_enable_hw(drvdata);
Dropping a valid error check is not acceptable. If a TMC cannot be
re-enabled, then that is a hardware error that needs noting and
dealing with.
Regards
Mike
> } else {
> /*
> * The ETB/ETF is not tracing and the buffer was just read.
> --
> 2.34.1
>
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
On Tue, Apr 15, 2025 at 11:46:49AM -0700, Yabin Cui wrote:
> When enabling a SINK or LINK type coresight device fails, the
> associated helpers should be disabled.
>
> Signed-off-by: Yabin Cui <yabinc(a)google.com>
> Suggested-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
> ---
> drivers/hwtracing/coresight/coresight-core.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
> index fb43ef6a3b1f..d9fcea69d221 100644
> --- a/drivers/hwtracing/coresight/coresight-core.c
> +++ b/drivers/hwtracing/coresight/coresight-core.c
> @@ -486,8 +486,10 @@ int coresight_enable_path(struct coresight_path *path, enum cs_mode mode,
> * that need disabling. Disabling the path here
> * would mean we could disrupt an existing session.
> */
> - if (ret)
> + if (ret) {
> + coresight_disable_helpers(csdev, path);
I think we can do better for code consolidation - we can use a central
place for error handling. I will give details below.
> goto out;
> + }
> break;
> case CORESIGHT_DEV_TYPE_SOURCE:
> /* sources are enabled from either sysFS or Perf */
> @@ -497,15 +499,17 @@ int coresight_enable_path(struct coresight_path *path, enum cs_mode mode,
> child = list_next_entry(nd, link)->csdev;
> ret = coresight_enable_link(csdev, parent, child, source);
> if (ret)
> - goto err;
> + goto err_disable_helpers;
> break;
> default:
> - goto err;
> + goto err_disable_helpers;
I know this is not a problem introduced by your patch - for an error
case, it returns 0. This will hide unexpected issues. I would like
to suggest to return -EINVAL for unknown types.
> }
> }
>
> out:
> return ret;
> +err_disable_helpers:
> + coresight_disable_helpers(csdev, path);
> err:
> coresight_disable_path_from(path, nd);
> goto out;
I am just wandering if we can handle errors in a unified way and
without using goto. I would change the code as below.
The point is to use a general flow for error handling, include a
sink error. For sink error, we still invoke
coresight_disable_path_from() for an empty operation.
Also, I think we need an additional patch for error handling in
coresight_enable_helpers(). If any errors are detected while enabling
a helper, we should disable the helpers that have already been
enabled.
Please let me know if you have any questions.
Thanks,
Leo
---8<---
diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
index fb43ef6a3b1f..cf2a3708a05e 100644
--- a/drivers/hwtracing/coresight/coresight-core.c
+++ b/drivers/hwtracing/coresight/coresight-core.c
@@ -465,7 +465,7 @@ int coresight_enable_path(struct coresight_path *path, enum cs_mode mode,
/* Enable all helpers adjacent to the path first */
ret = coresight_enable_helpers(csdev, mode, path);
if (ret)
- goto err;
+ goto err_disable_path;
/*
* ETF devices are tricky... They can be a link or a sink,
* depending on how they are configured. If an ETF has been
@@ -487,7 +487,7 @@ int coresight_enable_path(struct coresight_path *path, enum cs_mode mode,
* would mean we could disrupt an existing session.
*/
if (ret)
- goto out;
+ goto err_disable_helpers;
break;
case CORESIGHT_DEV_TYPE_SOURCE:
/* sources are enabled from either sysFS or Perf */
@@ -497,18 +497,21 @@ int coresight_enable_path(struct coresight_path *path, enum cs_mode mode,
child = list_next_entry(nd, link)->csdev;
ret = coresight_enable_link(csdev, parent, child, source);
if (ret)
- goto err;
+ goto err_disable_helpers;
break;
default:
- goto err;
+ ret = -EINVAL;
+ goto err_disable_helpers;
}
}
-out:
- return ret;
-err:
+ return 0;
+
+err_disable_helpers:
+ coresight_disable_helpers(csdev, path);
+err_disable_path:
coresight_disable_path_from(path, nd);
- goto out;
+ return ret;
}
> --
> 2.49.0.604.gff1f9ca942-goog
>
Hi Yabin,
Sorry for late reply as I was on vacation for the past two weeks.
On Tue, Apr 15, 2025 at 11:46:48AM -0700, Yabin Cui wrote:
> When tracing ETM data on multiple CPUs concurrently via the
> perf interface, the CATU device is shared across different CPU
> paths. This can lead to race conditions when multiple CPUs attempt
> to enable or disable the CATU device simultaneously.
>
> To address these race conditions, this patch introduces the
> following changes:
>
> 1. The enable and disable operations for the CATU device are not
> reentrant. Therefore, a spinlock is added to ensure that only
> one CPU can enable or disable a given CATU device at any point
> in time.
>
> 2. A reference counter is used to manage the enable/disable state
> of the CATU device. The device is enabled when the first CPU
> requires it and is only disabled when the last CPU finishes
> using it. This ensures the device remains active as long as at
> least one CPU needs it.
>
> Signed-off-by: Yabin Cui <yabinc(a)google.com>
LGTM:
Reviewed-by: Leo Yan <leo.yan(a)arm.com>
As Jie reminded, please add James' review tag in next spin.
Thanks,
Leo
> ---
> drivers/hwtracing/coresight/coresight-catu.c | 25 +++++++++++++-------
> drivers/hwtracing/coresight/coresight-catu.h | 1 +
> 2 files changed, 18 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-catu.c b/drivers/hwtracing/coresight/coresight-catu.c
> index fa170c966bc3..3909b562b077 100644
> --- a/drivers/hwtracing/coresight/coresight-catu.c
> +++ b/drivers/hwtracing/coresight/coresight-catu.c
> @@ -458,12 +458,17 @@ static int catu_enable_hw(struct catu_drvdata *drvdata, enum cs_mode cs_mode,
> static int catu_enable(struct coresight_device *csdev, enum cs_mode mode,
> void *data)
> {
> - int rc;
> + int rc = 0;
> struct catu_drvdata *catu_drvdata = csdev_to_catu_drvdata(csdev);
>
> - CS_UNLOCK(catu_drvdata->base);
> - rc = catu_enable_hw(catu_drvdata, mode, data);
> - CS_LOCK(catu_drvdata->base);
> + guard(raw_spinlock_irqsave)(&catu_drvdata->spinlock);
> + if (csdev->refcnt == 0) {
> + CS_UNLOCK(catu_drvdata->base);
> + rc = catu_enable_hw(catu_drvdata, mode, data);
> + CS_LOCK(catu_drvdata->base);
> + }
> + if (!rc)
> + csdev->refcnt++;
> return rc;
> }
>
> @@ -486,12 +491,15 @@ static int catu_disable_hw(struct catu_drvdata *drvdata)
>
> static int catu_disable(struct coresight_device *csdev, void *__unused)
> {
> - int rc;
> + int rc = 0;
> struct catu_drvdata *catu_drvdata = csdev_to_catu_drvdata(csdev);
>
> - CS_UNLOCK(catu_drvdata->base);
> - rc = catu_disable_hw(catu_drvdata);
> - CS_LOCK(catu_drvdata->base);
> + guard(raw_spinlock_irqsave)(&catu_drvdata->spinlock);
> + if (--csdev->refcnt == 0) {
> + CS_UNLOCK(catu_drvdata->base);
> + rc = catu_disable_hw(catu_drvdata);
> + CS_LOCK(catu_drvdata->base);
> + }
> return rc;
> }
>
> @@ -550,6 +558,7 @@ static int __catu_probe(struct device *dev, struct resource *res)
> dev->platform_data = pdata;
>
> drvdata->base = base;
> + raw_spin_lock_init(&drvdata->spinlock);
> catu_desc.access = CSDEV_ACCESS_IOMEM(base);
> catu_desc.pdata = pdata;
> catu_desc.dev = dev;
> diff --git a/drivers/hwtracing/coresight/coresight-catu.h b/drivers/hwtracing/coresight/coresight-catu.h
> index 141feac1c14b..755776cd19c5 100644
> --- a/drivers/hwtracing/coresight/coresight-catu.h
> +++ b/drivers/hwtracing/coresight/coresight-catu.h
> @@ -65,6 +65,7 @@ struct catu_drvdata {
> void __iomem *base;
> struct coresight_device *csdev;
> int irq;
> + raw_spinlock_t spinlock;
> };
>
> #define CATU_REG32(name, offset) \
> --
> 2.49.0.604.gff1f9ca942-goog
>
On Mon, Apr 21, 2025 at 02:58:18PM -0700, Yabin Cui wrote:
> The cs_etm PMU, regardless of the underlying trace sink (ETF, ETR or
> TRBE), doesn't require contiguous pages for its AUX buffer.
Though contiguous pages are not mandatory for TRBE, I would set the
PERF_PMU_CAP_AUX_NO_SG flag for it. This can potentially benefit
performance.
For non per CPU sinks, it is fine to allocate non-contiguous pages.
Thanks,
Leo
> This patch adds the PERF_PMU_CAP_AUX_NON_CONTIGUOUS_PAGES capability
> to the cs_etm PMU. This allows the kernel to allocate non-contiguous
> pages for the AUX buffer, reducing memory fragmentation when using
> cs_etm.
>
> Signed-off-by: Yabin Cui <yabinc(a)google.com>
> ---
> drivers/hwtracing/coresight/coresight-etm-perf.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> index f4cccd68e625..c98646eca7f8 100644
> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> @@ -899,7 +899,8 @@ int __init etm_perf_init(void)
> int ret;
>
> etm_pmu.capabilities = (PERF_PMU_CAP_EXCLUSIVE |
> - PERF_PMU_CAP_ITRACE);
> + PERF_PMU_CAP_ITRACE |
> + PERF_PMU_CAP_AUX_NON_CONTIGUOUS_PAGES);
>
> etm_pmu.attr_groups = etm_pmu_attr_groups;
> etm_pmu.task_ctx_nr = perf_sw_context;
> --
> 2.49.0.805.g082f7c87e0-goog
>
>