This series is to enable AUX pause and resume on Arm CoreSight.
The first patch extracts the trace unit controlling operations to two functions. These two functions will be used by AUX pause and resume.
Patches 02 and 03 change the ETMv4 driver to prepare callback functions for AUX pause and resume.
Patch 04 changes the ETM perf layer to support AUX pause and resume in a perf session. The patch 05 re-enables sinks after buffer update, based on it, the patch 06 updates buffer on AUX pause occasion, which can mitigate the trace data lose issue.
Patch 07 documents the AUX pause usages with Arm CoreSight.
This patch set has been verified on the Hikey960 board.
It is suggested to disable CPUIdle (add `nohlt` option in Linux command line) when verifying this series. ETM and funnel drivers are found issues during CPU suspend and resume which will be addressed separately.
Changes from v3: - Re-enabled sink in buffer update callbacks (Suzuki).
Changes from v2: - Rebased on CoreSight next branch. - Dropped the uAPI 'update_buf_on_pause' and updated document respectively (Suzuki). - Renamed ETM callbacks to .pause_perf() and .resume_perf() (Suzuki). - Minor improvement for error handling in the AUX resume flow.
Changes from v1: - Added validation function pointers in pause and resume APIs (Mike).
Leo Yan (7): coresight: etm4x: Extract the trace unit controlling coresight: Introduce pause and resume APIs for source coresight: etm4x: Hook pause and resume callbacks coresight: perf: Support AUX trace pause and resume coresight: tmc: Re-enable sink after buffer update coresight: perf: Update buffer on AUX pause Documentation: coresight: Document AUX pause and resume
Documentation/trace/coresight/coresight-perf.rst | 31 +++++++++ drivers/hwtracing/coresight/coresight-core.c | 22 +++++++ drivers/hwtracing/coresight/coresight-etm-perf.c | 84 +++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-etm4x-core.c | 143 +++++++++++++++++++++++++++++------------ drivers/hwtracing/coresight/coresight-etm4x.h | 2 + drivers/hwtracing/coresight/coresight-priv.h | 2 + drivers/hwtracing/coresight/coresight-tmc-etf.c | 9 +++ drivers/hwtracing/coresight/coresight-tmc-etr.c | 10 +++ include/linux/coresight.h | 4 ++ 9 files changed, 265 insertions(+), 42 deletions(-)
The trace unit is controlled in the ETM hardware enabling and disabling. The sequential changes for support AUX pause and resume will reuse the same operations.
Extract the operations in the etm4_{enable|disable}_trace_unit() functions. A minor improvement in etm4_enable_trace_unit() is for returning the timeout error to callers.
Signed-off-by: Leo Yan leo.yan@arm.com Reviewed-by: Mike Leach mike.leach@linaro.org --- drivers/hwtracing/coresight/coresight-etm4x-core.c | 103 +++++++++++++++++++++++++---------------- 1 file changed, 62 insertions(+), 41 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index e5972f16abff..53cb0569dbbf 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -431,6 +431,44 @@ static int etm4x_wait_status(struct csdev_access *csa, int pos, int val) return coresight_timeout(csa, TRCSTATR, pos, val); }
+static int etm4_enable_trace_unit(struct etmv4_drvdata *drvdata) +{ + struct coresight_device *csdev = drvdata->csdev; + struct device *etm_dev = &csdev->dev; + struct csdev_access *csa = &csdev->access; + + /* + * ETE mandates that the TRCRSR is written to before + * enabling it. + */ + if (etm4x_is_ete(drvdata)) + etm4x_relaxed_write32(csa, TRCRSR_TA, TRCRSR); + + etm4x_allow_trace(drvdata); + /* Enable the trace unit */ + etm4x_relaxed_write32(csa, 1, TRCPRGCTLR); + + /* Synchronize the register updates for sysreg access */ + if (!csa->io_mem) + isb(); + + /* wait for TRCSTATR.IDLE to go back down to '0' */ + if (etm4x_wait_status(csa, TRCSTATR_IDLE_BIT, 0)) { + dev_err(etm_dev, + "timeout while waiting for Idle Trace Status\n"); + return -ETIME; + } + + /* + * As recommended by section 4.3.7 ("Synchronization when using the + * memory-mapped interface") of ARM IHI 0064D + */ + dsb(sy); + isb(); + + return 0; +} + static int etm4_enable_hw(struct etmv4_drvdata *drvdata) { int i, rc; @@ -539,33 +577,7 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata) etm4x_relaxed_write32(csa, trcpdcr | TRCPDCR_PU, TRCPDCR); }
- /* - * ETE mandates that the TRCRSR is written to before - * enabling it. - */ - if (etm4x_is_ete(drvdata)) - etm4x_relaxed_write32(csa, TRCRSR_TA, TRCRSR); - - etm4x_allow_trace(drvdata); - /* Enable the trace unit */ - etm4x_relaxed_write32(csa, 1, TRCPRGCTLR); - - /* Synchronize the register updates for sysreg access */ - if (!csa->io_mem) - isb(); - - /* wait for TRCSTATR.IDLE to go back down to '0' */ - if (etm4x_wait_status(csa, TRCSTATR_IDLE_BIT, 0)) - dev_err(etm_dev, - "timeout while waiting for Idle Trace Status\n"); - - /* - * As recommended by section 4.3.7 ("Synchronization when using the - * memory-mapped interface") of ARM IHI 0064D - */ - dsb(sy); - isb(); - + rc = etm4_enable_trace_unit(drvdata); done: etm4_cs_lock(drvdata, csa);
@@ -884,25 +896,12 @@ static int etm4_enable(struct coresight_device *csdev, struct perf_event *event, return ret; }
-static void etm4_disable_hw(void *info) +static void etm4_disable_trace_unit(struct etmv4_drvdata *drvdata) { u32 control; - struct etmv4_drvdata *drvdata = info; - struct etmv4_config *config = &drvdata->config; struct coresight_device *csdev = drvdata->csdev; struct device *etm_dev = &csdev->dev; struct csdev_access *csa = &csdev->access; - int i; - - etm4_cs_unlock(drvdata, csa); - etm4_disable_arch_specific(drvdata); - - if (!drvdata->skip_power_up) { - /* power can be removed from the trace unit now */ - control = etm4x_relaxed_read32(csa, TRCPDCR); - control &= ~TRCPDCR_PU; - etm4x_relaxed_write32(csa, control, TRCPDCR); - }
control = etm4x_relaxed_read32(csa, TRCPRGCTLR);
@@ -943,6 +942,28 @@ static void etm4_disable_hw(void *info) * of ARM IHI 0064H.b. */ isb(); +} + +static void etm4_disable_hw(void *info) +{ + u32 control; + struct etmv4_drvdata *drvdata = info; + struct etmv4_config *config = &drvdata->config; + struct coresight_device *csdev = drvdata->csdev; + struct csdev_access *csa = &csdev->access; + int i; + + etm4_cs_unlock(drvdata, csa); + etm4_disable_arch_specific(drvdata); + + if (!drvdata->skip_power_up) { + /* power can be removed from the trace unit now */ + control = etm4x_relaxed_read32(csa, TRCPDCR); + control &= ~TRCPDCR_PU; + etm4x_relaxed_write32(csa, control, TRCPDCR); + } + + etm4_disable_trace_unit(drvdata);
/* read the status of the single shot comparators */ for (i = 0; i < drvdata->nr_ss_cmp; i++) {
On 01/04/2025 19:07, Leo Yan wrote:
The trace unit is controlled in the ETM hardware enabling and disabling. The sequential changes for support AUX pause and resume will reuse the same operations.
Extract the operations in the etm4_{enable|disable}_trace_unit() functions. A minor improvement in etm4_enable_trace_unit() is for returning the timeout error to callers.
Signed-off-by: Leo Yan leo.yan@arm.com Reviewed-by: Mike Leach mike.leach@linaro.org
The patch looks good to me. One comment below, nothing with your patch though.
drivers/hwtracing/coresight/coresight-etm4x-core.c | 103 +++++++++++++++++++++++++---------------- 1 file changed, 62 insertions(+), 41 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index e5972f16abff..53cb0569dbbf 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -431,6 +431,44 @@ static int etm4x_wait_status(struct csdev_access *csa, int pos, int val) return coresight_timeout(csa, TRCSTATR, pos, val); } +static int etm4_enable_trace_unit(struct etmv4_drvdata *drvdata) +{
- struct coresight_device *csdev = drvdata->csdev;
- struct device *etm_dev = &csdev->dev;
- struct csdev_access *csa = &csdev->access;
- /*
* ETE mandates that the TRCRSR is written to before
* enabling it.
*/
- if (etm4x_is_ete(drvdata))
etm4x_relaxed_write32(csa, TRCRSR_TA, TRCRSR);
- etm4x_allow_trace(drvdata);
- /* Enable the trace unit */
- etm4x_relaxed_write32(csa, 1, TRCPRGCTLR);
- /* Synchronize the register updates for sysreg access */
- if (!csa->io_mem)
isb();
- /* wait for TRCSTATR.IDLE to go back down to '0' */
- if (etm4x_wait_status(csa, TRCSTATR_IDLE_BIT, 0)) {
dev_err(etm_dev,
"timeout while waiting for Idle Trace Status\n");
return -ETIME;
- }
- /*
* As recommended by section 4.3.7 ("Synchronization when using the
* memory-mapped interface") of ARM IHI 0064D
*/
- dsb(sy);
- isb();
- return 0;
+}
- static int etm4_enable_hw(struct etmv4_drvdata *drvdata) { int i, rc;
@@ -539,33 +577,7 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata) etm4x_relaxed_write32(csa, trcpdcr | TRCPDCR_PU, TRCPDCR); }
- /*
* ETE mandates that the TRCRSR is written to before
* enabling it.
*/
- if (etm4x_is_ete(drvdata))
etm4x_relaxed_write32(csa, TRCRSR_TA, TRCRSR);
- etm4x_allow_trace(drvdata);
- /* Enable the trace unit */
- etm4x_relaxed_write32(csa, 1, TRCPRGCTLR);
- /* Synchronize the register updates for sysreg access */
- if (!csa->io_mem)
isb();
- /* wait for TRCSTATR.IDLE to go back down to '0' */
- if (etm4x_wait_status(csa, TRCSTATR_IDLE_BIT, 0))
dev_err(etm_dev,
"timeout while waiting for Idle Trace Status\n");
- /*
* As recommended by section 4.3.7 ("Synchronization when using the
* memory-mapped interface") of ARM IHI 0064D
*/
- dsb(sy);
- isb();
- rc = etm4_enable_trace_unit(drvdata); done: etm4_cs_lock(drvdata, csa);
@@ -884,25 +896,12 @@ static int etm4_enable(struct coresight_device *csdev, struct perf_event *event, return ret; } -static void etm4_disable_hw(void *info) +static void etm4_disable_trace_unit(struct etmv4_drvdata *drvdata) { u32 control;
- struct etmv4_drvdata *drvdata = info;
- struct etmv4_config *config = &drvdata->config; struct coresight_device *csdev = drvdata->csdev; struct device *etm_dev = &csdev->dev; struct csdev_access *csa = &csdev->access;
- int i;
- etm4_cs_unlock(drvdata, csa);
- etm4_disable_arch_specific(drvdata);
- if (!drvdata->skip_power_up) {
/* power can be removed from the trace unit now */
control = etm4x_relaxed_read32(csa, TRCPDCR);
control &= ~TRCPDCR_PU;
etm4x_relaxed_write32(csa, control, TRCPDCR);
- }
control = etm4x_relaxed_read32(csa, TRCPRGCTLR); @@ -943,6 +942,28 @@ static void etm4_disable_hw(void *info) * of ARM IHI 0064H.b. */ isb(); +}
+static void etm4_disable_hw(void *info) +{
- u32 control;
- struct etmv4_drvdata *drvdata = info;
- struct etmv4_config *config = &drvdata->config;
- struct coresight_device *csdev = drvdata->csdev;
- struct csdev_access *csa = &csdev->access;
- int i;
- etm4_cs_unlock(drvdata, csa);
- etm4_disable_arch_specific(drvdata);
- if (!drvdata->skip_power_up) {
/* power can be removed from the trace unit now */
control = etm4x_relaxed_read32(csa, TRCPDCR);
control &= ~TRCPDCR_PU;
etm4x_relaxed_write32(csa, control, TRCPDCR);
- }
Shouldn't we delay ^^ until we have read all the registers back and disabled the trace ? As I said, this is an existing problem and may need to be fixed. It would be good to move it after we have done everything with the accesses. If we start with that patch, we could easily backport it to stable.
Suzuki
- etm4_disable_trace_unit(drvdata);
/* read the status of the single shot comparators */ for (i = 0; i < drvdata->nr_ss_cmp; i++) {
Introduce APIs for pausing and resuming trace source and export as GPL symbols.
Signed-off-by: Leo Yan leo.yan@arm.com Reviewed-by: Mike Leach mike.leach@linaro.org --- drivers/hwtracing/coresight/coresight-core.c | 22 ++++++++++++++++++++++ drivers/hwtracing/coresight/coresight-priv.h | 2 ++ include/linux/coresight.h | 4 ++++ 3 files changed, 28 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index fb43ef6a3b1f..d4c3000608f2 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -367,6 +367,28 @@ void coresight_disable_source(struct coresight_device *csdev, void *data) } EXPORT_SYMBOL_GPL(coresight_disable_source);
+void coresight_pause_source(struct coresight_device *csdev) +{ + if (!coresight_is_percpu_source(csdev)) + return; + + if (source_ops(csdev)->pause_perf) + source_ops(csdev)->pause_perf(csdev); +} +EXPORT_SYMBOL_GPL(coresight_pause_source); + +int coresight_resume_source(struct coresight_device *csdev) +{ + if (!coresight_is_percpu_source(csdev)) + return -EOPNOTSUPP; + + if (!source_ops(csdev)->resume_perf) + return -EOPNOTSUPP; + + return source_ops(csdev)->resume_perf(csdev); +} +EXPORT_SYMBOL_GPL(coresight_resume_source); + /* * coresight_disable_path_from : Disable components in the given path beyond * @nd in the list. If @nd is NULL, all the components, except the SOURCE are diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h index 82644aff8d2b..2d9baa9d8228 100644 --- a/drivers/hwtracing/coresight/coresight-priv.h +++ b/drivers/hwtracing/coresight/coresight-priv.h @@ -249,5 +249,7 @@ void coresight_add_helper(struct coresight_device *csdev, void coresight_set_percpu_sink(int cpu, struct coresight_device *csdev); struct coresight_device *coresight_get_percpu_sink(int cpu); void coresight_disable_source(struct coresight_device *csdev, void *data); +void coresight_pause_source(struct coresight_device *csdev); +int coresight_resume_source(struct coresight_device *csdev);
#endif diff --git a/include/linux/coresight.h b/include/linux/coresight.h index d79a242b271d..c95c72e07e02 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -398,6 +398,8 @@ struct coresight_ops_link { * is associated to. * @enable: enables tracing for a source. * @disable: disables tracing for a source. + * @resume_perf: resumes tracing for a source in perf session. + * @pause_perf: pauses tracing for a source in perf session. */ struct coresight_ops_source { int (*cpu_id)(struct coresight_device *csdev); @@ -405,6 +407,8 @@ struct coresight_ops_source { enum cs_mode mode, struct coresight_path *path); void (*disable)(struct coresight_device *csdev, struct perf_event *event); + int (*resume_perf)(struct coresight_device *csdev); + void (*pause_perf)(struct coresight_device *csdev); };
/**
Add callbacks for pausing and resuming the tracer.
A "paused" flag in the driver data indicates whether the tracer is paused. If the flag is set, the driver will skip starting the hardware trace. The flag is always set to false for the sysfs mode, meaning the tracer will never be paused in the case.
Signed-off-by: Leo Yan leo.yan@arm.com Reviewed-by: Mike Leach mike.leach@linaro.org --- drivers/hwtracing/coresight/coresight-etm4x-core.c | 42 +++++++++++++++++++++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-etm4x.h | 2 ++ 2 files changed, 43 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c index 53cb0569dbbf..5b69446db947 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c @@ -577,7 +577,8 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata) etm4x_relaxed_write32(csa, trcpdcr | TRCPDCR_PU, TRCPDCR); }
- rc = etm4_enable_trace_unit(drvdata); + if (!drvdata->paused) + rc = etm4_enable_trace_unit(drvdata); done: etm4_cs_lock(drvdata, csa);
@@ -820,6 +821,9 @@ static int etm4_enable_perf(struct coresight_device *csdev,
drvdata->trcid = path->trace_id;
+ /* Populate pause state */ + drvdata->paused = !!READ_ONCE(event->hw.aux_paused); + /* And enable it */ ret = etm4_enable_hw(drvdata);
@@ -846,6 +850,9 @@ static int etm4_enable_sysfs(struct coresight_device *csdev, struct coresight_pa
drvdata->trcid = path->trace_id;
+ /* Tracer will never be paused in sysfs mode */ + drvdata->paused = false; + /* * Executing etm4_enable_hw on the cpu whose ETM is being enabled * ensures that register writes occur when cpu is powered. @@ -1080,10 +1087,43 @@ static void etm4_disable(struct coresight_device *csdev, coresight_set_mode(csdev, CS_MODE_DISABLED); }
+static int etm4_resume_perf(struct coresight_device *csdev) +{ + struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + struct csdev_access *csa = &csdev->access; + + if (coresight_get_mode(csdev) != CS_MODE_PERF) + return -EINVAL; + + etm4_cs_unlock(drvdata, csa); + etm4_enable_trace_unit(drvdata); + etm4_cs_lock(drvdata, csa); + + drvdata->paused = false; + return 0; +} + +static void etm4_pause_perf(struct coresight_device *csdev) +{ + struct etmv4_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + struct csdev_access *csa = &csdev->access; + + if (coresight_get_mode(csdev) != CS_MODE_PERF) + return; + + etm4_cs_unlock(drvdata, csa); + etm4_disable_trace_unit(drvdata); + etm4_cs_lock(drvdata, csa); + + drvdata->paused = true; +} + static const struct coresight_ops_source etm4_source_ops = { .cpu_id = etm4_cpu_id, .enable = etm4_enable, .disable = etm4_disable, + .resume_perf = etm4_resume_perf, + .pause_perf = etm4_pause_perf, };
static const struct coresight_ops etm4_cs_ops = { diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h index bd7db36ba197..ac649515054d 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.h +++ b/drivers/hwtracing/coresight/coresight-etm4x.h @@ -983,6 +983,7 @@ struct etmv4_save_state { * @state_needs_restore: True when there is context to restore after PM exit * @skip_power_up: Indicates if an implementation can skip powering up * the trace unit. + * @paused: Indicates if the trace unit is paused. * @arch_features: Bitmap of arch features of etmv4 devices. */ struct etmv4_drvdata { @@ -1036,6 +1037,7 @@ struct etmv4_drvdata { struct etmv4_save_state *save_state; bool state_needs_restore; bool skip_power_up; + bool paused; DECLARE_BITMAP(arch_features, ETM4_IMPDEF_FEATURE_MAX); };
This commit supports AUX trace pause and resume in a perf session for Arm CoreSight.
First, we need to decide which flag can indicate the CoreSight PMU event has started. The 'event->hw.state' cannot be used for this purpose because its initial value and the value after hardware trace enabling are both 0.
On the other hand, the context value 'ctxt->event_data' stores the ETM private info. This pointer is valid only when the PMU event has been enabled. It is safe to permit AUX trace pause and resume operations only when it is not a NULL pointer.
To achieve fine-grained control of the pause and resume, only the tracer is disabled and enabled. This avoids the unnecessary complexity and latency caused by manipulating the entire link path.
Signed-off-by: Leo Yan leo.yan@arm.com Reviewed-by: Mike Leach mike.leach@linaro.org --- drivers/hwtracing/coresight/coresight-etm-perf.c | 45 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index f4cccd68e625..2dcf1809cb7f 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -365,6 +365,18 @@ static void *etm_setup_aux(struct perf_event *event, void **pages, continue; }
+ /* + * If AUX pause feature is enabled but the ETM driver does not + * support the operations, clear this CPU from the mask and + * continue to next one. + */ + if (event->attr.aux_start_paused && + (!source_ops(csdev)->pause_perf || !source_ops(csdev)->resume_perf)) { + dev_err_once(&csdev->dev, "AUX pause is not supported.\n"); + cpumask_clear_cpu(cpu, mask); + continue; + } + /* * No sink provided - look for a default sink for all the ETMs, * where this event can be scheduled. @@ -450,6 +462,15 @@ static void *etm_setup_aux(struct perf_event *event, void **pages, goto out; }
+static int etm_event_resume(struct coresight_device *csdev, + struct etm_ctxt *ctxt) +{ + if (!ctxt->event_data) + return 0; + + return coresight_resume_source(csdev); +} + static void etm_event_start(struct perf_event *event, int flags) { int cpu = smp_processor_id(); @@ -463,6 +484,14 @@ static void etm_event_start(struct perf_event *event, int flags) if (!csdev) goto fail;
+ if (flags & PERF_EF_RESUME) { + if (etm_event_resume(csdev, ctxt) < 0) { + dev_err(&csdev->dev, "Failed to resume ETM event.\n"); + goto fail; + } + return; + } + /* Have we messed up our tracking ? */ if (WARN_ON(ctxt->event_data)) goto fail; @@ -545,6 +574,16 @@ static void etm_event_start(struct perf_event *event, int flags) return; }
+static void etm_event_pause(struct coresight_device *csdev, + struct etm_ctxt *ctxt) +{ + if (!ctxt->event_data) + return; + + /* Stop tracer */ + coresight_pause_source(csdev); +} + static void etm_event_stop(struct perf_event *event, int mode) { int cpu = smp_processor_id(); @@ -555,6 +594,9 @@ static void etm_event_stop(struct perf_event *event, int mode) struct etm_event_data *event_data; struct coresight_path *path;
+ if (mode & PERF_EF_PAUSE) + return etm_event_pause(csdev, ctxt); + /* * If we still have access to the event_data via handle, * confirm that we haven't messed up the tracking. @@ -899,7 +941,8 @@ int __init etm_perf_init(void) int ret;
etm_pmu.capabilities = (PERF_PMU_CAP_EXCLUSIVE | - PERF_PMU_CAP_ITRACE); + PERF_PMU_CAP_ITRACE | + PERF_PMU_CAP_AUX_PAUSE);
etm_pmu.attr_groups = etm_pmu_attr_groups; etm_pmu.task_ctx_nr = perf_sw_context;
The buffer update callbacks disable the sink before syncing data but misses to re-enable it afterward. This is fine in the general flow, because the sink will be re-enabled the next time the PMU event is activated.
However, during AUX pause and resume, if the sink is disabled in the buffer update callback, there is no chance to re-enable it when AUX resumes.
To address this, the callbacks now check the event state 'event->hw.state'. If the event is an active state (0), the sink is re-enabled.
For the TMC ETR driver, buffer updates are not fully protected by the driver's spinlock. In this case, the sink is not re-enabled if its reference counter is 0, in order to avoid race conditions where the sink may have been completely disabled.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etf.c | 9 +++++++++ drivers/hwtracing/coresight/coresight-tmc-etr.c | 10 ++++++++++ 2 files changed, 19 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index d858740001c2..7584cc03d8e6 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -482,6 +482,7 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, unsigned long offset, to_read = 0, flags; struct cs_buffers *buf = sink_config; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + struct perf_event *event = handle->event;
if (!buf) return 0; @@ -586,6 +587,14 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, * is expected by the perf ring buffer. */ CS_LOCK(drvdata->base); + + /* + * If the event is active, it is triggered during an AUX pause. + * Re-enable the sink so that it is ready when AUX resume is invoked. + */ + if (!event->hw.state) + __tmc_etb_enable_hw(drvdata); + out: raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 76a8cb29b68a..8923fbc6e1a0 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1636,6 +1636,7 @@ tmc_update_etr_buffer(struct coresight_device *csdev, struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); struct etr_perf_buffer *etr_perf = config; struct etr_buf *etr_buf = etr_perf->etr_buf; + struct perf_event *event = handle->event;
raw_spin_lock_irqsave(&drvdata->spinlock, flags);
@@ -1705,6 +1706,15 @@ tmc_update_etr_buffer(struct coresight_device *csdev, */ smp_wmb();
+ /* + * If the event is active, it is triggered during an AUX pause. + * Re-enable the sink so that it is ready when AUX resume is invoked. + */ + raw_spin_lock_irqsave(&drvdata->spinlock, flags); + if (csdev->refcnt && !event->hw.state) + __tmc_etr_enable_hw(drvdata); + raw_spin_unlock_irqrestore(&drvdata->spinlock, flags); + out: /* * Don't set the TRUNCATED flag in snapshot mode because 1) the
Hi Leo,
On Tue, 1 Apr 2025 at 19:07, Leo Yan leo.yan@arm.com wrote:
The buffer update callbacks disable the sink before syncing data but misses to re-enable it afterward. This is fine in the general flow, because the sink will be re-enabled the next time the PMU event is activated.
However, during AUX pause and resume, if the sink is disabled in the buffer update callback, there is no chance to re-enable it when AUX resumes.
To address this, the callbacks now check the event state 'event->hw.state'. If the event is an active state (0), the sink is re-enabled.
For the TMC ETR driver, buffer updates are not fully protected by the driver's spinlock. In this case, the sink is not re-enabled if its reference counter is 0, in order to avoid race conditions where the sink may have been completely disabled.
Signed-off-by: Leo Yan leo.yan@arm.com
drivers/hwtracing/coresight/coresight-tmc-etf.c | 9 +++++++++ drivers/hwtracing/coresight/coresight-tmc-etr.c | 10 ++++++++++ 2 files changed, 19 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index d858740001c2..7584cc03d8e6 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -482,6 +482,7 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, unsigned long offset, to_read = 0, flags; struct cs_buffers *buf = sink_config; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
struct perf_event *event = handle->event; if (!buf) return 0;
@@ -586,6 +587,14 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, * is expected by the perf ring buffer. */ CS_LOCK(drvdata->base);
/*
* If the event is active, it is triggered during an AUX pause.
* Re-enable the sink so that it is ready when AUX resume is invoked.
*/
if (!event->hw.state)
__tmc_etb_enable_hw(drvdata);
Think that the refcnt should be checked here too.
Does the ETB case need to be handled? - somewhat confusingly the coresight-tmc-etf.c file handles both ETF and ETB.
Regards
Mike
out: raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 76a8cb29b68a..8923fbc6e1a0 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1636,6 +1636,7 @@ tmc_update_etr_buffer(struct coresight_device *csdev, struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); struct etr_perf_buffer *etr_perf = config; struct etr_buf *etr_buf = etr_perf->etr_buf;
struct perf_event *event = handle->event; raw_spin_lock_irqsave(&drvdata->spinlock, flags);
@@ -1705,6 +1706,15 @@ tmc_update_etr_buffer(struct coresight_device *csdev, */ smp_wmb();
/*
* If the event is active, it is triggered during an AUX pause.
* Re-enable the sink so that it is ready when AUX resume is invoked.
*/
raw_spin_lock_irqsave(&drvdata->spinlock, flags);
if (csdev->refcnt && !event->hw.state)
__tmc_etr_enable_hw(drvdata);
raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
out: /* * Don't set the TRUNCATED flag in snapshot mode because 1) the -- 2.34.1
-- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK
Hi Mike,
On Wed, Apr 02, 2025 at 04:05:10PM +0100, Mike Leach wrote:
[...]
@@ -482,6 +482,7 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, unsigned long offset, to_read = 0, flags; struct cs_buffers *buf = sink_config; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
struct perf_event *event = handle->event; if (!buf) return 0;
@@ -586,6 +587,14 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, * is expected by the perf ring buffer. */ CS_LOCK(drvdata->base);
/*
* If the event is active, it is triggered during an AUX pause.
* Re-enable the sink so that it is ready when AUX resume is invoked.
*/
if (!event->hw.state)
__tmc_etb_enable_hw(drvdata);
Think that the refcnt should be checked here too.
No, ETF driver uses spinlock to guard the entire region for checking refcnt and updating buffer, here it is still in the same critical region. This is why the checking refcnt is not needed.
Does the ETB case need to be handled? - somewhat confusingly the coresight-tmc-etf.c file handles both ETF and ETB.
ETF is for the link mode, and ETB is for sink. Updating buffer is only for sink mode, this is why here I use __tmc_etb_enable_hw(). Does it make sense?
I also have a question for the paired operations (this is applied for both ETF and ETR drivers).
Now the flow is:
tmc_update_etf_buffer() {
tmc_flush_and_stop();
update buffer;
__tmc_etb_enable_hw(); }
The operations are not paired between tmc_flush_and_stop() and __tmc_etb_enable_hw().
The tmc_flush_and_stop() function only controls the TMC_FFCR register. I'm not sure whether I need to extract the TMC_FFCR operations from __tmc_etb_enable_hw() to use them for recovery in the update buffer. Or do you think re-enabling the hardware in this patch is the safer approach?
Thanks, Leo
On Wed, 2 Apr 2025 at 16:58, Leo Yan leo.yan@arm.com wrote:
Hi Mike,
On Wed, Apr 02, 2025 at 04:05:10PM +0100, Mike Leach wrote:
[...]
@@ -482,6 +482,7 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, unsigned long offset, to_read = 0, flags; struct cs_buffers *buf = sink_config; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
struct perf_event *event = handle->event; if (!buf) return 0;
@@ -586,6 +587,14 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, * is expected by the perf ring buffer. */ CS_LOCK(drvdata->base);
/*
* If the event is active, it is triggered during an AUX pause.
* Re-enable the sink so that it is ready when AUX resume is invoked.
*/
if (!event->hw.state)
__tmc_etb_enable_hw(drvdata);
Think that the refcnt should be checked here too.
No, ETF driver uses spinlock to guard the entire region for checking refcnt and updating buffer, here it is still in the same critical region. This is why the checking refcnt is not needed.
Does the ETB case need to be handled? - somewhat confusingly the coresight-tmc-etf.c file handles both ETF and ETB.
ETF is for the link mode, and ETB is for sink. Updating buffer is only for sink mode, this is why here I use __tmc_etb_enable_hw(). Does it make sense?
I also have a question for the paired operations (this is applied for both ETF and ETR drivers).
Now the flow is:
tmc_update_etf_buffer() {
tmc_flush_and_stop(); update buffer; __tmc_etb_enable_hw();
}
The operations are not paired between tmc_flush_and_stop() and __tmc_etb_enable_hw().
The tmc_flush_and_stop() function only controls the TMC_FFCR register. I'm not sure whether I need to extract the TMC_FFCR operations from __tmc_etb_enable_hw() to use them for recovery in the update buffer. Or do you think re-enabling the hardware in this patch is the safer approach?
Thanks, Leo
Looks OK to me.
Reviewed-by: Mike Leach mike.leach@linaro.org
Due to sinks like ETR and ETB don't support interrupt handling, the hardware trace data might be lost for continuous running tasks.
This commit takes advantage of the AUX pause for updating trace buffer to mitigate the trace data losing issue.
The per CPU sink has its own interrupt handling. Thus, there will be a race condition between the updating buffer in NMI and sink's interrupt handler. To avoid the race condition, this commit disallows updating buffer on AUX pause for the per CPU sink. Currently, this is only applied for TRBE.
Signed-off-by: Leo Yan leo.yan@arm.com --- drivers/hwtracing/coresight/coresight-etm-perf.c | 43 +++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 41 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 2dcf1809cb7f..f1551c08ecb2 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -574,14 +574,53 @@ static void etm_event_start(struct perf_event *event, int flags) return; }
-static void etm_event_pause(struct coresight_device *csdev, +static void etm_event_pause(struct perf_event *event, + struct coresight_device *csdev, struct etm_ctxt *ctxt) { + int cpu = smp_processor_id(); + struct coresight_device *sink; + struct perf_output_handle *handle = &ctxt->handle; + struct coresight_path *path; + unsigned long size; + if (!ctxt->event_data) return;
/* Stop tracer */ coresight_pause_source(csdev); + + path = etm_event_cpu_path(ctxt->event_data, cpu); + sink = coresight_get_sink(path); + if (WARN_ON_ONCE(!sink)) + return; + + /* + * The per CPU sink has own interrupt handling, it might have + * race condition with updating buffer on AUX trace pause if + * it is invoked from NMI. To avoid the race condition, + * disallows updating buffer for the per CPU sink case. + */ + if (coresight_is_percpu_sink(sink)) + return; + + if (WARN_ON_ONCE(handle->event != event)) + return; + + if (!sink_ops(sink)->update_buffer) + return; + + size = sink_ops(sink)->update_buffer(sink, handle, + ctxt->event_data->snk_config); + if (READ_ONCE(handle->event)) { + if (!size) + return; + + perf_aux_output_end(handle, size); + perf_aux_output_begin(handle, event); + } else { + WARN_ON_ONCE(size); + } }
static void etm_event_stop(struct perf_event *event, int mode) @@ -595,7 +634,7 @@ static void etm_event_stop(struct perf_event *event, int mode) struct coresight_path *path;
if (mode & PERF_EF_PAUSE) - return etm_event_pause(csdev, ctxt); + return etm_event_pause(event, csdev, ctxt);
/* * If we still have access to the event_data via handle,
This adds description for AUX pause and resume. It gives introduction for what's AUX pause and resume and records usage examples.
Signed-off-by: Leo Yan leo.yan@arm.com --- Documentation/trace/coresight/coresight-perf.rst | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+)
diff --git a/Documentation/trace/coresight/coresight-perf.rst b/Documentation/trace/coresight/coresight-perf.rst index d087aae7d492..30be89320621 100644 --- a/Documentation/trace/coresight/coresight-perf.rst +++ b/Documentation/trace/coresight/coresight-perf.rst @@ -78,6 +78,37 @@ enabled like::
Please refer to the kernel configuration help for more information.
+Fine-grained tracing with AUX pause and resume +---------------------------------------------- + +Arm CoreSight may generate a large amount of hardware trace data, which +will lead to overhead in recording and distract users when reviewing +profiling result. To mitigate the issue of excessive trace data, Perf +provides AUX pause and resume functionality for fine-grained tracing. + +The AUX pause and resume can be triggered by associated events. These +events can be ftrace tracepoints (including static and dynamic +tracepoints) or PMU events (e.g. CPU PMU cycle event). To create a perf +session with AUX pause / resume, three configuration terms are +introduced: + +- "aux-action=start-paused": it is specified for the cs_etm PMU event to + launch in a paused state. +- "aux-action=pause": an associated event is specified with this term + to pause AUX trace. +- "aux-action=resume": an associated event is specified with this term + to resume AUX trace. + +Example for triggering AUX pause and resume with ftrace tracepoints:: + + perf record -e cs_etm/aux-action=start-paused/k,syscalls:sys_enter_openat/aux-action=resume/,syscalls:sys_exit_openat/aux-action=pause/ ls + +Example for triggering AUX pause and resume with PMU event:: + + perf record -a -e cs_etm/aux-action=start-paused/k \ + -e cycles/aux-action=pause,period=10000000/ \ + -e cycles/aux-action=resume,period=1050000/ -- sleep 1 + Perf test - Verify kernel and userspace perf CoreSight work -----------------------------------------------------------
On 01/04/2025 19:07, Leo Yan wrote:
This series is to enable AUX pause and resume on Arm CoreSight.
The first patch extracts the trace unit controlling operations to two functions. These two functions will be used by AUX pause and resume.
Patches 02 and 03 change the ETMv4 driver to prepare callback functions for AUX pause and resume.
Patch 04 changes the ETM perf layer to support AUX pause and resume in a perf session. The patch 05 re-enables sinks after buffer update, based on it, the patch 06 updates buffer on AUX pause occasion, which can mitigate the trace data lose issue.
Patch 07 documents the AUX pause usages with Arm CoreSight.
This patch set has been verified on the Hikey960 board.
It is suggested to disable CPUIdle (add `nohlt` option in Linux command line) when verifying this series. ETM and funnel drivers are found issues during CPU suspend and resume which will be addressed separately.
The series looks good to me, except for the comment on the Patch 1.
I would like to get an Ack from James as he has looked at in the past.
Suzuki
Changes from v3:
- Re-enabled sink in buffer update callbacks (Suzuki).
Changes from v2:
- Rebased on CoreSight next branch.
- Dropped the uAPI 'update_buf_on_pause' and updated document respectively (Suzuki).
- Renamed ETM callbacks to .pause_perf() and .resume_perf() (Suzuki).
- Minor improvement for error handling in the AUX resume flow.
Changes from v1:
- Added validation function pointers in pause and resume APIs (Mike).
Leo Yan (7): coresight: etm4x: Extract the trace unit controlling coresight: Introduce pause and resume APIs for source coresight: etm4x: Hook pause and resume callbacks coresight: perf: Support AUX trace pause and resume coresight: tmc: Re-enable sink after buffer update coresight: perf: Update buffer on AUX pause Documentation: coresight: Document AUX pause and resume
Documentation/trace/coresight/coresight-perf.rst | 31 +++++++++ drivers/hwtracing/coresight/coresight-core.c | 22 +++++++ drivers/hwtracing/coresight/coresight-etm-perf.c | 84 +++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-etm4x-core.c | 143 +++++++++++++++++++++++++++++------------ drivers/hwtracing/coresight/coresight-etm4x.h | 2 + drivers/hwtracing/coresight/coresight-priv.h | 2 + drivers/hwtracing/coresight/coresight-tmc-etf.c | 9 +++ drivers/hwtracing/coresight/coresight-tmc-etr.c | 10 +++ include/linux/coresight.h | 4 ++ 9 files changed, 265 insertions(+), 42 deletions(-)
On 01/04/2025 7:07 pm, Leo Yan wrote:
This series is to enable AUX pause and resume on Arm CoreSight.
The first patch extracts the trace unit controlling operations to two functions. These two functions will be used by AUX pause and resume.
Patches 02 and 03 change the ETMv4 driver to prepare callback functions for AUX pause and resume.
Patch 04 changes the ETM perf layer to support AUX pause and resume in a perf session. The patch 05 re-enables sinks after buffer update, based on it, the patch 06 updates buffer on AUX pause occasion, which can mitigate the trace data lose issue.
Patch 07 documents the AUX pause usages with Arm CoreSight.
This patch set has been verified on the Hikey960 board.
It is suggested to disable CPUIdle (add `nohlt` option in Linux command line) when verifying this series. ETM and funnel drivers are found issues during CPU suspend and resume which will be addressed separately.
Changes from v3:
- Re-enabled sink in buffer update callbacks (Suzuki).
Changes from v2:
- Rebased on CoreSight next branch.
- Dropped the uAPI 'update_buf_on_pause' and updated document respectively (Suzuki).
- Renamed ETM callbacks to .pause_perf() and .resume_perf() (Suzuki).
- Minor improvement for error handling in the AUX resume flow.
Changes from v1:
- Added validation function pointers in pause and resume APIs (Mike).
Leo Yan (7): coresight: etm4x: Extract the trace unit controlling coresight: Introduce pause and resume APIs for source coresight: etm4x: Hook pause and resume callbacks coresight: perf: Support AUX trace pause and resume coresight: tmc: Re-enable sink after buffer update coresight: perf: Update buffer on AUX pause Documentation: coresight: Document AUX pause and resume
Documentation/trace/coresight/coresight-perf.rst | 31 +++++++++ drivers/hwtracing/coresight/coresight-core.c | 22 +++++++ drivers/hwtracing/coresight/coresight-etm-perf.c | 84 +++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-etm4x-core.c | 143 +++++++++++++++++++++++++++++------------ drivers/hwtracing/coresight/coresight-etm4x.h | 2 + drivers/hwtracing/coresight/coresight-priv.h | 2 + drivers/hwtracing/coresight/coresight-tmc-etf.c | 9 +++ drivers/hwtracing/coresight/coresight-tmc-etr.c | 10 +++ include/linux/coresight.h | 4 ++ 9 files changed, 265 insertions(+), 42 deletions(-)
Reviewed-by: James Clark james.clark@linaro.org