Add a test to confirm that default sink selection skips over an ETF
and returns an ETR even if it's further away.
This also makes it easier to add new unit tests in the future.
Reviewed-by: Leo Yan <leo.yan(a)arm.com>
Signed-off-by: James Clark <james.clark(a)linaro.org>
---
Changes in v4:
- Rename etm to src now that it's not CORESIGHT_DEV_SUBTYPE_SOURCE_PROC
- Remove the now empty src_ops too
- Fix a rebase mistake in the Makefile that removed CTCU
- Link to v3: https://lore.kernel.org/r/20250312-james-cs-kunit-test-v3-1-dcfb69730161@li…
Changes in v3:
- Use CORESIGHT_DEV_SUBTYPE_SOURCE_BUS type instead of the default
(CORESIGHT_DEV_SUBTYPE_SOURCE_PROC) so that the test still works even
when TRBE sinks are registered. This also removes the need for the
fake CPU ID callback.
- Link to v2: https://lore.kernel.org/r/20250305-james-cs-kunit-test-v2-1-83ba682b976c@li…
Changes in v2:
- Let devm free everything rather than doing individual kfrees:
"Like with managed drivers, KUnit-managed fake devices are
automatically cleaned up when the test finishes, but can be manually
cleaned up early with kunit_device_unregister()."
- Link to v1: https://lore.kernel.org/r/20250225164639.522741-1-james.clark@linaro.org
---
drivers/hwtracing/coresight/Kconfig | 9 +++
drivers/hwtracing/coresight/Makefile | 1 +
drivers/hwtracing/coresight/coresight-core.c | 1 +
.../hwtracing/coresight/coresight-kunit-tests.c | 74 ++++++++++++++++++++++
4 files changed, 85 insertions(+)
diff --git a/drivers/hwtracing/coresight/Kconfig b/drivers/hwtracing/coresight/Kconfig
index ecd7086a5b83..f064e3d172b3 100644
--- a/drivers/hwtracing/coresight/Kconfig
+++ b/drivers/hwtracing/coresight/Kconfig
@@ -259,4 +259,13 @@ config CORESIGHT_DUMMY
To compile this driver as a module, choose M here: the module will be
called coresight-dummy.
+
+config CORESIGHT_KUNIT_TESTS
+ tristate "Enable Coresight unit tests"
+ depends on KUNIT
+ default KUNIT_ALL_TESTS
+ help
+ Enable Coresight unit tests. Only useful for development and not
+ intended for production.
+
endif
diff --git a/drivers/hwtracing/coresight/Makefile b/drivers/hwtracing/coresight/Makefile
index 8e62c3150aeb..4e6ea5b05b01 100644
--- a/drivers/hwtracing/coresight/Makefile
+++ b/drivers/hwtracing/coresight/Makefile
@@ -53,3 +53,4 @@ obj-$(CONFIG_ULTRASOC_SMB) += ultrasoc-smb.o
obj-$(CONFIG_CORESIGHT_DUMMY) += coresight-dummy.o
obj-$(CONFIG_CORESIGHT_CTCU) += coresight-ctcu.o
coresight-ctcu-y := coresight-ctcu-core.o
+obj-$(CONFIG_CORESIGHT_KUNIT_TESTS) += coresight-kunit-tests.o
diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
index fb43ef6a3b1f..47af75ba7a00 100644
--- a/drivers/hwtracing/coresight/coresight-core.c
+++ b/drivers/hwtracing/coresight/coresight-core.c
@@ -959,6 +959,7 @@ coresight_find_default_sink(struct coresight_device *csdev)
}
return csdev->def_sink;
}
+EXPORT_SYMBOL_GPL(coresight_find_default_sink);
static int coresight_remove_sink_ref(struct device *dev, void *data)
{
diff --git a/drivers/hwtracing/coresight/coresight-kunit-tests.c b/drivers/hwtracing/coresight/coresight-kunit-tests.c
new file mode 100644
index 000000000000..c8f361767c45
--- /dev/null
+++ b/drivers/hwtracing/coresight/coresight-kunit-tests.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <kunit/test.h>
+#include <kunit/device.h>
+#include <linux/coresight.h>
+
+#include "coresight-priv.h"
+
+static struct coresight_device *coresight_test_device(struct device *dev)
+{
+ struct coresight_device *csdev = devm_kcalloc(dev, 1,
+ sizeof(struct coresight_device),
+ GFP_KERNEL);
+ csdev->pdata = devm_kcalloc(dev, 1,
+ sizeof(struct coresight_platform_data),
+ GFP_KERNEL);
+ return csdev;
+}
+
+static void test_default_sink(struct kunit *test)
+{
+ /*
+ * Source -> ETF -> ETR -> CATU
+ * ^
+ * | default
+ */
+ struct device *dev = kunit_device_register(test, "coresight_kunit");
+ struct coresight_device *src = coresight_test_device(dev),
+ *etf = coresight_test_device(dev),
+ *etr = coresight_test_device(dev),
+ *catu = coresight_test_device(dev);
+ struct coresight_connection conn = {};
+
+ src->type = CORESIGHT_DEV_TYPE_SOURCE;
+ /*
+ * Don't use CORESIGHT_DEV_SUBTYPE_SOURCE_PROC, that would always return
+ * a TRBE sink if one is registered.
+ */
+ src->subtype.source_subtype = CORESIGHT_DEV_SUBTYPE_SOURCE_BUS;
+ etf->type = CORESIGHT_DEV_TYPE_LINKSINK;
+ etf->subtype.sink_subtype = CORESIGHT_DEV_SUBTYPE_SINK_BUFFER;
+ etr->type = CORESIGHT_DEV_TYPE_SINK;
+ etr->subtype.sink_subtype = CORESIGHT_DEV_SUBTYPE_SINK_SYSMEM;
+ catu->type = CORESIGHT_DEV_TYPE_HELPER;
+
+ conn.src_dev = src;
+ conn.dest_dev = etf;
+ coresight_add_out_conn(dev, src->pdata, &conn);
+
+ conn.src_dev = etf;
+ conn.dest_dev = etr;
+ coresight_add_out_conn(dev, etf->pdata, &conn);
+
+ conn.src_dev = etr;
+ conn.dest_dev = catu;
+ coresight_add_out_conn(dev, etr->pdata, &conn);
+
+ KUNIT_ASSERT_PTR_EQ(test, coresight_find_default_sink(src), etr);
+}
+
+static struct kunit_case coresight_testcases[] = {
+ KUNIT_CASE(test_default_sink),
+ {}
+};
+
+static struct kunit_suite coresight_test_suite = {
+ .name = "coresight_test_suite",
+ .test_cases = coresight_testcases,
+};
+
+kunit_test_suites(&coresight_test_suite);
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("James Clark <james.clark(a)linaro.org>");
+MODULE_DESCRIPTION("Arm CoreSight KUnit tests");
---
base-commit: 3eadce8308bc8d808fd9e3a9d211c84215087451
change-id: 20250305-james-cs-kunit-test-3af1df2401e6
Best regards,
--
James Clark <james.clark(a)linaro.org>
On 29/04/2025 10:31 pm, Yabin Cui wrote:
> perf always allocates contiguous AUX pages based on aux_watermark.
> However, this contiguous allocation doesn't benefit all PMUs. For
> instance, ARM SPE and TRBE operate with virtual pages, and Coresight
> ETR allocates a separate buffer. For these PMUs, allocating contiguous
> AUX pages unnecessarily exacerbates memory fragmentation. This
> fragmentation can prevent their use on long-running devices.
>
> This patch modifies the perf driver to allocate non-contiguous AUX
> pages by default. For PMUs that can benefit from contiguous pages (
> Intel PT and BTS), a new PMU capability, PERF_PMU_CAP_AUX_PREFER_LARGE
> is introduced to maintain their existing behavior.
>
> Signed-off-by: Yabin Cui <yabinc(a)google.com>
> ---
> Changes since v1:
> In v1, default is preferring contiguous pages, and add a flag to
> allocate non-contiguous pages. In v2, default is allocating
> non-contiguous pages, and add a flag to prefer contiguous pages.
>
> v1 patchset:
> perf,coresight: Reduce fragmentation with non-contiguous AUX pages for
> cs_etm
>
> arch/x86/events/intel/bts.c | 3 ++-
> arch/x86/events/intel/pt.c | 3 ++-
> include/linux/perf_event.h | 1 +
> kernel/events/ring_buffer.c | 18 +++++++++++-------
> 4 files changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
> index a95e6c91c4d7..9129f00e4b9f 100644
> --- a/arch/x86/events/intel/bts.c
> +++ b/arch/x86/events/intel/bts.c
> @@ -625,7 +625,8 @@ static __init int bts_init(void)
> return -ENOMEM;
>
> bts_pmu.capabilities = PERF_PMU_CAP_AUX_NO_SG | PERF_PMU_CAP_ITRACE |
> - PERF_PMU_CAP_EXCLUSIVE;
> + PERF_PMU_CAP_EXCLUSIVE |
> + PERF_PMU_CAP_AUX_PREFER_LARGE;
> bts_pmu.task_ctx_nr = perf_sw_context;
> bts_pmu.event_init = bts_event_init;
> bts_pmu.add = bts_event_add;
> diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
> index fa37565f6418..37179e813b8c 100644
> --- a/arch/x86/events/intel/pt.c
> +++ b/arch/x86/events/intel/pt.c
> @@ -1866,7 +1866,8 @@ static __init int pt_init(void)
>
> pt_pmu.pmu.capabilities |= PERF_PMU_CAP_EXCLUSIVE |
> PERF_PMU_CAP_ITRACE |
> - PERF_PMU_CAP_AUX_PAUSE;
> + PERF_PMU_CAP_AUX_PAUSE |
> + PERF_PMU_CAP_AUX_PREFER_LARGE;
> pt_pmu.pmu.attr_groups = pt_attr_groups;
> pt_pmu.pmu.task_ctx_nr = perf_sw_context;
> pt_pmu.pmu.event_init = pt_event_init;
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 0069ba6866a4..56d77348c511 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -301,6 +301,7 @@ struct perf_event_pmu_context;
> #define PERF_PMU_CAP_AUX_OUTPUT 0x0080
> #define PERF_PMU_CAP_EXTENDED_HW_TYPE 0x0100
> #define PERF_PMU_CAP_AUX_PAUSE 0x0200
> +#define PERF_PMU_CAP_AUX_PREFER_LARGE 0x0400
>
> /**
> * pmu::scope
> diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
> index 5130b119d0ae..d76249ce4f17 100644
> --- a/kernel/events/ring_buffer.c
> +++ b/kernel/events/ring_buffer.c
> @@ -679,7 +679,7 @@ int rb_alloc_aux(struct perf_buffer *rb, struct perf_event *event,
> {
> bool overwrite = !(flags & RING_BUFFER_WRITABLE);
> int node = (event->cpu == -1) ? -1 : cpu_to_node(event->cpu);
> - int ret = -ENOMEM, max_order;
> + int ret = -ENOMEM, max_order = 0;
>
> if (!has_aux(event))
> return -EOPNOTSUPP;
> @@ -689,8 +689,8 @@ int rb_alloc_aux(struct perf_buffer *rb, struct perf_event *event,
>
> if (!overwrite) {
> /*
> - * Watermark defaults to half the buffer, and so does the
> - * max_order, to aid PMU drivers in double buffering.
> + * Watermark defaults to half the buffer, to aid PMU drivers
> + * in double buffering.
> */
> if (!watermark)
> watermark = min_t(unsigned long,
> @@ -698,16 +698,20 @@ int rb_alloc_aux(struct perf_buffer *rb, struct perf_event *event,
> (unsigned long)nr_pages << (PAGE_SHIFT - 1));
>
> /*
> - * Use aux_watermark as the basis for chunking to
> + * For PMUs that prefer large contiguous buffers,
> + * use aux_watermark as the basis for chunking to
> * help PMU drivers honor the watermark.
> */
> - max_order = get_order(watermark);
> + if (event->pmu->capabilities & PERF_PMU_CAP_AUX_PREFER_LARGE)
> + max_order = get_order(watermark);
> } else {
> /*
> - * We need to start with the max_order that fits in nr_pages,
> + * For PMUs that prefer large contiguous buffers,
> + * we need to start with the max_order that fits in nr_pages,
> * not the other way around, hence ilog2() and not get_order.
> */
> - max_order = ilog2(nr_pages);
> + if (event->pmu->capabilities & PERF_PMU_CAP_AUX_PREFER_LARGE)
> + max_order = ilog2(nr_pages);
Doesn't this one need to be 'PERF_PMU_CAP_AUX_PREFER_LARGE |
PERF_PMU_CAP_AUX_NO_SG', otherwise the NO_SG test further down doesn't
work for devices that only have NO_SG and not PREFER_LARGE.
NO_SG implies PREFER_LARGE behavior, except that NO_SG additionally hard
fails if it can't do it in one alloc. But I think you shouldn't have to
set them both to get the correct behavior.
This series is to enable AUX pause and resume on Arm CoreSight.
The first patch extracts the trace unit controlling operations to two
functions. These two functions will be used by AUX pause and resume.
Patches 02 and 03 change the ETMv4 driver to prepare callback functions
for AUX pause and resume.
Patch 04 changes the ETM perf layer to support AUX pause and resume in a
perf session. The patch 05 re-enables sinks after buffer update, based
on it, the patch 06 updates buffer on AUX pause occasion, which can
mitigate the trace data lose issue.
Patch 07 documents the AUX pause usages with Arm CoreSight.
This patch set has been verified on the Hikey960 board.
It is suggested to disable CPUIdle (add `nohlt` option in Linux command
line) when verifying this series. ETM and funnel drivers are found
issues during CPU suspend and resume which will be addressed separately.
Changes from v3:
- Re-enabled sink in buffer update callbacks (Suzuki).
Changes from v2:
- Rebased on CoreSight next branch.
- Dropped the uAPI 'update_buf_on_pause' and updated document
respectively (Suzuki).
- Renamed ETM callbacks to .pause_perf() and .resume_perf() (Suzuki).
- Minor improvement for error handling in the AUX resume flow.
Changes from v1:
- Added validation function pointers in pause and resume APIs (Mike).
Leo Yan (7):
coresight: etm4x: Extract the trace unit controlling
coresight: Introduce pause and resume APIs for source
coresight: etm4x: Hook pause and resume callbacks
coresight: perf: Support AUX trace pause and resume
coresight: tmc: Re-enable sink after buffer update
coresight: perf: Update buffer on AUX pause
Documentation: coresight: Document AUX pause and resume
Documentation/trace/coresight/coresight-perf.rst | 31 +++++++++
drivers/hwtracing/coresight/coresight-core.c | 22 +++++++
drivers/hwtracing/coresight/coresight-etm-perf.c | 84 +++++++++++++++++++++++-
drivers/hwtracing/coresight/coresight-etm4x-core.c | 143 +++++++++++++++++++++++++++++------------
drivers/hwtracing/coresight/coresight-etm4x.h | 2 +
drivers/hwtracing/coresight/coresight-priv.h | 2 +
drivers/hwtracing/coresight/coresight-tmc-etf.c | 9 +++
drivers/hwtracing/coresight/coresight-tmc-etr.c | 10 +++
include/linux/coresight.h | 4 ++
9 files changed, 265 insertions(+), 42 deletions(-)
--
2.34.1