From: Arnd Bergmann <arnd(a)arndb.de>
Export the this_cpu_has_cap() for use by modules. This is
used by TRBE driver. Without this patch, TRBE will fail
to build as a module :
ERROR: modpost: "this_cpu_has_cap" [drivers/hwtracing/coresight/coresight-trbe.ko] undefined!
Fixes: 8a1065127d95 ("coresight: trbe: Add infrastructure for Errata handling")
Cc: Will Deacon <will(a)kernel.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
[ change to EXPORT_SYMBOL_GPL ]
Acked-by: Catalin Marinas <catalin.marinas(a)arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
---
arch/arm64/kernel/cpufeature.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index f8a3067d10c6..82e68c69bb99 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -2839,6 +2839,7 @@ bool this_cpu_has_cap(unsigned int n)
return false;
}
+EXPORT_SYMBOL_GPL(this_cpu_has_cap);
/*
* This helper function is used in a narrow window when,
--
2.25.4
On 21/10/2021 08:38, Tao Zhang wrote:
> Add TPDA and TPDM support to DTS for RB5 board. This change is a
> sample for validating. After applying this patch, the new TPDM and
> TPDA nodes can be observed at the coresight devices path. TPDM and
> TPDA hardware can be operated by commands.
>
> List the commands for validating this series patches as below.
> echo 1 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
> echo 1 > /sys/bus/coresight/devices/tpdm0/enable_source
> echo 1 > /sys/bus/coresight/devices/tpdm0/integration_test
> echo 2 > /sys/bus/coresight/devices/tpdm0/integration_test
> cat /dev/tmc_etf0 > /data/etf-tpdm0.bin
> echo 0 > /sys/bus/coresight/devices/tpdm0/enable_source
> echo 0 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
> echo 1 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
> echo 1 > /sys/bus/coresight/devices/tpdm1/enable_source
> echo 1 > /sys/bus/coresight/devices/tpdm1/integration_test
> echo 2 > /sys/bus/coresight/devices/tpdm1/integration_test
> cat /dev/tmc_etf0 > /data/etf-tpdm1.bin
> echo 0 > /sys/bus/coresight/devices/tpdm1/enable_source
> echo 0 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
> echo 1 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
> echo 1 > /sys/bus/coresight/devices/tpdm2/enable_source
> echo 1 > /sys/bus/coresight/devices/tpdm2/integration_test
> echo 2 > /sys/bus/coresight/devices/tpdm2/integration_test
> cat /dev/tmc_etf0 > /data/etf-tpdm2.bin
> echo 0 > /sys/bus/coresight/devices/tpdm2/enable_source
> echo 0 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
>
> If the data from TPDMs can be obtained from the ETF, it means
> that the TPDMs verification is successful. At the same time,
How can we decode the TPDM trace ? Is there a public decoder
available ?
> since TPDM0, TPDM1 and TPDM2 are all connected to the same
> funnel "funnel@6c2d000" and output via different output ports,
> it also means that the following patches verification is
> successful.
> coresight: add support to enable more coresight paths
> coresight: funnel: add support for multiple output ports
>
> Signed-off-by: Tao Zhang <quic_taozha(a)quicinc.com>
> ---
> arch/arm64/boot/dts/qcom/qrb5165-rb5.dts | 439 +++++++++++++++++++++++
> 1 file changed, 439 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/qcom/qrb5165-rb5.dts b/arch/arm64/boot/dts/qcom/qrb5165-rb5.dts
> index 8ac96f8e79d4..bcec8b181e11 100644
> --- a/arch/arm64/boot/dts/qcom/qrb5165-rb5.dts
> +++ b/arch/arm64/boot/dts/qcom/qrb5165-rb5.dts
> @@ -222,6 +222,445 @@
> +
> + funnel@6b04000 {
> + compatible = "arm,coresight-dynamic-funnel", "arm,primecell";
> + arm,primecell-periphid = <0x000bb908>;
> +
> + reg = <0 0x6b04000 0 0x1000>;
> + reg-names = "funnel-base";
> +
> + clocks = <&aoss_qmp>;
> + clock-names = "apb_pclk";
> +
> + out-ports {
> + port {
> + merge_funnel_out: endpoint {
> + remote-endpoint =
> + <&etf_in>;
> + };
> + };
> + };
> +
> + in-ports {
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + port@7 {
> + reg = <7>;
> + swao_funnel_in7: endpoint {
> + slave-mode;
This is obsolete, with the new in-ports/out-ports construct.
Suzuki
Hi Leo,
Would you be ok with the current patch the way it is? In case it's of
any help, I'm sharing the testing steps that James and I went through
when testing this internally, if you want to add to it
- Test that only a portion of the buffer is saved until there is a wraparound
$ ./perf record -vvv -e arm_spe/period=148576/u -S -- taskset --cpu-list 0 stress --cpu 1 & while true; do sleep 0.2; killall -s USR2 perf; done
- Test snapshot mode in CPU mode
$ sudo ./perf record -vvv -C 0 -e arm_spe/period=148576/u -S -- taskset --cpu-list 0 stress --cpu 1 &
- Test that auxtrace buffers correspond to an aux record
- Test snapshot default sizes in sudo and user modes
- Test small snapshot size
$ ./perf record -vvv -e arm_spe/period=148576/u -S1000 -m16,16 -- taskset --cpu-list 0 stress --cpu 1 &
If there are any concerns with the patches, please let me know and I
will try to address them.
Thanks,
German
On 13/10/2021 08:51, Will Deacon wrote:
> On Wed, Oct 13, 2021 at 08:39:16AM +0800, Leo Yan wrote:
>> On Mon, Oct 11, 2021 at 04:55:37PM +0100, German Gomez wrote:
>>> On 06/10/2021 10:51, Leo Yan wrote:
>>>> On Wed, Oct 06, 2021 at 10:35:20AM +0100, German Gomez wrote:
>>>>
>>>> [...]
>>>>
>>>>>> So simply say, I think the head pointer monotonically increasing is
>>>>>> the right thing to do in Arm SPE driver.
>>>>> I will talk to James about how we can proceed on this.
>>>> Thanks!
>>> I took this offline with James and, though it looks possible to patch
>>> the SPE driver to have a monotonically increasing head pointer in order
>>> to simplify the handling in the perf tool, it could be a breaking change
>>> for users of the perf_event_open syscall that currently rely on the way
>>> it works now.
>> Here I cannot create the connection between AUX head pointer and the
>> breakage of calling perf_event_open().
>>
>> Could you elaborate what's the reason the monotonical increasing head
>> pointer will lead to the breakage for perf_event_open()?
> It's a user-visible change in behaviour, isn't it? Therefore we risk
> breaking applications that rely on the current behaviour if we change it
> unconditionally.
>
> Given that the driver has always worked like this and it doesn't sound like
> it's the end of the world to deal with it in userspace (after all, it's
> aligned with intel-pt), then I don't think we should change it.
>
> Will
Hi Arnd
Thanks for the report.
On 29/10/2021 11:31, Arnd Bergmann wrote:
> On Fri, Oct 15, 2021 at 12:31 AM Suzuki K Poulose
> <suzuki.poulose(a)arm.com> wrote:
>>
>> +static void trbe_check_errata(struct trbe_cpudata *cpudata)
>> +{
>> + int i;
>> +
>> + for (i = 0; i < TRBE_ERRATA_MAX; i++) {
>> + int cap = trbe_errata_cpucaps[i];
>> +
>> + if (WARN_ON_ONCE(cap < 0))
>> + return;
>> + if (this_cpu_has_cap(cap))
>> + set_bit(i, cpudata->errata);
>> + }
>> +}
>
> this_cpu_has_cap() is private to arch/arm64 and not exported, so this causes
> a build failure when used from a loadable module:
>
> ERROR: modpost: "this_cpu_has_cap"
> [drivers/hwtracing/coresight/coresight-trbe.ko] undefined!
>
> Should this symbol be exported or do we need a different workaround?
This should be exported. I can send in a patch.
Suzuki
On Thu, Oct 21, 2021 at 03:38:47PM +0800, Tao Zhang wrote:
> Current coresight implementation only supports enabling source
> ETMs or STM. This patch adds support to enable more kinds of
> coresight source to sink paths. We build a path from source to
> sink when any source is enabled and store it in a list. When the
> source is disabled, we fetch the corresponding path from the list
> and decrement the refcount on each device in the path. The device
> is disabled if the refcount reaches zero. Don't store path to
> coresight data structure of source to avoid unnecessary change to
> ABI.
> Since some targets may have coresight sources other than STM and
> ETMs, we need to add this change to support these coresight
> devices.
>
> Signed-off-by: Tingwei Zhang <tingwei(a)codeaurora.org>
> Signed-off-by: Tao Zhang <quic_taozha(a)quicinc.com>
> ---
> drivers/hwtracing/coresight/coresight-core.c | 100 +++++++++++--------
> 1 file changed, 56 insertions(+), 44 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
> index 8a18c71df37a..1e621d61307a 100644
> --- a/drivers/hwtracing/coresight/coresight-core.c
> +++ b/drivers/hwtracing/coresight/coresight-core.c
> @@ -37,18 +37,16 @@ struct coresight_node {
> };
>
> /*
> - * When operating Coresight drivers from the sysFS interface, only a single
> - * path can exist from a tracer (associated to a CPU) to a sink.
> + * struct coresight_path - path from source to sink
> + * @path: Address of path list.
> + * @link: hook to the list.
> */
> -static DEFINE_PER_CPU(struct list_head *, tracer_path);
> +struct coresight_path {
> + struct list_head *path;
> + struct list_head link;
> +};
For sources associated with a CPU, like ETMs, having a per-cpu way of storing
paths is a definite advantage and should be kept that way.
>
> -/*
> - * As of this writing only a single STM can be found in CS topologies. Since
> - * there is no way to know if we'll ever see more and what kind of
> - * configuration they will enact, for the time being only define a single path
> - * for STM.
> - */
> -static struct list_head *stm_path;
> +static LIST_HEAD(cs_active_paths);
Then there are sources that aren't associated with a CPU like STMs and TPDMs.
Perhaps using an IDR or the hash of the device name as a key to a hashing
vector would be better than doing a sequential search, especially as the
list of devices is bound to increase over time.
>
> /*
> * When losing synchronisation a new barrier packet needs to be inserted at the
> @@ -354,6 +352,7 @@ static void coresight_disable_sink(struct coresight_device *csdev)
> if (ret)
> return;
> coresight_control_assoc_ectdev(csdev, false);
> + csdev->activated = false;
I don't see why this is needed and without proper documentation there is no way
for me to guess the logic behind the change. The ->activated flag should be
manipulated from the command line interface only.
> csdev->enable = false;
> }
>
> @@ -590,6 +589,20 @@ int coresight_enable_path(struct list_head *path, u32 mode, void *sink_data)
> goto out;
> }
>
> +static struct coresight_device *coresight_get_source(struct list_head *path)
> +{
> + struct coresight_device *csdev;
> +
> + if (!path)
> + return NULL;
> +
> + csdev = list_first_entry(path, struct coresight_node, link)->csdev;
> + if (csdev->type != CORESIGHT_DEV_TYPE_SOURCE)
> + return NULL;
> +
> + return csdev;
> +}
> +
> struct coresight_device *coresight_get_sink(struct list_head *path)
> {
> struct coresight_device *csdev;
> @@ -1086,9 +1099,23 @@ static int coresight_validate_source(struct coresight_device *csdev,
> return 0;
> }
>
> +static int coresight_store_path(struct list_head *path)
> +{
> + struct coresight_path *node;
> +
> + node = kzalloc(sizeof(struct coresight_path), GFP_KERNEL);
> + if (!node)
> + return -ENOMEM;
> +
> + node->path = path;
> + list_add(&node->link, &cs_active_paths);
> +
> + return 0;
> +}
> +
> int coresight_enable(struct coresight_device *csdev)
> {
> - int cpu, ret = 0;
> + int ret = 0;
> struct coresight_device *sink;
> struct list_head *path;
> enum coresight_dev_subtype_source subtype;
> @@ -1133,25 +1160,9 @@ int coresight_enable(struct coresight_device *csdev)
> if (ret)
> goto err_source;
>
> - switch (subtype) {
> - case CORESIGHT_DEV_SUBTYPE_SOURCE_PROC:
> - /*
> - * When working from sysFS it is important to keep track
> - * of the paths that were created so that they can be
> - * undone in 'coresight_disable()'. Since there can only
> - * be a single session per tracer (when working from sysFS)
> - * a per-cpu variable will do just fine.
> - */
> - cpu = source_ops(csdev)->cpu_id(csdev);
> - per_cpu(tracer_path, cpu) = path;
> - break;
> - case CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE:
> - stm_path = path;
> - break;
> - default:
> - /* We can't be here */
> - break;
> - }
> + ret = coresight_store_path(path);
> + if (ret)
> + goto err_source;
>
> out:
> mutex_unlock(&coresight_mutex);
> @@ -1168,8 +1179,11 @@ EXPORT_SYMBOL_GPL(coresight_enable);
>
> void coresight_disable(struct coresight_device *csdev)
> {
> - int cpu, ret;
> + int ret;
> struct list_head *path = NULL;
> + struct coresight_path *cspath = NULL;
> + struct coresight_path *cspath_next = NULL;
> + struct coresight_device *src_csdev = NULL;
>
> mutex_lock(&coresight_mutex);
>
> @@ -1180,20 +1194,18 @@ void coresight_disable(struct coresight_device *csdev)
> if (!csdev->enable || !coresight_disable_source(csdev))
> goto out;
>
> - switch (csdev->subtype.source_subtype) {
> - case CORESIGHT_DEV_SUBTYPE_SOURCE_PROC:
> - cpu = source_ops(csdev)->cpu_id(csdev);
> - path = per_cpu(tracer_path, cpu);
> - per_cpu(tracer_path, cpu) = NULL;
> - break;
> - case CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE:
> - path = stm_path;
> - stm_path = NULL;
> - break;
> - default:
> - /* We can't be here */
> - break;
> + list_for_each_entry_safe(cspath, cspath_next, &cs_active_paths, link) {
> + src_csdev = coresight_get_source(cspath->path);
> + if (!src_csdev)
> + continue;
> + if (src_csdev == csdev) {
> + path = cspath->path;
> + list_del(&cspath->link);
> + kfree(cspath);
See my comment above - I agree that sources _not_ associated with a CPU should
be handled differently. CPU bound sources should be kept untouched.
That is all the time I had for today, I will continue tomorrow.
Thanks,
Mathieu
> + }
> }
> + if (path == NULL)
> + goto out;
>
> coresight_disable_path(path);
> coresight_release_path(path);
> --
> 2.17.1
>