On Fri, Jun 26, 2026 at 08:09:58PM +0800, Jie Gan wrote:
[...]
> I have another proposal: what if we allocate the ATID in trace_noc_id() when
> the device does not already have a valid ATID?
>
> Possible scenarios:
>
> If the itnoc device is connected to a TPDM device (which has no ATID),
> trace_noc_id() will be invoked via coresight_path_assign_trace_id(), and a
> valid ATID can be allocated for the path.
>
> If the itnoc device is connected to sources other than TPDM, trace_noc_id()
> will never be invoked, and therefore no ATID will be allocated for the
> device, saving resources.
TBH, I'm not sure I can make a judgement here, as I don't have enough
knowledge of the topology. And I'm not sure whether the listed
connections cover all possible cases.
I also found commit 5799dee92dc2:
| This patch adds platform driver support for the CoreSight Interconnect
| TNOC, Interconnect TNOC is a CoreSight link that forwards trace data
| from a subsystem to the Aggregator TNOC. Compared to Aggregator TNOC,
| it does not have aggregation and ATID functionality.
With your proposal, wouldn't ATID be allocated for the interconnect
TNOC while being skipped for the Aggregator TNOC? That seems to
contradict the commit log.
Thanks,
Leo
Hi Jie,
On Fri, Jun 26, 2026 at 10:03:41AM +0800, Jie Gan wrote:
[...]
> Hi Leo,
>
> To be honest, I would prefer not to modify the interconnect platform driver.
> On some Qualcomm platforms, multiple itnoc devices reside within small
> blocks(one or more than one for each block) and are connected to a dummy
> source. In such cases, two ATIDs are allocated for a path (the dummy source
> and the itnoc), which is inefficient. This is why the itnoc platform driver
> created to avoid this waste.
>
> The TraceNoC (called as AG TraceNoC) is a generic TraceNoC device which
> connected to multiple source and link devices, aggregating data from all
> source devices into a single output path.
As I said, it may be fragile to couple a specific device property (ATID)
to the AMBA driver.
You're now facing a case where a device cannot be registered as an AMBA
device, so it cannot use ATID. Likewise, I can imagine in future where a
device is registered as an AMBA device, but you don't want ATID.
> This device is implemented as an AMBA device but lacks proper hardware
> configuration. As a result, it must be handled in the driver as a
> workaround, which unfortunately breaks the original design intent.
Seems to me, it is not reasonable to pretend an AMBA device but AMBA
ID registers are absent.
How about add a new DT property ("qcom,tnoc-enable-atid") to force
enabling ATID?
Thanks,
Leo
On Thu, Jun 25, 2026 at 09:01:18AM +0800, Jie Gan wrote:
[...]
> > > However, I believe it is acceptable to allocate an ATID for the itNoC device
> > > and the issue can be fixed with this way.
> >
> > I think so.
>
> Hi Suzuki/Leo
>
> Which solution do you prefer to address the issue?
I will leave this to Suzuki.
> The interconnect traceNoC platform driver is intended for the itnoc device,
> implying that no TPDM devices are connected to it. So, if I modify it to
> allocate an ATID, I think it would be better to rename the “itnoc” node
> accordingly? Or it's ok to leave it as-is?
>
> BTW, the traceNoC device definitely is an AMBA device with CID/PID
> registers.
Just to share a bit thoughts on the driver's design.
I think it would be better to keep the probe function generic. The AMBA
probe should not be specific to TraceNoC, and the platform probe should
not be only dedicated to the interconnect TraceNoC. The probe function
should simply handle a device that appears on either the AMBA bus or the
platform bus.
So the question is: if allocat an ATID for all traceNoC devices, do you
still need to distinguish TraceNoC types? If no, then the code can be
unified.
Thanks,
Leo
On Wed, Jun 24, 2026 at 11:08:32PM +0800, Jie Gan wrote:
[...]
> > Why does it fail ? power management ? hw broken ? Is it really AMBA or
> > do you pretend that to be an AMBA device by faking the CID/PID?
>
> The CID reads as 0 from the register, which I suspect is a hardware design
> issue. I have not yet confirmed this with the hardware team. As a
> workaround, I provided a fake periphid via a DT property to bypass
> amba_read_periphid.
>
>
> Leo commented in other thread:
> >>tnoc.c registers both an AMBA driver and a platform driver. Shouldn't >>it
> >>be registered as a platform device instead?
>
> The platform driver is intended for the interconnect TraceNoC device and is
> not designed to allocate an ATID. The issue is that the TPDM device borrows
> the ATID from the TraceNoC device, resulting in the ATID always being 0 when
> associated with an interconnect NoC device.
>
> However, I believe it is acceptable to allocate an ATID for the itNoC device
> and the issue can be fixed with this way.
I think so.
On Wed, Jun 24, 2026 at 05:49:26PM +0800, Jie Gan wrote:
> The AMBA bus attempts to read the CID/PID of a device before invoking
> its probe function if the arm,primecell-periphid property is absent.
> This causes a deferred probe issue for the TraceNoC device, as the
> CID/PID cannot be read from the periphid register.
> Add the arm,primecell-periphid property to bypass the AMBA bus
> check and resolve the probe issue.
tnoc.c registers both an AMBA driver and a platform driver. Shouldn't it
be registered as a platform device instead?
Thanks,
Leo
On 24/06/2026 14:48, Jie Gan wrote:
>
>
> On 6/24/2026 9:27 PM, Konrad Dybcio wrote:
>> On 6/24/26 11:49 AM, Jie Gan wrote:
>>> The AMBA bus attempts to read the CID/PID of a device before invoking
>>> its probe function if the arm,primecell-periphid property is absent.
>>> This causes a deferred probe issue for the TraceNoC device, as the
>>> CID/PID cannot be read from the periphid register.
>>
>> Why does it probe defer?
>>
>
> For an AMBA device, the periphid is mandatory for probing. In the
> amba_match function, AMBA attempts to read the periphid from the CID/PID
> registers if the arm,primecell-periphid property is absent in the device
> tree. If this read fails, it returns -EPROBE_DEFER, and the probe
> ultimately fails.
Why does it fail ? power management ? hw broken ? Is it really AMBA or
do you pretend that to be an AMBA device by faking the CID/PID?
Suzuki
> Most AMBA devices expose valid CID/PID registers, so specifying
> arm,primecell-periphid in the device tree is usually unnecessary.
> However, for the TraceNoC device in this case, AMBA cannot reliably read
> the periphid from the corresponding registers.
>
>> And is this required for all TNOC devices?
>
> So far, the TNOC device has been added to sm8750, Glymur, and Kaanapali
> platforms, and all exhibit probe failures due to the same root cause.
>
> I prefer to fix it on Kaanapali first.
>
> Thanks,
> Jie
>
>>
>> Konrad
>
On Mon, Jun 15, 2026 at 08:32:28PM +0800, Yingchao Deng wrote:
> The Qualcomm extended CTI is a heavily parameterized version of ARM’s
> CSCTI. It allows a debugger to send to trigger events to a processor or to
> send a trigger event to one or more processors when a trigger event occurs
> on another processor on the same SoC, or even between SoCs.
>
> Qualcomm extended CTI supports up to 128 triggers. And some of the register
> offsets are changed.
>
> The commands to configure CTI triggers are the same as ARM's CTI.
This series looks good to me and I tested:
- Confirmed no any change for standard CTI;
- Perf command;
- Build bisection.
Tested-by: Leo Yan <leo.yan(a)arm.com>
Mike may also want to take a look in case there is any overlap with his
claim tags init patch [1]. I don't have a strong preference on which
series goes in first, as the claim tag handling in this series looks
clean enough to me.
@Yingchao, just a small suggestion: rc1–rc4 is usually the best window
for getting patches merged. If you receive comments and are able to
address them quickly, it may be worth respin patches during that
period to try catching the next merge window. This is no guarantees,
just thought it might be useful to share as a general tip :)
[1] https://lore.kernel.org/linux-arm-kernel/20260619160148.499223-2-mike.leach…
All CoreSight compliant components have an implementation defined number
of 0 to 8 claim tag bits in the claim tag registers.
These are used to claim the CoreSight resources by system agents.
ARM recommends implementions have 4 claim tag bits, though a valid
implementation can have 0 claim tags bits.
The CoreSight drivers implement a 2 claim tag bit protocol to allow
self hosted and external debug agents to manage access to the hardware.
However, if there are less than 2 claim tags available the protocol
incorrectly returns an error on device claim, as no checks are made.
If insufficient claim tags are present in a component then the protocol
must return success on claim / disclaim to allow components to be used
normally.
Add initialisation to read the CLAIMSET bits to establish the number of
available claim tag bits, and adjust the claim returns accordingly.
Cache the claimtag protocol availablity in the coresight_device to reduce
reads for the main claim/disclaim api.
changes since v2:
1) consolidated API to remove the API calls using just cs_access, which were
used purely to clear down stale self claim tags, replace with a normal
coresight_device API for initialisation, to match the claim/disclaim API.
This does both the check on availability and the stale tag clearance.
Updated all drivers to use the new init functionality
2) Added option for drivers to skip claim tag checking completely for devices
with no-compliant hardware, that do not implement registers at the claim tag
location, or do not operate correctly to indicate the correct number of
claim tags for the device.
changes since v1:
1) Added claim tag availability cache into coresight_device when using the
main coresight_claim_device() / coresight_disclaim_device() API.
Applies to coresight/next
Mike Leach (1):
coresight: fix issue where coresight component has no claimtags
drivers/hwtracing/coresight/coresight-catu.c | 6 +-
drivers/hwtracing/coresight/coresight-core.c | 139 ++++++++++++++++--
.../hwtracing/coresight/coresight-cti-core.c | 7 +-
drivers/hwtracing/coresight/coresight-etb10.c | 9 +-
.../coresight/coresight-etm3x-core.c | 8 +-
.../coresight/coresight-etm4x-core.c | 8 +-
.../hwtracing/coresight/coresight-funnel.c | 7 +-
drivers/hwtracing/coresight/coresight-priv.h | 7 +
.../coresight/coresight-replicator.c | 9 +-
.../hwtracing/coresight/coresight-tmc-core.c | 7 +-
include/linux/coresight.h | 23 ++-
11 files changed, 205 insertions(+), 25 deletions(-)
--
2.43.0
This series adds thread-stack and synthesized callchain support for Arm
CoreSight, which comes from older series [1] but heavily rewritten.
CS ETM previously kept last-branch state in a per-trace-queue buffer.
That effectively makes the state per CPU, while the call/return history
belongs to a thread. This series moves branch tracking to the common
thread-stack code.
The series records CoreSight branches with thread_stack__event(), uses
thread_stack__br_sample() for last branch entries, flushes thread stacks
after decoder resets.
A decoder reset between AUX trace buffers is treated as a global trace
discontinuity, so all thread stacks are flushed, so avoids carrying
stale call/return history across a trace discontinuity.
One limitation remains for instructions emulated by the kernel. In that
case the exception return address may not match the return address
stored in the thread stack, because after exception return can be one
instruction ahead. The stack can still recover when a later return
matches an upper caller. Given emulated instructions are not the common
target for performance callchain analysis. Supporting this would require
extending the common thread-stack path to accept both the real target
address and an adjusted address for stack matching, so this series
leaves that extra complexity out.
The series has been tested on Orion6 board:
perf test 136 -vvv
136: CoreSight synthesized callchain:
--- start ---
test child forked, pid 3539
---- end(0) ----
136: CoreSight synthesized callchain : Ok
perf script --itrace=g16i10il64
callchain_test 17468 [005] 1031003.229943: 10 instructions:
aaaac32507c4 main+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test)
ffff90bd225c __libc_start_call_main+0x7c (/usr/lib/aarch64-linux-gnu/libc.so.6)
ffff90bd233c call_init+0x9c (inlined)
ffff90bd233c __libc_start_main_impl+0x9c (inlined)
aaaac3250670 _start+0x30 (/home/kernel/leoy/test_cs_callchain/callchain_test)
callchain_test 17468 [005] 1031003.229943: 10 instructions:
aaaac3250774 do_svc+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac3250798 print+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac32507b0 foo+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac32507c8 main+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
ffff90bd225c __libc_start_call_main+0x7c (/usr/lib/aarch64-linux-gnu/libc.so.6)
ffff90bd233c call_init+0x9c (inlined)
ffff90bd233c __libc_start_main_impl+0x9c (inlined)
aaaac3250670 _start+0x30 (/home/kernel/leoy/test_cs_callchain/callchain_test)
callchain_test 17468 [005] 1031003.229944: 10 instructions:
ffff800080010c20 vectors+0x420 ([kernel.kallsyms])
aaaac3250784 do_svc+0x1c (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac3250798 print+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac32507b0 foo+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac32507c8 main+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
ffff90bd225c __libc_start_call_main+0x7c (/usr/lib/aarch64-linux-gnu/libc.so.6)
ffff90bd233c call_init+0x9c (inlined)
ffff90bd233c __libc_start_main_impl+0x9c (inlined)
aaaac3250670 _start+0x30 (/home/kernel/leoy/test_cs_callchain/callchain_test)
Note, the test fails on Juno board which is caused by many discontinuity
packets (mainly caused by NO_SYNC elem). This is likely caused by the
FIFO overflow on the path.
[1] https://lore.kernel.org/linux-arm-kernel/20200220052701.7754-1-leo.yan@lina…
Signed-off-by: Leo Yan <leo.yan(a)arm.com>
---
Changes in v9:
- Added patch 01 to fixed thread leak during trace queue init (sashiko).
- Added check in instruction and branch samples in
cs_etm__add_stack_event() (sashiko).
- Released frontend_thread properly in cs_etm__context() (sashiko).
- Refined cs_etm__flush_all_stack() to use switch (sashiko).
- Gathered James' review tags.
- Rebased on the latest perf-tools-next.
- Link to v8: https://lore.kernel.org/r/20260611-b4-arm_cs_callchain_support_v1-v8-0-7379…
Changes in v8:
- Updated test_arm_coresight_disasm.sh to pass "--itrace=b" and updated
examples in arm-cs-trace-disasm.py (James).
- Removed static annotation in callchain workload and renamed functions
with prefix "callchain_" to reduce naming conflict (James).
- For callchain test pre-condition check, removed the aarch64 check and
added the root permission check (James).
- Resolved the shellcheck errors (James).
- Link to v7: https://lore.kernel.org/r/20260611-b4-arm_cs_callchain_support_v1-v7-0-1ba7…
Changes in v7:
- Rebased on the latest perf-tools-next.
- Used struct_size() for allocation callchain struct (James).
- Added a helper cs_etm__packet_has_taken_branch() (James).
- Minor improvements for the callchain test (used record-ctl FIFO and
reworked the validation callstack push / pop).
- Link to v6: https://lore.kernel.org/r/20260526-b4-arm_cs_callchain_support_v1-v6-0-f9f4…
Changes in v6:
- Heavily rewrote the patches since restarted the work after 6 years.
- Changed to use the common thread-stack for branch stack and callchain
management.
- Added a callchain test.
- Link to v5: https://lore.kernel.org/linux-arm-kernel/20200220052701.7754-1-leo.yan@lina…
Changes in v5:
- Addressed Mike's suggestion for performance improvement for function
cs_etm__instr_addr() for quick calculation for non T32;
- Removed the patch 'perf cs-etm: Synchronize instruction sample with
the thread stack' (Mike);
- Fixed the issue for exception is taken for branch target address
accessing, for the branch sample and stack thread handling, the
related patches are 01, 02, 07;
- Fixed the stack thread handling for instruction emulation and single
step with patches 08, 09.
- Link to v4: https://lore.kernel.org/linux-arm-kernel/20200203020716.31832-1-leo.yan@lin…
---
Leo Yan (9):
perf cs-etm: Fix thread leaks on trace queue init failure
perf cs-etm: Filter synthesized branch samples
perf cs-etm: Decode ETE exception packets
perf cs-etm: Refactor instruction size handling
perf cs-etm: Use thread-stack for last branch entries
perf cs-etm: Flush thread stacks after decoder reset
perf cs-etm: Support call indentation
perf cs-etm: Synthesize callchains for instruction samples
perf test: Add Arm CoreSight callchain test
tools/perf/Documentation/perf-test.txt | 6 +-
tools/perf/scripts/python/arm-cs-trace-disasm.py | 9 +-
tools/perf/tests/builtin-test.c | 1 +
tools/perf/tests/shell/coresight/callchain.sh | 172 ++++++++++
.../shell/coresight/test_arm_coresight_disasm.sh | 4 +-
tools/perf/tests/tests.h | 1 +
tools/perf/tests/workloads/Build | 2 +
tools/perf/tests/workloads/callchain.c | 33 ++
tools/perf/util/cs-etm.c | 367 +++++++++++++--------
9 files changed, 446 insertions(+), 149 deletions(-)
---
base-commit: 44543bb53ebfecb5ce4e890053a666affab9e482
change-id: 20260521-b4-arm_cs_callchain_support_v1-2c2a70719bcc
Best regards,
--
Leo Yan <leo.yan(a)arm.com>
This series adds thread-stack and synthesized callchain support for Arm
CoreSight, which comes from older series [1] but heavily rewritten.
CS ETM previously kept last-branch state in a per-trace-queue buffer.
That effectively makes the state per CPU, while the call/return history
belongs to a thread. This series moves branch tracking to the common
thread-stack code.
The series records CoreSight branches with thread_stack__event(), uses
thread_stack__br_sample() for last branch entries, flushes thread stacks
after decoder resets.
A decoder reset between AUX trace buffers is treated as a global trace
discontinuity, so all thread stacks are flushed, so avoids carrying
stale call/return history across a trace discontinuity.
One limitation remains for instructions emulated by the kernel. In that
case the exception return address may not match the return address
stored in the thread stack, because after exception return can be one
instruction ahead. The stack can still recover when a later return
matches an upper caller. Given emulated instructions are not the common
target for performance callchain analysis. Supporting this would require
extending the common thread-stack path to accept both the real target
address and an adjusted address for stack matching, so this series
leaves that extra complexity out.
The series has been tested on Orion6 board:
perf test 136 -vvv
136: CoreSight synthesized callchain:
--- start ---
test child forked, pid 3539
---- end(0) ----
136: CoreSight synthesized callchain : Ok
perf script --itrace=g16i10il64
callchain_test 17468 [005] 1031003.229943: 10 instructions:
aaaac32507c4 main+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test)
ffff90bd225c __libc_start_call_main+0x7c (/usr/lib/aarch64-linux-gnu/libc.so.6)
ffff90bd233c call_init+0x9c (inlined)
ffff90bd233c __libc_start_main_impl+0x9c (inlined)
aaaac3250670 _start+0x30 (/home/kernel/leoy/test_cs_callchain/callchain_test)
callchain_test 17468 [005] 1031003.229943: 10 instructions:
aaaac3250774 do_svc+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac3250798 print+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac32507b0 foo+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac32507c8 main+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
ffff90bd225c __libc_start_call_main+0x7c (/usr/lib/aarch64-linux-gnu/libc.so.6)
ffff90bd233c call_init+0x9c (inlined)
ffff90bd233c __libc_start_main_impl+0x9c (inlined)
aaaac3250670 _start+0x30 (/home/kernel/leoy/test_cs_callchain/callchain_test)
callchain_test 17468 [005] 1031003.229944: 10 instructions:
ffff800080010c20 vectors+0x420 ([kernel.kallsyms])
aaaac3250784 do_svc+0x1c (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac3250798 print+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac32507b0 foo+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac32507c8 main+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
ffff90bd225c __libc_start_call_main+0x7c (/usr/lib/aarch64-linux-gnu/libc.so.6)
ffff90bd233c call_init+0x9c (inlined)
ffff90bd233c __libc_start_main_impl+0x9c (inlined)
aaaac3250670 _start+0x30 (/home/kernel/leoy/test_cs_callchain/callchain_test)
Note, the test fails on Juno board which is caused by many discontinuity
packets (mainly caused by NO_SYNC elem). This is likely caused by the
FIFO overflow on the path.
[1] https://lore.kernel.org/linux-arm-kernel/20200220052701.7754-1-leo.yan@lina…
Signed-off-by: Leo Yan <leo.yan(a)arm.com>
---
Changes in v10:
- Change to syscall(SYS_gettid) for build failure on x86 (James).
- Extracted sample thread stack into cs_etm__sample_branch_stack().
- Link to v9: https://lore.kernel.org/r/20260616-b4-arm_cs_callchain_support_v1-v9-0-f8fa…
Changes in v9:
- Added patch 01 to fixed thread leak during trace queue init (sashiko).
- Added check in instruction and branch samples in
cs_etm__add_stack_event() (sashiko).
- Released frontend_thread properly in cs_etm__context() (sashiko).
- Refined cs_etm__flush_all_stack() to use switch (sashiko).
- Gathered James' review tags.
- Rebased on the latest perf-tools-next.
- Link to v8: https://lore.kernel.org/r/20260611-b4-arm_cs_callchain_support_v1-v8-0-7379…
Changes in v8:
- Updated test_arm_coresight_disasm.sh to pass "--itrace=b" and updated
examples in arm-cs-trace-disasm.py (James).
- Removed static annotation in callchain workload and renamed functions
with prefix "callchain_" to reduce naming conflict (James).
- For callchain test pre-condition check, removed the aarch64 check and
added the root permission check (James).
- Resolved the shellcheck errors (James).
- Link to v7: https://lore.kernel.org/r/20260611-b4-arm_cs_callchain_support_v1-v7-0-1ba7…
Changes in v7:
- Rebased on the latest perf-tools-next.
- Used struct_size() for allocation callchain struct (James).
- Added a helper cs_etm__packet_has_taken_branch() (James).
- Minor improvements for the callchain test (used record-ctl FIFO and
reworked the validation callstack push / pop).
- Link to v6: https://lore.kernel.org/r/20260526-b4-arm_cs_callchain_support_v1-v6-0-f9f4…
Changes in v6:
- Heavily rewrote the patches since restarted the work after 6 years.
- Changed to use the common thread-stack for branch stack and callchain
management.
- Added a callchain test.
- Link to v5: https://lore.kernel.org/linux-arm-kernel/20200220052701.7754-1-leo.yan@lina…
Changes in v5:
- Addressed Mike's suggestion for performance improvement for function
cs_etm__instr_addr() for quick calculation for non T32;
- Removed the patch 'perf cs-etm: Synchronize instruction sample with
the thread stack' (Mike);
- Fixed the issue for exception is taken for branch target address
accessing, for the branch sample and stack thread handling, the
related patches are 01, 02, 07;
- Fixed the stack thread handling for instruction emulation and single
step with patches 08, 09.
- Link to v4: https://lore.kernel.org/linux-arm-kernel/20200203020716.31832-1-leo.yan@lin…
---
Leo Yan (9):
perf cs-etm: Fix thread leaks on trace queue init failure
perf cs-etm: Filter synthesized branch samples
perf cs-etm: Decode ETE exception packets
perf cs-etm: Refactor instruction size handling
perf cs-etm: Use thread-stack for last branch entries
perf cs-etm: Flush thread stacks after decoder reset
perf cs-etm: Support call indentation
perf cs-etm: Synthesize callchains for instruction samples
perf test: Add Arm CoreSight callchain test
tools/perf/Documentation/perf-test.txt | 6 +-
tools/perf/scripts/python/arm-cs-trace-disasm.py | 9 +-
tools/perf/tests/builtin-test.c | 1 +
tools/perf/tests/shell/coresight/callchain.sh | 172 ++++++++++
.../shell/coresight/test_arm_coresight_disasm.sh | 4 +-
tools/perf/tests/tests.h | 1 +
tools/perf/tests/workloads/Build | 2 +
tools/perf/tests/workloads/callchain.c | 33 ++
tools/perf/util/cs-etm.c | 377 +++++++++++++--------
9 files changed, 454 insertions(+), 151 deletions(-)
---
base-commit: 8c214ad8cb8d692c82c6466b8e88973dbfa8e064
change-id: 20260521-b4-arm_cs_callchain_support_v1-2c2a70719bcc
Best regards,
--
Leo Yan <leo.yan(a)arm.com>