Hi Mark,
CoreSight trace collection is broken on v5.0-rc6 due to this commit:
9dff0aa95a32 perf/core: Don't WARN() for impossible ring-buffer sizes
Before:
root@juno:/home/linaro# perf record -e cs_etm/(a)20070000.etr/u --per-thread
uname
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.036 MB perf.data ]
root@juno:/home/linaro#
After:
root@juno:/home/linaro# perf record -e cs_etm/(a)20070000.etr/u --per-thread
uname
failed to mmap with 12 (Cannot allocate memory)
root@juno:/home/linaro#
The problem is related to the order_base_2() [1] test with a size of 1264,
stemming from nr_pages equal to 128. The combination yields an order of
11, something leading directly to the error path.
The results are the same with linux-next 20190213. This was tested on a
Juno R0 and R1 with a 4K page configuration. I haven't tried but I'm
pretty sure it breaks IntelPT as well.
Please have a look when you have a minute. Leo Yan and I will be happy to
test patches.
Thanks,
Mathieu
[1].
https://elixir.bootlin.com/linux/v5.0-rc6/source/kernel/events/ring_buffer.…
This is important - see below.
Once this change is released I will send a patch to update the HOWTO.md on
github.
Mathieu
---------- Forwarded message ---------
From: Arnaldo Carvalho de Melo <acme(a)kernel.org>
Date: Tue, 12 Feb 2019 at 11:04
Subject: HEADS UP: disable building with opencsd by default
To: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Cc: Jiri Olsa <jolsa(a)kernel.org>, Kim Phillips <kim.phillips(a)arm.com>, <
linux-arm-kernel(a)lists.infradead.org>, Mike Leach <mike.leach(a)arm.com>,
Namhyung Kim <namhyung(a)kernel.org>, Peter Zijlstra <peterz(a)infradead.org>,
Suzuki Poulouse <suzuki.poulose(a)arm.com>, Ingo Molnar <mingo(a)kernel.org>
Please see the rationale on the cset, this test-all.c thing was done by
Ingo long ago to speed up the build of the tools by checking first if
the common set of libraries is present, and because this isn't being
properly checked I left several cases pass, like opencsd, libaio and
libcrypto, that I'm working as well and have fixes in my perf/core
branch.
- Arnaldo
commit b7213083e3f3d69fffb5229034b3ac3ca9d41038
Author: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Date: Tue Feb 12 14:37:15 2019 -0300
perf coresight: Do not test for libopencsd by default
Since it is not yet that generally available, avoid testing for the
presence of libcoresight in the fast path test-all.bin feature test.
# dnf search opencsd
No matches found.
# dnf search OpenCSD
No matches found.
# cat /etc/fedora-release
Fedora release 29 (Twenty Nine)
#
I.e. right now, in my system test-all.bin is failing all the time since
Fedora29 doesn't have libopencsd available:
$ cat /tmp/build/perf/feature/test-all.make.output
In file included from test-all.c:174:
test-libopencsd.c:2:10: fatal error: opencsd/c_api/opencsd_c_api.h:
No such file or directory
#include <opencsd/c_api/opencsd_c_api.h>
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
See:
6ab2b762befd ("perf build: Disable libbabeltrace check by default")
For the rationale, as soon as libopencsd becomes more generally packaged
and available, we do the same thing we did with babeltrace, enabling it
by default, as done in:
24787afbcd01 ("perf tools: Enable LIBBABELTRACE by default")
For now, to explicitely ask for opencsd, make sure you have it installed
and use:
make -C tools/perf CORESIGHT=1
The feature test output will be there as an empty file:
$ ls -la /tmp/build/perf/feature/test-libopencsd.make.output
Because the binary used for the feature check was successfully built:
$ ls -la /tmp/build/perf/feature/test-libopencsd.bin
-rwxrwxr-x. 1 acme acme 18336 Feb 12 14:49
/tmp/build/perf/feature/test-libopencsd.bin
$ ldd /tmp/build/perf/feature/test-libopencsd.bin
linux-vdso.so.1 (0x00007fffe18cc000)
libopencsd_c_api.so.0 => /lib64/libopencsd_c_api.so.0
(0x00007fb8e67f6000)
libopencsd.so.0 => /lib64/libopencsd.so.0 (0x00007fb8e676f000)
libc.so.6 => /lib64/libc.so.6 (0x00007fb8e65a9000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fb8e6411000)
libm.so.6 => /lib64/libm.so.6 (0x00007fb8e628d000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fb8e6272000)
/lib64/ld-linux-x86-64.so.2 (0x00007fb8e6828000)
$
And the resulting perf binary will be linked with it:
-rw-rw-r--. 1 acme acme 0 Feb 12 14:49
/tmp/build/perf/feature/test-libopencsd.make.output
$ ldd ~/bin/perf | grep opencsd
libopencsd_c_api.so.0 => /lib64/libopencsd_c_api.so.0
(0x00007fd43097f000)
libopencsd.so.0 => /lib64/libopencsd.so.0 (0x00007fd4308f8000)
$
To make sure this gets built before pushing things upstream I have a
ubuntu:19.04-x-arm64 container that has:
[root@quaco x-arm64]# grep CORESIGHT Dockerfile
ENV EXTRA_MAKE_ARGS=CORESIGHT=1
[root@quaco x-arm64]#
So that I always build with libopencsd before pushing things upstream.
Cc: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Jiri Olsa <jolsa(a)kernel.org>
Cc: Kim Phillips <kim.phillips(a)arm.com>
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Cc: Mike Leach <mike.leach(a)arm.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose(a)arm.com>
Link:
https://lkml.kernel.org/n/tip-20vyy39jw9jgrijesi30fgox@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 5467c6bf9ceb..bb9dca65eb5f 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -70,7 +70,6 @@ FEATURE_TESTS_BASIC := \
sched_getcpu \
sdt \
setns \
- libopencsd \
libaio
# FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list
@@ -84,6 +83,7 @@ FEATURE_TESTS_EXTRA := \
libbabeltrace \
libbfd-liberty \
libbfd-liberty-z \
+ libopencsd \
libunwind-debug-frame \
libunwind-debug-frame-arm \
libunwind-debug-frame-aarch64 \
diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c
index 20cdaa4fc112..74329957553a 100644
--- a/tools/build/feature/test-all.c
+++ b/tools/build/feature/test-all.c
@@ -170,10 +170,6 @@
# include "test-setns.c"
#undef main
-#define main main_test_libopencsd
-# include "test-libopencsd.c"
-#undef main
-
#define main main_test_libaio
# include "test-libaio.c"
#undef main
@@ -217,7 +213,6 @@ int main(int argc, char *argv[])
main_test_sched_getcpu();
main_test_sdt();
main_test_setns();
- main_test_libopencsd();
main_test_libaio();
return 0;
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index b441c88cafa1..e3bf29f942a8 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -386,7 +386,8 @@ ifeq ($(feature-setns), 1)
$(call detected,CONFIG_SETNS)
endif
-ifndef NO_CORESIGHT
+ifdef CORESIGHT
+ $(call feature_check,libopencsd)
ifeq ($(feature-libopencsd), 1)
CFLAGS += -DHAVE_CSTRACE_SUPPORT $(LIBOPENCSD_CFLAGS)
LDFLAGS += $(LIBOPENCSD_LDFLAGS)
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 09df1c8a4ec9..c2ccc54618d1 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -102,7 +102,7 @@ include ../scripts/utilities.mak
# When selected, pass LLVM_CONFIG=/path/to/llvm-config to `make' if
# llvm-config is not in $PATH.
#
-# Define NO_CORESIGHT if you do not want support for CoreSight trace
decoding.
+# Define CORESIGHT if you DO WANT support for CoreSight trace decoding.
#
# Define NO_AIO if you do not want support of Posix AIO based trace
# streaming for record mode. Currently Posix AIO trace streaming is
The latest ARM CoreSight specification updates the component identification
requirements for all components attached to an AMBA bus. (ARM IHI 0029E)
This specification defines bits 15:12 in the ComponentID (CID) value as the
device class. Identification requirements now depend on this class.
Class 0xF: Traditional components identified by Peripheral ID (PID) only.
Class 0x9: CoreSight components may be identified by a Universal Component
Identifier (UCI) consisting of the PID plus CoreSight DevType and DevArch
values.
Current and future ARM CoreSight IP will now use the same PID for
components on the same function - e.g. the ETM, CTI, PMU and Debug elements
associated with a core. The first core to use this UCI method is the A35,
which currently has binding entries in the ETMv4 driver.
This patchset prepares for the addition of the upcoming CTI driver, which
will need to correctly bind with A35, and reported new devices that share
PID for multiple components, while overcoming the limitation of binding by
PID alone, which cannot now work.
Patch 0001: Adds new UCI data structure and uses it with existing drivers
that use private data field in amba_id. This fixes issue from prior set.
Patch 0002: Implements the UCI matching code in the AMBA core code.
Patch 0003: Update ETMv4 driver to use UCI as appropriate.
Thanks
Mike
Tested on DB410, Juno; kernel 5.0-rc6
Changes since v4:
re-based on 5.0-rc6
Added reviewed/tested by tags.
Changes since v3:
Fix UCI structure to allow CoreSight drivers to set private data. This
fixes bug where none-UCI private data would cause a driver binding
mismatch. (e.g. STM).
Add CS_AMBA macros to simplify building AMBA ID tables, with and without
UCI settings.
Changes since v2:
Simplification of amba_cs_uci_id_match().
Fix CID class bitfield comments.
Dropped RFC tag on patchset.
Mike Leach (3):
drivers: amba: Updates to component identification for driver
matching.
drivers: amba: Update component matching to use the CoreSight UCI
values.
coresight: etmv4: Update ID register table to add UCI support
drivers/amba/bus.c | 45 +++++++++++++++----
drivers/hwtracing/coresight/coresight-etm3x.c | 44 ++++++------------
drivers/hwtracing/coresight/coresight-etm4x.c | 21 +++++----
drivers/hwtracing/coresight/coresight-priv.h | 40 +++++++++++++++++
drivers/hwtracing/coresight/coresight-stm.c | 14 ++----
drivers/hwtracing/coresight/coresight-tmc.c | 30 ++++---------
include/linux/amba/bus.h | 39 ++++++++++++++++
7 files changed, 153 insertions(+), 80 deletions(-)
--
2.19.1
This patch seris adds support for sample flags so can facilitate perf
to print sample flags for branch instruction.
Patch 0001 is used to save last branch information in packet structure,
this includes instruction type, subtype and condition flag to help
making decision for which branch instruction it is. It passes related
information from decoder layer to cs-etm.c, so we use cs-etm.c as a
central place to set sample flags.
Patch 0002 is used to set sample flags for instruction range packet.
Patch 0003 is used to set sample flags for trace discontinuity packet.
Patches 0004/0005/0006 are preparation for exception packet handling:
Patch 0004 addes exception number in packet; pacth 0005/0006 is to use
traceID/metadata tuple to access metadata pointer based on traceID, this
can help decide if the CPU is connected with ETMv3 or ETMv4, ETMv3 and
ETMv4 have totally different definition for exception numbers.
Patch 0007 sets sample flags for exception packet; patch 0008 support
sample flags for exception return packet.
This patch set is applied on the acme's perf core branch with the latest
commit bdec77cfe58d ("Merge remote-tracking branch 'tip/perf/urgent'
into perf/core").
This patch set has been verified for x86 and arm64 perf building and
also is verified with below command:
Before:
# perf script -F,-time,+flags -k vmlinux
[...]
main 6650 [001] 1 branches: f7b08490 lib_loop_test+0xc (/root/coresight_test/libcstest.so)
main 6650 [001] 1 branches: f7b084a2 lib_loop_test+0x1e (/root/coresight_test/libcstest.so)
main 6650 [001] 1 branches: f7b084a2 lib_loop_test+0x1e (/root/coresight_test/libcstest.so)
main 6650 [001] 1 branches: f7b084a2 lib_loop_test+0x1e (/root/coresight_test/libcstest.so)
main 6650 [001] 1 branches: f7b084a2 lib_loop_test+0x1e (/root/coresight_test/libcstest.so)
main 6650 [001] 1 branches: f7b084a2 lib_loop_test+0x1e (/root/coresight_test/libcstest.so)
main 6650 [001] 1 branches: f7b084b0 lib_loop_test+0x2c (/root/coresight_test/libcstest.so)
main 6650 [001] 1 branches: 6b65a8 main+0x1c (/root/coresight_test/main)
main 6650 [001] 1 branches: 6b6448 printf@plt+0x8 (/root/coresight_test/main)
main 6650 [001] 1 branches: 6b642c _init+0x18 (/root/coresight_test/main)
main 6650 [001] 1 branches: f7b2d23c [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
main 6650 [001] 1 branches: f7b2906e [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
main 6650 [001] 1 branches: f7b2559a [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
main 6650 [001] 1 branches: f7b2559a [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
main 6650 [001] 1 branches: f7b2559a [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
main 6650 [001] 1 branches: f7b2559a [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
main 6650 [001] 1 branches: f7b2559a [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
main 6650 [001] 1 branches: f7b255ee [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
main 6650 [001] 1 branches: f7b25634 [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
[...]
After:
# perf script -F,-time,+flags -k vmlinux
[...]
main 6650 [001] 1 branches: jmp f7b08490 lib_loop_test+0xc (/root/coresight_test/libcstest.so)
main 6650 [001] 1 branches: jcc f7b084a2 lib_loop_test+0x1e (/root/coresight_test/libcstest.so)
main 6650 [001] 1 branches: jcc f7b084a2 lib_loop_test+0x1e (/root/coresight_test/libcstest.so)
main 6650 [001] 1 branches: jcc f7b084a2 lib_loop_test+0x1e (/root/coresight_test/libcstest.so)
main 6650 [001] 1 branches: jcc f7b084a2 lib_loop_test+0x1e (/root/coresight_test/libcstest.so)
main 6650 [001] 1 branches: jcc f7b084a2 lib_loop_test+0x1e (/root/coresight_test/libcstest.so)
main 6650 [001] 1 branches: return f7b084b0 lib_loop_test+0x2c (/root/coresight_test/libcstest.so)
main 6650 [001] 1 branches: call 6b65a8 main+0x1c (/root/coresight_test/main)
main 6650 [001] 1 branches: return 6b6448 printf@plt+0x8 (/root/coresight_test/main)
main 6650 [001] 1 branches: return 6b642c _init+0x18 (/root/coresight_test/main)
main 6650 [001] 1 branches: call f7b2d23c [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
main 6650 [001] 1 branches: call f7b2906e [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
main 6650 [001] 1 branches: jcc f7b2559a [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
main 6650 [001] 1 branches: jcc f7b2559a [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
main 6650 [001] 1 branches: jcc f7b2559a [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
main 6650 [001] 1 branches: jcc f7b2559a [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
main 6650 [001] 1 branches: jcc f7b2559a [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
main 6650 [001] 1 branches: jmp f7b255ee [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
main 6650 [001] 1 branches: call f7b25634 [unknown] (/usr/lib/arm-linux-gnueabihf/ld-2.27.so)
[...]
Changes from v6:
* Addressed Mathieu's suggestion for patch 0005, refined its commit log
and comment and fixed cs_etm__get_cpu() definitions in header file;
also tested with perf for Arm and x86 buildings.
* Added Mathieu's review tags.
Changes from v5:
* Addressed Rob's suggestion to add specification info for exception
number encoding;
* Added Rob's review tag in patch 0007.
Changes from v4:
* Fixed typos in comments, and removed redundant info from commit log;
* Addressed Mathieu's suggestion to add helper functions for metadata
fields (CS_ETM_CPU and CS_ETM_MAGIC) accessing;
* Addressed Mathieu's suggestion to include headers with alphabetical
order.
Changes from v3:
* Fixed typos in commit logs;
* Rearranged fields in cs_etm_packet by grouping with same variable
types;
* Fixed ETMv4 exception number which pointed by Mike;
* Fixed ETMv4 SVC / SMC / HVC in the same CALL, by checking svc
instruction to distinguish them;
* Refine ETMv4 return exception packet handling.
Changes from v2:
* Addressed Mathieu's suggestion to split one big patch to 3 small
patches for setting sample flags, one is for instruction range
packet, one is for discontinuity packet and one is for exception
packet.
* Added supporting for ETMv3 exception packet.
* Followed Mathieu's suggestion to move all sample flags handling
from decoder layer to cs-etm.c, thus it has enough info to set flags
based on trace context in single place.
Changes from v1:
* Moved exception packets handling patches into patch series 'perf
cs-etm: Correct packets handling'.
* Added sample flags fixing up for TRACE_OFF packet.
* Created a new function which is used to maintain flags fixing up.
Leo Yan (8):
perf cs-etm: Add last instruction information in packet
perf cs-etm: Set sample flags for instruction range packet
perf cs-etm: Set sample flags for trace discontinuity
perf cs-etm: Add exception number in exception packet
perf cs-etm: Change tuple from traceID-CPU# to traceID-metadata
perf cs-etm: Add traceID in packet
perf cs-etm: Set sample flags for exception packet
perf cs-etm: Set sample flags for exception return packet
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 41 +-
.../perf/util/cs-etm-decoder/cs-etm-decoder.h | 6 +
tools/perf/util/cs-etm.c | 394 +++++++++++++++++-
tools/perf/util/cs-etm.h | 53 ++-
4 files changed, 476 insertions(+), 18 deletions(-)
--
2.17.1
The latest ARM CoreSight specification updates the component identification
requirements for all components attached to an AMBA bus. (ARM IHI 0029E)
This specification defines bits 15:12 in the ComponentID (CID) value as the
device class. Identification requirements now depend on this class.
Class 0xF: Traditional components identified by Peripheral ID (PID) only.
Class 0x9: CoreSight components may be identified by a Universal Component
Identifier (UCI) consisting of the PID plus CoreSight DevType and DevArch
values.
Current and future ARM CoreSight IP will now use the same PID for
components on the same function - e.g. the ETM, CTI, PMU and Debug elements
associated with a core. The first core to use this UCI method is the A35,
which currently has binding entries in the ETMv4 driver.
This patchset prepares for the addition of the upcoming CTI driver, which
will need to correctly bind with A35, and reported new devices that share
PID for multiple components, while overcoming the limitation of binding by
PID alone, which cannot now work.
Patch 0001: Adds new UCI data structure and uses it with existing drivers
that use private data field in amba_id. This fixes issue from prior set.
Patch 0002: Implements the UCI matching code in the AMBA core code.
Patch 0003: Update ETMv4 driver to use UCI as appropriate.
Thanks
Mike
Tested on DB410, Juno; kernel 5.0-rc4
Changes since v3:
Fix UCI structure to allow CoreSight drivers to set private data. This
fixes bug where none-UCI private data would cause a driver binding
mismatch. (e.g. STM).
Add CS_AMBA macros to simplify building AMBA ID tables, with and without
UCI settings.
Changes since v2:
Simplification of amba_cs_uci_id_match().
Fix CID class bitfield comments.
Dropped RFC tag on patchset.
Mike Leach (3):
drivers: amba: Updates to component identification for driver
matching.
drivers: amba: Update component matching to use the CoreSight UCI
values.
coresight: etmv4: Update ID register table to add UCI support
drivers/amba/bus.c | 45 +++++++++++++++----
drivers/hwtracing/coresight/coresight-etm3x.c | 44 ++++++------------
drivers/hwtracing/coresight/coresight-etm4x.c | 21 +++++----
drivers/hwtracing/coresight/coresight-priv.h | 40 +++++++++++++++++
drivers/hwtracing/coresight/coresight-stm.c | 14 ++----
drivers/hwtracing/coresight/coresight-tmc.c | 30 ++++---------
include/linux/amba/bus.h | 39 ++++++++++++++++
7 files changed, 153 insertions(+), 80 deletions(-)
--
2.19.1
From: Leo Yan <leo.yan(a)linaro.org>
When return from exception, we need to distinguish if it's system call
return or for other type exceptions for setting sample flags. Due to
the exception return packet doesn't contain exception number, so we
cannot decide sample flags based on exception number.
On the other hand, the exception return packet is followed by an
instruction range packet; this range packet deliveries the start address
after exception handling, we can check if it is a SVC instruction just
before the start address. If there has one SVC instruction is found
ahead the return address, this means it's an exception return for system
call; otherwise it is an normal return for other exceptions.
This patch is to set sample flags for exception return packet, firstly
it simply set sample flags as PERF_IP_FLAG_INTERRUPT for all exception
returns since at this point it doesn't know what's exactly the exception
type. We will defer to decide if it's an exception return for system
call when the next instruction range packet comes, it checks if there
has one SVC instruction prior to the start address and if so we will
change sample flags to PERF_IP_FLAG_SYSCALLRET for system call return.
Signed-off-by: Leo Yan <leo.yan(a)linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Mike Leach <mike.leach(a)linaro.org>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Robert Walker <robert.walker(a)arm.com>
Cc: Suzuki K Poulouse <suzuki.poulose(a)arm.com>
Cc: coresight ml <coresight(a)lists.linaro.org>
Cc: linux-arm-kernel(a)lists.infradead.org
Link: http://lkml.kernel.org/r/20190129122842.32041-9-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
---
tools/perf/util/cs-etm.c | 44 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 44 insertions(+)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index a714b31656ea..8b3f882d6e2f 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -1372,6 +1372,20 @@ static int cs_etm__set_sample_flags(struct cs_etm_queue *etmq)
if (prev_packet->sample_type == CS_ETM_DISCONTINUITY)
prev_packet->flags |= PERF_IP_FLAG_BRANCH |
PERF_IP_FLAG_TRACE_BEGIN;
+
+ /*
+ * If the previous packet is an exception return packet
+ * and the return address just follows SVC instuction,
+ * it needs to calibrate the previous packet sample flags
+ * as PERF_IP_FLAG_SYSCALLRET.
+ */
+ if (prev_packet->flags == (PERF_IP_FLAG_BRANCH |
+ PERF_IP_FLAG_RETURN |
+ PERF_IP_FLAG_INTERRUPT) &&
+ cs_etm__is_svc_instr(etmq, packet, packet->start_addr))
+ prev_packet->flags = PERF_IP_FLAG_BRANCH |
+ PERF_IP_FLAG_RETURN |
+ PERF_IP_FLAG_SYSCALLRET;
break;
case CS_ETM_DISCONTINUITY:
/*
@@ -1422,6 +1436,36 @@ static int cs_etm__set_sample_flags(struct cs_etm_queue *etmq)
prev_packet->flags = packet->flags;
break;
case CS_ETM_EXCEPTION_RET:
+ /*
+ * When the exception return packet is inserted, since
+ * exception return packet is not used standalone for
+ * generating samples and it's affiliation to the previous
+ * instruction range packet; so set previous range packet
+ * flags to tell perf it is an exception return branch.
+ *
+ * The exception return can be for either system call or
+ * other exception types; unfortunately the packet doesn't
+ * contain exception type related info so we cannot decide
+ * the exception type purely based on exception return packet.
+ * If we record the exception number from exception packet and
+ * reuse it for excpetion return packet, this is not reliable
+ * due the trace can be discontinuity or the interrupt can
+ * be nested, thus the recorded exception number cannot be
+ * used for exception return packet for these two cases.
+ *
+ * For exception return packet, we only need to distinguish the
+ * packet is for system call or for other types. Thus the
+ * decision can be deferred when receive the next packet which
+ * contains the return address, based on the return address we
+ * can read out the previous instruction and check if it's a
+ * system call instruction and then calibrate the sample flag
+ * as needed.
+ */
+ if (prev_packet->sample_type == CS_ETM_RANGE)
+ prev_packet->flags = PERF_IP_FLAG_BRANCH |
+ PERF_IP_FLAG_RETURN |
+ PERF_IP_FLAG_INTERRUPT;
+ break;
case CS_ETM_EMPTY:
default:
break;
--
2.20.1
From: Leo Yan <leo.yan(a)linaro.org>
If packet processing wants to know the packet is bound with which ETM
version, it needs to access metadata to decide that based on metadata
magic number; but we cannot simply to use CPU logic ID number as index
to access metadata sequential array, especially when system have
hotplugged off CPUs, the metadata array are only allocated for online
CPUs but not offline CPUs, so the CPU logic number doesn't match with
its index in the array.
This patch is to change tuple from traceID-CPU# to traceID-metadata,
thus it can use the tuple to retrieve metadata pointer according to
traceID.
For safe accessing metadata fields, this patch provides helper function
cs_etm__get_cpu() which is used to return CPU number according to
traceID; cs_etm_decoder__buffer_packet() is the first consumer for this
helper function.
Signed-off-by: Leo Yan <leo.yan(a)linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Mike Leach <mike.leach(a)linaro.org>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Robert Walker <robert.walker(a)arm.com>
Cc: Suzuki K Poulouse <suzuki.poulose(a)arm.com>
Cc: coresight ml <coresight(a)lists.linaro.org>
Cc: linux-arm-kernel(a)lists.infradead.org
Link: http://lkml.kernel.org/r/20190129122842.32041-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
---
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 8 +++---
tools/perf/util/cs-etm.c | 26 ++++++++++++++-----
tools/perf/util/cs-etm.h | 9 ++++++-
3 files changed, 31 insertions(+), 12 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
index 294efa76c9e3..cdd38ffd10d2 100644
--- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
+++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
@@ -305,14 +305,12 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder,
enum cs_etm_sample_type sample_type)
{
u32 et = 0;
- struct int_node *inode = NULL;
+ int cpu;
if (decoder->packet_count >= MAX_BUFFER - 1)
return OCSD_RESP_FATAL_SYS_ERR;
- /* Search the RB tree for the cpu associated with this traceID */
- inode = intlist__find(traceid_list, trace_chan_id);
- if (!inode)
+ if (cs_etm__get_cpu(trace_chan_id, &cpu) < 0)
return OCSD_RESP_FATAL_SYS_ERR;
et = decoder->tail;
@@ -322,7 +320,7 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder,
decoder->packet_buffer[et].sample_type = sample_type;
decoder->packet_buffer[et].isa = CS_ETM_ISA_UNKNOWN;
- decoder->packet_buffer[et].cpu = *((int *)inode->priv);
+ decoder->packet_buffer[et].cpu = cpu;
decoder->packet_buffer[et].start_addr = CS_ETM_INVAL_ADDR;
decoder->packet_buffer[et].end_addr = CS_ETM_INVAL_ADDR;
decoder->packet_buffer[et].instr_count = 0;
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 1aa29633ce77..a5497a761db7 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -97,6 +97,20 @@ static u32 cs_etm__get_v7_protocol_version(u32 etmidr)
return CS_ETM_PROTO_ETMV3;
}
+int cs_etm__get_cpu(u8 trace_chan_id, int *cpu)
+{
+ struct int_node *inode;
+ u64 *metadata;
+
+ inode = intlist__find(traceid_list, trace_chan_id);
+ if (!inode)
+ return -EINVAL;
+
+ metadata = inode->priv;
+ *cpu = (int)metadata[CS_ETM_CPU];
+ return 0;
+}
+
static void cs_etm__packet_dump(const char *pkt_string)
{
const char *color = PERF_COLOR_BLUE;
@@ -252,7 +266,7 @@ static void cs_etm__free(struct perf_session *session)
cs_etm__free_events(session);
session->auxtrace = NULL;
- /* First remove all traceID/CPU# nodes for the RB tree */
+ /* First remove all traceID/metadata nodes for the RB tree */
intlist__for_each_entry_safe(inode, tmp, traceid_list)
intlist__remove(traceid_list, inode);
/* Then the RB tree itself */
@@ -1519,9 +1533,9 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
0xffffffff);
/*
- * Create an RB tree for traceID-CPU# tuple. Since the conversion has
- * to be made for each packet that gets decoded, optimizing access in
- * anything other than a sequential array is worth doing.
+ * Create an RB tree for traceID-metadata tuple. Since the conversion
+ * has to be made for each packet that gets decoded, optimizing access
+ * in anything other than a sequential array is worth doing.
*/
traceid_list = intlist__new(NULL);
if (!traceid_list) {
@@ -1587,8 +1601,8 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
err = -EINVAL;
goto err_free_metadata;
}
- /* All good, associate the traceID with the CPU# */
- inode->priv = &metadata[j][CS_ETM_CPU];
+ /* All good, associate the traceID with the metadata pointer */
+ inode->priv = metadata[j];
}
/*
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h
index 37f8d48179ca..fb5fc6538b7f 100644
--- a/tools/perf/util/cs-etm.h
+++ b/tools/perf/util/cs-etm.h
@@ -53,7 +53,7 @@ enum {
CS_ETMV4_PRIV_MAX,
};
-/* RB tree for quick conversion between traceID and CPUs */
+/* RB tree for quick conversion between traceID and metadata pointers */
struct intlist *traceid_list;
#define KiB(x) ((x) * 1024)
@@ -69,6 +69,7 @@ static const u64 __perf_cs_etmv4_magic = 0x4040404040404040ULL;
#ifdef HAVE_CSTRACE_SUPPORT
int cs_etm__process_auxtrace_info(union perf_event *event,
struct perf_session *session);
+int cs_etm__get_cpu(u8 trace_chan_id, int *cpu);
#else
static inline int
cs_etm__process_auxtrace_info(union perf_event *event __maybe_unused,
@@ -76,6 +77,12 @@ cs_etm__process_auxtrace_info(union perf_event *event __maybe_unused,
{
return -1;
}
+
+static inline int cs_etm__get_cpu(u8 trace_chan_id __maybe_unused,
+ int *cpu __maybe_unused)
+{
+ return -1;
+}
#endif
#endif
--
2.20.1