4.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leo Yan <leo.yan(a)linaro.org>
commit d6c9c05fe1eb4b213b183d8a1e79416256dc833a upstream.
Since commit edeb0c90df35 ("perf tools: Stop fallbacking to kallsyms for
vdso symbols lookup"), the kernel address cannot be properly parsed to
kernel symbol with command 'perf script -k vmlinux'. The reason is
CoreSight samples is always to set CPU mode as PERF_RECORD_MISC_USER,
thus it fails to find corresponding map/dso in below flows:
process_sample_event()
`-> machine__resolve()
`-> thread__find_map(thread, sample->cpumode, sample->ip, al);
In this flow it needs to pass argument 'sample->cpumode' to tell what's
the CPU mode, before it always passed PERF_RECORD_MISC_USER but without
any failure until the commit edeb0c90df35 ("perf tools: Stop fallbacking
to kallsyms for vdso symbols lookup") has been merged. The reason is
even with the wrong CPU mode the function thread__find_map() firstly
fails to find map but it will rollback to find kernel map for vdso
symbols lookup. In the latest code it has removed the fallback code,
thus if CPU mode is PERF_RECORD_MISC_USER then it cannot find map
anymore with kernel address.
This patch is to correct samples CPU mode setting, it creates a new
helper function cs_etm__cpu_mode() to tell what's the CPU mode based on
the address with the info from machine structure; this patch has a bit
extension to check not only kernel and user mode, but also check for
host/guest and hypervisor mode. Finally this patch uses the function in
instruction and branch samples and also apply in cs_etm__mem_access()
for a minor polishing.
Signed-off-by: Leo Yan <leo.yan(a)linaro.org>
Cc: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: David Miller <davem(a)davemloft.net>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: coresight(a)lists.linaro.org
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: stable(a)kernel.org # v4.19
Link: http://lkml.kernel.org/r/1540883908-17018-1-git-send-email-leo.yan@linaro.o…
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/util/cs-etm.c | 39 ++++++++++++++++++++++++++++++---------
1 file changed, 30 insertions(+), 9 deletions(-)
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -244,6 +244,27 @@ static void cs_etm__free(struct perf_ses
zfree(&aux);
}
+static u8 cs_etm__cpu_mode(struct cs_etm_queue *etmq, u64 address)
+{
+ struct machine *machine;
+
+ machine = etmq->etm->machine;
+
+ if (address >= etmq->etm->kernel_start) {
+ if (machine__is_host(machine))
+ return PERF_RECORD_MISC_KERNEL;
+ else
+ return PERF_RECORD_MISC_GUEST_KERNEL;
+ } else {
+ if (machine__is_host(machine))
+ return PERF_RECORD_MISC_USER;
+ else if (perf_guest)
+ return PERF_RECORD_MISC_GUEST_USER;
+ else
+ return PERF_RECORD_MISC_HYPERVISOR;
+ }
+}
+
static u32 cs_etm__mem_access(struct cs_etm_queue *etmq, u64 address,
size_t size, u8 *buffer)
{
@@ -258,10 +279,7 @@ static u32 cs_etm__mem_access(struct cs_
return -1;
machine = etmq->etm->machine;
- if (address >= etmq->etm->kernel_start)
- cpumode = PERF_RECORD_MISC_KERNEL;
- else
- cpumode = PERF_RECORD_MISC_USER;
+ cpumode = cs_etm__cpu_mode(etmq, address);
thread = etmq->thread;
if (!thread) {
@@ -653,7 +671,7 @@ static int cs_etm__synth_instruction_sam
struct perf_sample sample = {.ip = 0,};
event->sample.header.type = PERF_RECORD_SAMPLE;
- event->sample.header.misc = PERF_RECORD_MISC_USER;
+ event->sample.header.misc = cs_etm__cpu_mode(etmq, addr);
event->sample.header.size = sizeof(struct perf_event_header);
sample.ip = addr;
@@ -665,7 +683,7 @@ static int cs_etm__synth_instruction_sam
sample.cpu = etmq->packet->cpu;
sample.flags = 0;
sample.insn_len = 1;
- sample.cpumode = event->header.misc;
+ sample.cpumode = event->sample.header.misc;
if (etm->synth_opts.last_branch) {
cs_etm__copy_last_branch_rb(etmq);
@@ -706,12 +724,15 @@ static int cs_etm__synth_branch_sample(s
u64 nr;
struct branch_entry entries;
} dummy_bs;
+ u64 ip;
+
+ ip = cs_etm__last_executed_instr(etmq->prev_packet);
event->sample.header.type = PERF_RECORD_SAMPLE;
- event->sample.header.misc = PERF_RECORD_MISC_USER;
+ event->sample.header.misc = cs_etm__cpu_mode(etmq, ip);
event->sample.header.size = sizeof(struct perf_event_header);
- sample.ip = cs_etm__last_executed_instr(etmq->prev_packet);
+ sample.ip = ip;
sample.pid = etmq->pid;
sample.tid = etmq->tid;
sample.addr = cs_etm__first_executed_instr(etmq->packet);
@@ -720,7 +741,7 @@ static int cs_etm__synth_branch_sample(s
sample.period = 1;
sample.cpu = etmq->packet->cpu;
sample.flags = 0;
- sample.cpumode = PERF_RECORD_MISC_USER;
+ sample.cpumode = event->sample.header.misc;
/*
* perf report cannot handle events without a branch stack
This is a note to let you know that I've just added the patch titled
perf cs-etm: Correct CPU mode for samples
to the 4.19-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-cs-etm-correct-cpu-mode-for-samples.patch
and it can be found in the queue-4.19 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d6c9c05fe1eb4b213b183d8a1e79416256dc833a Mon Sep 17 00:00:00 2001
From: Leo Yan <leo.yan(a)linaro.org>
Date: Tue, 30 Oct 2018 15:18:28 +0800
Subject: perf cs-etm: Correct CPU mode for samples
From: Leo Yan <leo.yan(a)linaro.org>
commit d6c9c05fe1eb4b213b183d8a1e79416256dc833a upstream.
Since commit edeb0c90df35 ("perf tools: Stop fallbacking to kallsyms for
vdso symbols lookup"), the kernel address cannot be properly parsed to
kernel symbol with command 'perf script -k vmlinux'. The reason is
CoreSight samples is always to set CPU mode as PERF_RECORD_MISC_USER,
thus it fails to find corresponding map/dso in below flows:
process_sample_event()
`-> machine__resolve()
`-> thread__find_map(thread, sample->cpumode, sample->ip, al);
In this flow it needs to pass argument 'sample->cpumode' to tell what's
the CPU mode, before it always passed PERF_RECORD_MISC_USER but without
any failure until the commit edeb0c90df35 ("perf tools: Stop fallbacking
to kallsyms for vdso symbols lookup") has been merged. The reason is
even with the wrong CPU mode the function thread__find_map() firstly
fails to find map but it will rollback to find kernel map for vdso
symbols lookup. In the latest code it has removed the fallback code,
thus if CPU mode is PERF_RECORD_MISC_USER then it cannot find map
anymore with kernel address.
This patch is to correct samples CPU mode setting, it creates a new
helper function cs_etm__cpu_mode() to tell what's the CPU mode based on
the address with the info from machine structure; this patch has a bit
extension to check not only kernel and user mode, but also check for
host/guest and hypervisor mode. Finally this patch uses the function in
instruction and branch samples and also apply in cs_etm__mem_access()
for a minor polishing.
Signed-off-by: Leo Yan <leo.yan(a)linaro.org>
Cc: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: David Miller <davem(a)davemloft.net>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: coresight(a)lists.linaro.org
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: stable(a)kernel.org # v4.19
Link: http://lkml.kernel.org/r/1540883908-17018-1-git-send-email-leo.yan@linaro.o…
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/util/cs-etm.c | 39 ++++++++++++++++++++++++++++++---------
1 file changed, 30 insertions(+), 9 deletions(-)
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -244,6 +244,27 @@ static void cs_etm__free(struct perf_ses
zfree(&aux);
}
+static u8 cs_etm__cpu_mode(struct cs_etm_queue *etmq, u64 address)
+{
+ struct machine *machine;
+
+ machine = etmq->etm->machine;
+
+ if (address >= etmq->etm->kernel_start) {
+ if (machine__is_host(machine))
+ return PERF_RECORD_MISC_KERNEL;
+ else
+ return PERF_RECORD_MISC_GUEST_KERNEL;
+ } else {
+ if (machine__is_host(machine))
+ return PERF_RECORD_MISC_USER;
+ else if (perf_guest)
+ return PERF_RECORD_MISC_GUEST_USER;
+ else
+ return PERF_RECORD_MISC_HYPERVISOR;
+ }
+}
+
static u32 cs_etm__mem_access(struct cs_etm_queue *etmq, u64 address,
size_t size, u8 *buffer)
{
@@ -258,10 +279,7 @@ static u32 cs_etm__mem_access(struct cs_
return -1;
machine = etmq->etm->machine;
- if (address >= etmq->etm->kernel_start)
- cpumode = PERF_RECORD_MISC_KERNEL;
- else
- cpumode = PERF_RECORD_MISC_USER;
+ cpumode = cs_etm__cpu_mode(etmq, address);
thread = etmq->thread;
if (!thread) {
@@ -653,7 +671,7 @@ static int cs_etm__synth_instruction_sam
struct perf_sample sample = {.ip = 0,};
event->sample.header.type = PERF_RECORD_SAMPLE;
- event->sample.header.misc = PERF_RECORD_MISC_USER;
+ event->sample.header.misc = cs_etm__cpu_mode(etmq, addr);
event->sample.header.size = sizeof(struct perf_event_header);
sample.ip = addr;
@@ -665,7 +683,7 @@ static int cs_etm__synth_instruction_sam
sample.cpu = etmq->packet->cpu;
sample.flags = 0;
sample.insn_len = 1;
- sample.cpumode = event->header.misc;
+ sample.cpumode = event->sample.header.misc;
if (etm->synth_opts.last_branch) {
cs_etm__copy_last_branch_rb(etmq);
@@ -706,12 +724,15 @@ static int cs_etm__synth_branch_sample(s
u64 nr;
struct branch_entry entries;
} dummy_bs;
+ u64 ip;
+
+ ip = cs_etm__last_executed_instr(etmq->prev_packet);
event->sample.header.type = PERF_RECORD_SAMPLE;
- event->sample.header.misc = PERF_RECORD_MISC_USER;
+ event->sample.header.misc = cs_etm__cpu_mode(etmq, ip);
event->sample.header.size = sizeof(struct perf_event_header);
- sample.ip = cs_etm__last_executed_instr(etmq->prev_packet);
+ sample.ip = ip;
sample.pid = etmq->pid;
sample.tid = etmq->tid;
sample.addr = cs_etm__first_executed_instr(etmq->packet);
@@ -720,7 +741,7 @@ static int cs_etm__synth_branch_sample(s
sample.period = 1;
sample.cpu = etmq->packet->cpu;
sample.flags = 0;
- sample.cpumode = PERF_RECORD_MISC_USER;
+ sample.cpumode = event->sample.header.misc;
/*
* perf report cannot handle events without a branch stack
Patches currently in stable-queue which might be from leo.yan(a)linaro.org are
queue-4.19/perf-cs-etm-correct-cpu-mode-for-samples.patch
queue-4.19/perf-intel-pt-bts-calculate-cpumode-for-synthesized-samples.patch
queue-4.19/perf-intel-pt-insert-callchain-context-into-synthesized-callchains.patch
This set adds support for ETMv3/PTM1.1 trace decoding. The work has been
tested on TC2 and ST-Microelectronics' mp157c-ev1 board and applies cleanly
on 4.20-rc2 and Acme's perf/core branch [1].
*** Before this set ***
$ perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
DCD_ETMV4_0020 : 0x0003 (OCSD_ERR_NOT_INIT) [Component not initialised.]; No decoder configuration information
DCD_ETMV4_0022 : 0x0003 (OCSD_ERR_NOT_INIT) [Component not initialised.]; No decoder configuration information
DCD_ETMV4_0024 : 0x0003 (OCSD_ERR_NOT_INIT) [Component not initialised.]; No decoder configuration information
DCD_ETMV4_0020 : 0x0003 (OCSD_ERR_NOT_INIT) [Component not initialised.]; No decoder configuration information
DCD_ETMV4_0022 : 0x0003 (OCSD_ERR_NOT_INIT) [Component not initialised.]; No decoder configuration information
DCD_ETMV4_0024 : 0x0003 (OCSD_ERR_NOT_INIT) [Component not initialised.]; No decoder configuration information
Warning:
AUX data lost 2 times out of 2!
Error:
The perf.data file has no samples!
*** After this set ***
[...]
# Samples: 12K of event 'branches'
# Event count (approx.): 12049
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................ .......................
#
28.18% 28.18% uname libc-2.19.so [.] strcmp
9.13% 9.13% uname libc-2.19.so [.] strcpy
7.87% 7.87% uname libc-2.19.so [.] strnlen
5.58% 5.58% uname libc-2.19.so [.] strlen
2.24% 2.24% uname libc-2.19.so [.] __rawmemchr
1.91% 1.91% uname ld-2.19.so [.] 0x000000000001156a
1.49% 1.49% uname libc-2.19.so [.] __argz_stringify
1.46% 1.46% uname libc-2.19.so [.] malloc
0.96% 0.96% uname libc-2.19.so [.] 0x0000000000054770
0.91% 0.91% uname libc-2.19.so [.] 0x000000000002430a
0.85% 0.85% uname ld-2.19.so [.] 0x0000000000007244
0.83% 0.83% uname libc-2.19.so [.] __stpcpy
[...]
Regards,
Mathieu
[1]. "6909b0a13389 perf stat: Use perf_evsel__is_clocki() for clock events"
Mathieu Poirier (3):
perf tools: Add configuration for ETMv3 trace protocol
perf tools: Add support for ETMv3 trace decoding
perf tools: Add support for PTMv1.1 decoding
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 31 +++++++++++
tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 9 +++
tools/perf/util/cs-etm.c | 73 ++++++++++++++++++++-----
3 files changed, 99 insertions(+), 14 deletions(-)
--
2.7.4
This set addresses problems observed with the CLAIM tag feature. The first
patch adds support for CLAIM tags to the ETB10 drivers. The remaining 3
patches deal with properly handling the tags on ETF and ETM3x devices.
Regards,
Mathieu
Changes since V1:
* Added Suzuki's review tags to patch 1 and 2.
* Addressed ordering issued in ETM3x enable/disable functions (Leo Yan)
Mathieu Poirier (4):
coresight: etb10: Add support for CLAIM tag
coresight: etf: Release CLAIM tag after disabling the HW
coresight: etm3x: Deal with CLAIM tag before and after accessing HW
coresight: etm3x: Release CLAIM tag when operated from perf
drivers/hwtracing/coresight/coresight-etb10.c | 23 +++++++++++++++++------
drivers/hwtracing/coresight/coresight-etm3x.c | 17 +++++++++--------
drivers/hwtracing/coresight/coresight-tmc-etf.c | 2 +-
3 files changed, 27 insertions(+), 15 deletions(-)
--
2.7.4
Hi all,
Now I found that if use the command 'perf script' for Arm CoreSight trace
data, it fails to parse kernel symbols if we don't specify kernel vmlinux
file. So when we don't specify kernel symbol files then perf tool will
roll back to use /proc/kallsyms for kernel symbols parsing, as result it will
run into below flow:
thread__find_addr_map(thread, cpumode, MAP__FUNCTION, address, &al);
map__load(al.map);
dso__data_read_offset(al.map->dso, machine, offset, buffer, size);
`-> data_read_offset()
I can observe the function data_read_offset() returns failure, this is caused
by checking the offset sanity "if (offset > dso->data.file_size)" (I pasted
the whole function code at below in case you want to get more context for it),
but if perf use "/proc/kallsyms" to load kernel symbols, the variable
'dso->data.file_size' will be set to zero thus the sanity checking always
thinks the offset is out of the file size bound.
Now I still don't understand how the dso/map support "/proc/kallsyms" and
have no idea to fix this issue, though I spent some time to look into it.
Could you give some suggestion for this? Or even better if you have fixing
for this, I am glad to test at my side.
static ssize_t data_read_offset(struct dso *dso, struct machine *machine,
u64 offset, u8 *data, ssize_t size)
{
if (data_file_size(dso, machine))
return -1;
/* Check the offset sanity. */
if (offset > dso->data.file_size)
return -1;
if (offset + size < offset)
return -1;
return cached_read(dso, machine, offset, data, size);
}
Thanks,
Leo Yan
This set addresses 3 problems observed with the CLAIM tag feature. The first
patch adds support for CLAIM tags to the ETB10 drivers. The second and third
patch deal with releasing the tags on ETF and ETM3x devices.
Review and testing would be appreciated.
Regards,
Mathieu
Mathieu Poirier (3):
coresight: etb10: Add support for CLAIM tag
coresight: etf: Release CLAIM tag after disabling the HW
coresight: etm3x: Release CLAIM tag when operated from perf
drivers/hwtracing/coresight/coresight-etb10.c | 23 +++++++++++++++++------
drivers/hwtracing/coresight/coresight-etm3x.c | 2 ++
drivers/hwtracing/coresight/coresight-tmc-etf.c | 2 +-
3 files changed, 20 insertions(+), 7 deletions(-)
--
2.7.4
This set adds support for ETMv3/PTM1.1 trace decoding. The work has been
tested on TC2 and ST-Microelectronics' mp157c-ev1 board and applies cleanly
on 4.20-rc1. Although not related to trace decoding, the following patches
[1][2][3] are required for ETMv3/PTM1.1 traces to be generated properly on
a 4.20-rc1 mainline kernel.
I am planning to post this on the kernel mailing list in a week.
Regards,
Mathieu
Changes since V1:
* Address a problem with protocol identification in [3/3]
[1]. https://lore.kernel.org/patchwork/patch/1007184/
[2]. https://lore.kernel.org/patchwork/patch/1007185/
[3]. https://lore.kernel.org/patchwork/patch/1007186/
Mathieu Poirier (3):
perf tools: Add configuration for ETMv3 trace protocol
perf tools: Add support for ETMv3 trace decoding
perf tools: Add support for PTMv1.1 decoding
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 31 +++++++++++
tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 9 +++
tools/perf/util/cs-etm.c | 73 ++++++++++++++++++++-----
3 files changed, 99 insertions(+), 14 deletions(-)
--
2.7.4
Changes this release.
Functional update: Add additional information about the last
instruction to the generic output packet & update docs for updated
output packet.
Bugfix: typecast removed from OCSD_VER_NUM in ocsd_if_version.h to
allow use in C pre-processor.
Bugfix: ETMV4: Interworking ISA change between A32-T32 occasionally
missed during instruction decode.
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK