Hi James,
Changes since v2:
- Remove const to non-const change and copy strings where needed instead.
- Use sizeof() instead of PATH_MAX
- Append "will not be set accurately." to new error message
- Remove unneeded stat() call
- Rebase on perf/core
==========================
Changes since v1:
- Add 3 refactor commits for sysfs reading around pmu.c as suggested by Arnaldo here [1]
- The dependency on [2] has now reached mainline so is no longer blocking
- Rebase on perf/core
German Gomez (4): perf pmu: Add function to check if a pmu file exists perf cs_etm: Keep separate symbols for ETMv4 and ETE parameters perf cs_etm: Record ts_source in AUXTRACE_INFO for ETMv4 and ETE perf cs_etm: Set the time field in the synthetic samples
James Clark (3): perf: Remove duplication around EVENT_SOURCE_DEVICE_PATH perf: Use perf_pmu__open_file() and perf_pmu__scan_file() perf: Remove remaining duplication of bus/event_source/devices/...
tools/perf/arch/arm/util/auxtrace.c | 5 +- tools/perf/arch/arm/util/cs-etm.c | 91 ++++++++++++-- tools/perf/arch/x86/util/pmu.c | 12 +- tools/perf/util/cputopo.c | 9 +- tools/perf/util/cs-etm-base.c | 34 ++++-- tools/perf/util/cs-etm.c | 86 ++++++++++++-- tools/perf/util/cs-etm.h | 13 +- tools/perf/util/pmu-hybrid.c | 27 +---- tools/perf/util/pmu.c | 177 +++++++++++----------------- tools/perf/util/pmu.h | 10 +- 10 files changed, 284 insertions(+), 180 deletions(-)
base-commit: 09e6f9f98370be9a9f8978139e0eb1be87d1125f
I have tested this patch set on our platform and able to see updated timestamp values in perf script's output.
$ perf record -e cs_etm/cycacc,@tmc_etr0/k -C 9 taskset -c 9 sleep 2 $ perf script --itrace=i1ns --ns -Fcomm,tid,pid,time,cpu,event,ip,sym,addr,symoff,flags,callindent
At certain points noticed that hardware emits same timestamp packets but with updated cycle count (CC) values. A small snippet of the log:
Idx:100; ID:12; I_ADDR_S_IS0 : Address, Short, IS0.; Addr=0xFFFF8000086A761C ~[0x761C] Idx:103; ID:12; I_TIMESTAMP : Timestamp.; Updated val = 0x2f373e37e02; CC=0x3d Idx:107; ID:12; I_ATOM_F2 : Atom format 2.; EN Idx:108; ID:12; I_TIMESTAMP : Timestamp.; Updated val = 0x2f373e37e02; CC=0x3f Idx:112; ID:12; I_ATOM_F1 : Atom format 1.; N Idx:113; ID:12; I_TIMESTAMP : Timestamp.; Updated val = 0x2f373e37e02; CC=0x45 Idx:116; ID:12; I_ATOM_F1 : Atom format 1.; E Idx:117; ID:12; I_ADDR_S_IS0 : Address, Short, IS0.; Addr=0xFFFF8000086B52D4 ~[0x152D4]
Since the source of timestamp is the Generic Timer block and the CPUs run at higher frequencies, this behaviour could be possible on high performance ARM cores.
Having consecutive timestamps with same value is resulting in a slightly jumbled order (in nanosecs) in perf script's time column. A snippet corresponding to the Coresight trace data mentioned above: ... perf 965/965 [001] 3182.286629044: instructions:k: return 0 ffff8000086a761c coresight_timeout+0xc8 perf 965/965 [001] 3182.286629044: instructions:k: return 0 ffff8000086a7620 coresight_timeout+0xcc perf 965/965 [001] 3182.286629046: instructions:k: jmp 0 ffff8000086a75c8 coresight_timeout+0x74 perf 965/965 [001] 3182.286629046: instructions:k: jmp 0 ffff8000086a75cc coresight_timeout+0x78 perf 965/965 [001] 3182.286629044: instructions:k: jcc 0 ffff8000086a75d0 coresight_timeout+0x7c perf 965/965 [001] 3182.286629044: instructions:k: jcc 0 ffff8000086a75d4 coresight_timeout+0x80 perf 965/965 [001] 3182.286629044: instructions:k: jcc 0 ffff8000086a75d8 coresight_timeout+0x84 perf 965/965 [001] 3182.286629044: instructions:k: jcc 0 ffff8000086a75dc coresight_timeout+0x88 perf 965/965 [001] 3182.286629044: instructions:k: jcc 0 ffff8000086a75e0 coresight_timeout+0x8c perf 965/965 [001] 3182.286629044: instructions:k: jcc 0 ffff8000086a75e4 coresight_timeout+0x90 perf 965/965 [001] 3182.286629044: instructions:k: jcc 0 ffff8000086a75e8 coresight_timeout+0x94 perf 965/965 [001] 3182.286629044: instructions:k: jcc 0 ffff8000086a75ec coresight_timeout+0x98
Perf's do_soft_timestamp() logic in cs_etm_decoder.c file is incrementing the HW timestamp based on instruction count. Since the next timestamp also has the same value, it could be leading to this jumbled order.
We would like to know if this has been observed on other platforms ? And what could be a solution in SW for this ?
With Regards, Tanmay