CoreSight

coresight@lists.linaro.org

5 participants
2627 discussions

[PATCH] coresight: tmc: Refactor loops in etb dump

by Leo Yan

In ETB dump function tmc_etb_dump_hw() it has nested loops. The second level loop is to iterate index in the range [0 .. drvdata->memwidth); but the index isn't really used in the code, thus the second level loop is useless. This patch is to remove the second level loop; the refactor also reduces indentation and we can use 'break' to replace 'goto' tag. Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org> Signed-off-by: Leo Yan <leo.yan(a)linaro.org> --- drivers/hwtracing/coresight/coresight-tmc-etf.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 9c599c9..8b34161 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -34,23 +34,20 @@ static void tmc_etb_dump_hw(struct tmc_drvdata *drvdata) { char *bufp; u32 read_data, lost; - int i; /* Check if the buffer wrapped around. */ lost = readl_relaxed(drvdata->base + TMC_STS) & TMC_STS_FULL; bufp = drvdata->buf; drvdata->len = 0; while (1) { - for (i = 0; i < drvdata->memwidth; i++) { - read_data = readl_relaxed(drvdata->base + TMC_RRD); - if (read_data == 0xFFFFFFFF) - goto done; - memcpy(bufp, &read_data, 4); - bufp += 4; - drvdata->len += 4; - } + read_data = readl_relaxed(drvdata->base + TMC_RRD); + if (read_data == 0xFFFFFFFF) + break; + memcpy(bufp, &read_data, 4); + bufp += 4; + drvdata->len += 4; } -done: + if (lost) coresight_insert_barrier_packet(drvdata->buf); return; -- 2.7.4

7 years, 5 months

[PATCH v3 0/9] coresight: Update device tree bindings

by Suzuki K Poulose

Coresight uses DT graph bindings to describe the connections of the components. However we have some undocumented usage of the bindings to describe some of the properties of the connections. The coresight driver needs to know the hardware ports invovled in the connection and the direction of data flow to effectively manage the trace sessions. So far we have relied on the "port" address (as described by the generic graph bindings) to represent the hardware port of the component for a connection. The hardware uses separate numbering scheme for input and output ports, which implies, we could have two different (input and output) ports with the same port number. This could create problems in the graph bindings where the label of the port wouldn't match the address. e.g, with the existing bindings we get : port@0{ // Output port 0 reg = <0>; ... }; port@1{ reg = <0>; // Input port 0 endpoint { slave-mode; ... }; }; With the new enforcement in the DT rules, mismatches in label and address are not allowed (as see in the case for port@1). So, we need a new mechanism to describe the hardware port number reliably. Also, we relied on an undocumented "slave-mode" property (see the above example) to indicate if the port is an input port. Let us formalise and switch to a new property to describe the direction of data flow. There were three options considered for the hardware port number scheme: 1) Use natural ordering in the DT to infer the hardware port number. i.e, Mandate that the all ports are listed in the DT and in the ascending order for each class (input and output respectively). Pros : - We don't need new properties and if the existing DTS list them in order (which most of them do), they work out of the box. Cons : - We must list all the ports even if the system cannot/shouldn't use it. - It is prone to human errors (if the order is not kept). 2) Use an explicit property to list both the direction and the hw port number and direction. Define "coresight,hwid" as 2 member array of u32, where the members are port number and the direction respectively. e.g port@0{ reg = <0>; endpoint { coresight,hwid = <0 1>; // Port # 0, Output } }; port@1{ reg = <1>; endpoint { coresight,hwid = <0 0>; // Port # 0, Input }; }; Pros: - The bindings are formal but not so reader friendly and could potentially lead to human errors. Cons: - Backward compatiblity is lost. 3) Use explicit properties (implemented in the series) for the hardware port id and direction. We define a new property "coresight,hwid" for each endpoint in coresight devices to specify the hardware port number explicitly. Also use a separate property "direction" to specify the direction of the data flow. e.g, port@0{ reg = <0>; endpoint { direction = <1>; // Output coresight,hwid = <0>; // Port # 0 } }; port@1{ reg = <1>; endpoint { direction = <0>; // Input coresight,hwid = <0>; // Port # 0 }; }; Pros: - The bindings are formal and reader friendly, and less prone to errors. Cons: - Backward compatibility is lost. After a round of discussions [1], the following option (4) is adopted : 4) Group ports based on the directions under a dedicated node. This has been checked with the upstream DTC tool to resolve the "address mismatch" issue. e.g, out-ports { // Output ports for this component port@0 { // Outport 0 reg = 0; endpoint { ... }; }; port@1 { // Outport 1 reg = 1; endpoint { ... }; }; }; in-ports { // Input ports for this component port@0 { // Inport 0 reg = 0; endpoint { ... }; }; port@1 { // Inport 1 reg = 1; endpoint { ... }; }; }; This series implements Option (4) listed above and falls back to the old bindings if the new bindings are not available. This allows the systems with old bindings work with the new driver. The driver now issues a warning (once) when it encounters the old bindings. The series contains DT update for Juno platform. The remaining in-kernel sources could be updated once we are fine with the proposal. It also cleans up the platform parsing code to reduce the memory usage by reusing the platform description. Applies on coresight/next Changes since V2: - Clean of_coresight_parse_endpoint() to return 1 to indicate a connection record was updated. - Drop documentation for old bindings Changes since V1: - Implement the proposal by Rob. - Drop the DTS updates for all platforms except Juno - Drop the incorrect fix in coresight_register. Instead document the code to prevent people trying to un-fix it again. - Add a patch to drop remote device references in DT graph parsing - Split of_node refcount fixing patch, fix a typo in the comment. - Add Reviewed-by tags from Mathieu. - Drop patches picked up for 4.18-rc series Changes since RFC: - Fixed style issues - Fix an existing memory leak coresight_register (Found in code update) - Fix missing of_node_put() in the existing driver (Reported-by Mathieu) - Update the existing dts in kernel tree. Suzuki K Poulose (9): coresight: Document error handling in coresight_register coresight: platform: Refactor graph endpoint parsing coresight: platform: Fix refcounting for graph nodes coresight: platform: Fix leaking device reference coresight: Fix remote endpoint parsing coresight: Add helper to check if the endpoint is input coresight: platform: Cleanup coresight connection handling coresight: Cleanup coresight DT bindings dts: juno: Update coresight bindings .../devicetree/bindings/arm/coresight.txt | 95 +++++--- arch/arm64/boot/dts/arm/juno-base.dtsi | 161 ++++++------ arch/arm64/boot/dts/arm/juno-cs-r1r2.dtsi | 52 ++-- arch/arm64/boot/dts/arm/juno.dts | 13 +- drivers/hwtracing/coresight/coresight.c | 35 +-- drivers/hwtracing/coresight/of_coresight.c | 269 ++++++++++++++------- include/linux/coresight.h | 9 +- 7 files changed, 359 insertions(+), 275 deletions(-) -- 2.7.4

7 years, 5 months

Re: ThunderX2 and Coresight bring up

by Mathieu Poirier

On Wed, 8 Aug 2018 at 01:59, Tomasz Nowicki <tnowicki(a)caviumnetworks.com> wrote: > > Hi Mathieu, > > It's been a while but I am back to Coresight. > > Let me remind my setup and the issue I am struggling with now. > > Kernel baseline: > https://github.com/Linaro/perf-opencsd (perf-opencsd-v4.16) > OpenCSD: > https://github.com/Linaro/OpenCSD.git (master) > > The simplest Coresight components path I used as a start point: > ETMv4.1 -> TDR -> FUNNEL -> ETF > > As I mentioned TDR is built by Cavium and it was added to aggregate 128 > inputs into one output rather than cascading funnels. TDR has its own > driver just to keep path connected in Linux Coresight framework. > > Here is how I catch some trace data: > sudo perf record -C 0 -e cs_etm/@etf0/ --per-thread test_app The above command line tells perf to trace everything that is happening on CPU0 for as long as "test_app" is executing. In this case the "--per-thread" option is ignored. This is called a CPU-wide trace scenario and is currently not supported for CS (I am currently working on it). If you want to make sure "test_app" executes on CPU0 and that you trace just that you will need to use the "taskset" utility: sudo perf record -e cs_etm/@etf0/ --per-thread taskset 0x1 test_app An alternative to the above would be to CPU-hotplug out CPU128-255 while you are testing. Let's start with that before going further. Thanks, Mathieu > > I need to use -C because my machines has 2 nodes, 32 cores (128 threads) > each and each node has different ETF. So I have to specify which CPU is > the source and for specified ETF sink (EFT0 can be a sink for > CPU0-CPU127, ETF1 can be a sink for CPU128-CPU255). Otherwise Linux > cannot find path for ETMs related to CPU128-CPU255 if I specify ETF0 as > a sink. > > Overall, I can see some data using: > # sudo perf report --stdio --dump > [...] > . ... CoreSight ETM Trace data: size 16384 bytes > Frame deformatter: Found 4 FSYNCS > ID:12 RESET operation on trace decode path > Idx:108; ID:12; I_NOT_SYNC : I Stream not synchronised > Idx:455; ID:12; I_ASYNC : Alignment Synchronisation. > Idx:468; ID:12; I_TRACE_INFO : Trace Info.; INFO=0x0 > Idx:470; ID:12; I_TRACE_ON : Trace On. > Idx:471; ID:12; I_CTXT : Context Packet.; Ctxt: AArch64,EL0, NS; > Idx:473; ID:12; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; > Addr=0x0000AAABE0B09584; > Idx:483; ID:12; I_ATOM_F1 : Atom format 1.; N > Idx:484; ID:12; I_TIMESTAMP : Timestamp.; Updated val = > 0x1b6a5d937cc1 > Idx:492; ID:12; I_ATOM_F3 : Atom format 3.; NNE > Idx:493; ID:12; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; > Addr=0x0000AAABE0B0D210; > Idx:504; ID:12; I_ATOM_F3 : Atom format 3.; NEE > Idx:505; ID:12; I_ATOM_F3 : Atom format 3.; NEN > Idx:506; ID:12; I_ATOM_F6 : Atom format 6.; EEEN > Idx:507; ID:12; I_ATOM_F3 : Atom format 3.; NNE > Idx:508; ID:12; I_ATOM_F1 : Atom format 1.; N > Idx:509; ID:12; I_ATOM_F3 : Atom format 3.; NNN > Idx:510; ID:12; I_ATOM_F3 : Atom format 3.; EEN > Idx:512; ID:12; I_ATOM_F1 : Atom format 1.; E > [...] > > However, I still see errors while using: > # sudo perf report --stdio > 0x1e8 [0x60]: failed to process type: 1 > Error: > failed to process sample > # To display the perf.data header info, please use > --header/--header-only options. > > The reason is that cs_etm__process_event() is failing on: > if (!etm->timeless_decoding) > return -EINVAL; > > and etm->timeless_decoding is setup in cs_etm__is_timeless_decoding(). > For some events time bit set and so far I failed to figure out what is > going on. Have you met similar issue so far? Any pointers or hints are > very appreciated. > > One more comment below. > > On 10.01.2018 21:10, Mathieu Poirier wrote: > > On 10 January 2018 at 06:57, Tomasz Nowicki <tnowicki(a)caviumnetworks.com> wrote: > >> Hello Mathieu, > >> > >> Thank you for your response. Please see comments below. > >> > >> On 08.01.2018 17:53, Mathieu Poirier wrote: > >>> > >>> Good day Tomasz, > >>> > >>> > >>> On 5 January 2018 at 05:51, tn <Tomasz.Nowicki(a)caviumnetworks.com> wrote: > >>>> > >>>> Hi Mathieu, > >>>> > >>>> I am bringing up Coresight functiproject zeroonality on ThunderX2. While > >>>> ramping up I > >>>> come across your Connect session: > >>>> > >>>> which I found very helpful. > >>> > >>> > >>> Perfect - a few things have changed since then, see below. > >>> > >>>> > >>>> During my research I had to create new Coresight component driver for > >>>> Linux, > >>>> here is the story. For ThunderX2, we aggregate data trace from all 128 > >>>> ETMs > >>>> into one funnel inport using so called TDR (Trace Data Ring) component. > >>>> This > >>>> should be transparent to software and does not require configuration at > >>>> all. > >>>> However, Linux Coresight framework requires components to be connected > >>>> each > >>>> other so we cannot leave funnel and ETMs disconnected in DT. I decided to > >>>> create pure software component i.e. TDR which is meant to connect chain > >>>> only, no actions on registers. > >>> > >>> > >>> Is this TDR an ARM IP or built in-house by Cavium? > >> > >> > >> This is Cavium specific component which I am going to upstream once I test > >> the whole functionality. > >> > >> And I suppose it > >>> > >>> was added there to aggregate 128 input into one output rather than > >>> cascading funnels? > >> > >> > >> Correct. > >> > >>>> > >>>> Now I am able to enable ETF sink and path from ETM via TDR via FUNNEL up > >>>> to > >>>> ETF and gather some data. To be sure things work properly I want to > >>>> decode > >>>> data using Linaro OpenCSD library following instructions from here: > >>>> > >>>> https://community.arm.com/tools/b/blog/posts/do-a-coresight-trace-on-linux-… > >>> > >>> > >>> Thanks for pointing this out, I didn't know about it. > >>> > >>>> but still got error while doing 'perf report' step. Kernel perf tool > >>>> support > >>>> for OpenCSD is out of tree for now so I may miss some patches. > >>> > >>> > >>> Can you get me a pastebin of the errors you're getting? > >> > >> > >> Sure, see: > >> https://pastebin.com/6YDq8KfC > >> As you see there is not much info about error cause. > >> > >>> > >>>> > >>>> Here is my setup: > >>>> https://github.com/Linaro/perf-opencsd/commits/upstream-v1 (+ ThunderX2 > >>>> specific patches) > >>> > >>> > >>> Oh boy... I wasn't expecting people to use that but I suppose it is > >>> the right thing to do. Keep going with that code. > >>> > >>>> https://github.com/Linaro/OpenCSD/commits/master > >>> > >>> > >>> This, in combination with the upstream-v1 branch should work properly. > >>> That's how I test things on my Juno and Dragon board. > >>> > >>>> > >>>> # echo 1 > etf0/enable_sink > >>>> # perf record -C 0 -e cs_etm// sleep 2 > >>> > >>> > >>> Ok, that won't work as the -C option is currently not supported (I am > >>> working on it). I also suggest to make sure you have the very latest > >>> TIP [1] on branch [2] and to carefully read the README.md. We > >>> recently updated the instructions to fit the newest development. > >>> Lastly we have deprecated enabling the sink from the sysFS interface - > >>> it can still work but no guarantees are provided. It is better to > >>> specify the sink as part of the perf record command line, as shown in > >>> the most recent HOWTO.md. > >> > >> > >> I am able to specify sink as part of the perf record command line only for > >> Linux Perf master branch: > >> https://github.com/Linaro/perf-opencsd/commits/master > >> > >> For upstream-v1 branch I am getting: > >> $ perf record -vvv -e cs_etm/@etf0/ --per-thread uname > >> Using CPUID 0x00000000420f5160 > >> perf: util/evsel.c:783: apply_config_terms: Assertion `!(1)' failed. > >> Aborted (core dumped) > > > > > > Ok, I've uploaded upstream-v2. With that branch everything works fine > > on my side, no changes needed. I added a fix for a regression in the > > perf tip tree and the code required to use the ETR from the perf > > interface. > > > > One thing about the above: "@etf0". Is this really the name you gave > > to the device in the DT? Look under /sys/bus/coresight/devices/ for > > an etf entry. What is listed there should is the name of the ETF as > > it is known to the system. > > Indeed, the name is different but for perf command clarity I use shortcut. > > Thanks, > Tomasz

7 years, 5 months

Fwd: Failed for ETM decoding with db410c snapshot mode

by Mike Leach

+CoreSight ML and Mathieu ---------- Forwarded message ---------- From: Mike Leach <mike.leach(a)linaro.org> Date: 3 September 2018 at 17:39 Subject: Re: Failed for ETM decoding with db410c snapshot mode To: Leo Yan <leo.yan(a)linaro.org> HI Leo, Short summary - there is a problem with the trace collected - not the decoder. See below for details On 3 September 2018 at 08:06, <leo.yan(a)linaro.org> wrote: > Hi Mike, Mathieu, > > [ + CoreSight ML ] > > When I work on the CoreSight + perf tool and used crash extension > program to extract the tracing data from perf aux buffer, finally I > can get the trace data for about 1.6MB from ETF sink from DB410c board. > > To verify the extracted trace data, I used 'snapshot' mode under > OpenCSD code base, you could see the tar file for this [1]. After > you download this file, you could place it under OpenCSD folder: > > $ cp db410c_snapshot_kdump.tgz my_opencsd/decoder/tests/snapshots > $ cd my_opencsd/decoder/tests/snapshots > $ tar zxvf db410c_snapshot_kdump.tgz > $ cd db410c_snapshot_kdump > > $ ../../bin/builddir/trc_pkt_lister This will print raw trace packets as it finds them without attempting any sort of interpretation. > $ ../../bin/builddir/trc_pkt_lister -decode This will try to decode the raw trace packets into a sequence of instructions executed (alongside the raw packets) This is where the packets are being flagged as incorrect. > > If I use the command 'trc_pkt_lister' without any extra options, it > can print out trace packets successfully; but if I add the extra > option '-decode' it uses 'decode all' mode and it reports the errors as: > > 483710 Idx:53086; ID:10; [0xf8 ]; I_ATOM_F3 : Atom format 3.; NNN > 483711 Idx:53086; ID:10; OCSD_GEN_TRC_ELEM_ADDR_NACC( 0xffff000008abc9f0 ) > 483712 Idx:53088; ID:10; [0xdb ]; I_ATOM_F2 : Atom format 2.; EE > 483713 Idx:53194; ID:10; [0x6b 0x8c 0x08 0xfa 0xdc 0x95 0x5c ]; I_COND_RES_F1 : Conditional Result, format 1. This is a conditional result trace packet - however as far as I am aware the trace unit on an A53 (i.e. DB410 core) cannot produce these. Additionally in the entire file I see 2 I_COND packets and 1 I_NUM_DS_MKR - a data synchronisation marker packet. Now Data sync can only ever occur if data trace is supported and enabled. Data trace is architecturally prohibited for A class v8 cores (and unimplemented on most A class v7 cores). If there were tracing of conditional elements occurring, and it were enabled, then the packets should match up - a cond instruction should match with one cond result element. But in the end - event without these inconsistencies - the TRACE_INFO element at the top of the listing tells me that conditional instruction trace is disabled. Thus you are seeing what I believe is the effect of concatenating trace data buffers together (you mention you have 1.6MB of data from the ETF - which is not that large), without inserting barrier packets in between. The decoder cannot spot the boundaries, and will carry on and be out of sync so can mis-read trace packet payload data as header data which will throw off the decode process. When I look at the raw byte data I am seeing this at the top of the listing:- Frame Data; Index 0; ID_DATA[????]; ff Frame Data; Index 0; ID_DATA[0x7f]; 7f ff 7f ff 7f ff This does not look valid at all to me. > 483714 DCD_ETMV4_0016 : 0x0018 (OCSD_ERR_BAD_DECODE_PKT) [Reserved or unknown packet in decoder.]; Unsupported packet type.Trace Packet Lister : Data Path fatal error > 483715 0x0018 (OCSD_ERR_BAD_DECODE_PKT) [Reserved or unknown packet in decoder.]; Unsupported packet type.Trace Packet Lister : Trace buffer done, processed 53216 bytes. > > You also could check detailed log trc_pkt_lister.ppl in the shared > tar packet; After searched for the OpenCSD code and found this error is > due it cannot support some types of packets [2]. > > So want to check what's the best for this issue; seems to me we need > to fix this so it can support well to complete the decoding? > The reason we have not implemented support for these packets, is that we have never seen an implementation that generates them. regards Mike > Thanks in advance for suggestion. > Leo Yan > > [1] http://people.linaro.org/~leo.yan/opencsd_db410c/db410c_snapshot_kdump.tgz > [2] https://github.com/Linaro/OpenCSD/blob/master/decoder/source/etmv4/trc_pkt_… -- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK -- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK

7 years, 5 months

[PATCH 1/1] coresight: etm4x: Configure EL2 exception level when kernel is running in HYP

by Tomasz Nowicki

For non-VHE systems host kernel runs at EL1 and jumps to EL2 whenever hypervisor code should be executed. In this case ETM4x driver must restrict configuration to EL1 when it setups kernel tracing. However, there is no separate hypervisor privilege level when VHE is enabled, the host kernel runs at EL2. This patch fixes configuration of TRCACATRn register for VHE systems so that ETM_EXLEVEL_NS_HYP bit is used instead of ETM_EXLEVEL_NS_OS to on/off kernel tracing. At the same time, it moves common code to new helper. Signed-off-by: Tomasz Nowicki <tnowicki(a)caviumnetworks.com> --- drivers/hwtracing/coresight/coresight-etm4x.c | 39 ++++++++++--------- 1 file changed, 20 insertions(+), 19 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index f79b0ea85d76..5f495c942f99 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -28,6 +28,7 @@ #include <linux/pm_runtime.h> #include <asm/sections.h> #include <asm/local.h> +#include <asm/virt.h> #include "coresight-etm4x.h" #include "coresight-etm-perf.h" @@ -606,7 +607,7 @@ static void etm4_set_default_config(struct etmv4_config *config) config->vinst_ctrl |= BIT(0); } -static u64 etm4_get_access_type(struct etmv4_config *config) +static u64 etm4_get_ns_access_type(struct etmv4_config *config) { u64 access_type = 0; @@ -617,17 +618,27 @@ static u64 etm4_get_access_type(struct etmv4_config *config) * Bit[13] Exception level 1 - OS * Bit[14] Exception level 2 - Hypervisor * Bit[15] Never implemented - * - * Always stay away from hypervisor mode. */ - access_type = ETM_EXLEVEL_NS_HYP; - if (config->mode & ETM_MODE_EXCL_KERN) - access_type |= ETM_EXLEVEL_NS_OS; + if (!is_kernel_in_hyp_mode()) { + /* Stay away from hypervisor mode for non-VHE */ + access_type = ETM_EXLEVEL_NS_HYP; + if (config->mode & ETM_MODE_EXCL_KERN) + access_type |= ETM_EXLEVEL_NS_OS; + } else if (config->mode & ETM_MODE_EXCL_KERN) { + access_type = ETM_EXLEVEL_NS_HYP; + } if (config->mode & ETM_MODE_EXCL_USER) access_type |= ETM_EXLEVEL_NS_APP; + return access_type; +} + +static u64 etm4_get_access_type(struct etmv4_config *config) +{ + u64 access_type = etm4_get_ns_access_type(config); + /* * EXLEVEL_S, bits[11:8], don't trace anything happening * in secure state. @@ -881,20 +892,10 @@ void etm4_config_trace_mode(struct etmv4_config *config) addr_acc = config->addr_acc[ETM_DEFAULT_ADDR_COMP]; /* clear default config */ - addr_acc &= ~(ETM_EXLEVEL_NS_APP | ETM_EXLEVEL_NS_OS); + addr_acc &= ~(ETM_EXLEVEL_NS_APP | ETM_EXLEVEL_NS_OS | + ETM_EXLEVEL_NS_HYP); - /* - * EXLEVEL_NS, bits[15:12] - * The Exception levels are: - * Bit[12] Exception level 0 - Application - * Bit[13] Exception level 1 - OS - * Bit[14] Exception level 2 - Hypervisor - * Bit[15] Never implemented - */ - if (mode & ETM_MODE_EXCL_KERN) - addr_acc |= ETM_EXLEVEL_NS_OS; - else - addr_acc |= ETM_EXLEVEL_NS_APP; + addr_acc |= etm4_get_ns_access_type(config); config->addr_acc[ETM_DEFAULT_ADDR_COMP] = addr_acc; config->addr_acc[ETM_DEFAULT_ADDR_COMP + 1] = addr_acc; -- 2.17.1

7 years, 5 months

[PATCH v2] perf: Support for Arm A32/T32 instruction sets in CoreSight trace

by Robert Walker

This patch adds support for generating instruction samples from trace of AArch32 programs using the A32 and T32 instruction sets. T32 has variable 2 or 4 byte instruction size, so the conversion between addresses and instruction counts requires extra information from the trace decoder, requiring version 0.9.1 of OpenCSD. A check for the new struct member has been added to the feature check for OpenCSD. Signed-off-by: Robert Walker <robert.walker(a)arm.com> --- v2: Minor fixes following review comments from Mathieu Rebased on v4.19-rc1 tools/build/feature/test-libopencsd.c | 7 +++ tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 27 ++++++++++ tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 10 ++++ tools/perf/util/cs-etm.c | 71 +++++++++++-------------- 4 files changed, 75 insertions(+), 40 deletions(-) diff --git a/tools/build/feature/test-libopencsd.c b/tools/build/feature/test-libopencsd.c index 5ff1246..d96b2df 100644 --- a/tools/build/feature/test-libopencsd.c +++ b/tools/build/feature/test-libopencsd.c @@ -3,6 +3,13 @@ int main(void) { + /* + * Requires ocsd_generic_trace_elem.num_instr_range introduced in + * OpenCSD 0.9 + */ + ocsd_generic_trace_elem elem; + (void)elem.num_instr_range; + (void)ocsd_get_version(); return 0; } diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 938def6..73d8384 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -263,9 +263,12 @@ static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder) decoder->tail = 0; decoder->packet_count = 0; for (i = 0; i < MAX_BUFFER; i++) { + decoder->packet_buffer[i].isa = CS_ETM_ISA_UNKNOWN; decoder->packet_buffer[i].start_addr = CS_ETM_INVAL_ADDR; decoder->packet_buffer[i].end_addr = CS_ETM_INVAL_ADDR; + decoder->packet_buffer[i].instr_count = 0; decoder->packet_buffer[i].last_instr_taken_branch = false; + decoder->packet_buffer[i].last_instr_size = 0; decoder->packet_buffer[i].exc = false; decoder->packet_buffer[i].exc_ret = false; decoder->packet_buffer[i].cpu = INT_MIN; @@ -294,11 +297,13 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder, decoder->packet_count++; decoder->packet_buffer[et].sample_type = sample_type; + decoder->packet_buffer[et].isa = CS_ETM_ISA_UNKNOWN; decoder->packet_buffer[et].exc = false; decoder->packet_buffer[et].exc_ret = false; decoder->packet_buffer[et].cpu = *((int *)inode->priv); decoder->packet_buffer[et].start_addr = CS_ETM_INVAL_ADDR; decoder->packet_buffer[et].end_addr = CS_ETM_INVAL_ADDR; + decoder->packet_buffer[et].instr_count = 0; if (decoder->packet_count == MAX_BUFFER - 1) return OCSD_RESP_WAIT; @@ -321,8 +326,28 @@ cs_etm_decoder__buffer_range(struct cs_etm_decoder *decoder, packet = &decoder->packet_buffer[decoder->tail]; + switch (elem->isa) { + case ocsd_isa_aarch64: + packet->isa = CS_ETM_ISA_A64; + break; + case ocsd_isa_arm: + packet->isa = CS_ETM_ISA_A32; + break; + case ocsd_isa_thumb2: + packet->isa = CS_ETM_ISA_T32; + break; + case ocsd_isa_tee: + case ocsd_isa_jazelle: + case ocsd_isa_custom: + case ocsd_isa_unknown: + default: + packet->isa = CS_ETM_ISA_UNKNOWN; + } + packet->start_addr = elem->st_addr; packet->end_addr = elem->en_addr; + packet->instr_count = elem->num_instr_range; + switch (elem->last_i_type) { case OCSD_INSTR_BR: case OCSD_INSTR_BR_INDIRECT: @@ -336,6 +361,8 @@ cs_etm_decoder__buffer_range(struct cs_etm_decoder *decoder, break; } + packet->last_instr_size = elem->last_instr_sz; + return ret; } diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index 612b575..9351bd1 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -28,11 +28,21 @@ enum cs_etm_sample_type { CS_ETM_TRACE_ON = 1 << 1, }; +enum cs_etm_isa { + CS_ETM_ISA_UNKNOWN, + CS_ETM_ISA_A64, + CS_ETM_ISA_A32, + CS_ETM_ISA_T32, +}; + struct cs_etm_packet { enum cs_etm_sample_type sample_type; + enum cs_etm_isa isa; u64 start_addr; u64 end_addr; + u32 instr_count; u8 last_instr_taken_branch; + u8 last_instr_size; u8 exc; u8 exc_ret; int cpu; diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 2ae6402..fcaa73f 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -31,14 +31,6 @@ #define MAX_TIMESTAMP (~0ULL) -/* - * A64 instructions are always 4 bytes - * - * Only A64 is supported, so can use this constant for converting between - * addresses and instruction counts, calculting offsets etc - */ -#define A64_INSTR_SIZE 4 - struct cs_etm_auxtrace { struct auxtrace auxtrace; struct auxtrace_queues queues; @@ -492,21 +484,16 @@ static inline void cs_etm__reset_last_branch_rb(struct cs_etm_queue *etmq) etmq->last_branch_rb->nr = 0; } -static inline u64 cs_etm__last_executed_instr(struct cs_etm_packet *packet) -{ - /* Returns 0 for the CS_ETM_TRACE_ON packet */ - if (packet->sample_type == CS_ETM_TRACE_ON) - return 0; +static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq, + u64 addr) { + u8 instrBytes[2]; - /* - * The packet records the execution range with an exclusive end address - * - * A64 instructions are constant size, so the last executed - * instruction is A64_INSTR_SIZE before the end address - * Will need to do instruction level decode for T32 instructions as - * they can be variable size (not yet supported). + cs_etm__mem_access(etmq, addr, ARRAY_SIZE(instrBytes), instrBytes); + /* T32 instruction size is indicated by bits[15:11] of the first + * 16-bit word of the instruction: 0b11101, 0b11110 and 0b11111 + * denote a 32-bit instruction. */ - return packet->end_addr - A64_INSTR_SIZE; + return ((instrBytes[1] & 0xF8) >= 0xE8) ? 4 : 2; } static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) @@ -518,27 +505,32 @@ static inline u64 cs_etm__first_executed_instr(struct cs_etm_packet *packet) return packet->start_addr; } -static inline u64 cs_etm__instr_count(const struct cs_etm_packet *packet) +static inline +u64 cs_etm__last_executed_instr(const struct cs_etm_packet *packet) { - /* - * Only A64 instructions are currently supported, so can get - * instruction count by dividing. - * Will need to do instruction level decode for T32 instructions as - * they can be variable size (not yet supported). - */ - return (packet->end_addr - packet->start_addr) / A64_INSTR_SIZE; + /* Returns 0 for the CS_ETM_TRACE_ON packet */ + if (packet->sample_type == CS_ETM_TRACE_ON) + return 0; + + return packet->end_addr - packet->last_instr_size; } -static inline u64 cs_etm__instr_addr(const struct cs_etm_packet *packet, +static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq, + const struct cs_etm_packet *packet, u64 offset) { - /* - * Only A64 instructions are currently supported, so can get - * instruction address by muliplying. - * Will need to do instruction level decode for T32 instructions as - * they can be variable size (not yet supported). - */ - return packet->start_addr + offset * A64_INSTR_SIZE; + if (packet->isa == CS_ETM_ISA_T32) { + u64 addr = packet->start_addr; + + while (offset > 0) { + addr += cs_etm__t32_instr_size(etmq, addr); + offset--; + } + return addr; + } + + /* Assume a 4 byte instruction size (A32/A64) */ + return packet->start_addr + offset * 4; } static void cs_etm__update_last_branch_rb(struct cs_etm_queue *etmq) @@ -867,9 +859,8 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) struct cs_etm_auxtrace *etm = etmq->etm; struct cs_etm_packet *tmp; int ret; - u64 instrs_executed; + u64 instrs_executed = etmq->packet->instr_count; - instrs_executed = cs_etm__instr_count(etmq->packet); etmq->period_instructions += instrs_executed; /* @@ -899,7 +890,7 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) * executed, but PC has not advanced to next instruction) */ u64 offset = (instrs_executed - instrs_over - 1); - u64 addr = cs_etm__instr_addr(etmq->packet, offset); + u64 addr = cs_etm__instr_addr(etmq, etmq->packet, offset); ret = cs_etm__synth_instruction_sample( etmq, addr, etm->instructions_sample_period); -- 2.7.4

7 years, 5 months

Enabling Coresight in atomic context.

by Mike Bazov

Greetings, I'm trying to enable the Coresight ETMv4 trace from kernel mode. I saw there is no documentation on how to do this, except using the sysfs user mode interface and perf. To overcome this i looked a little bit in the coresight.h header file, and came to these APIs: > extern int coresight_enable(struct coresight_device *csdev); > > extern void coresight_disable(struct coresight_device *csdev); And the sysfs implementation uses these APIs when enabing/disabling the trace code, so i thought this could suit my needs. The next problem was actually getting the coresight devices data structures, which aren't exported and are actually provided internally to perf and sysfs. So i exported the coresight bus type: struct bus_type coresight_bustype = { > > .name = "coresight", > > }; > > EXPORT_SYMBOL(coresight_bustype); And enumerated the bus and just looked for the type CORESIGHT_DEV_TYPE_SOURCE, since the only source on my board is ETMv4, there isn't any conflicts so i should get only ETMv4s. This worked, and indeed i collected a coresight device for every CPU, and enabled the trace successfully while being in *non-atomic context*. The problem is, i must enable the trace in *atomic context synchronously *on the current thread's CPU(i can't issue a work-queue to enable the trace for me). So.. i got many BUG() errors because of non-atomic API usage, for example, the allocation of the TMC-ETR buffer: > dma_alloc_coherent(drvdata->dev, etr_buf->size, > &flat_buf->daddr, *GFP_KERNEL*); So my questions are: 1) Is there a more documented way of enabling coresight from kernel mode? i believe i achieved this using cheats. 2) I see there is no exported kernel API to config the coresight trace attributes(for example, filter EL0). I can only do so from sysfs.. am i missing something? 3) Are there any plans to make the Coresight infrastructure atomic-context friendly? If there are, is the development in progress? if not.. how would you suggest tackling the issues i've described in this message? Thank you!

7 years, 5 months

Coresight with Perf need ETR ?

by Christophe ROULLIER

Hi, I have followed procedure to integrate libopenCSD in perf tool in target ARM. (based on kernel 4.18rc1) I’ve seen in documentation that perf record need parameters cs_etm/(a)xxxxx.etr/u …<mailto:cs_etm/@xxxxx.etr/u%20…> I would like to know if ETR device is mandatory to use coresight trace with perf ? In my architecture, I’ve : 1 .funnel 1 .tpiu 2 .etm 1 .etf 1 .stm 1 replicator but no ETR ☹ If possible without ETR, do you have examples of perf record cmd usage ? I’ve configured my coresight register as followed : echo 1 > /sys/bus/coresight/devices/xxxx.stm/hwevent_select echo 2 > /sys/bus/coresight/devices/xxxx.stm/hwevent_extmux_select // Need to track IRQ up of UART echo 0x200000 > /sys/bus/coresight/devices/xxxx.stm/hwevent_enable echo 1 > /sys/bus/coresight/devices/yyyy.etf/enable_sink echo 1 > /sys/bus/coresight/devices/xxxx.stm/enable_source Which perf record cmd I must enter to catch my STM event ? Thanks for your help. Christophe.

7 years, 5 months

Failed for ETM decoding with db410c snapshot mode

by leo.yan＠linaro.org

Hi Mike, Mathieu, [ + CoreSight ML ] When I work on the CoreSight + perf tool and used crash extension program to extract the tracing data from perf aux buffer, finally I can get the trace data for about 1.6MB from ETF sink from DB410c board. To verify the extracted trace data, I used 'snapshot' mode under OpenCSD code base, you could see the tar file for this [1]. After you download this file, you could place it under OpenCSD folder: $ cp db410c_snapshot_kdump.tgz my_opencsd/decoder/tests/snapshots $ cd my_opencsd/decoder/tests/snapshots $ tar zxvf db410c_snapshot_kdump.tgz $ cd db410c_snapshot_kdump $ ../../bin/builddir/trc_pkt_lister $ ../../bin/builddir/trc_pkt_lister -decode If I use the command 'trc_pkt_lister' without any extra options, it can print out trace packets successfully; but if I add the extra option '-decode' it uses 'decode all' mode and it reports the errors as: 483710 Idx:53086; ID:10; [0xf8 ]; I_ATOM_F3 : Atom format 3.; NNN 483711 Idx:53086; ID:10; OCSD_GEN_TRC_ELEM_ADDR_NACC( 0xffff000008abc9f0 ) 483712 Idx:53088; ID:10; [0xdb ]; I_ATOM_F2 : Atom format 2.; EE 483713 Idx:53194; ID:10; [0x6b 0x8c 0x08 0xfa 0xdc 0x95 0x5c ]; I_COND_RES_F1 : Conditional Result, format 1. 483714 DCD_ETMV4_0016 : 0x0018 (OCSD_ERR_BAD_DECODE_PKT) [Reserved or unknown packet in decoder.]; Unsupported packet type.Trace Packet Lister : Data Path fatal error 483715 0x0018 (OCSD_ERR_BAD_DECODE_PKT) [Reserved or unknown packet in decoder.]; Unsupported packet type.Trace Packet Lister : Trace buffer done, processed 53216 bytes. You also could check detailed log trc_pkt_lister.ppl in the shared tar packet; After searched for the OpenCSD code and found this error is due it cannot support some types of packets [2]. So want to check what's the best for this issue; seems to me we need to fix this so it can support well to complete the decoding? Thanks in advance for suggestion. Leo Yan [1] http://people.linaro.org/~leo.yan/opencsd_db410c/db410c_snapshot_kdump.tgz [2] https://github.com/Linaro/OpenCSD/blob/master/decoder/source/etmv4/trc_pkt_…

7 years, 5 months

Decoding STM traces with OpenCSD test programs

by Mathieu Poirier

Good day Leo, Please meet my new friend Christophe. He is from ST-Micro and is currently working on integrating the CS framework in their next platform. I would be grateful if you could share with him the instructions you have used to get STM doing on the dragon board and how you got the snapshot decoder to decode the traces. Christophe is also looking to get a trace snapshot that is known to be working properly to test in his environment. Many thanks, Mathieu

7 years, 5 months

Jump to page:

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

CoreSight