CoreSight

coresight@lists.linaro.org

7 participants
2543 discussions

by Al Grant

Hi, One of the top questions that comes up on CoreSight is how it interacts with CPU power saving, and I'd like to get a handle on where we are with this. This will help us understand if any more work needs to be done. I'd suggest three levels of support: - transparent: use of CoreSight has no effect on CPU power saving - if an idle CPU would have been powered down it's still powered down. Any increased power draw from CoreSight comes from debug/trace blocks being powered up as necessary, not from keeping entire CPUs powered up. - automatic wakelock: use of CoreSight has the effect of disabling powering off of idle CPUs, so there may be a significant increase in power consumption, but it's done automatically and works out of the box. CoreSight itself is fully functional irrespective of how the system is configured. - invasive: power saving must be disabled manually - i.e. you have to get a manual (and possibly device-specific) recipe from somewhere. If you don't then things will break (loss of trace at best, crash at worst). I would hope that perf (all modes) is transparent, and direct use of sysfs is at worst a wakelock... but where are we now? Are there still boards that need manual recipes with the current kernel - either with perf or with sysfs? Thanks, Al

6 years, 10 months

[PATCH v5 0/8] perf cs-etm: Add support for sample flags

by Leo Yan

This patch seris adds support for sample flags so can facilitate perf to print sample flags for branch instruction. Patch 0001 is used to save last branch information in packet structure, this includes instruction type, subtype and condition flag to help making decision for which branch instruction it is. It passes related information from decoder layer to cs-etm.c, so we use cs-etm.c as a central place to set sample flags. Patch 0002 is used to set sample flags for instruction range packet. Patch 0003 is used to set sample flags for trace discontinuity packet. Patches 0004/0005/0006 are preparation for exception packet handling: Patch 0004 addes exception number in packet; pacth 0005/0006 is to use traceID/metadata tuple to access metadata pointer based on traceID, this can help decide if the CPU is connected with ETMv3 or ETMv4, ETMv3 and ETMv4 have totally different definition for exception numbers. Patch 0007 sets sample flags for exception packet; patch 0008 support sample flags for exception return packet. As Mathieu pointed out, one most difficult thing is to handle nest interrupt and trace discontinuity for exception return packet; if we use one data structure to maintain the interrupt context and also track the nest interrupts, this will be complex. So in patch 0008 we simply to check if there have one SVC instruction prior to the return address, by this way we can simply get rid of the dependency between the exception packet and exception return packet for setting sample flags, thus this can let us to set sample flags for exception return packet only based on the sequential instruction range packet. This patch series is applied on the acme's perf core branch [1] with the with latest commit ee412f14693a ("tools include uapi: Sync linux/vhost.h with the kernel sources"). After applying this patch series, we can verify sample flags with below command: # perf script -F,-time,+flags,+ip,+sym,+dso,+addr,+symoff -k vmlinux Changes from v4: * Fixed typos in comments, and removed redundant info from commit log; * Addressed Mathieu's suggestion to add helper functions for metadata fields (CS_ETM_CPU and CS_ETM_MAGIC) accessing; * Addressed Mathieu's suggestion to include headers with alphabetical order. Changes from v3: * Fixed typos in commit logs; * Rearranged fields in cs_etm_packet by grouping with same variable types; * Fixed ETMv4 exception number which pointed by Mike; * Fixed ETMv4 SVC / SMC / HVC in the same CALL, by checking svc instruction to distinguish them; * Refine ETMv4 return exception packet handling. Changes from v2: * Addressed Mathieu's suggestion to split one big patch to 3 small patches for setting sample flags, one is for instruction range packet, one is for discontinuity packet and one is for exception packet. * Added supporting for ETMv3 exception packet. * Followed Mathieu's suggestion to move all sample flags handling from decoder layer to cs-etm.c, thus it has enough info to set flags based on trace context in single place. Changes from v1: * Moved exception packets handling patches into patch series 'perf cs-etm: Correct packets handling'. * Added sample flags fixing up for TRACE_OFF packet. * Created a new function which is used to maintain flags fixing up. [1] https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=perf/… Leo Yan (8): perf cs-etm: Add last instruction information in packet perf cs-etm: Set sample flags for instruction range packet perf cs-etm: Set sample flags for trace discontinuity perf cs-etm: Add exception number in exception packet perf cs-etm: Change tuple from traceID-CPU# to traceID-metadata perf cs-etm: Add traceID in packet perf cs-etm: Set sample flags for exception packet perf cs-etm: Set sample flags for exception return packet .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 41 +- .../perf/util/cs-etm-decoder/cs-etm-decoder.h | 6 + tools/perf/util/cs-etm.c | 405 +++++++++++++++++- tools/perf/util/cs-etm.h | 45 +- 4 files changed, 479 insertions(+), 18 deletions(-) -- 2.17.1

6 years, 10 months

[PATCH v4 0/8] perf cs-etm: Add support for sample flags

by Leo Yan

This patch seris adds support for sample flags so can facilitate perf to print sample flags for branch instruction. Patch 0001 is used to save last branch information in packet structure, this includes instruction type, subtype and condition flag to help making decision for which branch instruction it is. It passes related information from decoder layer to cs-etm.c, so we use cs-etm.c as a central place to set sample flags. Patch 0002 is used to set sample flags for instruction range packet. Patch 0003 is used to set sample flags for trace discontinuity packet. Patches 0004/0005/0006 are preparation for exception packet handling: Patch 0004 addes exception number in packet; pacth 0005/0006 is to use traceID/metadata pointer tuple to access metadata pointer based on trace ID, so this can help decide if the CPU is connected with ETMv3 or ETMv4, ETMv3 and ETMv4 have totally different definition for exception numbers. Patch 0007 sets sample flags for exception packet; patch 0008 support sample flags for exception return packet. As Mathieu pointed out, one most difficult thing is to handle nest interrupt and trace discontinuity for exception return packet; if we use one data structure to maintain the interrupt context and also track the nest interrupts, this will be complex. So in patch 0008 we simply to check if there have one SVC instruction prior to the return address, by this way we can simply get rid of the dependency between the exception packet and exception return packet for setting sample flags, thus this can let us to set sample flags for exception return packet only based on the sequential instruction range packet. This patch series is applied on the acme's perf core branch [1] with the with latest commit ee412f14693a ("tools include uapi: Sync linux/vhost.h with the kernel sources"). After applying the dependency patches and this patch series, we can verify sample flags with below command: # perf script -F,-time,+flags,+ip,+sym,+dso,+addr,+symoff -k vmlinux Changes from v3: * Fixed typos in commit logs; * Rearranged fields in cs_etm_packet by grouping with same variable types; * Fixed ETMv4 exception number which pointed by Mike; * Fixed ETMv4 SVC / SMC / HVC in the same CALL, by checking svc instruction to distinguish them; * Refine ETMv4 return exception packet handling. Changes from v2: * Addressed Mathieu's suggestion to split one big patch to 3 small patches for setting sample flags, one is for instruction range packet, one is for discontinuity packet and one is for exception packet. * Added supporting for ETMv3 exception packet. * Followed Mathieu's suggestion to move all sample flags handling from decoder layer to cs-etm.c, thus it has enough info to set flags based on trace context in single place. Changes from v1: * Moved exception packets handling patches into patch series 'perf cs-etm: Correct packets handling'. * Added sample flags fixing up for TRACE_OFF packet. * Created a new function which is used to maintain flags fixing up. [1] https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=perf/… Leo Yan (8): perf cs-etm: Add last instruction information in packet perf cs-etm: Set sample flags for instruction range packet perf cs-etm: Set sample flags for trace discontinuity perf cs-etm: Add exception number in exception packet perf cs-etm: Change tuple from traceID-CPU# to traceID-metadata perf cs-etm: Add traceID in packet perf cs-etm: Set sample flags for exception packet perf cs-etm: Set sample flags for exception return packet .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 36 +- .../perf/util/cs-etm-decoder/cs-etm-decoder.h | 6 + tools/perf/util/cs-etm.c | 383 +++++++++++++++++- tools/perf/util/cs-etm.h | 43 +- 4 files changed, 454 insertions(+), 14 deletions(-) -- 2.17.1

6 years, 10 months

[PATCH AUTOSEL 4.19 77/97] perf cs-etm: Correct packets swapping in cs_etm__flush()

by Sasha Levin

From: Leo Yan <leo.yan(a)linaro.org> [ Upstream commit 43fd56669c28cd354e9228bdb58e4bca1c1a8b66 ] The structure cs_etm_queue uses 'prev_packet' to point to previous packet, this can be used to combine with new coming packet to generate samples. In function cs_etm__flush() it swaps packets only when the flag 'etm->synth_opts.last_branch' is true, this means that it will not swap packets if without option '--itrace=il' to generate last branch entries; thus for this case the 'prev_packet' doesn't point to the correct previous packet and the stale packet still will be used to generate sequential sample. Thus if dump trace with 'perf script' command we can see the incorrect flow with the stale packet's address info. This patch corrects packets swapping in cs_etm__flush(); except using the flag 'etm->synth_opts.last_branch' it also checks the another flag 'etm->sample_branches', if any flag is true then it swaps packets so can save correct content to 'prev_packet'. Finally this can fix the wrong program flow dumping issue. The patch has a minor refactoring to use 'etm->synth_opts.last_branch' instead of 'etmq->etm->synth_opts.last_branch' for condition checking, this is consistent with that is done in cs_etm__sample(). Signed-off-by: Leo Yan <leo.yan(a)linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier(a)linaro.org> Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com> Cc: Jiri Olsa <jolsa(a)redhat.com> Cc: Mike Leach <mike.leach(a)linaro.org> Cc: Namhyung Kim <namhyung(a)kernel.org> Cc: Robert Walker <robert.walker(a)arm.com> Cc: coresight(a)lists.linaro.org Cc: linux-arm-kernel(a)lists.infradead.org Link: http://lkml.kernel.org/r/1544513908-16805-2-git-send-email-leo.yan@linaro.o… Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- tools/perf/util/cs-etm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index ca577658e890..7b5e15cc6b71 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -1005,7 +1005,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) } swap_packet: - if (etmq->etm->synth_opts.last_branch) { + if (etm->sample_branches || etm->synth_opts.last_branch) { /* * Swap PACKET with PREV_PACKET: PACKET becomes PREV_PACKET for * the next incoming packet. -- 2.19.1

6 years, 10 months

[PATCH AUTOSEL 4.20 093/117] perf cs-etm: Correct packets swapping in cs_etm__flush()

by Sasha Levin

From: Leo Yan <leo.yan(a)linaro.org> [ Upstream commit 43fd56669c28cd354e9228bdb58e4bca1c1a8b66 ] The structure cs_etm_queue uses 'prev_packet' to point to previous packet, this can be used to combine with new coming packet to generate samples. In function cs_etm__flush() it swaps packets only when the flag 'etm->synth_opts.last_branch' is true, this means that it will not swap packets if without option '--itrace=il' to generate last branch entries; thus for this case the 'prev_packet' doesn't point to the correct previous packet and the stale packet still will be used to generate sequential sample. Thus if dump trace with 'perf script' command we can see the incorrect flow with the stale packet's address info. This patch corrects packets swapping in cs_etm__flush(); except using the flag 'etm->synth_opts.last_branch' it also checks the another flag 'etm->sample_branches', if any flag is true then it swaps packets so can save correct content to 'prev_packet'. Finally this can fix the wrong program flow dumping issue. The patch has a minor refactoring to use 'etm->synth_opts.last_branch' instead of 'etmq->etm->synth_opts.last_branch' for condition checking, this is consistent with that is done in cs_etm__sample(). Signed-off-by: Leo Yan <leo.yan(a)linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier(a)linaro.org> Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com> Cc: Jiri Olsa <jolsa(a)redhat.com> Cc: Mike Leach <mike.leach(a)linaro.org> Cc: Namhyung Kim <namhyung(a)kernel.org> Cc: Robert Walker <robert.walker(a)arm.com> Cc: coresight(a)lists.linaro.org Cc: linux-arm-kernel(a)lists.infradead.org Link: http://lkml.kernel.org/r/1544513908-16805-2-git-send-email-leo.yan@linaro.o… Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- tools/perf/util/cs-etm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index 73430b73570d..c2f0c92623f0 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -1005,7 +1005,7 @@ static int cs_etm__flush(struct cs_etm_queue *etmq) } swap_packet: - if (etmq->etm->synth_opts.last_branch) { + if (etm->sample_branches || etm->synth_opts.last_branch) { /* * Swap PACKET with PREV_PACKET: PACKET becomes PREV_PACKET for * the next incoming packet. -- 2.19.1

6 years, 10 months

Re: [PATCH] dts: arm64: add CoreSight trace support for hi3660

by leo.yan＠linaro.org

Hi Wanglai, [ + CoreSight mailing list ] On Tue, Dec 11, 2018 at 06:05:49PM +0800, Wanglai Shi wrote: > This patch adds devicetree entries for the CoreSight trace > components on hi3660. > > Signed-off-by: Wanglai Shi <shiwanglai(a)hisilicon.com> > --- > .../arm64/boot/dts/hisilicon/hi3660-coresight.dtsi | 428 +++++++++++++++++++++ > 1 file changed, 428 insertions(+) > create mode 100644 arch/arm64/boot/dts/hisilicon/hi3660-coresight.dtsi This patch doesn't work on Hikey960 due CoreSight related dt binding has not been really included by dts file. > diff --git a/arch/arm64/boot/dts/hisilicon/hi3660-coresight.dtsi b/arch/arm64/boot/dts/hisilicon/hi3660-coresight.dtsi > new file mode 100644 > index 0000000..95c79e4 > --- /dev/null > +++ b/arch/arm64/boot/dts/hisilicon/hi3660-coresight.dtsi > @@ -0,0 +1,428 @@ > +/* > + * dtsi for Hisilicon Hi3660 Coresight > + * > + * Copyright (C) 2016-2017 Hisilicon Ltd. s/2017/2018 > + * > + * Author: Wanglai Shi <shiwanglai(a)hisilicon.com> > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License version 2 as > + * publishhed by the Free Software Foundation. > + */ > +/ { > + amba { s/amba/soc > + #address-cells = <2>; > + #size-cells = <2>; > + compatible = "arm,amba-bus"; > + ranges; If under 'soc' node, because 'soc' has defined its address/size cells length, thus you don't need to define these properties repeatly. > + > + /* A53 cluster internal coresight */ > + etm@0,ecc40000 { s/etm@0,ecc40000/etm@ecc40000 > + compatible = "arm,coresight-etm4x","arm,primecell"; Add extra space between two compatible strings: compatible = "arm,coresight-etm4x", "arm,primecell"; Please apply this rule for all below compatible string bindings. > + reg = <0 0xecc40000 0 0x1000>; > + clocks = <&pclk>; > + clock-names = "apb_pclk"; > + cpu = <&cpu0>; > + port { > + etm0_out_port: endpoint { > + remote-endpoint = <&funnel0_in_port0>; > + }; > + }; Since Suzuki introduced CoreSight DT bindings for "out-ports" and "in-ports", so it's suggested to use more explict way to bind hardware port with specifying direction: out-ports { port { etm0_out: endpoint { remote-endpoint = <&funnel0_in_port0>; }; }; }; > + }; > + > + etm@1,ecd40000 { Remove '1,' > + compatible = "arm,coresight-etm4x","arm,primecell"; > + reg = <0 0xecd40000 0 0x1000>; > + clocks = <&pclk>; > + clock-names = "apb_pclk"; > + cpu = <&cpu1>; > + port { Suggest for adding 'out-ports'. > + etm1_out_port: endpoint { > + remote-endpoint = <&funnel0_in_port1>; > + }; > + }; > + }; > + > + etm@2,ece40000 { Remove '2,' > + compatible = "arm,coresight-etm4x","arm,primecell"; > + reg = <0 0xece40000 0 0x1000>; > + clocks = <&pclk>; > + clock-names = "apb_pclk"; > + cpu = <&cpu2>; > + port { Suggest for adding 'out-ports'. > + etm2_out_port: endpoint { > + remote-endpoint = <&funnel0_in_port2>; > + }; > + }; > + }; > + > + etm@3,ecf40000 { Remove '3,' > + compatible = "arm,coresight-etm4x","arm,primecell"; > + reg = <0 0xecf40000 0 0x1000>; > + clocks = <&pclk>; > + clock-names = "apb_pclk"; > + cpu = <&cpu3>; > + port { Suggest for adding 'out-ports'. > + etm3_out_port: endpoint { > + remote-endpoint = <&funnel0_in_port3>; > + }; > + }; > + }; > + > + funnel0:funnel@0,ec801000 { funnel0: funnel@ec801000 > + compatible = "arm,coresight-funnel","arm,primecell"; > + reg = <0 0xec801000 0 0x1000>; > + clocks = <&pclk>; > + clock-names = "apb_pclk"; > + ports { > + #address-cells = <1>; > + #size-cells = <0>; > + > + /* funnel output port */ > + port@0 { > + reg = <0>; > + funnel0_out_port: endpoint { > + remote-endpoint = > + <&etf0_in_port>; > + }; > + }; Suggest for below format: out-ports { port { reg = <0>; clusster0_funnel_out_port: endpoint { remote-endpoint = <&etf0_in_port>; }; }; }; > + > + /* funnel input ports */ > + port@1 { Suggest for adding 'in-ports'. BTW, please keep the consistence between node name and registers; e.g. port@0 should be consistent with 'reg = <0>;' and port@1 for 'reg = <1>;' ... So this should be port@0 > + reg = <0>; > + funnel0_in_port0: endpoint { > + slave-mode; > + remote-endpoint = > + <&etm0_out_port>; > + }; > + }; > + > + port@2 { s/port@2/port@1 > + reg = <1>; > + funnel0_in_port1: endpoint { > + slave-mode; > + remote-endpoint = > + <&etm1_out_port>; > + }; > + }; > + > + port@3 { s/port@3/port@2 > + reg = <2>; > + funnel0_in_port2: endpoint { > + slave-mode; > + remote-endpoint = > + <&etm2_out_port>; > + }; > + }; > + > + port@4 { s/port@4/port@3 > + reg = <3>; > + funnel0_in_port3: endpoint { > + slave-mode; > + remote-endpoint = > + <&etm3_out_port>; > + }; > + }; > + }; > + }; > + > + etf0:etf@0,ec802000 { etf0: etf@ec802000 > + compatible = "arm,coresight-tmc","arm,primecell"; > + reg = <0 0xec802000 0 0x1000>; > + clocks = <&pclk>; > + clock-names = "apb_pclk"; > + ports { Same with upper suggestions: add 'in-ports' and 'out-ports'. > + #address-cells = <1>; > + #size-cells = <0>; > + /* input port */ > + port@0 { > + reg = <0>; > + etf0_in_port: endpoint { > + slave-mode; > + remote-endpoint = > + <&funnel0_out_port>; > + }; > + }; > + > + /* output port */ > + port@1 { s/port@1/port@0 > + reg = <0>; > + etf0_out_port: endpoint { > + remote-endpoint = > + <&funnel2_in_port0>; > + }; > + }; > + }; > + }; > + > + /* A73 cluster internal coresight */ > + etm@4,ed440000 { Same suggestion with CA53 cluster bindings for etm. > + compatible = "arm,coresight-etm4x","arm,primecell"; > + reg = <0 0xed440000 0 0x1000>; > + clocks = <&pclk>; > + clock-names = "apb_pclk"; > + cpu = <&cpu4>; > + port { > + etm4_out_port: endpoint { > + remote-endpoint = <&funnel1_in_port0>; > + }; > + }; > + }; > + > + etm@5,ed540000 { > + compatible = "arm,coresight-etm4x","arm,primecell"; > + reg = <0 0xed540000 0 0x1000>; > + clocks = <&pclk>; > + clock-names = "apb_pclk"; > + cpu = <&cpu5>; > + port { > + etm5_out_port: endpoint { > + remote-endpoint = <&funnel1_in_port1>; > + }; > + }; > + }; > + > + etm@6,ed640000 { > + compatible = "arm,coresight-etm4x","arm,primecell"; > + reg = <0 0xed640000 0 0x1000>; > + clocks = <&pclk>; > + clock-names = "apb_pclk"; > + cpu = <&cpu6>; > + port { > + etm6_out_port: endpoint { > + remote-endpoint = <&funnel1_in_port2>; > + }; > + }; > + }; > + > + etm@7,ed740000 { > + compatible = "arm,coresight-etm4x","arm,primecell"; > + reg = <0 0xed740000 0 0x1000>; > + clocks = <&pclk>; > + clock-names = "apb_pclk"; > + cpu = <&cpu7>; > + port { > + etm7_out_port: endpoint { > + remote-endpoint = <&funnel1_in_port3>; > + }; > + }; > + }; > + > + funnel1:funnel@1,ed001000 { Same suggestion for funnel0. > + compatible = "arm,coresight-funnel","arm,primecell"; > + reg = <0 0xed001000 0 0x1000>; > + clocks = <&pclk>; > + clock-names = "apb_pclk"; > + ports { > + #address-cells = <1>; > + #size-cells = <0>; > + > + /* funnel output port */ > + port@0 { > + reg = <0>; > + funnel1_out_port: endpoint { > + remote-endpoint = > + <&etf1_in_port>; > + }; > + }; > + > + /* funnel input ports */ > + port@1 { > + reg = <0>; > + funnel1_in_port0: endpoint { > + slave-mode; > + remote-endpoint = > + <&etm4_out_port>; > + }; > + }; > + > + port@2 { > + reg = <1>; > + funnel1_in_port1: endpoint { > + slave-mode; > + remote-endpoint = > + <&etm5_out_port>; > + }; > + }; > + > + port@3 { > + reg = <2>; > + funnel1_in_port2: endpoint { > + slave-mode; > + remote-endpoint = > + <&etm6_out_port>; > + }; > + }; > + > + port@4 { > + reg = <3>; > + funnel1_in_port3: endpoint { > + slave-mode; > + remote-endpoint = > + <&etm7_out_port>; > + }; > + }; > + }; > + }; > + > + etf1:etf@1,ed002000 { Same suggestion for etf0. > + compatible = "arm,coresight-tmc","arm,primecell"; > + reg = <0 0xed002000 0 0x1000>; > + clocks = <&pclk>; > + clock-names = "apb_pclk"; > + ports { > + #address-cells = <1>; > + #size-cells = <0>; > + /* input port */ > + port@0 { > + reg = <0>; > + etf1_in_port: endpoint { > + slave-mode; > + remote-endpoint = > + <&funnel1_out_port>; > + }; > + }; > + > + /* output port */ > + port@1 { > + reg = <0>; > + etf1_out_port: endpoint { > + remote-endpoint = > + <&funnel2_in_port0>; > + }; > + }; > + }; > + }; > + > + /* Top coresight config */ > + funnel@2,ec031000 { > + compatible = "arm,coresight-funnel","arm,primecell"; > + reg = <0 0xec031000 0 0x1000>; > + clocks = <&pclk>; > + clock-names = "apb_pclk"; > + ports { > + #address-cells = <1>; > + #size-cells = <0>; > + > + /* funnel output port */ > + port@0 { > + reg = <0>; > + funnel2_out_port: endpoint { > + remote-endpoint = > + <&etf2_in_port>; > + }; > + }; > + > + /* funnel input ports */ > + port@1 { > + reg = <0>; > + funnel2_in_port0: endpoint { > + slave-mode; > + remote-endpoint = > + <&etf0_out_port>; > + }; > + }; I think this funnel should have two input ports: one is for etf0 and another is for connection etf1, but here it misses for etf1. Do I miss anything for this? > + }; > + }; > + > + etf@2,ec036000 { Same suggestion for etf0/1. > + compatible = "arm,coresight-tmc","arm,primecell"; > + reg = <0 0xec036000 0 0x1000>; > + clocks = <&pclk>; > + clock-names = "apb_pclk"; > + ports { > + #address-cells = <1>; > + #size-cells = <0>; > + /* input port */ > + port@0 { > + reg = <0>; > + etf2_in_port: endpoint { > + slave-mode; > + remote-endpoint = > + <&funnel2_out_port>; > + }; > + }; > + > + /* output port */ > + port@1 { > + reg = <0>; > + etf2_out_port: endpoint { > + remote-endpoint = > + <&replicator0_in_port>; > + }; > + }; > + }; > + }; > + > + replicator { > + compatible = "arm,coresight-replicator"; Replicator doesn't have clock? > + > + ports { > + #address-cells = <1>; > + #size-cells = <0>; > + > + /* etr out port */ > + port@0 { > + reg = <0>; > + replicator0_out_port0: endpoint { > + remote-endpoint = > + <&etr_in_port>; > + }; > + }; > + /* TPIU out port */ > + port@1 { > + reg = <1>; > + replicator0_out_port1: endpoint { > + remote-endpoint = > + <&tpiu_in_port>; > + }; > + }; > + /* input port */ > + port@2 { > + reg = <0>; > + replicator0_in_port: endpoint { > + slave-mode; > + remote-endpoint = > + <&etf2_out_port>; > + }; > + }; > + }; > + }; > + > + etr@0,ec033000 { > + compatible = "arm,coresight-tmc","arm,primecell"; > + reg = <0 0xec033000 0 0x1000>; > + clocks = <&pclk>; > + clock-names = "apb_pclk"; > + ports { > + #address-cells = <1>; > + #size-cells = <0>; Usually etr doesn't need to set address-cells and size-cell anymore. > + > + /* etr input port */ > + port@0 { > + reg = <0>; > + etr_in_port: endpoint { > + slave-mode; >From Documentation/devicetree/bindings/arm/coresight.txt, I cannot see there have 'slave-mode' property. > + remote-endpoint = > + <&replicator0_out_port0>; > + }; > + }; > + }; > + }; > + > + tpiu@ec032000 { > + compatible = "arm,coresight-tpiu", "arm,primecell"; > + reg = <0 0xec032000 0 0x1000>; > + > + clocks = <&pclk>; > + clock-names = "apb_pclk"; > + port { > + tpiu_in_port: endpoint { > + slave-mode; > + remote-endpoint = > + <&replicator0_out_port1>; > + }; > + }; > + }; > + }; I don't see CPU debug module DT binding, could you add them as well? You could use the single one patch to contain CPU debug module or use one dedicated patch, both will be okay for me. Thanks, Leo Yan > +}; > -- > 2.7.4 >

6 years, 11 months

[PATCH v3 0/8] perf cs-etm: Correct packets handling

by Leo Yan

perf cs-etm module converts decoder elements to packets and then we have more context crossing packets to generate synthenize samples, finally perf tool can faciliate samples for statistics and report the results. This patch series is to address several issues found related with packets handling and samples generation when worked firstly on branch sample flags support for Arm CoreSight trace data, so this patch series is dependency for sample flags setting, will send another dedicated patch series for sample flags later. In this patch series, the first two patches are mainly to fix issues in cs_etm__flush(): patch 0001 corrects packets swapping in cs_etm__flush() and this can fix the wrong branch sample caused by the missed packets swapping; patch 0002 is to fix the wrong samples generation with stale packets at the end of trace block. Patch 0003 and 0004 are for minor fixing; patch 0003 removes unused field 'cs_etm_decoder::trace_on', this can simplize the switch-case code for all discontinuity packet generation by using one code block; patch 0004 is to refactor enumeration cs_etm_sample_type. Patch 0005 is to rename CS_ETM_TRACE_ON to CS_ETM_DISCONTINUITY, we use a more general packet type to present trace discontinuity, so it can be used by TRACE_ON event, and also can be used by NO_SYNC and EO_TRACE elements. Patch 0006 is used to support NO_SYNC packet, otherwise the trace decoding cannot reflect the tracing discontinuity caused by NO_SYNC packet. Patch 0007 is used to support EO_TRACE packet, which also introduces the tracing discontinuity at the end of trace and we should save last trace data for it. Patch 0008 is used to generate branch sample for exception packets. Credit to Mike Leach and Robert Walker who made me clear for underlying mechanism for NO_SYNC/EO_TRACE elements, Mike also shared the detailed explanation for why we can treat NO_SYNC and TRACE_ON elements as the same, so except following Mike & Rob suggestion for trace discontinuity consolidation, most commit log of patches 0006/0007 also come from Mike's explanation. This patch series is applied directly on the acme's perf/core branch [1] with latest commit aaab25f03e9e ("perf trace: Allow selecting use the use of the ordered_events code"). With applying the dependency patch, this patch series has been tested for branch samples dumping with below command on Juno board: # perf script -F,-time,+ip,+sym,+dso,+addr,+symoff -k vmlinux Changes from v2: * Addressed Mathieu's comments and suggestions for minor refactoring for removing unused field 'cs_etm_decoder::trace_on' and added one dedicated patch to refactor enumeration cs_etm_sample_type. * Added Mathieu's 'reviewed' tags; Very appreciate Mathieu's many suggestion for crossing several patch series. Changes from v1: * Synced the consistent code in patch 0001 for condition checking. * Introduced new function cs_etm__end_block() for flushing packet at the end of trace block. * Added new patch 0003 to rename CS_ETM_TRACE_ON to CS_ETM_DISCONTINUITY. * Used the same one packet type CS_ETM_DISCONTINUITY for all trace discontinuity (include support TRACE_ON/EO_TRACE/NO_SYNC packets). * Removed tracking exception number patch, which will be added in sample flag patch series. Leo Yan (8): perf cs-etm: Correct packets swapping in cs_etm__flush() perf cs-etm: Avoid stale branch samples when flush packet perf cs-etm: Remove unused 'trace_on' in cs_etm_decoder perf cs-etm: Refactor enumeration cs_etm_sample_type perf cs-etm: Rename CS_ETM_TRACE_ON to CS_ETM_DISCONTINUITY perf cs-etm: Treat NO_SYNC element as trace discontinuity perf cs-etm: Treat EO_TRACE element as trace discontinuity perf cs-etm: Generate branch sample for exception packet tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 42 +++++++++----- tools/perf/util/cs-etm-decoder/cs-etm-decoder.h | 10 ++-- tools/perf/util/cs-etm.c | 77 ++++++++++++++++++++++--- 3 files changed, 100 insertions(+), 29 deletions(-) -- 2.7.4

6 years, 11 months

[PATCH 33/63] perf cs-etm: Generate branch sample for exception packet

by Arnaldo Carvalho de Melo

From: Leo Yan <leo.yan(a)linaro.org> The exception packet appears as one element with 'elem_type' == OCSD_GEN_TRC_ELEM_EXCEPTION or OCSD_GEN_TRC_ELEM_EXCEPTION_RET, which is present for exception entry and exit respectively. The decoder sets the packet fields 'packet->exc' and 'packet->exc_ret' to indicate the exception packets; but exception packets don't have a dedicated sample type and shares the same sample type CS_ETM_RANGE with normal instruction packets. As a result, the exception packets are taken as normal instruction packets and this introduces confusion in mixing different packet types. Furthermore, these instruction range packets will be processed for branch samples only when 'packet->last_instr_taken_branch' is true, otherwise they will be omitted, this can introduce a mess for exception and exception returning due to not having the complete address range info for context switching. To process exception packets properly, this patch introduces two new sample types: CS_ETM_EXCEPTION and CS_ETM_EXCEPTION_RET; these two types of packets will be handled by cs_etm__exception(). The function cs_etm__exception() forces setting the previous CS_ETM_RANGE packet flag 'prev_packet->last_instr_taken_branch' to true, this matches well with the program flow when the exception is trapped from user space to kernel space, no matter if the most recent flow has branch taken or not; this is also safe for returning to user space after exception handling. After exception packets have their own sample type, the packet fields 'packet->exc' and 'packet->exc_ret' aren't needed anymore, so remove them. Signed-off-by: Leo Yan <leo.yan(a)linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier(a)linaro.org> Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com> Cc: Jiri Olsa <jolsa(a)redhat.com> Cc: Mike Leach <mike.leach(a)linaro.org> Cc: Namhyung Kim <namhyung(a)kernel.org> Cc: Robert Walker <robert.walker(a)arm.com> Cc: coresight ml <coresight(a)lists.linaro.org> Cc: linux-arm-kernel(a)lists.infradead.org Link: http://lkml.kernel.org/r/1544513908-16805-9-git-send-email-leo.yan@linaro.o… Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com> --- .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 26 +++++++++++++---- .../perf/util/cs-etm-decoder/cs-etm-decoder.h | 4 +-- tools/perf/util/cs-etm.c | 28 +++++++++++++++++++ 3 files changed, 50 insertions(+), 8 deletions(-) diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index cda6f074bd03..8c155575c6c5 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -290,8 +290,6 @@ static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder) decoder->packet_buffer[i].instr_count = 0; decoder->packet_buffer[i].last_instr_taken_branch = false; decoder->packet_buffer[i].last_instr_size = 0; - decoder->packet_buffer[i].exc = false; - decoder->packet_buffer[i].exc_ret = false; decoder->packet_buffer[i].cpu = INT_MIN; } } @@ -319,8 +317,6 @@ cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder, decoder->packet_buffer[et].sample_type = sample_type; decoder->packet_buffer[et].isa = CS_ETM_ISA_UNKNOWN; - decoder->packet_buffer[et].exc = false; - decoder->packet_buffer[et].exc_ret = false; decoder->packet_buffer[et].cpu = *((int *)inode->priv); decoder->packet_buffer[et].start_addr = CS_ETM_INVAL_ADDR; decoder->packet_buffer[et].end_addr = CS_ETM_INVAL_ADDR; @@ -397,6 +393,22 @@ cs_etm_decoder__buffer_discontinuity(struct cs_etm_decoder *decoder, CS_ETM_DISCONTINUITY); } +static ocsd_datapath_resp_t +cs_etm_decoder__buffer_exception(struct cs_etm_decoder *decoder, + const uint8_t trace_chan_id) +{ + return cs_etm_decoder__buffer_packet(decoder, trace_chan_id, + CS_ETM_EXCEPTION); +} + +static ocsd_datapath_resp_t +cs_etm_decoder__buffer_exception_ret(struct cs_etm_decoder *decoder, + const uint8_t trace_chan_id) +{ + return cs_etm_decoder__buffer_packet(decoder, trace_chan_id, + CS_ETM_EXCEPTION_RET); +} + static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( const void *context, const ocsd_trc_index_t indx __maybe_unused, @@ -420,10 +432,12 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION: - decoder->packet_buffer[decoder->tail].exc = true; + resp = cs_etm_decoder__buffer_exception(decoder, + trace_chan_id); break; case OCSD_GEN_TRC_ELEM_EXCEPTION_RET: - decoder->packet_buffer[decoder->tail].exc_ret = true; + resp = cs_etm_decoder__buffer_exception_ret(decoder, + trace_chan_id); break; case OCSD_GEN_TRC_ELEM_PE_CONTEXT: case OCSD_GEN_TRC_ELEM_ADDR_NACC: diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h index a27231722e27..a6407d41598f 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h @@ -26,6 +26,8 @@ enum cs_etm_sample_type { CS_ETM_EMPTY, CS_ETM_RANGE, CS_ETM_DISCONTINUITY, + CS_ETM_EXCEPTION, + CS_ETM_EXCEPTION_RET, }; enum cs_etm_isa { @@ -43,8 +45,6 @@ struct cs_etm_packet { u32 instr_count; u8 last_instr_taken_branch; u8 last_instr_size; - u8 exc; - u8 exc_ret; int cpu; }; diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c index cea3158915d3..27a374ddf661 100644 --- a/tools/perf/util/cs-etm.c +++ b/tools/perf/util/cs-etm.c @@ -1000,6 +1000,25 @@ static int cs_etm__sample(struct cs_etm_queue *etmq) return 0; } +static int cs_etm__exception(struct cs_etm_queue *etmq) +{ + /* + * When the exception packet is inserted, whether the last instruction + * in previous range packet is taken branch or not, we need to force + * to set 'prev_packet->last_instr_taken_branch' to true. This ensures + * to generate branch sample for the instruction range before the + * exception is trapped to kernel or before the exception returning. + * + * The exception packet includes the dummy address values, so don't + * swap PACKET with PREV_PACKET. This keeps PREV_PACKET to be useful + * for generating instruction and branch samples. + */ + if (etmq->prev_packet->sample_type == CS_ETM_RANGE) + etmq->prev_packet->last_instr_taken_branch = true; + + return 0; +} + static int cs_etm__flush(struct cs_etm_queue *etmq) { int err = 0; @@ -1148,6 +1167,15 @@ static int cs_etm__run_decoder(struct cs_etm_queue *etmq) */ cs_etm__sample(etmq); break; + case CS_ETM_EXCEPTION: + case CS_ETM_EXCEPTION_RET: + /* + * If the exception packet is coming, + * make sure the previous instruction + * range packet to be handled properly. + */ + cs_etm__exception(etmq); + break; case CS_ETM_DISCONTINUITY: /* * Discontinuity in trace, flush -- 2.19.2

6 years, 11 months

[PATCH 32/63] perf cs-etm: Treat EO_TRACE element as trace discontinuity

by Arnaldo Carvalho de Melo

From: Leo Yan <leo.yan(a)linaro.org> If the decoder outputs an EO_TRACE element, it means the end of the trace buffer; this is a discontinuity and in this case the end of trace data needs to be saved. This patch generates a CS_ETM_DISCONTINUITY packet for the EO_TRACE element hereby flushing the end of trace data in cs-etm.c. Signed-off-by: Leo Yan <leo.yan(a)linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier(a)linaro.org> Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com> Cc: Jiri Olsa <jolsa(a)redhat.com> Cc: Mike Leach <mike.leach(a)linaro.org> Cc: Namhyung Kim <namhyung(a)kernel.org> Cc: Robert Walker <robert.walker(a)arm.com> Cc: coresight(a)lists.linaro.org Cc: linux-arm-kernel(a)lists.infradead.org Link: http://lkml.kernel.org/r/1544513908-16805-8-git-send-email-leo.yan@linaro.o… Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com> --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index bee026e76a4c..cda6f074bd03 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -409,6 +409,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( switch (elem->elem_type) { case OCSD_GEN_TRC_ELEM_UNKNOWN: break; + case OCSD_GEN_TRC_ELEM_EO_TRACE: case OCSD_GEN_TRC_ELEM_NO_SYNC: case OCSD_GEN_TRC_ELEM_TRACE_ON: resp = cs_etm_decoder__buffer_discontinuity(decoder, @@ -425,7 +426,6 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( decoder->packet_buffer[decoder->tail].exc_ret = true; break; case OCSD_GEN_TRC_ELEM_PE_CONTEXT: - case OCSD_GEN_TRC_ELEM_EO_TRACE: case OCSD_GEN_TRC_ELEM_ADDR_NACC: case OCSD_GEN_TRC_ELEM_TIMESTAMP: case OCSD_GEN_TRC_ELEM_CYCLE_COUNT: -- 2.19.2

6 years, 11 months

[PATCH 31/63] perf cs-etm: Treat NO_SYNC element as trace discontinuity

by Arnaldo Carvalho de Melo

From: Leo Yan <leo.yan(a)linaro.org> The CoreSight tracer driver might insert barrier packets between different buffers, thus the decoder can spot the boundaries based on the barrier packet; it is possible for the decoder to hit a barrier packet and emit a NO_SYNC element, then the decoder will find a periodic synchronisation point inside that next trace block that starts the trace again but does not have the TRACE_ON element as indicator - usually because this trace block has wrapped the buffer so we have lost the original point when the trace was enabled. In the first case it causes the insertion of a OCSD_GEN_TRC_ELEM_NO_SYNC in the middle of the tracing stream, but as we were not handling the NO_SYNC element properly this ends up making users miss the discontinuity indications. Though OCSD_GEN_TRC_ELEM_NO_SYNC is different from CS_ETM_TRACE_ON when output from the decoder, both indicate that the trace data is discontinuous; this patch treats OCSD_GEN_TRC_ELEM_NO_SYNC as a trace discontinuity and generates a CS_ETM_DISCONTINUITY packet for it, so cs-etm can handle the discontinuity for this case, finally it saves the last trace data for the previous trace block and restart samples for the new block. Signed-off-by: Leo Yan <leo.yan(a)linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier(a)linaro.org> Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com> Cc: Jiri Olsa <jolsa(a)redhat.com> Cc: Mike Leach <mike.leach(a)linaro.org> Cc: Namhyung Kim <namhyung(a)kernel.org> Cc: Robert Walker <robert.walker(a)arm.com> Cc: coresight ml <coresight(a)lists.linaro.org> Cc: linux-arm-kernel(a)lists.infradead.org Link: http://lkml.kernel.org/r/1544513908-16805-7-git-send-email-leo.yan@linaro.o… Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com> --- tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 1 - 1 file changed, 1 deletion(-) diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index 1039f364f4cc..bee026e76a4c 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -410,7 +410,6 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( case OCSD_GEN_TRC_ELEM_UNKNOWN: break; case OCSD_GEN_TRC_ELEM_NO_SYNC: - break; case OCSD_GEN_TRC_ELEM_TRACE_ON: resp = cs_etm_decoder__buffer_discontinuity(decoder, trace_chan_id); -- 2.19.2

6 years, 11 months

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

CoreSight