This patchset introduces initial concepts in CoreSight system
configuration management support. to allow more detailed and complex
programming to be applied to CoreSight systems during trace capture.
Configurations consist of 2 elements:-
1) Features - programming combinations for devices, applied to a class of
device on the system (all ETMv4), or individual devices.
2) Configurations - a set of programmed features used when the named
configuration is selected.
Features and configurations are declared as a data table, a set of register,
resource and parameter requirements. Features and configurations are loaded
into the system by the virtual cs_syscfg device. This then matches features
to any registered devices and loads the feature into them.
Individual device classes that support feature and configuration register
with cs_syscfg.
Once loaded a configuration can be enabled for a specific trace run.
Configurations are registered with the perf cs_etm event as entries in
cs_etm/events. These can be selected on the perf command line as follows:-
perf record -e cs_etm/<config_name>/ ...
This patch set has one pre-loaded configuration and feature.
A named "strobing" feature is provided for ETMv4.
A named "autofdo" configuration is provided. This configuration enables
strobing on any ETM in used.
Thus the command:
perf record -e cs_etm/autofdo/ ...
will trace the supplied application while enabling the "autofdo" configuation
on each ETM as it is enabled by perf. This in turn will enable strobing for
the ETM - with default parameters. Parameters can be adjusted using configfs.
The sink used in the trace run will be automatically selected.
A configuration can supply up to 15 of preset parameter values, which will
subsitute in parameter values for any feature used in the configuration.
Selection of preset values as follows
perf record -e cs_etm/autofdo,preset=1/ ...
(valid presets 1-N, where N is the number supplied in the configuration, not
exceeding 15. preset=0 is the same as not selecting a preset.)
Applies to & tested against coresight/next (5.13-rc6 base)
Changes since v7:
Fixed kernel test robot issue - config with CORESIGHT=y & CONFIGFS_FS=m causes
build error. Altered CORESIGHT config to select CONFIGFS_FS.
Reported-by: kernel test robot <lkp(a)intel.com>
Replaced mutex use to protect loaded config lists in coresight devices with per
device spinlock to remove issue when disable called in interrupt context.
Reported-by: Branislav Rankov <branislav.rankov(a)arm.com>
Changes since v6:
Fixed kernel test robot issues-
Reported-by: kernel test robot <lkp(a)intel.com>
Changes since v5:
1) Fix code style issues from auto-build reports, as
Reported-by: kernel test robot <lkp(a)intel.com>
2) Update comments to get consistent docs for API functions.
3) remove unused #define from autofdo example.
4) fix perf code style issues from patch 4 (Mathieu)
5) fix configfs code style issues from patch 9. (Mathieu)
Changes since v4: (based on comments from Matthieu and Suzuki).
No large functional changes - primarily code improvements and naming schema.
1) Updated entire set to ensure a consistent naming scheme was used for
variables and struct members that refer to the key objects in the system.
Suffixes _desc used for all references to feature and configuraion descriptors,
suffix _csdev used for all references to load feature and configs in the csdev
instances. (Mathieu & Suzuki).
2) Dropped the 'configurations' sub dir in cs_etm perf directories as superfluous
with the configfs containing the same information. (Mathieu).
3) Simplified perf handling code (suzuki)
4) Multiple simplifications and improvements for code readability (Matthieu
and Suzuki)
Changes since v3: (Primarily based on comments from Matthieu)
1) Locking mechanisms simplified.
2) Removed the possibility to enable features independently from
configurations.Only configurations can be enabled now. Simplifies programming
logic.
3) Configuration now uses an activate->enable mechanism. This means that perf
will activate a selected configuration at the start of a session (during
setup_aux), and disable at the end of a session (around free_aux)
The active configuration and associated features will be programmed into the
CoreSight device instances when they are enabled. This locks the configuration
into the system while in use. Parameters cannot be altered while this is
in place. This mechanism will be extended in future for dynamic load / unload
of configurations to prevent removal while in use.
4) Removed the custom bus / driver as un-necessary. A single device is
registered to own perf fs elements and configfs.
5) Various other minor issues addressed.
Changes since v2:
1) Added documentation file.
2) Altered cs_syscfg driver to no longer be coresight_device based, and moved
to its own custom bus to remove it from the main coresight bus. (Mathieu)
3) Added configfs support to inspect and control loaded configurations and
features. Allows listing of preset values (Yabin Cui)
4) Dropped sysfs support for adjusting feature parameters on the per device
basis, in favour of a single point adjustment in configfs that is pushed to all
device instances.
5) Altered how the config and preset command line options are handled in perf
and the drivers. (Mathieu and Suzuki).
6) Fixes for various issues and technical points (Mathieu, Yabin)
Changes since v1:
1) Moved preloaded configurations and features out of individual drivers.
2) Added cs_syscfg driver to manage configurations and features. Individual
drivers register with cs_syscfg indicating support for config, and provide
matching information that the system uses to load features into the drivers.
This allows individual drivers to be updated on an as needed basis - and
removes the need to consider devices that cannot benefit from configuration -
static replicators, funnels, tpiu.
3) Added perf selection of configuarations.
4) Rebased onto the coresight module loading set.
To follow in future revisions / sets:-
a) load of additional config and features by loadable module.
b) load of additional config and features by configfs
c) enhanced resource management for ETMv4 and checking features have sufficient
resources to be enabled.
d) ECT and CTI support for configuration and features.
Mike Leach (10):
coresight: syscfg: Initial coresight system configuration
coresight: syscfg: Add registration and feature loading for cs devices
coresight: config: Add configuration and feature generic functions
coresight: etm-perf: update to handle configuration selection
coresight: syscfg: Add API to activate and enable configurations
coresight: etm-perf: Update to activate selected configuration
coresight: etm4x: Add complex configuration handlers to etmv4
coresight: config: Add preloaded configurations
coresight: syscfg: Add initial configfs support
Documentation: coresight: Add documentation for CoreSight config
.../trace/coresight/coresight-config.rst | 244 ++++++
Documentation/trace/coresight/coresight.rst | 16 +
drivers/hwtracing/coresight/Kconfig | 1 +
drivers/hwtracing/coresight/Makefile | 7 +-
.../hwtracing/coresight/coresight-cfg-afdo.c | 153 ++++
.../coresight/coresight-cfg-preload.c | 31 +
.../coresight/coresight-cfg-preload.h | 13 +
.../hwtracing/coresight/coresight-config.c | 275 ++++++
.../hwtracing/coresight/coresight-config.h | 253 ++++++
drivers/hwtracing/coresight/coresight-core.c | 12 +-
.../hwtracing/coresight/coresight-etm-perf.c | 150 +++-
.../hwtracing/coresight/coresight-etm-perf.h | 12 +-
.../hwtracing/coresight/coresight-etm4x-cfg.c | 182 ++++
.../hwtracing/coresight/coresight-etm4x-cfg.h | 30 +
.../coresight/coresight-etm4x-core.c | 38 +-
.../coresight/coresight-etm4x-sysfs.c | 3 +
.../coresight/coresight-syscfg-configfs.c | 396 +++++++++
.../coresight/coresight-syscfg-configfs.h | 45 +
.../hwtracing/coresight/coresight-syscfg.c | 829 ++++++++++++++++++
.../hwtracing/coresight/coresight-syscfg.h | 81 ++
include/linux/coresight.h | 11 +
21 files changed, 2746 insertions(+), 36 deletions(-)
create mode 100644 Documentation/trace/coresight/coresight-config.rst
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-afdo.c
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-preload.c
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-preload.h
create mode 100644 drivers/hwtracing/coresight/coresight-config.c
create mode 100644 drivers/hwtracing/coresight/coresight-config.h
create mode 100644 drivers/hwtracing/coresight/coresight-etm4x-cfg.c
create mode 100644 drivers/hwtracing/coresight/coresight-etm4x-cfg.h
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg-configfs.c
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg-configfs.h
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg.c
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg.h
--
2.17.1
On Wed, Jul 14, 2021 at 09:40:15AM +0100, Russell King (Oracle) wrote:
> On Tue, Jul 13, 2021 at 07:13:02PM +0100, Catalin Marinas wrote:
> > We could try to clarify E2.2.1 to simply state that naturally aligned
> > LDRD/STRD are single-copy atomic without any subsequent statement on the
> > translation table.
>
> I think that clarification would be most helpful. Thanks.
Thanks for the suggestion and confirmation, Russell & Catalin.
If so, I will implement the weak functions for
compat_auxtrace_mmap__{read_head|write_tail}; and write the arm/arm64
specific functions with using LDRD/STRD instructions.
For better patches organization, I will use a separate patch set for
enabling the compat functions (in particular patches 10, 11/11) in
the next spin.
Thanks,
Leo
Em Tue, Jul 13, 2021 at 05:31:03PM +0000, Hunter, Adrian escreveu:
> > On Mon, Jul 12, 2021 at 03:14:35PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Sun, Jul 11, 2021 at 06:41:04PM +0800, Leo Yan escreveu:
> > > > +++ b/tools/perf/util/env.c
> > > > @@ -11,6 +11,7 @@
> > > > #include <stdlib.h>
> > > > #include <string.h>
> > > > +int kernel_is_64_bit;
> > > > struct perf_env perf_env;
> > > Why can't this be in 'struct perf_env'?
> > Good question. I considered to add it in struct perf_env but finally I used this
> > way; the reason is this variable "kernel_is_64_bit" is only used during
> > recording phase for AUX ring buffer, and don't use it for report. So seems to
> > me it's over complexity to add a new field and just wander if it's necessary to
> > save this field as new feature in the perf header.
> I think we store the arch, so if the "kernel_is_64_bit" calculation depends only on arch
> then I guess we don't need a new feature at the moment.
So, I wasn't suggesting to add this info to the perf.data file header,
just to the in-memory 'struct perf_env'.
And also we should avoid unconditionally initializing things that we may
never need, please structure it as:
static void perf_env__init_kernel_mode(struct perf_env *env)
{
const char *arch = perf_env__raw_arch(env);
if (!strncmp(arch, "x86_64", 6) || !strncmp(arch, "aarch64", 7) ||
!strncmp(arch, "arm64", 5) || !strncmp(arch, "mips64", 6) ||
!strncmp(arch, "parisc64", 8) || !strncmp(arch, "riscv64", 7) ||
!strncmp(arch, "s390x", 5) || !strncmp(arch, "sparc64", 7))
kernel_is_64_bit = 1;
else
kernel_is_64_bit = 0;
}
void perf_env__init(struct perf_env *env)
{
...
env->kernel_is_64_bit = -1;
...
}
bool perf_env__kernel_is_64_bit(struct perf_env *env)
{
if (env->kernel_is_64_bit == -1)
perf_env__init_kernel_mode(env);
return env->kernel_is_64_bit;
}
One thing in my TODO is to crack down on the tons of initializations
perf does unconditionally, last time I looked there are lots :-\
- Arnaldo
> > Combining the comment from Adrian in another email, I think it's good to add
> > a new field "compat_mode" in the struct perf_env, and this field will be
> > initialized in build-record.c. Currently we don't need to save this value into
> > the perf file, if later we need to use this value for decoding phase, then we
> > can add a new feature item to save "compat_mode"
> > into the perf file's header.
> > If you have any different idea, please let me know. Thanks!
This patchset consists of refactoring to allow the decoder to be
created in advance when the AUX records are iterated over. The
AUX record flags are used to communicate whether the data is
formatted or not which is the reason this refactoring is required.
These changes result in some simplifications, removal of early exit
conditions etc.
A change was also made to --dump-raw-trace code to allow the
formatted/unformatted status to persist and for the decoder to
not be continually deleted and recreated.
The changes apply on top of the previous patchset "[PATCH v7 0/2] perf
cs-etm: Split Coresight decode by aux records".
Changes since v1:
* Change 'decoders_per_cpu' variable name to 'decoders' and add a comment
* Add a warning that piped mode is best effort, suggested by Suzuki
James Clark (6):
perf cs-etm: Refactor initialisation of kernel start address
perf cs-etm: Split setup and timestamp search functions
perf cs-etm: Only setup queues when they are modified
perf cs-etm: Suppress printing when resetting decoder
perf cs-etm: Use existing decoder instead of resetting it
perf cs-etm: Pass unformatted flag to decoder
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 14 +-
tools/perf/util/cs-etm.c | 185 +++++++++---------
2 files changed, 97 insertions(+), 102 deletions(-)
--
2.28.0
Patch release for OpenCSD v.1.1.1 made to address C-API include file
issues for the ETE decoder, raised by work on perf tools
Regards
Mike
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
Add -lstdc++ to perf when linking libopencsd as it is a dependency. It
does not hurt to add it when dynamic linking.
Filter out -static flag when building plugins as they are always built
as dynamic libraries and -static and -dynamic don't work well together
on arm and arm64.
Signed-off-by: Tamas Zsoldos <tamas.zsoldos(a)arm.com>
Signed-off-by: Branislav Rankov <branislav.rankov(a)arm.com>
---
tools/perf/Makefile.config | 2 +-
tools/perf/Makefile.perf | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 73df23dd664c..b014a9bdd0db 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -143,7 +143,7 @@ FEATURE_CHECK_LDFLAGS-libcrypto = -lcrypto
ifdef CSINCLUDES
LIBOPENCSD_CFLAGS := -I$(CSINCLUDES)
endif
-OPENCSDLIBS := -lopencsd_c_api -lopencsd
+OPENCSDLIBS := -lopencsd_c_api -lopencsd -lstdc++
ifdef CSLIBS
LIBOPENCSD_LDFLAGS := -L$(CSLIBS)
endif
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index e47f04e5b51e..cd3cf910fa8a 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -792,7 +792,7 @@ endif
$(patsubst perf-%,%.o,$(PROGRAMS)): $(wildcard */*.h)
-LIBTRACEEVENT_FLAGS += plugin_dir=$(plugindir_SQ) 'EXTRA_CFLAGS=$(EXTRA_CFLAGS)' 'LDFLAGS=$(LDFLAGS)'
+LIBTRACEEVENT_FLAGS += plugin_dir=$(plugindir_SQ) 'EXTRA_CFLAGS=$(EXTRA_CFLAGS)' 'LDFLAGS=$(filter-out -static,$(LDFLAGS))'
$(LIBTRACEEVENT): FORCE
$(Q)$(MAKE) -C $(TRACE_EVENT_DIR) $(LIBTRACEEVENT_FLAGS) O=$(OUTPUT) $(OUTPUT)libtraceevent.a
--
2.17.1
This patchset consists of refactoring to allow the decoder to be
created in advance when the AUX records are iterated over. The
AUX record flags are used to communicate whether the data is
formatted or not which is the reason this refactoring is required.
These changes result in some simplifications, removal of early exit
conditions etc.
A change was also made to --dump-raw-trace code to allow the
formatted/unformatted status to persist and for the decoder to
not be continually deleted and recreated.
The changes apply on top of the previous patchset "[PATCH v7 0/2] perf
cs-etm: Split Coresight decode by aux records".
James Clark (6):
perf cs-etm: Refactor initialisation of kernel start address
perf cs-etm: Split setup and timestamp search functions
perf cs-etm: Only setup queues when they are modified
perf cs-etm: Suppress printing when resetting decoder
perf cs-etm: Use existing decoder instead of resetting it
perf cs-etm: Pass unformatted flag to decoder
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 10 +-
tools/perf/util/cs-etm.c | 174 ++++++++----------
2 files changed, 84 insertions(+), 100 deletions(-)
--
2.28.0
Change applies to perf/core (45237f9898fc)
Changes since v6:
* Fix for snapshot mode where buffers are wrapped. This fix was done by clamping the aux record
size to the size of the buffer (see comment).
* Added an extra debugging printout.
* Typo/formatting fixes.
* Add the change for --dump-raw-trace as a second commit. I planned to do this later, but have now
finished it so I'll submit it at the same time.
* Did some more thorough testing around the different snapshot scenarios.
Decoding snapshot files with duplicate data is improved by this patchset because of the reason
mentioned at the end of the testing section. Coincidentally, the same issue is also fixed in
"[PATCH v1 0/3] coresight: Fix for snapshot mode" but by not saving duplicates, rather than not
decoding them.
James Clark (2):
perf cs-etm: Split Coresight decode by aux records
perf cs-etm: Split --dump-raw-trace by AUX records
tools/perf/util/cs-etm.c | 188 ++++++++++++++++++++++++++++++++++++++-
1 file changed, 185 insertions(+), 3 deletions(-)
--
2.28.0
This is a series of fixes addressing the issues in the way we handle
- Self-Hosted trace filter control register for ETM/ETE
- AUX buffer and event handling of TRBE at overflow.
The use of TRUNCATED flag on an IRQ for the TRBE driver is
something that needs to be rexamined. Please see Patch 3 for
more details.
Suzuki K Poulose (5):
coresight: etm4x: Save restore TRFCR_EL1
coresight: etm4x: Use Trace Filtering controls dynamically
coresight: trbe: Keep TRBE disabled on overflow irq
coresight: trbe: Move irq_work queue processing
coresight: trbe: Prohibit tracing while handling an IRQ
.../coresight/coresight-etm4x-core.c | 109 ++++++++++++++----
drivers/hwtracing/coresight/coresight-etm4x.h | 7 +-
drivers/hwtracing/coresight/coresight-trbe.c | 91 ++++++++++-----
3 files changed, 149 insertions(+), 58 deletions(-)
--
2.24.1