CoreSight April 2025

coresight@lists.linaro.org

10 participants
55 discussions

Re: [PATCH v5 1/2] coresight: catu: Introduce refcount and spinlock for enabling/disabling

by Suzuki K Poulose

On 30/04/2025 00:12, Yabin Cui wrote: > When tracing ETM data on multiple CPUs concurrently via the > perf interface, the CATU device is shared across different CPU > paths. This can lead to race conditions when multiple CPUs attempt > to enable or disable the CATU device simultaneously. > > To address these race conditions, this patch introduces the > following changes: > > 1. The enable and disable operations for the CATU device are not > reentrant. Therefore, a spinlock is added to ensure that only > one CPU can enable or disable a given CATU device at any point > in time. > > 2. A reference counter is used to manage the enable/disable state > of the CATU device. The device is enabled when the first CPU > requires it and is only disabled when the last CPU finishes > using it. This ensures the device remains active as long as at > least one CPU needs it. > > Signed-off-by: Yabin Cui <yabinc(a)google.com> > Reviewed-by: James Clark <james.clark(a)linaro.org> > Reviewed-by: Leo Yan <leo.yan(a)arm.com> Looks good to me, I will add : Fixes: fcacb5c154ba ("coresight: Introduce support for Coresight Address Translation Unit") Suzuki > --- > drivers/hwtracing/coresight/coresight-catu.c | 25 +++++++++++++------- > drivers/hwtracing/coresight/coresight-catu.h | 1 + > 2 files changed, 18 insertions(+), 8 deletions(-) > > diff --git a/drivers/hwtracing/coresight/coresight-catu.c b/drivers/hwtracing/coresight/coresight-catu.c > index 96cb48b140af..d4e2e175e077 100644 > --- a/drivers/hwtracing/coresight/coresight-catu.c > +++ b/drivers/hwtracing/coresight/coresight-catu.c > @@ -458,12 +458,17 @@ static int catu_enable_hw(struct catu_drvdata *drvdata, enum cs_mode cs_mode, > static int catu_enable(struct coresight_device *csdev, enum cs_mode mode, > void *data) > { > - int rc; > + int rc = 0; > struct catu_drvdata *catu_drvdata = csdev_to_catu_drvdata(csdev); > > - CS_UNLOCK(catu_drvdata->base); > - rc = catu_enable_hw(catu_drvdata, mode, data); > - CS_LOCK(catu_drvdata->base); > + guard(raw_spinlock_irqsave)(&catu_drvdata->spinlock); > + if (csdev->refcnt == 0) { > + CS_UNLOCK(catu_drvdata->base); > + rc = catu_enable_hw(catu_drvdata, mode, data); > + CS_LOCK(catu_drvdata->base); > + } > + if (!rc) > + csdev->refcnt++; > return rc; > } > > @@ -486,12 +491,15 @@ static int catu_disable_hw(struct catu_drvdata *drvdata) > > static int catu_disable(struct coresight_device *csdev, void *__unused) > { > - int rc; > + int rc = 0; > struct catu_drvdata *catu_drvdata = csdev_to_catu_drvdata(csdev); > > - CS_UNLOCK(catu_drvdata->base); > - rc = catu_disable_hw(catu_drvdata); > - CS_LOCK(catu_drvdata->base); > + guard(raw_spinlock_irqsave)(&catu_drvdata->spinlock); > + if (--csdev->refcnt == 0) { > + CS_UNLOCK(catu_drvdata->base); > + rc = catu_disable_hw(catu_drvdata); > + CS_LOCK(catu_drvdata->base); > + } > return rc; > } > > @@ -550,6 +558,7 @@ static int __catu_probe(struct device *dev, struct resource *res) > dev->platform_data = pdata; > > drvdata->base = base; > + raw_spin_lock_init(&drvdata->spinlock); > catu_desc.access = CSDEV_ACCESS_IOMEM(base); > catu_desc.pdata = pdata; > catu_desc.dev = dev; > diff --git a/drivers/hwtracing/coresight/coresight-catu.h b/drivers/hwtracing/coresight/coresight-catu.h > index 141feac1c14b..755776cd19c5 100644 > --- a/drivers/hwtracing/coresight/coresight-catu.h > +++ b/drivers/hwtracing/coresight/coresight-catu.h > @@ -65,6 +65,7 @@ struct catu_drvdata { > void __iomem *base; > struct coresight_device *csdev; > int irq; > + raw_spinlock_t spinlock; > }; > > #define CATU_REG32(name, offset) \

2 months

Re: [PATCH v5 2/2] coresight: core: Disable helpers for devices that fail to enable

by Leo Yan

On Tue, Apr 29, 2025 at 04:13:00PM -0700, Yabin Cui wrote: > When enabling a SINK or LINK type coresight device fails, the > associated helpers should be disabled. > > Signed-off-by: Yabin Cui <yabinc(a)google.com> > Suggested-by: Suzuki K Poulose <suzuki.poulose(a)arm.com> > Reviewed-by: James Clark <james.clark(a)linaro.org> > Reviewed-by: Mike Leach <mike.leach(a)linaro.org> Reviewed-by: Leo Yan <leo.yan(a)arm.com> > --- > drivers/hwtracing/coresight/coresight-core.c | 15 ++++++++++----- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c > index dabec7073aed..d3523f0262af 100644 > --- a/drivers/hwtracing/coresight/coresight-core.c > +++ b/drivers/hwtracing/coresight/coresight-core.c > @@ -465,7 +465,7 @@ int coresight_enable_path(struct coresight_path *path, enum cs_mode mode, > /* Enable all helpers adjacent to the path first */ > ret = coresight_enable_helpers(csdev, mode, path); > if (ret) > - goto err; > + goto err_disable_path; > /* > * ETF devices are tricky... They can be a link or a sink, > * depending on how they are configured. If an ETF has been > @@ -486,8 +486,10 @@ int coresight_enable_path(struct coresight_path *path, enum cs_mode mode, > * that need disabling. Disabling the path here > * would mean we could disrupt an existing session. > */ > - if (ret) > + if (ret) { > + coresight_disable_helpers(csdev, path); > goto out; > + } > break; > case CORESIGHT_DEV_TYPE_SOURCE: > /* sources are enabled from either sysFS or Perf */ > @@ -497,16 +499,19 @@ int coresight_enable_path(struct coresight_path *path, enum cs_mode mode, > child = list_next_entry(nd, link)->csdev; > ret = coresight_enable_link(csdev, parent, child, source); > if (ret) > - goto err; > + goto err_disable_helpers; > break; > default: > - goto err; > + ret = -EINVAL; > + goto err_disable_helpers; > } > } > > out: > return ret; > -err: > +err_disable_helpers: > + coresight_disable_helpers(csdev, path); > +err_disable_path: > coresight_disable_path_from(path, nd); > goto out; > } > -- > 2.49.0.967.g6a0df3ecc3-goog >

2 months

[PATCH] coresight: Fixes device's owner field for registered using coresight_init_driver()

by Junhao He

The coresight_init_driver() of the coresight-core module is called from the sub coresgiht device (such as tmc/stm/funnle/...) module. It calls amba_driver_register() and Platform_driver_register(), which are macro functions that use the coresight-core's module to initialize the caller's owner field. Therefore, when the sub coresight device calls coresight_init_driver(), an incorrect THIS_MODULE value is captured. The sub coesgiht modules can be removed while their callbacks are running, resulting in a general protection failure. Add module parameter to coresight_init_driver() so can be called with the module of the callback. Fixes: 075b7cd7ad7d ("coresight: Add helpers registering/removing both AMBA and platform drivers") Signed-off-by: Junhao He <hejunhao3(a)huawei.com> --- drivers/hwtracing/coresight/coresight-catu.c | 2 +- drivers/hwtracing/coresight/coresight-core.c | 6 +++--- drivers/hwtracing/coresight/coresight-cpu-debug.c | 3 ++- drivers/hwtracing/coresight/coresight-funnel.c | 3 ++- drivers/hwtracing/coresight/coresight-replicator.c | 3 ++- drivers/hwtracing/coresight/coresight-stm.c | 2 +- drivers/hwtracing/coresight/coresight-tmc-core.c | 2 +- drivers/hwtracing/coresight/coresight-tpiu.c | 2 +- include/linux/coresight.h | 2 +- 9 files changed, 14 insertions(+), 11 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-catu.c b/drivers/hwtracing/coresight/coresight-catu.c index bfea880d6dfb..337668f9cfd4 100644 --- a/drivers/hwtracing/coresight/coresight-catu.c +++ b/drivers/hwtracing/coresight/coresight-catu.c @@ -702,7 +702,7 @@ static int __init catu_init(void) { int ret; - ret = coresight_init_driver("catu", &catu_driver, &catu_platform_driver); + ret = coresight_init_driver("catu", &catu_driver, &catu_platform_driver, THIS_MODULE); tmc_etr_set_catu_ops(&etr_catu_buf_ops); return ret; } diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index 9fc6f6b863e0..c546a417836c 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -1399,17 +1399,17 @@ module_init(coresight_init); module_exit(coresight_exit); int coresight_init_driver(const char *drv, struct amba_driver *amba_drv, - struct platform_driver *pdev_drv) + struct platform_driver *pdev_drv, struct module *owner) { int ret; - ret = amba_driver_register(amba_drv); + ret = __amba_driver_register(amba_drv, owner); if (ret) { pr_err("%s: error registering AMBA driver\n", drv); return ret; } - ret = platform_driver_register(pdev_drv); + ret = __platform_driver_register(pdev_drv, owner); if (!ret) return 0; diff --git a/drivers/hwtracing/coresight/coresight-cpu-debug.c b/drivers/hwtracing/coresight/coresight-cpu-debug.c index 75962dae9aa1..cc599c5ef4b2 100644 --- a/drivers/hwtracing/coresight/coresight-cpu-debug.c +++ b/drivers/hwtracing/coresight/coresight-cpu-debug.c @@ -774,7 +774,8 @@ static struct platform_driver debug_platform_driver = { static int __init debug_init(void) { - return coresight_init_driver("debug", &debug_driver, &debug_platform_driver); + return coresight_init_driver("debug", &debug_driver, &debug_platform_driver, + THIS_MODULE); } static void __exit debug_exit(void) diff --git a/drivers/hwtracing/coresight/coresight-funnel.c b/drivers/hwtracing/coresight/coresight-funnel.c index 5a819c8970fb..8f451b051ddc 100644 --- a/drivers/hwtracing/coresight/coresight-funnel.c +++ b/drivers/hwtracing/coresight/coresight-funnel.c @@ -433,7 +433,8 @@ static struct amba_driver dynamic_funnel_driver = { static int __init funnel_init(void) { - return coresight_init_driver("funnel", &dynamic_funnel_driver, &funnel_driver); + return coresight_init_driver("funnel", &dynamic_funnel_driver, &funnel_driver, + THIS_MODULE); } static void __exit funnel_exit(void) diff --git a/drivers/hwtracing/coresight/coresight-replicator.c b/drivers/hwtracing/coresight/coresight-replicator.c index 3e55be9c8418..f7607c72857c 100644 --- a/drivers/hwtracing/coresight/coresight-replicator.c +++ b/drivers/hwtracing/coresight/coresight-replicator.c @@ -438,7 +438,8 @@ static struct amba_driver dynamic_replicator_driver = { static int __init replicator_init(void) { - return coresight_init_driver("replicator", &dynamic_replicator_driver, &replicator_driver); + return coresight_init_driver("replicator", &dynamic_replicator_driver, &replicator_driver, + THIS_MODULE); } static void __exit replicator_exit(void) diff --git a/drivers/hwtracing/coresight/coresight-stm.c b/drivers/hwtracing/coresight/coresight-stm.c index 117dbb484543..403eea8f95d4 100644 --- a/drivers/hwtracing/coresight/coresight-stm.c +++ b/drivers/hwtracing/coresight/coresight-stm.c @@ -1046,7 +1046,7 @@ static struct platform_driver stm_platform_driver = { static int __init stm_init(void) { - return coresight_init_driver("stm", &stm_driver, &stm_platform_driver); + return coresight_init_driver("stm", &stm_driver, &stm_platform_driver, THIS_MODULE); } static void __exit stm_exit(void) diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c index b54562f392f3..e31e36635394 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-core.c +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c @@ -742,7 +742,7 @@ static struct platform_driver tmc_platform_driver = { static int __init tmc_init(void) { - return coresight_init_driver("tmc", &tmc_driver, &tmc_platform_driver); + return coresight_init_driver("tmc", &tmc_driver, &tmc_platform_driver, THIS_MODULE); } static void __exit tmc_exit(void) diff --git a/drivers/hwtracing/coresight/coresight-tpiu.c b/drivers/hwtracing/coresight/coresight-tpiu.c index b048e146fbb1..f9ecd05cbe5c 100644 --- a/drivers/hwtracing/coresight/coresight-tpiu.c +++ b/drivers/hwtracing/coresight/coresight-tpiu.c @@ -318,7 +318,7 @@ static struct platform_driver tpiu_platform_driver = { static int __init tpiu_init(void) { - return coresight_init_driver("tpiu", &tpiu_driver, &tpiu_platform_driver); + return coresight_init_driver("tpiu", &tpiu_driver, &tpiu_platform_driver, THIS_MODULE); } static void __exit tpiu_exit(void) diff --git a/include/linux/coresight.h b/include/linux/coresight.h index f09ace92176e..e6c26952ddc2 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -660,7 +660,7 @@ coresight_find_output_type(struct coresight_platform_data *pdata, union coresight_dev_subtype subtype); int coresight_init_driver(const char *drv, struct amba_driver *amba_drv, - struct platform_driver *pdev_drv); + struct platform_driver *pdev_drv, struct module *owner); void coresight_remove_driver(struct amba_driver *amba_drv, struct platform_driver *pdev_drv); -- 2.33.0

2 months

Re: [PATCH v3 00/11] (subset) ARM: qcom: fix APQ8064 schema warnings

by Suzuki K Poulose

On Fri, 25 Apr 2025 20:47:00 +0300, Dmitry Baryshkov wrote: > Rob's bot has reported [1] several warnings for Nexus 4 submisson, > however none of those warnings are specific to that device. Fix all > those warnings for all APQ8064 platforms by extending existing schemas, > adding missing schemas and making APQ8064 DT follow all the schema > files. > > [1]: https://lore.kernel.org/linux-arm-msm/174221818190.3957236.3364090534153729… > > [...] Applied, thanks! [06/11] dt-bindings: arm: arm,coresight-static-replicator: add optional clocks https://git.kernel.org/coresight/c/13e3a882 Best regards, -- Suzuki K Poulose <suzuki.poulose(a)arm.com>

2 months

[PATCH v2 0/2] Add Qualcomm extended CTI support

by Mao Jinlong

The QCOM extended CTI is a heavily parameterized version of ARM’s CSCTI. It allows a debugger to send to trigger events to a processor or to send a trigger event to one or more processors when a trigger event occurs on another processor on the same SoC, or even between SoCs. QCOM extended CTI supports up to 128 triggers. And some of the register offsets are changed. The commands to configure CTI triggers are the same as ARM's CTI. Changes in V2: 1. Add enum for compatible items. 2. Move offset arraies to coresight-cti-core Mao Jinlong (2): dt-bindings: arm: Add Qualcomm extended CTI coresight: cti: Add Qualcomm extended CTI support .../bindings/arm/arm,coresight-cti.yaml | 4 +- .../hwtracing/coresight/coresight-cti-core.c | 127 ++++++++++++++---- .../coresight/coresight-cti-platform.c | 16 ++- .../hwtracing/coresight/coresight-cti-sysfs.c | 124 +++++++++++++---- drivers/hwtracing/coresight/coresight-cti.h | 72 +++++----- 5 files changed, 243 insertions(+), 100 deletions(-) -- 2.25.1

2 months

Re: [PATCH v4] coresight: prevent deactivate active config while enabling the config

by Leo Yan

Hi Levi, On Wed, Mar 26, 2025 at 07:34:25AM +0000, Yeoreum Yun wrote: [...] > > > --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c > > > +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c > > > @@ -1020,6 +1020,9 @@ static void etm4_disable_sysfs(struct coresight_device *csdev) > > > smp_call_function_single(drvdata->cpu, etm4_disable_hw, drvdata, 1); > > > > > > raw_spin_unlock(&drvdata->spinlock); > > > + > > > + cscfg_csdev_disable_active_config(csdev); > > > + > > > > In general, we need to split changes into several patches if each > > addresses a different issue. From my understanding, the change above is > > to fix missing to disable config when disable Sysfs mode. > > > > If so, could we use a seperate patch for this change? > > > > It's not a differnt issue. Without this line, the active count wouldn't > decrese and it raise another issue -- unloadable moudle for active_cnt :( > So I think it should be included in this patch. I read the code again and concluded the change above is not related to locking and would be a separate issue: when we close a Sysfs session, we need to disable a config on a CoreSight device. Could you clarify what is meaning "unloadable moudle for active_cnt"? I only saw "cscfg_mgr->sys_active_cnt" is used for module unloading, but have no clue why "active_cnt" impacts module unloading. > > > cpus_read_unlock(); > > > > > > /* > > > diff --git a/drivers/hwtracing/coresight/coresight-syscfg.c b/drivers/hwtracing/coresight/coresight-syscfg.c > > > index a70c1454b410..6d8c212ad434 100644 > > > --- a/drivers/hwtracing/coresight/coresight-syscfg.c > > > +++ b/drivers/hwtracing/coresight/coresight-syscfg.c > > > @@ -391,14 +391,17 @@ static void cscfg_owner_put(struct cscfg_load_owner_info *owner_info) > > > static void cscfg_remove_owned_csdev_configs(struct coresight_device *csdev, void *load_owner) > > > { > > > struct cscfg_config_csdev *config_csdev, *tmp; > > > + unsigned long flags; > > > > > > if (list_empty(&csdev->config_csdev_list)) > > > return; > > > > > > + raw_spin_lock_irqsave(&csdev->cscfg_csdev_lock, flags); > > > > I think we should use spinlock to guard the condition checking > > list_empty(). > > > > Here the race condition is the 'config_csdev_list' list and > > configurations on the list. For atomicity, we should use lock to > > protect any operations on the list (read, add, delete, etc). > > Interesting... Would you let me know which race it is? > here to check list_empty(), it already guarded with "cscfg_mutex". Thanks for pointing out this. I read the code and understood that in some scenarios the list is protected by the mutex "cscfg_mutex". I would argue for using locking, we need to make clear for two thigns: - What is the race condition; - What locking is used to protect the race condition. For current case, a CoreSight device has a config list, the race condition is the config list will be manipulated by multiple places (e.g., for module loading / unloading, opening or closing a perf or SysFS session). So a spinlock is used to to protect the config list. "cscfg_mutex" is a high level lock, my understanding is to protect the high level operations from the Sysfs knobs, though sometimes it can mitigate the race condition on configuration list mentioned above, but the spinlock is the locking mechanism for the low level's config list on a CoreSight device. > However list_del() is special case because iterating config_csdev_list > can be done without cscfg_mutex -- see > cscfg_csdev_enable_active_config(). > This gurad with spinlock purpose to guard race unloading and > get the config in cscfg_csdev_enable_active_config() > (Please see my response below...). > > the emptiness of config_csdev_list is guarded with cscfg_mutex. > therefore, It seems enough to guard iterating part with spinlock :) Mixed using cscfg_mutex and spinlock get code complexity, and I am a bit concerned this is not best practice. At a glance, I would say 'cscfg_mutex' is purposed to protect the global 'cscfg_mgr', per CoreSight device's config list should be protected by 'cscfg_csdev_lock'. > > A side topic, as here it adds locks for protecting 'config_csdev_list', > > I am wandering why we do not do the same thing for > > 'feature_csdev_list' (See cscfg_remove_owned_csdev_features() and > > cscfg_get_feat_csdev()). > > In case of feature, It's okay since it couldn't be accessed when it > gets failed to get related config. I looked at cscfg_load_feat_csdev(), it uses 'cscfg_csdev_lock' to protect the feature list. This is why I thought the feature list also need to be protected by the lock. Now it is only partially protected. > When we see cscfg_csdev_enable_active_config(), the config could be > accessed without cscfg_mutex lock. so the config need to be guarded with > spin_lock otherwise it could be acquired while unload module > (after get active_cnt in search logic cscfg_csdev_enable_active_config() > and other running unloading process) To make things more clear, I have a questions. 'cscfg_mgr->sys_active_cnt' is used in the cscfg_unload_config_sets() function to decide if can unload module, if any configuration is active, why this variable cannot prevent unloading module? Sorry for late replying. Leo > But feature list is depends on config, If config is safe from > load/unload, this is not an issue so we don't need it. > > Thanks for your review!

2 months

Re: [PATCH] coresight: trbe: Save/restore state across CPU low power state

by Mike Leach

Hi On Fri, 25 Apr 2025 at 23:05, Yabin Cui <yabinc(a)google.com> wrote: > > On Thu, Apr 24, 2025 at 9:16 PM Anshuman Khandual > <anshuman.khandual(a)arm.com> wrote: > > > > On 4/24/25 04:30, Yabin Cui wrote: > > > Similar to ETE, TRBE may lose its context when a CPU enters low > > > power state. To make things worse, if ETE state is restored without > > > restoring TRBE state, it can lead to a stuck CPU due to an enabled > > > source device with no enabled sink devices. > > > > Could you please provide some more details about when the cpu gets > > stuck e.g dmesg, traces etc. Also consider adding those details in > > the commit message as well to establish the problem, this patch is > > trying to address. > > This is already the best I know. When experimenting TRBE locally > (on Pixel 9), if I don't save TRBE state across low power state, when > recording system wide ETM data using TRBE on a CPU core. In a > few seconds, the CPU state becomes unknown (no message in dmesg). > Since the core may hold locks needed by other cores, it soon locks > other cores and causes a watchdog reset. I found the coresight driver > always carefully enables sink before source, and disables sink after > source. I guess there is some risk in not doing so, like the CPU hang > here. Maybe you know why? > > > > > > > > > > This patch introduces support for "arm,coresight-loses-context-with-cpu" > > > in the TRBE driver. When present, TRBE registers are saved before > > > and restored after CPU low power state. To prevent CPU hangs, TRBE > > > state is always saved after ETE state and restored after ETE state. > > > > The save and restore order here is > > > > 1. Save ETE > > 2. Save TRBE > > 3. Restore ETE > > 4. Restore TRBE > > > > > > > > Signed-off-by: Yabin Cui <yabinc(a)google.com> > > > --- > > > .../coresight/coresight-etm4x-core.c | 13 ++++- > > > drivers/hwtracing/coresight/coresight-trbe.c | 53 +++++++++++++++++++ > > > include/linux/coresight.h | 6 +++ > > > 3 files changed, 71 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c > > > index e5972f16abff..1bbaa1249206 100644 > > > --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c > > > +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c > > > @@ -1863,6 +1863,7 @@ static int __etm4_cpu_save(struct etmv4_drvdata *drvdata) > > > static int etm4_cpu_save(struct etmv4_drvdata *drvdata) > > > { > > > int ret = 0; > > > + struct coresight_device *sink; > > > > > > /* Save the TRFCR irrespective of whether the ETM is ON */ > > > if (drvdata->trfcr) > > > @@ -1871,8 +1872,14 @@ static int etm4_cpu_save(struct etmv4_drvdata *drvdata) > > > * Save and restore the ETM Trace registers only if > > > * the ETM is active. > > > */ > > > - if (coresight_get_mode(drvdata->csdev) && drvdata->save_state) > > > + if (coresight_get_mode(drvdata->csdev) && drvdata->save_state) { > > > ret = __etm4_cpu_save(drvdata); > > > + if (ret == 0) { > > > + sink = coresight_get_percpu_sink(drvdata->cpu); > > > + if (sink && sink_ops(sink)->percpu_save) > > > + sink_ops(sink)->percpu_save(sink); > > > + } > > > + } > > > return ret; > > > } > > > > > > @@ -1977,6 +1984,10 @@ static void __etm4_cpu_restore(struct etmv4_drvdata *drvdata) > > > > > > static void etm4_cpu_restore(struct etmv4_drvdata *drvdata) > > > { > > > + struct coresight_device *sink = coresight_get_percpu_sink(drvdata->cpu); > > > + > > > + if (sink && sink_ops(sink)->percpu_restore) > > > + sink_ops(sink)->percpu_restore(sink); > > > > TRBE gets restored first which contradicts the order mentioned > > earlier in the commit message ? > > > An error in the commit message, should be: > To prevent CPU hangs, TRBE state is always saved after ETE state and > restored before ETE state. > > > > > > if (drvdata->trfcr) > > > write_trfcr(drvdata->save_trfcr); > > > if (drvdata->state_needs_restore) > > > diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c > > > index fff67aac8418..38bf46951a82 100644 > > > --- a/drivers/hwtracing/coresight/coresight-trbe.c > > > +++ b/drivers/hwtracing/coresight/coresight-trbe.c > > > @@ -115,6 +115,13 @@ static int trbe_errata_cpucaps[] = { > > > */ > > > #define TRBE_WORKAROUND_OVERWRITE_FILL_MODE_SKIP_BYTES 256 > > > > > > +struct trbe_save_state { > > > + u64 trblimitr; > > > + u64 trbbaser; > > > + u64 trbptr; > > > + u64 trbsr; > > > +}; > > > > This seems to accommodate all required TRBE registers. Although this is > > very intuitive could you please also add a comment above this structure > > explaining the elements like other existing structures in the file ? > > Will do in the v2 patch. > > > > > + > > > /* > > > * struct trbe_cpudata: TRBE instance specific data > > > * @trbe_flag - TRBE dirty/access flag support > > > @@ -123,6 +130,9 @@ static int trbe_errata_cpucaps[] = { > > > * @cpu - CPU this TRBE belongs to. > > > * @mode - Mode of current operation. (perf/disabled) > > > * @drvdata - TRBE specific drvdata > > > + * @state_needs_save - Need to save trace registers when entering cpu idle > > > + * @state_needs_restore - Need to restore trace registers when exiting cpu idle > > > + * @save_state - Saved trace registers > > > * @errata - Bit map for the errata on this TRBE. > > > */ > > > struct trbe_cpudata { > > > @@ -133,6 +143,9 @@ struct trbe_cpudata { > > > enum cs_mode mode; > > > struct trbe_buf *buf; > > > struct trbe_drvdata *drvdata; > > > + bool state_needs_save; > > > > This tracks whether coresight_loses_context_with_cpu() is supported > > on the CPU or not. > > > > > + bool state_needs_restore; > > > > This tracks whether the state has been saved earlier and hence can > > be restored later on when required. > > Will update the comment in the v2 patch. > > > > > > + struct trbe_save_state save_state; > > > DECLARE_BITMAP(errata, TRBE_ERRATA_MAX); > > > }; > > > > > > @@ -1187,12 +1200,49 @@ static irqreturn_t arm_trbe_irq_handler(int irq, void *dev) > > > return IRQ_HANDLED; > > > } > > > > > > +static void arm_trbe_cpu_save(struct coresight_device *csdev) > > > +{ > > > + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev); > > > + struct trbe_save_state *state = &cpudata->save_state; > > > > Please move the above statement after the following conditional > > block. Because struct trbe_save_state is not going to be required > > if arm_trbe_cpu_save() returns prematurely from here. > > > Will do in the v2 patch. > > > + > > > + if (cpudata->mode == CS_MODE_DISABLED || !cpudata->state_needs_save) > > > + return;> + > > > + state->trbbaser = read_sysreg_s(SYS_TRBBASER_EL1); > > > + state->trbptr = read_sysreg_s(SYS_TRBPTR_EL1); > > > + state->trblimitr = read_sysreg_s(SYS_TRBLIMITR_EL1); > > > + state->trbsr = read_sysreg_s(SYS_TRBSR_EL1); > > > + cpudata->state_needs_restore = true; > > > > This looks right. > > > > > +} > > > + > > > +static void arm_trbe_cpu_restore(struct coresight_device *csdev) > > > +{ > > > + struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev); > > > + struct trbe_save_state *state = &cpudata->save_state; > > Please move this assignment inside the block where these registers > > actually get restored after all checks are done. > > > Will do in the v2 patch. > > > + > > > + if (cpudata->state_needs_restore) { > > > + /* > > > + * To avoid disruption of normal tracing, restore trace > > > + * registers only when TRBE lost power (TRBLIMITR == 0). > > > + */ > > > > Although this sounds like a reasonable condition, but what happens > > when TRBE restoration is skipped ? > > > There are cases when a CPU fails to enter a low power state or only enters > a shallow low power state. So TRBE state may not be lost when calling > arm_trbe_cpu_restore(). In this case, if we write the TRBE registers, probably > we lose ETM data written between save and restore. This may not matter, > but could there be a race condition between this restore function and TRBE > interrupt handler (which could also write TRBE registers)? I can't think > this up very well. So I only restore registers when state is lost. If > the state isn't > lost, TRBE should stay in enabled mode, and continue receiving data from ETE. > > > > + if (read_sysreg_s(SYS_TRBLIMITR_EL1) == 0) { > > > > state = &cpudata->save_state > > > > > + write_sysreg_s(state->trbbaser, SYS_TRBBASER_EL1); > > > + write_sysreg_s(state->trbptr, SYS_TRBPTR_EL1); > > > + write_sysreg_s(state->trbsr, SYS_TRBSR_EL1); > > > + set_trbe_enabled(cpudata, state->trblimitr); > > > + } > > > + cpudata->state_needs_restore = false; > > > > That means earlier captured state is dropped without a restoration. > > > > > + } > > > +} > > > + > > > static const struct coresight_ops_sink arm_trbe_sink_ops = { > > > .enable = arm_trbe_enable, > > > .disable = arm_trbe_disable, > > > .alloc_buffer = arm_trbe_alloc_buffer, > > > .free_buffer = arm_trbe_free_buffer, > > > .update_buffer = arm_trbe_update_buffer, > > > + .percpu_save = arm_trbe_cpu_save, > > > + .percpu_restore = arm_trbe_cpu_restore, > > > > Why this percpu_* prefix ? > > > Added percpu_ prefix because it's only for percpu_sink > (coresight_get_percpu_sink). > Please let me know if you have a better name. > > > }; > > > > > > static const struct coresight_ops arm_trbe_cs_ops = { > > > @@ -1358,6 +1408,9 @@ static void arm_trbe_probe_cpu(void *info) > > > cpudata->trbe_flag = get_trbe_flag_update(trbidr); > > > cpudata->cpu = cpu; > > > cpudata->drvdata = drvdata; > > > + cpudata->state_needs_save = coresight_loses_context_with_cpu( > > > + &drvdata->pdev->dev); > > > + cpudata->state_needs_restore = false; > > > > The init here looks good. > > > > > return; > > > cpu_clear: > > > cpumask_clear_cpu(cpu, &drvdata->supported_cpus); > > > diff --git a/include/linux/coresight.h b/include/linux/coresight.h > > > index d79a242b271d..fec375d02535 100644 > > > --- a/include/linux/coresight.h > > > +++ b/include/linux/coresight.h > > > @@ -362,6 +362,10 @@ enum cs_mode { > > > * @alloc_buffer: initialises perf's ring buffer for trace collection. > > > * @free_buffer: release memory allocated in @get_config. > > > * @update_buffer: update buffer pointers after a trace session. > > > + * @percpu_save: saves state when CPU enters idle state. > > > + * Only set for percpu sink. > > > + * @percpu_restore: restores state when CPU exits idle state. > > > + * only set for percpu sink. > > > */ > > > struct coresight_ops_sink { > > > int (*enable)(struct coresight_device *csdev, enum cs_mode mode, > > > @@ -374,6 +378,8 @@ struct coresight_ops_sink { > > > unsigned long (*update_buffer)(struct coresight_device *csdev, > > > struct perf_output_handle *handle, > > > void *sink_config); > > > + void (*percpu_save)(struct coresight_device *csdev); > > > + void (*percpu_restore)(struct coresight_device *csdev); > > > > Again - why this percpu_* prefix ? > > > > > }; > > > > > > /** I do not think this is the best approach. The TRBE driver has its own power management registration functions, so is it not possible for the save and restore should be handled there, through a PM notifier, just as the ETM/ETE is? Drop the unnecessary DT entry - TRBE is a per cpu architectural device - if a TRBE is present, we know it will power down with the PE. The CoreSight architecture permits an ETE to drive trace to an external sink - so the TRBE might be present but unused, therefore hooking into a source driver that may not be driving trace into the device does not seem wise.. The TRBE PM can follow the model of the ETE / ETMv4 and save and restore if currently in use. This approach will isolate the changes to the TRBE driver where they should be. Regards Mike -- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK

2 months

Re: [PATCH v1 1/2] dt-bindings: arm: Add Qualcomm extended CTI

by Jinlong Mao

On 2025/4/28 15:39, Krzysztof Kozlowski wrote: > On 28/04/2025 09:31, Jinlong Mao wrote: >> >> >> On 2024/9/3 20:42, Krzysztof Kozlowski wrote: >>> On 03/09/2024 14:18, Mao Jinlong wrote: >>>> Add Qualcomm extended CTI support in CTI binding file. Qualcomm >>>> extended CTI supports up to 128 triggers. >>>> >>>> Signed-off-by: Mao Jinlong <quic_jinlmao(a)quicinc.com> >>>> --- >>>> .../devicetree/bindings/arm/arm,coresight-cti.yaml | 14 ++++++++++++++ >>>> 1 file changed, 14 insertions(+) >>>> >>>> diff --git a/Documentation/devicetree/bindings/arm/arm,coresight-cti.yaml b/Documentation/devicetree/bindings/arm/arm,coresight-cti.yaml >>>> index 6a73eaa66a42..141efba7c697 100644 >>>> --- a/Documentation/devicetree/bindings/arm/arm,coresight-cti.yaml >>>> +++ b/Documentation/devicetree/bindings/arm/arm,coresight-cti.yaml >>>> @@ -87,6 +87,10 @@ properties: >>>> - const: arm,coresight-cti-v8-arch >>>> - const: arm,coresight-cti >>>> - const: arm,primecell >>>> + - items: >>>> + - const: qcom,coresight-cti-extended >>> >>> That's just enum in previous entry/list. >> Sorry for the late response. This is a new CTI type. Need the three >> items in compatible at the same time, just like other kind of CTIs. > > > Comment stays valid, you did not address it at all. Hi Krzysztof, Do you mean we only need const: qcom,coresight-cti-extended here ? No need const: arm,coresight-cti and const: arm,primecell as they are in previous entry/list, right ? > >> >>> >>>> + - const: arm,coresight-cti >>>> + - const: arm,primecell >>>> >>>> reg: >>>> maxItems: 1 >>>> @@ -254,6 +258,16 @@ examples: >>>> clocks = <&soc_smc50mhz>; >>>> clock-names = "apb_pclk"; >>>> }; >>>> + # minimum extended CTI definition. >>>> + - | >>> >>> No need for new example. No differences here. >> This is a new type CTI. > > > Comment stays valid, you did not address it at all. > > Best regards, > Krzysztof

2 months, 1 week

Re: [PATCH 1/2] perf: Allow non-contiguous AUX buffer pages via PMU capability

by James Clark

On 23/04/2025 8:52 pm, Yabin Cui wrote: > On Tue, Apr 22, 2025 at 7:10 AM Leo Yan <leo.yan(a)arm.com> wrote: >> >> On Tue, Apr 22, 2025 at 02:49:54PM +0200, Ingo Molnar wrote: >> >> [...] >> >>>> Hi Yabin, >>>> >>>> I was wondering if this is just the opposite of >>>> PERF_PMU_CAP_AUX_NO_SG, and that order 0 should be used by default >>>> for all devices to solve the issue you describe. Because we already >>>> have PERF_PMU_CAP_AUX_NO_SG for devices that need contiguous pages. >>>> Then I found commit 5768402fd9c6 ("perf/ring_buffer: Use high order >>>> allocations for AUX buffers optimistically") that explains that the >>>> current allocation strategy is an optimization. >>>> >>>> Your change seems to decide that for certain devices we want to >>>> optimize for fragmentation rather than performance. If these are >>>> rarely used features specifically when looking at performance should >>>> we not continue to optimize for performance? Or at least make it user >>>> configurable? >>> >>> So there seems to be 3 categories: >>> >>> - 1) Must have physically contiguous AUX buffers, it's a hardware ABI. >>> (PERF_PMU_CAP_AUX_NO_SG for Intel BTS and PT.) >>> >>> - 2) Would be nice to have continguous AUX buffers, for a bit more >>> performance. >>> >>> - 3) Doesn't really care. >>> >>> So we do have #1, and it appears Yabin's usecase is #3? > > Yes, in my usecase, I care much more about MM-friendly than a little potential > performance when using PMU. It's not a rarely used feature. On Android, we > collect ETM data periodically on internal user devices for AutoFDO optimization > (for both userspace libraries and the kernel). Allocating a large > chunk of contiguous > AUX pages (4M for each CPU) periodically is almost unbearable. The kernel may > need to kill many processes to fulfill the request. It affects user > experience even > after using PMU. > > I am totally fine to reuse PERF_PMU_CAP_AUX_NO_SG. If PMUs don't want to > sacrifice performance for MM-friendly, why support scatter gather mode? If there > are strong performance reasons to allocate contiguous AUX pages in > scatter gather > mode, I hope max_order is configurable in userspace. > > Currently, max_order is affected by aux_watermark. But aux_watermark > also affects > how frequently the PMU overflows AUX buffer and notifies userspace. > It's not ideal > to set aux_watermark to 1 page size. So if we want to make max_order user > configurable, maybe we can add a one bit field in perf_event_attr? > >> >> In Yabin's case, the AUX buffer work as a bounce buffer. The hardware >> trace data is copied by a driver from low level's contiguous buffer to >> the AUX buffer. >> >> In this case we cannot benefit much from continguous AUX buffers. >> >> Thanks, >> Leo Hi Yabin, So after doing some testing it looks like there is 0 difference in overhead for max_order=0 vs ensuring the buffer is one contiguous allocation for Arm SPE, and TRBE would be exactly the same. This makes sense because we're vmapping pages individually anyway regardless of the base allocation. Seems like the performance optimization of the optimistically large mappings is only for devices that require extra buffer management stuff other than normal virtual memory. Can we add a new capability PERF_PMU_CAP_AUX_PREFER_LARGE and apply it to Intel PT and BTS? Then the old (before the optimistic large allocs change) max_order=0 behavior becomes the default again, and PREFER_LARGE is just for those two devices. Other and new devices would get the more memory friendly allocations by default, as it's unlikely they'll benefit from anything different. Thanks James

2 months, 1 week

Re: [PATCH v3 06/11] dt-bindings: arm: arm,coresight-static-replicator: add optional clocks

by Suzuki K Poulose

Hi Dmitry On 25/04/2025 18:47, Dmitry Baryshkov wrote: > As most other CoreSight devices the replicator can use either of the > optional clocks. Document those optional clocks in the schema. > Additionally document the one-off case of Zynq-7000 platforms which uses > apb_pclk and two additional debug clocks. What is the plan for merging this series ? I am happy to pick this patch via coresight tree. > > Fixes: 3c15fddf3121 ("dt-bindings: arm: Convert CoreSight bindings to DT schema") > Reviewed-by: Rob Herring (Arm) <robh(a)kernel.org> > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov(a)oss.qualcomm.com> Otherwise, Acked-by: Suzuki K Poulose <suzuki.poulose(a)arm.com> Suzuki > --- > .../bindings/arm/arm,coresight-static-replicator.yaml | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/Documentation/devicetree/bindings/arm/arm,coresight-static-replicator.yaml b/Documentation/devicetree/bindings/arm/arm,coresight-static-replicator.yaml > index a6f793ea03b6c193fc0ff72a45e0249a63a2ba3c..0c1017affbad2f03892b250ad864d9a5dc8d02f1 100644 > --- a/Documentation/devicetree/bindings/arm/arm,coresight-static-replicator.yaml > +++ b/Documentation/devicetree/bindings/arm/arm,coresight-static-replicator.yaml > @@ -30,6 +30,19 @@ properties: > power-domains: > maxItems: 1 > > + clocks: > + minItems: 1 > + maxItems: 3 > + > + clock-names: > + oneOf: > + - items: > + - enum: [apb_pclk, atclk] > + - items: # Zynq-700 > + - const: apb_pclk > + - const: dbg_trc > + - const: dbg_apb > + > in-ports: > $ref: /schemas/graph.yaml#/properties/ports > additionalProperties: false >

2 months, 1 week

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

CoreSight April 2025