This patch series adds support for CoreSight components local to CPU clusters, including funnel, replicator, and TMC, which reside within CPU cluster power domains. These components require special handling due to power domain constraints.
Unlike system-level CoreSight devices, these components share the CPU cluster's power domain. When the cluster enters low-power mode (LPM), their registers become inaccessible. Notably, `pm_runtime_get` alone cannot bring the cluster out of LPM, making standard register access unreliable.
To address this, the series introduces: - Identifying cluster-bound devices via a new `qcom,cpu-bound-components` device tree property. - Implementing deferred probing: if associated CPUs are offline during probe, initialization is deferred until a CPU hotplug notifier detects the CPU coming online. - Utilizing `smp_call_function_single()` to ensure register accesses (initialization, enablement, sysfs reads) are always executed on a powered CPU within the target cluster. - Extending the CoreSight link `enable` callback to pass the `cs_mode`. This allows drivers to distinguish between SysFS and Perf modes and apply mode-specific logic.
Jie Gan (1): arm64: dts: qcom: hamoa: add Coresight nodes for APSS debug block
Yuanfang Zhang (11): dt-bindings: arm: coresight: Add 'qcom,cpu-bound-components' property coresight: Pass trace mode to link enable callback coresight-funnel: Support CPU cluster funnel initialization coresight-funnel: Defer probe when associated CPUs are offline coresight-replicator: Support CPU cluster replicator initialization coresight-replicator: Defer probe when associated CPUs are offline coresight-replicator: Update management interface for CPU-bound devices coresight-tmc: Support probe and initialization for CPU cluster TMCs coresight-tmc-etf: Refactor enable function for CPU cluster ETF support coresight-tmc: Update management interface for CPU-bound TMCs coresight-tmc: Defer probe when associated CPUs are offline
Verification:
This series has been verified on sm8750.
Test steps for delay probe:
1. limit the system to enable at most 6 CPU cores during boot. 2. echo 1 >/sys/bus/cpu/devices/cpu6/online. 3. check whether ETM6 and ETM7 have been probed.
Test steps for sysfs mode:
echo 1 >/sys/bus/coresight/devices/tmc_etf0/enable_sink echo 1 >/sys/bus/coresight/devices/etm0/enable_source echo 1 >/sys/bus/coresight/devices/etm6/enable_source echo 0 >/sys/bus/coresight/devices/etm0/enable_source echo 0 >/sys/bus/coresight/devicse/etm6/enable_source echo 0 >/sys/bus/coresight/devices/tmc_etf0/enable_sink
echo 1 >/sys/bus/coresight/devices/tmc_etf1/enable_sink echo 1 >/sys/bus/coresight/devcies/etm0/enable_source cat /dev/tmc_etf1 >/tmp/etf1.bin echo 0 >/sys/bus/coresight/devices/etm0/enable_source echo 0 >/sys/bus/coresight/devices/tmc_etf1/enable_sink
echo 1 >/sys/bus/coresight/devices/tmc_etf2/enable_sink echo 1 >/sys/bus/coresight/devices/etm6/enable_source cat /dev/tmc_etf2 >/tmp/etf2.bin echo 0 >/sys/bus/coresight/devices/etm6/enable_source echo 0 >/sys/bus/coresight/devices/tmc_etf2/enable_sink
Test steps for sysfs node:
cat /sys/bus/coresight/devices/tmc_etf*/mgmt/*
cat /sys/bus/coresight/devices/funnel*/funnel_ctrl
cat /sys/bus/coresight/devices/replicator*/mgmt/*
Test steps for perf mode:
perf record -a -e cs_etm//k -- sleep 5
Signed-off-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com --- Changes in v2: - Use the qcom,cpu-bound-components device tree property to identify devices bound to a cluster. - Refactor commit message. - Introduce a supported_cpus field in the drvdata structure to record the CPUs that belong to the cluster where the local component resides. - Link to v1: https://lore.kernel.org/r/20251027-cpu_cluster_component_pm-v1-0-31355ac588c...
--- Jie Gan (1): arm64: dts: qcom: hamoa: Add CoreSight nodes for APSS debug block
Yuanfang Zhang (11): dt-bindings: arm: coresight: Add 'qcom,cpu-bound-components' property coresight-funnel: Support CPU cluster funnel initialization coresight-funnel: Defer probe when associated CPUs are offline coresight-replicator: Support CPU cluster replicator initialization coresight-replicator: Defer probe when associated CPUs are offline coresight-replicator: Update management interface for CPU-bound devices coresight-tmc: Support probe and initialization for CPU cluster TMCs coresight-tmc-etf: Refactor enable function for CPU cluster ETF support coresight-tmc: Update management interface for CPU-bound TMCs coresight-tmc: Defer probe when associated CPUs are offline coresight: Pass trace mode to link enable callback
.../bindings/arm/arm,coresight-dynamic-funnel.yaml | 5 + .../arm/arm,coresight-dynamic-replicator.yaml | 5 + .../devicetree/bindings/arm/arm,coresight-tmc.yaml | 5 + arch/arm64/boot/dts/qcom/hamoa.dtsi | 926 +++++++++++++++++++++ arch/arm64/boot/dts/qcom/purwa.dtsi | 12 + drivers/hwtracing/coresight/coresight-core.c | 7 +- drivers/hwtracing/coresight/coresight-funnel.c | 258 +++++- drivers/hwtracing/coresight/coresight-replicator.c | 341 +++++++- drivers/hwtracing/coresight/coresight-tmc-core.c | 387 +++++++-- drivers/hwtracing/coresight/coresight-tmc-etf.c | 106 ++- drivers/hwtracing/coresight/coresight-tmc.h | 10 + drivers/hwtracing/coresight/coresight-tnoc.c | 3 +- drivers/hwtracing/coresight/coresight-tpda.c | 3 +- include/linux/coresight.h | 3 +- 14 files changed, 1902 insertions(+), 169 deletions(-) --- base-commit: 008d3547aae5bc86fac3eda317489169c3fda112 change-id: 20251016-cpu_cluster_component_pm-ce518f510433
Best regards,
Funnels associated with CPU clusters reside in the cluster's power domain. Unlike dynamic funnels (which are typically system-wide), these per-cluster funnels are only accessible when the cluster is powered on. Standard runtime PM may not suffice to wake up a cluster from low-power states, making direct register access unreliable.
Enhance the funnel driver to support these per-cluster devices:
1. Safe Initialization: - Identify CPU cluster funnels via "qcom,cpu-bound-components". - Use smp_call_function_single() to perform hardware initialization (claim tag clearing) on a CPU within the cluster. - Refactor the probe flow to encapsulate device registration in funnel_add_coresight_dev().
2. Cross-CPU Enablement: - Update funnel_enable() to use smp_call_function_single() when enabling the hardware on a cluster-bound funnel.
3. Debug Interface Support: - Update funnel_ctrl_show() to safely read the control register via cross-CPU calls when necessary.
This ensures that funnel operations remain safe and functional even when the associated CPU cluster is in aggressive low-power states.
Signed-off-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com --- drivers/hwtracing/coresight/coresight-funnel.c | 183 ++++++++++++++++++++----- 1 file changed, 152 insertions(+), 31 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-funnel.c b/drivers/hwtracing/coresight/coresight-funnel.c index 3b248e54471a38f501777fe162fea850d1c851b3..a1264df84ab4c625c63dfbb9b7710b983a10c6b4 100644 --- a/drivers/hwtracing/coresight/coresight-funnel.c +++ b/drivers/hwtracing/coresight/coresight-funnel.c @@ -15,6 +15,7 @@ #include <linux/slab.h> #include <linux/of.h> #include <linux/platform_device.h> +#include <linux/pm_domain.h> #include <linux/pm_runtime.h> #include <linux/coresight.h> #include <linux/amba/bus.h> @@ -40,6 +41,7 @@ DEFINE_CORESIGHT_DEVLIST(funnel_devs, "funnel"); * @csdev: component vitals needed by the framework. * @priority: port selection order. * @spinlock: serialize enable/disable operations. + * @supported_cpus: Represent the CPUs related to this funnel. */ struct funnel_drvdata { void __iomem *base; @@ -48,6 +50,13 @@ struct funnel_drvdata { struct coresight_device *csdev; unsigned long priority; raw_spinlock_t spinlock; + struct cpumask *supported_cpus; +}; + +struct funnel_smp_arg { + struct funnel_drvdata *drvdata; + int port; + int rc; };
static int dynamic_funnel_enable_hw(struct funnel_drvdata *drvdata, int port) @@ -76,6 +85,33 @@ static int dynamic_funnel_enable_hw(struct funnel_drvdata *drvdata, int port) return rc; }
+static void funnel_enable_hw_smp_call(void *info) +{ + struct funnel_smp_arg *arg = info; + + arg->rc = dynamic_funnel_enable_hw(arg->drvdata, arg->port); +} + +static int funnel_enable_hw(struct funnel_drvdata *drvdata, int port) +{ + int cpu, ret; + struct funnel_smp_arg arg = { 0 }; + + if (!drvdata->supported_cpus) + return dynamic_funnel_enable_hw(drvdata, port); + + arg.drvdata = drvdata; + arg.port = port; + + for_each_cpu(cpu, drvdata->supported_cpus) { + ret = smp_call_function_single(cpu, + funnel_enable_hw_smp_call, &arg, 1); + if (!ret) + return arg.rc; + } + return ret; +} + static int funnel_enable(struct coresight_device *csdev, struct coresight_connection *in, struct coresight_connection *out) @@ -86,19 +122,24 @@ static int funnel_enable(struct coresight_device *csdev, bool first_enable = false;
raw_spin_lock_irqsave(&drvdata->spinlock, flags); - if (in->dest_refcnt == 0) { - if (drvdata->base) - rc = dynamic_funnel_enable_hw(drvdata, in->dest_port); - if (!rc) - first_enable = true; - } - if (!rc) + + if (in->dest_refcnt == 0) + first_enable = true; + else in->dest_refcnt++; + raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
- if (first_enable) - dev_dbg(&csdev->dev, "FUNNEL inport %d enabled\n", - in->dest_port); + if (first_enable) { + if (drvdata->base) + rc = funnel_enable_hw(drvdata, in->dest_port); + if (!rc) { + in->dest_refcnt++; + dev_dbg(&csdev->dev, "FUNNEL inport %d enabled\n", + in->dest_port); + } + } + return rc; }
@@ -188,15 +229,39 @@ static u32 get_funnel_ctrl_hw(struct funnel_drvdata *drvdata) return functl; }
+static void get_funnel_ctrl_smp_call(void *info) +{ + struct funnel_smp_arg *arg = info; + + arg->rc = get_funnel_ctrl_hw(arg->drvdata); +} + static ssize_t funnel_ctrl_show(struct device *dev, struct device_attribute *attr, char *buf) { u32 val; + int cpu, ret; struct funnel_drvdata *drvdata = dev_get_drvdata(dev->parent); + struct funnel_smp_arg arg = { 0 };
pm_runtime_get_sync(dev->parent); - - val = get_funnel_ctrl_hw(drvdata); + if (!drvdata->supported_cpus) { + val = get_funnel_ctrl_hw(drvdata); + } else { + arg.drvdata = drvdata; + for_each_cpu(cpu, drvdata->supported_cpus) { + ret = smp_call_function_single(cpu, + get_funnel_ctrl_smp_call, &arg, 1); + if (!ret) + break; + } + if (!ret) { + val = arg.rc; + } else { + pm_runtime_put(dev->parent); + return ret; + } + }
pm_runtime_put(dev->parent);
@@ -211,22 +276,68 @@ static struct attribute *coresight_funnel_attrs[] = { }; ATTRIBUTE_GROUPS(coresight_funnel);
+static void funnel_clear_self_claim_tag(struct funnel_drvdata *drvdata) +{ + struct csdev_access access = CSDEV_ACCESS_IOMEM(drvdata->base); + + coresight_clear_self_claim_tag(&access); +} + +static void funnel_init_on_cpu(void *info) +{ + struct funnel_drvdata *drvdata = info; + + funnel_clear_self_claim_tag(drvdata); +} + +static int funnel_add_coresight_dev(struct device *dev) +{ + struct coresight_desc desc = { 0 }; + struct funnel_drvdata *drvdata = dev_get_drvdata(dev); + + if (drvdata->base) { + desc.groups = coresight_funnel_groups; + desc.access = CSDEV_ACCESS_IOMEM(drvdata->base); + } + + desc.name = coresight_alloc_device_name(&funnel_devs, dev); + if (!desc.name) + return -ENOMEM; + + desc.type = CORESIGHT_DEV_TYPE_LINK; + desc.subtype.link_subtype = CORESIGHT_DEV_SUBTYPE_LINK_MERG; + desc.ops = &funnel_cs_ops; + desc.pdata = dev->platform_data; + desc.dev = dev; + + drvdata->csdev = coresight_register(&desc); + if (IS_ERR(drvdata->csdev)) + return PTR_ERR(drvdata->csdev); + return 0; +} + +static struct cpumask *funnel_get_supported_cpus(struct device *dev) +{ + struct generic_pm_domain *pd; + + pd = pd_to_genpd(dev->pm_domain); + if (pd) + return pd->cpus; + + return NULL; +} + static int funnel_probe(struct device *dev, struct resource *res) { void __iomem *base; struct coresight_platform_data *pdata = NULL; struct funnel_drvdata *drvdata; - struct coresight_desc desc = { 0 }; - int ret; + int cpu, ret;
if (is_of_node(dev_fwnode(dev)) && of_device_is_compatible(dev->of_node, "arm,coresight-funnel")) dev_warn_once(dev, "Uses OBSOLETE CoreSight funnel binding\n");
- desc.name = coresight_alloc_device_name(&funnel_devs, dev); - if (!desc.name) - return -ENOMEM; - drvdata = devm_kzalloc(dev, sizeof(*drvdata), GFP_KERNEL); if (!drvdata) return -ENOMEM; @@ -244,9 +355,6 @@ static int funnel_probe(struct device *dev, struct resource *res) if (IS_ERR(base)) return PTR_ERR(base); drvdata->base = base; - desc.groups = coresight_funnel_groups; - desc.access = CSDEV_ACCESS_IOMEM(base); - coresight_clear_self_claim_tag(&desc.access); }
dev_set_drvdata(dev, drvdata); @@ -258,23 +366,36 @@ static int funnel_probe(struct device *dev, struct resource *res) dev->platform_data = pdata;
raw_spin_lock_init(&drvdata->spinlock); - desc.type = CORESIGHT_DEV_TYPE_LINK; - desc.subtype.link_subtype = CORESIGHT_DEV_SUBTYPE_LINK_MERG; - desc.ops = &funnel_cs_ops; - desc.pdata = pdata; - desc.dev = dev; - drvdata->csdev = coresight_register(&desc); - if (IS_ERR(drvdata->csdev)) - return PTR_ERR(drvdata->csdev);
- return 0; + if (fwnode_property_present(dev_fwnode(dev), "qcom,cpu-bound-components")) { + drvdata->supported_cpus = funnel_get_supported_cpus(dev); + if (!drvdata->supported_cpus) + return -EINVAL; + + cpus_read_lock(); + for_each_cpu(cpu, drvdata->supported_cpus) { + ret = smp_call_function_single(cpu, + funnel_init_on_cpu, drvdata, 1); + if (!ret) + break; + } + cpus_read_unlock(); + + if (ret) + return 0; + } else if (res) { + funnel_clear_self_claim_tag(drvdata); + } + + return funnel_add_coresight_dev(dev); }
static int funnel_remove(struct device *dev) { struct funnel_drvdata *drvdata = dev_get_drvdata(dev);
- coresight_unregister(drvdata->csdev); + if (drvdata->csdev) + coresight_unregister(drvdata->csdev);
return 0; }
Introduce the `qcom,cpu-bound-components` boolean property for CoreSight components (TMC, Funnel, and Replicator).
This property indicates that the component is physically located within a CPU cluster power domain. Such components share the power state of the cluster and may require special handling (e.g., cross-CPU register access) compared to system-wide components.
Signed-off-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com --- .../devicetree/bindings/arm/arm,coresight-dynamic-funnel.yaml | 5 +++++ .../devicetree/bindings/arm/arm,coresight-dynamic-replicator.yaml | 5 +++++ Documentation/devicetree/bindings/arm/arm,coresight-tmc.yaml | 5 +++++ 3 files changed, 15 insertions(+)
diff --git a/Documentation/devicetree/bindings/arm/arm,coresight-dynamic-funnel.yaml b/Documentation/devicetree/bindings/arm/arm,coresight-dynamic-funnel.yaml index b74db15e5f8af2226b817f6af5f533b1bfc74736..a4c7333e8359da9035a9fed999ec99159e00a1d9 100644 --- a/Documentation/devicetree/bindings/arm/arm,coresight-dynamic-funnel.yaml +++ b/Documentation/devicetree/bindings/arm/arm,coresight-dynamic-funnel.yaml @@ -57,6 +57,11 @@ properties: power-domains: maxItems: 1
+ qcom,cpu-bound-components: + type: boolean + description: + Indicates whether the funnel is located physically within cpu cluster. + label: description: Description of a coresight device. diff --git a/Documentation/devicetree/bindings/arm/arm,coresight-dynamic-replicator.yaml b/Documentation/devicetree/bindings/arm/arm,coresight-dynamic-replicator.yaml index 17ea936b796fd42bb885e539201276a11e91028c..2c6e78f02ed84d95bb4366e4c4bbd1b3953efa32 100644 --- a/Documentation/devicetree/bindings/arm/arm,coresight-dynamic-replicator.yaml +++ b/Documentation/devicetree/bindings/arm/arm,coresight-dynamic-replicator.yaml @@ -67,6 +67,11 @@ properties: Indicates that the replicator will lose register context when AMBA clock is removed which is observed in some replicator designs.
+ qcom,cpu-bound-components: + type: boolean + description: + Indicates whether the replicator is located physically within cpu cluster. + in-ports: $ref: /schemas/graph.yaml#/properties/ports additionalProperties: false diff --git a/Documentation/devicetree/bindings/arm/arm,coresight-tmc.yaml b/Documentation/devicetree/bindings/arm/arm,coresight-tmc.yaml index 96dd5b5f771a39138df9adde0c9c9a6f5583d9da..8c4f2244a5c74dc8654892305025a4e6bccbce07 100644 --- a/Documentation/devicetree/bindings/arm/arm,coresight-tmc.yaml +++ b/Documentation/devicetree/bindings/arm/arm,coresight-tmc.yaml @@ -86,6 +86,11 @@ properties: $ref: /schemas/types.yaml#/definitions/uint32 maximum: 15
+ qcom,cpu-bound-components: + type: boolean + description: + indicates whether the TMC-ETF is located physically within cpu cluster. + in-ports: $ref: /schemas/graph.yaml#/properties/ports additionalProperties: false
Per-cluster funnels rely on the associated CPU cluster being online to securely access registers during initialization. If all CPUs in the cluster are offline during probe, these operations fail.
Support deferred initialization for these devices:
1. Track funnels that fail to probe due to offline CPUs in a global list. 2. Register a CPU hotplug notifier (funnel_online_cpu) to detect when a relevant CPU comes online. 3. Upon CPU online, retry the hardware initialization and registration with the CoreSight framework.
Signed-off-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com --- drivers/hwtracing/coresight/coresight-funnel.c | 62 +++++++++++++++++++++++--- 1 file changed, 57 insertions(+), 5 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-funnel.c b/drivers/hwtracing/coresight/coresight-funnel.c index a1264df84ab4c625c63dfbb9b7710b983a10c6b4..5d114ce1109f4f9a8b108110bdae258f216881d8 100644 --- a/drivers/hwtracing/coresight/coresight-funnel.c +++ b/drivers/hwtracing/coresight/coresight-funnel.c @@ -32,6 +32,9 @@ #define FUNNEL_ENSx_MASK 0xff
DEFINE_CORESIGHT_DEVLIST(funnel_devs, "funnel"); +static LIST_HEAD(funnel_delay_probe); +static enum cpuhp_state hp_online; +static DEFINE_SPINLOCK(delay_lock);
/** * struct funnel_drvdata - specifics associated to a funnel component @@ -42,6 +45,8 @@ DEFINE_CORESIGHT_DEVLIST(funnel_devs, "funnel"); * @priority: port selection order. * @spinlock: serialize enable/disable operations. * @supported_cpus: Represent the CPUs related to this funnel. + * @dev: pointer to the device associated with this funnel. + * @link: list node for adding this funnel to the delayed probe list. */ struct funnel_drvdata { void __iomem *base; @@ -51,6 +56,8 @@ struct funnel_drvdata { unsigned long priority; raw_spinlock_t spinlock; struct cpumask *supported_cpus; + struct device *dev; + struct list_head link; };
struct funnel_smp_arg { @@ -371,7 +378,7 @@ static int funnel_probe(struct device *dev, struct resource *res) drvdata->supported_cpus = funnel_get_supported_cpus(dev); if (!drvdata->supported_cpus) return -EINVAL; - + drvdata->dev = dev; cpus_read_lock(); for_each_cpu(cpu, drvdata->supported_cpus) { ret = smp_call_function_single(cpu, @@ -379,10 +386,15 @@ static int funnel_probe(struct device *dev, struct resource *res) if (!ret) break; } - cpus_read_unlock();
- if (ret) + if (ret) { + scoped_guard(spinlock, &delay_lock) + list_add(&drvdata->link, &funnel_delay_probe); + cpus_read_unlock(); return 0; + } + + cpus_read_unlock(); } else if (res) { funnel_clear_self_claim_tag(drvdata); } @@ -394,9 +406,12 @@ static int funnel_remove(struct device *dev) { struct funnel_drvdata *drvdata = dev_get_drvdata(dev);
- if (drvdata->csdev) + if (drvdata->csdev) { coresight_unregister(drvdata->csdev); - + } else { + scoped_guard(spinlock, &delay_lock) + list_del(&drvdata->link); + } return 0; }
@@ -533,8 +548,41 @@ static struct amba_driver dynamic_funnel_driver = { .id_table = dynamic_funnel_ids, };
+static int funnel_online_cpu(unsigned int cpu) +{ + struct funnel_drvdata *drvdata, *tmp; + int ret; + + list_for_each_entry_safe(drvdata, tmp, &funnel_delay_probe, link) { + if (cpumask_test_cpu(cpu, drvdata->supported_cpus)) { + scoped_guard(spinlock, &delay_lock) + list_del(&drvdata->link); + + ret = pm_runtime_resume_and_get(drvdata->dev); + if (ret < 0) + return 0; + + funnel_clear_self_claim_tag(drvdata); + funnel_add_coresight_dev(drvdata->dev); + pm_runtime_put(drvdata->dev); + } + } + return 0; +} + static int __init funnel_init(void) { + int ret; + + ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, + "arm/coresight-funnel:online", + funnel_online_cpu, NULL); + + if (ret > 0) + hp_online = ret; + else + return ret; + return coresight_init_driver("funnel", &dynamic_funnel_driver, &funnel_driver, THIS_MODULE); } @@ -542,6 +590,10 @@ static int __init funnel_init(void) static void __exit funnel_exit(void) { coresight_remove_driver(&dynamic_funnel_driver, &funnel_driver); + if (hp_online) { + cpuhp_remove_state_nocalls(hp_online); + hp_online = 0; + } }
module_init(funnel_init);
Per-cluster replicators rely on the associated CPU cluster being online to securely access registers during initialization. If all CPUs in the cluster are offline during probe, these operations fail.
Support deferred initialization for these devices: 1. Track replicators that fail to probe due to offline CPUs in a global list. 2. Register a CPU hotplug notifier (`replicator_online_cpu`) to detect when a relevant CPU comes online. 3. Upon CPU online, retry the hardware initialization and registration with the CoreSight framework.
Signed-off-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com --- drivers/hwtracing/coresight/coresight-replicator.c | 65 ++++++++++++++++++++-- 1 file changed, 61 insertions(+), 4 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-replicator.c b/drivers/hwtracing/coresight/coresight-replicator.c index c11da452559c73af6709b39d03b646cb4779736f..f8d13894098f1e414fb0da8d6eeb1da4f0d55a8c 100644 --- a/drivers/hwtracing/coresight/coresight-replicator.c +++ b/drivers/hwtracing/coresight/coresight-replicator.c @@ -26,6 +26,9 @@ #define REPLICATOR_IDFILTER1 0x004
DEFINE_CORESIGHT_DEVLIST(replicator_devs, "replicator"); +static LIST_HEAD(replicator_delay_probe); +static enum cpuhp_state hp_online; +static DEFINE_SPINLOCK(delay_lock);
/** * struct replicator_drvdata - specifics associated to a replicator component @@ -37,6 +40,8 @@ DEFINE_CORESIGHT_DEVLIST(replicator_devs, "replicator"); * @spinlock: serialize enable/disable operations. * @check_idfilter_val: check if the context is lost upon clock removal. * @supported_cpus: Represent the CPUs related to this funnel. + * @dev: pointer to the device associated with this replicator. + * @link: link to the delay_probed list. */ struct replicator_drvdata { void __iomem *base; @@ -46,6 +51,8 @@ struct replicator_drvdata { raw_spinlock_t spinlock; bool check_idfilter_val; struct cpumask *supported_cpus; + struct device *dev; + struct list_head link; };
struct replicator_smp_arg { @@ -394,7 +401,7 @@ static int replicator_probe(struct device *dev, struct resource *res) drvdata->supported_cpus = replicator_get_supported_cpus(dev); if (!drvdata->supported_cpus) return -EINVAL; - + drvdata->dev = dev; cpus_read_lock(); for_each_cpu(cpu, drvdata->supported_cpus) { ret = smp_call_function_single(cpu, @@ -402,10 +409,15 @@ static int replicator_probe(struct device *dev, struct resource *res) if (!ret) break; } - cpus_read_unlock();
- if (ret) + if (ret) { + scoped_guard(spinlock, &delay_lock) + list_add(&drvdata->link, &replicator_delay_probe); + cpus_read_unlock(); return 0; + } + + cpus_read_unlock(); } else if (res) { replicator_init_hw(drvdata); } @@ -419,8 +431,13 @@ static int replicator_remove(struct device *dev) { struct replicator_drvdata *drvdata = dev_get_drvdata(dev);
- if (drvdata->csdev) + if (drvdata->csdev) { coresight_unregister(drvdata->csdev); + } else { + scoped_guard(spinlock, &delay_lock) + list_del(&drvdata->link); + } + return 0; }
@@ -552,8 +569,44 @@ static struct amba_driver dynamic_replicator_driver = { .id_table = dynamic_replicator_ids, };
+static int replicator_online_cpu(unsigned int cpu) +{ + struct replicator_drvdata *drvdata, *tmp; + int ret; + + spin_lock(&delay_lock); + list_for_each_entry_safe(drvdata, tmp, &replicator_delay_probe, link) { + if (cpumask_test_cpu(cpu, drvdata->supported_cpus)) { + list_del(&drvdata->link); + spin_unlock(&delay_lock); + ret = pm_runtime_resume_and_get(drvdata->dev); + if (ret < 0) + return 0; + + replicator_clear_self_claim_tag(drvdata); + replicator_reset(drvdata); + replicator_add_coresight_dev(drvdata->dev); + pm_runtime_put(drvdata->dev); + spin_lock(&delay_lock); + } + } + spin_unlock(&delay_lock); + return 0; +} + static int __init replicator_init(void) { + int ret; + + ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, + "arm/coresight-replicator:online", + replicator_online_cpu, NULL); + + if (ret > 0) + hp_online = ret; + else + return ret; + return coresight_init_driver("replicator", &dynamic_replicator_driver, &replicator_driver, THIS_MODULE); } @@ -561,6 +614,10 @@ static int __init replicator_init(void) static void __exit replicator_exit(void) { coresight_remove_driver(&dynamic_replicator_driver, &replicator_driver); + if (hp_online) { + cpuhp_remove_state_nocalls(hp_online); + hp_online = 0; + } }
module_init(replicator_init);
Replicators associated with CPU clusters reside in the cluster's power domain. Unlike system-wide replicators, their registers are only accessible when the cluster is powered on. Standard runtime PM may not suffice to wake up a cluster from low-power states, making direct register access unreliable during initialization or operation.
Enhance the replicator driver to support these per-cluster devices:
1. Safe Initialization: - Identify per-cluster replicators via device properties. - Use smp_call_function_single() to perform hardware initialization (reset and claim tag clearing) on a CPU within the cluster. - Refactor the probe flow to encapsulate device registration in replicator_add_coresight_dev().
2. Cross-CPU Enablement: - Update replicator_enable() to use smp_call_function_single() when enabling the hardware on a cluster-bound replicator.
3. Claim/Disclaim Handling: - Introduce replicator_claim/disclaim_device_unlocked() to manage device access safely before full framework registration.
This ensures that replicator operations remain robust even when the associated CPU cluster is in low-power states, while maintaining compatibility with existing system-level replicators.
Signed-off-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com --- drivers/hwtracing/coresight/coresight-replicator.c | 200 +++++++++++++++++---- 1 file changed, 167 insertions(+), 33 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-replicator.c b/drivers/hwtracing/coresight/coresight-replicator.c index e6472658235dc479cec91ac18f3737f76f8c74f0..c11da452559c73af6709b39d03b646cb4779736f 100644 --- a/drivers/hwtracing/coresight/coresight-replicator.c +++ b/drivers/hwtracing/coresight/coresight-replicator.c @@ -13,6 +13,7 @@ #include <linux/io.h> #include <linux/err.h> #include <linux/slab.h> +#include <linux/pm_domain.h> #include <linux/pm_runtime.h> #include <linux/property.h> #include <linux/clk.h> @@ -35,6 +36,7 @@ DEFINE_CORESIGHT_DEVLIST(replicator_devs, "replicator"); * @csdev: component vitals needed by the framework * @spinlock: serialize enable/disable operations. * @check_idfilter_val: check if the context is lost upon clock removal. + * @supported_cpus: Represent the CPUs related to this funnel. */ struct replicator_drvdata { void __iomem *base; @@ -43,18 +45,61 @@ struct replicator_drvdata { struct coresight_device *csdev; raw_spinlock_t spinlock; bool check_idfilter_val; + struct cpumask *supported_cpus; };
-static void dynamic_replicator_reset(struct replicator_drvdata *drvdata) +struct replicator_smp_arg { + struct replicator_drvdata *drvdata; + int outport; + int rc; +}; + +static void replicator_clear_self_claim_tag(struct replicator_drvdata *drvdata) +{ + struct csdev_access access = CSDEV_ACCESS_IOMEM(drvdata->base); + + coresight_clear_self_claim_tag(&access); +} + +static int replicator_claim_device_unlocked(struct replicator_drvdata *drvdata) +{ + struct coresight_device *csdev = drvdata->csdev; + struct csdev_access access = CSDEV_ACCESS_IOMEM(drvdata->base); + u32 claim_tag; + + if (csdev) + return coresight_claim_device_unlocked(csdev); + + writel_relaxed(CORESIGHT_CLAIM_SELF_HOSTED, drvdata->base + CORESIGHT_CLAIMSET); + + claim_tag = readl_relaxed(drvdata->base + CORESIGHT_CLAIMCLR); + if (claim_tag != CORESIGHT_CLAIM_SELF_HOSTED) { + coresight_clear_self_claim_tag_unlocked(&access); + return -EBUSY; + } + + return 0; +} + +static void replicator_disclaim_device_unlocked(struct replicator_drvdata *drvdata) { struct coresight_device *csdev = drvdata->csdev; + struct csdev_access access = CSDEV_ACCESS_IOMEM(drvdata->base); + + if (csdev) + return coresight_disclaim_device_unlocked(csdev);
+ coresight_clear_self_claim_tag_unlocked(&access); +} + +static void dynamic_replicator_reset(struct replicator_drvdata *drvdata) +{ CS_UNLOCK(drvdata->base);
- if (!coresight_claim_device_unlocked(csdev)) { + if (!replicator_claim_device_unlocked(drvdata)) { writel_relaxed(0xff, drvdata->base + REPLICATOR_IDFILTER0); writel_relaxed(0xff, drvdata->base + REPLICATOR_IDFILTER1); - coresight_disclaim_device_unlocked(csdev); + replicator_disclaim_device_unlocked(drvdata); }
CS_LOCK(drvdata->base); @@ -116,6 +161,34 @@ static int dynamic_replicator_enable(struct replicator_drvdata *drvdata, return rc; }
+static void replicator_enable_hw_smp_call(void *info) +{ + struct replicator_smp_arg *arg = info; + + arg->rc = dynamic_replicator_enable(arg->drvdata, 0, arg->outport); +} + +static int replicator_enable_hw(struct replicator_drvdata *drvdata, + int inport, int outport) +{ + int cpu, ret; + struct replicator_smp_arg arg = { 0 }; + + if (!drvdata->supported_cpus) + return dynamic_replicator_enable(drvdata, 0, outport); + + arg.drvdata = drvdata; + arg.outport = outport; + + for_each_cpu(cpu, drvdata->supported_cpus) { + ret = smp_call_function_single(cpu, replicator_enable_hw_smp_call, &arg, 1); + if (!ret) + return arg.rc; + } + + return ret; +} + static int replicator_enable(struct coresight_device *csdev, struct coresight_connection *in, struct coresight_connection *out) @@ -126,19 +199,24 @@ static int replicator_enable(struct coresight_device *csdev, bool first_enable = false;
raw_spin_lock_irqsave(&drvdata->spinlock, flags); - if (out->src_refcnt == 0) { - if (drvdata->base) - rc = dynamic_replicator_enable(drvdata, in->dest_port, - out->src_port); - if (!rc) - first_enable = true; - } - if (!rc) + + if (out->src_refcnt == 0) + first_enable = true; + else out->src_refcnt++; raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
- if (first_enable) - dev_dbg(&csdev->dev, "REPLICATOR enabled\n"); + if (first_enable) { + if (drvdata->base) + rc = replicator_enable_hw(drvdata, in->dest_port, + out->src_port); + if (!rc) { + out->src_refcnt++; + dev_dbg(&csdev->dev, "REPLICATOR enabled\n"); + return rc; + } + } + return rc; }
@@ -217,23 +295,69 @@ static const struct attribute_group *replicator_groups[] = { NULL, };
+static int replicator_add_coresight_dev(struct device *dev) +{ + struct coresight_desc desc = { 0 }; + struct replicator_drvdata *drvdata = dev_get_drvdata(dev); + + if (drvdata->base) { + desc.groups = replicator_groups; + desc.access = CSDEV_ACCESS_IOMEM(drvdata->base); + } + + desc.name = coresight_alloc_device_name(&replicator_devs, dev); + if (!desc.name) + return -ENOMEM; + + desc.type = CORESIGHT_DEV_TYPE_LINK; + desc.subtype.link_subtype = CORESIGHT_DEV_SUBTYPE_LINK_SPLIT; + desc.ops = &replicator_cs_ops; + desc.pdata = dev->platform_data; + desc.dev = dev; + + drvdata->csdev = coresight_register(&desc); + if (IS_ERR(drvdata->csdev)) + return PTR_ERR(drvdata->csdev); + + return 0; +} + +static void replicator_init_hw(struct replicator_drvdata *drvdata) +{ + replicator_clear_self_claim_tag(drvdata); + replicator_reset(drvdata); +} + +static void replicator_init_on_cpu(void *info) +{ + struct replicator_drvdata *drvdata = info; + + replicator_init_hw(drvdata); +} + +static struct cpumask *replicator_get_supported_cpus(struct device *dev) +{ + struct generic_pm_domain *pd; + + pd = pd_to_genpd(dev->pm_domain); + if (pd) + return pd->cpus; + + return NULL; +} + static int replicator_probe(struct device *dev, struct resource *res) { struct coresight_platform_data *pdata = NULL; struct replicator_drvdata *drvdata; - struct coresight_desc desc = { 0 }; void __iomem *base; - int ret; + int cpu, ret;
if (is_of_node(dev_fwnode(dev)) && of_device_is_compatible(dev->of_node, "arm,coresight-replicator")) dev_warn_once(dev, "Uses OBSOLETE CoreSight replicator binding\n");
- desc.name = coresight_alloc_device_name(&replicator_devs, dev); - if (!desc.name) - return -ENOMEM; - drvdata = devm_kzalloc(dev, sizeof(*drvdata), GFP_KERNEL); if (!drvdata) return -ENOMEM; @@ -251,9 +375,6 @@ static int replicator_probe(struct device *dev, struct resource *res) if (IS_ERR(base)) return PTR_ERR(base); drvdata->base = base; - desc.groups = replicator_groups; - desc.access = CSDEV_ACCESS_IOMEM(base); - coresight_clear_self_claim_tag(&desc.access); }
if (fwnode_property_present(dev_fwnode(dev), @@ -268,25 +389,38 @@ static int replicator_probe(struct device *dev, struct resource *res) dev->platform_data = pdata;
raw_spin_lock_init(&drvdata->spinlock); - desc.type = CORESIGHT_DEV_TYPE_LINK; - desc.subtype.link_subtype = CORESIGHT_DEV_SUBTYPE_LINK_SPLIT; - desc.ops = &replicator_cs_ops; - desc.pdata = dev->platform_data; - desc.dev = dev;
- drvdata->csdev = coresight_register(&desc); - if (IS_ERR(drvdata->csdev)) - return PTR_ERR(drvdata->csdev); + if (fwnode_property_present(dev_fwnode(dev), "qcom,cpu-bound-components")) { + drvdata->supported_cpus = replicator_get_supported_cpus(dev); + if (!drvdata->supported_cpus) + return -EINVAL; + + cpus_read_lock(); + for_each_cpu(cpu, drvdata->supported_cpus) { + ret = smp_call_function_single(cpu, + replicator_init_on_cpu, drvdata, 1); + if (!ret) + break; + } + cpus_read_unlock();
- replicator_reset(drvdata); - return 0; + if (ret) + return 0; + } else if (res) { + replicator_init_hw(drvdata); + } + + ret = replicator_add_coresight_dev(dev); + + return ret; }
static int replicator_remove(struct device *dev) { struct replicator_drvdata *drvdata = dev_get_drvdata(dev);
- coresight_unregister(drvdata->csdev); + if (drvdata->csdev) + coresight_unregister(drvdata->csdev); return 0; }
Standard system replicators allow direct register access from any CPU. However, replicators associated with specific CPU clusters share the cluster's power domain and require access via a CPU within that domain.
Replace the standard `coresight_simple_reg*` accessors with custom handlers (`coresight_replicator_reg*`) to support these devices: - For cluster-bound replicators (indicated by `supported_cpus`), use `smp_call_function_single()` to read registers on an associated CPU. - For standard replicators, retain the direct access behavior.
This ensures correct operation for per-cluster replicators while maintaining compatibility for existing system-level devices.
Signed-off-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com --- drivers/hwtracing/coresight/coresight-replicator.c | 61 +++++++++++++++++++++- 1 file changed, 59 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-replicator.c b/drivers/hwtracing/coresight/coresight-replicator.c index f8d13894098f1e414fb0da8d6eeb1da4f0d55a8c..a9f22d0e15de21aa06c8d1e193e5db06091efd75 100644 --- a/drivers/hwtracing/coresight/coresight-replicator.c +++ b/drivers/hwtracing/coresight/coresight-replicator.c @@ -58,6 +58,7 @@ struct replicator_drvdata { struct replicator_smp_arg { struct replicator_drvdata *drvdata; int outport; + u32 offset; int rc; };
@@ -286,9 +287,65 @@ static const struct coresight_ops replicator_cs_ops = { .link_ops = &replicator_link_ops, };
+static void replicator_read_register_smp_call(void *info) +{ + struct replicator_smp_arg *arg = info; + + arg->rc = readl_relaxed(arg->drvdata->base + arg->offset); +} + +static ssize_t coresight_replicator_reg32_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct replicator_drvdata *drvdata = dev_get_drvdata(dev->parent); + struct cs_off_attribute *cs_attr = container_of(attr, struct cs_off_attribute, attr); + unsigned long flags; + struct replicator_smp_arg arg = { 0 }; + u32 val; + int ret, cpu; + + pm_runtime_get_sync(dev->parent); + + if (!drvdata->supported_cpus) { + raw_spin_lock_irqsave(&drvdata->spinlock, flags); + val = readl_relaxed(drvdata->base + cs_attr->off); + raw_spin_unlock_irqrestore(&drvdata->spinlock, flags); + + } else { + arg.drvdata = drvdata; + arg.offset = cs_attr->off; + for_each_cpu(cpu, drvdata->supported_cpus) { + ret = smp_call_function_single(cpu, + replicator_read_register_smp_call, + &arg, 1); + if (!ret) + break; + } + if (!ret) { + val = arg.rc; + } else { + pm_runtime_put_sync(dev->parent); + return ret; + } + } + + pm_runtime_put_sync(dev->parent); + + return sysfs_emit(buf, "0x%x\n", val); +} + +#define coresight_replicator_reg32(name, offset) \ + (&((struct cs_off_attribute[]) { \ + { \ + __ATTR(name, 0444, coresight_replicator_reg32_show, NULL), \ + offset \ + } \ + })[0].attr.attr) + static struct attribute *replicator_mgmt_attrs[] = { - coresight_simple_reg32(idfilter0, REPLICATOR_IDFILTER0), - coresight_simple_reg32(idfilter1, REPLICATOR_IDFILTER1), + coresight_replicator_reg32(idfilter0, REPLICATOR_IDFILTER0), + coresight_replicator_reg32(idfilter1, REPLICATOR_IDFILTER1), NULL, };
TMC instances associated with CPU clusters reside in the cluster's power domain. Unlike system-level TMCs, their registers are only accessible when the cluster is powered on. Standard runtime PM may not suffice to wake up a cluster from low-power states during probe, making direct register access unreliable.
Refactor the probe sequence to handle these per-cluster devices safely:
1. Identify per-cluster TMCs using the "qcom,cpu-bound-components" property. 2. For such devices, use `smp_call_function_single()` to perform hardware initialization (`tmc_init_hw_config`) on a CPU within the cluster. This ensures the domain is powered during access. 3. Factor out the device registration logic into `tmc_add_coresight_dev()`. This allows common registration code to be shared between the standard probe path and the deferred probe path (used when the associated CPUs are initially offline).
This change ensures reliable initialization for per-cluster TMCs while maintaining backward compatibility for standard system-level TMCs.
Signed-off-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com --- drivers/hwtracing/coresight/coresight-tmc-core.c | 195 +++++++++++++++-------- drivers/hwtracing/coresight/coresight-tmc.h | 6 + 2 files changed, 132 insertions(+), 69 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c index 36599c431be6203e871fdcb8de569cc6701c52bb..0e1b5956398d3cefdd938a8a8404076eb4850b44 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-core.c +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c @@ -21,6 +21,7 @@ #include <linux/slab.h> #include <linux/dma-mapping.h> #include <linux/spinlock.h> +#include <linux/pm_domain.h> #include <linux/pm_runtime.h> #include <linux/of.h> #include <linux/of_address.h> @@ -769,56 +770,14 @@ static void register_crash_dev_interface(struct tmc_drvdata *drvdata, "Valid crash tracedata found\n"); }
-static int __tmc_probe(struct device *dev, struct resource *res) +static int tmc_add_coresight_dev(struct device *dev) { - int ret = 0; - u32 devid; - void __iomem *base; - struct coresight_platform_data *pdata = NULL; - struct tmc_drvdata *drvdata; + struct tmc_drvdata *drvdata = dev_get_drvdata(dev); struct coresight_desc desc = { 0 }; struct coresight_dev_list *dev_list = NULL; + int ret = 0;
- drvdata = devm_kzalloc(dev, sizeof(*drvdata), GFP_KERNEL); - if (!drvdata) - return -ENOMEM; - - dev_set_drvdata(dev, drvdata); - - ret = coresight_get_enable_clocks(dev, &drvdata->pclk, &drvdata->atclk); - if (ret) - return ret; - - ret = -ENOMEM; - - /* Validity for the resource is already checked by the AMBA core */ - base = devm_ioremap_resource(dev, res); - if (IS_ERR(base)) { - ret = PTR_ERR(base); - goto out; - } - - drvdata->base = base; - desc.access = CSDEV_ACCESS_IOMEM(base); - - raw_spin_lock_init(&drvdata->spinlock); - - devid = readl_relaxed(drvdata->base + CORESIGHT_DEVID); - drvdata->config_type = BMVAL(devid, 6, 7); - drvdata->memwidth = tmc_get_memwidth(devid); - /* This device is not associated with a session */ - drvdata->pid = -1; - drvdata->etr_mode = ETR_MODE_AUTO; - - if (drvdata->config_type == TMC_CONFIG_TYPE_ETR) { - drvdata->size = tmc_etr_get_default_buffer_size(dev); - drvdata->max_burst_size = tmc_etr_get_max_burst_size(dev); - } else { - drvdata->size = readl_relaxed(drvdata->base + TMC_RSZ) * 4; - } - - tmc_get_reserved_region(dev); - + desc.access = CSDEV_ACCESS_IOMEM(drvdata->base); desc.dev = dev;
switch (drvdata->config_type) { @@ -834,9 +793,9 @@ static int __tmc_probe(struct device *dev, struct resource *res) desc.type = CORESIGHT_DEV_TYPE_SINK; desc.subtype.sink_subtype = CORESIGHT_DEV_SUBTYPE_SINK_SYSMEM; desc.ops = &tmc_etr_cs_ops; - ret = tmc_etr_setup_caps(dev, devid, &desc.access); + ret = tmc_etr_setup_caps(dev, drvdata->devid, &desc.access); if (ret) - goto out; + return ret; idr_init(&drvdata->idr); mutex_init(&drvdata->idr_mutex); dev_list = &etr_devs; @@ -851,44 +810,141 @@ static int __tmc_probe(struct device *dev, struct resource *res) break; default: pr_err("%s: Unsupported TMC config\n", desc.name); - ret = -EINVAL; - goto out; + return -EINVAL; }
desc.name = coresight_alloc_device_name(dev_list, dev); - if (!desc.name) { - ret = -ENOMEM; + if (!desc.name) + return -ENOMEM; + + drvdata->desc_name = desc.name; + + desc.pdata = dev->platform_data; + + drvdata->csdev = coresight_register(&desc); + if (IS_ERR(drvdata->csdev)) + return PTR_ERR(drvdata->csdev); + + drvdata->miscdev.name = desc.name; + drvdata->miscdev.minor = MISC_DYNAMIC_MINOR; + drvdata->miscdev.fops = &tmc_fops; + ret = misc_register(&drvdata->miscdev); + if (ret) + coresight_unregister(drvdata->csdev); + + return ret; +} + +static void tmc_clear_self_claim_tag(struct tmc_drvdata *drvdata) +{ + struct csdev_access access = CSDEV_ACCESS_IOMEM(drvdata->base); + + coresight_clear_self_claim_tag(&access); +} + +static void tmc_init_hw_config(struct tmc_drvdata *drvdata) +{ + u32 devid; + + devid = readl_relaxed(drvdata->base + CORESIGHT_DEVID); + drvdata->config_type = BMVAL(devid, 6, 7); + drvdata->memwidth = tmc_get_memwidth(devid); + drvdata->devid = devid; + drvdata->size = readl_relaxed(drvdata->base + TMC_RSZ) * 4; + tmc_clear_self_claim_tag(drvdata); +} + +static void tmc_init_on_cpu(void *info) +{ + struct tmc_drvdata *drvdata = info; + + tmc_init_hw_config(drvdata); +} + +static struct cpumask *tmc_get_supported_cpus(struct device *dev) +{ + struct generic_pm_domain *pd; + + pd = pd_to_genpd(dev->pm_domain); + if (pd) + return pd->cpus; + + return NULL; +} + +static int __tmc_probe(struct device *dev, struct resource *res) +{ + int cpu, ret = 0; + void __iomem *base; + struct coresight_platform_data *pdata = NULL; + struct tmc_drvdata *drvdata; + + drvdata = devm_kzalloc(dev, sizeof(*drvdata), GFP_KERNEL); + if (!drvdata) + return -ENOMEM; + + dev_set_drvdata(dev, drvdata); + + ret = coresight_get_enable_clocks(dev, &drvdata->pclk, &drvdata->atclk); + if (ret) + return ret; + + ret = -ENOMEM; + + /* Validity for the resource is already checked by the AMBA core */ + base = devm_ioremap_resource(dev, res); + if (IS_ERR(base)) { + ret = PTR_ERR(base); goto out; }
+ drvdata->base = base; + + raw_spin_lock_init(&drvdata->spinlock); + /* This device is not associated with a session */ + drvdata->pid = -1; + drvdata->etr_mode = ETR_MODE_AUTO; + tmc_get_reserved_region(dev); + pdata = coresight_get_platform_data(dev); if (IS_ERR(pdata)) { ret = PTR_ERR(pdata); goto out; } dev->platform_data = pdata; - desc.pdata = pdata;
- coresight_clear_self_claim_tag(&desc.access); - drvdata->csdev = coresight_register(&desc); - if (IS_ERR(drvdata->csdev)) { - ret = PTR_ERR(drvdata->csdev); - goto out; + if (fwnode_property_present(dev_fwnode(dev), "qcom,cpu-bound-components")) { + drvdata->supported_cpus = tmc_get_supported_cpus(dev); + if (!drvdata->supported_cpus) + return -EINVAL; + + cpus_read_lock(); + for_each_cpu(cpu, drvdata->supported_cpus) { + ret = smp_call_function_single(cpu, + tmc_init_on_cpu, drvdata, 1); + if (!ret) + break; + } + cpus_read_unlock(); + if (ret) { + ret = 0; + goto out; + } + } else { + tmc_init_hw_config(drvdata); }
- drvdata->miscdev.name = desc.name; - drvdata->miscdev.minor = MISC_DYNAMIC_MINOR; - drvdata->miscdev.fops = &tmc_fops; - ret = misc_register(&drvdata->miscdev); - if (ret) { - coresight_unregister(drvdata->csdev); - goto out; + if (drvdata->config_type == TMC_CONFIG_TYPE_ETR) { + drvdata->size = tmc_etr_get_default_buffer_size(dev); + drvdata->max_burst_size = tmc_etr_get_max_burst_size(dev); }
+ ret = tmc_add_coresight_dev(dev); + out: if (is_tmc_crashdata_valid(drvdata) && !tmc_prepare_crashdata(drvdata)) - register_crash_dev_interface(drvdata, desc.name); + register_crash_dev_interface(drvdata, drvdata->desc_name); return ret; }
@@ -934,10 +990,12 @@ static void __tmc_remove(struct device *dev) * etb fops in this case, device is there until last file * handler to this device is closed. */ - misc_deregister(&drvdata->miscdev); + if (!drvdata->supported_cpus) + misc_deregister(&drvdata->miscdev); if (drvdata->crashdev.fops) misc_deregister(&drvdata->crashdev); - coresight_unregister(drvdata->csdev); + if (drvdata->csdev) + coresight_unregister(drvdata->csdev); }
static void tmc_remove(struct amba_device *adev) @@ -992,7 +1050,6 @@ static void tmc_platform_remove(struct platform_device *pdev)
if (WARN_ON(!drvdata)) return; - __tmc_remove(&pdev->dev); pm_runtime_disable(&pdev->dev); } diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 95473d1310323425b7d136cbd46f118faa7256be..b104b7bf82d2a7a99382636e41d3718cf258d820 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -243,6 +243,9 @@ struct tmc_resrv_buf { * (after crash) by default. * @crash_mdata: Reserved memory for storing tmc crash metadata. * Used by ETR/ETF. + * @supported_cpus: Represent the CPUs related to this TMC. + * @devid: TMC variant ID inferred from the device configuration register. + * @desc_name: Name to be used while creating crash interface. */ struct tmc_drvdata { struct clk *atclk; @@ -273,6 +276,9 @@ struct tmc_drvdata { struct etr_buf *perf_buf; struct tmc_resrv_buf resrv_buf; struct tmc_resrv_buf crash_mdata; + struct cpumask *supported_cpus; + u32 devid; + const char *desc_name; };
struct etr_buf_operations {
TMC-ETF devices associated with specific CPU clusters share the cluster's power domain. Accessing their registers requires the cluster to be powered on, which can only be guaranteed by running code on a CPU within that cluster.
Refactor the enablement logic to support this requirement: 1. Split `tmc_etf_enable_hw` and `tmc_etb_enable_hw` into local and SMP-aware variants: - `*_local`: Performs the actual register access. - `*_smp_call`: Wrapper for `smp_call_function_single`. - The main entry point now detects if the device is CPU-bound and uses `smp_call_function_single` to execute the local variant on an appropriate CPU if necessary.
2. Adjust locking in `tmc_enable_etf_sink_sysfs` and `tmc_enable_etf_link`: - Drop the spinlock before calling `tmc_etf_enable_hw`. This is necessary because `smp_call_function_single` (used for cross-CPU calls) may require interrupts enabled or might sleep/wait, which is unsafe under a spinlock. - Re-acquire the lock afterwards to update driver state.
Signed-off-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com --- drivers/hwtracing/coresight/coresight-tmc-etf.c | 87 ++++++++++++++++++++++--- 1 file changed, 77 insertions(+), 10 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 8882b1c4cdc05353fb2efd6a9ba862943048f0ff..11357788e9d93c53980e99e0ef78450e393f4059 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -47,7 +47,7 @@ static int __tmc_etb_enable_hw(struct tmc_drvdata *drvdata) return rc; }
-static int tmc_etb_enable_hw(struct tmc_drvdata *drvdata) +static int tmc_etb_enable_hw_local(struct tmc_drvdata *drvdata) { int rc = coresight_claim_device(drvdata->csdev);
@@ -60,6 +60,36 @@ static int tmc_etb_enable_hw(struct tmc_drvdata *drvdata) return rc; }
+struct tmc_smp_arg { + struct tmc_drvdata *drvdata; + int rc; +}; + +static void tmc_etb_enable_hw_smp_call(void *info) +{ + struct tmc_smp_arg *arg = info; + + arg->rc = tmc_etb_enable_hw_local(arg->drvdata); +} + +static int tmc_etb_enable_hw(struct tmc_drvdata *drvdata) +{ + int cpu, ret; + struct tmc_smp_arg arg = { 0 }; + + if (!drvdata->supported_cpus) + return tmc_etb_enable_hw_local(drvdata); + + arg.drvdata = drvdata; + for_each_cpu(cpu, drvdata->supported_cpus) { + ret = smp_call_function_single(cpu, + tmc_etb_enable_hw_smp_call, &arg, 1); + if (!ret) + return arg.rc; + } + return ret; +} + static void tmc_etb_dump_hw(struct tmc_drvdata *drvdata) { char *bufp; @@ -130,7 +160,7 @@ static int __tmc_etf_enable_hw(struct tmc_drvdata *drvdata) return rc; }
-static int tmc_etf_enable_hw(struct tmc_drvdata *drvdata) +static int tmc_etf_enable_hw_local(struct tmc_drvdata *drvdata) { int rc = coresight_claim_device(drvdata->csdev);
@@ -143,6 +173,32 @@ static int tmc_etf_enable_hw(struct tmc_drvdata *drvdata) return rc; }
+static void tmc_etf_enable_hw_smp_call(void *info) +{ + struct tmc_smp_arg *arg = info; + + arg->rc = tmc_etf_enable_hw_local(arg->drvdata); +} + +static int tmc_etf_enable_hw(struct tmc_drvdata *drvdata) +{ + int cpu, ret; + struct tmc_smp_arg arg = { 0 }; + + if (!drvdata->supported_cpus) + return tmc_etf_enable_hw_local(drvdata); + + arg.drvdata = drvdata; + + for_each_cpu(cpu, drvdata->supported_cpus) { + ret = smp_call_function_single(cpu, + tmc_etf_enable_hw_smp_call, &arg, 1); + if (!ret) + return arg.rc; + } + return ret; +} + static void tmc_etf_disable_hw(struct tmc_drvdata *drvdata) { struct coresight_device *csdev = drvdata->csdev; @@ -228,7 +284,11 @@ static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev) used = true; drvdata->buf = buf; } + raw_spin_unlock_irqrestore(&drvdata->spinlock, flags); + ret = tmc_etb_enable_hw(drvdata); + + raw_spin_lock_irqsave(&drvdata->spinlock, flags); if (!ret) { coresight_set_mode(csdev, CS_MODE_SYSFS); csdev->refcnt++; @@ -291,7 +351,11 @@ static int tmc_enable_etf_sink_perf(struct coresight_device *csdev, break; }
- ret = tmc_etb_enable_hw(drvdata); + if (drvdata->supported_cpus && + !cpumask_test_cpu(smp_processor_id(), drvdata->supported_cpus)) + break; + + ret = tmc_etb_enable_hw_local(drvdata); if (!ret) { /* Associate with monitored process. */ drvdata->pid = pid; @@ -376,19 +440,22 @@ static int tmc_enable_etf_link(struct coresight_device *csdev, return -EBUSY; }
- if (csdev->refcnt == 0) { + if (csdev->refcnt == 0) + first_enable = true; + + if (!first_enable) + csdev->refcnt++; + + raw_spin_unlock_irqrestore(&drvdata->spinlock, flags); + if (first_enable) { ret = tmc_etf_enable_hw(drvdata); if (!ret) { coresight_set_mode(csdev, CS_MODE_SYSFS); - first_enable = true; + csdev->refcnt++; + dev_dbg(&csdev->dev, "TMC-ETF enabled\n"); } } - if (!ret) - csdev->refcnt++; - raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
- if (first_enable) - dev_dbg(&csdev->dev, "TMC-ETF enabled\n"); return ret; }
The current TMC management interface (sysfs attributes) assumes that device registers can be accessed directly from any CPU. However, for TMCs associated with specific CPU clusters, registers must be accessed from a CPU within that cluster.
Replace the standard `coresight_simple_reg*` handlers with custom accessors (`coresight_tmc_reg*`). These new handlers check if the TMC is bound to a specific set of CPUs: - If bound, they use `smp_call_function_single()` to read the register on an appropriate CPU. - If not bound (global TMC), they fall back to direct access.
This ensures correct register reads for per-cluster TMC devices while maintaining backward compatibility for global TMCs.
Signed-off-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com --- drivers/hwtracing/coresight/coresight-tmc-core.c | 137 ++++++++++++++++++++--- 1 file changed, 123 insertions(+), 14 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c index 0e1b5956398d3cefdd938a8a8404076eb4850b44..5b9f2e57c78f42f0f1460d8a8dcbac72b5f6085e 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-core.c +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c @@ -458,21 +458,130 @@ static enum tmc_mem_intf_width tmc_get_memwidth(u32 devid) return memwidth; }
+struct tmc_smp_arg { + struct tmc_drvdata *drvdata; + u32 offset; + int rc; +}; + +static void tmc_read_reg_smp_call(void *info) +{ + struct tmc_smp_arg *arg = info; + + arg->rc = readl_relaxed(arg->drvdata->base + arg->offset); +} + +static u32 cpu_tmc_read_reg(struct tmc_drvdata *drvdata, u32 offset) +{ + struct tmc_smp_arg arg = { + .drvdata = drvdata, + .offset = offset, + }; + int cpu, ret = 0; + + for_each_cpu(cpu, drvdata->supported_cpus) { + ret = smp_call_function_single(cpu, + tmc_read_reg_smp_call, &arg, 1); + if (!ret) + return arg.rc; + } + + return ret; +} + +static ssize_t coresight_tmc_reg32_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct tmc_drvdata *drvdata = dev_get_drvdata(dev->parent); + struct cs_off_attribute *cs_attr = container_of(attr, struct cs_off_attribute, attr); + int ret; + u32 val; + + ret = pm_runtime_resume_and_get(dev->parent); + if (ret < 0) + return ret; + + if (!drvdata->supported_cpus) + val = readl_relaxed(drvdata->base + cs_attr->off); + else + val = cpu_tmc_read_reg(drvdata, cs_attr->off); + + pm_runtime_put(dev->parent); + + if (ret < 0) + return ret; + else + return sysfs_emit(buf, "0x%x\n", val); +} + +static ssize_t coresight_tmc_reg64_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct tmc_drvdata *drvdata = dev_get_drvdata(dev->parent); + struct cs_pair_attribute *cs_attr = container_of(attr, struct cs_pair_attribute, attr); + int ret; + u64 val; + + ret = pm_runtime_resume_and_get(dev->parent); + if (ret < 0) + return ret; + if (!drvdata->supported_cpus) { + val = readl_relaxed(drvdata->base + cs_attr->lo_off) | + ((u64)readl_relaxed(drvdata->base + cs_attr->hi_off) << 32); + } else { + ret = cpu_tmc_read_reg(drvdata, cs_attr->lo_off); + + if (ret < 0) + goto out; + + val = ret; + + ret = cpu_tmc_read_reg(drvdata, cs_attr->hi_off); + if (ret < 0) + goto out; + + val |= ((u64)ret << 32); + } + +out: + pm_runtime_put_sync(dev->parent); + if (ret < 0) + return ret; + else + return sysfs_emit(buf, "0x%llx\n", val); +} + +#define coresight_tmc_reg32(name, offset) \ + (&((struct cs_off_attribute[]) { \ + { \ + __ATTR(name, 0444, coresight_tmc_reg32_show, NULL), \ + offset \ + } \ + })[0].attr.attr) +#define coresight_tmc_reg64(name, lo_off, hi_off) \ + (&((struct cs_pair_attribute[]) { \ + { \ + __ATTR(name, 0444, coresight_tmc_reg64_show, NULL), \ + lo_off, hi_off \ + } \ + })[0].attr.attr) static struct attribute *coresight_tmc_mgmt_attrs[] = { - coresight_simple_reg32(rsz, TMC_RSZ), - coresight_simple_reg32(sts, TMC_STS), - coresight_simple_reg64(rrp, TMC_RRP, TMC_RRPHI), - coresight_simple_reg64(rwp, TMC_RWP, TMC_RWPHI), - coresight_simple_reg32(trg, TMC_TRG), - coresight_simple_reg32(ctl, TMC_CTL), - coresight_simple_reg32(ffsr, TMC_FFSR), - coresight_simple_reg32(ffcr, TMC_FFCR), - coresight_simple_reg32(mode, TMC_MODE), - coresight_simple_reg32(pscr, TMC_PSCR), - coresight_simple_reg32(devid, CORESIGHT_DEVID), - coresight_simple_reg64(dba, TMC_DBALO, TMC_DBAHI), - coresight_simple_reg32(axictl, TMC_AXICTL), - coresight_simple_reg32(authstatus, TMC_AUTHSTATUS), + coresight_tmc_reg32(rsz, TMC_RSZ), + coresight_tmc_reg32(sts, TMC_STS), + coresight_tmc_reg64(rrp, TMC_RRP, TMC_RRPHI), + coresight_tmc_reg64(rwp, TMC_RWP, TMC_RWPHI), + coresight_tmc_reg32(trg, TMC_TRG), + coresight_tmc_reg32(ctl, TMC_CTL), + coresight_tmc_reg32(ffsr, TMC_FFSR), + coresight_tmc_reg32(ffcr, TMC_FFCR), + coresight_tmc_reg32(mode, TMC_MODE), + coresight_tmc_reg32(pscr, TMC_PSCR), + coresight_tmc_reg32(devid, CORESIGHT_DEVID), + coresight_tmc_reg64(dba, TMC_DBALO, TMC_DBAHI), + coresight_tmc_reg32(axictl, TMC_AXICTL), + coresight_tmc_reg32(authstatus, TMC_AUTHSTATUS), NULL, };
On some platforms, the TMC driver may probe before the associated CPUs are online. This prevents the driver from securely accessing the hardware or configuring it via smp_call_function_single(), which requires the target CPU to be available.
To address this, defer the hardware initialization if the associated CPUs are offline: 1. Track such deferred devices in a global list. 2. Register a CPU hotplug callback (`tmc_online_cpu`) to detect when a relevant CPU comes online. 3. Upon CPU online, retry the hardware initialization and registration for the waiting TMC devices.
Signed-off-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com --- drivers/hwtracing/coresight/coresight-tmc-core.c | 59 +++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-tmc.h | 4 ++ 2 files changed, 61 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c b/drivers/hwtracing/coresight/coresight-tmc-core.c index 5b9f2e57c78f42f0f1460d8a8dcbac72b5f6085e..9182fa8e4074a7c9739494b2f5d59be2e96f1d3d 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-core.c +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c @@ -36,6 +36,9 @@ DEFINE_CORESIGHT_DEVLIST(etb_devs, "tmc_etb"); DEFINE_CORESIGHT_DEVLIST(etf_devs, "tmc_etf"); DEFINE_CORESIGHT_DEVLIST(etr_devs, "tmc_etr"); +static LIST_HEAD(tmc_delay_probe); +static enum cpuhp_state hp_online; +static DEFINE_SPINLOCK(delay_lock);
int tmc_wait_for_tmcready(struct tmc_drvdata *drvdata) { @@ -1027,6 +1030,8 @@ static int __tmc_probe(struct device *dev, struct resource *res) if (!drvdata->supported_cpus) return -EINVAL;
+ drvdata->dev = dev; + cpus_read_lock(); for_each_cpu(cpu, drvdata->supported_cpus) { ret = smp_call_function_single(cpu, @@ -1034,11 +1039,16 @@ static int __tmc_probe(struct device *dev, struct resource *res) if (!ret) break; } - cpus_read_unlock(); + if (ret) { + scoped_guard(spinlock, &delay_lock) + list_add(&drvdata->link, &tmc_delay_probe); + cpus_read_unlock(); ret = 0; goto out; } + + cpus_read_unlock(); } else { tmc_init_hw_config(drvdata); } @@ -1103,8 +1113,12 @@ static void __tmc_remove(struct device *dev) misc_deregister(&drvdata->miscdev); if (drvdata->crashdev.fops) misc_deregister(&drvdata->crashdev); - if (drvdata->csdev) + if (drvdata->csdev) { coresight_unregister(drvdata->csdev); + } else { + scoped_guard(spinlock, &delay_lock) + list_del(&drvdata->link); + } }
static void tmc_remove(struct amba_device *adev) @@ -1215,14 +1229,55 @@ static struct platform_driver tmc_platform_driver = { }, };
+static int tmc_online_cpu(unsigned int cpu) +{ + struct tmc_drvdata *drvdata, *tmp; + int ret; + + spin_lock(&delay_lock); + list_for_each_entry_safe(drvdata, tmp, &tmc_delay_probe, link) { + if (cpumask_test_cpu(cpu, drvdata->supported_cpus)) { + list_del(&drvdata->link); + + spin_unlock(&delay_lock); + ret = pm_runtime_resume_and_get(drvdata->dev); + if (ret < 0) + return 0; + + tmc_init_hw_config(drvdata); + tmc_clear_self_claim_tag(drvdata); + tmc_add_coresight_dev(drvdata->dev); + pm_runtime_put(drvdata->dev); + spin_lock(&delay_lock); + } + } + spin_unlock(&delay_lock); + return 0; +} + static int __init tmc_init(void) { + int ret; + + ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, + "arm/coresight-tmc:online", + tmc_online_cpu, NULL); + + if (ret > 0) + hp_online = ret; + else + return ret; + return coresight_init_driver("tmc", &tmc_driver, &tmc_platform_driver, THIS_MODULE); }
static void __exit tmc_exit(void) { coresight_remove_driver(&tmc_driver, &tmc_platform_driver); + if (hp_online) { + cpuhp_remove_state_nocalls(hp_online); + hp_online = 0; + } } module_init(tmc_init); module_exit(tmc_exit); diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index b104b7bf82d2a7a99382636e41d3718cf258d820..2583bc4f556195cd814e674dc66f08909dea61b2 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -246,6 +246,8 @@ struct tmc_resrv_buf { * @supported_cpus: Represent the CPUs related to this TMC. * @devid: TMC variant ID inferred from the device configuration register. * @desc_name: Name to be used while creating crash interface. + * @dev: pointer to the device associated with this TMC. + * @link: link to the delay_probed list. */ struct tmc_drvdata { struct clk *atclk; @@ -279,6 +281,8 @@ struct tmc_drvdata { struct cpumask *supported_cpus; u32 devid; const char *desc_name; + struct device *dev; + struct list_head link; };
struct etr_buf_operations {
Currently, the link enable callback does not receive the CoreSight mode (enum cs_mode). This prevents link drivers from knowing whether they are being enabled for SysFS or Perf.
This distinction is crucial because Perf mode runs in atomic context, where certain operations (like smp_call_function_single()) are unsafe. Without knowing the mode, drivers cannot conditionally avoid these unsafe calls.
Update the `enable` callback in `struct coresight_ops_link` to accept `enum cs_mode`. This allows drivers to implement mode-specific logic, such as using atomic-safe enablement sequences when running in Perf mode. Update all call sites and driver implementations accordingly.
Signed-off-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com --- drivers/hwtracing/coresight/coresight-core.c | 7 ++++--- drivers/hwtracing/coresight/coresight-funnel.c | 21 +++++++++++++++++++- drivers/hwtracing/coresight/coresight-replicator.c | 23 +++++++++++++++++++++- drivers/hwtracing/coresight/coresight-tmc-etf.c | 19 +++++++++++++++++- drivers/hwtracing/coresight/coresight-tnoc.c | 3 ++- drivers/hwtracing/coresight/coresight-tpda.c | 3 ++- include/linux/coresight.h | 3 ++- 7 files changed, 70 insertions(+), 9 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c index c660cf8adb1c7cafff8f85e501f056e4e151e372..1863bdb57281b4fd405cf966d565c581506ea270 100644 --- a/drivers/hwtracing/coresight/coresight-core.c +++ b/drivers/hwtracing/coresight/coresight-core.c @@ -314,7 +314,8 @@ static void coresight_disable_sink(struct coresight_device *csdev) static int coresight_enable_link(struct coresight_device *csdev, struct coresight_device *parent, struct coresight_device *child, - struct coresight_device *source) + struct coresight_device *source, + enum cs_mode mode) { int link_subtype; struct coresight_connection *inconn, *outconn; @@ -331,7 +332,7 @@ static int coresight_enable_link(struct coresight_device *csdev, if (link_subtype == CORESIGHT_DEV_SUBTYPE_LINK_SPLIT && IS_ERR(outconn)) return PTR_ERR(outconn);
- return link_ops(csdev)->enable(csdev, inconn, outconn); + return link_ops(csdev)->enable(csdev, inconn, outconn, mode); }
static void coresight_disable_link(struct coresight_device *csdev, @@ -550,7 +551,7 @@ int coresight_enable_path(struct coresight_path *path, enum cs_mode mode) case CORESIGHT_DEV_TYPE_LINK: parent = list_prev_entry(nd, link)->csdev; child = list_next_entry(nd, link)->csdev; - ret = coresight_enable_link(csdev, parent, child, source); + ret = coresight_enable_link(csdev, parent, child, source, mode); if (ret) goto err_disable_helpers; break; diff --git a/drivers/hwtracing/coresight/coresight-funnel.c b/drivers/hwtracing/coresight/coresight-funnel.c index 5d114ce1109f4f9a8b108110bdae258f216881d8..c50522c2854c7193a8c30b1a603abe566a1c1ccf 100644 --- a/drivers/hwtracing/coresight/coresight-funnel.c +++ b/drivers/hwtracing/coresight/coresight-funnel.c @@ -121,7 +121,8 @@ static int funnel_enable_hw(struct funnel_drvdata *drvdata, int port)
static int funnel_enable(struct coresight_device *csdev, struct coresight_connection *in, - struct coresight_connection *out) + struct coresight_connection *out, + enum cs_mode mode) { int rc = 0; struct funnel_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); @@ -135,6 +136,23 @@ static int funnel_enable(struct coresight_device *csdev, else in->dest_refcnt++;
+ if (mode == CS_MODE_PERF) { + if (first_enable) { + if (drvdata->supported_cpus && + !cpumask_test_cpu(smp_processor_id(), drvdata->supported_cpus)) { + raw_spin_unlock_irqrestore(&drvdata->spinlock, flags); + return -EINVAL; + } + + if (drvdata->base) + rc = dynamic_funnel_enable_hw(drvdata, in->dest_port); + if (!rc) + in->dest_refcnt++; + } + raw_spin_unlock_irqrestore(&drvdata->spinlock, flags); + return rc; + } + raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
if (first_enable) { @@ -183,6 +201,7 @@ static void funnel_disable(struct coresight_device *csdev, dynamic_funnel_disable_hw(drvdata, in->dest_port); last_disable = true; } + raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
if (last_disable) diff --git a/drivers/hwtracing/coresight/coresight-replicator.c b/drivers/hwtracing/coresight/coresight-replicator.c index a9f22d0e15de21aa06c8d1e193e5db06091efd75..cc7d3916b8b9d5d342d6cde0487722eeb8dee78b 100644 --- a/drivers/hwtracing/coresight/coresight-replicator.c +++ b/drivers/hwtracing/coresight/coresight-replicator.c @@ -199,7 +199,8 @@ static int replicator_enable_hw(struct replicator_drvdata *drvdata,
static int replicator_enable(struct coresight_device *csdev, struct coresight_connection *in, - struct coresight_connection *out) + struct coresight_connection *out, + enum cs_mode mode) { int rc = 0; struct replicator_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); @@ -212,6 +213,25 @@ static int replicator_enable(struct coresight_device *csdev, first_enable = true; else out->src_refcnt++; + + if (mode == CS_MODE_PERF) { + if (first_enable) { + if (drvdata->supported_cpus && + !cpumask_test_cpu(smp_processor_id(), drvdata->supported_cpus)) { + raw_spin_unlock_irqrestore(&drvdata->spinlock, flags); + return -EINVAL; + } + + if (drvdata->base) + rc = dynamic_replicator_enable(drvdata, in->dest_port, + out->src_port); + if (!rc) + out->src_refcnt++; + } + raw_spin_unlock_irqrestore(&drvdata->spinlock, flags); + return rc; + } + raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
if (first_enable) { @@ -272,6 +292,7 @@ static void replicator_disable(struct coresight_device *csdev, out->src_port); last_disable = true; } + raw_spin_unlock_irqrestore(&drvdata->spinlock, flags);
if (last_disable) diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 11357788e9d93c53980e99e0ef78450e393f4059..f1b8264b4e5c8a8d38778c25515cbf557c0993b7 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -427,7 +427,8 @@ static int tmc_disable_etf_sink(struct coresight_device *csdev)
static int tmc_enable_etf_link(struct coresight_device *csdev, struct coresight_connection *in, - struct coresight_connection *out) + struct coresight_connection *out, + enum cs_mode mode) { int ret = 0; unsigned long flags; @@ -446,6 +447,22 @@ static int tmc_enable_etf_link(struct coresight_device *csdev, if (!first_enable) csdev->refcnt++;
+ if (mode == CS_MODE_PERF) { + if (first_enable) { + if (drvdata->supported_cpus && + !cpumask_test_cpu(smp_processor_id(), drvdata->supported_cpus)) { + raw_spin_unlock_irqrestore(&drvdata->spinlock, flags); + return -EINVAL; + } + + ret = tmc_etf_enable_hw_local(drvdata); + if (!ret) + csdev->refcnt++; + } + raw_spin_unlock_irqrestore(&drvdata->spinlock, flags); + return ret; + } + raw_spin_unlock_irqrestore(&drvdata->spinlock, flags); if (first_enable) { ret = tmc_etf_enable_hw(drvdata); diff --git a/drivers/hwtracing/coresight/coresight-tnoc.c b/drivers/hwtracing/coresight/coresight-tnoc.c index ff9a0a9cfe96e5f5e3077c750ea2f890cdd50d94..48e9e685b9439d92bdaae9e40d3b3bc2d1ac1cd2 100644 --- a/drivers/hwtracing/coresight/coresight-tnoc.c +++ b/drivers/hwtracing/coresight/coresight-tnoc.c @@ -73,7 +73,8 @@ static void trace_noc_enable_hw(struct trace_noc_drvdata *drvdata) }
static int trace_noc_enable(struct coresight_device *csdev, struct coresight_connection *inport, - struct coresight_connection *outport) + struct coresight_connection *outport, + enum cs_mode mode) { struct trace_noc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
diff --git a/drivers/hwtracing/coresight/coresight-tpda.c b/drivers/hwtracing/coresight/coresight-tpda.c index 3a3825d27f861585ca1d847929747f8096004089..e6f52abc5b023a997c36d74c0e3b1a3de8236ba2 100644 --- a/drivers/hwtracing/coresight/coresight-tpda.c +++ b/drivers/hwtracing/coresight/coresight-tpda.c @@ -190,7 +190,8 @@ static int __tpda_enable(struct tpda_drvdata *drvdata, int port)
static int tpda_enable(struct coresight_device *csdev, struct coresight_connection *in, - struct coresight_connection *out) + struct coresight_connection *out, + enum cs_mode mode) { struct tpda_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); int ret = 0; diff --git a/include/linux/coresight.h b/include/linux/coresight.h index 2b48be97fcd0d7ea2692206692bd33f35ba4ec79..218eb1d1dcef61f5d98ebbfff38370192b8a6e45 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -383,7 +383,8 @@ struct coresight_ops_sink { struct coresight_ops_link { int (*enable)(struct coresight_device *csdev, struct coresight_connection *in, - struct coresight_connection *out); + struct coresight_connection *out, + enum cs_mode mode); void (*disable)(struct coresight_device *csdev, struct coresight_connection *in, struct coresight_connection *out);
From: Jie Gan jie.gan@oss.qualcomm.com
The APSS debug block is built with CoreSight devices like ETM, replicator, funnel and TMC ETF. Add dt nodes for these devices to enable ETM trace.
Signed-off-by: Jie Gan jie.gan@oss.qualcomm.com Co-developed-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com Signed-off-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com --- arch/arm64/boot/dts/qcom/hamoa.dtsi | 926 ++++++++++++++++++++++++++++++++++++ arch/arm64/boot/dts/qcom/purwa.dtsi | 12 + 2 files changed, 938 insertions(+)
diff --git a/arch/arm64/boot/dts/qcom/hamoa.dtsi b/arch/arm64/boot/dts/qcom/hamoa.dtsi index a17900eacb20396a9792efcfcd6ce6dd877435d1..8c3de8bf058daa681db040c4a9a38253863e6c78 100644 --- a/arch/arm64/boot/dts/qcom/hamoa.dtsi +++ b/arch/arm64/boot/dts/qcom/hamoa.dtsi @@ -305,6 +305,210 @@ eud_in: endpoint { }; };
+ etm-0 { + compatible = "arm,coresight-etm4x-sysreg"; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + cpu = <&cpu0>; + qcom,skip-power-up; + + out-ports { + port { + etm0_out: endpoint { + remote-endpoint = <&ncc0_0_rep_in>; + }; + }; + }; + }; + + etm-1 { + compatible = "arm,coresight-etm4x-sysreg"; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + cpu = <&cpu1>; + qcom,skip-power-up; + + out-ports { + port { + etm1_out: endpoint { + remote-endpoint = <&ncc0_1_rep_in>; + }; + }; + }; + }; + + etm-2 { + compatible = "arm,coresight-etm4x-sysreg"; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + cpu = <&cpu2>; + qcom,skip-power-up; + + out-ports { + port { + etm2_out: endpoint { + remote-endpoint = <&ncc0_2_rep_in>; + }; + }; + }; + }; + + etm-3 { + compatible = "arm,coresight-etm4x-sysreg"; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + cpu = <&cpu3>; + qcom,skip-power-up; + + out-ports { + port { + etm3_out: endpoint { + remote-endpoint = <&ncc0_3_rep_in>; + }; + }; + }; + }; + + etm-4 { + compatible = "arm,coresight-etm4x-sysreg"; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + cpu = <&cpu4>; + qcom,skip-power-up; + + out-ports { + port { + etm4_out: endpoint { + remote-endpoint = <&ncc1_0_rep_in>; + }; + }; + }; + }; + + etm-5 { + compatible = "arm,coresight-etm4x-sysreg"; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + cpu = <&cpu5>; + qcom,skip-power-up; + + out-ports { + port { + etm5_out: endpoint { + remote-endpoint = <&ncc1_1_rep_in>; + }; + }; + }; + }; + + etm-6 { + compatible = "arm,coresight-etm4x-sysreg"; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + cpu = <&cpu6>; + qcom,skip-power-up; + + out-ports { + port { + etm6_out: endpoint { + remote-endpoint = <&ncc1_2_rep_in>; + }; + }; + }; + }; + + etm-7 { + compatible = "arm,coresight-etm4x-sysreg"; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + cpu = <&cpu7>; + qcom,skip-power-up; + + out-ports { + port { + etm7_out: endpoint { + remote-endpoint = <&ncc1_3_rep_in>; + }; + }; + }; + }; + + etm8: etm-8 { + compatible = "arm,coresight-etm4x-sysreg"; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + cpu = <&cpu8>; + qcom,skip-power-up; + + out-ports { + port { + etm8_out: endpoint { + remote-endpoint = <&ncc2_0_rep_in>; + }; + }; + }; + }; + + etm9: etm-9 { + compatible = "arm,coresight-etm4x-sysreg"; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + cpu = <&cpu9>; + qcom,skip-power-up; + + out-ports { + port { + etm9_out: endpoint { + remote-endpoint = <&ncc2_1_rep_in>; + }; + }; + }; + }; + + etm10: etm-10 { + compatible = "arm,coresight-etm4x-sysreg"; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + cpu = <&cpu10>; + qcom,skip-power-up; + + out-ports { + port { + etm10_out: endpoint { + remote-endpoint = <&ncc2_2_rep_in>; + }; + }; + }; + }; + + etm11: etm-11 { + compatible = "arm,coresight-etm4x-sysreg"; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + cpu = <&cpu11>; + qcom,skip-power-up; + + out-ports { + port { + etm11_out: endpoint { + remote-endpoint = <&ncc2_3_rep_in>; + }; + }; + }; + }; + firmware { scm: scm { compatible = "qcom,scm-x1e80100", "qcom,scm"; @@ -6864,6 +7068,14 @@ funnel1_in2: endpoint { }; };
+ port@4 { + reg = <4>; + + funnel1_in4: endpoint { + remote-endpoint = <&apss_funnel_out>; + }; + }; + port@5 { reg = <5>;
@@ -8154,6 +8366,720 @@ ddr_funnel1_out: endpoint { }; };
+ apss_funnel: funnel@12080000 { + compatible = "arm,coresight-dynamic-funnel", "arm,primecell"; + reg = <0x0 0x12080000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + + in-ports { + #address-cells = <1>; + #size-cells = <0>; + + port@0 { + reg = <0>; + + apss_funnel_in0: endpoint { + remote-endpoint = <&ncc0_etf_out>; + }; + }; + + port@1 { + reg = <1>; + + apss_funnel_in1: endpoint { + remote-endpoint = <&ncc1_etf_out>; + }; + }; + + port@2 { + reg = <2>; + + apss_funnel_in2: endpoint { + remote-endpoint = <&ncc2_etf_out>; + }; + }; + }; + + out-ports { + port { + apss_funnel_out: endpoint { + remote-endpoint = <&funnel1_in4>; + }; + }; + }; + }; + + funnel@13401000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb908>; + reg = <0x0 0x13401000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd0>; + qcom,cpu-bound-components; + + in-ports { + #address-cells = <1>; + #size-cells = <0>; + + port@2 { + reg = <2>; + + ncc0_2_funnel_in2: endpoint { + remote-endpoint = <&ncc0_1_funnel_out>; + }; + }; + }; + + out-ports { + port { + ncc0_2_funnel_out: endpoint { + remote-endpoint = <&ncc0_etf_in>; + }; + }; + }; + }; + + tmc@13409000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb961>; + reg = <0x0 0x13409000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd0>; + qcom,cpu-bound-components; + + in-ports { + port { + ncc0_etf_in: endpoint { + remote-endpoint = <&ncc0_2_funnel_out>; + }; + }; + }; + + out-ports { + port { + ncc0_etf_out: endpoint { + remote-endpoint = <&apss_funnel_in0>; + }; + }; + }; + }; + + replicator@13490000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb909>; + reg = <0x0 0x13490000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd0>; + qcom,cpu-bound-components; + + in-ports { + port { + ncc0_0_rep_in: endpoint { + remote-endpoint = <&etm0_out>; + }; + }; + }; + + out-ports { + port { + ncc0_0_rep_out: endpoint { + remote-endpoint = <&ncc0_1_funnel_in0>; + }; + }; + }; + }; + + replicator@134a0000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb909>; + reg = <0x0 0x134a0000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd0>; + qcom,cpu-bound-components; + + in-ports { + port { + ncc0_1_rep_in: endpoint { + remote-endpoint = <&etm1_out>; + }; + }; + }; + + out-ports { + port { + ncc0_1_rep_out: endpoint { + remote-endpoint = <&ncc0_1_funnel_in1>; + }; + }; + }; + }; + + replicator@134b0000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb909>; + reg = <0x0 0x134b0000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd0>; + qcom,cpu-bound-components; + + in-ports { + port { + ncc0_2_rep_in: endpoint { + remote-endpoint = <&etm2_out>; + }; + }; + }; + + out-ports { + port { + ncc0_2_rep_out: endpoint { + remote-endpoint = <&ncc0_1_funnel_in2>; + }; + }; + }; + }; + + replicator@134c0000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb909>; + reg = <0x0 0x134c0000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd0>; + qcom,cpu-bound-components; + + in-ports { + port { + ncc0_3_rep_in: endpoint { + remote-endpoint = <&etm3_out>; + }; + }; + }; + + out-ports { + port { + ncc0_3_rep_out: endpoint { + remote-endpoint = <&ncc0_1_funnel_in3>; + }; + }; + }; + }; + + funnel@134d0000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb908>; + reg = <0x0 0x134d0000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd0>; + qcom,cpu-bound-components; + + in-ports { + #address-cells = <1>; + #size-cells = <0>; + + port@0 { + reg = <0>; + + ncc0_1_funnel_in0: endpoint { + remote-endpoint = <&ncc0_0_rep_out>; + }; + }; + + port@1 { + reg = <1>; + + ncc0_1_funnel_in1: endpoint { + remote-endpoint = <&ncc0_1_rep_out>; + }; + }; + + port@2 { + reg = <2>; + + ncc0_1_funnel_in2: endpoint { + remote-endpoint = <&ncc0_2_rep_out>; + }; + }; + + port@3 { + reg = <3>; + + ncc0_1_funnel_in3: endpoint { + remote-endpoint = <&ncc0_3_rep_out>; + }; + }; + }; + + out-ports { + port { + ncc0_1_funnel_out: endpoint { + remote-endpoint = <&ncc0_2_funnel_in2>; + }; + }; + }; + }; + + funnel@13901000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb908>; + reg = <0x0 0x13901000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd1>; + qcom,cpu-bound-components; + + in-ports { + #address-cells = <1>; + #size-cells = <0>; + + port@2 { + reg = <2>; + + ncc1_2_funnel_in2: endpoint { + remote-endpoint = <&ncc1_1_funnel_out>; + }; + }; + }; + + out-ports { + port { + ncc1_2_funnel_out: endpoint { + remote-endpoint = <&ncc1_etf_in>; + }; + }; + }; + }; + + tmc@13909000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb961>; + reg = <0x0 0x13909000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd1>; + qcom,cpu-bound-components; + + in-ports { + port { + ncc1_etf_in: endpoint { + remote-endpoint = <&ncc1_2_funnel_out>; + }; + }; + }; + + out-ports { + port { + ncc1_etf_out: endpoint { + remote-endpoint = <&apss_funnel_in1>; + }; + }; + }; + }; + + replicator@13990000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb909>; + reg = <0x0 0x13990000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd1>; + qcom,cpu-bound-components; + + in-ports { + port { + ncc1_0_rep_in: endpoint { + remote-endpoint = <&etm4_out>; + }; + }; + }; + + out-ports { + port { + ncc1_0_rep_out: endpoint { + remote-endpoint = <&ncc1_1_funnel_in0>; + }; + }; + }; + }; + + replicator@139a0000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb909>; + reg = <0x0 0x139a0000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd1>; + qcom,cpu-bound-components; + + in-ports { + port { + ncc1_1_rep_in: endpoint { + remote-endpoint = <&etm5_out>; + }; + }; + }; + + out-ports { + port { + ncc1_1_rep_out: endpoint { + remote-endpoint = <&ncc1_1_funnel_in1>; + }; + }; + }; + }; + + replicator@139b0000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb909>; + reg = <0x0 0x139b0000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd1>; + qcom,cpu-bound-components; + + in-ports { + port { + ncc1_2_rep_in: endpoint { + remote-endpoint = <&etm6_out>; + }; + }; + }; + + out-ports { + port { + ncc1_2_rep_out: endpoint { + remote-endpoint = <&ncc1_1_funnel_in2>; + }; + }; + }; + }; + + replicator@139c0000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb909>; + reg = <0x0 0x139c0000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd1>; + qcom,cpu-bound-components; + + in-ports { + port { + ncc1_3_rep_in: endpoint { + remote-endpoint = <&etm7_out>; + }; + }; + }; + + out-ports { + port { + ncc1_3_rep_out: endpoint { + remote-endpoint = <&ncc1_1_funnel_in3>; + }; + }; + }; + }; + + funnel@139d0000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb908>; + reg = <0x0 0x139d0000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd1>; + qcom,cpu-bound-components; + + in-ports { + #address-cells = <1>; + #size-cells = <0>; + + port@0 { + reg = <0>; + + ncc1_1_funnel_in0: endpoint { + remote-endpoint = <&ncc1_0_rep_out>; + }; + }; + + port@1 { + reg = <1>; + + ncc1_1_funnel_in1: endpoint { + remote-endpoint = <&ncc1_1_rep_out>; + }; + }; + + port@2 { + reg = <2>; + + ncc1_1_funnel_in2: endpoint { + remote-endpoint = <&ncc1_2_rep_out>; + }; + }; + + port@3 { + reg = <3>; + + ncc1_1_funnel_in3: endpoint { + remote-endpoint = <&ncc1_3_rep_out>; + }; + }; + }; + + out-ports { + port { + ncc1_1_funnel_out: endpoint { + remote-endpoint = <&ncc1_2_funnel_in2>; + }; + }; + }; + }; + + cluster2_funnel_l2: funnel@13e01000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb908>; + reg = <0x0 0x13e01000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd2>; + qcom,cpu-bound-components; + + in-ports { + #address-cells = <1>; + #size-cells = <0>; + + port@2 { + reg = <2>; + + ncc2_2_funnel_in2: endpoint { + remote-endpoint = <&ncc2_1_funnel_out>; + }; + }; + }; + + out-ports { + port { + ncc2_2_funnel_out: endpoint { + remote-endpoint = <&ncc2_etf_in>; + }; + }; + }; + }; + + cluster2_etf: tmc@13e09000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb961>; + reg = <0x0 0x13e09000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd2>; + qcom,cpu-bound-components; + + in-ports { + port { + ncc2_etf_in: endpoint { + remote-endpoint = <&ncc2_2_funnel_out>; + }; + }; + }; + + out-ports { + port { + ncc2_etf_out: endpoint { + remote-endpoint = <&apss_funnel_in2>; + }; + }; + }; + }; + + cluster2_rep_2_0: replicator@13e90000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb909>; + reg = <0x0 0x13e90000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd2>; + qcom,cpu-bound-components; + + in-ports { + port { + ncc2_0_rep_in: endpoint { + remote-endpoint = <&etm8_out>; + }; + }; + }; + + out-ports { + port { + ncc2_0_rep_out: endpoint { + remote-endpoint = <&ncc2_1_funnel_in0>; + }; + }; + }; + }; + + cluster2_rep_2_1: replicator@13ea0000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb909>; + reg = <0x0 0x13ea0000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd2>; + qcom,cpu-bound-components; + + in-ports { + port { + ncc2_1_rep_in: endpoint { + remote-endpoint = <&etm9_out>; + }; + }; + }; + + out-ports { + port { + ncc2_1_rep_out: endpoint { + remote-endpoint = <&ncc2_1_funnel_in1>; + }; + }; + }; + }; + + cluster2_rep_2_2: replicator@13eb0000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb909>; + reg = <0x0 0x13eb0000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd2>; + qcom,cpu-bound-components; + + in-ports { + port { + ncc2_2_rep_in: endpoint { + remote-endpoint = <&etm10_out>; + }; + }; + }; + + out-ports { + port { + ncc2_2_rep_out: endpoint { + remote-endpoint = <&ncc2_1_funnel_in2>; + }; + }; + }; + }; + + cluster2_rep_2_3: replicator@13ec0000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb909>; + reg = <0x0 0x13ec0000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd2>; + qcom,cpu-bound-components; + + in-ports { + port { + ncc2_3_rep_in: endpoint { + remote-endpoint = <&etm11_out>; + }; + }; + }; + + out-ports { + port { + ncc2_3_rep_out: endpoint { + remote-endpoint = <&ncc2_1_funnel_in3>; + }; + }; + }; + }; + + cluster2_funnel_l1: funnel@13ed0000 { + compatible = "arm,primecell"; + arm,primecell-periphid = <0x000bb908>; + reg = <0x0 0x13ed0000 0x0 0x1000>; + + clocks = <&aoss_qmp>; + clock-names = "apb_pclk"; + power-domains = <&cluster_pd2>; + qcom,cpu-bound-components; + + in-ports { + #address-cells = <1>; + #size-cells = <0>; + + port@0 { + reg = <0>; + + ncc2_1_funnel_in0: endpoint { + remote-endpoint = <&ncc2_0_rep_out>; + }; + }; + + port@1 { + reg = <1>; + + ncc2_1_funnel_in1: endpoint { + remote-endpoint = <&ncc2_1_rep_out>; + }; + }; + + port@2 { + reg = <2>; + + ncc2_1_funnel_in2: endpoint { + remote-endpoint = <&ncc2_2_rep_out>; + }; + }; + + port@3 { + reg = <3>; + + ncc2_1_funnel_in3: endpoint { + remote-endpoint = <&ncc2_3_rep_out>; + }; + }; + }; + + out-ports { + port { + ncc2_1_funnel_out: endpoint { + remote-endpoint = <&ncc2_2_funnel_in2>; + }; + }; + }; + }; + apps_smmu: iommu@15000000 { compatible = "qcom,x1e80100-smmu-500", "qcom,smmu-500", "arm,mmu-500"; reg = <0 0x15000000 0 0x100000>; diff --git a/arch/arm64/boot/dts/qcom/purwa.dtsi b/arch/arm64/boot/dts/qcom/purwa.dtsi index 2cecd2dd0de8c39f0702d6983bead2bc2adccf9b..38f2df9e42b60b5f22decfb464381bce214d414d 100644 --- a/arch/arm64/boot/dts/qcom/purwa.dtsi +++ b/arch/arm64/boot/dts/qcom/purwa.dtsi @@ -21,6 +21,18 @@ /delete-node/ &gpu_speed_bin; /delete-node/ &pcie3_phy; /delete-node/ &thermal_zones; +/delete-node/ &etm8; +/delete-node/ &etm9; +/delete-node/ &etm10; +/delete-node/ &etm11; +/delete-node/ &cluster2_funnel_l1; +/delete-node/ &cluster2_funnel_l2; +/delete-node/ &cluster2_etf; +/delete-node/ &cluster2_rep_2_0; +/delete-node/ &cluster2_rep_2_1; +/delete-node/ &cluster2_rep_2_2; +/delete-node/ &cluster2_rep_2_3; +/delete-node/ &apss_funnel_in2;
&gcc { compatible = "qcom,x1p42100-gcc", "qcom,x1e80100-gcc";
Cc: Sudeep
On 18/12/2025 08:09, Yuanfang Zhang wrote:
This patch series adds support for CoreSight components local to CPU clusters, including funnel, replicator, and TMC, which reside within CPU cluster power domains. These components require special handling due to power domain constraints.
Unlike system-level CoreSight devices, these components share the CPU cluster's power domain. When the cluster enters low-power mode (LPM), their registers become inaccessible. Notably, `pm_runtime_get` alone cannot bring the cluster out of LPM, making standard register access unreliable.
Why ? AFAIU, we have ways to tie the power-domain to that of the cluster and that can auto-magically keep the cluster power ON as long as you want to use them.
Suzuki
To address this, the series introduces:
- Identifying cluster-bound devices via a new `qcom,cpu-bound-components` device tree property.
- Implementing deferred probing: if associated CPUs are offline during probe, initialization is deferred until a CPU hotplug notifier detects the CPU coming online.
- Utilizing `smp_call_function_single()` to ensure register accesses (initialization, enablement, sysfs reads) are always executed on a powered CPU within the target cluster.
- Extending the CoreSight link `enable` callback to pass the `cs_mode`. This allows drivers to distinguish between SysFS and Perf modes and apply mode-specific logic.
Jie Gan (1): arm64: dts: qcom: hamoa: add Coresight nodes for APSS debug block
Yuanfang Zhang (11): dt-bindings: arm: coresight: Add 'qcom,cpu-bound-components' property coresight: Pass trace mode to link enable callback coresight-funnel: Support CPU cluster funnel initialization coresight-funnel: Defer probe when associated CPUs are offline coresight-replicator: Support CPU cluster replicator initialization coresight-replicator: Defer probe when associated CPUs are offline coresight-replicator: Update management interface for CPU-bound devices coresight-tmc: Support probe and initialization for CPU cluster TMCs coresight-tmc-etf: Refactor enable function for CPU cluster ETF support coresight-tmc: Update management interface for CPU-bound TMCs coresight-tmc: Defer probe when associated CPUs are offline
Verification:
This series has been verified on sm8750.
Test steps for delay probe:
- limit the system to enable at most 6 CPU cores during boot.
- echo 1 >/sys/bus/cpu/devices/cpu6/online.
- check whether ETM6 and ETM7 have been probed.
Test steps for sysfs mode:
echo 1 >/sys/bus/coresight/devices/tmc_etf0/enable_sink echo 1 >/sys/bus/coresight/devices/etm0/enable_source echo 1 >/sys/bus/coresight/devices/etm6/enable_source echo 0 >/sys/bus/coresight/devices/etm0/enable_source echo 0 >/sys/bus/coresight/devicse/etm6/enable_source echo 0 >/sys/bus/coresight/devices/tmc_etf0/enable_sink
echo 1 >/sys/bus/coresight/devices/tmc_etf1/enable_sink echo 1 >/sys/bus/coresight/devcies/etm0/enable_source cat /dev/tmc_etf1 >/tmp/etf1.bin echo 0 >/sys/bus/coresight/devices/etm0/enable_source echo 0 >/sys/bus/coresight/devices/tmc_etf1/enable_sink
echo 1 >/sys/bus/coresight/devices/tmc_etf2/enable_sink echo 1 >/sys/bus/coresight/devices/etm6/enable_source cat /dev/tmc_etf2 >/tmp/etf2.bin echo 0 >/sys/bus/coresight/devices/etm6/enable_source echo 0 >/sys/bus/coresight/devices/tmc_etf2/enable_sink
Test steps for sysfs node:
cat /sys/bus/coresight/devices/tmc_etf*/mgmt/*
cat /sys/bus/coresight/devices/funnel*/funnel_ctrl
cat /sys/bus/coresight/devices/replicator*/mgmt/*
Test steps for perf mode:
perf record -a -e cs_etm//k -- sleep 5
Signed-off-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com
Changes in v2:
- Use the qcom,cpu-bound-components device tree property to identify devices bound to a cluster.
- Refactor commit message.
- Introduce a supported_cpus field in the drvdata structure to record the CPUs that belong to the cluster where the local component resides.
- Link to v1: https://lore.kernel.org/r/20251027-cpu_cluster_component_pm-v1-0-31355ac588c...
Jie Gan (1): arm64: dts: qcom: hamoa: Add CoreSight nodes for APSS debug block
Yuanfang Zhang (11): dt-bindings: arm: coresight: Add 'qcom,cpu-bound-components' property coresight-funnel: Support CPU cluster funnel initialization coresight-funnel: Defer probe when associated CPUs are offline coresight-replicator: Support CPU cluster replicator initialization coresight-replicator: Defer probe when associated CPUs are offline coresight-replicator: Update management interface for CPU-bound devices coresight-tmc: Support probe and initialization for CPU cluster TMCs coresight-tmc-etf: Refactor enable function for CPU cluster ETF support coresight-tmc: Update management interface for CPU-bound TMCs coresight-tmc: Defer probe when associated CPUs are offline coresight: Pass trace mode to link enable callback
.../bindings/arm/arm,coresight-dynamic-funnel.yaml | 5 + .../arm/arm,coresight-dynamic-replicator.yaml | 5 + .../devicetree/bindings/arm/arm,coresight-tmc.yaml | 5 + arch/arm64/boot/dts/qcom/hamoa.dtsi | 926 +++++++++++++++++++++ arch/arm64/boot/dts/qcom/purwa.dtsi | 12 + drivers/hwtracing/coresight/coresight-core.c | 7 +- drivers/hwtracing/coresight/coresight-funnel.c | 258 +++++- drivers/hwtracing/coresight/coresight-replicator.c | 341 +++++++- drivers/hwtracing/coresight/coresight-tmc-core.c | 387 +++++++-- drivers/hwtracing/coresight/coresight-tmc-etf.c | 106 ++- drivers/hwtracing/coresight/coresight-tmc.h | 10 + drivers/hwtracing/coresight/coresight-tnoc.c | 3 +- drivers/hwtracing/coresight/coresight-tpda.c | 3 +- include/linux/coresight.h | 3 +- 14 files changed, 1902 insertions(+), 169 deletions(-)
base-commit: 008d3547aae5bc86fac3eda317489169c3fda112 change-id: 20251016-cpu_cluster_component_pm-ce518f510433
Best regards,
Hi,
On Thu, Dec 18, 2025 at 12:09:40AM -0800, Coresight ML wrote:
[...]
- Utilizing `smp_call_function_single()` to ensure register accesses (initialization, enablement, sysfs reads) are always executed on a powered CPU within the target cluster.
This is concerned as Mike suggested earlier.
Let me convert to a common question: how does the Linux kernel manage a power domain shared by multiple hardware modules?
A general solution is to bind a power domain (let's say PD1) to both module A (MOD_A) and module B (MOD_B). Each time before accessing MOD_A or MOD_B, PD1 must be powered on first via the pm_runtime APIs, with its refcount increased accordingly.
My understanding is the problem in your case is that the driver fails to create a relationship between the funnel/replicator modules and the cluster power domain. Instead, you are trying to use the CPUs in the same cluster as a delegate for power operations - when you want to access MOD_B, your wake up MOD_A which sharing the same power domain, only to turn on the PD_A in order to access MOD_B.
Have you discussed with the firmware and hardware engineers whether it is feasible to provide explicit power and clock control interfaces for the funnel and replicator modules? I can imagine the cluster power domain's design might differ from other device power domains, but should not the hardware provide a sane design that allows software to control power for the access logic within it?
General speaking, using smp_call_function_single() makes sense if only when accessing logics within the CPU boundary.
P.s., currently you can use "taskset" as a temporary solution without any code change, something like:
taskset -c 0 echo 1 > /sys/bus/coresight/devices/etm0/enable_source
Thanks, Leo
On 12/18/2025 5:32 PM, Suzuki K Poulose wrote:
Cc: Sudeep
On 18/12/2025 08:09, Yuanfang Zhang wrote:
This patch series adds support for CoreSight components local to CPU clusters, including funnel, replicator, and TMC, which reside within CPU cluster power domains. These components require special handling due to power domain constraints.
Unlike system-level CoreSight devices, these components share the CPU cluster's power domain. When the cluster enters low-power mode (LPM), their registers become inaccessible. Notably, `pm_runtime_get` alone cannot bring the cluster out of LPM, making standard register access unreliable.
Why ? AFAIU, we have ways to tie the power-domain to that of the cluster and that can auto-magically keep the cluster power ON as long as you want to use them.
Suzuki
Hi Suzuki
Runtime PM for CPU devices works little different, it is mostly used to manage hierarchical CPU topology (PSCI OSI mode) to talk with genpd framework to manage the last CPU handling in cluster. It doesn’t really send IPI to wakeup CPU device (It don’t have .power_on/.power_off) callback implemented which gets invoked from .runtime_resume callback. This behavior is aligned with the upstream Kernel.
Yuanfang
To address this, the series introduces:
- Identifying cluster-bound devices via a new `qcom,cpu-bound-components`
device tree property.
- Implementing deferred probing: if associated CPUs are offline during
probe, initialization is deferred until a CPU hotplug notifier detects the CPU coming online.
- Utilizing `smp_call_function_single()` to ensure register accesses
(initialization, enablement, sysfs reads) are always executed on a powered CPU within the target cluster.
- Extending the CoreSight link `enable` callback to pass the `cs_mode`.
This allows drivers to distinguish between SysFS and Perf modes and apply mode-specific logic.
Jie Gan (1): arm64: dts: qcom: hamoa: add Coresight nodes for APSS debug block
Yuanfang Zhang (11): dt-bindings: arm: coresight: Add 'qcom,cpu-bound-components' property coresight: Pass trace mode to link enable callback coresight-funnel: Support CPU cluster funnel initialization coresight-funnel: Defer probe when associated CPUs are offline coresight-replicator: Support CPU cluster replicator initialization coresight-replicator: Defer probe when associated CPUs are offline coresight-replicator: Update management interface for CPU-bound devices coresight-tmc: Support probe and initialization for CPU cluster TMCs coresight-tmc-etf: Refactor enable function for CPU cluster ETF support coresight-tmc: Update management interface for CPU-bound TMCs coresight-tmc: Defer probe when associated CPUs are offline
Verification:
This series has been verified on sm8750.
Test steps for delay probe:
- limit the system to enable at most 6 CPU cores during boot.
- echo 1 >/sys/bus/cpu/devices/cpu6/online.
- check whether ETM6 and ETM7 have been probed.
Test steps for sysfs mode:
echo 1 >/sys/bus/coresight/devices/tmc_etf0/enable_sink echo 1 >/sys/bus/coresight/devices/etm0/enable_source echo 1 >/sys/bus/coresight/devices/etm6/enable_source echo 0 >/sys/bus/coresight/devices/etm0/enable_source echo 0 >/sys/bus/coresight/devicse/etm6/enable_source echo 0 >/sys/bus/coresight/devices/tmc_etf0/enable_sink
echo 1 >/sys/bus/coresight/devices/tmc_etf1/enable_sink echo 1 >/sys/bus/coresight/devcies/etm0/enable_source cat /dev/tmc_etf1 >/tmp/etf1.bin echo 0 >/sys/bus/coresight/devices/etm0/enable_source echo 0 >/sys/bus/coresight/devices/tmc_etf1/enable_sink
echo 1 >/sys/bus/coresight/devices/tmc_etf2/enable_sink echo 1 >/sys/bus/coresight/devices/etm6/enable_source cat /dev/tmc_etf2 >/tmp/etf2.bin echo 0 >/sys/bus/coresight/devices/etm6/enable_source echo 0 >/sys/bus/coresight/devices/tmc_etf2/enable_sink
Test steps for sysfs node:
cat /sys/bus/coresight/devices/tmc_etf*/mgmt/*
cat /sys/bus/coresight/devices/funnel*/funnel_ctrl
cat /sys/bus/coresight/devices/replicator*/mgmt/*
Test steps for perf mode:
perf record -a -e cs_etm//k -- sleep 5
Signed-off-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com
Changes in v2:
- Use the qcom,cpu-bound-components device tree property to identify devices
bound to a cluster.
- Refactor commit message.
- Introduce a supported_cpus field in the drvdata structure to record the CPUs
that belong to the cluster where the local component resides.
Jie Gan (1): arm64: dts: qcom: hamoa: Add CoreSight nodes for APSS debug block
Yuanfang Zhang (11): dt-bindings: arm: coresight: Add 'qcom,cpu-bound-components' property coresight-funnel: Support CPU cluster funnel initialization coresight-funnel: Defer probe when associated CPUs are offline coresight-replicator: Support CPU cluster replicator initialization coresight-replicator: Defer probe when associated CPUs are offline coresight-replicator: Update management interface for CPU-bound devices coresight-tmc: Support probe and initialization for CPU cluster TMCs coresight-tmc-etf: Refactor enable function for CPU cluster ETF support coresight-tmc: Update management interface for CPU-bound TMCs coresight-tmc: Defer probe when associated CPUs are offline coresight: Pass trace mode to link enable callback
.../bindings/arm/arm,coresight-dynamic-funnel.yaml | 5 + .../arm/arm,coresight-dynamic-replicator.yaml | 5 + .../devicetree/bindings/arm/arm,coresight-tmc.yaml | 5 + arch/arm64/boot/dts/qcom/hamoa.dtsi | 926 +++++++++++++++++++++ arch/arm64/boot/dts/qcom/purwa.dtsi | 12 + drivers/hwtracing/coresight/coresight-core.c | 7 +- drivers/hwtracing/coresight/coresight-funnel.c | 258 +++++- drivers/hwtracing/coresight/coresight-replicator.c | 341 +++++++- drivers/hwtracing/coresight/coresight-tmc-core.c | 387 +++++++-- drivers/hwtracing/coresight/coresight-tmc-etf.c | 106 ++- drivers/hwtracing/coresight/coresight-tmc.h | 10 + drivers/hwtracing/coresight/coresight-tnoc.c | 3 +- drivers/hwtracing/coresight/coresight-tpda.c | 3 +- include/linux/coresight.h | 3 +- 14 files changed, 1902 insertions(+), 169 deletions(-)
base-commit: 008d3547aae5bc86fac3eda317489169c3fda112 change-id: 20251016-cpu_cluster_component_pm-ce518f510433
Best regards,
On 18/12/2025 16:18, yuanfang zhang wrote:
On 12/18/2025 5:32 PM, Suzuki K Poulose wrote:
Cc: Sudeep
On 18/12/2025 08:09, Yuanfang Zhang wrote:
This patch series adds support for CoreSight components local to CPU clusters, including funnel, replicator, and TMC, which reside within CPU cluster power domains. These components require special handling due to power domain constraints.
Unlike system-level CoreSight devices, these components share the CPU cluster's power domain. When the cluster enters low-power mode (LPM), their registers become inaccessible. Notably, `pm_runtime_get` alone cannot bring the cluster out of LPM, making standard register access unreliable.
Why ? AFAIU, we have ways to tie the power-domain to that of the cluster and that can auto-magically keep the cluster power ON as long as you want to use them.
Suzuki
Hi Suzuki
Runtime PM for CPU devices works little different, it is mostly used to manage hierarchical CPU topology (PSCI OSI mode) to talk with genpd framework to manage the last CPU handling in cluster. It doesn’t really send IPI to wakeup CPU device (It don’t have .power_on/.power_off) callback implemented which gets invoked from .runtime_resume callback. This behavior is aligned with the upstream Kernel.
Why does it need to wake up the CPU ? The firmware can power up the cluster right? Anyways, to me this all looks like working around a firmware issue. I will let you sort this out with Sudeep's response , as I am not an expert on the cluster powermanagement and standards.
Suzuki
Yuanfang
To address this, the series introduces:
- Identifying cluster-bound devices via a new `qcom,cpu-bound-components`
device tree property.
- Implementing deferred probing: if associated CPUs are offline during
probe, initialization is deferred until a CPU hotplug notifier detects the CPU coming online.
- Utilizing `smp_call_function_single()` to ensure register accesses
(initialization, enablement, sysfs reads) are always executed on a powered CPU within the target cluster.
- Extending the CoreSight link `enable` callback to pass the `cs_mode`.
This allows drivers to distinguish between SysFS and Perf modes and apply mode-specific logic.
Jie Gan (1): arm64: dts: qcom: hamoa: add Coresight nodes for APSS debug block
Yuanfang Zhang (11): dt-bindings: arm: coresight: Add 'qcom,cpu-bound-components' property coresight: Pass trace mode to link enable callback coresight-funnel: Support CPU cluster funnel initialization coresight-funnel: Defer probe when associated CPUs are offline coresight-replicator: Support CPU cluster replicator initialization coresight-replicator: Defer probe when associated CPUs are offline coresight-replicator: Update management interface for CPU-bound devices coresight-tmc: Support probe and initialization for CPU cluster TMCs coresight-tmc-etf: Refactor enable function for CPU cluster ETF support coresight-tmc: Update management interface for CPU-bound TMCs coresight-tmc: Defer probe when associated CPUs are offline
Verification:
This series has been verified on sm8750.
Test steps for delay probe:
- limit the system to enable at most 6 CPU cores during boot.
- echo 1 >/sys/bus/cpu/devices/cpu6/online.
- check whether ETM6 and ETM7 have been probed.
Test steps for sysfs mode:
echo 1 >/sys/bus/coresight/devices/tmc_etf0/enable_sink echo 1 >/sys/bus/coresight/devices/etm0/enable_source echo 1 >/sys/bus/coresight/devices/etm6/enable_source echo 0 >/sys/bus/coresight/devices/etm0/enable_source echo 0 >/sys/bus/coresight/devicse/etm6/enable_source echo 0 >/sys/bus/coresight/devices/tmc_etf0/enable_sink
echo 1 >/sys/bus/coresight/devices/tmc_etf1/enable_sink echo 1 >/sys/bus/coresight/devcies/etm0/enable_source cat /dev/tmc_etf1 >/tmp/etf1.bin echo 0 >/sys/bus/coresight/devices/etm0/enable_source echo 0 >/sys/bus/coresight/devices/tmc_etf1/enable_sink
echo 1 >/sys/bus/coresight/devices/tmc_etf2/enable_sink echo 1 >/sys/bus/coresight/devices/etm6/enable_source cat /dev/tmc_etf2 >/tmp/etf2.bin echo 0 >/sys/bus/coresight/devices/etm6/enable_source echo 0 >/sys/bus/coresight/devices/tmc_etf2/enable_sink
Test steps for sysfs node:
cat /sys/bus/coresight/devices/tmc_etf*/mgmt/*
cat /sys/bus/coresight/devices/funnel*/funnel_ctrl
cat /sys/bus/coresight/devices/replicator*/mgmt/*
Test steps for perf mode:
perf record -a -e cs_etm//k -- sleep 5
Signed-off-by: Yuanfang Zhang yuanfang.zhang@oss.qualcomm.com
Changes in v2:
- Use the qcom,cpu-bound-components device tree property to identify devices
bound to a cluster.
- Refactor commit message.
- Introduce a supported_cpus field in the drvdata structure to record the CPUs
that belong to the cluster where the local component resides.
Jie Gan (1): arm64: dts: qcom: hamoa: Add CoreSight nodes for APSS debug block
Yuanfang Zhang (11): dt-bindings: arm: coresight: Add 'qcom,cpu-bound-components' property coresight-funnel: Support CPU cluster funnel initialization coresight-funnel: Defer probe when associated CPUs are offline coresight-replicator: Support CPU cluster replicator initialization coresight-replicator: Defer probe when associated CPUs are offline coresight-replicator: Update management interface for CPU-bound devices coresight-tmc: Support probe and initialization for CPU cluster TMCs coresight-tmc-etf: Refactor enable function for CPU cluster ETF support coresight-tmc: Update management interface for CPU-bound TMCs coresight-tmc: Defer probe when associated CPUs are offline coresight: Pass trace mode to link enable callback
.../bindings/arm/arm,coresight-dynamic-funnel.yaml | 5 + .../arm/arm,coresight-dynamic-replicator.yaml | 5 + .../devicetree/bindings/arm/arm,coresight-tmc.yaml | 5 + arch/arm64/boot/dts/qcom/hamoa.dtsi | 926 +++++++++++++++++++++ arch/arm64/boot/dts/qcom/purwa.dtsi | 12 + drivers/hwtracing/coresight/coresight-core.c | 7 +- drivers/hwtracing/coresight/coresight-funnel.c | 258 +++++- drivers/hwtracing/coresight/coresight-replicator.c | 341 +++++++- drivers/hwtracing/coresight/coresight-tmc-core.c | 387 +++++++-- drivers/hwtracing/coresight/coresight-tmc-etf.c | 106 ++- drivers/hwtracing/coresight/coresight-tmc.h | 10 + drivers/hwtracing/coresight/coresight-tnoc.c | 3 +- drivers/hwtracing/coresight/coresight-tpda.c | 3 +- include/linux/coresight.h | 3 +- 14 files changed, 1902 insertions(+), 169 deletions(-)
base-commit: 008d3547aae5bc86fac3eda317489169c3fda112 change-id: 20251016-cpu_cluster_component_pm-ce518f510433
Best regards,
On 12/18/2025 6:40 PM, Leo Yan wrote:
Hi,
On Thu, Dec 18, 2025 at 12:09:40AM -0800, Coresight ML wrote:
[...]
- Utilizing `smp_call_function_single()` to ensure register accesses (initialization, enablement, sysfs reads) are always executed on a powered CPU within the target cluster.
This is concerned as Mike suggested earlier.
Let me convert to a common question: how does the Linux kernel manage a power domain shared by multiple hardware modules?
A general solution is to bind a power domain (let's say PD1) to both module A (MOD_A) and module B (MOD_B). Each time before accessing MOD_A or MOD_B, PD1 must be powered on first via the pm_runtime APIs, with its refcount increased accordingly.
My understanding is the problem in your case is that the driver fails to create a relationship between the funnel/replicator modules and the cluster power domain. Instead, you are trying to use the CPUs in the same cluster as a delegate for power operations - when you want to access MOD_B, your wake up MOD_A which sharing the same power domain, only to turn on the PD_A in order to access MOD_B.
Have you discussed with the firmware and hardware engineers whether it is feasible to provide explicit power and clock control interfaces for the funnel and replicator modules? I can imagine the cluster power domain's design might differ from other device power domains, but should not the hardware provide a sane design that allows software to control power for the access logic within it?
It is due to the particular characteristics of the CPU cluster power domain. Runtime PM for CPU devices works little different, it is mostly used to manage hierarchical CPU topology (PSCI OSI mode) to talk with genpd framework to manage the last CPU handling in cluster. It doesn’t really send IPI to wakeup CPU device (It don’t have .power_on/.power_off) callback implemented which gets invoked from .runtime_resume callback. This behavior is aligned with the upstream Kernel.
General speaking, using smp_call_function_single() makes sense if only when accessing logics within the CPU boundary.
P.s., currently you can use "taskset" as a temporary solution without any code change, something like:
taskset -c 0 echo 1 > /sys/bus/coresight/devices/etm0/enable_source
This can address the runtime issue, but it does not resolve the problem during the probe phase.
thanks, Yuanfang>
Thanks, Leo
On Fri, Dec 19, 2025 at 09:50:18AM +0800, yuanfang zhang wrote:
[...]
It is due to the particular characteristics of the CPU cluster power domain. Runtime PM for CPU devices works little different, it is mostly used to manage hierarchical CPU topology (PSCI OSI mode) to talk with genpd framework to manage the last CPU handling in cluster. It doesn’t really send IPI to wakeup CPU device (It don’t have .power_on/.power_off) callback implemented which gets invoked from .runtime_resume callback. This behavior is aligned with the upstream Kernel.
Just for easier understanding, let me give an example:
funnel0: funnel@10000000 { compatible = "arm,coresight-dynamic-funnel", "arm,primecell"; reg = <0x10000000 0x1000>;
clocks = <&rpmcc RPM_SMD_QDSS_CLK>, <&rpmcc RPM_SMD_QDSS_A_CLK>; clock-names = "apb_pclk", "atclk"; power-domains = <&cluster0_pd>; }
If funnel0 is bound to cluster's power domain, kernel's genPD will automatically enable cluster power domain before access registers.
My understanding is your driver or firmware fails to turn on a cluster power domain without waking up a CPU (and without sending IPI). It is not a kernel issue or limitation, and no any incorrect in PSCI OSI.
As Suzuki said, you might directly reply Sudeep's questions. We would confirm if any flaw in common code.
P.s., currently you can use "taskset" as a temporary solution without any code change, something like:
taskset -c 0 echo 1 > /sys/bus/coresight/devices/etm0/enable_source
This can address the runtime issue, but it does not resolve the problem during the probe phase.
Indeed. If you insmod mode, you might can temporarily disable idle states?
exec 3<> /dev/cpu_dma_latency; echo 0 >&3 insmod exec 3<>-
Thanks, Leo