On Mon, 03 Nov 2025 15:06:20 +0800, Jie Gan wrote:
> Enable CTCU device for QCS8300 platform. Add a fallback mechnasim in binding to utilize
> the compitable of the SA8775p platform becuase the CTCU for QCS8300 shares same
> configurations as SA8775p platform.
>
> Changes in V4:
> 1. dtsi file has been renamed from qcs8300.dtsi -> monaco.dtsi
> Link to V3 - https://lore.kernel.org/all/20251013-enable-ctcu-for-qcs8300-v3-0-611e6e0d3…
>
> [...]
Applied, thanks!
[1/2] dt-bindings: arm: add CTCU device for monaco
https://git.kernel.org/coresight/c/51cd1fb70e08
Best regards,
--
Suzuki K Poulose <suzuki.poulose(a)arm.com>
Hi,
On Fri, Dec 19, 2025 at 10:39:49AM +0800, Ma Ke wrote:
[...]
> From the discussion, I note two possible fix directions:
>
> 1. Release the initial reference in etm_setup_aux() (current v2 patch)
> 2. Modify the behavior of coresight_get_sink_by_id() itself so it
> doesn't increase the reference count.
The option 2 is the right way to go.
> To ensure the correctness of the v3 patch, I'd like to confirm which
> patch is preferred. If option 2 is the consensus, I'm happy to modify
> the implementation of coresight_get_sink_by_id() as suggested.
It is good to use a separate patch to fix
coresight_find_device_by_fwnode() mentioned by James:
diff --git a/drivers/hwtracing/coresight/coresight-platform.c b/drivers/hwtracing/coresight/coresight-platform.c
index 0db64c5f4995..2b34f818ba88 100644
--- a/drivers/hwtracing/coresight/coresight-platform.c
+++ b/drivers/hwtracing/coresight/coresight-platform.c
@@ -107,14 +107,16 @@ coresight_find_device_by_fwnode(struct fwnode_handle *fwnode)
* platform bus.
*/
dev = bus_find_device_by_fwnode(&platform_bus_type, fwnode);
- if (dev)
- return dev;
/*
* We have a configurable component - circle through the AMBA bus
* looking for the device that matches the endpoint node.
*/
- return bus_find_device_by_fwnode(&amba_bustype, fwnode);
+ if (!dev)
+ dev = bus_find_device_by_fwnode(&amba_bustype, fwnode);
+
+ put_device(dev);
+ return dev;
}
/*
@@ -274,7 +276,6 @@ static int of_coresight_parse_endpoint(struct device *dev,
of_node_put(rparent);
of_node_put(rep);
- put_device(rdev);
return ret;
}
Thanks for working on this.
On 19/12/2025 09:08, Jie Gan wrote:
>
>
> On 11/3/2025 3:06 PM, Jie Gan wrote:
>> The CTCU device for monaco shares the same configurations as SA8775p. Add
>> a fallback to enable the CTCU for monaco to utilize the compitable of the
>> SA8775p.
>>
>
> Gentle reminder.
I was under the assumption that this was going via msm tree ? Sorry, I
misunderstood. I can pull this in for v6.20
Suzuki
>
>> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
>> Acked-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
>> Reviewed-by: Bjorn Andersson <andersson(a)kernel.org>
>> Signed-off-by: Jie Gan <jie.gan(a)oss.qualcomm.com>
>> ---
>> Documentation/devicetree/bindings/arm/qcom,coresight-ctcu.yaml | 9 +
>> ++++++--
>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/devicetree/bindings/arm/qcom,coresight-
>> ctcu.yaml b/Documentation/devicetree/bindings/arm/qcom,coresight-
>> ctcu.yaml
>> index c969c16c21ef..460f38ddbd73 100644
>> --- a/Documentation/devicetree/bindings/arm/qcom,coresight-ctcu.yaml
>> +++ b/Documentation/devicetree/bindings/arm/qcom,coresight-ctcu.yaml
>> @@ -26,8 +26,13 @@ description: |
>> properties:
>> compatible:
>> - enum:
>> - - qcom,sa8775p-ctcu
>> + oneOf:
>> + - items:
>> + - enum:
>> + - qcom,qcs8300-ctcu
>> + - const: qcom,sa8775p-ctcu
>> + - enum:
>> + - qcom,sa8775p-ctcu
>> reg:
>> maxItems: 1
>>
>
This patch series adds support for CoreSight components local to CPU clusters,
including funnel, replicator, and TMC, which reside within CPU cluster power
domains. These components require special handling due to power domain
constraints.
Unlike system-level CoreSight devices, these components share the CPU cluster's
power domain. When the cluster enters low-power mode (LPM), their registers
become inaccessible. Notably, `pm_runtime_get` alone cannot bring the cluster
out of LPM, making standard register access unreliable.
To address this, the series introduces:
- Identifying cluster-bound devices via a new `qcom,cpu-bound-components`
device tree property.
- Implementing deferred probing: if associated CPUs are offline during
probe, initialization is deferred until a CPU hotplug notifier detects
the CPU coming online.
- Utilizing `smp_call_function_single()` to ensure register accesses
(initialization, enablement, sysfs reads) are always executed on a
powered CPU within the target cluster.
- Extending the CoreSight link `enable` callback to pass the `cs_mode`.
This allows drivers to distinguish between SysFS and Perf modes and
apply mode-specific logic.
Jie Gan (1):
arm64: dts: qcom: hamoa: add Coresight nodes for APSS debug block
Yuanfang Zhang (11):
dt-bindings: arm: coresight: Add 'qcom,cpu-bound-components' property
coresight: Pass trace mode to link enable callback
coresight-funnel: Support CPU cluster funnel initialization
coresight-funnel: Defer probe when associated CPUs are offline
coresight-replicator: Support CPU cluster replicator initialization
coresight-replicator: Defer probe when associated CPUs are offline
coresight-replicator: Update management interface for CPU-bound devices
coresight-tmc: Support probe and initialization for CPU cluster TMCs
coresight-tmc-etf: Refactor enable function for CPU cluster ETF support
coresight-tmc: Update management interface for CPU-bound TMCs
coresight-tmc: Defer probe when associated CPUs are offline
Verification:
This series has been verified on sm8750.
Test steps for delay probe:
1. limit the system to enable at most 6 CPU cores during boot.
2. echo 1 >/sys/bus/cpu/devices/cpu6/online.
3. check whether ETM6 and ETM7 have been probed.
Test steps for sysfs mode:
echo 1 >/sys/bus/coresight/devices/tmc_etf0/enable_sink
echo 1 >/sys/bus/coresight/devices/etm0/enable_source
echo 1 >/sys/bus/coresight/devices/etm6/enable_source
echo 0 >/sys/bus/coresight/devices/etm0/enable_source
echo 0 >/sys/bus/coresight/devicse/etm6/enable_source
echo 0 >/sys/bus/coresight/devices/tmc_etf0/enable_sink
echo 1 >/sys/bus/coresight/devices/tmc_etf1/enable_sink
echo 1 >/sys/bus/coresight/devcies/etm0/enable_source
cat /dev/tmc_etf1 >/tmp/etf1.bin
echo 0 >/sys/bus/coresight/devices/etm0/enable_source
echo 0 >/sys/bus/coresight/devices/tmc_etf1/enable_sink
echo 1 >/sys/bus/coresight/devices/tmc_etf2/enable_sink
echo 1 >/sys/bus/coresight/devices/etm6/enable_source
cat /dev/tmc_etf2 >/tmp/etf2.bin
echo 0 >/sys/bus/coresight/devices/etm6/enable_source
echo 0 >/sys/bus/coresight/devices/tmc_etf2/enable_sink
Test steps for sysfs node:
cat /sys/bus/coresight/devices/tmc_etf*/mgmt/*
cat /sys/bus/coresight/devices/funnel*/funnel_ctrl
cat /sys/bus/coresight/devices/replicator*/mgmt/*
Test steps for perf mode:
perf record -a -e cs_etm//k -- sleep 5
Signed-off-by: Yuanfang Zhang <yuanfang.zhang(a)oss.qualcomm.com>
---
Changes in v2:
- Use the qcom,cpu-bound-components device tree property to identify devices
bound to a cluster.
- Refactor commit message.
- Introduce a supported_cpus field in the drvdata structure to record the CPUs
that belong to the cluster where the local component resides.
- Link to v1: https://lore.kernel.org/r/20251027-cpu_cluster_component_pm-v1-0-31355ac588…
---
Jie Gan (1):
arm64: dts: qcom: hamoa: Add CoreSight nodes for APSS debug block
Yuanfang Zhang (11):
dt-bindings: arm: coresight: Add 'qcom,cpu-bound-components' property
coresight-funnel: Support CPU cluster funnel initialization
coresight-funnel: Defer probe when associated CPUs are offline
coresight-replicator: Support CPU cluster replicator initialization
coresight-replicator: Defer probe when associated CPUs are offline
coresight-replicator: Update management interface for CPU-bound devices
coresight-tmc: Support probe and initialization for CPU cluster TMCs
coresight-tmc-etf: Refactor enable function for CPU cluster ETF support
coresight-tmc: Update management interface for CPU-bound TMCs
coresight-tmc: Defer probe when associated CPUs are offline
coresight: Pass trace mode to link enable callback
.../bindings/arm/arm,coresight-dynamic-funnel.yaml | 5 +
.../arm/arm,coresight-dynamic-replicator.yaml | 5 +
.../devicetree/bindings/arm/arm,coresight-tmc.yaml | 5 +
arch/arm64/boot/dts/qcom/hamoa.dtsi | 926 +++++++++++++++++++++
arch/arm64/boot/dts/qcom/purwa.dtsi | 12 +
drivers/hwtracing/coresight/coresight-core.c | 7 +-
drivers/hwtracing/coresight/coresight-funnel.c | 258 +++++-
drivers/hwtracing/coresight/coresight-replicator.c | 341 +++++++-
drivers/hwtracing/coresight/coresight-tmc-core.c | 387 +++++++--
drivers/hwtracing/coresight/coresight-tmc-etf.c | 106 ++-
drivers/hwtracing/coresight/coresight-tmc.h | 10 +
drivers/hwtracing/coresight/coresight-tnoc.c | 3 +-
drivers/hwtracing/coresight/coresight-tpda.c | 3 +-
include/linux/coresight.h | 3 +-
14 files changed, 1902 insertions(+), 169 deletions(-)
---
base-commit: 008d3547aae5bc86fac3eda317489169c3fda112
change-id: 20251016-cpu_cluster_component_pm-ce518f510433
Best regards,
--
Yuanfang Zhang <yuanfang.zhang(a)oss.qualcomm.com>
On 19/12/2025 10:21, Sudeep Holla wrote:
> On Fri, Dec 19, 2025 at 10:13:14AM +0800, yuanfang zhang wrote:
>>
>>
>> On 12/18/2025 7:33 PM, Sudeep Holla wrote:
>>> On Thu, Dec 18, 2025 at 12:09:40AM -0800, Yuanfang Zhang wrote:
>>>> This patch series adds support for CoreSight components local to CPU clusters,
>>>> including funnel, replicator, and TMC, which reside within CPU cluster power
>>>> domains. These components require special handling due to power domain
>>>> constraints.
>>>>
>>>
>>> Could you clarify why PSCI-based power domains associated with clusters in
>>> domain-idle-states cannot address these requirements, given that PSCI CPU-idle
>>> OSI mode was originally intended to support them? My understanding of this
>>> patch series is that OSI mode is unable to do so, which, if accurate, appears
>>> to be a flaw that should be corrected.
>>
>> It is due to the particular characteristics of the CPU cluster power
>> domain.Runtime PM for CPU devices works little different, it is mostly used
>> to manage hierarchicalCPU topology (PSCI OSI mode) to talk with genpd
>> framework to manage the last CPU handling in cluster.
>
> That is indeed the intended design. Could you clarify which specific
> characteristics differentiate it here?
>
>> It doesn’t really send IPI to wakeup CPU device (It don’t have
>> .power_on/.power_off) callback implemented which gets invoked from
>> .runtime_resume callback. This behavior is aligned with the upstream Kernel.
>>
>
> I am quite lost here. Why is it necessary to wake up the CPU? If I understand
> correctly, all of this complexity is meant to ensure that the cluster power
> domain is enabled before any of the funnel registers are accessed. Is that
> correct?
>
> If so, and if the cluster domains are already defined as the power domains for
> these funnel devices, then they should be requested to power on automatically
> before any register access occurs. Is that not the case?
>
> What am I missing in this reasoning?
Exactly, this is what I am too. But then you get the "pre-formated
standard response" without answering our questions.
Suzuki
On 19/12/2025 10:04, Jie Gan wrote:
> From: Tao Zhang <tao.zhang(a)oss.qualcomm.com>
>
> The TPDA_SYNC counter tracks the number of bytes transferred from the
> aggregator. When this count reaches the value programmed in the
> TPDA_SYNCR register, an ASYNC request is triggered, allowing userspace
> tools to accurately parse each valid packet.
>
> Signed-off-by: Tao Zhang <tao.zhang(a)oss.qualcomm.com>
> Reviewed-by: James Clark <james.clark(a)linaro.org>
> Co-developed-by: Jie Gan <jie.gan(a)oss.qualcomm.com>
> Signed-off-by: Jie Gan <jie.gan(a)oss.qualcomm.com>
> ---
> drivers/hwtracing/coresight/coresight-tpda.c | 7 +++++++
> drivers/hwtracing/coresight/coresight-tpda.h | 5 +++++
> 2 files changed, 12 insertions(+)
>
> diff --git a/drivers/hwtracing/coresight/coresight-tpda.c b/drivers/hwtracing/coresight/coresight-tpda.c
> index d25a8bcfb3d4..d378ff8ad77d 100644
> --- a/drivers/hwtracing/coresight/coresight-tpda.c
> +++ b/drivers/hwtracing/coresight/coresight-tpda.c
> @@ -163,6 +163,13 @@ static void tpda_enable_pre_port(struct tpda_drvdata *drvdata)
> */
> if (drvdata->trig_flag_ts)
> writel_relaxed(0x0, drvdata->base + TPDA_FPID_CR);
> +
> + val = readl_relaxed(drvdata->base + TPDA_SYNCR);
> + /* Reset the mode ctrl */
> + val &= ~TPDA_SYNCR_MODE_CTRL;
> + /* Program the counter value for TPDA_SYNCR */
> + val |= TPDA_SYNCR_COUNTER_MASK;
Do we plan to change this value via sysfs ? If not whats the point of
clearing the field. Why not simply set it (as it is all 1s anyways).
> + writel_relaxed(val, drvdata->base + TPDA_SYNCR);
> }
>
> static int tpda_enable_port(struct tpda_drvdata *drvdata, int port)
> diff --git a/drivers/hwtracing/coresight/coresight-tpda.h b/drivers/hwtracing/coresight/coresight-tpda.h
> index 8a075cfbc3cc..97e2729c15c9 100644
> --- a/drivers/hwtracing/coresight/coresight-tpda.h
> +++ b/drivers/hwtracing/coresight/coresight-tpda.h
> @@ -9,6 +9,7 @@
> #define TPDA_CR (0x000)
> #define TPDA_Pn_CR(n) (0x004 + (n * 4))
> #define TPDA_FPID_CR (0x084)
> +#define TPDA_SYNCR (0x08C)
>
> /* Cross trigger Global (all ports) flush request bit */
> #define TPDA_CR_FLREQ BIT(0)
> @@ -38,6 +39,10 @@
> #define TPDA_Pn_CR_CMBSIZE GENMASK(7, 6)
> /* Aggregator port DSB data set element size bit */
> #define TPDA_Pn_CR_DSBSIZE BIT(8)
Newline to separate the defintions of different registers, please.
> +/* TPDA_SYNCR mode control bit */
> +#define TPDA_SYNCR_MODE_CTRL BIT(12)
> +/* TPDA_SYNCR counter mask */
> +#define TPDA_SYNCR_COUNTER_MASK GENMASK(11, 0)
>
> #define TPDA_MAX_INPORTS 32
>
Suzuki
>