On 23/01/2023 17:22, Steve Clevenger wrote:
>
>
> On 1/23/2023 2:54 AM, Suzuki K Poulose wrote:
>> On 21/01/2023 07:30, Steve Clevenger wrote:
>>>
>>> Hi Suzuki,
>>>
>>> Comments in-line. Please note the approach I attempted while adding in
>>> the Ampere support was to otherwise not disturb existing driver code for
>>> non-Ampere parts.
>>>
>>> Steve
>>>
>>> On 1/20/2023 3:12 AM, Suzuki K Poulose wrote:
>>>> Hi Steve
>>>>
>>>> Thanks for the patches. Have a few comments below.
>>>>
>>>> On 20/01/2023 00:51, Steve Clevenger wrote:
>>>>> Add Ampere early clear of ETM TRCOSLAR.OSLK prior to TRCIDR1 access.
>>>>> Ampere Computing erratum AC03_DEBUG_06 describes an Ampere
>>>>> Computing design decision MMIO reads are considered the same as an
>>>>> external debug access. If TRCOSLAR.OSLK is set, the TRCIDR1 access
>>>>> results in a bus fault followed by a kernel panic. A TRCIDR1 read
>>>>> is valid regardless of TRCOSLAR.OSLK provided MMIO access
>>>>> (now deprecated) is supported.
>>>>> AC03_DEBUG_06 is described in the AmpereOne Developer Errata:
>>>>> https://solutions.amperecomputing.com/customer-connect/products/AmpereOne-d…
>>>>
>>>> Please could you add this erratum to the :
>>>>
>>>> Documentation/arm64/silicon-errata.rst ?
>>>>
>>>> Given the ETM is v4.6, doesn't it support system instructions and
>>>> that is causing this issue of "MMIO access is considered external" ?
>>>> If it does, I think we should drop all of this and simply wire the
>>>> system instruction access support.
>>
>>> That's not the issue in this case. This MMIO access should've been
>>> allowed by the Ampere ETMv4.6 implementation. Based on comments I've
>>
>> That doesn't answe the question. Please could you confirm the value of
>> ID_AA64DFR0_EL1 on your system ?
> This ID_AA64DFR0_EL1 value came from a TRACE32 debug session connected
> to this part. The ID_AA64DFR0_EL1 value is 0x000F01F210307619. So,
> TraceVer, bits [7:4] are b0001. My understanding is the system register
> interface must be implemented on all ETMv4.6 parts.
So, I don't understand why we are pushing towards enabling the
"obsolete" MMIO interface ? Is this because "ACPI" mandates it ?
Then, please don't. The spec needs an update to reflect the ETMs
with sysreg access and ETEs.
Why not stick to the system register access* ?
* PS: The ACPI support for the ETM/ETE needs additional changes to the
CoreSight driver, *not* the CoreSight ACPI spec. @Anshuman is working on
this at the moment and will be available soon.
The hack patch below should be sufficient to give it a try and if it works.
Kind regards
Suzuki
>
>>
>> Or, are you able to try this on your ACPI based system and see if you
>> are able to use the etm ? (UNTESTED hack !)
>>
>>
>> diff --git a/drivers/acpi/acpi_amba.c b/drivers/acpi/acpi_amba.c
>> index f5b443ab01c2..099966cbac5a 100644
>> --- a/drivers/acpi/acpi_amba.c
>> +++ b/drivers/acpi/acpi_amba.c
>> @@ -22,7 +22,6 @@
>> static const struct acpi_device_id amba_id_list[] = {
>> {"ARMH0061", 0}, /* PL061 GPIO Device */
>> {"ARMH0330", 0}, /* ARM DMA Controller DMA-330 */
>> - {"ARMHC500", 0}, /* ARM CoreSight ETM4x */
>> {"ARMHC501", 0}, /* ARM CoreSight ETR */
>> {"ARMHC502", 0}, /* ARM CoreSight STM */
>> {"ARMHC503", 0}, /* ARM CoreSight Debug */
>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> index 1ea8f173cca0..66670533fd54 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> @@ -3,6 +3,7 @@
>> * Copyright (c) 2014, The Linux Foundation. All rights reserved.
>> */
>>
>> +#include <linux/acpi.h>
>> #include <linux/bitops.h>
>> #include <linux/kernel.h>
>> #include <linux/moduleparam.h>
>> @@ -2286,12 +2287,22 @@ static const struct of_device_id
>> etm4_sysreg_match[] = {
>> {}
>> };
>>
>> +#ifdef CONFIG_ACPI
>> +static const struct acpi_device_id etm4x_acpi_ids[] = {
>> + {"ARMHC500", 0}, /* ARM CoreSight ETM4x */
>> + {}
>> +};
>> +
>> +MODULE_DEVICE_TABLE(acpi, etm4x_acpi_ids);
>> +#endif
>> +
>> static struct platform_driver etm4_platform_driver = {
>> .probe = etm4_probe_platform_dev,
>> .remove = etm4_remove_platform_dev,
>> .driver = {
>> .name = "coresight-etm4x",
>> .of_match_table = etm4_sysreg_match,
>> + .acpi_match_table = ACPI_PTR(etm4x_acpi_ids),
>> .suppress_bind_attrs = true,
>> },
>> };
>>
>>
>>
>>
>>> read in the driver code, the MMIO read access to TRCIDR1 occurs after a
>>> TRCDEVARCH access. The comments suggest this was to accommodate
>>> potentially unreliable TRCDEVARCH (and TRCIDR1) values. This Ampere
>>> ETMv4.6 allows an MMIO access to TRCDEVARCH, but not to TRCIDR1 unless
>>> the TRCOSLAR.OSLK lock is cleared first.
>>>
>>>>
>>>>>
>>>>> Add Ampere ETM PID required for Coresight ETM driver support.
>>>>>
>>>>> Signed-off-by: Steve Clevenger <scclevenger(a)os.amperecomputing.com>
>>>>> ---
>>>>> .../coresight/coresight-etm4x-core.c | 36
>>>>> +++++++++++++++----
>>>>> drivers/hwtracing/coresight/coresight-etm4x.h | 2 ++
>>>>> 2 files changed, 32 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>>>> b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>>>> index 1cc052979e01..533be1928a09 100644
>>>>> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>>>> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>>>> @@ -1091,19 +1091,34 @@ static void etm4_init_arch_data(void *info)
>>>>> drvdata = dev_get_drvdata(init_arg->dev);
>>>>> csa = init_arg->csa;
>>>>> + /* Detect the support for OS Lock before we actually use it */
>>>>> + etm_detect_os_lock(drvdata, csa);
>>>>> + > + /*
>>>>> + * For ETM implementations that consider MMIO an external access
>>>>> + * clear TRCOSLAR.OSLK early.
>>>>> + */
>>>>> + if (drvdata->mmio_external)
>>>>> + etm4_os_unlock_csa(drvdata, csa);
>>>>> +
>>>>> /*
>>>>> * If we are unable to detect the access mechanism,
>>>>> * or unable to detect the trace unit type, fail
>>>>> - * early.
>>>>> + * early. Reset TRCOSLAR.OSLK if cleared.
>>>>> */
>>>>> - if (!etm4_init_csdev_access(drvdata, csa))
>>>>> + if (!etm4_init_csdev_access(drvdata, csa)) {
>>>>> + if (drvdata->mmio_external)
>>>>> + etm4_os_lock(drvdata);
>>>>
>>>> Couldn't this unlock/lock sequence be moved into the
>>>> etm4_init_csdev_iomem_access() where it actually matters ?
>>>>
>>>> Or thinking more about it, we could actually move the unlock step early
>>>> for all ETMs irrespective of whether they are affected by this erratum.
>>>> Of course, putting this back, if we fail to detect the ETM properly.
>>>> I don't see any issue with that.
>>
>>
>>> I agree the lock could be cleared earlier in the code. That's what this
>>> patch does for Ampere. If it's decided ok to do for other (or all)
>>> manufacturers, then the Ampere specific ID check goes away in this
>>> place. The Ampere ID check (and flag) to determine whether the [Patch
>>> 2/3] 64-bit access is split into 2 ea. 32-bit accesses would remain, or
>>> use an existing feature mask as suggested by Mike Leach in a later
>>> review.
>>>
>>>>
>>>>> return;
>>>>> + }
>>>>> - /* Detect the support for OS Lock before we actually use it */
>>>>> - etm_detect_os_lock(drvdata, csa);
>>>>> + /*
>>>>> + * Make sure all registers are accessible
>>>>> + * TRCOSLAR.OSLK may already be clear
>>>>> + */
>>>>> + if (!drvdata->mmio_external)
>>>>> + etm4_os_unlock_csa(drvdata, csa);
>>>>> - /* Make sure all registers are accessible */
>>>>> - etm4_os_unlock_csa(drvdata, csa);
>>>>> etm4_cs_unlock(drvdata, csa);
>>>>> etm4_check_arch_features(drvdata, init_arg->pid);
>>>>> @@ -2027,6 +2042,14 @@ static int etm4_probe(struct device *dev, void
>>>>> __iomem *base, u32 etm_pid)
>>>>> init_arg.csa = &access;
>>>>> init_arg.pid = etm_pid;
>>>>> + /*
>>>>> + * Ampere ETM v4.6 considers MMIO access as external. This mask
>>>>> + * isolates the manufacturer JEP106 ID in the PID.
>>>>> + * TRCPIDR2 (JEDC|DES_1) << 16 | TRCPIDR1 (DES_0) << 8)
>>>>> + */
>>>>
>>>> Does it affect all Ampere ETMs ? You seem to be ignoring the
>>>> PDIR1.PART_1, PDIR0_PART_0 fields, which happens to be 0.
>>
>>> This is the first Ampere ETMv4.x implementation. I wrote the ID check
>>> like this specifically because Ampere does not intend to address this
>>> for ETM designs in progress.
>>
>> I would recommend to make this mask stricter and apply this to the
>> current implementation. When there are more, we could add this here,
>> rather than having to leave this work around for all the possible cores.
>>
>>>
>>>>
>>>>> + if ((init_arg.pid & 0x000FF000) == 0x00096000)
>>>>> + drvdata->mmio_external = true;
>>>> Like I said, we may be able to get rid of this flag and do the step for
>>>> all ETMs. But before all of that, I would like to see if this is problem
>>>> because we are skipping the system instruction route.
>>>>
>>
>>> We understand MMIO access is deprecated going forward. There is other
>>> Linux code to be concerned about. For example, AMBA code reads the
>>> component PID/CID. This discovery code uses mapped values digested from
>>> the CoreSight ACPI which are the descriptions and graphs for the
>>
>> With the "proposed" ACPI support for system register, AMBA would not be
>> involved at all.
>>
>>> manufacturer trace implementation. There may be other Linux code I'm not
>>> aware. Note the ASL examples in ARM Document number: DEN0067 specify
>>> MMIO locations for every CoreSight component.
>>
>> Yes, but this was never updated to cover the system register based
>> implementations. I will chase this up.
>>
>>
>> Suzuki
>>
Hi Steve,
On Sat, 21 Jan 2023 at 07:31, Steve Clevenger
<scclevenger(a)os.amperecomputing.com> wrote:
>
>
> Hi Mike,
>
> Comments in-line.
>
> Steve
>
> On 1/20/2023 3:45 AM, Mike Leach wrote:
> > Hi Steve,
> >
> > On Fri, 20 Jan 2023 at 00:52, Steve Clevenger
> > <scclevenger(a)os.amperecomputing.com> wrote:
> >>
> >> Add Ampere early clear of ETM TRCOSLAR.OSLK prior to TRCIDR1 access.
> >> Ampere Computing erratum AC03_DEBUG_06 describes an Ampere
> >> Computing design decision MMIO reads are considered the same as an
> >> external debug access. If TRCOSLAR.OSLK is set, the TRCIDR1 access
> >> results in a bus fault followed by a kernel panic. A TRCIDR1 read
> >> is valid regardless of TRCOSLAR.OSLK provided MMIO access
> >> (now deprecated) is supported.
> >> AC03_DEBUG_06 is described in the AmpereOne Developer Errata:
> >> https://solutions.amperecomputing.com/customer-connect/products/AmpereOne-d…
> >>
> >> Add Ampere ETM PID required for Coresight ETM driver support.
> >>
> >> Signed-off-by: Steve Clevenger <scclevenger(a)os.amperecomputing.com>
> >> ---
> >> .../coresight/coresight-etm4x-core.c | 36 +++++++++++++++----
> >> drivers/hwtracing/coresight/coresight-etm4x.h | 2 ++
> >> 2 files changed, 32 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> >> index 1cc052979e01..533be1928a09 100644
> >> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> >> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> >> @@ -1091,19 +1091,34 @@ static void etm4_init_arch_data(void *info)
> >> drvdata = dev_get_drvdata(init_arg->dev);
> >> csa = init_arg->csa;
> >>
> >
> > As far as I can tell there appears to be an initialisation issue here.
> > etm_probe()
> > ...
> > struct csdev_access access = { 0 };
> > ...
> > init_arg.csa = &access
> >
> > ::call=> etm4_init_arch_data(init_arg)
> >
> > Thus csa is uninitialised?
> It looks to me csa is intended to be initialized to zero? In any case,
> the Ampere check uses only the ETM pid, which is initialized directly above.
>
Sorry, I should have been more explicit.
csa is the addressing abstraction used by all the underlying register
read/write code.
It is initialised to {0} in the calling code, probably to avoid the
kernel tests complaining about uninitialised use of a variable.
However in the etm4_init_csdev_access() function we are using a base
address then it is initialised to:-
struct csdev_access {
io_mem = true;
*base = io_mem_base_addr;
};
and in the access using system registers for an etm4 to:
struct csdev_access {
io_mem = false;
*read = etm4x_sysreg_read()
*write = etm4x_sysreg_write()
};
Thus all underlying register access can use the correct method for the device.
> >
> >> + /* Detect the support for OS Lock before we actually use it */
> >> + etm_detect_os_lock(drvdata, csa);
> >> +
Thus passing a 0 init csa object to the etm_detect_os_lock() fn above
seems to be suspicious.
> >> + /*
> >> + * For ETM implementations that consider MMIO an external access
> >> + * clear TRCOSLAR.OSLK early.
> >> + */
> >> + if (drvdata->mmio_external)
> >> + etm4_os_unlock_csa(drvdata, csa);
> >> +
> >> /*
> >> * If we are unable to detect the access mechanism,
> >> * or unable to detect the trace unit type, fail
> >> - * early.
> >> + * early. Reset TRCOSLAR.OSLK if cleared.
> >> */
> >> - if (!etm4_init_csdev_access(drvdata, csa))
> >> + if (!etm4_init_csdev_access(drvdata, csa)) {
> >
> > This call initialises csa according to sysreg / iomem access requirements
> csa is initialized only when no drvdata->base exists.
Not so - csa is initialised in both circumstances as described above.
> Under what
> circumstance would there be no ETM base given the recommended CoreSight
> ACPI implementation? See the examples in ARM Document number: DEN0067.
This will be used in the ETE devices (which share the etm4 driver), or
any ETM4.6+ that uses the "arm,coresight-etm4x-sysreg" device tree
binding (not sure what the ACPI equivalent is).
So, either way, you need an init csa, before passing it to the driver calls.
Later in the initialisation sequence we generate a coresight_device
object which the csa is bound to, and finally if all is well the
coresight_device is bound to drvdata at which point the device is
ready for use.
It is unfortunate, but to handle the two methods of register access,
the initilialisation process for the driver has become more
complicated with ordering dependencies - to ensure that the rest of
the driver remains simpler when accessing device registers.
As Suzuki mentioned - moving this specific lock requirement into the
_init function would be clearer and ensure that the initialisation
sequences were observed.
Regards
Mike
> >
> >
> >
> >> + if (drvdata->mmio_external)
> >> + etm4_os_lock(drvdata);
> >> return;
> >> + }
> >>
> >> - /* Detect the support for OS Lock before we actually use it */
> >> - etm_detect_os_lock(drvdata, csa);
> >> + /*
> >> + * Make sure all registers are accessible
> >> + * TRCOSLAR.OSLK may already be clear
> >> + */
> >> + if (!drvdata->mmio_external)
> >> + etm4_os_unlock_csa(drvdata, csa);
> >>
> >> - /* Make sure all registers are accessible */
> >> - etm4_os_unlock_csa(drvdata, csa);
> >> etm4_cs_unlock(drvdata, csa);
> >>
> >> etm4_check_arch_features(drvdata, init_arg->pid);
> >> @@ -2027,6 +2042,14 @@ static int etm4_probe(struct device *dev, void __iomem *base, u32 etm_pid)
> >> init_arg.csa = &access;
> >> init_arg.pid = etm_pid;
> >>
> >> + /*
> >> + * Ampere ETM v4.6 considers MMIO access as external. This mask
> >> + * isolates the manufacturer JEP106 ID in the PID.
> >> + * TRCPIDR2 (JEDC|DES_1) << 16 | TRCPIDR1 (DES_0) << 8)
> >> + */
> >> + if ((init_arg.pid & 0x000FF000) == 0x00096000)
> >> + drvdata->mmio_external = true;
> >> +
> >> /*
> >> * Serialize against CPUHP callbacks to avoid race condition
> >> * between the smp call and saving the delayed probe.
> >> @@ -2192,6 +2215,7 @@ static const struct amba_id etm4_ids[] = {
> >> CS_AMBA_ID(0x000bb95e), /* Cortex-A57 */
> >> CS_AMBA_ID(0x000bb95a), /* Cortex-A72 */
> >> CS_AMBA_ID(0x000bb959), /* Cortex-A73 */
> >> + CS_AMBA_UCI_ID(0x00096000, uci_id_etm4),/* Ampere ARMv8 */
> >> CS_AMBA_UCI_ID(0x000bb9da, uci_id_etm4),/* Cortex-A35 */
> >> CS_AMBA_UCI_ID(0x000bbd05, uci_id_etm4),/* Cortex-A55 */
> >> CS_AMBA_UCI_ID(0x000bbd0a, uci_id_etm4),/* Cortex-A75 */
> >> diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
> >> index 4b21bb79f168..cf4f9f2e1807 100644
> >> --- a/drivers/hwtracing/coresight/coresight-etm4x.h
> >> +++ b/drivers/hwtracing/coresight/coresight-etm4x.h
> >> @@ -1015,6 +1015,7 @@ struct etmv4_save_state {
> >> * @skip_power_up: Indicates if an implementation can skip powering up
> >> * the trace unit.
> >> * @arch_features: Bitmap of arch features of etmv4 devices.
> >> + * @mmio_external: True if ETM considers MMIO an external access.
> >> */
> >> struct etmv4_drvdata {
> >> void __iomem *base;
> >> @@ -1067,6 +1068,7 @@ struct etmv4_drvdata {
> >> bool state_needs_restore;
> >> bool skip_power_up;
> >> DECLARE_BITMAP(arch_features, ETM4_IMPDEF_FEATURE_MAX);
> >
> > Rather than continue to add bools - is it not worthwhile adding to the
> > bitmap above and extending the arch features API to allow a
> > "has_feature" call?
> I can look into this. I agree using a bool for every exception doesn't
> scale well. Referring to one Suzuki Poulose review comment, his proposal
> to clear TRCOSLAR.OSLK early for all parts would mean one of these bools
> could go away. Otherwise, possibly add one (or more) bit definitions for
> use by the etm4_disable_arch_specific call. The order of this call would
> need to change, depending.
>
> >
> >> + bool mmio_external;
> >> };
> >>
> >> /* Address comparator access types */
> >> --
> >> 2.25.1
> >>
> > Regards
> >
> > Mike
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
On 21/01/2023 07:30, Steve Clevenger wrote:
>
> Hi Suzuki,
>
> Comments in-line. Please note the approach I attempted while adding in
> the Ampere support was to otherwise not disturb existing driver code for
> non-Ampere parts.
>
> Steve
>
> On 1/20/2023 3:12 AM, Suzuki K Poulose wrote:
>> Hi Steve
>>
>> Thanks for the patches. Have a few comments below.
>>
>> On 20/01/2023 00:51, Steve Clevenger wrote:
>>> Add Ampere early clear of ETM TRCOSLAR.OSLK prior to TRCIDR1 access.
>>> Ampere Computing erratum AC03_DEBUG_06 describes an Ampere
>>> Computing design decision MMIO reads are considered the same as an
>>> external debug access. If TRCOSLAR.OSLK is set, the TRCIDR1 access
>>> results in a bus fault followed by a kernel panic. A TRCIDR1 read
>>> is valid regardless of TRCOSLAR.OSLK provided MMIO access
>>> (now deprecated) is supported.
>>> AC03_DEBUG_06 is described in the AmpereOne Developer Errata:
>>> https://solutions.amperecomputing.com/customer-connect/products/AmpereOne-d…
>>
>> Please could you add this erratum to the :
>>
>> Documentation/arm64/silicon-errata.rst ?
>>
>> Given the ETM is v4.6, doesn't it support system instructions and
>> that is causing this issue of "MMIO access is considered external" ?
>> If it does, I think we should drop all of this and simply wire the
>> system instruction access support.
> That's not the issue in this case. This MMIO access should've been
> allowed by the Ampere ETMv4.6 implementation. Based on comments I've
That doesn't answe the question. Please could you confirm the value of
ID_AA64DFR0_EL1 on your system ?
Or, are you able to try this on your ACPI based system and see if you
are able to use the etm ? (UNTESTED hack !)
diff --git a/drivers/acpi/acpi_amba.c b/drivers/acpi/acpi_amba.c
index f5b443ab01c2..099966cbac5a 100644
--- a/drivers/acpi/acpi_amba.c
+++ b/drivers/acpi/acpi_amba.c
@@ -22,7 +22,6 @@
static const struct acpi_device_id amba_id_list[] = {
{"ARMH0061", 0}, /* PL061 GPIO Device */
{"ARMH0330", 0}, /* ARM DMA Controller DMA-330 */
- {"ARMHC500", 0}, /* ARM CoreSight ETM4x */
{"ARMHC501", 0}, /* ARM CoreSight ETR */
{"ARMHC502", 0}, /* ARM CoreSight STM */
{"ARMHC503", 0}, /* ARM CoreSight Debug */
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c
b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index 1ea8f173cca0..66670533fd54 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -3,6 +3,7 @@
* Copyright (c) 2014, The Linux Foundation. All rights reserved.
*/
+#include <linux/acpi.h>
#include <linux/bitops.h>
#include <linux/kernel.h>
#include <linux/moduleparam.h>
@@ -2286,12 +2287,22 @@ static const struct of_device_id
etm4_sysreg_match[] = {
{}
};
+#ifdef CONFIG_ACPI
+static const struct acpi_device_id etm4x_acpi_ids[] = {
+ {"ARMHC500", 0}, /* ARM CoreSight ETM4x */
+ {}
+};
+
+MODULE_DEVICE_TABLE(acpi, etm4x_acpi_ids);
+#endif
+
static struct platform_driver etm4_platform_driver = {
.probe = etm4_probe_platform_dev,
.remove = etm4_remove_platform_dev,
.driver = {
.name = "coresight-etm4x",
.of_match_table = etm4_sysreg_match,
+ .acpi_match_table = ACPI_PTR(etm4x_acpi_ids),
.suppress_bind_attrs = true,
},
};
> read in the driver code, the MMIO read access to TRCIDR1 occurs after a
> TRCDEVARCH access. The comments suggest this was to accommodate
> potentially unreliable TRCDEVARCH (and TRCIDR1) values. This Ampere
> ETMv4.6 allows an MMIO access to TRCDEVARCH, but not to TRCIDR1 unless
> the TRCOSLAR.OSLK lock is cleared first.
>
>>
>>>
>>> Add Ampere ETM PID required for Coresight ETM driver support.
>>>
>>> Signed-off-by: Steve Clevenger <scclevenger(a)os.amperecomputing.com>
>>> ---
>>> .../coresight/coresight-etm4x-core.c | 36 +++++++++++++++----
>>> drivers/hwtracing/coresight/coresight-etm4x.h | 2 ++
>>> 2 files changed, 32 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>> b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>> index 1cc052979e01..533be1928a09 100644
>>> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>> @@ -1091,19 +1091,34 @@ static void etm4_init_arch_data(void *info)
>>> drvdata = dev_get_drvdata(init_arg->dev);
>>> csa = init_arg->csa;
>>> + /* Detect the support for OS Lock before we actually use it */
>>> + etm_detect_os_lock(drvdata, csa);
>>> + > + /*
>>> + * For ETM implementations that consider MMIO an external access
>>> + * clear TRCOSLAR.OSLK early.
>>> + */
>>> + if (drvdata->mmio_external)
>>> + etm4_os_unlock_csa(drvdata, csa);
>>> +
>>> /*
>>> * If we are unable to detect the access mechanism,
>>> * or unable to detect the trace unit type, fail
>>> - * early.
>>> + * early. Reset TRCOSLAR.OSLK if cleared.
>>> */
>>> - if (!etm4_init_csdev_access(drvdata, csa))
>>> + if (!etm4_init_csdev_access(drvdata, csa)) {
>>> + if (drvdata->mmio_external)
>>> + etm4_os_lock(drvdata);
>>
>> Couldn't this unlock/lock sequence be moved into the
>> etm4_init_csdev_iomem_access() where it actually matters ?
>>
>> Or thinking more about it, we could actually move the unlock step early
>> for all ETMs irrespective of whether they are affected by this erratum.
>> Of course, putting this back, if we fail to detect the ETM properly.
>> I don't see any issue with that.
> I agree the lock could be cleared earlier in the code. That's what this
> patch does for Ampere. If it's decided ok to do for other (or all)
> manufacturers, then the Ampere specific ID check goes away in this
> place. The Ampere ID check (and flag) to determine whether the [Patch
> 2/3] 64-bit access is split into 2 ea. 32-bit accesses would remain, or
> use an existing feature mask as suggested by Mike Leach in a later review.
>
>>
>>> return;
>>> + }
>>> - /* Detect the support for OS Lock before we actually use it */
>>> - etm_detect_os_lock(drvdata, csa);
>>> + /*
>>> + * Make sure all registers are accessible
>>> + * TRCOSLAR.OSLK may already be clear
>>> + */
>>> + if (!drvdata->mmio_external)
>>> + etm4_os_unlock_csa(drvdata, csa);
>>> - /* Make sure all registers are accessible */
>>> - etm4_os_unlock_csa(drvdata, csa);
>>> etm4_cs_unlock(drvdata, csa);
>>> etm4_check_arch_features(drvdata, init_arg->pid);
>>> @@ -2027,6 +2042,14 @@ static int etm4_probe(struct device *dev, void
>>> __iomem *base, u32 etm_pid)
>>> init_arg.csa = &access;
>>> init_arg.pid = etm_pid;
>>> + /*
>>> + * Ampere ETM v4.6 considers MMIO access as external. This mask
>>> + * isolates the manufacturer JEP106 ID in the PID.
>>> + * TRCPIDR2 (JEDC|DES_1) << 16 | TRCPIDR1 (DES_0) << 8)
>>> + */
>>
>> Does it affect all Ampere ETMs ? You seem to be ignoring the
>> PDIR1.PART_1, PDIR0_PART_0 fields, which happens to be 0.
> This is the first Ampere ETMv4.x implementation. I wrote the ID check
> like this specifically because Ampere does not intend to address this
> for ETM designs in progress.
I would recommend to make this mask stricter and apply this to the
current implementation. When there are more, we could add this here,
rather than having to leave this work around for all the possible cores.
>
>>
>>> + if ((init_arg.pid & 0x000FF000) == 0x00096000)
>>> + drvdata->mmio_external = true;
>> Like I said, we may be able to get rid of this flag and do the step for
>> all ETMs. But before all of that, I would like to see if this is problem
>> because we are skipping the system instruction route.
>>
> We understand MMIO access is deprecated going forward. There is other
> Linux code to be concerned about. For example, AMBA code reads the
> component PID/CID. This discovery code uses mapped values digested from
> the CoreSight ACPI which are the descriptions and graphs for the
With the "proposed" ACPI support for system register, AMBA would not be
involved at all.
> manufacturer trace implementation. There may be other Linux code I'm not
> aware. Note the ASL examples in ARM Document number: DEN0067 specify
> MMIO locations for every CoreSight component.
Yes, but this was never updated to cover the system register based
implementations. I will chase this up.
Suzuki
Changes since v4:
* Rebase onto perf/core
* Convert new perf_pmu__cpu_slots_per_cycle() function to use
new helper functions
===========================
Changes since v3:
* Scale time estimates by INSTR_PER_NS, rather than assuming 1
instruction = 1ns
* Add a new commit that fixes some issues around timestamps going
backwards
* Use nanoseconds inside cs-etm-decoder.c, rather than storing the
raw time values and converting when a sample is synthesized. This
simplifies some of the code like estimating the first timestamp.
===========================
Changes since v2:
* Remove const to non-const change and copy strings where needed
instead.
* Use sizeof() instead of PATH_MAX
* Append "will not be set accurately." to new error message
* Remove unneeded stat() call
* Rebase on perf/core
==========================
Changes since v1:
* Add 3 refactor commits for sysfs reading around pmu.c as suggested
by Arnaldo here [1]
* The dependency on [2] has now reached mainline so is no longer
blocking
* Rebase on perf/core
[1]: https://lore.kernel.org/all/YnqVqq5QW%2Fb14oPZ@kernel.org/
[2]: https://lore.kernel.org/all/20220503123537.1003035-1-german.gomez@arm.com/
German Gomez (4):
perf pmu: Add function to check if a pmu file exists
perf cs_etm: Keep separate symbols for ETMv4 and ETE parameters
perf cs_etm: Record ts_source in AUXTRACE_INFO for ETMv4 and ETE
perf cs_etm: Set the time field in the synthetic samples
James Clark (4):
perf: Remove duplication around EVENT_SOURCE_DEVICE_PATH
perf: Use perf_pmu__open_file() and perf_pmu__scan_file()
perf: Remove remaining duplication of bus/event_source/devices/...
perf: cs-etm: Ensure that Coresight timestamps don't go backwards
tools/perf/arch/arm/util/auxtrace.c | 5 +-
tools/perf/arch/arm/util/cs-etm.c | 91 ++++++++-
tools/perf/arch/arm64/util/pmu.c | 4 +-
tools/perf/arch/x86/util/pmu.c | 12 +-
tools/perf/util/cputopo.c | 9 +-
tools/perf/util/cs-etm-base.c | 34 +++-
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 68 +++++--
tools/perf/util/cs-etm.c | 95 +++++++++-
tools/perf/util/cs-etm.h | 16 +-
tools/perf/util/pmu-hybrid.c | 27 +--
tools/perf/util/pmu.c | 177 +++++++-----------
tools/perf/util/pmu.h | 10 +-
12 files changed, 351 insertions(+), 197 deletions(-)
base-commit: 1962ab6f6e0b39e4216206205bda14aff87705f3
prerequisite-patch-id: 9722bf86e3e6d16d177ff9a1411992a795a7dcbd
prerequisite-patch-id: b05dbef439c2ea8465f3321532257b0ca29f21f9
prerequisite-patch-id: 92680a4781cbcf010fcb007e6ea030f59e9eaefc
prerequisite-patch-id: 8e3a73a04e4b89b503377b5fac1d89d551159393
prerequisite-patch-id: 09980d8fedcdaa70b220a7802428109f48448a58
--
2.25.1
Changes since v2:
* Remove const to non-const change and copy strings where needed
instead.
* Use sizeof() instead of PATH_MAX
* Append "will not be set accurately." to new error message
* Remove unneeded stat() call
* Rebase on perf/core
==========================
Changes since v1:
* Add 3 refactor commits for sysfs reading around pmu.c as suggested
by Arnaldo here [1]
* The dependency on [2] has now reached mainline so is no longer
blocking
* Rebase on perf/core
[1]: https://lore.kernel.org/all/YnqVqq5QW%2Fb14oPZ@kernel.org/
[2]: https://lore.kernel.org/all/20220503123537.1003035-1-german.gomez@arm.com/
German Gomez (4):
perf pmu: Add function to check if a pmu file exists
perf cs_etm: Keep separate symbols for ETMv4 and ETE parameters
perf cs_etm: Record ts_source in AUXTRACE_INFO for ETMv4 and ETE
perf cs_etm: Set the time field in the synthetic samples
James Clark (3):
perf: Remove duplication around EVENT_SOURCE_DEVICE_PATH
perf: Use perf_pmu__open_file() and perf_pmu__scan_file()
perf: Remove remaining duplication of bus/event_source/devices/...
tools/perf/arch/arm/util/auxtrace.c | 5 +-
tools/perf/arch/arm/util/cs-etm.c | 91 ++++++++++++--
tools/perf/arch/x86/util/pmu.c | 12 +-
tools/perf/util/cputopo.c | 9 +-
tools/perf/util/cs-etm-base.c | 34 ++++--
tools/perf/util/cs-etm.c | 86 ++++++++++++--
tools/perf/util/cs-etm.h | 13 +-
tools/perf/util/pmu-hybrid.c | 27 +----
tools/perf/util/pmu.c | 177 +++++++++++-----------------
tools/perf/util/pmu.h | 10 +-
10 files changed, 284 insertions(+), 180 deletions(-)
base-commit: 09e6f9f98370be9a9f8978139e0eb1be87d1125f
--
2.25.1
Changes since v3:
* Scale time estimates by INSTR_PER_NS, rather than assuming 1
instruction = 1ns
* Add a new commit that fixes some issues around timestamps going
backwards
* Use nanoseconds inside cs-etm-decoder.c, rather than storing the
raw time values and converting when a sample is synthesized. This
simplifies some of the code like estimating the first timestamp.
===========================
Changes since v2:
* Remove const to non-const change and copy strings where needed
instead.
* Use sizeof() instead of PATH_MAX
* Append "will not be set accurately." to new error message
* Remove unneeded stat() call
* Rebase on perf/core
==========================
Changes since v1:
* Add 3 refactor commits for sysfs reading around pmu.c as suggested
by Arnaldo here [1]
* The dependency on [2] has now reached mainline so is no longer
blocking
* Rebase on perf/core
[1]: https://lore.kernel.org/all/YnqVqq5QW%2Fb14oPZ@kernel.org/
[2]: https://lore.kernel.org/all/20220503123537.1003035-1-german.gomez@arm.com/
German Gomez (4):
perf pmu: Add function to check if a pmu file exists
perf cs_etm: Keep separate symbols for ETMv4 and ETE parameters
perf cs_etm: Record ts_source in AUXTRACE_INFO for ETMv4 and ETE
perf cs_etm: Set the time field in the synthetic samples
James Clark (4):
perf: Remove duplication around EVENT_SOURCE_DEVICE_PATH
perf: Use perf_pmu__open_file() and perf_pmu__scan_file()
perf: Remove remaining duplication of bus/event_source/devices/...
perf: cs-etm: Ensure that Coresight timestamps don't go backwards
tools/perf/arch/arm/util/auxtrace.c | 5 +-
tools/perf/arch/arm/util/cs-etm.c | 91 ++++++++-
tools/perf/arch/x86/util/pmu.c | 12 +-
tools/perf/util/cputopo.c | 9 +-
tools/perf/util/cs-etm-base.c | 34 +++-
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 68 +++++--
tools/perf/util/cs-etm.c | 95 +++++++++-
tools/perf/util/cs-etm.h | 16 +-
tools/perf/util/pmu-hybrid.c | 27 +--
tools/perf/util/pmu.c | 177 +++++++-----------
tools/perf/util/pmu.h | 10 +-
11 files changed, 349 insertions(+), 195 deletions(-)
base-commit: 69b41ac87e4a664de78a395ff97166f0b2943210
prerequisite-patch-id: 9722bf86e3e6d16d177ff9a1411992a795a7dcbd
prerequisite-patch-id: b05dbef439c2ea8465f3321532257b0ca29f21f9
prerequisite-patch-id: 92680a4781cbcf010fcb007e6ea030f59e9eaefc
prerequisite-patch-id: 8e3a73a04e4b89b503377b5fac1d89d551159393
prerequisite-patch-id: 09980d8fedcdaa70b220a7802428109f48448a58
prerequisite-patch-id: 711843c93d5d6bdf4d73e024949950f4e4de9e1a
--
2.25.1
OpenCSD version 1.4 is released with support for FEAT_ITE.
This adds a new packet type, with associated output element ID in
the packet type enum - OCSD_GEN_TRC_ELEM_INSTRUMENTATION.
As we just ignore this packet in perf, add to the switch statement
to avoid the "enum not handled in switch error", but conditionally
so as not to break the perf build for older OpenCSD installations.
Signed-off-by: Mike Leach <mike.leach(a)linaro.org>
---
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
index fa3aa9c0fb2e..48e7121880a9 100644
--- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
+++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
@@ -604,6 +604,9 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer(
case OCSD_GEN_TRC_ELEM_CUSTOM:
case OCSD_GEN_TRC_ELEM_SYNC_MARKER:
case OCSD_GEN_TRC_ELEM_MEMTRANS:
+#if (OCSD_VER_NUM >= 0x010400)
+ case OCSD_GEN_TRC_ELEM_INSTRUMENTATION:
+#endif
default:
break;
}
--
2.17.1
Version 1.4.0 of OpenCSD is released.
This contains support for the new packet introduced by PE FEAT_ITE.
Also contains fix for github issue #52
Regards
Mike
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
This series adds support for the trace performance monitoring and
diagnostics hardware (TPDM and TPDA). It is composed of two major
elements.
a) Changes for original coresight framework to support for TPDM and TPDA.
b) Add driver code for TPDM and TPDA.
Introduction of changes for original coresight framework
Support TPDM as new coresight source.
Since only STM and ETM are supported as coresight source originally.
TPDM is a newly added coresight source. We need to change
the original way of saving coresight path to support more types source
for coresight driver.
The following patch is to add support more coresight sources.
coresight: core: Use IDR for non-cpu bound sources' paths.
Introduction of TPDM and TPDA
TPDM - The trace performance monitoring and diagnostics monitor or TPDM in
short serves as data collection component for various dataset types
specified in the QPMDA(Qualcomm performance monitoring and diagnostics
architecture) spec. The primary use case of the TPDM is to collect data
from different data sources and send it to a TPDA for packetization,
timestamping and funneling.
Coresight: Add coresight TPDM source driver
dt-bindings: arm: Adds CoreSight TPDM hardware definitions
coresight-tpdm: Add DSB dataset support
coresight-tpdm: Add integration test support
TPDA - The trace performance monitoring and diagnostics aggregator or
TPDA in short serves as an arbitration and packetization engine for the
performance monitoring and diagnostics network as specified in the QPMDA
(Qualcomm performance monitoring and diagnostics architecture)
specification. The primary use case of the TPDA is to provide
packetization, funneling and timestamping of Monitor data as specified
in the QPMDA specification.
The following patch is to add driver for TPDA.
Coresight: Add TPDA link driver
dt-bindings: arm: Adds CoreSight TPDA hardware definitions
The last patch of this series is a device tree modification, which add
the TPDM and TPDA configuration to device tree for validating.
ARM: dts: msm: Add tpdm mm/prng for sm8250
Once this series patches are applied properly, the tpdm and tpda nodes
should be observed at the coresight path /sys/bus/coresight/devices
e.g.
/sys/bus/coresight/devices # ls -l | grep tpd
tpda0 -> ../../../devices/platform/soc(a)0/6004000.tpda/tpda0
tpdm0 -> ../../../devices/platform/soc(a)0/6c08000.mm.tpdm/tpdm0
We can use the commands are similar to the below to validate TPDMs.
Enable coresight sink first.
echo 1 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
echo 1 > /sys/bus/coresight/devices/tpdm0/enable_source
echo 1 > /sys/bus/coresight/devices/tpdm0/integration_test
echo 2 > /sys/bus/coresight/devices/tpdm0/integration_test
The test data will be collected in the coresight sink which is enabled.
If rwp register of the sink is keeping updating when do
integration_test (by cat tmc_etf0/mgmt/rwp), it means there is data
generated from TPDM to sink.
There must be a tpda between tpdm and the sink. When there are some
other trace event hw components in the same HW block with tpdm, tpdm
and these hw components will connect to the coresight funnel. When
there is only tpdm trace hw in the HW block, tpdm will connect to
tpda directly.
+---------------+ +-------------+
| tpdm@6c08000 | |tpdm@684C000 |
+-------|-------+ +------|------+
| |
+-------|-------+ |
| funnel@6c0b000| |
+-------|-------+ |
| |
+-------|-------+ |
|funnel@6c2d000 | |
+-------|-------+ |
| |
| +---------------+ |
+----- tpda@6004000 -----------+
+-------|-------+
|
+-------|-------+
|funnel@6005000 |
+---------------+
This patch series depends on patch series:
[v7,00/15] coresight: Add new API to allocate trace source ID values
https://patchwork.kernel.org/project/linux-arm-kernel/cover/20230116124928.…
TPDM_TPDA commit tree:
https://git.codelinaro.org/clo/linux-kernel/coresight/-/commits/tpdm-tpda-v…
Changes in V17:
1. Rebase changes on V7 coresight: Add new API to allocate trace source ID values
2. Add documentation for TPDA and TPDM under
Documentation/tracing/coresight/. (Suzuki K Poulose <suzuki.poulose(a)arm.com>)
Changes in V16:
1. Update device tree changes to match up with device tree bindings.
3. Update the copyright year to 2023.
Changes in V15:
1. coresight-tpda: Add more comments in trace id function.
2. qcom,coresight-tpdm.yaml: Add more comments in description.
3. Push "arm64: dts: qcom: sm8250: Add coresight components" out this series.
Changes in V14:
rebase to "[v5,00/14] coresight: Add new API to allocate trace source ID values" and latest 6.x kernel
Changes in V13:
1. Fix the conflicts when apply patches to the latest base line.
Changes in V12:
1. Clear bits for atid before setting them and relese atid when tpda
remove. -- Suzuki K Poulose <suzuki.poulose(a)arm.com>
Changes in V11:
1. Change dev_info to dev_dbg in TPDM/TPDA drivers. -- Suzuki K Poulose <suzuki.poulose(a)arm.com>
2. Merge sysfs API change of integration_test to integration_test driver
change. -- Suzuki K Poulose <suzuki.poulose(a)arm.com>
Changes in V10:
1. Fix the error of TPDM yaml file. -- Rob Herring <robh(a)kernel.org>
Changes in V9:
1. Rename yaml file for TPDM/TPDA and fix the error for the yaml files.
-- Rob Herring <robh(a)kernel.org>
Changes in V8:
1. Use spinlock to protect drvdata of TPDM/TPDA -- Suzuki K Poulose <suzuki.poulose(a)arm.com>
2. Use CORESIGHT_DEV_SUBTYPE_SOURCE_OTHERS as source type for TPDM -- Suzuki K Poulose <suzuki.poulose(a)arm.com>
3. Fix the warning for yaml file of TPDM/TPDA -- Rob Herring <robh(a)kernel.org>
Changes in V7:
1. Update the commit title and move the changes to right place which
is sorted by address for dtsi changes. -- Konrad Dybcio <konrad.dybcio(a)somainline.org>
Changes in V6:
1. Update maintainers in tpdm/tpda yaml file. -- Mike Leach <mike.leach(a)linaro.org>
2. Set the .remove function pointer in the amba_driver structure
of tpdm/tpda driver. Add tpda_remove function for tpda driver. -- Mike Leach <mike.leach(a)linaro.org>
3. Define datasets of tpdm as unsigned long. -- Mike Leach <mike.leach(a)linaro.org>
4. Move all coresight nodes to sm8250.dtsi.
-- Mike Leach <mike.leach(a)linaro.org>;Konrad Dybcio <konrad.dybcio(a)somainline.org>
5. Remove CORESIGHT_TPDM_INTEGRATION_TEST config. -- Mike Leach <mike.leach(a)linaro.org>
Changes in V5:
1. Keep the ETM source paths per-CPU and use IDR for other sources'
paths. (Suzuki K Poulose <suzuki.poulose(a)arm.com>)
Changes in V4:
1. Drop trace id for tpdm source as its trace atid is defined by the tpda.
Allocate tpda's atid dynamically. (Mike Leach)
Changes in V3:
1. Use bitmap to assign the trace id. (Mathieu Poirier)
Changes in V2:
1. Use IDR to store the path of sources. (Mathieu Poirier)
2. Only add integration_test/enable/disable for TPDM. No other configs.
(Mathieu Poirier)
3. Move coresight dtsi changes to sm8250.dtsi. (Suzuki K Poulose)
Mao Jinlong (9):
coresight: core: Use IDR for non-cpu bound sources' paths.
Coresight: Add coresight TPDM source driver
dt-bindings: arm: Add CoreSight TPDM hardware
coresight-tpdm: Add DSB dataset support
coresight-tpdm: Add integration test support
Coresight: Add TPDA link driver
dt-bindings: arm: Adds CoreSight TPDA hardware definitions
Documentation: trace: Add documentation for TPDM and TPDA
arm64: dts: qcom: sm8250: Add tpdm mm/prng
.../testing/sysfs-bus-coresight-devices-tpdm | 13 +
.../bindings/arm/qcom,coresight-tpda.yaml | 129 +++++++++
.../bindings/arm/qcom,coresight-tpdm.yaml | 93 +++++++
.../trace/coresight/coresight-tpda.rst | 52 ++++
.../trace/coresight/coresight-tpdm.rst | 43 +++
MAINTAINERS | 1 +
arch/arm64/boot/dts/qcom/sm8250.dtsi | 164 +++++++++++
drivers/hwtracing/coresight/Kconfig | 23 ++
drivers/hwtracing/coresight/Makefile | 2 +
drivers/hwtracing/coresight/coresight-core.c | 42 ++-
drivers/hwtracing/coresight/coresight-tpda.c | 211 ++++++++++++++
drivers/hwtracing/coresight/coresight-tpda.h | 35 +++
drivers/hwtracing/coresight/coresight-tpdm.c | 259 ++++++++++++++++++
drivers/hwtracing/coresight/coresight-tpdm.h | 62 +++++
include/linux/coresight.h | 1 +
15 files changed, 1118 insertions(+), 12 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm
create mode 100644 Documentation/devicetree/bindings/arm/qcom,coresight-tpda.yaml
create mode 100644 Documentation/devicetree/bindings/arm/qcom,coresight-tpdm.yaml
create mode 100644 Documentation/trace/coresight/coresight-tpda.rst
create mode 100644 Documentation/trace/coresight/coresight-tpdm.rst
create mode 100644 drivers/hwtracing/coresight/coresight-tpda.c
create mode 100644 drivers/hwtracing/coresight/coresight-tpda.h
create mode 100644 drivers/hwtracing/coresight/coresight-tpdm.c
create mode 100644 drivers/hwtracing/coresight/coresight-tpdm.h
--
2.39.0