With current design, the name of the non-cpu bounded coresight
component is the device type with the number. And with 'ls' command
we can get the register address of the component. But from these
information, we can't know what the HW or system the component belongs
to. Add label in DT and show the hw information by reading label sysfs
node.
cti_sys0 -> ../../../devices/platform/soc(a)0/138f0000.cti/cti_sys0
cti_sys1 -> ../../../devices/platform/soc(a)0/13900000.cti/cti_sys1
tpdm0 -> ../../../devices/platform/soc(a)0/10b0d000.tpdm/tpdm0
tpdm1 -> ../../../devices/platform/soc(a)0/10c28000.tpdm/tpdm1
tpdm2 -> ../../../devices/platform/soc(a)0/10c29000.tpdm/tpdm2
/sys/bus/coresight/devices # cat cti*/label
cti_dlct_0
cti_dlct_1
cti_apss_0
cti_apss_1
cti_apss_2
Change since V4:
1. Add label in DT and add label sysfs node for each coresight device.
Change since V3:
1. Change device-name to arm,cs-dev-name.
2. Add arm,cs-dev-name to only CTI and sources' dt-binding.
Change since V2:
1. Fix the error in coresight core.
drivers/hwtracing/coresight/coresight-core.c:1775:7: error: assigning to 'char *' from 'const char *' discards qualifiers
2. Fix the warning when run dtbinding check.
Documentation/devicetree/bindings/arm/arm,coresight-cpu-debug.yaml: device-name: missing type definition
Change since V1:
1. Change coresight-name to device name.
2. Add the device-name in coresight dt bindings.
Mao Jinlong (2):
dt-bindings: arm: Add label in the coresight components
coresight: Add label sysfs node support
.../testing/sysfs-bus-coresight-devices-cti | 6 ++++
.../sysfs-bus-coresight-devices-funnel | 6 ++++
.../testing/sysfs-bus-coresight-devices-tpdm | 6 ++++
.../bindings/arm/arm,coresight-cti.yaml | 6 ++++
.../arm/arm,coresight-dummy-sink.yaml | 6 ++++
.../arm/arm,coresight-dummy-source.yaml | 6 ++++
.../arm/arm,coresight-dynamic-funnel.yaml | 6 ++++
.../arm/arm,coresight-dynamic-replicator.yaml | 6 ++++
.../arm/arm,coresight-static-funnel.yaml | 6 ++++
.../arm/arm,coresight-static-replicator.yaml | 6 ++++
.../bindings/arm/arm,coresight-tmc.yaml | 6 ++++
.../bindings/arm/qcom,coresight-tpda.yaml | 6 ++++
.../bindings/arm/qcom,coresight-tpdm.yaml | 6 ++++
drivers/hwtracing/coresight/coresight-sysfs.c | 32 +++++++++++++++++++
14 files changed, 110 insertions(+)
--
2.17.1
In our hardware design, by combining a funnel and a replicator, it
implement a hardware device with one-to-one correspondence between
output ports and input ports. The programming usage on this device
is the same as funnel. The software uses a funnel and a static
replicator to implement the driver of this device. Since original
funnels only support a single output connection and original
replicator only support a single input connection, the code needs
to be modified to support this new feature. The following is a
typical topology diagram of multi-port output mechanism.
|----------| |---------| |----------| |---------|
| TPDM 0 | | Source0 | | Source 1 | | TPDM 1 |
|----------| |---------| |----------| |---------|
| | | |
| | | |
| --------- | | |
| | | |
| | | |
| | | |
\-------------/ ---------------------- |
\ Funnel 0 / | |
----------- | ------------------------------
| | |
| | |
\------------------/
\ Funnel 1 / ----|
\--------------/ |
| |----> Combine a funnel and a
| | static replicator
/-----------------\ |
/ replicator 0 \ ----|
/---------------------\
| | |
| | |-----------|
| |---------| |
| |TPDM0 |TPDM1
| \-----------------/
| \ TPDA 0 /
| \-------------/
| |
| |
|Source0/1 |
\-------------------------------/
\ Funnel 2 /
\---------------------------/
Changes in V6:
1. Optimize the prompt content of the warning log
when the filter handle is not a trace source.
-- Suzuki K Poulose
2. Reset the filter device and fwnode if it is not
a trace source.
-- Suzuki K Poulose
Changes in V5:
1. Replace "filter-src" with "filter-source" in the
dt-binding document.
-- Suzuki K Poulose
2. Optimize the comments of the patch "coresight:
Add support for trace filtering by source" due to bad
example.
-- Suzuki K Poulose
3. Correct spelling errors in the patch "coresight:
Add support for trace filtering by source".
-- Suzuki K Poulose
4. Optimize the function "coresight_blocks_source".
-- Suzuki K Poulose
5. Add { } in the function "of_coresight_parse_endpoint".
-- Suzuki K Poulose
6. Adjust the order of the patches.
-- Suzuki K Poulose
7. Adjust the alignment in "coresight-platform.c".
-- Suzuki K Poulose
Changes in V4:
1. Use "coresight_get_source(path)" in the function
"coresight_disable_path_from" instead of explicitly
passing the source.
-- Suzuki K Poulose
2. Optimize the order of the input parameters for
"_coresight_build_path".
-- Suzuki K Poulose
3. Reuse the method "coresight_block_source" in
"_coresight_build_path".
-- Suzuki K Poulose
4. Remove the unnecessary () in "coresight_build_path".
-- Suzuki K Poulose
5. Add a helper to check if a device is SOURCE.
-- Suzuki K Poulose
6. Adjust the posistion of setting "still_orphan" in
"coresight_build_path".
-- Suzuki K Poulose
Changes in V3:
1. Rename the function "coresight_source_filter" to
"coresight_block_source". And refine this function.
-- Suzuki K Poulose
2. Rename the parameters of the function
"coresight_find_out_connection" to avoid confusion.
-- Suzuki K Poulose
3. Get the source of path in "coresight_enable_path" and
"coresight_disable_path".
-- Suzuki K Poulose
4. Fix filter source device before skip the port in
"coresight_orphan_match".
-- Suzuki K Poulose
5. Make sure the device still orphan if whter is a filter
source firmware node but the filter source device is null.
-- Suzuki K Poulose
6. Walk through the entire coresight bus and fixup the
"filter_src_dev" if the source is being removed.
-- Suzuki K Poulose
7. Refine the commit description of patch#2.
-- Suzuki K Poulose
8. Fix the warning reported by kernel test robot.
-- kernel test robot.
9. Use the source device directly if the port has a
hardcoded filter in "tpda_get_element_size".
-- Suzuki K Poulose
Changes in V2:
1. Change the reference for endpoint property in dt-binding.
-- Krzysztof Kozlowski
2. Change the property name "filter_src" to "filter-src".
-- Krzysztof Kozlowski
3. Fix the errors in running 'make dt_binding_check'.
-- Rob Herring
4. Pass in the source parameter instead of path.
-- Suzuki K Poulose
5. Reset the "filter_src_dev" if the "src" csdev is being removed.
-- Suzuki K Poulose
6. Add a warning if the "filter_src_dev" is of not the
type DEV_TYPE_SOURCE.
-- Suzuki K Poulose
7. Optimize the procedure for handling all possible cases.
-- Suzuki K Poulose
Changes in V1:
1. Add a static replicator connect to a funnel to implement the
correspondence between the output ports and the input ports on
funnels.
-- Suzuki K Poulose
2. Add filter_src_dev and filter_src_dev phandle to
"coresight_connection" struct, and populate them if there is one.
-- Suzuki K Poulose
3. To look at the phandle and then fixup/remove the filter_src
device in fixup/remove connections.
-- Suzuki K Poulose
4. When TPDA reads DSB/CMB element size, it is implemented by
looking up filter src device in the connections.
-- Suzuki K Poulose
Tao Zhang (4):
dt-bindings: arm: qcom,coresight-static-replicator: Add property for
source filtering
coresight: Add a helper to check if a device is source
coresight: Add support for trace filtering by source
coresight-tpda: Optimize the function of reading element size
.../arm/arm,coresight-static-replicator.yaml | 19 ++-
drivers/hwtracing/coresight/coresight-core.c | 113 +++++++++++++++---
.../hwtracing/coresight/coresight-platform.c | 21 ++++
drivers/hwtracing/coresight/coresight-tpda.c | 13 +-
include/linux/coresight.h | 12 +-
5 files changed, 155 insertions(+), 23 deletions(-)
--
2.17.1
Hi
On 04/12/2024 10:13, yuanfang zhang wrote:
> From: Yuanfang Zhang <quic_yuanfang(a)quicinc.com>
>
> As recommended by section 4.3.7 ("Synchronization when using system
> instructions to progrom the trace unit") of ARM IHI 0064H.b, the
> self-hosted trace analyzer must perform a Context synchronization
> event between writing to the TRCPRGCTLR and reading the TRCSTATR.
>
Thanks for the patch. Please could you add the above in a comment
in the changes below ?
Also, add Fixes tag ? (It may go all the way back to initial support)
Otherwise looks sensible to me.
Suzuki
> Signed-off-by: Yuanfang Zhang <quic_yuanfang(a)quicinc.com>
> ---
> drivers/hwtracing/coresight/coresight-etm4x-core.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index 66d44a404ad0..5da2c523c30a 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -906,6 +906,7 @@ static void etm4_disable_hw(void *info)
> tsb_csync();
> etm4x_relaxed_write32(csa, control, TRCPRGCTLR);
>
> + isb();
> /* wait for TRCSTATR.PMSTABLE to go to '1' */
> if (coresight_timeout(csa, TRCSTATR, TRCSTATR_PMSTABLE_BIT, 1))
> dev_err(etm_dev,
Hi Yicong
On 02/12/2024 09:24, Yicong Yang wrote:
> From: Yicong Yang <yangyicong(a)hisilicon.com>
>
> When trying to run perf and sysfs mode simultaneously, the WARN_ON()
> in tmc_etr_enable_hw() is triggered sometimes:
>
> WARNING: CPU: 42 PID: 3911571 at drivers/hwtracing/coresight/coresight-tmc-etr.c:1060 tmc_etr_enable_hw+0xc0/0xd8 [coresight_tmc]
> [..snip..]
> Call trace:
> tmc_etr_enable_hw+0xc0/0xd8 [coresight_tmc] (P)
> tmc_enable_etr_sink+0x11c/0x250 [coresight_tmc] (L)
> tmc_enable_etr_sink+0x11c/0x250 [coresight_tmc]
> coresight_enable_path+0x1c8/0x218 [coresight]
> coresight_enable_sysfs+0xa4/0x228 [coresight]
> enable_source_store+0x58/0xa8 [coresight]
> dev_attr_store+0x20/0x40
> sysfs_kf_write+0x4c/0x68
> kernfs_fop_write_iter+0x120/0x1b8
> vfs_write+0x2c8/0x388
> ksys_write+0x74/0x108
> __arm64_sys_write+0x24/0x38
> el0_svc_common.constprop.0+0x64/0x148
> do_el0_svc+0x24/0x38
> el0_svc+0x3c/0x130
> el0t_64_sync_handler+0xc8/0xd0
> el0t_64_sync+0x1ac/0x1b0
> ---[ end trace 0000000000000000 ]---
>
> Since the enablement of sysfs mode is separeted into two critical regions,
> one for sysfs buffer allocation and another for hardware enablement, it's
> possible to race with the perf mode. Fix this by double check whether
> the perf mode's been used before enabling the hardware in sysfs mode.
Thanks for the fix. Some minor comments below.
It needs a Fixes tag.
>
> Signed-off-by: Yicong Yang <yangyicong(a)hisilicon.com>
> ---
> .../hwtracing/coresight/coresight-tmc-etr.c | 30 +++++++++++++++++++
> 1 file changed, 30 insertions(+)
>
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index ad83714ca4dc..d382d95da5ff 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -1230,6 +1230,36 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
>
> spin_lock_irqsave(&drvdata->spinlock, flags);
>
> + /*
> + * Since the sysfs buffer allocation and the hardware enablement is not
> + * in the same critical region, it's possible to race with the perf
> + * mode:
> + * [sysfs mode] [perf mode]
> + * tmc_etr_get_sysfs_buffer()
> + * spin_lock(&drvdata->spinlock)
> + * [sysfs buffer allocation]
> + * spin_unlock(&drvdata->spinlock)
> + * spin_lock(&drvdata->spinlock)
> + * tmc_etr_enable_hw()
> + * drvdata->etr_buf = etr_perf->etr_buf
> + * spin_unlock(&drvdata->spinlock)
> + * spin_lock(&drvdata->spinlock)
> + * tmc_etr_enable_hw()
> + * WARN_ON(drvdata->etr_buf) // WARN sicne etr_buf initialized at
> + * the perf side
> + * spin_unlock(&drvdata->spinlock)
> + *
> + * So check here before continue.
> + */
> + if (coresight_get_mode(csdev) == CS_MODE_PERF) {
> + drvdata->sysfs_buf = NULL;
> + spin_unlock_irqrestore(&drvdata->spinlock, flags);
> +
> + /* Free allocated memory out side of the spinlock */
> + tmc_etr_free_sysfs_buf(sysfs_buf);
> + return -EBUSY;
> + }
With this in place, I think we should remove the check for CS_MODE_PERF
in get_etr_sysfs_buf() to avoid confusion (which I believe opened up
this race)
Suzuki
> +
> /*
> * In sysFS mode we can have multiple writers per sink. Since this
> * sink is already enabled no memory is needed and the HW need not be
On 25/11/2024 9:48 am, Yeoreum Yun wrote:
> In some coresight drivers, drvdata->spinlock can be held during __schedule()
> by perf_event_task_sched_out()/in().
>
> Since drvdata->spinlock type is spinlock_t and
> perf_event_task_sched_out()/in() is called after acquiring rq_lock,
> which is raw_spinlock_t (an unsleepable lock),
> this poses an issue in PREEMPT_RT kernel where spinlock_t is sleepable.
>
> To address this,change type drvdata->spinlock in some coresight drivers,
> which can be called by perf_event_task_sched_out()/in(),
> from spinlock_t to raw_spinlock_t.
>
> Levi Yun (9):
> coresight: change coresight_device lock type to raw_spinlock_t
> coresight-etm4x: change etmv4_drvdata spinlock type to raw_spinlock_t
> coresight: change coresight_trace_id_map's lock type to raw_spinlock_t
> coresight-cti: change cti_drvdata spinlock's type to raw_spinlock_t
> coresight-etb10: change etb_drvdata spinlock's type to raw_spinlock_t
> coresight-funnel: change funnel_drvdata spinlock's type to
> raw_spinlock_t
> coresight-replicator: change replicator_drvdata spinlock's type to
> raw_spinlock_t
> coresight-tmc: change tmc_drvdata spinlock's type to raw_spinlock_t
> coresight/ultrasoc: change cti_drvdata spinlock's type to
> raw_spinlock_t
>
> .../hwtracing/coresight/coresight-config.c | 21 +-
> .../hwtracing/coresight/coresight-config.h | 2 +-
> drivers/hwtracing/coresight/coresight-core.c | 2 +-
> .../hwtracing/coresight/coresight-cti-core.c | 65 +-
> .../hwtracing/coresight/coresight-cti-sysfs.c | 135 +++--
> drivers/hwtracing/coresight/coresight-cti.h | 2 +-
> drivers/hwtracing/coresight/coresight-etb10.c | 62 +-
> .../coresight/coresight-etm4x-core.c | 71 ++-
> .../coresight/coresight-etm4x-sysfs.c | 566 +++++++++---------
> drivers/hwtracing/coresight/coresight-etm4x.h | 2 +-
> .../hwtracing/coresight/coresight-funnel.c | 34 +-
> .../coresight/coresight-replicator.c | 36 +-
> .../hwtracing/coresight/coresight-syscfg.c | 75 ++-
> .../hwtracing/coresight/coresight-tmc-core.c | 9 +-
> .../hwtracing/coresight/coresight-tmc-etf.c | 195 +++---
> .../hwtracing/coresight/coresight-tmc-etr.c | 199 +++---
> drivers/hwtracing/coresight/coresight-tmc.h | 2 +-
> .../hwtracing/coresight/coresight-trace-id.c | 93 ++-
> drivers/hwtracing/coresight/ultrasoc-smb.c | 12 +-
> drivers/hwtracing/coresight/ultrasoc-smb.h | 2 +-
> include/linux/coresight.h | 4 +-
> 21 files changed, 751 insertions(+), 838 deletions(-)
>
> --
> LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
>
Can TPIU be used as a sink with Perf? If so that one is missing the update.
Makes me wonder if something like a "perf_component" flag in
coresight_device or a #define would make sense. Then the lock type gets
auto picked. Seems like in a few years we will have completely forgotten
why some of the coresight locks are raw and others aren't.
On 25/11/2024 9:48 am, Yeoreum Yun wrote:
> In some coresight drivers, drvdata->spinlock can be held during __schedule()
> by perf_event_task_sched_out()/in().
>
> Since drvdata->spinlock type is spinlock_t and
> perf_event_task_sched_out()/in() is called after acquiring rq_lock,
> which is raw_spinlock_t (an unsleepable lock),
> this poses an issue in PREEMPT_RT kernel where spinlock_t is sleepable.
>
> To address this,change type drvdata->spinlock in some coresight drivers,
> which can be called by perf_event_task_sched_out()/in(),
> from spinlock_t to raw_spinlock_t.
>
> Levi Yun (9):
> coresight: change coresight_device lock type to raw_spinlock_t
> coresight-etm4x: change etmv4_drvdata spinlock type to raw_spinlock_t
> coresight: change coresight_trace_id_map's lock type to raw_spinlock_t
> coresight-cti: change cti_drvdata spinlock's type to raw_spinlock_t
> coresight-etb10: change etb_drvdata spinlock's type to raw_spinlock_t
> coresight-funnel: change funnel_drvdata spinlock's type to
> raw_spinlock_t
> coresight-replicator: change replicator_drvdata spinlock's type to
> raw_spinlock_t
> coresight-tmc: change tmc_drvdata spinlock's type to raw_spinlock_t
> coresight/ultrasoc: change cti_drvdata spinlock's type to
> raw_spinlock_t
>
> .../hwtracing/coresight/coresight-config.c | 21 +-
> .../hwtracing/coresight/coresight-config.h | 2 +-
> drivers/hwtracing/coresight/coresight-core.c | 2 +-
> .../hwtracing/coresight/coresight-cti-core.c | 65 +-
> .../hwtracing/coresight/coresight-cti-sysfs.c | 135 +++--
> drivers/hwtracing/coresight/coresight-cti.h | 2 +-
> drivers/hwtracing/coresight/coresight-etb10.c | 62 +-
> .../coresight/coresight-etm4x-core.c | 71 ++-
> .../coresight/coresight-etm4x-sysfs.c | 566 +++++++++---------
> drivers/hwtracing/coresight/coresight-etm4x.h | 2 +-
> .../hwtracing/coresight/coresight-funnel.c | 34 +-
> .../coresight/coresight-replicator.c | 36 +-
> .../hwtracing/coresight/coresight-syscfg.c | 75 ++-
> .../hwtracing/coresight/coresight-tmc-core.c | 9 +-
> .../hwtracing/coresight/coresight-tmc-etf.c | 195 +++---
> .../hwtracing/coresight/coresight-tmc-etr.c | 199 +++---
> drivers/hwtracing/coresight/coresight-tmc.h | 2 +-
> .../hwtracing/coresight/coresight-trace-id.c | 93 ++-
> drivers/hwtracing/coresight/ultrasoc-smb.c | 12 +-
> drivers/hwtracing/coresight/ultrasoc-smb.h | 2 +-
> include/linux/coresight.h | 4 +-
> 21 files changed, 751 insertions(+), 838 deletions(-)
>
> --
> LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
>
Reviewed-by: James Clark <james.clark(a)linaro.org>
Somewhat related to this change, but not something that needs to be done
now:
All of the spinlocks for the syfs store/read functions don't need to be
raw spinlocks, or spinlocks at all. They're never called from the
scheduling code and can be locked by coresight_mutex instead.
coresight_mutex is held on the sysfs enable/disable path when those
values are actually used.
Same point as here:
https://lore.kernel.org/linux-arm-kernel/9a637e74-d81d-405c-bad0-c97ec1aa4b…
Probably needs a review of which values in each device might be shared
between perf mode and sysfs mode when they shouldn't be, as there was
one in the link above.
I thought the only thing shared between sysfs and perf should be the
'mode' which is taken with a compare and swap without locking anyway.
On 26/11/2024 4:23 pm, Oliver Upton wrote:
> On Thu, Nov 21, 2024 at 12:50:10PM +0000, James Clark wrote:
>>
>>
>> On 20/11/2024 5:31 pm, Oliver Upton wrote:
>>> On Tue, Nov 12, 2024 at 10:37:10AM +0000, James Clark wrote:
>>>> +void kvm_set_trfcr(u64 host_trfcr, u64 guest_trfcr)
>>>> +{
>>>> + if (kvm_arm_skip_trace_state())
>>>> + return;
>>>> +
>>>> + if (has_vhe())
>>>> + write_sysreg_s(guest_trfcr, SYS_TRFCR_EL12);
>>>> + else
>>>> + if (host_trfcr != guest_trfcr) {
>>>> + *host_data_ptr(host_debug_state.trfcr_el1) = guest_trfcr;
>>>
>>> Huh? That's going into host_debug_state, which is the dumping grounds
>>> for *host* context when entering a guest.
>>>
>>> Not sure why we'd stick a *guest* value in there...
>>>
>>
>> Only to save a 3rd storage place for trfcr when just the register and one
>> place is technically enough. But yes if it's more readable to have
>> guest_trfcr_el1 separately then that makes sense.
>
> Yeah, since this is all per-cpu data at this point rather than per-vCPU,
> it isn't the end of the world to use a few extra bytes.
>
>> That works, it would be nice to have it consistent and have it that way for
>> filtering, like kvm_set_guest_trace_filters(bool kernel, bool user). But I
>> suppose we can justify not doing it there because we're not really
>> interpreting the TRFCR value just writing it whole.
>
> Agreed, the biggest thing I'd want to see in the exported interfaces
> like this is to have enable/disable helpers to tell KVM when a driver
> wants KVM to start/stop managing a piece of state while in a guest.
>
> Then the hypervisor code can blindly save/restore some opaque values to
> whatever registers it needs to update.
>
>>> What if trace is disabled in the guest or in the host? Do we need to
>>> synchronize when transitioning from an enabled -> disabled state like we
>>> do today?
>>>
>>
>> By synchronize do you mean the tsb_csync()? I can only see it being
>> necessary for the TRBE case because then writing to the buffer is fatal.
>> Without TRBE the trace sinks still work and the boundary of when exactly
>> tracing is disabled in the kernel isn't critical.
>
> Ack, I had the blinders on that we cared only about TRBE here.
>
>>> I took a stab at this, completely untested of course && punts on
>>> protected mode. But this is _generally_ how I'd like to see everything
>>> fit together.
>>>
>>
>> Would you expect to see the protected mode stuff ignored if I sent another
>> version more like yours below? Or was that just skipped to keep the example
>> shorter?
>
> Skipped since I slapped this together in a hurry.
>
>> I think I'm a bit uncertain on that one because removing HAS_TRBE means you
>> can't check if TRBE is enabled or not in protected mode and it will go wrong
>> if it is.
>
> The protected mode hypervisor will need two bits of information.
> Detecting that the feature is present can be done in the kernel so long
> as the corresponding static key / cpucap is toggled before we drop
> privileges.
>
> Whether or not it is programmable + enabled is a decision that must be
> made by observing hardware state from the hypervisor before entering a
> guest.
>
> [...]
>
>>> +void kvm_enable_trbe(u64 guest_trfcr)
>>> +{
>>> + if (WARN_ON_ONCE(preemptible()))
>>> + return;
>>> +
>>> + if (has_vhe()) {
>>> + write_sysreg_s(guest_trfcr, SYS_TRFCR_EL12);
>>> + return;
>>> + }
>>> +
>>> + *host_data_ptr(guest_trfcr_el1) = guest_trfcr;
>>> + host_data_set_flag(HOST_TRBE_ENABLED);
>>
>> FWIW TRBE and TRF are separate features, so this wouldn't do the filtering
>> correctly if TRBE wasn't in use, but I can split it out into
>> separate kvm_enable_trbe(void) and kvm_set_guest_filters(u64 guest_trfcr).
>
> KVM manages the same piece of state (TRFCR_EL1) either way though right?
>
> The expectation I had is that KVM is informed any time a trace session
> (TRBE or otherwise) is enabled/disabled on a CPU, likely with a TRFCR_EL1
> of 0 if guest mode is excluded.
>
> The function names might need massaging, but I was hoping to have a
> single set of enable/disable knobs to cover all bases here.
>
I sent another version, it did come out much simpler and still does all
the same things as before.
I didn't manage to make a single enable/disable knob though. The thing
is the filtering is set on the source side of the driver and trbe is a
sink thing. I would have to couple them together and add knowledge of
the sink type to the source to make it work.
That would then open up the possibility for anyone adding a new source
to get the trbe bit wrong in the future. Having KVM override the filter
setting when trbe is in use seems a lot safer and easier to understand.
FEAT_TRF is a Coresight feature that allows trace capture to be
completely filtered at different exception levels, unlike the existing
TRCVICTLR controls which may still emit target addresses of branches,
even if the following trace is filtered.
Without FEAT_TRF, it was possible to start a trace session on a host and
also collect trace from the guest as TRCVICTLR was never programmed to
exclude guests (and it could still emit target addresses even if it
was).
With FEAT_TRF, the current behavior of trace in guests exists depends on
whether nVHE or VHE are being used. Both of the examples below are from
the host's point of view, as Coresight isn't accessible from guests.
This patchset is only relevant to when FEAT_TRF exists, otherwise there
is no change.
Current behavior:
nVHE/pKVM:
Because the host and the guest are both using TRFCR_EL1, trace will be
generated in guests depending on the same filter rules the host is
using. For example if the host is tracing userspace only, then guest
userspace trace will also be collected.
(This is further limited by whether TRBE is used because an issue
with TRBE means that it's completely disabled in nVHE guests, but it's
possible to have other tracing components.)
VHE:
With VHE, the host filters will be in TRFCR_EL2, but the filters in
TRFCR_EL1 will be active when the guest is running. Because we don't
write to TRFCR_EL1, guest trace will be completely disabled.
New behavior:
The guest filtering rules from the Perf session are now honored for both
nVHE and VHE modes. This is done by either writing to TRFCR_EL12 at the
start of the Perf session and doing nothing else further, or caching the
guest value and writing it at guest switch for nVHE. In pKVM, trace is
now be disabled for both protected and unprotected guests.
There is also an optimization where the Coresight drivers pass their
enabled state to KVM. This means in the common case KVM doesn't have to
touch any sysregs when the feature isn't in use.
Applies to kvmarm/next (60ad25e14a) but includes two commits from Oliver
for a conflicting change to move TRBE and SPE flags to host data [7].
---
Changes since V7 [6]:
* Drop SPE changes
* Change the interface to be based on intent, i.e kvm_enable_trbe()
rather than passing the raw register value
* Drop change to re-use vcpu_flags mechanism in favour of [7]
* Simplify by using the same switch function to and from guest
Changes since V6 [5]:
* Implement a better "do nothing" case where both the SPE and Coresight
drivers give the enabled state to KVM, allowing some register
reads to be dropped.
* Move the state and feature flags out of the vCPU into the per-CPU
host_debug_state.
* Simplify the switch logic by adding a new flag HOST_STATE_SWAP_TRFCR
and only storing a single TRFCR value.
* Rename vcpu flag macros to a more generic kvm_flag...
Changes since V5 [4]:
* Sort new sysreg entries by encoding
* Add a comment about sorting arch/arm64/tools/sysreg
* Warn on preemptible() before calling smp_processor_id()
* Pickup tags
* Change TRFCR_EL2 from SysregFields to Sysreg because it was only
used once
Changes since V4 [3]:
* Remove all V3 changes that made it work in pKVM and just disable
trace there instead
* Restore PMU host/hyp state sharing back to how it was
(kvm_pmu_update_vcpu_events())
* Simplify some of the duplication in the comments and function docs
* Add a WARN_ON_ONCE() if kvm_etm_set_guest_trfcr() is called when
the trace filtering feature doesn't exist.
* Split sysreg change into a tools update followed by the new register
addition
Changes since V3:
* Create a new shared area to store the host state instead of copying
it before each VCPU run
* Drop commit that moved SPE and trace registers from host_debug_state
into the kvm sysregs array because the guest values were never used
* Document kvm_etm_set_guest_trfcr()
* Guard kvm_etm_set_guest_trfcr() with a feature check
* Drop Mark B and Suzuki's review tags on the sysreg patch because it
turned out that broke the Perf build and needed some unconventional
changes to fix it (as in: to update the tools copy of the headers in
the same commit as the kernel changes)
Changes since V2:
* Add a new iflag to signify presence of FEAT_TRF and keep the
existing TRBE iflag. This fixes the issue where TRBLIMITR_EL1 was
being accessed even if TRBE didn't exist
* Reword a commit message
Changes since V1:
* Squashed all the arm64/tools/sysreg changes into the first commit
* Add a new commit to move SPE and TRBE regs into the kvm sysreg array
* Add a comment above the TRFCR global that it's per host CPU rather
than vcpu
Changes since nVHE RFC [1]:
* Re-write just in terms of the register value to be written for the
host and the guest. This removes some logic from the hyp code and
a value of kvm_vcpu_arch:trfcr_el1 = 0 no longer means "don't
restore".
* Remove all the conditional compilation and new files.
* Change the kvm_etm_update_vcpu_events macro to a function.
* Re-use DEBUG_STATE_SAVE_TRFCR so iflags don't need to be expanded
anymore.
* Expand the cover letter.
Changes since VHE v3 [2]:
* Use the same interface as nVHE mode so TRFCR_EL12 is now written by
kvm.
[1]: https://lore.kernel.org/kvmarm/20230804101317.460697-1-james.clark@arm.com/
[2]: https://lore.kernel.org/kvmarm/20230905102117.2011094-1-james.clark@arm.com/
[3]: https://lore.kernel.org/linux-arm-kernel/20240104162714.1062610-1-james.cla…
[4]: https://lore.kernel.org/all/20240220100924.2761706-1-james.clark@arm.com/
[5]: https://lore.kernel.org/linux-arm-kernel/20240226113044.228403-1-james.clar…
[6]: https://lore.kernel.org/kvmarm/20241112103717.589952-1-james.clark@linaro.o…
[7]: https://lore.kernel.org/kvmarm/20241115224924.2132364-4-oliver.upton@linux.…
James Clark (6):
arm64/sysreg: Add a comment that the sysreg file should be sorted
tools: arm64: Update sysreg.h header files
arm64/sysreg/tools: Move TRFCR definitions to sysreg
KVM: arm64: coresight: Give TRBE enabled state to KVM
KVM: arm64: Support trace filtering for guests
coresight: Pass guest TRFCR value to KVM
Oliver Upton (2):
KVM: arm64: Get rid of __kvm_get_mdcr_el2() and related warts
KVM: arm64: Track presence of SPE/TRBE in kvm_host_data instead of
vCPU
arch/arm64/include/asm/kvm_asm.h | 5 +-
arch/arm64/include/asm/kvm_host.h | 39 +-
arch/arm64/include/asm/sysreg.h | 12 -
arch/arm64/kvm/arm.c | 5 +-
arch/arm64/kvm/debug.c | 92 ++--
arch/arm64/kvm/hyp/nvhe/debug-sr.c | 61 +--
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 6 -
arch/arm64/kvm/hyp/vhe/debug-sr.c | 5 -
arch/arm64/tools/sysreg | 38 ++
.../coresight/coresight-etm4x-core.c | 43 +-
drivers/hwtracing/coresight/coresight-etm4x.h | 2 +-
drivers/hwtracing/coresight/coresight-priv.h | 3 +
drivers/hwtracing/coresight/coresight-trbe.c | 5 +
tools/arch/arm64/include/asm/sysreg.h | 410 +++++++++++++++++-
tools/include/linux/kasan-tags.h | 15 +
15 files changed, 609 insertions(+), 132 deletions(-)
create mode 100644 tools/include/linux/kasan-tags.h
--
2.34.1