Changes in V3:
- Rebased to linux-perf-tools branch.
- Squash symbol-elf.c and symbol.h into same commit.
- In map.c, merge dso__is_pie() call into existing if statement.
- In arm-cs-trace-disasm.py, remove debug artifacts.
Changes in V2:
- In dso__is_pie() (symbol-elf.c), Decrease indentation, add null pointer
checks per Leo Yan review.
- Updated mailing list distribution
Fedora 37 distributed shared binary and executable mapped files show a
zero text section offset. Starting with the Fedora 38 distribution, the
shared binary and executable mapped files show a non-zero text section
offset for some binaries. The text offset parameter is never passed into
the arm-cs-trace-disasm.py script to allow the script to adjust the
start/end address range passed to objdump. This adjustment is required
to correctly offset into the dso text section. Not doing so results in
an incorrect user instruction trace display for Fedora 38 (and later)
user trace output.
Steve Clevenger (4):
Add dso__is_pie call to identify ELF PIE
Force MAPPING_TYPE__IDENTIY for PIE
Add map pgoff to python dictionary based on MAPPING_TYPE
Adjust objdump start/end range per map pgoff parameter
.../scripts/python/arm-cs-trace-disasm.py | 9 ++-
tools/perf/util/map.c | 4 +-
.../scripting-engines/trace-event-python.c | 13 +++-
tools/perf/util/symbol-elf.c | 60 +++++++++++++++++++
tools/perf/util/symbol.h | 1 +
5 files changed, 79 insertions(+), 8 deletions(-)
--
2.25.1
On 8/27/2024 1:23 PM, Leo Yan wrote:
> On 8/26/2024 10:35 PM, Steve Clevenger wrote:
>>
>> Changes in V2:
>> - Updated mailing list distribution
>>
>> Extract map_pgoff parameter from the dictionary, and adjust start/end
>> range passed to objdump based on the value.
>>
>> The start_addr/stop_addr address checks are changed to print a warning
>> only if verbose == True. This script repeatedly sees a zero value passed
>> in for
>> start_addr = cpu_data[str(cpu) + 'addr']
>>
>> These zero values are not a new problem. The start_addr/stop_addr warning
>> clutters the instruction trace output, hence this change.
>>
>> Signed-off-by: Steve Clevenger <scclevenger(a)os.amperecomputing.com>
>> ---
>> .../perf/scripts/python/arm-cs-trace-disasm.py | 17 +++++++++++++++--
>> 1 file changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/scripts/python/arm-cs-trace-disasm.py b/tools/perf/scripts/python/arm-cs-trace-disasm.py
>> index d973c2baed1c..6bf806078f9a 100755
>> --- a/tools/perf/scripts/python/arm-cs-trace-disasm.py
>> +++ b/tools/perf/scripts/python/arm-cs-trace-disasm.py
>> @@ -187,6 +187,7 @@ def process_event(param_dict):
>> dso_start = get_optional(param_dict, "dso_map_start")
>> dso_end = get_optional(param_dict, "dso_map_end")
>> symbol = get_optional(param_dict, "symbol")
>> + map_pgoff = get_optional(param_dict, "map_pgoff")
>>
>> cpu = sample["cpu"]
>> ip = sample["ip"]
>> @@ -250,13 +251,25 @@ def process_event(param_dict):
>> return
>>
>> if (start_addr < int(dso_start) or start_addr > int(dso_end)):
>> - print("Start address 0x%x is out of range [ 0x%x .. 0x%x ] for dso %s" % (start_addr, int(dso_start), int(dso_end), dso))
>> + if (options.verbose == True):
>> + print("Start address 0x%x is out of range [ 0x%x .. 0x%x ] for dso %s" % (start_addr, int(dso_start), int(dso_end), dso))
>> return
>>
>> if (stop_addr < int(dso_start) or stop_addr > int(dso_end)):
>> - print("Stop address 0x%x is out of range [ 0x%x .. 0x%x ] for dso %s" % (stop_addr, int(dso_start), int(dso_end), dso))
>> + if (options.verbose == True):
>> + print("Stop address 0x%x is out of range [ 0x%x .. 0x%x ] for dso %s" % (stop_addr, int(dso_start), int(dso_end), dso))
>> return
>>
>> + if map_pgoff != None and map_pgoff != '[unknown]':
>> + if (dso == "[kernel.kallsyms]"):
>> + dso_vm_start = 0
>> + map_pgoff = '[unknown]'
>> + else:
>> + dso_vm_start = int(dso_start)
>> + start_addr += map_pgoff
>> + stop_addr += map_pgoff
>> + map_pgoff = '[unknown]'
>
> Every sample has its own field `map_pgoff`. So I am confused why we need the
> sentence: map_pgoff = '[unknown]'. And you can see with above change, we will
> set dso_vm_start delicately with below code section.
>
> How about a simple change like below?
>
> @@ -267,7 +268,7 @@ def process_event(param_dict):
>
> dso_fname = get_dso_file_path(dso, dso_bid)
> if path.exists(dso_fname):
> - print_disam(dso_fname, dso_vm_start, start_addr, stop_addr)
> + print_disam(dso_fname, dso_vm_start, start_addr + map_pgoff, stop_addr +
> map_pgoff)
>
> If this change does not work, I am curious if anything we missed to handle in
> the perf's C code (frontend).
>
> Thanks,
> Leo
Leo,
Sorry, the map_pgoff = '[unknown]' instances are actually leftover debug
artifacts used for print type debug. I resorted to print debug after I
didn't see success attempting mixed-mode debug with Visual Studio Code
on AARCH64. If you have a known working launch configuration, please
share it. See V3 of the patch series. I've rebased to the
perf-tools-next branch.
Steve
>
>> if (options.objdump_name != None):
>> # It doesn't need to decrease virtual memory offset for disassembly
>> # for kernel dso and executable file dso, so in this case we set
>> --
>> 2.25.1
>>
On 8/27/2024 12:55 PM, Leo Yan wrote:
> On 8/26/2024 10:35 PM, Steve Clevenger wrote:
>>
>> Changes in V2:
>> - Updated mailing list distribution
>>
>> Use dso__is_pie() to check whether the DSO file is a Position
>> Independent Executable (PIE). If PIE, change the MAPPING_TYPE to
>> MAPPING_TYPE__IDENTITY so a zero map pgoff (text offset) is passed
>> into the script.
>>
>>
>> Signed-off-by: Steve Clevenger <scclevenger(a)os.amperecomputing.com>
>> ---
>> tools/perf/util/map.c | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
>> index e1d14936a60d..df7c06fc373e 100644
>> --- a/tools/perf/util/map.c
>> +++ b/tools/perf/util/map.c
>> @@ -171,8 +171,11 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
>> assert(!dso__kernel(dso));
>> map__init(result, start, start + len, pgoff, dso);
>>
>> + if (map->pgoff && !no_dso)
>> + no_dso = dso__is_pie(dso); // PIE check
>
> Here the logic is whatever the `pgoff` is (zero or non-zero), we need to
> always set the mapping type as MAPPING_TYPE__IDENTITY. It is no need to check
> pgoff and is no matter with 'no_dso'. So we can simply check dso__is_pie() and
> set IDENTITY flag for the true case.
>
> if (dso__is_pie(dso)) {
> map__set_mapping_type(map, MAPPING_TYPE__IDENTITY);
> } else if (anon || no_dso) {
> ...
> }
>
> BTW, please rebase the series on the top of the perf-tools-next branch [1], I
> saw this patch has merging conflict on it.
>
> Thanks,
> Leo
Hi Leo,
I combined the dso__is_pie() call onto the end of the existing 'if'
statement you now show as an 'else if'. Please see V3 of the patch series.
Steve
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git
> branch: perf-tools-next
>> +
>> if (anon || no_dso) {
>> - map->mapping_type = MAPPING_TYPE__IDENTITY;
>> + map__set_mapping_type(map, MAPPING_TYPE__IDENTITY);
>>
>> /*
>> * Set memory without DSO as loaded. All map__find_*
>> --
>> 2.25.1
>>
Changes in V2:
- Updated mailing list distribution
Fedora 37 distributed shared binary and executable mapped files show a
zero text section offset. Starting with the Fedora 38 distribution, the
shared binary and executable mapped files show a non-zero text section
offset for some binaries. The text offset parameter is never passed into
the arm-cs-trace-disasm.py script to allow the script to adjust the
start/end address range passed to objdump. This adjustment is required
to correctly offset into the dso text section. Not doing so results in
an incorrect user instruction trace display for Fedora 38 (and later)
user trace output.
Steve Clevenger (5):
Add dso__is_pie call to identify ELF PIE
Add dso__is_pie prototype
Force MAPPING_TYPE__IDENTIY for PIE
Add map pgoff to python dictionary based on MAPPING_TYPE
Adjust objdump start/end range per map pgoff parameter
.../scripts/python/arm-cs-trace-disasm.py | 17 +++++-
tools/perf/util/map.c | 5 +-
.../scripting-engines/trace-event-python.c | 13 +++-
tools/perf/util/symbol-elf.c | 60 +++++++++++++++++++
tools/perf/util/symbol.h | 1 +
5 files changed, 90 insertions(+), 6 deletions(-)
--
2.25.1
On 23/08/2024 10:57 am, Ganapatrao Kulkarni wrote:
>
> Hi James/Mike,
>
> On 23-08-2024 02:33 pm, James Clark wrote:
>>
>>
>> On 19/08/2024 11:59 am, Mike Leach wrote:
>>> Hi,
>>>
>>> A new branch of OpenCSD is available - ocsd-consistency-checks-1.5.4-rc1
>>>
>>> Testing I managed to do confirms the N atom on unconditional branches
>>> appear to work. I do not have a test case for the range
>>> discontinuities.
>>>
>>> The checks are enabled using operation flags on decoder creation. See
>>> the docs for details.
>>>
>>> Mike
>>>
>>
>> Hi Mike,
>>
>> I tested the new OpenCSD and I don't see the error anymore in the
>> disassembly script. I'm not sure if we need to go any further and add
>> the backwards check, it looks like just a later symptom and the checks
>> that you've added already prevent it.
>>
>> If you release a new version I can send the perf patch. I was going to
>> use these flags if that looks right to you? As far as I know that's the
>> set that can be always on and won't fail on bad hardware?
>>
>> I also assumed that ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK can be given even
>> for etmv3 and it's just a nop?
>>
>> diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> index e917985bbbe6..90967fd807e6 100644
>> --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>> @@ -685,9 +685,14 @@ cs_etm_decoder__create_etm_decoder(struct cs_etm_decoder_params *d_params,
>> return 0;
>>
>> if (d_params->operation == CS_ETM_OPERATION_DECODE) {
>> + int decode_flags = OCSD_CREATE_FLG_FULL_DECODER;
>> +#ifdef OCSD_OPFLG_N_UNCOND_DIR_BR_CHK
>> + decode_flags |= OCSD_OPFLG_N_UNCOND_DIR_BR_CHK | OCSD_OPFLG_CHK_RANGE_CONTINUE |
>> + ETM4_OPFLG_PKTDEC_AA64_OPCODE_CHK;
>> +#endif
>> if (ocsd_dt_create_decoder(decoder->dcd_tree,
>> decoder->decoder_name,
>> - OCSD_CREATE_FLG_FULL_DECODER,
>> + decode_flags,
>> trace_config, &csid))
>> return -1;
>>
>
> I tried Mike's branch with above James's patch and still the segfault is happening to us.
>
Did you update OpenCSD as well?
On 2024/8/23 15:39, Krzysztof Kozlowski wrote:
> On Fri, Aug 23, 2024 at 03:16:07PM +0800, Jinlong Mao wrote:
>>
>>
>> On 2024/8/22 15:41, Krzysztof Kozlowski wrote:
>>> On Wed, Aug 21, 2024 at 11:41:18PM -0700, Mao Jinlong wrote:
>>>> qcom,qmi-id is the instance id used by qmi API to communicate with
>>>> remote processor.
>>>>
>>>> Signed-off-by: Mao Jinlong <quic_jinlmao(a)quicinc.com>
>>>> ---
>>>> .../bindings/arm/qcom,coresight-remote-etm.yaml | 10 ++++++++++
>>>> 1 file changed, 10 insertions(+)
>>>>
>>>> diff --git a/Documentation/devicetree/bindings/arm/qcom,coresight-remote-etm.yaml b/Documentation/devicetree/bindings/arm/qcom,coresight-remote-etm.yaml
>>>> index 4fd5752978cd..27e5f18bfedf 100644
>>>> --- a/Documentation/devicetree/bindings/arm/qcom,coresight-remote-etm.yaml
>>>> +++ b/Documentation/devicetree/bindings/arm/qcom,coresight-remote-etm.yaml
>>>> @@ -20,6 +20,13 @@ properties:
>>>> compatible:
>>>> const: qcom,coresight-remote-etm
>>>> + qcom,qmi-id:
>>>> + $ref: /schemas/types.yaml#/definitions/uint32
>>>> + description:
>>>> + This id is used by qmi API to communicate with remote processor for
>>>> + enabling and disabling remote etm. Each processor has its unique instance
>>>> + id.
>>>> +
>>>> out-ports:
>>>> $ref: /schemas/graph.yaml#/properties/ports
>>>> additionalProperties: false
>>>> @@ -31,6 +38,7 @@ properties:
>>>> required:
>>>> - compatible
>>>> + - qcom,qmi-id
>>>
>>> That's an ABI break.
>>>
>>> Best regards,
>>> Krzysztof
>> Hi Krzysztof,
>>
>> Sorry, I didn't get your point.
>> Could you please share more details ?
>
> Adding new required properties is an ABI break. Nothing in commit msg
> explained why this is okay or even needed.
>
> Best regards,
> Krzysztof
Thank you. I will update in the commit msg.
>
On 2024/8/22 15:41, Krzysztof Kozlowski wrote:
> On Wed, Aug 21, 2024 at 11:41:18PM -0700, Mao Jinlong wrote:
>> qcom,qmi-id is the instance id used by qmi API to communicate with
>> remote processor.
>>
>> Signed-off-by: Mao Jinlong <quic_jinlmao(a)quicinc.com>
>> ---
>> .../bindings/arm/qcom,coresight-remote-etm.yaml | 10 ++++++++++
>> 1 file changed, 10 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/arm/qcom,coresight-remote-etm.yaml b/Documentation/devicetree/bindings/arm/qcom,coresight-remote-etm.yaml
>> index 4fd5752978cd..27e5f18bfedf 100644
>> --- a/Documentation/devicetree/bindings/arm/qcom,coresight-remote-etm.yaml
>> +++ b/Documentation/devicetree/bindings/arm/qcom,coresight-remote-etm.yaml
>> @@ -20,6 +20,13 @@ properties:
>> compatible:
>> const: qcom,coresight-remote-etm
>>
>> + qcom,qmi-id:
>> + $ref: /schemas/types.yaml#/definitions/uint32
>> + description:
>> + This id is used by qmi API to communicate with remote processor for
>> + enabling and disabling remote etm. Each processor has its unique instance
>> + id.
>> +
>> out-ports:
>> $ref: /schemas/graph.yaml#/properties/ports
>> additionalProperties: false
>> @@ -31,6 +38,7 @@ properties:
>>
>> required:
>> - compatible
>> + - qcom,qmi-id
>
> That's an ABI break.
>
> Best regards,
> Krzysztof
Hi Krzysztof,
Sorry, I didn't get your point.
Could you please share more details ?
Thanks
Jinlong Mao
>
>
In our hardware design, by combining a funnel and a replicator, it
implement a hardware device with one-to-one correspondence between
output ports and input ports. The programming usage on this device
is the same as funnel. The software uses a funnel and a static
replicator to implement the driver of this device. Since original
funnels only support a single output connection and original
replicator only support a single input connection, the code needs
to be modified to support this new feature. The following is a
typical topology diagram of multi-port output mechanism.
|----------| |---------| |----------| |---------|
| TPDM 0 | | Source0 | | Source 1 | | TPDM 1 |
|----------| |---------| |----------| |---------|
| | | |
| | | |
| --------- | | |
| | | |
| | | |
| | | |
\-------------/ ---------------------- |
\ Funnel 0 / | |
----------- | ------------------------------
| | |
| | |
\------------------/
\ Funnel 1 / ----|
\--------------/ |
| |----> Combine a funnel and a
| | static replicator
/-----------------\ |
/ replicator 0 \ ----|
/---------------------\
| | |
| | |-----------|
| |---------| |
| |TPDM0 |TPDM1
| \-----------------/
| \ TPDA 0 /
| \-------------/
| |
| |
|Source0/1 |
\-------------------------------/
\ Funnel 2 /
\---------------------------/
Changes in V3:
1. Rename the function "coresight_source_filter" to
"coresight_block_source". And refine this function.
-- Suzuki K Poulose
2. Rename the parameters of the function
"coresight_find_out_connection" to avoid confusion.
-- Suzuki K Poulose
3. Get the source of path in "coresight_enable_path" and
"coresight_disable_path".
-- Suzuki K Poulose
4. Fix filter source device before skip the port in
"coresight_orphan_match".
-- Suzuki K Poulose
5. Make sure the device still orphan if whter is a filter
source firmware node but the filter source device is null.
-- Suzuki K Poulose
6. Walk through the entire coresight bus and fixup the
"filter_src_dev" if the source is being removed.
-- Suzuki K Poulose
7. Refine the commit description of patch#2.
-- Suzuki K Poulose
8. Call "fwnode_handle_put" to "drop" the refcount for the
device firmware node.
-- Suzuki K Poulose
9. Fix the warning reported by kernel test robot.
-- kernel test robot.
10. Use the source device directly if the port has a
hardcoded filter in "tpda_get_element_size".
-- Suzuki K Poulose
Changes in V2:
1. Change the reference for endpoint property in dt-binding.
-- Krzysztof Kozlowski
2. Change the property name "filter_src" to "filter-src".
-- Krzysztof Kozlowski
3. Fix the errors in running 'make dt_binding_check'.
-- Rob Herring
4. Pass in the source parameter instead of path.
-- Suzuki K Poulose
5. Reset the "filter_src_dev" if the "src" csdev is being removed.
-- Suzuki K Poulose
6. Add a warning if the "filter_src_dev" is of not the
type DEV_TYPE_SOURCE.
-- Suzuki K Poulose
7. Optimize the procedure for handling all possible cases.
-- Suzuki K Poulose
Changes in V1:
1. Add a static replicator connect to a funnel to implement the
correspondence between the output ports and the input ports on
funnels.
-- Suzuki K Poulose
2. Add filter_src_dev and filter_src_dev phandle to
"coresight_connection" struct, and populate them if there is one.
-- Suzuki K Poulose
3. To look at the phandle and then fixup/remove the filter_src
device in fixup/remove connections.
-- Suzuki K Poulose
4. When TPDA reads DSB/CMB element size, it is implemented by
looking up filter src device in the connections.
-- Suzuki K Poulose
Tao Zhang (3):
dt-bindings: arm: qcom,coresight-static-replicator: Add property for
source filtering
coresight: Add support for trace filtering by source
coresight-tpda: Optimize the function of reading element size
.../arm/arm,coresight-static-replicator.yaml | 19 ++-
drivers/hwtracing/coresight/coresight-core.c | 136 +++++++++++++++---
.../hwtracing/coresight/coresight-platform.c | 18 +++
drivers/hwtracing/coresight/coresight-tpda.c | 11 +-
include/linux/coresight.h | 5 +
5 files changed, 164 insertions(+), 25 deletions(-)
--
2.17.1