All,
During Connect the suggestion was made that each working group should have
its own IRC Channel for discussions and topics relating to the group in
particular (as opposed to #linaro which is 'generic' Linaro conversations).
Therefore I have just set up #linaro-tcwg on Freenode for the Toolchain
Working Group.
This channel is public and open to anyone who wants to talk with the TCWG
group about anything toolchain related.
Thanks,
Matt
--
Matthew Gretton-Dann
Toolchain Working Group, Linaro
Hi,
Food for thought for today's sync up. I've been writting QEMU plugins to
exercise the plugin system and see what sort of useful information you
can extract when you can control the instruction stream.
For example I now have a plugin that can break down instruction counts
for any given run, for example a kernel boot:
Instruction Classes:
Class: UDEF not counted
Class: SVE (68 hits)
Class: Reserved (0 hits)
Class: PCrel addr (4589078 hits)
Class: Add/Sub (imm,tags) (0 hits)
Class: Add/Sub (imm) (26832113 hits)
Class: Logical (imm) (74304974 hits)
Class: Move Wide (imm) (10933759 hits)
Class: Bitfield (71470957 hits)
Class: Extract (85655 hits)
Class: Data Proc Imm (0 hits)
Class: Cond Branch (imm) (37227632 hits)
Class: Exception Gen (6 hits)
Class: NOP not counted
Class: Hints (244825554 hits)
Class: Barriers (1668558 hits)
Class: PSTATE (202144 hits)
Class: System Insn (7132992 hits)
Class: System Reg (2268308 hits)
Class: Branch (reg) (6280976 hits)
Class: Branch (imm) (18347905 hits)
Class: Cmp & Branch (180167025 hits)
Class: Tst & Branch (4092972 hits)
Class: Branches (0 hits)
Class: AdvSimd ldstmult (0 hits)
Class: AdvSimd ldstmult++ (0 hits)
Class: AdvSimd ldst (0 hits)
Class: AdvSimd ldst++ (0 hits)
Class: ldst excl (160861365 hits)
Class: Prefetch (0 hits)
Class: Load Reg (lit) (12828544 hits)
Class: ldst noalloc pair (0 hits)
Class: ldst pair (60381349 hits)
Class: ldst reg (0 hits)
Class: Atomic ldst (0 hits)
Class: ldst reg (reg off) (0 hits)
Class: ldst reg (pac) (0 hits)
Class: ldst reg (imm) (119597941 hits)
Class: Loads & Stores (0 hits)
Class: Data Proc Reg (113586343 hits)
Class: Scalar FP (0 hits)
Class: Unclassified (0 hits)
You can break down each class to individual instructions. For example
the Hints are mostly:
Individual Instructions:
Instr: wfe (132400072 hits) (op=0xd503205f/ Hints)
Instr: sevl (66433640 hits) (op=0xd50320bf/ Hints)
Instr: yield (29619246 hits) (op=0xd503203f/ Hints)
Instr: wfi (2865 hits) (op=0xd503207f/ Hints)
So I'm looking for a similar experiment that would be useful for the
memory sub-system. When I chatted to Maxim we thought maybe a simplified
cache line simulator might be useful. The aim wouldn't be to simulate
what a real cache might do but to be useful say for identifying regions
of code which might be susceptible to cache line bouncing. So as
compiler writers what sort of run time memory behaviour would you like
to track? What sort of information would be useful to extract with such
a tool?
I'm open to ideas ;-)
--
Alex Bennée
== This Week ==
* PR88837 (7/10)
- Addressed all upstream suggestions.
- Found a (hackish) way to test patch with qemu (multiple issues).
- Sorting thru "strange" testsuite fallout most of which seems
unrelated to my patch.
* PR88833 (2/10)
- Looking at fwprop pass
* Misc (1/10)
- Meetings
== Next Week ==
- Continue ongoing tasks.
(Short week, three days.)
Progress:
* VIRT-65 [QEMU upstream maintainership]
+ continuing with the conversion of the VFP decoder to 'decodetree'.
With some useful advice from RTH I have now got a big chunk of
it done, and it looks like this will provide:
- better places to put "UNDEF if CPU doesn't have double support" checks
- checks of "VFP enabled?" only after all UNDEF checks have happened
- cleanup of a lot of code that uses some TCG globals cpu_F0 and cpu_F1,
which is weird ancient style and overdue for a cleanup
- a VFP decoder which isn't a single thousand-line function with multiple
nested switch statements
thanks
-- PMM
3 day week.
[LLVM-122] BTI and PAC support in lld, llvm-readobj, llvm-objdump
- Now in upstream review. Most of the week spent writing and updating tests.
Some time reviewing some asm goto patches patches.
Planned absences:
Holiday Friday 31st May.
== Progress ==
* Out of office 22 & 24 May
* [GlobalISel] Refactor CallLowering [LLVM-568]
- In progress, likely going to take a while
- Found a minor bug in the lowering for AArch64 (I can get it to
crash on some edge case), not sure if it's worth fixing independently
since it gets fixed anyway with the refactoring that I have in
progress
- Trying to understand an AMDGPU failure
== Plan ==
* Out of office 29 May - 10 June
* More of the same
o LLVM
* 8.0.1-rc1 ARM and AArch64 Binaries uploaded.
* Buildbots: One fixe committed upstream.
* Machine outliner:
- Fixed liveness issue.
- Preparing pat6ch for re-submission
o Misc
* Various meetings and discussions.
[VIRT-343 # ARMv8.5-RNG, Random Number Generator ]
Merged!
[VIRT-263 # ARMv8.1-VHE Virtual Host Extensions ]
Started dusting off and rebasing wip.
[VIRT-339 # ARMv8.5-BTI, Branch Target Identification ]
Started reviewing the kernel patch set for this feature.
[VIRT-327 # Richard's upstream QEMU work ]
PR for tcg gvec work.
PR for Sato-san's RX target.
Patch set to update capstone and enable s390x.
GSOC: Review v3 of Jan's enable risu for x86 patch set.
[Other]
Travel arrangements for Xilinx meeting in San Jose, June 13.
Will need to pick Peter's brain re m-profile before then.
r~
== This Week ==
* PR88833 (4/10)
- Started investigating the issue, it seems one of the code-movement
RTL passes like cse2
do not remove identical register copies resulting in extra register move.
* PR88837 (5/10)
- Patch almost approved offline by Richard, suggested me to move
discussion upstream.
- Observed "strange" issue with return value vectors on qemu for
run-time tests for fixed-length vectors. Turned out due to mismatch
in vector-length at compile and run time -;)
- Trying to run SVE tests with qemu.
* Misc (1/10)
- Meetings
== Next Week ==
- Continue ongoing tasks
[LLVM-122] BTI and PAC support in LLD
Implementation now working, have written BTI tests, next step is
finishing off PAC tests.
[Misc]
Helped out debugging an asm-goto problem on ARM targets.
Investigated a GNU ld LMA overlap when VMA and LMA got out of sync.
Helped out with CMSIS use of ld scripts when using a fast-model,
needed to get LMA == VMA for program header covering BSS.