Hi there. A first-pass list of summit sessions is up at:
https://wiki.linaro.org/MichaelHope/Sandbox/1111Blueprints
The next step is to investigate these areas and come up with a basic
plan that can be discussed during the summit.
I've put your names against the sessions as follows:
Andrew: Broad tuning
Dave: String routines everywhere
Dave: QEMU topic #2
Doug: STM support
Ira: Vectoriser and NEON performance
Ken: 64 bit sync primitives
Ken: Good backtracing
Ken: End-developer tools bluesky
Michael: Publish benchmarking of the toolchain
Michael: Binary builds
Michael: Deeper validation
Peter: QEMU bluesky
Ramana: Thumb-2 performance brainstorm
Ramana: GCC backend rework
Ulrich: GDB as a cross-debugger
Ira and Dave, I know you won't be at the summit but we'll see about
being able to call in.
Could you all please have a read of the outline, investigate these
topics, and draft up a blueprint-style list of work items to achieve
it? Record any notes in the sandbox page[5] or in a child page if
needed. Larger topics may warrant a specification[1].
I'd like these done by the end of next week. I'd expect to spend up
to a day on the topics you already understand and more on broad
topics.
For reference, the see the draft TRs[2] and spreadsheet [3]. I've
added some GDB topics to the spreadsheet that still need to go onto
the wiki page.
The outlines should touch each of these TRs in some way so let me know
if I've missed anything.
There's more good information on the process and style at:
https://wiki.linaro.org/Process/Blueprintshttps://wiki.linaro.org/Process/WorkItemsHowtohttps://wiki.linaro.org/Resources/FAQ
Questions? Need more detail? Let me know.
-- Michael
[1] https://wiki.linaro.org/Process/SpecTemplate
[2] https://wiki.linaro.org/Cycles/1111/TechnicalTopics/Toolchain
[3] https://spreadsheets.google.com/ccc?key=0Ap7fWLePADFVdHkxYy1INTZmMEd4bkwxSG…
[4] there is no 4...
[5] https://wiki.linaro.org/MichaelHope/Sandbox/1111Blueprints
Hi,
Agenda for today's performance call . Sorry about the last minute
posting and I'll put this in the wiki soon enough.
1. Sync-up on what's been happening around the group:
a. Coremark regressions.
b. Thumb2 constants patch.
c. divmodsi4 and vfp register moves.
d. DENBench investigations.
2. Planning for the summit and turning some of the ideas into blueprints.
3. AOB.
cheers
Ramana
== Last week ==
* PR48250, rehaul arm_legitimize_reload_address(). Richard Sandiford
caught a bug of mine where I overlooked the valid index range of NEON
quad-word load/stores. Quickly whipped up a fix, soon approved and
committed upstream.
* LP #744754, ICE in NEON struct-mode auto-inc-dec MEMs. Pushed upstream
patch for a merge to Linaro 4.5.
* PR46888, bit-field insert optimization patch. Resumed investigating,
mailed Andrew Pinski for more information on that REG_EQUAL note issue
he mentioned on gcc-patches; can't quite reproduce it myself.
* CoreMark ARMv6/v7 regressions: posted a patch set to gcc-patches.
Still waiting review.
* Reported to Bernd and AndrewS on an issue (LP #748138) which seems to
be related to the shrink-wrap patch. This ICE does not seem to be
avoided by doing -fno-shrink-wrap.
* A few tasks related to Linaro-Budapest event travel.
== This week ==
* Do the merge of the new combine patches to Linaro, and test.
* LP #689887 is still in progress.
* Hope to experiment with a few more optimization ideas.
Michael mentioned that some users reported seeing better preformance from
RVCT using arm_neon.h then they did when coding directly in assembler.
He suggested we try the same thing for GCC. Here's an experiment using
the example that Jim Huang posted to the dev list recently:
https://wiki.linaro.org/RichardSandiford/Sandbox/IntrinsicsPerformance
The summary is that the C version needs to borrow a trick from the
assembly code in order to be competitive. If it does that, though,
the C code can be faster. I think this is mostly down to scheduling,
although I haven't checked in detail yet.
Richard
== String and Memory routines ==
* Profiled denbench with perf and produced a set of stats to show
which programs spent how much time in libc and how
much time was spent in each routine. While some of the
benchmarks are good (like aes) and spend almost no time in libc
some of the others (MPEG codecs especially) seem to spend
significant times in libc.
* Ran all of denbench through latrace to generate sets of library
calls; post processed them to extract the section between the clock()
calls (and hence in the timed portion) and analysed the hot library
calls. I've looked at some of the output but not all of it yet; I
get output like:
Memcpy stats (dst align/src align/length/number of occurrences/total
size copied)
memcpy: 0,0,1 , 1588520, 1588520
memcpy: 16,28,4096 , 1, 4096
memcpy: 4,20,16384 , 855, 14008320
This shows that for a bunch of tests they do an inordinate number of 1
byte memcpy's, and a few hundred larger memcpy's with an address %32
which is 4
(and destination %32 is 20) - so not aligned but at least equally misaligned.
* Started writing up a report on some of the stats
* Also started to try and extract the same stuff from SPEC2k6
== QEMU ==
* Tested Peter's QEmu release earlier in the week (On Lucid so
didn't hit his natty bug)
* Wrote up a couple of specs (one for TrustZone and the other for
Device Tree integration)
== GDB ==
* Created Linaro GDB 7.2-2011.04-0 release.
* Committed patch to fix accessing "fpscr" register to Linaro GDB.
* Failure to disable address space randomization (bug #616001) has been
fixed in the kernel; closed the Linaro GDB bug.
== GCC ==
* Ongoing analysis of bug #759409 (Profiled bootstrap fails). Identified
two independent problems, one related to a new CodeSourcery feature,
and one a mis-optimization of final-stage cc1plus which is also present
in FSF GCC 4.5 (PR 43085). Ongoing investigation to track down the
root cause of the latter problem.
== Schedule ==
* Public holidays 04/22 - 04/25.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
RAG:
Red:
Amber:
Green: another monthly qemu-linaro release out on schedule
Current Milestones:
| Planned | Estimate | Actual |
qemu-linaro 2011-04 | 2011-04-21 | 2011-04-21 | 2011-04-21 |
Historical Milestones:
finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off |
first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 |
qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | 2011-03-08 |
== maintain-beagle-models ==
* qemu-linaro 2011.04 tested and released
* had to do another last minute -1 respin to fix a problem caught by
ubuntu package builds; we need to come up with a process that lets
us do test package builds prior to release so we can fix this sort
of issue in a less last-minute fashion
== merge-correctness-fixes ==
* sent patch: fix semihosting SYS_HEAPINFO (seems to have issues though)
* sent patch: UNDEFs in Neon load/store space
* sent patches: fix build issues on sparc
* sent patch: bump the initrd load address to work with bigger kernels
* sent patch: set Invalid flag for float-to-int conversion of NaN
* sent patch: move vld/vst multiple to helper functions
* reviewed patches from Aurelien doing some general softfloat cleanup
* sent out a version of my performance counters patch which just does
a basic dummy implementation without the cycle counter (since the
cycle counter bits were going off down a blind alley rather and this
part is the last thing needed to be able to boot Linaro vexpress
images on stock upstream QEMU)
== other ==
* trying to nail down proposed QEMU work for next cycle; work-in-progress:
https://wiki.linaro.org/PeterMaydell/Qemu1111
* meetings: toolchain, standup
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
Holiday: 22 Apr - 2 May
9-13 May: UDS, Budapest
(maybe) 15-16 August: QEMU/KVM strand at LinuxCon NA, Vancouver
[LinuxCon proper follows on 17-19th]
Hi,
libunwind:
* added initial support for resuming at a certain stack frame
* posted unw_resume support plus some some testsuite fixes on the ml
* there are still some issues left if signal handlers/frames are involved
Note: Friday is a public holiday.
Regards
Ken