On Wed, Feb 1, 2012 at 2:51 PM, Michael Hope michael.hope@linaro.org wrote:
On Wed, Feb 1, 2012 at 7:09 AM, Christian Robottom Reis kiko@linaro.org wrote:
On Tue, Jan 31, 2012 at 02:22:27PM +1300, Michael Hope wrote:
One of our cards this quarter was -O3 as a performance theme which included doing a write up on the advantages and usability of -O3. This write up is at: https://wiki.linaro.org/Internal/ToolChain/BuildingAtO3
A sanitised version with non-sharable benchmark data is at: https://wiki.linaro.org/Internal/ToolChain/BuildingAtO3
Could someone review it for me please? Both facts and style. Peter, care to nit it?
This is fantastic work. I actually don't have a lot to add -- it presents the alternatives clearly and summarizes the results. And it's nice that we almost always win!
However, what's the story with SPEC2K? Half of me thinks we'd probably want to extend the life of this card for enough time to at least know (or address) the issues.
Yeah, it's not good. I'll look into it so we understand the reason before publishing.
Data first. Here's a plot of the relative scores for the benchmarks that make up SPEC 2000: https://wiki.linaro.org/Internal/People/MichaelHope?action=AttachFile&do...
(restricted, sorry non-Linaro people)
Åsa, this was made with: http://bazaar.launchpad.net/~linaro-toolchain-dev/linaro-toolchain-benchmark...
and is the graph I was talking about last week.
The green line is how much slower -Os than -O2. The purple line is how much faster -O3 -fno-tree-vectorise is over -O2. The blue line is -O3 vs -O2. The difference between the purple and blue is the improvement due to the vectoriser.
mesa regresses at -O3 novect. The vectoriser helps slightly.
eon improves at -O3 novect then regresses heavily when the vectoriser turns on.
gap regresses at -O3 novect. The vectoriser pulls it to a net win.
And that's about it. The others are OK wins but nothing to compensate for the losses.
The Fortran based benchmarks are missing. I'll fix that and we'll see what's next.
If we continue to use train runs then we need to normalise the results similar to what ref does. At the moment long tests like equake dominate the results over short tests like gcc.
-- Michael