On 16 August 2012 13:04, Mans Rullgard mans.rullgard@linaro.org wrote:
On 15 August 2012 22:38, Mans Rullgard mans.rullgard@linaro.org wrote:
On 15 August 2012 17:17, Matthew Gretton-Dann matthew.gretton-dann@linaro.org wrote:
The performance of PGO on 'a popular embedded benchmark' is 14% improvement, LTO is 7%. Don't know both together, or SPEC.
On 'a popular media coding library' PGO gains 2-5% in general, in one case as much as 11%. The relative gains are larger on average with hand-written assembly disabled, but obviously nowhere near the performance with it enabled.
On the same library LTO is 2-3.5 _times_ slower than without on all tests, although it does pass the test suite.
Sorry for crying wolf. I redid the LTO build and the huge performance drop is gone. Now I'm getting minor gains on most tests and 5-8% loss on a few. More worrying is that it is now failing a few tests. I'll look into both issues and report back.
The test failures are caused by a known bug in 4.8 trunk (54132).
After hacking the build system to make sure exactly the same optimisation flags are used when compiling and linking, I'm getting a 4% gain as best result and 1% loss on a couple of tests. Most tests change less than 1%.
If the optimisation flags for compiling and linking differ, all kinds of bad things seem to happen.