I thought I'd send an update on the SPEC 2000 twolf variance. We're seeing a high amount of variance in the results for the SPEC 2000 twolf, vpr, and galgel benchmarks. I've run tests on a PandaBoard, Origen, and IGEPv2 and gotten a coefficent of variance of 0.014, 0.017, and 0.003 which suggests that the problem is Cortex-A9 specific. twolf is hard on the cache so my theory is that it's something to do with the memory subsystem. I currently have a PandaBoard running with SMP, heap randomisation, virtual address randomisation, and the branch predictor off and the CPU down clocked to the non-overdrive 600 MHz. I'll let this run overnight and see if the results are tighter.
To solve Andrew's immediate problem, I'm running SPEC on the 64 bit core register shifts on the OMAP3. The results are tight enough that we should be able to show any regressions.
-- Michael