I've gone through and checked the 64 bit operation improvements that Andrew has made to GCC. For everything but the Cortex-A8, GCC uses the NEON unit for 64 bit operations and Andrew's improvements mean we can stay on NEON for longer without having an expensive transfer back and forth to the core registers.
The results are here: https://wiki.linaro.org/MichaelHope/Sandbox/64BitOperations
Once we've fixed the shift-left-by-n pattern I'll turn this into an Outputs[1] page.
Benchmark results have been sent to the linaro-toolchain-benchmarks list.
-- Michael