On Wed, Sep 15, 2010 at 10:25 AM, Guillaume Letellier Guillaume.Letellier@arm.com wrote:
Linaro toolchain doesn't target a specific platform but is generic
for armv7 platforms. Are you expecting to see those optimisations turned on in Linaro toolchain?
Sorry, I don't understand the question. We want to spread these routines out and get them integrated into all of the upstream C libraries including NewLib, Bionic, and GLIBC.
My concern is that you want to spread it too widely! If the NEON-optimised memcpy() goes into GLIBC then I assume it will be used for any armv7 platforms (unless I'm mistaken you don't have a mechanism to detect whether GLIBC runs on a cortex-A8 or A9 And you don't have 2 different versions of the glibc library for the 2 CPUs) So this library might be good for the A8 but not the other CPUs.
GLIBC a mechanism for picking the best routines to use based on the CPU capabilities. This means that GLIBC can include A8 and A9 versions both with and without NEON, Ubuntu can ship all of these versions, and the dynamic linker can choose the best one based on the chip it is running on.
NewLib and Bionic are set at compile time but are normally used on a fixed platform.
I assume the turn on cost is amortised across a run. Note that if the data is not in the L1 cache then the NEON unit wins even for small-ish (~64 byte) copies.
Only on Cortex-A8. But still expensive power-wise.
Could you point me at a reference on the power consumption of the NEON unit? It will take power, but I don't know how much or how significant it is.
-- Michael