Linaro toolchain doesn't target a specific platform but is generic
for armv7 platforms. Are you expecting to see those optimisations turned on in Linaro toolchain?
Sorry, I don't understand the question. We want to spread these routines out and get them integrated into all of the upstream C libraries including NewLib, Bionic, and GLIBC.
My concern is that you want to spread it too widely! If the NEON-optimised memcpy() goes into GLIBC then I assume it will be used for any armv7 platforms (unless I'm mistaken you don't have a mechanism to detect whether GLIBC runs on a cortex-A8 or A9 And you don't have 2 different versions of the glibc library for the 2 CPUs) So this library might be good for the A8 but not the other CPUs.
My understanding is that the NEON unit is on per process, so once you've turned it on once it should stay on.
It's turned off by the kernel at context switch. For thread dealing with a lot of data, it make sense. Turning on NEON for a small copy doesn't make sense on embedded platforms.
I assume the turn on cost is amortised across a run. Note that if the data is not in the L1 cache then the NEON unit wins even for small-ish (~64 byte) copies.
Only on Cortex-A8. But still expensive power-wise.
Guillaume
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.