== String routines == * Finally finished the ltrace analysis of the whole of SPEC 2k6 and have written it up - I'll proof read it next week and then send it out to the benchmark list. * Ran memset and memcpy benchmarks of larger than cache sizes on A9 * memcpy on larger than cache sizes (or probably mainly cache miss data) does come back to Neon winning over ARM; my suspicion is that with cache hits we run out of bandwidth on Neon, but that doesn't happen in the cache miss case; why it's faster in that case I'm not sure yet. * memset is still not faster for Neon even on large sizes where the destination isn't in the cache.
== Other == * Started looking at 64 bit atomics * Looking at the pot of QEmu work with Peter.
Dave