== String routines == * Wrote a thumb optimised strchr - As expected it's got nice performance for longer runs but at sizes <16 bytes it's slower, and a lot of the strchr calls are very short, so it's probably not of benefit in most cases
( https://wiki.linaro.org/WorkingGroups/ToolChain/Benchmarks/InitialStrchr?act... )
* Wrote a neon-memcpy - As previously found with memset, it performs well on A8 but poorly on A9 - it does however do the case where the source/destination isn't aligned quite well even on A9 ; the vld1 unaligned case works with relatively little penalty. (it performs comparably to the Bionic implementation - mine is a bit faster on shorter calls, Bionic is better on longer uses - I think that's because they've got some careful use of preloads where I have so far got none).
I'm on holiday up to and including 5th April.
Dave
linaro-toolchain@lists.linaro.org