Hi Will,
Thanks for the reply.
Will Newton will.newton@linaro.org writes:
On 27 August 2013 04:16, Michael Hudson-Doyle michael.hudson@linaro.org wrote:
Hi Michael,
There has been interest from LEG members to ensure that optimal library routines are used on their platforms. My understanding is that the "correct" way of doing this these days is to use ifuncs to select the best implementation for a given system.
I see that glibc 2.18 contains an ifunc-ed version of memcpy. Does the TCWG have a hit list of other functions that might get the same treatment? If so, does it have a plan and the resources to implement them? If it's a matter of resources, I think LEG might be able to help there.
I am not aware of any immediate plans for adding new functions, but we have a few JIRA cards open to write better implementations of various string functions. memset in particular looks like it should be fairly easy to optimise further with NEON.
OK. The main libc functions I've seen showing up in perf reports have been memcpy and malloc/free so I think you're on target there...
We have tests and benchmarks for string functions in the cortex-strings package on launchpad and development of new code has been done there under a liberal license before pushing to glibc/newlib/bionic/etc.
Right. But as far as you know offhand there's nothing in the pipeline that is likely to have different optimal implementations on A8/A9/A15?
There's been some investigation into optimising libm too, but I think any work on that would likely be further away still.
OK.
Note that ifunc resolvers get passed the HWCAP bits from the kernel, so that is the granularity of decision that can simply be resolved - that essentially means you can dispatch based on the presence of NEON or VFP rather than say any micro-architectural details (although you could in theory extract this information in the resolver if it is easily accessible in userspace).
Oh hm. So you can't easily separate A8/A9/A15 per se? That's potentially unfortunate from what I hear, but is quite a way away from my area of expertise :-)
Cheers, mwh