On Mon, Sep 5, 2011 at 9:32 PM, David Gilbert david.gilbert@linaro.org wrote:
On 5 September 2011 04:21, Michael Hope michael.hope@linaro.org wrote:
On Fri, Sep 2, 2011 at 4:08 PM, Michael Hope michael.hope@linaro.org wrote: http://people.linaro.org/~michaelh/incoming/strings-performance/sizes-strlen...
That's very nice - although quite bizarre; even the lower end of the steps are suitably fast so not really anything to worry about; but it would be great to understand where the 1500 cycle difference is going at the large end.
I've re-run the strlen tests on four different A9 chips which cover four different revisions of the A9 core. See: http://people.linaro.org/~michaelh/incoming/variants-strlen-08.png
I'm afraid I don't know how to turn the /proc/cpuinfo variant and revision into an ARM rxpy. vela is v1:r0. ursa is a v1:r2. leo is a v2:r1. silverbell is a v0:r1. All machines have different clock speeds so I've normalised the graphs at their 64 byte loop performance.
The two v1 devices have the funny response past 256 bytes. The v0 and v2 devices don't. It could be something to do with the variant or the memory subsystems.
The good news is that the new strlen() is much faster than the alternatives so we're not blocked.
-- Michael