Re: Linaro GCC 4.4 and 4.5 2010.09 released

14 Sep 2010


      On Wed, Sep 15, 2010 at 5:19 AM, Guillaume Letellier
Guillaume.Letellier@arm.com wrote:
...
Hi,
...
Also available is an early release of optimised string routines for
the Cortex-A series, including a mix of NEON and Thumb-2 versions of
memcpy(), memset(), strcpy(), strcmp(), and strlen().  For more
information see:
 https://launchpad.net/cortex-strings
My understanding is that the NEON optimisation will give some performance gain *ONLY* on Cortex-A8 but it will also burn more energy. On other CPU, e.g. Cortex-A9, there is no performance gain but still it will cost more energy.
I've heard that too but never had it confirmed.  I will ask.  The
output of this project will be a set of routines specialised for
Thumb-2, NEON, Cortex-A8, and Cortex-A9, where there is a benefit in
doing variants for each.  We need good non-NEON versions as NEON is
optional and it can't be used in the Linux kernel.
...
Linaro toolchain doesn't target a specific platform but is generic for armv7 platforms. Are you expecting to see those optimisations turned on in Linaro toolchain?
Sorry, I don't understand the question.  We want to spread these
routines out and get them integrated into all of the upstream C
libraries including NewLib, Bionic, and GLIBC.
...
The NEON-optimised version is also beneficial for large copies, but it is not on short copies when the NEON unit has to be powered up (Linux kernel will get an exception to turn it on). I guess your benchmark didn't take that into account. Can the NEON-optimised version be changed so that it is not used for small copies?
My understanding is that the NEON unit is on per process, so once
you've turned it on once it should stay on.  I assume the turn on cost
is amortised across a run.  Note that if the data is not in the L1
cache then the NEON unit wins even for small-ish (~64 byte) copies.
-- Michael

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: Linaro GCC 4.4 and 4.5 2010.09 released