On Wed, Dec 8, 2010 at 7:10 PM, Michael Hope michael.hope@linaro.org wrote:
On Wed, Dec 8, 2010 at 6:59 PM, Prashanth S prashanth.s@samsung.com wrote:
Dear All
Our team in Samsung collected some performance metrics for the following 3 GCC cross compilers
Gentoo Complier(part of Chrome OS Build Environment) GCC 4.4.1 (Code Sourcery). Linaro (gcc-linaro-4.5-2010.11-1)
Flags used to Build Linaro Tool chain used Michael Hope Script .Just modified "GCCFLAGS = --with-mode=thumb --with-arch=armv7-a --with-float=softfp --with-fpu=neon --with-fpu=vfpv3-d16"
Some notes on the flags themselves:
The options in your email are quite decent: -mtune=cortex-a9 -mfloat-abi=softfp -mfpu=neon -ftree-vectorize -fomit-frame-pointer -ffast-math -mcpu=cortex-a9 -O3
You can prune that back a bit to: -mcpu=cortex-a9 -mfloat-abi=softfp -mfpu=neon -ffast-math -O3
as -O turns on -fomit-frame-pointer, -O3 turns on -ftree-vectorize, and -mcpu implies -mtune.
You can prune this back even further by configuring and building the compiler using --with-cpu and similar options. These set the defaults so you can use just: -ffast-math -O3
Be careful with -ffast-math. It will improve performance but means that floating point calculations no longer follow the full IEEE 754 standard. I'd turn this off until you verify individual packages.
-mfloat-abi=hard will give better performance but may involve a porting effort.
-fno-common also gives a small improvement in various benchmarks, but may break some programs.
Note that the 'best' options depend on the individual program. It's not unusual for programs to sometimes do better with -O2 than -O3, or better without -ftree-vectorise. I'd be interested in any situations you run across.
-- Michael