2011/3/24 Michael Hope michael.hope@linaro.org:
On Fri, Mar 25, 2011 at 2:59 AM, Barry Song 21cnbao@gmail.com wrote:
2011/3/24 Andrew Stubbs andrew.stubbs@linaro.org
On 24/03/11 11:05, Imre Kaloz wrote:
On Thu, 24 Mar 2011 11:36:17 +0100, Andrew Stubbs andrew.stubbs@linaro.org wrote:
However, you can build your own compiler from the Linaro sources, and then build the libraries you need to match, and you can have v5 support. This is not a straightforward process. :(
You can always use the OpenWrt buildroot to easily build a custom Linaro-based crosscompiler, just make sure you select the right libc for your needs (we use uClibc by default) and a target similar to yours.
Or OpenEmbedded or CrossTool / CrossTool-NG.
Thank all of you! you really help me much! You toolchain team is really great!
In fact, i knew how to compile a toolchain. As i said, i have compiled a toolchain by simple options: Configured with: ../gcc-linaro-4.4-2011.02-0/ configure --target=arm-none-linux-gnueabi --prefix=/home/vmuser/development/toolchain/build-toolchain/tools --enable-languages=c,c++ --disable-libgomp Thread model: posix gcc version 4.4.5 (Linaro GCC 4.4-2011.02-0)
it can make uboot work with arch=armv5.
i want to know whether any performance is lost by my simple configure options if the toolchain is used to armv7 with vfpv3? GCC documents show that those options we use to compile gcc will become the default options of gcc runtime. But what is the real benefit toolchains can get by configuring gcc with default arch and fpu since we can switch arch options at runtime?
Is the key glibc? If compiling glibc by gcc with options for a special ARM arch and float point unit, it will improve the performance of glibc to the arch? And then the glibc will not support other arch or SoCs without the specified float point unit?
Hi Barry. The short answer is 'it depends' :)
GCC is more than a compiler and includes other things such as a runtime library (libgcc) and hooks into the libc for features like thread local storage. These are built for the architecture and floating point unit options you pass to GCC's configure, so if you want one toolchain that runs everywhere then you need to configure it for the lowest common denominator (normally a ARMv5T in ARM mode with no FPU). A similar argument applies to GLIBC.
The next question is, does this matter for your application? What workload will your product run and will it be meaningfully affected by this lowest common denominator build? * If your application uses a lot of floating point, then the lack of FPU support in GLIBC matters
Completely right. I got a linaro 4.5 by:
sudo add-apt-repository ppa:linaro-maintainers/toolchain sudo apt-get update sudo apt-get install gcc-4.5-arm-linux-gnueabi
Then I did WHETSTONE benchmark on a low-frequency cortex A9 FPGA with vfpv3.
Result showed linaro by apt-get is using generic glibc in fact. its gave only 3.3 WHETSTONE MIPS. Then i compiled a glibc with vfpv3 support to replace the glibc by apt-get, the new toolchain gave 16.7 WHETSTONE MIPS, 400% improvement in WHETSTONE.
* If your product has limited memory, then the smaller code size of Thumb-2 is worthwhile * If you need to squeeze out another 5 % in performance, then using ARMv7 instead of ARMv5 will help
Completely right. I did a edn benchmark from http://www.mrtc.mdh.se/projects/wcet/wcet_bench/edn/edn.c. EDN can show memory and fixed-point performance. My testing result shows edn has a 5-10% performance improvement by linaro 4.5 with armv7_a option.
There are other technical solutions such as: * Building libgcc and glibc for the different variants and picking the best at link time * Building them and picking at dynamic link time using hwcaps or similar
These make sense for a generic binary toolchain such as the CodeSourcery one Andrew mentioned, and for generic distributions such as Ubuntu but not for a focused end user product.
Then my conclusion is that the best choice for cortex A9 with vfpv3 is 1. using linaro gcc 4.4/4.5 with ARMV7 optimization 2. using glibc compiled with vfpv3 supporting
A strange phenomenon i found was nbench would enter dead loop in ASSIGNMENT case. It could finish ASSIGNMENT case forever. nbench can be downloaded from http://www.tux.org/~mayer/linux/nbench-byte-2.2.3.tar.gz. I haven't time to figure out whether it is a linaro gcc bug. Then i just ignore this case by writing NBENCH.CONF like the following: CUSTOMRUN=T DONUMSORT=T DOLU=T DOSTRINGSORT=T DOBITFIELD=T DOEMF=T DOFOUR=T DOASSIGN=F ASSIGNMINSECONDS=1 DOHUFF=T DOIDEA=T
If you have time, may you help to check the reason?
-- Michael
Thanks Barry