On 9 October 2012 12:09, Peter Maydell peter.maydell@linaro.org wrote:
On 9 October 2012 11:21, Matthew Gretton-Dann matthew.gretton-dann@linaro.org wrote:
On 9 October 2012 10:37, Jubi Taneja jubitaneja@gmail.com wrote:
I wanted to see the difference in objdump of an application where I can make the difference between the VFPV3 and VFPV4 support. I tried enabling the flag -mfpu=vfpv3 and -mfpu=vfpv4 for ARM Cortex A15 toolchain in my test code but cannot see the difference in two objdumps.
Try the following (tested against FSF GCC:
/* arm-none-linux-gnueabi-gcc -mcpu=cortex-a15 -mfpu=vfpv4 -S -o- /tmp/fma.c -mfloat-abi=hard -O2 */ float f(float a, float b, float c) { return a * b + c; } /* end of tmp.c */
I would have expected that you would need a gcc option to tell it that non-IEEE floating point results are OK. Otherwise it's not valid to emit a fused multiply-add for a*b+c because IEEE specifies that you should get a rounding step between the multiply and the add. Or does gcc default to non-IEEE arithmetic?
GCC defaults to -ffp-contract=fast which according to the manual:
enables floating-point expression contraction such as forming of fused multiply-add operations if the target has native support for them
Which as Peter notes is not IEEE compliant.
Thanks,
Matt