Hi Renato,

 

I think to be able to make the best possible judgement here, answers to the following questions would be needed:

 

·         Does this result in non-compliance of IEEE754 regarding denormals? NaN? INFs? Something else?

·         Also, does the C/C++ standard say something about IEEE 754 compliance?

·         I checked the OpenCL 1.1 spec, and that one says that IEEE 754 compliance regarding treatment of INF and NaNs is a must; signalling NaNs is not required; supporting denormalized numbers is optional. (see section 7.2)

·         I’m guessing that default option for Clang is to produce fully compliant IEEE754 code? Is it? Is that the right choice? Or is not handling denormals fully correctly a better default? What about NaNs, INFs, others?

 

Thanks,

 

Kristof

 

From: Renato Golin [mailto:renato.golin@linaro.org]
Sent: 19 March 2013 21:56
To: Linaro Toolchain
Cc: Kristof Beyls; Tim Northover
Subject: LLVM ARM NEON VMUL.f32

 

Hi folks,

 

I found an issue while fixing a test using the wrong VMUL.f32, and I'd like to know what should be our choice on this topic that is slightly controversial.

 

Basically, LLVM chooses to lower single-precision FMUL to NEON's VMUL.f32 instead of VFP's version because, on some cores (A8, A5 and Apple's Swift), the VFP variant is really slow.

 

This is all cool and dandy, but NEON is not IEEE 754 compliant, so the result is slightly different. So slightly that only one test, that was really pushing the boundaries (ie. going below FLT_MIN) did catch it.

 

There are two ways we can go here:

 

1. Strict IEEE compatibility and *only* lower NEON's VMUL if unsafe-math is on. This will make generic single-prec. code slower but you can always turn unsafe-math on if you want more speed.

 

2. Continue using NEON for f32 by default and put a note somewhere that people should turn this option (FeatureNEONForFP) off on A5/A8 if they *really* care about maximum IEEE compliance.

 

Apple already said that for Darwin, 2 is still the option of choice. Do we agree and ignore this issue? Or for GNU/EABI we want strict conformance by default?

 

GCC uses fmuls...

 

cheers,

--renato


-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.