On Thu, Jul 21, 2016 at 11:30 PM, Jim Wilson jim.wilson@linaro.org wrote:
On Thu, Jul 21, 2016 at 6:33 PM, Jeffrey Walton noloader@gmail.com wrote:
I think vfpd32 cpu flag means I have 32 D-registers. The cpu flags neon and vfpv3 flags means I want something more than -mfpu=neon-fp16, but I'm not sure what that is.
neon implies vfvp3 and 32 D-registers and asimd/neon support, so that part is correct. it isn't obvious to me if you have the half-precision float support. The "half" printed by the kernel means that half-word loads are supported, which is only false for some obsolete parts I think. The kernel doesn't appear to be checking to see if the hardware has half-precision float support or not, so you can't determine that from /proc/cpuinfo.
OK, so looking at http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dht0002a/ch01s03..., it appears the "minimum" of the -mfpu option is VFPv3-D16. Since I have 32 D-regs, I can use the one VFPv3-D32, which should equate to -mfpu=neon-vfp3 (which does not seem to exist).
I can't use -mfpu=neon-vfpv4 because vfpv4 is not signaled, and it could be missing the half word and fma extensions implied with vfpv4.
So I guess the question is, what do I use for -mfpu=neon-vfp3 (or -mfpu=neon-vfp3-d32)? Is -mfpu=neon enough?
Thanks again for the help with this.