What -mfpu option is used with neon, vfpv3 and vfpd32 flag?

List overview All Threads
Download

newer

older

[ACTIVITY] 18-22 July 2016

-mfpu=neon-fp-armv8 and...

Jeffrey Walton

22 Jul 2016 22 Jul '16

1:33 a.m.

Hi Everyone,

I'm looking at the features of a BeagleBone Black. Its /proc/cpuinfo is below.

I think vfpd32 cpu flag means I have 32 D-registers. The cpu flags neon and vfpv3 flags means I want something more than -mfpu=neon-fp16, but I'm not sure what that is.

My question is, what GCC ARM option is used when we encounter the neon, vfpv3 and vfpd32 flags?

Thanks in advance.

**********

$ cat /proc/cpuinfo processor : 0 model name : ARMv7 Processor rev 2 (v7l) BogoMIPS : 996.14 Features : half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpd32 CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x3 CPU part : 0xc08 CPU revision : 2

Hardware : Generic AM33XX (Flattened Device Tree) Revision : 0000 Serial : 0000000000000000

Show replies by date

Jim Wilson

22 Jul 22 Jul

3:30 a.m.

On Thu, Jul 21, 2016 at 6:33 PM, Jeffrey Walton noloader@gmail.com wrote:

...

I think vfpd32 cpu flag means I have 32 D-registers. The cpu flags neon and vfpv3 flags means I want something more than -mfpu=neon-fp16, but I'm not sure what that is.

neon implies vfvp3 and 32 D-registers and asimd/neon support, so that part is correct. it isn't obvious to me if you have the half-precision float support. The "half" printed by the kernel means that half-word loads are supported, which is only false for some obsolete parts I think. The kernel doesn't appear to be checking to see if the hardware has half-precision float support or not, so you can't determine that from /proc/cpuinfo.

Jim

Jeffrey Walton

3:48 a.m.

On Thu, Jul 21, 2016 at 11:30 PM, Jim Wilson jim.wilson@linaro.org wrote:

...

On Thu, Jul 21, 2016 at 6:33 PM, Jeffrey Walton noloader@gmail.com wrote:

...
I think vfpd32 cpu flag means I have 32 D-registers. The cpu flags neon and vfpv3 flags means I want something more than -mfpu=neon-fp16, but I'm not sure what that is.

neon implies vfvp3 and 32 D-registers and asimd/neon support, so that part is correct. it isn't obvious to me if you have the half-precision float support. The "half" printed by the kernel means that half-word loads are supported, which is only false for some obsolete parts I think. The kernel doesn't appear to be checking to see if the hardware has half-precision float support or not, so you can't determine that from /proc/cpuinfo.

Thanks Jim.

Is there an arm-msr-tools or similar that has setuid so we can access the MSRs?

My thinking is, I can tell people to install arm-msr-tools so we can query for the features directly. I want to avoid telling people to run a test script as root.

Jeff

Jim Wilson

4:15 a.m.

On Thu, Jul 21, 2016 at 8:48 PM, Jeffrey Walton noloader@gmail.com wrote:

...

Is there an arm-msr-tools or similar that has setuid so we can access the MSRs?

I'm not familiar with any such tool, but I haven't looked for one before. I found an x86 msr-tools project at github with a web search. It seems to be a standard part of debian/ubuntu x86 distros. I don't see an obvious arm equivalent.

You could ask the kernel developers to add a hardware capability (hwcap) check for half-precision fp and emit that info into the /proc/cpuinfo file, though it would take a while for that to be implemented and propagate to your users.

Jim

Jeffrey Walton

4:13 a.m.

On Thu, Jul 21, 2016 at 11:30 PM, Jim Wilson jim.wilson@linaro.org wrote:

...

On Thu, Jul 21, 2016 at 6:33 PM, Jeffrey Walton noloader@gmail.com wrote:

...
I think vfpd32 cpu flag means I have 32 D-registers. The cpu flags neon and vfpv3 flags means I want something more than -mfpu=neon-fp16, but I'm not sure what that is.

neon implies vfvp3 and 32 D-registers and asimd/neon support, so that part is correct. it isn't obvious to me if you have the half-precision float support. The "half" printed by the kernel means that half-word loads are supported, which is only false for some obsolete parts I think. The kernel doesn't appear to be checking to see if the hardware has half-precision float support or not, so you can't determine that from /proc/cpuinfo.

OK, so looking at http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dht0002a/ch01s03..., it appears the "minimum" of the -mfpu option is VFPv3-D16. Since I have 32 D-regs, I can use the one VFPv3-D32, which should equate to -mfpu=neon-vfp3 (which does not seem to exist).

I can't use -mfpu=neon-vfpv4 because vfpv4 is not signaled, and it could be missing the half word and fma extensions implied with vfpv4.

So I guess the question is, what do I use for -mfpu=neon-vfp3 (or -mfpu=neon-vfp3-d32)? Is -mfpu=neon enough?

Thanks again for the help with this.

Jim Wilson

4:19 a.m.

On Thu, Jul 21, 2016 at 9:13 PM, Jeffrey Walton noloader@gmail.com wrote:

...

So I guess the question is, what do I use for -mfpu=neon-vfp3 (or -mfpu=neon-vfp3-d32)? Is -mfpu=neon enough?

The -mfpu=neon option is enough. neon implies vfpv3 and 32 D registers.

Jim

Jeffrey Walton

4:21 a.m.

On Fri, Jul 22, 2016 at 12:19 AM, Jim Wilson jim.wilson@linaro.org wrote:

...

On Thu, Jul 21, 2016 at 9:13 PM, Jeffrey Walton noloader@gmail.com wrote:

...
So I guess the question is, what do I use for -mfpu=neon-vfp3 (or -mfpu=neon-vfp3-d32)? Is -mfpu=neon enough?

The -mfpu=neon option is enough. neon implies vfpv3 and 32 D registers.

Perfect, thanks.

Jeff

Richard Earnshaw

9:14 a.m.

On 22/07/16 05:21, Jeffrey Walton wrote:

...

On Fri, Jul 22, 2016 at 12:19 AM, Jim Wilson jim.wilson@linaro.org wrote:

...
On Thu, Jul 21, 2016 at 9:13 PM, Jeffrey Walton noloader@gmail.com wrote:

...
So I guess the question is, what do I use for -mfpu=neon-vfp3 (or -mfpu=neon-vfp3-d32)? Is -mfpu=neon enough?

The -mfpu=neon option is enough. neon implies vfpv3 and 32 D registers.

Perfect, thanks.

Jeff _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-toolchain

According to https://beagleboard.org/black, this board contains a Cortex-A8. So -mfpu=neon is correct.

https://community.arm.com/groups/tools/blog/2013/04/15/arm-cortex-a-processo...

R. IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

3524

days inactive

3524

days old

linaro-toolchain@lists.linaro.org

7 comments

participants

tags (0)

participants (3)

Jeffrey Walton
Jim Wilson
Richard Earnshaw