Some news from the qemu mailing list that I think might be of interest to gcc folks here:
Christophe Lyon from ST has kindly released a large set of test cases of Neon intrinsics: http://gitorious.org/arm-neon-tests/arm-neon-tests (the tests themselves are more aimed at testing qemu, so they just produce output to be compared against a reference generated from running on hardware).
However they don't currently compile with gcc (but are ok with armcc). From the README:
# The tests currently fail to build with GCC/ARM: # - no support for Neon_Overflow/fpsrc register # - ICE when compiling ref_vldX.c, ref_vldX_lane.c, ref_vstX_lane.c # - fails to compile vst1_lane.c # - missing include files: dspfns.h, armdsp.h
Maybe it's worth somebody having a look at this, at least enough to find out whether the ICEs are things we already know about or have perhaps already fixed in linaro gcc?
thanks -- PMM
On Wed, Jan 26, 2011 at 3:54 AM, Peter Maydell peter.maydell@linaro.org wrote:
Some news from the qemu mailing list that I think might be of interest to gcc folks here:
Christophe Lyon from ST has kindly released a large set of test cases of Neon intrinsics: http://gitorious.org/arm-neon-tests/arm-neon-tests (the tests themselves are more aimed at testing qemu, so they just produce output to be compared against a reference generated from running on hardware).
However they don't currently compile with gcc (but are ok with armcc). From the README:
# The tests currently fail to build with GCC/ARM: # - no support for Neon_Overflow/fpsrc register # - ICE when compiling ref_vldX.c, ref_vldX_lane.c, ref_vstX_lane.c
gcc-linaro-2011.01 no longer ICEs.
# - fails to compile vst1_lane.c
The particular error is:
In file included from ref_vst1_lane.c:27:0: .../4.5.2/include/arm_neon.h: In function 'exec_vst1_lane': .../4.5.2/include/arm_neon.h:8424:33: error: argument must be a constant .../4.5.2/include/arm_neon.h:8448:33: error: argument must be a constant
A bit of bisection shows that these two lines:
TEST_VST1_LANE(q, int, s, 8, 16, 15); TEST_VST1_LANE(q, uint, u, 8, 16, 10);
are the cause. The other, similar lines are fine.
# - missing include files: dspfns.h, armdsp.h
dspfns.h is: https://bugs.launchpad.net/gcc-linaro/+bug/625256
armdsp.h isn't tracked anywhere. Both of these are ARM specific and we should talk about it on Monday.
Maybe it's worth somebody having a look at this, at least enough to find out whether the ICEs are things we already know about or have perhaps already fixed in linaro gcc?
Richard Sandiford, could you have a look at the vst1_lane faults?
-- Michael
Hi!
On 26.01.2011 02:40, Michael Hope wrote:
On Wed, Jan 26, 2011 at 3:54 AM, Peter Maydell peter.maydell@linaro.org wrote:
Christophe Lyon from ST has kindly released a large set of test cases of Neon intrinsics: http://gitorious.org/arm-neon-tests/arm-neon-tests
For the record, these tests aim at being exhaustive in the sense that: - each intrinsic variant is called - when there are corner cases to be expected, each intrinsics is called multiple times with different values. I am not sure the tests are readable enough for this to be obvious :-)
Note that they test only valid combinations (when parameters have mandatory ranges, armcc refuses to compile if the effective parameter is out of the range). It means that the tests could be enhanced in this respect, to make sure that GCC behaves correctly when fed with illegal parameters.
Obviously, I may have missed some corner cases :-)
# The tests currently fail to build with GCC/ARM: # - no support for Neon_Overflow/fpsrc register # - ICE when compiling ref_vldX.c, ref_vldX_lane.c, ref_vstX_lane.c
gcc-linaro-2011.01 no longer ICEs.
Good news! Indeed I only tested with the latest 2010 Linaro release.
# - fails to compile vst1_lane.c
The particular error is:
Cool, my README has been clear enough for you to rebuild ;-)
In file included from ref_vst1_lane.c:27:0: .../4.5.2/include/arm_neon.h: In function 'exec_vst1_lane': .../4.5.2/include/arm_neon.h:8424:33: error: argument must be a constant .../4.5.2/include/arm_neon.h:8448:33: error: argument must be a constant
A bit of bisection shows that these two lines:
TEST_VST1_LANE(q, int, s, 8, 16, 15); TEST_VST1_LANE(q, uint, u, 8, 16, 10);
are the cause. The other, similar lines are fine.
To save you some time, these 2 lines invoke the vst1q_lane_s8 with lane==15 and vst1q_lane_u8 with lane==10. I have tried lower values, and the test then compiles OK. However the prototypes are: vst1q_lane_s8 (int8_t * __a, int8x16_t __b, const int __c) vst1q_lane_u8 (uint8_t * __a, uint8x16_t __b, const int __c)
This makes me think that somewhere GCC knows that 'c' has a validity range, but this is hidden from the end-user.
ARM's armcc has such decorations: __is_constrange(0,15,lane)
# - missing include files: dspfns.h, armdsp.h
dspfns.h is: https://bugs.launchpad.net/gcc-linaro/+bug/625256
armdsp.h isn't tracked anywhere. Both of these are ARM specific and we should talk about it on Monday.
Although not exactly related to Neon, I may also mention that some ARM intrinsics are missing from GCC, such as __ssat, __usat (I have recently fixed bugs in qemu related to these two, which I discovered thanks to my tests). The list of missing intrinsics includes: __clz __qadd __qdbl __qsub __rbit __rev __ssat __usat
Maybe it's worth somebody having a look at this, at least enough to find out whether the ICEs are things we already know about or have perhaps already fixed in linaro gcc?
Richard Sandiford, could you have a look at the vst1_lane faults?
I hope I have provided info to save you some time investigating. We can have a look here too.
Christophe.
Michael Hope michael.hope@linaro.org writes:
On Wed, Jan 26, 2011 at 3:54 AM, Peter Maydell peter.maydell@linaro.org wrote:
Maybe it's worth somebody having a look at this, at least enough to find out whether the ICEs are things we already know about or have perhaps already fixed in linaro gcc?
Richard Sandiford, could you have a look at the vst1_lane faults?
Sure.
Richard
Michael Hope michael.hope@linaro.org writes:
On Wed, Jan 26, 2011 at 3:54 AM, Peter Maydell peter.maydell@linaro.org wrote:
Some news from the qemu mailing list that I think might be of interest to gcc folks here:
Christophe Lyon from ST has kindly released a large set of test cases of Neon intrinsics: http://gitorious.org/arm-neon-tests/arm-neon-tests (the tests themselves are more aimed at testing qemu, so they just produce output to be compared against a reference generated from running on hardware).
However they don't currently compile with gcc (but are ok with armcc). From the README:
# The tests currently fail to build with GCC/ARM: # - no support for Neon_Overflow/fpsrc register # - ICE when compiling ref_vldX.c, ref_vldX_lane.c, ref_vstX_lane.c
gcc-linaro-2011.01 no longer ICEs.
It fails with -marm:
(insn 1817 1816 1818 2 /home/export/usr/gcc-linaro/H-x86_64-unknown-linux-gnu/bin/../lib/gcc/arm-linux-gnueabi/4.5.2/include/arm_neon.h:921 5 (parallel [ (set (reg:CI 303 [ D.14795 ]) (unspec:CI [ (mem:CI (reg:SI 3 r3 [1023]) [0 S48 A64]) (reg:CI 303 [ D.14795 ]) (unspec:V8HI [ (const_int 0 [0x0]) ] 191) ] 106)) (set (reg:SI 3 r3 [1023]) (plus:SI (reg:SI 3 r3 [1023]) (const_int 24 [0x18]))) ]) 1614 {neon_vld3qav8hi} (nil)) ref_vldX.c:157: confused by earlier errors, bailing out
I suspect the original testing was using a normal -marm default instead of Linaro's -mthumb.
The problem is that register 303 is spilled to the stack, and the stack slot address isn't legitimate for CImode (it's too far from the frame pointer). Reload rightly decides to reload the address into a temporary reload register, but the ARM backend also says that the load must go through a GENERAL_REGS reload register:
Reloads for insn # 1817 Reload 0: reload_in (SI) = (plus:SI (reg/f:SI 11 fp) (const_int -7548 [0xffffffffffffe284])) CORE_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine reload_in_reg: (plus:SI (reg/f:SI 11 fp) (const_int -7548 [0xffffffffffffe284])) Reload 1: reload_in (SI) = (plus:SI (reg/f:SI 11 fp) (const_int -7548 [0xffffffffffffe284])) CORE_REGS, RELOAD_FOR_OTHER_ADDRESS (opnum = 0), can't combine reload_in_reg: (plus:SI (reg/f:SI 11 fp) (const_int -7548 [0xffffffffffffe284])) Reload 2: GENERAL_REGS, RELOAD_FOR_OTHER_ADDRESS (opnum = 0), can't combine, secondary_reload_p Reload 3: GENERAL_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0), can't combine, secondary_reload_p Reload 4: reload_in (CI) = (mem/c:CI (plus:SI (reg/f:SI 11 fp) (const_int -7548 [0xffffffffffffe284])) [0 %sfp+-7496 S48 A64]) reload_out (CI) = (mem/c:CI (plus:SI (reg/f:SI 11 fp) (const_int -7548 [0xffffffffffffe284])) [0 %sfp+-7496 S48 A64]) VFP_REGS, RELOAD_OTHER (opnum = 0), can't combine reload_in_reg: (reg:CI 303 [ D.14795 ]) reload_out_reg: (reg:CI 303 [ D.14795 ]) secondary_in_reload = 2, secondary_out_reload = 3
where secondary reloads 2 and 3 are bogus.
This comes from two related problems in coproc_secondary_reload_class: it doesn't handle structure modes like CImode, and it checks whether the MEM is already legitimate. The latter is wrong because the memory is still in its unreloaded form. The structure (and vector) move patterns handle all valid addresses, and reload will take care of invalid addresses for us, so we should simply check for a MEM.
The patch below seems to fix the ICEs. I'll test and submit one I've looked at the lane problem.
Richard
=== modified file 'gcc/config/arm/arm.c' --- gcc/config/arm/arm.c 2011-01-13 16:06:19 +0000 +++ gcc/config/arm/arm.c 2011-01-28 11:16:07 +0000 @@ -9285,11 +9285,14 @@ return GENERAL_REGS; }
+ /* The neon move patterns handle all legitimate vector and struct + addresses. */ if (TARGET_NEON + && MEM_P (x) && (GET_MODE_CLASS (mode) == MODE_VECTOR_INT - || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT) - && neon_vector_mem_operand (x, 0)) - return NO_REGS; + || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT + || VALID_NEON_STRUCT_MODE (mode))) + return NO_REGS;
if (arm_coproc_mem_operand (x, wb) || s_register_operand (x, mode)) return NO_REGS;
linaro-toolchain@lists.linaro.org