On Tue, Apr 26, 2011 at 10:30 PM, Lee Moore moore@imperas.com wrote:
Hi All,
This is based upon gcc version 4.5.3 (20110221 pre-release) Any help appreciated
This shows a bug in the Linaro gcc compiler with the Arm NEON vset_lane intrinsic Note in the objdump that the vmov.8 instruction that places the value in the vector for the non-q version uses 1 where it should use 2 and 3:
18: ee410bb0 vmov.8 d17[1], r0 1c: ee420bb0 vmov.8 d18[1], r0 20: ee400b90 vmov.8 d16[0], r0 3c: ee440bb0 vmov.8 d20[1], r0
For the q version the vmov.8 instructions are correct: 40: ee420bf0 vmov.8 d18[3], r0 54: ee420bd0 vmov.8 d18[2], r0 64: ee400b90 vmov.8 d16[0], r0 70: ee420bb0 vmov.8 d18[1], r0
/* Source code */ #include <arm_neon.h> static uint8x8_t vec[5] static uint8x16_t qvec[5]; void set(uint8_t value) { vec[1] = vset_lane_u8(value, vec[0], 3); vec[2] = vset_lane_u8(value, vec[0], 2); vec[3] = vset_lane_u8(value, vec[0], 1); vec[4] = vset_lane_u8(value, vec[0], 0); qvec[1] = vsetq_lane_u8(value, qvec[0], 3); qvec[2] = vsetq_lane_u8(value, qvec[0], 2); qvec[3] = vsetq_lane_u8(value, qvec[0], 1); qvec[4] = vsetq_lane_u8(value, qvec[0], 0); }
Hi Lee. Thanks for the bug report. Sorry for the delay in replying but we've been busy with the summit and this slipped through.
I've logged Launchpad bug #784375 with your test case. The good news is that it's already been fixed in Linaro GCC 4.5/4.6 2011.05 which is due out on Thursday.
-- Michael