On 30 November 2011 20:28, Michael Hope michael.hope@linaro.org wrote:
On Thu, Dec 1, 2011 at 12:20 AM, Ira Rosen ira.rosen@linaro.org wrote:
On 30 November 2011 02:33, Michael Hope michael.hope@linaro.org wrote:
Peeling and using the vld1.i64 {d16-d17}, [r1:64]! form should be faster for larger loops. For some reason vld1.i64 ..., [r1:128] gives an illegal instruction trap on my board. Note that the :128 is in bits.
Are you sure the address is 128 bit aligned ? I think the reason for the failure is the behaviour of memalign. Changing the memalign's on top from 8 to ALIGN appears to fix the problem - or was that deliberate ?
Regards, Ramana