Re: [gnu-arm-releases] Re: [PATCH, WIP] NEON quadword vectors in big-endian mode (#10061, #7306)

2 Dec 2010


      On Thu, Dec 02, 2010 at 10:54:32AM +0200, Ira Rosen wrote:
...
On 1 December 2010 17:57, Daniel Jacobowitz dan@codesourcery.com wrote:
...
On Wed, Dec 01, 2010 at 11:16:16AM +0200, Ira Rosen wrote:
...
The meaning of the builtin (or maybe a new tree code would be better?)
is that the elements of v0, v1 and v2 are deinterleaved. I wanted the
MEM_REFs, since we actually have three data accesses here, and
something (builtin or tree code) to indicate the deinterleaving. Since
the vectors are passed to the builtin, I don't think it's a problem if
the statements get separated. When the expander sees the builtin, it
has to remove the loads it created for the MEM_REFs and create a new
"vector load multiple and deinterleave". Is that possible?
This is a problem I've struggled with before.  My only caution is that
representing the MEM_REF's separately from the deinterleaving in the IR
allows all sorts of ways (many we haven't thought of yet) for them to
get separated, and there's no instruction to efficiently implement the
deinterleaving from registers.  For instance, suppose a pseudo gets
propagated into the builtin and we can't find the MEM_REFs any more.
The resulting code could easily be worse than pre-vectorization.
I see. So one builtin for everything, like
vector_load_deinterleave (v0, v1, v2,..., stride,...)
is our only option?
It's not the only option; the way you've described might work, too.
But yes, it's my opinion that a single builtin is less likely to
generate something the compiler can't recover from.
-- 
Daniel Jacobowitz
CodeSourcery

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [gnu-arm-releases] Re: [PATCH, WIP] NEON quadword vectors in big-endian mode (#10061, #7306)