On Mon, 18 Oct 2010, Ira Rosen wrote:
Does that mean that the vectorizer will be aware of specific
instructions?
I would imagine that it would need to know what permutations are available, yes (GIMPLE and RTL would have some form of general permuting load/store operation, which the vectorizer would only generate where relevant instructions exist for the chosen permutation).
So, there will be a new tree code, e.g. PERM_LOAD_EXPR, and the vectorizer will use it for misaligned loads in big endian (or maybe for little endian as well), and for strided loads. The vectorizer will check if the instruction is supported giving the desired stride (1,2,3) as input, and will receive a mask. It will use the mask in order to permute all other relevant vectors (like vectors of constants) if necessary, making all the generic GIMPLE and RTL correct. And later, when assembly code is generated, everything should be permuted again?
Yes, something like that.