Hi,
I need to learn much more about ARM architecture, but I have some initial comments.
Julian Brown julian@codesourcery.com wrote on 15/09/2010 11:37:21 AM:
- automatic vector size selection (it's currently selected by command line switch)
Generally (check assumption) I think that wider vectors may make inner
loops more efficient,
but may increase the size of setup/teardown code (e.g. setup: increased
versioning. Teardown,
increased insns for reduction ops). More importantly, sometimes larger
vectors may inhibit vectorization.
We ideally want to calculate costs per vector-size per-loop (or per other
vectorization opportunity).
There is a patch http://gcc.gnu.org/ml/gcc-patches/2010-03/msg00167.html that was not committed to mainline (and I think not to vect256, but I am not sure about that). This patch tries to vectorize for the wider option unless it is impossible because of data dependence constraints.
I agree with that cost model approach.
- ensure that all gcc vectorizer pattern names are implemented in the machine description (those that can be).
In my opinion we better concentrate on:
- Conversly, perhaps identify NEON capabilities not covered by GCC patterns, and add them to gcc (e.g. vld2/vld3/vld4 insns)
Most of the existing vectorizer patterns were inspired by Altivec's capabilities. I think our approach should originate from the architecture and not the other way around. For example, I don't think we should spend time on implementation of vect_extract_even/odd and vect_interleave_high/low (even though they seem to match VUNZIP and VZIP), when we have those amazing VLD2/3/4 and VST2/3/4 instructions.
I've not even started on looking at:
- loops with more than two basic blocks (caused by if statements (anything else?))
What do you mean by that? If-conversion improvements?
Do you (Ira) have access to the ARM ISA docs detailing the NEON instructions?
I have "ARM® Architecture Reference Manual ARM®v7-A and ARM®v7-R edition".
Ira
Cheers,
Julian[attachment "CS308-vectorization-improvements.txt" deleted by Ira Rosen/Haifa/IBM]