Mixed vector sizes

10 Nov 2010


      Hi,
I started to look into mixed vector sizes (in the same loop). My main reason
for this was to allow widening and narrowing instructions, that have
different vector sizes for src and dest, to work properly. My example was
widen_mult (int = short * short), I thought its implementation was not
optimal. But now that I have a working GCC mainline for ARM, I see that it
works just fine.
short ub[], uc[];
int c[];
for (i = 0; i < n; i++)
    c[i] = ub[i] * ua[i];
is compiled as:
.L11:
        add     r1, r1, #1
        vldmia  r4!, {d18-d19}
        cmp     r5, r1
        vldmia  ip!, {d16-d17}
        vmull.s16 q10, d18, d16
        vstr    d20, [r3, #-32]
        vstr    d21, [r3, #-24]
        vmull.s16 q8, d19, d17
        vstr    d16, [r3, #-16]
        vstr    d17, [r3, #-8]
        add     r3, r3, #32
        bhi     .L11
which looks good to me at least from the vmull point of view.
Does anyone have an example when mixed vector size instructions are not used
properly?
Another reason for mixed sizes could be cases where only part of the loop
can be vectorized with the wider vectors. I don't know how common this is.
Are there any other reasons to implement mixed vector sizes? I understand
that this can be a useful feature, I am just not sure it's the most
important one.
Thanks,
Ira

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Mixed vector sizes