Testing the SMS register pressure estimation on libav micro benchmarks and eembc. Discussed with Ayal the implementation. He had some ideas to consider regarding the it. Looking into the regressions of SMSed kernels in libav which are not related to register pressure: Consulting with Ayal regarding the case in dsputil-ssd_int8_vs_int16_c where we have severe regression with SMS; it seemed that the regression was due to dependence between accumulations that can be avoided, more specifically we had the following case in vector code:
vec1 = vec1 + ... ... vec1 = vec1+ ... ... vec1 = vec1+ ... ... vec1 = vec1+...
to resolve this, I implemented a hack similar to MVE optimiation in the loop-unroller as follows:
vec1 = vec1 + ... ... vec2 = vec2+ ... ... vec3 = vec3+ ... ... vec4 = vec4+...
This gives ~4.5% improvements to the non-SMSed version. The SMS version now shows no regression as the problematic loop which caused the regression now failed to be SMSed and I'm looking into the reason.
Another regression showed in idct-internal-8 is apparently related to the do-loop optimziation (SMS actually failed to be applied in this loop). when applying the patch to expand SMS to recognise doloop then the regression is resolved. (http://gcc.gnu.org/ml/gcc-patches/2011-09/msg02051.html; patch is not in mainline yet)