Hi,
Thanks again for measuring this.
mjpegenc before: 500000 runs take 7.31085s after: 500000 runs take 3.04492s speedup: x2.4
mjpegenc and aacsbr-2 contains simple accumulation without load/store dependence and thus SMS succeeds to improve them. aacsbr-1 also contains such accumulation however doloop fails on it. I will try to run it with the recent patch to avoid using doloop (http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01807.html) and with -fira-algorithm=priority that you discovered avoids the spill issue.
Thanks, Revital