Hi,
Yes, I also noticed that. When I tested it only one reg-move was created so the scheduling patch would not effect on it.
FWIW, looking at the results I posted yesterday, the scheduling patch did improve the results compared with the non-scheduling patch:
You are right! this was my mistake, sorry about that...
mjpegenc before: 500000 runs take 28.6987s after: 500000 runs take 7.31342s speedup: x3.92
That single register move wasn't schedulable within the current ii, so the patch used a higher ii without the move. Unfortunately, while the new loop needs fewer spills, it doesn't avoid them completely.
Right, this can also be seen in the results you posted today evaluating the effect of SMS on the vectorized code.
Thanks, Revital