On 8 August 2012 16:24, Ulrich Weigand Ulrich.Weigand@de.ibm.com wrote:
Hello,
I've had a look at the mp3player performance regressions (just with *some* data sets) with the vector-alignment patch. Interestingly it turns out that the patch basically does not change the generated code for the hot spot (inv_mdct routine) at all. (The *only* change is which bits of the incoming pointer the run-time alignment check generated by the vectorizer tests for. But this has no practical consequences, since the check itself is not hot, and the *decision* made by the check is the same anyway -- everything is in fact properly aligned at runtime.)
The other difference, outside of code, introduced by the vector-alignment patch is that some global arrays used to be forcibly aligned to 16 bytes by the vectorizer, and they are now only aligned to 8 bytes. To check whether this makes a difference, I've modified the compiler as a hack to always force all global arrays to be 16 byte aligned. And interestingly enough, this appears to fix this particular performance regression ...
Are those arrays involved in any of the hot code?