I ran my usual set of benchmarks of libav compiled with the current gcc releases (hand-written assembly disabled). The results are in this spreadsheet: https://docs.google.com/spreadsheet/ccc?key=0AguHvNGaLXy9dHExeWZ1YWZ1c0s2Vnp...
First the good news, almost everything is faster with 4.6+ than with linaro-4.5.
The bad news is that some things have regressed since 4.6, even if not all the way back to 4.5 levels. A few especially problematic pieces stand out:
- The mp3 test performs 5-15% worse. This regression is (mostly) attributable to the ff_mpadsp_apply_window_fixed [1] function. We have looked at this one before.
- FLAC is 9% slower in upstream 4.7/4.8 compared to Linaro releases. Here flac_lpc_16_c [2] and flac_decorrelate_indep_c_16 [3] are mainly to blame.
- MPEG2/MPEG4 decoding is ~10% slower with vectorisation turned on. The culprit here is ff_simple_idct_8_c [4] function.
- H.264 and DTS seem 1-2% slower, although this could be just noise.
- Code size has increased by ~10% in all post-4.6 releases.
In all cases, compiled with -O3 -mcpu=cortex-a9. The vectorised builds all use -fvect-cost-model. Without this flag the results are much worse.
[1] http://git.libav.org/?p=libav.git%3Ba=blob%3Bf=libavcodec/mpegaudiodsp_templ... [2] http://git.libav.org/?p=libav.git%3Ba=blob%3Bf=libavcodec/flacdsp.c%3Bh=a2e3... [3] http://git.libav.org/?p=libav.git%3Ba=blob%3Bf=libavcodec/flacdsp_template.c... [4] http://git.libav.org/?p=libav.git%3Ba=blob%3Bf=libavcodec/simple_idct_templa...