Hi,
* regtested vzip/vuzp patch * looked into big-endian build * applied all the required patches and checked that Viterbi gets vectorized giving ~2x performance improvement (compiled with cross-compiler) * looked into vld/vst implementation - mostly discussions with Richard * DenBench analysis: - there are loops that should get vectorized with vzip/vuzp patch, I'll check them next week - sad8_c (hot function from mp4encode) needs reduction SLP (which I implemented several weeks ago), and an ability to jump unknown stride in loop SLP - I am looking into this
Ira
linaro-toolchain@lists.linaro.org