== Progress ==
* Tried a number of testcases for the shuffles . Needed to add support to the C++ frontend for the __builtin_shuffle support. Fortunately there existed a patch - I tested it and it looked good. Committed upstream. However the original author had some concerns whether it would work in C++ or not but we shall see. The OP is concerned that it might break C++11 and constexpr which need to be looked at . * Briefly investigated a regression with Linaro GCC 4.6 with Neon intrinsics. It looks like my patch to allow LTO to proceed has had some fall out . We really need some good tests in the GCC testsuite for intrinsics. * Looked at the Android documents and commented. * Some upstream patch review.
== Plans ==
* Follow on the C++11 issues with the __builtin_shuffle patch if any. * Commit the __builtin_shuffle variation of the neon intrinsics patch into FSF 4.8. The improvements obtained are real and nice atleast for the testcases that we could see after finishing up the testcases. * There is some follow-up work which should tie in nicely with costs rework - lower-subreg ends up splitting things a bit badly in some cases with vld4 style intrinsics and for V4SF copies. So it's better we try to get the costs right. I suspect this might be harming us in a few cases with auto-vectorized code as well. Especially where we vectorize with the large vldn instructions. * Investigate the 4.6 regression with Neon intrinsics. * Auto-inc-dec scheduler work.