== This week ==
* Spent almost all the week on GCC's auto inc/dec pass. I first
continued with the incremental "clean ups" and recoding that I'd
started during free time at Budapest, with the idea of bolting the new
optimisations on top of that. However, in the end, I decided it would
be better to rewrite the pass entirely, using a different approach.
I've now got an early prototype of that rewrite, and it seems to be
working as expected on the test cases I've tried so far. I'm running
a regression test over the weekend, although TBH, I expect it to fail
at this stage.
* Tested the fix for vzip, vunz and vtrn. Went well, so I'll submit
next week.
* Blueprints.
== Next week ==
* More auto inc/dec:
* Round off some known rough edges in the prototype.
* Fix bugs.
* Run benchmarks.
* Run code comparison tests (diffing assembly code), both on ARM and
on other targets of interest.
Richard
The Linaro Toolchain Working Group is pleased to announce the release
of Linaro GDB 7.2.
Linaro GDB 7.2 2011.05-0 is the sixth release in the 7.2 series. Based
off the latest GDB 7.2, it includes a number of ARM-focused bug fixes.
This release fixes:
* LP: #615972 Neon registers missing in core files
* LP: #615978 Failure to software single-step into signal handler
* LP: #615996 gdb.cp/templates.exp failures
The source tarball is available at:
https://launchpad.net/gdb-linaro/+milestone/7.2-2011.05-0
More information on Linaro GDB is available at:
https://launchpad.net/gdb-linaro
-- Michael
Can somebody please explain how development happens regarding qemu-linaro ?
I've taken a look here [0] and If I'm not mistaken, there's no code in the
repo. I can see a lot of blueprints, but I don't understand how work is
being done regarding those blueprints or when will it be done! Oh, and what
exactly is the 'qemu-linaro' tarball in the repo ?
I'm not sure how newbie this question is, but please bear with me. :D
Thanks in advance.
[0] https://launchpad.net/qemu-linaro
--
Karim Allah Ahmed.
LinkedIn <http://eg.linkedin.com/pub/karim-allah-ahmed/13/829/550/>
Hello,
* Sent 5 SMS related patches for review upstream.
* Backported two SMS patches from mainline to gcc-linaro and
gcc-linaro/4.6 (fixes for unfreed memory)
Thanks,
Revital
Hi,
* committed a patch that supports reductions in SLP (upstream)
* continued analyzing benchmarks: ffmpeg, EEMBC telecom, office, networking
* started to look into implementation of reverse accesses for Neon
* blueprints
Ira
The Linaro Toolchain Working Group is pleased to announce the release
of both Linaro GCC 4.5 and Linaro GCC 4.6.
Linaro GCC 4.5 2011.05 is the tenth release in the 4.5 series. Based
off the latest
GCC 4.5.3+svn173417, it adds new optimisations, much improved support
for strided load/stores, and fixes for many of the issues found in the
last month.
Interesting changes in 4.5 include:
* Updates to 4.5.3+r173417
* Performance improvements in NEON strided loads and stores
* Performance improvements targeted at EEMBC CoreMark
* Precompiled header support on recent Linux kernels
Fixes:
* LP: #660156: Heap randomisation causes PCH testsuite failures
* LP: #784375: vset_lane_u8 intrinsic generates wrong lane number
* LP: #759409: Profiled bootstrap fails in FSF GCC 4.5
* LP: #723086: Test regressions in the Fortran test suite
The strided load/store improvements allow both NEON intrinsics and the
vectoriser to efficiently access values that occur at every n'th
address, such as all of the red values in a RGB image or all of the
left channel samples in a interleaved audio array. Previous versions of GCC
would unpack the values onto the stack instead of using the registers
directly.
The CoreMark improvements improve the code generation for the hot
functions in benchmark. This release is now on par with Linaro GCC
4.4 and significantly ahead of other FSF or Linaro 4.5 based
compilers. This fixes the long-standing problems of ARMv5 being
faster than ARMv7 and 4.4 based compilers being faster than 4.5 based
ones.
Linaro GCC 4.6 is the third release in the 4.6 series. Based off the
latest GCC 4.6.0+svn173480, it adds new optimisations, vectoriser
improvements, and continues with the merge of many ARM-focused
changes.
Interesting changes include:
* Updates to 4.6.0+r173417
* Brings forward more of the performance improvements from Linaro GCC 4.5
* Adds support for swing-modulo scheduling
* Fixes precompiled header support on recent Linux kernels
* Changes the default NEON vector size to quads
* Adds auto-detection of the best vector size
* Adds vectorisation improvements due to better if-conversion
Fixes:
* LP: #714921: Uses an unreasonable amount of memory to compile QEMU on armel
* LP: #723086: Test regressions in the Fortran test suite
The source tarball is available from:
https://launchpad.net/gcc-linaro/+milestone/4.5-2011.05-0https://launchpad.net/gcc-linaro/+milestone/4.6-2011.05-0
Downloads are available from the Linaro GCC page on Launchpad:
https://launchpad.net/gcc-linaro
Mailing list: http://lists.linaro.org/mailman/listinfo/linaro-toolchain
Bugs: https://bugs.launchpad.net/gcc-linaro/
Questions? https://ask.linaro.org/
Interested in commercial support? inquire at support(a)linaro.org
-- Michael
Hi All,
This is based upon gcc version 4.5.3 (20110221 pre-release)
Any help appreciated
This shows a bug in the Linaro gcc compiler with the Arm NEON
vset_lane intrinsic
Note in the objdump that the vmov.8 instruction that places the
value in the vector for the non-q version uses 1 where it should use
2 and 3:
18: ee410bb0 vmov.8 d17[1], r0
1c: ee420bb0 vmov.8 d18[1], r0
20: ee400b90 vmov.8 d16[0], r0
3c: ee440bb0 vmov.8 d20[1], r0
For the q version the vmov.8 instructions are correct:
40: ee420bf0 vmov.8 d18[3], r0
54: ee420bd0 vmov.8 d18[2], r0
64: ee400b90 vmov.8 d16[0], r0
70: ee420bb0 vmov.8 d18[1], r0
/* Source code */
#include <arm_neon.h>
static uint8x8_t vec[5]
static uint8x16_t qvec[5];
void set(uint8_t value)
{
vec[1] = vset_lane_u8(value, vec[0], 3);
vec[2] = vset_lane_u8(value, vec[0], 2);
vec[3] = vset_lane_u8(value, vec[0], 1);
vec[4] = vset_lane_u8(value, vec[0], 0);
qvec[1] = vsetq_lane_u8(value, qvec[0], 3);
qvec[2] = vsetq_lane_u8(value, qvec[0], 2);
qvec[3] = vsetq_lane_u8(value, qvec[0], 1);
qvec[4] = vsetq_lane_u8(value, qvec[0], 0);
}
Thx
Lee
Hi there. The 2011.05 release has been spun and is testing up well.
The 4.5 and 4.6 branches are now open so feel free to commit any
approved patches.
-- Michael
Progress:
* Attended LDS from 9th -14th May.
Plans:
* Look at Thumb2 performance blueprint and break it down.
* Investigate more headroom for SPEC2k starting this week.
* Thumb2 performance call this week.
Meetings:
* 1-1s
* T2 performance.
Hello,
- Attended Linaro@UDS.
- SMS patches to support ARM do-loop pattern got approved in mainline
and merged into gcc-linaro 4.6 and 4.5.
- Sent merge request for two patches in trunk. (SMS_fixes_for_unfreed_memory)
- Implemented an optimization for the stage-count and now testing it.
Thanks,
Revital