Merged FSF GCC 4.7 to the Linaro GCC 4.7 branch.
Merged from GCC 4.6.3 release to Linaro GCC 4.6 branch.
Wrote and posted a patch to load DImode immediate constant into NEON
registers properly. Unfortunately, testing showed a bootstrap stage 2
vs. 3 miscompare, so there's something not quite right. However,
disassembly of the binaries hasn't revealed any problems, so this
failure is still a mystery. More investigation required.
Wrote and posted a patch to do DImode negation in NEON. Realised that I
had forgotten to do the core-register fall-back case; posted a new
version. Again, there's something annoyingly subtle that prevents
bootstrap. This time it looks like some sort of wrong code bug.
Investigating.
Wrote and posted a patch to do DImode one's complement in NEON. Richard
E questioned how it was written though. The tests passed successfully,
so that's a novelty this week!
Looked for other NEON instructions missing support. Didn't find any ...
but the machine description isn't exactly straight forward.
Considered the problems with choosing whether or not to do an operation
in NEON, or not. Discussed the existing state and possible solutions
with Ramana and Benrd (thanks Guys). Thought about it some more. Posted
a vague description of what might fix it to the linaro-toolchain list.
Awaiting replies.
Hi!
* Development benchmarks:
Finished the first implementation, sent to Michael for review.
* This became a very short week because of sickness.
Plans for week 10 is to triage existing bugs, and to get going again with
the SunSpider benchmark.
Regards
Åsa
Summary:
* Multilib on linaro toolchain.
Details:
1. Multilib test for linaro toolchain.
* Fix the multilib build issue when multiarch patch is applied.
* Fix a bug in crosstool-ng upstream to build multilib for glibc/eglibc.
* Successfully build multilib toolchain for armv6t2, cortex-a8 and
cortex-a9. And tests show it can link the correct libraries.
2. Still issues:
* Multilib and multiarch uses different directory structures.
Multilib build can not find the correct directories in the prebuilt
oneiric-sysroot.
* Build armv5t arm mode lib when default mode is thumb for armv7-a.
Annual leaves:
* Mar. 1-2.
Plans:
* Finalize the multilib solution for linaro binary toolchain.
* Work on code size optimization for the embedded toolchain.
Best regards!
-Zhenqiang
> The basic idea is that we add a new RTL optimization pass (or two) that
> assesses the usage of pseudo registers, and makes recommendations about
> what register class each should end up in, if there's a choice. These
> recommendations would then be used by later passes to get a better use
> of NEON. I might call this the "prealloc" pass, or something.
That sounds very much like the pre-reload that "new-ra" had at one
point (http://gcc.gnu.org/viewcvs/branches/new-regalloc-branch/gcc/pre-reload.c).
The problem with pre-reload for new-ra was that it was basically
reload instead of something nicer and cleaner. It also only ran just
before the register allocator, which is too late for the problem you
are trying to solve.
> Firstly, for each pseudo-register in a function, the pass would look at
> the insn constraints for each "def" and "use", and see how the registers
> relate to one another. This might determine things like "if rN is in
> class A, then rM must be also in class A".
At SUSE I tried to do this with the webizer pass (web.c). I wrote down
the ideas we implemented at the time (see
http://gcc.gnu.org/ml/gcc/2005-01/msg00179.html):
- web class, to replace regclass and choose register classes webs
instead of pseudos. This also includes splitting webs if a register
in a web really wants to be in two different classes to satisfy
constraints in two different insns. Right now, as far as I
understand, regclass just picks one and lets reload figure out how to
fix up that mistake.
- A semi-strict RTL mode. Right now there is just strict and
non-strict. On the branch there is a semi-strict mode which is the
same as strict RTL except that pseudo-registers are still allowed.
- pre-reload (which is related to web class) to make sure as many insn
constraints as possible are satisfied before the register allocator
goes to work. Basically, after pre-reload the insns stream should be
in semi-strict RTL form.
I used the webizer to unify defs and uses. I would split a web if it
needed multiple register classes (I inserted a mov, without checking
that a move existed from the source to the target register class), and
I put pseudos r1 and r2 in the same register class if there was an
insn (set (r1) (r2)) somewhere. The selection of the register classes
had a cost function, but I used rtx_cost, which is not very effective,
really. But I never took this experiment very far because for x86-64
the plan didn't work as well as I had hoped. I don't remember the
details, but the biggest problem I had with the experimental
implementation of these ideas (apart from lots of trouble with recog
for semi-strict RTL) was that there is a bit of an ordering problem
between combine on the one hand, and web-based register classes. If
you assign classes too early and don't allow things to change, then
combine fails too often. If you assign register classes after combine,
you may not get the instructions selected the way you want them to be.
This was when GCC still had the old local-alloc.c and global.c
allocators. Things may be different (better) with IRA and the upcoming
LRA stuff.
If you plan to work on this, I would suggest you discuss the plan on
the GCC mailing list also, with Jeff Law and Vladimir Makarov in CC
because they are working on a reload rewrite (LRA).
Ciao!
Steven
Hi,
OpenEmbedded:
* Worked on the meta-linaro layer and added libgcc and crosssdk
recipes to satisfy some bitbake dependencies
* I had to apply a few patches to build the linaro toolchain the OE
way (mostly gcc configury)
* successfully built the sato and Qt images
* Moved on to test the February release of the linaro binary toolchain
and (probably) and hit an issue with unaligned SD card images to used
with QEMU
* the guest kernel fails with: attempt to access beyond end of device
* /proc/partitions shows different block sizes (host vs. guest)
* the image size gets calculated on the fly by OE
* patch posted that introduces allows to specify a rootfs size alignment
* not seen on trunk as they use IDE
* Started to rebase the linaro-meta layer against current OE-core
* created https://wiki.linaro.org/KenWerner/Sandbox/OEMetaLinaroCard
based on the existent card of David R.
Regards,
Ken
== GCC ==
* Fixed mainline regression causing ICE in certain outer-loop
vectorization cases.
* Merged fwprop-subreg patch into Linaro GCC 4.7.
* Completed patch to generate usat/ssat instructions
where appropriate; checked into GCC mainline.
Merge requests to Linaro GCC 4.6 and 4.7 pending.
* Ongoing work on improving end-of-loop value computation.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
==Progress===
* Finished off PGO patch - sent upstream.
* Finished off the ABI tests - sent upstream.
* Investigated fixes for LP 942307 - a problem with kernel builds for
android. Backported a fix from Uli last year.
* Upstream patch review.
* Small configury done for SPEC2k as far as HC partitioning goes.
* Some Android benchmark investigations.
* Recovered from a broken upgrade on my laptop from natty to oneiric
on my laptop and then went all the way to Precise. It works
reasonably !
=== Plans ===
* Commit all approved and tested patches.
* Check on hc partitioning results from SPEC2k and make sure there is
an improvement and the feature works !
* Investigate https://bugs.launchpad.net/gcc-linaro/+bug/924726 in a
little more detail.
* Get back to partial-partial PRE.
Absences.
* 1 week holiday sometime before that - to be booked.
* Linaro Connect Q2.12 - May 28 - June 1 - travel booked - hotel to be booked.
Current Milestones:
|| || Planned || Estimate || Actual ||
||cp15-rework || 2012-01-06 || 2012-??-?? || ||
(new blueprints & reestimate for this one pending)
Historical Milestones:
||a15-usermode-support || 2011-11-10 || 2011-11-10 || 2011-10-27 ||
||upstream-omap3-cleanup || 2011-11-10 || 2011-12-15 || 2011-12-12 ||
||initial-a15-system-model || 2012-01-27 || 2012-01-27 || 2012-01-17 ||
||qemu-kvm-getting-started || 2012-03-04?|| 2012-03-04?|| 2012-02-01 ||
== cp15-rework ==
* ploughing through conversion of cp15 registers to new design:
patchset now 20 patches long, still TODO crn={0,1,6,7,9}
== other ==
* reviewed more Xilinx Zynq model patches
* looking at BE8 support: Paul Brook has posted some patches
to support this in user mode
* LP:944645: fixed bug where we weren't clearing the IT bits when
entering an M profile exception handler
* sent out an arm-devs.next pullreq
* trying to track down why linux-user is failing brk() and thus
causing bash segfaults