=== Progress ===
LP1296601 (ICE in push_minipool_fix) [5/10]
* completed a prototype fix
* submitted RFC patch to gcc-patches
* still awaiting review
PR60609 (Error: value of 256 too large for field of 1 bytes) [3/10]
* implemented fix and posted to gcc-patches
* approved, subject to further testing on Thumb-1
libvpx NEON assembler vs instrinsics performance investigation [2/10]
* looking at disassembly, code is not terribly aesthetically pleasing
* in some cases clang looks better
=== Plan ===
write up libvpx investigation
follow up/ping LP1296601
NEON scheduling TCWG-135
TCWG-156 (5/10)
* Hacked v7 memcpy into a memset
* Much fiddling with builds, targets
* Kicked off a benchmark run
Misc
* Meetings (1/10)
* Finding hardware/setting up working environments/figuring out workflows
(4/10)
== Issues ==
* none
== Progress ==
* Launchpad bugs:
o TCWG-422 : ICE in assign_by_spills building linux btrfs module (1/10)
- New failure after first fix reported.
- reduced new testcase.
- Fix committed by Vladimir as rev209038
o Backported "Internal compiler error in push_reload during
bootstrap stage 2" to GCC 4.7 (1/10)
- analysed validation results.
- re-spawned some jobs.
* Backports review: (5/10)
o cortex-a53 support backport:
- We are still not able to validate it on aarch64-linux-gnu target
with a compiler configured to default to cortex-a53, but no regressions
observed in the generic case and on bare_metal (with cortex-a53).
* Misc:
o Cbuildv1 baby-sitting (2/10)
- Toolchain64 disk was full.
o Various meetings (1/10).
== Next ==
- Mainly 4.7 and 4.8 April releases
- TCWG-413 Spec2006 (5/10)
- Analysed 456.hmmer
- In the process of opening performance bug reports
- Started looking at 453.povray
- TCWG-291 CRC (2/10)
- Not seeing performance improvement with redundant "and" instruction gone
- Analysing with perf to see the reason
- LP1301335 (3/10)
- SLP vectorizer ICEs for QT5 Webkit for Linaro 4.7
- Doesn’t occur in trunk/4.8/4.7 FSF
- Patch proposed for merge request which fixes
- I also see some FAIL -> PASS in the regression with this patch
- This patch is only relevant for Linaro 4.7 so we cant/don’t need to
upstream it (?)
== Plan ==
Continue with Spec2006 and crc
4 day week 31-Oct local holiday
Bug fix (2/10)
* Looking at a register allocation issue with ARMv7 hard float issue. (3/10)
Tried changing machine description pattern same as trunk in gcc 4.8 branch.
Issue does not occur with trunk and reason is arm64 moved to lra.
turning off lra bug occurs.
Trying to find out if it is easy to fix in reload or wait for LRA backport.
PGO - AArch64 (TCWG-179) (3/10)
* Native CPU2006 runs on V8 foundation model.
SPEC runs -O3 -mcpu=cortex-a57. INT benchmark failures seen with mcf
and h264ref.
rest benchmarks running.
* Tried to use ubuntu saucy core image on V8 foundation model and
mount NFS is failing.
* Trying to install QEMU user static for aarch64 and use them from
chroot environment
GLIBC Systemtap (2/10)
* Re spined libc systemtap probe patch to glibc. Will newton is
testing it in hardware.
Meetings (2/10)
* Attend 1-1 with Ryan discuss 2014 goal planning.
* Attend 1-1 with Maxim discuss PGO work.
== Progress ==
* Kernel (TCWG-417)
- Implementing named register global variables (D3261)
- Helping Milosz and Vinicius (LLVMLinux) to get a kernel ready
- First LLVM-compiled kernel booted on Versatile Express hardware
* Background
- Reviewing patches, etc.
- Apple merged their ARM64 back-end, fiddling bots
- Making the new TableGen docs official
- Jira farming
- Became code owner for the ARM Linux support
- LLVM Foundation announced
- Trying to run SPEC on AArch64
* Time
- CARD-124 6/10
- Others 4/10
== Plan ==
* Holiday for two and a half weeks
* Follow up the named register patch
== Progress ==
* glibc patch review (2/10)
* Helping out with aarch64 glibc setjmp/longjmp Systemtap probes testing (1/10)
* Investigated and submitted patch for gas ARM alignment issue (3/10)
* Committed library and script for malloc logging (1/10, TCWG-423)
* Rebased and tidied up malloc microbenchmark (2/10, TCWG-160)
* Various small binutils and glibc patches (1/10)
== Issues ==
* None
== Plan ==
* Submit patch for glibc malloc microbenchmark
--
Will Newton
Toolchain Working Group, Linaro
Hi all,
I've just filed a bug on glibc I'd love you to take a look at:
https://sourceware.org/bugzilla/show_bug.cgi?id=16796
Here's the description to save clicking:
Hi,
There is a test in glibc (tst-tls5) that tests that
((uintptr_t)pthread_self())%16 is zero. But watch this:
(t-mwhudson)mwhudson@am1:~$ cat btp.c
#include <stdint.h>
#include <stdio.h>
#include <pthread.h>
int
main(int argc, char** argv)
{
uintptr_t p = (uintptr_t)__builtin_thread_pointer();
uintptr_t q = (uintptr_t)pthread_self();
printf("p: %lx %ld\n", p, p%16);
printf("q: %lx %ld\n", q, q%16);
}
(t-mwhudson)mwhudson@am1:~$ gcc -o btp btp.c -lpthread
(t-mwhudson)mwhudson@am1:~$ ulimit -s unlimited
(t-mwhudson)mwhudson@am1:~$ ./btp
p: 2000028d88 8
q: 2000028698 8
(t-mwhudson)mwhudson@am1:~$ ulimit -S -s 8192
(t-mwhudson)mwhudson@am1:~$ ./btp
p: 7f7fd086f0 0
q: 7f7fd08000 0
So something is clearly wrong; maybe it's just that the test is too
strict, but somehow that seems a bit unlikely. FWIW, this doesn't
happen if you don't link with libpthread so maaaaybe it's a bug in
something that ends up in libpthread's .init section?
Cheers,
mwh
== Week of March 24th ==
- STREAM regression (TCWG-388, 5/10)
-- Finished prototype patch. The patch adds modeling of ARM L2 auto-prefetcher hardware to GCC scheduler (the model is very simple as auto-prefetcher is very lightly documented). Half of the patch cleans up and improves GCC scheduler, and the other half implements the auto-prefetcher model.
-- While looking into ARM scheduling support noticed that ARM doesn't use multipass lookahead scheduling, which surprised me. Enabled it (multipass scheduling) in my patches.
- Looked into lll_timed_wait Glibc/uClibc bug upstream (1/10)
-- https://sourceware.org/ml/libc-alpha/2014-03/msg00905.html
- Various discussions and reviews (4/10)
== Week of March 30th ==
- STREAM regression (TCWG-388)
-- Benchmark patches on SPEC2k and find/confirm best values for tuning parameters:
--- dfa_lookahead: should normally be issue_rate-1.
--- L2 auto-prefetcher queue depth: new tuning knob.
-- Investigate any performance regressions from the patches.
- lll_timed_wait Glibc/uClibc bug
-- Make sure it is fixed upstream. Possibly backport to Linaro branches.
--
Maxim Kuvyrkov
www.linaro.org
== Issues ==
* none
== Progress ==
* Launchpad bugs:
o TCWG-422 : ICE in assign_by_spills building linux btrfs module (1/10)
- created blueprint for :
https://bugs.launchpad.net/gcc-linaro/+bug/1296676
- Reported upstream as :
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60650
- Reduced testcase
- Fix committed by Vladimir as rev208876
- still ICE when configured for arm-none-linux-gnueabihf
o Backported "Internal compiler error in push_reload during
bootstrap stage 2" to GCC 4.7 (1/10)
- https://bugs.launchpad.net/gcc-linaro/+bug/1129013
- some testsuite regressions observed. will investigate.
* Backports review: (5/10)
o reviewed backport for pr60264 and rev202663
o cortex-a53 support backport:
- Analysed testsuite regression
- 22K Loc patch under review
* LRA on AArch32:
o TCWG-345 : Analyse performance of LRA for ARM. (1/10)
- looked at the perf tool results
* Misc:
o Various meetings (1/10).
o Various support to team members (1/10)
o Cbuildv1 baby-sitting (Calxedas nodes have to be restarted after
each upgrades !)
== Next ==
- continue cortex-a53 review
- continue on backports.
- continue on TCWG-345.