All,
As I mentioned on Monday, due to new members of the team coming on
board I am going to rejig the meeting times from 18th December.
My proposal is as follows (all times UK Local - UTC+0h winter, UTC+1h summer):
0. These changes will come into effect the week commencing 18th December.
1. The weekly call will happen on a Monday, at 0930 and 1600. You
only need to make it to one instance of the call. For those who are
physically near each other arranging it so that both calls get
attended would be good and notes shared would be good.
2. The Stand-up call will happen on a Thursday, alternating between
0930 and 1600. This is 'best effort' - but please try to attend at
least every other week.
3. The performance call will go weekly, and will happen on a Thursday,
alternating between 0930 and 1600. If you are on the invite list to
this call again please try to attend at least every other week, but if
you can manage every week that would be great.
If you have any comments on these please can you raise them by the end
of the week as I would like to confirm what we are going to do at the
call on Monday (at the normal time of 0915UTC).
Thanks,
Matt
--
Matthew Gretton-Dann
Linaro Toolchain Working Group
matthew.gretton-dann(a)linaro.org
The minutes of the performance call held on 3 December 2012 can be found at:
https://wiki.linaro.org/WorkingGroups/ToolChain/Meetings/2012-12-03
In summary the actions from the meeting are:
* ACTION: Yvan will do the trunk merge this week
* ACTION: Yvan to do the GCC release next week
* ACTION: Christophe to do the GDB release branch merge
* ACTION: Matt volunteered Zhenqiang as the QEMU manual tester this month
* ACTION: Matt: want shrinkwrap to go upstream
* ACTION: Matt - Can blueprint improvements in shrink wrap past that
* ACTION: Matt to create a blueprint on aarch32 ARMv8 instructions in QEMU
* ACTION: Matt to decide if aarch32 ARMv8 instructions in QEMU needs a card
* ACTION: Christophe to send Michael cross build failures to investigate
Thanks,
-- Michael
== Progress ==
* Turn off 64-bits bitops in Neon: patch proposed upstream after
positive benchmarking.
Re-submitted after request to add testcases and documentation for
the new option.
* Disable peeling: running benchmarks with peeling completely disabled
to see the impact.
* PGO/hot-cold partitioning: tested new patch from google, which
solves some of the ICEs, but makes new ones appear.
* builtin_bswap16 backport to Linaro-4.7: checking if there would be a
possibility to keep the testcase which fails in one of our
configurations, after rebasing my branch.
* Trying to bootstrap gcc-linaro/4.7 on board
* Internal support
== Next ==
* Follow-up on 64-bits bitops in Neon
* Look at benchmarks results with peeling disabled
* finish builtin_bswap16 backport
Summary:
* Verify shrink-wrap related bugs.
* Validate and release Linaro toolchain binary 2012.11 release.
* Collect performance data for different branch costs.
Details:
* Validate and release Linaro toolchain binary 2012.11.
* Test aarch64 toolchain. All the basic tests PASS except gdbserver.
* Collect performance data for branch cost combination. For eembcv1,
some combinations have more than 2% performance improvement on
PandaBoard THUMB mode. More test results will come later.
* Verify shrink-wrap related bugs (http://goo.gl/6fGg5). All pass with
the new patch. Native tests are ongoing. Identify the root cause why
the copy, which blocks the shrink-wrap optimization for 453.povray
benchmark, is not optimized.
Plans:
* Finalize the aarch64 toolchain binary release plan.
* Collect performance data for branch cost tuning.
* Enhance shrink-wrap to optimize the copy.
Best regards!
-Zhenqiang
== Progress ==
* Boehm GC AArch64 support:
- Read wikis and papers on the memory model
- Reported an issue with the current ARMv7 atomic builtins
- Submitted the fix, which was approved
- Improving libatomic-ops AArch64 support with load-acquire/
store-release usage.
== Next ==
* Continue on the Boehm GC AArch64 support.
[Short week: 3 days]
* looked at (but failed to reproduce) a hang in QEMU reported
by Christoffer when shutting down a KVM ARM guest using TUN/TAP
networking
* investigated LP:1084148 (segfault in qemu usermode) sufficiently
to diagnose it as probably another of qemu's "can't handle
multithreaded guest programs" bugs
* fixed some problems with QEMU's secondary CPU boot code which
were masked by errors in QEMU's GIC model but revealed by
real hardware (ie KVM); fixed the GIC model bugs as well
* investigated LP:955379 (cmake hangs under qemu-arm-static).
Tracked down to a race condition involving signal delivery,
the fix to which would require the significant redesign I
sketched out here a year or so ago:
http://lists.gnu.org/archive/html/qemu-devel/2011-12/msg00384.html
KVM blueprint progress tracker:
http://ex.seabright.co.nz/helpers/backlog?group_by=topic&colour_by=state&pr…
-- PMM
== Blueprints ==
Initial Current Actual
initial-aarch64-backport 31 Oct 2012 7 Dec 2012*
aarch64-baremetal-testing 31 Oct 2012 7 Dec 2012*
fix-gcc-multiarch-testing 31 Dec 2012 31 Dec 2012
backport-fma-intrinsic 31 Dec 2012 31 Dec 2012
fused-multiply-add-support 31 Dec 2012 31 Dec 2012
gcc-investigate-lra-for-arm 31 Dec 2012 31 Dec 2012
== Progress ==
* Admin
* Interviewing
* Preparation for taking over from Michael
* Investigate patches for literal pool layout bug
* Applied
* PINGed triplet backport patches upstream
* Other bug issues
* Including an issue running SPEC2K on x86 with recent trunk
* And a 4.6 gcc-linaro only issue
== Next Week ==
* Start leading Toolchain team
* Run HOT/COLD partitioning benchmarks
* Analyse ARM results
* On x86_64 to see what the actual benefit we could get
* initial-aarch64-backport & aarch64-baremetal-testing
* Finish documentation
* gcc-investigate-lra-for-arm
* Analyse benchmarks
* fix-gcc-multiarch-testing
* Come up with strawman proposal for updating testsuite to handle
testing with varying command-line options.
== Future ==
* backport-fma-intrinsic & fused-multiply-add-support
* Backport patches once fix-gcc-multiarch-testing has been done.
== Planned Leave ==
* Monday 24 December - Monday 31 December
--
Matthew Gretton-Dann
Linaro Toolchain Working Group
matthew.gretton-dann(a)linaro.org
Hi,
I think I have identified some issues with the atomic builtins, but I want
your advises.
For instance :
A: __atomic_store_n (addr, val, __ATOMIC_SEQ_CST);
gives the armv7 code:
DMB sy
STR r1, [r0]
DMB sy
but if I have well understood, the DMBs instructions only provide the
property that the
code is sequentially consistent, but not the atomicity for which we have to
use the
LDREX/STREX instructions. Thus I think that the code should be :
DMB sy
1: LDREX r2, [r0]
STREX r1, r2, [r0]
TEQ r1, #0
BNE 1b
B: __atomic_load_n (addr, __ATOMIC_ACQUIRE);
gives the armv7 code:
DMB sy
LDR r0, [r0]
but the load-acquire semantique specifies that all loads and stores
appearing in program order
after the load-acquire will be observed after the load-acquire, thus the
DMB should be after the
LDR, no ?
--
Yvan