== Progress ==
VRP based zero/sign extension
- Got some review comments for the patch and started addressing them
- split the patch into two; 1. propagating value range and 2. do rtl
expansion
- testing in progress
specfp regression
- Benchmarked spec2k for A15 with trunk and couldn't reproduce it.
- benchmarked spec2k for A9 with trunk and couldn't reproduce between
24th and 28th
== Leave ==
- Monday off Sick
== Plan ==
VRP based zero/sign extension
- Send patch for review
specfp regression
- benchmark for trunk version on 23rd
- Resolve regression
Sorry, this covers the last two weeks, not just one.
== Progress ==
* Created database schema for DejaGnu test results
* Created data schema for benchmarks
* Wrote scripts to convert benchmark and test data into a form that
can be imported into a database, added them to DejaGnu branch
* Imported all historical benchmark data
* Imported most historical test results (GCC still importing)
* Did some experimental graphs of test results
* Read lots of web pages to come up to speed on Linaro, registered
for lots of web pages and accounts
* Learned about Cbuild and LAVA
* Started on Cbuild v2
* Installed Ubuntu on Chromebook
== Plan ==
* Write Cbuild 2 design doc
* Continue work on Cbuild v2 to be able to use it for the June release
* Get remote testing working with Chromebook & foundation model
* More support tasks resulting from move off launchpad
- rob -
Greetings,
I'm using the Linaro tool chain with Eclipse (Juno) (under Windows) and
openOCD to write firmware for an STM32F20x based design (using an ST-Link2
debugger).
In general, that all works fairly well.
The part I'm having problems with is debugging (step-in, etc) from Eclipse.
The execution flow seems chaotic when single stepping through C code: it
skips statements, it jumps into the middle of a function, then returns to
the start of a function, it loops over certain statements (while there's no
loop in the code), etc. (It's close to useless).
I have seen this behavior with other IDE's and tool chains when code was
built with optimization turned on.
However, I specify 'no optimization' (-O0) when I build my code.
My questions:
a) Is there some implicit optimization being done in the compiler, even
though I tell it not to do so, which may affect proper debugging?
b) Are other people using Eclipse (Juno) and are they seeing the same
issue? Are there any known ways to fix this chaotic debugger behavior?
Kind regards,
~ Paul Claessen
== Progress ==
* binutils on ARM testsuite finally green in cbuild!
* Tested and pushed to gerrit bionic memcpy patches.
* Investigated binutils native AArch64 testsuite failures (not IFUNC related).
* Made a start on the DeveloperTools/LibraryPerformance wiki.
* Started looking at the Android memcpy problem on Galaxy Nexus.
== Issues ==
* binutils make ; make check takes over 24 hours on foundation model!
== Plan ==
* Respin AArch64 IFUNC binutils patch once relocation number allocated.
* Setup git mirrors for binutils, glibc and newlib.
* Android memcpy issue.
--
Will Newton
Toolchain Working Group, Linaro
== Progress ==
* Disable-peeling: looking at how to have less aggressive vectorization
* Libsanitizer/aarch64: initiated upstream discussion
* PGO/LTO bug reported by Doko: SD card too small to reproduce the problem
* Merges for linaro-gcc-2013.06: started looking at what to backport,
started merges
* Jira/wiki: started cleanup/collecting new cards
* Internal support
== Next ==
* Jira: update status on cards/blueprints backported from launchpad
* Merges for linaro-gcc-2013.06: continue collecting relevant merges
* Disable-peeling: continue investigating vectorizer behaviour
*Libsanitizer/aarch64: look at frame implementation
* PGO/LTO: complete build of python
* Neon intrinsics: continue improving crc with vuzp/veor
Progress:
* misc
** got raring/aarch64 cross build set up
** reducing number of places that need changing for a new qemu
target: sent some simple configure patches
** some 32 bit cleanup work that will help with getting John's
AArch64 patches to work
** tested Huawei's aarch64 patches and confirmed they work
** rebased qemu-linaro (and passed the results to Serge H for Ubuntu)
** sent patches which make QEMU builds for arm/ppc/microblaze guests
require libfdt, since a non-FDT-aware ARM QEMU is becoming
rapidly less and less useful
Plans:
* handover from John Rigby
* VIRT-55: talk to Andre about testing; investigate testing migration
using LAVA
* set up a new qemu-linaro tree/branch as our CI/LAVA input [to keep it
separate from our "we release this" tree]
* restart work on upstreaming omap3 patches as part of my generic qemu
maintenance work (will reduce our maintenance burden in the long term)
-- PMM
== Progress ==
* Buildbots
- Self-hosting bot online
- Fiddling with MCJIT tests to get bots green
* Benchmarks
- Running Phoronic benchmarks: GCC vs. LLVM, good results
- Got a sample of the PerfDB SQLite database, writing some queries
* Jira/Wiki farming
- Creating loads of new cards, blueprints, sub-tasks
- Adding content to the wiki pages about processes, cards, etc
* Release 3.3
- RC2 is out, no regressions, already on official repository
* EuroLLVM 2013
- Monthly call, wrap-up, preview of next year's
== Plan ==
* Try running a CBuild benchmark with LLVM 3.3 (Rob?)
* Automate release process, maybe we can do that every month
* Automate Phoronix test (GCC+LLVMrel+LLVMsvn)
* Follow up on Panda/Arndale ordering, needed for buildbots
* Try to extract useful information from perf database
Hi all,
I've spent a little while porting an optimization from Python 3 to
Python 2.7 (http://bugs.python.org/issue4753). The idea of the patch is
to improve performance by dispatching opcodes on computed labels rather
than a big switch -- and so confusing the branch predictor less.
The problem with this is that the last bit of code for each opcode ends
up being the same, so common subexpression elimination wants to coalesce
all these bits, which neatly and completely nullifies the point of the
optimization. Playing around just building from source directly, it
seems that -fno-gcse prevents gcc from doing this, and the resulting
interpreter shows a small performance improvement over a build that does
not include the patch.
However, when I build a debian package containing the patch, I see no
improvement at all. My theory, and I'd like you guys to tell me if this
makes sense, is that this is because the Debian package uses link time
optimization, and so even though I carefully compile ceval.c with
-fno-gcse, the common subexpression elimination happens anyway at link
time. I've tried staring at disassembly to confirm or deny this but I
don't know ARM assembly very well and the compiled function is roughtly
10k instructions long so I didn't get very far with this (I can supply
the disassembly if someone wants to see it!).
Is there some way I can tell GCC to not compile perform CSE on a section
of code? I guess I can make sure that the whole program, linker step
and all, is compiled with -fno-gcse but that seems a bit of a blunt
hammer.
I'd also be interested if you think this class of optimization makes
little sense on ARM and then I'll stop and find something else to do :-)
Cheers,
mwh
The v8 Foundation Model User Guide has a bare metal hello world example that uses semi-hosting. The Makefile uses ARM tools, however. Is there equivalent support for this example using a bare metal version of the gnu tools, such as gcc-linaro-aarch64-none-elf-4.8-2013.04-20130422_linux.tar.xz? I took a look, but didn't see a way to do this.
Of course, running the Linaro linux port on the v8 Foundation Model allows one to run hello world and much more, but I'm currently only interested in a bare metal target using gnu tools.
Thanks, Don