Summary:
* Test shrink-wrap code
Details:
1. Add simple_return support in function thumb2_expand_return for
shrink-wrap. Here is the make check status
* One new fail is due to code size increase. We'd disable it when
optimizing function for size on THUMB2.
* Other new fails is due to dwarf info. Root cause is ICE at function
maybe_record_trace_start
gcc_checking_assert (cfi_row_equal_p (cur_row, ti->beg_row));
Here is the failed code segment:
tst ... L1
push {r4}
...
ldr r4, ...
L1:
bx lr // common simple return from two branches.
Here are the results for cur_row and ti->beg_row of trace starting at L1:
{cfa = {offset = 0, base_offset = 0, reg = 13, indirect = 0, in_use =
0}, cfa_cfi = 0x0, reg_save = 0x0}
{cfa = {offset = 4, base_offset = 0, reg = 13, indirect = 0, in_use =
0}, cfa_cfi = 0x0, reg_save = 0x7ffff726ab70}
Try gcc-linaro-4.5-2011.03. It does not generate the common bx lr.
test L1
push {r4}
...
pop {r4}
bx lr
L1:
bx lr
There is similar bug about it. But the fix is useless for us:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50833
Plans:
* Continue shrink-wrap task.
Best regards!
-Zhenqiang
== Progress ==
* Neon vext support for builtin_shuffle:
* Committed vext patch upstream, as well as a small cleanup patch.
* Merged vext support into gcc-linaro/4.7 branch.
* Posted upstream a follow-up patch to make vext tests support
big-endian.
Filed PR 54517: wrong code generation in big-endian with inline in
these tests.
* Updated patch to fix 3 testcases in big-endian after upstream comments.
* Implement builtin_bswap16:
* Posted upstream a patch to implement bswap16. Discussion on-going,
to have less duplication between arm and thumb patterns.
* Investigating how to make GCC catch the (x<<8)|(x>>8) construct
(where x is unsigned short) and map it to rev16, like
builtin_bswap16.
== Next ==
* Continue with bswap16 support.
Current Milestones:
|| || Planned || Estimate || Actual ||
|| clean up kvm-qemu cp i/f || 2012-09-20 || 2012-09-20 || ||
|| fake-trustzone || 2012-10-15 || 2012-10-15 || ||
Also planned: general keeping up with kernel changes; upstream patch
review; qemu-linaro releases. May change dates to align with overall
KVM plan for the quarter when that is finalised.
Previous Milestones:
||cp15-rework || 2012-01-06 || 2012-06-23 || 2012-06-24 ||
||a15-lpae-support || 2012-07-13 || 2012-07-20 || 2012-07-20 ||
== track-kvm-abi-changes ==
* merged in Christoffer's patches altering the IRQ delivery ABI
== other ==
* resent some patches as qemu trunk has reopened after 1.2 release
* misc upstream review work
* prep for qemu-linaro 2012.09 release
* AFDS (annual review) season again
KVM blueprint progress tracker:
http://ex.seabright.co.nz/helpers/backlog?group_by=topic&colour_by=state&pr…
-- PMM
== Progress ==
* Started looking at symbol_ref splitting benchmark results
* One big regression ~18%
* Started to investigate whether code alignment was the problem as before
* Hot/Cold partitioning in PGO:
* https://blueprints.launchpad.net/gcc-linaro/+spec/hot-cold-partitioning-in-…
* Sent fixes so far upstream
* Spent most of the week looking at a Silent Code Gen fault in reload.
* Admin
* Some interviewing
== Next Week ==
* symbol_ref splitting
* Test code alignment hypothesis
* Test v2 patch.
* Hot/Cold Partitioning:
* Investigate remaining silent code-gen faults and non-termination
issue in SPEC
* If failures are fixed start profiledbootstraps and tests on the
central boards.
== Future ==
* Look at Cards for Vectorization, PGO and LTO with Michael.
--
Matthew Gretton-Dann
Linaro Toolchain Working Group
matthew.gretton-dann(a)linaro.org
All,
I have run into a(nother) problem with reload with
-freorder-blocks-and-partition.
Attached is my WIP patch (some of this has been sent up to gcc-patches
for review), and also the profile information (tarred up) I have
gathered in the train session
I get a segfault when executing crafty, which seems to come from
incorrect code generation in iterate.c.
The following shows some of the RTL dump after IRA
(insn 3087 1471 3088 163 (clobber (reg:DI 682 [ D.7985 ])) -1
(nil))
(insn 3088 3087 3085 163 (set (subreg:SI (reg:DI 682 [ D.7985 ]) 0)
(sign_extend:SI (mem/c:QI (reg/f:SI 1417) [0
transposition_id+0 S1 A8]))) 735 {*thumb2_extendqisi_v6}
(expr_list:REG_DEAD (reg/f:SI 1417)
(nil)))
...
(insn 3089 1477 1478 163 (set (subreg:SI (reg:DI 682 [ D.7985 ]) 4)
(ashiftrt:SI (subreg:SI (reg:DI 682 [ D.7985 ]) 0)
(const_int 31 [0x1f]))) 130 {*arm_shiftsi3}
(nil))
...
(insn 2898 2622 2899 203 (set (reg:SI 1677)
(mem/u/c:SI (symbol_ref/u:SI ("*.LC60") [flags 0x2]) [2 S4
A32])) 635 {*thumb2_movsi_vfp}
(insn_list:REG_LABEL_OPERAND 1525 (expr_list:REG_EQUIV (label_ref:SI 1525)
(nil))))
(insn 2899 2898 3491 203 (set (reg:SI 1678)
(ior:SI (reg:SI 1677)
(const_int 1 [0x1]))) 98 {*iorsi3_insn}
(expr_list:REG_DEAD (reg:SI 1677)
(nil)))
(insn 3491 2899 3492 203 (set (reg:DI 1734 [orig:682 D.7985 ] [682])
(reg:DI 682 [ D.7985 ])) 636 {*movdi_vfp}
(nil))
...
(jump_insn 2900 3499 2625 203 (set (pc)
(reg:SI 1678)) 727 {*thumb2_indirect_jump}
(expr_list:REG_DEAD (reg:SI 1678)
(expr_list:REG_CROSSING_JUMP (nil)
(nil))))
...
Insns 2898, 2899, and 2900 form the standard Thumb-2 indirect jump
sequence. Insn 3491 is a move that has been generated as part of
emit_moves in IRA for the loop it belongs to (effectively copying r682
into r1734).
Now despite thinking that r1678 is live throughout insn 3491 after
reload this part of the RTL dump looks like
(insn 2898 2622 2899 212 (set (reg:SI 4 r4 [1677])
(mem/u/c:SI (symbol_ref/u:SI ("*.LC60") [flags 0x2]) [2 S4
A32])) 635 {*thumb2_movsi_vfp}
(insn_list:REG_LABEL_OPERAND 1525 (expr_list:REG_EQUIV (label_ref:SI 1525)
(nil))))
(insn 2899 2898 3491 212 (set (reg:SI 4 r4 [1678])
(ior:SI (reg:SI 4 r4 [1677])
(const_int 1 [0x1]))) 98 {*iorsi3_insn}
(nil))
(insn 3491 2899 3493 212 (set (reg:DI 4 r4 [orig:682 D.7985 ] [682])
(mem/c:DI (plus:SI (reg/f:SI 13 sp)
(const_int 24 [0x18])) [9 %sfp+-672 S8 A64])) 636 {*movdi_vfp}
(nil))
...
(jump_insn 2900 3497 2625 212 (set (pc)
(reg:SI 4 r4 [1678])) 727 {*thumb2_indirect_jump}
(expr_list:REG_CROSSING_JUMP (nil)
(nil)))
That is all of r682, r1678, and r1734 have been assigned to hard
register r4. This is incorrect - as insn 2900 wants to be using r1678
from insn 2899.
Looking at the logs it seems to me that r1734 because its original is
r682 and that is assigned r4.
The reload dump says the following about the liveness of the registers
for various insns:
insn=3087, live_throughout: ..., dead_or_set: 682
insn=3088, live_throughout: ..., dead_or_set: 682
insn=3089, live_throughout: ..., dead_or_set: 682
insn=2898, live_throughout: ..., 682, ..., dead_or_set: 1677
insn=2899, live_throughout: ..., 682, ..., dead_or_set: 1677, 1678
insn=3491, live_throughout: ..., 682, 1678, ..., dead_or_set: 1734
insn=2900, live_throughout: ..., 682, 1734, ..., dead_or_set: 1678
This suggests to me that the compiler should know assigning the same
hard-register to r682 and r1678 is incorrect as they have overlapping
live-ranges, and are not duplicates of each other.
The compiler is configured as follows:
Target: arm-none-linux-gnueabi
Configured with:
/work/sources/gcc-fsf-enable-hot-cold-partitioning/configure
--target=arm-none-linux-gnueabi
--prefix=/work/builds/gcc-fsf-enable-hot-cold-partitioning-arm-none-linux-gnueabi/tools
--with-sysroot=/work/builds/gcc-fsf-enable-hot-cold-partitioning-arm-none-linux-gnueabi/sysroot
--disable-libssp --disable-libgomp --disable-libmudflap
--enable-languages=c,c++,fortran --with-cpu=cortex-a9 --with-fpu=neon
--with-float=softfp --enable-build-with-cxx : (reconfigured)
/work/sources/gcc-fsf-enable-hot-cold-partitioning/configure
--target=arm-none-linux-gnueabi
--prefix=/work/builds/gcc-fsf-enable-hot-cold-partitioning-arm-none-linux-gnueabi/tools
--with-sysroot=/work/builds/gcc-fsf-enable-hot-cold-partitioning-arm-none-linux-gnueabi/sysroot
--disable-libssp --disable-libgomp --disable-libmudflap
--enable-languages=c,c++,fortran --with-cpu=cortex-a9 --with-fpu=neon
--with-float=softfp --enable-build-with-cxx
Thread model: posix
gcc version 4.8.0 20120821 (experimental) (GCC)
The gcc command line looks like:
./xgcc -B`pwd` -march=armv7-a -mtune=cortex-a9 -mthumb -mfpu=neon
-mvectorize-with-neon-quad -mfloat-abi=softfp
-fprofile-use=.../186.crafty -freorder-blocks-and-partition
-fno-common -fdump-noaddr -O3 -dp -save-temps iterate.c -o iterate.o
Does anyone have any hints as to where I should go looking?
Thanks,
Matt
--
Matthew Gretton-Dann
Linaro Toolchain Working Group
matthew.gretton-dann(a)linaro.org
First I am trying to learn and am in no way an expert.
I have searched and searched for a walkthrough on how to compile LKMs for my
pandaboard ES. My target is the pandaboard running Linaro 12.08. I am using
Linaro toolchain binary 12.08. My confusion comes from me not knowing where
the kernel sources are (really how to get them) and the target libraries. I
guess what I expect is the target's rootfs with kernel source to be
somewhere on my Ubuntu Host so I can link to them.
I have tried the following make file configs. But obviously, I don't have
the correct path for KERNDIR as I don't know if it even exists on my host.
And the second make file seems like a native compile.
My previous experience with buildroot:
KERNDIR=/opt/buildroot/build_arm/linux-2.6.20.7/
export CROSS_COMPILE=arm-linux-
export ARCH=arm
obj-m += pwr_led.o
all:
make -C $(KERNDIR) M=$(PWD) modules
clean:
make -C $(KERNDIR) M=$(PWD) clean
Web examples:
obj-m = foo.o
KVERSION = $(shell uname -r)
all:
make -C /lib/modules/$(KVERSION)/build M=$(PWD) modules
clean:
make -C /lib/modules/$(KVERSION)/build M=$(PWD) clean
Could you please set my head straight by pointing me to a webpage or briefly
walking through the steps?
Thanks,
Todd
Assaf,
Just to let you know that linaro-toolchain-dev(a)lists.launchpad.net is
a closed list, a better place for this question is
linaro-toolchain(a)lists.linaro.org.
On 3 September 2012 11:41, Assaf Hoffman <hoffman(a)marvell.com> wrote:
> Hi,
>
> Where can I find Linaro toolchain manuals?
The manuals are distributed as *.info files, as per FSF GCC. These
are found in .../share/info/ under wherever you installed your
toolchain.
One way to view these is to run info as follows:
info -f .../share/info/gcc.info
> I’m looking for Linaro supported optimization flag list.
>
> Is it a super set of FSF GCC?
Linaro GCC 4.X supports all the options of FSF GCC 4.X, it may also
support options that were added in later versions of FSF GCC if the
appropriate functionality has been backported. The info files will
document these new options.
Unfortunately, I do not have an easy to read list of these options (if
indeed there are any), so I can't provide you with any further
pointers at the moment.
Thanks,
Matt
--
Matthew Gretton-Dann
Linaro Toolchain Working Group
matthew.gretton-dann(a)linaro.org
Summary:
* Release Linaro binary toolchain 2012.08
* Start shrink-wrap task
Details:
1. Validate and release Linaro binary toolchain 2012.08.
2. Start shrink-wrap task
https://launchpad.net/gcc-linaro/+spec/shrink-wrapping:
* Learn the background from the mail-list discussion:
http://old.nabble.com/Shrink-wrapping%3A-Introduction-to31220423.html
* Build X86/MIPS trunk toolchain to check the changes.
* Try to apply ARM backend related patches. It can get correct
result for several small cases. But lots of new fails in gcc
make-check. "pop_multiple_with_writeback_and_return" is not handled
correctly.
Plans:
* Continue the shrink-warp work.
Best regards!
-Zhenqiang
(Short week: 4 days, bank holiday)
Current Milestones:
|| || Planned || Estimate || Actual ||
|| clean up kvm-qemu cp i/f || 2012-09-20 || 2012-09-20 || ||
|| fake-trustzone || 2012-10-15 || 2012-10-15 || ||
Also planned: general keeping up with kernel changes; upstream patch
review; qemu-linaro releases. May change dates to align with overall
KVM plan for the quarter when that is finalised.
Previous Milestones:
||cp15-rework || 2012-01-06 || 2012-06-23 || 2012-06-24 ||
||a15-lpae-support || 2012-07-13 || 2012-07-20 || 2012-07-20 ||
== track-kvm-abi-changes ==
* working through design of how we do sync of register state between
QEMU and the kernel and how we handle migration (the two turn out
to be related and I now have a nice looking design that resolves
both of these at once)
* both the interrupt injection ABI and the coprocessor access
ABI are changing (again!)
== other ==
* fixed an embarrassing bug which made qemu-system-arm segfault
on 32 bit hosts; luckily spotted just in time for QEMU 1.2 release
* AFDS (annual review) season again
KVM blueprint progress tracker:
http://ex.seabright.co.nz/helpers/backlog?group_by=topic&colour_by=state&pr…
-- PMM
== Progress ==
* Bank Holiday on Monday
* Got symbol_ref split benchmarking going
* Hot/Cold partitioning in PGO:
* https://blueprints.launchpad.net/gcc-linaro/+spec/hot-cold-partitioning-in-…
* Investigated and fixed all compile-time failures in SPEC
* Investigated and fixed a silent code-gen fault in SPEC
== Next Week ==
* Look at symbol_ref split benchmarking results
* Hot/Cold Partitioning:
* Investigate remaining silent code-gen faults and non-termination
issue in SPEC
* If failures are fixed start profiledbootstraps and tests on the
central boards.
== Future ==
* Look at Cards for Vectorization, PGO and LTO with Michael.
--
Matthew Gretton-Dann
Linaro Toolchain Working Group
matthew.gretton-dann(a)linaro.org