All,
I have run into a(nother) problem with reload with -freorder-blocks-and-partition.
Attached is my WIP patch (some of this has been sent up to gcc-patches for review), and also the profile information (tarred up) I have gathered in the train session
I get a segfault when executing crafty, which seems to come from incorrect code generation in iterate.c.
The following shows some of the RTL dump after IRA
(insn 3087 1471 3088 163 (clobber (reg:DI 682 [ D.7985 ])) -1 (nil)) (insn 3088 3087 3085 163 (set (subreg:SI (reg:DI 682 [ D.7985 ]) 0) (sign_extend:SI (mem/c:QI (reg/f:SI 1417) [0 transposition_id+0 S1 A8]))) 735 {*thumb2_extendqisi_v6} (expr_list:REG_DEAD (reg/f:SI 1417) (nil))) ... (insn 3089 1477 1478 163 (set (subreg:SI (reg:DI 682 [ D.7985 ]) 4) (ashiftrt:SI (subreg:SI (reg:DI 682 [ D.7985 ]) 0) (const_int 31 [0x1f]))) 130 {*arm_shiftsi3} (nil)) ... (insn 2898 2622 2899 203 (set (reg:SI 1677) (mem/u/c:SI (symbol_ref/u:SI ("*.LC60") [flags 0x2]) [2 S4 A32])) 635 {*thumb2_movsi_vfp} (insn_list:REG_LABEL_OPERAND 1525 (expr_list:REG_EQUIV (label_ref:SI 1525) (nil)))) (insn 2899 2898 3491 203 (set (reg:SI 1678) (ior:SI (reg:SI 1677) (const_int 1 [0x1]))) 98 {*iorsi3_insn} (expr_list:REG_DEAD (reg:SI 1677) (nil))) (insn 3491 2899 3492 203 (set (reg:DI 1734 [orig:682 D.7985 ] [682]) (reg:DI 682 [ D.7985 ])) 636 {*movdi_vfp} (nil)) ... (jump_insn 2900 3499 2625 203 (set (pc) (reg:SI 1678)) 727 {*thumb2_indirect_jump} (expr_list:REG_DEAD (reg:SI 1678) (expr_list:REG_CROSSING_JUMP (nil) (nil)))) ...
Insns 2898, 2899, and 2900 form the standard Thumb-2 indirect jump sequence. Insn 3491 is a move that has been generated as part of emit_moves in IRA for the loop it belongs to (effectively copying r682 into r1734).
Now despite thinking that r1678 is live throughout insn 3491 after reload this part of the RTL dump looks like
(insn 2898 2622 2899 212 (set (reg:SI 4 r4 [1677]) (mem/u/c:SI (symbol_ref/u:SI ("*.LC60") [flags 0x2]) [2 S4 A32])) 635 {*thumb2_movsi_vfp} (insn_list:REG_LABEL_OPERAND 1525 (expr_list:REG_EQUIV (label_ref:SI 1525) (nil)))) (insn 2899 2898 3491 212 (set (reg:SI 4 r4 [1678]) (ior:SI (reg:SI 4 r4 [1677]) (const_int 1 [0x1]))) 98 {*iorsi3_insn} (nil)) (insn 3491 2899 3493 212 (set (reg:DI 4 r4 [orig:682 D.7985 ] [682]) (mem/c:DI (plus:SI (reg/f:SI 13 sp) (const_int 24 [0x18])) [9 %sfp+-672 S8 A64])) 636 {*movdi_vfp} (nil)) ... (jump_insn 2900 3497 2625 212 (set (pc) (reg:SI 4 r4 [1678])) 727 {*thumb2_indirect_jump} (expr_list:REG_CROSSING_JUMP (nil) (nil)))
That is all of r682, r1678, and r1734 have been assigned to hard register r4. This is incorrect - as insn 2900 wants to be using r1678 from insn 2899.
Looking at the logs it seems to me that r1734 because its original is r682 and that is assigned r4.
The reload dump says the following about the liveness of the registers for various insns: insn=3087, live_throughout: ..., dead_or_set: 682 insn=3088, live_throughout: ..., dead_or_set: 682 insn=3089, live_throughout: ..., dead_or_set: 682 insn=2898, live_throughout: ..., 682, ..., dead_or_set: 1677 insn=2899, live_throughout: ..., 682, ..., dead_or_set: 1677, 1678 insn=3491, live_throughout: ..., 682, 1678, ..., dead_or_set: 1734 insn=2900, live_throughout: ..., 682, 1734, ..., dead_or_set: 1678
This suggests to me that the compiler should know assigning the same hard-register to r682 and r1678 is incorrect as they have overlapping live-ranges, and are not duplicates of each other.
The compiler is configured as follows:
Target: arm-none-linux-gnueabi Configured with: /work/sources/gcc-fsf-enable-hot-cold-partitioning/configure --target=arm-none-linux-gnueabi --prefix=/work/builds/gcc-fsf-enable-hot-cold-partitioning-arm-none-linux-gnueabi/tools --with-sysroot=/work/builds/gcc-fsf-enable-hot-cold-partitioning-arm-none-linux-gnueabi/sysroot --disable-libssp --disable-libgomp --disable-libmudflap --enable-languages=c,c++,fortran --with-cpu=cortex-a9 --with-fpu=neon --with-float=softfp --enable-build-with-cxx : (reconfigured) /work/sources/gcc-fsf-enable-hot-cold-partitioning/configure --target=arm-none-linux-gnueabi --prefix=/work/builds/gcc-fsf-enable-hot-cold-partitioning-arm-none-linux-gnueabi/tools --with-sysroot=/work/builds/gcc-fsf-enable-hot-cold-partitioning-arm-none-linux-gnueabi/sysroot --disable-libssp --disable-libgomp --disable-libmudflap --enable-languages=c,c++,fortran --with-cpu=cortex-a9 --with-fpu=neon --with-float=softfp --enable-build-with-cxx Thread model: posix gcc version 4.8.0 20120821 (experimental) (GCC)
The gcc command line looks like:
./xgcc -B`pwd` -march=armv7-a -mtune=cortex-a9 -mthumb -mfpu=neon -mvectorize-with-neon-quad -mfloat-abi=softfp -fprofile-use=.../186.crafty -freorder-blocks-and-partition -fno-common -fdump-noaddr -O3 -dp -save-temps iterate.c -o iterate.o
Does anyone have any hints as to where I should go looking?
Thanks,
Matt