Yao Qi wrote:
6801 ldr r1, [r0, #0] f831 3013 ldrh.w r3, [r1, r3, lsl #1] -f413 6f00 tst.w r3, #2048 ; 0x800 -f43f af41 beq.w cc <t_run_test+0xcc> +0518 lsls r0, r3, #20 +f57f af44 bpl.w cc <t_run_test+0xcc> 4610 mov r0, r2
Someone suggests that the slowdown might be caused by usage of r0 in first instruction. Since r0 is used in the first insn, the third insn lsls can't overwrite r0 until first insn ldr is done.
It depends on whether the Cortex-A8 implemented any form of register renaming features. If they did, this should not be the problem.
The second patch is Bernd's "Fix an if statement in arm_rtx_costs_1" http://gcc.gnu.org/ml/gcc-patches/2010-07/msg02096.html After this patch applied, EEMBC benchmark number is not changed. Shall we merge this patch to linaro 4.5 tree? I am inclined to merge it, but if you have concerns on this patch, let us discuss here.
As we discussed in the meeting yesterday, the criteria of us picking up upstreams patches is that patches don't slow down speed and don't increase code size.
Code size is not reduced either on A8. I'll re-test this patch on A9. If still no benefit either size or speed, we don't backport it to Linaro 4.5.
Well, you might try some other benchmarks? Maybe building ffmpeg or the Linux kernel to see if any code generation differences? Even slight improvements may warrant the backport, as it seems quite harmless.
Chung-Lin