Hello,
While testing SMS on Crotex-A9 I see that the latency of load instruction is 1 cycle when compiling with -mcpu=cortex-a9 -mthumb -mtune=cortex-a9 -O3.
Below is a snippet from the SMS dump file showing the DDG, created for the loop in foo function, which depicts the edge between the load of input[i] (insn 181) and the mult instruction (insn 184). [181 -(T,1,0)-> 184] is the true dependence edge created between the two insns; with latency of 1. On Crotex-A8 the latency of the load is 3 as expected. I've read in crotex-a9.md file that loads should have a latency of 4 cycles so I just wanted to check if I should have used other combination of flags for Crotex-A9 or the load latency should indeed be of 1 cycle here.
Thanks, Revital
int foo (int max, signed short *input, int y) { int i, accum;
for (i = 0; i < max; i++) { accum += (signed int) input[i] * (signed int) input[i+y]; } return accum; }
The snippet from the DDG:
Node num: 2 (insn 181 178 184 13 (set (reg:SI 216 [ D.2019 ]) (zero_extend:SI (mem:HI (plus:SI (reg:SI 319 [ ivtmp.34 ]) (reg:SI 345)) [2 MEM[base: D.2076_257, index: D.2079_226, offset: 0B]+0 S2 A16]))) tmp.c:7 714 {*thumb2_zero_extendhisi2_v6} (nil)) OUT ARCS: [181 -(A,0,1)-> 176] [181 -(T,1,0)-> 184] IN ARCS: [184 -(A,0,1)-> 181] [176 -(T,1,0)-> 181] Node num: 3 (insn 184 181 234 13 (set (reg/v:SI 209 [ accum ]) (plus:SI (mult:SI (sign_extend:SI (subreg/s/u:HI (reg:SI 212 [ D.2013 ]) 0)) (sign_extend:SI (subreg/s/u:HI (reg:SI 216 [ D.2019 ]) 0))) (reg/v:SI 209 [ accum ]))) tmp.c:7 64 {maddhisi4} (expr_list:REG_DEAD (reg:SI 216 [ D.2019 ]) (expr_list:REG_DEAD (reg:SI 212 [ D.2013 ]) (nil))))
Hi Revital,
The latency should be 4 cycles. I've been looking into a similar problem and it appears as though the scheduler description is broken with some of my earlier Neon scheduler changes. I am looking at a fix for that and will send it out once I have fixed it but it won't be until early next week since I'm travelling this week.
cheers Ramana
-----Original Message----- From: linaro-toolchain-bounces@lists.linaro.org on behalf of Revital1 Eres Sent: Sun 09/01/2011 3:41 PM To: linaro-toolchain@lists.linaro.org Subject: A question about instructions latencies in Crotex-A9
Hello,
While testing SMS on Crotex-A9 I see that the latency of load instruction is 1 cycle when compiling with -mcpu=cortex-a9 -mthumb -mtune=cortex-a9 -O3.
Below is a snippet from the SMS dump file showing the DDG, created for the loop in foo function, which depicts the edge between the load of input[i] (insn 181) and the mult instruction (insn 184). [181 -(T,1,0)-> 184] is the true dependence edge created between the two insns; with latency of 1. On Crotex-A8 the latency of the load is 3 as expected. I've read in crotex-a9.md file that loads should have a latency of 4 cycles so I just wanted to check if I should have used other combination of flags for Crotex-A9 or the load latency should indeed be of 1 cycle here.
Thanks, Revital
int foo (int max, signed short *input, int y) { int i, accum;
for (i = 0; i < max; i++) { accum += (signed int) input[i] * (signed int) input[i+y]; } return accum; }
The snippet from the DDG:
Node num: 2 (insn 181 178 184 13 (set (reg:SI 216 [ D.2019 ]) (zero_extend:SI (mem:HI (plus:SI (reg:SI 319 [ ivtmp.34 ]) (reg:SI 345)) [2 MEM[base: D.2076_257, index: D.2079_226, offset: 0B]+0 S2 A16]))) tmp.c:7 714 {*thumb2_zero_extendhisi2_v6} (nil)) OUT ARCS: [181 -(A,0,1)-> 176] [181 -(T,1,0)-> 184] IN ARCS: [184 -(A,0,1)-> 181] [176 -(T,1,0)-> 181] Node num: 3 (insn 184 181 234 13 (set (reg/v:SI 209 [ accum ]) (plus:SI (mult:SI (sign_extend:SI (subreg/s/u:HI (reg:SI 212 [ D.2013 ]) 0)) (sign_extend:SI (subreg/s/u:HI (reg:SI 216 [ D.2019 ]) 0))) (reg/v:SI 209 [ accum ]))) tmp.c:7 64 {maddhisi4} (expr_list:REG_DEAD (reg:SI 216 [ D.2019 ]) (expr_list:REG_DEAD (reg:SI 212 [ D.2013 ]) (nil))))
_______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain
Hello Ramana,
The latency should be 4 cycles. I've been looking into a similar problem and it appears as though the scheduler description is broken with some of my earlier Neon scheduler changes. I am looking at a fix for that and will send it out once I have fixed it but it won't be until early next week since I'm travelling this week.
Thanks for your reply! I'll try your fix once it will be ready then.
Revital
linaro-toolchain@lists.linaro.org