Hi,
I am looking at best approach for http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43721 - Failure to optimise (a/b) and (a%b) into single __aeabi_idivmod call in ARM architecture
In sumary, the following c code results in __aeabi_idivmod() call and one __aeabi_idiv() call even though the former already calculates the quotient. int q = a / b; int r = a % b; return q + r;
My question is what would be the best way to handle it. As I see there are few options with some issues.
1. Handling in gimple level, try to reduce the operations to equivalent of this. We should do this for the targets without integer divide. {q, r} = a % b; Gimple assign stmts have only one lhs operation (?). Therefore, lhs has to be made 64bit to signify return values of R0 and R1 returned together. I am not too sure of any implications on other architectures here.
2. Handling in expand_divmod. Here, when we see a div or mod operation, we will have to do a linear search to see if there is a valid equivalent operation to combine. If we find one, we can generate __aeabi_idivmod() and cache the result for the equivalent operation. As I see, this can get messy and might not be acceptable.
3. An RTL pass to process and combine these library calls. Possibly using cse. I am still looking at this.
4. Ramana tried a prototype to do the same using target pattens. He has ruled this out. (if you want more info, please refer to at https://code.launchpad.net/~ramana/gcc-linaro/divmodsi4-experiments)
Any suggestion for best way to handle this?
Thanks, Kugan
Kugan,
I don't have the source code to hand but how are the sin()/cos()->sincos() optimizations handled?
Thanks,
Matt
On 5 June 2013 11:44, Kugan kugan.vivekanandarajah@linaro.org wrote:
Hi,
I am looking at best approach for http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43721 - Failure to optimise (a/b) and (a%b) into single __aeabi_idivmod call in ARM architecture
In sumary, the following c code results in __aeabi_idivmod() call and one __aeabi_idiv() call even though the former already calculates the quotient. int q = a / b; int r = a % b; return q + r;
My question is what would be the best way to handle it. As I see there are few options with some issues.
- Handling in gimple level, try to reduce the operations to equivalent of
this. We should do this for the targets without integer divide. {q, r} = a % b; Gimple assign stmts have only one lhs operation (?). Therefore, lhs has to be made 64bit to signify return values of R0 and R1 returned together. I am not too sure of any implications on other architectures here.
- Handling in expand_divmod. Here, when we see a div or mod operation, we
will have to do a linear search to see if there is a valid equivalent operation to combine. If we find one, we can generate __aeabi_idivmod() and cache the result for the equivalent operation. As I see, this can get messy and might not be acceptable.
- An RTL pass to process and combine these library calls. Possibly using
cse. I am still looking at this.
- Ramana tried a prototype to do the same using target pattens. He has
ruled this out. (if you want more info, please refer to at https://code.launchpad.net/~ramana/gcc-linaro/divmodsi4-experiments)
Any suggestion for best way to handle this?
Thanks, Kugan
On 05/06/13 21:27, Matthew Gretton-Dann wrote:
Kugan,
I don't have the source code to hand but how are the sin()/cos()->sincos() optimizations handled?
Thanks Matt. There is a tree level pass to combine sin()/cos() into sincos(). Commit that added this is: http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=121052. We can try doing same thing similar here.
Is there anyway we can know in the tree level that the target does not define integer divide?
Thanks, Kugan
Thanks,
Matt
On 5 June 2013 11:44, Kugan kugan.vivekanandarajah@linaro.org wrote:
Hi,
I am looking at best approach for http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43721 - Failure to optimise (a/b) and (a%b) into single __aeabi_idivmod call in ARM architecture
In sumary, the following c code results in __aeabi_idivmod() call and one __aeabi_idiv() call even though the former already calculates the quotient. int q = a / b; int r = a % b; return q + r;
My question is what would be the best way to handle it. As I see there are few options with some issues.
- Handling in gimple level, try to reduce the operations to equivalent of
this. We should do this for the targets without integer divide. {q, r} = a % b; Gimple assign stmts have only one lhs operation (?). Therefore, lhs has to be made 64bit to signify return values of R0 and R1 returned together. I am not too sure of any implications on other architectures here.
- Handling in expand_divmod. Here, when we see a div or mod operation, we
will have to do a linear search to see if there is a valid equivalent operation to combine. If we find one, we can generate __aeabi_idivmod() and cache the result for the equivalent operation. As I see, this can get messy and might not be acceptable.
- An RTL pass to process and combine these library calls. Possibly using
cse. I am still looking at this.
- Ramana tried a prototype to do the same using target pattens. He has
ruled this out. (if you want more info, please refer to at https://code.launchpad.net/~ramana/gcc-linaro/divmodsi4-experiments)
Any suggestion for best way to handle this?
Thanks, Kugan
On 6 June 2013 12:09, Kugan kugan.vivekanandarajah@linaro.org wrote:
On 05/06/13 21:27, Matthew Gretton-Dann wrote:
Kugan,
I don't have the source code to hand but how are the sin()/cos()->sincos() optimizations handled?
Thanks Matt. There is a tree level pass to combine sin()/cos() into sincos(). Commit that added this is: http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=121052. We can try doing same thing similar here.
Is there anyway we can know in the tree level that the target does not define integer divide?
Some targets, e.g. MIPS, have a combined div/mod instruction. Those could benefit from this as well, unless they already achieve that optimisation differently.
On 06/06/13 22:03, Mans Rullgard wrote:
On 6 June 2013 12:09, Kugan kugan.vivekanandarajah@linaro.org wrote:
On 05/06/13 21:27, Matthew Gretton-Dann wrote:
Kugan,
I don't have the source code to hand but how are the sin()/cos()->sincos() optimizations handled?
Thanks Matt. There is a tree level pass to combine sin()/cos() into sincos(). Commit that added this is: http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=121052. We can try doing same thing similar here.
Is there anyway we can know in the tree level that the target does not define integer divide?
Some targets, e.g. MIPS, have a combined div/mod instruction. Those could benefit from this as well, unless they already achieve that optimisation differently.
Thanks Mans. I will have a look at MIPS port.
Availability of sincos in the target is specified with the define TARGET_HAS_SINCOS and it is being used in tree level optimizations. We can do something similar (?) to get the information about availability of ops/run time library call to consider optimizing.
Thanks, Kugan
-----Original Message----- From: linaro-toolchain-bounces@lists.linaro.org [mailto:linaro- toolchain-bounces@lists.linaro.org] On Behalf Of Kugan Sent: 07 June 2013 01:58 To: Mans Rullgard Cc: linaro-toolchain Subject: Re: Failure to optimise (a/b) and (a%b) into single __aeabi_idivmod call
On 06/06/13 22:03, Mans Rullgard wrote:
On 6 June 2013 12:09, Kugan kugan.vivekanandarajah@linaro.org
wrote:
On 05/06/13 21:27, Matthew Gretton-Dann wrote:
Kugan,
I don't have the source code to hand but how are the sin()/cos()->sincos() optimizations handled?
Thanks Matt. There is a tree level pass to combine sin()/cos() into sincos(). Commit that added this is: http://gcc.gnu.org/viewcvs/gcc?view=revision&revision=121052. We can
try
doing same thing similar here.
Is there anyway we can know in the tree level that the target does
not
define integer divide?
Some targets, e.g. MIPS, have a combined div/mod instruction. Those
could
benefit from this as well, unless they already achieve that
optimisation
differently.
The optimization is already achieved for ports that have hardware support for divmod but not for cases where you need the libcall support.
Thanks Mans. I will have a look at MIPS port.
Availability of sincos in the target is specified with the define TARGET_HAS_SINCOS and it is being used in tree level optimizations. We can do something similar (?) to get the information about availability of ops/run time library call to consider optimizing.
Well HAVE_divmod<mode> will tell you if there is a divmod pattern in the backend. I suspect you should also be able to test for this by checking if there's a libcall for divmod available in case the pattern is absent.
regards Ramana
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
linaro-toolchain@lists.linaro.org