Hi All, I am using 2011.3 4.5 linaro GCC(armv7-a vfpv3d16) to compile kernel and modules. I select to compile all codecs as modules: "config SND_SOC_ALL_CODECS tristate "Build all ASoC CODEC drivers" " as M and I2C/SPI too.
Then in the kernel dir, run "make" to get both vmlinux and modules, I found snd-soc-wm8974.ko, snd-soc-wm8940.ko and snd-soc-wm8510.ko will fail due to "__aeabi_uldivmod undefined".
If i comment do_div() in these codec drivers, this issue will disappear. But it is strange there are many codecs which use do_div() too, for example: sound/soc/codecs/max98088.c sound/soc/codecs/max9850.c sound/soc/codecs/wm8350.c sound/soc/codecs/wm8400.c sound/soc/codecs/wm8510.c sound/soc/codecs/wm8580.c sound/soc/codecs/wm8753.c sound/soc/codecs/wm8804.c sound/soc/codecs/wm8900.c sound/soc/codecs/wm8904.c sound/soc/codecs/wm8940.c sound/soc/codecs/wm8955.c sound/soc/codecs/wm8960.c sound/soc/codecs/wm8974.c sound/soc/codecs/wm8978.c sound/soc/codecs/wm8985.c sound/soc/codecs/wm8990.c sound/soc/codecs/wm8991.c sound/soc/codecs/wm8993.c sound/soc/codecs/wm8994.c sound/soc/codecs/wm8995.c sound/soc/codecs/wm9081.c
but others can pass the compiling except those 3 modules. Is it due to a wrong optimization by gcc?
Other information: 1. old tool-chains we are using can pass the compiling of the 3 modules. 2. If i built all codecs into kernel image, these 3 drivers don't report error while compiling.
Thanks Barry
Hi Barry. I think the toolchain is operating correctly here. The current version recognises a divide followed by a modulo and optimises this into a call to the standard EABI function __aeabi__uldivmod(). Note the code:
do_div(Kpart, source);
K = Kpart & 0xFFFFFFFF;
/* Check if we need to round */ if ((K % 10) >= 5) K += 5;
This function is provided by libgcc for normal applications. The kernel provides it's own versions in arch/arm/lib/lib1funcs.s but is missing __aeabi_uldivmod (note the 'l' for 64 bit).
-- Michael
On Tue, Apr 26, 2011 at 12:45 AM, Barry Song 21cnbao@gmail.com wrote:
Hi All, I am using 2011.3 4.5 linaro GCC(armv7-a vfpv3d16) to compile kernel and modules. I select to compile all codecs as modules: "config SND_SOC_ALL_CODECS tristate "Build all ASoC CODEC drivers" " as M and I2C/SPI too.
Then in the kernel dir, run "make" to get both vmlinux and modules, I found snd-soc-wm8974.ko, snd-soc-wm8940.ko and snd-soc-wm8510.ko will fail due to "__aeabi_uldivmod undefined".
If i comment do_div() in these codec drivers, this issue will disappear. But it is strange there are many codecs which use do_div() too, for example: sound/soc/codecs/max98088.c sound/soc/codecs/max9850.c sound/soc/codecs/wm8350.c sound/soc/codecs/wm8400.c sound/soc/codecs/wm8510.c sound/soc/codecs/wm8580.c sound/soc/codecs/wm8753.c sound/soc/codecs/wm8804.c sound/soc/codecs/wm8900.c sound/soc/codecs/wm8904.c sound/soc/codecs/wm8940.c sound/soc/codecs/wm8955.c sound/soc/codecs/wm8960.c sound/soc/codecs/wm8974.c sound/soc/codecs/wm8978.c sound/soc/codecs/wm8985.c sound/soc/codecs/wm8990.c sound/soc/codecs/wm8991.c sound/soc/codecs/wm8993.c sound/soc/codecs/wm8994.c sound/soc/codecs/wm8995.c sound/soc/codecs/wm9081.c
but others can pass the compiling except those 3 modules. Is it due to a wrong optimization by gcc?
Other information:
- old tool-chains we are using can pass the compiling of the 3 modules.
- If i built all codecs into kernel image, these 3 drivers don't
report error while compiling.
Thanks Barry
linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain
Hi Michael,
2011/4/26 Michael Hope michael.hope@linaro.org:
Hi Barry. I think the toolchain is operating correctly here. The current version recognises a divide followed by a modulo and optimises this into a call to the standard EABI function __aeabi__uldivmod(). Note the code:
do_div(Kpart, source);
K = Kpart & 0xFFFFFFFF;
/* Check if we need to round */ if ((K % 10) >= 5) K += 5;
This function is provided by libgcc for normal applications. The kernel provides it's own versions in arch/arm/lib/lib1funcs.s but is missing __aeabi_uldivmod (note the 'l' for 64 bit).
In fact the problem happen ealier:
if (Ndiv < 6) { source /= 2; pll_div->pre_div = 1; Ndiv = target / source; } else pll_div->pre_div = 0;
if ((Ndiv < 6) || (Ndiv > 12)) printk(KERN_WARNING "WM8974 N value %u outwith recommended range!\n", Ndiv);
pll_div->n = Ndiv; Nmod = target % source; Kpart = FIXED_PLL_SIZE * (long long)Nmod;
do_div(Kpart, source);
If commenting "source /= 2", the problem disappear.
-- Michael
On Tue, Apr 26, 2011 at 12:45 AM, Barry Song 21cnbao@gmail.com wrote:
Hi All, I am using 2011.3 4.5 linaro GCC(armv7-a vfpv3d16) to compile kernel and modules. I select to compile all codecs as modules: "config SND_SOC_ALL_CODECS tristate "Build all ASoC CODEC drivers" " as M and I2C/SPI too.
Then in the kernel dir, run "make" to get both vmlinux and modules, I found snd-soc-wm8974.ko, snd-soc-wm8940.ko and snd-soc-wm8510.ko will fail due to "__aeabi_uldivmod undefined".
If i comment do_div() in these codec drivers, this issue will disappear. But it is strange there are many codecs which use do_div() too, for example: sound/soc/codecs/max98088.c sound/soc/codecs/max9850.c sound/soc/codecs/wm8350.c sound/soc/codecs/wm8400.c sound/soc/codecs/wm8510.c sound/soc/codecs/wm8580.c sound/soc/codecs/wm8753.c sound/soc/codecs/wm8804.c sound/soc/codecs/wm8900.c sound/soc/codecs/wm8904.c sound/soc/codecs/wm8940.c sound/soc/codecs/wm8955.c sound/soc/codecs/wm8960.c sound/soc/codecs/wm8974.c sound/soc/codecs/wm8978.c sound/soc/codecs/wm8985.c sound/soc/codecs/wm8990.c sound/soc/codecs/wm8991.c sound/soc/codecs/wm8993.c sound/soc/codecs/wm8994.c sound/soc/codecs/wm8995.c sound/soc/codecs/wm9081.c
but others can pass the compiling except those 3 modules. Is it due to a wrong optimization by gcc?
Other information:
- old tool-chains we are using can pass the compiling of the 3 modules.
- If i built all codecs into kernel image, these 3 drivers don't
report error while compiling.
Thanks Barry
linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain
2011/4/26 Barry Song 21cnbao@gmail.com:
Hi Michael,
2011/4/26 Michael Hope michael.hope@linaro.org:
Hi Barry. I think the toolchain is operating correctly here. The current version recognises a divide followed by a modulo and optimises this into a call to the standard EABI function __aeabi__uldivmod(). Note the code:
do_div(Kpart, source);
K = Kpart & 0xFFFFFFFF;
/* Check if we need to round */ if ((K % 10) >= 5) K += 5;
This function is provided by libgcc for normal applications. The kernel provides it's own versions in arch/arm/lib/lib1funcs.s but is missing __aeabi_uldivmod (note the 'l' for 64 bit).
In fact the problem happen ealier:
if (Ndiv < 6) { source /= 2; pll_div->pre_div = 1; Ndiv = target / source; } else pll_div->pre_div = 0;
if ((Ndiv < 6) || (Ndiv > 12)) printk(KERN_WARNING "WM8974 N value %u outwith recommended range!\n", Ndiv);
pll_div->n = Ndiv; Nmod = target % source; Kpart = FIXED_PLL_SIZE * (long long)Nmod;
do_div(Kpart, source);
If commenting "source /= 2", the problem disappear.
Or if adding one line before do_div, all will be ok. asm("" : "+r"(source)); do_div(Kpart, source);
-- Michael
On Tue, Apr 26, 2011 at 12:45 AM, Barry Song 21cnbao@gmail.com wrote:
Hi All, I am using 2011.3 4.5 linaro GCC(armv7-a vfpv3d16) to compile kernel and modules. I select to compile all codecs as modules: "config SND_SOC_ALL_CODECS tristate "Build all ASoC CODEC drivers" " as M and I2C/SPI too.
Then in the kernel dir, run "make" to get both vmlinux and modules, I found snd-soc-wm8974.ko, snd-soc-wm8940.ko and snd-soc-wm8510.ko will fail due to "__aeabi_uldivmod undefined".
If i comment do_div() in these codec drivers, this issue will disappear. But it is strange there are many codecs which use do_div() too, for example: sound/soc/codecs/max98088.c sound/soc/codecs/max9850.c sound/soc/codecs/wm8350.c sound/soc/codecs/wm8400.c sound/soc/codecs/wm8510.c sound/soc/codecs/wm8580.c sound/soc/codecs/wm8753.c sound/soc/codecs/wm8804.c sound/soc/codecs/wm8900.c sound/soc/codecs/wm8904.c sound/soc/codecs/wm8940.c sound/soc/codecs/wm8955.c sound/soc/codecs/wm8960.c sound/soc/codecs/wm8974.c sound/soc/codecs/wm8978.c sound/soc/codecs/wm8985.c sound/soc/codecs/wm8990.c sound/soc/codecs/wm8991.c sound/soc/codecs/wm8993.c sound/soc/codecs/wm8994.c sound/soc/codecs/wm8995.c sound/soc/codecs/wm9081.c
but others can pass the compiling except those 3 modules. Is it due to a wrong optimization by gcc?
Other information:
- old tool-chains we are using can pass the compiling of the 3 modules.
- If i built all codecs into kernel image, these 3 drivers don't
report error while compiling.
Thanks Barry
linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain
On Tue, Apr 26, 2011 at 1:48 PM, Barry Song 21cnbao@gmail.com wrote:
2011/4/26 Barry Song 21cnbao@gmail.com:
Hi Michael,
2011/4/26 Michael Hope michael.hope@linaro.org:
Hi Barry. I think the toolchain is operating correctly here. The current version recognises a divide followed by a modulo and optimises this into a call to the standard EABI function __aeabi__uldivmod(). Note the code:
do_div(Kpart, source);
K = Kpart & 0xFFFFFFFF;
/* Check if we need to round */ if ((K % 10) >= 5) K += 5;
This function is provided by libgcc for normal applications. The kernel provides it's own versions in arch/arm/lib/lib1funcs.s but is missing __aeabi_uldivmod (note the 'l' for 64 bit).
In fact the problem happen ealier:
if (Ndiv < 6) { source /= 2; pll_div->pre_div = 1; Ndiv = target / source; } else pll_div->pre_div = 0;
if ((Ndiv < 6) || (Ndiv > 12)) printk(KERN_WARNING "WM8974 N value %u outwith recommended range!\n", Ndiv);
pll_div->n = Ndiv; Nmod = target % source; Kpart = FIXED_PLL_SIZE * (long long)Nmod;
do_div(Kpart, source);
If commenting "source /= 2", the problem disappear.
Or if adding one line before do_div, all will be ok. asm("" : "+r"(source)); do_div(Kpart, source);
Hi Barry. I can reproduce the problem in Linaro GCC 4.5-2011.04 and GCC 4.5.2. It does not exist in GCC 4.6.0. wm8974_set_dai_pll() inlines the call to pll_factors() and leaves a .globl reference to __aeabi_uldivmod in the code. This function is never called.
Marking pll_factors() as __attribute__((noinline)) also works around the problem.
I've logged http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48783 upstream and recorded this as LP: #771551.
-- Michael
2011/4/27 Michael Hope michael.hope@linaro.org:
On Tue, Apr 26, 2011 at 1:48 PM, Barry Song 21cnbao@gmail.com wrote:
2011/4/26 Barry Song 21cnbao@gmail.com:
Hi Michael,
2011/4/26 Michael Hope michael.hope@linaro.org:
Hi Barry. I think the toolchain is operating correctly here. The current version recognises a divide followed by a modulo and optimises this into a call to the standard EABI function __aeabi__uldivmod(). Note the code:
do_div(Kpart, source);
K = Kpart & 0xFFFFFFFF;
/* Check if we need to round */ if ((K % 10) >= 5) K += 5;
This function is provided by libgcc for normal applications. The kernel provides it's own versions in arch/arm/lib/lib1funcs.s but is missing __aeabi_uldivmod (note the 'l' for 64 bit).
In fact the problem happen ealier:
if (Ndiv < 6) { source /= 2; pll_div->pre_div = 1; Ndiv = target / source; } else pll_div->pre_div = 0;
if ((Ndiv < 6) || (Ndiv > 12)) printk(KERN_WARNING "WM8974 N value %u outwith recommended range!\n", Ndiv);
pll_div->n = Ndiv; Nmod = target % source; Kpart = FIXED_PLL_SIZE * (long long)Nmod;
do_div(Kpart, source);
If commenting "source /= 2", the problem disappear.
Or if adding one line before do_div, all will be ok. asm("" : "+r"(source)); do_div(Kpart, source);
Hi Barry. I can reproduce the problem in Linaro GCC 4.5-2011.04 and GCC 4.5.2. It does not exist in GCC 4.6.0. wm8974_set_dai_pll() inlines the call to pll_factors() and leaves a .globl reference to __aeabi_uldivmod in the code. This function is never called.
Marking pll_factors() as __attribute__((noinline)) also works around the problem.
I've logged http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48783 upstream and recorded this as LP: #771551.
Thanks. I am totally thinking it is a gcc bug not an optimization feature. in fact, __aeabi_uldivmod is never called as seen by objdump. It only exists in symbol reference list.
-- Michael
On 27/04/11 09:44, Barry Song wrote:
Thanks. I am totally thinking it is a gcc bug not an optimization feature. in fact, __aeabi_uldivmod is never called as seen by objdump. It only exists in symbol reference list.
Your code contains "Nmod = target % source" so the only reason divmod wouldn't get called is if it got optimized away somehow.
If that were the case then it would be a compiler bug somehow, but when I build the test case in Michael's bug report it looks like __aeabi_uidivmod does get called, AFAICT.
I don't see any reference to __aeabi_uldivmod so you must have different compiler or testcase?
Andrew
2011/4/27 Andrew Stubbs andrew.stubbs@linaro.org:
On 27/04/11 09:44, Barry Song wrote:
Thanks. I am totally thinking it is a gcc bug not an optimization feature. in fact, __aeabi_uldivmod is never called as seen by objdump. It only exists in symbol reference list.
Your code contains "Nmod = target % source" so the only reason divmod wouldn't get called is if it got optimized away somehow.
If that were the case then it would be a compiler bug somehow, but when I build the test case in Michael's bug report it looks like __aeabi_uidivmod does get called, AFAICT.
I don't see any reference to __aeabi_uldivmod so you must have different compiler or testcase?
__aeabi_u*i*divmod does get called in asm codes. I mean __aeabi_u*l*divmod has never existed in asm codes after objdump the target ko. __aeabi_u*l*divmod only exists in refrence list. the list means what symbols are depent by this module. So we got a link error. but in fact, the module doesn't need link this symbol since it never call __aeabi_u*l*divmod in asm level.
Andrew
On 27/04/11 10:22, Barry Song wrote:
__aeabi_u*l*divmod has never existed in asm codes after objdump the target ko. __aeabi_u*l*divmod only exists in refrence list. the list means what symbols are depent by this module. So we got a link error. but in fact, the module doesn't need link this symbol since it never call __aeabi_u*l*divmod in asm level.
Can you compile with --save-temps and look in the .s file.
If it's never mentioned in there then it's not a compiler bug (at least, not with this testcase) - the reference is coming from elsewhere.
We should be able to narrow things down, at least.
Andrew
Hi Andrew. I uploaded the wrong preprocessed source to the GCC bugzilla entry. It included the __attribute__((noinline)) workaround which hides the problem.
I've fixed that and re-attached the correct version. The assembly version contains the following:
.size wm8974_pcm_hw_params, .-wm8974_pcm_hw_params .global __aeabi_uidiv .global __aeabi_uidivmod .global __aeabi_uldivmod .align 2 .thumb .thumb_func .type wm8974_set_dai_pll, %function wm8974_set_dai_pll: @ args = 4, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 push {r3, r4, r5, r6, r7, r8, r9, lr} ...
-- Michael
On Wed, Apr 27, 2011 at 9:33 PM, Andrew Stubbs andrew.stubbs@linaro.org wrote:
On 27/04/11 10:22, Barry Song wrote:
__aeabi_u*l*divmod has never existed in asm codes after objdump the target ko. __aeabi_u*l*divmod only exists in refrence list. the list means what symbols are depent by this module. So we got a link error. but in fact, the module doesn't need link this symbol since it never call __aeabi_u*l*divmod in asm level.
Can you compile with --save-temps and look in the .s file.
If it's never mentioned in there then it's not a compiler bug (at least, not with this testcase) - the reference is coming from elsewhere.
We should be able to narrow things down, at least.
Andrew
On 27/04/11 10:57, Michael Hope wrote:
Hi Andrew. I uploaded the wrong preprocessed source to the GCC bugzilla entry. It included the __attribute__((noinline)) workaround which hides the problem.
OK, I've now reproduced the problem and reduced the test case.
See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48783
Andrew
On 2011/4/27 07:10 PM, Andrew Stubbs wrote:
On 27/04/11 10:57, Michael Hope wrote:
Hi Andrew. I uploaded the wrong preprocessed source to the GCC bugzilla entry. It included the __attribute__((noinline)) workaround which hides the problem.
OK, I've now reproduced the problem and reduced the test case.
See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48783
Andrew
I think I've found the source of this, to summarize: during RTL expand (yes during *expand*), .global __aeabi_* directives are added to the asm output when seeing use of the libcalls. Later the calls get optimized out in some way, leaving the unused .global symbols...
I would think that, under an EABI toolchain environment, all the AAPCS defined __aeabi_* runtime functions should always be available (e.g. libgcc) Thus we should be able to skip adding these .global __aeabi_* directives completely? Ramana?
Yip, so the compiler spots these two lines: Ndiv = target / source; Nmod = target % source;
and turns them into Ndiv, Nmod = __aeabi_uldivmod(target, source)
Depending on the policy of the kernel developers, you should either implement __aeabi_uldivmod in lib1funcs.asm or use another kernel construct to prevent the optimisation from happening.
Googling around brought up this thread: http://comments.gmane.org/gmane.linux.kernel/965262
-- Michael
On Tue, Apr 26, 2011 at 1:42 PM, Barry Song 21cnbao@gmail.com wrote:
Hi Michael,
2011/4/26 Michael Hope michael.hope@linaro.org:
Hi Barry. I think the toolchain is operating correctly here. The current version recognises a divide followed by a modulo and optimises this into a call to the standard EABI function __aeabi__uldivmod(). Note the code:
do_div(Kpart, source);
K = Kpart & 0xFFFFFFFF;
/* Check if we need to round */ if ((K % 10) >= 5) K += 5;
This function is provided by libgcc for normal applications. The kernel provides it's own versions in arch/arm/lib/lib1funcs.s but is missing __aeabi_uldivmod (note the 'l' for 64 bit).
In fact the problem happen ealier:
if (Ndiv < 6) { source /= 2; pll_div->pre_div = 1; Ndiv = target / source; } else pll_div->pre_div = 0;
if ((Ndiv < 6) || (Ndiv > 12)) printk(KERN_WARNING "WM8974 N value %u outwith recommended range!\n", Ndiv);
pll_div->n = Ndiv; Nmod = target % source; Kpart = FIXED_PLL_SIZE * (long long)Nmod;
do_div(Kpart, source);
If commenting "source /= 2", the problem disappear.
-- Michael
On Tue, Apr 26, 2011 at 12:45 AM, Barry Song 21cnbao@gmail.com wrote:
Hi All, I am using 2011.3 4.5 linaro GCC(armv7-a vfpv3d16) to compile kernel and modules. I select to compile all codecs as modules: "config SND_SOC_ALL_CODECS tristate "Build all ASoC CODEC drivers" " as M and I2C/SPI too.
Then in the kernel dir, run "make" to get both vmlinux and modules, I found snd-soc-wm8974.ko, snd-soc-wm8940.ko and snd-soc-wm8510.ko will fail due to "__aeabi_uldivmod undefined".
If i comment do_div() in these codec drivers, this issue will disappear. But it is strange there are many codecs which use do_div() too, for example: sound/soc/codecs/max98088.c sound/soc/codecs/max9850.c sound/soc/codecs/wm8350.c sound/soc/codecs/wm8400.c sound/soc/codecs/wm8510.c sound/soc/codecs/wm8580.c sound/soc/codecs/wm8753.c sound/soc/codecs/wm8804.c sound/soc/codecs/wm8900.c sound/soc/codecs/wm8904.c sound/soc/codecs/wm8940.c sound/soc/codecs/wm8955.c sound/soc/codecs/wm8960.c sound/soc/codecs/wm8974.c sound/soc/codecs/wm8978.c sound/soc/codecs/wm8985.c sound/soc/codecs/wm8990.c sound/soc/codecs/wm8991.c sound/soc/codecs/wm8993.c sound/soc/codecs/wm8994.c sound/soc/codecs/wm8995.c sound/soc/codecs/wm9081.c
but others can pass the compiling except those 3 modules. Is it due to a wrong optimization by gcc?
Other information:
- old tool-chains we are using can pass the compiling of the 3 modules.
- If i built all codecs into kernel image, these 3 drivers don't
report error while compiling.
Thanks Barry
linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain
On Tue, 26 Apr 2011, Michael Hope wrote:
Yip, so the compiler spots these two lines: Ndiv = target / source; Nmod = target % source;
and turns them into Ndiv, Nmod = __aeabi_uldivmod(target, source)
Why would gcc do that? All four variables involved here are of type unsigned int, no unsigned long long. Seems to me that __aeabi_uidivmod should have been used here instead.
Nicolas
2011/4/26 Nicolas Pitre nicolas.pitre@linaro.org:
On Tue, 26 Apr 2011, Michael Hope wrote:
Yip, so the compiler spots these two lines: Ndiv = target / source; Nmod = target % source;
and turns them into Ndiv, Nmod = __aeabi_uldivmod(target, source)
Why would gcc do that? All four variables involved here are of type unsigned int, no unsigned long long. Seems to me that __aeabi_uidivmod should have been used here instead.
I think we should dig into do_div, as i have reported, add a line "asm("" : "+r"(source));" before "do_div(Kpart, source);" can also avoid this optimization. So the __aeabi_uldivmod should be in do_div.
Nicolas
On Tue, 26 Apr 2011, Michael Hope wrote:
Hi Barry. I think the toolchain is operating correctly here. The current version recognises a divide followed by a modulo and optimises this into a call to the standard EABI function __aeabi__uldivmod(). Note the code:
do_div(Kpart, source); K = Kpart & 0xFFFFFFFF; /* Check if we need to round */ if ((K % 10) >= 5) K += 5;
This function is provided by libgcc for normal applications. The kernel provides it's own versions in arch/arm/lib/lib1funcs.s but is missing __aeabi_uldivmod (note the 'l' for 64 bit).
The kernel is omitting this function on purpose. The idea is to prevent people from ever using 64-bit by 64-bit divisions since they are always costly and avoidable.
This is why the kernel provides a do_div() macro: to allow for 64-bit dividend by only 32-bit divisors. And this stems from the fact that gcc has no (or used not to have) patterns to match a division with a 64-bit dividend and a 32-bit divisor, hence it promotes the divisor to a 64-bit value and perform the costly division that the kernel wants to avoid.
Worse, gcc isn't smart enough to optimize the operation even when the divisor is constant, which is quite a common operation in the kernel. This is why many years ago I wrote the code for the do_div() version you can find in arch/arm/include/asm/div64.h where the division is turned into a reciprocal multiplication. For example, despite the amount of added C code, do_div(x, 10000) now produces the following assembly code (where x is assigned to r0-r1):
adr r4, .L0 ldmia r4, {r4-r5} umull r2, r3, r4, r0 mov r2, #0 umlal r3, r2, r5, r0 umlal r3, r2, r4, r1 mov r3, #0 umlal r2, r3, r5, r1 mov r0, r2, lsr #11 orr r0, r0, r3, lsl #21 mov r1, r3, lsr #11 ... .L0: .word 948328779 .word 879609302
But I digress. This is just to say that gcc shouldn't pull __aeabi_uldivmod in this case because:
1) the division and the modulus are not performed on the same operands;
2) the modulus is performed on a 32-bit variable;
3) the do_div() implementation looks like nothing that gcc could recognize as being a division.
Therefore I don't see how the right pattern could have been matched.
Nicolas
On 26/04/11 03:39, Nicolas Pitre wrote:
But I digress. This is just to say that gcc shouldn't pull __aeabi_uldivmod in this case because:
There isn't a library call (or instruction) for a straight 'mod' operation, so GCC always has to use 'divmod', no exceptions.
In any case, optimization of a div and a mod into a single call is problematic, at best, so it's unlikely that's happening. See GCC PR43721.
Andrew
On Wed, Apr 27, 2011 at 12:23 AM, Andrew Stubbs andrew.stubbs@linaro.org wrote:
On 26/04/11 03:39, Nicolas Pitre wrote:
But I digress. This is just to say that gcc shouldn't pull __aeabi_uldivmod in this case because:
There isn't a library call (or instruction) for a straight 'mod' operation, so GCC always has to use 'divmod', no exceptions.
In any case, optimization of a div and a mod into a single call is problematic, at best, so it's unlikely that's happening. See GCC PR43721.
Ah, sorry. So the divmod call is there due to the modulo operation only.
-- Michael
On Wed, 27 Apr 2011, Michael Hope wrote:
On Wed, Apr 27, 2011 at 12:23 AM, Andrew Stubbs andrew.stubbs@linaro.org wrote:
On 26/04/11 03:39, Nicolas Pitre wrote:
But I digress. This is just to say that gcc shouldn't pull __aeabi_uldivmod in this case because:
There isn't a library call (or instruction) for a straight 'mod' operation, so GCC always has to use 'divmod', no exceptions.
In any case, optimization of a div and a mod into a single call is problematic, at best, so it's unlikely that's happening. See GCC PR43721.
Ah, sorry. So the divmod call is there due to the modulo operation only.
Makes sense. Now... why is the uldivmod version used over the uidivmod version when all the variables involved are only 32 bit wide?
Nicolas
linaro-toolchain@lists.linaro.org