From: Sedat Dilek sedat.dilek@gmail.com
It turns out that if your config tickles __builtin_constant_p via differences in choices to inline or not, this now produces invalid assembly:
$ cat foo.c long a(long b, long c) { asm("orb\t%1, %0" : "+q"(c): "r"(b)); return c; } $ gcc foo.c foo.c: Assembler messages: foo.c:2: Error: `%rax' not allowed with `orb'
The "q" constraint only has meanting on -m32 otherwise is treated as "r".
This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m, or Clang+allyesconfig.
Keep the masking operation to appease sparse (`make C=1`), add back the cast in order to properly select the proper 8b register alias.
[Nick: reworded]
Cc: stable@vger.kernel.org Cc: Jesse Brandeburg jesse.brandeburg@intel.com Link: https://github.com/ClangBuiltLinux/linux/issues/961 Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/ Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast") Reported-by: Sedat Dilek sedat.dilek@gmail.com Reported-by: kernelci.org bot bot@kernelci.org Suggested-by: Andy Shevchenko andriy.shevchenko@intel.com Suggested-by: Ilie Halip ilie.halip@gmail.com Tested-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com --- arch/x86/include/asm/bitops.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h index b392571c1f1d..139122e5b25b 100644 --- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr) - : "iq" (CONST_MASK(nr) & 0xff) + : "iq" ((u8)(CONST_MASK(nr) & 0xff)) : "memory"); } else { asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0" @@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "andb %1,%0" : CONST_MASK_ADDR(nr, addr) - : "iq" (CONST_MASK(nr) ^ 0xff)); + : "iq" ((u8)(CONST_MASK(nr) ^ 0xff))); } else { asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0" : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
On May 5, 2020 10:44:22 AM PDT, Nick Desaulniers ndesaulniers@google.com wrote:
From: Sedat Dilek sedat.dilek@gmail.com
It turns out that if your config tickles __builtin_constant_p via differences in choices to inline or not, this now produces invalid assembly:
$ cat foo.c long a(long b, long c) { asm("orb\t%1, %0" : "+q"(c): "r"(b)); return c; } $ gcc foo.c foo.c: Assembler messages: foo.c:2: Error: `%rax' not allowed with `orb'
The "q" constraint only has meanting on -m32 otherwise is treated as "r".
This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m, or Clang+allyesconfig.
Keep the masking operation to appease sparse (`make C=1`), add back the cast in order to properly select the proper 8b register alias.
[Nick: reworded]
Cc: stable@vger.kernel.org Cc: Jesse Brandeburg jesse.brandeburg@intel.com Link: https://github.com/ClangBuiltLinux/linux/issues/961 Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/ Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast") Reported-by: Sedat Dilek sedat.dilek@gmail.com Reported-by: kernelci.org bot bot@kernelci.org Suggested-by: Andy Shevchenko andriy.shevchenko@intel.com Suggested-by: Ilie Halip ilie.halip@gmail.com Tested-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com
arch/x86/include/asm/bitops.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h index b392571c1f1d..139122e5b25b 100644 --- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) & 0xff)
} else { asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0": "iq" ((u8)(CONST_MASK(nr) & 0xff)) : "memory");
@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "andb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) ^ 0xff));
} else { asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0" : : RLONG_ADDR(addr), "Ir" (nr) : "memory");: "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
Drop & 0xff and change ^ 0xff to ~.
The redundancy is confusing.
On Tue, May 5, 2020 at 11:07 AM hpa@zytor.com wrote:
On May 5, 2020 10:44:22 AM PDT, Nick Desaulniers ndesaulniers@google.com wrote:
From: Sedat Dilek sedat.dilek@gmail.com
It turns out that if your config tickles __builtin_constant_p via differences in choices to inline or not, this now produces invalid assembly:
$ cat foo.c long a(long b, long c) { asm("orb\t%1, %0" : "+q"(c): "r"(b)); return c; } $ gcc foo.c foo.c: Assembler messages: foo.c:2: Error: `%rax' not allowed with `orb'
The "q" constraint only has meanting on -m32 otherwise is treated as "r".
This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m, or Clang+allyesconfig.
Keep the masking operation to appease sparse (`make C=1`), add back the cast in order to properly select the proper 8b register alias.
[Nick: reworded]
Cc: stable@vger.kernel.org Cc: Jesse Brandeburg jesse.brandeburg@intel.com Link: https://github.com/ClangBuiltLinux/linux/issues/961 Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/ Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast") Reported-by: Sedat Dilek sedat.dilek@gmail.com Reported-by: kernelci.org bot bot@kernelci.org Suggested-by: Andy Shevchenko andriy.shevchenko@intel.com Suggested-by: Ilie Halip ilie.halip@gmail.com Tested-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com
arch/x86/include/asm/bitops.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h index b392571c1f1d..139122e5b25b 100644 --- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) & 0xff)
: "iq" ((u8)(CONST_MASK(nr) & 0xff)) : "memory"); } else { asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "andb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) ^ 0xff));
: "iq" ((u8)(CONST_MASK(nr) ^ 0xff))); } else { asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0" : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
Drop & 0xff and change ^ 0xff to ~.
The redundancy is confusing.
Thanks for the review. While I would also like to have less redundancy, we have ourselves a catch-22 that that won't resolve.
Without the cast to u8, gcc and clang will not select low-8-bit registers required for the `b` suffix on `orb` and `andb`, which results in an assembler error. Without the mask, sparse will warn about the upper bytes of the value being truncated. (I guess that would have been a more concise commit message).
On Tue, May 05, 2020 at 11:07:24AM -0700, hpa@zytor.com wrote:
On May 5, 2020 10:44:22 AM PDT, Nick Desaulniers ndesaulniers@google.com wrote:
@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) & 0xff)
} else { asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0": "iq" ((u8)(CONST_MASK(nr) & 0xff)) : "memory");
@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "andb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) ^ 0xff));
} else { asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0" : : RLONG_ADDR(addr), "Ir" (nr) : "memory");: "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
Drop & 0xff and change ^ 0xff to ~.
But then we're back to sparse being unhappy, no? The thing with ~ is that it will set high bits which will be truncated, which makes sparse sad.
On Thu, May 7, 2020 at 7:38 AM Peter Zijlstra peterz@infradead.org wrote:
On Tue, May 05, 2020 at 11:07:24AM -0700, hpa@zytor.com wrote:
On May 5, 2020 10:44:22 AM PDT, Nick Desaulniers ndesaulniers@google.com wrote:
@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) & 0xff)
} else { asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0": "iq" ((u8)(CONST_MASK(nr) & 0xff)) : "memory");
@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "andb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) ^ 0xff));
} else { asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0" : : RLONG_ADDR(addr), "Ir" (nr) : "memory");: "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
Drop & 0xff and change ^ 0xff to ~.
But then we're back to sparse being unhappy, no? The thing with ~ is that it will set high bits which will be truncated, which makes sparse sad.
This change will make sparse happy and allow these cleanups: #define CONST_MASK(nr) ((u8)1 << ((nr) & 7))
Tested with GCC 9.3.1.
-- Brian Gerst
On Thu, May 7, 2020 at 7:00 AM Brian Gerst brgerst@gmail.com wrote:
This change will make sparse happy and allow these cleanups: #define CONST_MASK(nr) ((u8)1 << ((nr) & 7))
yep, this is more elegant, IMO. Will send a v3 later with this change. Looking at the uses of CONST_MASK, I noticed arch_change_bit() currently has the (u8) cast from commit 838e8bb71dc0c ("x86: Implement change_bit with immediate operand as "lock xorb""), so that instance can get cleaned up with the above suggestion.
On Thu, May 7, 2020 at 12:19 PM Nick Desaulniers ndesaulniers@google.com wrote:
On Thu, May 7, 2020 at 7:00 AM Brian Gerst brgerst@gmail.com wrote:
This change will make sparse happy and allow these cleanups: #define CONST_MASK(nr) ((u8)1 << ((nr) & 7))
yep, this is more elegant, IMO. Will send a v3 later with this change. Looking at the uses of CONST_MASK, I noticed arch_change_bit() currently has the (u8) cast from commit 838e8bb71dc0c ("x86: Implement change_bit with immediate operand as "lock xorb""), so that instance can get cleaned up with the above suggestion.
Oh, we need the cast to be the final operation. The binary AND and XOR in 2 of the 3 uses of CONST_MASK implicitly promote the operands of the binary operand to int, so the type of the evaluated subexpression is int. https://wiki.sei.cmu.edu/confluence/display/c/EXP14-C.+Beware+of+integer+pro... So I think this version (v2) is most precise fix, and would be better than defining more macros or (worse) using metaprogramming.
On Thu, May 7, 2020 at 6:29 PM Nick Desaulniers ndesaulniers@google.com wrote:
On Thu, May 7, 2020 at 12:19 PM Nick Desaulniers ndesaulniers@google.com wrote:
On Thu, May 7, 2020 at 7:00 AM Brian Gerst brgerst@gmail.com wrote:
This change will make sparse happy and allow these cleanups: #define CONST_MASK(nr) ((u8)1 << ((nr) & 7))
yep, this is more elegant, IMO. Will send a v3 later with this change. Looking at the uses of CONST_MASK, I noticed arch_change_bit() currently has the (u8) cast from commit 838e8bb71dc0c ("x86: Implement change_bit with immediate operand as "lock xorb""), so that instance can get cleaned up with the above suggestion.
Oh, we need the cast to be the final operation. The binary AND and XOR in 2 of the 3 uses of CONST_MASK implicitly promote the operands of the binary operand to int, so the type of the evaluated subexpression is int. https://wiki.sei.cmu.edu/confluence/display/c/EXP14-C.+Beware+of+integer+pro... So I think this version (v2) is most precise fix, and would be better than defining more macros or (worse) using metaprogramming.
One last suggestion. Add the "b" modifier to the mask operand: "orb %b1, %0". That forces the compiler to use the 8-bit register name instead of trying to deduce the width from the input.
-- Brian Gerst
On Thu, May 7, 2020 at 6:57 PM Brian Gerst brgerst@gmail.com wrote:
On Thu, May 7, 2020 at 6:29 PM Nick Desaulniers ndesaulniers@google.com wrote:
On Thu, May 7, 2020 at 12:19 PM Nick Desaulniers ndesaulniers@google.com wrote:
On Thu, May 7, 2020 at 7:00 AM Brian Gerst brgerst@gmail.com wrote:
This change will make sparse happy and allow these cleanups: #define CONST_MASK(nr) ((u8)1 << ((nr) & 7))
yep, this is more elegant, IMO. Will send a v3 later with this change. Looking at the uses of CONST_MASK, I noticed arch_change_bit() currently has the (u8) cast from commit 838e8bb71dc0c ("x86: Implement change_bit with immediate operand as "lock xorb""), so that instance can get cleaned up with the above suggestion.
Oh, we need the cast to be the final operation. The binary AND and XOR in 2 of the 3 uses of CONST_MASK implicitly promote the operands of the binary operand to int, so the type of the evaluated subexpression is int. https://wiki.sei.cmu.edu/confluence/display/c/EXP14-C.+Beware+of+integer+pro... So I think this version (v2) is most precise fix, and would be better than defining more macros or (worse) using metaprogramming.
One last suggestion. Add the "b" modifier to the mask operand: "orb %b1, %0". That forces the compiler to use the 8-bit register name instead of trying to deduce the width from the input.
Ah right: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Looks like that works for both compilers. In that case, we can likely drop the `& 0xff`, too. Let me play with that, then I'll hopefully send a v3 today.
On 2020-05-08 10:21, Nick Desaulniers wrote:
One last suggestion. Add the "b" modifier to the mask operand: "orb %b1, %0". That forces the compiler to use the 8-bit register name instead of trying to deduce the width from the input.
Ah right: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Looks like that works for both compilers. In that case, we can likely drop the `& 0xff`, too. Let me play with that, then I'll hopefully send a v3 today.
Good idea. I requested a while ago that they document these modifiers; they chose not to document them all which in some ways is good; it shows what they are willing to commit to indefinitely.
-hpa
From: Peter Anvin
Sent: 08 May 2020 18:32 On 2020-05-08 10:21, Nick Desaulniers wrote:
One last suggestion. Add the "b" modifier to the mask operand: "orb %b1, %0". That forces the compiler to use the 8-bit register name instead of trying to deduce the width from the input.
Ah right: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Looks like that works for both compilers. In that case, we can likely drop the `& 0xff`, too. Let me play with that, then I'll hopefully send a v3 today.
Good idea. I requested a while ago that they document these modifiers; they chose not to document them all which in some ways is good; it shows what they are willing to commit to indefinitely.
I thought the intention here was to explicitly do a byte access. If the constant bit number has had a div/mod by 8 done on it then the address can be misaligned - so you mustn't do a non-byte sized locked access.
OTOH the original base address must be aligned.
Looking at some instruction timing, BTS/BTR aren't too bad if the bit number is a constant. But are 6 or 7 clocks slower if it is in %cl. Given these are locked RMW bus cycles they'll always be slow!
How about an asm multi-part alternative that uses a byte offset and byte constant if the compiler thinks the mask is constant or a 4-byte offset and 32bit mask if it doesn't.
The other alternative is to just use BTS/BTS and (maybe) rely on the assembler to add in the word offset to the base address.
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On May 10, 2020 4:59:17 AM PDT, David Laight David.Laight@ACULAB.COM wrote:
From: Peter Anvin
Sent: 08 May 2020 18:32 On 2020-05-08 10:21, Nick Desaulniers wrote:
One last suggestion. Add the "b" modifier to the mask operand:
"orb
%b1, %0". That forces the compiler to use the 8-bit register name instead of trying to deduce the width from the input.
Ah right:
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
Looks like that works for both compilers. In that case, we can
likely
drop the `& 0xff`, too. Let me play with that, then I'll hopefully send a v3 today.
Good idea. I requested a while ago that they document these
modifiers; they
chose not to document them all which in some ways is good; it shows
what they
are willing to commit to indefinitely.
I thought the intention here was to explicitly do a byte access. If the constant bit number has had a div/mod by 8 done on it then the address can be misaligned - so you mustn't do a non-byte sized locked access.
OTOH the original base address must be aligned.
Looking at some instruction timing, BTS/BTR aren't too bad if the bit number is a constant. But are 6 or 7 clocks slower if it is in %cl. Given these are locked RMW bus cycles they'll always be slow!
How about an asm multi-part alternative that uses a byte offset and byte constant if the compiler thinks the mask is constant or a 4-byte offset and 32bit mask if it doesn't.
The other alternative is to just use BTS/BTS and (maybe) rely on the assembler to add in the word offset to the base address.
David
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
I don't understand what you are getting at here.
The intent is to do a byte access. The "multi-part asm" you are talking about is also already there...
On May 7, 2020 4:34:22 AM PDT, Peter Zijlstra peterz@infradead.org wrote:
On Tue, May 05, 2020 at 11:07:24AM -0700, hpa@zytor.com wrote:
On May 5, 2020 10:44:22 AM PDT, Nick Desaulniers
ndesaulniers@google.com wrote:
@@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long
*addr)
if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) & 0xff)
} else { asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0": "iq" ((u8)(CONST_MASK(nr) & 0xff)) : "memory");
@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long
*addr)
if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "andb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) ^ 0xff));
} else { asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0" : : RLONG_ADDR(addr), "Ir" (nr) : "memory");: "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
Drop & 0xff and change ^ 0xff to ~.
But then we're back to sparse being unhappy, no? The thing with ~ is that it will set high bits which will be truncated, which makes sparse sad.
In that case, sparse is just broken.
On Tue, May 05, 2020 at 10:44:22AM -0700, Nick Desaulniers wrote:
From: Sedat Dilek sedat.dilek@gmail.com
It turns out that if your config tickles __builtin_constant_p via differences in choices to inline or not, this now produces invalid assembly:
$ cat foo.c long a(long b, long c) { asm("orb\t%1, %0" : "+q"(c): "r"(b)); return c; } $ gcc foo.c foo.c: Assembler messages: foo.c:2: Error: `%rax' not allowed with `orb'
The "q" constraint only has meanting on -m32 otherwise is treated as "r".
This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m, or Clang+allyesconfig.
For what it's worth, I don't see this with allyesconfig.
Keep the masking operation to appease sparse (`make C=1`), add back the cast in order to properly select the proper 8b register alias.
[Nick: reworded]
Cc: stable@vger.kernel.org
The offending commit was added in 5.7-rc1; we shouldn't need to Cc stable since this should be picked up as an -rc fix.
Cc: Jesse Brandeburg jesse.brandeburg@intel.com Link: https://github.com/ClangBuiltLinux/linux/issues/961 Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/ Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast") Reported-by: Sedat Dilek sedat.dilek@gmail.com Reported-by: kernelci.org bot bot@kernelci.org Suggested-by: Andy Shevchenko andriy.shevchenko@intel.com Suggested-by: Ilie Halip ilie.halip@gmail.com
Not to split hairs but this is Ilie's diff, he should probably be the author with Sedat's Reported-by/Tested-by.
https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-608239458
But eh, it's all a team effort plus that can only happen with Ilie's explicit consent for a Signed-off-by.
I am currently doing a set of builds with clang-11 with this patch on top of 5.7-rc4 to make sure that all of the cases I have found work. Once that is done, I'll comment back with a tag.
Tested-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com
arch/x86/include/asm/bitops.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h index b392571c1f1d..139122e5b25b 100644 --- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) & 0xff)
} else { asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0": "iq" ((u8)(CONST_MASK(nr) & 0xff)) : "memory");
@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "andb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) ^ 0xff));
} else { asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0" : : RLONG_ADDR(addr), "Ir" (nr) : "memory");: "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
-- 2.26.2.526.g744177e7f7-goog
Cheers, Nathan
On Wed, May 6, 2020 at 6:30 AM Nathan Chancellor natechancellor@gmail.com wrote:
On Tue, May 05, 2020 at 10:44:22AM -0700, Nick Desaulniers wrote:
From: Sedat Dilek sedat.dilek@gmail.com
It turns out that if your config tickles __builtin_constant_p via differences in choices to inline or not, this now produces invalid assembly:
$ cat foo.c long a(long b, long c) { asm("orb\t%1, %0" : "+q"(c): "r"(b)); return c; } $ gcc foo.c foo.c: Assembler messages: foo.c:2: Error: `%rax' not allowed with `orb'
The "q" constraint only has meanting on -m32 otherwise is treated as "r".
This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m, or Clang+allyesconfig.
For what it's worth, I don't see this with allyesconfig.
Keep the masking operation to appease sparse (`make C=1`), add back the cast in order to properly select the proper 8b register alias.
[Nick: reworded]
Cc: stable@vger.kernel.org
The offending commit was added in 5.7-rc1; we shouldn't need to Cc stable since this should be picked up as an -rc fix.
Cc: Jesse Brandeburg jesse.brandeburg@intel.com Link: https://github.com/ClangBuiltLinux/linux/issues/961 Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/ Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast") Reported-by: Sedat Dilek sedat.dilek@gmail.com Reported-by: kernelci.org bot bot@kernelci.org Suggested-by: Andy Shevchenko andriy.shevchenko@intel.com Suggested-by: Ilie Halip ilie.halip@gmail.com
Not to split hairs but this is Ilie's diff, he should probably be the author with Sedat's Reported-by/Tested-by.
https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-608239458
But eh, it's all a team effort plus that can only happen with Ilie's explicit consent for a Signed-off-by.
Digital dementia... Looking 3 weeks back I have put all relevant informations into the patches in [1], mentionning the diff is from Ilie. Ilie for what reason did not react on any response for 3 weeks in the CBL issue-tracker. I think Nick wants to quickly fix the Kernel-CI-Bot issue seen with Clang. Personally, I hope this patch will be upstreamed in (one of) the next RC release.
I agree on CC:stable can be dropped. Check causing commit-id:
$ git describe --contains 1651e700664b4 v5.7-rc1~122^2
Thanks.
Regards, - Sedat -
[1] https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-613207374
I am currently doing a set of builds with clang-11 with this patch on top of 5.7-rc4 to make sure that all of the cases I have found work. Once that is done, I'll comment back with a tag.
Tested-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com
arch/x86/include/asm/bitops.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h index b392571c1f1d..139122e5b25b 100644 --- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) & 0xff)
: "iq" ((u8)(CONST_MASK(nr) & 0xff)) : "memory"); } else { asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "andb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) ^ 0xff));
: "iq" ((u8)(CONST_MASK(nr) ^ 0xff))); } else { asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0" : : RLONG_ADDR(addr), "Ir" (nr) : "memory");
-- 2.26.2.526.g744177e7f7-goog
Cheers, Nathan
On Tue, May 05, 2020 at 09:30:28PM -0700, Nathan Chancellor wrote:
On Tue, May 05, 2020 at 10:44:22AM -0700, Nick Desaulniers wrote:
From: Sedat Dilek sedat.dilek@gmail.com
It turns out that if your config tickles __builtin_constant_p via differences in choices to inline or not, this now produces invalid assembly:
$ cat foo.c long a(long b, long c) { asm("orb\t%1, %0" : "+q"(c): "r"(b)); return c; } $ gcc foo.c foo.c: Assembler messages: foo.c:2: Error: `%rax' not allowed with `orb'
The "q" constraint only has meanting on -m32 otherwise is treated as "r".
This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m, or Clang+allyesconfig.
For what it's worth, I don't see this with allyesconfig.
Keep the masking operation to appease sparse (`make C=1`), add back the cast in order to properly select the proper 8b register alias.
[Nick: reworded]
Cc: stable@vger.kernel.org
The offending commit was added in 5.7-rc1; we shouldn't need to Cc stable since this should be picked up as an -rc fix.
Cc: Jesse Brandeburg jesse.brandeburg@intel.com Link: https://github.com/ClangBuiltLinux/linux/issues/961 Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/ Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast") Reported-by: Sedat Dilek sedat.dilek@gmail.com Reported-by: kernelci.org bot bot@kernelci.org Suggested-by: Andy Shevchenko andriy.shevchenko@intel.com Suggested-by: Ilie Halip ilie.halip@gmail.com
Not to split hairs but this is Ilie's diff, he should probably be the author with Sedat's Reported-by/Tested-by.
https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-608239458
But eh, it's all a team effort plus that can only happen with Ilie's explicit consent for a Signed-off-by.
I am currently doing a set of builds with clang-11 with this patch on top of 5.7-rc4 to make sure that all of the cases I have found work. Once that is done, I'll comment back with a tag.
Reviewed-by: Nathan Chancellor natechancellor@gmail.com Tested-by: Nathan Chancellor natechancellor@gmail.com # build
Tested-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com
arch/x86/include/asm/bitops.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h index b392571c1f1d..139122e5b25b 100644 --- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) & 0xff)
} else { asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0": "iq" ((u8)(CONST_MASK(nr) & 0xff)) : "memory");
@@ -74,7 +74,7 @@ arch_clear_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "andb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) ^ 0xff));
} else { asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0" : : RLONG_ADDR(addr), "Ir" (nr) : "memory");: "iq" ((u8)(CONST_MASK(nr) ^ 0xff)));
-- 2.26.2.526.g744177e7f7-goog
Cheers, Nathan
On Tue, May 5, 2020 at 9:30 PM Nathan Chancellor natechancellor@gmail.com wrote:
On Tue, May 05, 2020 at 10:44:22AM -0700, Nick Desaulniers wrote:
This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m, or Clang+allyesconfig.
For what it's worth, I don't see this with allyesconfig.
Oops, ok, I'll drop that from the commit message in v2. I was testing with the former.
Keep the masking operation to appease sparse (`make C=1`), add back the cast in order to properly select the proper 8b register alias.
[Nick: reworded]
Cc: stable@vger.kernel.org
The offending commit was added in 5.7-rc1; we shouldn't need to Cc stable since this should be picked up as an -rc fix.
Got it, will drop in v2.
Cc: Jesse Brandeburg jesse.brandeburg@intel.com Link: https://github.com/ClangBuiltLinux/linux/issues/961 Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/ Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast") Reported-by: Sedat Dilek sedat.dilek@gmail.com Reported-by: kernelci.org bot bot@kernelci.org Suggested-by: Andy Shevchenko andriy.shevchenko@intel.com Suggested-by: Ilie Halip ilie.halip@gmail.com
Not to split hairs but this is Ilie's diff, he should probably be the author with Sedat's Reported-by/Tested-by.
https://github.com/ClangBuiltLinux/linux/issues/961#issuecomment-608239458
Ooh, you're right. Sorry about that Ilie. I'm usually pretty pedantic about getting that right; my mistake. I'll fix that in v2. As Sedat noted, the issue tracker has been a little quiet on this issue, but I'll note that there are extraordinary circumstances going on in the world these days (COVID) so delays should be anticipated.
Ilie, may I put your authorship and signed off by tag on the V2?
But eh, it's all a team effort plus that can only happen with Ilie's explicit consent for a Signed-off-by.
On Tue, May 5, 2020 at 1:47 PM Nick Desaulniers ndesaulniers@google.com wrote:
From: Sedat Dilek sedat.dilek@gmail.com
It turns out that if your config tickles __builtin_constant_p via differences in choices to inline or not, this now produces invalid assembly:
$ cat foo.c long a(long b, long c) { asm("orb\t%1, %0" : "+q"(c): "r"(b)); return c; } $ gcc foo.c foo.c: Assembler messages: foo.c:2: Error: `%rax' not allowed with `orb'
The "q" constraint only has meanting on -m32 otherwise is treated as "r".
This is easily reproducible via Clang+CONFIG_STAGING=y+CONFIG_VT6656=m, or Clang+allyesconfig.
Keep the masking operation to appease sparse (`make C=1`), add back the cast in order to properly select the proper 8b register alias.
[Nick: reworded]
Cc: stable@vger.kernel.org Cc: Jesse Brandeburg jesse.brandeburg@intel.com Link: https://github.com/ClangBuiltLinux/linux/issues/961 Link: https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/ Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast") Reported-by: Sedat Dilek sedat.dilek@gmail.com Reported-by: kernelci.org bot bot@kernelci.org Suggested-by: Andy Shevchenko andriy.shevchenko@intel.com Suggested-by: Ilie Halip ilie.halip@gmail.com Tested-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com
arch/x86/include/asm/bitops.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h index b392571c1f1d..139122e5b25b 100644 --- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) & 0xff)
: "iq" ((u8)(CONST_MASK(nr) & 0xff))
I think a better fix would be to make CONST_MASK() return a u8 value rather than have to cast on every use.
Also I question the need for the "q" constraint. It was added in commit 437a0a54 as a workaround for GCC 3.4.4. Now that the minimum GCC version is 4.6, is this still necessary?
-- Brian Gerst
On May 6, 2020 11:18:09 PM PDT, Brian Gerst brgerst@gmail.com wrote:
On Tue, May 5, 2020 at 1:47 PM Nick Desaulniers ndesaulniers@google.com wrote:
From: Sedat Dilek sedat.dilek@gmail.com
It turns out that if your config tickles __builtin_constant_p via differences in choices to inline or not, this now produces invalid assembly:
$ cat foo.c long a(long b, long c) { asm("orb\t%1, %0" : "+q"(c): "r"(b)); return c; } $ gcc foo.c foo.c: Assembler messages: foo.c:2: Error: `%rax' not allowed with `orb'
The "q" constraint only has meanting on -m32 otherwise is treated as "r".
This is easily reproducible via
Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
or Clang+allyesconfig.
Keep the masking operation to appease sparse (`make C=1`), add back
the
cast in order to properly select the proper 8b register alias.
[Nick: reworded]
Cc: stable@vger.kernel.org Cc: Jesse Brandeburg jesse.brandeburg@intel.com Link: https://github.com/ClangBuiltLinux/linux/issues/961 Link:
https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast") Reported-by: Sedat Dilek sedat.dilek@gmail.com Reported-by: kernelci.org bot bot@kernelci.org Suggested-by: Andy Shevchenko andriy.shevchenko@intel.com Suggested-by: Ilie Halip ilie.halip@gmail.com Tested-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com
arch/x86/include/asm/bitops.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/bitops.h
b/arch/x86/include/asm/bitops.h
index b392571c1f1d..139122e5b25b 100644 --- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) & 0xff)
: "iq" ((u8)(CONST_MASK(nr) & 0xff))
I think a better fix would be to make CONST_MASK() return a u8 value rather than have to cast on every use.
Also I question the need for the "q" constraint. It was added in commit 437a0a54 as a workaround for GCC 3.4.4. Now that the minimum GCC version is 4.6, is this still necessary?
-- Brian Gerst
Yes, "q" is needed on i386.
On Thu, May 7, 2020 at 3:02 AM hpa@zytor.com wrote:
On May 6, 2020 11:18:09 PM PDT, Brian Gerst brgerst@gmail.com wrote:
On Tue, May 5, 2020 at 1:47 PM Nick Desaulniers ndesaulniers@google.com wrote:
From: Sedat Dilek sedat.dilek@gmail.com
It turns out that if your config tickles __builtin_constant_p via differences in choices to inline or not, this now produces invalid assembly:
$ cat foo.c long a(long b, long c) { asm("orb\t%1, %0" : "+q"(c): "r"(b)); return c; } $ gcc foo.c foo.c: Assembler messages: foo.c:2: Error: `%rax' not allowed with `orb'
The "q" constraint only has meanting on -m32 otherwise is treated as "r".
This is easily reproducible via
Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
or Clang+allyesconfig.
Keep the masking operation to appease sparse (`make C=1`), add back
the
cast in order to properly select the proper 8b register alias.
[Nick: reworded]
Cc: stable@vger.kernel.org Cc: Jesse Brandeburg jesse.brandeburg@intel.com Link: https://github.com/ClangBuiltLinux/linux/issues/961 Link:
https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved cast") Reported-by: Sedat Dilek sedat.dilek@gmail.com Reported-by: kernelci.org bot bot@kernelci.org Suggested-by: Andy Shevchenko andriy.shevchenko@intel.com Suggested-by: Ilie Halip ilie.halip@gmail.com Tested-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com
arch/x86/include/asm/bitops.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/bitops.h
b/arch/x86/include/asm/bitops.h
index b392571c1f1d..139122e5b25b 100644 --- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) & 0xff)
: "iq" ((u8)(CONST_MASK(nr) & 0xff))
I think a better fix would be to make CONST_MASK() return a u8 value rather than have to cast on every use.
Also I question the need for the "q" constraint. It was added in commit 437a0a54 as a workaround for GCC 3.4.4. Now that the minimum GCC version is 4.6, is this still necessary?
-- Brian Gerst
Yes, "q" is needed on i386.
I think the bug this worked around was that the compiler didn't detect that CONST_MASK(nr) was also constant and doesn't need to be put into a register. The question is does that bug still exist on compiler versions we care about?
-- Brian Gerst
From: Brian Gerst
Sent: 07 May 2020 14:32
...
I think the bug this worked around was that the compiler didn't detect that CONST_MASK(nr) was also constant and doesn't need to be put into a register. The question is does that bug still exist on compiler versions we care about?
Hmmm... That ought to have been fixed instead of worrying about the fact that an invalid register was used.
Alternatively is there any reason not to use the bts/btc instructions? Yes, I know they'll do wider accesses, but variable bit numbers do. It is also possible that the assembler will support constant bit numbers >= 32 by adding to the address offset.
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On May 7, 2020 8:09:35 AM PDT, David Laight David.Laight@ACULAB.COM wrote:
From: Brian Gerst
Sent: 07 May 2020 14:32
...
I think the bug this worked around was that the compiler didn't
detect
that CONST_MASK(nr) was also constant and doesn't need to be put into a register. The question is does that bug still exist on compiler versions we care about?
Hmmm... That ought to have been fixed instead of worrying about the fact that an invalid register was used.
Alternatively is there any reason not to use the bts/btc instructions? Yes, I know they'll do wider accesses, but variable bit numbers do. It is also possible that the assembler will support constant bit numbers >= 32 by adding to the address offset.
David
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
They're slower, and for unaligned locked fields can be severely so.
On May 7, 2020 6:32:24 AM PDT, Brian Gerst brgerst@gmail.com wrote:
On Thu, May 7, 2020 at 3:02 AM hpa@zytor.com wrote:
On May 6, 2020 11:18:09 PM PDT, Brian Gerst brgerst@gmail.com
wrote:
On Tue, May 5, 2020 at 1:47 PM Nick Desaulniers ndesaulniers@google.com wrote:
From: Sedat Dilek sedat.dilek@gmail.com
It turns out that if your config tickles __builtin_constant_p via differences in choices to inline or not, this now produces invalid assembly:
$ cat foo.c long a(long b, long c) { asm("orb\t%1, %0" : "+q"(c): "r"(b)); return c; } $ gcc foo.c foo.c: Assembler messages: foo.c:2: Error: `%rax' not allowed with `orb'
The "q" constraint only has meanting on -m32 otherwise is treated
as
"r".
This is easily reproducible via
Clang+CONFIG_STAGING=y+CONFIG_VT6656=m,
or Clang+allyesconfig.
Keep the masking operation to appease sparse (`make C=1`), add
back
the
cast in order to properly select the proper 8b register alias.
[Nick: reworded]
Cc: stable@vger.kernel.org Cc: Jesse Brandeburg jesse.brandeburg@intel.com Link: https://github.com/ClangBuiltLinux/linux/issues/961 Link:
https://lore.kernel.org/lkml/20200504193524.GA221287@google.com/
Fixes: 1651e700664b4 ("x86: Fix bitops.h warning with a moved
cast")
Reported-by: Sedat Dilek sedat.dilek@gmail.com Reported-by: kernelci.org bot bot@kernelci.org Suggested-by: Andy Shevchenko andriy.shevchenko@intel.com Suggested-by: Ilie Halip ilie.halip@gmail.com Tested-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Sedat Dilek sedat.dilek@gmail.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com
arch/x86/include/asm/bitops.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/bitops.h
b/arch/x86/include/asm/bitops.h
index b392571c1f1d..139122e5b25b 100644 --- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long
*addr)
if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) & 0xff)
: "iq" ((u8)(CONST_MASK(nr) & 0xff))
I think a better fix would be to make CONST_MASK() return a u8 value rather than have to cast on every use.
Also I question the need for the "q" constraint. It was added in commit 437a0a54 as a workaround for GCC 3.4.4. Now that the minimum GCC version is 4.6, is this still necessary?
-- Brian Gerst
Yes, "q" is needed on i386.
I think the bug this worked around was that the compiler didn't detect that CONST_MASK(nr) was also constant and doesn't need to be put into a register. The question is does that bug still exist on compiler versions we care about?
-- Brian Gerst
The compiler is free to do that, including for legit reasons (common subexpression elimination, especially.) So yes.
From: Brian Gerst
Sent: 07 May 2020 07:18
...
--- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long *addr) if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) & 0xff)
: "iq" ((u8)(CONST_MASK(nr) & 0xff))
I think a better fix would be to make CONST_MASK() return a u8 value rather than have to cast on every use.
Or assign to a local variable - then it doesn't matter how the value is actually calculated. So: u8 mask = CONST_MASK(nr); asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr) : "iq" mask
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On May 7, 2020 12:44:44 AM PDT, David Laight David.Laight@ACULAB.COM wrote:
From: Brian Gerst
Sent: 07 May 2020 07:18
...
--- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long
*addr)
if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) & 0xff)
: "iq" ((u8)(CONST_MASK(nr) & 0xff))
I think a better fix would be to make CONST_MASK() return a u8 value rather than have to cast on every use.
Or assign to a local variable - then it doesn't matter how the value is actually calculated. So: u8 mask = CONST_MASK(nr); asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr) : "iq" mask
David
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
"const u8" please...
From: hpa@zytor.com
Sent: 07 May 2020 08:59 On May 7, 2020 12:44:44 AM PDT, David Laight David.Laight@ACULAB.COM wrote:
From: Brian Gerst
Sent: 07 May 2020 07:18
...
--- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long
*addr)
if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) & 0xff)
: "iq" ((u8)(CONST_MASK(nr) & 0xff))
I think a better fix would be to make CONST_MASK() return a u8 value rather than have to cast on every use.
Or assign to a local variable - then it doesn't matter how the value is actually calculated. So: u8 mask = CONST_MASK(nr); asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr) : "iq" mask
David
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
"const u8" please...
Why, just a waste of disk space.
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On May 7, 2020 1:35:01 AM PDT, David Laight David.Laight@ACULAB.COM wrote:
From: hpa@zytor.com
Sent: 07 May 2020 08:59 On May 7, 2020 12:44:44 AM PDT, David Laight
David.Laight@ACULAB.COM wrote:
From: Brian Gerst
Sent: 07 May 2020 07:18
...
--- a/arch/x86/include/asm/bitops.h +++ b/arch/x86/include/asm/bitops.h @@ -54,7 +54,7 @@ arch_set_bit(long nr, volatile unsigned long
*addr)
if (__builtin_constant_p(nr)) { asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr)
: "iq" (CONST_MASK(nr) & 0xff)
: "iq" ((u8)(CONST_MASK(nr) & 0xff))
I think a better fix would be to make CONST_MASK() return a u8
value
rather than have to cast on every use.
Or assign to a local variable - then it doesn't matter how the value is actually calculated. So: u8 mask = CONST_MASK(nr); asm volatile(LOCK_PREFIX "orb %1,%0" : CONST_MASK_ADDR(nr, addr) : "iq" mask
David
Registered Address Lakeside, Bramley Road, Mount Farm, Milton
Keynes,
MK1 1PT, UK Registration No: 1397386 (Wales)
"const u8" please...
Why, just a waste of disk space.
David
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Descriptive.
On Thu, May 7, 2020 at 10:50 AM David Laight David.Laight@aculab.com wrote:
From: Brian Gerst
Sent: 07 May 2020 07:18
I think a better fix would be to make CONST_MASK() return a u8 value rather than have to cast on every use.
Or assign to a local variable - then it doesn't matter how the value is actually calculated. So: u8 mask = CONST_MASK(nr);
Another case with negation won't work like this I believe. So, I thin kthe patch we have is good enough, no need to seek for an evil.
On Wed, May 6, 2020 at 11:18 PM Brian Gerst brgerst@gmail.com wrote:
I think a better fix would be to make CONST_MASK() return a u8 value rather than have to cast on every use.
Also I question the need for the "q" constraint. It was added in commit 437a0a54 as a workaround for GCC 3.4.4. Now that the minimum GCC version is 4.6, is this still necessary?
TL;DR yes
ah, thanks for the git archeology, it's useful. I don't think this is a compiler bug however, just the compiler being more strict that the `b` suffix on `orb` requires a 8b register operand. For 32b x86, `q` asm constraint is required, because not all GPR's had lower 8b register aliases, as Arnd found and HPA noted as well.
I like your suggested change to CONST_MASK, and will send that in a bit.
linux-stable-mirror@lists.linaro.org