From GCC commit 3f13154553f8546a ("df-scan: remove ad-hoc handling of global regs in asms"), global registers will no longer be forced to add to the def-use chain. Then current_thread_info(), current_stack_pointer and __my_cpu_offset may be lifted out of the loop because they are no longer treated as "volatile variables".
This optimization is still correct for the current_thread_info() and current_stack_pointer usages because they are associated to a thread. However it is wrong for __my_cpu_offset because it is associated to a CPU rather than a thread: if the thread migrates to a different CPU in the loop, __my_cpu_offset should be changed.
Change __my_cpu_offset definition to treat it as a "volatile variable", in order to avoid such a mis-optimization.
Cc: stable@vger.kernel.org Reported-by: Xiaotian Wu wuxiaotian@loongson.cn Reported-by: Miao Wang shankerwangmiao@gmail.com Signed-off-by: Xing Li lixing@loongson.cn Signed-off-by: Hongchen Zhang zhanghongchen@loongson.cn Signed-off-by: Rui Wang wangrui@loongson.cn Signed-off-by: Huacai Chen chenhuacai@loongson.cn --- arch/loongarch/include/asm/percpu.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/loongarch/include/asm/percpu.h b/arch/loongarch/include/asm/percpu.h index 9b36ac003f89..03b98491d301 100644 --- a/arch/loongarch/include/asm/percpu.h +++ b/arch/loongarch/include/asm/percpu.h @@ -29,7 +29,12 @@ static inline void set_my_cpu_offset(unsigned long off) __my_cpu_offset = off; csr_write64(off, PERCPU_BASE_KS); } -#define __my_cpu_offset __my_cpu_offset + +#define __my_cpu_offset \ +({ \ + __asm__ __volatile__("":"+r"(__my_cpu_offset)); \ + __my_cpu_offset; \ +})
#define PERCPU_OP(op, asm_op, c_op) \ static __always_inline unsigned long __percpu_##op(void *ptr, \
On Fri, 2024-03-15 at 10:45 +0800, Huacai Chen wrote:
From GCC commit 3f13154553f8546a ("df-scan: remove ad-hoc handling of global regs in asms"), global registers will no longer be forced to add to the def-use chain. Then current_thread_info(), current_stack_pointer and __my_cpu_offset may be lifted out of the loop because they are no longer treated as "volatile variables".
Ooops... I'm wondering why this issue has not blown up our systems before. The referred GCC commit is far before LoongArch CPUs are taped.
This optimization is still correct for the current_thread_info() and current_stack_pointer usages because they are associated to a thread. However it is wrong for __my_cpu_offset because it is associated to a CPU rather than a thread: if the thread migrates to a different CPU in the loop, __my_cpu_offset should be changed.
Change __my_cpu_offset definition to treat it as a "volatile variable", in order to avoid such a mis-optimization.
Cc: stable@vger.kernel.org
I suppose we should add Fixes: 5b0b14e550a0 ("LoongArch: Add atomic/locking header") here.
Reported-by: Xiaotian Wu wuxiaotian@loongson.cn Reported-by: Miao Wang shankerwangmiao@gmail.com Signed-off-by: Xing Li lixing@loongson.cn Signed-off-by: Hongchen Zhang zhanghongchen@loongson.cn Signed-off-by: Rui Wang wangrui@loongson.cn Signed-off-by: Huacai Chen chenhuacai@loongson.cn
arch/loongarch/include/asm/percpu.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/loongarch/include/asm/percpu.h b/arch/loongarch/include/asm/percpu.h index 9b36ac003f89..03b98491d301 100644 --- a/arch/loongarch/include/asm/percpu.h +++ b/arch/loongarch/include/asm/percpu.h @@ -29,7 +29,12 @@ static inline void set_my_cpu_offset(unsigned long off) __my_cpu_offset = off; csr_write64(off, PERCPU_BASE_KS); } -#define __my_cpu_offset __my_cpu_offset
+#define __my_cpu_offset \ +({ \
- __asm__ __volatile__("":"+r"(__my_cpu_offset)); \
- __my_cpu_offset; \
+}) #define PERCPU_OP(op, asm_op, c_op) \ static __always_inline unsigned long __percpu_##op(void *ptr, \
Hi, Ruoyao,
On Wed, Mar 20, 2024 at 6:27 PM Xi Ruoyao xry111@xry111.site wrote:
On Fri, 2024-03-15 at 10:45 +0800, Huacai Chen wrote:
From GCC commit 3f13154553f8546a ("df-scan: remove ad-hoc handling of global regs in asms"), global registers will no longer be forced to add to the def-use chain. Then current_thread_info(), current_stack_pointer and __my_cpu_offset may be lifted out of the loop because they are no longer treated as "volatile variables".
Ooops... I'm wondering why this issue has not blown up our systems before. The referred GCC commit is far before LoongArch CPUs are taped.
This optimization is still correct for the current_thread_info() and current_stack_pointer usages because they are associated to a thread. However it is wrong for __my_cpu_offset because it is associated to a CPU rather than a thread: if the thread migrates to a different CPU in the loop, __my_cpu_offset should be changed.
Change __my_cpu_offset definition to treat it as a "volatile variable", in order to avoid such a mis-optimization.
Cc: stable@vger.kernel.org
I suppose we should add Fixes: 5b0b14e550a0 ("LoongArch: Add atomic/locking header") here.
You are right, but since this patch is in loongarch-next for the pull request, I don't want to change things.
Huacai
Reported-by: Xiaotian Wu wuxiaotian@loongson.cn Reported-by: Miao Wang shankerwangmiao@gmail.com Signed-off-by: Xing Li lixing@loongson.cn Signed-off-by: Hongchen Zhang zhanghongchen@loongson.cn Signed-off-by: Rui Wang wangrui@loongson.cn Signed-off-by: Huacai Chen chenhuacai@loongson.cn
arch/loongarch/include/asm/percpu.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/loongarch/include/asm/percpu.h b/arch/loongarch/include/asm/percpu.h index 9b36ac003f89..03b98491d301 100644 --- a/arch/loongarch/include/asm/percpu.h +++ b/arch/loongarch/include/asm/percpu.h @@ -29,7 +29,12 @@ static inline void set_my_cpu_offset(unsigned long off) __my_cpu_offset = off; csr_write64(off, PERCPU_BASE_KS); } -#define __my_cpu_offset __my_cpu_offset
+#define __my_cpu_offset \ +({ \
__asm__ __volatile__("":"+r"(__my_cpu_offset)); \
__my_cpu_offset; \
+})
#define PERCPU_OP(op, asm_op, c_op) \ static __always_inline unsigned long __percpu_##op(void *ptr, \
-- Xi Ruoyao xry111@xry111.site School of Aerospace Science and Technology, Xidian University
linux-stable-mirror@lists.linaro.org