On Wed, Jul 17, 2019 at 5:02 PM Vaibhav Rustagi vaibhavrustagi@google.com wrote:
From: Nick Desaulniers ndesaulniers@google.com
Implementing memcpy and memset in terms of __builtin_memcpy and __builtin_memset is problematic.
GCC at -O2 will replace calls to the builtins with calls to memcpy and memset (but will generate an inline implementation at -Os). Clang will replace the builtins with these calls regardless of optimization level.
$ llvm-objdump -dr arch/x86/purgatory/string.o | tail
0000000000000339 memcpy: 339: 48 b8 00 00 00 00 00 00 00 00 movabsq $0, %rax 000000000000033b: R_X86_64_64 memcpy 343: ff e0 jmpq *%rax
0000000000000345 memset: 345: 48 b8 00 00 00 00 00 00 00 00 movabsq $0, %rax 0000000000000347: R_X86_64_64 memset 34f: ff e0
Such code results in infinite recursion at runtime. This is observed when doing kexec.
Just so it's crystal clear to other reviewers, consider this codegen between compilers and optimization levels: https://godbolt.org/z/jcfKsw So I'd imagine the commit that introduced these implementations very much relied on being compiled at -Os to work.
Instead, reuse an implementation from arch/x86/boot/compressed/string.c if we define warn as a symbol.
Alternatively, I was getting fancy trying to match what GCC lowers __builtin_memcpy/__builtin_memset to: diff --git a/arch/x86/purgatory/string.c b/arch/x86/purgatory/string.c index 795ca4f..e055f65 100644 --- a/arch/x86/purgatory/string.c +++ b/arch/x86/purgatory/string.c @@ -16,10 +16,23 @@
void *memcpy(void *dst, const void *src, size_t len) { - return __builtin_memcpy(dst, src, len); + asm( + "movq %0, %%rax\n\t" + "movq %2, %%rcx\n\t" + "rep movsb\n\t" + : "=r"(dst) : "r"(src), "ri"(len) : "rax", "rcx"); + return dst; }
void *memset(void *dst, int c, size_t len) { - return __builtin_memset(dst, c, len); + void* ret; + asm( + "movq %1, %%r8\n\t" + "movl %2, %%eax\n\t" + "movq %3, %%rcx\n\t" + "rep stosb\n\t" + "movq %%r8, %0" + : "=r"(ret) : "r"(dst), "ri"(c), "ri"(len) : "r8", "eax", "rcx"); + return ret; }
but then Alistair pointed out that we have a proliferation of memcpy+memest definitions in the kernel, and we should probably just reuse an existing one rather than add to the arms race.
Link: https://bugs.chromium.org/p/chromium/issues/detail?id=984056 Reported-by: Vaibhav Rustagi vaibhavrustagi@google.com Tested-by: Vaibhav Rustagi vaibhavrustagi@google.com Debugged-by: Vaibhav Rustagi vaibhavrustagi@google.com Debugged-by: Manoj Gupta manojgupta@google.com Suggested-by: Alistair Delva adelva@google.com Signed-off-by: Vaibhav Rustagi vaibhavrustagi@google.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com
arch/x86/purgatory/Makefile | 3 +++ arch/x86/purgatory/purgatory.c | 6 ++++++ arch/x86/purgatory/string.c | 23 ----------------------- 3 files changed, 9 insertions(+), 23 deletions(-) delete mode 100644 arch/x86/purgatory/string.c
diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile index 3589ec4a28c7..84b8314ddb2d 100644 --- a/arch/x86/purgatory/Makefile +++ b/arch/x86/purgatory/Makefile @@ -6,6 +6,9 @@ purgatory-y := purgatory.o stack.o setup-x86_$(BITS).o sha256.o entry64.o string targets += $(purgatory-y) PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y))
+$(obj)/string.o: $(srctree)/arch/x86/boot/compressed/string.c FORCE
$(call if_changed_rule,cc_o_c)
$(obj)/sha256.o: $(srctree)/lib/sha256.c FORCE $(call if_changed_rule,cc_o_c)
diff --git a/arch/x86/purgatory/purgatory.c b/arch/x86/purgatory/purgatory.c index 6d8d5a34c377..b607bda786f6 100644 --- a/arch/x86/purgatory/purgatory.c +++ b/arch/x86/purgatory/purgatory.c @@ -68,3 +68,9 @@ void purgatory(void) } copy_backup_region(); }
+/*
- Defined in order to reuse memcpy() and memset() from
- arch/x86/boot/compressed/string.c
- */
+void warn(const char *msg) {}
This is the one part I feel bad about; memcpy() in arch/x86/boot/compressed/string.c calls warn() which would result in an undefined symbol in purgatory.ro. Maybe there's a preferred solution, or this is ok for purgatory/kexec? There's other x86 memsets+memcpys, but IMO this is the smallest incision without playing the satisfy-the-symbol-dependencies game.
If the maintainers are ok with this, then the series looks ready to go to me. Thanks for debugging/sending Vaibhav.
Orthogonally, I showed Hans Boehm the pointer comparisons+subtraction in arch/x86/boot/compressed/string.c's memcpy asking about pointer provenance issues (https://wiki.sei.cmu.edu/confluence/display/c/ARR36-C.+Do+not+subtract+or+co..., http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2090.htm) introduced in commit 00ec2c37031e ("x86/boot: Warn on future overlapping memcpy() use") and he started cursing in Spanish (I don't think he speaks Spanish) and performed the sign of the cross. Y'all need <strikethrough>Jesus</strikethrough>[u]intptr_t.
diff --git a/arch/x86/purgatory/string.c b/arch/x86/purgatory/string.c deleted file mode 100644 index 01ad43873ad9..000000000000 --- a/arch/x86/purgatory/string.c +++ /dev/null @@ -1,23 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-only -/*
- Simple string functions.
- Copyright (C) 2014 Red Hat Inc.
- Author:
Vivek Goyal <vgoyal@redhat.com>
- */
-#include <linux/types.h>
-#include "../boot/string.c"
-void *memcpy(void *dst, const void *src, size_t len) -{
return __builtin_memcpy(dst, src, len);
-}
-void *memset(void *dst, int c, size_t len) -{
return __builtin_memset(dst, c, len);
-}
2.22.0.510.g264f2c817a-goog