LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings.
`stpcpy` is just like `strcpy` except: 1. it returns the pointer to the new tail of `dest`. This allows you to chain multiple calls to `stpcpy` in one statement. 2. it requires the parameters not to overlap. Calling `sprintf` with overlapping arguments was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior.
`stpcpy` was first standardized in POSIX.1-2008.
Implement this so that we don't observe linkage failures due to missing symbol definitions for `stpcpy`.
Similar to last year's fire drill with: commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")
This optimization was introduced into clang-12.
Cc: stable@vger.kernel.org Link: https://bugs.llvm.org/show_bug.cgi?id=47162 Link: https://github.com/ClangBuiltLinux/linux/issues/1126 Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html Link: https://reviews.llvm.org/D85963 Reported-by: Sami Tolvanen samitolvanen@google.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com --- include/linux/string.h | 3 +++ lib/string.c | 23 +++++++++++++++++++++++ 2 files changed, 26 insertions(+)
diff --git a/include/linux/string.h b/include/linux/string.h index b1f3894a0a3e..e570b9b10f50 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -31,6 +31,9 @@ size_t strlcpy(char *, const char *, size_t); #ifndef __HAVE_ARCH_STRSCPY ssize_t strscpy(char *, const char *, size_t); #endif +#ifndef __HAVE_ARCH_STPCPY +extern char *stpcpy(char *__restrict, const char *__restrict__); +#endif
/* Wraps calls to strscpy()/memset(), no arch specific code required */ ssize_t strscpy_pad(char *dest, const char *src, size_t count); diff --git a/lib/string.c b/lib/string.c index 6012c385fb31..81bc4d62c256 100644 --- a/lib/string.c +++ b/lib/string.c @@ -272,6 +272,29 @@ ssize_t strscpy_pad(char *dest, const char *src, size_t count) } EXPORT_SYMBOL(strscpy_pad);
+#ifndef __HAVE_ARCH_STPCPY +/** + * stpcpy - copy a string from src to dest returning a pointer to the new end + * of dest, including src's NULL terminator. May overrun dest. + * @dest: pointer to end of string being copied into. Must be large enough + * to receive copy. + * @src: pointer to the beginning of string being copied from. Must not overlap + * dest. + * + * stpcpy differs from strcpy in two key ways: + * 1. inputs must not overlap. + * 2. return value is the new NULL terminated character. (for strcpy, the + * return value is a pointer to src. + */ +#undef stpcpy +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src) +{ + while ((*dest++ = *src++) != '\0') + /* nothing */; + return dest; +} +#endif + #ifndef __HAVE_ARCH_STRCAT /** * strcat - Append one %NUL-terminated string to another
On Fri, 2020-08-14 at 17:24 -0700, Nick Desaulniers wrote:
LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings.
[]
diff --git a/include/linux/string.h b/include/linux/string.h
[]
@@ -31,6 +31,9 @@ size_t strlcpy(char *, const char *, size_t); #ifndef __HAVE_ARCH_STRSCPY ssize_t strscpy(char *, const char *, size_t); #endif +#ifndef __HAVE_ARCH_STPCPY +extern char *stpcpy(char *__restrict, const char *__restrict__);
Why use two different forms for __restrict and __restrict__? Any real reason to use __restrict__ at all?
It's used nowhere else in the kernel.
$ git grep -w -P '__restrict_{0,2}' scripts/genksyms/keywords.c: // According to rth, c99 defines "_Bool", __restrict", __restrict__", "restrict". KAO scripts/genksyms/keywords.c: { "__restrict__", RESTRICT_KEYW },
On Fri, Aug 14, 2020 at 5:52 PM Joe Perches joe@perches.com wrote:
On Fri, 2020-08-14 at 17:24 -0700, Nick Desaulniers wrote:
LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings.
[]
diff --git a/include/linux/string.h b/include/linux/string.h
[]
@@ -31,6 +31,9 @@ size_t strlcpy(char *, const char *, size_t); #ifndef __HAVE_ARCH_STRSCPY ssize_t strscpy(char *, const char *, size_t); #endif +#ifndef __HAVE_ARCH_STPCPY +extern char *stpcpy(char *__restrict, const char *__restrict__);
Why use two different forms for __restrict and __restrict__? Any real reason to use __restrict__ at all?
Bah, sorry, I recently enabled some setting in my ~/.vimrc to help me find my cursor better: " highlight cursor set cursorline set cursorcolumn
Turns out this makes it pretty difficult to see underscores, or the lack thereof. Will fix up.
It's used nowhere else in the kernel.
$ git grep -w -P '__restrict_{0,2}' scripts/genksyms/keywords.c: // According to rth, c99 defines "_Bool", __restrict", __restrict__", "restrict". KAO scripts/genksyms/keywords.c: { "__restrict__", RESTRICT_KEYW },
Hi Nick,
On Fri, Aug 14, 2020 at 5:24 PM Nick Desaulniers ndesaulniers@google.com wrote:
LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings.
`stpcpy` is just like `strcpy` except:
- it returns the pointer to the new tail of `dest`. This allows you to chain multiple calls to `stpcpy` in one statement.
- it requires the parameters not to overlap. Calling `sprintf` with overlapping arguments was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior.
`stpcpy` was first standardized in POSIX.1-2008.
Implement this so that we don't observe linkage failures due to missing symbol definitions for `stpcpy`.
Similar to last year's fire drill with: commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")
This optimization was introduced into clang-12.
Cc: stable@vger.kernel.org Link: https://bugs.llvm.org/show_bug.cgi?id=47162 Link: https://github.com/ClangBuiltLinux/linux/issues/1126 Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html Link: https://reviews.llvm.org/D85963 Reported-by: Sami Tolvanen samitolvanen@google.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com
include/linux/string.h | 3 +++ lib/string.c | 23 +++++++++++++++++++++++ 2 files changed, 26 insertions(+)
Thank you for the patch! I can confirm that this fixes the build for me with ToT Clang.
Tested-by: Sami Tolvanen samitolvanen@google.com
Sami
On Fri, Aug 14, 2020 at 05:24:15PM -0700, Nick Desaulniers wrote:
+#ifndef __HAVE_ARCH_STPCPY +/**
- stpcpy - copy a string from src to dest returning a pointer to the new end
of dest, including src's NULL terminator. May overrun dest.
- @dest: pointer to end of string being copied into. Must be large enough
to receive copy.
- @src: pointer to the beginning of string being copied from. Must not overlap
dest.
- stpcpy differs from strcpy in two key ways:
- inputs must not overlap.
- return value is the new NULL terminated character. (for strcpy, the
- return value is a pointer to src.
- */
+#undef stpcpy +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src) +{
- while ((*dest++ = *src++) != '\0')
/* nothing */;
- return dest;
+} +#endif
Won't this return a pointer that's one _past_ the terminating NUL? I think you need the increments to be inside the loop body, rather than as part of the condition.
Nit: NUL is more correct than NULL to refer to the string terminator.
On Fri, Aug 14, 2020 at 09:33:10PM -0400, Arvind Sankar wrote:
On Fri, Aug 14, 2020 at 05:24:15PM -0700, Nick Desaulniers wrote:
+#ifndef __HAVE_ARCH_STPCPY +/**
- stpcpy - copy a string from src to dest returning a pointer to the new end
of dest, including src's NULL terminator. May overrun dest.
- @dest: pointer to end of string being copied into. Must be large enough
to receive copy.
- @src: pointer to the beginning of string being copied from. Must not overlap
dest.
- stpcpy differs from strcpy in two key ways:
- inputs must not overlap.
- return value is the new NULL terminated character. (for strcpy, the
- return value is a pointer to src.
- */
+#undef stpcpy +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src) +{
- while ((*dest++ = *src++) != '\0')
/* nothing */;
- return dest;
+} +#endif
Won't this return a pointer that's one _past_ the terminating NUL? I think you need the increments to be inside the loop body, rather than as part of the condition.
Nit: NUL is more correct than NULL to refer to the string terminator.
Also, strcpy (at least the one in the C standard) is undefined if the strings overlap, so that's not really a difference.
LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings. Calling `sprintf` with overlapping arguments was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior.
`stpcpy` is just like `strcpy` except it returns the pointer to the new tail of `dest`. This allows you to chain multiple calls to `stpcpy` in one statement.
`stpcpy` was first standardized in POSIX.1-2008.
Implement this so that we don't observe linkage failures due to missing symbol definitions for `stpcpy`.
Similar to last year's fire drill with: commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")
This optimization was introduced into clang-12.
Cc: stable@vger.kernel.org Link: https://bugs.llvm.org/show_bug.cgi?id=47162 Link: https://github.com/ClangBuiltLinux/linux/issues/1126 Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html Link: https://reviews.llvm.org/D85963 Suggested-by: Arvind Sankar nivedita@alum.mit.edu Suggested-by: Joe Perches joe@perches.com Reported-by: Sami Tolvanen samitolvanen@google.com Tested-by: Sami Tolvanen samitolvanen@google.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com --- Changes V2: * Added Sami's Tested by; though the patch changed implementation, the missing symbol at link time was the problem Sami was observing. * Fix __restrict -> __restrict__ typo as per Joe. * Drop note about restrict from commit message as per Arvind. * Fix NULL -> NUL as per Arvind; NUL is ASCII '\0'. TIL * Fix off by one error as per Arvind; I had another off by one error in my test program that was masking this.
include/linux/string.h | 3 +++ lib/string.c | 23 +++++++++++++++++++++++ 2 files changed, 26 insertions(+)
diff --git a/include/linux/string.h b/include/linux/string.h index b1f3894a0a3e..7686dbca8582 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -31,6 +31,9 @@ size_t strlcpy(char *, const char *, size_t); #ifndef __HAVE_ARCH_STRSCPY ssize_t strscpy(char *, const char *, size_t); #endif +#ifndef __HAVE_ARCH_STPCPY +extern char *stpcpy(char *__restrict__, const char *__restrict__); +#endif
/* Wraps calls to strscpy()/memset(), no arch specific code required */ ssize_t strscpy_pad(char *dest, const char *src, size_t count); diff --git a/lib/string.c b/lib/string.c index 6012c385fb31..68ddbffbbd58 100644 --- a/lib/string.c +++ b/lib/string.c @@ -272,6 +272,29 @@ ssize_t strscpy_pad(char *dest, const char *src, size_t count) } EXPORT_SYMBOL(strscpy_pad);
+#ifndef __HAVE_ARCH_STPCPY +/** + * stpcpy - copy a string from src to dest returning a pointer to the new end + * of dest, including src's NUL terminator. May overrun dest. + * @dest: pointer to end of string being copied into. Must be large enough + * to receive copy. + * @src: pointer to the beginning of string being copied from. Must not overlap + * dest. + * + * stpcpy differs from strcpy in two key ways: + * 1. inputs must not overlap. + * 2. return value is the new NULL terminated character. (for strcpy, the + * return value is a pointer to src. + */ +#undef stpcpy +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src) +{ + while ((*dest++ = *src++) != '\0') + /* nothing */; + return --dest; +} +#endif + #ifndef __HAVE_ARCH_STRCAT /** * strcat - Append one %NUL-terminated string to another
On Fri, Aug 14, 2020 at 07:09:44PM -0700, Nick Desaulniers wrote:
LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings. Calling `sprintf` with overlapping arguments was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior.
`stpcpy` is just like `strcpy` except it returns the pointer to the new tail of `dest`. This allows you to chain multiple calls to `stpcpy` in one statement.
`stpcpy` was first standardized in POSIX.1-2008.
Implement this so that we don't observe linkage failures due to missing symbol definitions for `stpcpy`.
Similar to last year's fire drill with: commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")
This optimization was introduced into clang-12.
Cc: stable@vger.kernel.org Link: https://bugs.llvm.org/show_bug.cgi?id=47162 Link: https://github.com/ClangBuiltLinux/linux/issues/1126 Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html Link: https://reviews.llvm.org/D85963 Suggested-by: Arvind Sankar nivedita@alum.mit.edu Suggested-by: Joe Perches joe@perches.com Reported-by: Sami Tolvanen samitolvanen@google.com Tested-by: Sami Tolvanen samitolvanen@google.com Signed-off-by: Nick Desaulniers ndesaulniers@google.com
Changes V2:
- Added Sami's Tested by; though the patch changed implementation, the missing symbol at link time was the problem Sami was observing.
- Fix __restrict -> __restrict__ typo as per Joe.
- Drop note about restrict from commit message as per Arvind.
- Fix NULL -> NUL as per Arvind; NUL is ASCII '\0'. TIL
- Fix off by one error as per Arvind; I had another off by one error in my test program that was masking this.
include/linux/string.h | 3 +++ lib/string.c | 23 +++++++++++++++++++++++ 2 files changed, 26 insertions(+)
diff --git a/include/linux/string.h b/include/linux/string.h index b1f3894a0a3e..7686dbca8582 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -31,6 +31,9 @@ size_t strlcpy(char *, const char *, size_t); #ifndef __HAVE_ARCH_STRSCPY ssize_t strscpy(char *, const char *, size_t); #endif +#ifndef __HAVE_ARCH_STPCPY +extern char *stpcpy(char *__restrict__, const char *__restrict__); +#endif /* Wraps calls to strscpy()/memset(), no arch specific code required */ ssize_t strscpy_pad(char *dest, const char *src, size_t count); diff --git a/lib/string.c b/lib/string.c index 6012c385fb31..68ddbffbbd58 100644 --- a/lib/string.c +++ b/lib/string.c @@ -272,6 +272,29 @@ ssize_t strscpy_pad(char *dest, const char *src, size_t count) } EXPORT_SYMBOL(strscpy_pad); +#ifndef __HAVE_ARCH_STPCPY +/**
- stpcpy - copy a string from src to dest returning a pointer to the new end
of dest, including src's NUL terminator. May overrun dest.
- @dest: pointer to end of string being copied into. Must be large enough
to receive copy.
- @src: pointer to the beginning of string being copied from. Must not overlap
dest.
- stpcpy differs from strcpy in two key ways:
- inputs must not overlap.
Looks like you missed my second email: strcpy also does not allow inputs to overlap. Couple typos below.
- return value is the new NULL terminated character. (for strcpy, the
^^ NUL terminator.
- return value is a pointer to src.
^^ dest.)
- */
+#undef stpcpy +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src) +{
- while ((*dest++ = *src++) != '\0')
/* nothing */;
- return --dest;
+} +#endif
#ifndef __HAVE_ARCH_STRCAT /**
- strcat - Append one %NUL-terminated string to another
-- 2.28.0.220.ged08abb693-goog
The kernel-doc comments in string.c currently have a mix of %NUL and NUL, but the former seems to be more common. %NUL-terminator appears to be the preferred wording.
On Fri, 2020-08-14 at 19:09 -0700, Nick Desaulniers wrote:
LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings. Calling `sprintf` with overlapping arguments was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior.
`stpcpy` is just like `strcpy` except it returns the pointer to the new tail of `dest`. This allows you to chain multiple calls to `stpcpy` in one statement.
[]
diff --git a/include/linux/string.h b/include/linux/string.h
[]
@@ -31,6 +31,9 @@ size_t strlcpy(char *, const char *, size_t); #ifndef __HAVE_ARCH_STRSCPY ssize_t strscpy(char *, const char *, size_t); #endif +#ifndef __HAVE_ARCH_STPCPY +extern char *stpcpy(char *__restrict__, const char *__restrict__);
If __restrict__ is used, perhaps it should follow the kernel style used by attributes like __iomem and __user
extern char *stpcpy(char __restrict *dest, const char __restrict *src);
(though I would lose the extern too)
char *stpcpy(char __restrict *dest, const char __restrict *src);
On Fri, Aug 14, 2020 at 07:09:44PM -0700, Nick Desaulniers wrote:
LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings. Calling `sprintf` with overlapping arguments was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior.
`stpcpy` is just like `strcpy` except it returns the pointer to the new tail of `dest`. This allows you to chain multiple calls to `stpcpy` in one statement.
O_O What?
No; this is a _terrible_ API: there is no bounds checking, there are no buffer sizes. Anything using the example sprintf() pattern is _already_ wrong and must be removed from the kernel. (Yes, I realize that the kernel is *filled* with this bad assumption that "I'll never write more than PAGE_SIZE bytes to this buffer", but that's both theoretically wrong ("640k is enough for anybody") and has been known to be wrong in practice too (e.g. when suddenly your writing routine is reachable by splice(2) and you may not have a PAGE_SIZE buffer).
But we cannot _add_ another dangerous string API. We're already in a terrible mess trying to remove strcpy[1], strlcpy[2], and strncpy[3]. This needs to be addressed up by removing the unbounded sprintf() uses. (And to do so without introducing bugs related to using snprintf() when scnprintf() is expected[4].)
-Kees
[1] https://github.com/KSPP/linux/issues/88 [2] https://github.com/KSPP/linux/issues/89 [3] https://github.com/KSPP/linux/issues/90 [4] https://lore.kernel.org/lkml/20200810092100.GA42813@2f5448a72a42/
Yeah, sprintf calls should be replaced with something safer.
Dňa 15. 8. 2020 o 18:34 užívateľ Kees Cook keescook@chromium.org napísal:
On Fri, Aug 14, 2020 at 07:09:44PM -0700, Nick Desaulniers wrote:
LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings. Calling `sprintf` with overlapping arguments was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior.
`stpcpy` is just like `strcpy` except it returns the pointer to the new tail of `dest`. This allows you to chain multiple calls to `stpcpy` in one statement.
O_O What?
No; this is a _terrible_ API: there is no bounds checking, there are no buffer sizes. Anything using the example sprintf() pattern is _already_ wrong and must be removed from the kernel. (Yes, I realize that the kernel is *filled* with this bad assumption that "I'll never write more than PAGE_SIZE bytes to this buffer", but that's both theoretically wrong ("640k is enough for anybody") and has been known to be wrong in practice too (e.g. when suddenly your writing routine is reachable by splice(2) and you may not have a PAGE_SIZE buffer).
But we cannot _add_ another dangerous string API. We're already in a terrible mess trying to remove strcpy[1], strlcpy[2], and strncpy[3]. This needs to be addressed up by removing the unbounded sprintf() uses. (And to do so without introducing bugs related to using snprintf() when scnprintf() is expected[4].)
-Kees
[1] https://github.com/KSPP/linux/issues/88 [2] https://github.com/KSPP/linux/issues/89 [3] https://github.com/KSPP/linux/issues/90 [4] https://lore.kernel.org/lkml/20200810092100.GA42813@2f5448a72a42/
-- Kees Cook
On Sat, Aug 15, 2020 at 9:34 AM Kees Cook keescook@chromium.org wrote:
On Fri, Aug 14, 2020 at 07:09:44PM -0700, Nick Desaulniers wrote:
LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings. Calling `sprintf` with overlapping arguments was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior.
`stpcpy` is just like `strcpy` except it returns the pointer to the new tail of `dest`. This allows you to chain multiple calls to `stpcpy` in one statement.
O_O What?
No; this is a _terrible_ API: there is no bounds checking, there are no buffer sizes. Anything using the example sprintf() pattern is _already_ wrong and must be removed from the kernel. (Yes, I realize that the kernel is *filled* with this bad assumption that "I'll never write more than PAGE_SIZE bytes to this buffer", but that's both theoretically wrong ("640k is enough for anybody") and has been known to be wrong in practice too (e.g. when suddenly your writing routine is reachable by splice(2) and you may not have a PAGE_SIZE buffer).
But we cannot _add_ another dangerous string API. We're already in a terrible mess trying to remove strcpy[1], strlcpy[2], and strncpy[3]. This needs to be addressed up by removing the unbounded sprintf() uses. (And to do so without introducing bugs related to using snprintf() when scnprintf() is expected[4].)
Well, everything (-next, mainline, stable) is broken right now (with ToT Clang) without providing this symbol. I'm not going to go clean the entire kernel's use of sprintf to get our CI back to being green.
Thoughts on not exposing the declaration in the header, and changing the comment to be to the effect of: "Exists only for optimizing compilers to replace calls to sprintf with; generally unsafe as bounds checks aren't performed, do not use." It's still a worthwhile optimization to have, even if the use that generated it didn't do any bounds checking.
-Kees
[1] https://github.com/KSPP/linux/issues/88 [2] https://github.com/KSPP/linux/issues/89 [3] https://github.com/KSPP/linux/issues/90 [4] https://lore.kernel.org/lkml/20200810092100.GA42813@2f5448a72a42/
-- Kees Cook
On Sat, 2020-08-15 at 13:47 -0700, Nick Desaulniers wrote:
On Sat, Aug 15, 2020 at 9:34 AM Kees Cook keescook@chromium.org wrote:
On Fri, Aug 14, 2020 at 07:09:44PM -0700, Nick Desaulniers wrote:
LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings. Calling `sprintf` with overlapping arguments was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior.
`stpcpy` is just like `strcpy` except it returns the pointer to the new tail of `dest`. This allows you to chain multiple calls to `stpcpy` in one statement.
O_O What?
No; this is a _terrible_ API: there is no bounds checking, there are no buffer sizes. Anything using the example sprintf() pattern is _already_ wrong and must be removed from the kernel. (Yes, I realize that the kernel is *filled* with this bad assumption that "I'll never write more than PAGE_SIZE bytes to this buffer", but that's both theoretically wrong ("640k is enough for anybody") and has been known to be wrong in practice too (e.g. when suddenly your writing routine is reachable by splice(2) and you may not have a PAGE_SIZE buffer).
But we cannot _add_ another dangerous string API. We're already in a terrible mess trying to remove strcpy[1], strlcpy[2], and strncpy[3]. This needs to be addressed up by removing the unbounded sprintf() uses. (And to do so without introducing bugs related to using snprintf() when scnprintf() is expected[4].)
Well, everything (-next, mainline, stable) is broken right now (with ToT Clang) without providing this symbol. I'm not going to go clean the entire kernel's use of sprintf to get our CI back to being green.
Maybe this should get place in compiler-clang.h so it isn't generic and public.
Something like:
--- include/linux/compiler-clang.h | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)
diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h index cee0c728d39a..6279f1904e39 100644 --- a/include/linux/compiler-clang.h +++ b/include/linux/compiler-clang.h @@ -61,3 +61,30 @@ #if __has_feature(shadow_call_stack) # define __noscs __attribute__((__no_sanitize__("shadow-call-stack"))) #endif + +#ifndef __HAVE_ARCH_STPCPY +/** + * stpcpy - copy a string from src to dest returning a pointer to the new end + * of dest, including src's NULL terminator. May overrun dest. + * @dest: pointer to buffer being copied into. + * Must be large enough to receive copy. + * @src: pointer to the beginning of string being copied from. + * Must not overlap dest. + * + * This function exists _only_ to support clang's possible conversion of + * sprintf calls to stpcpy. + * + * stpcpy differs from strcpy in two key ways: + * 1. inputs must not overlap. + * 2. return value is dest's NUL termination character after copy. + * (for strcpy, the return value is a pointer to src) + */ + +static inline char *stpcpy(char __restrict *dest, const char __restrict *src) +{ + while ((*dest++ = *src++) != '\0') { + ; /* nothing */ + } + return --dest; +} +#endif
-fno-builtin-stpcpy can be used to disable stpcpy but Nick at llvm bugzilla wrote that these flags are broken with LTO.
Dňa 15. 8. 2020 o 23:24 užívateľ Joe Perches joe@perches.com napísal:
On Sat, 2020-08-15 at 13:47 -0700, Nick Desaulniers wrote:
On Sat, Aug 15, 2020 at 9:34 AM Kees Cook keescook@chromium.org wrote: On Fri, Aug 14, 2020 at 07:09:44PM -0700, Nick Desaulniers wrote:
LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings. Calling `sprintf` with overlapping arguments was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior.
`stpcpy` is just like `strcpy` except it returns the pointer to the new tail of `dest`. This allows you to chain multiple calls to `stpcpy` in one statement.
O_O What?
No; this is a _terrible_ API: there is no bounds checking, there are no buffer sizes. Anything using the example sprintf() pattern is _already_ wrong and must be removed from the kernel. (Yes, I realize that the kernel is *filled* with this bad assumption that "I'll never write more than PAGE_SIZE bytes to this buffer", but that's both theoretically wrong ("640k is enough for anybody") and has been known to be wrong in practice too (e.g. when suddenly your writing routine is reachable by splice(2) and you may not have a PAGE_SIZE buffer).
But we cannot _add_ another dangerous string API. We're already in a terrible mess trying to remove strcpy[1], strlcpy[2], and strncpy[3]. This needs to be addressed up by removing the unbounded sprintf() uses. (And to do so without introducing bugs related to using snprintf() when scnprintf() is expected[4].)
Well, everything (-next, mainline, stable) is broken right now (with ToT Clang) without providing this symbol. I'm not going to go clean the entire kernel's use of sprintf to get our CI back to being green.
Maybe this should get place in compiler-clang.h so it isn't generic and public.
Something like:
include/linux/compiler-clang.h | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)
diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h index cee0c728d39a..6279f1904e39 100644 --- a/include/linux/compiler-clang.h +++ b/include/linux/compiler-clang.h @@ -61,3 +61,30 @@ #if __has_feature(shadow_call_stack) # define __noscs __attribute__((__no_sanitize__("shadow-call-stack"))) #endif
+#ifndef __HAVE_ARCH_STPCPY +/**
- stpcpy - copy a string from src to dest returning a pointer to the new end
of dest, including src's NULL terminator. May overrun dest.
- @dest: pointer to buffer being copied into.
Must be large enough to receive copy.
- @src: pointer to the beginning of string being copied from.
Must not overlap dest.
- This function exists _only_ to support clang's possible conversion of
- sprintf calls to stpcpy.
- stpcpy differs from strcpy in two key ways:
- inputs must not overlap.
- return value is dest's NUL termination character after copy.
- (for strcpy, the return value is a pointer to src)
- */
+static inline char *stpcpy(char __restrict *dest, const char __restrict *src) +{
- while ((*dest++ = *src++) != '\0') {
; /* nothing */
- }
return --dest;
+} +#endif
On Sat, Aug 15, 2020 at 2:24 PM Joe Perches joe@perches.com wrote:
On Sat, 2020-08-15 at 13:47 -0700, Nick Desaulniers wrote:
On Sat, Aug 15, 2020 at 9:34 AM Kees Cook keescook@chromium.org wrote:
On Fri, Aug 14, 2020 at 07:09:44PM -0700, Nick Desaulniers wrote:
LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings. Calling `sprintf` with overlapping arguments was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior.
`stpcpy` is just like `strcpy` except it returns the pointer to the new tail of `dest`. This allows you to chain multiple calls to `stpcpy` in one statement.
O_O What?
No; this is a _terrible_ API: there is no bounds checking, there are no buffer sizes. Anything using the example sprintf() pattern is _already_ wrong and must be removed from the kernel. (Yes, I realize that the kernel is *filled* with this bad assumption that "I'll never write more than PAGE_SIZE bytes to this buffer", but that's both theoretically wrong ("640k is enough for anybody") and has been known to be wrong in practice too (e.g. when suddenly your writing routine is reachable by splice(2) and you may not have a PAGE_SIZE buffer).
But we cannot _add_ another dangerous string API. We're already in a terrible mess trying to remove strcpy[1], strlcpy[2], and strncpy[3]. This needs to be addressed up by removing the unbounded sprintf() uses. (And to do so without introducing bugs related to using snprintf() when scnprintf() is expected[4].)
Well, everything (-next, mainline, stable) is broken right now (with ToT Clang) without providing this symbol. I'm not going to go clean the entire kernel's use of sprintf to get our CI back to being green.
Maybe this should get place in compiler-clang.h so it isn't generic and public.
https://bugs.llvm.org/show_bug.cgi?id=47162#c7 and https://bugs.llvm.org/show_bug.cgi?id=47144 Seem to imply that Clang is not the only compiler that can lower a sequence of libcalls to stpcpy. Do we want to wait until we have a fire drill w/ GCC to move such an implementation from include/linux/compiler-clang.h back in to lib/string.c?
Something like:
include/linux/compiler-clang.h | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)
diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h index cee0c728d39a..6279f1904e39 100644 --- a/include/linux/compiler-clang.h +++ b/include/linux/compiler-clang.h @@ -61,3 +61,30 @@ #if __has_feature(shadow_call_stack) # define __noscs __attribute__((__no_sanitize__("shadow-call-stack"))) #endif
+#ifndef __HAVE_ARCH_STPCPY +/**
- stpcpy - copy a string from src to dest returning a pointer to the new end
of dest, including src's NULL terminator. May overrun dest.
- @dest: pointer to buffer being copied into.
Must be large enough to receive copy.
- @src: pointer to the beginning of string being copied from.
Must not overlap dest.
- This function exists _only_ to support clang's possible conversion of
- sprintf calls to stpcpy.
- stpcpy differs from strcpy in two key ways:
- inputs must not overlap.
- return value is dest's NUL termination character after copy.
- (for strcpy, the return value is a pointer to src)
- */
+static inline char *stpcpy(char __restrict *dest, const char __restrict *src) +{
while ((*dest++ = *src++) != '\0') {
; /* nothing */
}
return --dest;
+} +#endif
On Sat, 2020-08-15 at 14:28 -0700, Nick Desaulniers wrote:
On Sat, Aug 15, 2020 at 2:24 PM Joe Perches joe@perches.com wrote:
On Sat, 2020-08-15 at 13:47 -0700, Nick Desaulniers wrote:
On Sat, Aug 15, 2020 at 9:34 AM Kees Cook keescook@chromium.org wrote:
On Fri, Aug 14, 2020 at 07:09:44PM -0700, Nick Desaulniers wrote:
LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings. Calling `sprintf` with overlapping arguments was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior.
`stpcpy` is just like `strcpy` except it returns the pointer to the new tail of `dest`. This allows you to chain multiple calls to `stpcpy` in one statement.
O_O What?
No; this is a _terrible_ API: there is no bounds checking, there are no buffer sizes. Anything using the example sprintf() pattern is _already_ wrong and must be removed from the kernel. (Yes, I realize that the kernel is *filled* with this bad assumption that "I'll never write more than PAGE_SIZE bytes to this buffer", but that's both theoretically wrong ("640k is enough for anybody") and has been known to be wrong in practice too (e.g. when suddenly your writing routine is reachable by splice(2) and you may not have a PAGE_SIZE buffer).
But we cannot _add_ another dangerous string API. We're already in a terrible mess trying to remove strcpy[1], strlcpy[2], and strncpy[3]. This needs to be addressed up by removing the unbounded sprintf() uses. (And to do so without introducing bugs related to using snprintf() when scnprintf() is expected[4].)
Well, everything (-next, mainline, stable) is broken right now (with ToT Clang) without providing this symbol. I'm not going to go clean the entire kernel's use of sprintf to get our CI back to being green.
Maybe this should get place in compiler-clang.h so it isn't generic and public.
https://bugs.llvm.org/show_bug.cgi?id=47162#c7 and https://bugs.llvm.org/show_bug.cgi?id=47144 Seem to imply that Clang is not the only compiler that can lower a sequence of libcalls to stpcpy. Do we want to wait until we have a fire drill w/ GCC to move such an implementation from include/linux/compiler-clang.h back in to lib/string.c?
My guess is yes, wait until gcc, if ever, needs it.
On Sat, Aug 15, 2020 at 2:31 PM Joe Perches joe@perches.com wrote:
On Sat, 2020-08-15 at 14:28 -0700, Nick Desaulniers wrote:
On Sat, Aug 15, 2020 at 2:24 PM Joe Perches joe@perches.com wrote:
On Sat, 2020-08-15 at 13:47 -0700, Nick Desaulniers wrote:
On Sat, Aug 15, 2020 at 9:34 AM Kees Cook keescook@chromium.org wrote:
On Fri, Aug 14, 2020 at 07:09:44PM -0700, Nick Desaulniers wrote:
LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings. Calling `sprintf` with overlapping arguments was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior.
`stpcpy` is just like `strcpy` except it returns the pointer to the new tail of `dest`. This allows you to chain multiple calls to `stpcpy` in one statement.
O_O What?
No; this is a _terrible_ API: there is no bounds checking, there are no buffer sizes. Anything using the example sprintf() pattern is _already_ wrong and must be removed from the kernel. (Yes, I realize that the kernel is *filled* with this bad assumption that "I'll never write more than PAGE_SIZE bytes to this buffer", but that's both theoretically wrong ("640k is enough for anybody") and has been known to be wrong in practice too (e.g. when suddenly your writing routine is reachable by splice(2) and you may not have a PAGE_SIZE buffer).
But we cannot _add_ another dangerous string API. We're already in a terrible mess trying to remove strcpy[1], strlcpy[2], and strncpy[3]. This needs to be addressed up by removing the unbounded sprintf() uses. (And to do so without introducing bugs related to using snprintf() when scnprintf() is expected[4].)
Well, everything (-next, mainline, stable) is broken right now (with ToT Clang) without providing this symbol. I'm not going to go clean the entire kernel's use of sprintf to get our CI back to being green.
Maybe this should get place in compiler-clang.h so it isn't generic and public.
https://bugs.llvm.org/show_bug.cgi?id=47162#c7 and https://bugs.llvm.org/show_bug.cgi?id=47144 Seem to imply that Clang is not the only compiler that can lower a sequence of libcalls to stpcpy. Do we want to wait until we have a fire drill w/ GCC to move such an implementation from include/linux/compiler-clang.h back in to lib/string.c?
My guess is yes, wait until gcc, if ever, needs it.
The suggestion to use static inline doesn't even make sense. The compiler is lowering calls to other library routines; `stpcpy` isn't being explicitly called. Even if it was, not sure we want it being inlined. No symbol definition will be emitted; problem not solved. And I refuse to add any more code using `extern inline`. Putting the definition in lib/string.c is the most straightforward and avoids revisiting this issue in the future for other toolchains. I'll limit access by removing the declaration, and adding a comment to avoid its use. But if you're going to use a gnu target triple without using -ffreestanding because you *want* libcall optimizations, then you have to provide symbols for all possible library routines!
On 2020-08-15, 'Nick Desaulniers' via Clang Built Linux wrote:
On Sat, Aug 15, 2020 at 2:31 PM Joe Perches joe@perches.com wrote:
On Sat, 2020-08-15 at 14:28 -0700, Nick Desaulniers wrote:
On Sat, Aug 15, 2020 at 2:24 PM Joe Perches joe@perches.com wrote:
On Sat, 2020-08-15 at 13:47 -0700, Nick Desaulniers wrote:
On Sat, Aug 15, 2020 at 9:34 AM Kees Cook keescook@chromium.org wrote:
On Fri, Aug 14, 2020 at 07:09:44PM -0700, Nick Desaulniers wrote: > LLVM implemented a recent "libcall optimization" that lowers calls to > `sprintf(dest, "%s", str)` where the return value is used to > `stpcpy(dest, str) - dest`. This generally avoids the machinery involved > in parsing format strings. Calling `sprintf` with overlapping arguments > was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior. > > `stpcpy` is just like `strcpy` except it returns the pointer to the new > tail of `dest`. This allows you to chain multiple calls to `stpcpy` in > one statement.
O_O What?
No; this is a _terrible_ API: there is no bounds checking, there are no buffer sizes. Anything using the example sprintf() pattern is _already_ wrong and must be removed from the kernel. (Yes, I realize that the kernel is *filled* with this bad assumption that "I'll never write more than PAGE_SIZE bytes to this buffer", but that's both theoretically wrong ("640k is enough for anybody") and has been known to be wrong in practice too (e.g. when suddenly your writing routine is reachable by splice(2) and you may not have a PAGE_SIZE buffer).
But we cannot _add_ another dangerous string API. We're already in a terrible mess trying to remove strcpy[1], strlcpy[2], and strncpy[3]. This needs to be addressed up by removing the unbounded sprintf() uses. (And to do so without introducing bugs related to using snprintf() when scnprintf() is expected[4].)
Well, everything (-next, mainline, stable) is broken right now (with ToT Clang) without providing this symbol. I'm not going to go clean the entire kernel's use of sprintf to get our CI back to being green.
Maybe this should get place in compiler-clang.h so it isn't generic and public.
https://bugs.llvm.org/show_bug.cgi?id=47162#c7 and https://bugs.llvm.org/show_bug.cgi?id=47144 Seem to imply that Clang is not the only compiler that can lower a sequence of libcalls to stpcpy. Do we want to wait until we have a fire drill w/ GCC to move such an implementation from include/linux/compiler-clang.h back in to lib/string.c?
My guess is yes, wait until gcc, if ever, needs it.
The suggestion to use static inline doesn't even make sense. The compiler is lowering calls to other library routines; `stpcpy` isn't being explicitly called. Even if it was, not sure we want it being inlined. No symbol definition will be emitted; problem not solved. And I refuse to add any more code using `extern inline`. Putting the definition in lib/string.c is the most straightforward and avoids revisiting this issue in the future for other toolchains. I'll limit access by removing the declaration, and adding a comment to avoid its use. But if you're going to use a gnu target triple without using -ffreestanding because you *want* libcall optimizations, then you have to provide symbols for all possible library routines!
Adding a definition without a declaration for stpcpy looks good. Clang LTO will work.
(If the kernel does not want to provide these routines, is http://git.kernel.org/linus/6edfba1b33c701108717f4e036320fc39abe1912 probably wrong? (why remove -ffreestanding from the main Makefile) )
On Sun, Aug 16, 2020 at 2:19 AM 'Fangrui Song' via Clang Built Linux clang-built-linux@googlegroups.com wrote:
On 2020-08-15, 'Nick Desaulniers' via Clang Built Linux wrote:
On Sat, Aug 15, 2020 at 2:31 PM Joe Perches joe@perches.com wrote:
On Sat, 2020-08-15 at 14:28 -0700, Nick Desaulniers wrote:
On Sat, Aug 15, 2020 at 2:24 PM Joe Perches joe@perches.com wrote:
On Sat, 2020-08-15 at 13:47 -0700, Nick Desaulniers wrote:
On Sat, Aug 15, 2020 at 9:34 AM Kees Cook keescook@chromium.org wrote: > On Fri, Aug 14, 2020 at 07:09:44PM -0700, Nick Desaulniers wrote: > > LLVM implemented a recent "libcall optimization" that lowers calls to > > `sprintf(dest, "%s", str)` where the return value is used to > > `stpcpy(dest, str) - dest`. This generally avoids the machinery involved > > in parsing format strings. Calling `sprintf` with overlapping arguments > > was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior. > > > > `stpcpy` is just like `strcpy` except it returns the pointer to the new > > tail of `dest`. This allows you to chain multiple calls to `stpcpy` in > > one statement. > > O_O What? > > No; this is a _terrible_ API: there is no bounds checking, there are no > buffer sizes. Anything using the example sprintf() pattern is _already_ > wrong and must be removed from the kernel. (Yes, I realize that the > kernel is *filled* with this bad assumption that "I'll never write more > than PAGE_SIZE bytes to this buffer", but that's both theoretically > wrong ("640k is enough for anybody") and has been known to be wrong in > practice too (e.g. when suddenly your writing routine is reachable by > splice(2) and you may not have a PAGE_SIZE buffer). > > But we cannot _add_ another dangerous string API. We're already in a > terrible mess trying to remove strcpy[1], strlcpy[2], and strncpy[3]. This > needs to be addressed up by removing the unbounded sprintf() uses. (And > to do so without introducing bugs related to using snprintf() when > scnprintf() is expected[4].)
Well, everything (-next, mainline, stable) is broken right now (with ToT Clang) without providing this symbol. I'm not going to go clean the entire kernel's use of sprintf to get our CI back to being green.
Maybe this should get place in compiler-clang.h so it isn't generic and public.
https://bugs.llvm.org/show_bug.cgi?id=47162#c7 and https://bugs.llvm.org/show_bug.cgi?id=47144 Seem to imply that Clang is not the only compiler that can lower a sequence of libcalls to stpcpy. Do we want to wait until we have a fire drill w/ GCC to move such an implementation from include/linux/compiler-clang.h back in to lib/string.c?
My guess is yes, wait until gcc, if ever, needs it.
The suggestion to use static inline doesn't even make sense. The compiler is lowering calls to other library routines; `stpcpy` isn't being explicitly called. Even if it was, not sure we want it being inlined. No symbol definition will be emitted; problem not solved. And I refuse to add any more code using `extern inline`. Putting the definition in lib/string.c is the most straightforward and avoids revisiting this issue in the future for other toolchains. I'll limit access by removing the declaration, and adding a comment to avoid its use. But if you're going to use a gnu target triple without using -ffreestanding because you *want* libcall optimizations, then you have to provide symbols for all possible library routines!
Adding a definition without a declaration for stpcpy looks good. Clang LTO will work.
(If the kernel does not want to provide these routines, is http://git.kernel.org/linus/6edfba1b33c701108717f4e036320fc39abe1912 probably wrong? (why remove -ffreestanding from the main Makefile) )
We had some many issues in arch/x86 where *FLAGS were removed or used differently and had to re-add them :-(.
So if -ffreestanding is used in arch/x86 and was! used in top-level Makefile - it makes sense to re-add it back? ( I cannot speak for archs other than x86. )
- Sedat -
On Sun, Aug 16, 2020 at 07:22:35AM +0200, Sedat Dilek wrote:
On Sun, Aug 16, 2020 at 2:19 AM 'Fangrui Song' via Clang Built Linux clang-built-linux@googlegroups.com wrote:
Adding a definition without a declaration for stpcpy looks good. Clang LTO will work.
(If the kernel does not want to provide these routines, is http://git.kernel.org/linus/6edfba1b33c701108717f4e036320fc39abe1912 probably wrong? (why remove -ffreestanding from the main Makefile) )
We had some many issues in arch/x86 where *FLAGS were removed or used differently and had to re-add them :-(.
So if -ffreestanding is used in arch/x86 and was! used in top-level Makefile - it makes sense to re-add it back? ( I cannot speak for archs other than x86. )
- Sedat -
-ffreestanding disables _all_ builtins and libcall optimizations, which is probably not desirable. If we added it back, we'd need to also go back to #define various string functions to the __builtin versions.
Though I don't understand the original issue, with -ffreestanding, sprintf shouldn't have been turned into strcpy in the first place.
32-bit still has -ffreestanding -- I wonder if that's actually necessary any more?
Why does -fno-builtin-stpcpy not work with clang LTO? Isn't that a compiler bug?
Thanks.
On Sun, Aug 16, 2020 at 8:02 AM Arvind Sankar nivedita@alum.mit.edu wrote:
On Sun, Aug 16, 2020 at 07:22:35AM +0200, Sedat Dilek wrote:
On Sun, Aug 16, 2020 at 2:19 AM 'Fangrui Song' via Clang Built Linux clang-built-linux@googlegroups.com wrote:
Adding a definition without a declaration for stpcpy looks good. Clang LTO will work.
(If the kernel does not want to provide these routines, is http://git.kernel.org/linus/6edfba1b33c701108717f4e036320fc39abe1912 probably wrong? (why remove -ffreestanding from the main Makefile) )
We had some many issues in arch/x86 where *FLAGS were removed or used differently and had to re-add them :-(.
So if -ffreestanding is used in arch/x86 and was! used in top-level Makefile - it makes sense to re-add it back? ( I cannot speak for archs other than x86. )
- Sedat -
-ffreestanding disables _all_ builtins and libcall optimizations, which is probably not desirable. If we added it back, we'd need to also go back to #define various string functions to the __builtin versions.
Though I don't understand the original issue, with -ffreestanding, sprintf shouldn't have been turned into strcpy in the first place.
32-bit still has -ffreestanding -- I wonder if that's actually necessary any more?
Why does -fno-builtin-stpcpy not work with clang LTO? Isn't that a compiler bug?
I just confirmed that adding -fno-builtin-stpcpy to KBUILD_CFLAGS does work with LTO as well.
Sami
On Mon, Aug 17, 2020 at 10:14 AM Sami Tolvanen samitolvanen@google.com wrote:
On Sun, Aug 16, 2020 at 8:02 AM Arvind Sankar nivedita@alum.mit.edu wrote:
On Sun, Aug 16, 2020 at 07:22:35AM +0200, Sedat Dilek wrote:
On Sun, Aug 16, 2020 at 2:19 AM 'Fangrui Song' via Clang Built Linux clang-built-linux@googlegroups.com wrote:
Adding a definition without a declaration for stpcpy looks good. Clang LTO will work.
(If the kernel does not want to provide these routines, is http://git.kernel.org/linus/6edfba1b33c701108717f4e036320fc39abe1912 probably wrong? (why remove -ffreestanding from the main Makefile) )
We had some many issues in arch/x86 where *FLAGS were removed or used differently and had to re-add them :-(.
So if -ffreestanding is used in arch/x86 and was! used in top-level Makefile - it makes sense to re-add it back? ( I cannot speak for archs other than x86. )
- Sedat -
-ffreestanding disables _all_ builtins and libcall optimizations, which is probably not desirable. If we added it back, we'd need to also go
I agree.
back to #define various string functions to the __builtin versions.
Though I don't understand the original issue, with -ffreestanding, sprintf shouldn't have been turned into strcpy in the first place.
Huh? The original issue for this thread is because `-ffreestanding` *isn't* being used for most targets (oh boy, actually mixed usage by ARCH. Looks like MIPS, m68k, superH, xtensa, and 32b x86 use it?); and I'm not suggesting it be used.
32-bit still has -ffreestanding -- I wonder if that's actually necessary any more?
Fair question. Someone will have to go chase git history, since 0a6ef376d4ba covers it up. If anyone has any tricks to do so quickly; I'd love to know. I generally checkout the commit prior, then use vim fugitive to get git blame.
Why does -fno-builtin-stpcpy not work with clang LTO? Isn't that a compiler bug?
Yes; Sami found a recent patch that looks to me like it may have recently solved that bug. https://reviews.llvm.org/D71193 which landed Dec 12 2019. The bug report was based on https://github.com/ClangBuiltLinux/linux/issues/416#issuecomment-472231304 (Issue reported March 8 2019). And I do recall being able to reproduce the bug when I sent commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")
Now that that is fixed as reported by Sami below, I don't mind sending a revert for 5f074f3e192f that adds -fno-builtin-bcmp, because the current implementation of bcmp isn't useful.
That said, this libcall optimization/transformation (sprintf->stpcpy) does look useful to me. Kees, do you have thoughts on me providing the implementation without exposing it in a header vs using -fno-builtin-stpcpy? (I would need to add the missing EXPORT_SYMBOL, as pointed out by 0day bot and on the github thread). I don't care either way; I'd just like your input before sending a V+1. Maybe better to just not implement it and never implement it?
I just confirmed that adding -fno-builtin-stpcpy to KBUILD_CFLAGS does work with LTO as well.
Sami
On Mon, Aug 17, 2020 at 11:36:49AM -0700, Nick Desaulniers wrote:
That said, this libcall optimization/transformation (sprintf->stpcpy) does look useful to me. Kees, do you have thoughts on me providing the implementation without exposing it in a header vs using -fno-builtin-stpcpy? (I would need to add the missing EXPORT_SYMBOL, as pointed out by 0day bot and on the github thread). I don't care either way; I'd just like your input before sending a V+1. Maybe better to just not implement it and never implement it?
I think I would ultimately prefer -fno-builtin-stpcpy, but for now, sure, an implementation without a header (and a biiig comment above it detailing why and a reference to the bug) would be fine by me.
On Mon, Aug 17, 2020 at 11:36:49AM -0700, Nick Desaulniers wrote:
Though I don't understand the original issue, with -ffreestanding, sprintf shouldn't have been turned into strcpy in the first place.
Huh? The original issue for this thread is because `-ffreestanding` *isn't* being used for most targets (oh boy, actually mixed usage by ARCH. Looks like MIPS, m68k, superH, xtensa, and 32b x86 use it?); and I'm not suggesting it be used.
Sorry, I meant the issue mentioned in the commit that removed -ffreestanding, not the stpcpy one you're solving now. It says that sprintf got converted into strcpy, which caused failures because back then, strcpy was #define'd to __builtin_strcpy, and the default implementation was actually of a function called __builtin_strcpy o_O, not strcpy.
Anyway, that's water under the bridge now.
6edfba1b33c7 ("x86_64: Don't define string functions to builtin") gcc should handle this anyways, and it causes problems when sprintf is turned into strcpy by gcc behind our backs and the C fallback version of strcpy is actually defining __builtin_strcpy
On Mon, Aug 17, 2020 at 1:13 PM Arvind Sankar nivedita@alum.mit.edu wrote:
On Mon, Aug 17, 2020 at 11:36:49AM -0700, Nick Desaulniers wrote:
Though I don't understand the original issue, with -ffreestanding, sprintf shouldn't have been turned into strcpy in the first place.
Huh? The original issue for this thread is because `-ffreestanding` *isn't* being used for most targets (oh boy, actually mixed usage by ARCH. Looks like MIPS, m68k, superH, xtensa, and 32b x86 use it?); and I'm not suggesting it be used.
Sorry, I meant the issue mentioned in the commit that removed -ffreestanding, not the stpcpy one you're solving now. It says that sprintf got converted into strcpy, which caused failures because back then, strcpy was #define'd to __builtin_strcpy, and the default implementation was actually of a function called __builtin_strcpy o_O, not strcpy.
Anyway, that's water under the bridge now.
6edfba1b33c7 ("x86_64: Don't define string functions to builtin") gcc should handle this anyways, and it causes problems when sprintf is turned into strcpy by gcc behind our backs and the C fallback version of strcpy is actually defining __builtin_strcpy
For fun, I tried removing `-ffreestanding` from arch/x86/Makefile; both gcc and clang can compile+boot the i386 defconfig just fine. Why don't I send a patch removing it with your suggested by in a series of fixes for stpcpy and bcmp?
On Mon, Aug 17, 2020 at 10:14:43AM -0700, Sami Tolvanen wrote:
I just confirmed that adding -fno-builtin-stpcpy to KBUILD_CFLAGS does work with LTO as well.
Oh, I read this out of order; sorry! Yes, if -fno-builtin-stpcpy works, let's use that instead. Doesn't that solve it too?
Hi Nick,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on hnaz-linux-mm/master] [also build test ERROR on kees/for-next/pstore linus/master v5.8 next-20200814] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Nick-Desaulniers/lib-string-c-imple... base: https://github.com/hnaz/linux-mm master config: ia64-defconfig (attached as .config) compiler: ia64-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=ia64
If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot lkp@intel.com
All errors (new ones prefixed by >>, old ones prefixed by <<):
ERROR: modpost: "stpcpy" [sound/pci/ac97/snd-ac97-codec.ko] undefined!
--- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Fri, Aug 14, 2020 at 6:33 PM Arvind Sankar nivedita@alum.mit.edu wrote:
On Fri, Aug 14, 2020 at 05:24:15PM -0700, Nick Desaulniers wrote:
+#ifndef __HAVE_ARCH_STPCPY +/**
- stpcpy - copy a string from src to dest returning a pointer to the new end
of dest, including src's NULL terminator. May overrun dest.
- @dest: pointer to end of string being copied into. Must be large enough
to receive copy.
- @src: pointer to the beginning of string being copied from. Must not overlap
dest.
- stpcpy differs from strcpy in two key ways:
- inputs must not overlap.
- return value is the new NULL terminated character. (for strcpy, the
- return value is a pointer to src.
- */
+#undef stpcpy +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src) +{
while ((*dest++ = *src++) != '\0')
/* nothing */;
return dest;
+} +#endif
Won't this return a pointer that's one _past_ the terminating NUL? I think you need the increments to be inside the loop body, rather than as part of the condition.
Yep, looks like I had a bug in my test program that masked this. Thanks for triple checking.
Nit: NUL is more correct than NULL to refer to the string terminator.
TIL.
From: Nick Desaulniers
Sent: 15 August 2020 01:24
LLVM implemented a recent "libcall optimization" that lowers calls to `sprintf(dest, "%s", str)` where the return value is used to `stpcpy(dest, str) - dest`. This generally avoids the machinery involved in parsing format strings.
`stpcpy` is just like `strcpy` except:
- it returns the pointer to the new tail of `dest`. This allows you to chain multiple calls to `stpcpy` in one statement.
- it requires the parameters not to overlap. Calling `sprintf` with overlapping arguments was clarified in ISO C99 and POSIX.1-2001 to be undefined behavior.
`stpcpy` was first standardized in POSIX.1-2008.
Implement this so that we don't observe linkage failures due to missing symbol definitions for `stpcpy`.
.. ...
diff --git a/include/linux/string.h b/include/linux/string.h index b1f3894a0a3e..e570b9b10f50 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -31,6 +31,9 @@ size_t strlcpy(char *, const char *, size_t); #ifndef __HAVE_ARCH_STRSCPY ssize_t strscpy(char *, const char *, size_t); #endif +#ifndef __HAVE_ARCH_STPCPY +extern char *stpcpy(char *__restrict, const char *__restrict__); +#endif
/* Wraps calls to strscpy()/memset(), no arch specific code required */ ssize_t strscpy_pad(char *dest, const char *src, size_t count); diff --git a/lib/string.c b/lib/string.c index 6012c385fb31..81bc4d62c256 100644 --- a/lib/string.c +++ b/lib/string.c @@ -272,6 +272,29 @@ ssize_t strscpy_pad(char *dest, const char *src, size_t count) } EXPORT_SYMBOL(strscpy_pad);
+#ifndef __HAVE_ARCH_STPCPY
...
+#undef stpcpy +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src) +{
- while ((*dest++ = *src++) != '\0')
/* nothing */;
- return dest;
+} +#endif
Hmmmm.... Maybe the compiler should just inline the above?
OTOH there are faster copies on 64bit systems (for moderate length strings).
It would also be nicer if the compiler actually used/required a symbol in the 'reserved for the implementation' namespace. Then the linker should be able to do a fixup to a differently name symbol - if that is required.
But compiler writers enjoy making embedded coders life hell.
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
linux-stable-mirror@lists.linaro.org