Hi, Willy
Hi Zhangjin,
On Tue, Aug 08, 2023 at 04:04:05AM +0800, Zhangjin Wu wrote:
As reported and suggested by Willy, the inline __sysret() helper introduces three types of conversions and increases the size:
(1) the "unsigned long" argument to __sysret() forces a sign extension from all sys_* functions that used to return 'int'
(2) the comparison with the error range now has to be performed on a 'unsigned long' instead of an 'int'
(3) the return value from __sysret() is a 'long' (note, a signed long) which then has to be turned back to an 'int' before being returned by the caller to satisfy the caller's prototype.
To fix up this, firstly, let's use macro instead of inline function to preserves the input type and avoids these useless conversions (1), (3).
Secondly, comparison to -MAX_ERRNO inflicts on all integer returns where we could previously keep a simple sign comparison, let's use a new __is_pointer() macro suggested by David Laight to limit the comparison to -MAX_ERRNO (2) only for pointer returns and preserve a simple sign comparison for integer returns as before. The __builtin_choose_expr() is suggested by David Laight to choose different comparisons based on the types to share code.
Thirdly, fix up the following warning by an explicit conversion and let __sysret() be able to accept the (void *) type of argument:
sysroot/powerpc/include/sys.h: In function 'sbrk': sysroot/powerpc/include/sys.h:104:16: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] 104 | return (void *)__sysret(-ENOMEM);
Fourthly, to further workaround the argument type with 'const' flag, must use __auto_type for gcc >= 11.0 and __typeof__((arg) + 0) suggested by David Laight for old gcc versions.
(...)
tools/include/nolibc/sys.h | 74 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 59 insertions(+), 15 deletions(-)
Quite frankly, even if it can be looked at as a piece of art, I don't like it. It's overkill for what we need and it brings in several tricky macros that we don't need and that require a link to their analysis so that nobody breaks them by accident. I mean, if one day we need them, okay we know we can find them, they're perfect for certain use cases. But all this just to avoid a ternary operation is far too much IMHO. That's without counting on the compiler tricks to use the ugly __auto_type when available, and the macro names which continue to possibly interact with user code.
Agree, I don't like __auto_type too, although I have tried to find whether there is a direct macro for it, but NO such one, and the __auto_type in some older versions don't accept 'const' flag, so, I'm also worried about if gcc will change it in the future ;-(
Seems __sysret() is mainly used by us in sys.h, perhaps we can simply assume and guarantee nobody will use 'const' in such cases.
And if you remember, originally you proposed to factor the SET_ERRNO() stuff in every syscall in order to "simplify the code and improve maintainability". It's clear that we've long abandonned that goal here. If we had no other choice, I'd rather roll back to the clean, readable and trustable SET_ERRNO() in every syscall!
Agree, or we simply use the original version without pointer returns support (only sbrk and mmap currently) but convert it to the macro version.
Or, as the idea mentioned by Thomas in a reply: if we can let the sys_ functions use 'long' returns, or even further, we convert all of the sys_ functions to macros and let them preserve input types from library routines and preserve return types from the my_syscall<N> macros.
As we discussed in my our syscall.h proposal, if there is a common my_syscall(), every sys_ function can be simply defined to something like:
#define sys_<NAME>(...) my_syscall(<NAME>, __VA_ARGS__)
In my_syscall(), it can even simply return -ENOSYS if the __NR_xxx is not defined (we init such __NR_xxx to something like __NR_NOSYS):
// sysnr.h
// If worried about the applications use this macro, perhaps we can // use a different prefix, for example, NOLIBC_NR_xxx
#define NOLIBC_NR_NOSYS (-1L)
#ifndef __NR_xxx #define NOLIBC_NR_xxx NOLIBC_NR_NOSYS #else #define NOLIBC_NR_xxx __NR_xxx #endif
// syscall.h
// _my_syscall is similar to syscall() in unistd.h, but without the // __sysret normalization
#define _my_syscalln(N, ...) my_syscall##N(__VA_ARGS__) #define _my_syscall_n(N, ...) _my_syscalln(N, __VA_ARGS__) #define _my_syscall(...) _my_syscall_n(_syscall_narg(__VA_ARGS__), ##__VA_ARGS__)
#define my_syscall(name, ...) \ ({ \ long _ret; \ if (NOLIBC_NR_##name == NOLIBC_NR_NOSYS) \ _ret = -ENOSYS; \ else \ _ret = _my_syscall(NOLIBC_NR_##name, ##__VA_ARGS__); \ _ret; \ })
// sys_<NAME> list, based on unistd.h
#define sys_<NAME>(...) my_syscall(<NAME>, __VA_ARGS__) #define sys_<NAME>(...) my_syscall(<NAME>, __VA_ARGS__) #define sys_<NAME>(...) my_syscall(<NAME>, __VA_ARGS__)
With above conversions, we may be able to predefine all of the sys_<NAME> functions to preserve the input types from library rountines and return types from my_syscall<N> (by default, 'long'). This also follows the suggestion from Arnd: let sys_ not use the other low level syscalls, only use its own.
This may also help us to remove all of the `#ifdef __NR_` wrappers, we can directly check the -ENOSYS in the library routines and try another sys_<NAME> if required, at last, call __sysret() to normalize the errors and return value.
Use dup2 and dup3 as examples, with sysnr.h and syscall.h above, sys.h will work like this, without any #ifdef's:
/* * int dup2(int old, int new); */
static __attribute__((unused)) int dup2(int old, int new) { int ret = sys_dup3(old, new, 0);
if (ret == -ENOSYS) ret = sys_dup2(old, new);
return __sysret(ret); }
/* * int dup3(int old, int new, int flags); */
static __attribute__((unused)) int dup3(int old, int new, int flags) { return __sysret(sys_dup3(old, new, flags)); }
If the above description is not clear enough, I have changed something for this idea with more cleanups and have done some simple tests:
Compare with v5 --> v6 (with gcc-13.2.0):
i386: 160 test(s): 158 passed, 2 skipped, 0 failed => status: warning 19509 -> 19250 x86_64: 160 test(s): 158 passed, 2 skipped, 0 failed => status: warning 22012 -> 21758 arm64: 160 test(s): 158 passed, 2 skipped, 0 failed => status: warning 25868 -> 25804 arm: 160 test(s): 158 passed, 2 skipped, 0 failed => status: warning 23112 -> 22828 mips: 160 test(s): 157 passed, 3 skipped, 0 failed => status: warning 22924 -> 22740 ppc: 160 test(s): 158 passed, 2 skipped, 0 failed => status: warning 26628 -> 26376 ppc64: 160 test(s): 158 passed, 2 skipped, 0 failed => status: warning 27204 -> 26756 ppc64le: 160 test(s): 158 passed, 2 skipped, 0 failed => status: warning 27828 -> 27364 riscv: 160 test(s): 158 passed, 2 skipped, 0 failed => status: warning 21870 -> 21772 s390: 160 test(s): 157 passed, 3 skipped, 0 failed => status: warning 22192 -> 21992
And now we get:
$ wc -l tools/include/nolibc/sys.h 746 tools/include/nolibc/sys.h
$ wc -l tools/include/nolibc/sysnr.h 410 tools/include/nolibc/sysnr.h $ wc -l tools/include/nolibc/syscall.h 110 tools/include/nolibc/syscall.h
Before:
$ wc -l tools/include/nolibc/sys.h 1222 tools/include/nolibc/sys.h
Most of the library routines become one line code.
The main size reduced may be the carefully tuned sys_ selection, for example, linkat v.s. link, the patchset still require some cleanups, will send v6 for more discussion if you agree.
So I just restarted from what I proposed the other day, using a ternary operator as I suggested in order to address the const case, and it gives me the following patch, which is way simpler and still a bit readable. It's made of two nested (?:) :
- the first one to determine if we have to check for the sign or against -MAX_ERRNO to detect an error (depends on the arg's signedness)
- the second one to return either the argument as-is or -1.
The only two tricks are that (typeof(arg))-1 is compared to 1 instead of zero so that gcc doesn't complain that we're comparing against a null pointer, and similarly we compare arg+1 to 1 instead of arg to 0 for the negative case, and that's all. It gives me the expected code and output from gcc-4.7 to 12.3, and clang-13.
I've checked against your version and it's always exactly the same (in fact to be more precise sometimes it's 1-2 bytes smaller but that's only due to the compiler taking liberties with the code ordering, it could as well have done it the other way around, though it did not this time):
26144 zhangjin-v5/nolibc-test--Os-arm64 | 26144 willy/nolibc-test--Os-arm64 23340 zhangjin-v5/nolibc-test--Os-armv5 | 23340 willy/nolibc-test--Os-armv5
[...]
Unless there's any objection, I'll queue this one. And if __sysret() annoys us again in the future I might very well revert that simplification.
Any question about the patch ?
[...]
-/* Syscall return helper for library routines, set errno as -ret when ret is in
- range of [-MAX_ERRNO, -1]
- Note, No official reference states the errno range here aligns with musl
- (src/internal/syscall_ret.c) and glibc (sysdeps/unix/sysv/linux/sysdep.h)
+/* Syscall return helper: takes the syscall value in argument and checks for an
- error in it. For unsigned returns, an error is within [-MAX_ERRNO, -1]. For
- signed returns, an error is any value < 0. When an error is encountered,
- -ret is set into errno and -1 is returned. Otherwise the returned value is
*/
- passed as-is with its type preserved.
-static __inline__ __attribute__((unused, always_inline)) -long __sysret(unsigned long ret) -{
- if (ret >= (unsigned long)-MAX_ERRNO) {
SET_ERRNO(-(long)ret);
return -1;
- }
- return ret;
-} +#define __sysret(arg) \ +({ \
- __typeof__(arg) __sysret_arg = (arg); \
Here ignores the 'const' flag in input type?
- ((((__typeof__(arg)) -1) > (__typeof__(arg)) 1) ? /* unsigned arg? */ \
(uintptr_t)__sysret_arg >= (uintptr_t)-(MAX_ERRNO) : /* errors */ \
(__sysret_arg + 1) < ((__typeof__(arg))1) /* signed: <0 = error */ \
- ) ? ({ \
SET_ERRNO(-(intptr_t)__sysret_arg); \
((__typeof__(arg)) -1); /* return -1 upon error */ \
- }) : __sysret_arg; /* return original value & type on success */ \
+})
To be honest, it is also a little complex when with one "?:" embedded in another, I even don't understand how the 'unsigned arg' branch works, sorry, is it dark magic like the __is_constexpr? ;-)
Thanks, Zhangjin
/* Functions in this file only describe syscalls. They're declared static so
- that the compiler usually decides to inline them while still being allowed
@@ -94,7 +97,7 @@ void *sbrk(intptr_t inc) if (ret && sys_brk(ret + inc) == ret + inc) return ret + inc;
- return (void *)__sysret(-ENOMEM);
- return __sysret((void *)-ENOMEM);
} @@ -682,7 +685,7 @@ void *sys_mmap(void *addr, size_t length, int prot, int flags, int fd, static __attribute__((unused)) void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset) {
- return (void *)__sysret((unsigned long)sys_mmap(addr, length, prot, flags, fd, offset));
- return __sysret(sys_mmap(addr, length, prot, flags, fd, offset));
} static __attribute__((unused)) -- 2.35.3