Hi, Willy, Thomas
Currently, since this part of the discussion is out of the original topic [1], as suggested by Willy, open a new thread for this.
[1]: https://lore.kernel.org/lkml/20230731224929.GA18296@1wt.eu/#R
The original topic [1] tries to enable -Wall (or even -Wextra) to report some issues (include the dead unused functions/data) found during compiling stage, this further propose a method to enable '-ffunction-sections -fdata-sections -Wl,--gc-sections,--print-gc-sections to' find the dead unused functions/data during linking stage:
Just thought about gc-section, do we need to further remove dead code/data in the binary? I don't think it is necessary for nolibc-test itself, but with '-Wl,--gc-sections -Wl,--print-gc-sections' may be a good helper to show us which ones should be dropped or which ones are wrongly declared as public?
Just found '-O3 + -Wl,--gc-section + -Wl,--print-gc-sections' did tell us something as below:
removing unused section '.text.nolibc_raise' removing unused section '.text.nolibc_memmove' removing unused section '.text.nolibc_abort' removing unused section '.text.nolibc_memcpy' removing unused section '.text.__stack_chk_init' removing unused section '.text.is_setting_valid'
These info may help us further add missing 'static' keyword or find another method to to drop the wrongly used status of some functions from the code side.
Let's continue the discussion as below.
On Mon, Jul 31, 2023 at 08:36:05PM +0200, Thomas Wei�schuh wrote:
[...]
It is very easy to add the missing 'static' keyword for is_setting_valid(), but for __stack_chk_init(), is it ok for us to convert it to 'static' and remove the 'weak' attrbute and even the 'section' attribute? seems it is only used by our _start_c() currently.
Making is_setting_valid(), __stack_chk_init() seems indeed useful. Also all the run_foo() test functions.
Most of them could theoretically be turned to static. *But* it causes a problem which is that it will multiply their occurrences in multi-unit programs, and that's in part why we've started to use weak instead. Also if you run through gdb and want to mark a break point, you won't have the symbol when it's static,
Willy, did I misunderstand something again? a simple test shows, seems this is not really always like that, static mainly means 'local', the symbol is still there if without -O2/-Os and is able to be set a breakpoint at:
// test.c: gcc -o test test.c #include <stdio.h>
static int test (void) { printf("hello, world!\n"); }
int main(void) { test();
return 0; }
Even with -Os/-O2, an additional '-g' is able to generate the 'test' symbol for debug as we expect.
and the code will appear at multiple locations,
which is really painful. I'd instead really prefer to avoid static when we don't strictly want to inline the code, and prefer weak when possible because we know many of them will be dropped at link time (and that's the exact purpose).
For the empty __stack_chk_init() one (when the arch not support stackprotector) we used, when with 'weak', it is not possible drop it during link time even with -O3, the weak one will be dropped may be only when there is a global one with the same name is used or the 'weak' one is never really used?
#include <stdio.h>
__attribute__((weak,unused,section(".text.nolibc_memset"))) int test (void) { printf("hello, world!\n"); }
int main(void) { test();
return 0; }
0000000000001060 <main>: 1060: f3 0f 1e fa endbr64 1064: 48 83 ec 08 sub $0x8,%rsp 1068: e8 03 01 00 00 callq 1170 <test> 106d: 31 c0 xor %eax,%eax 106f: 48 83 c4 08 add $0x8,%rsp 1073: c3 retq 1074: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 107b: 00 00 00 107e: 66 90 xchg %ax,%ax
Seems it is either impossible to add a 'inline' keyword again with the 'weak' attribute (warned by compiler), so, the _start_c (itself is always called by _start) will always add an empty call to the weak empty __stack_chk_init(), -Os/-O2/-O3 don't help. for such an empty function, in my opinion, as the size we want to care about, the calling place should be simply removed by compiler.
Test also shows, with current __inline__ method, the calling place is removed, but with c89, the __stack_chk_init() itself will not be droped automatically, only if not with -std=c89, it will be dropped and not appear in the --print-gc-sections result.
Even for a supported architecture, the shorter __stack_chk_init() may be better to inlined to the _start_c()?
So, If my above test is ok, then, we'd better simply convert the whole __stack_chk_init() to a static one as below (I didn't investigate this deeply due to the warning about static and weak conflict at the first time):
diff --git a/tools/include/nolibc/crt.h b/tools/include/nolibc/crt.h index 32e128b0fb62..a5f33fef1672 100644 --- a/tools/include/nolibc/crt.h +++ b/tools/include/nolibc/crt.h @@ -10,7 +10,7 @@ char **environ __attribute__((weak)); const unsigned long *_auxv __attribute__((weak));
-void __stack_chk_init(void); +static void __stack_chk_init(void); static void exit(int);
void _start_c(long *sp) diff --git a/tools/include/nolibc/stackprotector.h b/tools/include/nolibc/stackprotector.h index b620f2b9578d..13f1d0e60387 100644 --- a/tools/include/nolibc/stackprotector.h +++ b/tools/include/nolibc/stackprotector.h @@ -37,8 +37,7 @@ void __stack_chk_fail_local(void) __attribute__((weak,section(".data.nolibc_stack_chk"))) uintptr_t __stack_chk_guard;
-__attribute__((weak,section(".text.nolibc_stack_chk"))) __no_stack_protector -void __stack_chk_init(void) +static __no_stack_protector void __stack_chk_init(void) { my_syscall3(__NR_getrandom, &__stack_chk_guard, sizeof(__stack_chk_guard), 0); /* a bit more randomness in case getrandom() fails, ensure the guard is never 0 */ @@ -46,7 +45,7 @@ void __stack_chk_init(void) __stack_chk_guard ^= (uintptr_t) &__stack_chk_guard; } #else /* !defined(_NOLIBC_STACKPROTECTOR) */ -__inline__ void __stack_chk_init(void) {} +static void __stack_chk_init(void) {} #endif /* defined(_NOLIBC_STACKPROTECTOR) */
#endif /* _NOLIBC_STACKPROTECTOR_H */
Thanks for the clarification. I forgot about that completely!
The stuff from nolibc-test.c itself (run_foo() and is_settings_valid()) should still be done.
Yes, likely. Nolibc-test should be done just like users expect to use nolibc, and nolibc should be the most flexible possible.
For the 'static' keyword we tested above, '-g' may help the debug requirement, so, is ok for us to apply 'static' for them safely now?
A further test shows, with 'static' on _start_c() doesn't help the size, for it is always called from _start(), will never save move instructions, but we need a more 'used' attribute to silence the error 'nolibc-test.c:(.text+0x38cd): undefined reference to `_start_c'', so, reserve it as the before simpler 'void _start_c(void)' may be better?
static __attribute__((used)) void _start_c(long *sp)
Thanks, Zhangjin
Cheers, Willy