On Tue, Aug 06, 2024 at 04:10:02PM +0100, Mark Brown wrote:
On Mon, Aug 05, 2024 at 08:54:54PM -0700, Kees Cook wrote:
This actually segfaults the parent:
# Running test 'Shadow stack with no token' # [5496] Trying clone3() with flags 0x100 (size 0) # I am the parent (5496). My child's pid is 5507 Segmentation fault
Oh dear. We possibly manage to corrupt the parent's shadow stack somehow? I don't think I managed to do that in my arm64 testing. This should also be something going wrong in arch_shstk_post_fork().
Let me know what would be most helpful to dig into more...
It'll almost certianly be something in arch_shstk_post_fork(), that's the bit I couldn't test. Just making that always return success should avoid the first fault, the second ought to not crash but will report a fail as we should be rejecting the shadow stack when we try to consume the token.
It took me a while to figure out where a thread switches shstk (even without this series):
kernel_clone, copy_process, copy_thread, fpu_clone, update_fpu_shstk (and shstk_alloc_thread_stack is called just before update_fpu_shstk).
I don't understand the token consumption in arch_shstk_post_fork(). This wasn't needed before with the fixed-size new shstk, why is it needed now?
Anyway, my attempt to trace the shstk changes for the test:
write(1, "TAP version 13\n", 15) = 15 write(1, "1..2\n", 5) = 5 clone3({flags=0, exit_signal=18446744073709551615, stack=NULL, stack_size=0}, 104) = -1 EINVAL (Invalid argument) write(1, "# clone3() syscall supported\n", 29) = 29 map_shadow_stack(NULL, 4096, 0) = 125837480497152 write(1, "# Shadow stack supportd\n", 24) = 24 write(1, "# Running test 'Shadow stack wit"..., 44) = 44 getpid() = 4943 write(1, "# [4943] Trying clone3() with fl"..., 51) = 51 map_shadow_stack(NULL, 4096, 0) = 125837480488960 clone3({flags=CLONE_VM, exit_signal=SIGCHLD, stack=NULL, stack_size=0, /* bytes 88..103 */ "\x00\xf0\x52\xd2\x72\x72\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00"} => {/* bytes 88..103 */ "\x00\xf0\x52\xd2\x72\x72\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00"}, 104) = 4944 getpid() = 4943 write(1, "# I am the parent (4943). My chi"..., 49strace: Process 4944 attached ) = 49 [pid 4944] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_CPERR, si_addr=NULL} --- [pid 4943] wait4(-1, <unfinished ...> [pid 4944] +++ killed by SIGSEGV (core dumped) +++ <... wait4 resumed>[{WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV && WCOREDUMP(s)}], __WALL, NULL) = 4944 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_DUMPED, si_pid=4944, si_uid=0, si_status=SIGSEGV, si_utime=0, si_stime=0} --- --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7272d21fffe8} --- +++ killed by SIGSEGV (core dumped) +++
[ 569.153288] shstk_setup: clone3[4943] ssp:7272d2200000 [ 569.153998] process: copy_thread: clone3[4943] new_ssp:7272d2530000 [ 569.154002] update_fpu_shstk: clone3[4943] ssp:7272d2530000 [ 569.154008] shstk_post_fork: clone3[4944] [ 569.154011] shstk_post_fork: clone3[4944] sending SIGSEGV post fork
I don't see an update_fpu_shstk for 4944? Should I with this test?
And the parent dies with SEGV_MAPERR??
I'll keep looking in the morning ...