On Tue, Aug 06, 2024 at 10:08:44PM -0700, Kees Cook wrote:
On Tue, Aug 06, 2024 at 04:10:02PM +0100, Mark Brown wrote:
# Running test 'Shadow stack with no token'
It took me a while to figure out where a thread switches shstk (even without this series):
kernel_clone, copy_process, copy_thread, fpu_clone, update_fpu_shstk (and shstk_alloc_thread_stack is called just before update_fpu_shstk).
I don't understand the token consumption in arch_shstk_post_fork(). This wasn't needed before with the fixed-size new shstk, why is it needed now?
Concerns were raised on earlier rounds of review that since instead of allocating the shadow stack as part of creating the new thread we are using a previously allocated shadow stack someone could use this as part of an exploit. You could just jump on top of any existing shadow stack and cause writes to it.
Anyway, my attempt to trace the shstk changes for the test:
write(1, "TAP version 13\n", 15) = 15 write(1, "1..2\n", 5) = 5 clone3({flags=0, exit_signal=18446744073709551615, stack=NULL, stack_size=0}, 104) = -1 EINVAL (Invalid argument) write(1, "# clone3() syscall supported\n", 29) = 29 map_shadow_stack(NULL, 4096, 0) = 125837480497152 write(1, "# Shadow stack supportd\n", 24) = 24 write(1, "# Running test 'Shadow stack wit"..., 44) = 44 getpid() = 4943 write(1, "# [4943] Trying clone3() with fl"..., 51) = 51 map_shadow_stack(NULL, 4096, 0) = 125837480488960 clone3({flags=CLONE_VM, exit_signal=SIGCHLD, stack=NULL, stack_size=0, /* bytes 88..103 */ "\x00\xf0\x52\xd2\x72\x72\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00"} => {/* bytes 88..103 */ "\x00\xf0\x52\xd2\x72\x72\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00"}, 104) = 4944 getpid() = 4943 write(1, "# I am the parent (4943). My chi"..., 49strace: Process 4944 attached ) = 49 [pid 4944] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_CPERR, si_addr=NULL} --- [pid 4943] wait4(-1, <unfinished ...> [pid 4944] +++ killed by SIGSEGV (core dumped) +++
So we created the thread, then before we get to the wait4() in the parent we start delivering a SEGV_CPERR to the child. The flow for the child is as expected.
<... wait4 resumed>[{WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV && WCOREDUMP(s)}], __WALL, NULL) = 4944 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_DUMPED, si_pid=4944, si_uid=0, si_status=SIGSEGV, si_utime=0, si_stime=0} --- --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7272d21fffe8} --- +++ killed by SIGSEGV (core dumped) +++
Then the parent gets an ordinary segfault, not a shadow stack specific one, like some memory got deallocated underneath it or a pointer got corrupted.
[ 569.153288] shstk_setup: clone3[4943] ssp:7272d2200000 [ 569.153998] process: copy_thread: clone3[4943] new_ssp:7272d2530000 [ 569.154002] update_fpu_shstk: clone3[4943] ssp:7272d2530000 [ 569.154008] shstk_post_fork: clone3[4944] [ 569.154011] shstk_post_fork: clone3[4944] sending SIGSEGV post fork
I don't see an update_fpu_shstk for 4944? Should I with this test?
I'd only expect to see one update, my understanding is that that update is for the child but happening in the context of the parent as the hild is not yet started.
Does this help:
diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c index 27acbdf44c5f..d7005974aff5 100644 --- a/arch/x86/kernel/shstk.c +++ b/arch/x86/kernel/shstk.c @@ -258,6 +258,8 @@ unsigned long shstk_alloc_thread_stack(struct task_struct *tsk, if (args->shadow_stack) { addr = args->shadow_stack; size = args->shadow_stack_size; + shstk->base = 0; + shstk->size = 0; } else { /* * For CLONE_VFORK the child will share the parents