During the discussion of the clone3() support for shadow stacks concerns were raised from the glibc side that since it is not possible to reuse the allocated shadow stack[1]. This means that the benefit of being able to manage allocations is greatly reduced, for example it is not possible to integrate the shadow stacks into the glibc thread stack cache. The stack can be inspected but otherwise it would have to be unmapped and remapped before it could be used again, it's not clear that this is better than managing things in the kernel.
In that discussion I suggested that we could enable reuse by writing a token to the shadow stack of exiting threads, mirroring how the userspace stack pivot instructions write a token to the outgoing stack. As mentioned by Florian[2] glibc already unwinds the stack and exits the thread from the start routine which would integrate nicely with this, the shadow stack pointer will be at the same place as it was when the thread started.
This would not write a token if the thread doesn't exit cleanly, that seems viable to me - users should probably handle this by double checking that a token is present after waiting for the thread.
This is tagged as a RFC since I put it together fairly quickly to demonstrate the proposal and the suggestion hasn't had much response either way from the glibc developers. At the very least we don't currently handle scheduling during exit(), or distinguish why the thread is exiting. I've also not done anything about x86.
[1] https://marc.info/?l=glibc-alpha&m=175821637429537&w=2 [2] https://marc.info/?l=glibc-alpha&m=175733266913483&w=2
Signed-off-by: Mark Brown broonie@kernel.org --- Mark Brown (3): arm64/gcs: Support reuse of GCS for exited threads kselftest/arm64: Validate PR_SHADOW_STACK_EXIT_TOKEN in basic-gcs kselftest/arm64: Add PR_SHADOW_STACK_EXIT_TOKEN to gcs-locking
arch/arm64/include/asm/gcs.h | 3 +- arch/arm64/mm/gcs.c | 25 ++++- include/uapi/linux/prctl.h | 1 + tools/testing/selftests/arm64/gcs/basic-gcs.c | 121 ++++++++++++++++++++++++ tools/testing/selftests/arm64/gcs/gcs-locking.c | 23 +++++ tools/testing/selftests/arm64/gcs/gcs-util.h | 3 +- 6 files changed, 173 insertions(+), 3 deletions(-) --- base-commit: 0b67d4b724b4afed2690c21bef418b8a803c5be2 change-id: 20250919-arm64-gcs-exit-token-82c3c2570aad prerequisite-change-id: 20231019-clone3-shadow-stack-15d40d2bf536
Best regards, -- Mark Brown broonie@kernel.org
Currently when a thread with a userspace allocated stack exits it is not possible for the GCS that was in use by the thread to be reused since no stack switch token will be left on the stack, preventing use with clone3() or userspace stack switching. The only thing userspace can realistically do with the GCS is inspect it or unmap it which is not ideal.
Enable reuse by modelling thread exit like pivoting in userspace with the stack pivot instructions, writing a stack switch token at the top entry of the GCS of the exiting thread. This allows userspace to switch back to the GCS in future, the use of the current stack location should work well with glibc's current behaviour of fully uwninding the stack of threads that exit cleanly.
This patch is an RFC and should not be applied as-is. Currently the token will only be written for the current thread, but will be written regardless of the reason the thread is exiting. This version of the patch does not handle scheduling during exit() at all, the code is racy.
The feature is gated behind a new GCS mode flag PR_SHADOW_STACK_EXIT_TOKEN to ensure that userspace that does not wish to use the tokens never has to see them.
Signed-off-by: Mark Brown broonie@kernel.org --- arch/arm64/include/asm/gcs.h | 3 ++- arch/arm64/mm/gcs.c | 25 ++++++++++++++++++++++++- include/uapi/linux/prctl.h | 1 + 3 files changed, 27 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/gcs.h b/arch/arm64/include/asm/gcs.h index b4bbec9382a1..1ec359d0ad51 100644 --- a/arch/arm64/include/asm/gcs.h +++ b/arch/arm64/include/asm/gcs.h @@ -52,7 +52,8 @@ static inline u64 gcsss2(void) }
#define PR_SHADOW_STACK_SUPPORTED_STATUS_MASK \ - (PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_WRITE | PR_SHADOW_STACK_PUSH) + (PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_WRITE | \ + PR_SHADOW_STACK_PUSH | PR_SHADOW_STACK_EXIT_TOKEN)
#ifdef CONFIG_ARM64_GCS
diff --git a/arch/arm64/mm/gcs.c b/arch/arm64/mm/gcs.c index fd1d5a6655de..4649c2b107a7 100644 --- a/arch/arm64/mm/gcs.c +++ b/arch/arm64/mm/gcs.c @@ -199,14 +199,37 @@ void gcs_set_el0_mode(struct task_struct *task)
void gcs_free(struct task_struct *task) { + unsigned long __user *cap_ptr; + unsigned long cap_val; + int ret; + if (!system_supports_gcs()) return;
if (!task->mm || task->mm != current->mm) return;
- if (task->thread.gcs_base) + if (task->thread.gcs_base) { vm_munmap(task->thread.gcs_base, task->thread.gcs_size); + } else if (task == current && + task->thread.gcs_el0_mode & PR_SHADOW_STACK_EXIT_TOKEN) { + cap_ptr = (unsigned long __user *)read_sysreg_s(SYS_GCSPR_EL0); + cap_ptr--; + cap_val = GCS_CAP(cap_ptr); + + /* + * We can't do anything constructive if this fails, + * and the thread might be exiting due to being in a + * bad state anyway. + */ + put_user_gcs(cap_val, cap_ptr, &ret); + + /* + * Ensure the new cap is ordered before standard + * memory accesses to the same location. + */ + gcsb_dsync(); + }
task->thread.gcspr_el0 = 0; task->thread.gcs_base = 0; diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index ed3aed264aeb..c3c37c39639f 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -352,6 +352,7 @@ struct prctl_mm_map { # define PR_SHADOW_STACK_ENABLE (1UL << 0) # define PR_SHADOW_STACK_WRITE (1UL << 1) # define PR_SHADOW_STACK_PUSH (1UL << 2) +# define PR_SHADOW_STACK_EXIT_TOKEN (1UL << 3)
/* * Prevent further changes to the specified shadow stack
Add a simple test which validates that exit tokens can be written to the GCS of exiting threads, in basic-gcs since this functionality is expected to be used by libcs.
We should add further tests which validate the option being absent.
Signed-off-by: Mark Brown broonie@kernel.org --- tools/testing/selftests/arm64/gcs/basic-gcs.c | 121 ++++++++++++++++++++++++++ 1 file changed, 121 insertions(+)
diff --git a/tools/testing/selftests/arm64/gcs/basic-gcs.c b/tools/testing/selftests/arm64/gcs/basic-gcs.c index 54f9c888249d..5515a5425186 100644 --- a/tools/testing/selftests/arm64/gcs/basic-gcs.c +++ b/tools/testing/selftests/arm64/gcs/basic-gcs.c @@ -360,6 +360,126 @@ static bool test_vfork(void) return pass; }
+/* We can reuse a shadow stack with an exit token */ +static bool test_exit_token(void) +{ + struct clone_args clone_args; + int ret, status; + static bool pass = true; /* These will be used in the thread */ + static uint64_t expected_cap; + static int elem; + static uint64_t *buf; + + /* Ensure we've got exit tokens enabled here */ + ret = my_syscall5(__NR_prctl, PR_SET_SHADOW_STACK_STATUS, + PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_EXIT_TOKEN, + 0, 0, 0); + if (ret != 0) + ksft_exit_fail_msg("Failed to enable exit token: %d\n", ret); + + buf = (void *)my_syscall3(__NR_map_shadow_stack, 0, page_size, + SHADOW_STACK_SET_TOKEN); + if (buf == MAP_FAILED) { + ksft_print_msg("Failed to map %lu byte GCS: %d\n", + page_size, errno); + return false; + } + ksft_print_msg("Mapped GCS at %p-%p\n", buf, + (void *)((uint64_t)buf + page_size)); + + /* We should have a cap token */ + elem = (page_size / sizeof(uint64_t)) - 1; + expected_cap = ((uint64_t)buf + page_size - 8); + expected_cap &= GCS_CAP_ADDR_MASK; + expected_cap |= GCS_CAP_VALID_TOKEN; + if (buf[elem] != expected_cap) { + ksft_print_msg("Cap entry is 0x%llx not 0x%llx\n", + buf[elem], expected_cap); + pass = false; + } + ksft_print_msg("cap token is 0x%llx\n", buf[elem]); + + memset(&clone_args, 0, sizeof(clone_args)); + clone_args.exit_signal = SIGCHLD; + clone_args.flags = CLONE_VM; + clone_args.shstk_token = (uint64_t)&(buf[elem]); + clone_args.stack = (uint64_t)malloc(page_size); + clone_args.stack_size = page_size; + + if (!clone_args.stack) { + ksft_print_msg("Failed to allocate stack\n"); + pass = false; + } + + /* Don't try to clone if we're failing, we might hang */ + if (!pass) + goto out; + + /* There is no wrapper for clone3() in nolibc (or glibc) */ + ret = my_syscall2(__NR_clone3, &clone_args, sizeof(clone_args)); + if (ret == -1) { + ksft_print_msg("clone3() failed: %d\n", errno); + pass = false; + goto out; + } + + if (ret == 0) { + /* In the child, make sure the token is gone */ + if (buf[elem]) { + ksft_print_msg("GCS token was not consumed: %llx\n", + buf[elem]); + pass = false; + } + + /* Make sure we're using the right stack */ + if ((uint64_t)get_gcspr() != (uint64_t)&buf[elem + 1]) { + ksft_print_msg("Child GCSPR_EL0 is %llx not %llx\n", + (uint64_t)get_gcspr(), + (uint64_t)&buf[elem + 1]); + pass = false; + } + + /* We want to exit with *exactly* the same GCS pointer */ + my_syscall1(__NR_exit, 0); + } + + ksft_print_msg("Waiting for child %d\n", ret); + + ret = waitpid(ret, &status, 0); + if (ret == -1) { + ksft_print_msg("Failed to wait for child: %d\n", + errno); + pass = false; + goto out; + } + + if (!WIFEXITED(status)) { + ksft_print_msg("Child exited due to signal %d\n", + WTERMSIG(status)); + pass = false; + } else { + if (WEXITSTATUS(status)) { + ksft_print_msg("Child exited with status %d\n", + WEXITSTATUS(status)); + pass = false; + } + } + + /* The token should have been restored */ + if (buf[elem] == expected_cap) { + ksft_print_msg("Cap entry restored\n"); + } else { + ksft_print_msg("Cap entry is 0x%llx not 0x%llx\n", + buf[elem], expected_cap); + pass = false; + } + +out: + free((void*)clone_args.stack); + munmap(buf, page_size); + return pass; +} + typedef bool (*gcs_test)(void);
static struct { @@ -377,6 +497,7 @@ static struct { { "map_guarded_stack", map_guarded_stack }, { "fork", test_fork }, { "vfork", test_vfork }, + { "exit_token", test_exit_token }, };
int main(void)
We have added PR_SHADOW_STACK_EXIT_TOKEN, ensure that locking works as expected for it.
Signed-off-by: Mark Brown broonie@kernel.org --- tools/testing/selftests/arm64/gcs/gcs-locking.c | 23 +++++++++++++++++++++++ tools/testing/selftests/arm64/gcs/gcs-util.h | 3 ++- 2 files changed, 25 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/arm64/gcs/gcs-locking.c b/tools/testing/selftests/arm64/gcs/gcs-locking.c index 989f75a491b7..0e8928096918 100644 --- a/tools/testing/selftests/arm64/gcs/gcs-locking.c +++ b/tools/testing/selftests/arm64/gcs/gcs-locking.c @@ -77,6 +77,29 @@ FIXTURE_VARIANT_ADD(valid_modes, enable_write_push) PR_SHADOW_STACK_PUSH, };
+FIXTURE_VARIANT_ADD(valid_modes, enable_token) +{ + .mode = PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_EXIT_TOKEN, +}; + +FIXTURE_VARIANT_ADD(valid_modes, enable_write_exit) +{ + .mode = PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_WRITE | + PR_SHADOW_STACK_EXIT_TOKEN, +}; + +FIXTURE_VARIANT_ADD(valid_modes, enable_push_exit) +{ + .mode = PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_PUSH | + PR_SHADOW_STACK_EXIT_TOKEN, +}; + +FIXTURE_VARIANT_ADD(valid_modes, enable_write_push_exit) +{ + .mode = PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_WRITE | + PR_SHADOW_STACK_PUSH | PR_SHADOW_STACK_EXIT_TOKEN, +}; + FIXTURE_SETUP(valid_modes) { } diff --git a/tools/testing/selftests/arm64/gcs/gcs-util.h b/tools/testing/selftests/arm64/gcs/gcs-util.h index c99a6b39ac14..1abc9d122ac1 100644 --- a/tools/testing/selftests/arm64/gcs/gcs-util.h +++ b/tools/testing/selftests/arm64/gcs/gcs-util.h @@ -36,7 +36,8 @@ struct user_gcs { # define PR_SHADOW_STACK_PUSH (1UL << 2)
#define PR_SHADOW_STACK_ALL_MODES \ - PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_WRITE | PR_SHADOW_STACK_PUSH + PR_SHADOW_STACK_ENABLE | PR_SHADOW_STACK_WRITE | \ + PR_SHADOW_STACK_PUSH | PR_SHADOW_STACK_EXIT_TOKEN
#define SHADOW_STACK_SET_TOKEN (1ULL << 0) /* Set up a restore token in the shadow stack */ #define SHADOW_STACK_SET_MARKER (1ULL << 1) /* Set up a top of stack merker in the shadow stack */
linux-kselftest-mirror@lists.linaro.org