The 08/29/2024 00:27, Mark Brown wrote:
Unfortunately plain clone() is not extensible and existing clone3() users will not specify a stack so all existing code would be broken if we mandated specifying the stack explicitly. For compatibility with these cases and also x86 (which did not initially implement clone3() support for shadow stacks) if no GCS is specified we will allocate one so when a thread is created which has GCS enabled allocate one for it. We follow the extensively discussed x86 implementation and allocate min(RLIMIT_STACK, 2G). Since the GCS only stores the call stack and not any variables this should be more than sufficient for most applications.
the code has RLIMIT_STACK/2
(which is what i expect on arm64, since gcs entry size is min stack frame / 2 if the stack is correctly aligned)
GCSs allocated via this mechanism will be freed when the thread exits.
i see gcs still mapped after thread exit when testing.
+static unsigned long gcs_size(unsigned long size) +{
- if (size)
return PAGE_ALIGN(size);
no /2
- /* Allocate RLIMIT_STACK/2 with limits of PAGE_SIZE..2G */
- size = PAGE_ALIGN(min_t(unsigned long long,
rlimit(RLIMIT_STACK) / 2, SZ_2G));
has /2
- return max(PAGE_SIZE, size);
+}
+unsigned long gcs_alloc_thread_stack(struct task_struct *tsk,
const struct kernel_clone_args *args)
+{
- unsigned long addr, size;
- if (!system_supports_gcs())
return 0;
- if (!task_gcs_el0_enabled(tsk))
return 0;
- if ((args->flags & (CLONE_VFORK | CLONE_VM)) != CLONE_VM) {
tsk->thread.gcspr_el0 = read_sysreg_s(SYS_GCSPR_EL0);
return 0;
- }
- size = args->stack_size;
no /2 (i think this should be divided)
- size = gcs_size(size);
- addr = alloc_gcs(0, size);
- if (IS_ERR_VALUE(addr))
return addr;
- tsk->thread.gcs_base = addr;
- tsk->thread.gcs_size = size;
- tsk->thread.gcspr_el0 = addr + size - sizeof(u64);
- return addr;
+}
...
void gcs_free(struct task_struct *task) {
- /*
* When fork() with CLONE_VM fails, the child (tsk) already
* has a GCS allocated, and exit_thread() calls this function
* to free it. In this case the parent (current) and the
* child share the same mm struct.
*/
- if (!task->mm || task->mm != current->mm)
return;
- if (task->thread.gcs_base) vm_munmap(task->thread.gcs_base, task->thread.gcs_size);
not sure why this logic fails to free thread gcs (created with clone3 in glibc)
other the gcs leak, my tests pass.