On Thu, Jan 16, 2020 at 06:29:26PM -0800, Kees Cook wrote:
On Thu, Jan 16, 2020 at 11:45:18PM +0100, Christian Brauner wrote:
As one example where this might be particularly problematic, Jann pointed out that in combination with the upcoming IORING_OP_OPENAT feature, this bug might allow unprivileged users to bypass the capability checks while asynchronously opening files like /proc/*/mem, because the capability checks for this would be performed against kernel credentials.
To follow up on this part of your mail. No, afaict, it's not aboutwinning a race. It's way simpler... When io uring creates a new kernel context it records the subjective credentials of the caller:
ctx = io_ring_ctx_alloc(p); if (!ctx) { if (account_mem) io_unaccount_mem(user, ring_pages(p->sq_entries, p->cq_entries)); free_uid(user); return -ENOMEM; } ctx->compat = in_compat_syscall(); ctx->account_mem = account_mem; ctx->user = user; ------> ctx->creds = get_current_cred(); <------
Later on, when it starts to do work it creates a kernel thread:
ctx->sqo_thread = kthread_create_on_cpu(io_sq_thread, ctx, cpu, "io_uring-sq"); } else { ctx->sqo_thread = kthread_create(io_sq_thread, ctx, "io_uring-sq"); }
and registers io_sq_thread as "callback". The callback io_sq_thread() runs __with kernel creds__. To prevent this from becoming an issue io_sq_thread() will override the __subjective credentials__ with the callers credentials:
old_cred = override_creds(ctx->creds);
But ptrace_has_cap() currently looks at __task_cred(current) aka __real_cred__. This means once IORING_OP_OPENAT and IORING_OP_OPENAT2 lands in v5.5-rc6 it is more or less trivial for an unprivileged user to bypass ptrace_may_access().
Christian