On 1/24/20 3:38 AM, Stefan Metzmacher wrote:
Am 22.01.20 um 10:26 schrieb Greg Kroah-Hartman:
From: Jens Axboe axboe@kernel.dk
commit 44d282796f81eb1debc1d7cb53245b4cb3214cb5 upstream.
If the credentials or the mm doesn't match, don't allow the task to submit anything on behalf of this ring. The task that owns the ring can pass the file descriptor to another task, but we don't want to allow that task to submit an SQE that then assumes the ring mm and creds if it needs to go async.
Cc: stable@vger.kernel.org Suggested-by: Stefan Metzmacher metze@samba.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
fs/io_uring.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -3716,6 +3716,12 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned wake_up(&ctx->sqo_wait); submitted = to_submit; } else if (to_submit) {
if (current->mm != ctx->sqo_mm ||
current_cred() != ctx->creds) {
ret = -EPERM;
goto out;
}
I thought about this a bit more.
I'm not sure if this is actually to restrictive, because it means applications like Samba won't be able to use io-uring at all.
As even if current_cred() and ctx->creds describe the same set of uid,gids the != won't ever match again and makes the whole ring unuseable.
I'm not sure about what the best short term solution could be...
- May just doing the check for path based operations?
and fail individual requests with EPERM.
- Or force REQ_F_FORCE_ASYNC for path based operations,
so that they're always executed from within the workqueue with were ctx->creds is active.
- Or (as proposed earlier) do the override_creds/revert_creds dance
(and similar for mm) if needed.
To summaries the problem again:
For path based operations like:
- IORING_OP_CONNECT (maybe also - IORING_OP_ACCEPT???)
- IORING_OP_SEND*, IORING_OP_RECV* on DGRAM sockets
- IORING_OP_OPENAT, IORING_OP_STATX, IORING_OP_OPENAT2
it's important under which current_cred they are called.
Are IORING_OP_MADVISE, IORING_OP_FADVISE and IORING_OP_FALLOCATE are only bound to the credentials of the passed fd they operate on?
The current assumption is that the io_uring_setup() syscall captures the current_cred() to ctx->cred and all operations on the ring are executed under the context of ctx->cred. Therefore all helper threads do the override_creds/revert_creds dance.
But it doesn't - we're expecting them to match, and with this change, we assert that it's the case or return -EPERM.
But the possible non-blocking line execution of operations in the io_uring_enter() syscall doesn't do the override_creds/revert_creds dance and execute the operations under current_cred().
This means it's random depending on filled cached under what credentials an operation is executed.
In order to prevent security problems the current patch is enough, but as outlined above it will make io-uring complete unuseable for applications using any syscall that changes current_cred().
Change 1. would be a little bit better, but still not really useful.
I'd actually prefer solution 3. as it's still possible to make use of non-blocking operations, while the security is the same as solution 2.
For your situation, we need to extend it anyway, and provide a way to swap between personalities. So yeah it won't work as-is for your use case, but we can work on making that the case.