CVE-2021-4197 patchset consists of: [1] 1756d7994ad8 ("cgroup: Use open-time credentials for process migraton perm checks") [2] 0d2b5955b362 ("cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv") [3] e57457641613 ("cgroup: Use open-time cgroup namespace for process migration perm checks") [4] b09c2baa5634 ("selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644") [5] 613e040e4dc2 ("selftests: cgroup: Test open-time credential usage for migration checks") [6] bf35a7879f1d ("selftests: cgroup: Test open-time cgroup namespace usage for migration checks")
Commits [2] and [3] are already preent in 5.10-stable, this patchset includes backports for the other commits.
Backport summary ---------------- 1756d7994ad8 ("cgroup: Use open-time credentials for process migraton perm checks") * Refactoring commit da70862efe006 ("cgroup: cgroup.{procs,threads} factor out common parts") is not present in kernel versions < 5.12, so the original changes to __cgroup_procs_write() had to be applied in both cgroup_threads_write() and cgroup_procs_write() functions.
c2e46f6b3e35 ("selftests/cgroup: Fix build on older distros") * This extra commit was added to fix the following selftest build failure, applies cleanly: ... cgroup_util.c: In function ‘clone_into_cgroup’: group_util.c:343:4: error: ‘struct clone_args’ has no member named ‘cgroup’ 343 | .cgroup = cgroup_fd, | ^~~~~~
All other selftest changes are clean cherry-picks.
Testing ------- The newly introduced selftests (test_cgcore_lesser_euid_open() and test_cgcore_lesser_ns_open()) pass with this series applied:
root@intel-x86-64:~# ./test_core ok 1 test_cgcore_internal_process_constraint ok 2 test_cgcore_top_down_constraint_enable ok 3 test_cgcore_top_down_constraint_disable ok 4 test_cgcore_no_internal_process_constraint_os ok 5 test_cgcore_parent_becomes_threaded ok 6 test_cgcore_invalid_domain ok 7 test_cgcore_populated ok 8 test_cgcore_proc_migration ok 9 test_cgcore_thread_migration ok 10 test_cgcore_destroy ok 11 test_cgcore_lesser_euid_open ok 12 test_cgcore_lesser_ns_open
Sachin Sant (1): selftests/cgroup: Fix build on older distros
Tejun Heo (4): cgroup: Use open-time credentials for process migraton perm checks selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644 selftests: cgroup: Test open-time credential usage for migration checks selftests: cgroup: Test open-time cgroup namespace usage for migration checks
kernel/cgroup/cgroup-v1.c | 7 +- kernel/cgroup/cgroup.c | 17 +- tools/testing/selftests/cgroup/cgroup_util.c | 6 +- tools/testing/selftests/cgroup/test_core.c | 165 +++++++++++++++++++ 4 files changed, 188 insertions(+), 7 deletions(-)
From: Tejun Heo tj@kernel.org
commit 1756d7994ad85c2479af6ae5a9750b92324685af upstream.
cgroup process migration permission checks are performed at write time as whether a given operation is allowed or not is dependent on the content of the write - the PID. This currently uses current's credentials which is a potential security weakness as it may allow scenarios where a less privileged process tricks a more privileged one into writing into a fd that it created.
This patch makes both cgroup2 and cgroup1 process migration interfaces to use the credentials saved at the time of open (file->f_cred) instead of current's.
Reported-by: "Eric W. Biederman" ebiederm@xmission.com Suggested-by: Linus Torvalds torvalds@linuxfoundation.org Fixes: 187fe84067bd ("cgroup: require write perm on common ancestor when moving processes on the default hierarchy") Reviewed-by: Michal Koutný mkoutny@suse.com Signed-off-by: Tejun Heo tj@kernel.org [OP: apply original __cgroup_procs_write() changes to cgroup_threads_write() and cgroup_procs_write(), as the refactoring commit da70862efe006 ("cgroup: cgroup.{procs,threads} factor out common parts") is not present in 5.10-stable] Signed-off-by: Ovidiu Panait ovidiu.panait@windriver.com --- kernel/cgroup/cgroup-v1.c | 7 ++++--- kernel/cgroup/cgroup.c | 17 ++++++++++++++++- 2 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c index 8f0ea12d7cee..1a0a9f820c69 100644 --- a/kernel/cgroup/cgroup-v1.c +++ b/kernel/cgroup/cgroup-v1.c @@ -505,10 +505,11 @@ static ssize_t __cgroup1_procs_write(struct kernfs_open_file *of, goto out_unlock;
/* - * Even if we're attaching all tasks in the thread group, we only - * need to check permissions on one of them. + * Even if we're attaching all tasks in the thread group, we only need + * to check permissions on one of them. Check permissions using the + * credentials from file open to protect against inherited fd attacks. */ - cred = current_cred(); + cred = of->file->f_cred; tcred = get_task_cred(task); if (!uid_eq(cred->euid, GLOBAL_ROOT_UID) && !uid_eq(cred->euid, tcred->uid) && diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 3f8447a5393e..0853289d321a 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -4788,6 +4788,7 @@ static ssize_t cgroup_procs_write(struct kernfs_open_file *of, struct cgroup_file_ctx *ctx = of->priv; struct cgroup *src_cgrp, *dst_cgrp; struct task_struct *task; + const struct cred *saved_cred; ssize_t ret; bool locked;
@@ -4805,9 +4806,16 @@ static ssize_t cgroup_procs_write(struct kernfs_open_file *of, src_cgrp = task_cgroup_from_root(task, &cgrp_dfl_root); spin_unlock_irq(&css_set_lock);
+ /* + * Process and thread migrations follow same delegation rule. Check + * permissions using the credentials from file open to protect against + * inherited fd attacks. + */ + saved_cred = override_creds(of->file->f_cred); ret = cgroup_attach_permissions(src_cgrp, dst_cgrp, of->file->f_path.dentry->d_sb, true, ctx->ns); + revert_creds(saved_cred); if (ret) goto out_finish;
@@ -4832,6 +4840,7 @@ static ssize_t cgroup_threads_write(struct kernfs_open_file *of, struct cgroup_file_ctx *ctx = of->priv; struct cgroup *src_cgrp, *dst_cgrp; struct task_struct *task; + const struct cred *saved_cred; ssize_t ret; bool locked;
@@ -4851,10 +4860,16 @@ static ssize_t cgroup_threads_write(struct kernfs_open_file *of, src_cgrp = task_cgroup_from_root(task, &cgrp_dfl_root); spin_unlock_irq(&css_set_lock);
- /* thread migrations follow the cgroup.procs delegation rule */ + /* + * Process and thread migrations follow same delegation rule. Check + * permissions using the credentials from file open to protect against + * inherited fd attacks. + */ + saved_cred = override_creds(of->file->f_cred); ret = cgroup_attach_permissions(src_cgrp, dst_cgrp, of->file->f_path.dentry->d_sb, false, ctx->ns); + revert_creds(saved_cred); if (ret) goto out_finish;
From: Sachin Sant sachinp@linux.vnet.ibm.com
commit c2e46f6b3e3551558d44c4dc518b9667cb0d5f8b upstream.
On older distros struct clone_args does not have a cgroup member, leading to build errors:
cgroup_util.c: In function 'clone_into_cgroup': cgroup_util.c:343:4: error: 'struct clone_args' has no member named 'cgroup' cgroup_util.c:346:33: error: invalid application of 'sizeof' to incomplete type 'struct clone_args'
But the selftests already have a locally defined version of the structure which is up to date, called __clone_args.
So use __clone_args which fixes the error.
Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Sachin Sant sachinp@linux.vnet.ibm.com> Acked-by: Christian Brauner christian.brauner@ubuntu.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Ovidiu Panait ovidiu.panait@windriver.com --- tools/testing/selftests/cgroup/cgroup_util.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/cgroup/cgroup_util.c b/tools/testing/selftests/cgroup/cgroup_util.c index 05853b0b8831..027014662fb2 100644 --- a/tools/testing/selftests/cgroup/cgroup_util.c +++ b/tools/testing/selftests/cgroup/cgroup_util.c @@ -337,13 +337,13 @@ pid_t clone_into_cgroup(int cgroup_fd) #ifdef CLONE_ARGS_SIZE_VER2 pid_t pid;
- struct clone_args args = { + struct __clone_args args = { .flags = CLONE_INTO_CGROUP, .exit_signal = SIGCHLD, .cgroup = cgroup_fd, };
- pid = sys_clone3(&args, sizeof(struct clone_args)); + pid = sys_clone3(&args, sizeof(struct __clone_args)); /* * Verify that this is a genuine test failure: * ENOSYS -> clone3() not available
From: Tejun Heo tj@kernel.org
commit b09c2baa56347ae65795350dfcc633dedb1c2970 upstream.
0644 is an odd perm to create a cgroup which is a directory. Use the regular 0755 instead. This is necessary for euid switching test case.
Reviewed-by: Michal Koutný mkoutny@suse.com Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Ovidiu Panait ovidiu.panait@windriver.com --- tools/testing/selftests/cgroup/cgroup_util.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/cgroup/cgroup_util.c b/tools/testing/selftests/cgroup/cgroup_util.c index 027014662fb2..5b16c7b0ae4f 100644 --- a/tools/testing/selftests/cgroup/cgroup_util.c +++ b/tools/testing/selftests/cgroup/cgroup_util.c @@ -219,7 +219,7 @@ int cg_find_unified_root(char *root, size_t len)
int cg_create(const char *cgroup) { - return mkdir(cgroup, 0644); + return mkdir(cgroup, 0755); }
int cg_wait_for_proc_count(const char *cgroup, int count)
From: Tejun Heo tj@kernel.org
commit 613e040e4dc285367bff0f8f75ea59839bc10947 upstream.
When a task is writing to an fd opened by a different task, the perm check should use the credentials of the latter task. Add a test for it.
Tested-by: Michal Koutný mkoutny@suse.com Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Ovidiu Panait ovidiu.panait@windriver.com --- tools/testing/selftests/cgroup/test_core.c | 68 ++++++++++++++++++++++ 1 file changed, 68 insertions(+)
diff --git a/tools/testing/selftests/cgroup/test_core.c b/tools/testing/selftests/cgroup/test_core.c index 3df648c37876..01b766506973 100644 --- a/tools/testing/selftests/cgroup/test_core.c +++ b/tools/testing/selftests/cgroup/test_core.c @@ -674,6 +674,73 @@ static int test_cgcore_thread_migration(const char *root) return ret; }
+/* + * cgroup migration permission check should be performed based on the + * credentials at the time of open instead of write. + */ +static int test_cgcore_lesser_euid_open(const char *root) +{ + const uid_t test_euid = 65534; /* usually nobody, any !root is fine */ + int ret = KSFT_FAIL; + char *cg_test_a = NULL, *cg_test_b = NULL; + char *cg_test_a_procs = NULL, *cg_test_b_procs = NULL; + int cg_test_b_procs_fd = -1; + uid_t saved_uid; + + cg_test_a = cg_name(root, "cg_test_a"); + cg_test_b = cg_name(root, "cg_test_b"); + + if (!cg_test_a || !cg_test_b) + goto cleanup; + + cg_test_a_procs = cg_name(cg_test_a, "cgroup.procs"); + cg_test_b_procs = cg_name(cg_test_b, "cgroup.procs"); + + if (!cg_test_a_procs || !cg_test_b_procs) + goto cleanup; + + if (cg_create(cg_test_a) || cg_create(cg_test_b)) + goto cleanup; + + if (cg_enter_current(cg_test_a)) + goto cleanup; + + if (chown(cg_test_a_procs, test_euid, -1) || + chown(cg_test_b_procs, test_euid, -1)) + goto cleanup; + + saved_uid = geteuid(); + if (seteuid(test_euid)) + goto cleanup; + + cg_test_b_procs_fd = open(cg_test_b_procs, O_RDWR); + + if (seteuid(saved_uid)) + goto cleanup; + + if (cg_test_b_procs_fd < 0) + goto cleanup; + + if (write(cg_test_b_procs_fd, "0", 1) >= 0 || errno != EACCES) + goto cleanup; + + ret = KSFT_PASS; + +cleanup: + cg_enter_current(root); + if (cg_test_b_procs_fd >= 0) + close(cg_test_b_procs_fd); + if (cg_test_b) + cg_destroy(cg_test_b); + if (cg_test_a) + cg_destroy(cg_test_a); + free(cg_test_b_procs); + free(cg_test_a_procs); + free(cg_test_b); + free(cg_test_a); + return ret; +} + #define T(x) { x, #x } struct corecg_test { int (*fn)(const char *root); @@ -689,6 +756,7 @@ struct corecg_test { T(test_cgcore_proc_migration), T(test_cgcore_thread_migration), T(test_cgcore_destroy), + T(test_cgcore_lesser_euid_open), }; #undef T
From: Tejun Heo tj@kernel.org
commit bf35a7879f1dfb0d050fe779168bcf25c7de66f5 upstream.
When a task is writing to an fd opened by a different task, the perm check should use the cgroup namespace of the latter task. Add a test for it.
Tested-by: Michal Koutný mkoutny@suse.com Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Ovidiu Panait ovidiu.panait@windriver.com --- tools/testing/selftests/cgroup/test_core.c | 97 ++++++++++++++++++++++ 1 file changed, 97 insertions(+)
diff --git a/tools/testing/selftests/cgroup/test_core.c b/tools/testing/selftests/cgroup/test_core.c index 01b766506973..600123503063 100644 --- a/tools/testing/selftests/cgroup/test_core.c +++ b/tools/testing/selftests/cgroup/test_core.c @@ -1,11 +1,14 @@ /* SPDX-License-Identifier: GPL-2.0 */
+#define _GNU_SOURCE #include <linux/limits.h> +#include <linux/sched.h> #include <sys/types.h> #include <sys/mman.h> #include <sys/wait.h> #include <unistd.h> #include <fcntl.h> +#include <sched.h> #include <stdio.h> #include <errno.h> #include <signal.h> @@ -741,6 +744,99 @@ static int test_cgcore_lesser_euid_open(const char *root) return ret; }
+struct lesser_ns_open_thread_arg { + const char *path; + int fd; + int err; +}; + +static int lesser_ns_open_thread_fn(void *arg) +{ + struct lesser_ns_open_thread_arg *targ = arg; + + targ->fd = open(targ->path, O_RDWR); + targ->err = errno; + return 0; +} + +/* + * cgroup migration permission check should be performed based on the cgroup + * namespace at the time of open instead of write. + */ +static int test_cgcore_lesser_ns_open(const char *root) +{ + static char stack[65536]; + const uid_t test_euid = 65534; /* usually nobody, any !root is fine */ + int ret = KSFT_FAIL; + char *cg_test_a = NULL, *cg_test_b = NULL; + char *cg_test_a_procs = NULL, *cg_test_b_procs = NULL; + int cg_test_b_procs_fd = -1; + struct lesser_ns_open_thread_arg targ = { .fd = -1 }; + pid_t pid; + int status; + + cg_test_a = cg_name(root, "cg_test_a"); + cg_test_b = cg_name(root, "cg_test_b"); + + if (!cg_test_a || !cg_test_b) + goto cleanup; + + cg_test_a_procs = cg_name(cg_test_a, "cgroup.procs"); + cg_test_b_procs = cg_name(cg_test_b, "cgroup.procs"); + + if (!cg_test_a_procs || !cg_test_b_procs) + goto cleanup; + + if (cg_create(cg_test_a) || cg_create(cg_test_b)) + goto cleanup; + + if (cg_enter_current(cg_test_b)) + goto cleanup; + + if (chown(cg_test_a_procs, test_euid, -1) || + chown(cg_test_b_procs, test_euid, -1)) + goto cleanup; + + targ.path = cg_test_b_procs; + pid = clone(lesser_ns_open_thread_fn, stack + sizeof(stack), + CLONE_NEWCGROUP | CLONE_FILES | CLONE_VM | SIGCHLD, + &targ); + if (pid < 0) + goto cleanup; + + if (waitpid(pid, &status, 0) < 0) + goto cleanup; + + if (!WIFEXITED(status)) + goto cleanup; + + cg_test_b_procs_fd = targ.fd; + if (cg_test_b_procs_fd < 0) + goto cleanup; + + if (cg_enter_current(cg_test_a)) + goto cleanup; + + if ((status = write(cg_test_b_procs_fd, "0", 1)) >= 0 || errno != ENOENT) + goto cleanup; + + ret = KSFT_PASS; + +cleanup: + cg_enter_current(root); + if (cg_test_b_procs_fd >= 0) + close(cg_test_b_procs_fd); + if (cg_test_b) + cg_destroy(cg_test_b); + if (cg_test_a) + cg_destroy(cg_test_a); + free(cg_test_b_procs); + free(cg_test_a_procs); + free(cg_test_b); + free(cg_test_a); + return ret; +} + #define T(x) { x, #x } struct corecg_test { int (*fn)(const char *root); @@ -757,6 +853,7 @@ struct corecg_test { T(test_cgcore_thread_migration), T(test_cgcore_destroy), T(test_cgcore_lesser_euid_open), + T(test_cgcore_lesser_ns_open), }; #undef T
On Thu, Apr 07, 2022 at 10:21:30AM +0300, Ovidiu Panait wrote:
CVE-2021-4197 patchset consists of: [1] 1756d7994ad8 ("cgroup: Use open-time credentials for process migraton perm checks") [2] 0d2b5955b362 ("cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv") [3] e57457641613 ("cgroup: Use open-time cgroup namespace for process migration perm checks") [4] b09c2baa5634 ("selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644") [5] 613e040e4dc2 ("selftests: cgroup: Test open-time credential usage for migration checks") [6] bf35a7879f1d ("selftests: cgroup: Test open-time cgroup namespace usage for migration checks")
Commits [2] and [3] are already preent in 5.10-stable, this patchset includes backports for the other commits.
Backport summary
1756d7994ad8 ("cgroup: Use open-time credentials for process migraton perm checks")
- Refactoring commit da70862efe006 ("cgroup: cgroup.{procs,threads} factor out common parts") is not present in kernel versions < 5.12, so the original changes to __cgroup_procs_write() had to be applied in both cgroup_threads_write() and cgroup_procs_write() functions.
c2e46f6b3e35 ("selftests/cgroup: Fix build on older distros")
- This extra commit was added to fix the following selftest build failure, applies cleanly: ... cgroup_util.c: In function ‘clone_into_cgroup’: group_util.c:343:4: error: ‘struct clone_args’ has no member named ‘cgroup’ 343 | .cgroup = cgroup_fd, | ^~~~~~
All other selftest changes are clean cherry-picks.
Testing
The newly introduced selftests (test_cgcore_lesser_euid_open() and test_cgcore_lesser_ns_open()) pass with this series applied:
root@intel-x86-64:~# ./test_core ok 1 test_cgcore_internal_process_constraint ok 2 test_cgcore_top_down_constraint_enable ok 3 test_cgcore_top_down_constraint_disable ok 4 test_cgcore_no_internal_process_constraint_os ok 5 test_cgcore_parent_becomes_threaded ok 6 test_cgcore_invalid_domain ok 7 test_cgcore_populated ok 8 test_cgcore_proc_migration ok 9 test_cgcore_thread_migration ok 10 test_cgcore_destroy ok 11 test_cgcore_lesser_euid_open ok 12 test_cgcore_lesser_ns_open
Sachin Sant (1): selftests/cgroup: Fix build on older distros
Tejun Heo (4): cgroup: Use open-time credentials for process migraton perm checks selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644 selftests: cgroup: Test open-time credential usage for migration checks selftests: cgroup: Test open-time cgroup namespace usage for migration checks
kernel/cgroup/cgroup-v1.c | 7 +- kernel/cgroup/cgroup.c | 17 +- tools/testing/selftests/cgroup/cgroup_util.c | 6 +- tools/testing/selftests/cgroup/test_core.c | 165 +++++++++++++++++++ 4 files changed, 188 insertions(+), 7 deletions(-)
-- 2.25.1
All now queued up, thanks.
greg k-h
linux-stable-mirror@lists.linaro.org