Quoting Ariadne Conill:
"In several other operating systems, it is a hard requirement that the first argument to execve(2) be the name of a program, thus prohibiting a scenario where argc < 1. POSIX 2017 also recommends this behaviour, but it is not an explicit requirement[1]:
The argument arg0 should point to a filename string that is associated with the process being started by one of the exec functions. ... Interestingly, Michael Kerrisk opened an issue about this in 2008[2], but there was no consensus to support fixing this issue then. Hopefully now that CVE-2021-4034 shows practical exploitative use[3] of this bug in a shellcode, we can reconsider."
An examination of existing[4] users of execve(..., NULL, NULL) shows mostly test code, or example rootkit code. While rejecting a NULL argv would be preferred, it looks like the main cause of userspace confusion is an assumption that argc >= 1, and buggy programs may skip argv[0] when iterating. To protect against userspace bugs of this nature, insert an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
Note that this is only done in the argc == 0 case because some userspace programs expect to find envp at exactly argv[argc]. The overlap of these two misguided assumptions is believed to be zero.
[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html [2] https://bugzilla.kernel.org/show_bug.cgi?id=8408 [3] https://www.qualys.com/2022/01/25/cve-2021-4034/pwnkit.txt [4] https://codesearch.debian.net/search?q=execve%5C+*%5C%28%5B%5E%2C%5D%2B%2C+*...
Reported-by: Ariadne Conill ariadne@dereferenced.org Reported-by: Michael Kerrisk mtk.manpages@gmail.com Cc: Matthew Wilcox willy@infradead.org Cc: Christian Brauner brauner@kernel.org Cc: Rich Felker dalias@libc.org Cc: Eric Biederman ebiederm@xmission.com Cc: Alexander Viro viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Kees Cook keescook@chromium.org --- fs/binfmt_elf.c | 10 +++++++++- fs/exec.c | 7 ++++++- 2 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 605017eb9349..e456c48658ad 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -297,7 +297,8 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, ei_index = elf_info - (elf_addr_t *)mm->saved_auxv; sp = STACK_ADD(p, ei_index);
- items = (argc + 1) + (envc + 1) + 1; + /* Make room for extra pointer when argc == 0. See below. */ + items = (min(argc, 1) + 1) + (envc + 1) + 1; bprm->p = STACK_ROUND(sp, items);
/* Point sp at the lowest address on the stack */ @@ -326,6 +327,13 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
/* Populate list of argv pointers back to argv strings. */ p = mm->arg_end = mm->arg_start; + /* + * Include an extra NULL pointer in argv when argc == 0 so + * that argv[1] != envp[0] to help userspace programs from + * mishandling argc == 0. See fs/exec.c bprm_stack_limits(). + */ + if (argc == 0 && put_user(0, sp++)) + return -EFAULT; while (argc-- > 0) { size_t len; if (put_user((elf_addr_t)p, sp++)) diff --git a/fs/exec.c b/fs/exec.c index 79f2c9483302..0b36384e55b1 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -495,8 +495,13 @@ static int bprm_stack_limits(struct linux_binprm *bprm) * the stack. They aren't stored until much later when we can't * signal to the parent that the child has run out of stack space. * Instead, calculate it here so it's possible to fail gracefully. + * + * In the case of argc < 1, make sure there is a NULL pointer gap + * between argv and envp to ensure confused userspace programs don't + * start processing from argv[1], thinking argc can never be 0, + * to block them from walking envp by accident. See fs/binfmt_elf.c. */ - ptr_size = (bprm->argc + bprm->envc) * sizeof(void *); + ptr_size = (min(bprm->argc, 1) + bprm->envc) * sizeof(void *); if (limit <= ptr_size) return -E2BIG; limit -= ptr_size;
On Wed, Jan 26, 2022 at 6:58 PM Kees Cook keescook@chromium.org wrote:
Quoting Ariadne Conill:
"In several other operating systems, it is a hard requirement that the first argument to execve(2) be the name of a program, thus prohibiting a scenario where argc < 1. POSIX 2017 also recommends this behaviour, but it is not an explicit requirement[1]:
The argument arg0 should point to a filename string that is associated with the process being started by one of the exec functions.
... Interestingly, Michael Kerrisk opened an issue about this in 2008[2], but there was no consensus to support fixing this issue then. Hopefully now that CVE-2021-4034 shows practical exploitative use[3] of this bug in a shellcode, we can reconsider."
An examination of existing[4] users of execve(..., NULL, NULL) shows mostly test code, or example rootkit code. While rejecting a NULL argv would be preferred, it looks like the main cause of userspace confusion is an assumption that argc >= 1, and buggy programs may skip argv[0] when iterating. To protect against userspace bugs of this nature, insert an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
Note that this is only done in the argc == 0 case because some userspace programs expect to find envp at exactly argv[argc]. The overlap of these two misguided assumptions is believed to be zero.
Will this result in the executed program being told that argc==0 but having an extra NULL pointer on the stack? If so, I believe this breaks the x86-64 ABI documented at https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf - page 29, figure 3.9 describes the layout of the initial process stack.
Actually, does this even work? Can a program still properly access its environment variables when invoked with argc==0 with this patch applied? AFAIU the way userspace locates envv on x86-64 is by calculating 8*(argc+1)?
Hi,
On Wed, 26 Jan 2022, Jann Horn wrote:
On Wed, Jan 26, 2022 at 6:58 PM Kees Cook keescook@chromium.org wrote:
Quoting Ariadne Conill:
"In several other operating systems, it is a hard requirement that the first argument to execve(2) be the name of a program, thus prohibiting a scenario where argc < 1. POSIX 2017 also recommends this behaviour, but it is not an explicit requirement[1]:
The argument arg0 should point to a filename string that is associated with the process being started by one of the exec functions.
... Interestingly, Michael Kerrisk opened an issue about this in 2008[2], but there was no consensus to support fixing this issue then. Hopefully now that CVE-2021-4034 shows practical exploitative use[3] of this bug in a shellcode, we can reconsider."
An examination of existing[4] users of execve(..., NULL, NULL) shows mostly test code, or example rootkit code. While rejecting a NULL argv would be preferred, it looks like the main cause of userspace confusion is an assumption that argc >= 1, and buggy programs may skip argv[0] when iterating. To protect against userspace bugs of this nature, insert an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
Note that this is only done in the argc == 0 case because some userspace programs expect to find envp at exactly argv[argc]. The overlap of these two misguided assumptions is believed to be zero.
Will this result in the executed program being told that argc==0 but having an extra NULL pointer on the stack? If so, I believe this breaks the x86-64 ABI documented at https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf - page 29, figure 3.9 describes the layout of the initial process stack.
I'm presently compiling a kernel with the patch to see if it works or not.
Actually, does this even work? Can a program still properly access its environment variables when invoked with argc==0 with this patch applied? AFAIU the way userspace locates envv on x86-64 is by calculating 8*(argc+1)?
In the other thread, it was suggested that perhaps we should set up an argv of {"", NULL}. In that case, it seems like it would be safe to claim argc == 1.
What do you think?
Ariadne
On Wed, Jan 26, 2022 at 7:42 PM Ariadne Conill ariadne@dereferenced.org wrote:
On Wed, 26 Jan 2022, Jann Horn wrote:
On Wed, Jan 26, 2022 at 6:58 PM Kees Cook keescook@chromium.org wrote:
Quoting Ariadne Conill:
"In several other operating systems, it is a hard requirement that the first argument to execve(2) be the name of a program, thus prohibiting a scenario where argc < 1. POSIX 2017 also recommends this behaviour, but it is not an explicit requirement[1]:
The argument arg0 should point to a filename string that is associated with the process being started by one of the exec functions.
... Interestingly, Michael Kerrisk opened an issue about this in 2008[2], but there was no consensus to support fixing this issue then. Hopefully now that CVE-2021-4034 shows practical exploitative use[3] of this bug in a shellcode, we can reconsider."
An examination of existing[4] users of execve(..., NULL, NULL) shows mostly test code, or example rootkit code. While rejecting a NULL argv would be preferred, it looks like the main cause of userspace confusion is an assumption that argc >= 1, and buggy programs may skip argv[0] when iterating. To protect against userspace bugs of this nature, insert an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
Note that this is only done in the argc == 0 case because some userspace programs expect to find envp at exactly argv[argc]. The overlap of these two misguided assumptions is believed to be zero.
Will this result in the executed program being told that argc==0 but having an extra NULL pointer on the stack? If so, I believe this breaks the x86-64 ABI documented at https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf - page 29, figure 3.9 describes the layout of the initial process stack.
I'm presently compiling a kernel with the patch to see if it works or not.
Actually, does this even work? Can a program still properly access its environment variables when invoked with argc==0 with this patch applied? AFAIU the way userspace locates envv on x86-64 is by calculating 8*(argc+1)?
In the other thread, it was suggested that perhaps we should set up an argv of {"", NULL}. In that case, it seems like it would be safe to claim argc == 1.
What do you think?
Sounds good to me, since that's something that could also happen normally if userspace calls execve(..., {"", NULL}, ...).
(I'd like it even better if we could just bail out with an error code, but I guess the risk of breakage might be too high with that approach?)
On Wed, Jan 26, 2022 at 08:50:39PM +0100, Jann Horn wrote:
On Wed, Jan 26, 2022 at 7:42 PM Ariadne Conill ariadne@dereferenced.org wrote:
On Wed, 26 Jan 2022, Jann Horn wrote:
On Wed, Jan 26, 2022 at 6:58 PM Kees Cook keescook@chromium.org wrote:
Quoting Ariadne Conill:
"In several other operating systems, it is a hard requirement that the first argument to execve(2) be the name of a program, thus prohibiting a scenario where argc < 1. POSIX 2017 also recommends this behaviour, but it is not an explicit requirement[1]:
The argument arg0 should point to a filename string that is associated with the process being started by one of the exec functions.
... Interestingly, Michael Kerrisk opened an issue about this in 2008[2], but there was no consensus to support fixing this issue then. Hopefully now that CVE-2021-4034 shows practical exploitative use[3] of this bug in a shellcode, we can reconsider."
An examination of existing[4] users of execve(..., NULL, NULL) shows mostly test code, or example rootkit code. While rejecting a NULL argv would be preferred, it looks like the main cause of userspace confusion is an assumption that argc >= 1, and buggy programs may skip argv[0] when iterating. To protect against userspace bugs of this nature, insert an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
Note that this is only done in the argc == 0 case because some userspace programs expect to find envp at exactly argv[argc]. The overlap of these two misguided assumptions is believed to be zero.
Will this result in the executed program being told that argc==0 but having an extra NULL pointer on the stack? If so, I believe this breaks the x86-64 ABI documented at https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf - page 29, figure 3.9 describes the layout of the initial process stack.
I'm presently compiling a kernel with the patch to see if it works or not.
Actually, does this even work? Can a program still properly access its environment variables when invoked with argc==0 with this patch applied? AFAIU the way userspace locates envv on x86-64 is by calculating 8*(argc+1)?
In the other thread, it was suggested that perhaps we should set up an argv of {"", NULL}. In that case, it seems like it would be safe to claim argc == 1.
What do you think?
Sounds good to me, since that's something that could also happen normally if userspace calls execve(..., {"", NULL}, ...).
(I'd like it even better if we could just bail out with an error code, but I guess the risk of breakage might be too high with that approach?)
We can't mutate argc; it'll turn at least some userspace into an infinite loop: https://sources.debian.org/src/valgrind/1:3.18.1-1/none/tests/execve.c/?hl=2...
On Wed, Jan 26, 2022 at 11:58:39AM -0800, Kees Cook wrote:
On Wed, Jan 26, 2022 at 08:50:39PM +0100, Jann Horn wrote:
On Wed, Jan 26, 2022 at 7:42 PM Ariadne Conill ariadne@dereferenced.org wrote:
On Wed, 26 Jan 2022, Jann Horn wrote:
On Wed, Jan 26, 2022 at 6:58 PM Kees Cook keescook@chromium.org wrote:
Quoting Ariadne Conill:
"In several other operating systems, it is a hard requirement that the first argument to execve(2) be the name of a program, thus prohibiting a scenario where argc < 1. POSIX 2017 also recommends this behaviour, but it is not an explicit requirement[1]:
The argument arg0 should point to a filename string that is associated with the process being started by one of the exec functions.
... Interestingly, Michael Kerrisk opened an issue about this in 2008[2], but there was no consensus to support fixing this issue then. Hopefully now that CVE-2021-4034 shows practical exploitative use[3] of this bug in a shellcode, we can reconsider."
An examination of existing[4] users of execve(..., NULL, NULL) shows mostly test code, or example rootkit code. While rejecting a NULL argv would be preferred, it looks like the main cause of userspace confusion is an assumption that argc >= 1, and buggy programs may skip argv[0] when iterating. To protect against userspace bugs of this nature, insert an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
Note that this is only done in the argc == 0 case because some userspace programs expect to find envp at exactly argv[argc]. The overlap of these two misguided assumptions is believed to be zero.
Will this result in the executed program being told that argc==0 but having an extra NULL pointer on the stack? If so, I believe this breaks the x86-64 ABI documented at https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf - page 29, figure 3.9 describes the layout of the initial process stack.
I'm presently compiling a kernel with the patch to see if it works or not.
Actually, does this even work? Can a program still properly access its environment variables when invoked with argc==0 with this patch applied? AFAIU the way userspace locates envv on x86-64 is by calculating 8*(argc+1)?
In the other thread, it was suggested that perhaps we should set up an argv of {"", NULL}. In that case, it seems like it would be safe to claim argc == 1.
What do you think?
Sounds good to me, since that's something that could also happen normally if userspace calls execve(..., {"", NULL}, ...).
(I'd like it even better if we could just bail out with an error code, but I guess the risk of breakage might be too high with that approach?)
We can't mutate argc; it'll turn at least some userspace into an infinite loop: https://sources.debian.org/src/valgrind/1:3.18.1-1/none/tests/execve.c/?hl=2...
How does that become an infinite loop? We obviously wouldn't mutate argc in the caller, just the callee.
Also, there's a version of this where we only mutate argc if we're executing a setuid program, which would remove the privilege escalation part of things.
On Wed, Jan 26, 2022 at 08:08:14PM +0000, Matthew Wilcox wrote:
On Wed, Jan 26, 2022 at 11:58:39AM -0800, Kees Cook wrote:
We can't mutate argc; it'll turn at least some userspace into an infinite loop: https://sources.debian.org/src/valgrind/1:3.18.1-1/none/tests/execve.c/?hl=2...
How does that become an infinite loop? We obviously wouldn't mutate argc in the caller, just the callee.
Oh, sorry, I misread. It's using /bin/true, not argv[0] (another bit of code I found was using argv[0]). Yeah, {"", NULL} could work.
Also, there's a version of this where we only mutate argc if we're executing a setuid program, which would remove the privilege escalation part of things.
True; though I'd like to keep the logic as non-specialized as possible. I don't like making stuff conditional on privilege boundaries if we can make it always happen.
Kees Cook keescook@chromium.org writes:
On Wed, Jan 26, 2022 at 08:08:14PM +0000, Matthew Wilcox wrote:
On Wed, Jan 26, 2022 at 11:58:39AM -0800, Kees Cook wrote:
We can't mutate argc; it'll turn at least some userspace into an infinite loop: https://sources.debian.org/src/valgrind/1:3.18.1-1/none/tests/execve.c/?hl=2...
How does that become an infinite loop? We obviously wouldn't mutate argc in the caller, just the callee.
Oh, sorry, I misread. It's using /bin/true, not argv[0] (another bit of code I found was using argv[0]). Yeah, {"", NULL} could work.
Also, there's a version of this where we only mutate argc if we're executing a setuid program, which would remove the privilege escalation part of things.
True; though I'd like to keep the logic as non-specialized as possible. I don't like making stuff conditional on privilege boundaries if we can make it always happen.
Which I think means turning the argc == 0 case into { "", NULL }. I think we can always do that, and it is already valid in userspace.
The only case I can imagine breaking would be an explicitly testing for argc == 0 and behaving completely differently if that is passed to the program.
Eric
On Wed, Jan 26, 2022 at 07:07:20PM +0100, Jann Horn wrote:
On Wed, Jan 26, 2022 at 6:58 PM Kees Cook keescook@chromium.org wrote:
Quoting Ariadne Conill:
"In several other operating systems, it is a hard requirement that the first argument to execve(2) be the name of a program, thus prohibiting a scenario where argc < 1. POSIX 2017 also recommends this behaviour, but it is not an explicit requirement[1]:
The argument arg0 should point to a filename string that is associated with the process being started by one of the exec functions.
... Interestingly, Michael Kerrisk opened an issue about this in 2008[2], but there was no consensus to support fixing this issue then. Hopefully now that CVE-2021-4034 shows practical exploitative use[3] of this bug in a shellcode, we can reconsider."
An examination of existing[4] users of execve(..., NULL, NULL) shows mostly test code, or example rootkit code. While rejecting a NULL argv would be preferred, it looks like the main cause of userspace confusion is an assumption that argc >= 1, and buggy programs may skip argv[0] when iterating. To protect against userspace bugs of this nature, insert an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
Note that this is only done in the argc == 0 case because some userspace programs expect to find envp at exactly argv[argc]. The overlap of these two misguided assumptions is believed to be zero.
Will this result in the executed program being told that argc==0 but having an extra NULL pointer on the stack? If so, I believe this breaks the x86-64 ABI documented at https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf - page 29, figure 3.9 describes the layout of the initial process stack.
Actually, does this even work? Can a program still properly access its environment variables when invoked with argc==0 with this patch applied? AFAIU the way userspace locates envv on x86-64 is by calculating 8*(argc+1)?
Hrm, yeah, I guess it's libc providing the envp pointer; it's not passes separately. Hrm.
Hi,
On Wed, 26 Jan 2022, Kees Cook wrote:
Quoting Ariadne Conill:
"In several other operating systems, it is a hard requirement that the first argument to execve(2) be the name of a program, thus prohibiting a scenario where argc < 1. POSIX 2017 also recommends this behaviour, but it is not an explicit requirement[1]:
The argument arg0 should point to a filename string that is associated with the process being started by one of the exec functions. ... Interestingly, Michael Kerrisk opened an issue about this in 2008[2], but there was no consensus to support fixing this issue then. Hopefully now that CVE-2021-4034 shows practical exploitative use[3] of this bug in a shellcode, we can reconsider."
An examination of existing[4] users of execve(..., NULL, NULL) shows mostly test code, or example rootkit code. While rejecting a NULL argv would be preferred, it looks like the main cause of userspace confusion is an assumption that argc >= 1, and buggy programs may skip argv[0] when iterating. To protect against userspace bugs of this nature, insert an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
Note that this is only done in the argc == 0 case because some userspace programs expect to find envp at exactly argv[argc]. The overlap of these two misguided assumptions is believed to be zero.
[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html [2] https://bugzilla.kernel.org/show_bug.cgi?id=8408 [3] https://www.qualys.com/2022/01/25/cve-2021-4034/pwnkit.txt [4] https://codesearch.debian.net/search?q=execve%5C+*%5C%28%5B%5E%2C%5D%2B%2C+*...
Reported-by: Ariadne Conill ariadne@dereferenced.org Reported-by: Michael Kerrisk mtk.manpages@gmail.com Cc: Matthew Wilcox willy@infradead.org Cc: Christian Brauner brauner@kernel.org Cc: Rich Felker dalias@libc.org Cc: Eric Biederman ebiederm@xmission.com Cc: Alexander Viro viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Kees Cook keescook@chromium.org
Tested-by: Ariadne Conill ariadne@dereferenced.org
It seems to work, but I still think bailing early with -EINVAL is a more reasonable position to take. For example, the following code, when used with BusyBox applets results in a segfault, as the multicall stub does not support scenarios where argc < 1:
#include <stdio.h> #include <unistd.h> #include <sys/syscall.h>
int main(int argc, const char **argv) { if (syscall(SYS_execve, "/bin/date", NULL, NULL) < 0) perror("execve"); return 0; }
Ariadne
Hi,
On Wed, 26 Jan 2022, Ariadne Conill wrote:
Hi,
On Wed, 26 Jan 2022, Kees Cook wrote:
Quoting Ariadne Conill:
"In several other operating systems, it is a hard requirement that the first argument to execve(2) be the name of a program, thus prohibiting a scenario where argc < 1. POSIX 2017 also recommends this behaviour, but it is not an explicit requirement[1]:
The argument arg0 should point to a filename string that is associated with the process being started by one of the exec functions. ... Interestingly, Michael Kerrisk opened an issue about this in 2008[2], but there was no consensus to support fixing this issue then. Hopefully now that CVE-2021-4034 shows practical exploitative use[3] of this bug in a shellcode, we can reconsider."
An examination of existing[4] users of execve(..., NULL, NULL) shows mostly test code, or example rootkit code. While rejecting a NULL argv would be preferred, it looks like the main cause of userspace confusion is an assumption that argc >= 1, and buggy programs may skip argv[0] when iterating. To protect against userspace bugs of this nature, insert an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
Note that this is only done in the argc == 0 case because some userspace programs expect to find envp at exactly argv[argc]. The overlap of these two misguided assumptions is believed to be zero.
[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html [2] https://bugzilla.kernel.org/show_bug.cgi?id=8408 [3] https://www.qualys.com/2022/01/25/cve-2021-4034/pwnkit.txt [4] https://codesearch.debian.net/search?q=execve%5C+*%5C%28%5B%5E%2C%5D%2B%2C+*...
Reported-by: Ariadne Conill ariadne@dereferenced.org Reported-by: Michael Kerrisk mtk.manpages@gmail.com Cc: Matthew Wilcox willy@infradead.org Cc: Christian Brauner brauner@kernel.org Cc: Rich Felker dalias@libc.org Cc: Eric Biederman ebiederm@xmission.com Cc: Alexander Viro viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Kees Cook keescook@chromium.org
Tested-by: Ariadne Conill ariadne@dereferenced.org
It seems to work, but I still think bailing early with -EINVAL is a more reasonable position to take. For example, the following code, when used with BusyBox applets results in a segfault, as the multicall stub does not support scenarios where argc < 1:
#include <stdio.h> #include <unistd.h> #include <sys/syscall.h>
int main(int argc, const char **argv) { if (syscall(SYS_execve, "/bin/date", NULL, NULL) < 0) perror("execve"); return 0; }
Further testing indicates that while things *mostly* work, it results in memory corruption in various tasks, for example, trying to build a new kernel hung, and the gcc process's name was a bunch of uninitialized memory. So, I don't think { NULL, NULL } is a good way to go.
Ariadne
On Wed, Jan 26, 2022 at 09:57:47AM -0800, Kees Cook wrote:
Quoting Ariadne Conill:
"In several other operating systems, it is a hard requirement that the first argument to execve(2) be the name of a program, thus prohibiting a scenario where argc < 1. POSIX 2017 also recommends this behaviour, but it is not an explicit requirement[1]:
The argument arg0 should point to a filename string that is associated with the process being started by one of the exec functions.
... Interestingly, Michael Kerrisk opened an issue about this in 2008[2], but there was no consensus to support fixing this issue then. Hopefully now that CVE-2021-4034 shows practical exploitative use[3] of this bug in a shellcode, we can reconsider."
An examination of existing[4] users of execve(..., NULL, NULL) shows mostly test code, or example rootkit code. While rejecting a NULL argv would be preferred, it looks like the main cause of userspace confusion is an assumption that argc >= 1, and buggy programs may skip argv[0] when iterating. To protect against userspace bugs of this nature, insert an extra NULL pointer in argv when argc == 0, so that argv[1] != envp[0].
Note that this is only done in the argc == 0 case because some userspace programs expect to find envp at exactly argv[argc]. The overlap of these two misguided assumptions is believed to be zero.
[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html [2] https://bugzilla.kernel.org/show_bug.cgi?id=8408 [3] https://www.qualys.com/2022/01/25/cve-2021-4034/pwnkit.txt [4] https://codesearch.debian.net/search?q=execve%5C+*%5C%28%5B%5E%2C%5D%2B%2C+*...
Reported-by: Ariadne Conill ariadne@dereferenced.org Reported-by: Michael Kerrisk mtk.manpages@gmail.com Cc: Matthew Wilcox willy@infradead.org Cc: Christian Brauner brauner@kernel.org Cc: Rich Felker dalias@libc.org Cc: Eric Biederman ebiederm@xmission.com Cc: Alexander Viro viro@zeniv.linux.org.uk Cc: linux-fsdevel@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Kees Cook keescook@chromium.org
fs/binfmt_elf.c | 10 +++++++++- fs/exec.c | 7 ++++++- 2 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 605017eb9349..e456c48658ad 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -297,7 +297,8 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, ei_index = elf_info - (elf_addr_t *)mm->saved_auxv; sp = STACK_ADD(p, ei_index);
- items = (argc + 1) + (envc + 1) + 1;
- /* Make room for extra pointer when argc == 0. See below. */
- items = (min(argc, 1) + 1) + (envc + 1) + 1; bprm->p = STACK_ROUND(sp, items);
/* Point sp at the lowest address on the stack */ @@ -326,6 +327,13 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec, /* Populate list of argv pointers back to argv strings. */ p = mm->arg_end = mm->arg_start;
- /*
* Include an extra NULL pointer in argv when argc == 0 so
* that argv[1] != envp[0] to help userspace programs from
* mishandling argc == 0. See fs/exec.c bprm_stack_limits().
*/
- if (argc == 0 && put_user(0, sp++))
while (argc-- > 0) { size_t len; if (put_user((elf_addr_t)p, sp++))return -EFAULT;
diff --git a/fs/exec.c b/fs/exec.c index 79f2c9483302..0b36384e55b1 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -495,8 +495,13 @@ static int bprm_stack_limits(struct linux_binprm *bprm) * the stack. They aren't stored until much later when we can't * signal to the parent that the child has run out of stack space. * Instead, calculate it here so it's possible to fail gracefully.
*
* In the case of argc < 1, make sure there is a NULL pointer gap
* between argv and envp to ensure confused userspace programs don't
* start processing from argv[1], thinking argc can never be 0,
*/* to block them from walking envp by accident. See fs/binfmt_elf.c.
- ptr_size = (bprm->argc + bprm->envc) * sizeof(void *);
- ptr_size = (min(bprm->argc, 1) + bprm->envc) * sizeof(void *); if (limit <= ptr_size) return -E2BIG; limit -= ptr_size;
-- 2.30.2
This patch is not just wrong, but extremely dangerously wrong, to the point that it may make all suid-root binaries exploitable (at least dynamic linked ones).
The ELF entry point contract is that argv+argc+1==envp, and in fact this is the "preferred" way of computing envp so as to avoid linear search over argv. In musl's dynamic linker we do exactly that; I'm not sure about glibc's. See:
https://git.musl-libc.org/cgit/musl/tree/ldso/dynlink.c?id=v1.2.2#n1740
If argv[argc+1] wrongly contains a null pointer, semantically, that means the environment is empty and auxv starts at the next stack slot. It's an exercise for the reader to populate the environment in a way that this memory wrongly gets interpreted as a meaningful auxv. I'm not sure this is possible, but I wouldn't automatically rule it out.
In short: YOU CANNOT CHANGE/BREAK CONTRACTS TO MITIGATE A VULN. Doing so just makes new vulns in the programs that were correct before.
Silently replacing argc==0 with argc==1 and argv[0]=="" would be a safe variant of this, but I'm really in favor of just erroring out, but *only doing it when the exec is a privilege boundary* (suid/etc.) to minimize the chance of breaking software dependent on allowing argc==0.
Rich
Greeting,
FYI, we noticed the following commit (built with gcc-9):
commit: 4736b95ed241d76c59d34859cb77703cf587dcee ("[PATCH] fs/binfmt_elf: Add padding NULL when argc == 0") url: https://github.com/0day-ci/linux/commits/Kees-Cook/fs-binfmt_elf-Add-padding... base: https://git.kernel.org/cgit/linux/kernel/git/kees/linux.git for-next/pstore patch link: https://lore.kernel.org/linux-fsdevel/20220126175747.3270945-1-keescook@chro...
in testcase: kernel-selftests version: kernel-selftests-x86_64-f050cde9-1_20220127 with following parameters:
group: x86
test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel. test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag Reported-by: kernel test robot oliver.sang@intel.com
gcc -m32 -o /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-4736b95ed241d76c59d34859cb77703cf587dcee/tools/testing/selftests/x86/test_FCOMI_32 -O2 -g -std=gnu99 -p thread -Wall -no-pie -DCAN_BUILD_32 test_FCOMI.c helpers.h -lrt -ldl -lm /usr/bin/ld: bad -plugin-opt option : error: ld returned 1 exit status make: *** [Makefile:75: /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-4736b95ed241d76c59d34859cb77703cf587dcee/tools/testing/selftests/x86/test_FCOMI_32] Error 1
please be noted above failure detail is in attachments.
actually we also saw other types of make failure which all not observed upon parent.
(1) gcc -m32 -o /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-4736b95ed241d76c59d34859cb77703cf587dcee/tools/testing/selftests/x86/test_mremap_vdso_32 -O2 -g -std=gn u99 -pthread -Wall -no-pie -DCAN_BUILD_32 -DCAN_BUILD_64 test_mremap_vdso.c helpers.h -lrt -ldl -lm : error: : No such file or directory : error: ^_: No such file or directory : error: : No such file or directory make: *** [Makefile:75: /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-4736b95ed241d76c59d34859cb77703cf587dcee/tools/testing/selftests/x86/test_mremap_vdso_32] E rror 1
(2) gcc -m32 -o /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-4736b95ed241d76c59d34859cb77703cf587dcee/tools/testing/selftests/x86/check_initial_reg_state_32 -O2 -g -std=gnu99 -pthread -Wall -no-pie -Wl,-ereal_start -static -DCAN_BUILD_32 -DCAN_BUILD_64 check_initial_reg_state.c helpers.h -lrt -ldl -lm : error: too many filenames given. Type --help for usage : fatal error: ~W▒~▒^?: No such file or directory compilation terminated. make: *** [Makefile:75: /usr/src/perf_selftests-x86_64-rhel-8.3-kselftests-4736b95ed241d76c59d34859cb77703cf587dcee/tools/testing/selftests/x86/check_initial_reg_stat e_32] Error 1
To reproduce:
# build kernel cd linux cp config-5.16.0-rc1-00002-g4736b95ed241 .config make HOSTCC=gcc-9 CC=gcc-9 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules make HOSTCC=gcc-9 CC=gcc-9 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install cd <mod-install-dir> find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
# if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state.
--- 0DAY/LKP+ Test Infrastructure Open Source Technology Center https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks, Oliver Sang
linux-stable-mirror@lists.linaro.org