This is touching mm/mmap.c, please ensure to cc- the reviewers (me and Liam, I have cc'd Liam here) as listed in MAINTAINERS when submitting patches for this file.
Also this seems like a really speculative 'please discuss' change so should be an RFC imo.
On Tue, Oct 08, 2024 at 12:55:39AM +0200, Jann Horn wrote:
As explained in the comment block this change adds, we can't tell what userspace's intent is when the stack grows towards an inaccessible VMA.
I have a (highly contrived) C testcase for 32-bit x86 userspace with glibc that mixes malloc(), pthread creation, and recursion in just the right way such that the main stack overflows into malloc() arena memory. (Let me know if you want me to share that.)
You are claiming a fixes from 2017 and cc'ing stable on a non-RFC so yeah... we're going to need more than your word for it :) we will want to be really sure this is a thing before we backport to basically every stable kernel.
Surely this isn't that hard to demonstrate though? You mmap something PROT_NONE just stack gap below the stack, then intentionally extend stack to it, before mprotect()'ing the PROT_NONE region?
I don't know of any specific scenario where this is actually exploitable, but it seems like it could be a security problem for sufficiently unlucky userspace.
I believe we should ensure that, as long as code is compiled with something like -fstack-check, a stack overflow in it can never cause the main stack to overflow into adjacent heap memory.
My fix effectively reverts the behavior for !vma_is_accessible() VMAs to the behavior before commit 1be7107fbe18 ("mm: larger stack guard gap, between vmas"), so I think it should be a fairly safe change even in case A.
Fixes: 561b5e0709e4 ("mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn jannh@google.com
I have tested that Libreoffice still launches after this change, though I don't know if that means anything.
Note that I haven't tested the growsup code.
mm/mmap.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 46 insertions(+), 7 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c index dd4b35a25aeb..971bfd6c1cea 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1064,10 +1064,12 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address) gap_addr = TASK_SIZE;
next = find_vma_intersection(mm, vma->vm_end, gap_addr);
- if (next && vma_is_accessible(next)) {
if (!(next->vm_flags & VM_GROWSUP))
- if (next && !(next->vm_flags & VM_GROWSUP)) {
/* see comments in expand_downwards() */
if (vma_is_accessible(prev))
return -ENOMEM;
if (address == next->vm_start) return -ENOMEM;
/* Check that both stack segments have the same anon_vma? */
I hate that we even maintain this for one single arch I believe at this point?
}
if (next) @@ -1155,10 +1157,47 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address) /* Enforce stack_guard_gap */ prev = vma_prev(&vmi); /* Check that both stack segments have the same anon_vma? */
- if (prev) {
if (!(prev->vm_flags & VM_GROWSDOWN) &&
vma_is_accessible(prev) &&
(address - prev->vm_end < stack_guard_gap))
- if (prev && !(prev->vm_flags & VM_GROWSDOWN) &&
(address - prev->vm_end < stack_guard_gap)) {
/*
* If the previous VMA is accessible, this is the normal case
* where the main stack is growing down towards some unrelated
* VMA. Enforce the full stack guard gap.
*/
if (vma_is_accessible(prev))
return -ENOMEM;
/*
* If the previous VMA is not accessible, we have a problem:
* We can't tell what userspace's intent is.
*
* Case A:
* Maybe userspace wants to use the previous VMA as a
* "guard region" at the bottom of the main stack, in which case
* userspace wants us to grow the stack until it is adjacent to
* the guard region. Apparently some Java runtime environments
* and Rust do that?
* That is kind of ugly, and in that case userspace really ought
* to ensure that the stack is fully expanded immediately, but
* we have to handle this case.
Yeah we can't break userspace on this, no doubt somebody is relying on this _somewhere_.
That said, I wish we disallowed this altogether regardless of accessibility.
*
* Case B:
* But maybe the previous VMA is entirely unrelated to the stack
* and is only *temporarily* PROT_NONE. For example, glibc
* malloc arenas create a big PROT_NONE region and then
* progressively mark parts of it as writable.
* In that case, we must not let the stack become adjacent to
* the previous VMA. Otherwise, after the region later becomes
* writable, a stack overflow will cause the stack to grow into
* the previous VMA, and we won't have any stack gap to protect
* against this.
Should be careful with terminology here, an mprotect() will not allow a merge so by 'grow into' you mean that a stack VMA could become immediately adjacent to a non-stack VMA prior to it which was later made writable.
Perhaps I am being pedantic...
*
* As an ugly tradeoff, enforce a single-page gap.
* A single page will hopefully be small enough to not be
* noticed in case A, while providing the same level of
* protection in case B that normal userspace threads get.
*/
if (address == prev->vm_end) return -ENOMEM;
Ugh, yuck. Not a fan of this at all. Feels like a dreadful hack.
You do raise an important point here, but it strikes me that we should be doing this check in the mprotect()/mmap() MAP_FIXED scenarios where it shouldn't be too costly to check against the next VMA (which we will be obtaining anyway for merge checks)?
That way we don't need a hack like this, and can just disallow the operation. That'd probably be as liable to break the program as an -ENOMEM on a stack expansion would...
}
base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b change-id: 20241008-stack-gap-inaccessible-c7319f7d4b1b -- Jann Horn jannh@google.com