Re: [PATCH v3 1/5] mm: fix off-by-one error in VMA count limit checks

14 Oct 2025

On Mon, Oct 13, 2025 at 11:28 PM Hugh Dickins hughd@google.com wrote:
...
On Mon, 13 Oct 2025, Kalesh Singh wrote:
...
The VMA count limit check in do_mmap() and do_brk_flags() uses a
strict inequality (>), which allows a process's VMA count to exceed
the configured sysctl_max_map_count limit by one.
A process with mm->map_count == sysctl_max_map_count will incorrectly
pass this check and then exceed the limit upon allocation of a new VMA
when its map_count is incremented.
Other VMA allocation paths, such as split_vma(), already use the
correct, inclusive (>=) comparison.
Fix this bug by changing the comparison to be inclusive in do_mmap()
and do_brk_flags(), bringing them in line with the correct behavior
of other allocation paths.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Cc: Andrew Morton akpm@linux-foundation.org
Cc: David Hildenbrand david@redhat.com
Cc: "Liam R. Howlett" Liam.Howlett@oracle.com
Cc: Lorenzo Stoakes lorenzo.stoakes@oracle.com
Cc: Mike Rapoport rppt@kernel.org
Cc: Minchan Kim minchan@kernel.org
Cc: Pedro Falcato pfalcato@suse.de
Reviewed-by: David Hildenbrand david@redhat.com
Reviewed-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com
Reviewed-by: Pedro Falcato pfalcato@suse.de
Acked-by: SeongJae Park sj@kernel.org
Signed-off-by: Kalesh Singh kaleshsingh@google.com

Changes in v3:

Collect Reviewed-by and Acked-by tags.

Changes in v2:

Fix mmap check, per Pedro

mm/mmap.c | 2 +-
 mm/vma.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 644f02071a41..da2cbdc0f87b 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -374,7 +374,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
              return -EOVERFLOW;
  /* Too many mappings? */


if (mm->map_count > sysctl_max_map_count)




if (mm->map_count >= sysctl_max_map_count)
        return -ENOMEM;

/*



diff --git a/mm/vma.c b/mm/vma.c
index a2e1ae954662..fba68f13e628 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -2797,7 +2797,7 @@ int do_brk_flags(struct vma_iterator *vmi, struct vm_area_struct *vma,
      if (!may_expand_vm(mm, vm_flags, len >> PAGE_SHIFT))
              return -ENOMEM;

if (mm->map_count > sysctl_max_map_count)




if (mm->map_count >= sysctl_max_map_count)
        return -ENOMEM;

if (security_vm_enough_memory_mm(mm, len >> PAGE_SHIFT))



--
2.51.0.760.g7b8bcc2412-goog
Sorry for letting you go so far before speaking up (I had to test what
I believed to be true, and had hoped that meanwhile one of your many
illustrious reviewers would say so first, but no): it's a NAK from me.
These are not off-by-ones: at the point of these checks, it is not
known whether an additional map/vma will have to be added, or the
addition will be merged into an existing map/vma.  So the checks
err on the lenient side, letting you get perhaps one more than the
sysctl said, but not allowing any more than that.
Which is all that matters, isn't it? Limiting unrestrained growth.
In this patch you're proposing to change it from erring on the
lenient side to erring on the strict side - prohibiting merges
at the limit which have been allowed for many years.
Whatever one thinks about the merits of erring on the lenient versus
erring on the strict side, I see no reason to make this change now,
and most certainly not with a Fixes Cc: stable. There is no danger
in the current behaviour; there is danger in prohibiting what was
allowed before.
As to the remainder of your series: I have to commend you for doing
a thorough and well-presented job, but I cannot myself see the point in
changing 21 files for what almost amounts to a max_map_count subsystem.
I call it misdirected effort, not at all to my taste, which prefers the
straightforward checks already there; but accept that my taste may be
out of fashion, so won't stand in the way if others think it worthwhile.
Hi Hugh,
Thanks for the detailed review and for taking the time to test the behavior.
You've raised a valid point. I wasn't aware of the history behind the
lenient check for merges. The lack of a comment, like the one that
exists for exceeding the limit in munmap(), led me to misinterpret
this as an off-by-one bug. The convention makes sense if we consider
potential merges.
If it was in-fact the intended behavior, then I agree we should keep
it lenient. It would mean though, that munmap() being able to free a
VMA if a split is required (by permitting exceeding the limit by 1)
would not work in the case where we have already exceeded the limit. I
find this to be inconsistent but this is also the current behavior ...
I will drop this patch and the patch that introduces the
vma_count_remaining() helper, as I see your point about it potentially
being unnecessary overhead.
Regarding your feedback on the rest of the series, I believe the 3
remaining patches are still valuable on their own.
- The selftest adds a comprehensive tests for VMA operations at the
sysctl_max_map_count limit. This will self-document the exact behavior
expected, including the leniency for potential merges that you
highlighted, preventing the kind of misunderstanding that led to my
initial patch.
- The rename of mm_struct->map_count to vma_count, is a
straightforward cleanup for code clarity that makes the purpose of the
field more explicit.
- The tracepoint adds needed observability for telemetry, allowing us
to see when processes are failing in the field due to VMA count limit.
The  selftest, is what  makes up a large portion of the diff you
sited, and with vma_count_remaining() gone the series will not touch
nearly as many files.
Would this be an acceptable path forward?
Thanks,
Kalesh
...
Hugh

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v3 1/5] mm: fix off-by-one error in VMA count limit checks