Hi all,
This series addresses an off-by-one bug in the VMA count limit check and introduces several improvements for clarity, test coverage, and observability around the VMA limit mechanism.
The VMA count limit, controlled by sysctl_max_map_count, is a critical safeguard that prevents a single process from consuming excessive kernel memory by creating too many memory mappings. However, the checks in do_mmap() and do_brk_flags() used a strict inequality, allowing a process to exceed this limit by one VMA.
This series begins by fixing this long-standing bug. The subsequent patches build on this by improving the surrounding code. A comprehensive selftest is added to validate VMA operations near the limit, preventing future regressions. The open-coded limit checks are replaced with a centralized helper, vma_count_remaining(), to improve readability. For better code clarity, mm_struct->map_count is renamed to the more apt vma_count.
Finally, a trace event is added to provide observability for processes that fail allocations due to VMA exhaustion, which is valuable for debugging and profiling on production systems.
The major changes in this version are: 1. Rebased on mm-new to resolve prior conflicts.
2. The patches to harden and add assertions for the VMA count have been dropped. David pointed out that these could be racy if sysctl_max_map_count is changed from userspace at just the wrong time.
3. The selftest has been completely rewritten per Lorenzo's feedback to make use of the kselftest harness and vm_util.h helpers.
4. The trace event has also been updated to contain more useful information and has been given a more fitting name, per feedback from Steve and Lorenzo.
Tested on x86_64 and arm64:
1. Build test: allyesconfig for rename
2. Selftests: cd tools/testing/selftests/mm && \ make && \ ./run_vmtests.sh -t max_vma_count
3. vma tests: cd tools/testing/vma && \ make && \ ./vma
Link to v2: https://lore.kernel.org/r/20250915163838.631445-1-kaleshsingh@google.com/
Thanks to everyone for their comments and feedback on the previous versions.
--Kalesh
Kalesh Singh (5): mm: fix off-by-one error in VMA count limit checks mm/selftests: add max_vma_count tests mm: introduce vma_count_remaining() mm: rename mm_struct::map_count to vma_count mm/tracing: introduce trace_mm_insufficient_vma_slots event
MAINTAINERS | 2 + fs/binfmt_elf.c | 2 +- fs/coredump.c | 2 +- include/linux/mm.h | 2 - include/linux/mm_types.h | 2 +- include/trace/events/vma.h | 32 + kernel/fork.c | 2 +- mm/debug.c | 2 +- mm/internal.h | 3 + mm/mmap.c | 31 +- mm/mremap.c | 13 +- mm/nommu.c | 8 +- mm/util.c | 1 - mm/vma.c | 39 +- mm/vma_internal.h | 2 + tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 1 + .../selftests/mm/max_vma_count_tests.c | 672 ++++++++++++++++++ tools/testing/selftests/mm/run_vmtests.sh | 5 + tools/testing/vma/vma.c | 32 +- tools/testing/vma/vma_internal.h | 16 +- 21 files changed, 818 insertions(+), 52 deletions(-) create mode 100644 include/trace/events/vma.h create mode 100644 tools/testing/selftests/mm/max_vma_count_tests.c
base-commit: 4c4142c93fc19cd75a024e5c81b0532578a9e187