On 08.01.25 17:13, Thomas Weißschuh wrote:
On Wed, Jan 08, 2025 at 02:36:57PM +0100, David Hildenbrand wrote:
On 08.01.25 09:05, Thomas Weißschuh wrote:
On Wed, Jan 08, 2025 at 11:46:19AM +0530, Dev Jain wrote:
On 07/01/25 8:44 pm, Thomas Weißschuh wrote:
If not enough physical memory is available the kernel may fail mmap(); see __vm_enough_memory() and vm_commit_limit(). In that case the logic in validate_complete_va_space() does not make sense and will even incorrectly fail. Instead skip the test if no mmap() succeeded.
Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()") Cc: stable@vger.kernel.org
CC stable on tests is ... odd.
I thought it was fairly common, but it isn't. Will drop it.
As it's not really a "kernel BUG", it's rather uncommon.
Note that with MAP_NORESRVE, most setups we care about will allow mapping as much as you want, but on access OOM will fire.
Thanks for the hint.
So one could require that /proc/sys/vm/overcommit_memory is setup properly and use MAP_NORESRVE.
Isn't the check for lchunks == 0 essentially exactly this?
I assume paired with MAP_NORESERVE?
Maybe, but it could be better to have something that says "if overcommit_memory is not setup properly I will SKIP this test", but otherwise I expect this to work and will FAIL if it doesn't".
Or would you expect to run into lchunks == 0 even if overcommit_memory is setup properly and MAP_NORESERVE is used? (very very low memory that we cannot even create all the VMAs?)
Reading from anonymous memory will populate the shared zeropage. To mitigate OOM from "too many page tables", one could simply unmap the pieces as they are verified (or MAP_FIXED over them, to free page tables).
The code has to figure out if a verified region was created by mmap(), otherwise an munmap() could crash the process. As the entries from /proc/self/maps may have been merged and (I assume)
Yes, and partial unmap (in chunk granularity?) would split them again.
the ordering of mappings is not guaranteed, some bespoke logic to establish the link will be needed.
My thinking was that you simply process one /proc/self/maps entry in some chunks. After processing a chunk, you munmap() it.
So you would process + munmap in chunks.
Is it fine to rely on CONFIG_ANON_VMA_NAME? That would make it much easier to implement.
Can you elaborate how you would do it?
Using MAP_NORESERVE and eager munmap()s, the testcase works nicely even in very low physical memory conditions.
Cool.