On Thu, Jan 09, 2025 at 02:05:43PM +0100, David Hildenbrand wrote:
That is clear. The issue would be to figure which chunks are valid to unmap. If something critical like the executable file is unmapped, the process crashes. But see below.
Ah, now I see what you mean. Yes, also the stack etc. will be problematic. So IIUC, you want to limit the munmap optimization only to the manually mmap()ed parts.
Correct.
Is it fine to rely on CONFIG_ANON_VMA_NAME? That would make it much easier to implement.
Can you elaborate how you would do it?
First set the VMA name after mmap():
for (i = 0; i < NR_CHUNKS_LOW; i++) { ptr[i] = mmap(NULL, MAP_CHUNK_SIZE, PROT_READ | PROT_WRITE, MAP_NORESERVE | MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (ptr[i] == MAP_FAILED) { if (validate_lower_address_hint()) ksft_exit_fail_msg("mmap unexpectedly succeeded with hint\n"); break; }
validate_addr(ptr[i], 0); if (prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, ptr[i], MAP_CHUNK_SIZE, "virtual_address_range")) ksft_exit_fail_msg("prctl(PR_SET_VMA_ANON_NAME) failed: %s\n", strerror(errno));
Likely this would prevent merging of VMAs.
With a 1 GiB chunk size, and NR_CHUNKS_LOW == 128TiB, you'd already require 128k VMAs. The default limit is frequently 64k.
They are merged for me, as they all share the same name.
PR_SET_VMA(2const) even mentions merging:
Note that assigning an attribute to a virtual memory area might prevent it from being merged with adjacent virtual memory areas due to the difference in that attribute's value.
is_mergeable_vma() has an explicit check using anon_vma_name_eq().
We could just scan the ptr / hptr array to see if this is a manual mmap area or not. If this takes too long, one could sort the arrays by address and perform a binary search.
Not the most efficient way of doing it, but maybe good enough for this test?
A naive loop is what I tried first, but it took forever.
Alternatively, store the pointer in a xarray-like tree instead of two arrays. Requires a bit more memory ... and we'd have to find a simple implementation we could just reuse in this test. So maybe there is a simpler way to get it done.
IMO the prctl() is that simpler way. The only real drawback is the dependency on CONFIG_ANON_VMA_NAME. We can add an entry to tools/testing/selftests/mm/config for it.
Thomas