The selftest started failing since commit e93d2521b27f ("x86/vdso: Split virtual clock pages into dedicated mapping") was merged. While debugging I stumbled upon some memory usage optimizations.
With these test now runs on a VM with only 60MiB of memory.
Signed-off-by: Thomas Weißschuh thomas.weissschuh@linutronix.de --- Changes in v4: - Pick up review tags - Correct Fixes: of patch 1 - Drop git rebase commit message artifacts - Replace strtok_r() with strspn() and strcspn() - Avoid uninitialized read on error in __get_smap_entry() - Link to v3: https://lore.kernel.org/r/20250113-virtual_address_range-tests-v3-0-f4a8e6b7...
Changes in v3: - Pick up review tags - Fix naming around PR_SET_VMA_ANON_NAME helper functions - Skip selftest if PR_SET_VMA_ANON_NAME is not supported - Check for VM_IO instead of [vvar name prefix - Link to v2: https://lore.kernel.org/r/20250110-virtual_address_range-tests-v2-0-262a2bf3...
Changes in v2: - Drop /dev/null usage - Avoid overcommit restrictions by dropping PROT_WRITE - Avoid high memory usage due to PTEs - Link to v1: https://lore.kernel.org/r/20250107-virtual_address_range-tests-v1-0-3834a2fb...
--- Thomas Weißschuh (4): selftests/mm: virtual_address_range: mmap() without PROT_WRITE selftests/mm: virtual_address_range: Unmap chunks after validation selftests/mm: vm_util: Split up /proc/self/smaps parsing selftests/mm: virtual_address_range: Avoid reading from VM_IO mappings
tools/testing/selftests/mm/config | 1 + tools/testing/selftests/mm/virtual_address_range.c | 41 ++++++++++++-- tools/testing/selftests/mm/vm_util.c | 66 +++++++++++++++++----- tools/testing/selftests/mm/vm_util.h | 1 + 4 files changed, 92 insertions(+), 17 deletions(-) --- base-commit: 3043cb9a517b707c12a3f5879f4970c97bfeb3fb change-id: 20250107-virtual_address_range-tests-95843766fa97
Best regards,
When mapping a larger chunk than physical memory is available with PROT_WRITE and overcommit is disabled, the mapping will fail. This will prevent the test from running on systems with less then ~1GiB of memory and triggering an inscrutinable test failure. As the mappings are never written to anyways, the flag can be removed.
Fixes: 4e5ce33ceb32 ("selftests/vm: add a test for virtual address range mapping") Signed-off-by: Thomas Weißschuh thomas.weissschuh@linutronix.de Acked-by: David Hildenbrand david@redhat.com Acked-by: Dev Jain dev.jain@arm.com --- tools/testing/selftests/mm/virtual_address_range.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/mm/virtual_address_range.c b/tools/testing/selftests/mm/virtual_address_range.c index 2a2b69e91950a37999f606847c9c8328d79890c2..ea6ccf49ef4c552f26317c2a40b09bca1a677f8f 100644 --- a/tools/testing/selftests/mm/virtual_address_range.c +++ b/tools/testing/selftests/mm/virtual_address_range.c @@ -166,7 +166,7 @@ int main(int argc, char *argv[]) ksft_set_plan(1);
for (i = 0; i < NR_CHUNKS_LOW; i++) { - ptr[i] = mmap(NULL, MAP_CHUNK_SIZE, PROT_READ | PROT_WRITE, + ptr[i] = mmap(NULL, MAP_CHUNK_SIZE, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (ptr[i] == MAP_FAILED) { @@ -186,7 +186,7 @@ int main(int argc, char *argv[])
for (i = 0; i < NR_CHUNKS_HIGH; i++) { hint = hint_addr(); - hptr[i] = mmap(hint, MAP_CHUNK_SIZE, PROT_READ | PROT_WRITE, + hptr[i] = mmap(hint, MAP_CHUNK_SIZE, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (hptr[i] == MAP_FAILED)
For each accessed chunk a PTE is created. More than 1GiB of PTEs is used in this way. Remove each PTE after validating a chunk to reduce peak memory usage.
It is important to only unmap memory that previously mmap()ed, as unmapping other mappings like the stack, heap or executable mappings will crash the process. The mappings read from /proc/self/maps and the return values from mmap() don't allow a simple correlation due to merging and no guaranteed order. To correlate the pointers and mappings use prctl(PR_SET_VMA_ANON_NAME). While it introduces a test dependency, other alternatives would introduce runtime or development overhead.
Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()") Signed-off-by: Thomas Weißschuh thomas.weissschuh@linutronix.de Acked-by: David Hildenbrand david@redhat.com --- tools/testing/selftests/mm/config | 1 + tools/testing/selftests/mm/virtual_address_range.c | 33 ++++++++++++++++++++-- 2 files changed, 32 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/mm/config b/tools/testing/selftests/mm/config index 4309916f629e36498efb07eb606b2f0c49ee6211..a28baa536332f3fcfb1b83759b5fbb432ae80178 100644 --- a/tools/testing/selftests/mm/config +++ b/tools/testing/selftests/mm/config @@ -7,3 +7,4 @@ CONFIG_TEST_HMM=m CONFIG_GUP_TEST=y CONFIG_TRANSPARENT_HUGEPAGE=y CONFIG_MEM_SOFT_DIRTY=y +CONFIG_ANON_VMA_NAME=y diff --git a/tools/testing/selftests/mm/virtual_address_range.c b/tools/testing/selftests/mm/virtual_address_range.c index ea6ccf49ef4c552f26317c2a40b09bca1a677f8f..386e4e46fa65b98af78dee4bb30144eb2b51f528 100644 --- a/tools/testing/selftests/mm/virtual_address_range.c +++ b/tools/testing/selftests/mm/virtual_address_range.c @@ -10,6 +10,7 @@ #include <string.h> #include <unistd.h> #include <errno.h> +#include <sys/prctl.h> #include <sys/mman.h> #include <sys/time.h> #include <fcntl.h> @@ -82,6 +83,24 @@ static void validate_addr(char *ptr, int high_addr) ksft_exit_fail_msg("Bad address %lx\n", addr); }
+static void mark_range(char *ptr, size_t size) +{ + if (prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, ptr, size, "virtual_address_range") == -1) { + if (errno == EINVAL) { + /* Depends on CONFIG_ANON_VMA_NAME */ + ksft_test_result_skip("prctl(PR_SET_VMA_ANON_NAME) not supported\n"); + ksft_finished(); + } else { + ksft_exit_fail_perror("prctl(PR_SET_VMA_ANON_NAME) failed\n"); + } + } +} + +static int is_marked_vma(const char *vma_name) +{ + return vma_name && !strcmp(vma_name, "[anon:virtual_address_range]\n"); +} + static int validate_lower_address_hint(void) { char *ptr; @@ -116,12 +135,17 @@ static int validate_complete_va_space(void)
prev_end_addr = 0; while (fgets(line, sizeof(line), file)) { + const char *vma_name = NULL; + int vma_name_start = 0; unsigned long hop;
- if (sscanf(line, "%lx-%lx %s[rwxp-]", - &start_addr, &end_addr, prot) != 3) + if (sscanf(line, "%lx-%lx %4s %*s %*s %*s %n", + &start_addr, &end_addr, prot, &vma_name_start) != 3) ksft_exit_fail_msg("cannot parse /proc/self/maps\n");
+ if (vma_name_start) + vma_name = line + vma_name_start; + /* end of userspace mappings; ignore vsyscall mapping */ if (start_addr & (1UL << 63)) return 0; @@ -149,6 +173,9 @@ static int validate_complete_va_space(void) return 1; lseek(fd, 0, SEEK_SET);
+ if (is_marked_vma(vma_name)) + munmap((char *)(start_addr + hop), MAP_CHUNK_SIZE); + hop += MAP_CHUNK_SIZE; } } @@ -175,6 +202,7 @@ int main(int argc, char *argv[]) break; }
+ mark_range(ptr[i], MAP_CHUNK_SIZE); validate_addr(ptr[i], 0); } lchunks = i; @@ -192,6 +220,7 @@ int main(int argc, char *argv[]) if (hptr[i] == MAP_FAILED) break;
+ mark_range(ptr[i], MAP_CHUNK_SIZE); validate_addr(hptr[i], 1); } hchunks = i;
Upcoming changes want to reuse the /proc/self/smaps parsing logic to parse the VmFlags field. As that works differently from the currently parsed HugePage counters, split up the logic so common functionality can be shared.
While reworking this code, also use the correct sscanf placeholder for the "uint64_t thp" variable.
Signed-off-by: Thomas Weißschuh thomas.weissschuh@linutronix.de Acked-by: David Hildenbrand david@redhat.com --- tools/testing/selftests/mm/vm_util.c | 42 +++++++++++++++++++++++++----------- 1 file changed, 29 insertions(+), 13 deletions(-)
diff --git a/tools/testing/selftests/mm/vm_util.c b/tools/testing/selftests/mm/vm_util.c index d8d0cf04bb57fd22bd7748fffec6a23c3103e35c..a450ab353f8e710a6bfce347bc3a7309920c70f5 100644 --- a/tools/testing/selftests/mm/vm_util.c +++ b/tools/testing/selftests/mm/vm_util.c @@ -2,6 +2,7 @@ #include <string.h> #include <fcntl.h> #include <dirent.h> +#include <inttypes.h> #include <sys/ioctl.h> #include <linux/userfaultfd.h> #include <linux/fs.h> @@ -193,13 +194,11 @@ unsigned long rss_anon(void) return rss_anon; }
-bool __check_huge(void *addr, char *pattern, int nr_hpages, - uint64_t hpage_size) +char *__get_smap_entry(void *addr, const char *pattern, char *buf, size_t len) { - uint64_t thp = -1; int ret; FILE *fp; - char buffer[MAX_LINE_LENGTH]; + char *entry = NULL; char addr_pattern[MAX_LINE_LENGTH];
ret = snprintf(addr_pattern, MAX_LINE_LENGTH, "%08lx-", @@ -211,23 +210,40 @@ bool __check_huge(void *addr, char *pattern, int nr_hpages, if (!fp) ksft_exit_fail_msg("%s: Failed to open file %s\n", __func__, SMAP_FILE_PATH);
- if (!check_for_pattern(fp, addr_pattern, buffer, sizeof(buffer))) + if (!check_for_pattern(fp, addr_pattern, buf, len)) goto err_out;
- /* - * Fetch the pattern in the same block and check the number of - * hugepages. - */ - if (!check_for_pattern(fp, pattern, buffer, sizeof(buffer))) + /* Fetch the pattern in the same block */ + if (!check_for_pattern(fp, pattern, buf, len)) goto err_out;
- snprintf(addr_pattern, MAX_LINE_LENGTH, "%s%%9ld kB", pattern); + /* Trim trailing newline */ + entry = strchr(buf, '\n'); + if (entry) + *entry = '\0';
- if (sscanf(buffer, addr_pattern, &thp) != 1) - ksft_exit_fail_msg("Reading smap error\n"); + entry = buf + strlen(pattern);
err_out: fclose(fp); + return entry; +} + +bool __check_huge(void *addr, char *pattern, int nr_hpages, + uint64_t hpage_size) +{ + char buffer[MAX_LINE_LENGTH]; + uint64_t thp = -1; + char *entry; + + entry = __get_smap_entry(addr, pattern, buffer, sizeof(buffer)); + if (!entry) + goto err_out; + + if (sscanf(entry, "%9" SCNu64 " kB", &thp) != 1) + ksft_exit_fail_msg("Reading smap error\n"); + +err_out: return thp == (nr_hpages * (hpage_size >> 10)); }
The virtual_address_range selftest reads from the start of each mapping listed in /proc/self/maps. However not all mappings are valid to be arbitrarily accessed.
For example the vvar data used for virtual clocks on x86 [vvar_vclock] can only be accessed if 1) the kernel configuration enables virtual clocks and 2) the hypervisor provided the data for it. Only the VDSO itself has the necessary information to know this. Since commit e93d2521b27f ("x86/vdso: Split virtual clock pages into dedicated mapping") the virtual clock data was split out into its own mapping, leading to EFAULT from read() during the validation.
Check for the VM_IO flag as a proxy. It is present for the VVAR mappings and MMIO ranges can be dangerous to access arbitrarily.
Reported-by: kernel test robot oliver.sang@intel.com Closes: https://lore.kernel.org/oe-lkp/202412271148.2656e485-lkp@intel.com Fixes: e93d2521b27f ("x86/vdso: Split virtual clock pages into dedicated mapping") Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()") Suggested-by: David Hildenbrand david@redhat.com Link: https://lore.kernel.org/lkml/e97c2a5d-c815-4936-a767-ac42a3220a90@redhat.com... Signed-off-by: Thomas Weißschuh thomas.weissschuh@linutronix.de
--- I left out the comment about the requirement for check_vmflag_io() to be called with the start address of a mapping. It's the same for the check_huge_*() functions and there it's not documented either. Also there is only a single, correct user and any misuse will instantly result in visible breakage. --- tools/testing/selftests/mm/virtual_address_range.c | 4 ++++ tools/testing/selftests/mm/vm_util.c | 24 ++++++++++++++++++++++ tools/testing/selftests/mm/vm_util.h | 1 + 3 files changed, 29 insertions(+)
diff --git a/tools/testing/selftests/mm/virtual_address_range.c b/tools/testing/selftests/mm/virtual_address_range.c index 386e4e46fa65b98af78dee4bb30144eb2b51f528..b380e102b22f0a44654ab046f257e8c35e8d90e9 100644 --- a/tools/testing/selftests/mm/virtual_address_range.c +++ b/tools/testing/selftests/mm/virtual_address_range.c @@ -15,6 +15,7 @@ #include <sys/time.h> #include <fcntl.h>
+#include "vm_util.h" #include "../kselftest.h"
/* @@ -159,6 +160,9 @@ static int validate_complete_va_space(void) if (prot[0] != 'r') continue;
+ if (check_vmflag_io((void *)start_addr)) + continue; + /* * Confirm whether MAP_CHUNK_SIZE chunk can be found or not. * If write succeeds, no need to check MAP_CHUNK_SIZE - 1 diff --git a/tools/testing/selftests/mm/vm_util.c b/tools/testing/selftests/mm/vm_util.c index a450ab353f8e710a6bfce347bc3a7309920c70f5..8bd2c5e59bc73bdfa617d4b11b448da84e2a3daf 100644 --- a/tools/testing/selftests/mm/vm_util.c +++ b/tools/testing/selftests/mm/vm_util.c @@ -400,3 +400,27 @@ unsigned long get_free_hugepages(void) fclose(f); return fhp; } + +bool check_vmflag_io(void *addr) +{ + char buffer[MAX_LINE_LENGTH]; + const char *flags; + size_t flaglen; + + flags = __get_smap_entry(addr, "VmFlags:", buffer, sizeof(buffer)); + if (!flags) + ksft_exit_fail_msg("%s: No VmFlags for %p\n", __func__, addr); + + while (true) { + flags += strspn(flags, " "); + + flaglen = strcspn(flags, " "); + if (!flaglen) + return false; + + if (flaglen == strlen("io") && !memcmp(flags, "io", flaglen)) + return true; + + flags += flaglen; + } +} diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests/mm/vm_util.h index 2eaed82099255e09ffd38ad9714994397f304685..b60ac68a9dc8893895f49946b258260f7a82218a 100644 --- a/tools/testing/selftests/mm/vm_util.h +++ b/tools/testing/selftests/mm/vm_util.h @@ -53,6 +53,7 @@ int uffd_unregister(int uffd, void *addr, uint64_t len); int uffd_register_with_ioctls(int uffd, void *addr, uint64_t len, bool miss, bool wp, bool minor, uint64_t *ioctls); unsigned long get_free_hugepages(void); +bool check_vmflag_io(void *addr);
/* * On ppc64 this will only work with radix 2M hugepage size
On 14.01.25 17:06, Thomas Weißschuh wrote:
The virtual_address_range selftest reads from the start of each mapping listed in /proc/self/maps. However not all mappings are valid to be arbitrarily accessed.
For example the vvar data used for virtual clocks on x86 [vvar_vclock] can only be accessed if 1) the kernel configuration enables virtual clocks and 2) the hypervisor provided the data for it. Only the VDSO itself has the necessary information to know this. Since commit e93d2521b27f ("x86/vdso: Split virtual clock pages into dedicated mapping") the virtual clock data was split out into its own mapping, leading to EFAULT from read() during the validation.
Check for the VM_IO flag as a proxy. It is present for the VVAR mappings and MMIO ranges can be dangerous to access arbitrarily.
Reported-by: kernel test robot oliver.sang@intel.com Closes: https://lore.kernel.org/oe-lkp/202412271148.2656e485-lkp@intel.com Fixes: e93d2521b27f ("x86/vdso: Split virtual clock pages into dedicated mapping") Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()") Suggested-by: David Hildenbrand david@redhat.com Link: https://lore.kernel.org/lkml/e97c2a5d-c815-4936-a767-ac42a3220a90@redhat.com... Signed-off-by: Thomas Weißschuh thomas.weissschuh@linutronix.de
Acked-by: David Hildenbrand david@redhat.com
Unfortunately, vsyscall doesn't seem to set VM_IO, it only has
VmFlags: ex
Which is rather weird.
So we cannot remove that special-casing right now,
On 14.01.25 17:06, Thomas Weißschuh wrote:
The selftest started failing since commit e93d2521b27f ("x86/vdso: Split virtual clock pages into dedicated mapping") was merged. While debugging I stumbled upon some memory usage optimizations.
With these test now runs on a VM with only 60MiB of memory.
60 MiB ? Crazy :)
linux-kselftest-mirror@lists.linaro.org