This test case triggers a race between madvise(MADV_DONTNEED) and mmap() in a single huge page, which got stolen (while reserved).
Once the only page is stolen, the memory previously mmaped (and madvise(MADV_DONTNEED) got a SIGBUS when accessed.
I am not adding this test to the un_vmtests.sh scripts, since this test fails at upstream.
Breno Leitao (1): selftests/mm: add a new test for madv and hugetlb mmap
tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 1 + .../selftests/mm/hugetlb_madv_vs_map.c | 124 ++++++++++++++++++ 3 files changed, 126 insertions(+) create mode 100644 tools/testing/selftests/mm/hugetlb_madv_vs_map.c
This test stresses the race between of madvise(DONTNEED), a page fault and a parallel huge page mmap, which should fail due to lack of available page available for mapping.
This test case must run on a system with one and only one huge page available.
# echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
During setup, the test allocates the only available page, and starts three threads:
- thread 1: * madvise(MADV_DONTNEED) on the allocated huge page - thread 2: * Write to the allocated huge page - thread 3: * Tries to allocated (steal) an extra huge page (which is not available)
thread 3 should never succeed in the allocation, since the only huge page was never unmapped, and should be reserved.
Touching the old page after thread3 allocation will raise a SIGBUS.
Signed-off-by: Breno Leitao leitao@debian.org --- tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 1 + .../selftests/mm/hugetlb_madv_vs_map.c | 124 ++++++++++++++++++ 3 files changed, 126 insertions(+) create mode 100644 tools/testing/selftests/mm/hugetlb_madv_vs_map.c
diff --git a/tools/testing/selftests/mm/.gitignore b/tools/testing/selftests/mm/.gitignore index 4ff10ea61461..d26e962f2ac4 100644 --- a/tools/testing/selftests/mm/.gitignore +++ b/tools/testing/selftests/mm/.gitignore @@ -46,3 +46,4 @@ gup_longterm mkdirty va_high_addr_switch hugetlb_fault_after_madv +hugetlb_madv_vs_map diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile index dede0bcf97a3..f6e42a773e1e 100644 --- a/tools/testing/selftests/mm/Makefile +++ b/tools/testing/selftests/mm/Makefile @@ -70,6 +70,7 @@ TEST_GEN_FILES += ksm_tests TEST_GEN_FILES += ksm_functional_tests TEST_GEN_FILES += mdwe_test TEST_GEN_FILES += hugetlb_fault_after_madv +TEST_GEN_FILES += hugetlb_madv_vs_map
ifneq ($(ARCH),arm64) TEST_GEN_FILES += soft-dirty diff --git a/tools/testing/selftests/mm/hugetlb_madv_vs_map.c b/tools/testing/selftests/mm/hugetlb_madv_vs_map.c new file mode 100644 index 000000000000..d01e8d4901d0 --- /dev/null +++ b/tools/testing/selftests/mm/hugetlb_madv_vs_map.c @@ -0,0 +1,124 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * A test case that must run on a system with one and only one huge page available. + * # echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages + * + * During setup, the test allocates the only available page, and starts three threads: + * - thread1: + * * madvise(MADV_DONTNEED) on the allocated huge page + * - thread 2: + * * Write to the allocated huge page + * - thread 3: + * * Try to allocated an extra huge page (which must not available) + * + * The test fails if thread3 is able to allocate a page. + * + * Touching the first page after thread3's allocation will raise a SIGBUS + * + * Author: Breno Leitao leitao@debian.org + */ +#include <pthread.h> +#include <stdio.h> +#include <stdlib.h> +#include <sys/mman.h> +#include <sys/types.h> +#include <unistd.h> + +#include "vm_util.h" +#include "../kselftest.h" + +#define MMAP_SIZE (1 << 21) +#define INLOOP_ITER 100 + +char *huge_ptr; + +/* Touch the memory while it is being madvised() */ +void *touch(void *unused) +{ + for (int i = 0; i < INLOOP_ITER; i++) + huge_ptr[0] = '.'; + + return NULL; +} + +void *madv(void *unused) +{ + for (int i = 0; i < INLOOP_ITER; i++) + madvise(huge_ptr, MMAP_SIZE, MADV_DONTNEED); + + return NULL; +} + +/* + * We got here, and there must be no huge page available for mapping + * The other hugepage should be flipping from used <-> reserved, because + * of madvise(DONTNEED). + */ +void *map_extra(void *unused) +{ + void *ptr; + + for (int i = 0; i < INLOOP_ITER; i++) { + ptr = mmap(NULL, MMAP_SIZE, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, + -1, 0); + + if ((long)ptr != -1) { + /* Touching the other page now will cause a SIGBUG + * huge_ptr[0] = '1'; + */ + return ptr; + } + } + + return NULL; +} + +int main(void) +{ + pthread_t thread1, thread2, thread3; + unsigned long free_hugepages; + void *ret; + + /* + * On kernel 6.7, we are able to reproduce the problem with ~10 + * interactions + */ + int max = 10; + + free_hugepages = get_free_hugepages(); + + if (free_hugepages != 1) { + ksft_exit_skip("This test needs one and only one page to execute. Got %lu\n", + free_hugepages); + } + + while (max--) { + huge_ptr = mmap(NULL, MMAP_SIZE, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, + -1, 0); + + if ((unsigned long)huge_ptr == -1) { + ksft_exit_skip("Failed to allocated huge page\n"); + return KSFT_SKIP; + } + + pthread_create(&thread1, NULL, madv, NULL); + pthread_create(&thread2, NULL, touch, NULL); + pthread_create(&thread3, NULL, map_extra, NULL); + + pthread_join(thread1, NULL); + pthread_join(thread2, NULL); + pthread_join(thread3, &ret); + + if (ret) { + ksft_test_result_fail("Unexpected huge page allocation\n"); + return KSFT_FAIL; + } + + /* Unmap and restart */ + munmap(huge_ptr, MMAP_SIZE); + } + + return KSFT_PASS; +}
On Fri, 5 Jan 2024 07:54:19 -0800 Breno Leitao leitao@debian.org wrote:
This test stresses the race between of madvise(DONTNEED), a page fault and a parallel huge page mmap, which should fail due to lack of available page available for mapping.
This test case must run on a system with one and only one huge page available.
# echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
Can't the test framework perform this configuration prior to running the test?
During setup, the test allocates the only available page, and starts three threads:
- thread 1:
- madvise(MADV_DONTNEED) on the allocated huge page
- thread 2:
- Write to the allocated huge page
- thread 3:
- Tries to allocated (steal) an extra huge page (which is not available)
thread 3 should never succeed in the allocation, since the only huge page was never unmapped, and should be reserved.
Touching the old page after thread3 allocation will raise a SIGBUS.
It's a bit strange to merge a selftest which is expected to fail because of a known but unfixed kernel bug. But I'll toss the test in there anyway, as we deserve to get bug reports ;)
On Tue, Jan 09, 2024 at 09:47:31PM -0800, Andrew Morton wrote:
On Fri, 5 Jan 2024 07:54:19 -0800 Breno Leitao leitao@debian.org wrote:
This test stresses the race between of madvise(DONTNEED), a page fault and a parallel huge page mmap, which should fail due to lack of available page available for mapping.
This test case must run on a system with one and only one huge page available.
# echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
Can't the test framework perform this configuration prior to running the test?
We have this infrastructure already set in the run_vmtest.sh. The "hugetlb_fault_after_madv" selftest needs the same configuration, so, once the fix is ready, we will just add something as:
--- a/tools/testing/selftests/mm/run_vmtests.sh +++ b/tools/testing/selftests/mm/run_vmtests.sh @@ -227,6 +227,7 @@ nr_hugepages_tmp=$(cat /proc/sys/vm/nr_hugepages) # For this test, we need one and just one huge page echo 1 > /proc/sys/vm/nr_hugepages CATEGORY="hugetlb" run_test ./hugetlb_fault_after_madv +CATEGORY="hugetlb" run_test ./hugetlb_madv_vs_map # Restore the previous number of huge pages, since further tests rely on it echo "$nr_hugepages_tmp" > /proc/sys/vm/nr_hugepages
On Fri, 5 Jan 2024 07:54:18 -0800 Breno Leitao leitao@debian.org wrote:
This test case triggers a race between madvise(MADV_DONTNEED) and mmap() in a single huge page, which got stolen (while reserved).
Once the only page is stolen, the memory previously mmaped (and madvise(MADV_DONTNEED) got a SIGBUS when accessed.
I am not adding this test to the un_vmtests.sh scripts, since this test fails at upstream.
Oh. Is a fix for this in the pipeline? If so, I assume that once the fix is merged, we enable this test in run_vmtests?
On Fri, Jan 05, 2024 at 08:42:38AM -0800, Andrew Morton wrote:
On Fri, 5 Jan 2024 07:54:18 -0800 Breno Leitao leitao@debian.org wrote:
This test case triggers a race between madvise(MADV_DONTNEED) and mmap() in a single huge page, which got stolen (while reserved).
Once the only page is stolen, the memory previously mmaped (and madvise(MADV_DONTNEED) got a SIGBUS when accessed.
I am not adding this test to the un_vmtests.sh scripts, since this test fails at upstream.
Oh. Is a fix for this in the pipeline? If so, I assume that once the fix is merged, we enable this test in run_vmtests?
The fix is not ready yet. As soon as the fix lands, I will enable the test in run_vmtests.
On Fri, 2024-01-05 at 08:42 -0800, Andrew Morton wrote:
On Fri, 5 Jan 2024 07:54:18 -0800 Breno Leitao leitao@debian.org wrote:
This test case triggers a race between madvise(MADV_DONTNEED) and mmap() in a single huge page, which got stolen (while reserved).
Once the only page is stolen, the memory previously mmaped (and madvise(MADV_DONTNEED) got a SIGBUS when accessed.
I am not adding this test to the un_vmtests.sh scripts, since this test fails at upstream.
Oh. Is a fix for this in the pipeline? If so, I assume that once the fix is merged, we enable this test in run_vmtests?
I've got some ideas on how to fix it, and hope to get a fix to you and Mike by next week.
I'll ask Mike if I run into any unexpected complications.
linux-kselftest-mirror@lists.linaro.org