v4: https://lkml.org/lkml/2020/9/2/356
v4-->v5
Based on comments from Artem Bityutskiy, evaluation of timer based
wakeup latencies may not be a fruitful measurement especially on the x86
platform which has the capability to pre-arm a CPU when a timer is set.
Hence, including only the IPI based tests for latency measurement to
acheive expected behaviour across platforms.
kernel module + bash selftest approach which presents lower deviations
and higher accuracy: https://lkml.org/lkml/2020/7/21/567
---
The patch series introduces a mechanism to measure wakeup latency for
IPI based interrupts.
The motivation behind this series is to find significant deviations
behind advertised latency values
To achieve this in the userspace, IPI latencies are calculated by
sending information through pipes and inducing a wakeup.
To account for delays from kernel-userspace interactions baseline
observations are taken on a 100% busy CPU and subsequent obervations
must be considered relative to that.
One downside of the userspace approach in contrast to the kernel
implementation is that the run to run variance can turn out to be high
in the order of ms; which is the scope of the experiments at times.
Another downside of the userspace approach is that it takes much longer
to run and hence a command-line option quick and full are added to make
sure quick 1 CPU tests can be carried out when needed and otherwise it
can carry out a full system comprehensive test.
Usage
---
./cpuidle --mode <full / quick / num_cpus> --output <output location>
full: runs on all CPUS
quick: run on a random CPU
num_cpus: Limit the number of CPUS to run on
Sample output snippet
---------------------
--IPI Latency Test---
SRC_CPU DEST_CPU IPI_Latency(ns)
...
0 5 256178
0 6 478161
0 7 285445
0 8 273553
Expected IPI latency(ns): 100000
Observed Average IPI latency(ns): 248334
Pratik Rajesh Sampat (1):
selftests/cpuidle: Add support for cpuidle latency measurement
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/cpuidle/Makefile | 7 +
tools/testing/selftests/cpuidle/cpuidle.c | 479 ++++++++++++++++++++++
tools/testing/selftests/cpuidle/settings | 1 +
4 files changed, 488 insertions(+)
create mode 100644 tools/testing/selftests/cpuidle/Makefile
create mode 100644 tools/testing/selftests/cpuidle/cpuidle.c
create mode 100644 tools/testing/selftests/cpuidle/settings
--
2.26.2
Changes since v1:
* check_config.sh now invokes the compiler via the Makefile's ($CC),
thanks to Jason Gunthorpe for calling that out.
* Removed a misleading sentence from patch #6, as identified by Ira
Weiny.
* Removed a forward-looking sentence, about using -lpthread in
gup_test.c soon, from the commit message in patch #4, since I'm not yet
sure if my local pthread-based stress tests are actually worthwhile or
not.
Original cover letter, still accurate at this point:
This is based on the latest mmotm.
Summary: This series provides two main things, and a number of smaller
supporting goodies. The two main points are:
1) Add a new sub-test to gup_test, which in turn is a renamed version of
gup_benchmark. This sub-test allows nicer testing of dump_pages(), at
least on user-space pages.
For quite a while, I was doing a quick hack to gup_test.c whenever I
wanted to try out changes to dump_page(). Then Matthew Wilcox asked me
what I meant when I said "I used my dump_page() unit test", and I
realized that it might be nice to check in a polished up version of
that.
Details about how it works and how to use it are in the commit
description for patch #6.
2) Fixes a limitation of hmm-tests: these tests are incredibly useful,
but only if people actually build and run them. And it turns out that
libhugetlbfs is a little too effective at throwing a wrench in the
works, there. So I've added a little configuration check that removes
just two of the 21 hmm-tests, if libhugetlbfs is not available.
Further details in the commit description of patch #8.
Other smaller things that this series does:
a) Remove code duplication by creating gup_test.h.
b) Clear up the sub-test organization, and their invocation within
run_vmtests.sh.
c) Other minor assorted improvements.
John Hubbard (8):
mm/gup_benchmark: rename to mm/gup_test
selftests/vm: use a common gup_test.h
selftests/vm: rename run_vmtests --> run_vmtests.sh
selftests/vm: minor cleanup: Makefile and gup_test.c
selftests/vm: only some gup_test items are really benchmarks
selftests/vm: gup_test: introduce the dump_pages() sub-test
selftests/vm: run_vmtest.sh: update and clean up gup_test invocation
selftests/vm: hmm-tests: remove the libhugetlbfs dependency
Documentation/core-api/pin_user_pages.rst | 6 +-
arch/s390/configs/debug_defconfig | 2 +-
arch/s390/configs/defconfig | 2 +-
mm/Kconfig | 21 +-
mm/Makefile | 2 +-
mm/{gup_benchmark.c => gup_test.c} | 109 ++++++----
mm/gup_test.h | 32 +++
tools/testing/selftests/vm/.gitignore | 3 +-
tools/testing/selftests/vm/Makefile | 38 +++-
tools/testing/selftests/vm/check_config.sh | 31 +++
tools/testing/selftests/vm/config | 2 +-
tools/testing/selftests/vm/gup_benchmark.c | 137 -------------
tools/testing/selftests/vm/gup_test.c | 188 ++++++++++++++++++
tools/testing/selftests/vm/hmm-tests.c | 10 +-
.../vm/{run_vmtests => run_vmtest.sh} | 24 ++-
15 files changed, 404 insertions(+), 203 deletions(-)
rename mm/{gup_benchmark.c => gup_test.c} (59%)
create mode 100644 mm/gup_test.h
create mode 100755 tools/testing/selftests/vm/check_config.sh
delete mode 100644 tools/testing/selftests/vm/gup_benchmark.c
create mode 100644 tools/testing/selftests/vm/gup_test.c
rename tools/testing/selftests/vm/{run_vmtests => run_vmtest.sh} (91%)
--
2.28.0
This version 2 of the mremap speed up patches previously posted at:
https://lore.kernel.org/r/20200930222130.4175584-1-kaleshsingh@google.com
mremap time can be optimized by moving entries at the PMD/PUD level if
the source and destination addresses are PMD/PUD-aligned and
PMD/PUD-sized. Enable moving at the PMD and PUD levels on arm64 and
x86. Other architectures where this type of move is supported and known to
be safe can also opt-in to these optimizations by enabling HAVE_MOVE_PMD
and HAVE_MOVE_PUD.
Observed Performance Improvements for remapping a PUD-aligned 1GB-sized
region on x86 and arm64:
- HAVE_MOVE_PMD is already enabled on x86 : N/A
- Enabling HAVE_MOVE_PUD on x86 : ~13x speed up
- Enabling HAVE_MOVE_PMD on arm64 : ~ 8x speed up
- Enabling HAVE_MOVE_PUD on arm64 : ~19x speed up
Altogether, HAVE_MOVE_PMD and HAVE_MOVE_PUD
give a total of ~150x speed up on arm64.
Changes in v2:
- Reduce mremap_test time by only validating a configurable
threshold of the remapped region, as per John.
- Use a random pattern for mremap validation. Provide pattern
seed in test output, as per John.
- Moved set_pud_at() to separate patch, per Kirill.
- Use switch() instead of ifs in move_pgt_entry(), per Kirill.
- Update commit message with description of Android
garbage collector use case for HAVE_MOVE_PUD, as per Joel.
- Fix build test error reported by kernel test robot in [1].
[1] https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/CKPGL4F…
Kalesh Singh (6):
kselftests: vm: Add mremap tests
arm64: mremap speedup - Enable HAVE_MOVE_PMD
mm: Speedup mremap on 1GB or larger regions
arm64: Add set_pud_at() functions
arm64: mremap speedup - Enable HAVE_MOVE_PUD
x86: mremap speedup - Enable HAVE_MOVE_PUD
arch/Kconfig | 7 +
arch/arm64/Kconfig | 2 +
arch/arm64/include/asm/pgtable.h | 1 +
arch/x86/Kconfig | 1 +
mm/mremap.c | 220 +++++++++++++--
tools/testing/selftests/vm/.gitignore | 1 +
tools/testing/selftests/vm/Makefile | 1 +
tools/testing/selftests/vm/mremap_test.c | 333 +++++++++++++++++++++++
tools/testing/selftests/vm/run_vmtests | 11 +
9 files changed, 547 insertions(+), 30 deletions(-)
create mode 100644 tools/testing/selftests/vm/mremap_test.c
base-commit: 472e5b056f000a778abb41f1e443de58eb259783
--
2.28.0.806.g8561365e88-goog
Hi,
Maybe this should really be an RFC, given that I don't fully understand
why the compaction_test.c program was mmap'ing 1 MB at a time. So
apologies in advance if I've mucked up something important, but if so,
maybe we can still find a way to get this fixed up to something better.
Anyway: there are 20+ tests in tools/testing/selftests/vm/. The entire
running time for these (via run_vmtest.sh) is about 56 seconds, of which
over half is due to just one test: compaction_test, which takes 27 sec!
(A runner-up is HMM, at 11 sec, so it's up for a look next.) The other
tests mostly take a few ms, and a few take 1.0 sec.
This drops the compaction_test run time from 27, to 3.3 sec. Enjoy. :)
thanks,
John Hubbard
NVIDIA
John Hubbard (1):
selftests/vm: 8x compaction_test speedup
tools/testing/selftests/vm/compaction_test.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
--
2.28.0
From: Colin Ian King <colin.king(a)canonical.com>
More recent libc implementations are now using openat/openat2 system
calls so also add do_sys_openat2 to the tracing so that the test
passes on these systems because do_sys_open may not be called.
Signed-off-by: Colin Ian King <colin.king(a)canonical.com>
---
.../testing/selftests/ftrace/test.d/kprobe/kprobe_args_user.tc | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_user.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_user.tc
index a30a9c07290d..cf1b4c3e9e6b 100644
--- a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_user.tc
+++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_args_user.tc
@@ -9,6 +9,8 @@ grep -A10 "fetcharg:" README | grep -q '\[u\]<offset>' || exit_unsupported
:;: "user-memory access syntax and ustring working on user memory";:
echo 'p:myevent do_sys_open path=+0($arg2):ustring path2=+u0($arg2):string' \
> kprobe_events
+echo 'p:myevent2 do_sys_openat2 path=+0($arg2):ustring path2=+u0($arg2):string' \
+ > kprobe_events
grep myevent kprobe_events | \
grep -q 'path=+0($arg2):ustring path2=+u0($arg2):string'
--
2.27.0
mremap time can be optimized by moving entries at the PMD/PUD level if
the source and destination addresses are PMD/PUD-aligned and
PMD/PUD-sized. Enable moving at the PMD and PUD levels on arm64 and
x86. Other architectures where this type of move is supported and known to
be safe can also opt-in to these optimizations by enabling HAVE_MOVE_PMD
and HAVE_MOVE_PUD.
Observed Performance Improvements for remapping a PUD-aligned 1GB-sized
region on x86 and arm64:
- HAVE_MOVE_PMD is already enabled on x86 : N/A
- Enabling HAVE_MOVE_PUD on x86 : ~13x speed up
- Enabling HAVE_MOVE_PMD on arm64 : ~ 8x speed up
- Enabling HAVE_MOVE_PUD on arm64 : ~19x speed up
Altogether, HAVE_MOVE_PMD and HAVE_MOVE_PUD
give a total of ~150x speed up on arm64.
Kalesh Singh (5):
kselftests: vm: Add mremap tests
arm64: mremap speedup - Enable HAVE_MOVE_PMD
mm: Speedup mremap on 1GB or larger regions
arm64: mremap speedup - Enable HAVE_MOVE_PUD
x86: mremap speedup - Enable HAVE_MOVE_PUD
arch/Kconfig | 7 +
arch/arm64/Kconfig | 2 +
arch/arm64/include/asm/pgtable.h | 1 +
arch/x86/Kconfig | 1 +
mm/mremap.c | 211 +++++++++++++++++---
tools/testing/selftests/vm/.gitignore | 1 +
tools/testing/selftests/vm/Makefile | 1 +
tools/testing/selftests/vm/mremap_test.c | 243 +++++++++++++++++++++++
tools/testing/selftests/vm/run_vmtests | 11 +
9 files changed, 448 insertions(+), 30 deletions(-)
create mode 100644 tools/testing/selftests/vm/mremap_test.c
--
2.28.0.709.gb0816b6eb0-goog