The kernel now has a number of testing and debugging tools, and we've
seen a bit of confusion about what the differences between them are.
Add a basic documentation outlining the testing tools, when to use each,
and how they interact.
This is a pretty quick overview rather than the idealised "kernel
testing guide" that'd probably be optimal, but given the number of times
questions like "When do you use KUnit and when do you use Kselftest?"
are being asked, it seemed worth at least having something. Hopefully
this can form the basis for more detailed documentation later.
Signed-off-by: David Gow <davidgow(a)google.com>
---
Documentation/dev-tools/index.rst | 3 +
Documentation/dev-tools/testing-overview.rst | 102 +++++++++++++++++++
2 files changed, 105 insertions(+)
create mode 100644 Documentation/dev-tools/testing-overview.rst
diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
index 1b1cf4f5c9d9..f590e5860794 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -7,6 +7,8 @@ be used to work on the kernel. For now, the documents have been pulled
together without any significant effort to integrate them into a coherent
whole; patches welcome!
+A brief overview of testing-specific tools can be found in :doc:`testing-overview`.
+
.. class:: toc-title
Table of contents
@@ -14,6 +16,7 @@ whole; patches welcome!
.. toctree::
:maxdepth: 2
+ testing-overview
coccinelle
sparse
kcov
diff --git a/Documentation/dev-tools/testing-overview.rst b/Documentation/dev-tools/testing-overview.rst
new file mode 100644
index 000000000000..8452adcb8608
--- /dev/null
+++ b/Documentation/dev-tools/testing-overview.rst
@@ -0,0 +1,102 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+Kernel Testing Guide
+====================
+
+
+There are a number of different tools for testing the Linux kernel, so knowing
+when to use each of them can be a challenge. This document provides a rough
+overview of their differences, and how they fit together.
+
+
+Writing and Running Tests
+=========================
+
+The bulk of kernel tests are written using either the :doc:`kselftest
+<kselftest>` or :doc:`KUnit <kunit/index>` frameworks. These both provide
+infrastructure to help make running tests and groups of tests easier, as well
+as providing helpers to aid in writing new tests.
+
+If you're looking to verify the behaviour of the Kernel — particularly specific
+parts of the kernel — then you'll want to use `KUnit` or `kselftest`.
+
+
+The Difference Between KUnit and kselftest
+------------------------------------------
+
+:doc:`KUnit <kunit/index>` is an entirely in-kernel system for "white box"
+testing: because test code is part of the kernel, it can access internal
+structures and functions which aren't exposed to userspace.
+
+`KUnit` tests therefore are best written against small, self-contained parts
+of the kernel, which can be tested in isolation. This aligns well with the
+concept of Unit testing.
+
+For example, a KUnit test might test an individual kernel function (or even a
+single codepath through a function, such as an error handling case), rather
+than a feature as a whole.
+
+There is a KUnit test style guide which may give further pointers
+
+
+:doc:`kselftest <kselftest>`, on the other hand, is largely implemented in
+userspace, and tests are normal userspace scripts or programs.
+
+This makes it easier to write more complicated tests, or tests which need to
+manipulate the overall system state more (e.g., spawning processes, etc.).
+However, it's not possible to call kernel functions directly unless they're
+exposed to userspace (by a syscall, device, filesystem, etc.) Some tests to
+also provide a kernel module which is loaded by the test, though for tests
+which run mostly or entirely within the kernel, `KUnit` may be the better tool.
+
+`kselftest` is therefore suited well to tests of whole features, as these will
+expose an interface to userspace, which can be tested, but not implementation
+details. This aligns well with 'system' or 'end-to-end' testing.
+
+
+Code Coverage Tools
+===================
+
+The Linux Kernel supports two different code coverage mesurement tools. These
+can be used to verify that a test is executing particular functions or lines
+of code. This is useful for determining how much of the kernel is being tested,
+and for finding corner-cases which are not covered by the appropriate test.
+
+:doc:`kcov` is a feature which can be built in to the kernel to allow
+capturing coverage on a per-task level. It's therefore useful for fuzzing and
+other situations where information about code executed during, for example, a
+single syscall is useful.
+
+:doc:`gcov` is GCC's coverage testing tool, which can be used with the kernel
+to get global or per-module coverage. Unlike KCOV, it does not record per-task
+coverage. Coverage data can be read from debugfs, and interpreted using the
+usual gcov tooling.
+
+
+Sanitizers
+==========
+
+The kernel also supports a number of sanitizers, which attempt to detect
+classes of issues when the occur in a running kernel. These typically
+look for undefined behaviour of some kind, such as invalid memory accesses,
+concurrency issues such as data races, or other undefined behaviour like
+integer overflows.
+
+* :doc:`kmemleak` (Kmemleak) detects possible memory leaks.
+* :doc:`kasan` detects invalid memory accesses such as out-of-bounds and
+ use-after-free errors.
+* :doc:`ubsan` detects behaviour that is undefined by the C standard, like
+ integer overflows.
+* :doc:`kcsan` detects data races.
+* :doc:`kfence` is a low-overhead detector of memory issues, which is much
+ faster than KASAN and can be used in production.
+
+These tools tend to test the kernel as a whole, and do not "pass" like
+kselftest or KUnit tests. They can be combined with KUnit or kselftest by
+running tests on a kernel with a sanitizer enabled: you can then be sure
+that none of these errors are occurring during the test.
+
+Some of these sanitizers integrate with KUnit or kselftest and will
+automatically fail tests if an issue is detected by a sanitizer.
+
--
2.31.1.295.g9ea45b61b8-goog
Changelog
RFC v2-->v3
Based on comments by Doug Smythies,
1. Changed commit log to reflect the test must be run as super user.
2. Added a comment specifying a method to run the test bash script
without recompiling.
3. Enable all the idle states after the experiments are completed so
that the system is in a coherent state after the tests have run
4. Correct the return status of a CPU that cannot be off-lined.
RFC v2: https://lkml.org/lkml/2021/4/1/615
---
A kernel module + userspace driver to estimate the wakeup latency
caused by going into stop states. The motivation behind this program is
to find significant deviations behind advertised latency and residency
values.
The patchset measures latencies for two kinds of events. IPIs and Timers
As this is a software-only mechanism, there will additional latencies of
the kernel-firmware-hardware interactions. To account for that, the
program also measures a baseline latency on a 100 percent loaded CPU
and the latencies achieved must be in view relative to that.
To achieve this, we introduce a kernel module and expose its control
knobs through the debugfs interface that the selftests can engage with.
The kernel module provides the following interfaces within
/sys/kernel/debug/latency_test/ for,
IPI test:
ipi_cpu_dest = Destination CPU for the IPI
ipi_cpu_src = Origin of the IPI
ipi_latency_ns = Measured latency time in ns
Timeout test:
timeout_cpu_src = CPU on which the timer to be queued
timeout_expected_ns = Timer duration
timeout_diff_ns = Difference of actual duration vs expected timer
Sample output on a POWER9 system is as follows:
# --IPI Latency Test---
# Baseline Average IPI latency(ns): 3114
# Observed Average IPI latency(ns) - State0: 3265
# Observed Average IPI latency(ns) - State1: 3507
# Observed Average IPI latency(ns) - State2: 3739
# Observed Average IPI latency(ns) - State3: 3807
# Observed Average IPI latency(ns) - State4: 17070
# Observed Average IPI latency(ns) - State5: 1038174
# Observed Average IPI latency(ns) - State6: 1068784
#
# --Timeout Latency Test--
# Baseline Average timeout diff(ns): 1420
# Observed Average timeout diff(ns) - State0: 1640
# Observed Average timeout diff(ns) - State1: 1764
# Observed Average timeout diff(ns) - State2: 1715
# Observed Average timeout diff(ns) - State3: 1845
# Observed Average timeout diff(ns) - State4: 16581
# Observed Average timeout diff(ns) - State5: 939977
# Observed Average timeout diff(ns) - State6: 1073024
Things to keep in mind:
1. This kernel module + bash driver does not guarantee idleness on a
core when the IPI and the Timer is armed. It only invokes sleep and
hopes that the core is idle once the IPI/Timer is invoked onto it.
Hence this program must be run on a completely idle system for best
results
2. Even on a completely idle system, there maybe book-keeping tasks or
jitter tasks that can run on the core we want idle. This can create
outliers in the latency measurement. Thankfully, these outliers
should be large enough to easily weed them out.
3. A userspace only selftest variant was also sent out as RFC based on
suggestions over the previous patchset to simply the kernel
complexeity. However, a userspace only approach had more noise in
the latency measurement due to userspace-kernel interactions
which led to run to run variance and a lesser accurate test.
Another downside of the nature of a userspace program is that it
takes orders of magnitude longer to complete a full system test
compared to the kernel framework.
RFC patch: https://lkml.org/lkml/2020/9/2/356
4. For Intel Systems, the Timer based latencies don't exactly give out
the measure of idle latencies. This is because of a hardware
optimization mechanism that pre-arms a CPU when a timer is set to
wakeup. That doesn't make this metric useless for Intel systems,
it just means that is measuring IPI/Timer responding latency rather
than idle wakeup latencies.
(Source: https://lkml.org/lkml/2020/9/2/610)
For solution to this problem, a hardware based latency analyzer is
devised by Artem Bityutskiy from Intel.
https://youtu.be/Opk92aQyvt0?t=8266https://intel.github.io/wult/
Pratik Rajesh Sampat (2):
cpuidle: Extract IPI based and timer based wakeup latency from idle
states
selftest/cpuidle: Add support for cpuidle latency measurement
drivers/cpuidle/Makefile | 1 +
drivers/cpuidle/test-cpuidle_latency.c | 157 ++++++++++
lib/Kconfig.debug | 10 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/cpuidle/Makefile | 6 +
tools/testing/selftests/cpuidle/cpuidle.sh | 326 +++++++++++++++++++++
tools/testing/selftests/cpuidle/settings | 2 +
7 files changed, 503 insertions(+)
create mode 100644 drivers/cpuidle/test-cpuidle_latency.c
create mode 100644 tools/testing/selftests/cpuidle/Makefile
create mode 100755 tools/testing/selftests/cpuidle/cpuidle.sh
create mode 100644 tools/testing/selftests/cpuidle/settings
--
2.17.1
This series aims to clarify the behavior of the KVM_GET_EMULATED_CPUID
ioctl, and fix a corner case where -E2BIG is returned when
the nent field of struct kvm_cpuid2 is matching the amount of
emulated entries that kvm returns.
Patch 1 proposes the nent field fix to cpuid.c,
patch 2 updates the ioctl documentation accordingly and
patches 3 and 4 extend the x86_64/get_cpuid_test.c selftest to check
the intended behavior of KVM_GET_EMULATED_CPUID.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit(a)redhat.com>
---
v5:
- Better comment in cpuid.c (patch 1)
Emanuele Giuseppe Esposito (4):
KVM: x86: Fix a spurious -E2BIG in KVM_GET_EMULATED_CPUID
Documentation: KVM: update KVM_GET_EMULATED_CPUID ioctl description
selftests: add kvm_get_emulated_cpuid to processor.h
selftests: KVM: extend get_cpuid_test to include
KVM_GET_EMULATED_CPUID
Documentation/virt/kvm/api.rst | 10 +--
arch/x86/kvm/cpuid.c | 33 ++++---
.../selftests/kvm/include/x86_64/processor.h | 1 +
.../selftests/kvm/lib/x86_64/processor.c | 33 +++++++
.../selftests/kvm/x86_64/get_cpuid_test.c | 90 ++++++++++++++++++-
5 files changed, 142 insertions(+), 25 deletions(-)
--
2.30.2
This series aims to clarify the behavior of the KVM_GET_EMULATED_CPUID
ioctl, and fix a corner case where -E2BIG is returned when
the nent field of struct kvm_cpuid2 is matching the amount of
emulated entries that kvm returns.
Patch 1 proposes the nent field fix to cpuid.c,
patch 2 updates the ioctl documentation accordingly and
patches 3 and 4 extend the x86_64/get_cpuid_test.c selftest to check
the intended behavior of KVM_GET_EMULATED_CPUID.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit(a)redhat.com>
---
v4:
- Address nitpicks given in the mailing list
Emanuele Giuseppe Esposito (4):
KVM: x86: Fix a spurious -E2BIG in KVM_GET_EMULATED_CPUID
Documentation: KVM: update KVM_GET_EMULATED_CPUID ioctl description
selftests: add kvm_get_emulated_cpuid to processor.h
selftests: KVM: extend get_cpuid_test to include
KVM_GET_EMULATED_CPUID
Documentation/virt/kvm/api.rst | 10 +--
arch/x86/kvm/cpuid.c | 33 ++++---
.../selftests/kvm/include/x86_64/processor.h | 1 +
.../selftests/kvm/lib/x86_64/processor.c | 33 +++++++
.../selftests/kvm/x86_64/get_cpuid_test.c | 90 ++++++++++++++++++-
5 files changed, 142 insertions(+), 25 deletions(-)
--
2.30.2