v1: https://lkml.org/lkml/2020/7/7/1036
Changelog v1 --> v2
1. Based on Shuah Khan's comment, changed exit code to ksft_skip to
indicate the test is being skipped
2. Change the busy workload for baseline measurement from
"yes > /dev/null" to "cat /dev/random to /dev/null", based on
observed CPU utilization for "yes" consuming ~60% CPU while the
latter consumes 100% of CPUs, giving more accurate baseline numbers
---
The patch series introduces a mechanism to measure wakeup latency for
IPI and timer based interrupts
The motivation behind this series is to find significant deviations
behind advertised latency and resisdency values
To achieve this, we introduce a kernel module and expose its control
knobs through the debugfs interface that the selftests can engage with.
The kernel module provides the following interfaces within
/sys/kernel/debug/latency_test/ for,
1. IPI test:
ipi_cpu_dest # Destination CPU for the IPI
ipi_cpu_src # Origin of the IPI
ipi_latency_ns # Measured latency time in ns
2. Timeout test:
timeout_cpu_src # CPU on which the timer to be queued
timeout_expected_ns # Timer duration
timeout_diff_ns # Difference of actual duration vs expected timer
To include the module, check option and include as module
kernel hacking -> Cpuidle latency selftests
The selftest inserts the module, disables all the idle states and
enables them one by one testing the following:
1. Keeping source CPU constant, iterates through all the CPUS measuring
IPI latency for baseline (CPU is busy with
"cat /dev/random > /dev/null" workload) and the when the CPU is
allowed to be at rest
2. Iterating through all the CPUs, sending expected timer durations to
be equivalent to the residency of the the deepest idle state
enabled and extracting the difference in time between the time of
wakeup and the expected timer duration
Usage
-----
Can be used in conjuction to the rest of the selftests.
Default Output location in: tools/testing/cpuidle/cpuidle.log
To run this test specifically:
$ make -C tools/testing/selftests TARGETS="cpuidle" run_tests
There are a few optinal arguments too that the script can take
[-h <help>]
[-m <location of the module>]
[-o <location of the output>]
Sample output snippet
---------------------
--IPI Latency Test---
--Baseline IPI Latency measurement: CPU Busy--
SRC_CPU DEST_CPU IPI_Latency(ns)
...
0 8 1996
0 9 2125
0 10 1264
0 11 1788
0 12 2045
Baseline Average IPI latency(ns): 1843
---Enabling state: 5---
SRC_CPU DEST_CPU IPI_Latency(ns)
0 8 621719
0 9 624752
0 10 622218
0 11 623968
0 12 621303
Expected IPI latency(ns): 100000
Observed Average IPI latency(ns): 622792
--Timeout Latency Test--
--Baseline Timeout Latency measurement: CPU Busy--
Wakeup_src Baseline_delay(ns)
...
8 2249
9 2226
10 2211
11 2183
12 2263
Baseline Average timeout diff(ns): 2226
---Enabling state: 5---
8 10749
9 10911
10 10912
11 12100
12 73276
Expected timeout(ns): 10000200
Observed Average timeout diff(ns): 23589
Pratik Rajesh Sampat (2):
cpuidle: Trace IPI based and timer based wakeup latency from idle
states
selftest/cpuidle: Add support for cpuidle latency measurement
drivers/cpuidle/Makefile | 1 +
drivers/cpuidle/test-cpuidle_latency.c | 150 ++++++++++++
lib/Kconfig.debug | 10 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/cpuidle/Makefile | 6 +
tools/testing/selftests/cpuidle/cpuidle.sh | 257 +++++++++++++++++++++
tools/testing/selftests/cpuidle/settings | 1 +
7 files changed, 426 insertions(+)
create mode 100644 drivers/cpuidle/test-cpuidle_latency.c
create mode 100644 tools/testing/selftests/cpuidle/Makefile
create mode 100755 tools/testing/selftests/cpuidle/cpuidle.sh
create mode 100644 tools/testing/selftests/cpuidle/settings
--
2.25.4
The goal for this series is to avoid device private memory TLB
invalidations when migrating a range of addresses from system
memory to device private memory and some of those pages have already
been migrated. The approach taken is to introduce a new mmu notifier
invalidation event type and use that in the device driver to skip
invalidation callbacks from migrate_vma_setup(). The device driver is
also then expected to handle device MMU invalidations as part of the
migrate_vma_setup(), migrate_vma_pages(), migrate_vma_finalize() process.
Note that this is opt-in. A device driver can simply invalidate its MMU
in the mmu notifier callback and not handle MMU invalidations in the
migration sequence.
This series is based on Jason Gunthorpe's HMM tree (linux-5.8.0-rc4).
Also, this replaces the need for the following two patches I sent:
("mm: fix migrate_vma_setup() src_owner and normal pages")
https://lore.kernel.org/linux-mm/20200622222008.9971-1-rcampbell@nvidia.com
("nouveau: fix mixed normal and device private page migration")
https://lore.kernel.org/lkml/20200622233854.10889-3-rcampbell@nvidia.com
Changes in v2:
Rebase to Jason Gunthorpe's HMM tree.
Added reviewed-by from Bharata B Rao.
Rename the mmu_notifier_range::data field to migrate_pgmap_owner as
suggested by Jason Gunthorpe.
Ralph Campbell (5):
nouveau: fix storing invalid ptes
mm/migrate: add a direction parameter to migrate_vma
mm/notifier: add migration invalidation type
nouveau/svm: use the new migration invalidation
mm/hmm/test: use the new migration invalidation
arch/powerpc/kvm/book3s_hv_uvmem.c | 2 ++
drivers/gpu/drm/nouveau/nouveau_dmem.c | 13 ++++++--
drivers/gpu/drm/nouveau/nouveau_svm.c | 10 +++++-
drivers/gpu/drm/nouveau/nouveau_svm.h | 1 +
.../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 13 +++++---
include/linux/migrate.h | 12 +++++--
include/linux/mmu_notifier.h | 7 ++++
lib/test_hmm.c | 33 +++++++++++--------
mm/migrate.c | 13 ++++++--
9 files changed, 77 insertions(+), 27 deletions(-)
--
2.20.1
Currently, KUnit does not allow the use of tests as a module.
This prevents the implementation of tests that require userspace.
This patchset makes this possible by introducing the use of
the root filesystem in KUnit. And it allows the use of tests
that can be compiled as a module
Vitor Massaru Iha (3):
kunit: tool: Add support root filesystem in kunit-tool
lib: Allows to borrow mm in userspace on KUnit
lib: Convert test_user_copy to KUnit test
include/kunit/test.h | 1 +
lib/Kconfig.debug | 17 ++
lib/Makefile | 2 +-
lib/kunit/try-catch.c | 15 +-
lib/{test_user_copy.c => user_copy_kunit.c} | 196 +++++++++-----------
tools/testing/kunit/kunit.py | 37 +++-
tools/testing/kunit/kunit_kernel.py | 105 +++++++++--
7 files changed, 238 insertions(+), 135 deletions(-)
rename lib/{test_user_copy.c => user_copy_kunit.c} (55%)
base-commit: 725aca9585956676687c4cb803e88f770b0df2b2
prerequisite-patch-id: 582b6d9d28ce4b71628890ec832df6522ca68de0
--
2.26.2
This fixes the way the Authority Mask Register (AMR) is updated
by the existing pkey tests and adds a new test to verify the
functionality of execute-disabled pkeys.
Previous versions can be found at:
v2: https://lore.kernel.org/linuxppc-dev/20200527030342.13712-1-sandipan@linux.…
v1: https://lore.kernel.org/linuxppc-dev/20200508162332.65316-1-sandipan@linux.…
Changes in v3:
- Fixed AMR writes for existing pkey tests (new patch).
- Moved Hash MMU check under utilities (new patch) and removed duplicate
code.
- Fixed comments on why the pkey permission bits were redefined.
- Switched to existing mfspr() macro for reading AMR.
- Switched to sig_atomic_t as data type for variables updated in the
signal handlers.
- Switched to exit()-ing if the signal handlers come across an unexpected
condition instead of trying to reset page and pkey permissions.
- Switched to write() from printf() for printing error messages from
the signal handlers.
- Switched to getpagesize().
- Renamed fault counter to denote remaining faults.
- Dropped unnecessary randomization for choosing an address to fault at.
- Added additional information on change in permissions due to AMR and
IAMR bits in comments.
- Switched the first instruction word of the executable region to a trap
to test if it is actually overwritten by a no-op later.
- Added an new test scenario where the pkey imposes no restrictions and
an attempt is made to jump to the executable region again.
Changes in v2:
- Added .gitignore entry for test binary.
- Fixed builds for older distros where siginfo_t might not have si_pkey as
a formal member based on discussion with Michael.
Sandipan Das (3):
selftests: powerpc: Fix pkey access right updates
selftests: powerpc: Move Hash MMU check to utilities
selftests: powerpc: Add test for execute-disabled pkeys
tools/testing/selftests/powerpc/include/reg.h | 6 +
.../testing/selftests/powerpc/include/utils.h | 1 +
tools/testing/selftests/powerpc/mm/.gitignore | 1 +
tools/testing/selftests/powerpc/mm/Makefile | 5 +-
.../selftests/powerpc/mm/bad_accesses.c | 28 --
.../selftests/powerpc/mm/pkey_exec_prot.c | 388 ++++++++++++++++++
.../selftests/powerpc/ptrace/core-pkey.c | 2 +-
.../selftests/powerpc/ptrace/ptrace-pkey.c | 2 +-
tools/testing/selftests/powerpc/utils.c | 28 ++
9 files changed, 429 insertions(+), 32 deletions(-)
create mode 100644 tools/testing/selftests/powerpc/mm/pkey_exec_prot.c
--
2.25.1
On Thu, Jul 09, 2020 at 09:27:43AM -0700, Andy Lutomirski wrote:
> On Thu, Jul 9, 2020 at 9:22 AM Dave Hansen <dave.hansen(a)intel.com> wrote:
> >
> > On 7/9/20 9:07 AM, Andy Lutomirski wrote:
> > > On Thu, Jul 9, 2020 at 8:56 AM Dave Hansen <dave.hansen(a)intel.com> wrote:
> > >> On 7/9/20 8:44 AM, Andersen, John wrote:
> > >>> Bits which are allowed to be pinned default to WP for CR0 and SMEP,
> > >>> SMAP, and UMIP for CR4.
> > >> I think it also makes sense to have FSGSBASE in this set.
> > >>
> > >> I know it hasn't been tested, but I think we should do the legwork to
> > >> test it. If not in this set, can we agree that it's a logical next step?
> > > I have no objection to pinning FSGSBASE, but is there a clear
> > > description of the threat model that this whole series is meant to
> > > address? The idea is to provide a degree of protection against an
> > > attacker who is able to convince a guest kernel to write something
> > > inappropriate to CR4, right? How realistic is this?
> >
> > If a quick search can find this:
> >
> > > https://googleprojectzero.blogspot.com/2017/05/exploiting-linux-kernel-via-…
> >
> > I'd pretty confident that the guys doing actual bad things have it in
> > their toolbox too.
> >
>
> True, but we have the existing software CR4 pinning. I suppose the
> virtualization version is stronger.
>
Yes, as Kees said this will be stronger because it stops ROP and other gadget
based techniques which avoid the use of native_write_cr0/4().
With regards to what should be done in this patchset and what in other
patchsets. I have a fix for kexec thanks to Arvind's note about
TRAMPOLINE_32BIT_CODE_SIZE. The physical host boots fine now and the virtual
one can kexec fine.
What remains to be done on that front is to add some identifying information to
the kernel image to declare that it supports paravirtualized control register
pinning or not.
Liran suggested adding a section to the built image acting as a flag to signify
support for being kexec'd by a kernel with pinning enabled. If anyone has any
opinions on how they'd like to see this implemented please let me know.
Otherwise I'll just take a stab at it and you'll all see it hopefully in the
next version.
With regards to FSGSBASE, are we open to validating and adding that to the
DEFAULT set as a part of a separate patchset? This patchset is focused on
replicating the functionality we already have natively.
(If anyone got this email twice, sorry I messed up the From: field the first
time around)
Hello
At first, I thought that the proposed system call is capable of
reading *multiple* small files using a single system call - which
would help increase HDD/SSD queue utilization and increase IOPS (I/O
operations per second) - but that isn't the case and the proposed
system call can read just a single file.
Without the ability to read multiple small files using a single system
call, it is impossible to increase IOPS (unless an application is
using multiple reader threads or somehow instructs the kernel to
prefetch multiple files into memory).
While you are at it, why not also add a readfiles system call to read
multiple, presumably small, files? The initial unoptimized
implementation of readfiles syscall can simply call readfile
sequentially.
Sincerely
Jan (atomsymbol)
With procfs v3.3.16, the sysctl command doesn't print the set key and
value on error. This change breaks livepatch selftest test-ftrace.sh,
that tests the interaction of sysctl ftrace_enabled:
Make it work with all sysctl versions using '-q' option.
Explicitly print the final status on success so that it can be verified
in the log. The error message is enough on failure.
Reported-by: Kamalesh Babulal <kamalesh(a)linux.vnet.ibm.com>
Signed-off-by: Petr Mladek <pmladek(a)suse.com>
---
The patch has been created against livepatch.git,
branch for-5.9/selftests-cleanup. But it applies also against
the current Linus' tree.
tools/testing/selftests/livepatch/functions.sh | 3 ++-
tools/testing/selftests/livepatch/test-ftrace.sh | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/livepatch/functions.sh b/tools/testing/selftests/livepatch/functions.sh
index 408529d94ddb..1aba83c87ad3 100644
--- a/tools/testing/selftests/livepatch/functions.sh
+++ b/tools/testing/selftests/livepatch/functions.sh
@@ -75,7 +75,8 @@ function set_dynamic_debug() {
}
function set_ftrace_enabled() {
- result=$(sysctl kernel.ftrace_enabled="$1" 2>&1 | paste --serial --delimiters=' ')
+ result=$(sysctl -q kernel.ftrace_enabled="$1" 2>&1 && \
+ sysctl kernel.ftrace_enabled 2>&1)
echo "livepatch: $result" > /dev/kmsg
}
diff --git a/tools/testing/selftests/livepatch/test-ftrace.sh b/tools/testing/selftests/livepatch/test-ftrace.sh
index 9160c9ec3b6f..552e165512f4 100755
--- a/tools/testing/selftests/livepatch/test-ftrace.sh
+++ b/tools/testing/selftests/livepatch/test-ftrace.sh
@@ -51,7 +51,7 @@ livepatch: '$MOD_LIVEPATCH': initializing patching transition
livepatch: '$MOD_LIVEPATCH': starting patching transition
livepatch: '$MOD_LIVEPATCH': completing patching transition
livepatch: '$MOD_LIVEPATCH': patching complete
-livepatch: sysctl: setting key \"kernel.ftrace_enabled\": Device or resource busy kernel.ftrace_enabled = 0
+livepatch: sysctl: setting key \"kernel.ftrace_enabled\": Device or resource busy
% echo 0 > /sys/kernel/livepatch/$MOD_LIVEPATCH/enabled
livepatch: '$MOD_LIVEPATCH': initializing unpatching transition
livepatch: '$MOD_LIVEPATCH': starting unpatching transition
--
2.26.2