Hi all,
This series proposes a rework of xstate-related tests to improve
maintainability and expand test coverage.
== Motivation: Addressing Missing and New XSTATE Tests ==
With the introduction of AMX, a new test suite [1] was created to verify
dynamic state handling by the kernel as observed from userspace. However,
previous tests for non-dynamic states like AVX lacked ABI validation,
leaving gaps in coverage. While these states currently function without
major issues (following the alternate sigstack fix [2]), xstate testing
in the x86 selftest suite has been largely overlooked.
Now, with Intel introducing another extended state, Advanced Performance
Extensions (APX) [3], a correspondent test case is need. The APX enabling
series will follow shortly and will leverage this refactored selftest
framework.
== Selftest Code Rework ==
To ensure ABI validation and core functionality across various xstates,
refactoring the test code is necessary. Without this, existing code from
amx.c would need to be duplicated, compromising the structural quality of
xstate tests.
This series introduces a shared test framework for extended state
validation, applicable to both existing and new xstates. The test cases
cover:
* Context switching
* ABI compatibility for signal handling
* ABI compatibility for ptrace interactions
== Patch Organization ==
The patchset is structured as follows:
* PATCH1: Preparatory cleanup — removing redundant signal handler
registration code.
* PATCH2/3: Introduce low-level XSAVE helpers and xstate component
enumeration.
* PATCH4/5: Refactor existing test code.
* PATCH6: Introduce a new signal test case.
* PATCH7/8: Consolidate test invocations and clarify the list of
supported features.
* PATCH9: Add test coverage for AVX.
== Coverage and Future Work Considerations ==
Currently, these tests are aligned with 64-bit mode only. Support for
32-bit cases will be considered when necessary, but only after this phase
of rework is finalized.
FWIW, the AMX TILECFG state is trivial, requiring almost constant values.
Additionally, various PKRU tests are already established in
tools/selftests/mm.
This series is based on the tip/master branch. You can also find it in
the following repository:
git://github.com/intel/apx.git selftest-xstate_v1
Thanks,
Chang
[1] https://lore.kernel.org/all/20211026122523.AFB99C1F@davehans-spike.ostc.int…
[2] https://lore.kernel.org/lkml/20210518200320.17239-1-chang.seok.bae@intel.co…
[3] https://www.intel.com/content/www/us/en/developer/articles/technical/advanc…
Chang S. Bae (9):
selftests/x86: Consolidate redundant signal helper functions
selftests/x86/xstate: Refactor XSAVE helpers for general use
selftests/x86/xstate: Enumerate and name xstate components
selftests/x86/xstate: Refactor context switching test
selftests/x86/xstate: Refactor ptrace ABI test
selftests/x86/xstate: Introduce signal ABI test
selftests/x86/xstate: Consolidate test invocations into a single entry
selftests/x86/xstate: Clarify supported xstates
selftests/x86/avx: Add AVX test
tools/testing/selftests/x86/Makefile | 6 +-
tools/testing/selftests/x86/amx.c | 442 +---------------
tools/testing/selftests/x86/avx.c | 12 +
.../selftests/x86/corrupt_xstate_header.c | 14 +-
tools/testing/selftests/x86/entry_from_vm86.c | 24 +-
tools/testing/selftests/x86/fsgsbase.c | 24 +-
tools/testing/selftests/x86/helpers.h | 28 +
tools/testing/selftests/x86/ioperm.c | 25 +-
tools/testing/selftests/x86/iopl.c | 25 +-
tools/testing/selftests/x86/ldt_gdt.c | 18 +-
tools/testing/selftests/x86/mov_ss_trap.c | 14 +-
tools/testing/selftests/x86/ptrace_syscall.c | 24 +-
tools/testing/selftests/x86/sigaltstack.c | 26 +-
tools/testing/selftests/x86/sigreturn.c | 24 +-
.../selftests/x86/single_step_syscall.c | 22 -
.../testing/selftests/x86/syscall_arg_fault.c | 12 -
tools/testing/selftests/x86/syscall_nt.c | 12 -
tools/testing/selftests/x86/sysret_rip.c | 24 +-
tools/testing/selftests/x86/test_vsyscall.c | 13 -
tools/testing/selftests/x86/unwind_vdso.c | 12 -
tools/testing/selftests/x86/xstate.c | 477 ++++++++++++++++++
tools/testing/selftests/x86/xstate.h | 195 +++++++
22 files changed, 753 insertions(+), 720 deletions(-)
create mode 100644 tools/testing/selftests/x86/avx.c
create mode 100644 tools/testing/selftests/x86/xstate.c
create mode 100644 tools/testing/selftests/x86/xstate.h
base-commit: 5bff053d066ba892464995ae4a7246f7a78fce2d
--
2.45.2
v1/v2:
There is only the first patch: RISC-V: Enable cbo.clean/flush in usermode,
which mainly removes the enabling of cbo.inval in user mode.
v3:
Add the functionality of Expose Zicbom and selftests for Zicbom.
v4:
Modify the order of macros, The test_no_cbo_inval function is added
separately.
v5:
1. Modify the order of RISCV_HWPROBE_KEY_ZICBOM_BLOCK_SIZE in hwprobe.rst
2. "TEST_NO_ZICBOINVAL" -> "TEST_NO_CBO_INVAL"
Yunhui Cui (3):
RISC-V: Enable cbo.clean/flush in usermode
RISC-V: hwprobe: Expose Zicbom extension and its block size
RISC-V: selftests: Add TEST_ZICBOM into CBO tests
Documentation/arch/riscv/hwprobe.rst | 6 ++
arch/riscv/include/asm/hwprobe.h | 2 +-
arch/riscv/include/uapi/asm/hwprobe.h | 2 +
arch/riscv/kernel/cpufeature.c | 8 +++
arch/riscv/kernel/sys_hwprobe.c | 6 ++
tools/testing/selftests/riscv/hwprobe/cbo.c | 66 +++++++++++++++++----
6 files changed, 78 insertions(+), 12 deletions(-)
--
2.39.2
A task in the kernel (task_mm_cid_work) runs somewhat periodically to
compact the mm_cid for each process. Add a test to validate that it runs
correctly and timely.
The test spawns 1 thread pinned to each CPU, then each thread, including
the main one, runs in short bursts for some time. During this period, the
mm_cids should be spanning all numbers between 0 and nproc.
At the end of this phase, a thread with high enough mm_cid (>= nproc/2)
is selected to be the new leader, all other threads terminate.
After some time, the only running thread should see 0 as mm_cid, if that
doesn't happen, the compaction mechanism didn't work and the test fails.
The test never fails if only 1 core is available, in which case, we
cannot test anything as the only available mm_cid is 0.
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Signed-off-by: Gabriele Monaco <gmonaco(a)redhat.com>
---
tools/testing/selftests/rseq/.gitignore | 1 +
tools/testing/selftests/rseq/Makefile | 2 +-
.../selftests/rseq/mm_cid_compaction_test.c | 200 ++++++++++++++++++
3 files changed, 202 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/rseq/mm_cid_compaction_test.c
diff --git a/tools/testing/selftests/rseq/.gitignore b/tools/testing/selftests/rseq/.gitignore
index 16496de5f6ce4..2c89f97e4f737 100644
--- a/tools/testing/selftests/rseq/.gitignore
+++ b/tools/testing/selftests/rseq/.gitignore
@@ -3,6 +3,7 @@ basic_percpu_ops_test
basic_percpu_ops_mm_cid_test
basic_test
basic_rseq_op_test
+mm_cid_compaction_test
param_test
param_test_benchmark
param_test_compare_twice
diff --git a/tools/testing/selftests/rseq/Makefile b/tools/testing/selftests/rseq/Makefile
index 5a3432fceb586..ce1b38f46a355 100644
--- a/tools/testing/selftests/rseq/Makefile
+++ b/tools/testing/selftests/rseq/Makefile
@@ -16,7 +16,7 @@ OVERRIDE_TARGETS = 1
TEST_GEN_PROGS = basic_test basic_percpu_ops_test basic_percpu_ops_mm_cid_test param_test \
param_test_benchmark param_test_compare_twice param_test_mm_cid \
- param_test_mm_cid_benchmark param_test_mm_cid_compare_twice
+ param_test_mm_cid_benchmark param_test_mm_cid_compare_twice mm_cid_compaction_test
TEST_GEN_PROGS_EXTENDED = librseq.so
diff --git a/tools/testing/selftests/rseq/mm_cid_compaction_test.c b/tools/testing/selftests/rseq/mm_cid_compaction_test.c
new file mode 100644
index 0000000000000..7ddde3b657dd6
--- /dev/null
+++ b/tools/testing/selftests/rseq/mm_cid_compaction_test.c
@@ -0,0 +1,200 @@
+// SPDX-License-Identifier: LGPL-2.1
+#define _GNU_SOURCE
+#include <assert.h>
+#include <pthread.h>
+#include <sched.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stddef.h>
+
+#include "../kselftest.h"
+#include "rseq.h"
+
+#define VERBOSE 0
+#define printf_verbose(fmt, ...) \
+ do { \
+ if (VERBOSE) \
+ printf(fmt, ##__VA_ARGS__); \
+ } while (0)
+
+/* 0.5 s */
+#define RUNNER_PERIOD 500000
+/* Number of runs before we terminate or get the token */
+#define THREAD_RUNS 5
+
+/*
+ * Number of times we check that the mm_cid were compacted.
+ * Checks are repeated every RUNNER_PERIOD.
+ */
+#define MM_CID_COMPACT_TIMEOUT 10
+
+struct thread_args {
+ int cpu;
+ int num_cpus;
+ pthread_mutex_t *token;
+ pthread_barrier_t *barrier;
+ pthread_t *tinfo;
+ struct thread_args *args_head;
+};
+
+static void __noreturn *thread_runner(void *arg)
+{
+ struct thread_args *args = arg;
+ int i, ret, curr_mm_cid;
+ cpu_set_t cpumask;
+
+ CPU_ZERO(&cpumask);
+ CPU_SET(args->cpu, &cpumask);
+ ret = pthread_setaffinity_np(pthread_self(), sizeof(cpumask), &cpumask);
+ if (ret) {
+ errno = ret;
+ perror("Error: failed to set affinity");
+ abort();
+ }
+ pthread_barrier_wait(args->barrier);
+
+ for (i = 0; i < THREAD_RUNS; i++)
+ usleep(RUNNER_PERIOD);
+ curr_mm_cid = rseq_current_mm_cid();
+ /*
+ * We select one thread with high enough mm_cid to be the new leader.
+ * All other threads (including the main thread) will terminate.
+ * After some time, the mm_cid of the only remaining thread should
+ * converge to 0, if not, the test fails.
+ */
+ if (curr_mm_cid >= args->num_cpus / 2 &&
+ !pthread_mutex_trylock(args->token)) {
+ printf_verbose(
+ "cpu%d has mm_cid=%d and will be the new leader.\n",
+ sched_getcpu(), curr_mm_cid);
+ for (i = 0; i < args->num_cpus; i++) {
+ if (args->tinfo[i] == pthread_self())
+ continue;
+ ret = pthread_join(args->tinfo[i], NULL);
+ if (ret) {
+ errno = ret;
+ perror("Error: failed to join thread");
+ abort();
+ }
+ }
+ pthread_barrier_destroy(args->barrier);
+ free(args->tinfo);
+ free(args->token);
+ free(args->barrier);
+ free(args->args_head);
+
+ for (i = 0; i < MM_CID_COMPACT_TIMEOUT; i++) {
+ curr_mm_cid = rseq_current_mm_cid();
+ printf_verbose("run %d: mm_cid=%d on cpu%d.\n", i,
+ curr_mm_cid, sched_getcpu());
+ if (curr_mm_cid == 0)
+ exit(EXIT_SUCCESS);
+ usleep(RUNNER_PERIOD);
+ }
+ exit(EXIT_FAILURE);
+ }
+ printf_verbose("cpu%d has mm_cid=%d and is going to terminate.\n",
+ sched_getcpu(), curr_mm_cid);
+ pthread_exit(NULL);
+}
+
+int test_mm_cid_compaction(void)
+{
+ cpu_set_t affinity;
+ int i, j, ret = 0, num_threads;
+ pthread_t *tinfo;
+ pthread_mutex_t *token;
+ pthread_barrier_t *barrier;
+ struct thread_args *args;
+
+ sched_getaffinity(0, sizeof(affinity), &affinity);
+ num_threads = CPU_COUNT(&affinity);
+ tinfo = calloc(num_threads, sizeof(*tinfo));
+ if (!tinfo) {
+ perror("Error: failed to allocate tinfo");
+ return -1;
+ }
+ args = calloc(num_threads, sizeof(*args));
+ if (!args) {
+ perror("Error: failed to allocate args");
+ ret = -1;
+ goto out_free_tinfo;
+ }
+ token = malloc(sizeof(*token));
+ if (!token) {
+ perror("Error: failed to allocate token");
+ ret = -1;
+ goto out_free_args;
+ }
+ barrier = malloc(sizeof(*barrier));
+ if (!barrier) {
+ perror("Error: failed to allocate barrier");
+ ret = -1;
+ goto out_free_token;
+ }
+ if (num_threads == 1) {
+ fprintf(stderr, "Cannot test on a single cpu. "
+ "Skipping mm_cid_compaction test.\n");
+ /* only skipping the test, this is not a failure */
+ goto out_free_barrier;
+ }
+ pthread_mutex_init(token, NULL);
+ ret = pthread_barrier_init(barrier, NULL, num_threads);
+ if (ret) {
+ errno = ret;
+ perror("Error: failed to initialise barrier");
+ goto out_free_barrier;
+ }
+ for (i = 0, j = 0; i < CPU_SETSIZE && j < num_threads; i++) {
+ if (!CPU_ISSET(i, &affinity))
+ continue;
+ args[j].num_cpus = num_threads;
+ args[j].tinfo = tinfo;
+ args[j].token = token;
+ args[j].barrier = barrier;
+ args[j].cpu = i;
+ args[j].args_head = args;
+ if (!j) {
+ /* The first thread is the main one */
+ tinfo[0] = pthread_self();
+ ++j;
+ continue;
+ }
+ ret = pthread_create(&tinfo[j], NULL, thread_runner, &args[j]);
+ if (ret) {
+ errno = ret;
+ perror("Error: failed to create thread");
+ abort();
+ }
+ ++j;
+ }
+ printf_verbose("Started %d threads.\n", num_threads);
+
+ /* Also main thread will terminate if it is not selected as leader */
+ thread_runner(&args[0]);
+
+ /* only reached in case of errors */
+out_free_barrier:
+ free(barrier);
+out_free_token:
+ free(token);
+out_free_args:
+ free(args);
+out_free_tinfo:
+ free(tinfo);
+
+ return ret;
+}
+
+int main(int argc, char **argv)
+{
+ if (!rseq_mm_cid_available()) {
+ fprintf(stderr, "Error: rseq_mm_cid unavailable\n");
+ return -1;
+ }
+ if (test_mm_cid_compaction())
+ return -1;
+ return 0;
+}
--
2.48.1
Some distributions may not enable MPTCP by default. All other MPTCP tests
source mptcp_lib.sh to ensure MPTCP is enabled before testing. However,
the ip_local_port_range test is the only one that does not include this
step.
Let's also ensure MPTCP is enabled in netns for ip_local_port_range so
that it passes on all distributions.
Suggested-by: Davide Caratti <dcaratti(a)redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin(a)gmail.com>
---
tools/testing/selftests/net/ip_local_port_range.sh | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/ip_local_port_range.sh b/tools/testing/selftests/net/ip_local_port_range.sh
index 6c6ad346eaa0..4ff746db1256 100755
--- a/tools/testing/selftests/net/ip_local_port_range.sh
+++ b/tools/testing/selftests/net/ip_local_port_range.sh
@@ -2,4 +2,6 @@
# SPDX-License-Identifier: GPL-2.0
./in_netns.sh \
- sh -c 'sysctl -q -w net.ipv4.ip_local_port_range="40000 49999" && ./ip_local_port_range'
+ sh -c 'sysctl -q -w net.mptcp.enabled=1 && \
+ sysctl -q -w net.ipv4.ip_local_port_range="40000 49999" && \
+ ./ip_local_port_range'
--
2.46.0
Some drivers, like tg3, do not set combined-count:
$ ethtool -l enp4s0f1
Channel parameters for enp4s0f1:
Pre-set maximums:
RX: 4
TX: 4
Other: n/a
Combined: n/a
Current hardware settings:
RX: 4
TX: 1
Other: n/a
Combined: n/a
In the case where combined-count is not set, the ethtool netlink code
in the kernel elides the value and the code in the test:
netnl.channels_get(...)
With a tg3 device, the returned dictionary looks like:
{'header': {'dev-index': 3, 'dev-name': 'enp4s0f1'},
'rx-max': 4,
'rx-count': 4,
'tx-max': 4,
'tx-count': 1}
Note that the key 'combined-count' is missing. As a result of this
missing key the test raises an exception:
# Exception| if channels['combined-count'] == 0:
# Exception| ~~~~~~~~^^^^^^^^^^^^^^^^^^
# Exception| KeyError: 'combined-count'
Change the test to check if 'combined-count' is a key in the dictionary
first and if not assume that this means the driver has separate RX and
TX queues.
With this change, the test now passes successfully on tg3 and mlx5
(which does have a 'combined-count').
Fixes: 1cf270424218 ("net: selftest: add test for netdev netlink queue-get API")
Signed-off-by: Joe Damato <jdamato(a)fastly.com>
---
tools/testing/selftests/drivers/net/queues.py | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/queues.py b/tools/testing/selftests/drivers/net/queues.py
index 38303da957ee..baa8845d9f64 100755
--- a/tools/testing/selftests/drivers/net/queues.py
+++ b/tools/testing/selftests/drivers/net/queues.py
@@ -45,10 +45,13 @@ def addremove_queues(cfg, nl) -> None:
netnl = EthtoolFamily()
channels = netnl.channels_get({'header': {'dev-index': cfg.ifindex}})
- if channels['combined-count'] == 0:
- rx_type = 'rx'
+ if 'combined-count' in channels:
+ if channels['combined-count'] == 0:
+ rx_type = 'rx'
+ else:
+ rx_type = 'combined'
else:
- rx_type = 'combined'
+ rx_type = 'rx'
expected = curr_queues - 1
cmd(f"ethtool -L {cfg.dev['ifname']} {rx_type} {expected}", timeout=10)
base-commit: bc50682128bde778a1ddc457a02d92a637c20c6f
--
2.43.0
Fix three DAMON selftest bugs that causes two and one false positive
failures and success.
SeongJae Park (3):
selftests/damon/damos_quota: make real expectation of quota exceeds
selftests/damon/damon_nr_regions: set ops update for merge results
check to 100ms
selftests/damon/damon_nr_regions: sort collected regiosn before
checking with min/max boundaries
tools/testing/selftests/damon/damon_nr_regions.py | 2 ++
tools/testing/selftests/damon/damos_quota.py | 9 ++++++---
2 files changed, 8 insertions(+), 3 deletions(-)
base-commit: 0ab548cd0961a01f9ef65aa999ca84febcdb04ab
--
2.39.5
GRO tests are timing dependent and can easily flake. This is partially
mitigated in gro.sh by giving each subtest 3 chances to pass. However,
this still flakes on some machines.
Set the device's napi_defer_hard_irqs to 50 so that GRO is less likely
to immediately flush. This already happened in setup_loopback.sh, but
wasn't added to setup_veth.sh. This accounts for most of the reduction
in flakiness.
We also increase the number of chances for success from 3 to 6.
`gro.sh -t <test>` now returns a passing/failing exit code as expected.
gro.c:main no longer erroneously claims a test passes when running as a
server.
Tested: Ran `gro.sh -t large` 100 times with and without this change.
Passed 100/100 with and 64/100 without. Ran inside strace to increase
flakiness.
Signed-off-by: Kevin Krakauer <krakauer(a)google.com>
---
tools/testing/selftests/net/gro.c | 8 +++++---
tools/testing/selftests/net/gro.sh | 5 +++--
tools/testing/selftests/net/setup_veth.sh | 1 +
3 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/net/gro.c b/tools/testing/selftests/net/gro.c
index b2184847e388..d5824eadea10 100644
--- a/tools/testing/selftests/net/gro.c
+++ b/tools/testing/selftests/net/gro.c
@@ -1318,11 +1318,13 @@ int main(int argc, char **argv)
read_MAC(src_mac, smac);
read_MAC(dst_mac, dmac);
- if (tx_socket)
+ if (tx_socket) {
gro_sender();
- else
+ } else {
+ /* Only the receiver exit status determines test success. */
gro_receiver();
+ fprintf(stderr, "Gro::%s test passed.\n", testname);
+ }
- fprintf(stderr, "Gro::%s test passed.\n", testname);
return 0;
}
diff --git a/tools/testing/selftests/net/gro.sh b/tools/testing/selftests/net/gro.sh
index 02c21ff4ca81..703173f8c8a9 100755
--- a/tools/testing/selftests/net/gro.sh
+++ b/tools/testing/selftests/net/gro.sh
@@ -21,7 +21,7 @@ run_test() {
# Each test is run 3 times to deflake, because given the receive timing,
# not all packets that should coalesce will be considered in the same flow
# on every try.
- for tries in {1..3}; do
+ for tries in {1..6}; do
# Actual test starts here
ip netns exec $server_ns ./gro "${ARGS[@]}" "--rx" "--iface" "server" \
1>>log.txt &
@@ -100,5 +100,6 @@ trap cleanup EXIT
if [[ "${test}" == "all" ]]; then
run_all_tests
else
- run_test "${proto}" "${test}"
+ exit_code=$(run_test "${proto}" "${test}")
+ exit $exit_code
fi;
diff --git a/tools/testing/selftests/net/setup_veth.sh b/tools/testing/selftests/net/setup_veth.sh
index 1f78a87f6f37..9882ad730c24 100644
--- a/tools/testing/selftests/net/setup_veth.sh
+++ b/tools/testing/selftests/net/setup_veth.sh
@@ -12,6 +12,7 @@ setup_veth_ns() {
[[ -e /var/run/netns/"${ns_name}" ]] || ip netns add "${ns_name}"
echo 1000000 > "/sys/class/net/${ns_dev}/gro_flush_timeout"
+ echo 50 > "/sys/class/net/${ns_dev}/napi_defer_hard_irqs"
ip link set dev "${ns_dev}" netns "${ns_name}" mtu 65535
ip -netns "${ns_name}" link set dev "${ns_dev}" up
--
2.48.1
This patch series extends the sev_init2 and the sev_smoke test to
exercise the SEV-SNP VM launch workflow.
Primarily, it introduces the architectural defines, its support in the
SEV library and extends the tests to interact with the SEV-SNP ioctl()
wrappers.
Patch 1 - Do not advertize SNP on initialization failure
Patch 2 - SNP test for KVM_SEV_INIT2
Patch 3 - Add vmgexit helper
Patch 4 - Add SMT control interface helper
Patch 5 - Replace assert() with TEST_ASSERT_EQ()
Patch 6 - Introduce SEV+ VM type check
Patch 7 - SNP iotcl() plumbing for the SEV library
Patch 8 - Force set GUEST_MEMFD for SNP
Patch 9 - Cleanups of smoke test - Decouple policy from type
Patch 10 - SNP smoke test
The series is based on
git.kernel.org/pub/scm/virt/kvm/kvm.git next
v6..v7:
Based on comments from Sean -
* Replaced FW check with sev->snp_initialized
* Dropped the patch which removes SEV+ KVM advertizement if INIT fails.
This should be now be resolved by the combination of the patches [1,2]
from Ashish.
* Change vmgexit to an inline function
* Export SMT control parsing interface to kvm_util
Note: hyperv_cpuid KST only compile testeworkbench.editor.empty.hintd
* Replace assert() with TEST_ASSERT_EQ() within SEV library
* Define KVM_SEV_PAGE_TYPE_INVALID for SEV call of encrypt_region()
* Parameterize encrypt_region() to include privatize_region()
* Deduplication of sev test calls between SEV,SEV-ES and SNP
* Removed FW version tests for SNP
* Included testing of SNP_POLICY_DBG
* Dropped most tags from patches that have been changed or indirectly
affected
[1] https://lore.kernel.org/all/d6d08c6b-9602-4f3d-92c2-8db6d50a1b92@amd.com
[2] https://lore.kernel.org/all/f78ddb64087df27e7bcb1ae0ab53f55aa0804fab.173922…
v5..v6:
https://lore.kernel.org/kvm/ab433246-e97c-495b-ab67-b0cb1721fb99@amd.com/
* Rename is_sev_platform_init to sev_fw_initialized (Nikunj)
* Rename KVM CPU feature X86_FEATURE_SNP to X86_FEATURE_SEV_SNP (Nikunj)
* Collected Tags from Nikunj, Pankaj, Srikanth.
v4..v5:
https://lore.kernel.org/kvm/8e7d8172-879e-4a28-8438-343b1c386ec9@amd.com/
* Introduced a check to disable advertising support for SEV, SEV-ES
and SNP when platform initialization fails (Nikunj)
* Remove the redundant SNP check within is_sev_vm() (Nikunj)
* Cleanup of the encrypt_region flow for better readability (Nikunj)
* Refactor paths to use the canonical $(ARCH) to rebase for kvm/next
v3..v4:
https://lore.kernel.org/kvm/20241114234104.128532-1-pratikrajesh.sampat@amd…
* Remove SNP FW API version check in the test and ensure the KVM
capability advertizes the presence of the feature. Retain the minimum
version definitions to exercise these API versions in the smoke test
* Retained only the SNP smoke test and SNP_INIT2 test
* The SNP architectural defined merged with SNP_INIT2 test patch
* SNP shutdown merged with SNP smoke test patch
* Add SEV VM type check to abstract comparisons and reduce clutter
* Define a SNP default policy which sets bits based on the presence of
SMT
* Decouple privatization and encryption for it to be SNP agnostic
* Assert for only positive tests using vm_ioctl()
* Dropped tested-by tags
In summary - based on comments from Sean, I have primarily reduced the
scope of this patch series to focus on breaking down the SNP smoke test
patch (v3 - patch2) to first introduce SEV-SNP support and use this
interface to extend the sev_init2 and the sev_smoke test.
The rest of the v3 patchset that introduces ioctl, pre fault, fallocate
and negative tests, will be re-worked and re-introduced subsequently in
future patch series post addressing the issues discussed.
v2..v3:
https://lore.kernel.org/kvm/20240905124107.6954-1-pratikrajesh.sampat@amd.c…
* Remove the assignments for the prefault and fallocate test type
enums.
* Fix error message for sev launch measure and finish.
* Collect tested-by tags [Peter, Srikanth]](<This patch series extends the sev_init2 and the sev_smoke test to
exercise the SEV-SNP VM launch workflow.
Primarily, it introduces the architectural defines, its support in the SEV
library and extends the tests to interact with the SEV-SNP ioctl()
wrappers.
Patch 1 - Do not advertize SNP on initialization failure
Patch 2 - SNP test for KVM_SEV_INIT2
Patch 3 - Add vmgexit helper
Patch 4 - Helper for SMT control interface
Patch 5 - Replace assert() with TEST_ASSERT_EQ()
Patch 6 - Introduce SEV+ VM type check
Patch 7 - SNP iotcl() plumbing for the SEV library
Patch 8 - Force set GUEST_MEMFD for SNP
Patch 9 - Cleanups of smoke test - Decouple policy from type
Patch 10 - SNP smoke test
The series is based on
git.kernel.org/pub/scm/virt/kvm/kvm.git next
v6..v7
Based on comments from Sean -
* Replaced FW check with sev-%3Esnp_initialized
* Dropped the patch which removes SEV+ KVM advertizement if INIT fails
This should be resolved by the combination of [1][2] from Ashish:
* Change vmgexit to an inline function
* Export SMT control parsing interface to kvm_util
* Replace assert() with TEST_ASSERT_EQ() within SEV library
* Define KVM_SEV_PAGE_TYPE_INVALID for SEV to use it with
encrypt_region()
* Parameterize encrypt_region() to include privatize_region()
functionality
* Deduplication of sev test calls between SEV,SEV-ES and SNP
* Removed FW version tests for SNP
* Included testing of SNP_POLICY_DBG
* Dropped most tags from patches that have directly / indirectly
changed.
[1] https://lore.kernel.org/all/d6d08c6b-9602-4f3d-92c2-8db6d50a1b92@amd.com
[2] https://lore.kernel.org/all/f78ddb64087df27e7bcb1ae0ab53f55aa0804fab.173922…
v5..v6
https://lore.kernel.org/kvm/ab433246-e97c-495b-ab67-b0cb1721fb99@amd.com/
* Rename is_sev_platform_init to sev_fw_initialized (Nikunj)
* Rename KVM CPU feature X86_FEATURE_SNP to X86_FEATURE_SEV_SNP (Nikunj)
* Collected Tags from Nikunj, Pankaj, Srikanth.
v4..v5:
https://lore.kernel.org/kvm/8e7d8172-879e-4a28-8438-343b1c386ec9@amd.com/
* Introduced a check to disable advertising support for SEV, SEV-ES
and SNP when platform initialization fails (Nikunj)
* Remove the redundant SNP check within is_sev_vm() (Nikunj)
* Cleanup of the encrypt_region flow for better readability (Nikunj)
* Refactor paths to use the canonical $(ARCH) to rebase for kvm/next
v3..v4:
https://lore.kernel.org/kvm/20241114234104.128532-1-pratikrajesh.sampat@amd…
* Remove SNP FW API version check in the test and ensure the KVM
capability advertizes the presence of the feature. Retain the minimum
version definitions to exercise these API versions in the smoke test
* Retained only the SNP smoke test and SNP_INIT2 test
* The SNP architectural defined merged with SNP_INIT2 test patch
* SNP shutdown merged with SNP smoke test patch
* Add SEV VM type check to abstract comparisons and reduce clutter
* Define a SNP default policy which sets bits based on the presence of
SMT
* Decouple privatization and encryption for it to be SNP agnostic
* Assert for only positive tests using vm_ioctl()
* Dropped tested-by tags
In summary - based on comments from Sean, I have primarily reduced the
scope of this patch series to focus on breaking down the SNP smoke test
patch (v3 - patch2) to first introduce SEV-SNP support and use this
interface to extend the sev_init2 and the sev_smoke test.
The rest of the v3 patchset that introduces ioctl, pre fault, fallocate
and negative tests, will be re-worked and re-introduced subsequently in
future patch series post addressing the issues discussed.
v2..v3:
https://lore.kernel.org/kvm/20240905124107.6954-1-pratikrajesh.sampat@amd.c…
* Remove the assignments for the prefault and fallocate test type
enums.
* Fix error message for sev launch measure and finish.
* Collect tested-by tags [Peter, Srikanth]
Pratik R. Sampat (10):
KVM: SEV: Disable SEV-SNP support on initialization failure
KVM: selftests: SEV-SNP test for KVM_SEV_INIT2
KVM: selftests: Add vmgexit helper
KVM: selftests: Add SMT control state helper
KVM: selftests: Replace assert() with TEST_ASSERT_EQ()
KVM: selftests: Introduce SEV VM type check
KVM: selftests: Add library support for interacting with SNP
KVM: selftests: Force GUEST_MEMFD flag for SNP VM type
KVM: selftests: Abstractions for SEV to decouple policy from type
KVM: selftests: Add a basic SEV-SNP smoke test
arch/x86/include/uapi/asm/kvm.h | 1 +
arch/x86/kvm/svm/sev.c | 4 +-
drivers/crypto/ccp/sev-dev.c | 8 ++
include/linux/psp-sev.h | 3 +
tools/arch/x86/include/uapi/asm/kvm.h | 1 +
.../testing/selftests/kvm/include/kvm_util.h | 35 +++++++
.../selftests/kvm/include/x86/processor.h | 1 +
tools/testing/selftests/kvm/include/x86/sev.h | 42 ++++++++-
tools/testing/selftests/kvm/lib/kvm_util.c | 7 +-
.../testing/selftests/kvm/lib/x86/processor.c | 4 +-
tools/testing/selftests/kvm/lib/x86/sev.c | 93 +++++++++++++++++--
.../testing/selftests/kvm/x86/hyperv_cpuid.c | 19 ----
.../selftests/kvm/x86/sev_init2_tests.c | 13 +++
.../selftests/kvm/x86/sev_smoke_test.c | 75 +++++++++------
14 files changed, 246 insertions(+), 60 deletions(-)
--
2.43.0
As the vIOMMU infrastructure series part-3, this introduces a new vEVENTQ
object. The existing FAULT object provides a nice notification pathway to
the user space with a queue already, so let vEVENTQ reuse that.
Mimicing the HWPT structure, add a common EVENTQ structure to support its
derivatives: IOMMUFD_OBJ_FAULT (existing) and IOMMUFD_OBJ_VEVENTQ (new).
An IOMMUFD_CMD_VEVENTQ_ALLOC is introduced to allocate vEVENTQ object for
vIOMMUs. One vIOMMU can have multiple vEVENTQs in different types but can
not support multiple vEVENTQs in the same type.
The forwarding part is fairly simple but might need to replace a physical
device ID with a virtual device ID in a driver-level event data structure.
So, this also adds some helpers for drivers to use.
As usual, this series comes with the selftest coverage for this new ioctl
and with a real world use case in the ARM SMMUv3 driver.
This is on Github:
https://github.com/nicolinc/iommufd/commits/iommufd_veventq-v7
Paring QEMU branch for testing:
https://github.com/nicolinc/qemu/commits/wip/for_iommufd_veventq-v7
Changelog
v7
* Rebase on Jason's for-next tree for latest fault.c
* Add Reviewed-by
* Update commit logs
* Add __reserved field sanity
* Skip kfree() on the static header
* Replace "bool on_list" with list_is_last()
* Use u32 for flags in iommufd_vevent_header
* Drop casting in iommufd_viommu_get_vdev_id()
* Update the bounding logic to veventq->sequence
* Add missing cpu_to_le64() around STRTAB_STE_1_MEV
* Reuse veventq->common.lock to fence sequence and num_events
* Rename overflow to lost_events and log it in upon kmalloc failure
* Correct the error handling part in iommufd_veventq_deliver_fetch()
* Add an arm_smmu_clear_vmaster() to simplify identity/blocked domain
attach ops
* Add additional four event records to forward to user space VM, and
update the uAPI doc
* Reuse the existing smmu->streams_mutex lock to fence master->vmaster
pointer, instead of adding a new rwsem
v6
https://lore.kernel.org/all/cover.1737754129.git.nicolinc@nvidia.com/
* Drop supports_veventq viommu op
* Split bug/cosmetics fixes out of the series
* Drop the blocking mutex around copy_to_user()
* Add veventq_depth in uAPI to limit vEVENTQ size
* Revise the documentation for a clear description
* Fix sparse warnings in arm_vmaster_report_event()
* Rework iommufd_viommu_get_vdev_id() to return -ENOENT v.s. 0
* Allow Abort/Bypass STEs to allocate vEVENTQ and set STE.MEV for DoS
mitigations
v5
https://lore.kernel.org/all/cover.1736237481.git.nicolinc@nvidia.com/
* Add Reviewed-by from Baolu
* Reorder the OBJ list as well
* Fix alphabetical order after renaming in v4
* Add supports_veventq viommu op for vEVENTQ type validation
v4
https://lore.kernel.org/all/cover.1735933254.git.nicolinc@nvidia.com/
* Rename "vIRQ" to "vEVENTQ"
* Use flexible array in struct iommufd_vevent
* Add the new ioctl command to union ucmd_buffer
* Fix the alphabetical order in union ucmd_buffer too
* Rename _TYPE_NONE to _TYPE_DEFAULT aligning with vIOMMU naming
v3
https://lore.kernel.org/all/cover.1734477608.git.nicolinc@nvidia.com/
* Rebase on Will's for-joerg/arm-smmu/updates for arm_smmu_event series
* Add "Reviewed-by" lines from Kevin
* Fix typos in comments, kdocs, and jump tags
* Add a patch to sort struct iommufd_ioctl_op
* Update iommufd's userpsace-api documentation
* Update uAPI kdoc to quote SMMUv3 offical spec
* Drop the unused workqueue in struct iommufd_virq
* Drop might_sleep() in iommufd_viommu_report_irq() helper
* Add missing "break" in iommufd_viommu_get_vdev_id() helper
* Shrink the scope of the vmaster's read lock in SMMUv3 driver
* Pass in two arguments to iommufd_eventq_virq_handler() helper
* Move "!ops || !ops->read" validation into iommufd_eventq_init()
* Move "fault->ictx = ictx" closer to iommufd_ctx_get(fault->ictx)
* Update commit message for arm_smmu_attach_prepare/commit_vmaster()
* Keep "iommufd_fault" as-is and rename "iommufd_eventq_virq" to just
"iommufd_virq"
v2
https://lore.kernel.org/all/cover.1733263737.git.nicolinc@nvidia.com/
* Rebase on v6.13-rc1
* Add IOPF and vIRQ in iommufd.rst (userspace-api)
* Add a proper locking in iommufd_event_virq_destroy
* Add iommufd_event_virq_abort with a lockdep_assert_held
* Rename "EVENT_*" to "EVENTQ_*" to describe the objects better
* Reorganize flows in iommufd_eventq_virq_alloc for abort() to work
* Adde struct arm_smmu_vmaster to store vSID upon attaching to a nested
domain, calling a newly added iommufd_viommu_get_vdev_id helper
* Adde an arm_vmaster_report_event helper in arm-smmu-v3-iommufd file
to simplify the routine in arm_smmu_handle_evt() of the main driver
v1
https://lore.kernel.org/all/cover.1724777091.git.nicolinc@nvidia.com/
Thanks!
Nicolin
Nicolin Chen (14):
iommufd/fault: Move two fault functions out of the header
iommufd/fault: Add an iommufd_fault_init() helper
iommufd: Abstract an iommufd_eventq from iommufd_fault
iommufd: Rename fault.c to eventq.c
iommufd: Add IOMMUFD_OBJ_VEVENTQ and IOMMUFD_CMD_VEVENTQ_ALLOC
iommufd/viommu: Add iommufd_viommu_get_vdev_id helper
iommufd/viommu: Add iommufd_viommu_report_event helper
iommufd/selftest: Require vdev_id when attaching to a nested domain
iommufd/selftest: Add IOMMU_TEST_OP_TRIGGER_VEVENT for vEVENTQ
coverage
iommufd/selftest: Add IOMMU_VEVENTQ_ALLOC test coverage
Documentation: userspace-api: iommufd: Update FAULT and VEVENTQ
iommu/arm-smmu-v3: Introduce struct arm_smmu_vmaster
iommu/arm-smmu-v3: Report events that belong to devices attached to
vIOMMU
iommu/arm-smmu-v3: Set MEV bit in nested STE for DoS mitigations
drivers/iommu/iommufd/Makefile | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 35 +
drivers/iommu/iommufd/iommufd_private.h | 135 +++-
drivers/iommu/iommufd/iommufd_test.h | 10 +
include/linux/iommufd.h | 23 +
include/uapi/linux/iommufd.h | 105 +++
tools/testing/selftests/iommu/iommufd_utils.h | 115 ++++
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 72 +++
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 79 ++-
drivers/iommu/iommufd/driver.c | 72 +++
drivers/iommu/iommufd/eventq.c | 597 ++++++++++++++++++
drivers/iommu/iommufd/fault.c | 342 ----------
drivers/iommu/iommufd/hw_pagetable.c | 6 +-
drivers/iommu/iommufd/main.c | 7 +
drivers/iommu/iommufd/selftest.c | 54 ++
drivers/iommu/iommufd/viommu.c | 2 +
tools/testing/selftests/iommu/iommufd.c | 36 ++
.../selftests/iommu/iommufd_fail_nth.c | 7 +
Documentation/userspace-api/iommufd.rst | 17 +
19 files changed, 1308 insertions(+), 408 deletions(-)
create mode 100644 drivers/iommu/iommufd/eventq.c
delete mode 100644 drivers/iommu/iommufd/fault.c
base-commit: 598749522d4254afb33b8a6c1bea614a95896868
--
2.43.0