Problem
=======
When host APEI is unable to claim synchronous external abort (SEA)
during stage-2 guest abort, today KVM directly injects an async SError
into the VCPU then resumes it. The injected SError usually results in
unpleasant guest kernel panic.
One of the major situation of guest SEA is when VCPU consumes recoverable
uncorrected memory error (UER), which is not uncommon at all in modern
datacenter servers with large amounts of physical memory. Although SError
and guest panic is sufficient to stop the propagation of corrupted memory
there is room to recover from an UER in a more graceful manner.
Proposed Solution
=================
Alternatively KVM can replay the SEA to the faulting VCPU, via existing
KVM_SET_VCPU_EVENTS API. If the memory poison consumption or the fault
that cause SEA is not from guest kernel, the blast radius can be limited
to the consuming or faulting guest userspace process, so the VM can keep
running.
In addition, instead of doing under the hood without involving userspace,
there are benefits to redirect the SEA to VMM:
- VM customers care about the disruptions caused by memory errors, and
VMM usually has the responsibility to start the process of notifying
the customers of memory error events in their VMs. For example some
cloud provider emits a critical log in their observability UI [1], and
provides playbook for customers on how to mitigate disruptions to
their workloads.
- VMM can protect future memory error consumption by unmapping the poisoned
pages from stage-2 page table with KVM userfault, or by splitting the
memslot that contains the poisoned guest pages [2].
- VMM can keep track of SEA events in the VM. When VMM thinks the status
on the host or the VM is bad enough, e.g. number of distinct SEAs
exceeds a threshold, it can restart the VM on another healthy host.
- Behavior parity with x86 architecture. When machine check exception
(MCE) is caused by VCPU, kernel or KVM signals userspace SIGBUS to
let VMM either recover from the MCE, or terminate itself with VM.
The prior RFC proposes to implement SIGBUS on arm64 as well, but
Marc preferred VCPU exit over signal [3]. However, implementation
aside, returning SEA to VMM is on par with returning MCE to VMM.
Once SEA is redirected to VMM, among other actions, VMM is encouraged
to inject external aborts into the faulting VCPU, which is already
supported by KVM on arm64. We notice injecting instruction abort is not
fully supported by KVM_SET_VCPU_EVENTS. Complement it in the patchset.
New UAPIs
=========
This patchset introduces following userspace-visiable changes to empower
VMM to control what happens next for SEA on guest memory:
- KVM_CAP_ARM_SEA_TO_USER. While taking SEA, if userspace has enabled
this new capability at VM creation, and the SEA is not caused by
memory allocated for stage-2 translation table, instead of injecting
SError, return KVM_EXIT_ARM_SEA to userspace.
- KVM_EXIT_ARM_SEA. This is the VM exit reason VMM gets. The details
about the SEA is provided in arm_sea as much as possible, including
sanitized ESR value at EL2, if guest virtual and physical addresses
(GPA and GVA) are available and the values if available.
- KVM_CAP_ARM_INJECT_EXT_IABT. VMM today can inject external data abort
to VCPU via KVM_SET_VCPU_EVENTS API. However, in case of instruction
abort, VMM cannot inject it via KVM_SET_VCPU_EVENTS.
KVM_CAP_ARM_INJECT_EXT_IABT is just a natural extend to
KVM_CAP_ARM_INJECT_EXT_DABT that tells VMM KVM_SET_VCPU_EVENTS now
supports external instruction abort.
* From v1 [4]:
- Rebased on commit 4d62121ce9b5 ("KVM: arm64: vgic-debug: Avoid
dereferencing NULL ITE pointer").
- Sanitize ESR_EL2 before reporting it to userspace.
- Do not do KVM_EXIT_ARM_SEA when SEA is caused by memory allocated to
stage-2 translation table.
[1] https://cloud.google.com/solutions/sap/docs/manage-host-errors
[2] https://lore.kernel.org/kvm/20250109204929.1106563-1-jthoughton@google.com
[3] https://lore.kernel.org/kvm/86pljbqqh0.wl-maz@kernel.org
[4] https://lore.kernel.org/kvm/20250505161412.1926643-1-jiaqiyan@google.com
Jiaqi Yan (5):
KVM: arm64: VM exit to userspace to handle SEA
KVM: arm64: Set FnV for VCPU when FAR_EL2 is invalid
KVM: selftests: Test for KVM_EXIT_ARM_SEA and KVM_CAP_ARM_SEA_TO_USER
KVM: selftests: Test for KVM_CAP_INJECT_EXT_IABT
Documentation: kvm: new uAPI for handling SEA
Raghavendra Rao Ananta (1):
KVM: arm64: Allow userspace to inject external instruction aborts
Documentation/virt/kvm/api.rst | 128 ++++++-
arch/arm64/include/asm/kvm_emulate.h | 67 ++++
arch/arm64/include/asm/kvm_host.h | 8 +
arch/arm64/include/asm/kvm_ras.h | 2 +-
arch/arm64/include/uapi/asm/kvm.h | 3 +-
arch/arm64/kvm/arm.c | 6 +
arch/arm64/kvm/guest.c | 13 +-
arch/arm64/kvm/inject_fault.c | 3 +
arch/arm64/kvm/mmu.c | 59 ++-
include/uapi/linux/kvm.h | 12 +
tools/arch/arm64/include/asm/esr.h | 2 +
tools/arch/arm64/include/uapi/asm/kvm.h | 3 +-
tools/testing/selftests/kvm/Makefile.kvm | 2 +
.../testing/selftests/kvm/arm64/inject_iabt.c | 98 +++++
.../testing/selftests/kvm/arm64/sea_to_user.c | 340 ++++++++++++++++++
tools/testing/selftests/kvm/lib/kvm_util.c | 1 +
16 files changed, 718 insertions(+), 29 deletions(-)
create mode 100644 tools/testing/selftests/kvm/arm64/inject_iabt.c
create mode 100644 tools/testing/selftests/kvm/arm64/sea_to_user.c
--
2.49.0.1266.g31b7d2e469-goog
Some failure modes are handled poorly by kublk. For example, if ublk_drv
is built as a module but not currently loaded into the kernel, ./kublk
add ... just hangs forever. This happens because in this case (and a few
others), the worker process does not notify its parent (via a write to
the shared eventfd) that it has tried and failed to initialize, so the
parent hangs forever. Fix this by ensuring that we always notify the
parent process of any initialization failure, and have the parent print
a (not very descriptive) log line when this happens.
Signed-off-by: Uday Shankar <ushankar(a)purestorage.com>
---
tools/testing/selftests/ublk/kublk.c | 34 +++++++++++++++++++++++-----------
1 file changed, 23 insertions(+), 11 deletions(-)
diff --git a/tools/testing/selftests/ublk/kublk.c b/tools/testing/selftests/ublk/kublk.c
index a98e14e4c245965d817b93843ff9a4011291223b..e2d2042810d4bb472e48a0ed91317d2bdf6e2f2a 100644
--- a/tools/testing/selftests/ublk/kublk.c
+++ b/tools/testing/selftests/ublk/kublk.c
@@ -1112,7 +1112,7 @@ static int __cmd_dev_add(const struct dev_ctx *ctx)
__u64 features;
const struct ublk_tgt_ops *ops;
struct ublksrv_ctrl_dev_info *info;
- struct ublk_dev *dev;
+ struct ublk_dev *dev = NULL;
int dev_id = ctx->dev_id;
int ret, i;
@@ -1120,13 +1120,15 @@ static int __cmd_dev_add(const struct dev_ctx *ctx)
if (!ops) {
ublk_err("%s: no such tgt type, type %s\n",
__func__, tgt_type);
- return -ENODEV;
+ ret = -ENODEV;
+ goto fail;
}
if (nr_queues > UBLK_MAX_QUEUES || depth > UBLK_QUEUE_DEPTH) {
ublk_err("%s: invalid nr_queues or depth queues %u depth %u\n",
__func__, nr_queues, depth);
- return -EINVAL;
+ ret = -EINVAL;
+ goto fail;
}
/* default to 1:1 threads:queues if nthreads is unspecified */
@@ -1136,30 +1138,37 @@ static int __cmd_dev_add(const struct dev_ctx *ctx)
if (nthreads > UBLK_MAX_THREADS) {
ublk_err("%s: %u is too many threads (max %u)\n",
__func__, nthreads, UBLK_MAX_THREADS);
- return -EINVAL;
+ ret = -EINVAL;
+ goto fail;
}
if (nthreads != nr_queues && !ctx->per_io_tasks) {
ublk_err("%s: threads %u must be same as queues %u if "
"not using per_io_tasks\n",
__func__, nthreads, nr_queues);
- return -EINVAL;
+ ret = -EINVAL;
+ goto fail;
}
dev = ublk_ctrl_init();
if (!dev) {
ublk_err("%s: can't alloc dev id %d, type %s\n",
__func__, dev_id, tgt_type);
- return -ENOMEM;
+ ret = -ENOMEM;
+ goto fail;
}
/* kernel doesn't support get_features */
ret = ublk_ctrl_get_features(dev, &features);
- if (ret < 0)
- return -EINVAL;
+ if (ret < 0) {
+ ret = -EINVAL;
+ goto fail;
+ }
- if (!(features & UBLK_F_CMD_IOCTL_ENCODE))
- return -ENOTSUP;
+ if (!(features & UBLK_F_CMD_IOCTL_ENCODE)) {
+ ret = -ENOTSUP;
+ goto fail;
+ }
info = &dev->dev_info;
info->dev_id = ctx->dev_id;
@@ -1200,7 +1209,8 @@ static int __cmd_dev_add(const struct dev_ctx *ctx)
fail:
if (ret < 0)
ublk_send_dev_event(ctx, dev, -1);
- ublk_ctrl_deinit(dev);
+ if (dev)
+ ublk_ctrl_deinit(dev);
return ret;
}
@@ -1262,6 +1272,8 @@ static int cmd_dev_add(struct dev_ctx *ctx)
shmctl(ctx->_shmid, IPC_RMID, NULL);
/* wait for child and detach from it */
wait(NULL);
+ if (exit_code == EXIT_FAILURE)
+ ublk_err("%s: command failed\n", __func__);
exit(exit_code);
} else {
exit(EXIT_FAILURE);
---
base-commit: c09a8b00f850d3ca0af998bff1fac4a3f6d11768
change-id: 20250603-ublk_init_fail-b498905159eb
Best regards,
--
Uday Shankar <ushankar(a)purestorage.com>
well, i checked the script using checkpatch.pl and
it shows that the patch has no warnings or errors
and its ready to be sent
v2:
- fixed multiple trailing whitespace errors and
- the Signed-off-by mismatch
The test file for the IR decoder used single-line comments
at the top to document its purpose and licensing,
which is inconsistent with the style used throughout the
Linux kernel.
In this patch i converted the file header to
a proper multi-line comment block
(/*) that aligns with standard kernel practices.
This improves readability, consistency across selftests,
and ensures the license and documentation are
clearly visible in a familiar format.
No functional changes have been made.
Signed-off-by: Abdelrahman Fekry <abdelrahmanfekry375(a)gmail.com>
---
tools/testing/selftests/ir/ir_loopback.c | 23 +++++++++++++----------
1 file changed, 13 insertions(+), 10 deletions(-)
diff --git a/tools/testing/selftests/ir/ir_loopback.c b/tools/testing/selftests/ir/ir_loopback.c
index f4a15cbdd5ea..c94faa975630 100644
--- a/tools/testing/selftests/ir/ir_loopback.c
+++ b/tools/testing/selftests/ir/ir_loopback.c
@@ -1,14 +1,17 @@
// SPDX-License-Identifier: GPL-2.0
-// test ir decoder
-//
-// Copyright (C) 2018 Sean Young <sean(a)mess.org>
-
-// When sending LIRC_MODE_SCANCODE, the IR will be encoded. rc-loopback
-// will send this IR to the receiver side, where we try to read the decoded
-// IR. Decoding happens in a separate kernel thread, so we will need to
-// wait until that is scheduled, hence we use poll to check for read
-// readiness.
-
+/* Copyright (C) 2018 Sean Young <sean(a)mess.org>
+ *
+ * Selftest for IR decoder
+ *
+ *
+ * When sending LIRC_MODE_SCANCODE, the IR will be encoded.
+ * rc-loopback will send this IR to the receiver side,
+ * where we try to read the decoded IR.
+ * Decoding happens in a separate kernel thread,
+ * so we will need to wait until that is scheduled,
+ * hence we use poll to check for read
+ * readiness.
+ */
#include <linux/lirc.h>
#include <errno.h>
#include <stdio.h>
--
2.25.1
This improves the expressiveness of unprivileged BPF by inserting
speculation barriers instead of rejecting the programs.
The approach was previously presented at LPC'24 [1] and RAID'24 [2].
To mitigate the Spectre v1 (PHT) vulnerability, the kernel rejects
potentially-dangerous unprivileged BPF programs as of
commit 9183671af6db ("bpf: Fix leakage under speculation on mispredicted
branches"). In [2], we have analyzed 364 object files from open source
projects (Linux Samples and Selftests, BCC, Loxilb, Cilium, libbpf
Examples, Parca, and Prevail) and found that this affects 31% to 54% of
programs.
To resolve this in the majority of cases this patchset adds a fall-back
for mitigating Spectre v1 using speculation barriers. The kernel still
optimistically attempts to verify all speculative paths but uses
speculation barriers against v1 when unsafe behavior is detected. This
allows for more programs to be accepted without disabling the BPF
Spectre mitigations (e.g., by setting cpu_mitigations_off()).
For this, it relies on the fact that speculation barriers generally
prevent all later instructions from executing if the speculation was not
correct (not only loads). See patch 7 ("bpf: Fall back to nospec for
Spectre v1") for a detailed description and references to the relevant
vendor documentation (AMD and Intel x86-64, ARM64, and PowerPC).
In [1] we have measured the overhead of this approach relative to having
mitigations off and including the upstream Spectre v4 mitigations. For
event tracing and stack-sampling profilers, we found that mitigations
increase BPF program execution time by 0% to 62%. For the Loxilb network
load balancer, we have measured a 14% slowdown in SCTP performance but
no significant slowdown for TCP. This overhead only applies to programs
that were previously rejected.
I reran the expressiveness-evaluation with v6.14 and made sure the main
results still match those from [1] and [2] (which used v6.5).
Main design decisions are:
* Do not use separate bytecode insns for v1 and v4 barriers (inspired by
Daniel Borkmann's question at LPC). This simplifies the verifier
significantly and has the only downside that performance on PowerPC is
not as high as it could be.
* Allow archs to still disable v1/v4 mitigations separately by setting
bpf_jit_bypass_spec_v1/v4(). This has the benefit that archs can
benefit from improved BPF expressiveness / performance if they are not
vulnerable (e.g., ARM64 for v4 in the kernel).
* Do not remove the empty BPF_NOSPEC implementation for backends for
which it is unknown whether they are vulnerable to Spectre v1.
[1] https://lpc.events/event/18/contributions/1954/ ("Mitigating
Spectre-PHT using Speculation Barriers in Linux eBPF")
[2] https://arxiv.org/pdf/2405.00078 ("VeriFence: Lightweight and
Precise Spectre Defenses for Untrusted Linux Kernel Extensions")
Changes:
* v3 -> v4:
- Remove insn parameter from do_check_insn() and extract
process_bpf_exit_full as a function as requested by Eduard
- Investigate apparent sanitize_check_bounds() bug reported by
Kartikeya (does appear to not be a bug but only confusing code),
sent separate patch to document it and add an assert
- Remove already-merged commit 1 ("selftests/bpf: Fix caps for
__xlated/jited_unpriv")
- Drop former commit 10 ("bpf: Allow nospec-protected var-offset stack
access") as it did not include a test and there are other places
where var-off is rejected. Also, none of the tested real-world
programs used var-off in the paper. Therefore keep the old behavior
for now and potentially prepare a patch that converts all cases
later if required.
- Add link to AMD lfence and PowerPC speculation barrier (ori 31,31,0)
documentation
- Move detailed barrier documentation to commit 7 ("bpf: Fall back to
nospec for Spectre v1")
- Link to v3: https://lore.kernel.org/all/20250501073603.1402960-1-luis.gerhorst@fau.de/
* v2 -> v3:
- Fix
https://lore.kernel.org/oe-kbuild-all/202504212030.IF1SLhz6-lkp@intel.com/
and similar by moving the bpf_jit_bypass_spec_v1/v4() prototypes out
of the #ifdef CONFIG_BPF_SYSCALL. Decided not to move them to
filter.h (where similar bpf_jit_*() prototypes live) as they would
still have to be duplicated in bpf.h to be usable to
bpf_bypass_spec_v1/v4() (unless including filter.h in bpf.h is an
option).
- Fix
https://lore.kernel.org/oe-kbuild-all/202504220035.SoGveGpj-lkp@intel.com/
by moving the variable declarations out of the switch-case.
- Build touched C files with W=2 and bpf config on x86 to check that
there are no other warnings introduced.
- Found 3 more checkpatch warnings that can be fixed without degrading
readability.
- Rebase to bpf-next 2025-05-01
- Link to v2: https://lore.kernel.org/bpf/20250421091802.3234859-1-luis.gerhorst@fau.de/
* v1 -> v2:
- Drop former commits 9 ("bpf: Return PTR_ERR from push_stack()") and 11
("bpf: Fall back to nospec for spec path verification") as suggested
by Alexei. This series therefore no longer changes push_stack() to
return PTR_ERR.
- Add detailed explanation of how lfence works internally and how it
affects the algorithm.
- Add tests checking that nospec instructions are inserted in expected
locations using __xlated_unpriv as suggested by Eduard (also,
include a fix for __xlated_unpriv)
- Add a test for the mitigations from the description of
commit 9183671af6db ("bpf: Fix leakage under speculation on
mispredicted branches")
- Remove unused variables from do_check[_insn]() as suggested by
Eduard.
- Remove INSN_IDX_MODIFIED to improve readability as suggested by
Eduard. This also causes the nospec_result-check to run (and fail)
for jumping-ops. Add a warning to assert that this check must never
succeed in that case.
- Add details on the safety of patch 10 ("bpf: Allow nospec-protected
var-offset stack access") based on the feedback on v1.
- Rebase to bpf-next-250420
- Link to v1: https://lore.kernel.org/all/20250313172127.1098195-1-luis.gerhorst@fau.de/
* RFC -> v1:
- rebase to bpf-next-250313
- tests: mark expected successes/new errors
- add bpt_jit_bypass_spec_v1/v4() to avoid #ifdef in
bpf_bypass_spec_v1/v4()
- ensure that nospec with v1-support is implemented for archs for
which GCC supports speculation barriers, except for MIPS
- arm64: emit speculation barrier
- powerpc: change nospec to include v1 barrier
- discuss potential security (archs that do not impl. BPF nospec) and
performance (only PowerPC) regressions
- Link to RFC: https://lore.kernel.org/bpf/20250224203619.594724-1-luis.gerhorst@fau.de/
Luis Gerhorst (9):
bpf: Move insn if/else into do_check_insn()
bpf: Return -EFAULT on misconfigurations
bpf: Return -EFAULT on internal errors
bpf, arm64, powerpc: Add bpf_jit_bypass_spec_v1/v4()
bpf, arm64, powerpc: Change nospec to include v1 barrier
bpf: Rename sanitize_stack_spill to nospec_result
bpf: Fall back to nospec for Spectre v1
selftests/bpf: Add test for Spectre v1 mitigation
bpf: Fall back to nospec for sanitization-failures
arch/arm64/net/bpf_jit.h | 5 +
arch/arm64/net/bpf_jit_comp.c | 28 +-
arch/powerpc/net/bpf_jit_comp64.c | 80 ++-
include/linux/bpf.h | 11 +-
include/linux/bpf_verifier.h | 3 +-
include/linux/filter.h | 2 +-
kernel/bpf/core.c | 32 +-
kernel/bpf/verifier.c | 633 ++++++++++--------
tools/testing/selftests/bpf/progs/bpf_misc.h | 4 +
.../selftests/bpf/progs/verifier_and.c | 8 +-
.../selftests/bpf/progs/verifier_bounds.c | 66 +-
.../bpf/progs/verifier_bounds_deduction.c | 45 +-
.../selftests/bpf/progs/verifier_map_ptr.c | 20 +-
.../selftests/bpf/progs/verifier_movsx.c | 16 +-
.../selftests/bpf/progs/verifier_unpriv.c | 65 +-
.../bpf/progs/verifier_value_ptr_arith.c | 101 ++-
.../selftests/bpf/verifier/dead_code.c | 3 +-
tools/testing/selftests/bpf/verifier/jmp32.c | 33 +-
tools/testing/selftests/bpf/verifier/jset.c | 10 +-
19 files changed, 755 insertions(+), 410 deletions(-)
base-commit: cd2e103d57e5615f9bb027d772f93b9efd567224
--
2.49.0
This improves the expressiveness of unprivileged BPF by inserting
speculation barriers instead of rejecting the programs.
The approach was previously presented at LPC'24 [1] and RAID'24 [2].
To mitigate the Spectre v1 (PHT) vulnerability, the kernel rejects
potentially-dangerous unprivileged BPF programs as of
commit 9183671af6db ("bpf: Fix leakage under speculation on mispredicted
branches"). In [2], we have analyzed 364 object files from open source
projects (Linux Samples and Selftests, BCC, Loxilb, Cilium, libbpf
Examples, Parca, and Prevail) and found that this affects 31% to 54% of
programs.
To resolve this in the majority of cases this patchset adds a fall-back
for mitigating Spectre v1 using speculation barriers. The kernel still
optimistically attempts to verify all speculative paths but uses
speculation barriers against v1 when unsafe behavior is detected. This
allows for more programs to be accepted without disabling the BPF
Spectre mitigations (e.g., by setting cpu_mitigations_off()).
For this, it relies on the fact that speculation barriers prevent all
later instructions if the speculation was not correct:
* On x86_64, lfence acts as full speculation barrier, not only as a
load fence [3]:
An LFENCE instruction or a serializing instruction will ensure that
no later instructions execute, even speculatively, until all prior
instructions complete locally. [...] Inserting an LFENCE instruction
after a bounds check prevents later operations from executing before
the bound check completes.
This was experimentally confirmed in [4].
* ARM's SB speculation barrier instruction also affects "any instruction
that appears later in the program order than the barrier" [5].
In [1] we have measured the overhead of this approach relative to having
mitigations off and including the upstream Spectre v4 mitigations. For
event tracing and stack-sampling profilers, we found that mitigations
increase BPF program execution time by 0% to 62%. For the Loxilb network
load balancer, we have measured a 14% slowdown in SCTP performance but
no significant slowdown for TCP. This overhead only applies to programs
that were previously rejected.
I reran the expressiveness-evaluation with v6.14 and made sure the main
results still match those from [1] and [2] (which used v6.5).
Main design decisions are:
* Do not use separate bytecode insns for v1 and v4 barriers (inspired by
Daniel Borkmann's question at LPC). This simplifies the verifier
significantly and has the only downside that performance on PowerPC is
not as high as it could be.
* Allow archs to still disable v1/v4 mitigations separately by setting
bpf_jit_bypass_spec_v1/v4(). This has the benefit that archs can
benefit from improved BPF expressiveness / performance if they are not
vulnerable (e.g., ARM64 for v4 in the kernel).
* Do not remove the empty BPF_NOSPEC implementation for backends for
which it is unknown whether they are vulnerable to Spectre v1.
[1] https://lpc.events/event/18/contributions/1954/ ("Mitigating
Spectre-PHT using Speculation Barriers in Linux eBPF")
[2] https://arxiv.org/pdf/2405.00078 ("VeriFence: Lightweight and
Precise Spectre Defenses for Untrusted Linux Kernel Extensions")
[3] https://www.intel.com/content/www/us/en/developer/articles/technical/softwa…
("Managed Runtime Speculative Execution Side Channel Mitigations")
[4] https://dl.acm.org/doi/pdf/10.1145/3359789.3359837 ("Speculator: a
tool to analyze speculative execution attacks and mitigations" -
Section 4.6 "Stopping Speculative Execution")
[5] https://developer.arm.com/documentation/ddi0597/2020-12/Base-Instructions/S…
("SB - Speculation Barrier - Arm Armv8-A A32/T32 Instruction Set Architecture (2020-12)")
Changes:
* v2 -> v3:
- Fix
https://lore.kernel.org/oe-kbuild-all/202504212030.IF1SLhz6-lkp@intel.com/
and similar by moving the bpf_jit_bypass_spec_v1/v4() prototypes out
of the #ifdef CONFIG_BPF_SYSCALL. Decided not to move them to
filter.h (where similar bpf_jit_*() prototypes live) as they would
still have to be duplicated in bpf.h to be usable to
bpf_bypass_spec_v1/v4() (unless including filter.h in bpf.h is an
option).
- Fix
https://lore.kernel.org/oe-kbuild-all/202504220035.SoGveGpj-lkp@intel.com/
by moving the variable declarations out of the switch-case.
- Build touched C files with W=2 and bpf config on x86 to check that
there are no other warnings introduced.
- Found 3 more checkpatch warnings that can be fixed without degrading
readability.
- Rebase to bpf-next 2025-05-01
- Link to v2: https://lore.kernel.org/bpf/20250421091802.3234859-1-luis.gerhorst@fau.de/
* v1 -> v2:
- Drop former commits 9 ("bpf: Return PTR_ERR from push_stack()") and 11
("bpf: Fall back to nospec for spec path verification") as suggested
by Alexei. This series therefore no longer changes push_stack() to
return PTR_ERR.
- Add detailed explanation of how lfence works internally and how it
affects the algorithm.
- Add tests checking that nospec instructions are inserted in expected
locations using __xlated_unpriv as suggested by Eduard (also,
include a fix for __xlated_unpriv)
- Add a test for the mitigations from the description of
commit 9183671af6db ("bpf: Fix leakage under speculation on
mispredicted branches")
- Remove unused variables from do_check[_insn]() as suggested by
Eduard.
- Remove INSN_IDX_MODIFIED to improve readability as suggested by
Eduard. This also causes the nospec_result-check to run (and fail)
for jumping-ops. Add a warning to assert that this check must never
succeed in that case.
- Add details on the safety of patch 10 ("bpf: Allow nospec-protected
var-offset stack access") based on the feedback on v1.
- Rebase to bpf-next-250420
- Link to v1: https://lore.kernel.org/all/20250313172127.1098195-1-luis.gerhorst@fau.de/
* RFC -> v1:
- rebase to bpf-next-250313
- tests: mark expected successes/new errors
- add bpt_jit_bypass_spec_v1/v4() to avoid #ifdef in
bpf_bypass_spec_v1/v4()
- ensure that nospec with v1-support is implemented for archs for
which GCC supports speculation barriers, except for MIPS
- arm64: emit speculation barrier
- powerpc: change nospec to include v1 barrier
- discuss potential security (archs that do not impl. BPF nospec) and
performance (only PowerPC) regressions
- Link to RFC: https://lore.kernel.org/bpf/20250224203619.594724-1-luis.gerhorst@fau.de/
Luis Gerhorst (11):
selftests/bpf: Fix caps for __xlated/jited_unpriv
bpf: Move insn if/else into do_check_insn()
bpf: Return -EFAULT on misconfigurations
bpf: Return -EFAULT on internal errors
bpf, arm64, powerpc: Add bpf_jit_bypass_spec_v1/v4()
bpf, arm64, powerpc: Change nospec to include v1 barrier
bpf: Rename sanitize_stack_spill to nospec_result
bpf: Fall back to nospec for Spectre v1
selftests/bpf: Add test for Spectre v1 mitigation
bpf: Allow nospec-protected var-offset stack access
bpf: Fall back to nospec for sanitization-failures
arch/arm64/net/bpf_jit.h | 5 +
arch/arm64/net/bpf_jit_comp.c | 28 +-
arch/powerpc/net/bpf_jit_comp64.c | 80 ++-
include/linux/bpf.h | 11 +-
include/linux/bpf_verifier.h | 3 +-
include/linux/filter.h | 2 +-
kernel/bpf/core.c | 32 +-
kernel/bpf/verifier.c | 653 ++++++++++--------
tools/testing/selftests/bpf/progs/bpf_misc.h | 4 +
.../selftests/bpf/progs/verifier_and.c | 8 +-
.../selftests/bpf/progs/verifier_bounds.c | 66 +-
.../bpf/progs/verifier_bounds_deduction.c | 45 +-
.../selftests/bpf/progs/verifier_map_ptr.c | 20 +-
.../selftests/bpf/progs/verifier_movsx.c | 16 +-
.../selftests/bpf/progs/verifier_unpriv.c | 65 +-
.../bpf/progs/verifier_value_ptr_arith.c | 101 ++-
tools/testing/selftests/bpf/test_loader.c | 14 +-
.../selftests/bpf/verifier/dead_code.c | 3 +-
tools/testing/selftests/bpf/verifier/jmp32.c | 33 +-
tools/testing/selftests/bpf/verifier/jset.c | 10 +-
20 files changed, 771 insertions(+), 428 deletions(-)
base-commit: 358b1c0f56ebb6996fcec7dcdcf6bae5dcbc8b6c
--
2.49.0
The BTF dumper code currently displays arrays of characters as just that -
arrays, with each character formatted individually. Sometimes this is what
makes sense, but it's nice to be able to treat that array as a string.
This change adds a special case to the btf_dump functionality to allow
arrays of single-byte integer values to be printed as character strings.
Characters for which isprint() returns false are printed as hex-escaped
values. This is enabled when the new ".emit_strings" is set to 1 in the
btf_dump_type_data_opts structure.
As an example, here's what it looks like to dump the string "hello" using
a few different field values for btf_dump_type_data_opts (.compact = 1):
- .emit_strings = 0, .skip_names = 0: (char[6])['h','e','l','l','o',]
- .emit_strings = 0, .skip_names = 1: ['h','e','l','l','o',]
- .emit_strings = 1, .skip_names = 0: (char[6])"hello"
- .emit_strings = 1, .skip_names = 1: "hello"
Here's the string "h\xff", dumped with .compact = 1 and .skip_names = 1:
- .emit_strings = 0: ['h',-1,]
- .emit_strings = 1: "h\xff"
Signed-off-by: Blake Jones <blakejones(a)google.com>
---
tools/lib/bpf/btf.h | 3 ++-
tools/lib/bpf/btf_dump.c | 44 +++++++++++++++++++++++++++++++++++++++-
2 files changed, 45 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index 4392451d634b..ccfd905f03df 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -326,9 +326,10 @@ struct btf_dump_type_data_opts {
bool compact; /* no newlines/indentation */
bool skip_names; /* skip member/type names */
bool emit_zeroes; /* show 0-valued fields */
+ bool emit_strings; /* print char arrays as strings */
size_t :0;
};
-#define btf_dump_type_data_opts__last_field emit_zeroes
+#define btf_dump_type_data_opts__last_field emit_strings
LIBBPF_API int
btf_dump__dump_type_data(struct btf_dump *d, __u32 id,
diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c
index 460c3e57fadb..336a6646e0fa 100644
--- a/tools/lib/bpf/btf_dump.c
+++ b/tools/lib/bpf/btf_dump.c
@@ -68,6 +68,7 @@ struct btf_dump_data {
bool compact;
bool skip_names;
bool emit_zeroes;
+ bool emit_strings;
__u8 indent_lvl; /* base indent level */
char indent_str[BTF_DATA_INDENT_STR_LEN];
/* below are used during iteration */
@@ -2028,6 +2029,43 @@ static int btf_dump_var_data(struct btf_dump *d,
return btf_dump_dump_type_data(d, NULL, t, type_id, data, 0, 0);
}
+static int btf_dump_string_data(struct btf_dump *d,
+ const struct btf_type *t,
+ __u32 id,
+ const void *data)
+{
+ const struct btf_array *array = btf_array(t);
+ __u32 i;
+
+ btf_dump_data_pfx(d);
+ btf_dump_printf(d, "\"");
+
+ for (i = 0; i < array->nelems; i++, data++) {
+ char c;
+
+ if (data >= d->typed_dump->data_end)
+ return -E2BIG;
+
+ c = *(char *)data;
+ if (c == '\0') {
+ /*
+ * When printing character arrays as strings, NUL bytes
+ * are always treated as string terminators; they are
+ * never printed.
+ */
+ break;
+ }
+ if (isprint(c))
+ btf_dump_printf(d, "%c", c);
+ else
+ btf_dump_printf(d, "\\x%02x", *(__u8 *)data);
+ }
+
+ btf_dump_printf(d, "\"");
+
+ return 0;
+}
+
static int btf_dump_array_data(struct btf_dump *d,
const struct btf_type *t,
__u32 id,
@@ -2055,8 +2093,11 @@ static int btf_dump_array_data(struct btf_dump *d,
* char arrays, so if size is 1 and element is
* printable as a char, we'll do that.
*/
- if (elem_size == 1)
+ if (elem_size == 1) {
+ if (d->typed_dump->emit_strings)
+ return btf_dump_string_data(d, t, id, data);
d->typed_dump->is_array_char = true;
+ }
}
/* note that we increment depth before calling btf_dump_print() below;
@@ -2544,6 +2585,7 @@ int btf_dump__dump_type_data(struct btf_dump *d, __u32 id,
d->typed_dump->compact = OPTS_GET(opts, compact, false);
d->typed_dump->skip_names = OPTS_GET(opts, skip_names, false);
d->typed_dump->emit_zeroes = OPTS_GET(opts, emit_zeroes, false);
+ d->typed_dump->emit_strings = OPTS_GET(opts, emit_strings, false);
ret = btf_dump_dump_type_data(d, NULL, t, id, data, 0, 0);
--
2.49.0.1204.g71687c7c1d-goog
As titled, adding version file to kselftest installation dir, so the user
of the tarball can know which kernel version the tarball belongs to.
Signed-off-by: Tianyi Cui <1997cui(a)gmail.com>
---
tools/testing/selftests/Makefile | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index a0a6ba47d600..246e9863b45b 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -291,6 +291,12 @@ ifdef INSTALL_PATH
$(MAKE) -s --no-print-directory OUTPUT=$$BUILD_TARGET COLLECTION=$$TARGET \
-C $$TARGET emit_tests >> $(TEST_LIST); \
done;
+ @if git describe HEAD > /dev/null 2>&1; then \
+ git describe HEAD > $(INSTALL_PATH)/VERSION; \
+ printf "Version saved to $(INSTALL_PATH)/VERSION\n"; \
+ else \
+ printf "Unable to get version from git describe\n"; \
+ fi
else
$(error Error: set INSTALL_PATH to use install)
endif
--
2.47.1
This series introduces VFIO selftests, located in
tools/testing/selftests/vfio/.
VFIO selftests aim to enable kernel developers to write and run tests
that take the form of userspace programs that interact with VFIO and
IOMMUFD uAPIs. VFIO selftests can be used to write functional tests for
new features, regression tests for bugs, and performance tests for
optimizations.
These tests are designed to interact with real PCI devices, i.e. they do
not rely on mocking out or faking any behavior in the kernel. This
allows the tests to exercise not only VFIO but also IOMMUFD, the IOMMU
driver, interrupt remapping, IRQ handling, etc.
We chose selftests to host these tests primarily to enable integration
with the existing KVM selftests. As explained in the next section,
enabling KVM developers to test the interaction between VFIO and KVM is
one of the motivators of this series.
Motivation
-----------------------------------------------------------------------
The main motivation for this series is upcoming development in the
kernel to support Hypervisor Live Updates [1][2]. Live Update is a
specialized reboot process where selected devices are kept operational
and their kernel state is preserved and recreated across a kexec. For
devices, DMA and interrupts may continue during the reboot. VFIO-bound
devices are the main target, since the first usecase of Live Updates is
to enable host kernel upgrades in a Cloud Computing environment without
disrupting running customer VMs.
To prepare for upcoming support for Live Updates in VFIO, IOMMUFD, IOMMU
drivers, the PCI layer, etc., we'd like to first lay the ground work for
exercising and testing VFIO from kernel selftests. This way when we
eventually upstream support for Live Updates, we can also upstream tests
for those changes, rather than purely relying on Live Update integration
tests which would be hard to share and reproduce upstream.
But even without Live Updates, VFIO and IOMMUFD are becoming an
increasingly critical component of running KVM-based VMs in cloud
environments. Virtualized networking and storage are increasingly being
offloaded to smart NICs/cards, and demand for high performance
networking, storage, and AI are also leading to NICs, SSDs, and GPUs
being directly attached to VMs via VFIO.
VFIO selftests increases our ability to test in several ways.
- It enables developers sending VFIO, IOMMUFD, etc. commits upstream to
test their changes against all existing VFIO selftests, reducing the
probability of regressions.
- It enables developers sending VFIO, IOMMUFD, etc. commits upstream to
include tests alongside their changes, increasing the quality of the
code that is merged.
- It enables testing the interaction between VFIO and KVM. There are
some paths in KVM that are only exercised through VFIO, such as IRQ
bypass. VFIO selftests provides a helper library to enable KVM
developers to write KVM selftests to test those interactions [3].
Design
-----------------------------------------------------------------------
VFIO selftests are designed around interacting with with VFIO-managed PCI
devices. As such, the core data struture is struct vfio_pci_device, which
represents a single PCI device.
struct vfio_pci_device *device;
device = vfio_pci_device_init("0000:6a:01.0", iommu_mode);
...
vfio_pci_device_cleanup(device);
vfio_pci_device_init() sets up a container or iommufd, depending on the
iommu_mode argument, to manage DMA mappings, fetches information about
the device and what interrupts it supports from VFIO and caches it, and
mmap()s all mappable BARs for the test to use.
There are helper methods that operate on struct vfio_pci_device to do
things like read and write to PCI config space, enable/disable IRQs, and
map memory for DMA,
struct vfio_pci_device and its methods do not care about what device
they are actually interacting with. It can be a GPU, a NIC, an SSD, etc.
To keep things simple initially, VFIO selftests only support a single
device per group and per container/iommufd. But it should be possible to
relax those restrictions in the future, e.g. to enable testing with
multiple devices in the same container/iommufd.
Driver Framework
-----------------------------------------------------------------------
In order to support VFIO selftests where a device is generating DMA and
interrupts on command, the VFIO selftests supports a driver framework.
This framework abstracts away device-specific details allowing VFIO
selftests to be written in a generic way, and then run against different
devices depending on what hardware developers have access to.
The framework also aims to support carrying drivers out-of-tree, e.g.
so that companies can run VFIO selftests with custom/test hardware.
Drivers must implement the following methods:
- probe(): Check if the driver supports a given device.
- init(): Initialize the driver.
- remove(): Deinitialize the driver and reset the device.
- memcpy_start(): Kick off a series of repeated memcpys (DMA reads and
DMA writes).
- memcpy_wait(): Wait for a memcpy operation to complete.
- send_msi(): Make the device send an MSI interrupt.
memcpy_start/wait() are for generating DMA. We separate the operation
into 2 steps so that tests can trigger a long-running DMA operation. We
expect to use this to stress test Live Updates by kicking off a
long-running mempcy operation and then performing a Live Update. These
methods are required to not generate any interrupts.
send_msi() is used for testing MSI and MSI-x interrupts. The driver
tells the test which MSI it will be using via device->driver.msi.
It's the responsibility of the test to set up a region of memory
and map it into the device for use by the driver, e.g. for in-memory
descriptors, before calling init().
A demo of the driver framework can be found in
tools/testing/selftests/vfio/vfio_pci_driver_test.c.
In addition, this series introduces a new KVM selftest to demonstrate
delivering a device MSI directly into a guest, which can be found in
tools/testing/selftests/kvm/vfio_pci_device_irq_test.c.
Tests
-----------------------------------------------------------------------
There are 5 tests in this series, mostly to demonstrate as a
proof-of-concept:
- tools/testing/selftests/vfio/vfio_pci_device_test.c
- tools/testing/selftests/vfio/vfio_pci_driver_test.c
- tools/testing/selftests/vfio/vfio_iommufd_setup_test.c
- tools/testing/selftests/vfio/vfio_dma_mapping_test.c
- tools/testing/selftests/kvm/vfio_pci_device_irq_test.c
Integrating with KVM selftests
-----------------------------------------------------------------------
To support testing the interactions between VFIO and KVM, the VFIO
selftests support sharing its library with the KVM selftest. The patches
at the end of this series demonstrate how that works.
Essentially, we allow the KVM selftests to build their own copy of
tools/testing/selftests/vfio/lib/ and link it into KVM selftests
binaries. This requires minimal changes to the KVM selftests Makefile.
Future Areas of Development
-----------------------------------------------------------------------
Library:
- Driver support for devices that can be used on AMD, ARM, and other
platforms.
- Driver support for a device available in QEMU VMs.
- Support for tests that use multiple devices.
- Support for IOMMU groups with multiple devices.
- Support for multiple devices sharing the same container/iommufd.
- Sharing TEST_ASSERT() macros and other common code between KVM
and VFIO selftests.
Tests:
- DMA mapping performance tests for BARs/HugeTLB/etc.
- Live Update selftests.
- Porting Sean's KVM selftest for posted interrupts to use the VFIO
selftests library [3]
This series can also be found on GitHub:
https://github.com/dmatlack/linux/tree/vfio/selftests/rfc
Cc: Alex Williamson <alex.williamson(a)redhat.com>
Cc: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: Kevin Tian <kevin.tian(a)intel.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Sean Christopherson <seanjc(a)google.com>
Cc: Vipin Sharma <vipinsh(a)google.com>
Cc: Josh Hilke <jrhilke(a)google.com>
Cc: Pasha Tatashin <pasha.tatashin(a)soleen.com>
Cc: Saeed Mahameed <saeedm(a)nvidia.com>
Cc: Saeed Mahameed <saeedm(a)nvidia.com>
Cc: Adithya Jayachandran <ajayachandra(a)nvidia.com>
Cc: Parav Pandit <parav(a)nvidia.com>
Cc: Leon Romanovsky <leonro(a)nvidia.com>
Cc: Vinicius Costa Gomes <vinicius.gomes(a)intel.com>
Cc: Dave Jiang <dave.jiang(a)intel.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
[1] https://lore.kernel.org/all/f35359d5-63e1-8390-619f-67961443bfe1@google.com/
[2] https://lore.kernel.org/all/20250515182322.117840-1-pasha.tatashin@soleen.c…
[3] https://lore.kernel.org/kvm/20250404193923.1413163-68-seanjc@google.com/
David Matlack (28):
selftests: Create tools/testing/selftests/vfio
vfio: selftests: Add a helper library for VFIO selftests
vfio: selftests: Introduce vfio_pci_device_test
tools headers: Add stub definition for __iomem
tools headers: Import asm-generic MMIO helpers
tools headers: Import x86 MMIO helper overrides
tools headers: Import iosubmit_cmds512()
tools headers: Import drivers/dma/ioat/{hw.h,registers.h}
tools headers: Import drivers/dma/idxd/registers.h
tools headers: Import linux/pci_ids.h
vfio: selftests: Keep track of DMA regions mapped into the device
vfio: selftests: Enable asserting MSI eventfds not firing
vfio: selftests: Add a helper for matching vendor+device IDs
vfio: selftests: Add driver framework
vfio: sefltests: Add vfio_pci_driver_test
vfio: selftests: Add driver for Intel CBDMA
vfio: selftests: Add driver for Intel DSA
vfio: selftests: Move helper to get cdev path to libvfio
vfio: selftests: Encapsulate IOMMU mode
vfio: selftests: Add [-i iommu_mode] option to all tests
vfio: selftests: Add vfio_type1v2_mode
vfio: selftests: Add iommufd_compat_type1{,v2} modes
vfio: selftests: Add iommufd mode
vfio: selftests: Make iommufd the default iommu_mode
vfio: selftests: Add a script to help with running VFIO selftests
KVM: selftests: Build and link sefltests/vfio/lib into KVM selftests
KVM: selftests: Test sending a vfio-pci device IRQ to a VM
KVM: selftests: Use real device MSIs in vfio_pci_device_irq_test
Josh Hilke (5):
vfio: selftests: Test basic VFIO and IOMMUFD integration
vfio: selftests: Move vfio dma mapping test to their own file
vfio: selftests: Add test to reset vfio device.
vfio: selftests: Use command line to set hugepage size for DMA mapping
test
vfio: selftests: Validate 2M/1G HugeTLB are mapped as 2M/1G in IOMMU
MAINTAINERS | 7 +
tools/arch/x86/include/asm/io.h | 101 +
tools/arch/x86/include/asm/special_insns.h | 27 +
tools/include/asm-generic/io.h | 482 +++
tools/include/asm/io.h | 11 +
tools/include/drivers/dma/idxd/registers.h | 601 +++
tools/include/drivers/dma/ioat/hw.h | 270 ++
tools/include/drivers/dma/ioat/registers.h | 251 ++
tools/include/linux/compiler.h | 4 +
tools/include/linux/io.h | 4 +-
tools/include/linux/pci_ids.h | 3212 +++++++++++++++++
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/kvm/Makefile.kvm | 6 +-
.../testing/selftests/kvm/include/kvm_util.h | 4 +
tools/testing/selftests/kvm/lib/kvm_util.c | 21 +
.../selftests/kvm/vfio_pci_device_irq_test.c | 173 +
tools/testing/selftests/vfio/.gitignore | 7 +
tools/testing/selftests/vfio/Makefile | 20 +
.../testing/selftests/vfio/lib/drivers/dsa.c | 416 +++
.../testing/selftests/vfio/lib/drivers/ioat.c | 235 ++
.../selftests/vfio/lib/include/vfio_util.h | 271 ++
tools/testing/selftests/vfio/lib/libvfio.mk | 26 +
.../selftests/vfio/lib/vfio_pci_device.c | 573 +++
.../selftests/vfio/lib/vfio_pci_driver.c | 126 +
tools/testing/selftests/vfio/run.sh | 110 +
.../selftests/vfio/vfio_dma_mapping_test.c | 239 ++
.../selftests/vfio/vfio_iommufd_setup_test.c | 133 +
.../selftests/vfio/vfio_pci_device_test.c | 195 +
.../selftests/vfio/vfio_pci_driver_test.c | 256 ++
29 files changed, 7780 insertions(+), 2 deletions(-)
create mode 100644 tools/arch/x86/include/asm/io.h
create mode 100644 tools/arch/x86/include/asm/special_insns.h
create mode 100644 tools/include/asm-generic/io.h
create mode 100644 tools/include/asm/io.h
create mode 100644 tools/include/drivers/dma/idxd/registers.h
create mode 100644 tools/include/drivers/dma/ioat/hw.h
create mode 100644 tools/include/drivers/dma/ioat/registers.h
create mode 100644 tools/include/linux/pci_ids.h
create mode 100644 tools/testing/selftests/kvm/vfio_pci_device_irq_test.c
create mode 100644 tools/testing/selftests/vfio/.gitignore
create mode 100644 tools/testing/selftests/vfio/Makefile
create mode 100644 tools/testing/selftests/vfio/lib/drivers/dsa.c
create mode 100644 tools/testing/selftests/vfio/lib/drivers/ioat.c
create mode 100644 tools/testing/selftests/vfio/lib/include/vfio_util.h
create mode 100644 tools/testing/selftests/vfio/lib/libvfio.mk
create mode 100644 tools/testing/selftests/vfio/lib/vfio_pci_device.c
create mode 100644 tools/testing/selftests/vfio/lib/vfio_pci_driver.c
create mode 100755 tools/testing/selftests/vfio/run.sh
create mode 100644 tools/testing/selftests/vfio/vfio_dma_mapping_test.c
create mode 100644 tools/testing/selftests/vfio/vfio_iommufd_setup_test.c
create mode 100644 tools/testing/selftests/vfio/vfio_pci_device_test.c
create mode 100644 tools/testing/selftests/vfio/vfio_pci_driver_test.c
base-commit: a11a72229881d8ac1d52ea727101bc9c744189c1
prerequisite-patch-id: 3bae97c9e1093148763235f47a84fa040b512d04
--
2.49.0.1151.ga128411c76-goog
Add missing config options for the tso.py test, specifically
to make sure the kernel is built with vxlan and gre tunnels.
I noticed this while adding a TSO-capable device QEMU to the CI.
Previously we only run virtio tests and it doesn't report LSO
stats on the QEMU we have.
Fixes: 0d0f4174f6c8 ("selftests: drv-net: add a simple TSO test")
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
---
CC: shuah(a)kernel.org
CC: willemb(a)google.com
CC: linux-kselftest(a)vger.kernel.org
---
tools/testing/selftests/drivers/net/hw/config | 6 ++++++
1 file changed, 6 insertions(+)
create mode 100644 tools/testing/selftests/drivers/net/hw/config
diff --git a/tools/testing/selftests/drivers/net/hw/config b/tools/testing/selftests/drivers/net/hw/config
new file mode 100644
index 000000000000..ea4b70d71563
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/hw/config
@@ -0,0 +1,6 @@
+CONFIG_IPV6=y
+CONFIG_IPV6_GRE=y
+CONFIG_NET_IP_TUNNEL=y
+CONFIG_NET_IPGRE=y
+CONFIG_NET_IPGRE_DEMUX=y
+CONFIG_VXLAN=y
--
2.49.0