June 2025 - Linux-kselftest-mirror

[PATCH v4 00/38] Mediated vPMU 4.0 for x86

by Mingwei Zhang

With joint effort from the upstream KVM community, we come up with the 4th version of mediated vPMU for x86. We have made the following changes on top of the previous RFC v3. v3 -> v4 - Rebase whole patchset on 6.14-rc3 base. - Address Peter's comments on Perf part. - Address Sean's comments on KVM part. * Change key word "passthrough" to "mediated" in all patches * Change static enabling to user space dynamic enabling via KVM_CAP_PMU_CAPABILITY. * Only support GLOBAL_CTRL save/restore with VMCS exec_ctrl, drop the MSR save/retore list support for GLOBAL_CTRL, thus the support of mediated vPMU is constrained to SapphireRapids and later CPUs on Intel side. * Merge some small changes into a single patch. - Address Sandipan's comment on invalid pmu pointer. - Add back "eventsel_hw" and "fixed_ctr_ctrl_hw" to avoid to directly manipulate pmc->eventsel and pmu->fixed_ctr_ctrl. Testing (Intel side): - Perf-based legacy vPMU (force emulation on/off) * Kselftests pmu_counters_test, pmu_event_filter_test and vmx_pmu_caps_test pass. * KUT PMU tests pmu, pmu_lbr, pmu_pebs pass. * Basic perf counting/sampling tests in 3 scenarios, guest-only, host-only and host-guest coexistence all pass. - Mediated vPMU (force emulation on/off) * Kselftests pmu_counters_test, pmu_event_filter_test and vmx_pmu_caps_test pass. * KUT PMU tests pmu, pmu_lbr, pmu_pebs pass. * Basic perf counting/sampling tests in 3 scenarios, guest-only, host-only and host-guest coexistence all pass. - Failures. All above tests passed on Intel Granite Rapids as well except a failure on KUT/pmu_pebs. * GP counter 0 (0xfffffffffffe): PEBS record (written seq 0) is verified (including size, counters and cfg). * The pebs_data_cfg (0xb500000000) doesn't match with the effective MSR_PEBS_DATA_CFG (0x0). * This failure has nothing to do with this mediated vPMU patch set. The failure is caused by Granite Rapids supported timed PEBS which needs extra support on Qemu and KUT/pmu_pebs. These extra support would be sent in separate patches later. Testing (AMD side): - Kselftests pmu_counters_test, pmu_event_filter_test and vmx_pmu_caps_test all pass - legacy guest with KUT/pmu: * qmeu option: -cpu host, -perfctr-core * when set force_emulation_prefix=1, passes * when set force_emulation_prefix=0, passes - perfmon-v1 guest with KUT/pmu: * qmeu option: -cpu host, -perfmon-v2 * when set force_emulation_prefix=1, passes * when set force_emulation_prefix=0, passes - perfmon-v2 guest with KUT/pmu: * qmeu option: -cpu host * when set force_emulation_prefix=1, passes * when set force_emulation_prefix=0, passes - perf_fuzzer (perfmon-v2): * fails with soft lockup in guest in current version. * culprit could be between 6.13 ~ 6.14-rc3 within KVM * Series tested on 6.12 and 6.13 without issue. Note: a QEMU series is needed to run mediated vPMU v4: - https://lore.kernel.org/all/20250324123712.34096-1-dapeng1.mi@linux.intel.c… History: - RFC v3: https://lore.kernel.org/all/20240801045907.4010984-1-mizhang@google.com/ - RFC v2: https://lore.kernel.org/all/20240506053020.3911940-1-mizhang@google.com/ - RFC v1: https://lore.kernel.org/all/20240126085444.324918-1-xiong.y.zhang@linux.int… Dapeng Mi (18): KVM: x86/pmu: Introduce enable_mediated_pmu global parameter KVM: x86/pmu: Check PMU cpuid configuration from user space KVM: x86: Rename vmx_vmentry/vmexit_ctrl() helpers KVM: x86/pmu: Add perf_capabilities field in struct kvm_host_values{} KVM: x86/pmu: Move PMU_CAP_{FW_WRITES,LBR_FMT} into msr-index.h header KVM: VMX: Add macros to wrap around {secondary,tertiary}_exec_controls_changebit() KVM: x86/pmu: Check if mediated vPMU can intercept rdpmc KVM: x86/pmu/vmx: Save/load guest IA32_PERF_GLOBAL_CTRL with vm_exit/entry_ctrl KVM: x86/pmu: Optimize intel/amd_pmu_refresh() helpers KVM: x86/pmu: Setup PMU MSRs' interception mode KVM: x86/pmu: Handle PMU MSRs interception and event filtering KVM: x86/pmu: Switch host/guest PMU context at vm-exit/vm-entry KVM: x86/pmu: Handle emulated instruction for mediated vPMU KVM: nVMX: Add macros to simplify nested MSR interception setting KVM: selftests: Add mediated vPMU supported for pmu tests KVM: Selftests: Support mediated vPMU for vmx_pmu_caps_test KVM: Selftests: Fix pmu_counters_test error for mediated vPMU KVM: x86/pmu: Expose enable_mediated_pmu parameter to user space Kan Liang (8): perf: Support get/put mediated PMU interfaces perf: Skip pmu_ctx based on event_type perf: Clean up perf ctx time perf: Add a EVENT_GUEST flag perf: Add generic exclude_guest support perf: Add switch_guest_ctx() interface perf/x86: Support switch_guest_ctx interface perf/x86/intel: Support PERF_PMU_CAP_MEDIATED_VPMU Mingwei Zhang (5): perf/x86: Forbid PMI handler when guest own PMU perf/x86/core: Plumb mediated PMU capability from x86_pmu to x86_pmu_cap KVM: x86/pmu: Exclude PMU MSRs in vmx_get_passthrough_msr_slot() KVM: x86/pmu: introduce eventsel_hw to prepare for pmu event filtering KVM: nVMX: Add nested virtualization support for mediated PMU Sandipan Das (4): perf/x86/core: Do not set bit width for unavailable counters KVM: x86/pmu: Add AMD PMU registers to direct access list KVM: x86/pmu/svm: Set GuestOnly bit and clear HostOnly bit when guest write to event selectors perf/x86/amd: Support PERF_PMU_CAP_MEDIATED_VPMU for AMD host Xiong Zhang (3): x86/irq: Factor out common code for installing kvm irq handler perf: core/x86: Register a new vector for KVM GUEST PMI KVM: x86/pmu: Register KVM_GUEST_PMI_VECTOR handler arch/x86/events/amd/core.c | 2 + arch/x86/events/core.c | 40 +- arch/x86/events/intel/core.c | 5 + arch/x86/include/asm/hardirq.h | 1 + arch/x86/include/asm/idtentry.h | 1 + arch/x86/include/asm/irq.h | 2 +- arch/x86/include/asm/irq_vectors.h | 5 +- arch/x86/include/asm/kvm-x86-pmu-ops.h | 2 + arch/x86/include/asm/kvm_host.h | 10 + arch/x86/include/asm/msr-index.h | 18 +- arch/x86/include/asm/perf_event.h | 1 + arch/x86/include/asm/vmx.h | 1 + arch/x86/kernel/idt.c | 1 + arch/x86/kernel/irq.c | 39 +- arch/x86/kvm/cpuid.c | 15 + arch/x86/kvm/pmu.c | 254 ++++++++- arch/x86/kvm/pmu.h | 45 ++ arch/x86/kvm/svm/pmu.c | 148 ++++- arch/x86/kvm/svm/svm.c | 26 + arch/x86/kvm/svm/svm.h | 2 +- arch/x86/kvm/vmx/capabilities.h | 11 +- arch/x86/kvm/vmx/nested.c | 68 ++- arch/x86/kvm/vmx/pmu_intel.c | 224 ++++++-- arch/x86/kvm/vmx/vmx.c | 89 +-- arch/x86/kvm/vmx/vmx.h | 11 +- arch/x86/kvm/x86.c | 63 ++- arch/x86/kvm/x86.h | 2 + include/linux/perf_event.h | 47 +- kernel/events/core.c | 519 ++++++++++++++---- .../beauty/arch/x86/include/asm/irq_vectors.h | 5 +- .../selftests/kvm/include/kvm_test_harness.h | 13 + .../testing/selftests/kvm/include/kvm_util.h | 3 + .../selftests/kvm/include/x86/processor.h | 8 + tools/testing/selftests/kvm/lib/kvm_util.c | 23 + .../selftests/kvm/x86/pmu_counters_test.c | 24 +- .../selftests/kvm/x86/pmu_event_filter_test.c | 8 +- .../selftests/kvm/x86/vmx_pmu_caps_test.c | 2 +- 37 files changed, 1480 insertions(+), 258 deletions(-) base-commit: 0ad2507d5d93f39619fc42372c347d6006b64319 -- 2.49.0.395.g12beb8f557-goog

2 days, 15 hours

8
125
0 0

[PATCH v7 00/30] TDX KVM selftests

by Sagi Shahar

This is v7 of the TDX selftests now that the base TDX patches have been accepted. This series is based on v6.16-rc1 No major changes from v6 asside from rebasing. Thanks, Changes from v6: - Rebased on top of v6.16-rc1 Ackerley Tng (12): KVM: selftests: Add function to allow one-to-one GVA to GPA mappings KVM: selftests: Expose function that sets up sregs based on VM's mode KVM: selftests: Store initial stack address in struct kvm_vcpu KVM: selftests: Add vCPU descriptor table initialization utility KVM: selftests: TDX: Use KVM_TDX_CAPABILITIES to validate TDs' attribute configuration KVM: selftests: TDX: Update load_td_memory_region() for VM memory backed by guest memfd KVM: selftests: Add functions to allow mapping as shared KVM: selftests: KVM: selftests: Expose new vm_vaddr_alloc_private() KVM: selftests: TDX: Add support for TDG.MEM.PAGE.ACCEPT KVM: selftests: TDX: Add support for TDG.VP.VEINFO.GET KVM: selftests: TDX: Add TDX UPM selftest KVM: selftests: TDX: Add TDX UPM selftests for implicit conversion Erdem Aktas (3): KVM: selftests: Add helper functions to create TDX VMs KVM: selftests: TDX: Add TDX lifecycle test KVM: selftests: TDX: Add TDX HLT exit test Isaku Yamahata (1): KVM: selftests: Update kvm_init_vm_address_properties() for TDX Roger Wang (1): KVM: selftests: TDX: Add TDG.VP.INFO test Ryan Afranji (2): KVM: selftests: TDX: Verify the behavior when host consumes a TD private memory KVM: selftests: TDX: Add shared memory test Sagi Shahar (10): KVM: selftests: TDX: Add report_fatal_error test KVM: selftests: TDX: Adding test case for TDX port IO KVM: selftests: TDX: Add basic TDX CPUID test KVM: selftests: TDX: Add basic TDG.VP.VMCALL<GetTdVmCallInfo> test KVM: selftests: TDX: Add TDX IO writes test KVM: selftests: TDX: Add TDX IO reads test KVM: selftests: TDX: Add TDX MSR read/write tests KVM: selftests: TDX: Add TDX MMIO reads test KVM: selftests: TDX: Add TDX MMIO writes test KVM: selftests: TDX: Add TDX CPUID TDVMCALL test Yan Zhao (1): KVM: selftests: TDX: Test LOG_DIRTY_PAGES flag to a non-GUEST_MEMFD memslot tools/testing/selftests/kvm/Makefile.kvm | 8 + .../testing/selftests/kvm/include/kvm_util.h | 36 + .../selftests/kvm/include/x86/kvm_util_arch.h | 1 + .../selftests/kvm/include/x86/processor.h | 2 + .../selftests/kvm/include/x86/tdx/td_boot.h | 83 ++ .../kvm/include/x86/tdx/td_boot_asm.h | 16 + .../selftests/kvm/include/x86/tdx/tdcall.h | 54 + .../selftests/kvm/include/x86/tdx/tdx.h | 67 + .../selftests/kvm/include/x86/tdx/tdx_util.h | 23 + .../selftests/kvm/include/x86/tdx/test_util.h | 133 ++ tools/testing/selftests/kvm/lib/kvm_util.c | 74 +- .../testing/selftests/kvm/lib/x86/processor.c | 97 +- .../selftests/kvm/lib/x86/tdx/td_boot.S | 100 ++ .../selftests/kvm/lib/x86/tdx/tdcall.S | 163 +++ tools/testing/selftests/kvm/lib/x86/tdx/tdx.c | 243 ++++ .../selftests/kvm/lib/x86/tdx/tdx_util.c | 643 +++++++++ .../selftests/kvm/lib/x86/tdx/test_util.c | 187 +++ .../selftests/kvm/x86/tdx_shared_mem_test.c | 129 ++ .../testing/selftests/kvm/x86/tdx_upm_test.c | 461 ++++++ tools/testing/selftests/kvm/x86/tdx_vm_test.c | 1254 +++++++++++++++++ 20 files changed, 3734 insertions(+), 40 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/x86/tdx/td_boot.h create mode 100644 tools/testing/selftests/kvm/include/x86/tdx/td_boot_asm.h create mode 100644 tools/testing/selftests/kvm/include/x86/tdx/tdcall.h create mode 100644 tools/testing/selftests/kvm/include/x86/tdx/tdx.h create mode 100644 tools/testing/selftests/kvm/include/x86/tdx/tdx_util.h create mode 100644 tools/testing/selftests/kvm/include/x86/tdx/test_util.h create mode 100644 tools/testing/selftests/kvm/lib/x86/tdx/td_boot.S create mode 100644 tools/testing/selftests/kvm/lib/x86/tdx/tdcall.S create mode 100644 tools/testing/selftests/kvm/lib/x86/tdx/tdx.c create mode 100644 tools/testing/selftests/kvm/lib/x86/tdx/tdx_util.c create mode 100644 tools/testing/selftests/kvm/lib/x86/tdx/test_util.c create mode 100644 tools/testing/selftests/kvm/x86/tdx_shared_mem_test.c create mode 100644 tools/testing/selftests/kvm/x86/tdx_upm_test.c create mode 100644 tools/testing/selftests/kvm/x86/tdx_vm_test.c -- 2.50.0.rc2.692.g299adb8693-goog

2 days, 15 hours

2
32
0 0

[PATCH] selftests: breakpoints: use suspend_stats to reliably check suspend success

by Moon Hee Lee

The step_after_suspend_test verifies that the system successfully suspended and resumed by setting a timerfd and checking whether the timer fully expired. However, this method is unreliable due to timing races. In practice, the system may take time to enter suspend, during which the timer may expire just before or during the transition. As a result, the remaining time after resume may show non-zero nanoseconds, even if suspend/resume completed successfully. This leads to false test failures. Replace the timer-based check with a read from /sys/power/suspend_stats/success. This counter is incremented only after a full suspend/resume cycle, providing a reliable and race-free indicator. Also remove the unused file descriptor for /sys/power/state, which remained after switching to a system() call to trigger suspend [1]. [1] https://lore.kernel.org/all/20240930224025.2858767-1-yifei.l.liu@oracle.com/ Fixes: c66be905cda2 ("selftests: breakpoints: use remaining time to check if suspend succeed") Signed-off-by: Moon Hee Lee <moonhee.lee.ca(a)gmail.com> --- .../breakpoints/step_after_suspend_test.c | 41 ++++++++++++++----- 1 file changed, 31 insertions(+), 10 deletions(-) diff --git a/tools/testing/selftests/breakpoints/step_after_suspend_test.c b/tools/testing/selftests/breakpoints/step_after_suspend_test.c index 8d275f03e977..8d233ac95696 100644 --- a/tools/testing/selftests/breakpoints/step_after_suspend_test.c +++ b/tools/testing/selftests/breakpoints/step_after_suspend_test.c @@ -127,22 +127,42 @@ int run_test(int cpu) return KSFT_PASS; } +/* + * Reads the suspend success count from sysfs. + * Returns the count on success or exits on failure. + */ +static int get_suspend_success_count_or_fail(void) +{ + FILE *fp; + int val; + + fp = fopen("/sys/power/suspend_stats/success", "r"); + if (!fp) + ksft_exit_fail_msg( + "Failed to open suspend_stats/success: %s\n", + strerror(errno)); + + if (fscanf(fp, "%d", &val) != 1) { + fclose(fp); + ksft_exit_fail_msg( + "Failed to read suspend success count\n"); + } + + fclose(fp); + return val; +} + void suspend(void) { - int power_state_fd; int timerfd; int err; + int count_before; + int count_after; struct itimerspec spec = {}; if (getuid() != 0) ksft_exit_skip("Please run the test as root - Exiting.\n"); - power_state_fd = open("/sys/power/state", O_RDWR); - if (power_state_fd < 0) - ksft_exit_fail_msg( - "open(\"/sys/power/state\") failed %s)\n", - strerror(errno)); - timerfd = timerfd_create(CLOCK_BOOTTIME_ALARM, 0); if (timerfd < 0) ksft_exit_fail_msg("timerfd_create() failed\n"); @@ -152,14 +172,15 @@ void suspend(void) if (err < 0) ksft_exit_fail_msg("timerfd_settime() failed\n"); + count_before = get_suspend_success_count_or_fail(); + system("(echo mem > /sys/power/state) 2> /dev/null"); - timerfd_gettime(timerfd, &spec); - if (spec.it_value.tv_sec != 0 || spec.it_value.tv_nsec != 0) + count_after = get_suspend_success_count_or_fail(); + if (count_after <= count_before) ksft_exit_fail_msg("Failed to enter Suspend state\n"); close(timerfd); - close(power_state_fd); } int main(int argc, char **argv) -- 2.43.0

2 days, 23 hours

3
3
0 0

[PATCH v2 0/6] VMM can handle guest SEA via KVM_EXIT_ARM_SEA

by Jiaqi Yan

Problem ======= When host APEI is unable to claim synchronous external abort (SEA) during stage-2 guest abort, today KVM directly injects an async SError into the VCPU then resumes it. The injected SError usually results in unpleasant guest kernel panic. One of the major situation of guest SEA is when VCPU consumes recoverable uncorrected memory error (UER), which is not uncommon at all in modern datacenter servers with large amounts of physical memory. Although SError and guest panic is sufficient to stop the propagation of corrupted memory there is room to recover from an UER in a more graceful manner. Proposed Solution ================= Alternatively KVM can replay the SEA to the faulting VCPU, via existing KVM_SET_VCPU_EVENTS API. If the memory poison consumption or the fault that cause SEA is not from guest kernel, the blast radius can be limited to the consuming or faulting guest userspace process, so the VM can keep running. In addition, instead of doing under the hood without involving userspace, there are benefits to redirect the SEA to VMM: - VM customers care about the disruptions caused by memory errors, and VMM usually has the responsibility to start the process of notifying the customers of memory error events in their VMs. For example some cloud provider emits a critical log in their observability UI [1], and provides playbook for customers on how to mitigate disruptions to their workloads. - VMM can protect future memory error consumption by unmapping the poisoned pages from stage-2 page table with KVM userfault, or by splitting the memslot that contains the poisoned guest pages [2]. - VMM can keep track of SEA events in the VM. When VMM thinks the status on the host or the VM is bad enough, e.g. number of distinct SEAs exceeds a threshold, it can restart the VM on another healthy host. - Behavior parity with x86 architecture. When machine check exception (MCE) is caused by VCPU, kernel or KVM signals userspace SIGBUS to let VMM either recover from the MCE, or terminate itself with VM. The prior RFC proposes to implement SIGBUS on arm64 as well, but Marc preferred VCPU exit over signal [3]. However, implementation aside, returning SEA to VMM is on par with returning MCE to VMM. Once SEA is redirected to VMM, among other actions, VMM is encouraged to inject external aborts into the faulting VCPU, which is already supported by KVM on arm64. We notice injecting instruction abort is not fully supported by KVM_SET_VCPU_EVENTS. Complement it in the patchset. New UAPIs ========= This patchset introduces following userspace-visiable changes to empower VMM to control what happens next for SEA on guest memory: - KVM_CAP_ARM_SEA_TO_USER. While taking SEA, if userspace has enabled this new capability at VM creation, and the SEA is not caused by memory allocated for stage-2 translation table, instead of injecting SError, return KVM_EXIT_ARM_SEA to userspace. - KVM_EXIT_ARM_SEA. This is the VM exit reason VMM gets. The details about the SEA is provided in arm_sea as much as possible, including sanitized ESR value at EL2, if guest virtual and physical addresses (GPA and GVA) are available and the values if available. - KVM_CAP_ARM_INJECT_EXT_IABT. VMM today can inject external data abort to VCPU via KVM_SET_VCPU_EVENTS API. However, in case of instruction abort, VMM cannot inject it via KVM_SET_VCPU_EVENTS. KVM_CAP_ARM_INJECT_EXT_IABT is just a natural extend to KVM_CAP_ARM_INJECT_EXT_DABT that tells VMM KVM_SET_VCPU_EVENTS now supports external instruction abort. * From v1 [4]: - Rebased on commit 4d62121ce9b5 ("KVM: arm64: vgic-debug: Avoid dereferencing NULL ITE pointer"). - Sanitize ESR_EL2 before reporting it to userspace. - Do not do KVM_EXIT_ARM_SEA when SEA is caused by memory allocated to stage-2 translation table. [1] https://cloud.google.com/solutions/sap/docs/manage-host-errors [2] https://lore.kernel.org/kvm/20250109204929.1106563-1-jthoughton@google.com [3] https://lore.kernel.org/kvm/86pljbqqh0.wl-maz@kernel.org [4] https://lore.kernel.org/kvm/20250505161412.1926643-1-jiaqiyan@google.com Jiaqi Yan (5): KVM: arm64: VM exit to userspace to handle SEA KVM: arm64: Set FnV for VCPU when FAR_EL2 is invalid KVM: selftests: Test for KVM_EXIT_ARM_SEA and KVM_CAP_ARM_SEA_TO_USER KVM: selftests: Test for KVM_CAP_INJECT_EXT_IABT Documentation: kvm: new uAPI for handling SEA Raghavendra Rao Ananta (1): KVM: arm64: Allow userspace to inject external instruction aborts Documentation/virt/kvm/api.rst | 128 ++++++- arch/arm64/include/asm/kvm_emulate.h | 67 ++++ arch/arm64/include/asm/kvm_host.h | 8 + arch/arm64/include/asm/kvm_ras.h | 2 +- arch/arm64/include/uapi/asm/kvm.h | 3 +- arch/arm64/kvm/arm.c | 6 + arch/arm64/kvm/guest.c | 13 +- arch/arm64/kvm/inject_fault.c | 3 + arch/arm64/kvm/mmu.c | 59 ++- include/uapi/linux/kvm.h | 12 + tools/arch/arm64/include/asm/esr.h | 2 + tools/arch/arm64/include/uapi/asm/kvm.h | 3 +- tools/testing/selftests/kvm/Makefile.kvm | 2 + .../testing/selftests/kvm/arm64/inject_iabt.c | 98 +++++ .../testing/selftests/kvm/arm64/sea_to_user.c | 340 ++++++++++++++++++ tools/testing/selftests/kvm/lib/kvm_util.c | 1 + 16 files changed, 718 insertions(+), 29 deletions(-) create mode 100644 tools/testing/selftests/kvm/arm64/inject_iabt.c create mode 100644 tools/testing/selftests/kvm/arm64/sea_to_user.c -- 2.49.0.1266.g31b7d2e469-goog

3 days, 17 hours

2
20
0 0

[PATCH 00/33] vfio: Introduce selftests for VFIO

by David Matlack

This series introduces VFIO selftests, located in tools/testing/selftests/vfio/. VFIO selftests aim to enable kernel developers to write and run tests that take the form of userspace programs that interact with VFIO and IOMMUFD uAPIs. VFIO selftests can be used to write functional tests for new features, regression tests for bugs, and performance tests for optimizations. These tests are designed to interact with real PCI devices, i.e. they do not rely on mocking out or faking any behavior in the kernel. This allows the tests to exercise not only VFIO but also IOMMUFD, the IOMMU driver, interrupt remapping, IRQ handling, etc. For more background on the motivation and design of this series, please see the RFC: https://lore.kernel.org/kvm/20250523233018.1702151-1-dmatlack@google.com/ This series can also be found on GitHub: https://github.com/dmatlack/linux/tree/vfio/selftests/v1 Changelog ----------------------------------------------------------------------- RFC: https://lore.kernel.org/kvm/20250523233018.1702151-1-dmatlack@google.com/ - Add symlink to linux/pci_ids.h instead of copying (Jason) - Add symlinks to drivers/dma/*/*.h instead of copying (Jason) - Automatically replicate vfio_dma_mapping_test across backing sources using fixture variants (Jason) - Automatically replicate vfio_dma_mapping_test and vfio_pci_driver_test across all iommu_modes using fixture variants (Jason) - Invert access() check in vfio_dma_mapping_test (me) - Use driver_override instead of add/remove_id (Alex) - Allow tests to get BDF from env var (Alex) - Use KSFT_FAIL instead of 1 to exit with failure (Alex) - Unconditionally create $(LIBVFIO_O_DIRS) to avoid target conflict with ../cgroup/lib/libcgroup.mk when building KVM selftests (me) - Allow VFIO selftests to run automatically by switching from TEST_GEN_PROGS_EXTENDED to TEST_GEN_PROGS. Automatically run selftests will use $VFIO_SELFTESTS_BDF environment variable to know which device to use (Alex) - Replace hardcoded SZ_4K with getpagesize() in vfio_dma_mapping_test to support platforms with other page sizes (me) - Make all global variables static where possible (me) - Pass argc and argv to test_harness_main() so that users can pass flags to the kselftest harness (me) Instructions ----------------------------------------------------------------------- Running VFIO selftests requires at a PCI device bound to vfio-pci for the tests to use. The address of this device is passed to the test as a segment:bus:device.function string, which must match the path to the device in /sys/bus/pci/devices/ (e.g. 0000:00:04.0). Once you have chosen a device, there is a helper script provided to unbind the device from its current driver, bind it to vfio-pci, export the environment variable $VFIO_SELFTESTS_BDF, and launch a shell: $ tools/testing/selftests/vfio/run.sh -d 0000:00:04.0 -s The -d option tells the script which device to use and the -s option tells the script to launch a shell. Additionally, the VFIO selftest vfio_dma_mapping_test has test cases that rely on HugeTLB pages being available, otherwise they are skipped. To enable those tests make sure at least 1 2MB and 1 1GB HugeTLB pages are available. $ echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages $ echo 1 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages To run all VFIO selftests using make: $ make -C tools/testing/selftests/vfio run_tests To run individual tests: $ tools/testing/selftests/vfio/vfio_dma_mapping_test $ tools/testing/selftests/vfio/vfio_dma_mapping_test -v iommufd_anonymous_hugetlb_2mb $ tools/testing/selftests/vfio/vfio_dma_mapping_test -r vfio_dma_mapping_test.iommufd_anonymous_hugetlb_2mb.dma_map_unmap The environment variable $VFIO_SELFTESTS_BDF can be overridden for a specific test by passing in the BDF on the command line as the last positional argument. $ tools/testing/selftests/vfio/vfio_dma_mapping_test 0000:00:04.0 $ tools/testing/selftests/vfio/vfio_dma_mapping_test -v iommufd_anonymous_hugetlb_2mb 0000:00:04.0 $ tools/testing/selftests/vfio/vfio_dma_mapping_test -r vfio_dma_mapping_test.iommufd_anonymous_hugetlb_2mb.dma_map_unmap 0000:00:04.0 When you are done, free the HugeTLB pages and exit the shell started by run.sh. Exiting the shell will cause the device to be unbound from vfio-pci and bound back to its original driver. $ echo 0 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages $ echo 0 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages $ exit It's also possible to use run.sh to run just a single test hermetically, rather than dropping into a shell: $ tools/testing/selftests/vfio/run.sh -d 0000:00:04.0 -- tools/testing/selftests/vfio/vfio_dma_mapping_test -v iommufd_anonymous Tests ----------------------------------------------------------------------- There are 5 tests in this series, mostly to demonstrate as a proof-of-concept: - tools/testing/selftests/vfio/vfio_pci_device_test.c - tools/testing/selftests/vfio/vfio_pci_driver_test.c - tools/testing/selftests/vfio/vfio_iommufd_setup_test.c - tools/testing/selftests/vfio/vfio_dma_mapping_test.c - tools/testing/selftests/kvm/vfio_pci_device_irq_test.c Future Areas of Development ----------------------------------------------------------------------- Library: - Driver support for devices that can be used on AMD, ARM, and other platforms (e.g. mlx5). - Driver support for a device available in QEMU VMs (e.g. pcie-ats-testdev [1]) - Support for tests that use multiple devices. - Support for IOMMU groups with multiple devices. - Support for multiple devices sharing the same container/iommufd. - Sharing TEST_ASSERT() macros and other common code between KVM and VFIO selftests. Tests: - DMA mapping performance tests for BARs/HugeTLB/etc. - Porting tests from https://github.com/awilliam/tests/commits/for-clg/ to selftests. - Live Update selftests. - Porting Sean's KVM selftest for posted interrupts to use the VFIO selftests library [2] Cc: Alex Williamson <alex.williamson(a)redhat.com> Cc: Jason Gunthorpe <jgg(a)nvidia.com> Cc: Kevin Tian <kevin.tian(a)intel.com> Cc: Paolo Bonzini <pbonzini(a)redhat.com> Cc: Sean Christopherson <seanjc(a)google.com> Cc: Vipin Sharma <vipinsh(a)google.com> Cc: Josh Hilke <jrhilke(a)google.com> Cc: Aaron Lewis <aaronlewis(a)google.com> Cc: Pasha Tatashin <pasha.tatashin(a)soleen.com> Cc: Saeed Mahameed <saeedm(a)nvidia.com> Cc: Adithya Jayachandran <ajayachandra(a)nvidia.com> Cc: Joel Granados <joel.granados(a)kernel.org> [1] https://github.com/Joelgranados/qemu/blob/pcie-testdev/hw/misc/pcie-ats-tes… [2] https://lore.kernel.org/kvm/20250404193923.1413163-68-seanjc@google.com/ David Matlack (28): selftests: Create tools/testing/selftests/vfio vfio: selftests: Add a helper library for VFIO selftests vfio: selftests: Introduce vfio_pci_device_test tools headers: Add stub definition for __iomem tools headers: Import asm-generic MMIO helpers tools headers: Import x86 MMIO helper overrides tools headers: Import iosubmit_cmds512() tools headers: Add symlink to linux/pci_ids.h vfio: selftests: Keep track of DMA regions mapped into the device vfio: selftests: Enable asserting MSI eventfds not firing vfio: selftests: Add a helper for matching vendor+device IDs vfio: selftests: Add driver framework vfio: sefltests: Add vfio_pci_driver_test dmaengine: ioat: Move system_has_dca_enabled() to dma.h vfio: selftests: Add driver for Intel CBDMA dmaengine: idxd: Allow registers.h to be included from tools/ vfio: selftests: Add driver for Intel DSA vfio: selftests: Move helper to get cdev path to libvfio vfio: selftests: Encapsulate IOMMU mode vfio: selftests: Replicate tests across all iommu_modes vfio: selftests: Add vfio_type1v2_mode vfio: selftests: Add iommufd_compat_type1{,v2} modes vfio: selftests: Add iommufd mode vfio: selftests: Make iommufd the default iommu_mode vfio: selftests: Add a script to help with running VFIO selftests KVM: selftests: Build and link sefltests/vfio/lib into KVM selftests KVM: selftests: Test sending a vfio-pci device IRQ to a VM KVM: selftests: Add -d option to vfio_pci_device_irq_test for device-sent MSIs Josh Hilke (5): vfio: selftests: Test basic VFIO and IOMMUFD integration vfio: selftests: Move vfio dma mapping test to their own file vfio: selftests: Add test to reset vfio device. vfio: selftests: Add DMA mapping tests for 2M and 1G HugeTLB vfio: selftests: Validate 2M/1G HugeTLB are mapped as 2M/1G in IOMMU MAINTAINERS | 7 + drivers/dma/idxd/registers.h | 4 + drivers/dma/ioat/dma.h | 2 + drivers/dma/ioat/hw.h | 3 - tools/arch/x86/include/asm/io.h | 101 +++ tools/arch/x86/include/asm/special_insns.h | 27 + tools/include/asm-generic/io.h | 482 ++++++++++++++ tools/include/asm/io.h | 11 + tools/include/linux/compiler.h | 4 + tools/include/linux/io.h | 4 +- tools/include/linux/pci_ids.h | 1 + tools/testing/selftests/Makefile | 1 + tools/testing/selftests/kvm/Makefile.kvm | 4 + .../testing/selftests/kvm/include/kvm_util.h | 4 + tools/testing/selftests/kvm/lib/kvm_util.c | 21 + .../selftests/kvm/vfio_pci_device_irq_test.c | 172 +++++ tools/testing/selftests/vfio/.gitignore | 7 + tools/testing/selftests/vfio/Makefile | 21 + .../selftests/vfio/lib/drivers/dsa/dsa.c | 416 ++++++++++++ .../vfio/lib/drivers/dsa/registers.h | 1 + .../selftests/vfio/lib/drivers/ioat/hw.h | 1 + .../selftests/vfio/lib/drivers/ioat/ioat.c | 235 +++++++ .../vfio/lib/drivers/ioat/registers.h | 1 + .../selftests/vfio/lib/include/vfio_util.h | 295 +++++++++ tools/testing/selftests/vfio/lib/libvfio.mk | 24 + .../selftests/vfio/lib/vfio_pci_device.c | 594 ++++++++++++++++++ .../selftests/vfio/lib/vfio_pci_driver.c | 126 ++++ tools/testing/selftests/vfio/run.sh | 109 ++++ .../selftests/vfio/vfio_dma_mapping_test.c | 199 ++++++ .../selftests/vfio/vfio_iommufd_setup_test.c | 127 ++++ .../selftests/vfio/vfio_pci_device_test.c | 176 ++++++ .../selftests/vfio/vfio_pci_driver_test.c | 247 ++++++++ 32 files changed, 3423 insertions(+), 4 deletions(-) create mode 100644 tools/arch/x86/include/asm/io.h create mode 100644 tools/arch/x86/include/asm/special_insns.h create mode 100644 tools/include/asm-generic/io.h create mode 100644 tools/include/asm/io.h create mode 120000 tools/include/linux/pci_ids.h create mode 100644 tools/testing/selftests/kvm/vfio_pci_device_irq_test.c create mode 100644 tools/testing/selftests/vfio/.gitignore create mode 100644 tools/testing/selftests/vfio/Makefile create mode 100644 tools/testing/selftests/vfio/lib/drivers/dsa/dsa.c create mode 120000 tools/testing/selftests/vfio/lib/drivers/dsa/registers.h create mode 120000 tools/testing/selftests/vfio/lib/drivers/ioat/hw.h create mode 100644 tools/testing/selftests/vfio/lib/drivers/ioat/ioat.c create mode 120000 tools/testing/selftests/vfio/lib/drivers/ioat/registers.h create mode 100644 tools/testing/selftests/vfio/lib/include/vfio_util.h create mode 100644 tools/testing/selftests/vfio/lib/libvfio.mk create mode 100644 tools/testing/selftests/vfio/lib/vfio_pci_device.c create mode 100644 tools/testing/selftests/vfio/lib/vfio_pci_driver.c create mode 100755 tools/testing/selftests/vfio/run.sh create mode 100644 tools/testing/selftests/vfio/vfio_dma_mapping_test.c create mode 100644 tools/testing/selftests/vfio/vfio_iommufd_setup_test.c create mode 100644 tools/testing/selftests/vfio/vfio_pci_device_test.c create mode 100644 tools/testing/selftests/vfio/vfio_pci_driver_test.c base-commit: e271ed52b344ac02d4581286961d0c40acc54c03 prerequisite-patch-id: c1decca4653262d3d2451e6fd4422ebff9c0b589 -- 2.50.0.rc2.701.gf1e915cc24-goog

3 days, 18 hours

4
43
0 0

[PATCH v3 00/15] Consolidate iommu page table implementations (AMD)

by Jason Gunthorpe

[All the precursor patches are merged now and AMD/RISCV/VTD conversions are written] Currently each of the iommu page table formats duplicates all of the logic to maintain the page table and perform map/unmap/etc operations. There are several different versions of the algorithms between all the different formats. The io-pgtable system provides an interface to help isolate the page table code from the iommu driver, but doesn't provide tools to implement the common algorithms. This makes it very hard to improve the state of the pagetable code under the iommu domains as any proposed improvement needs to alter a large number of different driver code paths. Combined with a lack of software based testing this makes improvement in this area very hard. iommufd wants several new page table operations: - More efficient map/unmap operations, using iommufd's batching logic - unmap that returns the physical addresses into a batch as it progresses - cut that allows splitting areas so large pages can have holes poked in them dynamically (ie guestmemfd hitless shared/private transitions) - More agressive freeing of table memory to avoid waste - Fragmenting large pages so that dirty tracking can be more granular - Reassembling large pages so that VMs can run at full IO performance in migration/dirty tracking error flows - KHO integration for kernel live upgrade Together these are algorithmically complex enough to be a very significant task to go and implement in all the page table formats we support. Just the "server" focused drivers use almost all the formats (ARMv8 S1&S2 / x86 PAE / AMDv1 / VT-D SS / RISCV) Instead of doing the duplicated work, this series takes the first step to consolidate the algorithms into one places. In spirit it is similar to the work Christoph did a few years back to pull the redundant get_user_pages() implementations out of the arch code into core MM. This unlocked a great deal of improvement in that space in the following years. I would like to see the same benefit in iommu as well. My first RFC showed a bigger picture with all most all formats and more algorithms. This series reorganizes that to be narrowly focused on just enough to convert the AMD driver to use the new mechanism. kunit tests are provided that allow good testing of the algorithms and all formats on x86, nothing is arch specific. AMD is one of the simpler options as the HW is quite uniform with few different options/bugs while still requiring the complicated contiguous pages support. The HW also has a very simple range based invalidation approach that is easy to implement. The AMD v1 and AMD v2 page table formats are implemented bit for bit identical to the current code, tested using a compare kunit test that checks against the io-pgtable version (on github, see below). Updating the AMD driver to replace the io-pgtable layer with the new stuff is fairly straightforward now. The layering is fixed up in the new version so that all the invalidation goes through function pointers. Several small fixing patches have come out of this as I've been fixing the problems that the test suite uncovers in the current code, and implementing the fixed version in iommupt. On performance, there is a quite wide variety of implementation designs across all the drivers. Looking at some key performance across the main formats: iommu_map(): pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 53,66 , 51,63 , 19.19 (AMDV1) 256*2^12, 386,1909 , 367,1795 , 79.79 256*2^21, 362,1633 , 355,1556 , 77.77 2^12, 56,62 , 52,59 , 11.11 (AMDv2) 256*2^12, 405,1355 , 357,1292 , 72.72 256*2^21, 393,1160 , 358,1114 , 67.67 2^12, 55,65 , 53,62 , 14.14 (VTD second stage) 256*2^12, 391,518 , 332,512 , 35.35 256*2^21, 383,635 , 336,624 , 46.46 2^12, 57,65 , 55,63 , 12.12 (ARM 64 bit) 256*2^12, 380,389 , 361,369 , 2.02 256*2^21, 358,419 , 345,400 , 13.13 iommu_unmap(): pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 69,88 , 65,85 , 23.23 (AMDv1) 256*2^12, 353,6498 , 331,6029 , 94.94 256*2^21, 373,6014 , 360,5706 , 93.93 2^12, 71,72 , 66,69 , 4.04 (AMDv2) 256*2^12, 228,891 , 206,871 , 76.76 256*2^21, 254,721 , 245,711 , 65.65 2^12, 69,87 , 65,82 , 20.20 (VTD second stage) 256*2^12, 210,321 , 200,315 , 36.36 256*2^21, 255,349 , 238,342 , 30.30 2^12, 72,77 , 68,74 , 8.08 (ARM 64 bit) 256*2^12, 521,357 , 447,346 , -29.29 256*2^21, 489,358 , 433,345 , -25.25 * Above numbers include additional patches to remove the iommu_pgsize() overheads. gcc 13.3.0, i7-12700 This version provides fairly consistent performance across formats. ARM unmap performance is quite different because this version supports contiguous pages and uses a very different algorithm for unmapping. Though why it is so worse compared to AMDv1 I haven't figured out yet. The per-format commits include a more detailed chart. There is a second branch: https://github.com/jgunthorpe/linux/commits/iommu_pt_all Containing supporting work and future steps: - ARM short descriptor (32 bit), ARM long descriptor (64 bit) formats - RISCV format and RISCV conversion https://github.com/jgunthorpe/linux/commits/iommu_pt_riscv - Support for a DMA incoherent HW page table walker - VT-D second stage format and VT-D conversion https://github.com/jgunthorpe/linux/commits/iommu_pt_vtd - DART v1 & v2 format - Draft of a iommufd 'cut' operation to break down huge pages - A compare test that checks the iommupt formats against the iopgtable interface, including updating AMD to have a working iopgtable and patches to make VT-D have an iopgtable for testing. - A performance test to micro-benchmark map and unmap against iogptable My strategy is to go one by one for the drivers: - AMD driver conversion - RISCV page table and driver - Intel VT-D driver and VTDSS page table - Flushing improvements for RISCV - ARM SMMUv3 And concurrently work on the algorithm side: - debugfs content dump, like VT-D has - Cut support - Increase/Decrease page size support - map/unmap batching - KHO As we make more algorithm improvements the value to convert the drivers increases. This is on github: https://github.com/jgunthorpe/linux/commits/iommu_pt v2: - Rebase on v6.16-rc2 - s/PT_ENTRY_WORD_SIZE/PT_ITEM_WORD_SIZE/s to follow the language better - Comment and documentation updates - Add PT_TOP_PHYS_MASK to help manage alignment restrictions on the top pointer - Add missed force_aperture = true - Make pt_iommu_deinit() take care of the not-yet-inited error case internally as AMD/RISCV/VTD all shared this logic - Change gather_range() into gather_range_pages() so it also deals with the page list. This makes the following cache flushing series simpler - Fix missed update of unmap->unmapped in some error cases - Change clear_contig() to order the gather more logically - Remove goto from the error handling in __map_range_leaf() - s/log2_/oalog2_/ in places where the argument is an oaddr_t - Pass the pts to pt_table_install64/32() - Do not use SIGN_EXTEND for the AMDv2 page table because of Vasant's information on how PASID 0 works. v1: https://patch.msgid.link/r/0-v2-5c26bde5c22d+58b-iommu_pt_jgg@nvidia.com - AMD driver only, many code changes RFC: https://lore.kernel.org/all/0-v1-01fa10580981+1d-iommu_pt_jgg@nvidia.com/ Alejandro Jimenez (1): iommu/amd: Use the generic iommu page table Jason Gunthorpe (14): genpt: Generic Page Table base API genpt: Add Documentation/ files iommupt: Add the basic structure of the iommu implementation iommupt: Add the AMD IOMMU v1 page table format iommupt: Add iova_to_phys op iommupt: Add unmap_pages op iommupt: Add map_pages op iommupt: Add read_and_clear_dirty op iommupt: Add a kunit test for Generic Page Table iommupt: Add a mock pagetable format for iommufd selftest to use iommufd: Change the selftest to use iommupt instead of xarray iommupt: Add the x86 64 bit page table format iommu/amd: Remove AMD io_pgtable support iommupt: Add a kunit test for the IOMMU implementation .clang-format | 1 + Documentation/driver-api/generic_pt.rst | 140 ++ Documentation/driver-api/index.rst | 1 + drivers/iommu/Kconfig | 2 + drivers/iommu/Makefile | 1 + drivers/iommu/amd/Kconfig | 5 +- drivers/iommu/amd/Makefile | 2 +- drivers/iommu/amd/amd_iommu.h | 1 - drivers/iommu/amd/amd_iommu_types.h | 109 +- drivers/iommu/amd/io_pgtable.c | 560 -------- drivers/iommu/amd/io_pgtable_v2.c | 370 ------ drivers/iommu/amd/iommu.c | 516 ++++---- drivers/iommu/generic_pt/.kunitconfig | 13 + drivers/iommu/generic_pt/Kconfig | 72 ++ drivers/iommu/generic_pt/fmt/Makefile | 26 + drivers/iommu/generic_pt/fmt/amdv1.h | 409 ++++++ drivers/iommu/generic_pt/fmt/defs_amdv1.h | 21 + drivers/iommu/generic_pt/fmt/defs_x86_64.h | 21 + drivers/iommu/generic_pt/fmt/iommu_amdv1.c | 15 + drivers/iommu/generic_pt/fmt/iommu_mock.c | 10 + drivers/iommu/generic_pt/fmt/iommu_template.h | 48 + drivers/iommu/generic_pt/fmt/iommu_x86_64.c | 11 + drivers/iommu/generic_pt/fmt/x86_64.h | 248 ++++ drivers/iommu/generic_pt/iommu_pt.h | 1150 +++++++++++++++++ drivers/iommu/generic_pt/kunit_generic_pt.h | 717 ++++++++++ drivers/iommu/generic_pt/kunit_iommu.h | 183 +++ drivers/iommu/generic_pt/kunit_iommu_pt.h | 451 +++++++ drivers/iommu/generic_pt/pt_common.h | 354 +++++ drivers/iommu/generic_pt/pt_defs.h | 323 +++++ drivers/iommu/generic_pt/pt_fmt_defaults.h | 193 +++ drivers/iommu/generic_pt/pt_iter.h | 640 +++++++++ drivers/iommu/generic_pt/pt_log2.h | 130 ++ drivers/iommu/io-pgtable.c | 4 - drivers/iommu/iommufd/Kconfig | 1 + drivers/iommu/iommufd/iommufd_test.h | 11 +- drivers/iommu/iommufd/selftest.c | 439 +++---- include/linux/generic_pt/common.h | 166 +++ include/linux/generic_pt/iommu.h | 270 ++++ include/linux/io-pgtable.h | 2 - tools/testing/selftests/iommu/iommufd.c | 60 +- tools/testing/selftests/iommu/iommufd_utils.h | 12 + 41 files changed, 6119 insertions(+), 1589 deletions(-) create mode 100644 Documentation/driver-api/generic_pt.rst delete mode 100644 drivers/iommu/amd/io_pgtable.c delete mode 100644 drivers/iommu/amd/io_pgtable_v2.c create mode 100644 drivers/iommu/generic_pt/.kunitconfig create mode 100644 drivers/iommu/generic_pt/Kconfig create mode 100644 drivers/iommu/generic_pt/fmt/Makefile create mode 100644 drivers/iommu/generic_pt/fmt/amdv1.h create mode 100644 drivers/iommu/generic_pt/fmt/defs_amdv1.h create mode 100644 drivers/iommu/generic_pt/fmt/defs_x86_64.h create mode 100644 drivers/iommu/generic_pt/fmt/iommu_amdv1.c create mode 100644 drivers/iommu/generic_pt/fmt/iommu_mock.c create mode 100644 drivers/iommu/generic_pt/fmt/iommu_template.h create mode 100644 drivers/iommu/generic_pt/fmt/iommu_x86_64.c create mode 100644 drivers/iommu/generic_pt/fmt/x86_64.h create mode 100644 drivers/iommu/generic_pt/iommu_pt.h create mode 100644 drivers/iommu/generic_pt/kunit_generic_pt.h create mode 100644 drivers/iommu/generic_pt/kunit_iommu.h create mode 100644 drivers/iommu/generic_pt/kunit_iommu_pt.h create mode 100644 drivers/iommu/generic_pt/pt_common.h create mode 100644 drivers/iommu/generic_pt/pt_defs.h create mode 100644 drivers/iommu/generic_pt/pt_fmt_defaults.h create mode 100644 drivers/iommu/generic_pt/pt_iter.h create mode 100644 drivers/iommu/generic_pt/pt_log2.h create mode 100644 include/linux/generic_pt/common.h create mode 100644 include/linux/generic_pt/iommu.h base-commit: cd76b0248a38645a3e3f8ca4a48bffc591e9da19 -- 2.43.0

1 week

4
24
0 0

[RFC PATCH v2 0/9] KVM: Enable Nested Virt selftests

by Ganapatrao Kulkarni

This patch series makes the selftest work with NV enabled. The guest code is run in vEL2 instead of EL1. We add a command line option to enable testing of NV. The NV tests are disabled by default. Modified around 12 selftests in this series. Changes since v1: - Updated NV helper functions as per comments [1]. - Modified existing testscases to run guest code in vEL2. [1] https://lkml.iu.edu/hypermail/linux/kernel/2502.0/07001.html Ganapatrao Kulkarni (9): KVM: arm64: nv: selftests: Add support to run guest code in vEL2. KVM: arm64: nv: selftests: Add simple test to run guest code in vEL2 KVM: arm64: nv: selftests: Enable hypervisor timer tests to run in vEL2 KVM: arm64: nv: selftests: enable aarch32_id_regs test to run in vEL2 KVM: arm64: nv: selftests: Enable vgic tests to run in vEL2 KVM: arm64: nv: selftests: Enable set_id_regs test to run in vEL2 KVM: arm64: nv: selftests: Enable test to run in vEL2 KVM: selftests: arm64: Extend kvm_page_table_test to run guest code in vEL2 KVM: arm64: nv: selftests: Enable page_fault_test test to run in vEL2 tools/testing/selftests/kvm/Makefile.kvm | 2 + tools/testing/selftests/kvm/arch_timer.c | 8 +- .../selftests/kvm/arm64/aarch32_id_regs.c | 34 ++++- .../testing/selftests/kvm/arm64/arch_timer.c | 118 +++++++++++++++--- .../selftests/kvm/arm64/nv_guest_hypervisor.c | 68 ++++++++++ .../selftests/kvm/arm64/page_fault_test.c | 35 +++++- .../testing/selftests/kvm/arm64/set_id_regs.c | 57 ++++++++- tools/testing/selftests/kvm/arm64/vgic_init.c | 54 +++++++- tools/testing/selftests/kvm/arm64/vgic_irq.c | 27 ++-- .../selftests/kvm/arm64/vgic_lpi_stress.c | 19 ++- .../testing/selftests/kvm/guest_print_test.c | 32 +++++ .../selftests/kvm/include/arm64/arch_timer.h | 16 +++ .../kvm/include/arm64/kvm_util_arch.h | 3 + .../selftests/kvm/include/arm64/nv_util.h | 45 +++++++ .../selftests/kvm/include/arm64/vgic.h | 1 + .../testing/selftests/kvm/include/kvm_util.h | 3 + .../selftests/kvm/include/timer_test.h | 1 + .../selftests/kvm/kvm_page_table_test.c | 30 ++++- tools/testing/selftests/kvm/lib/arm64/nv.c | 46 +++++++ .../selftests/kvm/lib/arm64/processor.c | 61 ++++++--- tools/testing/selftests/kvm/lib/arm64/vgic.c | 8 ++ 21 files changed, 604 insertions(+), 64 deletions(-) create mode 100644 tools/testing/selftests/kvm/arm64/nv_guest_hypervisor.c create mode 100644 tools/testing/selftests/kvm/include/arm64/nv_util.h create mode 100644 tools/testing/selftests/kvm/lib/arm64/nv.c -- 2.48.1

1 week, 3 days

6
28
0 0

[PATCH v2 0/5] KVM: Improve VMware guest support

by Zack Rusin

This is the second version of a series that lets us run VMware Workstation on Linux on top of KVM. The most significant change in this series is the introduction of CONFIG_KVM_VMWARE which is, in general, a nice cleanup for various bits of VMware compatibility code that have been scattered around KVM. (first patch) The rest of the series builds upon the VMware platform to implement features that are needed to run VMware guests without any modifications on top of KVM: - ability to turn on the VMware backdoor at runtime on a per-vm basis (used to be a kernel boot argument only) - support for VMware hypercalls - VMware products have a huge collection of hypercalls, all of which are handled in userspace, - support for handling legacy VMware backdoor in L0 in nested configs - in cases where we have WS running a Windows VBS guest, the L0 would be KVM, L1 Hyper-V so by default VMware Tools backdoor calls endup in Hyper-V which can not handle them, so introduce a cap to let L0 handle those. The final change in the series is a kselftest of the VMware hypercall functionality. Cc: Paolo Bonzini <pbonzini(a)redhat.com> Cc: Jonathan Corbet <corbet(a)lwn.net> Cc: Sean Christopherson <seanjc(a)google.com> Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: Ingo Molnar <mingo(a)redhat.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Dave Hansen <dave.hansen(a)linux.intel.com> Cc: x86(a)kernel.org Cc: "H. Peter Anvin" <hpa(a)zytor.com> Cc: Zack Rusin <zack.rusin(a)broadcom.com> Cc: Doug Covelli <doug.covelli(a)broadcom.com> Cc: Shuah Khan <shuah(a)kernel.org> Cc: Namhyung Kim <namhyung(a)kernel.org> Cc: Arnaldo Carvalho de Melo <acme(a)redhat.com> Cc: Michael Ellerman <mpe(a)ellerman.id.au> Cc: Joel Stanley <joel(a)jms.id.au> Cc: Isaku Yamahata <isaku.yamahata(a)intel.com> Cc: kvm(a)vger.kernel.org Cc: linux-doc(a)vger.kernel.org Cc: linux-kernel(a)vger.kernel.org Cc: linux-kselftest(a)vger.kernel.org Zack Rusin (5): KVM: x86: Centralize KVM's VMware code KVM: x86: Allow enabling of the vmware backdoor via a cap KVM: x86: Add support for VMware guest specific hypercalls KVM: x86: Add support for legacy VMware backdoors in nested setups KVM: selftests: x86: Add a test for KVM_CAP_X86_VMWARE_HYPERCALL Documentation/virt/kvm/api.rst | 86 +++++++- MAINTAINERS | 9 + arch/x86/include/asm/kvm_host.h | 13 ++ arch/x86/kvm/Kconfig | 16 ++ arch/x86/kvm/Makefile | 1 + arch/x86/kvm/emulate.c | 11 +- arch/x86/kvm/kvm_vmware.c | 85 ++++++++ arch/x86/kvm/kvm_vmware.h | 189 ++++++++++++++++++ arch/x86/kvm/pmu.c | 39 +--- arch/x86/kvm/pmu.h | 4 - arch/x86/kvm/svm/nested.c | 6 + arch/x86/kvm/svm/svm.c | 10 +- arch/x86/kvm/vmx/nested.c | 6 + arch/x86/kvm/vmx/vmx.c | 5 +- arch/x86/kvm/x86.c | 74 +++---- arch/x86/kvm/x86.h | 2 - include/uapi/linux/kvm.h | 27 +++ tools/include/uapi/linux/kvm.h | 3 + tools/testing/selftests/kvm/Makefile.kvm | 1 + .../selftests/kvm/x86/vmware_hypercall_test.c | 121 +++++++++++ 20 files changed, 614 insertions(+), 94 deletions(-) create mode 100644 arch/x86/kvm/kvm_vmware.c create mode 100644 arch/x86/kvm/kvm_vmware.h create mode 100644 tools/testing/selftests/kvm/x86/vmware_hypercall_test.c -- 2.48.1

1 week, 4 days

2
2
0 0

[PATCH/RFC] kunit/rtc: Add real support for very slow tests

by Geert Uytterhoeven

When running rtc_lib_test ("lib_test" before my "[PATCH] rtc: Rename lib_test to rtc_lib_test") on m68k/ARAnyM: KTAP version 1 1..1 KTAP version 1 # Subtest: rtc_lib_test_cases # module: rtc_lib_test 1..2 # rtc_time64_to_tm_test_date_range_1000: Test should be marked slow (runtime: 3.222371420s) ok 1 rtc_time64_to_tm_test_date_range_1000 # rtc_time64_to_tm_test_date_range_160000: try timed out # rtc_time64_to_tm_test_date_range_160000: test case timed out # rtc_time64_to_tm_test_date_range_160000.speed: slow not ok 2 rtc_time64_to_tm_test_date_range_160000 # rtc_lib_test_cases: pass:1 fail:1 skip:0 total:2 # Totals: pass:1 fail:1 skip:0 total:2 not ok 1 rtc_lib_test_cases Commit 02c2d0c2a84172c3 ("kunit: Add speed attribute") added the notion of "very slow" tests, but this is further unused and unhandled. Hence: 1. Introduce KUNIT_CASE_VERY_SLOW(), 2. Increase timeout by ten; ideally this should only be done for very slow tests, but I couldn't find how to access kunit_case.attr.case from kunit_try_catch_run(), 3. Mark rtc_time64_to_tm_test_date_range_1000 slow, 4. Mark rtc_time64_to_tm_test_date_range_160000 very slow. Afterwards: KTAP version 1 1..1 KTAP version 1 # Subtest: rtc_lib_test_cases # module: rtc_lib_test 1..2 # rtc_time64_to_tm_test_date_range_1000.speed: slow ok 1 rtc_time64_to_tm_test_date_range_1000 # rtc_time64_to_tm_test_date_range_160000.speed: very_slow ok 2 rtc_time64_to_tm_test_date_range_160000 # rtc_lib_test_cases: pass:2 fail:0 skip:0 total:2 # Totals: pass:2 fail:0 skip:0 total:2 ok 1 rtc_lib_test_cases Signed-off-by: Geert Uytterhoeven <geert(a)linux-m68k.org> --- drivers/rtc/rtc_lib_test.c | 4 ++-- include/kunit/test.h | 11 +++++++++++ lib/kunit/try-catch.c | 3 ++- 3 files changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/rtc/rtc_lib_test.c b/drivers/rtc/rtc_lib_test.c index c30c759662e39b48..fd3210e39d37dbc6 100644 --- a/drivers/rtc/rtc_lib_test.c +++ b/drivers/rtc/rtc_lib_test.c @@ -85,8 +85,8 @@ static void rtc_time64_to_tm_test_date_range_1000(struct kunit *test) } static struct kunit_case rtc_lib_test_cases[] = { - KUNIT_CASE(rtc_time64_to_tm_test_date_range_1000), - KUNIT_CASE_SLOW(rtc_time64_to_tm_test_date_range_160000), + KUNIT_CASE_SLOW(rtc_time64_to_tm_test_date_range_1000), + KUNIT_CASE_VERY_SLOW(rtc_time64_to_tm_test_date_range_160000), {} }; diff --git a/include/kunit/test.h b/include/kunit/test.h index 9b773406e01f3c43..4e3c1cae5b41466e 100644 --- a/include/kunit/test.h +++ b/include/kunit/test.h @@ -183,6 +183,17 @@ static inline char *kunit_status_to_ok_not_ok(enum kunit_status status) { .run_case = test_name, .name = #test_name, \ .attr.speed = KUNIT_SPEED_SLOW, .module_name = KBUILD_MODNAME} +/** + * KUNIT_CASE_VERY_SLOW - A helper for creating a &struct kunit_case + * with the very slow attribute + * + * @test_name: a reference to a test case function. + */ + +#define KUNIT_CASE_VERY_SLOW(test_name) \ + { .run_case = test_name, .name = #test_name, \ + .attr.speed = KUNIT_SPEED_VERY_SLOW, .module_name = KBUILD_MODNAME} + /** * KUNIT_CASE_PARAM - A helper for creation a parameterized &struct kunit_case * diff --git a/lib/kunit/try-catch.c b/lib/kunit/try-catch.c index 6bbe0025b0790bd2..92099c67bb21d0a4 100644 --- a/lib/kunit/try-catch.c +++ b/lib/kunit/try-catch.c @@ -56,7 +56,8 @@ static unsigned long kunit_test_timeout(void) * If tests timeout due to exceeding sysctl_hung_task_timeout_secs, * the task will be killed and an oops generated. */ - return 300 * msecs_to_jiffies(MSEC_PER_SEC); /* 5 min */ + // FIXME times ten for KUNIT_SPEED_VERY_SLOW? + return 10 * 300 * msecs_to_jiffies(MSEC_PER_SEC); /* 5 min */ } void kunit_try_catch_run(struct kunit_try_catch *try_catch, void *context) -- 2.43.0

1 week, 4 days

3
3
0 0

[RESEND PATCH] selftests/pidfd: align stack to fix SP alignment exception

by Shuai Xue

The pidfd_test fails on the ARM64 platform with the following error: Bail out! pidfd_poll check for premature notification on child thread exec test: Failed When exception-trace is enabled, the kernel logs the details: #echo 1 > /proc/sys/debug/exception-trace #dmesg | tail -n 20 [48628.713023] pidfd_test[1082142]: unhandled exception: SP Alignment, ESR 0x000000009a000000, SP/PC alignment exception in pidfd_test[400000+4000] [48628.713049] CPU: 21 PID: 1082142 Comm: pidfd_test Kdump: loaded Tainted: G W E 6.6.71-3_rc1.al8.aarch64 #1 [48628.713051] Hardware name: AlibabaCloud AliServer-Xuanwu2.0AM-1UC1P-5B/AS1111MG1, BIOS 1.2.M1.AL.P.157.00 07/29/2023 [48628.713053] pstate: 60001800 (nZCv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=-c) [48628.713055] pc : 0000000000402100 [48628.713056] lr : 0000ffff98288f9c [48628.713056] sp : 0000ffffde49daa8 [48628.713057] x29: 0000000000000000 x28: 0000000000000000 x27: 0000000000000000 [48628.713060] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 [48628.713062] x23: 0000000000000000 x22: 0000000000000000 x21: 0000000000400e80 [48628.713065] x20: 0000000000000000 x19: 0000000000402650 x18: 0000000000000000 [48628.713067] x17: 00000000004200d8 x16: 0000ffff98288f40 x15: 0000ffffde49b92c [48628.713070] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 [48628.713072] x11: 0000000000001011 x10: 0000000000402100 x9 : 0000000000000010 [48628.713074] x8 : 00000000000000dc x7 : 3861616239346564 x6 : 000000000000000a [48628.713077] x5 : 0000ffffde49daa8 x4 : 000000000000000a x3 : 0000ffffde49daa8 [48628.713079] x2 : 0000ffffde49dadc x1 : 0000ffffde49daa8 x0 : 0000000000000000 According to ARM ARM D1.3.10.2 SP alignment checking: > When the SP is used as the base address of a calculation, regardless of > any offset applied by the instruction, if bits [3:0] of the SP are not > 0b0000, there is a misaligned SP. To fix it, align the stack with 16 bytes. Signed-off-by: Shuai Xue <xueshuai(a)linux.alibaba.com> --- tools/testing/selftests/pidfd/pidfd_test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/pidfd/pidfd_test.c b/tools/testing/selftests/pidfd/pidfd_test.c index c081ae91313a..ec161a7c3ff9 100644 --- a/tools/testing/selftests/pidfd/pidfd_test.c +++ b/tools/testing/selftests/pidfd/pidfd_test.c @@ -33,7 +33,7 @@ static bool have_pidfd_send_signal; static pid_t pidfd_clone(int flags, int *pidfd, int (*fn)(void *)) { size_t stack_size = 1024; - char *stack[1024] = { 0 }; + char *stack[1024] __attribute__((aligned(16))) = {0}; #ifdef __ia64__ return __clone2(fn, stack, stack_size, flags | SIGCHLD, NULL, pidfd); -- 2.39.3

1 week, 6 days

3
6
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-kselftest-mirror June 2025