These patches are based on kvm/next, and are also available at:
https://github.com/mdroth/linux/commits/sev-selftests-ucall-rfc1
== BACKGROUND ==
These patches are a prerequisite for adding selftest support for SEV guests and possibly other confidential computing implementations in the future. They were motivated by a suggestion Paolo made in response to the initial SEV selftest RFC:
https://lore.kernel.org/lkml/20211025035833.yqphcnf5u3lk4zgg@amd.com/T/#m959...
Since the changes touch multiple archs and ended up creating a bit more churn than expected, I thought it would be a good idea to carve this out into a separate standalone series for reviewers who may be more interested in the ucall changes than anything SEV-related.
To summarize, x86 relies on a ucall based on using PIO intructions to generate an exit to userspace and provide the GVA of a dynamically-allocated ucall struct that resides in guest memory and contains information about how to handle/interpret the exit. This doesn't work for SEV guests for 3 main reasons:
1) The guest memory is generally encrypted during run-time, so the guest needs to ensure the ucall struct is allocated in shared memory. 2) The guest page table is also encrypted, so the address would need to be a GPA instead of a GVA. 3) The guest vCPU register may also be encrypted in the case of SEV-ES/SEV-SNP, so the approach of examining vCPU register state has additional requirements such as requiring guest code to implement a #VC handler that can provide the appropriate registers via a vmgexit.
To address these issues, the SEV selftest RFC1 patchset introduced a set of new SEV-specific interfaces that closely mirrored the functionality of ucall()/get_ucall(), but relied on a pre-allocated/static ucall buffer in shared guest memory so it that guest code could pass messages/state to the host by simply writing to this pre-arranged shared memory region and then generating an exit to userspace (via a halt instruction).
Paolo suggested instead implementing support for test/guest-specific ucall implementations that could be used as an alternative to the default PIO-based ucall implementations as-needed based on test/guest requirements, while still allowing for tests to use a common set interfaces like ucall()/get_ucall().
== OVERVIEW ==
This series implements the above functionality by introducing a new ucall_ops struct that can be used to register a particular ucall implementation as need, then re-implements x86/arm64/s390x in terms of the ucall_ops.
But for the purposes of introducing a new ucall_ops implementation appropriate for SEV, there are a couple issues that resulted in the need for some additional ucall interfaces as well:
a) ucall() doesn't take a pointer to the ucall struct it modifies, so to make it work in the case of an implementation that relies a pre-allocated ucall struct in shared guest memory some sort of global lookup functionality would be needed to locate the appropriate ucall struct for a particular VM/vcpu combination, and this would need to be made accessible for use by the guest as well. guests would then need some way of determining what VM/vcpu identifiers they need to use to do the lookup, which to do reliably would likely require seeding the guest with those identifiers in advance, which is possible, but much more easily achievable by simply adding a ucall() alternative that accepts a pointer to the ucall struct for that particular VM/vcpu.
b) get_ucall() *does* take a pointer to a ucall struct, but currently zeroes it out and uses it to copy the guest's ucall struct into. It *could* be re-purposed to handle the case where the pointer is an actual pointer to the ucall struct in shared guest memory, but that could cause problems since callers would need some idea of what the underlying ucall implementation expects. Ideally the interfaces would be agnostic to the ucall implementation.
So to address those issues, this series also allows ucall implementations to optionally be extended to support a set of 'shared' ops that are used in the following manner:
host: uc_gva = ucall_shared_alloc() setup_vm_args(vm, uc_gva)
guest: ucall_shared(uc_gva, ...)
host: uget_ucall_shared(uc_gva, ...)
and then implements a new ucall implementation, ucall_ops_halt, based around these shared interfaces and halt instructions.
While this doesn't really meet the initial goal of re-using the existing ucall interfaces as-is, the hope is that these *_shared interfaces are general enough to be re-usable things other than SEV, or at least improve on code readability over the initial SEV-specific interfaces.
Any review/comments are greatly appreciated!
---------------------------------------------------------------- Michael Roth (10): kvm: selftests: move base kvm_util.h declarations to kvm_util_base.h kvm: selftests: move ucall declarations into ucall_common.h kvm: selftests: introduce ucall_ops for test/arch-specific ucall implementations kvm: arm64: selftests: use ucall_ops to define default ucall implementation (COMPILE-TESTED ONLY) kvm: s390: selftests: use ucall_ops to define default ucall implementation kvm: selftests: add ucall interfaces based around shared memory kvm: selftests: add ucall_shared ops for PIO kvm: selftests: introduce ucall implementation based on halt instructions kvm: selftests: add GUEST_SHARED_* macros for shared ucall implementations kvm: selftests: add ucall_test to test various ucall functionality
tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 5 +- .../testing/selftests/kvm/include/aarch64/ucall.h | 18 + tools/testing/selftests/kvm/include/kvm_util.h | 408 +-------------------- .../testing/selftests/kvm/include/kvm_util_base.h | 368 +++++++++++++++++++ tools/testing/selftests/kvm/include/s390x/ucall.h | 18 + tools/testing/selftests/kvm/include/ucall_common.h | 147 ++++++++ tools/testing/selftests/kvm/include/x86_64/ucall.h | 19 + tools/testing/selftests/kvm/lib/aarch64/ucall.c | 43 +-- tools/testing/selftests/kvm/lib/s390x/ucall.c | 45 +-- tools/testing/selftests/kvm/lib/ucall_common.c | 133 +++++++ tools/testing/selftests/kvm/lib/x86_64/ucall.c | 82 +++-- tools/testing/selftests/kvm/ucall_test.c | 182 +++++++++ 13 files changed, 982 insertions(+), 487 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/aarch64/ucall.h create mode 100644 tools/testing/selftests/kvm/include/kvm_util_base.h create mode 100644 tools/testing/selftests/kvm/include/s390x/ucall.h create mode 100644 tools/testing/selftests/kvm/include/ucall_common.h create mode 100644 tools/testing/selftests/kvm/include/x86_64/ucall.h create mode 100644 tools/testing/selftests/kvm/lib/ucall_common.c create mode 100644 tools/testing/selftests/kvm/ucall_test.c
Between helper macros and interfaces that will be introduced in subsequent patches, much of kvm_util.h would end up being declarations specific to ucall. Ideally these could be separated out into a separate header since they are not strictly required for writing guest tests and are mostly self-contained interfaces other than a reliance on a few core declarations like struct kvm_vm. This doesn't make a big difference as far as how tests will be compiled/written since all these interfaces will still be packaged up into a single/common libkvm.a used by all tests, but it is still nice to be able to compartmentalize to improve readabilty and reduce merge conflicts in the future for common tasks like adding new interfaces to kvm_util.h.
Furthermore, some of the ucall declarations will be arch-specific, requiring various #ifdef'ery in kvm_util.h. Ideally these declarations could live in separate arch-specific headers, e.g. include/<arch>/ucall.h, which would handle arch-specific declarations as well as pulling in common ucall-related declarations shared by all archs.
One simple way to do this would be to #include ucall.h at the bottom of kvm_util.h, after declarations it relies upon like struct kvm_vm. This is brittle however, and doesn't scale easily to other sets of interfaces that may be added in the future.
Instead, move all declarations currently in kvm_util.h into kvm_util_base.h, then have kvm_util.h #include it. With this change, non-base declarations can be selectively moved/introduced into separate headers, which can then be included in kvm_util.h so that individual tests don't need to be touched. Subsequent patches will then move ucall-related declarations into a separate header to meet the above goals.
Signed-off-by: Michael Roth michael.roth@amd.com --- .../testing/selftests/kvm/include/kvm_util.h | 407 +---------------- .../selftests/kvm/include/kvm_util_base.h | 417 ++++++++++++++++++ 2 files changed, 418 insertions(+), 406 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/kvm_util_base.h
diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index 6a1a37f30494..c860ced3888d 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -7,411 +7,6 @@ #ifndef SELFTEST_KVM_UTIL_H #define SELFTEST_KVM_UTIL_H
-#include "test_util.h" - -#include "asm/kvm.h" -#include "linux/list.h" -#include "linux/kvm.h" -#include <sys/ioctl.h> - -#include "sparsebit.h" - -#define KVM_DEV_PATH "/dev/kvm" -#define KVM_MAX_VCPUS 512 - -#define NSEC_PER_SEC 1000000000L - -/* - * Callers of kvm_util only have an incomplete/opaque description of the - * structure kvm_util is using to maintain the state of a VM. - */ -struct kvm_vm; - -typedef uint64_t vm_paddr_t; /* Virtual Machine (Guest) physical address */ -typedef uint64_t vm_vaddr_t; /* Virtual Machine (Guest) virtual address */ - -/* Minimum allocated guest virtual and physical addresses */ -#define KVM_UTIL_MIN_VADDR 0x2000 -#define KVM_GUEST_PAGE_TABLE_MIN_PADDR 0x180000 - -#define DEFAULT_GUEST_PHY_PAGES 512 -#define DEFAULT_GUEST_STACK_VADDR_MIN 0xab6000 -#define DEFAULT_STACK_PGS 5 - -enum vm_guest_mode { - VM_MODE_P52V48_4K, - VM_MODE_P52V48_64K, - VM_MODE_P48V48_4K, - VM_MODE_P48V48_64K, - VM_MODE_P40V48_4K, - VM_MODE_P40V48_64K, - VM_MODE_PXXV48_4K, /* For 48bits VA but ANY bits PA */ - VM_MODE_P47V64_4K, - VM_MODE_P44V64_4K, - NUM_VM_MODES, -}; - -#if defined(__aarch64__) - -#define VM_MODE_DEFAULT VM_MODE_P40V48_4K -#define MIN_PAGE_SHIFT 12U -#define ptes_per_page(page_size) ((page_size) / 8) - -#elif defined(__x86_64__) - -#define VM_MODE_DEFAULT VM_MODE_PXXV48_4K -#define MIN_PAGE_SHIFT 12U -#define ptes_per_page(page_size) ((page_size) / 8) - -#elif defined(__s390x__) - -#define VM_MODE_DEFAULT VM_MODE_P44V64_4K -#define MIN_PAGE_SHIFT 12U -#define ptes_per_page(page_size) ((page_size) / 16) - -#endif - -#define MIN_PAGE_SIZE (1U << MIN_PAGE_SHIFT) -#define PTES_PER_MIN_PAGE ptes_per_page(MIN_PAGE_SIZE) - -struct vm_guest_mode_params { - unsigned int pa_bits; - unsigned int va_bits; - unsigned int page_size; - unsigned int page_shift; -}; -extern const struct vm_guest_mode_params vm_guest_mode_params[]; - -int open_path_or_exit(const char *path, int flags); -int open_kvm_dev_path_or_exit(void); -int kvm_check_cap(long cap); -int vm_enable_cap(struct kvm_vm *vm, struct kvm_enable_cap *cap); -int vcpu_enable_cap(struct kvm_vm *vm, uint32_t vcpu_id, - struct kvm_enable_cap *cap); -void vm_enable_dirty_ring(struct kvm_vm *vm, uint32_t ring_size); -const char *vm_guest_mode_string(uint32_t i); - -struct kvm_vm *vm_create(enum vm_guest_mode mode, uint64_t phy_pages, int perm); -void kvm_vm_free(struct kvm_vm *vmp); -void kvm_vm_restart(struct kvm_vm *vmp, int perm); -void kvm_vm_release(struct kvm_vm *vmp); -void kvm_vm_get_dirty_log(struct kvm_vm *vm, int slot, void *log); -void kvm_vm_clear_dirty_log(struct kvm_vm *vm, int slot, void *log, - uint64_t first_page, uint32_t num_pages); -uint32_t kvm_vm_reset_dirty_ring(struct kvm_vm *vm); - -int kvm_memcmp_hva_gva(void *hva, struct kvm_vm *vm, const vm_vaddr_t gva, - size_t len); - -void kvm_vm_elf_load(struct kvm_vm *vm, const char *filename); - -void vm_dump(FILE *stream, struct kvm_vm *vm, uint8_t indent); - -/* - * VM VCPU Dump - * - * Input Args: - * stream - Output FILE stream - * vm - Virtual Machine - * vcpuid - VCPU ID - * indent - Left margin indent amount - * - * Output Args: None - * - * Return: None - * - * Dumps the current state of the VCPU specified by @vcpuid, within the VM - * given by @vm, to the FILE stream given by @stream. - */ -void vcpu_dump(FILE *stream, struct kvm_vm *vm, uint32_t vcpuid, - uint8_t indent); - -void vm_create_irqchip(struct kvm_vm *vm); - -void vm_userspace_mem_region_add(struct kvm_vm *vm, - enum vm_mem_backing_src_type src_type, - uint64_t guest_paddr, uint32_t slot, uint64_t npages, - uint32_t flags); - -void vcpu_ioctl(struct kvm_vm *vm, uint32_t vcpuid, unsigned long ioctl, - void *arg); -int _vcpu_ioctl(struct kvm_vm *vm, uint32_t vcpuid, unsigned long ioctl, - void *arg); -void vm_ioctl(struct kvm_vm *vm, unsigned long ioctl, void *arg); -int _vm_ioctl(struct kvm_vm *vm, unsigned long cmd, void *arg); -void kvm_ioctl(struct kvm_vm *vm, unsigned long ioctl, void *arg); -int _kvm_ioctl(struct kvm_vm *vm, unsigned long ioctl, void *arg); -void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags); -void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa); -void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot); -void vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpuid); -vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min); -vm_vaddr_t vm_vaddr_alloc_pages(struct kvm_vm *vm, int nr_pages); -vm_vaddr_t vm_vaddr_alloc_page(struct kvm_vm *vm); - -void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, - unsigned int npages); -void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa); -void *addr_gva2hva(struct kvm_vm *vm, vm_vaddr_t gva); -vm_paddr_t addr_hva2gpa(struct kvm_vm *vm, void *hva); -void *addr_gpa2alias(struct kvm_vm *vm, vm_paddr_t gpa); - -/* - * Address Guest Virtual to Guest Physical - * - * Input Args: - * vm - Virtual Machine - * gva - VM virtual address - * - * Output Args: None - * - * Return: - * Equivalent VM physical address - * - * Returns the VM physical address of the translated VM virtual - * address given by @gva. - */ -vm_paddr_t addr_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva); - -struct kvm_run *vcpu_state(struct kvm_vm *vm, uint32_t vcpuid); -void vcpu_run(struct kvm_vm *vm, uint32_t vcpuid); -int _vcpu_run(struct kvm_vm *vm, uint32_t vcpuid); -int vcpu_get_fd(struct kvm_vm *vm, uint32_t vcpuid); -void vcpu_run_complete_io(struct kvm_vm *vm, uint32_t vcpuid); -void vcpu_set_guest_debug(struct kvm_vm *vm, uint32_t vcpuid, - struct kvm_guest_debug *debug); -void vcpu_set_mp_state(struct kvm_vm *vm, uint32_t vcpuid, - struct kvm_mp_state *mp_state); -struct kvm_reg_list *vcpu_get_reg_list(struct kvm_vm *vm, uint32_t vcpuid); -void vcpu_regs_get(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_regs *regs); -void vcpu_regs_set(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_regs *regs); - -/* - * VM VCPU Args Set - * - * Input Args: - * vm - Virtual Machine - * vcpuid - VCPU ID - * num - number of arguments - * ... - arguments, each of type uint64_t - * - * Output Args: None - * - * Return: None - * - * Sets the first @num function input registers of the VCPU with @vcpuid, - * per the C calling convention of the architecture, to the values given - * as variable args. Each of the variable args is expected to be of type - * uint64_t. The maximum @num can be is specific to the architecture. - */ -void vcpu_args_set(struct kvm_vm *vm, uint32_t vcpuid, unsigned int num, ...); - -void vcpu_sregs_get(struct kvm_vm *vm, uint32_t vcpuid, - struct kvm_sregs *sregs); -void vcpu_sregs_set(struct kvm_vm *vm, uint32_t vcpuid, - struct kvm_sregs *sregs); -int _vcpu_sregs_set(struct kvm_vm *vm, uint32_t vcpuid, - struct kvm_sregs *sregs); -void vcpu_fpu_get(struct kvm_vm *vm, uint32_t vcpuid, - struct kvm_fpu *fpu); -void vcpu_fpu_set(struct kvm_vm *vm, uint32_t vcpuid, - struct kvm_fpu *fpu); -void vcpu_get_reg(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_one_reg *reg); -void vcpu_set_reg(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_one_reg *reg); -#ifdef __KVM_HAVE_VCPU_EVENTS -void vcpu_events_get(struct kvm_vm *vm, uint32_t vcpuid, - struct kvm_vcpu_events *events); -void vcpu_events_set(struct kvm_vm *vm, uint32_t vcpuid, - struct kvm_vcpu_events *events); -#endif -#ifdef __x86_64__ -void vcpu_nested_state_get(struct kvm_vm *vm, uint32_t vcpuid, - struct kvm_nested_state *state); -int vcpu_nested_state_set(struct kvm_vm *vm, uint32_t vcpuid, - struct kvm_nested_state *state, bool ignore_error); -#endif -void *vcpu_map_dirty_ring(struct kvm_vm *vm, uint32_t vcpuid); - -int _kvm_device_check_attr(int dev_fd, uint32_t group, uint64_t attr); -int kvm_device_check_attr(int dev_fd, uint32_t group, uint64_t attr); -int _kvm_create_device(struct kvm_vm *vm, uint64_t type, bool test, int *fd); -int kvm_create_device(struct kvm_vm *vm, uint64_t type, bool test); -int _kvm_device_access(int dev_fd, uint32_t group, uint64_t attr, - void *val, bool write); -int kvm_device_access(int dev_fd, uint32_t group, uint64_t attr, - void *val, bool write); - -int _vcpu_has_device_attr(struct kvm_vm *vm, uint32_t vcpuid, uint32_t group, - uint64_t attr); -int vcpu_has_device_attr(struct kvm_vm *vm, uint32_t vcpuid, uint32_t group, - uint64_t attr); -int _vcpu_access_device_attr(struct kvm_vm *vm, uint32_t vcpuid, uint32_t group, - uint64_t attr, void *val, bool write); -int vcpu_access_device_attr(struct kvm_vm *vm, uint32_t vcpuid, uint32_t group, - uint64_t attr, void *val, bool write); - -const char *exit_reason_str(unsigned int exit_reason); - -void virt_pgd_alloc(struct kvm_vm *vm); - -/* - * VM Virtual Page Map - * - * Input Args: - * vm - Virtual Machine - * vaddr - VM Virtual Address - * paddr - VM Physical Address - * memslot - Memory region slot for new virtual translation tables - * - * Output Args: None - * - * Return: None - * - * Within @vm, creates a virtual translation for the page starting - * at @vaddr to the page starting at @paddr. - */ -void virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr); - -vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min, - uint32_t memslot); -vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num, - vm_paddr_t paddr_min, uint32_t memslot); -vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm); - -/* - * Create a VM with reasonable defaults - * - * Input Args: - * vcpuid - The id of the single VCPU to add to the VM. - * extra_mem_pages - The number of extra pages to add (this will - * decide how much extra space we will need to - * setup the page tables using memslot 0) - * guest_code - The vCPU's entry point - * - * Output Args: None - * - * Return: - * Pointer to opaque structure that describes the created VM. - */ -struct kvm_vm *vm_create_default(uint32_t vcpuid, uint64_t extra_mem_pages, - void *guest_code); - -/* Same as vm_create_default, but can be used for more than one vcpu */ -struct kvm_vm *vm_create_default_with_vcpus(uint32_t nr_vcpus, uint64_t extra_mem_pages, - uint32_t num_percpu_pages, void *guest_code, - uint32_t vcpuids[]); - -/* Like vm_create_default_with_vcpus, but accepts mode and slot0 memory as a parameter */ -struct kvm_vm *vm_create_with_vcpus(enum vm_guest_mode mode, uint32_t nr_vcpus, - uint64_t slot0_mem_pages, uint64_t extra_mem_pages, - uint32_t num_percpu_pages, void *guest_code, - uint32_t vcpuids[]); - -/* - * Adds a vCPU with reasonable defaults (e.g. a stack) - * - * Input Args: - * vm - Virtual Machine - * vcpuid - The id of the VCPU to add to the VM. - * guest_code - The vCPU's entry point - */ -void vm_vcpu_add_default(struct kvm_vm *vm, uint32_t vcpuid, void *guest_code); - -bool vm_is_unrestricted_guest(struct kvm_vm *vm); - -unsigned int vm_get_page_size(struct kvm_vm *vm); -unsigned int vm_get_page_shift(struct kvm_vm *vm); -uint64_t vm_get_max_gfn(struct kvm_vm *vm); -int vm_get_fd(struct kvm_vm *vm); - -unsigned int vm_calc_num_guest_pages(enum vm_guest_mode mode, size_t size); -unsigned int vm_num_host_pages(enum vm_guest_mode mode, unsigned int num_guest_pages); -unsigned int vm_num_guest_pages(enum vm_guest_mode mode, unsigned int num_host_pages); -static inline unsigned int -vm_adjust_num_guest_pages(enum vm_guest_mode mode, unsigned int num_guest_pages) -{ - unsigned int n; - n = vm_num_guest_pages(mode, vm_num_host_pages(mode, num_guest_pages)); -#ifdef __s390x__ - /* s390 requires 1M aligned guest sizes */ - n = (n + 255) & ~255; -#endif - return n; -} - -struct kvm_userspace_memory_region * -kvm_userspace_memory_region_find(struct kvm_vm *vm, uint64_t start, - uint64_t end); - -struct kvm_dirty_log * -allocate_kvm_dirty_log(struct kvm_userspace_memory_region *region); - -int vm_create_device(struct kvm_vm *vm, struct kvm_create_device *cd); - -#define sync_global_to_guest(vm, g) ({ \ - typeof(g) *_p = addr_gva2hva(vm, (vm_vaddr_t)&(g)); \ - memcpy(_p, &(g), sizeof(g)); \ -}) - -#define sync_global_from_guest(vm, g) ({ \ - typeof(g) *_p = addr_gva2hva(vm, (vm_vaddr_t)&(g)); \ - memcpy(&(g), _p, sizeof(g)); \ -}) - -void assert_on_unhandled_exception(struct kvm_vm *vm, uint32_t vcpuid); - -/* Common ucalls */ -enum { - UCALL_NONE, - UCALL_SYNC, - UCALL_ABORT, - UCALL_DONE, - UCALL_UNHANDLED, -}; - -#define UCALL_MAX_ARGS 6 - -struct ucall { - uint64_t cmd; - uint64_t args[UCALL_MAX_ARGS]; -}; - -void ucall_init(struct kvm_vm *vm, void *arg); -void ucall_uninit(struct kvm_vm *vm); -void ucall(uint64_t cmd, int nargs, ...); -uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc); - -#define GUEST_SYNC_ARGS(stage, arg1, arg2, arg3, arg4) \ - ucall(UCALL_SYNC, 6, "hello", stage, arg1, arg2, arg3, arg4) -#define GUEST_SYNC(stage) ucall(UCALL_SYNC, 2, "hello", stage) -#define GUEST_DONE() ucall(UCALL_DONE, 0) -#define __GUEST_ASSERT(_condition, _condstr, _nargs, _args...) do { \ - if (!(_condition)) \ - ucall(UCALL_ABORT, 2 + _nargs, \ - "Failed guest assert: " \ - _condstr, __LINE__, _args); \ -} while (0) - -#define GUEST_ASSERT(_condition) \ - __GUEST_ASSERT(_condition, #_condition, 0, 0) - -#define GUEST_ASSERT_1(_condition, arg1) \ - __GUEST_ASSERT(_condition, #_condition, 1, (arg1)) - -#define GUEST_ASSERT_2(_condition, arg1, arg2) \ - __GUEST_ASSERT(_condition, #_condition, 2, (arg1), (arg2)) - -#define GUEST_ASSERT_3(_condition, arg1, arg2, arg3) \ - __GUEST_ASSERT(_condition, #_condition, 3, (arg1), (arg2), (arg3)) - -#define GUEST_ASSERT_4(_condition, arg1, arg2, arg3, arg4) \ - __GUEST_ASSERT(_condition, #_condition, 4, (arg1), (arg2), (arg3), (arg4)) - -#define GUEST_ASSERT_EQ(a, b) __GUEST_ASSERT((a) == (b), #a " == " #b, 2, a, b) - -int vm_get_stats_fd(struct kvm_vm *vm); -int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid); - -uint32_t guest_get_vcpuid(void); +#include "kvm_util_base.h"
#endif /* SELFTEST_KVM_UTIL_H */ diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h new file mode 100644 index 000000000000..8fb6aeff5469 --- /dev/null +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h @@ -0,0 +1,417 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * tools/testing/selftests/kvm/include/kvm_util_base.h + * + * Copyright (C) 2018, Google LLC. + */ +#ifndef SELFTEST_KVM_UTIL_BASE_H +#define SELFTEST_KVM_UTIL_BASE_H + +#include "test_util.h" + +#include "asm/kvm.h" +#include "linux/list.h" +#include "linux/kvm.h" +#include <sys/ioctl.h> + +#include "sparsebit.h" + +#define KVM_DEV_PATH "/dev/kvm" +#define KVM_MAX_VCPUS 512 + +#define NSEC_PER_SEC 1000000000L + +/* + * Callers of kvm_util only have an incomplete/opaque description of the + * structure kvm_util is using to maintain the state of a VM. + */ +struct kvm_vm; + +typedef uint64_t vm_paddr_t; /* Virtual Machine (Guest) physical address */ +typedef uint64_t vm_vaddr_t; /* Virtual Machine (Guest) virtual address */ + +/* Minimum allocated guest virtual and physical addresses */ +#define KVM_UTIL_MIN_VADDR 0x2000 +#define KVM_GUEST_PAGE_TABLE_MIN_PADDR 0x180000 + +#define DEFAULT_GUEST_PHY_PAGES 512 +#define DEFAULT_GUEST_STACK_VADDR_MIN 0xab6000 +#define DEFAULT_STACK_PGS 5 + +enum vm_guest_mode { + VM_MODE_P52V48_4K, + VM_MODE_P52V48_64K, + VM_MODE_P48V48_4K, + VM_MODE_P48V48_64K, + VM_MODE_P40V48_4K, + VM_MODE_P40V48_64K, + VM_MODE_PXXV48_4K, /* For 48bits VA but ANY bits PA */ + VM_MODE_P47V64_4K, + VM_MODE_P44V64_4K, + NUM_VM_MODES, +}; + +#if defined(__aarch64__) + +#define VM_MODE_DEFAULT VM_MODE_P40V48_4K +#define MIN_PAGE_SHIFT 12U +#define ptes_per_page(page_size) ((page_size) / 8) + +#elif defined(__x86_64__) + +#define VM_MODE_DEFAULT VM_MODE_PXXV48_4K +#define MIN_PAGE_SHIFT 12U +#define ptes_per_page(page_size) ((page_size) / 8) + +#elif defined(__s390x__) + +#define VM_MODE_DEFAULT VM_MODE_P44V64_4K +#define MIN_PAGE_SHIFT 12U +#define ptes_per_page(page_size) ((page_size) / 16) + +#endif + +#define MIN_PAGE_SIZE (1U << MIN_PAGE_SHIFT) +#define PTES_PER_MIN_PAGE ptes_per_page(MIN_PAGE_SIZE) + +struct vm_guest_mode_params { + unsigned int pa_bits; + unsigned int va_bits; + unsigned int page_size; + unsigned int page_shift; +}; +extern const struct vm_guest_mode_params vm_guest_mode_params[]; + +int open_path_or_exit(const char *path, int flags); +int open_kvm_dev_path_or_exit(void); +int kvm_check_cap(long cap); +int vm_enable_cap(struct kvm_vm *vm, struct kvm_enable_cap *cap); +int vcpu_enable_cap(struct kvm_vm *vm, uint32_t vcpu_id, + struct kvm_enable_cap *cap); +void vm_enable_dirty_ring(struct kvm_vm *vm, uint32_t ring_size); +const char *vm_guest_mode_string(uint32_t i); + +struct kvm_vm *vm_create(enum vm_guest_mode mode, uint64_t phy_pages, int perm); +void kvm_vm_free(struct kvm_vm *vmp); +void kvm_vm_restart(struct kvm_vm *vmp, int perm); +void kvm_vm_release(struct kvm_vm *vmp); +void kvm_vm_get_dirty_log(struct kvm_vm *vm, int slot, void *log); +void kvm_vm_clear_dirty_log(struct kvm_vm *vm, int slot, void *log, + uint64_t first_page, uint32_t num_pages); +uint32_t kvm_vm_reset_dirty_ring(struct kvm_vm *vm); + +int kvm_memcmp_hva_gva(void *hva, struct kvm_vm *vm, const vm_vaddr_t gva, + size_t len); + +void kvm_vm_elf_load(struct kvm_vm *vm, const char *filename); + +void vm_dump(FILE *stream, struct kvm_vm *vm, uint8_t indent); + +/* + * VM VCPU Dump + * + * Input Args: + * stream - Output FILE stream + * vm - Virtual Machine + * vcpuid - VCPU ID + * indent - Left margin indent amount + * + * Output Args: None + * + * Return: None + * + * Dumps the current state of the VCPU specified by @vcpuid, within the VM + * given by @vm, to the FILE stream given by @stream. + */ +void vcpu_dump(FILE *stream, struct kvm_vm *vm, uint32_t vcpuid, + uint8_t indent); + +void vm_create_irqchip(struct kvm_vm *vm); + +void vm_userspace_mem_region_add(struct kvm_vm *vm, + enum vm_mem_backing_src_type src_type, + uint64_t guest_paddr, uint32_t slot, uint64_t npages, + uint32_t flags); + +void vcpu_ioctl(struct kvm_vm *vm, uint32_t vcpuid, unsigned long ioctl, + void *arg); +int _vcpu_ioctl(struct kvm_vm *vm, uint32_t vcpuid, unsigned long ioctl, + void *arg); +void vm_ioctl(struct kvm_vm *vm, unsigned long ioctl, void *arg); +int _vm_ioctl(struct kvm_vm *vm, unsigned long cmd, void *arg); +void kvm_ioctl(struct kvm_vm *vm, unsigned long ioctl, void *arg); +int _kvm_ioctl(struct kvm_vm *vm, unsigned long ioctl, void *arg); +void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags); +void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa); +void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot); +void vm_vcpu_add(struct kvm_vm *vm, uint32_t vcpuid); +vm_vaddr_t vm_vaddr_alloc(struct kvm_vm *vm, size_t sz, vm_vaddr_t vaddr_min); +vm_vaddr_t vm_vaddr_alloc_pages(struct kvm_vm *vm, int nr_pages); +vm_vaddr_t vm_vaddr_alloc_page(struct kvm_vm *vm); + +void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, + unsigned int npages); +void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa); +void *addr_gva2hva(struct kvm_vm *vm, vm_vaddr_t gva); +vm_paddr_t addr_hva2gpa(struct kvm_vm *vm, void *hva); +void *addr_gpa2alias(struct kvm_vm *vm, vm_paddr_t gpa); + +/* + * Address Guest Virtual to Guest Physical + * + * Input Args: + * vm - Virtual Machine + * gva - VM virtual address + * + * Output Args: None + * + * Return: + * Equivalent VM physical address + * + * Returns the VM physical address of the translated VM virtual + * address given by @gva. + */ +vm_paddr_t addr_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva); + +struct kvm_run *vcpu_state(struct kvm_vm *vm, uint32_t vcpuid); +void vcpu_run(struct kvm_vm *vm, uint32_t vcpuid); +int _vcpu_run(struct kvm_vm *vm, uint32_t vcpuid); +int vcpu_get_fd(struct kvm_vm *vm, uint32_t vcpuid); +void vcpu_run_complete_io(struct kvm_vm *vm, uint32_t vcpuid); +void vcpu_set_guest_debug(struct kvm_vm *vm, uint32_t vcpuid, + struct kvm_guest_debug *debug); +void vcpu_set_mp_state(struct kvm_vm *vm, uint32_t vcpuid, + struct kvm_mp_state *mp_state); +struct kvm_reg_list *vcpu_get_reg_list(struct kvm_vm *vm, uint32_t vcpuid); +void vcpu_regs_get(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_regs *regs); +void vcpu_regs_set(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_regs *regs); + +/* + * VM VCPU Args Set + * + * Input Args: + * vm - Virtual Machine + * vcpuid - VCPU ID + * num - number of arguments + * ... - arguments, each of type uint64_t + * + * Output Args: None + * + * Return: None + * + * Sets the first @num function input registers of the VCPU with @vcpuid, + * per the C calling convention of the architecture, to the values given + * as variable args. Each of the variable args is expected to be of type + * uint64_t. The maximum @num can be is specific to the architecture. + */ +void vcpu_args_set(struct kvm_vm *vm, uint32_t vcpuid, unsigned int num, ...); + +void vcpu_sregs_get(struct kvm_vm *vm, uint32_t vcpuid, + struct kvm_sregs *sregs); +void vcpu_sregs_set(struct kvm_vm *vm, uint32_t vcpuid, + struct kvm_sregs *sregs); +int _vcpu_sregs_set(struct kvm_vm *vm, uint32_t vcpuid, + struct kvm_sregs *sregs); +void vcpu_fpu_get(struct kvm_vm *vm, uint32_t vcpuid, + struct kvm_fpu *fpu); +void vcpu_fpu_set(struct kvm_vm *vm, uint32_t vcpuid, + struct kvm_fpu *fpu); +void vcpu_get_reg(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_one_reg *reg); +void vcpu_set_reg(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_one_reg *reg); +#ifdef __KVM_HAVE_VCPU_EVENTS +void vcpu_events_get(struct kvm_vm *vm, uint32_t vcpuid, + struct kvm_vcpu_events *events); +void vcpu_events_set(struct kvm_vm *vm, uint32_t vcpuid, + struct kvm_vcpu_events *events); +#endif +#ifdef __x86_64__ +void vcpu_nested_state_get(struct kvm_vm *vm, uint32_t vcpuid, + struct kvm_nested_state *state); +int vcpu_nested_state_set(struct kvm_vm *vm, uint32_t vcpuid, + struct kvm_nested_state *state, bool ignore_error); +#endif +void *vcpu_map_dirty_ring(struct kvm_vm *vm, uint32_t vcpuid); + +int _kvm_device_check_attr(int dev_fd, uint32_t group, uint64_t attr); +int kvm_device_check_attr(int dev_fd, uint32_t group, uint64_t attr); +int _kvm_create_device(struct kvm_vm *vm, uint64_t type, bool test, int *fd); +int kvm_create_device(struct kvm_vm *vm, uint64_t type, bool test); +int _kvm_device_access(int dev_fd, uint32_t group, uint64_t attr, + void *val, bool write); +int kvm_device_access(int dev_fd, uint32_t group, uint64_t attr, + void *val, bool write); + +int _vcpu_has_device_attr(struct kvm_vm *vm, uint32_t vcpuid, uint32_t group, + uint64_t attr); +int vcpu_has_device_attr(struct kvm_vm *vm, uint32_t vcpuid, uint32_t group, + uint64_t attr); +int _vcpu_access_device_attr(struct kvm_vm *vm, uint32_t vcpuid, uint32_t group, + uint64_t attr, void *val, bool write); +int vcpu_access_device_attr(struct kvm_vm *vm, uint32_t vcpuid, uint32_t group, + uint64_t attr, void *val, bool write); + +const char *exit_reason_str(unsigned int exit_reason); + +void virt_pgd_alloc(struct kvm_vm *vm); + +/* + * VM Virtual Page Map + * + * Input Args: + * vm - Virtual Machine + * vaddr - VM Virtual Address + * paddr - VM Physical Address + * memslot - Memory region slot for new virtual translation tables + * + * Output Args: None + * + * Return: None + * + * Within @vm, creates a virtual translation for the page starting + * at @vaddr to the page starting at @paddr. + */ +void virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr); + +vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min, + uint32_t memslot); +vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num, + vm_paddr_t paddr_min, uint32_t memslot); +vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm); + +/* + * Create a VM with reasonable defaults + * + * Input Args: + * vcpuid - The id of the single VCPU to add to the VM. + * extra_mem_pages - The number of extra pages to add (this will + * decide how much extra space we will need to + * setup the page tables using memslot 0) + * guest_code - The vCPU's entry point + * + * Output Args: None + * + * Return: + * Pointer to opaque structure that describes the created VM. + */ +struct kvm_vm *vm_create_default(uint32_t vcpuid, uint64_t extra_mem_pages, + void *guest_code); + +/* Same as vm_create_default, but can be used for more than one vcpu */ +struct kvm_vm *vm_create_default_with_vcpus(uint32_t nr_vcpus, uint64_t extra_mem_pages, + uint32_t num_percpu_pages, void *guest_code, + uint32_t vcpuids[]); + +/* Like vm_create_default_with_vcpus, but accepts mode and slot0 memory as a parameter */ +struct kvm_vm *vm_create_with_vcpus(enum vm_guest_mode mode, uint32_t nr_vcpus, + uint64_t slot0_mem_pages, uint64_t extra_mem_pages, + uint32_t num_percpu_pages, void *guest_code, + uint32_t vcpuids[]); + +/* + * Adds a vCPU with reasonable defaults (e.g. a stack) + * + * Input Args: + * vm - Virtual Machine + * vcpuid - The id of the VCPU to add to the VM. + * guest_code - The vCPU's entry point + */ +void vm_vcpu_add_default(struct kvm_vm *vm, uint32_t vcpuid, void *guest_code); + +bool vm_is_unrestricted_guest(struct kvm_vm *vm); + +unsigned int vm_get_page_size(struct kvm_vm *vm); +unsigned int vm_get_page_shift(struct kvm_vm *vm); +uint64_t vm_get_max_gfn(struct kvm_vm *vm); +int vm_get_fd(struct kvm_vm *vm); + +unsigned int vm_calc_num_guest_pages(enum vm_guest_mode mode, size_t size); +unsigned int vm_num_host_pages(enum vm_guest_mode mode, unsigned int num_guest_pages); +unsigned int vm_num_guest_pages(enum vm_guest_mode mode, unsigned int num_host_pages); +static inline unsigned int +vm_adjust_num_guest_pages(enum vm_guest_mode mode, unsigned int num_guest_pages) +{ + unsigned int n; + n = vm_num_guest_pages(mode, vm_num_host_pages(mode, num_guest_pages)); +#ifdef __s390x__ + /* s390 requires 1M aligned guest sizes */ + n = (n + 255) & ~255; +#endif + return n; +} + +struct kvm_userspace_memory_region * +kvm_userspace_memory_region_find(struct kvm_vm *vm, uint64_t start, + uint64_t end); + +struct kvm_dirty_log * +allocate_kvm_dirty_log(struct kvm_userspace_memory_region *region); + +int vm_create_device(struct kvm_vm *vm, struct kvm_create_device *cd); + +#define sync_global_to_guest(vm, g) ({ \ + typeof(g) *_p = addr_gva2hva(vm, (vm_vaddr_t)&(g)); \ + memcpy(_p, &(g), sizeof(g)); \ +}) + +#define sync_global_from_guest(vm, g) ({ \ + typeof(g) *_p = addr_gva2hva(vm, (vm_vaddr_t)&(g)); \ + memcpy(&(g), _p, sizeof(g)); \ +}) + +void assert_on_unhandled_exception(struct kvm_vm *vm, uint32_t vcpuid); + +/* Common ucalls */ +enum { + UCALL_NONE, + UCALL_SYNC, + UCALL_ABORT, + UCALL_DONE, + UCALL_UNHANDLED, +}; + +#define UCALL_MAX_ARGS 6 + +struct ucall { + uint64_t cmd; + uint64_t args[UCALL_MAX_ARGS]; +}; + +void ucall_init(struct kvm_vm *vm, void *arg); +void ucall_uninit(struct kvm_vm *vm); +void ucall(uint64_t cmd, int nargs, ...); +uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc); + +#define GUEST_SYNC_ARGS(stage, arg1, arg2, arg3, arg4) \ + ucall(UCALL_SYNC, 6, "hello", stage, arg1, arg2, arg3, arg4) +#define GUEST_SYNC(stage) ucall(UCALL_SYNC, 2, "hello", stage) +#define GUEST_DONE() ucall(UCALL_DONE, 0) +#define __GUEST_ASSERT(_condition, _condstr, _nargs, _args...) do { \ + if (!(_condition)) \ + ucall(UCALL_ABORT, 2 + _nargs, \ + "Failed guest assert: " \ + _condstr, __LINE__, _args); \ +} while (0) + +#define GUEST_ASSERT(_condition) \ + __GUEST_ASSERT(_condition, #_condition, 0, 0) + +#define GUEST_ASSERT_1(_condition, arg1) \ + __GUEST_ASSERT(_condition, #_condition, 1, (arg1)) + +#define GUEST_ASSERT_2(_condition, arg1, arg2) \ + __GUEST_ASSERT(_condition, #_condition, 2, (arg1), (arg2)) + +#define GUEST_ASSERT_3(_condition, arg1, arg2, arg3) \ + __GUEST_ASSERT(_condition, #_condition, 3, (arg1), (arg2), (arg3)) + +#define GUEST_ASSERT_4(_condition, arg1, arg2, arg3, arg4) \ + __GUEST_ASSERT(_condition, #_condition, 4, (arg1), (arg2), (arg3), (arg4)) + +#define GUEST_ASSERT_EQ(a, b) __GUEST_ASSERT((a) == (b), #a " == " #b, 2, a, b) + +int vm_get_stats_fd(struct kvm_vm *vm); +int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid); + +uint32_t guest_get_vcpuid(void); + +#endif /* SELFTEST_KVM_UTIL_BASE_H */
Now that core kvm_util declarations have special home in kvm_util_base.h, move ucall-related declarations out into a separate header.
Signed-off-by: Michael Roth michael.roth@amd.com --- .../testing/selftests/kvm/include/kvm_util.h | 1 + .../selftests/kvm/include/kvm_util_base.h | 49 --------------- .../selftests/kvm/include/ucall_common.h | 59 +++++++++++++++++++ 3 files changed, 60 insertions(+), 49 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/ucall_common.h
diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index c860ced3888d..c9286811a4cb 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -8,5 +8,6 @@ #define SELFTEST_KVM_UTIL_H
#include "kvm_util_base.h" +#include "ucall_common.h"
#endif /* SELFTEST_KVM_UTIL_H */ diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h index 8fb6aeff5469..4e2946ba3ff7 100644 --- a/tools/testing/selftests/kvm/include/kvm_util_base.h +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h @@ -360,55 +360,6 @@ int vm_create_device(struct kvm_vm *vm, struct kvm_create_device *cd);
void assert_on_unhandled_exception(struct kvm_vm *vm, uint32_t vcpuid);
-/* Common ucalls */ -enum { - UCALL_NONE, - UCALL_SYNC, - UCALL_ABORT, - UCALL_DONE, - UCALL_UNHANDLED, -}; - -#define UCALL_MAX_ARGS 6 - -struct ucall { - uint64_t cmd; - uint64_t args[UCALL_MAX_ARGS]; -}; - -void ucall_init(struct kvm_vm *vm, void *arg); -void ucall_uninit(struct kvm_vm *vm); -void ucall(uint64_t cmd, int nargs, ...); -uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc); - -#define GUEST_SYNC_ARGS(stage, arg1, arg2, arg3, arg4) \ - ucall(UCALL_SYNC, 6, "hello", stage, arg1, arg2, arg3, arg4) -#define GUEST_SYNC(stage) ucall(UCALL_SYNC, 2, "hello", stage) -#define GUEST_DONE() ucall(UCALL_DONE, 0) -#define __GUEST_ASSERT(_condition, _condstr, _nargs, _args...) do { \ - if (!(_condition)) \ - ucall(UCALL_ABORT, 2 + _nargs, \ - "Failed guest assert: " \ - _condstr, __LINE__, _args); \ -} while (0) - -#define GUEST_ASSERT(_condition) \ - __GUEST_ASSERT(_condition, #_condition, 0, 0) - -#define GUEST_ASSERT_1(_condition, arg1) \ - __GUEST_ASSERT(_condition, #_condition, 1, (arg1)) - -#define GUEST_ASSERT_2(_condition, arg1, arg2) \ - __GUEST_ASSERT(_condition, #_condition, 2, (arg1), (arg2)) - -#define GUEST_ASSERT_3(_condition, arg1, arg2, arg3) \ - __GUEST_ASSERT(_condition, #_condition, 3, (arg1), (arg2), (arg3)) - -#define GUEST_ASSERT_4(_condition, arg1, arg2, arg3, arg4) \ - __GUEST_ASSERT(_condition, #_condition, 4, (arg1), (arg2), (arg3), (arg4)) - -#define GUEST_ASSERT_EQ(a, b) __GUEST_ASSERT((a) == (b), #a " == " #b, 2, a, b) - int vm_get_stats_fd(struct kvm_vm *vm); int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid);
diff --git a/tools/testing/selftests/kvm/include/ucall_common.h b/tools/testing/selftests/kvm/include/ucall_common.h new file mode 100644 index 000000000000..9eecc9d40b79 --- /dev/null +++ b/tools/testing/selftests/kvm/include/ucall_common.h @@ -0,0 +1,59 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * tools/testing/selftests/kvm/include/kvm_util.h + * + * Copyright (C) 2018, Google LLC. + */ +#ifndef SELFTEST_KVM_UCALL_COMMON_H +#define SELFTEST_KVM_UCALL_COMMON_H + +/* Common ucalls */ +enum { + UCALL_NONE, + UCALL_SYNC, + UCALL_ABORT, + UCALL_DONE, + UCALL_UNHANDLED, +}; + +#define UCALL_MAX_ARGS 6 + +struct ucall { + uint64_t cmd; + uint64_t args[UCALL_MAX_ARGS]; +}; + +void ucall_init(struct kvm_vm *vm, void *arg); +void ucall_uninit(struct kvm_vm *vm); +void ucall(uint64_t cmd, int nargs, ...); +uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc); + +#define GUEST_SYNC_ARGS(stage, arg1, arg2, arg3, arg4) \ + ucall(UCALL_SYNC, 6, "hello", stage, arg1, arg2, arg3, arg4) +#define GUEST_SYNC(stage) ucall(UCALL_SYNC, 2, "hello", stage) +#define GUEST_DONE() ucall(UCALL_DONE, 0) +#define __GUEST_ASSERT(_condition, _condstr, _nargs, _args...) do { \ + if (!(_condition)) \ + ucall(UCALL_ABORT, 2 + _nargs, \ + "Failed guest assert: " \ + _condstr, __LINE__, _args); \ +} while (0) + +#define GUEST_ASSERT(_condition) \ + __GUEST_ASSERT(_condition, #_condition, 0, 0) + +#define GUEST_ASSERT_1(_condition, arg1) \ + __GUEST_ASSERT(_condition, #_condition, 1, (arg1)) + +#define GUEST_ASSERT_2(_condition, arg1, arg2) \ + __GUEST_ASSERT(_condition, #_condition, 2, (arg1), (arg2)) + +#define GUEST_ASSERT_3(_condition, arg1, arg2, arg3) \ + __GUEST_ASSERT(_condition, #_condition, 3, (arg1), (arg2), (arg3)) + +#define GUEST_ASSERT_4(_condition, arg1, arg2, arg3, arg4) \ + __GUEST_ASSERT(_condition, #_condition, 4, (arg1), (arg2), (arg3), (arg4)) + +#define GUEST_ASSERT_EQ(a, b) __GUEST_ASSERT((a) == (b), #a " == " #b, 2, a, b) + +#endif /* SELFTEST_KVM_UCALL_COMMON_H */
On Fri, Dec 10, 2021 at 10:46:12AM -0600, Michael Roth wrote:
Now that core kvm_util declarations have special home in kvm_util_base.h, move ucall-related declarations out into a separate header.
Signed-off-by: Michael Roth michael.roth@amd.com
.../testing/selftests/kvm/include/kvm_util.h | 1 + .../selftests/kvm/include/kvm_util_base.h | 49 --------------- .../selftests/kvm/include/ucall_common.h | 59 +++++++++++++++++++ 3 files changed, 60 insertions(+), 49 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/ucall_common.h
diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index c860ced3888d..c9286811a4cb 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -8,5 +8,6 @@ #define SELFTEST_KVM_UTIL_H #include "kvm_util_base.h" +#include "ucall_common.h" #endif /* SELFTEST_KVM_UTIL_H */
Now that kvm_util.h is looking like a "libkvm.h", then we can do some more header cleanups to make that official. After this series is merged I'll send a series that
- removes unnecessary includes from kvm_util_common.h and other headers - renames kvm_util.h to libkvm.h - also includes guest_modes.h and test_util.h from libkvm.h - simplify the includes of all unit tests since they'll be including libkvm.h - probably move include/sparsebit.h to lib, since no unit test needs it
Thanks, drew
diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h index 8fb6aeff5469..4e2946ba3ff7 100644 --- a/tools/testing/selftests/kvm/include/kvm_util_base.h +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h @@ -360,55 +360,6 @@ int vm_create_device(struct kvm_vm *vm, struct kvm_create_device *cd); void assert_on_unhandled_exception(struct kvm_vm *vm, uint32_t vcpuid); -/* Common ucalls */ -enum {
- UCALL_NONE,
- UCALL_SYNC,
- UCALL_ABORT,
- UCALL_DONE,
- UCALL_UNHANDLED,
-};
-#define UCALL_MAX_ARGS 6
-struct ucall {
- uint64_t cmd;
- uint64_t args[UCALL_MAX_ARGS];
-};
-void ucall_init(struct kvm_vm *vm, void *arg); -void ucall_uninit(struct kvm_vm *vm); -void ucall(uint64_t cmd, int nargs, ...); -uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc);
-#define GUEST_SYNC_ARGS(stage, arg1, arg2, arg3, arg4) \
ucall(UCALL_SYNC, 6, "hello", stage, arg1, arg2, arg3, arg4)
-#define GUEST_SYNC(stage) ucall(UCALL_SYNC, 2, "hello", stage) -#define GUEST_DONE() ucall(UCALL_DONE, 0) -#define __GUEST_ASSERT(_condition, _condstr, _nargs, _args...) do { \
- if (!(_condition)) \
ucall(UCALL_ABORT, 2 + _nargs, \
"Failed guest assert: " \
_condstr, __LINE__, _args); \
-} while (0)
-#define GUEST_ASSERT(_condition) \
- __GUEST_ASSERT(_condition, #_condition, 0, 0)
-#define GUEST_ASSERT_1(_condition, arg1) \
- __GUEST_ASSERT(_condition, #_condition, 1, (arg1))
-#define GUEST_ASSERT_2(_condition, arg1, arg2) \
- __GUEST_ASSERT(_condition, #_condition, 2, (arg1), (arg2))
-#define GUEST_ASSERT_3(_condition, arg1, arg2, arg3) \
- __GUEST_ASSERT(_condition, #_condition, 3, (arg1), (arg2), (arg3))
-#define GUEST_ASSERT_4(_condition, arg1, arg2, arg3, arg4) \
- __GUEST_ASSERT(_condition, #_condition, 4, (arg1), (arg2), (arg3), (arg4))
-#define GUEST_ASSERT_EQ(a, b) __GUEST_ASSERT((a) == (b), #a " == " #b, 2, a, b)
int vm_get_stats_fd(struct kvm_vm *vm); int vcpu_get_stats_fd(struct kvm_vm *vm, uint32_t vcpuid); diff --git a/tools/testing/selftests/kvm/include/ucall_common.h b/tools/testing/selftests/kvm/include/ucall_common.h new file mode 100644 index 000000000000..9eecc9d40b79 --- /dev/null +++ b/tools/testing/selftests/kvm/include/ucall_common.h @@ -0,0 +1,59 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/*
- tools/testing/selftests/kvm/include/kvm_util.h
- Copyright (C) 2018, Google LLC.
- */
+#ifndef SELFTEST_KVM_UCALL_COMMON_H +#define SELFTEST_KVM_UCALL_COMMON_H
+/* Common ucalls */ +enum {
- UCALL_NONE,
- UCALL_SYNC,
- UCALL_ABORT,
- UCALL_DONE,
- UCALL_UNHANDLED,
+};
+#define UCALL_MAX_ARGS 6
+struct ucall {
- uint64_t cmd;
- uint64_t args[UCALL_MAX_ARGS];
+};
+void ucall_init(struct kvm_vm *vm, void *arg); +void ucall_uninit(struct kvm_vm *vm); +void ucall(uint64_t cmd, int nargs, ...); +uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc);
+#define GUEST_SYNC_ARGS(stage, arg1, arg2, arg3, arg4) \
ucall(UCALL_SYNC, 6, "hello", stage, arg1, arg2, arg3, arg4)
+#define GUEST_SYNC(stage) ucall(UCALL_SYNC, 2, "hello", stage) +#define GUEST_DONE() ucall(UCALL_DONE, 0) +#define __GUEST_ASSERT(_condition, _condstr, _nargs, _args...) do { \
- if (!(_condition)) \
ucall(UCALL_ABORT, 2 + _nargs, \
"Failed guest assert: " \
_condstr, __LINE__, _args); \
+} while (0)
+#define GUEST_ASSERT(_condition) \
- __GUEST_ASSERT(_condition, #_condition, 0, 0)
+#define GUEST_ASSERT_1(_condition, arg1) \
- __GUEST_ASSERT(_condition, #_condition, 1, (arg1))
+#define GUEST_ASSERT_2(_condition, arg1, arg2) \
- __GUEST_ASSERT(_condition, #_condition, 2, (arg1), (arg2))
+#define GUEST_ASSERT_3(_condition, arg1, arg2, arg3) \
- __GUEST_ASSERT(_condition, #_condition, 3, (arg1), (arg2), (arg3))
+#define GUEST_ASSERT_4(_condition, arg1, arg2, arg3, arg4) \
- __GUEST_ASSERT(_condition, #_condition, 4, (arg1), (arg2), (arg3), (arg4))
+#define GUEST_ASSERT_EQ(a, b) __GUEST_ASSERT((a) == (b), #a " == " #b, 2, a, b)
+#endif /* SELFTEST_KVM_UCALL_COMMON_H */
2.25.1
To support SEV, x86 tests will need an alternative to using PIO instructions to handle ucall-related functionality since ucall structs are currently allocated on the guest stack, which will generally be encrypted memory that can't be accessed by tests through the normal mechanisms (along with some other complications which will requires some new ucall interfaces as well).
To prepare for this, introduce a ucall_ops struct and supporting interfaces that can be used to define multiple ucall implementations that can be selected on a per-test basis, and re-work the existing PIO-based ucall implementation to make use of these changes. Subsequent patches will do the same for other archs as well, and then extend this ops interface to address complications when dealing with encrypted/private guest memory.
Signed-off-by: Michael Roth michael.roth@amd.com --- tools/testing/selftests/kvm/Makefile | 2 +- .../testing/selftests/kvm/include/kvm_util.h | 10 ++ .../selftests/kvm/include/ucall_common.h | 17 +++- .../selftests/kvm/include/x86_64/ucall.h | 18 ++++ .../testing/selftests/kvm/lib/ucall_common.c | 95 +++++++++++++++++++ .../testing/selftests/kvm/lib/x86_64/ucall.c | 46 ++++----- 6 files changed, 157 insertions(+), 31 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/x86_64/ucall.h create mode 100644 tools/testing/selftests/kvm/lib/ucall_common.c
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index c4e34717826a..05bff4039890 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -34,7 +34,7 @@ ifeq ($(ARCH),s390) endif
LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c -LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S +LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S lib/ucall_common.c LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S lib/aarch64/spinlock.c lib/aarch64/gic.c lib/aarch64/gic_v3.c lib/aarch64/vgic.c LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index c9286811a4cb..2701bf98c0db 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -8,6 +8,16 @@ #define SELFTEST_KVM_UTIL_H
#include "kvm_util_base.h" +/* + * TODO: ucall.h contains arch-specific declarations along with + * ucall_common.h. For now only a subset of archs provide the + * new header. Once all archs implement the new header the #include for + * ucall_common.h can be dropped. + */ +#ifdef __x86_64__ +#include "ucall.h" +#else #include "ucall_common.h" +#endif
#endif /* SELFTEST_KVM_UTIL_H */ diff --git a/tools/testing/selftests/kvm/include/ucall_common.h b/tools/testing/selftests/kvm/include/ucall_common.h index 9eecc9d40b79..fcd32607dcff 100644 --- a/tools/testing/selftests/kvm/include/ucall_common.h +++ b/tools/testing/selftests/kvm/include/ucall_common.h @@ -1,8 +1,12 @@ /* SPDX-License-Identifier: GPL-2.0-only */ /* - * tools/testing/selftests/kvm/include/kvm_util.h + * Common interfaces related to ucall support. + * + * A ucall is a hypercall to userspace. * * Copyright (C) 2018, Google LLC. + * Copyright (C) 2018, Red Hat, Inc. + * Copyright (C) 2021, Advanced Micro Devices, Inc. */ #ifndef SELFTEST_KVM_UCALL_COMMON_H #define SELFTEST_KVM_UCALL_COMMON_H @@ -14,6 +18,7 @@ enum { UCALL_ABORT, UCALL_DONE, UCALL_UNHANDLED, + UCALL_NOT_IMPLEMENTED, };
#define UCALL_MAX_ARGS 6 @@ -23,8 +28,18 @@ struct ucall { uint64_t args[UCALL_MAX_ARGS]; };
+struct ucall_ops { + const char *name; + void (*init)(struct kvm_vm *vm, void *arg); + void (*uninit)(struct kvm_vm *vm); + void (*send_cmd)(struct ucall *uc); + uint64_t (*recv_cmd)(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc); +}; + void ucall_init(struct kvm_vm *vm, void *arg); void ucall_uninit(struct kvm_vm *vm); +void ucall_init_ops(struct kvm_vm *vm, void *arg, const struct ucall_ops *ops); +void ucall_uninit_ops(struct kvm_vm *vm); void ucall(uint64_t cmd, int nargs, ...); uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc);
diff --git a/tools/testing/selftests/kvm/include/x86_64/ucall.h b/tools/testing/selftests/kvm/include/x86_64/ucall.h new file mode 100644 index 000000000000..8366bdc9c04e --- /dev/null +++ b/tools/testing/selftests/kvm/include/x86_64/ucall.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Arch-specific ucall implementations. + * + * A ucall is a "hypercall to userspace". + * + * Copyright (C) 2021 Advanced Micro Devices + */ +#ifndef SELFTEST_KVM_UCALL_H +#define SELFTEST_KVM_UCALL_H + +#include "ucall_common.h" + +extern const struct ucall_ops ucall_ops_pio; + +extern const struct ucall_ops ucall_ops_default; + +#endif /* SELFTEST_KVM_UCALL_H */ diff --git a/tools/testing/selftests/kvm/lib/ucall_common.c b/tools/testing/selftests/kvm/lib/ucall_common.c new file mode 100644 index 000000000000..db0129edcbc1 --- /dev/null +++ b/tools/testing/selftests/kvm/lib/ucall_common.c @@ -0,0 +1,95 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Common interfaces related to ucall support. A ucall is a hypercall to + * userspace. + * + * Copyright (C) 2018, Red Hat, Inc. + * Copyright (C) 2021, Advanced Micro Devices, Inc. + */ +#include "kvm_util_base.h" +#include "ucall_common.h" + +extern const struct ucall_ops ucall_ops_default; + +/* Some archs rely on a default that is available even without ucall_init(). */ +#if defined(__x86_64__) || defined(__s390x__) +static const struct ucall_ops *ucall_ops = &ucall_ops_default; +#else +static const struct ucall_ops *ucall_ops; +#endif + +void ucall_init_ops(struct kvm_vm *vm, void *arg, const struct ucall_ops *ops) +{ + TEST_ASSERT(ops, "ucall ops must be specified"); + ucall_ops = ops; + sync_global_to_guest(vm, ucall_ops); + + if (ucall_ops->init) + ucall_ops->init(vm, arg); +} + +void ucall_init(struct kvm_vm *vm, void *arg) +{ + ucall_init_ops(vm, arg, &ucall_ops_default); +} + +void ucall_uninit_ops(struct kvm_vm *vm) +{ + if (ucall_ops && ucall_ops->uninit) + ucall_ops->uninit(vm); + + ucall_ops = NULL; + sync_global_to_guest(vm, ucall_ops); +} + +void ucall_uninit(struct kvm_vm *vm) +{ + ucall_uninit_ops(vm); +} + +static void ucall_process_args(struct ucall *uc, uint64_t cmd, int nargs, va_list va_args) +{ + int i; + + nargs = nargs <= UCALL_MAX_ARGS ? nargs : UCALL_MAX_ARGS; + uc->cmd = cmd; + + for (i = 0; i < nargs; ++i) + uc->args[i] = va_arg(va_args, uint64_t); +} + +/* + * Allocate/populate a ucall buffer from the guest's stack and then generate an + * exit to host userspace. ucall_ops->send_cmd should have some way of + * communicating the address of the ucall buffer to the host. + */ +void ucall(uint64_t cmd, int nargs, ...) +{ + struct ucall uc; + va_list va; + + if (!ucall_ops->send_cmd) + return; + + va_start(va, nargs); + ucall_process_args(&uc, cmd, nargs, va); + va_end(va); + + ucall_ops->send_cmd(&uc); +} + +/* + * Parse the ucall buffer allocated by the guest via ucall() to determine what + * ucall message/command was sent by the guest. If 'uc' is provided, copy the + * contents of the guest's ucall buffer into it. + */ +uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc) +{ + if (!ucall_ops->recv_cmd) + return UCALL_NOT_IMPLEMENTED; + + if (uc) + memset(uc, 0, sizeof(*uc)); + + return ucall_ops->recv_cmd(vm, vcpu_id, uc); +} diff --git a/tools/testing/selftests/kvm/lib/x86_64/ucall.c b/tools/testing/selftests/kvm/lib/x86_64/ucall.c index a3489973e290..f5d9aba0d803 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/ucall.c +++ b/tools/testing/selftests/kvm/lib/x86_64/ucall.c @@ -1,48 +1,28 @@ // SPDX-License-Identifier: GPL-2.0 /* - * ucall support. A ucall is a "hypercall to userspace". + * Arch-specific ucall implementations. + * + * A ucall is a "hypercall to userspace". * * Copyright (C) 2018, Red Hat, Inc. */ -#include "kvm_util.h" +#include "kvm_util_base.h" +#include "ucall.h"
#define UCALL_PIO_PORT ((uint16_t)0x1000)
-void ucall_init(struct kvm_vm *vm, void *arg) -{ -} - -void ucall_uninit(struct kvm_vm *vm) -{ -} - -void ucall(uint64_t cmd, int nargs, ...) +static void ucall_ops_pio_send_cmd(struct ucall *uc) { - struct ucall uc = { - .cmd = cmd, - }; - va_list va; - int i; - - nargs = nargs <= UCALL_MAX_ARGS ? nargs : UCALL_MAX_ARGS; - - va_start(va, nargs); - for (i = 0; i < nargs; ++i) - uc.args[i] = va_arg(va, uint64_t); - va_end(va); - asm volatile("in %[port], %%al" - : : [port] "d" (UCALL_PIO_PORT), "D" (&uc) : "rax", "memory"); + : : [port] "d" (UCALL_PIO_PORT), "D" (uc) : "rax", "memory"); }
-uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc) +static uint64_t ucall_ops_pio_recv_cmd(struct kvm_vm *vm, uint32_t vcpu_id, + struct ucall *uc) { struct kvm_run *run = vcpu_state(vm, vcpu_id); struct ucall ucall = {};
- if (uc) - memset(uc, 0, sizeof(*uc)); - if (run->exit_reason == KVM_EXIT_IO && run->io.port == UCALL_PIO_PORT) { struct kvm_regs regs;
@@ -57,3 +37,11 @@ uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc)
return ucall.cmd; } + +const struct ucall_ops ucall_ops_pio = { + .name = "PIO", + .send_cmd = ucall_ops_pio_send_cmd, + .recv_cmd = ucall_ops_pio_recv_cmd, +}; + +const struct ucall_ops ucall_ops_default = ucall_ops_pio;
As with x86, switch the default ucall implementation to using the new ucall_ops infrastructure. With this change ucall_init/ucall_uninit are no longer arch-specific and can now be dropped in favor of the ones in ucall_common.c.
Signed-off-by: Michael Roth michael.roth@amd.com --- tools/testing/selftests/kvm/Makefile | 2 +- .../selftests/kvm/include/aarch64/ucall.h | 18 ++++++++ .../testing/selftests/kvm/include/kvm_util.h | 2 +- .../testing/selftests/kvm/lib/aarch64/ucall.c | 43 +++++++++---------- 4 files changed, 40 insertions(+), 25 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/aarch64/ucall.h
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 05bff4039890..2d4299961d0e 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -35,7 +35,7 @@ endif
LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S lib/ucall_common.c -LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S lib/aarch64/spinlock.c lib/aarch64/gic.c lib/aarch64/gic_v3.c lib/aarch64/vgic.c +LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S lib/aarch64/spinlock.c lib/aarch64/gic.c lib/aarch64/gic_v3.c lib/aarch64/vgic.c lib/ucall_common.c LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
TEST_GEN_PROGS_x86_64 = x86_64/cr4_cpuid_sync_test diff --git a/tools/testing/selftests/kvm/include/aarch64/ucall.h b/tools/testing/selftests/kvm/include/aarch64/ucall.h new file mode 100644 index 000000000000..d3189d0a4d68 --- /dev/null +++ b/tools/testing/selftests/kvm/include/aarch64/ucall.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Arch-specific ucall implementations. + * + * A ucall is a "hypercall to userspace". + * + * Copyright (C) 2021 Advanced Micro Devices + */ +#ifndef SELFTEST_KVM_UCALL_H +#define SELFTEST_KVM_UCALL_H + +#include "ucall_common.h" + +extern const struct ucall_ops ucall_ops_mmio; + +extern const struct ucall_ops ucall_ops_default; + +#endif /* SELFTEST_KVM_UCALL_H */ diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index 2701bf98c0db..16ec8c53cd81 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -14,7 +14,7 @@ * new header. Once all archs implement the new header the #include for * ucall_common.h can be dropped. */ -#ifdef __x86_64__ +#if defined(__x86_64__) || defined (__aarch64__) #include "ucall.h" #else #include "ucall_common.h" diff --git a/tools/testing/selftests/kvm/lib/aarch64/ucall.c b/tools/testing/selftests/kvm/lib/aarch64/ucall.c index e0b0164e9af8..ab052ab5d5de 100644 --- a/tools/testing/selftests/kvm/lib/aarch64/ucall.c +++ b/tools/testing/selftests/kvm/lib/aarch64/ucall.c @@ -1,11 +1,14 @@ // SPDX-License-Identifier: GPL-2.0 /* - * ucall support. A ucall is a "hypercall to userspace". + * Arch-specific ucall implementations. + * + * A ucall is a "hypercall to userspace". * * Copyright (C) 2018, Red Hat, Inc. */ -#include "kvm_util.h" +#include "kvm_util_base.h" #include "../kvm_util_internal.h" +#include "ucall.h"
static vm_vaddr_t *ucall_exit_mmio_addr;
@@ -22,7 +25,7 @@ static bool ucall_mmio_init(struct kvm_vm *vm, vm_paddr_t gpa) return true; }
-void ucall_init(struct kvm_vm *vm, void *arg) +static void ucall_ops_mmio_init(struct kvm_vm *vm, void *arg) { vm_paddr_t gpa, start, end, step, offset; unsigned int bits; @@ -65,38 +68,22 @@ void ucall_init(struct kvm_vm *vm, void *arg) TEST_FAIL("Can't find a ucall mmio address"); }
-void ucall_uninit(struct kvm_vm *vm) +static void ucall_ops_mmio_uninit(struct kvm_vm *vm) { ucall_exit_mmio_addr = 0; sync_global_to_guest(vm, ucall_exit_mmio_addr); }
-void ucall(uint64_t cmd, int nargs, ...) +static void ucall_ops_mmio_send_cmd(struct ucall *uc) { - struct ucall uc = { - .cmd = cmd, - }; - va_list va; - int i; - - nargs = nargs <= UCALL_MAX_ARGS ? nargs : UCALL_MAX_ARGS; - - va_start(va, nargs); - for (i = 0; i < nargs; ++i) - uc.args[i] = va_arg(va, uint64_t); - va_end(va); - - *ucall_exit_mmio_addr = (vm_vaddr_t)&uc; + *ucall_exit_mmio_addr = (vm_vaddr_t)uc; }
-uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc) +static uint64_t ucall_ops_mmio_recv_cmd(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc) { struct kvm_run *run = vcpu_state(vm, vcpu_id); struct ucall ucall = {};
- if (uc) - memset(uc, 0, sizeof(*uc)); - if (run->exit_reason == KVM_EXIT_MMIO && run->mmio.phys_addr == (uint64_t)ucall_exit_mmio_addr) { vm_vaddr_t gva; @@ -113,3 +100,13 @@ uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc)
return ucall.cmd; } + +const struct ucall_ops ucall_ops_mmio = { + .name = "MMIO", + .init = ucall_ops_mmio_init, + .uninit = ucall_ops_mmio_uninit, + .send_cmd = ucall_ops_mmio_send_cmd, + .recv_cmd = ucall_ops_mmio_recv_cmd, +}; + +const struct ucall_ops ucall_ops_default = ucall_ops_mmio;
As with x86, switch the default ucall implementation to using the new ucall_ops infrastructure. With this change ucall_init/ucall_uninit are no longer arch-specific and can now be dropped in favor of the ones in ucall_common.c.
Signed-off-by: Michael Roth michael.roth@amd.com --- tools/testing/selftests/kvm/Makefile | 6 +-- .../testing/selftests/kvm/include/kvm_util.h | 10 ----- .../selftests/kvm/include/s390x/ucall.h | 18 ++++++++ tools/testing/selftests/kvm/lib/s390x/ucall.c | 45 +++++++------------ 4 files changed, 38 insertions(+), 41 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/s390x/ucall.h
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 2d4299961d0e..06a02b6fa907 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -33,9 +33,9 @@ ifeq ($(ARCH),s390) UNAME_M := s390x endif
-LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c -LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S lib/ucall_common.c -LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S lib/aarch64/spinlock.c lib/aarch64/gic.c lib/aarch64/gic_v3.c lib/aarch64/vgic.c lib/ucall_common.c +LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c lib/ucall_common.c +LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S +LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S lib/aarch64/spinlock.c lib/aarch64/gic.c lib/aarch64/gic_v3.c lib/aarch64/vgic.c LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
TEST_GEN_PROGS_x86_64 = x86_64/cr4_cpuid_sync_test diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index 16ec8c53cd81..f6ec4fcd66d9 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -8,16 +8,6 @@ #define SELFTEST_KVM_UTIL_H
#include "kvm_util_base.h" -/* - * TODO: ucall.h contains arch-specific declarations along with - * ucall_common.h. For now only a subset of archs provide the - * new header. Once all archs implement the new header the #include for - * ucall_common.h can be dropped. - */ -#if defined(__x86_64__) || defined (__aarch64__) #include "ucall.h" -#else -#include "ucall_common.h" -#endif
#endif /* SELFTEST_KVM_UTIL_H */ diff --git a/tools/testing/selftests/kvm/include/s390x/ucall.h b/tools/testing/selftests/kvm/include/s390x/ucall.h new file mode 100644 index 000000000000..6ba8bc77d667 --- /dev/null +++ b/tools/testing/selftests/kvm/include/s390x/ucall.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Arch-specific ucall implementations. + * + * A ucall is a "hypercall to userspace". + * + * Copyright (C) 2021 Advanced Micro Devices + */ +#ifndef SELFTEST_KVM_UCALL_H +#define SELFTEST_KVM_UCALL_H + +#include "ucall_common.h" + +extern const struct ucall_ops ucall_ops_diag501; + +extern const struct ucall_ops ucall_ops_default; + +#endif /* SELFTEST_KVM_UCALL_H */ diff --git a/tools/testing/selftests/kvm/lib/s390x/ucall.c b/tools/testing/selftests/kvm/lib/s390x/ucall.c index 9d3b0f15249a..67f1baa42b28 100644 --- a/tools/testing/selftests/kvm/lib/s390x/ucall.c +++ b/tools/testing/selftests/kvm/lib/s390x/ucall.c @@ -1,46 +1,27 @@ // SPDX-License-Identifier: GPL-2.0 /* - * ucall support. A ucall is a "hypercall to userspace". + * Arch-specific ucall implementations. + * + * A ucall is a "hypercall to userspace". * * Copyright (C) 2019 Red Hat, Inc. */ -#include "kvm_util.h" - -void ucall_init(struct kvm_vm *vm, void *arg) -{ -} - -void ucall_uninit(struct kvm_vm *vm) -{ -} +#include "kvm_util_base.h" +#include "ucall.h"
-void ucall(uint64_t cmd, int nargs, ...) +static void +ucall_ops_diag501_send_cmd(struct ucall *uc) { - struct ucall uc = { - .cmd = cmd, - }; - va_list va; - int i; - - nargs = nargs <= UCALL_MAX_ARGS ? nargs : UCALL_MAX_ARGS; - - va_start(va, nargs); - for (i = 0; i < nargs; ++i) - uc.args[i] = va_arg(va, uint64_t); - va_end(va); - /* Exit via DIAGNOSE 0x501 (normally used for breakpoints) */ asm volatile ("diag 0,%0,0x501" : : "a"(&uc) : "memory"); }
-uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc) +static uint64_t +ucall_ops_diag501_recv_cmd(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc) { struct kvm_run *run = vcpu_state(vm, vcpu_id); struct ucall ucall = {};
- if (uc) - memset(uc, 0, sizeof(*uc)); - if (run->exit_reason == KVM_EXIT_S390_SIEIC && run->s390_sieic.icptcode == 4 && (run->s390_sieic.ipa >> 8) == 0x83 && /* 0x83 means DIAGNOSE */ @@ -57,3 +38,11 @@ uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc)
return ucall.cmd; } + +const struct ucall_ops ucall_ops_diag501 = { + .name = "diag501", + .send_cmd = ucall_ops_diag501_send_cmd, + .recv_cmd = ucall_ops_diag501_recv_cmd, +}; + +const struct ucall_ops ucall_ops_default = ucall_ops_diag501;
One complication with SEV and likely other Confidential Computing implementations is the fact that guest stacks might be encrypted, so existing interfaces like ucall()/get_ucall() will not work as-is since they allocate ucall structs on the guest stack dynamically. Additionally, the basic task of communicating the location in guest memory of these structs is complicated by the fact that guest register state may also be also encrypted, so existing approaches like reading vCPU register values to get these addresses would need to take this into consideration.
One way to address both of these in a (hopefully) robust way is to introduce a new set of ucall interfaces that rely on tests setting up shared guest memory in advance so that these ucall struct addresses are communicated between host/guest in advance, along with any work needed to ensure the memory is shared/public. With that in place, a ucall implementation only needs to trigger an exit back to host userspace to allow for host/guest communication via this shared memory / ucall struct.
Implement this approach by extending ucall_ops to allow for ucall implementations based on shared memory, and introducing ucall_shared()/get_ucall_shared() analogs to the existing ucall()/get_ucall() interfaces.
Signed-off-by: Michael Roth michael.roth@amd.com --- .../selftests/kvm/include/ucall_common.h | 5 +++ .../testing/selftests/kvm/lib/ucall_common.c | 38 +++++++++++++++++++ 2 files changed, 43 insertions(+)
diff --git a/tools/testing/selftests/kvm/include/ucall_common.h b/tools/testing/selftests/kvm/include/ucall_common.h index fcd32607dcff..ae0e8eec9734 100644 --- a/tools/testing/selftests/kvm/include/ucall_common.h +++ b/tools/testing/selftests/kvm/include/ucall_common.h @@ -34,6 +34,8 @@ struct ucall_ops { void (*uninit)(struct kvm_vm *vm); void (*send_cmd)(struct ucall *uc); uint64_t (*recv_cmd)(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc); + void (*send_cmd_shared)(struct ucall *uc); + uint64_t (*recv_cmd_shared)(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc); };
void ucall_init(struct kvm_vm *vm, void *arg); @@ -42,6 +44,9 @@ void ucall_init_ops(struct kvm_vm *vm, void *arg, const struct ucall_ops *ops); void ucall_uninit_ops(struct kvm_vm *vm); void ucall(uint64_t cmd, int nargs, ...); uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc); +vm_vaddr_t ucall_shared_alloc(struct kvm_vm *vm, int count); +void ucall_shared(struct ucall *uc, uint64_t cmd, int nargs, ...); +uint64_t get_ucall_shared(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc);
#define GUEST_SYNC_ARGS(stage, arg1, arg2, arg3, arg4) \ ucall(UCALL_SYNC, 6, "hello", stage, arg1, arg2, arg3, arg4) diff --git a/tools/testing/selftests/kvm/lib/ucall_common.c b/tools/testing/selftests/kvm/lib/ucall_common.c index db0129edcbc1..8e5738241a7c 100644 --- a/tools/testing/selftests/kvm/lib/ucall_common.c +++ b/tools/testing/selftests/kvm/lib/ucall_common.c @@ -93,3 +93,41 @@ uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc)
return ucall_ops->recv_cmd(vm, vcpu_id, uc); } + +/* Allocate shared memory within a guest to for a shared ucall buffer. */ +vm_vaddr_t ucall_shared_alloc(struct kvm_vm *vm, int count) +{ + return vm_vaddr_alloc(vm, count * sizeof(struct ucall), + vm_get_page_size(vm)); +} + +/* + * Populate a shared ucall buffer previously allocated by ucall_shared_alloc() + * and then generate an exit to host userspace. + */ +void ucall_shared(struct ucall *uc, uint64_t cmd, int nargs, ...) +{ + va_list va; + + if (!ucall_ops->send_cmd_shared) + return; + + va_start(va, nargs); + ucall_process_args(uc, cmd, nargs, va); + va_end(va); + + ucall_ops->send_cmd_shared(uc); +} + +/* + * Parse a ucall buffer that has been allocated by ucall_shared_alloc() and + * shared with the guest in advance to determine the ucall message/command that + * was sent by the guest. + */ +uint64_t get_ucall_shared(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc) +{ + if (!ucall_ops->recv_cmd_shared) + return UCALL_NOT_IMPLEMENTED; + + return ucall_ops->recv_cmd_shared(vm, vcpu_id, uc); +}
While the PIO-based ucall implementation won't actually be ideal for SEV guests due to the requirements/constraints on how the guest code would needed to handle resulting #VC exceptions generated by PIO instructions, it does provide a simple way to write/convert tests using the new ucall_shared() interfaces so that they can be more easily transitioned to handle running as confidential guests in the future by selecting a different ucall_ops implementation.
Signed-off-by: Michael Roth michael.roth@amd.com --- tools/testing/selftests/kvm/lib/x86_64/ucall.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/tools/testing/selftests/kvm/lib/x86_64/ucall.c b/tools/testing/selftests/kvm/lib/x86_64/ucall.c index f5d9aba0d803..157d2a102547 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/ucall.c +++ b/tools/testing/selftests/kvm/lib/x86_64/ucall.c @@ -38,10 +38,23 @@ static uint64_t ucall_ops_pio_recv_cmd(struct kvm_vm *vm, uint32_t vcpu_id, return ucall.cmd; }
+static uint64_t ucall_ops_pio_recv_cmd_shared(struct kvm_vm *vm, uint32_t vcpu_id, + struct ucall *uc) +{ + struct kvm_run *run = vcpu_state(vm, vcpu_id); + + if (run->exit_reason == KVM_EXIT_IO && run->io.port == UCALL_PIO_PORT) + vcpu_run_complete_io(vm, vcpu_id); + + return uc->cmd; +} + const struct ucall_ops ucall_ops_pio = { .name = "PIO", .send_cmd = ucall_ops_pio_send_cmd, .recv_cmd = ucall_ops_pio_recv_cmd, + .send_cmd_shared = ucall_ops_pio_send_cmd, + .recv_cmd_shared = ucall_ops_pio_recv_cmd_shared, };
const struct ucall_ops ucall_ops_default = ucall_ops_pio;
With the shared ucall interfaces the only thing the ucall implementation really needs to do is generate an exit to host userspace so that the shared ucall struct can be examined. This implementation uses a 'halt' instruction to generate such an exit, and is suitable for use with SEV guests, and potentially other confidential guest implementations.
Signed-off-by: Michael Roth michael.roth@amd.com --- .../selftests/kvm/include/x86_64/ucall.h | 1 + .../testing/selftests/kvm/lib/x86_64/ucall.c | 23 +++++++++++++++++++ 2 files changed, 24 insertions(+)
diff --git a/tools/testing/selftests/kvm/include/x86_64/ucall.h b/tools/testing/selftests/kvm/include/x86_64/ucall.h index 8366bdc9c04e..457fc1406746 100644 --- a/tools/testing/selftests/kvm/include/x86_64/ucall.h +++ b/tools/testing/selftests/kvm/include/x86_64/ucall.h @@ -12,6 +12,7 @@ #include "ucall_common.h"
extern const struct ucall_ops ucall_ops_pio; +extern const struct ucall_ops ucall_ops_halt;
extern const struct ucall_ops ucall_ops_default;
diff --git a/tools/testing/selftests/kvm/lib/x86_64/ucall.c b/tools/testing/selftests/kvm/lib/x86_64/ucall.c index 157d2a102547..4dfb12881434 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/ucall.c +++ b/tools/testing/selftests/kvm/lib/x86_64/ucall.c @@ -57,4 +57,27 @@ const struct ucall_ops ucall_ops_pio = { .recv_cmd_shared = ucall_ops_pio_recv_cmd_shared, };
+static void ucall_ops_halt_send_cmd_shared(struct ucall *uc) +{ + asm volatile("hlt" : : : "memory"); +} + +static uint64_t ucall_ops_halt_recv_cmd_shared(struct kvm_vm *vm, uint32_t vcpu_id, + struct ucall *uc) +{ + struct kvm_run *run = vcpu_state(vm, vcpu_id); + + TEST_ASSERT(run->exit_reason == KVM_EXIT_HLT, + "unexpected exit reason: %u (%s)", + run->exit_reason, exit_reason_str(run->exit_reason)); + + return uc->cmd; +} + +const struct ucall_ops ucall_ops_halt = { + .name = "halt", + .send_cmd_shared = ucall_ops_halt_send_cmd_shared, + .recv_cmd_shared = ucall_ops_halt_recv_cmd_shared, +}; + const struct ucall_ops ucall_ops_default = ucall_ops_pio;
Introduce GUEST_SHARED_* macros, which are mostly analogous to existing GUEST_SYNC/GUEST_ASSERT/etc macros used to simplify guest code that uses ucall for host/guest synchronization.
There are also some new CHECK_GUEST_SHARED_* macros intended to provide similar helpers in the host code that can pair directly with the guest versions.
Signed-off-by: Michael Roth michael.roth@amd.com --- .../selftests/kvm/include/ucall_common.h | 68 +++++++++++++++++++ 1 file changed, 68 insertions(+)
diff --git a/tools/testing/selftests/kvm/include/ucall_common.h b/tools/testing/selftests/kvm/include/ucall_common.h index ae0e8eec9734..b9b220dbd6c3 100644 --- a/tools/testing/selftests/kvm/include/ucall_common.h +++ b/tools/testing/selftests/kvm/include/ucall_common.h @@ -48,6 +48,7 @@ vm_vaddr_t ucall_shared_alloc(struct kvm_vm *vm, int count); void ucall_shared(struct ucall *uc, uint64_t cmd, int nargs, ...); uint64_t get_ucall_shared(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc);
+/* Helpers for host/guest synchronization using ucall_shared */ #define GUEST_SYNC_ARGS(stage, arg1, arg2, arg3, arg4) \ ucall(UCALL_SYNC, 6, "hello", stage, arg1, arg2, arg3, arg4) #define GUEST_SYNC(stage) ucall(UCALL_SYNC, 2, "hello", stage) @@ -76,4 +77,71 @@ uint64_t get_ucall_shared(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc)
#define GUEST_ASSERT_EQ(a, b) __GUEST_ASSERT((a) == (b), #a " == " #b, 2, a, b)
+/* Helper macros for ucall synchronization via shared memory/ucall struct. */ +#define GUEST_SHARED_SYNC_ARGS(uc, stage, arg1, arg2, arg3, arg4) \ + ucall_shared(uc, UCALL_SYNC, 6, "hello", stage, arg1, arg2, arg3, arg4) +#define GUEST_SHARED_SYNC(uc, stage) \ + ucall_shared(uc, UCALL_SYNC, 2, "hello", stage) +#define GUEST_SHARED_DONE(uc) \ + ucall_shared(uc, UCALL_DONE, 0) +#define __GUEST_SHARED_ASSERT(uc, _condition, _condstr, _nargs, _args...) do { \ + if (!(_condition)) \ + ucall_shared(uc, UCALL_ABORT, 2 + _nargs, \ + "Failed guest assert: " \ + _condstr, __LINE__, _args); \ +} while (0) + +#define GUEST_SHARED_ASSERT(uc, _condition) \ + __GUEST_SHARED_ASSERT(uc, _condition, #_condition, 0, 0) + +#define GUEST_SHARED_ASSERT_1(uc, _condition, arg1) \ + __GUEST_SHARED_ASSERT(uc, _condition, #_condition, 1, (arg1)) + +#define GUEST_SHARED_ASSERT_2(uc, _condition, arg1, arg2) \ + __GUEST_SHARED_ASSERT(uc, _condition, #_condition, 2, (arg1), (arg2)) + +#define GUEST_SHARED_ASSERT_3(uc, _condition, arg1, arg2, arg3) \ + __GUEST_SHARED_ASSERT(uc, _condition, #_condition, 3, (arg1), (arg2), (arg3)) + +#define GUEST_SHARED_ASSERT_4(uc, _condition, arg1, arg2, arg3, arg4) \ + __GUEST_SHARED_ASSERT(uc, _condition, #_condition, 4, (arg1), (arg2), (arg3), (arg4)) + +#define GUEST_SHARED_ASSERT_EQ(uc, a, b) \ + __GUEST_SHARED_ASSERT(uc, (a) == (b), #a " == " #b, 2, a, b) + +#define __CHECK_SHARED_STATE(uc, uc_cmd, uc_cmd_expected) do { \ + if (uc_cmd != uc_cmd_expected) { \ + if (uc_cmd == UCALL_ABORT) \ + TEST_FAIL("Unexpected guest abort: "%s" at %s:%ld", \ + (const char *)uc->args[0], __FILE__, \ + uc->args[1]); \ + else \ + TEST_FAIL("Unexpected ucall command/state: %" PRIu64, \ + uc_cmd); \ + } \ +} while (0) + +#define CHECK_SHARED_SYNC(vm, vcpu_id, uc, stage) do { \ + uint64_t uc_cmd = get_ucall_shared(vm, vcpu_id, uc); \ + TEST_ASSERT(uc_cmd == UCALL_SYNC, \ + "Unexpected ucall command/state: %" PRIu64, uc_cmd); \ + TEST_ASSERT(!strcmp((char *)uc->args[0], "hello"), \ + "Invalid ucall signature argument."); \ + TEST_ASSERT(uc->args[1] == stage, \ + "Invalid ucall sync stage: %" PRIu64, uc->args[1]); \ +} while (0) + +#define CHECK_SHARED_DONE(vm, vcpu_id, uc) do { \ + uint64_t uc_cmd = get_ucall_shared(vm, vcpu_id, uc); \ + __CHECK_SHARED_STATE(uc, uc_cmd, UCALL_DONE); \ + TEST_ASSERT(uc_cmd == UCALL_DONE, \ + "Unexpected ucall command/state: %" PRIu64, uc_cmd); \ +} while (0) + +#define CHECK_SHARED_ABORT(vm, vcpu_id, uc) do { \ + uint64_t uc_cmd = get_ucall_shared(vm, vcpu_id, uc); \ + TEST_ASSERT(uc_cmd == UCALL_ABORT, \ + "Unexpected ucall command/state: %" PRIu64, uc_cmd); \ +} while (0) + #endif /* SELFTEST_KVM_UCALL_COMMON_H */
Tests will initialize the ucall implementation in 3 main ways, depending on the test/architecture:
1) by relying on a default ucall implementation being available without the need to do any additional setup 2) by calling ucall_init() to initialize the default ucall implementation 3) by using ucall_init_ops() to initialize a specific ucall implementation
and in each of these cases it may use the ucall implementation to execute the standard ucall()/get_ucall() interfaces, or the new ucall_shared()/get_ucall_shared() interfaces.
Implement a basic self-test to exercise ucall under all the scenarios that are applicable for a particular architecture.
Signed-off-by: Michael Roth michael.roth@amd.com --- tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 3 + tools/testing/selftests/kvm/ucall_test.c | 182 +++++++++++++++++++++++ 3 files changed, 186 insertions(+) create mode 100644 tools/testing/selftests/kvm/ucall_test.c
diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore index 3763105029fb..4a801cba9c62 100644 --- a/tools/testing/selftests/kvm/.gitignore +++ b/tools/testing/selftests/kvm/.gitignore @@ -57,3 +57,4 @@ /steal_time /kvm_binary_stats_test /system_counter_offset_test +/ucall_test diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 06a02b6fa907..412de8093e6c 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -88,6 +88,7 @@ TEST_GEN_PROGS_x86_64 += set_memory_region_test TEST_GEN_PROGS_x86_64 += steal_time TEST_GEN_PROGS_x86_64 += kvm_binary_stats_test TEST_GEN_PROGS_x86_64 += system_counter_offset_test +TEST_GEN_PROGS_x86_64 += ucall_test
TEST_GEN_PROGS_aarch64 += aarch64/arch_timer TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions @@ -105,6 +106,7 @@ TEST_GEN_PROGS_aarch64 += rseq_test TEST_GEN_PROGS_aarch64 += set_memory_region_test TEST_GEN_PROGS_aarch64 += steal_time TEST_GEN_PROGS_aarch64 += kvm_binary_stats_test +TEST_GEN_PROGS_aarch64 += ucall_test
TEST_GEN_PROGS_s390x = s390x/memop TEST_GEN_PROGS_s390x += s390x/resets @@ -116,6 +118,7 @@ TEST_GEN_PROGS_s390x += kvm_page_table_test TEST_GEN_PROGS_s390x += rseq_test TEST_GEN_PROGS_s390x += set_memory_region_test TEST_GEN_PROGS_s390x += kvm_binary_stats_test +TEST_GEN_PROGS_s390x += ucall_test
TEST_GEN_PROGS += $(TEST_GEN_PROGS_$(UNAME_M)) LIBKVM += $(LIBKVM_$(UNAME_M)) diff --git a/tools/testing/selftests/kvm/ucall_test.c b/tools/testing/selftests/kvm/ucall_test.c new file mode 100644 index 000000000000..f0e6e4e79786 --- /dev/null +++ b/tools/testing/selftests/kvm/ucall_test.c @@ -0,0 +1,182 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * ucall interface/implementation tests. + * + * Copyright (C) 2021 Advanced Micro Devices + */ +#define _GNU_SOURCE /* for program_invocation_short_name */ +#include <fcntl.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/ioctl.h> + +#include "test_util.h" + +#include "kvm_util.h" +#include "processor.h" + +#define VCPU_ID 2 +#define TOTAL_PAGES 512 + +enum uc_test_type { + UC_TEST_WITHOUT_UCALL_INIT, + UC_TEST_WITH_UCALL_INIT, + UC_TEST_WITH_UCALL_INIT_OPS, + UC_TEST_WITH_UCALL_INIT_OPS_SHARED, + UC_TEST_MAX, +}; + +struct uc_test_config { + enum uc_test_type type; + const struct ucall_ops *ops; +}; + +static void test_ucall(void) +{ + GUEST_SYNC(1); + GUEST_SYNC(2); + GUEST_DONE(); + GUEST_ASSERT(false); +} + +static void check_ucall(struct kvm_vm *vm) +{ + struct ucall uc_tmp; + + vcpu_run(vm, VCPU_ID); + TEST_ASSERT(get_ucall(vm, VCPU_ID, &uc_tmp) == UCALL_SYNC, "sync failed"); + + vcpu_run(vm, VCPU_ID); + TEST_ASSERT(get_ucall(vm, VCPU_ID, &uc_tmp) == UCALL_SYNC, "sync failed"); + + vcpu_run(vm, VCPU_ID); + TEST_ASSERT(get_ucall(vm, VCPU_ID, &uc_tmp) == UCALL_DONE, "done failed"); + + vcpu_run(vm, VCPU_ID); + TEST_ASSERT(get_ucall(vm, VCPU_ID, &uc_tmp) == UCALL_ABORT, "abort failed"); +} + +static void test_ucall_shared(struct ucall *uc) +{ + GUEST_SHARED_SYNC(uc, 1); + GUEST_SHARED_SYNC(uc, 2); + GUEST_SHARED_DONE(uc); + GUEST_SHARED_ASSERT(uc, false); +} + +static void check_ucall_shared(struct kvm_vm *vm, struct ucall *uc) +{ + vcpu_run(vm, VCPU_ID); + CHECK_SHARED_SYNC(vm, VCPU_ID, uc, 1); + + vcpu_run(vm, VCPU_ID); + CHECK_SHARED_SYNC(vm, VCPU_ID, uc, 2); + + vcpu_run(vm, VCPU_ID); + CHECK_SHARED_DONE(vm, VCPU_ID, uc); + + vcpu_run(vm, VCPU_ID); + CHECK_SHARED_ABORT(vm, VCPU_ID, uc); +} + +static void __attribute__((__flatten__)) +guest_code(struct ucall *uc) +{ + if (uc) + test_ucall_shared(uc); + else + test_ucall(); +} + +static struct kvm_vm *setup_vm(void) +{ + struct kvm_vm *vm; + + vm = vm_create(VM_MODE_DEFAULT, 0, O_RDWR); + vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, 0, 0, TOTAL_PAGES, 0); + + /* Set up VCPU and initial guest kernel. */ + vm_vcpu_add_default(vm, VCPU_ID, guest_code); + kvm_vm_elf_load(vm, program_invocation_name); + + return vm; +} + +static void setup_vm_args(struct kvm_vm *vm, vm_vaddr_t uc_gva) +{ + vcpu_args_set(vm, VCPU_ID, 1, uc_gva); +} + +static void run_ucall_test(const struct uc_test_config *config) +{ + struct kvm_vm *vm = setup_vm(); + const struct ucall_ops *ops = config->ops; + bool is_default_ops = (!ops || ops == &ucall_ops_default); + bool shared = (config->type == UC_TEST_WITH_UCALL_INIT_OPS_SHARED); + + pr_info("Testing ucall%s ops for: %s%s\n", + shared ? "_shared" : "", + ops ? ops->name : "unspecified", + is_default_ops ? " (via default)" : ""); + + if (config->type == UC_TEST_WITH_UCALL_INIT) + ucall_init(vm, NULL); + else if (config->type == UC_TEST_WITH_UCALL_INIT_OPS || + config->type == UC_TEST_WITH_UCALL_INIT_OPS_SHARED) + ucall_init_ops(vm, NULL, config->ops); + + if (shared) { + struct ucall *uc; + vm_vaddr_t uc_gva; + + /* Set up ucall buffer. */ + uc_gva = ucall_shared_alloc(vm, 1); + uc = addr_gva2hva(vm, uc_gva); + + setup_vm_args(vm, uc_gva); + check_ucall_shared(vm, uc); + } else { + setup_vm_args(vm, 0); + check_ucall(vm); + } + + if (config->type == UC_TEST_WITH_UCALL_INIT) + ucall_uninit(vm); + else if (config->type == UC_TEST_WITH_UCALL_INIT_OPS || + config->type == UC_TEST_WITH_UCALL_INIT_OPS_SHARED) + ucall_uninit_ops(vm); + + kvm_vm_free(vm); +} + +static const struct uc_test_config test_configs[] = { +#if defined(__x86_64__) + { UC_TEST_WITHOUT_UCALL_INIT, NULL }, + { UC_TEST_WITH_UCALL_INIT, NULL }, + { UC_TEST_WITH_UCALL_INIT_OPS, &ucall_ops_default }, + { UC_TEST_WITH_UCALL_INIT_OPS, &ucall_ops_pio }, + { UC_TEST_WITH_UCALL_INIT_OPS_SHARED, &ucall_ops_pio }, + { UC_TEST_WITH_UCALL_INIT_OPS_SHARED, &ucall_ops_halt }, +#elif defined(__aarch64__) + { UC_TEST_WITH_UCALL_INIT, NULL }, + { UC_TEST_WITH_UCALL_INIT_OPS, &ucall_ops_default }, + { UC_TEST_WITH_UCALL_INIT_OPS, &ucall_ops_mmio }, +#elif defined(__s390x__) + { UC_TEST_WITHOUT_UCALL_INIT, NULL }, + { UC_TEST_WITH_UCALL_INIT, NULL }, + { UC_TEST_WITH_UCALL_INIT_OPS, &ucall_ops_default }, + { UC_TEST_WITH_UCALL_INIT_OPS, &ucall_ops_diag501 }, +#endif + { UC_TEST_MAX, NULL }, +}; + +int main(int argc, char *argv[]) +{ + int i; + + for (i = 0; test_configs[i].type != UC_TEST_MAX; i++) + run_ucall_test(&test_configs[i]); + + return 0; +}
On 12/10/21 17:46, Michael Roth wrote:
These patches are based on kvm/next, and are also available at:
https://github.com/mdroth/linux/commits/sev-selftests-ucall-rfc1
Thanks, this is a nice implementation of the concept. I'll check s390 before the next merge window, as I intend to merge this and the SEV tests (which I have already confirmed to work; I haven't yet checked SEV-ES because it's a bit harder for me to install new kernels on the SEV-ES machine I have access to).
Paolo
== BACKGROUND ==
These patches are a prerequisite for adding selftest support for SEV guests and possibly other confidential computing implementations in the future. They were motivated by a suggestion Paolo made in response to the initial SEV selftest RFC:
https://lore.kernel.org/lkml/20211025035833.yqphcnf5u3lk4zgg@amd.com/T/#m959...
Since the changes touch multiple archs and ended up creating a bit more churn than expected, I thought it would be a good idea to carve this out into a separate standalone series for reviewers who may be more interested in the ucall changes than anything SEV-related.
To summarize, x86 relies on a ucall based on using PIO intructions to generate an exit to userspace and provide the GVA of a dynamically-allocated ucall struct that resides in guest memory and contains information about how to handle/interpret the exit. This doesn't work for SEV guests for 3 main reasons:
- The guest memory is generally encrypted during run-time, so the guest needs to ensure the ucall struct is allocated in shared memory.
- The guest page table is also encrypted, so the address would need to be a GPA instead of a GVA.
- The guest vCPU register may also be encrypted in the case of SEV-ES/SEV-SNP, so the approach of examining vCPU register state has additional requirements such as requiring guest code to implement a #VC handler that can provide the appropriate registers via a vmgexit.
To address these issues, the SEV selftest RFC1 patchset introduced a set of new SEV-specific interfaces that closely mirrored the functionality of ucall()/get_ucall(), but relied on a pre-allocated/static ucall buffer in shared guest memory so it that guest code could pass messages/state to the host by simply writing to this pre-arranged shared memory region and then generating an exit to userspace (via a halt instruction).
Paolo suggested instead implementing support for test/guest-specific ucall implementations that could be used as an alternative to the default PIO-based ucall implementations as-needed based on test/guest requirements, while still allowing for tests to use a common set interfaces like ucall()/get_ucall().
== OVERVIEW ==
This series implements the above functionality by introducing a new ucall_ops struct that can be used to register a particular ucall implementation as need, then re-implements x86/arm64/s390x in terms of the ucall_ops.
But for the purposes of introducing a new ucall_ops implementation appropriate for SEV, there are a couple issues that resulted in the need for some additional ucall interfaces as well:
a) ucall() doesn't take a pointer to the ucall struct it modifies, so to make it work in the case of an implementation that relies a pre-allocated ucall struct in shared guest memory some sort of global lookup functionality would be needed to locate the appropriate ucall struct for a particular VM/vcpu combination, and this would need to be made accessible for use by the guest as well. guests would then need some way of determining what VM/vcpu identifiers they need to use to do the lookup, which to do reliably would likely require seeding the guest with those identifiers in advance, which is possible, but much more easily achievable by simply adding a ucall() alternative that accepts a pointer to the ucall struct for that particular VM/vcpu.
b) get_ucall() *does* take a pointer to a ucall struct, but currently zeroes it out and uses it to copy the guest's ucall struct into. It *could* be re-purposed to handle the case where the pointer is an actual pointer to the ucall struct in shared guest memory, but that could cause problems since callers would need some idea of what the underlying ucall implementation expects. Ideally the interfaces would be agnostic to the ucall implementation.
So to address those issues, this series also allows ucall implementations to optionally be extended to support a set of 'shared' ops that are used in the following manner:
host: uc_gva = ucall_shared_alloc() setup_vm_args(vm, uc_gva)
guest: ucall_shared(uc_gva, ...)
host: uget_ucall_shared(uc_gva, ...)
and then implements a new ucall implementation, ucall_ops_halt, based around these shared interfaces and halt instructions.
While this doesn't really meet the initial goal of re-using the existing ucall interfaces as-is, the hope is that these *_shared interfaces are general enough to be re-usable things other than SEV, or at least improve on code readability over the initial SEV-specific interfaces.
Any review/comments are greatly appreciated!
Michael Roth (10): kvm: selftests: move base kvm_util.h declarations to kvm_util_base.h kvm: selftests: move ucall declarations into ucall_common.h kvm: selftests: introduce ucall_ops for test/arch-specific ucall implementations kvm: arm64: selftests: use ucall_ops to define default ucall implementation (COMPILE-TESTED ONLY) kvm: s390: selftests: use ucall_ops to define default ucall implementation kvm: selftests: add ucall interfaces based around shared memory kvm: selftests: add ucall_shared ops for PIO kvm: selftests: introduce ucall implementation based on halt instructions kvm: selftests: add GUEST_SHARED_* macros for shared ucall implementations kvm: selftests: add ucall_test to test various ucall functionality
tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 5 +- .../testing/selftests/kvm/include/aarch64/ucall.h | 18 + tools/testing/selftests/kvm/include/kvm_util.h | 408 +-------------------- .../testing/selftests/kvm/include/kvm_util_base.h | 368 +++++++++++++++++++ tools/testing/selftests/kvm/include/s390x/ucall.h | 18 + tools/testing/selftests/kvm/include/ucall_common.h | 147 ++++++++ tools/testing/selftests/kvm/include/x86_64/ucall.h | 19 + tools/testing/selftests/kvm/lib/aarch64/ucall.c | 43 +-- tools/testing/selftests/kvm/lib/s390x/ucall.c | 45 +-- tools/testing/selftests/kvm/lib/ucall_common.c | 133 +++++++ tools/testing/selftests/kvm/lib/x86_64/ucall.c | 82 +++-- tools/testing/selftests/kvm/ucall_test.c | 182 +++++++++ 13 files changed, 982 insertions(+), 487 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/aarch64/ucall.h create mode 100644 tools/testing/selftests/kvm/include/kvm_util_base.h create mode 100644 tools/testing/selftests/kvm/include/s390x/ucall.h create mode 100644 tools/testing/selftests/kvm/include/ucall_common.h create mode 100644 tools/testing/selftests/kvm/include/x86_64/ucall.h create mode 100644 tools/testing/selftests/kvm/lib/ucall_common.c create mode 100644 tools/testing/selftests/kvm/ucall_test.c
On Fri, Dec 10, 2021, Michael Roth wrote:
To summarize, x86 relies on a ucall based on using PIO intructions to generate an exit to userspace and provide the GVA of a dynamically-allocated ucall struct that resides in guest memory and contains information about how to handle/interpret the exit. This doesn't work for SEV guests for 3 main reasons:
- The guest memory is generally encrypted during run-time, so the guest needs to ensure the ucall struct is allocated in shared memory.
- The guest page table is also encrypted, so the address would need to be a GPA instead of a GVA.
- The guest vCPU register may also be encrypted in the case of SEV-ES/SEV-SNP, so the approach of examining vCPU register state has additional requirements such as requiring guest code to implement a #VC handler that can provide the appropriate registers via a vmgexit.
To address these issues, the SEV selftest RFC1 patchset introduced a set of new SEV-specific interfaces that closely mirrored the functionality of ucall()/get_ucall(), but relied on a pre-allocated/static ucall buffer in shared guest memory so it that guest code could pass messages/state to the host by simply writing to this pre-arranged shared memory region and then generating an exit to userspace (via a halt instruction).
Paolo suggested instead implementing support for test/guest-specific ucall implementations that could be used as an alternative to the default PIO-based ucall implementations as-needed based on test/guest requirements, while still allowing for tests to use a common set interfaces like ucall()/get_ucall().
This all seems way more complicated than it needs to be. HLT is _worse_ than PIO on x86 because it triggers a userspace exit if and only if the local APIC is not in-kernel. That is bound to bite someone. The only issue with SEV is the address, not the VM-Exit mechanism. That doesn't change with SEV-ES, SEV-SNP, or TDX, as PIO and HLT will both get reflected as #VC/#VE, i.e. the guest side needs to be updated to use VMGEXIT/TDCALL no matter what, at which point having the hypercall request PIO emulation is just as easy as requesting HLT.
I also don't like having to differentiate between a "shared" and "regular" ucall. I kind of like having to explicitly pass the ucall object being used, but that puts undue burden on simple single-vCPU tests.
The inability to read guest private memory is really the only issue, and that can be easily solved without completely revamping the ucall framework, and without having to update a huge pile of tests to make them place nice with private memory.
This would also be a good opportunity to clean up the stupidity of tests having to manually call ucall_init(), drop the unused/pointless @arg from ucall_init(), and maybe even fix arm64's lurking landmine of not being SMP safe (the address is shared by all vCPUs).
To reduce the burden on tests and avoid ordering issues with creating vCPUs, allocate a ucall struct for every possible vCPU when the VM is created and stuff the GPA of the struct in the struct itself so that the guest can communicate the GPA instead of the GVA. Then confidential VMs just need to make all structs shared.
If all architectures have a way to access a vCPU ID, the ucall structs could be stored as a simple array. If not, a list based allocator would probably suffice.
E.g. something like this, except the list management is in common code instead of x86, and also delete all the per-test ucall_init() calls.
diff --git a/tools/testing/selftests/kvm/lib/x86_64/ucall.c b/tools/testing/selftests/kvm/lib/x86_64/ucall.c index a3489973e290..9aab6407bd42 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/ucall.c +++ b/tools/testing/selftests/kvm/lib/x86_64/ucall.c @@ -8,19 +8,59 @@
#define UCALL_PIO_PORT ((uint16_t)0x1000)
-void ucall_init(struct kvm_vm *vm, void *arg) +static struct list_head *ucall_list; + +void ucall_init(struct kvm_vm *vm) { + struct ucall *ucalls; + int nr_cpus = kvm_check_cap(KVM_CAP_MAX_VCPUS); + int i; + + TEST_ASSERT(!ucall_list, "ucall() can only be used by one VM at a time"); + + INIT_LIST_HEAD(&vm->ucall_list); + + ucalls = vm_vaddr_alloc(nr_cpus * sizeof(struct ucall)); + ucall_make_shared(ucall_list, <size>); + + for (i = 0; i < nr_cpus; i++) { + ucalls[i].gpa = addr_gva2gpa(vm, &ucalls[i]); + + list_add(&vm->ucall_list, &ucalls[i].list) + } + + ucall_list = &vm->ucall_list; + sync_global_to_guest(vm, ucall_list); }
void ucall_uninit(struct kvm_vm *vm) { + ucall_list = NULL; + sync_global_to_guest(vm, ucall_list); +} + +static struct ucall *ucall_alloc(void) +{ + struct ucall *uc; + + /* Is there a lock primitive for the guest? */ + lock_something(&ucall_lock); + uc = list_first_entry(ucall_list, struct ucall, list); + + list_del(&uc->list); + unlock_something(&ucall_lock); +} + +static void ucall_free(struct ucall *uc) +{ + lock_something(&ucall_lock); + list_add(&uc->list, ucall_list); + unlock_something(&ucall_lock); }
void ucall(uint64_t cmd, int nargs, ...) { - struct ucall uc = { - .cmd = cmd, - }; + struct ucall *uc = ucall_alloc(); va_list va; int i;
@@ -32,7 +72,9 @@ void ucall(uint64_t cmd, int nargs, ...) va_end(va);
asm volatile("in %[port], %%al" - : : [port] "d" (UCALL_PIO_PORT), "D" (&uc) : "rax", "memory"); + : : [port] "d" (UCALL_PIO_PORT), "D" (uc->gpa) : "rax", "memory"); + + ucall_free(uc); }
uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc) @@ -47,7 +89,7 @@ uint64_t get_ucall(struct kvm_vm *vm, uint32_t vcpu_id, struct ucall *uc) struct kvm_regs regs;
vcpu_regs_get(vm, vcpu_id, ®s); - memcpy(&ucall, addr_gva2hva(vm, (vm_vaddr_t)regs.rdi), + memcpy(&ucall, addr_gpa2hva(vm, (vm_paddr_t)regs.rdi), sizeof(ucall));
vcpu_run_complete_io(vm, vcpu_id);
On Thu, Dec 30, 2021 at 09:11:12PM +0000, Sean Christopherson wrote:
On Fri, Dec 10, 2021, Michael Roth wrote:
To summarize, x86 relies on a ucall based on using PIO intructions to generate an exit to userspace and provide the GVA of a dynamically-allocated ucall struct that resides in guest memory and contains information about how to handle/interpret the exit. This doesn't work for SEV guests for 3 main reasons:
- The guest memory is generally encrypted during run-time, so the guest needs to ensure the ucall struct is allocated in shared memory.
- The guest page table is also encrypted, so the address would need to be a GPA instead of a GVA.
- The guest vCPU register may also be encrypted in the case of SEV-ES/SEV-SNP, so the approach of examining vCPU register state has additional requirements such as requiring guest code to implement a #VC handler that can provide the appropriate registers via a vmgexit.
To address these issues, the SEV selftest RFC1 patchset introduced a set of new SEV-specific interfaces that closely mirrored the functionality of ucall()/get_ucall(), but relied on a pre-allocated/static ucall buffer in shared guest memory so it that guest code could pass messages/state to the host by simply writing to this pre-arranged shared memory region and then generating an exit to userspace (via a halt instruction).
Paolo suggested instead implementing support for test/guest-specific ucall implementations that could be used as an alternative to the default PIO-based ucall implementations as-needed based on test/guest requirements, while still allowing for tests to use a common set interfaces like ucall()/get_ucall().
This all seems way more complicated than it needs to be. HLT is _worse_ than PIO on x86 because it triggers a userspace exit if and only if the local APIC is not in-kernel. That is bound to bite someone.
Hmmm, fair point. It's easy for me to just not use in-kernel APIC in the current SEV tests to avoid the issue, but HLT is being made available as an available ucall implementation for other tests as well, and given in-kernel APIC is set up automatically maybe it's not robust enough.
not in-kernel. That is bound to bite someone. The only issue with SEV is the address, not the VM-Exit mechanism. That doesn't change with SEV-ES, SEV-SNP, or TDX, as PIO and HLT will both get reflected as #VC/#VE, i.e. the guest side needs to be updated to use VMGEXIT/TDCALL no matter what, at which point having the hypercall request PIO emulation is just as easy as requesting HLT.
I'm not aware of any #VC handling needed for HLT in the case of SEV-ES/SEV-SNP. That was one of the reasons for the SEV tests using this ucall implementation. Of course, at some point, we'd want full support for PIO/MMIO/etc. in the #VC handler, but it's not something I'd planned on adding until after the SEV-SNP tests, since it seems like we'd need to import a bunch of intruction decoding code from elsewhere in the kernel, which is a lot of churn that's not immediately necessary for getting at least some basic tests in place. Since the HLT implementation is only 20 lines of code it seemed like a reasonable stop-gap until we start getting more CoCo tests in place. But the in-kernel APIC issue probably needs more consideration...
Perhaps for *just* PIO, the intruction decoding can be open-coded so it can be added to the initial #VC handler implementation, which would avoid the need for HLT implementation. I'll take a look at that.
I also don't like having to differentiate between a "shared" and "regular" ucall. I kind of like having to explicitly pass the ucall object being used, but that puts undue burden on simple single-vCPU tests.
I tried to avoid it, but I got hung up on that fact that pre-allocating arrays/lists of ucall structs needs to be done for each VM, and so we'd end up needing some way for a guest to identify which pool it's ucall struct should be allocated from. But you've gotten around that by just sync_global_to_guest()'ing for each pool at the time ucall_init() is called, so the guest only ever sees it's particular pool. Then the switch from writing GVA to writing GPA solves the translation problem. Nice.
The inability to read guest private memory is really the only issue, and that can be easily solved without completely revamping the ucall framework, and without having to update a huge pile of tests to make them place nice with private memory.
I think the first 5 patches in this series are still relevant cleanups vs. having a complete standalone ucall implementation for each arch, and Andrew has also already started looking at other header cleanups related to patch #1, so maybe Paolo would still like to queue those. Would also provide a better starting point for having a centralized allocator for the ucall structs, which you hinted at wanting below.
But the subsequent patches that add the ucall_shared() interfaces should probably be set aside for now in favor of your proposal.
This would also be a good opportunity to clean up the stupidity of tests having to manually call ucall_init(), drop the unused/pointless @arg from ucall_init(), and maybe even fix arm64's lurking landmine of not being SMP safe (the address is shared by all vCPUs).
I thought you *didn't* want to update a huge pile of tests :) I suppose it's unavoidable, since with your proposal, having something like ucall_init() being called at some point is required, as opposed to the current implementation where it is optional. Are you intending to have it be called automatically by vm_create*()?
To reduce the burden on tests and avoid ordering issues with creating vCPUs, allocate a ucall struct for every possible vCPU when the VM is created and stuff the GPA of the struct in the struct itself so that the guest can communicate the GPA instead of the GVA. Then confidential VMs just need to make all structs shared.
So a separate call like:
ucall_make_shared(vm->ucall_list)
? Might need some good documentation/assertions to make sure it gets called at the right place for confidential VMs, and may need some extra hooks in SEV selftest implementation for switching from private to shared after the memory has already been allocated, but seems reasonable.
If all architectures have a way to access a vCPU ID, the ucall structs could be stored as a simple array. If not, a list based allocator would probably suffice.
I think list allocator is nicer, generating #VCs for both the PIO and the cpuid checks for vCPU lookup seems like a lot of extra noise to sift through while debugging where an errant test is failing, and doesn't seem to have any disadvantage vs. an array.
Thanks,
Mike
On Tue, Jan 04, 2022, Michael Roth wrote:
On Thu, Dec 30, 2021 at 09:11:12PM +0000, Sean Christopherson wrote:
not in-kernel. That is bound to bite someone. The only issue with SEV is the address, not the VM-Exit mechanism. That doesn't change with SEV-ES, SEV-SNP, or TDX, as PIO and HLT will both get reflected as #VC/#VE, i.e. the guest side needs to be updated to use VMGEXIT/TDCALL no matter what, at which point having the hypercall request PIO emulation is just as easy as requesting HLT.
I'm not aware of any #VC handling needed for HLT in the case of SEV-ES/SEV-SNP. That was one of the reasons for the SEV tests using this ucall implementation.
Ah, you're right, HLT is an "automatic" exit and the CPU takes care of adjusting RIP. TDX is the only one that requires a hypercall.
Of course, at some point, we'd want full support for PIO/MMIO/etc. in the #VC handler, but it's not something I'd planned on adding until after the SEV-SNP tests, since it seems like we'd need to import a bunch of intruction decoding code from elsewhere in the kernel, which is a lot of churn that's not immediately necessary for getting at least some basic tests in place. Since the HLT implementation is only 20 lines of code it seemed like a reasonable stop-gap until we start getting more CoCo tests in place. But the in-kernel APIC issue probably needs more consideration...
Perhaps for *just* PIO, the intruction decoding can be open-coded so it can be added to the initial #VC handler implementation, which would avoid the need for HLT implementation. I'll take a look at that.
PIO shouldn't require instruction decoding or a #VC handler. What I was thinking is that the guest in the selftest would make a direct #VMGEXIT/TDCALL to request PIO instead of executing an OUT.
I also don't like having to differentiate between a "shared" and "regular" ucall. I kind of like having to explicitly pass the ucall object being used, but that puts undue burden on simple single-vCPU tests.
I tried to avoid it, but I got hung up on that fact that pre-allocating arrays/lists of ucall structs needs to be done for each VM, and so we'd end up needing some way for a guest to identify which pool it's ucall struct should be allocated from. But you've gotten around that by just sync_global_to_guest()'ing for each pool at the time ucall_init() is called, so the guest only ever sees it's particular pool. Then the switch from writing GVA to writing GPA solves the translation problem. Nice.
The inability to read guest private memory is really the only issue, and that can be easily solved without completely revamping the ucall framework, and without having to update a huge pile of tests to make them place nice with private memory.
I think the first 5 patches in this series are still relevant cleanups vs. having a complete standalone ucall implementation for each arch, and Andrew has also already started looking at other header cleanups related to patch #1, so maybe Paolo would still like to queue those. Would also provide a better starting point for having a centralized allocator for the ucall structs, which you hinted at wanting below.
But the subsequent patches that add the ucall_shared() interfaces should probably be set aside for now in favor of your proposal.
This would also be a good opportunity to clean up the stupidity of tests having to manually call ucall_init(), drop the unused/pointless @arg from ucall_init(), and maybe even fix arm64's lurking landmine of not being SMP safe (the address is shared by all vCPUs).
I thought you *didn't* want to update a huge pile of tests :) I suppose it's unavoidable, since with your proposal, having something like ucall_init() being called at some point is required, as opposed to the current implementation where it is optional. Are you intending to have it be called automatically by vm_create*()?
Yeah, I was thinking it could be done at the lowest level vm_create() helper. We'll need to expand vm_create() (or add yet another layer to avoid modifying a pile of tests) to allow opting out of initializing ucall, e.g. sev_migrate_tests.c needs to create multiple concurrent VMs, but happily doesn't need ucall support.
To reduce the burden on tests and avoid ordering issues with creating vCPUs, allocate a ucall struct for every possible vCPU when the VM is created and stuff the GPA of the struct in the struct itself so that the guest can communicate the GPA instead of the GVA. Then confidential VMs just need to make all structs shared.
So a separate call like:
ucall_make_shared(vm->ucall_list)
? Might need some good documentation/assertions to make sure it gets called at the right place for confidential VMs, and may need some extra hooks in SEV selftest implementation for switching from private to shared after the memory has already been allocated, but seems reasonable.
Again, I was thinking that it would be done unconditionally by ucall_init(), i.e. would be automatically handled by the selftest framework and would Just Work for individual tests.
If all architectures have a way to access a vCPU ID, the ucall structs could be stored as a simple array. If not, a list based allocator would probably suffice.
I think list allocator is nicer, generating #VCs for both the PIO and the cpuid checks for vCPU lookup seems like a lot of extra noise to sift through while debugging where an errant test is failing, and doesn't seem to have any disadvantage vs. an array.
Ah, right, I forgot that querying the vCPU ID would require a hypercall.
On Wed, Jan 05, 2022 at 12:17:33AM +0000, Sean Christopherson wrote:
On Tue, Jan 04, 2022, Michael Roth wrote:
On Thu, Dec 30, 2021 at 09:11:12PM +0000, Sean Christopherson wrote:
not in-kernel. That is bound to bite someone. The only issue with SEV is the address, not the VM-Exit mechanism. That doesn't change with SEV-ES, SEV-SNP, or TDX, as PIO and HLT will both get reflected as #VC/#VE, i.e. the guest side needs to be updated to use VMGEXIT/TDCALL no matter what, at which point having the hypercall request PIO emulation is just as easy as requesting HLT.
I'm not aware of any #VC handling needed for HLT in the case of SEV-ES/SEV-SNP. That was one of the reasons for the SEV tests using this ucall implementation.
Ah, you're right, HLT is an "automatic" exit and the CPU takes care of adjusting RIP. TDX is the only one that requires a hypercall.
Of course, at some point, we'd want full support for PIO/MMIO/etc. in the #VC handler, but it's not something I'd planned on adding until after the SEV-SNP tests, since it seems like we'd need to import a bunch of intruction decoding code from elsewhere in the kernel, which is a lot of churn that's not immediately necessary for getting at least some basic tests in place. Since the HLT implementation is only 20 lines of code it seemed like a reasonable stop-gap until we start getting more CoCo tests in place. But the in-kernel APIC issue probably needs more consideration...
Perhaps for *just* PIO, the intruction decoding can be open-coded so it can be added to the initial #VC handler implementation, which would avoid the need for HLT implementation. I'll take a look at that.
PIO shouldn't require instruction decoding or a #VC handler. What I was thinking is that the guest in the selftest would make a direct #VMGEXIT/TDCALL to request PIO instead of executing an OUT.
That seems like a nicer approach. But it sort of lends itself to having test-specific ucall implementations in some form. How are you thinking vm_create() should decide what implementation to use? With this series in place it could be something like:
vm_create(..., struct ucall_ops *ops) ucall_init_ops(ops)
and with the SEV selftests in their current form it would look something like:
sev_vm_create(...) vm_create_with_ucall(..., ops=ucall_ops_pio_vmgexit) ucall_init_ops(ops)
is that sort of what you're thinking, or something else?
I also don't like having to differentiate between a "shared" and "regular" ucall. I kind of like having to explicitly pass the ucall object being used, but that puts undue burden on simple single-vCPU tests.
I tried to avoid it, but I got hung up on that fact that pre-allocating arrays/lists of ucall structs needs to be done for each VM, and so we'd end up needing some way for a guest to identify which pool it's ucall struct should be allocated from. But you've gotten around that by just sync_global_to_guest()'ing for each pool at the time ucall_init() is called, so the guest only ever sees it's particular pool. Then the switch from writing GVA to writing GPA solves the translation problem. Nice.
The inability to read guest private memory is really the only issue, and that can be easily solved without completely revamping the ucall framework, and without having to update a huge pile of tests to make them place nice with private memory.
I think the first 5 patches in this series are still relevant cleanups vs. having a complete standalone ucall implementation for each arch, and Andrew has also already started looking at other header cleanups related to patch #1, so maybe Paolo would still like to queue those. Would also provide a better starting point for having a centralized allocator for the ucall structs, which you hinted at wanting below.
But the subsequent patches that add the ucall_shared() interfaces should probably be set aside for now in favor of your proposal.
This would also be a good opportunity to clean up the stupidity of tests having to manually call ucall_init(), drop the unused/pointless @arg from ucall_init(), and maybe even fix arm64's lurking landmine of not being SMP safe (the address is shared by all vCPUs).
I thought you *didn't* want to update a huge pile of tests :) I suppose it's unavoidable, since with your proposal, having something like ucall_init() being called at some point is required, as opposed to the current implementation where it is optional. Are you intending to have it be called automatically by vm_create*()?
Yeah, I was thinking it could be done at the lowest level vm_create() helper. We'll need to expand vm_create() (or add yet another layer to avoid modifying a pile of tests) to allow opting out of initializing ucall, e.g. sev_migrate_tests.c needs to create multiple concurrent VMs, but happily doesn't need ucall support.
Why does sev_migrate_tests need to opt out? Couldn't it use ucall_ops_pio_vmgexit like that SEV case above?
I ask because there is a ucall() in the exception handling code where some unhandled exceptions result in the guest automatically issuing a ucall(UCALL_UNHANDLED), so even when tests don't use ucall() they might still rely on it if they enable exception handling. So that might be an argument for always setting up at least the default ucall_ops_pio implementation and creating a pool just in case. (or an argument for dropping the UCALL_HANDLED handling).
To reduce the burden on tests and avoid ordering issues with creating vCPUs, allocate a ucall struct for every possible vCPU when the VM is created and stuff the GPA of the struct in the struct itself so that the guest can communicate the GPA instead of the GVA. Then confidential VMs just need to make all structs shared.
So a separate call like:
ucall_make_shared(vm->ucall_list)
? Might need some good documentation/assertions to make sure it gets called at the right place for confidential VMs, and may need some extra hooks in SEV selftest implementation for switching from private to shared after the memory has already been allocated, but seems reasonable.
Again, I was thinking that it would be done unconditionally by ucall_init(), i.e. would be automatically handled by the selftest framework and would Just Work for individual tests.
Ok, I'll have to think that through more. Currently with the SEV selftests as they we have:
sev_vm_create(policy, npages) vm = vm_create(...) vm_set_memory_encryption(vm, encrypt_by_default, enc_bit) //vm_vaddr_alloc_shared() can be used now
The ucall struct allocations would need to go through vm_vaddr_alloc_shared() to make sure the selftest library tracks/maps the pages as shared, but that vm_set_memory_encryption() happens too late if the ucall_init() stuff is done in vm_create(). It should be possible to pass the vm_set_memory_encryption() arguments directly to vm_create() to allow for what you're proposing, but I guess we'd need a new vm_create() wrapper that handles both the vm_set_memory_encryption() args, along with the ucall_ops above, something like:
sev_vm_create(policy, npages) vm = vm_create_coco(..., encrypt_by_default, enc_bit/shared_bit, ucall_ops)
Or were you thinking something else? Just trying to get an idea of how this will all need to tie in with the SEV selftests and what needs to change on that end.
On Wed, Jan 05, 2022, Michael Roth wrote:
On Wed, Jan 05, 2022 at 12:17:33AM +0000, Sean Christopherson wrote:
PIO shouldn't require instruction decoding or a #VC handler. What I was thinking is that the guest in the selftest would make a direct #VMGEXIT/TDCALL to request PIO instead of executing an OUT.
That seems like a nicer approach. But it sort of lends itself to having test-specific ucall implementations in some form. How are you thinking vm_create() should decide what implementation to use? With this series in place it could be something like:
vm_create(..., struct ucall_ops *ops) ucall_init_ops(ops)
and with the SEV selftests in their current form it would look something like:
sev_vm_create(...) vm_create_with_ucall(..., ops=ucall_ops_pio_vmgexit) ucall_init_ops(ops)
is that sort of what you're thinking, or something else?
I keep forgetting ucall() doesn't have access to the VM. But, since we're restricing ucall() to a single VM, we can have a global that sets ucall_ops during ucall_init() based on the VM type, or skip an ops and just open code the behavior in x86's ucall() by snapshotting the VM type. Either way, the goal is to avoid having to pass in ucall_ops at the test level.
Yeah, I was thinking it could be done at the lowest level vm_create() helper. We'll need to expand vm_create() (or add yet another layer to avoid modifying a pile of tests) to allow opting out of initializing ucall, e.g. sev_migrate_tests.c needs to create multiple concurrent VMs, but happily doesn't need ucall support.
Why does sev_migrate_tests need to opt out? Couldn't it use ucall_ops_pio_vmgexit like that SEV case above?
Because it uses multiple VMs, and my rough sketch only allows for a single VM to use ucall. Though I suppose we could simply keep appending to the ucall list for every VM. The requirement would then be that all VMs are of the same type, i.e. utilize the same ucall_ops.
I ask because there is a ucall() in the exception handling code where some unhandled exceptions result in the guest automatically issuing a ucall(UCALL_UNHANDLED), so even when tests don't use ucall() they might still rely on it if they enable exception handling. So that might be an argument for always setting up at least the default ucall_ops_pio implementation and creating a pool just in case. (or an argument for dropping the UCALL_HANDLED handling).
The sev_migrate_tests don't even run a guest, hence the quick-and-dirty "solution". Though thinking toward the future, that may be too dirty as it would prevent tests from having multiple "real" VMs.
To reduce the burden on tests and avoid ordering issues with creating vCPUs, allocate a ucall struct for every possible vCPU when the VM is created and stuff the GPA of the struct in the struct itself so that the guest can communicate the GPA instead of the GVA. Then confidential VMs just need to make all structs shared.
So a separate call like:
ucall_make_shared(vm->ucall_list)
? Might need some good documentation/assertions to make sure it gets called at the right place for confidential VMs, and may need some extra hooks in SEV selftest implementation for switching from private to shared after the memory has already been allocated, but seems reasonable.
Again, I was thinking that it would be done unconditionally by ucall_init(), i.e. would be automatically handled by the selftest framework and would Just Work for individual tests.
Ok, I'll have to think that through more. Currently with the SEV selftests as they we have:
sev_vm_create(policy, npages) vm = vm_create(...) vm_set_memory_encryption(vm, encrypt_by_default, enc_bit) //vm_vaddr_alloc_shared() can be used now
The ucall struct allocations would need to go through vm_vaddr_alloc_shared() to make sure the selftest library tracks/maps the pages as shared, but that vm_set_memory_encryption() happens too late if the ucall_init() stuff is done in vm_create(). It should be possible to pass the vm_set_memory_encryption() arguments directly to vm_create() to allow for what you're proposing, but I guess we'd need a new vm_create() wrapper that handles both the vm_set_memory_encryption() args, along with the ucall_ops above, something like:
sev_vm_create(policy, npages) vm = vm_create_coco(..., encrypt_by_default, enc_bit/shared_bit, ucall_ops)
Or were you thinking something else? Just trying to get an idea of how this will all need to tie in with the SEV selftests and what needs to change on that end.
Hmm, I was thinking the selftest framework would only need to be told the VM type, e.g. DEFAULT, SEV, SEV-ES, SEV-SNP, or TDX, and would then handle setting everything up, e.g. enumerating the C-bit location and encrypting memory as needed.
One thought would be to extend "enum vm_guest_mode" with flags above NUM_VM_MODES to specify the VM type. That way tests that use VM_MODE_DEFAULT would continue to work without any updates.
On Wed, Jan 05, 2022 at 05:43:21PM +0000, Sean Christopherson wrote:
On Wed, Jan 05, 2022, Michael Roth wrote:
On Wed, Jan 05, 2022 at 12:17:33AM +0000, Sean Christopherson wrote:
PIO shouldn't require instruction decoding or a #VC handler. What I was thinking is that the guest in the selftest would make a direct #VMGEXIT/TDCALL to request PIO instead of executing an OUT.
That seems like a nicer approach. But it sort of lends itself to having test-specific ucall implementations in some form. How are you thinking vm_create() should decide what implementation to use? With this series in place it could be something like:
vm_create(..., struct ucall_ops *ops) ucall_init_ops(ops)
and with the SEV selftests in their current form it would look something like:
sev_vm_create(...) vm_create_with_ucall(..., ops=ucall_ops_pio_vmgexit) ucall_init_ops(ops)
is that sort of what you're thinking, or something else?
I keep forgetting ucall() doesn't have access to the VM. But, since we're restricing ucall() to a single VM, we can have a global that sets ucall_ops during ucall_init() based on the VM type, or skip an ops and just open code the behavior in x86's ucall() by snapshotting the VM type. Either way, the goal is to avoid having to pass in ucall_ops at the test level.
Yeah, I was thinking it could be done at the lowest level vm_create() helper. We'll need to expand vm_create() (or add yet another layer to avoid modifying a pile of tests) to allow opting out of initializing ucall, e.g. sev_migrate_tests.c needs to create multiple concurrent VMs, but happily doesn't need ucall support.
Why does sev_migrate_tests need to opt out? Couldn't it use ucall_ops_pio_vmgexit like that SEV case above?
Because it uses multiple VMs, and my rough sketch only allows for a single VM to use ucall. Though I suppose we could simply keep appending to the ucall list for every VM. The requirement would then be that all VMs are of the same type, i.e. utilize the same ucall_ops.
Hmm, maybe I misread your patch. Not supporting multiple VMs was the reason I gave up on having the ucall structs allocated on-demand and went with requiring them to be passed as arguments to ucall().
I thought with your patch you had solved that by having each vm have it's own pool, via vm->ucall_list, and then mapping each pool into each guest separately via:
ucall_init(vm): ucall_list = vm->ucall_list sync_global_to_guest(ucall_list).
then as long as that ucall_init() is done *after* the guest calls kvm_vm_elf_load(), it will end up with a 'ucall_list' global that points to it's own specific vm->ucall_list. Then on the test side it doesn't matter what the 'ucall_list' global is currently set to since you have the GPA and know what vm exited.
Or am I missing something there?
Although even if that is the case, now that we're proposing doing the ucall_init() inside vm_create(), then we run the risk of a test calling kvm_vm_elf_load() after, which might clobber the guest's copy of ucall_list global if ucall_init() had since been called for another VM. But that could maybe be worked around by having whatever vm_create() variant we use also do the kvm_vm_elf_load() unconditionally as part of creation.
I ask because there is a ucall() in the exception handling code where some unhandled exceptions result in the guest automatically issuing a ucall(UCALL_UNHANDLED), so even when tests don't use ucall() they might still rely on it if they enable exception handling. So that might be an argument for always setting up at least the default ucall_ops_pio implementation and creating a pool just in case. (or an argument for dropping the UCALL_HANDLED handling).
The sev_migrate_tests don't even run a guest, hence the quick-and-dirty "solution". Though thinking toward the future, that may be too dirty as it would prevent tests from having multiple "real" VMs.
To reduce the burden on tests and avoid ordering issues with creating vCPUs, allocate a ucall struct for every possible vCPU when the VM is created and stuff the GPA of the struct in the struct itself so that the guest can communicate the GPA instead of the GVA. Then confidential VMs just need to make all structs shared.
So a separate call like:
ucall_make_shared(vm->ucall_list)
? Might need some good documentation/assertions to make sure it gets called at the right place for confidential VMs, and may need some extra hooks in SEV selftest implementation for switching from private to shared after the memory has already been allocated, but seems reasonable.
Again, I was thinking that it would be done unconditionally by ucall_init(), i.e. would be automatically handled by the selftest framework and would Just Work for individual tests.
Ok, I'll have to think that through more. Currently with the SEV selftests as they we have:
sev_vm_create(policy, npages) vm = vm_create(...) vm_set_memory_encryption(vm, encrypt_by_default, enc_bit) //vm_vaddr_alloc_shared() can be used now
The ucall struct allocations would need to go through vm_vaddr_alloc_shared() to make sure the selftest library tracks/maps the pages as shared, but that vm_set_memory_encryption() happens too late if the ucall_init() stuff is done in vm_create(). It should be possible to pass the vm_set_memory_encryption() arguments directly to vm_create() to allow for what you're proposing, but I guess we'd need a new vm_create() wrapper that handles both the vm_set_memory_encryption() args, along with the ucall_ops above, something like:
sev_vm_create(policy, npages) vm = vm_create_coco(..., encrypt_by_default, enc_bit/shared_bit, ucall_ops)
Or were you thinking something else? Just trying to get an idea of how this will all need to tie in with the SEV selftests and what needs to change on that end.
Hmm, I was thinking the selftest framework would only need to be told the VM type, e.g. DEFAULT, SEV, SEV-ES, SEV-SNP, or TDX, and would then handle setting everything up, e.g. enumerating the C-bit location and encrypting memory as needed.
One thought would be to extend "enum vm_guest_mode" with flags above NUM_VM_MODES to specify the VM type. That way tests that use VM_MODE_DEFAULT would continue to work without any updates.
Ok, let me see what that approach looks like on the SEV selftest side.
On Wed, Jan 05, 2022, Michael Roth wrote:
On Wed, Jan 05, 2022 at 05:43:21PM +0000, Sean Christopherson wrote:
Because it uses multiple VMs, and my rough sketch only allows for a single VM to use ucall. Though I suppose we could simply keep appending to the ucall list for every VM. The requirement would then be that all VMs are of the same type, i.e. utilize the same ucall_ops.
Hmm, maybe I misread your patch. Not supporting multiple VMs was the reason I gave up on having the ucall structs allocated on-demand and went with requiring them to be passed as arguments to ucall().
I thought with your patch you had solved that by having each vm have it's own pool, via vm->ucall_list, and then mapping each pool into each guest separately via:
ucall_init(vm): ucall_list = vm->ucall_list sync_global_to_guest(ucall_list).
then as long as that ucall_init() is done *after* the guest calls kvm_vm_elf_load(), it will end up with a 'ucall_list' global that points to it's own specific vm->ucall_list. Then on the test side it doesn't matter what the 'ucall_list' global is currently set to since you have the GPA and know what vm exited.
Or am I missing something there?
Ha, that was not at all intented. But yes, it should work. I'd rather be lucky than good?
Although even if that is the case, now that we're proposing doing the ucall_init() inside vm_create(), then we run the risk of a test calling kvm_vm_elf_load() after, which might clobber the guest's copy of ucall_list global if ucall_init() had since been called for another VM. But that could maybe be worked around by having whatever vm_create() variant we use also do the kvm_vm_elf_load() unconditionally as part of creation.
Will sync_global_to_guest() even work as intended if kvm_vm_elf_load() hasn't been called? If not, then sync_global_{to,from}_guest() should really assert if the test hasn't been loaded.
As for ucall_init(), I think the best approach would be to make kvm_vm_elf_load() a static and replace all calls with:
kvm_vm_load_guest(vm);
where its implementation is:
void kvm_vm_load_guest(struct kvm_vm *vm) { kvm_vm_elf_load(vm, program_invocation_name);
ucall_init(vm); }
The logic being that if a test creates a VM but never loads any code into the guest, e.g. kvm_create_max_vcpus, then it _can't_ make ucalls.
On Wed, Jan 05, 2022 at 07:40:57PM +0000, Sean Christopherson wrote:
On Wed, Jan 05, 2022, Michael Roth wrote:
On Wed, Jan 05, 2022 at 05:43:21PM +0000, Sean Christopherson wrote:
Because it uses multiple VMs, and my rough sketch only allows for a single VM to use ucall. Though I suppose we could simply keep appending to the ucall list for every VM. The requirement would then be that all VMs are of the same type, i.e. utilize the same ucall_ops.
Hmm, maybe I misread your patch. Not supporting multiple VMs was the reason I gave up on having the ucall structs allocated on-demand and went with requiring them to be passed as arguments to ucall().
I thought with your patch you had solved that by having each vm have it's own pool, via vm->ucall_list, and then mapping each pool into each guest separately via:
ucall_init(vm): ucall_list = vm->ucall_list sync_global_to_guest(ucall_list).
then as long as that ucall_init() is done *after* the guest calls kvm_vm_elf_load(), it will end up with a 'ucall_list' global that points to it's own specific vm->ucall_list. Then on the test side it doesn't matter what the 'ucall_list' global is currently set to since you have the GPA and know what vm exited.
Or am I missing something there?
Ha, that was not at all intented. But yes, it should work. I'd rather be lucky than good?
:)
Although even if that is the case, now that we're proposing doing the ucall_init() inside vm_create(), then we run the risk of a test calling kvm_vm_elf_load() after, which might clobber the guest's copy of ucall_list global if ucall_init() had since been called for another VM. But that could maybe be worked around by having whatever vm_create() variant we use also do the kvm_vm_elf_load() unconditionally as part of creation.
Will sync_global_to_guest() even work as intended if kvm_vm_elf_load() hasn't been called? If not, then sync_global_{to,from}_guest() should really assert if the test hasn't been loaded.
Yah, seems like it would get clobbered by kvm_vm_elf_load() later. And can't think of any good reason to use sync_global_to_guest() without also needing kvm_vm_elf_load() at some point, so makes sense to enforce it.
As for ucall_init(), I think the best approach would be to make kvm_vm_elf_load() a static and replace all calls with:
kvm_vm_load_guest(vm);
where its implementation is:
void kvm_vm_load_guest(struct kvm_vm *vm) { kvm_vm_elf_load(vm, program_invocation_name);
ucall_init(vm); }
The logic being that if a test creates a VM but never loads any code into the guest, e.g. kvm_create_max_vcpus, then it _can't_ make ucalls.
Makes sense. And if different ops are needed for vmgexit()/tdcall() it could be something like (if based on patches 1-5 of this series, and extending vm_guest_mode as you suggested earlier):
void kvm_vm_load_guest(struct kvm_vm *vm) {
kvm_vm_elf_load(vm, program_invocation_name);
if (vm->mode == VM_MODE_SEV) ucall_init_ops(vm, ucall_ops_pio_vmgexit); else (vm->vm_type == VM_MODE_TDX) ucall_init_ops(vm, ucall_ops_pio_tdcall); else ucall_init_ops(vm, ucall_ops_pio);
Shame we have to update all the kvm_vm_elf_load() call-sites, but they'd end up potentially breaking things if left as-is anyway.
Were you planning on sending patches for these changes, or should I incorporate your prototype and take a stab at the other changes as part of v2 of this series?
On Wed, Jan 05, 2022, Michael Roth wrote:
On Wed, Jan 05, 2022 at 07:40:57PM +0000, Sean Christopherson wrote:
As for ucall_init(), I think the best approach would be to make kvm_vm_elf_load() a static and replace all calls with:
kvm_vm_load_guest(vm);
where its implementation is:
void kvm_vm_load_guest(struct kvm_vm *vm) { kvm_vm_elf_load(vm, program_invocation_name);
ucall_init(vm); }
The logic being that if a test creates a VM but never loads any code into the guest, e.g. kvm_create_max_vcpus, then it _can't_ make ucalls.
Makes sense. And if different ops are needed for vmgexit()/tdcall() it could be something like (if based on patches 1-5 of this series, and extending vm_guest_mode as you suggested earlier):
void kvm_vm_load_guest(struct kvm_vm *vm) {
kvm_vm_elf_load(vm, program_invocation_name);
if (vm->mode == VM_MODE_SEV) ucall_init_ops(vm, ucall_ops_pio_vmgexit); else (vm->vm_type == VM_MODE_TDX)
I don't think we want to do this here, but instead down in the arch-specific ucall_init(). Also, not sure if I was clear before (can't tell what you interpreted based on the above snippet), but I think we'll want VM_MODE_SEV etc... to be modifiers on top of the VA/PA stuff.
ucall_init_ops(vm, ucall_ops_pio_tdcall); else ucall_init_ops(vm, ucall_ops_pio);
Shame we have to update all the kvm_vm_elf_load() call-sites, but they'd end up potentially breaking things if left as-is anyway.
Were you planning on sending patches for these changes, or should I incorporate your prototype and take a stab at the other changes as part of v2 of this series?
Nope, all yours. Thanks!
On Wed, Jan 05, 2022 at 10:02:53PM +0000, Sean Christopherson wrote:
On Wed, Jan 05, 2022, Michael Roth wrote:
On Wed, Jan 05, 2022 at 07:40:57PM +0000, Sean Christopherson wrote:
As for ucall_init(), I think the best approach would be to make kvm_vm_elf_load() a static and replace all calls with:
kvm_vm_load_guest(vm);
where its implementation is:
void kvm_vm_load_guest(struct kvm_vm *vm) { kvm_vm_elf_load(vm, program_invocation_name);
ucall_init(vm); }
The logic being that if a test creates a VM but never loads any code into the guest, e.g. kvm_create_max_vcpus, then it _can't_ make ucalls.
Makes sense. And if different ops are needed for vmgexit()/tdcall() it could be something like (if based on patches 1-5 of this series, and extending vm_guest_mode as you suggested earlier):
void kvm_vm_load_guest(struct kvm_vm *vm) {
kvm_vm_elf_load(vm, program_invocation_name);
if (vm->mode == VM_MODE_SEV) ucall_init_ops(vm, ucall_ops_pio_vmgexit); else (vm->vm_type == VM_MODE_TDX)
I don't think we want to do this here, but instead down in the arch-specific ucall_init(). Also, not sure if I was clear before (can't tell what you interpreted based on the above snippet), but I think we'll want VM_MODE_SEV etc... to be modifiers on top of the VA/PA stuff.
Ok, something like this (with additional ones added as-needed)?
#define VM_MODE_DEFAULT VM_MODE_PXXV48_4K +#define SEV_VM_MODE_DEFAULT SEV_VM_MODE_PXXV48_4K
enum vm_guest_mode { ... VM_MODE_PXXV48_4K, ... NUM_VM_MODES, + SEV_VM_MODE_PXXV48_4K, + NUM_VM_MODES_EXTENDED, }
ucall_init_ops(vm, ucall_ops_pio_tdcall); else ucall_init_ops(vm, ucall_ops_pio);
Shame we have to update all the kvm_vm_elf_load() call-sites, but they'd end up potentially breaking things if left as-is anyway.
Were you planning on sending patches for these changes, or should I incorporate your prototype and take a stab at the other changes as part of v2 of this series?
Nope, all yours. Thanks!
Thanks for the suggestions!
linux-kselftest-mirror@lists.linaro.org