Oliver and/or Marc, question for y'all towards the bottom.
On Wed, Feb 28, 2024, Sean Christopherson wrote:
On Wed, Feb 28, 2024, Mark Brown wrote:
On Thu, Feb 08, 2024 at 09:48:39PM +0100, Thomas Huth wrote:
From: Sean Christopherson seanjc@google.com
Extract the code to set a vCPU's entry point out of vm_arch_vcpu_add() and into a new API, vcpu_arch_set_entry_point(). Providing a separate API will allow creating a KVM selftests hardness that can handle tests that use different entry points for sub-tests, whereas *requiring* the entry point to be specified at vCPU creation makes it difficult to create a generic harness, e.g. the boilerplate setup/teardown can't easily create and destroy the VM and vCPUs.
With today's -next I'm seeing most of the KVM selftests failing on an arm64 defconfig with:
# ==== Test Assertion Failure ==== # include/kvm_util_base.h:677: !ret # pid=735 tid=735 errno=9 - Bad file descriptor # 1 0x0000000000410937: vcpu_set_reg at kvm_util_base.h:677 (discriminator 4) # 2 (inlined by) vcpu_arch_set_entry_point at processor.c:370 (discriminator 4) # 3 0x0000000000407bab: vm_vcpu_add at kvm_util_base.h:981 # 4 (inlined by) __vm_create_with_vcpus at kvm_util.c:419 # 5 (inlined by) __vm_create_shape_with_one_vcpu at kvm_util.c:432 # 6 0x000000000040187b: __vm_create_with_one_vcpu at kvm_util_base.h:892 # 7 (inlined by) vm_create_with_one_vcpu at kvm_util_base.h:899 # 8 (inlined by) main at aarch32_id_regs.c:158 # 9 0x0000007fbcbe6dc3: ?? ??:0 # 10 0x0000007fbcbe6e97: ?? ??:0 # 11 0x0000000000401f2f: _start at ??:? # KVM_SET_ONE_REG failed, rc: -1 errno: 9 (Bad file descriptor)
and a bisect pointed to this commit which does look plausibly relevant.
Note that while this was bisected with plain arm64 defconfig and the KVM selftests fragment was not enabled, but enabling the KVM fragment gave the same result as would be expected based on the options enabled by the fragment. We're also seeing an alternative failure pattern where the tests segfault when run in a different environment, I'm also tracking that down but I suspect these are the same issue.
Gah, my bad, I should have at least tested on ARM since I have easy access to such hardware. If I can't figure out what's going wrong in the next few hours, I'll drop this series and we can try again for 6.10.
Sorry :-/
/facepalm
The inner helper doesn't return the vCPU, and by dumb (bad) luck, selftests end up trying to use fd=0.
diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c index ed4ab29f4fad..a9eb17295be4 100644 --- a/tools/testing/selftests/kvm/lib/aarch64/processor.c +++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c @@ -386,6 +386,7 @@ static struct kvm_vcpu *__aarch64_vcpu_add(struct kvm_vm *vm, uint32_t vcpu_id, aarch64_vcpu_setup(vcpu, init);
vcpu_set_reg(vcpu, ARM64_CORE_REG(sp_el1), stack_vaddr + stack_size); + return vcpu; }
struct kvm_vcpu *aarch64_vcpu_add(struct kvm_vm *vm, uint32_t vcpu_id,
I'll squash the above and force push.
In my defense, I would have caught this when build-testing, as the compiler does warn...
lib/aarch64/processor.c -o /usr/local/google/home/seanjc/go/src/kernel.org/nox/tools/testing/selftests/kvm/lib/aarch64/processor.o lib/aarch64/processor.c: In function ‘__aarch64_vcpu_add’: lib/aarch64/processor.c:389:1: warning: no return statement in function returning non-void [-Wreturn-type] 389 | } | ^ At top level: cc1: note: unrecognized command-line option ‘-Wno-gnu-variable-sized-type-not-at-end’ may have been intended to silence earlier diagnostics
but due to a different issue that is fixed in the kvm-arm tree[*], but not in mine, I built without -Werror and didn't see the new warn in the sea of GUEST_PRINTF warnings.
Ugh, and I still can't enable -Werror, because there are unused functions in aarch64/vpmu_counter_access.c
aarch64/vpmu_counter_access.c:96:20: error: unused function 'enable_counter' [-Werror,-Wunused-function] static inline void enable_counter(int idx) ^ aarch64/vpmu_counter_access.c:104:20: error: unused function 'disable_counter' [-Werror,-Wunused-function] static inline void disable_counter(int idx) ^ 2 errors generated. make: *** [Makefile:278: /usr/local/google/home/seanjc/go/src/kernel.org/nox/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.o] Error 1 make: *** Waiting for unfinished jobs....
Commit 49f31cff9c533d264659356b90445023b04e10fb failed to build with 'make-clang make-arm make -j128'.
Oliver/Marc, any thoughts on how you want to fix the unused function warnings? As evidenced by this goof, being able to compile with -Werror is super helpful.
And another question: is there any reason to not force -Werror for selftests?
[*] https://lore.kernel.org/all/20240202234603.366925-1-seanjc@google.com