On Mon, Nov 01, 2021, Maxim Levitsky wrote:
On Mon, 2021-11-01 at 16:43 +0100, Vitaly Kuznetsov wrote:
Paolo Bonzini pbonzini@redhat.com writes:
On 11/08/21 14:29, Maxim Levitsky wrote:
Modify debug_regs test to create a pending interrupt and see that it is blocked when single stepping is done with KVM_GUESTDBG_BLOCKIRQ
Signed-off-by: Maxim Levitsky mlevitsk@redhat.com
.../testing/selftests/kvm/x86_64/debug_regs.c | 24 ++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-)
I haven't looked very much at this, but the test fails.
Same here,
the test passes on AMD but fails consistently on Intel:
# ./x86_64/debug_regs ==== Test Assertion Failure ==== x86_64/debug_regs.c:179: run->exit_reason == KVM_EXIT_DEBUG && run->debug.arch.exception == DB_VECTOR && run->debug.arch.pc == target_rip && run->debug.arch.dr6 == target_dr6 pid=13434 tid=13434 errno=0 - Success 1 0x00000000004027c6: main at debug_regs.c:179 2 0x00007f65344cf554: ?? ??:0 3 0x000000000040294a: _start at ??:? SINGLE_STEP[1]: exit 8 exception 1 rip 0x402a25 (should be 0x402a27) dr6 0xffff4ff0 (should be 0xffff4ff0)
(I know I'm late to the party).
Well that is strange. It passes on my intel laptop. Just tested (kvm/queue + qemu master, compiled today) :-(
It fails on iteration 1 (and there is iteration 0) which I think means that we start with RIP on sti, and get #DB on start of xor instruction first (correctly), and then we get #DB again on start of xor instruction again?
Something very strange. My laptop has i7-7600U.
I haven't verified on hardware, but my guess is that this code in vmx_vcpu_run()
/* When single-stepping over STI and MOV SS, we must clear the * corresponding interruptibility bits in the guest state. Otherwise * vmentry fails as it then expects bit 14 (BS) in pending debug * exceptions being set, but that's not correct for the guest debugging * case. */ if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) vmx_set_interrupt_shadow(vcpu, 0);
interacts badly with APICv=1. It will kill the STI shadow and cause the IRQ in vmcs.GUEST_RVI to be recognized when it (micro-)architecturally should not. My head is going in circles trying to sort out what would actually happen. Maybe comment out that and/or disable APICv to see if either one makes the test pass?