On Wed, Apr 16, 2025, Xiaoyao Li wrote:
On 3/24/2025 9:02 PM, Manali Shukla wrote:
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 5fe84f2427b5..f7c925aa0c4f 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7909,6 +7909,25 @@ apply some other policy-based mitigation. When exiting to userspace, KVM sets KVM_RUN_X86_BUS_LOCK in vcpu-run->flags, and conditionally sets the exit_reason to KVM_EXIT_X86_BUS_LOCK. +Note! KVM_CAP_X86_BUS_LOCK_EXIT on AMD CPUs with the Bus Lock Threshold is close +enough to INTEL's Bus Lock Detection VM-Exit to allow using +KVM_CAP_X86_BUS_LOCK_EXIT for AMD CPUs.
+The biggest difference between the two features is that Threshold (AMD CPUs) is +fault-like i.e. the bus lock exit to user space occurs with RIP pointing at the +offending instruction, whereas Detection (Intel CPUs) is trap-like i.e. the bus +lock exit to user space occurs with RIP pointing at the instruction right after +the guilty one.
+The bus lock threshold on AMD CPUs provides a per-VMCB counter which is +decremented every time a bus lock occurs, and a VM-Exit is triggered if and only +if the bus lock counter is '0'.
+To provide Detection-like semantics for AMD CPUs, the bus lock counter has been +initialized to '0', i.e. exit on every bus lock, and when re-executing the +guilty instruction, the bus lock counter has been set to '1' to effectively step +past the instruction.
From the perspective of API, I don't think the last two paragraphs matter much to userspace.
It should describe what userspace can/should do. E.g., when exit to userspace due to bus lock on AMD platform, the RIP points at the instruction which causes the bus lock. Userspace can advance the RIP itself before re-enter the guest to make progress. If userspace doesn't change the RIP, KVM internal can handle it by making the re-execution of the instruction doesn't trigger bus lock VM exit to allow progress.
Agreed. It's not just the last two paragraphs, it's the entire doc update.
The existing documentation very carefully doesn't say anything about *how* the feature is implemented on Intel, so I don't see any reason to mention or compare Bus Lock Threshold vs. Bus Lock Detection. As Xiaoyao said, simply state what is different.
And I would definitely not say anything about whether or not userspace can advance RIP, as doing so will likely crash/corrupt the guest. KVM sets bus_lock_counter to allow forward progress, KVM does NOT skip RIP.
All in all, I think the only that needs to be called out is that RIP will point to the next instruction on Intel, but the offending instruction on Intel.
Unless I'm missing a detail, I think it's just this:
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 5fe84f2427b5..d9788f9152f1 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7909,6 +7909,11 @@ apply some other policy-based mitigation. When exiting to userspace, KVM sets KVM_RUN_X86_BUS_LOCK in vcpu-run->flags, and conditionally sets the exit_reason to KVM_EXIT_X86_BUS_LOCK.
+Due to differences in the underlying hardware implementation, the vCPU's RIP at +the time of exit diverges between Intel and AMD. On Intel hosts, RIP points at +the next instruction, i.e. the exit is trap-like. On AMD hosts, RIP points at +the offending instruction, i.e. the exit is fault-like. + Note! Detected bus locks may be coincident with other exits to userspace, i.e. KVM_RUN_X86_BUS_LOCK should be checked regardless of the primary exit reason if userspace wants to take action on all detected bus locks.