On Wed, 21 Dec 2022 16:50:30 +0000, Oliver Upton oliver.upton@linux.dev wrote:
On Wed, Dec 21, 2022 at 09:35:06AM +0000, Marc Zyngier wrote:
[...]
- if (kvm_vcpu_abt_iss1tw(vcpu)) {
/*
* Only a permission fault on a S1PTW should be
* considered as a write. Otherwise, page tables baked
* in a read-only memslot will result in an exception
* being delivered in the guest.
Somewhat of a tangent, but:
Aren't we somewhat unaligned with the KVM UAPI by injecting an exception in this case? I know we've been doing it for a while, but it flies in the face of the rules outlined in the KVM_SET_USER_MEMORY_REGION documentation.
That's an interesting point, and I certainly haven't considered that for faults introduced by page table walks.
I'm not sure what userspace can do with that though. The problem is that this is a write for which we don't have useful data: although we know it is a page-table walker access, we don't know what it was about to write. The instruction that caused the write is meaningless (it could either be a load, a store, or an instruction fetch). How do you populate the data[] field then?
If anything, this is closer to KVM_EXIT_ARM_NISV, for which we give userspace the full ESR and ask it to sort it out. I doubt it will be able to, but hey, maybe it is worth a shot. This would need to be a different exit reason though, as NISV is explicitly for non-memslot stuff.
In any case, the documentation for KVM_SET_USER_MEMORY_REGION needs to reflect the fact that KVM_EXIT_MMIO cannot represent a fault due to a S1 PTW.
Oh I completely agree with you here. I probably should have said before, I think the exit would be useless anyway. Getting the documentation in line with the intended behavior seems to be the best fix.
Right. How about something like this?
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 226b40baffb8..72abd018a618 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1381,6 +1381,14 @@ It is recommended to use this API instead of the KVM_SET_MEMORY_REGION ioctl. The KVM_SET_MEMORY_REGION does not allow fine grained control over memory allocation and is deprecated.
+Note: On arm64, a write generated by the page-table walker (to update +the Access and Dirty flags, for example) never results in a +KVM_EXIT_MMIO exit when the slot has the KVM_MEM_READONLY flag. This +is because KVM cannot provide the data that would be written by the +page-table walker, making it impossible to emulate the access. +Instead, an abort (data abort if the cause of the page-table update +was a load or a store, instruction abort if it was an instruction +fetch) is injected in the guest.
4.36 KVM_SET_TSS_ADDR ---------------------
Thanks,
M.