I lied, there aren't three different definitions of the KVM clock. The fourth is on i386, where it's still based on CLOCK_MONOTONIC (well, boot time which might as well be the same sine we offset it anyway).
If we fix *that* to be based on CLOCK_MONOTONIC_RAW then we can rip out a whole bunch of mechanisms which were added to cope with NTP frequency skew.
This cleans up the mess and gets us back down to the two unavoidable definitions of the KVM clock: when the host and guest TSCs are well behaved and in sync, it's in "master clock" mode where it's defined as a simple arithmetic function of the guest TSC, otherwise it's clamped to the host's CLOCK_MONOTONIC_RAW.
It includes Jack's KVM_[GS]ET_CLOCK_GUEST patch to allow accurate migration. Also my KVM_VCPU_TSC_SCALE which exposes the precise TSC scaling factors. This is needed to get accurate migration of the guest TSC, and can *also* be used by userspace to have vDSO-style access to the KVM clock. Thus allowing hypercalls and other emulated clock devices (e.g. PIT, HPET, ACPI timer) to be based on the KVM clock too, giving *consistency* across a live migration.
I do still need to fix KVM_REQ_MASTERCLOCK_UPDATE so that it doesn't clamp the clock back to CLOCK_MONOTONIC_RAW; we should *update* the ka->kvmclock_offset when we've been running in use_master_clock mode. Should probably do that in kvm_arch_hardware_disable() for timekeeping across hibernation too, but I haven't finished working that one out.
I think there are still some places left where KVM reads the time twice in close(ish) succession and then assumes they were at the *same* time, which I'll audit and fix too.
I also need to flesh out the test cases and do some real testing from VMMs, but I think it's ready for some heckling at least.
https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/clocks
David Woodhouse (8): KVM: x86/xen: Do not corrupt KVM clock in kvm_xen_shared_info_init() KVM: x86: Improve accuracy of KVM clock when TSC scaling is in force KVM: x86: Explicitly disable TSC scaling without CONSTANT_TSC KVM: x86: Add KVM_VCPU_TSC_SCALE and fix the documentation on TSC migration KVM: x86: Avoid NTP frequency skew for KVM clock on 32-bit host KVM: x86: Remove periodic global clock updates KVM: x86: Kill KVM_REQ_GLOBAL_CLOCK_UPDATE KVM: x86: Fix KVM clock precision in __get_kvmclock()
Jack Allister (2): KVM: x86: Add KVM_[GS]ET_CLOCK_GUEST for accurate KVM clock migration KVM: selftests: Add KVM/PV clock selftest to prove timer correction
Documentation/virt/kvm/api.rst | 37 ++ Documentation/virt/kvm/devices/vcpu.rst | 115 ++++-- arch/x86/include/asm/kvm_host.h | 8 +- arch/x86/include/uapi/asm/kvm.h | 6 + arch/x86/kvm/svm/svm.c | 3 +- arch/x86/kvm/vmx/vmx.c | 2 +- arch/x86/kvm/x86.c | 438 +++++++++++++--------- arch/x86/kvm/xen.c | 4 +- include/uapi/linux/kvm.h | 3 + tools/testing/selftests/kvm/Makefile | 1 + tools/testing/selftests/kvm/x86_64/pvclock_test.c | 192 ++++++++++ 11 files changed, 600 insertions(+), 209 deletions(-)