On Mon, 2024-04-22 at 16:11 +0200, Paolo Bonzini wrote:
On Thu, Apr 18, 2024 at 9:46 PM David Woodhouse dwmw2@infradead.org wrote:
+ curr_tsc_hz = get_cpu_tsc_khz() * 1000LL; + if (unlikely(curr_tsc_hz == 0)) { + rc = -EINVAL; + goto out; + }
+ if (kvm_caps.has_tsc_control) + curr_tsc_hz = kvm_scale_tsc(curr_tsc_hz, + v->arch.l1_tsc_scaling_ratio);
+ /* + * The scaling factors in the hv_clock do not depend solely on the + * TSC frequency *requested* by userspace. They actually use the + * host TSC frequency that was measured/detected by the host kernel, + * scaled by kvm_scale_tsc() with the vCPU's l1_tsc_scaling_ratio. + * So a sanity check that they *precisely* match would have false + * negatives. Allow for a discrepancy of 1 kHz either way.
This is not very clear - if kvm_caps.has_tsc_control, cur_tsc_hz is exactly the "host TSC frequency [...] scaled by kvm_scale_tsc() with the vCPU's l1_tsc_scaling_ratio". But even in that case there is a double rounding issue, I guess.
That's exactly what I'm saying, isn't it?
Perhaps the issue is clearer if I say "that was measured/detected by *each* host kernel"?
The point is that if I boot on a kernel which measured its TSC against the PIT and came up with a value of 3002MHz, and then migrate to an "identical" host which measured against *its* PIT and decided its TSC frequency was 2999MHz.... then migrate a guest with an explicit TSC frequency of 2500MHz from one host to the other... their effective tsc_to_system_mul and tsc_shift in the pvclock are *different* because...
"The scaling factors in the hv_clock do not depend solely on the TSC frequency *requested* by userspace. They actually use the host TSC frequency that was measured/detected by each host kernel, scaled by kvm_scale_tsc() with the vCPU's l1_tsc_scaling_ratio."
Or did I misunderstand your objection?