On 9/19/2024 12:07 AM, Michael Kelley wrote:
From: Naman Jain namjain@linux.microsoft.com Sent: Monday, September 16, 2024 10:39 PM
read_hv_sched_clock_tsc() assumes that the Hyper-V clock counter is bigger than the variable hv_sched_clock_offset, which is cached during early boot, but depending on the timing this assumption may be false when a hibernated VM starts again (the clock counter starts from 0 again) and is resuming back (Note: hv_init_tsc_clocksource() is not called during hibernation/resume); consequently, read_hv_sched_clock_tsc() may return a negative integer (which is interpreted as a huge positive integer since the return type is u64) and new kernel messages are prefixed with huge timestamps before read_hv_sched_clock_tsc() grows big enough (which typically takes several seconds).
Fix the issue by saving the Hyper-V clock counter just before the suspend, and using it to correct the hv_sched_clock_offset in resume. This makes hv tsc page based sched_clock continuous and ensures that post resume, it starts from where it left off during suspend. Override x86_platform.save_sched_clock_state and x86_platform.restore_sched_clock_state routines to correct this as soon as possible.
Note: if Invariant TSC is available, the issue doesn't happen because
- we don't register read_hv_sched_clock_tsc() for sched clock:
See commit e5313f1c5404 ("clocksource/drivers/hyper-v: Rework clocksource and sched clock setup"); 2) the common x86 code adjusts TSC similarly: see __restore_processor_state() -> tsc_verify_tsc_adjust(true) and x86_platform.restore_sched_clock_state().
Cc: stable@vger.kernel.org Fixes: 1349401ff1aa ("clocksource/drivers/hyper-v: Suspend/resume Hyper-V clocksource for hibernation") Co-developed-by: Dexuan Cui decui@microsoft.com Signed-off-by: Dexuan Cui decui@microsoft.com Signed-off-by: Naman Jain namjain@linux.microsoft.com
Changes from v2: https://lore.kernel.org/all/20240911045632.3757-1-namjain@linux.microsoft.co... Addressed Michael's comments:
- Changed commit msg to include information on making timestamps continuous
- Changed subject to reflect the new file being changed
- Changed variable name for saving offset/counters
- Moved comment on new function introduced from header file to function definition.
- Removed the equations in comments
- Rebased to latest linux-next tip
Changes from v1: https://lore.kernel.org/all/20240909053923.8512-1-namjain@linux.microsoft.co...
- Reorganized code as per Michael's comment, and moved the logic to x86
specific files, to keep hyperv_timer.c arch independent.
arch/x86/kernel/cpu/mshyperv.c | 58 ++++++++++++++++++++++++++++++ drivers/clocksource/hyperv_timer.c | 14 +++++++- include/clocksource/hyperv_timer.h | 2 ++ 3 files changed, 73 insertions(+), 1 deletion(-)
This all looks good to me now. Thanks for indulging all my comments. :-)
One minor nit: The "Subject:" prefix should not have a dash in "hyper-v". The goal is to be consistent in the prefixes used for a particular file instead of wandering all over the place. If you check the commit history for arch/x86/kernel/cpu/mshyperv.c, you'll see that it is "x86/hyperv", not "x86/hyperv-v".
This is super picky, so don't respin just for this. The maintainer can probably fix it when accepting the patch if there are no other changes.
Reviewed-by: Michael Kelley mhklinux@outlook.com
Thank you Michael for reviewing. I'll keep this in mind while posting any patch in future.
Regards, Naman