On Sun, Mar 16, 2025 at 9:56 PM Thomas Gleixner tglx@linutronix.de wrote:
On Sat, Mar 15 2025 at 16:22, John Stultz wrote:
On Sat, Mar 15, 2025 at 12:23 PM Thomas Gleixner tglx@linutronix.de wrote:
So to fix this, rework the timekeeping_advance() logic a bit so that when we are called from do_adjtimex() and the offset is smaller then cycle_interval, that we call timekeeping_forward(), to first accumulate the sub-interval time into xtime_nsec. Then with no unaccumulated cycles in offset, we can do the mult adjustment without worry of the subtraction having an impact.
It's a smart solution. I briefly pondered something similar, but I'm not really fond of the fact, that it causes a clock_was_set() event for no good reason.
clock_was_set() means that there is a time jump. But that's absolutely not the case with do_adjtimex() changing the frequency for quick adjustments. That does not affect continuity at all.
Oh, good point. I wasn't thinking clearly about the semantics, and was being a little paranoid there (as most calls to timekeeping_forward_now() have clock_was_set() calls that follow). I suspect I can do away with that bit and avoid it. I'll respin later this week.
Actually your patch is not even emitting a clock_was_set() event:
if (offset < real_tk->cycle_interval) {
timekeeping_forward(tk, now);
*clock_set = 1;
return 0;
}
#define TK_CLEAR_NTP (1 << 0) #define TK_CLOCK_WAS_SET (1 << 1)
So it clears NTP instead. Not really what you want either. :)
Hey Thomas, Sorry for the slow reply here. So I agree with you that we don't want to set clock_set above, that was my mistake. But this last bit I don't think is right, as timekeeping advance() just returns a bool (return !!clock_set;), which is used to decide to call clock_was_set() or not - not the argument passed to clock_was_set().
But yes, it simply can forward the clock without side effects.
I think that this should be done for all TICK_ADV_FREQ adjustments. In case of such asynchronous adjustments it does not make any sense to take the old ntp_error value into account and accumlate some more. In fact this simply should clear ntp_error so the new value goes into effect as provided by NTP and not skewed by ntp_error.
These async adjustments (usually very small ones) happen when the current source degrades and chronyd/ntpd switches over to a new server.
Something like the below.
So I finally got a chance to look at the diff between your change and mine, and your changes look good to me. Thanks again for catching the clock_set thinko, and I agree clearing ntp_error looks like the right thing to do. I'm going to do some testing here with them and resubmit shortly.
thanks -john