[LEAPSECS] POSIX? (was Time math libraries, UTC to TAI)

Sat Dec 31 11:31:42 EST 2016

Warner Losh wrote:
> On Fri, Dec 30, 2016 at 9:52 PM, Steve Summit <scs+ls at eskimo.com> wrote:
> > CLOCK_TAI is already in the newest Linux kernels, but I'm not sure how
> > well it works; CLOCK_UTC is, shall we say, emerging.
>
> Other systems don't have this quite yet, but I'd love to see it more
> widely implemented. Is there a spec for this yet, or is it just
> whatever code is noodling around in Linux?

As Zefram already noted, Markus Kuhn specified CLOCK_UTC pretty clearly
at https://www.cl.cam.ac.uk/~mgk25/posix-clocks.html -- and that was
back in 1998!

I don't know of any implementations "in the wild", but if anyone
has heard of any, I'd love to.

As I've mentioned, I've got a prototype implementation based on
the Linux 4.4.22 kernel which I'm hoping to release very soon.
I'm attaching a rough draft of my own spec.

> To do this, one would need to tell the kernel that a new leap second
> is introduced.

Absolutely.  You can do that either by calling adjtimex and
setting STA_INS (as ntpd does today), or by updating a new
in-kernel table of leap seconds and letting it notice for itself.
(And I've got several avenues in mind for updating that in-kernel
table.)

> That's not too bad, but you'd also need to then run through all
> the timers in the system to adjust the time that the UTC timer
> was going to fire to be a new time...

Well, maybe not every timer in the system, but yes, potentially a
lot of them.  This is the part I'm most scared of (because the
Linux timer code is even more baroque and impenetrable than the
base timekeeping code, but that's another story).  I'm deferring
this part of the problem to "phase 2" of my work.

> But I'm curious how you'd represent a leap second in this scheme?

The same way Warnecke and Kempen, Markus Kuhn, David Madore,
and others have suggested: using nonnormalized struct timespecs.
This is mildly kludgey, but it has the glorious advantage that it
can use existing structures and system calls; all you need to do
("all you need to do") is introduce a new clkid value for
clock_gettime and the other Posix.1b clock and timer functions
to use.

I'm also partial to an alternate representation using the triple
(days since 1970, seconds within day, nanoseconds), perhaps
augmented with a bit of additional metadata such as the current
day length and DTAI value.  I'm using that internally, because I
thought it'd be cleaner, but it may not be strictly necessary,
and I'm not making it visible at any public interfaces yet,
because of course everyone complains (and rightly so) that there
are too many time formats already.

> I'd like to schedule a timeout in the UTC time during the leap second

Me, too.  I don't have that implemented yet, but my intent is
that eventually CLOCK_UTC and its nonnormalized timespecs will be
meaningfully usable everywhere, not just in clock_gettime, but
also clock_settime (done), timer_create et al., and clock_nanosleep.

> It's especially troublesome if the kernel decides that there
> really isn't going to be a leap second at midnight for some reason
> (like a bad ntp server leaked the wrong data which was later
> corrected).

I'd like to say I'm not going to worry about that sort of case
too much -- clearly, we can't implement proper leap second
handling if we can't trust our time services to report them
properly! -- but based on what I'm seeing on the ntp servers I'm
sampling today, I may have been a bit too optimistic in that
stance. :-\

			*	*	*

The attachment contains a -- rough! -- draft of a spec for the
leapsecond-aware clock and timer code I am prototyping.
-------------- next part --------------
I. Clocks

For the Linux kernel modifications described here, three time
scales are defined:

1. UTC.  UTC is returned by the clock_gettime call with a
   clkid value of CLOCK_UTC.

   For dates on and after 1972-01-01, CLOCK_UTC values are
   defined in terms of UTC years, days, hours, minutes, and
   seconds based on a modified count of seconds since 1970-01-01,
   as follows:

   If S < 60:
	tv_sec = ((f(Y, J) * 24 + H) * 60 + M) * 60 + int(S)
	tv_nsec = frac(S) * 1000000000

   If S == 60:
	tv_sec = ((f(Y, J) * 24 + H) * 60 + M) * 60 + int(S-1)
	tv_nsec = (frac(S) + 1) * 1000000000

   where Y is the year, J is the 0-based day of the year, and H,
   M, and S are UTC hours, minutes, and seconds, and where
   f(Y, J) is defined as:

	f(Y, J) = ((Y-1970) * 365 + J +
		(Y-1969)/4 - (Y-1901)/100 + (Y-1601)/400) * 86400 

   For dates before 1972-01-01, CLOCK_UTC values are defined
   in the analogous way in terms of proleptic UTC.  It is
   implementation-defined whether any leap seconds occurred
   prior to 1972.

2. TAI.  UTC is returned by the clock_gettime call with a
   clkid value of CLOCK_TAI.

   For dates on and after 1972-01-01, CLOCK_TAI values are
   defined in terms of TAI years, days, hours, minutes, and
   seconds based on a count of seconds since 1970-01-01,
   as follows:

	tv_sec = ((f(Y, J) * 24 + H) * 60 + M) * 60 + int(S)
	tv_nsec = frac(S) * 1000000000

   where Y is the year, J is the 0-based day of the year, and H,
   M, and S are TAI hours, minutes, and seconds, and where
   f(Y, J) is defined as above.

   For dates before 1972-01-01, CLOCK_TAI values are
   defined in the analogous way in terms of proleptic TAI.

   At any point in time it will always be the case that the
   difference between CLOCK_TAI and CLOCK_UTC values will equal
   DTAI at that instant in time.  (In the current implementation,
   the difference between CLOCK_TAI and CLOCK_UTC values will
   always be integral.)

3. Posix time.  Posix time is returned by the time and
   gettimeofday system calls, and by the clock_gettime call
   with a clkid value of CLOCK_REALTIME (aka CLOCK_POSIX).

   For dates after 1972-01-01, and except in the vicinity of a
   leap second, Posix time is identical to UTC.

   Posix time ignores the existence of leap seconds.  In the
   vicinity of a leap second (where "in the vicinity" is defined
   as below), Posix time is "smeared" with respect to UTC to give
   the illusion of a day with exactly 86400 seconds, without the
   leap second, and without any jumps or discontinuities.

   For a leap second occurring at time t, smearing occurs over a
   time interval beginning at t - o1 and ending at t + o2, for
   implementation-defined values of o1 and o2 such that:

	0 < o1 <= 86400
	0 <= o2 <= 43200

   During the interval from t - o1 to t + o2 (this is the
   definition of "in the vicinity of the leap second"), Posix
   time is derived from UTC as follows:

	t0 = t - o1

	t1 = utc.tv_sec + utc.tv_nsec / 1000000000

	t2 = t0 + (t1 - t0) * (o1 + o2) / (o1 + o2 + L)

	posix.tv_sec = int(t2)
	posix.tv_nsec = int(frac(t2) * 1000000000)

   where L is +1 for a positive leap second, -1 for a negative
   leap second.

   During the interval from t - o1 to t + o2, Posix seconds
   are slightly longer or shorter than UTC (and therefore SI)
   seconds, and it can therefore be said that during that
   interval, Posix time is using an approximate version of UT1
   instead of UTC.

   Typical values of o1 and o2 are:

	o1 = 1000, o2 = 0: UTC-SLS
	[see https://www.cl.cam.ac.uk/~mgk25/time/utc-sls/ ]

	o1 = 86400, o2 = 0: smear over entirety of UTC day
	preceding leap second (scs preferred)

	o1 = 36000, o2 = 36000: Google public NTP server smear:
	10 hours on either side of leap second 2017-01-01
	[see https://cloudplatform.googleblog.com/2016/11/making-every-leap-second-count-with-our-new-public-NTP-servers.html ]

	o1 = 43200, o2 = 43200: smear over 24 hours symmetrically
	surrounding leap second (Google proposed smearing
	standard; see https://developers.google.com/time/smear )

	o1 = 1, o2 = 0: "Maximum pain for minimum time" or "rip
	the bandage off" approach: tolerate a bigger discontinuity
	for a shorter time, approximating the (ill-defined but
	rather jumpy) legacy leap-second behavior.

II. Timers

My goal is that any alarm- or timer-using code, if it fetches the
time both at the beginning and end of an alarm/timer interval,
will observe a passage of time perfectly consistent with the
alarm/timer interval, *if* both the alarm/timer calls and the
fetch-time calls use consistent time scales (i.e., all true-UTC,
or all Posix-compatible and therefore potentially smeared).
As of this writing, however, the work required to achieve this
goal has not been completed.

There are three timer cases to consider: absolute (that is,
clock_nanosleep with the TIMER_ABSTIME flag set), time_t
relative, and UTC relative.

Absolute timers go off at exactly the caller-specified time, no
matter how long that is from now.  Absolute timers are unaffected
by leap seconds.  However, if at time t1 a caller sets an
absolute timer for time t2, and also computes d=t2-t1, it will
not necessarily be the case that precisely time d goes by until
the timer goes off.  The clock may be reset in the interim; there
may be a new leap second in the interim.  [Footnote: Those are two
significantly different cases, though, and the distinction would
be visible to a carefully-written program that fetched a
leap-second-aware time after the timer went off, and computed a
leap-second-aware difference d2.  d2 would equal d if the clock
had been reset, but not if a leap second had occurred.]

time_t relative timers count time_t (Posix) seconds.  On leap
second days, Posix seconds do not precisely equal SI/UTC seconds.
[Footnote: they will typically be "smeared" using some algorithm,
perhaps so steeply that one of them may seem to last for almost 2
seconds.]  If the duration of a time_t relative timer encompasses
multiple days, and if one (or more?) days are leap second days
while the others are not, a time_t relative timer will count true
seconds on the non-leap-second days and smeared seconds on the
leap second day(s).  Nevertheless, if a calling program fetches
the current (non-leap-second-aware) time, performs a time_t
relative sleep for d seconds, fetches the current (non-leap-
second-aware) time again, and computes the non-leap-second-aware
difference, that difference is (under all circumstances)
guaranteed to be almost exactly d.  (Any discrepancy will reflect
the minute delays between the calls, but it will *not* reflect
any known error(s) due to the difference(s) between Posix and UTC
seconds.)

UTC relative timers count SI/UTC seconds, no matter what.  If a
calling program fetches the current time in a leap-second-aware
way, performs a UTC relative sleep for d seconds, fetches the
leap-second-aware current time again, and computes the
leap-second-aware difference, that difference is (under all
circumstances) guaranteed to be almost exactly d.  (Any
discrepancy will reflect the minute delays between the calls,
but it will *not* reflect any known error(s) due to the
difference(s) between Posix and UTC seconds.)

Exceptions (alluded to above):
* Clock jumps may (will) cause relative timers to expire at other
  than the "expected time".
* Clock jumps may cause absolute timers to expire at other
  than the specified time, if the clock jumps to later than the
  expiration time (in which case the timer will seem to go off late).