[LEAPSECS] Time math libraries, UTC to TAI

Thu Dec 29 00:08:58 EST 2016

On Wed, Dec 28, 2016 at 2:21 PM, Brooks Harris <brooks at edlmax.com> wrote:
> On 2016-12-28 02:29 PM, Warner Losh wrote:
>>
>> On Wed, Dec 28, 2016 at 10:33 AM, Brooks Harris <brooks at edlmax.com> wrote:
>>>
>>> The YMDhms count progression across the first Leap Second
>>> (1972-06-30T23:59:60 (UTC)) as yielded by POSIX gmtime() is expected to
>>> be
>>
>> Or rather "something like the following," because POSIX doesn't say
>> what happens during the leap second. Some systems replay the 799
>> second rather than the 800 second to avoid starting the day twice...
>> This is also allowed by POSIX because every last thing dealing with
>> the leap second is implementation defined because it is outside the
>> scope of the standard. FreeBSD does 799 twice for example. There's
>> other systems that 'freeze' time during the leap second, only
>> incrementing it by a tiny fraction for each gettimeofday call.
>
> Hi Warner,
>
> My understanding, also from David Wells: "Unlike the POSIX conventions , the
> NTP clock is frozen and does not advanced during the leap second, so there
> is no need to set it back one second at the end of the leap second. "

Except that it does advance... It plays the second twice, once with
the pending leap set, and once with it cleared I though. At least for
the time exchanges that happen during the leap second. I have a fuzzy
memory of this changing at some point (from freeze to double tick
disambiguated by a bit), but have no primary source for this.

> This seems consistent my understanding of the specs, that POSIX would
> "reset" and NTP would freeze. But POSIX is intentionally vague on its
> definition of "the epoch" to allow some fudge factor for older systems to be
> conformant and somewhat unclear how Leap Seconds are handled.

POSIX is silent on leap seconds. Any leap second behavior is outside
the scope of the standard because POSIX time_t doesn't have leap
seconds at all. Any behavior at all is possible during the leap second
since leap seconds cannot exist in POSIX. That's my main beef with the
standard: It works well for local time at the expense of being able to
do something sensible (or a range of sensible behavior) with leap
seconds (both ticking through them and representing them.

It's my memory of kern_ntptime.c that early versions froze the system
time during the leap second, while later ones double ticked. I recall
reading something on Dave Mills' ntp web site years ago to this
effect, but a quick google search is not fruitful at the moment.

> Indeed some implementations have made different choices, which is exactly
> the sort of mismatched behavior we'd all like to find a way to overcome,
> right?

There's two problems: one is that time_t is defined in such a way as
to preclude sensible things with a leap second. Two is that
implementations differ on what the sensible thing to do is because
they must break some symmetry to fix the problem: is time monotonic or
can you double tick special seconds? Is frequency stable or can you
randomly fudge it (either hugely by stopping time, or tinily with a
long-term offset)? Much each second must have a uniform label? Without
a sensible standard, with defined semantics, progress is impossible
because you must break some fundamental aspect of the passage of time.
Leap seconds break time by making it a non-uniform radix in a fussy
and technical way that's poorly communicated to the world. Since the
standard assumes a uniform radix, the impedance mismatch cannot but
cause problems. Couple that with the "It's just a second, so who
cares" attitude that's prevalent, and I despair at a solution coming.

In other words, I'm of the opinion that defining what the right thing
to do while the ship is sinking isn't really useful if the underlying
model assumes ships can't sink.

>>>   time_t          gmtime()                UTC
>>> 78796799 = 1972-06-30 23:59:59 = 1972-06-30T23:59:59 (UTC)
>>> 78796800 = 1972-07-01 00:00:00 = 1972-06-30T23:59:60 (UTC) << Leap Second
>>> 78796800 = 1972-07-01 00:00:00 = 1972-07-01T00:00:00 (UTC) << time_t
>>> reset
>>> 78796801 = 1972-07-01 00:00:01 = 1972-07-01T00:00:01 (UTC)
>>>
>>> time_t must be reset after the Leap Second to maintain the alignment of
>>> the
>>> POSIX and UTC YMDhms representations. In effect the time_t origin has
>>> become
>>> coincident with "1972-01-01 00:00:00 UTC plus one Leap Second", or
>>> "1972-01-01 00:00:01 UTC". As David Mills says "... In effect, a new
>>> timescale is reestablished after each new leap second."
>>
>> Yes, you must repeat time_t values to be posixly correct. Many
>> extensions to POSIX have been proposed and implemented (and some are
>> quite good) that effectively say time_t ticks in TAI time and to get
>> UTC the host must translate with varying degrees of papering over the
>> old APIs.
>
>
> Seems to me these "some are quite good" extensions should be guides to
> adopting and standardizing common deterministic behavior.

Unfortunately, being "quite good" here is like being a "quite good
politician:" All politicians lie. And they all get caught in some of
their lies and it's impossible to know which ones in advance (or they
wouldn't tell them). Same is true for these APIs. Since time and
timing code has been written by a pack of pathologically independent,
strong-willed people whose certainty about proper code was unburdened
by their actually being right about the code, telling a lie that will
be benign to all this code is impossible. Some code can tolerate
repeated seconds, but can't tolerate frequency changes. Others can
tolerate time standing still (or almost still), but can't tolerate
repeated seconds. Some can tolerate a small frequency shift, but not a
large one. Some can handle the truth, but goes to great lengths to see
what 'lies' the system is telling it to sort out what the truth behind
the lies really is based on certain assumptions that may not be
universal. The APIs are good at telling lies and papering over leap
seconds, but aren't perfect at it because POSIX requires that you make
a choice about what aspect of time you sacrifice when you live on UTC
since its time_t can't represent time_t (and by extension, neither can
struct tm, for reasons that are less widely discussed having to do
with a misfeature (from a leap second correctness perspective) of
adding N to one of the tm_* fields meaning "normalize this time, with
the uniform radix assumptions, and use that for downstream results).
The trouble is that we can't even use flow analysis to determine what
the code "wants to hear" because the assumptions underlying some code
can be quite subtle and defy easy compile-time analysis.

So while it sounds promising, it is a really bad idea to base progress
on a liar. The art will only change when the only correct way to write
programs requires full knowledge of leap seconds, can represent them,
and offers no shortcuts to this goal that pretend leap seconds don't
exist.

Sorry to be so pessimistic about this, but until you can solve the
dusty deck problem and the standard modeling time in a way that's at
odds with UTC, there's little hope for progress. This is why I've said
that POSIX has de-facto redefined UTC, since that's what most programs
get by default.

Warner

> -Brooks
>
>> But Dave Mills is right: if you are trying to count seconds,
>> your counts are necessarily discontinuous at a leap second in an
>> implementation defined way.
>>
>> Warner
>> _______________________________________________
>> LEAPSECS mailing list
>> LEAPSECS at leapsecond.com
>> https://pairlist6.pair.net/mailman/listinfo/leapsecs
>>
>>
>
> _______________________________________________
> LEAPSECS mailing list
> LEAPSECS at leapsecond.com
> https://pairlist6.pair.net/mailman/listinfo/leapsecs