[LEAPSECS] Leap seconds ain't broken, but most implementations are broken

Wed Jan 4 04:03:50 EST 2017

Tony Finch wrote:
> Warner Losh <imp at bsdimp.com> wrote:
>> We have a
>> specific legacy standard called POSIX that's causing all kinds of
>> issues that pop up when you least expect it
> 
> I haven't mentioned the usual litany of NTP servers getting it wrong,
> including servers run by national time labs. It's pretty embarrassing
> that one of our main time distribution systems routinely screws up leap
> seconds.
> 
> So the blame isn't just due to POSIX, or even mostly. Remember, NTP time
> stamps are equal to POSIX time stamps with a constant offset regardless
> of the number of leap seconds. The difference is that NTP actually
> specifies how to handle leap seconds, and carries leap indicator bits
> alongside the timestamp. 
> 
> Even though NTP can represent current UTC correctly, it often gets leap
> seconds wrong. It does not give confidence that we will be able to
> reduce bugs by teaching more code about leap seconds, when NTP cares
> about time and gets it wrong, and most code cares much less.

I think this you statement isn't quite fair.

If a web server delivered a page with broken HTML code you wouldn't
blame the web server daemon, e.g. apache, would you? It's the task of
the web server admin to configure the server correctly and make sure the
original PHP or HTML code is such that the delivered page isn't broken.

IMO this is similar to ntpd. If it's not provided with an updated leap
second file then it may have no idea that a leap second is approaching.
If a faulty GPS receiver passes a leap second warning to ntpd, should
ntpd not trust the GPS receiver since it knows there are some broken
receivers out there?

Current versions of ntpd accept a leap second file if it has been
configured, and the file hasn't expired. If no leap second file can be
used then a leap second announcement from a refclock is used.

For client nodes without own refclock either the leap second file needs
to be provided, or a majority of the configured upstream NTP servers
need to send a leap second warning flag to let ntpd accept the announcement.

This tries to make operation as safe as possible, but this doesn't even
help in any case. Imagine your NTP daemon has a valid leap second file,
handles the leap second correctly, and also passes the leap second
warning to its clients.

If this daemon's time sources (GPS receiver, or upstream NTP server(s))
don't insert the leap second at the same time then our daemon will
observe a sudden 1 s offset after the leap second, and even though it
has itself handled the leap second correctly, it will step the time a
few minutes later to the wrong time provided by its reference time
source(s).

IMO a basic problem here is that there is no general rule when another
leap second has to be inserted. Instead, it depends on human decisions,
and time servers need to be provided with the new information manually.

BTW, this is similar with time zones: If the time zone files aren't
updated after some rules have changed, the DST conversion for certain
zones may not be correct. Would you blame glibc and friends for this?

I had expected that at least the folks at the time labs would take care
that they have set up their servers correctly, and I agree that it's
very disappointing that this isn't always the case.

Martin