[LEAPSECS] leap seconds in POSIX

Martin Burnicki martin.burnicki at burnicki.net
Mon Jan 27 05:15:47 EST 2020


Hal Murray wrote:
> 
> Does anybody know of a good writeup of how to fix POSIX to know about leap 
> seconds and/or why POSIX hasn't done anything about it yet?

I've made a number of presentations and whitepapers about leap seconds
and problems related to them. However I'm not aware of an easy, good
solution.

The basic problem is that common API calls used to retrieve the system
time stamp from an OS don't provide a status that could be used to
distinguish between a normal second and an ongoing, inserted leap second.

Without such status a function that converts a timestamp to calendar
date and time doesn't know if the timestamp associated with an inserted
leap second should yield a second with number '60' of the current day,
or a second '00' of the next day.

I think an API that provides a timestamp and an associated status in a
*consistent* way would be too "expensive" with regard to execution time
because some locking mechanism needed to be implemented to avoid that
inconsistent timestamp and status information could be returned.

On the other hand, if an application already has a broken down date and
time (e.g. from an external time source, serial string or whatever), it
knows that the time is the leap second if the second number is '60'. So
the '60' is the required status information.

However, if you "normalize" the time of a leap second, e.g. 23:59:60,
then 60 seconds carry over to one minute, 60 minutes to 1 hour, and 24
hours to the next day. So when computing an associated timestamp, the
effective timestamp is the same as for 00:00:00 the next day, and the
information that this second is a leap second is simply lost unless
otherwise preserved in some way.

On most Unix-like systems the system time is kept as POSIX time, and
when a leap second is inserted, the kernel just steps the system time
back by 1 second.

*This* is what confuses applications that don't expect that the system
timestamp ever goes backward.

Many years ago Dave Mills has proposed a way to avoid this problem by
stopping the system clock for 1 s, and doing a system time increment by
the smallest unit whenever an application retrieves a system time stamp
while the clock is stopped.

This would be a good workaround since the time returned from the kernel
never steps back, and there are no duplicate timestamps because even
during the leap second the timestamps increment by a small amount.

Quite some time ago I asked one of the Linux kernel maintainers why they
don't implement the leap second handling this way, and the answer was
just "because it's too expensive". Whenever an application queried the
current system time, the kernel had to check if a leap second was just
being inserted, and if there was, some small amount of time had to be
added to the stopped system time.

And all this just to get it "right" for 1 second in several years. Just
stepping the time back by 1 second at a certain point in time is much
faster, and much easier to implement.

> I assume the basic idea would be something like switch the kernel to use TAI 
> rather than UTC and implement conversion in some library routines.

I think this could be a good approach, but this requires that not only
leap second announcements but also the current TAI offset is supplied to
the OS.

Runtime libraries require the correct TAI offset to be able to convert
the TAI system time to UTC. E.g., the kernel could return a TAI
timestamp, and the runtime library function like gmtime() needs to know
the current TAI offset and leap second date to be able to return a time
with a 60 in the seconds field.

I know there are leap second files from IERS or NIST, and also current
versions of the TZ database contain a copy of such file, but this
requires that this information is continuously updated, i.e. new
versions of the leap second files or the TZ data base are supplied.

This should work (I haven't tried it) if you configure the system with
one of the "right" timezones, which gets the TAI information from a
table of the TZ DB.

This may work with current versions of operating systems, where the OS
maintainers or the admin provides the required information, but may not
work reliably for systems that are out of support, or embedded systems
that never get any update. Thes would use a wrong UTC time after the
next leap second.

In the past, ntpd could provide its clients also with the TAI offset if
autokey was configured. Since autokey is now obsolete and replaced by
NTS, an extension field to the NTP packet could provide the TAI offset.

PTP/IEEE 1588 works based on TAI, and the protocol provides the TAI to
UTC offset, so either PTP or NTP with TAI offset extension field could
be used to adjust the kernel time to TAI.

Both ntpd and PTP clients should be able to write the current TAI offset
and leap second announcments to the kernel. However, AFAIK time
conversion functions only retrieve the TAI information from the TZ DB,
not from the kernel, so the TAI capabilities of NTP and PTP clients
don't really help if the TZ DB isn't updated.

So, IMO, there were API calls required that the runtime library could
use to query the current TAI offset, a TAI timestamp of the next leap
second (if one has been scheduled), and the TAI offset after the next
leap second from the kernel. So the system could use information
provided via NTP or PTP even without further updates of a leap second
file or the TZ DB.

> There is a discussion on the IETF ntp list with typical S/N for this topic.

Above I tried to write a summary from my point of view, and I hope it's
not considered as noise. ;-)

Martin


More information about the LEAPSECS mailing list