[LEAPSECS] Lets get REAL about time.
Poul-Henning Kamp
phk at phk.freebsd.dk
Fri Jan 20 06:29:18 EST 2012
I would like your comments on this API proposal, if we can agree
that it workable, I am willing to push it, hard, in the UNIX world.
Thanks in advance.
Lets get REAL about time
------------------------
With the leap-seconds still unresolved, it is time that we get real about
time in the computer business.
Our history is littered with failed representations of time.
Counting milliseconds since boot in a 32bit integer means your
operating system runs out of steam after 49 days.
Counting seconds since 1970 in 32 bit will be fun in 2038.
Most if not all of these disasters are caused by skimping on bits, but
some of them are notable for their inherent inanity:
struct timeval {
time_t tv_sec; /* seconds */
suseconds_t tv_usec; /* and microseconds */
};
How anybody could get the demented idea that a mixed-radix format
for time representaiton could be a smart move is totally beyond me.
Here is a handy macro to find time differences in that format:
#define timespecsub(vvp, uvp) \
do { \
(vvp)->tv_sec -= (uvp)->tv_sec; \
(vvp)->tv_nsec -= (uvp)->tv_nsec; \
if ((vvp)->tv_nsec < 0) { \
(vvp)->tv_sec--; \
(vvp)->tv_nsec += 1000000000; \
} \
} while (0)
Lovely, isn't it ?
So lets get real, and fix this once and for all.
How many bits of resolution do we need ?
We want at least 1000 years range of our new type.
There are approximately 32 billion seconds in 1000 years, so we need
at least 35 bits in front of the binary point.
We also want resolution that can measure all relevant physical and
in particular all relevante computing timeintervals with an error less
than 0.1%.
Modern CPUs clock around 4GHz, multiplying by 1000 for resulution we
find that we need 42 bits after the binary point.
77 bits is pretty far from 64 bits, and already today, 42 years
from our epoch, 64bit resolution have 4% error on nanoseconds,
that's not workable.
We could invent a new data type, "int80_t" or something, but that
would be a lot of work, so let us just bite the bullet and use one
we already have: IEEE binary128 floatingpoint, also known as
quardruple-precision floating point.
With 112 bits of resolution we have enough, I belive that allows
us to date events in the Carboniferous era with 28 bits behind the
nanoseconds.
Some might object that OS kernels traditionally do not do floating
point numbers, but this proposal dont require them to do so. It
is perfectly feasible to have a kernel which does timekeeping with
a 64bit integer and 64bit fraction part, and constructs a binary128
number from these two parts.
Next comes the question what we do about leap-seconds, and there
is only one answer that works in practice: We make them a
representation issue.
Our new timecale should run on the TAI timescale which does not
have leap-seconds or any other artifacts, and library functions can
convert that to UTC time, civil time etc, using a leap-second table
which can be updated as and when leap-seconds gets announced.
Here is my proposed api:
]] typedef $compiler_magic realtime_t;
A IEEE binary128 number containing time on the TAI timescale
counted in SI seconds since 2012-01-20 00:00:00 UTC. The exact
epoch is not imporant, but it is a good idea to make it different
from all other currently used epochs, to make programmer mistakes
clearly visible.
]] int tai_time(realtime_t *timestamp, double *error);
Returns zero on success and timestamp contains a timestamp estimate
as specified above.
If the error parameter is not NULL, it returns the one-sigma error
estimate of the timestamp.
If the error parameter is NULL, and the one sigma error estimate
is larger than 0.1 second, the call returns a negative number and
sets errno to EACCESS.
If either of the two pointers are not valid in the current address
space, the call fails and sets errno to EFAULT.
In all other cases the call succeeds.
]] int run_time(realtime_t *timestamp);
This call returns a monotonically increasing estimate of elapsed
time in SI seconds, since the program started.
This timescale may or may not not advance during such periods where
the program is not able to run, for instance during computer "suspend"
or "hibernate" modes, or while images of virtual machines are stored
or transferred between hardware platforms.
In all other cases, this timescale represents elapsed time in SI
seconds, and two successive calls to this function will always
return two timestamps in increasing order.
This call can only fail if the pointer is not valid in the current
address space, in which case it sets errno to EFAULT.
In all other cases the call succeeds and returns zero.
Next we need the UTC and civil time representation functions, here
we leverage the already defined "struct tm", but add to it member
for fractional seconds:
]] double tm_frac; /* fractional second [0...1[ */
We treat UTC as another case of the larger class of civil timezones.
]] int realtime_tm(realtime_t timestamp, struct tm *tm, const char *tz);
Converts the timestamp to "broken down time" in the specificed
timezone, applying leap-second corrections, daylight savings time etc
as directed by the "tz" paramter.
This call return zero on success, and a negative number on failure.
The call can fail only in the following ways:
The tm or tz pointer is invalid in the current address space (errno = EFAULT)
The timestamp is not a valid floating point number (errno = EINVAL)
The timestamp is outside the valid range of translation tables
available to the program (errno = E2BIG)
The tz parameter describes a unknown or unsupported timescale (errno = ENOENT)
All implementations must support tz = "UTC". It would be smart to
standardize other values as well.
We need the reverse function as well:
]] int tm_realtime(struct tm *tm, realtime_t *timestamp, const char *tz);
This call return zero on success, and a negative number on failure.
The call can fail only in the following ways:
One of the pointers are invalid in the current address space (errno = EFAULT)
The struct tm is not a valid timestamp on the specified timecale
(errno = EINVAL)
The struct tm is outside the validity of the timescale translations available
to the program (errno = E2BIG)
The tz parameter describes a unknown or unsupported timescale (errno = ENOENT)
All implementations must support tz = "UTC". It would be smart to
standardize other values as well.
There, done.
--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
More information about the LEAPSECS
mailing list