[LEAPSECS] Lets get REAL about time.
Zefram
zefram at fysh.org
Sat Jan 21 16:57:37 EST 2012
Poul-Henning Kamp wrote:
>How anybody could get the demented idea that a mixed-radix format
>for time representaiton could be a smart move is totally beyond me.
struct timeval and struct timespec have an obvious benefit: durations that
are short decimal fractions of the second can be represented exactly.
I think this should not be lightly dismissed. Mixed radix is silly,
sure, but to have the quantum be a decimal, rather than binary, power
of the second is useful. It gets along well with the ways we represent
times and durations for human use, which are all decimal.
If you use binary fractions then you'll have to convert between decimal
and binary fractions a lot, and you'll run into rounding issues. In that
case you should probably work out some conversion rules that round-trip
values correctly to some appropriate precision.
When representing time in a single binary quantity, you could avoid the
conversion pain by counting attoseconds rather than fractional seconds.
This has the inconvenience that pulling out the seconds part requires
performing a division, rather than just ignoring some of the bits.
But you're not encouraging that sort of operation anyway.
I'm personally ambivalent about this. I appreciate the purity of using
binary fractions of the second, but the conversions are so prevalent
that an exact representation for decimal fractions (down to attoseconds)
looks like a worthwhile tradeoff.
There's an interesting precedent: the conventional 32.184 s difference
between TT(TAI) and TAI appears to have been chosen not merely as
a terminating-decimal number of seconds, but as a terminating-decimal
number of *days* (0.0003725 days). In general a terminating-decimal
number of seconds doesn't have this property, and it's a useful property
since TT is frequently described in MJD terms.
>How many bits of resolution do we need ?
I agree with you that a 128-bit type is appropriate. I came to the same
conclusion myself some time ago, for the same reasons. I had a vague
intention to work up a full API as you have done, but for many years
now my libraries have been mostly for Perl rather than for C.
>We could invent a new data type, "int80_t" or something, but that
>would be a lot of work, so let us just bite the bullet and use one
>we already have: IEEE binary128 floatingpoint, also known as
>quardruple-precision floating point.
On this I firmly disagree. Floating point brings in a huge swath of
complicated behaviour that you don't want. The changes of quantum that
occur at exponent boundaries will hurt, especially since you'll be quite
often subtracting two timestamps that are close together. Then there's
the bizarre behaviour of zeroes, and the exotic infinities and NaNs.
You don't want any of that for this purpose.
You want a fixed-point format. Either 64.64 with the second as unit,
or 128.0 with the attosecond as unit. True, C doesn't commonly offer
an arithmetic type with this sort of behaviour. (And, of course, it
doesn't offer native non-integer fixed-point support at all.) But that's
a temporary problem. 64-bit ints became commonly available before the
standard changed to mandate them, and 128-bit ints will inevitably go
the same way. You'll kick yourself later if you permanently compromised
on the semantics to get some implementation convenience for the first
five years.
I think this API should be specified so that realtime_t can be an
arithmetic type where the compiler supports it, and can be a struct (of
two 64-bit ints) where necessary. In fact, you're pretty much forced
to do this anyway: IEEE quad-precision floats are not all that widely
available. This does mean that the API user can't rely on being able
to perform native C arithmetic on realtime_t values, and you'll need
a bunch of macros to portably express the operations. But in a decade
realtime_t will be an arithmetic type on everything in significant use,
and programmers will be able to use the native arithmetic operations
with only a small loss of backcompat. The switch from structs to native
types can even be managed to preserve binary compatibility, provided
struct-based implementations think ahead.
>Our new timecale should run on the TAI timescale which does not
>have leap-seconds or any other artifacts,
Careful here. TAI can be projected back, by retrospective application
of the definiton of the atomic second, to mid-1955. But prior to that
there wasn't continuous (or pretty soon any) operation of atomic clocks,
and hence there is no TAI. I believe you're intending that your time
API be good for use on much older dates. If you want to be able to
process dates prior to 1955, your time scale cannot be unadulterated TAI.
There is, of course, a time scale that ticks pure SI seconds prior to
1955: TT. You could define that your timestamps notionally track TT,
and then you can perfectly well have a Carboniferous-era timestamp.
If you do notionally track TT, there remains a problem with contemporary
high-precision access to it. In practice you'll have access to a local
approximation of TAI. (As someone else pointed out, you can't have
real-time access to true TAI, so let's make that explicit.) TAI trivially
gives you TT(TAI): in fact, if you're not concerned about the epoch
(which you're not if your timestamps are using their own epoch), the
two are identical. So the clock you actually have access to, TAI(h),
yields a TT(TAI)(h) timestamp.
Now that we're being explicit about the approximation involved in
real-time access to these time scales, we can easily say that TT(TAI)(h)
is a sufficiently close approximation of TT for most purposes. Where it's
not, you'll need to label your timestamp with the particular (h), and
then you can retrospectively apply corrections to convert the timestamp
into something more like TT(BIPM15).
Also, mind your epoch. If you're rejecting 1958-01-01 then the exact
point doesn't matter, but its definition does. If you define your epoch
as "2012-01-20 00:00:00 TAI" then you're at the mercy of how well TAI
was being implemented in 2012, and how well it's retrospectively known.
There is a unique point in time that gives better triangulation to TT
and other time scales. TT is defined such that 1977-01-01 00:00:32.184
TT coincides exactly with 1977-01-01 00:00:00 TAI. So define your epoch
as a point on the TT time scale, or (equivalently) as some offset from
1977-01-01 00:00:00 TAI.
I delved into these issues a bit when defining a notation for
linear dates, for a clock program. Have a look at the discussion in
<http://www.fysh.org/~zefram/text_rep/linear_date.txt>. (So far I'm
the only user of this notation that I know of, but I thought it worth
the formalisation.)
>convert that to UTC time, civil time etc, using a leap-second table
For this too, you need to have a think about the pre-atomic era. Prior to
1961 there is no UTC. It is possible to make approximate conversions
between TT and UT1 arbitrarily far back (well, back to the solidification
of the Earth's crust, which is a prerequisite for UT1 to be well defined).
For the Victorian era, I wonder whether these conversions are as precise
as how well UT1 (GMT) could be contemporaneously measured. That would
determine whether you can acceptably represent a Victorian precision
timestamp (presumably in astronomical data) as a realtime_t.
You could get a cleaner and more reproducible conversion if you were to
define a retrospective form of UTC. (Need to give it a different name,
of course.) You could take the TT/UT1 data and quantise it by defining
leap seconds, so that each UTCretro day by definition consists of an
integer number of TT seconds. Promulgate table. UTCretro will be an
acceptable approximation of the contemporaneous GMT for most purposes.
In the general case, however, you need to be able to represent a
UT1/UTwhatever timestamp on its own, as something different from the
TAI or TT timestamp represented by a realtime_t. The natural format
for leap-free UT time scales such as UT1 is an arithmetic type counting
days from some epoch (such as MJD). Again you have to pick a particular
arithmetic type for API purposes, and you have to decide whether your
base unit will actually be the day or something else such as second
or attosecond. UTC (the real one or UTCretro) requires a day+seconds
struct that you haven't defined: you're using the fully broken-down
struct tm for this purpose, but I think that's excessively cumbersome.
>If the error parameter is not NULL, it returns the one-sigma error
>estimate of the timestamp.
Yes, important to have this in the API.
>If the error parameter is NULL, and the one sigma error estimate
>is larger than 0.1 second, the call returns a negative number and
>sets errno to EACCESS.
I'm uncomfortable about the arbitrary number. I think this semantic
would be better as a separate function where one passes in a maximum
acceptable error.
>This call returns a monotonically increasing estimate of elapsed
>time in SI seconds, since the program started.
Elapsed in which reference frame? Doesn't make much difference right
now, but it'll become increasingly important as space travel grows.
You might want to have more than one such time scale available: for the
computer hardware, for the best estimate of the user's motion, for TT.
Suspending the apparent elapsion of time when the computer is off would
not be acceptable for some of these reference frames.
Also, since "monotonically increasing" and "estimate of elapsed time"
are somewhat contradictory goals, you should probably go into more detail
about what kind of steering is expected. A useful reference point is
the steering that TAI undergoes: best current estimate of *frequency*,
no attempt to rectify historical phase errors. What's most useful
depends very much on the application.
>The timestamp is outside the valid range of translation tables
>available to the program (errno = E2BIG)
Should be EDOM, for domain error. E2BIG is specifically an error mode
for exec, and it's concerned with an in-memory object occupying too much
space, rather than a numerical value being outside an acceptable range.
Same for the other error later that you assigned E2BIG.
>All implementations must support tz = "UTC". It would be smart to
>standardize other values as well.
Presumably tz = "TAI" would be acceptable too, and civil timezones are
not necessarily just round offsets from UTC. This is getting slightly
beyond the current concept of "timezone", but it does put these time
scales in the correct relationship to the underlying timestamps.
So be aware that the `timezone' mechanism will have to be a bit more
sophisticated than the current Olson system.
-zefram
More information about the LEAPSECS
mailing list