How to preserve nanosecond precision in datetime calculations (for large numbers)

Hi, I am working with datetime variables with nanosecond precision. I am trying to maintain this precision throughout the calculations, but I lose the precision because the precision of the numbers exceed double precision (for large numbers). For example, 1294871257.002060945 seconds has 3 additional digits more than double precision that I need to preserve in the calculations. I realized that I cannot use the vpa function on a datetime variable. Are there any suggestions on how to handle this? The example script below shows the loss of precision in the calculation by outputing the decimal portions of the original and final numbers.
Thanks,
dt=1294871257.002060945; % nanosecond precision for a large number that exceeds double precision.
epoch = datetime(1980,1,6,'TimeZone','UTCLeapSeconds'); % define the epoch reference time in UTC (and has 0 seconds)
DtUTC = epoch + seconds(dt); % add the duration to the reference time
Time_vec=datevec(DtUTC); % converts final datetime value to a 6-element vector (Y,M,D,H,M,S)
sprintf('%.9f',dt-floor(dt)) % decimal part of original number of seconds
sprintf('%.9f',Time_vec(6)-floor(Time_vec(6))) % decimal part of calculated number of seconds

 Accepted Answer

Start off with the time as a symbolic object.
dt=sym('1294871257.002060945');
If you don't and just start with dt as a double you've already lost.
eps(double(dt))
ans = 2.3842e-07
fprintf('%0.16f', double(dt))
1294871257.0020608901977539
Now split dt into the integer part and the fractional part symbolically.
frac = dt - floor(dt);
whole = dt - frac;
fprintf('%0.16f', double(frac))
0.0020609450000000
Finally add whole and frac separately to your epoch.
epoch = datetime(1980,1,6,'TimeZone','UTCLeapSeconds');
DtUTC = epoch + seconds(double(whole)) + seconds(double(frac));
DtUTC.Format = 'uuuu-MM-dd''T''HH:mm:ss.SSSSSSSSS''Z'''
DtUTC = datetime
2021-01-16T22:27:19.002060945Z
Time_vec=datevec(DtUTC);
sprintf('%.9f',dt-floor(dt))
ans = '0.002060945'
sprintf('%.9f',Time_vec(6)-floor(Time_vec(6)))
ans = '0.002060945'

13 Comments

Not having the Symbolic TB, I did the same numerically with text manipulation, Steven.
Thank you @Steven Lord, this is great feedback. Thanks for pointing out to first start off with time as a symbolic object. This preserves the necessary precision throughout the calculations.
@Steven Lord, I'll point out that the format defintion in my script (using Matlab R2021a) results in an error. Although it seems to be unnecssary since I guess the full precision for the fractional and whole parts are preserved within the duration and datetime variables when summed in the previous step. When it is output using datevec, the seconds element to nanosecond precision only needs 11 significant figures and so double is enough precision at this point.
DtUTC.Format = 'uuuu-MM-dd''T''HH:mm:ss.SSSSSSSSS''Z'''
The date format for UTCLeapSeconds datetimes must be 'uuuu-MM-dd'T'HH:mm:ss.SSS'Z''.
Yeah, I ran into the formatting issue as well...there are still many warts in the (relatively) new datetime and duration classes and missing abilities beyond just routine calendar use for appointments, etc.
Well, the post evaluated the code in release R2021b and it worked so obviously something must have changed in release R2021b. Looking at the Release Notes the only change listed that might have caused that format to be accepted where it wasn't in previous releases was the startat function now recognizing time zone information. Maybe the requirements on the format of datetime arrays were relaxed as part of that change? I'm not certain. Or maybe that requirement was relaxed and the change was deemed minor enough not to warrant a release note.
I only put that line in the example to show that the fractional seconds matched the appropriate part of the number that you added to your epoch time.
@dpb This is a bit of a tangent from the original question, but what type of functionality did you expect datetime and duration to have that it is missing?
Specifically with duration I find the limitations on input and output formatting to be excessively strict; for example in this case you can't parse just a "ss.SSSS" string without prepending at least a "mm:" field to the string. In addition, then the 'Infmt' format string has to include an explicit ".S" suffix; it's not enough that the decimal point in the input string triggers reading such a value.
On the other side, I find it an annoyance is that you can't set format strings to but a few selected choices -- in particular you have to have at least "mm:ss.SSS" in order to have the ".S" fractional seconds show at all in the digital format; you can use the "s" format string and get in seconds, but afaict, it is fixed at three decimal digits and there's no way to get any more precision other than convert to double and then one runs into the precision issue again.
I grant it's still relatively new and other functionality is more critical, first, but it seems somewhat limiting in flexibility. I also grant I have not given a lot of thought about implementation issues as far as input syntax...
I've had some other use cases I'll have to go back and think about the details of again to try to add to the list of the functional side.
ADDENDUM:
Another real-world example of having to go thru gyrations to work around the limitations in the input formatting for durations.
is at the point it broke--the thread is pretty long, but that comment arena is the pertinent-to-this-conversation section.
I've noted this in the enhancement database.
Thanks, Steven. I'll quit adding ammo on that particular nit... :)
I still can't recall/couldn't find the Q? the other reasonable but somewhat unusual use of duration I can recall requiring a lot of gyrations to work around, unfortunately. It had to do with plotting, but I have not come across it again by any search term I can think of, and I had a machine crash and those temporary scripts/functions I had in the working directory I used for Answers piddling weren't in the backup path so I can't reconstruct that way, either.
Eventually maybe it will come to me.
Just for the record, this error
The date format for UTCLeapSeconds datetimes must be 'uuuu-MM-dd'T'HH:mm:ss.SSS'Z''
was one of the warts that dpb refers to that was removed in R2021b. Formats for UTCLeapSeconds must still be that ISO form, but may now have 0 to 9 fractional seconds digits.
Ah! I recall that one, now, Peter. :)

Sign in to comment.

More Answers (3)

>> t='1294871257.002060945'; % treat long value as string
>> dsec=seconds(str2double(extractBefore(t,'.'))) + ...
seconds(str2double(extractAfter(t,strfind(t,'.')-1))); % combine integer/fractional parts
>> dsec.Format='dd:hh:mm:ss.SSSSSSSSS' % format to show nsec resolution
dsec =
duration
14986:22:27:37.002061035
>>
This is the most straightforward workaround I can think of within the limittions of the (somewhat hampered) durations class which has limited input options for formatting input and the helper functions such as seconds that are only base numeric classes aware.
The datetime class is still pretty young; it has much left to be worked out in order to make it fully functional over niche usage such as yours.
You'll probably have to build a wrapper class of functions to hide all the internal machinations required; the above just illustrates that if you can break the whole number of seconds precision required into pieces that are each within the precision of a double that the duration object itself can deal with them to that precision -- at least storing an input number. I've not tested about rounding when try to do arithmetic with the result.

5 Comments

Very interesting idea @dpb. The problem still is that you lose the extra precision needed when you combine the integer and fractional parts. In Steven's approach, the variables 'whole' and 'frac' are still symbolic, so I believe when they are combined they still maintain the necessary precision in the duration and datetime variables. When they are combined here as you describe, the result is summed as double without maintaining the necessary precision. Compare the fractional part of the result you show to the fractional part of the original time. They differ in your snippet above. Thanks for the thoughts on this,
I hadn't inspected the trailing digits as closely as should in the end result; it works in building the fractional seconds; it appears to be a bug in the duration object implementation internals...namely the overloaded plus operator. Undoubtedly the other math operations will have same issues.
>> frac=(str2double(extractAfter(t,strfind(t,'.')-1)));
>> fprintf('%.9f\n',frac) % the fractional part is ok as double
0.002060945
>> fracDur=seconds(frac); % so convert it to duration
>> fracDur.Format='dd:hh:mm:ss.SSSSSSSSS' % there's one digit rounding here now
fracDur =
duration
00:00:00.002060944
>> fracDur=fracDur+seconds(str2double(extractBefore(t,'.'))) % and fails here
fracDur =
duration
14986:22:27:37.002061035
>>
I'll file bug report; TMW didn't implement the internals correctly to keep the precision claimed.
As noted, I thought something of this sort might happen when combined two composites but overooked it having already occurred.
@dpb I have to say that the concept of overloading operators is a little beyond me. But I did look closely at the differences between your method and the method using symbolic, and now I do not understand why there should be any difference. In fact, I do not fully understand why the symbolic method works either because in both cases the numbers are just converted to double before the seconds function operates on them. I even tried adding trailing zeros to your string t, to see if it had to do with how the double is stored in memory. Like I said, this is a little beyond me, but thanks for following up with this. It seems like it would be a good alternative.
Also, I have noticed that in the symbolic approach, there is a nuance. It must be a symbolic variable, not a symbolic number. In other words:
This works,
dt=sym('1294871257.002060945');
This does not work,
dt=sym(1294871257.002060945);
I believe that as a symbolic variable, it is the only way to maintain the needed accuracy. But I am not clear on the nuance here.
This:
dt=sym(1294871257.002060945);
evaluates 1294871257.002060945 in double precision and converts the resulting double into symbolic. As I stated in my answer the first of those steps has already caused problems for your approach.
format longg
dt = 1294871257.002060945
dt =
1294871257.00206
% Spacing between dt and the next largest *representable* double
eps(dt)
ans =
2.38418579101562e-07
% This is the closest representable double to the number you entered
fprintf('%0.16f', dt)
1294871257.0020608901977539
This:
dt=sym('1294871257.002060945');
vpa(dt, 20)
ans = 
1294871257.002060945
doesn't go through double at all.
In some cases sym can "recognize" the floating-point result of an operation that should be of a particular form and compensate for roundoff error. See the description of the flag input argument to the sym function, specifically the row dealing with the (default) 'r' flag, for a list of recognized expressions. So as an example even though 1/3 is not exactly one third, it's close enough for sym. But your number doesn't fall into one of those recognized expression categories.
x = sym(1/3)
x = 

Sign in to comment.

This question already has an accepted answer, but I would like to add my observations on this topic.
datetime:
  • Internally holds the value as a complex double. The real part is the number of milliseconds since Modern UTC Epoch = 1970-Jan-01 UTC. The imaginary part is a "correction" to add to maintain precision.
duration:
  • Internally holds the value as a real double representing milliseconds.
You can already see the disconnect. The datetime class is designed to hold values to a higher precision than the duration class. Creating datetime variables with the higher precision is fine, but as soon as you subtract them a duration results and you lose that precision. This is an unfortunate situation and could have been avoided if the duration class held the value the same way that datetime variables do. I have made the suggestion to TMW to change this and make the duration class consistent with the datetime class, but I don't know if they will ever do it.
Example of the problem:
format longg
dt = datetime(2000,1,1,1,1,1.2345678912345)
dt = datetime
01-Jan-2000 01:01:01
dt.Second
ans =
1.2345678912345
You can see that the full precision of the original seconds is there. You can even look at the internals:
sdt = struct(dt)
Warning: Calling STRUCT on an object prevents the object from hiding its implementation details and should thus be avoided. Use DISP or DISPLAY to see the visible public details of an object. See 'help struct' for more information.
sdt = struct with fields:
UTCZoneID: 'UTC' UTCLeapSecsZoneID: "UTCLeapSeconds" ISO8601Format: 'uuuu-MM-dd'T'HH:mm:ss.SSS'Z'' ISO8601FormatPattern: regexpPattern("uuuu-MM-dd'T'HH:mm:ss(\.S{1,9})?'Z'") epochDN: 719529 MonthsOfYear: [1x1 struct] DaysOfWeek: [1x1 struct] data: 946688461234.568 + 2.01407499389461e-05i fmt: '' tz: '' dfltPivot: 1974 dateFields: [1x1 struct] noConstructorParamsSupplied: [1x1 struct]
What is the data value? Well, here is a demonstration:
mutc = dt - datetime(1970,1,1)
mutc = duration
262969:01:01
milliseconds(mutc)
ans =
946688461234.568
You can see that the data value in the dt variable is in fact milliseconds since Modern UTC epoch. To see the purpose of the imaginary part, note that the duration calculated from a datetime difference does not retain the seconds accuracy:
[h,m,s] = hms(mutc)
h =
262969
m =
1
s =
1.23456788063049
The trailing digits of the duration seconds does not match the original. Not good.
But the datetime variable, with the correction, is able to get the original seconds accurately. E.g.,
(mod(real(sdt.data),60000) + imag(sdt.data)) / 1000
ans =
1.2345678912345
When calculating the second value, internally the equivalent of the above is done to maintain precision. This is all good as far as it goes, but as soon as you subtract two datetime variables you lose this! Sigh ...
My advice when you need to retain precision is to AVOID THE DURATION CLASS and to AVOID SUBTRACTING DATETIME VARIABLES. You may have to write your own code to subtract datetime variables piecemeal and maintain the difference in your own format. If you have a string of "extended precision" you need to add to a datetime, separate the string into pieces that can individually be held accurately in doubles, and add them to the datetime sequentially to maintain precision. E.g., see this related thread:

6 Comments

James, this advice seems like throwing out the baby with the bathwater. duration has enough bits to preserve nanosecond precision over a span of +\- 104 days, and preserve microsecond precision over the span of +/- 284 years. I'm sure there are applications that require more than that, but for most people, most of the time, duration will be sufficient.
Your advice relies on the internal representation. That's a bad idea.
Peter, I think James' advice is the only present solution under the condition he gives of "when you need to retain precision" -- as presently implemented the duration doesn't live up to the precision of the datetime from which it may be derived. Why the two weren't designed to be consistent with each other has always puzzled me, too...
@Peter Perkins ".... Your advice relies on the internal representation. That's a bad idea ..."
Respectfully, no it doesn't. The struct stuff above is only for demonstration purposes to show how datetime currently maintains precision. However, my solution to maintaining precision doesn't depend on that at all. If you read through and understand my solution in the other link(s), my solution involves only the datetime public interface functions (e.g., dt.Second).
To your point, though, I would agree that most people don't need to go through these hoops. But if your application does need the extra precision, I provide a way to do it that doesn't rely on datetime internals.
Memory and performance.
If your datetimes are at ms precision, unless you need ms precision over more than 284,000 years, durations are exact. How many people need more than that? I'm guessing noone.
If your datetimes are at higher precision, durations provide us resolution over 284 years, ns resolution over 104 days. There's some round-off involved, and that might affect some calculations that use exact comparisons, but exact numeric comparisons are not a good idea to begin with. How many people need that resolution over those spans? I'm guessing very very few.
So there was a choice between doubling the footprint and (not exactly) halving the performance for everyone, and losing resolution for probably almost noone.
All your points are well taken, but this forum has already seen at least two posts where people needed more precision. What's the big deal with showing them why durations are not appropriate in these cases and also showing them how to do the calculations to maintain precision? I really don't understand the push back here.
If there were only two people on the planet that cared about this issue, me and the original poster, I would be OK with that because I would have helped that person solve their particular problem.
[James, an earlier version of this reply assumed that you were advocating taking the output of that call to struct and working with that internal data. The problem there is that if you dig into the internals of datetime, your code will at some point stop working when those internals change. I can guarantee that this will happen at some point. It almost always does, and we try very very hard to maintain backwards compatibility, but not for code that addresses the private internals of a class. But on rereading your post, I'm not so sure you are advocating for that. Apologies.]
I think the safest solution is what Steve suggested for the OP, i.e. break up 1294871257.002060945 (which is 41 years) into whole and fractional seconds. But of course as Steve points out, the instant you type 1294871257.002060945 you're done for, so you have to do something earlier. The OP didn't really provide enough context to give good advice, likely the whole approach could be modified. 41 years ago in 2021 was the GPS origin, maybe that's what's going on, but I don't really know how that number came into being. I bet there's some way to avoid such a long high precision elapsed time. One strategy is to take differences from your smallest datetime, which is typically not spanning 41 years,
I agree that it would be great to not have to worry about such things. duration is not yet in that world.

Sign in to comment.

I don't know if this helps or just confuses the issue, but there is another way to do what the OP has asked for:
x = int64(1294871257002060945)
x = int64 1294871257002060945
fmt = "dd-MMM-uuuu HH:mm:ss.SSSSSSSSS";
t = datetime(x,ConvertFrom="epochtime",Epoch=datetime(1980,1,6),TicksPerSecond=1e9,Format=fmt)
t = datetime
16-Jan-2021 22:27:37.002060944
That's a 4, not a 5, at the end, but it's due to display quirks, in that datetime does not round display up the way double would. The seconds value is actually a very tiny bit less than 37.002060945:
format long
dv = datevec(t);
dv(end)
ans =
37.002060944999997
Again, the OP had 1294871257.002060945 and I don't know what form that was in. If it was text, there's some hope of doing what I've shown here.

Categories

Products

Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!