Date and time pitfalls

I just finished adding a feature to a small tool that permits to define trigger time in any time zone the user wishes. It gave me the opportunity to assess the situation regarding date and time classes. The status is appalling : both C# and Java standard classes simply sucks at the exercise, for different reasons, but they lead to the ‘flatland of desolation’ instead the ‘pit of success’. And they will leave you stranded if you need more than ‘Local to UTC’ and ‘UTC to local’ conversions. This gives us an opportunity to make a tour d’horizon on the topics of Date, Time, DST and time zones. Lets start by some useful definitions…

Quick definitions

  • Calendar: allows to give name to periods of time. The smallest period represented in a calendar is a day. Multiple calendars exist. Most of you are probably aware of the existence of one of those: jewish , muslim , Chinese… Keep in mind: just a representation, many calendars exist.
  • GMT: Greenwich Mean Time. Current time of day as viewed from the Greenwich’s meridian. Is used as the standard global time.
  • UTC: Universal Time Coordinated, has replaced TAI. The difference with TAI is that UTC accounts explicitly for leap seconds, whereas GMT does not. As of today, I am not aware of any library that does manage leap seconds (see below).
  • TAI: International Atomic Time, reference time established by averaging atomic clock outputs (>200) based on the current second definition. The acronym comes from the french: Temps Atomique International. This standard is an absolute that disregards the Earth rotation.  It is roughly synchronized with GMT; there is currently a 35 seconds gap.
  • Leap Seconds: due to the slowdown of the earth rotation, extra seconds have to be inserted so that the actual time of day still matches its definition. There are added at the end of the day (GMT Time) as an extra second: after 23:59:59, the clock marks 23:59:60 and then 00:00:00. They are added either on June 30th or December 31st; as of today, 25 leap seconds have been added since 1972, roughly one per year up to 1998 and around 1 every 4 years since.
  • Time zone: geographical zone where the current time is the same. Originally, they were defined by an offset against the GMT, but since the introduction of Daylight saving time, they have got more complex.
  • Local time: time as seen within a given time zone.

Timezones in the world since Septembe...
Timezones in the world since September 20, 2011  (Photo credit: Wikipedia)
Now it is time to discuss the pitfalls…

#1 Time vs time : same difference?

This is a big one: when the user/requirement speaks about time, what does he/it actually mean? local time or universal time? When dealing with humans, you can assume local time, but turn the implicit into explicit and raise the question. When dealing with MtoM, universal time is your best bet, unless humans are somehow involved. Next questions is: are multiple time zones involved? Probably, most systems are global in some way. The use case probably needs refining, such as: As a user I want this process to happen at 10 AM Paris Time and 4 PM Hong Kong. Pretty straightforward, don’t you think? Nope! Make the implicit explicit: Does it mean simultaneously? Yes, of course, as it happens actually once at 10 AM Paris, so 4PM Hong Kong. Are we there yet? Nope!!! It turns out that 10 AM Paris may not be equivalent to 4PM Hong Kong, because Paris has Daylight Saving Time and Hong Kong does not: half of the year the gap is 6 hours, and it is 7 hours for the other half. So, now the question is: is the reference Paris time (10 AM) or Hong Kong time (4 PM)? Or it may also be 8 AM UTC? Conclusion: When dealing with time, make sure to properly identify the time zone.

#2 A date is not defined at midnight

In C# there is no Date only type, therefore the (enforced, see DateTime.Date) usage is to set the time to 00:00, as in March 23rd is 2015/03/23 00:00:00 when you have no concern for time.

Bad! As soon as you will do any kind of transformation: arithmetic, TimeZone conversions etc.. you have a significant risk to loose a day in the process . For example, if the initial date is the end of DST, moving forward by 1 day (i.e. 24 hours) will stay on the same date, at 11 PM. Search for a Date only class.

#3 Full UTC may not save you

Once you have been bitten by some TimeZone issues, you may be tempted to go full UTC, i.e. having all internal dates expressed as UTC datetimes. Alas! As soon as you need to relate those to the user you can either show those as UTC or his/her current local time. But this may lead to some unexpected results if the user has changed timezone for some reason, or entered/exited DST. Conclusion: store the reference timezone whenever you need to store a DateTime or a time (without the date).

#4 Scheduling is a nightmare

Well, any type of scheduling is more complex than one may think: 99% of the time, this will be implemented by computing the delay before the event occur and then awaiting that delay to elapse. Maybe someday OSes will natively provide date&time based scheduling, but for now, you need to deal with self computed wait time and delays. To compute a delay, you need to use the same base, i.e. same timezone, for both the start time and the scheduled event, then the delay can be used as is for any timer primitive that suits your need.

Remember that a user’s timezone is not a constant attribute: when I travel, I no longer expect my phone to wake me up at 6AM Paris time, it would make me very angry when I am in New York!

#5 There is no shared standard for timezone identification

Microsoft OSes were among the first to fully support timezones, i.e. not only supporting the standard time offset but also daylight saving time period. And they have established their own timezone referential. But due to its proprietary dimension, it has never got any traction outside Windows. To be fair, it exhibits shortcomings, the biggest one being it they has poor ids for the timezones.

But there is a good alternative: the IANA time zone database. It does offer nice ids, based on regions and cities and it is used on Java, iOS, Linux… basically everywhere but Windows. But the naming convention is not practical for storage communication: it is Area/Location, such as Europe/Amsterdam which is a bit long to be included with each exchanged date and impact storage/bandwidth requirements.

#6 There is no adequate framework for unit testing

Not sure what the situation is for Java, but trying to simulate/set a timezone in unit testing is hard, to say the least. And it gets even harder if you need two: one for the server and one for the (mock) user!

Rules of thumb

  1. Store application events in UTC
  2. Properly identify users timezone
  3. Use a date only class if you only need dates
  4. Schedule event in the user timezone
  5. Use J/NodaTime!

Resources

Note:

Edited on October 8th, 2014 to add TAI definition.

Edited on November 1st, 2015 to fix misspellings and improve some wording.