Time Zones in C++

Bill Seymour
2024-03-31


0.  Contents


1.  Introduction

This paper describes an open-source C++ class called timezone that could be useful when dealing with, you guessed it, civil time zones.  It’s distributed under the Boost Software License.  (This is not part of Boost; the author just likes their open-source license.)

This class is part of a larger civil time library that the author is developing, but it’s documented separately because there’s no compelling reason why it couldn’t be a stand-alone type in its own right.

There’s also a class called tz_data an instance of which is intended to be one of timezone’s private data members, but it’s documented herein just in case some user wants to live dangerously and use it as a stand-alone type.

There’s a strong assumption that you have access to Zoneinfo’s compiled binary data files.  If you have a POSIX system, including a Mac, you almost certainly do.  On Windows it might be more “interesting”, although a timezone can be constructed from a POSIX-style TZ environment variable (e.g., CST6CDT,M3.2.0,M11.1.0”) instead of a Zoneinfo Zone name (e.g., “America/Chicago”). (A  separate paper describes one way to get the Zoneinfo binaries on Windows.)

For the sake of brevity elsewhere in this paper, Zoneinfo’s compiled binaries are called “TZif files” because the first four bytes in such files are always “TZif”.  That’s short for “Time Zone Information Format”.

The code requires 64-bit time_ts (fixing the Y2038 issue), and it requires at least a C++11 compiler and standard library with int32_t and int64_t defined in <cstdint>.

The code is distributed in https://www.cstdbill.com/civil-time/timezone.zip which contains:


2.  Synopsis

static_assert(sizeof(std::time_t) * CHAR_BIT >= 64, "64-bit time_t required");

namespace civil_time {

const std::string& get_tz() noexcept;
void set_tz(const char*);
void set_tz(std::string&&);
void set_tz(const std::string&);

const std::string& get_tz_root() noexcept;
void set_tz_root(const char*);
void set_tz_root(std::string&&);
void set_tz_root(const std::string&);

namespace zoneinfo {

typedef std::int32_t tz_int;
typedef std::int64_t tz_time;

struct tz_data final
{
  #pragma pack(4)
    struct header
    {
        char   version[20];
        tz_int tzh_ttisgmtcnt;
        tz_int tzh_ttisstdcnt;
        tz_int tzh_leapcnt;
        tz_int tzh_timecnt;
        tz_int tzh_typecnt;
        tz_int tzh_charcnt;
    };
    header hdr;
  #pragma pack()

    tz_time* trans_times;
    unsigned char* info_idx;

  #pragma pack(2)
    struct ttinfo
    {
        tz_int        tt_gmtoff;
        unsigned char tt_isdst;
        unsigned char tt_abbrind;
    };
  #pragma pack()
    ttinfo* info;
    char* abbrv;

  #pragma pack(4)
    struct leapsecs
    {
        tz_time leap;
        tz_int  cnt;
    };
  #pragma pack()
    leapsecs* leaps;

    char* stdind;
    char* utcind;
    char* tzenv;

    struct tzrule
    {
        int mo, wc, wd, jd, hr, mn, sc;
        tzrule() noexcept;
        tzrule(const std::tm&) noexcept;
        int compare(const tzrule&) const noexcept;
    };
    tzrule* tzrules;

    tz_data() noexcept;
    ~tz_data() noexcept;

    tz_data(const tz_data&);
    tz_data(tz_data&&) noexcept;

    tz_data& operator=(const tz_data&);
    tz_data& operator=(tz_data&&) noexcept;

    void swap(tz_data&) noexcept;

    void clear() noexcept;
    void all_clear() noexcept;

    tz_data& read(const std::string&);
    std::string make_posix(const std::string&);

    static std::string get_posix_tz(const std::string&);
};

using std::swap;
void swap(tz_data&, tz_data&) noexcept;

} // namespace zoneinfo

class timezone
{
public:
    timezone();
    explicit timezone(int, int = 0, int = 0);
    explicit timezone(const std::string&);
    explicit timezone(std::string&&);

    virtual ~timezone();

    timezone(const timezone&);
    timezone(timezone&&);

    timezone& operator=(const timezone&);
    timezone& operator=(timezone&&);

    void swap(timezone&);

    const std::string& name() const noexcept;

    std::string posix_tz_env_var() const noexcept;
    static std::string posix_tz_env_var(const std::string&);

    std::time_t effective_time() const noexcept;

    int std_offset() const noexcept;
    int dst_offset() const noexcept;

    const zoneinfo::tz_data& raw_data() const noexcept;

    int utc_offset() const noexcept;
    bool is_dst() const noexcept;
    const char* abbrv() const noexcept;

    enum class trans_type { unknown = -1, wall, std, utc };
    trans_type transition_type() const noexcept;

    timezone& switch_to_local();
    timezone& switch_to_utc();
    timezone& switch_to_offset(int, int = 0, int = 0);
    timezone& switch_to(const std::string&);
    timezone& switch_to(std::string&&);

    timezone& for_time(std::time_t) noexcept;
};

using std::swap;
void swap(timezone&, timezone&);

} // namespace civil_time


3.  The TZ and TZ_ROOT Environment Variables

These functions are declared in the civil_time namespace.

const std::string& get_tz() noexcept;
void set_tz(const char*);
void set_tz(std::string&&);
void set_tz(const std::string&);
This is a string that mimics the POSIX-style TZ environment variable.  Both the 'M'- and 'J'-style “daylight saving time” (DST) rules are supported.

When constructing a timezone object, if the string begins with a colon, or if it lacks a comma (introducing DST rules), it will be taken to be a Zoneinfo Zone name initially (minus the optional leading colon); but if reading a TZif file fails, the constructor will try again treating the string as a POSIX TZ value.

const std::string& get_tz_root() noexcept;
void set_tz_root(const char*);
void set_tz_root(std::string&&);
void set_tz_root(const std::string&);
This string contains the full path to the directory where the TZif files reside.  It defaults to /usr/share/zoneinfo on POSIX systems; or it can be an empty string if you’re compiling for Windows and haven’t installed the TZif files anywhere.

You don’t actually need to have the TZ and TZ_ROOT variables defined in your environment:  by default, these values are defined as macros in timezone_config.hpp (see Appendix A).  If the defaults are wrong for your system, you can call any of the set_tz() or set_tz_root() functions before you make any use of the timezone or tz_data classes (e.g., at the beginning of main()), or you can explicitly define the CIVIL_TIME_TZ and CIVIL_TIME_TZ_ROOT macros yourself when compiling timezone.cpp.

If you do have these environment variables, you can compile timezone.cpp with the CIVIL_TIME_USE_GETENV macro defined.  This will cause the values to be initialized by calls to std::getenv() before anything in timezone.cpp is used.


4.  The tz_data Type

This type is declared in the civil_time::zoneinfo namespace.

An instance of this class holds all the data in a single TZif file.  See Appendix B for a description of such a file.

Note that tz_int is a typedef for std::int32_t and tz_time is a typedef for std::int64_t.  We need to be picky about these because RFC 8536 requires that all the integers in a TZif file be four-octet objects, and all time_ts be eight-octet objects, with negative values having two’s-complement representation.  You’ll also see some #pragma packs below because everything in a TZif file is packed on arbitrary byte boundaries.


4.1.  Data Members

This class is declared to be a struct rather than a class, so basically everything is public (except for a couple of undocumented member functions that manage allocated memory).  It’s intended to be a private data member of the timezone class described below, but it could be a stand-alone class as long as users never do anything except:


4.1.1.  The header

#pragma pack(4)
struct header
{
    char   version[20];
    tz_int tzh_ttisgmtcnt;
    tz_int tzh_ttisstdcnt;
    tz_int tzh_leapcnt;
    tz_int tzh_timecnt;
    tz_int tzh_typecnt;
    tz_int tzh_charcnt;
};
header hdr;
#pragma pack()
This is what the author thinks of as the “header” portion of a TZif file:


4.1.2.  Data Arrays

There are nine pointers that will all point to memory allocated on the free store (a.k.a., “heap”), or they’ll be nullptr.
tz_time* trans_times;
unsigned char* info_idx;
The number of elements in each of these arrays is given by hdr.tzh_timecnt. The idea is that, to find the time that’s observed for a given time_t, search the trans_times array for the largest value not greater than the time_t that you’re interested in.  At that same position in the info_idx array will be an index to the array that follows.

If there’s only one element in the array that follows (e.g., when the object refers to a fixed offset from UTC), hdr.tzh_timecnt can be zero (since there’s never been a transition), and so both trans_times and info_idx can be nullptr.

#pragma pack(2)
struct ttinfo
{
    tz_int        tt_gmtoff;
    unsigned char tt_isdst;
    unsigned char tt_abbrind;
};
#pragma pack()
ttinfo* info;
A ttinfo structure contains interesting information about the current time observance. The number of elements in the array of ttinfos that info points to is given by hdr.tzh_typecnt which will always be at least 1, so info will never be nullptr.

Note the #pragma pack(2) business:  the TZif files pack everything on arbitrary byte boundaries, and we need sizeof(ttinfo) to be 6, not 8 as it would be if we allowed the compiler to create a structure with padding bytes.

char* abbrv;
This contains one or more '\0'-terminated C-style strings packed into a single array of char.  The strings are the usual abbreviations (e.g., “CST”, “CDT”) used to identify the time zone.  The particular abbreviation of interest begins at abbrv + info[info_idx[trans_pos]].tt_abbrind where trans_pos is the position in the trans_times array where the time_t of interest was found.

The total number of chars in the array including the '\0' terminator(s) is given by hdr.tzh_charcnt.  There will always be at least one abbreviation, so abbrv will never be nullptr.

#pragma pack(4)
struct leapsecs
{
    tz_time leap;
    tz_int  cnt;
};
#pragma pack()
leapsecs* leaps;
leapsecs::leap contains the time() when a leap second happened, and leapsecs::cnt contains the number of leap seconds that have been added prior to leapsecs::leap.  Note that, although it has never happened yet, a leap second can be subtracted instead of added; so it’s possible for leapsecs::cnt to get smaller.  It doesn’t necessarily become greater (and note that Duncan Agnew of Scripps Oceanography at UCSD has predicted that, because of changes to the Earth’s rotation due to global warming, a negative leap second will be required by 2029 at the latest).

The number of leapsecs structures is given by hdr.tzh_leapcnt.

None of the TZif files in the 2024a release has any leap second information (so leaps will always be nullptr and hdr.tzh_leapcnt will always be zero); but this library allows for leap seconds because RFC 8536 does, and so they might show up in the future.

char* stdind;
char* utcind;
These are effectively arrays of booleans (either '\0' or '\1') that together indicate whether the published times of transitions (e.g., winter to summer time) specify the local wall clock time, the local standard time, or UTC.  *stdind is local standard or wall clock time; *utcind is UTC or local time.

The sizes of these arrays are given by hdr.tzh_ttisstdcnt and hdr.tzh_ttisgmtcnt, respectively.  The author observes that both of these have the same value as does hdr.tzh_typecnt for all zones in Zoneinfo’s 2024a release, but he has seen no documentation (or guarantee) of that.

char* tzenv;
This is the '\0'-terminated string that’s the value of the POSIX TZ environment variable (e.g., “CST6CDT,M3.2.0,M11.1.0”).  Its length isn’t given by any of the integers in the header:  it’s just an ordinary C-style string.

In the TZif files, this string begins and ends with '\n' characters; but this library doesn’t store those.

struct tzrule
{
    int mo /*month*/, wc /*week count*/, wd /*weekday*/;
    int jd /*Julian day*/;
    int hr, mn, sc; // time of day

    tzrule() noexcept;
    tzrule(const std::tm&) noexcept;
    int compare(const tzrule&) const noexcept;
};
tzrule* tzrules;
This array is created only by the make_posix() member function which initializes *this given a POSIX TZ string instead of by reading a TZif file.  tzrules will be nullptr if the data were read from a TZif file or if the zone doesn’t observe DST.  If it does observe DST, there will be two array elements with tzrules[0] being the rule for moving to DST , tzrules[1]the rule for returning to standard time.

In the tzrules array, in an 'M'-style rule (month, week count and weekday), tzrule::jd will be INT_MIN; if jd != INT_MIN, it’s a 'J'-style rule and mo, wc and wd will be INT_MIN.

In tzrule objects constructed elsewhere from a struct tm, all of mo, wc, wd and jd will be the values taken from the tm.

tzrule::compare() compares a tzrules array element with some other tzrule object.  It returns a value less than zero if the array element represents a time before the passed argument, zero if they’re the same time, or a value greater than zero if it’s after the passed argument.  If the array element’s jd != INT_MIN, jd will be compared; otherwise mo, wc and wd will be compared.

Note that compare() may be called only on a tzrules array element.  The passed argument may be the other array element or a tzrule object constructed from a std::tm.
If both are array elements, both must be 'M'-style rules or both must be 'J'-style rules; and a debug build will assert if they’re different.  We might fix that Real Soon Now.


4.2.  Member Functions

This class doesn’t have much behavior associated with it beyond reading a TZif file or creating the data from a POSIX-like TZ environment variable.

The copy constructor, the copy assignment operator, and the read() and make_posix() functions all allocate memory and so can throw a bad_alloc exception.  The read() and get_posix_tz() functions can also throw an invalid_argument exception or a runtime_error exception; and make_posix() can throw an invalid_argument.


4.2.1.  Special Member Functions and swap()

tz_data() noexcept;
The default constructor just constructs an object with all the member pointers set to nullptr, and with all the non-pointer data members set to zero.
~tz_data() noexcept;
The destructor is non-trivial.
tz_data(const tz_data&);
tz_data(tz_data&&) noexcept;

tz_data& operator=(const tz_data&);
tz_data& operator=(tz_data&&) noexcept;

void swap(tz_data&) noexcept;
Instances of the tz_data class are freely copyable, moveable, and swappable.  There’s also a non-member swap(tz_data&,tz_data&).


4.2.2.  Cleanup

void clear() noexcept;
void all_clear() noexcept;
clear() frees all allocated memory and sets all the pointers to nullptr.

all_clear() sets the version array and all integers in the header to zero and then calls clear(), effectively wiping out all the information in the object.


4.2.3.  Creating the Data

tz_data& read(const std::string& zone_name);
This is the function that reads a TZif file and initializes all the data members described above.  The argument is Zoneinfo’s name for the time zone (e.g., “America/Chicago”).

In addition to throwing a bad_alloc if memory allocation fails,

This function returns *this.

std::string make_posix(const std::string& tz_env_var);
This function doesn’t try to read a TZif file, but instead tries to infer the data from a string that might be found in a POSIX-like TZ environment variable.  It returns the passed TZ string minus any DST rules (which is probably a reasonable name for the zone).

It will throw an invalid_argument exception if it can’t parse the passed TZ string.

static std::string get_posix_tz(const std::string& zone_name);
This static member function reads a TZif file without modifying any tz_data data members and returns the POSIX TZ environment variable.  It’s a member of the tz_data class because it shares some code with tz_data::read(), but the intent is that it be called only by timezone’s static posix_tz_env_var(const std::string&).


5.  The timezone Type

This type is declared in the civil_time namespace.

The timezone class represents an offset from UTC and rules for going on and off “daylight saving time” (DST).  Depending on how it’s constructed, it can represent a fixed offset that never changes, an offset that changes only when going on and off DST, or historical offsets read from the Zoneinfo database.  You can change which civil time zone the timezone object represents after construction.

All the special member functions, other constructors, and swap() allocate memory and so can throw a bad_alloc exception.  Those that might read a TZif file or construct a timezone from a POSIX-like TZ environment variable (all except timezone(int,int,int)) can also throw invalid_argument or runtime_error exceptions.

It’s intended that this class be inherited by another class in a larger civil time library, so it’s not declared final and it has a virtual destructor.


5.1.  Special Member Functions and swap()

timezone();
The default constructor attempts to create the local time zone the name of which will be whatever is returned by civil_time::get_tz() (on a POSIX box probably “/etc/localtime”).  Note that this could be a name that looks like a POSIX-style TZ environment variable if you’re not using the TZif files.
virtual ~timezone();
The destructor is non-trivial; and it’s virtual so that the class can be safely inherited.
timezone(const timezone&);
timezone(timezone&&);

timezone& operator=(const timezone&);
timezone& operator=(timezone&&);

void swap(timezone&);
Instances of the timezone class are freely copyable, moveable, and swappable.  There’s also a non-member swap(timezone&,timezone&).


5.2.  Other Constructors

explicit timezone(int hours, int minutes = 0, int seconds = 0);
This creates a timezone with a fixed offset from UTC that doesn’t observe DST.  The sign of the offset will be that of the hours argument.  The signs of the minutes and seconds arguments will be adjusted to match hours if they’re non-zero, so they can both be absolute values in the call.

Both the name and the abbreviation will be “UTC±n” where n is the offset from UTC.  The offset can be just hours, hours:minutes, hours:minutes:seconds, or entirely absent if n would be zero.

The TZ environment variable will follow the Zoneinfo convention of “nn” where the offset between the angle brackets is the actual offset and the value after the closing '>' will have the opposite sign (like in a TZ environment variable).  It can also be just “<+0>0” for UTC exactly.
Note that time zones west of the prime meridian have negative offsets from UTC; but in the legacy TZ environment variables, the offsets have the wrong sign.

explicit timezone(const std::string&);
explicit timezone(std::string&&);
The argument can be a Zoneinfo Zone name (e.g., “America/Chicago”) or a POSIX-style TZ string, possibly with DST rules (e.g., “CST6CDT,M3.2.0,M11.1.0”).  If the string doesn’t have the DST rules (e.g., just “CST6CDT”), it will be taken to be a Zoneinfo name initially.  In any event, if reading the TZif file fails for some reason, we’ll try again treating the argument as a TZ string.  If the argument gets treated as a TZ string and has a DST abbreviation but lacks the DST rules, the rules will default to the U.S. rules in effect at the time of this writing (DST is standard time plus one hour changing at 02:00 local wall clock time on the second Sunday in March or the first Sunday in November).

Objects constructed from a POSIX-style TZ string will be correct only for times not far from the present.  You’ll get transitions to and from DST if they apply, but you won’t have any of the history because you won’t have read a TZif file.


5.3.  Observers

Some make sense only relative to a particular time_t value.  Except for the static posix_tz_env_var(), none will ever throw an exception.


5.3.1.  Observers Known Regardless of the Current Time

const std::string& name() const noexcept;
This returns the name of the time zone, which could be the Zoneinfo name (e.g., “America/Chicago”), an offset for which DST rules apply (e.g., “CST6CDT”), or a fixed offset from UTC (e.g., “UTC+1”, “<+1>-1”).
std::string posix_tz_env_var() const noexcept;
static std::string posix_tz_env_var(const std::string&);
This returns what would be the POSIX TZ environment variable, which will probably look something like “CST6CDT,M3.2.0,M11.1.0”, but could also be one of Zoneinfo’s extensions (e.g., “<+07>-7” or maybe even “<+1030>-10:30<+11>-11,M10.1.0,M4.1.0”).

The static version looks up a TZ string given a Zoneinfo Zone name without modifying *this.  It can throw an invalid_argument exception if the argument isn’t a valid Zone name, and it can throw a runtime_error exception if there’s some error reading the TZif file.  (If you’re using the tz_names_xlate.inc file, it’ll do an in-memory search and won’t throw a runtime_error; but it can still throw an invalid_argument if it can’t find the name.)

std::time_t effective_time() const noexcept;
This returns the time_t that was most recently passed to the for_time() function described below.  If for_time() hasn’t been called yet since the object was originally constructed, this function returns the value returned by std::time() when the object was constructed.
int std_offset() const noexcept;
int dst_offset() const noexcept;
These two functions return the offsets in seconds from UTC for standard time and daylight saving time.  They don’t care whether the zone is currently observing DST; but calling for_time() can change the values if the new time_t reflects some significantly different time observance other than just the regular winter/summer transitions.

dst_offset() will return INT_MIN for zones that don’t observe DST.

Note that the difference might not be exactly 3600 seconds (i.e., exactly 1 hour, e.g., on Australia’s Lord Howe Island which currently moves forward in the summer by only half an hour); and the DST offset might even be less than the STD offset (e.g., in the Republic of Ireland which treats summer time as the standard time and winter time as the adjusted one).

const zoneinfo::tz_data& raw_data() const noexcept;
This returns a const reference to the tz_data object that was initialized by reading from a TZif file or inferred from a TZ string.


5.3.2.  Observers That Make Sense Only for a Particular Time

int utc_offset() const noexcept;
This returns the current offset from UTC in seconds including the additional offset for DST if that applies.
bool is_dst() const noexcept;
This returns whether DST is in effect.
const char* abbrv() const noexcept;
This returns the customary abbreviation for the time zone (e.g. “CST”).  This could be a Zoneinfo extension like “<-7>”.
enum class trans_type { unknown = -1, wall, std, utc };
trans_type transition_type() const noexcept;
This returns whether the times that transitions happen (e.g., between summer and winter time) are relative to the local wall clock time, the local standard time, or UTC.


5.4.  Mutators

All of these return *this.


5.4.1.  Changing the Time Zone

timezone& switch_to_local();
This switches to the local time zone.
timezone& switch_to_offset(int hours, int minutes = 0, int seconds = 0);
timezone& switch_to_utc() { return switch_to_offset(0); }
This switches to a time zone with a fixed offset from UTC that never observes DST.
timezone& switch_to(const std::string&);
timezone& switch_to(std::string&&);
This switches to a named time zone.

All of these allocate memory and so can throw a bad_alloc exception.  Those that might read a TZif file or infer the data from a TZ string (all except switch_to_offset()) can also throw invalid_argument or runtime_error.


5.4.2.  Setting the Time

timezone& for_time(std::time_t) noexcept;
This sets the time_t that will be used by those observers that return values relative to a particular point in time.  It will never throw an exception.


Appendix A:  Configuration

The assumption out of the box is that timezone.cpp is being compiled for either POSIX (including a Mac) or Microsoft Windows, and in either case, that there’s no TZ or TZ_ROOT environment variable and that the TZif files are available somewhere.

It’s possible to use this library without the TZif files, but the timezone class will give correct answers only for times not far from the present:  you’ll get the winter-summer transitions, but you won’t have any of the history.  See this paper if you’d like to have the TZif files on your Windows box.

All the macros described in this section are defined in timezone_config.hpp and control only how timezone.cpp gets compiled.


A.1.  Where to Find the Data

There are four macros that you can define to control how timezone.cpp finds time zone information:

Neither of the last two gets defined by default, but timezone_config.hpp will define the first two if you haven’t defined them already.

If you’re not compiling for Windows, your local time zone will default to “/etc/localtime” and your Zoneinfo directory will default to /usr/share/zoneinfo”.  This is probably correct for POSIX, but some older systems might use /etc/timezone for the local zone instead.

If you are compiling for Windows, you’ll get defaults that reflect how the author set up his own system, which are probably not what you want:

So if you’re compiling for Windows, you’ll probably want to explicitly define CIVIL_TIME_TZ and either CIVIL_TIME_NO_ZONEINFO or CIVIL_TIME_TZ_ROOT (or maybe CIVIL_TIME_USE_GETENV if you’ve created the TZ and TZ_ROOT environment variables).

These values can be changed programmatically at run time, so you don’t need worry about this at all if you call one or both of the set_tz() and set_tz_root() functions before you make any use of the timezone class (e.g., at the beginning of main()).

The author observed that actually setting a TZ environment variable on Windows 10 Home build 19045.4170 caused the <ctime> functions to misbehave; so approach statements about CIVIL_TIME_USE_GETENV with some trepidation if compiling for Windows.  It might be best, on Windows, to just set the local time zone programmatically at the beginning of main().


A.2.  The Two .inc Files

There are also four macros any two of which you can define (if you want to) that control whether the two .inc files are used.  On POSIX, you don’t get either by default; on Windows, you get both by default.

tz_names_xlate.inc is a cross reference of Zoneinfo Zone names to POSIX TZ environment variable values.  You can opt in on POSIX by defining CIVIL_TIME_USE_NAME_XLATE or opt out on Windows by defining CIVIL_TIME_NO_NAME_XLATE.  This might be useful on POSIX if you do lots of lookup of TZ environment variables and don’t want to have to read TZif files to find them; it’s necessary on Windows if you don’t have the TZif files installed but still want to allow your software to use the Zoneinfo Zone names.

tz_links_xlate.inc is a cross reference of Zoneinfo Link names to Zone names.  You can opt in by defining CIVIL_TIME_USE_LINK_XLATE or opt out by defining CIVIL_TIME_NO_LINK_XLATE.  It might be useful on POSIX if for some reason you don’t want to follow the filesystem’s symlinks; it’s necessary on Windows if you don’t have the TZif files installed or you haven’t set up Windows symlinks.

Even on Windows, you don’t really need either one if you have the TZif files installed and you’ve created all the Windows symlinks.

Both files are just sorted lists of initializers for char const * const[][2] arrays, such initializers having the form:

{"some-name","translated-name"},
and allow doing an in-memory binary search instead of traversing the filesystem.

They can be generated with the xlate-tz-names program on any box, POSIX or Windows, where the TZif files have been installed.  The one (optional) command line argument is the full path to the Zoneinfo directory; and it defaults to /usr/share/zoneinfo which is probably correct for POSIX.  On the author’s Windows box, he had to supply the command line argument, c:\Users\Public\Zoneinfo which is the directory he created.

These files probably won’t change from one Zoneinfo release to another; but if they do change, and you’re using them, you’ll want to run xlate-tz-names again and recompile timezone.cpp.  The two .inc files supplied with this distribution are correct for Zoneinfo’s 2024a release.


Appendix B:  The Structure of Zoneinfo’s Binary Files

For a complete description, see RFC 8536.

Version 1:
all ints and time_ts are 32 bits, big endian, packed on arbitrary byte boundaries
Type Name Use
char[4] header always "TZif"
char version '\0', '2' or '3'
char[15] [future use] all '\0'
int32_t tzh_ttisgmtcnt count of UTC/local indicators
int32_t tzh_ttisstdcnt count of standard/wall indicators
int32_t tzh_leapcnt count of leap seconds
int32_t tzh_timecnt count of transition times
int32_t tzh_typecnt count of local time types (never zero)
int32_t tzh_charcnt total characters in all abbreviation strings
time_t[tzh_timecnt] transition times as returned by std::time()
unsigned char[tzh_timecnt] indices into the following ttinfo array
struct ttinfo {
  int32_t       tt_gmtoff;
  unsigned char tt_isdst;
  unsigned char tt_abbrind;
} [tzh_typecnt]
tt_gmtoff is total ISO 8601 offset (std + dst);
tt_isdst is a boolean indicating DST;
tt_abbrind is an index to abbreviations.
char[tzh_charcnt] packed '\0'-terminated abbreviations
struct leaps {
  time_t  leap;
  int32_t cnt;
} [tzh_leapcnt];
leaps::leap is time() at a leap second;
leaps::cnt is the number of leap seconds
that have been added before leaps::leap.
char[tzh_ttisstdcnt] whether local transitions are standard or wall
char[tzh_ttisgmtcnt] whether transition times are UTC or local
Version 2:
additional copy of all the above, but with 64-bit time_ts (in yellow above), plus:
char[] '\n' POSIX TZ environment variable '\n'
Version 3 adds no data, but allows some extensions
to the “POSIX TZ environment variable”.
There’s also a “Version 4” that mostly has to do with how the TZif files are compiled from the Zoneinfo
sources.  The author has seen no examples of Version 4 binaries and doesn’t believe that they’ve been
released yet; and this library treats all of Versions 2, 3 and 4 the same.


All suggestions and corrections will be welcome; all flames will be amusing.  Mail to was@pobox.com