Using traditional calendars with ICU
PostgreSQL displays and calculates date and times with respect to the Gregorian calendar, and out of the box, does not support traditional calendars in use in some parts of the world. However the ICU library does so, meaning that when Postgres is built with ICU, a large part of the necessary code and data is already there, just needing to be exposed at the SQL level. Adding such interfaces is the goal of the icu_ext extension, which currently provides over 25 functions to tap into ICU.
Starting with the just released 1.8.0 version, icu_ext makes traditional calendars accessible in SQL through a set of custom functions, types, operators and configuration settings.
For instance the day corresponding to “Sep 27, 2023” in the gregorian
calendar is expressed like this in some other calendars, using
the locales 'en@calendar=name-of-calendar'
.
@calendar= | short format | medium format | long format |
---|---|---|---|
buddhist | 9/27/2566 BE | Sep 27, 2566 BE | September 27, 2566 BE |
chinese | 8/13/2023 | Mo8 13, 2023 | Eighth Month 13, 2023(gui-mao) |
coptic | 1/16/1740 ERA1 | Tout 16, 1740 ERA1 | Tout 16, 1740 ERA1 |
ethiopic | 1/16/2016 ERA1 | Meskerem 16, 2016 ERA1 | Meskerem 16, 2016 ERA1 |
hebrew | 12 Tishri 5784 | 12 Tishri 5784 | 12 Tishri 5784 |
indian | 7/5/1945 Saka | Asvina 5, 1945 Saka | Asvina 5, 1945 Saka |
islamic | 3/12/1445 AH | Rab. I 12, 1445 AH | Rabiʻ I 12, 1445 AH |
japanese | 9/27/5 R | Sep 27, 5 Reiwa | September 27, 5 Reiwa |
persian | 7/5/1402 AP | Mehr 5, 1402 AP | Mehr 5, 1402 AP |
This output is with an english locale (the en
part in the
locale). It’s of course possible to choose different languages (see
select * from icu_locales_list()
for what’s available).
Input/Output formats
Independently of the calendar, the text representation for dates and times may be specified by refering to the name of a “basic format” such as described in the CLDR. With icu_ext, these names are enclosed between curly brackets to mark them as a format:
{short}
{medium}
{long}
{full}
Alternatively, a format string can be specified with fields described in Formatting Dates and Times (ICU documentation), which reproduces the Date Field Symbol Table in the CLDR.
These format strings can to be passed to the formatting and parse functions:
- icu_format_date (input date, format text [, locale text] ) → text
-
icu_format_datetime (input timestamptz, format text [, locale text] ) → text
- icu_parse_date (input text, format text [, locale text] ) → date
- icu_parse_datetime (input text, format text [, locale text] ) → timestamptz
Examples:
SELECT icu_format_date(
'2020-12-31'::date,
'{medium}',
'en@calendar=ethiopic'
);
icu_format_date
----------------------
Tahsas 22, 2013 ERA1
SELECT icu_parse_datetime(
'11/Meskerem/2016 14:57:17',
'dd/MMMM/yyyy HH:mm:ss',
'en@calendar=ethiopic'
);
icu_parse_datetime
------------------------
2023-09-22 14:57:17+02
SELECT icu_format_datetime(
now(),
'GGGG dd/MMMM/yyyy HH:mm:ss.SSS z',
'fr@calendar=buddhist'
);
icu_format_datetime
------------------------------------------------
ère bouddhique 22/septembre/2566 14:55:48.133 UTC+2
Data types: icu_date and icu_timestamptz
It may be cumbersome to have to call the format and parse functions every time we need to display or submit a date. Ideally we should be able to get date fields to be displayed in our calendar, locale and format of choice by configuration.
In order to do that, icu_ext exposes
the icu_date
and icu_timestamptz
data types, along
with the settings icu_ext.icu_ext.date_format
,
icu_ext.icu_ext.timestamptz_format
and icu_ext.locale
.
The settings for the formats and the locale take the same values as would be passed if calling the format or parse functions.
For instance:
-- Vietnamese language, buddhist calendar
SET icu_ext.locale TO 'vi@calendar=buddhist';
SET icu_ext.timestamptz_format TO '{long}';
SELECT now()::icu_timestamptz;
+------------------------------------------------+
| now |
+------------------------------------------------+
| Ngày 27 tháng 9 năm 2566 BE lúc 18:30:33 GMT+2 |
+------------------------------------------------+
SET icu_ext.date_format TO 'dd/MM/yyyy';
CREATE TEMPORARY TABLE test(event_date icu_date);
INSERT INTO test VALUES(current_date+10);
SELECT * FROM test;
+------------+
| event_date |
+------------+
| 07/10/2566 |
+------------+
SET icu_ext.date_format TO '{full}';
SELECT * FROM test;
+---------------------------------------+
| event_date |
+---------------------------------------+
| Thứ Bảy, ngày 07 tháng 10 năm 2566 BE |
+---------------------------------------+
-- Switch to english and ethiopic calendar
SET icu_ext.locale TO 'en@calendar=ethiopic';
SELECT * FROM test;
+----------------------------------+
| event_date |
+----------------------------------+
| Saturday, Meskerem 26, 2016 ERA1 |
+----------------------------------+
Internally, the icu_date and icu_timestamptz types store the same values
as the corresponding core types date and timestamptz (that is,
a duration in days or microseconds since 2000, january 1st GMT),
which fortunately allows the extension code to branch to the core
code for most of the operations.
However, their text representation is completely different and is driven
by the locale (selecting language and calendar) and the format string
set in the icu_ext.*
configuration namespace.
By contrast, the text representations for the core types are driven by
the DateStyle
setting, which offers much less flexibility.
Date calculation and the icu_interval type
Besides input and output, dates can be shifted by adding intervals to them.
Unfortunately the interval
built-in data type cannot be used
directly for some traditional calendars, because interval
automatically
converts years into 12 months (internally it doesn’t have a field for
the years), while some calendars have 13 months per year.
Therefore the extension reimplements the interval type into an
icu_interval
data type with a different storage. Along with it, it
offers a specific set of operations to add and subtract intervals to
dates and timestamps with respect to the currently selected calendar
in icu_ext.locale
.
The basic operations are implemented:
- icu_date +/- icu_interval → icu_timestamptz
- icu_timestamptz +/- icu_interval → icu_timestamptz
- icu_interval * int → icu_interval
- icu_interval +/- icu_interval → icu_interval
Examples:
-- Adding 13 months = adding 1 year in the ethiopic calendar
SET icu_ext.locale TO 'en@calendar=ethiopic';
SELECT current_date::icu_date as today,
(current_date + icu_interval '13 months')::icu_date as "plus 13 mon";
+------------------------+------------------------+
| today | plus 13 mon |
+------------------------+------------------------+
| Meskerem 16, 2016 ERA1 | Meskerem 16, 2017 ERA1 |
+------------------------+------------------------+
Time zones
The time zone used by icu_ext is the SQL session time zone (as set by
SET timezone
and returned by SHOW timezone
).
Although ICU has its own list of time zones, icu_ext doesn’t use it. Hopefully the time zone names used by Postgres are compatible with ICU because they both derive from the IANA database.
Before leaving
If you have questions or issues with this brand new code or the other functionalities in icu_ext, feel free to submit them on github, as well as give a ⭐ to the project if you like it!