PostgreSQL uses an internal heuristic parser for all date/time input support. Dates and times are input as strings, and are broken up into distinct fields with a preliminary determination of what kind of information may be in the field. Each field is interpreted and either assigned a numeric value, ignored, or rejected. The parser contains internal lookup tables for all textual fields, including months, days of the week, and time zones.
This appendix includes information on the content of these lookup tables and describes the steps used by the parser to decode dates and times.
The date/time types are all decoded using a common set of routines.
Date/Time Input Interpretation
Break the input string into tokens and categorize each token as a string, time, time zone, or number.
If the numeric token contains a colon (:), this is a time string. Include all subsequent digits and colons.
If the numeric token contains a dash (-), slash (/), or two or more dots (.), this is a date string which may have a text month.
If the token is numeric only, then it is either a single field or an ISO 8601 concatenated date (e.g., 19990113 for January 13, 1999) or time (e.g. 141516 for 14:15:16).
If the token starts with a plus (+) or minus (-), then it is either a time zone or a special field.
If the token is a text string, match up with possible strings.
Do a binary-search table lookup for the token as either a special string (e.g., today), day (e.g., Thursday), month (e.g., January), or noise word (e.g., at, on).
Set field values and bit mask for fields. For example, set year, month, day for today, and additionally hour, minute, second for now.
If not found, do a similar binary-search table lookup to match the token with a time zone.
If not found, throw an error.
The token is a number or number field.
If there are more than 4 digits, and if no other date fields have been previously read, then interpret as a "concatenated date" (e.g., 19990118). 8 and 6 digits are interpreted as year, month, and day, while 7 and 5 digits are interpreted as year, day of year, respectively.
If the token is three digits and a year has already been decoded, then interpret as day of year.
If four or six digits and a year has already been read, then interpret as a time.
If four or more digits, then interpret as a year.
If in European date mode, and if the day field has not yet been read, and if the value is less than or equal to 31, then interpret as a day.
If the month field has not yet been read, and if the value is less than or equal to 12, then interpret as a month.
If the day field has not yet been read, and if the value is less than or equal to 31, then interpret as a day.
If two digits or four or more digits, then interpret as a year.
Otherwise, throw an error.
If BC has been specified, negate the year and add one for internal storage. (There is no year zero in the Gregorian calendar, so numerically 1BC becomes year zero.)
If BC was not specified, and if the year field was two digits in length, then adjust the year to 4 digits. If the field was less than 70, then add 2000; otherwise, add 1900.
Tip: Gregorian years AD 1-99 may be entered by using 4 digits with leading zeros (e.g., 0099 is AD 99). Previous versions of PostgreSQL accepted years with three digits and with single digits, but as of version 7.0 the rules have been tightened up to reduce the possibility of ambiguity.