Date and Time Datatypes

The datatypes covered in this section are shown in Figure 4-4.

Date and time datatypes

Figure 4-4. Date and time datatypes

The W3C Recommendation, “XML Schema Part 2: Datatypes,” provides new confirmation of how difficult it is to fix time.

The support for date and time datatypes relies entirely on a subset of the ISO 8601 standard, which is the only format supported by W3C XML Schema. The purpose of ISO 8601 is to eliminate the risk of confusion between the various date and time formats used in different countries. In other words, W3C XML Schema does not support these local date and time formats, and imposes the usage of ISO 8601 for any datatype that has the semantic of a date or time. While this is a good thing for interchange formats, this is more questionable when XML is used to define user interfaces, since we will see that ISO 8601 is not very user friendly. The variations using the names of the months or different orders between year, month, and day are not the only victims of this decision: ISO 8601 imposes the usage of the Gregorian (Christian) calendar to the exclusion of calendars used by other cultures or religions.

ISO 8601 describes several formats to define date, times, periods, and recurring dates, with different levels of precision and indetermination. After many discussions, W3C XML Schema selected a subset of these formats and created a primitive datatype for each format that is supported.

The indeterminacy allowed in some of these formats adds a lot of difficulty, especially when comparisons or arithmetic are involved. For instance, it is possible to define a point in time without specifying the time zone, which is then considered undetermined. This undetermined time zone is identical all over the document (and between the schema and the instance documents) and it’s not an issue to compare two datetimes without a time zone. The problem arises when you need to compare two points in time, one with a time zone and the other without. The result of this comparison will be undetermined if these values are too close, since one of them may be between -13 hours and +12 hours of Coordinated Universal Time (UTC). Thus, the support of these datetime datatypes introduces a notion of “partial order relation.”

Another caveat with ISO 8601 is that time zones are only supported through the time difference from UTC, which ignores the notion of summer time. For instance, if an application working in British Summer Time (BST) wants to specify the time zone—and we have seen that this is necessary to be able to compare datetimes—the application needs to know if a date is in summer (the time zone will be one hour after UTC) or in winter (the time zone would then be UTC). ISO 8601 ignores the “named time zones” using the summer saving times (such as PST, BST, or WET) that we use in our day-to-day life; ignoring the time zones can be seen as a somewhat dangerous shortcut to specify that a datetime is on your “local time,” whatever it is.

Point in time: xs:dateTime

The xs:dateTime datatype defines a “specific instant of time.” This is a subset of what ISO 8601 calls a “moment of time.” Its lexical value follows the format “CCYY-MM-DDThh:mm:ss,” in which all the fields must be present and may optionally be preceded by a sign and leading figures, if needed, and followed by fractional digits for the seconds and a time zone. The time zone may be specified using the letter “Z,” which identifies UTC, or by the difference of time with UTC.

Tip

The value space of xs:dateTime is considered to be the moment of time itself. The time zone that defines the value (when there is one) is considered meaningless, which is a problem for some applications that complain that even though 2002-01-18T12:00:00+00:00 and 2002-01-18T11:00:00-01:00 refer to the same “moment of time,” they carry different time zone information, which should make its way into the value space.

Valid values for xs:dateTime include:

2001-10-26T21:32:52
2001-10-26T21:32:52+02:00
2001-10-26T19:32:52Z
2001-10-26T19:32:52+00:00
-2001-10-26T21:32:52
2001-10-26T21:32:52.12679

The following values are invalid:

2001-10-26 (all the parts must be specified)
2001-10-26T21:32 (all the parts must be specified)
2001-10-26T25:32:52+02:00 (the hours part (25) is out of range)
01-10-26T21:32 (all the parts must be specified)

In the valid examples given above, three of them have identical value spaces:

2001-10-26T21:32:52+02:00
2001-10-26T19:32:52Z
2001-10-26T19:32:52+00:00

The first one (2001-10-26T21:32:52), which doesn’t include a time zone specification, is considered to have an indeterminate value between 2001-10-26T21:32:52-14:00 and 2001-10-26T21:32:52+14:00. With the usage of summer saving time, this range is subject to national regulations and may change. The range was between -13:00 and +12:00 when the Recommendation was published, but the Working Group has kept a margin to accommodate possible changes in the regulations.

Despite the indeterminacy of the time zone when none is specified, the W3C XML Schema Recommendation considers that the values of datetimes without time zones implicitly refer to the same undetermined time zone and can be compared between them. While this is fine for “local” applications that operate in a single time zone, this is a source of potential confusion and errors for world-wide applications or even for applications that calculate a duration between moments belonging to different time saving seasons within a single time zone.

Periods of time: xs:date , xs:gYearMonth and xs:gYear .

The lexical space of xs:date datatype is identical to the date part of xs:dateTime . Like xs:dateTime , it includes a time zone that should always be specified to be able to compare two dates without ambiguity. As defined per W3C XML Schema, a date is a period one day in its time zone, “independent of how many hours this day has.” The consequence of this definition is that two dates defined in a different time zone cannot be equal except if they designate the same interval (2001-10-26+12:00 and 2001-10-25-12:00, for instance). Another consequence is that, like with xs:dateTime , the order relation between a date with a time zone and a date without a time zone is partial.

Valid values for xs:date include:

2001-10-26
2001-10-26+02:00
2001-10-26Z
2001-10-26+00:00
-2001-10-26
-20000-04-01

The following values are invalid:

2001-10 (all the parts must be specified)
2001-10-32 (the days part (32) is out of range)
2001-13-26+02:00 (the month part (13) is out of range)
01-10-26 (the century part is missing)

xs:date represents a day identified by a Gregorian calendar date (and could have been called "gYearMonthDay“). xs:gYearMonth (“g” for Gregorian) is a Gregorian calendar month and xs:gYear is a Gregorian calendar year. These three datatypes are fixed periods of time and optional time zones may be specified for each of them. The only differences between them really are their length (1 day, 1 month, and 1 year) and their format (i.e., their lexical spaces).

The format of xs:gYearMonth is the format of xs:date without the day part. Valid values for xs:gYearMonth include:

2001-10
2001-10+02:00
2001-10Z
2001-10+00:00
-2001-10
-20000-04

The following values are invalid:

2001 (the month part is missing)
2001-13 (the month part is out of range)
2001-13-26+02:00 (the month part is out of range)
01-10 (the century part is missing)

The format of xs:gYear is the format of xs:gYearMonth without the month part. Valid values for xs:gYear include:

2001
2001+02:00
2001Z
2001+00:00
-2001
-20000

The following values are invalid:

01 (the century part is missing)
2001-13 (the month part is out of range)

This support of time periods is very restrictive: these periods can only match the Gregorian calendar day, month, or year, and cannot have an arbitrary length or start time.

Recurring point in time: xs:time

The lexical space of xs:time is identical to the time part of xs:dateTime . The semantic of xs:time represents a point in time that recurs every day; the meaning of 01:20:15 is “the point in time recurring each day at 01:20:15 am.” Like xs:date and xs:dateTime , xs:time accepts an optional time zone definition. The same issue arises when comparing times with and without time zones.

Valid values for xs:time include:

21:32:52
21:32:52+02:00
19:32:52Z
19:32:52+00:00
21:32:52.12679

The following values are invalid:

21:32 (all the parts must be specified)
25:25:10 (the hour part is out of range)
-10:00:00 (the hour part is out of range)
1:20:10 (all the digits must be supplied)

This support of a recurring point in time is also very limited: the recursion period must be a Gregorian calendar day and cannot be arbitrary.

Recurring period of time: xs:gDay , xs:gMonth , and xs:gMonthDay .

We have already seen points in times and periods, as well as recurring points in time. This wouldn’t be complete without a description of recurring periods. W3C XML Schema supports three predefined recurring periods corresponding to Gregorian calendar months (recurring every year) and days (recurring each month or year). The support of recurring periods is restricted both in terms of recursion (the recursion period can only be a Gregorian calendar year or month) and period (the start time can only be a Gregorian calendar day or month, and the duration can only be a Gregorian calendar month or day).

xs:gDay is a period of a Gregorian calendar day recurring each Gregorian calendar month. The lexical representation of xs:gDay is ---DD with an optional time zone specification. Valid values for xs:gDay include:

---01
---01Z
---01+02:00
---01-04:00
---15
---31

The following values are invalid:

--30- (the format must be "---DD“)
---35 (the day is out of range)
---5 (all the digits must be supplied)
15 (missing the leading "---“)

The rules of arithmetic between dates and durations apply in this case, and days are “pinned” in the range for each month. In our example, --31, the selected dates will be January 31st, February 28th (or 29th), March 31st, April 30th, etc.

xs:gMonthDay is a period of a Gregorian calendar day recurring each Gregorian calendar year. The lexical representation of xs:gMonthDay is --MM-DD with an optional time zone specification. Valid values for xs:gMonthDay include:

--05-01
--11-01Z
--11-01+02:00
--11-01-04:00
--11-15
--02-29

The following values are invalid:

-01-30- (the format must be --MM-DD)
--01-35 (the day part is out of range)
--1-5 (one part is missing)
01-15 (the leading -- is missing)

xs:gMonth is a period of a Gregorian calendar month recurring each Gregorian calendar year. The lexical representation of xs:gMonth defined in the Recommendation is --MM-- with an optional time zone specification. The W3C XML Schema Working Group has acknowledged that this was an error and that the format --MM defined by ISO 8061 should be used instead. It has not been decided yet if the format described in the Recommendation will be forbidden or only deprecated, but it is advised to use the format --MM (assuming that the tools you are using already support it). Valid values for xs:gMonth include:

--05
--11Z
--11+02:00
--11-04:00
--02

The following values are invalid:

-01- (the format must be --MM)
--13 (the month is out of range)
--1 (both digits must be provided)
01 (the leading -- is missing)
xs:duration

Naive programmers who think that the concept of duration is simple should read the Recommendation, which states: xs:duration is defined as a six-dimensional space!” Mathematicians would object that this is not absolutely true since most of the axes of these dimension are parallel, but the fact is that when these programmers say that a development will last one month and 3 days, they define a duration that is comprised of between 31 and 34 days. The attempt of W3C XML Schema to deal with these issues on top of ISO 8601 has introduced a degree of indeterminacy in the comparisons between durations.

The lexical space of xs:duration is the format defined by ISO 8601 under the form PnYnMnDTnHnMnS, in which the capital letters are delimiters that can be omitted when the corresponding member is not used. An important difference with the format used for xs:dateTime is none of these members are mandatory and none of them are restricted to a range. This gives flexibility to choose the units that will be used and to combine several of them—for instance, P1Y2MT123S (1 year, 2 months, and 123 seconds). This flexibility has a price; such a duration is not completely defined: a year may have 365 or 366 days, and a period of two months lasts between 59 and 62 days. Durations cannot always be compared and the order between durations is partial. We will see, in the next chapter, that user-defined datatypes can be derived from xs:duration , which can restrict the components used to express durations and insure that these indeterminations do not happen.

Since the value of a duration is fixed as soon as you give it a starting point, the schema Working Group has identified four datetimes:

1696-09-01T00:00:00Z
1697-02-01T00:00:00Z
1903-03-01T00:00:00Z
1903-07-01T00:00:00Z

These cause the greatest deviations when durations mixing day, month, and other components are added. The Working Group has determined that the comparison of durations is undefined if—and only if—the result of the comparison is different when each of these dates is used as a starting point.

Valid values for xs:duration include:

PT1004199059S
PT130S
PT2M10S
P1DT2S
-P1Y
P1Y2M3DT5H20M30.123S

The following values are invalid:

1Y (the leading P is missing)
P1S (the T separator is missing)
P-1Y (all parts must be positive)
P1M2Y (the parts order is significant and Y must precede M)
P1Y-1M (all parts must be positive)