The datatypes covered in this section are shown in Figure 4-4.
The W3C Recommendation, “XML Schema Part 2: Datatypes,” provides new confirmation of how difficult it is to fix time.
The support for date and time datatypes relies entirely on a subset of the ISO 8601 standard, which is the only format supported by W3C XML Schema. The purpose of ISO 8601 is to eliminate the risk of confusion between the various date and time formats used in different countries. In other words, W3C XML Schema does not support these local date and time formats, and imposes the usage of ISO 8601 for any datatype that has the semantic of a date or time. While this is a good thing for interchange formats, this is more questionable when XML is used to define user interfaces, since we will see that ISO 8601 is not very user friendly. The variations using the names of the months or different orders between year, month, and day are not the only victims of this decision: ISO 8601 imposes the usage of the Gregorian (Christian) calendar to the exclusion of calendars used by other cultures or religions.
ISO 8601 describes several formats to define date, times, periods, and recurring dates, with different levels of precision and indetermination. After many discussions, W3C XML Schema selected a subset of these formats and created a primitive datatype for each format that is supported.
The indeterminacy allowed in some of these formats adds a lot of difficulty, especially when comparisons or arithmetic are involved. For instance, it is possible to define a point in time without specifying the time zone, which is then considered undetermined. This undetermined time zone is identical all over the document (and between the schema and the instance documents) and it’s not an issue to compare two datetimes without a time zone. The problem arises when you need to compare two points in time, one with a time zone and the other without. The result of this comparison will be undetermined if these values are too close, since one of them may be between -13 hours and +12 hours of Coordinated Universal Time (UTC). Thus, the support of these datetime datatypes introduces a notion of “partial order relation.”
Another caveat with ISO 8601 is that time zones are only supported through the time difference from UTC, which ignores the notion of summer time. For instance, if an application working in British Summer Time (BST) wants to specify the time zone—and we have seen that this is necessary to be able to compare datetimes—the application needs to know if a date is in summer (the time zone will be one hour after UTC) or in winter (the time zone would then be UTC). ISO 8601 ignores the “named time zones” using the summer saving times (such as PST, BST, or WET) that we use in our day-to-day life; ignoring the time zones can be seen as a somewhat dangerous shortcut to specify that a datetime is on your “local time,” whatever it is.
xs:dateTime
The
xs:dateTime
datatype defines a
“specific instant of time.” This is
a subset of what ISO 8601 calls a “moment of
time.” Its lexical value follows the format
“CCYY-MM-DDThh:mm:ss,” in which all
the fields must be present and may optionally be preceded by a sign
and leading figures, if needed, and followed by fractional digits for
the seconds and a time zone. The time zone may be specified using the
letter “Z,” which identifies UTC,
or by the difference of time with UTC.
The value space of
xs:dateTime
is considered to be
the moment of time itself. The time zone that defines the value (when
there is one) is considered meaningless, which is a problem for some
applications that complain that even though
2002-01-18T12:00:00+00:00
and
2002-01-18T11:00:00-01:00
refer to the same
“moment of time,” they carry
different time zone information, which should make its way into the
value space.
Valid values for
xs:dateTime
include:
2001-10-26T21:32:52
|
2001-10-26T21:32:52+02:00
|
2001-10-26T19:32:52Z
|
2001-10-26T19:32:52+00:00
|
-2001-10-26T21:32:52
|
2001-10-26T21:32:52.12679
|
The following values are invalid:
2001-10-26 (all the parts must be specified) |
2001-10-26T21:32 (all the parts must be specified) |
2001-10-26T25:32:52+02:00 (the hours part (25) is out of range) |
01-10-26T21:32 (all the parts must be specified) |
In the valid examples given above, three of them have identical value spaces:
2001-10-26T21:32:52+02:00
|
2001-10-26T19:32:52Z
|
2001-10-26T19:32:52+00:00
|
The first one (2001-10-26T21:32:52
), which
doesn’t include a time zone specification, is
considered to have an indeterminate value between
2001-10-26T21:32:52-14:00
and
2001-10-26T21:32:52+14:00
. With the usage of
summer saving time, this range is subject to national regulations and
may change. The range was between -13:00 and +12:00 when the
Recommendation was published, but the Working Group has kept a margin
to accommodate possible changes in the regulations.
Despite the indeterminacy of the time zone when none is specified, the W3C XML Schema Recommendation considers that the values of datetimes without time zones implicitly refer to the same undetermined time zone and can be compared between them. While this is fine for “local” applications that operate in a single time zone, this is a source of potential confusion and errors for world-wide applications or even for applications that calculate a duration between moments belonging to different time saving seasons within a single time zone.
xs:date
,
xs:gYearMonth
and
xs:gYear
.
The
lexical
space of
xs:date
datatype is identical to the date part
of
xs:dateTime
. Like
xs:dateTime
,
it includes a time zone that should always be specified to be able to
compare two dates without ambiguity. As defined per W3C XML Schema, a
date is a period one day in its time zone,
“independent of how many hours this day
has.” The consequence of this definition is that two
dates defined in a different time zone cannot be equal except if they
designate the same interval (2001-10-26+12:00
and
2001-10-25-12:00
, for instance). Another
consequence is that, like with
xs:dateTime
, the
order relation between a date with a time zone and a date without a
time zone is partial.
Valid values for
xs:date
include:
2001-10-26
|
2001-10-26+02:00
|
2001-10-26Z
|
2001-10-26+00:00
|
-2001-10-26
|
-20000-04-01
|
The following values are invalid:
2001-10 (all the parts must be specified) |
2001-10-32 (the days part (32) is out of range) |
2001-13-26+02:00 (the month part (13) is out of range) |
01-10-26 (the century part is missing) |
xs:date
represents a day identified by a
Gregorian calendar date (and could have been called
"gYearMonthDay
“).
xs:gYearMonth
(“g”
for Gregorian) is a Gregorian calendar month and
xs:gYear
is a Gregorian calendar year. These three
datatypes are fixed periods of time and optional time zones may be
specified for each of them. The only differences between them really
are their length (1 day, 1 month, and 1 year) and their format (i.e.,
their lexical spaces).
The format of
xs:gYearMonth
is the format of
xs:date
without the day part. Valid values for
xs:gYearMonth
include:
2001-10
|
2001-10+02:00
|
2001-10Z
|
2001-10+00:00
|
-2001-10
|
-20000-04
|
The following values are invalid:
2001 (the month part is missing) |
2001-13 (the month part is out of range) |
2001-13-26+02:00 (the month part is out of range) |
01-10 (the century part is missing) |
The format of
xs:gYear
is the format of
xs:gYearMonth
without the month part. Valid values for
xs:gYear
include:
2001
|
2001+02:00
|
2001Z
|
2001+00:00
|
-2001
|
-20000
|
The following values are invalid:
01 (the century part is missing) |
2001-13 (the month part is out of range) |
This support of time periods is very restrictive: these periods can only match the Gregorian calendar day, month, or year, and cannot have an arbitrary length or start time.
xs:time
The lexical
space of
xs:time
is identical to the time part
of
xs:dateTime
. The semantic of
xs:time
represents a point in time that recurs every
day; the meaning of 01:20:15
is
“the point in time recurring each day at 01:20:15
am.” Like
xs:date
and
xs:dateTime
,
xs:time
accepts an
optional time zone definition. The same issue arises when comparing
times with and without time zones.
Despite the fact that: 01:20:15 is commonly used to represent a duration of 1 hour, 20 minutes, and 15 seconds, a different format has been chosen to represent a duration.
Valid values for
xs:time
include:
21:32:52
|
21:32:52+02:00
|
19:32:52Z
|
19:32:52+00:00
|
21:32:52.12679
|
The following values are invalid:
21:32 (all the parts must be specified) |
25:25:10 (the hour part is out of range) |
-10:00:00 (the hour part is out of range) |
1:20:10 (all the digits must be supplied) |
This support of a recurring point in time is also very limited: the recursion period must be a Gregorian calendar day and cannot be arbitrary.
xs:gDay
,
xs:gMonth
, and
xs:gMonthDay
.We have already seen points in times and periods, as well as recurring points in time. This wouldn’t be complete without a description of recurring periods. W3C XML Schema supports three predefined recurring periods corresponding to Gregorian calendar months (recurring every year) and days (recurring each month or year). The support of recurring periods is restricted both in terms of recursion (the recursion period can only be a Gregorian calendar year or month) and period (the start time can only be a Gregorian calendar day or month, and the duration can only be a Gregorian calendar month or day).
xs:gDay
is a period of a Gregorian calendar day
recurring each Gregorian calendar month. The lexical representation
of
xs:gDay
is ---DD
with an
optional time zone specification. Valid values for
xs:gDay
include:
---01
|
---01Z
|
---01+02:00
|
---01-04:00
|
---15
|
---31
|
The following values are invalid:
--30- (the format must be "---DD “) |
---35 (the day is out of range) |
---5 (all the digits must be supplied) |
15 (missing the leading "--- “) |
The rules of arithmetic between dates and durations apply in this
case, and days are “pinned” in the
range for each month. In our example, --31
, the
selected dates will be January 31st, February 28th (or 29th), March
31st, April 30th, etc.
xs:gMonthDay
is a period of a Gregorian calendar day
recurring each Gregorian calendar year. The lexical representation of
xs:gMonthDay
is --MM-DD
with an
optional time zone specification. Valid values for
xs:gMonthDay
include:
--05-01
|
--11-01Z
|
--11-01+02:00
|
--11-01-04:00
|
--11-15
|
--02-29
|
The following values are invalid:
-01-30- (the format must be --MM-DD ) |
--01-35 (the day part is out of range) |
--1-5 (one part is missing) |
01-15 (the leading -- is missing) |
xs:gMonth
is a period of a Gregorian calendar
month recurring each Gregorian calendar year. The lexical
representation of
xs:gMonth
defined in the
Recommendation is --MM--
with an optional
time
zone specification. The W3C XML Schema Working Group has acknowledged
that this was an error and that the format --MM
defined by ISO 8061 should be used instead. It has not been decided
yet if the format described in the Recommendation will be forbidden
or only deprecated, but it is advised to use the format
--MM
(assuming that the tools you are using
already support it). Valid values for
xs:gMonth
include:
--05
|
--11Z
|
--11+02:00
|
--11-04:00
|
--02
|
The following values are invalid:
-01- (the format must be --MM ) |
--13 (the month is out of range) |
--1 (both digits must be provided) |
01 (the leading -- is missing) |
xs:duration
Naive
programmers who think that the concept of duration is simple should
read the Recommendation, which states:
xs:duration
is defined as a six-dimensional
space!” Mathematicians would object that this is not
absolutely true since most of the axes of these dimension are
parallel, but the fact is that when these programmers say that a
development will last one month and 3 days, they define a duration
that is comprised of between 31 and 34 days. The attempt of W3C XML
Schema to deal with these issues on top of ISO 8601 has introduced a
degree of indeterminacy in the comparisons between durations.
The lexical space of
xs:duration
is the format
defined by ISO 8601 under the form PnYnMnDTnHnMnS
,
in which the capital letters are delimiters that can be omitted when
the corresponding member is not used. An important difference with
the format used for
xs:dateTime
is none of these
members are mandatory and none of them are restricted to a range.
This gives flexibility to choose the units that will be used and to
combine several of them—for instance,
P1Y2MT123S
(1 year, 2 months, and 123 seconds).
This flexibility has a price; such a duration is not completely
defined: a year may have 365 or 366 days, and a period of two months
lasts between 59 and 62 days. Durations cannot always be compared and
the order between durations is partial. We will see, in the next
chapter, that user-defined datatypes can be derived from
xs:duration
, which can restrict the components used to
express durations and insure that these indeterminations do not
happen.
Since the value of a duration is fixed as soon as you give it a starting point, the schema Working Group has identified four datetimes:
1696-09-01T00:00:00Z
|
1697-02-01T00:00:00Z
|
1903-03-01T00:00:00Z
|
1903-07-01T00:00:00Z
|
These cause the greatest deviations when durations mixing day, month, and other components are added. The Working Group has determined that the comparison of durations is undefined if—and only if—the result of the comparison is different when each of these dates is used as a starting point.
Valid values for
xs:duration
include:
PT1004199059S
|
PT130S
|
PT2M10S
|
P1DT2S
|
-P1Y
|
P1Y2M3DT5H20M30.123S
|
The following values are invalid:
1Y (the leading P is missing) |
P1S (the T separator is missing) |
P-1Y (all parts must be positive) |
P1M2Y (the parts order is significant and Y must precede M ) |
P1Y-1M (all parts must be
positive) |