So far, we have used only predefined datatypes. In this chapter, we will see how to create new simple types, taking advantage of the different derivation mechanisms and facets of derivation by restriction.
W3C XML Schema has defined three independent and complementary mechanisms for defining our own custom datatypes, using existing datatypes as starting points. These new user datatypes that are built upon existing predefined datatypes or on other user datatypes are called “derivation.”
The three derivation methods are derivation by restriction (where constraints are added on a datatype without changing its original semantic or meaning), derivation by list (where new datatypes are defined as being lists of values belonging to a datatype and take the semantic of list datatypes), and derivation by union (where new datatypes are defined as allowing values from a set of other datatypes and lose most of their semantic).
As with the
xs:complexType
,
definitions (which we saw in our Russian doll design) and
xs:simpleType(global definition)
can be
either named or anonymous. Despite this similarity, simple and
complex types are very different. A simple type is a restriction on
the value of an element or an attribute (i.e., a constraint on the
content of a set of documents) while a complex type is a definition
of a content model (i.e., a constraint on the markup). This is why
the derivation methods for simple and complex types are very
different, even though W3C XML Schema used the same element name
(xs:restriction
) for both. This is a common source of
confusion.
These derivation methods are flexible and powerful. However, that W3C XML Schema needs many different primary datatypes can be seen as proof that they are not sufficient to create a new primary datatype. The reason being that the derivation methods are only acting on the value space or on the lexical space (as defined in Chapter 4), but they cannot modify the relations between these two spaces, nor create new value or lexical spaces. This subject has been debated by the W3C XML Schema Working Group, which has not found an agreement for ways to define an abstract datatype system that would allow definition of several lexical representations. The most obvious consequence of this decision is that, despite the protestation from the W3C I18N Working Group, W3C XML Schema doesn’t allow the definition of localized decimal or date datatypes.
Restriction is
probably the most commonly used and
natural derivation method.Datatypes are created by restriction by
adding new constraints to the possible values. W3C XML
Schema itself has been using derivation by restriction to
define most of derived predefined datatypes, such as
xs:positiveInteger
, which is a derivation by
restriction of
xs:integer
. The
restrictions can be defined along different aspects or axes that
W3C XML Schema calls
“facets.”
A derivation by restriction is done using a
xs:restriction
element and each facet is defined using a specific element embedded
in the xs:restriction
element. The datatype on which
the restriction is applied is called the base datatype, which can be
referenced through a <base>
attribute or
defined in the xs:restriction
element:
<xs:simpleType name="myInteger"> <xs:restriction base="xs:integer"> <xs:minInclusive value="-2"/> <xs:maxExclusive value="5"/> </xs:restriction> </xs:simpleType>
It can also be defined in two steps using an embedded
xs:simpleType(global definition)
anonymous definition:
<xs:simpleType name="myInteger"> <xs:restriction> <xs:simpleType> <xs:restriction base="xs:integer"> <xs:maxExclusive value="5"/> </xs:restriction> </xs:simpleType> <xs:minInclusive value="-2"/> </xs:restriction> </xs:simpleType>
The
xs:minInclusive
and
xs:maxExclusive
elements are two
facets that can be applied to an integer datatype. As can be guessed
from their names, they specify the minimum inclusive (i.e., that can
be reached) and maximum exclusive (i.e., that is not allowed) values.
We will introduce the list of facets in the next section. Depending
on the facet, each acts directly either on the value space or on the
lexical space of the datatype, and the same facet may have different
effects depending on the datatype on which it is applied.
Whatever facet is being applied on a datatype, the semantic of its
primitive type is unchanged, the list of facets that can be applied
cannot be extended, and one must be careful to choose, when possible,
a datatype whose primitive type matches the purpose of the node in
which it will be used. For instance, while it is possible to
constrain a string datatype to match non-ISO 8601 dates using
patterns, this solution should be used only when absolutely required
since this datatype would still be considered a string and lack
facets, such as
xs:minInclusive
or
xs:maxExclusive
that are defined on
date datatypes but that have no meaning (for W3C XML Schema) on a
string.
The impact of the “right” choice of the base datatype with a semantic as close as possible to its actual usage in the instance documents will become more critical when W3C XML Schema aware applications become available. Such applications will have a different behavior depending on the datatype information found in the PSVI. A “wrong” choice will have side effects. For instance, the first drafts of XPath 2.0 propose to interpret values according to predefined datatypes and the results of equality tests on values or the sort orders would depend on the datatypes.
Before we start
looking at the list of facets, we’ll discuss the way
they work. They may be classified into three categories:
xs:whiteSpace
defines the
whitespace processing that happens between the parser and lexical
spaces—but can be used only on
xs:string
and
xs:normalizedString
.
xs:pattern
works on the lexical space; all the
other facets constrain the value space. The availability of the
facets and their effect depend on the datatype on which they are
applied. We will see them in the context of groups of datatypes
sharing the same set of facets.
These
datatypes share the fact that they are
character strings (even though technically W3C XML Schema
doesn’t consider all of them as derived from the
xs:string
datatypes) and that
whitespaces are collapsed before validation, as defined in the
Recommendation, “all occurrences of #x9 (tab), #xA
(line feed), and #xD (carriage return) are replaced with #x20 (space)
and then, contiguous sequences of #x20s are collapsed to a single
#x20, and initial and/or final #x20s are deleted.”
Those datatypes are:
xs:ENTITY
,
xs:ID
,
xs:IDREF
,
xs:language
,
xs:Name
,
xs:NCName
,
xs:NMTOKEN
,
xs:token
,
xs:anyURI
,
xs:base64Binary
,
xs:hexBinary
,
xs:NOTATION
, and
xs:QName
. Their facets are explained in the
next section:
xs:enumeration
allows definition of a list of
possible values. Here’s an example:
<xs:simpleType name="schemaRecommendations"> <xs:restriction base="xs:anyURI"> <xs:enumeration value="http://www.w3.org/TR/xmlschema-0/"/> <xs:enumeration value="http://www.w3.org/TR/xmlschema-1/"/> <xs:enumeration value="http://www.w3.org/TR/xmlschema-2/"/> </xs:restriction> </xs:simpleType>
This facet is constraining the value space. For most of the string
(and assimilated) datatypes, lexical and values are identical and
this doesn’t make any difference; however, it does
make a difference for
xs:anyURI
,
xs:base64Binary
, and
xs:QName
. For instance,
"http://dmoz.org/World/Français/"
and "http://dmoz.org/World/Fran%c3%a7ais/"
would be considered equal for
xs:anyURI
, the line breaks would be ignored
for
xs:base64Binary
, and the match
would be done on the tuples {namespace URI, local name} for
xs:QName
, ignoring the prefix
used in the schema and instance documents.
One should also note that
xs:anyURI
datatypes are not
“absolutized” by W3C XML Schema and
do not support xml:base
. This means that if the
“schemaRecommendations” defined in
the previous example is assigned to a XLink href
attribute, it must fail to validate the following instance element:
<a xml:base="http://www.w3.org/TR/" href="xmlschema-1/"> XML Schema Part 2: Datatypes </a>
We cannot leave this section without discussing
xs:NOTATION
. This datatype is the only case of a
predefined datatype that cannot be used directly in a schema and must
be used through derived types specifying a set of
xs:enumeration
facets. Even though notations are very seldom
used in real-life applications, this book wouldn’t
be complete without at least an example of notations. If we take the
usual example of a picture using a notation in an attribute to
qualify the content of a binary field as follows:
<?xml version="1.0"?> <picture type="png"> iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAIAAAACUFjqAAAABmJLR0QA/wD/AP+gvaeTAAAA CXBIWXMAAAsSAAALEgHS3X78AAAAB3RJTUUH0QofESYx2JhwGwAAAFZJREFUeNqlj8ENwDAI A6HqGDCWp2QQ2AP2oI9IbaQm/dRPn9EJ7m7a56DPPDgiIoKIzGyBM9Pdx+4ueXabWVUBEJHR nLNJVbfuqspMAEOxwO9r/vX3BTEnKRXtqqslAAAAAElFTkSuQmCC </picture>
The schema might be written as (note how the notations need to be
declared in the schema to be used in an
xs:enumeration
facet):
<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:notation name="jpeg" public="image/jpeg" system="file:///usr/bin/xv"/> <xs:notation name="gif" public="image/gif" system="file:///usr/bin/xv"/> <xs:notation name="png" public="image/png" system="file:///usr/bin/xv"/> <xs:notation name="svg" public="image/svg" system="file:///usr/bin/xsmiles"/> <xs:notation name="pdf" public="application/pdf" system="file:///usr/bin/acroread"/> <xs:simpleType name="graphicalFormat"> <xs:restriction base="xs:NOTATION"> <xs:enumeration value="jpeg"/> <xs:enumeration value="gif"/> <xs:enumeration value="png"/> <xs:enumeration value="svg"/> <xs:enumeration value="pdf"/> </xs:restriction> </xs:simpleType> <xs:element name="picture"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:base64Binary"> <xs:attribute name="type" type="graphicalFormat"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> </xs:schema>
xs:length
defines a fixed length measured in number
of characters (general case) or bytes
(xs:hexBinary
and
xs:base64Binary
):
<xs:simpleType name="standardNotations"> <xs:restriction base="xs:NOTATION"> <xs:length value="8"/> </xs:restriction> </xs:simpleType>
This facet also constrains the value space. For
xs:anyURI
, this may be difficult to predict
since the length is checked after the character normalization. For
xs:QName
, this is even worse
since the W3C XML Schema recommendation has not given any definition
of the length of an
xs:QName
tuple. Fortunately, in practice, constraining the length of these
datatypes doesn’t seem to be very useful, and
it’s a good idea to avoid using these constraints on
these datatypes. The same restriction applies to the next two facets.
xs:maxLength
defines
a maximum length measured in number of characters (general case) or
bytes (xs:hexBinary
and
xs:base64Binary
):
<xs:simpleType name="binaryImage"> <xs:restriction base="xs:hexBinary"> <xs:maxLength value="1024"/> </xs:restriction> </xs:simpleType>
xs:minLength
defines
a minimum length measured in number of characters (general case) or
bytes (hexBinary and base64Binary):
<xs:simpleType name="longName"> <xs:restriction base="xs:NCName"> <xs:minLength value="6"/> </xs:restriction> </xs:simpleType>
xs:pattern
defines a pattern that must be matched
by the string (we will explore patterns in more detail in the next
chapter) :
<xs:simpleType name="httpURI"> <xs:restriction base="xs:anyURI"> <xs:pattern value="http://.*"/> </xs:restriction> </xs:simpleType>
Several pattern facets can be defined in a single derivation step. They are then merged together through a logical “or” (a value will match the restricted datatype if it matches one of the patterns).
The
whitespaces of these other strings are
not collapsed before validation, and a new facet (
xs:whiteSpace
) is available, in addition to the
facets just described, to specify the treatment to apply on
whitespaces for the user-defined datatypes derived from them.
Those
datatypes
are:
xs:normalizedString
and
xs:string
.
xs:whiteSpace
defines
the way to handle whitespaces—i.e., #x20 (space), #x9 (tab),
#xA (linefeed), and #xD (carriage return)—for this datatype:
<xs:simpleType name="CapitalizedNameWS"> <xs:restriction base="xs:string"> <xs:whiteSpace value="collapse"/> <xs:pattern value="([A-Z]([a-z]*) ?)+"/> </xs:restriction> </xs:simpleType>
The values of an
xs:whiteSpace
facet are “preserve” (whitespaces
are kept unchanged), “replace” (all
the instances of any whitespace are replaced with a space), and
“collapse” (leading and trailing
whitespaces are removed and all the other sequences of contiguous
whitespaces are replaced by a single space). This facet is atypical
since it specifies a treatment to be done on a value before applying
any validation test on this value. In the earlier example, setting
whitespace to “collapse” allows
testing of a single space character in the pattern
(” ?”). This ensures the
whitespaces are collapsed before the pattern is tested and will match
any number of whitespaces.
The whitespace behavior cannot be relaxed during a restriction: if a
datatype has a whitespace set as
“preserve,” its derived datatypes
can have any whitespace behavior; if its whitespace is set as
“replace,” its derived datatypes
can only have whitespace equal to
“replace” or
“collapse”; if its whitespace is
“collapse,” all its derived
datatypes must have the same behavior. This means
xs:string
is the only datatype that can be
used to derive datatypes without any whitespace processing and
xs:string
and
xs:normalizedString
are the only datatypes that can be
used to derive datatypes normalizing the whitespaces.
In practice, this facet isn’t really useful for
user-defined datatypes since the whitespace processing largely
dictates the choice of the predefined datatype to use. When we need a
datatype that does no whitespace processing, we must use
xs:string
and not
xs:whiteSpace
. When we need a
datatype that normalizes the whitespaces, instead of using
xs:string
and applying a
xs:whiteSpace
facet, we can use
xs:normalizedString
directly, which has
the same effect. When we need a datatype that collapses the
whitespaces, we can use
xs:token
if it’s a
string—since, again,
xs:token
is not a token in the usual meaning
of the word but rather a “tokenized
string”—as well as any nonstring datatype. The
whitespace processing will already be set to
“collapse” without any need to use
xs:whiteSpace
. The previous example
given is then equivalent to:
<xs:simpleType name="CapitalizedNameWS"> <xs:restriction base="xs:token"> <xs:pattern value="([A-Z]([a-z]*) ?)+"/> </xs:restriction> </xs:simpleType>
Technically speaking, the W3C Working Group hasn’t
“fixed” the
xs:whiteSpace
facet for
xs:token
and its derived datatypes. However,
xs:whiteSpace
has been set to
“collapse” for
xs:token
; since the facet
can’t be relaxed in further restriction, this value
cannot be changed in any datatype derived from these datatypes.
The facets
of:
xs:double
and
xs:float
are described in the next sections.
xs:enumeration
allows definition of a list of
possible values and operates on the value space—for example:
<xs:simpleType name="enumeration"> <xs:restriction base="xs:float"> <xs:enumeration value="-INF"/> <xs:enumeration value="1.618033989"/> <xs:enumeration value="3e3"/> </xs:restriction> </xs:simpleType>
This simple type will match literals such as:
<enumeration> 1.618033989 </enumeration> <enumeration> 3e3 </enumeration> <enumeration> 003000.0000 </enumeration>
This example shows (as we’ve briefly seen with
xs:anyURI
,
xs:QName
, and
xs:base64Binary
) two different lexical
representations (“3e3” and
“003000.0000”) for the same value.
It also shows, as expected, that all the lexical representations have
the same value, so one of the enumerated values will be accepted.
xs:maxExclusive
defines a maximum value that cannot
be reached:
<xs:simpleType name="maxExclusive"> <xs:restriction base="xs:float"> <xs:maxExclusive value="10"/> </xs:restriction> </xs:simpleType>
This datatype validates “9.999999999999999,” but not “10.”
The
xs:maxExclusive
facet is
especially useful for datatypes such as
xs:float
,
xs:double
,
xs:decimal
, or even for datetime types that can
cope with infinitesimal values and in which it is not possible to
determine the greatest value that is smaller than a value.
xs:maxInclusive
defines a maximum value that can be
reached:
<xs:simpleType name="thousands"> <xs:restriction base="xs:double"> <xs:maxInclusive value="1e3"/> </xs:restriction> </xs:simpleType>
xs:minExclusive
defines a minimum value that cannot
be reached:
<xs:simpleType name="strictlyPositive"> <xs:restriction base="xs:double"> <xs:minExclusive value="0"/> </xs:restriction> </xs:simpleType>
xs:minInclusive
defines a minimum value that can be
reached:
<xs:simpleType name="positive"> <xs:restriction base="xs:double"> <xs:minInclusive value="0"/> </xs:restriction> </xs:simpleType>
xs:pattern
defines a pattern that must be
matched by the lexical value of the datatype:
<xs:simpleType name="nonScientific"> <xs:restriction base="xs:float"> <xs:pattern value="[^eE]*"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="noLeading0"> <xs:restriction base="xs:float"> <xs:pattern value="[^0].*"/> </xs:restriction> </xs:simpleType>
This example shows how a pattern, acting on the lexical value of the
float, can disable the use of scientific notation
(xxxEyyy
) or leading zeros.
The
xs:pattern
is the only facet
that directly acts on the lexical space of the datatype.
These
datatypes are partially ordered, and
bounds can be defined even though some restrictions apply. These
datatypes are:
xs:date
,
xs:dateTime
,
xs:duration
,
xs:gDay
,
xs:gMonth
,
xs:gMonthDay
,
xs:gYear
,
xs:gYearMonth
, and
xs:time
and their facets are the same as
those of the float datatypes, as shown in the next sections.:
xs:enumeration
allows definition of a list of
possible values as well as works on the value space—for
example:
<xs:simpleType name="ModernSwissHistoricalDates"> <xs:restriction base="xs:gYear"> <xs:enumeration value="1864"/> <xs:enumeration value="1872"/> <xs:enumeration value="1914"/> <xs:enumeration value="1939"/> <xs:enumeration value="1971"/> <xs:enumeration value="1979"/> <xs:enumeration value="1992"/> </xs:restriction> </xs:simpleType>
This simple type will match literals such as:
1939
Since no time zone is specified for the dates in the enumeration, the time zone is undetermined. These dates do not match any date with a time zone specified, such as:
1939Z
or:
1939+10:00
The same issue appears if enumerations include a time zone, such as in:
<xs:simpleType name="wakeUpTime"> <xs:restriction base="xs:time"> <xs:enumeration value="07:00:00-07:00"/> <xs:enumeration value="07:15:00-07:00"/> <xs:enumeration value="07:30:00-07:00"/> <xs:enumeration value="07:45:00-07:00"/> <xs:enumeration value="08:00:00-07:00"/> </xs:restriction> </xs:simpleType>
This new datatype matches:
07:00:00-07:00
as well as:
11:00:00-04:00
and even:
07:15:00-07:15
but will not validate any time with a time zone.
Even though handling both times with and without time zones is problematic and questionable, it is possible to mix enumerations of values with and without time zones, such as:
<xs:simpleType name="sevenOClockPST"> <xs:restriction base="xs:time"> <xs:enumeration value="07:00:00-07:00"/> <xs:enumeration value="07:00:00"/> </xs:restriction> </xs:simpleType>
xs:maxExclusive
defines a maximum value
that can be reached:
<xs:simpleType name="beforeY2K"> <xs:restriction base="xs:dateTime"> <xs:maxExclusive value="2000-01-01T00:00:00Z"/> </xs:restriction> </xs:simpleType>
This datatype validates any date strictly less than Y2K UTC, such as:
1999-12-31T23:59:59Z
or:
1999-12-31T23:59:59.999999999999Z
It will also validate the following; even if expressed using any other time zone, such as:
2000-01-01T11:59:59+12:00
It doesn’t validate:
2000-01-01T00:00:00Z
The interval of indeterminacy of +/-14 hours is applied when compared to datetimes without a time zone. The greatest datetime without a time zone (without counting the fractions of seconds) is therefore:
1999-12-31T09:59:59
xs:maxInclusive
defines a maximum value
that can be reached:
<xs:simpleType name="AQuarterOrLess"> <xs:restriction base="xs:duration"> <xs:maxInclusive value="P3M"/> </xs:restriction> </xs:simpleType>
This datatype validates all the durations less than or equal to 3
months. Durations such as P2M
(2 months) or
P3M
(3 months) qualify. If both months and days
are used, P2M30D
(2 months and 30 days) will be
valid, but P2M31D
(2 months and 31 days), or even
P2M30DT1S
(2 months, 30 days and 1 second), will
be rejected because of the indetermination of the actual duration
when parts from year/month on one side and day/hours/minutes/seconds
on the other side are used.
xs:minExclusive
defines a minimum value
that can be reached:
<xs:simpleType name="afterTeaTimeInParisInSummer"> <xs:restriction base="xs:time"> <xs:minExclusive value="17:00:00+02:00"/> </xs:restriction> </xs:simpleType>
xs:minInclusive
defines a minimum value
that can be reached:
<xs:simpleType name="afterOrOnThe20th"> <xs:restriction base="xs:gDay"> <xs:minInclusive value="---20"/> </xs:restriction> </xs:simpleType>
We can also take back our example using durations and define:
<xs:simpleType name="AQuarterOrMore"> <xs:restriction base="xs:duration"> <xs:minInclusive value="P3M"/> </xs:restriction> </xs:simpleType>
This datatype validates all durations that are more than or equal to
3 months. Durations such as P4M
(4 months) or
P3M
(3 months) will qualify. If both months and
days are used, P2M31D
(2 months and 31 days) will
be valid, but P2M30D
(2 months and 30 days), or
even P2M30DT23H59M59S
(2 months, 30 days, 23
hours, 59 minutes and 59 seconds), will be rejected because of the
indetermination of the actual duration.
Because of this indeterminacy, W3C XML Schema considers our third
month to have 30 days when we apply
xs:minInclusive
, and
31 days when we apply
xs:maxInclusive
. In practice, it
may be wise to invalidate the usage of combinations allowing such an
indeterminacy. We will see in the next chapter how to do it with a
pattern.
xs:pattern
defines a pattern that must be
matched by the lexical value of the datatype. We will see patterns in
detail in the next chapter. To get an idea of what they look like,
look at the following datatype. It forbids usage of a time zone by
an
xs:dateTime
datatype:
<xs:simpleType name="noTimeZone"> <xs:restriction base="xs:dateTime"> <xs:pattern value=".*T[^Z+-]*"/> </xs:restriction> </xs:simpleType>
These
datatypes are:
xs:byte
,
xs:int
,
xs:integer
,
xs:long
,
xs:negativeInteger
,
xs:nonNegativeInteger
,
xs:nonPositiveInteger
,
xs:positiveInteger
,
xs:short
,
xs:unsignedByte
,
xs:unsignedInt
,
xs:unsignedLong
, and
xs:unsignedShort
.
They accept the same facets of float datatypes as datetime of float datatypes, which we just saw, plus an additional facet to constraint the number of digits, as shown next.
xs:totalDigits
defines the maximum number of decimal
digits:
<xs:simpleType name="totalDigits"> <xs:restriction base="xs:integer"> <xs:totalDigits value="5"/> </xs:restriction> </xs:simpleType>
This datatype accepts only integers with up to five decimal digits.
xs:totalDigits
acts on the value
space, which means that the integer
“000012345,” whose canonical value
is “12345,” matches the datatype
defined previously.
This
single datatype
(
xs:decimal
) accepts all the facets of the
integers and an additional facet to define the number of fractional
digits as shown next.
xs:fractionDigits
specifies the maximum number of decimal
digits in the fractional part (after the dot) :
<xs:simpleType name="fractionDigits"> <xs:restriction base="xs:decimal"> <xs:fractionDigits value="2"/> </xs:restriction> </xs:simpleType>
xs:fractionDigits
acts on the value
space, which means that the integer
“1.12000,” whose canonical value is
“1.12,” matches the datatype
defined previously.
With only one facet allowed, as far as
restriction facets are concerned, the simplest datatype is
xs:boolean
. The value space of
this simple datatype is limited to
“true” and
“false,” but its lexical space also
includes “0” and
“1.” The
xs:pattern
facet can be used to exclude one of
these formats.
The functionality of
xs:pattern
is usually very rich; however, given
the limited number of values of the
xs:boolean
, its only use here appears to be to
fix a format:
<xs:simpleType name="trueOrFalse"> <xs:restriction base="xs:boolean"> <xs:pattern value="true"/> <xs:pattern value="false"/> </xs:restriction> </xs:simpleType>
The available facets for the list datatypes
(
xs:IDREFS
,
xs:ENTITIES
, and
xs:NMTOKENS
) are the facets
available for all the datatypes that are derived by list, as we will
see in the next section.
New restrictions can be applied to datatypes that are already derived by restriction from other types.
When the new restrictions are done on facets that have not yet been constrained, the new facets are just added to the set of facets already defined. The value and lexical spaces of the new datatype are the intersection of all the restrictions. Things become more complex when the same facets are being redefined, and restricting facets can extend the value space.
As far as multiple facet definitions are concerned, we can classify the facets into four categories, described in the next sections.
This is the general case.
xs:enumeration
,
xs:fractionDigits
,
xs:maxExclusive
,
xs:maxInclusive
,
xs:maxLength
,
xs:minExclusive
,
xs:minInclusive
,
xs:minLength
, and
xs:totalDigits
are in this case.
For all these facets, it is forbidden to add a facet that expands the value space of the base datatype. The following examples demonstrate such errors:
<xs:simpleType name="minInclusive"> <xs:restriction base="xs:float"> <xs:minInclusive value="10"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="minInclusive2"> <xs:restriction base="minInclusive"> <xs:minInclusive value="0"/> </xs:restriction> </xs:simpleType>
or:
<xs:simpleType name="enumeration"> <xs:restriction base="xs:float"> <xs:enumeration value="-INF"/> <xs:enumeration value="1.618033989"/> <xs:enumeration value="3e3"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="enumeration2"> <xs:restriction base="enumeration"> <xs:enumeration value="0"/> </xs:restriction> </xs:simpleType>
The
xs:length
facet is the only one in this
category. The length of a derived datatype cannot be redefined if the
length of its parent has been defined.
xs:length
can be seen as a
shortcut for assigning an equal value to
xs:maxLength
and
xs:minLength
. This behavior is coherent with what
happens if these two facets are both used with the same value:
further values of
xs:maxLength
must be inferior or equal to the length, and further values of
xs:minLength
must be greater than
or equal to the length. Since
xs:minLength
must also be smaller than or equal
to
xs:maxLength
, the only
possibility is that they all need to stay equal to the length as
previously defined.
The
xs:pattern
facet is the only facet that can be
applied multiple times. It always restricts the lexical space by
performing a straight intersection of the lexical spaces. The
following noScientificNoLeading0
datatype will try
to match the patterns for both the base datatype and the new
restriction:
<xs:simpleType name="nonScientific"> <xs:restriction base="xs:float"> <xs:pattern value="[^eE]*"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="noScientificNoLeading0"> <xs:restriction base="nonScientific"> <xs:pattern value="[^0].*"/> </xs:restriction> </xs:simpleType>
xs:whiteSpace
is a remarkable
exception. This facet defines the whitespace processing and can
actually expand the set of accepted instance documents during a
“restriction,” as shown in the
following example:
<xs:simpleType name="greetings"> <xs:restriction base="xs:string"> <xs:whiteSpace value="replace"/> <xs:enumeration value="hi"/> <xs:enumeration value="hello"/> <xs:enumeration value="how do you do?"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="restricted-greetings"> <xs:restriction base="greetings"> <xs:whiteSpace value="collapse"/> </xs:restriction> </xs:simpleType>
While the first datatype (“greetings”) accepts:
how do you do?
but rejects a string such as:
how do you do?
the type issued from the “restriction” accepts both.
Each facet (except
xs:enumeration
and
xs:pattern
) includes a
fixed
attribute which,
when set to true
, disables the possibility of
modifying the facet during further restrictions by derivation.
If we want to make sure that the minimum value of our
minInclusive
cannot be modified, we write:
<xs:simpleType name="minInclusive"> <xs:restriction base="xs:float"> <xs:minInclusive value="10" fixed="true"/> </xs:restriction> </xs:simpleType>
This is the method used by the schema for W3C XML
Schema to fix the value of the facets used to derive
predefined datatypes. For instance, the type
xs:integer
is derived from
xs:decimal
through:
<xs:simpleType name="integer" id="integer"> <xs:restriction base="xs:decimal"> <xs:fractionDigits value="0" fixed="true"/> </xs:restriction> </xs:simpleType>
<xs:enumeration
>
and
<xs:pattern
>
cannot be fixed.