Derivation by list is the mechanism by which a list datatype can be derived from an atomic datatype. All the items in the list need to have the same datatype.
List datatypes are special cases in which a structure is defined within the content of a single attribute or element. This practice is usually discouraged since applications do not have access to the atomic values through the current XML APIs, XPath expressions, or in the Infoset. This situation might change in the future since these datatypes should be adopted by XPath 2.0, which will likely provide some kind of mechanism to access to the items within these lists.
This feature appears to have been introduced to maintain compatibility with SGML and XML DTD IDREFS, but W3C XML Schema has been cautious and doesn’t allow definition of the list separator or complex lists with complex types or heterogeneous members. Among the constructs that can be seen in some XML vocabularies and cannot be described by XML Schema (except by using regular expressions as a partial workaround) are comma-separated lists of values, and lists with heterogeneous members, such as values with units:
<commaSeparated> 1, 2, 25 </commaSeparated> <valueWithUnit> 10 em </valueWithUnit>
Whitespace-separated lists and split XML elements or attributes are preferred:
<commaSeparated> 1 2 25 </commaSeparated> <valueWithUnit unit="em"> 10 </valueWithUnit> <valueWithUnit> 10em </valueWithUnit>
IDREFS, ENTITIES, and NMTOKENS are predefined list datatypes that are derived from atomic types using this method.
As we have seen with these three datatypes, all the list datatypes that can be defined must be whitespace-separated. No other separator is accepted.
With this restriction, defining a list is very simple, and
W3C XML Schema has defined two syntaxes. Both
use a
xs:list
element, which allows a definition
by reference to existing types or embeds a type definition (these two
syntaxes cannot be mixed).
The definition of a list datatype by reference to an existing type is
done through a
itemType
attribute:
<xs:simpleType name="integerList"> <xs:list itemType="xs:integer"/> </xs:simpleType>
This datatype can be used to define attributes or elements that accept a whitespace-separated list of integers such as: “1 -25000 1000.”
The definition of a list datatype can also be done by embedding
a
xs:simpleType(global definition)
element:
<xs:simpleType name="myIntegerList"> <xs:list> <xs:simpleType> <xs:restriction base="xs:integer"> <xs:maxInclusive value="100"/> </xs:restriction> </xs:simpleType> </xs:list> </xs:simpleType>
This datatype can be used to define attributes or elements that accept a whitespace-separated list of integers smaller than or equal to 100 such as: “1 -25000 100.”
List datatypes have their own value space that can be constrained using a set of specific facets that is common to all of them.
These facets are
xs:length
,
xs:maxLength
,
xs:minLength
,
xs:enumeration
and
xs:whiteSpace
. The unit used to measure the length
of a list type is always the number of elements in the list.
To apply these facets to a user-defined list type, we need to follow
two steps. We first define the list datatype, and then define a
datatype to constrain the list datatype. The reason for this is each
xs:simpleType(global definition)
accepts only one derivation method
chosen between the three existing methods.
In this process, the derivation by restriction has to be done first, since a list datatype loses the facets of its atomic type and has the only five facets just described that have a meaning that is specific to list types.