A schema assigns a type to each element and attribute it
declares. In Example 17-5,
the fullName
element has a
complex type. Elements with complex types may
contain nested elements and have attributes. Only elements can contain
complex types. Attributes always have simple types.
Since the type is declared using an xs:complexType
element embedded directly in the element declaration,
it is also an anonymous type, rather than a named type.
New types are defined using xs:complexType
or xs:simpleType
elements. If a new type is
declared globally with a top-level element, it needs to be given a
name so that it can be referenced from element and attribute
declarations within the schema. If a type is defined inline (inside an
element or attribute declaration), it does not need to be named. But
since it has no name, it cannot be referenced by other element or
attribute declarations. When building large and complex schemas, data
types will need to be shared among multiple different elements. To
facilitate this reuse, it is necessary to create named types.
To show how named types and complex content interact, let's
expand the example schema. A new address
element will contain the fullName
element, and the person's name will
be divided into a first- and last-name component. A typical instance document would look like Example 17-6.
Example 17-6. addressdoc.xml after adding address, first, and last elements
<?xml version="1.0"?> <addr:address xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://namespaces.oreilly.com/xmlnut/address address-schema.xsd" xmlns:addr="http://namespaces.oreilly.com/xmlnut/address" addr:language="en"> <addr:fullName> <addr:first>Scott</addr:first> <addr:last>Means</addr:last> </addr:fullName> </addr:address>
To accommodate this new format, fairly substantial structural changes to the schema are required, as shown in Example 17-7.
Example 17-7. address-schema.xsd to support address element
<xs:schema xmlns:xsi="http://www.w3.org/2001/XMLSchema" targetNamespace="http://namespaces.oreilly.com/xmlnut/address" xmlns:addr="http://namespaces.oreilly.com/xmlnut/address" elementFormDefault="qualified"> <xs:element name="address"> <xs:complexType> <xs:sequence> <xs:element name="fullName"> <xs:complexType> <xs:sequence> <xs:element name="first" type="addr:nameComponent"/> <xs:element name="last" type="addr:nameComponent"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> <xs:complexType name="nameComponent"> <xs:simpleContent> <xs:extension base="xs:string"/> </xs:simpleContent> </xs:complexType> </xs:schema>
The first major difference between this schema and the previous
version is that the root element name has been changed from fullName
to address
. The same result could have been
accomplished by creating a new top-level element declaration for the
new address
element, but that would
have opened a loophole allowing a valid instance document to contain
only a fullName
element and nothing
else.
The address
element
declaration defines a new anonymous complex type. Unlike the old
definition, this complex type is defined to contain complex content
using the xs:sequence
element. The
sequence element tells the schema processor that the contained list of
elements must appear in the target document in the exact order they
are given. In this case, the sequence contains only one element
declaration.
The nested element declaration is for the fullName
element, which then repeats the
xs:complexType
and xs:sequence
definition process. Within this
nested sequence, two element declarations appear for the first
and last
elements.
These two element declarations, unlike all prior element
declarations, explicitly reference a new complex type that's declared
in the schema: the addr:nameComponent
type. It is fully
qualified to differentiate it from possible conflicts with built-in
schema data types.
The nameComponent
type is
declared by the xs:complexType
element immediately following the address
element declaration. It is
identified as a named type by the presence of the name
attribute, but in every other way it is
constructed the same way it would have been as an anonymous
type.
One feature of schemas that should be welcome to DTD
developers is the ability to explicitly set the minimum and maximum
number of times an element may occur at a particular point in a
document using minOccurs
and
maxOccurs
attributes of the
xs:element
element. For example,
this declaration adds an optional middle name to the fullName
element:
<xs:element name="fullName"> <xs:complexType> <xs:sequence> <xs:element name="first" type="addr:nameComponent"/> <xs:element name="middle" type="addr:nameComponent" minOccurs="0"/> <xs:element name="last" type="addr:nameComponent"/> </xs:sequence> </xs:complexType> </xs:element>
Notice that the element declaration for the middle
element has a minOccurs
value of 0. The default value
for both minOccurs
and maxOccurs
is 1, if they are not provided
explicitly. Therefore, setting minOccurs
to 0 means that the middle
element may appear 0 to 1 times.
This is equivalent to using the ?
operator in a DTD declaration. Another possible value for the
maxOccurs
attribute is unbounded
, which indicates that the
element in question may appear an unlimited number of times. This
value is used to produce the same effect as the *
and +
operators in a DTD declaration. The advantage over DTDs comes when
you use values other than 0, 1, or unbounded
, letting you specify things like
"this element must appear at least twice but no more than four
times."