Although W3CXML Schema permits mixed content models and describes them better than in XML DTDS, W3CXML Schema treats them as an add-on plugged on top of complex content models. The good news is that this allows control of children elements exactly as we’ve just seen for complex contents. The bad news is that we abandon any control over the child text nodes whose values cannot be constrained at all, and, of course, the descriptions of the child elements are subject to the same limitations as in the case of complex content models. The limitations on unordered content models are probably even more unfriendly for mixed content models, which are more “free style,” than the limitation is for complex content models.
This
add-on is implemented
through a mixed
attribute in the
xs:complexType(global definition)
,
which is otherwise used exactly as we’ve seen for
complex content models. The effect of this attribute when its value
is set to "true"
is to allow any text nodes within
the content model, before, between, and after the child elements. The
location, the whitespace processing, and the datatype of these text
nodes cannot be restricted in any way.
Let’s go back to the definition of our
title
element and change it to accept a reduced
version of XHTML with the a
link and an
em
element to highlight some parts of its text.
The definition, which was previously done by extending a simple type
to create a simple content complex type, needs to be re-written as a
complex content definition with a mixed attribute set to
"true"
. The full definition, including the
definition of the a
element, the definition of a
markedText
complex type and its usage to define
the title
element, could be:
<xs:element name="a"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="href" type="xs:anyURI"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:complexType name="markedText" mixed="true"> <xs:choice minOccurs="0" maxOccurs="unbounded"> <xs:element name="em" type="xs:token"/> <xs:element ref="a"/> </xs:choice> <xs:attribute ref="lang"/> </xs:complexType> <xs:element name="title" type="markedText"/>
This definition matches elements such as:
<title lang="en"> Being a <a href="http://dmoz.org/Shopping/Pets/Dogs/"> Dog </a> Is a <em> Full-Time </em> Job </title>
Note that the length of the title can no longer be restricted.
Mixed content models are derived exactly like the complex content models on which they have been plugged. The semantic of both methods stays exactly the same.
Mixed
contents
complex types can be derived by extension from other complex content
complex types and the meaning will be the same. If I want to add a
strong
element to my markedText
mixed content type, I can define the following content model:
<xs:element name="title"> <xs:complexType mixed="true"> <xs:complexContent mixed="true"> <xs:extension base="markedText"> <xs:choice minOccurs="0" maxOccurs="unbounded"> <xs:element name="strong" type="xs:string"/> </xs:choice> </xs:extension> </xs:complexContent> </xs:complexType> </xs:element>
One must note, though, that this extension is equivalent to:
<xs:complexType name="resultingType" mixed="true"> <xs:sequence> <xs:choice minOccurs="0" maxOccurs="unbounded"> <xs:element name="em" type="xs:token"/> <xs:element ref="a"/> </xs:choice> <xs:choice minOccurs="0" maxOccurs="unbounded"> <xs:element name="strong" type="xs:string"/> </xs:choice> </xs:sequence> <xs:attribute ref="lang"/> </xs:complexType>
This is probably not what we would like to see in practice since this
content model expects to see all the occurrences of
a
and em
before any instance of
strong
. We will see later, in Chapter 12, that this specific issue can be solved using
a feature named “substitution
groups” instead of using xs:choice
.
The
derivation of mixed content models by
restriction is also done using the method defined for complex content
models, with the same constraint that each particle must be an
explicit derivation of the corresponding particle of the base type.
To illustrate the consequences of this constraint,
let’s look again at the definition and the use of
our markedText
:
<xs:element name="a"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="href" type="xs:anyURI"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:complexType name="markedText" mixed="true"> <xs:choice minOccurs="0" maxOccurs="unbounded"> <xs:element name="em" type="xs:token"/> <xs:element ref="a"/> </xs:choice> <xs:attribute ref="lang"/> </xs:complexType> <xs:element name="title" type="markedText"/>
If we want to forbid em
elements in our title,
force the href
to be an http absolute URI, and
require the lang
attribute to be either
en
or es
, we need to do some
refactoring to show that the a
element included in
our title is an explicit derivation of the general definition of
a
. We also need to use a global complex type
definition for a
instead of the previous anonymous
definition:
<xs:element name="a" type="link"/>
We can now either derive a new global complex type from the new
link
complex type or embed its derivation in the
definition of our title
element:
<xs:element name="title"> <xs:complexType mixed="true"> <xs:complexContent mixed="true"> <xs:restriction base="markedText"> <xs:choice minOccurs="0" maxOccurs="unbounded"> <xs:element name="a"> <xs:complexType> <xs:simpleContent> <xs:restriction base="link"> <xs:attribute name="href"> <xs:simpleType> <xs:restriction base="xs:anyURI"> <xs:pattern value="http://.*"/> </xs:restriction> </xs:simpleType> </xs:attribute> </xs:restriction> </xs:simpleContent> </xs:complexType> </xs:element> </xs:choice> <xs:attribute name="lang"> <xs:simpleType> <xs:restriction base="xs:language"> <xs:enumeration value="en"/> <xs:enumeration value="es"/> </xs:restriction> </xs:simpleType> </xs:attribute> </xs:restriction> </xs:complexContent> </xs:complexType> </xs:element>
This example is a caricature. In practice it would be more readable to create an intermediate global type definition to avoid embedding several derivations, but it provides an overview of this derivation process.
Since complex and mixed content models are built using the same mechanism, one may wonder what the possibilities are for deriving complex contents from mixed contents and vice versa. The answer to this question lurks in the semantic of these two derivation methods.
Derivation by extension appends new content after the content of the base type and the structure of the base type is kept unchanged. It is therefore not possible to derive a mixed content model from complex content model. When a content model is mixed, the position of the text nodes cannot be constrained, and this permits text nodes within the base type at any location. For the same reason, it is impossible to extend a mixed content model into a complex content model because the text nodes that are allowed in the base type would become forbidden.
Derivation by restriction defines a subset of the base type. It is
forbidden to derive a mixed content model from a complex content
model. The resulting type would allow text nodes that are forbidden in
the base type and would expand rather than restrict the content
model. There is one workable possibility, however. The last
combination is the only possible one: a mixed content model can be
restricted into a complex content model. Forbidding the text nodes of
a mixed content model is a valid restriction and can be done by
setting the mixed
attribute to
“false” in the xs:complexType
definition. It is even possible to derive a
simple content model into a mixed content model since this is, in
fact, a restriction removing the sibling elements and keeping the
text nodes. This assumes, of course, that the sibling elements are
optional; i.e., they have a minOccurs
attribute
equal to 0.