This chapter covers the basic building blocks of XML: elements. It explains how to use element declarations to assign names and types to elements. It also describes element properties that can be set via element declarations, such as default and fixed values, nillability, and qualified versus unqualified name forms.
Element declarations are used to assign names and types to elements. This is accomplished using an element
element. Element declarations can be either global or local.
Global element declarations appear at the top level of the schema document, meaning that their parent must be schema
. These global element declarations can then be used in multiple complex types, as described in Section 12.4.2 on p. 267. Table 6–1 shows the syntax for a global element declaration.
Table 6–1. XSD Syntax: global element declaration
Example 6–1 shows two global element declarations: name
and size
. A complex type is then defined which references these element declarations by name using the ref
attribute.
The qualified names used by global element declarations must be unique in the schema. This includes not just the schema document in which they appear, but also any other schema documents that are used with it.
The name specified in an element declaration must be an XML non-colonized name, which means that it must start with a letter or underscore (_
), and may only contain letters, digits, underscores (_
), hyphens (-
), and periods (.
). The qualified element name consists of the target namespace of the schema document, plus the local name in the declaration. In Example 6–1, the name
and size
element declarations take on the target namespace http://datypic.com/prod.
Example 6–1. Global element declarations
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns="http://datypic.com/prod"
targetNamespace="http://datypic.com/prod">
<xs:element name="name" type="xs:string"/>
<xs:element name="size" type="xs:integer"/>
<xs:complexType name="ProductType">
<xs:sequence>
<xs:element ref="name"/>
<xs:element ref="size" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
Since globally declared element names are qualified by the target namespace of the schema document, it is not legal to include a namespace prefix in the value of the name
attribute, as shown in Example 6–2. If you want to declare elements in a different namespace, you must create a separate schema document with that target namespace and import it into the original schema document.
Example 6–2. Illegal attempt to prefix an element name
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:prod="http://datypic.com/prod"
targetNamespace="http://datypic.com/prod">
<xs:element name="name" type="xs:string"/>
<xs:element name="prod:size" type="xs:integer"/>
</xs:schema>
Occurrence constraints (minOccurs
and maxOccurs
) appear in an element reference rather than the global element declaration. This is because they are related to the appearance of an element in a particular content model. Element references are covered in Section 12.4.2 on p. 267.
Local element declarations, on the other hand, appear entirely within a complex type definition. Local element declarations can only be used in that type definition, never referenced by other complex types or used in a substitution group. Table 6–2 shows the syntax for a local element declaration.
Table 6–2. XSD Syntax: local element declaration
Example 6–3 shows two local element declarations, name
and size
, which appear entirely within a complex type definition.
Occurrence constraints (minOccurs
and maxOccurs
) can appear in local element declarations. Some attributes, namely substitutionGroup
, final
, and abstract
, are valid in global element declarations but not in local element declarations. This is because these attributes all relate to substitution groups, in which local element declarations cannot participate.
The name specified in a local element declaration must also be an XML non-colonized name. If its form is qualified, it takes on the target namespace of the schema document. If it is unqualified, it is considered to be in no namespace. See Section 6.3 on p. 98 for more information.
Example 6–3. Local element declarations
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns="http://datypic.com/prod"
targetNamespace="http://datypic.com/prod">
<xs:complexType name="ProductType">
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="size" type="xs:integer" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
Names used in local element declarations are scoped to the complex type within which they are declared. You can have two completely different local element declarations with the same element name, as long as they are in different complex types. You can also have two local element declarations with the same element name in the same complex type, provided that they themselves have the same type. This is explained further in Section 12.4.3 on p. 268.
Use global element declarations if:
• The element declaration could ever apply to the root element during validation. Such a declaration should be global so that the schema processor can access it.
• You want to use the exact same element declaration in more than one complex type.
• You want to use the element declaration in a substitution group. Local element declarations cannot participate in substitution groups (see Chapter 16).
Use local element declarations if:
• You want to allow unqualified element names in the instance. In this case, make all of the element declarations local except for the root element declaration. If you mix global and local declarations, and you want the element names in local declarations to be unqualified, you will require your instance authors to know which element declarations are global and which are local. Global element names are always qualified in the instance (see Section 6.3 on p. 98).
• You want to have several element declarations with the same name but different types or other properties. Using local declarations, you can have two element declarations for size
: One that is a child of shoe
has the type ShoeSizeType
, and one that is a child of hat
has the type HatSizeType
. If the size
declaration is global, it can only occur once, and therefore use only one type, in that schema. The same holds true for default and fixed values as well as nillability.
Regardless of whether they are local or global, all element declarations associate an element name with a type, which may be either simple or complex. There are four ways to associate a type with an element name:
1. Reference a named type by specifying the type
attribute in the element declaration. This may be either a built-in type or a user-defined type.
2. Define an anonymous type by specifying either a simpleType
or a complexType
child.
3. Use no particular type, by specifying neither a type
attribute nor a simpleType
or complexType
child. In this case, the actual type is anyType
which allows any children and/or character data content, and any attributes, as long as it is well-formed XML.1
4. Define one or more type alternatives using alternative
children. This more advanced feature of version 1.1 is described separately in Section 14.2 on p. 375.
Example 6–4 shows four element declarations with different type assignment methods.
Example 6–4. Assigning types to elements
<xs:element name="size" type="SizeType"/>
<xs:element name="name" type="xs:string"/>
<xs:element name="product">
<xs:complexType>
<xs:sequence>
<xs:element ref="name"/>
<xs:element ref="size"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="anything"/>
The first example uses the type
attribute to specify SizeType
as the type of size
. The second example also uses the type
attribute, this time to assign a built-in type string
to name
. The xs
prefix is used because built-in types are part of the XML Schema Namespace. For a complete explanation of the use of prefixes in schema documents, see Section 3.3.5 on p. 52.
The third example uses an in-line anonymous complex type, which is defined entirely within the product
declaration. Finally, the fourth element declaration, anything
, does not specify a particular type, which means that anything
elements can have any well-formed content and any attributes.
For a detailed discussion of using named or anonymous types, see Section 8.2.3 on p. 133.
When an element declaration is local—that is, when it isn’t at the top level of a schema document—you have the choice of putting those element names into the target namespace of the schema or not. Let’s explore the two alternatives.
Example 6–5 shows an instance where all element names are qualified. Every element name has a prefix that maps it to the product namespace.
Example 6–5. Qualified local names
<prod:product xmlns:prod="http://datypic.com/prod">
<prod:number>557</prod:number>
<prod:size>10</prod:size>
</prod:product>
Example 6–6, on the other hand, shows an instance where only the root element name, product
, is qualified. The other element names have no prefix, and since there is no default namespace declaration, they are not in any namespace.
Example 6–6. Unqualified local names
<prod:product xmlns:prod="http://datypic.com/prod">
<number>557</number>
<size>10</size>
</prod:product>
Let’s look at the schemas that would describe these two instances. Example 6–7 shows a schema for the instance in Example 6–5, which has qualified element names.
Example 6–7. Schema for qualified local element names
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns="http://datypic.com/prod"
targetNamespace="http://datypic.com/prod"
elementFormDefault="qualified">
<xs:element name="product" type="ProductType"/>
<xs:complexType name="ProductType">
<xs:sequence>
<xs:element name="number" type="xs:integer"/>
<xs:element name="size" type="xs:integer"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
The schema document has elementFormDefault
set to qualified
. As a result, elements conforming to local declarations must use qualified element names in the instance. In this example, the declaration for product
is global and the declarations for number
and size
are local.
To create a schema for the instance in Example 6–6, which has unqualified names, you can simply change the value of elementFormDefault
in the schema document to unqualified
. Or, since the default value is unqualified
, you could simply omit the attribute. In this case, elements conforming to global declarations must still use qualified element names—hence the use of prod:product
in the instance.
It is also possible to specify the form on a particular element declaration using a form
attribute whose value, like elementFormDefault
, is either qualified
or unqualified
. Example 6–8 shows a revised schema that uses the form
attribute on the number
element declaration to override elementFormDefault
and make it unqualified.
Example 6–8. Using the form attribute
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns="http://datypic.com/prod"
targetNamespace="http://datypic.com/prod"
elementFormDefault="qualified">
<xs:element name="product" type="ProductType"/>
<xs:complexType name="ProductType">
<xs:sequence>
<xs:element name="number" type="xs:integer"
form="unqualified"/>
<xs:element name="size" type="xs:integer"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
A valid instance is shown in Example 6–9.
<prod:product xmlns:prod="http://datypic.com/prod">
<number>557</number>
<prod:size>10</prod:size>
</prod:product>
Default namespaces do not mix well with unqualified element names. The instance in Example 6–10 declares the prod
namespace as the default namespace. However, this will not work with a schema document where elementFormDefault
is set to unqualified
, because it will be unsuccessfully looking for the elements number
and size
in the prod
namespace whereas they are in fact not in any namespace.
Example 6–10. Invalid mixing of unqualified names and a default namespace
<product xmlns="http://datypic.com/prod">
<number>557</number>
<size>10</size>
</product>
Although unqualified element names may seem confusing, they do have some advantages when combining multiple namespaces. Section 21.7.3 on p. 575 provides a more complete coverage of the pros and cons of unqualified local names.
Default and fixed values are used to augment an instance by adding values to empty elements. The schema processor will insert a default or fixed value if the element in question is empty. If the element is absent from the instance, it will not be inserted. This is different from the treatment of default and fixed values for attributes.
Default and fixed values are specified by the default
and fixed
attributes, respectively. Only one of the two attributes (default
or fixed
) may appear; they are mutually exclusive. Default and fixed values can be specified in element declarations with:
• Complex types with simple content
• Complex types with mixed content, if all children are optional
The default or fixed value must be valid for the type of that element. For example, it is not legal to specify a default value of xyz
if the type of the element is integer
.1
The specification of fixed and default values in element declarations is independent of their occurrence constraints (minOccurs
and maxOccurs
). Unlike defaulted attributes, a defaulted element may be required (i.e., minOccurs
in its declaration may be more than 0
). If an element with a default value is required, it may still appear empty and have its default value filled in.
The default value is filled in if the element is empty. Example 6–11 shows the declaration of product
with two children, name
and size
, that have default values specified.
Example 6–11. Specifying an element’s default value
<xs:element name="product">
<xs:complexType>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element name="name" type="xs:string" default="N/A"/>
<xs:element name="size" type="xs:integer" default="12"/>
</xs:choice>
</xs:complexType>
</xs:element>
It is important to note that certain types allow an empty value. This includes string
, normalizedString
, token
, and any types derived from them that do not specifically disallow the empty string as a value. Additionally, unrestricted list types allow empty values. For any type that allows an empty string value, the element will never be considered to have that empty string value because the default value will be filled in. However, if an element has the xsi:nil
attribute set to true
, its default value is not inserted.
Table 6–3 describes how element default values are inserted in different situations, based on the declaration in Example 6–11.
Table 6–3. Default value behavior for elements
Fixed values are added in all the same situations as default values. The only difference is that if the element has a value, its value must be equivalent to the fixed value. When the schema processor determines whether the value of the element is in fact equivalent to the fixed value, it takes into account the element’s type.
Table 6–4 shows some valid and invalid instances for elements declared with fixed values. The size
element has the type integer
, so all forms of the integer “1” are accepted in the instance, including “01
”, “+1
”, and “ 1
” surrounded by whitespace. Whitespace around a value is acceptable because the whiteSpace
facet value for integer
is collapse
, meaning that whitespace is stripped before validation takes place. A value that contains only whitespace, like <size> </size>
, is not valid because it is not considered empty but also is not equal to 1.
Table 6–4. Elements with fixed values
The name
element, on the other hand, has the type string
. The string “01
” is invalid because it is not considered to be equal to the string “1
”. The string “ 1
” is also invalid because the whiteSpace
facet value for string
is preserve
, meaning that the leading and trailing spaces are kept. For more information on type equality, see Section 11.7 on p. 253.
In some cases, an element may be either absent from an instance or empty (contain no value). The instance shown in Example 6–12 is a purchase order with some absent and empty elements.
<order>
<giftWrap>ADULT BDAY</giftWrap>
<customer>
<name>
<first>Priscilla</first>
<middle/>
<last>Walmsley</last>
</name>
</customer>
<items>
<shirt>
<giftWrap/>
<number>557</number>
<name>Short-Sleeved Linen Blouse</name>
<size></size>
</shirt>
<umbrella>
<number>443</number>
<name>Deluxe Golf Umbrella</name>
<size></size>
</umbrella>
</items>
</order>
There are many possible reasons for an element value to be missing in an instance:
• The information is not applicable: Umbrellas do not come in different sizes.
• We do not know whether the information is applicable: We do not know whether the customer has a middle name.
• It is not relevant to this particular application of the data: The billing application does not care about product sizes.
• It is the default, so it is not specified: The customer’s title should default to “Mr.”
• It actually is present and the value is an empty string: The gift wrap value for the shirt is empty, meaning “none,” which should override the gift wrap value of the order.
• It is erroneously missing because of a user error or technical bug: We should have a size for the shirt.
Different applications treat missing values in different ways. One application might treat an absent element as not applicable, and an empty element as an error. Another might treat an empty element as not applicable, and an absent element as an error. The treatment of missing values may vary within the same schema. In our example, we used a combination of absent and empty elements to signify different reasons for missing values.
XML Schema offers a third method of indicating a missing value: nils. By marking an element as nil, you are telling the processor “I know this element is empty, but I want it to be valid anyway.” The actual reason why it is empty, or what the application should do with it, is entirely up to you. XML Schema does not associate any particular semantics with this absence. It only offers an additional way to express a missing value, with the following benefits:
• You do not have to weaken the type by allowing empty content and/or making attributes optional.
• You are making a deliberate statement that the information does not exist. This is a clearer message than simply omitting the element, which would mean that we do not know if it exists.
• If for some reason an application is relying on that element being there, for example as a placeholder, nil provides a way for it to exist without imparting any additional information.
• You can easily turn off default value processing. The default value for the element will not be added if it is marked as nil.
An approach for our purchase order document is outlined below. It uses nils, derived types, simple type restrictions, and default values to better constrain missing values. The resulting instance is shown in Example 6–13.
Example 6–13. Missing values, revisited
<order xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<giftWrap>ADULT BDAY</giftWrap>
<customer>
<name>
<title/> <!--default will be filled in-->
<first>Priscilla</first>
<middle xsi:nil="true"/>
<last>Walmsley</last>
</name>
</customer>
<items>
<shirt>
<giftWrap/>
<number>557</number>
<name>Short-Sleeved Linen Blouse</name>
<size></size> <!--INVALID! -->
</shirt>
<umbrella>
<number>443</number>
<name>Deluxe Golf Umbrella</name>
</umbrella>
</items>
</order>
• The information is not applicable: Give shirt
and umbrella
different types and do not include the size
element declaration in UmbrellaType
.
• We do not know whether the information is applicable: Make middle
nillable and set xsi:nil
to true
if it is not present.
• It is not relevant to this particular application of the data: Give the billing application a separate schema document and insert a wildcard where the size
and other optional element declarations or references may appear.
• It is the default, so it is not specified: Specify a default value of “Mr.” for title
.
• It actually is present and the value is an empty string: Allow giftWrap
to appear empty.
• It is erroneously missing because of a user error or technical bug: Make size
required and make it an integer
or other type that does not accept empty values.
This is one of the many reasonable approaches for handling absent values. The important thing is to define a strategy that provides all the information your application needs and ensures that all errors are caught.
To indicate that the value of an instance element is nil, specify the xsi:nil
attribute on that element. Example 6–14 shows five instances of size
that use the xsi:nil
attribute. The xsi:nil
attribute applies to the element in whose tag it appears, not any of the attributes. There is no way to specify that an attribute value is nil.
Example 6–14. xsi:nil in instance elements
<size xsi:nil="true"/>
<size xsi:nil="true"></size>
<size xsi:nil="true" system="US-DRESS"/>
<size xsi:nil="false">10</size>
<size xsi:nil="true">10</size> <!--INVALID! -->
The xsi:nil
attribute is in the XML Schema Instance Namespace (http://www.w3.org/2001/XMLSchema-instance). This namespace must be declared in the instance, but it is not necessary to specify a schema location for it. Any schema processor will recognize the xsi:nil
attribute of any XML element.
If the xsi:nil
attribute appears on an element, and its value is set to true
, that element must be empty. It cannot contain any child elements or character data, even if its type requires content. The last instance of the size
element in Example 6–14 is invalid because xsi:nil
is true
but it contains data. However, it is valid for a nil element to have other attributes, as long as they are declared for that type.
In order to allow an element to appear in the instance with the xsi:nil
attribute, its element declaration must indicate that it is nillable. Nillability is indicated by setting the nillable
attribute in the element declaration to true
. Example 6–15 shows an element declaration illustrating this.
Example 6–15. Making size elements nillable
<xs:element name="size" type="xs:integer" nillable="true"/>
Specifying nillable="true"
in the declaration allows elements to have the xsi:nil
attribute. Otherwise, the xsi:nil
attribute cannot appear, even with its value set to false
. It is not necessary (or even legal) to separately declare the xsi:nil
attribute for the type used in the element declaration. In Example 6–15, we gave size
a simple type. Normally this would mean that it cannot have attributes, but the xsi:nil
attribute is given special treatment. Elements with either complex or simple types can be nillable.
If nillable
is set to true
, a fixed value may not be specified in the declaration.1 However, it is legal to specify a default value. If an element has an xsi:nil
set to true
, the default value is not filled in even though the element is empty.
Elements should not be declared nillable if they will ever be used as fields in an identity constraint, such as a key or a uniqueness constraint. See Section 17.7.2 on p. 434 for more information on identity constraint fields.