Chapter 6. Element declarations

Chapter 6. Element declarations

This chapter covers the basic building blocks of XML: elements. It explains how to use element declarations to assign names and types to elements. It also describes element properties that can be set via element declarations, such as default and fixed values, nillability, and qualified versus unqualified name forms.

6.1. Global and local element declarations

Element declarations are used to assign names and types to elements. This is accomplished using an element element. Element declarations can be either global or local.

6.1.1. Global element declarations

Global element declarations appear at the top level of the schema document, meaning that their parent must be schema. These global element declarations can then be used in multiple complex types, as described in Section 12.4.2 on p. 267. Table 6–1 shows the syntax for a global element declaration.

Table 6–1. XSD Syntax: global element declaration

Example 6–1 shows two global element declarations: name and size. A complex type is then defined which references these element declarations by name using the ref attribute.

The qualified names used by global element declarations must be unique in the schema. This includes not just the schema document in which they appear, but also any other schema documents that are used with it.

The name specified in an element declaration must be an XML non-colonized name, which means that it must start with a letter or underscore (_), and may only contain letters, digits, underscores (_), hyphens (-), and periods (.). The qualified element name consists of the target namespace of the schema document, plus the local name in the declaration. In Example 6–1, the name and size element declarations take on the target namespace http://datypic.com/prod.

Example 6–1. Global element declarations

Click here to view code image

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns="http://datypic.com/prod"
           targetNamespace="http://datypic.com/prod">

  <xs:element name="name" type="xs:string"/>
  <xs:element name="size" type="xs:integer"/>

  <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element ref="name"/>
      <xs:element ref="size" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>

</xs:schema>

Since globally declared element names are qualified by the target namespace of the schema document, it is not legal to include a namespace prefix in the value of the name attribute, as shown in Example 6–2. If you want to declare elements in a different namespace, you must create a separate schema document with that target namespace and import it into the original schema document.

Example 6–2. Illegal attempt to prefix an element name

Click here to view code image

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:prod="http://datypic.com/prod"
           targetNamespace="http://datypic.com/prod">
  <xs:element name="name" type="xs:string"/>
  <xs:element name="prod:size" type="xs:integer"/>
</xs:schema>

Occurrence constraints (minOccurs and maxOccurs) appear in an element reference rather than the global element declaration. This is because they are related to the appearance of an element in a particular content model. Element references are covered in Section 12.4.2 on p. 267.

6.1.2. Local element declarations

Local element declarations, on the other hand, appear entirely within a complex type definition. Local element declarations can only be used in that type definition, never referenced by other complex types or used in a substitution group. Table 6–2 shows the syntax for a local element declaration.

Table 6–2. XSD Syntax: local element declaration

Example 6–3 shows two local element declarations, name and size, which appear entirely within a complex type definition.

Occurrence constraints (minOccurs and maxOccurs) can appear in local element declarations. Some attributes, namely substitutionGroup, final, and abstract, are valid in global element declarations but not in local element declarations. This is because these attributes all relate to substitution groups, in which local element declarations cannot participate.

The name specified in a local element declaration must also be an XML non-colonized name. If its form is qualified, it takes on the target namespace of the schema document. If it is unqualified, it is considered to be in no namespace. See Section 6.3 on p. 98 for more information.

Example 6–3. Local element declarations

Click here to view code image

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns="http://datypic.com/prod"
           targetNamespace="http://datypic.com/prod">
  <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="name" type="xs:string"/>
      <xs:element name="size" type="xs:integer" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

Names used in local element declarations are scoped to the complex type within which they are declared. You can have two completely different local element declarations with the same element name, as long as they are in different complex types. You can also have two local element declarations with the same element name in the same complex type, provided that they themselves have the same type. This is explained further in Section 12.4.3 on p. 268.

6.1.3. Design hint: Should I use global or local element declarations?

Use global element declarations if:

• The element declaration could ever apply to the root element during validation. Such a declaration should be global so that the schema processor can access it.

• You want to use the exact same element declaration in more than one complex type.

• You want to use the element declaration in a substitution group. Local element declarations cannot participate in substitution groups (see Chapter 16).

Use local element declarations if:

• You want to allow unqualified element names in the instance. In this case, make all of the element declarations local except for the root element declaration. If you mix global and local declarations, and you want the element names in local declarations to be unqualified, you will require your instance authors to know which element declarations are global and which are local. Global element names are always qualified in the instance (see Section 6.3 on p. 98).

• You want to have several element declarations with the same name but different types or other properties. Using local declarations, you can have two element declarations for size: One that is a child of shoe has the type ShoeSizeType, and one that is a child of hat has the type HatSizeType. If the size declaration is global, it can only occur once, and therefore use only one type, in that schema. The same holds true for default and fixed values as well as nillability.

6.2. Declaring the types of elements

Regardless of whether they are local or global, all element declarations associate an element name with a type, which may be either simple or complex. There are four ways to associate a type with an element name:

1. Reference a named type by specifying the type attribute in the element declaration. This may be either a built-in type or a user-defined type.

2. Define an anonymous type by specifying either a simpleType or a complexType child.

3. Use no particular type, by specifying neither a type attribute nor a simpleType or complexType child. In this case, the actual type is anyType which allows any children and/or character data content, and any attributes, as long as it is well-formed XML.1

4. Define one or more type alternatives using alternative children. This more advanced feature of version 1.1 is described separately in Section 14.2 on p. 375.

Example 6–4 shows four element declarations with different type assignment methods.

Example 6–4. Assigning types to elements

Click here to view code image

<xs:element name="size" type="SizeType"/>

<xs:element name="name" type="xs:string"/>

<xs:element name="product">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="name"/>
      <xs:element ref="size"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="anything"/>

The first example uses the type attribute to specify SizeType as the type of size. The second example also uses the type attribute, this time to assign a built-in type string to name. The xs prefix is used because built-in types are part of the XML Schema Namespace. For a complete explanation of the use of prefixes in schema documents, see Section 3.3.5 on p. 52.

The third example uses an in-line anonymous complex type, which is defined entirely within the product declaration. Finally, the fourth element declaration, anything, does not specify a particular type, which means that anything elements can have any well-formed content and any attributes.

For a detailed discussion of using named or anonymous types, see Section 8.2.3 on p. 133.

6.3. Qualified vs. unqualified forms

When an element declaration is local—that is, when it isn’t at the top level of a schema document—you have the choice of putting those element names into the target namespace of the schema or not. Let’s explore the two alternatives.

6.3.1. Qualified local names

Example 6–5 shows an instance where all element names are qualified. Every element name has a prefix that maps it to the product namespace.

Example 6–5. Qualified local names

Click here to view code image

<prod:product xmlns:prod="http://datypic.com/prod">
<prod:number>557</prod:number>
<prod:size>10</prod:size>
</prod:product>

6.3.2. Unqualified local names

Example 6–6, on the other hand, shows an instance where only the root element name, product, is qualified. The other element names have no prefix, and since there is no default namespace declaration, they are not in any namespace.

Example 6–6. Unqualified local names

Click here to view code image

<prod:product xmlns:prod="http://datypic.com/prod">
<number>557</number>
<size>10</size>
</prod:product>

6.3.3. Using elementFormDefault

Let’s look at the schemas that would describe these two instances. Example 6–7 shows a schema for the instance in Example 6–5, which has qualified element names.

Example 6–7. Schema for qualified local element names

Click here to view code image

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns="http://datypic.com/prod"
           targetNamespace="http://datypic.com/prod"
           elementFormDefault="qualified">
  <xs:element name="product" type="ProductType"/>
  <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="number" type="xs:integer"/>
      <xs:element name="size" type="xs:integer"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

The schema document has elementFormDefault set to qualified. As a result, elements conforming to local declarations must use qualified element names in the instance. In this example, the declaration for product is global and the declarations for number and size are local.

To create a schema for the instance in Example 6–6, which has unqualified names, you can simply change the value of elementFormDefault in the schema document to unqualified. Or, since the default value is unqualified, you could simply omit the attribute. In this case, elements conforming to global declarations must still use qualified element names—hence the use of prod:product in the instance.

6.3.4. Using form

It is also possible to specify the form on a particular element declaration using a form attribute whose value, like elementFormDefault, is either qualified or unqualified. Example 6–8 shows a revised schema that uses the form attribute on the number element declaration to override elementFormDefault and make it unqualified.

Example 6–8. Using the form attribute

Click here to view code image

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns="http://datypic.com/prod"
           targetNamespace="http://datypic.com/prod"
           elementFormDefault="qualified">
  <xs:element name="product" type="ProductType"/>
  <xs:complexType name="ProductType">
    <xs:sequence>
      <xs:element name="number" type="xs:integer"
                        form="unqualified"/>
      <xs:element name="size" type="xs:integer"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

A valid instance is shown in Example 6–9.

Example 6–9. Overridden form

Click here to view code image

<prod:product xmlns:prod="http://datypic.com/prod">
<number>557</number>
<prod:size>10</prod:size>
</prod:product>

6.3.5. Default namespaces and unqualified names

Default namespaces do not mix well with unqualified element names. The instance in Example 6–10 declares the prod namespace as the default namespace. However, this will not work with a schema document where elementFormDefault is set to unqualified, because it will be unsuccessfully looking for the elements number and size in the prod namespace whereas they are in fact not in any namespace.

Example 6–10. Invalid mixing of unqualified names and a default namespace

Click here to view code image

<product xmlns="http://datypic.com/prod">
<number>557</number>
<size>10</size>
</product>

Although unqualified element names may seem confusing, they do have some advantages when combining multiple namespaces. Section 21.7.3 on p. 575 provides a more complete coverage of the pros and cons of unqualified local names.

6.4. Default and fixed values

Default and fixed values are used to augment an instance by adding values to empty elements. The schema processor will insert a default or fixed value if the element in question is empty. If the element is absent from the instance, it will not be inserted. This is different from the treatment of default and fixed values for attributes.

Default and fixed values are specified by the default and fixed attributes, respectively. Only one of the two attributes (default or fixed) may appear; they are mutually exclusive. Default and fixed values can be specified in element declarations with:

• Simple types

• Complex types with simple content

• Complex types with mixed content, if all children are optional

The default or fixed value must be valid for the type of that element. For example, it is not legal to specify a default value of xyz if the type of the element is integer.1

The specification of fixed and default values in element declarations is independent of their occurrence constraints (minOccurs and maxOccurs). Unlike defaulted attributes, a defaulted element may be required (i.e., minOccurs in its declaration may be more than 0). If an element with a default value is required, it may still appear empty and have its default value filled in.

6.4.1. Default values

The default value is filled in if the element is empty. Example 6–11 shows the declaration of product with two children, name and size, that have default values specified.

Example 6–11. Specifying an element’s default value

Click here to view code image

<xs:element name="product">
  <xs:complexType>
    <xs:choice minOccurs="0" maxOccurs="unbounded">
      <xs:element name="name" type="xs:string" default="N/A"/>
      <xs:element name="size" type="xs:integer" default="12"/>
    </xs:choice>
  </xs:complexType>
</xs:element>

It is important to note that certain types allow an empty value. This includes string, normalizedString, token, and any types derived from them that do not specifically disallow the empty string as a value. Additionally, unrestricted list types allow empty values. For any type that allows an empty string value, the element will never be considered to have that empty string value because the default value will be filled in. However, if an element has the xsi:nil attribute set to true, its default value is not inserted.

Table 6–3 describes how element default values are inserted in different situations, based on the declaration in Example 6–11.

Table 6–3. Default value behavior for elements

6.4.2. Fixed values

Fixed values are added in all the same situations as default values. The only difference is that if the element has a value, its value must be equivalent to the fixed value. When the schema processor determines whether the value of the element is in fact equivalent to the fixed value, it takes into account the element’s type.

Table 6–4 shows some valid and invalid instances for elements declared with fixed values. The size element has the type integer, so all forms of the integer “1” are accepted in the instance, including “01”, “+1”, and “ 1 ” surrounded by whitespace. Whitespace around a value is acceptable because the whiteSpace facet value for integer is collapse, meaning that whitespace is stripped before validation takes place. A value that contains only whitespace, like <size> </size>, is not valid because it is not considered empty but also is not equal to 1.

Table 6–4. Elements with fixed values

The name element, on the other hand, has the type string. The string “01” is invalid because it is not considered to be equal to the string “1”. The string “ 1 ” is also invalid because the whiteSpace facet value for string is preserve, meaning that the leading and trailing spaces are kept. For more information on type equality, see Section 11.7 on p. 253.

6.5. Nils and nillability

In some cases, an element may be either absent from an instance or empty (contain no value). The instance shown in Example 6–12 is a purchase order with some absent and empty elements.

Example 6–12. Missing values

Click here to view code image

<order>
  <giftWrap>ADULT BDAY</giftWrap>
  <customer>
    <name>
      <first>Priscilla</first>
      <middle/>
      <last>Walmsley</last>
    </name>
  </customer>
  <items>
    <shirt>
      <giftWrap/>
      <number>557</number>
      <name>Short-Sleeved Linen Blouse</name>
      <size></size>
    </shirt>
    <umbrella>
      <number>443</number>
      <name>Deluxe Golf Umbrella</name>
      <size></size>
    </umbrella>
  </items>
</order>

There are many possible reasons for an element value to be missing in an instance:

• The information is not applicable: Umbrellas do not come in different sizes.

• We do not know whether the information is applicable: We do not know whether the customer has a middle name.

• It is not relevant to this particular application of the data: The billing application does not care about product sizes.

• It is the default, so it is not specified: The customer’s title should default to “Mr.”

• It actually is present and the value is an empty string: The gift wrap value for the shirt is empty, meaning “none,” which should override the gift wrap value of the order.

• It is erroneously missing because of a user error or technical bug: We should have a size for the shirt.

Different applications treat missing values in different ways. One application might treat an absent element as not applicable, and an empty element as an error. Another might treat an empty element as not applicable, and an absent element as an error. The treatment of missing values may vary within the same schema. In our example, we used a combination of absent and empty elements to signify different reasons for missing values.

XML Schema offers a third method of indicating a missing value: nils. By marking an element as nil, you are telling the processor “I know this element is empty, but I want it to be valid anyway.” The actual reason why it is empty, or what the application should do with it, is entirely up to you. XML Schema does not associate any particular semantics with this absence. It only offers an additional way to express a missing value, with the following benefits:

• You do not have to weaken the type by allowing empty content and/or making attributes optional.

• You are making a deliberate statement that the information does not exist. This is a clearer message than simply omitting the element, which would mean that we do not know if it exists.

• If for some reason an application is relying on that element being there, for example as a placeholder, nil provides a way for it to exist without imparting any additional information.

• You can easily turn off default value processing. The default value for the element will not be added if it is marked as nil.

An approach for our purchase order document is outlined below. It uses nils, derived types, simple type restrictions, and default values to better constrain missing values. The resulting instance is shown in Example 6–13.

Example 6–13. Missing values, revisited

Click here to view code image

<order xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <giftWrap>ADULT BDAY</giftWrap>
  <customer>
    <name>
      <title/>              
      <first>Priscilla</first>
      <middle xsi:nil="true"/>
      <last>Walmsley</last>
    </name>
  </customer>
  <items>
    <shirt>
      <giftWrap/>
      <number>557</number>
      <name>Short-Sleeved Linen Blouse</name>
      <size></size>                  
    </shirt>
    <umbrella>
      <number>443</number>
      <name>Deluxe Golf Umbrella</name>
    </umbrella>
  </items>
</order>

• The information is not applicable: Give shirt and umbrella different types and do not include the size element declaration in UmbrellaType.

• We do not know whether the information is applicable: Make middle nillable and set xsi:nil to true if it is not present.

• It is not relevant to this particular application of the data: Give the billing application a separate schema document and insert a wildcard where the size and other optional element declarations or references may appear.

• It is the default, so it is not specified: Specify a default value of “Mr.” for title.

• It actually is present and the value is an empty string: Allow giftWrap to appear empty.

• It is erroneously missing because of a user error or technical bug: Make size required and make it an integer or other type that does not accept empty values.

This is one of the many reasonable approaches for handling absent values. The important thing is to define a strategy that provides all the information your application needs and ensures that all errors are caught.

6.5.1. Using xsi:nil in an instance

To indicate that the value of an instance element is nil, specify the xsi:nil attribute on that element. Example 6–14 shows five instances of size that use the xsi:nil attribute. The xsi:nil attribute applies to the element in whose tag it appears, not any of the attributes. There is no way to specify that an attribute value is nil.

Example 6–14. xsi:nil in instance elements

Click here to view code image

<size xsi:nil="true"/>
<size xsi:nil="true"></size>
<size xsi:nil="true" system="US-DRESS"/>
<size xsi:nil="false">10</size>
<size xsi:nil="true">10</size>

The xsi:nil attribute is in the XML Schema Instance Namespace (http://www.w3.org/2001/XMLSchema-instance). This namespace must be declared in the instance, but it is not necessary to specify a schema location for it. Any schema processor will recognize the xsi:nil attribute of any XML element.

If the xsi:nil attribute appears on an element, and its value is set to true, that element must be empty. It cannot contain any child elements or character data, even if its type requires content. The last instance of the size element in Example 6–14 is invalid because xsi:nil is true but it contains data. However, it is valid for a nil element to have other attributes, as long as they are declared for that type.

6.5.2. Making elements nillable

In order to allow an element to appear in the instance with the xsi:nil attribute, its element declaration must indicate that it is nillable. Nillability is indicated by setting the nillable attribute in the element declaration to true. Example 6–15 shows an element declaration illustrating this.

Example 6–15. Making size elements nillable

Click here to view code image

<xs:element name="size" type="xs:integer" nillable="true"/>

Specifying nillable="true" in the declaration allows elements to have the xsi:nil attribute. Otherwise, the xsi:nil attribute cannot appear, even with its value set to false. It is not necessary (or even legal) to separately declare the xsi:nil attribute for the type used in the element declaration. In Example 6–15, we gave size a simple type. Normally this would mean that it cannot have attributes, but the xsi:nil attribute is given special treatment. Elements with either complex or simple types can be nillable.

If nillable is set to true, a fixed value may not be specified in the declaration.¹ However, it is legal to specify a default value. If an element has an xsi:nil set to true, the default value is not filled in even though the element is empty.

Elements should not be declared nillable if they will ever be used as fields in an identity constraint, such as a key or a uniqueness constraint. See Section 17.7.2 on p. 434 for more information on identity constraint fields.