Substitution groups are a flexible way to designate element declarations as substitutes for other element declarations in content models. You can easily designate new element declarations as substitutes, from other schema documents and even other namespaces, without changing the original content model. This chapter describes how to define and use substitution groups.
Substitution groups are useful for simplifying content models and making them more extensible and flexible. Suppose you have a section of a purchase order that lists products of various kinds. You could use repeating product
elements, each having an attribute or child element to indicate what kind of a product it is. However, you may also want to allow different content models for different kinds of products. For example, shirts have a mandatory size, while umbrellas are not allowed to have a size specified. Also, you may want to use descriptive element names that indicate the kind of product. Lastly, you may want the definition to be flexible enough to accept new kinds of products without altering the original schema. This is a perfect application for substitution groups.
Each substitution group consists of a head and one or more members. Wherever the head element declaration is referenced in a content model, one of the member element declarations may be substituted in place of the head. For example, the head of your substitution group might be product
, with the members being the different kinds of products such as shirt
, hat
, and umbrella
. This hierarchy is depicted in Figure 16–1.
Figure 16–1. Substitution group hierarchy
This means that anywhere product
appears in a content model, any of product
, shirt
, hat
, or umbrella
may appear in the instance. The members themselves cannot be substituted for each other. For example, if shirt
appears in a content model, umbrella
cannot be substituted in its place.
Substitution groups form a hierarchy. There can be multiple levels of substitution, and a member of one group may be the head of another group. Other element declarations might have shirt
as their substitution group head, as shown in Figure 16–2. In this case, tShirt
and blouse
may substitute for either product
or shirt
.
Figure 16–2. Multilevel substitution group hierarchy
Example 16–1 shows the ItemsType
complex type that contains a product
element declaration. The product
element declaration will be the head of the substitution group, although there is nothing special about the product
declaration to indicate this. It is significant, however, that it is a global declaration, since only a global element declaration can be the head of a substitution group.
Example 16–1. The head of a substitution group
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="items" type="ItemsType"/>
<xs:complexType name="ItemsType">
<xs:sequence>
<xs:element ref="product" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<xs:element name="product" type="ProductType"/>
<xs:complexType name="ProductType">
<xs:sequence>
<xs:element name="number" type="xs:integer"/>
<xs:element name="name" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
Example 16–2 shows the three element declarations that are members of the substitution group. The product
, shirt
, hat
, and umbrella
element declarations can be used interchangeably wherever product
appears in any content model. Each of the declarations uses the substitutionGroup
attribute to indicate that it is substitutable for product
. Members of a substitution group must be globally declared; it is not legal to use the substitutionGroup
attribute in local element declarations or element references.
Example 16–2. Members of a substitution group
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="shirt" type="ShirtType"
substitutionGroup="product"/>
<xs:complexType name="ShirtType">
<xs:complexContent>
<xs:extension base="ProductType">
<xs:sequence>
<xs:element name="size" type="ShirtSizeType"/>
<xs:element name="color" type="ColorType"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:element name="hat" substitutionGroup="product">
<xs:complexType>
<xs:complexContent>
<xs:extension base="ProductType">
<xs:sequence>
<xs:element name="size" type="HatSizeType"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
</xs:element>
<xs:element name="umbrella" substitutionGroup="product"/>
<!--...-->
</xs:schema>
Example 16–3 shows a valid instance. Since items
can contain an unlimited number of product
elements, any combination of product
, shirt
, hat
, and umbrella
may appear in items
, in any order. Keep in mind that everywhere a reference to the global product
element declaration appears in a content model, it can be replaced by these other element declarations because the substitution group is in effect. If there is a content model where you only want product
elements to be valid, with no substitution, you can get around this by supplying a local product
element declaration in that content model.
Example 16–3. Instance of items
<items>
<product>
<number>999</number>
<name>Special Seasonal</name>
</product>
<shirt>
<number>557</number>
<name>Short-Sleeved Linen Blouse</name>
<size>10</size>
<color value="blue"/>
</shirt>
<hat>
<number>563</number>
<name>Ten-Gallon Hat</name>
<size>L</size>
</hat>
<umbrella>
<number>443</number>
<name>Deluxe Golf Umbrella</name>
</umbrella>
</items>
The substitutionGroup
attribute takes a QName
as its value. This means that if the head’s element name is in a namespace (i.e., the schema document in which it is declared has a target namespace), you must prefix the element name you specify in the substitutionGroup
attribute. You can have an element declaration from a different namespace as the head of your substitution group, provided that the namespace of that element declaration has been imported into your schema document.
In Example 16–2, the complex types of shirt
and hat
are both derived from the type of product
. This is a requirement; members of a substitution group must have types that are either the same as the type of the head, or derived from it by either extension or restriction. They can be directly derived from it, or derived indirectly through multiple levels of restriction and/or extension.
In our example, shirt
is assigned a named type, ShirtType
, which extends ProductType
, while hat
has an anonymous type, also an extension of ProductType
. The third element declaration, umbrella
, does not specify a type. If a substitution group member is specified without a type, it automatically takes on the type of the head of its substitution group. Therefore, in this case, umbrella
has the type ProductType
.
This type constraint on the members of a substitution group is not as restrictive as it seems. You can make the type of the head very generic, allowing almost anything to be derived from it. In fact, you do not have to specify a type in the head element declaration at all, which gives it the generic anyType
. Since all types are derived (directly or indirectly) from anyType
, the members of the substitution group in this case can have any types, including simple types.
The previous examples in this chapter use complex types, but substitution groups may also be used for element declarations with simple types. If a member of a substitution group has a simple type, it must be a restriction of (or the same as) the simple type of the head.
Example 16–4 shows a substitution group of element declarations with simple types. Note that the head element declaration, number
, does not specify a type, meaning that the members of the substitution group may have any type.
Example 16–4. Substitution group with simple types
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="number"/>
<xs:element name="skuNumber" type="xs:string"
substitutionGroup="number"/>
<xs:element name="productID" type="xs:integer"
substitutionGroup="number"/>
</xs:schema>
In version 1.0, each element declaration can only be a member of one substitution group. In version 1.1, it is possible for an element declaration to be a member of many substitution groups. This is done by specifying a space-separated list of names as the value of the substitutionGroup
attribute, as shown in Example 16–5.
Example 16–5. A member of two substitution groups
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="product" type="ProductType"/>
<xs:element name="discontinuedProduct" type="ProductType"/>
<xs:element name="hat" type="HatType"
substitutionGroup="product"/>
<xs:element name="shirt" type="ShirtType"
substitutionGroup="product"/>
<xs:element name="umbrella" type="UmbrellaType"
substitutionGroup="product discontinuedProduct"/>
<!--...-->
</xs:schema>
In this example, there are two head element declarations: product
and discontinuedProduct
. The hat
and shirt
declarations are in just the product
substitution group, but umbrella
is in both. This means that umbrella
can appear anywhere either of these two elements can appear. The type restrictions described in the previous section still apply, so it is generally necessary for the two head element declarations to use same type (which can be anyType
) or types that are related to each other by derivation.
Substitution groups are very useful, but as you may have guessed, there are other methods of achieving similar goals. This section will take a closer look at two of these methods.
The behavior of substitution groups is similar to that of named choice
groups. In the previous examples, we said that wherever product
can appear, it can really be product
, shirt
, hat
, or umbrella
. This choice can also be represented by a named choice
group that lists the relevant element declarations. Example 16–6 shows the definition of a named model group that allows a choice of product
or shirt
or hat
or umbrella
. This named model group is then referenced in the ItemsType
definition.
It is easy to see the list of elements that are allowed, because they are all declared within the named model group. This can be an advantage if the list of member element declarations will not change. On the other hand, if you want to be able to add new element declarations as needed, from a variety of schema documents, using substitution groups is a much better approach. This is because named choice
groups are more rigid. While you can use redefining or overriding to extend a named choice
group, it is more cumbersome and can only be done in schema documents with the same target namespace.
Example 16–6. Using a choice group
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="items" type="ItemsType"/>
<xs:complexType name="ItemsType">
<xs:group ref="ProductGroup" maxOccurs="unbounded"/>
</xs:complexType>
<xs:group name="ProductGroup">
<xs:choice>
<xs:element name="product" type="ProductType"/>
<xs:element name="shirt" type="ShirtType"/>
<xs:element name="hat" type="HatType"/>
<xs:element name="umbrella" type="ProductType"/>
</xs:choice>
</xs:group>
<!--...-->
</xs:schema>
Another alternative to using substitution groups is to repeat the same element name for all of the items (in this case product
), and use xsi:type
attributes to distinguish between the different types of products. Using this approach, we would not declare shirt
, hat
, or umbrella
elements at all, just their types, as shown in Example 16–7. Remember, it is acceptable to substitute a derived type in an instance if you specify the xsi:type
attribute. This is described in more detail in Section 13.6 on p. 341.
Example 16–7. Defining derived types
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType name="ShirtType">
<xs:complexContent>
<xs:extension base="ProductType">
<xs:sequence>
<xs:element name="size" type="ShirtSizeType"/>
<xs:element name="color" type="ColorType"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="HatType">
<xs:complexContent>
<xs:extension base="ProductType">
<xs:sequence>
<xs:element name="size" type="HatSizeType"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="UmbrellaType">
<xs:complexContent>
<xs:extension base="ProductType"/>
</xs:complexContent>
</xs:complexType>
<!--...-->
</xs:schema>
Example 16–8 shows a valid instance for this approach. The product
element is repeated many times, with the xsi:type
attribute distinguishing between the different product types.
The advantage of this approach is that the instance may be easier to process. A Java program or XSLT stylesheet that handles this instance can treat all product types the same based on their element name, but also distinguish between them using the value of xsi:type
if necessary.
Example 16–8. Valid instance using derived types
<items xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<product>
<number>999</number>
<name>Special Seasonal</name>
</product>
<product xsi:type="ShirtType">
<number>557</number>
<name>Short-Sleeved Linen Blouse</name>
<size>10</size>
<color value="blue"/>
</product>
<product xsi:type="HatType">
<number>563</number>
<name>Ten-Gallon Hat</name>
<size>L</size>
</product>
<product xsi:type="UmbrellaType">
<number>443</number>
<name>Deluxe Golf Umbrella</name>
</product>
</items>
Using substitution groups, with some XML technologies, if one wanted to retrieve all the products, it would be necessary to select them based on their position in the instance (e.g., all children of items
) rather than their element name (product
), which could be less reliable. This distinction is not as important with schema-aware technologies like XSLT 2.0, with which you can refer generically to schema-element(product)
which means “product
or any of its substitutes.”
Type substitution also has some disadvantages. It works fine for schema validation, but it is impossible to write a DTD that would validate this instance to the same degree. Also, it looks slightly more complicated and requires a declaration of the XML Schema Instance Namespace, which adds an extra dependency.
Substitution groups are a powerful tool, and you may want to control their use. Three attributes of element declarations control the creation and use of substitutions.
• The final
attribute limits the declaration of substitution groups in schemas.
• The block
attribute limits the use of substituted elements in instances.
• The abstract
attribute forces element substitution in instances.
These three attributes only apply to global element declarations, since local element declarations can never serve as heads of substitution groups.
You may want to prevent other people from defining schemas that use your element declaration as the head of a substitution group. This is accomplished using the final
attribute in the element declaration, which may have one of the following values:
• #all
, in version 1.0, prevents any other element declaration from using your element declaration as a substitution group head. In version 1.1, it only prevents element declarations whose types are extensions or restrictions from being in the substitution group, but allows element declarations that have the same type as the head.
• extension
prevents extension in substitution group members. An element declaration that uses your element declaration as its substitution group head must have a type that is either the same as, or derived by restriction from, the type of your element declaration.
• restriction
prevents restriction in substitution group members. An element declaration that uses your element declaration as its substitution group head must have a type that is either the same as, or derived by extension from, the type of your element declaration.
• extension restriction
and restriction extension
are values that have the same effect as #all
.
• ""
(an empty string) means that there are no restrictions. This value is useful for overriding the value of finalDefault
, as described below.
• If no final
attribute is specified, it takes its value from the finalDefault
attribute of the schema
element. If neither final
nor finalDefault
is specified, there are no restrictions on substitutions for that element declaration.
Example 16–9 shows four element declarations that control the use of substitution groups. With this declaration of product
, the schema shown in Example 16–2 would have been illegal, since it attempts to use the product
element declaration as the head of a substitution group.
Example 16–9. Using final to control substitution group declaration
<xs:element name="product" type="ProductType" final="#all"/>
<xs:element name="items" type="ItemsType" final="extension"/>
<xs:element name="color" type="ColorType" final="restriction"/>
<xs:element name="size" type="SizeType" final=""/>
In the previous section, we saw how to prevent a schema from containing an element declaration that uses your element declaration as its substitution group head. There is another way to control element substitutions, this time in the instance. This is accomplished by using the block
attribute, and assigning the value substitution
(or #all
) to it. Example 16–10 shows element declarations that use the block
attribute.
With this declaration of product
, the schema shown in Example 16–2 would have been legal, but the instance in Example 16–3 would have been illegal. This is the extremely subtle difference between the final
and block
attributes as they relate to substitution groups.
The block
attribute also accepts the values extension
and restriction
, as described in Section 13.7.3 on p. 346. These values can also affect substitution groups, in that they can block members whose types are derived by either extension or restriction. For example, if Example 16–2 were changed to add block="extension"
to the product
declaration, that would make substituting shirt
or hat
invalid in the instance, because their types are derived by extension from the type of product
.
Example 16–10. Using block to prevent substitution group use
<xs:element name="product" type="ProductType" block="#all"/>
<xs:element name="hat" type="HatType" block="substitution"/>
An element declaration may be abstract, meaning that its sole purpose is to serve as the head of a substitution group. Elements declared as abstract can never appear in instance documents. This is indicated by the abstract
attribute in the element declaration. Example 16–11 shows an abstract element declaration for product
. With this declaration, Example 16–3 would be invalid because a product
element appears in the instance. Instead, only shirt
, hat
, and umbrella
would be able to appear in items
.
Example 16–11. An abstract element declaration
<xs:element name="product" type="ProductType" abstract="true"/>