Controlling Type Derivation

Just as some object-oriented programming languages allow the creator of an object to dictate the limits on how an object can be extended, the schema language allows schema authors to place restrictions on type extension and restriction.

The abstract attribute applies to type and element declarations. When it is set to true, that element or type cannot appear directly in an instance document. If an element is declared as abstract, a member of a substitution group based on that element must appear. If a type is declared as abstract, no element declared with that type may appear in an instance document.

Until now, the schema has placed no restrictions on how other types or elements could be derived from its elements and types. The final attribute can be added to a complex type definition and set to either #all, extension, or restriction. On a simple type definition, it can be set to #all or to a list containing any combination of the values list, union, and/or restriction, in any order. When a type is derived from another type that has the final attribute set, the schema processor verifies that the desired derivation is legal. For example, a final attribute could prevent the physicalAddressType type from being extended:

<xs:complexType name="physicalAddressType" final="extension">

Since the main schema in address-schema.xsd attempts to redefine the physicalAddressType in an xs:redefine block, the schema processor generates the following errors when it attempts to validate the instance document:

ComplexType 'physicalAddressType': cos-ct-extends.1.1: Derivation by 
extension is forbidden by either the base type physicalAddressType_redefined 
or the schema.
Attribute "addr:latitude" must be declared for element type "physicalAddress".
Attribute "addr:longitude" must be declared for element type 
"physicalAddress".

The first error is a result of trying to extend a type that has been marked to prevent extension. The next two errors occur because the new, extended type was not parsed and applied to the content in the document. Now that you've seen how this works, removing this particular "feature" from the physicalAddressType definition gets the schema working again.

Similar to the final attribute, the fixed attribute is provided to mark certain facets of simple types as immutable. Facets that have been marked as fixed="true" cannot be overridden in derived types.

Perhaps one of the most welcome features of schemas is the ability to express more sophisticated relationships between values in elements and attributes of a document. The limitations of the primitive index capability provided by the XML 1.0 ID and IDREF attributes became readily apparent as documents began to include multiple distinct types of element data with complex data keys. The two facilities for enforcing element uniqueness in schemas are the xs:unique and xs:key elements.

The xs:unique element enforces element and attribute value uniqueness for a specified set of elements in a schema document. This uniqueness constraint is constructed in two phases. First, the set of all of the elements to be evaluated is defined using a restricted XPath expression. Next, the precise element and attribute values that must be unique are defined.

To illustrate, let's add logic to the address schema to prevent the same phone number from appearing multiple times within a given contacts element. To add this restriction, the element declaration for contacts includes a uniqueness constraint:

<xs:element name="contacts" type="addr:contactsType" minOccurs="0">
  <xs:unique name="phoneNums">
    <xs:selector xpath="addr:phone"/>
    <xs:field xpath="@addr:number"/>
  </xs:unique>
</xs:element>

Now, if a given contacts element contains two phone elements with the same value for their number attributes, the schema processor will generate an error.

This is the basic algorithm that the schema processor follows to enforce these restrictions:

  1. Use the xpath attribute of the single xs:selector element to build a set of all of the elements to which the restriction will apply.

  2. Logically combine the values referenced by each xs:field element for each selected element. Compare the combinations of values that you get for each of the elements.

  3. Report any conflicts as a validity error.

The xs:key element is closely related to the xs:unique element. Logically, the xs:key element functions exactly the same way the xs:unique element does. It uses the xs:selector element to define a set of elements it applies to, then one or more xs:field elements are used to define which values make up this particular key. The difference between these elements is that xs:key says that every selected element must have a value for each of the fields specified, whereas with xs:unique, it doesn't matter if some of the selected elements don't have values for the fields. Having created a fairly full-featured address element, creating a collection of these elements called addressBook would be an excellent way to show this feature in operation.

First, the new addressBook element is declared, including a key based on the ssn attribute of each address entry:

<xs:element name="addressBook">
  <xs:complexType>
    <xs:sequence maxOccurs="unbounded">
      <xs:element ref="addr:address"/>
    </xs:sequence>
  </xs:complexType>
  <xs:key name="ssnKey">
    <xs:selector xpath="addr:address"/>
    <xs:field xpath="@addr:ssn"/>
  </xs:key>
 </xs:element>

(If the ssn attribute was optional, you'd need to use xs:unique rather than xs:key in this example.)

Now that the key is defined, you can add a new element to the address element declaration that connects a particular address record with another record. For example, to list references to the children of a particular person in the address book, add the following declaration for a kids element:

<xs:element name="address">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="fullName">
. . .
      </xs:element>
      <xs:element name="kids" minOccurs="0">
        <xs:complexType>
          <xs:sequence maxOccurs="unbounded">
            <xs:element name="kid">
              <xs:complexType>
                <xs:attribute name="ssn" type="addr:ssn"/>
              </xs:complexType>
            </xs:element>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
. . .
    </xs:sequence>
  <xs:attributeGroup ref="addr:nationality"/>
  <xs:attribute name="ssn" type="addr:ssn"/>
  <xs:anyAttribute namespace="http://www.w3.org/1999/xlink"
      processContents="skip"/>
  </xs:complexType>
 </xs:element>

Now, an xs:keyref element in the addressBook element declaration enforces the constraint that the ssn attribute of a particular kid element must match an ssn attribute on an address element in the current document:

<xs:element name="addressBook">
. . .
  <xs:key name="ssnKey">
    <xs:selector xpath="addr:address"/>
    <xs:field xpath="@addr:ssn"/>
  </xs:key>
  <xs:keyref name="kidSSN" refer="addr:ssnKey">
    <xs:selector xpath="addr:address/addr:kids/addr:kid"/>
    <xs:field xpath="@addr:ssn"/>
  </xs:keyref>
 </xs:element>

Now, if any kid element in an instance document refers to a nonexistent address record, the schema validator will report an error.