If we look back with a critical eye at our library, we see we used the following simple datatypes:
<xs:element name="name" type="xs:string"/> <xs:element name="qualification" type="xs:string"/> <xs:element name="born" type="xs:date"/> <xs:element name="dead" type="xs:date"/> <xs:element name="isbn" type="xs:string"/> <xs:attribute name="id" type="xs:ID"/> <xs:attribute name="available" type="xs:boolean"/> <xs:attribute name="lang" type="xs:language"/>
We are lucky that the elements born
and
dead
are ISO 8601 dates. The ISBN number is
composed of numeric digits and a final character which can be either
a digit or the letter “x"-and is
therefore represented as a string. We also did a good job with the
datatypes for the id
, available
and lang
attributes, but the choice of
xs:string
for the elements name
and
qualification
is more controversial. They appear
in the instance document as:
<name> Charles M Schulz </name> .../... <qualification> bold, brash and tomboyish </qualification>
This formatting suggests that whitespaces are probably not
significant and should be collapsed. This can be done by choosing the
datatype
xs:token
instead of
xs:string
; the same applies to the
title
element, which is a simple content derived
from
xs:string
that would be better derived from
xs:token
. This change will not have any impact on
the validation with our schema, but the document is more precisely
described and future derivations would be more easily built on
xs:token
than on
xs:string
. The
other datatype that could have been chosen better is
isbn
, which can be represented as
xs
:NMTOKEN
. The new schema
would then be:
<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="name" type="xs:token"/> <xs:element name="qualification" type="xs:token"/> <xs:element name="born" type="xs:date"/> <xs:element name="dead" type="xs:date"/> <xs:element name="isbn" type="xs:NMTOKEN"/> <xs:attribute name="id" type="xs:ID"/> <xs:attribute name="available" type="xs:boolean"/> <xs:attribute name="lang" type="xs:language"/> <xs:element name="title"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:token"> <xs:attribute ref="lang"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name="library"> <xs:complexType> <xs:sequence> <xs:element ref="book" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="author"> <xs:complexType> <xs:sequence> <xs:element ref="name"/> <xs:element ref="born"/> <xs:element ref="dead" minOccurs="0"/> </xs:sequence> <xs:attribute ref="id"/> </xs:complexType> </xs:element> <xs:element name="book"> <xs:complexType> <xs:sequence> <xs:element ref="isbn"/> <xs:element ref="title"/> <xs:element ref="author" minOccurs="0" maxOccurs="unbounded"/> <xs:element ref="character" minOccurs="0" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute ref="id"/> <xs:attribute ref="available"/> </xs:complexType> </xs:element> <xs:element name="character"> <xs:complexType> <xs:sequence> <xs:element ref="name"/> <xs:element ref="born"/> <xs:element ref="qualification"/> </xs:sequence> <xs:attribute ref="id"/> </xs:complexType> </xs:element> </xs:schema>