The footer example is about at the limits of what you can comfortably fit in a DTD. In practice, web sites prefer to store repeated content like this in external files and load it into their pages using PHP, server-side includes, or some similar mechanism. XML supports this technique through external general entity references, although in this case the client, rather than the server, is responsible for integrating the different pieces of the document into a coherent whole.
An external parsed general entity reference is declared in the
DTD using an ENTITY
declaration.
However, instead of the actual replacement text, the SYSTEM
keyword and a URL to the replacement
text is given. For example:
<!ENTITY footer SYSTEM "http://www.oreilly.com/boilerplate/footer.xml">
Of course, a relative URL will often be used instead. For example:
<!ENTITY footer SYSTEM "/boilerplate/footer.xml">
In either case, when the general entity reference &footer;
is seen in the character data
of an element, the parser may replace it with the document found at
http://www.oreilly.com/boilerplate/footer.xml.
References to external parsed entities are not allowed in attribute
values. Most of the time this shouldn't be too big a hassle because
attribute values tend to be small enough to be easily included in
internal entities.
Notice we wrote that the parser may replace the entity reference with the document at the URL, not that it must. This is an area where parsers have some leeway in just how much of the XML specification they wish to implement. A validating parser must retrieve such an external entity. However, a nonvalidating parser may or may not choose to retrieve the entity.
Furthermore, not all text files can serve as external entities. In order to be loaded in by a general entity reference, the document must be potentially well-formed when inserted into an existing document. This does not mean the external entity itself must be well-formed. In particular, the external entity might not have a single root element. However, if such a root element were wrapped around the external entity, then the resulting document should be well-formed. This means, for example, that all elements that start inside the entity must finish inside the same entity. They cannot finish inside some other entity. Furthermore, the external entity does not have a prolog and, therefore, cannot have an XML declaration or a document type declaration.
Instead of an XML declaration, an external entity may have a text declaration; this looks a lot like an XML declaration. The main difference is that in a text declaration the encoding declaration is required, while the version attribute is optional. Furthermore, there is no standalone declaration. The main purpose of the text declaration is to tell the parser what character set the entity is encoded in. For example, this is a common text declaration:
<?xml version="1.0" encoding="MacRoman"?>
However, you could also use this text declaration with no
version
attribute:
<?xml encoding="MacRoman"?>
Example 3-5 is a well-formed external entity that could be included from another document using an external general entity reference.
Example 3-5. An external parsed entity
<?xml encoding="ISO-8859-1"?> <hr size="1" noshade="true"/> <font CLASS="footer"> <a href="index.html">O'Reilly Home</a> | <a href="sales/bookstores/">O'Reilly Bookstores</a> | <a href="order_new/">How to Order</a> | <a href="oreilly/contact.html">O'Reilly Contacts</a><br> <a href="http://international.oreilly.com/">International</a> | <a href="oreilly/about.html">About O'Reilly</a> | <a href="affiliates.html">Affiliated Companies</a> </font> <p> <font CLASS="copy"> Copyright 2004, O'Reilly Media, Inc.<br/> <a href="mailto:webmaster@oreilly.com">webmaster@oreilly.com</a> </font> </p>