XInclude is a new technology developed at the W3C for combining multiple well-formed and optionally valid documents and fragments thereof into a single document. It's similar in effect to using external entity references to assemble a document from several component pieces. However, XInclude can assemble a document from resources that are themselves fully well-formed documents that include XML declarations and even document type declarations. It can also use XPointers to extract only a piece of an external document, rather than including the entire thing.
XInclude defines two elements, xi:include
and xi:fallback
, both
in the http://www.w3.org/2001/XInclude
namespace. An
xi:include
element has an href
attribute that points to a document. An
XInclude processor replaces all the xi:include
elements in a master document with
the documents they point to. These documents can be other XML documents
or plain text documents like Java source code. If the xi:include
element has an xpointer
attribute, then the xi:include
element is replaced by only those
parts of the remote document that the XPointer indicates. If the
processor cannot find the external document the href
attribute points to, then it replaces the
xi:include
element with the contents
of the element's xi:fallback
child
element instead.
This chapter is based on the April 13, 2004 2nd Candidate Recommendation of XInclude. We think this draft is pretty stable, but it's possible some of the details described here may change before the final release. The most current version of the XInclude specification can be found at http://www.w3.org/TR/xinclude/.
The key component of XInclude is the include
element. This must be in the
http://www.w3.org/2001/XInclude
namespace. The xi
or
xinclude
prefixes are customary, although, as always, the prefix
can change as long as the URI remains the same. This element has an
href
attribute that contains a URL pointing to the document
to include. For example, this element includes the document found at
the relative URL AlanTuring.xml:
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="AlanTuring.xml"/>
Of course, you can use absolute URLs as well:
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="http://cafeconleche.org/books/xian3/examples/12/AlanTuring.xml" />
Technically, the href
attribute contains an IRI rather than a URI or URL. An IRI is like a URI except that it can contain non-ASCII
characters such as é and . These characters are normally encoded in
UTF-8, and then each byte of the UTF-8 sequence is percent escaped
to convert the IRI to a URI before resolving it. If you're working
in English, and you're not writing an XInclude processor, you can
pretty much ignore this. All standard URLs are legal IRIs. If you
are working with non-English, non-ASCII IRIs, this just means you
can use them exactly as you'd expect without having to manually
hex-encode the non-ASCII characters yourself.
Normally, the namespace declaration is placed on the root
element of the including document, and not repeated on each individual
xi:include
element. Henceforth in
this chapter, we will assume that the namespace prefix xi
is bound to the correct namespace
URI.
Example 12-1 shows a
document similar to Example
8-1 that contains two xi:include
elements. The first one loads the
document found at the relative URL
AlanTuring.xml. The second loads the document
found at the relative URL
RichardPFeynman.xml.
Example 12-1. A document that uses XInclude to load two other documents
<?xml version="1.0"?> <people xmlns:xi="http://www.w3.org/2001/XInclude" > <xi:include href="AlanTuring.xml"/> <xi:include href="RichardPFeynman.xml"/> </people>
When an XInclude processor reads this document, it will parse
the XML documents found at the two URLs and insert their contents
(except for the XML and document type declarations, if any) into the
finished document at the positions indicated by the xi:include
elements. The xi:include
elements are removed. XInclusion
is not done by default, and many XML parsers do not understand or
support XInclude. You either need to use a filter that resolves the
xi:include
elements before
processing the documents further, or tell the parser that you want it
to perform XInclusion. The exact details vary from one processor to
the next. For example, using xmllint from libxml, the --xinclude
option tells it to resolve
XIncludes:
$ xmllint --xinclude http://cafeconleche.org/books/xian3/examples/12/12-1.xml
<?xml version="1.0"?>
<people xmlns:xi="http://www.w3.org/2001/XInclude">
<person born="1912" died="1954"
xml:base=
"http://cafeconleche.org/books/xian3/examples/12/AlanTuring.xml">
<name>
<first_name>Alan</first_name>
<last_name>Turing</last_name>
</name>
<profession>computer scientist</profession>
<profession>mathematician</profession>
<profession>cryptographer</profession>
</person>
<person born="1918" died="1988"
xml:base=
"http://cafeconleche.org/books/xian3/examples/12/RichardPFeynman.xml">
<name>
<first_name>Richard</first_name>
<middle_initial>P</middle_initial>
<last_name>Feynman</last_name>
</name>
<profession>physicist</profession>
<hobby>Playing the bongoes</hobby>
</person>
</people>
You'll notice that the processor has added xml:base
attributes to attempt to preserve the base URIs of the
included elements. This is not so important here, where both the
including document and the two included documents all live in the same
directory. However, when assembling a document from different sources
on different servers and different directories, this helps make sure
the relative URLs in the included text are properly resolved.
It's also important to note that the inclusion is based on the parsed documents. It's not done as if by copying and pasting the raw text. XML declarations are not copied. Insignificant white space inside tags may not be quite the same after inclusion as it was before. Whitespace in the prolog and epilog is not copied at all. Document type declarations are not copied, but any default attribute values they defined are copied.
libxml includes fairly complete support for XInclude. Xerces-J 2.7 includes incomplete support for XInclude.
Other parsers typically have none at all and will require the use of
third-party libraries that do support XInclude, such as XOM's nu.xom.xinclude
package. This is still fairly bleeding edge technology.