Namespaces have two purposes in XML:
To distinguish between elements and attributes from different vocabularies with different meanings that happen to share the same name
To group all the related elements and attributes from a single XML application together so that software can easily recognize them
The first purpose is easier to explain and grasp, but the second purpose is more important in practice.
Namespaces are implemented by attaching a prefix to each element
and attribute. Each prefix is mapped to a URI by an xmlns
:prefix
attribute. Default URIs can also be provided for elements that don't
have a prefix. Default namespaces are declared by xmlns
attributes. Elements and attributes that are attached to
the same URI are in the same namespace. Elements from many XML
applications are identified by standard URIs.
In an XML 1.1 document, an Internationalized Resource Identifier (IRI) can be used instead of a URI. An IRI is just like a URI except it can contain non-ASCII characters such as é and π. In practice, parsers don't check that namespace names are legal URIs in XML 1.0, so the distinction is mostly academic.
Some documents combine markup from multiple XML applications. For example, an XHTML document may contain both SVG pictures and MathML equations. An XSLT stylesheet will contain both XSLT instructions and elements from the result-tree vocabulary. And XLinks are always symbiotic with the elements of the document in which they appear since XLink itself doesn't define any elements, only attributes.
In some cases, these applications may use the same name to refer
to different things. For example, in SVG a set
element sets the value of an attribute
for a specified duration of time, while in MathML, a set
element represents a mathematical set
such as the set of all positive even numbers. It's essential to know
when you're working with a MathML set
and when you're working with an SVG
set
. Otherwise, validation,
rendering, indexing, and many other tasks will get confused and
fail.
Consider Example 4-1. This is a simple list of paintings, including the title of each painting, the date each was painted, the artist who painted it, and a description of the painting.
Example 4-1. A list of paintings
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?> <catalog> <painting> <title>Memory of the Garden at Etten</title> <artist>Vincent Van Gogh</artist> <date>November, 1888</date> <description> Two women look to the left. A third works in her garden. </description> </painting> <painting> <title>The Swing</title> <artist>Pierre-Auguste Renoir</artist> <date>1876</date> <description> A young girl on a swing. Two men and a toddler watch. </description> </painting> <!-- Many more paintings... --> </catalog>
Now suppose that Example 4-1 is to be served as a web page and you want to make it accessible to search engines. One possibility is to use the Resource Description Framework (RDF) to embed metadata in the page. This describes the page for any search engines or other robots that might come along. Using the Dublin Core metadata vocabulary (http://purl.oclc.org/dc/ ), a standard vocabulary for library catalog-style information that can be encoded in XML or other syntaxes, an RDF description of this page might look something like this:
<RDF> <Description about="http://www.cafeconleche.org/examples/impressionists.xml"> <title> Impressionist Paintings </title> <creator> Elliotte Rusty Harold </creator> <description> A list of famous impressionist paintings organized by painter and date </description> <date>2000-08-22</date> </Description> </RDF>
Here we've used the Description
and RDF
elements from RDF and the title
, creator
, description
, and date
elements from the Dublin Core. We have
no choice about these names; they are established by their respective
specifications. If we want software that understands RDF and the
Dublin Core to understand our documents, then we have to use these
names. Example 4-2 combines
this description with the actual list of paintings.
Example 4-2. A list of paintings, including catalog information about the list
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?> <catalog> <RDF> <Description about="http://www.cafeconleche.org/examples/impressionists.xml"> <title> Impressionist Paintings </title> <creator> Elliotte Rusty Harold </creator> <description> A list of famous impressionist paintings organized by painter and date </description> <date>2000-08-22</date> </Description> </RDF> <painting> <title>Memory of the Garden at Etten</title> <artist>Vincent Van Gogh</artist> <date>November, 1888</date> <description> Two women look to the left. A third works in her garden. </description> </painting> <painting> <title>The Swing</title> <artist>Pierre-Auguste Renoir</artist> <date>1876</date> <description> A young girl on a swing. Two men and a toddler watch. </description> </painting> <!-- Many more paintings... --> </catalog>
Now we have a problem. Several elements have been overloaded
with different meanings in different parts of the document. The
title
element is used for both the
title of the page and the title of a painting. The date
element is used for both the date the
page was written and the date the painting was painted. One description
element describes pages, while
another describes paintings.
This presents all sorts of problems. Validation is difficult
because catalog and Dublin Core elements with the same name have
different content specifications. Web browsers may want to hide the
page description while showing the painting description, but not all
stylesheet languages can tell the difference between the two.
Processing software may understand the date format used in the Dublin
Core date
element, but not the more
free-form format used in the painting date
element.
We could change the names of the elements from our vocabulary,
painting_title
instead of title
, date_painted
instead of date
, and so on. However, this is
inconvenient if you already have a lot of documents marked up in the
old version of the vocabulary. And it may not be possible to do this
in all cases, especially if the name collisions occur not because of
conflicts between your vocabulary and a standard vocabulary, but
because of conflicts between two or more standard vocabularies. For
instance, RDF just barely avoids a collision with the Dublin Core over
the Description
and description
elements.
In other cases, there may not be any name conflicts, but it may still be important for software to determine quickly and decisively which XML application a given element or attribute belongs to. For instance, an XSLT processor needs to distinguish between XSLT instructions and literal result-tree elements.