The most useful XPath expression is a location path. A location path identifies a set of nodes in a document. This set may be empty, may contain a single node, or may contain several nodes. These can be element nodes, attribute nodes, namespace nodes, text nodes, comment nodes, processing-instruction nodes, root nodes, or any combination of these. A location path is built out of successive location steps. Each location step is evaluated relative to a particular node in the document called the context node .
The simplest location path is the one that selects the root node of
the document. This is simply the forward slash /
. (You'll notice that a lot of XPath
syntax is deliberately similar to the syntax used by the Unix shell.
Here /
is the root node of a Unix
filesystem, and /
is the root
node of an XML document.) For example, this XSLT template rule uses
the XPath pattern /
to match the
entire input document tree and wrap it in an html
element:
<xsl:template match="/"> <html><xsl:apply-templates/></html> </xsl:template>
/
is an absolute location path because no matter what the
context node is—that is, no matter where the processor was in the
input document when this template rule was applied—it always means
the same thing: the root node of the document. It is relative to
which document you're processing, but not to anything within that
document.
The second simplest location path is a single element
name. This path selects all child elements of the context node with
the specified name. For example, the XPath profession
refers to all profession
child elements of the context
node. Exactly which elements these are depends on what the context
node is, so this is a relative XPath. For example, if the context
node is the Alan Turing person
element in Example 9-1,
then the location path profession
refers to these three profession
child elements of that element:
<profession>computer scientist</profession> <profession>mathematician</profession> <profession>cryptographer</profession>
However, if the context node is the Richard Feynman person
element in Example 9-1, then the XPath
profession
refers to its single
profession
child element:
<profession>physicist</profession>
If the context node is the name
child element of Richard Feynman or
Alan Turing's person
element,
then this XPath doesn't refer to anything at all because neither of
those has any profession
child
elements.
In XSLT, the context node for an XPath expression used in the
select
attribute of xsl:apply-templates
and similar elements
is the node that is currently matched. For example, consider the
simple stylesheet in Example
9-2. In particular, look at the template rule for the
person
element. The XSLT
processor will activate this rule twice, once for each person
node in the document. The first
time the context node is set to Alan Turing's person
element. The second time the
context node is set to Richard Feynman's person
element. When the same template is
instantiated with a different context node, the XPath expression in
<xsl:value-of
select="name"/>
refers to a different
element, and the output produced is therefore different.
Example 9-2. A very simple stylesheet for Example 9-1
<?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="people"> <xsl:apply-templates select="person"/> </xsl:template> <xsl:template match="person"> <xsl:value-of select="name"/> </xsl:template> </xsl:stylesheet>
When XPath is used in other systems, such as XPointer or XForms, other means are provided for determining what the context node is.
Attributes are also addressable by XPath. To select a particular
attribute of an element, use an @
sign followed by the name of the attribute you want.
For example, the XPath expression @born
selects the born
attribute of the context node. Example 9-3 is a simple XSLT
stylesheet that generates an HTML table of names and birth and death
dates from documents like Example
9-1.
Example 9-3. An XSLT stylesheet that uses root element, child element, and attribute location steps
<?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <xsl:apply-templates select="people"/> </html> </xsl:template> <xsl:template match="people"> <table> <xsl:apply-templates select="person"/> </table> </xsl:template> <xsl:template match="person"> <tr> <td><xsl:value-of select="name"/></td> <td><xsl:value-of select="@born"/></td> <td><xsl:value-of select="@died"/></td> </tr> </xsl:template> </xsl:stylesheet>
The stylesheet in Example
9-3 has three template rules. The first template rule has a
match pattern that matches the root node, /
. The XSLT processor activates this
template rule and sets the context node to the root node. Then it
outputs the start-tag <html>
. This is followed by an
xsl:apply-templates
element that
selects nodes matching the XPath expression people
. If the input document is Example 9-1, then there is
exactly one such node, the root element. This is selected and its
template rule, the one with the match pattern of people
, is applied. The XSLT processor
sets the context node to the root people
element and then begins processing
the people
template. It outputs a
<table>
start-tag and then
encounters an xsl:apply-templates
element that selects nodes matching the XPath expression person
. Two child elements of this context
node match the XPath expression person
, so they're each processed in turn
using the person
template rule.
When the XSLT processor begins processing each person
element, it sets the context node
to that element. It outputs that element's name
child element value and born
and died
attribute values wrapped in a table
row and three table cells. The net result is:
<html> <table> <tr> <td> Alan Turing </td> <td>1912</td> <td>1954</td> </tr> <tr> <td> Richard P Feynman </td> <td>1918</td> <td>1988</td> </tr> </table> </html>
Although element, attribute, and root nodes account for 90% or more of what you need to do with XML documents, this still leaves four kinds of nodes that need to be addressed: namespace nodes, text nodes, processing-instruction nodes, and comment nodes. Namespace nodes are rarely handled explicitly. The other three node types have special node tests to match them. These are as follows:
comment( )
text( )
processing-instruction(
)
Since comments and text nodes don't have names, the comment( )
and text( )
location steps match any comment
or text node child of the context node. Each comment is a separate
comment node. Each text node contains the maximum possible
contiguous run of text not interrupted by any tag. Entity references
and CDATA
sections are resolved
into text and markup and do not interrupt text nodes.
By default, XSLT stylesheets do process text nodes but do not process comment nodes. You can add a comment template rule to an XSLT stylesheet so it will process comments too. For example, this template rule replaces each comment with the text "Comment Deleted" in italic:
<xsl:template match="comment( )"> <i>Comment Deleted</i> </xsl:template>
With no arguments, the processing-instruction( )
location step
selects all processing-instruction children of the context node. If
it has an argument, then it only selects the processing-instruction
children with the specified target. For example, the XPath
expression processing-instruction('xml-stylesheet')
selects all processing-instruction children of the context node
whose target is xml-stylesheet
.
Wildcards match different element and node types at the same
time. There are three wildcards: *
, node(
)
, and @*
.
The asterisk (*
)
matches any element node regardless of name. For example, this XSLT
template rule says that all elements should have their child
elements processed but should not result in any output in and of
themselves:
<xsl:template match="*"><xsl:apply-templates select="*"/></xsl:template>
The *
does not match
attributes, text nodes, comments, or processing-instruction nodes.
Thus, in the previous example, output will only come from child
elements that have their own template rules that override this
one.
You can put a namespace prefix in front of the asterisk. In this case, only
elements in the same namespace are matched. For example, svg:*
matches all elements with the same
namespace URI as the svg
prefix
is mapped to. As usual, it's the URI that matters, not the prefix.
The prefix can be different in the stylesheet and the source
document as long as the namespace URI is the same.
The node( )
wildcard
matches not only all element types but also the root node, text
nodes, processing-instruction nodes, namespace nodes, attribute
nodes, and comment nodes.
The @*
wildcard matches all attribute nodes. For example,
this XSLT template rule copies the values of all attributes of a
person
element in the document
into the content of an attributes
element in the output:
<xsl:template match="person"> <attributes><xsl:apply-templates select="@*"/></attributes> </xsl:template>
As with elements, you can attach a namespace prefix to the wildcard to match attributes in a
specific namespace. For instance, @xlink:*
matches all XLink attributes
provided that the prefix xlink
is
mapped to the http://www.w3.org/1999/xlink
URI. Again,
it's the URI that matters, not the actual prefix.
You often want to match more than one type of element or
attribute but not all types. For example, you may want an XSLT
template that applies to the profession
and hobby
elements but not to the name
, person
, or people
elements. You can combine
location paths and steps with the vertical bar (|
) to indicate that you want to match any
of the named elements. For instance, profession|hobby
matches profession
and hobby
elements. first_name|middle_initial|last_name
matches first_name
, middle_initial
, and last_name
elements. @id|@xlink:type
matches id
and xlink:type
attributes. *|@*
matches elements and attributes but
does not match text nodes, comment nodes, or processing-instruction
nodes. For example, this XSLT template rule applies to all the
nonempty leaf elements (elements that don't contain any other
elements) of Example 9-1of Example 9-1:
<xsl:template match="first_name|last_name|profession|hobby"> <xsl:value-of select="text( )"/> </xsl:template>