XPath is a non-XML syntax for expressions that identifies particular nodes and groups of nodes in an XML document. It is used by both XPointer and XSLT, as well as by some native XML databases and query languages.
XPath views each XML document as a tree of nodes. Each node has one of seven types:
Each document has exactly one root node, which is the root of the tree. This node contains one comment node child for each comment outside the document element, one processing-instruction node child for each processing instruction outside the root element, and exactly one element node child for the root element. It does not contain any representation of the XML declaration, the document type declaration, or any whitespace that occurs before or after the root element. The root node has no parent node. The root node's value is the value of the root element.
An element node has a name, a namespace URI, a parent node, and a list of child nodes, which may include other element nodes, comment nodes, processing-instruction nodes, and text nodes. An element node also has a collection of attributes and a collection of in-scope namespaces, none of which are considered to be children of the element. The string-value of an element node is the complete, parsed text between the element's start- and end-tags that remains after all tags, comments, and processing instructions are removed and all entity and character references are resolved.
An attribute node has a name, a namespace URI, a
value, and a parent element. However, although elements are
parents of attributes, attributes are not children of their
parent elements. The biological metaphor breaks down here.
xmlns
and xmlns
:prefix
attributes are not represented as attribute nodes. An attribute
node's value is the normalized attribute value.
Each text node represents the maximum possible contiguous run of text between tags, processing instructions, and comments. A text node has a parent node but does not have children. A text node's value is the text of the node.
A namespace node represents a namespace in scope
on an element. In general, each namespace declaration by an
xmlns
or xmlns
:prefix
attribute produces a namespace node on that element and on all
of its descendant elements (unless overridden by another
namespace declaration). Like attribute nodes, each namespace
node has a parent element but is not the child of that parent.
The name of a namespace node is the prefix. The value of a
namespace node is the namespace URI.
A processing-instruction node has a target, data, a parent node, and no children. The name of a processing-instruction node is its target. The value of a processing-instruction node is the data of the processing instruction, not including any initial whitespace.
A comment node represents a comment. It has a
parent node and no children. The value of a comment is the
string content of the comment, not including the <!--
and -->
.
The XML declaration and the document type declaration are not
included in XPath's view of an XML document. All entity references,
character references, and CDATA
sections are resolved before the XPath tree is built. The references
themselves are not included as a separate part of the tree.