In HTML, comments are sometimes abused to support
nonstandard extensions. For instance, the contents of the script
element are sometimes enclosed in a
comment to protect it from display by a nonscript-aware browser. The
Apache web server parses comments in .shtml files to recognize server-side
includes. Unfortunately, these documents may not survive being passed
through various HTML editors and processors with their comments and
associated semantics intact. Worse yet, it's possible for an innocent
comment to be misconstrued as input to the application.
XML provides the processing instruction as
an alternative means of passing information to particular applications
that may read the document. A processing instruction begins with
<?
and ends with ?>
.
Immediately following the <?
is
an XML name called the target , possibly the name of the application for which this
processing instruction is intended or possibly just an identifier for
this particular processing instruction. The rest of the processing
instruction contains text in a format appropriate for the applications
for which the instruction is intended.
For example, in HTML, a robots META
tag is used to tell search-engine and
other robots whether and how they should index a page. The following
processing instruction has been proposed as an equivalent for XML
documents:
<?robots index="yes" follow="no"?>
The target of this processing instruction is robots
. The syntax of this particular
processing instruction is two pseudo-attributes, one named index
and one named follow
, whose values are either yes
or no
. The semantics of this particular
processing instruction are that if the index
attribute has the value yes
, then search-engine robots should index
this page. If index
has the value
no
, then robots should not index
the page. Similarly, if follow
has
the value yes
, then links from this
document will be followed; if it has the value no
, they won't be.
Other processing instructions may have totally different syntaxes and semantics. For instance, processing instructions can contain an effectively unlimited amount of text. PHP includes large programs in processing instructions. For example:
<?php mysql_connect("database.unc.edu", "clerk", "password"); $result = mysql("HR", "SELECT LastName, FirstName FROM Employees ORDER BY LastName, FirstName"); $i = 0; while ($i < mysql_numrows ($result)) { $fields = mysql_fetch_row($result); echo "<person>$fields[1] $fields[0] </person>\r\n"; $i++; } mysql_close( ); ?>
Processing instructions are markup, but they're not elements.
Consequently, like comments, processing instructions may appear
anywhere in an XML document outside of a tag, including before or
after the root element. The most common processing instruction,
xml-stylesheet
, is
used to attach stylesheets to documents. It always appears before the
root element, as Example 2-6
demonstrates. In this example, the xml-stylesheet
processing instruction tells
browsers to apply the CSS stylesheet person.css to this document before showing
it to the reader.
Example 2-6. An XML document with a processing instruction in its prolog
<?xml-stylesheet href="person.css" type="text/css"?> <person> Alan Turing </person>
The processing instruction names xml
, XML
,
XmL
, etc., in any combination of
case, are forbidden in order to avoid confusion with the XML
declaration. Otherwise, you're free to pick any legal XML name for
your processing instructions.