Example 1-1 shows a simple XML document. This particular XML document might be seen in an inventory-control system or a stock database. It marks up the data with tags and attributes describing the color, size, bar-code number, manufacturer, name of the product, and so on.
Programs that actually try to understand the contents of the XML document—that is, do more than merely treat it as any other text file—will use an XML parser to read the document. The parser is responsible for dividing the document into individual elements, attributes, and other pieces. It passes the contents of the XML document to an application piece by piece. If at any point the parser detects a violation of the well-formedness rules of XML, then it reports the error to the application and stops parsing. In some cases, the parser may read further in the document, past the original error, so that it can detect and report other errors that occur later in the document. However, once it has detected the first well-formedness error, it will no longer pass along the contents of the elements and attributes it encounters.
Individual XML applications normally dictate more precise rules
about exactly which elements and attributes are allowed where. For
instance, you wouldn't expect to find a G_Clef
element when reading a biology
document. Some of these rules can be precisely specified with a schema
written in any of several languages, including the W3C XML Schema
Language, RELAX NG, and DTDs. A document may contain a URL indicating
where the schema can be found. Some XML parsers will notice this and
compare the document to its schema as they read it to see if the
document satisfies the constraints specified there. Such a parser is
called a validating parser . A violation of those constraints is called a
validity error , and the whole process of checking a document against a
schema is called validation . If a validating parser finds a validity error, it will
report it to the application on whose behalf it's parsing the
document. This application can then decide whether it wishes to
continue parsing the document. However, validity errors are not
necessarily fatal (unlike well-formedness errors), and an application
may choose to ignore them. Not all parsers are validating parsers.
Some merely check for well-formedness.
The application that receives data from the parser may be:
A word processor, such as StarOffice Writer, that loads the XML document for editing
A database, such as Microsoft SQL Server, that stores the XML data in a new record
A personal finance program, such as Microsoft Money, that sees the XML as a bank statement
A syndication program that reads the XML document and extracts the headlines for today's news