XML Syntax

For each section of this reference that maps directly to an XML language structure, an informal syntax reference describes that structure's form. The following conventions are used with these syntax blocks:

Format

Meaning

DOCTYPE

Bold text indicates literal characters that must appear as written within the document (e.g., DOCTYPE).

encoding-name

Italicized text indicates that the user must replace the text with real data. The item indicates what type of data should be inserted (e.g., encoding-name = en-us).

|

The vertical bar indicates that only one out of a list of possible values can be selected.

[ ]

Square brackets indicate that a particular portion of the syntax is optional.

Every XML document is broken into two primary sections: the prolog and the document element. A few documents may also have comments or processing instructions that follow the root element in a sort of epilog (an unofficial term). The prolog contains structural information about the particular type of XML document you are writing, including the XML declaration and document type declaration. The prolog is optional, and if a document does not need to be validated against a DTD, it can be omitted completely. The only required structure in a well-formed XML document is the top-level document element itself.

The following syntax structures are common to the entire XML document. Unless otherwise noted within a subsequent reference item, the following structures can appear anywhere within an XML document.

Chapter 2 explained the difference between well-formed and valid documents. Well-formed documents that include and conform to a given DTD are considered valid. Documents that include a DTD and violate the rules of that DTD are invalid. The DTD is comprised of both the internal subset (declarations contained directly within the document) and the external subset (declarations that are included from outside the main document).

Elements are an XML document's lifeblood. They provide the structure for character data and attribute values that make up a particular instance of an XML document type definition. The !ELEMENT and !ATTLIST declarations from the DTD restrict the possible contents of an element within a valid XML document. Combining elements and/or attributes that violate these restrictions generates an error in a validating parser.

Although namespace support was not part of the original XML 1.0 Recommendation, Namespaces in XML was approved less than a year later (January 14, 1999). Namespaces are used to uniquely identify the element and attribute names of a given XML application from those of other applications. See Chapter 4 for more detailed information.

The following sections describe how namespaces impact the formation and interpretation of element and attribute names within an XML document.