HTML/XHTML Document Elements

Every HTML document should conform to the HTML SGML DTD, the formal Document Type Definition that defines the HTML standard. The DTD defines the tags and syntax that are used to create an HTML document. You can inform the browser which DTD your document complies with by placing a special Standard Generalized Markup Language (SGML) command in the first line of the document:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">

This cryptic message indicates that your document is intended to be compliant with the HTML 4.01 final DTD defined by the World Wide Web Consortium (W3C). Other versions of the DTD define more restricted versions of the HTML standard, and not all browsers support all versions of the HTML DTD. In fact, specifying any other <!DOCTYPE> may cause the browser to misinterpret your document when displaying it for the user. It's also unclear what <!DOCTYPE> to use if you include nonstandard, albeit popular extensions in the HTML document—even for the deprecated HTML 3.0 standard, for which a DTD was never released.

HTML developers are increasingly including an appropriate SGML DOCTYPE command as a prefix in their HTML documents. Because of the confusion of versions and standards, if you do choose to include a DOCTYPE in your HTML document, choose the appropriate one to ensure that your document is rendered correctly.

For XHTML authors, we do strongly recommend that you include the proper DOCTYPE statement in your XHTML documents, in conformance with XML standards. Read Chapters 15 and 16 for more about DTDs and the XML and XHTML standards.

As we saw earlier, the <html> and </html> tags serve to delimit the beginning and end of a document. Since the typical browser can easily infer from the enclosed source that it is an HTML or XHTML document, you don't really need to include the tag in your source HTML document.

That said, it's considered good form to include this tag so that other tools, particularly more mundane text-processing ones, can recognize your document as an HTML document. At the very least, the presence of the beginning and end <html> tags ensures that the beginning or the end of the document has not inadvertently been deleted. Besides, XHTML requires the <html> and </html> tags.

Between <html> and </html> are the document's head and body. Within the head, you'll find tags that identify the document and define its place within a document collection. Within the body is the actual document content, defined by tags that determine the layout and appearance of the document text. As you might expect, the document head is contained within <head> and </head> tags and the body is within <body> and </body> tags, all of which we define in more detail later in this chapter.[*]

By far, the most common form of the <html> tag is simply:

<html>
document head and body content
</html>


[*] For the special HTML/XHTML frame document, a <frameset> tag replaces the <body> tag; more about this in Chapter 11.