Ultimately, one hopes that browsers will be able to display not just XHTML documents but any XML document as well. Since it's too much to ask that browsers provide semantics for all XML applications both current and yet-to-be-invented, stylesheets will be attached to each document to provide instructions about how each element will be rendered.
The current major stylesheet languages are:
Eventually, there will be more versions of these, including at least CSS 2.1, CSS Level 3, and XSLT 2.0. However, let's begin by looking at how and how well existing style languages are supported by existing browsers.
The stylesheet associated with a document is indicated by an xml-stylesheet
processing instruction in
the document's prolog, which comes after the XML declaration but
before the root element start-tag. This processing instruction uses
pseudo-attributes to describe the stylesheet (that is, they look
like attributes but are not attributes because xml-stylesheet
is a processing instruction
and not an element).
There are two required pseudo-attributes for xml-stylesheet
processing instructions.
The value of the href
pseudo-attribute gives the URL, possibly relative,
where the stylesheet can be found. The type
pseudo-attribute value specifies the MIME media type of the stylesheet, text/css
for cascading stylesheets, application/xml
for XSLT stylesheets. In Example 7-3, the xml-stylesheet
processing instruction
tells browsers to apply the CSS stylesheet person.css to this document before
showing it to the reader.
Example 7-3. An XML document associated with a stylesheet
<?xml version="1.0"?> <?xml-stylesheet href="person.css" type="text/css"?> <person> Alan Turing </person>
Microsoft Internet Explorer uses type="text/xsl
" for XSLT stylesheets.
However, the text/xsl
MIME media type has not been and will not be
registered with the IANA. It is a figment of Microsoft's imagination.
In the future, application/xslt+xml
will be
registered to identify XSLT stylesheets specifically.
In addition to these two required pseudo-attributes, there are four optional pseudo-attributes:
media
charset
alternate
title
The media
pseudo-attribute contains a short string
identifying the medium this stylesheet should be used for—for
example, paper, onscreen display, television, and so forth. It can
specify either a single medium or a comma-separated list of media.
The recognized values include:
screen
Computer monitors
tty
Teletypes, terminals, xterms, and other monospaced, text-only devices
tv
Televisions, WebTVs, video game consoles, and the like
projection
Slides, transparencies, and direct-from-laptop presentations that will be shown to an audience on a large screen
handheld
PDAs, cell phones, GameBoys, and the like
print
Paper
braille
Tactile feedback devices for the sight-impaired
aural
Screen readers and speech synthesizers
all
All of the previously mentioned plus any that haven't been invented yet
For example, this xml-stylesheet
processing instruction
says that the CSS stylesheet at http://www.cafeconleche.org/style/titus.css should
be used for television, projection, and print:
<?xml-stylesheet href="http://www.cafeconleche.org/style/titus.css" type="text/css" media="tv, projection, print"?>
The charset
pseudo-attribute specifies in which character set
the stylesheet is written, using the same values as the encoding
declaration. For example, to say that the CSS stylesheet koran.css is written in the ISO-8859-6
character set, you'd use this processing instruction:
<?xml-stylesheet href="koran.css" type="text/css" charset="ISO-8859-6"?>
The alternate
pseudo-attribute specifies whether this is the
primary stylesheet for its media type or an alternate one for
special cases. The default value is no
, which indicates that it is the
primary stylesheet. If alternate
has the value yes
, then the browser may (but does not
have to) present the user a choice from among the alternate
stylesheets. If it does offer a choice, then it uses the value of
the title
pseudo-attribute to
tell the user how the stylesheets differ. For example, these three
xml-stylesheet
processing
instructions offer the user a choice between large, small, and
medium text:
<?xml-stylesheet href="big.css" type="text/css" alternate="yes" title="Large fonts"?> <?xml-stylesheet href="small.css" type="text/css" alternate="yes" title="Small fonts"?> <?xml-stylesheet href="medium.css" type="text/css" title="Normal fonts"?>
Browsers that aren't able to ask the user to choose a
stylesheet will simply pick the first nonalternate sheet that most
closely matches its media type (screen
for a typical web browser).
Microsoft Internet Explorer 4.0 (IE4) and later includes an XML parser that can be accessed from VBScript or JavaScript. This is used internally to support channels and the Active Desktop. Your own JavaScript and VBScript programs can use this parser to read XML data and insert it into the web page. However, anything more straightforward, like simply displaying a page of XML from a specified URL, is beyond IE4's capabilities. Furthermore, IE4 doesn't understand any stylesheet language when applied to XML.
Internet Explorer 5 (IE5) and 5.5 (IE 5.5) do understand XML, although their parser is more than a little buggy; it rejects a number of documents it shouldn't reject, most embarrassingly the XML 1.0 specification itself. Internet Explorer 6 (IE6) has improved XML support somewhat, but it is still not completely conformant.
IE5 and later can directly display XML files, with or without an associated stylesheet. If no stylesheet is provided, then IE5 uses a default, built-in XSLT stylesheet that displays the tree structure of the XML document along with a little DHTML to allow the user to collapse and expand nodes in the tree. Figure 7-1 shows IE5 displaying Example 6-1 from the last chapter.
IE5 also supports parts of CSS Level 1 and a little of CSS Level 2. However, the support is spotty and inconsistent. Even some aspects of CSS that work for HTML documents fail when applied to XML documents. IE 5.5 and IE6 slightly improve coverage of CSS but don't support all CSS properties and selectors. In fact, many CSS features that work in IE6 for HTML still don't work when applied to XML documents.
IE5 and IE 5.5 support their own custom version of XSLT, based on a very early working draft of the XSLT specification. They do not support XSLT 1.0. You can tell the difference by looking at the namespace of the stylesheet. A stylesheet written for IE5 uses the http://www.w3.org/TR/WD-xsl namespace, whereas a stylesheet designed for standard-compliant XSLT processors uses the http://www.w3.org/1999/XSL/Transform namespace. Despite superficial similarities, these two languages are not compatible. A stylesheet written for IE5 will not work with any other XSLT processor, and a stylesheet written using standard XSLT 1.0 will not work in IE5. IE6 supports both real XSLT and Microsoft's nonstandard dialect.
Netscape 4.x and earlier do not provide any significant support for displaying XML in the browser. Netscape 4.0.6 and later do use XML internally for some features such as "What's Related." However, the parser used isn't accessible to the page author, even through JavaScript.
Mozilla 1.0 and Netscape 6.0 and later do fully support display of XML in the browser. CSS Level 2 is almost completely supported, and XSLT support is pretty good too. Mozilla can read an XML web page, download the associated CSS or XSLT stylesheet, apply it to the document, and display the result to the end user, all completely automatically and more or less exactly as XML on the Web was always meant to work. Mozilla also partially supports MathML and SVG. The SVG support is not switched on by default as of Mozilla 1.7, and the MathML support requires some extra fonts with more mathematical symbols; neither of these is hard to add.
Authoring your web pages in XML does not necessarily require serving them in XML. Fourth-generation and earlier browsers that don't support XML in any significant way will be with us for some time to come. Servicing users with these browsers requires standard, ordinary HTML that works in any browser back to Mosaic 1.0.
One popular option is to write the pages in XML but serve them in HTML. When the server receives a request for an XML document, it automatically converts the document to HTML and sends the converted document instead. More sophisticated servers can cache the converted documents. They can also recognize browsers that support XML and send them the raw XML instead.
The preferred way to perform the conversion is with an XSLT stylesheet and a Java servlet. Indeed, most XSLT engines, such as Xalan-J and SAXON, include servlets that do exactly this. However, other schemes are possible, for instance, using PHP or CGI instead of a servlet. The key is to make sure that browsers only receive what they know how to read and display. We'll talk more about XSLT in the next chapter.