Direct Display of XML in Browsers

Ultimately, one hopes that browsers will be able to display not just XHTML documents but any XML document as well. Since it's too much to ask that browsers provide semantics for all XML applications both current and yet-to-be-invented, stylesheets will be attached to each document to provide instructions about how each element will be rendered.

The current major stylesheet languages are:

Eventually, there will be more versions of these, including at least CSS 2.1, CSS Level 3, and XSLT 2.0. However, let's begin by looking at how and how well existing style languages are supported by existing browsers.

The stylesheet associated with a document is indicated by an xml-stylesheet processing instruction in the document's prolog, which comes after the XML declaration but before the root element start-tag. This processing instruction uses pseudo-attributes to describe the stylesheet (that is, they look like attributes but are not attributes because xml-stylesheet is a processing instruction and not an element).

Microsoft Internet Explorer 4.0 (IE4) and later includes an XML parser that can be accessed from VBScript or JavaScript. This is used internally to support channels and the Active Desktop. Your own JavaScript and VBScript programs can use this parser to read XML data and insert it into the web page. However, anything more straightforward, like simply displaying a page of XML from a specified URL, is beyond IE4's capabilities. Furthermore, IE4 doesn't understand any stylesheet language when applied to XML.

Internet Explorer 5 (IE5) and 5.5 (IE 5.5) do understand XML, although their parser is more than a little buggy; it rejects a number of documents it shouldn't reject, most embarrassingly the XML 1.0 specification itself. Internet Explorer 6 (IE6) has improved XML support somewhat, but it is still not completely conformant.

IE5 and later can directly display XML files, with or without an associated stylesheet. If no stylesheet is provided, then IE5 uses a default, built-in XSLT stylesheet that displays the tree structure of the XML document along with a little DHTML to allow the user to collapse and expand nodes in the tree. Figure 7-1 shows IE5 displaying Example 6-1 from the last chapter.

IE5 also supports parts of CSS Level 1 and a little of CSS Level 2. However, the support is spotty and inconsistent. Even some aspects of CSS that work for HTML documents fail when applied to XML documents. IE 5.5 and IE6 slightly improve coverage of CSS but don't support all CSS properties and selectors. In fact, many CSS features that work in IE6 for HTML still don't work when applied to XML documents.

IE5 and IE 5.5 support their own custom version of XSLT, based on a very early working draft of the XSLT specification. They do not support XSLT 1.0. You can tell the difference by looking at the namespace of the stylesheet. A stylesheet written for IE5 uses the http://www.w3.org/TR/WD-xsl namespace, whereas a stylesheet designed for standard-compliant XSLT processors uses the http://www.w3.org/1999/XSL/Transform namespace. Despite superficial similarities, these two languages are not compatible. A stylesheet written for IE5 will not work with any other XSLT processor, and a stylesheet written using standard XSLT 1.0 will not work in IE5. IE6 supports both real XSLT and Microsoft's nonstandard dialect.

Netscape 4.x and earlier do not provide any significant support for displaying XML in the browser. Netscape 4.0.6 and later do use XML internally for some features such as "What's Related." However, the parser used isn't accessible to the page author, even through JavaScript.

Mozilla 1.0 and Netscape 6.0 and later do fully support display of XML in the browser. CSS Level 2 is almost completely supported, and XSLT support is pretty good too. Mozilla can read an XML web page, download the associated CSS or XSLT stylesheet, apply it to the document, and display the result to the end user, all completely automatically and more or less exactly as XML on the Web was always meant to work. Mozilla also partially supports MathML and SVG. The SVG support is not switched on by default as of Mozilla 1.7, and the MathML support requires some extra fonts with more mathematical symbols; neither of these is hard to add.

Authoring your web pages in XML does not necessarily require serving them in XML. Fourth-generation and earlier browsers that don't support XML in any significant way will be with us for some time to come. Servicing users with these browsers requires standard, ordinary HTML that works in any browser back to Mosaic 1.0.

One popular option is to write the pages in XML but serve them in HTML. When the server receives a request for an XML document, it automatically converts the document to HTML and sends the converted document instead. More sophisticated servers can cache the converted documents. They can also recognize browsers that support XML and send them the raw XML instead.

The preferred way to perform the conversion is with an XSLT stylesheet and a Java servlet. Indeed, most XSLT engines, such as Xalan-J and SAXON, include servlets that do exactly this. However, other schemes are possible, for instance, using PHP or CGI instead of a servlet. The key is to make sure that browsers only receive what they know how to read and display. We'll talk more about XSLT in the next chapter.