A.1 A Quick Introduction to XML

These days, almost everyone has at least heard of XML, and certainly everyone has used it, though often unknowingly. If you’ve seen the source code of a web page, you already know what XML looks like, because XHTML, the markup language used on the Web today, is one of the subspecies of XML (more precisely, an XML vocabulary). SVG is another such vocabulary.

XML is a standardized way to record structured information in plain text. Its biggest advantage is that it is easy to parse for computers and yet quite understandable for humans. Unlike most other computer-related concepts that are, usually, as complex as you think they are (or more), XML is an unbelievably simple thing. So simple, in fact, that I will now explain it in a few short paragraphs.

The basic building block of an XML document is called an element. Here is an example of an element containing some text:

<example>Here goes some text.</example>

The text inside the element is called its content, and that content is delimited by tags. Everything between the less-than sign (<) and the greater-than sign (>) is a tag. Here, the opening tag and the closing tag are almost identical, except that the closing one has a forward slash (/) before the element name, which in this case is example.

An element may have no content at all:

<example></example>

or

<example/>

These two empty elements are absolutely equivalent; the second one, consisting of a single tag, is just a spelling variant of the first. Note the different position of the forward slash (/) in the single-tag empty element.

In addition to text, elements can also contain other elements, but those elements must lie entirely within the containing element. For example, this is wrong:

<a><b></a></b>

because the element b starts inside a but ends outside it. If an element started inside some other element, it must also end within that element. Here is an example of correct XML:

<a><b/><c/><d><e/></d></a>

One can say that a is the parent element of b, c, and d, while d is the parent of e. An element may have many children, but it has only one parent (except for the root element of the document, which has no parent at all). Figure A-1 is a graphic representation of this XML fragment.

A tree representing the XML code

Figure A-1. A tree representing the XML code

Note

This explains why groups of objects in Inkscape cannot contain objects from different layers (4.5 Groups). A group is a parent element for its children, but so is a layer. A group may be contained inside a single layer, but you can’t have it scattered across several layers, because that would result in that group having several parents.

Apart from children elements, XML elements may have some attributes. Each attribute is a name with some associated value. Attributes are specified in the opening tag of an element, for example:

<text type="important" font-size="10">Here is some text.</text>

Note the use of the equal sign (=) and the double quote marks around attribute values. Both are mandatory.

Note

An element cannot have two attributes with the same name. Note that the order of the attributes in the opening tag never affects anything; attributes in XML are unordered.

Finally, the entire XML document is simply a single element (called root) with some text content and children elements, which in turn can have more content and more children, and so on. An XML document can thus be thought of as a tree growing from a single root. For example, here is a complete SVG document whose root element, svg, contains two elements representing a rectangle and a text string:

<svg>
  <rect x="100" y="100" width="300" height="50" fill="blue"/>
  <text x="100" y="150">This is a text string.</text>
</svg>