Keys to Writing Good Text Alternatives

Images are used for a variety of purposes, and therefore the requirements for writing the text alternatives depend on the purpose of the image. When penning a text alternative for an image, the first question you should ask yourself is, “What is the purpose of this image?” Is it a graph indicating sales growth of a product? Is it a button that links to the home page? Is it a webcam showing the current weather conditions on campus? Is it a test to determine if the entity interacting with your content is human or a spambot? Is it a piece of art? These are only a few situations in which you might be using a given image. We’ll provide a few examples to help you consider how to write good text alternatives.

Any change from nontext content to text involves some amount of signal loss. Your responsibility as an author is to compensate for that lost data as efficiently as possible. It may help to imagine describing the image you’re seeing to someone on the phone. (Try it with lolcats: “It’s this cat looking really surprised and unhappy, and the text says ‘DO NOT WANT!’”)

There is no right answer, strictly speaking, when it comes to alt text. Good alt text is situational and therefore subjective. For example, Figure 3-2 shows a picture.

How would you describe it? Perhaps something like:

alt="a mosque with five tall minarets, two of which are
under construction"

And nobody would fault you for that. You could probably get away with less, though “mosque” is a bit too terse.

Of course, we’re cheating. We know the name of this mosque and where it’s based. So we could refer to it canonically:

alt="the al-Saleh mosque in Sana'a, Yemen"

And if we were to use this in a blog post about the city, we might say:

alt="al-Saleh mosque viewed from al-Saba'in Park"

Any of these is an adequate alternative for most applications.

If you’re teaching a class on architecture, though, a few words certainly do not do the photo justice. You would want your students to understand many, many details irrelevant to the casual reader. For example, al-Saleh’s 6 minarets (including the one hidden in the photo) are each 328 feet tall. The central dome is 90 feet in diameter, and the main hall occupies over 146,000 square feet. The cream and beige look on the minarets is strongly evocative of the distinctive painted brick highlights found in the old city of Sana’a.

That’s great information, but don’t go shoving it all in the alt attribute. There’s another attribute called longdesc, which is meant to point to long descriptions of the image. Create a file (say, mosque-longdesc.html), and point to it in the longdesc attribute:

<img alt="the al-Saleh mosque in Sana'a, Yemen" 
longdesc="mosque-longdesc.html" />

Warning

One of the most common pitfalls of longdesc is that authors will type the text of the description into the attribute value. Don’t do this! The longdesc attribute always points to a URL.

Sadly, even after 11 years of longdesc, it’s still not very well supported. To get around this, WCAG 1.0 suggested descriptive links (or “D-links,” so named because it was determined that they should be text links that read “[D]”). It’s an unattractive solution, and people who haven’t discovered D-links before are unlikely to click on them. But where long descriptions are useful, many people still need them.

One way to break the logjam is with script. Here’s one attempt to make longdesc useful to more people, using JavaScript to map the attribute value to the user interface of the image: http://www.malform.no/acidlongdesctest/.

The container for your application plug-in or your web document should contain some basic pieces of metadata to ensure the contents are rendered correctly and consistently.

The document type tells the browser how to render the page and how strictly to follow the rules for rendering. Depending on which doctype is specified and how or if it is missing altogether, browsers have three possible rendering modes: quirks, standard, and almost standard. Quirks mode is intended to display legacy pages created in 2001 or earlier—before all major browsers had implemented the final standards for HTML 4.01 and CSS Level 1.

On the mobile side of things, the bright line is between WAP 1.0 and WAP 2.0. WAP 1.0 was based on HTML but included several mobile-specific elements such as “card”—the main unit for individual web pages (as in a card deck—smaller, and more likely to fit on the smaller, mobile screen). WAP 2.0 (XHTML-MP) is built on XHTML Basic (but uses the application/vnd.wap.xhtml+xml MIME type). The following are some examples of document type declarations:

// Mobile – XHTML Basic 1.1
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.1//EN"
 "http://www.w3.org/TR/xhtml-basic/xhtml-basic11.dtd">

//XHTML Mobile Profile 1.1
<!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.1//EN"
  "http://www.openmobilealliance.org/tech/DTD/xhtml-mobile11.dtd">

// HTML 4.01 – rendered in Standards or Almost Standards mode
// Note: lack of system identifier renders in quirks mode in IE Mac
// Transitional
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
 "http://www.w3.org/TR/html4/loose.dtd">
// Strict
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 
"http://www.w3.org/TR/html4/strict.dtd">

// XHTML 1.0 with system identifier and without an XML declaration -
// rendered in Standards or Almost Standards mode
// Transitional
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
// Strict
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

Note

Note that including the system identifier (e.g., “http://www.w3.org/TR/html4/strict.dtd”) can change the mode the browser uses. For a complete list of system identifiers, refer to “Comparison of document types” (http://en.wikipedia.org/wiki/Quirks_mode#Comparison_of_document_types).

Declaring the language of a document ensures that Braille and synthesized speech will be generated correctly. Declaring the character encoding ensures that the correct characters are displayed in the browser or in captions in a media player. You need to specify language and character encoding because they indicate different things. German could be the declared language of a website, but pages can be served in a variety of character encodings, including Unicode or ISO-8859-1.

Specifying the language is straightforward—for HTML 4.01, use the lang attribute on the html element, with the ISO language code (“en” for English, “fr” for French, “es” for Spanish, “ja” for Japanese, “de” for German, and so on) as the value:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 
"http://www.w3.org/TR/html4/strict.dtd">
<html lang="de">

For XHTML served as XML, use the xml:lang attribute:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
   "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="de">

For XHTML served as text/HTML, use both the lang and xml:lang attributes:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="de" xml:lang="de">

Character encoding also depends on which MIME type your server uses and which document type you are using. The W3C Internationalization Activity’s Tutorial, “Character sets & encodings in XHTML, HTML and CSS” summarizes the requirements as shown in Table 3-1.

Table 3-1. Matrix of alternatives for declaring a character encoding[15]

 HTTP headers<?xml...<meta...
HTMLOKInvalidPreferred
XHTML (text/HTML)OKOKPreferred
XHTML (XML)OKPreferredInvalid

The preferred encoding is Unicode (more precisely the Universal Character Set that is defined both by ISO/IEC and Unicode standards, more simply referred to as Unicode). From the I18N tutorial:

With Unicode, there are three encoding formats to choose from: UTF-8, UTF-16, and UTF-32, depending on how many bytes are used to represent each character (1, 2, and 4, respectively). UTF-8 is most typically used. Therefore, for XHTML served as XML, here’s the preferred declaration:

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
   "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

For XHTML served as text/HTML, use:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www/w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>

We’ve devoted three chapters to making web applications accessible. The primary issues with applications are:

The changes that need to be identified are primarily role and state. Knowing the role that something plays tells you about its capabilities. If it is a checkbox, you know you can check and uncheck it. If it is a button, you should be able to press it and cause something to happen. State, on the other hand, refers to how an object has changed. In the previous examples, the checkbox’s states are “checked” and “unchecked.” The button is “pressed” or “not pressed” or “active” or “disabled.” More on that later.

Another aspect is identifying errors, which is covered in Chapter 5.

Information about relationships is implied in the document or application structure. Using semantic elements in a good order tells a story. For example, links in an unordered list create several groups of links. These groups have relationships/data.

Why is this important? Consider the visual interface. Related objects are placed near each other. If you are unsure of one object, you often look around it to gain a better understanding of what it does. For example, a lone text field doesn’t mean much until paired with its label. A single data cell in a table doesn’t mean much until you compare it with other cells. The relationships between objects create the overall narrative.

We have devoted a whole chapter to how to choose the appropriate elements and attributes to create meaningful structure, but we want to foreshadow the importance of structure here by pointing out that the elements themselves have semantics and the meaning attached to those elements is metadata.

The link is what makes the Web—allowing us to hop, skip, and jump from information bit to information bit. The text of a link helps us predict where it goes and if we want to go there. User experience guru Jared Spool refers to link “scent” to describe users’ confidence that they are getting closer to the information they seek. The more “scent” a link gives off—leading us to what we are hunting for—the less “click disappointment” we are likely to face.

For people using mobile devices, waiting for unwanted content to download is not only frustrating but also costly. A page full of “click here” links is not going to help the way a search engine ranks your site. (Want proof? Search for “click here” and see what you get.) For people using a screen reader, the links on your page may be the only pieces of information they interact with: many people pull up a list of the links on a page to get a sense of what the page is about and where they can go. In fact, most of the people visiting your site will be affected by the order and placement of links—these create context and convey meaning associated with that link.

In some cases, the design of your site or application may prevent you from making link text descriptive. In that case, the traditional advice has been to add context using the title attribute. Unfortunately, this is of little use to people who need that information most: screen readers need to be configured to read title text, and frequently they aren’t.

A better solution is to make sense of the link with text in a span element. Then, using CSS, you can hide that extra information offscreen. This can be particularly useful when you have a number of “Click here” types of links, which can’t be differentiated by screen readers as they sort through links, but you don’t have free space to expand those links.

<a href="#">Read more<span class="context">
about providing context</span></a>

This CSS will hide the extra text far off the left side of the screen:

.context
{
   left: −999em;
   width: 1em;
   overflow: hidden;
}

This can be an ideal way to provide links to content in multiple formats:

<ul>
<li><a href="release.html"><span class="context">Press release in
 </span>HTML</a></li>
<li><a href="release.pdf"><span class="context">Press release in 
</span>PDF</a></li>
<li><a href="release.doc"><span class="context">Press release in 
</span>Word</a></li>
</ul>