Keys to Writing Good Text Alternatives

Images are used for a variety of purposes, and therefore the requirements for writing the text alternatives depend on the purpose of the image. When penning a text alternative for an image, the first question you should ask yourself is, “What is the purpose of this image?” Is it a graph indicating sales growth of a product? Is it a button that links to the home page? Is it a webcam showing the current weather conditions on campus? Is it a test to determine if the entity interacting with your content is human or a spambot? Is it a piece of art? These are only a few situations in which you might be using a given image. We’ll provide a few examples to help you consider how to write good text alternatives.

Pictures of Recognizable Objects

Any change from nontext content to text involves some amount of signal loss. Your responsibility as an author is to compensate for that lost data as efficiently as possible. It may help to imagine describing the image you’re seeing to someone on the phone. (Try it with lolcats: “It’s this cat looking really surprised and unhappy, and the text says ‘DO NOT WANT!’”)

There is no right answer, strictly speaking, when it comes to alt text. Good alt text is situational and therefore subjective. For example, Figure 3-2 shows a picture.

Figure 3-2. Photo by Rosalie Town in need of a description

How would you describe it? Perhaps something like:

alt="a mosque with five tall minarets, two of which are
under construction"

And nobody would fault you for that. You could probably get away with less, though “mosque” is a bit too terse.

Of course, we’re cheating. We know the name of this mosque and where it’s based. So we could refer to it canonically:

alt="the al-Saleh mosque in Sana'a, Yemen"

And if we were to use this in a blog post about the city, we might say:

alt="al-Saleh mosque viewed from al-Saba'in Park"

Any of these is an adequate alternative for most applications.

If you’re teaching a class on architecture, though, a few words certainly do not do the photo justice. You would want your students to understand many, many details irrelevant to the casual reader. For example, al-Saleh’s 6 minarets (including the one hidden in the photo) are each 328 feet tall. The central dome is 90 feet in diameter, and the main hall occupies over 146,000 square feet. The cream and beige look on the minarets is strongly evocative of the distinctive painted brick highlights found in the old city of Sana’a.

That’s great information, but don’t go shoving it all in the alt attribute. There’s another attribute called longdesc, which is meant to point to long descriptions of the image. Create a file (say, mosque-longdesc.html), and point to it in the longdesc attribute:

<img alt="the al-Saleh mosque in Sana'a, Yemen" 
longdesc="mosque-longdesc.html" />

Warning

One of the most common pitfalls of longdesc is that authors will type the text of the description into the attribute value. Don’t do this! The longdesc attribute always points to a URL.

Sadly, even after 11 years of longdesc, it’s still not very well supported. To get around this, WCAG 1.0 suggested descriptive links (or “D-links,” so named because it was determined that they should be text links that read “[D]”). It’s an unattractive solution, and people who haven’t discovered D-links before are unlikely to click on them. But where long descriptions are useful, many people still need them.

One way to break the logjam is with script. Here’s one attempt to make longdesc useful to more people, using JavaScript to map the attribute value to the user interface of the image: http://www.malform.no/acidlongdesctest/.

Links

Alt text for images in links is somewhat different from that of regular images. The alt text we just described for the average image is usually a noun or some kind of representation of the image’s content. But for links, it’s a verb and represents where the link will take you. It may be a rounded green arrow pointing right, but if it links to the next page of an article, that’s what it should say:

<a href="page2.html"><img src="rightarrow.gif" alt="next page" /></a>

Graphs

A graph—be it a pie chart or a 10,000-point histogram—is a visual representation of a number of data points, usually organized to show a trend. Good alt text will describe the purpose of the graph:

alt="Acme stock chart"

Better alt text will briefly describe the trend:

alt="sunspot activity over the last two centuries follows a 
consistent 11-year trend"

But the best you can do with multivariate data is to link to a table of the data from which the graph was constructed:

<a href="stocktable.html"><img src="stockchart.gif"
alt="view stock data" /></a>

Tabular data is covered in Chapter 6.

Logos

Logos are functional equivalents to company or product names. Again, don’t stress over it. If it’s a link, say where the link goes. If it’s not, say what it is: a logo. And if you’re talking about corporate branding (say, describing Coca-Cola’s cursive logotype to a marketing class), make room for the long descriptions.

Webcams

If you’re broadcasting an image from a webcam, which is updated on a regular interval, all you need to say is what you know will be true of that image. For example, there’s a camera focused on the University of Washington campus quad known as Red Square, which could have alt text such as the following:

alt="Live picture of Red Square"

It’s not necessary to go overboard describing the neo-Gothic façade of Suzzallo Library, the red brick plaza, the Gerberding Hall bell tower, the trees, or the fountain in the background. That’s the kind of thing you’d use longdesc for. In this case, most people are probably just looking to see if it’s raining. If you’re intending for people to know what the weather is, or how many Jolt colas are left in the machine, you should provide some other way to access that information in parallel. But you don’t know what is going to be happening at any given moment on the average live camera, so don’t overthink it.

CAPTCHA

CAPTCHA images usually don’t have alt text, as their entire existence is based around them not being programmatically readable. So the alt text that should go with a CAPTCHA is pretty simple:

alt="captcha"

However, you’re not done. If you require your users to complete a CAPTCHA, you will have to provide users with some other way to accomplish the task that doesn’t require the ability to cherry-pick text out of a distorted image.

Better yet, don’t use them at all. We’ve already covered the fact that they’re easily defeated, and their by-design inaccessibility only makes it worse. We discuss these and many other issues regarding CAPTCHA in Chapter 5.

Image dimensions

If the image’s height and width are not specified in markup, a browser will need to download an image before creating the layout. When you provide explicit dimensions, a browser can begin showing content immediately, saving the allotted space for the image after it has a chance to download. For mobile devices, this avoids the need to rerender the page after all images have downloaded. Technically, this violates our principle of avoiding the use of presentational HTML, but there’s a lot of upside for mobile applications and rendering time, and it doesn’t hurt much to do. We’ll say it again: for many mobile users, time is money.

<img width="200" height="100" src="sample.png" alt="..." />

Document-Level Metadata

The container for your application plug-in or your web document should contain some basic pieces of metadata to ensure the contents are rendered correctly and consistently.

Document type

The document type tells the browser how to render the page and how strictly to follow the rules for rendering. Depending on which doctype is specified and how or if it is missing altogether, browsers have three possible rendering modes: quirks, standard, and almost standard. Quirks mode is intended to display legacy pages created in 2001 or earlier—before all major browsers had implemented the final standards for HTML 4.01 and CSS Level 1.

On the mobile side of things, the bright line is between WAP 1.0 and WAP 2.0. WAP 1.0 was based on HTML but included several mobile-specific elements such as “card”—the main unit for individual web pages (as in a card deck—smaller, and more likely to fit on the smaller, mobile screen). WAP 2.0 (XHTML-MP) is built on XHTML Basic (but uses the application/vnd.wap.xhtml+xml MIME type). The following are some examples of document type declarations:

// Mobile – XHTML Basic 1.1
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.1//EN"
 "http://www.w3.org/TR/xhtml-basic/xhtml-basic11.dtd">

//XHTML Mobile Profile 1.1
<!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.1//EN"
  "http://www.openmobilealliance.org/tech/DTD/xhtml-mobile11.dtd">

// HTML 4.01 – rendered in Standards or Almost Standards mode
// Note: lack of system identifier renders in quirks mode in IE Mac
// Transitional
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
 "http://www.w3.org/TR/html4/loose.dtd">
// Strict
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 
"http://www.w3.org/TR/html4/strict.dtd">

// XHTML 1.0 with system identifier and without an XML declaration -
// rendered in Standards or Almost Standards mode
// Transitional
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
// Strict
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

Note

Note that including the system identifier (e.g., “http://www.w3.org/TR/html4/strict.dtd”) can change the mode the browser uses. For a complete list of system identifiers, refer to “Comparison of document types” (http://en.wikipedia.org/wiki/Quirks_mode#Comparison_of_document_types).

Language and character encoding

Declaring the language of a document ensures that Braille and synthesized speech will be generated correctly. Declaring the character encoding ensures that the correct characters are displayed in the browser or in captions in a media player. You need to specify language and character encoding because they indicate different things. German could be the declared language of a website, but pages can be served in a variety of character encodings, including Unicode or ISO-8859-1.

Specifying the language is straightforward—for HTML 4.01, use the lang attribute on the html element, with the ISO language code (“en” for English, “fr” for French, “es” for Spanish, “ja” for Japanese, “de” for German, and so on) as the value:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 
"http://www.w3.org/TR/html4/strict.dtd">
<html lang="de">

For XHTML served as XML, use the xml:lang attribute:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
   "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="de">

For XHTML served as text/HTML, use both the lang and xml:lang attributes:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="de" xml:lang="de">

Character encoding also depends on which MIME type your server uses and which document type you are using. The W3C Internationalization Activity’s Tutorial, “Character sets & encodings in XHTML, HTML and CSS” summarizes the requirements as shown in Table 3-1.

Table 3-1. Matrix of alternatives for declaring a character encoding^[15]

	HTTP headers	<?xml...	<meta...
HTML	OK	Invalid	Preferred
XHTML (text/HTML)	OK	OK	Preferred
XHTML (XML)	OK	Preferred	Invalid
^[15]This table was taken from http://www.w3.org/International/tutorials/tutorial-char-enc/#Slide0250.

The preferred encoding is Unicode (more precisely the Universal Character Set that is defined both by ISO/IEC and Unicode standards, more simply referred to as Unicode). From the I18N tutorial:

A Unicode encoding can support many languages and can accommodate pages and forms in any mixture of those languages. Its use also eliminates the need for server-side logic to individually determine the character encoding for each page served or each incoming form submission. This significantly reduces the complexity of dealing with a multilingual site or application.

With Unicode, there are three encoding formats to choose from: UTF-8, UTF-16, and UTF-32, depending on how many bytes are used to represent each character (1, 2, and 4, respectively). UTF-8 is most typically used. Therefore, for XHTML served as XML, here’s the preferred declaration:

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
   "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

For XHTML served as text/HTML, use:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www/w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>

Titles

Use the HTML title element to provide a unique title for each page/application. Titles provide landmarks for people and search engines. People use titles when switching between windows on the desktop (at this moment, I have eight tabs open in my browser). Search engine bots use titles when scanning sites. Mobile device users will use titles when downloading a page to help determine if it is in fact the page they want, and they may stop a download if a title does not match their expectations. Mobile devices may truncate the title, so front-loading the title will ensure that the most meaningful bits are more likely to show up. As the dotMobi guidelines point out, authors commonly use the site name for all pages on the site—not very useful. Put that bit at the end like so:

<title>Unique page title | Site Name</title>

Role and State

We’ve devoted three chapters to making web applications accessible. The primary issues with applications are:

Making them keyboard-accessible
Ensuring that changes caused by user interaction can be detected by a person or his software agent (whether it’s a browser alone or in combination with an assistive technology)

The changes that need to be identified are primarily role and state. Knowing the role that something plays tells you about its capabilities. If it is a checkbox, you know you can check and uncheck it. If it is a button, you should be able to press it and cause something to happen. State, on the other hand, refers to how an object has changed. In the previous examples, the checkbox’s states are “checked” and “unchecked.” The button is “pressed” or “not pressed” or “active” or “disabled.” More on that later .

Another aspect is identifying errors, which is covered in Chapter 5.

Relationships

Information about relationships is implied in the document or application structure. Using semantic elements in a good order tells a story. For example, links in an unordered list create several groups of links. These groups have relationships/data.

Why is this important? Consider the visual interface. Related objects are placed near each other. If you are unsure of one object, you often look around it to gain a better understanding of what it does. For example, a lone text field doesn’t mean much until paired with its label. A single data cell in a table doesn’t mean much until you compare it with other cells. The relationships between objects create the overall narrative.

We have devoted a whole chapter to how to choose the appropriate elements and attributes to create meaningful structure, but we want to foreshadow the importance of structure here by pointing out that the elements themselves have semantics and the meaning attached to those elements is metadata.

Link Text

Link by link we build paths of understanding across the web of humanity.

—Tim Berners-Lee

The link is what makes the Web—allowing us to hop, skip, and jump from information bit to information bit. The text of a link helps us predict where it goes and if we want to go there. User experience guru Jared Spool refers to link “scent” to describe users’ confidence that they are getting closer to the information they seek. The more “scent” a link gives off—leading us to what we are hunting for—the less “click disappointment” we are likely to face.

For people using mobile devices, waiting for unwanted content to download is not only frustrating but also costly. A page full of “click here” links is not going to help the way a search engine ranks your site. (Want proof? Search for “click here” and see what you get.) For people using a screen reader, the links on your page may be the only pieces of information they interact with: many people pull up a list of the links on a page to get a sense of what the page is about and where they can go. In fact, most of the people visiting your site will be affected by the order and placement of links—these create context and convey meaning associated with that link.

In some cases, the design of your site or application may prevent you from making link text descriptive. In that case, the traditional advice has been to add context using the title attribute. Unfortunately, this is of little use to people who need that information most: screen readers need to be configured to read title text, and frequently they aren’t.

A better solution is to make sense of the link with text in a span element. Then, using CSS, you can hide that extra information offscreen. This can be particularly useful when you have a number of “Click here” types of links, which can’t be differentiated by screen readers as they sort through links, but you don’t have free space to expand those links.

<a href="#">Read more<span class="context">
about providing context</span></a>

This CSS will hide the extra text far off the left side of the screen:

.context
{
   left: −999em;
   width: 1em;
   overflow: hidden;
}

This can be an ideal way to provide links to content in multiple formats:

<ul>
<li><a href="release.html"><span class="context">Press release in
 </span>HTML</a></li>
<li><a href="release.pdf"><span class="context">Press release in 
</span>PDF</a></li>
<li><a href="release.doc"><span class="context">Press release in 
</span>Word</a></li>
</ul>