As you learned earlier in this chapter, it's best not to format XHTML too heavily. To get maximum control and make it easy to update your Web site's look later on, you should head straight to style sheets (as described in the next chapter). However, a few basic formatting elements are truly useful. You're certain to come across them, and you'll probably want to use them in your own pages. These elements are all inline elements, so you use them inside a block element, like a paragraph, a heading, or a list.
You've already seen the elements for bold (<b>) and italic (<i>) formatting in Chapter 2. They're staples in XHTML, letting you quickly format snippets of text. XHTML also has a <u> element for underlining text, but you can only use it in XHTML 1.0 transitional (The Document Type Definition). Here's an example that uses all three elements—<i> for italics, <b> for bold, and <u> for underline:
<p <b>Stop!</b> The mattress label says <u>do not remove under penalty of law</u> and you <i>don't</i> want to mess with mattress companies. </p>
A browser displays it like this:
Stop! The mattress label says do not remove under penalty of law and you don't want to mess with mattress companies.
If you keep your pages clean with XHTML 1.0 strict, you can't use the <u> element. However, you can get exactly the same effect using text decorations in a style sheet. Specifying a Font shows you how.
The <em> element (for emphasized text) is the logical-element equivalent of the physical element <i>. These two elements have the same effect—they both italicize text. Philosophically, the <em> element is a better choice, because it's more generic. When you use <em>, you're simply indicating that you want to emphasize a piece of text, but you aren't saying how to emphasize it. Later on, you can use a style sheet to define just how browsers should emphasize it. Possibilities include making it a different color, a different font, or a different size. If you don't use a style sheet, the text inside the <em> element is set in italics, just as with the <i> element.
Technically, you can use style sheets to redefine the <i> element in the same way. However, it seems confusing to have the <i> element do anything except apply italics. After all, that's its name.
The <strong> element is the logical-element equivalent of the physical element <b>. If you aren't using style sheets, this simply applies bold formatting to a piece of text. Overall, Web developers more commonly use the <i> and <b> elements over <em> and <strong>, but XHTML experts prefer the latter because they're more flexible.
Here's the previous example rewritten to use the <em> and <strong> elements:
<p> <strong>Stop!</strong> The mattress label says <u>do not remove under penalty of law</u> and you <em>don't</em> want to mess with mattress companies. </p>
There's no logical-element equivalent for the <u> underline element, although you can always use one of the generic elements discussed earlier, like <span> in conjunction with the text-decoration style property (see Specifying a Font).
You can use the <sub> element for subscript—text that's smaller and placed at the bottom of the current line. The <sup> element is for superscript—smaller text at the top of the current line. Finally, wrapping text in a <strike> element tells a browser to cross it out, but you can use it only in XHTML 1.0 transitional. Figure 5-13 shows an example of all three.
Web designers who want to stay on the right side of XHTML law can still create crossed-out text. One alternative is to use the rare <del> element (which is meant to represent deleted text in a revised document). However, you can't trust that all browsers will format <del> the same way, and you really shouldn't use it for anything other than highlighting changes. A better approach is to use a style rule that applies the right text decoration, as explained on Specifying a Font.
Text within a <tt> element appears in a fixed-width (monospaced) font, such as Courier. Programmers sometimes use it for snippets of code in a paragraph.
<p>To solve your problem, use the <tt>Fizzle( )</tt> function.</p>
Which shows up like this:
To solve your problem, use the
Fizzle( )
function.
Teletype text (or typewriter text) looks exactly like the text in a <pre> block (see Preformatted Text), but you should place <tt> text inside another block element. Unlike preformatted text, browsers ignore spaces and line breaks in <tt> text, as they do in every other XHTML element.
Not all characters are available directly on your keyboard. For example, what if you want to add a copyright symbol (©), a paragraph mark (¶), or an accented e (é)? Good news: XHTML supports them all, along with about 250 relatives, including mathematical symbols and Icelandic letters. To add them, however, you need to use some sleight of hand. The trick is to use XHTML character entities—special codes that browsers recognize as requests for unusual characters. Table 5-2 has some common options, with a sprinkling of accent characters.
Table 5-2. Common special characters
Character | Name of Character | What to Type |
---|---|---|
© | Copyright | © |
® | Registered trademark | ® |
¢ | Cent sign | ¢ |
£ | Pound sterling | £ |
¥ | Yen sign | ¥ |
€ | Euro sign | € (but € is better supported) |
° | Degree sign | ° |
± | Plus or minus | ± |
÷ | Division sign | ÷ |
× | Multiply sign | × |
μ | Micro sign | µ |
¼ | Fraction one-fourth | ¼ |
½ | Fraction one-half | ½ |
¾ | Fraction three-fourths | ¾ |
¶ | Paragraph sign | ¶ |
§ | Section sign | § |
« | Left angle quote, guillemot left | |
» | Right angle quote, guillemot right | » |
¡ | Inverted exclamation | ¡ |
¿ | Inverted question mark | ¿ |
æ | Small ae diphthong (ligature) | æ |
ç | Small c, cedilla | ç |
è | Small e, grave accent | è |
é | Small e, acute accent | é |
ê | Small e, circumflex accent | ê |
ë | Small e, dieresis or umlaut mark | ë |
ö | Small o, dieresis or umlaut mark | ö |
É | Capital E, acute accent | É |
The euro symbol is a relative newcomer to XHTML. Although you can use the character entity € you'll have the best support using the numeric code € because it works with older browsers.
XHTML character entities aren't just for non-English letters and exotic symbols. You also need them to deal with characters that have a special meaning according to the XHTML standard—namely angle brackets (< >) and the ampersand (&). You shouldn't enter these characters directly into a Web page because the browser will assume you're trying to give it a super-special instruction. Instead, you need to replace these characters with their equivalent character entity, as shown in Table 5-3.
Table 5-3. XHTML character entities
Character | Name of Character | What To Type |
---|---|---|
< | Left angle bracket | < |
> | Right angle bracket | > |
& | Ampersand | & |
" | Double quotation mark | " |
Strictly speaking, you don't need all these entities all of the time. For example, it's safe to insert ordinary quotation marks by typing them in from your keyboard—just don't put them inside attribute names. Similarly, browsers are usually intelligent enough to handle the ampersand (&) character appropriately, but it's better style to use the & code, so that there's no chance a browser will confuse the ampersand with another character entity. Finally, the character entities for the angle brackets are absolutely, utterly necessary.
Here's some flawed text that won't display correctly:
I love the greater than (>) and less than (<) symbols. Problem is, when I type them my browser thinks I'm trying to use a tag.
And here's the corrected version, with XHTML character entities. When a browser processes and displays this text, it replaces the entities with the characters you really want.
I love the greater than (>
) and less than (<
) symbols. Problem is, when I type them my browser thinks I'm trying to use a tag.
Most Web design tools insert the correct character entities as you type, as long as you're in Design view and not Code view.
To get a more comprehensive list of special characters and see how they look in your browser, check out www.webmonkey.com/reference/Special_Characters.
Although character entities work perfectly well, they can be a bit clumsy if you need to rely on them all the time. For example, consider the famous French phrase "We were given speech to hide our thoughts," shown here:
La parole nous a été donnée pour déguiser notre pensée.
Here's what it looks like with character entities replacing all the accented characters:
La parole nous a été donnée pour déguiser notre pensée.
French speakers would be unlikely to put up with this for long. Fortunately, there's a solution called Unicode encoding. Essentially, Unicode is a system that converts characters into the bytes that computers understand and can properly render. By using Unicode encoding, you can create accented characters just as easily as if they were keys on your keyboard.
So how does it work? First, you need a way to get the accented characters into your Web page. Here are some options:
Type it in. Many non-English speakers will have the benefit of keyboards that include accented characters.
Use a utility. In Windows, you can run a little utility called charmap (short for Character Map) that lets you pick from a range of special characters and copy your selected character to the clipboard so it's ready for pasting into another program. To run charmap, click Start → Run, type in charmap, and then hit Enter (in Windows Vista, click Start, and then type charmap into the search box).
Use your Web page editor. Some Web page editors include their own built-in symbol pickers. In Expression Web, choose Insert → Symbol (see Figure 5-14). In Dreamweaver, you can use Insert → HTML → Special Characters → Other, but this process only inserts character entities, not Unicode characters. Though the end result is the same, your XHTML markup will still include a clutter of codes.
Figure 5-14. Choose Insert → Symbol to see Expression Web's comprehensive list of special characters. When you pick one, Expression Web inserts the actual character, Unicode-style, not the cryptic character entity.
When using Unicode encoding, you need to make sure you save your Web page correctly. This won't be a problem if you use a professional Web page editor, which is smart enough to get it right the first time. But Unicode can trip up text editors. For example, in Windows Notepad, you need to choose File → Save As, and then pick UTF-8 from the Encoding list (see Figure 5-15). For the Mac's TextEdit, select Format → Make Plain Text, go to Preferences → Open and Save → Plain Text File Encoding → Saving Files, and then select Unicode (UTF-8) from the drop-down list. Every time you re-save your file thereafter, Notepad and TextEdit will encode it correctly.