I’ll start this chapter with suggestions that help make web pages accessible to the most widely used webbots—the spiders that download, analyze, and rank web pages for search engines, a process often called search engine optimization (SEO).
Finally, I’ll conclude the chapter by explaining the occasional importance of special-purpose web pages, formatted to send data directly to webbots instead of browsers.
The most important thing to remember when designing a web page for SEO is that spiders rely on you, the developer, to provide context for the information they find. This is important because web pages using HTML mix content with display format commands. To add complexity to the spider’s task, a spider has to examine words in the web page’s content to determine how relevant the words are to the web page’s main topic. You can improve a spider’s ability to index and rank your web pages, as well as improve your search ranking by predictably using a few standard HTML tags. The topic of SEO is vast and many books are entirely dedicated to it. This chapter only scratches the surface, but it should get you on your way.
Search engines generally associate the number of links to a web page with the web page’s popularity and importance. In fact, getting other websites to link to your web page is probably the best way to improve your web page’s search ranking. Regardless of where the links originate, it’s always important to use descriptive hyper-references when making links. Without descriptive links, search engine spiders will know the linked URL, but they won’t know the importance of the link. For example, the first link in Example 29-1 is much more useful to search spiders than the second link.
Google bombing is an example of how search rankings were affected by the terms used to describe links. Google bombing (also known as spam indexing) was a technique where people conspired to create many links, with identical link descriptions, to a specific web page. As Google (or any other search engine) indexed these web pages, the link descriptions became associated with the targeted web page. As a result, when people entered the link descriptions as search terms, the targeted pages were highly ranked in the results. Google bombing was occasionally used for political purposes to place a targeted politician’s website as the highest ranked result for a derogatory search term. For example, depending on the search engine used, a search for the phrase miserable failure may have returned the official biography of George W. Bush as the top result. Similarly, a search for the word waffles may have produced the official web page of Senator John Kerry. While Google accounts for a few well-known instances of this gamesmanship, Google bombing is still possible, and it remains an unresolved challenge for all search engines.
The HTML title tag helps spiders identify the main topic of a web page. Each web page should have a unique title that describes the general purpose of the page, as shown in Example 29-2.
You can think of meta tags as extensions of the title tag. Like title tags, meta tags explain the main topic of the web page. However, unlike title tags, they allow for detailed descriptions of the content on the web page and the search terms people may use to find the page. For example, Example 29-3 shows meta tags that may accompany the title tag used in the previous example.
Example 29-3. Describing a web page in detail with meta tags
<!- The meta:author defines the author of the web page --> <meta name="Author" content="Michael Schrenk"> <!— The meta:description is how search engines describe the page in search results--> <meta name="Description" content="Official Website: Webbots, Spiders, and Screen Scrapers"> <!— The meta:keywords are a list of search terms that may lead people to your web page--> <meta name="Keywords" content="Webbot, Spider, Webbot Development, Spider Development">
There are many misconceptions about meta tags. Many people insist on using every conceivable keyword that may apply to a web page, using the more, the better theory. In reality, you should limit your selection of keywords to the six or eight keywords that best describe the content of your web page. It’s important to remember that the keywords represent potential search terms that people may use to find your web page. Moreover, for each additional keyword you use, your web page becomes less specific in the eyes of search engines. As you increase the number of keywords, you also increase the competition for use of those keywords. When this happens, other pages containing the same keywords dilute your position within search rankings. There are also rumors that some search engines ignore web pages that have excessive numbers of keywords as a measure to avoid keyword spamming, or the overuse of keywords. Whether these rumors are true or not, it still makes sense to use fewer, but better quality, keywords. For this reason, there is usually no need to include regular plurals[80] in keywords.
The more unique your keywords are, the higher your web page will rank in search results when people use those keywords in web searches. Once thing to watch out for is when your keyword is part of another, longer word. For example, I once worked for a company called Entolo. We had difficulty getting decent rankings on search engines because the word Entolo is a subset of the word Scientology (sciENTOLOgy). Since there were many more heavily linked web pages dedicated to Scientology, our website seldom registered highly with any search services.
In addition to making web pages easier to read, header tags help search engines identify and locate important content on web pages. For example, consider the example in Example 29-4.
Example 29-4. Using header tags to identify key content on a web page
<h1 class="main_header">North American Wire Packaging</h1> In North America, large amounts of wire are commonly shipped on spools...
In the past, web designers strayed from using header tags because they only offer a small availability of font selections. But now, with the wide acceptance of style sheets, there is no reason not to use HTML header tags to describe important sections of your web pages.
Long ago, before everyone had graphical browsers, web designers used the alt
attribute of the HTML <img>
tag to describe images to people with text-based browsers. Today, with the increasing popularity of image search tools, the alt
attribute helps search engines interpret the content of images, as shown below in Example 29-5.