Chapter 2

TCP/IP and the Internet

IN THIS CHAPTER

check Introducing the Internet

check Familiarizing yourself with TCP/IP standards

check Figuring out how TCP/IP lines up with the OSI Reference Model

check Discovering important TCP/IP applications

Many years ago, Transmission Control Protocol/Internet Protocol (TCP/IP) was known primarily as the protocol of the Internet. The biggest challenge of getting a local area network (LAN) connected to the Internet was figuring out how to mesh TCP/IP with the proprietary protocols that were the basis of the LANs — most notably Internetwork Packet Exchange/Sequenced Packet Exchange (IPX/SPX) and NetBIOS Extended User Interface (NetBEUI).

But then, some years ago, network administrators realized that they could save the trouble of combining TCP/IP with IPX/SPX and NetBEUI by eliminating IPX/SPX and NetBEUI from the equation altogether. As a result, TCP/IP is not just the protocol of the Internet now, but it’s also the protocol on which most LANs are based.

This chapter is a gentle introduction to the Internet in general and the TCP/IP suite of protocols in particular. After I get the introductions out of the way, you’ll be able to focus more in-depth on the detailed TCP/IP information given in the remaining chapters of Book 3.

What Is the Internet?

The Goliath of all computer networks, the Internet links hundreds of millions of computer users throughout the world. Strictly speaking, the Internet is a network of networks. It consists of hundreds of thousands of separate computer networks, all interlinked, so that a user on any of those networks can reach out and potentially touch a user on any of the other networks. This network of networks connects more than a billion computers to each other. (That’s right, billion with a b.)

One of the official documents (RFC 2026) of the Internet Engineering Task Force (IETF) defines the Internet as “a loosely organized international collaboration of autonomous, interconnected networks.” Broken down piece by piece, this definition encompasses several key aspects of what the Internet is:

A Little Internet History

The Internet has a fascinating history, if such things interest you. There’s no particular reason why you should be interested in such things, of course, except that a superficial understanding of how the Internet got started may help you to understand and cope with the way this massive computer network exists today. So here goes.

The Internet traces its beginnings back to a small network called ARPANET, built by the Department of Defense in 1969 to link defense installations. ARPANET soon expanded to include not only defense installations but universities as well. In the 1970s, ARPANET was split into two networks: one for military use (renamed MILNET) and the original ARPANET (for nonmilitary use). The two networks were connected by a networking link called IP — the Internet protocol — so called because it allowed communication between two networks.

The good folks who designed IP had the foresight to realize that soon, more than two networks would want to be connected. In fact, they left room for tens of thousands of networks to join the game, which is a good thing because it wasn’t long before the Internet began to take off.

By the mid-1980s, ARPANET was beginning to reach the limits of what it could do. Enter the National Science Foundation (NSF), which set up a nationwide network designed to provide access to huge supercomputers, those monolithic computers used to discover new prime numbers and calculate the orbits of distant galaxies. The supercomputers were never put to much use, but the network that was put together to support the supercomputers — NSFNET — was used. In fact, NSFNET replaced ARPANET as the new backbone for the Internet.

Then, out of the blue, it seemed as if the whole world became interested in the Internet. Stories about it appeared in Time and Newsweek. Any company that had “dot com” in its name practically doubled in value every month. Al Gore claimed he invented the Internet. The Net began to grow so fast that even NSFNET couldn’t keep up, so private commercial networks got into the game. The size of the Internet nearly doubled every year for most of the 1990s. Then, in the first few years of the millennium, the growth rate slowed a bit. However, the Internet still seems to be growing at the phenomenal rate of about 30 to 50 percent per year, and who knows how long this dizzying rate of growth will continue.

TCP/IP Standards and RFCs

The TCP/IP protocol standards that define how the Internet works are managed by the IETF. However, the IETF doesn’t impose standards. Instead, it simply oversees the process by which ideas are developed into agreed-upon standards.

An Internet standard is published in the Request for Comments (RFC) document. When a document is accepted for publication, it is assigned an RFC number by the IETF. The RFC is then published. After it’s published, an RFC is never changed. If a standard is enhanced, the enhancement is covered in a separate RFC.

Thousands of RFCs are available from the IETF website (www.ietf.org). The oldest RFC is RFC 0001, published in April 1969. It describes how the host computers communicated with each other in the original ARPANET. The most recent proposed standard (as of January 2018) is RFC 8311, entitled “Relaxing Restrictions on Explicit Congestion Notification (ECN) Experimentation.”

Not all RFCs represent Internet standards. The following paragraphs summarize the various types of RFC documents:

TABLE 2-1 Maturity Levels for Internet Standards Track RFCs

Maturity Level

Description

Proposed Standard

Generally stable, have resolved known design choices, are believed to be well understood, have received significant community review, and appear to enjoy enough community interest to be considered valuable.

Draft Standard

Well understood and known to be quite stable. At least two interoperable implementations must exist, developed independently from separate code bases. The specification is believed to be mature and useful.

Internet Standard

Have been fully accepted by the Internet community as highly mature and useful standards.

Table 2-2 summarizes the RFCs that apply to the key Internet standards described in this book.

TABLE 2-2 RFCs for Key Internet Standards

RFC

Date

Description

768

August 1980

User Datagram Protocol (UDP)

791

September 1981

Internet Protocol (IP)

792

September 1981

Internet Control Message Protocol (ICMP)

793

September 1981

Transmission Control Protocol (TCP)

826

November 1982

Ethernet Address Resolution Protocol (ARP)

950

August 1985

Internet Standard Subnetting Procedure

959

October 1985

File Transfer Protocol (FTP)

1034

November 1987

Domain Names — Concepts and Facilities (DNS)

1035

November 1987

Domain Names — Implementation and Specification (DNS)

1939

May 1996

Post Office Protocol Version 3 (POP3)

2131

March 1997

Dynamic Host Configuration Protocol (DHCP)

3376

November 1997

Internet Group Management Protocol (IGMP) (Updates RFC 2236 and 1112)

7230 through 7235

June 2014

Hypertext Transfer Protocol – HTTP/1.1

5321

October 2008

Simple Mail Transfer Protocol (SMTP)

tip My favorite RFC is 1149, an experimental specification for the “Transmission of IP datagrams on avian carriers.” The specification calls for IP datagrams to be written in hexadecimal on scrolls of paper and secured to “avian carriers” with duct tape. (Not surprisingly, it’s dated April 1, 1990. Similar RFCs are frequently submitted on April 1.)

The TCP/IP Protocol Framework

Like the seven-layer OSI Reference Model, TCP/IP protocols are based on a layered framework. TCP/IP has four layers, as shown in Figure 2-1. These layers are described in the following sections.

image

FIGURE 2-1 The four layers of the TCP/IP framework.

Network interface layer

The lowest level of the TCP/IP architecture is the network interface layer. It corresponds to the OSI physical and data link layers. You can use many different TCP/IP protocols at the network interface layer, including Ethernet and token ring for LANs and protocols such as X.25, Frame Relay, and ATM for wide area networks (WANs).

The network interface layer is assumed to be unreliable.

Network layer

The network layer is where data is addressed, packaged, and routed among networks. Several important Internet protocols operate at the network layer:

  • Internet Protocol (IP): A routable protocol that uses IP addresses to deliver packets to network devices. IP is an intentionally unreliable protocol, so it doesn’t guarantee delivery of information.
  • Address Resolution Protocol (ARP): Resolves IP addresses to hardware Media Access Control (MAC) addresses, which uniquely identify hardware devices.
  • Internet Control Message Protocol (ICMP): Sends and receives diagnostic messages. ICMP is the basis of the ubiquitous ping command.
  • Internet Group Management Protocol (IGMP): Used to multicast messages to multiple IP addresses at once.

Transport layer

The transport layer is where sessions are established and data packets are exchanged between hosts. Two core protocols are found at this layer:

  • Transmission Control Protocol (TCP): Provides reliable connection-oriented transmission between two hosts. TCP establishes a session between hosts, and then ensures delivery of packets between the hosts.
  • User Datagram Protocol (UDP): Provides connectionless, unreliable, one-to-one or one-to-many delivery.

Application layer

The application layer of the TCP/IP model corresponds to the session, presentation, and application layers of the OSI Reference Model. A few of the most popular application layer protocols are

  • HyperText Transfer Protocol (HTTP): The core protocol of the World Wide Web.
  • File Transfer Protocol (FTP): A protocol that enables a client to send and receive complete files from a server.
  • Telnet: The protocol that lets you connect to another computer on the Internet in a terminal emulation mode.
  • Simple Mail Transfer Protocol (SMTP): One of several key protocols that are used to provide email services.
  • Domain Name System (DNS): The protocol that allows you to refer to other host computers by using names rather than numbers.