Chapter 18. Network Capture

Sooner or later, you will connect your system to a network, whether it is a LAN segment at work, a cable or DSL modem at home, or even a dial-up connection on the road. You will send and receive packets from a variety of computers that you know almost nothing about. Being able to monitor, capture, and analyze those packets can be incredibly useful, either to troubleshoot network performance, debug a problematic networking program, or capture an attack for later analysis or as evidence for prosecution.

This chapter is meant to give you a short introduction to the essential tools of capturing and manipulating traffic. For additional resources, I strongly recommend Wireshark & Ethereal Network Protocol Analyzer Toolkit, by Orebaugh et al. (Syngress) and Network Intrusion Detection, by Steven Northcutt and Judy Novak (SAMS).

tcpdump

tcpdump is a command-line packet sniffer for Unix-based operating systems. In order to capture packets other than those addressed to the host's MAC address, it must enable Promiscuous Mode on the card, which requires superuser/root access. Most versions of Unix will not let you run tcpdump unless you are root, because being able to see packets from other users would violate Unix's security model.

tcpdump was originally written by Van Jacobson, Craig Leres, and Steven McCanne when they worked at Lawrence Berkeley National Laboratory (LBNL) Network Research Group (NRG), just up the hill from the main UC Berkeley campus. Because of this, the filtering language used in tcpdump is known as the Berkeley Packet Filtering (BPF) language.

Basics

Acquiring tcpdump is fairly straightforward; most Linux/Unix/POSIX distributions (or distros, for short) have a simple-to-install package for both tcpdump and libpcap (you need both to capture traffic). If an install package isn't available for your distro, the source is available from http://www.tcpdump.org. Compiling from source is a fairly straightforward operation.

Once installed, log in as root (or use sudo as discussed in Limiting Access) and run it:

 # tcpdump

By default, it picks up the first interface it finds (excluding the loopback, typically eth0 for Linux) and displays each packet it sees on that interface as a single line; for example:

12:55:09.459039 IP 10.10.9.24.3766 > server.ssh: . ack 887032 win 65535
12:55:09.459181 IP server.ssh > 10.10.9.24.3766: P 887280:887396(116)
ack 3693 win 9648

Ctrl-C quits the capture. If you did the above capture remotely through SSH, you receive a torrent of packets similar to the ones just shown. You are essentially seeing yourself, then telling yourself you are seeing yourself. This capability is not terribly useful, although sometimes it can be enough to know whether your network is working properly. Note that if tcpdump knows the service name from the /etc/services file, it displays the service name instead of the port number (e.g., .ssh in the example). Also if it knows the reverse lookup for a particular address, it resolves it by default—this can be useful, or it can be bad.

Consider this: if you are doing a covert packet capture between you and a remote target, or between two targets on a port-mirrored switch (or you broke its CAM table), and then you start doing reverse-DNS lookups on the hosts you are collecting against, you are leaking information that someone else may be able to monitor and detect, revealing your otherwise untraceable activities. To disable name lookups, use the -n option:

# tcpdump -n

What if you want to ssh in on one interface and then monitor on another? A neat trick is to have the second interface simply up, but with no IP address assigned to it. This allows you to monitor a network without being detected on that network or directly attacked. In order to specify which interface you are going to capture on, use the -i option and then the name of an interface; for example:

# tcpdump -i eth1

Say your Unix system is also forwarding traffic, perhaps modifying it on the way or maybe even dropping traffic (due to an inline IDS or a proxy server), and you want to see traffic that it is forwarding; or suppose your machaine is multi-homed, or you're connected to multiple span ports, etc. If it is a Linux-based system, you can easily capture traffic to and from any interfaces with the any interface keyword; for example:

# tcpdump -i any

If you decide to save packets to disk (discussed later) by using this technique, the resulting capture file is typically called a Linux cooked capture. Also keep in mind that this does not set the interfaces to promiscuous mode and only packets destined to or sent from the system are captured.

To get a list of available capture interfaces, use the -D option:

# tcpdump -D
1.eth0
2.eth1
3.eth2
4.eth3
5.any (Pseudo-device that captures on all interfaces)
6.lo

My example shows four Ethernet interfaces, the Linux any interface, and the loopback interface.

Berkeley Packet Filter (BPF)

Up until now, you have been capturing any old packet that comes across the wire. And while this may be handy, it is not always desirable (consider the problem of sshing into a box, then performing a tcpdump and capturing your own SSH session in the process). As mentioned earlier, the gents from Berkeley came up with a filtering language to specify packets of interest, which became known as the Berkeley Packet Filter (BPF) language. It can be as simple or as complex as you like. We will start out with some basics.

Anything that comes after the specified options when tcpdump is run is considered a BPF expression. Expressions consist of one or more primitives, which generally consist of an identifier (either name or number) and a qualifier. There are three different kinds of qualifiers:

Type

Defines what kind of item the qualifier refers to. Valid qualifiers include:

host

Defines a specific host to capture. By default, it assumes the protocol setting is ip and can therefore be an IP address or domain name (as long as the name can be resolved to an IP address); for example, tcpdump host 10.2.3.4 or tcpdump host foo.bar.com.

net

Defines a specific network to capture. If one byte is given, it assumes a class A netmask; for two bytes, it assumes a class B netmask; and for three bytes, it assumes a class C netmask. For example, tcpdump net 10.1 would match any packets whose source or destination addresses matched 10.1.x.x.

port

Defines a specific port (TCP and UDP) to capture; for example tcpdump port 80. Also supports service names if defined in /etc/services; for example,. tcpdump port ssh.

portrange

Defines a specific port range (TCP and UDP) to capture; for example,tcpdump portrange 135-139.

Direction

Defines a direction to filter on. Valid qualifiers include:

Here's a simple example; say you want to filter on only IP protocol packets (no ARP or IPX). You could use the IP primitive to create the following filter:

# tcpdump -i eth1 ip

You can take these different primitives and combine them, using Boolean logic (and, not, or); for example:

# tcpdump -i eth1 tcp port 80 and ether host 00:0C:F1:F1:B6:20

This would listen on interface eth1 for HTTP traffic that was sent or received by a system with a MAC address of 00:0C:F1:D1:B6:20.

This next example is very useful:

# tcpdump -i eth2 not tcp port 22

This captures everything except SSH, which is really handy when you have ssh'd in remotely and are commanding the box to capture traffic, but you want to exclude your encrypted (and therefore useless) SSH session.

Writing Packets to Disk

Watching packets fly over the wire is interesting, for about 10 seconds. Under normal circumstances, the packets are going by so quickly that you are not going to have the reflexes to see them long enough for it to make much sense. Sure, you can page it with your favorite paging program (e.g., more, less), but it would be so much more useful if you could save these packets to disk.

Fortunately, tcpdump has a switch to do just that—the write option -w. By specifying a filename after the -w option, packets are written to disk instead of displayed:

# tcpdump -i eth2 -w eth2-all-but-ssh.pcap not tcp port 22

By default, tcpdump generally examines only the first 68 bytes of each packet (96 with SunOS NIT), which is enough to snag the IP header and an ICMP, TCP, UDP, or similar header, but not go much into the payload space. This is to save on memory (packet buffers) and to increase performance. Saving just the packet headers to disk is not useful most of the time. To ensure the entire packet is captured (including payload), use the snaplen option of -s and set it to zero, which captures the entire packet regardless of its length; for example:

# tcpdump -s 0 -i eth2 -w eth2-all-but-ssh.pcap not tcp port 22

These are the most common options I use when I'm creating pcaps for later review.

One problem with writing packets to disk is the sheer volume of data that gets written. This can be countered with some good BPF-fu to get more precise captures, but there are times when we cannot do filtering because we do not know what we are looking for, when we need to capture a lot of data before we find something interesting, or when we need to capture for a very long time. All of these scenarios result in very large pcap files. Since most systems can handle 2 GB files these days without any problems, this could be considered a non-issue, except for that fact that eventually you will want to view these pcaps. Viewing a 2 GB pcap file is not something I have ever tried, as I do not have 3-4 GB of RAM, nor the 30-45 minutes of free time to watch it load. tcpdump has a -C (chunk) option that allows you to spread a capture across several files. The -C option takes a file size parameter (in millions of bytes, not megabytes) that if exceeded, starts a new capture file and closes the old one; for example:

# tcpdump -s 0 -C 100 -i eth2 -w eth2-all-but-ssh.pcap not tcp port 22

This would create several files, as close to 100 million byte files as possible of pcap data until you told it to stop with a Ctrl-C. The variance in file size occurs because tcpdump does not split a packet between two capture files. New files are named the same as the -w option filename, with a number appended to the end, starting at 1 and going up from there. Unfortunately, if you name your files correctly and place a .pcap extension on the end, it will do the wrong thing and number them .pcap1, .pcap2, .pcap3, etc.

The -C option can be combined with the -W (wrap) option to set a limit on how many files are created. When this number is reached, the oldest file is overwritten, creating a rolling buffer. This can be useful if you are doing a really long-term capture but do not want to crash the process because you ran out of hard drive space. This makes 10 pcap files of 100 million bytes each (for a total of about a gigabyte):

# tcpdump -s 0 -C 100 -W 10 -i eth2 -w eth2-all-but-ssh.pcap not tcp port 22

I'd normally have the -W option higher, say 100 or even 500, to ensure I do not lose data right away.

Advanced BPF Filtering

Berkeley Packet Filters support some fairly complex and detailed filter options, but it gets ugly fast and is not for the faint of heart. A full, advanced BFP document could fill its own book. BPF actually results in a compiled virtual machine that filters packets when tcpdump is executed. It is capable of some very complicated relational comparisons, including bit-level operations such as masking and shifting.

Thankfully, many of the commonly used header masks, such as TCP flags or ICMP types, are precalculated. Using the built-in offset/field value of tcpflags and tcp-rst, you can filter on TCP flags; for example:

# tcpdump -i eth0 'tcp[tcpflags] & tcp-rst !=0'

This gives us any packet that had a TCP RST flag set. Note that because we are using the ampersand (&) to perform a logical AND, we must encapsulate the expression inside tick marks or the shell interprets it a different way (i.e., as an execute-in-background operation).

Advanced Dump Display

There are a variety of different options for displaying data. I will cover the most important two: Verbosity and Format.

Verbosity is controlled with the -q and various -v options:

-q (Quiet): Prints minimal protocol information.
-v (Slightly Verbose): IP TTL, IP Identification, IP Length, IP Options are displayed; IP and ICMP checksums are validated. If used with the -w option, instead it displays a packet count every 10 seconds.
-vv (More Verbose): In addition to the above, TCP and UDP checksums are validated and the application-layer data is displayed for common local-LAN protocols, such as NFS, SMB, NBT, and more. This is only useful if the -s option (for example, -s 0) is also used to extend into the application layer to see how much of the packet is examined by tcpdump.
-vvv (Total Verbosity): In addition to the above, detailed protocol information are displayed, such as Telnet options.

Format controls how the data is displayed. There are a few options for this:

-A (ASCII)

Prints out payload data in ASCII format; you also need to engage at least -v in order for this to work. Good for printable-text protocols such as HTTP, and SMTP, SIP, but less useful for binary-based protocols such as SMB, ARP, and Peer-to-Peer or Chat.

-x (Hex-Only)

Prints out payload data in Hexadecimal format; also needs at least a -v to work. This can be useful, but only if you are really good at reading hex format.

-xx (Hex-With-Link-Layer)

Same as -x, except it includes the link layer, which shows you neat layer 2 details such as MAC address.

-X (Hex-Plus-ASCII)

This option prints both Hex and ASCII side-by-side. Like the others, it works only with at least one -v. This is a good compromise when viewing binary-based or mixed binary/ASCII protocols.

-XX(Hex-Plus-ASCII-With-Link-Layer)

Same as -X, except that it also includes the link layer.

Generally, I use tcpdump for captures and view them later with Wireshark (see Ethereal/Wireshark). But sometimes you want to check to make sure your capture is working, so data display can be important. If I am playing with a printable protocol, I usually use -A, while if I'm working on a binary or mixed protocol, I use -X. Once I know I have my filter settings set correctly, I switch out the -A or -X for a -w and save to disk for later analysis.

Using tcpdump to Extract Packets

You can write packets to disk with the -w (write) option and do extensive filtering with BPF, and when you combine that with the -r (read) option, you can also take in a packet capture created previously with the -w option and filter out only the packets you want from it.

Want to extract all HTTP traffic from a packet capture (pcap) and save it to a new file? Try this:

# tcpdump -r capture.pcap -w http.pcap tcp port 80

Your resulting http.pcap file has only the HTTP sessions found in the original pcap.

Got a pcap of a port-scanning event? You will see thousands and thousands of SYN's, but what you really care about is the Selective Negative ACK packets that your system sent back, indicating an open port. To filter out those packets, try this:

# tcpdump -r port-scan.pcap -w snack-packets.pcap 'tcp[tcpflags] = 0x12'

The 0x12 is the value of the ACK bit (position 1 on the high-nibble) plus the SYN bit (position 2 on the low-nibble). This pattern is exceptionally useful when testing firewall rules. Configure your firewall as you like and then use a port scanner to test it. Use the BPF above to see only which ports respond to the scan as open, then compare to your scanner's output. By saving these to disk, your resulting Selective Negative ACK packets.pcap shows only those packets that responded to a connection request with an open connection reply.