Chapter 3. Building a Linux Firewall

In this chapter, you'll learn how to build a Linux iptables firewall from scratch. While the recipes are aimed at DSL and cable Internet users, they also work for T1/E1 customers. In fact, a Linux box with a T1 interface card is a great alternative to expensive commercial routers. If you're a normal business user and not an ISP that needs Buicksized routers handling routing tables with hundreds of thousands of entries, then Linux on good-quality x86 hardware will serve your needs just fine.

A Linux border firewall can provide security and share an Internet connection for a whole LAN, which can contain Linux, Windows, Mac, and other PCs. A host firewall protects a single PC. There are a multitude of hardware choices for your fire-wall box, from small single-board computers, to recycled old PCs, to rackmount units. Any Linux distribution contains everything you need to build a sophisticated, configurable, reliable firewall on any hardware.

Definitions and roles get a bit blurry, as an iptables firewall does both packet filtering and routing. You could call it a filtering router.

iptables is the key to making everything work. Having a solid understanding of how iptables works and how to write custom rules will give you mighty network guru powers. Please study Oskar Andreasson's Iptables Tutorial (http://iptables-tutorial.frozentux.net/) and Craig Hunt's TCP/IP Network Administration (O'Reilly) to get a deeper understanding of how iptables and TCP/IP work. Another excellent resource is the Netfilter FAQ (http://www.iptables.org/documentation/index.html). At the least, you should know what headers IP, TCP, UDP, and ICMP packets contain, and the section "Traversing Of Tables and Chains" in the Iptables Tutorial is especially helpful for understanding how packets move through iptables. If you don't understand these things, iptables will always be mysterious.

Firewalls and routers are often combined on the same device, which is often called an Internet gateway. Strictly speaking, a gateway moves traffic between networks that use different protocols, such as NETBEUI and TCP/IP, which is not something we see much anymore. These days, it means any network devices that connect networks.

Routers forward traffic between networks. You always need a router between your LAN and other networks. You may also add intrusion detection, traffic control, proxies, secure remote access, DNS/DHCP, and any other services you want, though in my opinion, it's better to limit your firewall to routing, firewalling, and traffic control. Other services should sit on separate boxes behind your Internet firewall, though of course this is up to you. In small shops, it's not uncommon for a single box to host a multitude of services. The risks are that any successful intruder will have a feast of yummy services to exploit, or you may simply overload the box to the point that performance suffers.

Any computer or network device that is exposed to untrusted networks is called a bastion host. Obviously, bastion hosts have special needs—they must be well-hardened, not share authentication services with your LAN hosts, and must have strict access controls.

If you are going to run Internet-accessible services, you need to isolate your public servers from your private LAN. If you are sharing a single Internet connection, the simplest way is to build a tri-homed (three network interfaces) Linux router; one NIC connects to the Internet, the second one connects to your LAN, and the third one connects to your demilitarized zone (DMZ). A demilitarized zone is a neutral zone between two opposing groups. In computer terms, it's a separate subnet where you segegrate your public servers from your private LAN hosts, and your DMZ hosts are treated as only slightly less untrustworthy than the big bad Internet.

Simply placing your public servers on a different subnet adds a useful layer of protection. DMZ hosts are not able to initiate connections back into the private network without being explicitly allowed to do so. If a DMZ server is compromised, an attacker should not find a path into your private network.

It doesn't matter if your DMZ hosts have public or private IP addresses. Never run public services from inside your LAN. The last thing you want to do is introduce a big fat Internet hole into your LAN.

If your servers have public routable IP addresses, then you may elect to connect them directly to the Internet or on a separate Internet connection. Host firewalls are useful for restricting traffic to the server and blocking the zillions of automated attacks that infest the Internet. A nice thing is a standalone firewall in front of your public servers to filter out unwanted traffic before it hits them.

Our Linux-based iptables firewall is going to perform several jobs:

Packet filtering is an extremely powerful, flexible mechanism that lets us perform all manner of mojo even on encrypted transmissions because TCP/IP packet headers are not encrypted. iptables rules filter on addresses, protocols, port numbers, and every other part of a TCP/IP packet header; it does not perform any sort of data inspection or filtering.

Having routing built-in a nice convenience that lets you pack a lot of functionality into a single device and into a few iptables rules.

NAT is the magic that lets you share a single public IP address with a whole private subnet, and to run public servers with private nonroutable addresses. Suppose you have a typical low-cost DSL Internet account. You have only a single public IP address, and a LAN of 25 workstations, laptops, and servers, protected by a nice iptables NAT firewall. Your entire network will appear to the outside world as a single computer. (Canny network gurus can penetrate NAT firewalls, but it isn't easy.) Source NAT (SNAT) rewrites the source addresses of all outgoing packets to the firewall address.

It works the other way as well. While having public routable IP addresses is desirable for public services, like web and mail servers, you can get by on the cheap without them and run public servers on private addresses. Destination NAT (DNAT) rewrites the destination address, which is the firewall address, to the real server addresses, then iptables forwards incoming traffic to these servers.

Someday, when IPv6 is widely implemented, we can say good-bye to NAT, except for those times when we really want it. It is useful for stretching the limited pool of IPv4 addresses, and unintentionally provides some security benefits. But, it also creates a host of routing problems. Protocols that have to traverse NAT, like FTP, IRC, SMTP, and HTTP have all kinds of ingenious hacks built into them to make it possible. Peer protocols like BitTorrent, instant messaging, and session initiation protocol (SIP) are especially challenging to get through NAT.

iptables and TCP/IP Headers

iptables reads the fields in packet headers, but not the data payload, so it's no good for content filtering.

When you're studying the different protocols, you'll run into conflicting terminology. To be strictly correct, IP and UDP move datagrams, TCP exchanges segments, and ICMP packets are messages. In the context of iptables, most admins just say "packets," though you run the risk of annoying pedantic network engineers. The important part is understanding that every data transmission is broken into a series of packets that travel independently over the network, often taking different routes. Then, when they arrive at their destination, the TCP protocol reassembles them in the correct order. Each packet contains in its headers all the information necessary for routers to forward it to its destination. IP and UDP are unreliable protocols because they do not have delivery confirmations, but this makes them very fast. TCP takes care of delivery confirmations, sequence numbers, and error-checking, so it incurs a bit of overhead, but gains reliability. TCP/IP together are extremely reliable.

If you have any questions about connecting to the Internet or networking hardware basics, read the Introduction to this book.

Policies are the default actions applied to packets that do not match any rules. There are three built-in tables: filter, NAT, and mangle. You will use the filter table the most, the NAT table a little, and the mangle table perhaps not at all (it is for advanced packet manipulation). Each table contains a number of built-in chains. You may also create custom chains. A chain is a list of rules that defines the actions applied to packets. Rules end with a target specification that tells what to do with the packet. This is done with the jump (-j) command, like this simple example that permits all loopback traffic with the ACCEPT target:

	iptables -A INPUT -i lo -j ACCEPT

Once a packet reaches the ACCEPT target, that is the end of the road, and it does not traverse any more chains. Rules can be run from the command line or put in a script. This is what each part of this rule means:

  • iptables = The iptables command

  • No table is specified, so the default filter table is used

  • -A INPUT = Append this rule to the built-in INPUT chain

  • -i lo = Apply this rule to packets going to interface lo

  • -j ACCEPT = Jump to the built-in ACCEPT chain, which moves packets to their final destinations

iptables does stateful packet inspection, which is done via its connection tracking mechanism. In other words, it knows if a packet is attempting to start a new connection or if it belongs to an existing one. Seeing packets in context is very powerful, and makes it possible to do a lot of work with a few rules. If you are running no public services, you can then easily block all outside attempts to create a connection, because they have no legitimate reason to try to connect to you. When you do run services such as SSH, FTP, or a web or mail server, iptables can allow only traffic targeted for the services you are running, and reject all the rest. You might block all outgoing traffic initiated from your servers because they're only supposed to respond to connection attempts from the outside, not initiate them. These things would be difficult to do without stateful packet inspection.

iptables is extensible with the addition of custom kernel modules, so iptables features vary by Linux distribution and user modifications. To see what your installation supports, check your /boot/config-* file. If you're not thrilled by the notion of managing a bunch of kernel modules (and iptables can use quite a few), build a custom kernel with the iptables functions you want built-in.

There are three tables in iptables. Any rules or custom chains that you create will go into one of these tables. The filter table is the default, and is the one you'll use the most. You can think of it as the firewalling portion of iptables. The filter table contains these built-in chains:

The NAT table is used only to change the packet's Source Address field or Destination Address field. If you have a single public, routable IP address in front of a LAN that uses private addresses, which is common, NAT translates the source IP addresses on outgoing packets to the public address. It doesn't matter if you have a hundred hosts sharing the connection—it will appear that all your traffic is coming from a single host. Conversely, you may use it to enable access to public services with private IPs. The NAT table has these built-in chains:

The mangle table lets you alter packet headers as you like. This has a host of uses that we will not cover in this book, but here are a few ideas for inspiration:

It has these built-in chains:

Packets coming into your network must first pass through the mangle table, then the NAT table, and finally, the filter table.

User-defined chains can improve performance because packets traverse your rules and chains in the order they are listed. Defining your own chains lets you create shortcuts, so packets can jump directly to the chains you want them to traverse, instead of passing through a bunch of irrelevant rules and chains first. Or, you may save some configuration steps by building a custom chain to use over and over.

While you can customize any Linux distribution any way you like, there are a number of specialized Linux distributions designed to serve as Internet routers and firewalls. They are stripped-down to the essentials. Some are small enough to fit on a floppy disk. Typically, these include iptables, DNS/DHCP servers, secure remote access, intrusion detection, logging, port forwarding, and Internet connection sharing. Here are a few of the more popular ones:

Freesco (http://www.freesco.org/)

The name means FREE ciSCO. It is a free replacement for commercial routers. It supports up to 10 Ethernet/arcnet/Token Ring/arlan network cards, and up to 10 modems. It is easy to set up, and can be run from a single write-protected diskette, or from a hard drive, if you want additional functionality.

IPCop (http://www.ipcop.org/)

An excellent prefab Internet gateway. It has a web-based administration interface, supports SSH and console access, and, in addition to the usual gateway services, it supports dial-up networking and DynDNS.

The Sentry Firewall CD (http://www.sentryfirewall.com/)

Sentry runs from a bootable CD, and stores configuration files on a diskette. Set the diskette to read-only, and recovering from an intrusion is as easy as patching the hole and rebooting.

Pyramid Linux (http://pyramid.metrix.net/)

Pyramid Linux, a descendant of the popular Pebble Linux, is maintained by Metrix Communications, and is based on Ubuntu Linux. It is optimized for wireless access points, and serves equally well as a wired-network firewall. The stock installation occupies under 50 MB, so it's perfect for single-board computers without expandable storage. Because it uses stock Ubuntu packages, you can easily add applications by copying the binaries and any dependent libraries from the Ubuntu liveCD.

Bering uClibc (http://leaf.sourceforge.net/bering-uclibc/)

Bering achieves its small size by using modified libraries. Because it is so customized, you have to rely on the Bering package repositories for additional application. This shouldn't be a problem for most admins, as they offer a large number of additional packages.

Voyage Linux (http://www.voyage.hk/software/voyage.html)

Based on Debian, Voyage can be shrunken to as small as 64 MB, or expanded as desired. Optimized for wireless access points, routers, and firewalls.

Debian Router (http://gate-bunker.p6.msu.ru/~berk/)

This is a work in progress. It is an interesting Debian implementation that takes a slimmed-down, stock Debian, and adapts it to boot from a flash drive and run entirely in memory.

It is equally important to harden your systems, and a great tool for this is Bastille Linux (http://www.bastille-linux.org/). Bastille is a set of scripts that walk you through a number of steps to harden your entire system. It is designed to be educational and functional. You can run through it a couple of times without actually changing anything, and it also has an undo feature so that you can practice without running the risk of locking yourself out of your system. It examines almost every aspect of your system, including file permissions, PAM settings, services, and remote access.