In this chapter, you'll learn how to build a Linux iptables firewall from scratch. While the recipes are aimed at DSL and cable Internet users, they also work for T1/E1 customers. In fact, a Linux box with a T1 interface card is a great alternative to expensive commercial routers. If you're a normal business user and not an ISP that needs Buicksized routers handling routing tables with hundreds of thousands of entries, then Linux on good-quality x86 hardware will serve your needs just fine.
A Linux border firewall can provide security and share an Internet connection for a whole LAN, which can contain Linux, Windows, Mac, and other PCs. A host firewall protects a single PC. There are a multitude of hardware choices for your fire-wall box, from small single-board computers, to recycled old PCs, to rackmount units. Any Linux distribution contains everything you need to build a sophisticated, configurable, reliable firewall on any hardware.
Definitions and roles get a bit blurry, as an iptables firewall does both packet filtering and routing. You could call it a filtering router.
iptables is the key to making everything work. Having a solid understanding of how iptables works and how to write custom rules will give you mighty network guru powers. Please study Oskar Andreasson's Iptables Tutorial (http://iptables-tutorial.frozentux.net/) and Craig Hunt's TCP/IP Network Administration (O'Reilly) to get a deeper understanding of how iptables and TCP/IP work. Another excellent resource is the Netfilter FAQ (http://www.iptables.org/documentation/index.html). At the least, you should know what headers IP, TCP, UDP, and ICMP packets contain, and the section "Traversing Of Tables and Chains" in the Iptables Tutorial is especially helpful for understanding how packets move through iptables. If you don't understand these things, iptables will always be mysterious.
Firewalls and routers are often combined on the same device, which is often called an Internet gateway. Strictly speaking, a gateway moves traffic between networks that use different protocols, such as NETBEUI and TCP/IP, which is not something we see much anymore. These days, it means any network devices that connect networks.
Routers forward traffic between networks. You always need a router between your LAN and other networks. You may also add intrusion detection, traffic control, proxies, secure remote access, DNS/DHCP, and any other services you want, though in my opinion, it's better to limit your firewall to routing, firewalling, and traffic control. Other services should sit on separate boxes behind your Internet firewall, though of course this is up to you. In small shops, it's not uncommon for a single box to host a multitude of services. The risks are that any successful intruder will have a feast of yummy services to exploit, or you may simply overload the box to the point that performance suffers.
Any computer or network device that is exposed to untrusted networks is called a bastion host. Obviously, bastion hosts have special needs—they must be well-hardened, not share authentication services with your LAN hosts, and must have strict access controls.
If you are going to run Internet-accessible services, you need to isolate your public servers from your private LAN. If you are sharing a single Internet connection, the simplest way is to build a tri-homed (three network interfaces) Linux router; one NIC connects to the Internet, the second one connects to your LAN, and the third one connects to your demilitarized zone (DMZ). A demilitarized zone is a neutral zone between two opposing groups. In computer terms, it's a separate subnet where you segegrate your public servers from your private LAN hosts, and your DMZ hosts are treated as only slightly less untrustworthy than the big bad Internet.
Simply placing your public servers on a different subnet adds a useful layer of protection. DMZ hosts are not able to initiate connections back into the private network without being explicitly allowed to do so. If a DMZ server is compromised, an attacker should not find a path into your private network.
It doesn't matter if your DMZ hosts have public or private IP addresses. Never run public services from inside your LAN. The last thing you want to do is introduce a big fat Internet hole into your LAN.
If your servers have public routable IP addresses, then you may elect to connect them directly to the Internet or on a separate Internet connection. Host firewalls are useful for restricting traffic to the server and blocking the zillions of automated attacks that infest the Internet. A nice thing is a standalone firewall in front of your public servers to filter out unwanted traffic before it hits them.
While firewalls are useful, remember to give a lot of attention to your application-level and OS security. Some admins recommend configuring your servers as though you have no firewall, and that is a good strategy. Linux and Unix servers can be hardened to the point where they really don't need a firewall. Windows systems are impossible to harden to this degree. Nor is a firewall a cure-all. A nice strong iptables firewall is a good umbrella to place over Windows hosts, but a firewall will not protect them from email-borne malware, infected web sites, or the increasing hordes of spyware, adware, Trojan horses, and rootkits that come in legitimate commercial software products, or the inability of commercial security products to detect all the bad stuff.
Our Linux-based iptables firewall is going to perform several jobs:
Packet filtering is an extremely powerful, flexible mechanism that lets us perform all manner of mojo even on encrypted transmissions because TCP/IP packet headers are not encrypted. iptables rules filter on addresses, protocols, port numbers, and every other part of a TCP/IP packet header; it does not perform any sort of data inspection or filtering.
Having routing built-in a nice convenience that lets you pack a lot of functionality into a single device and into a few iptables rules.
NAT is the magic that lets you share a single public IP address with a whole private subnet, and to run public servers with private nonroutable addresses. Suppose you have a typical low-cost DSL Internet account. You have only a single public IP address, and a LAN of 25 workstations, laptops, and servers, protected by a nice iptables NAT firewall. Your entire network will appear to the outside world as a single computer. (Canny network gurus can penetrate NAT firewalls, but it isn't easy.) Source NAT (SNAT) rewrites the source addresses of all outgoing packets to the firewall address.
It works the other way as well. While having public routable IP addresses is desirable for public services, like web and mail servers, you can get by on the cheap without them and run public servers on private addresses. Destination NAT (DNAT) rewrites the destination address, which is the firewall address, to the real server addresses, then iptables forwards incoming traffic to these servers.
Someday, when IPv6 is widely implemented, we can say good-bye to NAT, except for those times when we really want it. It is useful for stretching the limited pool of IPv4 addresses, and unintentionally provides some security benefits. But, it also creates a host of routing problems. Protocols that have to traverse NAT, like FTP, IRC, SMTP, and HTTP have all kinds of ingenious hacks built into them to make it possible. Peer protocols like BitTorrent, instant messaging, and session initiation protocol (SIP) are especially challenging to get through NAT.
iptables reads the fields in packet headers, but not the data payload, so it's no good for content filtering.
When you're studying the different protocols, you'll run into conflicting terminology. To be strictly correct, IP and UDP move datagrams, TCP exchanges segments, and ICMP packets are messages. In the context of iptables, most admins just say "packets," though you run the risk of annoying pedantic network engineers. The important part is understanding that every data transmission is broken into a series of packets that travel independently over the network, often taking different routes. Then, when they arrive at their destination, the TCP protocol reassembles them in the correct order. Each packet contains in its headers all the information necessary for routers to forward it to its destination. IP and UDP are unreliable protocols because they do not have delivery confirmations, but this makes them very fast. TCP takes care of delivery confirmations, sequence numbers, and error-checking, so it incurs a bit of overhead, but gains reliability. TCP/IP together are extremely reliable.
If you have any questions about connecting to the Internet or networking hardware basics, read the Introduction to this book.
Do you even need a firewall? Short answer: if you connect to other networks, yes. Ubuntu Linux, for one famous example, does not include a firewall configurator during installation because it installs with no running services. No services means no points of attack. But, I think this is missing an important point: things change, mistakes happen, and layered defenses are a standard best practice. Why let your hosts be pummeled and your LAN congested by outside attacks, even if they are futile? Head all that junk off at your firewall. Even public services benefit from being firewalled. For example, there's no need to subject your web server to the endless SSH attacks and MS SQL Server worms infesting the Internet, so you can block everything but port TCP 80. The same goes for all of your hosts: reduce the load and potential compromises by diverting unwanted traffic before it hits them.
You can take this a step further and fine-tune exactly where you allow incoming traffic to come from. SSH is the poster child for this—if you're not expecting legitimate connection attempts from far-flung lands, write rules to allow only the address ranges or specific addresses that you know are legitimate, and bitbucket the rest.
iptables is part of the Netfilter project. Netfilter is a set of Linux kernel hooks that communicate with the network stack. iptables is a command and the table structure that contains the rulesets that control the packet filtering.
iptables is complex. It filters packets by the fields in IP, TCP, UDP, and ICMP packet headers. A number of different actions can be taken on each packet, so the key to iptables happiness is simplicity. Start with the minimum necessary to get the job done, then add rules as you need them. It's not necessary to build vast iptables edifices, and in fact, it's a bad idea, as it makes it difficult to maintain, and will hurt performance.
Policies are the default actions applied to packets that do not
match any rules. There are three built-in tables: filter, NAT, and
mangle. You will use the filter table the most, the NAT table a
little, and the mangle table perhaps not at all (it is for advanced
packet manipulation). Each table contains a number of built-in
chains. You may also create custom chains. A chain is a list of rules
that defines the actions applied to packets. Rules end with a target
specification that tells what to do with the packet. This is done with
the jump (-j
) command, like this
simple example that permits all loopback traffic with the ACCEPT
target:
iptables -A INPUT -i lo -j ACCEPT
Once a packet reaches the ACCEPT
target, that is the end of the road,
and it does not traverse any more chains. Rules can be run from the
command line or put in a script. This is what each part of this rule
means:
iptables
= The
iptables command
No table is specified, so the default filter table is used
-A INPUT =
Append this
rule to the built-in INPUT
chain
-i lo =
Apply this rule
to packets going to interface lo
-j ACCEPT =
Jump to the
built-in ACCEPT
chain, which
moves packets to their final destinations
iptables does stateful packet inspection, which is done via its connection tracking mechanism. In other words, it knows if a packet is attempting to start a new connection or if it belongs to an existing one. Seeing packets in context is very powerful, and makes it possible to do a lot of work with a few rules. If you are running no public services, you can then easily block all outside attempts to create a connection, because they have no legitimate reason to try to connect to you. When you do run services such as SSH, FTP, or a web or mail server, iptables can allow only traffic targeted for the services you are running, and reject all the rest. You might block all outgoing traffic initiated from your servers because they're only supposed to respond to connection attempts from the outside, not initiate them. These things would be difficult to do without stateful packet inspection.
iptables is extensible with the addition of custom kernel modules, so iptables features vary by Linux distribution and user modifications. To see what your installation supports, check your /boot/config-* file. If you're not thrilled by the notion of managing a bunch of kernel modules (and iptables can use quite a few), build a custom kernel with the iptables functions you want built-in.
There are three tables in iptables. Any rules or custom chains that you create will go into one of these tables. The filter table is the default, and is the one you'll use the most. You can think of it as the firewalling portion of iptables. The filter table contains these built-in chains:
INPUT
Processes incoming packets
FORWARD
Processes packets routed through the host
OUTPUT
Processes outgoing packets
The NAT table is used only to change the packet's Source Address field or Destination Address field. If you have a single public, routable IP address in front of a LAN that uses private addresses, which is common, NAT translates the source IP addresses on outgoing packets to the public address. It doesn't matter if you have a hundred hosts sharing the connection—it will appear that all your traffic is coming from a single host. Conversely, you may use it to enable access to public services with private IPs. The NAT table has these built-in chains:
PREROUTING
Alters incoming packets before routing
OUTPUT
Alters locally-generated packets before routing
POSTROUTING
Alters packets after routing
The mangle table lets you alter packet headers as you like. This has a host of uses that we will not cover in this book, but here are a few ideas for inspiration:
Change the TOS field of packets for QoS (there are now better ways for managing QoS, but there it is)
MARKing packets to collect statistics for filtering, logging, or routing
Limit packet rate
It has these built-in chains:
PREROUTING
Alters incoming packets before routing
OUTPUT
Alters locally generated packets before routing
INPUT
Alters packets destined for the local machine
FORWARD
Processes packets routed through the host
POSTROUTING
Alters packets on their way out, after routing
Packets coming into your network must first pass through the mangle table, then the NAT table, and finally, the filter table.
User-defined chains can improve performance because packets traverse your rules and chains in the order they are listed. Defining your own chains lets you create shortcuts, so packets can jump directly to the chains you want them to traverse, instead of passing through a bunch of irrelevant rules and chains first. Or, you may save some configuration steps by building a custom chain to use over and over.
While you can customize any Linux distribution any way you like, there are a number of specialized Linux distributions designed to serve as Internet routers and firewalls. They are stripped-down to the essentials. Some are small enough to fit on a floppy disk. Typically, these include iptables, DNS/DHCP servers, secure remote access, intrusion detection, logging, port forwarding, and Internet connection sharing. Here are a few of the more popular ones:
The name means FREE ciSCO. It is a free replacement for commercial routers. It supports up to 10 Ethernet/arcnet/Token Ring/arlan network cards, and up to 10 modems. It is easy to set up, and can be run from a single write-protected diskette, or from a hard drive, if you want additional functionality.
An excellent prefab Internet gateway. It has a web-based administration interface, supports SSH and console access, and, in addition to the usual gateway services, it supports dial-up networking and DynDNS.
Sentry runs from a bootable CD, and stores configuration files on a diskette. Set the diskette to read-only, and recovering from an intrusion is as easy as patching the hole and rebooting.
Pyramid Linux, a descendant of the popular Pebble Linux, is maintained by Metrix Communications, and is based on Ubuntu Linux. It is optimized for wireless access points, and serves equally well as a wired-network firewall. The stock installation occupies under 50 MB, so it's perfect for single-board computers without expandable storage. Because it uses stock Ubuntu packages, you can easily add applications by copying the binaries and any dependent libraries from the Ubuntu liveCD.
Bering achieves its small size by using modified libraries. Because it is so customized, you have to rely on the Bering package repositories for additional application. This shouldn't be a problem for most admins, as they offer a large number of additional packages.
Based on Debian, Voyage can be shrunken to as small as 64 MB, or expanded as desired. Optimized for wireless access points, routers, and firewalls.
This is a work in progress. It is an interesting Debian implementation that takes a slimmed-down, stock Debian, and adapts it to boot from a flash drive and run entirely in memory.
It is equally important to harden your systems, and a great tool for this is Bastille Linux (http://www.bastille-linux.org/). Bastille is a set of scripts that walk you through a number of steps to harden your entire system. It is designed to be educational and functional. You can run through it a couple of times without actually changing anything, and it also has an undo feature so that you can practice without running the risk of locking yourself out of your system. It examines almost every aspect of your system, including file permissions, PAM settings, services, and remote access.
I cannot guarantee that the recipes in this chapter are crack-proof, or that they will offer perfect protection. No one can make such a claim. Users clamor for easy, point-and-click security, but there is no such thing. Security is an escalating arms race. The well-armed network administrator studies the relevant RFCs, iptables documentation, and keeps up-to-date with important security news (e.g., the security bulletins for their particular Linux distribution, Bugtraq mailing list, securityfocus.com, and Bruce Schneier's Crypto-Gram list).