Limiting Damage

When we look at real-world examples of physical security, the two most visible indications that something has been secured are that we can see the limitations on access that are in place, and we can see the steps that have been taken to limit potential damage. If you look at the front gate at an Army base, you see a checkpoint, one or more armed guards that check credentials, and possibly some speed bumps or gates. These are all there to limit access. But, for the more secured areas, you might also see concrete barriers, or maybe a concrete bunker, or a series of sandbags. While these do provide some capability to limit access, they are made from these strong materials to limit potential damage; e.g., sandbags absorb the energy of bullets, concrete bunkers can withstand explosions that would destroy wooden buildings. The checkpoint at an Army base is designed to repel an attack.

Even when you are not trying to protect against an attack, the real world provides examples of how to secure an area to limit the effects of unintentional actions. A Formula 1 car driver races on a track that is surrounded by both hard and soft barriers to protect spectators from fast-moving debris in case of a crash, and walls are in place to protect the other drivers in case a car loses control and comes across the track. Airports are designed with buffer space around the runway, and barriers to stop debris propelled by powerful jet engines. Your home or office has technology to limit damage in case of a fire, as gypsum drywall provides a level of fire resistance, and overhead sprinkler systems douse flames with water or other chemicals to control the extent of a blaze.

On a computer, we can limit the effect of damaging attacks through the access control methods discussed earlier, or we can create or modify subsystems that are configured to limit the effectiveness of attacks or accidental damage. One way is to alter how the system handles memory and in doing so, limit the effectiveness of traditional buffer overflow attacks by implementing stack protection or memory canaries. Another method is to configure the system so that execution is limited to known binaries in known locations—for example, by chrooting an FTP service. At the extreme end, we can wrap an entire subsystem into a virtualized environment that runs entirely in software, capable of being restored to a prior state if something were to go awry.

Mounting Volumes As noexec

If you are running a web server or an FTP server, one important attack vector to consider is what users are able to upload to your site. Many web applications contain vulnerabilities that may allow users to upload files to common locations—for example, /tmp. If a user manages to upload a program into /tmp and exploits another mechanism to execute that program, you are looking at a compromise of your web server. Therefore, it's common practice to mount any volumes that users can upload files to as noexec. This does not suddenly make your system bulletproof, but it does protect against simple scripted file-drop attacks and is relatively painless.

Edit your /etc/fstab to look something like this:

/dev/hda3        /tmp    ext3    noexec    0       1

Of course, change the device (here, it's hda3) to your /tmp partition. If you are running your FTP server in a chrooted environment, you can use this trick to further harden the chroot jail. Of course, you have to have separate partitions for each mount point.

#fstab
/dev/hda9        /opt/jail            ext3    defaults          0    0
/dev/hda6        /opt/jail/etc        ext3    noexec,nosuid,ro  0    0
/dev/hda7        /opt/jail/bin        ext3    ro                0    0
/dev/hda8        /opt/jail/data       ext3    noexec,nosuid     0    0

There is some more information on mounting noexec at http://www.debian-administration.org/articles/57.

Controlling the Linux Kernel Through /proc/sys

Even though almost everyone interacts with it at some level, most Linux users and administrators rarely if ever directly probe into the /proc filesystem. Instead, we use tools such as lsmod, lspci, dmesg, ipfw, ipchains, or iptables and the like to query or change the state of the running kernel. Generally, this is OK, but if you want to lock down a Linux box, it's best to have an understanding of what you are actually changing. Thankfully, most of what we are interested in modifying is somewhat self-explanatory, and all of it is documented either in the kernel source tree or on the Web.

First up, let's get some background on what we're doing. As previously stated, the primary location to make changes to a running Linux kernel is within the virtual directory /proc. This isn't an actual directory structure on your hard disk, taking up space; /proc exists in memory and is created and updated dynamically as the system runs.

Open up a shell and cd into the /proc directory. Let's take a look at what's here. On my Gentoo system, I see this:

choad:# cd /proc/
choad:/proc# ls
1/      19341/  2/      5221/  9/         execdomains  mounts@
10/     19345/  20535/  5457/  9400/      fb           mtrr
11/     19380/  20536/  5458/  96/        filesystems  net/
12496/  19383/  20571/  5459/  97/        fs/          partitions
12613/  19385/  22282/  5460/  98/        ide/         scsi/
12614/  19387/  22287/  6/     asound/    interrupts   self@
12621/  19394/  22302/  7/     buddyinfo  iomem        slabinfo
12622/  19396/  22307/  706/   bus/       ioports      stat
12623/  19399/  22342/  720/   cmdline    irq/         swaps
14/     19401/  2491/   721/   config.gz  kallsyms     sys/
15/     19402/  2493/   724/   cpuinfo    kcore        sysrq-trigger
16/     19405/  2523/   725/   crypto     kmsg         sysvipc/
17358/  19407/  2535/   744/   devices    loadavg      tty/
17371/  19409/  2949/   745/   diskstats  locks        uptime
1740/   19411/  3/      746/   dma        meminfo      version
19324/  19415/  4/      8/     dri/       misc         vmstat
19340/  19417/  5/      858/   driver/    modules      zoneinfo
choad:/proc#

Basically, the /proc virtual filesystem provides an organized way to examine and modify running system attributes. Here, we can see a lot of directories with names consisting entirely of numbers. These directories represent the processes IDs running on the system at the exact moment (give or take) that I ran the ls /proc command. If I take a peek into the 1 directory, I'll see a few virtual files:

choad:/proc# ls 1/
attr/    cwd@     fd/   mounts      oom_score  smaps  status
auxv     environ  maps  mountstats  root@      stat   task/
cmdline  exe@     mem   oom_adj     seccomp    statm  wchan
choad:/proc#

If I want to take a look at, say, some basic information about whatever this 1 process is, I can get its status:

choad:/proc# cat 1/status
Name:   init
State:  S (sleeping)
SleepAVG:       88%
Tgid:   1
Pid:    1
PPid:   0
TracerPid:      0
Uid:    0       0       0       0
Gid:    0       0       0       0
FDSize: 32
Groups:
VmPeak:     1528 kB
VmSize:     1524 kB
VmLck:         0 kB
VmHWM:       488 kB
VmRSS:       376 kB
VmData:      152 kB
VmStk:        88 kB
VmExe:        36 kB
VmLib:      1220 kB
VmPTE:        12 kB
Threads:        1
SigQ:   0/16372
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: fffffffe57f0d8fc
SigCgt: 00000000280b2603
CapInh: 00000000fffffeff
CapPrm: 00000000ffffffff
CapEff: 00000000fffffeff

Here I can see that the process at PID 1 is in fact init, that right now it's sleeping, and that it was spawned by UID 0 (a.k.a. root). I also see some memory usage statistics and some signal information. This is fascinating if you are planning to write a replacement for, say, ps. The point is, you can readily find information about what the system is running by going directly into the /proc filesystem and poking around. Let's say that you suspect that a process is chewing up memory. You can find out the current state of system memory by looking at the contents of /proc/meminfo:

choad:/proc# cat meminfo
MemTotal:      2074820 kB
MemFree:        301296 kB
Buffers:        150832 kB
Cached:        1406284 kB
SwapCached:        696 kB
Active:        1145384 kB
Inactive:       517372 kB
HighTotal:     1178152 kB
HighFree:         1692 kB
LowTotal:       896668 kB
LowFree:        299604 kB
SwapTotal:     1004052 kB
SwapFree:      1002248 kB
Dirty:               0 kB
Writeback:           0 kB
Mapped:         134928 kB
Slab:            97424 kB
CommitLimit:   2041460 kB
Committed_AS:   304884 kB
PageTables:       1052 kB
VmallocTotal:   114680 kB
VmallocUsed:     40188 kB
VmallocChunk:    69108 kB
choad:/proc#

Now, let's jump back to the /proc directory proper and take a closer look into the /proc/sys directory. Inside /proc/sys you'll find a number of subdirectories:

choad:# cd /proc/sys
choad:/proc/sys# ls
debug/  dev/  fs/  kernel/  net/  proc/  vm/

The two places we'll be focusing are in the /proc/sys/kernel/ and the /proc/sys/net/ directories. The other directories are interesting in their own right, but in the interest of space, let's focus on the kernel and net portions.

Inside /proc/sys/kernel, we see:

choad:<<src/linux/include/linux# ls /proc/sys/kernel
bootloader_type  max_lock_depth  panic                   sg-big-buff
cad_pid          modprobe        panic_on_oops           shmall
cap-bound        msgmax          pid_max                 shmmax
core_pattern     msgmnb          printk                  shmmni
core_uses_pid    msgmni          printk_ratelimit        sysrq
ctrl-alt-del     ngroups_max     printk_ratelimit_burst  tainted
domainname       osrelease       pty/                    threads-max
fbsplash         ostype          random/                 unknown_nmi_panic
hostname         overflowgid     randomize_va_space      version
hotplug          overflowuid     sem

/proc/sys/kernel/cap-bound

I don't often play with /proc/sys/kernel/cap-bound, but it really does warrant attention because it's a powerful hardening tool. By changing the bit-value of cap-bound, you can restrict a great many capabilities of processes, even ones that operate as root. I always have to reference the capabilities table every time I want to change this value, so it's worth taking a look at your capabilities file inside your kernel source (often in /usr/src/linux/include/linux/capabilities.h if you don't know for certain). I usually set /proc/sys/kernel/cap-bound to prevent module loading after the system has booted by sending the following:

echo $((1 << 16)) >/proc/sys/kernel/cap-bound

Try these resources for more good info: http://lwn.net/1999/1202/kernel.php3 and http://lwn.net/1999/1202/capabilities.php3

/proc/sys/net

This directory has several settings to harden your network connections. You might have different options here depending on which *nix distribution you're using. For more information on using this for ICMP and IP control, go to http://www.linuxsecurity.com/content/view/111337/65/.

/proc/sys/kernel/modprobe

This file contains the path to whatever program is used to when a kernel thread calls kmod to load kernel modules. By default, this path is /sbin/modprobe. Changing this value to a garbage value effectively turns off the kernel's ability to load modules. One neat way to make rootkit installation difficult is to change this value after the system has booted. Rootkits often depend on loading a kernel module to obfuscate their runtime behavior. Unfortunately, this also means that your own programs won't be able to load modules as necessary, which can break things. Many implementations of X (whether you're running Xorg or XFree86 or some other X implementation) dynamically load the video card driver into memory when X starts up. You can preload any necessary modules into your kernel at system boot, and then change value of /proc/sys/kernel/modprobe as the last step in the boot process. It's kind of a hack, but it does work. To do this on my Gentoo machine, I have to edit two files, /etc/modules.autoload.d/kernel-2.6 (if you are still running kernel-2.4, then edit the appropriate file instead), and /etc/conf.d/local.start.

My /etc/modules.autoload.d/kernel-2.6 looks like this:

# /etc/modules.autoload.d/kernel-2.6:  kernel modules to load when system boots.
e1000
i810
ehci_hcd
uhci_hcd
usbhid
snd-intel8x0
nvidia

Notice that the nvidia module—the binary driver for my GeForce card—is loaded manually so that when I want to start up X, it's already in place.

Next, edit whatever file init loads last and make sure the last executed line reads:

echo "/bin/false" > /proc/sys/kernel/modprobe

I say "whatever init loads last" because different systems want to do their own thing when running startup scripts. It's unnecessarily confusing, but that's the state of Linux distributions for you. Table 14-1 shows the last init file loaded for various Linux distributions.

Table 14-1. Last init file loaded for various Linux distributions

Distribution	Final init script	Notes
Debian	/etc/rc.local	The rc.local script requires `exit 0` at the end to work properly.
Fedora Core	/etc/rc.d/rc.local
Gentoo	/etc/conf.d/local.start
Red Hat Enterprise Linux	/etc/rc.d/rc.local
Slackware	/etc/rc.d/rc.local
Suse	/etc/init.d/boot.local
Ubuntu	/etc/rc.local	See note for Debian.

If your Linux distro isn't listed in the table, check the documentation that came with it to see what order your particular init loads scripts. Outside of that, I have good luck with a simple Google search for "distro_name init script local."

If you are extremely paranoid of someone running a rootkit on your Linux system, then compile all of the drivers you need directly into the kernel and disable dynamic module loading altogether. Be aware that you won't be able to compile in binary-only drivers, particularly video card drivers provided by vendors such as NVidia and ATI. This isn't as bad as it used to be; my experience has been that the open source drivers provided in the Xorg distribution are actually superior for typical 2D desktop operation, not to mention easier to get running with the Xorg configurator.