Chapter 7. Backups

There are so many enemies of your data. When it comes to disks, it’s not a question of whether your hard drives will fail, it’s a question of when. Beyond hard drive failure you find rm, dd, and a number of other Linux commands that are incredibly efficient at destroying your data. Just ask a good friend of mine who was trying to clean up his MP3 directory. A number of us were helping him perfect a find script that would delete all of the files in his MP3 directory that did not end in .mp3. Despite our warnings to test the script with echo first, he ran the full command: find . -type f ! -name '*.mp3' -exec rm -f {} \;. At first it appeared to be working, until he discovered he hadn’t run the command in his MP3 directory—he ran it in ~, his home directory. True, he had cleaned up his MP3 directory, along with the rest of his files. The bottom line is that the only real way to ensure that your data is safe is to back it up.

There are any number of ways to back up data under Ubuntu, and in this chapter I cover a graphical tool called BackupPC. I also discuss some commonsense backup tips and describe how to create a full image of a drive or partition. I include some special considerations for when you’re backing up a database. By the end of the chapter, if you haven’t set up a backup system yet, I hope you will be encouraged by how easy it is under Ubuntu.

Backup Principles

There are a number of principles that should guide you when you set up your backup strategy. Most of these are common sense but bear repeating:

Image Back up data to a separate system.

That separate system might be a separate drive, a tape, or ideally a completely separate host. The point is not to back up data on a drive to the same drive. You really want your backups to be as far removed from the system as possible—even for my personal data at home I have a backup system in place to copy my most important files to a server out of state. That way, if my house burned down or serious file system corruption hit my server, my important data would still exist.

Image Test your backups.

If you haven’t successfully restored from backup, you haven’t truly backed anything up. After you set up a backup system, you must make sure that you can restore from it. It’s a good practice to follow up with tests of your restore process periodically afterward. The worst time to find out a backup didn’t work is when you really need a file.

Image RAID is not a substitute for backups.

A common mistake among beginner administrators is to mistake RAID for backups. RAID provides you with redundancy for hard disks so that if a particular disk fails, your data still remains safe on the other disks. RAID does not protect you from a user deleting a file or, worse, complete file system corruption. In the case of a RAID mirror, if you write bad data to one drive, that bad data will simply be replicated to the second. On top of this, it’s not unheard of for a RAID controller to die and write bad data to the disks as it goes down. In any of these cases if you did not keep a backup that is separate from your RAID, your data would be gone.

Image Create full and incremental backup schedules.

The majority of files on a server tend to stay the same, particularly when you are talking about the core OS files. For this reason most administrators opt for a combination of full backups (a complete copy of every file) over a longer period of time, such as every week, and incremental backups (only files that have changed since the last backup) over a shorter period of time, usually daily. Since incremental backups generally involved fewer files, they take up less space and are faster to complete. Just keep in mind that if you restore multiple files, there’s a chance that some of the files aren’t included in the latest incremental backup. The safe approach is to restore from the full backup and then all subsequent incrementals if you aren’t sure every file made it into the last backup.

Image Decide how often to back up.

A common question one might ask is “How often should I back up?” The basic answer is “How much work can you afford to lose?” Many organizations can stand losing up to a day’s work, so they back up nightly. If you can afford to lose only a few hours of work, then you need to back your data up every few hours.

Image Archive your backups.

While it would be nice to save backups forever, the reality is that backups can consume an incredible amount of space. You may be able to keep only a month’s worth of backups on your system before you run out of space. Even if that is the case, consider archiving old backups to separate storage like a tape, a USB drive, or even DVDs that you label and store in a vault. Many organizations maintain a month’s worth of backups, and then archive off a full backup every month, every quarter, or every year. That way they have a snapshot of their data at that point so even if the backup server itself were to catch fire, there’s still a version of the data available.

Drive Imaging

An image is a complete bit-for-bit copy of a drive. Once you image a drive, its image should be indistinguishable from the original drive. One of the most guaranteed, if wasteful, methods for backing up a system is to take an image of its drives. Even if you don’t use drive imaging as your backup strategy, you will find a number of other circumstances where drive images come in handy, from cloning a system to file system recovery to forensics.


Warning

When imaging a drive, it’s important that the drive not be in use. If the drive changes while you image it, you will not be able to guarantee that the image is consistent, so be sure that any file systems on a drive are unmounted. The requirement that a drive you image not be in use is yet another reason why most people don’t use imaging as their primary backup strategy.


The classic UNIX imaging tool is dd, and you will find it on just about any Linux system and definitely on any Ubuntu server. This straightforward and blunt tool in its most basic form reads an input file bit by bit and copies it to an output file bit by bit. If you had two drives of identical size, /dev/sda and /dev/sdb, here is the command to image sda to sdb:

$ sudo dd if=/dev/sda of=/dev/sdb

Of course, dd can use any file as its input and output file, so instead of imaging to another drive, you could image to a file. This is particularly handy for forensics, when you might have a number of file system images stored on a single large USB drive. Assuming you have mounted your USB drive at /media/disk1, here is how you could image /dev/sda to a file on that drive:

$ sudo dd if=/dev/sda of=/media/disk1/sda-image.img

To restore from this image, you would just reverse the two arguments. Here are the commands to restore the two previous examples:

$ sudo dd if=/dev/sdb of=/dev/sda
$ sudo dd if=/media/disk1/sda-image.img of=/dev/sda

You can also image individual partitions. This can be useful since you can easily mount the images loopback and read through them. First let’s image a partition on /dev/sda:

$ sudo dd if=/dev/sda1 of=/media/disk1/sda1-image.img

Now you can create a directory, /mnt/temp, and use the loop mount option to mount this image:

$ sudo mkdir /mnt/temp
$ sudo mount -o loop /media/disk1/sda1-image.img /mnt/temp

This is handy when you need to recover only a few files from an image. You can browse /mnt/temp like any other file system and copy individual files or entire directories from it. To copy this image back to the original drive, reverse the arguments once again:

$ sudo dd if=/media/disk1/sda1-image.img of=/dev/sda1

Another useful trick is imaging over the network. The fact is that with some servers you might not have a separate disk attached that can hold an image. One method might be to set up a remote NFS server with plenty of storage. Then you could mount the NFS share on the local system and create an image file that way. Of course, that requires that you have an NFS server set up. Another method is to pipe dd’s output to SSH. Since most servers will probably have SSH, you won’t have to set up anything special to create this image, and all of the data will be transferred over an encrypted channel.

To transfer /dev/sda from the local machine over the network to 10.1.1.5 and dump the image at /media/disk1/sda-image.img, you would type

$ sudo dd if=/dev/sda | ssh username@10.1.1.5 \
"cat > /media/disk1/sda-image.img"

To restore this image:

$ ssh username@10.1.1.5 "cat /media/disk1/sda-image.img" |
sudo dd of=/dev/sda

Database Backups

For the most part, backing up a system is as easy as making a copy of its files. On a database system, however, things aren’t quite so simple. A database often won’t commit changes to disk immediately, so if you simply make a copy of the database files, the database itself might be in an inconsistent state. When you restore it, you can’t necessarily guarantee that it is an uncorrupted copy.

The solution to this problem is to use tools included with the database to provide a consistent dump of the complete database to a file that you can back up. Next I describe how to use the tools provided for MySQL and PostgreSQL databases under Ubuntu.

MySQL

The tool MySQL uses to create a backup of its database is called mysqldump. This tool dumps an entire database or databases to the screen. Most people then redirect the output to a file or pipe it to a tool like gzip to compress it first. For instance, if your user had a database called wordpress, here is how you would back it up:

$ mysqldump wordpress > wordpress_backup.sql

If you wanted to compress the database as it was dumped, you would put a pipe to gzip in the middle:

$ mysqldump wordpress | gzip > wordpress_backup.sql.gz

Now if you wanted to back up more than one database, there are two main ways to do it. The first way is to use the --databases argument followed by a space-separated list of databases to back up. The other method is to use the --all-databases argument, which backs up everything:

$ mysqldump --all-databases > all_databases_backup.sql

Of course, I assume you have set passwords for your database users so these commands won’t work for any of those users. This especially won’t work if you want to back up all databases, because at least some are owned by the root user. The solution is to use the -u and -p options to specify the user and password to use:

$ mysqldump --all-databases -u root -pinsecure >
all_databases_backup.sql

The preceding command would back up all of the databases as the root user using the password insecure. I list this example only to say that while this option works, it is insecure. The reason is that the full list of arguments, including the password, will be visible to all users on the system who run the ps command. A better method is to use -p without specifying a password:

$ mysqldump --all-databases -u root -p >
all_databases_backup.sql

When you specify -p without a password, mysqldump behaves like the mysql command and will prompt you to enter one. This provides good security, but of course it also means that you have to enter the password manually. Most people who back up their MySQL databases set up a cron job to do it at night. The way that MySQL recommends you solve this problem is to add the password to the client section in the ~/.my.cnf file for the user performing the backups. If you don’t already have a ~/.my.cnf file, create a new one and add the following text:

[client]
password=moresecure

Replace moresecure with the password your user will use to log in. Once you set up this file, you don’t need to specify the -p option anymore because mysqldump will pick up the password from this file. Of course, the downside here is that this password is in a plain-text file on the system, so you will want to set its permissions so that only your user can see it:

$ chmod 400 ~/.my.cnf

Restore MySQL Backups

A backup isn’t much use if you can’t restore from it. To restore a backup on MySQL, use the mysql command-line tool and point it at your backup. For instance, to back up the test database to test_backup.sql, you would type

$ mysql test < test_backup.sql

If instead you were backing up a number of databases, just type

$ mysql < multiple_database_backup.sql

To restore all databases, you need to log in as the root user. Of course, you are a secure MySQL administrator and have set a root password, so you must use the -p option (unless you set up a .my.cnf file, in which case you can leave out -p):

$ mysql -u root -p < all_databases.sql

MySQL Backup Cron Job

Since most people generally want to provide a MySQL backup at least once a day, here’s a quick and simple way to set up the cron job. First choose the location where you will store your backups. In this example I still store the backups in /root because I know only root can read that directory, but you will probably want to store them somewhere else with more space.

The main thing to consider is how many backups you want to keep. If you have some sort of other backup system in place to back up all of your files, you may need to keep only one database backup file on the system, since older versions will be stored on your remote backup server. If you want to store, say, a week’s worth of backups, you can use a simple shell trick. The date command with no arguments can be used to output the current date, but you can add some arguments to it so that it outputs, for instance, only the current day of the week:

$ date +%A
Friday

When you run mysqldump, you can enclose that entire command in backticks, and the shell will replace that section of your script with the output of the command. So if you were to write

$ mysqldump -u root --all-databases >
/root/all_databases_backup-`date +%A`.sql

the shell would actually save the database to /root/all_databases_backup-Friday.sql. That means the next day it runs the command it will name it Saturday, and so on. After a week, the new backup will automatically overwrite the one from the previous week without your having to write in any extra shell logic. To make this command run every night, you just have to create a file as root called /etc/cron.daily/mysqlbackup containing the following script:

#!/bin/sh

mysqldump -u root --all-databases >
/root/all_databases_backup-`date +%A`.sql

Then you would type chmod a+x /etc/cron.daily/mysqlbackup so that the script is executable.

Finally, if you set up a root password for MySQL, you must create a /root/.my.cnf file with the password in it, as discussed earlier. Now every night when the cron.daily scripts run, this script will run as well. If you want to change how many backups you keep, it’s as easy as changing the date command within the backticks. If you want only one backup, you can just save to an ordinary file. If you want to keep a month’s worth of backups, for instance, just replace %A with %d, which lists the day of the month starting with 01.

PostgreSQL

PostgreSQL uses a backup mechanism similar to MySQL’s in that it provides a command-line dump tool called pg_dump that dumps one or more databases to the command line. In its simplest form it behaves a lot like the mysqldump command. To back up a database named test, created by your user, you could type

$ pg_dump test > test_backup.sql

The main database user for PostgreSQL is the postgres user, so you are more likely to do backups as that user:

$ sudo -u postgres pg_dump test > test_backup.sql

To back up all PostgreSQL databases, use the pg_dumpall command instead:

$ sudo -u postgres pg_dumpall > all_databases_backup.sql

Restoring PostgreSQL databases works much like MySQL except you use the psql tool. Here is how you would restore each of the backups I did previously:

$ psql test < test.sql
$ sudo -u postgres psql test < test.sql
$ sudo -u postgres psql < all_databases_backup.sql

PostgreSQL Backup Cron Job

The cron job to back up PostgreSQL is very similar to the one for MySQL, except in this case there’s no need to set up any /root/my.cnf files. You just need to create a new file called /etc/cron.daily/postgresqlbackup containing the following:

#!/bin/sh

/usr/bin/sudo -u postgres /usr/bin/pg_dumpall >
/root/all_databases_backup-`date +%A`.sql

Then you would make the script executable with chmod a+x /etc/cron.daily/postgresqlbackup. Now every night when the cron.daily scripts run, this script will run as well. Changing how many backups you keep is as easy as changing the date command within the backticks. If you wanted only one backup, you can just remove the backticks and everything between them. If you wanted to keep a month’s worth of backups, for instance, just replace %A with %d, which will output the day of the month starting with 01.

BackupPC

One of the simpler but still powerful backup programs for Ubuntu is called BackupPC. BackupPC is written in Perl and can make use of tar and rsync to back up Linux and UNIX hosts, and it can mount and back up SMB shares. Unlike many other backup programs, BackupPC does not necessarily back up a particular machine at the same time every day. This software was designed with networks of desktops that power off at the end of the day in mind, so as you add hosts, it probes them to see if they are up. If BackupPC is able to back them up during the evening backup window it will, but if it can’t, it will attempt to back up the host during the day.

A nice feature of BackupPC is that it not only compresses files it has backed up, it also scans through all of the files daily and, where it sees duplicates, creates a hard link. Since a lot of servers tend to have the same system files, this method means that you can squeeze a lot more data on a lot less disk. BackupPC is packaged for Ubuntu, so you can use your package manager to install it:

$ sudo apt-get install backuppc

BackupPC includes a Web-based interface you can use to manage backup jobs, view logs, and restore files, so it will include the Apache packages it needs if they aren’t already installed. During the install process you will be prompted to select a Web server for BackupPC. Unless you set up your own Web server ahead of time and know what you are doing, select apache2 here. BackupPC uses Apache htpasswd accounts to password-protect the page, and the installer creates a backuppc user and outputs a random password to the screen, so be sure to jot it down. If you forget to do that or forget the password later on, you can use the htpasswd command against the /etc/backuppc/htpasswd file.

After the installer completes, open a Web browser and point it to the /backuppc directory on that host, so if your host was 10.1.1.7, you would point it to http://10.1.1.7/backuppc/. You will be prompted for login credentials, so use the login and password you were given during the install. Once you are logged in, you will see the default BackupPC admin page as shown in Figure 7-1.

Image

Figure 7-1 Default BackupPC Web interface

BackupPC Storage

As with any other backup server, BackupPC needs a lot of storage. All of the backups are ultimately stored under /var/lib/backuppc, so if you have a separate disk (or set of disks in a RAID) for BackupPC, this is a good place to mount it. Let’s assume you have a second SCSI partition at /dev/sdb1 that you want to use for BackupPC. First move the old directory out of the way, mount the new drive, and copy over the current /var/lib/backuppc files. BackupPC must be stopped while you do this so that it doesn’t write to that directory while you’re changing it:

$ sudo service backuppc stop
$ sudo mv /var/lib/backuppc /var/lib/backuppc.orig
$ sudo mkdir /var/lib/backuppc
$ sudo chown backuppc:backuppc /var/lib/backuppc
$ sudo mount /dev/sdb1 /var/lib/backuppc
$ sudo rsync -av /var/lib/backuppc.orig/ /var/lib/backuppc/
$ sudo service backuppc start

Finally don’t forget to add the new /dev/sdb1 mount point into /etc/fstab so it will mount automatically the next time the system boots.

Default BackupPC Configuration

Of course, the default Web interface isn’t very useful until you add a host. BackupPC’s default behavior is defined in /etc/backuppc/config.pl, its core configuration file. If you are unfamiliar with Perl, this file may seem a bit daunting at first as all of the options are configured in Perl data structures. I walk you through adding an Ubuntu host that you will back up with rsync, and as you will see, once you get the core configuration file set, it is relatively simple to add hosts.

The config.pl file defines the default settings for all hosts BackupPC backs up, such as how often to back up, what directories to back up, whether to use smb, rsync, or tar to back up, and even what arguments to pass to those commands. What you want to do is generate a config.pl that works for the majority of your hosts and then create host-specific configuration files when a host needs special options. Any host-specific configuration goes into a .pl file under /etc/backuppc/ named after the host. So if you had a host named web1 and wanted to change some settings just for it, you would copy those specific options from /etc/backuppc/config.pl to /etc/ backuppc/web1.pl. Any options you set in web1.pl will override anything in config.pl when web1 is being backed up.

In this example I assume a network mostly made up of other Ubuntu servers and use rsync for the backup. This is all-important because by default BackupPC logs in over SSH as root. On a network of Ubuntu servers this wouldn’t work, because root is disabled by default, so we need to change some settings.

There are two different ways to edit the BackupPC configuration. The first (and easier) is through the Web interface; with the second you just open /etc/backuppc/config.pl with a text editor and locate and tweak settings directly.

Web-Based Configuration

You can actually change all of the options you need directly from the Web interface. From the BackupPC home page click the Edit Config link in the left pane and then click the Xfer link along the top of the right pane. You will then see a configuration screen like the one in Figure 7-2. On the XferMethod drop-down menu change from smb to rsync. Then scroll down until you see the RsyncClientCmd and RsyncClientRestoreCmd options. Change both of them from

$sshPath -q -x -l root $host $rsyncPath $argList+

to

$sshPath -q -x -l backuppc $host sudo $rsyncPath $argList+

Image

Figure 7-2 BackupPC Xfer configuration options

Then scroll back up to the top of the page and click the Save button. Once you are done with all of your changes, click the Admin Options link in the left sidebar and then click Reload so BackupPC can read your new settings.

Command-Line Configuration

While the Web interface provides an easy way to configure BackupPC, some people prefer doing it all through the command line. If you are one of those people, open the /etc/backuppc/config.pl file in your preferred text editor and then search for the line that matches this one:

$Conf{XferMethod} = 'smb';

This option defines the default method BackupPC uses to transfer files. SMB might work well for a network of Windows machines, but since we have Ubuntu hosts, we change this to rsync:

$Conf{XferMethod} = 'rsync';

Next, we need to set BackupPC so that it logs in to each machine as a regular user and then uses sudo to become root. We create a backuppc user on each host along with a secure sudo role so that BackupPC can log in and back up the machine. First locate the following line:

$Conf{RsyncClientCmd} = '$sshPath -q -x -l root $host $rsyncPath
  $argList+';

This defines what command BackupPC uses when it backs up with rsync. Change it to

$Conf{RsyncClientCmd} = '$sshPath -q -x -l backuppc $host sudo
  $rsyncPath $argList+';

We need to do the same thing for the command BackupPC uses when it restores to a host, so find the line that matches

$Conf{RsyncClientRestoreCmd} = '$sshPath -q -x -l root $host
  $rsyncPath $argList+';

and change that along the same lines as the previous option:

$Conf{RsyncClientRestoreCmd} = '$sshPath -q -x -l backuppc $host
  sudo $rsyncPath $argList+';

By default, BackupPC backs up the entire root file system along with all mounted file systems. I’m leaving this setting alone for now because I do want to back up all of the files on the host, but I discuss how to change it later in the chapter.

Configure the Client Machine

Now that you have changed the config.pl option, you are ready to set up BackupPC so that it can log in and back up your client. In this example we call our client web1, so where you see web1 listed in the example, replace it with your client’s hostname.

Configure SSH Keys

Since BackupPC needs to be able to log in to hosts without interaction, you must set up passwordless SSH keys for the backuppc user. On the BackupPC server type

$ sudo -u backuppc ssh-keygen -t rsa

Hit Enter at each of the prompts to accept the defaults. The public and private keys will be stored at /var/lib/backuppc/.ssh/.

Set Up the Client

Now log in to your client and create a backuppc user (hit Enter when prompted for the name and room number and other information about the user) and create a .ssh directory for the same user:

$ sudo adduser backuppc --disabled-password
$ sudo mkdir /home/backuppc/.ssh
$ sudo chown backuppc /home/backuppc/.ssh


Tip

If your client does not yet have an SSH server running, then run sudo apt-get install openssh-server.


Now you need to copy the contents of the /var/lib/backuppc/.ssh/id_ rsa.pub file from your BackupPC server to the /home/backuppc/.ssh/authorized_keys file. One way to do this is to log in to both machines on separate terminals, open both files, and then use your mouse to copy and paste between them. Another method is to use scp on the server to copy the file to the /tmp directory on the client and then log in to the client and copy it from there.

On the server:

$ sudo scp /var/lib/backuppc/.ssh/id_rsa.pub user@web1:/tmp/

Replace user@web1 with the username and hostname on the client. Then on the client:

$ sudo sh -c "cat /tmp/id_rsa.pub >>
/home/backuppc/.ssh/authorized_keys"

Now you should be able to go to the BackupPC server and log in to the client as the backuppc user without a password:

$ sudo -u backuppc ssh web1

Configure sudo

Now we need to configure sudo on the client machine so that the backuppc user can run rsync as root without a password. To do this, run sudo visudo on the client and add the following line to the /etc/sudoers file:

backuppc ALL=(root) NOPASSWD:/usr/bin/rsync

Add the Client to BackupPC

Now that BackupPC can log in to the client and run rsync as root, we are ready to add it to the list of hosts BackupPC backs up. All of the hosts are defined in /etc/backuppc/hosts, and you can add hosts either by editing the file directly or via the Web interface.

Web Interface

To add a host in the Web interface, click Edit Config in the left sidebar and then click the Hosts link on the top of the right pane. Click the Add button to add a new host, and once you are finished, click the Save button. Finally, click Admin Options in the left sidebar and then the Reload button so BackupPC will reload the changes.

Command Line

To add a host via the command line, open /etc/backuppc/ hosts in a text editor and add the following line at the bottom:

web1 0 backuppc

Change web1 to the hostname of the server you want to back up. The 0 tells BackupPC that this host has a static IP address, and the backuppc at the end sets what user can manage this host on the Web interface. I just used a single space in this example, but you can separate the columns with multiple spaces so everything lines up and looks nicer. If you wanted other users to also be able to back up and restore this host from the Web interface, you could add a fourth column to this line and list those users separated by commas. So if I had two users, allan and jorge, that I wanted to be able to manage web1, the line would read

web1 0 backuppc allan,jorge

Once you save the changes to /etc/backuppc/hosts, tell BackupPC to reload its configuration with sudo service backuppc reload or sudo /etc/init.d/backuppc reload.

Start the First Backup Job

Once the BackupPC program reloads its configuration, go back to the BackupPC Web interface and reload the page. You should be able to see your host in the “Select a host...” drop-down menu on the left side of the page. Select that option and you will see the default host page as shown in Figure 7-3. To test that everything is set up correctly, click Start Full Backup to initiate the first backup for the host. Then you can click Status in the left sidebar to go to the main status page and see that your backup job has started. It should look something like Figure 7-4. To stop a job for a particular host, go to that host’s page and then click the Stop/Dequeue Backup button.

Image

Figure 7-3 BackupPC host management Web page

Image

Figure 7-4 BackupPC full backup running

rsync Tweaks

While most of the default rsync options should be fine for the average user, there are a few extra options you might want to enable depending on your environment.

Checksum Seed

If your host is relatively new, it should have a version of rsync greater than or equal to 2.6.3 (if you aren’t sure what version you have, type rsync --version). If so, you can take advantage of the --checksum-seed option, which can cache rsync’s checksums and overall speed up the rsync process. To do this you need to add that option to the RsyncArgs and RsyncRestoreArgs option in your BackupPC configuration.

To add this setting from the Web, click Web Config in the sidebar, then the Xfer link. Then scroll down to the RsyncArgs option where you can see each individual option on its own line. At the end of these options is an Add button. Click that and add --checksum-seed=32761. Then scroll down and add the same option to the RsyncRestoreArgs section.

To make the same change on the command line, open /etc/backuppc/config .pl in a text editor and find the line that starts with $Conf{RsyncArgs}. Each rsync option is on its own line, but if you are still using the config.pl that came with the package, you will see this option commented out:

#'--checksum-seed=32761',

Just remove the # from the beginning of that option. If you don’t see the commented-out option, just add a line below the last RsyncArgs option that reads

'--checksum-seed=32761',

Then move down to the RsyncRestoreArgs section (generally it’s the next option) and do the same thing for this option.

One File System

By default, BackupPC traverses all of the file systems on the host and backs up absolutely everything. There are circumstances when you might not want that to be the default behavior, especially in a cluster when you can easily replace the main system files by rebuilding the host. In these circumstances what you want to do instead is tell rsync to stick to one file system at a time, and then specify which mount points BackupPC should back up.

The first step is relatively simple because this option goes in the same place as the --checksum-seed option. Follow the steps I described earlier to add the --checksum-seed option to RsyncArgs and RsyncRestoreArgs, in either the Web interface or the command line. This time, though, the option you add is

--one-file-system

Once you set that option, you must define each file system that BackupPC will back up. In this example, let’s assume that you have /home and /var on separate partitions and want to back up only them. The option you will change is called RsyncShareName. In the Web interface return to the Xfer configuration screen you used to add --one-file-system to RsyncArgs. Above that section you will see the section named RsyncShareName. Each share is on its own line, as with RsyncArgs. First you change the first option from / to /home. Then you click the Add button and add a new share named /var.

To change the same option on the command line, open config.pl, find the line that looks like

$Conf{RsyncShareName} = '/';

and change that to

$Conf{RsyncShareName} = ['/home', '/var'];

Exclude Directories

You might find that you typically want to back up all of the files on the host apart from a few different directories. For instance, in the preceding example, you might want to back up /home and /var, but perhaps you want to skip /var/spool/mail and /var/tmp. To do this you go back to the Xfer configuration screen on the Web interface and scroll down to the BackupFilesExclude option. Then you type in /var/tmp under the New Key field and click Add. Once the screen refreshes, you can scroll down and add /var/spool/mail the same way.

If you want to change this on the command line instead, you search through the file until you see a line that looks like this:

$Conf{BackupFilesExclude} = undef;

By default no files are excluded. To add the two directories you change that option to

$Conf{BackupFilesExclude} = ['/var/tmp', '/var/spool/mail'];

As with any configuration changes, once you have changed everything, don’t forget to save and then reload BackupPC so the changes take effect.

Host-Specific Tweaks

While you can set up a default BackupPC config that works for most of your hosts, you will likely run into a few machines that need something slightly different from the default. With BackupPC it’s particularly easy to branch off from the default config and customize options. For instance, you might want to apply the --one-file-system option or exclude directories only on one host.

Basically, to add custom options for a particular host, copy those options from the /etc/backuppc/config.pl file into /etc/backuppc/hostname.pl, where hostname is the name of the host you want to change. So if, for instance, you wanted to back up only /home and /var on your BackupPC host and not traverse file systems on the host called web1, you would copy the entire RsyncArgs and RsyncRestoreArgs in a file named /etc/backuppc/web1.pl along with the RsyncShareName option. The result would look something like this:

$Conf{RsyncShareName} = ['/home', '/var'];
$Conf{RsyncArgs} = [
     #
     # Do not edit these!
     #
     '--numeric-ids',
     '--perms',
     '--owner',
     '--group',
     '-D',
     '--links',
     '--hard-links',
     '--times',
     '--block-size=2048',
     '--recursive',

        # my custom options
     '--one-file-system',
];
$Conf{RsyncRestoreArgs} = [
     #
     # Do not edit these!
     #
     '--numeric-ids',
     '--perms',
     '--owner',
     '--group',
     '-D',
     '--links',
     '--hard-links',
     '--times',
     '--block-size=2048',
     '--recursive',

        # my custom options
     '--one-file-system',
];

Change the Backup Schedule

By default, BackupPC takes one full backup per week and in between takes an incremental backup of every host. BackupPC keeps one full backup and six incremental backups before it deletes anything. Finally, BackupPC will not start up new jobs for hosts that are always on the network between 7:00 a.m. and 7:30 p.m. during the week. These defaults are not suitable for everyone. For instance, you might be required to keep a month’s worth of full backups, or everyone might be out of the office by 6:00 p.m. so you can start backups then. All of these options are easy to change in the BackupPC Web interface.

To start, click the Edit Config link in the left sidebar and then click the Schedule link on the top right-hand side of the screen. You will see a schedule-editing screen as shown in Figure 7-5. The first set of options lets you schedule your full backups. Each of the options is hyperlinked to a manual page so you can read about what they change.

Image

Figure 7-5 BackupPC schedule configuration screen

The FullPeriod option defines how much time should pass (in days) before a new full backup should be scheduled. This option is always set slightly below a full number. In the case of the default, 6.97, a full backup will be scheduled every seven days. The FullKeepCnt and FullKeepCntMin options configure how many full backups to keep and the minimum number to keep, respectively, and FullAgeMax defines the maximum number of days before an old full backup is deleted. Incremental backups can save backup resources as they back up only what has changed since the last full backup. They take most of the same options as the full backups, and by default they are run every day and the last six are saved.

The Blackouts section of this page lets you define BackupPC’s blackout period. The blackout period is the range of time during which BackupPC will not attempt to back up hosts that are always on the network. This way, if you have desktops that might be powered off in the evening, BackupPC will back them up during the day, but for servers that are always on, BackupPC knows it can wait until the evening when they are presumably under less load. When you add a new host to BackupPC, it will try to ping it periodically to determine whether it is always on the network. If it is, BackupPC will back it up only during the blackout period. The hourBegin option defines what hour or fraction of an hour the blackout period begins, and the hourEnd option sets when it ends. The weekDays option sets which days of the week the blackout period is in effect. By default the blackout period is between 7:00 a.m. and 7:30 p.m. Monday through Friday.

To demonstrate how you would change these options, I define a different backup policy that might be used in an organization. In this organization we want weekly full backups and daily incremental backups, but we want to save the full backups up to a month before discarding them. We also want to keep the last two weeks of incremental backups. Finally, everyone in the office leaves by 6:00 p.m., so we want to start backups then. To make these changes I need to change only the following values:

FullKeepCnt = 4
IncrKeepCnt = 12
hourEnd = 18

Restore Files

Backing up files is all well and good, but it isn’t too useful unless you can restore them. One of the best features of BackupPC in my opinion is its easy-to-use Web-based restore. If you set up additional Web accounts for backuppc (use htpasswd -c /etc/backuppc/htpasswd username), you can add those accounts to specific hosts in the /etc/backuppc/hosts.conf (or on the Web interface) and those users can log in to BackupPC and restore their own files.

To restore a file or directories for a host, first go to the host’s home page on BackupPC (select the host in the drop-down menu in the left sidebar). On the host’s main page you will see a list of completed backups in a table that lists whether the backup was full or incremental, when the backup started, how long it took, and where those files are stored on the file system. Click the hyperlinked backup number at the beginning of a particular row to restore from that backup. You will then see the entire directory structure of your host on the page with checkboxes next to files and directories (see Figure 7-6). The interface is like that of most file managers, and you can click on an individual directory to expand it. Once you have selected all of the files you wish to restore, click the “Restore selected files” button at the top or bottom of the page.

Image

Figure 7-6 BackupPC restore file browser

BackupPC provides you with three different restore options:

Image Direct restore

In a direct restore, BackupPC restores the files or directories directly to a host. You can actually choose which host to restore to, as BackupPC lets you choose any host it has configured. By default BackupPC restores to the share and directory below the share that the files originally came from, so if you want to overwrite or replace what is currently there, you can just click the Start Restore button. You can even completely overwrite the entire / directory on the remote host with the full backup if you need to. Instead of restoring the files to their original directory, you could also restore to a different directory if you wanted to compare the two files.

Image Download zip archive

This option is very useful when you back up Windows desktops and allow your users to restore their own files. Instead of restoring a file directly to a host, you can instead generate a .zip file that contains all of your restored files and directories and download it to your current computer.

Image Download tar archive

This and the zip archive option are essentially the same, except in this case you get a tar archive instead, which is more useful for Linux desktops.

Ubuntu BackupPC Conventions

The following list details the common directories and files BackupPC uses, including where it stores configuration files and where it logs.

Image /etc/backuppc

This directory contains all of the configuration files for BackupPC, including its Apache config and any host-specific configuration files.

Image /etc/backuppc/config.pl

All of the default BackupPC options are set in the config.pl file. The version of config.pl that comes with the package by default is full of documentation that explains each option and gives examples. Any host-specific configuration goes into a separate file named after the host and ending in .pl.

Image /etc/backuppc/hosts

All of the hosts that BackupPC will back up are defined here.

Image /etc/backuppc/htpasswd

Web users along with their passwords are set in this file by default. This is a standard Apache password file, so you can use the htpasswd command to make changes (type man htpasswd for details on how to use the program).

Image /etc/backuppc/apache.conf

This file defines all of the virtual host settings for the BackupPC Web administration page. BackupPC creates a symlink from /etc/apache2/conf.d/backuppc.conf to this file, although these days it would fit better under /etc/apache2/sites-available.

Image /etc/init.d/backuppc

Here is BackupPC’s init script. BackupPC starts the service by default once it is installed and automatically starts at boot time.

Image /var/lib/backuppc

This directory contains all of the files that BackupPC backs up, so you should consider putting this directory on a separate large mount point or at least make sure it has plenty of free space.

Image /var/lib/backuppc/log

All of the logs for each backup can be found under this directory. You can also access the logs from the Web interface.

Image /var/lib/backuppc/pc

Each host has its own directory here that contains its latest set of files. BackupPC pools together identical files from multiple hosts with hard links, but if you have removed a host from BackupPC and want to delete its files as well, delete the host’s directory under here first.