9. Periodic Processes

Scripting and automation are the keys to consistency and reliability. For example, an adduser program can add new users faster than you can, with a smaller chance of making mistakes. Almost any task can be encoded in a Perl or Python script.

It’s often useful to have a script or command executed without any human intervention. For example, you might want to have a script verify (say, every half-hour) that your network routers and switches are working correctly, and have the script send you email when problems are discovered.¹

9.1 CRON: Schedule Commands

The cron daemon is the standard tool for running commands on a predetermined schedule. It starts when the system boots and runs as long as the system is up.

cron reads configuration files that contain lists of command lines and the times at which they are to be invoked. The command lines are executed by sh, so almost anything you can do by hand from the shell can also be done with cron.²

A cron configuration file is called a “crontab,” short for “cron table.” Crontabs for individual users are stored under /var/spool/cron. There is (at most) one crontab file per user: one for root, one for jsmith, and so on. Crontab files are named with the login names of the users to whom they belong, and cron uses these filenames to figure out which UID to use when running the commands contained in each file. The crontab command transfers crontab files to and from this directory.

Although the exact implementations vary, all versions of cron try to minimize the time they spend reparsing configuration files and making time calculations. The crontab command helps maintain cron’s efficiency by notifying cron when the crontabs change. Ergo, you shouldn’t edit crontab files directly, since this may result in cron not noticing your changes. If you do get into a situation where cron doesn’t seem to acknowledge a modified crontab, a HUP signal will force it to reload on most systems.

See Chapter 11 for more information about syslog.

cron normally does its work silently, but most versions can keep a log file (usually /var/cron/log or /var/adm/cron/log) that lists the commands that were executed and the times at which they ran. See Table 9.2. on page 287 for logging defaults.

On some systems, creating the log file enables logging, and removing the log file turns logging off. On other systems, the log is turned on or off in a configuration file. Yet another variation is for cron to use syslog. The log file grows quickly and is rarely useful; leave logging turned off unless you’re debugging a specific problem or have specific auditing requirements.

9.2 The Format of Crontab Files

All the crontab files on a system share a similar format. Comments are introduced with a pound sign (#) in the first column of a line. Each noncomment line contains six fields and represents one command:

minute hour dom month weekday command

The first five fields tell cron when to run the command. They’re separated by whitespace, but within the command field, whitespace is passed along to the shell. The fields in the time specification are interpreted as shown in Table 9.1.

Table 9.1. Crontab time specifications

Each of the time-related fields may contain

• A star, which matches everything

• A single integer, which matches exactly

• Two integers separated by a dash, matching a range of values

• A range followed by a slash and a step value, e.g., 1-10/2 (Linux only)

• A comma-separated list of integers or ranges, matching any value

For example, the time specification

45 10 * * 1-5

means “10:45 a.m., Monday through Friday.” A hint: never put a star in the first field unless you want the command to be run every minute.

There is a potential ambiguity to watch out for with the weekday and dom fields. Every day is both a day of the week and a day of the month. If both weekday and dom are specified, a day need satisfy only one of the two conditions in order to be selected. For example,

0,30 * 13 * 5

means “every half-hour on Friday, and every half-hour on the 13^th of the month,” not “every half-hour on Friday the 13^th.”

The command is the sh command line to be executed. It can be any valid shell command and should not be quoted. The command is considered to continue to the end of the line and may contain blanks or tabs.

Although sh is involved in executing the command, the shell does not act as a login shell and does not read the contents of ~/.profile or ~/.bash_profile. As a result, the command’s environment variables may be set up somewhat differently from what you expect. If a command seems to work fine when executed from the shell but fails when introduced into a crontab file, the environment is the likely culprit. If need be, you can always wrap your command into a script that sets up the appropriate environment variables.

Percent signs (%) indicate newlines within the command field. Only the text up to the first percent sign is included in the actual command. The remaining lines are given to the command as standard input.

Here are some examples of legal crontab commands:

And below are some additional examples of complete crontab entries:

30 2 * * 1 (cd /home/joe/project; make)

This entry runs make in the directory /home/joe/project every Monday morning at 2:30 a.m. An entry like this might be used to start a long compilation at a time when other users would not be using the system. Usually, any output produced by a cron command is mailed to the owner of the crontab.³

20 1 * * * find /tmp -atime +3 -type f -exec rm -f { } ’;’

This command runs at 1:20 each morning. It removes all files in the /tmp direc-tory that have not been accessed in 3 days.

55 23 * * 0-3,6 /staff/trent/bin/checkservers

This line runs checkservers at 11:55 p.m. every day except Thursdays and Fridays.

cron does not try to compensate for commands that are missed while the system is down. However, the Linux and HP-UX crons are smart about small time adjustments such as shifts into and out of daylight saving time. Other versions of cron may skip commands or run them twice if they are scheduled during the transition period (usually between 1:00 and 3:00 a.m. in the United States, for example).⁴

9.3 Crontab Management

crontab filename installs filename as your crontab, replacing any previous version. crontab -e checks out a copy of your crontab, invokes your editor on it (as specified by the EDITOR environment variable), and then resubmits it to the crontab directory. crontab -l lists the contents of your crontab to standard output, and crontab -r removes it, leaving you with no crontab file at all.

Root can supply a username argument to edit or view other users’ crontabs. For example, crontab -r jsmith erases the crontab belonging to the user jsmith, and crontab -e jsmith edits it. Linux allows both a username and a filename argument in the same command, so the username must be prefixed with -u to disambiguate (e.g., crontab -u jsmith crontab.new).

Without command-line arguments, most versions of crontab will try to read a crontab from standard input. If you enter this mode by accident, don’t try to exit with <Control-D>; doing so will erase your entire crontab. Use <Control-C> instead. Linux requires you to supply a dash as the filename argument if you want to make crontab pay attention to its standard input. Smart.

Two config files, cron.deny and cron.allow, specify which users may submit crontab files. They’re located in a different directory on every system; see Table 9.2. for a summary.

Table 9.2 Locations of cron permission and log files

If the allow file exists, then it contains a list of all users that may submit crontabs, one per line. No unlisted person can invoke the crontab command. If the allow file doesn’t exist, then the deny file is checked. It, too, is just a list of users, but the meaning is reversed: everyone except the listed users is allowed access.

If neither the allow file nor the deny file exists, systems default (apparently at random, there being no dominant convention) to allowing all users to submit cron-tabs or to limiting crontab access to root. In practice, a starter cron.allow or cron.deny file is often included in the default OS installation, so the question of how crontab behaves without configuration files is moot. Among our example systems, only HP-UX defaults to blocking crontab access for unprivileged users.

It’s important to note that on most systems, access control is implemented by crontab, not by cron. If a user is able to sneak a crontab file into the appropriate directory by other means, cron will blindly execute the commands it contains.

Solaris is a bit different in this regard. Its cron daemon checks to be sure that the user’s account hasn’t been locked with an *LK* in /etc/shadow. If it has, cron won’t run the user’s jobs. The rationale is to prevent disabled users from running jobs, whether inadvertently or maliciously. If you want a user to have a valid account from cron’s perspective but not a valid password, run passwd -N user.

9.4 Linux and Vixie-Cron Extensions

The version of cron included on Linux distributions (including our three examples) is usually the one known as ISC cron or “Vixie-cron,” named after its author, Paul Vixie. It’s a modern rewrite that provides a bit of added functionality with less mess.

A primary difference is that in addition to looking for user-specific crontabs, Vixie-cron also obeys system crontab entries found in /etc/crontab and in the /etc/cron.d directory. These files have a slightly different format from the peruser crontab files in that they allow commands to be run as an arbitrary user. An extra username field comes before the command name. The username field is not present in garden-variety crontab files because the crontab’s filename provides this same information (even on Linux systems).

cron treats the /etc/crontab and /etc/cron.d entries in exactly the same way. In general, /etc/crontab is intended as a file for system administrators to maintain by hand, whereas /etc/cron.d is provided as a depot into which software packages can install any crontab entries they might need. Files in /etc/cron.d are by convention named after the packages that install them, but cron doesn’t care about or enforce this convention.

Time ranges in Vixie-cron crontabs can include a step value. For example, the series 0,3,6,9,12,15,18 can be written more concisely as 0-18/3. You can also use three-letter text mnemonics for the names of months and days, but not in combination with ranges. As far as we know, this feature works only with English names.

You can specify environment variables and their values in a Vixie-cron crontab file. See the crontab(5) man page for more details.

Vixie-cron logs its activities through syslog using the facility “cron,” with most messages submitted at level “info.” Default syslog configurations generally send cron log data to its own file.

For reasons that are unclear, cron has been renamed crond on Red Hat. But it is still the same Vixie-cron we all know and love.

9.5 Some Common Uses for Cron

A number of standard tasks are especially suited for invocation by cron, and these usually make up the bulk of the material in root’s crontab. In this section we look at a few common chores and the crontab lines used to implement them.

Systems often come with crontab entries preinstalled. If you want to deactivate the standard entries, comment them out by inserting a pound sign (#) at the beginning of each line. Don’t delete them; you might want to refer to them later.

In addition to the /etc/cron.d mechanism, Linux distributions also preinstall crontab entries that run the scripts in a set of well-known directories, thereby providing another way for software packages to install periodic jobs without any editing of a crontab file. For example, scripts in /etc/cron.daily are run once a day, and scripts in /etc/cron.weekly are run once a week. You can put files in these directories by hand as well.

Many sites have experienced subtle but recurrent network glitches that occur because administrators have configured cron to run the same command on hundreds of machines at exactly the same time. Clock synchronization with NTP exacerbates the problem. The problem is easy to fix with a random delay script or config file adjustment, but it can be tricky to diagnose because the symptoms resolve so quickly and completely.

Simple reminders

It’s not going to put Google Calendar out of business, but cron can be quite useful in its own geeky way for simple reminders: birthdays, due dates, recurrent tasks, etc. That’s especially true when the reminder process has to integrate with other home-grown software such as a trouble ticket manager.

The following crontab entry implements a simple email reminder. (Lines have been folded to fit the page. In reality, this is one long line.)

Note the use of the % character both to separate the command from the input text and to mark line endings within the input. This entry sends email once on the 25^th day of each month.

Filesystem cleanup

Some of the files on any system are worthless junk (no, not the system files). For example, when a program crashes, the kernel may write out a file (usually named core, core.pid, or program.core) that contains an image of the program’s address space. Core files are useful for developers, but for administrators they are usually a waste of space. Users often don’t know about core files, so they tend not to delete them on their own.⁵

NFSv3 is another source of extra files. Because NFSv3 servers are stateless, they have to use a special convention to preserve files that have been deleted locally but are still in use by a remote machine. Most implementations rename such files to .nfsxxx, where xxx is a number. Various situations can result in these files being forgotten and left around after they are supposed to have been deleted.

NFS, the Network File System, is described in Chapter 18.

Many programs create temporary files in /tmp or /var/tmp that aren’t erased for one reason or another. Some programs, especially editors, like to make a backup copy of each file they work with.

A partial solution to the junk file problem is to institute some sort of nightly disk space reclamation out of cron. Modern systems usually come with something of this sort set up for you, but it’s a good idea to review your system’s default behavior to make sure it’s appropriate for your situation.

Below are several common idioms implemented with the find command.

This command removes core images that have not been accessed in a week. The -xdev argument makes sure that find won’t cross over to filesystems other than the root; this restraint is important on networks where many filesystems may be cross-mounted.⁶ If you want to clean up more than one filesystem, use a separate command for each. (Note that /var is typically a separate filesystem.)

The -type f argument is important because the Linux kernel source contains a directory called core. You wouldn’t want to be deleting that, would you?⁷

This command deletes files that have not been accessed in three days and that begin with # or .# or .nfs or end with ~ or .CKP. These patterns are typical of various sorts of temporary and editor backup files.

See page 143 for more information about mount options.

For performance reasons, some administrators use the noatime mount option to prevent the filesystem from maintaining access time stamps. That configuration will confuse both of the find commands shown above because the files will appear to have been unreferenced even if they were recently active. Unfortunately, the failure mode is to delete the files; be sure you are maintaining access times before using these commands as shown.

This command recursively removes all subdirectories of /tmp not modified in 72 hours. On most systems, plain files in /tmp are removed at boot time by the system startup scripts. However, some systems do not remove directories. If a directory named lost+found exists, it is treated specially and is not removed. This is important if /tmp is a separate filesystem. See page 260 for more information about lost+found.

If you use any of these commands, make sure that users are aware of your cleanup policies before disaster strikes!

Network distribution of configuration files

See Chapter 19 for more information about sharing configuration files.

If you are running a network of machines, it’s often convenient to maintain a single, network-wide version of configuration files such as the mail aliases database. Usually, the underlying sharing mechanism is some form of polling or periodic distribution, so this is an ideal task for cron. Master versions of system files can be distributed every night with rsync or rdist.

Sometimes, postprocessing is required. For example, you might need to run the newaliases command to convert a file of mail aliases to the hashed format used by sendmail because the AutoRebuildAliases option isn’t set in your sendmail.cf file. You might also need to load files into an administrative database such as NIS.

Log file rotation

Systems vary in the quality of their default log file management, and you will probably need to adjust the defaults to conform to your local policies. To “rotate” a log file means to divide it into segments by size or by date, keeping several older versions of the log available at all times. Since log rotation is a recurrent and regularly scheduled event, it’s an ideal task for cron. See Chapter 11, Syslog and Log Files for more details.

9.6 Exercises

E9.1 A local user has been abusing his crontab privileges by running expensive tasks at frequent intervals. After asking him to stop several times, you are forced to revoke his privileges. List the steps needed to delete his current crontab and make sure he can’t add a new one.

E9.2 Think of three tasks (other than those mentioned in this chapter) that might need to be run periodically. Write crontab entries for each task and specify where they should go on your system.

E9.3 Choose three entries from your system’s crontab files. Decode each one and describe when it runs, what it does, and why you think the entry is needed. (Requires root access.)

E9.4 Write a script that keeps your startup files (~/.[a-z]*) synchronized among all the machines on which you have an account. Schedule this script to run regularly from cron. (Is it safe to blindly copy every file whose name starts with a dot? How will you handle directories? Should files being replaced on the destination machines be backed up before they are overwritten?)