For the final chapter, I'll describe some annoyances associated with computer administration. This chapter includes a number of topics (gateways, remote logins, logfile management, automated scripts) that don't quite fit with annoyances in other chapters. Perhaps the most important annoyance is the first, which deals with how every computer on a network downloads identical copies of the same updates, overloading your Internet connection.
If all you administer is one or two Linux computers, updates are a straightforward process. All you need to do is configure updates from the most appropriate mirror on the Internet. If desired, you can automate downloads and installations of updates using a cron job. For more information on how to configure updates from yum and apt-based mirrors, see Chapter 8.
However, when you administer a large number of Linux computers, the updates can easily overload standard high-speed Internet connections. For example, if you're downloading updates to the OpenOffice.org suite, you could be downloading hundreds of megabytes of packages. If you're downloading these packages on 100 computers simultaneously, that may be too much for your Internet connection, especially when other jobs are pending.
In this annoyance, I'll show you how you can create a local mirror of your favorite update server. You can then share the appropriate directory and configure your updates locally.
Where possible, I'll show you how you can limit what you mirror to updates. For example, Fedora Linux includes dedicated update directories. Most downloads are associated with updates, so it's appropriate to limit what you mirror to such packages.
One other approach is to download just the packages and create the repository systems yourself. For example, the createrepo command strips the headers from each RPM and configures a database that helps the yum command find the dependencies associated with every package.
I assume you have the hard disk space you need on your mirror server. Repositories can be very demanding with respect to disk space; be aware, if you're synchronizing repositories for multiple architectures and distributions, that downloaded mirrors can easily take up hundreds of gigabytes of space.
There are a number of ways to download the files associated with a mirror. The most common standard is based on the rsync command. With rsync, you can synchronize your mirrors as needed, downloading only those parts of those packages that are new or have otherwise changed. I'll show you how you can use rsync in this annoyance.
There are a number of other tools available. Naturally, you can use any FTP client to download mirrors to local directories. Commands such as wget and curl do an excellent job with large downloads. If you're working with an apt repository, the apt-mirror project provides another excellent alternative (http://freshmeat.net/projects/apt-mirror/).
To create your mirror, you can take these steps, which I'll detail in the following subsections:
Find an appropriate update mirror, specifically the one that gives you the best performance for individual updates. Some trial and error may be required. While the best update mirror is usually geographically close to you, that may not always be the case.
Make room for the updates. Several gigabytes may be required, especially if you're making room for updates for multiple distributions and/or versions. You may even consider using a dedicated partition or drive.
Synchronize the mirror locally. The first time you download a mirror, you may be downloading gigabytes of data.
If required, make your local mirror usable through your preferred update system.
Test a local update after you've downloaded a mirror to make sure it works.
Automate the synchronization process.
Point your clients to the local mirror.
The best update mirror may not be the one that is physically closest to your network. Some mirrors have faster connections to the Internet. Others have less traffic. Some mirror administrators may discourage full mirror downloads or even limit the number of simultaneous connections. And many public mirrors don't support rsync connections.
Our selected distributions have "official" lists of update mirrors. More may be available. If a mirror includes a Fedora repository, it may also include a SUSE repository. For example, while the University of Mississippi is not (currently) on the official list of mirrors for SUSE Linux, updates are available from its server at http://mirror.phy.olemiss.edu/mirror/suse/suse/. Here's where to find the "official" list of mirrors for our selected distributions:
http://fedora.redhat.com/download/mirrors.html includes a list of mirrors accessible through the rsync protocol; don't limit yourself to those specified, as others may also work with rsync.
Official mirrors of the open source SUSE distribution can be found at http://en.opensuse.org/Mirrors_Released_Version. Trial and error is required to find rsync-capable mirrors.
Official Debian mirrors can be found at http://www.debian.org/mirror/list. Many support a limited number of architectures. Trial and error is required to find rsync-capable mirrors.
To see if a mirror works with the rsync protocol, run the rsync command with the URL in question. For example, if you want to check the mirror specified in the Debian Mirror List from the University of Southern California, run the following command (and don't forget the double colon at the end):
rsync mirrors.usc.edu::
When I ran this command, I saw a long list of directories, clearly associated with various Linux distributions, including SUSE, Fedora, and others. If there is no rsync server at your desired site, the rsync command will time out, or you'll have to press Ctrl-C to return to the command line.
Finding the best update mirror is somewhat subjective. Yes, you could go by objective measures, such as the time required for the download. But conditions change. Internet traffic can slow down in certain geographic areas. Servers do go down. Some trial and error may be required.
Updates can consume gigabytes of space. The choices you make can make a significant difference in the space you need. Key factors include:
Every architecture that you maintain locally can multiply the space you need. For example, if you're rolling out both 64-bit and 32-bit workstations, you'll need at least double the space.
If you're maintaining mirrors for more than one distribution, your space requirements increase accordingly.
If you're maintaining mirrors for more than one version of a distribution (such as for Fedora Core 4 and 5), your space requirements can multiply.
Many administrators find it convenient to include a copy of the installation trees in the update repository partition. This increases the space required by the size of the installation CDs/DVDs.
You may want to create a dedicated partition for your update repositories. That way, you can be sure that the space required by the repository does not crowd out the rest of your system.
Along with perhaps most of the world of Linux, I like the rsync command. With appropriate switches, it's easy to use this command to copy the files and directories that you want. Once you've set up a mirror, you can use the rsync command as needed to keep your local mirror up-to-date.
The rsync command is straightforward; I use it to back up the home directory from my laptop computer with the following command:
rsync -a -e ssh michael@laptop.example.com:/home/michael/* /backup
If you've set the environment variable ENV_RSYNC=ssh
, you don't need the
-e ssh option. For more information on
rsync, see the "I'm Afraid of Losing
Data" annoyance in Chapter 2.
In the following subsections, I illustrate some simple examples of how you can create your own rsync mirror on our selected distributions. This assumes you're using an appropriate directory, possibly configured on a separate disk or partition.
For this exercise, assume you want to synchronize your local update mirror with the one available from kernel.org. The entry in the list of Fedora mirrors is a little deceiving. When you see the following:
rsync://mirrors.kernel.org/fedora/core/
You'll need to run the following command to confirm that rsync works on that server, as well as to view the available directories (don't forget the trailing forward slash):
rsync mirrors.kernel.org::fedora/core/
When I ran this command, I saw the result shown here:
MOTD: Welcome to the Linux Kernel Archive. MOTD: MOTD: Due to U.S. Exports Regulations, all cryptographic software on this MOTD: site is subject to the following legal notice: MOTD: MOTD: This site includes publicly available encryption source code MOTD: which, together with object code resulting from the compiling of MOTD: publicly available source code, may be exported from the United MOTD: States under License Exception "TSU" pursuant to 15 C.F.R. Section MOTD: 740.13(e). MOTD: MOTD: This legal notice applies to cryptographic software only. MOTD: Please see the Bureau of Industry and Security, MOTD: http://www.bis.doc.gov/ for more information about current MOTD: U.S. regulations. MOTD: drwxr-xr-x 4096 2005/06/09 09:40:43 . drwxr-xr-x 4096 2004/03/01 08:39:30 1 drwxr-xr-x 4096 2004/05/14 04:18:24 2 drwxr-xr-x 4096 2004/11/03 15:00:14 3 drwxr-xr-x 4096 2005/06/09 09:41:47 4 drwxrwsr-x 4096 2005/12/16 23:49:44 development drwxr-xr-x 4096 2005/11/22 06:14:23 test drwxrwsr-x 4096 2005/06/07 08:29:19 updates [michael@FedoraCore4 rhn]$
Naturally, Fedora Core production releases (which should also be available on the installation CDs/DVDs) are associated with the numbered directories. But the focus in this annoyance is on updates, which is the last directory listed on the server. Hopefully, this directory includes updates divided by Fedora Core releases.
To make sure this server includes the updates I need, I ran the following command:
rsync mirrors.kernel.org::fedora/core/updates/
I continued the process until I confirmed that this server included the update RPMs that I wanted to mirror. I wanted to create an Apache-based repository, so I mirrored the RPMs to the /var/www/html/yum/Fedora/Core/updates/4/i386 directory.
By default, the DocumentRoot associated with the default Fedora Apache configuration points to the /var/www/html directory; if I configure a local Apache server, I can use the Fedora/Core/updates/4/ subdirectory.
Then, to synchronize the local and remote update directories, I ran the following command:
rsync -a mirrors.kernel.org::fedora/core/updates/4/i386/. \ /var/www/html/yum/Fedora/Core/updates/4/i386
Because the SUSE list of mirrors doesn't specify which are rsync servers, some trial and error is required. For this exercise, I attempted to synchronize my local update mirror with that available from the University of Utah. The listing that I saw in the SUSE mirror list as of this writing was:
suse.cs.utah.edu/pub/
I tried the following command, which led to an error message:
rsync suse.cs.utah.edu::pub/
@ERROR: Unknown module 'pub'
rsync: connection unexpectedly closed (0 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(359)
So I tried the top-level directory and found the SUSE repositories at the top of the list:
rsync suse.cs.utah.edu::
suse The full /pub/suse directory from ftp.suse.com.
people The full /pub/people directory from ftp.suse.com.
projects The full /pub/projects directory from ftp.suse.com.
And, with a little browsing, as described in the previous section, I found the SUSE update directories with the following command:
rsync suse.cs.utah.edu::suse/i386/update/10.0/
I wanted to download updates associated with SUSE 10.0 to the following directory:
/var/lib/YaST2/you/mnt/i386/update/10.0/
I could run the following command to synchronize all updates from the update directory at the University of Utah (the -v uses verbose mode, and the -z compresses the transferred data):
rsync -avz suse.cs.utah.edu::suse/i386/update/10.0/. \ /var/lib/YaST2/you/mnt/i386/update/10.0/
But that might transfer more than you need. If you explore a bit further, you'll find source packages as well as packages built for 64-bit and PPC CPU systems. If you have only 32-bit workstations, you don't need all this extra data. You can use the --exclude switch to avoid transferring these packages:
rsync -avz --exclude=*.src.rpm --exclude=*.ppc --exclude=*x86_64* \ suse.cs.utah.edu: :suse/i386/update/10.0/. \ /var/lib/YaST2/you/mnt/i386/update/10.0/
Debian mirrors are somewhat different. Besides the different package format, Debian mirrors do not include any separate update servers. Therefore, if you want to mirror a Debian update server, you'll have to install all the packages in the server (except any that you specifically exclude).
Because the Debian list of mirrors does not specify rsync servers, some trial and error may be required. For this exercise, I wanted to synchronize my local update mirror with that available from the University of California at Berkeley. The listing that I saw from this mirror was:
rsync linux.csua.berkeley.edu::
debian
debian-non-US
debian-cd
In other words, this revealed the directories associated with Debian CDs as well as non-U.S. packages. For now, I assume that you want to mirror the regular Debian repositories. I found them with the following command:
rsync linux.csua.berkeley.edu::debian/dists/Debian3.1r0/main/
But as you can see from the output shown below, there are a number of directories full of packages that you may not need, unless you want to include the installers, as well as the binary packages associated with the full Debian range of architectures:
drwxr-sr-x 4096 2005/06/04 10:20:54 . drwxr-sr-x 4096 2005/12/17 00:33:29 binary-alpha drwxr-sr-x 4096 2005/12/17 00:39:50 binary-arm drwxr-sr-x 4096 2005/12/17 00:48:56 binary-hppa drwxr-sr-x 4096 2005/12/17 00:55:50 binary-i386 drwxr-sr-x 4096 2005/12/17 01:01:22 binary-ia64 drwxr-sr-x 4096 2005/12/17 01:07:29 binary-m68k drwxr-sr-x 4096 2005/12/17 01:15:06 binary-mips drwxr-sr-x 4096 2005/12/17 01:23:07 binary-mipsel drwxr-sr-x 4096 2005/12/17 01:29:11 binary-powerpc drwxr-sr-x 4096 2005/12/17 01:35:33 binary-s390 drwxr-sr-x 4096 2005/12/17 01:41:44 binary-sparc drwxr-sr-x 4096 2004/01/04 11:47:29 debian-installer drwxr-sr-x 4096 2005/03/24 00:22:16 installer-alpha drwxr-sr-x 4096 2005/03/24 00:22:16 installer-arm drwxr-sr-x 4096 2005/03/24 00:22:17 installer-hppa drwxr-sr-x 4096 2005/03/24 00:22:17 installer-i386 drwxr-sr-x 4096 2005/03/24 00:22:17 installer-ia64 drwxr-sr-x 4096 2005/03/24 00:22:17 installer-m68k drwxr-sr-x 4096 2005/03/24 00:22:17 installer-mips drwxr-sr-x 4096 2005/03/24 00:22:17 installer-mipsel drwxr-sr-x 4096 2005/03/24 00:22:17 installer-powerpc drwxr-sr-x 4096 2005/03/24 00:22:17 installer-s390 drwxr-sr-x 4096 2005/03/24 00:22:17 installer-sparc drwxr-sr-x 4096 2005/12/17 01:45:08 source drwxr-sr-x 4096 2005/06/04 11:40:37 upgrade-kernel
To download just the directories that you need, you can go into the appropriate subdirectory, or you can make extensive use of the --exclude switch. Debian recommends the latter. For example, if all of your workstations include Intel Itanium CPUs, you can run a command that excludes all files and directories not associated with the IA64 architecture. Debian recommends that you include the --recursive, --times, --links, --hard-links, and --delete switches, too. The basic steps to creating your mirror are:
Recursively download and synchronize files from all subdirectories
Preserve the date and time associated with each file
Re-create any existing symlinks
Include any hard-linked files
Delete any files that no longer exist on the mirror
If I wanted to limit the downloads to the ia64 directory, I would include the following switches:
rsync -avz --recursive --times --links --hard-links --delete --exclude binary-alpha/ --exclude *_alpha.deb --exclude binary-arm/ --exclude *_arm.deb --exclude binary-hppa/ --exclude *_hppa.deb --exclude binary-i386/ --exclude *_i386.deb --exclude binary-m68k/ --exclude *_m68k.deb --exclude binary-mips/ --exclude *_mips.deb --exclude binary-mipsel/ --exclude *_mipsel.deb --exclude binary-powerpc/ --exclude *_powerpc.deb --exclude binary-s390/ --exclude *_s390.deb --exclude binary-sparc/ --exclude *_sparc.deb
But things are beginning to get complicated. Debian provides a script that can help. All you'll need to do before running the script is to specify a few directives, including the rsync server, directory, and architectures to exclude. To see the script, navigate to http://www.debian.org/mirror/anonftpsync. For additional discussion of this rsync script, see http://www.debian.org/mirror/ftpmirror.
Now that you have a local mirror of Linux updates, you'll need to make sure it's usable through your update system. For our selected distributions, I'm assuming that you're using yum for Fedora, apt for Debian, or YaST for SUSE Linux. This step involves creating the database that your packaging system consults on each host to know what it's already updated and to stay in sync.
I also assume that you've shared the update directory using a
standard sharing service, such as FTP, HTTP, or NFS. I've described
the basic methods associated with yum and
apt updates in Chapter 8. If you're connecting
to a shared NFS directory, substitute file:///
(with three forward slashes) for http:// or ftp://.
Generally, when you use rsync to copy and synchronize to local mirrors, you've also downloaded the directories that support the apt or yum databases.
If you're using apt for updates, such as for Debian Linux, you may already have the key database files: Packages.gz for regular binary packages and Sources.gz for source packages. Based on the Debian mirror described earlier, you can find these files in the following directories:
linux.csua.berkeley.edu/debian/dists/Debian3.1r0/main/binary-i386/ linux.csua.berkeley.edu/debian/dists/Debian3.1r0/main/source/
If you need to create your own versions of these database files, navigate to the directory with the binary packages and run the following command:
dpkg-scanpackages . /dev/null | gzip -9c > Packages.gz
And for the database of source packages, navigate to the directory with those packages and run the following command:
dpkg-scansources . /dev/null | gzip -9c > Sources.gz
For more information on this process, see http://www.interq.or.jp/libra/oohara/apt-gettable/.
There are two ways to create a yum repository database. Through Fedora Core 3, the standard was the yum-arch command, which is included in the yum RPM. Since that time, the standard has become the createrepo command, based on a package of the same name. For the older Fedora distributions (as well as the rebuild distributions of Red Hat Enterprise Linux 3 and 4, which use yum for updates), you can create your own yum repository database by navigating to the package directory and running the following command:
yum-arch .
As yum "digests" the package headers, it collects them in a headers/ subdirectory.
For later Fedora distributions, assuming the packages are in the directory described earlier for Fedora updates, you'd run the following command:
createrepo /var/www/html/yum/Fedora/Core/updates/4/i386
This command creates an XML database in the repodata/ subdirectory. If your mirror process already copied either of these directories, you don't need to create it.
Now you'll want to test a local update. I described some of the update systems in Chapter 8. To summarize, for any of our three distributions, you'll need to make some configuration changes to point the package manager to the update server you created on your local network:
If you're updating yum for Fedora, you'll want to update the appropriate configuration files in the /etc/yum.repos.d directory. If your local mirror consists of Fedora updates, the file is fedora-updates.repo. For example, if you've shared the directory described in the previous section via NFS and have mounted the appropriate directory, you would substitute the following for the default baseurl directive:
baseurl=file:/var/www/html/yum/Fedora/Core/updates/4/i386/
If you're updating YaST for SUSE Linux, you'll need to point the update server to the shared local directory. In the appropriate YaST menu, you can configure a connection to any of several servers, including FTP, HTTP, or NFS servers from the local network. For example, if I've created an FTP server that points to the SUSE repository directory created earlier, I'd select FTP, cite the name of the server, and point to the following directory on that server: /var/lib/YaST2/you/mnt/i386/update/10.0/
If you're updating apt for Debian Linux, you'll want to update the appropriate URLs configured in /etc/apt/sources.list. For example, if you've mirrored a repository for Debian Sarge and created an HTTP server on your local network, on a computer named debianrep, in the web server's /repo subdirectory, you'd add the following line to each clients' sources.list file:
deb http://debianrep/repo sarge main
Once you change the appropriate configuration file, you can test updates from the local server that you created.
When you're satisfied that the local update server meets your needs, you'll want to automate the synchronization process. To do so, insert the rsync command(s) that you used in a cron job file. If you had to create yum or apt database files, you'll want to add those commands described earlier to the cron job.
Even after the first time you create a mirror, the downloads for updates can be extensive. For example, updates to the OpenOffice.org suite alone can occupy several hundred megabytes.
Therefore, you'll want to schedule the cron job for a time when few or no other jobs are running. And that depends on the schedule of other cron jobs, as well as any other jobs (such as database processing) that may happen during off-hours.
Once you've tested your local mirror, and then configured regular updates to that mirror, you're ready to connect your local workstations to it. You'll need to modify the same files as described earlier in the "Test a Local Update" section.
If you want to configure automatic updates on your workstations from your local repositories, you'll need to configure cron jobs on each host.
Remember, updates always carry some degree of risk. But when you update the system with the local repository, you're testing at least some of the updates. You have to decide if you want to do more testing or allow automatic updates to the production systems on your network. You can always create a script to log in to and update each of the production systems when you're ready.
Some distributions support GUI configuration of automated updates; SUSE supports it directly via YaST (which is saved to /etc/cron.d/yast2-online-update).
If you've installed the latest version of yum on Fedora Core, there's a cron job already configured in /etc/cron.daily/yum.cron. To let it run, you'll need to activate the yum service in the /etc/init.d directory.
Creating an update script is a straightforward process, with the following general steps:
Create a cron job in the appropriate directory. If you want a weekly update, add it to /etc/cron.weekly.
Make sure the script checks for the latest version of the update-management command. For example, if you're updating with apt, make sure it's up-to-date with the following command:
apt-get install apt
I use apt-get install and not apt-get upgrade, so I don't have to worry about pending updates to other packages. If the package is already installed, it is automatically upgraded.
If you're running apt, you'll need to make sure the local cache of packages is up-to-date:
apt-get update
Finally, apply the update command that you need, such as the following:
apt-get dist-upgrade
As distributions evolve, developers make changes. Sometimes, the developers behind a distribution choose to drop services. Sometimes the service that you're most comfortable with was never built for your distribution. Sometimes people convert from distributions or allied systems, such as HP-UX or Sun Solaris, where different services are available. In any of these situations, you'll have to look beyond the distribution repositories to install the service you want.
For example, while the WU-FTP server is the default on Sun Solaris 10, it has been dropped from Red Hat and Fedora distributions. It isn't even available in the Fedora Extras repository. Nevertheless, if a company is converting from Solaris to Red Hat Linux, the administrators would naturally look to install WU-FTP on Red Hat Enterprise Linux. (In my opinion, that would be a mistake, but we'll explore that issue in more detail in this annoyance.)
The developers behind your favorite service may have built what you want for your distribution. If they have, that is your best option, as it ensures that:
Configuration files are installed in appropriate locations.
The package becomes part of your database.
The developers are motivated to help you if there are distribution-specific problems.
If the developers behind a service have built their software, and have customized a package for a specific distribution, they have an interest in making sure that it works on that distribution.
However, if the service is not built for your distribution, don't immediately try building or compiling the service from its source code. While that might be your best option (especially if you're customizing the service), I believe there are alternatives that should be explored first.
One of the joys associated with open source software is choice. Rarely is there only one option for a service. For example, there are a wide variety of FTP servers that you can install on Linux systems. They include ProFTP, vsFTP, Muddleftp, glFTP, Pure-FTP, and WU-FTP. I've left out a few, including those built on Java.
But if it's a major service, your distribution should have at least one natively configured option for that service. For example, Red Hat Enterprise Linux includes vsFTP as the only FTP server. That's quite an expression of faith from the leading Linux distribution, enough to make many geeks take a closer look at vsFTP.
You can also explore alternative software for your service. You may be able to find alternatives in the Linux application libraries, described in "So Many Options for Applications" in Chapter 4. You may be able to find other options in third party repositories described in the next section. You may also be able to find alternatives online, perhaps with a search through Wikipedia (http://www.wikipedia.org) or Google.
In other words, if you find that your preferred server software is not available for your distribution, you should look for alternatives. That means:
Trying the software provided by your distribution for the service
Looking for alternatives from third parties who may have built similar software for the desired service
Examining other alternatives that can be installed on your system
If you can't find the software you want included with your distribution, you can look to third parties to help. These developers generally take the source code from original developers and build appropriate RPM or DEB packages suitable for the distribution of your choice.
There are a number of third-party repositories available for Linux distributions. They generally include software not available from the main repositories. For example, in the "I Need a Movie Viewer" annoyance in Chapter 4, I described some third-party repositories that included the libdvdcss package needed to view commercial DVDs.
The drawback of a third-party repository is that its packages may not be fully tested, especially with respect to the libraries that you might install on your distribution. In fact, there are reports of geeks who have run into incompatible libraries when they use more than one third-party repository.
You can get direct access to a third-party repository through your update software. Specifically, you can point yum, apt, and YaST systems directly to the appropriate URLs for the third-party repositories of your choice.
Generally, third-party repositories include instructions on how to include them in your update software and/or configuration files. For our preferred distributions, you can find a list of third-party repositories in the following locations.
Individuals within the Fedora project help integrate connections with a number of third-party repositories. While the focus is on Fedora Core, most of these repositories include separate URLs you can use for Red Hat Enterprise Linux (as well as rebuild distributions based on Red Hat Enterprise Linux source code). Instructions are usually available on the web page for each third-party repository. As of this writing, the status for the major Fedora repositories can be found online at http://fedoranews.org.
SUSE has traditionally included a lot of extra software with its DVDs. And more is available from third parties. Several are listed for your information at http://en.opensuse.org/YaST_package_repository. You can include them as an installation source in YaST. However, SUSE warns that "YaST fully trusts installation sources and does not perform any kind of authenticity verification on the contained packages." In other words, SUSE's third-party repositories might not include a GPG key, as you see with Fedora's repositories.
Many third-party repositories for SUSE distributions do have GPG keys. One central location for many of these repositories can be found at ftp://ftp.gwdg.de/pub/linux/misc/apt4rpm/rpmkeys.
The repositories associated with Debian Linux are extensive, which is natural for a community-based distribution. Be careful with the list at http://www.apt-get.org/main/; many of the repositories are dedicated to specific versions of Debian such as Potato, which has been obsolete since 2002.
By their very nature, these lists of third-party repositories may not be complete. And as the developers behind these repositories may not coordinate their efforts, including more than one third-party repository on your Linux system may lead to unpredictable results.
If a package used to be built for your distribution, it may still work for the newer version. For example, if you absolutely need the WU-FTP server on Red Hat Enterprise Linux (RHEL) 4, there are ways to get old versions.
For the purpose of this annoyance, I installed the latest available version of WU-FTP built for Red Hat. It's available from the Fedora Legacy project, at http://fedoralegacy.org, from the updates repository associated with Red Hat Linux 7.3. When I tried to install it on RHEL 4, I got a message suggesting that I install the Open SSL toolkit, which addresses the security vulnerabilities associated with WU-FTP, at least as of its release in 2004.
Because of the security issues associated with it, I do not recommend WU-FTP. However, it may be helpful in a transition from a different operating system where WU-FTP is the default, such as Solaris. The security issues can be managed behind firewalls until your transition is complete.
Once the appropriate packages were installed, I was able to get WU-FTP running on RHEL 4. While using old versions is not recommended as a general solution, the installation of familiar software and services can ease transitions, even for organizations moving just from one version of Linux (or Unix) to another.
In some cases, the appropriate service is available as a source code package, customized for the desired distribution. This option is most common for the "rebuild" distributions associated with RHEL.
For RHEL, Red Hat complies with the GNU General Public License by releasing its source code. As Red Hat has released the source code in Source RPM packages, you can try to install those packages on any RPM-based distribution. These packages are publicly available from Red Hat at ftp.redhat.com, in the pub/redhat/linux/enterprise/4/en/os/i386/SRPMS/ subdirectory.
If you're running RHEL Workstation, you don't have the server packages included with the RHEL Server distributions. One example is the vsFTP server. It goes almost without saying that if you install a package available only on RHEL Server on a RHEL Workstation, you should not expect support for that package from Red Hat. I've downloaded the RHEL 4 Source RPM for the vsFTP server on my RHEL Workstation. Once downloaded, I can install it using the following steps:
Run the following command to unpack the source code from the vsFTP server to the /usr/src/redhat directory:
rpm -ivh vsftpd-2.0.1-5.src.rpm
The source code is unpacked to a .spec file in the /usr/src/redhat/SPECS directory, as well as various source and patch files in the /usr/src/redhat/SOURCES directory.
Navigate to the directory containing the .spec file.
Build the binary RPM (as well as source information) with the following command:
rpmbuild -bb vsftpd.spec
In this particular case, the .spec file creates two binary RPMs and stores them in the /usr/src/redhat/RPMS/i386 directory.
Install the binary RPMs just like any other Red Hat RPMs.
This process doesn't always work. As different tools are used by the rebuild distributions, you generally can't use the kernel source code released by Red Hat on a RHEL rebuild distribution, as they have been built by different teams of developers, using different tools.
You can always install a Linux service (or any other Linux software) from the original source code. Generally, it's available only as a compressed tar archive. Once you download the archive, you'll want to decompress it. The command you use depends on the compression format, which is normally associated with the archive extension. For example, if the archive has a .tar.gz or .tgz extension, such as archive.tar.gz, you can decompress it with the following command:
tar xzvf archive.tar.gz
Alternatively, archives with a .tar.bz or .tar.bz2 extension can be decompressed with the tar xjvf command. Normally, archived files packaged for a service are decompressed to a separate subdirectory, with the name of the archive.
The methods for installing from source code vary widely. Detailed instructions are normally made available in a text file in the decompressed archive.
One popular use for Linux is as a gateway between networks. The software associated with the gateway is fairly simple. In fact, it can be loaded from permanent media, such as a CD. That technique prevents crackers from breaking into the gateway and thus breaking the security barrier, or firewall, commonly configured between networks.
Configuring a Linux gateway normally requires three basic administrative steps:
Configuring your system to forward IP traffic.
Setting up masquerading.
Creating a firewall between networks.
The only thing you absolutely need to do is configure IP forwarding. It is disabled by default. For this annoyance, I assume you're configuring a computer with two network cards, and each card is connected to a different network.
There are many excellent firewall configuration tools, but this annoyance shows you how to configure the system by hand. If you use the tools, you'll overwrite the configuration files that you may create as you review this annoyance.
Linux normally disables IP forwarding between network cards, and it is disabled in the default configurations of our preferred distributions. The way you activate IP forwarding depends on whether you've configured an IPv4 or IPv6 network.
Here, I assume that your system supports the /proc filesystem with kernel settings, along with the sysctl program to access kernel switches. Your system meets these requirements if you have a /proc directory and an /etc/sysctl.conf file.
If there are problems, you'll want to make sure the appropriate settings are active in your kernel. Specifically, you should see the following settings in the active config-* file in the /boot directory:
CONFIG_PROC_FS=y CONFIG_SYSCTL=y
If these settings don't reflect what you need, you can't just edit this configuration file. In that case, you'll need to recompile the kernel, as described in the "Recompiling the Kernel" annoyance in Chapter 7.
To activate forwarding on an IPv4 network, you'll need to
toggle the ip_forward
setting in
the appropriate kernel configuration directory. The simplest way to
do so is with the following command:
echo "1" > /proc/sys/net/ipv4/ip_forward
To make sure forwarding is turned on the next time you boot your computer, open /etc/sysctl.conf and add the following directive:
net.ipv4.ip_forward = 1
To activate forwarding on an IPv6 network, you'll need to
toggle the forwarding
setting in
the appropriate kernel configuration directory. The simplest way to
do so is with the following command:
echo "1" > /proc/sys/net/ipv6/conf/all/forwarding
To make sure forwarding is turned on the next time you boot your computer, open /etc/sysctl.conf and add the following directive:
net.ipv6.conf.all.forwarding = 1
This assumes you've installed all other components required to configure an IPv6 network. For more information, see the related HOWTO written by Peter Bieringer at http://www.tldp.org/HOWTO/Linux+IPv6-HOWTO/.
When you have one IP address on the Internet for your network, you need to find a way to share it with all the computers on your network. The standard is with IP masquerading. Once configured, your gateway substitutes the IP address of the network interface card it uses to reach the Internet for the address of any computer on your network that requests data from the Internet.
Naturally, IP masquerading assumes you've activated IP forwarding, as I described in the previous section.
The current standard for configuring IP address translation on a gateway is iptables, the same command used to erect firewalls. Here you use it to alter network packets with Network Address Translation, specifically with the iptables -t nat command.
As an example, if your Internet connection uses a device named wlan0 and your LAN uses IP addresses on the 10.11.12.0/16 private network, the command you need is:
iptables -t nat -A POSTROUTING -s 10.11.12.0/16 -o wlan0 -j MASQUERADE
As described earlier, this command uses Network Address Translation. It adds (-A) the rule to the end of the iptables chain. It modifies network packets as they leave the network (POSTROUTING). It specifies (-s) source IP addresses to be those from your LAN (10.11.12.0/16). It points to wlan0 as the output interface (-o). For all data that meets these standards, computers on your LAN MASQUERADE on the external network with the IP address assigned to wlan0.
To save this command, you'll need to run iptables-save and send the result to a file with a command such as:
iptables-save >> firewall
You could save the iptables commands to the standard configuration file for the distribution, but that would risk conflicts with settings written by tools such as Red Hat's Firewall Configuration tool. If you want to make these commands part of your firewall, you'll have to modify those files manually.
Detailed instructions on creating a firewall are beyond the scope of this book. However, the gateway between networks is the best place to create a firewall, so I'll mention some of the considerations for doing so.
Both Red Hat/Fedora and SUSE Linux have their own firewall configuration tools. These tools are excellent and can be used to create a fairly simple firewall. You can build upon the firewall created by these tools as needed.
You can start the standard Red Hat/Fedora Firewall Configuration tool with the system-config-securitylevel command. Results are saved to /etc/sysconfig/iptables.
You can open the SUSE firewall tool in YaST. Results are saved to /etc/sysconfig/SUSEfirewall2.
There is no standard firewall tool available for Debian. However, there are a substantial number of available options, including several excellent GUI tools.
In addition, a number of third-party firewall generators are available online. As is standard with open source software, neither I nor O'Reilly endorses any of these systems (or anything else in the book).
For more information, see the related annoyance "My Firewall Blocks My Internet Access," in Chapter 8.
There are two reasons why you may want remote access. First, the computer you want to use may be too far away. Second, the computer, as with many servers, may not even have a monitor.
There are several ways to configure remote access to a Linux server. As described in Chapter 9, in the "Users Are Still Demanding Telnet" annoyance, Telnet is one method. While Telnet is insecure, I described methods you can use to encrypt and further secure Telnet communications in that chapter.
Perhaps the best way to configure secure access to a remote Linux system is through the Secure Shell (SSH). Connections through SSH are encrypted. You can even set up encryption keys and password phrases that are not transferred during logins. As described in the next annoyance, you can even use SSH to access GUI applications remotely.
What I describe in this annoyance just covers the basics associated with creating an SSH server and connection. For more information, see SSH, The Secure Shell: The Definitive Guide by Daniel J. Barrett et al. (O'Reilly).
Security is provided through the Secure Shell, and access can be configured through the appropriate SSH configuration file. You'll find two configuration files in the /etc/ssh directory, sshd_config and ssh_config. You can configure both files: sshd_config on the server, and ssh_config on each client. You can also use some of the switches available with the ssh command or customize a client for an individual user with a file in the appropriate home directory.
One possible security issue with SSH is related to user keys, which are stored in ~/.ssh/ under their home directories. If your workstations use NFS to mount home directories from a central server, your encrypted keys will be transmitted over the network in clear text. Anyone who intercepts this transmission can eventually decrypt those keys. If this describes your configuration, consult SSH, The Secure Shell: The Definitive Guide by Daniel J. Barrett et al. (O'Reilly) for an alternative configuration.
Generally, when you configure SSH, it's mostly done on the server. Any configuration you do on the client, through /etc/ssh/ssh_config, is secondary.
After you make any changes to the configuration files, remember that you'll have to restart the SSH server. On Debian Linux, you can do so with the following command:
/etc/init.d/ssh restart
On SUSE and Red Hat/Fedora Linux, the command is slightly different:
/etc/init.d/sshd restart
The SSH server configuration file, /etc/ssh/sshd_config, supports direct access by default. You can limit access by user, by group, and by network. If you're supporting access through a firewall, you'll need to provide appropriate access through that barrier.
You can limit access by user with the AllowUsers
directive. If there is no such
directive in the /etc/ssh/sshd_config
configuration file, all users are allowed on the SSH server (unless
otherwise prohibited via Pluggable Authentication Modules, as
described in Chapter
10).
For example, if I want to allow only donna to access this server via SSH, I can add the following directive:
AllowUsers donna
You can add AllowUsers
directives for all users for whom you want to authorize access via
SSH. For example, I could add the following directives to limit
access to three users:
AllowUsers donna AllowUsers nancy AllowUsers randy
Alternatively, you can use the DenyUsers
directive to prohibit access to
certain accounts.
You may want to deny access to the most privileged user. This requires a different directive:
PermitRootLogin no
SSH allows root logins by default. So if you want to minimize the risk to the administrative account, this directive is important.
You can further refine the AllowUsers
directive. For example, you can
limit access from users on the remote computer named
enterprise4a to donna's
account:
AllowUsers donna@enterprise4a
Don't let the @ confuse you. This directive does not specify an email address. It specifies a local account and a remote computer from where users are allowed to log in to that account. You can substitute an FQDN for enterprise4a.
Some wildcards are supported. For instance, if you want to support access from the 192.168.0.0/24 network to all local accounts, use the following directive:
AllowUsers *@192.168.0.*
Just as the AllowUsers
and DenyUsers
directives can help you regulate
access via SSH to accounts on the local server, the AllowGroups
and DenyGroups
directives can do the same,
based on group accounts as defined in
/etc/group.
If you have a firewall between desired SSH clients and servers, you'll need to make sure that the firewall allows SSH connections. For your convenience, allowing SSH connections is a standard option with the Red Hat/Fedora and SUSE firewall configuration tools. If you're configuring your firewall manually, you'll have to make sure to allow TCP and UPD connections through port 22.
Sending passwords over a network can be a problem. While SSH communications are encrypted, if a cracker can determine when you send your password and intercept it over your network, he can eventually decrypt it.
The SSH system supports the use of passphrases, which can be more complex than regular passwords (you can even use complete sentences such as "I live 40 feet from the North Pole."). Commands such as ssh-keygen allow you to create a private and public key based on the passphrase. The standard is 1024-bit encryption, which makes the passphrase (or the associated keys) much more difficult to crack.
Once the public key is transferred to the remote system, you'll be able to use SSH to log in to the remote system. The passphrase activates the private key. If matched to the public key on the remote system, an SSH login is allowed.
Create and transfer the private and public keys as follows:
Choose an encryption algorithm (I've arbitrarily selected DSA) and generate a private and public key in your home directory (I use /home/michael/.ssh here) with a command like:
ssh-keygen -t dsa -b 1024 -f /home/michael/.ssh/enterprise-key
When prompted, enter a passphrase. Passphrases are different from standard passwords. They can include whole sentences, such as:
I like O'Reilly's ice cream
This particular ssh-keygen command generates two keys, putting them in the enterprise-key and enterprise-key.pub files in the /home/michael/.ssh/ directory. You can (and probably should) choose a different passphrase for the encryption key.
Next, transmit the public key that you've created to the remote computer. The following command uses the Secure Copy command (scp) to copy the file to donna's home directory on the computer named debian:
scp .ssh/enterprise-key.pub donna@debian:/home/donna/
Now log in to donna's account on the remote computer. Assuming the Secure Shell service is enabled on debian, you can do so with the following command:
ssh donna@debian
You'll have to specify donna's password because you have not yet set up passphrase protection. You should now be in donna's home directory, /home/donna, on the debian computer.
If it doesn't already exist, you'll need to create an .ssh/ subdirectory. You'll also want to make sure it has appropriate permissions with the following commands:
mkdir /home/donna/.ssh chmod 700 /home/donna/.ssh
Create the authorized_keys file in the .ssh/ subdirectory:
touch .ssh/authorized_keys
Now take the contents of the public SSH key that you created and put it in the authorized_keys file:
cat enterprise-key.pub >> .ssh/authorized_keys
Note that I used the the append sign (>>) because I want to keep all previous keys that might be in the file; it can contain all the keys referring to all the remote hosts from which you want to log in.
Log out of donna's account. The next time you log in, you'll be prompted for the passphrase as follows.
Enter passphrase key for '/home/michael/.ssh/enterprise-key':
Now you can connect securely, using SSH, without having to enter your password or a password on the remote system. With the other measures described earlier in this annoyance, you can also protect your SSH server by user, protect it by group, make sure SSH communications come from a permitted network, and allow SSH through firewalls.
The first time you use SSH to log in to a remote system, you may see the following message, which means you haven't configured passphrases:
The authenticity of host 'debian (10.168.0.15)' can't be established. RSA key fingerprint is 18:d2:73:ec:53:ce:52:4f:2d:43:55:fb:0c:14:49:1e. Are you sure you want to continue connecting (yes/no)?
Once you enter yes
, you'll
see the following message:
Warning: Permanently added 'debian,10.168.0.15' (RSA) to the list of known hosts.
Then you're prompted for the password on the remote system.
If you've configured passphrases, you'll see only the second message, followed by a request for the passphrase.
In either case, the remote system sends your client a public key, which is added to the user's ~/.ssh/known_hosts file. If the name or IP address of the remote system changes, you'll see an error, which you can address only by editing or deleting the known_hosts file.
Sometimes you need to run a GUI application but can't get to your computer. You may want to support users who need remote access to their applications.
I'll assume that you've already set up Secure Shell (SSH) or VNC clients for these users. In this annoyance, I'll show you how you can configure secure remote access to your GUI applications. While you can use VNC, SSH is preferred, as it provides strong encryption, making it more difficult for a cracker to track your keystrokes. An SSH configuration means that you're networking only the GUI application that you happen to be running remotely, as opposed to a whole GUI desktop environment.
There are relatively secure versions of VNC available; you can even tunnel VNC through an SSH connection. For more information on the wide variety of VNC servers and clients, Wikipedia provides an excellent starting point at http://en.wikipedia.org/wiki/VNC. If you don't like VNC, explore the increasingly popular FreeNX (which uses SSH) at http://freenx.berlios.de/.
If you absolutely need remote access for GUI applications, keep it behind a firewall. If at all possible, don't open the firewall to external clients on the SSH ports. If you do, use the directives described in the following sections (and the previous annoyance) to minimize your risks.
The configuration file for the SSH server is /etc/ssh/sshd_config. While it offers a substantial number of directives, most of the defaults configured on our target distributions don't need to be changed for SSH to work. However, these defaults may not be secure. Depending on your distribution, you may need to make a few changes. I suggest you pay particular attention to the following directives:
X11Forwarding yes
As the object of this annoyance is how to safely configure remote access of GUI applications, I assume you'll use this directive to enable remote access.
Protocol 2
Specifies the use of the SSH2 protocol, which is currently being maintained and updated for any security problems. Without this directive, the SSH server can also take logins from SSH1 clients, which are less secure.
ListenAddress
Allows you to specify the IP address of the network card
to take SSH connections, such as ListenAddress 192.168.0.12
. Assumes
you have more than one network card on this computer.
LoginGraceTime
Helps thwart crackers who try to break into an account
with different passwords. The default is 120 seconds, after
which no additional password entries are allowed. I would set a
shorter period, such as LoginGraceTime
30
.
PermitRootLogin no
The default is yes
. In
my opinion, you should never permit logins by the
root user. Even if encrypted,
root logins are a risk. If the login is
intercepted, the root password may be
eventually decrypted. In contrast, if you use the
su or sudo commands
after logging in via SSH, it's much more difficult for a cracker
to determine which bits contain the root
password.
Alternatively, you can create encryption keys as described in the previous annoyance. Once configured, SSH login passwords don't get sent over the network.
AllowUsers
By default, all users are allowed to log in via SSH. It's best to limit this as much as possible. You can limit logins by users, or even by users on specific systems. For example, if you wanted to limit SSH access to two users, you might use one of the following directives:
AllowUsersmichael
donna
AllowUsers michael@debian.example.com donna@suse.example.com
In the second directive, SSH logins to the local accounts for michael and donna are allowed from the remote debian.example.com system.
After saving changes to the SSH server configuration file, you'll need to restart the associated daemon. The name of the daemon may vary slightly by distribution; you can use the following command for Red Hat/Fedora and SUSE Linux:
/etc/init.d/sshd restart
The appropriate command on Debian Linux is slightly different:
/etc/init.d/ssh restart
There are three ways to configure the SSH client to support networking of GUI tools and applications:
Directly, via switches and options to the ssh connection command
For all users on a client, via the /etc/ssh/ssh_config file
For a single user on a client, via the ~/.ssh/config file
By default, any authorized user can log in to an SSH server, specifying access to GUI applications with the -X switch, e.g.:
ssh -X michael@debian.example.com
But GUI access may not be secure. The most secure approach is to limit X access for all users on a client and then enable it for only the desired users. To do so, open /etc/ssh/ssh_config and set the following directives:
ForwardX11Trusted
no
The default for this directive varies by distribution. This setting minimizes risks to other clients.
ForwardX11 no
Although this should be disabled by default on all Linux distributions, it doesn't hurt to make sure.
Next, on the ~/.ssh/config file for the user that you want to authorize, include:
ForwardX11 yes
This directive supersedes any default settings in the /etc/ssh/ssh_config file, and allows remote GUI access to the applications of that user's choice.
Once configured, you can access remote GUI applications through the command line. To this end, you'll need to know the text commands that start GUI applications, such as /usr/bin/oowriter. Unless you're running a network with gigabit-level speeds, expect a bit of a delay as the application opens (and as it runs remotely on your workstation).
I believe it's helpful for any Linux user to review her own logs on a regular basis. Familiarity can help any geek learn the value of logs. For one thing, log entries will be associated with failed logins, which suggest that a cracker is trying to break into your system. But logs can do much more. For example, web logs can give you a feel for where your customers are coming from, in terms of geography; clicked links associated with web ads; how long they stay on your web site; and more.
As a Linux administrator, chances are good that you're administering a substantial number of Linux computers. It may be useful to consolidate the logs on a single system. If a server goes down, you'll have the logs from that server available on the central log server. When there are problems, such as "critical" error messages, you may want an email sent to your address. You may need tools to help you go through all of these logs.
Logs on our selected distributions are governed by the system and kernel log daemons. While Red Hat/Fedora and Debian combine these daemons into a single package (sysklogd), SUSE includes them in separate packages (syslogd and klogd). While there are minor variations in how they're configured, they're all governed by a logfile in the same location, /etc/syslog.conf.
If you want to dedicate a specific system as your central log server, first make sure you have enough space on that system. It may help to configure logs, such as those in the /var/log directory, on a separate partition so that they can't fill up critical system partitions if they get too big. For more information, see the next annoyance.
On the system that you're configuring as a central log server, you'll have to configure the system log daemon (syslogd) to accept remote connections. The simplest way to do so is to stop the daemon, and then start it again with the -rm 0 switch. The way you implement this varies slightly by distribution:
The Red Hat/Fedora distributions let you configure
switches for the system log daemon in
/etc/sysconfig/syslog. The key directive is
SYSLOGD_OPTIONS
. To support
remote log reception, change this directive to:
SYSLOGD_OPTIONS="-rm 0"
SUSE Linux handles standard options for the system log
daemon in a similar fashion. The daemon log options are listed
in /etc/sysconfig/syslog. To support remote
log reception, change the SYSLOGD_PARAMS
directive to:
SYSLOGD_PARAMS="-rm 0"
Debian Linux does not provide any
/etc/sysconfig files for daemon
configuration. However, you can configure the system log daemon
directly in the associated start script,
/etc/init.d/sysklogd. To support remote log
reception, change the SYSLOGD
directive to:
SYSLOGD="-rm 0"
Once you've made the configuration changes, you can implement them by restarting the system log daemon on each computer with the following command:
/etc/init.d/syslog restart
On Debian Linux, the script's location is slightly different:
/etc/init.d/sysklogd restart
Naturally, if there's a firewall between the log server and log clients, you'll need to make provisions in that firewall to allow traffic through port 514. As you can see in /etc/services, that's the standard port for system log communications. To make sure your system log service now receives from remote computers, check your /var/log/syslog (or, if that file doesn't exist, /var/log/messages) for the following entry (the version number may vary):
syslogd 1.4.1: restart (remote reception).
Now that the central server is ready, you can configure your other Linux systems to send copies of their logs to that computer. The log configuration file on each of our preferred distributions is /etc/syslog.conf, and the key directive is straightforward. If you want a copy of all logs sent to the logmaster.example.com computer, all you need in that file is:
*.* @logmaster
Unfortunately, the system log service can't handle fully qualified domain names. Logs on the central server from your remote systems will have only the regular hostnames.
If you're just concerned with kernel-based issues, to help diagnose shutdowns, you can send just the kernel messages to the remote log server:
kern.* @logmaster
There are many excellent tools for monitoring logfiles. Many geeks even create their own scripts for this purpose. One excellent source for different monitoring tools and scripts is Automating Unix and Linux Administration by Kirk Bauer (Apress).
One of the major standards for log monitoring is known as Logwatch. It's available from both the Debian and Red Hat/Fedora Linux repositories. A logwatch RPM that works on SUSE is available from the Logwatch home page at http://www.logwatch.org.
Logwatch is organized into three groups of files. The overall configuration file is logwatch.conf. Other logfiles for many individual services are organized in a services/ subdirectory. The logfiles are placed in groups based on configuration files in a logfiles/ subdirectory. The actual directory varies by distribution, or by the release that you may have installed from http://www.logwatch.org.
As the directories associated with Logwatch vary so widely by distribution, I generally do not use full directory paths in this annoyance. If you're uncertain about the location of a file, you'll have to do your own searching with commands such as locate, and rpm -ql logwatch or dpkg -L logwatch.
Before I show you how to configure the basic Logwatch configuration file, I need to review its location on your system. If you've downloaded the latest version from http://www.logwatch.org, you'll need to make sure key settings are compatible with the scripts and configuration directories for your distribution.
The standard Logwatch configuration file is logwatch.conf. You can find it in the /etc/log.d/conf or /etc/logwatch/conf directories. As described earlier, there is no standard SUSE Logwatch package.
If you've downloaded the latest version from the Logwatch home page, you'll find key configuration files in different locations. The logwatch.conf configuration file (as well as default services and configuration logfiles) is stored in /usr/share/logwatch/default.conf; detailed configuration changes can be added to files in the /etc/logwatch/conf directory.
Administrators are now encouraged to add changes to Logwatch settings to override.conf and patterns that Logwatch should drop to ignore.conf. But those are advanced settings beyond what I can cover in this annoyance. Refer to the Logwatch web site for the latest information.
Logwatch's standard directives include:
LogDir
The standard for logging directories on our preferred distributions is /var/log.
MailTo
While the default is MailTo =
root
, you're free to change this to the email
address of your choice, assuming you have a working outgoing
email server on this system.
Print
The Print
directive
is unrelated to printers; it determines whether reports are
sent to standard output, which is normally the screen. The
usual default is Print =
no
. Change this directive to view output in real
time on your console.
TmpDir
UseMkTemp
MkTemp
These three directives all configure the use of
temporary files. By default, the TmpDir
directive points to the
/tmp directory. In the latest version of
Logwatch, this directive points to the
/var/cache/logwatch directory.
If your TmpDir
is
/tmp, make sure the UseMkTemp
directive is active. This
uses the MkTemp
directive
to point to the mktemp utility for
changing the name and permissions of temporary logfiles to
keep them secure while they're stored in the
/tmp directory.
If you've activated UseMkTemp
, you need to point the
MkTemp
directive to the
full path of the mktemp utility, normally
/bin/mktemp.
Range
The Range
directive
specifies the timeline for the report. The standard is
Range = yesterday
; it's
consistent with a log report, processed by the cron daemon,
sometime after midnight.
Detail
The Detail
directive
associated with a report specifies the amount of information
you get. Detail = Low
limits information to security and other critical service
issues. A High
level of
Detail
creates very verbose
reports, especially if you're collecting information from
multiple computers.
Service
The Service
directive
gives you an opportunity to limit the services on which
Logwatch prepares reports. While the default is Service = All
, you can specify
individual services with a directive such as Service = pam
—or specify all except
an individual service with two directives, such as:
Service = All Service = -ftpd-messages
Mailer
The Mailer
directive
specifies the command-line utility associated with text
emails. Depending on your distribution, it should be set
either to /usr/bin/mail or
/bin/mail.
The service configuration files associated with Logwatch are stored in a service/ subdirectory. While the list of files may seem extensive, don't worry about configuring each file. The defaults are generally fine, unless you want to specify a special file group. One example where you may want to specify a special group is with the Clam AntiVirus software (www.clamav.net). The following is based on the package downloaded from the Clam AV web site.
These configuration files differ from those installed with specific services. For example, the clamav.conf file cited in this section, on a Debian system, is in the /etc/logwatch/conf/services directory. It configures Clam AntiVirus software interactions with the logwatch system. It is not a substitute for the main Clam AntiVirus configuration file, normally /etc/clamav/clamd.conf.
By default, the following directive in the
clamav.conf file (along with the LogDir=/var/log
directive in
logwatch.conf) sends logs from this service to
the standard /var/log/messages:
LogFile = messages
As it may be inconvenient to have so much traffic in /var/log/messages, you could send logs to a different file, such as /var/log/clam-update, with the following directive:
LogFile = clam-update
Logs can grow quickly, especially for services with a lot of activity. Logs for commercial web sites can easily add several hundred megabytes of files every day.
Unmanaged, this kind of growth can overwhelm your system, taking space needed by your users, occupying the empty space required to run a GUI, and making it impossible to boot your system.
If your logs grow quickly, you should consider creating dedicated partitions. Even with dedicated partitions and search scripts of dazzling sophistication, it can take quite a while to search through large logfiles for the data you may need. Therefore, you may consider configuring your system to start new logfiles more often, perhaps daily or even hourly.
Even large, dedicated partitions may not be good enough. The demands of logfiles can grow so large that you may need to move logfiles to different systems.
The associated cron jobs are run in alphabetical order; files starting with numbers come first. For example, the 00logwatch script in /etc/cron.daily is run before others.
If you have more than one log-management service installed, such as logrotate or logwatch, the associated jobs may not be fully compatible.
Logs can become quite large, and can easily grow by hundreds of megabytes of space (or more) every day. There are two basic options in this regard:
With a dedicated log partition, the space taken by a service or kernel log doesn't overwhelm the space required to run a Linux system. If you use a standard Linux distribution, the way to set this up is to mount the /var/log directory on a dedicated partition.
Even if you configure a dedicated server to collect logs from other Linux systems, I still recommend a dedicated log partition.
For most organizations, the data associated with logs isn't nearly as critical as, say, that associated with user home directories. Because logs grow quickly, one method to manage this growth is a RAID 0 volume with daily backups.
RAID 0 is the fastest possible media for large files and may be suitable for a log server. With appropriate controllers, it allows you to add more disks as logs grow.
Your management may have different feelings about the importance of logfiles. Perhaps you'll want to protect them as evidence, to help you track the activity of certain users, to establish patterns of visits to your web sites, or possibly even as evidence usable in a court of law. If logfiles are that important, you may want to use a more robust data storage system, such as RAID 5, or even back them up to stable archives such as DVDs.
Log rotation means starting new files to contain incoming log messages, so that old logs can easily be backed up. Rotation also involves removing old messages to return needed disk space to the system. In Linux, log rotation is configured through the logrotate configuration files. In our preferred distributions, these files are stored in /etc/logrotate.conf. To understand how this process works, it's useful to analyze it in detail.
Every day, on a schedule defined by your /etc/crontab configuration file, Linux runs the /etc/cron.daily/logrotate script. It includes the following command, which runs the logrotate service, based on the settings in /etc/logrotate.conf:
/usr/sbin/logrotate /etc/logrotate.conf
To see how rotation works, we can analyze /etc/logrotate.conf. The first four default commands in this file are identical in our preferred distributions:
weekly
Logfiles are rotated by default on a weekly
basis. You can set this to
daily
or monthly
, or specify a maximum log size
after which rotation occurs with a command such as:
size=100k
rotate 4
Linux distributions normally store four weeks of backlogs for each service.
create
A new, empty file is created in place of the logfile that is now rotated.
include
/etc/logrotate.d
Configuration files in the /etc/logrotate.d directory are included in the rotation process.
In some cases, the distribution developers have configured rotation of the wtmp and btmp access logs in /var/log, as they are not associated with any specific package, nor are they maintained by any of the /etc/logrotate.d configuration files.
If you add the following directive, you can enable compression of your logfiles, saving even more room:
compress
Compression still allows access by some dedicated logfile viewers and editors, including vi. There are a substantial number of options available; Sourceforge.net includes several hundred log-management suites, many of which can even search through directories of compressed logfiles.
Logs should normally be deleted automatically. However, if you see logs more than five weeks old, that suggests a problem with your logrotate script, or perhaps that your cron jobs aren't being run as scheduled.
For example, on my newer laptop computer, I haven't configured my winmodem to allow external logins by modem. I have no modem getty (mgetty) logs in my /var/log directory. When I run the daily cron logrotate script, I get a related error:
# /etc/cron.daily/logrotate
error: error accessing /var/log/mgetty: No such file or directory
error: mgetty:1 glob failed for /var/log/mgetty/*.log
There are several ways to address this issue. I could configure mgetty, but that would be a waste of time. I could delete the mgetty configuration file in /etc/logrotate.d, but that would cause more problems if I choose to configure it in the future. The option I chose was to create the /var/log/mgetty directory, as the root user. After creating that directory, I ran the logrotate script again, without errors.
I also ran touch .placeholder in that directory, to make sure the directory wouldn't get deleted at the next update.
If you've made some of the changes suggested in "So Many Server Logs," earlier in this chapter, you may have already sent your logs to remote systems. In short, you'll need to configure the System Log daemon on the log server to receive remote logs, and configure the other computers to send their logs to that log server. See the previous annoyance for details.
The first time you run a job, it's helpful to do it manually. The more you do a job yourself, the more you learn about that job.
However, once you've run a job a few times, there's little more that you can learn about that job, as least in your current environment. At that point, it's best to automate the process. Linux already has a service that runs automated jobs on a regular basis, whether it be hourly, daily, weekly, or monthly.
Another reason why you want to automate tasks is so you can go home. With appropriate logs, you can make sure the job was properly executed when you return to work. Thus, you can configure a database job to run once per year, so you don't have to be at work on New Year's Eve.
Finally, when you administer a group of systems, the number of things you have to do can be overwhelming. Automation is often the only way to keep up with what you need to do. This is why you need to learn to manage the cron service.
It's easy to learn the workings of the cron service. Every Linux system includes numerous examples of cron jobs. The cron daemon wakes up every minute Linux is running, to see if there's a script scheduled to be run at that time.
Standard administrative jobs are run as scheduled in /etc/crontab. Red Hat and Debian configure this file in straightforward ways, with different command scripts for hourly, daily, weekly, and monthly jobs. The format starts with five time-based columns, followed by the user and the command:
minute / hour / day of month / month / day of week / user / command
Take a look at your own version of this file. While it varies by
distribution, all use a variation of the same first two directives,
SHELL
and PATH
.
SHELL=/bin/sh SHELL=/bin/bash PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin PATH=/sbin:/bin:/usr/sbin:/usr/bin PATH=/usr/bin:/usr/sbin:/sbin:/bin:/usr/lib/news/bin
Both SHELL
directives point
to different names for the default bash shell. The PATH
directives provide the baseline for
other scripts executed from the embedded directories by the cron
daemon. The simplest version of this script is associated with Red
Hat/Fedora distributions:
01 * * * * root run-parts /etc/cron.hourly 02 4 * * * root run-parts /etc/cron.daily 22 4 * * 0 root run-parts /etc/cron.weekly 42 4 1 * * root run-parts /etc/cron.monthly
These directives point to the run-parts
command, which runs the scripts in the noted directories, as the
root user. While you could use the full path to
the command (/usr/bin/runparts), that's not
necessary because /usr/bin is in the PATH
, as cited at the beginning of this
file.
In this case, hourly scripts are run at one minute past every hour, daily scripts are run at 4:02 A.M. every day, weekly scripts are run at 4:22 A.M. every Sunday, and monthly scripts are run on the first day of each month, at 4:42 A.M.
While Debian and SUSE run more complex versions of this script,
the effect is essentially the same. On our preferred Linux
distributions, the cron daemon runs the scripts in the
/etc/cron.hourly,
/etc/cron.daily,
/etc/cron.weekly, and
/etc/cron.monthly directories. Many scripts in
these directories use the full path to all commands, despite the
PATH
directive in
/etc/crontab.
You can create a cron job in any of the aforementioned directories, and it will be run at the intervals established in /etc/crontab. To help you understand how all this works, I'll create a yearly cron job, with the following steps:
As SUSE's /etc/crontab calls the /usr/lib/cron/run-crons script, the following steps (at least after step 6) won't work in that distribution.
Log in as the root user. (Alternatively, if your regular account is in the /etc/sudoers file, you can log in as a regular user and use the sudo command to invoke the commands in this section.)
Create a /etc/cron.yearly directory,
with the same ownership and permissions as the other related
directories. As those directories are owned by
root and have 755
permissions, they happen to be
compatible with standard root permissions for
new directories. So all that is required is:
sudo mkdir /etc/cron.yearly
Create a new script in the /etc/cron.yearly directory. I'd call it happynewyear. Include the following commands in that script (which saves the files from user donna's home directory in user randy's home directory):
#!/bin/sh /usr/bin/rsync -aHvz /home/donna /home/randy/
Save the file. Make sure the script is executable with the following command:
chmod 755 /etc/cron.yearly/happynewyear
Test the script. Run it using the full path to the script:
/etc/cron.yearly/happynewyear
Now make sure it runs at the next new year. Open your /etc/crontab and make a copy of the directive that runs the monthly cron scripts in /etc/cron.monthly. Change the directory to /etc/cron.yearly, and modify the time the script is run to something appropriate. For example, I use the following line in my Red Hat Enterprise Linux 4 system:
2 0 1 1 * root run-parts /etc/cron.yearly
This directive runs the script at two minutes past midnight on January 1. As the day of the week associated with New Year's Day varies, the last time entry has to be a wildcard. I chose two minutes past midnight because the directive associated with the /etc/cron.hourly directory is run at one minute past midnight.
Save your /etc/crontab configuration file.
Any output from a cron job is sent to the user as an email. Most standard cron jobs you'll find in the directories discussed here are carefully designed not to create any output, so you won't see email from them. cron jobs suppress such output by redirecting both standard output and standard error to files (or to /dev/null).
Users can create and schedule their own cron jobs. As a regular user, you can open a cron file for your account with the following command:
crontab -e
Use the steps described in the previous section to create your
own cron job. With the appropriate SHELL
, PATH
, and commands, you can run the
scripts of your choice at the regular times of your choosing. To
review your account's crontab configuration, run the following
command:
crontab -l
Naturally, most regular users won't understand how to create their own cron jobs. As the administrator, you'll have to create the jobs for them. For example, if you want to create a job for user nancy and have administrative privileges, run the following command:
crontab -u nancy -e
However, for any user to access individual cron jobs, he needs permission. There are several ways to configure permissions to use cron:
If there's an empty /etc/cron.deny file (and no /etc/cron.allow file), all users are allowed to have individual cron jobs.
If there is no /etc/cron.deny or /etc/cron.allow files, only the root user is allowed to have cron jobs.
If there are specific users in /etc/cron.deny, they're not allowed to use cron jobs, and the root user isn't allowed to create a cron job for them; all others are allowed to use cron jobs.
If /etc/cron.deny includes ALL
(representing all users), and
specific users are listed in
/etc/cron.allow, only those users listed in
the latter file are allowed to have cron jobs.
What you do depends on whether some of your users need to create cron jobs, and whether they are capable and trusted to do their own cron jobs (or whether you're willing to create cron jobs for your users).
Depending on how much work a cron job performs, it can
noticeably increase the load on the system. If you create a
user-specific cron job, try to schedule it for times when other
cron jobs aren't also running. If you've authorized users to
create their own cron jobs, give them times where they're
authorized to run them. Audit their jobs. You can review a user's
cron jobs with the crontab -u
username
-l
command.
If you want to see how cron jobs are configured, check them
out for all users in your spool directory. The actual directory
varies slightly by distribution. Red Hat/Fedora uses
/var/spool/cron/
username
, SUSE uses
/var/spool/cron/tabs/
username
, and Debian uses
/var/spool/cron/crontabs/
username
.
Not all jobs have to be run on a regular basis. People who crunch statistical data may need to run scripts at different times. Weathermen who are trying to model future trends may want to try some scripts just once or maybe twice. What they run often takes all of the resources on your systems. The only time they can run their jobs is in the middle of the night. They have families and like to sleep at night, so they may ask you to run that job. Well, perhaps you also have a family and like to sleep at night. But you don't want to create a cron job for this purpose because it's a one-time task.
For this purpose, Linux has the batch-job system, governed by the at daemon. To schedule a batch job at a given time, you can use the at command.
When you run the at command to
create a batch job, you have to specify a time when the job is to be
run. You're then taken to an at>
prompt, where you can specify the commands or scripts to be
executed.
Users who configure their own scripts can place them in their
own ~/bin directory. Scripts in these
directories (with executable permission), such as
~/bin/fatdata, can be run without specifying
the full path. Debian Linux doesn't add ~/bin
to the PATH
unless the directory
exists.
For example, if you're about to leave for the day and have already configured the fatdata script in your home directory's command bin (~/bin), take the following steps to run the script in one hour:
Run the following command:
at now + 1 hour
The at>
prompt is
open. If you're in SUSE or Debian Linux, you'll see a note that
reflects the default bash shell. (After these steps, I'll describe
some alternative ways to specify the time you need.)
At the at>
prompt,
enter the commands that you want to run at the specified time. In
my case, that would be the single command:
/home/michael/fatdata
When you're done with the commands that you want run, press Ctrl-D.
If you want to review pending at jobs, use the atq command.
If you want to cancel a job, you can use the atrm command, based on the queue number shown in the output from atq. For example, if you know that your job will be run at 10:30 P.M. tonight, you'll see something similar to the following output from atq, which notes that this is job 7:
7 2006-01-22 22:30 a michael
You can then cancel the job with the atrm 7 command.
As with cron jobs, any output from at jobs is sent via email to the user for whom the job ran.
The at command offers a rich syntax for configuring the job at the time of your choice. While you can specify a certain amount of time in the future, such as:
at now + 12 hour
you can also set a specific time, such as 1:00 A.M. tomorrow morning:
at 1 AM tomorrow
Alternatively, you can specify a date:
at 2 AM March 15
You'll need to make sure the at daemon is running. The following command shows whether it's running:
ps aux | grep atd
If it isn't running, make sure it's installed (it's the at RPM or DEB package on our preferred distributions) and configured to run at your default runlevel.
If you want to see how at jobs will be run,
you can check them out in your spool. The actual directory varies
slightly by distribution: Red Hat/Fedora and Debian use
/var/spool/cron/atjobs; SUSE uses
/var/spool/atjobs. If you also have batch
jobs that use the batch command (see
the next section), you'll note that the spool files associated with
regular at jobs start with an a
, while spool files associated with
batch jobs start with a b
.
A batch job, in contrast to an at job, runs as soon as the CPU has time for it. All you need to do to create a batch job is to use the batch command. With the batch command, Linux won't run the job unless the load average on the CPU is less than a certain threshold, which depends on the distribution. If you're running Red Hat/Fedora or SUSE Linux, the threshold is .8, or 80 percent of the capacity of a single CPU. If you're running Debian Linux, the threshold is 1.5, or 150 percent of the capacity of a single CPU. Naturally, you'll want to vary this threshold depending on the CPUs on your system.
Except for the aforementioned CPU limits, the
batch command works in the same way as the
at command; both set you up with an at>
prompt. If you want to change the
parameters associated with batch jobs, you can do so with the help of
the atd command. For example, if your system
includes four CPUs, you may find it useful to run batch jobs unless
more than three CPUs are loaded:
atd -l 3
If your batch jobs are intense, you may want to increase the time between such jobs. By default, they're run in 60-second intervals. The following command increases the interval to one hour:
atd -b 3600
For any user to access individual batch or at jobs, she needs permission. There are several ways to configure these permissions:
If there's an empty /etc/at.deny file (and no /etc/at.allow file), all users are allowed to have individual batch or at jobs.
If there are no /etc/at.deny or /etc/at.allow files, only the root user is allowed to have batch or at jobs.
If there are specific users in /etc/at.deny, they're not allowed to use batch or at jobs.
If /etc/at.deny includes ALL
(representing all users) and
individual users are listed in /etc/at.allow,
only those users listed in the latter file are allowed to have
batch or at jobs.
What you do depends on whether some of your users need to create at jobs, and whether they are capable and trusted to do them on their own (or whether you're willing to create them for your users).