The Kernel Needs an Upgrade

You don't need to change the kernel on your system often, but any Linux geek should be comfortable installing a new kernel without having to upgrade the entire distribution. The difficulty of the job depends on the reason you're upgrading. If a security flaw or other bug is found that merits an upgrade (which doesn't happen often), you may be able to simply download a package from your distributor with the new kernel. Support for new hardware may require the compilation of a new module, but usually not the entire kernel. The one major job you may run into—recompiling the whole kernel—is usually necessary only when you want to add some major feature, such as a new networking protocol, and even that isn't so hard.

A new kernel may have unexpected effects on your system, and you may accidentally leave out a feature that was in the previous kernel, so make sure an upgrade is absolutely necessary before you go whole hog. Any time you upgrade a kernel, you're taking a risk. Kernel upgrades can change the way your operating system works with your hardware, services, and more. Be prepared to back off and return to the previous kernel until you have run the new one for quite some time and are sure it's robust.

When you upgrade the Linux kernel, you have several other decisions to make: how to add the new kernel to your system, selecting a source, and whether to patch the existing kernel. These options require you to consider a number of factors:

Upgrading from a package: Do you use the binary package provided by your distribution? If not, are you aware of the possible risks, such as changes in configuration defaults, incompatibilities with other software, or differences in default directories?
Selecting a source: Should you work from the source code provided by your distribution, or generic source code from kernel.org?
Patching the kernel: When should you patch the kernel? When should you avoid kernel patches?
Sharpen your tools: Do you have the tools you need to modify or recompile your kernel?

I explain each of these factors in detail in the following sections.

Upgrading from a Package

Red Hat, SUSE, and Debian all create binary kernel packages to incorporate security updates, add new features, address problems documented in bug reports, and more. When you install an updated binary kernel package, it automatically updates your active bootloader. You'll then have two kernels available, side by side, in your /boot directory.

Installing a new kernel is simple. For Red Hat and SUSE, use the -i or --install option associated with the rpm command:

rpm -i kernel-newversion

Alternatively, you can use the update tool associated with each distribution (up2date, yum, YaST); when you select an updated kernel with those tools, by default they apply the install (and not the upgrade) option.

For Debian, connect to the network and use apt-get install to retrieve and install the new kernel:

apt-get install kernel-image-newversion

Tip

Debian is moving from kernel-image to linux-image as the name of their kernel packages.

Red Hat, SUSE, and Debian design their current binary kernel packages to add appropriate stanzas to GRUB or LILO bootloaders.

Warning

Use the install option instead of the upgrade option when you add binary kernel packages to your system. If you use the upgrade option, you'll overwrite your existing working kernel. In that case, you'll burn your bridges behind you. You'll be gambling that your system will work immediately with a new kernel, but that's very risky. There might be a hardware issue or other aspects of your particular system configuration that cause the kernel to fail. If your new kernel does not work, you may have to restore your system from backup. In contrast, if you save the old kernel by using the install option, you can simply choose the old one at the boot prompt and then try to fix the new kernel and recompile it.

Selecting a Source

There are a couple of situations when you can't depend on a binary package but have to download and compile source code.

The first situation is when the feature you need, such as the right modules for your hardware, is not available in your current kernel. You may need to download the modules as a part of another package. Especially if the package is available as a tarball, you may need to apply the instructions in any embedded script. Typical script names include Makefile and INSTALL, and may be detailed in a README file. One example of how this works is the "My Wireless Card Works on Another Operating System, but Not Linux" annoyance in Chapter 5.

The other situation is where the driver is already available as part of the kernel, either embedded in the kernel itself, or available as a module in the appropriate /lib/modules/`uname -r` directory. In that case, you'll need to download the source code for your kernel, use a menu to activate the appropriate features, and then recompile that kernel.

If at all possible, use the kernel source code developed specifically for your Linux distribution. While you may want to use the latest features available in the latest "stock" Linux kernel http://www.kernel.org, be careful. Each distribution compiles kernels with specific features that can interact and impact one other in subtle ways.

Several distributions work with special kernel sources. Some even include "backports" from more advanced kernels. So if you download a new version from http://www.kernel.org, you could lose the backports with other features that you need. It's therefore best to work with kernel sources provided for your distribution. For example, Red Hat Enterprise Linux 3 uses a specially configured Linux kernel that's nominally built from version 2.4.20, but includes backports from version 2.6. So if you install version 2.4.21 from http://www.kernel.org, you may actually lose features that you need.

Most distributions make the source code available in easily downloadable packages with names such as kernel-source. As we'll see shortly, though, Red Hat changed this convention starting with Fedora Core 3 and RHEL 4.

Patching the Kernel

If you download the original source code for a kernel from http://www.kernel.org, you can also use patches from that site. For example, if you've compiled and installed kernel version 2.6.15, you can upgrade to version 2.6.16 by downloading and installing patch-2.6.16— and compiling the combined source code.

Tip

At www.kernel.org, you can see patches labeled ac and mm. These are developmental patches released for general testing. The ac patches are released by Alan Cox; the mm patches are released by Andrew Morton. These patches include features not yet accepted by Linus Torvalds for the stable kernel.

If you choose to patch a kernel, the process may be more complex than you expect. Patches are applied to the kernel source code. Installing the patch is not enough. You still have to compile the combined source code into a new kernel.

Patches incorporate improvements between minor kernel revisions. In this section, I'll offer directions that work with most generic patches from http://www.kernel.org as well as patches provided by most distributions. However, the directories are different for Fedora Core 3 or later (we'll discuss this in the next annoyance). In general, here's what you do:

Install the source code for the current kernel (in this case, for version 2.6.15) in the /usr/src/linux-2.6.15 directory.
Download the desired patch (in this case, patch-2.6.16.gz or patch-2.6.16.bz2); make sure the patch is in the /usr/src directory.
Unpack the desired patch with the appropriate command (gunzip or bunzip2).
Navigate to the current kernel source directory—in this case, /usr/src/linux-2.6.15.
Apply the desired patch, using in this case the command:
```
patch -p1 < ../patch-2.6.16
```
Minor problems can arise when you try to apply a patch. Watch the output carefully for error messages. You may be able to diagnose a problem such as a compiler that is too old or a patch on a wrong directory.
It may help to back up a kernel when you apply a patch, by adding the --backup option:
```
patch --backup -p1 < ../patch-2.6.16
```
You can also create a log of error messages; the following command saves errors in the patch.log file:
```
patch --backup -p1 < ../patch-2.6.16> patch.log 2>&1
```

Tip

Some of you may know more efficient ways to run these commands, such as combining them on one line with pipes, and so forth. However, I think it's important to break out commands on something as important as kernel patches in detail.

Now you're almost ready to configure and recompile your new kernel. However, you need to make sure your tools are suitable for the software you're compiling.

Tip

Because Fedora Core 3 is the development platform for Red Hat Enterprise Linux 4, new features that you see for Fedora Core 3 normally also apply to Red Hat Enterprise Linux 4. This includes new conventions for kernel sources discussed in this chapter. Fedora Core 5 is reportedly the development platform for Red Hat Enterprise Linux.

Sharpen Your Tools

If you need to recompile your kernel, the source code is not enough. You also need the right tools. First, there are the tools that do the actual work, such as the GNU C Compiler. They vary by distribution and by major kernel version (2.4 and 2.6). More information is available in the Changes file, in the Documentation subdirectory of your kernel source directory.

Tip

Linux kernel 2.4 is still in common use on the Red Hat Enterprise Linux 3 series of distributions. Although that may seem dated, Red Hat has included appropriate "backports" of various kernel 2.6 features, and they have committed to support their customized kernel through the year 2008.

Then there are the packages that allow you to customize the kernel using graphical menus. Generally, these include the ncurses-devel library in the libncurses5-dev package for the lower-end menuconfig, and the TCL and TK libraries for the higher-end xconfig and kconfig menus. We'll look at these menus briefly in the next annoyance.

The ncurses-devel library is a good choice for a system that isn't running the X Window System, which is the case with many servers. Alternatively, there are three different menu configuration systems that take advantage of the X Window System GUI. If you use the GNOME desktop, find and install the GTK+ 2.0, Glib 2.0, or libglade 2.0 development libraries. The actual packages and their names vary by distribution. Alternatively, if you use the KDE desktop, download the QuickTime development libraries.

Recompiling the Kernel

The thought of recompiling the kernel strikes fear into far too many Linux geeks. While you have to perform each step in order and wait for the process to finish, and while mistakes can force you to try again or backtrack to an earlier kernel, the process is not as bad as it seems. Assuming that you have the source code and tools installed, as described in the previous annoyance, you can just follow the basic steps I describe here. Variations are possible, depending on your kernel version and distribution. For more information, see the README file in the directory with your source code, usually /usr/src/linux. (For Fedora Core 3 and above, the standard source code directory is a subdirectory of /usr/src/redhat/BUILD.)

Configuring the Kernel

The following are general steps associated with getting to the kernel customization menu. Depending on your configuration, there may be variations:

Navigate to the directory with the kernel source. If you've compiled your kernel in the past, you should have a .config file in this directory. If you want to start over, you can delete this .config file and make sure that the source code is clean with the following command:
```
make mrproper
```
Create a .config file in the local directory. If you're recompiling the current kernel, just copy it from the config-`uname -r` file in the /boot directory. For example, if your current kernel is version 2.6.11, run the following command:
```
cp /boot/config-2.6.11 /usr/src/linux/.config
```
Tip
When you include the `uname -r` string in another command, it embeds the version number of the current kernel. It's useful, as many kernel-related files and directories use the version number of the current kernel.
Customize the Makefile in the directory with your source code. Modified correctly, this helps identify the new kernels that you create. The key is the fourth variable in the Makefile, EXTRAVERSION, which gets appended to the end of the new kernel files. On my Debian computer, I've set EXTRAVERSION to -mj1; when I modified my 2.6.8 Linux kernel, the new kernel was named vmlinuz-2.6.8-mj1. If I recompile this kernel again for different features, I'd change EXTRAVERSION to -mj2; that kernel would be named vmlinuz-2.6.8-mj2.
Now you can modify the kernel as needed. With the thousands of options available, it's more efficient to make modifications with a graphical interface. If you have the proper ncurses development libraries installed, you can start a menu similar to Figure 7-1 with the make menuconfig command.
Figure 7-1. The ncurses kernel menu
If you're running GNOME and have one of the libraries described in the previous section installed, you can start a menu similar to that shown in Figure 7-2 with the make gconfig command.
Figure 7-2. The gconfig kernel menu
If you use the KDE desktop and have the QuickTime libraries, you can open the qconf kernel configuration menu shown in Figure 7-3 with the make xconfig command. (The xconfig menu had a substantially different look and feel for kernel version 2.4.)
Figure 7-3. The xconfig kernel 2.6 menu
Tip
With SUSE, you need to set the environment and display defaults to correspond to the root account before opening a kernel configuration tool with the make xconfig or make gconfig commands. If you've logged in to the SUSE GUI with a normal account, open a command-line terminal interface and log in to the root account with the following command, which transfers your account's X environment directives to root for this session:
```
# sux - root
```
Now you can customize the kernel with the settings of your choice. Unless you're working with an embedded device, you should almost always enable loadable module support. You then have three choices for many features: to exclude it, compile it into the core kernel, or compile it as a module.

The variety of options in kernel configuration is annoying, and unfortunately beyond the scope of this book. An advantage of reusing the /boot/config-`uname -r` file is that you can probably get away with just adding a single module or making a few limited changes (hopefully well documented by a README file when you downloaded the kernel or patch). When you save your settings, you're creating a new .config file. Once compiled, the result should be saved to the /boot directory.

This is where the Debian process diverges from the commands you can use on Red Hat/Fedora and SUSE; I'll describe both processes in the following sections. In either case, the commands required to compile the kernel can take several minutes, or even hours, depending on your hardware.

Preparing the Source for Fedora/Red Hat

Starting with Fedora Core 3, Red Hat has eliminated the kernel-source package. But you can build the source code from the source RPM, which is called kernel-`uname -r`.src.rpm in the SRPMS directory.

For example, if you download the original kernel source .src.rpm for Fedora Core 3 (version 2.6.9-1.667), you need to take the following steps to get to the kernel customization menu:

Install the source RPM with the rpm -ivh kernel-2.6.9-1.667.src.rpm command.
Navigate to the directory with the kernel SPEC file—normally, /usr/src/redhat/SPECS.
Build the source code. The following command builds it in an appropriate directory. The `uname -m` option provides the architecture associated with your system:
```
rpmbuild -bp --target=`uname -m` /usr/src/redhat/SPECS/kernel-2.6.9-2.6.spec
```
Install the development tools. The easiest way to do so is by installing the Development Tools package group via the system-config-packages or pirut utility.
Install the graphical tools required for the kernel configuration tool of your choice. Generally, if you use the GNOME Desktop Environment, run make gconfig, which requires the GNOME Development Tools package group. If you use the KDE Desktop Environment, run make xconfig, which requires the KDE Development Tools package group.

Tip

Whatever distribution you use, many packages assume that your source code is in the /usr/src/linux directory. If your distribution uses a different directory, it can help to create a symbolic link from /usr/src/linux to the directory with the kernel source code.

Processing a Red Hat/Fedora or SUSE Kernel

Once the kernel is configured, you can start the build process. Once you've customized your .config kernel configuration file, the following are basic steps associated with recompiling the kernel. Depending on your configuration, there may be variations:

The following command checks your .config settings against any dependencies and processes your kernel source code accordingly:
```
make dep
```
Before you start compiling the source code, it's a good idea to make sure no files exist from previous builds with the following command. Otherwise, they might get mixed in with your new build and cause havoc:
```
make clean
```
Now you can compile your source code into a binary kernel. The following command creates a binary kernel in the arch/i386/boot subdirectory. If you have something other than a 32-bit CPU, the directory name will contain a string denoting that processor in place of i386:
```
make bzImage
```
Assuming you've configured loadable modules, you'll need to process and install the modular drivers associated with your new kernel:
```
make modules
make modules_install
```
You should find your newly customized modules in the /lib/modules/`uname -r` directory.
Finally, you're ready to process your new kernel. The following command creates an Initial RAM disk, copies it and the new kernel to your /boot directory, and updates the active bootloader:
```
make install
```
Naturally, you should check the result in the /boot (and in some cases, the top-level root) directory, as well as in your bootloader configuration file. If the bootloader configuration was not updated, you can do it yourself using the information in Chapter 6.

Processing a Debian Kernel

The Debian kernel build process is relatively simple because it involves making your own custom Debian package; most of the grunt work is done for you automatically by this package. This section of the chapter assumes you've modified your .config file with one of the kernel configuration tools described earlier. The following are basic steps associated with recompiling a Debian kernel:

Install the Debian package known as kernel-package, which includes a number of scripts that can help process the Debian kernel. Then run the following command to create a Debian kernel package:
```
make-kpkg buildpackage -rev mine kernel-image
```
If this is successful, you'll find your package with a .deb extension in your /usr/src directory. In my case, I found the kernel-image-2.6.8-mj1_mine_i386.deb package in that directory. (Debian Etch uses the linux-image package name.)
Install the kernel package with the following command:
```
dpkg -i kernel-image-2.6.8-mj1_mine_i386.deb
```
Make sure this command updates your active bootloader in the appropriate configuration file. Alternatively, you can update the bootloader configuration file yourself using the information in "Rooting Out the Bootloader" in Chapter 6.
Make sure this command updates the soft links from the top-level root directory (/) for the Initial RAM disk and the kernel (vmlinuz). As an example, the following lines are excepts from an ls -l command on my Debian computer (dates have been removed due to line space constraints):
```
lrwxrwxrwx  1 root root 22 vmlinuz -> boot/vmlinuz-2.6.8-mj1
lrwxrwxrwx  1 root root 24 vmlinuz.old -> boot/vmlinuz-2.6.8-1-386
```

If any of these files point to the wrong kernel, that can lead to one of the most annoying problems of all, a kernel panic. This is the subject of the following annoyance.

I Can't Boot Because of a Kernel Panic

One of the most feared problems in the world of Unix or Linux is the kernel panic, when the system stops completely during the boot process. The computer won't respond to any input, save the power switch. This is where your backups, rescue modes, or rescue media can be a lifesaver—see Chapter 6 for how you can prepare for this situation.

A number of problems can cause a kernel panic, many of which occur when you try to recompile or install a new kernel. During the boot process, if Linux can't find the hard drive, the partitions, or initial RAM disk files, you'll get a kernel panic. But kernel panics aren't limited to these issues.

Unless there's corruption on your disk or some problem with your hardware, kernel panics generally come from some recent change to key components in the boot sequence, driver problems, or boot issues, such as:

The bootloader (GRUB or LILO)
A recompiled kernel
A new Initial RAM disk
Partition changes associated with the root (/) or /boot directories
Power problems
Troublesome drivers, especially those created for other systems

Record the messages that the console displays immediately before your kernel panic. Review what you did just before the kernel panic, especially with respect to the preceding list. These actions can give you hints to your problems. If you still can't figure out the problem, use these messages as keywords for a search for similar problems with search engines such as http://www.yahoo.com or http://groups.google.com.

Sample Panic Messages and Their Possible Meanings

Here's a typical example of a kernel panic:

VFS: Cannot open root device "hda6" or unknown-block(0,0)
Please append a correct "root=" boot option
Kernel panic: VFS: Unable to mount root fs on unknown-block(0,0)

This problem is caused by an error in the bootloader configuration file. The Virtual File System (VFS) could not find some filesystem such as root (/) or /boot.

One possible cause is the confusing nature of the GRUB configuration file. For example, if you see the following directive in /boot/grub/grub.conf or /boot/grub/menu.1st:

root (hd0,5)

You might think this points to the /boot directory on /dev/hda5. But as you should know from "Rooting Out the Bootloader" in Chapter 6, this directive actually tells your computer to look for the /boot directory on /dev/hda6.

Another example shown here is slightly misleading. This error message might suggest that there is a problem with the /sbin/init command, which is the first process (process 1) always run by the system:

Warning: unable to open an initial console
Kernel panic - not syncing: No init found.
Try passing init= option to kernel

In fact, this problem is not directly related to init. My computer could not find init because the bootloader pointed to the wrong partition for the top-level root (/) directory. The root directory on my system was on /dev/hda7, but the bootloader configuration file pointed to /dev/hda6, as shown here.

kernel  /vmlinuz-2.6.8-mj1 root=/dev/hda6

If you have a separate partition for the /boot directory, a mislocated partition could lead to a similar kernel panic message.

Another possible cause of panics in Debian are the links from the /vmlinuz and /initrd.new files. Debian links these files from the top-level root (/) directory. If the links are broken or point to the wrong locations, you might get the following message:

pivot_root: No such file or directory
/sbin/init: 426: cannot open dev/console: No such file
Kernel panic: Attempted to kill init!

Naturally, you can address this problem either by linking the noted files from the top-level root (/) directory to the right locations in the /boot directory or by revising the menu.lst configuration file to point directly to /boot.

Another panic is related to the following message:

Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(3,7)

While this appears similar to previous messages related to misplaced partitions, it actually is based on a missing Initial RAM disk file. Look at your menu.lst file. It should point you to an initrd file in the /boot directory. If you don't find the cited initrd file, you may need to re-create it with the mkinitrd command.

From these examples, we see that the cause may not be directly related to the error message. If you have some experience, you may recognize some of these messages. Otherwise, the best approach is to analyze the files and directories associated with the boot process, with the help of books such as this one.

Tip

If you haven't made any recent changes to your kernel, check your power supply and fans. Hardware doesn't last forever, and the lack of sufficient ventilation could cause your system to stop with a kernel panic. Dust can also affect ventilation and heat transfer, especially in non-clean room environments.

Reviewing the Rescue Process After a Panic

"Dual-Boot Recovery" in Chapter 6 describes how to use a rescue CD or other medium to boot a system; after a system panic, the process is straightforward. Try each of the following steps to boot a system. They're ordered by increasing levels of difficulty:

If you have more than one kernel configured in your bootloader, try them all. If a different kernel works, you may have a corrupt kernel, initial RAM disk, or an error in the bootloader configuration file.
If you have a rescue disk or CD customized for your system, try that next. Such disks are designed to boot your system in your current configuration. At that point, you can connect to any backups that you might have to recover a previously working configuration.
Use the rescue mode customized for your distribution. If you have the Red Hat/Fedora installation CD, it searches for and mounts your existing partitions. SUSE's installation disk and the Debian from Scratch CD install familiar tools that can help you mount and diagnose any problems you may have.
Boot with a CD-based Linux distribution such as Knoppix. It includes a complete Linux distribution, including specialized tools designed to help you rescue a system.

"I Lost the Root Password" in Chapter 6 describes booting into single user mode. Unfortunately, if you have a kernel panic, your system has usually stopped before it could boot into this useful runlevel.

Rescuing from a Kernel Panic

Once you've started your system using some emergency or rescue disk, review what you've done since your last successful boot. If you've changed a kernel, revised a bootloader, created a new initial RAM disk, or revised the partition associated with your root (/) or /boot directories, that could be the cause of your kernel panic.

The cure, then, is to reverse what you've done recently. If applicable, restore the original kernel, initial RAM disk, /boot or root (/) partition, or bootloader. Alternatively, restore the key parts of your system from a backup. Once you've gone back to your previous working configuration, test the result.

Tip

If you revise a key part of the boot system, you should test the result as soon as possible by rebooting Linux. If there are problems, you can restore your system while your memory of the changes you've made is fresh. You should also document the changes you've made.

I Can't Boot Because of Some "File Not Found" Error

One example of a "File not found" error is the following, where the root(hd0,0) directive points to the wrong partition. It's pretty clear what the problem is in this case, because the filesystem is listed as FAT:

root (hd0,0)
 Filesystem type is fat, partition type 0xb
kernel /vmlinuz-2.4.21-15.EL ro root=LABEL=/ hdd=ide-scsi

Error 15: File not found

The noted partition is (hd0,0), also known as /dev/hda1. It's formatted to the FAT filesystem. Your Linux boot files are almost certainly not located on a partition formatted to a Microsoft filesystem. You may need to do some searching to find the partition with your Linux boot files. Then you can revise your bootloader to point to the right partition.

In the following case, the problem is less clear. While the message is almost identical to the previous error, the problem is actually a missing Linux kernel:

root (hd0,0)
 Filesystem type is ext2fs, partition type 0x83
kernel /vmlinuz-2.4.21-15.EL ro root=LABEL=/ hdd=ide-scsi

Error 15: File not found

The solution may be as simple as correcting a typo in the name of the file you specified as the kernel.

The following error message illustrates a boot process that proceeded a bit further. As the messages stop at the Initial RAM disk message, you might (correctly) conclude that the problem is related to a missing, mislinked, or misnamed Initial RAM disk file:

root (hd0,0)
 Filesystem type is ext2fs, partition type 0x83
kernel /vmlinuz-2.4.21-15.EL ro root=LABEL=/ hdd=ide-scsi
 [Linux-bzImage, setup=0x1400, size=0x12dae6]
initrd /initrd-2.4.21-15.EL.img

Error 15: File not found

The following error message says that Linux can't find /etc/inittab:

INIT: No inittab file found

Another error message indicates that the /boot filesystem is mislabeled:

Couldn't find matching filesystem: LABEL=/bot

A closely related error may indicate that the runlevel field has been omitted or incorrectly specified in the id directive in /etc/inittab:

INIT: /etc/inittab[20]: fault unknown action field
Enter runlevel:

At the prompt just shown, if you can enter the number associated with your preferred runlevel, there may be a problem with the id directive in /etc/inittab. Otherwise, there may be a different problem with the /etc/inittab file.

Finding That File

If you see a "File not found" message, focus on the filename associated with the message. Usually, a file or directory specified in /etc/inittab or /etc/fstab is the source of the problem. A Linux guru should know these files well—or at least be able to refer to them on other Linux computers as models.

sh-3.00#

A missing mount command, on the other hand, leads to unexpected errors. In SUSE, you'll see a number of "failed to mount" errors. In Red Hat, you'll see something simpler:

Is /proc mounted?

With Red Hat and SUSE, you can see whether one of these files became corrupted by checking it against the associated RPM:

rpm -Vf /bin/mount

If you don't see any output, the file is verified as the original. Otherwise, you may have a problem, depending on the output. The options you might see are shown in Table 7-1.

Table 7-1. rpm output for files that are unverified

Output	Description
`S`	Mismatched file size
`M`	Different permissions or file type (Mode)
`5`	Incorrect MD5 checksum
`L`	Bad symbolic link
`D`	Wrong device number
`U`	Ownership (user) wrong
`G`	Ownership (group) wrong
`T`	Mismatched file modification time
`c`	Identifies a configuration file

If you're checking files on Debian-based packages, use the debsums command. It uses the MD5 checksum for each file to verify its authenticity. This requires you to get the Debian package from a CD or download. I've downloaded and installed my Debian packages with the apt-get command, and they're stored in the /var/cache/apt/archives directory. Thus, if I wanted to check the MD5 checksums of the files from the smbclient package, I'd run the following command:

debsums -ag /var/cache/apt/archives/smbclient*

Any MD5 checksum for a specific file that does not match returns a message such as:

usr/bin/smbtar                 FAILED

For a more comprehensive and preemptive approach to detecting the malicious replacement of critical system files, install Tripwire on your system as soon as you install the operating system (and before connecting to a network). Tripwire works on most modern Linux distributions. For more information on the open source version of this tool, refer to http://www.tripwire.org.

I Need to Add a Custom Kernel Module

Sometimes, the module associated with an item of your hardware is not included with your distribution. That's an annoyance every Linux geek should know how to fix.

In "My Wireless Card Works on Another Operating System, but Not Linux" in Chapter 5, we stressed the importance of knowing the make, the model, and, in many cases, the chipset associated with each hardware component. With this information, you can identify available Linux modules associated with your hardware.

To find the right hardware module, try the following in order:

Check loaded modules. If the lsmod command shows a module for your hardware, it has been detected and installed by your distribution.
Check compiled modules. If you're lucky, the right module for your hardware is already available in your /lib/modules/`uname -r` directory.
Check the kernel source code. If it contains the module for your hardware, the kernel probably supports it and you can compile and install the module yourself.
Check your hardware manufacturer. Increasing numbers of hardware vendors support Linux. You may be able to download drivers direct from the manufacturer's web site, just as you might download Microsoft Windows drivers.
Check for experimental drivers. As discussed in the annoyances in Chapter 5, experimental Linux drivers are often available for testing. While these are not "production-ready," they may work well enough for your needs.

I'll describe what you do if one the first four steps yields results. Refer to "My Wireless Card Works on Another Operating System, but Not Linux" in Chapter 5 if you have to resort to developmental (alpha or beta) drivers.

Check Installed Modules

The module that you need may already be properly installed; in this case, the operating system is ready and you may just have to do something hardware-specific to activate the hardware. (It may be as simple as checking that the cable is plugged in.)

As an example, assume you're looking for the Linksys Tulip driver. Based on the module name that you've found in your documentation or online web search, identifying the right driver is a simple:

lsmod | grep -i tulip

Tip

Sometimes the case of a driver varies; the grep -i tulip command searches for tulip in upper- and lowercase.

If the driver turns up in your list of modules, the problem may be with the hardware. If you're sure the hardware is working (perhaps because you've tried it under another operating system), you may have a defective module; you may try recompiling from a fresh copy of the driver software, using the techniques we've described in "Recompiling the Kernel," earlier in this chapter.

Check Compiled Modules

If you're fortunate, the module for your new hardware is part of your distribution's module directory. The right compiled module should be in the /lib/modules directory associated with your active kernel (you might have more than one kernel). This directory is defined by:

/lib/modules/`uname -r`

If the module is available only in a different directory, try copying it to the directory associated with the active kernel. It might work if you're fortunate. Otherwise, use the source code to recompile the module for your running kernel.

Check the Kernel Source Code

When you recompile your kernel, you can incorporate drivers of your choice. Assuming you've enabled loadable modules, there are three options for most drivers during kernel configuration:

Incorporate the driver directly into the kernel (the y option).
Configure the driver as a module, which will be loaded as needed (the m option).
Leave the driver out of your kernel altogether (the n option).

The notes associated with a driver are often quite specific. For example, the HCI UART driver shown in Figure 7-4 is directly associated with several specific Bluetooth cards.

Figure 7-4. A Bluetooth driver in the kernel

The particular tool shown in this figure lists modules with dots, and notes drivers that are compiled directly into the kernel with check marks. Other tools may be more straightforward. For example, with the ncurses-based make menuconfig tool, y incorporates the driver into the kernel, m configures the driver as a module, and n excludes the driver from your kernel and list of modules.

Check Your Hardware Manufacturer

An increasing number of hardware manufacturers provide Linux drivers. If you haven't found a module for your component yet, try their web sites. You might be pleasantly surprised. Unfortunately, loading a driver is a bit more difficult than downloading and running an executable file.

While details vary, there are five basic steps associated with setting up a downloaded driver for Linux:

Download the driver. Process it per the vendor's instructions. If they suggest that you don't need to recompile your kernel, and they include a driver customized for your specific kernel, skip to the next section of this annoyance.
Place the driver's source code file (if it is not compiled, it normally has a .c extension) in the appropriate kernel source code directory. The vendor instructions may suggest a directory.
Navigate to your kernel directory. Use the kernel configuration tool of your choice, such as make menuconfig, make xconfig, or make gconfig. Configure the driver either as a module or directly into the kernel.
If the only change that you've made is to add the driver, you can try just reinstalling the modules with the make modules and make modules_install commands. But the safer option is to recompile the whole kernel. The documentation from your driver vendor may provide more information.
Reboot and start Linux using the new kernel. Find the appropriate module in the /lib/modules directory. If found, you may be able to install it, using the options described in the next section.

Making Sure Your Kernel Is Loaded

Once you have a working driver, Linux might automatically detect it the next time you reboot. If not, make sure the driver module is available from the correct directory, specifically a subdirectory of /lib/modules/`uname -r`. See if you can load it with the appropriate insmod or modprobe commands.

Assuming that works, you'll need to add the driver, with configuration options, to a module configuration file. The next time you boot Linux, driver modules are loaded based on instructions in this file. For distributions associated with Linux 2.4 series kernels, the file is /etc/modprobe.conf. For Linux 2.6 series kernels, the file name varies by distribution.

For example, Debian includes some generic drivers in /etc/modules. Specialty drivers are loaded via scripts in the /etc/modultils directory. Standard module aliases are listed in different files in the /etc/modprobe.d directory.

In SUSE and Fedora Core 3, standard modules are listed in /etc/modprobe.conf. SUSE User-defined modules can be loaded via /etc/modules.conf. Specific options can be added to appropriate files in the /etc/modprobe.d directory.

My Files Are on That Other Computer

What's the point of a network if you can't share files? Linux provides several different protocols for file sharing. Two of the most common are the Network File System (NFS) and Samba.

NFS is the most efficient way to share directories between Linux and Unix computers. Unfortunately, if you have connection problems, shared NFS directories can hang up a client computer.

Samba supports sharing between Linux/Unix and Microsoft Windows computers. It's almost up-to-date with the latest developments in the Server Message Block/Common Internet File System (SMB/CIFS) protocols. While you can configure a Samba server as a Primary Domain Controller (PDC) or as a member server of an Active Directory (AD) network, Samba cannot yet act as a Domain Controller on an AD network.

The process of configuring NFS and Samba shares is a complex topic; a complete discussion is beyond the scope of this book. We assume in this section that you know the basics of installing and activating the appropriate NFS and Samba packages on your system. For more information, see Managing NFS and NIS by Hal Stern and Using Samba by Jay Ts et al. (both published by O'Reilly).

The annoyances that we'll deal with are:

Configuring NFS directories risks hangs when the network is down.
Connecting to shared Samba directories normally requires root account access.

Connecting with NFS

The process of configuring a regular NFS share is elementary for the Linux geek. However, if you want to minimize the risk of hangs when a client tries to mount an NFS directory, you'll need to follow these instructions carefully.

Under NFS, the system on which a shared directory physically resides permits other systems to mount the directory by listing it in /etc/exports. Access can be further limited with the /etc/hosts.allow and /etc/hosts.deny files. Given the NFS security issues, such as no provision for encryption over the network, root access is prohibited by default.

It's common to configure a shared directory good for all users, such as /home, from a central server. For example, you might see the following line in /etc/exports:

/home    192.168.0.0/24(rw,sync)

This line exports the directory /home to all systems on an internal Class C network, allowing read/write access and making the application that writes a file wait until the data is stored on the remote disk.

On the client side, there are two basic ways to mount a shared NFS directory: with a hard mount or a soft mount. A hard mount is more resistant to dropped connections, which can corrupt your data. In contrast, a soft mount can keep your system from hanging if there's a network problem when your system attempts to connect to a NFS directory. So if you need reliability when writing to a shared NFS directory, consider a hard mount. If you have trouble connecting to NFS server systems, consider a soft mount.

Default mounts through NFS are hard mounts. So if you choose a hard mount, you can configure it with a line such as this one in the client computer's /etc/fstab:

192.168.0.10:/home    /server     nfs    nfsvers=2    0   0

The /home directory on 192.168.0.10 will be hard-mounted on /server. To use a soft mount, configure it explicitly on the client:

192.168.0.10:/home    /server     nfs    soft,nfsvers=2    0   0

Naturally, this isn't the only way you can configure a mount from a client. You can mount the remote /home directory on the local /home directory. You can also use the fully qualified domain name of the NFS server, as long as forward and reverse DNS pointers are available for that server.

There is one more way to mount a shared NFS directory: the automounter, which we'll describe at the end of this annoyance.

Connecting with Samba

One of the annoying things about Samba is that you have to be the root user to connect to a shared Samba or Microsoft directory—at least under the defaults for some distributions. Regular users need to connect to shared directories all the time. I'll show you some ways around this problem in this section.

To support access, users need an account on the Samba server or on the corresponding PDC. Just as you can configure a single database of Linux/Unix usernames and passwords on a NIS or LDAP server, you can configure a single database of Microsoft usernames and passwords on a Domain Controller. You can configure a Microsoft Windows server or a Linux server configured with Samba as a PDC.

Let's start with the simplest solution to the user account problem—but the solution that requires the most manual work for you. This solution is to configure a user account for every user to whom you want to give access to a share on your Linux system. This works fine in a small environment where only one server offers files or printers.

For example, I've configured my own account with the following command, which prompts for a password.

smbpasswd -a michael

If you're using a Samba 2.x system, the corresponding command is smbadduser.

To browse the shares from a Samba (or a Microsoft Windows) server, smbclient can help. For example, if you want to view the shares on a computer named sunshine, run the following command:

smbclient -L sunshine

If you're familiar with Microsoft operating systems, you may recognize the output. It's similar to what you see through the Network Neighborhood or My Network Places tools, or from a net view \\sunshine command.

Most of the latest Linux GUI desktop environments make it easy to browse a Samba/CIFS network. Current GNOME desktops support network browsing with Nautilus, while KDE desktops can access the network via Konqueror. Just enter the following in the address bar of one of these tools:

smb:///

If you don't see the address bar in either browser, press Ctrl-L (Nautilus) or Ctrl-O (Konqueror). You can then enter smb:/// in the Location text box.

The standard Samba configuration file, smb.conf, supports access by regular users to their home directories. The following commands in the file serve up the home directory for each user:

[homes]
        comment = Home Directories
        valid users = %S

But therein lies a problem. Regular users aren't normally allowed to use the mount command. On many Linux distributions, they aren't even allowed to use the Samba mount commands such as smbmount.

Debian alleviates this problem by configuring two key commands as SUID root, supporting access by regular users: smbmnt and smbumount. If you're running Samba on another distribution, you can set SUID permissions with the following commands:

chmod u+s /usr/bin/smbmnt
chmod u+s /usr/bin/smbumount

Non-root users can now use smbmount and smbumount to access their home directories on other Linux computers. The first command mentioned in the previous sentence is not a misprint; because smbmount uses the smbmnt command, regular users can work with either one, as long as they have set the noted SUID permissions.

It's possible that other distributions will adapt Debian's settings for smbmount and smbumount. At that point, user connections to remote home directories shared via Samba will be less of an annoyance.

Tip

As described shortly, you can use the automounter to mount shared Samba directories. However, unless the share is not password-protected, that means you'll have to add a password in clear text to an automounter configuration file.

Regular users now have access to their home directories. In this configuration, as user michael, I can access my home directories on other computers. For example, to mount the /home/michael directory on a directory named test/ on my SUSE laptop, I'd run the following command. Commands such as smbmount prompt for passwords as required (the password on the server or Microsoft Domain Controller, not the user's local system), so I'll need to enter the password that I created on the Samba server when prompted:

smbmount //suse1/michael test

Then I can unmount this share with the following command:

smbumount test

One user can also log in as a different user on the Samba server. Assuming Samba passwords have been assigned, I could log in to Donna's account with the following command:

smbmount //suse1/donna test -o username=donna

Once again, I'm prompted for Donna's password. I could add the password to the command line, but it would appear in clear text on the terminal, where a "shoulder surfer" (someone who looks over your shoulder for information) might read the password.

Automating Mounts with the Automounter

If your network is less than reliable, network mounts can cause trouble. Mount attempts to inaccessible NFS servers can even cause your computer to hang.

This is where the automounter can help. It is invoked by the kernel when someone accesses a remote directory in any way—such as by issuing an ls command or opening a file in that directory in a text editor—and performs the necessary mount over NFS or SMB/CIFS. The behavior is impressively fast and really makes networking seamless.

The automounter is available by default on most modern Linux systems. It requires the autofs service and is configured through the /etc/auto.master configuration file. In most cases, the file you configure is /etc/auto.misc. Most distributions include commented sample commands that you can use. We'll show simple examples here. For more information, see http://www.tldp.org/HOWTO/Automount.html.

NFS automouter share

In this example, I've set up an NFS share of the /home directory from the suse1 server. I've also configured the following command in my /etc/auto.misc file:

linux     -rw,soft,intr      suse1:/home

I can then access the NFS share as a regular user with the following command:

ls /misc/linux

Samba automouter share

In this example, I can use the standard Samba share of my /home/michael directory from the suse1 server. All I need is the following command in my /etc/auto.misc file:

michael -fstype=smbfs,username=michael,password=michael ://suse1/michael

I can then access the Samba share with the following command:

ls /misc/michael

Notice that configuring the automouter this way stores your password in clear text, which is a big security hazard because this file is readable by default by all users. If you choose to share Samba directories in this way, limit access to this configuration file to the root user:

chown 600 /etc/auto.misc

Regular Users Can't Mount the CD/DVD Drive

The CD/DVD drive is a critical part of modern personal computers, allowing users to access music, movies, backups, and more. While you may find it best to lock the CD/DVD drive on Linux servers, you need to support it on users' systems. And that is precisely where Linux makes trouble: it generally allows only the root user to mount a filesystem. When a user inserts a CD or DVD, the filesystem associated to the CD/DVD drive has to be mounted before the user has access.

There are two options for giving users access to their drives: running the automounter, or making revisions to the mount command and /etc/fstab file to let non-root users mount the pertinent directories.

Tip

Closely related is the "My CD/DVD Is Locked" annoyance in Chapter 1, which includes instructions on how to disable automatic CD/DVD mounting via KDE's autorun and GNOME's gnome-volume-properties.

Configuring the Automouter

The automounter runs as root when the kernel detects that a new filesystem has appeared and mounts it as root so that it magically appears for the user. As discussed in the previous annoyance, the automounter relies on the /etc/auto.master and /etc/auto.misc configuration files. When you activate the following directive in /etc/auto.master, you'll get a link to /etc/auto.misc:

/misc     /etc/auto.misc    --timeout=60

Open the /etc/auto.misc file. You'll see a default directive for the CD/DVD drive, such as:

cd       -fstype=iso9660,ro,nosuid,nodev :/dev/cdrom

Make sure the automounter reads your revised configuration files by restarting the associated service:

/etc/init.d/autofs restart

Put the two files together. You've configured the /misc directory in /etc/auto.master and the cd subdirectory in /etc/auto.misc. Test the result. Insert a disk in the CD/DVD drive and run the following command:

ls /misc/cd

Setting Up Mounts by Regular Users

The alternative to the automounter is to configure your /etc/fstab configuration file to allow regular users to mount your CD/DVD drive. Because there is a single group to represent all users in SUSE, the following command works on that distribution.

/dev/hdc  /media/cdrecorder  users,umask=000    0 0

The users group is available on most distributions, and you can assign the users of your choice to that group. The default Linux Group ID for users is 100.

You'll also need to configure the mount command with SUID permissions, to support access by regular users. It's already configured that way in SUSE and Fedora Core.

I'm Having Trouble Connecting to an Existing Network

Unfortunately, there are so many ways networks can go wrong that they're hard even to categorize, much less describe and solve. With the development of wireless networks, potential problems have multiplied.

When diagnosing network problems, the first thing to remember is that most problems are physical. If you rush to change your networking software when the problem is just a hub without power, you could make things worse. As wireless networks have their own physical and software issues, we discuss this issue separately at the end of this annoyance.

Tip

This annoyance assumes you're using TCP/IP networking, which is the standard on the Internet. Because Unix was developed concurrently with the foundations of the Internet, and Linux is in many ways a clone of Unix, Linux is built for TCP/IP.

Some companies use other networking protocols to promote security or for legacy reasons. Linux can support other networking protocols, such as AppleTalk and IPX/SPX. For more information, see the applicable HOWTOs at http://www.anders.com/projects/netatalk and http://www.tldp.org/HOWTO/IPX-HOWTO.html.

After you fix a network problem, you may need to revise a configuration file to keep the problem from happening again the next time you boot. Generally, most modern distributions store these configuration files in the /etc/sysconfig/network or similar directories. If you have trouble finding the right file, Red Hat/Fedora, SUSE, and Debian all have excellent GUI utilities that can help you configure basic network interfaces.

Tip

While this is a relatively long section, I still cover only a few of the basic networking issues. Unfortunately, a complete list of annoyances and solutions is beyond the scope of this book. For more information, start with the Networking HOWTO at http://www.tldp.org/HOWTO/Net-HOWTO.

Isolating the Problem

Chances are good that you already have a working LAN. But trouble is sure to happen from time to time. Cables can fray or become loose. Heat can cause network cards to work their way out of their slots. Power may cycle on your hub, switch, or router. And the first symptom you see may be network trouble on your Linux system.

In this section, I list potential problems to check step by step. As you gain experience, you may be able to isolate the problem more quickly.

Basic loopback connections

Check whether networking is operational by looking at the status of your loopback connections, one that doesn't depend at all on networking hardware. Output from the ifconfig command should list active network adapters. As long as network software is installed, you should see output at least from your loopback adapter, similar to:

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:533496 errors:0 dropped:0 overruns:0 frame:0
          TX packets:533496 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:75134145 (71.6 MiB)  TX bytes:75134145 (71.6 MiB)

If you don't see a loopback interface, it may be down. You can try to activate it with the following command:

/sbin/ifconfig lo up

You should also be able to verify the loopback interface with the following command:

ping 127.0.0.1

Now try the ifconfig command again. If this doesn't work, the problems are deeper than I can address in this annoyance.

Checking network interfaces

Assuming you have network adapters on your system, you should also see their output from ifconfig. If you don't, try activating the associated interfaces. Assuming they're Ethernet or wireless adapters, try the following commands:

/sbin/ifconfig eth0 up
/sbin/ifconfig wlan0 up

Then run ifconfig again. You should see output such as:

eth0      Link encap:Ethernet  HWaddr 00:0D:9D:86:36:A0
          inet6 addr: fe80::20d:9dff:fe86:36a0/64 Scope:Link
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:297 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:99454 (97.1 KiB)
          Interrupt:10 Base address:0xe000

If you still don't see your network cards, you may have a physical problem with the card or connection; read ahead for more information. But assuming you're on an IPv4 network, there's still a problem. You need an IPv4 network address.

Tip

An IPv6 address is shown in the settings associated with a network card. This is a manufacturer-assigned address, which is probably not suitable even if you're configuring an IPv6 network. If you use IPv6, you probably derive addresses from a hierarchy of authorities, as with IPv4.

If there's a DHCP server for your network, check it with your DHCP client command. Different distributions use client commands such as dhcpcd, dhclient, and pump to ask for an address from that server. If you have multiple interfaces, you should specify one; for instance, the following command asks for DHCP service for the Ethernet card on my computer:

dhclient eth0

Ideally, you'll now see something similar to the following IPv4 address information in the output to ifconfig:

  inet addr:192.168.0.11  Bcast:192.168.0.255  Mask:255.255.255.0

Now you can test the connection between your computer and the network card. In this case, you can do so with the following command:

ping 192.168.0.11

You'll need to stop the output by pressing Ctrl-C. Alternatively, you could use the -c 4 switch to limit the output to four pings, i.e.:

ping -c 4 192.168.0.11

Checking connectivity

Now you can check connectivity to the rest of the network. The first step is to check the connectivity to a neighboring computer. If you're a Linux administrator for the network, you should be able to find these addresses through /etc/hosts or a local DNS computer IP address in /etc/resolv.conf.

Tip

If you don't know how to determine what IP addresses are on your network, refer to the IP Sub-Networking mini-HOWTO at http://www.tldp.org/HOWTO/IP-Subnetworking.html.

For example, the following command verifies connectivity to my Internet gateway router:

ping 192.168.0.1

Next, you can check connectivity to your network's IP address on the Internet. It's available through the other network interface on your gateway computer or router. If the gateway is a Linux or Unix system, you can find the interface's address with an ifconfig command on that computer. If it's another operating system, consult appropriate documentation.

For example, if the Internet address on my network gateway is 11.12.13.14, I'd try:

ping 11.12.13.14

Now, unless you know a specific IP address on the Internet, that's as far as you can go with just IP addresses.

Checking names on your LAN

As we don't normally connect to the Internet with IP addresses in our browsers, we also need to check connectivity through computer names. If you've configured static IP addresses, you should be able to find the computer names on your network in /etc/hosts. Alternatively, if you have a DNS server for your network, you should be able to find the list with the appropriate host command. For example, if you use example.com as your private network domain, you'd run:

host -l example.com

The host -la command may be required for later versions of DNS.

Tip

I had previously configured example.com on my network; one of the results was enterprise3d.example.com, which I could then ping from another computer on my network.

ping -c 4 enterprise3d.example.com

Checking names on the Internet

If you're connected to the Internet, you can check name connectivity in a wider setting. Run the ping command to your favorite web site:

% ping -c 1 www.oreilly.com
PING www.oreilly.com (208.201.239.36) 56(84) bytes of data.
64 bytes from www.oreillynet.com (208.201.239.36): icmp_seq=1 ttl=45 time=40.1 ms

This response verifies that the DNS servers that you use for Internet addresses are working properly. If you have a problem here, you should check your connection to your ISP's DNS servers. If your gateway computer runs on Linux, you'll find it in that computer's /etc/resolv.conf.

Alternatively, your computer gateway may not know where to route requests. The following shows that your system knows where to route request to two internal networks. But if the IP address is associated with a different network, your system doesn't know where to route the request:

% netstat -r
Kernel IP routing table
Destination  Gateway      Genmask       Flags MSS Window  irtt Iface
192.168.0.0  *            255.255.255.0 U       0 0          0 eth1
192.168.1.0  *            255.255.255.0 U       0 0          0 eth0

What you need is a default route, which applies to IP addresses not otherwise specified. Assuming your network is connected to the Internet and the interface on the gateway that receives data from your system is 192.168.0.1, this command should solve your routing problem:

% route add default gw 192.168.0.1

And the next time you run netstat -r, you'll see the following output.

default      192.168.0.1  0.0.0.0       UG      0 0          0 eth1

The 0.0.0.0 in the output refers to the network mask; it means that all addresses go through 192.168.0.1. Sometimes, the output also lists default as the destination address in place of 0.0.0.0; the two are synonymous when it comes to IPv4 addressing. In some cases, you may even see the fully qualified domain name (FQDN) of the gateway.

Firewalls

When you're able to connect to other computers on your LAN but not to an external network such as the Internet, you may have a firewall that is too restrictive. For example, the firewall could allow you to ping web sites on the Internet but not connect to those sites using TCP to get a web page.

In many cases, the only computer configured with a firewall is the gateway computer or router between your network and an external network, such as the Internet.

On the gateway computer, if you trust internal users (a big if), you may disable firewalls on the network card associated with the internal LAN. Unless you're working in a location such as an Internet café, crackers normally come from outside the network.

You might want to create defenses within your network as well. For example, you might configure outgoing email servers to stop internal users from sending out an excessive number of emails, which might qualify as spam. Or you might want to create firewalls within your network to further protect critical areas within your enterprise from external and internal users.

Tip

If your company wants you to block access to certain sites on the Internet, one alternative is a proxy server. For more information on the Squid proxy server, see Squid: The Definitive Guide by Duane Wessels (O'Reilly).

If you need a firewall to regulate traffic within your LAN, you'll probably need a number of open ports to support services such as Samba, NFS, and SSH. All these open ports are difficult to configure, complex to maintain, and make internal firewalls less valuable.

Because many Linux distributions configure a firewall by default, that may prevent some types of network communication within your LAN. To check the operation of, and then disable, an iptables firewall, run the following commands:

iptables -L
iptables -F

The first command lists all rules currently being used to filter traffic, and the second flushes the rules so no filtering is done.

To make sure Linux doesn't reactivate the firewall the next time you reboot, you'll need to disable or delete the appropriate command file in the /etc/sysconfig directory. The file varies by distribution. SUSE encourages users to disable firewalls using YaST.

A detailed discussion of firewalls is beyond the scope of this book. For more information on firewalls, see Linux iptables Pocket Reference by Gregor Purdy (O'Reilly).

Physical Network Troubleshooting

Loose cables, problematic network cards, and—given that so many of us now run wireless networks—the presence of too many walls between a wireless card and an access point are the most common reasons network connections fail. I assume that you already understand the basic functionality of network hardware; I summarize the components in Table 7-2.

Table 7-2. Network components

Component	Potential physical problem
Network Interface Card (NIC)	Not seated in the motherboard slot or PCMCIA socket. Check lights to confirm connectivity.
Wireless NIC	Too distant from access point; too many walls blocking signal; interference from devices in similar frequencies—e.g., handheld telephones.
Cable	Wrong cable type, such as the incorrect use of a crossover cable between a PC and a hub. Severe bends can affect performance.
Hub/Switch/Router	Lack of power. Check lights to confirm connectivity. Make sure lights are active for all connections.
Gateway/Router	Lack of power. Check lights to confirm connectivity. Requires at least two NICs on the computer gateway.

When there is a solid connection between a Network Interface Card (NIC) or Hub/Switch/Router and a cable, you should see lights on each component. Generally, a solid light means you have power or connectivity; a blinking light is a sign of network activity.

Troubleshooting Network Services

If you've verified your physical network connections and still have problems, check your network services. Because some of these services are associated with other annoyances, this discussion is limited to general principles.

Make sure the service is active. You can test it by starting the associated script from /etc/init.d. Once you find the service is operational, make sure the service starts in appropriate runlevels the next time you boot Linux.
Unless you need to secure your systems from inside attack, deactivate firewalls on computers internal to your LAN.
Check the appropriate configuration file for your service. You must list each directory in the configuration file (or use the generic terms homes and printers in Samba) in order for others to access it over the network.
See if you can access the shared network directory on the local computer. If you can't get to a shared directory locally, you probably can't get to it from other computers on your LAN.

Wireless Network Issues

The advance of wireless networks led to additional annoyances. We've briefly addressed interference with other wireless devices. Worst of all, an unsecured wireless network makes it easy for outsiders to break in. In general, we assume that you're configuring a connection to an access point, such as a gateway router. However, it's also possible to connect wirelessly to a peer, such as a wireless card attached to a different computer.

Tip

This section contains just a brief overview of what you can do to avoid wireless annoyances. An excellent option for more in-depth coverage is Linux Unwired by Roger Weeks et al. (O'Reilly).

As described in "My Wireless Card Works on Another Operating System, but Not Linux" in Chapter 5, a working wireless NIC will show up in the output to ifconfig -a.

To manage a wireless network on Linux, you need the commands associated with the wireless-tools package. (At least, that's the name of the package on Red Hat/Fedora, SUSE, and Debian.)

If your wireless card fits into a PCMCIA slot, you'll also need separate configuration files in the /etc/pcmcia directory. The package that installs these files varies by distribution and by major kernel version. Table 7-3 lists some sample names under which you can find the package.

Table 7-3. Wireless package

Distribution	Major kernel version	Package
Debian	2.4/2.6	pcmcia-cs
SUSE (older versions)	2.4	kernel-pcmcia-cs
SUSE 9.X/10.x	2.6	pcmcia
Red Hat Enterprise Linux 3	2.4	kernel-pcmcia-cs
Fedora Core 3/4, Red Hat Enterprise Linux 4	2.6	pcmcia-cs

These configuration files may not work with special wireless tools or drivers installed from third-party sources such as Linuxant (http://www.linuxant.com) or SourceForge (http://sf.net), which we discussed in more detail in Chapter 5.

Once you have the right packages installed, you can configure your wireless card from the command-line interface. The key commands are iwconfig, iwevent, iwgetid, and iwlist. Once your wireless network operates to your satisfaction, you'll need to modify the appropriate configuration files with your desired settings. The commands are described in the following subsections.

iwevent

The iwevent command can help you monitor the wireless network. You can run this command in the background:

iwevent &

Once started, the command can help you monitor major changes to your wireless network, such as hardware, speeds, and more. Even while being run in the background, the output goes to the command console.

iwgetid

The iwgetid command identifies the name of the wireless network to which you're connected. For example, I might see the following output, which reflects the ESSID of my wireless network:

wlan0:     ESSID:"randynancy"

If this isn't the network you want, you can do something about it with the iwlist and iwconfig commands.

iwlist

The iwlist command is powerful. It can help you scan available networks, manage transmission power, check available communication channels, list access points, and more. You can run it in the following format:

iwlist [device] option

Generally, it's more efficient to specify the network device when you run this command. For example, if your wireless interface is wlan0, and you want to scan available wireless network ESSIDs, run the following command:

iwlist wlan0 scanning

Key information from this output includes the frequency (channel), ESSIDs, available bit rates, signal strengths, and access modes.

Other key iwlist command options are listed in Table 7-4. In several cases, there are two options, such as rate and bit rate, that produce the same result.

Table 7-4. iwlist command options

Option	Description
scanning	Scans for available access points, returning the ESSIDs, transmission frequencies, bit rates (from the access points), signal strengths, and access modes
frequency channel	Lists available channels and their corresponding reception frequencies available to your network card
bitrate rate	Lists available bit rates for your network card
encryption key	Lists encryption keys for your network card
power	Specifies the power-management modes for your network card
txpower	Reports the transmission power from your network card
ap accesspoints	Reports detected access points
peers	Lists detected access points and configured peers

iwconfig

In the same way you can configure a regular network card with the ifconfig command, you can configure a wireless network card with the iwconfig command: you can change access points, set bit rates, adjust transmission power, and more. Just remember that, once you've verified that your changes work, you'll need to revise the applicable configuration files or scripts for your wireless device so they take effect each time the system boots.

Running iwconfig without options returns the wireless characteristics of each wireless network device:

wlan0  IEEE 802.11-DS  ESSID:"randynancy"  Nickname:"unknown"
      Mode:Managed  Frequency:2.412 GHz  Access Point: 00:09:5B:FA:BB:76
      Bit Rate=5.5 Mb/s   Tx-Power=20 dBm
      RTS thr:off   Fragment thr:off
      Encryption key:off
      Power Management:off
      Link Quality=38/100  Signal level=-62 dBm  Noise level=-154 dBm
      Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
      Tx excessive retries:0  Invalid misc:0   Missed beacon:0

When you specify the wireless device, you can change its configuration. For example, you may be able to connect to more than one wireless network:

# iwlist wlan0 scanning | grep ESSID
ESSID:"randynancy"
ESSID:"default"

You might have trouble connecting to your preferred network. In my case, I want to make sure that I connect to my home network (instead of my neighbor's network). Thus, I specify the network to which I connect as follows:

# iwconfig wlan0 essid randynancy

There are a number of other wireless characteristics that you can configure with the iwconfig command. Using the format shown in the previous example, you can change the settings described in Table 7-5.

Table 7-5. iwconfig options

Option	Function
essid	Sets the wireless network to which your device connects.
channel	Specifies the channel where your wireless card communicates. It's best if it matches the transmission channel configured at your access point.
mode	Changes the operating mode to either centralized communication with an access point or ad hoc communication with other wireless peers; options include: `Ad-Hoc` if there's no access point `Managed` with access points `Master` if this network card is the access point `Repeater` for forwarding from access points `Secondary` as a backup Repeater `Monitor` if the card receives only dataxs
ap	Defines a specific access point.
rate	Specifies a communication rate in bits per second.
key	Sets an encryption key.
txpower	Specifies the transmission power.

I Need to Work with Microsoft-Formatted Partitions

Users who continue to run Microsoft Windows on dual-boot systems with Linux need access to Windows filesystems from Linux. Even users in the process of converting to Linux may retain important files on Microsoft-formatted partitions and want to read or write them from Linux. Naturally, you'll want to encourage users to run Linux whenever possible. Therefore, you'll need to help your users access Microsoft partitions from Linux on a local computer. Samba is no help in this case because it offers access to filesystems on running operating systems, not on alternative operating systems that haven't been booted.

Linux has no problems with local partitions formatted as one of the various File Allocation Table (FAT) filesystems. You can read and write files to any partition with this format. If the FAT partition is available on a local hard drive, you can mount that partition like any Linux partition on that computer. Read and write access to such partitions are enabled by default in current Linux kernels.

Unfortunately, Linux does not work as well with the various Microsoft New Technology File Systems (NTFS). It's easy enough to mount an NTFS partition. Current Linux kernels allow you to read and copy files from such partitions. However, writing to an NTFS partition with current Linux distributions puts all the files on that partition at risk, due to corruption. But there is another option based on Jan Kratochvil's Captive NTFS system.

Tip

Users with dual-boot systems can also access their Linux files while running Microsoft Windows. There are several ways to read Linux ext2/ext3 formatted filesystems. For more information, see the SourceForge Ext2 package (http://sourceforge.net/projects/winext2fsd/) or Explore2fs (http://uranus.it.swin.edu.au/~jn/linux/explore2fs.htm).

Mounting Microsoft Partitions

The following is just a brief overview of how to configure access to Microsoft-formatted partitions on a local computer. If you need more information, refer to the SourceForge NTFS Project (http://linux-ntfs.sourceforge.net/).

Current distributions don't always include software to mount NTFS partitions, even in read-only mode. However, if you're running Red Hat/Fedora distributions, you may be able to get RPMs for this purpose from the NTFS Project.

If your first IDE hard drive partition is formatted as a VFAT file system, you can mount it locally with the following command (assuming the /mnt/vfat directory exists):

# mount -t vfat /dev/hda1 /mnt/vfat

If you want to configure permanent access to this partition, configure the mount in your /etc/fstab. The following command allows the root user to mount, read, and write to the noted partition.

/dev/hda1  /mnt/vfat   vfat   defaults    0 0

But root-only access to Microsoft data can be annoying. To configure regular user access to the partition, you'll need to specify user and/or group IDs. For example, because all regular users on a SUSE computer are members of the users group, the following command in /etc/fstab enables read access for all regular SUSE users:

/dev/hda1  /mnt/vfat   vfat   defaults,users    0 0

By default, however, write and execute access may be forbidden. To permit these, you'll need to set an appropriate umask. The following /etc/fstab entry allows complete access to all users:

/dev/hda1  /mnt/vfat   vfat   users,gid=users,umask=000    0 0

For an NTFS system, you'll probably want to limit access to read-only. Otherwise, users may try to use experimental writing tools that could corrupt the partition. Thus, if /dev/hda2 is formatted to NTFS, you might include the following line in /etc/fstab:

/dev/hda2  /mnt/ntfs   ntfs   ro,users,gid=users,umask=000    0 0

If you're on a distribution without a group for all users, you can create one. Alternatively, you can substitute a specific user ID. This could be sufficient on a workstation dedicated to a single user.

Configuring Captive NTFS

The Captive NTFS system searches through and configures connections to partitions formatted to that filesystem. As of this writing, it uses NTFS drivers available on a local partition. If you have an NTFS partition, you should already have a licensed version of Microsoft Windows with the needed drivers.

The drawback of Captive NTFS is speed. A simple transfer of a 15 MB file to a Captive NTFS mounted filesystem took about six minutes in one test I ran. A similar transfer to a Microsoft VFAT partition took a couple of seconds.

Tip

While I've had good success with Captive NTFS, the slow speed of data transfer makes it difficult to test extensively for corruption. The speed also makes Captive NTFS impractical for many applications; network transfers to NTFS partitions are much faster.

The Captive NTFS package is available as a tarball and an RPM from the associated home page at http://www.jankratochvil.net/project/captive/. Once Captive NTFS is installed, check your /etc/fstab configuration file. If there's a current NTFS partition on your hard disks, Captive NTFS should have detected it and configured an installation command in that file. For instance, it added the following to my Debian /etc/fstab:

/dev/hda1 /mnt/captive-noname captive-ntfs defaults,noauto 0,0

Next, you'll need to find and copy the appropriate NTFS system files from your Microsoft Windows installation. Captive NTFS includes its own search tool for this purpose, which you can start with the captive-install-acquire command. It's a slow process; it took all of the resources on my laptop with 768 MB of RAM for nearly an hour. It searches and then copies critical NTFS files to the /var/lib/captive directory, as shown in Figure 7-5.

Figure 7-5. Searching for NTFS system files

Tip

There's a way to shortcut this process. Mount your NTFS partition with Microsoft Windows. Copy the following critical files to the /var/lib/captive directory: cdfs.sys, fastfat.sys, ntfs.sys, and ntoskrnl.exe. If you've installed a Microsoft XP service pack, you can find the latest version of these files in the \WINNT\ServicePackFiles directory.

Now you can mount NTFS partitions in read/write mode. Using the configuration line added to my /etc/fstab, I can mount my NTFS partition with the following command:

mount /dev/hda1

Unmounting the NTFS partition is a critical part of the process. It may seem to take a long time on your computer. That's because it's syncing any changes that you've made to the NTFS partition with the data on the actual hard disk.

Because it may be annoying to have to remember to unmount a directory, you may wish to use the automounter for this purpose. We've addressed the basic configuration of this system earlier in this chapter.