Chapter 5. Redirecting I/O

Many Unix programs read input (such as a file) and write output. In this chapter, we discuss Unix programs that handle their input and output in a standard way. This lets them work with each other.

This chapter generally doesn’t apply to full-screen programs, such as the Pico editor, that take control of your whole terminal window. (The pager programs, less, more, and pg, do work together in this way.) It also doesn’t apply to graphical programs, such as StarOffice or Netscape, that open their own windows on your screen.

Standard Input and Standard Output

What happens if you don’t give a filename argument on a command line? Most programs will take their input from your keyboard instead (after you press the first RETURN to start the program running, that is). Your terminal keyboard is the program’s standard input.

As a program runs, the results are usually displayed on your terminal screen. The terminal screen is the program’s standard output.

So, by default, each of these programs takes its input from the standard input and sends the results to the standard output.

These two default cases of input/output (I/O) can be varied. This is called I/O redirection.

If a program doesn’t normally read from files, but reads from its standard input, you can give a filename by using the < (less-than symbol) operator. For example, the mail program (see Section 6.5.2 in Chapter 6) normally reads the message to send from your keyboard. Here’s how to use the input redirection operator to mail the contents of the file to_do to bigboss@corp.xyz:

$ mail bigboss@corp.xyz < to_do
$

If a program writes to its standard output, which is normally the screen, you can make it write to a file instead by using the greater-than symbol (>) operator. The pipe operator (|) sends the standard output of one program to the standard input of another program. Input/output redirection is one of the most powerful and flexible Unix features, We’ll take a closer look at it soon.

Instead of always letting a program’s output come to the screen, you can redirect output into a file. This is useful when you’d like to save program output or when you put files together to make a bigger file.

When you add “> filename" to the end of a command line, the program’s output is diverted from the standard output to the named file. The > symbol is called the output redirection operator.

For example, let’s use cat with this operator. The file contents that you’d normally see on the screen (from the standard output) are diverted into another file, which we’ll then read using cat (without any redirection!):

$ cat /etc/passwd > password
$ cat password
root:x&k8KP30f;(:0:0:Root:/:
daemon:*:1:1:Admin:/:
	.
	.
	.
john::128:50:John Doe:/usr/john:/bin/sh
$

An earlier example (in Section 5.1.1.1) showed how cat /etc/passwd displays the file /etc/passwd on the screen. The example here adds the > operator; so the output of cat goes to a file called password in the working directory. Displaying the file password shows that its contents are the same as the file /etc/passwd (the effect is the same as the copy command cp /etc/passwd password).

You can use the > redirection operator with any program that sends text to its standard output—not just with cat. For example:

$ who > users
$ date > today
$ ls
password   today   users   ...

We’ve sent the output of who to a file called users and the output of date to the file named today. Listing the directory shows the two new files. Let’s look at the output from the who and date programs by reading these two files with cat:

$ cat users
tim     tty1    Aug 12  07:30
john    tty4    Aug 12  08:26
$ cat today
Tue Aug 12 08:36:09 EDT 2001
$

You can also use the cat program and the > operator to make a small text file. We told you earlier to type CTRL-D if you accidentally enter cat without a filename. This is because the cat program alone takes whatever you type on the keyboard as input. Thus, the command:

takes input from the keyboard and redirects it to a file. Try the following example:

$ cat > to_do
Finish report by noon
Lunch with Xannie
Swim at 5:30
^D
$

cat takes the text that you typed as input (in this example, the three lines that begin with Finish, Lunch, and Swim), and the > operator redirects it to a file called to_do. Type CTRL-D once, on a new line by itself, to signal the end of the text. You should get a shell prompt.

You can also create a bigger file from smaller files with the cat command and the > operator. The form:

creates a file newfile, consisting of file1 followed by file2.

$ cat today to_do > diary
$ cat diary
Tue Aug 12 08:36:09 EDT 2001
Finish report by noon
Lunch with Xannie
Swim at 5:30
$

We’ve seen how to redirect input from a file and output to a file. You can also connect two programs together so that the output from one program becomes the input of the next program. Two or more programs connected in this way form a pipe. To make a pipe, put a vertical bar (|) on the command line between two commands. When a pipe is set up between two commands, the standard output of the command to the left of the pipe symbol becomes the standard input of the command to the right of the pipe symbol. Any two commands can form a pipe as long as the first program writes to standard output and the second program reads from standard input.

When a program takes its input from another program, performs some operation on that input, and writes the result to the standard output (which may be piped to yet another program), it is referred to as a filter. A common use of filters is to modify output. Just as a common filter culls unwanted items, Unix filters can restructure output.

Most Unix programs can be used to form pipes. Some programs that are commonly used as filters are described in the next sections. Note that these programs aren’t used only as filters or parts of pipes. They’re also useful on their own.

The grep program searches a file or files for lines that have a certain pattern. The syntax is:

The name “grep” derives from the ed (a Unix line editor) command g/re/p, which means "globally search for a regular expression and print all lines containing it.” A regular expression is either some plain text (a word, for example) and/or special characters used for pattern matching. When you learn more about regular expressions, you can use them to specify complex patterns of text.

The simplest use of grep is to look for a pattern consisting of a single word. It can be used in a pipe so that only those lines of the input files containing a given string are sent to the standard output. But let’s start with an example reading from files: searching all files in the working directory for a word—say, Unix. We’ll use the wildcard * to quickly give grep all filenames in the directory.

$ grep "Unix" *
ch01:Unix is a flexible and powerful operating system
ch01:When the Unix designers started work, little did
ch05:What can we do with Unix?
$

When grep searches multiple files, it shows the filename where it finds each matching line of text. Alternatively, if you don’t give grep a filename to read, it reads its standard input; that’s the way all filter programs work:

$ ls -l | grep "Aug"
-rw-rw-rw-   1 john  doc     11008 Aug  6 14:10 ch02
-rw-rw-rw-   1 john  doc      8515 Aug  6 15:30 ch07
-rw-rw-r--   1 john  doc      2488 Aug 15 10:51 intro
-rw-rw-r--   1 carol doc      1605 Aug 23 07:35 macros
$

First, the example runs ls -l to list your directory. The standard output of ls -l is piped to grep, which only outputs lines that contain the string Aug (that is, files that were last modified in August). Because the standard output of grep isn’t redirected, those lines go to the terminal screen.

grep options let you modify the search. Table 5-1 lists some of the options.

Next, let’s use a regular expression that tells grep to find lines with carol, followed by zero or more other characters (abbreviated in a regular expression as ".*“),[15] then followed by Aug:

$ ls -l | grep "carol.*Aug"
-rw-rw-r--   1 carol doc      1605 Aug 23 07:35 macros
$

For more about regular expressions, see the references in Section 8.1 (Chapter 8).

The sort program arranges lines of text alphabetically or numerically. The following example sorts the lines in the food file (from Section 4.5 in Chapter 4) alphabetically. sort doesn’t modify the file itself; it reads the file and writes the sorted text to the standard output.

$ sort food
Afghani Cuisine
Bangkok Wok
Big Apple Deli
Isle of Java
Mandalay
Sushi and Sashimi
Sweet Tooth
Tio Pepe's Peppers

By default, sort arranges lines of text alphabetically. Many options control the sorting, and Table 5-2 lists some of them.

More than two commands may be linked up into a pipe. Taking a previous pipe example using grep, we can further sort the files modified in August by order of size. The following pipe uses the commands ls, grep, and sort:

$ ls -l | grep "Aug" | sort +4n
-rw-rw-r--  1 carol doc      1605 Aug 23 07:35 macros
-rw-rw-r--  1 john  doc      2488 Aug 15 10:51 intro
-rw-rw-rw-  1 john  doc      8515 Aug  6 15:30 ch07
-rw-rw-rw-  1 john  doc     11008 Aug  6 14:10 ch02
$

This pipe sorts all files in your directory modified in August by order of size, and prints them to the terminal screen. The sort option +4n skips four fields (fields are separated by blanks), then sorts the lines in numeric order. So, the output of ls, filtered by grep, is sorted by the file size (this is the fifth column, starting with 1605). Both grep and sort are used here as filters to modify the output of the ls -l command. If you wanted to email this listing to someone, you could add a final pipe to the mail program. Or you could print the listing by piping the sort output to your printer command (either lp or lpr).

The less program, which you saw in Section 3.2 in Chapter 3, can also be used as a filter. A long output normally zips by you on the screen, but if you run text through less, the display stops after each screenful of text.

Let’s assume that you have a long directory listing. (If you want to try this example and need a directory with lots of files, use cd first to change to a system directory such as /bin or /usr/bin.) To make it easier to read the sorted listing, pipe the output through less:

$ ls -l | grep "Aug" | sort +4n | less
-rw-rw-r--  1 carol doc      1605 Aug 23 07:35 macros
-rw-rw-r--  1 john  doc      2488 Aug 15 10:51 intro
-rw-rw-rw-  1 john  doc      8515 Aug  6 15:30 ch07
-rw-rw-r--  1 john  doc     14827 Aug  9 12:40 ch03
	.
	.
	.
-rw-rw-rw-  1 john  doc     16867 Aug  6 15:56 ch05
:

less reads a screenful of text from the pipe (consisting of lines sorted by order of file size), then prints a colon (:) prompt. At the prompt, you can type a less command to move through the sorted text. less reads more text from the pipe and shows it to you, as well as saves a copy of what it has read, so you can go backwards to reread previous text if you want to. (The simpler pager programs more and pg generally can’t back up while reading from a pipe.) When you’re done seeing the sorted text, the q command quits less.

In the following exercises you redirect output, create a simple pipe, and use filters to modify output.

Redirect output to a file.

Enter who > users

Email that file to yourself. (Replace username with your own username.)

Enter mail username < users

Sort output of a program.

Enter who | sort

Append sorted output to a file.

Enter who | sort >> users

Display output to screen.

Enter less users (or more users or pg users)

Display long output to screen.

Enter ls -l /bin | less (or more or pg)

Format and print a file with pr.

Enter pr users | lp or pr users | lpr



[14] This example could be shortened by combining the two cat commands into one, giving both filenames as arguments to a single cat command. That wouldn’t work, though, if you were making a real diary with a command other than cat users.

[15] Note that the regular expression for “zero or more characters,” ".*“, is different than the corresponding filename wildcard "*“. See Section 4.2 in Chapter 4. We can’t cover regular expressions in enough depth here to explain the difference—though more-detailed books do. As a rule of thumb, remember that the first argument to grep is a regular expression; other arguments, if any, are filenames that can use wildcards.