Using pipes

In some cases, we may store output from one command in a file, with the intent to use it in the input to another. Consider this script, which accepts a list of ASCII words on its standard input, converts any uppercase letters to lowercase with tr, sorts them, and then prints a count of how often each word is used, sorted by frequency:

#!/bin/bash

# Convert all capital letters in the input to lowercase tr A-Z a-z > words.lowercase

# Sort all the lowercase words in order sort words.lowercase > words.sorted

# Print counts of how many times each word occurs uniq -c words.sorted > words.frequency

# Sort that list by frequency, descending sort -k1,1nr words.frequency

This sort of script involving many commands in sequence to filter and aggregate data can be very useful for analyzing large amounts of raw text, such as log files. However, there's a problem; when we run this, it leaves some files lying around:

$ ls words.*
words.frequency  words.lowercase  words.sorted

Of course, we could clean them up with rm at the end of the script:

rm words.frequency words.lowercase words.sorted

It would be preferable not to involve these temporary files in the first place, and instead to feed the output of each program directly into the input of the next one. This is done with the pipe operator. We can reduce the whole script to just one pipeline:

tr A-Z a-z | sort | uniq -c | sort -k1,1nr

Visually, we could represent it like this:

This pipeline accomplishes the same thing as the script does; try running it and type a few words separated by newlines, and then type Control-D to terminate it (end-of-file):

$ tr A-Z a-z | sort | uniq -c | sort -k1,1nr
bash
bash
script
user
script
Bash
SCRIPT

If Ctrl + D is pressed to finish the list, we get this output:

3 bash
3 script
1 user

You can think of the pipe operator as a way to perform input redirection (<) and output redirection (>) for a pair of programs into one another, like connecting things with a physical pipe, without having to involve an intermediate state, such as a file.

Note that only the output from each command is redirected into the input of the next; pipes do not affect where errors are sent.

A convenient property of the pipe operator is that it can be used at the end of a line; we could put the preceding pipeline into a script spanning multiple lines, as follows:

#!/bin/bash
tr A-Z a-z |
sort |
uniq -c |
sort -k1,1nr

You may find this more readable, particularly if some of your commands are long.