Cutting Columns or Fields

A nifty command called cut lets you select a list of columns or fields from one or more files.

You must specify either the -c option to cut by column or -f to cut by fields. (Fields are separated by tabs unless you specify a different field separator with -d. Use quotes (Section 27.12) if you want a space or other special character as the delimiter.)

In some versions of cut, the column(s) or field(s) to cut must follow the option immediately, without any space. Use a comma between separate values and a hyphen to specify a range (e.g., 1-10,15 or 20,23 or 50-).

The order of the columns and fields is ignored; the characters in each line are always output from first to last, in the order they're read from the input. For example, cut -f1,2,4 produces exactly the same output as cut -f4,2,1. If this isn't what you want, try perl (Section 41.1) or awk (Section 20.10), which let you output fields in any order.

cut is incredibly handy. Here are some examples:

Section 21.18 covers the cut counterpart, paste.

As was mentioned, you can use awk or perl to extract columns of text. Given the above task to extract the fifth and first fields fields of /etc/passwd, you can use awk:

% awk -F: '{print $5, "=>", $1}' /etc/passwd

An often forgotten command-line option for perl is -a, which puts perl in awk compatibility mode. In other words, you can get the same field-splitting behavior right from the command line:

% perl -F: -lane 'print $F[4], "=>", "$F[0]"' /etc/passwd

In the line above, perl is told about the field separator in the same way awk is, with the -F flag. The next four options are fairly common. The -l option removes newlines from input and adds a newline to all print statements. This is a real space saver for "one-line wonders," like the one above. The -a flag tells perl to split each line on the indicated field separator. If no field separator is indicated, the line is split on a space character. Each field is stored in the global array @F. Remember that the first index in a Perl array is zero. The -n option encloses the Perl code indicated by the -e to be wrapped in a loop that reads one line at a time from stdin. This little Perl snippet is useful if you need to do some additional processing with the contents of each field.

—TOR, DG, and JJ