Reading data line by line

We mentioned earlier in this chapter that for loops are often misused to read data line by line from files, or from the output of command pipes. The correct way to do this is with a very common idiom for use with while loops: repeatedly running the read command, always with its -r flag.

Consider a file named fcs, with the names of four famous computer scientists, one on each line:

Ken Thompson
Dennis Ritchie
John McCarthy
Larry Wall

We'd like a program to print out this table, but also to print a link to the person's article on the English-language Wikipedia after each line. There is no dedicated UNIX tool to do this, so we will write our own in Bash.

We might implement a rough version something like this:

#!/bin/bash
while read -r name ; do
    printf '%s\n' "$name"
    printf 'https://en.wikipedia.org/wiki/%s\n' "${name// /_}"
done < fcs

There's quite a lot to take in there, so let's break it down.

The while loop's test command is read -r name. read is a builtin command that accepts a line from standard input, and saves its contents into one or more variables. In this case, we will save the entirety of each line into one variable, called name.

We use the -r option for read to stop it from treating backslashes in input specially, potentially misinterpreting data. We don't have any backslashes in this particular data, but it's still a good habit to get into.

The input source for the loop, the fcs file, is specified at the end of the loop, after done. Recall from Chapter 4, Input, Output, and Redirection, that a compound command can have redirections for input and output applied to it, just like a simple command. A while loop is a compound command. In this instance, we're specifying the standard input for the loop, and hence for each read -r command.

Within the loop, there are two printf statements. The first prints the name just as we read it. The second prints a link to the Wikipedia page, after substituting an underscore for every space using parameter expansion.

If we save the script in a file named fcs-wiki.bash, and run it with bash fcs-wiki.bash in the same directory as the fcs file, we can see the output we wanted:

$ bash fcs-wiki.bash
Ken Thompson
https://en.wikipedia.org/wiki/Ken_Thompson
Dennis Ritchie
https://en.wikipedia.org/wiki/Dennis_Ritchie
John McCarthy
https://en.wikipedia.org/wiki/John_McCarthy
Larry Wall
https://en.wikipedia.org/wiki/Larry_Wall