The awk command is a command suite mainstay in both UNIX and Linux. The UNIX awk command was first developed by Bell Labs in the 1970s and is named after the surnames of the main authors: Alfred Aho, Peter Weinberger, and Brian Kernighan. The awk command allows access to the AWK programming language, which is designed to process data within text streams.
There are many implementations of AWK:
- gawk: Also known as GNU AWK, it is a free version of AWK and used by many developers; we will use it in this book.
- mawk: Another implementation made by a guy named Mike Brennan. This implementation only includes a few gawk features; it was designed for speed and performance.
- tawk: Or Thompson AWK, is an implementation that works on Solaris, DOS, and Windows.
- BWK awk: Also known as nawk, it is used by OpenBSD and macOS.
Note that the awk interpreter that we will use in this book is gawk but there is a symbolic link for it with the name awk. So awk and gawk are the same command.
You can ensure this by listing the awk binary to see where it points to:
To demonstrate the programming language that is provided with awk, we should create a Hello World program. We know this is compulsory for all languages:
$ awk 'BEGIN { print "Hello World!" }'
Not only can we see that this code will print the ubiquitous hello message, we can also generate header information with the BEGIN block. Later, we will see that we
can create summary information with an END code block by allowing for a main
code block.
We can see the output of this basic command in the following screenshot: