The character class

We saw how to match any character using the dot character. What if you want to match a specific set of characters only?

You can pass the characters you want to match between square brackets [] to match them, and this is the character class.

Let's take the following file as an example:

I love bash scripting.
I hope it works without a crash.
Or I'll smash it.

Let's see how the character class works:

$ awk '/[mbr]ash/{print $0}' myfile
$ sed -n '/[mbr]ash/p' myfile

The character class [mbr] matches any of the included characters followed by ash, so this matches the three lines.

You can employ it in something useful, such as matching an uppercase or a lower case character:

$ echo "Welcome to shell scripting" | awk '/^[Ww]elcome/{print $0}'
$ echo "welcome to shell scripting" | awk '/^[Ww]elcome/{print $0}'

The character class is negated using the caret character like this:

$ awk '/[^br]ash/{print $0}' myfile  

Here, we match any line that contains ash and starts neither with b norĀ r.

Remember that using the caret character (^) outside the square brackets means the beginning of a line.

Using character class, you specify your characters. What if you have a long range of characters?