Chapter 5

Boosting up a Step

IN THIS CHAPTER

check Using RegEx to parse strings

check Using Tokenizer to break strings into tokens

check Converting numbers to other data types

check Using Foreach to create improved loops

check Using Filesystem to access the operating system

The Boost library is vast. It’s doubtful that a typical developer will ever use everything that Boost has to offer. Of course, before you can pick and choose what you want to use, you need to know it exists. Browsing through the help file can reveal classes that you need to add to your toolkit to produce good applications. This chapter helps by taking you on a whirlwind tour of the major Boost categories. Don’t expect this chapter to discuss everything — Boost is simply too large for that. If you want to see a list of what Boost has to offer, check out

Tip In addition to reviewing the examples in this chapter and looking through the Help file, it also pays to browse the Boost directory for examples. For example, if you look at the \CodeBlocks\boost_1_73_0\libs\regex\example directory, you find three examples of how to use RegEx, one of which is demonstrated in the “Testing the installation” section of Book 7, Chapter 4. Every example directory contains a Jamfile.v2 that you can use to build the examples using Boost.Build. If you still haven’t found the example you need, check online for more examples — Boost is extremely popular. Even Microsoft has gotten into the act by providing examples at https://devblogs.microsoft.com/cppblog/using-c-coroutines-with-boost-c-libraries/, https://marketplace.visualstudio.com/items?itemName=AdamWulkiewicz.GraphicalDebugging, and https://docs.microsoft.com/en-us/visualstudio/test/how-to-use-boost-test-for-cpp?view=vs-2019.

Before you begin working through the examples in this chapter, make sure you know how to configure your development environment to use Boost. The “Testing the installation” and “Building Your First Boost Application Using Date Time” sections of Book 7, Chapter 4 tell how to configure Code::Blocks to use Boost. The “Building Your First Boost Application Using Date Time” section also provides you with a simple example that gets you started working with Boost.

Remember You don’t have to type the source code for this chapter manually. In fact, using the downloadable source is a lot easier. You can find the source for this chapter in the \CPP_AIO4\BookVII\Chapter05 folder of the downloadable source. See the Introduction for details on how to find these source files.

Parsing Strings Using RegEx

Regular expressions are an important part of today’s computing environment. You use them to perform pattern matching, where the application finds a series of matching characters in a string. For example, if you want the user to enter values from 0 through 9 and nothing else, you can create a pattern that prevents the user from entering anything else. Using patterns in the form of regular expressions serves a number of important purposes:

  • Ensures that your application receives precisely the right kind of input
  • Enforces a particular data input format (such as the way you input a telephone number)
  • Reduces security risks (for example, a user can’t input a script in place of the data you wanted)

Warning Some developers make the mistake of thinking that a regular expression can prevent every sort of data input error. However, regular expressions are only one tool in an arsenal you must build against errant input. For example, a regular expression can’t perform range checking. If you want values between 101 and 250, a regular expression will ensure that the user enters three digits; however, you must use range checking to prevent the user from entering a value of 100.

Defining the pattern for a regular expression can prove time consuming. However, after you create the pattern, you can use it every time you must check for a particular input pattern. The following sections describe how to work with the RegEx (regular expressions) library.

Adding the RegEx library

Most of the Boost library works just fine by adding headers to your application code. However, a few components, such as RegEx, require a library. Before you can use a library, you must build it. The instructions for performing this task appear in the “Building the libraries” section of Book 7, Chapter 4. After you build the library, you must add it to your application.

Two techniques exist for adding the required headers and libraries to an application. The first technique is to add it to the compiler settings, as you do for the “Testing the installation” section of Book 7, Chapter 4. The second technique is to add the settings to a specific project. You use the first technique when you work with Boost for a large number of projects and require access to all libraries. The second technique is best when you use Boost only for specific projects and require access only to specific libraries. The following steps show you how to perform the project-specific setup for any library, not just the RegEx library:

  1. Use the Project wizard to create a new project.

    Nothing has changed from the beginning of this book; every application begins with a new project. The next section discusses the RegEx example, and you can use that project name as a starting point here.

  2. Choose Project ⇒ Build Options.

    Code::Blocks displays the Project Build Options dialog box.

  3. Select the project name, such as RegEx, in the left pane.
  4. Select the Linker Settings tab.

    You see a number of linker settings, including a Link Libraries list, which will be blank.

  5. Click Add.

    Code::Blocks displays the Add Library dialog box, shown in Figure 5-1.

  6. Click the Browse button — the button sporting an opening file folder.

    You see the Choose Library to Link dialog box.

    Snapshot of selecting the library you want to add.

    FIGURE 5-1: Select the library you want to add.

  7. Using the dialog box, navigate to the library of your choice, such as libboost_regex-mgw6-mt-x64-1_73.a (the release version of the library), select the library, and then click OK.

    The Boost library files are typically located in the \CodeBlocks\boost_1_73_0\bin.v2\libs\ directory. When you click OK, you see a dialog box that asks whether you want to keep this as a relative path.

    Remember Relative paths specify a location using the current location as a starting point. The alternative is an absolute path, which specifies a location based on the root directory of your hard drive. In most cases, absolute paths are less likely to get broken.

  8. Click No.

    You see the absolute path for the selected library, such as libboost_regex-mgw6-mt-x64-1_73.a, added to the File field of the Add Library dialog box.

  9. Click OK.

    After you click OK, you see the absolute path for the library added to the Linker Settings, as shown in Figure 5-2.

    Snapshot of adding the library to the application.

    FIGURE 5-2: Add the library to the application.

  10. Click the Search Directories tab.

    You see three subtabs: Compiler, Linker, and Resource Compiler.

  11. Click Add in the Compiler subtab.

    You see an Add Directory dialog box like the one shown in Figure 5-3.

    Snapshot of adding appropriate search directories for Boost header and library files.

    FIGURE 5-3: Add appropriate search directories for Boost header and library files.

  12. Type the location of the Boost header files in the Directory field.

    As an alternative, you can click the Browse button to use a Browse for Folder dialog box to find them. The files are normally located in the \CodeBlocks\boost_1_73_0\boost folder.

  13. Click OK.

    You see the search folder added to the Compiler tab, as shown in Figure 5-4.

    Snapshot of the search location for any compiler, linker, or resource compiler.

    FIGURE 5-4: The search location for any compiler, linker, or resource compiler.

  14. Click Add in the Linker subtab.

    You see yet another Add Directory dialog box (refer to Figure 5-3).

  15. Type the location of the Boost library files in the Directory field and then click OK.

    The Boost library files are typically located in the \CodeBlocks\boost_1_73_0\bin.v2\libs directory. After you click OK, you see the directory added to the Linker tab.

  16. Click OK.

    The selected library is ready for inclusion in your application.

Creating the RegEx code

Using a regular expression is relatively straightforward. All you do is create the expression and then use it with a function to perform specific kinds of pattern matches. The function you choose is important because each function performs the pattern matching differently. The RegEx example code, shown in Listing 5-1, demonstrates how to create a regular expression and then use it in two different ways to determine whether user input is correct.

LISTING 5-1: Performing Matches and Searches Using RegEx

#include <iostream>
#include "boost/regex.hpp"

using namespace std;
using namespace boost;

int main() {
char MyNumber[80];
cout << "Type a three-digit number: ";
cin >> MyNumber;

regex Expression("[0-9][0-9][0-9]");
cmatch Matches;

// Perform a matching check.
if (regex_match(MyNumber, Matches, Expression)) {
cout << "You typed: " << Matches << endl;
} else {
cout << "Not a three-digit number!" << endl;
}

// Perform a search check.
if (regex_search(MyNumber, Matches, Expression)) {
cout << "Found: " << Matches << endl;
} else {
cout << "No three-digit number found!" << endl;
}
return 0;
}

In this case, the code begins by adding the proper header, RegEx.hpp, and the proper namespace, boost. In many cases, you can get by without doing much more than performing these two steps in your code. It then performs three steps:

  1. Get some user input. Even though the prompt tells the user to enter a three-digit number, C++ doesn’t enforce this requirement.
  2. Create the regular expression. This example needs a set of three ranges for numbers: [0-9][0-9][0-9]. Using ranges works well for a number of tasks, and you use them often when creating a regular expression.
  3. Perform the pattern match. The example uses RegEx_match(), which performs a precise match, and RegEx_search(), which looks for the right characters anywhere in the input. Both functions require three input values: the value you want to check, an output variable of type cmatch that tells where the match is found, and the regular expression.

To see how this code works, you must perform a series of three tests. First, run the application and type 0 as the input. Naturally, typing 0 means that the code will fail and you see this output:

Not a three-digit number!
No three-digit number found!

Run the application again and type 123 as the input to see

You typed: 123
Found: 123

So far, there isn’t much difference between the two functions, which is why you need the third test. Run the application and type ABC123XYZ as the input to see:

Not a three-digit number!
Found: 123

This final test shows that the RegEx_search() function finds the three-digit value in the string. Obviously, the RegEx_search() function is great when you need to locate information but not good when you need to secure it. When you need a precise pattern match, use RegEx_match() instead.

Breaking Strings into Tokens Using Tokenizer

Humans view strings as a sentence or at least a phrase. Mixtures of words create meaning that we can see in a moment.

Remember Computers, on the other hand, understand nothing. A computer can perform pattern matching and do math, but it can’t understand Kipling (read more about this fascinating author at https://www.poetryfoundation.org/poets/rudyard-kipling). It’s because of this lack of understanding that you must tokenize text for the computer. A computer can perform comparisons on individual tokens, usually single words or symbols, and create output based on those comparisons.

The compiler you use relies on a tokenizer, an application component that breaks text into tokens, to turn the text you type into machine code the computer can execute. However, the tokenizer appears in all sorts of applications. For example, when you perform a spelling check on a document, the word processing application breaks the text into individual words using a tokenizer, and then compares those words to words in its internal dictionary.

The Tokens example, shown in Listing 5-2, shows a method for creating tokens from strings. This basic technique works with any phrase, string, or series of strings. You’ll normally process the tokens after you finish creating them.

LISTING 5-2: Creating Tokens from Strings

#include <iostream>
#include "boost/tokenizer.hpp"

using namespace std;
using namespace boost;

int main() {
string MyString = "This is a test string!";
tokenizer<> Tokens(MyString);

// Display each token on screen.
tokenizer<>::iterator Iterate;
for (Iterate = Tokens.begin(); Iterate != Tokens.end();
Iterate++)
cout << *Iterate << endl;
return 0;
}

The tokenizer template places the tokenized form of MyString in Tokens. The application now has a set of tokens with which to work. To see the tokens, you must iterate through them by creating a tokenizer<>::iterator, Iterate. The application uses iterator to output the individual tokens. When you run this application, you see the following output:

This
is
a
test
string

Tip This example shows a basic routine that you can use for just about any need. However, you might need some of the extended capabilities of the tokenizer class. Check out the materials at https://www.boost.org/doc/libs/1_73_0/libs/tokenizer/doc/index.html for more information about both the tokenizer and the tokenizer<>::iterator.

Performing Numeric Conversion

Numeric conversion isn’t hard to perform — it’s accurate numeric conversion that’s hard to perform. Getting the right result as you move from one type of number to another is essential. Sure, you probably won’t notice too much if your game score is off by a point or two, but you’ll definitely notice the missing dollars from your savings account. Worse yet, when taking a trip into space, a rounding error can definitely ruin your day as you head off toward the sun rather than Planet Earth.

The Boost library includes the converter template, which makes converting from one kind of number to another relatively easy. The converter template includes all kinds of flexibility. The Convert example, shown in Listing 5-3, presents two different levels of converter template usage.

LISTING 5-3: Converting from double to int

#include <iostream>
#include "boost/numeric/conversion/converter.hpp"

using namespace std;
using namespace boost;
using namespace boost::numeric;

int main() {
typedef converter<int, double> Double2Int;
double MyDouble = 2.1;
int MyInt = Double2Int::convert(MyDouble);

cout << "The double value is: " << MyDouble << endl;
cout << "The int value is: " << MyInt << endl;

// See what happens with a larger value.
MyDouble = 3.8;
MyInt = Double2Int::convert(MyDouble);
cout << "The double value is: " << MyDouble << endl;
cout << "The int value is: " << MyInt << endl;

// Round instead of truncate.
typedef conversion_traits<int, double> Traits;
typedef converter<int, double, Traits,
def_overflow_handler, RoundEven<double> >
Double2Rounded;
MyInt = Double2Rounded::convert(MyDouble);
cout << "The int value is: " << MyInt << endl;
return 0;
}

The example begins by creating a converter object, Double2Int. This first object shows the minimum information that you can provide — the target (int) and source (double) values. The default setting truncates floating-point values (float and double among them) to obtain an int value. To perform a conversion, the code relies on the convert method, which requires a variable of the required source type as an argument.

Remember The converter template includes support for four kinds of rounding. You must use the correct kind of rounding to match your application requirements. Imagine what would happen to calculations if you used truncation when rounding is really the required operation. The following list describes all four kinds of rounding that converter supports:

  • Trunc: Removes the decimal portion of the value (rounds toward 0)
  • RoundEven: Rounds values up or down as needed such that the ending value is even (also called banker’s rounding). Consequently, 1.5 rounds up to 2, while 2.5 rounds down to 2.
  • Ceil: Rounds the value up toward positive infinity when the decimal portion is greater than 0
  • Floor: Rounds the value down toward negative infinity when the decimal portion is greater than 0

The second converter object, Double2Rounded, shows the template requirements to choose the kind of rounding that the object performs. In this case, you supply five arguments to the template (the converter template accepts up to seven arguments; see https://www.boost.org/doc/libs/1_73_0/libs/numeric/conversion/doc/html/boost_numericconversion/converter___function_object.html):

  • Target
  • Source
  • conversion_traits, which include the target and source types as a minimum
  • Overflow handler, which determines how the object handles conversions that result in an overflow (the default is def_overflow_handler)
  • Rounding template object (which includes the rounding source type)

The process for using the extended form of the converter template is the same as the simple form shown earlier in the example. However, you must now create a conversions_traits object (Traits in this case) and provide the required input information. (See more examples of using conversion_traits at https://www.boost.org/doc/libs/1_73_0/libs/numeric/conversion/doc/html/boost_numericconversion/conversion_traits___traits_class.html.) As before, you rely on the convert method to perform the conversion process. Here’s the application output:

The double value is: 2.1
The int value is: 2
The double value is: 3.8
The int value is: 3
The int value is: 4

The last two lines show the difference in rounding the value 3.8 using Trunc and RoundEven. See https://www.boost.org/doc/libs/1_73_0/libs/numeric/conversion/doc/html/index.html for more about numeric conversion.

Creating Improved Loops Using Foreach

Writing efficient loops is a requirement if you want your application to perform optimally. Interestingly enough, many loops use a certain amount of boilerplate code (code that is essentially the same every time you write it, but with small nuances).

Remember Templates and other methodologies described in this book provide a means to overcome the boredom of writing essentially the same code. However, none of the examples to date has shown a tried-and-true method: macros. A macro is essentially a substitution technique that replaces a keyword with the boilerplate code you’d normally write. Macros normally appear in uppercase, such as BOOST_FOREACH, which is the macro used in this section of the chapter. Instead of typing all the code associated with a macro, you simply type the macro name and the compiler does the rest of the work for you.

Technical stuff The magic behind the BOOST_FOREACH macro is that it creates all the iteration code you normally create by hand. In other words, you aren’t providing any less code to the compiler; you simply let the macro write it for you. The Boost library still relies on the Standard Library for_each algorithm; you avoid writing all the code you used to write when using the algorithm. See https://www.boost.org/doc/libs/1_73_0/doc/html/foreach.html for more about the BOOST_FOREACH macro. The ForEach example, in Listing 5-4, shows how to use a BOOST_FOREACH loop to iterate through a vector.

LISTING 5-4: Creating a BOOST_FOREACH Loop

#include <iostream>
#include <vector>
#include "boost/foreach.hpp"

using namespace std;
using namespace boost;

int main() {
vector<string> names;
names.push_back("Tom");
names.push_back("Dick");
names.push_back("Harry");
names.push_back("April");
names.push_back("May");
names.push_back("June");

BOOST_FOREACH(string Name, names)
cout << Name << endl;

cout << endl << "Backward:" << endl;
BOOST_REVERSE_FOREACH(string Name, names)
cout << Name << endl;
return 0;
}

This example begins by creating a vector. In fact, it’s the same vector as the one used for the Vectors example in Book 5, Chapter 6, Listing 6-1. In this case, the example then creates a BOOST_FOREACH loop that iterates through names. Each iteration places a single value from names into Name. The code then prints the single name.

An interesting feature of the Boost library is that you can reverse the order of iteration. In this case, the code uses a BOOST_REVERSE_FOREACH loop to go in the opposite direction — from end to beginning. The technique is precisely the same as going forward. Here’s the application output:

Tom
Dick
Harry
April
May
June

Backward:
June
May
April
Harry
Dick
Tom

As you can see, iterating forward and backward works precisely as you expect. The BOOST_FOREACH and BOOST_REVERSE_FOREACH macros support a number of container types:

  • Any Standard Template Library (STL) container
  • Arrays
  • Null-terminated strings (char and wchar_t)
  • STL iterator pair (essentially a range)
  • boost::iterator_range<> and boost::sub_range<>

Tip The macro STL container support is generalized. Any object type that supports these two requirements will work:

  • Nested iterator and const_iterator types
  • begin() and end() methods

Accessing the Operating System Using Filesystem

Working with files and directories is an important part of any application you create. Book 6 shows some standard techniques you use to work with both files and directories. However, these methods can become cumbersome and somewhat limited. Boost augments your ability to work with the file system using the Filesystem library. Creating and deleting both files and directories becomes a single call process. You can also perform tasks such as moving and renaming both files and directories.

The most important addition that Boost makes is defining a method to obtain error information from the operating system. This feature is found in the System library, which you must include as part of your application. Among other capabilities, the System library enables you to convert a numeric error that the operating system returns into a human-readable form. Unfortunately, the System library is still a work in progress, so this chapter can’t demonstrate how to use it in any great detail.

You must add references to the libboost_filesystem-mgw6-mt-x32-1_73.a and libboost_system-mgw6-mt-x32-1_73.a files using the technique found in the “Adding the RegEx library” section, earlier in this chapter, for the example to work. The project file may require that you change the library setting to match your system. When you set up this application properly, you should see two libraries on the Linker Settings tab of the Project Build Options dialog box, as shown in Figure 5-5.

Snapshot of using the Filesystem library which requires the System library as well.

FIGURE 5-5: Using the Filesystem library requires the System library as well.

Remember The OS example in Listing 5-5 shows only a modicum of the capabilities of the Filesystem library. The big thing to remember when using this example is that it requires both Filesystem and System libraries because the System library provides error-handling support. The example begins by creating a directory and a file. It then adds data to the file, reads the file back in and displays it, and then deletes both file and directory.

LISTING 5-5: Interacting with the File System Using Boost

#include <iostream>
#include "boost/filesystem.hpp"

using namespace boost::filesystem;
using namespace std;

int main() {
if (! exists("Test")) {
create_directory(path("Test"));
cout << "Created Directory Test" << endl;
} else
cout << "Directory Test Exists" << endl;

if (! exists("Test/Data.txt")) {
boost::filesystem::ofstream File("Test/Data.txt");
File << "This is a test!";
File.close();
cout << "Created File Data.txt" << endl;
} else
cout << "File Data.txt Exists" << endl;

if (exists("Test/Data.txt")) {
cout << "Data.txt contains "
<< file_size("Test/Data.txt")
<< " bytes." << endl;
boost::filesystem::ifstream File("Test/Data.txt");
string Data;
while (! File.eof()) {
File >> Data;
cout << Data << " ";
}
cout << endl;
File.close();
} else
cout << "File Data.txt Doesn't Exist!" << endl;

if (exists("Test/Data.txt")) {
remove(path("Test/Data.txt"));
cout << "Deleted Data.txt" << endl;
}

if (exists("Test")) {
remove(path("Test"));
cout << "Deleted Test" << endl;
}

return 0;
}

The first feature you should notice about this example is that it constantly checks to verify that the file or directory exists using the exists() function. Your applications should follow this pattern because you can’t know that a file or directory will exist when you need to work with it, even if your application created it. A user or external application can easily delete the file or directory between the time you create it and when you need to work with it again.

To create a directory, you use create_directory(), which accepts a path as input. You create a path object using path(). Many of the other Filesystem library calls require a path object as well. For example, when you want to remove (delete) either a file or directory, you must supply a path object to remove(). Interestingly enough, remove() does remove a file without creating a path object, but it won’t remove a directory. The inconsistent behavior can make an application that incorrectly uses remove() devilishly difficult to debug.

Notice that the example uses the boost::filesystem::ofstream and boost::filesystem::ifstream classes. If you try to compile the application without using the fully qualified name of the classes, you get an ambiguous reference error from Code::Blocks. Using the Boost version of the classes ensures maximum compatibility and fewer errors. Here is what you see when you run this application:

Created Directory Test
Created File Data.txt
Data.txt contains 15 bytes.
This is a test!
Deleted Data.txt
Deleted Test

One final element to look at in this example is file_size(), which reports the size of the file in bytes. The Filesystem library provides a number of helpful statistics that you can use to make your applications robust and reliable. As previously mentioned, you want to spend time working with this library because it contains so many helpful additions to the standard capabilities that C++ provides.