In our application, we are going to create two worker threads that will increment a shared counter, and let them run for a specific amount of time.
As a first step, we define two global atomic variables, running and counter:
std::atomic<bool> running{true};
std::atomic<int> counter{0};
The running variable is a binary flag. When it is set to true, the worker threads should keep running. After it changes to false, the worker threads should terminate.
The counter variable is our shared counter. The worker threads will concurrently increment it. We use the fetch_add method that we already used in the Using atomic variables recipe. It is used to increment a variable atomically. In this recipe, we pass an additional argument, std::memory_order_relaxed, to this method:
counter.fetch_add(1, std::memory_order_relaxed);
This argument is a hint. While consistency in atomicity and modification is important and should be guaranteed for an implementation of a counter, the order among concurrent memory accesses is not that important. std::memory_order_relaxed defines this kind of memory access for atomic variables. Passing it into the fetch_add method allows us to fine-tune it for a particular target platform, to avoid unneeded synchronization delays that can affect performance.
In the main function, we create two worker threads:
std::thread worker1(worker);
std::thread worker2(worker);
Then, the main thread is paused for 1 second. After the pause, the main thread sets the value of the running variable to false, indicating that the worker threads should terminate:
running = false;
After the worker threads terminate, we print the value of the counter:
The resulting counter value is determined by the timeout intervals passed to the worker functions. Changing the type of memory order in the fetch_add method does not result in a noticeable change in the resulting value in our example. However, it can result in the better performance of highly concurrent applications that use atomic variables, because a compiler can reorder operations in concurrent threads without breaking the application logic. This kind of optimization is highly dependent on a developer's intents, and cannot be inferred automatically without hints from the developer.