Exploiting parallelism for undirected graph coloring

Parallel graph coloring algorithms are based on the observation that if we split the graph into multiple independent sets of vertices, we can color those in parallel without introducing any conflict.

An independent set is defined as the collection or set of vertices where no two vertices share an edge.

To develop a parallelized version of the sequential greedy graph coloring algorithm from the previous section, we will be relying on a simple, yet effective algorithm proposed by Jones and Plassmann [6]. Before diving into the implementation details, let's take a few minutes to explain how the algorithm generates independent sets and how can we guarantee that our compute function will avoid data races while accessing the graph.

At initialization time, each vertex is assigned a random token. During each super-step, every vertex that hasn't been colored yet compares its own token value to the value of every uncolored neighbor. In the highly unlikely case that two neighboring vertices have been assigned the same random token, we can use the vertex ID as an extra comparison predicate to break the tie. The vertex with the highest token value gets to choose the next color using the same steps as the sequential algorithm while the neighboring vertices remain idle, waiting for their turn.

The concept of using tokens to enforce a coloring order guarantees that, at every super-step, we only color exactly one vertex from each independent set. At the same time, since connected vertices wait for their turn before they can pick a color, no data races can occur.

Just like we did with the shortest path implementation, we will begin by defining a type for holding the state of each vertex and a type that describes the messages that are exchanged between neighboring vertices:

The vertexState struct keeps track of the token and colors that are assigned to the vertex. Furthermore, the usedColors map tracks colors that have already been assigned to the vertex neighbors. Vertices broadcast their state to each of their neighbors by exchanging VertexStateMessage instances. In addition to the token and color value, these messages also include a vertex ID. As we mentioned previously, the vertex ID is required for breaking ties while comparing token values.

Now, let's break down the compute function for this algorithm into small chunks and examine each chunk in more detail:

First things first, each super-step iteration begins by marking all the vertices as inactive. This way, vertices will only be reactivated in subsequent super-steps when a neighbor gets to pick a color and broadcast its selection. Consequently, once the last remaining vertex has been colored, no further messages will be exchanged and all the vertices will be marked as inactive. This observation will serve as the termination condition for the algorithm.

Super-step 0 serves as an initialization step. During this step, we assign random tokens to all the vertices and have them all announce their initial state to their neighbors. If any of the vertices are pre-colored, their assigned colors will also be included in the broadcasted state update message. The vertexState type defines a handy helper method called asMessage which generates a VertexStateMessage, that can be sent to neighbors via the graph's BroadcastToNeighbors method:

Of course, the input graph may potentially include vertices with no neighbors. If these vertices have not been already pre-colored, we simply assign them the first available color, which in our particular implementation is color 1.

The next block of code processes state announcements from the vertex neighbors. Before iterating each state message, each vertex sets its local pickNextColor variable to true. Then, the message list is iterated and the following occurs:

Once all state announcements have been processed, we check the value of the pickNextColor variable. If the vertex is not allowed to pick the next color, it simply broadcasts its current state and waits for the next super-step, as follows:

Otherwise, the vertex gets to select the next color to be assigned to it:

Since some of the vertices in the graph might be pre-colored, our goal is to pick the smallest not used color for this vertex. To do this, we initialize a counter with the smallest allowed value and enter a loop: at each step, we check whether usedColors contains an entry for the nextColor value. If so, we increment the counter and try again. Otherwise, we assign nextColor to the vertex and broadcast our updated state to the neighbors.

In case you are having concerns about the space requirements for keeping track of the used colors on a per-vertex basis, we can actually do much better if we don't need to support potentially pre-colored graphs. If that happens to be the case, each vertex only needs to keep track of the maximum color value that's assigned to its neighbors. At color selection time, the vertex picks maxColor + 1 as its own color.

The full source code and tests for the graph coloring implementation from this section can be found in this book's GitHub repository in the Chapter08/color folder.