Little boxes on the hillside, little boxes made of ticky tacky,
Little boxes on the hillside, little boxes all the same.
There’s a green one and a pink one, and a blue one and a yellow one,
And they’re all made out of ticky tacky, and they all look just the same.
“Little Boxes” by Malvina Reynolds (1900–1978), American folk/blues singer-songwriter and political activist.
In this chapter we begin our detailed study of fractal dimensions of a network . There are two approaches to calculating a fractal dimension of . One approach, applicable if is a spatially embedded network, is to treat as a geometric object and apply techniques, such as box counting or modelling by an IFS, applicable to geometric objects. For example, in Sect. 5.3 we showed how to calculate the similarity dimension d S of a tree described by an IFS. Also, several studies have computed d B for networks embedded in by applying box counting to the geometric figure. This was the approach taken to calculate d B in [Benguigui 92] for four railroad networks, in [Arlt 03] for the microvascular network of chicks, in [Lu 04] for road networks in cities in Texas, and in [Hou 15] for electric power networks; later in this book we will consider those studies in more detail. These approaches to computing d B make no use of network properties. The focus of this chapter is how to compute a fractal dimension for using methods that treat as a network and not simply as a geometric object.
the box counting dimension (this chapter), based on the box-counting method for geometric fractals, describes how the number of boxes of size s needed to cover varies with s,
the correlation dimension (Chapter 11), which describes how the number of nodes within r hops of a random node varies with r, extends to networks the correlation dimension of geometric objects (Chapter 9),
the information dimension (Chapter 15) which extends to networks the information dimension of geometric objects (Chapter 14),
the generalized dimensions (Chapter 17) which extends to networks the generalized dimensions of geometric objects (Chapter 16).
As discussed in the beginning of Chap. 3 and in Sect. 5.5, several definitions of “fractal” have been proposed for a geometric object Ω. None of them has proved entirely satisfactory. As we saw in Sect. 6.5, applications of d B of Ω compare d B to the dimension of similar objects (e.g., healthy versus unhealthy lungs), or examine how d B changes over time. Although focusing on the definition of “fractality” for Ω does not concern us, in later chapters we will need to focus on the definition of “fractality” for . The reason this is necessary for is that, while the terms “fractality” and “self-similarity” are often used interchangeably for Ω, they have been given very different meanings for . With this caveat on terminology, in this chapter and in the next chapter we study the box counting dimension of .
Analogous to computing d B of a geometric fractal, computing d B of requires covering by a set of “boxes”, where a box is a subnetwork of . The methods for computing a covering of are also called box counting methods, and several methods have been proposed. Most methods are for unweighted networks for which “distance” means chemical distance (i.e., hop count); a few methods are applicable to unweighted networks, e.g., spatially embedded networks for which “distance” means Euclidean distance. Most of this chapter will focus on unweighted networks, and we begin there. Recall that throughout this book, except where otherwise indicated, we assume that is a connected, unweighted, and undirected network.
7.1 Node Coverings and Arc Coverings
In this section we consider what it means to cover a network of N nodes and A arcs.
Definition 7.1 The network B is a subnetwork of if B can be obtained from by deleting nodes and arcs. A box is a subnetwork of . A box is disconnected if some nodes in the box cannot be connected by arcs in the box. □
Let be a collection of boxes. Two types of coverings of have been proposed: node coverings and arc coverings. Let s be a positive integer.
Definition 7.2 (i) The set is a node s-covering of the network if for each j we have diam(B j) < s and if each node in is contained in exactly one B j. (ii) The set is an arc s-covering of if for each j we have diam(B j) < s and if each arc in is contained in exactly one B j. □
If B j is a box in a node or arc s-covering of , then the requirement diam(B j) < s in Definition 7.2 implies that B j is connected. However, as discussed in Sect. 8.6, this requirement, which is a standard assumption in defining the box counting dimension of [Gallos 07b, Kim 07a, Kim 07b, Rosenberg 17b, Song 07], may for good reasons frequently be violated in some methods for determining the fractal dimensions of .
It is possible to define a node covering of to allow a node to be contained in more than one box; coverings with overlapping boxes are used in [Furuya 11, Sun 14]. The great advantage of non-overlapping boxes is that they immediately yield a probability distribution, as discussed in Chap. 15. The probability distribution obtained from a non-overlapping node covering of is the basis for computing the information dimension d I and the generalized dimensions D q of (Chap. 17). Therefore, in this book each covering of is assumed to use non-overlapping boxes, as specified in Definition 7.2.
Definition 7.3 (i) An arc s-covering is minimal if for any other arc s-covering , we have J ≤ J ′. Let B A(s) be the number of boxes in a minimal arc s-covering of . (ii) A node s-covering is minimal if for any other node s-covering , we have J ≤ J ′. Let B N(s) be the number of boxes in a minimal node s-covering of . □
That is, a covering is minimal if it uses the fewest possible number of boxes. For s > Δ, the minimal node or arc s-covering consists of a single box, which is itself. Thus B A(s) = B N(s) = 1 for s > Δ.
How large can the ratio B A(s)∕B N(s) be? For s = 2, the ratio can be arbitrarily large. For consider , the complete graph on N nodes, which has N(N − 1)∕2 arcs. For s = 2, we have B N(2) = ⌈N∕2⌉, since each box in the node 2-covering contains one arc, which covers two nodes. We have B A(2) = N(N − 1)∕2, since each box in the arc 2-covering contains one arc. Thus B A(2)∕B N(2) is .
This network has 2N + 2 nodes and 3N arcs. We have B N(3) = 2; the two boxes in the minimal node cover are shown on the left. We have B A(3) = N + 1, as shown on the right. Thus B A(3)∕B N(3) is .
Exercise 7.1 ⋆ For each integer s ≥ 2, is there a graph with N nodes such that B A(s)∕B N(s) is ? □
Virtually all research on network coverings has considered node coverings; only a few studies, e.g., [Jalan 17, Zhou 07], use arc coverings. The arc covering approach of [Jalan 17] uses the N × N adjacency matrix M (Sect. 2.1). For a given integer s between 2 and N∕2, the method partitions M into non-overlapping square s × s boxes, and then counts the number m(s) of boxes containing at least one arc. The network has fractal dimension d if m(s) ∼ s −d. Since m(s) depends in general on the order in which the nodes are labelled, m(s) is averaged over many shufflings of the node order.
The reason arc coverings are rarely used is that, in practice, computing a fractal dimension of a geometric object typically starts with a given set of points in (the points are then covered by boxes, or the distance between each pair of points is computed, as discussed in Chap. 9), and nodes in a network are analogous to points in . Having contrasted arc coverings and node coverings for a network, we now abandon arc coverings; henceforth, all coverings of are node coverings, and by covering we mean covering the nodes of . Also, henceforth by an s-covering we mean a node s-covering, and by a covering of size s we mean an s-covering.
7.2 Diameter-Based and Radius-Based Boxes
There are two main approaches used to define boxes for use in covering : diameter-based boxes and radius-based boxes.
Definition 7.4 (i) A radius-based box with center node and radius r is the subnetwork of containing all nodes whose distance to n does not exceed r. Let B R(r) be the minimal number of radius-based boxes of radius at most r needed to cover . (ii) A diameter-based box of size s is a subnetwork of of diameter s − 1. Let B D(s) denote the minimal number of diameter-based boxes of size at most s needed to cover . □
The subscript “R” in B R(n, r) denotes “radius”. Thus the node set of is . Radius-based boxes are used in the Maximum Excluded Mass Burning and Random Sequential Node Burning methods which we will study in Chap. 8. Interestingly, the above definition of a radius-based box may frequently be violated in the Maximum Excluded Mass Burning and Random Sequential Node Burning methods. In particular, some radius-based boxes created by those methods may be disconnected, or some boxes may contain only some of the nodes within distance r of the center node n.
For a given s there is in general not a unique ; e.g., for a long chain network and s small, there are many diameter-based boxes of size s. A diameter-based box is not defined in terms of a center node; instead, for nodes we require dist(x, y) < s.1 Diameter-based boxes are used in the Greedy Node Coloring, Box Burning, and Compact Box Burning heuristics described in Chap. 8. The above definition of a diameter-based box also may frequently be violated in the Box Burning and Compact Box Burning methods. Also, since each node in must belong to exactly one B j in an s-covering using diameter-based boxes, then in general we will not have diam(B j) = s − 1 for all j. To see this, consider a chain of 3 nodes (call them x, y, and z), and let s = 2. The minimal 2-covering using diameter-based boxes requires two boxes, B 1 and B 2. If B 1 covers x and y, then B 2 covers only z, so the diameter of B 2 is 0.
The only nodes adjacent to n are x and z, so and B R(1) = 2. Yet the diameter of is 2, so it can be covered by a single diameter-based box of size 3, namely itself, so B D(3) = 1. Thus B R(r) and B D(2r + 1) are not in general equal. Nonetheless, for the C. elegans and Internet backbone networks studied in [Song 07], the calculated fractal dimension was the same whether radius-based or diameter-based boxes were used. Similarly, both radius-based and diameter-based boxes yielded a fractal dimension of approximately 4.1 for the WWW (the World Wide Web) [Kim 07a].
As applied to a network , the term box counting refers to computing a minimal s-covering of for a range of values of s, using either radius-based boxes or diameter-based boxes. Conceivably, other types of boxes might be used to cover . Once we have computed B D(s) for various values of s (or B R(r) for various values of r) we can try to compute d B.
Computing any fractal dimension for a finite network is necessarily open to the same criticism that can be directed to computing a fractal dimension of a real-world geometric object, namely that a fractal dimension is defined only for a fractal (a theoretical construct with structure at infinitely many levels), and is not defined for a pre-fractal. This criticism has not impeded the usefulness of d B of real-world geometric objects. Applying the same reasoning to networks, we can try to compute d B for a finite network, and even for a small finite network.
In the literature on fractal dimensions of networks, d B is often (e.g., [Song 07]) informally defined by the scaling . The drawback of this definition is that the scaling relation “∼” is not well defined, e.g., often in the statistical physics literature “∼” indicates a scaling as s → 0 (Sect. 1.2), which is not meaningful for a network. In the context of network fractal dimensions, the symbol “∼”, is usually interpreted to mean “approximately behaves like”. Definition 7.5 below provides a more computationally useful definition of d B. Recall that Δ is the diameter of .
Alternatively and equivalently, sometimes (7.1) is written as . If has box counting dimension d B, then over some range of s we have for some constant α. Since each box in the covering need not have the same size, it could be argued that this fractal dimension of should instead be called the Hausdorff dimension of ; a different definition of the Hausdorff dimension of [Rosenberg 18b] is studied in Chap. 18. Calling “fractal” if d B exists is the terminology used in [Gallos 07b].
This figure suggests that d B should be estimated over the range 2 ≤ s ≤ 6. Applying regression to the values over this range of s yields the estimate d B = 2.38. □
Exercise 7.2 ⋆ In Sect. 5.4 we defined the packing dimension of a geometric object. Can we define a packing dimension of a network? □
7.3 A Closer Look at Network Coverings
In describing their box counting method (which we present in Chap. 8), [Kim 07a] wrote “The particular definition of box size has proved to be inessential for fractal scaling.” On the contrary, it can make a great deal of difference which box definition is being used when computing d B. To see this, and recalling Definition 7.4, suppose we create two new definitions: (i) a wide box of size s is a subnetwork of of diameter s, and (ii) let denote the minimal number of wide boxes of size at most s needed to cover . With these two new definitions, the difference between a wide box of size s and a diameter-based box of size s is that the diameter of is s and the diameter of is s − 1.
plots versus for K = 15 and this range of s. The curve is concave, not linear, and a trend line for these points has a slope of − 0.79, rather than a straight line with a slope of − 1.
Rule 7.1 When using diameter-based boxes of size s, the box counting dimension d B of should be calculated using the ordered pairs . □
Rule 7.2 When using radius-based boxes with radius r, the box counting dimension d B of should be calculated using the ordered pairs . □
Exercise 7.3 Since B D(s) = 1 when s > Δ, does it follow that B R(r) = 1 when 2r + 1 > Δ? □
We will return to rectilinear networks in Sect. 11.4, where we study the correlation dimension of a rectilinear grid.
Exercise 7.5 Consider an infinite rectilinear grid in . Show that, when s is odd, a box with the maximal number of nodes has a center node, and the maximal number of nodes in a box of size s is (s 2 + 1)∕2. □
Exercise 7.6 Let be the box counting dimension of network . Suppose is a subnetwork of . Does the strict inequality hold? □
Exercise 7.7 As discussed in Sect. 5.5, a geometric fractal is sometimes defined as an object whose dimension is not an integer. Is this a useful definition of a “fractal network”? □
Exercise 7.8 Consider an infinite rectilinear lattice in . What does a diameter-based box of size s look like when s is even? □
Exercise 7.9 Let be a 6 × 6 square rectilinear lattice in . Compute B D(s) for s = 2, 3, 4, 5. Compute B R(r) for r = 1, 2, 3. □
7.4 The Origin of Fractal Networks
where the solid circles are measurements from a fractal network, and the hollow circles are from a non-fractal network.
Various researchers have proposed models to explain why networks exhibit fractal scaling. One proposed cause [Song 06] is the disassortative correlation between the degrees of neighboring nodes. A disassortative network is a network in which nodes of low degree are more likely to connect with nodes of high degree, and so there is a strong disassortativity (i.e., repulsion) between hubs. This can be modelled by Coulomb’s Law [Zhang 16].