1 Introduction
Decision diagrams (DDs) are a popular data structure to encode large sets of structured data, for example vectors whose elements take values over finite domains, but it is well-known [10] that the size of the DD strongly depends on how the structure of the data (its “variables”) is mapped to the structure of the DD (its “levels”). The problem of determining the association of variable(s) to levels is the “variable ordering problem” and it is known that finding an optimal order is an NP-complete problem [9] for any DD class, including binary DDs (BDDs [10]) and multiway DDs (MDDs [19]). This has given rise to a variety of metrics (to compare the effectiveness of two orders without actually building the corresponding DDs) and of heuristics (to compute sub-optimal orders, often by attempting to optimize a given metric). DDs play a central role in many system verification tools [4, 11, 14, 20, 22], where they typically support state space exploration. Tools often make use of general-purpose DD libraries [8, 18, 26, 27]. Libraries typically support dynamic reordering to improve the current order at run-time, while the definition of an initial order (static ordering) is typically up to the verification tool, which can rely on domain knowledge. The two problems are synergistic: reordering works better if the initial order is at least fairly good.
Our research seeks to find good variable order metrics and good variable order heuristics for MDDs encoding sets of fixed-size vectors, when these vectors satisfy some linear invariants. We want to answer whether it is possible to leverage invariant information to define effective metrics and heuristics for variable order. Two applications where this is important are encoding the state space of a discrete-event discrete state system and encoding all solutions to a set of integer constraints. In this paper, we concentrate on the first problem, but also address a special case of the second. Specifically, we study the relationship between MDDs and linear invariants with integer coefficients, and define two new metrics, PF and i, and associated heuristic and meta-heuristic. PF and i
exploit the constraint imposed by the invariants. Our evaluation shows that i
is superior to any other metric we consider, in all experiments we performed.
We do not discuss the state-of-the art on heuristics, see [7] for a full survey, but only metrics and on how metric optimization can guide a meta-heuristic. After the necessary background in Sect. 2, Sects. 3 and 4 define the metrics PF and i, based on a number of observation and propositions on the relation between MDD and invariants. Section 5 experimentally evaluates the two metrics against several other metrics on 40 different models, considering thousands of variable orders. Section 6 summarizes our results and discusses future work.
2 Background
Let ,
and
denote the set of booleans, natural numbers, and integers, respectively. All other sets are denoted by calligraphic letters, e.g.,
.
2.1 Discrete-Event Discrete-State System and Their State Space
- (1)
The set of potential states
, defining the type of system states. We assume
; i.e., a state
is a valuation of a finite set
of natural variables.
- (2)
The state
, describing the initial state of the system.
- (3)
The relation
, describing the state-to-state transitions; if
, the system can move from
to
in one step. We assume that
is defined by a finite set of events
and a function
, specifying the unique state
reached if event e occurs in state
, none if
(e is disabled in
). We write
iff
.
The reachable states are and, for such a system, an invariant is a boolean function
with the property that it evaluates to
in all reachable states:
, while it may be either
or
in the unreachable states
.
We specify DEDSs as Petri nets, because of their widespread use and the large body of literature on Petri net invariants. In Petri net terminology, the evaluation of variables describes the number of tokens in the set of places (thus the state, or marking, is a vector in
), the events
correspond to the transitions
, while two
matrices
and
define the system evolution.
(transition firing) iff
, otherwise
, i.e., t is disabled in
, where
is interpreted component-wise. The incidence matrix
is the net change to the marking caused by firing transition t is
. Figure 1 shows two Petri nets used as running example. Places are shown as circles, transitions as bars, and
(
) as incoming (outgoing) arcs for transitions with the corresponding value in
(
) shown on the arc (omitted if 1). The incidence matrix and initial marking are next to the nets.
A p-flow is a vector such that
, and its support is
. A p-flow
implies a linear invariant of the form
, where
is obviously a constant value, the token count of the invariant, which depends only on
. If clear from the context,
may refer to either a p-flow or the implied invariant.
P-flows with no negative entries are called p-semiflows. Let be the set of p-flows,
the set of p-semiflows, and
the p-flows that are not p-semiflows. Since multiplying a p-flow by a non-zero integer results in a p-flow, these sets are either empty or infinite. Figure 1 shows the minimal p-flows (defined later) as column vectors, with the token count below the vector.
![$$\varvec{\pi }$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq66.png)
![$$\mathbf {m}[v] \le Tc (\varvec{\pi }) / \varvec{\pi }[v]$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq67.png)
![$$v$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq68.png)
![$$\varvec{\pi }$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq69.png)
![$$\mathbf {m}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq70.png)
![$$\begin{aligned} f_1: \ \ \forall \mathbf {m}\in \mathcal {S}_{ rch }, ~\mathbf {m}[P_0] + \mathbf {m}[P_1b] + \mathbf {m}[P_2b] + \mathbf {m}[P_3b]= & {} 2 \\ f_2: \ \ \forall \mathbf {m}\in \mathcal {S}_{ rch }, ~\mathbf {m}[P_0] + \mathbf {m}[P_1a] + \mathbf {m}[P_2a] + \mathbf {m}[P_3a]= & {} 2. \end{aligned}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_Equ1.png)
![$$v$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq71.png)
![$$n_{v} \in \mathbb {N}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq72.png)
![$$\mathcal {S}_{ pot }$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq73.png)
![../images/483135_1_En_16_Chapter/483135_1_En_16_IEq76_HTML.gif](../images/483135_1_En_16_Chapter/483135_1_En_16_IEq76_HTML.gif)
![$$\mathcal {S}_{ rch }$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq75.png)
![$$\mathcal {F}^+$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq76.png)
![$$\mathcal {F}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq77.png)
![$$\mathcal {F}^+$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq78.png)
![$$\mathcal {F}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq79.png)
![$$\mathbf {C}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq80.png)
![$$|\mathcal {P}|$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq81.png)
![$$|\mathcal {T}|$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq82.png)
![$$\mathcal {F}^+$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq83.png)
![$$\mathcal {F}^+_{min}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq84.png)
![$$\mathcal {F}^+_{min}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq85.png)
![$$|\mathcal {P}|$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq86.png)
![$$n+1$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq87.png)
![$$2n+1$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq88.png)
![$$\mathcal {F}^+_{min}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq89.png)
![$$2^n$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq90.png)
![$$n=3$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq91.png)
![$$\mathcal {F}^+_{min}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq92.png)
![../images/483135_1_En_16_Chapter/483135_1_En_16_Fig1_HTML.png](../images/483135_1_En_16_Chapter/483135_1_En_16_Fig1_HTML.png)
Two Petri nets, their incidence matrices, and their p-flows.
In addition to , we can define
. Obviously
. We let
refer to either when the distinction is not relevant. Note that
is a superset of the linearized reachability set [21]
, used in Petri net theory to devise a semi-decidable procedure for safety properties.
2.2 Multiway Decision Diagrams
Definition 1
(MDD). Given a global domain , where each local domain
is of the form
for some
, an (ordered, quasi-reduced) MDD over
is a directed acyclic graph with exactly two terminal nodes,
and
, at level 0 (we write
), with each non-terminal node p at some level
having one outgoing edge for each
, pointing to a node p[i] at level
or to
, and with no duplicates (there cannot be nodes p and q at level k with
for all
) or redundant nodes (node p at level k is redundant if
for all
) pointing to
. The function
encoded by an MDD node p is recursively defined as
if
, and
if
. Interpreting
as an indicator function, p also encodes the set
, defined as
. This is the set of variable assignments compatible with the paths from p to
.
MDDs are a canonical representation of subsets of : given MDD nodes p and q at the same level,
iff
. We observe that quasi-reduced MDDs differ from the more common fully-reduced MDDs, which allow edges to skip levels by eliminating all redundant nodes, not just those encoding
. As it will be clear, though, the quasi-reduced MDD encoding the state space of a Petri net covered by invariants cannot contain redundant nodes, thus coincides with the fully-reduced MDD for such models. When drawing MDDs, edges point down and we omit node
, edges pointing to it, and the corresponding cells in the originating node, so that, if node p at level k with
is drawn as
, it means that
. We also omit node
and edges pointing to it, but not the corresponding cell in the originating node.
![$$\mathcal {S}_{ rch }$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq135.png)
![../images/483135_1_En_16_Chapter/483135_1_En_16_IEq138_HTML.gif](../images/483135_1_En_16_Chapter/483135_1_En_16_IEq138_HTML.gif)
![$$\mathcal {S}_{ sat }$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq137.png)
![$$|\mathcal {F}| - 1$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq138.png)
![../images/483135_1_En_16_Chapter/483135_1_En_16_Fig2_HTML.png](../images/483135_1_En_16_Chapter/483135_1_En_16_Fig2_HTML.png)
P-semiflows and MDD for two variable orders for the net in Fig. 1(A).
Since we focus on the size of the MDD encoding , we only consider MDDs with a single root node r, so that
. Letting
be the set of MDD nodes al level k, we characterize the MDD size in terms of its nonterminal nodes
, i.e.,
(although, unlike for BDDs [10] where nodes have exactly two outgoing edges, the number of MDD edges
could also be a meaningful measure of size). The first step to generate
is to map the places
of the Petri net to the L levels of the MDD. We limit ourselves to mapping each place to a different level, i.e., requiring a variable order
, where
. It is known that the choice of
can exponentially affect the size of MDD and finding an optimal mapping is NP-complete [9]. We stress that we consider only the final size of the MDD. In reality, the fixpoint iterations to compute
or the intersections to compute
can lead to an intermediate size of the MDD (peak size) that is normally much larger than the final size. However, our work to reduce the final MDD size is largely orthogonal to other strategies (like saturation [13] for
construction) aimed at reducing the peak size, thus both can be employed to improve efficiency.
The MDDs in Fig. 2 encode for the Petri net of Fig. 1(A), for two different variable orders. More precisely Fig. 2 shows, left to right, and for each order, the variable order (with level L at the top), the place bounds, the p-semiflows
(with the token count at the bottom), and the corresponding MDDs. The variable order in (B) is poor, resulting in an MDD with 40 nodes, while that in (A) requires only 19 nodes.
2.3 Metrics for Variable Orders
A metric M is a perfect predictor of MDD size if implies
for any variable orders
and
, where
is the number of nodes in the MDD for
when using variable order
; no efficiently-computable perfect predictor is known. Metrics have been defined based on the span of events in the incidence matrix
, on the bandwidth of
, on the center of gravity of events, and on p-semiflows. Metrics that consider the span of each event t (distance between the top and bottom nonzero in
or
for the given variable order) are the Normalized sum of Event Span (NES), the Weighted NES (WES), Sum of Span (SOS) [24], Sum of Tops (SOT) [11] and Sum of Unique and Productive Spans (SOUPS) [25]. Classic bandwidth reduction techniques from linear algebra were applied to variable order computation for the first time in [23]. The corresponding metrics are Bandwidth (BW), Profile (PROF), or Wavefront (WF), computed on a squared matrix derived from the incidence matrix
. Point-transition spans (PTS) is the metric used as a convergence criterion by the widely used heuristic Force [3], an algorithm for multi-dimensional clustering of graphs that has been adapted to variable order generation. A center of gravity for the variables is defined and the orders are measured in terms of hyperdistance of the variable from the center of gravity. PTS
[6] is a variation of PTS to consider also the effect of p-semiflows in the PTS variable clustering. Finally, the p-semiflow span (PSF) is the metric optimized by the heuristic defined in [5], which works by ordering the variables according to p-semiflows. PSF is a measure of the proximity of places that belong to the same p-semiflow.
An overview of these metrics can be found in [6], which also studies their coefficient of correlation to determine the predictive power of each metric over a large set of models and of orders. All models in the study are Petri nets, mostly conservative. We now provide some details for SOUPS and PSF which, together with PTS, have been reported as valuable predictors [6].
SOUPS modifies the sum of transition spans (SOS) metric [24] by considering only once the maximal common portion of multiple transition spans having the same effect on the marking and avoids counting the bottom portion of a transition span if it checks but does not change the marking of the corresponding places. SOUPS performs particularly well in conjunction with saturation [13], as it tends to result in even smaller peak MDDs. SOS and SOUPS, just like WES and NES [24] or SOT [11], are easily computed from the matrices and
.
![$$ \text {PSF}(\lambda ) = \sum \nolimits _{\varvec{\pi }\in \mathcal {F}^+_{min}} \Bigl ( \max \bigr \{\lambda (v) : \varvec{\pi }[v] \ne 0 \bigl \} - \min \bigr \{\lambda (v) : \varvec{\pi }[v] \ne 0 \bigl \} + 1 \Bigr ). $$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_Equ2.png)
![$$\varvec{\pi }$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq171.png)
![$$\varvec{\pi }[v]$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq172.png)
![$$v$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq173.png)
![$$ Supp (\varvec{\pi })$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq174.png)
![$$\mathcal {F}^+_{min}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq175.png)
There has been a proposal [16] to use of p-semiflows to eliminate some state variables (decision diagram levels) through a greedy heuristic, but later work [12] observed that this leads to a loss of locality in the MDD representation of the transition relation, and suggested instead to use p-semiflows to merge variables, proving that this always reduces the MDD size. The same paper [12] also proposed to modify the sum-of-transition-tops (SOT) metric so that it considers also a set of linearly independent p-semiflows, but provided no hints about the relative weight given to transitions vs. p-semiflows when computing the metric.
3 MDD and Invariants: The PF Metric
We now begin investigating the relationship between p-flows and the shape of the MDD encoding and
, and introduce the new metric PF.
P-flows and information remembered at level k. The invariant corresponding to a p-flow imposes a constraint on the reachable markings, since it implies a constant weighted sum of the tokens in the places belonging to
. Thus, the MDD must “remember” (using distinct nodes at level k) the possible partial weighted sums corresponding to places in the invariant support that are above level k, as long as the invariant is active, i.e., its support contains places mapped to levels k or below, and this is true even if the place mapped to level k is not in the support. Thus, intuitively, places in the support should be mapped to levels close to each other. This can be easily seen seen in Fig. 2(B), where the places in the support of the two p-semiflows in
are not in consecutive levels, resulting in more nodes: the level for
has 9 nodes, since the MDD must remember the partial sum of tokens of the places in the two branches of the Petri net of Fig. 1(A), and each of them can range from 0 to 2. In the order of Fig. 2(A), all places in the top branch are instead above the level of place
, which is in turn above all places in the bottom branch. Thus, level
has only three possible values to remember: whether in the top (and thus in the bottom) branch there are 0, 1, or 2 tokens (and therefore
has 2, 1, or 0 tokens, respectively). This dependence is captured by the metric PSF of Sect. 2.3. The PSF value for order (B) is 13, while it is 8 for order (A), consistent with the intuition that a smaller value of PSF results in a smaller MDD.
![$$\varvec{\pi }$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq185.png)
![$$\varvec{\pi }$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq186.png)
![$$ Supp (\varvec{\pi })$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq187.png)
![$$P_0$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq188.png)
![$$P_0$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq189.png)
![$$P_3a$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq190.png)
![$$\mathcal {F}^-$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq191.png)
![$$ \varvec{\pi }_3: \ \ \mathbf {m}[P_1a] + \mathbf {m}[P_2a] + \mathbf {m}[P_3a] - \mathbf {m}[P_1b] - \mathbf {m}[P_2b] - \mathbf {m}[P_3b] = 0 , $$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_Equ3.png)
![$$\mathcal {F}^-$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq192.png)
![$$\mathcal {F}^+$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq193.png)
![$$\mathcal {F}_{min}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq194.png)
![$$ \text {PF}(\lambda ) = \sum \nolimits _{\varvec{\pi }\,\in \, \mathcal {F}_{min}} \Bigl ( \max \bigr \{\lambda (p) : \varvec{\pi }(p) \ne 0 \bigl \} - \min \bigr \{\lambda (p) : \varvec{\pi }(p) \ne 0 \bigl \} + 1 \Bigr ), $$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_Equ4.png)
![$$\mathcal {F}_{min}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq195.png)
![$$\mathcal {F}_{min}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq196.png)
![$$\mathcal {F}^+_{min}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq197.png)
Continuing the analogy with PSF, we define as the set of minimal p-flows, i.e., the g.c.d. of their entries is 1 and their support does not strictly include the support of any other p-flow; in addition, to avoid considering both a p-flow and its negative, we assume an arbitrary place order (unrelated to the MDD variable order) and require the first nonzero entry to be positive. We now prove that this set
is unique and that it can generate a multiple of any p-flow. In the figures, the set
is shown partitioned into
and
.
Theorem 1
Set is unique, and it spans all p-flow directions, i.e., given
, for some
,
equals a linear combination of elements in
.
Proof
To prove uniqueness, it suffices to show that there can be at most one minimal p-flow with a given support. Assume by contradiction that there are two distinct minimal p-flows and
with
, and let
and
be the coefficients in
and
corresponding to the first place
, respectively. Then, define
, so that
.
If , then
but
, thus
and
cannot be minimal p-flows since their support strictly contains the support of
, a contradiction.
If , then
, which implies
, since the g.c.d. of both
and
is 1. But then,
, again a contradiction.
To prove that spans all p-flow directions, consider
. There must exist
with
; pick
and let
and
, so that
, with
. Either
, or it is a p-flow, in which case we can repeat the process to obtain
, and so on. Eventually, we must reach the case
, i.e.,
, at which point we can write
, where
, i.e., we can express a multiple of
as a linear combination of elements of
.
![../images/483135_1_En_16_Chapter/483135_1_En_16_Fig3_HTML.png](../images/483135_1_En_16_Chapter/483135_1_En_16_Fig3_HTML.png)
Three variable orders for the Petri net of Fig. 1(B), and the resulting MDDs.
We observe that the size of , like that of
, is at most exponential in
, since the proof of Theorem 1 shows that the elements of
must have uncomparable supports.
4 MDD and Invariants: The i
Metric
As we shall see in Sect. 5, both PSF and PF exhibit significant correlation with the MDD size. However, there are cases where they do not perform well, especially when is large. Consider for example the Petri net of Fig. 1(B), and the three MDDs corresponding to different variable orders in Fig. 3. This Petri net has many minimal p-flows,
and
. The three p-flows in
relate the places inside each fork-and-join subnet (
, for
), while the eight p-semiflows in
relate the tokens in the three fork-and-join subnets with those in place
. The order in Fig. 3(B) produces the smallest MDD size (37 nodes against the 49 nodes of order (A) and 69 of (C)), but it is the one with the worst (highest) value of PSF. On the other side also PF fails to chose the order with the smallest MDD: the smallest value for PF is 55 for order (A), which is only the second best for MDD size. One reason is that, when
contains many related, dependent constraints affecting a given MDD level, counting all of them may confuse the metric. On the other hand, we have seen that considering instead a basis depends strongly on the choice of vectors included in the basis, with a meaningless metric in the worst case.
We then propose i, a new variable order metric which, like PSF and PF, is based on linear invariants but, unlike PSF and PF, is unaffected by redundant minimal p-flows and is independent of the choice of the specific p-flows being considered, as long as they constitute a generator set. i
focuses on the number
of linearly independent partial p-flows that are still active at level k. The definition of i
requires a deeper understanding of the relationships among the MDD structure and the p-flows, as illustrated next.
![$$p, q \ne \mathtt {\bot }$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq266.png)
![$$p .lvl = k$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq267.png)
![$$q .lvl = h$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq268.png)
![$$\mathcal {A}_p$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq269.png)
![$$\mathcal {B}_p$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq270.png)
![$$\mathtt {\top }$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq271.png)
![$$\mathcal {C}_{p,q}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq272.png)
![$$\begin{aligned} \mathcal {A}_p= & {} \{(i_L,\ldots ,i_{k+1}) : r[i_L]\cdots [i_{k+1}] = p\} \\ \mathcal {B}_p= & {} \{(i_k,\ldots ,i_1) : p[i_k]\cdots [i_1] = \mathtt {\top }\} \\ \mathcal {C}_{p,q}= & {} \{(i_k,\ldots ,i_{h+1}) : p[i_k]\cdots [i_{h+1}] = q\} , \end{aligned}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_Equ5.png)
![$$\mathcal {A}_r \!=\! \mathcal {B}_{\mathtt {\top }} \!=\! \{()\}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq273.png)
![$$\mathcal {A}_{\mathtt {\top }} \!=\! \mathcal {B}_{r} \!=\! \mathcal {S}_r$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq274.png)
![$$\mathcal {A}_p \!=\! \mathcal {C}_{r,p}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq275.png)
![$$\mathcal {B}_p \!=\! \mathcal {C}_{p,\mathtt {\top }}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq276.png)
![$$\mathcal {C}_{p,q} \!=\! \emptyset $$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq277.png)
![$$q .lvl \!\ge \! p .lvl $$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq278.png)
![$$\mathcal {S}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq279.png)
![$$\lambda $$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq280.png)
![$$\mathcal {A}_p$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq281.png)
![$$\mathcal {B}_p$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq282.png)
![$$\mathcal {C}_{p,q}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq283.png)
![$$i_k$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq284.png)
![$$v= \lambda ^{-1}(k)$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq285.png)
Theorem 2
[28]. The nodes at level k can be used to define a partition of :
![$$\mathcal {C}_{p,q}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq288.png)
![$$\mathcal {A}_p$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq289.png)
![$$p .lvl = k > q .lvl =h$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq290.png)
![$$\sigma = (i_k,\ldots ,i_{h+1}) \in \mathcal {C}_{p, q}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq291.png)
![$$\varvec{\pi }\in \mathcal {F}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq292.png)
![$$\sigma $$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq293.png)
![$$\varvec{\pi }$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq294.png)
![$$ Sum (p, q, \sigma , \varvec{\pi }) = \sum \nolimits _{p .lvl \ge j > q .lvl } i_j \cdot \varvec{\pi }[\lambda ^{-1}(j)]. $$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_Equ6.png)
![$$\sigma \in \mathcal {C}_{r,\mathtt {\top }} = \mathcal {S}_{ rch }$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq295.png)
![$$ Sum (r, \mathtt {\top }, \sigma , \varvec{\pi }) = Tc (\varvec{\pi })$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq296.png)
We can now introduce two fundamental properties enjoyed by an MDD encoding a state space subject to a set of p-flows , which will pave the way to the definition of our new metric called i
.
Theorem 3
Assume a set of states subject to the set of p-flows
is encoded by an MDD rooted at r. Then, all paths between a given pair of nodes have the same partial sum for any given invariant:
,
,
. We can therefore write
.
Proof
Consider two nodes p and q, with , two paths
and
from p to q, and any
and
, so that both
and
describe markings in
. Then, for any p-flow
, we have that
. However,
, thus we must have
.
An even stronger property holds if the MDD encodes : then, every node in the MDD is completely identified by a unique pattern of partial p-flow sums.
Theorem 4
![$$\mathcal {S}_{ sat }$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq319.png)
![$$ \forall p,p' \in \mathcal {N}_k, ~ p \ne p' ~ \Rightarrow ~ \exists \varvec{\pi }\in \mathcal {F}, ~ Sum (r,p,\varvec{\pi }) \ne Sum (r,p',\varvec{\pi }) . $$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_Equ7.png)
Proof
Remember that . Assume that distinct nodes
satisfy
. Since the MDD is canonical, p and
must encode different sets, thus there must be a
in
or in
(w.l.o.g. assume the former case). Then, considering any
and
, we have
and
. But
implies
and, since
, and this holds for any
in
, we should also have
, a contradiction.
![../images/483135_1_En_16_Chapter/483135_1_En_16_Fig4_HTML.png](../images/483135_1_En_16_Chapter/483135_1_En_16_Fig4_HTML.png)
Computations of the rank weights from a matrix with
.
Theorem 4 implies that every node in the MDD encoding is completely identified by a unique pattern of partial p-flow sums. However, not every p-flow is relevant at a given level k of the MDD, and, more importantly, the portions from level L to level k of different p-flows may encode the same information, i.e., may be linearly dependent, yet these redundant portions contribute to the computation of the PF metric. i
, then, attempts to estimate the number of possible combinations of partial path sums that may actually be found in the nodes at level k of the MDD, taking into account these linear dependencies.
![$$|\mathcal {P}| \times |\mathcal {F}_{min}|$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq342.png)
![$$\mathbf {F}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq343.png)
![$$\lambda $$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq344.png)
![$$\mathcal {F}_{min}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq345.png)
![$$\rho _{ up }(k)$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq346.png)
![$$ \rho _{ up }(k) = \mathrm {rank}\bigl ( \mathbf {F}[L:k+1,\cdot ] \bigr ), $$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_Equ8.png)
![$$\mathbf {F}[L:k+1,\cdot ]$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq347.png)
![$$\mathbf {F}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq348.png)
![$$k+1$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq349.png)
![$$\rho _{ up }(k)$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq350.png)
![$$\rho _{ down }(k)$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq351.png)
![$$\mathbf {F}[k:1,\cdot ]$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq352.png)
![$$\mathbf {F}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq353.png)
![$$ \rho _{ down }(k) = \mathrm {rank}(\mathbf {F}) - \mathrm {rank}\bigl ( \mathbf {F}[k:1,\cdot ] \bigr ). $$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_Equ9.png)
![$$ \rho (k) = \rho _{ up }(k) - \rho _{ down }(k) . $$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_Equ10.png)
![$$\rho _{ up }(k)$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq354.png)
![$$\rho _{ down }(k)$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq355.png)
![$$\mathbf {F}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq356.png)
![$$\rho _{ up }(k)$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq357.png)
![$$\rho _{ down }(k)$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq358.png)
![$$\rho (k)$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq359.png)
![$$_\mathrm {Rank}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq360.png)
![$$\rho (k)$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq361.png)
![$$ \text {i}_\mathrm {Rank} = \sum \nolimits _{1 \le k \le L} \rho (k), $$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_Equ11.png)
![$$_\mathrm {Rank}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq362.png)
![$$_\mathrm {Rank}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq363.png)
![$$\mathbf {F}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq364.png)
![$$\mathbf {F}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq365.png)
![$$\mathbf {F}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq366.png)
![$$\mathbf {F}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq367.png)
![$$O(\min \{\mathcal {P},\mathcal {T}\}^3)$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq368.png)
5 Experimental Assessment of the Metrics
We now experimentally assess the efficacy of PF and i: since the relationship between p-flows and MDD nodes is stronger for
than for
(Theorem 4), we expect higher correlation when the MDD encodes the former. We also seek to determine whether these metrics can be used to drive iterative heuristics or meta-heuristics that compute variable orders. All experiments are on different sets of orders for 40 models taken from the Petri Net Repository [2]. The experiments have been conducted using the GreatSPN tool [1, 4], which uses the Meddly library [8]. All MDDs generated had fewer than one million nodes. We follow the evaluation procedure of [6] and compute the Spearman coefficient of correlation (CC), whose interpretation is: [1, 0.8] means very strong correlation, [0.6, 0.8] strong correlation, [0.4, 0.6] moderate correlation, and so on decreasing. Negative values indicate anti-correlation.
Figure 5 compares the correlation of i and PF to that of the metrics of Sect. 2.3. Although all experiments have been performed, for sake of space only 6 metrics are considered in the tables. We have chosen to include PSF, PF and i
(for obvious reasons), plus the best among the
span metrics (SOUPS), and two versions of PTS (PTS and PTS
, without and with p-flow) since PTS is the metrics implicitly optimized by the widely used Force heuristic. No bandwidth metric is reported since they all exhibit at best a moderate correlation. Each row represents a metric, columns report the CC of the metrics with the MDD encoding
(columns [A] and [B]) and
(columns [C] and [D]) for two different sets of orders. The CC of a single model for a single metric is computed from the bivariate series relating, for each variable order
, the MDD size built using
with the value of the metric for that
. ICC is the CC computed over the set of orders
in
and BCC is computed over
. The sets
and
are built from 1,000 initial random orders by generating sequences of increasingly better orders (in terms of MDD final size) until a convergence criterion is satisfied;
retains all orders while
retains only the last orders in each sequence (thus exactly 1,000 orders). This construction is explained in [6], where it was observed that
tends to contain mostly good orders, and
a mixture of good and bad orders. The above sets have been built for each of the 40 models. For each combination, we report the mean CC (over all models) and the CC distribution for the 40 models; the x axis is partitioned into 20 bins, so the y axis indicates the number of models whose CC falls into each bin. All plots have the same scale on the y axis. and the height of the bar at 0 is fixed at 36 for all rows.
The results of Fig. 5 indicate that i has the highest correlation for both ICC and BCC and for both
and
. i
is better than the second best by
(ICC on
) and up to
(BCC on
). The comparison with PTS (the metric used as a convergence criteria by the widely used Force heuristic) is even more striking. It is also evident that in none of the four cases PF performs better than PSF, supporting our observation that considering more p-flows is not always (or even usually) a good idea. Figure 5 also indicates that all metrics have better CC when the MDD encodes
(column [A] vs. [C], and column [B] vs. [D]). This is not surprising for i
, given Theorem 4, but it also holds for all other metrics. This could be due to the fact that, since
, the MDD for
encodes additional constraints not captured by any of the metrics.
![$$\mathcal {V}_\mathrm {IMPR}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq402.png)
![$$\mathcal {V}_\mathrm {BEST}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq403.png)
![$$\mathcal {V}_\mathrm {BEST}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq404.png)
![../images/483135_1_En_16_Chapter/483135_1_En_16_Fig5_HTML.png](../images/483135_1_En_16_Chapter/483135_1_En_16_Fig5_HTML.png)
Two correlation coefficients for different metrics for and
The experiments reported in Fig. 6 serve to evaluate whether the metrics can be used as an objective function inside a simulated annealing procedure (columns [A] and [C]) or as a meta-heuristic to select one among the orders produced during the simulated annealing (columns [B] and [D]). Given an initial variable order and a metric m the procedure searches an “optimal” order through a simulated annealing procedure [17], aimed at minimizing the value of m. We employ a standard simulated annealing procedure, described in [6]. Unlike the construction of the set of orders used for the computation of ICC and BCC in Fig. 5, no MDD is built during the construction of the candidate variable order. For each metric, the simulated annealing procedure is run 1,000 times, from different initial orders, and Fig. 6 reports, in columns [A] and [C], the mean and distribution of the “score” of the MDDs built using the 1,000 orders produced by the 1,000 runs of the simulated annealing for each metric m, for the 40 models. The score is the distance from the size of the smallest MDD built, normalized on the distance between the smallest and the largest MDD size built (see [6], Eq. 5), obviously computed separately for each model. A value of 1 for order for a given model indicates that the smallest MDD seen for that model was built using
. A value of 0 indicates the worst order. Column [A] refers to the MDDs storing
, while column [C] refers to
. Again, i
performs better than any other metrics in both cases.
![$$_\mathrm {Rank}$$](../images/483135_1_En_16_Chapter/483135_1_En_16_Chapter_TeX_IEq412.png)
![../images/483135_1_En_16_Chapter/483135_1_En_16_Fig6_HTML.png](../images/483135_1_En_16_Chapter/483135_1_En_16_Fig6_HTML.png)
Evaluation of metrics on simulated annealing produced orders
![../images/483135_1_En_16_Chapter/483135_1_En_16_Fig7_HTML.png](../images/483135_1_En_16_Chapter/483135_1_En_16_Fig7_HTML.png)
Evaluation of metrics on Force-produced orders.
Figure 7 shows the evaluation of a meta-heuristic also defined in [6], based on Force. Each metric m is used to drive the selection of the “best” variable order among a set of variable orders produced using Force from an initial set of 1,000 random orders. This is done for each of the 40 models. The last row is the baseline (1,000 points, all computed using Force), while all other histograms are built out of 40 MDD sizes, one per model. A mean value greater than the baseline mean indicates that the metric selects the best orders among the ones computed by Force. A mean below the baseline indicates otherwise. Again, when we employ i
to select the order to use, we get a better score than with any other metric for both
(left column) and
(right column).
6 Conclusion and Future Work
We considered the problem of defining and evaluating variable orders for MDDs encoding either the reachable states of a DEDS () or the states satisfying a set of linear invariants (
). We studied the relation between the MDD size and structure and the linear invariants, and proposed two new metrics: PF, a trivial extension to PSF; and i
. Through a set of experiments, metrics have been evaluated both as predictors of the MDD size and as drivers for two heuristics (and associated meta-heuristics). The experiments follow the procedure proposed in [6], as defining a good and fair procedure to compare metrics and MDD sizes for a set of models is a nontrivial task. The results show that i
is better than any other metrics we found in the literature.
The definition of i, and PF, assumes that linear invariants are available. For DEDSs specified as Petri nets, the linear invariants are derived from the p-flows, the left annullers of the incidence matrix, an integer matrix describing how an event modifies a state. Clearly, whenever a DEDS can be specified through a similar matrix, the application of our method is straightforward, as in the case of various formalisms used in system modeling and verification. For other formalisms, this may be less immediate, but our method only assumes a set of linear invariants on the state space, regardless of how they are computed.
In our experiments, we considered only conservative Petri nets, where each place appears in at least one invariant. This allowed us to compare with previously defined metrics that exploit linear invariants generated from p-semiflows. If no invariants are available, or if most places are not part of any invariant, PF and i could perform very poorly. If a net is not conservative, a subset of places may “lose” tokens, “gain tokens”, or both. The last two cases cause
to be infinite, but the first case can still be managed by our approach, thanks to p-flows. As an example, consider the net obtained from the net in Fig. 1(B) by removing the arc from transition
back to
Such a net does not have any p-semiflow, but all the places between each pair of fork-and-join belong to a p-flow, allowing us to apply our method. A further extension could consider invariants where the weighted sum of tokens in a subset of places is less than or equal a constant (instead of just equal).
Several directions for additional exploration remain. First, i does not consider the initial state of the DEDS, but the number of nodes at a given level depends on the token count of the p-flows, and this may be especially important when the p-flows have significantly different token counts. Then, the efficient computation of i
is obviously important, as heuristics using it could probably evaluate it many times. The computation could be expensive since it involves matrix rank computations.
Finally, we are interested in extending i to more general constraints, which can still provide hints on good variable orders; for example, a constraint “if
then
” imposes no limitations on C along paths where
, (assuming A is above B and B is above C in the MDD), but, requires to remember the value of B until reaching C along paths where
.
![Creative Commons](../css/cc-by.png)
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.