36

Dynamic Trees*

Camil Demetrescu

Sapienza University of Rome

Irene Finocchi

Sapienza University of Rome

Giuseppe F. Italiano

University of Rome Tor Vergata

36.1Introduction

36.2Linking and Cutting Trees

Using Operations on Vertex-Disjoint PathsImplementing Operations on Vertex-Disjoint Paths

36.3Topology Trees

ConstructionUpdatesApplications

36.4Top Trees

UpdatesRepresentation and Applications

36.5ET Trees

UpdatesApplications

36.6Reachability Trees

36.7Conclusions

Acknowledgments

References

36.1Introduction

In this chapter we consider the problem of maintaining properties of a collection of vertex-disjoint trees that change over time as edges are added or deleted. The trees can be rooted or free, and vertices and edges may be associated with real-valued costs that may change as well. A straightforward solution would be to store explicitly with each vertex its parent and cost, if any: with this representation each update would cost only O(1) time, but answering queries would be typically proportional to the size or to the depth of the tree, which may be linear in the worst case. By representing the structure of the trees implicitly, one can reduce the query time while slightly increasing the update time. The typical achieved bounds are logarithmic in the number of vertices of the forest, either in the worst-case or amortized over a sequence of operations.

While the basic tree update operations are edge insertions, edge deletions, and possibly vertex/edge cost changes, many properties of dynamically changing trees have been considered in the literature. The basic query operation is tree membership: while the forest of trees is dynamically changing, we would like to know at any time which tree contains a given vertex, or whether two vertices are in the same tree. Dynamic tree membership is a special case of dynamic connectivity in undirected graphs, and indeed in Chapter 37 we will see that some of the data structures developed here for trees are used to solve the more general problem on graphs. We remark that, if only edge insertions are allowed, the tree membership problem is equivalent to maintaining disjoint sets under union operations and thus the well known set union data structures can solve it [1]. In this chapter we will instead consider the problem in a fully dynamic setting, in which also edge deletions are allowed, and present efficient data structures such as the linking and cutting trees of Sleator and Tarjan [2] and the topology trees of Frederickson [3].

Other properties that have been considered are finding the least common ancestor of two vertices, the center, the median, or the diameter of a tree [2,4, 5]. When costs are associated either to vertices or to edges, one could also ask what is the minimum or maximum cost in a given path. A variant of topology trees, known as top trees [4], are especially well suited at maintaining this kind of path information.

ET trees, first introduced in [6] and much used in [7], allow it to deal easily with forests whose vertices are associated with weighted or unweighted keys, supporting, for example, minkey queries, which require to return a key of minimum weight in the tree that contains a given vertex. Reachability trees, introduced by Even and Shiloach in [8], support instead distance and shortest path queries and have been widely used to solve dynamic path problems on directed graphs (see, e.g., [7,9]).

36.2Linking and Cutting Trees

In this section we present a data structure due to Sleator and Tarjan [2] useful to maintain a collection of rooted trees, each of whose vertices has a real-valued cost, under an arbitrary sequence of the following operations:

maketree(v): initialize a new tree consisting of a single vertex v with cost zero.

findroot(v): return the root of the tree containing vertex v.

findcost(v): return a vertex of minimum cost in the path from v to findroot(v).

addcost(v, δ): add the real number δ to the cost of every vertex in the path from v to findroot(v).

link(v, w): merge the trees containing vertices v and w by inserting edge (v, w). This operation assumes that v and w are in different trees and that v is a tree root.

cut(v): delete the edge from v to its parent, thus splitting the tree containing vertex v into two trees. This operation assumes that v is not a tree root.

The data structure of Sleator and Tarjan is known as linking and cutting trees and supports all the above operations in O(log n) time, by representing the structure of the trees implicitly. Other operations that can be supported within the same time bound are changing the root of a tree, finding the parent of a vertex, and finding the least common ancestor of two vertices. In particular, the possibility of making a given vertex v root of a tree makes the data structure powerful enough to handle problems requiring linking and cutting of free (i.e., unrooted) trees. Furthermore, the same time bounds can be obtained when real costs are associated with edges rather than with vertices.

The rest of this section is organized as follows. In Section 36.2.1 we show how to implement the operations given above using simpler primitives defined on paths (rather than on trees), and in Section 36.2.2 we describe the implementation of these primitives on paths. For simplicity, we only describe a solution that achieves O(log n) amortized (rather than worst-case) time per operation. Details of all these results may be found in [2].

36.2.1Using Operations on Vertex-Disjoint Paths

In this section we show the reduction between the operations on trees and a suitable collection of operations on vertex-disjoint paths. Assume we know how to perform the following operations:

makepath(v): initialize a new path consisting of a single vertex v with cost zero;

findpath(v): return the path containing vertex v;

findpathtail(p): return the tail (last vertex) of path p;

findpathcost(p): return a vertex of minimum cost in path p;

addpathcost(p, δ): add the real value δ to the cost of every vertex in path p;

join(p, v, q): merge path p, vertex v, and path q into a single path by inserting one edge from the tail of p to v, and one edge from v to the head (first vertex) of q, and return the new path. Either p or q can be empty;

split(v): divide the path containing vertex v into at most three paths by deleting the edges incident to v. Return the two new paths p (containing all the vertices before v) and q (containing all the vertices after v). Again, either p or q can be empty.

In order to solve the problem of linking and cutting trees, we partition each tree into a set of vertex disjoint paths. Each tree operation will be defined in terms of one or more path operations. This partition is defined by allowing each tree edge to be either solid or dashed and by maintaining the invariant that at most one solid edge enters each vertex (we consider an edge oriented from a child to its parent). Removing dashed edges therefore partitions the tree into vertex-disjoint solid paths. Dashed edges are represented implicitly: we associate with each path p its successor, that is the vertex entered by the dashed edge leaving the tail of p. If the tail of p is a root, successor(p) is null. Each path will be represented by a vertex on it (an empty path being represented by null). In order to convert dashed edges to solid (and vice-versa) we will be using the following operation:

expose(v): make the tree path from v to findroot(v) solid. This is done by converting dashed edges in the path to solid, and solid edges incident to the path to dashed. Return the resulting solid path.

Now we describe how to implement tree operations in terms of path operations:

a maketree(v) is done by a makepath(v) followed by setting successor(v) to null;

a findroot(v) is a findpathtail(expose(v));

a findcost(v) is a findpathcost(expose(v));

an addcost(v, δ) is an addpathcost(expose(v), δ);

a link(v, w) is implemented by performing first an expose(v) that makes v into a one-vertex solid path, then an expose(w) that makes the path from w to its root solid, and then by joining these two solid paths: in short, this means assigning null to successor(join(null,expose(v),expose(w)));

to perform a cut(v), we first perform an expose(v), which leaves v with no entering solid edge. We then perform a split(v), which returns paths p and q: since v is the head of its solid path and is not a tree root, p will be empty, while q will be non-empty. We now complete the operation by setting both successor(v) and successor(q) to null.

To conclude, we need to show how to perform an expose, that is, how to convert all the dashed edges in the path from a given vertex to the tree root to solid maintaining the invariant that at most one solid edge enters each vertex. Let x be a vertex of this path such that the edge from x to its parent w is dashed (and thus w = successor(x)). What we would like to do is to convert edge (x, w) into solid, and to convert the solid edge previously entering w (if any) into dashed. We call this operation a splice. The pseudocode in Figure 36.1 implements expose(v) as a sequence of splices. Path p, initialized to be empty, at the end of the execution will contain the solid path from v to findroot(v). Each iteration of the while loop performs a splice at v by converting to solid the edge from the tail of p to v (if pnull) and to dashed the edge from the tail of q to v (if qnull). A step-by-step execution of expose on a running example is shown in Figure 36.2.

fig36_1.jpg

Figure 36.1Implementation of expose(v).

fig36_2.jpg

Figure 36.2Effect of expose(v). (a) The original decomposition into solid paths; (b–e) vertices and paths after the execution of line 5 in the four consecutive iterations of the while loop; (f) the decomposition into solid paths after expose(v).

From the description above, each tree operation takes O(1) path operations and at most one expose. Each splice within an expose requires O(1) path operations. Hence, in order to compute the running time, we need first to count the number of splices per expose and then to show how to implement the path operations. With respect to the former point, Sleator and Tarjan prove that a sequence of m tree operations causes O(m log n) splices, and thus O(log n) splices amortized per expose.

Theorem 36.1

[2] Any sequence of m tree operations (including n maketree) requires O(m) path operations and at most m expose. The exposes can be implemented with O(m log n) splices, each of which requires O(1) path operations.

36.2.2Implementing Operations on Vertex-Disjoint Paths

We now describe how to represent solid paths in order to implement efficiently tree operations. Each solid path is represented by a binary tree whose nodes in symmetric order are the vertices in the path; each node x contains pointers to its parent p(x), to its left child l(x), and to its right child r(x). We call the tree representing a solid path a solid tree. The vertex representing a solid path is the root of the corresponding solid tree, and thus the root of the solid tree contains a pointer to the successor of the path in the dynamic tree.

Vertex costs are represented as follows. Let cost(x) be the cost of vertex x, and let mincost(x) be the minimum cost among the descendants of x in its solid tree. Rather than storing these two values, in order to implement addcost operations efficiently, we store at x the incremental quantities Δcost(x) and Δmin(x) defined as follows:

Δcost(x)=cost(x)mincost(x)

(36.1)

Δmin(x)={mincost(x)ifxisasolidtreerootmincost(x)mincost(p(x))otherwise

(36.2)

An example of this representation is given in Figure 36.3. Given Δcost and Δmin, we can compute mincost(x) by summing up Δmin for all vertices in the solid tree path from the root to x, and cost(x) as mincost(x) + Δcost(x). Moreover, note that Δcost(x) = 0 if and only if x is a minimum cost node in the subtree rooted at x. If this is not the case and the minimum cost node is in the right subtree, then Δmin(r(x)) = 0; otherwise Δmin(l(x)) = 0. With this representation, rotations can be still implemented in O(1) time. The path operations can be carried out as follows:

makepath(v): initialize a binary tree of one vertex v with Δmin(v) = 0 and Δcost(v) = 0.

findpath(v): starting from v, follow parent pointers in v’s solid tree until a node with no parent is found. Return this node.

findpathtail(p): assuming that p is the root of a solid tree, follow right pointers and return the rightmost node in the solid tree.

findpathcost(p): initialize v to p and repeat the following step until Δcost(v) = 0: if v has a right child and Δmin(r(v)) = 0, replace v by r(v); otherwise, replace v by l(v). At the end, return v.

addpathcost(p, δ): add δ to Δmin(p).

join(p, v, q): join the solid trees with roots p, v and q.

split(v): split the solid tree containing node v.

fig36_3.jpg

Figure 36.3Representing solid paths with binary trees. (a) A solid path and its vertex costs; (b) solid tree with explicit costs (the bold value is cost(v) and the italic value is mincost(v)); (c) corresponding solid tree with incremental costs (the bold value is Δcost(v) and the italic value is Δmin(v)).

We observe that operations findpath, findpathtail and findpathcost are essentially a look up in a search tree, while split and join are exactly the same operations on search trees. If we represent solid paths by means of balanced search trees, the time per path operation becomes O(log n), and by Theorem 36.1 any sequence of m tree operations can be supported in O(m(log n)2) time. Using self-adjusting binary search trees [10] to represent solid paths, together with a more careful analysis, yields a better bound:

Theorem 36.2

[2] Any sequence of m tree operations (including n maketree) requires O(m log n) time.

Insights on the use of self-adjusting binary search trees in the implementation of path operations are given in Chapter 13. Using biased search trees [11], the O(log n) amortized bound given in Theorem 36.2 can be made worst-case. Details can be found in [2].

36.3Topology Trees

Topology trees have been introduced by Frederickson [3] in order to maintain information about trees subject to insertions and deletions of edges and answer efficiently, for example, tree membership queries. Similarly to the linking and cutting trees of Sleator and Tarjan [2] that we have discussed in Section 36.2, topology trees follow the idea of partitioning a tree into a set of vertex-disjoint paths. However, they are very different in how this partition is chosen, and in the data structures used to represent the paths inside the partition. Indeed, Sleator and Tarjan [2] use a simple partition of the trees based upon a careful choice of sophisticated data structures to represent paths. On the contrary, Frederickson [3] uses a more sophisticated partition that is based upon the topology of the tree; this implies more complicated algorithms but simpler data structures for representing paths.

The basic idea is to partition the tree into a suitable collection of subtrees, called clusters, and to implement updates such that only a small number of such clusters is involved. The decomposition defined by the clusters is applied recursively to get faster update and query times.

In order to illustrate how such a recursive decomposition is computed, we assume that T has maximum vertex degree 3: this is without loss of generality, since a standard transformation can be applied if this is not the case [12]. Namely, each vertex v of degree d > 3 is replaced by new vertices v0,…, vd − 1; for each neighbor ui of vertex v, 0 ≤ id − 1, edge (v, ui) is replaced by (vi, ui), and a new edge (vi, vi + 1) is created if i < d − 1.

Given a tree T of maximum degree 3, a cluster is any connected subgraph of T. The cardinality and the external degree of a cluster are the number of its vertices and the number of tree edges incident to it, respectively. We now define a partition of the vertices of T such that the resulting clusters possess certain useful properties. Let z be a positive integer.

Definition 36.1

A restricted partition of order z w.r.t. T is a partition of the vertex set V into clusters of degree at most 3 such that:

1.Each cluster of external degree 3 has cardinality 1.

2.Each cluster of external degree <3 has cardinality at most z.

3.No two adjacent clusters can be combined and still satisfy the above.

A restricted partition of order 2 of a tree T is shown in Figure 36.4. There can be several restricted partitions for a given tree T, based upon different choices of the vertices to be unioned. For instance, vertex 8 in Figure 36.4 could be unioned with 7, instead of 11, and the partition would still be valid. It can be proved that a restricted partition of order z has Θ(n/z) clusters [3,13].

fig36_4.jpg

Figure 36.4A restricted partition of order 2 of a tree T.

We now show that the partition defined above can be applied recursively for Θ(log n) levels. Such a recursive application yields a restricted multilevel partition [3,13], from which the topology tree can be finally obtained.

Definition 36.2

A topology tree is a hierarchical representation of a tree T such that each level of the topology tree partitions the vertices of T into clusters. Clusters at level 0 contain one vertex each. Clusters at level ℓ ≥ 1 form a restricted partition of order 2 of the vertices of the tree T obtained by shrinking each cluster at level ℓ − 1 into a single vertex.

As shown in Figure 36.5, level l of the restricted multilevel partition is obtained by computing a restricted partition of order 2 with respect to the tree resulting from viewing each cluster at level l − 1 as a single vertex. Figure 36.5 also shows the topology tree corresponding to the restricted multilevel partition. Call any cluster of level l − 1 matched if it is unioned with another cluster to give a cluster of level l: unmatched clusters have a unique child in the topology tree. It can be proved that, for any level l > 0 of a restricted multilevel partition, the number of matched clusters at level l − 1 is at least 1/3 of the total number of vertex clusters at level l − 1. Since each pair of matched clusters is replaced by their union at level l, the number of clusters at level l is at most 5/6 the number of clusters at level l − 1. The number of levels of the topology tree is therefore Θ(log n).

fig36_5.jpg

Figure 36.5Restricted multilevel partition and corresponding topology tree.

36.3.1Construction

It is sufficient to show how to compute a restricted partition: the levels of the topology tree can be then built in a bottom up fashion by repeatedly applying the clustering algorithm as suggested by Definition 36.2. Because of property (3) in Definition 36.1, it is natural to compute a restricted partition according to a locally greedy heuristic, which does not always obtain the minimum number of clusters, but has the advantage of requiring only local adjustments during updates. The tree is first rooted at any vertex of degree 1 and the procedure cluster is called with the root as argument. At a generic step, procedure cluster(v) works as follows. It initializes the cluster C(v) containing vertex v as C(v) = {v}. Then, for each child w of v, it recursively calls cluster(w), that computes C(w): if C(w) can be unioned with C(v) without violating the size and degree bounds in Definition 36.1, C(v) is updated as C(v) ∪ C(w), otherwise C(w) is output as a cluster. As an example, the restricted partition shown in Figure 36.4 is obtained by running procedure cluster on the tree rooted at vertex 7.

36.3.2Updates

We first describe how to update the clusters of a restricted partition when an edge e is inserted in or deleted from the dynamic tree T: this operation is the crux of the update of the entire topology tree.

Update of a restricted partition. We start from edge deletion. First, removing an edge e splits T into two trees, say T1 and T2, which inherit all of the clusters of T, possibly with the following exceptions.

1.Edge e is entirely contained in a cluster: this cluster is no longer connected and therefore must be split. After the split, we must check whether each of the two resulting clusters is adjacent to a cluster of tree degree at most 2, and if these two adjacent clusters together have cardinality ≤ 2. If so, we combine these two clusters in order to maintain condition (3).

2.Edge e is between two clusters: in this case no split is needed. However, since the tree degree of the clusters containing the endpoints of e has been decreased, we must check if each cluster should be combined with an adjacent cluster, again because of condition (3).

Similar local manipulations can be applied to restore invariants (1)–(3) in Definition 36.1 in case of edge insertions. We now come to the update of the topology tree.

Update of the topology tree. Each level can be updated upon insertions and deletions of edges in tree T by applying few locally greedy adjustments similar to the ones described above. In particular, a constant number of basic clusters (corresponding to leaves in the topology tree) are examined: the changes in these basic clusters percolate up in the topology tree, possibly causing vertex clusters to be regrouped in different ways. The fact that only a constant amount of work has to be done on O(log n) topology tree nodes implies a logarithmic bound on the update time.

Theorem 36.3

[3,13] The update of a topology tree because of an edge insertion or deletion in the dynamic tree T can be supported in O(log n) time, where n is the number of vertices of T.

36.3.3Applications

In the fully dynamic tree membership problem we would like to maintain a forest of unrooted trees under insertion of edges (which merge two trees into one), deletion of edges (which split one tree into two), and membership queries. Typical queries require to return the name of the tree containing a given vertex, or ask whether two vertices are in a same tree. Most of the solutions presented in the literature root each tree arbitrarily at one of its vertices; by keeping extra information at the root (such as the name of the tree), membership queries are equivalent to finding the tree root a vertex.

The dynamic tree clustering techniques of Frederickson have also found wide application in dynamic graph algorithms. Namely, topology trees have been originally designed in order to solve the fully dynamic minimum spanning tree problem [3], in which we wish to maintain a minimum spanning tree of a dynamic weighted undirected graph upon insertions/deletions of edges and edge cost changes. Let G = (V, E) be the dynamic graph and let S be a designated spanning tree of G. As G is updated, edges in the spanning tree S may change: for example, if the cost of an edge e is increased in G and e is in the spanning tree, we need to check for the existence of a replacement edge e′ of smaller cost, and swap e′ with e in S. The clustering approach proposed in [3,13] consists of partitioning the vertex set V into subtrees connected in S, so that each subtree is only adjacent to a few other subtrees. A topology tree is used for representing this recursive partition of the spanning tree S. A generalization of topology trees, called 2-dimensional topology trees, is also formed from pairs of nodes in the topology tree in order to maintain information about the edges in E S [3]. Fully dynamic algorithms based only on a single level of clustering obtain typically time bounds of the order of O(m2/3) (see for instance [14,15]), where m is the number of edges of the graph. When the partition is applied recursively, better O(m1/2) time bounds can be achieved by using 2-dimensional topology trees: we refer the interested reader to [3,13] for details. As we will see in Section 36.5.2, Frederickson’s algorithm is not optimal: the fully dynamic minimum spanning tree problem has been later solved in polylogarithmic time [16].

With the same technique, an O(m1/2) time bound can be obtained also for fully dynamic connectivity and 2-edge connectivity [3,13]. For instance, [13] shows that edges and vertices can be inserted to or deleted from an undirected graph in O(m1/2) time, and a query as to whether two vertices are in the same 2-edge-connected component can be answered in O(log n) time, n being the number of vertices. This result is based on the use of ambivalent data structures [13], a refinement of the clustering technique in which edges can belong to multiple groups, only one of which is actually selected depending on the topology of the given spanning tree.

36.4Top Trees

Top trees have been introduced by Alstrup et al. [4] to maintain efficiently information about paths in trees, such as, for example, the maximum weight on the path between any pair of vertices in a tree. The basic idea is taken from Frederickson’s topology trees, but instead of partitioning vertices, top trees work by partitioning edges: the same vertex can then appear in more than one cluster. Top trees can be also seen as a natural generalization of standard balanced binary trees over dynamic collections of lists that may be concatenated and split, where each node of the balanced binary tree represents a segment of a list. As we will see, in the terminology of top trees this is just a special case of a cluster.

We follow here the presentation in [5]. Similarly to [3,13], a cluster is a connected subtree of the dynamic tree T, with the additional constraint that at most two vertices, called boundary vertices, have edges out of the subtree. We will denote the boundary of a cluster C as δC. If the boundary contains two vertices u and v, we call cluster path of C the unique path between u and v in T and we denote it as π(C). If |δC < 2|, then π(C) = ∅. Two clusters C1 and C2 are called neighbors if their intersection contains exactly one vertex: since clusters are connected and have no edges in common, the intersection vertex must be in δC1δC2. It is also possible to define a boundary δT, consisting of one or two vertices, for the entire tree T: we will call such vertices, if any, external boundary vertices. If external boundary vertices are defined, we have to extend the notion of boundary of a cluster: namely, if a cluster C contains an external boundary vertex v, then vδC even if v has no edge out of C.

Definition 36.3

A top tree T over a pair (T, δT) is a binary tree such that:

The leaves of T are the edges of T.

The internal nodes of T are clusters of T.

The subtree represented by each internal node is the union of the subtrees represented by its two children, which must be neighbors.

The root of T represents the entire tree T.

The height of T is O(log n), where n is the number of vertices of T.

A tree with a single node has an empty top tree. Figure 36.6 shows the clusters at different levels of the recursive partition and the corresponding top tree. Note that a vertex can appear in many clusters, as many as Θ(n) in the worst case. However, it can be a non-boundary vertex only in O(log n) clusters. Indeed, for each vertex v which is neither an external boundary vertex nor a leaf in T, there exists a unique cluster C with children A and B such that vδA, vδB, and vδC. Then v is non-boundary vertex only in cluster C and in all its ancestors in the top tree.

fig36_6.jpg

Figure 36.6Clusters and top tree of a tree T. Edges with the same color, thickness, and pattern are in the same cluster. Boundary vertices are grey. External boundary vertices are squared.

A locally greedy approach similar to the one described in Section 36.3.1 for topology trees can be used to build a top tree. The only modifications require to reason in terms of edges, instead of vertices, and to check the condition on the cardinality of the boundary before unioning any two neighboring clusters.

36.4.1Updates

Given a dynamic forest, top trees over the trees of the forest are maintained under the following operations:

link(u, v), where u and v are in different trees Tu and Tv of the forest: link trees Tu and Tv by adding edge (u, v);

cut(e): remove edge e from the forest;

expose(u, v), where u and v are in the same tree T of the forest: make u and v the external boundary vertices of T and return the new root cluster of the top tree over T.

Top trees can be maintained under these operations by making use of two basic merge and split primitives:

merge: it takes two top trees whose roots are neighbor clusters and joins them to form a unique top tree;

split: this is the reverse operation, deleting the root of a given top tree.

The implementation of update operations starts with a sequence of Split of all ancestor clusters of edges whose boundary changes and finishes with a sequence of Merge. In case of insertion and deletions, since an end-point v of an edge has to be already boundary vertex of the edge if v is not a leaf, an update can change the boundary of at most two edges, excluding the edge being inserted/deleted. From [35] we have:

Theorem 36.4

[4,5] For a dynamic forest we can maintain top trees of height O(log n) supporting each link, cut, and expose operation with a sequence of O(log n) split and merge. The sequence itself is identified in O(log n) time. The space usage of top trees is linear in the size of the dynamic forest.

36.4.2Representation and Applications

Top trees are represented as standard binary trees, with pointers to parent and children for each node. With each leaf is associated the corresponding edge in T and with each internal node the at most two boundary vertices of the corresponding cluster. In addition, each vertex v of T has a pointer to the deepest cluster for which v is a non-boundary vertex, or to the root cluster containing v if v is an external boundary vertex. Given this representation, top trees can be used as a black box for maintaining different kinds of information. Typically, the user needs to attach extra information to the top tree nodes and to show how such information can be maintained upon merge and split operations.

A careful choice of the extra information makes it possible to maintain easily path properties of trees, such as the minimum weight of an edge on the path between any two vertices. In this example, the extra information wC associated with a cluster C is the weight of the lightest edge on the cluster path π(C). Before showing how to maintain it, note that if cluster A is a child of cluster C in the top tree and A contains an edge from π(C), then π(A) ⊆ π(C): we call A a path child of C. When a cluster is created by a merge, we store as extra information the minimum weight stored at its path children. In case of a split, we just discard the information. Now, in order to find the minimum weight between any two vertices u and v, we compute the root cluster C of the top tree in which u and v are external boundary vertices by calling expose(u,v). Then π(C) is the path between u and v and wC is exactly the value we are looking for.

Top trees can be used quite easily if the property we wish to maintain is a local property, that is, being satisfied by a vertex/edge in a tree implies that the property is also satisfied in all the subtrees containing the vertex/edge. Non-local properties appear to be more challenging. For general non-local searching the user has to supply a function select that can be used to guide a binary search towards a desired edge: given the root of a top tree, the function selects one of the two children according to the property to be maintained. Since the property is non-local, in general it is not possible to recurse directly on the selected child as is. However, Alstrup et al. [5] show that the top tree can be temporarily modified by means of a few merge operations so that select can be provided with the “right” input in the recursive call and guide the search to a correct solution.

Lemma 36.1

Given a top tree, after O(log n) calls to select, merge, and split, there is a unique edge (u, v) contained in all clusters chosen by select, and then (u, v) is returned.

We refer the interested reader to [5] for the proof of Lemma 36.1, which shows how to modify the top tree in order to facilitate calls to select. We limit here to use the search as a black box in order to show how to maintain dynamically the center of a tree (i.e., a vertex which maximizes the distance from any other vertex) using top trees.

The extra information maintained for a cluster C with boundary vertices a and b are: the distance dist(C) between a and b, and the lengths ℓa(C) and ℓb(C) of a longest path in C departing from a and b, respectively. Now we show how to compute the extra information for a cluster C obtained by merging two neighboring clusters A and B. Let c be a boundary vertex of cluster C and, w.l.o.g., let cδA. The longest path from c to a vertex in A has length ℓc(A). Instead, in order to get from c to a vertex in B, we must traverse a vertex, say x, such that xδAδB: thus, the longest path from c to a vertex in B has length dist(c, x) + ℓx(B). This is equal to ℓc(B) if cδB (i.e., if x = c), or to dist(A) + ℓx(B) if cδB. Now, we set ℓc(C) = max{ℓc(A), dist(c, x) + ℓx(B)}. We can compute dist(C) similarly. Finally, function select can be implemented as follows. Given a cluster C with children A and B, let u be the vertex in the intersection of A and B: if ℓu(A) ≥ ℓu(B) select picks A, otherwise it picks B. The correctness of select depends on the fact that if ℓu(A) ≥ ℓu(B), then A contains all the centers of C. Using Lemma 36.1 we can finally conclude that the center of a tree can be maintained dynamically in O(log n) time.

We refer the interested reader to [4,5] for other sample applications of top trees, such as maintaining the median or the diameter of dynamic trees. As we will recall in Section 36.5.2, top trees are fundamental in the design of the fastest algorithm for the fully dynamic minimum spanning tree problem [16].

36.5ET Trees

ET trees, first introduced in [6], have been later used to design algorithms for a variety of problems on dynamic graphs, such as fully dynamic connectivity and bipartiteness (see, e.g., [7,17]). They provide an implicit representation of dynamic forests whose vertices are associated with weighted or unweighted keys. In addition to arbitrary edge insertions and deletions, updates allow it to add or remove the weighted key associated to a vertex. Supported queries are the following:

connected(u, v): tells whether vertices u and v are in the same tree.

size(v): returns the number of vertices in the tree that contains v.

minkey(v): returns a key of minimum weight in the tree that contains v; if keys are unweighted, an arbitrary key is returned.

The main concept for the definition of ET trees is that of Euler tour.

Definition 36.4

An Euler tour of a tree T is a maximal closed walk over the graph obtained by replacing each edge of T by two directed edges with opposite direction.

The Euler tour of a tree can be easily computed in linear time by rooting the tree at an arbitrary vertex and running a depth first visit [18]. Each time a vertex v is encountered on the walk, we call this an occurrence of v and we denote it by o(v). A vertex of degree Δ has exactly Δ occurrences, expect for the root which is visited Δ + 1 times. Furthermore, the walk traverses each directed edge exactly once; hence, if T has n vertices, the Euler tour has length 2n − 2. Given an n-vertex tree T, we encode it with a sequence of 2n − 1 symbols given by an Euler tour. We refer to this encoding as ET(T). For instance, the encoding of the tree given in Figure 36.7 derived from the Euler tour shown below the tree itself is ET(T) = a b c b d b g f g e g b a.

fig36_7.jpg

Figure 36.7A tree, an Euler tour, and the corresponding ET tree with d = 2.

Definition 36.5

An ET tree is a dynamic balanced d-ary tree over some Euler tour around T. Namely, the leaves of the ET tree are the nodes of the Euler tour, in the same order in which they appear (see Figure 36.7).

An ET tree has O(n) nodes due to the linear length of the Euler tour and to properties of d-ary trees. However, since each vertex of T may occur several times in the Euler tour, an arbitrary occurrence is marked as representative of the vertex.

36.5.1Updates

We first analyze how to update the encoding ET(T) when T is subject to dynamic edge operations. If an edge e = (u, v) is deleted from T, denote by Tu and Tv the two trees obtained after the deletion, with uTu and vTv. Let o1(u), o1(v), o2(u) and o2(v) be the occurrences of u and v encountered during the visit of (u, v). Without loss of generality assume that o1(u) < o1(v) < o2(v) < o2(u) so that ET(T) = αo1(u)βo1(v)γo2(v)δo2(u)ε. Then ET(Tv) is given by the interval o1(v)γo2(v), and ET(Tu) can be obtained by splicing out the interval from ET(T), that is, ET(Tu) = αo1(u)βδo2(u)ε.

If two trees T1 and T2 are joined in a new tree T because of a new edge e = (u, v), with uT1 and vT2, we first reroot T2 at v. Now, given ET(T1) = αo1(u)β and the computed encoding ET(T2) = o1(v)γo2(v), we compute ET(T) = αo1(u)o1(v)γo2(v)o(u)β, where o(u) is a newly created occurrence of vertex u. To complete the description, we need to show how to change the root of a tree T from r to another vertex s. Let ET(T) = o(r)αo1(s)β, where o1(s) is any occurrence of s. Then, the new encoding will be o1(s)βαo(s), where o(s) is a newly created occurrence of s that is added at the end of the new sequence.

In summary, if trees in the forest are linked or cut, a constant number of the following operations is required: (i) splicing out an interval from the encoding, (ii) inserting an interval into the encoding, (iii) inserting or (iv) deleting a single occurrence from the encoding. If the encoding ET(T) is stored in a balanced search tree of degree d, then one may perform each operation in time O(d logd n) while maintaining the balance of the tree.

36.5.2Applications

The query connected(u, v) can be easily supported in time O(log n/d) by finding the roots of the ET trees containing u and v and checking if they coincide. The same time is sufficient to check whether one element precedes another element in the ordering.

To support size and minkey queries, each node q of the ET tree maintains two additional values: the number s(q) of representatives below it and the minimum weight key k(q) attached to a representative below it. Such values can be easily maintained during updates and allow it to answer queries of the form size(v) and minkey(v) in O(log n/d) time for any vertex v of the forest: the root r of the ET tree containing v is found and values s(r) and k(r) are returned, respectively. We refer the interested reader to [7] for additional details of the method.

In Section 36.4 we will see how ET trees have been used [7,16] to design a polylogarithmic algorithm for fully dynamic connectivity. Here we limit to observe that trees in a spanning forest of the dynamic graph are maintained using the Euler tour data structure, and therefore updates and connectivity queries within the forest can be supported in logarithmic time. The use of randomization and of a logarithmic decomposition makes it possible to maintain also non-tree edges in polylogarithmic time upon changes in the graph.

36.6Reachability Trees

In this section we consider a tree data structure that has been widely used to solve dynamic path problems on directed graphs.

The first appearance of this tool dates back to 1981, when Even and Shiloach showed how to maintain a breadth-first tree of an undirected graph under any sequence of edge deletions [8]; they used this as a kernel for decremental connectivity on undirected graphs. Later on, Henzinger and King [7] showed how to adapt this data structure to fully dynamic transitive closure in directed graphs. King [9] designed an extension of this tree data structure to weighted directed graphs for solving fully dynamic transitive closure (see Section 37.6.1) and all pairs shortest paths (see Section 37.7.1).

In the unweighted directed version, the goal is to maintain information about breadth-first search (BFS) on a directed graph G undergoing deletions of edges. In particular, in the context of dynamic path problems, we are interested in maintaining BFS trees of depth up to d, with dn. Given a directed graph G = (V, E) and a vertex rV, we would like to support any intermixed sequence of the following operations:

Delete(x, y): delete edge (x, y) from G.

Level(u): return the level of vertex u in the BFS tree of depth d rooted at r (return +∞ if u is not reachable from r within distance d).

In [9] it is shown how to maintain efficiently the BFS levels, supporting any Level operation in constant time and any sequence of Delete operations in O(md) overall time:

Lemma 36.2

(King [9]) Maintaining BFS levels up to depth d from a given root requires O(md) time in the worst case throughout any sequence of edge deletions in a directed graph with m initial edges.

Lemma 36.2 implies that maintaining BFS levels requires d times the time needed for constructing them. Since dn, we obtain a total bound of O(mn) if there are no limits on the depth of the BFS levels.

As it was shown in [7,9], it is possible to extend the BFS data structure presented in this section to deal with weighted directed graphs. In this case, a shortest path tree is maintained in place of BFS levels: after each edge deletion or edge weight increase, the tree is reconnected by essentially mimicking Dijkstra’s algorithm rather than BFS. Details can be found in [9].

36.7Conclusions

In this chapter we have surveyed data structures for maintaining properties of dynamically changing trees, focusing our attention on linking and cutting trees [2], topology trees [3], top trees [4], ET trees [6,7], and reachability trees [8]. We have shown that these data structures typically support updates and many different kinds of queries in logarithmic (amortized or worst-case) time. Hence, problems such as tree membership, maintaining the center or diameter of a tree, finding the minimum cost on a given path can be solved in O(log n) time in a fully dynamic setting.

All the data structures for maintaining properties of dynamically changing trees are not only important and interesting on their own, but are often used as building blocks by many dynamic graph algorithms, as we will see in Chapter 37. Some of these data structures, such as the union find data structures and the linking and cutting trees of Sleator and Tarjan have myriads of applications in other problems, such as implementing property grammars, computational geometry problems, testing equivalence of finite state machines, computing the longest common subsequence of two sequences, performing unification in logic programming and theorem proving, finding minimum spanning trees, and maximum flow algorithms. Since all these problems are outside the scope of this survey, we have not mentioned these applications and have restricted our attention to the applications of these data structures to dynamic tree and graph algorithms only.

Acknowledgments

This work has been supported in part by the IST Programmes of the EU under contract numbers IST-1999-14186 (ALCOM-FT) and IST-2001-33555 (COSIN), and by the Italian Ministry of University and Scientific Research (Project “ALINWEB: Algorithmics for Internet and the Web”).

References

1.R. E. Tarjan and J. van Leeuwen. Worst-case analysis of set union algorithms. J. Assoc. Comput. Mach., 31:245–281, 1984.

2.D. D. Sleator and R. E. Tarjan. A data structure for dynamic trees. J. Comp. Syst. Sci., 24:362–381, 1983.

3.G. N. Frederickson. Data structures for on-line updating of minimum spanning trees. SIAM J. Comput., 14:781–798, 1985.

4.S. Alstrup, J. Holm, K. de Lichtenberg, and M. Thorup. Minimizing diameters of dynamic trees. In Proc. 24th Int. Colloquium on Automata, Languages and Programming (ICALP 97), LNCS 1256, pages 270–280, 1997.

5.S. Alstrup, J. Holm, and M. Thorup. Maintaining center and median in dynamic trees. In Proc. 7th Scandinavian Workshop on Algorithm Theory (SWAT 00), pages 46–56, 2000.

6.R. E. Tarjan and U. Vishkin. An efficient parallel biconnectivity algorithm. SIAM J. on Computing, 14:862–874, 1985.

7.M. R. Henzinger and V. King. Randomized fully dynamic graph algorithms with polylogarithmic time per operation. J. Assoc. Comput. Mach., 46(4):502–536, 1999.

8.S. Even and Y. Shiloach. An on-line edge deletion problem. J. Assoc. Comput. Mach., 28:1–4, 1981.

9.V. King. Fully dynamic algorithms for maintaining all-pairs shortest paths and transitive closure in digraphs. In Proc. 40th Symposium on Foundations of Computer Science (FOCS 99), 1999.

10.D. D. Sleator and R. E. Tarjan. Self-adjusting binary search trees. J. Assoc. Comput. Mach., 32:652–686, 1985.

11.S. W. Bent, D. D. Sleator, and R. E. Tarjan. Biased search trees. SIAM J. Comput., 14(3):545–568, 1985.

12.F. Harary. Graph Theory. Addison-Wesley, Reading, MA, 1969.

13.G. N. Frederickson. Ambivalent data structures for dynamic 2-edge-connectivity and k smallest spanning trees. SIAM J. Comput., 26(2):484-–538, 1997.

14.Z. Galil and G. F. Italiano. Fully-dynamic algorithms for 2-edge connectivity. SIAM J. Comput., 21:1047–1069, 1992.

15.M. Rauch. Fully dynamic biconnectivity in graphs. Algorithmica, 13:503–538, 1995.

16.J. Holm, K. de Lichtenberg, and M. Thorup. Poly-logarithmic deterministic fully-dynamic algorithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity. J. Assoc. Comput. Mach., 48(4):723–760, 2001.

17.P. B. Miltersen, S. Subramanian, J. S. Vitter, and R. Tamassia. Complexity models for incremental computation. Theoret. Comput. Science, 130:203–236, 1994.

18.T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms, Second Edition. The MIT Press, Cambridge, MA, 2001.

*This chapter has been reprinted from first edition of this Handbook, without any content updates.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.252.81