Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 4 Splitting Method for Counting and Optimization

4.1 Background

This chapter deals with the splitting method for counting, combinatorial optimization, and rare-event estimation. Before turning to the splitting method we present some background on counting using randomized (or Monte Carlo) algorithms.

To date, very little is known about how to construct efficient algorithms for hard counting problems. This means that exact solutions to these problems cannot be obtained in polynomial time, and therefore our work focuses on approximation algorithms, and, in particular, approximation algorithms based on randomization. The basic procedure for counting is outlined below.

1. Formulate the counting problem as estimating the cardinality $c04-math-0001$ of some set $c04-math-0002$ , such as the one in (1.1).

2. Find a sequence of decreasing sets $c04-math-0003$ such that

4.1 $c04-math-0004$

and $c04-math-0005$ is known.

3. Write $c04-math-0006$ as

4.2 $c04-math-0007$

where

4.3 $c04-math-0008$

Note that $c04-math-0009$ is typically very small, like $c04-math-0010$ , while each ratio

4.4 $c04-math-0011$

should be not too small, like $c04-math-0012$ or bigger. Clearly, estimating $c04-math-0013$ directly while sampling in $c04-math-0014$ is meaningless, but estimating each $c04-math-0015$ separately seems to be a good alternative.

4. Develop an efficient estimator $c04-math-0016$ for each $c04-math-0017$ and estimate $c04-math-0018$ by

4.5 $c04-math-0019$

where $c04-math-0020$ .

Using (4.1)(4.2)(4.3)(4.4)(4.5), one can design a sampling plan according to which the “difficult” counting problem defined on the set $c04-math-0021$ is decomposed into a number of “easy” ones associated with a sequence of related sets $c04-math-0022$ and such that $c04-math-0023$ . Typically, the Monte Carlo algorithms based on (4.1)(4.2)(4.3)(4.4)(4.5) explore the connection between counting and sampling and in particular the reduction from approximate counting of a discrete set to approximate sampling of the elements of this set [108].

To deliver a meaningful estimator of $c04-math-0024$ , we must solve the following two major problems:

i. Construct the sequence $c04-math-0025$ such that each $c04-math-0026$ is not a rare-event probability.

ii. Obtain a low variance unbiased estimator $c04-math-0027$ of each $c04-math-0028$ .

Hence, once both tasks (i) and (ii) are resolved, one can obtain an efficient estimator for $c04-math-0029$ given in (4.5).

Task (i) might be typically simply resolved in many cases. Let us consider a concrete example:

Suppose that each variable of the set (1.1) is bounded, say $c04-math-0030$ $c04-math-0030a$ ( $c04-math-0031$ ), then we may take $c04-math-0032$ , with $c04-math-0033$ .
Consider the function $c04-math-0034$ to be the number of constraints of the integer program (1.1); hence, $c04-math-0035$ .
Define finally the desired subregions $c04-math-0036$ by

Task (ii) is quite complicated. It is associated with uniform sampling separately at each subregion $c04-math-0038$ and will be addressed in the subsequent text.

The rest of this chapter is organized as follows. Section 4.2 presents a quick look at the splitting method. We show that, similarly to the conventional classic randomized algorithms (see [88]), it uses a sequential sampling plan to decompose a “difficult” problem into a sequence of “easy” ones. Sections 4.3 and 4.4 present two different splitting algorithms for counting; the first based on a fixed-level set and the second on an adaptive level set. Both algorithms employ a combination of a Markov chain Monte Carlo algorithm, such as the Gibbs sampler, with a specially designed cloning mechanism. The latter runs in parallel multiple Markov chains by making sure that all of them run in steady-state at each iteration. Section 4.5 shows how, using the splitting method, one can generate a sequence of points $c04-math-0039$ uniformly distributed on a discrete set $c04-math-0040$ . By uniformity we mean that the sample $c04-math-0041$ on $c04-math-0042$ passes the standard Chi-squared test [91]. We support numerically the uniformity of generated samples based on the Chi-squared test.

In Section 4.6 we show that the splitting method is suitable for solving combinatorial optimization problems, such as the maximal cut or traveling salesman problems, and thus can be considered as an alternative to the standard cross-entropy and MinxEnt methods. Sections 4.7.1 and 4.7.2 deal with two enhancements of theadaptive splitting method for counting. The first is called the direct splitting estimator and is based on the direct counting of $c04-math-0043$ , while the second is called the capture-recapture estimator of $c04-math-0044$ . It has its origin in the well-known capture-recapture method in statistics [114]. Both are used to obtain low-variance estimators for counting on complex sets as compared to the adaptive splitting algorithm. Section 4.8 shows how the splitting algorithm can be efficiently used for estimating the reliability of complex static networks. Finally, Section 4.9 presents supportive numerical results for counting, rare events, and optimization.

4.2 Quick Glance at the Splitting Method

In this section we will take a quick look at our sampling mechanism using splitting. Consider the counting problem (4.2) with $c04-math-0045$ subsets, that is,

equation

where $c04-math-0047$ . We assume that the subsets $c04-math-0048$ are associated with levels $c04-math-0049$ and can be written as

where $c04-math-0051$ is the sample performance function and $c04-math-0052$ . We set for convenience $c04-math-0053$ . The other levels are either fixed in advance, or their values are determined adaptively by the splitting algorithm during the course of simulation. Observe that $c04-math-0054$ in (4.3) can be interpreted as

4.6 $c04-math-0055$

where $c04-math-0056$ , the uniform distribution on $c04-math-0057$ . We denote this by $c04-math-0058$ . Furthermore, we shall require a uniform pdf on each subset $c04-math-0059$ , which is denoted by $c04-math-0060$ .Clearly,

4.7 $c04-math-0061$

where $c04-math-0062$ is the normalization constant. Generating points uniformly distributed on $c04-math-0063$ using the splitting method will be addressed in Section 4.5. With these notations, we can consider $c04-math-0064$ of (4.4) to be a conditional expectation defined as

4.8 $c04-math-0065$

Suppose that sampling from $c04-math-0066$ becomes available, and that we generated $c04-math-0067$ samples in $c04-math-0068$ , then the final estimators of $c04-math-0069$ and $c04-math-0070$ can be written as

4.9 $c04-math-0071$

and

4.10 $c04-math-0072$

respectively. Here we used

4.11 $c04-math-0073$

as estimators of their true unknown conditional expectations. Furthermore, $c04-math-0074$ , with $c04-math-0075$ and $c04-math-0076$ . We call $c04-math-0077$ the elite sample size, associated with the elite samples.

To provide more insight into the splitting method, consider a toy example of the integer problem (1.1), pictured in Figure 4.1. The solution set is a six-sided polytope in the two-dimensional plane, obtained by $c04-math-0078$ inequality constraints. Its boundary is shown by the bold lines in the figure. The bounding set is the unit square $c04-math-0079$ . We generate $c04-math-0080$ points uniformly in $c04-math-0081$ (see the dots in the figure). The corresponding five performance values $c04-math-0082$ , $c04-math-0083$ are

Figure 4.1 Iteration 1: five uniform samples on $c04-math-0085$ .

As for elite samples, we choose the two largest values corresponding to $c04-math-0086$ and $c04-math-0087$ . We thus have $c04-math-0088$ , and $c04-math-0089$ . Consider the first elite sample $c04-math-0090$ with $c04-math-0091$ . Its associated subregion of points satisfying exactly the same five constraints is given by the shaded area in Figure 4.2.

Figure 4.2 The subregion corresponding to the elite point $c04-math-0092$ with $c04-math-0093$ .

Similarly, Figure 4.3 shows the elite point $c04-math-0094$ (with $c04-math-0095$ ) and its associated subregion of points satisfying the same four constraints. However, notice that there are more subregions of points satisfying at least four constraints. The union of all these subregions is the set $c04-math-0096$ ; that is,

as shown in Figure 4.4.

Figure 4.3 The subregion corresponding to the elite point $c04-math-0098$ with $c04-math-0099$ .

Figure 4.4 The subregion $c04-math-0100$ containing all points with $c04-math-0101$ .

Starting from the two elite points $c04-math-0102$ , we apply a Markov chain Monte Carlo algorithm that generates points in $c04-math-0103$ such that these points are uniformly distributed. Figure 4.5 shows $c04-math-0104$ resulting points. The five performance values are

Figure 4.5 Iteration 2: five uniform points on $c04-math-0106$ .

Next, we repeat the same procedure as in the first iteration. As elite samples we choose the two largest points $c04-math-0107$ . Thus, $c04-math-0108$ , and $c04-math-0109$ . This gives us the set of all points satisfying at least five constraints,

see Figure 4.6.

Figure 4.6 The subregion $c04-math-0111$ containing all points with $c04-math-0112$ .

We use a Markov chain Monte Carlo algorithm to generate from the two elite points five new points in $c04-math-0114$ , uniformly distributed; see Figure 4.7. Note that two of the new points hit the desired polytope. At this stage, we stop the iterations and conclude that the algorithm has converged. The purpose of Figures 4.1–4.7 is to demonstrate that the splitting method can be viewed as a global search in the sense that, at each iteration, the generated random points $c04-math-0115$ are randomly distributed inside their corresponding subregion. The main issue remaining in the forthcoming sections is to provide exact algorithms that implement the splitting method.

Figure 4.7 Iteration 3: five uniform points on $c04-math-0113$ .

Before doing so, we show how to cast the problem of counting the number of what we call multiple events into the framework (4.6)–(4.8). As we shall see below, counting the number of feasible points $c04-math-0116$ of the set (1.1) follows as a particular case of it.

Example 4.1 Counting Multiple Events

Consider counting the cardinality of the set

4.12 $c04-math-0117$

where $c04-math-0118$ are arbitrary functions. In this case, we can associate with (4.12) the following multiple-event probability estimation problem:

4.13 $c04-math-0119$

Note that (4.13) extends (4.6) in the sense that it involves an intersection of $c04-math-0120$ events $c04-math-0121$ , that is, multiple events rather than a single one $c04-math-0122$ . Some of the constraints may be equality constraints, $c04-math-0123$ .

As before, we assume that each individual indicator event in (4.13) is not rare, that is, each probability of the type $c04-math-0124$ is not a rare-event probability, say $c04-math-0125$ but the event intersection forms a rare-event probability $c04-math-0126$ .

It is clear that in this case $c04-math-0127$ in (4.13) can be rewritten as

4.14 $c04-math-0128$

where

4.15 $c04-math-0129$

For the cardinality of the set $c04-math-0130$ one applies (4.10), which means that one only needs to estimate the rare-event probability $c04-math-0131$ in (4.14) according to (4.9).

4.3 Splitting Algorithm with Fixed Levels

In this section we assume that the levels $c04-math-0132$ are fixed in advance, with $c04-math-0133$ . This can be done using a pilot run with the splitting method as in [73]. Consider the problem of estimating the conditional probability

Assume further that we have generated a point $c04-math-0135$ uniformly distributed on $c04-math-0136$ , that is, $c04-math-0137$ . Recall from our toy example in the previous section that, given this information of the current point, we should be able to generate a new point in $c04-math-0138$ , which is again uniformly distributed. Typically, this might be obtained by applying a Markov chain Monte Carlo technique. For now we just assume that we dispose of a mapping $c04-math-0139$ , which preserves uniformity, that is,

4.16 $c04-math-0140$

We call this mapping the uniform mutation mapping. Based on that we can derive an unbiased estimator of $c04-math-0141$ by using the followingprocedure:

1. Generate a sample $c04-math-0142$ of points uniformly distributed on $c04-math-0143$ .

2. Apply the uniform mutation mapping to each point, and determine the cardinality $c04-math-0144$ of the set of points $c04-math-0145$ . This set is called the elite set.

3. Deliver

equation

as an unbiased estimator of $c04-math-0147$ .

The following algorithm applies this procedure iteratively. Moreover, in each iteration all $c04-math-0148$ elite points are reproduced (cloned) $c04-math-0149$ times to form a new sample of $c04-math-0150$ points. The parameter $c04-math-0151$ is called the splitting parameter.

Algorithm 4.1 Fixed-Level Splitting Algorithm for Counting

Input: the counting problem (1.1); the fixed levels $c04-math-0152$ ; $c04-math-0153$ ; $c04-math-0154$ ; the splitting parameters $c04-math-0155$ ; the initial sample size $c04-math-0156$ .
Output: estimators (4.9) and (4.10).

1. Initialize: Set a counter $c04-math-0157$ . Generate a sample $c04-math-0158$ uniformly on $c04-math-0159$ .

2. Select: Determine the elite sample, that is, the points for which $c04-math-0160$ . Suppose that $c04-math-0161$ is the size of this subset, and denote these points by $c04-math-0162$ ; thus,

Note that each elite $c04-math-0164$ .

3. Estimate $c04-math-0165$ : Take

equation

as an estimator of $c04-math-0167$ .

4. Stop: If $c04-math-0168$ ( $c04-math-0169$ ), deliver the estimators (4.9) and (4.10). Else, continue with the next step.

5. Splitting: Reproduce (clone) $c04-math-0170$ times each sample point $c04-math-0171$ of the elite sample. Set $c04-math-0172$ .

6. Mutation: To each of the cloned points apply the uniform mutation mapping $c04-math-0173$ . Denote the new sample by $c04-math-0174$ . Note that each point in the sample is distributed uniformly on $c04-math-0175$ .

7. Iterate: Increase the counter $c04-math-0176$ , and repeat from step 2.

A key ingredient in the Algorithm 4.1 is the uniform mutation mapping $c04-math-0177$ . Typically, this is obtained by a Markov chain kernel $c04-math-0178$ on $c04-math-0179$ with the uniform distribution as its invariant distribution. This means that we assume:

i. There is some aperiodic irreducible Markov chain $c04-math-0180$ on $c04-math-0181$ with transition probabilities

for measurable subsets $c04-math-0183$ .

ii. The invariant probability distribution $c04-math-0184$ exists; that is,

iii. $c04-math-0186$ is the uniform distribution on $c04-math-0187$ .

Under these assumptions, we can generate from any point $c04-math-0188$ a path of points $c04-math-0189$ in $c04-math-0190$ , by repeatedly applying the Markov kernel $c04-math-0191$ . If the initial point is random uniformly distributed, then each consecutive point of the path preserves this property (assumption (iii)). Otherwise, we use the property that the limiting distribution of Markov chain is its invariant distribution. In implementations we often resort to the Gibbs sampler for constructing the Markov kernel and generating an associated path of points (see Appendix 4.10 for details). All these considerations are summarized in just mentioning “apply the uniform mutation mapping.”

We next show that the estimator $c04-math-0192$ is unbiased. Indeed taking into account that

equation

we obtain from the last two equalities that

equation

from which $c04-math-0195$ readily follows. We next calculate the variance of $c04-math-0196$ . In order to accomplish this, we do the following:

1. Find for each point $c04-math-0197$ in the first elite set its corresponding number $c04-math-0198$ of offspring in the final subset $c04-math-0199$ .

2. Denote by $c04-math-0200$ the number of offspring generated from the $c04-math-0201$ -th elite point. Then clearly,

equation

and $c04-math-0203$ are iid.

3. Calculate $c04-math-0204$ as

4.17 $c04-math-0205$

where

and

Thus, to estimate the variance of $c04-math-0208$ it suffices to estimate $c04-math-0209$ , which in turn can be calculated as

equation

where

equation

and $c04-math-0212$ are iid samples.

It is challenging to find good splitting parameters $c04-math-0213$ in the fixed-level splitting version. When these parameters are not chosen correctly, the number of samples will either die out or explode, leading in both cases to large variances of $c04-math-0214$ . The minimal variance choice is $c04-math-0215$ , see [48] and [55]. Thus, given the levels $c04-math-0216$ , we can execute pilot runs to find rough estimates of the $c04-math-0217$ 's, denoted by $c04-math-0218$ . At iteration $c04-math-0219$ ( $c04-math-0220$ ) of the algorithm we set the splitting parameter of the $c04-math-0221$ -th elite point to be a random variable defined as

where the variable $c04-math-0223$ is a Bernoulli random variable with success probability $c04-math-0224$ , and $c04-math-0225$ means rounding down to the nearest integer. Thus, the splitting parameters satisfy $c04-math-0226$ . Note that the sample size in the next iteration is also a random variable defined as

equation

In the following section we propose an adaptive splitting method considering two issues that can cause difficulties in the fixed-level approach:

An objection against cloning is that it introduces correlation. An alternative approach is to sample $c04-math-0228$ “new” points in the current subset by running a Markov chain using a Markov chain Monte Carlo method.
Finding “good” locations of the levels $c04-math-0229$ is crucial but difficult. We propose to select the levels adaptively.

4.4 Adaptive Splitting Algorithm

In both counting and optimization problems the proposed adaptive splitting algorithms generate sequences of pairs

4.18 $c04-math-0230$

where, as before, $c04-math-0231$ . This is in contrast to the cross-entropy method [107], where one generates a sequence of pairs

4.19 $c04-math-0232$

Here $c04-math-0233$ is a sequence of parameters in the parametric family of distributions $c04-math-0234$ . The crucial difference is, of course, that in the splitting method, $c04-math-0235$ is a sequence of (uniform) nonparametric distributions rather than of parametric ones $c04-math-0236$ . Otherwise the cross-entropy and the splitting algorithms are very similar, apart from the fact that cross-entropy has the desirable property that the samples are independent, whereas in splitting they are not. The advantage of the splitting method is that it permits approximate sampling from the uniform pdf $c04-math-0237$ and permits updating the parameters $c04-math-0238$ and $c04-math-0239$ adaptively. It is shown in [104] that for rare-event estimation and counting the splitting method typically outperforms its cross-entropy counterpart, while for combinatorial optimization they perform similarly. The main reason is that for counting the sampling from a sequence of nonparametric probability density function $c04-math-0240$ is more beneficial (more informative) than sampling from a sequence of a parametric family like $c04-math-0241$ .

Below we present the main steps of the adaptive splitting algorithm. However, first, we describe the main ideas for which we refer to [13] and [40], where the adaptive $c04-math-0242$ -thresholds satisfy the requirements $c04-math-0243$ . Here, as before, the parameters $c04-math-0244$ , called the rarity parameters, are fixed in advance and they must be not too small, say $c04-math-0245$ .

We start by generating a sample of $c04-math-0246$ points uniformly on $c04-math-0247$ , denoted by $c04-math-0248$ . Consider the sample set at the $c04-math-0249$ -th iteration: $c04-math-0250$ of $c04-math-0251$ random points in $c04-math-0252$ . All these sampled points are uniformly distributed on $c04-math-0253$ . Let $c04-math-0254$ be the $c04-math-0255$ -th quantile of the ordered statistics values of the values $c04-math-0256$ . The elite set $c04-math-0257$ consists of those points of the sample set for which $c04-math-0258$ . Let $c04-math-0259$ be the size of the elite set. If all values $c04-math-0260$ would be distinct, it would follow that the number of elites $c04-math-0261$ , where $c04-math-0262$ denotes rounding up to the nearest integer. However, when we deal with a discrete finite space, typically we will find many duplicate sampled pointswith $c04-math-0263$ . All these are included in the elite set. Level $c04-math-0264$ is used to define the next level subset $c04-math-0265$ . Finally, note that it easily follows from (4.7) that the elite points are distributed uniformly on this level subset.

Regarding the elite sample $c04-math-0266$ , we do three things. First, we use its relative size $c04-math-0267$ as an estimate of the conditional probability $c04-math-0268$ . Next, we delete duplicates. We call this screening, and it might be modeled formally by introducing the identity map $c04-math-0269$ ; that is, $c04-math-0270$ for all points $c04-math-0271$ . Apply the identity map $c04-math-0272$ to the elite set, resulting in a set of $c04-math-0273$ distinct points (called the screened elites), that we denote by

4.20 $c04-math-0274$

Thirdly, each screened elite is the starting point of $c04-math-0275$ trajectories in $c04-math-0276$ generated by a Markov chain simulation using a transition probability matrix with $c04-math-0277$ as its stationary distribution. Because the starting point is uniformly distributed, all consecutive points on the sample paths are uniformly distributed on $c04-math-0278$ . Therefore, we may use all these points in the next iteration. If the splitting parameter $c04-math-0279$ , all trajectories are independent.

Thus, we simulate $c04-math-0280$ trajectories, each trajectory for $c04-math-0281$ steps. This produces a total of $c04-math-0282$ uniform points in $c04-math-0283$ . To continue with the next iteration again with a sample set of size $c04-math-0284$ , we choose randomly $c04-math-0285$ of the generated trajectories (without replacement) and apply one more Markov chain transition to the end points of these trajectories. Denote the new sample set by $c04-math-0286$ , and repeat the same procedure as above. Summarizing the above:

4.21 $c04-math-0287$

The algorithm iterates until it converges to $c04-math-0288$ , say at iteration $c04-math-0289$ , at which stage we stop and deliver the estimators (4.9) and (4.10) of $c04-math-0290$ and $c04-math-0291$ , respectively.

Figure 4.8 represents a typical dynamics of the zero-th iteration of the adaptive splitting algorithm below for counting. Initially $c04-math-0292$ points were generated randomly in the two-dimensional space $c04-math-0293$ (some are shown as black dots). Because $c04-math-0294$ , the elite sample consists of $c04-math-0295$ points (shown as the points numbered $c04-math-0296$ ) with values $c04-math-0297$ . After renumbering, these five points are $c04-math-0298$ . From each of these elite points $c04-math-0299$ Markov chain paths of length $c04-math-0300$ were generated resulting in again $c04-math-0301$ points in the space $c04-math-0302$ . In the picture we see these paths from two points $c04-math-0303$ and $c04-math-0304$ . Three points (the in the upper region) are shown reaching the next-level $c04-math-0305$ .

Algorithm 4.2 Adaptive Splitting Algorithm for Counting

Input: The counting problem (4.2); the rarity parameters $c04-math-0306$ ; the splitting parameters $c04-math-0307$ ; the sample size $c04-math-0308$ .
Output: Estimators (4.9) and (4.10).

1. Initialize: Set a counter $c04-math-0309$ . Generate a sample $c04-math-0310$ uniformly on $c04-math-0311$ .

2. Select: Compute level $c04-math-0312$ as the $c04-math-0313$ quantile of the ordered statistics values of $c04-math-0314$ . Determine the elite sample, that is, the largest subset of $c04-math-0315$ consisting of points for which $c04-math-0316$ . Suppose that $c04-math-0317$ is the size of this subset, and denote its points by $c04-math-0318$ ; thus,

Note that each elite point $c04-math-0320$ , the uniform distribution on the set $c04-math-0321$ .

3. Estimating $c04-math-0322$ : Take

equation

as an estimator of $c04-math-0324$ .

4. Stop: If $c04-math-0325$ , deliver the estimators (4.9) and (4.10). Else continue with the next step.

5. Screening: Screen the elite set; that is, remove duplicates, to obtain $c04-math-0326$ distinct points $c04-math-0327$ uniformly distributed in $c04-math-0328$ .

6. Splitting: Reproduce (clone) $c04-math-0329$ times each screened elite $c04-math-0330$ of the screened elite sample; that is, take $c04-math-0331$ identical copies of each point.

7. Sampling: To each of the cloned points apply a Markov chain sampler of length $c04-math-0332$ in $c04-math-0333$ with $c04-math-0334$ as its stationary distribution. The length $c04-math-0335$ is given by $c04-math-0336$ . Choose randomly (without replacement) $c04-math-0337$ of these paths and extend these with one point by applying an extra transition of the Markov chain sampler. Denote the new entire sample by $c04-math-0338$ . Note that each point in the sample is distributed uniformly on the set $c04-math-0339$ .

8. Iterate: Increase the counter $c04-math-0340$ , and repeat from step 2.

Figure 4.8 Iteration of the splitting algorithm.

As an illustrative example, one may consider a sum of Bernoulli random variables. Suppose we have $c04-math-0341$ for $c04-math-0342$ and we are interested in calculating a probability of the event $c04-math-0343$ for fixed $c04-math-0344$ , $c04-math-0345$ . One should note that even for moderate $c04-math-0346$ and, say, $c04-math-0347$ we have a rare-event scenario. Let now $c04-math-0348$ be the performance function. The Gibbs sampler can be defined as follows:

Algorithm 4.3 Gibbs Sampler for the Sum of Bernoulli Random Variables

Input: A random Bernoulli vector $c04-math-0349$ where $c04-math-0350$
Output: A uniformly chosen random neighbor $c04-math-0351$ with the same property.

For each $c04-math-0352$ do:

1. Generate $c04-math-0353$ ; set $c04-math-0354$ .

2. If $c04-math-0355$ , set $c04-math-0356$ ; otherwise $c04-math-0357$ .

From this simple example, it is clear that if we repeatedly apply this algorithm on any vector $c04-math-0358$ , it will eventually approach the desired event space $c04-math-0359$ $c04-math-0359a$ .

We conclude with a few remarks.

Remark 4.1

One of the important features of Algorithm 4.2 is that it produces points distributed nearly uniformly on each subregion $c04-math-0360$ . Indeed, the assumption is that in the initialization step we can generate samples that are exactly uniformly distributed on the entire space $c04-math-0361$ . At the subsequent iterations we generate in parallel a substantial number of independent sequences of nearly uniform points. We also make sure that they have sufficient lengths. By doing so we guarantee that our Gibbs sampler mixes rapidly and the generated points at each $c04-math-0362$ are indeed distributed approximately uniform, see also a relevant discussion in Gelman and Rubin [51]. We found numerically that this holds by choosing the sample size about 10–100 times larger than dimension $c04-math-0363$ and the splitting parameters $c04-math-0364$ not too small, say $c04-math-0365$ . For more details on uniform sampling, see Section 4.5.

The following figures supply evidence that supports the rapid mixing properties of the Gibbs sampler. We applied the splitting algorithm to a 3-SAT problem from the SATLIB benchmark problems, consisting of $c04-math-0366$ literals and $c04-math-0367$ clauses. In each iteration we chose arbitrarily one of the elite points as a starting point of the Gibbs sampler of length $c04-math-0368$ . From this sequence of points $c04-math-0369$ in the subset $c04-math-0370$ we constructed a time series $c04-math-0371$ by summating the coordinates of each point: $c04-math-0372$ , $c04-math-0373$ . Finally, we computed the autocorrelation function of this time series. Figure 4.9 shows these autocorrelation functions of the first four iterations, up to lag 20. We clearly see that in each iteration the correlation vanishes rapidly.

Figure 4.9 Autocorrelations of the Gibbs sampler.

Remark 4.2 Adaptive Choice of $c04-math-0374$

Suppose that we want to fix in advance the rarity parameters $c04-math-0375$ . Typically it is not so simple, especially for combinatorial problems. The reason is that if, for example, we consider the integer constraints (1.1), then a majority of generated points will result at the same level $c04-math-0376$ . Note that all these points are added to the elite set and as result the number of elites blowsup to $c04-math-0377$ . Such a policy may lead to a situation where $c04-math-0378$ . As result we obtain that $c04-math-0379$ and the Algorithm 4.2 will be locked.

To eliminate such undesirable phenomena we propose the following adaptive procedure involving a fixed interval $c04-math-0380$ for $c04-math-0381$ . Initially we choose quite an arbitrary $c04-math-0382$ , say $c04-math-0383$ , called the proposal rarity parameter; let $c04-math-0384$ be the largest elite value of the ordered sample $c04-math-0385$ , corresponding to the proposal $c04-math-0386$ . The adaptive choice of $c04-math-0387$ , where, say $c04-math-0388$ , is executed as follows:

Include in the elite sample all additional points satisfying $c04-math-0389$ , provided that the number of elite samples $c04-math-0390$ .
Remove from the elite sample all points $c04-math-0391$ , provided the number of elite samples $c04-math-0392$ . Note that by doing so we switch from a lower elite level to a higher one. If $c04-math-0393$ , and thus $c04-math-0394$ , accept $c04-math-0395$ as the size of the elites sample. If $c04-math-0396$ , proceed sampling until $c04-math-0397$ , that is until at least $c04-math-0398$ elite samples are obtained. The above guarantees that $c04-math-0399$ .

It follows that such adaptive $c04-math-0400$ satisfying $c04-math-0401$ prevents Algorithm 4.2 of being stopped before reaching the target level $c04-math-0402$ .

Remark 4.3 Alternative Choice of $c04-math-0403$

As an alternative to Remark 4.2 we can define $c04-math-0404$ based on the maximum value of the ordered sample $c04-math-0405$ , denoted as $c04-math-0406$ , rather than on the $c04-math-0407$ quantile $c04-math-0408$ . This is so, since the sample size $c04-math-0409$ is typically larger than the number of levels $c04-math-0410$ in the model. Thus, we expect that the number of $c04-math-0411$ values, denoted as $c04-math-0412$ will be greater than 1. The adaptive $c04-math-0413$ in this case will correspond to $c04-math-0414$ .

As an example, consider the following two sample scenarios for the ordered values of $c04-math-0415$ :

i. $c04-math-0416$ ;

ii. $c04-math-0417$ .

We have in (i) $c04-math-0418$ and in (ii) $c04-math-0419$ . Note that because in case (ii) we have $c04-math-0420$ (that is, a single value of $c04-math-0421$ ), we can include in the elite sample all additional smaller values corresponding (in this case) to $c04-math-0422$ . By doing so we obtain $c04-math-0423$ instead of $c04-math-0424$ .

In summary, in the alternative $c04-math-0425$ approach we first select some target $c04-math-0426$ value, say $c04-math-0427$ , and then accumulate all largest values of $c04-math-0428$ until we obtain

4.5 Sampling Uniformly on Discrete Regions

Here we demonstrate how, using the splitting method, one can generate a sequence of points $c04-math-0430$ nearly uniformly distributed on a discrete set $c04-math-0431$ , such as the set (1.1). By near uniformity we mean that the sample $c04-math-0432$ on $c04-math-0433$ passes the standard Chi-square test. Thus, the goal of Algorithm 4.2 is twofold: counting the points on the set $c04-math-0434$ and simultaneously generating points uniformly distributed on that set $c04-math-0435$ .

Note that generating a sample of points uniformly distributed on a continuous set $c04-math-0436$ using a Markov chain Monte Carlo algorithm has been thoroughly studied in the literature. One of the most popular methods is the hit-and-run of Smith [53]. For recent applications of hit-and-run for convex optimization, see the work of Gryazina and Polyak [58].

We will show empirically that a simple modification of the splitting Algorithm 4.2 allows one to generate points uniformly distributed on sets like (1.1). The modification of the Algorithm 4.2 is very simple. Once it reaches the desired level $c04-math-0437$ we perform several more iterations, say $c04-math-0438$ iterations, with the corresponding elite samples. As for previous iterations, we use here the screening, splitting, and sampling steps. Such policy will only improve the uniformity of the samples on $c04-math-0439$ . The number of additional steps $c04-math-0440$ needed for the resulting sample to pass the Chi-square test for uniformity, depends on the quality of the original elite sample at level $c04-math-0441$ , which in turn depends on the selected values $c04-math-0442$ and $c04-math-0443$ . We found numerically that the closer $c04-math-0444$ is to 1, the more uniform is the sample. But, running the splitting Algorithm 4.2 at $c04-math-0445$ close to 1 is clearly time consuming. Thus, there exists a trade-off between the quality (uniformity) of the sample and the number of additional iterations $c04-math-0446$ required.

As an example, consider a $c04-math-0447$ -SAT problem consisting of $c04-math-0448$ literals and $c04-math-0449$ clauses and $c04-math-0450$ (see also Section 4.9 with the numerical results). We applied the Chi-square test for uniformity of $c04-math-0451$ samples for the following cases:

$c04-math-0452$ and no additional iterations ( $c04-math-0453$ );
$c04-math-0454$ and no additional iterations ( $c04-math-0455$ );
$c04-math-0456$ and one additional iteration ( $c04-math-0457$ ).

All three experiments passed the Chi-square test at level $c04-math-0458$ : the observed test values were 13.709, 9.016, and 8.434, respectively, against critical value $c04-math-0459$ . The histogram corresponding to the last experiment ( $c04-math-0460$ ) is shown in Figure 4.10.

Figure 4.10 Histogram of 5,000 samples for the 3-SAT problem.

In this way, we tested many more SAT problems for uniformity and found that typically (about 95%) the resulting samples pass the Chi-square test, provided we perform two or three additional iterations ( $c04-math-0461$ ) on the corresponding elite sample after the algorithm has reached the desired level $c04-math-0462$ .

4.6 Splitting Algorithm for Combinatorial Optimization

In this section, we show that the splitting method is suitable for solving combinatorial optimization problems, such as the maximum cut or traveling salesman problems, and, thus, can be considered as an alternative to the standard cross-entropy and MinxEnt methods. The splitting algorithm for combinatorial optimization can be considered as a particular case of the counting Algorithm 4.2 with $c04-math-0463$ . In that case, we do not need to calculate the product estimators $c04-math-0464$ ; thus, Step 3 of Algorithm 4.2 is omitted. The main difference reflects the stopping step.

Algorithm 4.4 Splitting Algorithm for Optimization

Given the rarity parameter $c04-math-0465$ , say $c04-math-0466$ and the sample size $c04-math-0467$ , execute the following steps:

1. Initialize: The same as in Algorithm 4.2.

2. Select: The same as in Algorithm 4.2 using a single $c04-math-0468$ in all iterations.

3. Stop: If for some given small integer $c04-math-0469$ , say $c04-math-0470$ ,

4.22 $c04-math-0471$

Deliver $c04-math-0472$ as the estimator of the optimal solution. Else go to the next step.

4. Screening: The same as in Algorithm 4.2.

5. Splitting: The same as in Algorithm 4.2.

6. Sampling: The same as in Algorithm 4.2.

7. Iterate: The same as in Algorithm 4.2.

4.7 Enhanced Splitting Method for Counting

Here we consider two enhanced estimators for the adaptive splitting method for counting. They are called the direct and capture-recapture estimators, respectively.

4.7.1 Counting with the Direct Estimator

This estimator is based on direct counting of the number of screened samples obtained immediately after crossing the level $c04-math-0473$ . This is the reason that it is called the direct estimator. It is denoted $c04-math-0474$ , and is associated with the uniform distribution $c04-math-0475$ on $c04-math-0476$ . We found numerically that $c04-math-0477$ is extremely useful and very accurate as compared to $c04-math-0478$ of the Algorithm 4.2, but is applicable only for counting where $c04-math-0479$ is not too large. Specifically, $c04-math-0480$ should be less than the sample size $c04-math-0481$ . Note, however, that counting problems with small values $c04-math-0482$ are the most difficult ones and in many counting problems one is interested in the cases where $c04-math-0483$ does not exceed some fixed quantity, say $c04-math-0484$ . Clearly, this is possible only if $c04-math-0485$ . It is important to note that $c04-math-0486$ is typically much more accurate than its counterpart, the standard estimator $c04-math-0487$ based on the splitting Algorithm 4.2. The reason is that $c04-math-0488$ is obtained directly by counting all distinct values of $c04-math-0489$ satisfying $c04-math-0490$ , that is, it can be written as

4.23 $c04-math-0491$

where $c04-math-0492$ is the sample obtained by screening the elites when Algorithm 4.2 actually satisfied the stopping criterion at Step 4. Again, the estimating Step 3 is omitted; and conveniently Step 4 (stopping) and Step 5 (screening) are swapped.

Algorithm 4.5 Direct Algorithm for Counting

Input: The counting problem (4.2); the splitting parameters $c04-math-0493$ ; the rarity parameters $c04-math-0494$ ; the parameters $c04-math-0495$ and $c04-math-0496$ , say $c04-math-0497$ and $c04-math-0498$ , such that $c04-math-0499$ as in Remark 4.2; the sample size $c04-math-0500$ .
Output: Estimator (4.23).

1. Initialization: Same as in Algorithm 4.2.

2. Select: Same as in Algorithm 4.2.

3. Screening: Same as in Algorithm 4.2.

4. Stop: If $c04-math-0501$ , deliver (4.23) as an estimator of $c04-math-0502$ . Else continue with next step.

5. Splitting: Same as in Algorithm 4.2.

6. Sampling: Same as in Algorithm 4.2.

7. Iterate: Same as in Algorithm 4.2.

To increase further the accuracy of $c04-math-0503$ we might execute one more iteration with a larger sample (more clones in Step 5 and/or longer Markov paths in Step 6).

Note that there is no need to calculate $c04-math-0504$ in Algorithm 4.5 at any iteration. Note also that the counting Algorithm 4.5 can be readily modified for combinatorial optimization, since an optimization problem can be viewed as a particular case of counting, where the counted quantity $c04-math-0505$ .

4.7.2 Counting with the Capture–Recapture Method

In this section, we discuss how to combine the classic capture-recapture (CAP-RECAP) method with the splitting Algorithm 4.2. First we present the classical capture-recapture algorithm in the literature.

4.7.2.1 Capture-Recapture Method

Originally, the capture-recapture method was used to estimate the size, say $c04-math-0506$ , of some unknown population on the basis of two independent samples, each taken without replacement. To see how the CAP-RECAP method works, consider an urn model with a total of $c04-math-0507$ identical balls. Denote by $c04-math-0508$ and $c04-math-0509$ the sample sizes taken at the first and second draws, respectively. Assume, in addition, that

The second draw takes place after all $c04-math-0510$ balls have been returned to the urn.
Before returning the $c04-math-0511$ balls, each is marked, say, painted with a different color.

Denote by $c04-math-0512$ the number of balls from the first draw that reappear in the second. Then a biased estimate $c04-math-0513$ of $c04-math-0514$ is

4.24 $c04-math-0515$

This is based on the observation that $c04-math-0516$ . Note that the name capture-recapture was borrowed from a problem of estimating the animal population size in a particular area on the basis of two visits. In this case, $c04-math-0517$ denotes the number of animals captured on the first visit and recaptured on the second.

A slightly less biased estimator of $c04-math-0518$ is

4.25 $c04-math-0519$

See Seber [113] for an analysis of its bias. Furthermore, defining the statistic

Seber [113] shows that

where

so that $c04-math-0523$ is an approximately unbiased estimator of the variance of $c04-math-0524$ .

4.7.2.2 Splitting Algorithm Combined with Capture–Recapture

Application of the CAP-RECAP to counting problems is straightforward. The target is to estimate the unknown size $c04-math-0525$ . Consider the last iteration $c04-math-0526$ of the adaptive splitting Algorithm 4.2; that is, when we have generated in Step 2 the set $c04-math-0527$ of elite points at level $c04-math-0528$ (points in the target set $c04-math-0529$ ).

Then, we perform the following steps:

1. Skip the stopping Step 4 (of Algorithm 4.2).

2. Screen the elites (Step 5).

3. Execute splitting Step 6 and sampling Step 7 (of Algorithm 4.2) with a sample size $c04-math-0530$ .

4. Remove the duplicates.

5. Record the resulting set of $c04-math-0531$ distinct points.

6. Execute Steps 2–5 above a second time, starting from the same elite set $c04-math-0532$ ; the sample size in Step 5 is taken to be $c04-math-0533$ . Let $c04-math-0534$ be the number of distinct points.

7. Count the number $c04-math-0535$ of distinct points that occur in both runs and deliver either estimator the (4.24) or (4.25).

The resulting algorithm can be written as

Algorithm 4.6 Capture-Recapture Algorithm for Counting

Input: the counting problem (4.2); sample sizes $c04-math-0536$ ; rarity parameters $c04-math-0537$ ; set automatically the splitting parameters $c04-math-0538$ .
Output: counting estimator (4.24) or (4.25).

1. Initialize: Set a counter $c04-math-0539$ . Generate a sample $c04-math-0540$ uniformly on $c04-math-0541$ .

2. Select: Compute level $c04-math-0542$ as the $c04-math-0543$ quantile of the ordered statistics values of $c04-math-0544$ . Determine the elite sample, that is, the largest subset of $c04-math-0545$ consisting of points for which $c04-math-0546$ . Suppose that $c04-math-0547$ is the size of this subset, and denote its points by $c04-math-0548$ ; thus,

3. Screening: Screen the elite set; that is, remove duplicates to obtain $c04-math-0550$ distinct points $c04-math-0551$ , uniformly distributed in $c04-math-0552$ .

4. Stop: If $c04-math-0553$ , set $c04-math-0554$ and go to Step 7; otherwise continue with the next step.

5. Sampling: To each of the screened elite points, apply a Markov chain sampler of length $c04-math-0555$ in $c04-math-0556$ with $c04-math-0557$ as its stationary distribution. The length $c04-math-0558$ is given by $c04-math-0559$ . Choose randomly (without replacement) $c04-math-0560$ of these paths and extendthese with one point by applying an extra transition of the Markov chain sampler. Denote the new entire sample by $c04-math-0561$ .

6. Iterate: Increase the counter $c04-math-0562$ , and repeat from step 2.

7. Capture: Let $c04-math-0563$ . For all $c04-math-0564$ , starting at the $c04-math-0565$ -th screened elite point, run a Markov chain of length $c04-math-0566$ in $c04-math-0567$ with $c04-math-0568$ as its stationary distribution. Extend $c04-math-0569$ randomly chosen sample paths with one point. Screen out duplicates to obtain a set $c04-math-0570$ of $c04-math-0571$ distinct points in $c04-math-0572$ .

8. Recapture: Repeat step 7 with $c04-math-0573$ . After screening out, the remaining set is $c04-math-0574$ of $c04-math-0575$ distinct points in $c04-math-0576$ .

9. Capture-Recapture: Compute the number $c04-math-0577$ of points in $c04-math-0578$ .

10. Estimating: Deliver estimator (4.24) or (4.25).

In Section 4.9, we compare the numerical results of the adaptive splitting Algorithm 4.2 with the capture–recapture Algorithm 4.6. As a general observation we found that the performances of the two counting estimators depend on the size of the target set $c04-math-0579$ . In particular, when we keep the sample $c04-math-0580$ limited to 10,000, then for $c04-math-0581$ sizes up to $c04-math-0582$ the CAP-RECAP estimator (4.25), denoted as $c04-math-0583$ is more accurate than the adaptive splitting estimator $c04-math-0584$ in (4.10), that is,

However, for larger target sets, say with $c04-math-0586$ , we propose using the splitting algorithm because the capture–recapture method performs poorly.

4.8 Application of Splitting to Reliability Models

4.8.1 Introduction

Consider a connected undirected graph $c04-math-0587$ with $c04-math-0588$ vertices and $c04-math-0589$ edges, where each edge has some probability $c04-math-0590$ of failing. What is the probability that $c04-math-0591$ becomes disconnected under random, independent edge failures? This problem can be generalized by allowing a different failure probability for each edge. It is well known that this problem is #P-hard, even in the special case $c04-math-0592$ .

Although approximation [18] and bounding [5] [6] methods are available, their accuracy and scope are very much dependent on the properties (such as size and topology) of the network. For large networks, estimating the reliability via simulation techniques becomes desirable. Due to the computational complexity of the exact determination, a Monte Carlo approach is justified. However, in highly reliable systems such as modern communication networks, the probability of network failure is a rare-event probability and naive methods such as the Crude Monte Carlo are impractical. Various variance reduction techniques have been developed to produce better estimates. Examples include a control variate method [47], importance sampling [22] [59], and graph decomposition approaches such as the recursive variance reduction algorithm and its refinements [21] [23]. For a survey of these methods, see [20].

For network unreliability, Karger [64] presents an algorithm and shows that its complexity belongs to the so-called fully polynomial randomized approximation schemes (see Appendix section A.3.2 for a definition). His approach can be described as follows.

Assume for simplicity that each edge fails with equal probability $c04-math-0593$ . Let $c04-math-0594$ be the size of a minimum cut in $c04-math-0595$ ; that is, the smallest cut in the graph having exactly $c04-math-0596$ edges. An important observation of Karger is that if all edges of the cut fail, then the graph becomes disconnected and the network unreliability (failure probability), denoted by $c04-math-0597$ , satisfies $c04-math-0598$ . If $c04-math-0599$ is bounded by some reasonable polynomial, say $c04-math-0600$ , then the naive Monte Carlo will do the job. Otherwise, $c04-math-0601$ will be very small and, thus, we are forced to face a rare event situation. Karger discusses the following issues:

The failure probability of a cut decreases exponentially with the number of edges in the cut (so he suggests considering only relatively small cuts).
There exists a polynomial number of such relatively small cuts, and they can be enumerated in polynomial time.
To estimate $c04-math-0602$ to within $c04-math-0603$ with high probability, it is sufficient to perform $c04-math-0604$ trials.
Combining the well-known DNF counting algorithm of Karp and Luby [65], one can construct easily a polynomial-sized DNF formula whose probability of being satisfied is precisely the desired one. The algorithm of Karger estimates the failure probability of the graph to within $c04-math-0605$ in $c04-math-0606$ time.

The permutation Monte Carlo (PMC) and the turnip methods of Elperin, Gertsbakh, and Lomonosov [41] [42] are the most successful breakthrough methods for networks reliability. Both methods are based on the so-called basic Monte Carlo (BMC) algorithm for computing the sample performance $c04-math-0607$ (see [52] and Algorithm 4.7). It is shown in [41] [42] that the complexity of both PMC and turnip is $c04-math-0608$ , where $c04-math-0609$ is the number of edges in the network.

Botev et al. [14] introduce a fast splitting algorithm based on efficient implementation of data structures using minimal spanning trees. This allows one to speed-up substantially the Gibbs sampling, while estimating the sample reliability function $c04-math-0610$ . The Gibbs sampler is implemented online into the BMC Algorithm 4.7 (see Section 4.8.3). Although numerically the splitting method in [14] performs exceptionally well compared with its standard counterpart (see Algorithms 16.7–16.9 in Kroese et al. [73]), there is no formal proof regarding the speed-up of the former relative to the latter. Extensive numerical results support the conjecture that its complexity is the same as in [41] [42], that is, $c04-math-0611$ .

The recent work of Rubinstein et al. [109] deals with further investigation of networks' reliability. In particular, they consider the following three well-known methods: cross-entropy, splitting, and PMC. They show that combining them with the so-called hanging edges Monte Carlo (HMC) algorithm of Lomonosov [43] instead of the conventional BMC algorithm obtains a speed-up of order $c04-math-0612$ . The speed-up is due to the fast computation of the sample reliability function $c04-math-0613$ . As a consequence, the earlier suggested cross-entropy algorithm in Kroese et al. [59], the conventional splitting one in Kroese et al. [73], and the PMC algorithm of Elperin et al. [41] for network reliability (all based on the BMC Algorithm 4.7) can be sped up by a factor of $c04-math-0614$ . In particular, the following is shown:

The theoretical complexity of the new HMC-based PMC algorithm becomes $c04-math-0615$ instead of $c04-math-0616$ .
The empirical complexity of the new HMC-based cross-entropy algorithm becomes $c04-math-0617$ instead of $c04-math-0618$ .
The empirical complexity of the new HMC-based splitting algorithm becomes $c04-math-0619$ instead of $c04-math-0620$ .

An essential feature of all three algorithms in [109] is that they retain their original form, because the improved counterparts merely implement the HMC as a subroutine.

In spite of the nice theoretical and empirical complexities of the new HMC-based PMC and cross-entropy algorithms, they have limited application. The reason is that both cross-entropy and PMC algorithms are numerically unstable for large $c04-math-0621$ . Consequently, the cross-entropy and PMC algorithms are able to generate stable reliability estimators for $c04-math-0622$ up to several hundred nodes, while the splitting and the turnip algorithms (in [14] [109], and [41] [42], respectively, all with complexities $c04-math-0623$ ) are able to generate accurate reliability estimators for tens of thousands of nodes.

Here we present a splitting algorithm based on the BMC Algorithm 4.7, as an alternative to the methods of [73] and [109]. Similar to [14] its empirical complexity is $c04-math-0624$ . We believe that the advantage of the current version as compared to the one in [14] is the simplicity of implementation of the Gibbs sampler for generating $c04-math-0625$ .

4.8.2 Static Graph Reliability Problem

Consider an undirected graph $c04-math-0626$ with a set of nodes $c04-math-0627$ and a set of edges $c04-math-0628$ . Each edge can be either operational or failed. The configuration of the system can be denoted by $c04-math-0629$ , where $c04-math-0630$ is the number of edges, $c04-math-0631$ = 1 if edge $c04-math-0632$ is operational, and $c04-math-0633$ = 0 if it is failed. A subset of nodes $c04-math-0634$ is selected a priori and the system (or graph) is said to be operational in the configuration $c04-math-0635$ if all nodes in $c04-math-0636$ are connected to each other by at least one path of operational edges only. We denote this by defining the structure function,

by $c04-math-0638$ when the system is operational in configuration $c04-math-0639$ , and $c04-math-0640$ otherwise.

Suppose that edge $c04-math-0641$ is operational with probability $c04-math-0642$ , so that the system configuration is a random vector $c04-math-0643$ , where $c04-math-0644$ = $c04-math-0645$ = $c04-math-0646$ and the $c04-math-0647$ 's are independent. The goal is to estimate $c04-math-0648$ , the reliability of thesystem.

The crude Monte Carlo reliability estimator is given as

4.26 $c04-math-0649$

where $c04-math-0650$ are iid realizations from a multivariate Bernoulli distribution such that $c04-math-0651$ for all $c04-math-0652$ . For highly reliable systems, the unreliability $c04-math-0653$ is very small, so that we have a rare-event situation, and $c04-math-0654$ has to be prohibitively large to get a meaningful estimator. Such rare-event situations call for efficient sampling strategies. In this section, we shall use the ones based on cross-entropy, splitting, and the PMC method, all of which rely on the graph evolution approach described next.

4.8.2.1 Dynamic Graph Reliability Problem Formulation

In our context, a rare event occurs when the network is nonoperational. We consider an artificial state that contains more information than the binary vector based on [41]. The idea is to transform the static model into a dynamic one by assuming that all edges start in the failed state at time 0 and that the $c04-math-0655$ -th edge is repaired after a random time $c04-math-0656$ , whose distribution is chosen so that the probability of repair at time 1 is the reliability of the edge. In particular, we assume that the $c04-math-0657$ 's have a continuous cumulative distribution function $c04-math-0658$ for $c04-math-0659$ , with $c04-math-0660$ and $c04-math-0661$ . The reliability of the graph is the probability that it is operational at time 1, and the crude Monte Carlo algorithm can be reformulated as follows:

Generate the vector of repair times and check if the graph is operational at time 1; repeat this $c04-math-0662$ times independently, and estimate the reliability by the proportion of those graphs that are operational.

This formulation has the advantage that, given the vector $c04-math-0663$ , we can compute at what time the network becomes operational and use this real number as a measure of our closeness to the rare event.

Assuming that the $c04-math-0664$ 's are independent, they are easy to generate by inverting the $c04-math-0665$ 's. We define random variables $c04-math-0666$ for all $c04-math-0667$ , where $c04-math-0668$ is the indicator function, and the associated random configuration $c04-math-0669$ . These variables give the status of edge $c04-math-0670$ at time $c04-math-0671$ , namely, if $c04-math-0672$ , the edge is operational at time $c04-math-0673$ . Note that for the original Bernoulli variable indicatingwhether edge $c04-math-0674$ is operational, we have $c04-math-0675$ . In this way we get also a random dynamic structure function $c04-math-0676$ , $c04-math-0677$ , for which $c04-math-0678$ . The interpretation is that the edges become operational one by one at random times and we are interested in the reliability at time 1. This is referred to as the construction process in [110]. Define the performance function

as the first time when the graph becomes operational. Hence, the value equals one of the repair times $c04-math-0680$ , namely the one that swaps the graph from non-operational to operational. Another point of view is to consider $c04-math-0681$ as a measure of closeness of $c04-math-0682$ to reliability: $c04-math-0683$ means that the system is operational at time 1, thus $c04-math-0684$ measures the distance from it. Both the cross-entropy [59] and the splitting method [73] exploit this feature.

These methods are designed to estimate the nonoperational probability (or unreliability) $c04-math-0685$ . Ideally, we would like to sample $c04-math-0686$ (repeatedly) from its distribution conditional on $c04-math-0687$ . To achieve this, we first partition the interval [0, 1] in fixed levels

and use them as intermediate stepping stones on the way to sampling $c04-math-0689$ conditional on $c04-math-0690$ as follows.

At stage $c04-math-0691$ of the algorithm, for $c04-math-0692$ , we have a collection of states $c04-math-0693$ from their distribution conditional on $c04-math-0694$ . For each of them, we construct a Markov chain that starts with that state and evolves by resampling some of the edge repair times under their distribution conditional on the value $c04-math-0695$ remaining larger than $c04-math-0696$ . While running all those chains for a given number of steps, we collect all the visited states $c04-math-0697$ whose value is above $c04-math-0698$ , and use them for the next stage. By discarding the repair times below $c04-math-0699$ and starting new chains from those above $c04-math-0700$ , and repeating this at each stage, we favor the longer repair times and eventually obtain a sufficiently large number of states with values above 1. Based on the number of steps at each stage, and the proportion of repair times that reach the final level, we obtain an unbiased estimator of $c04-math-0701$ , which is the graph unreliability, see [41].

This reformulation with the repair-time variables $c04-math-0702$ enables us to consider the reliability at any time $c04-math-0703$ : the configuration is $c04-math-0704$ and the graph is operational if and only if $c04-math-0705$ . Once the repair-time variables $c04-math-0706$ are generated, we can compute from them an estimator of the reliability for any $c04-math-0707$ .

In Elperin et al. [41] [42] and Gertsbakh and Shpungin [52], each repair distribution $c04-math-0708$ is taken as exponential with rate $c04-math-0709$ (denoted $c04-math-0710$ ). Then

and $c04-math-0712$ . We denote the corresponding joint pdf by $c04-math-0713$ .

4.8.3 BMC Algorithm for Computing $c04-math-0714$

Recall that, in terms of the vector $c04-math-0715$ , the graph unreliability can be written as $c04-math-0716$ and using the crude Monte Carlo, we would estimate $c04-math-0717$ via

equation

It is well known that direct evaluation of $c04-math-0719$ by using the minimal cuts and minimal path sets of the network is impractical. Instead, it is common to resort to the following simple algorithm [52] [59, 73], which uses a breadth-first search procedure [32].

Algorithm 4.7 Basic Monte Carlo Algorithm for Computing $c04-math-0720$

Given $c04-math-0721$ and a set of terminal nodes of size $c04-math-0722$ , set $c04-math-0723$ and execute the following steps.

1. Let $c04-math-0724$ be the permutation of the edges $c04-math-0725$ obtained from $c04-math-0726$ such that

2. Consider the network $c04-math-0728$ in which the edges $c04-math-0729$ are operational and the rest, $c04-math-0730$ , are failed.

3. Use the breadth-first search algorithm to check if the network is operational. If so, denote by $c04-math-0731$ the final number $c04-math-0732$ corresponding to the operational network, stop and go to Step 4; otherwise, increment $c04-math-0733$ and repeat from Step 2.

4. Output $c04-math-0734$ as the time at which the network becomes operational.

The stopping random value of $c04-math-0735$ corresponding to the operational network is called the critical number.

Remark 4.4

In the original BMC version [52], the terminal connectivity (see Step $c04-math-0736$ of Algorithm 4.7) was established using the Kruskal algorithm with complexity $c04-math-0737$ . The corresponding complexity of the entire algorithm was $c04-math-0738$ [52]. One can speed up Algorithm 4.7 by using the bisection search for finding the critical number $c04-math-0739$ instead of searching it sequentially starting from $c04-math-0740$ . By doing so, the complexity of this enhanced breadth-first search version reduces to $c04-math-0741$ and that of Algorithm 4.7 reduces to $c04-math-0742$ .

4.8.4 Gibbs Sampler

To estimate the unreliability $c04-math-0743$ we use the standard splitting algorithm on a sequence of adaptive levels

As usual, for each level $c04-math-0745$ we employ a Gibbs sampler. Clearly, the complexity of the splitting algorithm depends on the complexity of the Gibbs sampler. We present here two version of the Gibbs sampler, the standard one, which was used earlier for different estimation and counting problems [105], and the new faster one, called an enhanced Gibbs sampler, which is specially designed for reliability models.

Algorithm 4.8 Standard Gibbs Sampler for Network Reliability

Input: A time $c04-math-0746$ ; a graph $c04-math-0747$ , with $c04-math-0748$ ; repair times $c04-math-0749$ such that $c04-math-0750$ .
Output: A random neighbor $c04-math-0751$ such that $c04-math-0752$ .

For each $c04-math-0753$ do:

1. Generate $c04-math-0754$ ; set $c04-math-0755$ .

2. If $c04-math-0756$ , set $c04-math-0757$ .

This algorithm exploits two properties, which ensures its correctness. The first property is that, given any repair times $c04-math-0758$ , the value $c04-math-0759$ , where $c04-math-0760$ is the smallest of all repair times that make the system operational. Suppose that $c04-math-0761$ , and choose an arbitrary edge $c04-math-0762$ . Suppose that we may alter its repair time randomly to set it $c04-math-0763$ . Then there are two possible scenarios.

a. Either edge $c04-math-0764$ is noncritical in the sense that, whatever the new repair time $c04-math-0765$ will be, the perfomance value remains larger than $c04-math-0766$ .

b. Or edge $c04-math-0767$ is critical in the sense that setting $c04-math-0768$ too small, the value becomes less than $c04-math-0769$ . In the latter case, surely when $c04-math-0770$ , the value also remains larger. Then we call the second property, which is the memoryless property of the exponential distribution.

The Gibbs sampler Algorithm 4.8 is quite time consuming because the evaluation of $c04-math-0771$ is so. Below we propose a faster option, called the enhanced Gibbs sampler, which avoids explicit calculation of $c04-math-0772$ during Gibbs sampler execution. It is based on similar observations made above. Consider again the situation that $c04-math-0773$ for some repair time $c04-math-0774$ and that we change the repair time of edge $c04-math-0775$ randomly.

If already $c04-math-0776$ , then we conclude that edge $c04-math-0777$ is noncritical, so we can generate any repair time $c04-math-0778$ .
If $c04-math-0779$ , then we run a breadth-first search procedure [32] in order to verify whether edge $c04-math-0780$ is critical or noncritical. The breadth-first search procedure will operate on a subgraph $c04-math-0781$ induced by edges operational before time $c04-math-0782$ . If $c04-math-0783$ is indeed critical, we will generate $c04-math-0784$ .

The enhanced Gibbs sampler can be written as follows.

Algorithm 4.9 Enhanced Gibbs Sampler for Network Reliability

Input: A time $c04-math-0785$ ; a graph $c04-math-0786$ , with $c04-math-0787$ ; repair times $c04-math-0788$ $c04-math-0788a$ such that $c04-math-0789$ .
Output: A random neighbor $c04-math-0790$ such that $c04-math-0791$ .

For each $c04-math-0792$ do:

1. Generate $c04-math-0793$ ; set $c04-math-0794$ .

2. If $c04-math-0795$ :

a. Determine $c04-math-0796$ and $c04-math-0797$ .

b. Check terminal connectivity by applying the breadth-first search procedure to the subgraph $c04-math-0798$ .

c. If the terminals are connected set $c04-math-0799$ .

Taking into account that for each random variable $c04-math-0800$ the most expensive operation is breadth-first search procedure on $c04-math-0801$ , which takes at most $c04-math-0802$ units of time because $c04-math-0803$ is clearly a subgraph of the original $c04-math-0804$ and so $c04-math-0805$ , we conclude that the overall complexity of the enhanced Gibbs sampler Algorithm 4.9 is $c04-math-0806$ .

4.9 Numerical Results with the Splitting Algorithms

We present here numerical results with the proposed algorithms for counting and optimization. A large collection of instances (including real-world) is available on the following sites:

http://people.brunel.ac.uk/∼mastjjb/jeb/orlib/scpinfo.html
http://www.nlsde.buaa.edu.cn/∼kexu/benchmarks/set-benchmarks.htm
http://www.mat.univie.ac.at/∼neum/glopt/test.html
http://www.caam.rice.edu/∼bixby/miplib/miplib.html
Multiple-knapsack are given on
SAT problems are given on the SATLIB website, www.satlib.org, and
- http://www.satcompetition.org
- http://fmv.jku.at/sat-race-2006/downloads.html

4.9.1 Counting

4.9.1.1 SAT Problem

We present data from experiments with two different 3-SAT models:

Small size; $c04-math-0807$ (meaning $c04-math-0808$ variables and $c04-math-0809$ clauses); exact count $c04-math-0810$ .
Moderate size; $c04-math-0811$ ; exact count $c04-math-0812$ .

We shall use the following notation.

Notation 4.1

For iteration $c04-math-0813$

$c04-math-0814$ and $c04-math-0815$ denote the actual number of elites and the number after screening, respectively;
$c04-math-0816$ and $c04-math-0817$ denote the upper and the lower elite levels reached, respectively (the $c04-math-0818$ levels are the same as the $c04-math-0819$ levels in the description of the algorithm);
$c04-math-0820$ are the adaptive rarity, splitting, and burn-in parameters, respectively;
$c04-math-0821$ is the estimator of the $c04-math-0822$ -th conditional probability;
(intermediate) product estimator after the $c04-math-0823$ -th iteration $c04-math-0824$ ;
(intermediate) direct estimator after the $c04-math-0825$ -th iteration $c04-math-0826$ , which is obtained by counting directly the number of distinct points satisfying all clauses.
RE denotes the estimated relative errors defined in (2.3) in Chapter 2.
CPU report computing time of the algorithm. All computations were executed on the same machine (PC Pentium E5200 4GB RAM).

4.9.1.2 First Model: $c04-math-0827$ -SAT with Instance Matrix $c04-math-0828$

This instance of the $c04-math-0829$ -SAT problem consists of $c04-math-0830$ variables and $c04-math-0831$ clauses. The exact count is $c04-math-0832$ . A typical dynamics of a run of the adaptive splitting Algorithm 4.2 with $c04-math-0833$ , $c04-math-0834$ , and $c04-math-0835$ (single burn-in) is given in Table 4.1.

Table 4.1 Dynamics of Algorithm 4.2 for 3-SAT $c04-math-0836$ model

We ran the algorithm 10 times with $c04-math-0845$ , $c04-math-0846$ , and $c04-math-0847$ . The average performance was

The direct estimator $c04-math-0849$ gave always the exact result $c04-math-0850$ .

4.9.1.3 Second Model: $c04-math-0851$ -SAT with Instance Matrix $c04-math-0852$

Our next model is a $c04-math-0853$ -SAT with an instance matrix $c04-math-0854$ taken from www.satlib.org. The exact value is $c04-math-0855$ . Table 4.2 presents the dynamics of the adaptive Algorithm 4.2 for this problem.

Table 4.2 Dynamics of Algorithm 4.2 for the $c04-math-0868$ -SAT with matrix $c04-math-0869$

We ran the splitting algorithm 10 times, using sample size $c04-math-0856$ and parameters $c04-math-0857$ and $c04-math-0858$ for all iterations. We obtained

We also implemented the direct Algorithm 4.5 after the last iteration of the adaptive splitting using $c04-math-0860$ . We obtained

Table 4.3 presents a comparison of the performances of the adaptive splitting estimator $c04-math-0862$ and its counterpart, the CAP-RECAP estimator, denoted by $c04-math-0863$ obtained from the capture-recapture Algorithm 4.6. The comparison was done for the same random 3-SAT model with the instance matrix $c04-math-0864$ . We set $c04-math-0865$ , $c04-math-0866$ , and $c04-math-0867$ .

Table 4.3 Comparison of the Performance of the Adaptive Estimator $c04-math-0878$ and Its CAP-RECAP counterpart $c04-math-0879$ for the 3-SAT $c04-math-0880$ model

Note that the sample $c04-math-0888$ was obtained as soon as soon as Algorithm 4.6 reaches the final level $c04-math-0889$ , and $c04-math-0890$ was obtained while running it for one more iteration at the same level $c04-math-0891$ . The actual sample sizes $c04-math-0892$ and $c04-math-0893$ were chosen according to the following rule: sample until Algorithm 4.6 screens out 50% of the samples and then stop. It follows from Table 4.3 that for model $c04-math-0894$ this corresponds to $c04-math-0895$ . It also follows that the relative error of $c04-math-0896$ is about 100 times smaller as compared to $c04-math-0897$ .

4.9.1.4 Random Graphs with Prescribed Degrees

Random graphs with given vertex degrees is a model for real-world complex networks, including the World Wide Web, social networks, and biological networks. The problem is to find a graph $c04-math-0898$ with $c04-math-0899$ vertices, given the degree sequence $c04-math-0900$ with some nonnegative integers. Following [9], a finite sequence $c04-math-0901$ of non-negative integers is called graphical if there is a labeled simple graph with vertex set $c04-math-0902$ in which vertex $c04-math-0903$ has degree $c04-math-0904$ . Such a graph is called a realization of the degree sequence $c04-math-0905$ . We are interested in the total number of such realizations for a given degree sequence, hence $c04-math-0906$ denotes the set of all graphs $c04-math-0907$ with the degree sequence $c04-math-0908$ .

In order to perform this estimation, we transform the problem into a counting problem by considering the complete graph $c04-math-0909$ of $c04-math-0910$ vertices, in which each vertex is connected with all other vertices. Thus, the total number of edges in $c04-math-0911$ is $c04-math-0912$ , labeled $c04-math-0913$ . The random graph problem with prescribed degrees is translated to the problem of choosing those edges of $c04-math-0914$ such that the resulting graph $c04-math-0915$ matches the given sequence $c04-math-0916$ . We set $c04-math-0917$ when $c04-math-0918$ is chosen, and $c04-math-0919$ , otherwise, $c04-math-0920$ . So that such an assignment $c04-math-0921$ matches the given degree sequence $c04-math-0922$ , it holds necessarily that $c04-math-0923$ , since this is the total number of edges. In other words, the configuration space is

equation

Let $c04-math-0925$ be the incidence matrix of $c04-math-0926$ with entries

equation

It is easy to see that whenever a configuration $c04-math-0928$ satisfies $c04-math-0929$ , the associated graph has degree sequence $c04-math-0930$ . We conclude that the set $c04-math-0931$ is represented by

We first present a small example as illustration. Let $c04-math-0933$ with $c04-math-0934$ , and $c04-math-0935$ . After ordering the edges of $c04-math-0936$ lexicographically, the corresponding incidence matrix is given as

equation

It can be seen that $c04-math-0938$ and $c04-math-0939$ $c04-math-0940$ present two solutions of this example.

For the random graph problem we define the performance function $c04-math-0941$ by

equation

where $c04-math-0943$ is the degree of vertex $c04-math-0944$ under the configuration $c04-math-0945$ . Each configuration that satisfies the degree sequence will have a performance function equal to $c04-math-0946$ .

The implementation of the Gibbs sampler for this problem is slightly different than for the $c04-math-0947$ -SAT one, in as much as we keep the number of edges in each realization fixed to $c04-math-0948$ . The algorithm below takes care of this requirement and generates a random $c04-math-0949$ .

Algorithm 4.10 A Random Graph with Prescribed Degrees

Input: The prescribed degrees sequence $c04-math-0950$ .
Output: A random $c04-math-0951$ .

1. Generate a random permutation of $c04-math-0952$ .

2. Choose the first $c04-math-0953$ places in this permutation and deliver a vector $c04-math-0954$ having one's in those places.

The adaptive thresholds in the splitting algorithm are negative, increasing to 0:

The resulting Gibbs sampler (in Step 3 of the splitting algorithm starting with a configuration $c04-math-0956$ for which $c04-math-0957$ ) can be written as follows.

Algorithm 4.11 Gibbs Algorithm for Random Graph Sampling

Input: A configuration $c04-math-0958$ with $c04-math-0959$ .
Output: A uniformly chosen random neighbor with the same property.

For each edge $c04-math-0960$ , while keeping all other edges fixed, do:

1. Remove $c04-math-0961$ from $c04-math-0962$ ; that is, make $c04-math-0963$ .

2. Check all possible placements for the edge resulting a new vector, denoted by $c04-math-0964$ conditioning on the performance function $c04-math-0965$ .

3. With uniform probability choose one of the valid realizations.

We will apply the splitting algorithm to two problems taken from [9].

4.9.1.5 A Small Problem

For this small problem we have the degree sequence

The solution can be obtained analytically and is already given in [9], from which we quote:

To count the number of labeled graphs with this degree sequence, note that there are $c04-math-0967$ such graphs with vertex 1 not joined to vertex 2 by an edge (these graphs look like two separate stars), and there are $c04-math-0968$ such graphs with an edge between vertices 1 and 2 (these look like two joined stars with an isolated edge left over). Thus, the total number of realizations of d is $c04-math-0969$ .

As we will see, the adaptive splitting Algorithm 4.2 easily handles the problem. First we present a typical dynamics (Table 4.4).

Table 4.4 Typical Dynamics of the Splitting Algorithm 4.2 for a small problem using $c04-math-0970$ and $c04-math-0971$

We ran the algorithm 10 times, independently, using sample size $c04-math-0979$ and rarity parameter $c04-math-0980$ . The average performance was

4.9.1.6 A Large Problem

A much harder instance (see [9]) is defined by

In [9] the number of such graphs is estimated to be about $c04-math-0983$ . The average performance with the adaptive splitting Algorithm 4.2, based on 10 experiments, using sample size $c04-math-0984$ and rarity parameter $c04-math-0985$ , was

4.9.1.7 Binary Contingency Tables

Given are two vectors of positive integers $c04-math-0987$ and $c04-math-0988$ such that $c04-math-0989$ for all $c04-math-0990$ , $c04-math-0991$ for all $c04-math-0992$ , and $c04-math-0993$ . A binary contingency table with row sums $c04-math-0994$ and column sums $c04-math-0995$ is a $c04-math-0996$ matrix $c04-math-0997$ of zero-one entries $c04-math-0998$ satisfying $c04-math-0999$ for every row $c04-math-1000$ and $c04-math-1001$ for every column $c04-math-1002$ . The problem is to count all contingency tables.

The extension of the proposed Gibbs sampler for counting the contingency tables follows immediately. We define the configuration space $c04-math-1003$ as the space where all column or row sums are satisfied:

equation

Clearly, sampling uniformly on $c04-math-1005$ is straightforward. The sample function $c04-math-1006$ is defined by

equation

that is, the difference of the row sums $c04-math-1008$ with the target $c04-math-1009$ if the column sums are right, and vice versa.

The Gibbs sampler is similar to random graphs with prescribed degrees.

Algorithm 4.12 Gibbs Algorithm for Random Contingency Tables Sampling

Input: A matrix realization $c04-math-1010$ with performance value $c04-math-1011$ .
Output: A uniformly chosen random neighbor with the same property.

For each column $c04-math-1012$ and for each 1-entry in this column ( $c04-math-1013$ ) do:

1. Remove this $c04-math-1014$ , that is, set $c04-math-1015$ .

2. Check all possible placements for this $c04-math-1016$ in the given column $c04-math-1017$ conditioning on the performance function $c04-math-1018$ , where $c04-math-1019$ is the matrix resulting by setting $c04-math-1020$ , $c04-math-1021$ for some $c04-math-1022$ , and all other entries remain unchanged.

3. Suppose that the set of valid realization is $c04-math-1023$ . This set also contains the original realization $c04-math-1024$ . Then, with probability $c04-math-1025$ , pick any realization at random and continue with step 1.

In this way we keep the column sums correct. Similarly, when we started with a matrix configuration with all row sums correct, we execute these steps for each row and swap 1 and 0 per row.

4.9.1.8 Model 1

The data are $c04-math-1026$ with row and column sums

The true count value is known to be $c04-math-1028$ (reported in [28]). Table 4.5 presents a typical dynamics of the adaptive splitting Algorithm 4.2 for Model 1 using $c04-math-1029$ and $c04-math-1030$ .

Table 4.5 Typical Dynamics of the Splitting Algorithm 4.2 for model 1 using $c04-math-1031$ and $c04-math-1032$

The average performance based on 10 experiments using sample size $c04-math-1040$ and rarity parameter $c04-math-1041$ was

4.9.1.9 Model 2

Consider the problem of counting tables with Darwin's finch data as marginal sums given in Chen et al. [28]. The data are $c04-math-1043$ with row and columns sums

The true count value in [28] is $c04-math-1045$ . The average performance based on 10 independent experiments using sample size $c04-math-1046$ and rarity parameter $c04-math-1047$ was

4.9.1.10 Vertex Coloring

Vertex coloring is a problem of coloring the vertices of a graph $c04-math-1049$ consisting of $c04-math-1050$ nodes and $c04-math-1051$ edges, such that neighboring vertices have different colors. The set of colors has a size $c04-math-1052$ . For more details on vertex coloring see Section A.1.1.

Algorithm 4.13 Gibbs Algorithm for Vertex Coloring

Input: A graph $c04-math-1053$ with proper coloring.
Output: A uniformly chosen random neighbor with the same property.

For each vertex $c04-math-1054$ do:

1. Choose a color for $c04-math-1055$ according to the uniform distribution over the set of colors that are not assigned at any of its neighbors.

2. Accept the new color with probability $c04-math-1056$ and leave all other vertices unchanged.

We consider here two coloring models: a small one and one of moderate size.

4.9.1.11 First Model: $c04-math-1057$ Vertices, $c04-math-1058$ Edges, $c04-math-1059$ Colors

The graph $c04-math-1060$ has the following $c04-math-1061$ adjacency matrix.

equation

The performance is based on 10 independent experiments using sample size $c04-math-1063$ and rarity parameter $c04-math-1064$ , was

Note that the splitting procedure starts with the set $c04-math-1066$ consisting of all feasible colorings of the graph $c04-math-1067$ (see Section A.1.1). Thus, $c04-math-1068$ . We obtain, in this case, that the probability equals

Because $c04-math-1070$ is not a rareevent, the crude Monte Carlo method can also be used here. The performance of the CMC estimator based on 10 independent runs with $c04-math-1071$ and $c04-math-1072$ was

It follows the RE of the CMC is about 3 times larger than its counterpart, the splitting Algorithm 4.2.

4.9.1.12 Second Model: $c04-math-1074$ Vertices, $c04-math-1075$ Edges, $c04-math-1076$

We next consider a larger and sparser coloring model, one with $c04-math-1077$ vertices, $c04-math-1078$ edges, and $c04-math-1079$ .

For this model, we applied the adaptive splitting Algorithm 4.2 with sample size $c04-math-1080$ , rarity parameter $c04-math-1081$ , and burn-in parameter $c04-math-1082$ . The direct algorithm was implemented immediately after the last iteration of the adaptive splitting, using again $c04-math-1083$ . Table 4.6 presents the dynamics for a single run of the splitting Algorithm 4.2.

Table 4.6 Dynamics of Algorithm 4.2 for 4-Coloring graph with $c04-math-1084$ nodes

The average performance based on 10 independent experiments was

The direct splitting algorithm results in $c04-math-1130$ with RE $c04-math-1131$ .

4.9.1.13 Permanent

The permanent of a general $c04-math-1132$ matrix $c04-math-1133$ is defined as

4.27 $c04-math-1134$

where $c04-math-1135$ is the set of all permutations $c04-math-1136$ of $c04-math-1137$ .

Example 4.2

Let $c04-math-1138$ be a $c04-math-1139$ matrix given as

4.28 $c04-math-1140$

Note that, in contrast to the Hamiltonian cycle problem, the diagonal elements of the matrix $c04-math-1141$ need not to be zeros. The $c04-math-1142$ permutations with the corresponding product values $c04-math-1143$ are given in Table 4.7. The permanent is $c04-math-1144$ . The values of the performance function $c04-math-1145$ with all six possible permutation vectors $c04-math-1146$ are given in Table 4.8.

Table 4.7 Six permutations and their product values of the matrix $c04-math-1147$ in (4.28)

Table 4.8 The $c04-math-1156$ values corresponding to the permutations for the matrix $c04-math-1157$ in (4.28)

$c04-math-1158$

$c04-math-1159$

$c04-math-1160$

$c04-math-1161$

$c04-math-1162$

$c04-math-1163$

$c04-math-1164$

To apply the Gibbs sampler for the permanent problem, we adopt the concept of “neighboring” elements (see, for example, Chapter 10 of Ross [99]) and in particular the concept of 2-opt heuristic [108]. More precisely, given a point (tour) $c04-math-1165$ of length $c04-math-1166$ generated by the pdf $c04-math-1167$ , the conditional Gibbs sampling updates the existing tour $c04-math-1168$ to $c04-math-1169$ , where $c04-math-1170$ is generated from $c04-math-1171$ and where $c04-math-1172$ is the same as $c04-math-1173$ with the exception that the points $c04-math-1174$ and $c04-math-1175$ in $c04-math-1176$ are interchanged. We accept the tour $c04-math-1177$ if $c04-math-1178$ , otherwise, we leave the tour $c04-math-1179$ the same. If $c04-math-1180$ is accepted, we update the cost function $c04-math-1181$ (for $c04-math-1182$ ) accordingly. We found empirically that, in order for the tours $c04-math-1183$ to be distributed approximately uniformly (at each subspace $c04-math-1184$ ), we used $c04-math-1185$ trials.

The Gibbs sampler may be summarized as follows:

Algorithm 4.14 Gibbs Algorithm for the Permanent

Input: Given a permutation $c04-math-1186$ and $c04-math-1187$ .
Output: A uniformly chosen random neighbor with the same property.

For each $c04-math-1188$ and $c04-math-1189$ $c04-math-1190$ do:

1. $c04-math-1191$ ;

2. If $c04-math-1192$ , set $c04-math-1193$ .

We applied the adaptive splitting Algorithm 4.2 to a problem with an instance matrix $c04-math-1194$ matrix. We took a sample size $c04-math-1195$ and set the rarity parameter $c04-math-1196$ and burn-in parameter $c04-math-1197$ . The average performance based on 10 experiments was

The direct estimator was implemented immediately after the last iteration of the adaptive splitting, using again $c04-math-1199$ . The average performance of the direct algorithm was $c04-math-1200$ with RE $c04-math-1201$ .

4.9.1.14 Hamiltonian Cycles

We solve the Hamiltonian cycles problem by applying the 2-opt heuristic used for the permanent. In counting Hamiltonian cycles we simulated an associated permanent problem by making use of the following observations:

The set of Hamiltonian cycles of length $c04-math-1202$ represents a subset of the associated permanent set of trajectories.
The latter set is formed from cycles of length $c04-math-1203$ .

It follows that, in order to count the number of Hamiltonian cycles for a given matrix $c04-math-1204$ , one can use the following simple procedure:

1. Run Algorithm 4.2 and calculate the estimator of $c04-math-1205$ of the associated permanent using the product formula (4.9). Denote such a permanent estimator by $c04-math-1206$ .

2. Proceed with one more iteration of Algorithm 4.2 and calculate the ratio of the number of screened Hamiltonian elite cycles (cycles of length $c04-math-1207$ ) to that of screened elite samples (samples of length $c04-math-1208$ ) in the permanent. Denote the ratio by $c04-math-1209$ .

3. Deliver $c04-math-1210$ as an estimator of the number Hamiltonian cycles $c04-math-1211$ .

Below we present simulation results for two $c04-math-1212$ models. We choose $c04-math-1213$ , $c04-math-1214$ and $c04-math-1215$ . The results are based on the averages of 10 experiments. The adaptive splitting algorithm results in

while the direct algorithm results in

4.9.1.15 Volume of a Polytope

Here we present numerical results with the splitting Algorithm 4.2, on the volume of a polytope defined as $c04-math-1218$ . We assume in addition that all components $c04-math-1219$ . By doing so the polytope $c04-math-1220$ will be inside the unit $c04-math-1221$ -dimensional cube and therefore its volume is very small (rare-event probability) relative to that of the unit cube. In general, however, before implementing the Gibbs sampler, one must find the minimal $c04-math-1222$ -dimensional box, which contains the polytope, that is, for each $c04-math-1223$ , one must find the values $c04-math-1224$ of the minimal box satisfying $c04-math-1225$ . This can be done by solving for each $c04-math-1226$ the following two linear programs:

4.29 $c04-math-1227$

4.30 $c04-math-1228$

and

4.31 $c04-math-1229$

4.32 $c04-math-1230$

We consider two models.

4.9.1.16 First Model

4.33 $c04-math-1231$

Table 4.9 presents the dynamics of a single run of the splitting Algorithm 4.2 with $c04-math-1232$ and $c04-math-1233$ . The results are self-explanatory.

Table 4.9 Dynamics of single run of splitting Algorithm 4.2

In this model, we also executed the crude Monte Carlo, for which we took a sample $c04-math-1240$ . The results of 10 experiments are

equation

4.9.1.17 Second Model

This problem involves a $c04-math-1242$ matrix, whose main part, $c04-math-1243$ , comprises the first 60 column vectors of matrix $c04-math-1244$ with 0-1 entries and the rightmost column corresponds to the vector $c04-math-1245$ . Running the splitting Algorithm 4.2 for 10 independent runs, each with $c04-math-1246$ and $c04-math-1247$ , we obtained average performance

4.9.2 Combinatorial Optimization

4.9.2.1 Traveling Salesman Problem

We now present some numerical results obtained with splitting Algorithm 4.2 while using the 2-opt heuristics. The models are taken from http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95/atsp.

All our results are based on 10 independent runs with the random Gibbs sampler. Below, we consider the ftv33 model with $c04-math-1249$ nodes and the best-known solution, equal to 1286. We used in our simulation studies $c04-math-1250$ and $c04-math-1251$ . We found that the accuracy with $c04-math-1252$ is higher than for $c04-math-1253$ , but the CPU time in the latter increased by about 20%. We also found that the accuracy of the splitting Algorithm 4.2 is quite sensitive to $c04-math-1254$ , $c04-math-1255$ , and $c04-math-1256$ . Below are some results:

$c04-math-1257$ , $c04-math-1258$ , and $c04-math-1259$ . The results are

The CPU was 25 sec for a run.
$c04-math-1261$ , $c04-math-1262$ , and $c04-math-1263$ . In this case, the 2-opt heuristic results in the best-known solution is $c04-math-1264$ in all 10 runs.
$c04-math-1265$ , $c04-math-1266$ , and $c04-math-1267$ . The results are
$c04-math-1269$ , $c04-math-1270$ , and $c04-math-1271$ . The results are
$c04-math-1273$ , $c04-math-1274$ , and $c04-math-1275$ . The results are (This is the worst case, with relative error about 7%.)

4.9.2.2 Knapsack Problem

Consider the Sento1.dat knapsack problem given in http://people.brunel.ac.uk/mastjjb/jeb/orlib/files/mknap2.txt. The problem has 30 constraints and 60 variables. We ran the splitting Algorithm 4.4 for 10 independent runs with $c04-math-1277$ , $c04-math-1278$ , $c04-math-1279$ , and we found that it discovers the best-known solution. We also ran it for various integer problems and always found that the results coincided with the best-known solution.

4.9.3 Reliability Models

To estimate the system unreliability parameter $c04-math-1280$ , we use in all our numerical experiments the exponential pdf $c04-math-1281$ with the rates $c04-math-1282$ , where $c04-math-1283$ is the underlying Bernoulli vector.

Our numerical results are given for a dodecahedron graph in Figure 4.11 and a lattice graph. Note that an $c04-math-1284$ graph has $c04-math-1285$ rows containing $c04-math-1286$ nodes each, arranged in a rectangular lattice. Each node is connected to its neighbors on the left, right, above, and below, where they exist. As an illustration, a $c04-math-1287$ lattice graph is presented in Figure 4.12. In all our numerical experiments, we take the terminals to be the upper-left and the lower-right nodes of the corresponding lattice graph.

Figure 4.11 The dodecahedron graph.

Figure 4.12 A $c04-math-1288$ lattice graph, with 40 nodes and 67 edges.

To estimate the unreliability parameter $c04-math-1289$ we run both models for 10 independent scenarios.

4.9.3.1 Dodecahedron Graph with Terminal Nodes $c04-math-1290$

We considered the following two cases:

Case 1 with $c04-math-1291$ . We set $c04-math-1292$ and $c04-math-1293$ . We obtained the following 10 estimators of $c04-math-1294$ :

The average CPU time was $c04-math-1296$ seconds and the relative error was $c04-math-1297$ .

Case 2 with $c04-math-1298$ . We set $c04-math-1299$ and $c04-math-1300$ . We obtained the following 10 estimators of $c04-math-1301$ :

The average CPU time was $c04-math-1303$ seconds and the relative error was about $c04-math-1304$ .

4.9.3.2 Lattice $c04-math-1305$ Graph with $c04-math-1306$

We considered the following two cases:

Case 1 with $c04-math-1307$ . We set $c04-math-1308$ and $c04-math-1309$ . We obtained the following 10 estimators of $c04-math-1310$ :

The average CPU time was $c04-math-1312$ seconds and the relative error was $c04-math-1313$ .

Case 2 with $c04-math-1314$ , $c04-math-1315$ , and $c04-math-1316$ . We obtained the following 10 estimators of $c04-math-1317$ :

The average CPU time was $c04-math-1319$ seconds and the relative error was $c04-math-1320$ .

4.10 Appendix: Gibbs Sampler

In this appendix, we show how to sample from a given joint pdf $c04-math-1321$ using the Gibbs sampler, where instead of proceeding with $c04-math-1322$ directly, which might be very difficult, one samples from one-dimensional conditional pdfs $c04-math-1323$ , which is typically much simpler.

Two basic versions of the Gibbs sampler [108] are available: systematic and random. In the former, the components of the vector $c04-math-1324$ are updated in a fixed, say increasing, order, while in the latter they are chosen randomly according to a discrete uniform $c04-math-1325$ -point pdf. Below we present the systematic version, where for a given vector $c04-math-1326$ , one generates a new vector $c04-math-1327$ with the same distribution $c04-math-1328$ as follows:

Algorithm 4.15 Systematic Gibbs Sampler

1. Draw $c04-math-1329$ from the conditional pdf $c04-math-1330$ .

2. Draw $c04-math-1331$ from the conditional pdf $c04-math-1332$

3. Draw $c04-math-1333$ from the conditional pdf $c04-math-1334$

Iterating with Algorithm 4.15, the Gibbs sampler generates (subject to some mild conditions, see [108]), a sample distributed $c04-math-1335$ .

We denote for convenience each conditional pdf $c04-math-1336$ by $c04-math-1337$ , where $c04-math-1338$ denotes conditioning on all random variables except the $c04-math-1339$ -th component.

We next present a random Gibbs sampler taken from (Ross, 2006) for estimating each $c04-math-1340$ according to (4.11), that is,

equation

Algorithm 4.16 Ross' Acceptance-Rejection Algorithm for Estimating $c04-math-1342$

1. Set $c04-math-1343$ =0.

2. Choose a vector $c04-math-1344$ such that $c04-math-1345$ .

3. Generate a random number $c04-math-1346$ and set $c04-math-1347$ = $c04-math-1348$ .

4. If $c04-math-1349$ , generate $c04-math-1350$ from the conditional one-dimensional distribution $c04-math-1351$ (see Algorithm 4.15).

5. If $c04-math-1352$ , return to 4.

6. Set $c04-math-1353$ and $c04-math-1354$ .

7. If $c04-math-1355$ , then $c04-math-1356$ .

8. Go to 3.

9. Estimate $c04-math-1357$ as $c04-math-1358$ .

For many rare-event and counting problems, generation from the conditional pdf $c04-math-1359$ can be achieved directly, that is, skipping Step 5. This should obviously speed things up.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 4: Splitting Method for Counting and Optimization

Create new playlist

Sign In

Sign Up

Chapter 4

Splitting Method for Counting and Optimization

4.1 Background

4.2 Quick Glance at the Splitting Method

4.3 Splitting Algorithm with Fixed Levels

4.4 Adaptive Splitting Algorithm

4.5 Sampling Uniformly on Discrete Regions

4.6 Splitting Algorithm for Combinatorial Optimization

4.7 Enhanced Splitting Method for Counting

4.7.1 Counting with the Direct Estimator

4.7.2 Counting with the Capture–Recapture Method

4.7.2.1 Capture-Recapture Method

4.7.2.2 Splitting Algorithm Combined with Capture–Recapture

4.8 Application of Splitting to Reliability Models

4.8.1 Introduction

4.8.2 Static Graph Reliability Problem

4.8.2.1 Dynamic Graph Reliability Problem Formulation

4.8.3 BMC Algorithm for Computing

4.8.4 Gibbs Sampler

4.9 Numerical Results with the Splitting Algorithms

4.9.1 Counting

4.9.1.1 SAT Problem

4.9.1.2 First Model: -SAT with Instance Matrix

4.9.1.3 Second Model: -SAT with Instance Matrix

4.9.1.4 Random Graphs with Prescribed Degrees

4.9.1.5 A Small Problem

4.9.1.6 A Large Problem

4.9.1.7 Binary Contingency Tables

4.9.1.8 Model 1

4.9.1.9 Model 2

4.9.1.10 Vertex Coloring

4.9.1.11 First Model: Vertices, Edges, Colors

4.9.1.12 Second Model: Vertices, Edges,

4.9.1.13 Permanent

4.9.1.14 Hamiltonian Cycles

4.9.1.15 Volume of a Polytope

4.9.1.16 First Model

4.9.1.17 Second Model

4.9.2 Combinatorial Optimization

4.9.2.1 Traveling Salesman Problem

4.9.2.2 Knapsack Problem

4.9.3 Reliability Models

4.9.3.1 Dodecahedron Graph with Terminal Nodes

4.9.3.2 Lattice Graph with

4.10 Appendix: Gibbs Sampler

Table of Contents for
Chapter 4: Splitting Method for Counting and Optimization

4.8.3 BMC Algorithm for Computing $c04-math-0714$

4.9.1.2 First Model: $c04-math-0827$ -SAT with Instance Matrix $c04-math-0828$

4.9.1.3 Second Model: $c04-math-0851$ -SAT with Instance Matrix $c04-math-0852$

4.9.1.11 First Model: $c04-math-1057$ Vertices, $c04-math-1058$ Edges, $c04-math-1059$ Colors

4.9.1.12 Second Model: $c04-math-1074$ Vertices, $c04-math-1075$ Edges, $c04-math-1076$

4.9.3.1 Dodecahedron Graph with Terminal Nodes $c04-math-1290$

4.9.3.2 Lattice $c04-math-1305$ Graph with $c04-math-1306$