Chapter 3

Minimum Cross-Entropy Method

This chapter deals with the minimum cross-entropy method, also known as the MinxEnt method for combinatorial optimization problems and rare-event probability estimation. The main idea of MinxEnt is to associate with each original optimization problem an auxiliary single-constrained convex optimization program in terms of probability density functions. The beauty is that this auxiliary program has a closed-form solution, which becomes the optimal zero variance solution, provided the “temperature” parameter is set to minus infinity. In addition, the associated pdf based on the product of marginals obtained from the joint optimal zero variance pdf coincide with the parametric pdf of the cross-entropy (CE) method. Thus, we obtain a strong connection between CE and MinxEnt, providing solid mathematical foundations.

3.1 Introduction

Let c03-math-0001 be a continuous function defined on some closed bounded c03-math-0002-dimensional domain c03-math-0003. Assume that c03-math-0004 is a unique minimum point over c03-math-0005. The following theorem is due to Pincus [94].

Theorem 3.1
Let c03-math-0006 be a real-valued continuous function over closed bounded c03-math-0007-dimensional domain c03-math-0008. Further assume that there is a unique minimum point c03-math-0009 over c03-math-0010 at which c03-math-0011 attains (there is no restriction on the number of local minima). Then the coordinates of c03-math-0012 of c03-math-0013 are given by

3.1 c03-math-0014

The proof of the theorem is based on Laplace's formula, which for sufficiently large c03-math-0015 can be written as
equation

This is because, for large c03-math-0017, the major contribution to the integrals appearing in (3.1) comes from a small neighborhood of the minimizer c03-math-0018.

Pincus' theorem holds for discrete optimization as well (assuming c03-math-0019). In this case, the integrals should be replaced by the relevant sums.

There are many Monte Carlo methods for evaluating the coordinates of c03-math-0020, that is, for approximating the ratio appearing in (3.1). Among them is the celebrated simulated annealing method, which is based on a Markov chain Monte Carlo technique, also called Metropolis' sampling procedure. The idea of the method is to sample from the Boltzmann density

3.2 c03-math-0021

without resorting to calculation of the integral (the denumerator). For details see [108].

It is important to note that, in general, sampling from the complex multidimensional pdf c03-math-0022 is a formidable task. If, however, the function c03-math-0023 is separable, that is, it can be represented by

equation

then the pdf c03-math-0025 in (3.2) decomposes as the product of its marginal pdfs, that is, it can be written as

3.3 c03-math-0026

Clearly, for a decomposable function c03-math-0027 sampling from the one-dimensional marginal pdfs of c03-math-0028 is fast.

Consider the application of the simulated annealing method to combinatorial optimization problems. As an example, consider the traveling salesman problem with c03-math-0029 cities. In this case, [1] shows how simulated annealing runs a Markov chain c03-math-0030 with c03-math-0031 states and c03-math-0032 denoting the length of the tour. As c03-math-0033 the stationary distribution of c03-math-0034 will become a degenerated one, that is, it converges to the optimal solution c03-math-0035 (shortest tour in the case of traveling salesman problem). It can be proved [1] that, in the case of multiple solutions, say c03-math-0036 solutions, we have that as c03-math-0037 the stationary distribution of c03-math-0038 will be uniform on the set of the c03-math-0039 optimal solutions.

The main drawback of simulated annealing is that it is slow and c03-math-0040, called the annealing temperature, must be chosen heuristically.

We present a different Monte Carlo method, which we call MinxEnt. It is also associated with the Boltzmann distribution, however, it is obtained by solving a MinxEnt program of a special type and is suitable for rare-event probability estimation and approximation of the optimal solutions of a broad class of NP-hard linear integer and combinatorial optimization problems.

The main idea of the MinxEnt approach is to associate with each original optimization problem an auxiliary single-constrained convex MinxEnt program of a special type, which has a closed form solution.

The rest of this chapter is organized as follows. In Section 3.2, we present some background on the classic MinxEnt program. In Section 3.3, we establish connections between rare-event probability estimation and MinxEnt, and present what is called the basic MinxEnt (BME). Section 3.4 presents a new MinxEnt method, which involves indicator functions and is called the indicator MinxEnt (IME). We prove that the optimal pdf obtained from the solution of the IME program is a zero variance one, provided the temperature parameter is set to minus infinity. This is quite a remarkable result. In addition, we prove that the parametric pdf based on the product of marginals obtained from the optimal zero-variance pdf coincides with the parametric pdf of the standard CE method, which we covered in the previous chapter. A remarkable consequence is that we obtain a strong mathematical foundation for CE. In Section 3.5 we present the IME algorithm for optimization.

3.2 Classic MinxEnt Method

The classic MinxEnt program reads as

3.4 c03-math-0041

Here, c03-math-0043 and c03-math-0044 are c03-math-0045-dimensional pdfs; c03-math-0046 are given functions; and c03-math-0047 is an c03-math-0048-dimensional vector. Here, c03-math-0049 is assumed to be known and is called the prior pdf. The program (3.4) is called the classic minimum cross-entropy or simply the MinxEnt program. If the prior c03-math-0050 is constant, then c03-math-0051, so that the minimization of c03-math-0052 in (3.4) can be replaced with the maximization of

3.5 c03-math-0053

where c03-math-0054 is the Shannon entropy [63]. The corresponding program is called the Jaynes' MinxEnt program. Note that the former minimizes the Kullback-Leibler cross-entropy, while the latter maximizes the Shannon entropy [63]. For a good paper on the generalization of MinxEnt, see [16].

The MinxEnt program, which under mild conditions [8] presents a convex constrained functional optimization problem, can be solved via Lagrange multipliers. The solution is given by [8]

3.6 c03-math-0055

where c03-math-0056 are obtained from the solution of the following system of equations:

3.7 c03-math-0057

The MinxEnt solution c03-math-0058 can be written as

3.8 c03-math-0059

where

3.9 c03-math-0060

is the normalization constant.

In the particular case where each c03-math-0061 is coordinate-wise separable, that is,

3.10 c03-math-0062

and the components c03-math-0063 of the random vector c03-math-0064 are independent. The joint pdf c03-math-0065 in (3.6) reduces to the product of marginal pdfs. In such case, we say that c03-math-0066 is decomposable.

In particular, the c03-math-0067-th component of c03-math-0068 can be written as

3.11 c03-math-0069

Remark 3.1
It is well known [34] that the optimal solution of the single-dimensional single-constrained MinxEnt program

3.12 c03-math-0070

coincides with the celebrated optimal exponential change of measure (ECM). Note that typically in a multidimensional ECM, one twists each component separately, using possibly different twisting parameters. In contrast, the optimal solution to the MinxEnt program is parameterized by a single-dimensional parameter c03-math-0072, so for the multidimensional case ECM differs from MinxEnt.

Example 3.1 Die Tossing
To obtain better insight into the MinxEnt program, consider a particular case of (3.12) associated with die tossing. We assume c03-math-0073 and c03-math-0074 is a given discrete distribution (the prior) over the six faces of the die, where c03-math-0075 denotes the nominal parameter vector. We restrict to densities c03-math-0076 that are parameterized in this way; that is, c03-math-0077 with c03-math-0078 for all c03-math-0079, such that c03-math-0080. Then we denote the Kullback-Leibler cross-entropy by c03-math-0081, which stands for c03-math-0082. Similarly, the Shannon entropy c03-math-0083 is abbreviated to c03-math-0084.
In this die problem it is readily seen that the functional program (3.12) leads to the following parametric one:

3.13 c03-math-0085

The optimal parameter vector c03-math-0087, derived from the solution of (3.13), can be written component-wise as

3.14 c03-math-0088

where c03-math-0089 is derived from the numerical solution of

3.15 c03-math-0090

Table 3.1, which is an exact replica of Table 4.1 of [63], presents c03-math-0091, c03-math-0092, and the entropy c03-math-0093 as functions of c03-math-0094 for a fair die, that is, with the prior c03-math-0095. The table is self-explanatory.

Table 3.1 c03-math-0096, c03-math-0097, and c03-math-0098 as Function of c03-math-0099 for a Fair Die
c03-tab-0001
Note that
  • If c03-math-0115, then c03-math-0116 and, thus, c03-math-0117.
  • c03-math-0118 is strictly concave in c03-math-0119 and the maximal entropy c03-math-0120.
  • For the extreme values of c03-math-0121, that is, for c03-math-0122 and c03-math-0123, the corresponding optimal solutions are c03-math-0124 and c03-math-0125, respectively; that is, the pdf c03-math-0126 becomes degenerated. For these cases:
    1. The entropy is c03-math-0127, and thus there is no uncertainty (for both degenerated vectors, c03-math-0128 and c03-math-0129).
    2. For c03-math-0130 and c03-math-0131 we have that c03-math-0132 and c03-math-0133, respectively. This important observation is in the spirit of Pincus [94] Theorem 3.1 and will play an important role below.
    3. It can also be readily shown that c03-math-0134 is degenerated regardless of the prior c03-math-0135.squ

The above observations for the die example can be readily extended to the case where instead of c03-math-0136 one considers c03-math-0137 with c03-math-0138 and with arbitrary c03-math-0139's.

Remark 3.2
Taking into account that MaxEnt (with the objective function c03-math-0140, see (3.5)) can be viewed as a particular case of MinxEnt with constant prior c03-math-0141, we can rewrite the basic MinxEnt formulas (3.6) and (3.7) as

3.16 c03-math-0142

where c03-math-0143 are obtained from the solution of the following system of equations:

3.17 c03-math-0144

We extend next the MinxEnt program (3.4) to both equality and inequality constraints; that is, we consider the following general MinxEnt program:

3.18 c03-math-0145

In this case, applying the Kuhn-Tucker [75] conditions to the program (3.18) we readily obtain that c03-math-0146 remains the same as in (3.6), while c03-math-0147 are found from the solution of the following convex program:

3.19 c03-math-0148

3.3 Rare Events and MinxEnt

To establish the connection between MinxEnt and rare events, we consider the problem of computing a rare-event probability of the form

3.20 c03-math-0149

Here, c03-math-0150 is a random vector with pdf c03-math-0151, c03-math-0152 is a performance function, and c03-math-0153 is such that c03-math-0154 is a rare event.

Suppose that we estimate c03-math-0155 by the importance sampling estimator

3.21 c03-math-0156

using the importance sampling pdf c03-math-0157 (see Section 2.2). Then the MinxEnt program that determines c03-math-0158 has the following single-constrained formulation:

3.22 c03-math-0159

The solution is (cf. (3.6))

3.23 c03-math-0161

and c03-math-0162 solves

3.24 c03-math-0163

We call (3.22) and (3.23), (3.24) the basic MinxEnt (BME) program and solution, respectively. This is the nonparametric setting in which generally the solution c03-math-0164 is a difficult pdf to generate samples from. As a typical example, suppose that the prior c03-math-0165 is the uniform pdf on a bounded state space c03-math-0166, that is c03-math-0167, where c03-math-0168 denotes the total area of the set c03-math-0169. The rare event is c03-math-0170. The importance sampling estimator (3.21) using the MinxEnt solution (3.23) becomes

equation

where c03-math-0172 is a random sample from c03-math-0173.

We give two simple examples; the first one leads to a simple simulation procedure from c03-math-0174, the second one does not.

Example 3.2
Suppose that c03-math-0175, c03-math-0176, and c03-math-0177. In this case c03-math-0178 on c03-math-0179, and

equation

where c03-math-0181 and c03-math-0182 are independent uniform c03-math-0183 random variables. Thus,

equation

The parameter c03-math-0185 solves

equation

This is easily solved numerically for any given c03-math-0187. The importance sampling pdf becomes

equation

on c03-math-0189, otherwise c03-math-0190. Hence, under the importance sampling pdf c03-math-0191, c03-math-0192 and c03-math-0193 are again independent but have a conditional exponential distribution with parameter c03-math-0194, conditioned on c03-math-0195. Generating samples from c03-math-0196 is straightforward.squ

Example 3.3
Suppose that c03-math-0197, c03-math-0198, and c03-math-0199. Thus, c03-math-0200 on c03-math-0201, and

equation

where c03-math-0203 is the circle segment above the chord c03-math-0204. Clearly, c03-math-0205 when c03-math-0206. Note that c03-math-0207 and c03-math-0208 are dependent. Generating samples from c03-math-0209 is easy and can be done, for example, by acceptance-rejection from uniform samples in the square c03-math-0210. In this example we get explicit expressions neither for the normalizing constant c03-math-0211 nor for the importance sampling pdf c03-math-0212.squ

As we already mentioned in Chapter 2, the cross-entropy method assumes that the prior pdf c03-math-0213 belongs to a parameterized family mfrackf of pdfs (2.6), where c03-math-0214 is called the nominal parameter vector. The solution c03-math-0215 to the MinxEnt program as given in (3.23) is generally not a pdf in that parameterized family. In some cases it is possible to find an approximate pdf c03-math-0216 as in the following setting.

Definition 3.1 Marginal Expectation Parameterization
We say that the parameterized family mfrackf of pdfs satisfies marginal expectation parameterization when
equation

Let c03-math-0218 mfrackf be the prior pdf. The MinxEnt solution for c03-math-0219 is

3.25 c03-math-0220

Suppose that the family mfrackf satisfies the marginal expectation parameterization, and let c03-math-0221 be any parameter vector for the family mfrackf . Then the above analysis suggests carrying out the importance sampling using c03-math-0222 equal to c03-math-0223 given in (3.25). Hence, we approximate the MinxEnt solution c03-math-0224 by c03-math-0225 with

3.26 c03-math-0226

Note that formula (3.26) is similar to the corresponding cross-entropy solution [107]

3.27 c03-math-0227

with one main difference: the indicator function c03-math-0228 in the CE formula is replaced by c03-math-0229.

Example 3.4
Let c03-math-0230, and suppose that c03-math-0231 is a binary random vector with independent components and probabilities

equation

In other words,

equation

Assume further that the performance function is c03-math-0234, and we want to find the optimal parameter c03-math-0235 to calculate the probability of the event c03-math-0236.

Because c03-math-0237 is decomposable, the MinxEnt program results in a product form

equation

with

equation

The parameter c03-math-0240 is obtained by solving

equation

This equation is solved numerically, resulting in c03-math-0242 and

equation

The CE solution is given in (3.27), which is estimated by the adaptive CE algorithm. Note that in this simple example the MinxEnt program does not need to learn the optimal parameters, but it requires a numerical procedure to find them. We executed 100 simulation experiments of the CE and BME methods with different seeds and measured their average statistical and computing performance. The CE algorithm used a sample size c03-math-0244 and rarity parameter c03-math-0245 at each iteration; the Lagrange parameter c03-math-0246 was found by bisection with an error of c03-math-0247. In the final importance sampling estimator, we used 2,000,000 samples. We found that under these assumptions the computing times were about the same, but the BME variance is smaller than the CE one: the reduction is 72%, where

equation

The gain is

equation

squ

3.4 Indicator MinxEnt Method

Consider the set

3.28 c03-math-0250

where c03-math-0251 are arbitrary functions. As before, we assume that the random variable c03-math-0252 is distributed according to a prior pdf c03-math-0253.

Hence, we associate with (3.28) the following multiple-event probability:

3.29 c03-math-0254

Note that (3.29) extends (3.20) in the sense that it involves simultaneously an intersection of c03-math-0255 events c03-math-0256, that is, multiple events rather than a single one c03-math-0257. Note also that some of the constraints may be equality ones, that is, c03-math-0258. Note finally that (3.29) has some interesting applications in rare-event simulation. For example, in a queueing model one might be interested in estimating the probability of the simultaneous occurrence of two events, c03-math-0259 and c03-math-0260, where the first is associated with buffer overflow (the number of customers c03-math-0261 is at least c03-math-0262), and the second is associated with the sojourn time (the waiting time of the customers c03-math-0263 in the queueing system is at least c03-math-0264).

We assume that each individual event c03-math-0265, is not rare, that is each probability c03-math-0266 is not a rare-event probability, say c03-math-0267 but their intersection forms a rare-event probability c03-math-0268. Similar to the single-event case in (3.20) we are interested in efficient estimation of c03-math-0269 defined in (3.29).

The main idea of the indicator MinxEnt approach is to design an importance sampling pdf c03-math-0270 such that under c03-math-0271 all constraints c03-math-0272 are fulfilled with certainty. For that purpose, we define

3.30 c03-math-0273

Then the indicator MinxEnt program is defined by the following single-constrained MinxEnt program

3.31 c03-math-0274

Its solution c03-math-0276 (see Section 3.2) is

3.32 c03-math-0277

where c03-math-0278 is obtained from the solution of the following equation:

3.33 c03-math-0279

We call (3.31) and (3.32), (3.33) the indicator MinxEnt (IME) program and solution, respectively.

When (3.33) has no solution, then, in fact, the set c03-math-0280 is empty.

We will use c03-math-0281 to estimate the rare-event probability c03-math-0282 given in (3.29). In forthcoming Lemmas 3.1 and 3.2 we will show that c03-math-0283 coincides with the zero-variance optimal importance sampling pdf, that is, with

3.34 c03-math-0284

Specifically, this gives

3.35 c03-math-0285

Note that the classic multiconstrained MinxEnt program (3.4) involves expectations of c03-math-0286, while the proposed single-constrained one (3.31) is based on the expectations of the indicators of c03-math-0287, so the name indicator MinxEnt program or simply IME program.

For c03-math-0288 the IME program (3.31) reduces to

3.36 c03-math-0289

where c03-math-0291.

Observe also that, in this case, the single-constrained programs (3.36) and (3.22) do not coincide: in the former case we use an expectation of the indicator of c03-math-0292, that is, c03-math-0293, while in the latter case we use an expectation of c03-math-0294, that is, c03-math-0295. We shall treat the program (3.36) in more details in Section 3.5.

Lemma 3.1
The optimal c03-math-0296 of the IME program (3.31) satisfying (3.33) is c03-math-0297.
Proof
The proof is given for a discrete domain c03-math-0298. For a continuous domain we replace the summations by integrations.
To prove that the optimal c03-math-0299 of the IME program (3.31) is c03-math-0300, we proceed as follows. Denoting, as before, c03-math-0301 we can write (3.33) as
equation

where we used in equality (i) that c03-math-0303 when c03-math-0304 because c03-math-0305.squ

Lemma 3.2
For c03-math-0306 the optimal IME pdf c03-math-0307 in (3.32) corresponds to the zero-variance importance sampling pdf (3.34).
Proof
The proof is given for a discrete domain c03-math-0308. For a continuous domain we replace the summations by integrations. By Lemma 3.1, c03-math-0309 and, denoting as before c03-math-0310, we can write (3.32) as
equation

squ

Observe again that generating samples from a multidimensional Boltzmann pdf, like c03-math-0312 in (3.32), is a difficult task. One might apply Markov chain Monte Carlo algorithms [108], but these are time consuming.

Assume further that the prior c03-math-0313 is given as a parameterized pdf c03-math-0314. Then, similar to [103], we shall approximate c03-math-0315 in (3.32) by a product of the marginal pdfs of a parameterized pdf c03-math-0316. That is, c03-math-0317, with

3.37 c03-math-0318

which coincides with (3.26) up to the notations. A further restriction assumes that each component of c03-math-0319 is discrete random variable on c03-math-0320. Then (3.37) extends to

3.38 c03-math-0321

Remark 3.3 The Standard Cross-Entropy Method
Similar to (3.37) (see also (3.27)) we can define the following CE updating formula:

3.39 c03-math-0322

3.4.1 Connection between CE and IME

To establish the connection between CE and IME we need the following result:

Theorem 3.2
For c03-math-0323 the optimal parameter vector c03-math-0324 in (3.37) of the marginal pdfs of the optimal c03-math-0325 in (3.32) coincides with the c03-math-0326 in (3.39) for the CE method.
Proof
The proof is very similar to Lemma 3.2 and is omitted.squ

Theorem 3.2 is crucial for the foundations of the CE method. Indeed, designed originally in [101] as a heuristics for rare-event estimation and combinatorial optimization problems, Theorem 3.2 states that CE has strong connections with the IME program (3.31). The main reason is that the optimal parametric pdf c03-math-0327 (with c03-math-0328 in (3.37) and c03-math-0329) and the CE pdf (with c03-math-0330 as in (3.39)) obtained heuristically from the solution of the following cross-entropy program:

equation

are the same, provided c03-math-0332 is the zero variance IS pdf.

The crucial difference between the proposed IME method and its CE counterparts lies in their simulation-based versions: in the latter we always require to generate a sequence of tuples c03-math-0333, while in the former we can fix in advance the temperature parameter c03-math-0334 (to be set a large negative number) and then generate a sequence of parameter vectors c03-math-0335 based on (3.37) alone. In addition, in contrast to CE, neither the elite sample nor the rarity parameter are involved in IME. As result, the proposed IME Algorithm becomes typically simpler, faster and at least as accurate as the standard CE based on (3.39).

3.5 IME Method for Combinatorial Optimization

We consider here unconstrained and constrained optimization, respectively.

3.5.1 Unconstrained Combinatorial Optimization

Consider the following non-smooth (continuous or discrete) unconstrained optimization program:

equation

Before proceeding with optimization recall that

1. The basic MinxEnt method of Section 3.3 is based on program (3.22), while the indicator MinxEnt method of Section 3.4 is based on program (3.36).
2. The programs (3.22) and (3.36) are different in the sense that, in the former, we require the condition c03-math-0337, while for the latter we require the condition c03-math-0338.
3. The parameter vector c03-math-0339 in the basic MinxEnt is updated according to (3.26), where c03-math-0340 is obtained from (3.7), while in the indicator MinxEnt according to (3.37) (where c03-math-0341).
4. The crucial difference between the two methods is that, in the basic MinxEnt, a sequence of the triplets c03-math-0342 is generated [103], while in the indicator MinxEnt only a sequence of tuples c03-math-0343 is generated with c03-math-0344 being fixed and equal to a large negative number. The levels c03-math-0345 are chosen again by considering the c03-math-0346 quantile of the samples performance values, similar to the cross-entropy method.

If not stated otherwise, we shall use the indicator MinxEnt.

To proceed with c03-math-0347, denote by c03-math-0348 the optimal function value. In this case, the indicator MinxEnt program becomes

3.40 c03-math-0349

equation

The corresponding updating of the component of the vector c03-math-0351 can be written as

3.41 c03-math-0352

where c03-math-0353 is a big negative number.

To motivate the program (3.40), consider again the die rolling example.

Example 3.5 The Die Rolling Example Using Program (3.40)
Table 3.2 presents data similar to Table 3.1 for the die rolling example using the MinxEnt program (3.40) with c03-math-0365. In particular, it presents c03-math-0366 (updated according to (3.37)) and the entropy c03-math-0367 as functions of c03-math-0368 for a fair die with the indicator of c03-math-0369, while calculating c03-math-0370 using (3.40) with c03-math-0371. One can see that, from the comparison of Table 3.2 and Table 3.1, that the entropy c03-math-0372 in the latter is smaller than in the former, which is based on the MinxEnt program (3.22).squ
Table 3.2 c03-math-0354 and c03-math-0355 as function of b for a Fair Die while Calculating c03-math-0356
c03-tab-0002
Algorithm 3.1 IME Algorithm for Unconstrained Optimization
1. Initialize: Define c03-math-0373, choose c03-math-0374 uniformly distributed over c03-math-0375. Set c03-math-0376 to a large negative number, say c03-math-0377. Set c03-math-0378 (iteration = level counter).
2. Draw: Generate a sample c03-math-0379 from the density c03-math-0380.
3. Select: Compute the elite sampling value c03-math-0381 of c03-math-0382 (similar to in CE).
4. Update: Use the same sample c03-math-0383 and compute c03-math-0384, according to (3.41).
5. Smooth: Smooth out the vector c03-math-0385 according to

3.42 c03-math-0386

where c03-math-0387, (c03-math-0388) is called the smoothing parameter.
6. Iterate: If the stopping criterion is met, stop; otherwise, set c03-math-0389 and return to Step 2.

As a stopping criterion one can use, for example, the following: if for some c03-math-0390, say c03-math-0391

3.43 c03-math-0392

then stop.

Remark 3.4
Note that as a modification of Algorithm 3.1 one can adopt the one where Step 3 is eliminated, that is, all samples are involved while updating c03-math-0393, according to (3.41). The reason is that for large negative c03-math-0394 there is not too much difference for c03-math-0395, whether we use the elite sample (as in Step 3) or the entire one (without Step 3).

3.5.2 Constrained Combinatorial Optimization: The Penalty Function Approach

Consider the following constrained combinatorial optimization problem (with inequality constraints):

3.44 c03-math-0396

Assume in addition that the vector c03-math-0398 is binary and all components c03-math-0399 and c03-math-0400 are positive numbers. Using the penalty method approach we can reduce the original constraint problem (3.44) to the following unconstrained one

3.45 c03-math-0401

where the penalty function is defined as

3.46 c03-math-0402

Here,

3.47 c03-math-0403

where c03-math-0404 is a positive number. If not stated otherwise, we assume that c03-math-0405. Note that c03-math-0406 is an increasing function of the number of the constraints that c03-math-0407 satisfies and the penalty parameter c03-math-0408 is chosen such that c03-math-0409 is negative when and only when at least one of the constraints is not satisfied. If c03-math-0410 satisfies all the constraints in (3.44), then c03-math-0411 and the c03-math-0412 value is equal to the value of the original performance function in (3.44). Clearly, the optimization program (3.44) can again be associated with the rare-event probability estimation problem, where c03-math-0413 is a vector of iid c03-math-0414 components.

We found that our numerical results with MinxEnt Algorithm 3.1 for different unconstrained and constrained combinatorial optimization problems show that it is comparable with its counterpart CE in both accuracy and the CPU time.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.205.99