6.4. The Comparison between Statistical Heuristic Search and image Algorithm

6.4.1. Comparison to image

Are the constraints given in Hypothesis I, including relaxed cases, strict? In order to unravel the problem, we'll compare them with well-known image search.

Definition 6.8

Assume that image is an admissible estimate of image, i.e. image. If for image, we have image, where image is the distance between image and image, then image is called reasonable, i.e., monotonic and admissible.

Proposition 6.2

Assume that G is a uniform m-ary tree and image is estimate with a typical error of order image [Pea84]. image is reasonable. image is i.i.d. and has a finite fourth moment. If A algorithm using image as its evaluation function has the polynomial mean complexity, then SA search using image as its statistic also has the polynomial mean complexity.

Proof

Assume that image, image and image. Since image is reasonable, we have image.
Since image is an admissible estimate having a typical error of order image and image searches g with polynomial complexity, from Property 6.1 in Section 6.1.1, we have that image, there exists image (ε is independent of p) such that

image

It does not lose generality in assuming that image. We now estimate the difference between image and image.
For the part satisfying image, we have:

image

image

image

Thus, we have:

image

image

Let image, where image. We have:

image(6.19)

Letting image be the local statistic of nodes, we have that when image, image holds. From Formula (6.19), when image, for subtree image if using the ‘mean’ of local statistics as its global statistic, then the global statistic from image, image, is image, and the global statistic from image, image, is image, i.e., the latter is larger than the former. Moreover, image is equivalent to image that belongs to the mixed cases we have discussed in Section 6.3.2. The mean complexity of the corresponding SA is polynomial.
The proposition shows that if image is monotonic, when image is convergent SA is also convergent.
From Theorem 6.6, it’s known that the proposition still holds when image is not monotonic.

Proposition 6.3

Assume that image is the lower estimation of image with a typical error of order image. Let

image

If image has a finite fourth moment, image and image, image, where d is a constant, letting image be a local statistic of nodes, then SA search is convergent.

Proof

Assume that image and image. We have

image

image

image

Then,

image

image

image(6.20)

where c is a constant.
Let image. When image, we have:

image

From Theorem 6.6, the corresponding SA is convergent.
The proposition shows that all lower estimations image of image that make image search convergent, when using the image as a local statistic, we can always extract a properly global statistic from image such that the corresponding SA search is convergent.
We will show below that the inverse is not true, i.e., we can provide a large class of estimations image such that its corresponding SA is convergent but the corresponding image is divergent.

Proposition 6.4

Assume that image is a lower estimation of image and is an image random variable with a finite fourth moment. For image, image, where c is a constant and image. If image is the local statistic of node n, then the corresponding SA is convergent.

Proof

From image, image, we have image. Letting image and image, we have

image

image

From Theorem 6.6, the corresponding SA is convergent.

Corollary 6.6

Assume that image is a lower estimate of image with a typical error of order N and image has a finite fourth moment. For image, image, where c is a constant and image. Letting image be a local statistic of nodes n, the corresponding SA is convergent.

Proof

Since image is a lower estimate with a typical error of order N from its definition, there exists image such that image, image.
We have:

image

i.e., image. From Proposition 6.3, we have the corollary.
According to Pearl’s result, under the conditions of image in Corollary 6.6 the corresponding image is exponential, but from Corollary 6.6 SA is convergent. Therefore, the condition that SA search is convergent is weaker than that of image search.

Corollary 6.7

If image is a lower estimate, image has a finite four moment, for image, image are equal, and there exist image and image such that image, then Corollary 6.6 holds.

Corollary 6.8

If image is the lower estimate of image with a typical error of order image, image has a finite fourth moment and image, image, where d is a constant, then the corresponding image is convergent. Letting image be the local statistic of node n, the corresponding SA is also convergent.

Proof

From Proposition 6.2 and the Pearl’s result given in Property 6.4, the corollary is obtained.
From the above propositions, it’s easy to see that the condition that makes SA convergent is weaker than that of image. On the other hand, the convergence of image is related to estimation with a typical error of order image that is defined by Formulas (6.1) and (6.2). It’s very difficult to confirm the two constants image and image within the two formulas. So the convergent condition of image is difficulty to be tested. But in SA, the only convergent requirement is that statistics image are independent and have positive variances and finite fourth moments. In general, the distribution of image satisfies the conditions.

6.4.2. Comparison to Other Weighted Techniques

The statistical inference methods can also be applied to weighted heuristic search. Weighted techniques in heuristic search have been investigated by several researchers (Nilson, 1980; Field et al., 1984). They introduced the concept of weighted components into evaluation function image. Thus, the relative weights of g(n) and h(n) in the evaluation function can be controlled by:

image

image

image

where image is a weight.
In statically weighted systems, a fixed weight is added to the evaluation functions of all nodes. For example, Pearl investigates a statically weighted system image and showed that the optimal weight is image (the definition of image and more details see Pearl (1984b)). But even the optimal weight is adopted; the exponential complexity still cannot be overcome.
For dynamic weighting, for example, the weight image may vary with the depth of a node in the search tree, for example, image, where e is a constant and N is the depth that the goal is located. But the dynamic weighting fails to differentiate the nodes: which are on the solution path (image), whereas the others (image) are not. Thus, neither static nor dynamic weighting can improve the search efficiency significantly.
As stated in Section 6.1.1, under certain conditions we regard heuristic search as a random sampling process. By using the statistic inference method, it can tell for sure whether a path looks more promising than others. This information can be used for guiding the weighting.
For example, the Wald sequential probability ratio test (SPRT) is used as a testing hypothesis in SA search. In some search stage if the hypothesis that some subtree T in the search tree contains solution path is rejected, or simply, subtree T is rejected, then the probability that the subtree contains the goal is low. Rather than pruning T as in SA, a fixed weight image is added to the evaluation function of the nodes in T, i.e. the evaluation function is increased by image, image. If the hypothesis that the subtree T′ contains the goal is accepted, the same weight is added to evaluation functions of all nodes in the brother-subtrees of T′, whose roots are the brothers of the root of T′. If no decision can be made by the statistic inference method, the searching process continues as in SA search. We call this new algorithm as the weighted SA search, or WSA. It is likely that the search will focus on the most promising path due to the weighting. We will show that the search efficiency can be improved by the WSA significantly.
For clarity and brevity, we assume that the search space is a uniform 2-ary tree, image, in the following discussion. The SPRT (or ASM) is used as a testing hypothesis and the given significance level is image, where image. The complexity of an algorithm is defined as the expected number of the nodes expanded by the algorithm when a goal is found.
ƒ(n) is an arbitrary global statistic (a subtree evaluation function) constructed from heuristic information and satisfies Hypothesis I.
1. Weighting Methods
There are two cases.
image, image is a known constant and image is the complexity of the original algorithm A (e.g., A) when searching the space. N is the depth at which the goal is located.
Formula image means there exist constants D and E such that

image

when N is large enough. So there is no loss of generality in assuming that image.
In the weighted SA, the weighting method is

image(6.21)

image

The weighting method is

image(6.22)

2. Optimal Weights and the Mean Complexity
image, image is a known constant
The weighted function is image, image is a constant.

Definition 6.9

A subtree is called a completely weighted, if all its subtrees have been judged to be rejected or accepted.
The subtree image shown in Fig. 6.8 is completely weighted (where the rejected subtrees are marked with sign ‘image’ ). But subtree image is not completely weighted.
We imagine that if a subtree is not completely weighted, the testing hypothesis is continued until it becomes a completely weighted one. Obviously, a completely weighted subtree has more expanded nodes than the incompletely weighted one. Thus, if an upper estimate of the mean complexity of the completely weighted subtree is obtained, it certainly is an upper estimate of the mean complexity in general.
image
Figure 6.8 A Completely Weighted and Incompletely Weighted Subtree
We now discuss this upper estimate.
Let T be a completely weighted 2-ary tree and image be a set of nodes at depth d. For image, from initial node image to n there exists a unique path consisting of d arcs. Among these d arcs if there are i image arcs marked by ‘image’, node n is referred to as an i-type node, or i-node.
So image can be divided into the following subsets:
0-node: there is only one such node.
1-node: the number of such nodes is image
….
i-node: the number of such nodes is image
d-node: the number of such nodes is image
In considering the complexity for finding a goal, we first ignore the cost of the statistic inference. Assume that the goal of the search tree belongs to 0-node so that its evaluation is image, where N is the depth at which the goal is located. From algorithm image, it's known that every node which image must be expanded in the searching process.
If node n is an i-node, its evaluation function is image. All nodes whose evaluations satisfy the following inequality will be expanded.

image

From image, it’s known that the complexity corresponding to the evaluation function image is image. The mean complexity of each i-node (the probability that an i-node is expanded) is

image

On the other hand, the mean complexity for finding a goal at depth N is at least N. Thus the mean complexity of each i-node is

image

image

When the goal is a 0-node, the upper bound of the mean complexity for computing all d-th depth nodes is the following (ignoring the complexity for making statistic inference).

image

image

image

On the other hand, when image is a constant, from Section 6.2 it’s known that the mean computational cost of SPRT is a constant Q for making the statistic inference of a node. When the goal is an 0-node, accounting for this cost, the mean complexity for computing all d-th depth nodes is

image

Similarly, if the goal belongs to i-node, its evaluation is image. Then the computational complexity of each node in the search tree is increased by a factor of image. Thus when the goal is an i-node, the mean complexity for computing all d-th nodes is

image

From algorithm SA, the probability that the goal falls into an i-node is image if the given level is image, image. At depth N, there are image i-nodes, so the probability that the goal belongs to i-node is

image

Accounting for all possible cases of the goal node, the mean complexity for computing all d-th depth nodes is

image

image

Let image. There is an optimal weight image such that image is minimal. The optimal weight image and image.
The upper bound of mean complexity of WSA is

image

Letting image, we have

image(6.23)

where, image is a constant.

Theorem 6.10

Assume image, image. There exists an optimal weight image such that

image

The complexity of WSA by using the optimal weight is

image

Proof

Let image, i.e., image. From image, we have

image

We obtain

image

Let image.
If image, for any image, we have

image(6.24)

If image, as long as image Formula (6.24) holds.
Substitute (6.24) into (6.22), we have

image

Thus, image.
From the theorem, we can see that the whole number of nodes in a 2-ary tree with N depth is image, image. Therefore, when image as long as image then we have image.

Theorem 6.11

If image, letting image and using the weighted function image, then image.

Proof

Similar to Theorem 6.10, we have

image

Let image. There exists an optimum image such that image is minimal. Substituting image into the above formula, we have

image

Letting image, we have image.
Thus, when image image.
Finally, image.
3. Discussion
The estimation of c and a: Generally, whether image is either image or image is unknown generally. Even if the type of functions is known but parameters c and a are still unknown.
We propose the following method for estimating c and a
Assume that in the 2k level, the number of expanded nodes is image. Then image can be used to estimate c. If c does not change much with k, then image may be regarded as type image, where image.
If c approaches to zero when k increases, then we consider image. If image is essentially unchanged, then image is regarded as type image and image.
Alterable significance levels: Assume that image is the optimal weight for image. Value image is unknown generally. We first by letting image then have image. Letting image, we have

image

image

Thus,

image

In order to have image, b should be sufficiently small such that

image(6.25)

Since c is unknown, b still cannot be determined by Formula (6.25). In order to overcome the difficulty, we perform the statistical inference as follows. For testing i-subtrees, the significance level we used is image, where image is a series monotonically approaching to zero with i. Thus, when image is sufficiently small, Formula (6.25) will hold. For example, let image, where b is a constant and image is a significance level for testing i-subtrees.
From Section 6.2, it’s known that when using significance level image the mean complexity of the statistical inference is

image

where c is a constant.
Thus, replacing Q by image in Formula image, when significance level is image we have

image

Theorem 6.12

If image, image is a constant, letting the significance level for i-subtrees be image, where b is a constant, then

image

Alterable weighting: When the type of the order of image is unknown, we uniformly adopt weight function image. Next, we will show that when image, image, if the weight function image is used what will happen to the complexity of WSA.
Since image the ‘additive’ weight s can be regarded as ‘multiplicative’ weight image but image is no long a constant. So we call it the alterable weighting.
Assume that image. When a node is a 0-node, image. For any node image, it is assumed to be an i-node. The evaluation function of image after weighting is image, where image are the weights along the path from starting node image to node image.
According to the heuristic search rules, when image, i.e., image, node image will be expanded.
It’s known that the goal locates at the N level, so the evaluation image may be adopted.
Thus, when s fixed, image satisfies image.
Let image and image. We have image.
Since image, the mean complexity for testing each i-tree is

image

When the goal is a t-node, image. The mean complexity for testing each i-node is

image

Similar to Theorem 6.10, we have

image(6.26)

Let image and

image

image

When image, image. Thus, when image is sufficiently small, the asymptotic estimation of the above formula can be represented as

image

image

image

Let image. When image is sufficiently small, from the above formula we have image. Substitute image into Formula (6.26), we have

image

Then, we have the following theorem.

Theorem 6.13

If image and using the same weighted function image, the order of mean complexity of WSA is image at most, i.e., the same as the order of image at most.
The theorem shows that when the type of image is unknown, we may adopt image as the weighted evaluation function.

6.4.3. Comparison to Other Methods

1. The Relation Among WSA, SA and Heuristic Search A
If weighted evaluation function image is adopted, when image then WSA will be changed to common heuristic search A. If weighted evaluation function is image, when image then WSA is changed to A as well.
In the above weighted evaluation functions, if image or image then WSA will be changed to SA, since image or image is equivalent to pruning the corresponding subtrees. Therefore, SA and A algorithms are two extreme cases of WSA algorithm. We also show that there exist optimal weights image and image of WSA. So the performances of WSA are better than that of SA and A in general.
2. Human Problem-Solving Behavior
SA algorithms are more close to human problem-solving behavior.
Global view: In SA algorithms, the statistical inference methods are used as a global judgment tool. So the global information can be used in the search. This embodies the global view in human problem solving, but in most computer algorithms such as search, path planning only local information is used. This inspires us to use the mathematical tools for investigating global properties such as calculus of variation in the large, bifurcation theory, the fixed point principle, statistical inference, etc. to improve the computer problem solving capacity.
SA algorithms can also be regarded as the application of the statistical inference methods to quotient space theory, or a multi-granular computing strategy by using both global and local information.
Learning from experience: In successive SA algorithms, the ‘successive operation’ is similar to learning from the previous experience so that the performances can be improved. But the successive operation builds upon the following basis, i.e., the mean computation of SA in one pass is convergent. Different from the SA algorithms image (or BF) does not have such a property generally so the successive operation cannot be used in the algorithm.
Difference or precision: As we know, SA builds upon the difference of two statistics, one from paths containing goal, and one from paths not containing goal. Algorithm image builds upon the estimated precision of evaluation functions from different paths. The precise estimates of statistics no doubt can mirror the difference, but the estimates that can mirror the difference of statistics are not necessarily precise. So it’s easy to see that the convergent condition of SA is weaker than that of image.
Judging criteria: In SA, the judgment is based on the difference of the means of statistics from a set of nodes. But in image the judgment is based on the difference among single nodes. So the judgment in SA is more reliable than image. Moreover, in image the unexpanded nodes should be saved in order for further comparison. This will increase the memory space.
Performances: W. Zhang (1988) uses 8-puzzle as an example to compare the performances of WSA and A. The results are shown in Table 6.1. There are totally 81 instances. The performance of SA is superior to A in 60 instances. Conversely, A is superior to SA only in 21 instances. The computational costs saved are 23.4% and 20.8%, respectively. As we know, 8-puzzle is a problem with a small size. Its longest solution path only contains 35 moves. We can expect that when the problem size becomes larger, SA will demonstrate more superiority.
Where, the computational cost-the number of moves, image-significance level, image-weight, d–when WSA search reaches d (depth) the statistical inference is made, and image.

Table 6.1

The Comparison of Performances between WSA and A

WSA AlgorithmNumber of Instancesimageimage
Number of InstancesComputational Cost SavedNumber of InstancesComputational Cost Increased
image =0.01
image =2, d=3
8160
74.1%
23.4%21
25.9%
20.8%

image

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.227.134.154