6.2.1. Enforced Hill-Climbing
Hill-climbing is a greedy search engine that selects the best successor node under evaluation function
h, and commits the search to it. Then the successor serves as the actual node, and the search continues.
Of course, hill-climbing does not necessarily find optimal solutions. Moreover, it can be trapped in state space problem graphs with dead-ends. On the other hand, the method proves to be extremely efficient for some problems, especially the easiest ones.
A more stable version is
enforced hill-climbing. It picks a successor node, only if it has a strictly better evaluation than the current node. Since this node might not be in the immediate neighborhood of the current node,
enforced hill-climbing searches for that node in a breadth-first manner. We have depicted the pseudo code for the driver in
Algorithm 6.11. The BFS procedure is shown in
Algorithm 6.12. We assume a proper heuristic with
h(
t) = 0, if and only if
t is a goal. An example is provided in
Figure 6.5.
Theorem 6.1
(Completeness Enforced Hill-Climbing) If the state space graph contains no dead-ends thenAlgorithm 6.12will find a solution.
Proof
There is only one case that the algorithm does not find a solution; that is, for some intermediate node
v, no better evaluated node
v can be found. Since BFS is a complete search method, it will find a node on a solution path with better evaluation. In fact, if it were not terminated in the case of
h(
v) <
h(
u), but in the case of
h(
v) = 0, it would find a full solution path.
If we have an unweighted problem graph, then it contains no dead-ends. Moreover, any complete algorithm can be used instead of BFS. However, there is no performance guarantee on the solution path obtained. An illustration of the search plateaus generated by the enforced hill-climbing algorithm is provided in
Figure 6.6. The plateaus do not have to be disjoint, as intermediate nodes in one layer can exceed the
h-value for which the BFS search was invoked.
Besides being incomplete in directed graphs the algorithm has other drawbacks. There is evidence that when the heuristic estimation is not very accurate, enforced hill-climbing easily suffers from stagnation or is led astray.
6.2.2. Weighted A*
We often have that the heuristic h drastically underestimates the true distance, so that we can obtain a more realistic estimate by scaling up its influence with regard to some parameter. Although this compromises optimality, it can lead to a significant speedup; this is an appropriate choice when searching under time or space constraints.
If we parameterize
with
, we obtain a continuous range of best-first search variants
Al, also denoted as
weighted A*. For
l = 0, we simulate a breadth-first traversal of the problem space; for
l = 1, we have
greedy best-first search. Algorithm
A0.5 selects nodes in the same order as original A*.
If we choose l appropriately, the monotonicity of f is preserved.
Lemma 6.1
For l ≤ 0.5 and a consistent estimate h, fl is monotone.
Proof
Since
h is consistent we have
f monotone; that is,
f(
v) ≥
f(
u) for all pairs (
u,
v) on a solution path. Consequently,
since
.
Let us now relax the restrictions on l to obtain more efficient, though nonadmissible, algorithms. The quality of the solution can still be bounded in the following sense.
Definition 6.1
(ϵ-Optimality) A search algorithm is ϵ-optimal
if it terminates with a solution of maximum cost ,
with ϵ denoting an arbitrary small positive constant.
Lemma 6.2
A* where for an admissible estimate h is ϵ-optimal.
Proof
For nodes
u in
Open that satisfy invariant (
I) (
Lemma 2.2) we have
and
due to the reweighting process. Therefore,
Thus, if a node
t ∈
T is selected we have
.
ϵ-Optimality allows for more liberal selection of nodes for expansion.
Lemma 6.3
Let .
Then any selection of a node in Focal
yields an ϵ-optimal algorithm.
Proof
Let
u be the node in invariant (
I) (
Lemma 2.2) with
and let
v be the node with the minimal
f-value in
Open. Then
f(
v) ≤
f(
u), and for a goal
t we have
.
6.2.3. Overconsistent A*
We can reformulate A* search to reuse the search results of its previous executions. To do this, we define the notion of an inconsistent node and then formulate A* search as the repeated expansion of inconsistent nodes. This formulation can reuse the results of previous executions simply by identifying all the nodes that are inconsistent.
We will first introduce a new variable, called
i. Intuitively, these
i-values will be estimates of start distances, just like
g-values. However, although
g(
u) is always the cost of the best path found so far from
s to
u,
i(
u) will always be equal to the cost of the best path from
s to
u found at the time of the last expansion of
u. If
u has never been expanded then
i(
u) is set to ∞. Thus, every
i-value is initially set to ∞ and then it is always reset to the
g-value of the node when the node is expanded. The pseudo code of A* that maintains these
i-values is given in
Algorithm 6.13.
Since we set
i(
u) to
g(
u) at the beginning of the expansion of
u and we assume nonnegative edge costs,
i(
u) remains equal to
g(
u) while
u is being expanded. Therefore, setting
g(
v) to
g(
u) +
w(
u,
v) is equivalent to setting
g(
v) to
i(
u) +
w(
u,
v). As a result, one benefit of introducing
i-values is the following invariant that A* always maintains: For every node
v ∈
S,
More importantly, however, it turns out that
Open contains exactly all the nodes
u visited by the search for which
. This is the case initially, when all nodes except for
s have both
i- and
g-values infinite and
Open only contains
s, which has
i(
s) = ∞ and
g(
s) = 0. Afterward, every time a node is being selected for expansion it is removed from
Open and its
i-value is set to its
g-value on the very next line. Finally, whenever the
g-value of any node is modified it has been decreased and is thus strictly less than the corresponding
i-value. After each modification of the
g-value, the node is made sure to be in
Open.
Let us call a node
u with
inconsistent and a node with
i(
u) =
g(
u)
consistent. Thus,
Open always contains exactly those nodes that are inconsistent. Consequently, since all the nodes for expansion are chosen from
Open, A* search expands only inconsistent nodes.
An intuitive explanation of the operation of A* in terms of inconsistent node expansions is as follows. Since at the time of expansion a node is made consistent by setting its
i-value equal to its
g-value, a node becomes inconsistent as soon as its
g-value is decreased and remains inconsistent until the next time the node is expanded. That is, suppose that a consistent node
u is the best predecessor for some node
v:
. Then, if
g(
u) decreases we get
and therefore
. In other words, the decrease in
g(
s) introduces an inconsistency between the
g-value of
u and the
g-values of its successors. Whenever
u is expanded, on the other hand, this inconsistency is corrected by reevaluating
the
g-values of the successors of
u. This in turn makes the successors of
u inconsistent. In this way the inconsistency is propagated to the children of
u via a series of expansions. Eventually the children no longer rely on
u, none of their
g-values are lowered, and none of them are inserted into the
Open list.
The operation of this new formulation of A* search is identical to the original version of A* search. The variable
i just makes it easy for us to identify all the nodes that are inconsistent: These are all the nodes
u with
. In fact, in this version of the subroutine, the
g-values only decrease, and since the
i-values are initially infinite, all inconsistent nodes have
i(
u) >
g(
u). We will call such nodes
overconsistent, and nodes
u with
i(
u) <
g(
u) are called
underconsistent.
In the versions of A* just presented, all nodes had their
g-values and
i-values initialized at the outset. We set the
i-values of all nodes to infinity, we set the
g-values of all nodes except for
s to infinity, and we set
g(
s) to 0. We now remove this initialization step and the only restriction we make is that no node is underconsistent and all
g-values satisfy
equation 6.1 except for
g(
s), which is equal to zero. This arbitrary overconsistent initialization will allow us to reuse previous search results when running multiple searches.
The pseudo code under this initialization is shown in
Algorithm 6.14. The only change necessary for the arbitrary overconsistent initialization is the terminating condition of the while loop. For the sake of simplicity we assume a single goal node
t ∈
T. The loop now terminates as soon as
f(
s) becomes less than or equal to the key of the node to be expanded next; that is, the smallest key in
Open (we assume that the min operator on an empty set returns ∞). The reason for this addition is that under the new initialization
t may never be expanded if it was already correctly initialized. For instance, if all nodes are initialized in such a way that all of them are consistent, then
Open is initially empty, and the search terminates without a single expansion. This is correct, because when all nodes are consistent
and
g(
s) = 0, then for every node
u ≠
s,
, which means that the
g-values are equal to the corresponding start distances and no search is necessary—the solution path from
s to
t is an optimal solution.
Just like the original A* search, for consistent heuristics, at the time the test of the while loop is being executed, for any node u with h(u) < ∞ and f(u) ≤ f(v) for all v in Open, it holds that the extracted solution path from u to t is optimal.
Given this property and the terminating condition of the algorithm, it is clear that after the algorithm terminates the returned solution is optimal. In addition, this property leads to the fact that no node is expanded more than once if heuristics are consistent: Once a node is expanded its g-value is optimal and can therefore never decrease afterward. Consequently, the node is never inserted into Open again. These properties are all similar to the ones A* search maintains. Different from A* search, however, the Improve function does not expand all the necessary nodes relevant to the computation of the shortest path. It does not expand the nodes of which the i-values were already equal to the corresponding start distances. This property can result in substantial computational savings when using it for repeated search. A node u is expanded at most once during the execution of the Improve function and only if i(u) was not equal to the start distance of u before invocation.
6.2.4. Anytime Repairing A*
In many domains A* search with inflated heuristics (i.e., A* search with
f-values equal to
g plus
ϵ h-values for
ϵ ≥ 1) can drastically reduce the number of nodes it has to examine before it produces a solution. While the path the search returns can be suboptimal, the search also provides a bound on the suboptimality, namely, the
ϵ by which the heuristic is inflated. Thus, setting
ϵ to 1 results in standard A* with an uninflated heuristic and the resulting path is guaranteed to be optimal. For
ϵ > 1 the length
of the found path is no larger than
ϵ times the length of the optimal path, while the search can often be much faster than its version with uninflated heuristics.
To construct an anytime algorithm with suboptimality bounds, we could run a succession of these A* searches with decreasing inflation factors. This naive approach results in a series of solutions, each one with a suboptimality factor equal to the corresponding inflation factor. This approach has control over the suboptimality bound, but wastes a lot of computation since each search iteration duplicates most of the efforts of the previous searches. In the following we explain the ARA* (anytime repairing A*) algorithm, which is an efficient anytime heuristic search that also runs A* with inflated heuristics in succession but reuses search efforts from previous executions in such a way that the suboptimality bounds are still satisfied. As a result, a substantial speedup is achieved by not recomputing the node values that have been correctly computed in the previous iterations.
Anytime repairing A* works by executing A* multiple times starting with a large ϵ and decreasing ϵ prior to each execution until ϵ = 1. As a result, after each search a solution is guaranteed to be within a factor ϵ of optimal. ARA* uses the overconsistent A* subroutine to reuse the results of the previous searches and therefore can be drastically more efficient.
The only complication is that the heuristics inflated with ϵ may no longer be consistent. It turns out that the same function applies equally well when heuristics are inconsistent. Moreover, in general, when consistent heuristics are inflated the nodes in A* search may be reexpanded multiple times. However, if we restrict the expansions to no more than one per node, then A* search is still complete and possesses ϵ-suboptimality: the cost of the found solution is no worse than ϵ times the cost of an optimal solution.
The same holds true for the subroutine as well. We therefore restrict the expansions in the function (see
Alg. 6.15) using the set
Closed: Initially,
Closed is empty; afterward, every node that is being expanded is added to it, and no node that is already in
Closed is inserted into
Open to be considered for expansion. Although this restricts the expansions to no more than one per node,
Open may no longer contain all inconsistent nodes. In fact,
Open contains only the inconsistent nodes that have not yet been expanded. We need, however, to keep track of
all inconsistent nodes since they will be the starting points for the inconsistency propagation in the future search iterations. We do this by maintaining the set
Incons of all the inconsistent nodes that are not in
Open in
Algorithm 6.15. Thus, the union of
Incons and
Open is exactly the set of all inconsistent nodes, and can be used to reset
Open before each call to the subroutine.
The main function of ARA* (see
Alg. 6.16) performs a series of search iterations. It does initialization, including setting
ϵ to a large value
ϵ0, and then repeatedly calls the subroutine with a series of decreasing values of
ϵ. Before each call to the subroutine, however, a new
Open list is constructed by moving to it the contents of the set
Incons. Consequently,
Open contains all inconsistent nodes before each call to the subroutine. Since the
Open list has to be sorted by the current
f-values of nodes, it is reordered. After each call to the function ARA* publishes a solution that is suboptimal by at most a factor of
ϵ. More generally, for any node
s with an
f-value smaller than or equal to the minimum
f-value in
Open, we have computed a solution path from
s to
u that is within a factor of
ϵ of optimal.
Each execution of the subroutine terminates when f(t) is no larger than the minimum key in Open. This means that the extracted solution path is within a factor ϵ of optimal. Since before each iteration ϵ is decreased, ARA* gradually decreases the suboptimality bound and finds new solutions to satisfy the bound.
Figure 6.7 shows the operation of ARA* in a simple maze example. Nodes that are inconsistent at the end of an iteration are shown with an asterisk. Although the first call (
ϵ = 2.5) is identical to the weighted A* call with the same
ϵ, the second call (
ϵ = 1.5) expands only one cell. This is in contrast to large number cells expanded by A* search with the same
ϵ. For both searches the suboptimality factor,
ϵ, decreases from 2.5 to 1.5. Finally, the third call with
ϵ set to 1 expands only nine cells.
ARA* gains efficiency due to the following two properties. First, each search iteration is guaranteed not to expand nodes more than once. Second, it does not expand nodes of which the i-values before a subroutine call function have already been correctly computed by some previous search iteration.
Each solution ARA* publishes comes with a suboptimality equal to
ϵ. In addition, an often tighter suboptimality bound can be computed as the ratio between
g(
t), which gives an upper bound on the cost of an optimal solution, and the minimum unweighted
f-value of an inconsistent node, which gives a lower bound on the cost of an optimal solution. This is a valid suboptimality bound as long as the ratio is larger than or equal to one. Otherwise,
g(
t) is already equal to the cost of an optimal solution. Thus, the actual suboptimality bound,
ϵ′, for each solution ARA* publishes can be computed as the minimum between
ϵ and this new bound:
The anytime behavior of ARA* strongly relies on the properties of the heuristics they use. In particular, it relies on the assumption that a sufficiently large inflation factor
ϵ substantially expedites the search process. Although in many domains this assumption is true, this is not guaranteed. In fact, it is possible to construct pathological examples where the best-first nature of searching with a large
ϵ can result in much longer processing times. In general, the key to obtaining anytime behavior in ARA* is finding heuristics for which the difference between the heuristic values and the true distances
these heuristics estimate is a function with only shallow local minima. Note that this is not the same as just the magnitude of the differences between the heuristic values and the true distances. Instead, the difference will have shallow local minima if the heuristic function has a shape similar to the shape of the true distance function. For example, in the case of robot navigation a local minimum can be a U-shaped obstacle placed on the straight line connecting a robot to its goal (assuming the heuristic function is Euclidean distance). The size of the obstacle determines how many nodes weighted A*, and consequently ARA*, will have to visit before getting out of the minimum. The conclusion is that with ARA*, the task of developing an anytime planner for various hard search domains becomes the problem of designing a heuristic function that results in shallow local minima. In many cases (although certainly not always) the design of such a heuristic function can be a much simpler task than the task of designing a whole new anytime search algorithm for solving the problem at hand.
6.2.5. k-Best-First Search
A very different nonoptimal search strategy modifies the selection condition in the priority-queue data structure by considering larger sets of nodes without destroying its internal
f-order. The algorithm
k-best-first search is a generalization of best-first search in that each cycle expands the best
k-nodes from
Open instead of the first best node only. Successors are not examined until the rest of the previous
k-best nodes are expanded. A pseudo-code implementation is provided in
Algorithm 6.17.
In light of this algorithm, best-first search can be regarded as 1-best-first search, and breadth-first search as ∞-best first search, since in each expansion cycle, all nodes in Open are expanded.
The rationale of the algorithm is that if the level of impreciseness in a nonadmissible heuristic function increases,
k-best-first search avoids running in the wrong direction and temporarily abandoning overestimated, optimal solution paths. It has been shown to outperform best-first search in a number of domains.
On the other hand, it will not be advantageous in conjunction with admissible, monotonic heuristics, since in this case all nodes that have a cost less than the optimal solution must be expanded anyway. However, when suboptimal solutions are affordable, k-best-first search can be a simple yet sufficiently powerful choice. From this point of view, k-best-first search is a natural competitor not for A*, but for weighted A* with l > 0.5.
6.2.7. Partial A* and Partial IDA*
During the study of partial hash functions such as
bit-state hashing,
double bit-state hashing, and
hash compact (see
Ch. 3), we have seen that the sizes of the hash tables can be decreased considerably. This is paid for by giving up search optimality, since some states can no longer be disambiguated. As we have seen, partial hashing is a compromise to the space requirements that full state storage algorithms have and can be casted as a nonadmissible simplification to traditional heuristic search algorithms. In the extreme case, partial search algorithms are not even complete, since they can miss an existing goal state due to wrong pruning. The probability can be reduced either by enlarging the number of bits in the remaining vector or by reinvoking the algorithm with different hash functions (see
Fig. 6.8, right).
Partial A* applies bit-state hashing for A*'s
Closed list. The hash table degenerates to a bit array without any collision strategy (we write
Closed[
i] to highlight the difference). Note that partial A* is applied without reopening, even if the estimate is not admissible, since the resulting algorithm cannot guarantee optimal solutions anyway. The effect of partial state storage is illustrated in
Figure 6.9. If only parts of states are stored, more states fit into main memory.
To analyze the consequences of applying nonreversible compression methods, we concentrate on bit-state hashing. Our focus on this technique is also motivated by the fact that bit-state hashing compresses states drastically down to one or a few bits emphasizing the advantages of depth-first search algorithms.
Algorithm 6.19 depicts the A* search algorithm with (single) bit-state hashing.
Given
M bits of memory, single bit-state hashing is able to store
M states. This saves memory of factor
, since the space requirements for an explicit state are at least lg|
S| bits. For large state spaces and less efficient state encodings the gains in state space coverage for bit-state hashing are considerable.
First of all, states in the search frontier can hardly be compressed. Second, it is often necessary to keep track of the path that leads to each state. An additional observation is that many heuristic functions and algorithms must access the length (or cost) of the optimal path through which the state was reached.
There are two solutions to these problems: either information is recomputed by traversing the path that leads to the state, or it is stored together with the state. The first so-called
state reconstruction method increases time complexity, and the second one increases the memory requirements. Still, state reconstruction needs to store a predecessor link, which on a
W-bit processor typically requires
W bits.
It is not trivial to analyze the amount of information needed to store the set Open, especially considering that problem graphs are not regular. However, experimental results show that the search frontier frequently grows exponentially with the search depth, such that compressing the set of closed states does not help much. Hence, applying bit-state hashing for search algorithms such as BFS is not as effective as it is in DFS.
Nonadmissible bit-state hashing can also be used in combination with linear IDA* search. The implementation of
Partial IDA* is shown in
Algorithm 6.20. Bit-state hashing can be combined with transposition table updates propagating
f-values or
h-values back to the root, but as the pruning technique is incomplete and annotating any information at a partially stored state is memory intensive, it is simpler to initialize the hash table in each iteration.
Refreshing large bitvector tables is fast in practice, but for shallow searches with a small number of expanded nodes this scheme can be improved by invoking ordinary IDA* with transposition table updates for smaller thresholds and by applying bitvector exploration in large depths only.