How does an admissible heuristic ensure an optimal solution?

Question

When using A* (or any other best path finding algorithm), we say that the heuristic used should be admissible, that is, it should never overestimate the actual solution path's length (or moves).

How does an admissible heuristic ensure an optimal solution? I am preferably looking for an intuitive explanation.

If you want you can explain using the Manhattan distance heuristic of the 8-puzzle.

score 20 · Answer 1 · edited Jan 04 '17 at 15:59

If the heuristic function is not admissible, than we can have an estimation that is bigger than the actual path cost from some node to a goal node. If this higher path cost estimation is on the least cost path (that we are searching for), the algorithm will not explore it and it may find another (not least cost) path to the goal.

Look at this simple example.

enter image description here

Let $A$ and $G$ be respectively the starting and goal nodes. Let $h(N)$ be an estimate of the path's length from node $N$ to $G$, $\forall N$ in the graph. Moreover, let $c(N, X_{i})$ be the step cost function from node $N$ to its neighbour $X_i$, $\forall N$ and $i=1..m$, where $m$ is the number of neighbours of $N$ (i.e., a function that returns the cost of the edge between node $N$ and one of its neighbours).

Let the heuristics be

$h(B) = 3$
$h(C) = 4$

This heuristics function $H$ is not admissible, because $$h(C) = 4 > c(C, G) = 2$$

If the $A^*$ algorithm starts initially from node $A$, it will select next node $B$ for expansion and, after this, it will reach node $G$ from there. And the path will be $A \rightarrow B \rightarrow G$ with cost $4$, instead of $A \rightarrow C \rightarrow G$ with cost $3$. If the heuristic function was admissible this would not have happened.

score 10 · Accepted Answer · answered Oct 14 '13 at 22:14

While Anton's answer is absolutely perfect let me try to provide an alternative answer: being admissible means that the heuristic does not overestimate the effort to reach the goal, i.e., $h(n) \leq h^*(n)$ for all $n$ in the state space (in the 8-puzzle, this means just for any permutation of the tiles and the goal you are currently considering) where $h^*(n)$ is the optimal cost to reach the target.

I think the most logical answer to see why $A^*$ provides optimal solutions if $h(n)$ is admissible is becauase it sorts all nodes in OPEN in ascending order of $f(n)=g(n)+h(n)$ and, also, because it does not stop when generating the goal but when expanding it:

Since nodes are expanded in ascending order of $f(n)$ you know that no other node is more promising than the current one. Remember: $h(n)$ is admissible so that having the lowest $f(n)$ means that it has an opportunity to reach the goal through a cheaper path that the other nodes in OPEN have not. And this is true unless you can prove the opposite, i.e., by expanding the current node.
Since $A^*$ stops only when it proceeds to expand the goal node (as oppossed to stop when generating it) you are sure (from the first point above) that no other node leads through a cheaper path to it.

And this is, essentially, all you will find in the original proof by Nilsson et al.

Hope this helps,

score 3 · Answer 3 · answered Feb 09 '21 at 20:41

I'd like to expand upon Anton's comment in his answer, and provide an explicit answer to the situation posed by Ashwin in the comments. I think it'll be helpful to answering the primary question.

Let's consider the situation

Where A is the starting node and G is the goal node. The numbers on the nodes are the heuristic costs, while the numbers on the edges are the costs to travel between those two nodes.

We can see that the heuristic function is admissible (i.e it doesn't overestimate the cost of reaching the goal) and consistent (ie. it decreases as we get closer to the goal node).

Here's how the A* algorithm would find the optimal solution for this graph:

Iteration 1: It checks the f(x) for all of A's neighbours

f(B) = x + x = 2x
f(C) = 1.1x + 0.5x = 1.6x

Since f(C) is smaller, it picks C as the next node.

Iteration 2: Check f(x) for C's neighbours and the existing paths

f(B) = x + x = 2x (A->B)
f(G) = 2.1x + 0 = 2.1x (C->G)

Clearly, the path from A->B is cheaper and thus A* will shift its focus to that path.

Iteration 3: Check f(x) for B's neighbours and the existing paths

f(G) = 2x + 0 = 2x (B->G)
f(G) = 2.1x + 0 = 2.1x (C->G)

Since B->G is cheaper here and we've reached the goal node (plus traversed all paths), the algorithm ends with the (optimal) solution A -> B -> C.

How does an admissible heuristic ensure an optimal solution?

3 Answers3

Linked