LAO* is a heuristic search algorithm that can find optimal solutions for MDPs without evaluating the entire state space, and takes the form of a cyclic graph,that is, a solution with loops.

LAO* is a simple generalization of AO* (a classic algorithm that can only solve problems that have acyclic solutions because the backwards induction algorithm used in its cost revision step assumes an acyclic solution) that can find solutions withloops. (The “L” in the name LAO* stands for “loop”.).

The key step in generalizing AO* to create LAO* is to recognize that the cost revision step of AO* is a dynamic programming algorithm, and to generalize this step appropriately. Instead of using backwards induction, state costs can be updated by using a dynamic programming algorithm for indefinite-horizon MDPs, such as policy iteration or value iteration.

The algorithm LAO* is summarized as follows: