Editing A* search algorithm (section)

==Description==

[[File:Astarpathfinding.gif|thumb|A* pathfinding algorithm navigating around a randomly-generated maze]]
[[File:A Star Algorithm.webm|thumb|Illustration of A* search for finding a path between two points on a graph. From left to right, a heuristic that prefers points closer to the goal is used increasingly. ]]

A* is an [[informed search algorithm]], or a [[best-first search]], meaning that it is formulated in terms of [[weighted graph]]s: starting from a specific starting [[node (graph theory)|node]] of a graph, it aims to find a path to the given goal node having the smallest cost (least distance travelled, shortest time, etc.).  It does this by maintaining a [[tree (data structure)|tree]] of paths originating at the start node and extending those paths one edge at a time until the goal node is reached.

At each iteration of its main loop, A* needs to determine which of its paths to extend. It does so based on the cost of the path and an estimate of the cost required to extend the path all the way to the goal. Specifically, A* selects the path that minimizes

:<math>f(n) = g(n) + h(n)</math>

where {{mvar|n}} is the next node on the path, {{math|''g''(''n'')}} is the cost of the path from the start node to {{mvar|n}}, and {{math|''h''(''n'')}} is a [[heuristic]] function that estimates the cost of the cheapest path from {{mvar|n}} to the goal. The heuristic function is problem-specific. If the heuristic function is [[admissible heuristic|admissible]] – meaning that it never overestimates the actual cost to get to the goal – A* is guaranteed to return a least-cost path from start to goal.

Typical implementations of A* use a [[priority queue]] to perform the repeated selection of minimum (estimated) cost nodes to expand. This priority queue is known as the ''open set'', ''fringe'' or ''frontier''. At each step of the algorithm, the node with the lowest {{math|''f''(''x'')}} value is removed from the queue, the {{mvar|f}} and {{mvar|g}} values of its neighbors are updated accordingly, and these neighbors are added to the queue. The algorithm continues until a removed node (thus the node with the lowest {{mvar|f}} value out of all fringe nodes) is a goal node.{{efn|Goal nodes may be passed over multiple times if there remain other nodes with lower {{mvar|f}} values, as they may lead to a shorter path to a goal.}} The {{mvar|f}} value of that goal is then also the cost of the shortest path, since {{mvar|h}} at the goal is zero in an admissible heuristic.

The algorithm described so far only gives the length of the shortest path. To find the actual sequence of steps, the algorithm can be easily revised so that each node on the path keeps track of its predecessor. After this algorithm is run, the ending node will point to its predecessor, and so on, until some node's predecessor is the start node.

As an example, when searching for the shortest route on a map, {{math|''h''(''x'')}} might represent the [[Euclidean distance|straight-line distance]] to the goal, since that is physically the smallest possible distance between any two points. For a grid map from a video game, using the [[Taxicab distance]] or the [[Chebyshev distance]] becomes better depending on the set of movements available (4-way or 8-way).

If the heuristic {{mvar|h}} satisfies the additional condition {{math|''h''(''x'') ≤ ''d''(''x'', ''y'') + ''h''(''y'')}} for every edge {{math|(''x'', ''y'')}} of the graph (where {{mvar|d}} denotes the length of that edge), then {{mvar|h}} is called [[Consistent heuristic|monotone, or consistent]]. With a consistent heuristic, A* is guaranteed to find an optimal path without processing any node more than once and A* is equivalent to running [[Dijkstra's algorithm]] with the [[reduced cost]] {{math|''d'''(''x'', ''y'') {{=}} ''d''(''x'', ''y'') + ''h''(''y'') − ''h''(''x'')}}.<ref>{{cite journal
 | last1 = Nannicini | first1 = Giacomo
 | last2 = Delling | first2 = Daniel
 | last3 = Schultes | first3 = Dominik
 | last4 = Liberti | first4 = Leo
 | doi = 10.1002/NET.20438
 | issue = 2
 | journal = Networks
 | pages = 240–251
 | title = Bidirectional A* search on time-dependent road networks
 | url = https://www.lix.polytechnique.fr/~liberti/bidirtimedepj.pdf
 | volume = 59
 | year = 2012}}</ref>

===Pseudocode===
The following [[pseudocode]] describes the algorithm:

<syntaxhighlight lang="pascal" start="1">
function reconstruct_path(cameFrom, current)
    total_path := {current}
    while current in cameFrom.Keys:
        current := cameFrom[current]
        total_path.prepend(current)
    return total_path

// A* finds a path from start to goal.
// h is the heuristic function. h(n) estimates the cost to reach goal from node n.
function A_Star(start, goal, h)
    // The set of discovered nodes that may need to be (re-)expanded.
    // Initially, only the start node is known.
    // This is usually implemented as a min-heap or priority queue rather than a hash-set.
    openSet := {start}

    // For node n, cameFrom[n] is the node immediately preceding it on the cheapest path from the start
    // to n currently known.
    cameFrom := an empty map

    // For node n, gScore[n] is the currently known cost of the cheapest path from start to n.
    gScore := map with default value of Infinity
    gScore[start] := 0

    // For node n, fScore[n] := gScore[n] + h(n). fScore[n] represents our current best guess as to
    // how cheap a path could be from start to finish if it goes through n.
    fScore := map with default value of Infinity
    fScore[start] := h(start)

    while openSet is not empty
        // This operation can occur in O(Log(N)) time if openSet is a min-heap or a priority queue
        current := the node in openSet having the lowest fScore[] value
        if current = goal
            return reconstruct_path(cameFrom, current)

        openSet.Remove(current)
        for each neighbor of current
            // d(current,neighbor) is the weight of the edge from current to neighbor
            // tentative_gScore is the distance from start to the neighbor through current
            tentative_gScore := gScore[current] + d(current, neighbor)
            if tentative_gScore < gScore[neighbor]
                // This path to neighbor is better than any previous one. Record it!
                cameFrom[neighbor] := current
                gScore[neighbor] := tentative_gScore
                fScore[neighbor] := tentative_gScore + h(neighbor)
                if neighbor not in openSet
                    openSet.add(neighbor)

    // Open set is empty but goal was never reached
    return failure
</syntaxhighlight>

'''Remark:''' In this pseudocode, if a node is reached by one path, removed from openSet, and subsequently reached by a cheaper path, it will be added to openSet again. This is essential to guarantee that the path returned is optimal if the heuristic function is [[admissible heuristic|admissible]] but not  [[Consistent heuristic|consistent]].   If the heuristic is consistent, when a node is removed from openSet the path to it is guaranteed to be optimal so the test ‘<code>tentative_gScore < gScore[neighbor]</code>’ will always fail if the node is reached again. The pseudocode implemented here is sometimes called the ''graph-search'' version of A*.<ref>{{Cite book|title=Artificial intelligence: A modern approach |last1=Russell |first1=Stuart J.|author-link1=Stuart J. Russell|date=2009|last2=Norvig|first2=Peter|author-link2=Peter Norvig|publisher=Pearson|isbn=978-0136042594|edition= 3rd|location=Boston|page=95}}</ref> This is in contrast with the version without the ‘<code>tentative_gScore < gScore[neighbor]</code>’ test to add nodes back to openSet, which is sometimes called the ''tree-search'' version of A* and require a consistent heuristic to guarantee optimality.

[[Image:Astar progress animation.gif|thumb|Illustration of A* search for finding path from a start node to a goal node in a [[robotics|robot]] [[motion planning]] problem. The empty circles represent the nodes in the ''open set'', i.e., those that remain to be explored, and the filled ones are in the closed set. Color on each closed node indicates the distance from the goal: the greener, the closer. One can first see the A* moving in a straight line in the direction of the goal, then when hitting the obstacle, it explores alternative routes through the nodes from the open set. {{see also|Dijkstra's algorithm}}]]

===Example===
An example of an A* algorithm in action where nodes are cities connected with roads and h(x) is the     straight-line distance to the target point:

[[File:AstarExampleEn.gif|An example of A* algorithm in action (nodes are cities connected with roads, h(x) is the straight-line distance to the target point) Green: Start, Blue: Target, Orange: Visited]]

'''Key:''' green: start; blue: goal; orange: visited

The A* algorithm has real-world applications. In this example, edges are railroads and h(x) is the [[great-circle distance]] (the shortest possible distance on a sphere) to the target. The algorithm is searching for a path between Washington, D.C., and Los Angeles.

[[File:A* Search Example on North American Freight Train Network.gif|The A* algorithm finding a path of railroads between Washington, D.C. and Los Angeles.]]

===Implementation details===
There are a number of simple optimizations or implementation details that can significantly affect the performance of an A* implementation.  The first detail to note is that the way the priority queue handles ties can have a significant effect on performance in some situations.  If ties are broken so the queue behaves in a [[LIFO (computing)|LIFO]] manner, A* will behave like [[depth-first search]] among equal cost paths (avoiding exploring more than one equally optimal solution).

When a path is required at the end of the search, it is common to keep with each node a reference to that node's parent.  At the end of the search, these references can be used to recover the optimal path.  If these references are being kept then it can be important that the same node doesn't appear in the priority queue more than once (each entry corresponding to a different path to the node, and each with a different cost).  A standard approach here is to check if a node about to be added already appears in the priority queue.  If it does, then the priority and parent pointers are changed to correspond to the lower-cost path. A standard [[binary heap]] based priority queue does not directly support the operation of searching for one of its elements, but it can be augmented with a [[hash table]] that maps elements to their position in the heap, allowing this decrease-priority operation to be performed in logarithmic time. Alternatively, a [[Fibonacci heap]] can perform the same decrease-priority operations in constant [[amortized time]].

===Special cases===
[[Dijkstra's algorithm]], as another example of a uniform-cost search algorithm, can be viewed as a special case of A* where {{tmath|1=h(x) = 0}} for all ''x''.<ref name="geospatial">{{citation|title=Geospatial Analysis: A Comprehensive Guide to Principles, Techniques and Software Tools|first1=Michael John|last1=De Smith|first2=Michael F.|last2=Goodchild|first3=Paul|last3=Longley|publisher=Troubadour Publishing Ltd|year=2007|isbn=9781905886609|page=344|url=https://books.google.com/books?id=SULMdT8qPwEC&pg=PA344}}.</ref><ref name="pythalgs">{{citation |last=Hetland |first=Magnus Lie |title=Python Algorithms: Mastering Basic Algorithms in the Python Language |page=214 |year=2010 |url=https://books.google.com/books?id=9_AXCmGDiz8C&pg=PA214 |archive-url=https://web.archive.org/web/20220215222823/https://books.google.com/books?id=9_AXCmGDiz8C&pg=PA214 |archive-date=15 February 2022 |publisher=Apress |isbn=9781430232377}}.</ref> General [[depth-first search]] can be implemented using A* by considering that there is a global counter ''C'' initialized with a very large value. Every time we process a node we assign ''C'' to all of its newly discovered neighbors. After every single assignment, we decrease the counter ''C'' by one. Thus the earlier a node is discovered, the higher its {{tmath|h(x)}} value. Both Dijkstra's algorithm and depth-first search can be implemented more efficiently without including an {{tmath|h(x)}} value at each node.