Editing Minimax (section)

== Combinatorial game theory ==
In [[combinatorial game theory]], there is a minimax algorithm for game solutions.

A '''simple''' version of the minimax ''algorithm'', stated below, deals with games such as [[tic-tac-toe]], where each player can win, lose, or draw. If player&nbsp;A ''can'' win in one move, their best move is that winning move. If player&nbsp;B knows that one move will lead to the situation where player&nbsp;A ''can'' win in one move, while another move will lead to the situation where player&nbsp;A can, at best, draw, then player&nbsp;B's best move is the one leading to a draw. Late in the game, it's easy to see what the "best" move is. The minimax algorithm helps find the best move, by working backwards from the end of the game. At each step it assumes that player&nbsp;A is trying to '''maximize''' the chances of A winning, while on the next turn player B is trying to '''minimize''' the chances of A winning (i.e., to maximize B's own chances of winning).

=== Minimax algorithm with alternate moves ===<!-- This section is linked from [[alpha–beta pruning]]. -->
A '''minimax algorithm'''<ref>{{Cite book
| first1 = Stuart J. | last1 = Russell | author1-link = Stuart J. Russell
| first2 = Peter.    | last2 = Norvig  | author2-link = Peter Norvig
| title=[[Artificial Intelligence: A Modern Approach]]
| year = 2021
| edition = 4th 
| isbn = 9780134610993 
| lccn = 20190474
| publisher = Pearson | location = Hoboken
| pages = 149–150
}}
</ref> is a recursive [[algorithm]] for choosing the next move in an n-player [[game theory|game]], usually a two-player game. A value is associated with each position or state of the game. This value is computed by means of a [[evaluation function|position evaluation function]] and it indicates how good it would be for a player to reach that position. The player then makes the move that maximizes the minimum value of the position resulting from the opponent's possible following moves. If it is '''A'''<nowiki/>'s turn to move, '''A''' gives a value to each of their legal moves.

A possible allocation method consists in assigning a certain win for '''A''' as +1 and for '''B''' as −1. This leads to [[combinatorial game theory]] as developed by [[John Horton Conway|John H. Conway]]. An alternative is using a rule that if the result of a move is an immediate win for '''A''', it is assigned positive infinity and if it is an immediate win for '''B''', negative infinity. The value to '''A''' of any other move is the maximum of the values resulting from each of '''B'''<nowiki/>'s possible replies. For this reason, '''A''' is called the ''maximizing player'' and '''B''' is called the ''minimizing player'', hence the name ''minimax algorithm''. The above algorithm will assign a value of positive or negative infinity to any position since the value of every position will be the value of some final winning or losing position.  Often this is generally only possible at the very end of complicated games such as [[chess]] or [[Go (board game)|go]], since it is not computationally feasible to look ahead as far as the completion of the game, except towards the end, and instead, positions are given finite values as estimates of the degree of belief that they will lead to a win for one player or another.

This can be extended if we can supply a [[heuristic]] evaluation function which gives values to non-final game states without considering all possible following complete sequences. We can then limit the minimax algorithm to look only at a certain number of moves ahead. This number is called the "look-ahead", measured in "[[Ply (chess)|plies]]". For example, the chess computer [[IBM Deep Blue|Deep Blue]] (the first one to beat a reigning world champion, [[Garry Kasparov]] at that time) looked ahead at least 12&nbsp;plies, then applied a heuristic evaluation function.<ref>
{{cite journal 
 | last = Hsu | first = Feng-Hsiung
 | year = 1999
 | title = IBM's Deep Blue chess grandmaster chips
 | journal = IEEE Micro
 | volume = 19  | issue = 2  | pages = 70–81
 | location = Los Alamitos, CA, USA
 | publisher = IEEE Computer Society
 | doi = 10.1109/40.755469
 | quote = During the 1997 match, the software search extended the search to about 40&nbsp;plies along the forcing lines, even though the non-extended search reached only about 12&nbsp;plies.
}}
</ref>

The algorithm can be thought of as exploring the [[node (computer science)|node]]s of a ''[[game tree]]''. The ''effective [[branching factor]]'' of the tree is the average number of [[child node|children]] of each node (i.e., the average number of legal moves in a position).  The number of nodes to be explored usually [[exponential growth|increases exponentially]] with the number of plies (it is less than exponential if evaluating [[forced move]]s or repeated positions). The number of nodes to be explored for the analysis of a game is therefore approximately the branching factor raised to the power of the number of plies. It is therefore [[Computational complexity theory#Intractability|impractical]] to completely analyze games such as chess using the minimax algorithm.

The performance of the naïve minimax algorithm may be improved dramatically, without affecting the result, by the use of [[alpha–beta pruning]]. Other heuristic pruning methods can also be used, but not all of them are guaranteed to give the same result as the unpruned search.

A naïve minimax algorithm may be trivially modified to additionally return an entire [[Variation (game tree)#Principal variation|Principal Variation]] along with a minimax score.

=== Pseudocode ===
The [[pseudocode]] for the depth-limited minimax algorithm is given below.

 '''function''' minimax(node, depth, maximizingPlayer) '''is'''
     '''if''' depth = 0 '''or''' node is a terminal node '''then'''
         '''return''' the heuristic value of node
     '''if''' maximizingPlayer '''then'''
         value := −∞
         '''for each''' child of node '''do'''
             value := max(value, minimax(child, depth − 1, FALSE))
         '''return''' value
     '''else''' ''(* minimizing player *)''
         value := +∞
         '''for each''' child of node '''do'''
             value := min(value, minimax(child, depth − 1, TRUE))
         '''return''' value

 ''(* Initial call *)''
 minimax(origin, depth, TRUE)

The minimax function returns a heuristic value for [[leaf nodes]] (terminal nodes and nodes at the maximum search depth). Non-leaf nodes inherit their value from a descendant leaf node. The heuristic value is a score measuring the favorability of the node for the maximizing player. Hence nodes resulting in a favorable outcome, such as a win, for the maximizing player have higher scores than nodes more favorable for the minimizing player. The heuristic value for terminal (game ending) leaf nodes are scores corresponding to win, loss, or draw, for the maximizing player. For non terminal leaf nodes at the maximum search depth, an evaluation function estimates a heuristic value for the node. The quality of this estimate and the search depth determine the quality and accuracy of the final minimax result.

Minimax treats the two players (the maximizing player and the minimizing player) separately in its code. Based on the observation that <math>\ \max(a,b) = -\min(-a,-b)\ ,</math> minimax may often be simplified into the [[negamax]] algorithm.

=== Example ===
[[Image:Minimax.svg|thumb|400px|A minimax tree example]]
[[File:Plminmax.gif|thumb|400px|An animated pedagogical example that attempts to be human-friendly by substituting initial infinite (or arbitrarily large) values for emptiness and by avoiding using the [[negamax]] coding simplifications.]]

Suppose the game being played only has a maximum of two possible moves per player each turn. The algorithm generates the [[game tree|tree]] on the right, where the circles represent the moves of the player running the algorithm (''maximizing player''), and squares represent the moves of the opponent (''minimizing player''). Because of the limitation of computation resources, as explained above, the tree is limited to a ''look-ahead'' of 4&nbsp;moves.

The algorithm evaluates each ''[[leaf node]]'' using a heuristic evaluation function, obtaining the values shown. The moves where the ''maximizing player'' wins are assigned with positive infinity, while the moves that lead to a win of the ''minimizing player'' are assigned with negative infinity. At level&nbsp;3, the algorithm will choose, for each node, the '''smallest''' of the ''[[child node]]'' values, and assign it to that same node (e.g. the node on the left will choose the minimum between "10" and "+∞", therefore assigning the value "10" to itself). The next step, in level&nbsp;2, consists of choosing for each node the '''largest''' of the ''child node'' values. Once again, the values are assigned to each ''[[parent node]]''. The algorithm continues evaluating the maximum and minimum values of the child nodes alternately until it reaches the ''[[root node]]'', where it chooses the move with the largest value (represented in the figure with a blue arrow). This is the move that the player should make in order to ''minimize'' the ''maximum'' possible [[loss function|loss]].