Editing Minimax (section)

== Game theory ==

=== In general games ===
The '''maximin value''' is the highest value that the player can be sure to get without knowing the actions of the other players; equivalently, it is the lowest value the other players can force the player to receive when they know the player's action. Its formal definition is:<ref name=ZMS2013>{{cite book |author1=Maschler, Michael |author2=Solan, Eilon |author2-link=Eilon Solan |author3=Zamir, Shmuel |year=2013 |title=Game Theory |publisher=[[Cambridge University Press]] |isbn=9781107005488 |pages=176–180}}</ref>

:<math>\underline{v_i} = \max_{a_i} \min_{a_{-i}} {v_i(a_i,a_{-i})}</math>

Where:

* {{mvar|i}} is the index of the player of interest.
* <math>-i</math> denotes all other players except player {{mvar|i}}.
* <math>a_i</math> is the action taken by player {{mvar|i}}.
* <math>a_{-i}</math> denotes the actions taken by all other players.
* <math>v_i</math> is the value function of player {{mvar|i}}.

Calculating the maximin value of a player is done in a worst-case approach: for each possible action of the player, we check all possible actions of the other players and determine the worst possible combination of actions – the one that gives player {{mvar|i}} the smallest value. Then, we determine which action player {{mvar|i}} can take in order to make sure that this smallest value is the highest possible.

For example, consider the following game for two players, where the first player ("row player") may choose any of three moves, labelled {{mvar|T}}, {{mvar|M}}, or {{mvar|B}}, and the second player ("column player") may choose either of two moves, {{mvar|L}} or {{mvar|R}}. The result of the combination of both moves is expressed in a payoff table:

:<math>\begin{array}{c|cc}
\hline
&
L &
R \\
\hline
T &
3,1 &
2,-20
\\
M &
5,0 &
-10,1
\\
B &
-100,2 &
4,4 \\
\hline
\end{array}</math>
(where the first number in each of the cell is the pay-out of the row player and the second number is the pay-out of the column player).

For the sake of example, we consider only [[Strategy (game theory)#Pure and mixed strategies|pure strategies]]. Check each player in turn:
* The row player can play {{mvar|T}}, which guarantees them a payoff of at least {{val|2}} (playing {{mvar|B}} is risky since it can lead to payoff {{val|-100}}, and playing {{mvar|M}} can result in a payoff of {{val|-10}}). Hence: <math>\underline{v_{row}} = 2</math>.
* The column player can play {{mvar|L}} and secure a payoff of at least {{val|0}} (playing {{mvar|R}} puts them in the risk of getting <math>-20</math>). Hence: <math>\underline{v_{col}} = 0</math>.

If both players play their respective maximin strategies <math>(T,L)</math>, the payoff vector is <math>(3,1)</math>.

The '''minimax value''' of a player is the smallest value that the other players can force the player to receive, without knowing the player's actions; equivalently, it is the largest value the player can be sure to get when they ''know'' the actions of the other players. Its formal definition is:<ref name=ZMS2013/>

:<math>\overline{v_i} = \min_{a_{-i}} \max_{a_i} {v_i(a_i,a_{-i})}</math>

The definition is very similar to that of the maximin value – only the order of the maximum and minimum operators is inverse. In the above example:
* The row player can get a maximum value of {{mvar|4}} (if the other player plays {{mvar|R}}) or {{val|5}} (if the other player plays {{mvar|L}}), so: <math>\overline{v_{row}} = 4\ .</math>
* The column player can get a maximum value of {{mvar|1}} (if the other player plays {{mvar|T}}), {{val|1}} (if {{mvar|M}}) or {{val|4}} (if {{mvar|B}}). Hence: <math>\overline{v_{col}} = 1\ .</math>

For every player {{mvar|i}}, the maximin is at most the minimax:
:<math>\underline{v_i} \leq \overline{v_i}</math>
Intuitively, in maximin the maximization comes after the minimization, so player {{mvar|i}} tries to maximize their value before knowing what the others will do; in minimax the maximization comes before the minimization, so player {{mvar|i}} is in a much better position – they maximize their value knowing what the others did.

Another way to understand the ''notation'' is by reading from right to left: When we write
:<math>\overline{v_i} = \min_{a_{-i}} \max_{a_i} {v_i(a_i,a_{-i})} = \min_{a_{-i}} \Big( \max_{a_i} {v_i(a_i,a_{-i})} \Big) </math>
the initial set of outcomes <math>\ v_i(a_i,a_{-i})\ </math> depends on both <math>\ {a_{i}}\ </math> and <math>\ {a_{-i}}\ .</math> We first ''marginalize away'' <math>{a_{i}}</math> from <math>v_i(a_i,a_{-i})</math>, by maximizing over <math>\ {a_{i}}\ </math> (for every possible value of <math>{a_{-i}}</math>) to yield a set of marginal outcomes <math>\ v'_i(a_{-i})\,,</math> which depends only on <math>\ {a_{-i}}\ .</math> We then minimize over <math>\ {a_{-i}}\ </math> over these outcomes. (Conversely for maximin.)

Although it is always the case that <math>\ \underline{v_{row}} \leq \overline{v_{row}}\ </math> and <math>\ \underline{v_{col}} \leq \overline{v_{col}}\,,</math> the payoff vector resulting from both players playing their minimax strategies, <math>\ (2,-20)\ </math> in the case of <math>\ (T,R)\ </math> or <math>(-10,1)</math> in the case of <math>\ (M,R)\,,</math> cannot similarly be ranked against the payoff vector <math>\ (3,1)\ </math> resulting from both players playing their maximin strategy.

=== In zero-sum games ===
<span id='Minimax theorem'></span><!-- added label in order not to break incoming links -->
In two-player [[zero-sum game]]s, the minimax solution is the same as the [[Nash equilibrium]].

In the context of zero-sum games, the [[minimax theorem]] is equivalent to:<ref name=Osborne>{{cite book |author1=Osborne, Martin J. |author2=Rubinstein, A. |author2-link=Ariel Rubinstein |year=1994 |title=A Course in Game Theory |place=Cambridge, MA |publisher=MIT Press |edition=print |isbn=9780262150415}}</ref>{{Failed verification|date=February 2015}}

<blockquote>For every two-person [[zero-sum]] game with finitely many strategies, there exists a value {{mvar|V}} and a mixed strategy for each player, such that
:(a) Given Player&nbsp;2's strategy, the best payoff possible for Player&nbsp;1 is {{mvar|V}}, and
:(b) Given Player&nbsp;1's strategy, the best payoff possible for Player 2 is −{{mvar|V}}.
</blockquote>
Equivalently, Player&nbsp;1's strategy guarantees them a payoff of {{mvar|V}} regardless of Player&nbsp;2's strategy, and similarly Player&nbsp;2 can guarantee themselves a payoff of −{{mvar|V}}. The name ''minimax'' arises because each player minimizes the maximum payoff possible for the other – since the game is zero-sum, they also minimize their own maximum loss (i.e., maximize their minimum payoff).
See also [[example of a game without a value]].

=== Example ===
{| class="wikitable" style="text-align:center; float:right; margin-left:1em"
 |+  align="bottom" style="caption-side: bottom" | Payoff matrix for player A
 ! 
 ! B chooses B1
 ! B chooses B2
 ! B chooses B3
 |-
 ! A chooses A1
 |  +3
 |  −2
 |  +2
 |-
 ! A chooses A2
 |  −1
 |  {{0|+}}0
 |  +4
 |-
 ! A chooses A3
 |  −4
 |  −3
 |  +1
|}
The following example of a zero-sum game, where '''A''' and '''B''' make simultaneous moves, illustrates ''maximin'' solutions. Suppose each player has three choices and consider the [[payoff matrix]] for '''A''' displayed on the table ("Payoff matrix for player&nbsp;A"). Assume the payoff matrix for '''B''' is the same matrix with the signs reversed (i.e., if the choices are A1 and B1 then '''B''' pays&nbsp;3 to '''A'''). Then, the maximin choice for '''A''' is A2 since the worst possible result is then having to pay&nbsp;1, while the simple maximin choice for '''B''' is B2 since the worst possible result is then no payment.  However, this solution is not stable, since if '''B''' believes '''A''' will choose A2 then '''B''' will choose B1 to gain&nbsp;1; then if '''A''' believes '''B''' will choose B1 then '''A''' will choose A1 to gain&nbsp;3; and then '''B''' will choose B2; and eventually both players will realize the difficulty of making a choice. So a more stable strategy is needed.

Some choices are ''dominated'' by others and can be eliminated: '''A''' will not choose A3 since either A1 or A2 will produce a better result, no matter what '''B''' chooses; '''B''' will not choose B3 since some mixtures of B1 and B2 will produce a better result, no matter what '''A''' chooses.

Player '''A''' can avoid having to make an expected payment of more than {{sfrac|1| 3 }} by choosing A1 with probability {{sfrac|1| 6 }} and A2 with probability {{nobr| {{sfrac|5| 6 }}:}} The expected payoff for '''A''' would be {{nobr|  3 × {{sfrac|1| 6 }} − 1 × {{sfrac|5| 6 }} {{=}} {{sfrac|−|1| 3 }}  }} in case '''B''' chose B1 and {{nobr|  −2 × {{sfrac|1|6 }} + 0 × {{sfrac|5| 6 }} {{=}} {{sfrac|−|1| 3 }}  }} in case '''B''' chose B2. Similarly, '''B''' can ensure an expected gain of at least {{sfrac|1| 3 }}, no matter what '''A''' chooses, by using a randomized strategy of choosing B1 with probability {{sfrac|1| 3 }} and B2 with probability {{sfrac|2| 3 }}. These [[mixed strategy|mixed]] minimax strategies cannot be improved and are now stable.

=== Maximin ===
Frequently, in game theory, '''maximin''' is distinct from minimax. Minimax is used in zero-sum games to denote minimizing the opponent's maximum payoff. In a [[zero-sum game]], this is identical to minimizing one's own maximum loss, and to maximizing one's own minimum gain.

"Maximin" is a term commonly used for non-zero-sum games to describe the strategy which maximizes one's own minimum payoff. In non-zero-sum games, this is not generally the same as minimizing the opponent's maximum gain, nor the same as the [[Nash equilibrium]] strategy.

=== In repeated games ===
The minimax values are very important in the theory of [[repeated games]]. One of the central theorems in this theory, the [[folk theorem (game theory)|folk theorem]], relies on the minimax values.