Editing Simulated annealing (section)

==Overview==
The [[thermodynamic state|state]] ''s'' of some [[physical system]]s, and the function ''E''(''s'') to be minimized, is analogous to the [[internal energy]] of the system in that state. The goal is to bring the system, from an arbitrary ''initial state'', to a state with the minimum possible energy.

[[File:Hill Climbing with Simulated Annealing.gif|center|thumb|Simulated annealing searching for a maximum. The objective here is to get to the highest point. In this example, it is not enough to use a simple [[hill climbing|hill climb algorithm]], as there are many [[Hill climbing algorithm#Local maxima|local maxima]]. By cooling the temperature slowly the global maximum is found.|500px]]

===The basic iteration===
At each step, the simulated annealing heuristic considers some neighboring state ''s*'' of the current state ''s'', and [[probabilistic]]ally decides between moving the system to state ''s*'' or staying in state ''s''.  These probabilities ultimately lead the system to move to states of lower energy.  Typically this step is repeated until the system reaches a state that is good enough for the application, or until a given computation budget has been exhausted.

===The neighbors of a state===
Optimization of a solution involves evaluating the neighbors of a state of the problem, which are new states produced through conservatively altering a given state. For example, in the [[traveling salesman problem]] each state is typically defined as a [[permutation]] of the cities to be visited, and the neighbors of any state are the set of permutations produced by swapping any two of these cities. The well-defined way in which the states are altered to produce neighboring states is called a "move", and different moves give different sets of neighboring states. These moves usually result in minimal alterations of the last state, in an attempt to progressively improve the solution through iteratively improving its parts (such as the city connections in the traveling salesman problem). It is even better to reverse the order of an interval of cities. This is a smaller move since swapping two cities can be achieved by twice reversing an interval. 

Simple [[heuristic]]s like [[hill climbing]], which move by finding better neighbor after better neighbor and stop when they have reached a solution which has no neighbors that are better solutions, cannot guarantee to lead to any of the existing better solutions{{snd}} their outcome may easily be just a [[local optimum]], while the actual best solution would be a [[global optimum]] that could be different. [[Metaheuristic]]s use the neighbors of a solution as a way to explore the solution space, and although they prefer better neighbors, they also accept worse neighbors in order to avoid getting stuck in local optima; they can find the global optimum if run for a long enough amount of time.

===Acceptance probabilities===
The probability of making the [[state transition|transition]] from the current state <math>s</math> to a candidate new state <math>s_\mathrm{new}</math> is specified by an ''acceptance probability function'' <math>P(e, e_\mathrm{new}, T)</math>, that depends on the energies <math>e = E(s)</math> and <math>e_\mathrm{new}= E(s_\mathrm{new})</math> of the two states, and on a global time-varying parameter <math>T</math> called the ''temperature''.  States with a smaller energy are better than those with a greater energy.  The probability function <math>P</math> must be positive even when <math>e_\mathrm{new}</math> is greater than <math>e</math>.  This feature prevents the method from becoming stuck at a local minimum that is worse than the global one.

When <math>T</math> tends to zero, the probability <math>P(e, e_\mathrm{new}, T)</math> must tend to zero if <math>e_\mathrm{new} > e</math> and to a positive value otherwise. For sufficiently small values of <math>T</math>, the system will then increasingly favor moves that go "downhill" (i.e., to lower energy values), and avoid those that go "uphill."  With <math>T=0</math> the procedure reduces to the [[greedy algorithm]], which makes only the downhill transitions.

In the original description of simulated annealing, the probability <math>P(e, e_\mathrm{new}, T)</math> was equal to 1 when <math>e_\mathrm{new} < e</math>—i.e., the procedure always moved downhill when it found a way to do so, irrespective of the temperature.  Many descriptions and implementations of simulated annealing still take this condition as part of the method's definition.  However, this condition is not essential for the method to work.

The <math>P</math> function is usually chosen so that the probability of accepting a move decreases when the difference
<math>e_\mathrm{new}-e</math> increases—that is, small uphill moves are more likely than large ones.  However, this requirement is not strictly necessary, provided that the above requirements are met.

Given these properties, the temperature <math>T</math> plays a crucial role in controlling the evolution of the state <math>s</math> of the system with regard to its sensitivity to the variations of system energies. To be precise, for a large <math>T</math>, the evolution of <math>s</math> is sensitive to coarser energy variations, while it is sensitive to finer energy variations when <math>T</math> is small.

===The annealing schedule===
{{multiple image
 | align = right
 | total_width = 400
 | image_gap = 1

 | image1 = SimulatedAnnealingFast.jpg
 | alt1 = Fast
 | caption1 = Fast

 | image2 = SimulatedAnnealingSlow.jpg
 | alt2 = Slow
 | caption2 = Slow

| footer= Example illustrating the effect of cooling schedule on the performance of simulated annealing.  The problem is to rearrange the [[pixel]]s of an image so as to minimize a certain [[potential energy]] function, which causes similar [[color]]s to attract at short range and repel at a slightly larger distance.  The elementary moves swap two adjacent pixels.  These images were obtained with a fast cooling schedule (left) and a slow cooling schedule (right), producing results similar to [[amorphous]] and [[crystalline solid]]s, respectively.
}}
The name and inspiration of the algorithm demand an interesting feature related to the temperature variation to be embedded in the operational characteristics of the algorithm. This necessitates a gradual reduction of the temperature as the simulation proceeds. The algorithm starts initially with <math>T</math> set to a high value (or infinity), and then it is decreased at each step following some ''annealing schedule''—which may be specified by the user but must end with <math>T=0</math> towards the end of the allotted time budget.  In this way, the system is expected to wander initially towards a broad region of the search space containing good solutions, ignoring small features of the energy function; then drift towards low-energy regions that become narrower and narrower, and finally move downhill according to the [[steepest descent]] heuristic.

For any given finite problem, the probability that the simulated annealing algorithm terminates with a [[global optimum|global optimal]] solution approaches 1 as the annealing schedule is extended.<ref>{{cite journal |doi=10.1109/34.295910 |title=Simulated annealing: A proof of convergence |year=1994 |last1=Granville |first1=V. |last2=Krivanek |first2=M. |last3=Rasson |first3=J.-P. |journal=IEEE Transactions on Pattern Analysis and Machine Intelligence |volume=16 |issue=6 |pages=652–656}}</ref> This theoretical result, however, is not particularly helpful, since the time required to ensure a significant probability of success will usually exceed the time required for a [[Brute-force search|complete search]] of the [[solution space]].<ref>{{Citation |last1=Nolte |first1=Andreas |title=A Note on the Finite Time Behaviour of Simulated Annealing |date=1997 |url=http://link.springer.com/10.1007/978-3-642-60744-8_32 |work=Operations Research Proceedings 1996 |volume=1996 |pages=175–180 |place=Berlin, Heidelberg |publisher=Springer Berlin Heidelberg |doi=10.1007/978-3-642-60744-8_32 |isbn=978-3-540-62630-5 |access-date=2023-02-06 |last2=Schrader |first2=Rainer|url-access=subscription }}</ref>