Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Genetic algorithm
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Methodology == === Optimization problems === In a genetic algorithm, a [[population]] of [[candidate solution]]s (called individuals, creatures, organisms, or [[phenotype]]s) to an optimization problem is evolved toward better solutions. Each candidate solution has a set of properties (its [[chromosome]]s or [[genotype]]) which can be mutated and altered; traditionally, solutions are represented in binary as strings of 0s and 1s, but other encodings are also possible.{{sfn|Whitley|1994|p=66}} The evolution usually starts from a population of randomly generated individuals, and is an [[Iteration|iterative process]], with the population in each iteration called a ''generation''. In each generation, the [[fitness (biology)|fitness]] of every individual in the population is evaluated; the fitness is usually the value of the [[objective function]] in the optimization problem being solved. The more fit individuals are [[Stochastics|stochastically]] selected from the current population, and each individual's genome is modified ([[Crossover (genetic algorithm)|recombined]] and possibly randomly mutated) to form a new generation. The new generation of candidate solutions is then used in the next iteration of the [[algorithm]]. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. A typical genetic algorithm requires: # a [[genetic representation]] of the solution domain, # a [[fitness function]] to evaluate the solution domain. A standard representation of each candidate solution is as an [[bit array|array of bits]] (also called ''bit set'' or ''bit string'').{{sfn|Whitley|1994|p=66}} Arrays of other types and structures can be used in essentially the same way. The main property that makes these genetic representations convenient is that their parts are easily aligned due to their fixed size, which facilitates simple [[Crossover (genetic algorithm)|crossover]] operations. Variable length representations may also be used, but crossover implementation is more complex in this case. Tree-like representations are explored in [[genetic programming]] and graph-form representations are explored in [[evolutionary programming]]; a mix of both linear chromosomes and trees is explored in [[gene expression programming]]. Once the genetic representation and the fitness function are defined, a GA proceeds to initialize a population of solutions and then to improve it through repetitive application of the mutation, crossover, inversion and selection operators. ==== Initialization ==== The population size depends on the nature of the problem, but typically contains hundreds or thousands of possible solutions. Often, the initial population is generated randomly, allowing the entire range of possible solutions (the [[Feasible region|search space]]). Occasionally, the solutions may be "seeded" in areas where optimal solutions are likely to be found or the distribution of the sampling probability tuned to focus in those areas of greater interest.<ref>{{cite journal |last1=Luque-Rodriguez |first1=Maria |last2=Molina-Baena |first2=Jose |last3=Jimenez-Vilchez |first3=Alfonso |last4=Arauzo-Azofra |first4=Antonio |title=Initialization of Feature Selection Search for Classification (sec. 3) |journal=Journal of Artificial Intelligence Research |date=2022 |volume=75 |pages=953β983 |doi=10.1613/jair.1.14015 |url=https://www.jair.org/index.php/jair/article/view/14015|doi-access=free }}</ref> ==== Selection ==== {{Main|Selection (genetic algorithm)}} During each successive generation, a portion of the existing population is [[selection (genetic algorithm)|selected]] to reproduce for a new generation. Individual solutions are selected through a ''fitness-based'' process, where [[Fitness (biology)|fitter]] solutions (as measured by a [[fitness function]]) are typically more likely to be selected. Certain selection methods rate the fitness of each solution and preferentially select the best solutions. Other methods rate only a random sample of the population, as the former process may be very time-consuming. The fitness function is defined over the genetic representation and measures the ''quality'' of the represented solution. The fitness function is always problem-dependent. For instance, in the [[knapsack problem]] one wants to maximize the total value of objects that can be put in a knapsack of some fixed capacity. A representation of a solution might be an array of bits, where each bit represents a different object, and the value of the bit (0 or 1) represents whether or not the object is in the knapsack. Not every such representation is valid, as the size of objects may exceed the capacity of the knapsack. The ''fitness'' of the solution is the sum of values of all objects in the knapsack if the representation is valid, or 0 otherwise. In some problems, it is hard or even impossible to define the fitness expression; in these cases, a [[Computer simulation|simulation]] may be used to determine the fitness function value of a [[phenotype]] (e.g. [[computational fluid dynamics]] is used to determine the air resistance of a vehicle whose shape is encoded as the phenotype), or even [[Interactive evolutionary computation|interactive genetic algorithms]] are used. ==== Genetic operators ==== {{Main|Crossover (genetic algorithm)|Mutation (genetic algorithm)}} The next step is to generate a second generation population of solutions from those selected, through a combination of [[genetic operator]]s: [[crossover (genetic algorithm)|crossover]] (also called recombination), and [[mutation (genetic algorithm)|mutation]]. For each new solution to be produced, a pair of "parent" solutions is selected for breeding from the pool selected previously. By producing a "child" solution using the above methods of crossover and mutation, a new solution is created which typically shares many of the characteristics of its "parents". New parents are selected for each new child, and the process continues until a new population of solutions of appropriate size is generated. Although reproduction methods that are based on the use of two parents are more "biology inspired", some research<ref>Eiben, A. E. et al (1994). "Genetic algorithms with multi-parent recombination". PPSN III: Proceedings of the International Conference on Evolutionary Computation. The Third Conference on Parallel Problem Solving from Nature: 78–87. {{ISBN|3-540-58484-6}}.</ref><ref>Ting, Chuan-Kang (2005). "On the Mean Convergence Time of Multi-parent Genetic Algorithms Without Selection". Advances in Artificial Life: 403–412. {{ISBN|978-3-540-28848-0}}.</ref> suggests that more than two "parents" generate higher quality chromosomes. These processes ultimately result in the next generation population of chromosomes that is different from the initial generation. Generally, the average fitness will have increased by this procedure for the population, since only the best organisms from the first generation are selected for breeding, along with a small proportion of less fit solutions. These less fit solutions ensure genetic diversity within the genetic pool of the parents and therefore ensure the genetic diversity of the subsequent generation of children. Opinion is divided over the importance of crossover versus mutation. There are many references in [[David B. Fogel|Fogel]] (2006) that support the importance of mutation-based search. Although crossover and mutation are known as the main genetic operators, it is possible to use other operators such as regrouping, colonization-extinction, or migration in genetic algorithms.{{citation needed|date=November 2019}} It is worth tuning parameters such as the [[Mutation (genetic algorithm)|mutation]] probability, [[Crossover (genetic algorithm)|crossover]] probability and population size to find reasonable settings for the problem's [[complexity class]] being worked on. A very small mutation rate may lead to [[genetic drift]] (which is non-[[Ergodicity|ergodic]] in nature). A recombination rate that is too high may lead to premature convergence of the genetic algorithm. A mutation rate that is too high may lead to loss of good solutions, unless [[#Elitism|elitist selection]] is employed. An adequate population size ensures sufficient genetic diversity for the problem at hand, but can lead to a waste of computational resources if set to a value larger than required. ==== Heuristics ==== In addition to the main operators above, other [[heuristic]]s may be employed to make the calculation faster or more robust. The ''speciation'' heuristic penalizes crossover between candidate solutions that are too similar; this encourages population diversity and helps prevent premature convergence to a less optimal solution.<ref>{{Cite book|title=Handbook of Evolutionary Computation|last1=Deb|first1=Kalyanmoy|last2=Spears|first2=William M.|s2cid=3547258|publisher=Institute of Physics Publishing|year=1997|chapter=C6.2: Speciation methods}}</ref><ref>{{Cite book|title=Handbook of Natural Computing|last=Shir|first=Ofer M.|date=2012|publisher=Springer Berlin Heidelberg|isbn=9783540929093|editor-last=Rozenberg|editor-first=Grzegorz|pages=1035β1069|language=en|chapter=Niching in Evolutionary Algorithms|doi=10.1007/978-3-540-92910-9_32|editor-last2=BΓ€ck|editor-first2=Thomas|editor-last3=Kok|editor-first3=Joost N.}}</ref> ==== Termination ==== This generational process is repeated until a termination condition has been reached. Common terminating conditions are: * A solution is found that satisfies minimum criteria * Fixed number of generations reached * Allocated budget (computation time/money) reached * The highest ranking solution's fitness is reaching or has reached a plateau such that successive iterations no longer produce better results * Manual inspection * Combinations of the above
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)