Editing Learning classifier system (section)

== History ==

=== Early years ===
[[John Henry Holland]] was best known for his work popularizing [[genetic algorithm]]s (GA), through his ground-breaking book "Adaptation in Natural and Artificial Systems"<ref>{{Cite book|title=Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence|last=Holland|first=John|publisher=Michigan Press|year=1975|isbn=9780262581110|url=https://books.google.com/books?id=5EgGaBkwvWcC}}</ref> in 1975 and his formalization of [[Holland's schema theorem]].  In 1976, Holland conceptualized an extension of the GA concept to what he called a "cognitive system",<ref>Holland JH (1976) Adaptation. In: Rosen R, Snell F (eds) Progress in theoretical biology, vol 4. Academic Press, New York, pp 263–293</ref> and provided the first detailed description of what would become known as the first learning classifier system in the paper "Cognitive Systems based on Adaptive Algorithms".<ref name=":2">Holland JH, Reitman JS (1978) Cognitive systems based on
adaptive algorithms Reprinted in: Evolutionary computation.
The fossil record. In: David BF (ed) IEEE Press, New York
1998. {{ISBN|0-7803-3481-7}}
</ref>  This first system, named '''Cognitive System One (CS-1)''' was conceived as a modeling tool, designed to model a real system (i.e. ''environment'') with unknown underlying dynamics using a population of human readable rules.  The goal was for a set of rules to perform [[online machine learning]] to adapt to the environment based on infrequent payoff/reward (i.e. reinforcement learning) and apply these rules to generate a behavior that matched the real system. This early, ambitious implementation was later regarded as overly complex, yielding inconsistent results.<ref name=":1" /><ref name=":3">{{Cite journal|last=Lanzi|first=Pier Luca|date=2008-02-08|title=Learning classifier systems: then and now|journal=Evolutionary Intelligence|language=en|volume=1|issue=1|pages=63–82|doi=10.1007/s12065-007-0003-3|s2cid=27153843|issn=1864-5909}}</ref>

Beginning in 1980, [[Kenneth A De Jong|Kenneth de Jong]] and his student Stephen Smith took a different approach to rule-based machine learning with '''(LS-1)''', where learning was viewed as an offline optimization process rather than an online adaptation process.<ref>Smith S (1980) A learning system based on genetic adaptive
algorithms. Ph.D. thesis, Department of Computer Science,
University of Pittsburgh
</ref><ref>Smith S (1983) [https://www.researchgate.net/profile/Stephen_Smith14/publication/220815785_Flexible_Learning_of_Problem_Solving_Heuristics_Through_Adaptive_Search/links/0deec52c18dbd0dd53000000.pdf Flexible learning of problem solving heuristics through adaptive search]. In: Eighth international joint conference
on articial intelligence. Morgan Kaufmann, Los Altos, pp
421–425
</ref><ref>De Jong KA (1988) Learning with genetic algorithms: an overview. Mach Learn 3:121–138</ref>  This new approach was more similar to a standard genetic algorithm but evolved independent sets of rules.  Since that time LCS methods inspired by the online learning framework introduced by Holland at the University of Michigan have been referred to as '''Michigan-style LCS''', and those inspired by Smith and De Jong at the University of Pittsburgh have been referred to as '''Pittsburgh-style LCS'''.<ref name=":1" /><ref name=":3" />  In 1986, Holland developed what would be considered the standard Michigan-style LCS for the next decade.<ref name=":4">[http://dl.acm.org/citation.cfm?id=216016 Holland, John H. "Escaping brittleness: the possibilities of general purpose learning algorithms applied to parallel rule-based system." ''Machine learning''(1986): 593-623.]</ref>

Other important concepts that emerged in the early days of LCS research included (1) the formalization of a ''bucket brigade algorithm'' (BBA) for credit assignment/learning,<ref>{{Cite book|last=Holland|first=John H.|date=1985-01-01|title=Properties of the Bucket Brigade|url=http://dl.acm.org/citation.cfm?id=645511.657087|journal=Proceedings of the 1st International Conference on Genetic Algorithms|location=Hillsdale, NJ, USA|publisher=L. Erlbaum Associates Inc.|pages=1–7|isbn=978-0805804263}}</ref> (2) selection of parent rules from a common 'environmental niche' (i.e. the ''match set'' [M]) rather than from the whole ''population'' [P],<ref>{{Cite thesis|last=Booker|first=L|title=Intelligent Behavior as an Adaptation to the Task Environment|date=1982-01-01|publisher=University of Michigan|url=http://www.citeulike.org/group/664/article/431772}}</ref> (3) ''covering'', first introduced as a ''create'' operator,<ref name=":5">Wilson, S. W. "[http://www.cs.sfu.ca/~vaughan/teaching/415/papers/wilson_animat.pdf Knowledge growth in an artificial animal]. Proceedings of the First International Conference on Genetic Algorithms and their Applications." (1985).</ref> (4) the formalization of an ''action set'' [A],<ref name=":5" /> (5) a simplified algorithm architecture,<ref name=":5" /> (6) ''strength-based fitness'',<ref name=":4" /> (7) consideration of single-step, or supervised learning problems<ref>{{Cite journal|last=Wilson|first=Stewart W.|title=Classifier systems and the animat problem|journal=Machine Learning|language=en|volume=2|issue=3|pages=199–228|doi=10.1007/BF00058679|issn=0885-6125|year=1987|doi-access=free}}</ref> and the introduction of the ''correct set'' [C],<ref>{{Cite book|last1=Bonelli|first1=Pierre|last2=Parodi|first2=Alexandre|last3=Sen|first3=Sandip|last4=Wilson|first4=Stewart|date=1990-01-01|title=NEWBOOLE: A Fast GBML System|url=https://archive.org/details/machinelearningp0000inte/page/153|journal=Proceedings of the Seventh International Conference (1990) on Machine Learning|location=San Francisco, CA, USA|publisher=Morgan Kaufmann Publishers Inc.|pages=[https://archive.org/details/machinelearningp0000inte/page/153 153–159]|isbn=978-1558601413|url-access=registration}}</ref> (8) ''accuracy-based fitness''<ref>{{Cite journal|last1=Frey|first1=Peter W.|last2=Slate|first2=David J.|title=Letter recognition using Holland-style adaptive classifiers|journal=Machine Learning|language=en|volume=6|issue=2|pages=161–182|doi=10.1007/BF00114162|issn=0885-6125|year=1991|doi-access=free}}</ref> (9) the combination of fuzzy logic with LCS<ref>Valenzuela-Rendón, Manuel. "[http://sci2s.ugr.es/sites/default/files/files/TematicWebSites/GeneticFuzzySystems/(1991)_Valenzuela-Rendon.pdf The Fuzzy Classifier System: A Classifier System for Continuously Varying Variables]." In ''ICGA'', pp. 346-353. 1991.</ref> (which later spawned a lineage of ''fuzzy LCS algorithms''), (10) encouraging ''long action chains'' and ''default hierarchies'' for improving performance on multi-step problems,<ref>{{Cite thesis|last=Riolo|first=Rick L.|title=Empirical Studies of Default Hierarchies and Sequences of Rules in Learning Classifier Systems|date=1988-01-01|publisher=University of Michigan|url=http://dl.acm.org/citation.cfm?id=914945|place=Ann Arbor, MI, USA}}</ref><ref>{{Cite journal|last=R.L.|first=Riolo|date=1987-01-01|title=Bucket brigade performance. I. Long sequences of classifiers|url=http://agris.fao.org/agris-search/search.do?recordID=US201301782174|journal=Genetic Algorithms and Their Applications: Proceedings of the Second International Conference on Genetic Algorithms: July 28–31, 1987 at the Massachusetts Institute of Technology, Cambridge, MA|language=en}}</ref><ref>{{Cite journal|last=R.L.|first=Riolo|date=1987-01-01|title=Bucket brigade performance. II. Default hierarchies|url=http://agris.fao.org/agris-search/search.do?recordID=US201301782175|journal=Genetic Algorithms and Their Applications: Proceedings of the Second International Conference on Genetic Algorithms: July 28–31, 1987 at the Massachusetts Institute of Technology, Cambridge, MA|language=en}}</ref> (11) examining [[latent learning]] (which later inspired a new branch of ''anticipatory classifier systems'' (ACS)<ref name=":7">W. Stolzmann, "Anticipatory classifier systems," in Proceedings

of the 3rd Annual Genetic Programming Conference, pp.

658–664, 1998.
</ref>), and (12) the introduction of the first [[Q-learning]]-like credit assignment technique.<ref>{{Cite book|last=Riolo|first=Rick L.|date=1990-01-01|title=Lookahead Planning and Latent Learning in a Classifier System|url=http://dl.acm.org/citation.cfm?id=116517.116553|journal=Proceedings of the First International Conference on Simulation of Adaptive Behavior on from Animals to Animats|location=Cambridge, MA, USA|publisher=MIT Press|pages=316–326|isbn=978-0262631389}}</ref>  While not all of these concepts are applied in modern LCS algorithms, each were landmarks in the development of the LCS paradigm.

=== The revolution ===
Interest in learning classifier systems was reinvigorated in the mid 1990s largely due to two events; the development of the [[Q-learning|Q-Learning]] algorithm<ref>Watkins, Christopher John Cornish Hellaby. "Learning from delayed rewards." PhD diss., University of Cambridge, 1989.</ref> for [[reinforcement learning]], and the introduction of significantly simplified Michigan-style LCS architectures by Stewart Wilson.<ref name=":10">{{Cite journal|last=Wilson|first=Stewart W.|date=1995-06-01|title=Classifier Fitness Based on Accuracy|journal=Evol. Comput.|volume=3|issue=2|pages=149–175|doi=10.1162/evco.1995.3.2.149|issn=1063-6560|citeseerx=10.1.1.363.2210|s2cid=18341635}}</ref><ref name=":6">{{Cite journal|last=Wilson|first=Stewart W.|date=1994-03-01|title=ZCS: A Zeroth Level Classifier System|journal=Evolutionary Computation|volume=2|issue=1|pages=1–18|doi=10.1162/evco.1994.2.1.1|issn=1063-6560|citeseerx=10.1.1.363.798|s2cid=17680778}}</ref>  Wilson's '''Zeroth-level Classifier System (ZCS)'''<ref name=":6" /> focused on increasing algorithmic understandability based on Hollands standard LCS implementation.<ref name=":4" />  This was done, in part, by removing rule-bidding and the internal message list, essential to the original BBA credit assignment, and replacing it with a hybrid BBA/[[Q-learning|Q-Learning]] strategy.  ZCS demonstrated that a much simpler LCS architecture could perform as well as the original, more complex implementations.  However, ZCS still suffered from performance drawbacks including the proliferation of over-general classifiers.

In 1995, Wilson published his landmark paper, "Classifier fitness based on accuracy" in which he introduced the classifier system '''XCS'''.<ref name=":10" />  XCS took the simplified architecture of ZCS and added an accuracy-based fitness, a niche GA (acting in the action set [A]), an explicit generalization mechanism called ''subsumption'', and an adaptation of the [[Q-learning|Q-Learning]] credit assignment.  XCS was popularized by its ability to reach optimal performance while evolving accurate and maximally general classifiers as well as its impressive problem flexibility (able to perform both [[reinforcement learning]] and [[supervised learning]]). XCS later became the best known and most studied LCS algorithm and defined a new family of ''accuracy-based LCS''.  ZCS alternatively became synonymous with ''strength-based LCS''.  XCS is also important, because it successfully bridged the gap between LCS and the field of [[reinforcement learning]].  Following the success of XCS, LCS were later described as reinforcement learning systems endowed with a generalization capability.<ref>{{Cite journal|last=Lanzi|first=P. L.|title=Learning classifier systems from a reinforcement learning perspective|journal=Soft Computing|language=en|volume=6|issue=3–4|pages=162–170|doi=10.1007/s005000100113|issn=1432-7643|year=2002|s2cid=39103390}}</ref>  [[Reinforcement learning]] typically seeks to learn a value function that maps out a complete representation of the state/action space.  Similarly, the design of XCS drives it to form an all-inclusive and accurate representation of the problem space (i.e. a ''complete map'') rather than focusing on high payoff niches in the environment (as was the case with strength-based LCS).  Conceptually, complete maps don't only capture what you should do, or what is correct, but also what you shouldn't do, or what's incorrect.  Differently, most strength-based LCSs, or exclusively supervised learning LCSs seek a rule set of efficient generalizations in the form of a ''best action map'' (or a ''partial map'').   Comparisons between strength vs. accuracy-based fitness and complete vs. best action maps have since been examined in greater detail.<ref>Kovacs, Timothy Michael Douglas. ''A Comparison of Strength and Accuracy-based Fitness in Learning and Classifier Systems''. 2002.</ref><ref>{{cite book | chapter-url=https://link.springer.com/chapter/10.1007/3-540-48104-4_6 | doi=10.1007/3-540-48104-4_6 | chapter=Two Views of Classifier Systems | title=Advances in Learning Classifier Systems | series=Lecture Notes in Computer Science | date=2002 | last1=Kovacs | first1=Tim | volume=2321 | pages=74–87 | isbn=978-3-540-43793-2 }}</ref>

=== In the wake of XCS ===
XCS inspired the development of a whole new generation of LCS algorithms and applications.  In 1995, Congdon was the first to apply LCS to real-world [[Epidemiology|epidemiological]] investigations of disease <ref name=":8" /> followed closely by Holmes who developed the '''BOOLE++''',<ref>{{Cite journal|last=Holmes|first=John H.|date=1996-01-01|title=A Genetics-Based Machine Learning Approach to Knowledge Discovery in Clinical Data|journal=Proceedings of the AMIA Annual Fall Symposium|pages=883|issn=1091-8280|pmc=2233061}}</ref> '''EpiCS''',<ref>Holmes, John H. "[https://web.archive.org/web/20180820234915/https://pdfs.semanticscholar.org/71e4/eb6c630dee4b762e74b2970f6dc638a351ab.pdf Discovering Risk of Disease with a Learning Classifier System]." In ''ICGA'', pp. 426-433. 1997.</ref> and later '''EpiXCS'''<ref>Holmes, John H., and Jennifer A. Sager. "[https://link.springer.com/10.1007%2F11527770_60 Rule discovery in epidemiologic surveillance data using EpiXCS: an evolutionary computation approach]." In''Conference on Artificial Intelligence in Medicine in Europe'', pp. 444-452. Springer Berlin Heidelberg, 2005.</ref> for [[Epidemiology|epidemiological]] classification.  These early works inspired later interest in applying LCS algorithms to complex and large-scale [[data mining]] tasks epitomized by [[bioinformatics]] applications.  In 1998, Stolzmann introduced '''anticipatory classifier systems (ACS)''' which included rules in the form of 'condition-action-effect, rather than the classic 'condition-action' representation.<ref name=":7" />  ACS was designed to predict the perceptual consequences of an action in all possible situations in an environment.  In other words, the system evolves a model that specifies not only what to do in a given situation, but also provides information of what will happen after a specific action will be executed. This family of LCS algorithms is best suited to multi-step problems, planning, speeding up learning, or disambiguating perceptual aliasing (i.e. where the same observation is obtained in distinct states but requires different actions).  Butz later pursued this anticipatory family of LCS developing a number of improvements to the original method.<ref>Butz, Martin V. "[https://web.archive.org/web/20180820234943/https://pdfs.semanticscholar.org/3572/7a56fcce7a73ccc43e5bfa19389780e6d436.pdf Biasing exploration in an anticipatory learning classifier system]." In ''International Workshop on Learning Classifier Systems'', pp. 3-22. Springer Berlin Heidelberg, 2001.</ref>  In 2002, Wilson introduced '''XCSF''', adding a computed action in order to perform function approximation.<ref>{{Cite journal|last=Wilson|first=Stewart W.|title=Classifiers that approximate functions|journal=Natural Computing|language=en|volume=1|issue=2–3|pages=211–234|doi=10.1023/A:1016535925043|issn=1567-7818|year=2002|s2cid=23032802}}</ref>  In 2003, Bernado-Mansilla introduced a '''sUpervised Classifier System (UCS)''', which specialized the XCS algorithm to the task of [[supervised learning]], single-step problems, and forming a best action set.  UCS removed the [[reinforcement learning]] strategy in favor of a simple, accuracy-based rule fitness as well as the explore/exploit learning phases, characteristic of many reinforcement learners.  Bull introduced a simple accuracy-based LCS '''(YCS)'''<ref>Bull, Larry. "[https://web.archive.org/web/20180820234941/https://pdfs.semanticscholar.org/120c/8f5057995c36ee60ec320c2263b20af05444.pdf A simple accuracy-based learning classifier system]." ''Learning Classifier Systems Group Technical Report UWELCSG03-005, University of the West of England, Bristol, UK'' (2003).</ref> and a simple strength-based LCS '''Minimal Classifier System (MCS)'''<ref>Bull, Larry. "[https://link.springer.com/chapter/10.1007/978-3-540-30217-9_104 A simple payoff-based learning classifier system]." In''International Conference on Parallel Problem Solving from Nature'', pp. 1032-1041. Springer Berlin Heidelberg, 2004.</ref> in order to develop a better theoretical understanding of the LCS framework.  Bacardit introduced '''GAssist'''<ref>Peñarroya, Jaume Bacardit. "Pittsburgh genetic-based machine learning in the data mining era: representations, generalization, and run-time." PhD diss., Universitat Ramon Llull, 2004.</ref> and '''BioHEL''',<ref>{{Cite journal|last1=Bacardit|first1=Jaume|last2=Burke|first2=Edmund K.|last3=Krasnogor|first3=Natalio|date=2008-12-12|title=Improving the scalability of rule-based evolutionary learning|journal=Memetic Computing|language=en|volume=1|issue=1|pages=55–67|doi=10.1007/s12293-008-0005-4|s2cid=775199|issn=1865-9284}}</ref> Pittsburgh-style LCSs designed for [[data mining]] and [[scalability]] to large datasets in [[bioinformatics]] applications.  In 2008, Drugowitsch published the book titled "Design and Analysis of Learning Classifier Systems" including some theoretical examination of LCS algorithms.<ref>{{Cite book|title=Design and Analysis of Learning Classifier Systems - Springer|volume = 139|doi=10.1007/978-3-540-79866-8|series = Studies in Computational Intelligence|year = 2008|isbn = 978-3-540-79865-1|last1 = Drugowitsch|first1 = Jan}}</ref>  Butz introduced the first rule online learning visualization within a [[Graphical user interface|GUI]] for XCSF<ref name=":9" /> (see the image at the top of this page).  Urbanowicz extended the UCS framework and introduced '''ExSTraCS,''' explicitly designed for [[supervised learning]] in noisy problem domains (e.g. epidemiology and bioinformatics).<ref>Urbanowicz, Ryan J., Gediminas Bertasius, and Jason H. Moore. "[http://www.seas.upenn.edu/~gberta/uploads/3/1/4/8/31486883/urbanowicz_2014_exstracs_algorithm.pdf An extended michigan-style learning classifier system for flexible supervised learning, classification, and data mining]." In ''International Conference on Parallel Problem Solving from Nature'', pp. 211-221. Springer International Publishing, 2014.</ref>  ExSTraCS integrated (1) expert knowledge to drive covering and genetic algorithm towards important features in the data,<ref>Urbanowicz, Ryan J., Delaney Granizo-Mackenzie, and Jason H. Moore. "[https://web.archive.org/web/20180820234834/https://pdfs.semanticscholar.org/b407/8f8bb6aa9e39e84b0b20874662a6ed8b7df1.pdf Using expert knowledge to guide covering and mutation in a michigan style learning classifier system to detect epistasis and heterogeneity]." In''International Conference on Parallel Problem Solving from Nature'', pp. 266-275. Springer Berlin Heidelberg, 2012.</ref> (2) a form of long-term memory referred to as attribute tracking,<ref>{{Cite book|last1=Urbanowicz|first1=Ryan|last2=Granizo-Mackenzie|first2=Ambrose|last3=Moore|first3=Jason|title=Proceedings of the 14th annual conference on Genetic and evolutionary computation |chapter=Instance-linked attribute tracking and feedback for michigan-style supervised learning classifier systems |date=2012-01-01|series=GECCO '12|location=New York, NY, USA|publisher=ACM|pages=927–934|doi=10.1145/2330163.2330291|isbn=9781450311779|s2cid=142534}}</ref> allowing for more efficient learning and the characterization of heterogeneous data patterns, and (3) a flexible rule representation similar to Bacardit's mixed discrete-continuous attribute list representation.<ref>{{Cite book|last1=Bacardit|first1=Jaume|last2=Krasnogor|first2=Natalio|title=Proceedings of the 11th Annual conference on Genetic and evolutionary computation |chapter=A mixed discrete-continuous attribute list representation for large scale classification domains |date=2009-01-01|series=GECCO '09|location=New York, NY, USA|publisher=ACM|pages=1155–1162|doi=10.1145/1569901.1570057|isbn=9781605583259|citeseerx=10.1.1.158.7314|s2cid=10906515}}</ref>  Both Bacardit and Urbanowicz explored statistical and visualization strategies to interpret LCS rules and perform knowledge discovery for data mining.<ref name=":11">{{Cite journal|last1=Urbanowicz|first1=R. J.|last2=Granizo-Mackenzie|first2=A.|last3=Moore|first3=J. H.|date=2012-11-01|title=An analysis pipeline with statistical and visualization-guided knowledge discovery for Michigan-style learning classifier systems|journal=IEEE Computational Intelligence Magazine|volume=7|issue=4|pages=35–45|doi=10.1109/MCI.2012.2215124|issn=1556-603X|pmc=4244006|pmid=25431544}}</ref><ref name=":12">{{cite journal | last1 = Bacardit | first1 = Jaume | last2 = Llorà | first2 = Xavier | year = 2013 | title = Large-scale data mining using genetics-based machine learning | journal = Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery | volume = 3 | issue = 1| pages = 37–61 | doi=10.1002/widm.1078| s2cid = 43062613 }}</ref>  Browne and Iqbal explored the concept of reusing building blocks in the form of code fragments and were the first to solve the 135-bit multiplexer benchmark problem by first learning useful building blocks from simpler multiplexer problems.<ref>{{Cite journal|last1=Iqbal|first1=Muhammad|last2=Browne|first2=Will N.|last3=Zhang|first3=Mengjie|date=2014-08-01|title=Reusing Building Blocks of Extracted Knowledge to Solve Complex, Large-Scale Boolean Problems|journal=IEEE Transactions on Evolutionary Computation|pages=465–480|volume=18|issue=4|doi=10.1109/tevc.2013.2281537|s2cid=525358}}</ref> '''ExSTraCS 2.0''' was later introduced to improve Michigan-style LCS scalability, successfully solving the 135-bit multiplexer benchmark problem for the first time directly.<ref name=":0"/>  The n-bit [[multiplexer]] problem is highly [[Epistasis|epistatic]] and [[Homogeneity and heterogeneity|heterogeneous]], making it a very challenging [[machine learning]] task.