Editing Feature selection (section)

==Optimality criteria==
The choice of optimality criteria is difficult as there are multiple objectives in a feature selection task. Many common criteria incorporate a measure of accuracy, penalised by the number of features selected. Examples include [[Akaike information criterion]] (AIC) and [[Mallows's Cp|Mallows's ''C<sub>p</sub>'']], which have a penalty of 2 for each added feature. AIC is based on [[information theory]], and is effectively derived via the [[maximum entropy principle]].<ref>{{Citation | first=H. |last=Akaike |author-link=Hirotugu Akaike | contribution = Prediction and entropy | pages=1–24 | title= A Celebration of Statistics | editor1-first= A. C. | editor1-last= Atkinson | editor2-first= S. E. | editor2-last= Fienberg | editor2-link= Stephen Fienberg | year = 1985 | publisher= Springer|url=https://apps.dtic.mil/dtic/tr/fulltext/u2/a120956.pdf|archive-url=https://web.archive.org/web/20190830132141/https://apps.dtic.mil/dtic/tr/fulltext/u2/a120956.pdf|url-status=live|archive-date=August 30, 2019}}.</ref><ref>{{Citation |last1=Burnham |first1=K. P. |last2=Anderson |first2=D. R. |year=2002 |title=Model Selection and Multimodel Inference: A practical information-theoretic approach |edition=2nd |publisher= [[Springer-Verlag]] |url=https://books.google.com/books?id=fT1Iu-h6E-oC|isbn=9780387953649 }}.</ref>

Other criteria are [[Bayesian information criterion]] (BIC), which uses a penalty of <math>\sqrt{\log{n}}</math> for each added feature, [[minimum description length]] (MDL) which asymptotically uses <math>\sqrt{\log{n}}</math>, [[Bonferroni correction|Bonferroni]] / RIC which use <math>\sqrt{2\log{p}}</math>, maximum dependency feature selection, and a variety of new criteria that are motivated by [[false discovery rate]] (FDR), which use something close to <math>\sqrt{2\log{\frac{p}{q}}}</math>. A maximum [[entropy rate]] criterion may also be used to select the most relevant subset of features.<ref>{{cite journal |last1=Einicke |first1=G. A. |title=Maximum-Entropy Rate Selection of Features for Classifying Changes in Knee and Ankle Dynamics During Running |journal=IEEE Journal of Biomedical and Health Informatics |volume=28 |issue=4 |pages=1097–1103 |year=2018 |doi= 10.1109/JBHI.2017.2711487 |pmid=29969403 |hdl=10810/68978 |s2cid=49555941 |hdl-access=free }}</ref>