Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Feature selection
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Main principles=== The feature selection methods are typically presented in three classes based on how they combine the selection algorithm and the model building. ====Filter method==== [[File:Filter Methode.png|thumb|300px|Filter Method for feature selection]] Filter type methods select variables regardless of the model. They are based only on general features like the correlation with the variable to predict. Filter methods suppress the least interesting variables. The other variables will be part of a classification or a regression model used to classify or to predict data. These methods are particularly effective in computation time and robust to overfitting.<ref name="ReferenceA">{{cite thesis |first=Julie |last=Hamon |title=Optimisation combinatoire pour la sélection de variables en régression en grande dimension: Application en génétique animale |url=https://tel.archives-ouvertes.fr/tel-00920205 |date=November 2013 |publisher=[[Lille University of Science and Technology]] |language=fr }}</ref> Filter methods tend to select redundant variables when they do not consider the relationships between variables. However, more elaborate features try to minimize this problem by removing variables highly correlated to each other, such as the Fast Correlation Based Filter (FCBF) algorithm.<ref>{{Cite journal |first1=Lei |last1=Yu |first2=Huan |last2=Liu |title=Feature selection for high-dimensional data: a fast correlation-based filter solution |journal=ICML'03: Proceedings of the Twentieth International Conference on International Conference on Machine Learning |date=August 2003 |pages=856–863 |url=https://www.aaai.org/Papers/ICML/2003/ICML03-111.pdf }}</ref> ====Wrapper method==== [[File:Feature selection Wrapper Method.png|thumb|300px|Wrapper Method for Feature selection]] Wrapper methods evaluate subsets of variables which allows, unlike filter approaches, to detect the possible interactions amongst variables.<ref name="M. Phuong, Z pages 301-309">T. M. Phuong, Z. Lin et R. B. Altman. [http://htsnp.stanford.edu/FSFS/TaggingSNP.pdf Choosing SNPs using feature selection.] {{Webarchive|url=https://web.archive.org/web/20160913211229/http://htsnp.stanford.edu/FSFS/TaggingSNP.pdf |date=2016-09-13 }} Proceedings / IEEE Computational Systems Bioinformatics Conference, CSB. IEEE Computational Systems Bioinformatics Conference, pages 301-309, 2005. {{PMID|16447987}}.</ref> The two main disadvantages of these methods are: * The increasing overfitting risk when the number of observations is insufficient. * The significant computation time when the number of variables is large. ====Embedded method==== [[File:Feature selection Embedded Method.png|thumb|300px|Embedded method for Feature selection]] Embedded methods have been recently proposed that try to combine the advantages of both previous methods. A learning algorithm takes advantage of its own variable selection process and performs feature selection and classification simultaneously, such as the FRMT algorithm.<ref>{{cite journal |last1=Saghapour |first1=E. |last2=Kermani |first2=S. |last3=Sehhati |first3=M. |year=2017 |title=A novel feature ranking method for prediction of cancer stages using proteomics data |journal=[[PLOS ONE]] |volume=12 |issue=9 |pages=e0184203 |doi=10.1371/journal.pone.0184203 |pmid=28934234 |pmc=5608217 |bibcode=2017PLoSO..1284203S |doi-access=free }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)