Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Statistical classification
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Categorization of data using statistics}} When [[classification]] is performed by a computer, statistical methods are normally used to develop the algorithm. Often, the individual observations are analyzed into a set of quantifiable properties, known variously as [[explanatory variables]] or ''features''. These properties may variously be [[categorical data|categorical]] (e.g. "A", "B", "AB" or "O", for [[blood type]]), [[ordinal data|ordinal]] (e.g. "large", "medium" or "small"), [[integer|integer-valued]] (e.g. the number of occurrences of a particular word in an [[email]]) or [[real number|real-valued]] (e.g. a measurement of [[blood pressure]]). Other classifiers work by comparing observations to previous observations by means of a [[similarity function|similarity]] or [[metric (mathematics)|distance]] function. An [[algorithm]] that implements classification, especially in a concrete implementation, is known as a '''classifier'''. The term "classifier" sometimes also refers to the mathematical [[function (mathematics)|function]], implemented by a classification algorithm, that maps input data to a category. Terminology across fields is quite varied. In [[statistics]], where classification is often done with [[logistic regression]] or a similar procedure, the properties of observations are termed [[explanatory variable]]s (or [[independent variable]]s, regressors, etc.), and the categories to be predicted are known as outcomes, which are considered to be possible values of the [[dependent variable]]. In [[machine learning]], the observations are often known as ''instances'', the explanatory variables are termed ''features'' (grouped into a [[feature vector]]), and the possible categories to be predicted are ''classes''. Other fields may use different terminology: e.g. in [[community ecology]], the term "classification" normally refers to [[cluster analysis]].
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)