Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Categorical variable
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{More footnotes needed|date=July 2024}}{{Short description|Variable capable of taking on a limited number of possible values}} In [[statistics]], a '''categorical variable''' (also called '''qualitative variable''') is a [[variable (research)|variable]] that can take on one of a limited, and usually fixed, number of possible values, assigning each individual or other unit of observation to a particular group or [[nominal category]] on the basis of some [[qualitative property]].<ref name="yates">{{cite book | last1 = Yates | first1 = Daniel S. | last2 = Moore | first2 = David S. | last3 = Starnes | first3 = Daren S. | year = 2003 | title = The Practice of Statistics | edition = 2nd | publisher = [[W. H. Freeman and Company|Freeman]] | location = New York | url = http://bcs.whfreeman.com/yates2e/ | isbn = 978-0-7167-4773-4 | access-date = 2014-09-28 | archive-url = https://web.archive.org/web/20050209001108/http://bcs.whfreeman.com/yates2e/ | archive-date = 2005-02-09 | url-status = dead }}</ref> In computer science and some branches of mathematics, categorical variables are referred to as [[enumerations]] or [[enumerated types]]. Commonly (though not in this article), each of the possible values of a categorical variable is referred to as a '''level'''. The [[probability distribution]] associated with a [[random variable|random]] categorical variable is called a [[categorical distribution]]. '''Categorical data''' is the [[statistical data type]] consisting of categorical variables or of data that has been converted into that form, for example as [[grouped data]]. More specifically, categorical data may derive from observations made of [[qualitative data]] that are summarised as counts or [[cross tabulation]]s, or from observations of [[quantitative data]] grouped within given intervals. Often, purely categorical data are summarised in the form of a [[contingency table]]. However, particularly when considering data analysis, it is common to use the term "categorical data" to apply to data sets that, while containing some categorical variables, may also contain non-categorical variables. [[Ordinal data|Ordinal variables]] have a meaningful ordering, while [[Nominal variable|nominal variables]] have no meaningful ordering. A categorical variable that can take on exactly two values is termed a ''[[binary variable]]'' or a '''dichotomous variable'''; an important special case is the [[Bernoulli variable]]. Categorical variables with more than two possible values are called '''polytomous variables'''; categorical variables are often assumed to be polytomous unless otherwise specified. [[Discretization]] is treating [[continuous function|continuous data]] as if it were categorical. [[Dichotomization]] is treating continuous data or polytomous variables as if they were binary variables. [[Regression analysis]] often treats category membership with one or more quantitative [[dummy variable (statistics)|dummy variables]].
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)