Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Statistics
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Study of collection and analysis of data}} {{hatnote group| {{about|the study of data|a value derived from a sample|Statistic}} {{Other uses|Statistics (disambiguation)}} }} [[File:Standard Normal Distribution.png|thumb|upright=1.3|right|The [[normal distribution]], a very common [[Probability density function|probability density]], is used extensively in [[inferential statistics]].]] [[File:Iris Pairs Plot.svg|thumb|upright=1.3|right|[[Scatter plot]]s and [[line chart]]s are used in [[descriptive statistics]] to show the observed relationships between different variables, here using the [[Iris flower data set]].]] {{StatsTopicTOC}} {{Math topics TOC}} <!--PLEASE DO NOT EDIT THE OPENING SENTENCE WITHOUT FIRST PROPOSING YOUR CHANGE AT THE TALK PAGE.--> '''Statistics''' (from [[German language|German]]: ''{{linktext|lang=de|Statistik}}'', {{Abbr|orig.|originally}} "description of a [[State (polity)|state]], a country"<ref name=":1">{{multiref | {{Cite OED|statistics}} | {{Cite encyclopedia |encyclopedia=Digitales Wörterbuch der deutschen Sprache |url=https://www.dwds.de/?q=Statistik |title=Statistik | date=August 2024 |publisher=Berlin-Brandenburgischen Akademie der Wissenschaften |lang=de}} }}</ref>) is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of [[data]].<ref name="def">{{multiref | {{Cite encyclopedia |encyclopedia=Oxford Reference |title=Statistics |year=2008 |publisher=Oxford University Press |isbn=978-0-19-954145-4 |url=https://www.oxfordreference.com/view/10.1093/acref/9780199541454.001.0001/acref-9780199541454-e-1566?rskey=nxhBLl&result=1979}} | {{Cite encyclopedia |first=Jan-Willem |last=Romijn |year=2014 |title=Philosophy of statistics |encyclopedia=Stanford Encyclopedia of Philosophy |url=http://plato.stanford.edu/entries/statistics/}} | {{Cite dictionary | dictionary=Cambridge Dictionary |title=Statistics |url=https://dictionary.cambridge.org/dictionary/english/statistics}} }}</ref> In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a [[statistical population]] or a [[statistical model]] to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of [[statistical survey|surveys]] and [[experimental design|experiments]].<ref name="Dodge">{{Cite book |last=Dodge |first=Yadolah |title=The Oxford Dictionary of Statistical Terms |publisher=Oxford University Press |year=2003 |isbn=0-19-920613-9}}</ref> When [[census]] data (comprising every member of the target population) cannot be collected, [[statistician]]s collect data by developing specific experiment designs and survey [[sample (statistics)|samples]]. Representative sampling assures that inferences and conclusions can reasonably extend from the sample to the population as a whole. An [[experimental study]] involves taking measurements of the system under study, manipulating the system, and then taking additional measurements using the same procedure to determine if the manipulation has modified the values of the measurements. In contrast, an [[observational study]] does not involve experimental manipulation. Two main statistical methods are used in [[data analysis]]: [[descriptive statistics]], which summarize data from a sample using [[Index (statistics)|indexes]] such as the [[mean]] or [[standard deviation]], and [[statistical inference|inferential statistics]], which draw conclusions from data that are subject to random variation (e.g., observational errors, sampling variation).<ref name=LundResearchLtd>{{cite web |last=Lund Research Ltd. |url=https://statistics.laerd.com/statistical-guides/descriptive-inferential-statistics.php |title=Descriptive and Inferential Statistics |publisher=statistics.laerd.com |access-date=2014-03-23 |archive-date=2020-10-26 |archive-url=https://web.archive.org/web/20201026075549/https://statistics.laerd.com/statistical-guides/descriptive-inferential-statistics.php |url-status=live }}</ref> Descriptive statistics are most often concerned with two sets of properties of a ''distribution'' (sample or population): ''[[central tendency]]'' (or ''location'') seeks to characterize the distribution's central or typical value, while ''[[statistical dispersion|dispersion]]'' (or ''variability'') characterizes the extent to which members of the distribution depart from its center and each other. Inferences made using [[mathematical statistics]] employ the framework of [[probability theory]], which deals with the analysis of random phenomena. A standard statistical procedure involves the collection of data leading to a [[statistical hypothesis testing|test of the relationship]] between two statistical data sets, or a data set and synthetic data drawn from an idealized model. A hypothesis is proposed for the statistical relationship between the two data sets, an [[alternative hypothesis|alternative]] to an idealized [[null hypothesis]] of no relationship between two data sets. Rejecting or disproving the null hypothesis is done using statistical tests that quantify the sense in which the null can be proven false, given the data that are used in the test. Working from a null hypothesis, two basic forms of error are recognized: [[Type I error]]s (null hypothesis is rejected when it is in fact true, giving a "false positive") and [[Type II error]]s (null hypothesis fails to be rejected when it is in fact false, giving a "false negative"). Multiple problems have come to be associated with this framework, ranging from obtaining a sufficient sample size to specifying an adequate null hypothesis.<ref name="LundResearchLtd" /> Statistical measurement processes are also prone to error in regards to the data that they generate. Many of these errors are classified as random (noise) or systematic ([[Bias (statistics)|bias]]), but other types of errors (e.g., blunder, such as when an analyst reports incorrect units) can also occur. The presence of [[missing data]] or [[censoring (statistics)|censoring]] may result in biased estimates and specific techniques have been developed to address these problems. {{TOC limit|3}}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)