Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Data set
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Classics == Several classic data sets have been used extensively in the [[statistical]] literature: * [[Iris flower data set]] β Multivariate data set introduced by [[Ronald Fisher]] (1936).<ref name="fisher36">{{cite journal|author=Fisher, R.A.|title=The Use of Multiple Measurements in Taxonomic Problems|journal=[[Annals of Eugenics]]|volume=7|pages=179–188|year=1963|issue=2|url=http://digital.library.adelaide.edu.au/coll/special//fisher/138.pdf|doi=10.1111/j.1469-1809.1936.tb02137.x|hdl=2440/15227|hdl-access=free|access-date=2007-05-22|archive-date=2011-09-28|archive-url=https://web.archive.org/web/20110928044802/http://digital.library.adelaide.edu.au/coll/special//fisher/138.pdf|url-status=dead}}</ref> [https://archive.ics.uci.edu/ml/datasets/Iris Provided online by University of California-Irvine Machine Learning Repository].<ref>{{cite web |url=https://archive.ics.uci.edu/ml/datasets/Iris |title=UCI Machine Learning Repository: Iris Data Set |access-date=2023-05-02 |url-status=live |archive-url=https://web.archive.org/web/20230426065109/https://archive.ics.uci.edu/ml/datasets/Iris |archive-date=2023-04-26}}</ref> * [[MNIST database]] β Images of handwritten digits commonly used to test classification, clustering, and [[Digital image processing|image processing]] algorithms * ''[[Categorical data analysis]]'' β Data sets used in the book, ''An Introduction to Categorical Data Analysis'', [https://stats.oarc.ucla.edu/other/examples/icda/ provided online] by UCLA Advanced Research Computing.<ref>{{cite web |url=https://stats.oarc.ucla.edu/other/examples/icda/ |title=Textbook Examples An Introduction to Categorical Data Analysis by Alan Agresti |access-date=2023-05-02 |url-status=live |archive-url=https://web.archive.org/web/20230131013107/https://stats.oarc.ucla.edu/other/examples/icda/ |archive-date=2023-01-31}}</ref> *''[[Robust statistics]]'' β Data sets used in ''[[Robust Regression and Outlier Detection]]'' ([[Peter Rousseeuw|Rousseeuw]] and Leroy, 1968). [https://web.archive.org/web/20050207032959/http://www.uni-koeln.de/themen/statistik/data/rousseeuw/ Provided online] at the University of Cologne.<ref>{{cite web |url=http://www.uni-koeln.de/themen/statistik/data/rousseeuw/ |title=The ROUSSEEUW datasets |url-status=dead |archive-url=https://web.archive.org/web/20050207032959/http://www.uni-koeln.de/themen/statistik/data/rousseeuw/ |archive-date=2005-02-07}}</ref> *''[[Time series]]'' β Data used in Chatfield's book, ''The Analysis of Time Series'', are [https://web.archive.org/web/20110102201323/http://lib.stat.cmu.edu/modules.php?op=modload&name=PostWrap&file=index&page=datasets/ provided on-line] by StatLib.<ref>{{cite web |url=http://lib.stat.cmu.edu/modules.php?op=modload&name=PostWrap&file=index&page=datasets/ |title=StatLib :: Data, Software and News from the Statistics Community |url-status=dead |archive-url=https://web.archive.org/web/20110102201323/http://lib.stat.cmu.edu/modules.php?op=modload&name=PostWrap&file=index&page=datasets/ |archive-date=2011-01-02}}</ref> *''Extreme values'' β Data used in the book, ''An Introduction to the Statistical Modeling of Extreme Values'' are [https://web.archive.org/web/20060910161517/http://homes.stat.unipd.it/coles/public_html/ismev/ismev.dat a snapshot of the data as it was provided on-line by Stuart Coles], the book's author. *''Bayesian Data Analysis'' β Data used in the book are [http://www.stat.columbia.edu/~gelman/book/data/ provided on-line] ([https://web.archive.org/web/20230122121643/http://www.stat.columbia.edu/~gelman/book/data/ archive link]) by [[Andrew Gelman]], one of the book's authors. * The [https://web.archive.org/web/20171023174701/http://ftp.ics.uci.edu:80/pub/machine-learning-databases/liver-disorders/ Bupa liver data] β Used in several papers in the [[machine learning]] (data mining) literature. * [[Anscombe's quartet]] β Small data set illustrating the importance of graphing the data to avoid statistical fallacies.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)