Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Data mining
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Background== The manual extraction of patterns from [[data]] has occurred for centuries. Early methods of identifying patterns in data include [[Bayes' theorem]] (1700s) and [[regression analysis]] (1800s).<ref>{{Cite journal|last=Coenen|first=Frans|date=2011-02-07|title=Data mining: past, present and future|url=https://www.cambridge.org/core/product/identifier/S0269888910000378/type/journal_article|journal=The Knowledge Engineering Review|language=en|volume=26|issue=1|pages=25β29|doi=10.1017/S0269888910000378|s2cid=6487637|issn=0269-8889|access-date=2021-09-04|archive-date=2023-07-02|archive-url=https://web.archive.org/web/20230702140030/https://www.cambridge.org/core/journals/knowledge-engineering-review/article/abs/data-mining-past-present-and-future/EE2E494D98BCE76EBE3FE07897540C43|url-status=live}}</ref> The proliferation, ubiquity and increasing power of computer technology have dramatically increased data collection, storage, and manipulation ability. As [[data set]]s have grown in size and complexity, direct "hands-on" data analysis has increasingly been augmented with indirect, automated data processing, aided by other discoveries in computer science, specially in the field of machine learning, such as [[Artificial neural network|neural networks]], [[cluster analysis]], [[genetic algorithms]] (1950s), [[decision tree learning|decision trees]] and [[decision rules]] (1960s), and [[support vector machines]] (1990s). Data mining is the process of applying these methods with the intention of uncovering hidden patterns.<ref name="Kantardzic">{{cite book |last=Kantardzic |first=Mehmed |title=Data Mining: Concepts, Models, Methods, and Algorithms |year=2003 |publisher=John Wiley & Sons |isbn=978-0-471-22852-3 |oclc=50055336 |url-access=registration |url=https://archive.org/details/dataminingconcep0000kant }}</ref> in large data sets. It bridges the gap from [[applied statistics]] and artificial intelligence (which usually provide the mathematical background) to [[database management]] by exploiting the way data is stored and indexed in databases to execute the actual learning and discovery algorithms more efficiently, allowing such methods to be applied to ever-larger data sets.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)