Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Machine learning
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Data mining=== Machine learning and [[data mining]] often employ the same methods and overlap significantly, but while machine learning focuses on prediction, based on ''known'' properties learned from the training data, data mining focuses on the [[discovery (observation)|discovery]] of (previously) ''unknown'' properties in the data (this is the analysis step of [[knowledge discovery]] in databases). Data mining uses many machine learning methods, but with different goals; on the other hand, machine learning also employs data mining methods as "[[unsupervised learning]]" or as a preprocessing step to improve learner accuracy. Much of the confusion between these two research communities (which do often have separate conferences and separate journals, [[ECML PKDD]] being a major exception) comes from the basic assumptions they work with: in machine learning, performance is usually evaluated with respect to the ability to ''reproduce known'' knowledge, while in knowledge discovery and data mining (KDD) the key task is the discovery of previously ''unknown'' knowledge. Evaluated with respect to known knowledge, an uninformed (unsupervised) method will easily be outperformed by other supervised methods, while in a typical KDD task, supervised methods cannot be used due to the unavailability of training data. Machine learning also has intimate ties to [[optimisation]]: Many learning problems are formulated as minimisation of some [[loss function]] on a training set of examples. Loss functions express the discrepancy between the predictions of the model being trained and the actual problem instances (for example, in classification, one wants to assign a [[Labeled data|label]] to instances, and models are trained to correctly predict the preassigned labels of a set of examples).<ref>{{cite encyclopedia |last1=Le Roux |first1=Nicolas |first2=Yoshua |last2=Bengio |first3=Andrew |last3=Fitzgibbon |title=Improving First and Second-Order Methods by Modeling Uncertainty |encyclopedia=Optimization for Machine Learning |year=2012 |page=404 |editor1-last=Sra |editor1-first=Suvrit |editor2-first=Sebastian |editor2-last=Nowozin |editor3-first=Stephen J. |editor3-last=Wright |publisher=MIT Press |url=https://books.google.com/books?id=JPQx7s2L1A8C&q=%22Improving+First+and+Second-Order+Methods+by+Modeling+Uncertainty&pg=PA403 |isbn=9780262016469 |access-date=12 November 2020 |archive-date=17 January 2023 |archive-url=https://web.archive.org/web/20230117053335/https://books.google.com/books?id=JPQx7s2L1A8C&q=%22Improving+First+and+Second-Order+Methods+by+Modeling+Uncertainty&pg=PA403 |url-status=live }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)