Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Statistical learning theory
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Regularization== [[File:Overfitting on Training Set Data.pdf|thumb|This image represents an example of overfitting in machine learning. The red dots represent training set data. The green line represents the true functional relationship, while the blue line shows the learned function, which has been overfitted to the training set data.]] In machine learning problems, a major problem that arises is that of [[overfitting]]. Because learning is a prediction problem, the goal is not to find a function that most closely fits the (previously observed) data, but to find one that will most accurately predict output from future input. [[Empirical risk minimization]] runs this risk of overfitting: finding a function that matches the data exactly but does not predict future output well. Overfitting is symptomatic of unstable solutions; a small perturbation in the training set data would cause a large variation in the learned function. It can be shown that if the stability for the solution can be guaranteed, generalization and consistency are guaranteed as well.<ref>Vapnik, V.N. and Chervonenkis, A.Y. 1971. [http://ai2-s2-pdfs.s3.amazonaws.com/a36b/028d024bf358c4af1a5e1dc3ca0aed23b553.pdf On the uniform convergence of relative frequencies of events to their probabilities]. ''Theory of Probability and Its Applications'' Vol 16, pp 264-280.</ref><ref>Mukherjee, S., Niyogi, P. Poggio, T., and Rifkin, R. 2006. [https://link.springer.com/article/10.1007/s10444-004-7634-z Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization]. ''Advances in Computational Mathematics''. Vol 25, pp 161-193.</ref> [[Regularization (mathematics)|Regularization]] can solve the overfitting problem and give the problem stability. Regularization can be accomplished by restricting the hypothesis space <math>\mathcal{H}</math>. A common example would be restricting <math>\mathcal{H}</math> to linear functions: this can be seen as a reduction to the standard problem of [[linear regression]]. <math>\mathcal{H}</math> could also be restricted to polynomial of degree <math>p</math>, exponentials, or bounded functions on [[Lp space|L1]]. Restriction of the hypothesis space avoids overfitting because the form of the potential functions are limited, and so does not allow for the choice of a function that gives empirical risk arbitrarily close to zero. One example of regularization is [[Tikhonov regularization]]. This consists of minimizing <math display="block">\frac{1}{n} \sum_{i=1}^n V(f(\mathbf{x}_i),y_i) + \gamma \left\|f\right\|_{\mathcal{H}}^2</math> where <math>\gamma</math> is a fixed and positive parameter, the regularization parameter. Tikhonov regularization ensures existence, uniqueness, and stability of the solution.<ref>Tomaso Poggio, Lorenzo Rosasco, et al. ''Statistical Learning Theory and Applications'', 2012, [https://www.mit.edu/~9.520/spring12/slides/class02/class02.pdf Class 2]</ref> {{clear}}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)