Editing Supervised learning (section)

===Noise in the output values===

A fourth issue is the degree of noise in the desired output values (the supervisory [[target variable]]s). If the desired output values are often incorrect (because of human error or sensor errors), then the learning algorithm should not attempt to find a function that exactly matches the training examples. Attempting to fit the data too carefully leads to [[overfitting]]. You can overfit even when there are no measurement errors (stochastic noise) if the function you are trying to learn is too complex for your learning model. In such a situation, the part of the target function that cannot be modeled "corrupts" your training data - this phenomenon has been called [[deterministic noise]]. When either type of noise is present, it is better to go with a higher bias, lower variance estimator.

In practice, there are several approaches to alleviate noise in the output values such as [[early stopping]] to prevent overfitting as well as [[anomaly detection|detecting]] and removing the noisy training examples prior to training the supervised learning algorithm. There are several algorithms that identify noisy training examples and removing the suspected noisy training examples prior to training has decreased [[generalization error]] with [[statistical significance]].<ref>C.E. Brodely and M.A. Friedl (1999). Identifying and Eliminating Mislabeled Training Instances, Journal of Artificial Intelligence Research 11, 131-167. (http://jair.org/media/606/live-606-1803-jair.pdf)</ref><ref>{{cite conference |author=M.R. Smith and T. Martinez |title=Improving Classification Accuracy by Identifying and Removing Instances that Should Be Misclassified |book-title=Proceedings of International Joint Conference on Neural Networks (IJCNN 2011) |pages=2690–2697 |year=2011 |doi=10.1109/IJCNN.2011.6033571 |citeseerx=10.1.1.221.1371 }}</ref>