Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Overfitting
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Underfitting== [[Image:Underfitted Model.png|thumb|300px|Figure 5. The red line represents an underfitted model of the data points represented in blue. We would expect to see a parabola shaped line to represent the curvature of the data points.]] [[Image:Underfitting fitted model.png|thumb|300px|Figure 6. The blue line represents a fitted model of the data points represented in green.]] Underfitting is the inverse of overfitting, meaning that the statistical model or machine learning algorithm is too simplistic to accurately capture the patterns in the data. A sign of underfitting is that there is a high bias and low variance detected in the current model or algorithm used (the inverse of overfitting: low [[bias]] and high [[variance]]). This can be gathered from the [[Bias-variance tradeoff]], which is the method of analyzing a model or algorithm for bias error, variance error, and irreducible error. With a high bias and low variance, the result of the model is that it will inaccurately represent the data points and thus insufficiently be able to predict future data results (see [[Generalization error]]). As shown in Figure 5, the linear line could not represent all the given data points due to the line not resembling the curvature of the points. We would expect to see a parabola-shaped line as shown in Figure 6 and Figure 1. If we were to use Figure 5 for analysis, we would get false predictive results contrary to the results if we analyzed Figure 6. Burnham & Anderson state the following.<ref name="BA2002"/>{{rp|32}} {{quote|text= ... an underfitted model would ignore some important replicable (i.e., conceptually replicable in most other samples) structure in the data and thus fail to identify effects that were actually supported by the data. In this case, bias in the parameter estimators is often substantial, and the sampling variance is underestimated, both factors resulting in poor confidence interval coverage. Underfitted models tend to miss important treatment effects in experimental settings.}} ===Resolving underfitting=== There are multiple ways to deal with underfitting: # Increase the complexity of the model: If the model is too simple, it may be necessary to increase its complexity by adding more features, increasing the number of parameters, or using a more flexible model. However, this should be done carefully to avoid overfitting.<ref name=":0">{{Cite web |date=2017-11-23 |title=ML {{!}} Underfitting and Overfitting |url=https://www.geeksforgeeks.org/underfitting-and-overfitting-in-machine-learning/ |access-date=2023-02-27 |website=GeeksforGeeks |language=en-us}}</ref> # Use a different algorithm: If the current algorithm is not able to capture the patterns in the data, it may be necessary to try a different one. For example, a neural network may be more effective than a linear regression model for some types of data.<ref name=":0" /> # Increase the amount of training data: If the model is underfitting due to a lack of data, increasing the amount of training data may help. This will allow the model to better capture the underlying patterns in the data.<ref name=":0" /> # Regularization: Regularization is a technique used to prevent overfitting by adding a penalty term to the loss function that discourages large parameter values. It can also be used to prevent underfitting by controlling the complexity of the model.<ref>{{Cite journal |last1=Nusrat |first1=Ismoilov |last2=Jang |first2=Sung-Bong |date=November 2018 |title=A Comparison of Regularization Techniques in Deep Neural Networks |journal=Symmetry |language=en |volume=10 |issue=11 |pages=648 |doi=10.3390/sym10110648 |bibcode=2018Symm...10..648N |issn=2073-8994 |doi-access=free }}</ref> # [[Ensemble Methods]]: Ensemble methods combine multiple models to create a more accurate prediction. This can help reduce underfitting by allowing multiple models to work together to capture the underlying patterns in the data. # [[Feature engineering]]: Feature engineering involves creating new model features from the existing ones that may be more relevant to the problem at hand. This can help improve the accuracy of the model and prevent underfitting.<ref name=":0" />
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)