Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Regression analysis
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Prediction (interpolation and extrapolation) {{anchor|Prediction|Interpolation|Extrapolation|Interpolation and extrapolation}}== {{further|Predicted response|Prediction interval}} [[File:CurveWeightHeight.png|thumb|upright=1.5|In the middle, the fitted straight line represents the best balance between the points above and below this line. The dotted straight lines represent the two extreme lines, considering only the variation in the slope. The inner curves represent the estimated range of values considering the variation in both slope and intercept. The outer curves represent a prediction for a new measurement.<ref>{{cite book |last=Rouaud |first=Mathieu |title=Probability, Statistics and Estimation|year=2013 |page=60 |url=http://www.incertitudes.fr/book.pdf }}</ref>]] Regression models '''''predict''''' a value of the ''Y'' variable given known values of the ''X'' variables. Prediction {{em|within}} the range of values in the dataset used for model-fitting is known informally as ''[[interpolation]]''. Prediction {{em|outside}} this range of the data is known as ''[[extrapolation]]''. Performing extrapolation relies strongly on the regression assumptions. The further the extrapolation goes outside the data, the more room there is for the model to fail due to differences between the assumptions and the sample data or the true values. A ''[[prediction interval]]'' that represents the uncertainty may accompany the point prediction. Such intervals tend to expand rapidly as the values of the independent variable(s) moved outside the range covered by the observed data. For such reasons and others, some tend to say that it might be unwise to undertake extrapolation.<ref>Chiang, C.L, (2003) ''Statistical methods of analysis'', World Scientific. {{isbn|981-238-310-7}} - [https://books.google.com/books?id=BuPNIbaN5v4C&dq=regression+extrapolation&pg=PA274 page 274 section 9.7.4 "interpolation vs extrapolation"]</ref> ===Model selection=== {{Further|Model selection}} The assumption of a particular form for the relation between ''Y'' and ''X'' is another source of uncertainty. A properly conducted regression analysis will include an assessment of how well the assumed form is matched by the observed data, but it can only do so within the range of values of the independent variables actually available. This means that any extrapolation is particularly reliant on the assumptions being made about the structural form of the regression relationship. If this knowledge includes the fact that the dependent variable cannot go outside a certain range of values, this can be made use of in selecting the model β even if the observed dataset has no values particularly near such bounds. The implications of this step of choosing an appropriate functional form for the regression can be great when extrapolation is considered. At a minimum, it can ensure that any extrapolation arising from a fitted model is "realistic" (or in accord with what is known).
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)