Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Prediction
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Statistics== In [[statistics]], prediction is a part of [[statistical inference]]. One particular approach to such inference is known as [[predictive inference]], but the prediction can be undertaken within any of the several approaches to statistical inference. Indeed, one possible description of statistics is that it provides a means of transferring knowledge about a sample of a population to the whole population, and to other related populations, which is not necessarily the same as prediction over time. When information is transferred across time, often to specific points in time, the process is known as [[forecasting]].<ref>{{cite book |last=Cox |first=D. R. |year=2006 |title=Principles of Statistical Inference |publisher=Cambridge University Press |isbn=978-0-521-68567-2 }}</ref>{{Failed verification|date=November 2017|reason=Reference does not mention "forecasting" at all.}} Forecasting usually requires [[time series]] methods, while prediction is often performed on [[cross-sectional data]]. Statistical techniques used for prediction include [[Regression analysis#Prediction|regression]] and its various sub-categories such as [[linear regression]], [[generalized linear model]]s ([[logistic regression]], [[Poisson regression]], [[Probit regression]]), etc. In case of forecasting, [[autoregressive moving average model]]s and [[vector autoregression]] models can be utilized. When these and/or related, generalized set of regression or [[machine learning]] methods are deployed in commercial usage, the field is known as [[predictive analytics]].<ref>{{cite book |last=Siegel |first=Eric |year=2013 |title=Predictive Analysis: The Power to Predict Who Will Click, Buy, Lie, or Die |publisher=John Wiley & Sons |location=Hoboken, NJ |isbn=978-1-118-35685-2 }}</ref> In many applications, such as time series analysis, it is possible to estimate the models that generate the observations. If models can be expressed as [[transfer function]]s or in terms of state-space parameters then smoothed, filtered and predicted data estimates can be calculated.{{Citation needed|date=December 2019|reason=removed citation to predatory publisher content}} If the underlying generating models are linear then a minimum-variance [[Kalman filter]] and a minimum-variance smoother may be used to recover data of interest from noisy measurements. These techniques rely on one-step-ahead predictors (which minimise the variance of the [[prediction error]]). When the generating models are nonlinear then stepwise linearizations may be applied within [[Extended Kalman Filter]] and smoother recursions. However, in nonlinear cases, optimum minimum-variance performance guarantees no longer apply.<ref>{{cite journal |last1=Julier |first1=S. J. |last2=Uhlmann |first2=J. K. |year=2004 |title=Unscented filtering and nonlinear estimation |journal=Proceedings of the IEEE |volume=92 |issue=3 |pages=401–422 |doi=10.1109/jproc.2003.823141 |s2cid=9614092 |citeseerx=10.1.1.136.6539 }}</ref> To use regression analysis for prediction, data are collected on the variable that is to be predicted, called the [[dependent variable]] or response variable, and on one or more variables whose values are [[hypothesis|hypothesized]] to influence it, called [[independent variable]]s or explanatory variables. A [[Function (mathematics)#Real function|functional form]], often linear, is hypothesized for the postulated causal relationship, and the [[parameter]]s of the function are [[estimation|estimated]] from the data—that is, are chosen so as to optimize is some way the [[goodness of fit|fit]] of the function, thus parameterized, to the data. That is the estimation step. For the prediction step, explanatory variable values that are deemed relevant to future (or current but not yet observed) values of the dependent variable are input to the parameterized function to generate predictions for the dependent variable.<ref>{{cite book |last=Fox |first=John |year=2016 |title=Applied Regression Analysis and Generalized Linear Models |publisher=Sage |location=London |edition=Third |isbn=978-1-4522-0566-3 }}</ref> An unbiased performance estimate of a model can be obtained on [[Hold-out cross-validation|hold-out test sets]]. The predictions can visually be compared to the ground truth in a [[parity plot]].
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)