Editing Cross-validation (statistics) (section)

{{short description|Statistical model validation technique}}
{{More citations needed|date=August 2017}}
[[File:Confusion matrix.png|thumb|250x250px|Comparing the cross-validation accuracy and percent of false negative (overestimation) of five classification models. Size of bubbles represent the standard deviation of cross-validation accuracy (tenfold).<ref name=":1">{{cite journal |last1=Piryonesi |first1=S. Madeh |last2=El-Diraby |first2=Tamer E. |title=Data Analytics in Asset Management: Cost-Effective Prediction of the Pavement Condition Index |journal=Journal of Infrastructure Systems |date=March 2020 |volume=26 |issue=1 |doi=10.1061/(ASCE)IS.1943-555X.0000512 }}</ref>]]
[[File:K-fold cross validation EN.svg|thumb|250px|right|Diagram of k-fold cross-validation]]

'''Cross-validation''',<ref>{{cite journal |last=Allen | first=David M |year=1974 |title=The Relationship between Variable Selection and Data Agumentation and a Method for Prediction |journal=Technometrics |volume=16 |issue=1 |pages=125–127 |doi=10.2307/1267500 |jstor=1267500 }}</ref><ref>{{cite journal |last1=Stone |first1=M. |title=Cross-Validatory Choice and Assessment of Statistical Predictions |journal=Journal of the Royal Statistical Society Series B: Statistical Methodology |date=1974 |volume=36 |issue=2 |pages=111–133 |doi=10.1111/j.2517-6161.1974.tb00994.x }}</ref><ref>{{cite journal |last=Stone | first=M |year=1977 |title=An Asymptotic Equivalence of Choice of Model by Cross-Validation and Akaike's Criterion |journal=Journal of the Royal Statistical Society, Series B (Methodological) |volume=39 |issue=1 |pages=44–47 | doi=10.1111/j.2517-6161.1977.tb01603.x | jstor=2984877 }}</ref> sometimes called '''rotation estimation'''<ref>{{cite book |last=Geisser |first=Seymour |year=1993 |title=Predictive Inference |publisher=Chapman and Hall |location=New York, NY |isbn=978-0-412-03471-8 }}{{pn|date=November 2024}}</ref><ref name="Kohavi95">{{cite book |last1=Kohavi |first1=Ron |chapter=A study of cross-validation and bootstrap for accuracy estimation and model selection |pages=1137–1143 |chapter-url=https://www.ijcai.org/Proceedings/95-2/Papers/016.pdf |title=Proceedings of the 14th international joint conference on Artificial intelligence |volume=2 |date=20 August 1995 |publisher=Morgan Kaufmann Publishers |isbn=978-1-55860-363-9 }}</ref><ref name="Devijver82">{{cite book | last1 = Devijver | first1 = Pierre A. | last2 = Kittler | first2 = Josef | title = Pattern Recognition: A Statistical Approach | publisher = Prentice-Hall | location = London, GB | date = 1982 |isbn=978-0-13-654236-0 }}{{pn|date=November 2024}}</ref> or '''out-of-sample testing''', is any of various similar [[model validation]] techniques for assessing how the results of a [[statistics|statistical]] analysis will [[Generalization error|generalize]] to an independent data set.
Cross-validation includes [[Resampling (statistics)|resampling]] and sample splitting methods that use different portions of the data to test and train a model on different iterations. It is often used in settings where the goal is prediction, and one wants to estimate how [[accuracy|accurately]] a [[predictive modelling|predictive model]] will perform in practice. It can also be used to assess the quality of a fitted model and the stability of its parameters.

In a prediction problem, a model is usually given a dataset of ''known data'' on which training is run (''training dataset''), and a dataset of ''unknown data'' (or ''first seen'' data) against which the model is tested (called the [[validation set|validation dataset]] or ''testing set'').<ref>{{cite web |title= What is the difference between test set and validation set? |url=https://stats.stackexchange.com/q/19051 |website=Cross Validated |publisher=Stack Exchange |first=Alexander |last=Galkin |date=November 28, 2011 |access-date=10 October 2018}}</ref><ref name="Newbie question: Confused about train, validation and test data!">{{cite web|url=http://www.heatonresearch.com/node/1823 |title=Newbie question: Confused about train, validation and test data! |date=December 2010 |website=Heaton Research  |access-date=2013-11-14 |url-status=dead |archive-url=https://web.archive.org/web/20150314221014/http://www.heatonresearch.com/node/1823 |archive-date=2015-03-14 }}{{self-published inline|date=November 2024}}</ref> The goal of cross-validation is to test the model's ability to predict new data that was not used in estimating it, in order to flag problems like [[overfitting]] or [[selection bias]]<ref>{{cite journal |last1=Cawley |first1=Gavin C. |last2=Talbot |first2=Nicola L. C. |title=On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation |journal=Journal of Machine Learning Research |volume=11 |date=2010 |pages=2079–2107 |url=https://www.jmlr.org/papers/volume11/cawley10a/cawley10a.pdf }}</ref> and to give an insight on how the model will generalize to an independent dataset (i.e., an unknown dataset, for instance from a real problem).

One round of cross-validation involves [[partition of a set|partitioning]] a [[statistical sample|sample]] of [[data]] into [[Complement (set theory)|complementary]] subsets, performing the analysis on one subset (called the ''training set''), and validating the analysis on the other subset (called the ''validation set'' or ''testing set''). To reduce [[variance|variability]], in most methods multiple rounds of cross-validation are performed using different partitions, and the validation results are combined (e.g. averaged) over the rounds to give an estimate of the model's predictive performance.

In summary, cross-validation combines (averages) measures of [[Goodness of fit|fitness]] in prediction to derive a more accurate estimate of model prediction performance.<ref>{{cite journal |last1=Seni |first1=Giovanni |last2=Elder |first2=John F. |title=Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions |journal=Synthesis Lectures on Data Mining and Knowledge Discovery |date=January 2010 |volume=2 |issue=1 |pages=1–126 |doi=10.2200/S00240ED1V01Y200912DMK002 }}</ref>