Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Cross-validation (statistics)
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Using prior information== When users apply cross-validation to select a good configuration <math>\lambda</math>, then they might want to balance the cross-validated choice with their own estimate of the configuration. In this way, they can attempt to counter the volatility of cross-validation when the sample size is small and include relevant information from previous research. In a forecasting combination exercise, for instance, cross-validation can be applied to estimate the weights that are assigned to each forecast. Since a simple equal-weighted forecast is difficult to beat, a penalty can be added for deviating from equal weights.<ref name="Hoornweg2018SUS">{{cite book |last1=Hoornweg |first1=Victor |title=Science: Under Submission | date=2018 |publisher=Hoornweg Press |isbn=978-90-829188-0-9 |url=https://victorhoornweg.com/docs/Hoornweg%202018%20Science%20Under%20Submission.pdf }}{{pn|date=November 2024}}{{self-published inline|date=November 2024}}</ref> Or, if cross-validation is applied to assign individual weights to observations, then one can penalize deviations from equal weights to avoid wasting potentially relevant information.<ref name = "Hoornweg2018SUS" /> Hoornweg (2018) shows how a tuning parameter <math>\gamma</math> can be defined so that a user can intuitively balance between the accuracy of cross-validation and the simplicity of sticking to a reference parameter <math>\lambda_R</math> that is defined by the user. If <math>\lambda_i</math> denotes the <math>i^{th}</math> candidate configuration that might be selected, then the [[Loss function#Statistics|loss function]] that is to be minimized can be defined as : <math> L_{\lambda_i} = (1-\gamma) \mbox{ Relative Accuracy}_i + \gamma \mbox{ Relative Simplicity}_i. </math> Relative accuracy can be quantified as <math>\mbox{MSE}(\lambda_i)/\mbox{MSE}(\lambda_R)</math>, so that the mean squared error of a candidate <math>\lambda_i</math> is made relative to that of a user-specified <math>\lambda_R</math>. The relative simplicity term measures the amount that <math>\lambda_i</math> deviates from <math>\lambda_R</math> relative to the maximum amount of deviation from <math>\lambda_R</math>. Accordingly, relative simplicity can be specified as <math>\frac{(\lambda_i-\lambda_R)^2}{(\lambda_{\max}-\lambda_R)^2}</math>, where <math>\lambda_{\max}</math> corresponds to the <math>\lambda</math> value with the highest permissible deviation from <math>\lambda_R</math>. With <math>\gamma\in[0,1]</math>, the user determines how high the influence of the reference parameter is relative to cross-validation. One can add relative simplicity terms for multiple configurations <math>c=1,2,...,C</math> by specifying the loss function as : <math> L_{\lambda_i} = \mbox{ Relative Accuracy}_i + \sum_{c=1}^C \frac{\gamma_c}{1-\gamma_c} \mbox{ Relative Simplicity}_{i,c}. </math> Hoornweg (2018) shows that a loss function with such an accuracy-simplicity tradeoff can also be used to intuitively define [[shrinkage estimator]]s like the (adaptive) lasso and [[Bayesian regression|Bayesian]] / [[ridge regression]].<ref name = "Hoornweg2018SUS" /> Click on the [[Lasso (statistics)#Interpretations of lasso|lasso]] for an example.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)