Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Logistic regression
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==="Rule of ten"=== {{main|One in ten rule}} Widely used, the "[[one in ten rule]]", states that logistic regression models give stable values for the explanatory variables if based on a minimum of about 10 events per explanatory variable (EPV); where ''event'' denotes the cases belonging to the less frequent category in the dependent variable. Thus a study designed to use <math>k</math> explanatory variables for an event (e.g. [[myocardial infarction]]) expected to occur in a proportion <math>p</math> of participants in the study will require a total of <math>10k/p</math> participants. However, there is considerable debate about the reliability of this rule, which is based on simulation studies and lacks a secure theoretical underpinning.<ref>{{cite journal|pmid=27881078|pmc=5122171|year=2016|last1=Van Smeden|first1=M.|title=No rationale for 1 variable per 10 events criterion for binary logistic regression analysis|journal=BMC Medical Research Methodology|volume=16|issue=1|page=163|last2=De Groot|first2=J. A.|last3=Moons|first3=K. G.|last4=Collins|first4=G. S.|last5=Altman|first5=D. G.|last6=Eijkemans|first6=M. J.|last7=Reitsma|first7=J. B.|doi=10.1186/s12874-016-0267-3 |doi-access=free }}</ref> According to some authors<ref>{{cite journal|last=Peduzzi|first=P|author2=Concato, J |author3=Kemper, E |author4=Holford, TR |author5=Feinstein, AR |title=A simulation study of the number of events per variable in logistic regression analysis|journal=[[Journal of Clinical Epidemiology]]|date=December 1996|volume=49|issue=12|pages=1373β9|pmid=8970487|doi=10.1016/s0895-4356(96)00236-3|doi-access=free}}</ref> the rule is overly conservative in some circumstances, with the authors stating, "If we (somewhat subjectively) regard confidence interval coverage less than 93 percent, type I error greater than 7 percent, or relative bias greater than 15 percent as problematic, our results indicate that problems are fairly frequent with 2β4 EPV, uncommon with 5β9 EPV, and still observed with 10β16 EPV. The worst instances of each problem were not severe with 5β9 EPV and usually comparable to those with 10β16 EPV".<ref>{{cite journal|last1=Vittinghoff|first1=E.|last2=McCulloch|first2=C. E.|title=Relaxing the Rule of Ten Events per Variable in Logistic and Cox Regression|journal=American Journal of Epidemiology|date=12 January 2007|volume=165|issue=6|pages=710β718|doi=10.1093/aje/kwk052|pmid=17182981|doi-access=free}}</ref> Others have found results that are not consistent with the above, using different criteria. A useful criterion is whether the fitted model will be expected to achieve the same predictive discrimination in a new sample as it appeared to achieve in the model development sample. For that criterion, 20 events per candidate variable may be required.<ref name=plo14mod/> Also, one can argue that 96 observations are needed only to estimate the model's intercept precisely enough that the margin of error in predicted probabilities is Β±0.1 with a 0.95 confidence level.<ref name=rms/>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)