Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Imputation (statistics)
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Listwise (complete case) deletion == {{Main|Listwise deletion}} By far, the most common means of dealing with missing data is listwise deletion (also known as complete case), which is when all cases with a missing value are deleted. If the data are [[missing completely at random]], then listwise deletion does not add any bias, but it does decrease the [[Power (statistics)|power]] of the analysis by decreasing the effective sample size. For example, if 1000 cases are collected but 80 have missing values, the effective sample size after listwise deletion is 920. If the cases are not missing completely at random, then listwise deletion will introduce bias because the sub-sample of cases represented by the missing data are not representative of the original sample (and if the original sample was itself a representative sample of a population, the complete cases are not representative of that population either).<ref name="cambridge.org">{{Cite journal|last1=Lall|first1=Ranjit|date=2016|title=How Multiple Imputation Makes a Difference|url=https://www.cambridge.org/core/journals/political-analysis/article/how-multiple-imputation-makes-a-difference/8C6616B679EF8F3EB0041B1BC88EEBB9|journal=Political Analysis|language=en|volume=24|issue=4|pages=414β433|doi=10.1093/pan/mpw020|doi-access=free}}</ref> While listwise deletion is unbiased when the missing data is missing completely at random, this is rarely the case in actuality.<ref>{{Cite journal|last=Kenward|first=Michael G|date=2013-02-26|title=The handling of missing data in clinical trials|journal=Clinical Investigation|volume=3|issue=3|pages=241β250|doi=10.4155/cli.13.7|doi-broken-date=2024-11-11 |issn=2041-6792|url=https://semanticscholar.org/paper/964403060982c44cc10842084105de256876b8c6}}</ref> Pairwise deletion (or "available case analysis") involves deleting a case when it is missing a variable required for a particular analysis, but including that case in analyses for which all required variables are present. When pairwise deletion is used, the total N for analysis will not be consistent across parameter estimations. Because of the incomplete N values at some points in time, while still maintaining complete case comparison for other parameters, pairwise deletion can introduce impossible mathematical situations such as correlations that are over 100%.<ref name="enders2010">{{cite book |last=Enders |first=C. K. |year=2010 |title=Applied Missing Data Analysis |location=New York |publisher=Guilford Press |isbn=978-1-60623-639-0 }}</ref> The one advantage complete case deletion has over other methods is that it is straightforward and easy to implement. This is a large reason why complete case is the most popular method of handling missing data in spite of the many disadvantages it has.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)