Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Effect size
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Difference family: Effect sizes based on differences between means === The raw effect size pertaining to a comparison of two groups is inherently calculated as the differences between the two means. However, to facilitate interpretation it is common to standardise the effect size; various conventions for statistical standardisation are presented below. ==== Standardized mean difference ==== [[File:Cohens d 4panel.svg|thumb|Plots of Gaussian densities illustrating various values of Cohen's d.]] A (population) effect size ''ΞΈ'' based on means usually considers the standardized mean difference (SMD) between two populations<ref name="HedgesL1985Statistical">{{Cite book | author = [[Larry V. Hedges]] & [[Ingram Olkin]] | title = Statistical Methods for Meta-Analysis | publisher = [[Academic Press]] | year = 1985 | location = Orlando | isbn = 978-0-12-336380-0 }}</ref>{{Rp|p=78|date=November 2012}} <math display="block">\theta = \frac{\mu_1 - \mu_2} \sigma,</math> where ''ΞΌ''<sub>1</sub> is the mean for one population, ''ΞΌ''<sub>2</sub> is the mean for the other population, and Ο is a [[standard deviation]] based on either or both populations. In the practical setting the population values are typically not known and must be estimated from sample statistics. The several versions of effect sizes based on means differ with respect to which statistics are used. This form for the effect size resembles the computation for a [[t-test|''t''-test]] statistic, with the critical difference that the ''t''-test statistic includes a factor of <math>\sqrt{n}</math>. This means that for a given effect size, the significance level increases with the sample size. Unlike the ''t''-test statistic, the effect size aims to estimate a population [[parameter]] and is not affected by the sample size. SMD values of 0.2 to 0.5 are considered small, 0.5 to 0.8 are considered medium, and greater than 0.8 are considered large.<ref name="Andrade2020">{{cite journal | last1 = Andrade | first1 = Chittaranjan | title = Mean Difference, Standardized Mean Difference (SMD), and Their Use in Meta-Analysis | journal = The Journal of Clinical Psychiatry | date = 22 September 2020 | volume = 81 | issue = 5 | eissn = 1555-2101 | doi = 10.4088/JCP.20f13681 | pmid = 32965803 | s2cid = 221865130 | url = | quote = SMD values of 0.2-0.5 are considered small, values of 0.5-0.8 are considered medium, and values > 0.8 are considered large. In psychopharmacology studies that compare independent groups, SMDs that are statistically significant are almost always in the small to medium range. It is rare for large SMDs to be obtained.| doi-access = free }}</ref> ====Cohen's ''d'' {{anchor|Cohen's d}}==== Cohen's ''d'' is defined as the difference between two means divided by a standard deviation for the data, i.e. <math display="block">d = \frac{\bar{x}_1 - \bar{x}_2} s.</math> [[Jacob Cohen (statistician)|Jacob Cohen]] defined ''s'', the [[pooled standard deviation]], as (for two independent samples):<ref name="CohenJ1988Statistical">{{cite book |last=Cohen |first=Jacob |author-link=Jacob Cohen (statistician) |url=https://books.google.com/books?id=2v9zDAsLvA0C&pg=PP1 |title=Statistical Power Analysis for the Behavioral Sciences |publisher=Routledge |year=1988 |isbn=978-1-134-74270-7 |pages=}}</ref>{{Rp|p=67|date=July 2014|chapter-url = http://www.utstat.toronto.edu/~brunner/oldclass/378f16/readings/CohenPower.pdf#page=66}} <math display="block">s = \sqrt{\frac{(n_1-1)s^2_1 + (n_2-1)s^2_2}{n_1+n_2 - 2}}</math> where the variance for one of the groups is defined as <math display="block">s_1^2 = \frac 1 {n_1-1} \sum_{i=1}^{n_1} (x_{1,i} - \bar{x}_1)^2,</math> and similarly for the other group. Other authors choose a slightly different computation of the standard deviation when referring to "Cohen's ''d''" where the denominator is without "-2"<ref>{{Cite journal | author1 = Robert E. McGrath | author2 = Gregory J. Meyer | title = When Effect Sizes Disagree: The Case of r and d | journal = [[Psychological Methods]] | volume = 11 | issue = 4 | pages = 386β401 | year = 2006 | url = http://www.bobmcgrath.org/Pubs/When_effect_sizes_disagree.pdf | doi = 10.1037/1082-989x.11.4.386 | pmid = 17154753 | citeseerx = 10.1.1.503.754 | access-date = 2014-07-30 | archive-url = https://web.archive.org/web/20131008171400/http://www.bobmcgrath.org/Pubs/When_effect_sizes_disagree.pdf | archive-date = 2013-10-08 | url-status=dead }}</ref><ref>{{cite book | last1=Hartung|first1=Joachim | last2=Knapp|first2=Guido | last3=Sinha|first3=Bimal K. | title=Statistical Meta-Analysis with Applications | url=https://books.google.com/books?id=JEoNB_2NONQC&pg=PP1|year=2008|publisher=John Wiley & Sons | isbn=978-1-118-21096-3}}</ref>{{Rp|p=14|date=November 2012}} <math display="block">s = \sqrt{\frac{(n_1-1)s^2_1 + (n_2-1)s^2_2}{n_1+n_2}}</math> This definition of "Cohen's ''d''" is termed the [[maximum likelihood]] estimator by Hedges and Olkin,<ref name="HedgesL1985Statistical" /> and it is related to Hedges' ''g'' by a scaling factor (see below). With two paired samples, an approach is to look at the distribution of the difference scores. In that case, ''s'' is the standard deviation of this distribution of difference scores (of note, the standard deviation of difference scores is dependent on the correlation between paired samples). This creates the following relationship between the t-statistic to test for a difference in the means of the two paired groups and Cohen's ''d''' (computed with difference scores): <math display="block">t = \frac{\bar{X}_1 - \bar{X}_2}{\text{SE}_{diff}} = \frac{\bar{X}_1 - \bar{X}_2}{\frac{\text{SD}_{diff}}{\sqrt N}} = \frac{\sqrt{N} (\bar{X}_1 - \bar{X}_2)}{SD_{diff}}</math> and <math display="block">d' = \frac{\bar{X}_1 - \bar{X}_2}{\text{SD}_{diff}} = \frac t {\sqrt N}</math>However, for paired samples, Cohen states that d' does not provide the correct estimate to obtain the power of the test for d, and that before looking the values up in the tables provided for d, it should be corrected for r as in the following formula:{{sfn|Cohen|1988|p=49}} <math display="block">\frac{d'} {\sqrt{1 - r}}.</math>where r is the correlation between paired measurements. Given the same sample size, the higher r, the higher the power for a test of paired difference. Since d' depends on r, as a measure of effect size it is difficult to interpret; therefore, in the context of paired analyses, since it is possible to compute d' or d (estimated with a pooled standard deviation or that of a group or time-point), it is necessary to explicitly indicate which one is being reported. As a measure of effect size, d (estimated with a pooled standard deviation or that of a group or time-point) is more appropriate, for instance in meta-analysis.<ref name=":0">{{Cite journal |last=Dunlap |first=William P. |last2=Cortina |first2=Jose M. |last3=Vaslow |first3=Joel B. |last4=Burke |first4=Michael J. |date=1996 |title=Meta-analysis of experiments with matched groups or repeated measures designs. |url=http://doi.apa.org/getdoi.cfm?doi=10.1037/1082-989X.1.2.170 |journal=Psychological Methods |language=en |volume=1 |issue=2 |pages=170β177 |doi= |issn=1082-989X}} {{doi|10.1037//1082-989X.1.2.170}}</ref> Cohen's ''d'' is frequently used in [[estimating sample sizes]] for statistical testing. A lower Cohen's ''d'' indicates the necessity of larger sample sizes, and vice versa, as can subsequently be determined together with the additional parameters of desired [[significance level]] and [[statistical power]].<ref>{{cite book|last=Kenny|first=David A.|title=Statistics for the Social and Behavioral Sciences|url=https://books.google.com/books?id=EdqhQgAACAAJ&pg=PP1|year=1987|publisher=Little, Brown|isbn=978-0-316-48915-7|chapter=Chapter 13|chapter-url=http://davidakenny.net/doc/statbook/chapter_13.pdf}}</ref> ==== Glass' Ξ ==== In 1976, [[Gene V. Glass]] proposed an estimator of the effect size that uses only the standard deviation of the second group<ref name="HedgesL1985Statistical"/>{{Rp|p=78|date=November 2012}} <math display="block">\Delta = \frac{\bar{x}_1 - \bar{x}_2}{s_2}</math> The second group may be regarded as a control group, and Glass argued that if several treatments were compared to the control group it would be better to use just the standard deviation computed from the control group, so that effect sizes would not differ under equal means and different variances. Under a correct assumption of equal population variances a pooled estimate for ''Ο'' is more precise. ==== Hedges' ''g'' ==== Hedges' ''g'', suggested by [[Larry Hedges]] in 1981,<ref>{{Cite journal | author = Larry V. Hedges | title = Distribution theory for Glass' estimator of effect size and related estimators | journal = [[Journal of Educational Statistics]] | volume = 6 | issue = 2 | pages = 107β128 | year = 1981 | doi = 10.3102/10769986006002107 | s2cid = 121719955 | author-link = Larry V. Hedges }}</ref> is like the other measures based on a standardized difference<ref name="HedgesL1985Statistical"/>{{Rp|p=79|date=November 2012}} <math display="block">g = \frac{\bar{x}_1 - \bar{x}_2}{s^*}</math> where the pooled standard deviation <math>s^*</math> is computed as:<!---there is something missing here... otherwise it is identical with Cohen's d... --> <math display="block">s^* = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}}.</math> However, as an [[estimator]] for the population effect size ''ΞΈ'' it is [[Bias of an estimator|bias]]ed. Nevertheless, this bias can be approximately corrected through multiplication by a factor <math display="block">g^* = J(n_1+n_2-2) \,\, g \, \approx \, \left(1-\frac{3}{4(n_1+n_2)-9}\right) \,\, g</math> Hedges and Olkin refer to this less-biased estimator <math>g^*</math> as ''d'',<ref name="HedgesL1985Statistical" /> but it is not the same as Cohen's ''d''. The exact form for the correction factor ''J''() involves the [[gamma function]]<ref name="HedgesL1985Statistical"/>{{Rp|p=104|date=November 2012}} <math display="block">J(a) = \frac{\Gamma(a/2)}{\sqrt{a/2 \,}\,\Gamma((a-1)/2)}.</math> <!-- In the above 'height' example, Hedges' ''Δ'' effect size equals 1.76 (95% confidence intervals: 1.70 β 1.82). Notice how the large sample size has increased the effect size from Cohen's ''d''? If, instead, the available data were from only 90 men and 80 women Hedges' ''Δ'' provides a more conservative estimate of effect size: 1.70 (with larger 95% confidence intervals: 1.35 β 2.05). --> There are also multilevel variants of Hedges' g, e.g., for use in cluster randomised controlled trials (CRTs).<ref>Hedges, L. V. (2011). Effect sizes in three-level cluster-randomized experiments. Journal of Educational and Behavioral Statistics, 36(3), 346-380. </ref> CRTs involve randomising clusters, such as schools or classrooms, to different conditions and are frequently used in education research. ====Ξ¨, root-mean-square standardized effect==== A similar effect size estimator for multiple comparisons (e.g., [[ANOVA]]) is the Ξ¨ root-mean-square standardized effect:<ref name="Steiger2004"/> <math display="block">\Psi = \sqrt{ \frac{1}{k-1} \cdot \sum_{j=1}^k \left(\frac{\mu_j-\mu}{\sigma}\right)^2}</math> where ''k'' is the number of groups in the comparisons. This essentially presents the omnibus difference of the entire model adjusted by the root mean square, analogous to ''d'' or ''g''. In addition, a generalization for multi-factorial designs has been provided.<ref name="Steiger2004"/> ==== Distribution of effect sizes based on means ==== Provided that the data is [[Gaussian distribution|Gaussian]] distributed a scaled Hedges' ''g'', <math display="inline">\sqrt{n_1 n_2/(n_1+n_2)}\,g</math>, follows a [[noncentral t-distribution|noncentral ''t''-distribution]] with the [[noncentrality parameter]] <math display="inline">\sqrt{n_1 n_2/(n_1+n_2)}\theta</math> and {{math|(''n''<sub>1</sub> + ''n''<sub>2</sub> β 2)}} degrees of freedom. Likewise, the scaled Glass' Ξ is distributed with {{math|''n''<sub>2</sub> β 1}} degrees of freedom. From the distribution it is possible to compute the [[Expected value|expectation]] and variance of the effect sizes. In some cases large sample approximations for the variance are used. One suggestion for the variance of Hedges' unbiased estimator is<ref name="HedgesL1985Statistical"/> {{Rp|p=86|date=November 2012}} <math display="block">\hat{\sigma}^2(g^*) = \frac{n_1+n_2}{n_1 n_2} + \frac{(g^*)^2}{2(n_1 + n_2)}.</math> ==== Strictly standardized mean difference (SSMD) ==== {{main|Strictly standardized mean difference}} As a statistical parameter, SSMD (denoted as <math>\beta</math>) is defined as the ratio of [[mean]] to [[standard deviation]] of the difference of two random values respectively from two groups. Assume that one group with random values has [[mean]] <math>\mu_1</math> and [[variance]] <math>\sigma_1^2</math> and another group has [[mean]] <math>\mu_2</math> and [[variance]] <math>\sigma_2^2</math>. The [[covariance]] between the two groups is <math>\sigma_{12}.</math> Then, the SSMD for the comparison of these two groups is defined as<ref name="ZhangGenomics2007">{{Cite journal |last=Zhang |first=XHD |year=2007 |title=A pair of new statistical parameters for quality control in RNA interference high-throughput screening assays |journal=Genomics |volume=89 |issue=4 |pages=552β61 |doi=10.1016/j.ygeno.2006.12.014 |pmid=17276655 |doi-access=}}</ref> :<math>\beta = \frac{\mu_1 - \mu_2}{\sqrt{\sigma_1^2 + \sigma_2^2 - 2\sigma_{12} }}.</math> If the two groups are independent, :<math>\beta = \frac{\mu_1 - \mu_2}{\sqrt{\sigma_1^2 + \sigma_2^2 }}.</math> If the two independent groups have equal [[variance]]s <math>\sigma^2</math>, :<math>\beta = \frac{\mu_1 - \mu_2}{\sqrt{2}\sigma}.</math> ==== Other metrics ==== [[Mahalanobis distance]] (D) is a multivariate generalization of Cohen's d, which takes into account the relationships between the variables.<ref>{{Cite journal | last=Del Giudice | first=Marco | date=2013-07-18|title=Multivariate Misgivings: Is D a Valid Measure of Group and Sex Differences? | journal=Evolutionary Psychology | language=en | volume=11 | issue=5 | pages=1067β1076 | doi=10.1177/147470491301100511 | doi-access=free| pmid=24333840 | pmc=10434404 }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)