Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Coefficient of variation
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Estimation== When only a sample of data from a population is available, the population CV can be estimated using the ratio of the [[Standard deviation#Estimation|sample standard deviation]] <math>s \,</math> to the sample mean <math>\bar{x}</math>: :<math>\widehat{c_{\rm v}} = \frac{s}{\bar{x}}</math> But this estimator, when applied to a small or moderately sized sample, tends to be too low: it is a [[biased estimator]]. For [[normally distributed]] data, an unbiased estimator<ref>Sokal RR & Rohlf FJ. ''Biometry'' (3rd Ed). New York: Freeman, 1995. p. 58. {{ISBN|0-7167-2411-1}}</ref> for a sample of size n is: :<math>\widehat{c_{\rm v}}^*=\bigg(1+\frac{1}{4n}\bigg)\widehat{c_{\rm v}}</math> ===Log-normal data=== Many datasets follow an approximately log-normal distribution.<ref>{{cite journal |doi=10.1641/0006-3568(2001)051[0341:LNDATS]2.0.CO;2 |title=Log-normal Distributions across the Sciences: Keys and Clues |year=2001 |last1=Limpert |first1=Eckhard |last2=Stahel |first2=Werner A. |last3=Abbt |first3=Markus |journal=BioScience |volume=51 |issue=5 |pages=341β352|doi-access=free }}</ref> In such cases, a more accurate estimate, derived from the properties of the [[log-normal distribution]],<ref>{{cite journal |doi=10.1093/biomet/51.1-2.25 |title=Confidence intervals for the coefficient of variation for the normal and log normal distributions |year=1964 |last1=Koopmans |first1=L. H. |last2=Owen |first2=D. B. |last3=Rosenblatt |first3=J. I. |journal=Biometrika |volume=51 |issue=1β2 |pages=25β32}}</ref><ref>{{cite journal |pmid=1601532 |year=1992 |last1=Diletti |first1=E |last2=Hauschke |first2=D |last3=Steinijans |first3=VW |title=Sample size determination for bioequivalence assessment by means of confidence intervals |volume=30 |pages=S51β8 |journal=International Journal of Clinical Pharmacology, Therapy, and Toxicology|issue=Suppl 1 }}</ref><ref>{{cite journal |doi=10.1081/BIP-100101013 |title=Why Are Pharmacokinetic Data Summarized by Arithmetic Means? |year=2000 |last1=Julious |first1=Steven A. |last2=Debarnot |first2=Camille A. M. |journal=Journal of Biopharmaceutical Statistics |volume=10 |pages=55β71 |pmid=10709801 |issue=1|s2cid=2805094 }}</ref> is defined as: :<math>\widehat{cv}_{\rm raw} = \sqrt{\mathrm{e}^{s_{\ln}^2}-1}</math> where <math>{s_{\ln}} \,</math> is the sample standard deviation of the data after a [[natural log]] transformation. (In the event that measurements are recorded using any other logarithmic base, b, their standard deviation <math>s_b \,</math> is converted to base e using <math>s_{\ln} = s_b \ln(b) \,</math>, and the formula for <math>\widehat{cv}_{\rm raw} \,</math> remains the same.<ref>{{cite journal | last1 = Reed | first1 = JF | last2 = Lynn | first2 = F | last3 = Meade | first3 = BD | year = 2002 | title = Use of Coefficient of Variation in Assessing Variability of Quantitative Assays | journal = Clin Diagn Lab Immunol | volume = 9 | issue = 6| pages = 1235β1239 | doi = 10.1128/CDLI.9.6.1235-1239.2002 | pmid = 12414755 | pmc = 130103 }}</ref>) This estimate is sometimes referred to as the "geometric CV" (GCV)<ref>Sawant, S.; Mohan, N. (2011) [http://pharmasug.org/proceedings/2011/PO/PharmaSUG-2011-PO08.pdf "FAQ: Issues with Efficacy Analysis of Clinical Trial Data Using SAS"] {{webarchive|url=https://web.archive.org/web/20110824094357/http://pharmasug.org/proceedings/2011/PO/PharmaSUG-2011-PO08.pdf |date=24 August 2011 }}, ''PharmaSUG2011'', Paper PO08</ref><ref>{{cite journal | last1 = Schiff | first1 = MH | display-authors = etal | year = 2014 | title = Head-to-head, randomised, crossover study of oral versus subcutaneous methotrexate in patients with rheumatoid arthritis: drug-exposure limitations of oral methotrexate at doses >=15 mg may be overcome with subcutaneous administration| journal = Ann Rheum Dis | volume = 73| issue = 8| pages = 1β3 | doi = 10.1136/annrheumdis-2014-205228 | pmid = 24728329 | pmc = 4112421}}</ref> in order to distinguish it from the simple estimate above. However, "geometric coefficient of variation" has also been defined by Kirkwood<ref>{{cite journal |last1=Kirkwood |first1=TBL |title=Geometric means and measures of dispersion |journal=Biometrics |year=1979 |volume=35 |issue=4 |pages=908β9 |jstor=2530139 }}</ref> as: :<math>\mathrm{GCV_K} = {\mathrm{e}^{s_{\ln}}\!\!-1}</math> This term was intended to be ''analogous'' to the coefficient of variation, for describing multiplicative variation in log-normal data, but this definition of GCV has no theoretical basis as an estimate of <math>c_{\rm v} \,</math> itself. For many practical purposes (such as [[sample size determination]] and calculation of [[confidence intervals]]) it is <math>s_{ln} \,</math> which is of most use in the context of log-normally distributed data. If necessary, this can be derived from an estimate of <math>c_{\rm v} \,</math> or GCV by inverting the corresponding formula.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)