Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Negative binomial distribution
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Overdispersed Poisson=== The negative binomial distribution, especially in its alternative parameterization described above, can be used as an alternative to the Poisson distribution. It is especially useful for discrete data over an unbounded positive range whose sample [[variance]] exceeds the sample [[mean]]. In such cases, the observations are [[Overdispersion|overdispersed]] with respect to a Poisson distribution, for which the mean is equal to the variance. Hence a Poisson distribution is not an appropriate model. Since the negative binomial distribution has one more parameter than the Poisson, the second parameter can be used to adjust the variance independently of the mean. See [[Cumulant#Cumulants of some discrete probability distributions|Cumulants of some discrete probability distributions]]. An application of this is to annual counts of [[tropical cyclone]]s in the [[Atlantic Ocean|North Atlantic]] or to monthly to 6-monthly counts of wintertime [[extratropical cyclone]]s over Europe, for which the variance is greater than the mean.<ref>{{cite journal|last=Villarini |first=G. |author2=Vecchi, G.A. |author3=Smith, J.A.|year=2010 |title=Modeling of the dependence of tropical storm counts in the North Atlantic Basin on climate indices |journal=[[Monthly Weather Review]] |volume=138 |issue=7 |pages=2681β2705 |doi=10.1175/2010MWR3315.1 |bibcode=2010MWRv..138.2681V |doi-access=free }}</ref><ref>{{cite journal|last=Mailier |first=P.J. |author2=Stephenson, D.B. |author3=Ferro, C.A.T. |author4= Hodges, K.I. |year=2006 |title=Serial Clustering of Extratropical Cyclones |journal=[[Monthly Weather Review]] |volume=134 |issue=8 |pages=2224β2240 |doi=10.1175/MWR3160.1 |bibcode=2006MWRv..134.2224M |doi-access=free }}</ref><ref>{{cite journal|last=Vitolo |first=R. |author2=Stephenson, D.B. |author3=Cook, Ian M. |author4= Mitchell-Wallace, K. |year=2009 |title=Serial clustering of intense European storms |journal=[[Meteorologische Zeitschrift]] |volume=18 |issue=4 |pages=411β424 |doi=10.1127/0941-2948/2009/0393 |bibcode=2009MetZe..18..411V |s2cid=67845213 }}</ref> In the case of modest overdispersion, this may produce substantially similar results to an overdispersed Poisson distribution.<ref>{{cite book | last = McCullagh | first = Peter | author-link= Peter McCullagh |author2=Nelder, John |author-link2=John Nelder | title = Generalized Linear Models |edition=Second | publisher = Boca Raton: Chapman and Hall/CRC | year = 1989 | isbn = 978-0-412-31760-6 |ref=McCullagh1989}}</ref><ref>{{cite book | last = Cameron | first = Adrian C. | author2 = Trivedi, Pravin K. | title = Regression analysis of count data | publisher = Cambridge University Press | year = 1998 | isbn = 978-0-521-63567-7 | ref = Cameron1998 | url-access = registration | url = https://archive.org/details/regressionanalys00came }}</ref> Negative binomial modeling is widely employed in ecology and biodiversity research for analyzing count data where overdispersion is very common. This is because overdispersion is indicative of biological aggregation, such as species or communities forming clusters. Ignoring overdispersion can lead to significantly inflated model parameters, resulting in misleading statistical inferences. The negative binomial distribution effectively addresses overdispersed counts by permitting the variance to vary quadratically with the mean. An additional dispersion parameter governs the slope of the quadratic term, determining the severity of overdispersion. The model's quadratic mean-variance relationship proves to be a realistic approach for handling overdispersion, as supported by empirical evidence from many studies. Overall, the NB model offers two attractive features: (1) the convenient interpretation of the dispersion parameter as an index of clustering or aggregation, and (2) its tractable form, featuring a closed expression for the probability mass function.<ref> {{cite journal|last=Stoklosa |first=J. |author2=Blakey, R.V. |author3=Hui, F.K.C. |year=2022 |title=An Overview of Modern Applications of Negative Binomial Modelling in Ecology and Biodiversity |journal=[[Diversity (journal)|Diversity]] |volume=14 |issue=5 |pages=320 |doi=10.3390/d14050320 |doi-access=free |bibcode=2022Diver..14..320S }} </ref> In genetics, the negative binomial distribution is commonly used to model data in the form of discrete sequence read counts from high-throughput RNA and DNA sequencing experiments.<ref> {{cite journal|last=Robinson |first=M.D. |author2=Smyth, G.K. |year=2007 |title=Moderated statistical tests for assessing differences in tag abundance. |journal=[[Bioinformatics]] |volume=23 |issue=21 |pages=2881β2887 |doi=10.1093/bioinformatics/btm453 |pmid=17881408|doi-access=free }} </ref><ref> {{cite web |url=http://www.bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.pdf |title=Differential analysis of count data β the}}</ref><ref> {{cite conference|last=Airoldi |first=E. M. |author2=Cohen, W. W. |author3=Fienberg, S. E. |date=June 2005 |title=Bayesian Models for Frequent Terms in Text |book-title=Proceedings of the Classification Society of North America and INTERFACE Annual Meetings |volume=990 |pages=991 |location=St. Louis, MO, USA }} </ref><ref> {{cite web |url=http://www.bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf |title=edgeR: differential expression analysis of digital gene expression data |last1=Chen |first1=Yunshun |last2=Davis |first2=McCarthy |date=September 25, 2014 |access-date=October 14, 2014}}</ref> In epidemiology of infectious diseases, the negative binomial has been used as a better option than the Poisson distribution to model overdispersed counts of secondary infections from one infected case (super-spreading events).<ref>{{cite journal|last=Lloyd-Smith|first=J. O. |author2= Schreiber, S. J. |author3= Kopp, P. E. |author4= Getz, W. M. |year=2005 |title=Superspreading and the effect of individual variation on disease emergence |journal=[[Nature (journal)|Nature]] |volume=438 |issue=7066 |pages=355β359 |doi=10.1038/nature04153|pmid=16292310 |pmc=7094981 |bibcode=2005Natur.438..355L }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)