Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Ewens's sampling formula
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Sampling formula which describes the probabilities of alleles in a sample}} {{inline|date=August 2011}} In [[population genetics]], '''Ewens's sampling formula''' describes the [[probabilities]] associated with counts of how many different [[allele]]s are observed a given number of times in the [[sample (statistics)|sample]]. ==Definition== Ewens's sampling formula, introduced by [[Warren Ewens]], states that under certain conditions (specified below), if a random sample of ''n'' [[gamete]]s is taken from a population and classified according to the [[gene]] at a particular [[locus (genetics)|locus]] then the [[probability]] that there are ''a''<sub>1</sub> [[allele]]s represented once in the sample, and ''a''<sub>2</sub> alleles represented twice, and so on, is :<math>\operatorname{Pr}(a_1,\dots,a_n; \theta)={n! \over \theta(\theta+1)\cdots(\theta+n-1)}\prod_{j=1}^n{\theta^{a_j} \over j^{a_j} a_j!},</math> for some positive number ''θ'' representing the [[population mutation rate]], whenever <math>a_1, \ldots, a_n</math> is a sequence of nonnegative integers such that :<math>a_1+2a_2+3a_3+\cdots+na_n=\sum_{i=1}^{n} i a_i = n.\,</math> The phrase "under certain conditions" used above is made precise by the following assumptions: * The sample size ''n'' is small by comparison to the size of the whole population; and * The population is in statistical equilibrium under [[mutation]] and [[genetic drift]] and the role of selection at the locus in question is negligible; and * Every mutant allele is novel. {{See also|Infinite-alleles model}} This is a [[probability distribution]] on the set of all [[integer partition|partitions of the integer]] ''n''. Among probabilists and statisticians it is often called the '''multivariate Ewens distribution'''. ==Mathematical properties== When ''θ'' = 0, the probability is 1 that all ''n'' genes are the same. When ''θ'' = 1, then the distribution is precisely that of the integer partition induced by a uniformly distributed [[random permutation]]. As ''θ'' → ∞, the probability that no two of the ''n'' genes are the same approaches 1. This family of probability distributions enjoys the property that if after the sample of ''n'' is taken, ''m'' of the ''n'' gametes are chosen without replacement, then the resulting probability distribution on the set of all partitions of the smaller integer ''m'' is just what the formula above would give if ''m'' were put in place of ''n''. The Ewens distribution arises naturally from the [[Chinese restaurant process]]. ==See also== * [[Chinese restaurant table distribution]] * [[Coalescent theory]] * [[Unified neutral theory of biodiversity]] * [[Biomathematics]] ==Notes== * Warren Ewens, "The sampling theory of selectively neutral alleles", ''Theoretical Population Biology'', volume 3, pages 87–112, 1972. * H. Crane. (2016) "[https://www.researchgate.net/publication/280311472_The_Ubiquitous_Ewens_Sampling_Formula The Ubiquitous Ewens Sampling Formula]", ''Statistical Science'', 31:1 (Feb 2016). This article introduces a series of seven articles about Ewens Sampling in a special issue of the journal. * J.F.C. Kingman, "Random partitions in population genetics", ''Proceedings of the Royal Society of London, Series B, Mathematical and Physical Sciences'', volume 361, number 1704, 1978. * S. Tavare and W. J. Ewens, "The Multivariate Ewens distribution." (1997, Chapter 41 from the reference below). * N.L. Johnson, S. Kotz, and N. Balakrishnan (1997) ''Discrete Multivariate Distributions'', Wiley. {{ISBN|0-471-12844-9}}. {{ProbDistributions|multivariate}} {{DEFAULTSORT:Ewens's Sampling Formula}} [[Category:Theory of probability distributions]] [[Category:Population genetics]] [[Category:Discrete distributions]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:ISBN
(
edit
)
Template:Inline
(
edit
)
Template:ProbDistributions
(
edit
)
Template:See also
(
edit
)
Template:Short description
(
edit
)