Editing Effective population size (section)

== History of theory ==
[[Ronald Fisher]] and [[Sewall Wright]] originally defined effective population size as "the number of breeding individuals in an [[idealised population]] that would show the same amount of dispersion of [[allele frequency|allele frequencies]] under random [[genetic drift]] or the same amount of [[inbreeding]] as the population under consideration". This implied two potentially different effective population sizes, based either on the one-generation increase in variance across replicate populations '''(variance effective population size)''', or on the one-generation change in the inbreeding coefficient '''(inbreeding effective population size)'''. These two are closely linked, and derived from [[F-statistics]], but they are not identical.<ref>{{cite journal |title=Wright and Fisher on Inbreeding and Random Drift |journal=Genetics |author=James F. Crow |author-link=James F. Crow |year=2010 |volume=184 |issue=3 |pages=609–611 |doi=10.1534/genetics.109.110023 |pmc=2845331 |pmid=20332416}}</ref>

Today, the effective population size is usually estimated empirically with respect to the amount of within-species [[nucleotide diversity|genetic diversity]] divided by the [[mutation rate]], yielding a '''coalescent effective population size''' that reflects the cumulative effects of genetic drift, background selection, and genetic hitchhiking over longer time periods.<ref name="Lynch 2003">{{cite journal |author=Lynch, M. |author2=Conery, J.S. |title=The origins of genome complexity |journal=Science|year=2003|volume=302|issue=5649 |pages=1401–1404 |doi=10.1126/science.1089370 |pmid=14631042|bibcode=2003Sci...302.1401L |citeseerx=10.1.1.135.974 |s2cid=11246091 }}</ref> Another important effective population size is the '''selection effective population size''' 1/s<sub>critical</sub>, where s<sub>critical</sub> is the critical value of the [[selection coefficient]] at which selection becomes more important than [[genetic drift]].<ref name="Neher 2011">{{Cite journal| volume = 188| pages = 975–996|author1=R.A. Neher |author2=B.I. Shraiman | title = Genetic Draft and Quasi-Neutrality in Large Facultatively Sexual Populations| journal = Genetics| year = 2011| doi = 10.1534/genetics.111.128876| issue = 4| pmid = 21625002| pmc = 3176096| arxiv = 1108.1635}}</ref>

=== Variance effective size ===
In the [[Idealized population|Wright-Fisher idealized population model]], the [[conditional variance]] of the allele frequency <math>p'</math>, given the [[allele frequency]] <math>p</math> in the previous generation, is

:<math>\operatorname{var}(p' \mid p)= {p(1-p) \over 2N}.</math>

Let <math>\widehat{\operatorname{var}}(p'\mid p)</math> denote the same, typically larger, variance in the actual population under consideration.  The variance effective population size <math>N_e^{(v)}</math> is defined as the size of an idealized population with the same variance.  This is found by substituting <math>\widehat{\operatorname{var}}(p'\mid p)</math> for <math>\operatorname{var}(p'\mid p)</math> and solving for <math>N</math> which gives

:<math>N_e^{(v)} = {p(1-p) \over 2 \widehat{\operatorname{var}}(p)}.</math>

In the following examples, one or more of the assumptions of a strictly idealised population are relaxed, while other assumptions are retained. The variance effective population size of the more relaxed population model is then calculated with respect to the strict model.

==== Variations in population size ====

Population size varies over time.  Suppose there are ''t'' non-overlapping [[generation]]s, then effective population size is given by the [[harmonic mean]] of the population sizes:<ref>{{Cite journal|last=Karlin|first=Samuel|date=1968-09-01|title=Rates of Approach to Homozygosity for Finite Stochastic Models with Variable Population Size|journal=The American Naturalist|volume=102|issue=927|pages=443–455|doi=10.1086/282557|bibcode=1968ANat..102..443K |s2cid=83824294|issn=0003-0147}}</ref>

:<math>{1 \over N_e} = {1 \over t} \sum_{i=1}^t {1 \over N_i}</math>

For example, say the population size was ''N'' = 10, 100, 50, 80, 20, 500 for six generations (''t'' = 6).  Then the effective population size is the [[harmonic mean]] of these, giving:

:{|
|-
|<math>{1 \over N_e}</math>
|<math>= {\begin{matrix} \frac{1}{10} \end{matrix} + \begin{matrix} \frac{1}{100} \end{matrix} + \begin{matrix} \frac{1}{50} \end{matrix} + \begin{matrix} \frac{1}{80} \end{matrix} + \begin{matrix} \frac{1}{20} \end{matrix} + \begin{matrix} \frac{1}{500} \end{matrix} \over 6} </math>
|-
|
|<math>= {0.1945 \over 6}</math>
|-
|
|<math>= 0.032416667</math>
|-
|<math>N_e</math>
|<math>= 30.8</math>
|}

Note this is less than the [[arithmetic mean]] of the population size, which in this example is 126.7. The harmonic mean tends to be dominated by the smallest [[population bottleneck|bottleneck]] that the population goes through.

==== Dioeciousness ====

If a population is [[dioecious]], i.e. there is no [[self-fertilisation]] then

:<math>N_e = N + \begin{matrix} \frac{1}{2} \end{matrix}</math>

or more generally,

:<math>N_e = N + \begin{matrix} \frac{D}{2} \end{matrix}</math>

where ''D'' represents dioeciousness and may take the value 0 (for not dioecious) or 1 for dioecious.

When ''N'' is large, ''N''<sub>''e''</sub> approximately equals ''N'', so this is usually trivial and often ignored:

:<math>N_e = N + \begin{matrix} \frac{1}{2} \approx N \end{matrix}</math>

==== Variance in reproductive success ====

If population size is to remain constant, each individual must contribute on average two [[gamete]]s to the next generation.  An idealized population assumes that this follows a [[Poisson distribution]] so that the [[variance]] of the number of gametes contributed, ''k'' is equal to the [[mean]] number contributed, i.e. 2:

:<math>\operatorname{var}(k) = \bar{k} = 2.</math>

However, in natural populations the variance is often larger than this. The vast majority of individuals may have no offspring, and the next generation stems only from a small number of individuals, so

:<math>\operatorname{var}(k) > 2.</math>

The effective population size is then smaller, and given by:

:<math>N_e^{(v)} = {4 N - 2D \over 2 + \operatorname{var}(k)}</math>

Note that if the variance of ''k'' is less than 2,  ''N''<sub>''e''</sub> is greater than ''N''.  In the extreme case of a population experiencing no variation in family size, in a laboratory population in which the number of offspring is artificially controlled, ''V''<sub>''k''</sub> = 0 and ''N''<sub>''e''</sub> = 2''N''.

==== Non-Fisherian sex-ratios ====

When the [[sex ratio]] of a population varies from the [[Ronald Fisher|Fisherian]] 1:1 ratio,  effective population size is given by:

:<math>N_e^{(v)} = N_e^{(F)} = {4 N_m N_f \over N_m + N_f}</math>

Where ''N''<sub>''m''</sub> is the number of males and ''N''<sub>''f''</sub> the number of females.  For example, with 80 males and 20 females (an absolute population size of 100):
:{|
|-
|<math>N_e</math>
|<math>= {4 \times 80 \times 20 \over 80 + 20}</math>
|-
|
|<math>={6400 \over 100}</math>
|-
|
|<math>= 64</math>
|}

Again, this results in ''N''<sub>''e''</sub> being less than ''N''.

===Inbreeding effective size===

Alternatively, the effective population size may be defined by noting how the average [[inbreeding coefficient]] changes from one generation to the next, and then defining ''N''<sub>''e''</sub> as the size of the idealized population that has the same change in average inbreeding coefficient as the population under consideration.  The presentation follows Kempthorne (1957).<ref>{{cite book |author=Kempthorne O |year=1957 |title=An Introduction to Genetic Statistics |publisher=Iowa State University Press}}</ref>

For the idealized population, the inbreeding coefficients follow the recurrence equation

:<math>F_t = \frac{1}{N}\left(\frac{1+F_{t-2}}{2}\right)+\left(1-\frac{1}{N}\right)F_{t-1}.</math>

Using Panmictic Index (1&nbsp;&minus;&nbsp;''F'') instead of inbreeding coefficient, we get the approximate recurrence equation

:<math>1-F_t = P_t = P_0\left(1-\frac{1}{2N}\right)^t. </math>

The difference per generation is

:<math>\frac{P_{t+1}}{P_t} = 1-\frac{1}{2N}. </math>

The inbreeding effective size can be found by solving

:<math>\frac{P_{t+1}}{P_t} = 1-\frac{1}{2N_e^{(F)}}. </math>

This is

:<math>N_e^{(F)} = \frac{1}{2\left(1-\frac{P_{t+1}}{P_t}\right)} </math>.

==== Theory of overlapping generations and age-structured populations ====

When organisms live longer than one breeding season, effective population sizes have to take into account the [[life table]]s for the species.

===== Haploid =====
Assume a haploid population with discrete age structure.  An example might be an organism that can survive several discrete breeding seasons.  Further, define the following age structure characteristics:

: <math>v_i = </math> [[Fisher's reproductive value]] for age <math>i</math>,

: <math>\ell_i = </math> The chance an individual will survive to age <math>i</math>, and

: <math>N_0 = </math> The number of newborn individuals per breeding season.

The [[generation time]] is calculated as

: <math>T = \sum_{i=0}^\infty \ell_i v_i = </math> average age of a reproducing individual

Then, the inbreeding effective population size is<ref>{{cite journal |author=Felsenstein J |year=1971 |title=Inbreeding and variance effective numbers in populations with overlapping generations | journal= [[Genetics (journal)|Genetics]]|volume= 68|issue=4 |pages=581–597|doi=10.1093/genetics/68.4.581 |pmid=5166069 |pmc=1212678 }}</ref>

:<math>N_e^{(F)} = \frac{N_0T}{1 + \sum_i\ell_{i+1}^2v_{i+1}^2(\frac{1}{\ell_{i+1}}-\frac{1}{\ell_i})}.</math>

===== Diploid =====
Similarly, the inbreeding effective number can be calculated for a diploid population with discrete age structure.  This was first given by Johnson,<ref>{{cite journal |author=Johnson DL |year=1977 |title=Inbreeding in populations with overlapping generations |journal=[[Genetics (journal)|Genetics]] |volume=87 |issue=3 |pages=581–591|doi=10.1093/genetics/87.3.581 |pmid=17248780 |pmc=1213763 }}</ref> but the notation more closely resembles Emigh and Pollak.<ref>{{cite journal |doi=10.1016/0040-5809(79)90028-5 |vauthors=Emigh TH, Pollak E |year=1979 |title=Fixation probabilities and effective population numbers in diploid populations with overlapping generations |journal=Theoretical Population Biology |volume=15 |issue=1 |pages=86–107|bibcode=1979TPBio..15...86E }}</ref>

Assume the same basic parameters for the life table as given for the haploid case, but distinguishing between male and female, such as ''N''<sub>0</sub><sup>''ƒ''</sup> and ''N''<sub>0</sub><sup>''m''</sup> for the number of newborn females and males, respectively (notice lower case ''ƒ'' for females, compared to upper case ''F'' for inbreeding).

The inbreeding effective number is

:<math>
\begin{align}
\frac{1}{N_e^{(F)}} = \frac{1}{4T}\left\{\frac{1}{N_0^f}+\frac{1}{N_0^m} + \sum_i\left(\ell_{i+1}^f\right)^2\left(v_{i+1}^f\right)^2\left(\frac{1}{\ell_{i+1}^f}-\frac{1}{\ell_i^f}\right)\right. \,\,\,\,\,\,\,\, & \\
 \left. {} + \sum_i\left(\ell_{i+1}^m\right)^2\left(v_{i+1}^m\right)^2\left(\frac{1}{\ell_{i+1}^m}-\frac{1}{\ell_i^m}\right) \right\}. &
\end{align}
</math>