Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Quantitative genetics
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Allele and genotype frequencies== To obtain means, variances and other statistics, both ''quantities'' and their ''occurrences'' are required. The gene effects (above) provide the framework for ''quantities'': and the ''frequencies'' of the contrasting alleles in the fertilization gamete-pool provide the information on ''occurrences''. [[File:Sexual-Repro-simpl.jpg|thumb|left | 400px | Analysis of sexual reproduction.]] Commonly, the frequency of the allele causing "more" in the phenotype (including dominance) is given the symbol '''''p''''', while the frequency of the contrasting allele is '''''q'''''. An initial assumption made when establishing the algebra was that the parental population was infinite and random mating, which was made simply to facilitate the derivation. The subsequent mathematical development also implied that the frequency distribution within the effective gamete-pool was uniform: there were no local perturbations where '''''p''''' and '''''q''''' varied. Looking at the diagrammatic analysis of sexual reproduction, this is the same as declaring that '''''p<sub>P</sub>''''' = '''''p<sub>g</sub>''''' = '''''p'''''; and similarly for '''''q'''''.<ref name="Falconer 1996"/> This mating system, dependent upon these assumptions, became known as "panmixia". Panmixia rarely actually occurs in nature,<ref name="Richards 1986">{{cite book|last1=Richards|first1=A. J.|title=Plant breeding systems.|date=1986|publisher=George Allen & Unwin|location=Boston|isbn=0-04-581020-6|url-access=registration|url=https://archive.org/details/plantbreedingsys0000rich}}</ref>{{rp|152–180}}<ref name="Chimpanzees 2014">{{cite web|last1=Jane Goodall Institute|title=Social structure of chimpanzees.|url=http://www.janegoodall.com/chimp_central/chimpanzees/behavior/social.asp|website=Chimp Central|access-date=20 August 2014|archive-url=https://web.archive.org/web/20080703193942/http://www.janegoodall.com/chimp_central/chimpanzees/behavior/social.asp|archive-date=3 July 2008|url-status=dead|df=dmy-all}}</ref> as gamete distribution may be limited, for example by dispersal restrictions or by behaviour, or by chance sampling (those local perturbations mentioned above). It is well known that there is a huge wastage of gametes in Nature, which is why the diagram depicts a ''potential'' gamete-pool separately to the ''actual'' gamete-pool. Only the latter sets the definitive frequencies for the zygotes: this is the true "gamodeme" ("gamo" refers to the gametes, and "deme" derives from Greek for "population"). But, under Fisher's assumptions, the ''gamodeme'' can be effectively extended back to the ''potential'' gamete-pool, and even back to the parental base-population (the "source" population). The random sampling arising when small "actual" gamete-pools are sampled from a large "potential" gamete-pool is known as ''[[genetic drift]]'', and is considered subsequently. While panmixia may not be widely extant, the ''potential'' for it does occur, although it may be only ephemeral because of those local perturbations. It has been shown, for example, that the F2 derived from ''random fertilization of F1 individuals'' (an ''allogamous'' F2), following hybridization, is an ''origin'' of a new ''potentially'' panmictic population.<ref name="Gordon 2000">{{cite journal|last1=Gordon|first1=Ian L.|title=Quantitative genetics of allogamous F2: an origin of randomly fertiliized populations.|journal=Heredity|date=2000|volume=85|pages=43–52|doi=10.1046/j.1365-2540.2000.00716.x|pmid=10971690|doi-access=free}}</ref><ref>An F2 derived by ''self fertilizing F1 individuals'' (an ''autogamous'' F2), however, is not an origin of a randomly fertilized population structure. See Gordon (2001).</ref> It has also been shown that if panmictic random fertilization occurred continually, it would maintain the same allele and genotype frequencies across each successive panmictic sexual generation—this being the ''Hardy Weinberg'' equilibrium.<ref name="Crow & Kimura"/>{{rp|34–39}}<ref name="Castle 1903">{{cite journal|last1=Castle|first1=W. E.|title=The law of heredity of Galton and Mendel and some laws governing race improvement by selection.|journal=Proceedings of the American Academy of Arts and Sciences|date=1903|volume=39|issue=8|pages=233–242|doi=10.2307/20021870|jstor=20021870|hdl=2027/hvd.32044106445109|hdl-access=free}}</ref><ref name="Hardy 1908">{{cite journal|last1=Hardy|first1=G. H.|title=Mendelian proportions in a mixed population.|journal=Science|date=1908|volume=28|pages=49–50|doi=10.1126/science.28.706.49|issue=706|pmid=17779291|pmc=2582692|bibcode=1908Sci....28...49H}}</ref><ref name="Weinberg 1908">{{cite journal|last1=Weinberg|first1=W.|title=Über den Nachweis der Verebung beim Menschen.|journal=Jahresh. Verein F. Vaterl. Naturk, Württem.|date=1908|volume=64|pages=368–382}}</ref><ref>Usually in science ethics, a discovery is named after the earliest person to propose it. Castle, however, seems to have been overlooked: and later when re-found, the title "Hardy Weinberg" was so ubiquitous it seemed too late to update it. Perhaps the "Castle Hardy Weinberg" equilibrium would be a good compromise?</ref> However, as soon as genetic drift was initiated by local random sampling of gametes, the equilibrium would cease. === Random fertilization=== Male and female gametes within the actual fertilizing pool are considered usually to have the same frequencies for their corresponding alleles. (Exceptions have been considered.) This means that when '''''p''''' male gametes carrying the '''''A''''' allele randomly fertilize '''''p''''' female gametes carrying that same allele, the resulting zygote has genotype '''''AA''''', and, under random fertilization, the combination occurs with a frequency of '''''p''''' x '''''p''''' (= '''''p<sup>2</sup>'''''). Similarly, the zygote '''''aa''''' occurs with a frequency of '''''q<sup>2</sup>'''''. Heterozygotes ('''''Aa''''') can arise in two ways: when '''''p''''' male ('''''A''''' allele) randomly fertilize '''''q''''' female ('''''a''''' allele) gametes, and ''vice versa''. The resulting frequency for the heterozygous zygotes is thus '''''2pq'''''.<ref name="Crow & Kimura"/>{{rp|32}} Notice that such a population is never more than half heterozygous, this maximum occurring when '''p'''='''q'''= 0.5. In summary then, under random fertilization, the zygote (genotype) frequencies are the quadratic expansion of the gametic (allelic) frequencies: <math display="inline"> (p+q)^2 = p^2 + 2pq + q^2 = 1 </math>. (The "=1" states that the frequencies are in fraction form, not percentages; and that there are no omissions within the framework proposed.) Notice that "random fertilization" and "panmixia" are ''not'' synonyms. === Mendel's research cross – a contrast=== Mendel's pea experiments were constructed by establishing true-breeding parents with "opposite" phenotypes for each attribute.<ref name="Mendel Bateson"/> This meant that each opposite parent was homozygous for its respective allele only. In our example, "tall ''vs'' dwarf", the tall parent would be genotype '''''TT''''' with '''''p''''' = '''1''' (and '''''q''''' = '''0'''); while the dwarf parent would be genotype '''''tt''''' with '''''q''''' = '''1''' (and '''''p''''' = '''0'''). After controlled crossing, their hybrid is '''''Tt''''', with '''''p''''' = '''''q''''' = '''{{sfrac|1|2}}'''. However, the frequency of this heterozygote = '''1''', because this is the F1 of an artificial cross: it has not arisen through random fertilization.<ref name="Gordon 1999">{{cite journal|last1=Gordon|first1=Ian L.|title=Quantitative genetics of intraspecies hybrids.|journal=Heredity|date=1999|volume=83|issue=6|pages=757–764|doi=10.1046/j.1365-2540.1999.00634.x|pmid=10651921|doi-access=free}}</ref> The F2 generation was produced by natural self-pollination of the F1 (with monitoring against insect contamination), resulting in '''''p''''' = '''''q''''' = '''{{sfrac|1|2}}''' being maintained. Such an F2 is said to be "autogamous". However, the genotype frequencies (0.25 '''''TT''''', 0.5 '''''Tt''''', 0.25 '''''tt''''') have arisen through a mating system very different from random fertilization, and therefore the use of the quadratic expansion has been avoided. The numerical values obtained were the same as those for random fertilization only because this is the special case of having originally crossed homozygous opposite parents.<ref name="Gordon 2001">{{cite journal|last1=Gordon|first1=Ian L.|title=Quantitative genetics of autogamous F2.|journal=Hereditas|date=2001|volume=134|pages=255–262|doi=10.1111/j.1601-5223.2001.00255.x|pmid=11833289|issue=3|doi-access=free}}</ref> We can notice that, because of the dominance of '''''T-''''' [frequency (0.25 + 0.5)] over '''''tt''''' [frequency 0.25], the 3:1 ratio is still obtained. A cross such as Mendel's, where true-breeding (largely homozygous) opposite parents are crossed in a controlled way to produce an F1, is a special case of hybrid structure. The F1 is often regarded as "entirely heterozygous" for the gene under consideration. However, this is an over-simplification and does not apply generally—for example when individual parents are not homozygous, or when ''populations'' inter-hybridise to form ''hybrid swarms''.<ref name="Gordon 1999"/> The general properties of intra-species hybrids (F1) and F2 (both "autogamous" and "allogamous") are considered in a later section. === Self fertilization – an alternative=== Having noticed that the pea is naturally self-pollinated, we cannot continue to use it as an example for illustrating random fertilization properties. Self-fertilization ("selfing") is a major alternative to random fertilization, especially within Plants. Most of the Earth's cereals are naturally self-pollinated (rice, wheat, barley, for example), as well as the pulses. Considering the millions of individuals of each of these on Earth at any time, it is obvious that self-fertilization is at least as significant as random fertilization. Self-fertilization is the most intensive form of ''inbreeding'', which arises whenever there is restricted independence in the genetical origins of gametes. Such reduction in independence arises if parents are already related, and/or from genetic drift or other spatial restrictions on gamete dispersal. Path analysis demonstrates that these are tantamount to the same thing.<ref name="Wright 1917">{{cite journal|last1=Wright|first1=S.|title=The average correlation within subgroups of a population.|journal=J. Wash. Acad. Sci.|date=1917|volume=7|pages=532–535}}</ref><ref name="Wright 1921 a">{{cite journal|last1=Wright|first1=S.|title=Systems of mating. I. The biometric relations between parent and offspring.|journal=Genetics|date=1921|volume=6|issue=2|pages=111–123|doi=10.1093/genetics/6.2.111|pmc=1200501|pmid=17245958}}</ref> Arising from this background, the ''inbreeding coefficient'' (often symbolized as '''F''' or '''''f''''') quantifies the effect of inbreeding from whatever cause. There are several formal definitions of '''''f''''', and some of these are considered in later sections. For the present, note that for a long-term self-fertilized species '''''f''''' = '''1'''. Natural self-fertilized populations are not single " ''pure lines'' ", however, but mixtures of such lines. This becomes particularly obvious when considering more than one gene at a time. Therefore, allele frequencies ('''''p''''' and '''''q''''') other than '''1''' or '''0''' are still relevant in these cases (refer back to the Mendel Cross section). The genotype frequencies take a different form, however. In general, the genotype frequencies become <math display="inline">[p^2(1-f)+pf]</math> for '''AA''' and <math display="inline">2pq(1-f)</math> for '''Aa''' and <math display="inline">[q^2(1-f)+qf]</math> for '''aa'''.<ref name="Crow & Kimura" />{{rp|65}} Notice that the frequency of the heterozygote declines in proportion to '''''f'''''. When '''''f'' = 1''', these three frequencies become respectively '''p''', '''0''' and '''q''' Conversely, when '''f = 0''', they reduce to the random-fertilization quadratic expansion shown previously. ===Population mean=== The population mean shifts the central reference point from the homozygote midpoint ('''mp''') to the mean of a sexually reproduced population. This is important not only to relocate the focus into the natural world, but also to use a measure of ''central tendency'' used by Statistics/Biometrics. In particular, the square of this mean is the Correction Factor, which is used to obtain the genotypic variances later.<ref name="S & T"/> [[File:G mean.jpg|thumb|300px|right|Population mean across all values of p, for various d effects.]] For each genotype in turn, its allele effect is multiplied by its genotype frequency; and the products are accumulated across all genotypes in the model. Some algebraic simplification usually follows to reach a succinct result. ==== The mean after random fertilization==== The contribution of '''AA''' is <math display="inline">p^2 (+)a</math>, that of '''Aa''' is <math display="inline">2pq d</math>, and that of '''aa''' is <math display="inline">q^2 (-)a</math>. Gathering together the two '''a''' terms and accumulating over all, the result is: <math display="inline"> a(p^2-q^2) + 2pq d</math>. Simplification is achieved by noting that <math display="inline"> (p^2-q^2) = (p-q)(p+q)</math>, and by recalling that <math display="inline"> (p+q) = 1</math>, thereby reducing the right-hand term to <math display="inline">(p-q)</math>. The succinct result is therefore <math display="inline"> G = a(p-q) + 2pqd</math>.<ref name="Falconer 1996"/> {{rp|110}} This defines the population mean as an "offset" from the homozygote midpoint (recall '''a''' and '''d''' are defined as ''deviations'' from that midpoint). The Figure depicts '''G''' across all values of '''p''' for several values of '''d''', including one case of slight over-dominance. Notice that '''G''' is often negative, thereby emphasizing that it is itself a ''deviation'' (from '''mp'''). Finally, to obtain the ''actual'' Population Mean in "phenotypic space", the midpoint value is added to this offset: <math display="inline"> P = G + mp</math>. An example arises from data on ear length in maize.<ref name="Sinnott Dunn & Dobzhansky">{{cite book|last1=Sinnott|first1=Edmund W.|last2=Dunn|first2=L. C.|last3=Dobzhansky|first3=Theodosius|title=Principles of genetics.|url=https://archive.org/details/principlesofgene00sinn|url-access=registration|date=1958|publisher=McGraw-Hill|location=New York}}</ref>{{rp|103}} Assuming for now that one gene only is represented, '''a''' = 5.45 cm, '''d''' = 0.12 cm [virtually "0", really], '''mp''' = 12.05 cm. Further assuming that '''p''' = 0.6 and '''q''' = 0.4 in this example population, then: '''G''' = 5.45 (0.6 − 0.4) + (0.48)0.12 = '''1.15 cm''' (rounded); and '''P''' = 1.15 + 12.05 = '''13.20 cm''' (rounded). ==== The mean after long-term self-fertilization==== The contribution of '''AA''' is <math display="inline"> p (+a)</math>, while that of '''aa''' is <math display="inline"> q (-a)</math>. [See above for the frequencies.] Gathering these two '''a''' terms together leads to an immediately very simple final result: <math display="inline"> G_{(f=1)} = a(p-q)</math>. As before, <math display="inline"> P = G + mp</math>. Often, "G<sub>(f=1)</sub>" is abbreviated to "G<sub>1</sub>". Mendel's peas can provide us with the allele effects and midpoint (see previously); and a mixed self-pollinated population with '''p''' = 0.6 and '''q''' = 0.4 provides example frequencies. Thus: '''G<sub>(f=1)</sub>''' = 82 (0.6 − .04) = 59.6 cm (rounded); and '''P<sub>(f=1)</sub>''' = 59.6 + 116 = 175.6 cm (rounded). ==== The mean – generalized fertilization==== A general formula incorporates the inbreeding coefficient '''''f''''', and can then accommodate any situation. The procedure is exactly the same as before, using the weighted genotype frequencies given earlier. After translation into our symbols, and further rearrangement:<ref name="Crow & Kimura"/> {{rp|77–78}} <math display="block"> \begin{align} G_{f} & = a (q-p) + [2pqd-f(2pqd)] \\ & = a(p-q) + (1-f) 2pqd \\ & = G_{0} - f\ 2pqd \end{align} </math> Here, '''G<sub>0</sub>''' is '''G''', which was given earlier. (Often, when dealing with inbreeding, "G<sub>0</sub>" is preferred to "G".) Supposing that the maize example [given earlier] had been constrained on a holme (a narrow riparian meadow), and had partial inbreeding to the extent of '''''f '''''= '''0.25''', then, using the third version (above) of '''G<sub>f</sub>''': '''G<sub>''0.25''</sub>''' = 1.15 − 0.25 (0.48) 0.12 = 1.136 cm (rounded), with '''P<sub>0.25</sub>''' = 13.194 cm (rounded). There is hardly any effect from inbreeding in this example, which arises because there was virtually no dominance in this attribute ('''d''' → 0). Examination of all three versions of '''G<sub>''f''</sub>''' reveals that this would lead to trivial change in the Population mean. Where dominance was notable, however, there would be considerable change. ===Genetic drift=== Genetic drift was introduced when discussing the likelihood of panmixia being widely extant as a natural fertilization pattern. [See section on Allele and Genotype frequencies.] Here the sampling of gametes from the ''potential'' gamodeme is discussed in more detail. The sampling involves random fertilization between pairs of random gametes, each of which may contain either an '''A''' or an '''a''' allele. The sampling is therefore binomial sampling.<ref name="Crow & Kimura"/>{{rp|382–395}}<ref name="Falconer 1996"/>{{rp|49–63}}<ref name="Fisher 1999"/>{{rp|35}}<ref name="Cochran 1977">{{cite book|last1=Cochran|first1=William G.|title=Sampling techniques.|date=1977|publisher=John Wiley & Sons|location=New York|edition=Third}}</ref>{{rp|55}} Each sampling "packet" involves '''2N''' alleles, and produces '''N''' zygotes (a "progeny" or a "line") as a result. During the course of the reproductive period, this sampling is repeated over and over, so that the final result is a mixture of sample progenies. The result is ''dispersed random fertilization'' <math> \left( \bigodot \right) </math> These events, and the overall end-result, are examined here with an illustrative example. The "base" allele frequencies of the example are those of the ''potential gamodeme'': the frequency of '''A''' is '''p<sub>g</sub> = 0.75''', while the frequency of '''a''' is '''q<sub>g</sub> = 0.25'''. [''White label'' "'''1'''" in the diagram.] Five example actual gamodemes are binomially sampled out of this base ('''s''' = the number of samples = 5), and each sample is designated with an "index" '''k''': with '''k = 1 .... s''' sequentially. (These are the sampling "packets" referred to in the previous paragraph.) The number of gametes involved in fertilization varies from sample to sample, and is given as '''2N<sub>k</sub>''' [at ''white label'' "'''2'''" in the diagram]. The total (Σ) number of gametes sampled overall is 52 [''white label'' "'''3'''" in the diagram]. Because each sample has its own size, ''weights'' are needed to obtain averages (and other statistics) when obtaining the overall results. These are <math display="inline"> \omega_k = 2N_k / (\sum_{k}^s 2N_k) </math>, and are given at ''white label'' "'''4'''" in the diagram. [[File:Genetic Drift example B3.jpg|thumb|400px|right|Genetic drift example analysis.]] ==== The sample gamodemes – genetic drift==== Following completion of these five binomial sampling events, the resultant actual gamodemes each contained different allele frequencies—('''p<sub>k</sub>''' and '''q<sub>k</sub>'''). [These are given at ''white label'' "'''5'''" in the diagram.] This outcome is actually the genetic drift itself. Notice that two samples (k = 1 and 5) happen to have the same frequencies as the ''base'' (''potential'') gamodeme. Another (k = 3) happens to have the ''p'' and ''q'' "reversed". Sample (k = 2) happens to be an "extreme" case, with '''p<sub>k</sub> = 0.9''' and '''q<sub>k</sub> = 0.1'''; while the remaining sample (k = 4) is "middle of the range" in its allele frequencies. All of these results have arisen only by "chance", through binomial sampling. Having occurred, however, they set in place all the downstream properties of the progenies. Because sampling involves chance, the ''probabilities'' ( {{math|<var>∫</var>}}<sub>k</sub> ) of obtaining each of these samples become of interest. These binomial probabilities depend on the starting frequencies ('''p<sub>g</sub>''' and '''q<sub>g</sub>''') and the sample size ('''2N<sub>k</sub>'''). They are tedious to obtain,<ref name="Crow & Kimura"/>{{rp|382–395}}<ref name="Cochran 1977"/>{{rp|55}} but are of considerable interest. [See ''white label'' "'''6'''" in the diagram.] The two samples (k = 1, 5), with the allele frequencies the same as in the ''potential gamodeme'', had higher "chances" of occurring than the other samples. Their binomial probabilities did differ, however, because of their different sample sizes (2N<sub>k</sub>). The "reversal" sample (k = 3) had a very low Probability of occurring, confirming perhaps what might be expected. The "extreme" allele frequency gamodeme (k = 2) was not "rare", however; and the "middle of the range" sample (k=4) ''was'' rare. These same Probabilities apply also to the progeny of these fertilizations. Here, some ''summarizing'' can begin. The ''overall allele frequencies'' in the progenies bulk are supplied by weighted averages of the appropriate frequencies of the individual samples. That is: <math display="inline"> p_{\centerdot} = \sum_{k}^s \omega_{k} \ p_{k} </math> and <math display="inline"> q_{\centerdot} = \sum_{k}^s \omega_{k} \ q_{k} </math>. (Notice that '''k''' is replaced by '''•''' for the overall result—a common practice.)<ref name="S & T"/> The results for the example are '''p<sub>•</sub>''' = 0.631 and '''q<sub>•</sub>''' = 0.369 [''black label'' "'''5'''" in the diagram]. These values are quite different to the starting ones ('''p<sub>g</sub>''' and '''q<sub>g</sub>''') [''white label'' "'''1'''"]. The sample allele frequencies also have variance as well as an average. This has been obtained using the ''sum of squares (SS)'' method <ref>This is outlined subsequently in the genotypic variances section.</ref> [See to the right of ''black label'' "'''5'''" in the diagram]. [Further discussion on this variance occurs in the section below on Extensive genetic drift.] ==== The progeny lines – dispersion==== The ''genotype frequencies'' of the five sample progenies are obtained from the usual quadratic expansion of their respective allele frequencies (''random fertilization''). The results are given at the diagram's ''white label'' "'''7'''" for the homozygotes, and at ''white label'' "'''8'''" for the heterozygotes. Re-arrangement in this manner prepares the way for monitoring inbreeding levels. This can be done either by examining the level of ''total'' homozygosis [('''p<sup>2</sup><sub>k</sub> + q<sup>2</sup><sub>k</sub>''') = ('''1 − 2p<sub>k</sub>q<sub>k</sub>''')], or by examining the level of heterozygosis ('''2p<sub>k</sub>q<sub>k</sub>'''), as they are complementary.<ref>Both are used commonly.</ref> Notice that samples ''k= 1, 3, 5'' all had the same level of heterozygosis, despite one being the "mirror image" of the others with respect to allele frequencies. The "extreme" allele-frequency case (k= ''2'') had the most homozygosis (least heterozygosis) of any sample. The "middle of the range" case (k= ''4'') had the least homozygosity (most heterozygosity): they were each equal at 0.50, in fact. The ''overall summary'' can continue by obtaining the ''weighted average'' of the respective genotype frequencies for the progeny bulk. Thus, for '''AA''', it is <math display="inline"> p^2_\centerdot = \sum_k^s \omega_k \ p_k^2 </math>, for '''Aa''', it is <math display="inline"> 2p_\centerdot q_\centerdot = \sum_k^s \omega_k \ 2 p_k q_k </math> and for '''aa''', it is <math display="inline"> q_\centerdot^2 = \sum_k^s \omega_k \ q_k^2 </math>. The example results are given at ''black label'' "'''7'''" for the homozygotes, and at ''black label'' "'''8'''" for the heterozygote. Note that the heterozygosity mean is ''0.3588'', which the next section uses to examine inbreeding resulting from this genetic drift. The next focus of interest is the dispersion itself, which refers to the "spreading apart" of the progenies' ''population means''. These are obtained as <math display="inline"> G_k = a (p_k - q_k) + 2p_k q_k d </math> [see section on the Population mean], for each sample progeny in turn, using the example gene effects given at ''white label'' "'''9'''" in the diagram. Then, each <math display="inline"> P_k = G_k + mp </math> is obtained also [at ''white label'' "'''10'''" in the diagram]. Notice that the "best" line (k = 2) had the ''highest'' allele frequency for the "more" allele ('''A''') (it also had the highest level of homozygosity). The ''worst'' progeny (k = 3) had the highest frequency for the "less" allele ('''a'''), which accounted for its poor performance. This "poor" line was less homozygous than the "best" line; and it shared the same level of homozygosity, in fact, as the two ''second-best'' lines (k = 1, 5). The progeny line with both the "more" and the "less" alleles present in equal frequency (k = 4) had a mean below the ''overall average'' (see next paragraph), and had the lowest level of homozygosity. These results reveal the fact that the alleles most prevalent in the "gene-pool" (also called the "germplasm") determine performance, not the level of homozygosity per se. Binomial sampling alone effects this dispersion. The ''overall summary'' can now be concluded by obtaining <math display="inline"> G_{\centerdot} = \sum_k^s \omega_k \ G_k </math> and <math display="inline"> P_{\centerdot} = \sum_k^s \omega_k \ P_k </math>. The example result for '''P<sub>•</sub>''' is 36.94 (''black label'' "'''10'''" in the diagram). This later is used to quantify ''inbreeding depression'' overall, from the gamete sampling. [See the next section.] However, recall that some "non-depressed" progeny means have been identified already (k = 1, 2, 5). This is an enigma of inbreeding—while there may be "depression" overall, there are usually superior lines among the gamodeme samplings. ==== The equivalent post-dispersion panmictic – inbreeding==== Included in the ''overall summary'' were the average allele frequencies in the mixture of progeny lines ('''p<sub>•</sub>''' and '''q<sub>•</sub>'''). These can now be used to construct a hypothetical panmictic equivalent.<ref name="Crow & Kimura"/>{{rp|382–395}}<ref name="Falconer 1996"/>{{rp|49–63}}<ref name="Fisher 1999"/>{{rp|35}} This can be regarded as a "reference" to assess the changes wrought by the gamete sampling. The example appends such a panmictic to the right of the Diagram. The frequency of '''AA''' is therefore '''(p<sub>•</sub>)<sup>2</sup>''' = 0.3979. This is less than that found in the dispersed bulk (0.4513 at ''black label'' "'''7'''"). Similarly, for '''aa''', '''(q<sub>•</sub>)<sup>2</sup>''' = 0.1303—again less than the equivalent in the progenies bulk (0.1898). Clearly, ''genetic drift'' has increased the overall level of homozygosis by the amount (0.6411 − 0.5342) = 0.1069. In a complementary approach, the heterozygosity could be used instead. The panmictic equivalent for '''Aa''' is '''2 p<sub>•</sub> q<sub>•</sub>''' = 0.4658, which is ''higher'' than that in the sampled bulk (0.3588) [''black label'' "'''8'''"]. The sampling has caused the heterozygosity to decrease by 0.1070, which differs trivially from the earlier estimate because of rounding errors. The '''inbreeding coefficient''' ('''''f''''') was introduced in the early section on Self Fertilization. Here, a formal definition of it is considered: '''''f''''' is the probability that two "same" alleles (that is '''A''' and '''A''', or '''a''' and '''a'''), which fertilize together are of common ancestral origin—or (more formally) '''''f''''' is the probability that two homologous alleles are autozygous.<ref name="Falconer 1996"/><ref name="Wright 1921 a"/> Consider any random gamete in the ''potential'' gamodeme that has its syngamy partner restricted by binomial sampling. The probability that that second gamete is homologous autozygous to the first is '''1/(2N)''', the reciprocal of the gamodeme size. For the five example progenies, these quantities are 0.1, 0.0833, 0.1, 0.0833 and 0.125 respectively, and their weighted average is '''0.0961'''. This is the ''inbreeding coefficient'' of the example progenies bulk, provided it is ''unbiased'' with respect to the full binomial distribution. An example based upon ''s = 5'' is likely to be biased, however, when compared to an appropriate entire binomial distribution based upon the sample number (''s'') approaching infinity (''s → ∞''). Another derived definition of '''''f''''' for the full Distribution is that '''''f''''' also equals the rise in homozygosity, which equals the fall in heterozygosity.<ref>See the earlier citations.</ref> For the example, these frequency changes are ''0.1069'' and ''0.1070'', respectively. This result is different to the above, indicating that bias with respect to the full underlying distribution is present in the example. For the example ''itself'', these latter values are the better ones to use, namely '''''f<sub>•</sub>''''' = '''0.10695'''. The ''population mean'' of the equivalent panmictic is found as ''[a (p<sub>•</sub>-q<sub>•</sub>) + 2 p<sub>•</sub>q<sub>•</sub> d] + mp''. Using the example ''gene effects'' (''white label'' "'''9'''" in the diagram), this mean is <math display="inline"> P_{\centerdot} = </math> 37.87. The equivalent mean in the dispersed bulk is 36.94 (''black label'' "'''10'''"), which is depressed by the amount ''0.93''. This is the ''inbreeding depression'' from this Genetic Drift. However, as noted previously, three progenies were ''not'' depressed (k = 1, 2, 5), and had means even greater than that of the panmictic equivalent. These are the lines a plant breeder looks for in a line selection programme.<ref name="Allard 1960">{{cite book|last1=Allard|first1=R. W.|title=Principles of plant breeding|url=https://archive.org/details/principlesofplan0000alla|url-access=registration|date=1960|publisher=John Wiley & Sons|location=New York}}</ref> ==== Extensive binomial sampling – is panmixia restored?==== If the number of binomial samples is large ('''s → ∞''' ), then '''p<sub>•</sub> → p<sub>g</sub>''' and '''q<sub>•</sub> → q<sub>g</sub>'''. It might be queried whether panmixia would effectively re-appear under these circumstances. However, the '''sampling of allele frequencies''' has ''still occurred'', with the result that '''σ<sup>2</sup><sub>p, q</sub>''' ≠ '''0'''.<ref name="varpq">This is read as "σ <sup>2</sup><sub>p</sub> and/or σ <sup>2</sup><sub>q</sub>". As ''p'' and ''q'' are complementary, σ <sup>2</sup><sub>p</sub> ≡ σ <sup>2</sup><sub>q</sub> and σ <sup>2</sup><sub>p</sub> = σ <sup>2</sup><sub>q</sub>.</ref> In fact, as '''s → ∞''', the <math display="inline"> \sigma^2_{p,\ q} \to \tfrac{p_g q_g} {2N} </math>, which is the ''variance'' of the ''whole binomial distribution''.<ref name="Crow & Kimura"/>{{rp|382–395}}<ref name="Falconer 1996"/>{{rp|49–63}} Furthermore, the "Wahlund equations" show that the progeny-bulk ''homozygote'' frequencies can be obtained as the sums of their respective average values ('''p<sup>2</sup><sub>•</sub>''' or '''q<sup>2</sup><sub>•</sub>''') ''plus'' '''σ<sup>2</sup><sub>p, q</sub>'''.<ref name="Crow & Kimura"/>{{rp|382–395}} Likewise, the bulk ''heterozygote'' frequency is '''(2 p<sub>•</sub> q<sub>•</sub>)''' ''minus'' '''twice''' the '''σ<sup>2</sup><sub>p, q</sub>'''. The variance arising from the binomial sampling is conspicuously present. Thus, even when '''s → ∞''', the progeny-bulk ''genotype'' frequencies still reveal ''increased homozygosis'', and ''decreased heterozygosis'', there is still ''dispersion of progeny means'', and still ''inbreeding'' and ''inbreeding depression''. That is, panmixia is ''not'' re-attained once lost because of genetic drift (binomial sampling). However, a new ''potential'' panmixia can be initiated via an allogamous F2 following hybridization.<ref name="Gordon 2003"/> ==== Continued genetic drift – increased dispersion and inbreeding==== Previous discussion on genetic drift examined just one cycle (generation) of the process. When the sampling continues over successive generations, conspicuous changes occur in '''σ<sup>2</sup>'''<sub>'''p''', '''q'''</sub> and '''''f'''''. Furthermore, another "index" is needed to keep track of "time": '''t''' = '''1 .... y''' where '''y''' = the number of "years" (generations) considered. The methodology often is to add the current binomial increment ('''Δ''' = "''de novo''") to what has occurred previously.<ref name="Crow & Kimura"/> The entire Binomial Distribution is examined here. [There is no further benefit to be had from an abbreviated example.] ===== Dispersion via σ<sup>2</sup><sub>p,q</sub>===== Earlier this variance (σ <sup>2</sup><sub>p,q</sub><ref name="varpq"/>) was seen to be:- <math display="block"> \begin{align} \sigma ^2_{p,q} & = p_g q_g \ / \ 2N \\ & = p_g q_g \left( \frac{1}{2N} \right) \\ & = p_g q_g \ f \\ & = p_g q_g \ \Delta f \ \scriptstyle \text{when used in recursive equations} \end{align} </math> With the extension over time, this is also the result of the ''first'' cycle, and so is <math display="inline"> \sigma^2_1 </math> (for brevity). At cycle 2, this variance is generated yet again—this time becoming the ''de novo'' variance (<math display="inline"> \Delta \sigma^2 </math>)—and accumulates to what was present already—the "carry-over" variance. The ''second'' cycle variance ('''<math display="inline"> \sigma^2_2 </math>''') is the weighted sum of these two components, the weights being <math display="inline"> 1 </math> for the ''de novo'' and <math display="inline"> \left( 1 - \tfrac{1}{2N} \right) </math> = <math display="inline"> \left( 1 - \Delta f \right) </math> for the"carry-over". Thus, {{NumBlk|:| <math> \sigma^2_2 = \left( 1 \right) \ \Delta \sigma^2 + \left( 1- \Delta f \right) \sigma^2_1 </math>| {{EquationRef|1}}}} The extension to generalize to any time ''t'', after considerable simplification, becomes:<ref name="Crow & Kimura"/>{{rp|328}}- {{NumBlk|:| <math> \sigma^2_t = p_g q_g \left[ 1 - \left( 1 - \Delta f \right)^t \right] </math>| {{EquationRef|2}}}} Because it was this variation in allele frequencies that caused the "spreading apart" of the progenies' means (''dispersion''), the change in''' σ<sup>2</sup><sub>t</sub>''' over the generations indicates the change in the level of the ''dispersion''. =====Dispersion via ''f''===== The method for examining the inbreeding coefficient is similar to that used for ''σ <sup>2</sup><sub>p,q</sub>''. The same weights as before are used respectively for ''de novo f'' ( '''Δ f''' ) [recall this is '''1/(2N)''' ] and ''carry-over f''. Therefore, <math display="inline"> f_2 = \left( 1 \right) \Delta f + \left( 1 - \Delta f \right) f_1 </math> , which is similar to '''Equation (1)''' in the previous sub-section. [[File:RF Inbreeding.jpg|thumb|300px|left|Inbreeding resulting from genetic drift in random fertilization.]] In general, after rearrangement,<ref name="Crow & Kimura"/> <math display="block"> \begin{align} f_t & = \Delta f + \left( 1 - \Delta f \right) f_{t-1} \\ & = \Delta f \left( 1 - f_{t-1} \right) + f_{t-1} \end{align} </math> The graphs to the left show levels of inbreeding over twenty generations arising from genetic drift for various ''actual gamodeme'' sizes (2N). Still further rearrangements of this general equation reveal some interesting relationships. '''(A)''' After some simplification,<ref name="Crow & Kimura"/> <math display="inline"> \left( f_t - f_{t-1} \right) = \Delta f \left( 1 - f_{t-1} \right) = \delta f_t </math>. The left-hand side is the difference between the current and previous levels of inbreeding: the ''change in inbreeding'' ('''δf<sub>t</sub>'''). Notice, that this ''change in inbreeding'' ('''δf<sub>t</sub>''') is equal to the ''de novo inbreeding'' ('''Δf''') only for the first cycle—when f<sub>t-1</sub> is ''zero''. '''(B)''' An item of note is the '''(1-f<sub>t-1</sub>)''', which is an "index of ''non-inbreeding''". It is known as the ''panmictic index''.<ref name= "Crow & Kimura"/><ref name="Falconer 1996"/> <math display="inline"> P_{t-1} = \left( 1 - f_{t-1} \right) </math>. '''(C)''' Further useful relationships emerge involving the ''panmictic index''.<ref name="Crow & Kimura"/><ref name="Falconer 1996"/> <math display="block"> \begin{align} \Delta f & = \frac {\delta f_t} {P_{t-1}} \\ & = 1 - \frac {P_t} {P_{t-1}} \end{align} </math>. '''(D)''' A key link emerges between ''σ <sup>2</sup><sub>p,q</sub>'' and ''f''. Firstly...<ref name="Crow & Kimura"/> <math display="block"> \begin{align} f_t & = 1 - \left( 1 -1 \Delta f \right) ^t \left( 1 - f_0 \right) \end{align} </math> Secondly, presuming that '''f<sub>0</sub>''' = '''0''', the right-hand side of this equation reduces to the section within the brackets of '''Equation (2)''' at the end of the last sub-section. That is, if initially there is no inbreeding, <math display="inline"> \sigma^2_t = p_g q_g f_t </math> '''!''' Furthermore, if this then is rearranged, <math display="inline"> f_t = \tfrac {\sigma^2_t} {p_g q_g} </math>. That is, when initial inbreeding is zero, the two principal viewpoints of ''binomial gamete sampling'' (genetic drift) are directly inter-convertible. ==== Selfing within random fertilization==== [[File:RF Inbreeding B.jpg|thumb|300px|right|Random fertilization compared to cross-fertilization]] It is easy to overlook that ''random fertilization'' includes self-fertilization. Sewall Wright showed that a proportion '''1/N''' of ''random fertilizations'' is actually ''self fertilization'' <math> \left( \bigotimes \right) </math>, with the remainder '''(N-1)/N''' being ''cross fertilization'' <math> \left( \mathsf{X} \right) </math>. Following path analysis and simplification, the new view ''random fertilization inbreeding'' was found to be: <math display="inline"> f_t = \Delta f \left( 1 + f_{t-1} \right) + \tfrac {N-1}{N} f_{t-1} </math>.<ref name="Wright 1921 a"/><ref name="Wright 1951">{{cite journal|last1=Wright|first1=Sewall|title=The genetical structure of populations.|journal=Annals of Eugenics|date=1951|volume=15|issue=4|pages=323–354|doi=10.1111/j.1469-1809.1949.tb02451.x|pmid=24540312}}</ref> Upon further rearrangement, the earlier results from the binomial sampling were confirmed, along with some new arrangements. Two of these were potentially very useful, namely: '''(A)''' <math display="inline"> f_t = \Delta f \left[ 1 + f_{t-1} \left( 2N-1 \right) \right] </math>; and '''(B)''' <math display="inline"> f_t = \Delta f \left( 1 - f_{t-1} \right) + f_{t-1} </math>. The recognition that selfing may ''intrinsically be a part of'' random fertilization leads to some issues about the use of the previous ''random fertilization'' 'inbreeding coefficient'. Clearly, then, it is inappropriate for any species incapable of ''self fertilization'', which includes plants with self-incompatibility mechanisms, dioecious plants, and [[bisexual animals]]. The equation of Wright was modified later to provide a version of random fertilization that involved only ''cross fertilization'' with no ''self fertilization''. The proportion '''1/N''' formerly due to ''selfing'' now defined the ''carry-over'' gene-drift inbreeding arising from the previous cycle. The new version is:<ref name="Crow & Kimura"/>{{rp|166}} <math display="block"> f_{\mathsf{X}_{t}} = f_{t-1} + \Delta f \left( 1 + f_{t-2} - 2 f_{t-1} \right) </math>. The graphs to the right depict the differences between standard ''random fertilization'' '''RF''', and random fertilization adjusted for "cross fertilization alone" '''CF'''. As can be seen, the issue is non-trivial for small gamodeme sample sizes. "Panmixia' is ''not'' synonymous with 'random fertilization,' nor is "random fertilization" synonymous with 'cross fertilization'.{{citation needed|date=January 2025}} ==== Homozygosity and heterozygosity==== In the sub-section on "The sample gamodemes – Genetic drift", a series of gamete samplings was followed, an outcome of which was an increase in homozygosity at the expense of heterozygosity. From this viewpoint, the rise in homozygosity was due to the gamete samplings. Levels of homozygosity can be viewed also according to whether homozygotes arose allozygously or autozygously. Recall that autozygous alleles have the same allelic origin, the likelihood (frequency) of which '''''is''''' the '''inbreeding coefficient (''f'')''' by definition. The proportion arising ''allozygously'' is therefore '''(1-f)'''. For the '''A'''-bearing gametes, which are present with a general frequency of '''p''', the overall frequency of those that are autozygous is therefore ('''f''' ''p''). Similarly, for '''a'''-bearing gametes, the autozygous frequency is ('''f''' ''q'').<ref>Remember that the issue of auto/allo -zygosity can arise only for ''homologous'' alleles (that is ''A'' and ''A'', or ''a'' and ''a''), and not for ''non-homologous'' alleles (''A'' and ''a''), which cannot possibly have the ''same allelic origin''.</ref> These two viewpoints regarding genotype frequencies must be connected to establish consistency. Following firstly the ''auto/allo'' viewpoint, consider the ''allozygous'' component. This occurs with the frequency of '''(1-f)''', and the alleles unite according to the ''random fertilization'' quadratic expansion. Thus: <math display="block"> \left( 1-f \right) \left[ p_0 + q_0 \right] ^2 = \left( 1-f \right) \left[ p_0^2 + q_0^2 \right] + \left( 1-f \right) \left[ 2 p_0 q_0 \right] </math> Consider next the ''autozygous'' component. As these alleles '''are''' ''autozygous'', they are effectively '''selfings''', and produce either '''AA''' or '''aa''' genotypes, but no heterozygotes. They therefore produce <math display="inline"> f p_0 </math> ''"AA"'' homozygotes plus <math display="inline"> f q_0 </math> ''"aa"'' homozygotes. Adding these two components together results in: <math display="inline"> \left[ \left( 1-f \right) p_0^2 + f p_0 \right] </math> for the '''AA''' homozygote; <math display="inline"> \left[ \left( 1-f \right) q_0^2 + f q_0 \right] </math> for the '''aa''' homozygote; and <math display="inline"> \left( 1-f \right) 2 p_0 q_0 </math> for the '''Aa''' heterozygote.<ref name="Crow & Kimura"/>{{rp|65}}<ref name="Falconer 1996"/> This is the same equation as that presented earlier in the section on "Self fertilization – an alternative". The reason for the decline in heterozygosity is made clear here. Heterozygotes can arise '''''only''''' from the allozygous component, and its frequency in the sample bulk is just '''(1-f)''': hence this must also be the factor controlling the frequency of the heterozygotes. Secondly, the ''sampling'' viewpoint is re-examined. Previously, it was noted that the decline in heterozygotes was <math display="inline"> f \left( 2 p_0 q_0 \right)</math>. This decline is distributed equally towards each homozygote; and is added to their basic ''random fertilization'' expectations. Therefore, the genotype frequencies are: <math display="inline"> \left( p_0^2 + f p_0 q_0 \right) </math> for the ''"AA"'' homozygote; <math display="inline"> \left( q_0^2 + f p_0 q_0 \right) </math> for the ''"aa"'' homozygote; and <math display="inline"> 2 p_0 q_0 - f \left( 2 p_0 q_0 \right) </math> for the heterozygote. Thirdly, the ''consistency'' between the two previous viewpoints needs establishing. It is apparent at once [from the corresponding equations above] that the heterozygote frequency is the same in both viewpoints. However, such a straightforward result is not immediately apparent for the homozygotes. Begin by considering the '''AA''' homozygote's final equation in the ''auto/allo'' paragraph above:- <math display="inline"> \left[ \left( 1-f \right) p_0^2 + f p_0 \right] </math>. Expand the brackets, and follow by re-gathering [within the resultant] the two new terms with the common-factor ''f'' in them. The result is: <math display="inline"> p_0^2 - f \left( p_0^2 - p_0 \right) </math>. Next, for the parenthesized " ''p<sup>2</sup><sub>0</sub>'' ", a ''(1-q)'' is substituted for a ''p'', the result becoming <math display="inline"> p_0^2 - f \left[ p_0 \left( 1-q_0 \right) - p_0 \right] </math>. Following that substitution, it is a straightforward matter of multiplying-out, simplifying and watching signs. The end result is <math display="inline"> p_0^2 + f p_0 q_0 </math>, which is exactly the result for '''AA''' in the ''sampling'' paragraph. The two viewpoints are therefore ''consistent'' for the '''AA''' homozygote. In a like manner, the consistency of the '''aa''' viewpoints can also be shown. The two viewpoints are consistent for all classes of genotypes.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)