Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Gene duplication
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Identifying duplications in sequenced genomes== ===Criteria and single genome scans=== The two genes that exist after a gene duplication event are called [[Paralog#Orthology and paralogy|paralogs]] and usually code for [[protein]]s with a similar function and/or structure. By contrast, [[Paralog#Orthology and paralogy|orthologous]] genes present in different species which are each originally derived from the same ancestral sequence. (See [[Homology (biology)#Sequence homology|Homology of sequences in genetics]]). It is important (but often difficult) to differentiate between paralogs and orthologs in biological research. Experiments on human gene function can often be carried out on other [[species]] if a homolog to a human gene can be found in the genome of that species, but only if the homolog is orthologous. If they are paralogs and resulted from a gene duplication event, their functions are likely to be too different. One or more copies of duplicated genes that constitute a gene family may be affected by insertion of [[transposable elements]] that causes significant variation between them in their sequence and finally may become responsible for [[divergent evolution]]. This may also render the chances and the rate of [[gene conversion]] between the homologs of gene duplicates due to less or no similarity in their sequences. Paralogs can be identified in single genomes through a sequence comparison of all annotated gene models to one another. Such a comparison can be performed on translated amino acid sequences (e.g. BLASTp, tBLASTx) to identify ancient duplications or on DNA nucleotide sequences (e.g. BLASTn, megablast) to identify more recent duplications. Most studies to identify gene duplications require reciprocal-best-hits or fuzzy reciprocal-best-hits, where each paralog must be the other's single best match in a sequence comparison.<ref name= Hahn>{{cite journal | vauthors = Hahn MW, Han MV, Han SG | title = Gene family evolution across 12 Drosophila genomes | journal = PLOS Genetics | volume = 3 | issue = 11 | pages = e197 | date = November 2007 | pmid = 17997610 | pmc = 2065885 | doi = 10.1371/journal.pgen.0030197 | doi-access = free }}</ref> Most gene duplications exist as [[low copy repeats]] (LCRs), rather highly repetitive sequences like transposable elements. They are mostly found in [[Chromosome regions|pericentronomic]], [[subtelomeric]] and [[Chromosome regions|interstitial]] regions of a chromosome. Many LCRs, due to their size (>1Kb), similarity, and orientation, are highly susceptible to duplications and deletions. ===Genomic microarrays detect duplications=== Technologies such as genomic [[microarrays]], also called array comparative [[genomic]] hybridization (array CGH), are used to detect chromosomal abnormalities, such as microduplications, in a high throughput fashion from genomic DNA samples. In particular, DNA [[microarray]] technology can simultaneously monitor the [[gene expression|expression]] levels of thousands of genes across many treatments or experimental conditions, greatly facilitating the evolutionary studies of [[gene regulation]] after gene duplication or [[speciation]].<ref>{{cite journal | vauthors = Mao R, Pevsner J | title = The use of genomic microarrays to study chromosomal abnormalities in mental retardation | journal = Mental Retardation and Developmental Disabilities Research Reviews | volume = 11 | issue = 4 | pages = 279β85 | year = 2005 | pmid = 16240409 | doi = 10.1002/mrdd.20082 }}</ref><ref>{{cite journal | vauthors = Gu X, Zhang Z, Huang W | title = Rapid evolution of expression and regulatory divergences after yeast gene duplication | journal = Proceedings of the National Academy of Sciences of the United States of America | volume = 102 | issue = 3 | pages = 707β12 | date = January 2005 | pmid = 15647348 | pmc = 545572 | doi = 10.1073/pnas.0409186102 | bibcode = 2005PNAS..102..707G | doi-access = free }}</ref> ===Next generation sequencing=== Gene duplications can also be identified through the use of next-generation sequencing platforms. The simplest means to identify duplications in genomic resequencing data is through the use of paired-end sequencing reads. Tandem duplications are indicated by sequencing read pairs which map in abnormal orientations. Through a combination of increased sequence coverage and abnormal mapping orientation, it is possible to identify duplications in genomic sequencing data.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)