Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
CpG site
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== CpG islands == [[Image:Cytosine becomes thymine.png|thumbnail|350px|right|How methylation of CpG sites followed by spontaneous deamination leads to a lack of CpG sites in methylated DNA. As a result, residual CpG islands are created in areas where methylation is rare, and CpG sites stick (or where C to T mutation is highly detrimental).]] '''CpG islands''' (or CG islands) are regions with a high frequency of CpG sites. Though objective definitions for CpG islands are limited, the usual formal definition is a region with at least 200 [[base pair|bp]], a GC percentage greater than 50%, and an observed-to-expected CpG ratio greater than 60%. The "observed-to-expected CpG ratio" can be derived where the observed is calculated as: <math>(\text{number of }CpGs)</math> and the expected as <math>(\text{number of }C * \text{number of }G) / \text{length of sequence}</math><ref name="Gardiner-Garden1987">{{Cite journal| volume = 196| issue = 2| pages = 261β282| vauthors = Gardiner-Garden M, Frommer M| title = CpG islands in vertebrate genomes| journal = Journal of Molecular Biology| date = 1987| doi= 10.1016/0022-2836(87)90689-9| pmid=3656447}}</ref> or <math>((\text{number of }C + \text{number of }G)/2)^2 / \text{length of sequence}</math>.<ref name="Saxonov2006"/> Many genes in mammalian genomes have CpG islands associated with the start of the gene<ref>{{cite book |title=Genetics: Analysis of Genes and Genomes |vauthors=Hartl DL, Jones EW |page=[https://archive.org/details/genetics00dani/page/477 477] |edition=6th |year=2005 |publisher=Jones & Bartlett, Canada |location=Mississauga |isbn=978-0-7637-1511-3 |url-access=registration |url=https://archive.org/details/genetics00dani/page/477 }}</ref> ([[Promoter (genetics)|promoter regions]]). Because of this, the presence of a CpG island is used to help in the prediction and annotation of genes. In mammalian genomes, CpG islands are typically 300β3,000 base pairs in length, and have been found in or near approximately 40% of [[promoter (biology)|promoter]]s of mammalian genes.<ref name="Fatemi2005">{{cite journal |vauthors=Fatemi M, Pao MM, Jeong S, Gal-Yam EN, Egger G, Weisenberger DJ, Jones PA |display-authors=6|title=Footprinting of mammalian promoters: use of a CpG DNA methyltransferase revealing nucleosome positions at a single molecule level |journal=Nucleic Acids Res |volume=33 |issue=20 |pages=e176 |year=2005 |pmid=16314307 |pmc=1292996 |doi=10.1093/nar/gni180}}</ref> Over 60% of human genes and almost all [[Housekeeping gene|house-keeping genes]] have their promoters embedded in CpG islands.<ref>{{Cite book|last=Alberts |first=Bruce|author-link=Bruce Alberts|title=Molecular biology of the cell|date=18 November 2014|isbn=978-0-8153-4432-2|edition=Sixth|publisher=Garland Science |location=New York, NY|pages=406|oclc=887605755|url=https://archive.org/details/molecularbiology0006edalbe/page/406/mode/2up}}</ref> Given the frequency of GC two-nucleotide sequences, the number of CpG dinucleotides is much lower than would be expected.<ref name="Saxonov2006">{{cite journal |vauthors=Saxonov S, Berg P, Brutlag DL |title=A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters |journal=Proc Natl Acad Sci USA |volume=103 |issue=5 |pages=1412β1417 |year=2006 |pmid=16432200 |pmc=1345710 |doi=10.1073/pnas.0510310103|bibcode=2006PNAS..103.1412S |doi-access=free }}</ref> A 2002 study revised the rules of CpG island prediction to exclude other GC-rich genomic sequences such as [[Alu sequence|Alu repeats]]. Based on an extensive search on the complete sequences of human chromosomes 21 and 22, DNA regions greater than 500 bp were found more likely to be the "true" CpG islands associated with the 5' regions of genes if they had a GC content greater than 55%, and an observed-to-expected CpG ratio of 65%.<ref name="Takai2002">{{cite journal |vauthors=Takai D, Jones PA |title=Comprehensive analysis of CpG islands in human chromosomes 21 and 22. |journal=Proc Natl Acad Sci USA |volume=99 |issue=6 |pages=3740β5 |year=2002 |pmid=11891299 |doi=10.1073/pnas.052410099 |pmc=122594|bibcode=2002PNAS...99.3740T |doi-access=free }}</ref> CpG islands are characterized by CpG dinucleotide content of at least 60% of that which would be statistically expected (~4β6%), whereas the rest of the genome has much lower CpG frequency (~1%), a phenomenon called [[CG suppression]]. Unlike CpG sites in the [[coding region]] of a gene, in most instances the CpG sites in the CpG islands of promoters are unmethylated if the genes are expressed. This observation led to the speculation that [[methylation]] of CpG sites in the promoter of a gene may inhibit gene expression. Methylation, along with [[histone]] modification, is central to [[Genomic imprinting|imprinting]].<ref name="Feil2007">{{cite journal |vauthors=Feil R, Berger F |title=Convergent evolution of genomic imprinting in plants and mammals |journal=Trends Genet |volume=23 |issue=4 |pages=192β199 |year=2007 |pmid=17316885 |doi=10.1016/j.tig.2007.02.004}}</ref> Most of the methylation differences between tissues, or between normal and cancer samples, occur a short distance from the CpG islands (at "CpG island shores") rather than in the islands themselves.<ref>{{cite journal | vauthors=[[Rafael Irizarry (scientist)|Irizarry RA]], Ladd-Acosta C, Wen B, Wu Z, Montano C, Onyango P, Cui H, Gabo K, Rongione M, Webster M, Ji H, Potash JB, Sabunciyan S, Feinberg AP |display-authors=6| title=The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores | journal=[[Nature Genetics]] | volume=41 | issue=2 | year=2009 | pages=178β186 | pmid=19151715 | pmc=2729128 | doi=10.1038/ng.298}}</ref> CpG islands typically occur at or near the transcription start site of genes, particularly [[housekeeping gene]]s, in vertebrates.<ref name="Saxonov2006"/> A C (cytosine) base followed immediately by a G (guanine) base (a CpG) is rare in vertebrate DNA because the cytosines in such an arrangement tend to be methylated. This methylation helps distinguish the newly synthesized DNA strand from the parent strand, which aids in the final stages of DNA proofreading after duplication. However, over time methylated cytosines tend to turn into [[thymine]]s because of spontaneous [[deamination]]. There is a special enzyme in humans ([[Thymine-DNA glycosylase]], or TDG) that specifically replaces T's from T/G mismatches. However, due to the rarity of CpGs, it is theorised to be insufficiently effective in preventing a possibly rapid mutation of the dinucleotides. The existence of CpG islands is usually explained by the existence of selective forces for relatively high CpG content, or low levels of methylation in that genomic area, perhaps having to do with the regulation of gene expression. A 2011 study showed that most CpG islands are a result of non-selective forces.<ref name="Tanay2011">{{cite journal |vauthors= Cohen N, Kenigsberg E, Tanay A|title=Primate CpG Islands Are Maintained by Heterogeneous Evolutionary Regimes Involving Minimal Selection |journal=Cell |volume=145 |issue=5 |pages=773β786 |year=2011 |pmid=21620139 |doi=10.1016/j.cell.2011.04.024|s2cid=14856605 |doi-access=free }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)