Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
CpG site
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{distinguish|CpG oligodeoxynucleotide}} {{short description|Region of often-methylated DNA with a cytosine followed by a guanine}} [[File:CpG vs C-G bp.svg|300px|thumb|a CpG site, ''i.e.'', the " 5'βCβphosphateβGβ3' " sequence of nucleotides, is indicated on one DNA strand (in yellow). On the reverse DNA strand (in blue), the complementary 5'βCpGβ3' site is shown. A C-G base-pairing between the two DNA strands is also indicated (right)]] The '''CpG sites''' or '''CG sites''' are regions of [[DNA]] where a [[cytosine]] [[nucleotide]] is followed by a [[guanine]] nucleotide in the linear [[DNA sequence|sequence]] of [[Base pair|base]]s along its [[Directionality (molecular biology)|5' β 3' direction]]. CpG sites occur with high frequency in genomic regions called [[CpG site#CpG islands|CpG islands]]. Cytosines in CpG dinucleotides can be [[DNA methylation|methylated]] to form [[5-methylcytosine]]s. [[Enzyme]]s that add a [[methyl group]] are called [[DNA methyltransferase]]s. In mammals, 70% to 80% of CpG cytosines are methylated.<ref name="Jabbari2004">{{cite journal |vauthors=Jabbari K, Bernardi G |title=Cytosine methylation and CpG, TpG (CpA) and TpA frequencies |journal=Gene |volume=333 |pages=143β9 |date=May 2004 |pmid=15177689 |doi=10.1016/j.gene.2004.02.043 }}</ref> Methylating the cytosine within a gene can change its expression, a mechanism that is part of a larger field of science studying gene regulation that is called [[epigenetics]]. Methylated cytosines often mutate to [[Thymine|thymines]]. In humans, about 70% of [[Promoter (genetics)|promoters]] located near the [[Transcription (genetics)|transcription start site]] of a gene (proximal promoters) contain a CpG island.<ref name="pmid16432200">{{cite journal |vauthors=Saxonov S, Berg P, Brutlag DL |title=A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters |journal=Proc. Natl. Acad. Sci. U.S.A. |volume=103 |issue=5 |pages=1412β7 |year=2006 |pmid=16432200 |pmc=1345710 |doi=10.1073/pnas.0510310103 |bibcode=2006PNAS..103.1412S |doi-access=free }}</ref><ref name="pmid21576262">{{cite journal |vauthors=Deaton AM, [[Adrian Bird|Bird A]] |title=CpG islands and the regulation of transcription |journal=Genes Dev. |volume=25 |issue=10 |pages=1010β22 |year=2011 |pmid=21576262 |pmc=3093116 |doi=10.1101/gad.2037511 }}</ref> == CpG characteristics == === Definition === ''CpG'' is shorthand for ''5'βCβphosphateβGβ3' '', that is, cytosine and guanine separated by only one [[phosphate]] group; phosphate links any two [[nucleoside]]s together in DNA. The ''CpG'' notation is used to distinguish this single-stranded linear sequence from the ''CG'' [[base pair|base-pairing]] of cytosine and guanine for double-stranded sequences. The CpG notation is therefore to be interpreted as the cytosine being [[Directionality (molecular biology)#5β²-end|5 prime]] to the guanine base. ''CpG'' should not be confused with ''GpC'', the latter meaning that a guanine is followed by a cytosine in the 5' β 3' direction of a single-stranded sequence. === Under-representation caused by high mutation rate === CpG dinucleotides have long been observed to occur with a much lower frequency in the sequence of vertebrate genomes than would be expected due to random chance. For example, in the human genome, which has a 42% [[GC content]],<ref name="Lander2001">{{Cite journal|last1=Lander|first1=Eric S.|author-link1 =Eric Lander|last2=Linton|first2=Lauren M. |last3=Birren|first3=Bruce |last4=Nusbaum|first4=Chad |last5=Zody|first5=Michael C.|last6=Baldwin |first6=Jennifer|last7=Devon |first7=Keri|last8=Dewar |first8=Ken |last9=Doyle|first9=Michael|date=15 February 2001|title=Initial sequencing and analysis of the human genome|journal=Nature|language=En|volume=409|issue=6822|pages=860β921|doi=10.1038/35057062|pmid=11237011|issn=1476-4687|bibcode=2001Natur.409..860L|doi-access=free|hdl=2027.42/62798|hdl-access=free}}</ref> a pair of [[nucleotide]]s consisting of cytosine followed by guanine would be expected to occur <math>0.21 \times 0.21 = 4.41 \%</math> of the time. The frequency of CpG dinucleotides in human genomes is less than one-fifth of the expected frequency.<ref>{{Cite journal|last=International Human Genome Sequencing Consortium|date=2001-02-15|title=Initial sequencing and analysis of the human genome|journal=Nature|language=en|volume=409|issue=6822|pages=860β921|doi=10.1038/35057062|pmid=11237011|bibcode=2001Natur.409..860L |issn=0028-0836|doi-access=free|hdl=2027.42/62798|hdl-access=free}}</ref> This underrepresentation is a consequence of the high [[mutation rate]] of methylated CpG sites: the spontaneously occurring [[deamination]] of a methylated cytosine results in a [[thymine]], and the resulting G:T mismatched bases are often improperly resolved to A:T; whereas the deamination of unmethylated cytosine results in a [[uracil]], which as a foreign base is quickly replaced by a cytosine by the [[base excision repair]] mechanism. The C to T [[transition (genetics)|transition]] rate at methylated CpG sites is ~10 fold higher than at unmethylated sites.<ref>{{cite journal|vauthors=Hwang DG, [[Philip Palmer Green|Green P]]|title=Bayesian Markov chain Monte Carlo sequence analysis reveals varying neutral substitution patterns in mammalian evolution. |journal=Proc Natl Acad Sci U S A |volume=101 | issue=39 |pages=13994β4001 |year=2004 |pmid= 15292512 |doi=10.1073/pnas.0404142101 |pmc=521089|bibcode=2004PNAS..10113994H |doi-access=free }}</ref><ref>{{cite book|vauthors=Walsh CP, Xu GL |chapter=Cytosine Methylation and DNA Repair |title=DNA Methylation: Basic Mechanisms |volume=301 |pages=283β315 |year=2006 |pmid=16570853 |doi=10.1007/3-540-31390-7_11|series=Current Topics in Microbiology and Immunology |isbn=3-540-29114-8 }}</ref><ref>{{cite journal|vauthors=[[Norman Arnheim|Arnheim N]], Calabrese P |title=Understanding what determines the frequency and pattern of human germline mutations. |journal=Nat Rev Genet |volume=10 | issue=7 |pages=478β488 |year=2009 |pmid= 19488047 |doi=10.1038/nrg2529 |pmc=2744436}}</ref><ref>{{cite journal|vauthors=SΓ©gurel L, Wyman MJ, Przeworski M |title=Determinants of Mutation Rate Variation in the Human Germline |journal=Annu Rev Genom Hum Genet |volume=15 |pages=47β70 |year=2014 |pmid= 25000986 |doi=10.1146/annurev-genom-031714-125740|doi-access=free }}</ref> === Genomic distribution === {| class="wikitable floatright" style="margin-left=15px: auto; margin-right: auto; border: none;" |+ |- ! scope="col" style="background-color:#ffe75f; width: 250px;" | CpG sites ! scope="col" style="width: 250px;" | GpC sites |- | [[File:APRT-CpG.svg|250px]] || [[File:APRT-GpC.svg|250px]] |- | colspan="2" style="width: 250px;" | Distribution of CpG sites (left: in red) and GpC sites (right: in green) in the human [[Adenine phosphoribosyltransferase|APRT]] gene. CpG are more abundant in the upstream region of the gene, where they form a [[CpG island]], whereas GpC are more evenly distributed. The 5 [[exon]]s of the APRT gene are indicated (blue), and the start (ATG) and stop (TGA) codons are emphasized (bold blue). |} CpG dinucleotides frequently occur in CpG islands (see definition of CpG islands, below). There are 28,890 CpG islands in the human genome, (50,267 if one includes CpG islands in repeat sequences).<ref name="pmid11237011">{{cite journal |vauthors=[[Eric Lander|Lander ES]], Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, Stange-Thomann Y, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas P, Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, McMurray A, Matthews L, Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, Mardis ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C, Uberbacher E, Frazier M, Gibbs RA, Muzny DM, Scherer SE, Bouck JB, Sodergren EJ, Worley KC, Rives CM, Gorrell JH, Metzker ML, Naylor SL, Kucherlapati RS, Nelson DL, [[George Weinstock|Weinstock GM]], Sakaki Y, Fujiyama A, Hattori M, Yada T, Toyoda A, Itoh T, Kawagoe C, Watanabe H, Totoki Y, Taylor T, Weissenbach J, Heilig R, Saurin W, Artiguenave F, Brottier P, Bruls T, Pelletier E, Robert C, Wincker P, Smith DR, Doucette-Stamm L, Rubenfield M, Weinstock K, Lee HM, Dubois J, Rosenthal A, Platzer M, Nyakatura G, Taudien S, Rump A, Yang H, Yu J, Wang J, Huang G, Gu J, Hood L, Rowen L, Madan A, Qin S, Davis RW, Federspiel NA, Abola AP, Proctor MJ, Myers RM, Schmutz J, Dickson M, Grimwood J, Cox DR, Olson MV, Kaul R, Raymond C, Shimizu N, Kawasaki K, Minoshima S, Evans GA, Athanasiou M, Schultz R, Roe BA, Chen F, Pan H, Ramser J, Lehrach H, Reinhardt R, McCombie WR, de la Bastide M, Dedhia N, BlΓΆcker H, Hornischer K, Nordsiek G, Agarwala R, Aravind L, Bailey JA, Bateman A, Batzoglou S, Birney E, Bork P, Brown DG, [[Christopher Burge|Burge CB]], Cerutti L, Chen HC, Church D, Clamp M, Copley RR, Doerks T, Eddy SR, Eichler EE, Furey TS, Galagan J, Gilbert JG, Harmon C, Hayashizaki Y, Haussler D, Hermjakob H, Hokamp K, Jang W, Johnson LS, Jones TA, Kasif S, Kaspryzk A, Kennedy S, Kent WJ, Kitts P, Koonin EV, Korf I, Kulp D, Lancet D, Lowe TM, McLysaght A, Mikkelsen T, Moran JV, Mulder N, Pollara VJ, Ponting CP, Schuler G, Schultz J, Slater G, Smit AF, Stupka E, Szustakowki J, Thierry-Mieg D, Thierry-Mieg J, Wagner L, Wallis J, Wheeler R, Williams A, Wolf YI, Wolfe KH, Yang SP, Yeh RF, Collins F, Guyer MS, Peterson J, Felsenfeld A, Wetterstrand KA, Patrinos A, Morgan MJ, de Jong P, Catanese JJ, Osoegawa K, Shizuya H, Choi S, Chen YJ, Szustakowki J |display-authors=6|title=Initial sequencing and analysis of the human genome |journal=Nature |volume=409 |issue=6822 |pages=860β921 |date=February 2001 |pmid=11237011 |doi=10.1038/35057062 |bibcode=2001Natur.409..860L |doi-access=free |hdl=2027.42/62798 |hdl-access=free }}</ref> This is in agreement with the 28,519 CpG islands found by [[Craig Venter|Venter]] et al.<ref name="pmid11181995">{{cite journal |vauthors=[[Craig Venter|Venter JC]], Adams MD, [[Eugene Myers|Myers EW]], Li PW, Mural RJ, Sutton GG, [[Hamilton O. Smith|Smith HO]], Yandell M, Evans CA, Holt RA, Gocayne JD, Amanatides P, Ballew RM, Huson DH, Wortman JR, Zhang Q, Kodira CD, Zheng XH, Chen L, Skupski M, Subramanian G, Thomas PD, Zhang J, Gabor Miklos GL, Nelson C, Broder S, Clark AG, Nadeau J, McKusick VA, Zinder N, Levine AJ, Roberts RJ, Simon M, Slayman C, Hunkapiller M, Bolanos R, Delcher A, Dew I, Fasulo D, Flanigan M, Florea L, Halpern A, Hannenhalli S, Kravitz S, Levy S, Mobarry C, Reinert K, Remington K, Abu-Threideh J, Beasley E, Biddick K, Bonazzi V, Brandon R, Cargill M, Chandramouliswaran I, Charlab R, Chaturvedi K, Deng Z, Di Francesco V, Dunn P, Eilbeck K, Evangelista C, Gabrielian AE, Gan W, Ge W, Gong F, Gu Z, Guan P, Heiman TJ, Higgins ME, Ji RR, Ke Z, Ketchum KA, Lai Z, Lei Y, Li Z, Li J, Liang Y, Lin X, Lu F, Merkulov GV, Milshina N, Moore HM, Naik AK, Narayan VA, Neelam B, Nusskern D, Rusch DB, Salzberg S, Shao W, Shue B, Sun J, Wang Z, Wang A, Wang X, Wang J, Wei M, Wides R, Xiao C, Yan C, Yao A, Ye J, Zhan M, Zhang W, Zhang H, Zhao Q, Zheng L, Zhong F, Zhong W, Zhu S, Zhao S, Gilbert D, Baumhueter S, Spier G, Carter C, Cravchik A, Woodage T, Ali F, An H, Awe A, Baldwin D, Baden H, Barnstead M, Barrow I, Beeson K, Busam D, Carver A, Center A, Cheng ML, Curry L, Danaher S, Davenport L, Desilets R, Dietz S, Dodson K, Doup L, Ferriera S, Garg N, Gluecksmann A, Hart B, Haynes J, Haynes C, Heiner C, Hladun S, Hostin D, Houck J, Howland T, Ibegwam C, Johnson J, Kalush F, Kline L, Koduru S, Love A, Mann F, May D, McCawley S, McIntosh T, McMullen I, Moy M, Moy L, Murphy B, Nelson K, Pfannkoch C, Pratts E, Puri V, Qureshi H, Reardon M, Rodriguez R, Rogers YH, Romblad D, Ruhfel B, Scott R, Sitter C, Smallwood M, Stewart E, Strong R, Suh E, Thomas R, Tint NN, Tse S, Vech C, Wang G, Wetter J, Williams S, Williams M, Windsor S, Winn-Deen E, Wolfe K, Zaveri J, Zaveri K, Abril JF, GuigΓ³ R, Campbell MJ, Sjolander KV, Karlak B, Kejariwal A, Mi H, Lazareva B, Hatton T, Narechania A, Diemer K, Muruganujan A, Guo N, Sato S, Bafna V, Istrail S, Lippert R, Schwartz R, Walenz B, Yooseph S, Allen D, Basu A, Baxendale J, Blick L, Caminha M, Carnes-Stine J, Caulk P, Chiang YH, Coyne M, Dahlke C, Mays A, Dombroski M, Donnelly M, Ely D, Esparham S, Fosler C, Gire H, Glanowski S, Glasser K, Glodek A, Gorokhov M, Graham K, Gropman B, Harris M, Heil J, Henderson S, Hoover J, Jennings D, Jordan C, Jordan J, Kasha J, Kagan L, Kraft C, Levitsky A, Lewis M, Liu X, Lopez J, Ma D, Majoros W, McDaniel J, Murphy S, Newman M, Nguyen T, Nguyen N, Nodell M, Pan S, Peck J, Peterson M, Rowe W, Sanders R, Scott J, Simpson M, Smith T, Sprague A, Stockwell T, Turner R, Venter E, Wang M, Wen M, Wu D, Wu M, Xia A, Zandieh A, Zhu X |display-authors=6|title=The sequence of the human genome |journal=Science |volume=291 |issue=5507 |pages=1304β51 |date=February 2001 |pmid=11181995 |doi=10.1126/science.1058040 |bibcode=2001Sci...291.1304V |doi-access=free }}</ref> since the Venter et al. genome sequence did not include the interiors of highly similar repetitive elements and the extremely dense repeat regions near the centromeres.<ref name="pmid11904395">{{cite journal |vauthors=[[Eugene Myers|Myers EW]], Sutton GG, [[Hamilton O. Smith|Smith HO]], Adams MD, [[Craig Venter|Venter JC]] |title=On the sequencing and assembly of the human genome |journal=Proc. Natl. Acad. Sci. U.S.A. |volume=99 |issue=7 |pages=4145β6 |date=April 2002 |pmid=11904395 |pmc=123615 |doi=10.1073/pnas.092136699 |bibcode=2002PNAS...99.4145M |doi-access=free }}</ref> Since CpG islands contain multiple CpG dinucleotide sequences, there appear to be more than 20 million CpG dinucleotides in the human genome. == CpG islands == [[Image:Cytosine becomes thymine.png|thumbnail|350px|right|How methylation of CpG sites followed by spontaneous deamination leads to a lack of CpG sites in methylated DNA. As a result, residual CpG islands are created in areas where methylation is rare, and CpG sites stick (or where C to T mutation is highly detrimental).]] '''CpG islands''' (or CG islands) are regions with a high frequency of CpG sites. Though objective definitions for CpG islands are limited, the usual formal definition is a region with at least 200 [[base pair|bp]], a GC percentage greater than 50%, and an observed-to-expected CpG ratio greater than 60%. The "observed-to-expected CpG ratio" can be derived where the observed is calculated as: <math>(\text{number of }CpGs)</math> and the expected as <math>(\text{number of }C * \text{number of }G) / \text{length of sequence}</math><ref name="Gardiner-Garden1987">{{Cite journal| volume = 196| issue = 2| pages = 261β282| vauthors = Gardiner-Garden M, Frommer M| title = CpG islands in vertebrate genomes| journal = Journal of Molecular Biology| date = 1987| doi= 10.1016/0022-2836(87)90689-9| pmid=3656447}}</ref> or <math>((\text{number of }C + \text{number of }G)/2)^2 / \text{length of sequence}</math>.<ref name="Saxonov2006"/> Many genes in mammalian genomes have CpG islands associated with the start of the gene<ref>{{cite book |title=Genetics: Analysis of Genes and Genomes |vauthors=Hartl DL, Jones EW |page=[https://archive.org/details/genetics00dani/page/477 477] |edition=6th |year=2005 |publisher=Jones & Bartlett, Canada |location=Mississauga |isbn=978-0-7637-1511-3 |url-access=registration |url=https://archive.org/details/genetics00dani/page/477 }}</ref> ([[Promoter (genetics)|promoter regions]]). Because of this, the presence of a CpG island is used to help in the prediction and annotation of genes. In mammalian genomes, CpG islands are typically 300β3,000 base pairs in length, and have been found in or near approximately 40% of [[promoter (biology)|promoter]]s of mammalian genes.<ref name="Fatemi2005">{{cite journal |vauthors=Fatemi M, Pao MM, Jeong S, Gal-Yam EN, Egger G, Weisenberger DJ, Jones PA |display-authors=6|title=Footprinting of mammalian promoters: use of a CpG DNA methyltransferase revealing nucleosome positions at a single molecule level |journal=Nucleic Acids Res |volume=33 |issue=20 |pages=e176 |year=2005 |pmid=16314307 |pmc=1292996 |doi=10.1093/nar/gni180}}</ref> Over 60% of human genes and almost all [[Housekeeping gene|house-keeping genes]] have their promoters embedded in CpG islands.<ref>{{Cite book|last=Alberts |first=Bruce|author-link=Bruce Alberts|title=Molecular biology of the cell|date=18 November 2014|isbn=978-0-8153-4432-2|edition=Sixth|publisher=Garland Science |location=New York, NY|pages=406|oclc=887605755|url=https://archive.org/details/molecularbiology0006edalbe/page/406/mode/2up}}</ref> Given the frequency of GC two-nucleotide sequences, the number of CpG dinucleotides is much lower than would be expected.<ref name="Saxonov2006">{{cite journal |vauthors=Saxonov S, Berg P, Brutlag DL |title=A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters |journal=Proc Natl Acad Sci USA |volume=103 |issue=5 |pages=1412β1417 |year=2006 |pmid=16432200 |pmc=1345710 |doi=10.1073/pnas.0510310103|bibcode=2006PNAS..103.1412S |doi-access=free }}</ref> A 2002 study revised the rules of CpG island prediction to exclude other GC-rich genomic sequences such as [[Alu sequence|Alu repeats]]. Based on an extensive search on the complete sequences of human chromosomes 21 and 22, DNA regions greater than 500 bp were found more likely to be the "true" CpG islands associated with the 5' regions of genes if they had a GC content greater than 55%, and an observed-to-expected CpG ratio of 65%.<ref name="Takai2002">{{cite journal |vauthors=Takai D, Jones PA |title=Comprehensive analysis of CpG islands in human chromosomes 21 and 22. |journal=Proc Natl Acad Sci USA |volume=99 |issue=6 |pages=3740β5 |year=2002 |pmid=11891299 |doi=10.1073/pnas.052410099 |pmc=122594|bibcode=2002PNAS...99.3740T |doi-access=free }}</ref> CpG islands are characterized by CpG dinucleotide content of at least 60% of that which would be statistically expected (~4β6%), whereas the rest of the genome has much lower CpG frequency (~1%), a phenomenon called [[CG suppression]]. Unlike CpG sites in the [[coding region]] of a gene, in most instances the CpG sites in the CpG islands of promoters are unmethylated if the genes are expressed. This observation led to the speculation that [[methylation]] of CpG sites in the promoter of a gene may inhibit gene expression. Methylation, along with [[histone]] modification, is central to [[Genomic imprinting|imprinting]].<ref name="Feil2007">{{cite journal |vauthors=Feil R, Berger F |title=Convergent evolution of genomic imprinting in plants and mammals |journal=Trends Genet |volume=23 |issue=4 |pages=192β199 |year=2007 |pmid=17316885 |doi=10.1016/j.tig.2007.02.004}}</ref> Most of the methylation differences between tissues, or between normal and cancer samples, occur a short distance from the CpG islands (at "CpG island shores") rather than in the islands themselves.<ref>{{cite journal | vauthors=[[Rafael Irizarry (scientist)|Irizarry RA]], Ladd-Acosta C, Wen B, Wu Z, Montano C, Onyango P, Cui H, Gabo K, Rongione M, Webster M, Ji H, Potash JB, Sabunciyan S, Feinberg AP |display-authors=6| title=The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores | journal=[[Nature Genetics]] | volume=41 | issue=2 | year=2009 | pages=178β186 | pmid=19151715 | pmc=2729128 | doi=10.1038/ng.298}}</ref> CpG islands typically occur at or near the transcription start site of genes, particularly [[housekeeping gene]]s, in vertebrates.<ref name="Saxonov2006"/> A C (cytosine) base followed immediately by a G (guanine) base (a CpG) is rare in vertebrate DNA because the cytosines in such an arrangement tend to be methylated. This methylation helps distinguish the newly synthesized DNA strand from the parent strand, which aids in the final stages of DNA proofreading after duplication. However, over time methylated cytosines tend to turn into [[thymine]]s because of spontaneous [[deamination]]. There is a special enzyme in humans ([[Thymine-DNA glycosylase]], or TDG) that specifically replaces T's from T/G mismatches. However, due to the rarity of CpGs, it is theorised to be insufficiently effective in preventing a possibly rapid mutation of the dinucleotides. The existence of CpG islands is usually explained by the existence of selective forces for relatively high CpG content, or low levels of methylation in that genomic area, perhaps having to do with the regulation of gene expression. A 2011 study showed that most CpG islands are a result of non-selective forces.<ref name="Tanay2011">{{cite journal |vauthors= Cohen N, Kenigsberg E, Tanay A|title=Primate CpG Islands Are Maintained by Heterogeneous Evolutionary Regimes Involving Minimal Selection |journal=Cell |volume=145 |issue=5 |pages=773β786 |year=2011 |pmid=21620139 |doi=10.1016/j.cell.2011.04.024|s2cid=14856605 |doi-access=free }}</ref> == Methylation, silencing, cancer, and aging == [[File:Cpg island evolution.svg|thumb|500px|An image showing a hypothetical evolutionary mechanism behind CpG island formation.]] {{Main|DNA methylation}} === CpG islands in promoters === In humans, about 70% of [[Promoter (genetics)|promoters]] located near the [[transcription start site]] of a gene (proximal promoters) contain a [[#CpG islands|CpG island]].<ref name="pmid16432200"/><ref name="pmid21576262"/> [[Distal promoter]] elements also frequently contain CpG islands. An example is the DNA repair gene ''[[ERCC1]]'', where the CpG island-containing element is located about 5,400 nucleotides upstream of the [[Transcription (biology)|transcription start site]] of the ''ERCC1'' gene.<ref name="pmid19626585">{{cite journal |vauthors=Chen HY, Shao CJ, Chen FR, Kwan AL, Chen ZP |title=Role of ERCC1 promoter hypermethylation in drug resistance to cisplatin in human gliomas |journal=Int. J. Cancer |volume=126 |issue=8 |pages=1944β54 |year=2010 |pmid=19626585 |doi=10.1002/ijc.24772 |doi-access=free }}</ref> CpG islands also occur frequently in promoters for [[Noncoding DNA#Noncoding functional RNA|functional noncoding RNAs]] such as [[microRNA]]s.<ref name="pmid27573897">{{cite book |vauthors=Kaur S, Lotsari-Salomaa JE, SeppΓ€nen-Kaijansinkko R, PeltomΓ€ki P |title=Non-coding RNAs in Colorectal Cancer |chapter=MicroRNA Methylation in Colorectal Cancer |volume=937 |pages=109β22 |year=2016 |pmid=27573897 |doi=10.1007/978-3-319-42059-2_6 |series=Advances in Experimental Medicine and Biology |isbn=978-3-319-42057-8 }}</ref> === Methylation of CpG islands stably silences genes === In humans, DNA methylation occurs at the 5 position of the pyrimidine ring of the cytosine residues within CpG sites to form [[5-methylcytosine]]s. The presence of multiple methylated CpG sites in CpG islands of promoters causes stable silencing of genes.<ref name=Bird>{{cite journal |vauthors=[[Adrian Bird|Bird A]] |title=DNA methylation patterns and epigenetic memory |journal=Genes Dev. |volume=16 |issue=1 |pages=6β21 |year=2002 |pmid=11782440 |doi=10.1101/gad.947102 |doi-access=free }}</ref> Silencing of a gene may be initiated by other mechanisms, but this is often followed by methylation of CpG sites in the promoter CpG island to cause the stable silencing of the gene.<ref name=Bird /> === Promoter CpG hyper/hypo-methylation in cancer === In cancers, loss of expression of genes occurs about 10 times more frequently by hypermethylation of promoter CpG islands than by mutations. For example, in a colorectal cancer there are usually about 3 to 6 [[Somatic evolution in cancer#Glossary|driver]] mutations and 33 to 66 [[Genetic hitchhiking|hitchhiker]] or passenger mutations.<ref name="pmid23539594">{{cite journal |vauthors=[[Bert Vogelstein|Vogelstein B]], Papadopoulos N, Velculescu VE, Zhou S, Diaz LA, Kinzler KW |title=Cancer genome landscapes |journal=Science |volume=339 |issue=6127 |pages=1546β58 |year=2013 |pmid=23539594 |pmc=3749880 |doi=10.1126/science.1235122 |bibcode=2013Sci...339.1546V }}</ref> In contrast, in one study of colon tumors compared to adjacent normal-appearing colonic mucosa, 1,734 CpG islands were heavily methylated in tumors whereas these CpG islands were not methylated in the adjacent mucosa.<ref name=Illingworth>{{cite journal |vauthors=Illingworth RS, Gruenewald-Schneider U, Webb S, Kerr AR, James KD, Turner DJ, Smith C, Harrison DJ, Andrews R, [[Adrian Bird|Bird AP]] |title=Orphan CpG islands identify numerous conserved promoters in the mammalian genome |journal=PLOS Genet. |volume=6 |issue=9 |pages=e1001134 |year=2010 |pmid=20885785 |pmc=2944787 |doi=10.1371/journal.pgen.1001134 |doi-access=free }}</ref> Half of the CpG islands were in promoters of annotated protein coding genes,<ref name=Illingworth /> suggesting that about 867 genes in a colon tumor have lost expression due to CpG island methylation. A separate study found an average of 1,549 differentially methylated regions (hypermethylated or hypomethylated) in the genomes of six colon cancers (compared to adjacent mucosa), of which 629 were in known promoter regions of genes.<ref name="pmid27493446">{{cite journal |vauthors=Wei J, Li G, Dang S, Zhou Y, Zeng K, Liu M |title=Discovery and Validation of Hypermethylated Markers for Colorectal Cancer |journal=Dis. Markers |volume=2016 |pages=1β7 |year=2016 |pmid=27493446 |pmc=4963574 |doi=10.1155/2016/2192853 |doi-access=free }}</ref> A third study found more than 2,000 genes differentially methylated between colon cancers and adjacent mucosa. Using [[gene set enrichment]] analysis, 569 out of 938 [[gene set enrichment|gene sets]] were hypermethylated and 369 were hypomethylated in cancers.<ref name="pmid23096130">{{cite journal |vauthors=Beggs AD, Jones A, El-Bahrawy M, El-Bahwary M, Abulafi M, Hodgson SV, Tomlinson IP |display-authors=6|title=Whole-genome methylation analysis of benign and malignant colorectal tumours |journal=J. Pathol. |volume=229 |issue=5 |pages=697β704 |year=2013 |pmid=23096130 |pmc=3619233 |doi=10.1002/path.4132 }}</ref> Hypomethylation of CpG islands in promoters results in overexpression of the genes or gene sets affected. One 2012 study<ref name="pmid22389639">{{cite journal |vauthors=Schnekenburger M, Diederich M |title=Epigenetics Offer New Horizons for Colorectal Cancer Prevention |journal=Curr Colorectal Cancer Rep |volume=8 |issue=1 |pages=66β81 |year=2012 |pmid=22389639 |pmc=3277709 |doi=10.1007/s11888-011-0116-z }}</ref> listed 147 specific genes with colon cancer-associated hypermethylated promoters, along with the frequency with which these hypermethylations were found in colon cancers. At least 10 of those genes had hypermethylated promoters in nearly 100% of colon cancers. They also indicated 11 [[microRNA]]s whose promoters were hypermethylated in colon cancers at frequencies between 50% and 100% of cancers. MicroRNAs (miRNAs) are small endogenous RNAs that pair with sequences in [[messenger RNA]]s to direct post-transcriptional repression. On average, each microRNA represses several hundred target genes.<ref name="pmid18955434">{{cite journal |vauthors=Friedman RC, Farh KK, [[Christopher Burge|Burge CB]], [[David Bartel|Bartel DP]] |title=Most mammalian mRNAs are conserved targets of microRNAs |journal=Genome Res. |volume=19 |issue=1 |pages=92β105 |year=2009 |pmid=18955434 |pmc=2612969 |doi=10.1101/gr.082701.108 }}</ref> Thus microRNAs with hypermethylated promoters may be allowing over-expression of hundreds to thousands of genes in a cancer. The information above shows that, in cancers, promoter CpG hyper/hypo-methylation of genes and of microRNAs causes loss of expression (or sometimes increased expression) of far more genes than does mutation. === DNA repair genes with hyper/hypo-methylated promoters in cancers === DNA repair genes are frequently repressed in cancers due to hypermethylation of CpG islands within their promoters. In [[Head and neck cancer|head and neck squamous cell carcinomas]] at least 15 DNA repair genes have frequently hypermethylated promoters; these genes are ''[[XRCC1]], [[MLH3]], [[PMS1]], [[RAD51B]], [[XRCC3]], [[RAD54B]], [[BRCA1]], [[SHFM1]], [[GEN1]], [[FANCE]], [[FAAP20]], [[SPRTN]], [[SETMAR]], [[HUS1]],'' and ''[[PER1]]''.<ref name="pmid27683114">{{cite journal |vauthors=Rieke DT, Ochsenreither S, Klinghammer K, Seiwert TY, Klauschen F, Tinhofer I, Keilholz U |display-authors=6|title=Methylation of RAD51B, XRCC3 and other homologous recombination genes is associated with expression of immune checkpoints and an inflammatory signature in squamous cell carcinoma of the head and neck, lung and cervix |journal=Oncotarget |volume= 7|issue= 46|pages= 75379β75393|year=2016 |pmid=27683114 |pmc=5342748 |doi=10.18632/oncotarget.12211 }}</ref> About seventeen types of cancer are frequently deficient in one or more DNA repair genes due to hypermethylation of their promoters.<ref name="pmid22956494">{{cite book |vauthors=Jin B, Robertson KD |title=Epigenetic Alterations in Oncogenesis |chapter=DNA Methyltransferases, DNA Damage Repair, and Cancer |volume=754 |pages=3β29 |year=2013 |pmid=22956494 |pmc=3707278 |doi=10.1007/978-1-4419-9967-2_1 |series=Advances in Experimental Medicine and Biology |isbn=978-1-4419-9966-5 }}</ref> As an example, promoter hypermethylation of the DNA repair gene ''[[O-6-methylguanine-DNA methyltransferase|MGMT]]'' occurs in 93% of bladder cancers, 88% of stomach cancers, 74% of thyroid cancers, 40%-90% of colorectal cancers and 50% of brain cancers. Promoter hypermethylation of ''[[LIG4]]'' occurs in 82% of colorectal cancers. Promoter hypermethylation of ''[[NEIL1]]'' occurs in 62% of [[head and neck cancer]]s and in 42% of [[non-small-cell lung carcinoma|non-small-cell lung cancer]]s. Promoter hypermethylation of ''[[Ataxia telangiectasia mutated|ATM]]'' occurs in 47% of [[non-small-cell lung carcinoma|non-small-cell lung cancer]]s. Promoter hypermethylation of ''[[MLH1]]'' occurs in 48% of [[non-small-cell lung carcinoma|non-small-cell lung cancer]] squamous cell carcinomas. Promoter hypermethylation of ''[[FANCB]]'' occurs in 46% of [[head and neck cancer]]s. On the other hand, the promoters of two genes, ''[[PARP1]]'' and ''[[FEN1]]'', were hypomethylated and these genes were over-expressed in numerous cancers. ''PARP1'' and ''FEN1'' are essential genes in the error-prone and mutagenic DNA repair pathway [[microhomology-mediated end joining]]. If this pathway is over-expressed the excess mutations it causes can lead to cancer. [[PARP1]] is over-expressed in tyrosine kinase-activated leukemias,<ref name="pmid25828893">{{cite journal |vauthors=Muvarak N, Kelley S, Robert C, Baer MR, Perrotti D, Gambacorti-Passerini C, Civin C, Scheibner K, Rassool FV |display-authors=6|title=c-MYC Generates Repair Errors via Increased Transcription of Alternative-NHEJ Factors, LIG3 and PARP1, in Tyrosine Kinase-Activated Leukemias |journal=Mol. Cancer Res. |volume=13 |issue=4 |pages=699β712 |year=2015 |pmid=25828893 |doi=10.1158/1541-7786.MCR-14-0422 |pmc=4398615}}</ref> in neuroblastoma,<ref name="pmid25563294">{{cite journal |vauthors=Newman EA, Lu F, Bashllari D, Wang L, Opipari AW, Castle VP |title=Alternative NHEJ Pathway Components Are Therapeutic Targets in High-Risk Neuroblastoma |journal=Mol. Cancer Res. |volume=13 |issue=3 |pages=470β82 |year=2015 |pmid=25563294 |doi=10.1158/1541-7786.MCR-14-0337 |doi-access=free }}</ref> in testicular and other germ cell tumors,<ref name="pmid23486608">{{cite journal |vauthors=Mego M, Cierna Z, Svetlovska D, Macak D, Machalekova K, Miskovska V, Chovanec M, Usakova V, Obertova J, Babal P, Mardiak J |display-authors=6|title=PARP expression in germ cell tumours |journal=J. Clin. Pathol. |volume=66 |issue=7 |pages=607β12 |year=2013 |pmid=23486608 |doi=10.1136/jclinpath-2012-201088 |s2cid=535704}}</ref> and in Ewing's sarcoma,<ref name="pmid11956622">{{cite journal |vauthors=Newman RE, Soldatenkov VA, Dritschilo A, Notario V |title=Poly(ADP-ribose) polymerase turnover alterations do not contribute to PARP overexpression in Ewing's sarcoma cells |journal=Oncol. Rep. |volume=9 |issue=3 |pages=529β32 |year=2002 |pmid=11956622 |doi= 10.3892/or.9.3.529}}</ref> [[FEN1]] is over-expressed in the majority of cancers of the breast,<ref name=Singh>{{cite journal |vauthors=Singh P, Yang M, Dai H, Yu D, Huang Q, Tan W, Kernstine KH, Lin D, Shen B |title=Overexpression and hypomethylation of flap endonuclease 1 gene in breast and other cancers |journal=Mol. Cancer Res. |volume=6 |issue=11 |pages=1710β7 |year=2008 |pmid=19010819 |pmc=2948671 |doi=10.1158/1541-7786.MCR-08-0269 }}</ref> prostate,<ref name="pmid16879693">{{cite journal |vauthors=Lam JS, Seligson DB, Yu H, Li A, Eeva M, Pantuck AJ, Zeng G, [[Steve Horvath|Horvath S]], [[Arie Belldegrun|Belldegrun AS]] |title=Flap endonuclease 1 is overexpressed in prostate cancer and is associated with a high Gleason score |journal=BJU Int. |volume=98 |issue=2 |pages=445β51 |year=2006 |pmid=16879693 |doi=10.1111/j.1464-410X.2006.06224.x |s2cid=22165252 }}</ref> stomach,<ref name="pmid15701830">{{cite journal |vauthors=Kim JM, Sohn HY, Yoon SY, Oh JH, Yang JO, Kim JH, Song KS, Rho SM, Yoo HS, Yoo HS, Kim YS, Kim JG, Kim NS |display-authors=6|title=Identification of gastric cancer-related genes using a cDNA microarray containing novel expressed sequence tags expressed in gastric cancer cells |journal=Clin. Cancer Res. |volume=11 |issue=2 Pt 1 |pages=473β82 |year=2005 |doi=10.1158/1078-0432.473.11.2|pmid=15701830 |doi-access=free }}</ref><ref name="pmid24590400">{{cite journal |vauthors=Wang K, Xie C, Chen D |title=Flap endonuclease 1 is a promising candidate biomarker in gastric cancer and is involved in cell proliferation and apoptosis |journal=Int. J. Mol. Med. |volume=33 |issue=5 |pages=1268β74 |year=2014 |pmid=24590400 |doi=10.3892/ijmm.2014.1682 |doi-access=free }}</ref> neuroblastomas,<ref name="pmid15922863">{{cite journal |vauthors=Krause A, Combaret V, Iacono I, Lacroix B, Compagnon C, Bergeron C, Valsesia-Wittmann S, Leissner P, Mougin B, Puisieux A |display-authors=6|title=Genome-wide analysis of gene expression in neuroblastomas detected by mass screening |journal=Cancer Lett. |volume=225 |issue=1 |pages=111β20 |year=2005 |pmid=15922863 |doi=10.1016/j.canlet.2004.10.035 |s2cid=44644467|url=http://hal.archives-ouvertes.fr/docs/00/15/79/17/PDF/Cancer_Letters_2004.pdf}}</ref> pancreatic,<ref name="pmid12651607">{{cite journal |vauthors=Iacobuzio-Donahue CA, Maitra A, Olsen M, Lowe AW, van Heek NT, Rosty C, Walter K, Sato N, Parker A, Ashfaq R, Jaffee E, Ryu B, Jones J, Eshleman JR, Yeo CJ, Cameron JL, Kern SE, Hruban RH, Brown PO, Goggins M |display-authors=6|title=Exploration of global gene expression patterns in pancreatic adenocarcinoma using cDNA microarrays |journal=Am. J. Pathol. |volume=162 |issue=4 |pages=1151β62 |year=2003 |pmid=12651607 |pmc=1851213 |doi=10.1016/S0002-9440(10)63911-9 }}</ref> and lung.<ref name="pmid19596913">{{cite journal |vauthors=Nikolova T, Christmann M, Kaina B |title=FEN1 is overexpressed in testis, lung and brain tumors |journal=Anticancer Res. |volume=29 |issue=7 |pages=2453β9 |year=2009 |pmid=19596913 }}</ref> DNA damage appears to be the primary underlying cause of cancer.<ref name="pmid18403632">{{cite journal |vauthors=[[Michael B. Kastan|Kastan MB]] |title=DNA damage responses: mechanisms and roles in human disease: 2007 G.H.A. Clowes Memorial Award Lecture |journal=Mol. Cancer Res. |volume=6 |issue=4 |pages=517β24 |year=2008 |pmid=18403632 |doi=10.1158/1541-7786.MCR-08-0020 |doi-access=free }}</ref><ref name=BernsteinPrasad>{{cite book |last1= Bernstein |first1=C |last2=Prasad |first2=AR |last3=Nfonsam |first3=V |last4=Bernstein |first4=H. |year=2013 |chapter= Chapter 16: DNA Damage, DNA Repair and Cancer |title= New Research Directions in DNA Repair |editor-first=Clark |editor-last=Chen |isbn=978-953-51-1114-6|page=413|publisher=BoD β Books on Demand }}</ref> If accurate DNA repair is deficient, DNA damages tend to accumulate. Such excess DNA damage can increase [[mutation]]al errors during [[DNA replication]] due to error-prone [[DNA repair#Translesion synthesis|translesion synthesis]]. Excess DNA damage can also increase [[Epigenetics|epigenetic]] alterations due to errors during DNA repair.<ref name=Hagan>{{cite journal |vauthors=O'Hagan HM, Mohammad HP, Baylin SB |title=Double strand breaks can initiate gene silencing and SIRT1-dependent onset of DNA methylation in an exogenous promoter CpG island |journal=PLOS Genetics |volume=4 |issue=8 |pages=e1000155 |year=2008 |pmid=18704159 |pmc=2491723 |doi=10.1371/journal.pgen.1000155 |doi-access=free }}</ref><ref name=Cuozzo>{{cite journal |vauthors=Cuozzo C, Porcellini A, Angrisano T |title=DNA damage, homology-directed repair, and DNA methylation |journal=PLOS Genetics |volume=3 |issue=7 |pages=e110 | date=July 2007 |pmid=17616978 |pmc=1913100 |doi=10.1371/journal.pgen.0030110|display-authors=etal |doi-access=free }}</ref> Such mutations and epigenetic alterations can give rise to [[cancer]] (see [[Neoplasm#Malignant neoplasms|malignant neoplasms]]). Thus, CpG island hyper/hypo-methylation in the promoters of DNA repair genes are likely central to progression to cancer. === Methylation of CpG sites with age === Since age has a strong effect on DNA methylation levels on tens of thousands of CpG sites, one can define a highly accurate [[Biological clock (aging)|biological clock]] (referred to as [[epigenetic clock]] or [[Biological clock (aging)|DNA methylation age]]) in humans and chimpanzees.<ref>{{cite journal |last1=Field |first1=Adam E. |last2=Robertson |first2=Neil A. |last3=Wang |first3=Tina |last4=Havas |first4=Aaron |last5=Ideker |first5=Trey |last6=Adams |first6=Peter D. |title=DNA Methylation Clocks in Aging: Categories, Causes, and Consequences |journal=Molecular Cell |date=September 2018 |volume=71 |issue=6 |pages=882β895 |doi=10.1016/j.molcel.2018.08.008|pmc=6520108 }}</ref> === Unmethylated sites === Unmethylated CpG dinucleotide sites can be detected by Toll-like receptor 9 ([[TLR 9]])<ref>{{Cite journal |vauthors=Ramirez-Ortiz ZG, Specht CA, Wang JP, Lee CK, Bartholomeu DC, Gazzinelli RT, Levitz SM |title=Toll-like receptor 9-dependent immune activation by unmethylated CpG motifs in Aspergillus fumigatus DNA |journal=Infect. Immun. |year=2008 |volume=76 |issue=5 |pages=2123β2129 |pmid=18332208 |doi=10.1128/IAI.00047-08 |pmc=2346696}}</ref> on [[plasmacytoid dendritic cell]]s, [[monocyte]]s, [[Natural killer cell|natural killer (NK) cells]], and [[B cell]]s in humans. This is used to detect intracellular viral infection. == Role of CpG sites in memory == In mammals, [[DNA methyltransferase]]s (which add [[methyl group]]s to DNA bases) exhibit a sequence preference for cytosines within CpG sites.<ref name=Ziller>{{cite journal |vauthors=Ziller MJ, MΓΌller F, Liao J, Zhang Y, Gu H, Bock C, Boyle P, Epstein CB, Bernstein BE, Lengauer T, Gnirke A, Meissner A |display-authors=6|title=Genomic distribution and inter-sample variation of non-CpG methylation across human cell types |journal=PLOS Genet. |volume=7 |issue=12 |pages=e1002389 |date=December 2011 |pmid=22174693 |pmc=3234221 |doi=10.1371/journal.pgen.1002389 |doi-access=free}}</ref> In the mouse brain, 4.2% of all cytosines are methylated, primarily in the context of CpG sites, forming 5mCpG.<ref name=Fasolino>{{cite journal |vauthors=Fasolino M, Zhou Z |title=The Crucial Role of DNA Methylation and MeCP2 in Neuronal Function |journal=Genes (Basel) |volume=8 |issue=5 |pages= 141|date=May 2017 |pmid=28505093 |pmc=5448015 |doi=10.3390/genes8050141 |doi-access=free }}</ref> Most hypermethylated 5mCpG sites increase the repression of associated genes.<ref name=Fasolino /> As reviewed by Duke et al., neuron DNA methylation (repressing expression of particular genes) is altered by neuronal activity. Neuron DNA methylation is required for [[synaptic plasticity]]; is modified by experiences; and active DNA methylation and demethylation is required for memory formation and maintenance.<ref name="pmid28620075">{{cite journal |vauthors=Duke CG, Kennedy AJ, Gavin CF, Day JJ, Sweatt JD |title=Experience-dependent epigenomic reorganization in the hippocampus |journal=Learn. Mem. |volume=24 |issue=7 |pages=278β288 |date=July 2017 |pmid=28620075 |pmc=5473107 |doi=10.1101/lm.045112.117 }}</ref> In 2016 Halder et al.<ref name="pmid26656643">{{cite journal |vauthors=Halder R, Hennion M, Vidal RO, Shomroni O, Rahman RU, Rajput A, Centeno TP, van Bebber F, Capece V, Garcia Vizcaino JC, Schuetz AL, Burkhardt S, Benito E, Navarro Sala M, Javan SB, Haass C, Schmid B, Fischer A, Bonn S |display-authors=6|title=DNA methylation changes in plasticity genes accompany the formation and maintenance of memory |journal=Nat. Neurosci. |volume=19 |issue=1 |pages=102β10 |date=January 2016 |pmid=26656643 |doi=10.1038/nn.4194 |pmc=4700510 }}</ref> using mice, and in 2017 Duke et al.<ref name="pmid28620075"/> using rats, subjected the rodents to contextual [[fear conditioning]], causing an especially strong [[Memory#Long-term memory|long-term memory]] to form. At 24 hours after the conditioning, in the [[hippocampus]] brain region of rats, the expression of 1,048 genes was down-regulated (usually associated with [[#Methylation of CpG islands stably silences genes|5mCpG]] in [[Promoter (genetics)|gene promoters]]) and the expression of 564 genes was up-regulated (often associated with hypomethylation of CpG sites in gene promoters). At 24 hours after training, 9.2% of the genes in the rat genome of [[hippocampus]] neurons were differentially methylated. However while the hippocampus is essential for learning new information it does not store information itself. In the mouse experiments of Halder, 1,206 differentially methylated genes were seen in the hippocampus one hour after contextual fear conditioning but these altered methylations were reversed and not seen after four weeks. In contrast with the absence of long-term CpG methylation changes in the hippocampus, substantial differential CpG methylation could be detected in [[Anterior cingulate cortex|cortical]] neurons during memory maintenance. There were 1,223 differentially methylated genes in the anterior cingulate cortex of mice four weeks after contextual fear conditioning. === Demethylation at CpG sites requires ROS activity === [[File:Initiation of DNA demethylation at a CpG site.svg|thumb|250 px|Initiation of [[DNA demethylation]] at a CpG site.]] In adult somatic cells DNA methylation typically occurs in the context of CpG dinucleotides ([[CpG sites]]), forming [[5-methylcytosine]]-pG, or 5mCpG. Reactive oxygen species (ROS) may attack guanine at the dinucleotide site, forming [[8-oxo-2'-deoxyguanosine|8-hydroxy-2'-deoxyguanosine]] (8-OHdG), and resulting in a 5mCp-8-OHdG dinucleotide site. The [[base excision repair]] enzyme [[oxoguanine glycosylase|OGG1]] targets 8-OHdG and binds to the lesion without immediate excision. OGG1, present at a 5mCp-8-OHdG site recruits [[Tet methylcytosine dioxygenase 1|TET1]] and TET1 oxidizes the 5mC adjacent to the 8-OHdG. This initiates demethylation of 5mC.<ref name=Zhou>{{cite journal |vauthors=Zhou X, Zhuang Z, Wang W, He L, Wu H, Cao Y, Pan F, Zhao J, Hu Z, Sekhar C, Guo Z |title=OGG1 is essential in oxidative stress induced DNA demethylation |journal=Cell. Signal. |volume=28 |issue=9 |pages=1163β71 |date=September 2016 |pmid=27251462 |doi=10.1016/j.cellsig.2016.05.021 }}</ref> [[File:Demethylation of 5-methylcytosine.svg|thumb|250 px|Demethylation of [[5-Methylcytosine]] (5mC) in neuron DNA.]] As reviewed in 2018,<ref name="pmid29875631">{{cite journal |vauthors=Bayraktar G, Kreutz MR |title=The Role of Activity-Dependent DNA Demethylation in the Adult Brain and in Neurological Disorders |journal=Front Mol Neurosci |volume=11 |pages=169 |date=2018 |pmid=29875631 |pmc=5975432 |doi=10.3389/fnmol.2018.00169 |doi-access=free }}</ref> in brain neurons, 5mC is oxidized by the ten-eleven translocation (TET) family of dioxygenases ([[Tet methylcytosine dioxygenase 1|TET1]], [[Tet methylcytosine dioxygenase 2|TET2]], [[Tet methylcytosine dioxygenase 3|TET3]]) to generate [[5-hydroxymethylcytosine]] (5hmC). In successive steps TET enzymes further hydroxylate 5hmC to generate 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). [[Thymine-DNA glycosylase]] (TDG) recognizes the intermediate bases 5fC and 5caC and excises the [[glycosidic bond]] resulting in an apyrimidinic site ([[AP site]]). In an alternative oxidative deamination pathway, 5hmC can be oxidatively deaminated by activity-induced cytidine deaminase/apolipoprotein B mRNA editing complex [[APOBEC3G|(AID/APOBEC)]] deaminases to form 5-hydroxymethyluracil (5hmU) or 5mC can be converted to [[thymine]] (Thy). 5hmU can be cleaved by TDG, single-strand-selective monofunctional uracil-DNA glycosylase 1 ([[SMUG1]]), Nei-Like DNA Glycosylase 1 ([[NEIL1]]), or methyl-CpG binding protein 4 ([[MBD4]]). AP sites and T:G mismatches are then repaired by base excision repair (BER) enzymes to yield [[cytosine]] (Cyt). Two reviews<ref name="pmid20649473">{{cite journal |vauthors=Massaad CA, Klann E |title=Reactive oxygen species in the regulation of synaptic plasticity and memory |journal=Antioxid. Redox Signal. |volume=14 |issue=10 |pages=2013β54 |date=May 2011 |pmid=20649473 |pmc=3078504 |doi=10.1089/ars.2010.3208 }}</ref><ref name="pmid27625575">{{cite journal |vauthors=Beckhauser TF, Francis-Oliveira J, De Pasquale R |title=Reactive Oxygen Species: Physiological and Physiopathological Effects on Synaptic Plasticity |journal=J Exp Neurosci |volume=10 |issue=Suppl 1 |pages=23β48 |date=2016 |pmid=27625575 |pmc=5012454 |doi=10.4137/JEN.S39887 }}</ref> summarize the large body of evidence for the critical and essential role of [[reactive oxygen species|ROS]] in [[memory]] formation. The [[DNA demethylation]] of thousands of CpG sites during memory formation depends on initiation by ROS. In 2016, Zhou et al.,<ref name=Zhou/> showed that ROS have a central role in [[DNA demethylation]]. [[Tet methylcytosine dioxygenase 1|TET1]] is a key enzyme involved in demethylating 5mCpG. However, TET1 is only able to act on 5mCpG if an ROS has first acted on the guanine to form [[8-oxo-2'-deoxyguanosine|8-hydroxy-2'-deoxyguanosine]] (8-OHdG), resulting in a 5mCp-8-OHdG dinucleotide (see first figure in this section).<ref name=Zhou /> After formation of 5mCp-8-OHdG, the [[base excision repair]] enzyme [[oxoguanine glycosylase|OGG1]] binds to the 8-OHdG lesion without immediate excision. Adherence of OGG1 to the 5mCp-8-OHdG site recruits [[Tet methylcytosine dioxygenase 1|TET1]], allowing TET1 to oxidize the 5mC adjacent to 8-OHdG, as shown in the first figure in this section. This initiates the demethylation pathway shown in the second figure in this section. Altered protein expression in neurons, controlled by ROS-dependent demethylation of CpG sites in gene promoters within neuron DNA, is central to memory formation.<ref name="pmid20975755">{{cite journal |vauthors=Day JJ, Sweatt JD |title=DNA methylation and memory formation |journal=Nat. Neurosci. |volume=13 |issue=11 |pages=1319β23 |date=November 2010 |pmid=20975755 |pmc=3130618 |doi=10.1038/nn.2666 }}</ref> == CpG loss == CpG depletion has been observed in the process of DNA methylation of [[Transposable element|Transposable Elements]] (TEs) where TEs are not only responsible in the genome expansion but also CpG loss in a host DNA. TEs can be known as "methylation centers" whereby the methylation process, the TEs spreads into the flanking DNA once in the host DNA. This spreading might subsequently result in CpG loss over evolutionary time. Older evolutionary times show a higher CpG loss in the flanking DNA, compared to the younger evolutionary times. Therefore, the DNA methylation can lead eventually to the noticeably loss of CpG sites in neighboring DNA. <ref name=":0">{{Cite journal|last1=Zhou|first1=Wanding|last2=Liang|first2=Gangning|last3=Molloy|first3=Peter L.|last4=Jones|first4=Peter A.|date= 11 August 2020|title=DNA methylation enables transposable element-driven genome expansion|journal=Proceedings of the National Academy of Sciences of the United States of America|volume=117|issue=32|pages=19359β19366|doi=10.1073/pnas.1921719117|issn=1091-6490|pmc=7431005|pmid=32719115|bibcode=2020PNAS..11719359Z |doi-access=free }}</ref> === Genome size and CpG ratio are negatively correlated === [[File:Genome Expansion.png|thumb|CpG methylation contributes to the genome expansion and consequently to CpG depletion. This picture shows a genome with no TEs and unmethylated CpG sites, and the insertion and transposition of a TE lead to methylation and silencing of the TE. Through the process of CpG methylation a decrease in CpG is found.<ref>{{cite journal|last1=Zhou|first1=Wanding|last2=Liang|first2=Gangning|last3=Molloy|first3=Peter L.|last4=Jones|first4=Peter A.|date=11 August 2020|title=DNA methylation enables transposable element-driven genome expansion|journal=Proceedings of the National Academy of Sciences of the United States of America|volume=117|issue=32|pages=19359β19366|doi=10.1073/pnas.1921719117|issn=1091-6490|pmc=7431005|pmid=32719115|bibcode=2020PNAS..11719359Z |doi-access=free }}</ref>]] There is generally an inverse correlation between genome size and number of CpG islands, as larger genomes typically have a greater number of transposable elements. Selective pressure against TE's is substantially reduced if expression is suppressed via methylation, further TE's can act as "methylation centres" facilitating methylation of flanking DNA. Since methylation reduces selective pressure on nucleotide sequence long term methylation of CpG sites increases accumulation of spontaneous cytosine to thymine transitions, thereby resulting in a loss of Cp sites. <ref name=":0" /> ==== Alu elements as promoters of CpG loss ==== Alu elements are known as the most abundant type of transposable elements. Some studies have used Alu elements as a way to study the factors responsible for genome expansion. Alu elements are CpG-rich in a longer amount of sequence, unlike LINEs and ERVs. Alus can work as a methylation center, and the insertion into a host DNA can produce DNA methylation and provoke a spreading into the Flanking DNA area. This spreading is why there is considerable CpG loss and genome expansion.<ref name=":0" /> However, this is a result that is analyzed over time because older Alu elements show more CpG loss in sites of neighboring DNA compared to younger ones. == See also == *[[TLR9]], detector of unmethylated CpG sites *[[Biological clock (aging)|DNA methylation age]] == References == {{Reflist|30em}} {{Portal bar|Biology}} [[Category:Molecular genetics]] [[Category:DNA]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Cite book
(
edit
)
Template:Cite journal
(
edit
)
Template:Distinguish
(
edit
)
Template:Main
(
edit
)
Template:Portal bar
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)