Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Genomics
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== History == === Etymology === From the Greek ΓΕΝ<ref name = "Liddell_2013" /> ''gen'', "gene" (gamma, epsilon, nu, epsilon) meaning "become, create, creation, birth", and subsequent variants: genealogy, genesis, genetics, genic, genomere, genotype, genus etc. While the word ''genome'' (from the [[German language|German]] ''Genom'', attributed to [[Hans Winkler]]) was in use in [[English language|English]] as early as 1926,<ref name = "OED_Genome"/> the term ''genomics'' was coined by Tom Roderick, a [[geneticist]] at the [[Jackson Laboratory]] ([[Bar Harbor, Maine]]), over beers with [[James E. Womack]], Tom Shows and [[Stephen J. O'Brien|Stephen O’Brien]] at a meeting held in [[Maryland]] on the mapping of the human genome in 1986.<ref name = "Yadav_2007"/> First as the name for a [[Genomics (journal)|new journal]] and then as a whole new science discipline.<ref name = "O'Brien_2022" /> === Early sequencing efforts === Following [[Rosalind Franklin]]'s confirmation of the helical structure of DNA, [[James D. Watson]] and [[Francis Crick]]'s publication of the structure of DNA in 1953 and [[Fred Sanger]]'s publication of the [[Amino acid]] sequence of insulin in 1955, nucleic acid sequencing became a major target of early [[molecular biology|molecular biologists]].<ref name = "Ankeny_2003"/> In 1964, [[Robert W. Holley]] and colleagues published the first nucleic acid sequence ever determined, the [[ribonucleic acid|ribonucleotide]] sequence of [[alanine]] [[tRNA|transfer RNA]].<ref name = "Holley_1965a"/><ref name = "Holley_1965b"/> Extending this work, [[Marshall Nirenberg]] and [[Philip Leder]] revealed the triplet nature of the [[genetic code]] and were able to determine the sequences of 54 out of 64 [[codons]] in their experiments.<ref name = "Nuremberg_1965"/> In 1972, [[Walter Fiers]] and his team at the Laboratory of Molecular Biology of the [[University of Ghent]] ([[Ghent]], [[Belgium]]) were the first to determine the sequence of a gene: the gene for [[Bacteriophage MS2]] coat protein.<ref name = "Min_1972"/> Fiers' group expanded on their MS2 coat protein work, determining the complete nucleotide-sequence of bacteriophage MS2-RNA (whose genome encodes just four genes in 3569 [[base pair]]s [bp]) and [[SV40|Simian virus 40]] in 1976 and 1978, respectively.<ref name = "Fiers_1976"/><ref name = "Fiers_1978"/> === DNA-sequencing technology developed === {{multiple image | width = 125 | image1 = Frederick Sanger2.jpg | caption1 = Frederick Sanger | image2 = WalterGilbert2.jpg | caption2 = Walter Gilbert | footer = [[Frederick Sanger]] and [[Walter Gilbert]] shared half of the 1980 Nobel Prize in Chemistry for Independently developing methods for the sequencing of DNA. }} In addition to his seminal work on the amino acid sequence of insulin, [[Frederick Sanger]] and his colleagues played a key role in the development of DNA sequencing techniques that enabled the establishment of comprehensive genome sequencing projects.<ref name = "Pevsner_2009"/> In 1975, he and Alan Coulson published a sequencing procedure using DNA polymerase with radiolabelled nucleotides that he called the ''Plus and Minus technique''.<ref name = "Tamarin_2004"/><ref name = "Sanger_1980"/> This involved two closely related methods that generated short oligonucleotides with defined 3' termini. These could be fractionated by [[electrophoresis]] on a [[polyacrylamide]] gel (called polyacrylamide gel electrophoresis) and visualised using autoradiography. The procedure could sequence up to 80 nucleotides in one go and was a big improvement, but was still very laborious. Nevertheless, in 1977 his group was able to sequence most of the 5,386 nucleotides of the single-stranded [[bacteriophage]] [[Phi X 174|φX174]], completing the first fully sequenced DNA-based genome.<ref name = "Sanger_1977"/> The refinement of the ''Plus and Minus'' method resulted in the chain-termination, or [[Sanger method]] (see [[#Shotgun sequencing|below]]), which formed the basis of the techniques of DNA sequencing, genome mapping, data storage, and bioinformatic analysis most widely used in the following quarter-century of research.<ref name = "Kaiser_2003"/><ref name = "Sanger_1977a"/> In the same year [[Walter Gilbert]] and [[Allan Maxam]] of [[Harvard University]] independently developed the [[Maxam-Gilbert sequencing|Maxam-Gilbert]] method (also known as the ''chemical method'') of DNA sequencing, involving the preferential cleavage of DNA at known bases, a less efficient method.<ref name = "Gilbert_1977"/><ref name = "Darden_2010"/> For their groundbreaking work in the sequencing of nucleic acids, Gilbert and Sanger shared half the 1980 [[Nobel Prize]] in chemistry with [[Paul Berg]] ([[recombinant DNA]]). === Complete genomes === The advent of these technologies resulted in a rapid intensification in the scope and speed of completion of [[Genome project|genome sequencing projects]]. The first complete genome sequence of a [[eukaryotic organelle]], the human [[mitochondrion]] (16,568 bp, about 16.6 kb [kilobase]), was reported in 1981,<ref name = "Anderson_1981"/> and the first [[chloroplast]] genomes followed in 1986.<ref name = "Shinozaki_1986"/><ref name = "Ohyama_1986"/> In 1992, the first eukaryotic [[chromosome]], chromosome III of brewer's yeast ''[[Saccharomyces cerevisiae]]'' (315 kb) was sequenced.<ref name = "Oliver_1992"/> The first free-living organism to be sequenced was that of ''[[Haemophilus influenzae]]'' (1.8 Mb [megabase]) in 1995.<ref name = "Fleischmann_1995"/> The following year a consortium of researchers from laboratories across [[North America]], [[Europe]], and [[Japan]] announced the completion of the first complete genome sequence of a eukaryote, ''[[Saccharomyces cerevisiae|S. cerevisiae]]'' (12.1 Mb), and since then genomes have continued being sequenced at an exponentially growing pace.<ref name = "Goffeau_1996"/> {{As of|2011|October}}, the complete sequences are available for: 2,719 [[virus]]es, 1,115 [[archaea]] and [[bacteria]], and 36 [[eukaryote]]s, of which about half are [[fungi]].<ref name = "Viruses_2011"/><ref name = "Entrez_2011"/> [[File:Number of prokaryotic genomes and sequencing costs.svg|left|thumb|300px|The number of genome projects has increased as technological improvements continue to lower the cost of sequencing. '''(A)''' Exponential growth of genome sequence databases since 1995. '''(B)''' The cost in US Dollars (USD) to sequence one million bases. '''(C)''' The cost in USD to sequence a 3,000 Mb (human-sized) genome on a log-transformed scale.|alt="Hockey stick" graph showing the exponential growth of public sequence databases.]] Most of the microorganisms whose genomes have been completely sequenced are problematic [[pathogen]]s, such as ''[[Haemophilus influenzae]]'', which has resulted in a pronounced bias in their phylogenetic distribution compared to the breadth of microbial diversity.<ref name = "Zimmer_2009a"/><ref name = "Geba_2009"/> Of the other sequenced species, most were chosen because they were well-studied model organisms or promised to become good models. Yeast (''[[Saccharomyces cerevisiae]]'') has long been an important [[model organism]] for the [[eukaryotic cell]], while the fruit fly ''[[Drosophila melanogaster]]'' has been a very important tool (notably in early pre-molecular [[genetics]]). The worm ''[[Caenorhabditis elegans]]'' is an often used simple model for [[multicellular organism]]s. The zebrafish ''[[Brachydanio rerio]]'' is used for many developmental studies on the molecular level, and the plant ''[[Arabidopsis thaliana]]'' is a model organism for flowering plants. The [[Japanese pufferfish]] (''[[Takifugu rubripes]]'') and the [[Dichotomyctere nigroviridis|spotted green pufferfish]] (''[[Tetraodon nigroviridis]]'') are interesting because of their small and compact genomes, which contain very little [[noncoding DNA]] compared to most species.<ref name = "BBC_2004"/><ref name = "Yue_2001"/> The mammals dog (''[[Canis familiaris]]''),<ref name = "nhgriDog2004"/> brown rat (''[[Rattus norvegicus]]''), mouse (''[[Mus musculus]]''), and chimpanzee (''[[Pan troglodytes]]'') are all important model animals in medical research.<ref name = "Darden_2010"/> A rough draft of the [[human genome]] was completed by the [[Human Genome Project]] in early 2001, creating much fanfare.<ref name = "McElheny_2010"/> This project, completed in 2003, sequenced the entire genome for one specific person, and by 2007 this sequence was declared "finished" (less than one error in 20,000 bases and all chromosomes assembled).<ref name = "McElheny_2010"/> In the years since then, the genomes of many other individuals have been sequenced, partly under the auspices of the [[1000 Genomes Project]], which announced the sequencing of 1,092 genomes in October 2012.<ref name = "1000genomes2012"/> Completion of this project was made possible by the development of dramatically more efficient sequencing technologies and required the commitment of significant [[bioinformatics]] resources from a large international collaboration.<ref name = "Neilsen_2012"/> The continued analysis of human genomic data has profound political and social repercussions for human societies.<ref name = "Barnes_2008"/> === The "omics" revolution === [[Image:Metabolomics schema.png|thumb|General schema showing the relationships of the [[genome]], [[transcriptome]], [[proteome]], and [[metabolome]] ([[lipidome]])]] {{Main|Omics|Human proteome project}} The English-language [[neologism]] '''omics''' informally refers to a field of study in biology ending in ''-omics'', such as genomics, [[proteomics]] or [[metabolomics]]. The related suffix '''-ome''' is used to address the objects of study of such fields, such as the [[genome]], [[proteome]], or [[metabolome]] ([[lipidome]]) respectively. The suffix ''-ome'' as used in molecular biology refers to a ''totality'' of some sort; similarly '''omics''' has come to refer generally to the study of large, comprehensive biological data sets. While the growth in the use of the term has led some scientists ([[Jonathan Eisen]], among others<ref name = "Eisen_2012"/>) to claim that it has been oversold,<ref name = "wsj_2012"/> it reflects the change in orientation towards the quantitative analysis of complete or near-complete assortment of all the constituents of a system.<ref name = "Scudellari_2011" /> In the study of [[Symbiosis|symbioses]], for example, researchers which were once limited to the study of a single gene product can now simultaneously compare the total complement of several types of biological molecules.<ref name = "Chaston_2012"/><ref name = "McCutcheon_2011"/>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)