Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Gene prediction
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Process in computational biology}} [[File:Gene structure.svg|thumbnail|350px|Structure of a [[eukaryotic]] gene]] In [[computational biology]], '''gene prediction''' or '''gene finding''' refers to the process of identifying the regions of genomic DNA that encode [[genes]]. This includes protein-coding [[gene]]s as well as [[RNA gene]]s, but may also include prediction of other functional elements such as [[regulatory regions]]. Gene finding is one of the first and most important steps in understanding the genome of a species once it has been [[Sequencing|sequenced]]. In its earliest days, "gene finding" was based on painstaking experimentation on living cells and organisms. Statistical analysis of the rates of [[homologous recombination]] of several different genes could determine their order on a certain [[chromosome]], and information from many such experiments could be combined to create a [[genetic map]] specifying the rough location of known genes relative to each other. Today, with comprehensive genome sequence and powerful computational resources at the disposal of the research community, gene finding has been redefined as a largely computational problem. Determining that a sequence is functional should be distinguished from determining [[Protein function prediction|the function]] of the gene or its product. Predicting the function of a gene and confirming that the gene prediction is accurate still demands ''[[in vivo]]'' experimentation<ref name="Sleator2010">{{cite journal | vauthors = Sleator RD | title = An overview of the current status of eukaryote gene prediction strategies | journal = Gene | volume = 461 | issue = 1–2 | pages = 1–4 | date = August 2010 | pmid = 20430068 | doi = 10.1016/j.gene.2010.04.008 }}</ref> through [[gene knockout]] and other assays, although frontiers of [[bioinformatics]] research <ref>{{Cite journal|last1=Ejigu|first1=Girum Fitihamlak|last2=Jung|first2=Jaehee|date=2020-09-18|title=Review on the Computational Genome Annotation of Sequences Obtained by Next-Generation Sequencing|journal=Biology|volume=9|issue=9|page=295|doi=10.3390/biology9090295|issn=2079-7737|pmc=7565776|pmid=32962098|doi-access=free }}</ref> are making it increasingly possible to predict the function of a gene based on its sequence alone. Gene prediction is one of the key steps in [[genome annotation]], following [[sequence assembly]], the filtering of non-coding regions and repeat masking.<ref name="Yandell2012">{{cite journal | vauthors = Yandell M, Ence D | title = A beginner's guide to eukaryotic genome annotation | journal = Nature Reviews. Genetics | volume = 13 | issue = 5 | pages = 329–42 | date = April 2012 | pmid = 22510764 | doi = 10.1038/nrg3174 | s2cid = 3352427 }}</ref> Gene prediction is closely related to the so-called 'target search problem' investigating how [[DNA-binding proteins]] ([[transcription factors]]) locate specific [[binding sites]] within the [[genome]].<ref name=redding2013>{{cite journal | vauthors = Redding S, Greene EC | title = How do proteins locate specific targets in DNA? | journal = Chemical Physics Letters | volume = 570 | pages = 1–11 | date = May 2013 | pmid = 24187380 | pmc = 3810971 | doi = 10.1016/j.cplett.2013.03.035 | bibcode = 2013CPL...570....1R }}</ref><ref name=sokolov2005>{{cite journal | vauthors = Sokolov IM, Metzler R, Pant K, Williams MC | title = Target search of N sliding proteins on a DNA | journal = Biophysical Journal | volume = 89 | issue = 2 | pages = 895–902 | date = August 2005 | pmid = 15908574 | pmc = 1366639 | doi = 10.1529/biophysj.104.057612 | bibcode = 2005BpJ....89..895S }}</ref> Many aspects of structural gene prediction are based on current understanding of underlying [[Biochemistry|biochemical]] processes in the [[Cell (biology)|cell]] such as gene [[transcription (genetics)|transcription]], [[translation (biology)|translation]], [[protein–protein interaction]]s and [[Regulation of gene expression|regulation processes]], which are subject of active research in the various [[omics]] fields such as [[transcriptomics]], [[proteomics]], [[metabolomics]], and more generally [[Structural genomics|structural]] and [[functional genomics]].
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)