Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Gene prediction
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Metagenomic gene prediction == [[Metagenomics]] is the study of genetic material recovered from the environment, resulting in sequence information from a pool of organisms. Predicting genes is useful for [[Metagenomics#Comparative metagenomics|comparative metagenomics]]. Metagenomics tools also fall into the basic categories of using either sequence similarity approaches (MEGAN4) and ab initio techniques (GLIMMER-MG). Glimmer-MG<ref name="Kelley2012">{{cite journal | vauthors = Kelley DR, Liu B, Delcher AL, Pop M, Salzberg SL | title = Gene prediction with Glimmer for metagenomic sequences augmented by classification and clustering | journal = Nucleic Acids Research | volume = 40 | issue = 1 | pages = e9 | date = January 2012 | pmid = 22102569 | pmc = 3245904 | doi = 10.1093/nar/gkr1067 }}</ref> is an extension to [[GLIMMER]] that relies mostly on an ab initio approach for gene finding and by using training sets from related organisms. The prediction strategy is augmented by classification and clustering gene data sets prior to applying ab initio gene prediction methods. The data is clustered by species. This classification method leverages techniques from metagenomic phylogenetic classification. An example of software for this purpose is, Phymm, which uses interpolated markov models—and PhymmBL, which integrates BLAST into the classification routines. MEGAN4<ref name="Huson2011">{{cite journal | vauthors = Huson DH, Mitra S, Ruscheweyh HJ, Weber N, Schuster SC | title = Integrative analysis of environmental sequences using MEGAN4 | journal = Genome Research | volume = 21 | issue = 9 | pages = 1552–60 | date = September 2011 | pmid = 21690186 | pmc = 3166839 | doi = 10.1101/gr.120618.111 }}</ref> uses a sequence similarity approach, using local alignment against databases of known sequences, but also attempts to classify using additional information on functional roles, biological pathways and enzymes. As in single organism gene prediction, sequence similarity approaches are limited by the size of the database. FragGeneScan and MetaGeneAnnotator are popular gene prediction programs based on [[Hidden Markov model]]. These predictors account for sequencing errors, partial genes and work for short reads. Another fast and accurate tool for gene prediction in metagenomes is MetaGeneMark.<ref name="Zhu2010">{{cite journal | vauthors = Zhu W, Lomsadze A, Borodovsky M | title = Ab initio gene identification in metagenomic sequences | journal = Nucleic Acids Research | volume = 38 | issue = 12 | pages = e132 | date = July 2010 | pmid = 20403810 | pmc = 2896542 | doi = 10.1093/nar/gkq275 }}</ref> This tool is used by the DOE Joint Genome Institute to annotate IMG/M, the largest metagenome collection to date.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)