Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Functional genomics
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Bioinformatics methods for Functional genomics == Because of the large quantity of data produced by these techniques and the desire to find biologically meaningful patterns, [[bioinformatics]] is crucial to analysis of functional genomics data. Examples of techniques in this class are [[data clustering]] or [[principal component analysis]] for unsupervised [[machine learning]] (class detection) as well as [[artificial neural network]]s or [[support vector machine]]s for supervised machine learning (class prediction, [[Statistical classification|classification]]). Functional enrichment analysis is used to determine the extent of over- or under-expression (positive- or negative- regulators in case of RNAi screens) of functional categories relative to a background sets. [[Gene ontology]] based enrichment analysis are provided by [[Gene set enrichment analysis#DAVID|DAVID]] and [[gene set enrichment analysis]] (GSEA),<ref>{{cite journal | vauthors = Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP | title = Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles | journal = Proceedings of the National Academy of Sciences of the United States of America | volume = 102 | issue = 43 | pages = 15545β50 | date = October 2005 | pmid = 16199517 | pmc = 1239896 | doi = 10.1073/pnas.0506580102 | bibcode = 2005PNAS..10215545S | doi-access = free }}</ref> pathway based analysis by Ingenuity<ref>{{cite web |url=http://www.ingenuity.com/ |archive-url=https://web.archive.org/web/19990125100225/http://www.ingenuity.com/ |url-status=dead |archive-date=1999-01-25 |title=Ingenuity Systems |access-date=2007-12-31 }}</ref> and Pathway studio<ref>{{cite web |url=http://www.ariadnegenomics.com/products/pathway-studio/ |title=Ariadne Genomics: Pathway Studio |access-date=2007-12-31 |archive-url=https://web.archive.org/web/20071230035556/http://www.ariadnegenomics.com/products/pathway-studio |archive-date=2007-12-30 |url-status=dead }}</ref> and protein complex based analysis by COMPLEAT.<ref>{{cite journal | vauthors = Vinayagam A, Hu Y, Kulkarni M, Roesel C, Sopko R, Mohr SE, Perrimon N | title = Protein complex-based analysis framework for high-throughput data sets | journal = Science Signaling | volume = 6 | issue = 264 | pages = rs5 | date = February 2013 | pmid = 23443684 | pmc = 3756668 | doi = 10.1126/scisignal.2003629 | url = http://www.flyrnai.org/compleat/ }}</ref> [[File:Phydms.jpg|thumb|An overview of a phydms workflow]] New computational methods have been developed for understanding the results of a deep mutational scanning experiment. 'phydms' compares the result of a deep mutational scanning experiment to a phylogenetic tree.<ref name="Hilton_2017">{{cite journal | vauthors = Hilton SK, Doud MB, Bloom JD | title = phydms: software for phylogenetic analyses informed by deep mutational scanning | journal = PeerJ | volume = 5 | pages = e3657 | date = 2017 | pmid = 28785526 | pmc = 5541924 | doi = 10.7717/peerj.3657 | doi-access = free }}</ref> This allows the user to infer if the selection process in nature applies similar constraints on a protein as the results of the deep mutational scan indicate. This may allow an experimenter to choose between different experimental conditions based on how well they reflect nature. Deep mutational scanning has also been used to infer protein-protein interactions.<ref>{{cite journal | vauthors = Diss G, Lehner B | title = The genetic landscape of a physical interaction | journal = eLife | volume = 7 | date = April 2018 | pmid = 29638215 | pmc = 5896888 | doi = 10.7554/eLife.32472 | doi-access = free }}</ref> The authors used a thermodynamic model to predict the effects of mutations in different parts of a dimer. Deep mutational structure can also be used to infer protein structure. Strong positive epistasis between two mutations in a deep mutational scan can be indicative of two parts of the protein that are close to each other in 3-D space. This information can then be used to infer protein structure. A proof of principle of this approach was shown by two groups using the protein GB1.<ref>{{cite journal | vauthors = Schmiedel JM, Lehner B | title = Determining protein structures using deep mutagenesis | journal = Nature Genetics | volume = 51 | issue = 7 | pages = 1177β1186 | date = July 2019 | pmid = 31209395 | pmc = 7610650 | doi = 10.1038/s41588-019-0431-x | doi-access = free }}</ref><ref>{{cite journal | vauthors = Rollins NJ, Brock KP, Poelwijk FJ, Stiffler MA, Gauthier NP, Sander C, Marks DS | title = Inferring protein 3D structure from deep mutation scans | journal = Nature Genetics | volume = 51 | issue = 7 | pages = 1170β1176 | date = July 2019 | pmid = 31209393 | pmc = 7295002 | doi = 10.1038/s41588-019-0432-9 }}</ref> Results from MPRA experiments have required machine learning approaches to interpret the data. A gapped k-mer SVM model has been used to infer the kmers that are enriched within cis-regulatory sequences with high activity compared to sequences with lower activity.<ref>{{cite journal | vauthors = Ghandi M, Lee D, Mohammad-Noori M, Beer MA | title = Enhanced regulatory sequence prediction using gapped k-mer features | journal = PLOS Computational Biology | volume = 10 | issue = 7 | pages = e1003711 | date = July 2014 | pmid = 25033408 | pmc = 4102394 | doi = 10.1371/journal.pcbi.1003711 | doi-access = free | bibcode = 2014PLSCB..10E3711G }}</ref> These models provide high predictive power. Deep learning and random forest approaches have also been used to interpret the results of these high-dimensional experiments.<ref>{{cite journal | vauthors = Li Y, Shi W, Wasserman WW | title = Genome-wide prediction of cis-regulatory regions using supervised deep learning methods | journal = BMC Bioinformatics | volume = 19 | issue = 1 | pages = 202 | date = May 2018 | pmid = 29855387 | pmc = 5984344 | doi = 10.1186/s12859-018-2187-1 | doi-access = free }}</ref> These models are beginning to help develop a better understanding of [[non-coding DNA]] function towards gene-regulation.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)