Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Structural alignment
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Mammoth=== MAMMOTH <ref name="Mammoth">{{ cite journal | pmid=12381844 | first= AR | last = Ortiz | author2 = Strauss CE | author3 = Olmea O. | title=MAMMOTH (matching molecular models obtained from theory): an automated method for model comparison. | journal=Protein Science | year=2002 |volume=11 | issue=11 |pages=2606–2621 |doi=10.1110/ps.0215902 | pmc= 2373724 |doi-access=free }}</ref> approaches the alignment problem from a different objective than almost all other methods. Rather than trying to find an alignment that maximally superimposes the largest number of residues, it seeks the subset of the structural alignment least likely to occur by chance. To do this it marks a local motif alignment with flags to indicate which residues simultaneously satisfy more stringent criteria: 1) Local structure overlap 2) regular secondary structure 3) 3D-superposition 4) same ordering in primary sequence. It converts the statistics of the number of residues with high-confidence matches and the size of the protein to compute an Expectation value for the outcome by chance. It excels at matching remote homologs, particularly structures generated by ab initio structure prediction to structure families such as SCOP, because it emphasizes extracting a statistically reliable sub alignment and not in achieving the maximal sequence alignment or maximal 3D superposition.<ref name="Malmstrom" /><ref name="robetta">{{cite journal |journal=Nucleic Acids Research |year= 2004 |volume= 32(Web Server issue): W526–W531 |doi= 10.1093/nar/gkh468 |pmid= 15215442 |title=Protein structure prediction and analysis using the Robetta server |author1=David E. Kim |author2=Dylan Chivian |author3=David Baker |issue= Web Server issue |pages= W526–W531 |pmc= 441606 |doi-access= free }}</ref> For every overlapping window of 7 consecutive residues it computes the set of displacement direction unit vectors between adjacent C-alpha residues. All-against-all local motifs are compared based on the URMS score. These values becomes the pair alignment score entries for dynamic programming which produces a seed pair-wise residue alignment. The second phase uses a modified MaxSub algorithm: a single 7 reside aligned pair in each proteins is used to orient the two full length protein structures to maximally superimpose these just these 7 C-alpha, then in this orientation it scans for any additional aligned pairs that are close in 3D. It re-orients the structures to superimpose this expanded set and iterates until no more pairs coincide in 3D. This process is restarted for every 7 residue window in the seed alignment. The output is the maximal number of atoms found from any of these initial seeds. This statistic is converted to a calibrated E-value for the similarity of the proteins. Mammoth makes no attempt to re-iterate the initial alignment or extend the high quality sub-subset. Therefore, the seed alignment it displays can't be fairly compared to DALI or TM align as it was formed simply as a heuristic to prune the search space. (It can be used if one wants an alignment based solely on local structure-motif similarity agnostic of long range rigid body atomic alignment.) Because of that same parsimony, it is well over ten times faster than DALI, CE and TM-align.<ref name="foldclass">{{cite journal |title=Efficient SCOP-fold classification and retrieval using index-based protein substructure alignments |author1=Pin-Hao Chi |author2=Bin Pang |author3=Dmitry Korkin |author4=Chi-Ren Shyu |journal=Bioinformatics |volume=25 | issue=19 |year=2009 |pages=2559–2565 |doi=10.1093/bioinformatics/btp474 |pmid=19667079 |doi-access=free }}</ref> It is often used in conjunction with these slower tools to pre-screen large data bases to extract the just the best E-value related structures for more exhaustive superposition or expensive calculations. <ref name="grishin04">{{cite journal |journal=BMC Bioinformatics |year= 2004 |volume= 5 |issue= 197 | doi=10.1186/1471-2105-5-197 |pmid= 15598351 |title=SCOPmap: Automated assignment of protein structures to evolutionary superfamilies |author1=Sara Cheek |author2=Yuan Qi |author3=Sri Krishna |author4=Lisa N Kinch |author5=Nick V Grishin |page= 197 |pmc= 544345 |doi-access=free }}</ref> <ref name="fssa">{{cite journal |title=FSSA: a novel method for identifying functional signatures from structural alignments |author1=Kai Wang |author2=Ram Samudrala |journal=Bioinformatics |year=2005 |volume=21 |issue=13 |pages=2969–2977 |doi=10.1093/bioinformatics/bti471 |pmid=15860561 |doi-access=free }}</ref> It has been particularly successful at analyzing "decoy" structures from ab initio structure prediction.<ref name="casp11">{{cite journal |vauthors=Kryshtafovych A, Monastyrskyy B, Fidelis K |title=CASP11 statistics and the prediction center evaluation system. \ |journal=Proteins |year= 2016 |volume=84 |issue=Suppl 1 |pages=(Suppl 1):15–19 | doi=10.1002/prot.25005 |pmid=26857434 |pmc=5479680 |doi-access=free }}</ref><ref name="Malmstrom" /><ref name="robetta" /> These decoys are notorious for getting local fragment motif structure correct, and forming some kernels of correct 3D tertiary structure but getting the full length tertiary structure wrong. In this twilight remote homology regime, Mammoth's e-values for the CASP<ref name="casp11" /> protein structure prediction evaluation have been shown to be significantly more correlated with human ranking than SSAP or DALI.<ref name=Mammoth /> Mammoths ability to extract the multi-criteria partial overlaps with proteins of known structure and rank these with proper E-values, combined with its speed facilitates scanning vast numbers of decoy models against the PDB data base for identifying the most likely correct decoys based on their remote homology to known proteins. <ref name="Malmstrom">{{cite journal |title=Superfamily Assignments for the Yeast Proteome through Integration of Structure Prediction with the Gene Ontology |author1=Lars Malmström Michael Riffle |author2=Charlie EM Strauss |author3=Dylan Chivian |author4=Trisha N Davis |author5=Richard Bonneau |author6=David Baker |year=2007 |journal=PLOS Biol | volume=5 |issue=4 |pages= e76corresponding author1,2 |doi=10.1371/journal.pbio.0050076 | pmid=17373854 | pmc=1828141 |doi-access=free }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)