Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Sequence alignment
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Structural alignment== {{Main|Structural alignment}} Structural alignments, which are usually specific to protein and sometimes RNA sequences, use information about the [[secondary structure|secondary]] and [[tertiary structure]] of the protein or RNA molecule to aid in aligning the sequences. These methods can be used for two or more sequences and typically produce local alignments; however, because they depend on the availability of structural information, they can only be used for sequences whose corresponding structures are known (usually through [[X-ray crystallography]] or [[NMR spectroscopy]]). Because both protein and RNA structure is more evolutionarily conserved than sequence,<ref name=chothia>{{cite journal | journal=EMBO J | volume=5 | issue=4 | pages=823β6 |date=April 1986 |author1=Chothia C |author2=Lesk AM. | title=The relation between the divergence of sequence and structure in proteins | pmid=3709526 |pmc=1166865 | doi=10.1002/j.1460-2075.1986.tb04288.x }}</ref> structural alignments can be more reliable between sequences that are very distantly related and that have diverged so extensively that sequence comparison cannot reliably detect their similarity. Structural alignments are used as the "gold standard" in evaluating alignments for homology-based [[protein structure prediction]]<ref name=skolnick>{{cite journal | journal=Proc Natl Acad Sci USA | volume=102 | pages=1029β34 | year=2005 |author1=Zhang Y |author2=Skolnick J. | title=The protein structure prediction problem could be solved using the current PDB library | pmid=15653774 | doi = 10.1073/pnas.0407152101 | issue=4 | pmc=545829 | bibcode=2005PNAS..102.1029Z | doi-access=free }}</ref> because they explicitly align regions of the protein sequence that are structurally similar rather than relying exclusively on sequence information. However, clearly structural alignments cannot be used in structure prediction because at least one sequence in the query set is the target to be modeled, for which the structure is not known. It has been shown that, given the structural alignment between a target and a template sequence, highly accurate models of the target protein sequence can be produced; a major stumbling block in homology-based structure prediction is the production of structurally accurate alignments given only sequence information.<ref name=skolnick/> ===DALI=== The DALI method, or [[distance matrix]] alignment, is a fragment-based method for constructing structural alignments based on contact similarity patterns between successive hexapeptides in the query sequences.<ref name=holm>{{cite journal | journal=Science | volume=273 | pages=595β603 | year=1996 |author1=Holm L |author2=Sander C | title=Mapping the protein universe | pmid=8662544 | doi = 10.1126/science.273.5275.595 | issue=5275 | bibcode=1996Sci...273..595H | s2cid=7509134 }}</ref> It can generate pairwise or multiple alignments and identify a query sequence's structural neighbors in the [[Protein Data Bank]] (PDB). It has been used to construct the [[Families of structurally similar proteins|FSSP]] structural alignment database (Fold classification based on Structure-Structure alignment of Proteins, or Families of Structurally Similar Proteins). A DALI webserver can be accessed at [https://web.archive.org/web/20090301064750/http://ekhidna.biocenter.helsinki.fi/dali_server/start DALI] and the FSSP is located at [https://web.archive.org/web/20051125045348/http://ekhidna.biocenter.helsinki.fi/dali/start The Dali Database]. ===SSAP=== SSAP (sequential structure alignment program) is a dynamic programming-based method of structural alignment that uses atom-to-atom vectors in structure space as comparison points. It has been extended since its original description to include multiple as well as pairwise alignments,<ref name=taylor>{{cite journal|journal=Protein Sci |volume=3 |pages=1858β70 |year=1994 |author1=Taylor WR |author2=Flores TP |author3=Orengo CA. |title=Multiple protein structure alignment |pmid=7849601 |doi=10.1002/pro.5560031025 |issue=10 |pmc=2142613 }}</ref> and has been used in the construction of the [[CATH]] (Class, Architecture, Topology, Homology) hierarchical database classification of protein folds.<ref name=orengo>{{cite journal | journal=Structure | volume=5 | pages=1093β108 | year=1997 |author1=Orengo CA |author2=Michie AD |author3=Jones S |author4=Jones DT |author5=Swindells MB |author6=Thornton JM | title=CATH--a hierarchic classification of protein domain structures | pmid=9309224 | doi=10.1016/S0969-2126(97)00260-8 | issue=8 | doi-access=free }}</ref> The CATH database can be accessed at [http://www.cathdb.info/ CATH Protein Structure Classification]. ===Combinatorial extension=== The combinatorial extension method of structural alignment generates a pairwise structural alignment by using local geometry to align short fragments of the two proteins being analyzed and then assembles these fragments into a larger alignment.<ref name=shindyalov>{{cite journal | journal=Protein Eng | volume=11 | pages=739β47 | year=1998 |author1=Shindyalov IN |author2=Bourne PE. | title=Protein structure alignment by incremental combinatorial extension (CE) of the optimal path | pmid=9796821 | doi = 10.1093/protein/11.9.739 | issue=9 | doi-access=free }}</ref> Based on measures such as rigid-body [[Root mean square deviation (bioinformatics)|root mean square distance]], residue distances, local secondary structure, and surrounding environmental features such as residue neighbor [[hydrophobic]]ity, local alignments called "aligned fragment pairs" are generated and used to build a similarity matrix representing all possible structural alignments within predefined cutoff criteria. A path from one protein structure state to the other is then traced through the matrix by extending the growing alignment one fragment at a time. The optimal such path defines the combinatorial-extension alignment. A web-based server implementing the method and providing a database of pairwise alignments of structures in the Protein Data Bank is located at the [https://web.archive.org/web/19981203071023/http://cl.sdsc.edu/ Combinatorial Extension] website.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)