Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Distance matrix
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== [[Sequence alignment]] === An alignment of two sequences is formed by inserting spaces in arbitrary locations along the sequences so that they end up with the same length and there are no two spaces at the same position of the two augmented sequences.<ref name=":0">{{Cite book |last=Sung |first=Wing-Kin |title=Algorithms in bioinformatics: A practical introduction |publisher=Chapman & Hall |year=2010 |isbn=978-1-4200-7033-0 |pages=29}}</ref> One of the primary methods for sequence alignment is [[dynamic programming]]. The method is used to fill the distance matrix and then obtain the alignment. In typical usage, for sequence alignment a matrix is used to assign scores to amino-acid matches or mismatches, and a gap penalty for matching an amino-acid in one sequence with a gap in the other. ==== Global alignment ==== The [[Needleman–Wunsch algorithm]] used to calculate global alignment uses dynamic programming to obtain the distance matrix. ==== Local alignment ==== The [[Smith–Waterman algorithm]] is also dynamic programming based which consists also in obtaining the distance matrix and then obtain the local alignment. ==== Multiple sequence alignment ==== [[Multiple sequence alignment]] is an extension of pairwise alignment to align several sequences at a time. Different MSA methods are based on the same idea of the distance matrix as global and local alignments. * Center star method. This method defines a center sequence {{Math|''S''<sub>c</sub>}} which minimizes the distance between the sequence {{Math|''S''<sub>c</sub>}} and any other sequence {{Math|''S''<sub>i</sub>}}. Then it generates a multiple alignment {{Math|M}} for the set of sequences {{Math|''S''}} so that for every {{Math|''S''<sub>i</sub>}} the alignment distance {{Math|''d''<sub>''M''</sub>(''S''<sub>c</sub>,''S''<sub>i</sub>)}} is the optimal pairwise alignment. This method has the characteristic that the computed alignment for {{Math|''S''}} whose sum-of-pair distance is at most twice the optimal multiple alignment. * Progressive alignment method. This heuristic method to create MSA first aligns the two most related sequences, and then it progressively aligns the next two most related sequences until all sequences are aligned. There are other methods that have their own program due to their popularity: * [[Clustal|ClustalW]] * [[MUSCLE (alignment software)|MUSCLE]] * [[MAFFT]] * MANGO * And many more ===== MAFFT ===== Multiple alignment using fast Fourier transform (MAFFT) is a program with an algorithm based on progressive alignment, and it offers various multiple alignment strategies. First, MAFFT constructs a distance matrix based on the number of shared 6-tuples. Second, it builds the guide tree based on the previous matrix. Third, it clusters the sequences with the help of the [[fast Fourier transform]] and starts the alignment. Based on the new alignment, it reconstructs the guide tree and align again.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)