Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Sequence alignment
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Dynamic programming=== The technique of [[dynamic programming]] can be applied to produce global alignments via the [[Needleman-Wunsch algorithm]], and local alignments via the [[Smith-Waterman algorithm]]. In typical usage, protein alignments use a [[substitution matrix]] to assign scores to amino-acid matches or mismatches, and a [[gap penalty]] for matching an amino acid in one sequence to a gap in the other. DNA and RNA alignments may use a scoring matrix, but in practice often simply assign a positive match score, a negative mismatch score, and a negative gap penalty. (In standard dynamic programming, the score of each amino acid position is independent of the identity of its neighbors, and therefore [[base stacking]] effects are not taken into account. However, it is possible to account for such effects by modifying the algorithm.){{citation needed|date=April 2024}} A common extension to standard linear gap costs are affine gap costs. Here two different gap penalties are applied for opening a gap and for extending a gap. Typically the former is much larger than the latter, e.g. -10 for gap open and -2 for gap extension. This results in fewer gaps in an alignment and residues and gaps are kept together, traits more representative of biological sequences. The Gotoh algorithm implements affine gap costs by using three matrices.<ref>{{Cite journal |last=Gotoh |first=Osamu |date=1982-12-15 |title=An improved algorithm for matching biological sequences |url=https://linkinghub.elsevier.com/retrieve/pii/0022283682903989 |journal=Journal of Molecular Biology |volume=162 |issue=3 |pages=705β708 |doi=10.1016/0022-2836(82)90398-9 |pmid=7166760 |issn=0022-2836|url-access=subscription }}</ref><ref>{{Cite journal |last=Gotoh |first=Osamu |date=1999-01-01 |title=Multiple sequence alignment: Algorithms and applications |url=https://linkinghub.elsevier.com/retrieve/pii/S0065227X99800070 |journal=Advances in Biophysics |volume=36 |pages=159β206 |doi=10.1016/S0065-227X(99)80007-0 |pmid=10463075 |issn=0065-227X|url-access=subscription }}</ref> Dynamic programming can be useful in aligning nucleotide to protein sequences, a task complicated by the need to take into account [[frameshift]] mutations (usually insertions or deletions). The framesearch method produces a series of global or local pairwise alignments between a query nucleotide sequence and a search set of protein sequences, or vice versa. Its ability to evaluate frameshifts offset by an arbitrary number of nucleotides makes the method useful for sequences containing large numbers of indels, which can be very difficult to align with more efficient heuristic methods. In practice, the method requires large amounts of computing power or a system whose architecture is specialized for dynamic programming. The [[BLAST (biotechnology)|BLAST]] and [[EMBOSS]] suites provide basic tools for creating translated alignments (though some of these approaches take advantage of side-effects of sequence searching capabilities of the tools). More general methods are available from [[open-source software]] such as [http://www.ebi.ac.uk/Tools/psa/genewise/ GeneWise].{{citation needed|date=April 2024}} The dynamic programming method is guaranteed to find an optimal alignment given a particular scoring function; however, identifying a good scoring function is often an empirical rather than a theoretical matter. Although dynamic programming is extensible to more than two sequences, it is prohibitively slow for large numbers of sequences or extremely long sequences.{{citation needed|date=April 2024}}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)