Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Needleman–Wunsch algorithm
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Advanced presentation of algorithm== Scores for aligned characters are specified by a [[similarity matrix]]. Here, {{math|''S''(''a'', ''b'')}} is the similarity of characters ''a'' and ''b''. It uses a linear [[gap penalty]], here called {{mvar|d}}. For example, if the similarity matrix was {| class="wikitable" ! scope="col" | ! scope="col" | A ! scope="col" | G ! scope="col" | C ! scope="col" | T |- style="text-align: right;" ! scope="row" | A | 10 || −1 || −3 || −4 |- style="text-align: right;" ! scope="row" | G | −1 || 7 || −5 || −3 |- style="text-align: right;" ! scope="row" | C | −3 || −5 || 9 || 0 |- style="text-align: right;" ! scope="row" | T | −4 || −3 || 0 || 8 |} then the alignment: AGACTAGTTAC CGA---GACGT with a gap penalty of −5, would have the following score: :{{math|''S''(A,C) + ''S''(G,G) + ''S''(A,A) + (3 × ''d'') + ''S''(G,G) + ''S''(T,A) + ''S''(T,C) + ''S''(A,G) + ''S''(C,T)}} := −3 + 7 + 10 − (3 × 5) + 7 + (−4) + 0 + (−1) + 0 = 1 To find the alignment with the highest score, a two-dimensional [[Array data structure|array]] (or [[Matrix (mathematics)|matrix]]) ''F'' is allocated. The entry in row ''i'' and column ''j'' is denoted here by <math>F_{ij}</math>. There is one row for each character in sequence ''A'', and one column for each character in sequence ''B''. Thus, if aligning sequences of sizes ''n'' and ''m'', the amount of memory used is in <math>O(nm)</math>. [[Hirschberg's algorithm]] only holds a subset of the array in memory and uses <math>\Theta(\min \{n,m\})</math> space, but is otherwise similar to Needleman-Wunsch (and still requires <math>O(nm)</math> time). As the algorithm progresses, the <math>F_{ij}</math> will be assigned to be the optimal score for the alignment of the first <math>i=0,\dotsc,n</math> characters in ''A'' and the first <math>j=0,\dotsc,m</math> characters in ''B''. The [[Bellman equation#Bellman's principle of optimality|principle of optimality]] is then applied as follows: * Basis: :<math>F_{0j} = d*j</math> :<math>F_{i0} = d*i</math> * Recursion, based on the principle of optimality: :<math>F_{ij} = \max(F_{i-1,j-1} + S(A_{i}, B_{j}), \; F_{i,j-1} + d, \; F_{i-1,j} + d)</math> The pseudo-code for the algorithm to compute the F matrix therefore looks like this: d ← Gap penalty score '''for''' i = 0 '''to''' '''length'''(A) F(i,0) ← d * i '''for''' j = 0 '''to''' '''length'''(B) F(0,j) ← d * j '''for''' i = 1 '''to''' '''length'''(A) '''for''' j = 1 '''to''' '''length'''(B) { Match ← F(i−1, j−1) + S(A<sub>i</sub>, B<sub>j</sub>) Delete ← F(i−1, j) + d Insert ← F(i, j−1) + d F(i,j) ← '''max'''(Match, Insert, Delete) } Once the ''F'' matrix is computed, the entry <math>F_{nm}</math> gives the maximum score among all possible alignments. To compute an alignment that actually gives this score, you start from the bottom right cell, and compare the value with the three possible sources (Match, Insert, and Delete above) to see which it came from. If Match, then <math>A_i</math> and <math>B_j</math> are aligned, if Delete, then <math>A_i</math> is aligned with a gap, and if Insert, then <math>B_j</math> is aligned with a gap. (In general, more than one choice may have the same value, leading to alternative optimal alignments.) AlignmentA ← "" AlignmentB ← "" i ← '''length'''(A) j ← '''length'''(B) '''while''' (i > 0 '''or''' j > 0) { '''if''' (i > 0 '''and''' j > 0 '''and''' F(i, j) == F(i−1, j−1) + S(A<sub>i</sub>, B<sub>j</sub>)) { AlignmentA ← A<sub>i</sub> + AlignmentA AlignmentB ← B<sub>j</sub> + AlignmentB i ← i − 1 j ← j − 1 } '''else''' '''if''' (i > 0 '''and''' F(i, j) == F(i−1, j) + d) { AlignmentA ← A<sub>i</sub> + AlignmentA AlignmentB ← "−" + AlignmentB i ← i − 1 } '''else''' { AlignmentA ← "−" + AlignmentA AlignmentB ← B<sub>j</sub> + AlignmentB j ← j − 1 } }
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)