Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Sequence database
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Current issues == === Storage & redundancy === Records in sequence databases are deposited from a wide range of sources, from individual researchers to large genome sequencing centers. As a result, the sequences themselves, and especially the biological annotations attached to these sequences, may vary in quality. There is much redundancy, as multiple labs may submit numerous sequences that are identical, or nearly identical, to others in the databases.<ref name="Sikic-2010">{{Cite journal | last1 = Sikic | first1 = K. | last2 = Carugo | first2 = O. | title = Protein sequence redundancy reduction: comparison of various method | journal = Bioinformation | volume = 5 | issue = 6 | pages = 234β9 | year = 2010 | doi = 10.6026/97320630005234| pmid = 21364823 | pmc=3055704}}</ref> Many annotations of the sequences are based not on laboratory experiments, but on the results of sequence similarity searches for previously annotated sequences. Once a sequence has been annotated based on similarity to others, and itself deposited in the database, it can also become the basis for future annotations. This can lead to a ''transitive annotation problem'' because there may be several such annotation transfers by sequence similarity between a particular database record and actual [[wet lab]] experimental information.<ref name="Iliopoulos-2003">{{Cite journal | last1 = Iliopoulos | first1 = I. | last2 = Tsoka | first2 = S. | last3 = Andrade | first3 = MA. | last4 = Enright | first4 = AJ. | last5 = Carroll | first5 = M. | last6 = Poullet | first6 = P. | last7 = Promponas | first7 = V. | last8 = Liakopoulos | first8 = T. | last9 = Palaios | first9 = G. | last10 = Pasquier | first10 = C | last11 = Hamodrakas | first11 = S | last12 = Tamames | first12 = J | last13 = Yagnik | first13 = A. T. | last14 = Tramontano | first14 = A | last15 = Devos | first15 = D | last16 = Blaschke | first16 = C | last17 = Valencia | first17 = A | last18 = Brett | first18 = D | last19 = Martin | first19 = D | last20 = Leroy | first20 = C | last21 = Rigoutsos | first21 = I | last22 = Sander | first22 = C | last23 = Ouzounis | first23 = C. A. | title = Evaluation of annotation strategies using an entire genome sequence | journal = Bioinformatics | volume = 19 | issue = 6 | pages = 717β26 |date=April 2003 | doi = 10.1093/bioinformatics/btg077| pmid = 12691983 | display-authors = 8 | doi-access = free }}</ref> Therefore, care must be taken when interpreting the annotation data from sequence databases. === Scoring methods === Most of the current database search algorithms rank alignment by a score, which is usually a particular scoring system.<ref>{{cite journal |title=Issues in searching molecular sequence databases|last1=Altschul |first1=Stephen |last2=Boguski |first2=Mark |last3=Gish |first3=Warren |last4=Wootton |first4=John |journal=Nature Genetics |year=1994 |volume=6 |issue=2 |pages=119β129 |url=https://www.nature.com/articles/ng0294-119.pdf |publisher=Nature Publishing Group|doi=10.1038/ng0294-119 |pmid=8162065 |s2cid=270160 }}</ref> The solution towards solving this issue is found by making a variety of scoring systems available to suit to the specific problem. === Alignment statistics === When using a searching algorithm we often produce an ordered list which can often carry a lack of biological significance.<ref>{{cite journal |title=Issues in searching molecular sequence databases|last1=Altschul |first1=Stephen |last2=Boguski |first2=Mark |last3=Gish |first3=Warren |last4=Wootton |first4=John |journal=Nature Genetics |year=1994 |volume=6 |issue=2 |pages=119β129 |url=https://www.nature.com/articles/ng0294-119.pdf |publisher=Nature Publishing Group|doi=10.1038/ng0294-119 |pmid=8162065 |s2cid=270160 }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)