Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Sequence profiling tool
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Introduction and usage== <!-- Deleted image removed: [[Image:NCBI.JPG|thumb|350px| A typical example of a keyword profiling tool - [http://www.ncbi.nlm.nih.gov/gquery/gquery.fcgi?itool=toolbar Entrez]]] --> The "post-[[genomics]]" era has given rise to a range of web-based tools and software to compile, organize, and deliver large amounts of [[primary sequence]] information, as well as [[tertiary structure|protein structures]], gene annotations, [[sequence alignment]]s, and other common bioinformatics tasks. In general, there exist three types of databases and service providers. The first one includes the popular public-domain or open-access databases supported by funding and grants such as [[National Center for Biotechnology Information|NCBI]], [[ExPASy]], [[Ensembl]], and [[Protein Data Bank|PDB]]. The second one includes smaller or more specific databases organized and compiled by individual research groups Examples include [http://www.yeastgenome.org/ Yeast Genome Database], [http://www.rnabase.org/ RNA database]. The third and final one includes private corporate or institutional databases that require payment or institutional affiliation to access. Such examples are rare given the globalization of public databases, unless the purported service is ‘in-development’ or the end point of the analysis is of commercial value. Typical scenarios of a profiling approach become relevant, particularly, in the cases of the first two groups, where researchers commonly wish to combine information derived from several sources about a single query or target sequence. For example, users might use the sequence alignment and search tool [[BLAST (biotechnology)|BLAST]] to identify [[homology (biology)|homologs]] of their gene of interest in other species, and then use these results to locate a solved protein structure for one of the homologs. Similarly, they might also want to know the likely [[secondary structure]] of the [[mRNA]] encoding the gene of interest, or whether a company sells a [[DNA construct]] containing the gene. Sequence profiling tools serve to automate and integrate the process of seeking such disparate information by rendering the process of searching several different external databases transparent to the user. Many public databases are already extensively linked so that complementary information in another database is easily accessible; for example, [[Genbank]] and the [[Protein Data Bank|PDB]] are closely intertwined. However, specialized tools organized and hosted by specific research groups can be difficult to integrate into this linkage effort because they are narrowly focused, are frequently modified, or use custom versions of common file formats. Advantages of sequence profiling tools include the ability to use multiple of these specialized tools in a single query and present the output with a common interface, the ability to direct the output of one set of tools or database searches into the input of another, and the capacity to disseminate hosting and compilation obligations to a network of research groups and institutions rather than a single centralized repository.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)