Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Probabilistic context-free grammar
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Protein sequence analysis=== Whereas PCFGs have proved powerful tools for predicting RNA secondary structure, usage in the field of protein sequence analysis has been limited. Indeed, the size of the [[amino acid]] alphabet and the variety of interactions seen in proteins make grammar inference much more challenging.<ref name="Searls D 2013">{{cite journal | last1 = Searls | first1 = D | year = 2013 | title = Review: A primer in macromolecular linguistics | journal = Biopolymers | volume = 99 | issue = 3| pages = 203β217 | doi=10.1002/bip.22101| pmid = 23034580 | s2cid = 12676925 }}</ref> As a consequence, most applications of [[formal language theory]] to protein analysis have been mainly restricted to the production of grammars of lower expressive power to model simple functional patterns based on local interactions.<ref>{{cite journal | last1 = Krogh | first1 = A | last2 = Brown | first2 = M | last3 = Mian | first3 = I | last4 = Sjolander | first4 = K | last5 = Haussler | first5 = D | year = 1994 | title = Hidden Markov models in computational biology: Applications to protein modeling | journal = J Mol Biol | volume = 235 | issue = 5| pages = 1501β1531 |doi=10.1006/jmbi.1994.1104 | pmid=8107089| s2cid = 2160404 }}</ref><ref>{{cite journal | last1 = Sigrist | first1 = C | last2 = Cerutti | first2 = L | last3 = Hulo | first3 = N | last4 = Gattiker | first4 = A | last5 = Falquet | first5 = L | last6 = Pagni | first6 = M | last7 = Bairoch | first7 = A | last8 = Bucher | first8 = P | year = 2002 | title = PROSITE: a documented database using patterns and profiles as motif descriptors | journal = Brief Bioinform | volume = 3 | issue = 3| pages = 265β274 | doi=10.1093/bib/3.3.265 | pmid=12230035| doi-access = free }}</ref> Since protein structures commonly display higher-order dependencies including nested and crossing relationships, they clearly exceed the capabilities of any CFG.<ref name="Searls D 2013"/> Still, development of PCFGs allows expressing some of those dependencies and providing the ability to model a wider range of protein patterns.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)