Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Probabilistic context-free grammar
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
====Algorithms==== Several algorithms dealing with aspects of PCFG based probabilistic models in RNA structure prediction exist. For instance the inside-outside algorithm and the CYK algorithm. The inside-outside algorithm is a recursive dynamic programming scoring algorithm that can follow [[Expectation-maximization algorithm|expectation-maximization]] paradigms. It computes the total probability of all derivations that are consistent with a given sequence, based on some PCFG. The inside part scores the subtrees from a parse tree and therefore subsequences probabilities given an PCFG. The outside part scores the probability of the complete parse tree for a full sequence.<ref name="Lari and Young 1990" /><ref name="Lari and Young 1991" /> CYK modifies the inside-outside scoring. Note that the term 'CYK algorithm' describes the CYK variant of the inside algorithm that finds an optimal parse tree for a sequence using a PCFG. It extends the actual [[CYK algorithm]] used in non-probabilistic CFGs.<ref name="Durbin 1998" /> The inside algorithm calculates <math>\alpha(i,j,v)</math> probabilities for all <math>i, j, v</math> of a parse subtree rooted at <math>W_v</math> for subsequence <math>x_i,...,x_j</math>. Outside algorithm calculates <math>\beta(i,j,v)</math> probabilities of a complete parse tree for sequence {{mvar|x}} from root excluding the calculation of <math>x_i,...,x_j</math>. The variables {{mvar|α}} and {{mvar|β}} refine the estimation of probability parameters of an PCFG. It is possible to reestimate the PCFG algorithm by finding the expected number of times a state is used in a derivation through summing all the products of {{mvar|α}} and {{mvar|β}} divided by the probability for a sequence {{mvar|x}} given the model <math>P(x|\theta)</math>. It is also possible to find the expected number of times a production rule is used by an expectation-maximization that utilizes the values of {{mvar|α}} and {{mvar|β}}.<ref name="Lari and Young 1990" /><ref name="Lari and Young 1991" /> The CYK algorithm calculates <math>\gamma(i, j, v)</math> to find the most probable parse tree <math>\hat{\pi}</math> and yields <math>\log P(x, \hat{\pi}|\theta)</math>.<ref name="Durbin 1998" /> Memory and time complexity for general PCFG algorithms in RNA structure predictions are <math>O(L^2M)</math> and <math>O(L^3M^3)</math> respectively. Restricting a PCFG may alter this requirement as is the case with database searches methods.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)