Editing Probabilistic context-free grammar (section)

====Algorithms====
Several algorithms dealing with aspects of PCFG based probabilistic models in RNA structure prediction exist. For instance the inside-outside algorithm and the CYK algorithm. The inside-outside algorithm is a recursive dynamic programming scoring algorithm that can follow [[Expectation-maximization algorithm|expectation-maximization]] paradigms. It computes the total probability of all derivations that are consistent with a given sequence, based on some PCFG. The inside part scores the subtrees from a parse tree and therefore subsequences probabilities given an PCFG. The outside part scores the probability of the complete parse tree for a full sequence.<ref name="Lari and Young 1990" /><ref name="Lari and Young 1991" /> CYK modifies the inside-outside scoring. Note that the term 'CYK algorithm' describes the CYK variant of the inside algorithm that finds an optimal parse tree for a sequence using a PCFG. It extends the actual [[CYK algorithm]] used in non-probabilistic CFGs.<ref name="Durbin 1998" />

The inside algorithm calculates <math>\alpha(i,j,v)</math> probabilities for all <math>i, j, v</math>  of a parse subtree rooted at <math>W_v</math> for subsequence <math>x_i,...,x_j</math>. Outside algorithm calculates <math>\beta(i,j,v)</math> probabilities of a complete parse tree for sequence {{mvar|x}} from root excluding the calculation of <math>x_i,...,x_j</math>. The variables {{mvar|&alpha;}} and {{mvar|&beta;}} refine the estimation of probability parameters of an PCFG. It is possible to reestimate the PCFG algorithm by finding the expected number of times a state is used in a derivation through summing all the products of {{mvar|&alpha;}} and {{mvar|&beta;}} divided by the probability for a sequence {{mvar|x}} given the model <math>P(x|\theta)</math>. It is also possible to find the expected number of times a production rule is used by an expectation-maximization that utilizes the values of {{mvar|&alpha;}} and {{mvar|&beta;}}.<ref name="Lari and Young 1990" /><ref name="Lari and Young 1991" /> The CYK algorithm calculates <math>\gamma(i, j, v)</math> to find the most probable parse tree <math>\hat{\pi}</math> and yields <math>\log P(x, \hat{\pi}|\theta)</math>.<ref name="Durbin 1998" />

Memory and time complexity for general PCFG algorithms in RNA structure predictions are <math>O(L^2M)</math> and <math>O(L^3M^3)</math> respectively. Restricting a PCFG may alter this requirement as is the case with database searches methods.