Editing LL parser (section)

== Constructing an LL(1) parsing table ==

In order to fill the parsing table, we have to establish what grammar rule the parser should choose if it sees a nonterminal ''A'' on the top of its stack and a symbol ''a'' on its input stream.
It is easy to see that such a rule should be of the form ''A'' → ''w'' and that the language corresponding to ''w'' should have at least one string starting with ''a''.
For this purpose we define the ''First-set'' of ''w'', written here as '''Fi'''(''w''), as the set of terminals that can be found at the start of some string in ''w'', plus ε if the empty string also belongs to ''w''.
Given a grammar with the rules ''A''<sub>1</sub> → ''w''<sub>1</sub>, ..., ''A''<sub>''n''</sub> → ''w''<sub>''n''</sub>, we can compute the '''Fi'''(''w''<sub>''i''</sub>) and '''Fi'''(''A''<sub>''i''</sub>) for every rule as follows:
# initialize every '''Fi'''(''A''<sub>''i''</sub>) with the empty set
# add Fi(''w''<sub>''i''</sub>) to '''Fi'''(''A''<sub>''i''</sub>) for every rule ''A''<sub>''i''</sub> → ''w''<sub>''i''</sub>, where Fi is defined as follows:
#* {{math|1=Fi(''a{{prime|w}}'') = {{mset| ''a'' }}}} for every terminal ''a''
#* {{math|1=Fi(''A{{prime|w}}'') = '''Fi'''(''A'')}} for every nonterminal ''A'' with ε not in '''Fi'''(''A'')
#* {{math|1=Fi(''A{{prime|w}}'') = ('''Fi'''(''A'') &setminus; { ε }) ∪ Fi(''w' '')}} for every nonterminal ''A'' with ε in '''Fi'''(''A'')
#* Fi(ε) = { ε }
# add Fi(''w''<sub>''i''</sub>) to '''Fi'''(''A''<sub>''i''</sub>) for every rule ''A''<sub>''i''</sub> → ''w''<sub>''i''</sub>
# do steps 2 and 3 until all '''Fi''' sets stay the same.
The result is the least fixed point solution to the following system:
* {{math|1='''Fi'''(''A'') ⊇ '''Fi'''(''w'')}} for each rule A → ''w''
* {{math|1='''Fi'''(''a'') ⊇ {{mset| ''a'' }},}} for each terminal ''a''
* {{math|1='''Fi'''(''w''<sub>0</sub> ''w''<sub>1</sub>) ⊇ '''Fi'''(''w''<sub>0</sub>)·'''Fi'''(''w''<sub>1</sub>),}} for all words ''w''<sub>0</sub> and ''w''<sub>1</sub>
* {{math|1='''Fi'''(''ε'') ⊇ {{mset|''ε''}}}}
where, for sets of words ''U'' and ''V'', the truncated product is defined by <math>U \cdot V = \{ (uv):1 \mid u \in U, v \in V \}</math>, and w:1 denotes the initial length-1 prefix of words w of length 2 or more, or ''w'', itself, if w has length 0 or 1.

Unfortunately, the First-sets are not sufficient to compute the parsing table.
This is because a right-hand side ''w'' of a rule might ultimately be rewritten to the empty string.
So the parser should also use the rule ''A'' → ''w'' if ''ε'' is in '''Fi'''(''w'') and it sees on the input stream a symbol that could follow ''A''. Therefore, we also need the ''Follow-set'' of ''A'', written as '''Fo'''(''A'') here, which is defined as the set of terminals ''a'' such that there is a string of symbols ''αAaβ'' that can be derived from the start symbol. We use '''$''' as a special terminal indicating end of input stream, and ''S'' as start symbol.

Computing the Follow-sets for the nonterminals in a grammar can be done as follows:
# initialize '''Fo'''(''S'') with { '''$''' } and every other '''Fo'''(''A''<sub>''i''</sub>) with the empty set
# if there is a rule of the form ''A''<sub>''j''</sub> → ''wA<sub>i</sub>{{prime|w}}'', then
#* if the terminal ''a'' is in '''Fi'''(''{{prime|w}}''), then add ''a'' to '''Fo'''(''A''<sub>''i''</sub>)
#* if ε is in '''Fi'''(''{{prime|w}}''), then add '''Fo'''(''A''<sub>''j''</sub>) to '''Fo'''(''A''<sub>''i''</sub>)
#* if ''w' '' has length 0, then add '''Fo'''(''A''<sub>''j''</sub>) to '''Fo'''(''A''<sub>''i''</sub>)
# repeat step 2 until all '''Fo''' sets stay the same.
This provides the least fixed point solution to the following system:
* {{math|'''Fo'''(''S'') ⊇ {{mset|'''$'''}}}}
* {{math|'''Fo'''(''A'') ⊇ '''Fi'''(''w'')·'''Fo'''(''B'')}} for each rule of the form {{tmath|B \to \dots A w}}
 
Now we can define exactly which rules will appear where in the parsing table.
If ''T''[''A'', ''a''] denotes the entry in the table for nonterminal ''A'' and terminal ''a'', then
: ''T''[''A'',''a''] contains the rule ''A'' → ''w'' if and only if
:: ''a'' is in '''Fi'''(''w'') or
:: ε is in '''Fi'''(''w'') and ''a'' is in '''Fo'''(''A'').
Equivalently: ''T''[''A'', ''a''] contains the rule ''A'' → ''w'' for each {{math|''a'' ∈ '''Fi'''(''w'')·'''Fo'''(''A'').}}

If the table contains at most one rule in every one of its cells, then the parser will always know which rule it has to use and can therefore parse strings without backtracking.
It is in precisely this case that the grammar is called an ''LL(1) grammar''.