Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
LL parser
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Constructing an LL(1) parsing table == In order to fill the parsing table, we have to establish what grammar rule the parser should choose if it sees a nonterminal ''A'' on the top of its stack and a symbol ''a'' on its input stream. It is easy to see that such a rule should be of the form ''A'' β ''w'' and that the language corresponding to ''w'' should have at least one string starting with ''a''. For this purpose we define the ''First-set'' of ''w'', written here as '''Fi'''(''w''), as the set of terminals that can be found at the start of some string in ''w'', plus Ξ΅ if the empty string also belongs to ''w''. Given a grammar with the rules ''A''<sub>1</sub> β ''w''<sub>1</sub>, ..., ''A''<sub>''n''</sub> β ''w''<sub>''n''</sub>, we can compute the '''Fi'''(''w''<sub>''i''</sub>) and '''Fi'''(''A''<sub>''i''</sub>) for every rule as follows: # initialize every '''Fi'''(''A''<sub>''i''</sub>) with the empty set # add Fi(''w''<sub>''i''</sub>) to '''Fi'''(''A''<sub>''i''</sub>) for every rule ''A''<sub>''i''</sub> β ''w''<sub>''i''</sub>, where Fi is defined as follows: #* {{math|1=Fi(''a{{prime|w}}'') = {{mset| ''a'' }}}} for every terminal ''a'' #* {{math|1=Fi(''A{{prime|w}}'') = '''Fi'''(''A'')}} for every nonterminal ''A'' with Ξ΅ not in '''Fi'''(''A'') #* {{math|1=Fi(''A{{prime|w}}'') = ('''Fi'''(''A'') ∖ { Ξ΅ }) βͺ Fi(''w' '')}} for every nonterminal ''A'' with Ξ΅ in '''Fi'''(''A'') #* Fi(Ξ΅) = { Ξ΅ } # add Fi(''w''<sub>''i''</sub>) to '''Fi'''(''A''<sub>''i''</sub>) for every rule ''A''<sub>''i''</sub> β ''w''<sub>''i''</sub> # do steps 2 and 3 until all '''Fi''' sets stay the same. The result is the least fixed point solution to the following system: * {{math|1='''Fi'''(''A'') β '''Fi'''(''w'')}} for each rule A β ''w'' * {{math|1='''Fi'''(''a'') β {{mset| ''a'' }},}} for each terminal ''a'' * {{math|1='''Fi'''(''w''<sub>0</sub> ''w''<sub>1</sub>) β '''Fi'''(''w''<sub>0</sub>)Β·'''Fi'''(''w''<sub>1</sub>),}} for all words ''w''<sub>0</sub> and ''w''<sub>1</sub> * {{math|1='''Fi'''(''Ξ΅'') β {{mset|''Ξ΅''}}}} where, for sets of words ''U'' and ''V'', the truncated product is defined by <math>U \cdot V = \{ (uv):1 \mid u \in U, v \in V \}</math>, and w:1 denotes the initial length-1 prefix of words w of length 2 or more, or ''w'', itself, if w has length 0 or 1. Unfortunately, the First-sets are not sufficient to compute the parsing table. This is because a right-hand side ''w'' of a rule might ultimately be rewritten to the empty string. So the parser should also use the rule ''A'' β ''w'' if ''Ξ΅'' is in '''Fi'''(''w'') and it sees on the input stream a symbol that could follow ''A''. Therefore, we also need the ''Follow-set'' of ''A'', written as '''Fo'''(''A'') here, which is defined as the set of terminals ''a'' such that there is a string of symbols ''Ξ±AaΞ²'' that can be derived from the start symbol. We use '''$''' as a special terminal indicating end of input stream, and ''S'' as start symbol. Computing the Follow-sets for the nonterminals in a grammar can be done as follows: # initialize '''Fo'''(''S'') with { '''$''' } and every other '''Fo'''(''A''<sub>''i''</sub>) with the empty set # if there is a rule of the form ''A''<sub>''j''</sub> β ''wA<sub>i</sub>{{prime|w}}'', then #* if the terminal ''a'' is in '''Fi'''(''{{prime|w}}''), then add ''a'' to '''Fo'''(''A''<sub>''i''</sub>) #* if Ξ΅ is in '''Fi'''(''{{prime|w}}''), then add '''Fo'''(''A''<sub>''j''</sub>) to '''Fo'''(''A''<sub>''i''</sub>) #* if ''w' '' has length 0, then add '''Fo'''(''A''<sub>''j''</sub>) to '''Fo'''(''A''<sub>''i''</sub>) # repeat step 2 until all '''Fo''' sets stay the same. This provides the least fixed point solution to the following system: * {{math|'''Fo'''(''S'') β {{mset|'''$'''}}}} * {{math|'''Fo'''(''A'') β '''Fi'''(''w'')Β·'''Fo'''(''B'')}} for each rule of the form {{tmath|B \to \dots A w}} Now we can define exactly which rules will appear where in the parsing table. If ''T''[''A'', ''a''] denotes the entry in the table for nonterminal ''A'' and terminal ''a'', then : ''T''[''A'',''a''] contains the rule ''A'' β ''w'' if and only if :: ''a'' is in '''Fi'''(''w'') or :: Ξ΅ is in '''Fi'''(''w'') and ''a'' is in '''Fo'''(''A''). Equivalently: ''T''[''A'', ''a''] contains the rule ''A'' β ''w'' for each {{math|''a'' β '''Fi'''(''w'')Β·'''Fo'''(''A'').}} If the table contains at most one rule in every one of its cells, then the parser will always know which rule it has to use and can therefore parse strings without backtracking. It is in precisely this case that the grammar is called an ''LL(1) grammar''.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)