Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
LL parser
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Overview == For a given [[context-free grammar]], the parser attempts to find the [[Context-free grammar#Derivations and syntax trees|leftmost derivation]]. Given an example grammar ''G'': # <math>S \to E</math> # <math>E \to ( E + E )</math> # <math>E \to i</math> the leftmost derivation for <math>w = ((i+i)+i)</math> is: : <math>S\ \overset{(1)}{\Rightarrow}\ E\ \overset{(2)}{\Rightarrow}\ (E+E)\ \overset{(2)}{\Rightarrow}\ ((E+E)+E)\ \overset{(3)}{\Rightarrow}\ ((i+E)+E)\ \overset{(3)}{\Rightarrow}\ ((i+i)+E)\ \overset{(3)}{\Rightarrow}\ ((i+i)+i)</math> Generally, there are multiple possibilities when selecting a rule to expand the leftmost non-terminal. In step 2 of the previous example, the parser must choose whether to apply rule 2 or rule 3: : <math>S\ \overset{(1)}{\Rightarrow}\ E\ \overset{(?)}{\Rightarrow}\ ?</math> To be efficient, the parser must be able to make this choice deterministically when possible, without backtracking. For some grammars, it can do this by peeking on the unread input (without reading). In our example, if the parser knows that the next unread symbol is '''(''', the only correct rule that can be used is 2. Generally, an LL(''k'') parser can look ahead at ''k'' symbols. However, given a grammar, the problem of determining if there exists a LL(''k'') parser for some ''k'' that recognizes it is undecidable. For each ''k'', there is a language that cannot be recognized by an LL(''k'') parser, but can be by an {{nowrap|LL(''k'' + 1)}}. We can use the above analysis to give the following formal definition: Let ''G'' be a context-free grammar and {{nowrap|''k'' β₯ 1}}. We say that ''G'' is LL(''k''), if and only if for any two leftmost derivations: # <math>S\ \Rightarrow\ \cdots\ \Rightarrow\ wA\alpha\ \Rightarrow\ \cdots\ \Rightarrow\ w\beta\alpha\ \Rightarrow\ \cdots\ \Rightarrow\ wu</math> # <math>S\ \Rightarrow\ \cdots\ \Rightarrow\ wA\alpha\ \Rightarrow\ \cdots\ \Rightarrow\ w\gamma\alpha\ \Rightarrow\ \cdots\ \Rightarrow\ wv</math> the following condition holds: the prefix of the string <math>u</math> of length <math>k</math> equals the prefix of the string <math>v </math> of length ''k'' implies <math>\beta = \gamma</math>. In this definition, <math>S</math> is the start symbol and <math>A</math> any non-terminal. The already derived input <math>w</math>, and yet unread <math>u</math> and <math>v</math> are strings of terminals. The Greek letters <math>\alpha</math>, <math>\beta</math> and <math>\gamma</math> represent any string of both terminals and non-terminals (possibly empty). The prefix length corresponds to the lookahead buffer size, and the definition says that this buffer is enough to distinguish between any two derivations of different words.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)