Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Canonical LR parser
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Overview == The LR(1) parser is a [[deterministic automaton]] and as such its operation is based on static [[state transition table]]s. These codify the grammar of the language it recognizes and are typically called "parsing tables". The parsing tables of the LR(1) parser are parameterized with a lookahead terminal. Simple parsing tables, like those used by the [[LR(0)]] parser represent grammar rules of the form : A1 β A B which means that if we go have input ''A'' followed by ''B'' then we will reduce the pair to ''A1'' regardless of what follows. After parameterizing such a rule with a lookahead we have: : A1 β A B, a which means that the reduction will now be performed only if the lookahead terminal is ''a''. This allows for richer languages where a simple rule can have different meanings depending on the lookahead context. For example, in a LR(1) grammar, all of the following rules perform a different reduction in spite of being based on the same state sequence. : A1 β A B, a : A2 β A B, b : A3 β A B, c : A4 β A B, d The same would not be true if a lookahead terminal was not being taken into account. Parsing errors can be identified without the parser having to read the whole input by declaring some rules as errors. For example, : E1 β B C, d can be declared an error, causing the parser to stop. This means that the lookahead information can also be used to catch errors, as in the following example: : A1 β A B, a : A1 β A B, b : A1 β A B, c : E1 β A B, d In this case A B will be reduced to A1 when the lookahead is a, b or c and an error will be reported when the lookahead is d. The lookahead can also be helpful in deciding when to reduce a rule. The lookahead can help avoid reducing a specific rule if the lookahead is not valid, which would probably mean that the current state should be combined with the following instead of the previous state. That means in the following example * Input sequence: A B C * Rules: :: A1 β A B :: A2 β B C the sequence can be reduced to : A A2 instead of : A1 C if the lookahead after the parser went to state B wasn't acceptable, i.e. no transition rule existed. Reductions can be produced directly from a terminal as in : X β y which allows for multiple sequences to appear. LR(1) parsers have the requirement that each rule should be expressed in a complete LR(1) manner, i.e. a sequence of two states with a specific lookahead. That makes simple rules such as : X β y requiring a great many artificial rules that essentially enumerate the combinations of all the possible states and lookahead terminals that can follow. A similar problem appears for implementing non-lookahead rules such as : A1 β A B where all the possible lookaheads must be enumerated. That is the reason why LR(1) parsers cannot be practically implemented without significant memory optimizations.<ref name="Pager 1977"/>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)