Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Packrat parser
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Memoization technique == Memoization is an [[Program optimization|optimization]] technique in computing that aims to speed up programs by storing the results of expensive function calls. This technique essentially works by [[Cache (computing)|caching]] the results so that when the same inputs occur again, the cached result is simply returned, thus avoiding the time-consuming process of re-computing.<ref>{{Cite journal |last=Norvig |first=Peter |date=1991-03-01 |title=Techniques for automatic memoization with applications to context-free parsing |url=https://dl.acm.org/doi/abs/10.5555/971738.971743 |journal=Computational Linguistics |volume=17 |issue=1 |pages=91β98 |doi= |issn=0891-2017}}</ref> When using packrat parsing and memoization, it's noteworthy that the parsing function for each nonterminal is solely based on the input string. It does not depend on any information gathered during the parsing process. Essentially, memoization table entries do not affect or rely on the parser's specific state at any given time.<ref>{{Cite book |last1=Dubroy |first1=Patrick |last2=Warth |first2=Alessandro |title=Proceedings of the 10th ACM SIGPLAN International Conference on Software Language Engineering |chapter=Incremental packrat parsing |date=2017-10-23 |chapter-url=https://doi.org/10.1145/3136014.3136022 |series=SLE 2017 |location=New York, NY, USA |publisher=Association for Computing Machinery |pages=14β25 |doi=10.1145/3136014.3136022 |isbn=978-1-4503-5525-4|s2cid=13047585 }}</ref> Packrat parsing stores results in a matrix or similar data structure that allows for quick look-ups and insertions. When a production is encountered, the matrix is checked to see if it has already occurred. If it has, the result is retrieved from the matrix. If not, the production is evaluated, the result is inserted into the matrix, and then returned.<ref name=":4">{{Cite journal |last1=Science |first1=International Journal of Scientific Research in |last2=Ijsrset |first2=Engineering and Technology |title=A Survey of Packrat Parser |url=https://www.academia.edu/17779983 |journal=A Survey of Packrat Parser}}</ref> When evaluating the entire <math>m*n</math> matrix in a tabular approach, it would require <math>\Theta(mn)</math> space.<ref name=":4" /> Here, <math>m</math> represents the number of nonterminals, and <math>n</math> represents the input string size. In a naΓ―ve implementation, the entire table can be derived from the input string starting from the end of the string. The Packrat parser can be improved to update only the necessary cells in the matrix through a depth-first visit of each subexpression tree. Consequently, using a matrix with dimensions of <math>m*n</math> is often wasteful, as most entries would remain empty.<ref name=":2" /> These cells are linked to the input string, not to the nonterminals of the grammar. This means that increasing the input string size would always increase memory consumption, while the number of parsing rules changes only the worst space complexity.<ref name=":3" /> === Cut operator === Another operator called ''cut'' has been introduced to Packrat to reduce its average space complexity even further. This operator utilizes the formal structures of many programming languages to eliminate impossible derivations. For instance, control statements parsing in a standard programming language is mutually exclusive from the first recognized token, e.g.,<math>\{\mathtt{if, do, while, switch}\}</math>.<ref name=":5">{{Cite book |last1=Mizushima |first1=Kota |last2=Maeda |first2=Atusi |last3=Yamaguchi |first3=Yoshinori |title=Proceedings of the 9th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering |chapter=Packrat parsers can handle practical grammars in mostly constant space |date=2010-05-06 |chapter-url=https://doi.org/10.1145/1806672.1806679 |series=PASTE '10 |location=New York, NY, USA |publisher=Association for Computing Machinery |pages=29β36 |doi=10.1145/1806672.1806679 |isbn=978-1-4503-0082-7|s2cid=14498865 }}</ref> {| class="wikitable" !Operator !Semantics |- |Cut <math>\begin{array}{l} \alpha \uparrow \beta / \gamma \\ (\alpha \uparrow \beta)* \end{array} </math> |if <math>\alpha </math> is recognized but <math>\beta </math> is not, skip the evaluation of the alternative. In the first case don't evaluate <math>\gamma </math> if <math>\alpha </math> was recognized The second rule is can be rewritten as <math>N \rightarrow \alpha \uparrow \beta N / \epsilon </math> and the same rules can be applied. |} When a Packrat parser uses cut operators, it effectively clears its backtracking stack. This is because a cut operator reduces the number of possible alternatives in an ordered choice. By adding cut operators in the right places in a grammar's definition, the resulting Packrat parser only needs a nearly constant amount of space for memoization.<ref name=":5" />
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)