Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Top-down parsing language
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Definition of a TDPL grammar == Formally, a '''TDPL grammar''' ''G'' is a quadruple consisting of the following components: * A finite set ''N'' of ''nonterminal symbols''. * A finite set Ξ£ of ''terminal symbols'' that is disjoint from ''N''. * A finite set ''P'' of ''[[Production rule (formal languages)|production rule]]s'', where a rule has one of the following forms: ** ''A'' β Ξ΅, where ''A'' is a nonterminal and Ξ΅ is the empty string. ** ''A'' β ''f'', where ''f'' is a distinguished symbol representing ''unconditional failure''. ** ''A'' β ''a'', where ''a'' is any terminal symbol. ** ''A'' β ''BC/D'', where ''B'', ''C'', and ''D'' are nonterminals. === Interpretation of a grammar === A TDPL grammar can be viewed as an extremely minimalistic formal representation of a [[recursive descent parser]], in which each of the nonterminals schematically represents a parsing [[function (programming)|function]]. Each of these nonterminal-functions takes as its input argument a string to be recognized, and yields one of two possible outcomes: * ''success'', in which case the function may optionally move forward or ''consume'' one or more characters of the input string supplied to it, or * ''failure'', in which case no input is consumed. Note that a nonterminal-function may succeed without actually consuming any input, and this is considered an outcome distinct from failure. A nonterminal ''A'' defined by a rule of the form ''A'' β Ξ΅ always succeeds without consuming any input, regardless of the input string provided. Conversely, a rule of the form ''A'' β ''f'' always fails regardless of input. A rule of the form ''A'' β ''a'' succeeds if the next character in the input string is the terminal ''a'', in which case the nonterminal succeeds and consumes that one terminal; if the next input character does not match (or there is no next character), then the nonterminal fails. A nonterminal ''A'' defined by a rule of the form ''A'' β ''BC/D'' first [[recursion|recursively]] invokes nonterminal ''B'', and if ''B'' succeeds, invokes ''C'' on the remainder of the input string left unconsumed by ''B''. If both ''B'' and ''C'' succeed, then ''A'' in turn succeeds and consumes the same total number of input characters that ''B'' and ''C'' together did. If either ''B'' or ''C'' fails, however, then ''A'' [[backtracking|backtracks]] to the original point in the input string where it was first invoked, and then invokes ''D'' on that original input string, returning whatever result ''D'' produces. === Examples === The following TDPL grammar describes the [[regular language]] consisting of an arbitrary-length sequence of a's and b's: : ''S'' β ''AS/T'' : ''T'' β ''BS/E'' : ''A'' β a : ''B'' β b : ''E'' β Ξ΅ The following grammar describes the [[context-free language|context-free]] [[Dyck language]] consisting of arbitrary-length strings of matched braces, such as <nowiki>'{}', '{{}{{}}}',</nowiki> etc.: : ''S'' β ''OT/E'' : ''T'' β ''SU/F'' : ''U'' β ''CS/F'' : ''O'' β { : ''C'' β } : ''E'' β Ξ΅ : ''F'' β ''f'' The above examples can be represented equivalently but much more succinctly in [[parsing expression grammar]] notation as {{code|2=peg|S β (a/b)*}} and {{code|2=peg|S β ({S})*}}, respectively.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)