Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Ambiguous grammar
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Type of a context-free grammar}} In [[computer science]], an '''ambiguous grammar''' is a [[context-free grammar]] for which there exists a [[string (computer science)|string]] that can have more than one [[leftmost derivation]] or [[parse tree]].<ref name="Levelt2008">{{cite book|author=Willem J. M. Levelt|title=An Introduction to the Theory of Formal Languages and Automata|url=https://books.google.com/books?id=tFvtwGYNe7kC&q=%22leftmost+derivation%22+%22ambiguous%22|year=2008|publisher=John Benjamins Publishing|isbn=978-90-272-3250-2}}</ref>{{sfn|Hopcroft|Motwani|Ullman|2006|p=217}} Every non-empty [[context-free language]] admits an ambiguous grammar by introducing e.g. a duplicate rule. A language that only admits ambiguous grammars is called an [[#Inherently ambiguous languages|inherently ambiguous language]]. [[Deterministic context-free grammar]]s are always unambiguous, and are an important subclass of unambiguous grammars; there are non-deterministic unambiguous grammars, however. For computer [[programming language]]s, the reference grammar is often ambiguous, due to issues such as the [[dangling else]] problem. If present, these ambiguities are generally resolved by adding precedence rules or other [[context-sensitive grammar|context-sensitive]] parsing rules, so the overall phrase grammar is unambiguous.{{citation needed|date=January 2018}} Some parsing algorithms (such as [[Earley parser|Earley]]<ref>{{cite journal|last1=Scott|first1=Elizabeth|title=SPPF-Style Parsing From Earley Recognizers|journal=Electronic Notes in Theoretical Computer Science|date=April 1, 2008|volume=203|issue=2|pages=53β67|doi=10.1016/j.entcs.2008.03.044|doi-access=free}}</ref> or [[Generalized LR parser|GLR]] parsers) can generate sets of parse trees (or "parse forests") from strings that are [[syntactically ambiguous]].<ref>Tomita, Masaru. "[http://anthology.aclweb.org/J/J87/J87-1004.pdf An efficient augmented-context-free parsing algorithm]." Computational linguistics 13.1-2 (1987): 31-46.</ref> == Examples == === Trivial language === The simplest example is the following ambiguous grammar (with start symbol A) for the trivial language that consists of only the empty string: : A β A | Ξ΅ ... meaning that the nonterminal A can be derived to either itself again, or to the empty string. Thus the empty string has leftmost derivations of length 1, 2, 3, and indeed of any length, depending on how many times the rule A β A is used. This language also has an unambiguous grammar, consisting of a single [[Production rule (formal languages)|production rule]]: : A β Ξ΅ ... meaning that the unique production can produce only the empty string, which is the unique string in the language. In the same way, any grammar for a non-empty language can be made ambiguous by adding duplicates. === Unary string === The [[regular language]] of unary strings of a given character, say <code>'a'</code> (the regular expression <code>a*</code>), has the unambiguous grammar: : A β aA | Ξ΅ ... but also has the ambiguous grammar: : A β aA | Aa | Ξ΅ These correspond to producing a [[right-associative]] tree (for the unambiguous grammar) or allowing both left- and right- association. This is elaborated below. === Addition and subtraction === The [[context free grammar]] : A β A + A | A β A | a is ambiguous since there are two leftmost derivations for the string a + a + a: {| border="0" |----- | || A || β A + A | | | A || β A + A |----- | || || β a + A | | | || β A + A + A (First A is replaced by A+A. Replacement of the second A would yield a similar derivation.) |----- | || || β a + A + A | | | || β a + A + A |----- | || || β a + a + A | | | || β a + a + A |----- | || || β a + a + a | | | || β a + a + a |} As another example, the grammar is ambiguous since there are two [[parse tree]]s for the string a + a − a: : [[Image:Leftmostderivations jaredwf.svg|Leftmostderivations jaredwf.svg|400px|class=skin-invert]] The language that it generates, however, is not inherently ambiguous; the following is a non-ambiguous grammar generating the same language: : A β A + a | A β a | a === Dangling else === {{main|Dangling else}} A common example of ambiguity in computer programming languages is the [[dangling else]] problem. In many languages, the <code>else</code> in an [[Conditional (computer programming)#Ifβthen(βelse)|Ifβthen(βelse)]] statement is optional, which results in nested conditionals having multiple ways of being recognized in terms of the context-free grammar. Concretely, in many languages one may write conditionals in two valid forms: the if-then form, and the if-then-else form β in effect, making the else clause optional. In a grammar containing the rules{{efn|The following example uses [[Pascal (programming language)|Pascal]] syntax.}} Statement β '''if''' Condition '''then''' Statement | '''if''' Condition '''then''' Statement '''else''' Statement | ... Condition β ... some ambiguous phrase structures can appear. The expression '''if''' a '''then''' '''if''' b '''then''' s '''else''' s2 can be parsed as either '''if''' a '''then''' '''begin''' '''if''' b '''then''' s '''end''' '''else''' s2 or as '''if''' a '''then''' '''begin''' '''if''' b '''then''' s '''else''' s2 '''end''' depending on whether the <code>else</code> is associated with the first <code>if</code> or second <code>if</code>. This is resolved in various ways in different languages. Sometimes the grammar is modified so that it is unambiguous, such as by requiring an <code>endif</code> statement or making <code>else</code> mandatory. In other cases the grammar is left ambiguous, but the ambiguity is resolved by making the overall phrase grammar context-sensitive, such as by associating an <code>else</code> with the nearest <code>if</code>. In this latter case the grammar is unambiguous, but the context-free grammar is ambiguous.{{clarify|reason=There is no such thing as an 'overall phrase grammar'. The 'nearest-if' rule can possibly be implemented by using a slightly modified, but still context-free grammar.|date=January 2017}} === An unambiguous grammar with multiple derivations === The existence of multiple derivations of the same string does not suffice to indicate that the grammar is ambiguous; only multiple ''leftmost'' derivations (or, equivalently, multiple parse trees) indicate ambiguity. For example, the simple grammar S β A + A A β 0 | 1 is an unambiguous grammar for the language { 0+0, 0+1, 1+0, 1+1 }. While each of these four strings has only one leftmost derivation, it has two different derivations, for example S [[Context-free grammar#Rule application|β]] A + A β 0 + A β 0 + 0 and S β A + A β A + 0 β 0 + 0 Only the former derivation is a leftmost one. == Recognizing ambiguous grammars == The [[decision problem]] of whether an arbitrary grammar is ambiguous is [[Undecidable problem|undecidable]] because it can be shown that it is equivalent to the [[Post correspondence problem]].{{sfn|Hopcroft|Motwani|Ullman|2006|loc=Theorem 9.20|p=415}} At least, there are tools implementing some [[semi-decidable|semi-decision procedure]] for detecting ambiguity of context-free grammars.<ref>{{cite conference | last1 = Axelsson | first1 = Roland | last2 = Heljanko | first2 = Keijo | last3 = Lange | first3 = Martin | year = 2008 | title = Analyzing Context-Free Grammars Using an Incremental SAT Solver | book-title = Proceedings of the 35th [[International Colloquium on Automata, Languages and Programming]] (ICALP'08), Reykjavik, Iceland | series = [[Lecture Notes in Computer Science]] | volume = 5126 | pages = 410β422 | publisher = Springer-Verlag | doi = 10.1007/978-3-540-70583-3_34 | isbn = 978-3-540-70582-6 | url = http://www.tcs.hut.fi/~kepa/publications/AxelssonHeljankoLange-ICALP08.pdf }}</ref> The efficiency of parsing a context-free grammar is determined by the automaton that accepts it. [[Deterministic context-free grammar]]s are accepted by [[deterministic pushdown automata]] and can be parsed in linear time, for example by an [[LR parser]].<ref>{{Cite journal | last1 = Knuth | first1 = D. E. | authorlink = Donald Knuth | title = On the translation of languages from left to right | doi = 10.1016/S0019-9958(65)90426-2 | journal = Information and Control | volume = 8 | issue = 6 | pages = 607β639 | date = July 1965 | doi-access = }}</ref> They are a strict subset of the [[context-free grammars]], which are accepted by [[pushdown automata]] and can be parsed in polynomial time, for example by the [[CYK algorithm]]. Unambiguous context-free grammars can be nondeterministic. For example, the language of even-length [[palindrome]]s on the alphabet of 0 and 1 has the unambiguous context-free grammar S β 0S0 | 1S1 | Ξ΅. An arbitrary string of this language cannot be parsed without reading all its symbols first, which means that a pushdown automaton has to try alternative state transitions to accommodate for the different possible lengths of a semi-parsed string.{{sfn|Hopcroft|Motwani|Ullman|2006|pp=254-6}} Nevertheless, removing grammar ambiguity may produce a deterministic context-free grammar and thus allow for more efficient parsing. Compiler generators such as [[YACC]] include features for resolving some kinds of ambiguity, such as by using the precedence and associativity constraints. == Inherently ambiguous languages == While some context-free languages (the set of strings that can be generated by a grammar) have both ambiguous and unambiguous grammars, there exist context-free languages for which no unambiguous context-free grammar exists. Such languages are called '''inherently ambiguous'''. There are no inherently ambiguous regular languages.<ref>{{Cite journal |last1=Book |first1=R. |last2=Even |first2=S. |last3=Greibach |first3=S. |last4=Ott |first4=G. |date=Feb 1971 |title=Ambiguity in Graphs and Expressions |url=http://dx.doi.org/10.1109/t-c.1971.223204 |journal=IEEE Transactions on Computers |volume=C-20 |issue=2 |pages=149β153 |doi=10.1109/t-c.1971.223204 |s2cid=20676251 |issn=0018-9340|url-access=subscription }}</ref><ref>{{Cite web |title=formal languages - Can regular expressions be made unambiguous? |url=https://mathoverflow.net/questions/45149/can-regular-expressions-be-made-unambiguous |access-date=2023-02-23 |website=MathOverflow |language=en}}</ref> The existence of inherently ambiguous context-free languages was proven with [[Parikh's theorem]] in 1961 by [[Rohit Jivanlal Parikh|Rohit Parikh]] in an MIT research report.<ref>{{cite book | last = Parikh | first = Rohit | title = Language-generating devices | publisher = Quarterly Progress Report, Research Laboratory of Electronics, MIT | date = January 1961}}</ref> The language <math>\{x | x=a^n b^m a^{n^{\prime}} b^m \text { or } x=a^n b^m a^n b^{m^{\prime}}, \text { where } n, n', m, m' \geq 1\}</math> is inherently ambiguous.<ref>{{Cite journal |last=Parikh |first=Rohit J. |date=1966-10-01 |title=On Context-Free Languages |journal=Journal of the ACM |volume=13 |issue=4 |pages=570β581 |doi=10.1145/321356.321364 |s2cid=12263468 |issn=0004-5411|doi-access=free }} Here: Theorem 3.</ref> [[Ogden's lemma]]<ref>{{Cite journal |last=Ogden |first=William |date=Sep 1968 |title=A helpful result for proving inherent ambiguity |url=http://dx.doi.org/10.1007/bf01694004 |journal=Mathematical Systems Theory |volume=2 |issue=3 |pages=191β194 |doi=10.1007/bf01694004 |s2cid=13197551 |issn=0025-5661|url-access=subscription }}</ref> can be used to prove that certain context-free languages, such as <math>\{a^nb^mc^m | m, n \geq 1\} \cup \{a^mb^mc^n | m, n \geq 1\}</math>, are inherently ambiguous. See ''{{slink|Ogden's lemma#Inherent ambiguity}}'' for a proof. The union of <math>\{a^n b^n c^m d^m \mid n, m > 0\}</math> with <math>\{a^n b^m c^m d^n \mid n, m > 0\}</math> is inherently ambiguous. This set is context-free, since the union of two context-free languages is always context-free. But {{harvtxt|Hopcroft|Ullman|1979}} give a proof that no context-free grammar for this union language can unambiguously parse strings of form <math>a^n b^n c^n d^n, (n > 0)</math>.{{sfn|Hopcroft | Ullman|1979|p=99-103|loc=Sect. 4.7}} More examples, and a general review of techniques for proving inherent ambiguity of context-free languages, are found given by Bassino and Nicaud (2011).<ref>{{Cite web |author=FredΓ©rique Bassino and Cyril Nicaud |date=December 16, 2011 |title=Philippe Flajolet & Analytic Combinatorics: Inherent Ambiguity of Context-Free Languages |url=https://algo.inria.fr/pfac/PFAC/Program_files/nicaud.pdf |url-status=live |archive-url=https://web.archive.org/web/20220925135141/http://algo.inria.fr/pfac/PFAC/Program_files/nicaud.pdf |archive-date=2022-09-25}}</ref> == See also == * [[GLR parser]], a type of parser for ambiguous and nondeterministic grammars * [[Chart parser]], another type of parser for ambiguous grammars * [[Syntactic ambiguity]] == Citations == === Notes === {{notelist}} === References === {{sfn whitelist|CITEREFHopcroftUllman1979|CITEREFHopcroftMotwaniUllman2006}} {{reflist}} * {{Hopcroft and Ullman 1979|author-link=no|title-link=no}} * {{Hopcroft, Motwani, and Ullman 2006}} === Further reading === * {{cite journal |last1=Brabrand |first1=Claus |last2=Giegerich |first2=Robert |last3=MΓΈller |first3=Anders |date=March 2010 |title=Analyzing Ambiguity of Context-Free Grammars |journal=Science of Computer Programming |volume=75 |issue=3 |pages=176β191 |publisher=Elsevier |doi=10.1016/j.scico.2009.11.002 |citeseerx=10.1.1.86.3118 }} * {{cite journal | last = Gross | first = Maurice | title = Inherent ambiguity of minimal linear grammars | journal = Information and Control | volume = 7 | issue = 3 | pages = 366β368 | date = September 1964 | doi = 10.1016/S0019-9958(64)90422-X | doi-access = free }} * {{cite book | last = Harrison | first = Michael | author-link = Michael A. Harrison | title = Introduction to Formal Language Theory | publisher = Addison-Wesley | year = 1978 | isbn = 0201029553 }} == External links == * [http://www.brics.dk/grammar dk.brics.grammar] - a grammar ambiguity analyzer. * [https://web.archive.org/web/20110719055512/http://www2.tcs.ifi.lmu.de/~mlange/cfganalyzer/index.html CFGAnalyzer] - tool for analyzing context-free grammars with respect to language universality, ambiguity, and similar properties. [[Category:Formal languages]] [[Category:Ambiguity]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Citation needed
(
edit
)
Template:Cite book
(
edit
)
Template:Cite conference
(
edit
)
Template:Cite journal
(
edit
)
Template:Cite web
(
edit
)
Template:Clarify
(
edit
)
Template:Efn
(
edit
)
Template:Harvtxt
(
edit
)
Template:Hopcroft, Motwani, and Ullman 2006
(
edit
)
Template:Hopcroft and Ullman 1979
(
edit
)
Template:Main
(
edit
)
Template:Notelist
(
edit
)
Template:Reflist
(
edit
)
Template:Sfn
(
edit
)
Template:Sfn whitelist
(
edit
)
Template:Short description
(
edit
)
Template:Slink
(
edit
)