Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Dangling else
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Problem in computer programming}} The '''dangling else''' is a problem in [[computer programming|programming]] of [[parser generator]]s in which an optional else clause in an [[conditional (computer programming)#If–then(–else)|if–then(–else)]] statement can make nested conditional statements ambiguous. Formally, the reference [[context-free grammar]] of the language is [[ambiguous grammar|ambiguous]], meaning there is more than one correct [[parse tree]]. ==Description== In many [[programming language]]s, one may write conditionally executed code in two forms: the if-then form, or the if-then-else form. (The else clause is optional.): '''if''' a '''then''' s '''if''' b '''then''' s1 '''else''' s2 Ambiguous interpretation becomes possible when there are nested statements; specifically when an if-then-else form replaces the statement <code>s</code> inside the above if-then construct: '''if''' a '''then''' '''if''' b '''then''' s1 '''else''' s2 In this example, <code>s1</code> gets executed if and only if <code>a</code> is true ''and'' <code>b</code> is true. But what about <code>s2</code>? One person might be sure that <code>s2</code> gets executed whenever <code>a</code> is false (by attaching the ''else'' to the first ''if''), while another person might be sure that <code>s2</code> gets executed only when <code>a</code> is true ''and'' <code>b</code> is false (by attaching the ''else'' to the second ''if''). In other words, someone could interpret the previous statement as being equivalent to either of the following unambiguous statements: '''if''' a '''then''' { '''if''' b '''then''' s1 } '''else''' s2 '''if''' a '''then''' { '''if''' b '''then''' s1 '''else''' s2 } The dangling-else problem dates back to [[ALGOL 60]],<ref>{{cite journal |last1=Abrahams |first1=P. W. |title=A final solution to the Dangling else of ALGOL 60 and related languages |journal=Communications of the ACM |volume=9 |issue=9 |pages=679–682 |year=1966 |doi=10.1145/365813.365821 |doi-access=free |s2cid=6777841}}</ref> and subsequent languages have resolved it in various ways. In [[LR parser]]s, the dangling else is the archetypal example of a [[shift-reduce conflict]]. ==Examples== Concrete examples follow. ===C=== In [[C (programming language)|C]], the grammar reads, in part: <pre> statement = ... | selection-statement selection-statement = ... | IF ( expression ) statement | IF ( expression ) statement ELSE statement</pre> Thus, without further rules, the statement <syntaxhighlight lang="C"> if (a) if (b) s; else s2; </syntaxhighlight> could ambiguously be parsed as if it were either: <syntaxhighlight lang="C"> if (a) { if (b) s; else s2; } </syntaxhighlight> or: <syntaxhighlight lang="C"> if (a) { if (b) s; } else s2; </syntaxhighlight> The C standard clarifies that an <code>else</code> block is associated with the nearest <code>if</code>.<ref name="c-standard"/> Therefore, the first tree is chosen. ==Approaches to avoid the issue== ===Avoiding ambiguity while keeping the syntax=== This problem often comes up in [[compiler design|compiler construction]], especially [[scannerless parsing]]. The convention when dealing with the dangling else is to attach the else to the nearby if statement,<ref name="Bison Manual">{{cite book |title=Bison 3.8.1 |section=5.2 Shift/Reduce Conflicts |publisher=GNU |section-url=https://www.gnu.org/software/bison/manual/html_node/Shift_002fReduce.html#Shift_002fReduce |url=https://www.gnu.org/software/bison/manual/html_node/ |access-date=2021-08-07}}</ref> allowing for unambiguous context-free grammars, in particular. Programming languages like Pascal,<ref>ISO 7185:1990 (Pascal) 6.8.3.4: ''An if-statement without an else-part shall not be immediately followed by the token else.''</ref> C,<ref name="c-standard">[https://www.open-std.org/jtc1/sc22/wg14/www/standards ISO 9899]:1999 (C): 6.8.4.1(3): "An else is associated with the lexically nearest preceding if that is allowed by the syntax.", available at [https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf WG14 N1256], p. 134</ref> and Java<ref>{{cite web |title=The Java Language Specification, Java SE 9 Edition, 14.5. Statements |ref=jls-14.5 |url=https://docs.oracle.com/javase/specs/jls/se9/html/jls-14.html}}</ref> follow this convention, so there is no ambiguity in the semantics of the ''language'', though the use of a parser generator may lead to ambiguous ''grammars''. In these cases alternative grouping is accomplished by explicit blocks, such as <code>begin...end</code> in Pascal<ref>{{cite book |last1=Dale |first1=Nell B. |last2=Weems |first2=Chip |title=Introduction to Pascal and Structured Design | section=Dangling Else |pages=160–161 |date=November 1996 |publisher=Jones & Bartlett Learning |isbn=9780763703974 |url=https://books.google.com/books?id=5x2k4vWwn1wC&pg=PA160}}</ref> and <code>{...}</code> in C. Depending on the compiler construction approach, one may take different corrective actions to avoid ambiguity: *If the parser is produced by an SLR, LR(1), or LALR [[LR parser]] generator, the programmer will often rely on the generated parser feature of preferring shift over reduce whenever there is a conflict.<ref name="Bison Manual"/> Alternatively, the grammar can be rewritten to remove the conflict, at the expense of an increase in grammar size (see [[#Avoiding the conflict in LR parsers|below]]). *If the parser is hand-written, the programmer may use a ''non-ambiguous'' context-free grammar. Alternatively, one may rely on a non-context-free grammar or a [[parsing expression grammar]]. ===Avoiding ambiguity by changing the syntax=== The problem can also be solved by making explicit the link between an else and its if, within the syntax. This usually helps avoid human errors.<ref>[https://code.google.com/archive/p/copute/issues/21#c2 Ambiguity of dangling else: non-context-free grammars are semantically opaque]</ref> Possible solutions are: * Having an "end if" symbol delimiting the end of the if construct. Examples of such languages are [[ALGOL 68]], [[Ada (programming language)|Ada]], [[Eiffel (programming language)|Eiffel]], [[PL/SQL]], [[Visual Basic]], [[Modula-2]], and [[AppleScript]]. * Disallowing the statement following a "then" to be an "if" itself (it may however be a pair of statement brackets containing only an if-then-clause). Some implementations of [[ALGOL 60]] follow this approach.<ref>''4.5.1 Conditional Statements — Syntax'' in P. Nauer (ed.), ''Revised Report on the Algorithmic Language ALGOL 60'', CACM 6,1, 1963 pp. 1-17</ref> * Requiring braces (parentheses) when an "else" follows an "if".<ref>[https://code.google.com/archive/p/copute/issues/21#c1 Ambiguity of dangling else: require braces when else follows if]</ref> * Requiring every "if" to be paired with an "else". To avoid a similar problem concerning semantics rather than syntax, [[Racket (programming language)|Racket]] deviates from [[Scheme (programming language)|Scheme]] by considering an <code>if</code> without a fallback clause to be an error, effectively distinguishing conditional ''expressions'' (i.e <code>if</code>) from conditional ''statements'' (i.e <code>when</code> and <code>unless</code>, which do not have fallback clauses). * Using different keywords for the one-alternative and two-alternative "if" statements. [[S-algol]] uses <code>if e do s</code> for the one-alternative case and <code>if e1 then e2 else e3</code> for the general case.<ref>{{citation |last=Davie |first=Antony J. T. |author2=Ronald Morrison |editor=Brian Meek |title=Recursive Descent Compiling |series=Ellis Horwood series in computers and their applications |year=1981 |publisher=Ellis Horwood |location=Chichester, West Sussex |isbn=0-470-27270-8 |page=20}}</ref> * Requiring braces unconditionally, like [[Swift (programming language)|Swift]] and [[Rust (programming language)|Rust]]. This is effectively true in [[Python (programming language)|Python]] as its indentation rules delimit every block, not just those in "if" statements. ===Avoiding the conflict in LR parsers=== The above example could be rewritten in the following way to remove the ambiguity : <pre> statement: open_statement | closed_statement ; open_statement: IF '(' expression ')' statement | IF '(' expression ')' closed_statement ELSE open_statement ; closed_statement: non_if_statement | IF '(' expression ')' closed_statement ELSE closed_statement ; non_if_statement: ... ; </pre> Any other statement-related grammar rules may also have to be duplicated in this way if they may directly or indirectly end with a <code>statement</code> or <code>selection-statement</code> non-terminal. However, we give grammar that includes both of if and while statements. <pre> statement: open_statement | closed_statement ; open_statement: IF '(' expression ')' statement | IF '(' expression ')' closed_statement ELSE open_statement | WHILE '(' expression ')' open_statement ; closed_statement: simple_statement | IF '(' expression ')' closed_statement ELSE closed_statement | WHILE '(' expression ')' closed_statement ; simple_statement: ... ; </pre> Finally, we give the grammar that forbids ambiguous IF statements. <pre> statement: open_statement | closed_statement ; open_statement: IF '(' expression ')' statement | IF '(' expression ')' closed_statement ELSE open_statement | WHILE '(' expression ')' open_statement ; closed_statement: simple_statement | IF '(' expression ')' closed_statement ELSE closed_statement | WHILE '(' expression ')' closed_statement ; simple_statement: ... ; </pre> With this grammar the statement <code>if (a) if (b) c else d</code> can only be parsed one way, because the other interpretation (<code>if (a) {if (b) c} else d</code>) is produced as <pre> statement open_statement IF '(' expression ')' closed_statement ELSE open_statement 'if' '(' 'a' ')' closed_statement 'else' 'd' </pre> and then the parsing fails trying to match <code>closed_statement</code> to "if (b) c". An attempt with <code>closed_statement</code> fails in the same way. The other parse, <code>if (a) {if (b) c else d}</code>) succeeds: <pre> statement open_statement IF '(' expression ')' statement IF '(' expression ')' closed_statement IF '(' a ')' (IF '(' expression ')' closed_statement ELSE closed_statement) IF '(' a ')' (IF '(' b ')' c ELSE 'd') </pre> ==See also== *[[Lexer hack]] *[[Most vexing parse]] ==References== {{Reflist}} [[Category:Ambiguity]] [[Category:Computer programming]] [[Category:Conditional constructs]] [[Category:Parsing]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Citation
(
edit
)
Template:Cite book
(
edit
)
Template:Cite journal
(
edit
)
Template:Cite web
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)