Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
GNU Bison
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Yacc-compatible parser generator program}} {{Infobox software | name = GNU Bison | logo = Heckert GNU white.svg | logo size = 100px | author = Robert Corbett | developer = The [[GNU Project]] | released = {{Start date and age|1985|06}}<ref name="Corbett85"/> | discontinued = <!-- Set to yes if software is discontinued, otherwise omit. --> | ver layout = <!-- simple (default) or stacked --> | latest release version = {{wikidata|property|preferred|references|edit|Q1071969|P348|P548=Q2804309}} | latest release date = {{wikidata|qualifier|preferred|single|Q1071969|P348|P548=Q2804309|P577}} | latest preview version = {{wikidata|property|preferred|references|edit|Q1071969|P348|P548=Q51930650}} | latest preview date = {{wikidata|qualifier|preferred|single|Q1071969|P348|P548=Q51930650|P577}} | programming language = [[C (programming language)|C]] and [[m4 (computer language)|m4]] | operating system = [[Unix-like]] | platform = | genre = [[Parser generator]] | license = [[GNU General Public License|GPL]] }} '''GNU Bison''', commonly known as '''Bison''', is a [[parser generator]] that is part of the [[GNU Project]]. Bison reads a specification in Bison syntax (described as "machine-readable [[Backus–Naur form|BNF]]"<ref>{{Cite web|title=Language and Grammar (Bison 3.8.1)|url=https://www.gnu.org/software/bison/manual/html_node/Language-and-Grammar.html|access-date=2021-12-26|website=www.gnu.org}}</ref>), warns about any [[parsing]] ambiguities, and generates a parser that reads sequences of [[Lexical analysis#Token|token]]s and decides whether the sequence conforms to the syntax specified by the grammar. The generated parsers are portable: they do not require any specific compilers. Bison by default generates [[LALR parser|LALR(1) parsers]] but it can also generate [[canonical LR parser|canonical LR]], IELR(1) and [[GLR parser|GLR]] parsers.<ref>[https://www.gnu.org/software/bison/manual/html_node/Introduction.html Bison Manual: Introduction.]</ref> In [[POSIX]] mode, Bison is compatible with [[Yacc]], but also has several extensions over this earlier program, including * Generation of counterexamples for conflicts * Location tracking (e.g., file, line, column) * Rich and internationalizable syntax error messages in the generated parsers * Customizable syntax error generation, * Reentrant parsers * Push parsers, with autocompletion * Support for named references * Several types of reports (graphical, XML) on the generated parser * Support for several programming languages ([[C (programming language)|C]], [[C++]], [[D (programming language)|D]], or [[Java (programming language)|Java]]) [[Flex (lexical analyser generator)|Flex]], an automatic [[Lexical analysis|lexical analyser]], is often used with Bison, to tokenise input data and provide Bison with tokens.<ref>{{Cite book | last=Levine | first=John | author-link=John R. Levine | title=flex & bison | publisher = O'Reilly Media | date=August 2009 | url=http://oreilly.com/catalog/9780596155988 | isbn=978-0-596-15597-1}}</ref> Bison was originally written by Robert Corbett in 1985.<ref name="Corbett85">{{cite thesis |last1=Corbett |first1=Robert Paul |date=June 1985 |title=Static Semantics and Compiler Error Recovery |type=Ph.D. |publisher=[[University of California, Berkeley]] |id=[[Defense Technical Information Center|DTIC]] [https://web.archive.org/web/20170608112235/http://www.dtic.mil/docs/citations/ADA611756 ADA611756]{{cbignore|bot=medic}}}}</ref> Later, in 1989, Robert Corbett released another parser generator named [[Berkeley Yacc]]. Bison was made Yacc-compatible by [[Richard Stallman]].<ref name="authors">{{cite web |title=AUTHORS |url=http://git.savannah.gnu.org/cgit/bison.git/tree/AUTHORS |department=bison.git |work=[[GNU Savannah]] |access-date=2017-08-26}}</ref> Bison is [[free software]] and is available under the [[GNU General Public License]], with an exception (discussed below) allowing its generated code to be used without triggering the [[copyleft]] requirements of the licence. ==Features== ===Counterexample generation=== One delicate issue with LR parser generators is the resolution of conflicts (shift/reduce and reduce/reduce conflicts). With many LR parser generators, resolving conflicts requires the analysis of the parser automaton, which demands some expertise from the user. To aid the user in understanding conflicts more intuitively, Bison can instead automatically generate counterexamples. For [[ambiguous grammar]]s, Bison often can even produce counterexamples that show the grammar is ambiguous. For instance, on a grammar suffering from the infamous [[dangling else]] problem, Bison reports {{Pre| doc/if-then-else.y: {{font color|purple|warning}}: shift/reduce conflict on token "else" [-{{font color|purple|Wcounterexamples}}] Example: {{font color|orange|"if" expr "then"}} {{font color|blue|"if" expr "then" stmt}} {{font color|red|•}} {{font color|blue|"else" stmt}} Shift derivation {{font color|orange|if_stmt}} {{font color|orange|↳ "if" expr "then"}} {{font color|green|stmt}} {{font color|green|↳}} {{font color|blue|if_stmt}} {{font color|blue|↳ "if" expr "then" stmt}} {{font color|red|•}} {{font color|blue|"else" stmt}} Example: {{font color|orange|"if" expr "then"}} {{font color|blue|"if" expr "then" stmt}} {{font color|red|•}} {{font color|orange|"else" stmt}} Reduce derivation {{font color|orange|if_stmt}} {{font color|orange|↳ "if" expr "then"}} {{font color|green|stmt}} {{font color|orange|"else" stmt}} {{font color|green|↳}} {{font color|blue|if_stmt}} {{font color|blue|↳ "if" expr "then" stmt}} {{font color|red|•}} }} ===Reentrancy=== Reentrancy is a feature which has been added to Bison and does not exist in Yacc. Normally, Bison generates a parser which is not [[Reentrant (subroutine)|reentrant]]. In order to achieve reentrancy the declaration <code>%define api.pure</code> must be used. More details on Bison reentrancy can be found in the Bison manual.<ref>[https://www.gnu.org/software/bison/manual/bison.html#Pure-Decl Bison Manual: A Pure (Reentrant) Parser]</ref> ===Output languages=== Bison can generate code for [[C (programming language)|C]], [[C++]], [[D (programming language)|D]] and [[Java (programming language)|Java]].<ref>[https://www.gnu.org/software/bison/manual/html_node/Decl-Summary.html Bison Manual: Bison Declaration Summary]</ref> For using the Bison-generated parser from other languages a [[language binding]] tool such as [[SWIG]] can be used. ==License and distribution of generated code== Because Bison generates source code that in turn gets added to the source code of other software projects, it raises some simple but interesting copyright questions. ===A GPL-compatible license is not required=== The code generated by Bison includes significant amounts of code from the Bison project itself. The Bison package is distributed under the terms of the [[GNU General Public License]] (GPL) but an exception has been added so that the GPL does not apply to output.<ref>[https://www.gnu.org/software/bison/manual/html_node/Conditions.html Bison Manual: Conditions for Using Bison]</ref><ref>[http://git.savannah.gnu.org/cgit/bison.git/tree/src/parse-gram.c A source code file, parse-gram.c, which includes the exception]</ref> Earlier releases of Bison stipulated that parts of its output were also licensed under the GPL, due to the inclusion of the yyparse() function from the original source code in the output. ===Distribution of packages using Bison=== Free software projects that use Bison may have a choice of whether to distribute the source code which their project feeds into Bison, or the resulting C code made output by Bison. Both are sufficient for a recipient to be able to compile the project source code. However, distributing only the input carries the minor inconvenience that the recipients must have a compatible copy of Bison installed so that they can generate the necessary C code when compiling the project. And distributing only the C code in output, creates the problem of making it very difficult for the recipients to modify the parser since this code was written neither ''by'' a human nor ''for'' humans - its purpose is to be fed directly into a C compiler. These problems can be avoided by distributing both the input files and the generated code. Most people will compile using the generated code, no different from any other software package, but anyone who wants to modify the parser component can modify the input files first and re-generate the generated files before compiling. Projects distributing both usually do not have the generated files in their [[version control]] systems. The files are only generated when making a release. Some licenses, such as the [[GNU General Public License|GPL]], require that the source code be in "''the preferred form of the work for making modifications to it''". GPL'd projects using Bison must thus distribute the files which are the input for Bison. Of course, they can also include the generated files. ==Use== Because Bison was written as a replacement for Yacc, and is largely compatible, the code from a lot of projects using Bison could equally be fed into Yacc. This makes it difficult to determine if a project "uses" Bison-specific source code or not. In many cases, the "use" of Bison could be trivially replaced by the equivalent use of Yacc or one of its other derivatives. Bison has features not found in Yacc, so some projects can be truly said to "use" Bison, since Yacc would not suffice. The following list is of projects which are known to "use" Bison in the looser sense, that they use free software development tools and distribute code which is intended to be fed into Bison or a Bison-compatible package. * [[Bash (Unix shell)|Bash]] shell uses a yacc grammar for parsing the command input. * Bison's own grammar parser is generated by Bison.<ref name="parse-gram.y">{{cite web |title=parse-gram.y |url=http://git.savannah.gnu.org/cgit/bison.git/tree/src/parse-gram.y |department=bison.git |work=[[GNU Savannah]] |access-date=2020-07-29}}</ref> * [[CMake]] uses several Bison grammars.<ref>{{cite web|url=https://github.com/Kitware/CMake/tree/master/Source/LexerParser|title=LexerParser in CMake|website=github.com}}</ref> * [[GNU Compiler Collection|GCC]] started out using Bison, but switched to a hand-written [[Recursive descent parser|recursive-descent parser]] for C++ in 2004 (version 3.4),<ref>[https://gcc.gnu.org/gcc-3.4/changes.html GCC 3.4 Release Series Changes, New Features, and Fixes]</ref> and for C and Objective-C in 2006 (version 4.1)<ref>[https://gcc.gnu.org/gcc-4.1/changes.html GCC 4.1 Release Series Changes, New Features, and Fixes]</ref> * The [[Go (programming language)|Go]] programming language (GC) used Bison, but switched to a hand-written scanner and parser in version 1.5.<ref>[https://forum.golangbridge.org/t/golang-grammar-definition/14473/3 Golang grammar definition]</ref> * [[LilyPond]] requires Bison to generate its parser.<ref>{{Cite web|url=https://git.savannah.gnu.org/cgit/lilypond.git/tree/lily/parser.yy|title=Parser.yy - GNU LilyPond Git Repository|website=git.savannah.gnu.org}}</ref> * [[MySQL]]<ref>{{Cite web|url=https://www.safaribooksonline.com/library/view/flex-bison/9780596805418/ch04.html|title=4. Parsing SQL - flex & bison [Book]}}</ref> * [[GNU Octave]] uses a Bison-generated parser.<ref>{{Cite web|url=http://octave.org/doxygen/4.0/d5/d60/oct-parse_8cc_source.html|title = GNU Octave: Libinterp/Parse-tree/Oct-parse.cc Source File}}</ref> * [[Perl 5]] uses a Bison-generated parser starting in 5.10.<ref>{{cite web|url=http://perldoc.perl.org/perl5100delta.html|title=What is new for perl 5.10.0?|publisher=perl.org}}</ref> * The [[PHP]] programming language (Zend Parser). * [[PostgreSQL]]<ref>{{cite web|url=https://www.postgresql.org/docs/current/parser-stage.html|publisher=postgresql.org|title=The Parser Stage|date=30 September 2021}}</ref> * [[Ruby MRI]], the reference implementation of the [[Ruby (programming language)|Ruby]] programming language, relies on a Bison grammar.<ref>{{cite web|url=https://github.com/ruby/ruby/blob/master/parse.y|website=github.com |title=Ruby MRI Parser}}</ref> * [[syslog-ng]] uses several Bison grammars assembled together.<ref>{{cite web|url=https://github.com/syslog-ng/syslog-ng/blob/2d594f664a5d8c01081c96c966ac3d05ef8675d4/modules/xml/xml-grammar.ym|website=github.com |title=syslog-ng's XML Parser|date=14 October 2021 }}</ref> ==A complete reentrant parser example== {{how-to|section|date=September 2016}} The following example shows how to use Bison and flex to write a simple calculator program (only addition and multiplication) and a program for creating an [[abstract syntax tree]]. The next two files provide definition and implementation of the syntax tree functions. <syntaxhighlight lang="c"> /* * Expression.h * Definition of the structure used to build the syntax tree. */ #ifndef __EXPRESSION_H__ #define __EXPRESSION_H__ /** * @brief The operation type */ typedef enum tagEOperationType { eVALUE, eMULTIPLY, eADD } EOperationType; /** * @brief The expression structure */ typedef struct tagSExpression { EOperationType type; /* /< type of operation */ int value; /* /< valid only when type is eVALUE */ struct tagSExpression *left; /* /< left side of the tree */ struct tagSExpression *right; /* /< right side of the tree */ } SExpression; /** * @brief It creates an identifier * @param value The number value * @return The expression or NULL in case of no memory */ SExpression *createNumber(int value); /** * @brief It creates an operation * @param type The operation type * @param left The left operand * @param right The right operand * @return The expression or NULL in case of no memory */ SExpression *createOperation(EOperationType type, SExpression *left, SExpression *right); /** * @brief Deletes a expression * @param b The expression */ void deleteExpression(SExpression *b); #endif /* __EXPRESSION_H__ */ </syntaxhighlight> <syntaxhighlight lang="c"> /* * Expression.c * Implementation of functions used to build the syntax tree. */ #include "Expression.h" #include <stdlib.h> /** * @brief Allocates space for expression * @return The expression or NULL if not enough memory */ static SExpression *allocateExpression() { SExpression *b = (SExpression *)malloc(sizeof(SExpression)); if (b == NULL) return NULL; b->type = eVALUE; b->value = 0; b->left = NULL; b->right = NULL; return b; } SExpression *createNumber(int value) { SExpression *b = allocateExpression(); if (b == NULL) return NULL; b->type = eVALUE; b->value = value; return b; } SExpression *createOperation(EOperationType type, SExpression *left, SExpression *right) { SExpression *b = allocateExpression(); if (b == NULL) return NULL; b->type = type; b->left = left; b->right = right; return b; } void deleteExpression(SExpression *b) { if (b == NULL) return; deleteExpression(b->left); deleteExpression(b->right); free(b); } </syntaxhighlight> The tokens needed by the Bison parser will be generated using flex. <syntaxhighlight lang="c"> %{ /* * Lexer.l file * To generate the lexical analyzer run: "flex Lexer.l" */ #include "Expression.h" #include "Parser.h" #include <stdio.h> %} %option outfile="Lexer.c" header-file="Lexer.h" %option warn nodefault %option reentrant noyywrap never-interactive nounistd %option bison-bridge %% [ \r\n\t]* { continue; /* Skip blanks. */ } [0-9]+ { sscanf(yytext, "%d", &yylval->value); return TOKEN_NUMBER; } "*" { return TOKEN_STAR; } "+" { return TOKEN_PLUS; } "(" { return TOKEN_LPAREN; } ")" { return TOKEN_RPAREN; } . { continue; /* Ignore unexpected characters. */} %% int yyerror(SExpression **expression, yyscan_t scanner, const char *msg) { fprintf(stderr, "Error: %s\n", msg); return 0; } </syntaxhighlight> The names of the tokens are typically neutral: "TOKEN_PLUS" and "TOKEN_STAR", not "TOKEN_ADD" and "TOKEN_MULTIPLY". For instance if we were to support the unary "+" (as in "+1"), it would be wrong to name this "+" "TOKEN_ADD". In a language such as C, "int *ptr" denotes the definition of a pointer, not a product: it would be wrong to name this "*" "TOKEN_MULTIPLY". Since the tokens are provided by flex we must provide the means to communicate between the [[Lexical analysis|parser and the lexer]].<ref name="flex-bison-bridge">[http://flex.sourceforge.net/manual/Bison-Bridge.html Flex Manual: C Scanners with Bison Parsers] {{webarchive|url=https://web.archive.org/web/20101217042916/http://flex.sourceforge.net/manual/Bison-Bridge.html |date=2010-12-17 }}</ref> The data type used for communication, ''YYSTYPE'', is set using Bison ''%union'' declaration. Since in this sample we use the reentrant version of both flex and yacc we are forced to provide parameters for the ''yylex'' function, when called from ''yyparse''.<ref name="flex-bison-bridge" /> This is done through Bison ''%lex-param'' and ''%parse-param'' declarations.<ref name="pure-calling-conventions">[https://www.gnu.org/software/bison/manual/html_node/Pure-Calling.html Bison Manual: Calling Conventions for Pure Parsers]</ref> <syntaxhighlight lang="c"> %{ /* * Parser.y file * To generate the parser run: "bison Parser.y" */ #include "Expression.h" #include "Parser.h" #include "Lexer.h" // reference the implementation provided in Lexer.l int yyerror(SExpression **expression, yyscan_t scanner, const char *msg); %} %code requires { typedef void* yyscan_t; } %output "Parser.c" %defines "Parser.h" %define api.pure %lex-param { yyscan_t scanner } %parse-param { SExpression **expression } %parse-param { yyscan_t scanner } %union { int value; SExpression *expression; } %token TOKEN_LPAREN "(" %token TOKEN_RPAREN ")" %token TOKEN_PLUS "+" %token TOKEN_STAR "*" %token <value> TOKEN_NUMBER "number" %type <expression> expr /* Precedence (increasing) and associativity: a+b+c is (a+b)+c: left associativity a+b*c is a+(b*c): the precedence of "*" is higher than that of "+". */ %left "+" %left "*" %% input : expr { *expression = $1; } ; expr : expr[L] "+" expr[R] { $$ = createOperation( eADD, $L, $R ); } | expr[L] "*" expr[R] { $$ = createOperation( eMULTIPLY, $L, $R ); } | "(" expr[E] ")" { $$ = $E; } | "number" { $$ = createNumber($1); } ; %% </syntaxhighlight> The code needed to obtain the syntax tree using the parser generated by Bison and the scanner generated by flex is the following. <syntaxhighlight lang="c"> /* * main.c file */ #include "Expression.h" #include "Parser.h" #include "Lexer.h" #include <stdio.h> int yyparse(SExpression **expression, yyscan_t scanner); SExpression *getAST(const char *expr) { SExpression *expression; yyscan_t scanner; YY_BUFFER_STATE state; if (yylex_init(&scanner)) { /* could not initialize */ return NULL; } state = yy_scan_string(expr, scanner); if (yyparse(&expression, scanner)) { /* error parsing */ return NULL; } yy_delete_buffer(state, scanner); yylex_destroy(scanner); return expression; } int evaluate(SExpression *e) { switch (e->type) { case eVALUE: return e->value; case eMULTIPLY: return evaluate(e->left) * evaluate(e->right); case eADD: return evaluate(e->left) + evaluate(e->right); default: /* should not be here */ return 0; } } int main(void) { char test[] = " 4 + 2*10 + 3*( 5 + 1 )"; SExpression *e = getAST(test); int result = evaluate(e); printf("Result of '%s' is %d\n", test, result); deleteExpression(e); return 0; } </syntaxhighlight> A simple makefile to build the project is the following. <syntaxhighlight lang="make"> # Makefile FILES = Lexer.c Parser.c Expression.c main.c CC = g++ CFLAGS = -g -ansi test: $(FILES) $(CC) $(CFLAGS) $(FILES) -o test Lexer.c: Lexer.l flex Lexer.l Parser.c: Parser.y Lexer.c bison Parser.y clean: rm -f *.o *~ Lexer.c Lexer.h Parser.c Parser.h test </syntaxhighlight> ==See also== {{Portal|Free and open-source software}} * [[Berkeley Yacc]] (byacc){{snd}} another free software Yacc replacement sharing the same author as GNU Bison * [[ANTLR]] ANother Tool for Language Recognition, another open-source parser generator ==References== {{Reflist}} ==Further reading== * {{Cite book | last = Levine | first = John | author-link = John R. Levine | title = flex & bison | publisher = O'Reilly Media | date = August 2009 | url = http://oreilly.com/catalog/9780596155988 | isbn = 978-0-596-15597-1}} ==External links== * [https://www.gnu.org/software/bison/ Website in the GNU Project] ** [https://www.gnu.org/software/bison/manual/ Manual] * [https://savannah.gnu.org/projects/bison/ Bison] project at [[GNU Savannah]] * [http://directory.fsf.org/bison.html Entry in the Free Software Directory] * [https://www.cs.uic.edu/~spopuri/cparser.html Internals of C parsers generated by GNU Bison] * [http://www.geeksww.com/tutorials/miscellaneous/bison_gnu_parser_generator/installation/installing_bison_gnu_parser_generator_ubuntu_linux.php How to download and install Bison (GNU Parser Generator) on Linux] * [http://gnuwin32.sourceforge.net/packages/bison.htm Win32 binaries by GnuWin32] (version 2.4.1) {{GNU}} {{DEFAULTSORT:Gnu Bison}} [[Category:Compiling tools]] [[Category:Cross-platform software]] [[Category:GNU Project software|Bison]] [[Category:Parser generators|Bison]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Cite book
(
edit
)
Template:Cite thesis
(
edit
)
Template:Cite web
(
edit
)
Template:GNU
(
edit
)
Template:How-to
(
edit
)
Template:Infobox
(
edit
)
Template:Infobox software
(
edit
)
Template:Main other
(
edit
)
Template:Portal
(
edit
)
Template:Pre
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)
Template:Snd
(
edit
)
Template:Template other
(
edit
)
Template:Webarchive
(
edit
)