Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Lexical analysis
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Lexer generator == {{See also|Parser generator}} Lexers are often generated by a ''lexer generator'', analogous to [[parser generator]]s, and such tools often come together. The most established is [[Lex (software)|lex]], paired with the [[yacc]] parser generator, or rather some of their many reimplementations, like [[Flex (lexical analyser generator)|flex]] (often paired with [[GNU Bison]]). These generators are a form of [[domain-specific language]], taking in a lexical specification β generally regular expressions with some markup β and emitting a lexer. These tools yield very fast development, which is very important in early development, both to get a working lexer and because a language specification may change often. Further, they often provide advanced features, such as pre- and post-conditions which are hard to program by hand. However, an automatically generated lexer may lack flexibility, and thus may require some manual modification, or an all-manually written lexer. Lexer performance is a concern, and optimizing is worthwhile, more so in stable languages where the lexer runs very often (such as C or HTML). lex/flex-generated lexers are reasonably fast, but improvements of two to three times are possible using more tuned generators. Hand-written lexers are sometimes used, but modern lexer generators produce faster lexers than most hand-coded ones. The lex/flex family of generators uses a table-driven approach which is much less efficient than the directly coded approach.{{Dubious|table-driven vs directly coded|date=May 2010}}<!-- The table-driven approach is not the problem: see 'control table' article β flex appears to be inefficient and is not using a [[trivial hash function]]. --> With the latter approach the generator produces an engine that directly jumps to follow-up states via goto statements. Tools like [[re2c]]<ref>{{Cite journal |last1= Bumbulis |first1= P. |last2= Cowan |first2= D. D. |doi= 10.1145/176454.176487 |title= RE2C: A more versatile scanner generator |journal= ACM Letters on Programming Languages and Systems |volume= 2 |issue= 1β4 |pages= 70β84 |date= MarβDec 1993|s2cid= 14814637 |doi-access= free }}</ref> have proven to produce engines that are between two and three times faster than flex produced engines.{{Citation needed|date=April 2008}} It is in general difficult to hand-write analyzers that perform better than engines generated by these latter tools.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)