Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Regular expression
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==History== [[File:Kleene.jpg|thumb|upright|[[Stephen Cole Kleene]], who introduced the concept]] Regular expressions originated in 1951, when mathematician [[Stephen Cole Kleene]] described [[regular language]]s using his mathematical notation called ''regular events''.{{sfn|Kleene|1951}}<ref name="Leung, New Mexico State University, 2010" >{{cite web |url=https://www.cs.nmsu.edu/historical-projects/Projects/kleene.9.16.10.pdf |title=Regular Languages and Finite Automata |access-date=13 August 2019 |last=Leung |first=Hing |date=16 September 2010 |website=[[New Mexico State University]] |quote=The concept of regular events was introduced by Kleene via the definition of regular expressions. |archive-url=https://web.archive.org/web/20131205193130/https://www.cs.nmsu.edu/historical-projects/Projects/kleene.9.16.10.pdf |archive-date=5 December 2013 |df=dmy-all}}</ref> These arose in [[theoretical computer science]], in the subfields of [[automata theory]] (models of computation) and the description and classification of [[formal language]]s, motivated by Kleene's attempt to describe early [[perceptron|artificial neural networks]]. (Kleene introduced it as an alternative to [[McCulloch-Pitts neuron|McCulloch & Pitts's]] "prehensible", but admitted "We would welcome any suggestions as to a more descriptive term."<ref>Kleene 1951, [https://www.rand.org/content/dam/rand/pubs/research_memoranda/2008/RM704.pdf#page=49 pg46]</ref>) Other early implementations of [[pattern matching]] include the [[SNOBOL]] language, which did not use regular expressions, but instead its own pattern matching constructs. Regular expressions entered popular use from 1968 in two uses: pattern matching in a text editor{{sfn|Thompson|1968}} and lexical analysis in a compiler.{{sfn|Johnson|Porter|Ackley|Ross|1968}} Among the first appearances of regular expressions in program form was when [[Ken Thompson]] built Kleene's notation into the editor [[QED (text editor)|QED]] as a means to match patterns in [[text file]]s.{{sfn|Thompson|1968}}<ref name="Beautiful Code Kernighan">{{cite book |last1=Kernighan |first1=Brian |author-link1=Brian Kernighan |title=Beautiful Code |chapter=A Regular Expressions Matcher |publisher=[[O'Reilly Media]] |pages=1–2 |chapter-url=http://www.cs.princeton.edu/courses/archive/spr09/cos333/beautiful.html |access-date=2013-05-15 |isbn=978-0-596-51004-6 |date=2007-08-08 |archive-date=2020-10-07 |archive-url=https://web.archive.org/web/20201007183137/https://www.cs.princeton.edu/courses/archive/spr09/cos333/beautiful.html |url-status=live}}</ref><ref>{{cite web |url=http://cm.bell-labs.com/who/dmr/qed.html |title=An incomplete history of the QED Text Editor |last1=Ritchie |first1=Dennis M. |access-date=9 October 2013 |archive-url=https://web.archive.org/web/19990221023422/http://cm.bell-labs.com/who/dmr/qed.html |archive-date=1999-02-21}}</ref>{{sfn|Aho|Ullman|1992|loc=10.11 Bibliographic Notes for Chapter 10, p. 589}} For speed, Thompson implemented regular expression matching by [[just-in-time compilation]] (JIT) to [[IBM 7094]] code on the [[Compatible Time-Sharing System]], an important early example of JIT compilation.{{sfn|Aycock|2003|p=98}} He later added this capability to the Unix editor [[ed (text editor)|ed]], which eventually led to the popular search tool [[grep]]'s use of regular expressions ("grep" is a word derived from the command for regular expression searching in the ed editor: <code>g/''re''/p</code> meaning "Global search for Regular Expression and Print matching lines").<ref>{{cite web |url=http://catb.org/jargon/html/G/grep.html |title=Jargon File 4.4.7: grep |author=[[Eric S. Raymond|Raymond, Eric S.]] citing [[Dennis Ritchie]] |date=2003 |access-date=2009-02-17 |archive-date=2011-06-05 |archive-url=https://web.archive.org/web/20110605165512/http://www.catb.org/jargon/html/G/grep.html}}</ref> Around the same time when Thompson developed QED, a group of researchers including [[Douglas T. Ross]] implemented a tool based on regular expressions that is used for lexical analysis in [[compiler]] design.{{sfn|Johnson|Porter|Ackley|Ross|1968}} Many variations of these original forms of regular expressions were used in [[Unix]]{{sfn|Aho|Ullman|1992|loc=10.11 Bibliographic Notes for Chapter 10, p. 589}} programs at [[Bell Labs]] in the 1970s, including [[Lex programming tool|lex]], [[sed]], [[AWK]], and [[expr]], and in other programs such as [[Vi (text editor)|vi]], and [[Emacs]] (which has its own, incompatible syntax and behavior). Regexes were subsequently adopted by a wide range of programs, with these early forms standardized in the [[POSIX.2]] standard in 1992. In the 1980s, the more complicated regexes arose in [[Perl]], which originally derived from a regex library written by [[Henry Spencer]] (1986), who later wrote an implementation for [[Tcl]] called ''Advanced Regular Expressions''.<ref>{{cite web |url=http://www.tcl.tk/doc/howto/regexp81.html |title=New Regular Expression Features in Tcl 8.1 |access-date=2013-10-11 |archive-date=2020-10-07 |archive-url=https://web.archive.org/web/20201007183137/http://www.tcl.tk/doc/howto/regexp81.html |url-status=live}}</ref> The Tcl library is a hybrid [[nondeterministic finite automaton|NFA]]/[[deterministic finite automaton|DFA]] implementation with improved performance characteristics. Software projects that have adopted Spencer's Tcl regular expression implementation include [[PostgreSQL]].<ref>{{cite web |url=http://www.postgresql.org/docs/9.3/interactive/functions-matching.html |website=PostgreSQL |title=Documentation: 9.3: Pattern Matching |access-date=2013-10-12 |archive-date=2020-10-07 |archive-url=https://web.archive.org/web/20201007183140/https://www.postgresql.org/docs/9.3/functions-matching.html |url-status=live}}</ref> Perl later expanded on Spencer's original library to add many new features.<ref>{{cite web |url=http://perldoc.perl.org/perlre.html |title=Perl Regular Expressions |website=perlre |author-link=Larry Wall |author=Wall, Larry |date=2006 |access-date=2006-10-10 |archive-date=2009-12-31 |archive-url=https://web.archive.org/web/20091231010052/http://perldoc.perl.org/perlre.html |url-status=live}}</ref> Part of the effort in the design of [[Raku (programming language)|Raku]] (formerly named Perl 6) is to improve Perl's regex integration, and to increase their scope and capabilities to allow the definition of [[parsing expression grammar]]s.<ref name="Apocalypse5">{{harvtxt|Wall|2002}}</ref> The result is a [[mini-language]] called [[Raku rules]], which are used to define Raku grammar as well as provide a tool to programmers in the language. These rules maintain existing features of Perl 5.x regexes, but also allow [[Backus–Naur form|BNF]]-style definition of a [[recursive descent parser]] via sub-rules. The use of regexes in structured information standards for document and database modeling started in the 1960s and expanded in the 1980s when industry standards like [[Standard Generalized Markup Language|ISO SGML]] (precursored by ANSI "GCA 101-1983") consolidated. The kernel of the [[XML schema#Validation|structure specification language]] standards consists of regexes. Its use is evident in the [[Document Type Definition|DTD]] element group syntax. Prior to the use of regular expressions, many search languages allowed simple wildcards, for example "*" to match any sequence of characters, and "?" to match a single character. Relics of this can be found today in the [[glob (programming)|glob]] syntax for filenames, and in the [[SQL]] <code>LIKE</code> operator. Starting in 1997, [[Philip Hazel]] developed [[Perl Compatible Regular Expressions|PCRE]] (Perl Compatible Regular Expressions), which attempts to closely mimic Perl's regex functionality and is used by many modern tools including [[PHP]] and [[Apache HTTP Server]].<ref>{{cite web |title=PCRE - Perl Compatible Regular Expressions |url=https://www.pcre.org/ |access-date=2024-04-07 |website=www.pcre.org}}</ref> Today, regexes are widely supported in programming languages, text processing programs (particularly [[lexer]]s), advanced text editors, and some other programs. Regex support is part of the [[standard library]] of many programming languages, including [[Java (programming language)|Java]] and [[Python (programming language)|Python]], and is built into the syntax of others, including Perl and [[ECMAScript]]. In the late 2010s, several companies started to offer hardware, [[FPGA]],<ref>{{cite web |url=https://grovf.com/products/gregex |title=GRegex – Faster Analytics for Unstructured Text Data |website=grovf.com |access-date=2019-10-22 |archive-date=2020-10-07 |archive-url=https://web.archive.org/web/20201007183139/https://grovf.com/products/gregex |url-status=live}}</ref> [[GPU]]<ref>{{cite web |url=http://bkase.github.io/CUDA-grep/finalreport.html |title=CUDA grep |website=bkase.github.io |access-date=2019-10-22 |archive-date=2020-10-07 |archive-url=https://web.archive.org/web/20201007183138/http://bkase.github.io/CUDA-grep/finalreport.html |url-status=live}}</ref> implementations of [[PCRE]] compatible regex engines that are faster compared to [[central processing unit|CPU]] implementations'''.'''
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)