Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Pattern matching
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Usage== ===Filtering data with patterns=== Pattern matching can be used to filter data of a certain structure. For instance, in Haskell a [[list comprehension]] could be used for this kind of filtering: <syntaxhighlight lang="haskell"> [A x|A x <- [A 1, B 1, A 2, B 2]] </syntaxhighlight> evaluates to [A 1, A 2] ===Pattern matching in Mathematica=== In [[Mathematica]], the only structure that exists is the [[Tree (data structure)|tree]], which is populated by symbols. In the [[Haskell]] syntax used thus far, this could be defined as <syntaxhighlight lang="haskell"> data SymbolTree = Symbol String [SymbolTree] </syntaxhighlight> An example tree could then look like <syntaxhighlight lang="mathematica"> Symbol "a" [Symbol "b" [], Symbol "c" []] </syntaxhighlight> In the traditional, more suitable syntax, the symbols are written as they are and the levels of the tree are represented using <code>[]</code>, so that for instance <code>a[b,c]</code> is a tree with a as the parent, and b and c as the children. A pattern in Mathematica involves putting "_" at positions in that tree. For instance, the pattern A[_] will match elements such as A[1], A[2], or more generally A[''x''] where ''x'' is any entity. In this case, <code>A</code> is the concrete element, while <code>_</code> denotes the piece of tree that can be varied. A symbol prepended to <code>_</code> binds the match to that variable name while a symbol appended to <code>_</code> restricts the matches to nodes of that symbol. Note that even blanks themselves are internally represented as <code>Blank[]</code> for <code>_</code> and <code>Blank[x]</code> for <code>_x</code>. The Mathematica function <code>Cases</code> filters elements of the first argument that match the pattern in the second argument:<ref>{{Cite web|title=Cases—Wolfram Language Documentation|url=https://reference.wolfram.com/language/ref/Cases.html.en|access-date=2020-11-17|website=reference.wolfram.com}}</ref> <syntaxhighlight lang="mathematica"> Cases[{a[1], b[1], a[2], b[2]}, a[_] ] </syntaxhighlight> evaluates to <syntaxhighlight lang="mathematica"> {a[1], a[2]} </syntaxhighlight> Pattern matching applies to the ''structure'' of expressions. In the example below, <syntaxhighlight lang="mathematica"> Cases[ {a[b], a[b, c], a[b[c], d], a[b[c], d[e]], a[b[c], d, e]}, a[b[_], _] ] </syntaxhighlight> returns <syntaxhighlight lang="mathematica"> {a[b[c],d], a[b[c],d[e]]} </syntaxhighlight> because only these elements will match the pattern <code>a[b[_],_]</code> above. In Mathematica, it is also possible to extract structures as they are created in the course of computation, regardless of how or where they appear. The function <code>Trace</code> can be used to monitor a computation, and return the elements that arise which match a pattern. For example, we can define the [[Fibonacci number|Fibonacci sequence]] as <syntaxhighlight lang="mathematica"> fib[0|1]:=1 fib[n_]:= fib[n-1] + fib[n-2] </syntaxhighlight> Then, we can ask the question: Given fib[3], what is the sequence of recursive Fibonacci calls? <syntaxhighlight lang="mathematica"> Trace[fib[3], fib[_]] </syntaxhighlight> returns a structure that represents the occurrences of the pattern <code>fib[_]</code> in the computational structure: <syntaxhighlight lang="mathematica"> {fib[3],{fib[2],{fib[1]},{fib[0]}},{fib[1]}} </syntaxhighlight> ====Declarative programming==== In symbolic programming languages, it is easy to have patterns as arguments to functions or as elements of data structures. A consequence of this is the ability to use patterns to declaratively make statements about pieces of data and to flexibly instruct functions how to operate. For instance, the [[Mathematica]] function <code>Compile</code> can be used to make more efficient versions of the code. In the following example the details do not particularly matter; what matters is that the subexpression <code>{{com[_], Integer}}</code> instructs <code>Compile</code> that expressions of the form <code>com[_]</code> can be assumed to be [[integer]]s for the purposes of compilation: <syntaxhighlight lang="mathematica"> com[i_] := Binomial[2i, i] Compile[{x, {i, _Integer}}, x^com[i], {{com[_], Integer}}] </syntaxhighlight> Mailboxes in [[Erlang (programming language)|Erlang]] also work this way. The [[Curry–Howard correspondence]] between proofs and programs relates [[ML (programming language)|ML]]-style pattern matching to [[Proof by cases|case analysis]] and [[proof by exhaustion]]. ===Pattern matching and strings=== By far the most common form of pattern matching involves strings of characters. In many programming languages, a particular syntax of strings is used to represent regular expressions, which are patterns describing string characters. However, it is possible to perform some string pattern matching within the same framework that has been discussed throughout this article. ====Tree patterns for strings==== In Mathematica, strings are represented as trees of root StringExpression and all the characters in order as children of the root. Thus, to match "any amount of trailing characters", a new wildcard ___ is needed in contrast to _ that would match only a single character. In Haskell and [[functional programming]] languages in general, strings are represented as functional [[List (computing)|lists]] of characters. A functional list is defined as an empty list, or an element constructed on an existing list. In Haskell syntax: <syntaxhighlight lang="haskell"> [] -- an empty list x:xs -- an element x constructed on a list xs </syntaxhighlight> The structure for a list with some elements is thus <code>element:list</code>. When pattern matching, we assert that a certain piece of data is equal to a certain pattern. For example, in the function: <syntaxhighlight lang="haskell"> head (element:list) = element </syntaxhighlight> We assert that the first element of <code>head</code>'s argument is called element, and the function returns this. We know that this is the first element because of the way lists are defined, a single element constructed onto a list. This single element must be the first. The empty list would not match the pattern at all, as an empty list does not have a head (the first element that is constructed). In the example, we have no use for <code>list</code>, so we can disregard it, and thus write the function: <syntaxhighlight lang="haskell"> head (element:_) = element </syntaxhighlight> The equivalent Mathematica transformation is expressed as head[element, ]:=element ====Example string patterns==== In Mathematica, for instance, <syntaxhighlight lang="mathematica"> StringExpression["a",_] </syntaxhighlight> will match a string that has two characters and begins with "a". The same pattern in Haskell: <syntaxhighlight lang="haskell"> ['a', _] </syntaxhighlight> Symbolic entities can be introduced to represent many different classes of relevant features of a string. For instance, StringExpression[LetterCharacter, DigitCharacter] will match a string that consists of a letter first, and then a number. In Haskell, [[Guard (computer science)|guards]] could be used to achieve the same matches: <syntaxhighlight lang="haskell"> [letter, digit] | isAlpha letter && isDigit digit </syntaxhighlight> The main advantage of symbolic string manipulation is that it can be completely integrated with the rest of the programming language, rather than being a separate, special purpose subunit. The entire power of the language can be leveraged to build up the patterns themselves or analyze and transform the programs that contain them. ====SNOBOL==== {{Main|SNOBOL}} SNOBOL (''StriNg Oriented and symBOlic Language'') is a computer programming language developed between 1962 and 1967 at [[AT&T Corporation|AT&T]] [[Bell Laboratories]] by [[David J. Farber]], [[Ralph E. Griswold]] and Ivan P. Polonsky. SNOBOL4 stands apart from most programming languages by having patterns as a [[first-class object|first-class data type]] (''i.e.'' a data type whose values can be manipulated in all ways permitted to any other data type in the programming language) and by providing operators for pattern [[concatenation]] and [[alternation (formal language theory)|alternation]]. Strings generated during execution can be treated as programs and executed. SNOBOL was quite widely taught in larger US universities in the late 1960s and early 1970s and was widely used in the 1970s and 1980s as a text manipulation language in the [[humanities]]. Since SNOBOL's creation, newer languages such as [[AWK]] and [[Perl]] have made string manipulation by means of [[regular expression]]s fashionable. SNOBOL4 patterns, however, subsume [[Backus–Naur form]] (BNF) grammars, which are equivalent to [[context-free grammar]]s and more powerful than [[regular expression]]s.<ref>Gimpel, J. F. 1973. A theory of discrete patterns and their implementation in SNOBOL4. Commun. ACM 16, 2 (Feb. 1973), 91–100. DOI=http://doi.acm.org/10.1145/361952.361960.</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)