Editing Unification (computer science) (section)

==Formal definition==

A ''unification problem'' is a finite set {{math|1=''E''={{mset| ''l''<sub>1</sub> ≐ ''r''<sub>1</sub>, ..., ''l''<sub>''n''</sub> ≐ ''r''<sub>''n''</sub> }}}} of equations to solve, where {{math|''l''<sub>''i''</sub>, ''r''<sub>''i''</sub>}} are in the set <math>T</math> of ''terms'' or ''expressions''. Depending on which expressions or terms are allowed to occur in an equation set or unification problem, and which expressions are considered equal, several frameworks of unification are distinguished. If higher-order variables, that is, variables representing [[function (mathematics)|function]]s, are allowed in an expression, the process is called '''higher-order unification''', otherwise ''first-order unification''. If a solution is required to make both sides of each equation literally equal, the process is called '''syntactic''' or ''free'' '''unification''', otherwise ''semantic'' or ''equational unification'', or '''E-unification''', or ''unification modulo theory''.

If the right side of each equation is closed (no free variables), the problem is called (pattern) ''matching''. The left side (with variables) of each equation is called the ''pattern''.<ref>{{cite book |last1=Dowek |first1=Gilles |title=Handbook of automated reasoning |date=1 January 2001 |publisher=Elsevier Science Publishers B. V. |isbn=978-0-444-50812-6 |pages=1009–1062 |url=http://www.lsv.fr/~dowek/Publi/unification.ps |chapter=Higher-order unification and matching |access-date=15 May 2019 |archive-date=15 May 2019 |archive-url=https://web.archive.org/web/20190515113555/http://www.lsv.fr/~dowek/Publi/unification.ps |url-status=dead }}</ref>

===Prerequisites===
Formally, a unification approach presupposes
* An infinite set <math>V</math> of ''variables''. For higher-order unification, it is convenient to choose <math>V</math> disjoint from the set of [[lambda-term bound variables]].
* A set <math>T</math> of ''terms'' such that <math>V \subseteq T</math>. For first-order unification, <math>T</math> is usually the set of [[first-order terms]] (terms built from variable and function symbols).  For higher-order unification <math>T</math> consists of first-order terms and [[lambda terms]] (terms containing some higher-order variables).
* A mapping <math>\text{vars}\colon T \rightarrow</math> [[power set|<math>\mathbb{P}</math>]]<math>(V)</math>, assigning to each term <math>t</math> the set <math>\text{vars}(t) \subsetneq V</math> of ''free variables'' occurring in <math>t</math>.
* A theory or [[equivalence relation]] <math>\equiv</math> on <math>T</math>, indicating which terms are considered equal. For first-order E-unification, <math>\equiv</math> reflects the background knowledge about certain function symbols; for example, if <math>\oplus</math> is considered commutative, <math>t\equiv u</math> if <math>u</math> results from <math>t</math> by swapping the arguments of <math>\oplus</math> at some (possibly all) occurrences. <ref group=note>E.g. ''a'' ⊕ (''b'' ⊕ ''f''(''x'')) ≡ ''a'' ⊕ (''f''(''x'') ⊕ ''b'') ≡ (''b'' ⊕ ''f''(''x'')) ⊕ ''a'' ≡ (''f''(''x'') ⊕ ''b'') ⊕ ''a''</ref> In the most typical case that there is no background knowledge at all, then only literally, or syntactically, identical terms are considered equal. In this case, ≡ is called the ''[[free theory]]'' (because it is a [[free object]]), the ''[[empty theory]]'' (because the set of equational [[sentence (mathematical logic)|sentences]], or the background knowledge, is empty), the ''theory of [[uninterpreted function]]s'' (because unification is done on uninterpreted [[term (logic)|terms]]), or the ''theory of [[Algebraic specification|constructors]]'' (because all function symbols just build up data terms, rather than operating on them). For higher-order unification, usually <math>t\equiv u</math> if <math>t</math> and <math>u</math> are [[alpha equivalent]].

As an example of how the set of terms and theory affects the set of solutions, the syntactic first-order unification problem { ''y'' = ''cons''(2,''y'') } has no solution over the set of [[finite terms]]. However, it has the single solution { ''y'' ↦ ''cons''(2,''cons''(2,''cons''(2,...))) } over the set of [[Tree (set theory)|infinite tree]] terms. Similarly, the semantic first-order unification problem { ''a''⋅''x'' = ''x''⋅''a'' } has each substitution of the form { ''x'' ↦ ''a''⋅...⋅''a'' } as a solution in a [[semigroup]], i.e. if (⋅) is considered [[associative]]. But the same problem, viewed in an [[abelian group]], where  (⋅) is considered also [[commutative]], has any substitution at all as a solution.

As an example of higher-order unification, the singleton set { ''a'' = ''y''(''x'') } is a syntactic second-order unification problem, since ''y'' is a function variable. One solution is { ''x'' ↦ ''a'', ''y'' ↦ ([[identity function]]) }; another one is { ''y'' ↦ ([[constant function]] mapping each value to ''a''), ''x'' ↦ ''(any value)'' }.

===Substitution===
{{main|Substitution (logic)}}
A ''substitution'' is a mapping <math>\sigma: V\rightarrow T</math> from variables to terms; the notation <math> \{x_1\mapsto t_1, ..., x_k \mapsto t_k\}</math> refers to a substitution mapping each variable <math>x_i</math> to the term <math>t_i</math>, for <math>i=1,...,k</math>, and every other variable to itself; the <math>x_i</math> must be pairwise distinct. ''Applying'' that substitution to a term <math>t</math> is written in [[postfix notation]] as <math>t \{x_1 \mapsto t_1, ..., x_k \mapsto t_k\}</math>; it means to (simultaneously) replace every occurrence of each variable <math>x_i</math> in the term <math>t</math> by <math>t_i</math>. The result <math>t\tau</math> of applying a substitution <math>\tau</math> to a term <math>t</math> is called an ''instance'' of that term <math>t</math>.
As a first-order example, applying the substitution {{math|{{mset| ''x'' ↦ ''h''(''a'',''y''), ''z'' ↦ ''b'' }}}} to the term 
{|
|-
|
| <math>f(</math>
| align="center" | <math>\textbf{x}</math>
| <math>, a, g(</math>
| <math>\textbf{z}</math>
| <math> ), y)</math>
|-
| yields &nbsp;
|-
|
| <math>f(</math>
| <math>\textbf{h}(\textbf{a}, \textbf{y})</math>
| <math>, a, g(</math>
| <math>\textbf{b}</math>
| <math>), y).</math>
|}

===Generalization, specialization===
If a term <math>t</math> has an instance equivalent to a term <math>u</math>, that is, if <math>t\sigma \equiv u</math> for some substitution <math>\sigma</math>, then <math>t</math> is called ''more general'' than <math>u</math>, and <math>u</math> is called ''more special'' than, or ''subsumed'' by, <math>t</math>. For example, <math>x\oplus a</math> is more general than <math>a\oplus b</math> if ⊕ is [[Commutative property|commutative]], since then <math>(x\oplus a) \{x\mapsto b\} = b\oplus a\equiv a\oplus b</math>.

If ≡ is literal (syntactic) identity of terms, a term may be both more general and more special than another one only if both terms differ just in their variable names, not in their syntactic structure; such terms are called ''variants'', or ''renamings'' of each other.
For example, 
<math>f(x_1, a, g(z_1), y_1)</math>
is a variant of 
<math>f(x_2, a, g(z_2), y_2)</math>,
since
<math display="block">f(x_1, a, g(z_1), y_1) \{x_1 \mapsto x_2, y_1 \mapsto y_2, z_1 \mapsto z_2\} = f(x_2, a, g(z_2), y_2) </math>
and
<math display="block">f(x_2, a, g(z_2), y_2) \{x_2 \mapsto x_1, y_2 \mapsto y_1, z_2 \mapsto z_1\} = f(x_1, a, g(z_1), y_1).</math>
However, <math>f(x_1, a, g(z_1), y_1)</math> is ''not'' a variant of  <math>f(x_2, a, g(x_2), x_2)</math>, since no substitution can transform the latter term into the former one.
The latter term is therefore properly more special than the former one.

For arbitrary <math>\equiv</math>, a term may be both more general and more special than a structurally different term.
For example, if ⊕ is [[idempotent]], that is, if always <math>x \oplus x \equiv x</math>, then the term <math>x\oplus y</math> is more general than <math>z</math>,<ref group=note>since <math>(x\oplus y) \{x\mapsto z, y \mapsto z\} = z\oplus z \equiv z</math></ref> and vice versa,<ref group=note>since {{math|1=''z'' {{mset|''z'' ↦ ''x'' ⊕ ''y''}} = ''x'' ⊕ ''y''}}</ref> although <math>x\oplus y</math> and <math>z</math> are of different structure.

A substitution <math>\sigma</math> is ''more special'' than, or ''subsumed'' by, a substitution <math>\tau</math> if <math>t\sigma</math> is subsumed by <math>t\tau</math> for each term <math>t</math>.  We also say that <math>\tau</math> is more general than <math>\sigma</math>. More formally, take a nonempty infinite set <math>V</math> of auxiliary variables such that no equation <math>l_i \doteq r_i</math> in the unification problem contains variables from <math>V</math>. Then a substitution <math>\sigma</math> is subsumed by another substitution <math>\tau</math> if there is a substitution <math>\theta</math> such that for all terms <math>X\notin V</math>, <math>X\sigma \equiv X\tau\theta</math>.<ref name=Vukmirovic/>
For instance <math> \{x \mapsto a, y \mapsto a \}</math> is subsumed by <math>\tau = \{x\mapsto y\}</math>, using <math>\theta=\{y\mapsto a\}</math>, but 
<math>\sigma = \{x\mapsto a\}</math> is not subsumed by <math>\tau = \{x\mapsto y\}</math>, as <math>f(x, y)\sigma = f(a, y)</math> is not an instance of
<math>f(x, y) \tau = f(y, y)</math>.<ref>{{cite book |last1=Apt |first1=Krzysztof R. |title=From logic programming to Prolog |date=1997 |publisher=Prentice Hall |location=London Munich |isbn=013230368X |edition=1. publ |url=https://homepages.cwi.nl/~apt/book.ps|page=24}}</ref>

===Solution set===

A substitution σ is a ''solution'' of the unification problem ''E'' if {{math|''l''<sub>''i''</sub>σ ≡ ''r''<sub>''i''</sub>σ}} for <math>i = 1, ..., n</math>. Such a substitution is also called a ''unifier'' of ''E''.
For example, if ⊕ is [[associative]], the unification problem { ''x'' ⊕ ''a'' ≐ ''a'' ⊕ ''x'' } has the solutions {''x'' ↦ ''a''}, {''x'' ↦ ''a'' ⊕ ''a''}, {''x'' ↦ ''a'' ⊕ ''a'' ⊕ ''a''}, etc., while the problem { ''x'' ⊕ ''a'' ≐ ''a'' } has no solution.

For a given unification problem ''E'', a set ''S'' of unifiers is called ''complete'' if each solution substitution is subsumed by some substitution in ''S''. A complete substitution set always exists (e.g. the set of all solutions), but in some frameworks (such as unrestricted higher-order unification) the problem of determining whether any solution exists (i.e., whether the complete substitution set is nonempty) is undecidable.

The set ''S'' is called ''minimal'' if none of its members subsumes another one. Depending on the framework, a complete and minimal substitution set may have zero, one, finitely many, or infinitely many members, or may not exist at all due to an infinite chain of redundant members.<ref>{{cite journal|first1=François|last1=Fages|first2=Gérard|last2=Huet|title=Complete Sets of Unifiers and Matchers in Equational Theories|journal=Theoretical Computer Science|volume=43|pages=189–200|year=1986|doi=10.1016/0304-3975(86)90175-1|doi-access=free}}</ref> Thus, in general, unification algorithms compute a finite approximation of the complete set, which may or may not be minimal, although most algorithms avoid redundant unifiers when possible.<ref name=Vukmirovic/> For first-order syntactical unification, Martelli and Montanari<ref name="Martelli.Montanari.1982">{{cite journal|first1=Alberto|last1=Martelli|first2=Ugo|last2=Montanari|title=An Efficient Unification Algorithm|journal=ACM Trans. Program. Lang. Syst.|volume=4|number=2|pages=258–282|date=Apr 1982|doi=10.1145/357162.357169|s2cid=10921306}}</ref> gave an algorithm that reports unsolvability or computes a single unifier that by itself forms a complete and minimal substitution set, called the '''most general unifier'''.