Editing Free monoid

{{Short description|Concept in mathematics}}
In [[abstract algebra]], the '''free monoid''' on a [[Set (mathematics)|set]] is the [[monoid]] whose elements are all the [[finite sequence]]s (or strings) of zero or more elements from that set, with [[string concatenation]] as the monoid operation and with the unique sequence of zero elements, often called the [[empty string]] and denoted by ε or λ, as the [[identity element]]. The free monoid on a set ''A'' is usually denoted ''A''<sup>&lowast;</sup>. The '''free semigroup''' on ''A'' is the sub[[semigroup]] of ''A''<sup>&lowast;</sup> containing all elements except the empty string. It is usually denoted ''A''<sup>+</sup>.<ref name="Lot23">{{harvtxt|Lothaire|1997|pp=2–3}}, [https://books.google.com/books?id=eATLTZzwW-sC&pg=PA2]</ref><ref name=PF2>{{harvtxt|Pytheas Fogg|2002|p=2}}</ref>

More generally, an abstract monoid (or semigroup) ''S'' is described as '''free''' if it is [[isomorphic]] to the free monoid (or semigroup) on some set.<ref name=Lot5>{{harvtxt|Lothaire|1997|p=5}}</ref>

As the name implies, free monoids and semigroups are those objects which satisfy the usual [[universal property]] defining [[free object]]s, in the respective [[category (mathematics)|categories]] of monoids and semigroups. It follows that every monoid (or semigroup) arises as a homomorphic image of a free monoid (or semigroup). The study of semigroups as images of free semigroups is called combinatorial semigroup theory.

Free monoids (and monoids in general) are [[associative]], by definition; that is, they are written without any parenthesis to show grouping or order of operation. The non-associative equivalent is the [[free magma]].

== Examples ==

=== Natural numbers ===
The monoid ('''N<sub>0</sub>''',+) of [[natural numbers]] (including zero) under addition is a free monoid on a singleton free generator, in this case, the natural number 1. 
According to the formal definition, this monoid consists of all sequences like "1", "1+1", "1+1+1", "1+1+1+1", and so on, including the empty sequence. 
Mapping each such sequence to its evaluation result<ref>Since addition of natural numbers is associative, the result doesn't depend on the order of evaluation, thus ensuring the mapping to be well-defined.</ref>
and the empty sequence to zero establishes an isomorphism from the set of such sequences to '''N<sub>0</sub>'''.
This isomorphism is compatible with "+", that is, for any two sequences ''s'' and ''t'', if ''s'' is mapped (i.e. evaluated) to a number ''m'' and ''t'' to ''n'', then their concatenation ''s''+''t'' is mapped to the sum ''m''+''n''.

=== Kleene star ===
In [[formal language]] theory, usually a finite set of "symbols" A (sometimes called the [[alphabet (formal languages)|alphabet]]) is considered. A finite sequence of symbols is called a "word over ''A''", and the free monoid ''A''<sup>&lowast;</sup> is called the "[[Kleene star]] of ''A''".
Thus, the abstract study of formal languages can be thought of as the study of subsets of finitely generated free monoids. 

For example, assuming an alphabet ''A'' = {''a'', ''b'', ''c''}, its Kleene star ''A''<sup>&lowast;</sup> contains all concatenations of ''a'', ''b'', and ''c'':
:{ε, ''a'', ''ab'', ''ba'', ''caa'', ''{{not a typo|cccbabbc}}'', ...}.

If ''A'' is any set, the ''word length'' function on ''A''<sup>&lowast;</sup> is the unique [[monoid homomorphism]] from ''A''<sup>&lowast;</sup> to ('''N<sub>0</sub>''',+) that maps each element of ''A'' to 1.  A free monoid is thus a '''graded monoid'''.<ref name=Sak382>Sakarovitch (2009) p.382</ref> (A graded monoid <math>M</math> is a monoid that can be written as <math>M=M_0\oplus M_1\oplus M_2 \cdots</math>. Each <math>M_n</math> is a grade; the grading here is just the length of the string. That is, <math>M_n</math> contains those strings of length <math>n.</math> The <math>\oplus</math> symbol here can be taken to mean "set union"; it is used instead of the symbol <math>\cup</math> because, in general, set unions might not be monoids, and so a distinct symbol is used. By convention, gradations are always written with the <math>\oplus</math> symbol.)

There are deep connections between the theory of [[semigroup]]s and that of [[automata theory|automata]]. For example, every formal language has a [[syntactic monoid]] that recognizes that language. For the case of a [[regular language]], that monoid is isomorphic to the [[transition monoid]] associated to the [[semiautomaton]] of some [[deterministic finite automaton]] that recognizes that language. The regular languages over an alphabet A are the closure of the finite subsets of A*, the free monoid over A, under union, product, and generation of submonoid.<ref>
{{Cite book|url=https://books.google.com/books?id=C5QbCAAAQBAJ|title=Groups, Languages, Algorithms: AMS-ASL Joint Special Session on Interactions Between Logic, Group Theory, and Computer Science, January 16-19, 2003, Baltimore, Maryland|last=Borovik|first=Alexandre|date=2005-01-01|publisher=American Mathematical Soc.|isbn=9780821836187|language=en}}
</ref> 

For the case of [[concurrent computation]], that is, systems with [[lock (computer science)|locks]], [[mutex]]es or [[thread join]]s, the computation can be described with [[history monoid]]s and [[trace monoid]]s. Roughly speaking, elements of the monoid can commute, (e.g. different threads can execute in any order), but only up to a lock or mutex, which prevent further commutation (e.g. serialize thread access to some object).

==Conjugate words==
[[File:Example of strings equidivisibility.gif|thumb|Example for 1st case of equidivisibility: m="UNCLE", n="ANLY", p="UN", q="CLEANLY", and s="CLE"]]
We define a pair of words in ''A''<sup>&lowast;</sup> of the form ''uv'' and ''vu'' as '''conjugate''': the conjugates of a word are thus its [[circular shift]]s.<ref name=Sak27>Sakarovitch (2009) p.27</ref>  Two words are conjugate in this sense if they are [[Conjugation (group theory)|conjugate in the sense of group theory]] as elements of the [[free group]] generated by ''A''.<ref name=PF297>{{harvtxt|Pytheas Fogg|2002|p=297}}</ref>

===Equidivisibility===
A free monoid is '''equidivisible''': if the equation ''mn'' = ''pq'' holds, then there exists an ''s'' such that either ''m'' = ''ps'', ''sn'' = ''q'' (example see image) or ''ms'' = ''p'', ''n'' = ''sq''.<ref name=Sak26>Sakarovitch (2009) p.26</ref> This result is also known as [[Levi's lemma]].<ref name="LucaVarricchio2011">{{cite book|author1=Aldo de Luca|author2=Stefano Varricchio|title=Finiteness and Regularity in Semigroups and Formal Languages|year=1999|publisher=Springer Berlin Heidelberg|isbn=978-3-642-64150-3|page=2}}</ref>

A monoid is free if and only if it is [[Graded monoid|graded]] (in the strong sense that only the identity has gradation 0) and equidivisible.<ref name=Sak26/>

== Free generators and rank ==
The members of a set ''A'' are called the '''free generators''' for ''A''<sup>&lowast;</sup> and ''A''<sup>+</sup>. The superscript * is then commonly understood to be the [[Kleene star]]. More generally, if ''S'' is an abstract free monoid (semigroup), then a set of elements which maps onto the set of single-letter words under an isomorphism to a monoid ''A''<sup>&lowast;</sup> (semigroup ''A<sup>+</sup>'') is called a ''set of free generators'' for ''S''.

Each free monoid (or semigroup) ''S'' has exactly one set of free generators, the [[cardinality]] of which is called the ''rank'' of ''S''.

Two free monoids or semigroups are isomorphic if and only if they have the same rank. In fact, ''every'' [[Monoid#Generators|set of generators]] for a free monoid or semigroup ''S'' contains the free generators, since a free generator has word length 1 and hence can only be generated by itself. It follows that a free semigroup or monoid is finitely generated if and only if it has finite rank.

A [[submonoid]] ''N'' of ''A''<sup>&lowast;</sup> is '''stable''' if ''u'', ''v'', ''ux'', ''xv'' in ''N'' together imply ''x'' in ''N''.<ref name=BPR61>{{harvtxt|Berstel|Perrin|Reutenauer|2010|p=61}}</ref>  A submonoid of ''A''<sup>&lowast;</sup> is stable if and only if it is free.<ref name=BPR62>{{harvtxt|Berstel|Perrin|Reutenauer|2010|p=62}}</ref>
For example, using the set of [[bit]]s { "0", "1" } as ''A'', the set ''N'' of all bit strings containing an even number of "1"s is a stable submonoid because if ''u'' contains an even number of "1"s, and ''ux'' as well, then ''x'' must contain an even number of "1"s, too. While ''N'' cannot be freely generated by any set of single bits, it ''can'' be freely generated by the set of bit strings { "0", "11", "101", "1001", "10001", ... } – the set of strings of the form "10<sup>''n''</sup>1" for some nonnegative integer ''n'' (along with the string "0").

===Codes===
A set of free generators for a free monoid ''P'' is referred to as a '''basis''' for '''P''': a set of words ''C'' is a '''code''' if ''C''* is a free monoid and ''C'' is a basis.<ref name=Lot5/>  A set ''X'' of words in ''A''<sup>&lowast;</sup> is a '''prefix''', or has the '''prefix property''', if it does not contain a proper [[prefix (computer science)|(string) prefix]] of any of its elements<!--- deleted, since "≤" as shorthand for "string prefix of" has not been defined yet, and is not used elsewhere in the article: that is, ''x'',''y'' in ''X'' with ''x'' ≤ ''y'' implies ''x'' = ''y''--->.  Every prefix in ''A''<sup>+</sup> is a code, indeed a [[prefix code]].<ref name=Lot5/><ref name=BPR58>{{harvtxt|Berstel|Perrin|Reutenauer|2010|p=58}}</ref>

A submonoid ''N'' of ''A''<sup>&lowast;</sup> is '''right unitary''' if ''x'', ''xy'' in ''N'' implies ''y'' in ''N''.  A submonoid is generated by a prefix if and only if it is right unitary.<ref name=Lot15>{{harvtxt|Lothaire|1997|p=15}}</ref>

==Factorization==
{{main |Monoid factorisation}}
A factorization of a free monoid is a sequence of subsets of words with the property that every word in the free monoid can be written as a concatenation of elements drawn from the subsets. The [[Chen–Fox–Lyndon theorem]] states that the [[Lyndon word]]s furnish a factorization. More generally, [[Hall word]]s provide a factorization; the Lyndon words are a special case of the Hall words.

==Free hull==
The intersection of free submonoids of a free monoid ''A''<sup>&lowast;</sup> is again free.<ref name=Lot6>{{harvtxt|Lothaire|1997|p=6}}</ref><ref name=LotII204>{{harvtxt|Lothaire|2011|p=204}}</ref>  If ''S'' is a subset of a free monoid ''A''* then the intersection of all free submonoids of ''A''* containing ''S'' is well-defined, since ''A''* itself is free, and contains ''S''; it is a free monoid and called the '''free hull''' of ''S''. A basis for this intersection is a code.

The '''defect theorem'''<ref name=Lot6/><ref name=LotII204/><ref name=BPR66>{{harvtxt|Berstel|Perrin|Reutenauer|2010|p=66}}</ref> states that if ''X'' is finite and ''C'' is the basis of the free hull of ''X'', then either ''X'' is a code and ''C'' = ''X'', or

:|''C''| ≤ |''X''| − 1 .

==Morphisms==
A [[monoid morphism]] ''f'' from a free monoid ''B''<sup>&lowast;</sup> to a monoid ''M'' is a map such that ''f''(''xy'') = ''f''(''x'')⋅''f''(''y'') for words ''x'',''y'' and ''f''(ε) = ι, where ε and ι denote the identity elements of ''B''<sup>&lowast;</sup> and ''M'', respectively.  The morphism ''f'' is determined by its values on the letters of ''B'' and conversely any map from ''B'' to ''M'' extends to a morphism.  A morphism is '''non-erasing'''<ref name=Lot7>{{harvtxt|Lothaire|1997|p=7}}</ref> or '''continuous'''<ref name=Sak25>{{harvtxt|Sakarovitch|2009|p=25}}</ref> if no letter of ''B'' maps to ι and '''trivial''' if every letter of ''B'' maps to ι.<ref name=Lot164>{{harvtxt|Lothaire|1997|p=164}}</ref>

A morphism ''f'' from a free monoid ''B''<sup>&lowast;</sup> to a free monoid ''A''<sup>&lowast;</sup> is '''total''' if every letter of ''A'' occurs in some word in the image of ''f''; '''cyclic'''<ref name=Lot164/> or '''periodic'''<ref name=Sal77>{{harvtxt|Salomaa|1981|p=77}}</ref> if the image of ''f'' is contained in {''w''}<sup>&lowast;</sup> for some word ''w'' of ''A''<sup>&lowast;</sup>.  A morphism ''f'' is '''''k''-uniform''' if the length |''f''(''a'')| is constant and equal to ''k'' for all ''a'' in ''A''.<ref name=ApCoW522>{{harvtxt|Lothaire|2005|p=522}}</ref><ref name=BR103>{{cite book | last1=Berstel | first1=Jean | last2=Reutenauer | first2=Christophe | title=Noncommutative rational series with applications | series=Encyclopedia of Mathematics and Its Applications | volume=137 | location=Cambridge | publisher=[[Cambridge University Press]] | year=2011 | isbn=978-0-521-19022-0 | zbl=1250.68007 | page=103 }}</ref>  A 1-uniform morphism is '''strictly alphabetic'''<ref name=Sak25/> or a '''coding'''.<ref name=AS9>{{harvtxt|Allouche|Shallit|2003|p=9}}</ref>

A morphism ''f'' from a free monoid ''B''<sup>&lowast;</sup> to a free monoid ''A''<sup>&lowast;</sup> is '''simplifiable''' if there is an alphabet ''C'' of cardinality less than that of ''B'' such the morphism ''f'' factors through ''C''<sup>&lowast;</sup>, that is, it is the composition of a morphism from ''B''<sup>&lowast;</sup> to ''C''<sup>&lowast;</sup> and a morphism from that to ''A''<sup>&lowast;</sup>; otherwise ''f'' is '''elementary'''.  The morphism ''f'' is called a '''code''' if the image of the alphabet ''B'' under ''f'' is a code. Every elementary morphism is a code.<ref name=Sal72>{{harvtxt|Salomaa|1981|p=72}}</ref>

===Test sets===
For  ''L'' a subset of ''B''<sup>&lowast;</sup>, a finite subset ''T'' of ''L'' is a ''test set'' for ''L'' if morphisms ''f'' and ''g'' on ''B''<sup>&lowast;</sup> agree on ''L'' if and only if they agree on ''T''.  The '''Ehrenfeucht conjecture''' is that any subset ''L'' has a test set:<ref name=Lot1789>{{harvtxt|Lothaire|1997|pp=178–179}}</ref> it has been proved<ref name=LotII451>{{harvtxt|Lothaire|2011|p=451}}</ref> independently by Albert and Lawrence; McNaughton; and Guba.  The proofs rely on [[Hilbert's basis theorem]].<ref>{{cite journal | first=A. | last=Salomaa | authorlink=Arto Salomaa | title=The Ehrenfeucht conjecture: A proof for language theorists | journal=Bulletin of the EATCS  | number=27 | date=October 1985 | pages=71–82 }}</ref>

===Map and fold===
{{unreferenced section|date=February 2015}}
The computational embodiment of a monoid morphism is a [[Map (higher-order function)|map]] followed by a [[Fold (higher-order function)|fold]]. In this setting, the free monoid on a set ''A'' corresponds to [[List (computing)|lists]] of elements from ''A'' with concatenation as the binary operation.  A monoid homomorphism from the free monoid to any other monoid (''M'',•) is a function ''f'' such that
* ''f''(''x''<sub>1</sub>...''x''<sub>''n''</sub>) = ''f''(''x''<sub>1</sub>) • ... • ''f''(''x''<sub>''n''</sub>)
* ''f''() = ''e''
where ''e'' is the identity on ''M''.  Computationally, every such homomorphism corresponds to a [[Map (higher-order function)|map]] operation applying ''f'' to all the elements of a list, followed by a [[Fold (higher-order function)|fold]] operation which combines the results using the binary operator •. This [[squiggol|computational paradigm]] (which can be generalized to non-associative binary operators) has inspired the [[MapReduce]] software framework.{{citation needed|date=February 2015}}

==Endomorphisms==
An '''[[endomorphism]]''' of ''A''<sup>&lowast;</sup> is a morphism from ''A''<sup>&lowast;</sup> to itself.<ref name=LotII450>{{harvtxt|Lothaire|2011|p=450}}</ref>  The [[identity map]] ''I'' is an endomorphism of ''A''<sup>&lowast;</sup>, and the endomorphisms form a [[monoid]] under [[composition of functions]].

An endomorphism ''f'' is '''prolongable''' if there is a letter ''a'' such that ''f''(''a'') = ''as'' for a non-empty string ''s''.<ref name=AS10>Allouche & Shallit (2003) p.10</ref>

===String projection===
The operation of [[String operations#String projection|string projection]] is an endomorphism. That is, given a letter ''a'' &isin; &Sigma; and a string ''s'' &isin; &Sigma;<sup>&lowast;</sup>, the string projection ''p''<sub>a</sub>(''s'') removes every occurrence of ''a'' from ''s''; it is formally defined by

:<math>p_a(s) = \begin{cases} 
\varepsilon & \text{if } s=\varepsilon, \text{ the empty string} \\
p_a(t) & \text{if } s=ta  \\ 
p_a(t)b & \text{if } s=tb \text{ and } b\ne a.   
\end{cases}</math>

Note that string projection is well-defined even if the rank of the monoid is infinite, as the above recursive definition works for all strings of finite length. String projection is a [[morphism]] in the category of free monoids, so that

:<math>p_a\left(\Sigma^*\right)= \left(\Sigma-a\right)^*</math>

where <math>p_a\left(\Sigma^*\right)</math> is understood to be the free monoid of all finite strings that don't contain the letter ''a''.  Projection commutes with the operation of string concatenation, so that <math>p_a(st)=p_a(s)p_a(t)</math> for all strings ''s'' and ''t''. There are many right inverses to string projection, and thus it is a [[split epimorphism]].

The identity morphism is <math>p_\varepsilon,</math> defined as <math>p_\varepsilon(s)=s</math> for all strings ''s'', and <math>p_\varepsilon(\varepsilon)=\varepsilon</math>. 

String projection is commutative, as clearly

:<math>p_a(p_b(s))=p_b(p_a(s)).</math>

For free monoids of finite rank, this follows from the fact that free monoids of the same rank are isomorphic, as projection reduces the rank of the monoid by one.

String projection is [[idempotent]], as

:<math>p_a(p_a(s))=p_a(s)</math>

for all strings ''s''. Thus, projection is an idempotent, commutative operation, and so it forms a bounded [[semilattice]] or a commutative [[band (algebra)|band]].

==The free commutative monoid==
Given a set ''A'', the '''free [[commutative monoid]]''' on ''A'' is the set of all finite [[multiset]]s with elements drawn from ''A'', with the monoid operation being multiset sum and the monoid unit being the empty multiset.

For example, if ''A'' = {''a'', ''b'', ''c''}, elements of the free commutative monoid on ''A'' are of the form
:{ε, ''a'', ''ab'', ''a''<sup>2</sup>''b'', ''ab''<sup>3</sup>''c''<sup>4</sup>, ...}.

The [[fundamental theorem of arithmetic]] states that the monoid of positive integers under multiplication is a free commutative monoid on an infinite set of generators, the [[prime number]]s.

The '''free commutative semigroup''' is the subset of the free commutative monoid that contains all multisets with elements drawn from ''A'' except the empty multiset.

The [[free partially commutative monoid]], or ''[[trace monoid]]'', is a generalization that encompasses both the free and free commutative monoids as instances. This generalization finds applications in [[combinatorics]] and in the study of [[Parallel computing|parallelism]] in [[computer science]].

==See also==
* [[String operations]]

==Notes==
{{reflist|30em}}

==References==
* {{citation | last1 = Allouche | first1 = Jean-Paul | last2 = Shallit | first2 = Jeffrey | author2-link = Jeffrey Shallit | isbn = 978-0-521-82332-6 | publisher = [[Cambridge University Press]] | title = Automatic Sequences: Theory, Applications, Generalizations | year = 2003 | zbl=1086.11015 }}
* {{citation | last1=Berstel | first1=Jean | author-link1=Jean Berstel | last2=Perrin | first2=Dominique | author-link2=Dominique Perrin | last3=Reutenauer | first3=Christophe | title=Codes and automata | series=Encyclopedia of Mathematics and its Applications | volume=129 | location=Cambridge | publisher=[[Cambridge University Press]] | year=2010 | isbn=978-0-521-88831-8 | zbl=1187.94001 }}
*{{citation | last=Lothaire | first=M. | authorlink=M. Lothaire | others=Contributors: Perrin, D.; Reutenauer, C.; Berstel, J.; Pin, J. E.; Pirillo, G.; Foata, D.; Sakarovitch, J.; Simon, I.; Schützenberger, M. P.; Choffrut, C.; Cori, R. Series editors: Lyndon, Roger; Rota, Gian-Carlo. Foreword by Roger Lyndon | editor= <!-- none, bad meta-data--> | title=Combinatorics on words | edition=2nd | series=Cambridge Mathematical Library | volume=17 | publisher=[[Cambridge University Press]] | year=1997 | doi = 10.1017/CBO9780511566097 | isbn=0-521-59924-5 | mr = 1475463  | zbl=0874.20040 }}
* {{citation | last=Lothaire | first=M. | authorlink=M. Lothaire | title=Algebraic combinatorics on words | others=With preface by Jean Berstel and Dominique Perrin | edition=Reprint of the 2002 hardback | series=Encyclopedia of Mathematics and Its Applications | volume=90| publisher=[[Cambridge University Press]] | year=2011 | isbn=978-0-521-18071-9 | zbl=1221.68183 }}
*{{citation | last=Lothaire | first=M. | authorlink=M. Lothaire | title=Applied combinatorics on words | others=A collective work by Jean Berstel, Dominique Perrin, Maxime Crochemore, Eric Laporte, Mehryar Mohri, Nadia Pisanti, Marie-France Sagot, [[Gesine Reinert]], [[Sophie Schbath]], Michael Waterman, Philippe Jacquet, [[Wojciech Szpankowski]], Dominique Poulalhon, Gilles Schaeffer, Roman Kolpakov, Gregory Koucherov, Jean-Paul Allouche and [[Valérie Berthé]] | series=Encyclopedia of Mathematics and Its Applications | volume=105 | location=Cambridge | publisher=[[Cambridge University Press]] | year=2005 | isbn=0-521-84802-4 | zbl=1133.68067 | url-access=registration | url=https://archive.org/details/appliedcombinato0000loth }}
* {{citation | last=Pytheas Fogg | first=N. | editor1=Berthé, Valérie|editor1-link=Valérie Berthé|editor2=Ferenczi, Sébastien|editor3=Mauduit, Christian|editor4=Siegel, A. | title=Substitutions in dynamics, arithmetics and combinatorics | series=Lecture Notes in Mathematics | volume=1794 | location=Berlin | publisher=[[Springer-Verlag]] | year=2002 | isbn=3-540-44141-7 | zbl=1014.11015 }}
* {{citation | last=Sakarovitch | first=Jacques | title=Elements of automata theory | others=Translated from the French by Reuben Thomas | location=Cambridge | publisher=[[Cambridge University Press]] | year=2009 | isbn=978-0-521-84425-3 | zbl=1188.68177 }}
* {{citation | first=Arto | last=Salomaa | authorlink=Arto Salomaa | title=Jewels of Formal Language Theory | publisher=Pitman Publishing | isbn=0-273-08522-0 | year=1981 | zbl=0487.68064 }}

==External links==
*{{Commons category-inline}}

[[Category:Semigroup theory]]
[[Category:Formal languages]]
[[Category:Free algebraic structures]]
[[Category:Combinatorics on words]]