Editing Link grammar (section)

== Parsing algorithm ==
Parsing is performed in analogy to assembling a [[jigsaw puzzle]] (representing the parsed sentence) from puzzle pieces (representing individual words).<ref>{{cite arXiv |author1=Daniel D. K. Sleator |author2=Davy Temperley |title=Parsing English with a Link Grammar |year=1991 |eprint=cmp-lg/9508004}}</ref><ref name="intro">[http://www.abisource.com/projects/link-grammar/dict/introduction.html An Introduction to the Link Grammar Parser]</ref>  A language is represented by means of a dictionary or [[lexis (linguistics)|lexis]], which consists of words and the set of allowed "jigsaw puzzle shapes" that each word can have. The shape is indicated by a "connector", which is a link-type, and a direction indicator '''+''' or '''-''' indicating right or left. Thus for example, a [[transitive verb]] may have the connectors '''S- & O+''' indicating that the verb may form a Subject ("'''S'''") connection to its left ("'''-'''") and an object connection ("'''O'''") to its right ("'''+'''").   Similarly, a [[common noun]] may have the connectors '''D- & S+''' indicating that it may connect to a [[determiner]] on the left ("'''D-'''") and act as a subject, when connecting to a verb on the right ("'''S+'''").  The act of parsing is then to identify that the '''S+''' connector can attach to the '''S-''' connector, forming an "'''S'''" link between the two words. Parsing completes when all connectors have been connected.

A given word may have dozens or even hundreds of allowed puzzle-shapes (termed "disjuncts"): for example, many verbs may be optionally transitive, thus making the '''O+''' connector optional; such verbs might also take adverbial modifiers ('''E''' connectors) which are inherently optional. More complex verbs may have additional connectors for indirect objects, or for [[grammatical particle|particles]] or [[preposition]]s.  Thus, a part of parsing also involves picking one single unique disjunct for a word; the final parse must [[satisfiability|satisfy]] (connect) ''all'' connectors for that disjunct.<ref>{{cite conference |author1=Dennis Grinberg |author2=John Lafferty |author3=Daniel Sleator |title=A Robust Parsing Algorithm for Link Grammar |conference=Proceedings of the Fourth International Workshop on Parsing Technologies, Prague |year=1995 |url=https://www.cs.cmu.edu/afs/cs.cmu.edu/project/link/pub/www/papers/ps/tr95-125.pdf |access-date=2023-08-28}}</ref>

===Dependency===
Connectors may also include head-dependent indicators '''h''' and '''d'''. In this case, a connector containing a head indicator is only allowed to connect to a connector containing the dependent indicator (or to a connector without any h-d indicators on it).  When these indicators are used, the link is decorated with arrows to indicate the link direction.<ref name="intro"/>

A recent extension simplifies the specification of connectors for languages that have little or no restrictions on word-order, such as [[Lithuanian language|Lithuanian]].  There are also extensions to make it easier to support languages with concatenative [[morphology (linguistics)|morphologies]].

===Planarity===
The parsing algorithm also requires that the final graph is a [[planar graph]], i.e. that no links cross.<ref name="intro"/> This constraint is based on empirical psycho-linguistic evidence that, indeed, for most languages, in nearly all situations, dependency links really do not cross.<ref>{{cite conference |author=J. Havelka |year=2007 |title=Beyond projectivity: multilingual evaluation of constraints and measures on non-projective structures |conference=Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics |pages=608–615 |place=Prague, Czech Republic |publisher=Association for Computational Linguistics}}</ref><ref>{{cite journal |author=R. Ferrer i Cancho |title=Why do syntactic links not cross? |journal=EPL |volume=76 |number=6 |year=2006 |pages=1228–1234|doi=10.1209/epl/i2006-10406-0 |bibcode=2006EL.....76.1228C |hdl=2117/180367 |hdl-access=free }}</ref> There are rare exceptions, e.g. in Finnish, and even in English; they can be parsed by link-grammar only by introducing more complex and selective connector types to capture these situations.

=== Costs and selection ===
Connectors can have an optional [[floating-point]] cost markup, so that some are "cheaper" to use than others, thus giving preference to certain parses over others.<ref name="intro"/>  That is, the total cost of parse is the sum of the individual costs of the connectors that were used; the cheapest parse indicates the most likely parse.  This is used for parse-ranking multiple ambiguous parses.  The fact that the costs are local to the connectors, and are not a global property of the algorithm makes them essentially [[Markov property|Markovian]] in nature.<ref>{{cite conference |author1=John Lafferty |author2=Daniel Sleator |author3=Davey Temperley |title=Grammatical Trigrams: a Probabilistic Model of Link Grammar |conference=Proceedings of the AAAI Conference on Probabilistic Approaches to Natural Language |year=1992 |url=https://www.cs.cmu.edu/afs/cs.cmu.edu/project/link/pub/www/papers/ps/gram3gram.pdf}}</ref><ref>_{{cite arXiv |author=Ramon Ferrer-i-Cancho |year=2013 |title=Hubiness, length, crossings and their relationships in dependency trees |class=cs.CL |eprint=1304.4086}}</ref><ref>{{cite journal |author=D. Temperley |year=2008 |title=Dependency length minimization in natural and artificial languages |journal=Journal of Quantitative Linguistics |volume=15 |number=3 |pages=256–282|doi=10.1080/09296170802159512 }}</ref><ref>{{cite book |author=E. Gibson |year=2000 |chapter=The dependency locality theory: A distance-based theory of linguistic complexity |editor1=Marantz, A. |editor2=Miyashita, Y. |editor3=O'Neil, W. |title=Image, Language, Brain: Papers from the first Mind Articulation Project Symposium |publisher=MIT Press |place=Cambridge, MA}}</ref><ref>{{cite journal |author=Haitao Liu |url=http://www.lingviko.net/JCS.pdf |title=Dependency distance as a metric of language comprehension difficulty |year=2008 |journal=Journal of Cognitive Science |volume=9 |number=2 |pages=159–191|doi=10.17791/jcs.2008.9.2.159 }}</ref><ref>{{cite journal |author1=Richard Futrell |author2=Kyle Mahowald |author3=Edward Gibson |title=Large-scale evidence of dependency length minimization in 37 languages |year=2015 |journal=PNAS |volume=112 |number=33 |pages=10336–10341 |doi=10.1073/pnas.1502134112|doi-access=free |pmid=26240370 |pmc=4547262 |bibcode=2015PNAS..11210336F }}</ref>

The assignment of a log-likelihood to linkages allows link grammar to implement the [[Selection (linguistics)|semantic selection]] of predicate-argument relationships.  That is, certain constructions, although syntactically valid, are extremely unlikely. In this way, link grammar embodies some of the ideas present in [[operator grammar]].

Because the costs are additive, they behave like the logarithm of the probability (since log-likelihoods are additive), or equivalently, somewhat like the [[Entropy (information theory)|entropy]] (since entropies are additive). This makes link grammar compatible with machine learning techniques such as [[hidden Markov model]]s and the [[Viterbi algorithm]], because the link costs correspond to the link weights in [[Markov network]]s or [[Bayesian network]]s.

=== Type theory ===
The link grammar link types can be understood to be the types in the sense of [[type theory]].<ref name="intro"/><ref>{{cite conference |author1=Daniel Sleator |author2=Davey Temperley |title=Parsing English with a Link Grammar |conference=Third International Workshop on Parsing Technologies |year=1993 |url=https://www.cs.cmu.edu/afs/cs.cmu.edu/project/link/pub/www/papers/ps/LG-IWPT93.pdf}} (See section 6 on categorial grammar).</ref>  In effect, the link grammar can be used to model the [[internal language]] of certain (non-symmetric) [[compact closed category|compact closed categories]], such as [[pregroup grammar]]s.  In this sense, link grammar appears to be isomorphic or homomorphic to some [[categorial grammar]]s.  Thus, for example, in a categorial grammar the noun phrase "''the bad boy''" may be written as

:<math>
{\text{the} \atop \text{NP/N,}}
{\text{bad} \atop \text{N/N,}}
{\text{boy} \atop \text{N}}
</math>

whereas the corresponding disjuncts in link grammar would be

 the: D+;
 bad: A+;
 boy: D- & A-;

The contraction rules (inference rules) of the [[Categorial_grammar#Lambek_calculus|Lambek calculus]] can be mapped to the connecting of connectors in link grammar. The '''+''' and '''-''' directional indicators correspond the forward and backward-slashes of the categorical grammar.  Finally, the single-letter names '''A''' and '''D''' can be understood as labels or "easy-to-read" mnemonic names for the rather more verbose types ''NP/N'', etc.

The primary distinction here  is then that the categorical grammars have two [[type constructor]]s, the forward and backward slashes, that can be used to create new types (such as ''NP/N'') from base types (such as ''NP'' and ''N''). Link-grammar omits the use of type constructors, opting instead to define a much larger set of base types having compact, easy-to-remember mnemonics.