Editing Context-sensitive grammar (section)

== Properties and uses ==
{{see also|context-sensitive language}}
{{more citations needed section|date=August 2014}}

=== Equivalence to linear bounded automaton ===
A formal language can be described by a context-sensitive grammar if and only if it is accepted by some [[linear bounded automaton]] (LBA).<ref>(Hopcroft, Ullman, 1979); Theorem 9.5, 9.6, p.&nbsp;225–226</ref> In some textbooks this result is attributed solely to Landweber and [[S.-Y. Kuroda|Kuroda]].<ref name="DavisSigal1994b"/> Others call it the [[John Myhill|Myhill]]–Landweber–Kuroda theorem.<ref name="flac">{{Cite web |first=Klaus |last=Sutner |url=https://www.cs.cmu.edu/~flac/pdf/ContSens.pdf |title=Context Sensitive Grammars |publisher=[[Carnegie Mellon University]] |date=Spring 2016 |access-date=2019-08-29 |archive-date=2017-02-03 |archive-url=https://web.archive.org/web/20170203081505/http://www.cs.cmu.edu/~flac/pdf/ContSens.pdf |url-status=dead }}</ref> (Myhill introduced the concept of deterministic LBA in 1960. Peter S. Landweber published in 1963 that the language accepted by a deterministic LBA is context sensitive.<ref>{{cite journal | author=P.S. Landweber | title=Three Theorems on Phrase Structure Grammars of Type 1 | journal=[[Information and Control]] | volume=6 | number=2 | pages=131&ndash;136 | year=1963 |doi=10.1016/s0019-9958(63)90169-4 | doi-access=free }}</ref> Kuroda introduced the notion of non-deterministic LBA and the equivalence between LBA and CSGs in 1964.<ref>{{cite book|first=Alexander|last=Meduna|author-link=Alexander Meduna|title=Automata and Languages: Theory and Applications|url=https://books.google.com/books?id=s7gEErax71cC&pg=PA755|year=2000|publisher=Springer Science & Business Media|isbn=978-1-85233-074-3|page=755}}</ref><ref>{{cite book|first=Willem J. M.|last=Levelt|author-link=Willem Levelt|title=An Introduction to the Theory of Formal Languages and Automata|url=https://books.google.com/books?id=tFvtwGYNe7kC&pg=PA126|year=2008|publisher=John Benjamins Publishing|isbn=978-90-272-3250-2|pages=126–127}}</ref>)

{{As of|2010}}{{update inline|date=May 2023}} it is still an open question whether every context-sensitive language can be accepted by a ''deterministic'' LBA.<ref>{{cite book|last=Martin|first=John C.|title=Introduction to Languages and the Theory of Computation|year=2010|publisher=McGraw-Hill|location=New York, NY|isbn=9780073191461|edition=4th|page=283}}</ref>

=== Closure properties ===
Context-sensitive languages are closed under [[complement (set theory)|complement]]. This 1988 result is known as the [[Immerman–Szelepcsényi theorem]].<ref name="flac"/>
Moreover, they are closed under [[union (set theory)|union]], [[intersection (set theory)|intersection]], [[concatenation#Concatenation of sets of strings|concatenation]], [[String substitution|substitution]],<ref group=note>more formally: if ''L'' ⊆ Σ<sup>*</sup> is a context-sensitive language and ''f'' maps each ''a''∈Σ to a context-sensitive language ''f''(''a''), the ''f''(''L'') is again a context-sensitive language</ref> [[inverse string homomorphism|inverse homomorphism]], and [[Kleene plus]].<ref>(Hopcroft, Ullman, 1979); Exercise S9.10, p.&nbsp;230–231</ref>

Every [[recursively enumerable language]] ''L'' can be written as ''h''(''L'') for some context-sensitive language ''L'' and some [[string homomorphism]] ''h''.<ref>(Hopcroft, Ullman, 1979); Exercise S9.14, p.&nbsp;230–232. ''h'' maps each symbol to itself or to the empty string.</ref>

=== Computational problems ===
The [[decision problem]] that asks whether a certain string ''s'' belongs to the language of a given context-sensitive grammar ''G'', is [[PSPACE-complete]]. Moreover, there are context-sensitive grammars whose languages are PSPACE-complete. In other words, there is a context-sensitive grammar ''G'' such that deciding whether a certain string ''s'' belongs to the language of ''G'' is PSPACE-complete (so ''G'' is fixed and only ''s'' is part of the input of the problem).<ref>An example of such a grammar, designed to solve the [[QSAT]] problem, is given in {{Cite book|last=Lita|first=C. V.|title=2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC) |chapter=On Complexity of the Detection Problem for Bounded Length Polymorphic Viruses |date=2016-09-01|pages=371–378|doi=10.1109/SYNASC.2016.064|isbn=978-1-5090-5707-8|s2cid=18067130}}</ref>

The [[emptiness problem]] for context-sensitive grammars (given a context-sensitive grammar ''G'', is ''L''(''G'')=∅ ?) is [[Undecidable language|undecidable]].<ref>(Hopcroft, Ullman, 1979); Exercise S9.13, p.&nbsp;230–231</ref><ref group=note>This also follows from (1) [[#top|context-free languages being also context-sensitive]], (2) [[#Closure properties|context-sensitive language being closed under intersection]], but (3) [[Context-free grammar#Language disjointness|disjointness of context-free languages being undecidable]].</ref>

=== As model of natural languages ===
Savitch has [[mathematical proof|proven]] the following theoretical result, on which he bases his criticism of CSGs as basis for natural language: for any [[recursively enumerable]] set ''R'', there exists a context-sensitive language/grammar ''G'' which can be used as a sort of proxy to test membership in ''R'' in the following way: given a string ''s'', ''s'' is in ''R'' if and only if there exists a positive integer ''n'' for which ''sc<sup>n</sup>'' is in G, where ''c'' is an arbitrary symbol not part of ''R''.<ref name="Vide1999"/>

It has been shown that nearly all [[natural language]]s may in general be characterized by context-sensitive grammars, but the whole class of CSGs seems to be much bigger than natural languages.{{citation needed|date=November 2011}}  Worse yet, since the aforementioned decision problem for CSGs is PSPACE-complete, that makes them totally unworkable for practical use, as a polynomial-time algorithm for a PSPACE-complete problem would imply [[P=NP problem|P=NP]].

It was proven that some natural languages are not context-free, based on identifying so-called [[cross-serial dependency|cross-serial dependencies]] and [[unbounded scrambling]] phenomena.{{cn|date=December 2022}} However this does not necessarily imply that the class of CSGs is necessary to capture "context sensitivity" in the colloquial sense of these terms in natural languages. For example, [[linear context-free rewriting system]]s (LCFRSs) are strictly weaker than CSGs but can account for the phenomenon of cross-serial dependencies; one can write a LCFRS grammar for {''a<sup>n</sup>b<sup>n</sup>c<sup>n</sup>d<sup>n</sup>'' | ''n'' ≥ 1} for example.<ref>{{cite web|url=http://user.phil-fak.uni-duesseldorf.de/~kallmeyer/GrammarFormalisms/4nl-cfg.pdf |archive-url=https://web.archive.org/web/20140819085139/http://user.phil-fak.uni-duesseldorf.de/~kallmeyer/GrammarFormalisms/4nl-cfg.pdf |archive-date=2014-08-19 |url-status=live|title=Mildly Context-Sensitive Grammar Formalisms: Natural Languages are not Context-Free|first=Laura|last=Kallmeyer|year=2011}}</ref><ref>{{cite web|url=http://user.phil-fak.uni-duesseldorf.de/~kallmeyer/GrammarFormalisms/4lcfrs-intro.pdf |archive-url=https://web.archive.org/web/20140819085928/http://user.phil-fak.uni-duesseldorf.de/~kallmeyer/GrammarFormalisms/4lcfrs-intro.pdf |archive-date=2014-08-19 |url-status=live|title=Mildly Context-Sensitive Grammar Formalisms: Linear Context-Free Rewriting Systems|first=Laura|last=Kallmeyer|year=2011}}</ref><ref name="Kallmeyer2010">{{cite book|first=Laura|last=Kallmeyer|title=Parsing Beyond Context-Free Grammars|year=2010|publisher=Springer Science & Business Media|isbn=978-3-642-14846-0|pages=1–5}}</ref>

Ongoing research on [[computational linguistics]] has focused on formulating other classes of languages that are "[[mildly context-sensitive language|mildly context-sensitive]]" whose decision problems are feasible, such as [[tree-adjoining grammar]]s, [[combinatory categorial grammar]]s, [[coupled context-free language]]s, and [[Generalized context-free grammar#Linear Context-free Rewriting Systems (LCFRSs)|linear context-free rewriting system]]s.  The languages generated by these formalisms properly lie between the context-free and context-sensitive languages.

More recently, the class [[PTIME]] has been identified with [[range concatenation grammar]]s, which are now considered to be the most expressive of the mild-context sensitive language classes.<ref name="Kallmeyer2010"/>