Editing Context-sensitive grammar

{{Short description|Type of a formal grammar}}
A '''context-sensitive grammar''' ('''CSG''') is a [[formal grammar]] in which the left-hand sides and right-hand sides of any [[Production (computer science)|production rules]] may be surrounded by a context of [[terminal symbol|terminal]] and [[nonterminal symbol]]s. Context-sensitive grammars are more general than [[context-free grammar]]s, in the sense that there are languages that can be described by a CSG but not by a context-free grammar. Context-sensitive grammars are less general (in the same sense) than [[unrestricted grammar]]s. Thus, CSGs are positioned between context-free and unrestricted grammars in the [[Chomsky hierarchy]].<ref>(Hopcroft, Ullman, 1979); Sect.9.4, p.227</ref>

A [[formal language]] that can be described by a context-sensitive grammar, or, equivalently, by a [[noncontracting grammar]] or a [[linear bounded automaton]], is called a [[context-sensitive language]]. Some textbooks actually define CSGs as non-contracting,<ref name="Linz2011">{{cite book|first=Peter|last=Linz|title=An Introduction to Formal Languages and Automata|url=https://books.google.com/books?id=hsxDiWvVdBcC&pg=PA291|year=2011|publisher=Jones & Bartlett Publishers|isbn=978-1-4496-1552-9|page=291}}</ref><ref name="Meduna2000">{{cite book|first=Alexander|last=Meduna|author-link=Alexander Meduna|title=Automata and Languages: Theory and Applications|url=https://books.google.com/books?id=s7gEErax71cC&pg=PA730|year=2000|publisher=Springer Science & Business Media|isbn=978-1-85233-074-3|page=730}}</ref><ref name="DavisSigal1994">{{cite book|first1=Martin|last1=Davis|author-link1=Martin Davis (mathematician)|first2=Ron|last2=Sigal|first3=Elaine J.|last3=Weyuker|author-link3=Elaine Weyuker|title=Computability, Complexity, and Languages: Fundamentals of Theoretical Computer Science|url=https://books.google.com/books?id=dSHIIx0uGx0C&pg=PA189|year=1994|publisher=Morgan Kaufmann|isbn=978-0-08-050246-5|page=189|edition=2nd}}</ref><ref>{{cite book|last=Martin|first=John C.|title=Introduction to Languages and the Theory of Computation|year=2010|publisher=McGraw-Hill|location=New York, NY|isbn=9780073191461|edition=4th|page=277}}</ref> although this is not how [[Noam Chomsky]] defined them in 1959.<ref name="Levelt2008">{{cite book|first=Willem J. M.|last=Levelt|author-link=Willem Levelt|title=An Introduction to the Theory of Formal Languages and Automata|url=https://books.google.com/books?id=tFvtwGYNe7kC&pg=PA26|year=2008|publisher=John Benjamins Publishing|isbn=978-90-272-3250-2|page=26}}</ref><ref name="DavisSigal1994b">{{cite book|first1=Martin|last1=Davis|author-link1=Martin Davis (mathematician)|first2=Ron |last2=Sigal|first3=Elaine J.|last3=Weyuker|author-link3=Elaine Weyuker|title=Computability, Complexity, and Languages: Fundamentals of Theoretical Computer Science|url=https://books.google.com/books?id=dSHIIx0uGx0C&pg=PA330|year=1994|publisher=Morgan Kaufmann|isbn=978-0-08-050246-5|pages=330–331|edition=2nd}}</ref> This choice of definition makes no difference in terms of the languages generated (i.e. the two definitions are [[weak equivalence (formal languages)|weakly equivalent]]), but it does make a difference in terms of what grammars are structurally considered context-sensitive; the latter issue was analyzed by Chomsky in 1963.<ref>{{cite book |last=Chomsky |first=N. |author-link=Noam Chomsky|year=1963|chapter=Formal properties of grammar|title=Handbook of Mathematical Psychology|editor1-first=R. D.|editor1-last=Luce|editor2-first=R. R.|editor2-last=Bush|editor3-first=E.|editor3-last=Galanter|location=New York|publisher=Wiley|pages=360–363|chapter-url=https://archive.org/details/handbookofmathem017893mbp}}</ref><ref name="Levelt2008-126">{{cite book|first=Willem J. M.|last=Levelt|author-link=Willem Levelt|title=An Introduction to the Theory of Formal Languages and Automata|url=https://books.google.com/books?id=tFvtwGYNe7kC&pg=PA125|year=2008|publisher=John Benjamins Publishing|isbn=978-90-272-3250-2|pages=125–126}}</ref>

Chomsky introduced context-sensitive grammars as a way to describe the syntax of [[natural language]] where it is often the case that a word may or may not be appropriate in a certain place depending on the context. [[Walter Savitch]] has criticized the terminology "context-sensitive" as misleading and proposed "non-erasing" as better explaining the distinction between a CSG and an [[unrestricted grammar]].<ref name="Vide1999">{{cite book|editor=Carlos Martín Vide|title=Issues in Mathematical Linguistics: Workshop on Mathematical Linguistics, State College, Pa., April 1998|url=https://books.google.com/books?id=BW2QNSvH2CoC&pg=PA186|year=1999|publisher=John Benjamins Publishing|isbn=90-272-1556-1|pages=186–187}}</ref>

Although it is well known that certain features of languages (e.g. [[cross-serial dependency]]) are not context-free, it is an [[open problem|open question]] how much of CSGs' expressive power is needed to capture the context sensitivity found in natural languages. Subsequent research in this area has focused on the more computationally tractable [[mildly context-sensitive language]]s.{{citation needed|date=March 2018}} The syntaxes of some [[visual programming language]]s can be described by context-sensitive [[graph grammar]]s.<ref>Zhang, Da-Qian, Kang Zhang, and Jiannong Cao. "[https://web.archive.org/web/20180323220143/https://pdfs.semanticscholar.org/5d3d/217d73e0f6bbeefa3749c16fbc7b2e00ec0b.pdf A context-sensitive graph grammar formalism for the specification of visual languages]." [[The Computer Journal]] 44.3 (2001): 186–200.</ref>

==Formal definition==

=== Formal grammar ===
Let us notate a [[formal grammar]] as <math>G = (N, \Sigma, P, S)</math>, with <math>N</math> a set of nonterminal symbols, <math>\Sigma</math> a set of terminal symbols, <math>P</math> a set of production rules, and <math>S \in N</math> the start symbol.

A string <math>u \in (N \cup \Sigma)^*</math> ''directly yields'', or ''directly derives to'', a string <math>v \in (N \cup \Sigma)^*</math>, denoted as <math>u \Rightarrow v</math>, if ''v'' can be obtained from ''u'' by an application of some production rule in ''P'', that is, if <math>u = \gamma L \delta</math> and <math>v = \gamma R \delta</math>, where <math>(L \to R) \in P</math> is a production rule, and <math>\gamma, \delta \in (N \cup \Sigma)^*</math> is the unaffected left and right part of the string, respectively.
More generally, ''u'' is said to ''yield'', or ''derive to'', ''v'', denoted as <math>u \Rightarrow^* v</math>, if ''v'' can be obtained from ''u'' by repeated application of production rules, that is, if <math>u = u_0 \Rightarrow ... \Rightarrow u_n = v</math> for some ''n'' ≥ 0 and some strings <math>u_1, ..., u_{n-1} \in (N \cup \Sigma)^*</math>. In other words, the relation <math>\Rightarrow^*</math> is the [[reflexive transitive closure]] of the relation <math>\Rightarrow</math>.

The '''language''' of the grammar ''G'' is the set of all terminal-symbol strings derivable from its start symbol, formally: <math>L(G) = \{ w \in \Sigma^* \mid S \Rightarrow^* w \}</math>.
Derivations that do not end in a string composed of terminal symbols only are possible, but do not contribute to ''L''(''G'').

=== Context-sensitive grammar ===
A formal grammar is '''context-sensitive''' if each rule in ''P'' is either of the form <math>S \to \varepsilon</math> where <math>\varepsilon</math> is the [[empty string]], or of the form
: α''A''β → αγβ
with ''A'' ∈ ''N'',<ref group="note">i.e., ''A'' a single [[nonterminal]]</ref> <math>\alpha, \beta\in (N \cup \Sigma \setminus\{S\})^*</math>,<ref group="note">i.e., α and β strings of nonterminals (except for the start symbol) and [[Terminal symbol|terminals]]</ref> and <math>\gamma\in (N \cup \Sigma \setminus\{S\})^+</math>.<ref group="note">i.e., γ is a nonempty string of nonterminals (except for the start symbol) and terminals</ref>

The name ''context-sensitive'' is explained by the α and β that form the context of ''A'' and determine whether ''A'' can be replaced with γ or not.
By contrast, in a [[context-free grammar]], no context is present: the left hand side of every production rule is just a nonterminal.

The string γ is not allowed to be empty.  Without this restriction, the resulting grammars become equal in power to [[unrestricted grammar]]s.<ref name="Vide1999" />

=== (Weakly) equivalent definitions ===
A [[noncontracting grammar]] is a grammar in which for any production rule, of the form ''u'' → ''v'', the length of ''u'' is less than or equal to the length of ''v''.

Every context-sensitive grammar is noncontracting, while every noncontracting grammar can be converted into an equivalent context-sensitive grammar; the two classes are [[weak equivalence (formal languages)|weakly equivalent]].<ref>{{cite book |last1=Hopcroft |first1=John E. |url=https://archive.org/details/introductiontoau00hopc |title=Introduction to Automata Theory, Languages, and Computation |last2=Ullman |first2=Jeffrey D. |publisher=Addison-Wesley |year=1979 |isbn=9780201029888 |author-link1=John Hopcroft |author-link2=Jeffrey Ullman |url-access=registration}}; p.&nbsp;223–224; Exercise 9, p.&nbsp;230. In the 2003 edition, the chapter on CSGs has been omitted.</ref>

Some authors use the term ''context-sensitive grammar'' to refer to noncontracting grammars in general.

The '''left-context'''- and '''right-context'''-sensitive grammars are defined by restricting the rules to just the form α''A'' → αγ and to just ''A''β → γβ, respectively. The languages generated by these grammars are also the full class of context-sensitive languages.<ref name="Hazewinkel1989">{{cite book |last=Hazewinkel |first=Michiel |url=https://books.google.com/books?id=s9F71NJxwzoC&pg=PA297 |title=Encyclopaedia of Mathematics |publisher=Springer Science & Business Media |year=1989 |isbn=978-1-55608-003-6 |volume=4 |page=297 |author-link=Michiel Hazewinkel}} also at https://www.encyclopediaofmath.org/index.php/Grammar,_context-sensitive</ref> The equivalence was established by [[Penttonen normal form]].<ref name="ItoKobayashi2010">{{cite book |last1=Ito |first1=Masami |url=https://books.google.com/books?id=xuaR2bJq0rcC&pg=PA183 |title=Automata, Formal Languages and Algebraic Systems: Proceedings of AFLAS 2008, Kyoto, Japan, 20–22 September 2008 |last2=Kobayashi |first2=Yūji |last3=Shoji |first3=Kunitaka |publisher=World Scientific |year=2010 |isbn=978-981-4317-60-3 |page=183}} citing {{Cite journal |last1=Penttonen |first1=Martti |date=Aug 1974 |title=One-sided and two-sided context in formal grammars |journal=[[Information and Control]] |volume=25 |issue=4 |pages=371–392 |doi=10.1016/S0019-9958(74)91049-3 |doi-access=free}}</ref>

== Examples ==
=== ''a<sup>n</sup>b<sup>n</sup>c<sup>n</sup>'' ===
The following context-sensitive grammar, with start symbol ''S'', generates the canonical non-[[context-free language]] { ''a<sup>n</sup>b<sup>n</sup>c<sup>n</sup>'' | ''n'' ≥ 1 } :{{cn|date=January 2022}}

{|
|- 
| 1. &nbsp; &nbsp; &nbsp; || || ''S'' &nbsp; &nbsp; || → &nbsp; &nbsp; || ''a'' || ''B'' || ''C'' 
|-
| 2. || || ''S'' || → || ''a'' || ''S'' || ''B'' || ''C''
|-
| 3. || ''C'' || ''B'' || → || ''C'' || ''Z''
|-
| 4. || ''C'' || ''Z'' || → || ''W'' || ''Z''
|-
| 5. || ''W'' || ''Z'' || → || ''W'' || ''C''
|-
| 6. || ''W'' || ''C'' || → || ''B'' || ''C''
|-
| 7. || ''a'' || ''B'' || → || ''a'' || ''b'' 
|-
| 8. || ''b'' || ''B'' || → || ''b'' || ''b''
|
|-
| 9. || ''b'' || ''C'' || → || ''b'' || ''c''
|-
|10. || ''c'' || ''C'' || → || ''c'' || ''c''
|}
<!---
# ''S'' → ''aSBC''
# ''S'' → ''aBC''
# ''CB'' → ''HB''
# ''HB'' → ''HC''
# ''HC'' → ''BC''
# ''aB'' → ''ab''
# ''bB'' → ''bb''
# ''bC'' → ''bc''
# ''cC'' → ''cc''
--->

Rules 1 and 2 allow for blowing-up ''S'' to ''a''<sup>''n''</sup>''BC''(''BC'')<sup>''n''−1</sup>; rules 3 to 6 allow for successively exchanging each ''CB'' to ''BC'' ([[Revesz' trick|four rules]] are needed for that since a rule ''CB'' → ''BC'' wouldn't fit into the scheme α''A''β → αγβ); rules 7–10 allow replacing a non-terminal ''B'' or ''C'' with its corresponding terminal ''b'' or ''c'', respectively, provided it is in the right place.
A generation chain for ''{{not a typo|aaabbbccc}}'' is:
: ''S''
: →<sub>2</sub> {{not a typo|'''aSBC'''}}
: →<sub>2</sub> {{not a typo|''a'''aSBC'''BC''}}
: →<sub>1</sub> {{not a typo|''aa'''aBC'''BCBC''}}
: →<sub>3</sub> {{not a typo|''aaaB'''CZ'''CBC''}}
: →<sub>4</sub> {{not a typo|''aaaB'''WZ'''CBC''}}
: →<sub>5</sub> {{not a typo|''aaaB'''WC'''CBC''}}
: →<sub>6</sub> {{not a typo|''aaaB'''BC'''CBC''}}
: →<sub>3</sub> {{not a typo|''aaaBBC'''CZ'''C''}}
: →<sub>4</sub> {{not a typo|''aaaBBC'''WZ'''C''}}
: →<sub>5</sub> {{not a typo|''aaaBBC'''WC'''C''}}
: →<sub>6</sub> {{not a typo|''aaaBBC'''BC'''C''}}
: →<sub>3</sub> {{not a typo|''aaaBB'''CZ'''CC''}}
: →<sub>4</sub> {{not a typo|''aaaBB'''WZ'''CC''}}
: →<sub>5</sub> {{not a typo|''aaaBB'''WC'''CC''}}
: →<sub>6</sub> {{not a typo|''aaaBB'''BC'''CC''}}
: →<sub>7</sub> {{not a typo|''aa'''ab'''BBCCC''}}
: →<sub>8</sub> {{not a typo|''aaa'''bb'''BCCC''}}
: →<sub>8</sub> {{not a typo|''aaab'''bb'''CCC''}}
: →<sub>9</sub> {{not a typo|''aaabb'''bc'''CC''}}
: →<sub>10</sub> {{not a typo|''aaabbb'''cc'''C''}}
: →<sub>10</sub> {{not a typo|''aaabbbc'''cc'''''}}
:

=== ''a<sup>n</sup>b<sup>n</sup>c<sup>n</sup>d<sup>n</sup>'', etc. ===
More complicated grammars can be used to parse { ''a<sup>n</sup>b<sup>n</sup>c<sup>n</sup>d<sup>n</sup>'' | ''n'' ≥ 1 }, and other languages with even more letters. Here we show a simpler approach using non-contracting grammars:{{cn|date=January 2022}}
Start with a kernel of regular productions generating the sentential forms
<math>(ABCD)^{n}abcd</math> and then include the non contracting productions
<math>p_{Da} : Da\rightarrow aD</math>,
<math>p_{Db} : Db\rightarrow bD</math>,
<math>p_{Dc} : Dc\rightarrow cD</math>,
<math>p_{Dd} : Dd\rightarrow dd</math>,
<math>p_{Ca} : Ca\rightarrow aC</math>,
<math>p_{Cb} : Cb\rightarrow bC</math>,
<math>p_{Cc} : Cc\rightarrow cc</math>,
<math>p_{Ba} : Ba\rightarrow aB</math>,
<math>p_{Bb} : Bb\rightarrow bb</math>,
<math>p_{Aa} : Aa\rightarrow aa</math>.

=== ''a<sup>m</sup>b<sup>n</sup>c<sup>m</sup>d<sup>n</sup>'' ===
A non contracting grammar (for which there is an equivalent CSG) for the language <math>L_{Cross} = \{ a^mb^nc^{m}d^{n} \mid m \ge 1, n \ge 1 \}</math> is defined by
:<math>p_0 : S \rightarrow RT</math>,
:<math>p_1 : R\rightarrow aRC | aC</math>,
:<math>p_3 : T\rightarrow BTd | Bd</math>,
:<math>p_5 : CB\rightarrow BC</math>,
:<math>p_6 : aB\rightarrow ab</math>, 
:<math>p_7 : bB\rightarrow bb</math>, 
:<math>p_8 : Cd\rightarrow cd</math>, and
:<math>p_9 : Cc\rightarrow cc</math>.

With these definitions, a derivation for <math>a^3b^2c^3d^2</math> is:
<math>S
\Rightarrow_{p_0} RT
\Rightarrow_{p^{2}_{1}p_{2}} a^3C^3T
\Rightarrow_{p_{3}p_{4}    } a^3C^3B^2d^2
\Rightarrow_{p^{6}_{5}     } a^3B^2C^3d^2
\Rightarrow_{p_{6}p_{7}    } a^3b^2C^3d^2
\Rightarrow_{p_{8}p^{2}_{9}} a^3b^2c^3d^2
</math>.{{citation needed|date=November 2018}}

=== ''a''<sup>2<sup>''i''</sup></sup> ===
A noncontracting grammar for the language { ''a''<sup>2<sup>''i''</sup></sup> | ''i'' ≥ 1 } is constructed in Example 9.5 (p.&nbsp;224) of (Hopcroft, Ullman, 1979):<ref>They obtained the grammar by systematic transformation of an [[unrestricted grammar]], given in Exm. 9.4, viz.:
# <math>S\rightarrow ACaB</math>,
# <math>Ca\rightarrow aaC</math>,
# <math>CB\rightarrow DB</math>,
# <math>CB\rightarrow E</math>,
# <math>aD\rightarrow Da</math>,
# <math>AD\rightarrow AC</math>,
# <math>aE\rightarrow Ea</math>,
# <math>AE\rightarrow \varepsilon</math>.
In the context-sensitive grammar, a string enclosed in square brackets, like <math>[ACaB]</math>, is considered a single symbol (similar to e.g. <code><name-part></code> in [[Backus–Naur form#Example|Backus–Naur form]]). The symbol names are chosen to resemble the unrestricted grammar. Likewise, rule groups in the context-sensitive grammar are numbered by the unrestricted-grammar rule they originated from.</ref>

# <math>S\rightarrow [ACaB]</math>
# <math>\begin{cases}
\ [Ca]a\rightarrow aa[Ca] \\ 
\ [Ca][aB]\rightarrow aa[CaB] \\ 
\ [ACa]a\rightarrow [Aa]a[Ca] \\ 
\ [ACa][aB]\rightarrow [Aa]a[CaB] \\ 
\ [ACaB]\rightarrow [Aa][aCB] \\ 
\ [CaB]\rightarrow a[aCB]
\end{cases}</math>
# <math>[aCB]\rightarrow [aDB]</math>
# <math>[aCB]\rightarrow [aE]</math>
# <math>\begin{cases}
\ a[Da]\rightarrow [Da]a \\
\ [aDB]\rightarrow [DaB] \\ 
\ [Aa][Da]\rightarrow [ADa]a \\ 
\ a[DaB]\rightarrow [Da][aB] \\
\ [Aa][DaB]\rightarrow [ADa][aB]
\end{cases}</math>
# <math>[ADa]\rightarrow [ACa]</math>
# <math>\begin{cases}
\ a[Ea]\rightarrow [Ea]a \\
\ [aE]\rightarrow [Ea] \\
\ [Aa][Ea]\rightarrow [AEa]a
\end{cases}</math>
# <math>[AEa]\rightarrow a</math>

== Kuroda normal form ==
Every context-sensitive grammar which does not generate the empty string can be transformed into a [[weak equivalence (formal languages)|weakly equivalent]] one in [[Kuroda normal form]]. "Weakly equivalent" here means that the two grammars generate the same language. The normal form will not in general be context-sensitive, but will be a [[noncontracting grammar]].<ref>{{cite journal | first=Sige-Yuki | last=Kuroda | author-link=S.-Y. Kuroda | title=Classes of languages and linear-bounded automata | journal=Information and Control | volume=7 | number=2 | pages=207–223 | date=June 1964 | doi=10.1016/s0019-9958(64)90120-2| doi-access=free }}</ref><ref>{{cite book |last1=Mateescu | first1=Alexandru |last2=Salomaa|first2=Arto |author-link2=Arto Salomaa|editor1-first=Grzegorz| editor1-last=Rozenberg|editor-link1=Grzegorz Rozenberg |editor2-first=Arto| editor2-last=Salomaa |editor-link2=Arto Salomaa|title=Handbook of Formal Languages. Volume I: Word, language, grammar |publisher=Springer-Verlag |year=1997 |pages=175–252 |chapter=Chapter 4: Aspects of Classical Language Theory |isbn=3-540-61486-9}}, Here: Theorem 2.2, p. 190</ref>

The Kuroda normal form is an actual normal form for non-contracting grammars.

== Properties and uses ==
{{see also|context-sensitive language}}
{{more citations needed section|date=August 2014}}

=== Equivalence to linear bounded automaton ===
A formal language can be described by a context-sensitive grammar if and only if it is accepted by some [[linear bounded automaton]] (LBA).<ref>(Hopcroft, Ullman, 1979); Theorem 9.5, 9.6, p.&nbsp;225–226</ref> In some textbooks this result is attributed solely to Landweber and [[S.-Y. Kuroda|Kuroda]].<ref name="DavisSigal1994b"/> Others call it the [[John Myhill|Myhill]]–Landweber–Kuroda theorem.<ref name="flac">{{Cite web |first=Klaus |last=Sutner |url=https://www.cs.cmu.edu/~flac/pdf/ContSens.pdf |title=Context Sensitive Grammars |publisher=[[Carnegie Mellon University]] |date=Spring 2016 |access-date=2019-08-29 |archive-date=2017-02-03 |archive-url=https://web.archive.org/web/20170203081505/http://www.cs.cmu.edu/~flac/pdf/ContSens.pdf |url-status=dead }}</ref> (Myhill introduced the concept of deterministic LBA in 1960. Peter S. Landweber published in 1963 that the language accepted by a deterministic LBA is context sensitive.<ref>{{cite journal | author=P.S. Landweber | title=Three Theorems on Phrase Structure Grammars of Type 1 | journal=[[Information and Control]] | volume=6 | number=2 | pages=131&ndash;136 | year=1963 |doi=10.1016/s0019-9958(63)90169-4 | doi-access=free }}</ref> Kuroda introduced the notion of non-deterministic LBA and the equivalence between LBA and CSGs in 1964.<ref>{{cite book|first=Alexander|last=Meduna|author-link=Alexander Meduna|title=Automata and Languages: Theory and Applications|url=https://books.google.com/books?id=s7gEErax71cC&pg=PA755|year=2000|publisher=Springer Science & Business Media|isbn=978-1-85233-074-3|page=755}}</ref><ref>{{cite book|first=Willem J. M.|last=Levelt|author-link=Willem Levelt|title=An Introduction to the Theory of Formal Languages and Automata|url=https://books.google.com/books?id=tFvtwGYNe7kC&pg=PA126|year=2008|publisher=John Benjamins Publishing|isbn=978-90-272-3250-2|pages=126–127}}</ref>)

{{As of|2010}}{{update inline|date=May 2023}} it is still an open question whether every context-sensitive language can be accepted by a ''deterministic'' LBA.<ref>{{cite book|last=Martin|first=John C.|title=Introduction to Languages and the Theory of Computation|year=2010|publisher=McGraw-Hill|location=New York, NY|isbn=9780073191461|edition=4th|page=283}}</ref>

=== Closure properties ===
Context-sensitive languages are closed under [[complement (set theory)|complement]]. This 1988 result is known as the [[Immerman–Szelepcsényi theorem]].<ref name="flac"/>
Moreover, they are closed under [[union (set theory)|union]], [[intersection (set theory)|intersection]], [[concatenation#Concatenation of sets of strings|concatenation]], [[String substitution|substitution]],<ref group=note>more formally: if ''L'' ⊆ Σ<sup>*</sup> is a context-sensitive language and ''f'' maps each ''a''∈Σ to a context-sensitive language ''f''(''a''), the ''f''(''L'') is again a context-sensitive language</ref> [[inverse string homomorphism|inverse homomorphism]], and [[Kleene plus]].<ref>(Hopcroft, Ullman, 1979); Exercise S9.10, p.&nbsp;230–231</ref>

Every [[recursively enumerable language]] ''L'' can be written as ''h''(''L'') for some context-sensitive language ''L'' and some [[string homomorphism]] ''h''.<ref>(Hopcroft, Ullman, 1979); Exercise S9.14, p.&nbsp;230–232. ''h'' maps each symbol to itself or to the empty string.</ref>

=== Computational problems ===
The [[decision problem]] that asks whether a certain string ''s'' belongs to the language of a given context-sensitive grammar ''G'', is [[PSPACE-complete]]. Moreover, there are context-sensitive grammars whose languages are PSPACE-complete. In other words, there is a context-sensitive grammar ''G'' such that deciding whether a certain string ''s'' belongs to the language of ''G'' is PSPACE-complete (so ''G'' is fixed and only ''s'' is part of the input of the problem).<ref>An example of such a grammar, designed to solve the [[QSAT]] problem, is given in {{Cite book|last=Lita|first=C. V.|title=2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC) |chapter=On Complexity of the Detection Problem for Bounded Length Polymorphic Viruses |date=2016-09-01|pages=371–378|doi=10.1109/SYNASC.2016.064|isbn=978-1-5090-5707-8|s2cid=18067130}}</ref>

The [[emptiness problem]] for context-sensitive grammars (given a context-sensitive grammar ''G'', is ''L''(''G'')=∅ ?) is [[Undecidable language|undecidable]].<ref>(Hopcroft, Ullman, 1979); Exercise S9.13, p.&nbsp;230–231</ref><ref group=note>This also follows from (1) [[#top|context-free languages being also context-sensitive]], (2) [[#Closure properties|context-sensitive language being closed under intersection]], but (3) [[Context-free grammar#Language disjointness|disjointness of context-free languages being undecidable]].</ref>

=== As model of natural languages ===
Savitch has [[mathematical proof|proven]] the following theoretical result, on which he bases his criticism of CSGs as basis for natural language: for any [[recursively enumerable]] set ''R'', there exists a context-sensitive language/grammar ''G'' which can be used as a sort of proxy to test membership in ''R'' in the following way: given a string ''s'', ''s'' is in ''R'' if and only if there exists a positive integer ''n'' for which ''sc<sup>n</sup>'' is in G, where ''c'' is an arbitrary symbol not part of ''R''.<ref name="Vide1999"/>

It has been shown that nearly all [[natural language]]s may in general be characterized by context-sensitive grammars, but the whole class of CSGs seems to be much bigger than natural languages.{{citation needed|date=November 2011}}  Worse yet, since the aforementioned decision problem for CSGs is PSPACE-complete, that makes them totally unworkable for practical use, as a polynomial-time algorithm for a PSPACE-complete problem would imply [[P=NP problem|P=NP]].

It was proven that some natural languages are not context-free, based on identifying so-called [[cross-serial dependency|cross-serial dependencies]] and [[unbounded scrambling]] phenomena.{{cn|date=December 2022}} However this does not necessarily imply that the class of CSGs is necessary to capture "context sensitivity" in the colloquial sense of these terms in natural languages. For example, [[linear context-free rewriting system]]s (LCFRSs) are strictly weaker than CSGs but can account for the phenomenon of cross-serial dependencies; one can write a LCFRS grammar for {''a<sup>n</sup>b<sup>n</sup>c<sup>n</sup>d<sup>n</sup>'' | ''n'' ≥ 1} for example.<ref>{{cite web|url=http://user.phil-fak.uni-duesseldorf.de/~kallmeyer/GrammarFormalisms/4nl-cfg.pdf |archive-url=https://web.archive.org/web/20140819085139/http://user.phil-fak.uni-duesseldorf.de/~kallmeyer/GrammarFormalisms/4nl-cfg.pdf |archive-date=2014-08-19 |url-status=live|title=Mildly Context-Sensitive Grammar Formalisms: Natural Languages are not Context-Free|first=Laura|last=Kallmeyer|year=2011}}</ref><ref>{{cite web|url=http://user.phil-fak.uni-duesseldorf.de/~kallmeyer/GrammarFormalisms/4lcfrs-intro.pdf |archive-url=https://web.archive.org/web/20140819085928/http://user.phil-fak.uni-duesseldorf.de/~kallmeyer/GrammarFormalisms/4lcfrs-intro.pdf |archive-date=2014-08-19 |url-status=live|title=Mildly Context-Sensitive Grammar Formalisms: Linear Context-Free Rewriting Systems|first=Laura|last=Kallmeyer|year=2011}}</ref><ref name="Kallmeyer2010">{{cite book|first=Laura|last=Kallmeyer|title=Parsing Beyond Context-Free Grammars|year=2010|publisher=Springer Science & Business Media|isbn=978-3-642-14846-0|pages=1–5}}</ref>

Ongoing research on [[computational linguistics]] has focused on formulating other classes of languages that are "[[mildly context-sensitive language|mildly context-sensitive]]" whose decision problems are feasible, such as [[tree-adjoining grammar]]s, [[combinatory categorial grammar]]s, [[coupled context-free language]]s, and [[Generalized context-free grammar#Linear Context-free Rewriting Systems (LCFRSs)|linear context-free rewriting system]]s.  The languages generated by these formalisms properly lie between the context-free and context-sensitive languages.

More recently, the class [[PTIME]] has been identified with [[range concatenation grammar]]s, which are now considered to be the most expressive of the mild-context sensitive language classes.<ref name="Kallmeyer2010"/>

==See also==
* [[Chomsky hierarchy]]
* [[Growing context-sensitive grammar]]
* [[Definite clause grammar#Non-context-free grammars]]
* [[List of parser generators for context-sensitive grammars]]

==Notes==
{{reflist|group=note}}

==References==
{{reflist}}

==Further reading==
* {{cite book|first1=Alexander|last1=Meduna|author-link1=Alexander Meduna|first2=Martin|last2=Švec|title=Grammars with Context Conditions and Their Applications|year=2005|publisher=John Wiley & Sons|isbn=978-0-471-73655-4}}

==External links==
* [https://web.archive.org/web/20110708224600/https://danielmattosroberts.com/earley/context-sensitive-earley.pdf Earley Parsing for Context-Sensitive Grammars]

{{Formal languages and grammars}}

{{DEFAULTSORT:Context-Sensitive Grammar}}
[[Category:Formal languages]]
[[Category:Grammar frameworks]]