Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Context-sensitive grammar
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Type of a formal grammar}} A '''context-sensitive grammar''' ('''CSG''') is a [[formal grammar]] in which the left-hand sides and right-hand sides of any [[Production (computer science)|production rules]] may be surrounded by a context of [[terminal symbol|terminal]] and [[nonterminal symbol]]s. Context-sensitive grammars are more general than [[context-free grammar]]s, in the sense that there are languages that can be described by a CSG but not by a context-free grammar. Context-sensitive grammars are less general (in the same sense) than [[unrestricted grammar]]s. Thus, CSGs are positioned between context-free and unrestricted grammars in the [[Chomsky hierarchy]].<ref>(Hopcroft, Ullman, 1979); Sect.9.4, p.227</ref> A [[formal language]] that can be described by a context-sensitive grammar, or, equivalently, by a [[noncontracting grammar]] or a [[linear bounded automaton]], is called a [[context-sensitive language]]. Some textbooks actually define CSGs as non-contracting,<ref name="Linz2011">{{cite book|first=Peter|last=Linz|title=An Introduction to Formal Languages and Automata|url=https://books.google.com/books?id=hsxDiWvVdBcC&pg=PA291|year=2011|publisher=Jones & Bartlett Publishers|isbn=978-1-4496-1552-9|page=291}}</ref><ref name="Meduna2000">{{cite book|first=Alexander|last=Meduna|author-link=Alexander Meduna|title=Automata and Languages: Theory and Applications|url=https://books.google.com/books?id=s7gEErax71cC&pg=PA730|year=2000|publisher=Springer Science & Business Media|isbn=978-1-85233-074-3|page=730}}</ref><ref name="DavisSigal1994">{{cite book|first1=Martin|last1=Davis|author-link1=Martin Davis (mathematician)|first2=Ron|last2=Sigal|first3=Elaine J.|last3=Weyuker|author-link3=Elaine Weyuker|title=Computability, Complexity, and Languages: Fundamentals of Theoretical Computer Science|url=https://books.google.com/books?id=dSHIIx0uGx0C&pg=PA189|year=1994|publisher=Morgan Kaufmann|isbn=978-0-08-050246-5|page=189|edition=2nd}}</ref><ref>{{cite book|last=Martin|first=John C.|title=Introduction to Languages and the Theory of Computation|year=2010|publisher=McGraw-Hill|location=New York, NY|isbn=9780073191461|edition=4th|page=277}}</ref> although this is not how [[Noam Chomsky]] defined them in 1959.<ref name="Levelt2008">{{cite book|first=Willem J. M.|last=Levelt|author-link=Willem Levelt|title=An Introduction to the Theory of Formal Languages and Automata|url=https://books.google.com/books?id=tFvtwGYNe7kC&pg=PA26|year=2008|publisher=John Benjamins Publishing|isbn=978-90-272-3250-2|page=26}}</ref><ref name="DavisSigal1994b">{{cite book|first1=Martin|last1=Davis|author-link1=Martin Davis (mathematician)|first2=Ron |last2=Sigal|first3=Elaine J.|last3=Weyuker|author-link3=Elaine Weyuker|title=Computability, Complexity, and Languages: Fundamentals of Theoretical Computer Science|url=https://books.google.com/books?id=dSHIIx0uGx0C&pg=PA330|year=1994|publisher=Morgan Kaufmann|isbn=978-0-08-050246-5|pages=330–331|edition=2nd}}</ref> This choice of definition makes no difference in terms of the languages generated (i.e. the two definitions are [[weak equivalence (formal languages)|weakly equivalent]]), but it does make a difference in terms of what grammars are structurally considered context-sensitive; the latter issue was analyzed by Chomsky in 1963.<ref>{{cite book |last=Chomsky |first=N. |author-link=Noam Chomsky|year=1963|chapter=Formal properties of grammar|title=Handbook of Mathematical Psychology|editor1-first=R. D.|editor1-last=Luce|editor2-first=R. R.|editor2-last=Bush|editor3-first=E.|editor3-last=Galanter|location=New York|publisher=Wiley|pages=360–363|chapter-url=https://archive.org/details/handbookofmathem017893mbp}}</ref><ref name="Levelt2008-126">{{cite book|first=Willem J. M.|last=Levelt|author-link=Willem Levelt|title=An Introduction to the Theory of Formal Languages and Automata|url=https://books.google.com/books?id=tFvtwGYNe7kC&pg=PA125|year=2008|publisher=John Benjamins Publishing|isbn=978-90-272-3250-2|pages=125–126}}</ref> Chomsky introduced context-sensitive grammars as a way to describe the syntax of [[natural language]] where it is often the case that a word may or may not be appropriate in a certain place depending on the context. [[Walter Savitch]] has criticized the terminology "context-sensitive" as misleading and proposed "non-erasing" as better explaining the distinction between a CSG and an [[unrestricted grammar]].<ref name="Vide1999">{{cite book|editor=Carlos Martín Vide|title=Issues in Mathematical Linguistics: Workshop on Mathematical Linguistics, State College, Pa., April 1998|url=https://books.google.com/books?id=BW2QNSvH2CoC&pg=PA186|year=1999|publisher=John Benjamins Publishing|isbn=90-272-1556-1|pages=186–187}}</ref> Although it is well known that certain features of languages (e.g. [[cross-serial dependency]]) are not context-free, it is an [[open problem|open question]] how much of CSGs' expressive power is needed to capture the context sensitivity found in natural languages. Subsequent research in this area has focused on the more computationally tractable [[mildly context-sensitive language]]s.{{citation needed|date=March 2018}} The syntaxes of some [[visual programming language]]s can be described by context-sensitive [[graph grammar]]s.<ref>Zhang, Da-Qian, Kang Zhang, and Jiannong Cao. "[https://web.archive.org/web/20180323220143/https://pdfs.semanticscholar.org/5d3d/217d73e0f6bbeefa3749c16fbc7b2e00ec0b.pdf A context-sensitive graph grammar formalism for the specification of visual languages]." [[The Computer Journal]] 44.3 (2001): 186–200.</ref> ==Formal definition== === Formal grammar === Let us notate a [[formal grammar]] as <math>G = (N, \Sigma, P, S)</math>, with <math>N</math> a set of nonterminal symbols, <math>\Sigma</math> a set of terminal symbols, <math>P</math> a set of production rules, and <math>S \in N</math> the start symbol. A string <math>u \in (N \cup \Sigma)^*</math> ''directly yields'', or ''directly derives to'', a string <math>v \in (N \cup \Sigma)^*</math>, denoted as <math>u \Rightarrow v</math>, if ''v'' can be obtained from ''u'' by an application of some production rule in ''P'', that is, if <math>u = \gamma L \delta</math> and <math>v = \gamma R \delta</math>, where <math>(L \to R) \in P</math> is a production rule, and <math>\gamma, \delta \in (N \cup \Sigma)^*</math> is the unaffected left and right part of the string, respectively. More generally, ''u'' is said to ''yield'', or ''derive to'', ''v'', denoted as <math>u \Rightarrow^* v</math>, if ''v'' can be obtained from ''u'' by repeated application of production rules, that is, if <math>u = u_0 \Rightarrow ... \Rightarrow u_n = v</math> for some ''n'' ≥ 0 and some strings <math>u_1, ..., u_{n-1} \in (N \cup \Sigma)^*</math>. In other words, the relation <math>\Rightarrow^*</math> is the [[reflexive transitive closure]] of the relation <math>\Rightarrow</math>. The '''language''' of the grammar ''G'' is the set of all terminal-symbol strings derivable from its start symbol, formally: <math>L(G) = \{ w \in \Sigma^* \mid S \Rightarrow^* w \}</math>. Derivations that do not end in a string composed of terminal symbols only are possible, but do not contribute to ''L''(''G''). === Context-sensitive grammar === A formal grammar is '''context-sensitive''' if each rule in ''P'' is either of the form <math>S \to \varepsilon</math> where <math>\varepsilon</math> is the [[empty string]], or of the form : α''A''β → αγβ with ''A'' ∈ ''N'',<ref group="note">i.e., ''A'' a single [[nonterminal]]</ref> <math>\alpha, \beta\in (N \cup \Sigma \setminus\{S\})^*</math>,<ref group="note">i.e., α and β strings of nonterminals (except for the start symbol) and [[Terminal symbol|terminals]]</ref> and <math>\gamma\in (N \cup \Sigma \setminus\{S\})^+</math>.<ref group="note">i.e., γ is a nonempty string of nonterminals (except for the start symbol) and terminals</ref> The name ''context-sensitive'' is explained by the α and β that form the context of ''A'' and determine whether ''A'' can be replaced with γ or not. By contrast, in a [[context-free grammar]], no context is present: the left hand side of every production rule is just a nonterminal. The string γ is not allowed to be empty. Without this restriction, the resulting grammars become equal in power to [[unrestricted grammar]]s.<ref name="Vide1999" /> === (Weakly) equivalent definitions === A [[noncontracting grammar]] is a grammar in which for any production rule, of the form ''u'' → ''v'', the length of ''u'' is less than or equal to the length of ''v''. Every context-sensitive grammar is noncontracting, while every noncontracting grammar can be converted into an equivalent context-sensitive grammar; the two classes are [[weak equivalence (formal languages)|weakly equivalent]].<ref>{{cite book |last1=Hopcroft |first1=John E. |url=https://archive.org/details/introductiontoau00hopc |title=Introduction to Automata Theory, Languages, and Computation |last2=Ullman |first2=Jeffrey D. |publisher=Addison-Wesley |year=1979 |isbn=9780201029888 |author-link1=John Hopcroft |author-link2=Jeffrey Ullman |url-access=registration}}; p. 223–224; Exercise 9, p. 230. In the 2003 edition, the chapter on CSGs has been omitted.</ref> Some authors use the term ''context-sensitive grammar'' to refer to noncontracting grammars in general. The '''left-context'''- and '''right-context'''-sensitive grammars are defined by restricting the rules to just the form α''A'' → αγ and to just ''A''β → γβ, respectively. The languages generated by these grammars are also the full class of context-sensitive languages.<ref name="Hazewinkel1989">{{cite book |last=Hazewinkel |first=Michiel |url=https://books.google.com/books?id=s9F71NJxwzoC&pg=PA297 |title=Encyclopaedia of Mathematics |publisher=Springer Science & Business Media |year=1989 |isbn=978-1-55608-003-6 |volume=4 |page=297 |author-link=Michiel Hazewinkel}} also at https://www.encyclopediaofmath.org/index.php/Grammar,_context-sensitive</ref> The equivalence was established by [[Penttonen normal form]].<ref name="ItoKobayashi2010">{{cite book |last1=Ito |first1=Masami |url=https://books.google.com/books?id=xuaR2bJq0rcC&pg=PA183 |title=Automata, Formal Languages and Algebraic Systems: Proceedings of AFLAS 2008, Kyoto, Japan, 20–22 September 2008 |last2=Kobayashi |first2=Yūji |last3=Shoji |first3=Kunitaka |publisher=World Scientific |year=2010 |isbn=978-981-4317-60-3 |page=183}} citing {{Cite journal |last1=Penttonen |first1=Martti |date=Aug 1974 |title=One-sided and two-sided context in formal grammars |journal=[[Information and Control]] |volume=25 |issue=4 |pages=371–392 |doi=10.1016/S0019-9958(74)91049-3 |doi-access=free}}</ref> == Examples == === ''a<sup>n</sup>b<sup>n</sup>c<sup>n</sup>'' === The following context-sensitive grammar, with start symbol ''S'', generates the canonical non-[[context-free language]] { ''a<sup>n</sup>b<sup>n</sup>c<sup>n</sup>'' | ''n'' ≥ 1 } :{{cn|date=January 2022}} {| |- | 1. || || ''S'' || → || ''a'' || ''B'' || ''C'' |- | 2. || || ''S'' || → || ''a'' || ''S'' || ''B'' || ''C'' |- | 3. || ''C'' || ''B'' || → || ''C'' || ''Z'' |- | 4. || ''C'' || ''Z'' || → || ''W'' || ''Z'' |- | 5. || ''W'' || ''Z'' || → || ''W'' || ''C'' |- | 6. || ''W'' || ''C'' || → || ''B'' || ''C'' |- | 7. || ''a'' || ''B'' || → || ''a'' || ''b'' |- | 8. || ''b'' || ''B'' || → || ''b'' || ''b'' | |- | 9. || ''b'' || ''C'' || → || ''b'' || ''c'' |- |10. || ''c'' || ''C'' || → || ''c'' || ''c'' |} <!--- # ''S'' → ''aSBC'' # ''S'' → ''aBC'' # ''CB'' → ''HB'' # ''HB'' → ''HC'' # ''HC'' → ''BC'' # ''aB'' → ''ab'' # ''bB'' → ''bb'' # ''bC'' → ''bc'' # ''cC'' → ''cc'' ---> Rules 1 and 2 allow for blowing-up ''S'' to ''a''<sup>''n''</sup>''BC''(''BC'')<sup>''n''−1</sup>; rules 3 to 6 allow for successively exchanging each ''CB'' to ''BC'' ([[Revesz' trick|four rules]] are needed for that since a rule ''CB'' → ''BC'' wouldn't fit into the scheme α''A''β → αγβ); rules 7–10 allow replacing a non-terminal ''B'' or ''C'' with its corresponding terminal ''b'' or ''c'', respectively, provided it is in the right place. A generation chain for ''{{not a typo|aaabbbccc}}'' is: : ''S'' : →<sub>2</sub> {{not a typo|'''aSBC'''}} : →<sub>2</sub> {{not a typo|''a'''aSBC'''BC''}} : →<sub>1</sub> {{not a typo|''aa'''aBC'''BCBC''}} : →<sub>3</sub> {{not a typo|''aaaB'''CZ'''CBC''}} : →<sub>4</sub> {{not a typo|''aaaB'''WZ'''CBC''}} : →<sub>5</sub> {{not a typo|''aaaB'''WC'''CBC''}} : →<sub>6</sub> {{not a typo|''aaaB'''BC'''CBC''}} : →<sub>3</sub> {{not a typo|''aaaBBC'''CZ'''C''}} : →<sub>4</sub> {{not a typo|''aaaBBC'''WZ'''C''}} : →<sub>5</sub> {{not a typo|''aaaBBC'''WC'''C''}} : →<sub>6</sub> {{not a typo|''aaaBBC'''BC'''C''}} : →<sub>3</sub> {{not a typo|''aaaBB'''CZ'''CC''}} : →<sub>4</sub> {{not a typo|''aaaBB'''WZ'''CC''}} : →<sub>5</sub> {{not a typo|''aaaBB'''WC'''CC''}} : →<sub>6</sub> {{not a typo|''aaaBB'''BC'''CC''}} : →<sub>7</sub> {{not a typo|''aa'''ab'''BBCCC''}} : →<sub>8</sub> {{not a typo|''aaa'''bb'''BCCC''}} : →<sub>8</sub> {{not a typo|''aaab'''bb'''CCC''}} : →<sub>9</sub> {{not a typo|''aaabb'''bc'''CC''}} : →<sub>10</sub> {{not a typo|''aaabbb'''cc'''C''}} : →<sub>10</sub> {{not a typo|''aaabbbc'''cc'''''}} : === ''a<sup>n</sup>b<sup>n</sup>c<sup>n</sup>d<sup>n</sup>'', etc. === More complicated grammars can be used to parse { ''a<sup>n</sup>b<sup>n</sup>c<sup>n</sup>d<sup>n</sup>'' | ''n'' ≥ 1 }, and other languages with even more letters. Here we show a simpler approach using non-contracting grammars:{{cn|date=January 2022}} Start with a kernel of regular productions generating the sentential forms <math>(ABCD)^{n}abcd</math> and then include the non contracting productions <math>p_{Da} : Da\rightarrow aD</math>, <math>p_{Db} : Db\rightarrow bD</math>, <math>p_{Dc} : Dc\rightarrow cD</math>, <math>p_{Dd} : Dd\rightarrow dd</math>, <math>p_{Ca} : Ca\rightarrow aC</math>, <math>p_{Cb} : Cb\rightarrow bC</math>, <math>p_{Cc} : Cc\rightarrow cc</math>, <math>p_{Ba} : Ba\rightarrow aB</math>, <math>p_{Bb} : Bb\rightarrow bb</math>, <math>p_{Aa} : Aa\rightarrow aa</math>. === ''a<sup>m</sup>b<sup>n</sup>c<sup>m</sup>d<sup>n</sup>'' === A non contracting grammar (for which there is an equivalent CSG) for the language <math>L_{Cross} = \{ a^mb^nc^{m}d^{n} \mid m \ge 1, n \ge 1 \}</math> is defined by :<math>p_0 : S \rightarrow RT</math>, :<math>p_1 : R\rightarrow aRC | aC</math>, :<math>p_3 : T\rightarrow BTd | Bd</math>, :<math>p_5 : CB\rightarrow BC</math>, :<math>p_6 : aB\rightarrow ab</math>, :<math>p_7 : bB\rightarrow bb</math>, :<math>p_8 : Cd\rightarrow cd</math>, and :<math>p_9 : Cc\rightarrow cc</math>. With these definitions, a derivation for <math>a^3b^2c^3d^2</math> is: <math>S \Rightarrow_{p_0} RT \Rightarrow_{p^{2}_{1}p_{2}} a^3C^3T \Rightarrow_{p_{3}p_{4} } a^3C^3B^2d^2 \Rightarrow_{p^{6}_{5} } a^3B^2C^3d^2 \Rightarrow_{p_{6}p_{7} } a^3b^2C^3d^2 \Rightarrow_{p_{8}p^{2}_{9}} a^3b^2c^3d^2 </math>.{{citation needed|date=November 2018}} === ''a''<sup>2<sup>''i''</sup></sup> === A noncontracting grammar for the language { ''a''<sup>2<sup>''i''</sup></sup> | ''i'' ≥ 1 } is constructed in Example 9.5 (p. 224) of (Hopcroft, Ullman, 1979):<ref>They obtained the grammar by systematic transformation of an [[unrestricted grammar]], given in Exm. 9.4, viz.: # <math>S\rightarrow ACaB</math>, # <math>Ca\rightarrow aaC</math>, # <math>CB\rightarrow DB</math>, # <math>CB\rightarrow E</math>, # <math>aD\rightarrow Da</math>, # <math>AD\rightarrow AC</math>, # <math>aE\rightarrow Ea</math>, # <math>AE\rightarrow \varepsilon</math>. In the context-sensitive grammar, a string enclosed in square brackets, like <math>[ACaB]</math>, is considered a single symbol (similar to e.g. <code><name-part></code> in [[Backus–Naur form#Example|Backus–Naur form]]). The symbol names are chosen to resemble the unrestricted grammar. Likewise, rule groups in the context-sensitive grammar are numbered by the unrestricted-grammar rule they originated from.</ref> # <math>S\rightarrow [ACaB]</math> # <math>\begin{cases} \ [Ca]a\rightarrow aa[Ca] \\ \ [Ca][aB]\rightarrow aa[CaB] \\ \ [ACa]a\rightarrow [Aa]a[Ca] \\ \ [ACa][aB]\rightarrow [Aa]a[CaB] \\ \ [ACaB]\rightarrow [Aa][aCB] \\ \ [CaB]\rightarrow a[aCB] \end{cases}</math> # <math>[aCB]\rightarrow [aDB]</math> # <math>[aCB]\rightarrow [aE]</math> # <math>\begin{cases} \ a[Da]\rightarrow [Da]a \\ \ [aDB]\rightarrow [DaB] \\ \ [Aa][Da]\rightarrow [ADa]a \\ \ a[DaB]\rightarrow [Da][aB] \\ \ [Aa][DaB]\rightarrow [ADa][aB] \end{cases}</math> # <math>[ADa]\rightarrow [ACa]</math> # <math>\begin{cases} \ a[Ea]\rightarrow [Ea]a \\ \ [aE]\rightarrow [Ea] \\ \ [Aa][Ea]\rightarrow [AEa]a \end{cases}</math> # <math>[AEa]\rightarrow a</math> == Kuroda normal form == Every context-sensitive grammar which does not generate the empty string can be transformed into a [[weak equivalence (formal languages)|weakly equivalent]] one in [[Kuroda normal form]]. "Weakly equivalent" here means that the two grammars generate the same language. The normal form will not in general be context-sensitive, but will be a [[noncontracting grammar]].<ref>{{cite journal | first=Sige-Yuki | last=Kuroda | author-link=S.-Y. Kuroda | title=Classes of languages and linear-bounded automata | journal=Information and Control | volume=7 | number=2 | pages=207–223 | date=June 1964 | doi=10.1016/s0019-9958(64)90120-2| doi-access=free }}</ref><ref>{{cite book |last1=Mateescu | first1=Alexandru |last2=Salomaa|first2=Arto |author-link2=Arto Salomaa|editor1-first=Grzegorz| editor1-last=Rozenberg|editor-link1=Grzegorz Rozenberg |editor2-first=Arto| editor2-last=Salomaa |editor-link2=Arto Salomaa|title=Handbook of Formal Languages. Volume I: Word, language, grammar |publisher=Springer-Verlag |year=1997 |pages=175–252 |chapter=Chapter 4: Aspects of Classical Language Theory |isbn=3-540-61486-9}}, Here: Theorem 2.2, p. 190</ref> The Kuroda normal form is an actual normal form for non-contracting grammars. == Properties and uses == {{see also|context-sensitive language}} {{more citations needed section|date=August 2014}} === Equivalence to linear bounded automaton === A formal language can be described by a context-sensitive grammar if and only if it is accepted by some [[linear bounded automaton]] (LBA).<ref>(Hopcroft, Ullman, 1979); Theorem 9.5, 9.6, p. 225–226</ref> In some textbooks this result is attributed solely to Landweber and [[S.-Y. Kuroda|Kuroda]].<ref name="DavisSigal1994b"/> Others call it the [[John Myhill|Myhill]]–Landweber–Kuroda theorem.<ref name="flac">{{Cite web |first=Klaus |last=Sutner |url=https://www.cs.cmu.edu/~flac/pdf/ContSens.pdf |title=Context Sensitive Grammars |publisher=[[Carnegie Mellon University]] |date=Spring 2016 |access-date=2019-08-29 |archive-date=2017-02-03 |archive-url=https://web.archive.org/web/20170203081505/http://www.cs.cmu.edu/~flac/pdf/ContSens.pdf |url-status=dead }}</ref> (Myhill introduced the concept of deterministic LBA in 1960. Peter S. Landweber published in 1963 that the language accepted by a deterministic LBA is context sensitive.<ref>{{cite journal | author=P.S. Landweber | title=Three Theorems on Phrase Structure Grammars of Type 1 | journal=[[Information and Control]] | volume=6 | number=2 | pages=131–136 | year=1963 |doi=10.1016/s0019-9958(63)90169-4 | doi-access=free }}</ref> Kuroda introduced the notion of non-deterministic LBA and the equivalence between LBA and CSGs in 1964.<ref>{{cite book|first=Alexander|last=Meduna|author-link=Alexander Meduna|title=Automata and Languages: Theory and Applications|url=https://books.google.com/books?id=s7gEErax71cC&pg=PA755|year=2000|publisher=Springer Science & Business Media|isbn=978-1-85233-074-3|page=755}}</ref><ref>{{cite book|first=Willem J. M.|last=Levelt|author-link=Willem Levelt|title=An Introduction to the Theory of Formal Languages and Automata|url=https://books.google.com/books?id=tFvtwGYNe7kC&pg=PA126|year=2008|publisher=John Benjamins Publishing|isbn=978-90-272-3250-2|pages=126–127}}</ref>) {{As of|2010}}{{update inline|date=May 2023}} it is still an open question whether every context-sensitive language can be accepted by a ''deterministic'' LBA.<ref>{{cite book|last=Martin|first=John C.|title=Introduction to Languages and the Theory of Computation|year=2010|publisher=McGraw-Hill|location=New York, NY|isbn=9780073191461|edition=4th|page=283}}</ref> === Closure properties === Context-sensitive languages are closed under [[complement (set theory)|complement]]. This 1988 result is known as the [[Immerman–Szelepcsényi theorem]].<ref name="flac"/> Moreover, they are closed under [[union (set theory)|union]], [[intersection (set theory)|intersection]], [[concatenation#Concatenation of sets of strings|concatenation]], [[String substitution|substitution]],<ref group=note>more formally: if ''L'' ⊆ Σ<sup>*</sup> is a context-sensitive language and ''f'' maps each ''a''∈Σ to a context-sensitive language ''f''(''a''), the ''f''(''L'') is again a context-sensitive language</ref> [[inverse string homomorphism|inverse homomorphism]], and [[Kleene plus]].<ref>(Hopcroft, Ullman, 1979); Exercise S9.10, p. 230–231</ref> Every [[recursively enumerable language]] ''L'' can be written as ''h''(''L'') for some context-sensitive language ''L'' and some [[string homomorphism]] ''h''.<ref>(Hopcroft, Ullman, 1979); Exercise S9.14, p. 230–232. ''h'' maps each symbol to itself or to the empty string.</ref> === Computational problems === The [[decision problem]] that asks whether a certain string ''s'' belongs to the language of a given context-sensitive grammar ''G'', is [[PSPACE-complete]]. Moreover, there are context-sensitive grammars whose languages are PSPACE-complete. In other words, there is a context-sensitive grammar ''G'' such that deciding whether a certain string ''s'' belongs to the language of ''G'' is PSPACE-complete (so ''G'' is fixed and only ''s'' is part of the input of the problem).<ref>An example of such a grammar, designed to solve the [[QSAT]] problem, is given in {{Cite book|last=Lita|first=C. V.|title=2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC) |chapter=On Complexity of the Detection Problem for Bounded Length Polymorphic Viruses |date=2016-09-01|pages=371–378|doi=10.1109/SYNASC.2016.064|isbn=978-1-5090-5707-8|s2cid=18067130}}</ref> The [[emptiness problem]] for context-sensitive grammars (given a context-sensitive grammar ''G'', is ''L''(''G'')=∅ ?) is [[Undecidable language|undecidable]].<ref>(Hopcroft, Ullman, 1979); Exercise S9.13, p. 230–231</ref><ref group=note>This also follows from (1) [[#top|context-free languages being also context-sensitive]], (2) [[#Closure properties|context-sensitive language being closed under intersection]], but (3) [[Context-free grammar#Language disjointness|disjointness of context-free languages being undecidable]].</ref> === As model of natural languages === Savitch has [[mathematical proof|proven]] the following theoretical result, on which he bases his criticism of CSGs as basis for natural language: for any [[recursively enumerable]] set ''R'', there exists a context-sensitive language/grammar ''G'' which can be used as a sort of proxy to test membership in ''R'' in the following way: given a string ''s'', ''s'' is in ''R'' if and only if there exists a positive integer ''n'' for which ''sc<sup>n</sup>'' is in G, where ''c'' is an arbitrary symbol not part of ''R''.<ref name="Vide1999"/> It has been shown that nearly all [[natural language]]s may in general be characterized by context-sensitive grammars, but the whole class of CSGs seems to be much bigger than natural languages.{{citation needed|date=November 2011}} Worse yet, since the aforementioned decision problem for CSGs is PSPACE-complete, that makes them totally unworkable for practical use, as a polynomial-time algorithm for a PSPACE-complete problem would imply [[P=NP problem|P=NP]]. It was proven that some natural languages are not context-free, based on identifying so-called [[cross-serial dependency|cross-serial dependencies]] and [[unbounded scrambling]] phenomena.{{cn|date=December 2022}} However this does not necessarily imply that the class of CSGs is necessary to capture "context sensitivity" in the colloquial sense of these terms in natural languages. For example, [[linear context-free rewriting system]]s (LCFRSs) are strictly weaker than CSGs but can account for the phenomenon of cross-serial dependencies; one can write a LCFRS grammar for {''a<sup>n</sup>b<sup>n</sup>c<sup>n</sup>d<sup>n</sup>'' | ''n'' ≥ 1} for example.<ref>{{cite web|url=http://user.phil-fak.uni-duesseldorf.de/~kallmeyer/GrammarFormalisms/4nl-cfg.pdf |archive-url=https://web.archive.org/web/20140819085139/http://user.phil-fak.uni-duesseldorf.de/~kallmeyer/GrammarFormalisms/4nl-cfg.pdf |archive-date=2014-08-19 |url-status=live|title=Mildly Context-Sensitive Grammar Formalisms: Natural Languages are not Context-Free|first=Laura|last=Kallmeyer|year=2011}}</ref><ref>{{cite web|url=http://user.phil-fak.uni-duesseldorf.de/~kallmeyer/GrammarFormalisms/4lcfrs-intro.pdf |archive-url=https://web.archive.org/web/20140819085928/http://user.phil-fak.uni-duesseldorf.de/~kallmeyer/GrammarFormalisms/4lcfrs-intro.pdf |archive-date=2014-08-19 |url-status=live|title=Mildly Context-Sensitive Grammar Formalisms: Linear Context-Free Rewriting Systems|first=Laura|last=Kallmeyer|year=2011}}</ref><ref name="Kallmeyer2010">{{cite book|first=Laura|last=Kallmeyer|title=Parsing Beyond Context-Free Grammars|year=2010|publisher=Springer Science & Business Media|isbn=978-3-642-14846-0|pages=1–5}}</ref> Ongoing research on [[computational linguistics]] has focused on formulating other classes of languages that are "[[mildly context-sensitive language|mildly context-sensitive]]" whose decision problems are feasible, such as [[tree-adjoining grammar]]s, [[combinatory categorial grammar]]s, [[coupled context-free language]]s, and [[Generalized context-free grammar#Linear Context-free Rewriting Systems (LCFRSs)|linear context-free rewriting system]]s. The languages generated by these formalisms properly lie between the context-free and context-sensitive languages. More recently, the class [[PTIME]] has been identified with [[range concatenation grammar]]s, which are now considered to be the most expressive of the mild-context sensitive language classes.<ref name="Kallmeyer2010"/> ==See also== * [[Chomsky hierarchy]] * [[Growing context-sensitive grammar]] * [[Definite clause grammar#Non-context-free grammars]] * [[List of parser generators for context-sensitive grammars]] ==Notes== {{reflist|group=note}} ==References== {{reflist}} ==Further reading== * {{cite book|first1=Alexander|last1=Meduna|author-link1=Alexander Meduna|first2=Martin|last2=Švec|title=Grammars with Context Conditions and Their Applications|year=2005|publisher=John Wiley & Sons|isbn=978-0-471-73655-4}} ==External links== * [https://web.archive.org/web/20110708224600/https://danielmattosroberts.com/earley/context-sensitive-earley.pdf Earley Parsing for Context-Sensitive Grammars] {{Formal languages and grammars}} {{DEFAULTSORT:Context-Sensitive Grammar}} [[Category:Formal languages]] [[Category:Grammar frameworks]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:As of
(
edit
)
Template:Citation needed
(
edit
)
Template:Cite book
(
edit
)
Template:Cite journal
(
edit
)
Template:Cite web
(
edit
)
Template:Cn
(
edit
)
Template:Formal languages and grammars
(
edit
)
Template:More citations needed section
(
edit
)
Template:Not a typo
(
edit
)
Template:Reflist
(
edit
)
Template:See also
(
edit
)
Template:Short description
(
edit
)
Template:Update inline
(
edit
)