Context-sensitive language
Template:Short description Template:Redirect In formal language theory, a context-sensitive language is a language that can be defined by a context-sensitive grammar (and equivalently by a noncontracting grammar). Context-sensitive is known as type-1 in the Chomsky hierarchy of formal languages.
Computational propertiesEdit
Computationally, a context-sensitive language is equivalent to a linear bounded nondeterministic Turing machine, also called a linear bounded automaton. That is a non-deterministic Turing machine with a tape of only <math>kn</math> cells, where <math>n</math> is the size of the input and <math>k</math> is a constant associated with the machine. This means that every formal language that can be decided by such a machine is a context-sensitive language, and every context-sensitive language can be decided by such a machine.
This set of languages is also known as NLINSPACE or NSPACE(O(n)), because they can be accepted using linear space on a non-deterministic Turing machine.<ref>Template:Citation.</ref> The class LINSPACE (or DSPACE(O(n))) is defined the same, except using a deterministic Turing machine. Clearly LINSPACE is a subset of NLINSPACE, but it is not known whether LINSPACE = NLINSPACE.<ref>Template:Citation.</ref>
ExamplesEdit
One of the simplest context-sensitive but not context-free languages is <math>L = \{ a^nb^nc^n : n \ge 1 \}</math>: the language of all strings consisting of Template:Mvar occurrences of the symbol "a", then Template:Mvar "b"s, then Template:Mvar "c"s (abc, Template:Not a typo, Template:Not a typo, etc.). A superset of this language, called the Bach language,<ref>Template:Cite conference</ref> is defined as the set of all strings where "a", "b" and "c" (or any other set of three symbols) occurs equally often (Template:Not a typo, Template:Not a typo, etc.) and is also context-sensitive.<ref>Bach, E. (1981). "Discontinuous constituents in generalized categorial grammars" Template:Webarchive. NELS, vol. 11, pp. 1–12.</ref><ref>Joshi, A.; Vijay-Shanker, K.; and Weir, D. (1991). "The convergence of mildly context-sensitive grammar formalisms". In: Sells, P., Shieber, S.M. and Wasow, T. (Editors). Foundational Issues in Natural Language Processing. Cambridge MA: Bradford.</ref>
Template:Mvar can be shown to be a context-sensitive language by constructing a linear bounded automaton which accepts Template:Mvar. The language can easily be shown to be neither regular nor context-free by applying the respective pumping lemmas for each of the language classes to Template:Mvar.
Similarly:
<math>L_\textit{Cross} = \{ a^mb^nc^{m}d^{n} : m \ge 1, n \ge 1 \}</math> is another context-sensitive language; the corresponding context-sensitive grammar can be easily projected starting with two context-free grammars generating sentential forms in the formats <math>a^mC^m</math> and <math>B^nd^n</math> and then supplementing them with a permutation production like <math>CB\rightarrow BC</math>, a new starting symbol and standard syntactic sugar.
<math>L_{MUL3} = \{ a^mb^nc^{mn} : m \ge 1, n \ge 1 \}</math> is another context-sensitive language (the "3" in the name of this language is intended to mean a ternary alphabet); that is, the "product" operation defines a context-sensitive language (but the "sum" defines only a context-free language as the grammar <math>S\rightarrow aSc|R</math> and <math>R\rightarrow bRc|bc</math> shows). Because of the commutative property of the product, the most intuitive grammar for <math>L_\textit{MUL3}</math> is ambiguous. This problem can be avoided considering a somehow more restrictive definition of the language, e.g. <math>L_\textit{ORDMUL3} = \{ a^mb^nc^{mn} : 1 < m < n \}</math>. This can be specialized to <math>L_\textit{MUL1} = \{ a^{mn} : m > 1, n > 1 \}</math> and, from this, to <math>L_{m^2} = \{ a^{m^2} : m > 1 \}</math>, <math>L_{m^3} = \{ a^{m^3} : m > 1 \}</math>, etc.
<math>L_{REP} = \{ w^{|w|} : w \in \Sigma^* \}</math> is a context-sensitive language. The corresponding context-sensitive grammar can be obtained as a generalization of the context-sensitive grammars for <math>L_\textit{Square} = \{ w^2 : w \in \Sigma^* \}</math>, <math>L_\textit{Cube} = \{ w^3 : w \in \Sigma^* \}</math>, etc.
<math>L_\textit{EXP} = \{ a^{2^n} : n \ge 1 \}</math> is a context-sensitive language.<ref>Example 9.5 (p. 224) of Hopcroft, John E.; Ullman, Jeffrey D. (1979). Introduction to Automata Theory, Languages, and Computation. Addison-Wesley</ref>
<math>L_\textit{PRIMES2} = \{ w : |w| \mbox { is prime } \}</math> is a context-sensitive language (the "2" in the name of this language is intended to mean a binary alphabet). This was proved by Hartmanis using pumping lemmas for regular and context-free languages over a binary alphabet and, after that, sketching a linear bounded multitape automaton accepting <math>L_{PRIMES2}</math>.<ref>Template:Cite journal</ref>
<math>L_\textit{PRIMES1} = \{ a^p : p \mbox { is prime } \}</math> is a context-sensitive language (the "1" in the name of this language is intended to mean a unary alphabet). This was credited by A. Salomaa to Matti Soittola by means of a linear bounded automaton over a unary alphabet<ref>Salomaa, Arto (1969), Theory of Automata, Template:ISBN, Pergamon, 276 pages. {{#invoke:doi|main}}</ref> (pages 213–214, exercise 6.8) and also to Marti Penttonen by means of a context-sensitive grammar also over a unary alphabet (See: Formal Languages by A. Salomaa, page 14, Example 2.5).
An example of recursive language that is not context-sensitive is any recursive language whose decision is an EXPSPACE-hard problem, say, the set of pairs of equivalent regular expressions with exponentiation.
Properties of context-sensitive languagesEdit
- The union, intersection, concatenation of two context-sensitive languages is context-sensitive, also the Kleene plus of a context-sensitive language is context-sensitive.<ref>Template:Cite book; Exercise 9.10, p.230. In the 2000 edition, the chapter on context-sensitive languages has been omitted.</ref>
- The complement of a context-sensitive language is itself context-sensitive<ref>Template:Cite journal</ref> a result known as the Immerman–Szelepcsényi theorem.
- Membership of a string in a language defined by an arbitrary context-sensitive grammar, or by an arbitrary deterministic context-sensitive grammar, is a PSPACE-complete problem.
See alsoEdit
- Linear bounded automaton
- List of parser generators for context-sensitive languages
- Chomsky hierarchy
- Indexed languages – a strict subset of the context-sensitive languages
- Weir hierarchy
ReferencesEdit
- Sipser, M. (1996), Introduction to the Theory of Computation, PWS Publishing Co.