Editing Cook–Levin theorem

{{short description|Boolean satisfiability is NP-complete and therefore that NP-complete problems exist}}
In [[computational complexity theory]], the '''Cook–Levin theorem''', also known as '''Cook's theorem''', states that the [[Boolean satisfiability problem]] is [[NP-completeness|NP-complete]].  That is, it is in [[NP (complexity)|NP]], and any problem in NP can be [[reduction (complexity)|reduced]] in [[polynomial time]] by a [[deterministic Turing machine]] to the Boolean satisfiability problem.

The theorem is named after [[Stephen Cook]] and [[Leonid Levin]]. The proof is due to [[Richard Karp]], based on an earlier proof (using a different notion of reducibility) by Cook.<ref name="Karp"/>

An important consequence of this theorem is that if there exists a deterministic polynomial-time algorithm for solving Boolean satisfiability, then every [[NP (complexity)|NP]] problem can be solved by a deterministic polynomial-time algorithm.  The question of whether such an algorithm for Boolean satisfiability exists is thus equivalent to the [[P versus NP problem]], which is still widely considered the most important unsolved problem in [[theoretical computer science]].

==Contributions==
The concept of [[NP-completeness]] was developed in the late 1960s and early 1970s in parallel by researchers in North America and the [[Soviet Union]].
In 1971, [[Stephen Cook]] published his paper "The complexity of theorem proving procedures"<ref>{{cite book|title=Proceedings of the Third Annual ACM Symposium on Theory of Computing|last=Cook|first=Stephen|year=1971|pages=151–158|chapter=The complexity of theorem proving procedures|doi=10.1145/800157.805047|isbn=9781450374644|s2cid=7573663|author-link=Stephen Cook|chapter-url=http://portal.acm.org/citation.cfm?coll=GUIDE&dl=GUIDE&id=805047}}</ref> in conference proceedings of the newly founded ACM [[Symposium on Theory of Computing]]. [[Richard Karp]]'s subsequent paper, "Reducibility among
combinatorial problems",<ref name="Karp"/> generated renewed interest in Cook's paper by providing a [[Karp's 21 NP-complete problems|list of 21 NP-complete problems]]. Karp also introduced the notion of completeness used in the current definition of NP-completeness (i.e., by [[polynomial-time many-one reduction]]). Cook and Karp each received a [[Turing Award]] for this work.

<!--
THIS IS POORLY WORDED:

Another important contribution of Karp was to notice
that many NP-complete problems, while seemingly intractable in the
worst case, are actually easy for almost all random instances.{{Citation needed|date=March 2008}} In 1984, Levin
extended this result by proving that some NP problems
are complete in the average case as well.<ref>http://www.claymath.org/millennium/P_vs_NP/pvsnp.pdf {{Dead link|date=February 2022}}</ref>

-->
The theoretical interest in NP-completeness was also enhanced by the work of Theodore P. Baker, John Gill, and [[Robert Solovay]] who showed, in 1975, that solving NP-problems in certain [[oracle machine]] models requires exponential time. That is, there exists an oracle ''A'' such that, for all subexponential deterministic-time complexity classes T, the relativized complexity class NP<sup>''A''</sup> is not a subset of T<sup>''A''</sup>. In particular, for this oracle, P<sup>''A''</sup>&nbsp;≠&nbsp;NP<sup>''A''</sup>.<ref>{{cite journal|author = T. P. Baker|author2=J. Gill |author3=R. Solovay |title = Relativizations of the P = NP question|journal = SIAM Journal on Computing |volume = 4|issue = 4|pages = 431–442|year = 1975|doi = 10.1137/0204037}}</ref>

In the USSR, a result equivalent to Baker, Gill, and Solovay's was published in 1969 by M. Dekhtiar.<ref>{{cite journal|last=Dekhtiar|first=M.|title = On the impossibility of eliminating exhaustive search in computing a function relative to its graph|journal = [[Proceedings of the USSR Academy of Sciences]]|volume = 14|pages = 1146–1148|year = 1969|language=ru}}</ref> Later [[Leonid Levin]]'s paper, "Universal search problems",<ref>{{cite journal|last=Levin|first=Leonid|author-link=Leonid Levin|trans-title = Universal search problems|language=ru|title= Универсальные задачи перебора|journal = Problems of Information Transmission |volume = 9|issue = 3|pages = 115–116|year = 1973|url=http://www.mathnet.ru/eng/ppi914}} Translated into English by {{cite journal|last=Trakhtenbrot|first=B. A.|author-link=Boris Trakhtenbrot | title = A survey of Russian approaches to ''perebor'' (brute-force searches) algorithms|journal = [[Annals of the History of Computing]] |volume = 6|issue = 4|pages = 384–400|year = 1984|doi=10.1109/MAHC.1984.10036|s2cid=950581}} Translation see appendix, p.399-400.</ref> was published in 1973, although it was mentioned in talks and submitted for publication a few years earlier.

Levin's approach was slightly different from Cook's and Karp's in that he considered [[search problem]]s, which require finding solutions rather than simply determining existence. He provided six such NP-complete search problems, or ''universal problems''.
Additionally he found for each of these problems an algorithm that solves it in optimal time (in particular, these algorithms run in polynomial time if and only if [[P versus NP problem|P = NP]]).

==Definitions==
A [[decision problem]] is ''in [[NP (complexity)|NP]]'' if it can be decided by a [[nondeterministic Turing machine]] in [[polynomial time]].

An ''instance of the Boolean satisfiability problem'' is a [[Boolean expression]] that combines [[Boolean variable]]s using [[Logical connective|Boolean operator]]s.
Such an expression is ''satisfiable'' if there is some assignment of [[truth value]]s to the variables that makes the entire expression true.

==Idea==
Given any decision problem in NP, construct a non-deterministic machine that solves it in polynomial time. Then for each input to that machine, build a Boolean expression that computes whether when that specific input is passed to the machine, the machine runs correctly, and the machine halts and answers "yes". Then the expression can be satisfied if and only if there is a way for the machine to run correctly and answer "yes", so the satisfiability of the constructed expression is equivalent to asking whether or not the machine will answer "yes".

==Proof==
''This proof is based on the one given by {{harvnb|Garey|Johnson|1979|loc=Section 2.6|pp=38–44}}.''

There are two parts to proving that the Boolean satisfiability problem (SAT) is NP-complete. One is to show that SAT is an NP problem. The other is to show that every NP problem can be reduced to an instance of a SAT problem by a [[polynomial-time many-one reduction]].

SAT is in NP because any assignment of Boolean values to Boolean variables that is claimed to satisfy the given expression can be ''verified'' in polynomial time by a deterministic Turing machine. (The statements '''''verifiable''' in polynomial time by a '''deterministic''' Turing machine'' and '''''solvable''' in polynomial time by a '''non-deterministic''' Turing machine'' are equivalent, and the proof can be found in many textbooks, for example Sipser's ''Introduction to the Theory of Computation'', section 7.3., as well as [[NP (complexity)#Equivalence of definitions|in the Wikipedia article on NP]]).

[[File:CookLevinCommDiag svg.svg|thumb|[[Commutative diagram]] showing Cook's reduction of <math>M</math> to SAT. Data sizes and program runtimes are colored in {{color|#cc6600|orange}} and {{color|#006600|green}}, respectively.]]
[[File:CookLevin_svg.svg|thumb|Schematized accepting computation by the machine <math>M</math>.]]
Now suppose that a given problem in NP can be solved by the [[nondeterministic Turing machine]] <math>M = (Q, \Sigma, s, F, \delta)</math>, where <math>Q</math> is the set of states, <math>\Sigma</math> is the alphabet of tape symbols, <math>s \in Q</math> is the initial state, <math>F \subseteq Q</math> is the set of accepting states, and <math>\delta \subseteq ((Q \setminus F) \times \Sigma) \times (Q \times \Sigma \times \{-1, +1\})</math> is the transition relation. Suppose further that <math>M</math> accepts or rejects an instance of the problem after at most <math>p(n)</math> computation steps, where <math>n</math> is the size of the instance and <math>p</math> is a polynomial function.

For each input, <math>I</math>, specify a Boolean expression <math>B</math> that is satisfiable [[if and only if]] the machine <math>M</math> accepts <math>I</math>.

The Boolean expression uses the variables set out in the following table. Here, <math>q \in Q</math> is a machine state, <math>-p(n) \leq i \leq p(n)</math> is a tape position, <math>j \in \Sigma</math> is a tape symbol, and <math>0 \leq k \leq p(n)</math> is the number of a computation step.

{| class="wikitable"
!Variables
!Intended interpretation
!How many?<ref>This column uses the [[big O notation]].</ref>
|-
|<math>T_{i,j,k}</math>
|True if tape cell <math>i</math> contains symbol <math>j</math> at step <math>k</math> of the computation.
|<math>O(p(n)^2)</math>
|-
|<math>H_{i,k}</math>
|True if <math>M</math>'s read/write head is at tape cell <math>i</math> at step <math>k</math> of the computation.
|<math>O(p(n)^2)</math>
|-
|<math>Q_{q,k}</math>
|True if <math>M</math> is in state <math>q</math> at step <math>k</math> of the computation.
|<math>O(p(n))</math>
|}

Define the Boolean expression <math>B</math> to be the [[Logical conjunction|conjunction]] of the sub-expressions in the following table, for all <math>-p(n) \leq i \leq p(n)</math> and <math>0 \leq k \leq p(n)</math>:

{| class="wikitable"
!Expression
!Conditions
!Interpretation
!How many?
|-
|<math>T_{i,j,0}</math>
|Tape cell <math>i</math> initially contains symbol <math>j</math>
|Initial contents of the tape.  For <math>i > n-1</math> and <math>i < 0</math>, outside of the actual input <math>I</math>, the initial symbol is the special default/blank symbol.
|<math>O(p(n))</math>
|-
|<math>Q_{s,0}</math>
| 
|Initial state of <math>M</math>.
|1
|-
|<math>H_{0,0}</math>
| 
|Initial position of read/write head.
|1
|-
|<math>\neg T_{i,j,k} \lor \neg T_{i,j',k}</math>
|<math>j \neq j'</math>
|At most one symbol per tape cell.
|<math>O(p(n)^2)</math>
|-
|<math>\bigvee_{j \in \Sigma} T_{i,j,k}</math>
|
|At least one symbol per tape cell.
|<math>O(p(n)^2)</math>
|-
|<math>T_{i,j,k} \land T_{i,j',k+1} \rightarrow H_{i,k}</math>
|<math>j \neq j'</math>
|Tape remains unchanged unless written by head.
|<math>O(p(n)^2)</math>
|-
|<math>\lnot Q_{q,k} \lor \lnot Q_{q',k}</math>
|<math>q \neq q'</math>
|At most one state at a time.
|<math>O(p(n))</math>
|-
|<math>\bigvee_{q \in Q} Q_{q, k}</math>
|
|At least one state at a time.
|<math>O(p(n))</math>
|-
|<math>\lnot H_{i,k} \lor \lnot H_{i',k}</math>
|<math>i \neq i'</math>
|At most one head position at a time.
|<math>O(p(n)^3)</math>
|-
|<math>\bigvee_{-p(n) \le i \le p(n)} H_{i, k}</math>
|
|At least one head position at a time.
|<math>O(p(n)^2)</math>
|-
|<math>\begin{array}{l}
(H_{i,k} \land Q_{q,k} \land T_{i,\sigma,k}) \to \\
\bigvee_{((q, \sigma), (q', \sigma', d)) \in \delta} (H_{i+d,\ k+1} \land Q_{q',\ k+1} \land T_{i,\ \sigma',\ k+1})
\end{array}</math>
|<math>k < p(n)</math>
|Possible transitions at computation step <math>k</math> when head is at position <math>i</math>.
|<math>O(p(n)^2)</math>
|-
|<math>\bigvee_{0 \le k \le p(n)} \bigvee_{f \in F} Q_{f,k}</math>
|
|Must finish in an accepting state, not later than in step <math>p(n)</math>.
|1
|}

If there is an accepting computation for <math>M</math> on input <math>I</math>, then <math>B</math> is satisfiable by assigning <math>T_{i,j,k}</math>, <math>H_{i,k}</math> and <math>Q_{i,k}</math> their intended interpretations. On the other hand, if <math>B</math> is satisfiable, then there is an accepting computation for <math>M</math> on input <math>I</math> that follows the steps indicated by the assignments to the variables.

There are <math>O(p(n)^2)</math> Boolean variables, each encodable in space <math>O(\log p(n))</math>. The number of clauses is <math>O(p(n)^3)</math><ref>The number of literals in each clause does not depend on <math>n</math>, except for the last table row, which leads to a clause with <math>O(p(n))</math> literals.</ref> so the size of <math>B</math> is <math>O(\log(p(n)) p(n)^3)</math>. Thus the transformation is certainly a polynomial-time many-one reduction, as required.

Only the first table row (<math>T_{i,j,0}</math>) actually depends on the input string <math>I</math>. The remaining lines depend only on the input length <math>n</math> and on the machine <math>M</math>; they formalize a generic computation of <math>M</math> for up to <math>p(n)</math> steps.

The transformation makes extensive use of the polynomial <math>p(n)</math>. As a consequence, the above proof is not [[constructive proof|constructive]]: even if <math>M</math> is known, [[witness (mathematics)|witness]]ing the membership of the given problem in NP, the transformation cannot be effectively computed, unless an upper bound <math>p(n)</math> of <math>M</math>'s time complexity is also known.

==Complexity==
While the above method encodes a non-deterministic Turing machine in complexity <math>O(\log(p(n))p(n)^3)</math>, the literature describes more sophisticated approaches in complexity <math>O(p(n)\log(p(n)))</math>.<ref>{{cite journal | url=https://www.ccs.neu.edu/home/viola/classes/papers/Schnorr-SatNQL.pdf | author=Claus-Peter Schnorr | title=Satisfiability is quasilinear complete in NQL | journal=Journal of the ACM | volume=25 | number=1 | pages=136&ndash;145 | date=Jan 1978 | doi=10.1145/322047.322060 | s2cid=1929802 }}</ref><ref>{{cite journal | url=https://www.ccs.neu.edu/home/viola/classes/papers/PippengerF-Oblivious.pdf | author=Nicholas Pippenger and Michael J. Fischer | title=Relations among complexity measures | journal=Journal of the ACM | volume=26 | number=2 | pages=361&ndash;381 | date=Apr 1979 | doi=10.1145/322123.322138 | s2cid=2432526 }}</ref><ref>{{cite conference | url= | author=John Michael Robson | title=A new proof of the NP completeness of satisfiability | editor= | work=Proceedings of the 2nd Australian Computer Science Conference | publisher= | pages=62–70 | date=Feb 1979 }}</ref><ref>{{cite journal | author=John Michael Robson | title=An <math>O(T \log T)</math> reduction from RAM computations to satisfiability | journal=Theoretical Computer Science | volume=82 | number=1 | pages=141&ndash;149 | date=May 1991 | doi=10.1016/0304-3975(91)90177-4 | doi-access=free }}</ref><ref>{{cite journal | url=https://www.ccs.neu.edu/home/viola/classes/papers/Cook1988.pdf | author=Stephen A. Cook | title=Short propositional formulas represent nondeterministic computations | journal=Information Processing Letters | volume=26 | number=5 | pages=269&ndash;270 | date=Jan 1988 | doi=10.1016/0020-0190(88)90152-4 }}</ref> The quasilinear result first appeared seven years after Cook's original publication.

The use of SAT to prove the existence of an NP-complete problem can be extended to other computational problems in logic, and to completeness for other [[complexity class]]es.
The [[quantified Boolean formula]] problem (QBF) involves Boolean formulas extended to include nested [[universal quantifier]]s and [[existential quantifier]]s for its variables. The QBF problem can be used to encode computation with a Turing machine limited to [[PSPACE|polynomial space complexity]], proving that there exists a problem (the recognition of true quantified Boolean formulas) that is [[PSPACE-complete]]. Analogously, dependency quantified boolean formulas encode computation with a Turing machine limited to [[NL (complexity)|logarithmic space complexity]], proving that there exists a problem that is [[NL-complete]].<ref>{{cite conference | url=https://ieeexplore.ieee.org/document/4568030 | author1=Gary L. Peterson |author2= John H. Reif | title=Multiple-person alternation | editor1=Ronald V. Book |editor2= Paul Young | book-title=Proc. 20th Annual [[Symposium on Foundations of Computer Science]] (SFCS) | publisher=IEEE | pages=348–363 |  year=1979 }}</ref><ref>{{cite journal | author1=Gary Peterson |author2= John Reif |author3= Salman Azhar | title=Lower bounds for multiplayer noncooperative games of incomplete information | journal=Computers & Mathematics with Applications | volume=41 | number=7&ndash;8 | pages=957&ndash;992 | date=Apr 2001 | doi=10.1016/S0898-1221(00)00333-3 | doi-access=free }}</ref>

==Consequences==
The proof shows that every problem in NP can be reduced in polynomial time (in fact, [[logarithmic space]] suffices) to an instance of the Boolean satisfiability problem. This means that if the Boolean satisfiability problem could be solved in polynomial time by a [[deterministic Turing machine]], then all problems in NP could be solved in polynomial time, and so the [[complexity class]] NP would be equal to the complexity class P.

The significance of NP-completeness was made clear by the publication in 1972 of [[Richard Karp]]'s landmark paper, "Reducibility among combinatorial problems", in which he showed that [[Karp's 21 NP-complete problems|21 diverse combinatorial and graph theoretical problems]], each infamous for its intractability, are NP-complete.<ref name="Karp">{{cite book |last=Karp |first=Richard M. |title=Complexity of Computer Computations |publisher=Plenum |year=1972 |isbn=0-306-30707-3 |editor=Raymond E. Miller |location=New York |pages=85–103 |chapter=Reducibility Among Combinatorial Problems |author-link=Richard Karp |editor2=James W. Thatcher |chapter-url=}}</ref>

Karp showed each of his problems to be NP-complete by reducing another problem (already shown to be NP-complete) to that problem.  For example, he showed the problem 3SAT (the [[Boolean satisfiability problem]] for expressions in [[conjunctive normal form]] (CNF) with exactly three variables or negations of variables per clause) to be NP-complete by showing how to reduce (in polynomial time) any instance of SAT to an equivalent instance of 3SAT.<ref>First modify the proof of the Cook–Levin theorem, so that the resulting formula is in conjunctive normal form, then introduce new variables to split clauses with more than 3 atoms.  For example, the clause <math>(A \lor B \lor C \lor D)</math> can be replaced by the conjunction of clauses <math>(A \lor B \lor Z) \land (\lnot Z \lor C \lor D)</math>, where <math>Z</math> is a new variable that will not be used anywhere else in the expression.  Clauses with fewer than three atoms can be padded; for example, <math>(A \lor B)</math> can be replaced by <math>(A \lor B \lor B)</math>.</ref>

Garey and Johnson presented more than 300 NP-complete problems in their book ''Computers and Intractability: A Guide to the Theory of NP-Completeness'',<ref>{{Garey-Johnson}}</ref> and new problems are still being discovered to be within that complexity class.

Although many practical instances of SAT can be [[Boolean satisfiability problem#Algorithms for solving SAT|solved by heuristic methods]], the question of whether there is a deterministic polynomial-time algorithm for SAT (and consequently all other NP-complete problems) is still a famous unsolved problem, despite decades of intense effort by complexity theorists, [[mathematical logician]]s, and others.  For more details, see the article [[P versus NP problem]].

==References==
{{reflist}}

{{DEFAULTSORT:Cook-Levin theorem}}
[[Category:Theorems in computational complexity theory]]
[[Category:Articles containing proofs]]