I know that CNF SAT is in NP (and also NP-complete), because SAT is in NP and NP-complete. But what I don't understand is why? Is there anyone that can explain this?
3 Answers
CNF-SAT is in NP since you can verify a satisfying assignment in polynomial time. CNF-SAT is NP-hard since SAT is a special case of CNF-SAT, and so we can reduce the NP-hard problem SAT to the CNF-SAT. Since it is both in NP and NP-hard, we conclude that CNF-SAT is NP-complete.
- 280,205
- 27
- 317
- 514
Try reading up on the Cook-Levin theorem. SAT is basically the first problem proven NP-complete. High level sketch of the proof: simulate a nondeterministic (NP-time, nondeterministic polynomial time) TM computation using a cascading circuit that computes the TM iterations, which can be converted to SAT. Loosely speaking, the "size" of the circuit is "polynomial".
Note: This is a repeat of an answer for a previous question, but the answer works better here than it did there.
There were two independent proofs that SAT is NP-hard, one by Stephen Cook in 1971 [1] and the other by Leonid Levin in 1973. We now know it as the Cook-Levin theorem. You can read the paper, or consult the Wikipedia article [2] for details, but I'll give a brief outline of the basic idea here.
Let's look at the recogniser problem. There is a language $L$, and a nondeterministic TM $M$ which recognises $L$ in polynomial time. Let $w$ be a string. The idea is to construct a boolean formula $A(w)$ in conjunctive normal form, where the number of formulas and the number of propositions is polynomial in the length of $w$ and the size of $M$ and the size of the alphabet, which is true if $M$ accepts $w$ and false if $M$ rejects $w$.
The proof depends on the fact that $M$ takes polynomial time in the length of the string. Suppose that the maximum number of steps that $M$ can take for a string of length $n$ is $Q(n)$. Then this is also an upper bound on the amount of tape that $M$ can use. We can trivially modify $M$ so that all computations take at least this time. We could, for example, modify $M$ so that it loops forever in an accepting or rejecting state, let it run for $Q(n)$ time and then see which state it was in.
We introduce the following proposition symbols:
- $P_{i,s,t}$ is true if and only if the tape square $s$ contains symbol $i$ at time step $t$.
- $Q_{i,t}$ is true if and only if the machine is in state $i$ at time step $t$.
- $S_{s,t}$ is true if and only if symbol $s$ is scanned by the tape head at time step $t$.
Next, we construct formulas which model the actions of $M$ and test whether or not $w$ is accepted. We can do this using only the above proposition symbols, and in conjunctive normal form.
I encourage you to think through the details yourself by working out what the formulas might look like. The ones that Cook used are:
- At each time step $t$, one and only one square is scanned.
- At each time step $t$ and tape square $s$, there is one and only one symbol.
- At each time step $t$, $M$ is in one and only one state.
- At time step $1$, $M$ is in its start state and the tape contains exactly $w$ followed by "blank" symbols.
- At each time step transition, the $P$, $Q$ and $S$ propositions are updated correctly, according to the transition function of $M$. Remember $M$ is nondeterministic, so you need to include all possible transitions. (If you're playing along at home, use three formulas for this.)
And the final, most important, formula states that:
- $M$ enters the "accept" state at some time.
Then the conjunction of all of these formulas is true if and only if $M$ accepts $w$. Solve using your favourite SAT solver, and you're done.
- The Complexity of Theorem-Proving Procedures by Stephen A. Cook (1971).
- Cook-Levin theorem on Wikipedia.