34.5 NP-complete problems

NP-complete problems arise in diverse domains: boolean logic, graphs, arithmetic, network design, sets and partitions, storage and retrieval, sequencing and scheduling, mathematical programming, algebra and number theory, games and puzzles, automata and language theory, program optimization, biology, chemistry, physics, and more. In this section, we shall use the reduction methodology to provide NP-completeness proofs for a variety of problems drawn from graph theory and set partitioning.

Figure 34.13 outlines the structure of the NP-completeness proofs in this section and Section 34.4. Each language in the figure is proved NP-complete by reduction from the language that points to it. At the root is CIRCUIT-SAT, which we proved NP-complete in Theorem 34.7.

Figure 34.13: The structure of NP-completeness proofs in Sections 34.4 and 34.5. All proofs ultimately follow by reduction from the NP-completeness of CIRCUIT-SAT.

34.5.1 The clique problem

A clique in an undirected graph G = (V, E) is a subset V' ⊆ V of vertices, each pair of which is connected by an edge in E. In other words, a clique is a complete subgraph of G. The size of a clique is the number of vertices it contains. The clique problem is the optimization problem of finding a clique of maximum size in a graph. As a decision problem, we ask simply whether a clique of a given size k exists in the graph. The formal definition is

CLIQUE = {<G, k> : G is a graph with a clique of size k}.

A naive algorithm for determining whether a graph G = (V, E) with |V| vertices has a clique of size k is to list all k-subsets of V , and check each one to see whether it forms a clique. The running time of this algorithm is , which is polynomial if k is a constant. In general, however, k could be near |V| /2, in which case the algorithm runs in superpolynomial time. As one might suspect, an efficient algorithm for the clique problem is unlikely to exist. k

Theorem 34.11

The clique problem is NP-complete.

Proof To show that CLIQUE NP, for a given graph G = (V, E), we use the set V' ⊆ V of vertices in the clique as a certificate for G. Checking whether V' is a clique can be accomplished in polynomial time by checking whether, for each pair u, v ∈ V', the edge (u, v) belongs to E.

We next prove that 3-CNF-SAT ≤_P CLIQUE, which shows that the clique problem is NP-hard. That we should be able to prove this result is somewhat surprising, since on the surface logical formulas seem to have little to do with graphs.

The reduction algorithm begins with an instance of 3-CNF-SAT. Let φ = C₁ ∧ C₂ ∧ ··· ∧ C_k be a boolean formula in 3-CNF with k clauses. For r = 1, 2,..., k, each clause C_r has exactly three distinct literals , and . We shall construct a graph G such that φ is satisfiable if and only if G has a clique of size k.

The graph G = (V, E) is constructed as follows. For each clause in φ, we place a triple of vertices , , and into V. We put an edge between two vertices and if both of the following hold:

, and and are in different triples, that is, r ≠ s, and
their corresponding literals are consistent, that is, is not the negation of .

This graph can easily be computed from φ in polynomial time. As an example of this construction, if we have

φ = (x₁ ∨ ¬x₂ ∨ ¬x₃) ∧ (¬x₁ ∨ x₂ ∨ x₃) ∧ (x₁ ∨ x₂ ∨ x₃),

then G is the graph shown in Figure 34.14.

Figure 34.14: The graph G derived from the 3-CNF formula φ = C₁ ∧ C₂ ∧ C₃, where C₁ = (x₁ ∨ ¬x₂ ∨ ¬x₃), C₂ = (¬x₁ ∧ x₂ ∧ x₃), and C₃ = (x₁ ∧ x₂ ∧ x₃), in reducing 3-CNF-SAT to CLIQUE. A satisfying assignment of the formula has x₂ = 0, x₃ = 1, and x₁ may be either 0 or 1. This assignment satisfies C₁ with ¬x₂, and it satisfies C₂ and C₃ with x₃, corresponding to the clique with lightly shaded vertices.

We must show that this transformation of φ into G is a reduction. First, suppose that φ has a satisfying assignment. Then each clause C_r contains at least one literal that is assigned 1, and each such literal corresponds to a vertex . Picking one such "true" literal from each clause yields a set V' of k vertices. We claim that V' is a clique. For any two vertices , where r ≠ s, both corresponding literals and are mapped to 1 by the given satisfying assignment, and thus the literals cannot be complements. Thus, by the construction of G, the edge belongs to E.

Conversely, suppose that G has a clique V' of size k. No edges in G connect vertices in the same triple, and so V' contains exactly one vertex per triple. We can assign 1 to each literal such that without fear of assigning 1 to both a literal and its complement, since G contains no edges between inconsistent literals. Each clause is satisfied, and so φ is satisfied. (Any variables that do not correspond to a vertex in the clique may be set arbitrarily.)

In the example of Figure 34.14, a satisfying assignment of φ has x₂ = 0 and x₃ = 1. A corresponding clique of size k = 3 consists of the vertices corresponding to ¬x₂ from the first clause, x₃ from the second clause, and x₃ from the third clause. Because the clique contains no vertices corresponding to either x₁ or ¬x₁, we can set x₁ to either 0 or 1 in this satisfying assignment.

Observe that in the proof of Theorem 34.11, we reduced an arbitrary instance of 3-CNF-SAT to an instance of CLIQUE with a particular structure. It might seem that we have shown only that CLIQUE is NP-hard in graphs in which the vertices are restricted to occur in triples and in which there are no edges between vertices in the same triple. Indeed, we have shown that CLIQUE is NP-hard only in this restricted case, but this proof suffices to show that CLIQUE is NP-hard in general graphs. Why? If we had a polynomial-time algorithm that solved CLIQUE on general graphs, it would also solve CLIQUE on restricted graphs.

It would not have been sufficient, however, to reduce instances of 3-CNF-SAT with a special structure to general instances of CLIQUE. Why? It might have been the case that the instances of 3-CNF-SAT we chose to reduce from were "easy," and so we would not have reduced an NP-hard problem to CLIQUE.

Observe also that the reduction used the instance of 3-CNF-SAT but not the solution. It would have been a mistake for the polynomial-time reduction to have been based on knowing whether the formula φ is satisfiable, since we do not know how to determine this information in polynomial time.

34.5.2 The vertex-cover problem

A vertex cover of an undirected graph G = (V, E) is a subset V' ⊆ V such that if (u, v) ∈ E, then u ∈ V' or v ∈ V' (or both). That is, each vertex "covers" its incident edges, and a vertex cover for G is a set of vertices that covers all the edges in E. The size of a vertex cover is the number of vertices in it. For example, the graph in Figure 34.15(b) has a vertex cover {w, z} of size 2.

Figure 34.15: Reducing CLIQUE to VERTEX-COVER. (a) An undirected graph G = (V, E) with clique V' = {u, v, x, y}. (b) The graph

produced by the reduction algorithm that has vertex cover V - V' = {w, z}.

The vertex-cover problem is to find a vertex cover of minimum size in a given graph. Restating this optimization problem as a decision problem, we wish to determine whether a graph has a vertex cover of a given size k. As a language, we define

VERTEX-COVER = {<G, k> : graph G has a vertex cover of size k}.

The following theorem shows that this problem is NP-complete.

Theorem 34.12

The vertex-cover problem is NP-complete.

Proof We first show that VERTEX-COVER ∈ NP. Suppose we are given a graph G = (V, E) and an integer k. The certificate we choose is the vertex cover V' ⊆ V itself. The verification algorithm affirms that |V'| = k, and then it checks, for each edge (u, v) ∈ E, that u ∈ V' or v ∈ V'. This verification can be performed straightforwardly in polynomial time.

We prove that the vertex-cover problem is NP-hard by showing that CLIQUE ≤_P VERTEX-COVER. This reduction is based on the notion of the "complement" of a graph. Given an undirected graph G = (V, E), we define the complement of G as , where , and (u, v) ∉ E}. In other words, is the graph containing exactly those edges that are not in G. Figure 34.15 shows a graph and its complement and illustrates the reduction from CLIQUE to VERTEX-COVER.

The reduction algorithm takes as input an instance <G, k> of the clique problem. It computes the complement , which is easily done in polynomial time. The output of the reduction algorithm is the instance , of the vertex-cover problem. To complete the proof, we show that this transformation is indeed a reduction: the graph G has a clique of size k if and only if the graph has a vertex cover of size |V | - k.

Suppose that G has a clique V' ⊆ V with |V'| = k. We claim that V - V' is a vertex cover in . Let (u, v) be any edge in E. Then, (u, v) ∉ E, which implies that at least one of u or v does not belong to V', since every pair of vertices in V' is connected by an edge of E. Equivalently, at least one of u or v is in V - V', which means that edge (u, v) is covered by V - V'. Since (u, v) was chosen arbitrarily from E, every edge of E is covered by a vertex in V - V'. Hence, the set V - V', which has size |V | - k, forms a vertex cover for .

Conversely, suppose that has a vertex cover V' ⊆ V, where |V'| = |V| - k. Then, for all u, v ∈ V, if (u, v) ∈ E, then u ∈ V' or v ∈ V' or both. The contrapositive of this implication is that for all u, v ∈ V, if u ∉ V' and v ∉ V', then (u, v) E. In other words, V -V' is a clique, and it has size |V |-|V'| = k.

Since VERTEX-COVER is NP-complete, we don't expect to find a polynomial-time algorithm for finding a minimum-size vertex cover. Section 35.1 presents a polynomial-time "approximation algorithm," however, which produces "approxi mate" solutions for the vertex-cover problem. The size of a vertex cover produced by the algorithm is at most twice the minimum size of a vertex cover.

Thus, we shouldn't give up hope just because a problem is NP-complete. There may be a polynomial-time approximation algorithm that obtains near-optimal solutions, even though finding an optimal solution is NP-complete. Chapter 35 gives several approximation algorithms for NP-complete problems.

34.5.3 The hamiltonian-cycle problem

We now return to the hamiltonian-cycle problem defined in Section 34.2.

Theorem 34.13

The hamiltonian cycle problem is NP-complete.

Proof We first show that HAM-CYCLE belongs to NP. Given a graph G = (V, E), our certificate is the sequence of |V| vertices that makes up the hamiltonian cycle. The verification algorithm checks that this sequence contains each vertex in V exactly once and that with the first vertex repeated at the end, it forms a cycle in G. That is, it checks that there is an edge between each pair of consecutive vertices and between the first and last vertices. This verification can be performed in polynomial time.

We now prove that VERTEX-COVER ≤_P HAM-CYCLE, which shows that HAM-CYCLE is NP-complete. Given an undirected graph G = (V, E) and an integer k, we construct an undirected graph G' = (V', E') that has a hamiltonian cycle if and only if G has a vertex cover of size k.

Our construction is based on a widget, which is a piece of a graph that enforces certain properties. Figure 34.16(a) shows the widget we use. For each edge (u, v) ∈ E, the graph G' that we construct will contain one copy of this widget,which we denote by W_uv. We denote each vertex in W_uv by [u, v, i] or [v, u, i], where 1 ≤ i ≤ 6, so that each widget W_uv contains 12 vertices. Widget W_uv also contains the 14 edges shown in Figure 34.16(a).

Figure 34.16: The widget used in reducing the vertex-cover problem to the hamiltonian-cycle problem. An edge (u, v) of graph G corresponds to widget W_uv in the graph G' created in the reduction. (a) The widget, with individual vertices labeled. (b)-(d) The shaded paths are the only possible ones through the widget that include all vertices, assuming that the only connections from the widget to the remainder of G' are through vertices [u, v, 1], [u, v, 6], [v, u, 1], and [v, u, 6].

Along with the internal structure of the widget, we enforce the properties we want by limiting the connections between the widget and the remainder of the graph G' that we construct. In particular, only vertices [u, v, 1], [u, v, 6], [v, u, 1], and [v, u, 6] will have edges incident from outside W_uv. Any hamiltonian cycle of G' will have to traverse the edges of W_uv in one of the three ways shown in Figures 34.16(b)-(d). If the cycle enters through vertex [u, v, 1], it must exit through vertex [u, v, 6], and it either visits all 12 of the widget's vertices (Figure 34.16(b)) or the six vertices [u, v, 1] through [u, v, 6] (Figure 34.16(c)). In the latter case, the cycle will have to reenter the widget to visit vertices [v, u, 1] through [v, u, 6]. Similarly, if the cycle enters through vertex [v, u, 1], it must exit through vertex [v, u, 6], and it either visits all 12 of the widget's vertices (Figure 34.16(d)) or the six vertices [v, u, 1] through [v, u, 6] (Figure 34.16(c)). No other paths through the widget that visit all 12 vertices are possible. In particular, it is impossible to construct two vertex-disjoint paths, one of which connects [u, v, 1] to [v, u, 6] and the other of which connects [v, u, 1] to [u, v, 6], such that the union of the two paths contain all of the widget's vertices.

The only other vertices in V' other than those of widgets are selector vertices s₁, s₂,..., s_k. We use edges incident on selector vertices in G' to select the k vertices of the cover in G.

In addition to the edges in widgets, there are two other types of edges in E', which Figure 34.17 shows. First, for each vertex u ∈ V, we add edges to join pairs of widgets in order to form a path containing all widgets corresponding to edges incident on u in G. We arbitrarily order the vertices adjacent to each vertex u ∈ V as u⁽¹⁾, u⁽²⁾,..., u^(degree(u)), where degree(u) is the number of vertices adjacent to u. We create a path in G' through all the widgets corresponding to edges incident on u by adding to E' the edges {([u, u⁽ⁱ⁾, 6], [u, u⁽ⁱ⁺¹⁾, 1]) : 1 ≤ i ≤ degree(u) - 1}. In Figure 34.17, for example, we order the vertices adjacent to w as x, y, z, and so graph G' in part (b) of the figure includes the edges ([w, x, 6], [w, y, 1]) and ([w, y, 6], [w, z, 1]). For each vertex u ∈ V, these edges in G' fill in a path containing all widgets corresponding to edges incident on u in G.

Figure 34.17: The reduction of an instance of the vertex-cover problem to an instance of the hamiltonian-cycle problem. (a) An undirected graph G with a vertex cover of size 2, consisting of the lightly shaded vertices w and y. (b) The undirected graph G' produced by the reduction, with the hamiltonian path corresponding to the vertex cover shaded. The vertex cover {w, y} corresponds to edges (s₁, [w, x, 1]) and (s₂, [y, x, 1]) appearing in the hamiltonian cycle.

The intuition behind these edges is that if we choose a vertex u ∈ V in the vertex cover of G, we can construct a path from [u, u⁽¹⁾, 1] to [u, u^(degree(u)), 6] in G' that "covers" all widgets corresponding to edges incident on u. That is, for each of these widgets, say , the path either includes all 12 vertices (if u is in the vertex cover but u⁽ⁱ⁾ is not) or just the six vertices [u, u⁽ⁱ⁾, 1], [u, u⁽ⁱ⁾, 2],..., [u, u⁽ⁱ⁾, 6] (if both u and u⁽ⁱ⁾ are in the vertex cover).

The final type of edge in E' joins the first vertex [u, u⁽¹⁾, 1] and the last vertex [u, u^(degree(u)), 6] of each of these paths to each of the selector vertices. That is, we include the edges

{(s_j, [u, u⁽¹⁾, 1]) : u ∈ V and 1 ≤ j ≤ k}
                 ∪ {(s_j, [u, u^(degree(u)), 6]) : u ∈ V and 1 ≤ j ≤ k}.

Next, we show that the size of G' is polynomial in the size of G, and hence we can construct G' in time polynomial in the size of G. The vertices of G' are those in the widgets, plus the selector vertices. Each widget contains 12 vertices, and there are k ≤ |V| selector vertices, for a total of

\|V'\|	=	12 \|E\| + k
	≤	12 \|E\| + \|V\|

vertices. The edges of G' are those in the widgets, those that go between widgets, and those connecting selector vertices to widgets. There are 14 edges in each widget, or 14 |E| in all widgets. For each vertex u ∈ V, there are degree(u) - 1 edges between widgets, so that summed over all vertices in V, there are

edges between widgets. Finally, there are two edges for each pair consisting of a selector vertex and a vertex of V, or 2k |V| such edges. The total number of edges of G' is therefore

\|E'\|	=	(14 \|E\|) + (2 \|E\| - \|V\|) + (2k\|V\|)
	=	16 \|E\| + (2k - 1)\|V\|
	≤	16 \|E\| + (2 \|V\| - 1)\|V\|.

Now we show that the transformation from graph G to G' is a reduction. That is, we must show that G has a vertex cover of size k if and only if G' has a hamiltonian cycle.

Suppose that G = (V, E) has a vertex cover V* ⊆ V of size k. Let V* = {u₁, u₂,..., u_k}. As Figure 34.17 shows, we form a hamiltonian cycle in G by including the following edges^[8] for each vertex u_j ∈ V*. Include edges , degree(u_j), which connect all widgets corresponding to edges incident on u_j. We also include the edges within these widgets as Figures 34.16(b)-(d) show, depending on whether the edge is covered by one or two vertices in V*. The hamiltonian cycle also includes the edges

By inspecting Figure 34.17, the reader can verify that these edges form a cycle. The cycle starts at s₁, visits all widgets corresponding to edges incident on u₁, then visits s₂, visits all widgets corresponding to edges incident on u₂, and so on, until it returns to s₁. Each widget is visited either once or twice, depending on whether one or two vertices of V* cover its corresponding edge. Because V* is a vertex cover for G, each edge in E is incident on some vertex in V*, and so the cycle visits each vertex in each widget of G'. Because the cycle also visits every selector vertex, it is hamiltonian.

Conversely, suppose that G' = (V', E') has a hamiltonian cycle C ⊆ E'. We claim that the set

(34.4)

is a vertex cover for G. To see why, partition C into maximal paths that start at some selector vertex s_i, traverse an edge (s_i, [u, u⁽¹⁾, 1]) for some u ∈ V, and end at a selector vertex s_j without passing through any other selector vertex. Let us call each such path a "cover path." From how G' is constructed, each cover path must start at some s_i, take the edge (s_i, [u, u⁽¹⁾, 1]) for some vertex u ∈ V, pass through all the widgets corresponding to edges in E incident on u, and then end at some selector vertex s_j. We refer to this cover path as p_u, and by equation (34.4), we put u into V*. Each widget visited by p_u must be W_uv or W_vu for some v ∈ V. For each widget visited by p_u, its vertices are visited by either one or two cover paths. If they are visited by one cover path, then edge (u, v) ∈ E is covered in G by vertex u. If two cover paths visit the widget, then the other cover path must be p_v, which implies that v ∈ V*, and edge (u, v) ∈ E is covered by both u and v. Because each vertex in each widget is visited by some cover path, we see that each edge in E is covered by some vertex in V*.

34.5.4 The traveling-salesman problem

In the traveling-salesman problem, which is closely related to the hamiltonian-cycle problem, a salesman must visit n cities. Modeling the problem as a complete graph with n vertices, we can say that the salesman wishes to make a tour, or hamiltonian cycle, visiting each city exactly once and finishing at the city he starts from. There is an integer cost c(i, j) to travel from city i to city j, and the salesman wishes to make the tour whose total cost is minimum, where the total cost is the sum of the individual costs along the edges of the tour. For example, in Figure 34.18, a minimum-cost tour is <u, w, v, x, u>, with cost 7. The formal language for the corresponding decision problem is

TSP = {<G, c, k> : G = (V, E) is a complete graph, c is a function from V × V → Z, k ∈ Z, and G has a traveling-salesman tour with cost at most k}.

Figure 34.18: An instance of the traveling-salesman problem. Shaded edges represent a minimum-cost tour, with cost 7.

The following theorem shows that a fast algorithm for the traveling-salesman problem is unlikely to exist.

Theorem 34.14

The traveling-salesman problem is NP-complete.

Proof We first show that TSP belongs to NP. Given an instance of the problem, we use as a certificate the sequence of n vertices in the tour. The verification algorithm checks that this sequence contains each vertex exactly once, sums up the edge costs, and checks whether the sum is at most k. This process can certainly be done in polynomial time.

To prove that TSP is NP-hard, we show that HAM-CYCLE ≤_P TSP. Let G = (V, E) be an instance of HAM-CYCLE. We construct an instance of TSP as follows. We form the complete graph G' = (V, E'), where E' = {(i, j) : i, j ∈ V and i ≠ j}, and we define the cost function c by

(Note that because G is undirected, it has no self-loops, and so c(v, v) = 1 for all vertices v ∈ V.) The instance of TSP is then (G', c, 0), which is easily formed in polynomial time.

We now show that graph G has a hamiltonian cycle if and only if graph G' has a tour of cost at most 0. Suppose that graph G has a hamiltonian cycle h. Each edge in h belongs to E and thus has cost 0 in G'. Thus, h is a tour in G' with cost 0. Conversely, suppose that graph G' has a tour h' of cost at most 0. Since the costs of the edges in E' are 0 and 1, the cost of tour h' is exactly 0 and each edge on the tour must have cost 0. Therefore, h' contains only edges in E. We conclude that h' is a hamiltonian cycle in graph G.

34.5.5 The subset-sum problem

The next NP-complete problem we consider is an arithmetic one. In the subset-sum problem, we are given a finite set S ⊂ N and a target t ∈ N. We ask whether there is a subset S' ⊆ S whose elements sum to t. For example, if S = {1, 2, 7, 14, 49, 98, 343, 686, 2409, 2793, 16808, 17206, 117705, 117993} and t = 138457, then the subset S' = {1, 2, 7, 98, 343, 686, 2409, 17206, 117705} is a solution.

As usual, we define the problem as a language:

SUBSET-SUM = {<S, t> : there exists a subset S ⊆ S such that t = Σ_s∈S' S}.

As with any arithmetic problem, it is important to recall that our standard encoding assumes that the input integers are coded in binary. With this assumption in mind, we can show that the subset-sum problem is unlikely to have a fast algorithm.

Theorem 34.15

The subset-sum problem is NP-complete.

Proof To show that SUBSET-SUM is in NP, for an instance <S, t> of the problem, we let the subset S' be the certificate. Checking whether t = Σ_s∈S' S can be accomplished by a verification algorithm in polynomial time.

We now show that 3-CNF-SAT ≤_P SUBSET-SUM. Given a 3-CNF formula φ over variables x₁, x₂,..., x_n with clauses C₁, C₂,..., C_k, each containing exactly three distinct literals, the reduction algorithm constructs an instance <S, t> of the subset-sum problem such that φ is satisfiable if and only if there is a subset of S whose sum is exactly t. Without loss of generality, we make two simplifying assumptions about the formula φ. First, no clause contains both a variable and its negation, for such a clause is automatically satisfied by any assignment of values to the variables. Second, each variable appears in at least one clause, for otherwise it does not matter what value is assigned to the variable.

The reduction creates two numbers in set S for each variable x_i and two numbers in S for each clause C_j. We shall create numbers in base 10, where each number contains n +k digits and each digit corresponds to either one variable or one clause. Base 10 (and other bases, as we shall see) has the property we need of preventing carries from lower digits to higher digits.

As Figure 34.19 shows, we construct set S and target t as follows. We label each digit position by either a variable or a clause. The least significant k digits are labeled by the clauses, and the most significant n digits are labeled by variables.

The target t has a 1 in each digit labeled by a variable and a 4 in each digit labeled by a clause.
For each variable x_i, there are two integers, v_i and , in S. Each has a 1 in the digit labeled by x_i and 0's in the other variable digits. If literal x_i appears in clause C_j, then the digit labeled by C_j in v_i contains a 1. If literal ¬x_i appears in clause C_j, then the digit labeled by C_j in contains a 1. All other digits labeled by clauses in v_i and are 0.

All v_i and values in set S are unique. Why? For l ≠ i, no v_l or values can equal v_i and in the most significant n digits. Furthermore, by our simplifying assumptions above, no v_i and can be equal in all k least significant digits. If v_i and were equal, then x_i and ¬x_i would have to appear in exactly the same set of clauses. But we assume that no clause contains both x_i and ¬x_i and that either x_i or ¬x_i appears in some clause, and so there must be some clause C_j for which v_i and differ.
For each clause C_j, there are two integers, s_j and in S. Each has 0's in all digits other than the one labeled by C_j. For s_j, there is a 1 in the C_j digit, and has a 2 in this digit. These integers are "slack variables," which we use to get each clause-labeled digit position to add to the target value of 4.

Simple inspection of Figure 34.19 demonstrates that all s_j and values in S are unique in set S.

Figure 34.19: The reduction of 3-CNF-SAT to SUBSET-SUM. The formula in 3-CNF is φ = C₁ ∧ C₂ ∧ C₃ ∧ C₄, where C₁ = (x₁ ∨ ¬x₂ ∨¬x₃), C₂ = (¬x₁ ∨¬x₂ ∨¬x₃), C₃ = (¬x₁ ∨¬x₂ ∨x₃), and C₄ = (x₁ x₂ x₃). A satisfying assignment of φ is <x₁ = 0, x₂ = 0, x₃ = 1>. The set S produced by the reduction consists of the base-10 numbers shown; reading from top to bottom, S = {1001001, 1000110, 100001, 101110, 10011, 11100, 1000, 2000, 100, 200, 10, 20, 1, 2}. The target t is 1114444. The subset S' ⊆ S is lightly shaded, and it contains , and , corresponding to the satisfying assignment. It also contains slack variables , and to achieve the target value of 4 in the digits labeled by C₁ through C₄.

Note that the greatest sum of digits in any one digit position is 6, which occurs in the digits labeled by clauses (three 1's from the v_i and values, plus 1 and 2 from the s_j and values). Interpreting these numbers in base 10, therefore, no carries can occur from lower digits to higher digits.^[9]

The reduction can be performed in polynomial time. The set S contains 2n + 2k values, each of which has n + k digits, and the time to produce each digit is poly-nomial in n + k. The target t has n + k digits, and the reduction produces each in constant time.

We now show that the 3-CNF formula φ is satisfiable if and only if there is a subset S ⊆ S whose sum is t. First, suppose that φ has a satisfying assignment. For i = 1, 2,..., n, if x_i = 1 in this assignment, then include v_i in S'. Otherwise, include . In other words, we include in S' exactly the v_i and values that correspond to literals with the value 1 in the satisfying assignment. Having included either v_i or , but not both, for all i, and having put 0 in the digits labeled by variables in all s_j and , we see that for each variable-labeled digit, the sum of the values of S' must be 1, which matches those digits of the target t. Because each clause is satisfied, there is some literal in the clause with the value 1. Therefore, each digit labeled by a clause has at least one 1 contributed to its sum by a v_i or value in S'. In fact, 1, 2, or 3 literals may be 1 in each clause, and so each clause-labeled digit has a sum of 1, 2, or 3 from the v_i and values in S'. (In Figure 34.19 for example, literals ¬x₁, ¬x₂, and x₃ have the value 1 in a satisfying assignment. Each of clauses C₁ and C₄ contain exactly one of these literals, and so together , and v₃ contribute 1 to the sum in the digits for C₁ and C₄. Clause C₂ contains two of these literals, and , and v₃ contribute 2 to the sum in the digit for C₂. Clause C₃ contains all three of these literals, and , and v₃ contribute 3 to the sum in the digit for C₃.) We achieve the target of 4 in each digit labeled by clause C_j by including in S' the appropriate nonempty subset of slack variables {s_j , }. (In Figure 34.19, S' includes and .) Since we have matched the target in all digits of the sum, and no carries can occur, the values of S' sum to t.x

Now, suppose that there is a subset S' ⊆ S that sums to t. The subset S' must include exactly one of v_i and for each i = 1, 2,..., n, for otherwise the digits labeled by variables would not sum to 1. If v_i ∈ S', we set x_i = 1. Otherwise, ∈ S', and we set x_i = 0. We claim that every clause C_j, for j = 1, 2,..., k, is satisfied by this assignment. To prove this claim, note that to achieve a sum of 4 in the digit labeled by C_j, the subset S' must include at least one v_i or value that has a 1 in the digit labeled by C_j, since the contributions of the slack variables s_j and together sum to at most 3. If S' includes a v_i that has a 1 in that position, then the literal x_i appears in clause C_j. Since we have set x_i = 1 when v_i ∈ S', clause C_j is satisfied. If S' includes a that has a 1 in that position, then the literal ¬x_i appears in C_j. Since we have set x_i = 0 when ∈ S', clause C_j is again satisfied. Thus, all clauses of φ are satisfied, which completes the proof.

Exercises 34.5-1

The subgraph-isomorphism problem takes two graphs G₁ and G₂ and asks whether G₁ is isomorphic to a subgraph of G₂. Show that the subgraph-isomorphism problem is NP-complete.

Exercises 34.5-2

Given an integer m-by-n matrix A and an integer m-vector b, the 0-1 integer-programming problem asks whether there is an integer n-vector x with elements in the set {0, 1} such that Ax ≤ b. Prove that 0-1 integer programming is NP-complete. (Hint: Reduce from 3-CNF-SAT.)

Exercises 34.5-3

The integer linear-programming problem is like the 0-1 integer-programming problem given in Exercise 34.5-2, except that the values of the vector x may be any integers rather than just 0 or 1. Assuming that the 0-1 integer-programming problem is NP-hard, show that the integer linear-programming problem is NP-complete.

Exercises 34.5-4

Show that the subset-sum problem is solvable in polynomial time if the target value t is expressed in unary.

Exercises 34.5-5

The set-partition problem takes as input a set S of numbers. The question is whether the numbers can be partitioned into two sets A and = S - A such that Σ_x∈x. Show that the set-partition problem is NP-complete.

Exercises 34.5-6

Show that the hamiltonian-path problem is NP-complete.

Exercises 34.5-7

The longest-simple-cycle problem is the problem of determining a simple cycle (no repeated vertices) of maximum length in a graph. Show that this problem is NP-complete.

Exercises 34.5-8

In the half 3-CNF satisfiability problem, we are given a 3-CNF formula φ with n variables and m clauses, where m is even. We wish to determine whether there exists a truth assignment to the variables of φ such that exactly half the clauses evaluate to 0 and exactly half the clauses evaluate to 1. Prove that the half 3-CNF satisfiability problem is NP-complete.

Problems 34-1: Independent set

An independent set of a graph G = (V, E) is a subset V' ⊇ V of vertices such that each edge in E is incident on at most one vertex in V'. The independent-set problem is to find a maximum-size independent set in G.

Formulate a related decision problem for the independent-set problem, and prove that it is NP-complete. (Hint: Reduce from the clique problem.)
Suppose that you are given a "black-box" subroutine to solve the decision problem you defined in part (a). Give an algorithm to find an independent set of maximum size. The running time of your algorithm should be polynomial in |V| and |E|, where queries to the black box are counted as a single step.

Although the independent-set decision problem is NP-complete, certain special cases are polynomial-time solvable.

Give an efficient algorithm to solve the independent-set problem when each vertex in G has degree 2. Analyze the running time, and prove that your algorithm works correctly.
Give an efficient algorithm to solve the independent-set problem when G is bipartite. Analyze the running time, and prove that your algorithm works correctly. (Hint: Use the results of Section 26.3.)

Problems 34-2: Bonnie and Clyde

Bonnie and Clyde have just robbed a bank. They have a bag of money and want to divide it up. For each of the following scenarios, either give a polynomial-time algorithm, or prove that the problem is NP-complete. The input in each case is a list of the n items in the bag, along with the value of each.

There are n coins, but only 2 different denominations: some coins are worth x dollars, and some are worth y dollars. They wish to divide the money exactly evenly.
There are n coins, with an arbitrary number of different denominations, but each denomination is a nonnegative integer power of 2, i.e., the possible denominations are 1 dollar, 2 dollars, 4 dollars, etc. They wish to divide the money exactly evenly.
There are n checks, which are, in an amazing coincidence, made out to "Bonnie or Clyde." They wish to divide the checks so that they each get the exact same amount of money.
There are n checks as in part (c), but this time they are willing to accept a split in which the difference is no larger than 100 dollars.

Problems 34-3: Graph coloring

A k-coloring of an undirected graph G = (V, E) is a function c : V → {1, 2,..., k} such that c(u) ≠ c(v) for every edge (u, v) ∈ E. In other words, the numbers 1, 2,..., k represent the k colors, and adjacent vertices must have different colors. The graph-coloring problem is to determine the minimum number of colors needed to color a given graph.

Give an efficient algorithm to determine a 2-coloring of a graph if one exists.
Cast the graph-coloring problem as a decision problem. Show that your decision problem is solvable in polynomial time if and only if the graph-coloring problem is solvable in polynomial time.
Let the language 3-COLOR be the set of graphs that can be 3-colored. Show that if 3-COLOR is NP-complete, then your decision problem from part (b) is NP-complete.

To prove that 3-COLOR is NP-complete, we use a reduction from 3-CNF-SAT. Given a formula φ of m clauses on n variables x₁, x₂,..., x_n, we construct a graph G = (V, E) as follows. The set V consists of a vertex for each variable, a vertex for the negation of each variable, 5 vertices for each clause, and 3 special vertices: TRUE, FALSE, and RED. The edges of the graph are of two types: "literal" edges that are independent of the clauses and "clause" edges that depend on the clauses. The literal edges form a triangle on the special vertices and also form a triangle on x_i, ¬x_i, and RED for i = 1, 2,..., n.

Argue that in any 3-coloring c of a graph containing the literal edges, exactly one of a variable and its negation is colored c(TRUE) and the other is colored c(FALSE). Argue that for any truth assignment for φ, there is a 3-coloring of the graph containing just the literal edges.

The widget shown in Figure 34.20 is used to enforce the condition corresponding to a clause (x ∨ y ∨ z). Each clause requires a unique copy of the 5 vertices that are heavily shaded in the figure; they connect as shown to the literals of the clause and the special vertex TRUE.

Argue that if each of x, y, and z is colored c(TRUE) or c(FALSE), then the widget is 3-colorable if and only if at least one of x, y, or z is colored c(TRUE).
Complete the proof that 3-COLOR is NP-complete.

Figure 34.20: The widget corresponding to a clause (x ∨ y ∨ z), used in Problem 34-3.

Problems 34-4: Scheduling with profits and deadlines

Suppose you have one machine and a set of n tasks a₁, a₂,..., a_n. Each task a_j has a processing time t_j, a profit p_j, and a deadline d_j. The machine can process only one task at a time, and task a_j must run uninterruptedly for t_j consecutive time units. If you complete task a_j by its deadline d_j, you receive a profit p_j, but if you complete it after its deadline, you receive no profit. As an optimization problem, you are given the processing times, profits, and deadlines for a set of n tasks, and you wish to find a schedule that completes all the tasks and returns the greatest amount of profit.

State this problem as a decision problem.
Show that the decision problem is NP-complete.
Give a polynomial-time algorithm for the decision problem, assuming that all processing times are integers from 1 to n. (Hint: Use dynamic programming.)
Give a polyomial-time algorithm for the optimization problem, assuming that all processing times are integers from 1 to n.

^[8]Technically, we define a cycle in terms of vertices rather than edges (see Section B.4). In the interest of clarity, we abuse notation here and define the hamiltonian cycle in terms of edges.

^[9]In fact, any base b, where b ≥ 7, would work. The instance at the beginning of this subsection is the set S and target t in Figure 34.19 interpreted in base 7, with S listed in sorted order.

\|E'\|	=	(14 \|E\|) + (2 \|E\| - \|V\|) + (2k\|V\|)
	=	16 \|E\| + (2k - 1)\|V\|
	≤	16 \|E\| + (2 \|V\| - 1)\|V\|.