In
mathematicsMathematics is the science and study of quantity, structure, space, and change. Mathematicians seek out patterns, formulate new conjectures, and establish truth by rigorous deduction from appropriately chosen axioms and definitions....
, the
independent set problem (
Independent Set or
IS) is a well-known problem in
graph theoryIn mathematics and computer science, graph theory is the study of graphs: mathematical structures used to model pairwise relations between objects from a certain collection. A "graph" in this context refers to a collection of vertices or 'nodes' and a collection of edges that connect pairs of...
. The independent set problem is known to be
NP-completeIn computational complexity theory, the complexity class NP-complete , is a class of problems having two properties...
. It is closely related to the
clique problemIn computational complexity theory, the clique problem is a graph-theoretic NP-complete problem. The problem was one of Richard Karp's original 21 problems shown NP-complete in his 1972 paper "Reducibility Among Combinatorial Problems"...
.
In a
simple graphIn mathematics, a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected objects are represented by mathematical abstractions called vertices, and the links that connect some pairs of vertices are called edges...
G, an
independent set is a subset of its vertices that are pairwise not adjacent. In other words, the subgraph induced by these vertices has no edges, only isolated vertices.
In
mathematicsMathematics is the science and study of quantity, structure, space, and change. Mathematicians seek out patterns, formulate new conjectures, and establish truth by rigorous deduction from appropriately chosen axioms and definitions....
, the
independent set problem (
Independent Set or
IS) is a well-known problem in
graph theoryIn mathematics and computer science, graph theory is the study of graphs: mathematical structures used to model pairwise relations between objects from a certain collection. A "graph" in this context refers to a collection of vertices or 'nodes' and a collection of edges that connect pairs of...
. The independent set problem is known to be
NP-completeIn computational complexity theory, the complexity class NP-complete , is a class of problems having two properties...
. It is closely related to the
clique problemIn computational complexity theory, the clique problem is a graph-theoretic NP-complete problem. The problem was one of Richard Karp's original 21 problems shown NP-complete in his 1972 paper "Reducibility Among Combinatorial Problems"...
.
Description
In a
simple graphIn mathematics, a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected objects are represented by mathematical abstractions called vertices, and the links that connect some pairs of vertices are called edges...
G, an
independent set is a subset of its vertices that are pairwise not adjacent. In other words, the subgraph induced by these vertices has no edges, only isolated vertices. Then, the independent set problem asks: given a graph
G and a
positive integerIn mathematics, there are two conventions for the set of natural numbers: it is either the set of positive integers {, , , ...} according to the traditional definition or the set of non-negative integers {, 1, 2, ...} according to...
k, does
G have an independent set of
cardinalityIn mathematics, the cardinality of a set is a measure of the "number of elements of the set". For example, the set A = {2, 4, 6} contains 3 elements, and therefore A has a cardinality of 3...
at least
k?
The corresponding
optimization problemIn mathematics and computer science, an optimization problem is the problem of finding the best solution from all feasible solutions. More formally, an optimization problem is a quadruple , where* is a set of instances;...
is the
maximum independent set problem (
Max Independent Set), which attempts to find the largest independent set in a graph. Given a solution to the decision problem, binary search can be used to solve the optimization problem with O(log
|V|) invocations of the decision problem's solution. The optimization problem is known to have no constant-factor
approximation algorithmIn computer science and operations research, approximation algorithms are algorithms used to find approximate solutions to optimization problems. Approximation algorithms are often associated with NP-hard problems; since it is unlikely that there can ever be efficient polynomial time exact...
if P≠NP.
Independent set problems and
cliqueIn graph theory, a clique in an undirected graph G = is a subset of the vertex set C ⊆ V, such that for every two vertices in C, there exists an edge connecting the two. This is equivalent to saying that the subgraph induced by C is complete...
problems may be easily translated into each other: an independent set in a graph
G is a clique in the
complementIn graph theory, the complement or inverse of a graph G is a graph H on the same vertices such that two vertices of H are adjacent if and only if they are not adjacent in G. That is to find the complement of a graph, you fill in all the missing edges to get a complete graph, and remove all the...
of
G, and vice versa.
Algorithms
The simplest
brute forceBrute force may refer to:* Brute-force search, a trivial computer problem-solving technique* Brute force attack, a method of defeating a cryptographic scheme by trying a large number of possibilities...
algorithm for Independent Set simply examines every vertex subset of size at least
k and checks whether it is an independent set. This requires time proportional to n!/(n - k)!k! where n is the order of the graph.
A much easier problem to solve is that of finding a
maximal independent setIn graph theory, a maximal independent set or maximal stable set is an independent set that is not a subset of any other independent set. That is, it is a set S such that every edge of the graph has at least one endpoint not in S and every vertex not in S has at least one neighbor in S...
, which is an independent set not contained in any other independent set. We begin with a single vertex. We find a vertex not adjacent to it and add it, then find a vertex adjacent to neither of these and add it, and so on until we can find no more vertices to add. At this time the set is maximal independent. More complex algorithms are known for listing all maximal independent sets, but the number of such sets can be exponential in the number of vertices. This can be seen by constructing a graph containing
n disconnected triangles. The maximum independent set is obviously of order
n, taking one vertex from each triangle. The number of maximal independent sets is therefore 3
n .
Proof of NP-completeness
It's easy to see that the problem is in NP, since if we have a subset of vertices, we can check to make sure there are no edges between any two of them in polynomial time. To show the problem is
NP-hardNP-hard , in computational complexity theory, is a class of problems that are, informally, "at least as hard as the hardest problems in NP". A problem H is NP-hard if and only if there is an NP-complete problem L that is polynomial time Turing-reducible to H...
, we will use a reduction from another NP-complete problem.
Assume we already know Cook's result that the
boolean satisfiability problemSatisfiability is the problem of determining if the variables of a givenBoolean formula can be assigned in such a way as to make the formulaevaluate to TRUE. Equally important is to determine whether no such assignments...
is NP-complete. One can efficiently reduce any boolean formula to
conjunctive normal formIn boolean logic, a formula is in conjunctive normal form if it is a conjunction of clauses, where a clause is a disjunction of literals. As a normal form, it is useful in automated theorem proving...
(CNF). In conjunctive normal form:
- The formula is a conjunction (and) of clauses.
- Each clause is a disjunction (or) of literals.
- Each literal is either a variable or its negation.
For example, the following formula is in CNF form, where ~ denotes negation:
and (
x1 or
x2 or
x4)
Such a formula is
satisfiable if we can assign true/false values to each variable in such a way that at least one literal in every clause is true. For example, any assignment with
x2 false and
x4 true satisfies the above formula. The problem of determining whether a formula in CNF form is satisfiable is also NP-complete and is called CNF-SAT.
Now, we describe a polynomial-time many-one reduction from CNF-SAT to the independent set problem. First, create a vertex for every literal in the formula; include duplicate vertices for those occurring more than once. Put an edge between:
- Any two literals which are negations of one another.
- Any two literals which are in the same clause.
For example, in our example above,
x2 would be adjacent to ~
x2, the first
x1 would be adjacent to ~
x3, and the second
x1 would be adjacent to
x4.
We argue now that this graph has an independent set of size at least
k, where
k is the number of clauses, if and only if the original formula was satisfiable.
Suppose we have an assignment satisfying the original formula. Then we can choose one literal from each clause which is made true by this assignment. This set is independent, because it only includes one literal from each clause (no edges of type 2), and because no assignment makes both a literal and its negation true (no edges of type 1).
On the other hand, suppose we have an independent set of size
k or greater. It can't contain any two literals in the same clause, since these are pairwise adjacent. But then, since there are at least
k vertices and
k clauses, we must have at least one in each clause (in fact exactly one). It also can't contain both a literal and its negation, because there are edges between these. That means it's easy to choose an assignment that makes these
k clauses all true, and this assignment will satisfy the original formula.
What makes this reduction to independent set so simple is the capacity of the edges in the graph to express
constraints, such as the necessity of never choosing both a literal and its negation. The graph coloring problem also benefits from this useful property.
External links