Summary of Complexity Analysis for Discrete Optimization

Discrete Optimization

A discrete optimization problem involves the selection of a best alternative from among a finite set of feasible solutions as defined by an objective function. Typically, these problems involve resource allocation. For example, one might want to pack a set of boxes into a minimum number of trucks, expend a limited budget adding edges to a network or increasing edge capacities so as to maximize the overall [Btransport capacity of a network, or schedule a set of operations on a set of parallel machines so as to minimize the last completion time.

Computational Complexity

Complexity analysis helps to set our expectations by defining what is and is not possible computationally in the worst case. Suppose we wished to construct a black box that would accept any input instance for a discrete optimization problem and would output and optimal answer. Computational complexity is the analysis of how long you can expect this computation to run, as a function of input size, and how good an answer you can expect. As a technical point, usually optimization problems are phrased as decision problems: "is there a solution with value at most B?" If one can efficiently answer the decision problem, then one can answer the optimization question using binary search. Though feasible sets are finite, they are not in general small. They are almost always at least exponential in the size of the problem input. Checking all possible solutions will not in general be a viable option. For example, consider the knapsack problem where we are given a set of objects each with a weight and a value. We wish to pack a knapsack with a weight capacity with as valuable a set of objects as possible. If we could generate and check a billion possible solutions per second, then naively generating all possible choices for a set of 65 items would require a thousand years. Knapsack problems can be solved very quickly in practice using more sophisticated techniques.

For practical purposes, there are two main classes of problems: P and NP-complete.

The Class P

The class P stands for "polynomial-time" solvable. The black box for these problems can give an optimal answer in time polynomial in the input size. This is generally considered to be the class of tractable problems. Examples of useful optimization problems solvable in P are: minimum spanning tree, maximum matching, maximum (network) flow, shortest paths, and linear programming.

The Class NP and NP-Completeness

The class NP stands for "Nondeterministic Polynomial" time. It means that if you can guess a solution, then you can verify it is polynomial time (i.e. efficiently). Here is were the decision/optimization point is important. Given a hypothetical solution to a decision problem, if one can efficiently check that all constraints are met (i.e. the solution if feasible) and compute the objective function to compare with the bound, then the problem is in NP. Now generating the solution is a different question entirely, much like the distinction between generating a mathematical proof and verifying it's correctness. The class NP is generally believed to be much bigger than the class P (i.e. P is a strict subset of NP), though it has not yet been proven formally. It is arguably the most important open problem in theoretical computer science.

A problem is NP-complete if it can be used as a black box to efficiently solve any other problem in NP. In essence, and NP-complete problem is as hard as any problem in NP. Right now, the best known solution for any NP-complete problem requires exponential time (as a function of input size) in the worst case. That is, it is not much better than generating all possible solutions and checking each one.

Thus NP-completeness corresponds to formal intractability: any algorithm that optimally solves and NP-complete problem for all instances will require an unacceptable amount of compute time in the worst case (provided P is a strict subset of NP)

The items in boldface in this statement are the loopholes. In practice, to get theoretically good algorithms, one must concentration on approximation algorithms, algorithms which assume input structure of interest (e.g. graph algorithms restricted to planar graphs, graphs that can be drawn in the plane without edge crossings), average-case behavior, or getting optimal solutions to particular problems of interest.

A Heirarchy of Approximability

When considering approximation algorithms for NP-hard problems, there is a heirarchy of what is theoretically possible:

Fully-Polynomial-Time Approximation Schemes

The best case is a fully-polynomial-time approximation scheme (FPTAS) Given an input instance and an error parameter e, such an algorithm will return a solution guaranteed to be within a factor of (1 + e) times optimal and run in time that is polynomial in the input size and 1/e. Thus there is a graceful tradeoff between running time and solution quality. Such problems are only weakly NP-complete. Frequently there exist pseudopolynomial-time algorithms for such problems. These are algorithms that guarantee an optimal answer and run in time polynomial in the unput size and the largest number input. Thus if all the numbers input are small, these algorithms are polynomial-time. Example: knapsack. If all the weights are small (polynomial in the number of items), this can be solved in polynomial time. For large input numbers, there is a FPTAS.

Polynomial-Time Approximation Schemes

The next best possibility is a polynomial-time approximation scheme (PTAS), which is the same as a FPTAS except that the running time can be exponential in 1/e, where e is the error parameter. Example: bin packing.

Approximation (Only) Within A Constant Factor

The next level are those problems that cannot be approximated within a certain constant, but can be approximated within a higher contant. There is a general class called max-SNP-complete problems, which is similar to NP-completeness in the sense that all problems within the class have a certain lower bound on the difficulty of approximation. The best constant known is still pretty small, however. For specific problems, one can prove stronger nonapproximability results. For example, assigning a set of jobs to unrelated parallel machines cannot be approximated to within a factor better than 3/2 (unless P = NP), but can be approximated within a factor of 2.

Approximation (Only) Within a Logarithmic Factor

This is similar to the previous class except that a problem in this class cannot be approximated to within c log n for some constant c, where n is the input size, and can be approximated within c'log n for a different, larger constant c'. Example: set cover.

Not Approximable within a Polynomial Factor

This class is essentially not approximable. Example: clique.

References

The definitive reference for NP-completeness is still:
M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-completeness, W.H. Freeman and Company, New York, 1979.
Any more modern introductory algorithms text will contain a more succinct discussion. See for example:
T.H. Cormen, C.E. Leiserson, and R.L. Rivest, Introduction to Algorithms, MIT Press, Cambridge, MA, and McGraw-Hill, New York, 1990.
For a more detailed description of complexity classes for approximability, references, and descriptions of all the example problems given above, and much more, see Crescenzi and Kann's Compendium of NP optimization problems.

This page maintained by Cynthia Phillips caphill@cs.sandia.gov Last Modified: October 21, 1997