On the Threshold
Where the Hard Problems Aren't
Survey propagation is really a family of algorithms, which could be applied in many different realms. So far, the method has been tested on two specific problems. The first of these is Boolean satisfiability, or SAT, where the aim is to solve a large formula in symbolic logic, assigning values of true or false to all the variables in such a way that the entire formula evaluates to true. The second problem is k-coloring. Because I have written about satisfiability on an earlier occasion, I shall adopt k-coloring as the main example here. I focus on three-coloring, where the palette of available colors has just three entries.
Three-coloring is a hard problem, but not an impossible one. The question "Is this graph three-colorable?" can always be answered, at least in principle. Since each vertex can be assigned any of three colors, and there are n vertices, there must be exactly 3 n ways of coloring the graph. To decide whether a specific graph is three-colorable, just work through all the combinations one by one. If you find an assignment that satisfies the constraint—that is, where no edges yoke together like-colored vertices—then the answer to the question is yes. If you exhaust all the possibilities without finding a proper coloring, you can be certain that none exists.
This algorithm is simple and sure. Unfortunately, it's also useless, because enumerating 3 n colorings is beyond the realm of practicality for any n larger than 15 or 20. Some more-sophisticated procedures can retain the guarantee of an exact and exhaustive search while reducing the number of operations to fewer than 1.5 n . This is a dramatic improvement, but it is still an exponential function, and it merely raises the limit to n=50 or so. For large graphs, with thousands of vertices, all such brute-force methods are hopeless.
On the other hand, if you could somehow peek at the solution to a large three-coloring problem, you could check its correctness with much less labor. All you would have to do is go through the list of edges, verifying that the vertices at the ends of each edge carry different colors. The number of edges in a graph cannot be greater than n 2, which is a polynomial rather than an exponential function and which therefore grows much more slowly.
Problems with answers that are hard to find but easy to check are the characteristic signature of the class called NP (which stands for "nondeterministic polynomial"). Three-coloring is a charter member of NP and also belongs to the more-elite group of problems described as NP-complete; the same is true of satisfiability. Barring a miracle, there will be no polynomial-time algorithms for NP-complete problems.
Having thus established the credentials of three-coloring as a certifiably hard problem, it is now time to reveal that most three-coloring problems on random graphs are actually quite easy. Given a typical graph, you have a good chance of quickly finding a three-coloring or proving that none exists. There is no real paradox in this curious situation. The classification of three-coloring as NP-complete is based on a worst-case analysis. It could be overturned only by an algorithm that is guaranteed to produce the correct answer and to run in polynomial time on every possible graph. No one has discovered such an algorithm. But there are many algorithms that run quickly most of the time, if you are willing to tolerate an occasional failure.
One popular strategy for graph-coloring algorithms is backtracking. It is similar to the way most people would attack the problem if they were to try coloring a graph by hand. You start by assigning an arbitrary color to an arbitrary vertex, then go on to the neighboring vertices, giving them any colors that do not cause a conflict. Continuing in this way, you may eventually reach a vertex where no color is legal; at that point you must back up, undoing some of your previous choices, and try again.
Showing that a graph cannot be three-colored calls for another kind of algorithm. The basic approach is to search for a small cluster of vertices that—even in isolation from the rest of the graph—cannot be three-colored. For example, a "clique" made up of four vertices that are all linked to one another has this property. If you can find just one such cluster, it settles the question for the entire graph.
Algorithms like these are very different from the brute-force, exhaustive-search methods. The simple enumeration of all 3 n colorings may be impossibly slow, but at least it's consistent; the running time is the same on all graphs of the same size. This is not true for backtracking and other inexact or incomplete algorithms; their performance varies widely depending on the nature of the graph. In particular, the algorithms are sensitive to the value of , the ratio of edges to vertices, which again is the parameter that controls the transition between colorable and uncolorable phases. Well below the critical value of , where edges are sparse, there are so many ways to color the graph successfully that any reasonable strategy is likely to stumble onto one of them. At the opposite extreme, far above the threshold, graphs are densely interconnected, and it's easy to find a subgraph that spoils the chance of a three-coloring. The troublesome region is between these poles, near the threshold. In that middle ground there may be just a few proper colorings, or there may be none at all. Distinguishing between these two situations can require checking almost every possible assignment.