On the Threshold
Where the Solutions Are
The critical value of is about 2.35. In other words, if a random graph with n vertices has fewer than 2.35n edges, it can almost surely be three-colored; if it has more than 2.35n edges, a three-coloring is unlikely. Moreover, the transition between these two regimes is known to be sharp; it is a true discontinuity, a sudden jump rather than a smooth gradation. To put this idea more formally, the width of the transitional region tends to zero as n tends to infinity.
The sharpness of the phase transition could be taken as encouraging news. If algorithms for deciding colorability bog down only in the transitional region, and if that region is vanishingly narrow, then the probability of encountering a hard-to-classify graph is correspondingly small. But it seems the universe has another bug (or feature). In the first place, the sharpness of the colorability transition is assured only for infinitely large graphs; at finite n, the corners of the transition curve are rounded. And there is another disrupting factor, which has been recognized only recently. It has to do not with the structure of the graph itself but with the structure of the set of all solutions to the coloring problem.
Although the uncolorable phase does not begin until ~ 2.35, experiments have shown that algorithms begin slowing down somewhat earlier, at values of around 2.2. The discrepancy may seem inconsequential, but it is too large to be explained merely by the blurring of the phase transition at finite n. Something else is going on.
To understand the cause, it helps to think of all the possible three-colorings of a graph spread out over a surface. The height of the surface at any point represents the number of conflicts in the corresponding coloring. Thus the perfect colorings (those with zero conflicts) all lie at sea level, while the worst colorings create high-altitude peaks or plateaus. Of course the topography of this landscape depends on the particular graph. Consider how the surface evolves as gradually increases. At low values of there are broad basins and valleys, representing the many ways to color the graph perfectly. At high the landscape is alpine, and even the lowest point is well above sea level, implying a complete absence of perfect colorings. The transitional value ~ 2.35 marks the moment when the last extensive areas of land at sea level disappear.
What happens in this "solution space" at ~ 2.2? It turns out this is the moment when a broad expanse of bottomland begins breaking up into smaller isolated basins. Below 2.2, nearly all the perfect colorings form a single giant connected cluster. They are connected in the sense that you can convert one solution into another by making relatively few changes, and without introducing too many conflicts in any of the intermediate stages. Above 2.2, each basin represents an isolated cluster of solutions. Colorings that lie in separate basins are substantially different, and converting one into another would require climbing over a ridge formed by colorings that have large numbers of conflicts. Algorithms that work by conducting a local search are unlikely to cross such ridge lines, and so they remain confined for long periods to whichever basin they first wander into. As increases above 2.2, the number of perfect colorings within any one basin dwindles away to zero, and so the algorithms may fail to find a solution, even though many proper colorings still exist elsewhere on the solution surface.
This vision of solutions spread out over an undulating landscape is a familiar conceptual device in many areas of physics. Often the landscape is interpreted as an energy surface, and physical systems are assumed to run downhill toward states of minimum energy. This analogy can be pursued further, setting up a direct correspondence between the k-coloring of graphs and a model of magnetic materials.