COMPUTING SCIENCE

# The Easiest Hard Problem

# Boiling and Freezing Numbers

An answer to this question has emerged in the past few years from a campaign of research that spans at least three disciplines—physics, mathematics and computer science. It turns out that the spectrum of partitioning problems has both hard and easy regions, with a sharp boundary between them. On crossing that frontier, the problem undergoes a phase transition, analogous to the boiling or freezing of water.

The standard classification of computing problems as P or NP and so on assumes that difficulty increases as a function of problem size. In the case of number partitioning, two factors determine the size of a problem instance: how many numbers are in the set, and how big they are. Specifically, the size of an instance is the number of bits needed to represent it, and this depends both on the number of integers, *n*, and on the number of bits, *m*, in a typical integer. Thus a set of 100 integers in a range near 2^{100} has *n*=100 and *m*=100 and a problem size of 10,000 bits.

Given this measure of size, one might expect that partitioning problems would get harder as the product of *n* and *m* increases. This conclusion is not entirely wrong; algorithms do have to labor longer over bigger problems, if only to read the input. Yet, surprisingly, the product *nm* is not the best predictor of difficulty in number partitioning. Instead it's the ratio *m*/*n*.

Some simple reasoning about extreme cases takes the mystery out of this assertion. Suppose the ratio of *m* to *n* is very small, say with *m*=1 and *n*=1,000. The task, then, is to partition a set of 1,000 numbers, each of which can be represented by a single bit. This is trivially easy: A one-bit positive integer must be equal to 1, and so the input to the problem is a list of a thousand 1s. Finding a perfect partition is just a matter of counting. At the opposite extreme, consider the case of *m*=1,000 and *n*=2 (the smallest *n* that makes sense in a partitioning problem). Here the separation into subsets is easy—how many ways can you partition a set of two items?—but the likelihood of a perfect partition is extremely low. It's just the probability that two randomly selected 1,000-bit numbers will be equal.

Now it becomes clear why in my baby-boom neighborhood we so easily formed ourselves into well-matched teams. Among the dozen or more kids who would gather for a game, athletic abilities may have differed by a factor of 10, but surely not by a factor of 1,000. The parameter *m* is the base-2 logarithm of this range of talents, and so it was no greater than 3 or 4. Thus *m*/*n* was rather small—less than 1—and we had many acceptable solutions to choose from. Perhaps if we had had a young Michael Jordan or Mia Hamm in the neighborhood, I would take a different view of the number-partitioning problem today.

EMAIL TO A FRIEND :

**Of Possible Interest**

**Computing Science**: Clarity in Climate Modeling

**Feature Article**: Candy Crush's Puzzling Mathematics

**Computing Science**: Belles lettres Meets Big Data