MY AMERICAN SCIENTIST
SEARCH

HOME > PAST ISSUE > Article Detail

COMPUTING SCIENCE

# Unwed Numbers

The mathematics of Sudoku, a puzzle that boasts "No math required!"

NP or Not NP, That Is the Question

Computer science has an elaborate hierarchy for classifying problems according to difficulty, and the question of where Sudoku fits into this scheme has elicited some controversy and confusion. It is widely reported that Sudoku belongs in the class NP, a set of notoriously difficult problems; meanwhile, however, many computer programs effortlessly solve any order-3 Sudoku puzzle. There is actually no contradiction in these facts, but there is also not much help in dispelling the confusion.

Complexity classes such as NP do not measure the difficulty of any specific problem instance but rather describe the rate at which difficulty grows as a function of problem size. If we can solve an order-n Sudoku, how much harder will we have to work to solve a puzzle of order n + 1? For problems in NP, the effort needed grows exponentially.

Most discussions of the complexity of Sudoku refer to the work of Takayuki Yato and Takahiro Seta of the University of Tokyo, whose analysis relates the task of solving Sudoku to the similar problem of completing a partially specified Latin square. The latter problem in turn has been connected with others that are already known to be in NP. This process of "reduction" from one problem to another is the standard way of establishing the complexity classes of computational problems. Yato and Seta employ an unusual form of reduction that addresses the difficulty of finding an additional solution after a first solution is already known. In Sudoku, of course, well-formed puzzles are expected to have only one solution. Yato and Seta say their result applies nonetheless. I don't quite follow their reasoning on this point, but the literature of complexity theory is vast and technical, and the fault is likely my own.

When you lay down your pencil on a completed Sudoku, the thought that you've just dispatched a problem in the class NP may boost your psychological wellbeing, but the NP label doesn't say anything about the relative difficulty of individual Sudoku puzzles. For that, a different kind of hierarchy is needed.

Many publishers rank their Sudoku on a scale from easy to hard (or from gentle to diabolical). The criteria for these ratings are not stated, and it's a common experience to breeze through a "very hard" puzzle and then get stuck on a "medium."

One easily measured factor that might be expected to influence difficulty is the number of givens. In general, having fewer cells specified at the outset ought to make for a harder puzzle. At the extremes of the range, it's clear that having all the cells filled in makes a puzzle very easy indeed, and having none filled in leaves the problem under-specified. What is the minimum number of givens that can ensure a unique solution? For an order-n grid, there is a lower bound of n 2 - 1. For example, on an order-3 grid with fewer than eight givens, there must be at least two numbers that appear nowhere among the givens. With no constraints on those symbols, there are at least two solutions in which their roles are interchanged.

Can the n 2 - 1 bound be achieved in practice? For n = 1 the answer is yes. On the order-2 grid there are uniquely solvable puzzles with four givens but not, I think, with three. (Finding the arrangements with just four givens is itself a pleasant puzzle.) For order 3, the minimum number of givens is unknown. Gordon Royle of the University of Western Australia has collected more than 24,000 examples of uniquely solvable grids with 17 givens, and he has found none with fewer than 17, but a proof is lacking.

Published puzzles generally have between 25 and 30 givens. Within this range, the correlation between number of givens and difficulty rating is weak. In one book, I found that the "gentle" puzzles averaged 28.3 givens and the "diabolical" ones 28.0.