The mathematics of Sudoku, a puzzle that boasts "No math required!"
In the search for general principles, a first step is to generalize
the puzzle itself. The standard 81-cell Sudoku grid is not the only
possibility. For any positive integer n, we can draw an
order-n Sudoku grid with n 2 rows,
n 2 columns and n 2 blocks;
the grid has a total of n 4 cells, which are to
be filled with numbers in the range from 1 to n
2. The standard grid with 81 cells is of order 3. Some
publishers produce puzzles of order 4 (256 cells) and order 5 (625
cells). On the smaller side, there's not much to say about the
order-1 puzzle. The order-2 Sudoku (with 4 rows, columns and blocks,
and 16 cells in all) is no challenge as a puzzle, but it does serve
as a useful test case for studying concepts and algorithms.
How many Sudoku solutions exist for each n? To put the
question another way: Starting from a blank grid—with no
givens at all—how many ways can the pattern be completed while
obeying the Sudoku constraints? As a first approximation, we can
simplify the problem by ignoring the blocks in the Sudoku grid,
allowing any solution in which each column and each row has exactly
one instance of each number. A pattern of this kind is known as a
Latin square, and it was already familiar to Leonhard Euler more
than 200 years ago.
Consider the 4 x 4 Latin square (which corresponds to the order-2
Sudoku). Euler counted them: There are exactly 576 ways of arranging
the numbers 1, 2, 3 and 4 in a square array with no duplications in
any row or column. It follows that 576 is an upper limit on the
number of order-2 Sudoku. (Every Sudoku solution is necessarily a
Latin square, but not every Latin square is a valid Sudoku.) In a
series of postings on the Sudoku Programmers Forum, Frazer Jarvis of
the University of Sheffield showed that exactly half the 4 x 4 Latin
squares are Sudoku solutions; that is, there are 288 valid
arrangements. (The method of counting is summarized in the
illustration on the next page.)
Moving to higher-order Sudoku and larger Latin squares, the counting
gets harder in a hurry. Euler got only as far as the 5 x 5 case, and
the 9 x 9 Latin squares were not enumerated until 1975; the tally is
5,524,751,496,156,892,842,531,225,600, or about 6 x 1027.
The order-3 Sudoku must be a subset of these squares. They were
counted in June 2005 by Bertram Felgenhauer of the Technical
University of Dresden in collaboration with Jarvis. The total they
computed is 6,670,903,752,021,072,936,960, or 7 x 1021.
Thus, among all the 9 x 9 Latin squares, a little more than one in a
million are also Sudoku grids.
It's a matter of definition, however, whether all those patterns are
really different. The Sudoku grid has many symmetries. If
you take any solution and rotate it by a multiple of 90 degrees, you
get another valid grid; in the tabulations above, these variants are
counted as separate entries. Beyond the obvious rotations and
reflections, you can permute the rows within a horizontal band of
blocks or the columns within a vertical stack of blocks, and you can
also freely shuffle the bands and stacks themselves. Furthermore,
the numerals in the cells are arbitrary markers, which can also be
permuted; for example, if you switch all the 5s and 6s in a puzzle,
you get another valid puzzle.
When all these symmetries are taken into account, the number of
essentially different Sudoku patterns is reduced substantially. In
the case of the order-2 Sudoku, it turns out there are actually only
two distinct grids! All the rest of the 288 patterns can all be
generated from these two by applying various symmetry operations. In
the order-3 case, the reduction is also dramatic, although it still
leaves an impressive number of genuinely different solutions:
3,546,146,300,288, or 4 x 1012.
Does the large number of order-3 Sudoku grids tell us anything about
the difficulty of solving the puzzle? Maybe. If we set out to solve
it by some kind of search algorithm, then the number of patterns to
be considered is a relevant factor. But any strategy that involves
generating all 6,670,903,752,021,072,936,960 grids is probably not
the best way to go about solving the puzzle.