MY AMERICAN SCIENTIST
SEARCH

HOME > PAST ISSUE > Article Detail

TECHNOLOGUE

# The Quest for Randomness

Determining whether numbers truly can have no pattern has implications for quantum mechanics, not to mention the stock market and data security.

# Fire Up the Statistics

The obvious way to check whether a sequence is random is just to throw some statistical tests at it. For example, suppose we were interested in the randomness of the following sequence: 314159265358979323846264338327950288419716...

As a first step, we could check whether the digits 0 through 9 appeared with approximately equal frequency, among, say, the first million digits. Passing such a test is clearly necessary for randomness, in the sense that, if the digits really were drawn completely randomly, and we looked at a million of them, then with overwhelming probability we would see roughly 10 percent zeros, 10 percent ones, and so forth. On the other hand, passing this test is not sufficient for randomness—because, for example, the decidedly nonrandom sequence 012345678901234567890123456789... also has the right digit frequencies.

So for an improved test, we could check whether the 100 consecutive two-digit pairs—that is, 00 through 99—occur with frequency close to 1 percent. A truly random sequence would pass this test almost surely. And it’s also more effective than looking at single-digit frequencies alone. But again, once we know that the test involves looking at two-digit frequencies, it’s easy to fool the test. For example, consider the following sequence, which was proposed in the 1930s by David Champernowne (then an undergraduate classmate of computing pioneer Alan Turing) and which simply consists of all the positive integers written out together: 123456789101112131415161718192021222324...

You might enjoy proving that not only does Champernowne’s sequence have the same two-digit frequencies as a random sequence, it also has the same three-digit frequencies, the same four-digit frequencies, and so on. Yet clearly the sequence isn’t random.

A moment’s thought will reveal that the problem is completely general. For example, the first sequence above seems to pass virtually every statistical test for randomness ever devised—until, that is, you notice that it is the digits of pi. With that single realization, the sequence flips from completely unpredictable to completely predictable. The same would be true for, say, the digits of the square root of two, or any other random-looking sequence that we can nevertheless specify by a rule.

Indeed, let me now argue that no matter how clever you are in devising a statistical test for randomness, an adversary will always be able to find a deterministic sequence that passes your test. If nothing else, such an adversary could say the following:

Let S be the set of all sequences of, say, one million digits that are declared to be random by your test. Presumably S is nonempty, because something had better pass your test! So for my sequence, I’ll simply pick the first sequence in S, when the sequences are viewed as numbers and arranged in ascending order. By hypothesis, that sequence passes your test, yet I just specified it deterministically!