Random numbers have a peculiar power, even when they are only pseudo- or quasirandom
The Curse of Dimensionality
Randomness has a conspicuous role in this description of the Monte Carlo method. In particular, the appeal to the law of large numbers requires that the sample points be chosen randomly. And it’s easy to see—just by looking at a picture—why random sampling works well: It scatters points everywhere. What’s not so easy to see is why other kinds of sample-point arrangements would not also serve the purpose. After all, one could measure the leaf area by laying down a simple grid of points in the square and counting the hits. I tried this experiment with my leaf image, placing 1,024 points in a 32×32 grid. I got an area estimate of 0.4209—not quite as good as my lucky random run, but still within 0.6 percent of the true area.
I also tried measuring the leaf with 1,024 quasirandom sample points, whose arrangement is in some sense intermediate between total chaos and total order. (For an explanation of how the quasirandom pattern is constructed, click the image at right to see “A Recipe for Quasirandom Numbers.”) The estimate from counting hits with quasirandom points was 0.4141, giving an error of 1 percent.
All three of these procedures give quite respectable results. Does that mean they are all equally powerful? No, I think it means that measuring the area of a leaf in two dimensions is an easy problem.
The task gets much harder in higher dimensions. Understanding why calls for an exercise in multidimensional thinking. Imagine a d-dimensional “cube” with edges of length 1, and a smaller cube inside it, with edge lengths along each dimension equal to 1/2. When d=1, a cube is just a line segment, and volume is equivalent to length; thus the smaller cube has half the volume of the large one. For d=2, the cube is a square, and volume is area; the small cube has volume 1/4. The case d=3 corresponds to an ordinary cube, and the volume filled by the small cube is now just 1/8. The progression continues. By the time we reach dimension d=20, the smaller cube—still with edge length 1/2 along each dimension—occupies only a millionth of the total volume. The mathematician Richard Bellman called this phenomenon “the curse of dimensionality.”
If we want to measure the volume of the one-in-a-million small cube—or even just detect its presence—we need enough sampling points to get at least one sample from the small cube’s interior. In other words, when we’re counting hits, we need to count at least one. For a 20-dimensional grid pattern, that means we need a million points (or, more precisely, 220=1,048,576). With random sampling, the size requirement is probabilistic and hence a little fuzzy, but the number of points needed if we want to have an expectation of a single hit is again 220. The analysis for quasirandom sampling comes out the same. Indeed, if you are groping blindly for an object of volume 1/2d, it hardly matters how the search pattern is arranged; you will have to look in 2d places.
If real-world problems were as hard as this one, the situation would be bleak. There would be no hope at all of dealing with a 360-dimension integral. But we know some problems of that scale do yield to Monte Carlo techniques; a reasonable guess is that the solvable problems have some internal structure that speeds the search. Furthermore, the choice of sampling pattern does seem to make a difference, so there is a meaningful distinction to be made among all the gradations of true, pseudo, quasi and non randomness.
» Post Comment