COMPUTING SCIENCE
Collective Wisdom
Brian Hayes
Unparalleled Parallelism
Factors, primes, codes, rulers—some of these projects sound like they might belong in the Guinness Book of World Records. They're not frivolous, but they're not quite in the mainstream either.
There are plenty of other areas of science and engineering
that could benefit from cheap and abundant computing. The
traditional big consumers of CPU cycles include the analysis of seismic data, simulations of many-body systems, studies of protein folding and other kinds of computational
chemistry, studies of turbulent fluid flow, and lattice
models of quantum field theories. Could such tasks be shared over the Net?
When viewed as a massively parallel computer, the Internet
has a peculiar architecture. It is extraordinarily rich in
raw computing capacity, with tens of millions of processors. But the bandwidth for communication between the processors is severely constrained. The 9,216 Pentiums of the Janus computer can talk to one another at a rate of 6.4 billion bits per second; for a node connected to the Internet by modem, the channel is slower by a factor of 100,000.
The limits on bandwidth determine what kinds of algorithms
run smoothly when spread out over the Net. Consider the case of an n-body simulation, which describes the motion of particles in a force field, such as stars in a galaxy or atoms in a fluid. One parallel n-body algorithm assigns each particle to its own processor, which then tracks the particle's path in space. The trouble is, each processor needs to consult all the other processors to
calculate the forces acting on the particle, and so the
volume of communication goes up as n2. That won't fly on the Net.
Yet n-body problems are not necessarily unsuited to network computing. There are other n-body algorithms, and other ways of partitioning the problem. In particular, "tree codes" organize the computation hierarchically. At the bottom of the hierarchy a processor calculates motions inside a small cluster of particles, without reference to the rest of the universe. At the next level several clusters are combined, ignoring their internal dynamics and looking only at the motions of their centers of mass. Then clusters of clusters are formed, and so on. Tree codes are popular for n-body computations, but whether they can be adapted to Internet computing remains to be seen.
Memory capacity is another serious constraint. Computer
owners who are willing to give away CPU cycles may be less
eager to let someone fill up their machine's memory or disk
drive. Both the bandwidth and the memory limits will be
difficult hurdles for programs that operate on large volumes of data, as in seismic analysis or weather prediction.
And yet there are powerful incentives for clearing those
hurdles. In round orders of magnitude, a typical personal
computer will soon execute 100 million instructions per
second; it will have 100 megabytes of memory and a gigabyte
of disk storage; it will consume 100 watts of electricity
and cost $1,000; 100 million of these machines will be
attached to the Internet. Multiply it out: 10 quadrillion
instructions per second, 10 billion megabytes of memory, 100 million gigabytes of disk storage, 10 gigawatts of electric-power demand, a price tag of $100 billion. It's probably worth rewriting your software to gain access to such a machine.
» Post Comment