Computing in a Parallel Universe
Multicore chips could bring about the biggest change in computing since the microprocessor
A Notorious Hangout
Parallel processing is hardly a new idea in computer science. Machines with multiple processors were built as early as the 1960s, when it was already widely believed that some form of "massive parallelism" was the way of the future. By the 1980s that future was at hand. David Gelernter of Yale University wrote that "parallel computing, long a notorious hangout for utopians, theorists, and backyard tinkerers, has almost arrived and is definitely for sale."
Throughout that decade and into the early 1990s novel parallel architectures became a wonderful playground for computer designers. For example, W. Daniel Hillis developed the Connection Machine, which had 216 single-bit processors (and 212 blinking red lights). Another notable project was the Transputer, created by the British semiconductor firm Inmos. Transputer chips were single processors designed for interconnection, with built-in communications links and facilities for managing parallel programs.
Software innovators were also drawn to the challenges of parallelism. The Occam programming language was devised for the Transputer, and languages called *Lisp and C* were written for the Connection Machine. Gelernter introduced the Linda programming system, in which multiple processors pluck tasks from a cloud called "tuple space."
What became of all these ventures? They were flattened by the steamroller of mass-market technology and economics. Special-purpose, limited-production designs are hard to justify when the same investment will buy hundreds or thousands of commodity PCs, which you can mount in racks and link together in a loose federation via Ethernet. Such clusters and "server farms" soon came to dominate large-scale computing, especially in the sciences. The vendors of supercomputers eventually gave in and began selling systems built on the same principle. All of the fastest supercomputers are now elaborations of this concept. In other words, parallelism wasn't defeated; it was co-opted.
It's also important to note that parallelism of a different kind insinuated itself into mainstream processor designs. The impressive performance of recent CPU chips comes not only from gigahertz clock rates but also from doing more during each clock cycle. The processors "pipeline" their instructions, decoding one while executing another and storing results from a third. Whenever possible, two or more instructions are executed simultaneously. Through such "instruction-level parallelism" a single CPU can have a throughput of more than one instruction per cycle, on average.