Logo IMG


Terabyte Territory

Brian Hayes

Information Wants to Be Free

I have some further questions about life in the terabyte era. Except for video, it's not clear how to get all those trillions of bytes onto a disk in the first place. No one is going to type it, or copy it from 180,000 CD-ROMs. Suppose it comes over the Internet. With a T1 connection, running steadily at top speed, it would take nearly 20 years to fill up 120 terabytes. Of course a decade from now everyone may have a link much faster than a T1 line, but such an increase in bandwidth cuts both ways. With better communication, there is less need to keep local copies of information. For the very reason that you can download anything, you don't need to.

The economic implications are also perplexing. Suppose you have identified 120 terabytes of data that you would like to have on your laptop, and you have a physical means of transferring the files. How will you pay for it all? At current prices, buying 120 million books or 40 million songs or 30,000 movies would put a strain on most family budgets. Thus the real limit on practical disk-drive capacity may have nothing to do with superparamagnetism; it may simply be the cost of content.

On the other hand, it's also possible that the economic lever will act in the other direction. Recent controversies over intellectual property rights suggest that restricting the flow of bits by either legal or technical means is going to be very difficult in a world of abundant digital storage and bandwidth. Setting the price of information far above the cost of its physical medium is at best a metastable situation; it probably cannot last indefinitely. A musician may well resent the idea that the economic value of her work is determined by something so remote and arcane as the dimensions of bit cells on plated glass disks, but this is hardly the first time that recording and communications technologies have altered the economics of the creative arts; consider the phonograph and the radio.

Still another nagging question is how anyone will be able to organize and make sense of a personal archive amounting to 120 terabytes. Computer file systems and the human interface to them are already creaking under the strain of managing a few gigabytes; using the same tools to index the Library of Congress is unthinkable. Perhaps this is the other side of the economic equation: Information itself becomes free (or do I mean worthless?), but metadata—the means of organizing information—is priceless.

The notion that we may soon have a surplus of disk capacity is profoundly counterintuitive. A well-known corollary of Parkinson's Law says that data, like everything else, always expands to fill the volume allotted to it. Shortage of storage space has been a constant of human history; I have never met anyone who had a hard time filling up closets or bookshelves or file cabinets. But closets and bookshelves and file cabinets don't double in size every year. Now it seems we face a curious Malthusian catastrophe of the information economy: The products of human creativity grow only arithmetically, whereas the capacity to store and distribute them increases geometrically. The human imagination can't keep up.

Or maybe it's only my imagination that can't keep up.

© Brian Hayes

comments powered by Disqus


Subscribe to American Scientist