Subscribe
Subscribe
MY AMERICAN SCIENTIST
LOG IN! REGISTER!
SEARCH
 
Logo IMG
HOME > PAST ISSUE > May-June 2011 > Article Detail

COMPUTING SCIENCE

Bit Lit

With digitized text from five million books, one is never at a loss for words

Brian Hayes

Books are being blown to bits. New ones are “born digital”; millions of old ones are being assimilated into the mind of the machine.

Some people question the wisdom of this transition to digital reading matter. Paper and ink have served us pretty well for a thousand years or more. Is it prudent to store everything we know in tiny smudges of electric charge we can’t see or touch? Critics also worry about who will wind up owning our cultural heritage. And then there are the sentimentalists, who say it’s just not the same curling up by the fireside with a good Kindle.

Well, I for one welcome our new computer overlords. And I would like to point out that books are not only for reading. There are other things we (and our computers) can do with the words in books. We can count them, sort them, make comparisons among them, search for patterns in their distribution, classify them, catalog them, analyze them. Yes, these are nerdy, mechanical, reductionist assaults on literature—but they are also methods of extracting meaning from text, just as reading is. And they scale better.




comments powered by Disqus
 

EMAIL TO A FRIEND :

Subscribe to American Scientist