Kurt Bollacker's article, Avoiding a Digital Dark Age, presents a somewhat disconcerting argument that all human knowledge can become extinct due to deterioration of the media on which it is stored or the mechanism of decoding the media. If one thinks about it, the lifespan of most human knowledge, without constant input of energy to maintain it in a good storage environment, is at most a few hundred years given the typical formats on which it is stored. I once saw a National Geographic special that showed that if humans suddenly disappeared, all of their city buildings would collapse within 200 years due to wear and tear or oxidation of the concrete, steel and glass supporting the buildings. Imagine what would happen to records stored on paper or fragile computerized records. Examination of data longevity shows few examples of data that survived without maintenance for over two thousand years. Some stone tablets, or some papayrii such as the Oxyrhyncus papayrii that were mummified in Egypt's dry desert sand, away from the flooding of the Nile River, are examples.
The author points out that the Rosetta project preserves information by etching it onto nickel wafers. This is a good idea given that only a microscope is needed to view the information. It took several thousand years of civilization to pass before scientists invented the microscope, but at least this technology (which is the data reading device for the nickel slabs) can be systematically created by hand-griding glass lenses. However, we might also take note of how nature has at times preserved information not just for thousands but millions of years. Specifically, rocks can survive for millions of years without environmental maintenance, particularly if protected from erosion by being buried. Amber can protect living biomass for millions of years. There are ways of fusing amber pieces together. Why not take a rock, hollow it out, encase the micro-etched nickel plate within it, pour some molten glass over it to seal the nickel within the stone, and encase the entire thing in Amber? This could last millions of years. One can also micro-etch a picture atlas of the language on which the data is encoded on the nickel slab. So both the data and a visual atlas of pictures and the vocabulary associated with the picture, can be enclosed on the slab.
These amber/nickel/stone data clusters could last millions of years just being buried in the ground or even tossed around, just like fossilized dinosaur eggs.
posted by John Mamoun
February 19, 2010 @ 6:55 AM
One idea to ensure preservation of decoding techniques is to use progressively complicated (dense) coding algorithms starting with plain English text written on a paper and finishing with compressed digital text on hard disk. The plain text on paper could describe how to read the next level of encoded instructions on a more dense medium than paper. This next medium would describe how to decode the next medium, etc...
The text could even explain how to buil mahcines to decipher the encoded data all the way to a computer to read the hard disk!
If there is a worry of even forgetting the language used to describe the decoding mechanism then we could simply start with a mathematical represntation teaching us the english language. Mathematics will never be forgotten or superseded as it is universal.
Just a thought!
posted by Walid Tabar
February 20, 2010 @ 5:29 PM
Important considerations not addressed in the article
- Ensuring that working copies of data are not corrupted or modified. If working copies are not read only there is a risk that they will be modified by software that accesses them.
- Ensuring that copies are accurate digital copies of the original using data verification techniques such as checksums. Otherwise a single mistake copying the data from one medium to another means the data will be lost and further efforts will only copy damaged data
- Appropriate cataloguing so that data is easily accessed
I've run into these problems preserving family photos. I have photos dating back to the 1990s but most of my collection starts around 2000. To the best of my knowledge I have not lost a single digital photo yet, but I have been surprised by programs modifying tags etc. when I recently started recording and comparing checksums between copies of my photos.
I intend for my photos to outlive me at the very least.
posted by Sammy Yousef
February 23, 2010 @ 7:22 PM
You should have cited Jeff Rothenberg, who covered most of these points 15 years ago in his article "Ensuring the Longevity of Digital Documents" in the January 1995 issue of Scientific American.
posted by David Rosenthal
February 23, 2010 @ 7:52 PM
This is a very interesting article that raises many interesting questions about making sure you can get back to the bits you stored. Many organisations now have organised repositories of information to mitigate this risk, but scientists can be lax at using these.
The problem however is not over there. When you get the bits back in 10 years the file format may be obsolete and the data useless. An additional layer on top of "bit preservation" is format preservation. National archives are leading the way in research into this problem including the "Active Preservation" approach divised by the UK National Archives and explored by the EU PLANETS project. Also the NSF Datanet project in the US is looking at this for scientific data.
posted by Jon Tilbury
February 25, 2010 @ 5:13 AM
The risk of losing electronic data at the hands of a terrorist with an EMP device frightens me. Couldn't some well-placed, well-time electromagnetic waves could destroy some of our most valuable (digital) intellectual assets?
posted by Scott Schablow
September 2, 2010 @ 12:57 AM
About once a month at Sigma Xi headquarters, we liven up the lunch hour with an American Scientist Pizza Lunch talk. In these informal lectures, scientists describe new research to nonscientists. The series is light on jargon but heavy on solid science. Each Pizza Lunch offers an in-depth look at its subject, whether it's bedbugs or the smart grid. Click below to read about and download these talks -- and to subscribe!
JSTOR, the online academic archive, now contains complete back issues of American Scientist from its inception in 1913 (as Sigma Xi Quarterly) through 2005.
The table of contents for each issue is freely available to all users; those with institutional access can read each complete issue.
View the full collection here.