Current Issue

This Article From Issue

May-June 2010

Volume 98, Number 3
Page 181

DOI: 10.1511/2010.84.181

To the Editors:

I was quite intrigued by Kurt D. Bollacker’s recent column, “Avoiding a Digital Dark Age” (March–April). The article pointed out two substantial shortcomings in storing information digitally. First, a single error in a digital recording can have rather dramatic effects. Thus, one needs not only to continually back up and copy digital information but also to do this in an error-correcting fashion. Second, when copying digital information, one has to be mindful that computer formats keep changing. It is important to keep current with formats and remain backwards compatible.

I would like to propose a somewhat whimsical suggestion for overcoming these two issues. Perhaps we could encode digital information that we wish to preserve in the DNA of bacteria—more specifically, in the regions between genes. Note, first of all, that the information would be recorded in a most fundamental and universal format, the natural DNA code of A, G, C and T. Secondly, because bacteria naturally replicate, backups of the information would be made in an error- correcting fashion using DNA polymerase. Although this copying is subject to some random mutation that evades correction, we could increase the fidelity by averaging the “readout” over a population of bacteria, rather than taking it from a single individual.

Mark Gerstein
Yale University

To the Editors:

Kurt Bollacker’s problem with restoring his file backups is an excellent example of the importance of free and open-source software (FOSS). Dr. Bollacker lost the ability to recover his files because he had lost his copy of the proprietary software that created the backup. Also, the company that created the software no longer exists and he wasn’t able to locate the software on the Internet.

While much FOSS software comes without a price tag, the term “free” in this context refers not to the cost but to the principle that the software can be copied, studied, changed and improved. The availability of source code avoids the common problem that the type of machine that ran the original software no longer exists. The source code can be recompiled or rewritten for another machine.

Devlin Gualtieri
Ledgewood, N.J.

Dr. Bollacker responds:

I believe that open-source software is a necessary part of solving the problem of digital data preservation. There are many benefits and advantages to FOSS that help the world of software development and data handling. The mutability, lack of licensing cost and didactic qualities of FOSS make our world a better place. But these benefits are not directly related to digital data preservation. The main virtue of open-source software in data preservation is that it can be thought of as highly precise (if not easy to read) documentation for digital formats. We don’t need a manual if the software can be recompiled to read old data and, ideally, convert it into new formats. Even if ancient software can no longer be run, the source can allow its functionality to be duplicated in new code. The need for such documentation, and thus the need to preserve source code, is diminished if the data formats are very simple and/or are highly “self documenting” (as are some XML schemas). Ideally, the formats would be self documented, and the source code would still be around, giving us a nice bit of redundant protection.

American Scientist Comments and Discussion

To discuss our articles or comment on them, please share them and tag American Scientist on social media platforms. Here are links to our profiles on Twitter, Facebook, and LinkedIn.

If we re-share your post, we will moderate comments/discussion following our comments policy.