Logo IMG


To Be Free, or Not To Be

Roger Harris

Imagine walking into your downtown library and finding that you can't check out a book without paying a fee. What you took for granted as a free service, you now have to pay for.

A similar situation may soon face biologists who study biodiversity, the variety and number of species. In the 1990s, gene-sequence data and genetically modified organisms were commercialized, raising new questions about the privatization of scientific data. Today, biodiversity databases are growing and struggling for funds—which may come in the form of private investment that could transform what is now an open, public resource reliant on government and nonprofit funding.

The word "biodiversity" was popularized in the late 1980s. Scientists have since established that current rates of biodiversity loss rank with those of geological mass extinctions. It's no surprise that biologists are anxious to catalogue species as quickly as possible.

One major effort rapidly moving ahead is Species 2000. In March 2005, this consortium of databases announced that it had digitized information on half a million species, perhaps a quarter of all those described. (The planet is thought to have upwards of 5 million species in all.)

The aim is to gather descriptions of all known species and their associated information into a single database. But an electronic collection of the biological equivalent of baseball cards is not enough. The data must be organized and made accessible in a way useful for conservationists, natural resource managers, taxonomists, the biotechnology research community and the public. These are the users of biodiversity informatics, or information about species—their descriptions, locations, population sizes and trends, taxonomic relationships, museum specimens, field observations and images. According to Mark Schaefer, president of NatureServe, a leading organization involved in compiling biodiversity databases, "Without this information, it will be difficult to make informed decisions about conservation and sustainable development."

Biodiversity databases, each with its own way to codify, organize and search data, have proliferated as experts in various taxonomic groups have built catalogs to meet their specific needs. (An example is the well-known FishBase.)

The Catalogue of Life Programme is the biggest and boldest attempt to integrate these databases. It is a joint agreement between Species 2000 (acting as a coordinating umbrella organization), the Global Biodiversity Information Facility (GBIF) and the Integrated Taxonomic Information Systems (ITIS). ITIS, the main U.S. contributor, is in turn a partnership of federal agencies and nonprofit organizations (themselves collaborations!) including NatureServe, the U.S. Geological Survey, the Smithsonian Institution and the National Biological Information Infrastructure. The organizational layers illustrate the complexity and cost of developing gigantic data sets as well as the extent of public-agency involvement.

Scientists at universities, museums and libraries have compiled much of the information in the catalogs with government support. Perhaps because public organizations play such a dominant role in biodiversity informatics, most of the data are readily accessible at no cost.

As Peter Raven, director of the Missouri Botanical Garden and a leader in biodiversity research and conservation, points out, "It is traditional to have lists of species and material associated with those lists in the public domain."

Stuart Pimm of the Nicholas School of the Environment and Earth Sciences at Duke University agrees: "So many of the data are collected by state and federal agencies, there is enormous public pressure to keep access open."

A hint of private interest in the growing databases came in January 2004, when Thomson Scientific, the world's largest information corporation, acquired Biosis, known for indexing and abstracting life-sciences journals. Biosis managed the Zoological Record, whose computers had hosted the Species 2000 project to that point. With the acquisition, Thomson now hosted the Species 2000 database, an arrangement that continues.

Although Jim Pringle, Thomson's vice president of development, says the company does not have definite plans to privatize biodiversity data, Thomson promptly applied to become a member of Species 2000. Frank Bisby, executive director of Species 2000, said the Species 2000 directors "took advice … and decided that [it] was not appropriate for a subsidiary of a major multinational."

"Technically," Bisby said, "their application is on hold, and we have approached them to set up some other sort of relationship." Bisby points out that Species 2000 "does not own the contributory databases, which remain the totally controlled property of their original custodians."

Raven, also a past president of Sigma Xi, sees some parallels between the biodiversity catalogs and gene-sequence databases, but he expects that "the information about biodiversity will have a wider and more diverse appeal, if not the commercial importance."

However, Pimm wonders whether biodiversity data have commercial value that can be returned to investors. "Many people feel environmental stewardship is a burden rather than an opportunity," Pimm said. "I just don't see what the market is." The various organizations compiling data "would commercialize if they thought they could," Pimm said. "They need to survive and are not flush with money."

NatureServe's Schaefer says biodiversity informatics workers need "to identify new sources of funds to generate high-quality, consistent biodiversity information.... A greater investment is needed by both the public and private sectors in the development of data.

"Commercial markets will focus increasingly on value-added products that put biodiversity information in a broader context by linking it with physical and socioeconomic information," Schaefer said. "Revenues from these products will help support the generation and maintenance of biodiversity data."

This may prove difficult. Pimm notes that potentially valuable data, such as those used in biodiversity prospecting, the search for new chemical compounds that may have commercial value, have been "traded informally." Failed business ventures such as Shaman Pharmaceuticals (which planned to survey native healers for knowledge of the curative properties of local plants and animals) "illustrate the difficulty of making money from biodiversity data."

Thomson's Pringle says the company is reviewing how to work biodiversity informatics into Thomson's vision for its Web of Science, but it's "hard to predict the impact of open-access research." He is careful to reassure scientists that Thomson wants to "work out good relationships with organizations who are providing free resources."

Raven says the big question is how underfunded biodiversity researchers can benefit from and retain access to data they helped compile, particularly in developing countries. Although the Species 2000 consortium is for the moment independent of Thomson, the company could find ways to add value to data. For example, developing new system architectures could enhance the efficiency of data mining and analysis. "The advantages," Raven said, "could be great."

Raven suggests that governments or foundations could subsidize access to proprietary data architectures by people in poorer countries. Alternatively, a commercial venture "could build [developing-country access] into the pricing structure," as some journal publishers do. In either case, he says, "ways must be found to subsidize such efforts if they are to work over the long run." If they don't work, the world's biodiversity data and the life it represents may end up as little more than an electronic collection of biological baseball cards.

comments powered by Disqus


Subscribe to American Scientist