SCIENCE OBSERVER
To Be Free, or Not To Be
Roger Harris
Imagine walking into your downtown library and finding that you can't
check out a book without paying a fee. What you took for granted as
a free service, you now have to pay for.
A similar situation may soon face biologists who study biodiversity,
the variety and number of species. In the 1990s, gene-sequence data
and genetically modified organisms were commercialized, raising new
questions about the privatization of scientific data. Today,
biodiversity databases are growing and struggling for
funds—which may come in the form of private investment that
could transform what is now an open, public resource reliant on
government and nonprofit funding.
The word "biodiversity" was popularized in the late 1980s.
Scientists have since established that current rates of biodiversity
loss rank with those of geological mass extinctions. It's no
surprise that biologists are anxious to catalogue species as quickly
as possible.
One major effort rapidly moving ahead is Species 2000. In March
2005, this consortium of databases announced that it had digitized
information on half a million species, perhaps a quarter of all
those described. (The planet is thought to have upwards of 5 million
species in all.)
The aim is to gather descriptions of all known species and their
associated information into a single database. But an electronic
collection of the biological equivalent of baseball cards is not
enough. The data must be organized and made accessible in a way
useful for conservationists, natural resource managers, taxonomists,
the biotechnology research community and the public. These are the
users of biodiversity informatics, or information about
species—their descriptions, locations, population sizes and
trends, taxonomic relationships, museum specimens, field
observations and images. According to Mark Schaefer, president of
NatureServe, a leading organization involved in compiling
biodiversity databases, "Without this information, it will be
difficult to make informed decisions about conservation and
sustainable development."
Biodiversity databases, each with its own way to codify, organize
and search data, have proliferated as experts in various taxonomic
groups have built catalogs to meet their specific needs. (An example
is the well-known FishBase.)
The Catalogue of Life Programme is the biggest and boldest attempt
to integrate these databases. It is a joint agreement between
Species 2000 (acting as a coordinating umbrella organization), the
Global Biodiversity Information Facility (GBIF) and the Integrated
Taxonomic Information Systems (ITIS). ITIS, the main U.S.
contributor, is in turn a partnership of federal agencies and
nonprofit organizations (themselves collaborations!) including
NatureServe, the U.S. Geological Survey, the Smithsonian Institution
and the National Biological Information Infrastructure. The
organizational layers illustrate the complexity and cost of
developing gigantic data sets as well as the extent of public-agency involvement.
Scientists at universities, museums and libraries have compiled much
of the information in the catalogs with government support. Perhaps
because public organizations play such a dominant role in
biodiversity informatics, most of the data are readily accessible at
no cost.
As Peter Raven, director of the Missouri Botanical Garden and a
leader in biodiversity research and conservation, points out,
"It is traditional to have lists of species and material
associated with those lists in the public domain."
Stuart Pimm of the Nicholas School of the Environment and Earth
Sciences at Duke University agrees: "So many of the data are
collected by state and federal agencies, there is enormous public
pressure to keep access open."
A hint of private interest in the growing databases came in January
2004, when Thomson Scientific, the world's largest information
corporation, acquired Biosis, known for indexing and abstracting
life-sciences journals. Biosis managed the Zoological Record, whose
computers had hosted the Species 2000 project to that point. With
the acquisition, Thomson now hosted the Species 2000 database, an
arrangement that continues.
Although Jim Pringle, Thomson's vice president of development, says
the company does not have definite plans to privatize biodiversity
data, Thomson promptly applied to become a member of Species 2000.
Frank Bisby, executive director of Species 2000, said the Species
2000 directors "took advice … and decided that [it] was
not appropriate for a subsidiary of a major multinational."
"Technically," Bisby said, "their application is on
hold, and we have approached them to set up some other sort of
relationship." Bisby points out that Species 2000 "does
not own the contributory databases, which remain the totally
controlled property of their original custodians."
Raven, also a past president of Sigma Xi, sees some parallels
between the biodiversity catalogs and gene-sequence databases, but
he expects that "the information about biodiversity will have a
wider and more diverse appeal, if not the commercial importance."
However, Pimm wonders whether biodiversity data have commercial
value that can be returned to investors. "Many people feel
environmental stewardship is a burden rather than an
opportunity," Pimm said. "I just don't see what the market
is." The various organizations compiling data "would
commercialize if they thought they could," Pimm said.
"They need to survive and are not flush with money."
NatureServe's Schaefer says biodiversity informatics workers need
"to identify new sources of funds to generate high-quality,
consistent biodiversity information.... A greater investment is
needed by both the public and private sectors in the development of data.
"Commercial markets will focus increasingly on value-added
products that put biodiversity information in a broader context by
linking it with physical and socioeconomic information,"
Schaefer said. "Revenues from these products will help support
the generation and maintenance of biodiversity data."
This may prove difficult. Pimm notes that potentially valuable data,
such as those used in biodiversity prospecting, the search for new
chemical compounds that may have commercial value, have been
"traded informally." Failed business ventures such as
Shaman Pharmaceuticals (which planned to survey native healers for
knowledge of the curative properties of local plants and animals)
"illustrate the difficulty of making money from biodiversity data."
Thomson's Pringle says the company is reviewing how to work
biodiversity informatics into Thomson's vision for its Web of
Science, but it's "hard to predict the impact of open-access
research." He is careful to reassure scientists that Thomson
wants to "work out good relationships with organizations who
are providing free resources."
Raven says the big question is how underfunded biodiversity
researchers can benefit from and retain access to data they helped
compile, particularly in developing countries. Although the Species
2000 consortium is for the moment independent of Thomson, the
company could find ways to add value to data. For example,
developing new system architectures could enhance the efficiency of
data mining and analysis. "The advantages," Raven said,
"could be great."
Raven suggests that governments or foundations could subsidize
access to proprietary data architectures by people in poorer
countries. Alternatively, a commercial venture "could build
[developing-country access] into the pricing structure," as
some journal publishers do. In either case, he says, "ways must
be found to subsidize such efforts if they are to work over the long
run." If they don't work, the world's biodiversity data and the
life it represents may end up as little more than an electronic
collection of biological baseball cards.