DNA Research Commons Scaled Back
Concerned about violating DNA donor confidentiality, U.S. research agency sticks with decision to limit data exchange
The U.S. government’s heralded plan to help researchers freely share some genetic research data online to speed up disease research is now a dream deferred.
The National Human Genome Research Institute is sticking with a decision, made last summer, to remove free-access, pooled genomics data it started posting on the Internet in 2006. Other high profile research organizations, including the Broad Institute, are doing the same.
The retreat began after investigators at Arizona’s Translational Genomics Research Institute and colleagues discovered how to detect individual genetic profiles in pools of 1,000 or more DNA donors. Their bioinformatics tools are so brawny that they produced positive donor IDs even from averaged data alone, which were all that the institute was sharing freely.
Geneticists, like all scientists, usually champion data sharing. But people who donate DNA for research studies typically are assured that their identities will remain confidential. Government officials were no longer certain they could keep that promise.
“We must protect the rights of the individuals participating in the studies,” said Laura Lyman Rodriguez, senior advisor to the genome research institute director.
Rodriguez acknowledges that the risk of revealing a donor’s identity is very small. A detailed genomic profile of a person would be required to make a positive ID in a research database using the new bioinformatics approach.
In fact, the risk of slowing disease research might be greater. A changed protocol requiring NIH approval to use the once-free-access data could delay or maybe even prevent some studies, some scientists say.
“If you are a run-of-the-mill bioinformatics person trying to find correlations, this is going to hinder you,” said Brad Malin, an assistant professor at Vanderbilt University School of Medicine, who evaluates genetic-data privacy approaches.
In 2006, National Institutes of Health officials hailed its inclusion of open-access data in the now modified database called Genotype and Phenotype (dbGaP) as a significant stride toward making the most of disparate genetic studies. Access to individual sequences, scrubbed of any identifying information, always required authorization from various institute review panels. But averaged data was to be available simply for the taking.
Most illnesses have multiple genetic roots. Of great interest are patterns of variation in single nucleotide polymorphisms (SNPS), which are single-point mutations in a genome. Pooled results of large genetic studies are good places to hunt for genetic patterns within diseases.
At the time the data were pulled last summer, the open-access portion of dbGaP data included information on genetic variation observed in people with asthma, prostate cancer, cardiovascular diseases, diabetes, Parkinson’s disease and amyotrophic lateral sclerosis, among other serious ailments.
Scientists had downloaded data 491 times before access was limited, Rodriguez said. The most popular genetic data were part of the sizable Framingham Heart Study.
The desire to make the most of such genetic studies remains very much alive, said David Altshuler, the director of the program in medical and population genetics at the Broad Institute in Cambridge, Massachusetts. But the ground is shifting fast beneath geneticists’ feet, and they must respond, he said.
For one, said Altshuler and others, the number of retail genomics companies willing to sequence people’s DNA for fees is growing. No one truly knows into whose hands such data could fall and how it might be used. One could imagine nothing good coming of an enemy’s discovery that a person enrolled in a study on drug or alcohol addiction, say, or for the genetic profile of a degenerative disease.
At the same time, law enforcement is expanding the way it uses DNA to track suspects, even innocent relatives of suspects in crime cases. In one case, investigators secretly obtained from a medical clinic a DNA sample that belonged to the daughter of a suspected serial killer to hasten their detective work. That sort of thing could conceivably happen one day with genetics research data.
“There is social good in sharing data, but it doesn’t trump all other things,” Altshuler said.
David Karp, a physician and genetics researcher at the University of Texas Southwestern Medical Center, said he suspects that people participating in genetic studies might be more willing to accept these sorts of risks than people deciding how to share data believe. One less-paternalistic way to deal with emerging threats to confidentiality, he said, is to inform donors that the risks exist and allow them to decide whether they are willing to live with that.
“Going forward, the way to solve this is to consent people in more contemplative ways,” Karp said.
Rodriguez, the NIH official, said that improved consent models might be one route to making pooled data more freely available again. But figuring out how to do that responsibly will take time. Multiple questions must first be answered.
“How much do you tell people without scaring them? How do you communicate the level of risk? What level of risks are people willing to tolerate? How frequently do they want to be asked?” Rodriguez said, reciting some of them.
And the list only goes on from there. —Catherine Clabby