MIT team seeks to protect privacy, boost access to genetic data using misinformation

Researchers at MIT have developed a framework to ensure genetic data can be shared freely without putting individuals’ privacy at risk. The approach, which applies statistical science also used by Apple ($AAPL), adds misinformation to search queries to prevent the identification of individuals while still delivering results that are good enough for research purposes.

Two researchers from the Computer Science and Artificial Intelligence Laboratory at MIT worked on the approach with a collaborator at Indiana University, leading to a publication in Cell Systems. The paper describes the adaptation of differential privacy to genome-wide association studies (GWAS). Differential privacy, a cryptographic theory adopted by Apple, uses techniques including the injection of misinformation into search results to ensure a researcher can learn as much as possible about a dataset as a whole without being able to know anything about the individuals in it.

This concept holds an obvious appeal to GWAS, a field torn between demands to protect the privacy of individuals and the need to realize the health benefits of widespread, large-scale analyses. As it stands, this is being achieved by anonymization--a method that carries the risk of being reversed--and through the use of gatekeepers. As the Cell Systems authors see it, the gatekeeper approach is slowing down the rate at which science advances.

Virtual Roundtable

ESMO Post Show: Highlights From the Virtual Conference

Cancer experts and pharma execs will break down the headline-making data from ESMO, sharing their insights and analysis around the conference’s most closely watched studies. This discussion will examine how groundbreaking research unveiled over the weekend will change clinical practice and prime drugs for key new indications, and panelists will fill you in on the need-to-know takeaways from oncology’s hottest fields.

“Right now, what a lot of people do, including the NIH, for a long time, is take all their data--including, often, aggregate data, the statistics we’re interested in protecting--and put them into repositories,” says Sean Simmons, an MIT postdoc and first author on the paper, said in a statement. “And you have to go through a time-consuming process to get access to them.” MIT’s Bonnie Berger, a coauthor of the paper, said it can take months to gain access to a repository.

Whether differential privacy is the answer to these problems remains to be seen. The authors see it as being suitable for use in situations “in which privacy concerns would make alternative approaches cumbersome or impossible.” Figuring out which situations the approach is best suited to will require experimentation, something at least one researcher not involved with the paper is keen to see happen.

“Hopefully, this will encourage the biomedical community to test this promising approach at large scale and, if it’s successful, define best practices and develop related tools,” Jean-Pierre Hubaux, a professor of computer science at the École Polytechnique Fédérale de Lausanne, said.

- read the release
- and the abstract

Related Articles:
More needs to be done to protect privacy on mHealth, researchers say
HHS starts competition to identify ways blockchain can improve data privacy, security


Suggested Articles

Chi-Med has detailed plans to seek approval from the FDA later this year in part on the strength of data from Chinese phase 3 trial.

Takeda tapped Roche’s Foundation Medicine to develop tissue- and blood-based companion diagnostic tests for its portfolio of lung cancer therapies.

The clamor for more transparency from the leading pandemic vaccine contenders has been getting louder.