Researchers join forces to build database of 50M genetic variants

By Nick Paul Taylor Oct 19, 2014 9:01pm

Jonathan Marchini of the University of Oxford

The genetic research community is about to get access to a lot of data. On October 20, two public databases generated by whole-genome and exome sequencing at multiple research institutions will be unveiled.

Jonathan Marchini of the University of Oxford and a small group of collaborators began work on the whole-genome database last year, Nature News reports. By pulling in data from multiple sources, the collaborators hoped to build a bigger database than any one research institution could create. Such scale could enable researchers to link patterns of genetic variations to diseases. As it stands, the database houses 50 million genetic variants gathered by 23 research centers.

The scale of the project is testament to growing recognition that even though the whole-genome databases generated by individual research institutions are bigger than ever, they might still not be big enough. Pooling resources may be the fastest way to spot patterns in the data. "There is a lot of goodwill between the people in the field; they all understand the benefits of doing this and have worked hard to make their data available," Marchini said.

On the same day Marchini unveils the Haplotype Reference Consortium, a database of exome data will also be shown to the world. The Exome Aggregation Consortium has collated exome data from 63,000 human genomes originally analyzed by an array of researchers. Again, the team behind the consortium hopes the scale of the database will yield insights beyond the reach of individual centers.

- read the Nature News feature