Big data has won a $200 million endorsement from the White House, and the National Institutes of Health (NIH) stands to gain a sizable share of funding. The NIH revealed March 29 that it is one of 6 federal agencies in line to reap benefits from the initiative to find solutions for taming huge datasets.
The NIH and National Science Foundation have teamed up to award up to $25 million from the initiative for 15 to 20 projects in science and engineering fields. In the biomedical arena, for instance, the NIH wants to fund projects that could enable the crunching of massive amounts of data to aid scientific investigations. Also, the agency announced that digital data from the international 1000 Genomes Project will be hosted in the cloud by Amazon Web Services and be available for free.
President Obama's big data bet follows criticism from the scientific community that, despite the billions of dollars invested in genomic research and molecular biology studies, a relative pittance has gone into supporting the gigantic datasets that have resulted from those efforts. For example, cheap and fast DNA sequencing has motivated federally funded labs to explore sequencing, yet few of them have the internal computing power to manage and analyze the massive genomic datasets.
How massive? The 1000 Genomes Project eats up 200 terabytes of storage, equal to some 16 million filing cabinets stuffed with text or 30,000 DVDs, according to the NIH.
"Improving access to data from this important project will accelerate the ability of researchers to understand human genetic variation and its contribution to health and disease," stated Dr. Eric Green, director of the NIH's National Human Genome Research Institute, which is one of the backers of the 1000 Genomes Project.
- here's the NIH's release
- check out InformationWeek's article