New fast and cheap DNA sequencers have given rise to the granddaddy of Big Data in biotech. How cheap and fast? Well, we've come a long way since the Human Genome Project, wrapped up in 2003, took more than a decade to identify the more than 20,000 genes and sequence all 3 billion bases of the genome with a total budget in the billions. According to the NIH's National Human Genome Research Institute, decoding an entire genome cost around $10 million in early 2007. From that point, driven largely by the emergence of next-gen sequencers, that price tag has plummeted faster than Moore's Law and is now quickly approaching $1,000 per genome. And Illumina ($ILMN) and others expect to sequence entire genomes in hours.
Now technologists are planning for a world where hundreds of millions of people have their genomes sequenced. Data storage aside, the challenge becomes interpreting these genomes efficiently to enable discoveries about diseases and human health. A bevy of software companies has emerged over the past several years to tackle the interpretation challenge in genomics, including DNAnexus, Knome and NextBio. At NextBio, for example, tech whizzes have made use of frameworks for pulling off massive computing tasks from Google ($GOOG) to analyze biological data. And the company recently teamed up with computer chip giant Intel ($INTC) to improve Hadoop applications for Big Data analysis in genomics.
"We're very excited and very encouraged by the pace of data becoming available," Satnam Alag, chief technology officer and vice president of engineering at NextBio, said in an interview with FierceBiotechIT. "NextBio being a data-driven engine, it feeds on the consumption of large amounts of Big Data."