Broad, with Intel’s support, makes GATK open source again, commits to model for all data science software

The Broad Institute is set to release version 4 of its Genome Analysis Toolkit (GATK) under an open source software license, ending a five-year experiment with a hybrid model.

Broad switched to a mixed open and closed-source model for its variant discovery and genotyping toolkit in 2012. The switch was presented as a way for Broad, in partnership with Appistry, to better serve researchers at for-profit organizations. But the action created confusion initially, was criticized in some quarters and ultimately drove some users to move to other genetic variant detectors, such as the open-source FreeBayes.

Now, Broad has returned to the open-source model for version 4 of GATK and for all software that emerges from its Data Sciences Platform, ending the need for users to buy commercial licenses. Broad framed the decision as a result of shortcomings with the vendor support model, the detrimental effect on development of closing off the code and the impact of the sector-wide move toward software-as-a-service.

GATK’s move to open source was partly facilitated by the $25 million collaboration it entered into with Intel in November. Together, Broad and Intel have developed a genomics stack they claim can run the GATK4 Best practices pipeline five times faster than older versions. Intel attributed the performance gains to the use of its computing chips, networking technology and solid-state drives.

The genomics community welcomed the return to open source.

“Open sourcing the GATK is a big deal for open genomics, and for open science in general,” Jeremy Freeman, manager of computational biology at the Chan Zuckerberg Initiative, said in a statement. “Not only does it make this critical tool available to as broad as possible an audience for use, reuse, inspection, and contribution, it provides a powerful example to the community for how an existing project can embrace open source.”

Broad further increased the breadth of the potential GATK user base through an agreement with Chinese giants BGI and Alibaba. Sequencing powerhouse BGI is making GATK4 available to users of its cloud genomics data analysis platform.