Making CRISPR-Cas9 gene editing safer with artificial intelligence

CRISPR gene editing was launched into the spotlight this week when Chinese scientist, He Jiankui, claimed to have made the world’s first genome-edited babies using the technology. The resulting ethical debate about manipulating the human germline was important, to be sure, but it overshadowed a more immediate concern: Before CRISPR research can be safely translated into therapies, scientists will need better methods for avoiding potential damaging off-target effects of the technology.

The problem, in a nutshell, is that after the CRISPR-Cas9 editing tool cuts double-stranded DNA, the DNA repairs itself but sometimes introduces mutations during the process. Scientists believe the errors depend on several factors, including the targeted sequence and the guide RNA (gRNA), but they also seem to follow a reproducible pattern.

Now, researchers at the Wellcome Sanger Institute say they have used machine learning to develop a tool that can predict which mutations CRISPR will introduce into a cell. They believe the technology could boost the efficiency of CRISPR research and ease the process of translating it into safe and effective treatments. 

For the study, which was published in the journal Nature Biotechnology, the team synthesized a library of 41,630 pairs of different gRNA and target DNA sequences. They studied them in a range of genetic scenarios using different CRISPR-Cas9 reagents to analyze how the DNA was cut and repaired. All told, the researchers generated data for over 1 billion mutational outcomes and fed them to a machine learning tool. The result is a program called FORECasT (favored outcomes of repair events at Cas9 targets), which can predict the outcome of the repair, be it single-base insertions or small deletions of genetic material.

“Our resource can predict the exact mutations resulting from CRISPR-Cas9 gene editing, just from the sequence of the target DNA. It will save time and resources for future CRISPR-Cas9 applications,” Felicity Allen, the study’s co-first author, said in a statement.The paper’s senior author, Leopold Parts, added that it “allows better design of editing experiments, and may lead to future therapeutic applications.”

RELATED: Chinese researcher claims to have made the first CRISPR-edited babies

The Wellcome Sanger researchers aren’t the only ones who are leveraging machine learning to help understand CRISPR-Cas9 errors. Scientists at Brigham and Women’s Hospital, the Broad Institute and Massachusetts Institute of Technology described a similar effort In a recent Nature study. Using a library of 2,000 gRNA-target pairs, they developed a machine-learning model that can predict insertions and deletions more than 50% of the time in patient-derived cell lines of three human diseases.

Scientists have yet to fully understand CRISPR’s off-target effects and are still searching for ways to minimize the unintentional harm. One idea is to pair CRISPR with a different scalpel enzyme from Cas9.

The Wellcome Sanger researchers suggested that their tool could be used "for genome-wide screens and custom edits,” they wrote in the study. They noted that diseases that are caused by expansions of short tandem repeats, such as Huntington’s, could be targeted with "microhomology-mediated repair," which would disable certain nucleotide bases.

Wellcome Sanger has made the tool publicly available for use by all researchers.