DeepMind AI breaks through decades-old protein folding puzzle

It’s a fundamental question that’s gone unanswered for decades: How do proteins, one of the building blocks of life itself, correctly fold themselves into the proper shape time after time? 

Researchers at DeepMind, Google’s artificial-intelligence-focused sibling, believe they’ve now cracked the code, with a computer model that can translate a chain of amino acids into a 3D structure—providing a better understanding of the biological functions that follow form and delivering a breakthrough that promises to change the path of medical research.

“We have been stuck on this one problem—how do proteins fold up—for nearly 50 years,” said John Moult, a professor at the University of Maryland and co-founder and chair of a biennial computer modeling competition focused on tackling the protein-folding question, known as CASP.

“To see DeepMind produce a solution for this, having worked personally on this problem for so long and after so many stops and starts, wondering if we’d ever get there, is a very special moment,” Moult said.

CASP, short for Critical Assessment of protein Structure Prediction, first launched in 1994 and was devised as a global experiment where participants could test their prediction methods on a standardized playing field.

DeepMind, with its AlphaFold project, took home the top prize in 2018 as a first-time entrant. The AI program was able to predict the most accurate final shape, from scratch, for 25 of the contest’s 43 proteins.

This year, they returned with the updated AlphaFold2—posting a median similarity score of 92.4, which judges how well the AI’s predictions line up with the known structures of all of CASP’s protein targets. Two-thirds of the scores, known as a global distance test, were over 90.

This means the models were only off by about the width of a single atom, according to DeepMind. AlphaFold previously set an accuracy record in 2018, but it was unable to break a median score of 60.

SPECIAL REPORT: The top AI lighthouse projects to watch | DeepMind's AlphaFold

“This computational work represents a stunning advance on the protein-folding problem,” said Venki Ramakrishnan, president of the Royal Society of London, who shared the 2009 Nobel Prize in Chemistry for his work on the cell’s protein-synthesizing ribosomes.

“It has occurred decades before many people in the field would have predicted,” Ramakrishnan said. “It will be exciting to see the many ways in which it will fundamentally change biological research.”

To better understand a protein’s shape—and how their delicate origami interacts with cells, affecting both life and disease—scientists have used technologies ranging from X-ray crystallography to magnetic resonance scanning to cyro-electron microscopes. But to be able to come to an answer virtually promises to unlock much more information across a variety of research fields, both inside and outside of healthcare.

“Proteins are extremely complicated molecules, and their precise three-dimensional structure is key to the many roles they perform, for example the insulin that regulates sugar levels in our blood and the antibodies that help us fight infections,” said Moult.

“Even tiny rearrangements of these vital molecules can have catastrophic effects on our health, so one of the most efficient ways to understand disease and find new treatments is to study the proteins involved,” he said. “There are tens of thousands of human proteins and many billions in other species, including bacteria and viruses, but working out the shape of just one requires expensive equipment and can take years.”

RELATED: An instant 2nd opinion: Google's DeepMind AI bests doctors at breast cancer screening

Still, AlphaFold’s efforts require a large amount of computing power. Trained on public protein data on about 170,000 different structures, the AI model uses 128 of Google’s high-end cloud computing cores, running them constantly for a number of weeks.

“AlphaFold is a once in a generation advance, predicting protein structures with incredible speed and precision,” said Arthur Levinson, Ph.D., founder and CEO of Calico, a subsidiary of Alphabet akin to Google and DeepMind.

“This leap forward demonstrates how computational methods are poised to transform research in biology and hold much promise for accelerating the drug discovery process,” added Levinson, who previously served as chairman and CEO of Genentech.

The AlphaFold program has already helped solve a protein structure that may lead to a better understanding of how signals are transmitted across cell membranes—a puzzle that has been frustrating biomedical researchers for nearly a decade, according to Andrei Lupas, director of the Max Planck Institute for Developmental Biology and a CASP judge.

The DeepMind team also hopes to help identify proteins that misfold, leading to malfunctions that cause disease, and to deliver computer models that may speed up drug discovery and development.