Once upon a time—back in the mid-2000s, a mere era ago in computing—researchers at the University of Washington wanted to predict how proteins fold themselves. How are loose strings of amino acids fashioned into functional, three-dimensional structures, capable of causing, preventing or curing a range of diseases?
It’s an endlessly complex, decades-old problem to get right. With each link in the chain, the number of possible interactions and final structures is tremendous. But in our bodies, these proteins find a way to fold into the same shape nearly every time. And understanding how they turn out is an important step toward targeting them with therapies.
The researchers at UW knew the processing power they needed wasn’t accessible at the time. Distributed computing approaches had been taken before—dividing the work among volunteers’ home PCs and allowing those machines to crunch the numbers in their downtime. But they took it a step further.
In 2008, they launched Foldit, an online, competitive game that asked players to try their hand at forming a protein the best. It tapped the human intuition of tens of thousands by presenting each amino acid chain as a puzzle with a simple set of chemical rules: solve it correctly, and you wouldn’t just rise to the top of the leaderboard, but you’d also be making an important potential contribution to scientific research.
A 2010 paper published in Nature included more than 57,000 global Foldit players as co-authors, showing that they were able to outperform computer models in predicting certain proteins’ native shapes—using just a couple of hundred megabytes of code and a lot of people’s free time.
Today, developments in artificial intelligence, cloud computing and big data are changing the landscape—but that sense of gamification remains. It’s still humans and machines, and machines are leaping ahead.
DeepMind Technologies, the London-based AI firm under the umbrella of Google and Alphabet, was once famous for building the machine learning program AlphaGo—the first computer that was able to defeat the international champion of Go, dubbed the world’s most complex strategy game.
Now DeepMind has AlphaFold, which in December 2018 took home the top prize at the biennial Critical Assessment of protein Structure Prediction, or CASP, competition, that since 1994 has focused on one of the toughest challenges in science.
As a first-time entrant, AlphaFold was able to predict the most accurate final shape, from scratch, for 25 of the contest’s 43 proteins. Some of those wins were closer than others, though the second-place finisher only won three of the 43 challenges.
The physicality of protein structures can also be gleaned in the lab, through electron microscopes, nuclear MRI scanning or X-ray crystallography, but these processes can be expensive, slow and tricky to get right. Using AI to build proteins virtually, based on vast genomic data sets, can make the task much less laborious, DeepMind says.
AlphaFold employs two methods relying on deep neural networks: one trained to predict the distances between pairs of specific amino acids, and another that evaluates the possible angles of chemical bonds between them. Those networks returned a score estimating how likely a structure would match a compilation of known protein fragments, while another neural network worked to generate completely new pieces.
While these computer models are not yet accurate enough to be used in drug design, AI’s proof-of-concept is there, and proving its value is not far behind. Even Big Tech’s giants are starting to get into the game—Amazon, Microsoft and Facebook among them—hiring protein experts and biomedical data scientists, according to a report from STAT.
They have also been working with academic laboratories, including the Institute for Protein Design at UW, current overseers of Foldit, whose work continues today. Whether these Big Tech companies will compete at the next CASP get-together in 2020 remains to be seen, but it will still be a game to watch.