DeepMind’s AlphaFold Is Close To Solving One Of Biology’s Greatest Challenges


One of DeepMind’s powerful AI algorithms, called AlphaFold, used its deep learning prowess to predict a protein’s three-dimensional shape, down to the width of an atom.

It’s a challenge that’s mystified biologists for 50 years and counting-so much so that computer-based protein structure prediction has been made into crowd-sourcing games, global competitions, and a Nobel Prize in search of a breakthrough.

AlphaFold triumphed over roughly 100 other teams in a long-running challenge called Critical Assessment of Structure Prediction, or CASP, with a knockout, jaw-dropping performance. The win for protein structure prediction marks its dazzling debut into the real world-one that nixes negative punditry on the value of AI for real-life quandaries. And DeepMind isn’t the only contender in the protein folding game.

By tactically changing the genes of a complicated protein assembly and observing the outcome, the team was able to build an algorithm that reconstructs the protein with extremely high accuracy.

A central tenet in biology is “Structure explains function.” The discovery of the double helix shape of DNA, for example, skyrocketed insights into how genetic information is copied and stored. Protein structures arguably contain as much, if not more, information.

If we know a protein’s structure, we can make educated guesses about its function. By mapping thousands of protein structures, we can begin to decipher the biology of life-and find ways to manipulate it.

One major breakthrough was to map the structure of “Spike” proteins on the surface of a virus, which the virus relies on to invade our cells. It’s not surprising that DeepMind’s AlphaFold went after these spike protein structures in March, just as Covid-19 cases began skyrocketing across the world.

The classic “Gold standard” for uncovering protein structures relies on an extremely tedious and difficult lab technique called X-ray crystallography. Scientists essentially “Freeze” proteins into delicate crystal structures and use a combination of X-rays, microscopes, and maths to figure out their shapes. Not all proteins can be “Flash-frozen” to be analyzed, leaving a Grand Canyon-sized gap for decoding biology. The instructions for building a 3D protein are inherently embedded inside its 1D amino acid sequences-a discovery that won the Nobel Prize.

The CASP challenge crowd-sourced predictions of protein structures that have already been identified using X-ray crystallography, but were unavailable to the public. Remember: amino acid sequences, the building blocks of proteins, contain data about a protein’s final 3D shape, which seems perfect for a deep learning approach. The neural network, trained on protein data banks of roughly 170,000 protein structures, could then interpret the protein’s structure as a “3D map” and analyze any buried relationships or patterns.

Nearly all of our drugs are designed to dock onto a protein, like keys to a lock. Having an AI-based method to decode protein structure could rapidly screen for tens of thousands of new drug targets.

It struggled with deciphering protein complexes-mega-structures of multiple individual 3D building blocks that form into a collective functional entity. This week, a team took a separate approach to analyzing protein complexes in living cells-something AlphaFold hasn’t yet dominated.

Their approach to the vexing problem went back to genes, the blueprint that guides the construction of amino acid chains, which contains information on 3D protein folding. The team found that they could quickly screen through thousands of mutations for a gene that makes a protein in living cells.

By observing the structure of resulting protein complexes, they could then use AI-based methods to map out how one mutation affects another-and in turn, reveal the “Rules” behind how these mega-structures form by just looking at their underlying genetic instructions.

Similar to AlphaFold, the technology, called “Integrative modeling,” isn’t yet ready to replace the gold standard of protein mapping.

From singular proteins to meta-protein complexes, we now have faster, simpler, cheaper ways to accurately visualize. With AI and biology working in tandem, protein folding may just be the first major breakthrough for medicine in our generation.