Goto

Collaborating Authors

 rosettafold


A Model-Centric Review of Deep Learning for Protein Design

arXiv.org Artificial Intelligence

Deep learning has transformed protein design, enabling accurate structure prediction, sequence optimization, and de novo protein generation. Advances in single - chain protein structure prediction via AlphaFold2, RoseTTAFold, ESM Fold, and others have achieved near - experimental accuracy, inspiring successive work extended to biomolecular complexes via AlphaFold Multimer, RoseTTAFold All - Atom, AlphaFold 3, Chai - 1, Boltz - 1 and others . Generative models such as Prot GPT 2, ProteinMPNN, and RFdiffusion have enabled sequence and backbone design beyond natural evolution - based limitations . More recently, joint sequence - structure co - design models, including ESM 3, have integrated both modalities into a unified framework, resulting in improved designability. Despite these advances, challenges still exist pertaining to modeling sequence - structure - function relationships and ensuring robust generalization beyond the regions of protein space spanned by the training data . Future advances wi ll likely focus on joint sequence - structure - function co - design frameworks that are able to model the fitness landscape more effectively than models that treat these modalities independently . Current capabilities, coupled with the dizzying rate of progress, suggest that the field will soon enable rapid, rational design of proteins with tailored structures and functions that transcend the limitations imposed by natural evolution. In this review, we discuss the current capabilities of deep learning methods for protein design, f ocusing on some of the most revolutionary and capable models with respect to their functionality and the applications that they enable, leading up to the current challenges of the field and the optimal path forward.


Beating the Best: Improving on AlphaFold2 at Protein Structure Prediction

arXiv.org Artificial Intelligence

The goal of Protein Structure Prediction (PSP) problem is to predict a protein's 3D structure (confirmation) from its amino acid sequence. The problem has been a 'holy grail' of science since the Noble prize-winning work of Anfinsen demonstrated that protein conformation was determined by sequence. A recent and important step towards this goal was the development of AlphaFold2, currently the best PSP method. AlphaFold2 is probably the highest profile application of AI to science. Both AlphaFold2 and RoseTTAFold (another impressive PSP method) have been published and placed in the public domain (code & models). Stacking is a form of ensemble machine learning ML in which multiple baseline models are first learnt, then a meta-model is learnt using the outputs of the baseline level model to form a model that outperforms the base models. Stacking has been successful in many applications. We developed the ARStack PSP method by stacking AlphaFold2 and RoseTTAFold. ARStack significantly outperforms AlphaFold2. We rigorously demonstrate this using two sets of non-homologous proteins, and a test set of protein structures published after that of AlphaFold2 and RoseTTAFold. As more high quality prediction methods are published it is likely that ensemble methods will increasingly outperform any single method.


Thanks to DALL-E, the Race to Make Artificial Protein Drugs Is On

#artificialintelligence

Remember when predicting protein shapes using AI was the breakthrough of the year? Having solved nearly all protein structures known to biology, AI is now turning to a new challenge: designing proteins from scratch. Far from an academic pursuit, the endeavor is a potential game-changer for drug discovery. Having the ability to draw up protein drugs for any given target inside the body--such as those triggering cancer growth and spread--could launch a new universe of medicines to tackle our worst medical foes. It's no wonder multiple AI powerhouses are answering the challenge.


In Its Greatest Biology Feat Yet, AI Unlocks the Complex Proteins Guarding Our DNA

#artificialintelligence

AI has done it again. After solving one of the grandest mysteries in biology--predicting protein structure--it decoded how proteins link up into complexes, and dreamed up novel protein structures that may ultimately be turned into drugs to control our basic biology, health, and life. Yet when faced with enormous protein complexes, AI faltered. In a mind-bending feat, a new algorithm deciphered the structure at the heart of inheritance--a massive complex of roughly 1,000 proteins that helps channel DNA instructions to the rest of the cell. The AI model is built on AlphaFold by DeepMind and RoseTTAfold from Dr. David Baker's lab at the University of Washington, which were both released to the public to further experiment on.


What's next for AlphaFold and the AI protein-folding revolution

#artificialintelligence

For more than a decade, molecular biologist Martin Beck and his colleagues have been trying to piece together one of the world's hardest jigsaw puzzles: a detailed model of the largest molecular machine in human cells. This behemoth, called the nuclear pore complex, controls the flow of molecules in and out of the nucleus of the cell, where the genome sits. Hundreds of these complexes exist in every cell. Each is made up of more than 1,000 proteins that together form rings around a hole through the nuclear membrane. These 1,000 puzzle pieces are drawn from more than 30 protein building blocks that interlace in myriad ways. Making the puzzle even harder, the experimentally determined 3D shapes of these building blocks are a potpourri of structures gathered from many species, so don't always mesh together well. And the picture on the puzzle's box -- a low-resolution 3D view of the nuclear pore complex -- lacks sufficient detail to know how many of the pieces precisely fit together. In 2016, a team led by Beck, who is based at the Max Planck Institute of Biophysics (MPIBP) in Frankfurt, Germany, reported a model1 that covered about 30% of the nuclear pore complex and around half of the 30 building blocks, called Nup proteins.


What's Up after AlphaFold on ML for Structural Biology?

#artificialintelligence

AlphaFold 2, the AI-based program developed by Google's Deepmind to crack the problem of predicting protein structures, made a strike in late 2020 when it "won" the 14th edition of a biannual "contest" on protein structure prediction called CASP (Critical Assessment of Structure Prediction) presented its results. It then made a second strike half a year later when Deepmind published a peer-reviewed article in the journal Nature describing how AlphaFold 2 works, and released its code openly in GitHub and as a Google Colab notebook that everybody could use. The hype kept growing as scientists developed even better notebooks from it, and as they found the many applications that AlphaFold had, even beyond its original aim. This hype grew even further when Deepmind released a new version of AlphaFold better suited to modeling the complexes made by multiple proteins when they interact. Then again when Deepmind joined forces with the European Institute of Bioinformatics to release a database of 3D models for all known proteins.


Without Code for DeepMind's Protein AI, One Lab Wrote Its Own

WIRED

For biologists who study the structure of proteins, the recent history of their field is divided into two epochs: before CASP14, the 14th biennial round of the Critical Assessment of Protein Structure conference, and after. In the decades before, scientists had spent years slowly chipping away at the problem of how to predict the structure of a protein from the sequence of amino acids that it comprises. After CASP14, which took place in December 2020, the problem had effectively been solved, by researchers at the Google subsidiary DeepMind. A research company focused on a branch of artificial intelligence known as "deep learning," DeepMind had previously made headlines by building an AI system that beat the Go world champion. But their success at protein structure prediction, which they achieved using a neural network they call AlphaFold2, represented the first time they had built a model that could solve a problem of real scientific relevance.


Artificial Intelligence Accurately Predicts Protein Folding

#artificialintelligence

Posted on July 27th, 2021 by Dr. Francis Collins Proteins are the workhorses of the cell. Mapping the precise shapes of the most important of these workhorses helps to unlock their life-supporting functions or, in the case of disease, potential for dysfunction. While the amino acid sequence of a protein provides the basis for its 3D structure, deducing the atom-by-atom map from principles of quantum mechanics has been beyond the ability of computer programs--until now. In a recent study in the journal Science, researchers reported they have developed artificial intelligence approaches for predicting the three-dimensional structure of proteins in record time, based solely on their one-dimensional amino acid sequences [1]. This groundbreaking approach will not only aid researchers in the lab, but guide drug developers in coming up with safer and more effective ways to treat and prevent disease.


DeepMind's AI for protein structure is coming to the masses

#artificialintelligence

The structure of human interleukin-12 protein bound to its receptor, as predicted by machine-learning software.Credit: Ian Haydon, UW Medicine Institute for Protein Design Software that accurately determines the 3D shape of proteins is set to become widely available to scientists. On 15 July, the London-based company DeepMind released an open-source version of its deep-learning neural network AlphaFold 2 and described its approach in a paper in Nature1. The network dominated a protein-structure prediction competition last year. Meanwhile, an academic team has developed its own protein-prediction tool inspired by AlphaFold 2, which is already gaining popularity with scientists. That system, called RoseTTaFold, performs nearly as well as AlphaFold 2, and is described in a Science paper also published on 15 July2.


Advanced New Artificial Intelligence Software Can Compute Protein Structures in 10 Minutes

#artificialintelligence

Protein design researchers used artificial intelligence to generate hundreds of new protein structures, including this 3D view of human interleukin-12 bound to its receptor. Scientists have waited months for access to highly accurate protein structure prediction since DeepMind presented remarkable progress in this area at the 2020 Critical Assessment of Structure Prediction, or CASP14, conference. The wait is now over. Researchers at the Institute for Protein Design at the University of Washington School of Medicine in Seattle have largely recreated the performance achieved by DeepMind on this important task. These results were published online by the journal Science on July 15, 2021.