Proteins are the building blocks for all living things, providing structure and managing processes in cells. Understanding how these molecules fold into specific 3D shapes is key to understanding their function but requires expensive equipment and lots of time, limiting the progress of research and development. A new artificial intelligence programme called AlphaFold has been shown to accurately predict protein structure in minutes, solving a decades old challenge. Its success is built on the availability of thousands of experimentally determined protein structures, a result of long-term research funding, infrastructure investment and data-sharing policies. DeepMind, the developers of AlphaFold, have made the AlphaFold code and protein structure predictions openly available to the global scientific community.
The human mediator complex has long been one of the most challenging multi-protein systems for structural biologists to understand.Credit: Yuan He The human genome holds the instructions for more than 20,000 proteins. But only about one-third of those have had their 3D structures determined experimentally. And in many cases, those structures are only partially known. Now, a transformative artificial intelligence (AI) tool called AlphaFold, which has been developed by Google's sister company DeepMind in London, has predicted the structure of nearly the entire human proteome (the full complement of proteins expressed by an organism). In addition, the tool has predicted almost complete proteomes for various other organisms, ranging from mice and maize (corn) to the malaria parasite (see'Folding options').
All the sessions from Transform 2021 are available on-demand now. DeepMind and the European Bioinformatics Institute (EMBL), a life sciences lab based in Hinxton, England, today announced the launch of what they claim is the most complete and accurate database of structures for proteins expressed by the human genome. In a joint press conference hosted by the journal Nature, the two organizations said that the database, the AlphaFold Protein Structure Database, which was created using DeepMind's AlphaFold 2 system, will be made available to the scientific community in the coming weeks. The recipe for proteins -- large molecules consisting of amino acids that are the fundamental building blocks of tissues, muscles, hair, enzymes, antibodies, and other essential parts of living organisms -- are encoded in DNA. It's these genetic definitions that circumscribe their three-dimensional structures, which in turn determine their capabilities.
Last month, DeepMind published the much anticipated, detailed methodology underlying the latest version of AlphaFold – the UK-based science company's powerful AI system that blew away its rivals in the latest major competition to predict the 3D structure of proteins. AlphaFold's machine learning methodology has been applied to predict structures for almost 99% of human proteins which have now been made publicly available. In this long read, I reflect on the significance of these developments for fundamental research and drug discovery. I wrote this as the ICR celebrates the 10th anniversary of its AI-enabled drug discovery knowledgebase canSAR – which features multiple approaches to predicting'druggability' as an aid to selecting drug targets and accelerating drug discovery. The coronavirus pandemic has, understandably, soaked up a lot of bandwidth when it comes to science news – but one particular non-Covid science story was able to cut through and hit the headlines in the UK and around the world. On 30 November 2020 it was announced that DeepMind – a subsidiary of Google's parent company Alphabet focusing on artificial intelligence – had made what was hailed as a huge leap towards solving one of biology's greatest remaining challenges: the ability to predict the correct, three-dimensional structures of proteins based on their constituent, one-dimensional amino acid sequences. The announcement attracted huge interest, but the expert community has been waiting for the peer-reviewed science publication. The AI methodology has now been published in the leading journal Nature and this was followed rapidly by a second Nature paper from DeepMind and collaborators at the European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), which reports the application of the most recent AlphaFold machine learning system to predict the 3D structures at scale for almost the entire human proteome – 98.5% of human proteins.
They are large complex molecules, made up of chains of amino acids, and what a protein does largely depends on its unique 3D structure. Figuring out what shapes proteins fold into is known as the "protein folding problem", and has stood as a grand challenge in biology for the past 50 years. In a major scientific advance, the latest version of our AI system AlphaFold has been recognised as a solution to this grand challenge by the organisers of the biennial Critical Assessment of protein Structure Prediction (CASP). This breakthrough demonstrates the impact AI can have on scientific discovery and its potential to dramatically accelerate progress in some of the most fundamental fields that explain and shape our world.