Protein structures to represent the data obtained via AlphaFold. DeepMind and EMBL release the most complete database of predicted 3D structures of human proteins. Partners use AlphaFold, the AI system recognized last year as a solution to the protein structure prediction problem, to release more than 350,000 protein structure predictions including the entire human proteome to the scientific community. DeepMind today announced its partnership with the European Molecular Biology Laboratory (EMBL), Europe's flagship laboratory for the life sciences, to make the most complete and accurate database yet of predicted protein structure models for the human proteome. This will cover all 20,000 proteins expressed by the human genome, and the data will be freely and openly available to the scientific community.
The human mediator complex has long been one of the most challenging multi-protein systems for structural biologists to understand.Credit: Yuan He The human genome holds the instructions for more than 20,000 proteins. But only about one-third of those have had their 3D structures determined experimentally. And in many cases, those structures are only partially known. Now, a transformative artificial intelligence (AI) tool called AlphaFold, which has been developed by Google's sister company DeepMind in London, has predicted the structure of nearly the entire human proteome (the full complement of proteins expressed by an organism). In addition, the tool has predicted almost complete proteomes for various other organisms, ranging from mice and maize (corn) to the malaria parasite (see'Folding options').
All the sessions from Transform 2021 are available on-demand now. DeepMind and the European Bioinformatics Institute (EMBL), a life sciences lab based in Hinxton, England, today announced the launch of what they claim is the most complete and accurate database of structures for proteins expressed by the human genome. In a joint press conference hosted by the journal Nature, the two organizations said that the database, the AlphaFold Protein Structure Database, which was created using DeepMind's AlphaFold 2 system, will be made available to the scientific community in the coming weeks. The recipe for proteins -- large molecules consisting of amino acids that are the fundamental building blocks of tissues, muscles, hair, enzymes, antibodies, and other essential parts of living organisms -- are encoded in DNA. It's these genetic definitions that circumscribe their three-dimensional structures, which in turn determine their capabilities.
Last month, DeepMind published the much anticipated, detailed methodology underlying the latest version of AlphaFold – the UK-based science company's powerful AI system that blew away its rivals in the latest major competition to predict the 3D structure of proteins. AlphaFold's machine learning methodology has been applied to predict structures for almost 99% of human proteins which have now been made publicly available. In this long read, I reflect on the significance of these developments for fundamental research and drug discovery. I wrote this as the ICR celebrates the 10th anniversary of its AI-enabled drug discovery knowledgebase canSAR – which features multiple approaches to predicting'druggability' as an aid to selecting drug targets and accelerating drug discovery. The coronavirus pandemic has, understandably, soaked up a lot of bandwidth when it comes to science news – but one particular non-Covid science story was able to cut through and hit the headlines in the UK and around the world. On 30 November 2020 it was announced that DeepMind – a subsidiary of Google's parent company Alphabet focusing on artificial intelligence – had made what was hailed as a huge leap towards solving one of biology's greatest remaining challenges: the ability to predict the correct, three-dimensional structures of proteins based on their constituent, one-dimensional amino acid sequences. The announcement attracted huge interest, but the expert community has been waiting for the peer-reviewed science publication. The AI methodology has now been published in the leading journal Nature and this was followed rapidly by a second Nature paper from DeepMind and collaborators at the European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), which reports the application of the most recent AlphaFold machine learning system to predict the 3D structures at scale for almost the entire human proteome – 98.5% of human proteins.
DeepMind, the British artificial intelligence (AI) company owned by Google, has solved a 50-year-old problem in biology. DeepMind's AI system, AlphaFold, cracked the so-called'protein folding problem' – figuring out how a protein's amino acid sequence dictates its 3D atomic structure. A protein's structure is closely linked with its function, and the ability to predict its structure unlocks a greater understanding of what it does and how it works. AlphaFold's neural network was trained with 170,000 known protein sequences and their different structures. The system registered an average accuracy score of 92.4 out of 100 for predicting protein structure, and a score of 87 in the category for most challenging proteins. Because almost all diseases, including cancer and Covid-19, are related to a protein's 3D structure, the AI could pave the way for faster development of treatments and drug discoveries by determining the structure of previously-unknown proteins.