Last month, DeepMind published the much anticipated, detailed methodology underlying the latest version of AlphaFold – the UK-based science company's powerful AI system that blew away its rivals in the latest major competition to predict the 3D structure of proteins. AlphaFold's machine learning methodology has been applied to predict structures for almost 99% of human proteins which have now been made publicly available. In this long read, I reflect on the significance of these developments for fundamental research and drug discovery. I wrote this as the ICR celebrates the 10th anniversary of its AI-enabled drug discovery knowledgebase canSAR – which features multiple approaches to predicting'druggability' as an aid to selecting drug targets and accelerating drug discovery. The coronavirus pandemic has, understandably, soaked up a lot of bandwidth when it comes to science news – but one particular non-Covid science story was able to cut through and hit the headlines in the UK and around the world. On 30 November 2020 it was announced that DeepMind – a subsidiary of Google's parent company Alphabet focusing on artificial intelligence – had made what was hailed as a huge leap towards solving one of biology's greatest remaining challenges: the ability to predict the correct, three-dimensional structures of proteins based on their constituent, one-dimensional amino acid sequences. The announcement attracted huge interest, but the expert community has been waiting for the peer-reviewed science publication. The AI methodology has now been published in the leading journal Nature and this was followed rapidly by a second Nature paper from DeepMind and collaborators at the European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), which reports the application of the most recent AlphaFold machine learning system to predict the 3D structures at scale for almost the entire human proteome – 98.5% of human proteins.

DeepMind, the British artificial intelligence (AI) company owned by Google, has solved a 50-year-old problem in biology. DeepMind's AI system, AlphaFold, cracked the so-called'protein folding problem' – figuring out how a protein's amino acid sequence dictates its 3D atomic structure. A protein's structure is closely linked with its function, and the ability to predict its structure unlocks a greater understanding of what it does and how it works. AlphaFold's neural network was trained with 170,000 known protein sequences and their different structures. The system registered an average accuracy score of 92.4 out of 100 for predicting protein structure, and a score of 87 in the category for most challenging proteins. Because almost all diseases, including cancer and Covid-19, are related to a protein's 3D structure, the AI could pave the way for faster development of treatments and drug discoveries by determining the structure of previously-unknown proteins.