Goto

Collaborating Authors

 missense mutation


ALPHAGMUT: A Rationale-Guided Alpha Shape Graph Neural Network to Evaluate Mutation Effects

Wang, Boshen, Ye, Bowei, Xu, Lin, Liang, Jie

arXiv.org Artificial Intelligence

In silico methods evaluating the mutation effects of missense mutations are providing an important approach for understanding mutations in personal genomes and identifying disease-relevant biomarkers. However, existing methods, including deep learning methods, heavily rely on sequence-aware information, and do not fully leverage the potential of available 3D structural information. In addition, these methods may exhibit an inability to predict mutations in domains difficult to formulate sequence-based embeddings. In this study, we introduce a novel rationale-guided graph neural network AlphaGMut to evaluate mutation effects and to distinguish pathogenic mutations from neutral mutations. We compute the alpha shapes of protein structures to obtain atomic-resolution edge connectivities and map them to an accurate residue-level graph representation. We then compute structural-, topological-, biophysical-, and sequence properties of the mutation sites, which are assigned as node attributes in the graph. These node attributes could effectively guide the graph neural network to learn the difference between pathogenic and neutral mutations using k-hop message passing with a short training period. We demonstrate that AlphaGMut outperforms state-of-the-art methods, including DeepMind's AlphaMissense, in many performance metrics. In addition, AlphaGMut has the advantage of performing well in alignment-free settings, which provides broader prediction coverage and better generalization compared to current methods requiring deep sequence-aware information.


DeepMind AI can predict if DNA mutations are likely to be harmful

New Scientist

Google DeepMind's AlphaMissense AI can predict whether mutations will affect how proteins such as haemoglobin subunit beta (left) or cystic fibrosis transmembrane conductance regulator (right) will function Artificial intelligence firm Google DeepMind has adapted its AlphaFold system for predicting protein structure to assess whether a huge number of simple mutations are harmful. The adapted system, called AlphaMissense, has done this for 71 million possible mutations of a kind called missense mutations in the 20,000 human proteins, and the results made freely available. "We think this is very helpful for clinicians and human geneticists," says Jun Cheng at Google DeepMind. "Hopefully, this can help them to pinpoint the cause of genetic disease." Almost everyone is born with between about 50 and 100 mutations not found in their parents, resulting in a huge amount of genetic variation between individuals.


Google DeepMind AI tool assesses DNA mutations for harm potential

The Guardian

Scientists at Google DeepMind have built an artificial intelligence program that can predict whether millions of genetic mutations are either harmless or likely to cause disease, in an effort to speed up research and the diagnosis of rare disorders. The program makes predictions about so-called missense mutations, where a single letter is misspelt in the DNA code. Such mutations are often harmless but they can disrupt how proteins work and cause diseases from cystic fibrosis and sickle-cell anaemia to cancer and problems with brain development. The researchers used AlphaMissense to assess all 71m single-letter mutations that could affect human proteins. When they set the program's precision to 90%, it predicted that 57% of missense mutations were probably harmless and 32% were probably harmful. It was uncertain about the impact of the rest.


Deciphering the Language of Nature: A transformer-based language model for deleterious mutations in proteins

Jiang, Theodore, Fang, Li, Wang, Kai

arXiv.org Artificial Intelligence

Various machine-learning models, including deep neural network models, have already been developed to predict deleteriousness of missense (non-synonymous) mutations. Potential improvements to the current state of the art, however, may still benefit from a fresh look at the biological problem using more sophisticated self-adaptive machine-learning approaches. Recent advances in the natural language processing field show transformer models-a type of deep neural network-to be particularly powerful at modeling sequence information with context dependence. In this study, we introduce MutFormer, a transformer-based model for the prediction of deleterious missense mutations, which uses reference and mutated protein sequences from the human genome as the primary features. MutFormer takes advantage of a combination of self-attention layers and convolutional layers to learn both long-range and short-range dependencies between amino acid mutations in a protein sequence. In this study, we first pre-trained MutFormer on reference protein sequences and mutated protein sequences resulting from common genetic variants observed in human populations. We next examined different fine-tuning methods to successfully apply the model to deleteriousness prediction of missense mutations. Finally, we evaluated MutFormer's performance on multiple testing data sets. We found that MutFormer showed similar or improved performance over a variety of existing tools, including those that used conventional machine-learning approaches. We conclude that MutFormer successfully considers sequence features that are not explored in previous studies and could potentially complement existing computational predictions or empirically generated functional scores to improve our understanding of disease variants.


An International Collaborative Effort Assessed The Applications Of AlphaFold2

#artificialintelligence

Teams of researchers across 18 institutes spread over 11 countries have worked together to assess the utility of AlphaFold2 (AF2) predictions in the analysis of distinctive structural elements, the impact of missense variants, the prediction of function and ligand binding sites, the modeling of interactions, and the modeling of experimental structural data. A significant biological macromolecule involved in every cellular activity is the protein. It is crucial for interaction, protein function, and how missense mutations (point mutations where a single nucleotide change results in a codon that codes for a different amino acid) can affect a protein's functionality. The primary structure of amino acids, through protein folding, forms a three-dimensional tertiary or quaternary structure. Although the experimental methods for figuring out protein structures have advanced tremendously, most of the protein universe remained unidentified.