AITopics | amino acid

Collaborating Authors

amino acid

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Scientists develop new method to generate protein datasets for training AI

AIHubJul-1-2026, 14:54:06 GMT

Protein engineering is a field primed for artificial intelligence research. Each protein is made up of amino acids; to optimize a protein function, researchers modify proteins by switching out one of 20 different amino acids for another. For a protein that is just 50 amino acids in length, this leads to approximately 1.13 10 potential combinations to test. This number of potential combinations, impossible to test in the lab, makes protein engineering an ideal challenge for AI. Modeling which of these combinations will give the best results is a perfect problem for the technology's massive computing power.

artificial intelligence, machine learning, protein, (15 more...)

AIHub

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education (0.93)

Technology:

Information Technology > Communications > Social Media (0.73)
Information Technology > Artificial Intelligence > Machine Learning (0.52)

Add feedback

Venus-MAXWELL: Efficient Learning of Protein-Mutation Stability Landscapes using Protein Language Models

Neural Information Processing SystemsJun-22-2026, 17:57:29 GMT

In-silico prediction of protein mutant stability, measured by the difference in Gibbs free energy change ( G), is fundamental for protein engineering. Current sequence-to-label methods typically employ the two-stage pipeline: (i) encoding mutant sequences using neural networks (e.g., transformers), followed by (ii) the G regression from the latent representations. Although these methods have demonstrated promising performance, their dependence on specialized neural network encoders significantly increases the complexity. Additionally, the requirement to individually compute latent representations for each mutant site negatively impacts computational efficiency and poses the risk of overfitting. This work proposes the Venus-MAXWELL framework, which reformulates mutation G prediction as a sequence-to-landscape task. In Venus-MAXWELL, mutations of a protein and their corresponding Gvalues are organized into a landscape matrix, allowing our framework to learn the G landscape of a protein with a single forward and backward pass during training. Besides, to facilitate future works, we also curated a large-scale G dataset with strict controls on data leakage and redundancy to ensure robust evaluation. Venus-MAXWELL is compatible with multiple protein language models and enables these models for accurate and efficient G prediction. For example, when integrated with the ESM-IF, Venus-MAXWELL achieves higher accuracy than ThermoMPNN with 10 faster in inference speed (despite having 50 more parameters than ThermoMPNN).

artificial intelligence, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unified all-atom molecule generation with neural fields

Neural Information Processing SystemsJun-17-2026, 21:53:21 GMT

Generative models for structure-based drug design are often limited to a specific modality, restricting their broader applicability. To address this challenge, we introduce FuncBind, a framework based on computer vision to generate targetconditioned, all-atom molecules across atomic systems. FuncBind uses neural fields to represent molecules as continuous atomic densities and employs scorebased generative models with modern architectures adapted from the computer vision literature. This modality-agnostic representation allows a single unified model to be trained on diverse atomic systems, from small to large molecules, and handle variable atom/residue counts, including non-canonical amino acids. FuncBind achieves competitive in silico performance in generating small molecules, macrocyclic peptides, and antibody complementarity-determining region loops, conditioned on target structures. FuncBind also generated in vitro novel antibody binders via de novo redesign of the complementarity-determining region H3 loop of two chosen co-crystal structures. As a final contribution, we introduce a new dataset and benchmark for structure-conditioned macrocyclic peptide generation*.

cit, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.56)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Equi-mRNA: Protein Translation Equivariant Encoding for mRNALanguage Models

Neural Information Processing SystemsJun-15-2026, 18:20:47 GMT

The growing importance of mRNA therapeutics and synthetic biology highlights the need for models that capture the latent structure of synonymous codon (different triplets encoding the same amino acid) usage, which subtly modulates translation efficiency and gene expression. While recent efforts incorporate codon-level inductive biases through auxiliary objectives, they often fall short of explicitly modeling the structured relationships that arise from the genetic code's inherent symmetries. We introduce Equi-mRNA, the first codon-level equivariant mRNA language model that explicitly encodes synonymous codon symmetries as cyclic subgroups of 2D Special Orthogonal matrix (SO(2)). By combining group-theoretic priors with an auxiliary equivariance loss and symmetry-aware pooling, Equi-mRNA learns biologically grounded representations that outperform vanilla baselines across multiple axes. On downstream property-prediction tasks including expression, stability, and riboswitch switching Equi-mRNA delivers up to 10% improvements in accuracy. In sequence generation, it produces mRNA constructs that are up to 4 more realistic under Fréchet BioDistance metrics and 28% better preserve functional properties compared to vanilla baseline. Interpretability analyses further reveal that learned codon-rotation distributions recapitulate known GC-content biases and tRNA abundance patterns, offering novel insights into codon usage. Equi-mRNA establishes a new biologically principled paradigm for mRNA modeling.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.92)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

Functional-Group-Based Diffusion for Pocket-Specific Molecule Generation and Elaboration

Neural Information Processing SystemsApr-28-2026, 12:41:18 GMT

In recent years, AI-assisted drug design methods have been proposed to generate molecules given the pockets' structures of target proteins. Most of them are atomlevel-based methods, which consider atoms as basic components and generate atom positions and types. In this way, however, it is hard to generate realistic fragments with complicated structures. To solve this, we propose D3FG, a functional-groupbased diffusion model for pocket-specific molecule generation and elaboration. D3FG decomposes molecules into two categories of components: functional groups defined as rigid bodies and linkers as mass points. And the two kinds of components can together form complicated fragments that enhance ligand-protein interactions. To be specific, in the diffusion process, D3FG diffuses the data distribution of the positions, orientations, and types of the components into a prior distribution; In the generative process, the noise is gradually removed from the three variables by denoisers parameterized with designed equivariant graph neural networks. In the experiments, our method can generate molecules with more realistic 3D structures, competitive affinities toward the protein targets, and better drug properties. Besides, D3FG as a solution to a new task of molecule elaboration, could generate molecules with high affinities based on existing ligands and the hotspots of target proteins.

artificial intelligence, functional group, machine learning, (17 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

PROSPECT: Labeled Tandem Mass Spectrometry Dataset for Machine Learning in Proteomics

Neural Information Processing SystemsApr-27-2026, 23:06:00 GMT

Proteomics is the interdisciplinary field focusing on the large-scale study of proteins. Proteins essentially organize and execute all functions within organisms. Today, the bottom-up analysis approach is the most commonly used workflow, where proteins are digested into peptides and subsequently analyzed using Tandem Mass Spectrometry (MS/MS). MS-based proteomics has transformed various fields in life sciences, such as drug discovery and biomarker identification. Today, proteomics is entering a phase where it is helpful for clinical decision-making. Computational methods are vital in turning large amounts of acquired raw MS data into information and, ultimately, knowledge.

bioinformatics, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > Germany (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

0fb7c02d420c993385c7de44c2b5bf01-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 22:04:47 GMT

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.69)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Protein contact prediction from amino acid co-evolution using convolutional networks for graph-valued images

Vladimir Golkov, Marcin J. Skwark, Antonij Golkov, Alexey Dosovitskiy, Thomas Brox, Jens Meiler, Daniel Cremers

Neural Information Processing SystemsMar-23-2026, 05:10:10 GMT

Proteins are responsible for most of the functions in life, and thus are the central focus of many areas of biomedicine. Protein structure is strongly related to protein function, but is difficult to elucidate experimentally, therefore computational structure prediction is a crucial task on the way to solve many biological questions. A contact map is a compact representation of the three-dimensional structure of a protein via the pairwise contacts between the amino acids constituting the protein. We use a convolutional network to calculate protein contact maps from detailed evolutionary coupling statistics between positions in the protein sequence. The input to the network has an image-like structure amenable to convolutions, but every "pixel" instead of color channels contains a bipartite undirected edge-weighted graph. We propose several methods for treating such "graph-valued images" in a convolutional network. The proposed method outperforms state-of-the-art methods by a considerable margin.

artificial intelligence, machine learning, protein, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > Germany (0.28)

Genre: Research Report (0.35)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education > Health & Safety > School Nutrition (0.67)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

PROSPECT PTMs: Rich Labeled Tandem Mass Spectrometry Dataset of Modified Peptides for Machine Learning in Proteomics

Neural Information Processing SystemsMar-22-2026, 19:43:33 GMT

Post-Translational Modifications (PTMs) are changes that occur in proteins after synthesis, influencing their structure, function, and cellular behavior. PTMs are essential in cell biology; they regulate protein function and stability, are involved in various cellular processes, and are linked to numerous diseases. A particularly interesting class of PTMs are chemical modifications such as phosphorylation introduced on amino acid side chains because they can drastically alter the physicochemical properties of the peptides once they are present. One or more PTMs can be attached to each amino acid of the peptide sequence. The most commonly applied technique to detect PTMs on proteins is bottom-up Mass Spectrometry-based proteomics (MS), where proteins are digested into peptides and subsequently analyzed using Tandem Mass Spectrometry (MS/MS).

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.85)

Add feedback

AdaNovo: Towards Robust \emph{De Novo} Peptide Sequencing in Proteomics against Data Biases

Neural Information Processing SystemsMar-17-2026, 20:27:42 GMT

Tandem mass spectrometry has played a pivotal role in advancing proteomics, enabling the high-throughput analysis of protein composition in biological tissues. Despite the development of several deep learning methods for predicting amino acid sequences (peptides) responsible for generating the observed mass spectra, training data biases hinder further advancements of \emph{de novo} peptide sequencing. Firstly, prior methods struggle to identify amino acids with Post-Translational Modifications (PTMs) due to their lower frequency in training data compared to canonical amino acids, further resulting in unsatisfactory peptide sequencing performance. Secondly, various noise and missing peaks in mass spectra reduce the reliability of training data (Peptide-Spectrum Matches, PSMs). To address these challenges, we propose AdaNovo, a novel and domain knowledge-inspired framework that calculates Conditional Mutual Information (CMI) between the mass spectra and amino acids or peptides, using CMI for robust training against above biases. Extensive experiments indicate that AdaNovo outperforms previous competitors on the widely-used 9-species benchmark, meanwhile yielding 3.6\% - 9.4\% improvements in PTMs identification. The supplements contain the code.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback