AITopics | chemical reaction

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Neural Information Processing SystemsDec-23-2025, 18:51:19 GMT

CARE: a Benchmark Suite for the Classification and Retrieval of Enzymes

Enzymes are important proteins that catalyze chemical reactions. In recent years, machine learning methods have emerged to predict enzyme function from sequence; however, there are no standardized benchmarks to evaluate these methods. We introduce CARE, a benchmark and dataset suite for the Classification And Retrieval of Enzymes (CARE). CARE centers on two tasks: (1) classification of a protein sequence by its enzyme commission (EC) number and (2) retrieval of an EC number given a chemical reaction. For each task, we design train-test splits to evaluate different kinds of out-of-distribution generalization that are relevant to real use cases. For the classification task, we provide baselines for state-of-the-art methods. Because the retrieval task has not been previously formalized, we propose a method called Contrastive Reaction-EnzymE Pretraining (CREEP) as one of the first baselines for this task and compare it to the recent method, CLIPZyme. CARE is available at https://github.com/jsunn-y/CARE/.

artificial intelligence, machine learning, proceedings, (7 more...)

Country: Europe > France (0.07)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceNov-24-2025

Generating transition states of chemical reactions via distance-geometry-based flow matching

Luo, Yufei, Gu, Xiang, Sun, Jian

Transition states (TSs) are crucial for understanding reaction mechanisms, yet their exploration is limited by the complexity of experimental and computational approaches. Here we propose TS-DFM, a flow matching framework that predicts TSs from reactants and products. By operating in molecular distance geometry space, TS-DFM explicitly captures the dynamic changes of interatomic distances in chemical reactions. A network structure named TSDVNet is designed to learn the velocity field for generating TS geometries accurately. On the benchmark dataset Transition1X, TS-DFM outperforms the previous state-of-the-art method React-OT by 30\% in structural accuracy. These predicted TSs provide high-quality initial structures, accelerating the convergence of CI-NEB optimization. Additionally, TS-DFM can identify alternative reaction paths. In our experiments, even a more favorable TS with lower energy barrier is discovered. Further tests on RGD1 dataset confirm its strong generalization ability on unseen molecules and reaction types, highlighting its potential for facilitating reaction exploration.

artificial intelligence, machine learning, reaction, (18 more...)

2511.17229

Country:

North America (0.14)
Asia > China > Shaanxi Province > Xi'an (0.05)
Europe > United Kingdom > North Sea > Southern North Sea (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Neural Information Processing SystemsNov-14-2025, 14:45:11 GMT

792c7b5aae4a79e78aaeda80516ae2ac-Supplemental.pdf

artificial intelligence, fraction, machine learning, (17 more...)

Country:

Asia > China > Hubei Province > Wuhan (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)

Genre: Research Report (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Neural Information Processing SystemsNov-14-2025, 06:11:48 GMT

Reinforced Genetic Algorithm for Structure-based Drug Design

The neural models take the 3D structure of the targets and ligands as inputs and are pre-trained using native complex structures to utilize the knowledge of the shared binding physics from different targets and then fine-tuned during optimization.

evolutionary algorithm, machine learning, reinforcement learning, (21 more...)

Country:

North America > United States > Illinois (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Materials (0.92)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

Neural Information Processing SystemsNov-13-2025, 12:23:22 GMT

AI for Interpretable Chemistry: Predicting Radical Mechanistic Pathways via Contrastive Learning

These models often struggle to predict balanced reactions and lack crucial information regarding intermediate byproducts. Moreover, they fall short in offering explanations regarding the underlying chemistry responsible for the predicted products.

artificial intelligence, machine learning, natural language, (22 more...)

Country:

North America > United States > California > Orange County > Irvine (0.05)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > France (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
Information Technology > Artificial Intelligence > Cognitive Science (0.93)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

arXiv.org Artificial IntelligenceNov-11-2025

ChemBOMAS: Accelerated BO in Chemistry with LLM-Enhanced Multi-Agent System

Han, Dong, Ai, Zhehong, Cai, Pengxiang, Lu, Shanya, Chen, Jianpeng, Ye, Zihao, Sun, Shuzhou, Gao, Ben, Ge, Lingli, Wang, Weida, Zhou, Xiangxin, Liu, Xihui, Su, Mao, Ouyang, Wanli, Bai, Lei, Zhou, Dongzhan, Xu, Tao, Li, Yuqiang, Zhang, Shufei

Bayesian optimization (BO) is a powerful tool for scientific discovery in chemistry, yet its efficiency is often hampered by the sparse experimental data and vast search space. Here, we introduce ChemBOMAS: a large language model (LLM)-enhanced multi-agent system that accelerates BO through synergistic data- and knowledge-driven strategies. Firstly, the data-driven strategy involves an 8B-scale LLM regressor fine-tuned on a mere 1% labeled samples for pseudo-data generation, robustly initializing the optimization process. Secondly, the knowledge-driven strategy employs a hybrid Retrieval-Augmented Generation approach to guide LLM in dividing the search space while mitigating LLM hallucinations. An Upper Confidence Bound algorithm then identifies high-potential subspaces within this established partition. Across the LLM-refined subspaces and supported by LLM-generated data, BO achieves the improvement of effectiveness and efficiency. Comprehensive evaluations across multiple scientific benchmarks demonstrate that ChemBOMAS set a new state-of-the-art, accelerating optimization efficiency by up to 5-fold compared to baseline methods.

large language model, machine learning, natural language, (19 more...)

2509.08736

Country:

Africa > South Sudan > Greater Upper Nile > Greater Pibor Administrative Area > Boma (0.40)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Hong Kong (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Materials > Chemicals (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Weinbauer, Klaus, Phan, Tieu-Long, Stadler, Peter F., Gärtner, Thomas, Malhotra, Sagar

Prime Implicant Explanations for Reaction Feasibility Prediction

arXiv.org Artificial IntelligenceOct-13-2025

Machine learning models that predict the feasibility of chemical reactions have become central to automated synthesis planning. Despite their predictive success, these models often lack transparency and interpretability. We introduce a novel formulation of prime implicant explanations--also known as minimally sufficient reasons--tailored to this domain, and propose an algorithm for computing such explanations in small-scale reaction prediction tasks. Preliminary experiments demonstrate that our notion of prime implicant explanations conservatively captures the ground truth explanations. That is, such explanations often contain redundant bonds and atoms but consistently capture the molecular attributes that are essential for predicting reaction feasibility.

explanation, machine learning, natural language, (15 more...)

2510.09226

Country:

Europe > Austria > Vienna (0.14)
Europe > Germany > Saxony > Leipzig (0.04)
South America > Colombia > Bogotá D.C. > Bogotá (0.04)
(5 more...)

Genre: Research Report (0.41)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Neural Information Processing SystemsOct-9-2025, 15:40:02 GMT

4fe1859112230a032c7143a9adc3be78-Supplemental-Conference.pdf

artificial intelligence, machine learning, reinforcement learning, (20 more...)

Country:

North America > United States > Illinois (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Materials (0.92)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Torabi, Yasaman, Shirani, Shahram, Reilly, James P.

Chem-NMF: Multi-layer $α$-divergence Non-Negative Matrix Factorization for Cardiorespiratory Disease Clustering, with Improved Convergence Inspired by Chemical Catalysts and Rigorous Asymptotic Analysis

arXiv.org Artificial IntelligenceOct-9-2025

Non-Negative Matrix Factorization (NMF) is an unsupervised learning method offering low-rank representations across various domains such as audio processing, biomedical signal analysis, and image recognition. The incorporation of $α$-divergence in NMF formulations enhances flexibility in optimization, yet extending these methods to multi-layer architectures presents challenges in ensuring convergence. To address this, we introduce a novel approach inspired by the Boltzmann probability of the energy barriers in chemical reactions to theoretically perform convergence analysis. We introduce a novel method, called Chem-NMF, with a bounding factor which stabilizes convergence. To our knowledge, this is the first study to apply a physical chemistry perspective to rigorously analyze the convergence behaviour of the NMF algorithm. We start from mathematically proven asymptotic convergence results and then show how they apply to real data. Experimental results demonstrate that the proposed algorithm improves clustering accuracy by 5.6% $\pm$ 2.7% on biomedical signals and 11.1% $\pm$ 7.2% on face images (mean $\pm$ std).

artificial intelligence, machine learning, pattern recognition, (15 more...)

2510.06632

Country:

North America > Canada > Ontario > Hamilton (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Iran (0.04)

Genre: Research Report > Promising Solution (0.54)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.66)