AITopics | mlip

Collaborating Authors

mlip

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

From Regression to Classification: Exploring the Benefits of Categorical Representations of Energy in MLIPs

Ali, Ahmad

arXiv.org Artificial IntelligenceDec-2-2025

Density Functional Theory (DFT) is a widely used computational method for estimating the energy and behavior of molecules. Machine Learning Interatomic Potentials (MLIPs) are models trained to approximate DFT-level energies and forces at dramatically lower computational cost. Many modern MLIPs rely on a scalar regression formulation; given information about a molecule, they predict a single energy value and corresponding forces while minimizing absolute error with DFT's calculations. In this work, we explore a multi-class classification formulation that predicts a categorical distribution over energy/force values, providing richer supervision through multiple targets. Most importantly, this approach offers a principled way to quantify model uncertainty. In particular, our method predicts a histogram of the energy/force distribution, converts scalar targets into histograms, and trains the model using cross-entropy loss. Our results demonstrate that this categorical formulation can achieve absolute error performance comparable to regression baselines. Furthermore, this representation enables the quantification of epistemic uncertainty through the entropy of the predicted distribution, offering a measure of model confidence absent in scalar regression approaches.

artificial intelligence, histogram, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2512.0116

Country: North America > Canada > Ontario (0.15)

Genre: Research Report > New Finding (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Add feedback

LeMat-Traj: A Scalable and Unified Dataset of Materials Trajectories for Atomistic Modeling

Ramlaoui, Ali, Siron, Martin, Djafar, Inel, Musielewicz, Joseph, Rossello, Amandine, Schmidt, Victor, Duval, Alexandre

arXiv.org Artificial IntelligenceOct-20-2025

The development of accurate machine learning interatomic potentials (MLIPs) is limited by the fragmented availability and inconsistent formatting of quantum mechanical trajectory datasets derived from Density Functional Theory (DFT). These datasets are expensive to generate yet difficult to combine due to variations in format, metadata, and accessibility. To address this, we introduce LeMat-Traj, a curated dataset comprising over 120 million atomic configurations aggregated from large-scale repositories, including the Materials Project, Alexandria, and OQMD. LeMat-Traj standardizes data representation, harmonizes results and filters for high-quality configurations across widely used DFT functionals (PBE, PBESol, SCAN, r2SCAN). It significantly lowers the barrier for training transferrable and accurate MLIPs. LeMat-Traj spans both relaxed low-energy states and high-energy, high-force structures, complementing molecular dynamics and active learning datasets. By fine-tuning models pre-trained on high-force data with LeMat-Traj, we achieve a significant reduction in force prediction errors on relaxation tasks. We also present LeMaterial-Fetcher, a modular and extensible open-source library developed for this work, designed to provide a reproducible framework for the community to easily incorporate new data sources and ensure the continued evolution of large-scale materials datasets. LeMat-Traj and LeMaterial-Fetcher are publicly available at https://huggingface.co/datasets/LeMaterial/LeMat-Traj and https://github.com/LeMaterial/lematerial-fetcher.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.20875

Genre: Research Report (0.82)

Industry: Energy (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

Transformers Discover Molecular Structure Without Graph Priors

Kreiman, Tobias, Bai, Yutong, Atieh, Fadi, Weaver, Elizabeth, Qu, Eric, Krishnapriyan, Aditi S.

arXiv.org Artificial IntelligenceOct-3-2025

Graph Neural Networks (GNNs) are the dominant architecture for molecular machine learning, particularly for molecular property prediction and machine learning interatomic potentials (MLIPs). GNNs perform message passing on predefined graphs often induced by a fixed radius cutoff or k-nearest neighbor scheme. While this design aligns with the locality present in many molecular tasks, a hard-coded graph can limit expressivity due to the fixed receptive field and slows down inference with sparse graph operations. In this work, we investigate whether pure, unmodified Transformers trained directly on Cartesian coordinates$\unicode{x2013}$without predefined graphs or physical priors$\unicode{x2013}$can approximate molecular energies and forces. As a starting point for our analysis, we demonstrate how to train a Transformer to competitive energy and force mean absolute errors under a matched training compute budget, relative to a state-of-the-art equivariant GNN on the OMol25 dataset. We discover that the Transformer learns physically consistent patterns$\unicode{x2013}$such as attention weights that decay inversely with interatomic distance$\unicode{x2013}$and flexibly adapts them across different molecular environments due to the absence of hard-coded biases. The use of a standard Transformer also unlocks predictable improvements with respect to scaling training resources, consistent with empirical scaling laws observed in other domains. Our results demonstrate that many favorable properties of GNNs can emerge adaptively in Transformers, challenging the necessity of hard-coded graph inductive biases and pointing toward standardized, scalable architectures for molecular modeling.

artificial intelligence, machine learning, transformer, (20 more...)

arXiv.org Artificial Intelligence

2510.02259

Country:

North America > United States (0.46)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Middle East > Saudi Arabia > Asir Province > Abha (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Energy (0.67)
Health & Medicine (0.48)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

RoMoCo: Robotic Motion Control Toolbox for Reduced-Order Model-Based Locomotion on Bipedal and Humanoid Robots

Dai, Min, Ames, Aaron D.

arXiv.org Artificial IntelligenceSep-25-2025

By leveraging reduced-order models for platform-agnostic gait generation, RoMoCo enables flexible controller design across diverse robots. We demonstrate its versatility and performance through extensive simulations on the Cassie, Unitree H1, and G1 robots, and validate its real-world efficacy with hardware experiments on the Cassie and G1 humanoids.

artificial intelligence, constraint, controller, (17 more...)

arXiv.org Artificial Intelligence

2509.19545

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.41)

Add feedback

UMA: A Family of Universal Models for Atoms

Wood, Brandon M., Dzamba, Misko, Fu, Xiang, Gao, Meng, Shuaibi, Muhammed, Barroso-Luque, Luis, Abdelmaqsoud, Kareem, Gharakhanyan, Vahe, Kitchin, John R., Levine, Daniel S., Michel, Kyle, Sriram, Anuroop, Cohen, Taco, Das, Abhishek, Rizvi, Ammar, Sahoo, Sushree Jagriti, Ulissi, Zachary W., Zitnick, C. Lawrence

arXiv.org Artificial IntelligenceJul-1-2025

The ability to quickly and accurately compute properties from atomic simulations is critical for advancing a large number of applications in chemistry and materials science including drug discovery, energy storage, and semiconductor manufacturing. To address this need, Meta FAIR presents a family of Universal Models for Atoms (UMA), designed to push the frontier of speed, accuracy, and generalization. UMA models are trained on half a billion unique 3D atomic structures (the largest training runs to date) by compiling data across multiple chemical domains, e.g. molecules, materials, and catalysts. We develop empirical scaling laws to help understand how to increase model capacity alongside dataset size to achieve the best accuracy. The UMA small and medium models utilize a novel architectural design we refer to as mixture of linear experts that enables increasing model capacity without sacrificing speed. For example, UMA-medium has 1.4B parameters but only ~50M active parameters per atomic structure. We evaluate UMA models on a diverse set of applications across multiple domains and find that, remarkably, a single model without any fine-tuning can perform similarly or better than specialized models. We are releasing the UMA code, weights, and associated data to accelerate computational workflows and enable the community to continue to build increasingly capable AI models.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.23971

Genre: Research Report > New Finding (0.92)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.66)
Materials > Chemicals (0.48)
Energy > Energy Storage (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Fast, Modular, and Differentiable Framework for Machine Learning-Enhanced Molecular Simulations

Christiansen, Henrik, Maruyama, Takashi, Errica, Federico, Zaverkin, Viktor, Takamoto, Makoto, Alesiani, Francesco

arXiv.org Artificial IntelligenceMar-26-2025

F ast, Modular, and Differentiable Framework for Machine Learning-Enhanced Molecular Simulations Henrik Christiansen, Takashi Maruyama, Federico Errica, Viktor Zaverkin, Makoto Takamoto, and Francesco Alesiani NEC Laboratories Europe GmbH, Kurfürsten-Anlage 36, 69115 Heidelberg, Germany (Dated: March 27, 2025) We present an end-to-end differentiable molecular simulation framework (DIMOS) for molecular dynamics and Monte Carlo simulations. DIMOS easily integrates machine-learning-based inter-atomic potentials and implements classical force fields including particle-mesh Ewald electrostatics. Thanks to its modularity, both classical and machine-learning-based approaches can be easily combined into a hybrid description of the system (ML/MM). The superior performance and the high versatility is probed in different benchmarks and applications, with speed-up factors of up to 170 . The advantage of differentiability is demonstrated by an end-to-end optimization of the proposal distribution in a Markov Chain Monte Carlo simulation based on Hamiltonian Monte Carlo. Using these optimized simulation parameters a 3 acceleration is observed in comparison to ad-hoc chosen simulation parameters. Molecular simulations are a cornerstone of modern computational physics, chemistry and biology, enabling researchers to understand complex properties of the system [1]. Traditional molecular dynamics (MD) and Markov Chain Monte Carlo (MCMC) simulations rely on pre-defined force fields and specialized software to achieve large timescales and efficient sampling of rugged free-energy landscapes [2]. However, conventional MD and MCMC simulation packages generally lack the flexibility and modularity to easily incorporate cutting-edge computational techniques such as machine learning (ML) based enhancements: Advances in machine learning in-teratomic potentials (MLIPs) promise improved accuracy for MD simulations [3], yet integrating these techniques into a scalable and user-friendly framework remains a major challenge, especially when developing novel approaches [4]. Here we present an end-to-end differentiable molecular simulation framework (DIMOS) implemented in PyTorch [5], a popular library for ML research. DI-MOS implements essential algorithms to perform MD and MCMC simulations, providing an easy-to-use way to interface MLIPs and an efficient implementation of classical force field components in addition to implementations of common integrators and barostats. Additional components are the efficient calculation of neighborlists and constraint algorithms which allow for larger timesteps of the numerical integrator. By relying on PyTorch, we inherit many advances achieved by the ML community: We achieve fast execution speed on diverse hardware platforms, combined with a simple-to-use and modular interface implemented in Python.

artificial intelligence, machine learning, simulation, (18 more...)

arXiv.org Artificial Intelligence

2503.20541

Country:

Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.24)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Equivariant Machine Learning Interatomic Potentials with Global Charge Redistribution

Maruf, Moin Uddin, Kim, Sungmin, Ahmad, Zeeshan

arXiv.org Artificial IntelligenceMar-23-2025

Machine learning interatomic potentials (MLIPs) provide a computationally efficient alternative to quantum mechanical simulations for predicting material properties. Message-passing graph neural networks, commonly used in these MLIPs, rely on local descriptor-based symmetry functions to model atomic interactions. However, such local descriptor-based approaches struggle with systems exhibiting long-range interactions, charge transfer, and compositional heterogeneity. In this work, we develop a new equivariant MLIP incorporating long-range Coulomb interactions through explicit treatment of electronic degrees of freedom, specifically global charge distribution within the system. This is achieved using a charge equilibration scheme based on predicted atomic electronegativities. We systematically evaluate our model across a range of benchmark periodic and non-periodic datasets, demonstrating that it outperforms both short-range equivariant and long-range invariant MLIPs in energy and force predictions. Our approach enables more accurate and efficient simulations of systems with long-range interactions and charge heterogeneity, expanding the applicability of MLIPs in computational materials science.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.17949

Country:

North America > United States > Texas > Lubbock County > Lubbock (0.04)
Asia > South Korea > Gyeonggi-do > Suwon (0.04)

Genre: Research Report (0.82)

Industry: Materials > Chemicals (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Ensemble Knowledge Distillation for Machine Learning Interatomic Potentials

Matin, Sakib, Shinkle, Emily, Pimonova, Yulia, Craven, Galen T., Pachalieva, Aleksandra, Li, Ying Wai, Barros, Kipton, Lubbers, Nicholas

arXiv.org Artificial IntelligenceMar-19-2025

Machine learning interatomic potentials (MLIPs) are a promising tool to accelerate atomistic simulations and molecular property prediction. The quality of MLIPs strongly depends on the quantity of available training data as well as the quantum chemistry (QC) level of theory used to generate that data. Datasets generated with high-fidelity QC methods, such as coupled cluster, are typically restricted to small molecules and may be missing energy gradients. With this limited quantity of data, it is often difficult to train good MLIP models. We present an ensemble knowledge distillation (EKD) method to improve MLIP accuracy when trained to energy-only datasets. In our EKD approach, first, multiple teacher models are trained to QC energies and then used to generate atomic forces for all configurations in the dataset. Next, a student MLIP is trained to both QC energies and to ensemble-averaged forces generated by the teacher models. We apply this workflow on the ANI-1ccx dataset which consists of organic molecules with configuration energies computed at the coupled cluster level of theory. The resulting student MLIPs achieve new state-of-the-art accuracy on the out-of-sample COMP6 benchmark and improved stability for molecular dynamics simulations. The EKD approach for MLIP is broadly applicable for chemical, biomolecular and materials science simulations.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.14293

Country: North America > United States > New Mexico > Los Alamos County > Los Alamos (0.06)

Genre:

Research Report (0.50)
Workflow (0.50)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Accurate, transferable, and verifiable machine-learned interatomic potentials for layered materials

Georgaras, Johnathan D., Ramdas, Akash, Shan, Chung Hsuan, Halsted, Elena, Berwyn, null, Li, Tianshu, da Jornada, Felipe H.

arXiv.org Artificial IntelligenceMar-19-2025

Twisted layered van-der-Waals materials often exhibit unique electronic and optical properties absent in their non-twisted counterparts. Unfortunately, predicting such properties is hindered by the difficulty in determining the atomic structure in materials displaying large moir\'e domains. Here, we introduce a split machine-learned interatomic potential and dataset curation approach that separates intralayer and interlayer interactions and significantly improves model accuracy -- with a tenfold increase in energy and force prediction accuracy relative to conventional models. We further demonstrate that traditional MLIP validation metrics -- force and energy errors -- are inadequate for moir\'e structures and develop a more holistic, physically-motivated metric based on the distribution of stacking configurations. This metric effectively compares the entirety of large-scale moir\'e domains between two structures instead of relying on conventional measures evaluated on smaller commensurate cells. Finally, we establish that one-dimensional instead of two-dimensional moir\'e structures can serve as efficient surrogate systems for validating MLIPs, allowing for a practical model validation protocol against explicit DFT calculations. Applying our framework to HfS2/GaS bilayers reveals that accurate structural predictions directly translate into reliable electronic properties. Our model-agnostic approach integrates seamlessly with various intralayer and interlayer interaction models, enabling computationally tractable relaxation of moir\'e materials, from bilayer to complex multilayers, with rigorously validated accuracy.

artificial intelligence, machine learning, mlip, (19 more...)

arXiv.org Artificial Intelligence

2503.15432

Country:

North America > United States > Texas (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Energy (0.46)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A practical guide to machine learning interatomic potentials -- Status and future

Jacobs, Ryan, Morgan, Dane, Attarian, Siamak, Meng, Jun, Shen, Chen, Wu, Zhenghao, Xie, Clare Yijia, Yang, Julia H., Artrith, Nongnuch, Blaiszik, Ben, Ceder, Gerbrand, Choudhary, Kamal, Csanyi, Gabor, Cubuk, Ekin Dogus, Deng, Bowen, Drautz, Ralf, Fu, Xiang, Godwin, Jonathan, Honavar, Vasant, Isayev, Olexandr, Johansson, Anders, Kozinsky, Boris, Martiniani, Stefano, Ong, Shyue Ping, Poltavsky, Igor, Schmidt, KJ, Takamoto, So, Thompson, Aidan, Westermayr, Julia, Wood, Brandon M.

arXiv.org Artificial IntelligenceMar-12-2025

The rapid development and large body of literature on machine learning interatomic potentials (MLIPs) can make it difficult to know how to proceed for researchers who are not experts but wish to use these tools. The spirit of this review is to help such researchers by serving as a practical, accessible guide to the state-of-the-art in MLIPs. This review paper covers a broad range of topics related to MLIPs, including (i) central aspects of how and why MLIPs are enablers of many exciting advancements in molecular modeling, (ii) the main underpinnings of different types of MLIPs, including their basic structure and formalism, (iii) the potentially transformative impact of universal MLIPs for both organic and inorganic systems, including an overview of the most recent advances, capabilities, downsides, and potential applications of this nascent class of MLIPs, (iv) a practical guide for estimating and understanding the execution speed of MLIPs, including guidance for users based on hardware availability, type of MLIP used, and prospective simulation size and time, (v) a manual for what MLIP a user should choose for a given application by considering hardware resources, speed requirements, energy and force accuracy requirements, as well as guidance for choosing pre-trained potentials or fitting a new potential from scratch, (vi) discussion around MLIP infrastructure, including sources of training data, pre-trained potentials, and hardware resources for training, (vii) summary of some key limitations of present MLIPs and current approaches to mitigate such limitations, including methods of including long-range interactions, handling magnetic systems, and treatment of excited states, and finally (viii) we finish with some more speculative thoughts on what the future holds for the development and application of MLIPs over the next 3-10+ years.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.cossms.2025.101214

2503.09814

Country:

North America > United States > Pennsylvania (0.28)
Europe > United Kingdom > England (0.27)
North America > United States > New York (0.14)
(6 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry:

Materials > Chemicals (1.00)
Education (1.00)
Government > Regional Government > North America Government > United States Government (0.67)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)

Add feedback