Goto

Collaborating Authors

 energy difference


Enhanced Diffusion Sampling: Efficient Rare Event Sampling and Free Energy Calculation with Diffusion Models

Xie, Yu, Winkler, Ludwig, Sun, Lixin, Lewis, Sarah, Foster, Adam E., Luna, José Jiménez, Hempel, Tim, Gastegger, Michael, Chen, Yaoyi, Zaporozhets, Iryna, Clementi, Cecilia, Bishop, Christopher M., Noé, Frank

arXiv.org Machine Learning

The rare-event sampling problem has long been the central limiting factor in molecular dynamics (MD), especially in biomolecular simulation. Recently, diffusion models such as BioEmu have emerged as powerful equilibrium samplers that generate independent samples from complex molecular distributions, eliminating the cost of sampling rare transition events. However, a sampling problem remains when computing observables that rely on states which are rare in equilibrium, for example folding free energies. Here, we introduce enhanced diffusion sampling, enabling efficient exploration of rare-event regions while preserving unbiased thermodynamic estimators. The key idea is to perform quantitatively accurate steering protocols to generate biased ensembles and subsequently recover equilibrium statistics via exact reweighting. We instantiate our framework in three algorithms: UmbrellaDiff (umbrella sampling with diffusion models), $Δ$G-Diff (free-energy differences via tilted ensembles), and MetaDiff (a batchwise analogue for metadynamics). Across toy systems, protein folding landscapes and folding free energies, our methods achieve fast, accurate, and scalable estimation of equilibrium properties within GPU-minutes to hours per system -- closing the rare-event sampling gap that remained after the advent of diffusion-model equilibrium samplers.


EP-GAT: Energy-based Parallel Graph Attention Neural Network for Stock Trend Classification

Jiang, Zhuodong, Zhang, Pengju, Martin, Peter

arXiv.org Artificial Intelligence

Graph neural networks have shown remarkable performance in forecasting stock movements, which arises from learning complex inter-dependencies between stocks and intra-dynamics of stocks. Existing approaches based on graph neural networks typically rely on static or manually defined factors to model changing inter-dependencies between stocks. Furthermore, these works often struggle to preserve hierarchical features within stocks. To bridge these gaps, this work presents the Energy-based Parallel Graph Attention Neural Network, a novel approach for predicting future movements for multiple stocks. First, it generates a dynamic stock graph with the energy difference between stocks and Boltzmann distribution, capturing evolving inter-dependencies between stocks. Then, a parallel graph attention mechanism is proposed to preserve the hierarchical intra-stock dynamics. Extensive experiments on five real-world datasets are conducted to validate the proposed approach, spanning from the US stock markets (NASDAQ, NYSE, SP) and UK stock markets (FTSE, LSE). The experimental results demonstrate that EP-GAT consistently outperforms competitive five baselines on test periods across various metrics. The ablation studies and hyperparameter sensitivity analysis further validate the effectiveness of each module in the proposed method.


Uncovering Conceptual Blindspots in Generative Image Models Using Sparse Autoencoders

Bohacek, Matyas, Fel, Thomas, Agrawala, Maneesh, Lubana, Ekdeep Singh

arXiv.org Artificial Intelligence

Despite their impressive performance, generative image models trained on large-scale datasets frequently fail to produce images with seemingly simple concepts -- e.g., human hands or objects appearing in groups of four -- that are reasonably expected to appear in the training data. These failure modes have largely been documented anecdotally, leaving open the question of whether they reflect idiosyncratic anomalies or more structural limitations of these models. To address this, we introduce a systematic approach for identifying and characterizing "conceptual blindspots" -- concepts present in the training data but absent or misrepresented in a model's generations. Our method leverages sparse autoencoders (SAEs) to extract interpretable concept embeddings, enabling a quantitative comparison of concept prevalence between real and generated images. We train an archetypal SAE (RA-SAE) on DINOv2 features with 32,000 concepts -- the largest such SAE to date -- enabling fine-grained analysis of conceptual disparities. Applied to four popular generative models (Stable Diffusion 1.5/2.1, PixArt, and Kandinsky), our approach reveals specific suppressed blindspots (e.g., bird feeders, DVD discs, and whitespaces on documents) and exaggerated blindspots (e.g., wood background texture and palm trees). At the individual datapoint level, we further isolate memorization artifacts -- instances where models reproduce highly specific visual templates seen during training. Overall, we propose a theoretically grounded framework for systematically identifying conceptual blindspots in generative models by assessing their conceptual fidelity with respect to the underlying data-generating process.


Cross-functional transferability in universal machine learning interatomic potentials

Huang, Xu, Deng, Bowen, Zhong, Peichen, Kaplan, Aaron D., Persson, Kristin A., Ceder, Gerbrand

arXiv.org Artificial Intelligence

The rapid development of universal machine learning interatomic potentials (uMLIPs) has demonstrated the possibility for generalizable learning of the universal potential energy surface. In principle, the accuracy of uMLIPs can be further improved by bridging the model from lower-fidelity datasets to high-fidelity ones. In this work, we analyze the challenge of this transfer learning problem within the CHGNet framework. We show that significant energy scale shifts and poor correlations between GGA and r$^2$SCAN pose challenges to cross-functional data transferability in uMLIPs. By benchmarking different transfer learning approaches on the MP-r$^2$SCAN dataset of 0.24 million structures, we demonstrate the importance of elemental energy referencing in the transfer learning of uMLIPs. By comparing the scaling law with and without the pre-training on a low-fidelity dataset, we show that significant data efficiency can still be achieved through transfer learning, even with a target dataset of sub-million structures. We highlight the importance of proper transfer learning and multi-fidelity learning in creating next-generation uMLIPs on high-fidelity data.


An Efficient Learning Method to Connect Observables

Yu, Hang, Miyagi, Takayuki

arXiv.org Machine Learning

Constructing fast and accurate surrogate models is a key ingredient for making robust predictions in many topics. We introduce a new model, the Multiparameter Eigenvalue Problem (MEP) emulator. The new method connects emulators and can make predictions directly from observables to observables. We present that the MEP emulator can be trained with data from Eigenvector Continuation (EC) and Parametric Matrix Model (PMM) emulators. A simple simulation on a one-dimensional lattice confirms the performance of the MEP emulator. Using $^{28}$O as an example, we also demonstrate that the predictive probability distribution of the target observables can be easily obtained through the new emulator.


Riemannian Denoising Score Matching for Molecular Structure Optimization with Accurate Energy

Woo, Jeheon, Kim, Seonghwan, Kim, Jun Hyeong, Kim, Woo Youn

arXiv.org Artificial Intelligence

This study introduces a modified score matching method aimed at generating molecular structures with high energy accuracy. The denoising process of score matching or diffusion models mirrors molecular structure optimization, where scores act like physical force fields that guide particles toward equilibrium states. To achieve energetically accurate structures, it can be advantageous to have the score closely approximate the gradient of the actual potential energy surface. Unlike conventional methods that simply design the target score based on structural differences in Euclidean space, we propose a Riemannian score matching approach. This method represents molecular structures on a manifold defined by physics-informed internal coordinates to efficiently mimic the energy landscape, and performs noising and denoising within this space. Our method has been evaluated by refining several types of starting structures on the QM9 and GEOM datasets, demonstrating that the proposed Riemannian score matching method significantly improves the accuracy of the generated molecular structures, attaining chemical accuracy. The implications of this study extend to various applications in computational chemistry, offering a robust tool for accurate molecular structure prediction.


How Susceptible are LLMs to Influence in Prompts?

Anagnostidis, Sotiris, Bulian, Jannis

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are highly sensitive to prompts, including additional context provided therein. As LLMs grow in capability, understanding their prompt-sensitivity becomes increasingly crucial for ensuring reliable and robust performance, particularly since evaluating these models becomes more challenging. In this work, we investigate how current models (Llama, Mixtral, Falcon) respond when presented with additional input from another model, mimicking a scenario where a more capable model -- or a system with access to more external information -- provides supplementary information to the target model. Across a diverse spectrum of question-answering tasks, we study how an LLM's response to multiple-choice questions changes when the prompt includes a prediction and explanation from another model. Specifically, we explore the influence of the presence of an explanation, the stated authoritativeness of the source, and the stated confidence of the supplementary input. Our findings reveal that models are strongly influenced, and when explanations are provided they are swayed irrespective of the quality of the explanation. The models are more likely to be swayed if the input is presented as being authoritative or confident, but the effect is small in size. This study underscores the significant prompt-sensitivity of LLMs and highlights the potential risks of incorporating outputs from external sources without thorough scrutiny and further validation. As LLMs continue to advance, understanding and mitigating such sensitivities will be crucial for their reliable and trustworthy deployment.


Characterization of Locality in Spin States and Forced Moves for Optimizations

Sato, Yoshiki, Konoshima, Makiko, Tamura, Hirotaka, Ohkubo, Jun

arXiv.org Artificial Intelligence

Ising formulations are widely utilized to solve combinatorial optimization problems, and a variety of quantum or semiconductor-based hardware has recently been made available. In combinatorial optimization problems, the existence of local minima in energy landscapes is problematic to use to seek the global minimum. We note that the aim of the optimization is not to obtain exact samplings from the Boltzmann distribution, and there is thus no need to satisfy detailed balance conditions. In light of this fact, we develop an algorithm to get out of the local minima efficiently while it does not yield the exact samplings. For this purpose, we utilize a feature that characterizes locality in the current state, which is easy to obtain with a type of specialized hardware. Furthermore, as the proposed algorithm is based on a rejection-free algorithm, the computational cost is low. In this work, after presenting the details of the proposed algorithm, we report the results of numerical experiments that demonstrate the effectiveness of the proposed feature and algorithm.


DP-Fast MH: Private, Fast, and Accurate Metropolis-Hastings for Large-Scale Bayesian Inference

Zhang, Wanrong, Zhang, Ruqi

arXiv.org Machine Learning

Bayesian inference provides a principled framework for learning from complex data and reasoning under uncertainty. It has been widely applied in machine learning tasks such as medical diagnosis, drug design, and policymaking. In these common applications, data can be highly sensitive. Differential privacy (DP) offers data analysis tools with powerful worst-case privacy guarantees and has been developed as the leading approach in privacy-preserving data analysis. In this paper, we study Metropolis-Hastings (MH), one of the most fundamental MCMC methods, for large-scale Bayesian inference under differential privacy. While most existing private MCMC algorithms sacrifice accuracy and efficiency to obtain privacy, we provide the first exact and fast DP MH algorithm, using only a minibatch of data in most iterations. We further reveal, for the first time, a three-way trade-off among privacy, scalability (i.e. the batch size), and efficiency (i.e. the convergence rate), theoretically characterizing how privacy affects the utility and computational cost in Bayesian inference. We empirically demonstrate the effectiveness and efficiency of our algorithm in various experiments.


Learned Mappings for Targeted Free Energy Perturbation between Peptide Conformations

Willow, Soohaeng Yoo, Kang, Lulu, Minh, David D. L.

arXiv.org Machine Learning

Targeted free energy perturbation uses an invertible mapping to promote configuration space overlap and the convergence of free energy estimates. However, developing suitable mappings can be challenging. Wirnsberger et al. (2020) demonstrated the use of machine learning to train deep neural networks that map between Boltzmann distributions for different thermodynamic states. Here, we adapt their approach to free energy differences of a flexible bonded molecule, deca-alanine, with harmonic biases with different spring centers. When the neural network is trained until ``early stopping'' - when the loss value of the test set increases - we calculate accurate free energy differences between thermodynamic states with spring centers separated by 1 \r{A} and sometimes 2 \r{A}. For more distant thermodynamic states, the mapping does not produce structures representative of the target state and the method does not reproduce reference calculations.