Goto

Collaborating Authors

 conv2d




Optimized Machine Learning Methods for Studying the Thermodynamic Behavior of Complex Spin Systems

Kapitan, Dmitrii, Ovchinnikov, Pavel, Soldatov, Konstantin, Andriushchenko, Petr, Kapitan, Vitalii

arXiv.org Artificial Intelligence

This paper presents a systematic study of the application of convolutional neural networks (CNNs) as an efficient and versatile tool for the analysis of critical and low-temperature phase states in spin system models. The problem of calculating the dependence of the average energy on the spatial distribution of exchange integrals for the Edwards-Anderson model on a square lattice with frustrated interactions is considered. We further construct a single convolutional classifier of phase states of the ferromagnetic Ising model on square, triangular, honeycomb, and kagome lattices, trained on configurations generated by the Swendsen-Wang cluster algorithm. Computed temperature profiles of the averaged posterior probability of the high-temperature phase form clear S-shaped curves that intersect in the vicinity of the theoretical critical temperatures and allow one to determine the critical temperature for the kagome lattice without additional retraining. It is shown that convolutional models substantially reduce the root-mean-square error (RMSE) compared with fully connected architectures and efficiently capture complex correlations between thermodynamic characteristics and the structure of magnetic correlated systems.


Understanding Diffusion Models via Code Execution

Yu, Cheng

arXiv.org Artificial Intelligence

Diffusion models have achieved remarkable performance in generative modeling, yet their theoretical foundations are often intricate, and the gap between mathematical formulations in papers and practical open-source implementations can be difficult to bridge. Existing tutorials primarily focus on deriving equations, offering limited guidance on how diffusion models actually operate in code. To address this, we present a concise implementation of approximately 300 lines that explains diffusion models from a code-execution perspective. Our minimal example preserves the essential components -- including forward diffusion, reverse sampling, the noise-prediction network, and the training loop -- while removing unnecessary engineering details. This technical report aims to provide researchers with a clear, implementation-first understanding of how diffusion models work in practice and how code and theory correspond. Our code and pre-trained models are available at: https://github.com/disanda/GM/tree/main/DDPM-DDIM-ClassifierFree.


Improved Training Mechanism for Reinforcement Learning via Online Model Selection

Afshar, Aida, Pacchiano, Aldo

arXiv.org Artificial Intelligence

We study the problem of online model selection in reinforcement learning, where the selector has access to a class of reinforcement learning agents and learns to adaptively select the agent with the right configuration. Our goal is to establish the improved efficiency and performance gains achieved by integrating online model selection methods into reinforcement learning training procedures. We examine the theoretical characterizations that are effective for identifying the right configuration in practice, and address three practical criteria from a theoretical perspective: 1) Efficient resource allocation, 2) Adaptation under non-stationary dynamics, and 3) Training stability across different seeds. Our theoretical results are accompanied by empirical evidence from various model selection tasks in reinforcement learning, including neural architecture selection, step-size selection, and self model selection.


Differentiable Physics-Neural Models enable Learning of Non-Markovian Closures for Accelerated Coarse-Grained Physics Simulations

Xue, Tingkai, Ooi, Chin Chun, Ge, Zhengwei, Leong, Fong Yew, Li, Hongying, Kang, Chang Wei

arXiv.org Artificial Intelligence

Numerical simulations provide key insights into many physical, real-world problems. However, while these simulations are solved on a full 3D domain, most analysis only require a reduced set of metrics (e.g. plane-level concentrations). This work presents a hybrid physics-neural model that predicts scalar transport in a complex domain orders of magnitude faster than the 3D simulation (from hours to less than 1 min). This end-to-end differentiable framework jointly learns the physical model parameterization (i.e. orthotropic diffusivity) and a non-Markovian neural closure model to capture unresolved, 'coarse-grained' effects, thereby enabling stable, long time horizon rollouts. This proposed model is data-efficient (learning with 26 training data), and can be flexibly extended to an out-of-distribution scenario (with a moving source), achieving a Spearman correlation coefficient of 0.96 at the final simulation time. Overall results show that this differentiable physics-neural framework enables fast, accurate, and generalizable coarse-grained surrogates for physical phenomena.



Supplementary Material: Learning Distilled Collaboration Graph for Multi-Agent Perception

Neural Information Processing Systems

V ehicles are spawned in CARLA via SUMO, and managed by the Traffic Manager. We employ the dataset format of the nuScenes and extend it to multi-agent scenarios, seen in Fig. IV. Each log file can produce 100 scenes, and each scene includes 100 frames. The input BEV map's dimension is (c, w,h) = (13, 256, 256). II.1 Architecture of student/teacher encoder We describe the architecture of the encoder below.