autoregressive network
Estimation of the reduced density matrix and entanglement entropies using autoregressive networks
Biaลas, Piotr, Korcyl, Piotr, Stebel, Tomasz, Zapolski, Dawid
We present an application of autoregressive neural networks to Monte Carlo simulations of quantum spin chains using the correspondence with classical two-dimensional spin systems. We use a hierarchy of neural networks capable of estimating conditional probabilities of consecutive spins to evaluate elements of reduced density matrices directly. Using the Ising chain as an example, we calculate the continuum limit of the ground state's von Neumann and Rรฉnyi bipartite entanglement entropies of an interval built of up to 5 spins. We demonstrate that our architecture is able to estimate all the needed matrix elements with just a single training for a fixed time discretization and lattice volume. Our method can be applied to other types of spin chains, possibly with defects, as well as to estimating entanglement entropies of thermal states at non-zero temperature.
Hierarchical autoregressive neural networks in three-dimensional statistical system
Biaลas, Piotr, Chahar, Vaibhav, Korcyl, Piotr, Stebel, Tomasz, Winiarski, Mateusz, Zapolski, Dawid
Autoregressive Neural Networks (ANN) have been recently proposed as a mechanism to improve the efficiency of Monte Carlo algorithms for several spin systems. The idea relies on the fact that the total probability of a configuration can be factorized into conditional probabilities of each spin, which in turn can be approximated by a neural network. Once trained, the ANNs can be used to sample configurations from the approximated probability distribution and to evaluate explicitly this probability for a given configuration. It has also been observed that such conditional probabilities give access to information-theoretic observables such as mutual information or entanglement entropy. So far, these methods have been applied to two-dimensional statistical systems or one-dimensional quantum systems. In this paper, we describe a generalization of the hierarchical algorithm to three spatial dimensions and study its performance on the example of the Ising model. We discuss the efficiency of the training and also describe the scaling with the system's dimensionality by comparing results for two- and three-dimensional Ising models with the same number of spins. Finally, we provide estimates of thermodynamical observables for the three-dimensional Ising model, such as the entropy and free energy in a range of temperatures across the phase transition.
Statistical Mechanics Calculations Using Variational Autoregressive Networks and Quantum Annealing
Tamura, Yuta, Ohzeki, Masayuki
In statistical mechanics, computing the partition function is generally difficult. An approximation method using a variational autoregressive network (VAN) has been proposed recently. This approach offers the advantage of directly calculating the generation probabilities while obtaining a significantly large number of samples. The present study introduces a novel approximation method that employs samples derived from quantum annealing machines in conjunction with VAN, which are empirically assumed to adhere to the Gibbs-Boltzmann distribution. When applied to the finite-size Sherrington-Kirkpatrick model, the proposed method demonstrates enhanced accuracy compared to the traditional VAN approach and other approximate methods, such as the widely utilized naive mean field.
Forecasting Vehicle Pitch of a Lightweight Underwater Vehicle Manipulator System with Recurrent Neural Networks
Kolano, Hannah, Davidson, Joseph R.
As Underwater Vehicle Manipulator Systems (UVMSs) have gotten smaller and lighter over the past years, it is becoming increasingly important to consider the coupling forces between the manipulator and the vehicle when planning and controlling the system. However, typical methods of handling these forces require an exact hydrodynamic model of the vehicle and access to low-level torque control on the manipulator, both of which are uncommon in the field. Therefore, many UVMS control methods are kinematics-based, which cannot inherently account for these effects. Our work bridges the gap between kinematic control and dynamics by training a recurrent neural network on simulated UVMS data to predict the pitch of the vehicle in the future based on the system's previous states. Kinematic planners and controllers can use this metric to incorporate dynamic knowledge without a computationally expensive model, improving their ability to perform underwater manipulation tasks.
Gradient estimators for normalising flows
Bialas, Piotr, Korcyl, Piotr, Stebel, Tomasz
Expressed in a form of an algorithm applied to study a simple classical statistical mechanics problem by Metropolis et al. [1] it is ubiquitous as a tool of dealing with complicated probability distributions (see for example [2]). In many cases one resorts to the construction of an associated Markov chain of consecutive proposals which provides a mathematically grounded way of generating samples from a given distribution even when the proper normalization of the latter is not known. The only limiting factor of the approach is the statistical uncertainty which directly depends on the number of statistically independent configurations. Hence, the effectiveness of any such simulation algorithm can be linked to its autocorrelation time which quantifies how many configurations are produced before a new, statistically independent configuration appears. For systems close to phase transitions the increasing autocorrelation times, a phenomenon called critical slowing down, is usually the main factor which limits the statistical precision of outputs. The recent interest in machine learning techniques has offered possible ways of dealing with this problem. Ref. [3] and later Ref. [4] proposed autoregressive neural networks as a mechanism of generating independent configurations which can be used as proposals in the construction of the Markov chain. The new algorithm was hence called Neural Markov Chain Monte Carlo (NMCMC). Once the neural network is sufficiently well trained one indeed finds that autocorrelation times are significantly reduced as was demonstrated in the context of the two-dimensional Ising model in Ref. [5].
Compact Autoregressive Network
Wang, Di, Huang, Feiqing, Zhao, Jingyu, Li, Guodong, Tian, Guangjian
Recurrent neural networks (RNN) and their variants, such as Long-Short Term Memory (Hochreiter and Schmidhuber, 1997) and Gated Recurrent Unit (Cho et al., 2014), are commonly used as the default architecture or even the synonym of sequence modeling by deep learning practitioners (Goodfellow et al., 2016). In the meanwhile, especially for high-dimensional time series, we may also consider the autoregressive modeling or multi-task learning, null y t f (y t 1, y t 2,..., y t P), (1) where the output null y t and each input y t i are N -dimensional, and the lag P can be very large for accomodating sequential dependence. Some non-recurrent feed-forward networks with convolutional or other certain architectures have been proposed recently for sequence modeling, and are shown to have state-of-the-art accuracy. For example, some autoregressive networks, such as PixelCNN (Van den Oord et al., 2016b) and WaveNet (Van den Oord et al., 2016a) for image and audio sequence modeling, are compelling alternatives to the recurrent networks. This paper aims at the autoregressive model (1) with a large number of sequences. This problem can be implemented by a fully connected network with NP inputs and N outputs.
Neural-network based general method for statistical mechanics on sparse systems
Pan, Feng, Zhou, Hai-Jun, Zhang, Pan
School of Physical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China (Dated: June 27, 2019) We propose a general method for solving statistical mechanics problems defined on sparse graphs, such as random graphs, real-world networks, and low-dimensional lattices. Our approach extract a small feedback vertex set of the sparse graph, converting the sparse system to a strongly correlated system with many-body and dense interactions on the feedback set, then solve it using variational method based on neural networks to estimate free energy, observables, and generate unbiased samples via direct sampling. Extensive experiments show that our approach is more accurate than existing approaches for sparse spin glass systems. On random graphs and real-world networks, our approach significantly outperforms the standard methods for sparse systems such as belief-propagation; on structured sparse systems such as two-dimensional lattices our approach is significantly faster and more accurate than recently proposed variational autoregressive networks using convolution neural networks. On dense systems, VAN uses multilayeredMany systems in science and technology are sparse.
Solving Statistical Mechanics using Variational Autoregressive Networks
Wu, Dian, Wang, Lei, Zhang, Pan
We propose a general framework for solving statistical mechanics of systems with a finite size. The approach extends the celebrated variational mean-field approaches using autoregressive neural networks which support direct sampling and exact calculation of normalized probability of configurations. The network computes variational free energy, estimates physical quantities such as entropy, magnetizations and correlations, and generates uncorrelated samples all at once. Training of the network employs the policy gradient approach in reinforcement learning, which unbiasedly estimates the gradient of variational parameters. We apply our approach to several classical systems, including 2-d Ising models, Hopfield model, Sherrington--Kirkpatrick spin glasses, and the inverse Ising model, for demonstrating its advantages over existing variational mean-field methods. Our approach sheds light on solving statistical physics problems using modern deep generative neural networks.
LogitBoost autoregressive networks
Multivariate binary distributions can be decomposed into products of univariate conditional distributions. Recently popular approaches have modeled these conditionals through neural networks with sophisticated weight-sharing structures. It is shown that state-of-the-art performance on several standard benchmark datasets can actually be achieved by training separate probability estimators for each dimension. In that case, model training can be trivially parallelized over data dimensions. On the other hand, complexity control has to be performed for each learned conditional distribution. Three possible methods are considered and experimentally compared. The estimator that is employed for each conditional is LogitBoost. Similarities and differences between the proposed approach and autoregressive models based on neural networks are discussed in detail.