It is well established that neural networks with deep architectures perform better than shallow networks for many tasks in machine learning. In statistical physics, while there has been recent interest in representing physical data with generative modelling, the focus has been on shallow neural networks. A natural question to ask is whether deep neural networks hold any advantage over shallow networks in representing such data. We investigate this question by using unsupervised, generative graphical models to learn the probability distribution of a two-dimensional Ising system. Deep Boltzmann machines, deep belief networks, and deep restricted Boltzmann networks are trained on thermal spin configurations from this system, and compared to the shallow architecture of the restricted Boltzmann machine. We benchmark the models, focussing on the accuracy of generating energetic observables near the phase transition, where these quantities are most difficult to approximate. Interestingly, after training the generative networks, we observe that the accuracy essentially depends only on the number of neurons in the first hidden layer of the network, and not on other model details such as network depth or model type. This is evidence that shallow networks are more efficient than deep networks at representing physical probability distributions associated with Ising systems near criticality.
We employ unsupervised machine learning techniques to learn latent parameters which best describe states of the two-dimensional Ising model and the three-dimensional XY model. These methods range from principal component analysis to artificial neural network based variational autoencoders. The states are sampled using a Monte-Carlo simulation above and below the critical temperature. We find that the predicted latent parameters correspond to the known order parameters. The latent representation of the states of the models in question are clustered, which makes it possible to identify phases without prior knowledge of their existence or the underlying Hamiltonian. Furthermore, we find that the reconstruction loss function can be used as a universal identifier for phase transitions.
Generative modeling, which learns joint probability distribution from training data and generates samples according to it, is an important task in machine learning and artificial intelligence. Inspired by probabilistic interpretation of quantum physics, we propose a generative model using matrix product states, which is a tensor network originally proposed for describing (particularly one-dimensional) entangled quantum states. Our model enjoys efficient learning by utilizing the density matrix renormalization group method which allows dynamic adjusting dimensions of the tensors, and offers an efficient direct sampling approach, Zipper, for generative tasks. We apply our method to generative modeling of several standard datasets including the principled Bars and Stripes, random binary patterns and the MNIST handwritten digits, to illustrate ability of our model, and discuss features as well as drawbacks of our model over popular generative models such as Hopfield model, Boltzmann machines and generative adversarial networks. Our work shed light on many interesting directions for future exploration on the development of quantum-inspired algorithms for unsupervised machine learning, which is of possibility of being realized by a quantum device.
We discuss the relevance of the recent Machine Learning (ML) literature for economics and econometrics. First we discuss the differences in goals, methods and settings between the ML literature and the traditional econometrics and statistics literatures. Then we discuss some specific methods from the machine learning literature that we view as important for empirical researchers in economics. These include supervised learning methods for regression and classification, unsupervised learning methods, as well as matrix completion methods. Finally, we highlight newly developed methods at the intersection of ML and econometrics, methods that typically perform better than either off-the-shelf ML or more traditional econometric methods when applied to particular classes of problems, problems that include causal inference for average treatment effects, optimal policy estimation, and estimation of the counterfactual effect of price changes in consumer choice models.